diff options
Diffstat (limited to 'src/runtime')
1029 files changed, 217631 insertions, 0 deletions
diff --git a/src/runtime/HACKING.md b/src/runtime/HACKING.md new file mode 100644 index 0000000..ce0b42a --- /dev/null +++ b/src/runtime/HACKING.md @@ -0,0 +1,332 @@ +This is a living document and at times it will be out of date. It is +intended to articulate how programming in the Go runtime differs from +writing normal Go. It focuses on pervasive concepts rather than +details of particular interfaces. + +Scheduler structures +==================== + +The scheduler manages three types of resources that pervade the +runtime: Gs, Ms, and Ps. It's important to understand these even if +you're not working on the scheduler. + +Gs, Ms, Ps +---------- + +A "G" is simply a goroutine. It's represented by type `g`. When a +goroutine exits, its `g` object is returned to a pool of free `g`s and +can later be reused for some other goroutine. + +An "M" is an OS thread that can be executing user Go code, runtime +code, a system call, or be idle. It's represented by type `m`. There +can be any number of Ms at a time since any number of threads may be +blocked in system calls. + +Finally, a "P" represents the resources required to execute user Go +code, such as scheduler and memory allocator state. It's represented +by type `p`. There are exactly `GOMAXPROCS` Ps. A P can be thought of +like a CPU in the OS scheduler and the contents of the `p` type like +per-CPU state. This is a good place to put state that needs to be +sharded for efficiency, but doesn't need to be per-thread or +per-goroutine. + +The scheduler's job is to match up a G (the code to execute), an M +(where to execute it), and a P (the rights and resources to execute +it). When an M stops executing user Go code, for example by entering a +system call, it returns its P to the idle P pool. In order to resume +executing user Go code, for example on return from a system call, it +must acquire a P from the idle pool. + +All `g`, `m`, and `p` objects are heap allocated, but are never freed, +so their memory remains type stable. As a result, the runtime can +avoid write barriers in the depths of the scheduler. + +`getg()` and `getg().m.curg` +---------------------------- + +To get the current user `g`, use `getg().m.curg`. + +`getg()` alone returns the current `g`, but when executing on the +system or signal stacks, this will return the current M's "g0" or +"gsignal", respectively. This is usually not what you want. + +To determine if you're running on the user stack or the system stack, +use `getg() == getg().m.curg`. + +Stacks +====== + +Every non-dead G has a *user stack* associated with it, which is what +user Go code executes on. User stacks start small (e.g., 2K) and grow +or shrink dynamically. + +Every M has a *system stack* associated with it (also known as the M's +"g0" stack because it's implemented as a stub G) and, on Unix +platforms, a *signal stack* (also known as the M's "gsignal" stack). +System and signal stacks cannot grow, but are large enough to execute +runtime and cgo code (8K in a pure Go binary; system-allocated in a +cgo binary). + +Runtime code often temporarily switches to the system stack using +`systemstack`, `mcall`, or `asmcgocall` to perform tasks that must not +be preempted, that must not grow the user stack, or that switch user +goroutines. Code running on the system stack is implicitly +non-preemptible and the garbage collector does not scan system stacks. +While running on the system stack, the current user stack is not used +for execution. + +nosplit functions +----------------- + +Most functions start with a prologue that inspects the stack pointer +and the current G's stack bound and calls `morestack` if the stack +needs to grow. + +Functions can be marked `//go:nosplit` (or `NOSPLIT` in assembly) to +indicate that they should not get this prologue. This has several +uses: + +- Functions that must run on the user stack, but must not call into + stack growth, for example because this would cause a deadlock, or + because they have untyped words on the stack. + +- Functions that must not be preempted on entry. + +- Functions that may run without a valid G. For example, functions + that run in early runtime start-up, or that may be entered from C + code such as cgo callbacks or the signal handler. + +Splittable functions ensure there's some amount of space on the stack +for nosplit functions to run in and the linker checks that any static +chain of nosplit function calls cannot exceed this bound. + +Any function with a `//go:nosplit` annotation should explain why it is +nosplit in its documentation comment. + +Error handling and reporting +============================ + +Errors that can reasonably be recovered from in user code should use +`panic` like usual. However, there are some situations where `panic` +will cause an immediate fatal error, such as when called on the system +stack or when called during `mallocgc`. + +Most errors in the runtime are not recoverable. For these, use +`throw`, which dumps the traceback and immediately terminates the +process. In general, `throw` should be passed a string constant to +avoid allocating in perilous situations. By convention, additional +details are printed before `throw` using `print` or `println` and the +messages are prefixed with "runtime:". + +For unrecoverable errors where user code is expected to be at fault for the +failure (such as racing map writes), use `fatal`. + +For runtime error debugging, it may be useful to run with `GOTRACEBACK=system` +or `GOTRACEBACK=crash`. The output of `panic` and `fatal` is as described by +`GOTRACEBACK`. The output of `throw` always includes runtime frames, metadata +and all goroutines regardless of `GOTRACEBACK` (i.e., equivalent to +`GOTRACEBACK=system`). Whether `throw` crashes or not is still controlled by +`GOTRACEBACK`. + +Synchronization +=============== + +The runtime has multiple synchronization mechanisms. They differ in +semantics and, in particular, in whether they interact with the +goroutine scheduler or the OS scheduler. + +The simplest is `mutex`, which is manipulated using `lock` and +`unlock`. This should be used to protect shared structures for short +periods. Blocking on a `mutex` directly blocks the M, without +interacting with the Go scheduler. This means it is safe to use from +the lowest levels of the runtime, but also prevents any associated G +and P from being rescheduled. `rwmutex` is similar. + +For one-shot notifications, use `note`, which provides `notesleep` and +`notewakeup`. Unlike traditional UNIX `sleep`/`wakeup`, `note`s are +race-free, so `notesleep` returns immediately if the `notewakeup` has +already happened. A `note` can be reset after use with `noteclear`, +which must not race with a sleep or wakeup. Like `mutex`, blocking on +a `note` blocks the M. However, there are different ways to sleep on a +`note`:`notesleep` also prevents rescheduling of any associated G and +P, while `notetsleepg` acts like a blocking system call that allows +the P to be reused to run another G. This is still less efficient than +blocking the G directly since it consumes an M. + +To interact directly with the goroutine scheduler, use `gopark` and +`goready`. `gopark` parks the current goroutine—putting it in the +"waiting" state and removing it from the scheduler's run queue—and +schedules another goroutine on the current M/P. `goready` puts a +parked goroutine back in the "runnable" state and adds it to the run +queue. + +In summary, + +<table> +<tr><th></th><th colspan="3">Blocks</th></tr> +<tr><th>Interface</th><th>G</th><th>M</th><th>P</th></tr> +<tr><td>(rw)mutex</td><td>Y</td><td>Y</td><td>Y</td></tr> +<tr><td>note</td><td>Y</td><td>Y</td><td>Y/N</td></tr> +<tr><td>park</td><td>Y</td><td>N</td><td>N</td></tr> +</table> + +Atomics +======= + +The runtime uses its own atomics package at `runtime/internal/atomic`. +This corresponds to `sync/atomic`, but functions have different names +for historical reasons and there are a few additional functions needed +by the runtime. + +In general, we think hard about the uses of atomics in the runtime and +try to avoid unnecessary atomic operations. If access to a variable is +sometimes protected by another synchronization mechanism, the +already-protected accesses generally don't need to be atomic. There +are several reasons for this: + +1. Using non-atomic or atomic access where appropriate makes the code + more self-documenting. Atomic access to a variable implies there's + somewhere else that may concurrently access the variable. + +2. Non-atomic access allows for automatic race detection. The runtime + doesn't currently have a race detector, but it may in the future. + Atomic access defeats the race detector, while non-atomic access + allows the race detector to check your assumptions. + +3. Non-atomic access may improve performance. + +Of course, any non-atomic access to a shared variable should be +documented to explain how that access is protected. + +Some common patterns that mix atomic and non-atomic access are: + +* Read-mostly variables where updates are protected by a lock. Within + the locked region, reads do not need to be atomic, but the write + does. Outside the locked region, reads need to be atomic. + +* Reads that only happen during STW, where no writes can happen during + STW, do not need to be atomic. + +That said, the advice from the Go memory model stands: "Don't be +[too] clever." The performance of the runtime matters, but its +robustness matters more. + +Unmanaged memory +================ + +In general, the runtime tries to use regular heap allocation. However, +in some cases the runtime must allocate objects outside of the garbage +collected heap, in *unmanaged memory*. This is necessary if the +objects are part of the memory manager itself or if they must be +allocated in situations where the caller may not have a P. + +There are three mechanisms for allocating unmanaged memory: + +* sysAlloc obtains memory directly from the OS. This comes in whole + multiples of the system page size, but it can be freed with sysFree. + +* persistentalloc combines multiple smaller allocations into a single + sysAlloc to avoid fragmentation. However, there is no way to free + persistentalloced objects (hence the name). + +* fixalloc is a SLAB-style allocator that allocates objects of a fixed + size. fixalloced objects can be freed, but this memory can only be + reused by the same fixalloc pool, so it can only be reused for + objects of the same type. + +In general, types that are allocated using any of these should be +marked as not in heap by embedding `runtime/internal/sys.NotInHeap`. + +Objects that are allocated in unmanaged memory **must not** contain +heap pointers unless the following rules are also obeyed: + +1. Any pointers from unmanaged memory to the heap must be garbage + collection roots. More specifically, any pointer must either be + accessible through a global variable or be added as an explicit + garbage collection root in `runtime.markroot`. + +2. If the memory is reused, the heap pointers must be zero-initialized + before they become visible as GC roots. Otherwise, the GC may + observe stale heap pointers. See "Zero-initialization versus + zeroing". + +Zero-initialization versus zeroing +================================== + +There are two types of zeroing in the runtime, depending on whether +the memory is already initialized to a type-safe state. + +If memory is not in a type-safe state, meaning it potentially contains +"garbage" because it was just allocated and it is being initialized +for first use, then it must be *zero-initialized* using +`memclrNoHeapPointers` or non-pointer writes. This does not perform +write barriers. + +If memory is already in a type-safe state and is simply being set to +the zero value, this must be done using regular writes, `typedmemclr`, +or `memclrHasPointers`. This performs write barriers. + +Runtime-only compiler directives +================================ + +In addition to the "//go:" directives documented in "go doc compile", +the compiler supports additional directives only in the runtime. + +go:systemstack +-------------- + +`go:systemstack` indicates that a function must run on the system +stack. This is checked dynamically by a special function prologue. + +go:nowritebarrier +----------------- + +`go:nowritebarrier` directs the compiler to emit an error if the +following function contains any write barriers. (It *does not* +suppress the generation of write barriers; it is simply an assertion.) + +Usually you want `go:nowritebarrierrec`. `go:nowritebarrier` is +primarily useful in situations where it's "nice" not to have write +barriers, but not required for correctness. + +go:nowritebarrierrec and go:yeswritebarrierrec +---------------------------------------------- + +`go:nowritebarrierrec` directs the compiler to emit an error if the +following function or any function it calls recursively, up to a +`go:yeswritebarrierrec`, contains a write barrier. + +Logically, the compiler floods the call graph starting from each +`go:nowritebarrierrec` function and produces an error if it encounters +a function containing a write barrier. This flood stops at +`go:yeswritebarrierrec` functions. + +`go:nowritebarrierrec` is used in the implementation of the write +barrier to prevent infinite loops. + +Both directives are used in the scheduler. The write barrier requires +an active P (`getg().m.p != nil`) and scheduler code often runs +without an active P. In this case, `go:nowritebarrierrec` is used on +functions that release the P or may run without a P and +`go:yeswritebarrierrec` is used when code re-acquires an active P. +Since these are function-level annotations, code that releases or +acquires a P may need to be split across two functions. + +go:uintptrkeepalive +------------------- + +The //go:uintptrkeepalive directive must be followed by a function declaration. + +It specifies that the function's uintptr arguments may be pointer values that +have been converted to uintptr and must be kept alive for the duration of the +call, even though from the types alone it would appear that the object is no +longer needed during the call. + +This directive is similar to //go:uintptrescapes, but it does not force +arguments to escape. Since stack growth does not understand these arguments, +this directive must be used with //go:nosplit (in the marked function and all +transitive calls) to prevent stack growth. + +The conversion from pointer to uintptr must appear in the argument list of any +call to this function. This directive is used for some low-level system call +implementations. diff --git a/src/runtime/Makefile b/src/runtime/Makefile new file mode 100644 index 0000000..55087de --- /dev/null +++ b/src/runtime/Makefile @@ -0,0 +1,5 @@ +# Copyright 2009 The Go Authors. All rights reserved. +# Use of this source code is governed by a BSD-style +# license that can be found in the LICENSE file. + +include ../Make.dist diff --git a/src/runtime/abi_test.go b/src/runtime/abi_test.go new file mode 100644 index 0000000..0c9488a --- /dev/null +++ b/src/runtime/abi_test.go @@ -0,0 +1,112 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build goexperiment.regabiargs + +// This file contains tests specific to making sure the register ABI +// works in a bunch of contexts in the runtime. + +package runtime_test + +import ( + "internal/abi" + "internal/testenv" + "os" + "os/exec" + "runtime" + "strings" + "testing" + "time" +) + +var regConfirmRun chan int + +//go:registerparams +func regFinalizerPointer(v *Tint) (int, float32, [10]byte) { + regConfirmRun <- *(*int)(v) + return 5151, 4.0, [10]byte{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} +} + +//go:registerparams +func regFinalizerIface(v Tinter) (int, float32, [10]byte) { + regConfirmRun <- *(*int)(v.(*Tint)) + return 5151, 4.0, [10]byte{1, 2, 3, 4, 5, 6, 7, 8, 9, 10} +} + +func TestFinalizerRegisterABI(t *testing.T) { + testenv.MustHaveExec(t) + + // Actually run the test in a subprocess because we don't want + // finalizers from other tests interfering. + if os.Getenv("TEST_FINALIZER_REGABI") != "1" { + cmd := testenv.CleanCmdEnv(exec.Command(os.Args[0], "-test.run=TestFinalizerRegisterABI", "-test.v")) + cmd.Env = append(cmd.Env, "TEST_FINALIZER_REGABI=1") + out, err := cmd.CombinedOutput() + if !strings.Contains(string(out), "PASS\n") || err != nil { + t.Fatalf("%s\n(exit status %v)", string(out), err) + } + return + } + + // Optimistically clear any latent finalizers from e.g. the testing + // package before continuing. + // + // It's possible that a finalizer only becomes available to run + // after this point, which would interfere with the test and could + // cause a crash, but because we're running in a separate process + // it's extremely unlikely. + runtime.GC() + runtime.GC() + + // fing will only pick the new IntRegArgs up if it's currently + // sleeping and wakes up, so wait for it to go to sleep. + success := false + for i := 0; i < 100; i++ { + if runtime.FinalizerGAsleep() { + success = true + break + } + time.Sleep(20 * time.Millisecond) + } + if !success { + t.Fatal("finalizer not asleep?") + } + + argRegsBefore := runtime.SetIntArgRegs(abi.IntArgRegs) + defer runtime.SetIntArgRegs(argRegsBefore) + + tests := []struct { + name string + fin any + confirmValue int + }{ + {"Pointer", regFinalizerPointer, -1}, + {"Interface", regFinalizerIface, -2}, + } + for i := range tests { + test := &tests[i] + t.Run(test.name, func(t *testing.T) { + regConfirmRun = make(chan int) + + x := new(Tint) + *x = (Tint)(test.confirmValue) + runtime.SetFinalizer(x, test.fin) + + runtime.KeepAlive(x) + + // Queue the finalizer. + runtime.GC() + runtime.GC() + + select { + case <-time.After(time.Second): + t.Fatal("finalizer failed to execute") + case gotVal := <-regConfirmRun: + if gotVal != test.confirmValue { + t.Fatalf("wrong finalizer executed? got %d, want %d", gotVal, test.confirmValue) + } + } + }) + } +} diff --git a/src/runtime/alg.go b/src/runtime/alg.go new file mode 100644 index 0000000..2a413ee --- /dev/null +++ b/src/runtime/alg.go @@ -0,0 +1,353 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/cpu" + "internal/goarch" + "unsafe" +) + +const ( + c0 = uintptr((8-goarch.PtrSize)/4*2860486313 + (goarch.PtrSize-4)/4*33054211828000289) + c1 = uintptr((8-goarch.PtrSize)/4*3267000013 + (goarch.PtrSize-4)/4*23344194077549503) +) + +func memhash0(p unsafe.Pointer, h uintptr) uintptr { + return h +} + +func memhash8(p unsafe.Pointer, h uintptr) uintptr { + return memhash(p, h, 1) +} + +func memhash16(p unsafe.Pointer, h uintptr) uintptr { + return memhash(p, h, 2) +} + +func memhash128(p unsafe.Pointer, h uintptr) uintptr { + return memhash(p, h, 16) +} + +//go:nosplit +func memhash_varlen(p unsafe.Pointer, h uintptr) uintptr { + ptr := getclosureptr() + size := *(*uintptr)(unsafe.Pointer(ptr + unsafe.Sizeof(h))) + return memhash(p, h, size) +} + +// runtime variable to check if the processor we're running on +// actually supports the instructions used by the AES-based +// hash implementation. +var useAeshash bool + +// in asm_*.s +func memhash(p unsafe.Pointer, h, s uintptr) uintptr +func memhash32(p unsafe.Pointer, h uintptr) uintptr +func memhash64(p unsafe.Pointer, h uintptr) uintptr +func strhash(p unsafe.Pointer, h uintptr) uintptr + +func strhashFallback(a unsafe.Pointer, h uintptr) uintptr { + x := (*stringStruct)(a) + return memhashFallback(x.str, h, uintptr(x.len)) +} + +// NOTE: Because NaN != NaN, a map can contain any +// number of (mostly useless) entries keyed with NaNs. +// To avoid long hash chains, we assign a random number +// as the hash value for a NaN. + +func f32hash(p unsafe.Pointer, h uintptr) uintptr { + f := *(*float32)(p) + switch { + case f == 0: + return c1 * (c0 ^ h) // +0, -0 + case f != f: + return c1 * (c0 ^ h ^ uintptr(fastrand())) // any kind of NaN + default: + return memhash(p, h, 4) + } +} + +func f64hash(p unsafe.Pointer, h uintptr) uintptr { + f := *(*float64)(p) + switch { + case f == 0: + return c1 * (c0 ^ h) // +0, -0 + case f != f: + return c1 * (c0 ^ h ^ uintptr(fastrand())) // any kind of NaN + default: + return memhash(p, h, 8) + } +} + +func c64hash(p unsafe.Pointer, h uintptr) uintptr { + x := (*[2]float32)(p) + return f32hash(unsafe.Pointer(&x[1]), f32hash(unsafe.Pointer(&x[0]), h)) +} + +func c128hash(p unsafe.Pointer, h uintptr) uintptr { + x := (*[2]float64)(p) + return f64hash(unsafe.Pointer(&x[1]), f64hash(unsafe.Pointer(&x[0]), h)) +} + +func interhash(p unsafe.Pointer, h uintptr) uintptr { + a := (*iface)(p) + tab := a.tab + if tab == nil { + return h + } + t := tab._type + if t.equal == nil { + // Check hashability here. We could do this check inside + // typehash, but we want to report the topmost type in + // the error text (e.g. in a struct with a field of slice type + // we want to report the struct, not the slice). + panic(errorString("hash of unhashable type " + t.string())) + } + if isDirectIface(t) { + return c1 * typehash(t, unsafe.Pointer(&a.data), h^c0) + } else { + return c1 * typehash(t, a.data, h^c0) + } +} + +func nilinterhash(p unsafe.Pointer, h uintptr) uintptr { + a := (*eface)(p) + t := a._type + if t == nil { + return h + } + if t.equal == nil { + // See comment in interhash above. + panic(errorString("hash of unhashable type " + t.string())) + } + if isDirectIface(t) { + return c1 * typehash(t, unsafe.Pointer(&a.data), h^c0) + } else { + return c1 * typehash(t, a.data, h^c0) + } +} + +// typehash computes the hash of the object of type t at address p. +// h is the seed. +// This function is seldom used. Most maps use for hashing either +// fixed functions (e.g. f32hash) or compiler-generated functions +// (e.g. for a type like struct { x, y string }). This implementation +// is slower but more general and is used for hashing interface types +// (called from interhash or nilinterhash, above) or for hashing in +// maps generated by reflect.MapOf (reflect_typehash, below). +// Note: this function must match the compiler generated +// functions exactly. See issue 37716. +func typehash(t *_type, p unsafe.Pointer, h uintptr) uintptr { + if t.tflag&tflagRegularMemory != 0 { + // Handle ptr sizes specially, see issue 37086. + switch t.size { + case 4: + return memhash32(p, h) + case 8: + return memhash64(p, h) + default: + return memhash(p, h, t.size) + } + } + switch t.kind & kindMask { + case kindFloat32: + return f32hash(p, h) + case kindFloat64: + return f64hash(p, h) + case kindComplex64: + return c64hash(p, h) + case kindComplex128: + return c128hash(p, h) + case kindString: + return strhash(p, h) + case kindInterface: + i := (*interfacetype)(unsafe.Pointer(t)) + if len(i.mhdr) == 0 { + return nilinterhash(p, h) + } + return interhash(p, h) + case kindArray: + a := (*arraytype)(unsafe.Pointer(t)) + for i := uintptr(0); i < a.len; i++ { + h = typehash(a.elem, add(p, i*a.elem.size), h) + } + return h + case kindStruct: + s := (*structtype)(unsafe.Pointer(t)) + for _, f := range s.fields { + if f.name.isBlank() { + continue + } + h = typehash(f.typ, add(p, f.offset), h) + } + return h + default: + // Should never happen, as typehash should only be called + // with comparable types. + panic(errorString("hash of unhashable type " + t.string())) + } +} + +//go:linkname reflect_typehash reflect.typehash +func reflect_typehash(t *_type, p unsafe.Pointer, h uintptr) uintptr { + return typehash(t, p, h) +} + +func memequal0(p, q unsafe.Pointer) bool { + return true +} +func memequal8(p, q unsafe.Pointer) bool { + return *(*int8)(p) == *(*int8)(q) +} +func memequal16(p, q unsafe.Pointer) bool { + return *(*int16)(p) == *(*int16)(q) +} +func memequal32(p, q unsafe.Pointer) bool { + return *(*int32)(p) == *(*int32)(q) +} +func memequal64(p, q unsafe.Pointer) bool { + return *(*int64)(p) == *(*int64)(q) +} +func memequal128(p, q unsafe.Pointer) bool { + return *(*[2]int64)(p) == *(*[2]int64)(q) +} +func f32equal(p, q unsafe.Pointer) bool { + return *(*float32)(p) == *(*float32)(q) +} +func f64equal(p, q unsafe.Pointer) bool { + return *(*float64)(p) == *(*float64)(q) +} +func c64equal(p, q unsafe.Pointer) bool { + return *(*complex64)(p) == *(*complex64)(q) +} +func c128equal(p, q unsafe.Pointer) bool { + return *(*complex128)(p) == *(*complex128)(q) +} +func strequal(p, q unsafe.Pointer) bool { + return *(*string)(p) == *(*string)(q) +} +func interequal(p, q unsafe.Pointer) bool { + x := *(*iface)(p) + y := *(*iface)(q) + return x.tab == y.tab && ifaceeq(x.tab, x.data, y.data) +} +func nilinterequal(p, q unsafe.Pointer) bool { + x := *(*eface)(p) + y := *(*eface)(q) + return x._type == y._type && efaceeq(x._type, x.data, y.data) +} +func efaceeq(t *_type, x, y unsafe.Pointer) bool { + if t == nil { + return true + } + eq := t.equal + if eq == nil { + panic(errorString("comparing uncomparable type " + t.string())) + } + if isDirectIface(t) { + // Direct interface types are ptr, chan, map, func, and single-element structs/arrays thereof. + // Maps and funcs are not comparable, so they can't reach here. + // Ptrs, chans, and single-element items can be compared directly using ==. + return x == y + } + return eq(x, y) +} +func ifaceeq(tab *itab, x, y unsafe.Pointer) bool { + if tab == nil { + return true + } + t := tab._type + eq := t.equal + if eq == nil { + panic(errorString("comparing uncomparable type " + t.string())) + } + if isDirectIface(t) { + // See comment in efaceeq. + return x == y + } + return eq(x, y) +} + +// Testing adapters for hash quality tests (see hash_test.go) +func stringHash(s string, seed uintptr) uintptr { + return strhash(noescape(unsafe.Pointer(&s)), seed) +} + +func bytesHash(b []byte, seed uintptr) uintptr { + s := (*slice)(unsafe.Pointer(&b)) + return memhash(s.array, seed, uintptr(s.len)) +} + +func int32Hash(i uint32, seed uintptr) uintptr { + return memhash32(noescape(unsafe.Pointer(&i)), seed) +} + +func int64Hash(i uint64, seed uintptr) uintptr { + return memhash64(noescape(unsafe.Pointer(&i)), seed) +} + +func efaceHash(i any, seed uintptr) uintptr { + return nilinterhash(noescape(unsafe.Pointer(&i)), seed) +} + +func ifaceHash(i interface { + F() +}, seed uintptr) uintptr { + return interhash(noescape(unsafe.Pointer(&i)), seed) +} + +const hashRandomBytes = goarch.PtrSize / 4 * 64 + +// used in asm_{386,amd64,arm64}.s to seed the hash function +var aeskeysched [hashRandomBytes]byte + +// used in hash{32,64}.go to seed the hash function +var hashkey [4]uintptr + +func alginit() { + // Install AES hash algorithms if the instructions needed are present. + if (GOARCH == "386" || GOARCH == "amd64") && + cpu.X86.HasAES && // AESENC + cpu.X86.HasSSSE3 && // PSHUFB + cpu.X86.HasSSE41 { // PINSR{D,Q} + initAlgAES() + return + } + if GOARCH == "arm64" && cpu.ARM64.HasAES { + initAlgAES() + return + } + getRandomData((*[len(hashkey) * goarch.PtrSize]byte)(unsafe.Pointer(&hashkey))[:]) + hashkey[0] |= 1 // make sure these numbers are odd + hashkey[1] |= 1 + hashkey[2] |= 1 + hashkey[3] |= 1 +} + +func initAlgAES() { + useAeshash = true + // Initialize with random data so hash collisions will be hard to engineer. + getRandomData(aeskeysched[:]) +} + +// Note: These routines perform the read with a native endianness. +func readUnaligned32(p unsafe.Pointer) uint32 { + q := (*[4]byte)(p) + if goarch.BigEndian { + return uint32(q[3]) | uint32(q[2])<<8 | uint32(q[1])<<16 | uint32(q[0])<<24 + } + return uint32(q[0]) | uint32(q[1])<<8 | uint32(q[2])<<16 | uint32(q[3])<<24 +} + +func readUnaligned64(p unsafe.Pointer) uint64 { + q := (*[8]byte)(p) + if goarch.BigEndian { + return uint64(q[7]) | uint64(q[6])<<8 | uint64(q[5])<<16 | uint64(q[4])<<24 | + uint64(q[3])<<32 | uint64(q[2])<<40 | uint64(q[1])<<48 | uint64(q[0])<<56 + } + return uint64(q[0]) | uint64(q[1])<<8 | uint64(q[2])<<16 | uint64(q[3])<<24 | uint64(q[4])<<32 | uint64(q[5])<<40 | uint64(q[6])<<48 | uint64(q[7])<<56 +} diff --git a/src/runtime/align_runtime_test.go b/src/runtime/align_runtime_test.go new file mode 100644 index 0000000..d78b0b2 --- /dev/null +++ b/src/runtime/align_runtime_test.go @@ -0,0 +1,51 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This file lives in the runtime package +// so we can get access to the runtime guts. +// The rest of the implementation of this test is in align_test.go. + +package runtime + +import "unsafe" + +// AtomicFields is the set of fields on which we perform 64-bit atomic +// operations (all the *64 operations in runtime/internal/atomic). +var AtomicFields = []uintptr{ + unsafe.Offsetof(m{}.procid), + unsafe.Offsetof(p{}.gcFractionalMarkTime), + unsafe.Offsetof(profBuf{}.overflow), + unsafe.Offsetof(profBuf{}.overflowTime), + unsafe.Offsetof(heapStatsDelta{}.tinyAllocCount), + unsafe.Offsetof(heapStatsDelta{}.smallAllocCount), + unsafe.Offsetof(heapStatsDelta{}.smallFreeCount), + unsafe.Offsetof(heapStatsDelta{}.largeAlloc), + unsafe.Offsetof(heapStatsDelta{}.largeAllocCount), + unsafe.Offsetof(heapStatsDelta{}.largeFree), + unsafe.Offsetof(heapStatsDelta{}.largeFreeCount), + unsafe.Offsetof(heapStatsDelta{}.committed), + unsafe.Offsetof(heapStatsDelta{}.released), + unsafe.Offsetof(heapStatsDelta{}.inHeap), + unsafe.Offsetof(heapStatsDelta{}.inStacks), + unsafe.Offsetof(heapStatsDelta{}.inPtrScalarBits), + unsafe.Offsetof(heapStatsDelta{}.inWorkBufs), + unsafe.Offsetof(lfnode{}.next), + unsafe.Offsetof(mstats{}.last_gc_nanotime), + unsafe.Offsetof(mstats{}.last_gc_unix), + unsafe.Offsetof(workType{}.bytesMarked), +} + +// AtomicVariables is the set of global variables on which we perform +// 64-bit atomic operations. +var AtomicVariables = []unsafe.Pointer{ + unsafe.Pointer(&ncgocall), + unsafe.Pointer(&test_z64), + unsafe.Pointer(&blockprofilerate), + unsafe.Pointer(&mutexprofilerate), + unsafe.Pointer(&gcController), + unsafe.Pointer(&memstats), + unsafe.Pointer(&sched), + unsafe.Pointer(&ticks), + unsafe.Pointer(&work), +} diff --git a/src/runtime/align_test.go b/src/runtime/align_test.go new file mode 100644 index 0000000..5f225d6 --- /dev/null +++ b/src/runtime/align_test.go @@ -0,0 +1,200 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "go/ast" + "go/build" + "go/importer" + "go/parser" + "go/printer" + "go/token" + "go/types" + "internal/testenv" + "os" + "regexp" + "runtime" + "strings" + "testing" +) + +// Check that 64-bit fields on which we apply atomic operations +// are aligned to 8 bytes. This can be a problem on 32-bit systems. +func TestAtomicAlignment(t *testing.T) { + testenv.MustHaveGoBuild(t) // go command needed to resolve std .a files for importer.Default(). + + // Read the code making the tables above, to see which fields and + // variables we are currently checking. + checked := map[string]bool{} + x, err := os.ReadFile("./align_runtime_test.go") + if err != nil { + t.Fatalf("read failed: %v", err) + } + fieldDesc := map[int]string{} + r := regexp.MustCompile(`unsafe[.]Offsetof[(](\w+){}[.](\w+)[)]`) + matches := r.FindAllStringSubmatch(string(x), -1) + for i, v := range matches { + checked["field runtime."+v[1]+"."+v[2]] = true + fieldDesc[i] = v[1] + "." + v[2] + } + varDesc := map[int]string{} + r = regexp.MustCompile(`unsafe[.]Pointer[(]&(\w+)[)]`) + matches = r.FindAllStringSubmatch(string(x), -1) + for i, v := range matches { + checked["var "+v[1]] = true + varDesc[i] = v[1] + } + + // Check all of our alignemnts. This is the actual core of the test. + for i, d := range runtime.AtomicFields { + if d%8 != 0 { + t.Errorf("field alignment of %s failed: offset is %d", fieldDesc[i], d) + } + } + for i, p := range runtime.AtomicVariables { + if uintptr(p)%8 != 0 { + t.Errorf("variable alignment of %s failed: address is %x", varDesc[i], p) + } + } + + // The code above is the actual test. The code below attempts to check + // that the tables used by the code above are exhaustive. + + // Parse the whole runtime package, checking that arguments of + // appropriate atomic operations are in the list above. + fset := token.NewFileSet() + m, err := parser.ParseDir(fset, ".", nil, 0) + if err != nil { + t.Fatalf("parsing runtime failed: %v", err) + } + pkg := m["runtime"] // Note: ignore runtime_test and main packages + + // Filter files by those for the current architecture/os being tested. + fileMap := map[string]bool{} + for _, f := range buildableFiles(t, ".") { + fileMap[f] = true + } + var files []*ast.File + for fname, f := range pkg.Files { + if fileMap[fname] { + files = append(files, f) + } + } + + // Call go/types to analyze the runtime package. + var info types.Info + info.Types = map[ast.Expr]types.TypeAndValue{} + conf := types.Config{Importer: importer.Default()} + _, err = conf.Check("runtime", fset, files, &info) + if err != nil { + t.Fatalf("typechecking runtime failed: %v", err) + } + + // Analyze all atomic.*64 callsites. + v := Visitor{t: t, fset: fset, types: info.Types, checked: checked} + ast.Walk(&v, pkg) +} + +type Visitor struct { + fset *token.FileSet + types map[ast.Expr]types.TypeAndValue + checked map[string]bool + t *testing.T +} + +func (v *Visitor) Visit(n ast.Node) ast.Visitor { + c, ok := n.(*ast.CallExpr) + if !ok { + return v + } + f, ok := c.Fun.(*ast.SelectorExpr) + if !ok { + return v + } + p, ok := f.X.(*ast.Ident) + if !ok { + return v + } + if p.Name != "atomic" { + return v + } + if !strings.HasSuffix(f.Sel.Name, "64") { + return v + } + + a := c.Args[0] + + // This is a call to atomic.XXX64(a, ...). Make sure a is aligned to 8 bytes. + // XXX = one of Load, Store, Cas, etc. + // The arg we care about the alignment of is always the first one. + + if u, ok := a.(*ast.UnaryExpr); ok && u.Op == token.AND { + v.checkAddr(u.X) + return v + } + + // Other cases there's nothing we can check. Assume we're ok. + v.t.Logf("unchecked atomic operation %s %v", v.fset.Position(n.Pos()), v.print(n)) + + return v +} + +// checkAddr checks to make sure n is a properly aligned address for a 64-bit atomic operation. +func (v *Visitor) checkAddr(n ast.Node) { + switch n := n.(type) { + case *ast.IndexExpr: + // Alignment of an array element is the same as the whole array. + v.checkAddr(n.X) + return + case *ast.Ident: + key := "var " + v.print(n) + if !v.checked[key] { + v.t.Errorf("unchecked variable %s %s", v.fset.Position(n.Pos()), key) + } + return + case *ast.SelectorExpr: + t := v.types[n.X].Type + if t == nil { + // Not sure what is happening here, go/types fails to + // type the selector arg on some platforms. + return + } + if p, ok := t.(*types.Pointer); ok { + // Note: we assume here that the pointer p in p.foo is properly + // aligned. We just check that foo is at a properly aligned offset. + t = p.Elem() + } else { + v.checkAddr(n.X) + } + if t.Underlying() == t { + v.t.Errorf("analysis can't handle unnamed type %s %v", v.fset.Position(n.Pos()), t) + } + key := "field " + t.String() + "." + n.Sel.Name + if !v.checked[key] { + v.t.Errorf("unchecked field %s %s", v.fset.Position(n.Pos()), key) + } + default: + v.t.Errorf("unchecked atomic address %s %v", v.fset.Position(n.Pos()), v.print(n)) + + } +} + +func (v *Visitor) print(n ast.Node) string { + var b strings.Builder + printer.Fprint(&b, v.fset, n) + return b.String() +} + +// buildableFiles returns the list of files in the given directory +// that are actually used for the build, given GOOS/GOARCH restrictions. +func buildableFiles(t *testing.T, dir string) []string { + ctxt := build.Default + ctxt.CgoEnabled = true + pkg, err := ctxt.ImportDir(dir, 0) + if err != nil { + t.Fatalf("can't find buildable files: %v", err) + } + return pkg.GoFiles +} diff --git a/src/runtime/arena.go b/src/runtime/arena.go new file mode 100644 index 0000000..c338d30 --- /dev/null +++ b/src/runtime/arena.go @@ -0,0 +1,1003 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Implementation of (safe) user arenas. +// +// This file contains the implementation of user arenas wherein Go values can +// be manually allocated and freed in bulk. The act of manually freeing memory, +// potentially before a GC cycle, means that a garbage collection cycle can be +// delayed, improving efficiency by reducing GC cycle frequency. There are other +// potential efficiency benefits, such as improved locality and access to a more +// efficient allocation strategy. +// +// What makes the arenas here safe is that once they are freed, accessing the +// arena's memory will cause an explicit program fault, and the arena's address +// space will not be reused until no more pointers into it are found. There's one +// exception to this: if an arena allocated memory that isn't exhausted, it's placed +// back into a pool for reuse. This means that a crash is not always guaranteed. +// +// While this may seem unsafe, it still prevents memory corruption, and is in fact +// necessary in order to make new(T) a valid implementation of arenas. Such a property +// is desirable to allow for a trivial implementation. (It also avoids complexities +// that arise from synchronization with the GC when trying to set the arena chunks to +// fault while the GC is active.) +// +// The implementation works in layers. At the bottom, arenas are managed in chunks. +// Each chunk must be a multiple of the heap arena size, or the heap arena size must +// be divisible by the arena chunks. The address space for each chunk, and each +// corresponding heapArena for that addres space, are eternelly reserved for use as +// arena chunks. That is, they can never be used for the general heap. Each chunk +// is also represented by a single mspan, and is modeled as a single large heap +// allocation. It must be, because each chunk contains ordinary Go values that may +// point into the heap, so it must be scanned just like any other object. Any +// pointer into a chunk will therefore always cause the whole chunk to be scanned +// while its corresponding arena is still live. +// +// Chunks may be allocated either from new memory mapped by the OS on our behalf, +// or by reusing old freed chunks. When chunks are freed, their underlying memory +// is returned to the OS, set to fault on access, and may not be reused until the +// program doesn't point into the chunk anymore (the code refers to this state as +// "quarantined"), a property checked by the GC. +// +// The sweeper handles moving chunks out of this quarantine state to be ready for +// reuse. When the chunk is placed into the quarantine state, its corresponding +// span is marked as noscan so that the GC doesn't try to scan memory that would +// cause a fault. +// +// At the next layer are the user arenas themselves. They consist of a single +// active chunk which new Go values are bump-allocated into and a list of chunks +// that were exhausted when allocating into the arena. Once the arena is freed, +// it frees all full chunks it references, and places the active one onto a reuse +// list for a future arena to use. Each arena keeps its list of referenced chunks +// explicitly live until it is freed. Each user arena also maps to an object which +// has a finalizer attached that ensures the arena's chunks are all freed even if +// the arena itself is never explicitly freed. +// +// Pointer-ful memory is bump-allocated from low addresses to high addresses in each +// chunk, while pointer-free memory is bump-allocated from high address to low +// addresses. The reason for this is to take advantage of a GC optimization wherein +// the GC will stop scanning an object when there are no more pointers in it, which +// also allows us to elide clearing the heap bitmap for pointer-free Go values +// allocated into arenas. +// +// Note that arenas are not safe to use concurrently. +// +// In summary, there are 2 resources: arenas, and arena chunks. They exist in the +// following lifecycle: +// +// (1) A new arena is created via newArena. +// (2) Chunks are allocated to hold memory allocated into the arena with new or slice. +// (a) Chunks are first allocated from the reuse list of partially-used chunks. +// (b) If there are no such chunks, then chunks on the ready list are taken. +// (c) Failing all the above, memory for a new chunk is mapped. +// (3) The arena is freed, or all references to it are dropped, triggering its finalizer. +// (a) If the GC is not active, exhausted chunks are set to fault and placed on a +// quarantine list. +// (b) If the GC is active, exhausted chunks are placed on a fault list and will +// go through step (a) at a later point in time. +// (c) Any remaining partially-used chunk is placed on a reuse list. +// (4) Once no more pointers are found into quarantined arena chunks, the sweeper +// takes these chunks out of quarantine and places them on the ready list. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/math" + "unsafe" +) + +// Functions starting with arena_ are meant to be exported to downstream users +// of arenas. They should wrap these functions in a higher-lever API. +// +// The underlying arena and its resources are managed through an opaque unsafe.Pointer. + +// arena_newArena is a wrapper around newUserArena. +// +//go:linkname arena_newArena arena.runtime_arena_newArena +func arena_newArena() unsafe.Pointer { + return unsafe.Pointer(newUserArena()) +} + +// arena_arena_New is a wrapper around (*userArena).new, except that typ +// is an any (must be a *_type, still) and typ must be a type descriptor +// for a pointer to the type to actually be allocated, i.e. pass a *T +// to allocate a T. This is necessary because this function returns a *T. +// +//go:linkname arena_arena_New arena.runtime_arena_arena_New +func arena_arena_New(arena unsafe.Pointer, typ any) any { + t := (*_type)(efaceOf(&typ).data) + if t.kind&kindMask != kindPtr { + throw("arena_New: non-pointer type") + } + te := (*ptrtype)(unsafe.Pointer(t)).elem + x := ((*userArena)(arena)).new(te) + var result any + e := efaceOf(&result) + e._type = t + e.data = x + return result +} + +// arena_arena_Slice is a wrapper around (*userArena).slice. +// +//go:linkname arena_arena_Slice arena.runtime_arena_arena_Slice +func arena_arena_Slice(arena unsafe.Pointer, slice any, cap int) { + ((*userArena)(arena)).slice(slice, cap) +} + +// arena_arena_Free is a wrapper around (*userArena).free. +// +//go:linkname arena_arena_Free arena.runtime_arena_arena_Free +func arena_arena_Free(arena unsafe.Pointer) { + ((*userArena)(arena)).free() +} + +// arena_heapify takes a value that lives in an arena and makes a copy +// of it on the heap. Values that don't live in an arena are returned unmodified. +// +//go:linkname arena_heapify arena.runtime_arena_heapify +func arena_heapify(s any) any { + var v unsafe.Pointer + e := efaceOf(&s) + t := e._type + switch t.kind & kindMask { + case kindString: + v = stringStructOf((*string)(e.data)).str + case kindSlice: + v = (*slice)(e.data).array + case kindPtr: + v = e.data + default: + panic("arena: Clone only supports pointers, slices, and strings") + } + span := spanOf(uintptr(v)) + if span == nil || !span.isUserArenaChunk { + // Not stored in a user arena chunk. + return s + } + // Heap-allocate storage for a copy. + var x any + switch t.kind & kindMask { + case kindString: + s1 := s.(string) + s2, b := rawstring(len(s1)) + copy(b, s1) + x = s2 + case kindSlice: + len := (*slice)(e.data).len + et := (*slicetype)(unsafe.Pointer(t)).elem + sl := new(slice) + *sl = slice{makeslicecopy(et, len, len, (*slice)(e.data).array), len, len} + xe := efaceOf(&x) + xe._type = t + xe.data = unsafe.Pointer(sl) + case kindPtr: + et := (*ptrtype)(unsafe.Pointer(t)).elem + e2 := newobject(et) + typedmemmove(et, e2, e.data) + xe := efaceOf(&x) + xe._type = t + xe.data = e2 + } + return x +} + +const ( + // userArenaChunkBytes is the size of a user arena chunk. + userArenaChunkBytesMax = 8 << 20 + userArenaChunkBytes = uintptr(int64(userArenaChunkBytesMax-heapArenaBytes)&(int64(userArenaChunkBytesMax-heapArenaBytes)>>63) + heapArenaBytes) // min(userArenaChunkBytesMax, heapArenaBytes) + + // userArenaChunkPages is the number of pages a user arena chunk uses. + userArenaChunkPages = userArenaChunkBytes / pageSize + + // userArenaChunkMaxAllocBytes is the maximum size of an object that can + // be allocated from an arena. This number is chosen to cap worst-case + // fragmentation of user arenas to 25%. Larger allocations are redirected + // to the heap. + userArenaChunkMaxAllocBytes = userArenaChunkBytes / 4 +) + +func init() { + if userArenaChunkPages*pageSize != userArenaChunkBytes { + throw("user arena chunk size is not a mutliple of the page size") + } + if userArenaChunkBytes%physPageSize != 0 { + throw("user arena chunk size is not a mutliple of the physical page size") + } + if userArenaChunkBytes < heapArenaBytes { + if heapArenaBytes%userArenaChunkBytes != 0 { + throw("user arena chunk size is smaller than a heap arena, but doesn't divide it") + } + } else { + if userArenaChunkBytes%heapArenaBytes != 0 { + throw("user arena chunks size is larger than a heap arena, but not a multiple") + } + } + lockInit(&userArenaState.lock, lockRankUserArenaState) +} + +type userArena struct { + // full is a list of full chunks that have not enough free memory left, and + // that we'll free once this user arena is freed. + // + // Can't use mSpanList here because it's not-in-heap. + fullList *mspan + + // active is the user arena chunk we're currently allocating into. + active *mspan + + // refs is a set of references to the arena chunks so that they're kept alive. + // + // The last reference in the list always refers to active, while the rest of + // them correspond to fullList. Specifically, the head of fullList is the + // second-to-last one, fullList.next is the third-to-last, and so on. + // + // In other words, every time a new chunk becomes active, its appended to this + // list. + refs []unsafe.Pointer + + // defunct is true if free has been called on this arena. + // + // This is just a best-effort way to discover a concurrent allocation + // and free. Also used to detect a double-free. + defunct atomic.Bool +} + +// newUserArena creates a new userArena ready to be used. +func newUserArena() *userArena { + a := new(userArena) + SetFinalizer(a, func(a *userArena) { + // If arena handle is dropped without being freed, then call + // free on the arena, so the arena chunks are never reclaimed + // by the garbage collector. + a.free() + }) + a.refill() + return a +} + +// new allocates a new object of the provided type into the arena, and returns +// its pointer. +// +// This operation is not safe to call concurrently with other operations on the +// same arena. +func (a *userArena) new(typ *_type) unsafe.Pointer { + return a.alloc(typ, -1) +} + +// slice allocates a new slice backing store. slice must be a pointer to a slice +// (i.e. *[]T), because userArenaSlice will update the slice directly. +// +// cap determines the capacity of the slice backing store and must be non-negative. +// +// This operation is not safe to call concurrently with other operations on the +// same arena. +func (a *userArena) slice(sl any, cap int) { + if cap < 0 { + panic("userArena.slice: negative cap") + } + i := efaceOf(&sl) + typ := i._type + if typ.kind&kindMask != kindPtr { + panic("slice result of non-ptr type") + } + typ = (*ptrtype)(unsafe.Pointer(typ)).elem + if typ.kind&kindMask != kindSlice { + panic("slice of non-ptr-to-slice type") + } + typ = (*slicetype)(unsafe.Pointer(typ)).elem + // t is now the element type of the slice we want to allocate. + + *((*slice)(i.data)) = slice{a.alloc(typ, cap), cap, cap} +} + +// free returns the userArena's chunks back to mheap and marks it as defunct. +// +// Must be called at most once for any given arena. +// +// This operation is not safe to call concurrently with other operations on the +// same arena. +func (a *userArena) free() { + // Check for a double-free. + if a.defunct.Load() { + panic("arena double free") + } + + // Mark ourselves as defunct. + a.defunct.Store(true) + SetFinalizer(a, nil) + + // Free all the full arenas. + // + // The refs on this list are in reverse order from the second-to-last. + s := a.fullList + i := len(a.refs) - 2 + for s != nil { + a.fullList = s.next + s.next = nil + freeUserArenaChunk(s, a.refs[i]) + s = a.fullList + i-- + } + if a.fullList != nil || i >= 0 { + // There's still something left on the full list, or we + // failed to actually iterate over the entire refs list. + throw("full list doesn't match refs list in length") + } + + // Put the active chunk onto the reuse list. + // + // Note that active's reference is always the last reference in refs. + s = a.active + if s != nil { + if raceenabled || msanenabled || asanenabled { + // Don't reuse arenas with sanitizers enabled. We want to catch + // any use-after-free errors aggressively. + freeUserArenaChunk(s, a.refs[len(a.refs)-1]) + } else { + lock(&userArenaState.lock) + userArenaState.reuse = append(userArenaState.reuse, liveUserArenaChunk{s, a.refs[len(a.refs)-1]}) + unlock(&userArenaState.lock) + } + } + // nil out a.active so that a race with freeing will more likely cause a crash. + a.active = nil + a.refs = nil +} + +// alloc reserves space in the current chunk or calls refill and reserves space +// in a new chunk. If cap is negative, the type will be taken literally, otherwise +// it will be considered as an element type for a slice backing store with capacity +// cap. +func (a *userArena) alloc(typ *_type, cap int) unsafe.Pointer { + s := a.active + var x unsafe.Pointer + for { + x = s.userArenaNextFree(typ, cap) + if x != nil { + break + } + s = a.refill() + } + return x +} + +// refill inserts the current arena chunk onto the full list and obtains a new +// one, either from the partial list or allocating a new one, both from mheap. +func (a *userArena) refill() *mspan { + // If there's an active chunk, assume it's full. + s := a.active + if s != nil { + if s.userArenaChunkFree.size() > userArenaChunkMaxAllocBytes { + // It's difficult to tell when we're actually out of memory + // in a chunk because the allocation that failed may still leave + // some free space available. However, that amount of free space + // should never exceed the maximum allocation size. + throw("wasted too much memory in an arena chunk") + } + s.next = a.fullList + a.fullList = s + a.active = nil + s = nil + } + var x unsafe.Pointer + + // Check the partially-used list. + lock(&userArenaState.lock) + if len(userArenaState.reuse) > 0 { + // Pick off the last arena chunk from the list. + n := len(userArenaState.reuse) - 1 + x = userArenaState.reuse[n].x + s = userArenaState.reuse[n].mspan + userArenaState.reuse[n].x = nil + userArenaState.reuse[n].mspan = nil + userArenaState.reuse = userArenaState.reuse[:n] + } + unlock(&userArenaState.lock) + if s == nil { + // Allocate a new one. + x, s = newUserArenaChunk() + if s == nil { + throw("out of memory") + } + } + a.refs = append(a.refs, x) + a.active = s + return s +} + +type liveUserArenaChunk struct { + *mspan // Must represent a user arena chunk. + + // Reference to mspan.base() to keep the chunk alive. + x unsafe.Pointer +} + +var userArenaState struct { + lock mutex + + // reuse contains a list of partially-used and already-live + // user arena chunks that can be quickly reused for another + // arena. + // + // Protected by lock. + reuse []liveUserArenaChunk + + // fault contains full user arena chunks that need to be faulted. + // + // Protected by lock. + fault []liveUserArenaChunk +} + +// userArenaNextFree reserves space in the user arena for an item of the specified +// type. If cap is not -1, this is for an array of cap elements of type t. +func (s *mspan) userArenaNextFree(typ *_type, cap int) unsafe.Pointer { + size := typ.size + if cap > 0 { + if size > ^uintptr(0)/uintptr(cap) { + // Overflow. + throw("out of memory") + } + size *= uintptr(cap) + } + if size == 0 || cap == 0 { + return unsafe.Pointer(&zerobase) + } + if size > userArenaChunkMaxAllocBytes { + // Redirect allocations that don't fit into a chunk well directly + // from the heap. + if cap >= 0 { + return newarray(typ, cap) + } + return newobject(typ) + } + + // Prevent preemption as we set up the space for a new object. + // + // Act like we're allocating. + mp := acquirem() + if mp.mallocing != 0 { + throw("malloc deadlock") + } + if mp.gsignal == getg() { + throw("malloc during signal") + } + mp.mallocing = 1 + + var ptr unsafe.Pointer + if typ.ptrdata == 0 { + // Allocate pointer-less objects from the tail end of the chunk. + v, ok := s.userArenaChunkFree.takeFromBack(size, typ.align) + if ok { + ptr = unsafe.Pointer(v) + } + } else { + v, ok := s.userArenaChunkFree.takeFromFront(size, typ.align) + if ok { + ptr = unsafe.Pointer(v) + } + } + if ptr == nil { + // Failed to allocate. + mp.mallocing = 0 + releasem(mp) + return nil + } + if s.needzero != 0 { + throw("arena chunk needs zeroing, but should already be zeroed") + } + // Set up heap bitmap and do extra accounting. + if typ.ptrdata != 0 { + if cap >= 0 { + userArenaHeapBitsSetSliceType(typ, cap, ptr, s.base()) + } else { + userArenaHeapBitsSetType(typ, ptr, s.base()) + } + c := getMCache(mp) + if c == nil { + throw("mallocgc called without a P or outside bootstrapping") + } + if cap > 0 { + c.scanAlloc += size - (typ.size - typ.ptrdata) + } else { + c.scanAlloc += typ.ptrdata + } + } + + // Ensure that the stores above that initialize x to + // type-safe memory and set the heap bits occur before + // the caller can make ptr observable to the garbage + // collector. Otherwise, on weakly ordered machines, + // the garbage collector could follow a pointer to x, + // but see uninitialized memory or stale heap bits. + publicationBarrier() + + mp.mallocing = 0 + releasem(mp) + + return ptr +} + +// userArenaHeapBitsSetType is the equivalent of heapBitsSetType but for +// non-slice-backing-store Go values allocated in a user arena chunk. It +// sets up the heap bitmap for the value with type typ allocated at address ptr. +// base is the base address of the arena chunk. +func userArenaHeapBitsSetType(typ *_type, ptr unsafe.Pointer, base uintptr) { + h := writeHeapBitsForAddr(uintptr(ptr)) + + // Our last allocation might have ended right at a noMorePtrs mark, + // which we would not have erased. We need to erase that mark here, + // because we're going to start adding new heap bitmap bits. + // We only need to clear one mark, because below we make sure to + // pad out the bits with zeroes and only write one noMorePtrs bit + // for each new object. + // (This is only necessary at noMorePtrs boundaries, as noMorePtrs + // marks within an object allocated with newAt will be erased by + // the normal writeHeapBitsForAddr mechanism.) + // + // Note that we skip this if this is the first allocation in the + // arena because there's definitely no previous noMorePtrs mark + // (in fact, we *must* do this, because we're going to try to back + // up a pointer to fix this up). + if uintptr(ptr)%(8*goarch.PtrSize*goarch.PtrSize) == 0 && uintptr(ptr) != base { + // Back up one pointer and rewrite that pointer. That will + // cause the writeHeapBits implementation to clear the + // noMorePtrs bit we need to clear. + r := heapBitsForAddr(uintptr(ptr)-goarch.PtrSize, goarch.PtrSize) + _, p := r.next() + b := uintptr(0) + if p == uintptr(ptr)-goarch.PtrSize { + b = 1 + } + h = writeHeapBitsForAddr(uintptr(ptr) - goarch.PtrSize) + h = h.write(b, 1) + } + + p := typ.gcdata // start of 1-bit pointer mask (or GC program) + var gcProgBits uintptr + if typ.kind&kindGCProg != 0 { + // Expand gc program, using the object itself for storage. + gcProgBits = runGCProg(addb(p, 4), (*byte)(ptr)) + p = (*byte)(ptr) + } + nb := typ.ptrdata / goarch.PtrSize + + for i := uintptr(0); i < nb; i += ptrBits { + k := nb - i + if k > ptrBits { + k = ptrBits + } + h = h.write(readUintptr(addb(p, i/8)), k) + } + // Note: we call pad here to ensure we emit explicit 0 bits + // for the pointerless tail of the object. This ensures that + // there's only a single noMorePtrs mark for the next object + // to clear. We don't need to do this to clear stale noMorePtrs + // markers from previous uses because arena chunk pointer bitmaps + // are always fully cleared when reused. + h = h.pad(typ.size - typ.ptrdata) + h.flush(uintptr(ptr), typ.size) + + if typ.kind&kindGCProg != 0 { + // Zero out temporary ptrmask buffer inside object. + memclrNoHeapPointers(ptr, (gcProgBits+7)/8) + } + + // Double-check that the bitmap was written out correctly. + // + // Derived from heapBitsSetType. + const doubleCheck = false + if doubleCheck { + size := typ.size + x := uintptr(ptr) + h := heapBitsForAddr(x, size) + for i := uintptr(0); i < size; i += goarch.PtrSize { + // Compute the pointer bit we want at offset i. + want := false + off := i % typ.size + if off < typ.ptrdata { + j := off / goarch.PtrSize + want = *addb(typ.gcdata, j/8)>>(j%8)&1 != 0 + } + if want { + var addr uintptr + h, addr = h.next() + if addr != x+i { + throw("userArenaHeapBitsSetType: pointer entry not correct") + } + } + } + if _, addr := h.next(); addr != 0 { + throw("userArenaHeapBitsSetType: extra pointer") + } + } +} + +// userArenaHeapBitsSetSliceType is the equivalent of heapBitsSetType but for +// Go slice backing store values allocated in a user arena chunk. It sets up the +// heap bitmap for n consecutive values with type typ allocated at address ptr. +func userArenaHeapBitsSetSliceType(typ *_type, n int, ptr unsafe.Pointer, base uintptr) { + mem, overflow := math.MulUintptr(typ.size, uintptr(n)) + if overflow || n < 0 || mem > maxAlloc { + panic(plainError("runtime: allocation size out of range")) + } + for i := 0; i < n; i++ { + userArenaHeapBitsSetType(typ, add(ptr, uintptr(i)*typ.size), base) + } +} + +// newUserArenaChunk allocates a user arena chunk, which maps to a single +// heap arena and single span. Returns a pointer to the base of the chunk +// (this is really important: we need to keep the chunk alive) and the span. +func newUserArenaChunk() (unsafe.Pointer, *mspan) { + if gcphase == _GCmarktermination { + throw("newUserArenaChunk called with gcphase == _GCmarktermination") + } + + // Deduct assist credit. Because user arena chunks are modeled as one + // giant heap object which counts toward heapLive, we're obligated to + // assist the GC proportionally (and it's worth noting that the arena + // does represent additional work for the GC, but we also have no idea + // what that looks like until we actually allocate things into the + // arena). + deductAssistCredit(userArenaChunkBytes) + + // Set mp.mallocing to keep from being preempted by GC. + mp := acquirem() + if mp.mallocing != 0 { + throw("malloc deadlock") + } + if mp.gsignal == getg() { + throw("malloc during signal") + } + mp.mallocing = 1 + + // Allocate a new user arena. + var span *mspan + systemstack(func() { + span = mheap_.allocUserArenaChunk() + }) + if span == nil { + throw("out of memory") + } + x := unsafe.Pointer(span.base()) + + // Allocate black during GC. + // All slots hold nil so no scanning is needed. + // This may be racing with GC so do it atomically if there can be + // a race marking the bit. + if gcphase != _GCoff { + gcmarknewobject(span, span.base(), span.elemsize) + } + + if raceenabled { + // TODO(mknyszek): Track individual objects. + racemalloc(unsafe.Pointer(span.base()), span.elemsize) + } + + if msanenabled { + // TODO(mknyszek): Track individual objects. + msanmalloc(unsafe.Pointer(span.base()), span.elemsize) + } + + if asanenabled { + // TODO(mknyszek): Track individual objects. + rzSize := computeRZlog(span.elemsize) + span.elemsize -= rzSize + span.limit -= rzSize + span.userArenaChunkFree = makeAddrRange(span.base(), span.limit) + asanpoison(unsafe.Pointer(span.limit), span.npages*pageSize-span.elemsize) + asanunpoison(unsafe.Pointer(span.base()), span.elemsize) + } + + if rate := MemProfileRate; rate > 0 { + c := getMCache(mp) + if c == nil { + throw("newUserArenaChunk called without a P or outside bootstrapping") + } + // Note cache c only valid while m acquired; see #47302 + if rate != 1 && userArenaChunkBytes < c.nextSample { + c.nextSample -= userArenaChunkBytes + } else { + profilealloc(mp, unsafe.Pointer(span.base()), userArenaChunkBytes) + } + } + mp.mallocing = 0 + releasem(mp) + + // Again, because this chunk counts toward heapLive, potentially trigger a GC. + if t := (gcTrigger{kind: gcTriggerHeap}); t.test() { + gcStart(t) + } + + if debug.malloc { + if debug.allocfreetrace != 0 { + tracealloc(unsafe.Pointer(span.base()), userArenaChunkBytes, nil) + } + + if inittrace.active && inittrace.id == getg().goid { + // Init functions are executed sequentially in a single goroutine. + inittrace.bytes += uint64(userArenaChunkBytes) + } + } + + // Double-check it's aligned to the physical page size. Based on the current + // implementation this is trivially true, but it need not be in the future. + // However, if it's not aligned to the physical page size then we can't properly + // set it to fault later. + if uintptr(x)%physPageSize != 0 { + throw("user arena chunk is not aligned to the physical page size") + } + + return x, span +} + +// isUnusedUserArenaChunk indicates that the arena chunk has been set to fault +// and doesn't contain any scannable memory anymore. However, it might still be +// mSpanInUse as it sits on the quarantine list, since it needs to be swept. +// +// This is not safe to execute unless the caller has ownership of the mspan or +// the world is stopped (preemption is prevented while the relevant state changes). +// +// This is really only meant to be used by accounting tests in the runtime to +// distinguish when a span shouldn't be counted (since mSpanInUse might not be +// enough). +func (s *mspan) isUnusedUserArenaChunk() bool { + return s.isUserArenaChunk && s.spanclass == makeSpanClass(0, true) +} + +// setUserArenaChunkToFault sets the address space for the user arena chunk to fault +// and releases any underlying memory resources. +// +// Must be in a non-preemptible state to ensure the consistency of statistics +// exported to MemStats. +func (s *mspan) setUserArenaChunkToFault() { + if !s.isUserArenaChunk { + throw("invalid span in heapArena for user arena") + } + if s.npages*pageSize != userArenaChunkBytes { + throw("span on userArena.faultList has invalid size") + } + + // Update the span class to be noscan. What we want to happen is that + // any pointer into the span keeps it from getting recycled, so we want + // the mark bit to get set, but we're about to set the address space to fault, + // so we have to prevent the GC from scanning this memory. + // + // It's OK to set it here because (1) a GC isn't in progress, so the scanning code + // won't make a bad decision, (2) we're currently non-preemptible and in the runtime, + // so a GC is blocked from starting. We might race with sweeping, which could + // put it on the "wrong" sweep list, but really don't care because the chunk is + // treated as a large object span and there's no meaningful difference between scan + // and noscan large objects in the sweeper. The STW at the start of the GC acts as a + // barrier for this update. + s.spanclass = makeSpanClass(0, true) + + // Actually set the arena chunk to fault, so we'll get dangling pointer errors. + // sysFault currently uses a method on each OS that forces it to evacuate all + // memory backing the chunk. + sysFault(unsafe.Pointer(s.base()), s.npages*pageSize) + + // Everything on the list is counted as in-use, however sysFault transitions to + // Reserved, not Prepared, so we skip updating heapFree or heapReleased and just + // remove the memory from the total altogether; it's just address space now. + gcController.heapInUse.add(-int64(s.npages * pageSize)) + + // Count this as a free of an object right now as opposed to when + // the span gets off the quarantine list. The main reason is so that the + // amount of bytes allocated doesn't exceed how much is counted as + // "mapped ready," which could cause a deadlock in the pacer. + gcController.totalFree.Add(int64(s.npages * pageSize)) + + // Update consistent stats to match. + // + // We're non-preemptible, so it's safe to update consistent stats (our P + // won't change out from under us). + stats := memstats.heapStats.acquire() + atomic.Xaddint64(&stats.committed, -int64(s.npages*pageSize)) + atomic.Xaddint64(&stats.inHeap, -int64(s.npages*pageSize)) + atomic.Xadd64(&stats.largeFreeCount, 1) + atomic.Xadd64(&stats.largeFree, int64(s.npages*pageSize)) + memstats.heapStats.release() + + // This counts as a free, so update heapLive. + gcController.update(-int64(s.npages*pageSize), 0) + + // Mark it as free for the race detector. + if raceenabled { + racefree(unsafe.Pointer(s.base()), s.elemsize) + } + + systemstack(func() { + // Add the user arena to the quarantine list. + lock(&mheap_.lock) + mheap_.userArena.quarantineList.insert(s) + unlock(&mheap_.lock) + }) +} + +// inUserArenaChunk returns true if p points to a user arena chunk. +func inUserArenaChunk(p uintptr) bool { + s := spanOf(p) + if s == nil { + return false + } + return s.isUserArenaChunk +} + +// freeUserArenaChunk releases the user arena represented by s back to the runtime. +// +// x must be a live pointer within s. +// +// The runtime will set the user arena to fault once it's safe (the GC is no longer running) +// and then once the user arena is no longer referenced by the application, will allow it to +// be reused. +func freeUserArenaChunk(s *mspan, x unsafe.Pointer) { + if !s.isUserArenaChunk { + throw("span is not for a user arena") + } + if s.npages*pageSize != userArenaChunkBytes { + throw("invalid user arena span size") + } + + // Mark the region as free to various santizers immediately instead + // of handling them at sweep time. + if raceenabled { + racefree(unsafe.Pointer(s.base()), s.elemsize) + } + if msanenabled { + msanfree(unsafe.Pointer(s.base()), s.elemsize) + } + if asanenabled { + asanpoison(unsafe.Pointer(s.base()), s.elemsize) + } + + // Make ourselves non-preemptible as we manipulate state and statistics. + // + // Also required by setUserArenaChunksToFault. + mp := acquirem() + + // We can only set user arenas to fault if we're in the _GCoff phase. + if gcphase == _GCoff { + lock(&userArenaState.lock) + faultList := userArenaState.fault + userArenaState.fault = nil + unlock(&userArenaState.lock) + + s.setUserArenaChunkToFault() + for _, lc := range faultList { + lc.mspan.setUserArenaChunkToFault() + } + + // Until the chunks are set to fault, keep them alive via the fault list. + KeepAlive(x) + KeepAlive(faultList) + } else { + // Put the user arena on the fault list. + lock(&userArenaState.lock) + userArenaState.fault = append(userArenaState.fault, liveUserArenaChunk{s, x}) + unlock(&userArenaState.lock) + } + releasem(mp) +} + +// allocUserArenaChunk attempts to reuse a free user arena chunk represented +// as a span. +// +// Must be in a non-preemptible state to ensure the consistency of statistics +// exported to MemStats. +// +// Acquires the heap lock. Must run on the system stack for that reason. +// +//go:systemstack +func (h *mheap) allocUserArenaChunk() *mspan { + var s *mspan + var base uintptr + + // First check the free list. + lock(&h.lock) + if !h.userArena.readyList.isEmpty() { + s = h.userArena.readyList.first + h.userArena.readyList.remove(s) + base = s.base() + } else { + // Free list was empty, so allocate a new arena. + hintList := &h.userArena.arenaHints + if raceenabled { + // In race mode just use the regular heap hints. We might fragment + // the address space, but the race detector requires that the heap + // is mapped contiguously. + hintList = &h.arenaHints + } + v, size := h.sysAlloc(userArenaChunkBytes, hintList, false) + if size%userArenaChunkBytes != 0 { + throw("sysAlloc size is not divisible by userArenaChunkBytes") + } + if size > userArenaChunkBytes { + // We got more than we asked for. This can happen if + // heapArenaSize > userArenaChunkSize, or if sysAlloc just returns + // some extra as a result of trying to find an aligned region. + // + // Divide it up and put it on the ready list. + for i := uintptr(userArenaChunkBytes); i < size; i += userArenaChunkBytes { + s := h.allocMSpanLocked() + s.init(uintptr(v)+i, userArenaChunkPages) + h.userArena.readyList.insertBack(s) + } + size = userArenaChunkBytes + } + base = uintptr(v) + if base == 0 { + // Out of memory. + unlock(&h.lock) + return nil + } + s = h.allocMSpanLocked() + } + unlock(&h.lock) + + // sysAlloc returns Reserved address space, and any span we're + // reusing is set to fault (so, also Reserved), so transition + // it to Prepared and then Ready. + // + // Unlike (*mheap).grow, just map in everything that we + // asked for. We're likely going to use it all. + sysMap(unsafe.Pointer(base), userArenaChunkBytes, &gcController.heapReleased) + sysUsed(unsafe.Pointer(base), userArenaChunkBytes, userArenaChunkBytes) + + // Model the user arena as a heap span for a large object. + spc := makeSpanClass(0, false) + h.initSpan(s, spanAllocHeap, spc, base, userArenaChunkPages) + s.isUserArenaChunk = true + + // Account for this new arena chunk memory. + gcController.heapInUse.add(int64(userArenaChunkBytes)) + gcController.heapReleased.add(-int64(userArenaChunkBytes)) + + stats := memstats.heapStats.acquire() + atomic.Xaddint64(&stats.inHeap, int64(userArenaChunkBytes)) + atomic.Xaddint64(&stats.committed, int64(userArenaChunkBytes)) + + // Model the arena as a single large malloc. + atomic.Xadd64(&stats.largeAlloc, int64(userArenaChunkBytes)) + atomic.Xadd64(&stats.largeAllocCount, 1) + memstats.heapStats.release() + + // Count the alloc in inconsistent, internal stats. + gcController.totalAlloc.Add(int64(userArenaChunkBytes)) + + // Update heapLive. + gcController.update(int64(userArenaChunkBytes), 0) + + // Put the large span in the mcentral swept list so that it's + // visible to the background sweeper. + h.central[spc].mcentral.fullSwept(h.sweepgen).push(s) + s.limit = s.base() + userArenaChunkBytes + s.freeindex = 1 + s.allocCount = 1 + + // This must clear the entire heap bitmap so that it's safe + // to allocate noscan data without writing anything out. + s.initHeapBits(true) + + // Clear the span preemptively. It's an arena chunk, so let's assume + // everything is going to be used. + // + // This also seems to make a massive difference as to whether or + // not Linux decides to back this memory with transparent huge + // pages. There's latency involved in this zeroing, but the hugepage + // gains are almost always worth it. Note: it's important that we + // clear even if it's freshly mapped and we know there's no point + // to zeroing as *that* is the critical signal to use huge pages. + memclrNoHeapPointers(unsafe.Pointer(s.base()), s.elemsize) + s.needzero = 0 + + s.freeIndexForScan = 1 + + // Set up the range for allocation. + s.userArenaChunkFree = makeAddrRange(base, s.limit) + return s +} diff --git a/src/runtime/arena_test.go b/src/runtime/arena_test.go new file mode 100644 index 0000000..7e121ad --- /dev/null +++ b/src/runtime/arena_test.go @@ -0,0 +1,529 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "internal/goarch" + "reflect" + . "runtime" + "runtime/debug" + "runtime/internal/atomic" + "testing" + "time" + "unsafe" +) + +type smallScalar struct { + X uintptr +} +type smallPointer struct { + X *smallPointer +} +type smallPointerMix struct { + A *smallPointer + B byte + C *smallPointer + D [11]byte +} +type mediumScalarEven [8192]byte +type mediumScalarOdd [3321]byte +type mediumPointerEven [1024]*smallPointer +type mediumPointerOdd [1023]*smallPointer + +type largeScalar [UserArenaChunkBytes + 1]byte +type largePointer [UserArenaChunkBytes/unsafe.Sizeof(&smallPointer{}) + 1]*smallPointer + +func TestUserArena(t *testing.T) { + // Set GOMAXPROCS to 2 so we don't run too many of these + // tests in parallel. + defer GOMAXPROCS(GOMAXPROCS(2)) + + // Start a subtest so that we can clean up after any parallel tests within. + t.Run("Alloc", func(t *testing.T) { + ss := &smallScalar{5} + runSubTestUserArenaNew(t, ss, true) + + sp := &smallPointer{new(smallPointer)} + runSubTestUserArenaNew(t, sp, true) + + spm := &smallPointerMix{sp, 5, nil, [11]byte{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}} + runSubTestUserArenaNew(t, spm, true) + + mse := new(mediumScalarEven) + for i := range mse { + mse[i] = 121 + } + runSubTestUserArenaNew(t, mse, true) + + mso := new(mediumScalarOdd) + for i := range mso { + mso[i] = 122 + } + runSubTestUserArenaNew(t, mso, true) + + mpe := new(mediumPointerEven) + for i := range mpe { + mpe[i] = sp + } + runSubTestUserArenaNew(t, mpe, true) + + mpo := new(mediumPointerOdd) + for i := range mpo { + mpo[i] = sp + } + runSubTestUserArenaNew(t, mpo, true) + + ls := new(largeScalar) + for i := range ls { + ls[i] = 123 + } + // Not in parallel because we don't want to hold this large allocation live. + runSubTestUserArenaNew(t, ls, false) + + lp := new(largePointer) + for i := range lp { + lp[i] = sp + } + // Not in parallel because we don't want to hold this large allocation live. + runSubTestUserArenaNew(t, lp, false) + + sss := make([]smallScalar, 25) + for i := range sss { + sss[i] = smallScalar{12} + } + runSubTestUserArenaSlice(t, sss, true) + + mpos := make([]mediumPointerOdd, 5) + for i := range mpos { + mpos[i] = *mpo + } + runSubTestUserArenaSlice(t, mpos, true) + + sps := make([]smallPointer, UserArenaChunkBytes/unsafe.Sizeof(smallPointer{})+1) + for i := range sps { + sps[i] = *sp + } + // Not in parallel because we don't want to hold this large allocation live. + runSubTestUserArenaSlice(t, sps, false) + + // Test zero-sized types. + t.Run("struct{}", func(t *testing.T) { + arena := NewUserArena() + var x any + x = (*struct{})(nil) + arena.New(&x) + if v := unsafe.Pointer(x.(*struct{})); v != ZeroBase { + t.Errorf("expected zero-sized type to be allocated as zerobase: got %x, want %x", v, ZeroBase) + } + arena.Free() + }) + t.Run("[]struct{}", func(t *testing.T) { + arena := NewUserArena() + var sl []struct{} + arena.Slice(&sl, 10) + if v := unsafe.Pointer(&sl[0]); v != ZeroBase { + t.Errorf("expected zero-sized type to be allocated as zerobase: got %x, want %x", v, ZeroBase) + } + arena.Free() + }) + t.Run("[]int (cap 0)", func(t *testing.T) { + arena := NewUserArena() + var sl []int + arena.Slice(&sl, 0) + if len(sl) != 0 { + t.Errorf("expected requested zero-sized slice to still have zero length: got %x, want 0", len(sl)) + } + arena.Free() + }) + }) + + // Run a GC cycle to get any arenas off the quarantine list. + GC() + + if n := GlobalWaitingArenaChunks(); n != 0 { + t.Errorf("expected zero waiting arena chunks, found %d", n) + } +} + +func runSubTestUserArenaNew[S comparable](t *testing.T, value *S, parallel bool) { + t.Run(reflect.TypeOf(value).Elem().Name(), func(t *testing.T) { + if parallel { + t.Parallel() + } + + // Allocate and write data, enough to exhaust the arena. + // + // This is an underestimate, likely leaving some space in the arena. That's a good thing, + // because it gives us coverage of boundary cases. + n := int(UserArenaChunkBytes / unsafe.Sizeof(*value)) + if n == 0 { + n = 1 + } + + // Create a new arena and do a bunch of operations on it. + arena := NewUserArena() + + arenaValues := make([]*S, 0, n) + for j := 0; j < n; j++ { + var x any + x = (*S)(nil) + arena.New(&x) + s := x.(*S) + *s = *value + arenaValues = append(arenaValues, s) + } + // Check integrity of allocated data. + for _, s := range arenaValues { + if *s != *value { + t.Errorf("failed integrity check: got %#v, want %#v", *s, *value) + } + } + + // Release the arena. + arena.Free() + }) +} + +func runSubTestUserArenaSlice[S comparable](t *testing.T, value []S, parallel bool) { + t.Run("[]"+reflect.TypeOf(value).Elem().Name(), func(t *testing.T) { + if parallel { + t.Parallel() + } + + // Allocate and write data, enough to exhaust the arena. + // + // This is an underestimate, likely leaving some space in the arena. That's a good thing, + // because it gives us coverage of boundary cases. + n := int(UserArenaChunkBytes / (unsafe.Sizeof(*new(S)) * uintptr(cap(value)))) + if n == 0 { + n = 1 + } + + // Create a new arena and do a bunch of operations on it. + arena := NewUserArena() + + arenaValues := make([][]S, 0, n) + for j := 0; j < n; j++ { + var sl []S + arena.Slice(&sl, cap(value)) + copy(sl, value) + arenaValues = append(arenaValues, sl) + } + // Check integrity of allocated data. + for _, sl := range arenaValues { + for i := range sl { + got := sl[i] + want := value[i] + if got != want { + t.Errorf("failed integrity check: got %#v, want %#v at index %d", got, want, i) + } + } + } + + // Release the arena. + arena.Free() + }) +} + +func TestUserArenaLiveness(t *testing.T) { + t.Run("Free", func(t *testing.T) { + testUserArenaLiveness(t, false) + }) + t.Run("Finalizer", func(t *testing.T) { + testUserArenaLiveness(t, true) + }) +} + +func testUserArenaLiveness(t *testing.T, useArenaFinalizer bool) { + // Disable the GC so that there's zero chance we try doing anything arena related *during* + // a mark phase, since otherwise a bunch of arenas could end up on the fault list. + defer debug.SetGCPercent(debug.SetGCPercent(-1)) + + // Defensively ensure that any full arena chunks leftover from previous tests have been cleared. + GC() + GC() + + arena := NewUserArena() + + // Allocate a few pointer-ful but un-initialized objects so that later we can + // place a reference to heap object at a more interesting location. + for i := 0; i < 3; i++ { + var x any + x = (*mediumPointerOdd)(nil) + arena.New(&x) + } + + var x any + x = (*smallPointerMix)(nil) + arena.New(&x) + v := x.(*smallPointerMix) + + var safeToFinalize atomic.Bool + var finalized atomic.Bool + v.C = new(smallPointer) + SetFinalizer(v.C, func(_ *smallPointer) { + if !safeToFinalize.Load() { + t.Error("finalized arena-referenced object unexpectedly") + } + finalized.Store(true) + }) + + // Make sure it stays alive. + GC() + GC() + + // In order to ensure the object can be freed, we now need to make sure to use + // the entire arena. Exhaust the rest of the arena. + + for i := 0; i < int(UserArenaChunkBytes/unsafe.Sizeof(mediumScalarEven{})); i++ { + var x any + x = (*mediumScalarEven)(nil) + arena.New(&x) + } + + // Make sure it stays alive again. + GC() + GC() + + v = nil + + safeToFinalize.Store(true) + if useArenaFinalizer { + arena = nil + + // Try to queue the arena finalizer. + GC() + GC() + + // In order for the finalizer we actually want to run to execute, + // we need to make sure this one runs first. + if !BlockUntilEmptyFinalizerQueue(int64(2 * time.Second)) { + t.Fatal("finalizer queue was never emptied") + } + } else { + // Free the arena explicitly. + arena.Free() + } + + // Try to queue the object's finalizer that we set earlier. + GC() + GC() + + if !BlockUntilEmptyFinalizerQueue(int64(2 * time.Second)) { + t.Fatal("finalizer queue was never emptied") + } + if !finalized.Load() { + t.Error("expected arena-referenced object to be finalized") + } +} + +func TestUserArenaClearsPointerBits(t *testing.T) { + // This is a regression test for a serious issue wherein if pointer bits + // aren't properly cleared, it's possible to allocate scalar data down + // into a previously pointer-ful area, causing misinterpretation by the GC. + + // Create a large object, grab a pointer into it, and free it. + x := new([8 << 20]byte) + xp := uintptr(unsafe.Pointer(&x[124])) + var finalized atomic.Bool + SetFinalizer(x, func(_ *[8 << 20]byte) { + finalized.Store(true) + }) + + // Write three chunks worth of pointer data. Three gives us a + // high likelihood that when we write 2 later, we'll get the behavior + // we want. + a := NewUserArena() + for i := 0; i < int(UserArenaChunkBytes/goarch.PtrSize*3); i++ { + var x any + x = (*smallPointer)(nil) + a.New(&x) + } + a.Free() + + // Recycle the arena chunks. + GC() + GC() + + a = NewUserArena() + for i := 0; i < int(UserArenaChunkBytes/goarch.PtrSize*2); i++ { + var x any + x = (*smallScalar)(nil) + a.New(&x) + v := x.(*smallScalar) + // Write a pointer that should not keep x alive. + *v = smallScalar{xp} + } + KeepAlive(x) + x = nil + + // Try to free x. + GC() + GC() + + if !BlockUntilEmptyFinalizerQueue(int64(2 * time.Second)) { + t.Fatal("finalizer queue was never emptied") + } + if !finalized.Load() { + t.Fatal("heap allocation kept alive through non-pointer reference") + } + + // Clean up the arena. + a.Free() + GC() + GC() +} + +func TestUserArenaCloneString(t *testing.T) { + a := NewUserArena() + + // A static string (not on heap or arena) + var s = "abcdefghij" + + // Create a byte slice in the arena, initialize it with s + var b []byte + a.Slice(&b, len(s)) + copy(b, s) + + // Create a string as using the same memory as the byte slice, hence in + // the arena. This could be an arena API, but hasn't really been needed + // yet. + var as string + asHeader := (*reflect.StringHeader)(unsafe.Pointer(&as)) + asHeader.Data = (*reflect.SliceHeader)(unsafe.Pointer(&b)).Data + asHeader.Len = len(b) + + // Clone should make a copy of as, since it is in the arena. + asCopy := UserArenaClone(as) + if (*reflect.StringHeader)(unsafe.Pointer(&as)).Data == (*reflect.StringHeader)(unsafe.Pointer(&asCopy)).Data { + t.Error("Clone did not make a copy") + } + + // Clone should make a copy of subAs, since subAs is just part of as and so is in the arena. + subAs := as[1:3] + subAsCopy := UserArenaClone(subAs) + if (*reflect.StringHeader)(unsafe.Pointer(&subAs)).Data == (*reflect.StringHeader)(unsafe.Pointer(&subAsCopy)).Data { + t.Error("Clone did not make a copy") + } + if len(subAs) != len(subAsCopy) { + t.Errorf("Clone made an incorrect copy (bad length): %d -> %d", len(subAs), len(subAsCopy)) + } else { + for i := range subAs { + if subAs[i] != subAsCopy[i] { + t.Errorf("Clone made an incorrect copy (data at index %d): %d -> %d", i, subAs[i], subAs[i]) + } + } + } + + // Clone should not make a copy of doubleAs, since doubleAs will be on the heap. + doubleAs := as + as + doubleAsCopy := UserArenaClone(doubleAs) + if (*reflect.StringHeader)(unsafe.Pointer(&doubleAs)).Data != (*reflect.StringHeader)(unsafe.Pointer(&doubleAsCopy)).Data { + t.Error("Clone should not have made a copy") + } + + // Clone should not make a copy of s, since s is a static string. + sCopy := UserArenaClone(s) + if (*reflect.StringHeader)(unsafe.Pointer(&s)).Data != (*reflect.StringHeader)(unsafe.Pointer(&sCopy)).Data { + t.Error("Clone should not have made a copy") + } + + a.Free() +} + +func TestUserArenaClonePointer(t *testing.T) { + a := NewUserArena() + + // Clone should not make a copy of a heap-allocated smallScalar. + x := Escape(new(smallScalar)) + xCopy := UserArenaClone(x) + if unsafe.Pointer(x) != unsafe.Pointer(xCopy) { + t.Errorf("Clone should not have made a copy: %#v -> %#v", x, xCopy) + } + + // Clone should make a copy of an arena-allocated smallScalar. + var i any + i = (*smallScalar)(nil) + a.New(&i) + xArena := i.(*smallScalar) + xArenaCopy := UserArenaClone(xArena) + if unsafe.Pointer(xArena) == unsafe.Pointer(xArenaCopy) { + t.Errorf("Clone should have made a copy: %#v -> %#v", xArena, xArenaCopy) + } + if *xArena != *xArenaCopy { + t.Errorf("Clone made an incorrect copy copy: %#v -> %#v", *xArena, *xArenaCopy) + } + + a.Free() +} + +func TestUserArenaCloneSlice(t *testing.T) { + a := NewUserArena() + + // A static string (not on heap or arena) + var s = "klmnopqrstuv" + + // Create a byte slice in the arena, initialize it with s + var b []byte + a.Slice(&b, len(s)) + copy(b, s) + + // Clone should make a copy of b, since it is in the arena. + bCopy := UserArenaClone(b) + if unsafe.Pointer(&b[0]) == unsafe.Pointer(&bCopy[0]) { + t.Errorf("Clone did not make a copy: %#v -> %#v", b, bCopy) + } + if len(b) != len(bCopy) { + t.Errorf("Clone made an incorrect copy (bad length): %d -> %d", len(b), len(bCopy)) + } else { + for i := range b { + if b[i] != bCopy[i] { + t.Errorf("Clone made an incorrect copy (data at index %d): %d -> %d", i, b[i], bCopy[i]) + } + } + } + + // Clone should make a copy of bSub, since bSub is just part of b and so is in the arena. + bSub := b[1:3] + bSubCopy := UserArenaClone(bSub) + if unsafe.Pointer(&bSub[0]) == unsafe.Pointer(&bSubCopy[0]) { + t.Errorf("Clone did not make a copy: %#v -> %#v", bSub, bSubCopy) + } + if len(bSub) != len(bSubCopy) { + t.Errorf("Clone made an incorrect copy (bad length): %d -> %d", len(bSub), len(bSubCopy)) + } else { + for i := range bSub { + if bSub[i] != bSubCopy[i] { + t.Errorf("Clone made an incorrect copy (data at index %d): %d -> %d", i, bSub[i], bSubCopy[i]) + } + } + } + + // Clone should not make a copy of bNotArena, since it will not be in an arena. + bNotArena := make([]byte, len(s)) + copy(bNotArena, s) + bNotArenaCopy := UserArenaClone(bNotArena) + if unsafe.Pointer(&bNotArena[0]) != unsafe.Pointer(&bNotArenaCopy[0]) { + t.Error("Clone should not have made a copy") + } + + a.Free() +} + +func TestUserArenaClonePanic(t *testing.T) { + var s string + func() { + x := smallScalar{2} + defer func() { + if v := recover(); v != nil { + s = v.(string) + } + }() + UserArenaClone(x) + }() + if s == "" { + t.Errorf("expected panic from Clone") + } +} diff --git a/src/runtime/asan.go b/src/runtime/asan.go new file mode 100644 index 0000000..25b8327 --- /dev/null +++ b/src/runtime/asan.go @@ -0,0 +1,67 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build asan + +package runtime + +import ( + "unsafe" +) + +// Public address sanitizer API. +func ASanRead(addr unsafe.Pointer, len int) { + sp := getcallersp() + pc := getcallerpc() + doasanread(addr, uintptr(len), sp, pc) +} + +func ASanWrite(addr unsafe.Pointer, len int) { + sp := getcallersp() + pc := getcallerpc() + doasanwrite(addr, uintptr(len), sp, pc) +} + +// Private interface for the runtime. +const asanenabled = true + +// asan{read,write} are nosplit because they may be called between +// fork and exec, when the stack must not grow. See issue #50391. + +//go:nosplit +func asanread(addr unsafe.Pointer, sz uintptr) { + sp := getcallersp() + pc := getcallerpc() + doasanread(addr, sz, sp, pc) +} + +//go:nosplit +func asanwrite(addr unsafe.Pointer, sz uintptr) { + sp := getcallersp() + pc := getcallerpc() + doasanwrite(addr, sz, sp, pc) +} + +//go:noescape +func doasanread(addr unsafe.Pointer, sz, sp, pc uintptr) + +//go:noescape +func doasanwrite(addr unsafe.Pointer, sz, sp, pc uintptr) + +//go:noescape +func asanunpoison(addr unsafe.Pointer, sz uintptr) + +//go:noescape +func asanpoison(addr unsafe.Pointer, sz uintptr) + +//go:noescape +func asanregisterglobals(addr unsafe.Pointer, n uintptr) + +// These are called from asan_GOARCH.s +// +//go:cgo_import_static __asan_read_go +//go:cgo_import_static __asan_write_go +//go:cgo_import_static __asan_unpoison_go +//go:cgo_import_static __asan_poison_go +//go:cgo_import_static __asan_register_globals_go diff --git a/src/runtime/asan/asan.go b/src/runtime/asan/asan.go new file mode 100644 index 0000000..25f15ae --- /dev/null +++ b/src/runtime/asan/asan.go @@ -0,0 +1,76 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build asan && linux && (arm64 || amd64 || riscv64 || ppc64le) + +package asan + +/* +#cgo CFLAGS: -fsanitize=address +#cgo LDFLAGS: -fsanitize=address + +#include <stdbool.h> +#include <stdint.h> +#include <sanitizer/asan_interface.h> + +void __asan_read_go(void *addr, uintptr_t sz, void *sp, void *pc) { + if (__asan_region_is_poisoned(addr, sz)) { + __asan_report_error(pc, 0, sp, addr, false, sz); + } +} + +void __asan_write_go(void *addr, uintptr_t sz, void *sp, void *pc) { + if (__asan_region_is_poisoned(addr, sz)) { + __asan_report_error(pc, 0, sp, addr, true, sz); + } +} + +void __asan_unpoison_go(void *addr, uintptr_t sz) { + __asan_unpoison_memory_region(addr, sz); +} + +void __asan_poison_go(void *addr, uintptr_t sz) { + __asan_poison_memory_region(addr, sz); +} + +// Keep in sync with the definition in compiler-rt +// https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/asan/asan_interface_internal.h#L41 +// This structure is used to describe the source location of +// a place where global was defined. +struct _asan_global_source_location { + const char *filename; + int line_no; + int column_no; +}; + +// Keep in sync with the definition in compiler-rt +// https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/asan/asan_interface_internal.h#L48 +// So far, the current implementation is only compatible with the ASan library from version v7 to v9. +// https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/asan/asan_init_version.h +// This structure describes an instrumented global variable. +// +// TODO: If a later version of the ASan library changes __asan_global or __asan_global_source_location +// structure, we need to make the same changes. +struct _asan_global { + uintptr_t beg; + uintptr_t size; + uintptr_t size_with_redzone; + const char *name; + const char *module_name; + uintptr_t has_dynamic_init; + struct _asan_global_source_location *location; + uintptr_t odr_indicator; +}; + + +extern void __asan_register_globals(void*, long int); + +// Register global variables. +// The 'globals' is an array of structures describing 'n' globals. +void __asan_register_globals_go(void *addr, uintptr_t n) { + struct _asan_global *globals = (struct _asan_global *)(addr); + __asan_register_globals(globals, n); +} +*/ +import "C" diff --git a/src/runtime/asan0.go b/src/runtime/asan0.go new file mode 100644 index 0000000..0948786 --- /dev/null +++ b/src/runtime/asan0.go @@ -0,0 +1,23 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !asan + +// Dummy ASan support API, used when not built with -asan. + +package runtime + +import ( + "unsafe" +) + +const asanenabled = false + +// Because asanenabled is false, none of these functions should be called. + +func asanread(addr unsafe.Pointer, sz uintptr) { throw("asan") } +func asanwrite(addr unsafe.Pointer, sz uintptr) { throw("asan") } +func asanunpoison(addr unsafe.Pointer, sz uintptr) { throw("asan") } +func asanpoison(addr unsafe.Pointer, sz uintptr) { throw("asan") } +func asanregisterglobals(addr unsafe.Pointer, sz uintptr) { throw("asan") } diff --git a/src/runtime/asan_amd64.s b/src/runtime/asan_amd64.s new file mode 100644 index 0000000..0489aa8 --- /dev/null +++ b/src/runtime/asan_amd64.s @@ -0,0 +1,91 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build asan + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// This is like race_amd64.s, but for the asan calls. +// See race_amd64.s for detailed comments. + +#ifdef GOOS_windows +#define RARG0 CX +#define RARG1 DX +#define RARG2 R8 +#define RARG3 R9 +#else +#define RARG0 DI +#define RARG1 SI +#define RARG2 DX +#define RARG3 CX +#endif + +// Called from intrumented code. +// func runtime·doasanread(addr unsafe.Pointer, sz, sp, pc uintptr) +TEXT runtime·doasanread(SB), NOSPLIT, $0-32 + MOVQ addr+0(FP), RARG0 + MOVQ size+8(FP), RARG1 + MOVQ sp+16(FP), RARG2 + MOVQ pc+24(FP), RARG3 + // void __asan_read_go(void *addr, uintptr_t sz, void *sp, void *pc); + MOVQ $__asan_read_go(SB), AX + JMP asancall<>(SB) + +// func runtime·doasanwrite(addr unsafe.Pointer, sz, sp, pc uintptr) +TEXT runtime·doasanwrite(SB), NOSPLIT, $0-32 + MOVQ addr+0(FP), RARG0 + MOVQ size+8(FP), RARG1 + MOVQ sp+16(FP), RARG2 + MOVQ pc+24(FP), RARG3 + // void __asan_write_go(void *addr, uintptr_t sz, void *sp, void *pc); + MOVQ $__asan_write_go(SB), AX + JMP asancall<>(SB) + +// func runtime·asanunpoison(addr unsafe.Pointer, sz uintptr) +TEXT runtime·asanunpoison(SB), NOSPLIT, $0-16 + MOVQ addr+0(FP), RARG0 + MOVQ size+8(FP), RARG1 + // void __asan_unpoison_go(void *addr, uintptr_t sz); + MOVQ $__asan_unpoison_go(SB), AX + JMP asancall<>(SB) + +// func runtime·asanpoison(addr unsafe.Pointer, sz uintptr) +TEXT runtime·asanpoison(SB), NOSPLIT, $0-16 + MOVQ addr+0(FP), RARG0 + MOVQ size+8(FP), RARG1 + // void __asan_poison_go(void *addr, uintptr_t sz); + MOVQ $__asan_poison_go(SB), AX + JMP asancall<>(SB) + +// func runtime·asanregisterglobals(addr unsafe.Pointer, n uintptr) +TEXT runtime·asanregisterglobals(SB), NOSPLIT, $0-16 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + // void __asan_register_globals_go(void *addr, uintptr_t n); + MOVD $__asan_register_globals_go(SB), AX + JMP asancall<>(SB) + +// Switches SP to g0 stack and calls (AX). Arguments already set. +TEXT asancall<>(SB), NOSPLIT, $0-0 + get_tls(R12) + MOVQ g(R12), R14 + MOVQ SP, R12 // callee-saved, preserved across the CALL + CMPQ R14, $0 + JE call // no g; still on a system stack + + MOVQ g_m(R14), R13 + // Switch to g0 stack. + MOVQ m_g0(R13), R10 + CMPQ R10, R14 + JE call // already on g0 + + MOVQ (g_sched+gobuf_sp)(R10), SP +call: + ANDQ $~15, SP // alignment for gcc ABI + CALL AX + MOVQ R12, SP + RET diff --git a/src/runtime/asan_arm64.s b/src/runtime/asan_arm64.s new file mode 100644 index 0000000..697c982 --- /dev/null +++ b/src/runtime/asan_arm64.s @@ -0,0 +1,76 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build asan + +#include "go_asm.h" +#include "textflag.h" + +#define RARG0 R0 +#define RARG1 R1 +#define RARG2 R2 +#define RARG3 R3 +#define FARG R4 + +// Called from instrumented code. +// func runtime·doasanread(addr unsafe.Pointer, sz, sp, pc uintptr) +TEXT runtime·doasanread(SB), NOSPLIT, $0-32 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + MOVD sp+16(FP), RARG2 + MOVD pc+24(FP), RARG3 + // void __asan_read_go(void *addr, uintptr_t sz, void *sp, void *pc); + MOVD $__asan_read_go(SB), FARG + JMP asancall<>(SB) + +// func runtime·doasanwrite(addr unsafe.Pointer, sz, sp, pc uintptr) +TEXT runtime·doasanwrite(SB), NOSPLIT, $0-32 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + MOVD sp+16(FP), RARG2 + MOVD pc+24(FP), RARG3 + // void __asan_write_go(void *addr, uintptr_t sz, void *sp, void *pc); + MOVD $__asan_write_go(SB), FARG + JMP asancall<>(SB) + +// func runtime·asanunpoison(addr unsafe.Pointer, sz uintptr) +TEXT runtime·asanunpoison(SB), NOSPLIT, $0-16 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + // void __asan_unpoison_go(void *addr, uintptr_t sz); + MOVD $__asan_unpoison_go(SB), FARG + JMP asancall<>(SB) + +// func runtime·asanpoison(addr unsafe.Pointer, sz uintptr) +TEXT runtime·asanpoison(SB), NOSPLIT, $0-16 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + // void __asan_poison_go(void *addr, uintptr_t sz); + MOVD $__asan_poison_go(SB), FARG + JMP asancall<>(SB) + +// func runtime·asanregisterglobals(addr unsafe.Pointer, n uintptr) +TEXT runtime·asanregisterglobals(SB), NOSPLIT, $0-16 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + // void __asan_register_globals_go(void *addr, uintptr_t n); + MOVD $__asan_register_globals_go(SB), FARG + JMP asancall<>(SB) + +// Switches SP to g0 stack and calls (FARG). Arguments already set. +TEXT asancall<>(SB), NOSPLIT, $0-0 + MOVD RSP, R19 // callee-saved + CBZ g, g0stack // no g, still on a system stack + MOVD g_m(g), R10 + MOVD m_g0(R10), R11 + CMP R11, g + BEQ g0stack + + MOVD (g_sched+gobuf_sp)(R11), R5 + MOVD R5, RSP + +g0stack: + BL (FARG) + MOVD R19, RSP + RET diff --git a/src/runtime/asan_ppc64le.s b/src/runtime/asan_ppc64le.s new file mode 100644 index 0000000..d13301a --- /dev/null +++ b/src/runtime/asan_ppc64le.s @@ -0,0 +1,87 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build asan + +#include "go_asm.h" +#include "textflag.h" + +#define RARG0 R3 +#define RARG1 R4 +#define RARG2 R5 +#define RARG3 R6 +#define FARG R12 + +// Called from instrumented code. +// func runtime·doasanread(addr unsafe.Pointer, sz, sp, pc uintptr) +TEXT runtime·doasanread(SB),NOSPLIT|NOFRAME,$0-32 + MOVD addr+0(FP), RARG0 + MOVD sz+8(FP), RARG1 + MOVD sp+16(FP), RARG2 + MOVD pc+24(FP), RARG3 + // void __asan_read_go(void *addr, uintptr_t sz, void *sp, void *pc); + MOVD $__asan_read_go(SB), FARG + BR asancall<>(SB) + +// func runtime·doasanwrite(addr unsafe.Pointer, sz, sp, pc uintptr) +TEXT runtime·doasanwrite(SB),NOSPLIT|NOFRAME,$0-32 + MOVD addr+0(FP), RARG0 + MOVD sz+8(FP), RARG1 + MOVD sp+16(FP), RARG2 + MOVD pc+24(FP), RARG3 + // void __asan_write_go(void *addr, uintptr_t sz, void *sp, void *pc); + MOVD $__asan_write_go(SB), FARG + BR asancall<>(SB) + +// func runtime·asanunpoison(addr unsafe.Pointer, sz uintptr) +TEXT runtime·asanunpoison(SB),NOSPLIT|NOFRAME,$0-16 + MOVD addr+0(FP), RARG0 + MOVD sz+8(FP), RARG1 + // void __asan_unpoison_go(void *addr, uintptr_t sz); + MOVD $__asan_unpoison_go(SB), FARG + BR asancall<>(SB) + +// func runtime·asanpoison(addr unsafe.Pointer, sz uintptr) +TEXT runtime·asanpoison(SB),NOSPLIT|NOFRAME,$0-16 + MOVD addr+0(FP), RARG0 + MOVD sz+8(FP), RARG1 + // void __asan_poison_go(void *addr, uintptr_t sz); + MOVD $__asan_poison_go(SB), FARG + BR asancall<>(SB) + +// func runtime·asanregisterglobals(addr unsafe.Pointer, n uintptr) +TEXT runtime·asanregisterglobals(SB),NOSPLIT|NOFRAME,$0-16 + MOVD addr+0(FP), RARG0 + MOVD n+8(FP), RARG1 + // void __asan_register_globals_go(void *addr, uintptr_t n); + MOVD $__asan_register_globals_go(SB), FARG + BR asancall<>(SB) + +// Switches SP to g0 stack and calls (FARG). Arguments already set. +TEXT asancall<>(SB), NOSPLIT, $0-0 + // LR saved in generated prologue + // Get info from the current goroutine + MOVD runtime·tls_g(SB), R10 // g offset in TLS + MOVD 0(R10), g + MOVD g_m(g), R7 // m for g + MOVD R1, R16 // callee-saved, preserved across C call + MOVD m_g0(R7), R10 // g0 for m + CMP R10, g // same g0? + BEQ call // already on g0 + MOVD (g_sched+gobuf_sp)(R10), R1 // switch R1 +call: + // prepare frame for C ABI + SUB $32, R1 // create frame for callee saving LR, CR, R2 etc. + RLDCR $0, R1, $~15, R1 // align SP to 16 bytes + MOVD FARG, CTR // address of function to be called + MOVD R0, 0(R1) // clear back chain pointer + BL (CTR) + MOVD $0, R0 // C code can clobber R0 set it back to 0 + MOVD R16, R1 // restore R1; + MOVD runtime·tls_g(SB), R10 // find correct g + MOVD 0(R10), g + RET + +// tls_g, g value for each thread in TLS +GLOBL runtime·tls_g+0(SB), TLSBSS+DUPOK, $8 diff --git a/src/runtime/asan_riscv64.s b/src/runtime/asan_riscv64.s new file mode 100644 index 0000000..6fcd94d --- /dev/null +++ b/src/runtime/asan_riscv64.s @@ -0,0 +1,68 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build asan + +#include "go_asm.h" +#include "textflag.h" + +// Called from instrumented code. +// func runtime·doasanread(addr unsafe.Pointer, sz, sp, pc uintptr) +TEXT runtime·doasanread(SB), NOSPLIT, $0-32 + MOV addr+0(FP), X10 + MOV sz+8(FP), X11 + MOV sp+16(FP), X12 + MOV pc+24(FP), X13 + // void __asan_read_go(void *addr, uintptr_t sz); + MOV $__asan_read_go(SB), X14 + JMP asancall<>(SB) + +// func runtime·doasanwrite(addr unsafe.Pointer, sz, sp, pc uintptr) +TEXT runtime·doasanwrite(SB), NOSPLIT, $0-32 + MOV addr+0(FP), X10 + MOV sz+8(FP), X11 + MOV sp+16(FP), X12 + MOV pc+24(FP), X13 + // void __asan_write_go(void *addr, uintptr_t sz); + MOV $__asan_write_go(SB), X14 + JMP asancall<>(SB) + +// func runtime·asanunpoison(addr unsafe.Pointer, sz uintptr) +TEXT runtime·asanunpoison(SB), NOSPLIT, $0-16 + MOV addr+0(FP), X10 + MOV sz+8(FP), X11 + // void __asan_unpoison_go(void *addr, uintptr_t sz); + MOV $__asan_unpoison_go(SB), X14 + JMP asancall<>(SB) + +// func runtime·asanpoison(addr unsafe.Pointer, sz uintptr) +TEXT runtime·asanpoison(SB), NOSPLIT, $0-16 + MOV addr+0(FP), X10 + MOV sz+8(FP), X11 + // void __asan_poison_go(void *addr, uintptr_t sz); + MOV $__asan_poison_go(SB), X14 + JMP asancall<>(SB) + +// func runtime·asanregisterglobals(addr unsafe.Pointer, n uintptr) +TEXT runtime·asanregisterglobals(SB), NOSPLIT, $0-16 + MOV addr+0(FP), X10 + MOV n+8(FP), X11 + // void __asan_register_globals_go(void *addr, uintptr_t n); + MOV $__asan_register_globals_go(SB), X14 + JMP asancall<>(SB) + +// Switches SP to g0 stack and calls (X14). Arguments already set. +TEXT asancall<>(SB), NOSPLIT, $0-0 + MOV X2, X8 // callee-saved + BEQZ g, g0stack // no g, still on a system stack + MOV g_m(g), X21 + MOV m_g0(X21), X21 + BEQ X21, g, g0stack + + MOV (g_sched+gobuf_sp)(X21), X2 + +g0stack: + JALR RA, X14 + MOV X8, X2 + RET diff --git a/src/runtime/asm.s b/src/runtime/asm.s new file mode 100644 index 0000000..84d56de --- /dev/null +++ b/src/runtime/asm.s @@ -0,0 +1,10 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +#ifndef GOARCH_amd64 +TEXT ·sigpanic0(SB),NOSPLIT,$0-0 + JMP ·sigpanic<ABIInternal>(SB) +#endif diff --git a/src/runtime/asm_386.s b/src/runtime/asm_386.s new file mode 100644 index 0000000..e16880c --- /dev/null +++ b/src/runtime/asm_386.s @@ -0,0 +1,1579 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// _rt0_386 is common startup code for most 386 systems when using +// internal linking. This is the entry point for the program from the +// kernel for an ordinary -buildmode=exe program. The stack holds the +// number of arguments and the C-style argv. +TEXT _rt0_386(SB),NOSPLIT,$8 + MOVL 8(SP), AX // argc + LEAL 12(SP), BX // argv + MOVL AX, 0(SP) + MOVL BX, 4(SP) + JMP runtime·rt0_go(SB) + +// _rt0_386_lib is common startup code for most 386 systems when +// using -buildmode=c-archive or -buildmode=c-shared. The linker will +// arrange to invoke this function as a global constructor (for +// c-archive) or when the shared library is loaded (for c-shared). +// We expect argc and argv to be passed on the stack following the +// usual C ABI. +TEXT _rt0_386_lib(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + PUSHL BX + PUSHL SI + PUSHL DI + + MOVL 8(BP), AX + MOVL AX, _rt0_386_lib_argc<>(SB) + MOVL 12(BP), AX + MOVL AX, _rt0_386_lib_argv<>(SB) + + // Synchronous initialization. + CALL runtime·libpreinit(SB) + + SUBL $8, SP + + // Create a new thread to do the runtime initialization. + MOVL _cgo_sys_thread_create(SB), AX + TESTL AX, AX + JZ nocgo + + // Align stack to call C function. + // We moved SP to BP above, but BP was clobbered by the libpreinit call. + MOVL SP, BP + ANDL $~15, SP + + MOVL $_rt0_386_lib_go(SB), BX + MOVL BX, 0(SP) + MOVL $0, 4(SP) + + CALL AX + + MOVL BP, SP + + JMP restore + +nocgo: + MOVL $0x800000, 0(SP) // stacksize = 8192KB + MOVL $_rt0_386_lib_go(SB), AX + MOVL AX, 4(SP) // fn + CALL runtime·newosproc0(SB) + +restore: + ADDL $8, SP + POPL DI + POPL SI + POPL BX + POPL BP + RET + +// _rt0_386_lib_go initializes the Go runtime. +// This is started in a separate thread by _rt0_386_lib. +TEXT _rt0_386_lib_go(SB),NOSPLIT,$8 + MOVL _rt0_386_lib_argc<>(SB), AX + MOVL AX, 0(SP) + MOVL _rt0_386_lib_argv<>(SB), AX + MOVL AX, 4(SP) + JMP runtime·rt0_go(SB) + +DATA _rt0_386_lib_argc<>(SB)/4, $0 +GLOBL _rt0_386_lib_argc<>(SB),NOPTR, $4 +DATA _rt0_386_lib_argv<>(SB)/4, $0 +GLOBL _rt0_386_lib_argv<>(SB),NOPTR, $4 + +TEXT runtime·rt0_go(SB),NOSPLIT|NOFRAME|TOPFRAME,$0 + // Copy arguments forward on an even stack. + // Users of this function jump to it, they don't call it. + MOVL 0(SP), AX + MOVL 4(SP), BX + SUBL $128, SP // plenty of scratch + ANDL $~15, SP + MOVL AX, 120(SP) // save argc, argv away + MOVL BX, 124(SP) + + // set default stack bounds. + // _cgo_init may update stackguard. + MOVL $runtime·g0(SB), BP + LEAL (-64*1024+104)(SP), BX + MOVL BX, g_stackguard0(BP) + MOVL BX, g_stackguard1(BP) + MOVL BX, (g_stack+stack_lo)(BP) + MOVL SP, (g_stack+stack_hi)(BP) + + // find out information about the processor we're on + // first see if CPUID instruction is supported. + PUSHFL + PUSHFL + XORL $(1<<21), 0(SP) // flip ID bit + POPFL + PUSHFL + POPL AX + XORL 0(SP), AX + POPFL // restore EFLAGS + TESTL $(1<<21), AX + JNE has_cpuid + +bad_proc: // show that the program requires MMX. + MOVL $2, 0(SP) + MOVL $bad_proc_msg<>(SB), 4(SP) + MOVL $0x3d, 8(SP) + CALL runtime·write(SB) + MOVL $1, 0(SP) + CALL runtime·exit(SB) + CALL runtime·abort(SB) + +has_cpuid: + MOVL $0, AX + CPUID + MOVL AX, SI + CMPL AX, $0 + JE nocpuinfo + + CMPL BX, $0x756E6547 // "Genu" + JNE notintel + CMPL DX, $0x49656E69 // "ineI" + JNE notintel + CMPL CX, $0x6C65746E // "ntel" + JNE notintel + MOVB $1, runtime·isIntel(SB) +notintel: + + // Load EAX=1 cpuid flags + MOVL $1, AX + CPUID + MOVL CX, DI // Move to global variable clobbers CX when generating PIC + MOVL AX, runtime·processorVersionInfo(SB) + + // Check for MMX support + TESTL $(1<<23), DX // MMX + JZ bad_proc + +nocpuinfo: + // if there is an _cgo_init, call it to let it + // initialize and to set up GS. if not, + // we set up GS ourselves. + MOVL _cgo_init(SB), AX + TESTL AX, AX + JZ needtls +#ifdef GOOS_android + // arg 4: TLS base, stored in slot 0 (Android's TLS_SLOT_SELF). + // Compensate for tls_g (+8). + MOVL -8(TLS), BX + MOVL BX, 12(SP) + MOVL $runtime·tls_g(SB), 8(SP) // arg 3: &tls_g +#else + MOVL $0, BX + MOVL BX, 12(SP) // arg 3,4: not used when using platform's TLS + MOVL BX, 8(SP) +#endif + MOVL $setg_gcc<>(SB), BX + MOVL BX, 4(SP) // arg 2: setg_gcc + MOVL BP, 0(SP) // arg 1: g0 + CALL AX + + // update stackguard after _cgo_init + MOVL $runtime·g0(SB), CX + MOVL (g_stack+stack_lo)(CX), AX + ADDL $const__StackGuard, AX + MOVL AX, g_stackguard0(CX) + MOVL AX, g_stackguard1(CX) + +#ifndef GOOS_windows + // skip runtime·ldt0setup(SB) and tls test after _cgo_init for non-windows + JMP ok +#endif +needtls: +#ifdef GOOS_openbsd + // skip runtime·ldt0setup(SB) and tls test on OpenBSD in all cases + JMP ok +#endif +#ifdef GOOS_plan9 + // skip runtime·ldt0setup(SB) and tls test on Plan 9 in all cases + JMP ok +#endif + + // set up %gs + CALL ldt0setup<>(SB) + + // store through it, to make sure it works + get_tls(BX) + MOVL $0x123, g(BX) + MOVL runtime·m0+m_tls(SB), AX + CMPL AX, $0x123 + JEQ ok + MOVL AX, 0 // abort +ok: + // set up m and g "registers" + get_tls(BX) + LEAL runtime·g0(SB), DX + MOVL DX, g(BX) + LEAL runtime·m0(SB), AX + + // save m->g0 = g0 + MOVL DX, m_g0(AX) + // save g0->m = m0 + MOVL AX, g_m(DX) + + CALL runtime·emptyfunc(SB) // fault if stack check is wrong + + // convention is D is always cleared + CLD + + CALL runtime·check(SB) + + // saved argc, argv + MOVL 120(SP), AX + MOVL AX, 0(SP) + MOVL 124(SP), AX + MOVL AX, 4(SP) + CALL runtime·args(SB) + CALL runtime·osinit(SB) + CALL runtime·schedinit(SB) + + // create a new goroutine to start program + PUSHL $runtime·mainPC(SB) // entry + CALL runtime·newproc(SB) + POPL AX + + // start this M + CALL runtime·mstart(SB) + + CALL runtime·abort(SB) + RET + +DATA bad_proc_msg<>+0x00(SB)/61, $"This program can only be run on processors with MMX support.\n" +GLOBL bad_proc_msg<>(SB), RODATA, $61 + +DATA runtime·mainPC+0(SB)/4,$runtime·main(SB) +GLOBL runtime·mainPC(SB),RODATA,$4 + +TEXT runtime·breakpoint(SB),NOSPLIT,$0-0 + INT $3 + RET + +TEXT runtime·asminit(SB),NOSPLIT,$0-0 + // Linux and MinGW start the FPU in extended double precision. + // Other operating systems use double precision. + // Change to double precision to match them, + // and to match other hardware that only has double. + FLDCW runtime·controlWord64(SB) + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + CALL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// void gogo(Gobuf*) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB), NOSPLIT, $0-4 + MOVL buf+0(FP), BX // gobuf + MOVL gobuf_g(BX), DX + MOVL 0(DX), CX // make sure g != nil + JMP gogo<>(SB) + +TEXT gogo<>(SB), NOSPLIT, $0 + get_tls(CX) + MOVL DX, g(CX) + MOVL gobuf_sp(BX), SP // restore SP + MOVL gobuf_ret(BX), AX + MOVL gobuf_ctxt(BX), DX + MOVL $0, gobuf_sp(BX) // clear to help garbage collector + MOVL $0, gobuf_ret(BX) + MOVL $0, gobuf_ctxt(BX) + MOVL gobuf_pc(BX), BX + JMP BX + +// func mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall(SB), NOSPLIT, $0-4 + MOVL fn+0(FP), DI + + get_tls(DX) + MOVL g(DX), AX // save state in g->sched + MOVL 0(SP), BX // caller's PC + MOVL BX, (g_sched+gobuf_pc)(AX) + LEAL fn+0(FP), BX // caller's SP + MOVL BX, (g_sched+gobuf_sp)(AX) + + // switch to m->g0 & its stack, call fn + MOVL g(DX), BX + MOVL g_m(BX), BX + MOVL m_g0(BX), SI + CMPL SI, AX // if g == m->g0 call badmcall + JNE 3(PC) + MOVL $runtime·badmcall(SB), AX + JMP AX + MOVL SI, g(DX) // g = m->g0 + MOVL (g_sched+gobuf_sp)(SI), SP // sp = m->g0->sched.sp + PUSHL AX + MOVL DI, DX + MOVL 0(DI), DI + CALL DI + POPL AX + MOVL $runtime·badmcall2(SB), AX + JMP AX + RET + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-4 + MOVL fn+0(FP), DI // DI = fn + get_tls(CX) + MOVL g(CX), AX // AX = g + MOVL g_m(AX), BX // BX = m + + CMPL AX, m_gsignal(BX) + JEQ noswitch + + MOVL m_g0(BX), DX // DX = g0 + CMPL AX, DX + JEQ noswitch + + CMPL AX, m_curg(BX) + JNE bad + + // switch stacks + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + CALL gosave_systemstack_switch<>(SB) + + // switch to g0 + get_tls(CX) + MOVL DX, g(CX) + MOVL (g_sched+gobuf_sp)(DX), BX + MOVL BX, SP + + // call target function + MOVL DI, DX + MOVL 0(DI), DI + CALL DI + + // switch back to g + get_tls(CX) + MOVL g(CX), AX + MOVL g_m(AX), BX + MOVL m_curg(BX), AX + MOVL AX, g(CX) + MOVL (g_sched+gobuf_sp)(AX), SP + MOVL $0, (g_sched+gobuf_sp)(AX) + RET + +noswitch: + // already on system stack; tail call the function + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOVL DI, DX + MOVL 0(DI), DI + JMP DI + +bad: + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOVL $runtime·badsystemstack(SB), AX + CALL AX + INT $3 + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT,$0-0 + // Cannot grow scheduler stack (m->g0). + get_tls(CX) + MOVL g(CX), BX + MOVL g_m(BX), BX + MOVL m_g0(BX), SI + CMPL g(CX), SI + JNE 3(PC) + CALL runtime·badmorestackg0(SB) + CALL runtime·abort(SB) + + // Cannot grow signal stack. + MOVL m_gsignal(BX), SI + CMPL g(CX), SI + JNE 3(PC) + CALL runtime·badmorestackgsignal(SB) + CALL runtime·abort(SB) + + // Called from f. + // Set m->morebuf to f's caller. + NOP SP // tell vet SP changed - stop checking offsets + MOVL 4(SP), DI // f's caller's PC + MOVL DI, (m_morebuf+gobuf_pc)(BX) + LEAL 8(SP), CX // f's caller's SP + MOVL CX, (m_morebuf+gobuf_sp)(BX) + get_tls(CX) + MOVL g(CX), SI + MOVL SI, (m_morebuf+gobuf_g)(BX) + + // Set g->sched to context in f. + MOVL 0(SP), AX // f's PC + MOVL AX, (g_sched+gobuf_pc)(SI) + LEAL 4(SP), AX // f's SP + MOVL AX, (g_sched+gobuf_sp)(SI) + MOVL DX, (g_sched+gobuf_ctxt)(SI) + + // Call newstack on m->g0's stack. + MOVL m_g0(BX), BP + MOVL BP, g(CX) + MOVL (g_sched+gobuf_sp)(BP), AX + MOVL -4(AX), BX // fault if CALL would, before smashing SP + MOVL AX, SP + CALL runtime·newstack(SB) + CALL runtime·abort(SB) // crash if newstack returns + RET + +TEXT runtime·morestack_noctxt(SB),NOSPLIT,$0-0 + MOVL $0, DX + JMP runtime·morestack(SB) + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + CMPL CX, $MAXSIZE; \ + JA 3(PC); \ + MOVL $NAME(SB), AX; \ + JMP AX +// Note: can't just "JMP NAME(SB)" - bad inlining results. + +TEXT ·reflectcall(SB), NOSPLIT, $0-28 + MOVL frameSize+20(FP), CX + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVL $runtime·badreflectcall(SB), AX + JMP AX + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-28; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVL stackArgs+8(FP), SI; \ + MOVL stackArgsSize+12(FP), CX; \ + MOVL SP, DI; \ + REP;MOVSB; \ + /* call function */ \ + MOVL f+4(FP), DX; \ + MOVL (DX), AX; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + CALL AX; \ + /* copy return values back */ \ + MOVL stackArgsType+0(FP), DX; \ + MOVL stackArgs+8(FP), DI; \ + MOVL stackArgsSize+12(FP), CX; \ + MOVL stackRetOffset+16(FP), BX; \ + MOVL SP, SI; \ + ADDL BX, DI; \ + ADDL BX, SI; \ + SUBL BX, CX; \ + CALL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $20-0 + MOVL DX, 0(SP) + MOVL DI, 4(SP) + MOVL SI, 8(SP) + MOVL CX, 12(SP) + MOVL $0, 16(SP) + CALL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +TEXT runtime·procyield(SB),NOSPLIT,$0-0 + MOVL cycles+0(FP), AX +again: + PAUSE + SUBL $1, AX + JNZ again + RET + +TEXT ·publicationBarrier(SB),NOSPLIT,$0-0 + // Stores are already ordered on x86, so this is just a + // compile barrier. + RET + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT,$0 + PUSHL AX + PUSHL BX + get_tls(BX) + MOVL g(BX), BX + LEAL arg+0(FP), AX + MOVL AX, (g_sched+gobuf_sp)(BX) + MOVL $runtime·systemstack_switch(SB), AX + MOVL AX, (g_sched+gobuf_pc)(BX) + MOVL $0, (g_sched+gobuf_ret)(BX) + // Assert ctxt is zero. See func save. + MOVL (g_sched+gobuf_ctxt)(BX), AX + TESTL AX, AX + JZ 2(PC) + CALL runtime·abort(SB) + POPL BX + POPL AX + RET + +// func asmcgocall_no_g(fn, arg unsafe.Pointer) +// Call fn(arg) aligned appropriately for the gcc ABI. +// Called on a system stack, and there may be no g yet (during needm). +TEXT ·asmcgocall_no_g(SB),NOSPLIT,$0-8 + MOVL fn+0(FP), AX + MOVL arg+4(FP), BX + MOVL SP, DX + SUBL $32, SP + ANDL $~15, SP // alignment, perhaps unnecessary + MOVL DX, 8(SP) // save old SP + MOVL BX, 0(SP) // first argument in x86-32 ABI + CALL AX + MOVL 8(SP), DX + MOVL DX, SP + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-12 + MOVL fn+0(FP), AX + MOVL arg+4(FP), BX + + MOVL SP, DX + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + get_tls(CX) + MOVL g(CX), DI + CMPL DI, $0 + JEQ nosave // Don't even have a G yet. + MOVL g_m(DI), BP + CMPL DI, m_gsignal(BP) + JEQ noswitch + MOVL m_g0(BP), SI + CMPL DI, SI + JEQ noswitch + CALL gosave_systemstack_switch<>(SB) + get_tls(CX) + MOVL SI, g(CX) + MOVL (g_sched+gobuf_sp)(SI), SP + +noswitch: + // Now on a scheduling stack (a pthread-created stack). + SUBL $32, SP + ANDL $~15, SP // alignment, perhaps unnecessary + MOVL DI, 8(SP) // save g + MOVL (g_stack+stack_hi)(DI), DI + SUBL DX, DI + MOVL DI, 4(SP) // save depth in stack (can't just save SP, as stack might be copied during a callback) + MOVL BX, 0(SP) // first argument in x86-32 ABI + CALL AX + + // Restore registers, g, stack pointer. + get_tls(CX) + MOVL 8(SP), DI + MOVL (g_stack+stack_hi)(DI), SI + SUBL 4(SP), SI + MOVL DI, g(CX) + MOVL SI, SP + + MOVL AX, ret+8(FP) + RET +nosave: + // Now on a scheduling stack (a pthread-created stack). + SUBL $32, SP + ANDL $~15, SP // alignment, perhaps unnecessary + MOVL DX, 4(SP) // save original stack pointer + MOVL BX, 0(SP) // first argument in x86-32 ABI + CALL AX + + MOVL 4(SP), CX // restore original stack pointer + MOVL CX, SP + MOVL AX, ret+8(FP) + RET + +// cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$12-12 // Frame size must match commented places below + NO_LOCAL_POINTERS + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call through AX. + get_tls(CX) +#ifdef GOOS_windows + MOVL $0, BP + CMPL CX, $0 + JEQ 2(PC) // TODO +#endif + MOVL g(CX), BP + CMPL BP, $0 + JEQ needm + MOVL g_m(BP), BP + MOVL BP, savedm-4(SP) // saved copy of oldm + JMP havem +needm: + MOVL $runtime·needm(SB), AX + CALL AX + MOVL $0, savedm-4(SP) // dropm on return + get_tls(CX) + MOVL g(CX), BP + MOVL g_m(BP), BP + + // Set m->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVL m_g0(BP), SI + MOVL SP, (g_sched+gobuf_sp)(SI) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 0(SP). + MOVL m_g0(BP), SI + MOVL (g_sched+gobuf_sp)(SI), AX + MOVL AX, 0(SP) + MOVL SP, (g_sched+gobuf_sp)(SI) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOVL m_curg(BP), SI + MOVL SI, g(CX) + MOVL (g_sched+gobuf_sp)(SI), DI // prepare stack as DI + MOVL (g_sched+gobuf_pc)(SI), BP + MOVL BP, -4(DI) // "push" return PC on the g stack + // Gather our arguments into registers. + MOVL fn+0(FP), AX + MOVL frame+4(FP), BX + MOVL ctxt+8(FP), CX + LEAL -(4+12)(DI), SP // Must match declared frame size + MOVL AX, 0(SP) + MOVL BX, 4(SP) + MOVL CX, 8(SP) + CALL runtime·cgocallbackg(SB) + + // Restore g->sched (== m->curg->sched) from saved values. + get_tls(CX) + MOVL g(CX), SI + MOVL 12(SP), BP // Must match declared frame size + MOVL BP, (g_sched+gobuf_pc)(SI) + LEAL (12+4)(SP), DI // Must match declared frame size + MOVL DI, (g_sched+gobuf_sp)(SI) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVL g(CX), BP + MOVL g_m(BP), BP + MOVL m_g0(BP), SI + MOVL SI, g(CX) + MOVL (g_sched+gobuf_sp)(SI), SP + MOVL 0(SP), AX + MOVL AX, (g_sched+gobuf_sp)(SI) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVL savedm-4(SP), DX + CMPL DX, $0 + JNE 3(PC) + MOVL $runtime·dropm(SB), AX + CALL AX + + // Done! + RET + +// void setg(G*); set g. for use by needm. +TEXT runtime·setg(SB), NOSPLIT, $0-4 + MOVL gg+0(FP), BX +#ifdef GOOS_windows + CMPL BX, $0 + JNE settls + MOVL $0, 0x14(FS) + RET +settls: + MOVL g_m(BX), AX + LEAL m_tls(AX), AX + MOVL AX, 0x14(FS) +#endif + get_tls(CX) + MOVL BX, g(CX) + RET + +// void setg_gcc(G*); set g. for use by gcc +TEXT setg_gcc<>(SB), NOSPLIT, $0 + get_tls(AX) + MOVL gg+0(FP), DX + MOVL DX, g(AX) + RET + +TEXT runtime·abort(SB),NOSPLIT,$0-0 + INT $3 +loop: + JMP loop + +// check that SP is in range [g->stack.lo, g->stack.hi) +TEXT runtime·stackcheck(SB), NOSPLIT, $0-0 + get_tls(CX) + MOVL g(CX), AX + CMPL (g_stack+stack_hi)(AX), SP + JHI 2(PC) + CALL runtime·abort(SB) + CMPL SP, (g_stack+stack_lo)(AX) + JHI 2(PC) + CALL runtime·abort(SB) + RET + +// func cputicks() int64 +TEXT runtime·cputicks(SB),NOSPLIT,$0-8 + // LFENCE/MFENCE instruction support is dependent on SSE2. + // When no SSE2 support is present do not enforce any serialization + // since using CPUID to serialize the instruction stream is + // very costly. +#ifdef GO386_softfloat + JMP rdtsc // no fence instructions available +#endif + CMPB internal∕cpu·X86+const_offsetX86HasRDTSCP(SB), $1 + JNE fences + // Instruction stream serializing RDTSCP is supported. + // RDTSCP is supported by Intel Nehalem (2008) and + // AMD K8 Rev. F (2006) and newer. + RDTSCP +done: + MOVL AX, ret_lo+0(FP) + MOVL DX, ret_hi+4(FP) + RET +fences: + // MFENCE is instruction stream serializing and flushes the + // store buffers on AMD. The serialization semantics of LFENCE on AMD + // are dependent on MSR C001_1029 and CPU generation. + // LFENCE on Intel does wait for all previous instructions to have executed. + // Intel recommends MFENCE;LFENCE in its manuals before RDTSC to have all + // previous instructions executed and all previous loads and stores to globally visible. + // Using MFENCE;LFENCE here aligns the serializing properties without + // runtime detection of CPU manufacturer. + MFENCE + LFENCE +rdtsc: + RDTSC + JMP done + +TEXT ldt0setup<>(SB),NOSPLIT,$16-0 + // set up ldt 7 to point at m0.tls + // ldt 1 would be fine on Linux, but on OS X, 7 is as low as we can go. + // the entry number is just a hint. setldt will set up GS with what it used. + MOVL $7, 0(SP) + LEAL runtime·m0+m_tls(SB), AX + MOVL AX, 4(SP) + MOVL $32, 8(SP) // sizeof(tls array) + CALL runtime·setldt(SB) + RET + +TEXT runtime·emptyfunc(SB),0,$0-0 + RET + +// hash function using AES hardware instructions +TEXT runtime·memhash(SB),NOSPLIT,$0-16 + CMPB runtime·useAeshash(SB), $0 + JEQ noaes + MOVL p+0(FP), AX // ptr to data + MOVL s+8(FP), BX // size + LEAL ret+12(FP), DX + JMP aeshashbody<>(SB) +noaes: + JMP runtime·memhashFallback(SB) + +TEXT runtime·strhash(SB),NOSPLIT,$0-12 + CMPB runtime·useAeshash(SB), $0 + JEQ noaes + MOVL p+0(FP), AX // ptr to string object + MOVL 4(AX), BX // length of string + MOVL (AX), AX // string data + LEAL ret+8(FP), DX + JMP aeshashbody<>(SB) +noaes: + JMP runtime·strhashFallback(SB) + +// AX: data +// BX: length +// DX: address to put return value +TEXT aeshashbody<>(SB),NOSPLIT,$0-0 + MOVL h+4(FP), X0 // 32 bits of per-table hash seed + PINSRW $4, BX, X0 // 16 bits of length + PSHUFHW $0, X0, X0 // replace size with its low 2 bytes repeated 4 times + MOVO X0, X1 // save unscrambled seed + PXOR runtime·aeskeysched(SB), X0 // xor in per-process seed + AESENC X0, X0 // scramble seed + + CMPL BX, $16 + JB aes0to15 + JE aes16 + CMPL BX, $32 + JBE aes17to32 + CMPL BX, $64 + JBE aes33to64 + JMP aes65plus + +aes0to15: + TESTL BX, BX + JE aes0 + + ADDL $16, AX + TESTW $0xff0, AX + JE endofpage + + // 16 bytes loaded at this address won't cross + // a page boundary, so we can load it directly. + MOVOU -16(AX), X1 + ADDL BX, BX + PAND masks<>(SB)(BX*8), X1 + +final1: + PXOR X0, X1 // xor data with seed + AESENC X1, X1 // scramble combo 3 times + AESENC X1, X1 + AESENC X1, X1 + MOVL X1, (DX) + RET + +endofpage: + // address ends in 1111xxxx. Might be up against + // a page boundary, so load ending at last byte. + // Then shift bytes down using pshufb. + MOVOU -32(AX)(BX*1), X1 + ADDL BX, BX + PSHUFB shifts<>(SB)(BX*8), X1 + JMP final1 + +aes0: + // Return scrambled input seed + AESENC X0, X0 + MOVL X0, (DX) + RET + +aes16: + MOVOU (AX), X1 + JMP final1 + +aes17to32: + // make second starting seed + PXOR runtime·aeskeysched+16(SB), X1 + AESENC X1, X1 + + // load data to be hashed + MOVOU (AX), X2 + MOVOU -16(AX)(BX*1), X3 + + // xor with seed + PXOR X0, X2 + PXOR X1, X3 + + // scramble 3 times + AESENC X2, X2 + AESENC X3, X3 + AESENC X2, X2 + AESENC X3, X3 + AESENC X2, X2 + AESENC X3, X3 + + // combine results + PXOR X3, X2 + MOVL X2, (DX) + RET + +aes33to64: + // make 3 more starting seeds + MOVO X1, X2 + MOVO X1, X3 + PXOR runtime·aeskeysched+16(SB), X1 + PXOR runtime·aeskeysched+32(SB), X2 + PXOR runtime·aeskeysched+48(SB), X3 + AESENC X1, X1 + AESENC X2, X2 + AESENC X3, X3 + + MOVOU (AX), X4 + MOVOU 16(AX), X5 + MOVOU -32(AX)(BX*1), X6 + MOVOU -16(AX)(BX*1), X7 + + PXOR X0, X4 + PXOR X1, X5 + PXOR X2, X6 + PXOR X3, X7 + + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + PXOR X6, X4 + PXOR X7, X5 + PXOR X5, X4 + MOVL X4, (DX) + RET + +aes65plus: + // make 3 more starting seeds + MOVO X1, X2 + MOVO X1, X3 + PXOR runtime·aeskeysched+16(SB), X1 + PXOR runtime·aeskeysched+32(SB), X2 + PXOR runtime·aeskeysched+48(SB), X3 + AESENC X1, X1 + AESENC X2, X2 + AESENC X3, X3 + + // start with last (possibly overlapping) block + MOVOU -64(AX)(BX*1), X4 + MOVOU -48(AX)(BX*1), X5 + MOVOU -32(AX)(BX*1), X6 + MOVOU -16(AX)(BX*1), X7 + + // scramble state once + AESENC X0, X4 + AESENC X1, X5 + AESENC X2, X6 + AESENC X3, X7 + + // compute number of remaining 64-byte blocks + DECL BX + SHRL $6, BX + +aesloop: + // scramble state, xor in a block + MOVOU (AX), X0 + MOVOU 16(AX), X1 + MOVOU 32(AX), X2 + MOVOU 48(AX), X3 + AESENC X0, X4 + AESENC X1, X5 + AESENC X2, X6 + AESENC X3, X7 + + // scramble state + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + ADDL $64, AX + DECL BX + JNE aesloop + + // 3 more scrambles to finish + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + PXOR X6, X4 + PXOR X7, X5 + PXOR X5, X4 + MOVL X4, (DX) + RET + +TEXT runtime·memhash32(SB),NOSPLIT,$0-12 + CMPB runtime·useAeshash(SB), $0 + JEQ noaes + MOVL p+0(FP), AX // ptr to data + MOVL h+4(FP), X0 // seed + PINSRD $1, (AX), X0 // data + AESENC runtime·aeskeysched+0(SB), X0 + AESENC runtime·aeskeysched+16(SB), X0 + AESENC runtime·aeskeysched+32(SB), X0 + MOVL X0, ret+8(FP) + RET +noaes: + JMP runtime·memhash32Fallback(SB) + +TEXT runtime·memhash64(SB),NOSPLIT,$0-12 + CMPB runtime·useAeshash(SB), $0 + JEQ noaes + MOVL p+0(FP), AX // ptr to data + MOVQ (AX), X0 // data + PINSRD $2, h+4(FP), X0 // seed + AESENC runtime·aeskeysched+0(SB), X0 + AESENC runtime·aeskeysched+16(SB), X0 + AESENC runtime·aeskeysched+32(SB), X0 + MOVL X0, ret+8(FP) + RET +noaes: + JMP runtime·memhash64Fallback(SB) + +// simple mask to get rid of data in the high part of the register. +DATA masks<>+0x00(SB)/4, $0x00000000 +DATA masks<>+0x04(SB)/4, $0x00000000 +DATA masks<>+0x08(SB)/4, $0x00000000 +DATA masks<>+0x0c(SB)/4, $0x00000000 + +DATA masks<>+0x10(SB)/4, $0x000000ff +DATA masks<>+0x14(SB)/4, $0x00000000 +DATA masks<>+0x18(SB)/4, $0x00000000 +DATA masks<>+0x1c(SB)/4, $0x00000000 + +DATA masks<>+0x20(SB)/4, $0x0000ffff +DATA masks<>+0x24(SB)/4, $0x00000000 +DATA masks<>+0x28(SB)/4, $0x00000000 +DATA masks<>+0x2c(SB)/4, $0x00000000 + +DATA masks<>+0x30(SB)/4, $0x00ffffff +DATA masks<>+0x34(SB)/4, $0x00000000 +DATA masks<>+0x38(SB)/4, $0x00000000 +DATA masks<>+0x3c(SB)/4, $0x00000000 + +DATA masks<>+0x40(SB)/4, $0xffffffff +DATA masks<>+0x44(SB)/4, $0x00000000 +DATA masks<>+0x48(SB)/4, $0x00000000 +DATA masks<>+0x4c(SB)/4, $0x00000000 + +DATA masks<>+0x50(SB)/4, $0xffffffff +DATA masks<>+0x54(SB)/4, $0x000000ff +DATA masks<>+0x58(SB)/4, $0x00000000 +DATA masks<>+0x5c(SB)/4, $0x00000000 + +DATA masks<>+0x60(SB)/4, $0xffffffff +DATA masks<>+0x64(SB)/4, $0x0000ffff +DATA masks<>+0x68(SB)/4, $0x00000000 +DATA masks<>+0x6c(SB)/4, $0x00000000 + +DATA masks<>+0x70(SB)/4, $0xffffffff +DATA masks<>+0x74(SB)/4, $0x00ffffff +DATA masks<>+0x78(SB)/4, $0x00000000 +DATA masks<>+0x7c(SB)/4, $0x00000000 + +DATA masks<>+0x80(SB)/4, $0xffffffff +DATA masks<>+0x84(SB)/4, $0xffffffff +DATA masks<>+0x88(SB)/4, $0x00000000 +DATA masks<>+0x8c(SB)/4, $0x00000000 + +DATA masks<>+0x90(SB)/4, $0xffffffff +DATA masks<>+0x94(SB)/4, $0xffffffff +DATA masks<>+0x98(SB)/4, $0x000000ff +DATA masks<>+0x9c(SB)/4, $0x00000000 + +DATA masks<>+0xa0(SB)/4, $0xffffffff +DATA masks<>+0xa4(SB)/4, $0xffffffff +DATA masks<>+0xa8(SB)/4, $0x0000ffff +DATA masks<>+0xac(SB)/4, $0x00000000 + +DATA masks<>+0xb0(SB)/4, $0xffffffff +DATA masks<>+0xb4(SB)/4, $0xffffffff +DATA masks<>+0xb8(SB)/4, $0x00ffffff +DATA masks<>+0xbc(SB)/4, $0x00000000 + +DATA masks<>+0xc0(SB)/4, $0xffffffff +DATA masks<>+0xc4(SB)/4, $0xffffffff +DATA masks<>+0xc8(SB)/4, $0xffffffff +DATA masks<>+0xcc(SB)/4, $0x00000000 + +DATA masks<>+0xd0(SB)/4, $0xffffffff +DATA masks<>+0xd4(SB)/4, $0xffffffff +DATA masks<>+0xd8(SB)/4, $0xffffffff +DATA masks<>+0xdc(SB)/4, $0x000000ff + +DATA masks<>+0xe0(SB)/4, $0xffffffff +DATA masks<>+0xe4(SB)/4, $0xffffffff +DATA masks<>+0xe8(SB)/4, $0xffffffff +DATA masks<>+0xec(SB)/4, $0x0000ffff + +DATA masks<>+0xf0(SB)/4, $0xffffffff +DATA masks<>+0xf4(SB)/4, $0xffffffff +DATA masks<>+0xf8(SB)/4, $0xffffffff +DATA masks<>+0xfc(SB)/4, $0x00ffffff + +GLOBL masks<>(SB),RODATA,$256 + +// these are arguments to pshufb. They move data down from +// the high bytes of the register to the low bytes of the register. +// index is how many bytes to move. +DATA shifts<>+0x00(SB)/4, $0x00000000 +DATA shifts<>+0x04(SB)/4, $0x00000000 +DATA shifts<>+0x08(SB)/4, $0x00000000 +DATA shifts<>+0x0c(SB)/4, $0x00000000 + +DATA shifts<>+0x10(SB)/4, $0xffffff0f +DATA shifts<>+0x14(SB)/4, $0xffffffff +DATA shifts<>+0x18(SB)/4, $0xffffffff +DATA shifts<>+0x1c(SB)/4, $0xffffffff + +DATA shifts<>+0x20(SB)/4, $0xffff0f0e +DATA shifts<>+0x24(SB)/4, $0xffffffff +DATA shifts<>+0x28(SB)/4, $0xffffffff +DATA shifts<>+0x2c(SB)/4, $0xffffffff + +DATA shifts<>+0x30(SB)/4, $0xff0f0e0d +DATA shifts<>+0x34(SB)/4, $0xffffffff +DATA shifts<>+0x38(SB)/4, $0xffffffff +DATA shifts<>+0x3c(SB)/4, $0xffffffff + +DATA shifts<>+0x40(SB)/4, $0x0f0e0d0c +DATA shifts<>+0x44(SB)/4, $0xffffffff +DATA shifts<>+0x48(SB)/4, $0xffffffff +DATA shifts<>+0x4c(SB)/4, $0xffffffff + +DATA shifts<>+0x50(SB)/4, $0x0e0d0c0b +DATA shifts<>+0x54(SB)/4, $0xffffff0f +DATA shifts<>+0x58(SB)/4, $0xffffffff +DATA shifts<>+0x5c(SB)/4, $0xffffffff + +DATA shifts<>+0x60(SB)/4, $0x0d0c0b0a +DATA shifts<>+0x64(SB)/4, $0xffff0f0e +DATA shifts<>+0x68(SB)/4, $0xffffffff +DATA shifts<>+0x6c(SB)/4, $0xffffffff + +DATA shifts<>+0x70(SB)/4, $0x0c0b0a09 +DATA shifts<>+0x74(SB)/4, $0xff0f0e0d +DATA shifts<>+0x78(SB)/4, $0xffffffff +DATA shifts<>+0x7c(SB)/4, $0xffffffff + +DATA shifts<>+0x80(SB)/4, $0x0b0a0908 +DATA shifts<>+0x84(SB)/4, $0x0f0e0d0c +DATA shifts<>+0x88(SB)/4, $0xffffffff +DATA shifts<>+0x8c(SB)/4, $0xffffffff + +DATA shifts<>+0x90(SB)/4, $0x0a090807 +DATA shifts<>+0x94(SB)/4, $0x0e0d0c0b +DATA shifts<>+0x98(SB)/4, $0xffffff0f +DATA shifts<>+0x9c(SB)/4, $0xffffffff + +DATA shifts<>+0xa0(SB)/4, $0x09080706 +DATA shifts<>+0xa4(SB)/4, $0x0d0c0b0a +DATA shifts<>+0xa8(SB)/4, $0xffff0f0e +DATA shifts<>+0xac(SB)/4, $0xffffffff + +DATA shifts<>+0xb0(SB)/4, $0x08070605 +DATA shifts<>+0xb4(SB)/4, $0x0c0b0a09 +DATA shifts<>+0xb8(SB)/4, $0xff0f0e0d +DATA shifts<>+0xbc(SB)/4, $0xffffffff + +DATA shifts<>+0xc0(SB)/4, $0x07060504 +DATA shifts<>+0xc4(SB)/4, $0x0b0a0908 +DATA shifts<>+0xc8(SB)/4, $0x0f0e0d0c +DATA shifts<>+0xcc(SB)/4, $0xffffffff + +DATA shifts<>+0xd0(SB)/4, $0x06050403 +DATA shifts<>+0xd4(SB)/4, $0x0a090807 +DATA shifts<>+0xd8(SB)/4, $0x0e0d0c0b +DATA shifts<>+0xdc(SB)/4, $0xffffff0f + +DATA shifts<>+0xe0(SB)/4, $0x05040302 +DATA shifts<>+0xe4(SB)/4, $0x09080706 +DATA shifts<>+0xe8(SB)/4, $0x0d0c0b0a +DATA shifts<>+0xec(SB)/4, $0xffff0f0e + +DATA shifts<>+0xf0(SB)/4, $0x04030201 +DATA shifts<>+0xf4(SB)/4, $0x08070605 +DATA shifts<>+0xf8(SB)/4, $0x0c0b0a09 +DATA shifts<>+0xfc(SB)/4, $0xff0f0e0d + +GLOBL shifts<>(SB),RODATA,$256 + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + // check that masks<>(SB) and shifts<>(SB) are aligned to 16-byte + MOVL $masks<>(SB), AX + MOVL $shifts<>(SB), BX + ORL BX, AX + TESTL $15, AX + SETEQ ret+0(FP) + RET + +TEXT runtime·return0(SB), NOSPLIT, $0 + MOVL $0, AX + RET + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT,$0 + get_tls(CX) + MOVL g(CX), AX + MOVL g_m(AX), AX + MOVL m_curg(AX), AX + MOVL (g_stack+stack_hi)(AX), AX + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|TOPFRAME,$0-0 + BYTE $0x90 // NOP + CALL runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + BYTE $0x90 // NOP + +// Add a module's moduledata to the linked list of moduledata objects. This +// is called from .init_array by a function generated in the linker and so +// follows the platform ABI wrt register preservation -- it only touches AX, +// CX (implicitly) and DX, but it does not follow the ABI wrt arguments: +// instead the pointer to the moduledata is passed in AX. +TEXT runtime·addmoduledata(SB),NOSPLIT,$0-0 + MOVL runtime·lastmoduledatap(SB), DX + MOVL AX, moduledata_next(DX) + MOVL AX, runtime·lastmoduledatap(SB) + RET + +TEXT runtime·uint32tofloat64(SB),NOSPLIT,$8-12 + MOVL a+0(FP), AX + MOVL AX, 0(SP) + MOVL $0, 4(SP) + FMOVV 0(SP), F0 + FMOVDP F0, ret+4(FP) + RET + +TEXT runtime·float64touint32(SB),NOSPLIT,$12-12 + FMOVD a+0(FP), F0 + FSTCW 0(SP) + FLDCW runtime·controlWord64trunc(SB) + FMOVVP F0, 4(SP) + FLDCW 0(SP) + MOVL 4(SP), AX + MOVL AX, ret+8(FP) + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - DI is the destination of the write +// - AX is the value being written at DI +// It clobbers FLAGS. It does not clobber any general-purpose registers, +// but may clobber others (e.g., SSE registers). +TEXT runtime·gcWriteBarrier(SB),NOSPLIT,$28 + // Save the registers clobbered by the fast path. This is slightly + // faster than having the caller spill these. + MOVL CX, 20(SP) + MOVL BX, 24(SP) + // TODO: Consider passing g.m.p in as an argument so they can be shared + // across a sequence of write barriers. + get_tls(BX) + MOVL g(BX), BX + MOVL g_m(BX), BX + MOVL m_p(BX), BX + MOVL (p_wbBuf+wbBuf_next)(BX), CX + // Increment wbBuf.next position. + LEAL 8(CX), CX + MOVL CX, (p_wbBuf+wbBuf_next)(BX) + CMPL CX, (p_wbBuf+wbBuf_end)(BX) + // Record the write. + MOVL AX, -8(CX) // Record value + MOVL (DI), BX // TODO: This turns bad writes into bad reads. + MOVL BX, -4(CX) // Record *slot + // Is the buffer full? (flags set in CMPL above) + JEQ flush +ret: + MOVL 20(SP), CX + MOVL 24(SP), BX + // Do the write. + MOVL AX, (DI) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + MOVL DI, 0(SP) // Also first argument to wbBufFlush + MOVL AX, 4(SP) // Also second argument to wbBufFlush + // BX already saved + // CX already saved + MOVL DX, 8(SP) + MOVL BP, 12(SP) + MOVL SI, 16(SP) + // DI already saved + + // This takes arguments DI and AX + CALL runtime·wbBufFlush(SB) + + MOVL 0(SP), DI + MOVL 4(SP), AX + MOVL 8(SP), DX + MOVL 12(SP), BP + MOVL 16(SP), SI + JMP ret + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +TEXT runtime·panicIndex(SB),NOSPLIT,$0-8 + MOVL AX, x+0(FP) + MOVL CX, y+4(FP) + JMP runtime·goPanicIndex(SB) +TEXT runtime·panicIndexU(SB),NOSPLIT,$0-8 + MOVL AX, x+0(FP) + MOVL CX, y+4(FP) + JMP runtime·goPanicIndexU(SB) +TEXT runtime·panicSliceAlen(SB),NOSPLIT,$0-8 + MOVL CX, x+0(FP) + MOVL DX, y+4(FP) + JMP runtime·goPanicSliceAlen(SB) +TEXT runtime·panicSliceAlenU(SB),NOSPLIT,$0-8 + MOVL CX, x+0(FP) + MOVL DX, y+4(FP) + JMP runtime·goPanicSliceAlenU(SB) +TEXT runtime·panicSliceAcap(SB),NOSPLIT,$0-8 + MOVL CX, x+0(FP) + MOVL DX, y+4(FP) + JMP runtime·goPanicSliceAcap(SB) +TEXT runtime·panicSliceAcapU(SB),NOSPLIT,$0-8 + MOVL CX, x+0(FP) + MOVL DX, y+4(FP) + JMP runtime·goPanicSliceAcapU(SB) +TEXT runtime·panicSliceB(SB),NOSPLIT,$0-8 + MOVL AX, x+0(FP) + MOVL CX, y+4(FP) + JMP runtime·goPanicSliceB(SB) +TEXT runtime·panicSliceBU(SB),NOSPLIT,$0-8 + MOVL AX, x+0(FP) + MOVL CX, y+4(FP) + JMP runtime·goPanicSliceBU(SB) +TEXT runtime·panicSlice3Alen(SB),NOSPLIT,$0-8 + MOVL DX, x+0(FP) + MOVL BX, y+4(FP) + JMP runtime·goPanicSlice3Alen(SB) +TEXT runtime·panicSlice3AlenU(SB),NOSPLIT,$0-8 + MOVL DX, x+0(FP) + MOVL BX, y+4(FP) + JMP runtime·goPanicSlice3AlenU(SB) +TEXT runtime·panicSlice3Acap(SB),NOSPLIT,$0-8 + MOVL DX, x+0(FP) + MOVL BX, y+4(FP) + JMP runtime·goPanicSlice3Acap(SB) +TEXT runtime·panicSlice3AcapU(SB),NOSPLIT,$0-8 + MOVL DX, x+0(FP) + MOVL BX, y+4(FP) + JMP runtime·goPanicSlice3AcapU(SB) +TEXT runtime·panicSlice3B(SB),NOSPLIT,$0-8 + MOVL CX, x+0(FP) + MOVL DX, y+4(FP) + JMP runtime·goPanicSlice3B(SB) +TEXT runtime·panicSlice3BU(SB),NOSPLIT,$0-8 + MOVL CX, x+0(FP) + MOVL DX, y+4(FP) + JMP runtime·goPanicSlice3BU(SB) +TEXT runtime·panicSlice3C(SB),NOSPLIT,$0-8 + MOVL AX, x+0(FP) + MOVL CX, y+4(FP) + JMP runtime·goPanicSlice3C(SB) +TEXT runtime·panicSlice3CU(SB),NOSPLIT,$0-8 + MOVL AX, x+0(FP) + MOVL CX, y+4(FP) + JMP runtime·goPanicSlice3CU(SB) +TEXT runtime·panicSliceConvert(SB),NOSPLIT,$0-8 + MOVL DX, x+0(FP) + MOVL BX, y+4(FP) + JMP runtime·goPanicSliceConvert(SB) + +// Extended versions for 64-bit indexes. +TEXT runtime·panicExtendIndex(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL AX, lo+4(FP) + MOVL CX, y+8(FP) + JMP runtime·goPanicExtendIndex(SB) +TEXT runtime·panicExtendIndexU(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL AX, lo+4(FP) + MOVL CX, y+8(FP) + JMP runtime·goPanicExtendIndexU(SB) +TEXT runtime·panicExtendSliceAlen(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL CX, lo+4(FP) + MOVL DX, y+8(FP) + JMP runtime·goPanicExtendSliceAlen(SB) +TEXT runtime·panicExtendSliceAlenU(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL CX, lo+4(FP) + MOVL DX, y+8(FP) + JMP runtime·goPanicExtendSliceAlenU(SB) +TEXT runtime·panicExtendSliceAcap(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL CX, lo+4(FP) + MOVL DX, y+8(FP) + JMP runtime·goPanicExtendSliceAcap(SB) +TEXT runtime·panicExtendSliceAcapU(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL CX, lo+4(FP) + MOVL DX, y+8(FP) + JMP runtime·goPanicExtendSliceAcapU(SB) +TEXT runtime·panicExtendSliceB(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL AX, lo+4(FP) + MOVL CX, y+8(FP) + JMP runtime·goPanicExtendSliceB(SB) +TEXT runtime·panicExtendSliceBU(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL AX, lo+4(FP) + MOVL CX, y+8(FP) + JMP runtime·goPanicExtendSliceBU(SB) +TEXT runtime·panicExtendSlice3Alen(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL DX, lo+4(FP) + MOVL BX, y+8(FP) + JMP runtime·goPanicExtendSlice3Alen(SB) +TEXT runtime·panicExtendSlice3AlenU(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL DX, lo+4(FP) + MOVL BX, y+8(FP) + JMP runtime·goPanicExtendSlice3AlenU(SB) +TEXT runtime·panicExtendSlice3Acap(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL DX, lo+4(FP) + MOVL BX, y+8(FP) + JMP runtime·goPanicExtendSlice3Acap(SB) +TEXT runtime·panicExtendSlice3AcapU(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL DX, lo+4(FP) + MOVL BX, y+8(FP) + JMP runtime·goPanicExtendSlice3AcapU(SB) +TEXT runtime·panicExtendSlice3B(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL CX, lo+4(FP) + MOVL DX, y+8(FP) + JMP runtime·goPanicExtendSlice3B(SB) +TEXT runtime·panicExtendSlice3BU(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL CX, lo+4(FP) + MOVL DX, y+8(FP) + JMP runtime·goPanicExtendSlice3BU(SB) +TEXT runtime·panicExtendSlice3C(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL AX, lo+4(FP) + MOVL CX, y+8(FP) + JMP runtime·goPanicExtendSlice3C(SB) +TEXT runtime·panicExtendSlice3CU(SB),NOSPLIT,$0-12 + MOVL SI, hi+0(FP) + MOVL AX, lo+4(FP) + MOVL CX, y+8(FP) + JMP runtime·goPanicExtendSlice3CU(SB) + +#ifdef GOOS_android +// Use the free TLS_SLOT_APP slot #2 on Android Q. +// Earlier androids are set up in gcc_android.c. +DATA runtime·tls_g+0(SB)/4, $8 +GLOBL runtime·tls_g+0(SB), NOPTR, $4 +#endif diff --git a/src/runtime/asm_amd64.h b/src/runtime/asm_amd64.h new file mode 100644 index 0000000..f7a8896 --- /dev/null +++ b/src/runtime/asm_amd64.h @@ -0,0 +1,25 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Define features that are guaranteed to be supported by setting the AMD64 variable. +// If a feature is supported, there's no need to check it at runtime every time. + +#ifdef GOAMD64_v2 +#define hasPOPCNT +#define hasSSE42 +#endif + +#ifdef GOAMD64_v3 +#define hasAVX +#define hasAVX2 +#define hasPOPCNT +#define hasSSE42 +#endif + +#ifdef GOAMD64_v4 +#define hasAVX +#define hasAVX2 +#define hasPOPCNT +#define hasSSE42 +#endif diff --git a/src/runtime/asm_amd64.s b/src/runtime/asm_amd64.s new file mode 100644 index 0000000..13c8de4 --- /dev/null +++ b/src/runtime/asm_amd64.s @@ -0,0 +1,2066 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" +#include "cgo/abi_amd64.h" + +// _rt0_amd64 is common startup code for most amd64 systems when using +// internal linking. This is the entry point for the program from the +// kernel for an ordinary -buildmode=exe program. The stack holds the +// number of arguments and the C-style argv. +TEXT _rt0_amd64(SB),NOSPLIT,$-8 + MOVQ 0(SP), DI // argc + LEAQ 8(SP), SI // argv + JMP runtime·rt0_go(SB) + +// main is common startup code for most amd64 systems when using +// external linking. The C startup code will call the symbol "main" +// passing argc and argv in the usual C ABI registers DI and SI. +TEXT main(SB),NOSPLIT,$-8 + JMP runtime·rt0_go(SB) + +// _rt0_amd64_lib is common startup code for most amd64 systems when +// using -buildmode=c-archive or -buildmode=c-shared. The linker will +// arrange to invoke this function as a global constructor (for +// c-archive) or when the shared library is loaded (for c-shared). +// We expect argc and argv to be passed in the usual C ABI registers +// DI and SI. +TEXT _rt0_amd64_lib(SB),NOSPLIT,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + MOVQ DI, _rt0_amd64_lib_argc<>(SB) + MOVQ SI, _rt0_amd64_lib_argv<>(SB) + + // Synchronous initialization. + CALL runtime·libpreinit(SB) + + // Create a new thread to finish Go runtime initialization. + MOVQ _cgo_sys_thread_create(SB), AX + TESTQ AX, AX + JZ nocgo + + // We're calling back to C. + // Align stack per ELF ABI requirements. + MOVQ SP, BX // Callee-save in C ABI + ANDQ $~15, SP + MOVQ $_rt0_amd64_lib_go(SB), DI + MOVQ $0, SI + CALL AX + MOVQ BX, SP + JMP restore + +nocgo: + ADJSP $16 + MOVQ $0x800000, 0(SP) // stacksize + MOVQ $_rt0_amd64_lib_go(SB), AX + MOVQ AX, 8(SP) // fn + CALL runtime·newosproc0(SB) + ADJSP $-16 + +restore: + POP_REGS_HOST_TO_ABI0() + RET + +// _rt0_amd64_lib_go initializes the Go runtime. +// This is started in a separate thread by _rt0_amd64_lib. +TEXT _rt0_amd64_lib_go(SB),NOSPLIT,$0 + MOVQ _rt0_amd64_lib_argc<>(SB), DI + MOVQ _rt0_amd64_lib_argv<>(SB), SI + JMP runtime·rt0_go(SB) + +DATA _rt0_amd64_lib_argc<>(SB)/8, $0 +GLOBL _rt0_amd64_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_amd64_lib_argv<>(SB)/8, $0 +GLOBL _rt0_amd64_lib_argv<>(SB),NOPTR, $8 + +#ifdef GOAMD64_v2 +DATA bad_cpu_msg<>+0x00(SB)/84, $"This program can only be run on AMD64 processors with v2 microarchitecture support.\n" +#endif + +#ifdef GOAMD64_v3 +DATA bad_cpu_msg<>+0x00(SB)/84, $"This program can only be run on AMD64 processors with v3 microarchitecture support.\n" +#endif + +#ifdef GOAMD64_v4 +DATA bad_cpu_msg<>+0x00(SB)/84, $"This program can only be run on AMD64 processors with v4 microarchitecture support.\n" +#endif + +GLOBL bad_cpu_msg<>(SB), RODATA, $84 + +// Define a list of AMD64 microarchitecture level features +// https://en.wikipedia.org/wiki/X86-64#Microarchitecture_levels + + // SSE3 SSSE3 CMPXCHNG16 SSE4.1 SSE4.2 POPCNT +#define V2_FEATURES_CX (1 << 0 | 1 << 9 | 1 << 13 | 1 << 19 | 1 << 20 | 1 << 23) + // LAHF/SAHF +#define V2_EXT_FEATURES_CX (1 << 0) + // FMA MOVBE OSXSAVE AVX F16C +#define V3_FEATURES_CX (V2_FEATURES_CX | 1 << 12 | 1 << 22 | 1 << 27 | 1 << 28 | 1 << 29) + // ABM (FOR LZNCT) +#define V3_EXT_FEATURES_CX (V2_EXT_FEATURES_CX | 1 << 5) + // BMI1 AVX2 BMI2 +#define V3_EXT_FEATURES_BX (1 << 3 | 1 << 5 | 1 << 8) + // XMM YMM +#define V3_OS_SUPPORT_AX (1 << 1 | 1 << 2) + +#define V4_FEATURES_CX V3_FEATURES_CX + +#define V4_EXT_FEATURES_CX V3_EXT_FEATURES_CX + // AVX512F AVX512DQ AVX512CD AVX512BW AVX512VL +#define V4_EXT_FEATURES_BX (V3_EXT_FEATURES_BX | 1 << 16 | 1 << 17 | 1 << 28 | 1 << 30 | 1 << 31) + // OPMASK ZMM +#define V4_OS_SUPPORT_AX (V3_OS_SUPPORT_AX | 1 << 5 | (1 << 6 | 1 << 7)) + +#ifdef GOAMD64_v2 +#define NEED_MAX_CPUID 0x80000001 +#define NEED_FEATURES_CX V2_FEATURES_CX +#define NEED_EXT_FEATURES_CX V2_EXT_FEATURES_CX +#endif + +#ifdef GOAMD64_v3 +#define NEED_MAX_CPUID 0x80000001 +#define NEED_FEATURES_CX V3_FEATURES_CX +#define NEED_EXT_FEATURES_CX V3_EXT_FEATURES_CX +#define NEED_EXT_FEATURES_BX V3_EXT_FEATURES_BX +#define NEED_OS_SUPPORT_AX V3_OS_SUPPORT_AX +#endif + +#ifdef GOAMD64_v4 +#define NEED_MAX_CPUID 0x80000001 +#define NEED_FEATURES_CX V4_FEATURES_CX +#define NEED_EXT_FEATURES_CX V4_EXT_FEATURES_CX +#define NEED_EXT_FEATURES_BX V4_EXT_FEATURES_BX + +// Darwin requires a different approach to check AVX512 support, see CL 285572. +#ifdef GOOS_darwin +#define NEED_OS_SUPPORT_AX V3_OS_SUPPORT_AX +// These values are from: +// https://github.com/apple/darwin-xnu/blob/xnu-4570.1.46/osfmk/i386/cpu_capabilities.h +#define commpage64_base_address 0x00007fffffe00000 +#define commpage64_cpu_capabilities64 (commpage64_base_address+0x010) +#define commpage64_version (commpage64_base_address+0x01E) +#define hasAVX512F 0x0000004000000000 +#define hasAVX512CD 0x0000008000000000 +#define hasAVX512DQ 0x0000010000000000 +#define hasAVX512BW 0x0000020000000000 +#define hasAVX512VL 0x0000100000000000 +#define NEED_DARWIN_SUPPORT (hasAVX512F | hasAVX512DQ | hasAVX512CD | hasAVX512BW | hasAVX512VL) +#else +#define NEED_OS_SUPPORT_AX V4_OS_SUPPORT_AX +#endif + +#endif + +TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 + // copy arguments forward on an even stack + MOVQ DI, AX // argc + MOVQ SI, BX // argv + SUBQ $(5*8), SP // 3args 2auto + ANDQ $~15, SP + MOVQ AX, 24(SP) + MOVQ BX, 32(SP) + + // create istack out of the given (operating system) stack. + // _cgo_init may update stackguard. + MOVQ $runtime·g0(SB), DI + LEAQ (-64*1024+104)(SP), BX + MOVQ BX, g_stackguard0(DI) + MOVQ BX, g_stackguard1(DI) + MOVQ BX, (g_stack+stack_lo)(DI) + MOVQ SP, (g_stack+stack_hi)(DI) + + // find out information about the processor we're on + MOVL $0, AX + CPUID + CMPL AX, $0 + JE nocpuinfo + + CMPL BX, $0x756E6547 // "Genu" + JNE notintel + CMPL DX, $0x49656E69 // "ineI" + JNE notintel + CMPL CX, $0x6C65746E // "ntel" + JNE notintel + MOVB $1, runtime·isIntel(SB) + +notintel: + // Load EAX=1 cpuid flags + MOVL $1, AX + CPUID + MOVL AX, runtime·processorVersionInfo(SB) + +nocpuinfo: + // if there is an _cgo_init, call it. + MOVQ _cgo_init(SB), AX + TESTQ AX, AX + JZ needtls + // arg 1: g0, already in DI + MOVQ $setg_gcc<>(SB), SI // arg 2: setg_gcc + MOVQ $0, DX // arg 3, 4: not used when using platform's TLS + MOVQ $0, CX +#ifdef GOOS_android + MOVQ $runtime·tls_g(SB), DX // arg 3: &tls_g + // arg 4: TLS base, stored in slot 0 (Android's TLS_SLOT_SELF). + // Compensate for tls_g (+16). + MOVQ -16(TLS), CX +#endif +#ifdef GOOS_windows + MOVQ $runtime·tls_g(SB), DX // arg 3: &tls_g + // Adjust for the Win64 calling convention. + MOVQ CX, R9 // arg 4 + MOVQ DX, R8 // arg 3 + MOVQ SI, DX // arg 2 + MOVQ DI, CX // arg 1 +#endif + CALL AX + + // update stackguard after _cgo_init + MOVQ $runtime·g0(SB), CX + MOVQ (g_stack+stack_lo)(CX), AX + ADDQ $const__StackGuard, AX + MOVQ AX, g_stackguard0(CX) + MOVQ AX, g_stackguard1(CX) + +#ifndef GOOS_windows + JMP ok +#endif +needtls: +#ifdef GOOS_plan9 + // skip TLS setup on Plan 9 + JMP ok +#endif +#ifdef GOOS_solaris + // skip TLS setup on Solaris + JMP ok +#endif +#ifdef GOOS_illumos + // skip TLS setup on illumos + JMP ok +#endif +#ifdef GOOS_darwin + // skip TLS setup on Darwin + JMP ok +#endif +#ifdef GOOS_openbsd + // skip TLS setup on OpenBSD + JMP ok +#endif + +#ifdef GOOS_windows + CALL runtime·wintls(SB) +#endif + + LEAQ runtime·m0+m_tls(SB), DI + CALL runtime·settls(SB) + + // store through it, to make sure it works + get_tls(BX) + MOVQ $0x123, g(BX) + MOVQ runtime·m0+m_tls(SB), AX + CMPQ AX, $0x123 + JEQ 2(PC) + CALL runtime·abort(SB) +ok: + // set the per-goroutine and per-mach "registers" + get_tls(BX) + LEAQ runtime·g0(SB), CX + MOVQ CX, g(BX) + LEAQ runtime·m0(SB), AX + + // save m->g0 = g0 + MOVQ CX, m_g0(AX) + // save m0 to g0->m + MOVQ AX, g_m(CX) + + CLD // convention is D is always left cleared + + // Check GOAMD64 reqirements + // We need to do this after setting up TLS, so that + // we can report an error if there is a failure. See issue 49586. +#ifdef NEED_FEATURES_CX + MOVL $0, AX + CPUID + CMPL AX, $0 + JE bad_cpu + MOVL $1, AX + CPUID + ANDL $NEED_FEATURES_CX, CX + CMPL CX, $NEED_FEATURES_CX + JNE bad_cpu +#endif + +#ifdef NEED_MAX_CPUID + MOVL $0x80000000, AX + CPUID + CMPL AX, $NEED_MAX_CPUID + JL bad_cpu +#endif + +#ifdef NEED_EXT_FEATURES_BX + MOVL $7, AX + MOVL $0, CX + CPUID + ANDL $NEED_EXT_FEATURES_BX, BX + CMPL BX, $NEED_EXT_FEATURES_BX + JNE bad_cpu +#endif + +#ifdef NEED_EXT_FEATURES_CX + MOVL $0x80000001, AX + CPUID + ANDL $NEED_EXT_FEATURES_CX, CX + CMPL CX, $NEED_EXT_FEATURES_CX + JNE bad_cpu +#endif + +#ifdef NEED_OS_SUPPORT_AX + XORL CX, CX + XGETBV + ANDL $NEED_OS_SUPPORT_AX, AX + CMPL AX, $NEED_OS_SUPPORT_AX + JNE bad_cpu +#endif + +#ifdef NEED_DARWIN_SUPPORT + MOVQ $commpage64_version, BX + CMPW (BX), $13 // cpu_capabilities64 undefined in versions < 13 + JL bad_cpu + MOVQ $commpage64_cpu_capabilities64, BX + MOVQ (BX), BX + MOVQ $NEED_DARWIN_SUPPORT, CX + ANDQ CX, BX + CMPQ BX, CX + JNE bad_cpu +#endif + + CALL runtime·check(SB) + + MOVL 24(SP), AX // copy argc + MOVL AX, 0(SP) + MOVQ 32(SP), AX // copy argv + MOVQ AX, 8(SP) + CALL runtime·args(SB) + CALL runtime·osinit(SB) + CALL runtime·schedinit(SB) + + // create a new goroutine to start program + MOVQ $runtime·mainPC(SB), AX // entry + PUSHQ AX + CALL runtime·newproc(SB) + POPQ AX + + // start this M + CALL runtime·mstart(SB) + + CALL runtime·abort(SB) // mstart should never return + RET + +bad_cpu: // show that the program requires a certain microarchitecture level. + MOVQ $2, 0(SP) + MOVQ $bad_cpu_msg<>(SB), AX + MOVQ AX, 8(SP) + MOVQ $84, 16(SP) + CALL runtime·write(SB) + MOVQ $1, 0(SP) + CALL runtime·exit(SB) + CALL runtime·abort(SB) + RET + + // Prevent dead-code elimination of debugCallV2, which is + // intended to be called by debuggers. + MOVQ $runtime·debugCallV2<ABIInternal>(SB), AX + RET + +// mainPC is a function value for runtime.main, to be passed to newproc. +// The reference to runtime.main is made via ABIInternal, since the +// actual function (not the ABI0 wrapper) is needed by newproc. +DATA runtime·mainPC+0(SB)/8,$runtime·main<ABIInternal>(SB) +GLOBL runtime·mainPC(SB),RODATA,$8 + +TEXT runtime·breakpoint(SB),NOSPLIT,$0-0 + BYTE $0xcc + RET + +TEXT runtime·asminit(SB),NOSPLIT,$0-0 + // No per-thread init. + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + CALL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// func gogo(buf *gobuf) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB), NOSPLIT, $0-8 + MOVQ buf+0(FP), BX // gobuf + MOVQ gobuf_g(BX), DX + MOVQ 0(DX), CX // make sure g != nil + JMP gogo<>(SB) + +TEXT gogo<>(SB), NOSPLIT, $0 + get_tls(CX) + MOVQ DX, g(CX) + MOVQ DX, R14 // set the g register + MOVQ gobuf_sp(BX), SP // restore SP + MOVQ gobuf_ret(BX), AX + MOVQ gobuf_ctxt(BX), DX + MOVQ gobuf_bp(BX), BP + MOVQ $0, gobuf_sp(BX) // clear to help garbage collector + MOVQ $0, gobuf_ret(BX) + MOVQ $0, gobuf_ctxt(BX) + MOVQ $0, gobuf_bp(BX) + MOVQ gobuf_pc(BX), BX + JMP BX + +// func mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall<ABIInternal>(SB), NOSPLIT, $0-8 + MOVQ AX, DX // DX = fn + + // save state in g->sched + MOVQ 0(SP), BX // caller's PC + MOVQ BX, (g_sched+gobuf_pc)(R14) + LEAQ fn+0(FP), BX // caller's SP + MOVQ BX, (g_sched+gobuf_sp)(R14) + MOVQ BP, (g_sched+gobuf_bp)(R14) + + // switch to m->g0 & its stack, call fn + MOVQ g_m(R14), BX + MOVQ m_g0(BX), SI // SI = g.m.g0 + CMPQ SI, R14 // if g == m->g0 call badmcall + JNE goodm + JMP runtime·badmcall(SB) +goodm: + MOVQ R14, AX // AX (and arg 0) = g + MOVQ SI, R14 // g = g.m.g0 + get_tls(CX) // Set G in TLS + MOVQ R14, g(CX) + MOVQ (g_sched+gobuf_sp)(R14), SP // sp = g0.sched.sp + PUSHQ AX // open up space for fn's arg spill slot + MOVQ 0(DX), R12 + CALL R12 // fn(g) + POPQ AX + JMP runtime·badmcall2(SB) + RET + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-8 + MOVQ fn+0(FP), DI // DI = fn + get_tls(CX) + MOVQ g(CX), AX // AX = g + MOVQ g_m(AX), BX // BX = m + + CMPQ AX, m_gsignal(BX) + JEQ noswitch + + MOVQ m_g0(BX), DX // DX = g0 + CMPQ AX, DX + JEQ noswitch + + CMPQ AX, m_curg(BX) + JNE bad + + // switch stacks + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + CALL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOVQ DX, g(CX) + MOVQ DX, R14 // set the g register + MOVQ (g_sched+gobuf_sp)(DX), BX + MOVQ BX, SP + + // call target function + MOVQ DI, DX + MOVQ 0(DI), DI + CALL DI + + // switch back to g + get_tls(CX) + MOVQ g(CX), AX + MOVQ g_m(AX), BX + MOVQ m_curg(BX), AX + MOVQ AX, g(CX) + MOVQ (g_sched+gobuf_sp)(AX), SP + MOVQ $0, (g_sched+gobuf_sp)(AX) + RET + +noswitch: + // already on m stack; tail call the function + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOVQ DI, DX + MOVQ 0(DI), DI + JMP DI + +bad: + // Bad: g is not gsignal, not g0, not curg. What is it? + MOVQ $runtime·badsystemstack(SB), AX + CALL AX + INT $3 + + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT,$0-0 + // Cannot grow scheduler stack (m->g0). + get_tls(CX) + MOVQ g(CX), BX + MOVQ g_m(BX), BX + MOVQ m_g0(BX), SI + CMPQ g(CX), SI + JNE 3(PC) + CALL runtime·badmorestackg0(SB) + CALL runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOVQ m_gsignal(BX), SI + CMPQ g(CX), SI + JNE 3(PC) + CALL runtime·badmorestackgsignal(SB) + CALL runtime·abort(SB) + + // Called from f. + // Set m->morebuf to f's caller. + NOP SP // tell vet SP changed - stop checking offsets + MOVQ 8(SP), AX // f's caller's PC + MOVQ AX, (m_morebuf+gobuf_pc)(BX) + LEAQ 16(SP), AX // f's caller's SP + MOVQ AX, (m_morebuf+gobuf_sp)(BX) + get_tls(CX) + MOVQ g(CX), SI + MOVQ SI, (m_morebuf+gobuf_g)(BX) + + // Set g->sched to context in f. + MOVQ 0(SP), AX // f's PC + MOVQ AX, (g_sched+gobuf_pc)(SI) + LEAQ 8(SP), AX // f's SP + MOVQ AX, (g_sched+gobuf_sp)(SI) + MOVQ BP, (g_sched+gobuf_bp)(SI) + MOVQ DX, (g_sched+gobuf_ctxt)(SI) + + // Call newstack on m->g0's stack. + MOVQ m_g0(BX), BX + MOVQ BX, g(CX) + MOVQ (g_sched+gobuf_sp)(BX), SP + CALL runtime·newstack(SB) + CALL runtime·abort(SB) // crash if newstack returns + RET + +// morestack but not preserving ctxt. +TEXT runtime·morestack_noctxt(SB),NOSPLIT,$0 + MOVL $0, DX + JMP runtime·morestack(SB) + +// spillArgs stores return values from registers to a *internal/abi.RegArgs in R12. +TEXT ·spillArgs(SB),NOSPLIT,$0-0 + MOVQ AX, 0(R12) + MOVQ BX, 8(R12) + MOVQ CX, 16(R12) + MOVQ DI, 24(R12) + MOVQ SI, 32(R12) + MOVQ R8, 40(R12) + MOVQ R9, 48(R12) + MOVQ R10, 56(R12) + MOVQ R11, 64(R12) + MOVQ X0, 72(R12) + MOVQ X1, 80(R12) + MOVQ X2, 88(R12) + MOVQ X3, 96(R12) + MOVQ X4, 104(R12) + MOVQ X5, 112(R12) + MOVQ X6, 120(R12) + MOVQ X7, 128(R12) + MOVQ X8, 136(R12) + MOVQ X9, 144(R12) + MOVQ X10, 152(R12) + MOVQ X11, 160(R12) + MOVQ X12, 168(R12) + MOVQ X13, 176(R12) + MOVQ X14, 184(R12) + RET + +// unspillArgs loads args into registers from a *internal/abi.RegArgs in R12. +TEXT ·unspillArgs(SB),NOSPLIT,$0-0 + MOVQ 0(R12), AX + MOVQ 8(R12), BX + MOVQ 16(R12), CX + MOVQ 24(R12), DI + MOVQ 32(R12), SI + MOVQ 40(R12), R8 + MOVQ 48(R12), R9 + MOVQ 56(R12), R10 + MOVQ 64(R12), R11 + MOVQ 72(R12), X0 + MOVQ 80(R12), X1 + MOVQ 88(R12), X2 + MOVQ 96(R12), X3 + MOVQ 104(R12), X4 + MOVQ 112(R12), X5 + MOVQ 120(R12), X6 + MOVQ 128(R12), X7 + MOVQ 136(R12), X8 + MOVQ 144(R12), X9 + MOVQ 152(R12), X10 + MOVQ 160(R12), X11 + MOVQ 168(R12), X12 + MOVQ 176(R12), X13 + MOVQ 184(R12), X14 + RET + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + CMPQ CX, $MAXSIZE; \ + JA 3(PC); \ + MOVQ $NAME(SB), AX; \ + JMP AX +// Note: can't just "JMP NAME(SB)" - bad inlining results. + +TEXT ·reflectcall(SB), NOSPLIT, $0-48 + MOVLQZX frameSize+32(FP), CX + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVQ $runtime·badreflectcall(SB), AX + JMP AX + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-48; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVQ stackArgs+16(FP), SI; \ + MOVLQZX stackArgsSize+24(FP), CX; \ + MOVQ SP, DI; \ + REP;MOVSB; \ + /* set up argument registers */ \ + MOVQ regArgs+40(FP), R12; \ + CALL ·unspillArgs(SB); \ + /* call function */ \ + MOVQ f+8(FP), DX; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + MOVQ (DX), R12; \ + CALL R12; \ + /* copy register return values back */ \ + MOVQ regArgs+40(FP), R12; \ + CALL ·spillArgs(SB); \ + MOVLQZX stackArgsSize+24(FP), CX; \ + MOVLQZX stackRetOffset+28(FP), BX; \ + MOVQ stackArgs+16(FP), DI; \ + MOVQ stackArgsType+0(FP), DX; \ + MOVQ SP, SI; \ + ADDQ BX, DI; \ + ADDQ BX, SI; \ + SUBQ BX, CX; \ + CALL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $40-0 + NO_LOCAL_POINTERS + MOVQ DX, 0(SP) + MOVQ DI, 8(SP) + MOVQ SI, 16(SP) + MOVQ CX, 24(SP) + MOVQ R12, 32(SP) + CALL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +TEXT runtime·procyield(SB),NOSPLIT,$0-0 + MOVL cycles+0(FP), AX +again: + PAUSE + SUBL $1, AX + JNZ again + RET + + +TEXT ·publicationBarrier(SB),NOSPLIT,$0-0 + // Stores are already ordered on x86, so this is just a + // compile barrier. + RET + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes R9. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT,$0 + MOVQ $runtime·systemstack_switch(SB), R9 + MOVQ R9, (g_sched+gobuf_pc)(R14) + LEAQ 8(SP), R9 + MOVQ R9, (g_sched+gobuf_sp)(R14) + MOVQ $0, (g_sched+gobuf_ret)(R14) + MOVQ BP, (g_sched+gobuf_bp)(R14) + // Assert ctxt is zero. See func save. + MOVQ (g_sched+gobuf_ctxt)(R14), R9 + TESTQ R9, R9 + JZ 2(PC) + CALL runtime·abort(SB) + RET + +// func asmcgocall_no_g(fn, arg unsafe.Pointer) +// Call fn(arg) aligned appropriately for the gcc ABI. +// Called on a system stack, and there may be no g yet (during needm). +TEXT ·asmcgocall_no_g(SB),NOSPLIT,$0-16 + MOVQ fn+0(FP), AX + MOVQ arg+8(FP), BX + MOVQ SP, DX + SUBQ $32, SP + ANDQ $~15, SP // alignment + MOVQ DX, 8(SP) + MOVQ BX, DI // DI = first argument in AMD64 ABI + MOVQ BX, CX // CX = first argument in Win64 + CALL AX + MOVQ 8(SP), DX + MOVQ DX, SP + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-20 + MOVQ fn+0(FP), AX + MOVQ arg+8(FP), BX + + MOVQ SP, DX + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + get_tls(CX) + MOVQ g(CX), DI + CMPQ DI, $0 + JEQ nosave + MOVQ g_m(DI), R8 + MOVQ m_gsignal(R8), SI + CMPQ DI, SI + JEQ nosave + MOVQ m_g0(R8), SI + CMPQ DI, SI + JEQ nosave + + // Switch to system stack. + CALL gosave_systemstack_switch<>(SB) + MOVQ SI, g(CX) + MOVQ (g_sched+gobuf_sp)(SI), SP + + // Now on a scheduling stack (a pthread-created stack). + // Make sure we have enough room for 4 stack-backed fast-call + // registers as per windows amd64 calling convention. + SUBQ $64, SP + ANDQ $~15, SP // alignment for gcc ABI + MOVQ DI, 48(SP) // save g + MOVQ (g_stack+stack_hi)(DI), DI + SUBQ DX, DI + MOVQ DI, 40(SP) // save depth in stack (can't just save SP, as stack might be copied during a callback) + MOVQ BX, DI // DI = first argument in AMD64 ABI + MOVQ BX, CX // CX = first argument in Win64 + CALL AX + + // Restore registers, g, stack pointer. + get_tls(CX) + MOVQ 48(SP), DI + MOVQ (g_stack+stack_hi)(DI), SI + SUBQ 40(SP), SI + MOVQ DI, g(CX) + MOVQ SI, SP + + MOVL AX, ret+16(FP) + RET + +nosave: + // Running on a system stack, perhaps even without a g. + // Having no g can happen during thread creation or thread teardown + // (see needm/dropm on Solaris, for example). + // This code is like the above sequence but without saving/restoring g + // and without worrying about the stack moving out from under us + // (because we're on a system stack, not a goroutine stack). + // The above code could be used directly if already on a system stack, + // but then the only path through this code would be a rare case on Solaris. + // Using this code for all "already on system stack" calls exercises it more, + // which should help keep it correct. + SUBQ $64, SP + ANDQ $~15, SP + MOVQ $0, 48(SP) // where above code stores g, in case someone looks during debugging + MOVQ DX, 40(SP) // save original stack pointer + MOVQ BX, DI // DI = first argument in AMD64 ABI + MOVQ BX, CX // CX = first argument in Win64 + CALL AX + MOVQ 40(SP), SI // restore original stack pointer + MOVQ SI, SP + MOVL AX, ret+16(FP) + RET + +#ifdef GOOS_windows +// Dummy TLS that's used on Windows so that we don't crash trying +// to restore the G register in needm. needm and its callees are +// very careful never to actually use the G, the TLS just can't be +// unset since we're in Go code. +GLOBL zeroTLS<>(SB),RODATA,$const_tlsSize +#endif + +// func cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$24-24 + NO_LOCAL_POINTERS + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one m for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call through AX. + get_tls(CX) +#ifdef GOOS_windows + MOVL $0, BX + CMPQ CX, $0 + JEQ 2(PC) +#endif + MOVQ g(CX), BX + CMPQ BX, $0 + JEQ needm + MOVQ g_m(BX), BX + MOVQ BX, savedm-8(SP) // saved copy of oldm + JMP havem +needm: +#ifdef GOOS_windows + // Set up a dummy TLS value. needm is careful not to use it, + // but it needs to be there to prevent autogenerated code from + // crashing when it loads from it. + // We don't need to clear it or anything later because needm + // will set up TLS properly. + MOVQ $zeroTLS<>(SB), DI + CALL runtime·settls(SB) +#endif + // On some platforms (Windows) we cannot call needm through + // an ABI wrapper because there's no TLS set up, and the ABI + // wrapper will try to restore the G register (R14) from TLS. + // Clear X15 because Go expects it and we're not calling + // through a wrapper, but otherwise avoid setting the G + // register in the wrapper and call needm directly. It + // takes no arguments and doesn't return any values so + // there's no need to handle that. Clear R14 so that there's + // a bad value in there, in case needm tries to use it. + XORPS X15, X15 + XORQ R14, R14 + MOVQ $runtime·needm<ABIInternal>(SB), AX + CALL AX + MOVQ $0, savedm-8(SP) // dropm on return + get_tls(CX) + MOVQ g(CX), BX + MOVQ g_m(BX), BX + + // Set m->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVQ m_g0(BX), SI + MOVQ SP, (g_sched+gobuf_sp)(SI) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 0(SP). + MOVQ m_g0(BX), SI + MOVQ (g_sched+gobuf_sp)(SI), AX + MOVQ AX, 0(SP) + MOVQ SP, (g_sched+gobuf_sp)(SI) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOVQ m_curg(BX), SI + MOVQ SI, g(CX) + MOVQ (g_sched+gobuf_sp)(SI), DI // prepare stack as DI + MOVQ (g_sched+gobuf_pc)(SI), BX + MOVQ BX, -8(DI) // "push" return PC on the g stack + // Gather our arguments into registers. + MOVQ fn+0(FP), BX + MOVQ frame+8(FP), CX + MOVQ ctxt+16(FP), DX + // Compute the size of the frame, including return PC and, if + // GOEXPERIMENT=framepointer, the saved base pointer + LEAQ fn+0(FP), AX + SUBQ SP, AX // AX is our actual frame size + SUBQ AX, DI // Allocate the same frame size on the g stack + MOVQ DI, SP + + MOVQ BX, 0(SP) + MOVQ CX, 8(SP) + MOVQ DX, 16(SP) + MOVQ $runtime·cgocallbackg(SB), AX + CALL AX // indirect call to bypass nosplit check. We're on a different stack now. + + // Compute the size of the frame again. FP and SP have + // completely different values here than they did above, + // but only their difference matters. + LEAQ fn+0(FP), AX + SUBQ SP, AX + + // Restore g->sched (== m->curg->sched) from saved values. + get_tls(CX) + MOVQ g(CX), SI + MOVQ SP, DI + ADDQ AX, DI + MOVQ -8(DI), BX + MOVQ BX, (g_sched+gobuf_pc)(SI) + MOVQ DI, (g_sched+gobuf_sp)(SI) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVQ g(CX), BX + MOVQ g_m(BX), BX + MOVQ m_g0(BX), SI + MOVQ SI, g(CX) + MOVQ (g_sched+gobuf_sp)(SI), SP + MOVQ 0(SP), AX + MOVQ AX, (g_sched+gobuf_sp)(SI) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVQ savedm-8(SP), BX + CMPQ BX, $0 + JNE done + MOVQ $runtime·dropm(SB), AX + CALL AX +#ifdef GOOS_windows + // We need to clear the TLS pointer in case the next + // thread that comes into Go tries to reuse that space + // but uses the same M. + XORQ DI, DI + CALL runtime·settls(SB) +#endif +done: + + // Done! + RET + +// func setg(gg *g) +// set g. for use by needm. +TEXT runtime·setg(SB), NOSPLIT, $0-8 + MOVQ gg+0(FP), BX + get_tls(CX) + MOVQ BX, g(CX) + RET + +// void setg_gcc(G*); set g called from gcc. +TEXT setg_gcc<>(SB),NOSPLIT,$0 + get_tls(AX) + MOVQ DI, g(AX) + MOVQ DI, R14 // set the g register + RET + +TEXT runtime·abort(SB),NOSPLIT,$0-0 + INT $3 +loop: + JMP loop + +// check that SP is in range [g->stack.lo, g->stack.hi) +TEXT runtime·stackcheck(SB), NOSPLIT, $0-0 + get_tls(CX) + MOVQ g(CX), AX + CMPQ (g_stack+stack_hi)(AX), SP + JHI 2(PC) + CALL runtime·abort(SB) + CMPQ SP, (g_stack+stack_lo)(AX) + JHI 2(PC) + CALL runtime·abort(SB) + RET + +// func cputicks() int64 +TEXT runtime·cputicks(SB),NOSPLIT,$0-0 + CMPB internal∕cpu·X86+const_offsetX86HasRDTSCP(SB), $1 + JNE fences + // Instruction stream serializing RDTSCP is supported. + // RDTSCP is supported by Intel Nehalem (2008) and + // AMD K8 Rev. F (2006) and newer. + RDTSCP +done: + SHLQ $32, DX + ADDQ DX, AX + MOVQ AX, ret+0(FP) + RET +fences: + // MFENCE is instruction stream serializing and flushes the + // store buffers on AMD. The serialization semantics of LFENCE on AMD + // are dependent on MSR C001_1029 and CPU generation. + // LFENCE on Intel does wait for all previous instructions to have executed. + // Intel recommends MFENCE;LFENCE in its manuals before RDTSC to have all + // previous instructions executed and all previous loads and stores to globally visible. + // Using MFENCE;LFENCE here aligns the serializing properties without + // runtime detection of CPU manufacturer. + MFENCE + LFENCE + RDTSC + JMP done + +// func memhash(p unsafe.Pointer, h, s uintptr) uintptr +// hash function using AES hardware instructions +TEXT runtime·memhash<ABIInternal>(SB),NOSPLIT,$0-32 + // AX = ptr to data + // BX = seed + // CX = size + CMPB runtime·useAeshash(SB), $0 + JEQ noaes + JMP aeshashbody<>(SB) +noaes: + JMP runtime·memhashFallback<ABIInternal>(SB) + +// func strhash(p unsafe.Pointer, h uintptr) uintptr +TEXT runtime·strhash<ABIInternal>(SB),NOSPLIT,$0-24 + // AX = ptr to string struct + // BX = seed + CMPB runtime·useAeshash(SB), $0 + JEQ noaes + MOVQ 8(AX), CX // length of string + MOVQ (AX), AX // string data + JMP aeshashbody<>(SB) +noaes: + JMP runtime·strhashFallback<ABIInternal>(SB) + +// AX: data +// BX: hash seed +// CX: length +// At return: AX = return value +TEXT aeshashbody<>(SB),NOSPLIT,$0-0 + // Fill an SSE register with our seeds. + MOVQ BX, X0 // 64 bits of per-table hash seed + PINSRW $4, CX, X0 // 16 bits of length + PSHUFHW $0, X0, X0 // repeat length 4 times total + MOVO X0, X1 // save unscrambled seed + PXOR runtime·aeskeysched(SB), X0 // xor in per-process seed + AESENC X0, X0 // scramble seed + + CMPQ CX, $16 + JB aes0to15 + JE aes16 + CMPQ CX, $32 + JBE aes17to32 + CMPQ CX, $64 + JBE aes33to64 + CMPQ CX, $128 + JBE aes65to128 + JMP aes129plus + +aes0to15: + TESTQ CX, CX + JE aes0 + + ADDQ $16, AX + TESTW $0xff0, AX + JE endofpage + + // 16 bytes loaded at this address won't cross + // a page boundary, so we can load it directly. + MOVOU -16(AX), X1 + ADDQ CX, CX + MOVQ $masks<>(SB), AX + PAND (AX)(CX*8), X1 +final1: + PXOR X0, X1 // xor data with seed + AESENC X1, X1 // scramble combo 3 times + AESENC X1, X1 + AESENC X1, X1 + MOVQ X1, AX // return X1 + RET + +endofpage: + // address ends in 1111xxxx. Might be up against + // a page boundary, so load ending at last byte. + // Then shift bytes down using pshufb. + MOVOU -32(AX)(CX*1), X1 + ADDQ CX, CX + MOVQ $shifts<>(SB), AX + PSHUFB (AX)(CX*8), X1 + JMP final1 + +aes0: + // Return scrambled input seed + AESENC X0, X0 + MOVQ X0, AX // return X0 + RET + +aes16: + MOVOU (AX), X1 + JMP final1 + +aes17to32: + // make second starting seed + PXOR runtime·aeskeysched+16(SB), X1 + AESENC X1, X1 + + // load data to be hashed + MOVOU (AX), X2 + MOVOU -16(AX)(CX*1), X3 + + // xor with seed + PXOR X0, X2 + PXOR X1, X3 + + // scramble 3 times + AESENC X2, X2 + AESENC X3, X3 + AESENC X2, X2 + AESENC X3, X3 + AESENC X2, X2 + AESENC X3, X3 + + // combine results + PXOR X3, X2 + MOVQ X2, AX // return X2 + RET + +aes33to64: + // make 3 more starting seeds + MOVO X1, X2 + MOVO X1, X3 + PXOR runtime·aeskeysched+16(SB), X1 + PXOR runtime·aeskeysched+32(SB), X2 + PXOR runtime·aeskeysched+48(SB), X3 + AESENC X1, X1 + AESENC X2, X2 + AESENC X3, X3 + + MOVOU (AX), X4 + MOVOU 16(AX), X5 + MOVOU -32(AX)(CX*1), X6 + MOVOU -16(AX)(CX*1), X7 + + PXOR X0, X4 + PXOR X1, X5 + PXOR X2, X6 + PXOR X3, X7 + + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + PXOR X6, X4 + PXOR X7, X5 + PXOR X5, X4 + MOVQ X4, AX // return X4 + RET + +aes65to128: + // make 7 more starting seeds + MOVO X1, X2 + MOVO X1, X3 + MOVO X1, X4 + MOVO X1, X5 + MOVO X1, X6 + MOVO X1, X7 + PXOR runtime·aeskeysched+16(SB), X1 + PXOR runtime·aeskeysched+32(SB), X2 + PXOR runtime·aeskeysched+48(SB), X3 + PXOR runtime·aeskeysched+64(SB), X4 + PXOR runtime·aeskeysched+80(SB), X5 + PXOR runtime·aeskeysched+96(SB), X6 + PXOR runtime·aeskeysched+112(SB), X7 + AESENC X1, X1 + AESENC X2, X2 + AESENC X3, X3 + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + // load data + MOVOU (AX), X8 + MOVOU 16(AX), X9 + MOVOU 32(AX), X10 + MOVOU 48(AX), X11 + MOVOU -64(AX)(CX*1), X12 + MOVOU -48(AX)(CX*1), X13 + MOVOU -32(AX)(CX*1), X14 + MOVOU -16(AX)(CX*1), X15 + + // xor with seed + PXOR X0, X8 + PXOR X1, X9 + PXOR X2, X10 + PXOR X3, X11 + PXOR X4, X12 + PXOR X5, X13 + PXOR X6, X14 + PXOR X7, X15 + + // scramble 3 times + AESENC X8, X8 + AESENC X9, X9 + AESENC X10, X10 + AESENC X11, X11 + AESENC X12, X12 + AESENC X13, X13 + AESENC X14, X14 + AESENC X15, X15 + + AESENC X8, X8 + AESENC X9, X9 + AESENC X10, X10 + AESENC X11, X11 + AESENC X12, X12 + AESENC X13, X13 + AESENC X14, X14 + AESENC X15, X15 + + AESENC X8, X8 + AESENC X9, X9 + AESENC X10, X10 + AESENC X11, X11 + AESENC X12, X12 + AESENC X13, X13 + AESENC X14, X14 + AESENC X15, X15 + + // combine results + PXOR X12, X8 + PXOR X13, X9 + PXOR X14, X10 + PXOR X15, X11 + PXOR X10, X8 + PXOR X11, X9 + PXOR X9, X8 + // X15 must be zero on return + PXOR X15, X15 + MOVQ X8, AX // return X8 + RET + +aes129plus: + // make 7 more starting seeds + MOVO X1, X2 + MOVO X1, X3 + MOVO X1, X4 + MOVO X1, X5 + MOVO X1, X6 + MOVO X1, X7 + PXOR runtime·aeskeysched+16(SB), X1 + PXOR runtime·aeskeysched+32(SB), X2 + PXOR runtime·aeskeysched+48(SB), X3 + PXOR runtime·aeskeysched+64(SB), X4 + PXOR runtime·aeskeysched+80(SB), X5 + PXOR runtime·aeskeysched+96(SB), X6 + PXOR runtime·aeskeysched+112(SB), X7 + AESENC X1, X1 + AESENC X2, X2 + AESENC X3, X3 + AESENC X4, X4 + AESENC X5, X5 + AESENC X6, X6 + AESENC X7, X7 + + // start with last (possibly overlapping) block + MOVOU -128(AX)(CX*1), X8 + MOVOU -112(AX)(CX*1), X9 + MOVOU -96(AX)(CX*1), X10 + MOVOU -80(AX)(CX*1), X11 + MOVOU -64(AX)(CX*1), X12 + MOVOU -48(AX)(CX*1), X13 + MOVOU -32(AX)(CX*1), X14 + MOVOU -16(AX)(CX*1), X15 + + // xor in seed + PXOR X0, X8 + PXOR X1, X9 + PXOR X2, X10 + PXOR X3, X11 + PXOR X4, X12 + PXOR X5, X13 + PXOR X6, X14 + PXOR X7, X15 + + // compute number of remaining 128-byte blocks + DECQ CX + SHRQ $7, CX + +aesloop: + // scramble state + AESENC X8, X8 + AESENC X9, X9 + AESENC X10, X10 + AESENC X11, X11 + AESENC X12, X12 + AESENC X13, X13 + AESENC X14, X14 + AESENC X15, X15 + + // scramble state, xor in a block + MOVOU (AX), X0 + MOVOU 16(AX), X1 + MOVOU 32(AX), X2 + MOVOU 48(AX), X3 + AESENC X0, X8 + AESENC X1, X9 + AESENC X2, X10 + AESENC X3, X11 + MOVOU 64(AX), X4 + MOVOU 80(AX), X5 + MOVOU 96(AX), X6 + MOVOU 112(AX), X7 + AESENC X4, X12 + AESENC X5, X13 + AESENC X6, X14 + AESENC X7, X15 + + ADDQ $128, AX + DECQ CX + JNE aesloop + + // 3 more scrambles to finish + AESENC X8, X8 + AESENC X9, X9 + AESENC X10, X10 + AESENC X11, X11 + AESENC X12, X12 + AESENC X13, X13 + AESENC X14, X14 + AESENC X15, X15 + AESENC X8, X8 + AESENC X9, X9 + AESENC X10, X10 + AESENC X11, X11 + AESENC X12, X12 + AESENC X13, X13 + AESENC X14, X14 + AESENC X15, X15 + AESENC X8, X8 + AESENC X9, X9 + AESENC X10, X10 + AESENC X11, X11 + AESENC X12, X12 + AESENC X13, X13 + AESENC X14, X14 + AESENC X15, X15 + + PXOR X12, X8 + PXOR X13, X9 + PXOR X14, X10 + PXOR X15, X11 + PXOR X10, X8 + PXOR X11, X9 + PXOR X9, X8 + // X15 must be zero on return + PXOR X15, X15 + MOVQ X8, AX // return X8 + RET + +// func memhash32(p unsafe.Pointer, h uintptr) uintptr +// ABIInternal for performance. +TEXT runtime·memhash32<ABIInternal>(SB),NOSPLIT,$0-24 + // AX = ptr to data + // BX = seed + CMPB runtime·useAeshash(SB), $0 + JEQ noaes + MOVQ BX, X0 // X0 = seed + PINSRD $2, (AX), X0 // data + AESENC runtime·aeskeysched+0(SB), X0 + AESENC runtime·aeskeysched+16(SB), X0 + AESENC runtime·aeskeysched+32(SB), X0 + MOVQ X0, AX // return X0 + RET +noaes: + JMP runtime·memhash32Fallback<ABIInternal>(SB) + +// func memhash64(p unsafe.Pointer, h uintptr) uintptr +// ABIInternal for performance. +TEXT runtime·memhash64<ABIInternal>(SB),NOSPLIT,$0-24 + // AX = ptr to data + // BX = seed + CMPB runtime·useAeshash(SB), $0 + JEQ noaes + MOVQ BX, X0 // X0 = seed + PINSRQ $1, (AX), X0 // data + AESENC runtime·aeskeysched+0(SB), X0 + AESENC runtime·aeskeysched+16(SB), X0 + AESENC runtime·aeskeysched+32(SB), X0 + MOVQ X0, AX // return X0 + RET +noaes: + JMP runtime·memhash64Fallback<ABIInternal>(SB) + +// simple mask to get rid of data in the high part of the register. +DATA masks<>+0x00(SB)/8, $0x0000000000000000 +DATA masks<>+0x08(SB)/8, $0x0000000000000000 +DATA masks<>+0x10(SB)/8, $0x00000000000000ff +DATA masks<>+0x18(SB)/8, $0x0000000000000000 +DATA masks<>+0x20(SB)/8, $0x000000000000ffff +DATA masks<>+0x28(SB)/8, $0x0000000000000000 +DATA masks<>+0x30(SB)/8, $0x0000000000ffffff +DATA masks<>+0x38(SB)/8, $0x0000000000000000 +DATA masks<>+0x40(SB)/8, $0x00000000ffffffff +DATA masks<>+0x48(SB)/8, $0x0000000000000000 +DATA masks<>+0x50(SB)/8, $0x000000ffffffffff +DATA masks<>+0x58(SB)/8, $0x0000000000000000 +DATA masks<>+0x60(SB)/8, $0x0000ffffffffffff +DATA masks<>+0x68(SB)/8, $0x0000000000000000 +DATA masks<>+0x70(SB)/8, $0x00ffffffffffffff +DATA masks<>+0x78(SB)/8, $0x0000000000000000 +DATA masks<>+0x80(SB)/8, $0xffffffffffffffff +DATA masks<>+0x88(SB)/8, $0x0000000000000000 +DATA masks<>+0x90(SB)/8, $0xffffffffffffffff +DATA masks<>+0x98(SB)/8, $0x00000000000000ff +DATA masks<>+0xa0(SB)/8, $0xffffffffffffffff +DATA masks<>+0xa8(SB)/8, $0x000000000000ffff +DATA masks<>+0xb0(SB)/8, $0xffffffffffffffff +DATA masks<>+0xb8(SB)/8, $0x0000000000ffffff +DATA masks<>+0xc0(SB)/8, $0xffffffffffffffff +DATA masks<>+0xc8(SB)/8, $0x00000000ffffffff +DATA masks<>+0xd0(SB)/8, $0xffffffffffffffff +DATA masks<>+0xd8(SB)/8, $0x000000ffffffffff +DATA masks<>+0xe0(SB)/8, $0xffffffffffffffff +DATA masks<>+0xe8(SB)/8, $0x0000ffffffffffff +DATA masks<>+0xf0(SB)/8, $0xffffffffffffffff +DATA masks<>+0xf8(SB)/8, $0x00ffffffffffffff +GLOBL masks<>(SB),RODATA,$256 + +// func checkASM() bool +TEXT ·checkASM(SB),NOSPLIT,$0-1 + // check that masks<>(SB) and shifts<>(SB) are aligned to 16-byte + MOVQ $masks<>(SB), AX + MOVQ $shifts<>(SB), BX + ORQ BX, AX + TESTQ $15, AX + SETEQ ret+0(FP) + RET + +// these are arguments to pshufb. They move data down from +// the high bytes of the register to the low bytes of the register. +// index is how many bytes to move. +DATA shifts<>+0x00(SB)/8, $0x0000000000000000 +DATA shifts<>+0x08(SB)/8, $0x0000000000000000 +DATA shifts<>+0x10(SB)/8, $0xffffffffffffff0f +DATA shifts<>+0x18(SB)/8, $0xffffffffffffffff +DATA shifts<>+0x20(SB)/8, $0xffffffffffff0f0e +DATA shifts<>+0x28(SB)/8, $0xffffffffffffffff +DATA shifts<>+0x30(SB)/8, $0xffffffffff0f0e0d +DATA shifts<>+0x38(SB)/8, $0xffffffffffffffff +DATA shifts<>+0x40(SB)/8, $0xffffffff0f0e0d0c +DATA shifts<>+0x48(SB)/8, $0xffffffffffffffff +DATA shifts<>+0x50(SB)/8, $0xffffff0f0e0d0c0b +DATA shifts<>+0x58(SB)/8, $0xffffffffffffffff +DATA shifts<>+0x60(SB)/8, $0xffff0f0e0d0c0b0a +DATA shifts<>+0x68(SB)/8, $0xffffffffffffffff +DATA shifts<>+0x70(SB)/8, $0xff0f0e0d0c0b0a09 +DATA shifts<>+0x78(SB)/8, $0xffffffffffffffff +DATA shifts<>+0x80(SB)/8, $0x0f0e0d0c0b0a0908 +DATA shifts<>+0x88(SB)/8, $0xffffffffffffffff +DATA shifts<>+0x90(SB)/8, $0x0e0d0c0b0a090807 +DATA shifts<>+0x98(SB)/8, $0xffffffffffffff0f +DATA shifts<>+0xa0(SB)/8, $0x0d0c0b0a09080706 +DATA shifts<>+0xa8(SB)/8, $0xffffffffffff0f0e +DATA shifts<>+0xb0(SB)/8, $0x0c0b0a0908070605 +DATA shifts<>+0xb8(SB)/8, $0xffffffffff0f0e0d +DATA shifts<>+0xc0(SB)/8, $0x0b0a090807060504 +DATA shifts<>+0xc8(SB)/8, $0xffffffff0f0e0d0c +DATA shifts<>+0xd0(SB)/8, $0x0a09080706050403 +DATA shifts<>+0xd8(SB)/8, $0xffffff0f0e0d0c0b +DATA shifts<>+0xe0(SB)/8, $0x0908070605040302 +DATA shifts<>+0xe8(SB)/8, $0xffff0f0e0d0c0b0a +DATA shifts<>+0xf0(SB)/8, $0x0807060504030201 +DATA shifts<>+0xf8(SB)/8, $0xff0f0e0d0c0b0a09 +GLOBL shifts<>(SB),RODATA,$256 + +TEXT runtime·return0(SB), NOSPLIT, $0 + MOVL $0, AX + RET + + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT,$0 + get_tls(CX) + MOVQ g(CX), AX + MOVQ g_m(AX), AX + MOVQ m_curg(AX), AX + MOVQ (g_stack+stack_hi)(AX), AX + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|TOPFRAME,$0-0 + BYTE $0x90 // NOP + CALL runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + BYTE $0x90 // NOP + +// This is called from .init_array and follows the platform, not Go, ABI. +TEXT runtime·addmoduledata(SB),NOSPLIT,$0-0 + PUSHQ R15 // The access to global variables below implicitly uses R15, which is callee-save + MOVQ runtime·lastmoduledatap(SB), AX + MOVQ DI, moduledata_next(AX) + MOVQ DI, runtime·lastmoduledatap(SB) + POPQ R15 + RET + +// Initialize special registers then jump to sigpanic. +// This function is injected from the signal handler for panicking +// signals. It is quite painful to set X15 in the signal context, +// so we do it here. +TEXT ·sigpanic0(SB),NOSPLIT,$0-0 + get_tls(R14) + MOVQ g(R14), R14 +#ifndef GOOS_plan9 + XORPS X15, X15 +#endif + JMP ·sigpanic<ABIInternal>(SB) + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - DI is the destination of the write +// - AX is the value being written at DI +// It clobbers FLAGS. It does not clobber any general-purpose registers, +// but may clobber others (e.g., SSE registers). +// Defined as ABIInternal since it does not use the stack-based Go ABI. +TEXT runtime·gcWriteBarrier<ABIInternal>(SB),NOSPLIT,$112 + // Save the registers clobbered by the fast path. This is slightly + // faster than having the caller spill these. + MOVQ R12, 96(SP) + MOVQ R13, 104(SP) + // TODO: Consider passing g.m.p in as an argument so they can be shared + // across a sequence of write barriers. + MOVQ g_m(R14), R13 + MOVQ m_p(R13), R13 + MOVQ (p_wbBuf+wbBuf_next)(R13), R12 + // Increment wbBuf.next position. + LEAQ 16(R12), R12 + MOVQ R12, (p_wbBuf+wbBuf_next)(R13) + CMPQ R12, (p_wbBuf+wbBuf_end)(R13) + // Record the write. + MOVQ AX, -16(R12) // Record value + // Note: This turns bad pointer writes into bad + // pointer reads, which could be confusing. We could avoid + // reading from obviously bad pointers, which would + // take care of the vast majority of these. We could + // patch this up in the signal handler, or use XCHG to + // combine the read and the write. + MOVQ (DI), R13 + MOVQ R13, -8(R12) // Record *slot + // Is the buffer full? (flags set in CMPQ above) + JEQ flush +ret: + MOVQ 96(SP), R12 + MOVQ 104(SP), R13 + // Do the write. + MOVQ AX, (DI) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + // It is possible for wbBufFlush to clobber other registers + // (e.g., SSE registers), but the compiler takes care of saving + // those in the caller if necessary. This strikes a balance + // with registers that are likely to be used. + // + // We don't have type information for these, but all code under + // here is NOSPLIT, so nothing will observe these. + // + // TODO: We could strike a different balance; e.g., saving X0 + // and not saving GP registers that are less likely to be used. + MOVQ DI, 0(SP) // Also first argument to wbBufFlush + MOVQ AX, 8(SP) // Also second argument to wbBufFlush + MOVQ BX, 16(SP) + MOVQ CX, 24(SP) + MOVQ DX, 32(SP) + // DI already saved + MOVQ SI, 40(SP) + MOVQ BP, 48(SP) + MOVQ R8, 56(SP) + MOVQ R9, 64(SP) + MOVQ R10, 72(SP) + MOVQ R11, 80(SP) + // R12 already saved + // R13 already saved + // R14 is g + MOVQ R15, 88(SP) + + // This takes arguments DI and AX + CALL runtime·wbBufFlush(SB) + + MOVQ 0(SP), DI + MOVQ 8(SP), AX + MOVQ 16(SP), BX + MOVQ 24(SP), CX + MOVQ 32(SP), DX + MOVQ 40(SP), SI + MOVQ 48(SP), BP + MOVQ 56(SP), R8 + MOVQ 64(SP), R9 + MOVQ 72(SP), R10 + MOVQ 80(SP), R11 + MOVQ 88(SP), R15 + JMP ret + +// gcWriteBarrierCX is gcWriteBarrier, but with args in DI and CX. +// Defined as ABIInternal since it does not use the stable Go ABI. +TEXT runtime·gcWriteBarrierCX<ABIInternal>(SB),NOSPLIT,$0 + XCHGQ CX, AX + CALL runtime·gcWriteBarrier<ABIInternal>(SB) + XCHGQ CX, AX + RET + +// gcWriteBarrierDX is gcWriteBarrier, but with args in DI and DX. +// Defined as ABIInternal since it does not use the stable Go ABI. +TEXT runtime·gcWriteBarrierDX<ABIInternal>(SB),NOSPLIT,$0 + XCHGQ DX, AX + CALL runtime·gcWriteBarrier<ABIInternal>(SB) + XCHGQ DX, AX + RET + +// gcWriteBarrierBX is gcWriteBarrier, but with args in DI and BX. +// Defined as ABIInternal since it does not use the stable Go ABI. +TEXT runtime·gcWriteBarrierBX<ABIInternal>(SB),NOSPLIT,$0 + XCHGQ BX, AX + CALL runtime·gcWriteBarrier<ABIInternal>(SB) + XCHGQ BX, AX + RET + +// gcWriteBarrierBP is gcWriteBarrier, but with args in DI and BP. +// Defined as ABIInternal since it does not use the stable Go ABI. +TEXT runtime·gcWriteBarrierBP<ABIInternal>(SB),NOSPLIT,$0 + XCHGQ BP, AX + CALL runtime·gcWriteBarrier<ABIInternal>(SB) + XCHGQ BP, AX + RET + +// gcWriteBarrierSI is gcWriteBarrier, but with args in DI and SI. +// Defined as ABIInternal since it does not use the stable Go ABI. +TEXT runtime·gcWriteBarrierSI<ABIInternal>(SB),NOSPLIT,$0 + XCHGQ SI, AX + CALL runtime·gcWriteBarrier<ABIInternal>(SB) + XCHGQ SI, AX + RET + +// gcWriteBarrierR8 is gcWriteBarrier, but with args in DI and R8. +// Defined as ABIInternal since it does not use the stable Go ABI. +TEXT runtime·gcWriteBarrierR8<ABIInternal>(SB),NOSPLIT,$0 + XCHGQ R8, AX + CALL runtime·gcWriteBarrier<ABIInternal>(SB) + XCHGQ R8, AX + RET + +// gcWriteBarrierR9 is gcWriteBarrier, but with args in DI and R9. +// Defined as ABIInternal since it does not use the stable Go ABI. +TEXT runtime·gcWriteBarrierR9<ABIInternal>(SB),NOSPLIT,$0 + XCHGQ R9, AX + CALL runtime·gcWriteBarrier<ABIInternal>(SB) + XCHGQ R9, AX + RET + +DATA debugCallFrameTooLarge<>+0x00(SB)/20, $"call frame too large" +GLOBL debugCallFrameTooLarge<>(SB), RODATA, $20 // Size duplicated below + +// debugCallV2 is the entry point for debugger-injected function +// calls on running goroutines. It informs the runtime that a +// debug call has been injected and creates a call frame for the +// debugger to fill in. +// +// To inject a function call, a debugger should: +// 1. Check that the goroutine is in state _Grunning and that +// there are at least 256 bytes free on the stack. +// 2. Push the current PC on the stack (updating SP). +// 3. Write the desired argument frame size at SP-16 (using the SP +// after step 2). +// 4. Save all machine registers (including flags and XMM registers) +// so they can be restored later by the debugger. +// 5. Set the PC to debugCallV2 and resume execution. +// +// If the goroutine is in state _Grunnable, then it's not generally +// safe to inject a call because it may return out via other runtime +// operations. Instead, the debugger should unwind the stack to find +// the return to non-runtime code, add a temporary breakpoint there, +// and inject the call once that breakpoint is hit. +// +// If the goroutine is in any other state, it's not safe to inject a call. +// +// This function communicates back to the debugger by setting R12 and +// invoking INT3 to raise a breakpoint signal. See the comments in the +// implementation for the protocol the debugger is expected to +// follow. InjectDebugCall in the runtime tests demonstrates this protocol. +// +// The debugger must ensure that any pointers passed to the function +// obey escape analysis requirements. Specifically, it must not pass +// a stack pointer to an escaping argument. debugCallV2 cannot check +// this invariant. +// +// This is ABIInternal because Go code injects its PC directly into new +// goroutine stacks. +TEXT runtime·debugCallV2<ABIInternal>(SB),NOSPLIT,$152-0 + // Save all registers that may contain pointers so they can be + // conservatively scanned. + // + // We can't do anything that might clobber any of these + // registers before this. + MOVQ R15, r15-(14*8+8)(SP) + MOVQ R14, r14-(13*8+8)(SP) + MOVQ R13, r13-(12*8+8)(SP) + MOVQ R12, r12-(11*8+8)(SP) + MOVQ R11, r11-(10*8+8)(SP) + MOVQ R10, r10-(9*8+8)(SP) + MOVQ R9, r9-(8*8+8)(SP) + MOVQ R8, r8-(7*8+8)(SP) + MOVQ DI, di-(6*8+8)(SP) + MOVQ SI, si-(5*8+8)(SP) + MOVQ BP, bp-(4*8+8)(SP) + MOVQ BX, bx-(3*8+8)(SP) + MOVQ DX, dx-(2*8+8)(SP) + // Save the frame size before we clobber it. Either of the last + // saves could clobber this depending on whether there's a saved BP. + MOVQ frameSize-24(FP), DX // aka -16(RSP) before prologue + MOVQ CX, cx-(1*8+8)(SP) + MOVQ AX, ax-(0*8+8)(SP) + + // Save the argument frame size. + MOVQ DX, frameSize-128(SP) + + // Perform a safe-point check. + MOVQ retpc-8(FP), AX // Caller's PC + MOVQ AX, 0(SP) + CALL runtime·debugCallCheck(SB) + MOVQ 8(SP), AX + TESTQ AX, AX + JZ good + // The safety check failed. Put the reason string at the top + // of the stack. + MOVQ AX, 0(SP) + MOVQ 16(SP), AX + MOVQ AX, 8(SP) + // Set R12 to 8 and invoke INT3. The debugger should get the + // reason a call can't be injected from the top of the stack + // and resume execution. + MOVQ $8, R12 + BYTE $0xcc + JMP restore + +good: + // Registers are saved and it's safe to make a call. + // Open up a call frame, moving the stack if necessary. + // + // Once the frame is allocated, this will set R12 to 0 and + // invoke INT3. The debugger should write the argument + // frame for the call at SP, set up argument registers, push + // the trapping PC on the stack, set the PC to the function to + // call, set RDX to point to the closure (if a closure call), + // and resume execution. + // + // If the function returns, this will set R12 to 1 and invoke + // INT3. The debugger can then inspect any return value saved + // on the stack at SP and in registers and resume execution again. + // + // If the function panics, this will set R12 to 2 and invoke INT3. + // The interface{} value of the panic will be at SP. The debugger + // can inspect the panic value and resume execution again. +#define DEBUG_CALL_DISPATCH(NAME,MAXSIZE) \ + CMPQ AX, $MAXSIZE; \ + JA 5(PC); \ + MOVQ $NAME(SB), AX; \ + MOVQ AX, 0(SP); \ + CALL runtime·debugCallWrap(SB); \ + JMP restore + + MOVQ frameSize-128(SP), AX + DEBUG_CALL_DISPATCH(debugCall32<>, 32) + DEBUG_CALL_DISPATCH(debugCall64<>, 64) + DEBUG_CALL_DISPATCH(debugCall128<>, 128) + DEBUG_CALL_DISPATCH(debugCall256<>, 256) + DEBUG_CALL_DISPATCH(debugCall512<>, 512) + DEBUG_CALL_DISPATCH(debugCall1024<>, 1024) + DEBUG_CALL_DISPATCH(debugCall2048<>, 2048) + DEBUG_CALL_DISPATCH(debugCall4096<>, 4096) + DEBUG_CALL_DISPATCH(debugCall8192<>, 8192) + DEBUG_CALL_DISPATCH(debugCall16384<>, 16384) + DEBUG_CALL_DISPATCH(debugCall32768<>, 32768) + DEBUG_CALL_DISPATCH(debugCall65536<>, 65536) + // The frame size is too large. Report the error. + MOVQ $debugCallFrameTooLarge<>(SB), AX + MOVQ AX, 0(SP) + MOVQ $20, 8(SP) // length of debugCallFrameTooLarge string + MOVQ $8, R12 + BYTE $0xcc + JMP restore + +restore: + // Calls and failures resume here. + // + // Set R12 to 16 and invoke INT3. The debugger should restore + // all registers except RIP and RSP and resume execution. + MOVQ $16, R12 + BYTE $0xcc + // We must not modify flags after this point. + + // Restore pointer-containing registers, which may have been + // modified from the debugger's copy by stack copying. + MOVQ ax-(0*8+8)(SP), AX + MOVQ cx-(1*8+8)(SP), CX + MOVQ dx-(2*8+8)(SP), DX + MOVQ bx-(3*8+8)(SP), BX + MOVQ bp-(4*8+8)(SP), BP + MOVQ si-(5*8+8)(SP), SI + MOVQ di-(6*8+8)(SP), DI + MOVQ r8-(7*8+8)(SP), R8 + MOVQ r9-(8*8+8)(SP), R9 + MOVQ r10-(9*8+8)(SP), R10 + MOVQ r11-(10*8+8)(SP), R11 + MOVQ r12-(11*8+8)(SP), R12 + MOVQ r13-(12*8+8)(SP), R13 + MOVQ r14-(13*8+8)(SP), R14 + MOVQ r15-(14*8+8)(SP), R15 + + RET + +// runtime.debugCallCheck assumes that functions defined with the +// DEBUG_CALL_FN macro are safe points to inject calls. +#define DEBUG_CALL_FN(NAME,MAXSIZE) \ +TEXT NAME(SB),WRAPPER,$MAXSIZE-0; \ + NO_LOCAL_POINTERS; \ + MOVQ $0, R12; \ + BYTE $0xcc; \ + MOVQ $1, R12; \ + BYTE $0xcc; \ + RET +DEBUG_CALL_FN(debugCall32<>, 32) +DEBUG_CALL_FN(debugCall64<>, 64) +DEBUG_CALL_FN(debugCall128<>, 128) +DEBUG_CALL_FN(debugCall256<>, 256) +DEBUG_CALL_FN(debugCall512<>, 512) +DEBUG_CALL_FN(debugCall1024<>, 1024) +DEBUG_CALL_FN(debugCall2048<>, 2048) +DEBUG_CALL_FN(debugCall4096<>, 4096) +DEBUG_CALL_FN(debugCall8192<>, 8192) +DEBUG_CALL_FN(debugCall16384<>, 16384) +DEBUG_CALL_FN(debugCall32768<>, 32768) +DEBUG_CALL_FN(debugCall65536<>, 65536) + +// func debugCallPanicked(val interface{}) +TEXT runtime·debugCallPanicked(SB),NOSPLIT,$16-16 + // Copy the panic value to the top of stack. + MOVQ val_type+0(FP), AX + MOVQ AX, 0(SP) + MOVQ val_data+8(FP), AX + MOVQ AX, 8(SP) + MOVQ $2, R12 + BYTE $0xcc + RET + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +// Defined as ABIInternal since they do not use the stack-based Go ABI. +TEXT runtime·panicIndex<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, BX + JMP runtime·goPanicIndex<ABIInternal>(SB) +TEXT runtime·panicIndexU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, BX + JMP runtime·goPanicIndexU<ABIInternal>(SB) +TEXT runtime·panicSliceAlen<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, AX + MOVQ DX, BX + JMP runtime·goPanicSliceAlen<ABIInternal>(SB) +TEXT runtime·panicSliceAlenU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, AX + MOVQ DX, BX + JMP runtime·goPanicSliceAlenU<ABIInternal>(SB) +TEXT runtime·panicSliceAcap<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, AX + MOVQ DX, BX + JMP runtime·goPanicSliceAcap<ABIInternal>(SB) +TEXT runtime·panicSliceAcapU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, AX + MOVQ DX, BX + JMP runtime·goPanicSliceAcapU<ABIInternal>(SB) +TEXT runtime·panicSliceB<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, BX + JMP runtime·goPanicSliceB<ABIInternal>(SB) +TEXT runtime·panicSliceBU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, BX + JMP runtime·goPanicSliceBU<ABIInternal>(SB) +TEXT runtime·panicSlice3Alen<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ DX, AX + JMP runtime·goPanicSlice3Alen<ABIInternal>(SB) +TEXT runtime·panicSlice3AlenU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ DX, AX + JMP runtime·goPanicSlice3AlenU<ABIInternal>(SB) +TEXT runtime·panicSlice3Acap<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ DX, AX + JMP runtime·goPanicSlice3Acap<ABIInternal>(SB) +TEXT runtime·panicSlice3AcapU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ DX, AX + JMP runtime·goPanicSlice3AcapU<ABIInternal>(SB) +TEXT runtime·panicSlice3B<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, AX + MOVQ DX, BX + JMP runtime·goPanicSlice3B<ABIInternal>(SB) +TEXT runtime·panicSlice3BU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, AX + MOVQ DX, BX + JMP runtime·goPanicSlice3BU<ABIInternal>(SB) +TEXT runtime·panicSlice3C<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, BX + JMP runtime·goPanicSlice3C<ABIInternal>(SB) +TEXT runtime·panicSlice3CU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ CX, BX + JMP runtime·goPanicSlice3CU<ABIInternal>(SB) +TEXT runtime·panicSliceConvert<ABIInternal>(SB),NOSPLIT,$0-16 + MOVQ DX, AX + JMP runtime·goPanicSliceConvert<ABIInternal>(SB) + +#ifdef GOOS_android +// Use the free TLS_SLOT_APP slot #2 on Android Q. +// Earlier androids are set up in gcc_android.c. +DATA runtime·tls_g+0(SB)/8, $16 +GLOBL runtime·tls_g+0(SB), NOPTR, $8 +#endif +#ifdef GOOS_windows +GLOBL runtime·tls_g+0(SB), NOPTR, $8 +#endif + +// The compiler and assembler's -spectre=ret mode rewrites +// all indirect CALL AX / JMP AX instructions to be +// CALL retpolineAX / JMP retpolineAX. +// See https://support.google.com/faqs/answer/7625886. +#define RETPOLINE(reg) \ + /* CALL setup */ BYTE $0xE8; BYTE $(2+2); BYTE $0; BYTE $0; BYTE $0; \ + /* nospec: */ \ + /* PAUSE */ BYTE $0xF3; BYTE $0x90; \ + /* JMP nospec */ BYTE $0xEB; BYTE $-(2+2); \ + /* setup: */ \ + /* MOVQ AX, 0(SP) */ BYTE $0x48|((reg&8)>>1); BYTE $0x89; \ + BYTE $0x04|((reg&7)<<3); BYTE $0x24; \ + /* RET */ BYTE $0xC3 + +TEXT runtime·retpolineAX(SB),NOSPLIT,$0; RETPOLINE(0) +TEXT runtime·retpolineCX(SB),NOSPLIT,$0; RETPOLINE(1) +TEXT runtime·retpolineDX(SB),NOSPLIT,$0; RETPOLINE(2) +TEXT runtime·retpolineBX(SB),NOSPLIT,$0; RETPOLINE(3) +/* SP is 4, can't happen / magic encodings */ +TEXT runtime·retpolineBP(SB),NOSPLIT,$0; RETPOLINE(5) +TEXT runtime·retpolineSI(SB),NOSPLIT,$0; RETPOLINE(6) +TEXT runtime·retpolineDI(SB),NOSPLIT,$0; RETPOLINE(7) +TEXT runtime·retpolineR8(SB),NOSPLIT,$0; RETPOLINE(8) +TEXT runtime·retpolineR9(SB),NOSPLIT,$0; RETPOLINE(9) +TEXT runtime·retpolineR10(SB),NOSPLIT,$0; RETPOLINE(10) +TEXT runtime·retpolineR11(SB),NOSPLIT,$0; RETPOLINE(11) +TEXT runtime·retpolineR12(SB),NOSPLIT,$0; RETPOLINE(12) +TEXT runtime·retpolineR13(SB),NOSPLIT,$0; RETPOLINE(13) +TEXT runtime·retpolineR14(SB),NOSPLIT,$0; RETPOLINE(14) +TEXT runtime·retpolineR15(SB),NOSPLIT,$0; RETPOLINE(15) diff --git a/src/runtime/asm_arm.s b/src/runtime/asm_arm.s new file mode 100644 index 0000000..591ef2a --- /dev/null +++ b/src/runtime/asm_arm.s @@ -0,0 +1,1083 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// _rt0_arm is common startup code for most ARM systems when using +// internal linking. This is the entry point for the program from the +// kernel for an ordinary -buildmode=exe program. The stack holds the +// number of arguments and the C-style argv. +TEXT _rt0_arm(SB),NOSPLIT|NOFRAME,$0 + MOVW (R13), R0 // argc + MOVW $4(R13), R1 // argv + B runtime·rt0_go(SB) + +// main is common startup code for most ARM systems when using +// external linking. The C startup code will call the symbol "main" +// passing argc and argv in the usual C ABI registers R0 and R1. +TEXT main(SB),NOSPLIT|NOFRAME,$0 + B runtime·rt0_go(SB) + +// _rt0_arm_lib is common startup code for most ARM systems when +// using -buildmode=c-archive or -buildmode=c-shared. The linker will +// arrange to invoke this function as a global constructor (for +// c-archive) or when the shared library is loaded (for c-shared). +// We expect argc and argv to be passed in the usual C ABI registers +// R0 and R1. +TEXT _rt0_arm_lib(SB),NOSPLIT,$104 + // Preserve callee-save registers. Raspberry Pi's dlopen(), for example, + // actually cares that R11 is preserved. + MOVW R4, 12(R13) + MOVW R5, 16(R13) + MOVW R6, 20(R13) + MOVW R7, 24(R13) + MOVW R8, 28(R13) + MOVW g, 32(R13) + MOVW R11, 36(R13) + + // Skip floating point registers on GOARM < 6. + MOVB runtime·goarm(SB), R11 + CMP $6, R11 + BLT skipfpsave + MOVD F8, (40+8*0)(R13) + MOVD F9, (40+8*1)(R13) + MOVD F10, (40+8*2)(R13) + MOVD F11, (40+8*3)(R13) + MOVD F12, (40+8*4)(R13) + MOVD F13, (40+8*5)(R13) + MOVD F14, (40+8*6)(R13) + MOVD F15, (40+8*7)(R13) +skipfpsave: + // Save argc/argv. + MOVW R0, _rt0_arm_lib_argc<>(SB) + MOVW R1, _rt0_arm_lib_argv<>(SB) + + MOVW $0, g // Initialize g. + + // Synchronous initialization. + CALL runtime·libpreinit(SB) + + // Create a new thread to do the runtime initialization. + MOVW _cgo_sys_thread_create(SB), R2 + CMP $0, R2 + BEQ nocgo + MOVW $_rt0_arm_lib_go<>(SB), R0 + MOVW $0, R1 + BL (R2) + B rr +nocgo: + MOVW $0x800000, R0 // stacksize = 8192KB + MOVW $_rt0_arm_lib_go<>(SB), R1 // fn + MOVW R0, 4(R13) + MOVW R1, 8(R13) + BL runtime·newosproc0(SB) +rr: + // Restore callee-save registers and return. + MOVB runtime·goarm(SB), R11 + CMP $6, R11 + BLT skipfprest + MOVD (40+8*0)(R13), F8 + MOVD (40+8*1)(R13), F9 + MOVD (40+8*2)(R13), F10 + MOVD (40+8*3)(R13), F11 + MOVD (40+8*4)(R13), F12 + MOVD (40+8*5)(R13), F13 + MOVD (40+8*6)(R13), F14 + MOVD (40+8*7)(R13), F15 +skipfprest: + MOVW 12(R13), R4 + MOVW 16(R13), R5 + MOVW 20(R13), R6 + MOVW 24(R13), R7 + MOVW 28(R13), R8 + MOVW 32(R13), g + MOVW 36(R13), R11 + RET + +// _rt0_arm_lib_go initializes the Go runtime. +// This is started in a separate thread by _rt0_arm_lib. +TEXT _rt0_arm_lib_go<>(SB),NOSPLIT,$8 + MOVW _rt0_arm_lib_argc<>(SB), R0 + MOVW _rt0_arm_lib_argv<>(SB), R1 + B runtime·rt0_go(SB) + +DATA _rt0_arm_lib_argc<>(SB)/4,$0 +GLOBL _rt0_arm_lib_argc<>(SB),NOPTR,$4 +DATA _rt0_arm_lib_argv<>(SB)/4,$0 +GLOBL _rt0_arm_lib_argv<>(SB),NOPTR,$4 + +// using NOFRAME means do not save LR on stack. +// argc is in R0, argv is in R1. +TEXT runtime·rt0_go(SB),NOSPLIT|NOFRAME|TOPFRAME,$0 + MOVW $0xcafebabe, R12 + + // copy arguments forward on an even stack + // use R13 instead of SP to avoid linker rewriting the offsets + SUB $64, R13 // plenty of scratch + AND $~7, R13 + MOVW R0, 60(R13) // save argc, argv away + MOVW R1, 64(R13) + + // set up g register + // g is R10 + MOVW $runtime·g0(SB), g + MOVW $runtime·m0(SB), R8 + + // save m->g0 = g0 + MOVW g, m_g0(R8) + // save g->m = m0 + MOVW R8, g_m(g) + + // create istack out of the OS stack + // (1MB of system stack is available on iOS and Android) + MOVW $(-64*1024+104)(R13), R0 + MOVW R0, g_stackguard0(g) + MOVW R0, g_stackguard1(g) + MOVW R0, (g_stack+stack_lo)(g) + MOVW R13, (g_stack+stack_hi)(g) + + BL runtime·emptyfunc(SB) // fault if stack check is wrong + +#ifdef GOOS_openbsd + // Save g to TLS so that it is available from signal trampoline. + BL runtime·save_g(SB) +#endif + + BL runtime·_initcgo(SB) // will clobber R0-R3 + + // update stackguard after _cgo_init + MOVW (g_stack+stack_lo)(g), R0 + ADD $const__StackGuard, R0 + MOVW R0, g_stackguard0(g) + MOVW R0, g_stackguard1(g) + + BL runtime·check(SB) + + // saved argc, argv + MOVW 60(R13), R0 + MOVW R0, 4(R13) + MOVW 64(R13), R1 + MOVW R1, 8(R13) + BL runtime·args(SB) + BL runtime·checkgoarm(SB) + BL runtime·osinit(SB) + BL runtime·schedinit(SB) + + // create a new goroutine to start program + SUB $8, R13 + MOVW $runtime·mainPC(SB), R0 + MOVW R0, 4(R13) // arg 1: fn + MOVW $0, R0 + MOVW R0, 0(R13) // dummy LR + BL runtime·newproc(SB) + ADD $8, R13 // pop args and LR + + // start this M + BL runtime·mstart(SB) + + MOVW $1234, R0 + MOVW $1000, R1 + MOVW R0, (R1) // fail hard + +DATA runtime·mainPC+0(SB)/4,$runtime·main(SB) +GLOBL runtime·mainPC(SB),RODATA,$4 + +TEXT runtime·breakpoint(SB),NOSPLIT,$0-0 + // gdb won't skip this breakpoint instruction automatically, + // so you must manually "set $pc+=4" to skip it and continue. +#ifdef GOOS_plan9 + WORD $0xD1200070 // undefined instruction used as armv5 breakpoint in Plan 9 +#else + WORD $0xe7f001f0 // undefined instruction that gdb understands is a software breakpoint +#endif + RET + +TEXT runtime·asminit(SB),NOSPLIT,$0-0 + // disable runfast (flush-to-zero) mode of vfp if runtime.goarm > 5 + MOVB runtime·goarm(SB), R11 + CMP $5, R11 + BLE 4(PC) + WORD $0xeef1ba10 // vmrs r11, fpscr + BIC $(1<<24), R11 + WORD $0xeee1ba10 // vmsr fpscr, r11 + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + BL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// void gogo(Gobuf*) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB),NOSPLIT|NOFRAME,$0-4 + MOVW buf+0(FP), R1 + MOVW gobuf_g(R1), R0 + MOVW 0(R0), R2 // make sure g != nil + B gogo<>(SB) + +TEXT gogo<>(SB),NOSPLIT|NOFRAME,$0 + BL setg<>(SB) + MOVW gobuf_sp(R1), R13 // restore SP==R13 + MOVW gobuf_lr(R1), LR + MOVW gobuf_ret(R1), R0 + MOVW gobuf_ctxt(R1), R7 + MOVW $0, R11 + MOVW R11, gobuf_sp(R1) // clear to help garbage collector + MOVW R11, gobuf_ret(R1) + MOVW R11, gobuf_lr(R1) + MOVW R11, gobuf_ctxt(R1) + MOVW gobuf_pc(R1), R11 + CMP R11, R11 // set condition codes for == test, needed by stack split + B (R11) + +// func mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall(SB),NOSPLIT|NOFRAME,$0-4 + // Save caller state in g->sched. + MOVW R13, (g_sched+gobuf_sp)(g) + MOVW LR, (g_sched+gobuf_pc)(g) + MOVW $0, R11 + MOVW R11, (g_sched+gobuf_lr)(g) + + // Switch to m->g0 & its stack, call fn. + MOVW g, R1 + MOVW g_m(g), R8 + MOVW m_g0(R8), R0 + BL setg<>(SB) + CMP g, R1 + B.NE 2(PC) + B runtime·badmcall(SB) + MOVW fn+0(FP), R0 + MOVW (g_sched+gobuf_sp)(g), R13 + SUB $8, R13 + MOVW R1, 4(R13) + MOVW R0, R7 + MOVW 0(R0), R0 + BL (R0) + B runtime·badmcall2(SB) + RET + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB),NOSPLIT,$0-0 + MOVW $0, R0 + BL (R0) // clobber lr to ensure push {lr} is kept + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB),NOSPLIT,$0-4 + MOVW fn+0(FP), R0 // R0 = fn + MOVW g_m(g), R1 // R1 = m + + MOVW m_gsignal(R1), R2 // R2 = gsignal + CMP g, R2 + B.EQ noswitch + + MOVW m_g0(R1), R2 // R2 = g0 + CMP g, R2 + B.EQ noswitch + + MOVW m_curg(R1), R3 + CMP g, R3 + B.EQ switch + + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOVW $runtime·badsystemstack(SB), R0 + BL (R0) + B runtime·abort(SB) + +switch: + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + BL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOVW R0, R5 + MOVW R2, R0 + BL setg<>(SB) + MOVW R5, R0 + MOVW (g_sched+gobuf_sp)(R2), R13 + + // call target function + MOVW R0, R7 + MOVW 0(R0), R0 + BL (R0) + + // switch back to g + MOVW g_m(g), R1 + MOVW m_curg(R1), R0 + BL setg<>(SB) + MOVW (g_sched+gobuf_sp)(g), R13 + MOVW $0, R3 + MOVW R3, (g_sched+gobuf_sp)(g) + RET + +noswitch: + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOVW R0, R7 + MOVW 0(R0), R0 + MOVW.P 4(R13), R14 // restore LR + B (R0) + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// R3 prolog's LR +// using NOFRAME means do not save LR on stack. +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0 + // Cannot grow scheduler stack (m->g0). + MOVW g_m(g), R8 + MOVW m_g0(R8), R4 + CMP g, R4 + BNE 3(PC) + BL runtime·badmorestackg0(SB) + B runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOVW m_gsignal(R8), R4 + CMP g, R4 + BNE 3(PC) + BL runtime·badmorestackgsignal(SB) + B runtime·abort(SB) + + // Called from f. + // Set g->sched to context in f. + MOVW R13, (g_sched+gobuf_sp)(g) + MOVW LR, (g_sched+gobuf_pc)(g) + MOVW R3, (g_sched+gobuf_lr)(g) + MOVW R7, (g_sched+gobuf_ctxt)(g) + + // Called from f. + // Set m->morebuf to f's caller. + MOVW R3, (m_morebuf+gobuf_pc)(R8) // f's caller's PC + MOVW R13, (m_morebuf+gobuf_sp)(R8) // f's caller's SP + MOVW g, (m_morebuf+gobuf_g)(R8) + + // Call newstack on m->g0's stack. + MOVW m_g0(R8), R0 + BL setg<>(SB) + MOVW (g_sched+gobuf_sp)(g), R13 + MOVW $0, R0 + MOVW.W R0, -4(R13) // create a call frame on g0 (saved LR) + BL runtime·newstack(SB) + + // Not reached, but make sure the return PC from the call to newstack + // is still in this function, and not the beginning of the next. + RET + +TEXT runtime·morestack_noctxt(SB),NOSPLIT|NOFRAME,$0-0 + // Force SPWRITE. This function doesn't actually write SP, + // but it is called with a special calling convention where + // the caller doesn't save LR on stack but passes it as a + // register (R3), and the unwinder currently doesn't understand. + // Make it SPWRITE to stop unwinding. (See issue 54332) + MOVW R13, R13 + + MOVW $0, R7 + B runtime·morestack(SB) + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + CMP $MAXSIZE, R0; \ + B.HI 3(PC); \ + MOVW $NAME(SB), R1; \ + B (R1) + +TEXT ·reflectcall(SB),NOSPLIT|NOFRAME,$0-28 + MOVW frameSize+20(FP), R0 + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVW $runtime·badreflectcall(SB), R1 + B (R1) + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-28; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVW stackArgs+8(FP), R0; \ + MOVW stackArgsSize+12(FP), R2; \ + ADD $4, R13, R1; \ + CMP $0, R2; \ + B.EQ 5(PC); \ + MOVBU.P 1(R0), R5; \ + MOVBU.P R5, 1(R1); \ + SUB $1, R2, R2; \ + B -5(PC); \ + /* call function */ \ + MOVW f+4(FP), R7; \ + MOVW (R7), R0; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + BL (R0); \ + /* copy return values back */ \ + MOVW stackArgsType+0(FP), R4; \ + MOVW stackArgs+8(FP), R0; \ + MOVW stackArgsSize+12(FP), R2; \ + MOVW stackArgsRetOffset+16(FP), R3; \ + ADD $4, R13, R1; \ + ADD R3, R1; \ + ADD R3, R0; \ + SUB R3, R2; \ + BL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $20-0 + MOVW R4, 4(R13) + MOVW R0, 8(R13) + MOVW R1, 12(R13) + MOVW R2, 16(R13) + MOVW $0, R7 + MOVW R7, 20(R13) + BL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes R11. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT|NOFRAME,$0 + MOVW $runtime·systemstack_switch(SB), R11 + ADD $4, R11 // get past push {lr} + MOVW R11, (g_sched+gobuf_pc)(g) + MOVW R13, (g_sched+gobuf_sp)(g) + MOVW $0, R11 + MOVW R11, (g_sched+gobuf_lr)(g) + MOVW R11, (g_sched+gobuf_ret)(g) + // Assert ctxt is zero. See func save. + MOVW (g_sched+gobuf_ctxt)(g), R11 + TST R11, R11 + B.EQ 2(PC) + BL runtime·abort(SB) + RET + +// func asmcgocall_no_g(fn, arg unsafe.Pointer) +// Call fn(arg) aligned appropriately for the gcc ABI. +// Called on a system stack, and there may be no g yet (during needm). +TEXT ·asmcgocall_no_g(SB),NOSPLIT,$0-8 + MOVW fn+0(FP), R1 + MOVW arg+4(FP), R0 + MOVW R13, R2 + SUB $32, R13 + BIC $0x7, R13 // alignment for gcc ABI + MOVW R2, 8(R13) + BL (R1) + MOVW 8(R13), R2 + MOVW R2, R13 + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-12 + MOVW fn+0(FP), R1 + MOVW arg+4(FP), R0 + + MOVW R13, R2 + CMP $0, g + BEQ nosave + MOVW g, R4 + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + MOVW g_m(g), R8 + MOVW m_gsignal(R8), R3 + CMP R3, g + BEQ nosave + MOVW m_g0(R8), R3 + CMP R3, g + BEQ nosave + BL gosave_systemstack_switch<>(SB) + MOVW R0, R5 + MOVW R3, R0 + BL setg<>(SB) + MOVW R5, R0 + MOVW (g_sched+gobuf_sp)(g), R13 + + // Now on a scheduling stack (a pthread-created stack). + SUB $24, R13 + BIC $0x7, R13 // alignment for gcc ABI + MOVW R4, 20(R13) // save old g + MOVW (g_stack+stack_hi)(R4), R4 + SUB R2, R4 + MOVW R4, 16(R13) // save depth in stack (can't just save SP, as stack might be copied during a callback) + BL (R1) + + // Restore registers, g, stack pointer. + MOVW R0, R5 + MOVW 20(R13), R0 + BL setg<>(SB) + MOVW (g_stack+stack_hi)(g), R1 + MOVW 16(R13), R2 + SUB R2, R1 + MOVW R5, R0 + MOVW R1, R13 + + MOVW R0, ret+8(FP) + RET + +nosave: + // Running on a system stack, perhaps even without a g. + // Having no g can happen during thread creation or thread teardown + // (see needm/dropm on Solaris, for example). + // This code is like the above sequence but without saving/restoring g + // and without worrying about the stack moving out from under us + // (because we're on a system stack, not a goroutine stack). + // The above code could be used directly if already on a system stack, + // but then the only path through this code would be a rare case on Solaris. + // Using this code for all "already on system stack" calls exercises it more, + // which should help keep it correct. + SUB $24, R13 + BIC $0x7, R13 // alignment for gcc ABI + // save null g in case someone looks during debugging. + MOVW $0, R4 + MOVW R4, 20(R13) + MOVW R2, 16(R13) // Save old stack pointer. + BL (R1) + // Restore stack pointer. + MOVW 16(R13), R2 + MOVW R2, R13 + MOVW R0, ret+8(FP) + RET + +// cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$12-12 + NO_LOCAL_POINTERS + + // Load m and g from thread-local storage. +#ifdef GOOS_openbsd + BL runtime·load_g(SB) +#else + MOVB runtime·iscgo(SB), R0 + CMP $0, R0 + BL.NE runtime·load_g(SB) +#endif + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call. + CMP $0, g + B.EQ needm + + MOVW g_m(g), R8 + MOVW R8, savedm-4(SP) + B havem + +needm: + MOVW g, savedm-4(SP) // g is zero, so is m. + MOVW $runtime·needm(SB), R0 + BL (R0) + + // Set m->g0->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVW g_m(g), R8 + MOVW m_g0(R8), R3 + MOVW R13, (g_sched+gobuf_sp)(R3) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 4(R13) aka savedsp-12(SP). + MOVW m_g0(R8), R3 + MOVW (g_sched+gobuf_sp)(R3), R4 + MOVW R4, savedsp-12(SP) // must match frame size + MOVW R13, (g_sched+gobuf_sp)(R3) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOVW m_curg(R8), R0 + BL setg<>(SB) + MOVW (g_sched+gobuf_sp)(g), R4 // prepare stack as R4 + MOVW (g_sched+gobuf_pc)(g), R5 + MOVW R5, -(12+4)(R4) // "saved LR"; must match frame size + // Gather our arguments into registers. + MOVW fn+0(FP), R1 + MOVW frame+4(FP), R2 + MOVW ctxt+8(FP), R3 + MOVW $-(12+4)(R4), R13 // switch stack; must match frame size + MOVW R1, 4(R13) + MOVW R2, 8(R13) + MOVW R3, 12(R13) + BL runtime·cgocallbackg(SB) + + // Restore g->sched (== m->curg->sched) from saved values. + MOVW 0(R13), R5 + MOVW R5, (g_sched+gobuf_pc)(g) + MOVW $(12+4)(R13), R4 // must match frame size + MOVW R4, (g_sched+gobuf_sp)(g) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVW g_m(g), R8 + MOVW m_g0(R8), R0 + BL setg<>(SB) + MOVW (g_sched+gobuf_sp)(g), R13 + MOVW savedsp-12(SP), R4 // must match frame size + MOVW R4, (g_sched+gobuf_sp)(g) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVW savedm-4(SP), R6 + CMP $0, R6 + B.NE 3(PC) + MOVW $runtime·dropm(SB), R0 + BL (R0) + + // Done! + RET + +// void setg(G*); set g. for use by needm. +TEXT runtime·setg(SB),NOSPLIT|NOFRAME,$0-4 + MOVW gg+0(FP), R0 + B setg<>(SB) + +TEXT setg<>(SB),NOSPLIT|NOFRAME,$0-0 + MOVW R0, g + + // Save g to thread-local storage. +#ifdef GOOS_windows + B runtime·save_g(SB) +#else +#ifdef GOOS_openbsd + B runtime·save_g(SB) +#else + MOVB runtime·iscgo(SB), R0 + CMP $0, R0 + B.EQ 2(PC) + B runtime·save_g(SB) + + MOVW g, R0 + RET +#endif +#endif + +TEXT runtime·emptyfunc(SB),0,$0-0 + RET + +TEXT runtime·abort(SB),NOSPLIT|NOFRAME,$0-0 + MOVW $0, R0 + MOVW (R0), R1 + +// armPublicationBarrier is a native store/store barrier for ARMv7+. +// On earlier ARM revisions, armPublicationBarrier is a no-op. +// This will not work on SMP ARMv6 machines, if any are in use. +// To implement publicationBarrier in sys_$GOOS_arm.s using the native +// instructions, use: +// +// TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 +// B runtime·armPublicationBarrier(SB) +// +TEXT runtime·armPublicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + DMB MB_ST + RET + +// AES hashing not implemented for ARM +TEXT runtime·memhash(SB),NOSPLIT|NOFRAME,$0-16 + JMP runtime·memhashFallback(SB) +TEXT runtime·strhash(SB),NOSPLIT|NOFRAME,$0-12 + JMP runtime·strhashFallback(SB) +TEXT runtime·memhash32(SB),NOSPLIT|NOFRAME,$0-12 + JMP runtime·memhash32Fallback(SB) +TEXT runtime·memhash64(SB),NOSPLIT|NOFRAME,$0-12 + JMP runtime·memhash64Fallback(SB) + +TEXT runtime·return0(SB),NOSPLIT,$0 + MOVW $0, R0 + RET + +TEXT runtime·procyield(SB),NOSPLIT|NOFRAME,$0 + MOVW cycles+0(FP), R1 + MOVW $0, R0 +yieldloop: + WORD $0xe320f001 // YIELD (NOP pre-ARMv6K) + CMP R0, R1 + B.NE 2(PC) + RET + SUB $1, R1 + B yieldloop + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT,$8 + // R11 and g register are clobbered by load_g. They are + // callee-save in the gcc calling convention, so save them here. + MOVW R11, saveR11-4(SP) + MOVW g, saveG-8(SP) + + BL runtime·load_g(SB) + MOVW g_m(g), R0 + MOVW m_curg(R0), R0 + MOVW (g_stack+stack_hi)(R0), R0 + + MOVW saveG-8(SP), g + MOVW saveR11-4(SP), R11 + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|NOFRAME|TOPFRAME,$0-0 + MOVW R0, R0 // NOP + BL runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + MOVW R0, R0 // NOP + +// x -> x/1000000, x%1000000, called from Go with args, results on stack. +TEXT runtime·usplit(SB),NOSPLIT,$0-12 + MOVW x+0(FP), R0 + CALL runtime·usplitR0(SB) + MOVW R0, q+4(FP) + MOVW R1, r+8(FP) + RET + +// R0, R1 = R0/1000000, R0%1000000 +TEXT runtime·usplitR0(SB),NOSPLIT,$0 + // magic multiply to avoid software divide without available m. + // see output of go tool compile -S for x/1000000. + MOVW R0, R3 + MOVW $1125899907, R1 + MULLU R1, R0, (R0, R1) + MOVW R0>>18, R0 + MOVW $1000000, R1 + MULU R0, R1 + SUB R1, R3, R1 + RET + +// This is called from .init_array and follows the platform, not Go, ABI. +TEXT runtime·addmoduledata(SB),NOSPLIT,$0-0 + MOVW R9, saver9-4(SP) // The access to global variables below implicitly uses R9, which is callee-save + MOVW R11, saver11-8(SP) // Likewise, R11 is the temp register, but callee-save in C ABI + MOVW runtime·lastmoduledatap(SB), R1 + MOVW R0, moduledata_next(R1) + MOVW R0, runtime·lastmoduledatap(SB) + MOVW saver11-8(SP), R11 + MOVW saver9-4(SP), R9 + RET + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + MOVW $1, R3 + MOVB R3, ret+0(FP) + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - R2 is the destination of the write +// - R3 is the value being written at R2 +// It clobbers condition codes. +// It does not clobber any other general-purpose registers, +// but may clobber others (e.g., floating point registers). +// The act of CALLing gcWriteBarrier will clobber R14 (LR). +TEXT runtime·gcWriteBarrier(SB),NOSPLIT|NOFRAME,$0 + // Save the registers clobbered by the fast path. + MOVM.DB.W [R0,R1], (R13) + MOVW g_m(g), R0 + MOVW m_p(R0), R0 + MOVW (p_wbBuf+wbBuf_next)(R0), R1 + // Increment wbBuf.next position. + ADD $8, R1 + MOVW R1, (p_wbBuf+wbBuf_next)(R0) + MOVW (p_wbBuf+wbBuf_end)(R0), R0 + CMP R1, R0 + // Record the write. + MOVW R3, -8(R1) // Record value + MOVW (R2), R0 // TODO: This turns bad writes into bad reads. + MOVW R0, -4(R1) // Record *slot + // Is the buffer full? (flags set in CMP above) + B.EQ flush +ret: + MOVM.IA.W (R13), [R0,R1] + // Do the write. + MOVW R3, (R2) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + // + // R0 and R1 were saved at entry. + // R10 is g, so preserved. + // R11 is linker temp, so no need to save. + // R13 is stack pointer. + // R15 is PC. + // + // This also sets up R2 and R3 as the arguments to wbBufFlush. + MOVM.DB.W [R2-R9,R12], (R13) + // Save R14 (LR) because the fast path above doesn't save it, + // but needs it to RET. This is after the MOVM so it appears below + // the arguments in the stack frame. + MOVM.DB.W [R14], (R13) + + // This takes arguments R2 and R3. + CALL runtime·wbBufFlush(SB) + + MOVM.IA.W (R13), [R14] + MOVM.IA.W (R13), [R2-R9,R12] + JMP ret + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +TEXT runtime·panicIndex(SB),NOSPLIT,$0-8 + MOVW R0, x+0(FP) + MOVW R1, y+4(FP) + JMP runtime·goPanicIndex(SB) +TEXT runtime·panicIndexU(SB),NOSPLIT,$0-8 + MOVW R0, x+0(FP) + MOVW R1, y+4(FP) + JMP runtime·goPanicIndexU(SB) +TEXT runtime·panicSliceAlen(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSliceAlen(SB) +TEXT runtime·panicSliceAlenU(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSliceAlenU(SB) +TEXT runtime·panicSliceAcap(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSliceAcap(SB) +TEXT runtime·panicSliceAcapU(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSliceAcapU(SB) +TEXT runtime·panicSliceB(SB),NOSPLIT,$0-8 + MOVW R0, x+0(FP) + MOVW R1, y+4(FP) + JMP runtime·goPanicSliceB(SB) +TEXT runtime·panicSliceBU(SB),NOSPLIT,$0-8 + MOVW R0, x+0(FP) + MOVW R1, y+4(FP) + JMP runtime·goPanicSliceBU(SB) +TEXT runtime·panicSlice3Alen(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSlice3Alen(SB) +TEXT runtime·panicSlice3AlenU(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSlice3AlenU(SB) +TEXT runtime·panicSlice3Acap(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSlice3Acap(SB) +TEXT runtime·panicSlice3AcapU(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSlice3AcapU(SB) +TEXT runtime·panicSlice3B(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSlice3B(SB) +TEXT runtime·panicSlice3BU(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSlice3BU(SB) +TEXT runtime·panicSlice3C(SB),NOSPLIT,$0-8 + MOVW R0, x+0(FP) + MOVW R1, y+4(FP) + JMP runtime·goPanicSlice3C(SB) +TEXT runtime·panicSlice3CU(SB),NOSPLIT,$0-8 + MOVW R0, x+0(FP) + MOVW R1, y+4(FP) + JMP runtime·goPanicSlice3CU(SB) +TEXT runtime·panicSliceConvert(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSliceConvert(SB) + +// Extended versions for 64-bit indexes. +TEXT runtime·panicExtendIndex(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R0, lo+4(FP) + MOVW R1, y+8(FP) + JMP runtime·goPanicExtendIndex(SB) +TEXT runtime·panicExtendIndexU(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R0, lo+4(FP) + MOVW R1, y+8(FP) + JMP runtime·goPanicExtendIndexU(SB) +TEXT runtime·panicExtendSliceAlen(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSliceAlen(SB) +TEXT runtime·panicExtendSliceAlenU(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSliceAlenU(SB) +TEXT runtime·panicExtendSliceAcap(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSliceAcap(SB) +TEXT runtime·panicExtendSliceAcapU(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSliceAcapU(SB) +TEXT runtime·panicExtendSliceB(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R0, lo+4(FP) + MOVW R1, y+8(FP) + JMP runtime·goPanicExtendSliceB(SB) +TEXT runtime·panicExtendSliceBU(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R0, lo+4(FP) + MOVW R1, y+8(FP) + JMP runtime·goPanicExtendSliceBU(SB) +TEXT runtime·panicExtendSlice3Alen(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSlice3Alen(SB) +TEXT runtime·panicExtendSlice3AlenU(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSlice3AlenU(SB) +TEXT runtime·panicExtendSlice3Acap(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSlice3Acap(SB) +TEXT runtime·panicExtendSlice3AcapU(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSlice3AcapU(SB) +TEXT runtime·panicExtendSlice3B(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSlice3B(SB) +TEXT runtime·panicExtendSlice3BU(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSlice3BU(SB) +TEXT runtime·panicExtendSlice3C(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R0, lo+4(FP) + MOVW R1, y+8(FP) + JMP runtime·goPanicExtendSlice3C(SB) +TEXT runtime·panicExtendSlice3CU(SB),NOSPLIT,$0-12 + MOVW R4, hi+0(FP) + MOVW R0, lo+4(FP) + MOVW R1, y+8(FP) + JMP runtime·goPanicExtendSlice3CU(SB) diff --git a/src/runtime/asm_arm64.s b/src/runtime/asm_arm64.s new file mode 100644 index 0000000..7eb5bcf --- /dev/null +++ b/src/runtime/asm_arm64.s @@ -0,0 +1,1525 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "tls_arm64.h" +#include "funcdata.h" +#include "textflag.h" + +TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 + // SP = stack; R0 = argc; R1 = argv + + SUB $32, RSP + MOVW R0, 8(RSP) // argc + MOVD R1, 16(RSP) // argv + +#ifdef TLS_darwin + // Initialize TLS. + MOVD ZR, g // clear g, make sure it's not junk. + SUB $32, RSP + MRS_TPIDR_R0 + AND $~7, R0 + MOVD R0, 16(RSP) // arg2: TLS base + MOVD $runtime·tls_g(SB), R2 + MOVD R2, 8(RSP) // arg1: &tlsg + BL ·tlsinit(SB) + ADD $32, RSP +#endif + + // create istack out of the given (operating system) stack. + // _cgo_init may update stackguard. + MOVD $runtime·g0(SB), g + MOVD RSP, R7 + MOVD $(-64*1024)(R7), R0 + MOVD R0, g_stackguard0(g) + MOVD R0, g_stackguard1(g) + MOVD R0, (g_stack+stack_lo)(g) + MOVD R7, (g_stack+stack_hi)(g) + + // if there is a _cgo_init, call it using the gcc ABI. + MOVD _cgo_init(SB), R12 + CBZ R12, nocgo + +#ifdef GOOS_android + MRS_TPIDR_R0 // load TLS base pointer + MOVD R0, R3 // arg 3: TLS base pointer + MOVD $runtime·tls_g(SB), R2 // arg 2: &tls_g +#else + MOVD $0, R2 // arg 2: not used when using platform's TLS +#endif + MOVD $setg_gcc<>(SB), R1 // arg 1: setg + MOVD g, R0 // arg 0: G + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL (R12) + ADD $16, RSP + +nocgo: + BL runtime·save_g(SB) + // update stackguard after _cgo_init + MOVD (g_stack+stack_lo)(g), R0 + ADD $const__StackGuard, R0 + MOVD R0, g_stackguard0(g) + MOVD R0, g_stackguard1(g) + + // set the per-goroutine and per-mach "registers" + MOVD $runtime·m0(SB), R0 + + // save m->g0 = g0 + MOVD g, m_g0(R0) + // save m0 to g0->m + MOVD R0, g_m(g) + + BL runtime·check(SB) + +#ifdef GOOS_windows + BL runtime·wintls(SB) +#endif + + MOVW 8(RSP), R0 // copy argc + MOVW R0, -8(RSP) + MOVD 16(RSP), R0 // copy argv + MOVD R0, 0(RSP) + BL runtime·args(SB) + BL runtime·osinit(SB) + BL runtime·schedinit(SB) + + // create a new goroutine to start program + MOVD $runtime·mainPC(SB), R0 // entry + SUB $16, RSP + MOVD R0, 8(RSP) // arg + MOVD $0, 0(RSP) // dummy LR + BL runtime·newproc(SB) + ADD $16, RSP + + // start this M + BL runtime·mstart(SB) + + // Prevent dead-code elimination of debugCallV2, which is + // intended to be called by debuggers. + MOVD $runtime·debugCallV2<ABIInternal>(SB), R0 + + MOVD $0, R0 + MOVD R0, (R0) // boom + UNDEF + +DATA runtime·mainPC+0(SB)/8,$runtime·main<ABIInternal>(SB) +GLOBL runtime·mainPC(SB),RODATA,$8 + +// Windows ARM64 needs an immediate 0xf000 argument. +// See go.dev/issues/53837. +#define BREAK \ +#ifdef GOOS_windows \ + BRK $0xf000 \ +#else \ + BRK \ +#endif \ + + +TEXT runtime·breakpoint(SB),NOSPLIT|NOFRAME,$0-0 + BREAK + RET + +TEXT runtime·asminit(SB),NOSPLIT|NOFRAME,$0-0 + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + BL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// void gogo(Gobuf*) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB), NOSPLIT|NOFRAME, $0-8 + MOVD buf+0(FP), R5 + MOVD gobuf_g(R5), R6 + MOVD 0(R6), R4 // make sure g != nil + B gogo<>(SB) + +TEXT gogo<>(SB), NOSPLIT|NOFRAME, $0 + MOVD R6, g + BL runtime·save_g(SB) + + MOVD gobuf_sp(R5), R0 + MOVD R0, RSP + MOVD gobuf_bp(R5), R29 + MOVD gobuf_lr(R5), LR + MOVD gobuf_ret(R5), R0 + MOVD gobuf_ctxt(R5), R26 + MOVD $0, gobuf_sp(R5) + MOVD $0, gobuf_bp(R5) + MOVD $0, gobuf_ret(R5) + MOVD $0, gobuf_lr(R5) + MOVD $0, gobuf_ctxt(R5) + CMP ZR, ZR // set condition codes for == test, needed by stack split + MOVD gobuf_pc(R5), R6 + B (R6) + +// void mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-8 + MOVD R0, R26 // context + + // Save caller state in g->sched + MOVD RSP, R0 + MOVD R0, (g_sched+gobuf_sp)(g) + MOVD R29, (g_sched+gobuf_bp)(g) + MOVD LR, (g_sched+gobuf_pc)(g) + MOVD $0, (g_sched+gobuf_lr)(g) + + // Switch to m->g0 & its stack, call fn. + MOVD g, R3 + MOVD g_m(g), R8 + MOVD m_g0(R8), g + BL runtime·save_g(SB) + CMP g, R3 + BNE 2(PC) + B runtime·badmcall(SB) + + MOVD (g_sched+gobuf_sp)(g), R0 + MOVD R0, RSP // sp = m->g0->sched.sp + MOVD (g_sched+gobuf_bp)(g), R29 + MOVD R3, R0 // arg = g + MOVD $0, -16(RSP) // dummy LR + SUB $16, RSP + MOVD 0(R26), R4 // code pointer + BL (R4) + B runtime·badmcall2(SB) + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + UNDEF + BL (LR) // make sure this function is not leaf + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-8 + MOVD fn+0(FP), R3 // R3 = fn + MOVD R3, R26 // context + MOVD g_m(g), R4 // R4 = m + + MOVD m_gsignal(R4), R5 // R5 = gsignal + CMP g, R5 + BEQ noswitch + + MOVD m_g0(R4), R5 // R5 = g0 + CMP g, R5 + BEQ noswitch + + MOVD m_curg(R4), R6 + CMP g, R6 + BEQ switch + + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOVD $runtime·badsystemstack(SB), R3 + BL (R3) + B runtime·abort(SB) + +switch: + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + BL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOVD R5, g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R3 + MOVD R3, RSP + MOVD (g_sched+gobuf_bp)(g), R29 + + // call target function + MOVD 0(R26), R3 // code pointer + BL (R3) + + // switch back to g + MOVD g_m(g), R3 + MOVD m_curg(R3), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R0 + MOVD R0, RSP + MOVD (g_sched+gobuf_bp)(g), R29 + MOVD $0, (g_sched+gobuf_sp)(g) + MOVD $0, (g_sched+gobuf_bp)(g) + RET + +noswitch: + // already on m stack, just call directly + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOVD 0(R26), R3 // code pointer + MOVD.P 16(RSP), R30 // restore LR + SUB $8, RSP, R29 // restore FP + B (R3) + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// Caller has already loaded: +// R3 prolog's LR (R30) +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0 + // Cannot grow scheduler stack (m->g0). + MOVD g_m(g), R8 + MOVD m_g0(R8), R4 + CMP g, R4 + BNE 3(PC) + BL runtime·badmorestackg0(SB) + B runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOVD m_gsignal(R8), R4 + CMP g, R4 + BNE 3(PC) + BL runtime·badmorestackgsignal(SB) + B runtime·abort(SB) + + // Called from f. + // Set g->sched to context in f + MOVD RSP, R0 + MOVD R0, (g_sched+gobuf_sp)(g) + MOVD R29, (g_sched+gobuf_bp)(g) + MOVD LR, (g_sched+gobuf_pc)(g) + MOVD R3, (g_sched+gobuf_lr)(g) + MOVD R26, (g_sched+gobuf_ctxt)(g) + + // Called from f. + // Set m->morebuf to f's callers. + MOVD R3, (m_morebuf+gobuf_pc)(R8) // f's caller's PC + MOVD RSP, R0 + MOVD R0, (m_morebuf+gobuf_sp)(R8) // f's caller's RSP + MOVD g, (m_morebuf+gobuf_g)(R8) + + // Call newstack on m->g0's stack. + MOVD m_g0(R8), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R0 + MOVD R0, RSP + MOVD (g_sched+gobuf_bp)(g), R29 + MOVD.W $0, -16(RSP) // create a call frame on g0 (saved LR; keep 16-aligned) + BL runtime·newstack(SB) + + // Not reached, but make sure the return PC from the call to newstack + // is still in this function, and not the beginning of the next. + UNDEF + +TEXT runtime·morestack_noctxt(SB),NOSPLIT|NOFRAME,$0-0 + // Force SPWRITE. This function doesn't actually write SP, + // but it is called with a special calling convention where + // the caller doesn't save LR on stack but passes it as a + // register (R3), and the unwinder currently doesn't understand. + // Make it SPWRITE to stop unwinding. (See issue 54332) + MOVD RSP, RSP + + MOVW $0, R26 + B runtime·morestack(SB) + +// spillArgs stores return values from registers to a *internal/abi.RegArgs in R20. +TEXT ·spillArgs(SB),NOSPLIT,$0-0 + STP (R0, R1), (0*8)(R20) + STP (R2, R3), (2*8)(R20) + STP (R4, R5), (4*8)(R20) + STP (R6, R7), (6*8)(R20) + STP (R8, R9), (8*8)(R20) + STP (R10, R11), (10*8)(R20) + STP (R12, R13), (12*8)(R20) + STP (R14, R15), (14*8)(R20) + FSTPD (F0, F1), (16*8)(R20) + FSTPD (F2, F3), (18*8)(R20) + FSTPD (F4, F5), (20*8)(R20) + FSTPD (F6, F7), (22*8)(R20) + FSTPD (F8, F9), (24*8)(R20) + FSTPD (F10, F11), (26*8)(R20) + FSTPD (F12, F13), (28*8)(R20) + FSTPD (F14, F15), (30*8)(R20) + RET + +// unspillArgs loads args into registers from a *internal/abi.RegArgs in R20. +TEXT ·unspillArgs(SB),NOSPLIT,$0-0 + LDP (0*8)(R20), (R0, R1) + LDP (2*8)(R20), (R2, R3) + LDP (4*8)(R20), (R4, R5) + LDP (6*8)(R20), (R6, R7) + LDP (8*8)(R20), (R8, R9) + LDP (10*8)(R20), (R10, R11) + LDP (12*8)(R20), (R12, R13) + LDP (14*8)(R20), (R14, R15) + FLDPD (16*8)(R20), (F0, F1) + FLDPD (18*8)(R20), (F2, F3) + FLDPD (20*8)(R20), (F4, F5) + FLDPD (22*8)(R20), (F6, F7) + FLDPD (24*8)(R20), (F8, F9) + FLDPD (26*8)(R20), (F10, F11) + FLDPD (28*8)(R20), (F12, F13) + FLDPD (30*8)(R20), (F14, F15) + RET + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + MOVD $MAXSIZE, R27; \ + CMP R27, R16; \ + BGT 3(PC); \ + MOVD $NAME(SB), R27; \ + B (R27) +// Note: can't just "B NAME(SB)" - bad inlining results. + +TEXT ·reflectcall(SB), NOSPLIT|NOFRAME, $0-48 + MOVWU frameSize+32(FP), R16 + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVD $runtime·badreflectcall(SB), R0 + B (R0) + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-48; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVD stackArgs+16(FP), R3; \ + MOVWU stackArgsSize+24(FP), R4; \ + ADD $8, RSP, R5; \ + BIC $0xf, R4, R6; \ + CBZ R6, 6(PC); \ + /* if R6=(argsize&~15) != 0 */ \ + ADD R6, R5, R6; \ + /* copy 16 bytes a time */ \ + LDP.P 16(R3), (R7, R8); \ + STP.P (R7, R8), 16(R5); \ + CMP R5, R6; \ + BNE -3(PC); \ + AND $0xf, R4, R6; \ + CBZ R6, 6(PC); \ + /* if R6=(argsize&15) != 0 */ \ + ADD R6, R5, R6; \ + /* copy 1 byte a time for the rest */ \ + MOVBU.P 1(R3), R7; \ + MOVBU.P R7, 1(R5); \ + CMP R5, R6; \ + BNE -3(PC); \ + /* set up argument registers */ \ + MOVD regArgs+40(FP), R20; \ + CALL ·unspillArgs(SB); \ + /* call function */ \ + MOVD f+8(FP), R26; \ + MOVD (R26), R20; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + BL (R20); \ + /* copy return values back */ \ + MOVD regArgs+40(FP), R20; \ + CALL ·spillArgs(SB); \ + MOVD stackArgsType+0(FP), R7; \ + MOVD stackArgs+16(FP), R3; \ + MOVWU stackArgsSize+24(FP), R4; \ + MOVWU stackRetOffset+28(FP), R6; \ + ADD $8, RSP, R5; \ + ADD R6, R5; \ + ADD R6, R3; \ + SUB R6, R4; \ + BL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $48-0 + NO_LOCAL_POINTERS + STP (R7, R3), 8(RSP) + STP (R5, R4), 24(RSP) + MOVD R20, 40(RSP) + BL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +// func memhash32(p unsafe.Pointer, h uintptr) uintptr +TEXT runtime·memhash32<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + MOVB runtime·useAeshash(SB), R10 + CBZ R10, noaes + MOVD $runtime·aeskeysched+0(SB), R3 + + VEOR V0.B16, V0.B16, V0.B16 + VLD1 (R3), [V2.B16] + VLD1 (R0), V0.S[1] + VMOV R1, V0.S[0] + + AESE V2.B16, V0.B16 + AESMC V0.B16, V0.B16 + AESE V2.B16, V0.B16 + AESMC V0.B16, V0.B16 + AESE V2.B16, V0.B16 + + VMOV V0.D[0], R0 + RET +noaes: + B runtime·memhash32Fallback<ABIInternal>(SB) + +// func memhash64(p unsafe.Pointer, h uintptr) uintptr +TEXT runtime·memhash64<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + MOVB runtime·useAeshash(SB), R10 + CBZ R10, noaes + MOVD $runtime·aeskeysched+0(SB), R3 + + VEOR V0.B16, V0.B16, V0.B16 + VLD1 (R3), [V2.B16] + VLD1 (R0), V0.D[1] + VMOV R1, V0.D[0] + + AESE V2.B16, V0.B16 + AESMC V0.B16, V0.B16 + AESE V2.B16, V0.B16 + AESMC V0.B16, V0.B16 + AESE V2.B16, V0.B16 + + VMOV V0.D[0], R0 + RET +noaes: + B runtime·memhash64Fallback<ABIInternal>(SB) + +// func memhash(p unsafe.Pointer, h, size uintptr) uintptr +TEXT runtime·memhash<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-32 + MOVB runtime·useAeshash(SB), R10 + CBZ R10, noaes + B aeshashbody<>(SB) +noaes: + B runtime·memhashFallback<ABIInternal>(SB) + +// func strhash(p unsafe.Pointer, h uintptr) uintptr +TEXT runtime·strhash<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + MOVB runtime·useAeshash(SB), R10 + CBZ R10, noaes + LDP (R0), (R0, R2) // string data / length + B aeshashbody<>(SB) +noaes: + B runtime·strhashFallback<ABIInternal>(SB) + +// R0: data +// R1: seed data +// R2: length +// At return, R0 = return value +TEXT aeshashbody<>(SB),NOSPLIT|NOFRAME,$0 + VEOR V30.B16, V30.B16, V30.B16 + VMOV R1, V30.D[0] + VMOV R2, V30.D[1] // load length into seed + + MOVD $runtime·aeskeysched+0(SB), R4 + VLD1.P 16(R4), [V0.B16] + AESE V30.B16, V0.B16 + AESMC V0.B16, V0.B16 + CMP $16, R2 + BLO aes0to15 + BEQ aes16 + CMP $32, R2 + BLS aes17to32 + CMP $64, R2 + BLS aes33to64 + CMP $128, R2 + BLS aes65to128 + B aes129plus + +aes0to15: + CBZ R2, aes0 + VEOR V2.B16, V2.B16, V2.B16 + TBZ $3, R2, less_than_8 + VLD1.P 8(R0), V2.D[0] + +less_than_8: + TBZ $2, R2, less_than_4 + VLD1.P 4(R0), V2.S[2] + +less_than_4: + TBZ $1, R2, less_than_2 + VLD1.P 2(R0), V2.H[6] + +less_than_2: + TBZ $0, R2, done + VLD1 (R0), V2.B[14] +done: + AESE V0.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V0.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V0.B16, V2.B16 + + VMOV V2.D[0], R0 + RET + +aes0: + VMOV V0.D[0], R0 + RET + +aes16: + VLD1 (R0), [V2.B16] + B done + +aes17to32: + // make second seed + VLD1 (R4), [V1.B16] + AESE V30.B16, V1.B16 + AESMC V1.B16, V1.B16 + SUB $16, R2, R10 + VLD1.P (R0)(R10), [V2.B16] + VLD1 (R0), [V3.B16] + + AESE V0.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V1.B16, V3.B16 + AESMC V3.B16, V3.B16 + + AESE V0.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V1.B16, V3.B16 + AESMC V3.B16, V3.B16 + + AESE V0.B16, V2.B16 + AESE V1.B16, V3.B16 + + VEOR V3.B16, V2.B16, V2.B16 + + VMOV V2.D[0], R0 + RET + +aes33to64: + VLD1 (R4), [V1.B16, V2.B16, V3.B16] + AESE V30.B16, V1.B16 + AESMC V1.B16, V1.B16 + AESE V30.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V30.B16, V3.B16 + AESMC V3.B16, V3.B16 + SUB $32, R2, R10 + + VLD1.P (R0)(R10), [V4.B16, V5.B16] + VLD1 (R0), [V6.B16, V7.B16] + + AESE V0.B16, V4.B16 + AESMC V4.B16, V4.B16 + AESE V1.B16, V5.B16 + AESMC V5.B16, V5.B16 + AESE V2.B16, V6.B16 + AESMC V6.B16, V6.B16 + AESE V3.B16, V7.B16 + AESMC V7.B16, V7.B16 + + AESE V0.B16, V4.B16 + AESMC V4.B16, V4.B16 + AESE V1.B16, V5.B16 + AESMC V5.B16, V5.B16 + AESE V2.B16, V6.B16 + AESMC V6.B16, V6.B16 + AESE V3.B16, V7.B16 + AESMC V7.B16, V7.B16 + + AESE V0.B16, V4.B16 + AESE V1.B16, V5.B16 + AESE V2.B16, V6.B16 + AESE V3.B16, V7.B16 + + VEOR V6.B16, V4.B16, V4.B16 + VEOR V7.B16, V5.B16, V5.B16 + VEOR V5.B16, V4.B16, V4.B16 + + VMOV V4.D[0], R0 + RET + +aes65to128: + VLD1.P 64(R4), [V1.B16, V2.B16, V3.B16, V4.B16] + VLD1 (R4), [V5.B16, V6.B16, V7.B16] + AESE V30.B16, V1.B16 + AESMC V1.B16, V1.B16 + AESE V30.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V30.B16, V3.B16 + AESMC V3.B16, V3.B16 + AESE V30.B16, V4.B16 + AESMC V4.B16, V4.B16 + AESE V30.B16, V5.B16 + AESMC V5.B16, V5.B16 + AESE V30.B16, V6.B16 + AESMC V6.B16, V6.B16 + AESE V30.B16, V7.B16 + AESMC V7.B16, V7.B16 + + SUB $64, R2, R10 + VLD1.P (R0)(R10), [V8.B16, V9.B16, V10.B16, V11.B16] + VLD1 (R0), [V12.B16, V13.B16, V14.B16, V15.B16] + AESE V0.B16, V8.B16 + AESMC V8.B16, V8.B16 + AESE V1.B16, V9.B16 + AESMC V9.B16, V9.B16 + AESE V2.B16, V10.B16 + AESMC V10.B16, V10.B16 + AESE V3.B16, V11.B16 + AESMC V11.B16, V11.B16 + AESE V4.B16, V12.B16 + AESMC V12.B16, V12.B16 + AESE V5.B16, V13.B16 + AESMC V13.B16, V13.B16 + AESE V6.B16, V14.B16 + AESMC V14.B16, V14.B16 + AESE V7.B16, V15.B16 + AESMC V15.B16, V15.B16 + + AESE V0.B16, V8.B16 + AESMC V8.B16, V8.B16 + AESE V1.B16, V9.B16 + AESMC V9.B16, V9.B16 + AESE V2.B16, V10.B16 + AESMC V10.B16, V10.B16 + AESE V3.B16, V11.B16 + AESMC V11.B16, V11.B16 + AESE V4.B16, V12.B16 + AESMC V12.B16, V12.B16 + AESE V5.B16, V13.B16 + AESMC V13.B16, V13.B16 + AESE V6.B16, V14.B16 + AESMC V14.B16, V14.B16 + AESE V7.B16, V15.B16 + AESMC V15.B16, V15.B16 + + AESE V0.B16, V8.B16 + AESE V1.B16, V9.B16 + AESE V2.B16, V10.B16 + AESE V3.B16, V11.B16 + AESE V4.B16, V12.B16 + AESE V5.B16, V13.B16 + AESE V6.B16, V14.B16 + AESE V7.B16, V15.B16 + + VEOR V12.B16, V8.B16, V8.B16 + VEOR V13.B16, V9.B16, V9.B16 + VEOR V14.B16, V10.B16, V10.B16 + VEOR V15.B16, V11.B16, V11.B16 + VEOR V10.B16, V8.B16, V8.B16 + VEOR V11.B16, V9.B16, V9.B16 + VEOR V9.B16, V8.B16, V8.B16 + + VMOV V8.D[0], R0 + RET + +aes129plus: + PRFM (R0), PLDL1KEEP + VLD1.P 64(R4), [V1.B16, V2.B16, V3.B16, V4.B16] + VLD1 (R4), [V5.B16, V6.B16, V7.B16] + AESE V30.B16, V1.B16 + AESMC V1.B16, V1.B16 + AESE V30.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V30.B16, V3.B16 + AESMC V3.B16, V3.B16 + AESE V30.B16, V4.B16 + AESMC V4.B16, V4.B16 + AESE V30.B16, V5.B16 + AESMC V5.B16, V5.B16 + AESE V30.B16, V6.B16 + AESMC V6.B16, V6.B16 + AESE V30.B16, V7.B16 + AESMC V7.B16, V7.B16 + ADD R0, R2, R10 + SUB $128, R10, R10 + VLD1.P 64(R10), [V8.B16, V9.B16, V10.B16, V11.B16] + VLD1 (R10), [V12.B16, V13.B16, V14.B16, V15.B16] + SUB $1, R2, R2 + LSR $7, R2, R2 + +aesloop: + AESE V8.B16, V0.B16 + AESMC V0.B16, V0.B16 + AESE V9.B16, V1.B16 + AESMC V1.B16, V1.B16 + AESE V10.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V11.B16, V3.B16 + AESMC V3.B16, V3.B16 + AESE V12.B16, V4.B16 + AESMC V4.B16, V4.B16 + AESE V13.B16, V5.B16 + AESMC V5.B16, V5.B16 + AESE V14.B16, V6.B16 + AESMC V6.B16, V6.B16 + AESE V15.B16, V7.B16 + AESMC V7.B16, V7.B16 + + VLD1.P 64(R0), [V8.B16, V9.B16, V10.B16, V11.B16] + AESE V8.B16, V0.B16 + AESMC V0.B16, V0.B16 + AESE V9.B16, V1.B16 + AESMC V1.B16, V1.B16 + AESE V10.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V11.B16, V3.B16 + AESMC V3.B16, V3.B16 + + VLD1.P 64(R0), [V12.B16, V13.B16, V14.B16, V15.B16] + AESE V12.B16, V4.B16 + AESMC V4.B16, V4.B16 + AESE V13.B16, V5.B16 + AESMC V5.B16, V5.B16 + AESE V14.B16, V6.B16 + AESMC V6.B16, V6.B16 + AESE V15.B16, V7.B16 + AESMC V7.B16, V7.B16 + SUB $1, R2, R2 + CBNZ R2, aesloop + + AESE V8.B16, V0.B16 + AESMC V0.B16, V0.B16 + AESE V9.B16, V1.B16 + AESMC V1.B16, V1.B16 + AESE V10.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V11.B16, V3.B16 + AESMC V3.B16, V3.B16 + AESE V12.B16, V4.B16 + AESMC V4.B16, V4.B16 + AESE V13.B16, V5.B16 + AESMC V5.B16, V5.B16 + AESE V14.B16, V6.B16 + AESMC V6.B16, V6.B16 + AESE V15.B16, V7.B16 + AESMC V7.B16, V7.B16 + + AESE V8.B16, V0.B16 + AESMC V0.B16, V0.B16 + AESE V9.B16, V1.B16 + AESMC V1.B16, V1.B16 + AESE V10.B16, V2.B16 + AESMC V2.B16, V2.B16 + AESE V11.B16, V3.B16 + AESMC V3.B16, V3.B16 + AESE V12.B16, V4.B16 + AESMC V4.B16, V4.B16 + AESE V13.B16, V5.B16 + AESMC V5.B16, V5.B16 + AESE V14.B16, V6.B16 + AESMC V6.B16, V6.B16 + AESE V15.B16, V7.B16 + AESMC V7.B16, V7.B16 + + AESE V8.B16, V0.B16 + AESE V9.B16, V1.B16 + AESE V10.B16, V2.B16 + AESE V11.B16, V3.B16 + AESE V12.B16, V4.B16 + AESE V13.B16, V5.B16 + AESE V14.B16, V6.B16 + AESE V15.B16, V7.B16 + + VEOR V0.B16, V1.B16, V0.B16 + VEOR V2.B16, V3.B16, V2.B16 + VEOR V4.B16, V5.B16, V4.B16 + VEOR V6.B16, V7.B16, V6.B16 + VEOR V0.B16, V2.B16, V0.B16 + VEOR V4.B16, V6.B16, V4.B16 + VEOR V4.B16, V0.B16, V0.B16 + + VMOV V0.D[0], R0 + RET + +TEXT runtime·procyield(SB),NOSPLIT,$0-0 + MOVWU cycles+0(FP), R0 +again: + YIELD + SUBW $1, R0 + CBNZ R0, again + RET + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes R0. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·systemstack_switch(SB), R0 + ADD $8, R0 // get past prologue + MOVD R0, (g_sched+gobuf_pc)(g) + MOVD RSP, R0 + MOVD R0, (g_sched+gobuf_sp)(g) + MOVD R29, (g_sched+gobuf_bp)(g) + MOVD $0, (g_sched+gobuf_lr)(g) + MOVD $0, (g_sched+gobuf_ret)(g) + // Assert ctxt is zero. See func save. + MOVD (g_sched+gobuf_ctxt)(g), R0 + CBZ R0, 2(PC) + CALL runtime·abort(SB) + RET + +// func asmcgocall_no_g(fn, arg unsafe.Pointer) +// Call fn(arg) aligned appropriately for the gcc ABI. +// Called on a system stack, and there may be no g yet (during needm). +TEXT ·asmcgocall_no_g(SB),NOSPLIT,$0-16 + MOVD fn+0(FP), R1 + MOVD arg+8(FP), R0 + SUB $16, RSP // skip over saved frame pointer below RSP + BL (R1) + ADD $16, RSP // skip over saved frame pointer below RSP + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-20 + MOVD fn+0(FP), R1 + MOVD arg+8(FP), R0 + + MOVD RSP, R2 // save original stack pointer + CBZ g, nosave + MOVD g, R4 + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + MOVD g_m(g), R8 + MOVD m_gsignal(R8), R3 + CMP R3, g + BEQ nosave + MOVD m_g0(R8), R3 + CMP R3, g + BEQ nosave + + // Switch to system stack. + MOVD R0, R9 // gosave_systemstack_switch<> and save_g might clobber R0 + BL gosave_systemstack_switch<>(SB) + MOVD R3, g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R0 + MOVD R0, RSP + MOVD (g_sched+gobuf_bp)(g), R29 + MOVD R9, R0 + + // Now on a scheduling stack (a pthread-created stack). + // Save room for two of our pointers /*, plus 32 bytes of callee + // save area that lives on the caller stack. */ + MOVD RSP, R13 + SUB $16, R13 + MOVD R13, RSP + MOVD R4, 0(RSP) // save old g on stack + MOVD (g_stack+stack_hi)(R4), R4 + SUB R2, R4 + MOVD R4, 8(RSP) // save depth in old g stack (can't just save SP, as stack might be copied during a callback) + BL (R1) + MOVD R0, R9 + + // Restore g, stack pointer. R0 is errno, so don't touch it + MOVD 0(RSP), g + BL runtime·save_g(SB) + MOVD (g_stack+stack_hi)(g), R5 + MOVD 8(RSP), R6 + SUB R6, R5 + MOVD R9, R0 + MOVD R5, RSP + + MOVW R0, ret+16(FP) + RET + +nosave: + // Running on a system stack, perhaps even without a g. + // Having no g can happen during thread creation or thread teardown + // (see needm/dropm on Solaris, for example). + // This code is like the above sequence but without saving/restoring g + // and without worrying about the stack moving out from under us + // (because we're on a system stack, not a goroutine stack). + // The above code could be used directly if already on a system stack, + // but then the only path through this code would be a rare case on Solaris. + // Using this code for all "already on system stack" calls exercises it more, + // which should help keep it correct. + MOVD RSP, R13 + SUB $16, R13 + MOVD R13, RSP + MOVD $0, R4 + MOVD R4, 0(RSP) // Where above code stores g, in case someone looks during debugging. + MOVD R2, 8(RSP) // Save original stack pointer. + BL (R1) + // Restore stack pointer. + MOVD 8(RSP), R2 + MOVD R2, RSP + MOVD R0, ret+16(FP) + RET + +// cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$24-24 + NO_LOCAL_POINTERS + + // Load g from thread-local storage. + BL runtime·load_g(SB) + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call. + CBZ g, needm + + MOVD g_m(g), R8 + MOVD R8, savedm-8(SP) + B havem + +needm: + MOVD g, savedm-8(SP) // g is zero, so is m. + MOVD $runtime·needm(SB), R0 + BL (R0) + + // Set m->g0->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVD g_m(g), R8 + MOVD m_g0(R8), R3 + MOVD RSP, R0 + MOVD R0, (g_sched+gobuf_sp)(R3) + MOVD R29, (g_sched+gobuf_bp)(R3) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 16(RSP) aka savedsp-16(SP). + // Beware that the frame size is actually 32+16. + MOVD m_g0(R8), R3 + MOVD (g_sched+gobuf_sp)(R3), R4 + MOVD R4, savedsp-16(SP) + MOVD RSP, R0 + MOVD R0, (g_sched+gobuf_sp)(R3) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOVD m_curg(R8), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R4 // prepare stack as R4 + MOVD (g_sched+gobuf_pc)(g), R5 + MOVD R5, -48(R4) + MOVD (g_sched+gobuf_bp)(g), R5 + MOVD R5, -56(R4) + // Gather our arguments into registers. + MOVD fn+0(FP), R1 + MOVD frame+8(FP), R2 + MOVD ctxt+16(FP), R3 + MOVD $-48(R4), R0 // maintain 16-byte SP alignment + MOVD R0, RSP // switch stack + MOVD R1, 8(RSP) + MOVD R2, 16(RSP) + MOVD R3, 24(RSP) + MOVD $runtime·cgocallbackg(SB), R0 + CALL (R0) // indirect call to bypass nosplit check. We're on a different stack now. + + // Restore g->sched (== m->curg->sched) from saved values. + MOVD 0(RSP), R5 + MOVD R5, (g_sched+gobuf_pc)(g) + MOVD RSP, R4 + ADD $48, R4, R4 + MOVD R4, (g_sched+gobuf_sp)(g) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVD g_m(g), R8 + MOVD m_g0(R8), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R0 + MOVD R0, RSP + MOVD savedsp-16(SP), R4 + MOVD R4, (g_sched+gobuf_sp)(g) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVD savedm-8(SP), R6 + CBNZ R6, droppedm + MOVD $runtime·dropm(SB), R0 + BL (R0) +droppedm: + + // Done! + RET + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT,$24 + // g (R28) and REGTMP (R27) might be clobbered by load_g. They + // are callee-save in the gcc calling convention, so save them. + MOVD R27, savedR27-8(SP) + MOVD g, saveG-16(SP) + + BL runtime·load_g(SB) + MOVD g_m(g), R0 + MOVD m_curg(R0), R0 + MOVD (g_stack+stack_hi)(R0), R0 + + MOVD saveG-16(SP), g + MOVD savedR28-8(SP), R27 + RET + +// void setg(G*); set g. for use by needm. +TEXT runtime·setg(SB), NOSPLIT, $0-8 + MOVD gg+0(FP), g + // This only happens if iscgo, so jump straight to save_g + BL runtime·save_g(SB) + RET + +// void setg_gcc(G*); set g called from gcc +TEXT setg_gcc<>(SB),NOSPLIT,$8 + MOVD R0, g + MOVD R27, savedR27-8(SP) + BL runtime·save_g(SB) + MOVD savedR27-8(SP), R27 + RET + +TEXT runtime·emptyfunc(SB),0,$0-0 + RET + +TEXT runtime·abort(SB),NOSPLIT|NOFRAME,$0-0 + MOVD ZR, R0 + MOVD (R0), R0 + UNDEF + +TEXT runtime·return0(SB), NOSPLIT, $0 + MOVW $0, R0 + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|NOFRAME|TOPFRAME,$0-0 + MOVD R0, R0 // NOP + BL runtime·goexit1(SB) // does not return + +// This is called from .init_array and follows the platform, not Go, ABI. +TEXT runtime·addmoduledata(SB),NOSPLIT,$0-0 + SUB $0x10, RSP + MOVD R27, 8(RSP) // The access to global variables below implicitly uses R27, which is callee-save + MOVD runtime·lastmoduledatap(SB), R1 + MOVD R0, moduledata_next(R1) + MOVD R0, runtime·lastmoduledatap(SB) + MOVD 8(RSP), R27 + ADD $0x10, RSP + RET + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + MOVW $1, R3 + MOVB R3, ret+0(FP) + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - R2 is the destination of the write +// - R3 is the value being written at R2 +// It clobbers condition codes. +// It does not clobber any general-purpose registers, +// but may clobber others (e.g., floating point registers) +// The act of CALLing gcWriteBarrier will clobber R30 (LR). +// +// Defined as ABIInternal since the compiler generates ABIInternal +// calls to it directly and it does not use the stack-based Go ABI. +TEXT runtime·gcWriteBarrier<ABIInternal>(SB),NOSPLIT,$200 + // Save the registers clobbered by the fast path. + STP (R0, R1), 184(RSP) + MOVD g_m(g), R0 + MOVD m_p(R0), R0 + MOVD (p_wbBuf+wbBuf_next)(R0), R1 + // Increment wbBuf.next position. + ADD $16, R1 + MOVD R1, (p_wbBuf+wbBuf_next)(R0) + MOVD (p_wbBuf+wbBuf_end)(R0), R0 + CMP R1, R0 + // Record the write. + MOVD R3, -16(R1) // Record value + MOVD (R2), R0 // TODO: This turns bad writes into bad reads. + MOVD R0, -8(R1) // Record *slot + // Is the buffer full? (flags set in CMP above) + BEQ flush +ret: + LDP 184(RSP), (R0, R1) + // Do the write. + MOVD R3, (R2) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + // R0 and R1 already saved + STP (R2, R3), 1*8(RSP) // Also first and second arguments to wbBufFlush + STP (R4, R5), 3*8(RSP) + STP (R6, R7), 5*8(RSP) + STP (R8, R9), 7*8(RSP) + STP (R10, R11), 9*8(RSP) + STP (R12, R13), 11*8(RSP) + STP (R14, R15), 13*8(RSP) + // R16, R17 may be clobbered by linker trampoline + // R18 is unused. + STP (R19, R20), 15*8(RSP) + STP (R21, R22), 17*8(RSP) + STP (R23, R24), 19*8(RSP) + STP (R25, R26), 21*8(RSP) + // R27 is temp register. + // R28 is g. + // R29 is frame pointer (unused). + // R30 is LR, which was saved by the prologue. + // R31 is SP. + + // This takes arguments R2 and R3. + CALL runtime·wbBufFlush(SB) + LDP 1*8(RSP), (R2, R3) + LDP 3*8(RSP), (R4, R5) + LDP 5*8(RSP), (R6, R7) + LDP 7*8(RSP), (R8, R9) + LDP 9*8(RSP), (R10, R11) + LDP 11*8(RSP), (R12, R13) + LDP 13*8(RSP), (R14, R15) + LDP 15*8(RSP), (R19, R20) + LDP 17*8(RSP), (R21, R22) + LDP 19*8(RSP), (R23, R24) + LDP 21*8(RSP), (R25, R26) + JMP ret + +DATA debugCallFrameTooLarge<>+0x00(SB)/20, $"call frame too large" +GLOBL debugCallFrameTooLarge<>(SB), RODATA, $20 // Size duplicated below + +// debugCallV2 is the entry point for debugger-injected function +// calls on running goroutines. It informs the runtime that a +// debug call has been injected and creates a call frame for the +// debugger to fill in. +// +// To inject a function call, a debugger should: +// 1. Check that the goroutine is in state _Grunning and that +// there are at least 288 bytes free on the stack. +// 2. Set SP as SP-16. +// 3. Store the current LR in (SP) (using the SP after step 2). +// 4. Store the current PC in the LR register. +// 5. Write the desired argument frame size at SP-16 +// 6. Save all machine registers (including flags and fpsimd registers) +// so they can be restored later by the debugger. +// 7. Set the PC to debugCallV2 and resume execution. +// +// If the goroutine is in state _Grunnable, then it's not generally +// safe to inject a call because it may return out via other runtime +// operations. Instead, the debugger should unwind the stack to find +// the return to non-runtime code, add a temporary breakpoint there, +// and inject the call once that breakpoint is hit. +// +// If the goroutine is in any other state, it's not safe to inject a call. +// +// This function communicates back to the debugger by setting R20 and +// invoking BRK to raise a breakpoint signal. Note that the signal PC of +// the signal triggered by the BRK instruction is the PC where the signal +// is trapped, not the next PC, so to resume execution, the debugger needs +// to set the signal PC to PC+4. See the comments in the implementation for +// the protocol the debugger is expected to follow. InjectDebugCall in the +// runtime tests demonstrates this protocol. +// +// The debugger must ensure that any pointers passed to the function +// obey escape analysis requirements. Specifically, it must not pass +// a stack pointer to an escaping argument. debugCallV2 cannot check +// this invariant. +// +// This is ABIInternal because Go code injects its PC directly into new +// goroutine stacks. +TEXT runtime·debugCallV2<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-0 + STP (R29, R30), -280(RSP) + SUB $272, RSP, RSP + SUB $8, RSP, R29 + // Save all registers that may contain pointers so they can be + // conservatively scanned. + // + // We can't do anything that might clobber any of these + // registers before this. + STP (R27, g), (30*8)(RSP) + STP (R25, R26), (28*8)(RSP) + STP (R23, R24), (26*8)(RSP) + STP (R21, R22), (24*8)(RSP) + STP (R19, R20), (22*8)(RSP) + STP (R16, R17), (20*8)(RSP) + STP (R14, R15), (18*8)(RSP) + STP (R12, R13), (16*8)(RSP) + STP (R10, R11), (14*8)(RSP) + STP (R8, R9), (12*8)(RSP) + STP (R6, R7), (10*8)(RSP) + STP (R4, R5), (8*8)(RSP) + STP (R2, R3), (6*8)(RSP) + STP (R0, R1), (4*8)(RSP) + + // Perform a safe-point check. + MOVD R30, 8(RSP) // Caller's PC + CALL runtime·debugCallCheck(SB) + MOVD 16(RSP), R0 + CBZ R0, good + + // The safety check failed. Put the reason string at the top + // of the stack. + MOVD R0, 8(RSP) + MOVD 24(RSP), R0 + MOVD R0, 16(RSP) + + // Set R20 to 8 and invoke BRK. The debugger should get the + // reason a call can't be injected from SP+8 and resume execution. + MOVD $8, R20 + BREAK + JMP restore + +good: + // Registers are saved and it's safe to make a call. + // Open up a call frame, moving the stack if necessary. + // + // Once the frame is allocated, this will set R20 to 0 and + // invoke BRK. The debugger should write the argument + // frame for the call at SP+8, set up argument registers, + // set the LR as the signal PC + 4, set the PC to the function + // to call, set R26 to point to the closure (if a closure call), + // and resume execution. + // + // If the function returns, this will set R20 to 1 and invoke + // BRK. The debugger can then inspect any return value saved + // on the stack at SP+8 and in registers. To resume execution, + // the debugger should restore the LR from (SP). + // + // If the function panics, this will set R20 to 2 and invoke BRK. + // The interface{} value of the panic will be at SP+8. The debugger + // can inspect the panic value and resume execution again. +#define DEBUG_CALL_DISPATCH(NAME,MAXSIZE) \ + CMP $MAXSIZE, R0; \ + BGT 5(PC); \ + MOVD $NAME(SB), R0; \ + MOVD R0, 8(RSP); \ + CALL runtime·debugCallWrap(SB); \ + JMP restore + + MOVD 256(RSP), R0 // the argument frame size + DEBUG_CALL_DISPATCH(debugCall32<>, 32) + DEBUG_CALL_DISPATCH(debugCall64<>, 64) + DEBUG_CALL_DISPATCH(debugCall128<>, 128) + DEBUG_CALL_DISPATCH(debugCall256<>, 256) + DEBUG_CALL_DISPATCH(debugCall512<>, 512) + DEBUG_CALL_DISPATCH(debugCall1024<>, 1024) + DEBUG_CALL_DISPATCH(debugCall2048<>, 2048) + DEBUG_CALL_DISPATCH(debugCall4096<>, 4096) + DEBUG_CALL_DISPATCH(debugCall8192<>, 8192) + DEBUG_CALL_DISPATCH(debugCall16384<>, 16384) + DEBUG_CALL_DISPATCH(debugCall32768<>, 32768) + DEBUG_CALL_DISPATCH(debugCall65536<>, 65536) + // The frame size is too large. Report the error. + MOVD $debugCallFrameTooLarge<>(SB), R0 + MOVD R0, 8(RSP) + MOVD $20, R0 + MOVD R0, 16(RSP) // length of debugCallFrameTooLarge string + MOVD $8, R20 + BREAK + JMP restore + +restore: + // Calls and failures resume here. + // + // Set R20 to 16 and invoke BRK. The debugger should restore + // all registers except for PC and RSP and resume execution. + MOVD $16, R20 + BREAK + // We must not modify flags after this point. + + // Restore pointer-containing registers, which may have been + // modified from the debugger's copy by stack copying. + LDP (30*8)(RSP), (R27, g) + LDP (28*8)(RSP), (R25, R26) + LDP (26*8)(RSP), (R23, R24) + LDP (24*8)(RSP), (R21, R22) + LDP (22*8)(RSP), (R19, R20) + LDP (20*8)(RSP), (R16, R17) + LDP (18*8)(RSP), (R14, R15) + LDP (16*8)(RSP), (R12, R13) + LDP (14*8)(RSP), (R10, R11) + LDP (12*8)(RSP), (R8, R9) + LDP (10*8)(RSP), (R6, R7) + LDP (8*8)(RSP), (R4, R5) + LDP (6*8)(RSP), (R2, R3) + LDP (4*8)(RSP), (R0, R1) + + LDP -8(RSP), (R29, R27) + ADD $288, RSP, RSP // Add 16 more bytes, see saveSigContext + MOVD -16(RSP), R30 // restore old lr + JMP (R27) + +// runtime.debugCallCheck assumes that functions defined with the +// DEBUG_CALL_FN macro are safe points to inject calls. +#define DEBUG_CALL_FN(NAME,MAXSIZE) \ +TEXT NAME(SB),WRAPPER,$MAXSIZE-0; \ + NO_LOCAL_POINTERS; \ + MOVD $0, R20; \ + BREAK; \ + MOVD $1, R20; \ + BREAK; \ + RET +DEBUG_CALL_FN(debugCall32<>, 32) +DEBUG_CALL_FN(debugCall64<>, 64) +DEBUG_CALL_FN(debugCall128<>, 128) +DEBUG_CALL_FN(debugCall256<>, 256) +DEBUG_CALL_FN(debugCall512<>, 512) +DEBUG_CALL_FN(debugCall1024<>, 1024) +DEBUG_CALL_FN(debugCall2048<>, 2048) +DEBUG_CALL_FN(debugCall4096<>, 4096) +DEBUG_CALL_FN(debugCall8192<>, 8192) +DEBUG_CALL_FN(debugCall16384<>, 16384) +DEBUG_CALL_FN(debugCall32768<>, 32768) +DEBUG_CALL_FN(debugCall65536<>, 65536) + +// func debugCallPanicked(val interface{}) +TEXT runtime·debugCallPanicked(SB),NOSPLIT,$16-16 + // Copy the panic value to the top of stack at SP+8. + MOVD val_type+0(FP), R0 + MOVD R0, 8(RSP) + MOVD val_data+8(FP), R0 + MOVD R0, 16(RSP) + MOVD $2, R20 + BREAK + RET + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +// +// Defined as ABIInternal since the compiler generates ABIInternal +// calls to it directly and it does not use the stack-based Go ABI. +TEXT runtime·panicIndex<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicIndex<ABIInternal>(SB) +TEXT runtime·panicIndexU<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicIndexU<ABIInternal>(SB) +TEXT runtime·panicSliceAlen<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R1, R0 + MOVD R2, R1 + JMP runtime·goPanicSliceAlen<ABIInternal>(SB) +TEXT runtime·panicSliceAlenU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R1, R0 + MOVD R2, R1 + JMP runtime·goPanicSliceAlenU<ABIInternal>(SB) +TEXT runtime·panicSliceAcap<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R1, R0 + MOVD R2, R1 + JMP runtime·goPanicSliceAcap<ABIInternal>(SB) +TEXT runtime·panicSliceAcapU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R1, R0 + MOVD R2, R1 + JMP runtime·goPanicSliceAcapU<ABIInternal>(SB) +TEXT runtime·panicSliceB<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicSliceB<ABIInternal>(SB) +TEXT runtime·panicSliceBU<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicSliceBU<ABIInternal>(SB) +TEXT runtime·panicSlice3Alen<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R2, R0 + MOVD R3, R1 + JMP runtime·goPanicSlice3Alen<ABIInternal>(SB) +TEXT runtime·panicSlice3AlenU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R2, R0 + MOVD R3, R1 + JMP runtime·goPanicSlice3AlenU<ABIInternal>(SB) +TEXT runtime·panicSlice3Acap<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R2, R0 + MOVD R3, R1 + JMP runtime·goPanicSlice3Acap<ABIInternal>(SB) +TEXT runtime·panicSlice3AcapU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R2, R0 + MOVD R3, R1 + JMP runtime·goPanicSlice3AcapU<ABIInternal>(SB) +TEXT runtime·panicSlice3B<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R1, R0 + MOVD R2, R1 + JMP runtime·goPanicSlice3B<ABIInternal>(SB) +TEXT runtime·panicSlice3BU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R1, R0 + MOVD R2, R1 + JMP runtime·goPanicSlice3BU<ABIInternal>(SB) +TEXT runtime·panicSlice3C<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicSlice3C<ABIInternal>(SB) +TEXT runtime·panicSlice3CU<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicSlice3CU<ABIInternal>(SB) +TEXT runtime·panicSliceConvert<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R2, R0 + MOVD R3, R1 + JMP runtime·goPanicSliceConvert<ABIInternal>(SB) diff --git a/src/runtime/asm_loong64.s b/src/runtime/asm_loong64.s new file mode 100644 index 0000000..a6ccd19 --- /dev/null +++ b/src/runtime/asm_loong64.s @@ -0,0 +1,792 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +#define REGCTXT R29 + +TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 + // R3 = stack; R4 = argc; R5 = argv + + ADDV $-24, R3 + MOVW R4, 8(R3) // argc + MOVV R5, 16(R3) // argv + + // create istack out of the given (operating system) stack. + // _cgo_init may update stackguard. + MOVV $runtime·g0(SB), g + MOVV $(-64*1024), R30 + ADDV R30, R3, R19 + MOVV R19, g_stackguard0(g) + MOVV R19, g_stackguard1(g) + MOVV R19, (g_stack+stack_lo)(g) + MOVV R3, (g_stack+stack_hi)(g) + + // if there is a _cgo_init, call it using the gcc ABI. + MOVV _cgo_init(SB), R25 + BEQ R25, nocgo + + MOVV R0, R7 // arg 3: not used + MOVV R0, R6 // arg 2: not used + MOVV $setg_gcc<>(SB), R5 // arg 1: setg + MOVV g, R4 // arg 0: G + JAL (R25) + +nocgo: + // update stackguard after _cgo_init + MOVV (g_stack+stack_lo)(g), R19 + ADDV $const__StackGuard, R19 + MOVV R19, g_stackguard0(g) + MOVV R19, g_stackguard1(g) + + // set the per-goroutine and per-mach "registers" + MOVV $runtime·m0(SB), R19 + + // save m->g0 = g0 + MOVV g, m_g0(R19) + // save m0 to g0->m + MOVV R19, g_m(g) + + JAL runtime·check(SB) + + // args are already prepared + JAL runtime·args(SB) + JAL runtime·osinit(SB) + JAL runtime·schedinit(SB) + + // create a new goroutine to start program + MOVV $runtime·mainPC(SB), R19 // entry + ADDV $-16, R3 + MOVV R19, 8(R3) + MOVV R0, 0(R3) + JAL runtime·newproc(SB) + ADDV $16, R3 + + // start this M + JAL runtime·mstart(SB) + + MOVV R0, 1(R0) + RET + +DATA runtime·mainPC+0(SB)/8,$runtime·main(SB) +GLOBL runtime·mainPC(SB),RODATA,$8 + +TEXT runtime·breakpoint(SB),NOSPLIT|NOFRAME,$0-0 + BREAK + RET + +TEXT runtime·asminit(SB),NOSPLIT|NOFRAME,$0-0 + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + JAL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// void gogo(Gobuf*) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB), NOSPLIT|NOFRAME, $0-8 + MOVV buf+0(FP), R4 + MOVV gobuf_g(R4), R5 + MOVV 0(R5), R0 // make sure g != nil + JMP gogo<>(SB) + +TEXT gogo<>(SB), NOSPLIT|NOFRAME, $0 + MOVV R5, g + JAL runtime·save_g(SB) + + MOVV gobuf_sp(R4), R3 + MOVV gobuf_lr(R4), R1 + MOVV gobuf_ret(R4), R19 + MOVV gobuf_ctxt(R4), REGCTXT + MOVV R0, gobuf_sp(R4) + MOVV R0, gobuf_ret(R4) + MOVV R0, gobuf_lr(R4) + MOVV R0, gobuf_ctxt(R4) + MOVV gobuf_pc(R4), R6 + JMP (R6) + +// void mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall(SB), NOSPLIT|NOFRAME, $0-8 + // Save caller state in g->sched + MOVV R3, (g_sched+gobuf_sp)(g) + MOVV R1, (g_sched+gobuf_pc)(g) + MOVV R0, (g_sched+gobuf_lr)(g) + MOVV g, (g_sched+gobuf_g)(g) + + // Switch to m->g0 & its stack, call fn. + MOVV g, R19 + MOVV g_m(g), R4 + MOVV m_g0(R4), g + JAL runtime·save_g(SB) + BNE g, R19, 2(PC) + JMP runtime·badmcall(SB) + MOVV fn+0(FP), REGCTXT // context + MOVV 0(REGCTXT), R5 // code pointer + MOVV (g_sched+gobuf_sp)(g), R3 // sp = m->g0->sched.sp + ADDV $-16, R3 + MOVV R19, 8(R3) + MOVV R0, 0(R3) + JAL (R5) + JMP runtime·badmcall2(SB) + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + UNDEF + JAL (R1) // make sure this function is not leaf + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-8 + MOVV fn+0(FP), R19 // R19 = fn + MOVV R19, REGCTXT // context + MOVV g_m(g), R4 // R4 = m + + MOVV m_gsignal(R4), R5 // R5 = gsignal + BEQ g, R5, noswitch + + MOVV m_g0(R4), R5 // R5 = g0 + BEQ g, R5, noswitch + + MOVV m_curg(R4), R6 + BEQ g, R6, switch + + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOVV $runtime·badsystemstack(SB), R7 + JAL (R7) + JAL runtime·abort(SB) + +switch: + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + JAL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOVV R5, g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R19 + // make it look like mstart called systemstack on g0, to stop traceback + ADDV $-8, R19 + MOVV $runtime·mstart(SB), R6 + MOVV R6, 0(R19) + MOVV R19, R3 + + // call target function + MOVV 0(REGCTXT), R6 // code pointer + JAL (R6) + + // switch back to g + MOVV g_m(g), R4 + MOVV m_curg(R4), g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R3 + MOVV R0, (g_sched+gobuf_sp)(g) + RET + +noswitch: + // already on m stack, just call directly + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOVV 0(REGCTXT), R4 // code pointer + MOVV 0(R3), R1 // restore LR + ADDV $8, R3 + JMP (R4) + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// Caller has already loaded: +// loong64: R5: LR +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0 + // Cannot grow scheduler stack (m->g0). + MOVV g_m(g), R7 + MOVV m_g0(R7), R8 + BNE g, R8, 3(PC) + JAL runtime·badmorestackg0(SB) + JAL runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOVV m_gsignal(R7), R8 + BNE g, R8, 3(PC) + JAL runtime·badmorestackgsignal(SB) + JAL runtime·abort(SB) + + // Called from f. + // Set g->sched to context in f. + MOVV R3, (g_sched+gobuf_sp)(g) + MOVV R1, (g_sched+gobuf_pc)(g) + MOVV R5, (g_sched+gobuf_lr)(g) + MOVV REGCTXT, (g_sched+gobuf_ctxt)(g) + + // Called from f. + // Set m->morebuf to f's caller. + MOVV R5, (m_morebuf+gobuf_pc)(R7) // f's caller's PC + MOVV R3, (m_morebuf+gobuf_sp)(R7) // f's caller's SP + MOVV g, (m_morebuf+gobuf_g)(R7) + + // Call newstack on m->g0's stack. + MOVV m_g0(R7), g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R3 + // Create a stack frame on g0 to call newstack. + MOVV R0, -8(R3) // Zero saved LR in frame + ADDV $-8, R3 + JAL runtime·newstack(SB) + + // Not reached, but make sure the return PC from the call to newstack + // is still in this function, and not the beginning of the next. + UNDEF + +TEXT runtime·morestack_noctxt(SB),NOSPLIT|NOFRAME,$0-0 + MOVV R0, REGCTXT + JMP runtime·morestack(SB) + +// reflectcall: call a function with the given argument list +// func call(argtype *_type, f *FuncVal, arg *byte, argsize, retoffset uint32). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + MOVV $MAXSIZE, R30; \ + SGTU R19, R30, R30; \ + BNE R30, 3(PC); \ + MOVV $NAME(SB), R4; \ + JMP (R4) +// Note: can't just "BR NAME(SB)" - bad inlining results. + +TEXT ·reflectcall(SB), NOSPLIT|NOFRAME, $0-48 + MOVWU stackArgsSize+24(FP), R19 + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVV $runtime·badreflectcall(SB), R4 + JMP (R4) + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-24; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVV arg+16(FP), R4; \ + MOVWU argsize+24(FP), R5; \ + MOVV R3, R12; \ + ADDV $8, R12; \ + ADDV R12, R5; \ + BEQ R12, R5, 6(PC); \ + MOVBU (R4), R6; \ + ADDV $1, R4; \ + MOVBU R6, (R12); \ + ADDV $1, R12; \ + JMP -5(PC); \ + /* call function */ \ + MOVV f+8(FP), REGCTXT; \ + MOVV (REGCTXT), R6; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + JAL (R6); \ + /* copy return values back */ \ + MOVV argtype+0(FP), R7; \ + MOVV arg+16(FP), R4; \ + MOVWU n+24(FP), R5; \ + MOVWU retoffset+28(FP), R6; \ + ADDV $8, R3, R12; \ + ADDV R6, R12; \ + ADDV R6, R4; \ + SUBVU R6, R5; \ + JAL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $32-0 + MOVV R7, 8(R3) + MOVV R4, 16(R3) + MOVV R12, 24(R3) + MOVV R5, 32(R3) + JAL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +TEXT runtime·procyield(SB),NOSPLIT,$0-0 + RET + +// Save state of caller into g->sched. +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes R19. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT|NOFRAME,$0 + MOVV $runtime·systemstack_switch(SB), R19 + ADDV $8, R19 + MOVV R19, (g_sched+gobuf_pc)(g) + MOVV R3, (g_sched+gobuf_sp)(g) + MOVV R0, (g_sched+gobuf_lr)(g) + MOVV R0, (g_sched+gobuf_ret)(g) + // Assert ctxt is zero. See func save. + MOVV (g_sched+gobuf_ctxt)(g), R19 + BEQ R19, 2(PC) + JAL runtime·abort(SB) + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-20 + MOVV fn+0(FP), R25 + MOVV arg+8(FP), R4 + + MOVV R3, R12 // save original stack pointer + MOVV g, R13 + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. + MOVV g_m(g), R5 + MOVV m_gsignal(R5), R6 + BEQ R6, g, g0 + MOVV m_g0(R5), R6 + BEQ R6, g, g0 + + JAL gosave_systemstack_switch<>(SB) + MOVV R6, g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R3 + + // Now on a scheduling stack (a pthread-created stack). +g0: + // Save room for two of our pointers. + ADDV $-16, R3 + MOVV R13, 0(R3) // save old g on stack + MOVV (g_stack+stack_hi)(R13), R13 + SUBVU R12, R13 + MOVV R13, 8(R3) // save depth in old g stack (can't just save SP, as stack might be copied during a callback) + JAL (R25) + + // Restore g, stack pointer. R4 is return value. + MOVV 0(R3), g + JAL runtime·save_g(SB) + MOVV (g_stack+stack_hi)(g), R5 + MOVV 8(R3), R6 + SUBVU R6, R5 + MOVV R5, R3 + + MOVW R4, ret+16(FP) + RET + +// func cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$24-24 + NO_LOCAL_POINTERS + + // Load m and g from thread-local storage. + MOVB runtime·iscgo(SB), R19 + BEQ R19, nocgo + JAL runtime·load_g(SB) +nocgo: + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call. + BEQ g, needm + + MOVV g_m(g), R12 + MOVV R12, savedm-8(SP) + JMP havem + +needm: + MOVV g, savedm-8(SP) // g is zero, so is m. + MOVV $runtime·needm(SB), R4 + JAL (R4) + + // Set m->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVV g_m(g), R12 + MOVV m_g0(R12), R19 + MOVV R3, (g_sched+gobuf_sp)(R19) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 8(R29) aka savedsp-16(SP). + MOVV m_g0(R12), R19 + MOVV (g_sched+gobuf_sp)(R19), R13 + MOVV R13, savedsp-24(SP) // must match frame size + MOVV R3, (g_sched+gobuf_sp)(R19) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the stack. + // This has the added benefit that it looks to the traceback + // routine like cgocallbackg is going to return to that + // PC (because the frame we allocate below has the same + // size as cgocallback_gofunc's frame declared above) + // so that the traceback will seamlessly trace back into + // the earlier calls. + MOVV m_curg(R12), g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R13 // prepare stack as R13 + MOVV (g_sched+gobuf_pc)(g), R4 + MOVV R4, -(24+8)(R13) // "saved LR"; must match frame size + MOVV fn+0(FP), R5 + MOVV frame+8(FP), R6 + MOVV ctxt+16(FP), R7 + MOVV $-(24+8)(R13), R3 + MOVV R5, 8(R3) + MOVV R6, 16(R3) + MOVV R7, 24(R3) + JAL runtime·cgocallbackg(SB) + + // Restore g->sched (== m->curg->sched) from saved values. + MOVV 0(R3), R4 + MOVV R4, (g_sched+gobuf_pc)(g) + MOVV $(24+8)(R3), R13 // must match frame size + MOVV R13, (g_sched+gobuf_sp)(g) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVV g_m(g), R12 + MOVV m_g0(R12), g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R3 + MOVV savedsp-24(SP), R13 // must match frame size + MOVV R13, (g_sched+gobuf_sp)(g) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVV savedm-8(SP), R12 + BNE R12, droppedm + MOVV $runtime·dropm(SB), R4 + JAL (R4) +droppedm: + + // Done! + RET + +// void setg(G*); set g. for use by needm. +TEXT runtime·setg(SB), NOSPLIT, $0-8 + MOVV gg+0(FP), g + // This only happens if iscgo, so jump straight to save_g + JAL runtime·save_g(SB) + RET + +// void setg_gcc(G*); set g called from gcc with g in R19 +TEXT setg_gcc<>(SB),NOSPLIT,$0-0 + MOVV R19, g + JAL runtime·save_g(SB) + RET + +TEXT runtime·abort(SB),NOSPLIT|NOFRAME,$0-0 + MOVW (R0), R0 + UNDEF + +// AES hashing not implemented for loong64 +TEXT runtime·memhash(SB),NOSPLIT|NOFRAME,$0-32 + JMP runtime·memhashFallback(SB) +TEXT runtime·strhash(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·strhashFallback(SB) +TEXT runtime·memhash32(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash32Fallback(SB) +TEXT runtime·memhash64(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash64Fallback(SB) + +TEXT runtime·return0(SB), NOSPLIT, $0 + MOVW $0, R19 + RET + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT,$16 + // g (R22) and REGTMP (R30) might be clobbered by load_g. They + // are callee-save in the gcc calling convention, so save them. + MOVV R30, savedREGTMP-16(SP) + MOVV g, savedG-8(SP) + + JAL runtime·load_g(SB) + MOVV g_m(g), R19 + MOVV m_curg(R19), R19 + MOVV (g_stack+stack_hi)(R19), R4 // return value in R4 + + MOVV savedG-8(SP), g + MOVV savedREGTMP-16(SP), R30 + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|NOFRAME|TOPFRAME,$0-0 + NOR R0, R0 // NOP + JAL runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + NOR R0, R0 // NOP + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + MOVW $1, R19 + MOVB R19, ret+0(FP) + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - R27 is the destination of the write +// - R28 is the value being written at R27. +// It clobbers R30 (the linker temp register). +// The act of CALLing gcWriteBarrier will clobber R1 (LR). +// It does not clobber any other general-purpose registers, +// but may clobber others (e.g., floating point registers). +TEXT runtime·gcWriteBarrier(SB),NOSPLIT,$216 + // Save the registers clobbered by the fast path. + MOVV R19, 208(R3) + MOVV R13, 216(R3) + MOVV g_m(g), R19 + MOVV m_p(R19), R19 + MOVV (p_wbBuf+wbBuf_next)(R19), R13 + // Increment wbBuf.next position. + ADDV $16, R13 + MOVV R13, (p_wbBuf+wbBuf_next)(R19) + MOVV (p_wbBuf+wbBuf_end)(R19), R19 + MOVV R19, R30 // R30 is linker temp register + // Record the write. + MOVV R28, -16(R13) // Record value + MOVV (R27), R19 // TODO: This turns bad writes into bad reads. + MOVV R19, -8(R13) // Record *slot + // Is the buffer full? + BEQ R13, R30, flush +ret: + MOVV 208(R3), R19 + MOVV 216(R3), R13 + // Do the write. + MOVV R28, (R27) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + MOVV R27, 8(R3) // Also first argument to wbBufFlush + MOVV R28, 16(R3) // Also second argument to wbBufFlush + // R1 is LR, which was saved by the prologue. + MOVV R2, 24(R3) + // R3 is SP. + MOVV R4, 32(R3) + MOVV R5, 40(R3) + MOVV R6, 48(R3) + MOVV R7, 56(R3) + MOVV R8, 64(R3) + MOVV R9, 72(R3) + MOVV R10, 80(R3) + MOVV R11, 88(R3) + MOVV R12, 96(R3) + // R13 already saved + MOVV R14, 104(R3) + MOVV R15, 112(R3) + MOVV R16, 120(R3) + MOVV R17, 128(R3) + MOVV R18, 136(R3) + // R19 already saved + MOVV R20, 144(R3) + MOVV R21, 152(R3) + // R22 is g. + MOVV R23, 160(R3) + MOVV R24, 168(R3) + MOVV R25, 176(R3) + MOVV R26, 184(R3) + // R27 already saved + // R28 already saved. + MOVV R29, 192(R3) + // R30 is tmp register. + MOVV R31, 200(R3) + + + // This takes arguments R27 and R28. + CALL runtime·wbBufFlush(SB) + + MOVV 8(R3), R27 + MOVV 16(R3), R28 + MOVV 24(R3), R2 + MOVV 32(R3), R4 + MOVV 40(R3), R5 + MOVV 48(R3), R6 + MOVV 56(R3), R7 + MOVV 64(R3), R8 + MOVV 72(R3), R9 + MOVV 80(R3), R10 + MOVV 88(R3), R11 + MOVV 96(R3), R12 + MOVV 104(R3), R14 + MOVV 112(R3), R15 + MOVV 120(R3), R16 + MOVV 128(R3), R17 + MOVV 136(R3), R18 + MOVV 144(R3), R20 + MOVV 152(R3), R21 + MOVV 160(R3), R23 + MOVV 168(R3), R24 + MOVV 176(R3), R25 + MOVV 184(R3), R26 + MOVV 192(R3), R29 + MOVV 200(R3), R31 + JMP ret + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +TEXT runtime·panicIndex(SB),NOSPLIT,$0-16 + MOVV R19, x+0(FP) + MOVV R18, y+8(FP) + JMP runtime·goPanicIndex(SB) +TEXT runtime·panicIndexU(SB),NOSPLIT,$0-16 + MOVV R19, x+0(FP) + MOVV R18, y+8(FP) + JMP runtime·goPanicIndexU(SB) +TEXT runtime·panicSliceAlen(SB),NOSPLIT,$0-16 + MOVV R18, x+0(FP) + MOVV R17, y+8(FP) + JMP runtime·goPanicSliceAlen(SB) +TEXT runtime·panicSliceAlenU(SB),NOSPLIT,$0-16 + MOVV R18, x+0(FP) + MOVV R17, y+8(FP) + JMP runtime·goPanicSliceAlenU(SB) +TEXT runtime·panicSliceAcap(SB),NOSPLIT,$0-16 + MOVV R18, x+0(FP) + MOVV R17, y+8(FP) + JMP runtime·goPanicSliceAcap(SB) +TEXT runtime·panicSliceAcapU(SB),NOSPLIT,$0-16 + MOVV R18, x+0(FP) + MOVV R17, y+8(FP) + JMP runtime·goPanicSliceAcapU(SB) +TEXT runtime·panicSliceB(SB),NOSPLIT,$0-16 + MOVV R19, x+0(FP) + MOVV R18, y+8(FP) + JMP runtime·goPanicSliceB(SB) +TEXT runtime·panicSliceBU(SB),NOSPLIT,$0-16 + MOVV R19, x+0(FP) + MOVV R18, y+8(FP) + JMP runtime·goPanicSliceBU(SB) +TEXT runtime·panicSlice3Alen(SB),NOSPLIT,$0-16 + MOVV R17, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSlice3Alen(SB) +TEXT runtime·panicSlice3AlenU(SB),NOSPLIT,$0-16 + MOVV R17, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSlice3AlenU(SB) +TEXT runtime·panicSlice3Acap(SB),NOSPLIT,$0-16 + MOVV R17, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSlice3Acap(SB) +TEXT runtime·panicSlice3AcapU(SB),NOSPLIT,$0-16 + MOVV R17, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSlice3AcapU(SB) +TEXT runtime·panicSlice3B(SB),NOSPLIT,$0-16 + MOVV R18, x+0(FP) + MOVV R17, y+8(FP) + JMP runtime·goPanicSlice3B(SB) +TEXT runtime·panicSlice3BU(SB),NOSPLIT,$0-16 + MOVV R18, x+0(FP) + MOVV R17, y+8(FP) + JMP runtime·goPanicSlice3BU(SB) +TEXT runtime·panicSlice3C(SB),NOSPLIT,$0-16 + MOVV R19, x+0(FP) + MOVV R18, y+8(FP) + JMP runtime·goPanicSlice3C(SB) +TEXT runtime·panicSlice3CU(SB),NOSPLIT,$0-16 + MOVV R19, x+0(FP) + MOVV R18, y+8(FP) + JMP runtime·goPanicSlice3CU(SB) +TEXT runtime·panicSliceConvert(SB),NOSPLIT,$0-16 + MOVV R17, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSliceConvert(SB) diff --git a/src/runtime/asm_mips64x.s b/src/runtime/asm_mips64x.s new file mode 100644 index 0000000..1abadb9 --- /dev/null +++ b/src/runtime/asm_mips64x.s @@ -0,0 +1,804 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +#define REGCTXT R22 + +TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 + // R29 = stack; R4 = argc; R5 = argv + + ADDV $-24, R29 + MOVW R4, 8(R29) // argc + MOVV R5, 16(R29) // argv + + // create istack out of the given (operating system) stack. + // _cgo_init may update stackguard. + MOVV $runtime·g0(SB), g + MOVV $(-64*1024), R23 + ADDV R23, R29, R1 + MOVV R1, g_stackguard0(g) + MOVV R1, g_stackguard1(g) + MOVV R1, (g_stack+stack_lo)(g) + MOVV R29, (g_stack+stack_hi)(g) + + // if there is a _cgo_init, call it using the gcc ABI. + MOVV _cgo_init(SB), R25 + BEQ R25, nocgo + + MOVV R0, R7 // arg 3: not used + MOVV R0, R6 // arg 2: not used + MOVV $setg_gcc<>(SB), R5 // arg 1: setg + MOVV g, R4 // arg 0: G + JAL (R25) + +nocgo: + // update stackguard after _cgo_init + MOVV (g_stack+stack_lo)(g), R1 + ADDV $const__StackGuard, R1 + MOVV R1, g_stackguard0(g) + MOVV R1, g_stackguard1(g) + + // set the per-goroutine and per-mach "registers" + MOVV $runtime·m0(SB), R1 + + // save m->g0 = g0 + MOVV g, m_g0(R1) + // save m0 to g0->m + MOVV R1, g_m(g) + + JAL runtime·check(SB) + + // args are already prepared + JAL runtime·args(SB) + JAL runtime·osinit(SB) + JAL runtime·schedinit(SB) + + // create a new goroutine to start program + MOVV $runtime·mainPC(SB), R1 // entry + ADDV $-16, R29 + MOVV R1, 8(R29) + MOVV R0, 0(R29) + JAL runtime·newproc(SB) + ADDV $16, R29 + + // start this M + JAL runtime·mstart(SB) + + MOVV R0, 1(R0) + RET + +DATA runtime·mainPC+0(SB)/8,$runtime·main(SB) +GLOBL runtime·mainPC(SB),RODATA,$8 + +TEXT runtime·breakpoint(SB),NOSPLIT|NOFRAME,$0-0 + MOVV R0, 2(R0) // TODO: TD + RET + +TEXT runtime·asminit(SB),NOSPLIT|NOFRAME,$0-0 + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + JAL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// void gogo(Gobuf*) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB), NOSPLIT|NOFRAME, $0-8 + MOVV buf+0(FP), R3 + MOVV gobuf_g(R3), R4 + MOVV 0(R4), R0 // make sure g != nil + JMP gogo<>(SB) + +TEXT gogo<>(SB), NOSPLIT|NOFRAME, $0 + MOVV R4, g + JAL runtime·save_g(SB) + + MOVV 0(g), R2 + MOVV gobuf_sp(R3), R29 + MOVV gobuf_lr(R3), R31 + MOVV gobuf_ret(R3), R1 + MOVV gobuf_ctxt(R3), REGCTXT + MOVV R0, gobuf_sp(R3) + MOVV R0, gobuf_ret(R3) + MOVV R0, gobuf_lr(R3) + MOVV R0, gobuf_ctxt(R3) + MOVV gobuf_pc(R3), R4 + JMP (R4) + +// void mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall(SB), NOSPLIT|NOFRAME, $0-8 + // Save caller state in g->sched + MOVV R29, (g_sched+gobuf_sp)(g) + MOVV R31, (g_sched+gobuf_pc)(g) + MOVV R0, (g_sched+gobuf_lr)(g) + + // Switch to m->g0 & its stack, call fn. + MOVV g, R1 + MOVV g_m(g), R3 + MOVV m_g0(R3), g + JAL runtime·save_g(SB) + BNE g, R1, 2(PC) + JMP runtime·badmcall(SB) + MOVV fn+0(FP), REGCTXT // context + MOVV 0(REGCTXT), R4 // code pointer + MOVV (g_sched+gobuf_sp)(g), R29 // sp = m->g0->sched.sp + ADDV $-16, R29 + MOVV R1, 8(R29) + MOVV R0, 0(R29) + JAL (R4) + JMP runtime·badmcall2(SB) + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + UNDEF + JAL (R31) // make sure this function is not leaf + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-8 + MOVV fn+0(FP), R1 // R1 = fn + MOVV R1, REGCTXT // context + MOVV g_m(g), R2 // R2 = m + + MOVV m_gsignal(R2), R3 // R3 = gsignal + BEQ g, R3, noswitch + + MOVV m_g0(R2), R3 // R3 = g0 + BEQ g, R3, noswitch + + MOVV m_curg(R2), R4 + BEQ g, R4, switch + + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOVV $runtime·badsystemstack(SB), R4 + JAL (R4) + JAL runtime·abort(SB) + +switch: + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + JAL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOVV R3, g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R1 + MOVV R1, R29 + + // call target function + MOVV 0(REGCTXT), R4 // code pointer + JAL (R4) + + // switch back to g + MOVV g_m(g), R1 + MOVV m_curg(R1), g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R29 + MOVV R0, (g_sched+gobuf_sp)(g) + RET + +noswitch: + // already on m stack, just call directly + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOVV 0(REGCTXT), R4 // code pointer + MOVV 0(R29), R31 // restore LR + ADDV $8, R29 + JMP (R4) + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// Caller has already loaded: +// R1: framesize, R2: argsize, R3: LR +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0 + // Cannot grow scheduler stack (m->g0). + MOVV g_m(g), R7 + MOVV m_g0(R7), R8 + BNE g, R8, 3(PC) + JAL runtime·badmorestackg0(SB) + JAL runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOVV m_gsignal(R7), R8 + BNE g, R8, 3(PC) + JAL runtime·badmorestackgsignal(SB) + JAL runtime·abort(SB) + + // Called from f. + // Set g->sched to context in f. + MOVV R29, (g_sched+gobuf_sp)(g) + MOVV R31, (g_sched+gobuf_pc)(g) + MOVV R3, (g_sched+gobuf_lr)(g) + MOVV REGCTXT, (g_sched+gobuf_ctxt)(g) + + // Called from f. + // Set m->morebuf to f's caller. + MOVV R3, (m_morebuf+gobuf_pc)(R7) // f's caller's PC + MOVV R29, (m_morebuf+gobuf_sp)(R7) // f's caller's SP + MOVV g, (m_morebuf+gobuf_g)(R7) + + // Call newstack on m->g0's stack. + MOVV m_g0(R7), g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R29 + // Create a stack frame on g0 to call newstack. + MOVV R0, -8(R29) // Zero saved LR in frame + ADDV $-8, R29 + JAL runtime·newstack(SB) + + // Not reached, but make sure the return PC from the call to newstack + // is still in this function, and not the beginning of the next. + UNDEF + +TEXT runtime·morestack_noctxt(SB),NOSPLIT|NOFRAME,$0-0 + // Force SPWRITE. This function doesn't actually write SP, + // but it is called with a special calling convention where + // the caller doesn't save LR on stack but passes it as a + // register (R3), and the unwinder currently doesn't understand. + // Make it SPWRITE to stop unwinding. (See issue 54332) + MOVV R29, R29 + + MOVV R0, REGCTXT + JMP runtime·morestack(SB) + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + MOVV $MAXSIZE, R23; \ + SGTU R1, R23, R23; \ + BNE R23, 3(PC); \ + MOVV $NAME(SB), R4; \ + JMP (R4) +// Note: can't just "BR NAME(SB)" - bad inlining results. + +TEXT ·reflectcall(SB), NOSPLIT|NOFRAME, $0-48 + MOVWU frameSize+32(FP), R1 + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVV $runtime·badreflectcall(SB), R4 + JMP (R4) + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-48; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVV stackArgs+16(FP), R1; \ + MOVWU stackArgsSize+24(FP), R2; \ + MOVV R29, R3; \ + ADDV $8, R3; \ + ADDV R3, R2; \ + BEQ R3, R2, 6(PC); \ + MOVBU (R1), R4; \ + ADDV $1, R1; \ + MOVBU R4, (R3); \ + ADDV $1, R3; \ + JMP -5(PC); \ + /* call function */ \ + MOVV f+8(FP), REGCTXT; \ + MOVV (REGCTXT), R4; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + JAL (R4); \ + /* copy return values back */ \ + MOVV stackArgsType+0(FP), R5; \ + MOVV stackArgs+16(FP), R1; \ + MOVWU stackArgsSize+24(FP), R2; \ + MOVWU stackRetOffset+28(FP), R4; \ + ADDV $8, R29, R3; \ + ADDV R4, R3; \ + ADDV R4, R1; \ + SUBVU R4, R2; \ + JAL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $40-0 + MOVV R5, 8(R29) + MOVV R1, 16(R29) + MOVV R3, 24(R29) + MOVV R2, 32(R29) + MOVV $0, 40(R29) + JAL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +TEXT runtime·procyield(SB),NOSPLIT,$0-0 + RET + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes R1. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT|NOFRAME,$0 + MOVV $runtime·systemstack_switch(SB), R1 + ADDV $8, R1 // get past prologue + MOVV R1, (g_sched+gobuf_pc)(g) + MOVV R29, (g_sched+gobuf_sp)(g) + MOVV R0, (g_sched+gobuf_lr)(g) + MOVV R0, (g_sched+gobuf_ret)(g) + // Assert ctxt is zero. See func save. + MOVV (g_sched+gobuf_ctxt)(g), R1 + BEQ R1, 2(PC) + JAL runtime·abort(SB) + RET + +// func asmcgocall_no_g(fn, arg unsafe.Pointer) +// Call fn(arg) aligned appropriately for the gcc ABI. +// Called on a system stack, and there may be no g yet (during needm). +TEXT ·asmcgocall_no_g(SB),NOSPLIT,$0-16 + MOVV fn+0(FP), R25 + MOVV arg+8(FP), R4 + JAL (R25) + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-20 + MOVV fn+0(FP), R25 + MOVV arg+8(FP), R4 + + MOVV R29, R3 // save original stack pointer + MOVV g, R2 + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + MOVV g_m(g), R5 + MOVV m_gsignal(R5), R6 + BEQ R6, g, g0 + MOVV m_g0(R5), R6 + BEQ R6, g, g0 + + JAL gosave_systemstack_switch<>(SB) + MOVV R6, g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R29 + + // Now on a scheduling stack (a pthread-created stack). +g0: + // Save room for two of our pointers. + ADDV $-16, R29 + MOVV R2, 0(R29) // save old g on stack + MOVV (g_stack+stack_hi)(R2), R2 + SUBVU R3, R2 + MOVV R2, 8(R29) // save depth in old g stack (can't just save SP, as stack might be copied during a callback) + JAL (R25) + + // Restore g, stack pointer. R2 is return value. + MOVV 0(R29), g + JAL runtime·save_g(SB) + MOVV (g_stack+stack_hi)(g), R5 + MOVV 8(R29), R6 + SUBVU R6, R5 + MOVV R5, R29 + + MOVW R2, ret+16(FP) + RET + +// func cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$24-24 + NO_LOCAL_POINTERS + + // Load m and g from thread-local storage. + MOVB runtime·iscgo(SB), R1 + BEQ R1, nocgo + JAL runtime·load_g(SB) +nocgo: + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call. + BEQ g, needm + + MOVV g_m(g), R3 + MOVV R3, savedm-8(SP) + JMP havem + +needm: + MOVV g, savedm-8(SP) // g is zero, so is m. + MOVV $runtime·needm(SB), R4 + JAL (R4) + + // Set m->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVV g_m(g), R3 + MOVV m_g0(R3), R1 + MOVV R29, (g_sched+gobuf_sp)(R1) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 8(R29) aka savedsp-16(SP). + MOVV m_g0(R3), R1 + MOVV (g_sched+gobuf_sp)(R1), R2 + MOVV R2, savedsp-24(SP) // must match frame size + MOVV R29, (g_sched+gobuf_sp)(R1) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOVV m_curg(R3), g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R2 // prepare stack as R2 + MOVV (g_sched+gobuf_pc)(g), R4 + MOVV R4, -(24+8)(R2) // "saved LR"; must match frame size + // Gather our arguments into registers. + MOVV fn+0(FP), R5 + MOVV frame+8(FP), R6 + MOVV ctxt+16(FP), R7 + MOVV $-(24+8)(R2), R29 // switch stack; must match frame size + MOVV R5, 8(R29) + MOVV R6, 16(R29) + MOVV R7, 24(R29) + JAL runtime·cgocallbackg(SB) + + // Restore g->sched (== m->curg->sched) from saved values. + MOVV 0(R29), R4 + MOVV R4, (g_sched+gobuf_pc)(g) + MOVV $(24+8)(R29), R2 // must match frame size + MOVV R2, (g_sched+gobuf_sp)(g) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVV g_m(g), R3 + MOVV m_g0(R3), g + JAL runtime·save_g(SB) + MOVV (g_sched+gobuf_sp)(g), R29 + MOVV savedsp-24(SP), R2 // must match frame size + MOVV R2, (g_sched+gobuf_sp)(g) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVV savedm-8(SP), R3 + BNE R3, droppedm + MOVV $runtime·dropm(SB), R4 + JAL (R4) +droppedm: + + // Done! + RET + +// void setg(G*); set g. for use by needm. +TEXT runtime·setg(SB), NOSPLIT, $0-8 + MOVV gg+0(FP), g + // This only happens if iscgo, so jump straight to save_g + JAL runtime·save_g(SB) + RET + +// void setg_gcc(G*); set g called from gcc with g in R1 +TEXT setg_gcc<>(SB),NOSPLIT,$0-0 + MOVV R1, g + JAL runtime·save_g(SB) + RET + +TEXT runtime·abort(SB),NOSPLIT|NOFRAME,$0-0 + MOVW (R0), R0 + UNDEF + +// AES hashing not implemented for mips64 +TEXT runtime·memhash(SB),NOSPLIT|NOFRAME,$0-32 + JMP runtime·memhashFallback(SB) +TEXT runtime·strhash(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·strhashFallback(SB) +TEXT runtime·memhash32(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash32Fallback(SB) +TEXT runtime·memhash64(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash64Fallback(SB) + +TEXT runtime·return0(SB), NOSPLIT, $0 + MOVW $0, R1 + RET + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT,$16 + // g (R30) and REGTMP (R23) might be clobbered by load_g. They + // are callee-save in the gcc calling convention, so save them. + MOVV R23, savedR23-16(SP) + MOVV g, savedG-8(SP) + + JAL runtime·load_g(SB) + MOVV g_m(g), R1 + MOVV m_curg(R1), R1 + MOVV (g_stack+stack_hi)(R1), R2 // return value in R2 + + MOVV savedG-8(SP), g + MOVV savedR23-16(SP), R23 + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|NOFRAME|TOPFRAME,$0-0 + NOR R0, R0 // NOP + JAL runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + NOR R0, R0 // NOP + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + MOVW $1, R1 + MOVB R1, ret+0(FP) + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - R20 is the destination of the write +// - R21 is the value being written at R20. +// It clobbers R23 (the linker temp register). +// The act of CALLing gcWriteBarrier will clobber R31 (LR). +// It does not clobber any other general-purpose registers, +// but may clobber others (e.g., floating point registers). +TEXT runtime·gcWriteBarrier(SB),NOSPLIT,$192 + // Save the registers clobbered by the fast path. + MOVV R1, 184(R29) + MOVV R2, 192(R29) + MOVV g_m(g), R1 + MOVV m_p(R1), R1 + MOVV (p_wbBuf+wbBuf_next)(R1), R2 + // Increment wbBuf.next position. + ADDV $16, R2 + MOVV R2, (p_wbBuf+wbBuf_next)(R1) + MOVV (p_wbBuf+wbBuf_end)(R1), R1 + MOVV R1, R23 // R23 is linker temp register + // Record the write. + MOVV R21, -16(R2) // Record value + MOVV (R20), R1 // TODO: This turns bad writes into bad reads. + MOVV R1, -8(R2) // Record *slot + // Is the buffer full? + BEQ R2, R23, flush +ret: + MOVV 184(R29), R1 + MOVV 192(R29), R2 + // Do the write. + MOVV R21, (R20) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + MOVV R20, 8(R29) // Also first argument to wbBufFlush + MOVV R21, 16(R29) // Also second argument to wbBufFlush + // R1 already saved + // R2 already saved + MOVV R3, 24(R29) + MOVV R4, 32(R29) + MOVV R5, 40(R29) + MOVV R6, 48(R29) + MOVV R7, 56(R29) + MOVV R8, 64(R29) + MOVV R9, 72(R29) + MOVV R10, 80(R29) + MOVV R11, 88(R29) + MOVV R12, 96(R29) + MOVV R13, 104(R29) + MOVV R14, 112(R29) + MOVV R15, 120(R29) + MOVV R16, 128(R29) + MOVV R17, 136(R29) + MOVV R18, 144(R29) + MOVV R19, 152(R29) + // R20 already saved + // R21 already saved. + MOVV R22, 160(R29) + // R23 is tmp register. + MOVV R24, 168(R29) + MOVV R25, 176(R29) + // R26 is reserved by kernel. + // R27 is reserved by kernel. + // R28 is REGSB (not modified by Go code). + // R29 is SP. + // R30 is g. + // R31 is LR, which was saved by the prologue. + + // This takes arguments R20 and R21. + CALL runtime·wbBufFlush(SB) + + MOVV 8(R29), R20 + MOVV 16(R29), R21 + MOVV 24(R29), R3 + MOVV 32(R29), R4 + MOVV 40(R29), R5 + MOVV 48(R29), R6 + MOVV 56(R29), R7 + MOVV 64(R29), R8 + MOVV 72(R29), R9 + MOVV 80(R29), R10 + MOVV 88(R29), R11 + MOVV 96(R29), R12 + MOVV 104(R29), R13 + MOVV 112(R29), R14 + MOVV 120(R29), R15 + MOVV 128(R29), R16 + MOVV 136(R29), R17 + MOVV 144(R29), R18 + MOVV 152(R29), R19 + MOVV 160(R29), R22 + MOVV 168(R29), R24 + MOVV 176(R29), R25 + JMP ret + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +TEXT runtime·panicIndex(SB),NOSPLIT,$0-16 + MOVV R1, x+0(FP) + MOVV R2, y+8(FP) + JMP runtime·goPanicIndex(SB) +TEXT runtime·panicIndexU(SB),NOSPLIT,$0-16 + MOVV R1, x+0(FP) + MOVV R2, y+8(FP) + JMP runtime·goPanicIndexU(SB) +TEXT runtime·panicSliceAlen(SB),NOSPLIT,$0-16 + MOVV R2, x+0(FP) + MOVV R3, y+8(FP) + JMP runtime·goPanicSliceAlen(SB) +TEXT runtime·panicSliceAlenU(SB),NOSPLIT,$0-16 + MOVV R2, x+0(FP) + MOVV R3, y+8(FP) + JMP runtime·goPanicSliceAlenU(SB) +TEXT runtime·panicSliceAcap(SB),NOSPLIT,$0-16 + MOVV R2, x+0(FP) + MOVV R3, y+8(FP) + JMP runtime·goPanicSliceAcap(SB) +TEXT runtime·panicSliceAcapU(SB),NOSPLIT,$0-16 + MOVV R2, x+0(FP) + MOVV R3, y+8(FP) + JMP runtime·goPanicSliceAcapU(SB) +TEXT runtime·panicSliceB(SB),NOSPLIT,$0-16 + MOVV R1, x+0(FP) + MOVV R2, y+8(FP) + JMP runtime·goPanicSliceB(SB) +TEXT runtime·panicSliceBU(SB),NOSPLIT,$0-16 + MOVV R1, x+0(FP) + MOVV R2, y+8(FP) + JMP runtime·goPanicSliceBU(SB) +TEXT runtime·panicSlice3Alen(SB),NOSPLIT,$0-16 + MOVV R3, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSlice3Alen(SB) +TEXT runtime·panicSlice3AlenU(SB),NOSPLIT,$0-16 + MOVV R3, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSlice3AlenU(SB) +TEXT runtime·panicSlice3Acap(SB),NOSPLIT,$0-16 + MOVV R3, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSlice3Acap(SB) +TEXT runtime·panicSlice3AcapU(SB),NOSPLIT,$0-16 + MOVV R3, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSlice3AcapU(SB) +TEXT runtime·panicSlice3B(SB),NOSPLIT,$0-16 + MOVV R2, x+0(FP) + MOVV R3, y+8(FP) + JMP runtime·goPanicSlice3B(SB) +TEXT runtime·panicSlice3BU(SB),NOSPLIT,$0-16 + MOVV R2, x+0(FP) + MOVV R3, y+8(FP) + JMP runtime·goPanicSlice3BU(SB) +TEXT runtime·panicSlice3C(SB),NOSPLIT,$0-16 + MOVV R1, x+0(FP) + MOVV R2, y+8(FP) + JMP runtime·goPanicSlice3C(SB) +TEXT runtime·panicSlice3CU(SB),NOSPLIT,$0-16 + MOVV R1, x+0(FP) + MOVV R2, y+8(FP) + JMP runtime·goPanicSlice3CU(SB) +TEXT runtime·panicSliceConvert(SB),NOSPLIT,$0-16 + MOVV R3, x+0(FP) + MOVV R4, y+8(FP) + JMP runtime·goPanicSliceConvert(SB) diff --git a/src/runtime/asm_mipsx.s b/src/runtime/asm_mipsx.s new file mode 100644 index 0000000..877c1bb --- /dev/null +++ b/src/runtime/asm_mipsx.s @@ -0,0 +1,882 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +#define REGCTXT R22 + +TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 + // R29 = stack; R4 = argc; R5 = argv + + ADDU $-12, R29 + MOVW R4, 4(R29) // argc + MOVW R5, 8(R29) // argv + + // create istack out of the given (operating system) stack. + // _cgo_init may update stackguard. + MOVW $runtime·g0(SB), g + MOVW $(-64*1024), R23 + ADD R23, R29, R1 + MOVW R1, g_stackguard0(g) + MOVW R1, g_stackguard1(g) + MOVW R1, (g_stack+stack_lo)(g) + MOVW R29, (g_stack+stack_hi)(g) + + // if there is a _cgo_init, call it using the gcc ABI. + MOVW _cgo_init(SB), R25 + BEQ R25, nocgo + ADDU $-16, R29 + MOVW R0, R7 // arg 3: not used + MOVW R0, R6 // arg 2: not used + MOVW $setg_gcc<>(SB), R5 // arg 1: setg + MOVW g, R4 // arg 0: G + JAL (R25) + ADDU $16, R29 + +nocgo: + // update stackguard after _cgo_init + MOVW (g_stack+stack_lo)(g), R1 + ADD $const__StackGuard, R1 + MOVW R1, g_stackguard0(g) + MOVW R1, g_stackguard1(g) + + // set the per-goroutine and per-mach "registers" + MOVW $runtime·m0(SB), R1 + + // save m->g0 = g0 + MOVW g, m_g0(R1) + // save m0 to g0->m + MOVW R1, g_m(g) + + JAL runtime·check(SB) + + // args are already prepared + JAL runtime·args(SB) + JAL runtime·osinit(SB) + JAL runtime·schedinit(SB) + + // create a new goroutine to start program + MOVW $runtime·mainPC(SB), R1 // entry + ADDU $-8, R29 + MOVW R1, 4(R29) + MOVW R0, 0(R29) + JAL runtime·newproc(SB) + ADDU $8, R29 + + // start this M + JAL runtime·mstart(SB) + + UNDEF + RET + +DATA runtime·mainPC+0(SB)/4,$runtime·main(SB) +GLOBL runtime·mainPC(SB),RODATA,$4 + +TEXT runtime·breakpoint(SB),NOSPLIT,$0-0 + BREAK + RET + +TEXT runtime·asminit(SB),NOSPLIT,$0-0 + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + JAL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// void gogo(Gobuf*) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB),NOSPLIT|NOFRAME,$0-4 + MOVW buf+0(FP), R3 + MOVW gobuf_g(R3), R4 + MOVW 0(R4), R5 // make sure g != nil + JMP gogo<>(SB) + +TEXT gogo<>(SB),NOSPLIT|NOFRAME,$0 + MOVW R4, g + JAL runtime·save_g(SB) + MOVW gobuf_sp(R3), R29 + MOVW gobuf_lr(R3), R31 + MOVW gobuf_ret(R3), R1 + MOVW gobuf_ctxt(R3), REGCTXT + MOVW R0, gobuf_sp(R3) + MOVW R0, gobuf_ret(R3) + MOVW R0, gobuf_lr(R3) + MOVW R0, gobuf_ctxt(R3) + MOVW gobuf_pc(R3), R4 + JMP (R4) + +// void mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall(SB),NOSPLIT|NOFRAME,$0-4 + // Save caller state in g->sched + MOVW R29, (g_sched+gobuf_sp)(g) + MOVW R31, (g_sched+gobuf_pc)(g) + MOVW R0, (g_sched+gobuf_lr)(g) + + // Switch to m->g0 & its stack, call fn. + MOVW g, R1 + MOVW g_m(g), R3 + MOVW m_g0(R3), g + JAL runtime·save_g(SB) + BNE g, R1, 2(PC) + JMP runtime·badmcall(SB) + MOVW fn+0(FP), REGCTXT // context + MOVW 0(REGCTXT), R4 // code pointer + MOVW (g_sched+gobuf_sp)(g), R29 // sp = m->g0->sched.sp + ADDU $-8, R29 // make room for 1 arg and fake LR + MOVW R1, 4(R29) + MOVW R0, 0(R29) + JAL (R4) + JMP runtime·badmcall2(SB) + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB),NOSPLIT,$0-0 + UNDEF + JAL (R31) // make sure this function is not leaf + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB),NOSPLIT,$0-4 + MOVW fn+0(FP), R1 // R1 = fn + MOVW R1, REGCTXT // context + MOVW g_m(g), R2 // R2 = m + + MOVW m_gsignal(R2), R3 // R3 = gsignal + BEQ g, R3, noswitch + + MOVW m_g0(R2), R3 // R3 = g0 + BEQ g, R3, noswitch + + MOVW m_curg(R2), R4 + BEQ g, R4, switch + + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOVW $runtime·badsystemstack(SB), R4 + JAL (R4) + JAL runtime·abort(SB) + +switch: + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + JAL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOVW R3, g + JAL runtime·save_g(SB) + MOVW (g_sched+gobuf_sp)(g), R1 + MOVW R1, R29 + + // call target function + MOVW 0(REGCTXT), R4 // code pointer + JAL (R4) + + // switch back to g + MOVW g_m(g), R1 + MOVW m_curg(R1), g + JAL runtime·save_g(SB) + MOVW (g_sched+gobuf_sp)(g), R29 + MOVW R0, (g_sched+gobuf_sp)(g) + RET + +noswitch: + // already on m stack, just call directly + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOVW 0(REGCTXT), R4 // code pointer + MOVW 0(R29), R31 // restore LR + ADD $4, R29 + JMP (R4) + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// Caller has already loaded: +// R1: framesize, R2: argsize, R3: LR +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0 + // Cannot grow scheduler stack (m->g0). + MOVW g_m(g), R7 + MOVW m_g0(R7), R8 + BNE g, R8, 3(PC) + JAL runtime·badmorestackg0(SB) + JAL runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOVW m_gsignal(R7), R8 + BNE g, R8, 3(PC) + JAL runtime·badmorestackgsignal(SB) + JAL runtime·abort(SB) + + // Called from f. + // Set g->sched to context in f. + MOVW R29, (g_sched+gobuf_sp)(g) + MOVW R31, (g_sched+gobuf_pc)(g) + MOVW R3, (g_sched+gobuf_lr)(g) + MOVW REGCTXT, (g_sched+gobuf_ctxt)(g) + + // Called from f. + // Set m->morebuf to f's caller. + MOVW R3, (m_morebuf+gobuf_pc)(R7) // f's caller's PC + MOVW R29, (m_morebuf+gobuf_sp)(R7) // f's caller's SP + MOVW g, (m_morebuf+gobuf_g)(R7) + + // Call newstack on m->g0's stack. + MOVW m_g0(R7), g + JAL runtime·save_g(SB) + MOVW (g_sched+gobuf_sp)(g), R29 + // Create a stack frame on g0 to call newstack. + MOVW R0, -4(R29) // Zero saved LR in frame + ADDU $-4, R29 + JAL runtime·newstack(SB) + + // Not reached, but make sure the return PC from the call to newstack + // is still in this function, and not the beginning of the next. + UNDEF + +TEXT runtime·morestack_noctxt(SB),NOSPLIT,$0-0 + // Force SPWRITE. This function doesn't actually write SP, + // but it is called with a special calling convention where + // the caller doesn't save LR on stack but passes it as a + // register (R3), and the unwinder currently doesn't understand. + // Make it SPWRITE to stop unwinding. (See issue 54332) + MOVW R29, R29 + + MOVW R0, REGCTXT + JMP runtime·morestack(SB) + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. + +#define DISPATCH(NAME,MAXSIZE) \ + MOVW $MAXSIZE, R23; \ + SGTU R1, R23, R23; \ + BNE R23, 3(PC); \ + MOVW $NAME(SB), R4; \ + JMP (R4) + +TEXT ·reflectcall(SB),NOSPLIT|NOFRAME,$0-28 + MOVW frameSize+20(FP), R1 + + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVW $runtime·badreflectcall(SB), R4 + JMP (R4) + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB),WRAPPER,$MAXSIZE-28; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVW stackArgs+8(FP), R1; \ + MOVW stackArgsSize+12(FP), R2; \ + MOVW R29, R3; \ + ADDU $4, R3; \ + ADDU R3, R2; \ + BEQ R3, R2, 6(PC); \ + MOVBU (R1), R4; \ + ADDU $1, R1; \ + MOVBU R4, (R3); \ + ADDU $1, R3; \ + JMP -5(PC); \ + /* call function */ \ + MOVW f+4(FP), REGCTXT; \ + MOVW (REGCTXT), R4; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + JAL (R4); \ + /* copy return values back */ \ + MOVW stackArgsType+0(FP), R5; \ + MOVW stackArgs+8(FP), R1; \ + MOVW stackArgsSize+12(FP), R2; \ + MOVW stackRetOffset+16(FP), R4; \ + ADDU $4, R29, R3; \ + ADDU R4, R3; \ + ADDU R4, R1; \ + SUBU R4, R2; \ + JAL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $20-0 + MOVW R5, 4(R29) + MOVW R1, 8(R29) + MOVW R3, 12(R29) + MOVW R2, 16(R29) + MOVW $0, 20(R29) + JAL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +TEXT runtime·procyield(SB),NOSPLIT,$0-4 + RET + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes R1. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT|NOFRAME,$0 + MOVW $runtime·systemstack_switch(SB), R1 + ADDU $8, R1 // get past prologue + MOVW R1, (g_sched+gobuf_pc)(g) + MOVW R29, (g_sched+gobuf_sp)(g) + MOVW R0, (g_sched+gobuf_lr)(g) + MOVW R0, (g_sched+gobuf_ret)(g) + // Assert ctxt is zero. See func save. + MOVW (g_sched+gobuf_ctxt)(g), R1 + BEQ R1, 2(PC) + JAL runtime·abort(SB) + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-12 + MOVW fn+0(FP), R25 + MOVW arg+4(FP), R4 + + MOVW R29, R3 // save original stack pointer + MOVW g, R2 + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + MOVW g_m(g), R5 + MOVW m_gsignal(R5), R6 + BEQ R6, g, g0 + MOVW m_g0(R5), R6 + BEQ R6, g, g0 + + JAL gosave_systemstack_switch<>(SB) + MOVW R6, g + JAL runtime·save_g(SB) + MOVW (g_sched+gobuf_sp)(g), R29 + + // Now on a scheduling stack (a pthread-created stack). +g0: + // Save room for two of our pointers and O32 frame. + ADDU $-24, R29 + AND $~7, R29 // O32 ABI expects 8-byte aligned stack on function entry + MOVW R2, 16(R29) // save old g on stack + MOVW (g_stack+stack_hi)(R2), R2 + SUBU R3, R2 + MOVW R2, 20(R29) // save depth in old g stack (can't just save SP, as stack might be copied during a callback) + JAL (R25) + + // Restore g, stack pointer. R2 is return value. + MOVW 16(R29), g + JAL runtime·save_g(SB) + MOVW (g_stack+stack_hi)(g), R5 + MOVW 20(R29), R6 + SUBU R6, R5 + MOVW R5, R29 + + MOVW R2, ret+8(FP) + RET + +// cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$12-12 + NO_LOCAL_POINTERS + + // Load m and g from thread-local storage. + MOVB runtime·iscgo(SB), R1 + BEQ R1, nocgo + JAL runtime·load_g(SB) +nocgo: + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call. + BEQ g, needm + + MOVW g_m(g), R3 + MOVW R3, savedm-4(SP) + JMP havem + +needm: + MOVW g, savedm-4(SP) // g is zero, so is m. + MOVW $runtime·needm(SB), R4 + JAL (R4) + + // Set m->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVW g_m(g), R3 + MOVW m_g0(R3), R1 + MOVW R29, (g_sched+gobuf_sp)(R1) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 4(R29) aka savedsp-8(SP). + MOVW m_g0(R3), R1 + MOVW (g_sched+gobuf_sp)(R1), R2 + MOVW R2, savedsp-12(SP) // must match frame size + MOVW R29, (g_sched+gobuf_sp)(R1) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOVW m_curg(R3), g + JAL runtime·save_g(SB) + MOVW (g_sched+gobuf_sp)(g), R2 // prepare stack as R2 + MOVW (g_sched+gobuf_pc)(g), R4 + MOVW R4, -(12+4)(R2) // "saved LR"; must match frame size + // Gather our arguments into registers. + MOVW fn+0(FP), R5 + MOVW frame+4(FP), R6 + MOVW ctxt+8(FP), R7 + MOVW $-(12+4)(R2), R29 // switch stack; must match frame size + MOVW R5, 4(R29) + MOVW R6, 8(R29) + MOVW R7, 12(R29) + JAL runtime·cgocallbackg(SB) + + // Restore g->sched (== m->curg->sched) from saved values. + MOVW 0(R29), R4 + MOVW R4, (g_sched+gobuf_pc)(g) + MOVW $(12+4)(R29), R2 // must match frame size + MOVW R2, (g_sched+gobuf_sp)(g) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVW g_m(g), R3 + MOVW m_g0(R3), g + JAL runtime·save_g(SB) + MOVW (g_sched+gobuf_sp)(g), R29 + MOVW savedsp-12(SP), R2 // must match frame size + MOVW R2, (g_sched+gobuf_sp)(g) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVW savedm-4(SP), R3 + BNE R3, droppedm + MOVW $runtime·dropm(SB), R4 + JAL (R4) +droppedm: + + // Done! + RET + +// void setg(G*); set g. for use by needm. +// This only happens if iscgo, so jump straight to save_g +TEXT runtime·setg(SB),NOSPLIT,$0-4 + MOVW gg+0(FP), g + JAL runtime·save_g(SB) + RET + +// void setg_gcc(G*); set g in C TLS. +// Must obey the gcc calling convention. +TEXT setg_gcc<>(SB),NOSPLIT,$0 + MOVW R4, g + JAL runtime·save_g(SB) + RET + +TEXT runtime·abort(SB),NOSPLIT,$0-0 + UNDEF + +// AES hashing not implemented for mips +TEXT runtime·memhash(SB),NOSPLIT|NOFRAME,$0-16 + JMP runtime·memhashFallback(SB) +TEXT runtime·strhash(SB),NOSPLIT|NOFRAME,$0-12 + JMP runtime·strhashFallback(SB) +TEXT runtime·memhash32(SB),NOSPLIT|NOFRAME,$0-12 + JMP runtime·memhash32Fallback(SB) +TEXT runtime·memhash64(SB),NOSPLIT|NOFRAME,$0-12 + JMP runtime·memhash64Fallback(SB) + +TEXT runtime·return0(SB),NOSPLIT,$0 + MOVW $0, R1 + RET + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT|NOFRAME,$0 + // g (R30), R3 and REGTMP (R23) might be clobbered by load_g. R30 and R23 + // are callee-save in the gcc calling convention, so save them. + MOVW R23, R8 + MOVW g, R9 + MOVW R31, R10 // this call frame does not save LR + + JAL runtime·load_g(SB) + MOVW g_m(g), R1 + MOVW m_curg(R1), R1 + MOVW (g_stack+stack_hi)(R1), R2 // return value in R2 + + MOVW R8, R23 + MOVW R9, g + MOVW R10, R31 + + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|NOFRAME|TOPFRAME,$0-0 + NOR R0, R0 // NOP + JAL runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + NOR R0, R0 // NOP + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + MOVW $1, R1 + MOVB R1, ret+0(FP) + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - R20 is the destination of the write +// - R21 is the value being written at R20. +// It clobbers R23 (the linker temp register). +// The act of CALLing gcWriteBarrier will clobber R31 (LR). +// It does not clobber any other general-purpose registers, +// but may clobber others (e.g., floating point registers). +TEXT runtime·gcWriteBarrier(SB),NOSPLIT,$104 + // Save the registers clobbered by the fast path. + MOVW R1, 100(R29) + MOVW R2, 104(R29) + MOVW g_m(g), R1 + MOVW m_p(R1), R1 + MOVW (p_wbBuf+wbBuf_next)(R1), R2 + // Increment wbBuf.next position. + ADD $8, R2 + MOVW R2, (p_wbBuf+wbBuf_next)(R1) + MOVW (p_wbBuf+wbBuf_end)(R1), R1 + MOVW R1, R23 // R23 is linker temp register + // Record the write. + MOVW R21, -8(R2) // Record value + MOVW (R20), R1 // TODO: This turns bad writes into bad reads. + MOVW R1, -4(R2) // Record *slot + // Is the buffer full? + BEQ R2, R23, flush +ret: + MOVW 100(R29), R1 + MOVW 104(R29), R2 + // Do the write. + MOVW R21, (R20) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + MOVW R20, 4(R29) // Also first argument to wbBufFlush + MOVW R21, 8(R29) // Also second argument to wbBufFlush + // R1 already saved + // R2 already saved + MOVW R3, 12(R29) + MOVW R4, 16(R29) + MOVW R5, 20(R29) + MOVW R6, 24(R29) + MOVW R7, 28(R29) + MOVW R8, 32(R29) + MOVW R9, 36(R29) + MOVW R10, 40(R29) + MOVW R11, 44(R29) + MOVW R12, 48(R29) + MOVW R13, 52(R29) + MOVW R14, 56(R29) + MOVW R15, 60(R29) + MOVW R16, 64(R29) + MOVW R17, 68(R29) + MOVW R18, 72(R29) + MOVW R19, 76(R29) + MOVW R20, 80(R29) + // R21 already saved + // R22 already saved. + MOVW R22, 84(R29) + // R23 is tmp register. + MOVW R24, 88(R29) + MOVW R25, 92(R29) + // R26 is reserved by kernel. + // R27 is reserved by kernel. + MOVW R28, 96(R29) + // R29 is SP. + // R30 is g. + // R31 is LR, which was saved by the prologue. + + // This takes arguments R20 and R21. + CALL runtime·wbBufFlush(SB) + + MOVW 4(R29), R20 + MOVW 8(R29), R21 + MOVW 12(R29), R3 + MOVW 16(R29), R4 + MOVW 20(R29), R5 + MOVW 24(R29), R6 + MOVW 28(R29), R7 + MOVW 32(R29), R8 + MOVW 36(R29), R9 + MOVW 40(R29), R10 + MOVW 44(R29), R11 + MOVW 48(R29), R12 + MOVW 52(R29), R13 + MOVW 56(R29), R14 + MOVW 60(R29), R15 + MOVW 64(R29), R16 + MOVW 68(R29), R17 + MOVW 72(R29), R18 + MOVW 76(R29), R19 + MOVW 80(R29), R20 + MOVW 84(R29), R22 + MOVW 88(R29), R24 + MOVW 92(R29), R25 + MOVW 96(R29), R28 + JMP ret + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +TEXT runtime·panicIndex(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicIndex(SB) +TEXT runtime·panicIndexU(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicIndexU(SB) +TEXT runtime·panicSliceAlen(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSliceAlen(SB) +TEXT runtime·panicSliceAlenU(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSliceAlenU(SB) +TEXT runtime·panicSliceAcap(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSliceAcap(SB) +TEXT runtime·panicSliceAcapU(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSliceAcapU(SB) +TEXT runtime·panicSliceB(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSliceB(SB) +TEXT runtime·panicSliceBU(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSliceBU(SB) +TEXT runtime·panicSlice3Alen(SB),NOSPLIT,$0-8 + MOVW R3, x+0(FP) + MOVW R4, y+4(FP) + JMP runtime·goPanicSlice3Alen(SB) +TEXT runtime·panicSlice3AlenU(SB),NOSPLIT,$0-8 + MOVW R3, x+0(FP) + MOVW R4, y+4(FP) + JMP runtime·goPanicSlice3AlenU(SB) +TEXT runtime·panicSlice3Acap(SB),NOSPLIT,$0-8 + MOVW R3, x+0(FP) + MOVW R4, y+4(FP) + JMP runtime·goPanicSlice3Acap(SB) +TEXT runtime·panicSlice3AcapU(SB),NOSPLIT,$0-8 + MOVW R3, x+0(FP) + MOVW R4, y+4(FP) + JMP runtime·goPanicSlice3AcapU(SB) +TEXT runtime·panicSlice3B(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSlice3B(SB) +TEXT runtime·panicSlice3BU(SB),NOSPLIT,$0-8 + MOVW R2, x+0(FP) + MOVW R3, y+4(FP) + JMP runtime·goPanicSlice3BU(SB) +TEXT runtime·panicSlice3C(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSlice3C(SB) +TEXT runtime·panicSlice3CU(SB),NOSPLIT,$0-8 + MOVW R1, x+0(FP) + MOVW R2, y+4(FP) + JMP runtime·goPanicSlice3CU(SB) +TEXT runtime·panicSliceConvert(SB),NOSPLIT,$0-8 + MOVW R3, x+0(FP) + MOVW R4, y+4(FP) + JMP runtime·goPanicSliceConvert(SB) + +// Extended versions for 64-bit indexes. +TEXT runtime·panicExtendIndex(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendIndex(SB) +TEXT runtime·panicExtendIndexU(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendIndexU(SB) +TEXT runtime·panicExtendSliceAlen(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSliceAlen(SB) +TEXT runtime·panicExtendSliceAlenU(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSliceAlenU(SB) +TEXT runtime·panicExtendSliceAcap(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSliceAcap(SB) +TEXT runtime·panicExtendSliceAcapU(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSliceAcapU(SB) +TEXT runtime·panicExtendSliceB(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSliceB(SB) +TEXT runtime·panicExtendSliceBU(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSliceBU(SB) +TEXT runtime·panicExtendSlice3Alen(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R3, lo+4(FP) + MOVW R4, y+8(FP) + JMP runtime·goPanicExtendSlice3Alen(SB) +TEXT runtime·panicExtendSlice3AlenU(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R3, lo+4(FP) + MOVW R4, y+8(FP) + JMP runtime·goPanicExtendSlice3AlenU(SB) +TEXT runtime·panicExtendSlice3Acap(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R3, lo+4(FP) + MOVW R4, y+8(FP) + JMP runtime·goPanicExtendSlice3Acap(SB) +TEXT runtime·panicExtendSlice3AcapU(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R3, lo+4(FP) + MOVW R4, y+8(FP) + JMP runtime·goPanicExtendSlice3AcapU(SB) +TEXT runtime·panicExtendSlice3B(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSlice3B(SB) +TEXT runtime·panicExtendSlice3BU(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R2, lo+4(FP) + MOVW R3, y+8(FP) + JMP runtime·goPanicExtendSlice3BU(SB) +TEXT runtime·panicExtendSlice3C(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSlice3C(SB) +TEXT runtime·panicExtendSlice3CU(SB),NOSPLIT,$0-12 + MOVW R5, hi+0(FP) + MOVW R1, lo+4(FP) + MOVW R2, y+8(FP) + JMP runtime·goPanicExtendSlice3CU(SB) diff --git a/src/runtime/asm_ppc64x.h b/src/runtime/asm_ppc64x.h new file mode 100644 index 0000000..5e55055 --- /dev/null +++ b/src/runtime/asm_ppc64x.h @@ -0,0 +1,25 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// FIXED_FRAME defines the size of the fixed part of a stack frame. A stack +// frame looks like this: +// +// +---------------------+ +// | local variable area | +// +---------------------+ +// | argument area | +// +---------------------+ <- R1+FIXED_FRAME +// | fixed area | +// +---------------------+ <- R1 +// +// So a function that sets up a stack frame at all uses as least FIXED_FRAME +// bytes of stack. This mostly affects assembly that calls other functions +// with arguments (the arguments should be stored at FIXED_FRAME+0(R1), +// FIXED_FRAME+8(R1) etc) and some other low-level places. +// +// The reason for using a constant is to make supporting PIC easier (although +// we only support PIC on ppc64le which has a minimum 32 bytes of stack frame, +// and currently always use that much, PIC on ppc64 would need to use 48). + +#define FIXED_FRAME 32 diff --git a/src/runtime/asm_ppc64x.s b/src/runtime/asm_ppc64x.s new file mode 100644 index 0000000..61ff17a --- /dev/null +++ b/src/runtime/asm_ppc64x.s @@ -0,0 +1,1266 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" +#include "asm_ppc64x.h" + +#ifdef GOOS_aix +#define cgoCalleeStackSize 48 +#else +#define cgoCalleeStackSize 32 +#endif + +TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 + // R1 = stack; R3 = argc; R4 = argv; R13 = C TLS base pointer + + // initialize essential registers + BL runtime·reginit(SB) + + SUB $(FIXED_FRAME+16), R1 + MOVD R2, 24(R1) // stash the TOC pointer away again now we've created a new frame + MOVW R3, FIXED_FRAME+0(R1) // argc + MOVD R4, FIXED_FRAME+8(R1) // argv + + // create istack out of the given (operating system) stack. + // _cgo_init may update stackguard. + MOVD $runtime·g0(SB), g + BL runtime·save_g(SB) + MOVD $(-64*1024), R31 + ADD R31, R1, R3 + MOVD R3, g_stackguard0(g) + MOVD R3, g_stackguard1(g) + MOVD R3, (g_stack+stack_lo)(g) + MOVD R1, (g_stack+stack_hi)(g) + + // if there is a _cgo_init, call it using the gcc ABI. + MOVD _cgo_init(SB), R12 + CMP R0, R12 + BEQ nocgo +#ifdef GOARCH_ppc64 + // ppc64 use elf ABI v1. we must get the real entry address from + // first slot of the function descriptor before call. + MOVD 8(R12), R2 + MOVD (R12), R12 +#endif + MOVD R12, CTR // r12 = "global function entry point" + MOVD R13, R5 // arg 2: TLS base pointer + MOVD $setg_gcc<>(SB), R4 // arg 1: setg + MOVD g, R3 // arg 0: G + // C functions expect 32 (48 for AIX) bytes of space on caller + // stack frame and a 16-byte aligned R1 + MOVD R1, R14 // save current stack + SUB $cgoCalleeStackSize, R1 // reserve the callee area + RLDCR $0, R1, $~15, R1 // 16-byte align + BL (CTR) // may clobber R0, R3-R12 + MOVD R14, R1 // restore stack +#ifndef GOOS_aix + MOVD 24(R1), R2 +#endif + XOR R0, R0 // fix R0 + +nocgo: + // update stackguard after _cgo_init + MOVD (g_stack+stack_lo)(g), R3 + ADD $const__StackGuard, R3 + MOVD R3, g_stackguard0(g) + MOVD R3, g_stackguard1(g) + + // set the per-goroutine and per-mach "registers" + MOVD $runtime·m0(SB), R3 + + // save m->g0 = g0 + MOVD g, m_g0(R3) + // save m0 to g0->m + MOVD R3, g_m(g) + + BL runtime·check(SB) + + // args are already prepared + BL runtime·args(SB) + BL runtime·osinit(SB) + BL runtime·schedinit(SB) + + // create a new goroutine to start program + MOVD $runtime·mainPC(SB), R3 // entry + MOVDU R3, -8(R1) + MOVDU R0, -8(R1) + MOVDU R0, -8(R1) + MOVDU R0, -8(R1) + MOVDU R0, -8(R1) + BL runtime·newproc(SB) + ADD $(8+FIXED_FRAME), R1 + + // start this M + BL runtime·mstart(SB) + + MOVD R0, 0(R0) + RET + +DATA runtime·mainPC+0(SB)/8,$runtime·main<ABIInternal>(SB) +GLOBL runtime·mainPC(SB),RODATA,$8 + +TEXT runtime·breakpoint(SB),NOSPLIT|NOFRAME,$0-0 + TW $31, R0, R0 + RET + +TEXT runtime·asminit(SB),NOSPLIT|NOFRAME,$0-0 + RET + +// Any changes must be reflected to runtime/cgo/gcc_aix_ppc64.S:.crosscall_ppc64 +TEXT _cgo_reginit(SB),NOSPLIT|NOFRAME,$0-0 + // crosscall_ppc64 and crosscall2 need to reginit, but can't + // get at the 'runtime.reginit' symbol. + BR runtime·reginit(SB) + +TEXT runtime·reginit(SB),NOSPLIT|NOFRAME,$0-0 + // set R0 to zero, it's expected by the toolchain + XOR R0, R0 + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + BL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// void gogo(Gobuf*) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB), NOSPLIT|NOFRAME, $0-8 + MOVD buf+0(FP), R5 + MOVD gobuf_g(R5), R6 + MOVD 0(R6), R4 // make sure g != nil + BR gogo<>(SB) + +TEXT gogo<>(SB), NOSPLIT|NOFRAME, $0 + MOVD R6, g + BL runtime·save_g(SB) + + MOVD gobuf_sp(R5), R1 + MOVD gobuf_lr(R5), R31 +#ifndef GOOS_aix + MOVD 24(R1), R2 // restore R2 +#endif + MOVD R31, LR + MOVD gobuf_ret(R5), R3 + MOVD gobuf_ctxt(R5), R11 + MOVD R0, gobuf_sp(R5) + MOVD R0, gobuf_ret(R5) + MOVD R0, gobuf_lr(R5) + MOVD R0, gobuf_ctxt(R5) + CMP R0, R0 // set condition codes for == test, needed by stack split + MOVD gobuf_pc(R5), R12 + MOVD R12, CTR + BR (CTR) + +// void mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-8 + // Save caller state in g->sched + // R11 should be safe across save_g?? + MOVD R3, R11 + MOVD R1, (g_sched+gobuf_sp)(g) + MOVD LR, R31 + MOVD R31, (g_sched+gobuf_pc)(g) + MOVD R0, (g_sched+gobuf_lr)(g) + + // Switch to m->g0 & its stack, call fn. + MOVD g, R3 + MOVD g_m(g), R8 + MOVD m_g0(R8), g + BL runtime·save_g(SB) + CMP g, R3 + BNE 2(PC) + BR runtime·badmcall(SB) + MOVD 0(R11), R12 // code pointer + MOVD R12, CTR + MOVD (g_sched+gobuf_sp)(g), R1 // sp = m->g0->sched.sp + // Don't need to do anything special for regabiargs here + // R3 is g; stack is set anyway + MOVDU R3, -8(R1) + MOVDU R0, -8(R1) + MOVDU R0, -8(R1) + MOVDU R0, -8(R1) + MOVDU R0, -8(R1) + BL (CTR) + MOVD 24(R1), R2 + BR runtime·badmcall2(SB) + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + // We have several undefs here so that 16 bytes past + // $runtime·systemstack_switch lies within them whether or not the + // instructions that derive r2 from r12 are there. + UNDEF + UNDEF + UNDEF + BL (LR) // make sure this function is not leaf + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-8 + MOVD fn+0(FP), R3 // R3 = fn + MOVD R3, R11 // context + MOVD g_m(g), R4 // R4 = m + + MOVD m_gsignal(R4), R5 // R5 = gsignal + CMP g, R5 + BEQ noswitch + + MOVD m_g0(R4), R5 // R5 = g0 + CMP g, R5 + BEQ noswitch + + MOVD m_curg(R4), R6 + CMP g, R6 + BEQ switch + + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOVD $runtime·badsystemstack(SB), R12 + MOVD R12, CTR + BL (CTR) + BL runtime·abort(SB) + +switch: + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + BL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOVD R5, g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R1 + + // call target function + MOVD 0(R11), R12 // code pointer + MOVD R12, CTR + BL (CTR) + + // restore TOC pointer. It seems unlikely that we will use systemstack + // to call a function defined in another module, but the results of + // doing so would be so confusing that it's worth doing this. + MOVD g_m(g), R3 + MOVD m_curg(R3), g + MOVD (g_sched+gobuf_sp)(g), R3 +#ifndef GOOS_aix + MOVD 24(R3), R2 +#endif + // switch back to g + MOVD g_m(g), R3 + MOVD m_curg(R3), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R1 + MOVD R0, (g_sched+gobuf_sp)(g) + RET + +noswitch: + // already on m stack, just call directly + // On other arches we do a tail call here, but it appears to be + // impossible to tail call a function pointer in shared mode on + // ppc64 because the caller is responsible for restoring the TOC. + MOVD 0(R11), R12 // code pointer + MOVD R12, CTR + BL (CTR) +#ifndef GOOS_aix + MOVD 24(R1), R2 +#endif + RET + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// Caller has already loaded: +// R3: framesize, R4: argsize, R5: LR +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0 + // Cannot grow scheduler stack (m->g0). + MOVD g_m(g), R7 + MOVD m_g0(R7), R8 + CMP g, R8 + BNE 3(PC) + BL runtime·badmorestackg0(SB) + BL runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOVD m_gsignal(R7), R8 + CMP g, R8 + BNE 3(PC) + BL runtime·badmorestackgsignal(SB) + BL runtime·abort(SB) + + // Called from f. + // Set g->sched to context in f. + MOVD R1, (g_sched+gobuf_sp)(g) + MOVD LR, R8 + MOVD R8, (g_sched+gobuf_pc)(g) + MOVD R5, (g_sched+gobuf_lr)(g) + MOVD R11, (g_sched+gobuf_ctxt)(g) + + // Called from f. + // Set m->morebuf to f's caller. + MOVD R5, (m_morebuf+gobuf_pc)(R7) // f's caller's PC + MOVD R1, (m_morebuf+gobuf_sp)(R7) // f's caller's SP + MOVD g, (m_morebuf+gobuf_g)(R7) + + // Call newstack on m->g0's stack. + MOVD m_g0(R7), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R1 + MOVDU R0, -(FIXED_FRAME+0)(R1) // create a call frame on g0 + BL runtime·newstack(SB) + + // Not reached, but make sure the return PC from the call to newstack + // is still in this function, and not the beginning of the next. + UNDEF + +TEXT runtime·morestack_noctxt(SB),NOSPLIT|NOFRAME,$0-0 + // Force SPWRITE. This function doesn't actually write SP, + // but it is called with a special calling convention where + // the caller doesn't save LR on stack but passes it as a + // register (R5), and the unwinder currently doesn't understand. + // Make it SPWRITE to stop unwinding. (See issue 54332) + // Use OR R0, R1 instead of MOVD R1, R1 as the MOVD instruction + // has a special affect on Power8,9,10 by lowering the thread + // priority and causing a slowdown in execution time + + OR R0, R1 + MOVD R0, R11 + BR runtime·morestack(SB) + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + MOVD $MAXSIZE, R31; \ + CMP R3, R31; \ + BGT 4(PC); \ + MOVD $NAME(SB), R12; \ + MOVD R12, CTR; \ + BR (CTR) +// Note: can't just "BR NAME(SB)" - bad inlining results. + +TEXT ·reflectcall(SB), NOSPLIT|NOFRAME, $0-48 + MOVWZ frameSize+32(FP), R3 + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVD $runtime·badreflectcall(SB), R12 + MOVD R12, CTR + BR (CTR) + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-48; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVD stackArgs+16(FP), R3; \ + MOVWZ stackArgsSize+24(FP), R4; \ + MOVD R1, R5; \ + CMP R4, $8; \ + BLT tailsetup; \ + /* copy 8 at a time if possible */ \ + ADD $(FIXED_FRAME-8), R5; \ + SUB $8, R3; \ +top: \ + MOVDU 8(R3), R7; \ + MOVDU R7, 8(R5); \ + SUB $8, R4; \ + CMP R4, $8; \ + BGE top; \ + /* handle remaining bytes */ \ + CMP $0, R4; \ + BEQ callfn; \ + ADD $7, R3; \ + ADD $7, R5; \ + BR tail; \ +tailsetup: \ + CMP $0, R4; \ + BEQ callfn; \ + ADD $(FIXED_FRAME-1), R5; \ + SUB $1, R3; \ +tail: \ + MOVBU 1(R3), R6; \ + MOVBU R6, 1(R5); \ + SUB $1, R4; \ + CMP $0, R4; \ + BGT tail; \ +callfn: \ + /* call function */ \ + MOVD f+8(FP), R11; \ +#ifdef GOOS_aix \ + /* AIX won't trigger a SIGSEGV if R11 = nil */ \ + /* So it manually triggers it */ \ + CMP R0, R11 \ + BNE 2(PC) \ + MOVD R0, 0(R0) \ +#endif \ + MOVD regArgs+40(FP), R20; \ + BL runtime·unspillArgs(SB); \ + MOVD (R11), R12; \ + MOVD R12, CTR; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + BL (CTR); \ +#ifndef GOOS_aix \ + MOVD 24(R1), R2; \ +#endif \ + /* copy return values back */ \ + MOVD regArgs+40(FP), R20; \ + BL runtime·spillArgs(SB); \ + MOVD stackArgsType+0(FP), R7; \ + MOVD stackArgs+16(FP), R3; \ + MOVWZ stackArgsSize+24(FP), R4; \ + MOVWZ stackRetOffset+28(FP), R6; \ + ADD $FIXED_FRAME, R1, R5; \ + ADD R6, R5; \ + ADD R6, R3; \ + SUB R6, R4; \ + BL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $40-0 + NO_LOCAL_POINTERS + MOVD R7, FIXED_FRAME+0(R1) + MOVD R3, FIXED_FRAME+8(R1) + MOVD R5, FIXED_FRAME+16(R1) + MOVD R4, FIXED_FRAME+24(R1) + MOVD R20, FIXED_FRAME+32(R1) + BL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +TEXT runtime·procyield(SB),NOSPLIT|NOFRAME,$0-4 + MOVW cycles+0(FP), R7 + // POWER does not have a pause/yield instruction equivalent. + // Instead, we can lower the program priority by setting the + // Program Priority Register prior to the wait loop and set it + // back to default afterwards. On Linux, the default priority is + // medium-low. For details, see page 837 of the ISA 3.0. + OR R1, R1, R1 // Set PPR priority to low +again: + SUB $1, R7 + CMP $0, R7 + BNE again + OR R6, R6, R6 // Set PPR priority back to medium-low + RET + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes R31. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·systemstack_switch(SB), R31 + ADD $16, R31 // get past prologue (including r2-setting instructions when they're there) + MOVD R31, (g_sched+gobuf_pc)(g) + MOVD R1, (g_sched+gobuf_sp)(g) + MOVD R0, (g_sched+gobuf_lr)(g) + MOVD R0, (g_sched+gobuf_ret)(g) + // Assert ctxt is zero. See func save. + MOVD (g_sched+gobuf_ctxt)(g), R31 + CMP R0, R31 + BEQ 2(PC) + BL runtime·abort(SB) + RET + +#ifdef GOOS_aix +#define asmcgocallSaveOffset cgoCalleeStackSize + 8 +#else +#define asmcgocallSaveOffset cgoCalleeStackSize +#endif + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-20 + MOVD fn+0(FP), R3 + MOVD arg+8(FP), R4 + + MOVD R1, R7 // save original stack pointer + MOVD g, R5 + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + MOVD g_m(g), R8 + MOVD m_gsignal(R8), R6 + CMP R6, g + BEQ g0 + MOVD m_g0(R8), R6 + CMP R6, g + BEQ g0 + BL gosave_systemstack_switch<>(SB) + MOVD R6, g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R1 + + // Now on a scheduling stack (a pthread-created stack). +g0: +#ifdef GOOS_aix + // Create a fake LR to improve backtrace. + MOVD $runtime·asmcgocall(SB), R6 + MOVD R6, 16(R1) + // AIX also save one argument on the stack. + SUB $8, R1 +#endif + // Save room for two of our pointers, plus the callee + // save area that lives on the caller stack. + SUB $(asmcgocallSaveOffset+16), R1 + RLDCR $0, R1, $~15, R1 // 16-byte alignment for gcc ABI + MOVD R5, (asmcgocallSaveOffset+8)(R1)// save old g on stack + MOVD (g_stack+stack_hi)(R5), R5 + SUB R7, R5 + MOVD R5, asmcgocallSaveOffset(R1) // save depth in old g stack (can't just save SP, as stack might be copied during a callback) +#ifdef GOOS_aix + MOVD R7, 0(R1) // Save frame pointer to allow manual backtrace with gdb +#else + MOVD R0, 0(R1) // clear back chain pointer (TODO can we give it real back trace information?) +#endif + // This is a "global call", so put the global entry point in r12 + MOVD R3, R12 + +#ifdef GOARCH_ppc64 + // ppc64 use elf ABI v1. we must get the real entry address from + // first slot of the function descriptor before call. + // Same for AIX. + MOVD 8(R12), R2 + MOVD (R12), R12 +#endif + MOVD R12, CTR + MOVD R4, R3 // arg in r3 + BL (CTR) + // C code can clobber R0, so set it back to 0. F27-F31 are + // callee save, so we don't need to recover those. + XOR R0, R0 + // Restore g, stack pointer, toc pointer. + // R3 is errno, so don't touch it + MOVD (asmcgocallSaveOffset+8)(R1), g + MOVD (g_stack+stack_hi)(g), R5 + MOVD asmcgocallSaveOffset(R1), R6 + SUB R6, R5 +#ifndef GOOS_aix + MOVD 24(R5), R2 +#endif + MOVD R5, R1 + BL runtime·save_g(SB) + + MOVW R3, ret+16(FP) + RET + +// func cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$24-24 + NO_LOCAL_POINTERS + + // Load m and g from thread-local storage. + MOVBZ runtime·iscgo(SB), R3 + CMP R3, $0 + BEQ nocgo + BL runtime·load_g(SB) +nocgo: + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call. + CMP g, $0 + BEQ needm + + MOVD g_m(g), R8 + MOVD R8, savedm-8(SP) + BR havem + +needm: + MOVD g, savedm-8(SP) // g is zero, so is m. + MOVD $runtime·needm(SB), R12 + MOVD R12, CTR + BL (CTR) + + // Set m->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVD g_m(g), R8 + MOVD m_g0(R8), R3 + MOVD R1, (g_sched+gobuf_sp)(R3) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 8(R1) aka savedsp-16(SP). + MOVD m_g0(R8), R3 + MOVD (g_sched+gobuf_sp)(R3), R4 + MOVD R4, savedsp-24(SP) // must match frame size + MOVD R1, (g_sched+gobuf_sp)(R3) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOVD m_curg(R8), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R4 // prepare stack as R4 + MOVD (g_sched+gobuf_pc)(g), R5 + MOVD R5, -(24+FIXED_FRAME)(R4) // "saved LR"; must match frame size + // Gather our arguments into registers. + MOVD fn+0(FP), R5 + MOVD frame+8(FP), R6 + MOVD ctxt+16(FP), R7 + MOVD $-(24+FIXED_FRAME)(R4), R1 // switch stack; must match frame size + MOVD R5, FIXED_FRAME+0(R1) + MOVD R6, FIXED_FRAME+8(R1) + MOVD R7, FIXED_FRAME+16(R1) + + MOVD $runtime·cgocallbackg(SB), R12 + MOVD R12, CTR + CALL (CTR) // indirect call to bypass nosplit check. We're on a different stack now. + + // Restore g->sched (== m->curg->sched) from saved values. + MOVD 0(R1), R5 + MOVD R5, (g_sched+gobuf_pc)(g) + MOVD $(24+FIXED_FRAME)(R1), R4 // must match frame size + MOVD R4, (g_sched+gobuf_sp)(g) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVD g_m(g), R8 + MOVD m_g0(R8), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R1 + MOVD savedsp-24(SP), R4 // must match frame size + MOVD R4, (g_sched+gobuf_sp)(g) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVD savedm-8(SP), R6 + CMP R6, $0 + BNE droppedm + MOVD $runtime·dropm(SB), R12 + MOVD R12, CTR + BL (CTR) +droppedm: + + // Done! + RET + +// void setg(G*); set g. for use by needm. +TEXT runtime·setg(SB), NOSPLIT, $0-8 + MOVD gg+0(FP), g + // This only happens if iscgo, so jump straight to save_g + BL runtime·save_g(SB) + RET + +#ifdef GOARCH_ppc64 +#ifdef GOOS_aix +DATA setg_gcc<>+0(SB)/8, $_setg_gcc<>(SB) +DATA setg_gcc<>+8(SB)/8, $TOC(SB) +DATA setg_gcc<>+16(SB)/8, $0 +GLOBL setg_gcc<>(SB), NOPTR, $24 +#else +TEXT setg_gcc<>(SB),NOSPLIT|NOFRAME,$0-0 + DWORD $_setg_gcc<>(SB) + DWORD $0 + DWORD $0 +#endif +#endif + +// void setg_gcc(G*); set g in C TLS. +// Must obey the gcc calling convention. +#ifdef GOARCH_ppc64le +TEXT setg_gcc<>(SB),NOSPLIT|NOFRAME,$0-0 +#else +TEXT _setg_gcc<>(SB),NOSPLIT|NOFRAME,$0-0 +#endif + // The standard prologue clobbers R31, which is callee-save in + // the C ABI, so we have to use $-8-0 and save LR ourselves. + MOVD LR, R4 + // Also save g and R31, since they're callee-save in C ABI + MOVD R31, R5 + MOVD g, R6 + + MOVD R3, g + BL runtime·save_g(SB) + + MOVD R6, g + MOVD R5, R31 + MOVD R4, LR + RET + +TEXT runtime·abort(SB),NOSPLIT|NOFRAME,$0-0 + MOVW (R0), R0 + UNDEF + +#define TBR 268 + +// int64 runtime·cputicks(void) +TEXT runtime·cputicks(SB),NOSPLIT,$0-8 + MOVD SPR(TBR), R3 + MOVD R3, ret+0(FP) + RET + +// spillArgs stores return values from registers to a *internal/abi.RegArgs in R20. +TEXT runtime·spillArgs(SB),NOSPLIT,$0-0 + MOVD R3, 0(R20) + MOVD R4, 8(R20) + MOVD R5, 16(R20) + MOVD R6, 24(R20) + MOVD R7, 32(R20) + MOVD R8, 40(R20) + MOVD R9, 48(R20) + MOVD R10, 56(R20) + MOVD R14, 64(R20) + MOVD R15, 72(R20) + MOVD R16, 80(R20) + MOVD R17, 88(R20) + FMOVD F1, 96(R20) + FMOVD F2, 104(R20) + FMOVD F3, 112(R20) + FMOVD F4, 120(R20) + FMOVD F5, 128(R20) + FMOVD F6, 136(R20) + FMOVD F7, 144(R20) + FMOVD F8, 152(R20) + FMOVD F9, 160(R20) + FMOVD F10, 168(R20) + FMOVD F11, 176(R20) + FMOVD F12, 184(R20) + RET + +// unspillArgs loads args into registers from a *internal/abi.RegArgs in R20. +TEXT runtime·unspillArgs(SB),NOSPLIT,$0-0 + MOVD 0(R20), R3 + MOVD 8(R20), R4 + MOVD 16(R20), R5 + MOVD 24(R20), R6 + MOVD 32(R20), R7 + MOVD 40(R20), R8 + MOVD 48(R20), R9 + MOVD 56(R20), R10 + MOVD 64(R20), R14 + MOVD 72(R20), R15 + MOVD 80(R20), R16 + MOVD 88(R20), R17 + FMOVD 96(R20), F1 + FMOVD 104(R20), F2 + FMOVD 112(R20), F3 + FMOVD 120(R20), F4 + FMOVD 128(R20), F5 + FMOVD 136(R20), F6 + FMOVD 144(R20), F7 + FMOVD 152(R20), F8 + FMOVD 160(R20), F9 + FMOVD 168(R20), F10 + FMOVD 176(R20), F11 + FMOVD 184(R20), F12 + RET + +// AES hashing not implemented for ppc64 +TEXT runtime·memhash<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-32 + JMP runtime·memhashFallback<ABIInternal>(SB) +TEXT runtime·strhash<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·strhashFallback<ABIInternal>(SB) +TEXT runtime·memhash32<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash32Fallback<ABIInternal>(SB) +TEXT runtime·memhash64<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash64Fallback<ABIInternal>(SB) + +TEXT runtime·return0(SB), NOSPLIT, $0 + MOVW $0, R3 + RET + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +#ifdef GOOS_aix +// On AIX, _cgo_topofstack is defined in runtime/cgo, because it must +// be a longcall in order to prevent trampolines from ld. +TEXT __cgo_topofstack(SB),NOSPLIT|NOFRAME,$0 +#else +TEXT _cgo_topofstack(SB),NOSPLIT|NOFRAME,$0 +#endif + // g (R30) and R31 are callee-save in the C ABI, so save them + MOVD g, R4 + MOVD R31, R5 + MOVD LR, R6 + + BL runtime·load_g(SB) // clobbers g (R30), R31 + MOVD g_m(g), R3 + MOVD m_curg(R3), R3 + MOVD (g_stack+stack_hi)(R3), R3 + + MOVD R4, g + MOVD R5, R31 + MOVD R6, LR + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +// +// When dynamically linking Go, it can be returned to from a function +// implemented in a different module and so needs to reload the TOC pointer +// from the stack (although this function declares that it does not set up x-a +// frame, newproc1 does in fact allocate one for goexit and saves the TOC +// pointer in the correct place). +// goexit+_PCQuantum is halfway through the usual global entry point prologue +// that derives r2 from r12 which is a bit silly, but not harmful. +TEXT runtime·goexit(SB),NOSPLIT|NOFRAME|TOPFRAME,$0-0 + MOVD 24(R1), R2 + BL runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + MOVD R0, R0 // NOP + +// prepGoExitFrame saves the current TOC pointer (i.e. the TOC pointer for the +// module containing runtime) to the frame that goexit will execute in when +// the goroutine exits. It's implemented in assembly mainly because that's the +// easiest way to get access to R2. +TEXT runtime·prepGoExitFrame(SB),NOSPLIT,$0-8 + MOVD sp+0(FP), R3 + MOVD R2, 24(R3) + RET + +TEXT runtime·addmoduledata(SB),NOSPLIT|NOFRAME,$0-0 + ADD $-8, R1 + MOVD R31, 0(R1) + MOVD runtime·lastmoduledatap(SB), R4 + MOVD R3, moduledata_next(R4) + MOVD R3, runtime·lastmoduledatap(SB) + MOVD 0(R1), R31 + ADD $8, R1 + RET + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + MOVW $1, R3 + MOVB R3, ret+0(FP) + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - R20 is the destination of the write +// - R21 is the value being written at R20. +// It clobbers condition codes. +// It does not clobber R0 through R17 (except special registers), +// but may clobber any other register, *including* R31. +TEXT runtime·gcWriteBarrier<ABIInternal>(SB),NOSPLIT,$112 + // The standard prologue clobbers R31. + // We use R18 and R19 as scratch registers. + MOVD g_m(g), R18 + MOVD m_p(R18), R18 + MOVD (p_wbBuf+wbBuf_next)(R18), R19 + // Increment wbBuf.next position. + ADD $16, R19 + MOVD R19, (p_wbBuf+wbBuf_next)(R18) + MOVD (p_wbBuf+wbBuf_end)(R18), R18 + CMP R18, R19 + // Record the write. + MOVD R21, -16(R19) // Record value + MOVD (R20), R18 // TODO: This turns bad writes into bad reads. + MOVD R18, -8(R19) // Record *slot + // Is the buffer full? (flags set in CMP above) + BEQ flush +ret: + // Do the write. + MOVD R21, (R20) + RET + +flush: + // Save registers R0 through R15 since these were not saved by the caller. + // We don't save all registers on ppc64 because it takes too much space. + MOVD R20, (FIXED_FRAME+0)(R1) // Also first argument to wbBufFlush + MOVD R21, (FIXED_FRAME+8)(R1) // Also second argument to wbBufFlush + // R0 is always 0, so no need to spill. + // R1 is SP. + // R2 is SB. + MOVD R3, (FIXED_FRAME+16)(R1) + MOVD R4, (FIXED_FRAME+24)(R1) + MOVD R5, (FIXED_FRAME+32)(R1) + MOVD R6, (FIXED_FRAME+40)(R1) + MOVD R7, (FIXED_FRAME+48)(R1) + MOVD R8, (FIXED_FRAME+56)(R1) + MOVD R9, (FIXED_FRAME+64)(R1) + MOVD R10, (FIXED_FRAME+72)(R1) + // R11, R12 may be clobbered by external-linker-inserted trampoline + // R13 is REGTLS + MOVD R14, (FIXED_FRAME+80)(R1) + MOVD R15, (FIXED_FRAME+88)(R1) + MOVD R16, (FIXED_FRAME+96)(R1) + MOVD R17, (FIXED_FRAME+104)(R1) + + // This takes arguments R20 and R21. + CALL runtime·wbBufFlush(SB) + + MOVD (FIXED_FRAME+0)(R1), R20 + MOVD (FIXED_FRAME+8)(R1), R21 + MOVD (FIXED_FRAME+16)(R1), R3 + MOVD (FIXED_FRAME+24)(R1), R4 + MOVD (FIXED_FRAME+32)(R1), R5 + MOVD (FIXED_FRAME+40)(R1), R6 + MOVD (FIXED_FRAME+48)(R1), R7 + MOVD (FIXED_FRAME+56)(R1), R8 + MOVD (FIXED_FRAME+64)(R1), R9 + MOVD (FIXED_FRAME+72)(R1), R10 + MOVD (FIXED_FRAME+80)(R1), R14 + MOVD (FIXED_FRAME+88)(R1), R15 + MOVD (FIXED_FRAME+96)(R1), R16 + MOVD (FIXED_FRAME+104)(R1), R17 + JMP ret + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +TEXT runtime·panicIndex<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicIndex<ABIInternal>(SB) +TEXT runtime·panicIndexU<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicIndexU<ABIInternal>(SB) +TEXT runtime·panicSliceAlen<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R4, R3 + MOVD R5, R4 + JMP runtime·goPanicSliceAlen<ABIInternal>(SB) +TEXT runtime·panicSliceAlenU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R4, R3 + MOVD R5, R4 + JMP runtime·goPanicSliceAlenU<ABIInternal>(SB) +TEXT runtime·panicSliceAcap<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R4, R3 + MOVD R5, R4 + JMP runtime·goPanicSliceAcap<ABIInternal>(SB) +TEXT runtime·panicSliceAcapU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R4, R3 + MOVD R5, R4 + JMP runtime·goPanicSliceAcapU<ABIInternal>(SB) +TEXT runtime·panicSliceB<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicSliceB<ABIInternal>(SB) +TEXT runtime·panicSliceBU<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicSliceBU<ABIInternal>(SB) +TEXT runtime·panicSlice3Alen<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R5, R3 + MOVD R6, R4 + JMP runtime·goPanicSlice3Alen<ABIInternal>(SB) +TEXT runtime·panicSlice3AlenU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R5, R3 + MOVD R6, R4 + JMP runtime·goPanicSlice3AlenU<ABIInternal>(SB) +TEXT runtime·panicSlice3Acap<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R5, R3 + MOVD R6, R4 + JMP runtime·goPanicSlice3Acap<ABIInternal>(SB) +TEXT runtime·panicSlice3AcapU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R5, R3 + MOVD R6, R4 + JMP runtime·goPanicSlice3AcapU<ABIInternal>(SB) +TEXT runtime·panicSlice3B<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R4, R3 + MOVD R5, R4 + JMP runtime·goPanicSlice3B<ABIInternal>(SB) +TEXT runtime·panicSlice3BU<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R4, R3 + MOVD R5, R4 + JMP runtime·goPanicSlice3BU<ABIInternal>(SB) +TEXT runtime·panicSlice3C<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicSlice3C<ABIInternal>(SB) +TEXT runtime·panicSlice3CU<ABIInternal>(SB),NOSPLIT,$0-16 + JMP runtime·goPanicSlice3CU<ABIInternal>(SB) +TEXT runtime·panicSliceConvert<ABIInternal>(SB),NOSPLIT,$0-16 + MOVD R5, R3 + MOVD R6, R4 + JMP runtime·goPanicSliceConvert<ABIInternal>(SB) + +// These functions are used when internal linking cgo with external +// objects compiled with the -Os on gcc. They reduce prologue/epilogue +// size by deferring preservation of callee-save registers to a shared +// function. These are defined in PPC64 ELFv2 2.3.3 (but also present +// in ELFv1) +// +// These appear unused, but the linker will redirect calls to functions +// like _savegpr0_14 or _restgpr1_14 to runtime.elf_savegpr0 or +// runtime.elf_restgpr1 with an appropriate offset based on the number +// register operations required when linking external objects which +// make these calls. For GPR/FPR saves, the minimum register value is +// 14, for VR it is 20. +// +// These are only used when linking such cgo code internally. Note, R12 +// and R0 may be used in different ways than regular ELF compliant +// functions. +TEXT runtime·elf_savegpr0(SB),NOSPLIT|NOFRAME,$0 + // R0 holds the LR of the caller's caller, R1 holds save location + MOVD R14, -144(R1) + MOVD R15, -136(R1) + MOVD R16, -128(R1) + MOVD R17, -120(R1) + MOVD R18, -112(R1) + MOVD R19, -104(R1) + MOVD R20, -96(R1) + MOVD R21, -88(R1) + MOVD R22, -80(R1) + MOVD R23, -72(R1) + MOVD R24, -64(R1) + MOVD R25, -56(R1) + MOVD R26, -48(R1) + MOVD R27, -40(R1) + MOVD R28, -32(R1) + MOVD R29, -24(R1) + MOVD g, -16(R1) + MOVD R31, -8(R1) + MOVD R0, 16(R1) + RET +TEXT runtime·elf_restgpr0(SB),NOSPLIT|NOFRAME,$0 + // R1 holds save location. This returns to the LR saved on stack (bypassing the caller) + MOVD -144(R1), R14 + MOVD -136(R1), R15 + MOVD -128(R1), R16 + MOVD -120(R1), R17 + MOVD -112(R1), R18 + MOVD -104(R1), R19 + MOVD -96(R1), R20 + MOVD -88(R1), R21 + MOVD -80(R1), R22 + MOVD -72(R1), R23 + MOVD -64(R1), R24 + MOVD -56(R1), R25 + MOVD -48(R1), R26 + MOVD -40(R1), R27 + MOVD -32(R1), R28 + MOVD -24(R1), R29 + MOVD -16(R1), g + MOVD -8(R1), R31 + MOVD 16(R1), R0 // Load and return to saved LR + MOVD R0, LR + RET +TEXT runtime·elf_savegpr1(SB),NOSPLIT|NOFRAME,$0 + // R12 holds the save location + MOVD R14, -144(R12) + MOVD R15, -136(R12) + MOVD R16, -128(R12) + MOVD R17, -120(R12) + MOVD R18, -112(R12) + MOVD R19, -104(R12) + MOVD R20, -96(R12) + MOVD R21, -88(R12) + MOVD R22, -80(R12) + MOVD R23, -72(R12) + MOVD R24, -64(R12) + MOVD R25, -56(R12) + MOVD R26, -48(R12) + MOVD R27, -40(R12) + MOVD R28, -32(R12) + MOVD R29, -24(R12) + MOVD g, -16(R12) + MOVD R31, -8(R12) + RET +TEXT runtime·elf_restgpr1(SB),NOSPLIT|NOFRAME,$0 + // R12 holds the save location + MOVD -144(R12), R14 + MOVD -136(R12), R15 + MOVD -128(R12), R16 + MOVD -120(R12), R17 + MOVD -112(R12), R18 + MOVD -104(R12), R19 + MOVD -96(R12), R20 + MOVD -88(R12), R21 + MOVD -80(R12), R22 + MOVD -72(R12), R23 + MOVD -64(R12), R24 + MOVD -56(R12), R25 + MOVD -48(R12), R26 + MOVD -40(R12), R27 + MOVD -32(R12), R28 + MOVD -24(R12), R29 + MOVD -16(R12), g + MOVD -8(R12), R31 + RET +TEXT runtime·elf_savefpr(SB),NOSPLIT|NOFRAME,$0 + // R0 holds the LR of the caller's caller, R1 holds save location + FMOVD F14, -144(R1) + FMOVD F15, -136(R1) + FMOVD F16, -128(R1) + FMOVD F17, -120(R1) + FMOVD F18, -112(R1) + FMOVD F19, -104(R1) + FMOVD F20, -96(R1) + FMOVD F21, -88(R1) + FMOVD F22, -80(R1) + FMOVD F23, -72(R1) + FMOVD F24, -64(R1) + FMOVD F25, -56(R1) + FMOVD F26, -48(R1) + FMOVD F27, -40(R1) + FMOVD F28, -32(R1) + FMOVD F29, -24(R1) + FMOVD F30, -16(R1) + FMOVD F31, -8(R1) + MOVD R0, 16(R1) + RET +TEXT runtime·elf_restfpr(SB),NOSPLIT|NOFRAME,$0 + // R1 holds save location. This returns to the LR saved on stack (bypassing the caller) + FMOVD -144(R1), F14 + FMOVD -136(R1), F15 + FMOVD -128(R1), F16 + FMOVD -120(R1), F17 + FMOVD -112(R1), F18 + FMOVD -104(R1), F19 + FMOVD -96(R1), F20 + FMOVD -88(R1), F21 + FMOVD -80(R1), F22 + FMOVD -72(R1), F23 + FMOVD -64(R1), F24 + FMOVD -56(R1), F25 + FMOVD -48(R1), F26 + FMOVD -40(R1), F27 + FMOVD -32(R1), F28 + FMOVD -24(R1), F29 + FMOVD -16(R1), F30 + FMOVD -8(R1), F31 + MOVD 16(R1), R0 // Load and return to saved LR + MOVD R0, LR + RET +TEXT runtime·elf_savevr(SB),NOSPLIT|NOFRAME,$0 + // R0 holds the save location, R12 is clobbered + MOVD $-192, R12 + STVX V20, (R0+R12) + MOVD $-176, R12 + STVX V21, (R0+R12) + MOVD $-160, R12 + STVX V22, (R0+R12) + MOVD $-144, R12 + STVX V23, (R0+R12) + MOVD $-128, R12 + STVX V24, (R0+R12) + MOVD $-112, R12 + STVX V25, (R0+R12) + MOVD $-96, R12 + STVX V26, (R0+R12) + MOVD $-80, R12 + STVX V27, (R0+R12) + MOVD $-64, R12 + STVX V28, (R0+R12) + MOVD $-48, R12 + STVX V29, (R0+R12) + MOVD $-32, R12 + STVX V30, (R0+R12) + MOVD $-16, R12 + STVX V31, (R0+R12) + RET +TEXT runtime·elf_restvr(SB),NOSPLIT|NOFRAME,$0 + // R0 holds the save location, R12 is clobbered + MOVD $-192, R12 + LVX (R0+R12), V20 + MOVD $-176, R12 + LVX (R0+R12), V21 + MOVD $-160, R12 + LVX (R0+R12), V22 + MOVD $-144, R12 + LVX (R0+R12), V23 + MOVD $-128, R12 + LVX (R0+R12), V24 + MOVD $-112, R12 + LVX (R0+R12), V25 + MOVD $-96, R12 + LVX (R0+R12), V26 + MOVD $-80, R12 + LVX (R0+R12), V27 + MOVD $-64, R12 + LVX (R0+R12), V28 + MOVD $-48, R12 + LVX (R0+R12), V29 + MOVD $-32, R12 + LVX (R0+R12), V30 + MOVD $-16, R12 + LVX (R0+R12), V31 + RET diff --git a/src/runtime/asm_riscv64.s b/src/runtime/asm_riscv64.s new file mode 100644 index 0000000..31b81ae --- /dev/null +++ b/src/runtime/asm_riscv64.s @@ -0,0 +1,892 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "funcdata.h" +#include "textflag.h" + +// func rt0_go() +TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 + // X2 = stack; A0 = argc; A1 = argv + ADD $-24, X2 + MOV A0, 8(X2) // argc + MOV A1, 16(X2) // argv + + // create istack out of the given (operating system) stack. + // _cgo_init may update stackguard. + MOV $runtime·g0(SB), g + MOV $(-64*1024), T0 + ADD T0, X2, T1 + MOV T1, g_stackguard0(g) + MOV T1, g_stackguard1(g) + MOV T1, (g_stack+stack_lo)(g) + MOV X2, (g_stack+stack_hi)(g) + + // if there is a _cgo_init, call it using the gcc ABI. + MOV _cgo_init(SB), T0 + BEQ T0, ZERO, nocgo + + MOV ZERO, A3 // arg 3: not used + MOV ZERO, A2 // arg 2: not used + MOV $setg_gcc<>(SB), A1 // arg 1: setg + MOV g, A0 // arg 0: G + JALR RA, T0 + +nocgo: + // update stackguard after _cgo_init + MOV (g_stack+stack_lo)(g), T0 + ADD $const__StackGuard, T0 + MOV T0, g_stackguard0(g) + MOV T0, g_stackguard1(g) + + // set the per-goroutine and per-mach "registers" + MOV $runtime·m0(SB), T0 + + // save m->g0 = g0 + MOV g, m_g0(T0) + // save m0 to g0->m + MOV T0, g_m(g) + + CALL runtime·check(SB) + + // args are already prepared + CALL runtime·args(SB) + CALL runtime·osinit(SB) + CALL runtime·schedinit(SB) + + // create a new goroutine to start program + MOV $runtime·mainPC(SB), T0 // entry + ADD $-16, X2 + MOV T0, 8(X2) + MOV ZERO, 0(X2) + CALL runtime·newproc(SB) + ADD $16, X2 + + // start this M + CALL runtime·mstart(SB) + + WORD $0 // crash if reached + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + CALL runtime·mstart0(SB) + RET // not reached + +// void setg_gcc(G*); set g called from gcc with g in A0 +TEXT setg_gcc<>(SB),NOSPLIT,$0-0 + MOV A0, g + CALL runtime·save_g(SB) + RET + +// func cputicks() int64 +TEXT runtime·cputicks(SB),NOSPLIT,$0-8 + // RDTIME to emulate cpu ticks + // RDCYCLE reads counter that is per HART(core) based + // according to the riscv manual, see issue 46737 + RDTIME A0 + MOV A0, ret+0(FP) + RET + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + UNDEF + JALR RA, ZERO // make sure this function is not leaf + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-8 + MOV fn+0(FP), CTXT // CTXT = fn + MOV g_m(g), T0 // T0 = m + + MOV m_gsignal(T0), T1 // T1 = gsignal + BEQ g, T1, noswitch + + MOV m_g0(T0), T1 // T1 = g0 + BEQ g, T1, noswitch + + MOV m_curg(T0), T2 + BEQ g, T2, switch + + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOV $runtime·badsystemstack(SB), T1 + JALR RA, T1 + +switch: + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + CALL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOV T1, g + CALL runtime·save_g(SB) + MOV (g_sched+gobuf_sp)(g), T0 + MOV T0, X2 + + // call target function + MOV 0(CTXT), T1 // code pointer + JALR RA, T1 + + // switch back to g + MOV g_m(g), T0 + MOV m_curg(T0), g + CALL runtime·save_g(SB) + MOV (g_sched+gobuf_sp)(g), X2 + MOV ZERO, (g_sched+gobuf_sp)(g) + RET + +noswitch: + // already on m stack, just call directly + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOV 0(CTXT), T1 // code pointer + ADD $8, X2 + JMP (T1) + +TEXT runtime·getcallerpc(SB),NOSPLIT|NOFRAME,$0-8 + MOV 0(X2), T0 // LR saved by caller + MOV T0, ret+0(FP) + RET + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// Called with return address (i.e. caller's PC) in X5 (aka T0), +// and the LR register contains the caller's LR. +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. + +// func morestack() +TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0 + // Cannot grow scheduler stack (m->g0). + MOV g_m(g), A0 + MOV m_g0(A0), A1 + BNE g, A1, 3(PC) + CALL runtime·badmorestackg0(SB) + CALL runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOV m_gsignal(A0), A1 + BNE g, A1, 3(PC) + CALL runtime·badmorestackgsignal(SB) + CALL runtime·abort(SB) + + // Called from f. + // Set g->sched to context in f. + MOV X2, (g_sched+gobuf_sp)(g) + MOV T0, (g_sched+gobuf_pc)(g) + MOV RA, (g_sched+gobuf_lr)(g) + MOV CTXT, (g_sched+gobuf_ctxt)(g) + + // Called from f. + // Set m->morebuf to f's caller. + MOV RA, (m_morebuf+gobuf_pc)(A0) // f's caller's PC + MOV X2, (m_morebuf+gobuf_sp)(A0) // f's caller's SP + MOV g, (m_morebuf+gobuf_g)(A0) + + // Call newstack on m->g0's stack. + MOV m_g0(A0), g + CALL runtime·save_g(SB) + MOV (g_sched+gobuf_sp)(g), X2 + // Create a stack frame on g0 to call newstack. + MOV ZERO, -8(X2) // Zero saved LR in frame + ADD $-8, X2 + CALL runtime·newstack(SB) + + // Not reached, but make sure the return PC from the call to newstack + // is still in this function, and not the beginning of the next. + UNDEF + +// func morestack_noctxt() +TEXT runtime·morestack_noctxt(SB),NOSPLIT|NOFRAME,$0-0 + // Force SPWRITE. This function doesn't actually write SP, + // but it is called with a special calling convention where + // the caller doesn't save LR on stack but passes it as a + // register, and the unwinder currently doesn't understand. + // Make it SPWRITE to stop unwinding. (See issue 54332) + MOV X2, X2 + + MOV ZERO, CTXT + JMP runtime·morestack(SB) + +// AES hashing not implemented for riscv64 +TEXT runtime·memhash<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-32 + JMP runtime·memhashFallback<ABIInternal>(SB) +TEXT runtime·strhash<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·strhashFallback<ABIInternal>(SB) +TEXT runtime·memhash32<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash32Fallback<ABIInternal>(SB) +TEXT runtime·memhash64<ABIInternal>(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash64Fallback<ABIInternal>(SB) + +// func return0() +TEXT runtime·return0(SB), NOSPLIT, $0 + MOV $0, A0 + RET + +// restore state from Gobuf; longjmp + +// func gogo(buf *gobuf) +TEXT runtime·gogo(SB), NOSPLIT|NOFRAME, $0-8 + MOV buf+0(FP), T0 + MOV gobuf_g(T0), T1 + MOV 0(T1), ZERO // make sure g != nil + JMP gogo<>(SB) + +TEXT gogo<>(SB), NOSPLIT|NOFRAME, $0 + MOV T1, g + CALL runtime·save_g(SB) + + MOV gobuf_sp(T0), X2 + MOV gobuf_lr(T0), RA + MOV gobuf_ret(T0), A0 + MOV gobuf_ctxt(T0), CTXT + MOV ZERO, gobuf_sp(T0) + MOV ZERO, gobuf_ret(T0) + MOV ZERO, gobuf_lr(T0) + MOV ZERO, gobuf_ctxt(T0) + MOV gobuf_pc(T0), T0 + JALR ZERO, T0 + +// func procyield(cycles uint32) +TEXT runtime·procyield(SB),NOSPLIT,$0-0 + RET + +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. + +// func mcall(fn func(*g)) +TEXT runtime·mcall<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-8 + MOV X10, CTXT + + // Save caller state in g->sched + MOV X2, (g_sched+gobuf_sp)(g) + MOV RA, (g_sched+gobuf_pc)(g) + MOV ZERO, (g_sched+gobuf_lr)(g) + + // Switch to m->g0 & its stack, call fn. + MOV g, X10 + MOV g_m(g), T1 + MOV m_g0(T1), g + CALL runtime·save_g(SB) + BNE g, X10, 2(PC) + JMP runtime·badmcall(SB) + MOV 0(CTXT), T1 // code pointer + MOV (g_sched+gobuf_sp)(g), X2 // sp = m->g0->sched.sp + // we don't need special macro for regabi since arg0(X10) = g + ADD $-16, X2 + MOV X10, 8(X2) // setup g + MOV ZERO, 0(X2) // clear return address + JALR RA, T1 + JMP runtime·badmcall2(SB) + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes X31. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT|NOFRAME,$0 + MOV $runtime·systemstack_switch(SB), X31 + ADD $8, X31 // get past prologue + MOV X31, (g_sched+gobuf_pc)(g) + MOV X2, (g_sched+gobuf_sp)(g) + MOV ZERO, (g_sched+gobuf_lr)(g) + MOV ZERO, (g_sched+gobuf_ret)(g) + // Assert ctxt is zero. See func save. + MOV (g_sched+gobuf_ctxt)(g), X31 + BEQ ZERO, X31, 2(PC) + CALL runtime·abort(SB) + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-20 + MOV fn+0(FP), X5 + MOV arg+8(FP), X10 + + MOV X2, X8 // save original stack pointer + MOV g, X9 + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + MOV g_m(g), X6 + MOV m_gsignal(X6), X7 + BEQ X7, g, g0 + MOV m_g0(X6), X7 + BEQ X7, g, g0 + + CALL gosave_systemstack_switch<>(SB) + MOV X7, g + CALL runtime·save_g(SB) + MOV (g_sched+gobuf_sp)(g), X2 + + // Now on a scheduling stack (a pthread-created stack). +g0: + // Save room for two of our pointers. + ADD $-16, X2 + MOV X9, 0(X2) // save old g on stack + MOV (g_stack+stack_hi)(X9), X9 + SUB X8, X9, X8 + MOV X8, 8(X2) // save depth in old g stack (can't just save SP, as stack might be copied during a callback) + + JALR RA, (X5) + + // Restore g, stack pointer. X10 is return value. + MOV 0(X2), g + CALL runtime·save_g(SB) + MOV (g_stack+stack_hi)(g), X5 + MOV 8(X2), X6 + SUB X6, X5, X6 + MOV X6, X2 + + MOVW X10, ret+16(FP) + RET + +// func asminit() +TEXT runtime·asminit(SB),NOSPLIT|NOFRAME,$0-0 + RET + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + MOV $MAXSIZE, T1 \ + BLTU T1, T0, 3(PC) \ + MOV $NAME(SB), T2; \ + JALR ZERO, T2 +// Note: can't just "BR NAME(SB)" - bad inlining results. + +// func call(stackArgsType *rtype, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +TEXT reflect·call(SB), NOSPLIT, $0-0 + JMP ·reflectcall(SB) + +// func call(stackArgsType *_type, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +TEXT ·reflectcall(SB), NOSPLIT|NOFRAME, $0-48 + MOVWU frameSize+32(FP), T0 + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOV $runtime·badreflectcall(SB), T2 + JALR ZERO, T2 + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-48; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOV stackArgs+16(FP), A1; \ + MOVWU stackArgsSize+24(FP), A2; \ + MOV X2, A3; \ + ADD $8, A3; \ + ADD A3, A2; \ + BEQ A3, A2, 6(PC); \ + MOVBU (A1), A4; \ + ADD $1, A1; \ + MOVB A4, (A3); \ + ADD $1, A3; \ + JMP -5(PC); \ + /* set up argument registers */ \ + MOV regArgs+40(FP), X25; \ + CALL ·unspillArgs(SB); \ + /* call function */ \ + MOV f+8(FP), CTXT; \ + MOV (CTXT), X25; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + JALR RA, X25; \ + /* copy return values back */ \ + MOV regArgs+40(FP), X25; \ + CALL ·spillArgs(SB); \ + MOV stackArgsType+0(FP), A5; \ + MOV stackArgs+16(FP), A1; \ + MOVWU stackArgsSize+24(FP), A2; \ + MOVWU stackRetOffset+28(FP), A4; \ + ADD $8, X2, A3; \ + ADD A4, A3; \ + ADD A4, A1; \ + SUB A4, A2; \ + CALL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $40-0 + NO_LOCAL_POINTERS + MOV A5, 8(X2) + MOV A1, 16(X2) + MOV A3, 24(X2) + MOV A2, 32(X2) + MOV X25, 40(X2) + CALL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT,$8 + // g (X27) and REG_TMP (X31) might be clobbered by load_g. + // X27 is callee-save in the gcc calling convention, so save it. + MOV g, savedX27-8(SP) + + CALL runtime·load_g(SB) + MOV g_m(g), X5 + MOV m_curg(X5), X5 + MOV (g_stack+stack_hi)(X5), X10 // return value in X10 + + MOV savedX27-8(SP), g + RET + +// func goexit(neverCallThisFunction) +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|NOFRAME|TOPFRAME,$0-0 + MOV ZERO, ZERO // NOP + JMP runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + MOV ZERO, ZERO // NOP + +// func cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$24-24 + NO_LOCAL_POINTERS + + // Load m and g from thread-local storage. + MOVBU runtime·iscgo(SB), X5 + BEQ ZERO, X5, nocgo + CALL runtime·load_g(SB) +nocgo: + + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call. + BEQ ZERO, g, needm + + MOV g_m(g), X5 + MOV X5, savedm-8(SP) + JMP havem + +needm: + MOV g, savedm-8(SP) // g is zero, so is m. + MOV $runtime·needm(SB), X6 + JALR RA, X6 + + // Set m->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOV g_m(g), X5 + MOV m_g0(X5), X6 + MOV X2, (g_sched+gobuf_sp)(X6) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 8(X2) aka savedsp-24(SP). + MOV m_g0(X5), X6 + MOV (g_sched+gobuf_sp)(X6), X7 + MOV X7, savedsp-24(SP) // must match frame size + MOV X2, (g_sched+gobuf_sp)(X6) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOV m_curg(X5), g + CALL runtime·save_g(SB) + MOV (g_sched+gobuf_sp)(g), X6 // prepare stack as X6 + MOV (g_sched+gobuf_pc)(g), X7 + MOV X7, -(24+8)(X6) // "saved LR"; must match frame size + // Gather our arguments into registers. + MOV fn+0(FP), X7 + MOV frame+8(FP), X8 + MOV ctxt+16(FP), X9 + MOV $-(24+8)(X6), X2 // switch stack; must match frame size + MOV X7, 8(X2) + MOV X8, 16(X2) + MOV X9, 24(X2) + CALL runtime·cgocallbackg(SB) + + // Restore g->sched (== m->curg->sched) from saved values. + MOV 0(X2), X7 + MOV X7, (g_sched+gobuf_pc)(g) + MOV $(24+8)(X2), X6 // must match frame size + MOV X6, (g_sched+gobuf_sp)(g) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOV g_m(g), X5 + MOV m_g0(X5), g + CALL runtime·save_g(SB) + MOV (g_sched+gobuf_sp)(g), X2 + MOV savedsp-24(SP), X6 // must match frame size + MOV X6, (g_sched+gobuf_sp)(g) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOV savedm-8(SP), X5 + BNE ZERO, X5, droppedm + MOV $runtime·dropm(SB), X6 + JALR RA, X6 +droppedm: + + // Done! + RET + +TEXT runtime·breakpoint(SB),NOSPLIT|NOFRAME,$0-0 + EBREAK + RET + +TEXT runtime·abort(SB),NOSPLIT|NOFRAME,$0-0 + EBREAK + RET + +// void setg(G*); set g. for use by needm. +TEXT runtime·setg(SB), NOSPLIT, $0-8 + MOV gg+0(FP), g + // This only happens if iscgo, so jump straight to save_g + CALL runtime·save_g(SB) + RET + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + MOV $1, T0 + MOV T0, ret+0(FP) + RET + +// spillArgs stores return values from registers to a *internal/abi.RegArgs in X25. +TEXT ·spillArgs(SB),NOSPLIT,$0-0 + MOV X10, (0*8)(X25) + MOV X11, (1*8)(X25) + MOV X12, (2*8)(X25) + MOV X13, (3*8)(X25) + MOV X14, (4*8)(X25) + MOV X15, (5*8)(X25) + MOV X16, (6*8)(X25) + MOV X17, (7*8)(X25) + MOV X8, (8*8)(X25) + MOV X9, (9*8)(X25) + MOV X18, (10*8)(X25) + MOV X19, (11*8)(X25) + MOV X20, (12*8)(X25) + MOV X21, (13*8)(X25) + MOV X22, (14*8)(X25) + MOV X23, (15*8)(X25) + MOVD F10, (16*8)(X25) + MOVD F11, (17*8)(X25) + MOVD F12, (18*8)(X25) + MOVD F13, (19*8)(X25) + MOVD F14, (20*8)(X25) + MOVD F15, (21*8)(X25) + MOVD F16, (22*8)(X25) + MOVD F17, (23*8)(X25) + MOVD F8, (24*8)(X25) + MOVD F9, (25*8)(X25) + MOVD F18, (26*8)(X25) + MOVD F19, (27*8)(X25) + MOVD F20, (28*8)(X25) + MOVD F21, (29*8)(X25) + MOVD F22, (30*8)(X25) + MOVD F23, (31*8)(X25) + RET + +// unspillArgs loads args into registers from a *internal/abi.RegArgs in X25. +TEXT ·unspillArgs(SB),NOSPLIT,$0-0 + MOV (0*8)(X25), X10 + MOV (1*8)(X25), X11 + MOV (2*8)(X25), X12 + MOV (3*8)(X25), X13 + MOV (4*8)(X25), X14 + MOV (5*8)(X25), X15 + MOV (6*8)(X25), X16 + MOV (7*8)(X25), X17 + MOV (8*8)(X25), X8 + MOV (9*8)(X25), X9 + MOV (10*8)(X25), X18 + MOV (11*8)(X25), X19 + MOV (12*8)(X25), X20 + MOV (13*8)(X25), X21 + MOV (14*8)(X25), X22 + MOV (15*8)(X25), X23 + MOVD (16*8)(X25), F10 + MOVD (17*8)(X25), F11 + MOVD (18*8)(X25), F12 + MOVD (19*8)(X25), F13 + MOVD (20*8)(X25), F14 + MOVD (21*8)(X25), F15 + MOVD (22*8)(X25), F16 + MOVD (23*8)(X25), F17 + MOVD (24*8)(X25), F8 + MOVD (25*8)(X25), F9 + MOVD (26*8)(X25), F18 + MOVD (27*8)(X25), F19 + MOVD (28*8)(X25), F20 + MOVD (29*8)(X25), F21 + MOVD (30*8)(X25), F22 + MOVD (31*8)(X25), F23 + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - T0 is the destination of the write +// - T1 is the value being written at T0. +// It clobbers R30 (the linker temp register - REG_TMP). +// The act of CALLing gcWriteBarrier will clobber RA (LR). +// It does not clobber any other general-purpose registers, +// but may clobber others (e.g., floating point registers). +TEXT runtime·gcWriteBarrier<ABIInternal>(SB),NOSPLIT,$208 + // Save the registers clobbered by the fast path. + MOV A0, 24*8(X2) + MOV A1, 25*8(X2) + MOV g_m(g), A0 + MOV m_p(A0), A0 + MOV (p_wbBuf+wbBuf_next)(A0), A1 + // Increment wbBuf.next position. + ADD $16, A1 + MOV A1, (p_wbBuf+wbBuf_next)(A0) + MOV (p_wbBuf+wbBuf_end)(A0), A0 + MOV A0, T6 // T6 is linker temp register (REG_TMP) + // Record the write. + MOV T1, -16(A1) // Record value + MOV (T0), A0 // TODO: This turns bad writes into bad reads. + MOV A0, -8(A1) // Record *slot + // Is the buffer full? + BEQ A1, T6, flush +ret: + MOV 24*8(X2), A0 + MOV 25*8(X2), A1 + // Do the write. + MOV T1, (T0) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + MOV T0, 1*8(X2) // Also first argument to wbBufFlush + MOV T1, 2*8(X2) // Also second argument to wbBufFlush + // X0 is zero register + // X1 is LR, saved by prologue + // X2 is SP + // X3 is GP + // X4 is TP + // X5 is first arg to wbBufFlush (T0) + // X6 is second arg to wbBufFlush (T1) + MOV X7, 3*8(X2) + MOV X8, 4*8(X2) + MOV X9, 5*8(X2) + // X10 already saved (A0) + // X11 already saved (A1) + MOV X12, 6*8(X2) + MOV X13, 7*8(X2) + MOV X14, 8*8(X2) + MOV X15, 9*8(X2) + MOV X16, 10*8(X2) + MOV X17, 11*8(X2) + MOV X18, 12*8(X2) + MOV X19, 13*8(X2) + MOV X20, 14*8(X2) + MOV X21, 15*8(X2) + MOV X22, 16*8(X2) + MOV X23, 17*8(X2) + MOV X24, 18*8(X2) + MOV X25, 19*8(X2) + MOV X26, 20*8(X2) + // X27 is g. + MOV X28, 21*8(X2) + MOV X29, 22*8(X2) + MOV X30, 23*8(X2) + // X31 is tmp register. + + // This takes arguments T0 and T1. + CALL runtime·wbBufFlush(SB) + + MOV 1*8(X2), T0 + MOV 2*8(X2), T1 + MOV 3*8(X2), X7 + MOV 4*8(X2), X8 + MOV 5*8(X2), X9 + MOV 6*8(X2), X12 + MOV 7*8(X2), X13 + MOV 8*8(X2), X14 + MOV 9*8(X2), X15 + MOV 10*8(X2), X16 + MOV 11*8(X2), X17 + MOV 12*8(X2), X18 + MOV 13*8(X2), X19 + MOV 14*8(X2), X20 + MOV 15*8(X2), X21 + MOV 16*8(X2), X22 + MOV 17*8(X2), X23 + MOV 18*8(X2), X24 + MOV 19*8(X2), X25 + MOV 20*8(X2), X26 + MOV 21*8(X2), X28 + MOV 22*8(X2), X29 + MOV 23*8(X2), X30 + + JMP ret + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers (ssa/gen/RISCV64Ops.go), but the space for those +// arguments are allocated in the caller's stack frame. +// These stubs write the args into that stack space and then tail call to the +// corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +TEXT runtime·panicIndex<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T0, X10 + MOV T1, X11 + JMP runtime·goPanicIndex<ABIInternal>(SB) +TEXT runtime·panicIndexU<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T0, X10 + MOV T1, X11 + JMP runtime·goPanicIndexU<ABIInternal>(SB) +TEXT runtime·panicSliceAlen<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T1, X10 + MOV T2, X11 + JMP runtime·goPanicSliceAlen<ABIInternal>(SB) +TEXT runtime·panicSliceAlenU<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T1, X10 + MOV T2, X11 + JMP runtime·goPanicSliceAlenU<ABIInternal>(SB) +TEXT runtime·panicSliceAcap<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T1, X10 + MOV T2, X11 + JMP runtime·goPanicSliceAcap<ABIInternal>(SB) +TEXT runtime·panicSliceAcapU<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T1, X10 + MOV T2, X11 + JMP runtime·goPanicSliceAcapU<ABIInternal>(SB) +TEXT runtime·panicSliceB<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T0, X10 + MOV T1, X11 + JMP runtime·goPanicSliceB<ABIInternal>(SB) +TEXT runtime·panicSliceBU<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T0, X10 + MOV T1, X11 + JMP runtime·goPanicSliceBU<ABIInternal>(SB) +TEXT runtime·panicSlice3Alen<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T2, X10 + MOV T3, X11 + JMP runtime·goPanicSlice3Alen<ABIInternal>(SB) +TEXT runtime·panicSlice3AlenU<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T2, X10 + MOV T3, X11 + JMP runtime·goPanicSlice3AlenU<ABIInternal>(SB) +TEXT runtime·panicSlice3Acap<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T2, X10 + MOV T3, X11 + JMP runtime·goPanicSlice3Acap<ABIInternal>(SB) +TEXT runtime·panicSlice3AcapU<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T2, X10 + MOV T3, X11 + JMP runtime·goPanicSlice3AcapU<ABIInternal>(SB) +TEXT runtime·panicSlice3B<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T1, X10 + MOV T2, X11 + JMP runtime·goPanicSlice3B<ABIInternal>(SB) +TEXT runtime·panicSlice3BU<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T1, X10 + MOV T2, X11 + JMP runtime·goPanicSlice3BU<ABIInternal>(SB) +TEXT runtime·panicSlice3C<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T0, X10 + MOV T1, X11 + JMP runtime·goPanicSlice3C<ABIInternal>(SB) +TEXT runtime·panicSlice3CU<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T0, X10 + MOV T1, X11 + JMP runtime·goPanicSlice3CU<ABIInternal>(SB) +TEXT runtime·panicSliceConvert<ABIInternal>(SB),NOSPLIT,$0-16 + MOV T2, X10 + MOV T3, X11 + JMP runtime·goPanicSliceConvert<ABIInternal>(SB) + +DATA runtime·mainPC+0(SB)/8,$runtime·main<ABIInternal>(SB) +GLOBL runtime·mainPC(SB),RODATA,$8 diff --git a/src/runtime/asm_s390x.s b/src/runtime/asm_s390x.s new file mode 100644 index 0000000..334e1aa --- /dev/null +++ b/src/runtime/asm_s390x.s @@ -0,0 +1,904 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// _rt0_s390x_lib is common startup code for s390x systems when +// using -buildmode=c-archive or -buildmode=c-shared. The linker will +// arrange to invoke this function as a global constructor (for +// c-archive) or when the shared library is loaded (for c-shared). +// We expect argc and argv to be passed in the usual C ABI registers +// R2 and R3. +TEXT _rt0_s390x_lib(SB), NOSPLIT|NOFRAME, $0 + STMG R6, R15, 48(R15) + MOVD R2, _rt0_s390x_lib_argc<>(SB) + MOVD R3, _rt0_s390x_lib_argv<>(SB) + + // Save R6-R15 in the register save area of the calling function. + STMG R6, R15, 48(R15) + + // Allocate 80 bytes on the stack. + MOVD $-80(R15), R15 + + // Save F8-F15 in our stack frame. + FMOVD F8, 16(R15) + FMOVD F9, 24(R15) + FMOVD F10, 32(R15) + FMOVD F11, 40(R15) + FMOVD F12, 48(R15) + FMOVD F13, 56(R15) + FMOVD F14, 64(R15) + FMOVD F15, 72(R15) + + // Synchronous initialization. + MOVD $runtime·libpreinit(SB), R1 + BL R1 + + // Create a new thread to finish Go runtime initialization. + MOVD _cgo_sys_thread_create(SB), R1 + CMP R1, $0 + BEQ nocgo + MOVD $_rt0_s390x_lib_go(SB), R2 + MOVD $0, R3 + BL R1 + BR restore + +nocgo: + MOVD $0x800000, R1 // stacksize + MOVD R1, 0(R15) + MOVD $_rt0_s390x_lib_go(SB), R1 + MOVD R1, 8(R15) // fn + MOVD $runtime·newosproc(SB), R1 + BL R1 + +restore: + // Restore F8-F15 from our stack frame. + FMOVD 16(R15), F8 + FMOVD 24(R15), F9 + FMOVD 32(R15), F10 + FMOVD 40(R15), F11 + FMOVD 48(R15), F12 + FMOVD 56(R15), F13 + FMOVD 64(R15), F14 + FMOVD 72(R15), F15 + MOVD $80(R15), R15 + + // Restore R6-R15. + LMG 48(R15), R6, R15 + RET + +// _rt0_s390x_lib_go initializes the Go runtime. +// This is started in a separate thread by _rt0_s390x_lib. +TEXT _rt0_s390x_lib_go(SB), NOSPLIT|NOFRAME, $0 + MOVD _rt0_s390x_lib_argc<>(SB), R2 + MOVD _rt0_s390x_lib_argv<>(SB), R3 + MOVD $runtime·rt0_go(SB), R1 + BR R1 + +DATA _rt0_s390x_lib_argc<>(SB)/8, $0 +GLOBL _rt0_s390x_lib_argc<>(SB), NOPTR, $8 +DATA _rt0_s90x_lib_argv<>(SB)/8, $0 +GLOBL _rt0_s390x_lib_argv<>(SB), NOPTR, $8 + +TEXT runtime·rt0_go(SB),NOSPLIT|TOPFRAME,$0 + // R2 = argc; R3 = argv; R11 = temp; R13 = g; R15 = stack pointer + // C TLS base pointer in AR0:AR1 + + // initialize essential registers + XOR R0, R0 + + SUB $24, R15 + MOVW R2, 8(R15) // argc + MOVD R3, 16(R15) // argv + + // create istack out of the given (operating system) stack. + // _cgo_init may update stackguard. + MOVD $runtime·g0(SB), g + MOVD R15, R11 + SUB $(64*1024), R11 + MOVD R11, g_stackguard0(g) + MOVD R11, g_stackguard1(g) + MOVD R11, (g_stack+stack_lo)(g) + MOVD R15, (g_stack+stack_hi)(g) + + // if there is a _cgo_init, call it using the gcc ABI. + MOVD _cgo_init(SB), R11 + CMPBEQ R11, $0, nocgo + MOVW AR0, R4 // (AR0 << 32 | AR1) is the TLS base pointer; MOVD is translated to EAR + SLD $32, R4, R4 + MOVW AR1, R4 // arg 2: TLS base pointer + MOVD $setg_gcc<>(SB), R3 // arg 1: setg + MOVD g, R2 // arg 0: G + // C functions expect 160 bytes of space on caller stack frame + // and an 8-byte aligned stack pointer + MOVD R15, R9 // save current stack (R9 is preserved in the Linux ABI) + SUB $160, R15 // reserve 160 bytes + MOVD $~7, R6 + AND R6, R15 // 8-byte align + BL R11 // this call clobbers volatile registers according to Linux ABI (R0-R5, R14) + MOVD R9, R15 // restore stack + XOR R0, R0 // zero R0 + +nocgo: + // update stackguard after _cgo_init + MOVD (g_stack+stack_lo)(g), R2 + ADD $const__StackGuard, R2 + MOVD R2, g_stackguard0(g) + MOVD R2, g_stackguard1(g) + + // set the per-goroutine and per-mach "registers" + MOVD $runtime·m0(SB), R2 + + // save m->g0 = g0 + MOVD g, m_g0(R2) + // save m0 to g0->m + MOVD R2, g_m(g) + + BL runtime·check(SB) + + // argc/argv are already prepared on stack + BL runtime·args(SB) + BL runtime·osinit(SB) + BL runtime·schedinit(SB) + + // create a new goroutine to start program + MOVD $runtime·mainPC(SB), R2 // entry + SUB $16, R15 + MOVD R2, 8(R15) + MOVD $0, 0(R15) + BL runtime·newproc(SB) + ADD $16, R15 + + // start this M + BL runtime·mstart(SB) + + MOVD $0, 1(R0) + RET + +DATA runtime·mainPC+0(SB)/8,$runtime·main(SB) +GLOBL runtime·mainPC(SB),RODATA,$8 + +TEXT runtime·breakpoint(SB),NOSPLIT|NOFRAME,$0-0 + MOVD $0, 2(R0) + RET + +TEXT runtime·asminit(SB),NOSPLIT|NOFRAME,$0-0 + RET + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + CALL runtime·mstart0(SB) + RET // not reached + +/* + * go-routine + */ + +// void gogo(Gobuf*) +// restore state from Gobuf; longjmp +TEXT runtime·gogo(SB), NOSPLIT|NOFRAME, $0-8 + MOVD buf+0(FP), R5 + MOVD gobuf_g(R5), R6 + MOVD 0(R6), R7 // make sure g != nil + BR gogo<>(SB) + +TEXT gogo<>(SB), NOSPLIT|NOFRAME, $0 + MOVD R6, g + BL runtime·save_g(SB) + + MOVD 0(g), R4 + MOVD gobuf_sp(R5), R15 + MOVD gobuf_lr(R5), LR + MOVD gobuf_ret(R5), R3 + MOVD gobuf_ctxt(R5), R12 + MOVD $0, gobuf_sp(R5) + MOVD $0, gobuf_ret(R5) + MOVD $0, gobuf_lr(R5) + MOVD $0, gobuf_ctxt(R5) + CMP R0, R0 // set condition codes for == test, needed by stack split + MOVD gobuf_pc(R5), R6 + BR (R6) + +// void mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall(SB), NOSPLIT, $-8-8 + // Save caller state in g->sched + MOVD R15, (g_sched+gobuf_sp)(g) + MOVD LR, (g_sched+gobuf_pc)(g) + MOVD $0, (g_sched+gobuf_lr)(g) + + // Switch to m->g0 & its stack, call fn. + MOVD g, R3 + MOVD g_m(g), R8 + MOVD m_g0(R8), g + BL runtime·save_g(SB) + CMP g, R3 + BNE 2(PC) + BR runtime·badmcall(SB) + MOVD fn+0(FP), R12 // context + MOVD 0(R12), R4 // code pointer + MOVD (g_sched+gobuf_sp)(g), R15 // sp = m->g0->sched.sp + SUB $16, R15 + MOVD R3, 8(R15) + MOVD $0, 0(R15) + BL (R4) + BR runtime·badmcall2(SB) + +// systemstack_switch is a dummy routine that systemstack leaves at the bottom +// of the G stack. We need to distinguish the routine that +// lives at the bottom of the G stack from the one that lives +// at the top of the system stack because the one at the top of +// the system stack terminates the stack walk (see topofstack()). +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + UNDEF + BL (LR) // make sure this function is not leaf + RET + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-8 + MOVD fn+0(FP), R3 // R3 = fn + MOVD R3, R12 // context + MOVD g_m(g), R4 // R4 = m + + MOVD m_gsignal(R4), R5 // R5 = gsignal + CMPBEQ g, R5, noswitch + + MOVD m_g0(R4), R5 // R5 = g0 + CMPBEQ g, R5, noswitch + + MOVD m_curg(R4), R6 + CMPBEQ g, R6, switch + + // Bad: g is not gsignal, not g0, not curg. What is it? + // Hide call from linker nosplit analysis. + MOVD $runtime·badsystemstack(SB), R3 + BL (R3) + BL runtime·abort(SB) + +switch: + // save our state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + BL gosave_systemstack_switch<>(SB) + + // switch to g0 + MOVD R5, g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R15 + + // call target function + MOVD 0(R12), R3 // code pointer + BL (R3) + + // switch back to g + MOVD g_m(g), R3 + MOVD m_curg(R3), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R15 + MOVD $0, (g_sched+gobuf_sp)(g) + RET + +noswitch: + // already on m stack, just call directly + // Using a tail call here cleans up tracebacks since we won't stop + // at an intermediate systemstack. + MOVD 0(R12), R3 // code pointer + MOVD 0(R15), LR // restore LR + ADD $8, R15 + BR (R3) + +/* + * support for morestack + */ + +// Called during function prolog when more stack is needed. +// Caller has already loaded: +// R3: framesize, R4: argsize, R5: LR +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB),NOSPLIT|NOFRAME,$0-0 + // Cannot grow scheduler stack (m->g0). + MOVD g_m(g), R7 + MOVD m_g0(R7), R8 + CMPBNE g, R8, 3(PC) + BL runtime·badmorestackg0(SB) + BL runtime·abort(SB) + + // Cannot grow signal stack (m->gsignal). + MOVD m_gsignal(R7), R8 + CMP g, R8 + BNE 3(PC) + BL runtime·badmorestackgsignal(SB) + BL runtime·abort(SB) + + // Called from f. + // Set g->sched to context in f. + MOVD R15, (g_sched+gobuf_sp)(g) + MOVD LR, R8 + MOVD R8, (g_sched+gobuf_pc)(g) + MOVD R5, (g_sched+gobuf_lr)(g) + MOVD R12, (g_sched+gobuf_ctxt)(g) + + // Called from f. + // Set m->morebuf to f's caller. + MOVD R5, (m_morebuf+gobuf_pc)(R7) // f's caller's PC + MOVD R15, (m_morebuf+gobuf_sp)(R7) // f's caller's SP + MOVD g, (m_morebuf+gobuf_g)(R7) + + // Call newstack on m->g0's stack. + MOVD m_g0(R7), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R15 + // Create a stack frame on g0 to call newstack. + MOVD $0, -8(R15) // Zero saved LR in frame + SUB $8, R15 + BL runtime·newstack(SB) + + // Not reached, but make sure the return PC from the call to newstack + // is still in this function, and not the beginning of the next. + UNDEF + +TEXT runtime·morestack_noctxt(SB),NOSPLIT|NOFRAME,$0-0 + // Force SPWRITE. This function doesn't actually write SP, + // but it is called with a special calling convention where + // the caller doesn't save LR on stack but passes it as a + // register (R5), and the unwinder currently doesn't understand. + // Make it SPWRITE to stop unwinding. (See issue 54332) + MOVD R15, R15 + + MOVD $0, R12 + BR runtime·morestack(SB) + +// reflectcall: call a function with the given argument list +// func call(stackArgsType *_type, f *FuncVal, stackArgs *byte, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs). +// we don't have variable-sized frames, so we use a small number +// of constant-sized-frame functions to encode a few bits of size in the pc. +// Caution: ugly multiline assembly macros in your future! + +#define DISPATCH(NAME,MAXSIZE) \ + MOVD $MAXSIZE, R4; \ + CMP R3, R4; \ + BGT 3(PC); \ + MOVD $NAME(SB), R5; \ + BR (R5) +// Note: can't just "BR NAME(SB)" - bad inlining results. + +TEXT ·reflectcall(SB), NOSPLIT, $-8-48 + MOVWZ frameSize+32(FP), R3 + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + MOVD $runtime·badreflectcall(SB), R5 + BR (R5) + +#define CALLFN(NAME,MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-48; \ + NO_LOCAL_POINTERS; \ + /* copy arguments to stack */ \ + MOVD stackArgs+16(FP), R4; \ + MOVWZ stackArgsSize+24(FP), R5; \ + MOVD $stack-MAXSIZE(SP), R6; \ +loopArgs: /* copy 256 bytes at a time */ \ + CMP R5, $256; \ + BLT tailArgs; \ + SUB $256, R5; \ + MVC $256, 0(R4), 0(R6); \ + MOVD $256(R4), R4; \ + MOVD $256(R6), R6; \ + BR loopArgs; \ +tailArgs: /* copy remaining bytes */ \ + CMP R5, $0; \ + BEQ callFunction; \ + SUB $1, R5; \ + EXRL $callfnMVC<>(SB), R5; \ +callFunction: \ + MOVD f+8(FP), R12; \ + MOVD (R12), R8; \ + PCDATA $PCDATA_StackMapIndex, $0; \ + BL (R8); \ + /* copy return values back */ \ + MOVD stackArgsType+0(FP), R7; \ + MOVD stackArgs+16(FP), R6; \ + MOVWZ stackArgsSize+24(FP), R5; \ + MOVD $stack-MAXSIZE(SP), R4; \ + MOVWZ stackRetOffset+28(FP), R1; \ + ADD R1, R4; \ + ADD R1, R6; \ + SUB R1, R5; \ + BL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $40-0 + MOVD R7, 8(R15) + MOVD R6, 16(R15) + MOVD R4, 24(R15) + MOVD R5, 32(R15) + MOVD $0, 40(R15) + BL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +// Not a function: target for EXRL (execute relative long) instruction. +TEXT callfnMVC<>(SB),NOSPLIT|NOFRAME,$0-0 + MVC $1, 0(R4), 0(R6) + +TEXT runtime·procyield(SB),NOSPLIT,$0-0 + RET + +// Save state of caller into g->sched, +// but using fake PC from systemstack_switch. +// Must only be called from functions with no locals ($0) +// or else unwinding from systemstack_switch is incorrect. +// Smashes R1. +TEXT gosave_systemstack_switch<>(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·systemstack_switch(SB), R1 + ADD $16, R1 // get past prologue + MOVD R1, (g_sched+gobuf_pc)(g) + MOVD R15, (g_sched+gobuf_sp)(g) + MOVD $0, (g_sched+gobuf_lr)(g) + MOVD $0, (g_sched+gobuf_ret)(g) + // Assert ctxt is zero. See func save. + MOVD (g_sched+gobuf_ctxt)(g), R1 + CMPBEQ R1, $0, 2(PC) + BL runtime·abort(SB) + RET + +// func asmcgocall(fn, arg unsafe.Pointer) int32 +// Call fn(arg) on the scheduler stack, +// aligned appropriately for the gcc ABI. +// See cgocall.go for more details. +TEXT ·asmcgocall(SB),NOSPLIT,$0-20 + // R2 = argc; R3 = argv; R11 = temp; R13 = g; R15 = stack pointer + // C TLS base pointer in AR0:AR1 + MOVD fn+0(FP), R3 + MOVD arg+8(FP), R4 + + MOVD R15, R2 // save original stack pointer + MOVD g, R5 + + // Figure out if we need to switch to m->g0 stack. + // We get called to create new OS threads too, and those + // come in on the m->g0 stack already. Or we might already + // be on the m->gsignal stack. + MOVD g_m(g), R6 + MOVD m_gsignal(R6), R7 + CMPBEQ R7, g, g0 + MOVD m_g0(R6), R7 + CMPBEQ R7, g, g0 + BL gosave_systemstack_switch<>(SB) + MOVD R7, g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R15 + + // Now on a scheduling stack (a pthread-created stack). +g0: + // Save room for two of our pointers, plus 160 bytes of callee + // save area that lives on the caller stack. + SUB $176, R15 + MOVD $~7, R6 + AND R6, R15 // 8-byte alignment for gcc ABI + MOVD R5, 168(R15) // save old g on stack + MOVD (g_stack+stack_hi)(R5), R5 + SUB R2, R5 + MOVD R5, 160(R15) // save depth in old g stack (can't just save SP, as stack might be copied during a callback) + MOVD $0, 0(R15) // clear back chain pointer (TODO can we give it real back trace information?) + MOVD R4, R2 // arg in R2 + BL R3 // can clobber: R0-R5, R14, F0-F3, F5, F7-F15 + + XOR R0, R0 // set R0 back to 0. + // Restore g, stack pointer. + MOVD 168(R15), g + BL runtime·save_g(SB) + MOVD (g_stack+stack_hi)(g), R5 + MOVD 160(R15), R6 + SUB R6, R5 + MOVD R5, R15 + + MOVW R2, ret+16(FP) + RET + +// cgocallback(fn, frame unsafe.Pointer, ctxt uintptr) +// See cgocall.go for more details. +TEXT ·cgocallback(SB),NOSPLIT,$24-24 + NO_LOCAL_POINTERS + + // Load m and g from thread-local storage. + MOVB runtime·iscgo(SB), R3 + CMPBEQ R3, $0, nocgo + BL runtime·load_g(SB) + +nocgo: + // If g is nil, Go did not create the current thread. + // Call needm to obtain one for temporary use. + // In this case, we're running on the thread stack, so there's + // lots of space, but the linker doesn't know. Hide the call from + // the linker analysis by using an indirect call. + CMPBEQ g, $0, needm + + MOVD g_m(g), R8 + MOVD R8, savedm-8(SP) + BR havem + +needm: + MOVD g, savedm-8(SP) // g is zero, so is m. + MOVD $runtime·needm(SB), R3 + BL (R3) + + // Set m->sched.sp = SP, so that if a panic happens + // during the function we are about to execute, it will + // have a valid SP to run on the g0 stack. + // The next few lines (after the havem label) + // will save this SP onto the stack and then write + // the same SP back to m->sched.sp. That seems redundant, + // but if an unrecovered panic happens, unwindm will + // restore the g->sched.sp from the stack location + // and then systemstack will try to use it. If we don't set it here, + // that restored SP will be uninitialized (typically 0) and + // will not be usable. + MOVD g_m(g), R8 + MOVD m_g0(R8), R3 + MOVD R15, (g_sched+gobuf_sp)(R3) + +havem: + // Now there's a valid m, and we're running on its m->g0. + // Save current m->g0->sched.sp on stack and then set it to SP. + // Save current sp in m->g0->sched.sp in preparation for + // switch back to m->curg stack. + // NOTE: unwindm knows that the saved g->sched.sp is at 8(R1) aka savedsp-16(SP). + MOVD m_g0(R8), R3 + MOVD (g_sched+gobuf_sp)(R3), R4 + MOVD R4, savedsp-24(SP) // must match frame size + MOVD R15, (g_sched+gobuf_sp)(R3) + + // Switch to m->curg stack and call runtime.cgocallbackg. + // Because we are taking over the execution of m->curg + // but *not* resuming what had been running, we need to + // save that information (m->curg->sched) so we can restore it. + // We can restore m->curg->sched.sp easily, because calling + // runtime.cgocallbackg leaves SP unchanged upon return. + // To save m->curg->sched.pc, we push it onto the curg stack and + // open a frame the same size as cgocallback's g0 frame. + // Once we switch to the curg stack, the pushed PC will appear + // to be the return PC of cgocallback, so that the traceback + // will seamlessly trace back into the earlier calls. + MOVD m_curg(R8), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R4 // prepare stack as R4 + MOVD (g_sched+gobuf_pc)(g), R5 + MOVD R5, -(24+8)(R4) // "saved LR"; must match frame size + // Gather our arguments into registers. + MOVD fn+0(FP), R1 + MOVD frame+8(FP), R2 + MOVD ctxt+16(FP), R3 + MOVD $-(24+8)(R4), R15 // switch stack; must match frame size + MOVD R1, 8(R15) + MOVD R2, 16(R15) + MOVD R3, 24(R15) + BL runtime·cgocallbackg(SB) + + // Restore g->sched (== m->curg->sched) from saved values. + MOVD 0(R15), R5 + MOVD R5, (g_sched+gobuf_pc)(g) + MOVD $(24+8)(R15), R4 // must match frame size + MOVD R4, (g_sched+gobuf_sp)(g) + + // Switch back to m->g0's stack and restore m->g0->sched.sp. + // (Unlike m->curg, the g0 goroutine never uses sched.pc, + // so we do not have to restore it.) + MOVD g_m(g), R8 + MOVD m_g0(R8), g + BL runtime·save_g(SB) + MOVD (g_sched+gobuf_sp)(g), R15 + MOVD savedsp-24(SP), R4 // must match frame size + MOVD R4, (g_sched+gobuf_sp)(g) + + // If the m on entry was nil, we called needm above to borrow an m + // for the duration of the call. Since the call is over, return it with dropm. + MOVD savedm-8(SP), R6 + CMPBNE R6, $0, droppedm + MOVD $runtime·dropm(SB), R3 + BL (R3) +droppedm: + + // Done! + RET + +// void setg(G*); set g. for use by needm. +TEXT runtime·setg(SB), NOSPLIT, $0-8 + MOVD gg+0(FP), g + // This only happens if iscgo, so jump straight to save_g + BL runtime·save_g(SB) + RET + +// void setg_gcc(G*); set g in C TLS. +// Must obey the gcc calling convention. +TEXT setg_gcc<>(SB),NOSPLIT|NOFRAME,$0-0 + // The standard prologue clobbers LR (R14), which is callee-save in + // the C ABI, so we have to use NOFRAME and save LR ourselves. + MOVD LR, R1 + // Also save g, R10, and R11 since they're callee-save in C ABI + MOVD R10, R3 + MOVD g, R4 + MOVD R11, R5 + + MOVD R2, g + BL runtime·save_g(SB) + + MOVD R5, R11 + MOVD R4, g + MOVD R3, R10 + MOVD R1, LR + RET + +TEXT runtime·abort(SB),NOSPLIT|NOFRAME,$0-0 + MOVW (R0), R0 + UNDEF + +// int64 runtime·cputicks(void) +TEXT runtime·cputicks(SB),NOSPLIT,$0-8 + // The TOD clock on s390 counts from the year 1900 in ~250ps intervals. + // This means that since about 1972 the msb has been set, making the + // result of a call to STORE CLOCK (stck) a negative number. + // We clear the msb to make it positive. + STCK ret+0(FP) // serialises before and after call + MOVD ret+0(FP), R3 // R3 will wrap to 0 in the year 2043 + SLD $1, R3 + SRD $1, R3 + MOVD R3, ret+0(FP) + RET + +// AES hashing not implemented for s390x +TEXT runtime·memhash(SB),NOSPLIT|NOFRAME,$0-32 + JMP runtime·memhashFallback(SB) +TEXT runtime·strhash(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·strhashFallback(SB) +TEXT runtime·memhash32(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash32Fallback(SB) +TEXT runtime·memhash64(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash64Fallback(SB) + +TEXT runtime·return0(SB), NOSPLIT, $0 + MOVW $0, R3 + RET + +// Called from cgo wrappers, this function returns g->m->curg.stack.hi. +// Must obey the gcc calling convention. +TEXT _cgo_topofstack(SB),NOSPLIT|NOFRAME,$0 + // g (R13), R10, R11 and LR (R14) are callee-save in the C ABI, so save them + MOVD g, R1 + MOVD R10, R3 + MOVD LR, R4 + MOVD R11, R5 + + BL runtime·load_g(SB) // clobbers g (R13), R10, R11 + MOVD g_m(g), R2 + MOVD m_curg(R2), R2 + MOVD (g_stack+stack_hi)(R2), R2 + + MOVD R1, g + MOVD R3, R10 + MOVD R4, LR + MOVD R5, R11 + RET + +// The top-most function running on a goroutine +// returns to goexit+PCQuantum. +TEXT runtime·goexit(SB),NOSPLIT|NOFRAME|TOPFRAME,$0-0 + BYTE $0x07; BYTE $0x00; // 2-byte nop + BL runtime·goexit1(SB) // does not return + // traceback from goexit1 must hit code range of goexit + BYTE $0x07; BYTE $0x00; // 2-byte nop + +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + // Stores are already ordered on s390x, so this is just a + // compile barrier. + RET + +// This is called from .init_array and follows the platform, not Go, ABI. +// We are overly conservative. We could only save the registers we use. +// However, since this function is only called once per loaded module +// performance is unimportant. +TEXT runtime·addmoduledata(SB),NOSPLIT|NOFRAME,$0-0 + // Save R6-R15 in the register save area of the calling function. + // Don't bother saving F8-F15 as we aren't doing any calls. + STMG R6, R15, 48(R15) + + // append the argument (passed in R2, as per the ELF ABI) to the + // moduledata linked list. + MOVD runtime·lastmoduledatap(SB), R1 + MOVD R2, moduledata_next(R1) + MOVD R2, runtime·lastmoduledatap(SB) + + // Restore R6-R15. + LMG 48(R15), R6, R15 + RET + +TEXT ·checkASM(SB),NOSPLIT,$0-1 + MOVB $1, ret+0(FP) + RET + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It takes two arguments: +// - R2 is the destination of the write +// - R3 is the value being written at R2. +// It clobbers R10 (the temp register) and R1 (used by PLT stub). +// It does not clobber any other general-purpose registers, +// but may clobber others (e.g., floating point registers). +TEXT runtime·gcWriteBarrier(SB),NOSPLIT,$96 + // Save the registers clobbered by the fast path. + MOVD R4, 96(R15) + MOVD g_m(g), R1 + MOVD m_p(R1), R1 + // Increment wbBuf.next position. + MOVD $16, R4 + ADD (p_wbBuf+wbBuf_next)(R1), R4 + MOVD R4, (p_wbBuf+wbBuf_next)(R1) + MOVD (p_wbBuf+wbBuf_end)(R1), R1 + // Record the write. + MOVD R3, -16(R4) // Record value + MOVD (R2), R10 // TODO: This turns bad writes into bad reads. + MOVD R10, -8(R4) // Record *slot + // Is the buffer full? + CMPBEQ R4, R1, flush +ret: + MOVD 96(R15), R4 + // Do the write. + MOVD R3, (R2) + RET + +flush: + // Save all general purpose registers since these could be + // clobbered by wbBufFlush and were not saved by the caller. + STMG R2, R3, 8(R15) // set R2 and R3 as arguments for wbBufFlush + MOVD R0, 24(R15) + // R1 already saved. + // R4 already saved. + STMG R5, R12, 32(R15) // save R5 - R12 + // R13 is g. + // R14 is LR. + // R15 is SP. + + // This takes arguments R2 and R3. + CALL runtime·wbBufFlush(SB) + + LMG 8(R15), R2, R3 // restore R2 - R3 + MOVD 24(R15), R0 // restore R0 + LMG 32(R15), R5, R12 // restore R5 - R12 + JMP ret + +// Note: these functions use a special calling convention to save generated code space. +// Arguments are passed in registers, but the space for those arguments are allocated +// in the caller's stack frame. These stubs write the args into that stack space and +// then tail call to the corresponding runtime handler. +// The tail call makes these stubs disappear in backtraces. +TEXT runtime·panicIndex(SB),NOSPLIT,$0-16 + MOVD R0, x+0(FP) + MOVD R1, y+8(FP) + JMP runtime·goPanicIndex(SB) +TEXT runtime·panicIndexU(SB),NOSPLIT,$0-16 + MOVD R0, x+0(FP) + MOVD R1, y+8(FP) + JMP runtime·goPanicIndexU(SB) +TEXT runtime·panicSliceAlen(SB),NOSPLIT,$0-16 + MOVD R1, x+0(FP) + MOVD R2, y+8(FP) + JMP runtime·goPanicSliceAlen(SB) +TEXT runtime·panicSliceAlenU(SB),NOSPLIT,$0-16 + MOVD R1, x+0(FP) + MOVD R2, y+8(FP) + JMP runtime·goPanicSliceAlenU(SB) +TEXT runtime·panicSliceAcap(SB),NOSPLIT,$0-16 + MOVD R1, x+0(FP) + MOVD R2, y+8(FP) + JMP runtime·goPanicSliceAcap(SB) +TEXT runtime·panicSliceAcapU(SB),NOSPLIT,$0-16 + MOVD R1, x+0(FP) + MOVD R2, y+8(FP) + JMP runtime·goPanicSliceAcapU(SB) +TEXT runtime·panicSliceB(SB),NOSPLIT,$0-16 + MOVD R0, x+0(FP) + MOVD R1, y+8(FP) + JMP runtime·goPanicSliceB(SB) +TEXT runtime·panicSliceBU(SB),NOSPLIT,$0-16 + MOVD R0, x+0(FP) + MOVD R1, y+8(FP) + JMP runtime·goPanicSliceBU(SB) +TEXT runtime·panicSlice3Alen(SB),NOSPLIT,$0-16 + MOVD R2, x+0(FP) + MOVD R3, y+8(FP) + JMP runtime·goPanicSlice3Alen(SB) +TEXT runtime·panicSlice3AlenU(SB),NOSPLIT,$0-16 + MOVD R2, x+0(FP) + MOVD R3, y+8(FP) + JMP runtime·goPanicSlice3AlenU(SB) +TEXT runtime·panicSlice3Acap(SB),NOSPLIT,$0-16 + MOVD R2, x+0(FP) + MOVD R3, y+8(FP) + JMP runtime·goPanicSlice3Acap(SB) +TEXT runtime·panicSlice3AcapU(SB),NOSPLIT,$0-16 + MOVD R2, x+0(FP) + MOVD R3, y+8(FP) + JMP runtime·goPanicSlice3AcapU(SB) +TEXT runtime·panicSlice3B(SB),NOSPLIT,$0-16 + MOVD R1, x+0(FP) + MOVD R2, y+8(FP) + JMP runtime·goPanicSlice3B(SB) +TEXT runtime·panicSlice3BU(SB),NOSPLIT,$0-16 + MOVD R1, x+0(FP) + MOVD R2, y+8(FP) + JMP runtime·goPanicSlice3BU(SB) +TEXT runtime·panicSlice3C(SB),NOSPLIT,$0-16 + MOVD R0, x+0(FP) + MOVD R1, y+8(FP) + JMP runtime·goPanicSlice3C(SB) +TEXT runtime·panicSlice3CU(SB),NOSPLIT,$0-16 + MOVD R0, x+0(FP) + MOVD R1, y+8(FP) + JMP runtime·goPanicSlice3CU(SB) +TEXT runtime·panicSliceConvert(SB),NOSPLIT,$0-16 + MOVD R2, x+0(FP) + MOVD R3, y+8(FP) + JMP runtime·goPanicSliceConvert(SB) diff --git a/src/runtime/asm_wasm.s b/src/runtime/asm_wasm.s new file mode 100644 index 0000000..e075c72 --- /dev/null +++ b/src/runtime/asm_wasm.s @@ -0,0 +1,445 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +TEXT runtime·rt0_go(SB), NOSPLIT|NOFRAME|TOPFRAME, $0 + // save m->g0 = g0 + MOVD $runtime·g0(SB), runtime·m0+m_g0(SB) + // save m0 to g0->m + MOVD $runtime·m0(SB), runtime·g0+g_m(SB) + // set g to g0 + MOVD $runtime·g0(SB), g + CALLNORESUME runtime·check(SB) + CALLNORESUME runtime·args(SB) + CALLNORESUME runtime·osinit(SB) + CALLNORESUME runtime·schedinit(SB) + MOVD $runtime·mainPC(SB), 0(SP) + CALLNORESUME runtime·newproc(SB) + CALL runtime·mstart(SB) // WebAssembly stack will unwind when switching to another goroutine + UNDEF + +TEXT runtime·mstart(SB),NOSPLIT|TOPFRAME,$0 + CALL runtime·mstart0(SB) + RET // not reached + +DATA runtime·mainPC+0(SB)/8,$runtime·main(SB) +GLOBL runtime·mainPC(SB),RODATA,$8 + +// func checkASM() bool +TEXT ·checkASM(SB), NOSPLIT, $0-1 + MOVB $1, ret+0(FP) + RET + +TEXT runtime·gogo(SB), NOSPLIT, $0-8 + MOVD buf+0(FP), R0 + MOVD gobuf_g(R0), R1 + MOVD 0(R1), R2 // make sure g != nil + MOVD R1, g + MOVD gobuf_sp(R0), SP + + // Put target PC at -8(SP), wasm_pc_f_loop will pick it up + Get SP + I32Const $8 + I32Sub + I64Load gobuf_pc(R0) + I64Store $0 + + MOVD gobuf_ret(R0), RET0 + MOVD gobuf_ctxt(R0), CTXT + // clear to help garbage collector + MOVD $0, gobuf_sp(R0) + MOVD $0, gobuf_ret(R0) + MOVD $0, gobuf_ctxt(R0) + + I32Const $1 + Return + +// func mcall(fn func(*g)) +// Switch to m->g0's stack, call fn(g). +// Fn must never return. It should gogo(&g->sched) +// to keep running g. +TEXT runtime·mcall(SB), NOSPLIT, $0-8 + // CTXT = fn + MOVD fn+0(FP), CTXT + // R1 = g.m + MOVD g_m(g), R1 + // R2 = g0 + MOVD m_g0(R1), R2 + + // save state in g->sched + MOVD 0(SP), g_sched+gobuf_pc(g) // caller's PC + MOVD $fn+0(FP), g_sched+gobuf_sp(g) // caller's SP + + // if g == g0 call badmcall + Get g + Get R2 + I64Eq + If + JMP runtime·badmcall(SB) + End + + // switch to g0's stack + I64Load (g_sched+gobuf_sp)(R2) + I64Const $8 + I64Sub + I32WrapI64 + Set SP + + // set arg to current g + MOVD g, 0(SP) + + // switch to g0 + MOVD R2, g + + // call fn + Get CTXT + I32WrapI64 + I64Load $0 + CALL + + Get SP + I32Const $8 + I32Add + Set SP + + JMP runtime·badmcall2(SB) + +// func systemstack(fn func()) +TEXT runtime·systemstack(SB), NOSPLIT, $0-8 + // R0 = fn + MOVD fn+0(FP), R0 + // R1 = g.m + MOVD g_m(g), R1 + // R2 = g0 + MOVD m_g0(R1), R2 + + // if g == g0 + Get g + Get R2 + I64Eq + If + // no switch: + MOVD R0, CTXT + + Get CTXT + I32WrapI64 + I64Load $0 + JMP + End + + // if g != m.curg + Get g + I64Load m_curg(R1) + I64Ne + If + CALLNORESUME runtime·badsystemstack(SB) + End + + // switch: + + // save state in g->sched. Pretend to + // be systemstack_switch if the G stack is scanned. + MOVD $runtime·systemstack_switch(SB), g_sched+gobuf_pc(g) + + MOVD SP, g_sched+gobuf_sp(g) + + // switch to g0 + MOVD R2, g + + // make it look like mstart called systemstack on g0, to stop traceback + I64Load (g_sched+gobuf_sp)(R2) + I64Const $8 + I64Sub + Set R3 + + MOVD $runtime·mstart(SB), 0(R3) + MOVD R3, SP + + // call fn + MOVD R0, CTXT + + Get CTXT + I32WrapI64 + I64Load $0 + CALL + + // switch back to g + MOVD g_m(g), R1 + MOVD m_curg(R1), R2 + MOVD R2, g + MOVD g_sched+gobuf_sp(R2), SP + MOVD $0, g_sched+gobuf_sp(R2) + RET + +TEXT runtime·systemstack_switch(SB), NOSPLIT, $0-0 + RET + +// AES hashing not implemented for wasm +TEXT runtime·memhash(SB),NOSPLIT|NOFRAME,$0-32 + JMP runtime·memhashFallback(SB) +TEXT runtime·strhash(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·strhashFallback(SB) +TEXT runtime·memhash32(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash32Fallback(SB) +TEXT runtime·memhash64(SB),NOSPLIT|NOFRAME,$0-24 + JMP runtime·memhash64Fallback(SB) + +TEXT runtime·return0(SB), NOSPLIT, $0-0 + MOVD $0, RET0 + RET + +TEXT runtime·asminit(SB), NOSPLIT, $0-0 + // No per-thread init. + RET + +TEXT ·publicationBarrier(SB), NOSPLIT, $0-0 + RET + +TEXT runtime·procyield(SB), NOSPLIT, $0-0 // FIXME + RET + +TEXT runtime·breakpoint(SB), NOSPLIT, $0-0 + UNDEF + +// Called during function prolog when more stack is needed. +// +// The traceback routines see morestack on a g0 as being +// the top of a stack (for example, morestack calling newstack +// calling the scheduler calling newm calling gc), so we must +// record an argument size. For that purpose, it has no arguments. +TEXT runtime·morestack(SB), NOSPLIT, $0-0 + // R1 = g.m + MOVD g_m(g), R1 + + // R2 = g0 + MOVD m_g0(R1), R2 + + // Cannot grow scheduler stack (m->g0). + Get g + Get R1 + I64Eq + If + CALLNORESUME runtime·badmorestackg0(SB) + End + + // Cannot grow signal stack (m->gsignal). + Get g + I64Load m_gsignal(R1) + I64Eq + If + CALLNORESUME runtime·badmorestackgsignal(SB) + End + + // Called from f. + // Set m->morebuf to f's caller. + NOP SP // tell vet SP changed - stop checking offsets + MOVD 8(SP), m_morebuf+gobuf_pc(R1) + MOVD $16(SP), m_morebuf+gobuf_sp(R1) // f's caller's SP + MOVD g, m_morebuf+gobuf_g(R1) + + // Set g->sched to context in f. + MOVD 0(SP), g_sched+gobuf_pc(g) + MOVD $8(SP), g_sched+gobuf_sp(g) // f's SP + MOVD CTXT, g_sched+gobuf_ctxt(g) + + // Call newstack on m->g0's stack. + MOVD R2, g + MOVD g_sched+gobuf_sp(R2), SP + CALL runtime·newstack(SB) + UNDEF // crash if newstack returns + +// morestack but not preserving ctxt. +TEXT runtime·morestack_noctxt(SB),NOSPLIT,$0 + MOVD $0, CTXT + JMP runtime·morestack(SB) + +TEXT ·asmcgocall(SB), NOSPLIT, $0-0 + UNDEF + +#define DISPATCH(NAME, MAXSIZE) \ + Get R0; \ + I64Const $MAXSIZE; \ + I64LeU; \ + If; \ + JMP NAME(SB); \ + End + +TEXT ·reflectcall(SB), NOSPLIT, $0-48 + I64Load fn+8(FP) + I64Eqz + If + CALLNORESUME runtime·sigpanic<ABIInternal>(SB) + End + + MOVW frameSize+32(FP), R0 + + DISPATCH(runtime·call16, 16) + DISPATCH(runtime·call32, 32) + DISPATCH(runtime·call64, 64) + DISPATCH(runtime·call128, 128) + DISPATCH(runtime·call256, 256) + DISPATCH(runtime·call512, 512) + DISPATCH(runtime·call1024, 1024) + DISPATCH(runtime·call2048, 2048) + DISPATCH(runtime·call4096, 4096) + DISPATCH(runtime·call8192, 8192) + DISPATCH(runtime·call16384, 16384) + DISPATCH(runtime·call32768, 32768) + DISPATCH(runtime·call65536, 65536) + DISPATCH(runtime·call131072, 131072) + DISPATCH(runtime·call262144, 262144) + DISPATCH(runtime·call524288, 524288) + DISPATCH(runtime·call1048576, 1048576) + DISPATCH(runtime·call2097152, 2097152) + DISPATCH(runtime·call4194304, 4194304) + DISPATCH(runtime·call8388608, 8388608) + DISPATCH(runtime·call16777216, 16777216) + DISPATCH(runtime·call33554432, 33554432) + DISPATCH(runtime·call67108864, 67108864) + DISPATCH(runtime·call134217728, 134217728) + DISPATCH(runtime·call268435456, 268435456) + DISPATCH(runtime·call536870912, 536870912) + DISPATCH(runtime·call1073741824, 1073741824) + JMP runtime·badreflectcall(SB) + +#define CALLFN(NAME, MAXSIZE) \ +TEXT NAME(SB), WRAPPER, $MAXSIZE-48; \ + NO_LOCAL_POINTERS; \ + MOVW stackArgsSize+24(FP), R0; \ + \ + Get R0; \ + I64Eqz; \ + Not; \ + If; \ + Get SP; \ + I64Load stackArgs+16(FP); \ + I32WrapI64; \ + I64Load stackArgsSize+24(FP); \ + I32WrapI64; \ + MemoryCopy; \ + End; \ + \ + MOVD f+8(FP), CTXT; \ + Get CTXT; \ + I32WrapI64; \ + I64Load $0; \ + CALL; \ + \ + I64Load32U stackRetOffset+28(FP); \ + Set R0; \ + \ + MOVD stackArgsType+0(FP), RET0; \ + \ + I64Load stackArgs+16(FP); \ + Get R0; \ + I64Add; \ + Set RET1; \ + \ + Get SP; \ + I64ExtendI32U; \ + Get R0; \ + I64Add; \ + Set RET2; \ + \ + I64Load32U stackArgsSize+24(FP); \ + Get R0; \ + I64Sub; \ + Set RET3; \ + \ + CALL callRet<>(SB); \ + RET + +// callRet copies return values back at the end of call*. This is a +// separate function so it can allocate stack space for the arguments +// to reflectcallmove. It does not follow the Go ABI; it expects its +// arguments in registers. +TEXT callRet<>(SB), NOSPLIT, $40-0 + NO_LOCAL_POINTERS + MOVD RET0, 0(SP) + MOVD RET1, 8(SP) + MOVD RET2, 16(SP) + MOVD RET3, 24(SP) + MOVD $0, 32(SP) + CALL runtime·reflectcallmove(SB) + RET + +CALLFN(·call16, 16) +CALLFN(·call32, 32) +CALLFN(·call64, 64) +CALLFN(·call128, 128) +CALLFN(·call256, 256) +CALLFN(·call512, 512) +CALLFN(·call1024, 1024) +CALLFN(·call2048, 2048) +CALLFN(·call4096, 4096) +CALLFN(·call8192, 8192) +CALLFN(·call16384, 16384) +CALLFN(·call32768, 32768) +CALLFN(·call65536, 65536) +CALLFN(·call131072, 131072) +CALLFN(·call262144, 262144) +CALLFN(·call524288, 524288) +CALLFN(·call1048576, 1048576) +CALLFN(·call2097152, 2097152) +CALLFN(·call4194304, 4194304) +CALLFN(·call8388608, 8388608) +CALLFN(·call16777216, 16777216) +CALLFN(·call33554432, 33554432) +CALLFN(·call67108864, 67108864) +CALLFN(·call134217728, 134217728) +CALLFN(·call268435456, 268435456) +CALLFN(·call536870912, 536870912) +CALLFN(·call1073741824, 1073741824) + +TEXT runtime·goexit(SB), NOSPLIT|TOPFRAME, $0-0 + NOP // first PC of goexit is skipped + CALL runtime·goexit1(SB) // does not return + UNDEF + +TEXT runtime·cgocallback(SB), NOSPLIT, $0-24 + UNDEF + +// gcWriteBarrier performs a heap pointer write and informs the GC. +// +// gcWriteBarrier does NOT follow the Go ABI. It has two WebAssembly parameters: +// R0: the destination of the write (i64) +// R1: the value being written (i64) +TEXT runtime·gcWriteBarrier(SB), NOSPLIT, $16 + // R3 = g.m + MOVD g_m(g), R3 + // R4 = p + MOVD m_p(R3), R4 + // R5 = wbBuf.next + MOVD p_wbBuf+wbBuf_next(R4), R5 + + // Record value + MOVD R1, 0(R5) + // Record *slot + MOVD (R0), 8(R5) + + // Increment wbBuf.next + Get R5 + I64Const $16 + I64Add + Set R5 + MOVD R5, p_wbBuf+wbBuf_next(R4) + + Get R5 + I64Load (p_wbBuf+wbBuf_end)(R4) + I64Eq + If + // Flush + MOVD R0, 0(SP) + MOVD R1, 8(SP) + CALLNORESUME runtime·wbBufFlush(SB) + End + + // Do the write + MOVD R1, (R0) + + RET diff --git a/src/runtime/atomic_arm64.s b/src/runtime/atomic_arm64.s new file mode 100644 index 0000000..21b4d8c --- /dev/null +++ b/src/runtime/atomic_arm64.s @@ -0,0 +1,9 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + DMB $0xe // DMB ST + RET diff --git a/src/runtime/atomic_loong64.s b/src/runtime/atomic_loong64.s new file mode 100644 index 0000000..4818a82 --- /dev/null +++ b/src/runtime/atomic_loong64.s @@ -0,0 +1,9 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + DBAR + RET diff --git a/src/runtime/atomic_mips64x.s b/src/runtime/atomic_mips64x.s new file mode 100644 index 0000000..dd6380c --- /dev/null +++ b/src/runtime/atomic_mips64x.s @@ -0,0 +1,13 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +#include "textflag.h" + +#define SYNC WORD $0xf + +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + SYNC + RET diff --git a/src/runtime/atomic_mipsx.s b/src/runtime/atomic_mipsx.s new file mode 100644 index 0000000..ac255fe --- /dev/null +++ b/src/runtime/atomic_mipsx.s @@ -0,0 +1,11 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +#include "textflag.h" + +TEXT ·publicationBarrier(SB),NOSPLIT,$0 + SYNC + RET diff --git a/src/runtime/atomic_pointer.go b/src/runtime/atomic_pointer.go new file mode 100644 index 0000000..25e0e65 --- /dev/null +++ b/src/runtime/atomic_pointer.go @@ -0,0 +1,98 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// These functions cannot have go:noescape annotations, +// because while ptr does not escape, new does. +// If new is marked as not escaping, the compiler will make incorrect +// escape analysis decisions about the pointer value being stored. + +// atomicwb performs a write barrier before an atomic pointer write. +// The caller should guard the call with "if writeBarrier.enabled". +// +//go:nosplit +func atomicwb(ptr *unsafe.Pointer, new unsafe.Pointer) { + slot := (*uintptr)(unsafe.Pointer(ptr)) + if !getg().m.p.ptr().wbBuf.putFast(*slot, uintptr(new)) { + wbBufFlush(slot, uintptr(new)) + } +} + +// atomicstorep performs *ptr = new atomically and invokes a write barrier. +// +//go:nosplit +func atomicstorep(ptr unsafe.Pointer, new unsafe.Pointer) { + if writeBarrier.enabled { + atomicwb((*unsafe.Pointer)(ptr), new) + } + atomic.StorepNoWB(noescape(ptr), new) +} + +// atomic_storePointer is the implementation of runtime/internal/UnsafePointer.Store +// (like StoreNoWB but with the write barrier). +// +//go:nosplit +//go:linkname atomic_storePointer runtime/internal/atomic.storePointer +func atomic_storePointer(ptr *unsafe.Pointer, new unsafe.Pointer) { + atomicstorep(unsafe.Pointer(ptr), new) +} + +// atomic_casPointer is the implementation of runtime/internal/UnsafePointer.CompareAndSwap +// (like CompareAndSwapNoWB but with the write barrier). +// +//go:nosplit +//go:linkname atomic_casPointer runtime/internal/atomic.casPointer +func atomic_casPointer(ptr *unsafe.Pointer, old, new unsafe.Pointer) bool { + if writeBarrier.enabled { + atomicwb(ptr, new) + } + return atomic.Casp1(ptr, old, new) +} + +// Like above, but implement in terms of sync/atomic's uintptr operations. +// We cannot just call the runtime routines, because the race detector expects +// to be able to intercept the sync/atomic forms but not the runtime forms. + +//go:linkname sync_atomic_StoreUintptr sync/atomic.StoreUintptr +func sync_atomic_StoreUintptr(ptr *uintptr, new uintptr) + +//go:linkname sync_atomic_StorePointer sync/atomic.StorePointer +//go:nosplit +func sync_atomic_StorePointer(ptr *unsafe.Pointer, new unsafe.Pointer) { + if writeBarrier.enabled { + atomicwb(ptr, new) + } + sync_atomic_StoreUintptr((*uintptr)(unsafe.Pointer(ptr)), uintptr(new)) +} + +//go:linkname sync_atomic_SwapUintptr sync/atomic.SwapUintptr +func sync_atomic_SwapUintptr(ptr *uintptr, new uintptr) uintptr + +//go:linkname sync_atomic_SwapPointer sync/atomic.SwapPointer +//go:nosplit +func sync_atomic_SwapPointer(ptr *unsafe.Pointer, new unsafe.Pointer) unsafe.Pointer { + if writeBarrier.enabled { + atomicwb(ptr, new) + } + old := unsafe.Pointer(sync_atomic_SwapUintptr((*uintptr)(noescape(unsafe.Pointer(ptr))), uintptr(new))) + return old +} + +//go:linkname sync_atomic_CompareAndSwapUintptr sync/atomic.CompareAndSwapUintptr +func sync_atomic_CompareAndSwapUintptr(ptr *uintptr, old, new uintptr) bool + +//go:linkname sync_atomic_CompareAndSwapPointer sync/atomic.CompareAndSwapPointer +//go:nosplit +func sync_atomic_CompareAndSwapPointer(ptr *unsafe.Pointer, old, new unsafe.Pointer) bool { + if writeBarrier.enabled { + atomicwb(ptr, new) + } + return sync_atomic_CompareAndSwapUintptr((*uintptr)(noescape(unsafe.Pointer(ptr))), uintptr(old), uintptr(new)) +} diff --git a/src/runtime/atomic_ppc64x.s b/src/runtime/atomic_ppc64x.s new file mode 100644 index 0000000..4742b6c --- /dev/null +++ b/src/runtime/atomic_ppc64x.s @@ -0,0 +1,14 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +#include "textflag.h" + +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + // LWSYNC is the "export" barrier recommended by Power ISA + // v2.07 book II, appendix B.2.2.2. + // LWSYNC is a load/load, load/store, and store/store barrier. + LWSYNC + RET diff --git a/src/runtime/atomic_riscv64.s b/src/runtime/atomic_riscv64.s new file mode 100644 index 0000000..544a7c5 --- /dev/null +++ b/src/runtime/atomic_riscv64.s @@ -0,0 +1,10 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// func publicationBarrier() +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + FENCE + RET diff --git a/src/runtime/auxv_none.go b/src/runtime/auxv_none.go new file mode 100644 index 0000000..5d473ca --- /dev/null +++ b/src/runtime/auxv_none.go @@ -0,0 +1,10 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !linux && !darwin && !dragonfly && !freebsd && !netbsd && !solaris + +package runtime + +func sysargs(argc int32, argv **byte) { +} diff --git a/src/runtime/callers_test.go b/src/runtime/callers_test.go new file mode 100644 index 0000000..d245cbd --- /dev/null +++ b/src/runtime/callers_test.go @@ -0,0 +1,341 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "reflect" + "runtime" + "strings" + "testing" +) + +func f1(pan bool) []uintptr { + return f2(pan) // line 15 +} + +func f2(pan bool) []uintptr { + return f3(pan) // line 19 +} + +func f3(pan bool) []uintptr { + if pan { + panic("f3") // line 24 + } + ret := make([]uintptr, 20) + return ret[:runtime.Callers(0, ret)] // line 27 +} + +func testCallers(t *testing.T, pcs []uintptr, pan bool) { + m := make(map[string]int, len(pcs)) + frames := runtime.CallersFrames(pcs) + for { + frame, more := frames.Next() + if frame.Function != "" { + m[frame.Function] = frame.Line + } + if !more { + break + } + } + + var seen []string + for k := range m { + seen = append(seen, k) + } + t.Logf("functions seen: %s", strings.Join(seen, " ")) + + var f3Line int + if pan { + f3Line = 24 + } else { + f3Line = 27 + } + want := []struct { + name string + line int + }{ + {"f1", 15}, + {"f2", 19}, + {"f3", f3Line}, + } + for _, w := range want { + if got := m["runtime_test."+w.name]; got != w.line { + t.Errorf("%s is line %d, want %d", w.name, got, w.line) + } + } +} + +func testCallersEqual(t *testing.T, pcs []uintptr, want []string) { + t.Helper() + + got := make([]string, 0, len(want)) + + frames := runtime.CallersFrames(pcs) + for { + frame, more := frames.Next() + if !more || len(got) >= len(want) { + break + } + got = append(got, frame.Function) + } + if !reflect.DeepEqual(want, got) { + t.Fatalf("wanted %v, got %v", want, got) + } +} + +func TestCallers(t *testing.T) { + testCallers(t, f1(false), false) +} + +func TestCallersPanic(t *testing.T) { + // Make sure we don't have any extra frames on the stack (due to + // open-coded defer processing) + want := []string{"runtime.Callers", "runtime_test.TestCallersPanic.func1", + "runtime.gopanic", "runtime_test.f3", "runtime_test.f2", "runtime_test.f1", + "runtime_test.TestCallersPanic"} + + defer func() { + if r := recover(); r == nil { + t.Fatal("did not panic") + } + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + testCallers(t, pcs, true) + testCallersEqual(t, pcs, want) + }() + f1(true) +} + +func TestCallersDoublePanic(t *testing.T) { + // Make sure we don't have any extra frames on the stack (due to + // open-coded defer processing) + want := []string{"runtime.Callers", "runtime_test.TestCallersDoublePanic.func1.1", + "runtime.gopanic", "runtime_test.TestCallersDoublePanic.func1", "runtime.gopanic", "runtime_test.TestCallersDoublePanic"} + + defer func() { + defer func() { + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + if recover() == nil { + t.Fatal("did not panic") + } + testCallersEqual(t, pcs, want) + }() + if recover() == nil { + t.Fatal("did not panic") + } + panic(2) + }() + panic(1) +} + +// Test that a defer after a successful recovery looks like it is called directly +// from the function with the defers. +func TestCallersAfterRecovery(t *testing.T) { + want := []string{"runtime.Callers", "runtime_test.TestCallersAfterRecovery.func1", "runtime_test.TestCallersAfterRecovery"} + + defer func() { + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + testCallersEqual(t, pcs, want) + }() + defer func() { + if recover() == nil { + t.Fatal("did not recover from panic") + } + }() + panic(1) +} + +func TestCallersAbortedPanic(t *testing.T) { + want := []string{"runtime.Callers", "runtime_test.TestCallersAbortedPanic.func2", "runtime_test.TestCallersAbortedPanic"} + + defer func() { + r := recover() + if r != nil { + t.Fatalf("should be no panic remaining to recover") + } + }() + + defer func() { + // panic2 was aborted/replaced by panic1, so when panic2 was + // recovered, there is no remaining panic on the stack. + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + testCallersEqual(t, pcs, want) + }() + defer func() { + r := recover() + if r != "panic2" { + t.Fatalf("got %v, wanted %v", r, "panic2") + } + }() + defer func() { + // panic2 aborts/replaces panic1, because it is a recursive panic + // that is not recovered within the defer function called by + // panic1 panicking sequence + panic("panic2") + }() + panic("panic1") +} + +func TestCallersAbortedPanic2(t *testing.T) { + want := []string{"runtime.Callers", "runtime_test.TestCallersAbortedPanic2.func2", "runtime_test.TestCallersAbortedPanic2"} + defer func() { + r := recover() + if r != nil { + t.Fatalf("should be no panic remaining to recover") + } + }() + defer func() { + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + testCallersEqual(t, pcs, want) + }() + func() { + defer func() { + r := recover() + if r != "panic2" { + t.Fatalf("got %v, wanted %v", r, "panic2") + } + }() + func() { + defer func() { + // Again, panic2 aborts/replaces panic1 + panic("panic2") + }() + panic("panic1") + }() + }() +} + +func TestCallersNilPointerPanic(t *testing.T) { + // Make sure we don't have any extra frames on the stack (due to + // open-coded defer processing) + want := []string{"runtime.Callers", "runtime_test.TestCallersNilPointerPanic.func1", + "runtime.gopanic", "runtime.panicmem", "runtime.sigpanic", + "runtime_test.TestCallersNilPointerPanic"} + + defer func() { + if r := recover(); r == nil { + t.Fatal("did not panic") + } + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + testCallersEqual(t, pcs, want) + }() + var p *int + if *p == 3 { + t.Fatal("did not see nil pointer panic") + } +} + +func TestCallersDivZeroPanic(t *testing.T) { + // Make sure we don't have any extra frames on the stack (due to + // open-coded defer processing) + want := []string{"runtime.Callers", "runtime_test.TestCallersDivZeroPanic.func1", + "runtime.gopanic", "runtime.panicdivide", + "runtime_test.TestCallersDivZeroPanic"} + + defer func() { + if r := recover(); r == nil { + t.Fatal("did not panic") + } + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + testCallersEqual(t, pcs, want) + }() + var n int + if 5/n == 1 { + t.Fatal("did not see divide-by-sizer panic") + } +} + +func TestCallersDeferNilFuncPanic(t *testing.T) { + // Make sure we don't have any extra frames on the stack. We cut off the check + // at runtime.sigpanic, because non-open-coded defers (which may be used in + // non-opt or race checker mode) include an extra 'deferreturn' frame (which is + // where the nil pointer deref happens). + state := 1 + want := []string{"runtime.Callers", "runtime_test.TestCallersDeferNilFuncPanic.func1", + "runtime.gopanic", "runtime.panicmem", "runtime.sigpanic"} + + defer func() { + if r := recover(); r == nil { + t.Fatal("did not panic") + } + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + testCallersEqual(t, pcs, want) + if state == 1 { + t.Fatal("nil defer func panicked at defer time rather than function exit time") + } + + }() + var f func() + defer f() + // Use the value of 'state' to make sure nil defer func f causes panic at + // function exit, rather than at the defer statement. + state = 2 +} + +// Same test, but forcing non-open-coded defer by putting the defer in a loop. See +// issue #36050 +func TestCallersDeferNilFuncPanicWithLoop(t *testing.T) { + state := 1 + want := []string{"runtime.Callers", "runtime_test.TestCallersDeferNilFuncPanicWithLoop.func1", + "runtime.gopanic", "runtime.panicmem", "runtime.sigpanic", "runtime.deferreturn", "runtime_test.TestCallersDeferNilFuncPanicWithLoop"} + + defer func() { + if r := recover(); r == nil { + t.Fatal("did not panic") + } + pcs := make([]uintptr, 20) + pcs = pcs[:runtime.Callers(0, pcs)] + testCallersEqual(t, pcs, want) + if state == 1 { + t.Fatal("nil defer func panicked at defer time rather than function exit time") + } + + }() + + for i := 0; i < 1; i++ { + var f func() + defer f() + } + // Use the value of 'state' to make sure nil defer func f causes panic at + // function exit, rather than at the defer statement. + state = 2 +} + +// issue #51988 +// Func.Endlineno was lost when instantiating generic functions, leading to incorrect +// stack trace positions. +func TestCallersEndlineno(t *testing.T) { + testNormalEndlineno(t) + testGenericEndlineno[int](t) +} + +func testNormalEndlineno(t *testing.T) { + defer testCallerLine(t, callerLine(t, 0)+1) +} + +func testGenericEndlineno[_ any](t *testing.T) { + defer testCallerLine(t, callerLine(t, 0)+1) +} + +func testCallerLine(t *testing.T, want int) { + if have := callerLine(t, 1); have != want { + t.Errorf("callerLine(1) returned %d, but want %d\n", have, want) + } +} + +func callerLine(t *testing.T, skip int) int { + _, _, line, ok := runtime.Caller(skip + 1) + if !ok { + t.Fatalf("runtime.Caller(%d) failed", skip+1) + } + return line +} diff --git a/src/runtime/cgo.go b/src/runtime/cgo.go new file mode 100644 index 0000000..d904682 --- /dev/null +++ b/src/runtime/cgo.go @@ -0,0 +1,54 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +//go:cgo_export_static main + +// Filled in by runtime/cgo when linked into binary. + +//go:linkname _cgo_init _cgo_init +//go:linkname _cgo_thread_start _cgo_thread_start +//go:linkname _cgo_sys_thread_create _cgo_sys_thread_create +//go:linkname _cgo_notify_runtime_init_done _cgo_notify_runtime_init_done +//go:linkname _cgo_callers _cgo_callers +//go:linkname _cgo_set_context_function _cgo_set_context_function +//go:linkname _cgo_yield _cgo_yield + +var ( + _cgo_init unsafe.Pointer + _cgo_thread_start unsafe.Pointer + _cgo_sys_thread_create unsafe.Pointer + _cgo_notify_runtime_init_done unsafe.Pointer + _cgo_callers unsafe.Pointer + _cgo_set_context_function unsafe.Pointer + _cgo_yield unsafe.Pointer +) + +// iscgo is set to true by the runtime/cgo package +var iscgo bool + +// cgoHasExtraM is set on startup when an extra M is created for cgo. +// The extra M must be created before any C/C++ code calls cgocallback. +var cgoHasExtraM bool + +// cgoUse is called by cgo-generated code (using go:linkname to get at +// an unexported name). The calls serve two purposes: +// 1) they are opaque to escape analysis, so the argument is considered to +// escape to the heap. +// 2) they keep the argument alive until the call site; the call is emitted after +// the end of the (presumed) use of the argument by C. +// cgoUse should not actually be called (see cgoAlwaysFalse). +func cgoUse(any) { throw("cgoUse should not be called") } + +// cgoAlwaysFalse is a boolean value that is always false. +// The cgo-generated code says if cgoAlwaysFalse { cgoUse(p) }. +// The compiler cannot see that cgoAlwaysFalse is always false, +// so it emits the test and keeps the call, giving the desired +// escape analysis result. The test is cheaper than the call. +var cgoAlwaysFalse bool + +var cgo_yield = &_cgo_yield diff --git a/src/runtime/cgo/abi_amd64.h b/src/runtime/cgo/abi_amd64.h new file mode 100644 index 0000000..9949435 --- /dev/null +++ b/src/runtime/cgo/abi_amd64.h @@ -0,0 +1,99 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Macros for transitioning from the host ABI to Go ABI0. +// +// These save the frame pointer, so in general, functions that use +// these should have zero frame size to suppress the automatic frame +// pointer, though it's harmless to not do this. + +#ifdef GOOS_windows + +// REGS_HOST_TO_ABI0_STACK is the stack bytes used by +// PUSH_REGS_HOST_TO_ABI0. +#define REGS_HOST_TO_ABI0_STACK (28*8 + 8) + +// PUSH_REGS_HOST_TO_ABI0 prepares for transitioning from +// the host ABI to Go ABI0 code. It saves all registers that are +// callee-save in the host ABI and caller-save in Go ABI0 and prepares +// for entry to Go. +// +// Save DI SI BP BX R12 R13 R14 R15 X6-X15 registers and the DF flag. +// Clear the DF flag for the Go ABI. +// MXCSR matches the Go ABI, so we don't have to set that, +// and Go doesn't modify it, so we don't have to save it. +#define PUSH_REGS_HOST_TO_ABI0() \ + PUSHFQ \ + CLD \ + ADJSP $(REGS_HOST_TO_ABI0_STACK - 8) \ + MOVQ DI, (0*0)(SP) \ + MOVQ SI, (1*8)(SP) \ + MOVQ BP, (2*8)(SP) \ + MOVQ BX, (3*8)(SP) \ + MOVQ R12, (4*8)(SP) \ + MOVQ R13, (5*8)(SP) \ + MOVQ R14, (6*8)(SP) \ + MOVQ R15, (7*8)(SP) \ + MOVUPS X6, (8*8)(SP) \ + MOVUPS X7, (10*8)(SP) \ + MOVUPS X8, (12*8)(SP) \ + MOVUPS X9, (14*8)(SP) \ + MOVUPS X10, (16*8)(SP) \ + MOVUPS X11, (18*8)(SP) \ + MOVUPS X12, (20*8)(SP) \ + MOVUPS X13, (22*8)(SP) \ + MOVUPS X14, (24*8)(SP) \ + MOVUPS X15, (26*8)(SP) + +#define POP_REGS_HOST_TO_ABI0() \ + MOVQ (0*0)(SP), DI \ + MOVQ (1*8)(SP), SI \ + MOVQ (2*8)(SP), BP \ + MOVQ (3*8)(SP), BX \ + MOVQ (4*8)(SP), R12 \ + MOVQ (5*8)(SP), R13 \ + MOVQ (6*8)(SP), R14 \ + MOVQ (7*8)(SP), R15 \ + MOVUPS (8*8)(SP), X6 \ + MOVUPS (10*8)(SP), X7 \ + MOVUPS (12*8)(SP), X8 \ + MOVUPS (14*8)(SP), X9 \ + MOVUPS (16*8)(SP), X10 \ + MOVUPS (18*8)(SP), X11 \ + MOVUPS (20*8)(SP), X12 \ + MOVUPS (22*8)(SP), X13 \ + MOVUPS (24*8)(SP), X14 \ + MOVUPS (26*8)(SP), X15 \ + ADJSP $-(REGS_HOST_TO_ABI0_STACK - 8) \ + POPFQ + +#else +// SysV ABI + +#define REGS_HOST_TO_ABI0_STACK (6*8) + +// SysV MXCSR matches the Go ABI, so we don't have to set that, +// and Go doesn't modify it, so we don't have to save it. +// Both SysV and Go require DF to be cleared, so that's already clear. +// The SysV and Go frame pointer conventions are compatible. +#define PUSH_REGS_HOST_TO_ABI0() \ + ADJSP $(REGS_HOST_TO_ABI0_STACK) \ + MOVQ BP, (5*8)(SP) \ + LEAQ (5*8)(SP), BP \ + MOVQ BX, (0*8)(SP) \ + MOVQ R12, (1*8)(SP) \ + MOVQ R13, (2*8)(SP) \ + MOVQ R14, (3*8)(SP) \ + MOVQ R15, (4*8)(SP) + +#define POP_REGS_HOST_TO_ABI0() \ + MOVQ (0*8)(SP), BX \ + MOVQ (1*8)(SP), R12 \ + MOVQ (2*8)(SP), R13 \ + MOVQ (3*8)(SP), R14 \ + MOVQ (4*8)(SP), R15 \ + MOVQ (5*8)(SP), BP \ + ADJSP $-(REGS_HOST_TO_ABI0_STACK) + +#endif diff --git a/src/runtime/cgo/abi_arm64.h b/src/runtime/cgo/abi_arm64.h new file mode 100644 index 0000000..e2b5e6d --- /dev/null +++ b/src/runtime/cgo/abi_arm64.h @@ -0,0 +1,43 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Macros for transitioning from the host ABI to Go ABI0. +// +// These macros save and restore the callee-saved registers +// from the stack, but they don't adjust stack pointer, so +// the user should prepare stack space in advance. +// SAVE_R19_TO_R28(offset) saves R19 ~ R28 to the stack space +// of ((offset)+0*8)(RSP) ~ ((offset)+9*8)(RSP). +// +// SAVE_F8_TO_F15(offset) saves F8 ~ F15 to the stack space +// of ((offset)+0*8)(RSP) ~ ((offset)+7*8)(RSP). +// +// R29 is not saved because Go will save and restore it. + +#define SAVE_R19_TO_R28(offset) \ + STP (R19, R20), ((offset)+0*8)(RSP) \ + STP (R21, R22), ((offset)+2*8)(RSP) \ + STP (R23, R24), ((offset)+4*8)(RSP) \ + STP (R25, R26), ((offset)+6*8)(RSP) \ + STP (R27, g), ((offset)+8*8)(RSP) + +#define RESTORE_R19_TO_R28(offset) \ + LDP ((offset)+0*8)(RSP), (R19, R20) \ + LDP ((offset)+2*8)(RSP), (R21, R22) \ + LDP ((offset)+4*8)(RSP), (R23, R24) \ + LDP ((offset)+6*8)(RSP), (R25, R26) \ + LDP ((offset)+8*8)(RSP), (R27, g) /* R28 */ + +#define SAVE_F8_TO_F15(offset) \ + FSTPD (F8, F9), ((offset)+0*8)(RSP) \ + FSTPD (F10, F11), ((offset)+2*8)(RSP) \ + FSTPD (F12, F13), ((offset)+4*8)(RSP) \ + FSTPD (F14, F15), ((offset)+6*8)(RSP) + +#define RESTORE_F8_TO_F15(offset) \ + FLDPD ((offset)+0*8)(RSP), (F8, F9) \ + FLDPD ((offset)+2*8)(RSP), (F10, F11) \ + FLDPD ((offset)+4*8)(RSP), (F12, F13) \ + FLDPD ((offset)+6*8)(RSP), (F14, F15) + diff --git a/src/runtime/cgo/asm_386.s b/src/runtime/cgo/asm_386.s new file mode 100644 index 0000000..2e7e951 --- /dev/null +++ b/src/runtime/cgo/asm_386.s @@ -0,0 +1,29 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +TEXT crosscall2(SB),NOSPLIT,$28-16 + MOVL BP, 24(SP) + MOVL BX, 20(SP) + MOVL SI, 16(SP) + MOVL DI, 12(SP) + + MOVL ctxt+12(FP), AX + MOVL AX, 8(SP) + MOVL a+4(FP), AX + MOVL AX, 4(SP) + MOVL fn+0(FP), AX + MOVL AX, 0(SP) + CALL runtime·cgocallback(SB) + + MOVL 12(SP), DI + MOVL 16(SP), SI + MOVL 20(SP), BX + MOVL 24(SP), BP + RET diff --git a/src/runtime/cgo/asm_amd64.s b/src/runtime/cgo/asm_amd64.s new file mode 100644 index 0000000..386299c --- /dev/null +++ b/src/runtime/cgo/asm_amd64.s @@ -0,0 +1,34 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "abi_amd64.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +// This signature is known to SWIG, so we can't change it. +TEXT crosscall2(SB),NOSPLIT,$0-0 + PUSH_REGS_HOST_TO_ABI0() + + // Make room for arguments to cgocallback. + ADJSP $0x18 +#ifndef GOOS_windows + MOVQ DI, 0x0(SP) /* fn */ + MOVQ SI, 0x8(SP) /* arg */ + // Skip n in DX. + MOVQ CX, 0x10(SP) /* ctxt */ +#else + MOVQ CX, 0x0(SP) /* fn */ + MOVQ DX, 0x8(SP) /* arg */ + // Skip n in R8. + MOVQ R9, 0x10(SP) /* ctxt */ +#endif + + CALL runtime·cgocallback(SB) + + ADJSP $-0x18 + POP_REGS_HOST_TO_ABI0() + RET diff --git a/src/runtime/cgo/asm_arm.s b/src/runtime/cgo/asm_arm.s new file mode 100644 index 0000000..ea55e17 --- /dev/null +++ b/src/runtime/cgo/asm_arm.s @@ -0,0 +1,56 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +TEXT crosscall2(SB),NOSPLIT|NOFRAME,$0 + SUB $(8*9), R13 // Reserve space for the floating point registers. + // The C arguments arrive in R0, R1, R2, and R3. We want to + // pass R0, R1, and R3 to Go, so we push those on the stack. + // Also, save C callee-save registers R4-R12. + MOVM.WP [R0, R1, R3, R4, R5, R6, R7, R8, R9, g, R11, R12], (R13) + // Finally, save the link register R14. This also puts the + // arguments we pushed for cgocallback where they need to be, + // starting at 4(R13). + MOVW.W R14, -4(R13) + + // Skip floating point registers on GOARM < 6. + MOVB runtime·goarm(SB), R11 + CMP $6, R11 + BLT skipfpsave + MOVD F8, (13*4+8*1)(R13) + MOVD F9, (13*4+8*2)(R13) + MOVD F10, (13*4+8*3)(R13) + MOVD F11, (13*4+8*4)(R13) + MOVD F12, (13*4+8*5)(R13) + MOVD F13, (13*4+8*6)(R13) + MOVD F14, (13*4+8*7)(R13) + MOVD F15, (13*4+8*8)(R13) + +skipfpsave: + BL runtime·load_g(SB) + // We set up the arguments to cgocallback when saving registers above. + BL runtime·cgocallback(SB) + + MOVB runtime·goarm(SB), R11 + CMP $6, R11 + BLT skipfprest + MOVD (13*4+8*1)(R13), F8 + MOVD (13*4+8*2)(R13), F9 + MOVD (13*4+8*3)(R13), F10 + MOVD (13*4+8*4)(R13), F11 + MOVD (13*4+8*5)(R13), F12 + MOVD (13*4+8*6)(R13), F13 + MOVD (13*4+8*7)(R13), F14 + MOVD (13*4+8*8)(R13), F15 + +skipfprest: + MOVW.P 4(R13), R14 + MOVM.IAW (R13), [R0, R1, R3, R4, R5, R6, R7, R8, R9, g, R11, R12] + ADD $(8*9), R13 + MOVW R14, R15 diff --git a/src/runtime/cgo/asm_arm64.s b/src/runtime/cgo/asm_arm64.s new file mode 100644 index 0000000..e808ded --- /dev/null +++ b/src/runtime/cgo/asm_arm64.s @@ -0,0 +1,37 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "abi_arm64.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +TEXT crosscall2(SB),NOSPLIT|NOFRAME,$0 + /* + * We still need to save all callee save register as before, and then + * push 3 args for fn (R0, R1, R3), skipping R2. + * Also note that at procedure entry in gc world, 8(RSP) will be the + * first arg. + */ + SUB $(8*24), RSP + STP (R0, R1), (8*1)(RSP) + MOVD R3, (8*3)(RSP) + + SAVE_R19_TO_R28(8*4) + SAVE_F8_TO_F15(8*14) + STP (R29, R30), (8*22)(RSP) + + + // Initialize Go ABI environment + BL runtime·load_g(SB) + BL runtime·cgocallback(SB) + + RESTORE_R19_TO_R28(8*4) + RESTORE_F8_TO_F15(8*14) + LDP (8*22)(RSP), (R29, R30) + + ADD $(8*24), RSP + RET diff --git a/src/runtime/cgo/asm_loong64.s b/src/runtime/cgo/asm_loong64.s new file mode 100644 index 0000000..961a3dd --- /dev/null +++ b/src/runtime/cgo/asm_loong64.s @@ -0,0 +1,67 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +TEXT crosscall2(SB),NOSPLIT|NOFRAME,$0 + /* + * We still need to save all callee save register as before, and then + * push 3 args for fn (R4, R5, R7), skipping R6. + * Also note that at procedure entry in gc world, 8(R29) will be the + * first arg. + */ + + ADDV $(-8*22), R3 + MOVV R4, (8*1)(R3) // fn unsafe.Pointer + MOVV R5, (8*2)(R3) // a unsafe.Pointer + MOVV R7, (8*3)(R3) // ctxt uintptr + MOVV R23, (8*4)(R3) + MOVV R24, (8*5)(R3) + MOVV R25, (8*6)(R3) + MOVV R26, (8*7)(R3) + MOVV R27, (8*8)(R3) + MOVV R28, (8*9)(R3) + MOVV R29, (8*10)(R3) + MOVV R30, (8*11)(R3) + MOVV g, (8*12)(R3) + MOVV R1, (8*13)(R3) + MOVD F24, (8*14)(R3) + MOVD F25, (8*15)(R3) + MOVD F26, (8*16)(R3) + MOVD F27, (8*17)(R3) + MOVD F28, (8*18)(R3) + MOVD F29, (8*19)(R3) + MOVD F30, (8*20)(R3) + MOVD F31, (8*21)(R3) + + // Initialize Go ABI environment + JAL runtime·load_g(SB) + + JAL runtime·cgocallback(SB) + + MOVV (8*4)(R3), R23 + MOVV (8*5)(R3), R24 + MOVV (8*6)(R3), R25 + MOVV (8*7)(R3), R26 + MOVV (8*8)(R3), R27 + MOVV (8*9)(R3), R28 + MOVV (8*10)(R3), R29 + MOVV (8*11)(R3), R30 + MOVV (8*12)(R3), g + MOVV (8*13)(R3), R1 + MOVD (8*14)(R3), F24 + MOVD (8*15)(R3), F25 + MOVD (8*16)(R3), F26 + MOVD (8*17)(R3), F27 + MOVD (8*18)(R3), F28 + MOVD (8*19)(R3), F29 + MOVD (8*20)(R3), F30 + MOVD (8*21)(R3), F31 + ADDV $(8*22), R3 + + RET diff --git a/src/runtime/cgo/asm_mips64x.s b/src/runtime/cgo/asm_mips64x.s new file mode 100644 index 0000000..ba94807 --- /dev/null +++ b/src/runtime/cgo/asm_mips64x.s @@ -0,0 +1,83 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le +// +build mips64 mips64le + +#include "textflag.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +TEXT crosscall2(SB),NOSPLIT|NOFRAME,$0 + /* + * We still need to save all callee save register as before, and then + * push 3 args for fn (R4, R5, R7), skipping R6. + * Also note that at procedure entry in gc world, 8(R29) will be the + * first arg. + */ +#ifndef GOMIPS64_softfloat + ADDV $(-8*23), R29 +#else + ADDV $(-8*15), R29 +#endif + MOVV R4, (8*1)(R29) // fn unsafe.Pointer + MOVV R5, (8*2)(R29) // a unsafe.Pointer + MOVV R7, (8*3)(R29) // ctxt uintptr + MOVV R16, (8*4)(R29) + MOVV R17, (8*5)(R29) + MOVV R18, (8*6)(R29) + MOVV R19, (8*7)(R29) + MOVV R20, (8*8)(R29) + MOVV R21, (8*9)(R29) + MOVV R22, (8*10)(R29) + MOVV R23, (8*11)(R29) + MOVV RSB, (8*12)(R29) + MOVV g, (8*13)(R29) + MOVV R31, (8*14)(R29) +#ifndef GOMIPS64_softfloat + MOVD F24, (8*15)(R29) + MOVD F25, (8*16)(R29) + MOVD F26, (8*17)(R29) + MOVD F27, (8*18)(R29) + MOVD F28, (8*19)(R29) + MOVD F29, (8*20)(R29) + MOVD F30, (8*21)(R29) + MOVD F31, (8*22)(R29) +#endif + // Initialize Go ABI environment + // prepare SB register = PC & 0xffffffff00000000 + BGEZAL R0, 1(PC) + SRLV $32, R31, RSB + SLLV $32, RSB + JAL runtime·load_g(SB) + + JAL runtime·cgocallback(SB) + + MOVV (8*4)(R29), R16 + MOVV (8*5)(R29), R17 + MOVV (8*6)(R29), R18 + MOVV (8*7)(R29), R19 + MOVV (8*8)(R29), R20 + MOVV (8*9)(R29), R21 + MOVV (8*10)(R29), R22 + MOVV (8*11)(R29), R23 + MOVV (8*12)(R29), RSB + MOVV (8*13)(R29), g + MOVV (8*14)(R29), R31 +#ifndef GOMIPS64_softfloat + MOVD (8*15)(R29), F24 + MOVD (8*16)(R29), F25 + MOVD (8*17)(R29), F26 + MOVD (8*18)(R29), F27 + MOVD (8*19)(R29), F28 + MOVD (8*20)(R29), F29 + MOVD (8*21)(R29), F30 + MOVD (8*22)(R29), F31 + ADDV $(8*23), R29 +#else + ADDV $(8*15), R29 +#endif + RET diff --git a/src/runtime/cgo/asm_mipsx.s b/src/runtime/cgo/asm_mipsx.s new file mode 100644 index 0000000..fd5d78e --- /dev/null +++ b/src/runtime/cgo/asm_mipsx.s @@ -0,0 +1,76 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle +// +build mips mipsle + +#include "textflag.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +TEXT crosscall2(SB),NOSPLIT|NOFRAME,$0 + /* + * We still need to save all callee save register as before, and then + * push 3 args for fn (R4, R5, R7), skipping R6. + * Also note that at procedure entry in gc world, 4(R29) will be the + * first arg. + */ + + // Space for 9 caller-saved GPR + LR + 6 caller-saved FPR. + // O32 ABI allows us to smash 16 bytes argument area of caller frame. +#ifndef GOMIPS_softfloat + SUBU $(4*14+8*6-16), R29 +#else + SUBU $(4*14-16), R29 // For soft-float, no FPR. +#endif + MOVW R4, (4*1)(R29) // fn unsafe.Pointer + MOVW R5, (4*2)(R29) // a unsafe.Pointer + MOVW R7, (4*3)(R29) // ctxt uintptr + MOVW R16, (4*4)(R29) + MOVW R17, (4*5)(R29) + MOVW R18, (4*6)(R29) + MOVW R19, (4*7)(R29) + MOVW R20, (4*8)(R29) + MOVW R21, (4*9)(R29) + MOVW R22, (4*10)(R29) + MOVW R23, (4*11)(R29) + MOVW g, (4*12)(R29) + MOVW R31, (4*13)(R29) +#ifndef GOMIPS_softfloat + MOVD F20, (4*14)(R29) + MOVD F22, (4*14+8*1)(R29) + MOVD F24, (4*14+8*2)(R29) + MOVD F26, (4*14+8*3)(R29) + MOVD F28, (4*14+8*4)(R29) + MOVD F30, (4*14+8*5)(R29) +#endif + JAL runtime·load_g(SB) + + JAL runtime·cgocallback(SB) + + MOVW (4*4)(R29), R16 + MOVW (4*5)(R29), R17 + MOVW (4*6)(R29), R18 + MOVW (4*7)(R29), R19 + MOVW (4*8)(R29), R20 + MOVW (4*9)(R29), R21 + MOVW (4*10)(R29), R22 + MOVW (4*11)(R29), R23 + MOVW (4*12)(R29), g + MOVW (4*13)(R29), R31 +#ifndef GOMIPS_softfloat + MOVD (4*14)(R29), F20 + MOVD (4*14+8*1)(R29), F22 + MOVD (4*14+8*2)(R29), F24 + MOVD (4*14+8*3)(R29), F26 + MOVD (4*14+8*4)(R29), F28 + MOVD (4*14+8*5)(R29), F30 + + ADDU $(4*14+8*6-16), R29 +#else + ADDU $(4*14-16), R29 +#endif + RET diff --git a/src/runtime/cgo/asm_ppc64x.s b/src/runtime/cgo/asm_ppc64x.s new file mode 100644 index 0000000..187b2d4 --- /dev/null +++ b/src/runtime/cgo/asm_ppc64x.s @@ -0,0 +1,139 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le +// +build ppc64 ppc64le + +#include "textflag.h" +#include "asm_ppc64x.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +// The value of R2 is saved on the new stack frame, and not +// the caller's frame due to issue #43228. +TEXT crosscall2(SB),NOSPLIT|NOFRAME,$0 + // Start with standard C stack frame layout and linkage + MOVD LR, R0 + MOVD R0, 16(R1) // Save LR in caller's frame + MOVW CR, R0 // Save CR in caller's frame + MOVW R0, 8(R1) + + BL saveregs2<>(SB) + + MOVDU R1, (-288-3*8-FIXED_FRAME)(R1) + // Save the caller's R2 + MOVD R2, 24(R1) + + // Initialize Go ABI environment + BL runtime·reginit(SB) + BL runtime·load_g(SB) + +#ifdef GOARCH_ppc64 + // ppc64 use elf ABI v1. we must get the real entry address from + // first slot of the function descriptor before call. + // Same for AIX. + MOVD 8(R3), R2 + MOVD (R3), R3 +#endif + MOVD R3, FIXED_FRAME+0(R1) // fn unsafe.Pointer + MOVD R4, FIXED_FRAME+8(R1) // a unsafe.Pointer + // Skip R5 = n uint32 + MOVD R6, FIXED_FRAME+16(R1) // ctxt uintptr + BL runtime·cgocallback(SB) + + // Restore the caller's R2 + MOVD 24(R1), R2 + ADD $(288+3*8+FIXED_FRAME), R1 + + BL restoreregs2<>(SB) + + MOVW 8(R1), R0 + MOVFL R0, $0xff + MOVD 16(R1), R0 + MOVD R0, LR + RET + +TEXT saveregs2<>(SB),NOSPLIT|NOFRAME,$0 + // O=-288; for R in R{14..31}; do echo "\tMOVD\t$R, $O(R1)"|sed s/R30/g/; ((O+=8)); done; for F in F{14..31}; do echo "\tFMOVD\t$F, $O(R1)"; ((O+=8)); done + MOVD R14, -288(R1) + MOVD R15, -280(R1) + MOVD R16, -272(R1) + MOVD R17, -264(R1) + MOVD R18, -256(R1) + MOVD R19, -248(R1) + MOVD R20, -240(R1) + MOVD R21, -232(R1) + MOVD R22, -224(R1) + MOVD R23, -216(R1) + MOVD R24, -208(R1) + MOVD R25, -200(R1) + MOVD R26, -192(R1) + MOVD R27, -184(R1) + MOVD R28, -176(R1) + MOVD R29, -168(R1) + MOVD g, -160(R1) + MOVD R31, -152(R1) + FMOVD F14, -144(R1) + FMOVD F15, -136(R1) + FMOVD F16, -128(R1) + FMOVD F17, -120(R1) + FMOVD F18, -112(R1) + FMOVD F19, -104(R1) + FMOVD F20, -96(R1) + FMOVD F21, -88(R1) + FMOVD F22, -80(R1) + FMOVD F23, -72(R1) + FMOVD F24, -64(R1) + FMOVD F25, -56(R1) + FMOVD F26, -48(R1) + FMOVD F27, -40(R1) + FMOVD F28, -32(R1) + FMOVD F29, -24(R1) + FMOVD F30, -16(R1) + FMOVD F31, -8(R1) + + RET + +TEXT restoreregs2<>(SB),NOSPLIT|NOFRAME,$0 + // O=-288; for R in R{14..31}; do echo "\tMOVD\t$O(R1), $R"|sed s/R30/g/; ((O+=8)); done; for F in F{14..31}; do echo "\tFMOVD\t$O(R1), $F"; ((O+=8)); done + MOVD -288(R1), R14 + MOVD -280(R1), R15 + MOVD -272(R1), R16 + MOVD -264(R1), R17 + MOVD -256(R1), R18 + MOVD -248(R1), R19 + MOVD -240(R1), R20 + MOVD -232(R1), R21 + MOVD -224(R1), R22 + MOVD -216(R1), R23 + MOVD -208(R1), R24 + MOVD -200(R1), R25 + MOVD -192(R1), R26 + MOVD -184(R1), R27 + MOVD -176(R1), R28 + MOVD -168(R1), R29 + MOVD -160(R1), g + MOVD -152(R1), R31 + FMOVD -144(R1), F14 + FMOVD -136(R1), F15 + FMOVD -128(R1), F16 + FMOVD -120(R1), F17 + FMOVD -112(R1), F18 + FMOVD -104(R1), F19 + FMOVD -96(R1), F20 + FMOVD -88(R1), F21 + FMOVD -80(R1), F22 + FMOVD -72(R1), F23 + FMOVD -64(R1), F24 + FMOVD -56(R1), F25 + FMOVD -48(R1), F26 + FMOVD -40(R1), F27 + FMOVD -32(R1), F28 + FMOVD -24(R1), F29 + FMOVD -16(R1), F30 + FMOVD -8(R1), F31 + + RET diff --git a/src/runtime/cgo/asm_riscv64.s b/src/runtime/cgo/asm_riscv64.s new file mode 100644 index 0000000..45151bf --- /dev/null +++ b/src/runtime/cgo/asm_riscv64.s @@ -0,0 +1,78 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +TEXT crosscall2(SB),NOSPLIT|NOFRAME,$0 + /* + * Push arguments for fn (X10, X11, X13), along with all callee-save + * registers. Note that at procedure entry the first argument is at + * 8(X2). + */ + ADD $(-8*29), X2 + MOV X10, (8*1)(X2) // fn unsafe.Pointer + MOV X11, (8*2)(X2) // a unsafe.Pointer + MOV X13, (8*3)(X2) // ctxt uintptr + MOV X8, (8*4)(X2) + MOV X9, (8*5)(X2) + MOV X18, (8*6)(X2) + MOV X19, (8*7)(X2) + MOV X20, (8*8)(X2) + MOV X21, (8*9)(X2) + MOV X22, (8*10)(X2) + MOV X23, (8*11)(X2) + MOV X24, (8*12)(X2) + MOV X25, (8*13)(X2) + MOV X26, (8*14)(X2) + MOV g, (8*15)(X2) + MOV X1, (8*16)(X2) + MOVD F8, (8*17)(X2) + MOVD F9, (8*18)(X2) + MOVD F18, (8*19)(X2) + MOVD F19, (8*20)(X2) + MOVD F20, (8*21)(X2) + MOVD F21, (8*22)(X2) + MOVD F22, (8*23)(X2) + MOVD F23, (8*24)(X2) + MOVD F24, (8*25)(X2) + MOVD F25, (8*26)(X2) + MOVD F26, (8*27)(X2) + MOVD F27, (8*28)(X2) + + // Initialize Go ABI environment + CALL runtime·load_g(SB) + CALL runtime·cgocallback(SB) + + MOV (8*4)(X2), X8 + MOV (8*5)(X2), X9 + MOV (8*6)(X2), X18 + MOV (8*7)(X2), X19 + MOV (8*8)(X2), X20 + MOV (8*9)(X2), X21 + MOV (8*10)(X2), X22 + MOV (8*11)(X2), X23 + MOV (8*12)(X2), X24 + MOV (8*13)(X2), X25 + MOV (8*14)(X2), X26 + MOV (8*15)(X2), g + MOV (8*16)(X2), X1 + MOVD (8*17)(X2), F8 + MOVD (8*18)(X2), F9 + MOVD (8*19)(X2), F18 + MOVD (8*20)(X2), F19 + MOVD (8*21)(X2), F20 + MOVD (8*22)(X2), F21 + MOVD (8*23)(X2), F22 + MOVD (8*24)(X2), F23 + MOVD (8*25)(X2), F24 + MOVD (8*26)(X2), F25 + MOVD (8*27)(X2), F26 + MOVD (8*28)(X2), F27 + ADD $(8*29), X2 + + RET diff --git a/src/runtime/cgo/asm_s390x.s b/src/runtime/cgo/asm_s390x.s new file mode 100644 index 0000000..8bf16e7 --- /dev/null +++ b/src/runtime/cgo/asm_s390x.s @@ -0,0 +1,55 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// Called by C code generated by cmd/cgo. +// func crosscall2(fn, a unsafe.Pointer, n int32, ctxt uintptr) +// Saves C callee-saved registers and calls cgocallback with three arguments. +// fn is the PC of a func(a unsafe.Pointer) function. +TEXT crosscall2(SB),NOSPLIT|NOFRAME,$0 + // Start with standard C stack frame layout and linkage. + + // Save R6-R15 in the register save area of the calling function. + STMG R6, R15, 48(R15) + + // Allocate 96 bytes on the stack. + MOVD $-96(R15), R15 + + // Save F8-F15 in our stack frame. + FMOVD F8, 32(R15) + FMOVD F9, 40(R15) + FMOVD F10, 48(R15) + FMOVD F11, 56(R15) + FMOVD F12, 64(R15) + FMOVD F13, 72(R15) + FMOVD F14, 80(R15) + FMOVD F15, 88(R15) + + // Initialize Go ABI environment. + BL runtime·load_g(SB) + + MOVD R2, 8(R15) // fn unsafe.Pointer + MOVD R3, 16(R15) // a unsafe.Pointer + // Skip R4 = n uint32 + MOVD R5, 24(R15) // ctxt uintptr + BL runtime·cgocallback(SB) + + FMOVD 32(R15), F8 + FMOVD 40(R15), F9 + FMOVD 48(R15), F10 + FMOVD 56(R15), F11 + FMOVD 64(R15), F12 + FMOVD 72(R15), F13 + FMOVD 80(R15), F14 + FMOVD 88(R15), F15 + + // De-allocate stack frame. + MOVD $96(R15), R15 + + // Restore R6-R15. + LMG 48(R15), R6, R15 + + RET + diff --git a/src/runtime/cgo/asm_wasm.s b/src/runtime/cgo/asm_wasm.s new file mode 100644 index 0000000..cb140eb --- /dev/null +++ b/src/runtime/cgo/asm_wasm.s @@ -0,0 +1,8 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT crosscall2(SB), NOSPLIT, $0 + UNDEF diff --git a/src/runtime/cgo/callbacks.go b/src/runtime/cgo/callbacks.go new file mode 100644 index 0000000..e7c8ef3 --- /dev/null +++ b/src/runtime/cgo/callbacks.go @@ -0,0 +1,107 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package cgo + +import "unsafe" + +// These utility functions are available to be called from code +// compiled with gcc via crosscall2. + +// The declaration of crosscall2 is: +// void crosscall2(void (*fn)(void *), void *, int); +// +// We need to export the symbol crosscall2 in order to support +// callbacks from shared libraries. This applies regardless of +// linking mode. +// +// Compatibility note: SWIG uses crosscall2 in exactly one situation: +// to call _cgo_panic using the pattern shown below. We need to keep +// that pattern working. In particular, crosscall2 actually takes four +// arguments, but it works to call it with three arguments when +// calling _cgo_panic. +// +//go:cgo_export_static crosscall2 +//go:cgo_export_dynamic crosscall2 + +// Panic. The argument is converted into a Go string. + +// Call like this in code compiled with gcc: +// struct { const char *p; } a; +// a.p = /* string to pass to panic */; +// crosscall2(_cgo_panic, &a, sizeof a); +// /* The function call will not return. */ + +// TODO: We should export a regular C function to panic, change SWIG +// to use that instead of the above pattern, and then we can drop +// backwards-compatibility from crosscall2 and stop exporting it. + +//go:linkname _runtime_cgo_panic_internal runtime._cgo_panic_internal +func _runtime_cgo_panic_internal(p *byte) + +//go:linkname _cgo_panic _cgo_panic +//go:cgo_export_static _cgo_panic +//go:cgo_export_dynamic _cgo_panic +func _cgo_panic(a *struct{ cstr *byte }) { + _runtime_cgo_panic_internal(a.cstr) +} + +//go:cgo_import_static x_cgo_init +//go:linkname x_cgo_init x_cgo_init +//go:linkname _cgo_init _cgo_init +var x_cgo_init byte +var _cgo_init = &x_cgo_init + +//go:cgo_import_static x_cgo_thread_start +//go:linkname x_cgo_thread_start x_cgo_thread_start +//go:linkname _cgo_thread_start _cgo_thread_start +var x_cgo_thread_start byte +var _cgo_thread_start = &x_cgo_thread_start + +// Creates a new system thread without updating any Go state. +// +// This method is invoked during shared library loading to create a new OS +// thread to perform the runtime initialization. This method is similar to +// _cgo_sys_thread_start except that it doesn't update any Go state. + +//go:cgo_import_static x_cgo_sys_thread_create +//go:linkname x_cgo_sys_thread_create x_cgo_sys_thread_create +//go:linkname _cgo_sys_thread_create _cgo_sys_thread_create +var x_cgo_sys_thread_create byte +var _cgo_sys_thread_create = &x_cgo_sys_thread_create + +// Notifies that the runtime has been initialized. +// +// We currently block at every CGO entry point (via _cgo_wait_runtime_init_done) +// to ensure that the runtime has been initialized before the CGO call is +// executed. This is necessary for shared libraries where we kickoff runtime +// initialization in a separate thread and return without waiting for this +// thread to complete the init. + +//go:cgo_import_static x_cgo_notify_runtime_init_done +//go:linkname x_cgo_notify_runtime_init_done x_cgo_notify_runtime_init_done +//go:linkname _cgo_notify_runtime_init_done _cgo_notify_runtime_init_done +var x_cgo_notify_runtime_init_done byte +var _cgo_notify_runtime_init_done = &x_cgo_notify_runtime_init_done + +// Sets the traceback context function. See runtime.SetCgoTraceback. + +//go:cgo_import_static x_cgo_set_context_function +//go:linkname x_cgo_set_context_function x_cgo_set_context_function +//go:linkname _cgo_set_context_function _cgo_set_context_function +var x_cgo_set_context_function byte +var _cgo_set_context_function = &x_cgo_set_context_function + +// Calls a libc function to execute background work injected via libc +// interceptors, such as processing pending signals under the thread +// sanitizer. +// +// Left as a nil pointer if no libc interceptors are expected. + +//go:cgo_import_static _cgo_yield +//go:linkname _cgo_yield _cgo_yield +var _cgo_yield unsafe.Pointer + +//go:cgo_export_static _cgo_topofstack +//go:cgo_export_dynamic _cgo_topofstack diff --git a/src/runtime/cgo/callbacks_aix.go b/src/runtime/cgo/callbacks_aix.go new file mode 100644 index 0000000..8f756fb --- /dev/null +++ b/src/runtime/cgo/callbacks_aix.go @@ -0,0 +1,12 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package cgo + +// These functions must be exported in order to perform +// longcall on cgo programs (cf gcc_aix_ppc64.c). +// +//go:cgo_export_static __cgo_topofstack +//go:cgo_export_static runtime.rt0_go +//go:cgo_export_static _rt0_ppc64_aix_lib diff --git a/src/runtime/cgo/callbacks_traceback.go b/src/runtime/cgo/callbacks_traceback.go new file mode 100644 index 0000000..dae31a8 --- /dev/null +++ b/src/runtime/cgo/callbacks_traceback.go @@ -0,0 +1,17 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build darwin || linux + +package cgo + +import _ "unsafe" // for go:linkname + +// Calls the traceback function passed to SetCgoTraceback. + +//go:cgo_import_static x_cgo_callers +//go:linkname x_cgo_callers x_cgo_callers +//go:linkname _cgo_callers _cgo_callers +var x_cgo_callers byte +var _cgo_callers = &x_cgo_callers diff --git a/src/runtime/cgo/cgo.go b/src/runtime/cgo/cgo.go new file mode 100644 index 0000000..b8473e5 --- /dev/null +++ b/src/runtime/cgo/cgo.go @@ -0,0 +1,40 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +/* +Package cgo contains runtime support for code generated +by the cgo tool. See the documentation for the cgo command +for details on using cgo. +*/ +package cgo + +/* + +#cgo darwin,!arm64 LDFLAGS: -lpthread +#cgo darwin,arm64 LDFLAGS: -framework CoreFoundation +#cgo dragonfly LDFLAGS: -lpthread +#cgo freebsd LDFLAGS: -lpthread +#cgo android LDFLAGS: -llog +#cgo !android,linux LDFLAGS: -lpthread +#cgo netbsd LDFLAGS: -lpthread +#cgo openbsd LDFLAGS: -lpthread +#cgo aix LDFLAGS: -Wl,-berok +#cgo solaris LDFLAGS: -lxnet +#cgo solaris LDFLAGS: -lsocket + +// We use -fno-stack-protector because internal linking won't find +// the support functions. See issues #52919 and #54313. +#cgo CFLAGS: -Wall -Werror -fno-stack-protector + +#cgo solaris CPPFLAGS: -D_POSIX_PTHREAD_SEMANTICS + +*/ +import "C" + +import "runtime/internal/sys" + +// Incomplete is used specifically for the semantics of incomplete C types. +type Incomplete struct { + _ sys.NotInHeap +} diff --git a/src/runtime/cgo/dragonfly.go b/src/runtime/cgo/dragonfly.go new file mode 100644 index 0000000..36d70e3 --- /dev/null +++ b/src/runtime/cgo/dragonfly.go @@ -0,0 +1,19 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build dragonfly + +package cgo + +import _ "unsafe" // for go:linkname + +// Supply environ and __progname, because we don't +// link against the standard DragonFly crt0.o and the +// libc dynamic library needs them. + +//go:linkname _environ environ +//go:linkname _progname __progname + +var _environ uintptr +var _progname uintptr diff --git a/src/runtime/cgo/freebsd.go b/src/runtime/cgo/freebsd.go new file mode 100644 index 0000000..2d9f624 --- /dev/null +++ b/src/runtime/cgo/freebsd.go @@ -0,0 +1,22 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build freebsd + +package cgo + +import _ "unsafe" // for go:linkname + +// Supply environ and __progname, because we don't +// link against the standard FreeBSD crt0.o and the +// libc dynamic library needs them. + +//go:linkname _environ environ +//go:linkname _progname __progname + +//go:cgo_export_dynamic environ +//go:cgo_export_dynamic __progname + +var _environ uintptr +var _progname uintptr diff --git a/src/runtime/cgo/gcc_386.S b/src/runtime/cgo/gcc_386.S new file mode 100644 index 0000000..5e6d715 --- /dev/null +++ b/src/runtime/cgo/gcc_386.S @@ -0,0 +1,42 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +.file "gcc_386.S" + +/* + * Apple still insists on underscore prefixes for C function names. + */ +#if defined(__APPLE__) || defined(_WIN32) +#define EXT(s) _##s +#else +#define EXT(s) s +#endif + +/* + * void crosscall_386(void (*fn)(void)) + * + * Calling into the 8c tool chain, where all registers are caller save. + * Called from standard x86 ABI, where %ebp, %ebx, %esi, + * and %edi are callee-save, so they must be saved explicitly. + */ +.globl EXT(crosscall_386) +EXT(crosscall_386): + pushl %ebp + movl %esp, %ebp + pushl %ebx + pushl %esi + pushl %edi + + movl 8(%ebp), %eax /* fn */ + call *%eax + + popl %edi + popl %esi + popl %ebx + popl %ebp + ret + +#ifdef __ELF__ +.section .note.GNU-stack,"",@progbits +#endif diff --git a/src/runtime/cgo/gcc_aix_ppc64.S b/src/runtime/cgo/gcc_aix_ppc64.S new file mode 100644 index 0000000..a77363e --- /dev/null +++ b/src/runtime/cgo/gcc_aix_ppc64.S @@ -0,0 +1,135 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build ppc64 +// +build aix + +.file "gcc_aix_ppc64.S" + +/* + * void crosscall_ppc64(void (*fn)(void), void *g) + * + * Calling into the gc tool chain, where all registers are caller save. + * Called from standard ppc64 C ABI, where r2, r14-r31, f14-f31 are + * callee-save, so they must be saved explicitly. + * AIX has a special assembly syntax and keywords that can be mixed with + * Linux assembly. + */ + .toc + .csect .text[PR] + .globl crosscall_ppc64 + .globl .crosscall_ppc64 + .csect crosscall_ppc64[DS] +crosscall_ppc64: + .llong .crosscall_ppc64, TOC[tc0], 0 + .csect .text[PR] +.crosscall_ppc64: + // Start with standard C stack frame layout and linkage + mflr 0 + std 0, 16(1) // Save LR in caller's frame + std 2, 40(1) // Save TOC in caller's frame + bl saveregs + stdu 1, -296(1) + + // Set up Go ABI constant registers + // Must match _cgo_reginit in runtime package. + xor 0, 0, 0 + + // Restore g pointer (r30 in Go ABI, which may have been clobbered by C) + mr 30, 4 + + // Call fn + mr 12, 3 + mtctr 12 + bctrl + + addi 1, 1, 296 + bl restoreregs + ld 2, 40(1) + ld 0, 16(1) + mtlr 0 + blr + +saveregs: + // Save callee-save registers + // O=-288; for R in {14..31}; do echo "\tstd\t$R, $O(1)"; ((O+=8)); done; for F in f{14..31}; do echo "\tstfd\t$F, $O(1)"; ((O+=8)); done + std 14, -288(1) + std 15, -280(1) + std 16, -272(1) + std 17, -264(1) + std 18, -256(1) + std 19, -248(1) + std 20, -240(1) + std 21, -232(1) + std 22, -224(1) + std 23, -216(1) + std 24, -208(1) + std 25, -200(1) + std 26, -192(1) + std 27, -184(1) + std 28, -176(1) + std 29, -168(1) + std 30, -160(1) + std 31, -152(1) + stfd 14, -144(1) + stfd 15, -136(1) + stfd 16, -128(1) + stfd 17, -120(1) + stfd 18, -112(1) + stfd 19, -104(1) + stfd 20, -96(1) + stfd 21, -88(1) + stfd 22, -80(1) + stfd 23, -72(1) + stfd 24, -64(1) + stfd 25, -56(1) + stfd 26, -48(1) + stfd 27, -40(1) + stfd 28, -32(1) + stfd 29, -24(1) + stfd 30, -16(1) + stfd 31, -8(1) + + blr + +restoreregs: + // O=-288; for R in {14..31}; do echo "\tld\t$R, $O(1)"; ((O+=8)); done; for F in {14..31}; do echo "\tlfd\t$F, $O(1)"; ((O+=8)); done + ld 14, -288(1) + ld 15, -280(1) + ld 16, -272(1) + ld 17, -264(1) + ld 18, -256(1) + ld 19, -248(1) + ld 20, -240(1) + ld 21, -232(1) + ld 22, -224(1) + ld 23, -216(1) + ld 24, -208(1) + ld 25, -200(1) + ld 26, -192(1) + ld 27, -184(1) + ld 28, -176(1) + ld 29, -168(1) + ld 30, -160(1) + ld 31, -152(1) + lfd 14, -144(1) + lfd 15, -136(1) + lfd 16, -128(1) + lfd 17, -120(1) + lfd 18, -112(1) + lfd 19, -104(1) + lfd 20, -96(1) + lfd 21, -88(1) + lfd 22, -80(1) + lfd 23, -72(1) + lfd 24, -64(1) + lfd 25, -56(1) + lfd 26, -48(1) + lfd 27, -40(1) + lfd 28, -32(1) + lfd 29, -24(1) + lfd 30, -16(1) + lfd 31, -8(1) + + blr diff --git a/src/runtime/cgo/gcc_aix_ppc64.c b/src/runtime/cgo/gcc_aix_ppc64.c new file mode 100644 index 0000000..f4f50b8 --- /dev/null +++ b/src/runtime/cgo/gcc_aix_ppc64.c @@ -0,0 +1,38 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build aix +// +build ppc64 ppc64le + +/* + * On AIX, call to _cgo_topofstack and Go main are forced to be a longcall. + * Without it, ld might add trampolines in the middle of .text section + * to reach these functions which are normally declared in runtime package. + */ +extern int __attribute__((longcall)) __cgo_topofstack(void); +extern int __attribute__((longcall)) runtime_rt0_go(int argc, char **argv); +extern void __attribute__((longcall)) _rt0_ppc64_aix_lib(void); + +int _cgo_topofstack(void) { + return __cgo_topofstack(); +} + +int main(int argc, char **argv) { + return runtime_rt0_go(argc, argv); +} + +static void libinit(void) __attribute__ ((constructor)); + +/* + * libinit aims to replace .init_array section which isn't available on aix. + * Using __attribute__ ((constructor)) let gcc handles this instead of + * adding special code in cmd/link. + * However, it will be called for every Go programs which has cgo. + * Inside _rt0_ppc64_aix_lib(), runtime.isarchive is checked in order + * to know if this program is a c-archive or a simple cgo program. + * If it's not set, _rt0_ppc64_ax_lib() returns directly. + */ +static void libinit() { + _rt0_ppc64_aix_lib(); +} diff --git a/src/runtime/cgo/gcc_amd64.S b/src/runtime/cgo/gcc_amd64.S new file mode 100644 index 0000000..5a1629e --- /dev/null +++ b/src/runtime/cgo/gcc_amd64.S @@ -0,0 +1,55 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +.file "gcc_amd64.S" + +/* + * Apple still insists on underscore prefixes for C function names. + */ +#if defined(__APPLE__) +#define EXT(s) _##s +#else +#define EXT(s) s +#endif + +/* + * void crosscall_amd64(void (*fn)(void), void (*setg_gcc)(void*), void *g) + * + * Calling into the 6c tool chain, where all registers are caller save. + * Called from standard x86-64 ABI, where %rbx, %rbp, %r12-%r15 + * are callee-save so they must be saved explicitly. + * The standard x86-64 ABI passes the three arguments m, g, fn + * in %rdi, %rsi, %rdx. + */ +.globl EXT(crosscall_amd64) +EXT(crosscall_amd64): + pushq %rbx + pushq %rbp + pushq %r12 + pushq %r13 + pushq %r14 + pushq %r15 + +#if defined(_WIN64) + movq %r8, %rdi /* arg of setg_gcc */ + call *%rdx /* setg_gcc */ + call *%rcx /* fn */ +#else + movq %rdi, %rbx + movq %rdx, %rdi /* arg of setg_gcc */ + call *%rsi /* setg_gcc */ + call *%rbx /* fn */ +#endif + + popq %r15 + popq %r14 + popq %r13 + popq %r12 + popq %rbp + popq %rbx + ret + +#ifdef __ELF__ +.section .note.GNU-stack,"",@progbits +#endif diff --git a/src/runtime/cgo/gcc_android.c b/src/runtime/cgo/gcc_android.c new file mode 100644 index 0000000..7ea2135 --- /dev/null +++ b/src/runtime/cgo/gcc_android.c @@ -0,0 +1,90 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <stdarg.h> +#include <android/log.h> +#include <pthread.h> +#include <dlfcn.h> +#include "libcgo.h" + +void +fatalf(const char* format, ...) +{ + va_list ap; + + // Write to both stderr and logcat. + // + // When running from an .apk, /dev/stderr and /dev/stdout + // redirect to /dev/null. And when running a test binary + // via adb shell, it's easy to miss logcat. + + fprintf(stderr, "runtime/cgo: "); + va_start(ap, format); + vfprintf(stderr, format, ap); + va_end(ap); + fprintf(stderr, "\n"); + + va_start(ap, format); + __android_log_vprint(ANDROID_LOG_FATAL, "runtime/cgo", format, ap); + va_end(ap); + + abort(); +} + +// Truncated to a different magic value on 32-bit; that's ok. +#define magic1 (0x23581321345589ULL) + +// From https://android.googlesource.com/platform/bionic/+/refs/heads/android10-tests-release/libc/private/bionic_asm_tls.h#69. +#define TLS_SLOT_APP 2 + +// inittls allocates a thread-local storage slot for g. +// +// It finds the first available slot using pthread_key_create and uses +// it as the offset value for runtime.tls_g. +static void +inittls(void **tlsg, void **tlsbase) +{ + pthread_key_t k; + int i, err; + void *handle, *get_ver, *off; + + // Check for Android Q where we can use the free TLS_SLOT_APP slot. + handle = dlopen("libc.so", RTLD_LAZY); + if (handle == NULL) { + fatalf("inittls: failed to dlopen main program"); + return; + } + // android_get_device_api_level is introduced in Android Q, so its mere presence + // is enough. + get_ver = dlsym(handle, "android_get_device_api_level"); + dlclose(handle); + if (get_ver != NULL) { + off = (void *)(TLS_SLOT_APP*sizeof(void *)); + // tlsg is initialized to Q's free TLS slot. Verify it while we're here. + if (*tlsg != off) { + fatalf("tlsg offset wrong, got %ld want %ld\n", *tlsg, off); + } + return; + } + + err = pthread_key_create(&k, nil); + if(err != 0) { + fatalf("pthread_key_create failed: %d", err); + } + pthread_setspecific(k, (void*)magic1); + // If thread local slots are laid out as we expect, our magic word will + // be located at some low offset from tlsbase. However, just in case something went + // wrong, the search is limited to sensible offsets. PTHREAD_KEYS_MAX was the + // original limit, but issue 19472 made a higher limit necessary. + for (i=0; i<384; i++) { + if (*(tlsbase+i) == (void*)magic1) { + *tlsg = (void*)(i*sizeof(void *)); + pthread_setspecific(k, 0); + return; + } + } + fatalf("inittls: could not find pthread key"); +} + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase) = inittls; diff --git a/src/runtime/cgo/gcc_arm.S b/src/runtime/cgo/gcc_arm.S new file mode 100644 index 0000000..6e8c14a --- /dev/null +++ b/src/runtime/cgo/gcc_arm.S @@ -0,0 +1,44 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +.file "gcc_arm.S" + +/* + * Apple still insists on underscore prefixes for C function names. + */ +#if defined(__APPLE__) +#define EXT(s) _##s +#else +#define EXT(s) s +#endif + +// Apple's ld64 wants 4-byte alignment for ARM code sections. +// .align in both Apple as and GNU as treat n as aligning to 2**n bytes. +.align 2 + +/* + * void crosscall_arm1(void (*fn)(void), void (*setg_gcc)(void *g), void *g) + * + * Calling into the 5c tool chain, where all registers are caller save. + * Called from standard ARM EABI, where r4-r11 are callee-save, so they + * must be saved explicitly. + */ +.globl EXT(crosscall_arm1) +EXT(crosscall_arm1): + push {r4, r5, r6, r7, r8, r9, r10, r11, ip, lr} + mov r4, r0 + mov r5, r1 + mov r0, r2 + + // Because the assembler might target an earlier revision of the ISA + // by default, we encode BLX as a .word. + .word 0xe12fff35 // blx r5 // setg(g) + .word 0xe12fff34 // blx r4 // fn() + + pop {r4, r5, r6, r7, r8, r9, r10, r11, ip, pc} + + +#ifdef __ELF__ +.section .note.GNU-stack,"",%progbits +#endif diff --git a/src/runtime/cgo/gcc_arm64.S b/src/runtime/cgo/gcc_arm64.S new file mode 100644 index 0000000..865f67c --- /dev/null +++ b/src/runtime/cgo/gcc_arm64.S @@ -0,0 +1,84 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +.file "gcc_arm64.S" + +/* + * Apple still insists on underscore prefixes for C function names. + */ +#if defined(__APPLE__) +#define EXT(s) _##s +#else +#define EXT(s) s +#endif + +// Apple's ld64 wants 4-byte alignment for ARM code sections. +// .align in both Apple as and GNU as treat n as aligning to 2**n bytes. +.align 2 + +/* + * void crosscall1(void (*fn)(void), void (*setg_gcc)(void *g), void *g) + * + * Calling into the gc tool chain, where all registers are caller save. + * Called from standard ARM EABI, where x19-x29 are callee-save, so they + * must be saved explicitly, along with x30 (LR). + */ +.globl EXT(crosscall1) +EXT(crosscall1): + .cfi_startproc + stp x29, x30, [sp, #-96]! + .cfi_def_cfa_offset 96 + .cfi_offset 29, -96 + .cfi_offset 30, -88 + mov x29, sp + .cfi_def_cfa_register 29 + stp x19, x20, [sp, #80] + .cfi_offset 19, -16 + .cfi_offset 20, -8 + stp x21, x22, [sp, #64] + .cfi_offset 21, -32 + .cfi_offset 22, -24 + stp x23, x24, [sp, #48] + .cfi_offset 23, -48 + .cfi_offset 24, -40 + stp x25, x26, [sp, #32] + .cfi_offset 25, -64 + .cfi_offset 26, -56 + stp x27, x28, [sp, #16] + .cfi_offset 27, -80 + .cfi_offset 28, -72 + + mov x19, x0 + mov x20, x1 + mov x0, x2 + + blr x20 + blr x19 + + ldp x27, x28, [sp, #16] + .cfi_restore 27 + .cfi_restore 28 + ldp x25, x26, [sp, #32] + .cfi_restore 25 + .cfi_restore 26 + ldp x23, x24, [sp, #48] + .cfi_restore 23 + .cfi_restore 24 + ldp x21, x22, [sp, #64] + .cfi_restore 21 + .cfi_restore 22 + ldp x19, x20, [sp, #80] + .cfi_restore 19 + .cfi_restore 20 + ldp x29, x30, [sp], #96 + .cfi_restore 29 + .cfi_restore 30 + .cfi_def_cfa 31, 0 + ret + .cfi_endproc + + +#ifdef __ELF__ +.section .note.GNU-stack,"",%progbits +#endif diff --git a/src/runtime/cgo/gcc_context.c b/src/runtime/cgo/gcc_context.c new file mode 100644 index 0000000..5fc0abb --- /dev/null +++ b/src/runtime/cgo/gcc_context.c @@ -0,0 +1,21 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build cgo +// +build aix darwin dragonfly freebsd linux netbsd openbsd solaris windows + +#include "libcgo.h" + +// Releases the cgo traceback context. +void _cgo_release_context(uintptr_t ctxt) { + void (*pfn)(struct context_arg*); + + pfn = _cgo_get_context_function(); + if (ctxt != 0 && pfn != nil) { + struct context_arg arg; + + arg.Context = ctxt; + (*pfn)(&arg); + } +} diff --git a/src/runtime/cgo/gcc_darwin_amd64.c b/src/runtime/cgo/gcc_darwin_amd64.c new file mode 100644 index 0000000..955b81d --- /dev/null +++ b/src/runtime/cgo/gcc_darwin_amd64.c @@ -0,0 +1,63 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <string.h> /* for strerror */ +#include <pthread.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + size_t size; + + setg_gcc = setg; + + size = pthread_get_stacksize_np(pthread_self()); + g->stacklo = (uintptr)&size - size + 4096; +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + size = pthread_get_stacksize_np(pthread_self()); + pthread_attr_init(&attr); + pthread_attr_setstacksize(&attr, size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall_amd64(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_darwin_arm64.c b/src/runtime/cgo/gcc_darwin_arm64.c new file mode 100644 index 0000000..5b77a42 --- /dev/null +++ b/src/runtime/cgo/gcc_darwin_arm64.c @@ -0,0 +1,142 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <limits.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> /* for strerror */ +#include <sys/param.h> +#include <unistd.h> +#include <stdlib.h> + +#include "libcgo.h" +#include "libcgo_unix.h" + +#include <TargetConditionals.h> + +#if TARGET_OS_IPHONE +#include <CoreFoundation/CFBundle.h> +#include <CoreFoundation/CFString.h> +#endif + +static void *threadentry(void*); +static void (*setg_gcc)(void*); + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + //fprintf(stderr, "runtime/cgo: _cgo_sys_thread_start: fn=%p, g=%p\n", ts->fn, ts->g); // debug + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + size = pthread_get_stacksize_np(pthread_self()); + pthread_attr_init(&attr); + pthread_attr_setstacksize(&attr, size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + +#if TARGET_OS_IPHONE + darwin_arm_init_thread_exception_port(); +#endif + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} + +#if TARGET_OS_IPHONE + +// init_working_dir sets the current working directory to the app root. +// By default ios/arm64 processes start in "/". +static void +init_working_dir() +{ + CFBundleRef bundle = CFBundleGetMainBundle(); + if (bundle == NULL) { + fprintf(stderr, "runtime/cgo: no main bundle\n"); + return; + } + CFURLRef url_ref = CFBundleCopyResourceURL(bundle, CFSTR("Info"), CFSTR("plist"), NULL); + if (url_ref == NULL) { + // No Info.plist found. It can happen on Corellium virtual devices. + return; + } + CFStringRef url_str_ref = CFURLGetString(url_ref); + char buf[MAXPATHLEN]; + Boolean res = CFStringGetCString(url_str_ref, buf, sizeof(buf), kCFStringEncodingUTF8); + CFRelease(url_ref); + if (!res) { + fprintf(stderr, "runtime/cgo: cannot get URL string\n"); + return; + } + + // url is of the form "file:///path/to/Info.plist". + // strip it down to the working directory "/path/to". + int url_len = strlen(buf); + if (url_len < sizeof("file://")+sizeof("/Info.plist")) { + fprintf(stderr, "runtime/cgo: bad URL: %s\n", buf); + return; + } + buf[url_len-sizeof("/Info.plist")+1] = 0; + char *dir = &buf[0] + sizeof("file://")-1; + + if (chdir(dir) != 0) { + fprintf(stderr, "runtime/cgo: chdir(%s) failed\n", dir); + } + + // The test harness in go_ios_exec passes the relative working directory + // in the GoExecWrapperWorkingDirectory property of the app bundle. + CFStringRef wd_ref = CFBundleGetValueForInfoDictionaryKey(bundle, CFSTR("GoExecWrapperWorkingDirectory")); + if (wd_ref != NULL) { + if (!CFStringGetCString(wd_ref, buf, sizeof(buf), kCFStringEncodingUTF8)) { + fprintf(stderr, "runtime/cgo: cannot get GoExecWrapperWorkingDirectory string\n"); + return; + } + if (chdir(buf) != 0) { + fprintf(stderr, "runtime/cgo: chdir(%s) failed\n", buf); + } + } +} + +#endif // TARGET_OS_IPHONE + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + size_t size; + + //fprintf(stderr, "x_cgo_init = %p\n", &x_cgo_init); // aid debugging in presence of ASLR + setg_gcc = setg; + size = pthread_get_stacksize_np(pthread_self()); + g->stacklo = (uintptr)&size - size + 4096; + +#if TARGET_OS_IPHONE + darwin_arm_init_mach_exception_handler(); + darwin_arm_init_thread_exception_port(); + init_working_dir(); +#endif +} diff --git a/src/runtime/cgo/gcc_dragonfly_amd64.c b/src/runtime/cgo/gcc_dragonfly_amd64.c new file mode 100644 index 0000000..0003414 --- /dev/null +++ b/src/runtime/cgo/gcc_dragonfly_amd64.c @@ -0,0 +1,66 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <sys/signalvar.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + SIGFILLSET(ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall_amd64(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_fatalf.c b/src/runtime/cgo/gcc_fatalf.c new file mode 100644 index 0000000..597e750 --- /dev/null +++ b/src/runtime/cgo/gcc_fatalf.c @@ -0,0 +1,23 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build aix !android,linux freebsd + +#include <stdarg.h> +#include <stdio.h> +#include <stdlib.h> +#include "libcgo.h" + +void +fatalf(const char* format, ...) +{ + va_list ap; + + fprintf(stderr, "runtime/cgo: "); + va_start(ap, format); + vfprintf(stderr, format, ap); + va_end(ap); + fprintf(stderr, "\n"); + abort(); +} diff --git a/src/runtime/cgo/gcc_freebsd_386.c b/src/runtime/cgo/gcc_freebsd_386.c new file mode 100644 index 0000000..9097a2a --- /dev/null +++ b/src/runtime/cgo/gcc_freebsd_386.c @@ -0,0 +1,71 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <sys/signalvar.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + SIGFILLSET(ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + /* + * Set specific keys. + */ + setg_gcc((void*)ts.g); + + crosscall_386(ts.fn); + return nil; +} diff --git a/src/runtime/cgo/gcc_freebsd_amd64.c b/src/runtime/cgo/gcc_freebsd_amd64.c new file mode 100644 index 0000000..6071ec3 --- /dev/null +++ b/src/runtime/cgo/gcc_freebsd_amd64.c @@ -0,0 +1,74 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <errno.h> +#include <sys/signalvar.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t *attr; + size_t size; + + // Deal with memory sanitizer/clang interaction. + // See gcc_linux_amd64.c for details. + setg_gcc = setg; + attr = (pthread_attr_t*)malloc(sizeof *attr); + if (attr == NULL) { + fatalf("malloc failed: %s", strerror(errno)); + } + pthread_attr_init(attr); + pthread_attr_getstacksize(attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(attr); + free(attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + SIGFILLSET(ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + _cgo_tsan_acquire(); + free(v); + _cgo_tsan_release(); + + crosscall_amd64(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_freebsd_arm.c b/src/runtime/cgo/gcc_freebsd_arm.c new file mode 100644 index 0000000..5f89978 --- /dev/null +++ b/src/runtime/cgo/gcc_freebsd_arm.c @@ -0,0 +1,77 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <machine/sysarch.h> +#include <sys/signalvar.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +#ifdef ARM_TP_ADDRESS +// ARM_TP_ADDRESS is (ARM_VECTORS_HIGH + 0x1000) or 0xffff1000 +// and is known to runtime.read_tls_fallback. Verify it with +// cpp. +#if ARM_TP_ADDRESS != 0xffff1000 +#error Wrong ARM_TP_ADDRESS! +#endif +#endif + +static void *threadentry(void*); + +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + SIGFILLSET(ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall_arm1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall_arm1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_freebsd_arm64.c b/src/runtime/cgo/gcc_freebsd_arm64.c new file mode 100644 index 0000000..dd8f888 --- /dev/null +++ b/src/runtime/cgo/gcc_freebsd_arm64.c @@ -0,0 +1,68 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <errno.h> +#include <sys/signalvar.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + SIGFILLSET(ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_freebsd_riscv64.c b/src/runtime/cgo/gcc_freebsd_riscv64.c new file mode 100644 index 0000000..6ce5e65 --- /dev/null +++ b/src/runtime/cgo/gcc_freebsd_riscv64.c @@ -0,0 +1,67 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <errno.h> +#include <sys/signalvar.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + SIGFILLSET(ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_freebsd_sigaction.c b/src/runtime/cgo/gcc_freebsd_sigaction.c new file mode 100644 index 0000000..98b122d --- /dev/null +++ b/src/runtime/cgo/gcc_freebsd_sigaction.c @@ -0,0 +1,80 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build freebsd,amd64 + +#include <errno.h> +#include <stddef.h> +#include <stdint.h> +#include <string.h> +#include <signal.h> + +#include "libcgo.h" + +// go_sigaction_t is a C version of the sigactiont struct from +// os_freebsd.go. This definition — and its conversion to and from struct +// sigaction — are specific to freebsd/amd64. +typedef struct { + uint32_t __bits[_SIG_WORDS]; +} go_sigset_t; +typedef struct { + uintptr_t handler; + int32_t flags; + go_sigset_t mask; +} go_sigaction_t; + +int32_t +x_cgo_sigaction(intptr_t signum, const go_sigaction_t *goact, go_sigaction_t *oldgoact) { + int32_t ret; + struct sigaction act; + struct sigaction oldact; + size_t i; + + _cgo_tsan_acquire(); + + memset(&act, 0, sizeof act); + memset(&oldact, 0, sizeof oldact); + + if (goact) { + if (goact->flags & SA_SIGINFO) { + act.sa_sigaction = (void(*)(int, siginfo_t*, void*))(goact->handler); + } else { + act.sa_handler = (void(*)(int))(goact->handler); + } + sigemptyset(&act.sa_mask); + for (i = 0; i < 8 * sizeof(goact->mask); i++) { + if (goact->mask.__bits[i/32] & ((uint32_t)(1)<<(i&31))) { + sigaddset(&act.sa_mask, i+1); + } + } + act.sa_flags = goact->flags; + } + + ret = sigaction(signum, goact ? &act : NULL, oldgoact ? &oldact : NULL); + if (ret == -1) { + // runtime.sigaction expects _cgo_sigaction to return errno on error. + _cgo_tsan_release(); + return errno; + } + + if (oldgoact) { + if (oldact.sa_flags & SA_SIGINFO) { + oldgoact->handler = (uintptr_t)(oldact.sa_sigaction); + } else { + oldgoact->handler = (uintptr_t)(oldact.sa_handler); + } + for (i = 0 ; i < _SIG_WORDS; i++) { + oldgoact->mask.__bits[i] = 0; + } + for (i = 0; i < 8 * sizeof(oldgoact->mask); i++) { + if (sigismember(&oldact.sa_mask, i+1) == 1) { + oldgoact->mask.__bits[i/32] |= (uint32_t)(1)<<(i&31); + } + } + oldgoact->flags = oldact.sa_flags; + } + + _cgo_tsan_release(); + return ret; +} diff --git a/src/runtime/cgo/gcc_libinit.c b/src/runtime/cgo/gcc_libinit.c new file mode 100644 index 0000000..3304d95 --- /dev/null +++ b/src/runtime/cgo/gcc_libinit.c @@ -0,0 +1,113 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build cgo +// +build aix darwin dragonfly freebsd linux netbsd openbsd solaris + +#include <pthread.h> +#include <errno.h> +#include <stdio.h> +#include <stdlib.h> +#include <string.h> // strerror +#include <time.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static pthread_cond_t runtime_init_cond = PTHREAD_COND_INITIALIZER; +static pthread_mutex_t runtime_init_mu = PTHREAD_MUTEX_INITIALIZER; +static int runtime_init_done; + +// The context function, used when tracing back C calls into Go. +static void (*cgo_context_function)(struct context_arg*); + +void +x_cgo_sys_thread_create(void* (*func)(void*), void* arg) { + pthread_t p; + int err = _cgo_try_pthread_create(&p, NULL, func, arg); + if (err != 0) { + fprintf(stderr, "pthread_create failed: %s", strerror(err)); + abort(); + } +} + +uintptr_t +_cgo_wait_runtime_init_done(void) { + void (*pfn)(struct context_arg*); + + pthread_mutex_lock(&runtime_init_mu); + while (runtime_init_done == 0) { + pthread_cond_wait(&runtime_init_cond, &runtime_init_mu); + } + + // TODO(iant): For the case of a new C thread calling into Go, such + // as when using -buildmode=c-archive, we know that Go runtime + // initialization is complete but we do not know that all Go init + // functions have been run. We should not fetch cgo_context_function + // until they have been, because that is where a call to + // SetCgoTraceback is likely to occur. We are going to wait for Go + // initialization to be complete anyhow, later, by waiting for + // main_init_done to be closed in cgocallbackg1. We should wait here + // instead. See also issue #15943. + pfn = cgo_context_function; + + pthread_mutex_unlock(&runtime_init_mu); + if (pfn != nil) { + struct context_arg arg; + + arg.Context = 0; + (*pfn)(&arg); + return arg.Context; + } + return 0; +} + +void +x_cgo_notify_runtime_init_done(void* dummy __attribute__ ((unused))) { + pthread_mutex_lock(&runtime_init_mu); + runtime_init_done = 1; + pthread_cond_broadcast(&runtime_init_cond); + pthread_mutex_unlock(&runtime_init_mu); +} + +// Sets the context function to call to record the traceback context +// when calling a Go function from C code. Called from runtime.SetCgoTraceback. +void x_cgo_set_context_function(void (*context)(struct context_arg*)) { + pthread_mutex_lock(&runtime_init_mu); + cgo_context_function = context; + pthread_mutex_unlock(&runtime_init_mu); +} + +// Gets the context function. +void (*(_cgo_get_context_function(void)))(struct context_arg*) { + void (*ret)(struct context_arg*); + + pthread_mutex_lock(&runtime_init_mu); + ret = cgo_context_function; + pthread_mutex_unlock(&runtime_init_mu); + return ret; +} + +// _cgo_try_pthread_create retries pthread_create if it fails with +// EAGAIN. +int +_cgo_try_pthread_create(pthread_t* thread, const pthread_attr_t* attr, void* (*pfn)(void*), void* arg) { + int tries; + int err; + struct timespec ts; + + for (tries = 0; tries < 20; tries++) { + err = pthread_create(thread, attr, pfn, arg); + if (err == 0) { + pthread_detach(*thread); + return 0; + } + if (err != EAGAIN) { + return err; + } + ts.tv_sec = 0; + ts.tv_nsec = (tries + 1) * 1000 * 1000; // Milliseconds. + nanosleep(&ts, nil); + } + return EAGAIN; +} diff --git a/src/runtime/cgo/gcc_libinit_windows.c b/src/runtime/cgo/gcc_libinit_windows.c new file mode 100644 index 0000000..2b5896b --- /dev/null +++ b/src/runtime/cgo/gcc_libinit_windows.c @@ -0,0 +1,151 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build cgo + +#define WIN32_LEAN_AND_MEAN +#include <windows.h> +#include <process.h> + +#include <stdio.h> +#include <stdlib.h> +#include <errno.h> + +#include "libcgo.h" +#include "libcgo_windows.h" + +// Ensure there's one symbol marked __declspec(dllexport). +// If there are no exported symbols, the unfortunate behavior of +// the binutils linker is to also strip the relocations table, +// resulting in non-PIE binary. The other option is the +// --export-all-symbols flag, but we don't need to export all symbols +// and this may overflow the export table (#40795). +// See https://sourceware.org/bugzilla/show_bug.cgi?id=19011 +__declspec(dllexport) int _cgo_dummy_export; + +static volatile LONG runtime_init_once_gate = 0; +static volatile LONG runtime_init_once_done = 0; + +static CRITICAL_SECTION runtime_init_cs; + +static HANDLE runtime_init_wait; +static int runtime_init_done; + +// Pre-initialize the runtime synchronization objects +void +_cgo_preinit_init() { + runtime_init_wait = CreateEvent(NULL, TRUE, FALSE, NULL); + if (runtime_init_wait == NULL) { + fprintf(stderr, "runtime: failed to create runtime initialization wait event.\n"); + abort(); + } + + InitializeCriticalSection(&runtime_init_cs); +} + +// Make sure that the preinit sequence has run. +void +_cgo_maybe_run_preinit() { + if (!InterlockedExchangeAdd(&runtime_init_once_done, 0)) { + if (InterlockedIncrement(&runtime_init_once_gate) == 1) { + _cgo_preinit_init(); + InterlockedIncrement(&runtime_init_once_done); + } else { + // Decrement to avoid overflow. + InterlockedDecrement(&runtime_init_once_gate); + while(!InterlockedExchangeAdd(&runtime_init_once_done, 0)) { + Sleep(0); + } + } + } +} + +void +x_cgo_sys_thread_create(void (*func)(void*), void* arg) { + _cgo_beginthread(func, arg); +} + +int +_cgo_is_runtime_initialized() { + EnterCriticalSection(&runtime_init_cs); + int status = runtime_init_done; + LeaveCriticalSection(&runtime_init_cs); + return status; +} + +uintptr_t +_cgo_wait_runtime_init_done(void) { + void (*pfn)(struct context_arg*); + + _cgo_maybe_run_preinit(); + while (!_cgo_is_runtime_initialized()) { + WaitForSingleObject(runtime_init_wait, INFINITE); + } + pfn = _cgo_get_context_function(); + if (pfn != nil) { + struct context_arg arg; + + arg.Context = 0; + (*pfn)(&arg); + return arg.Context; + } + return 0; +} + +void +x_cgo_notify_runtime_init_done(void* dummy) { + _cgo_maybe_run_preinit(); + + EnterCriticalSection(&runtime_init_cs); + runtime_init_done = 1; + LeaveCriticalSection(&runtime_init_cs); + + if (!SetEvent(runtime_init_wait)) { + fprintf(stderr, "runtime: failed to signal runtime initialization complete.\n"); + abort(); + } +} + +// The context function, used when tracing back C calls into Go. +static void (*cgo_context_function)(struct context_arg*); + +// Sets the context function to call to record the traceback context +// when calling a Go function from C code. Called from runtime.SetCgoTraceback. +void x_cgo_set_context_function(void (*context)(struct context_arg*)) { + EnterCriticalSection(&runtime_init_cs); + cgo_context_function = context; + LeaveCriticalSection(&runtime_init_cs); +} + +// Gets the context function. +void (*(_cgo_get_context_function(void)))(struct context_arg*) { + void (*ret)(struct context_arg*); + + EnterCriticalSection(&runtime_init_cs); + ret = cgo_context_function; + LeaveCriticalSection(&runtime_init_cs); + return ret; +} + +void _cgo_beginthread(void (*func)(void*), void* arg) { + int tries; + uintptr_t thandle; + + for (tries = 0; tries < 20; tries++) { + thandle = _beginthread(func, 0, arg); + if (thandle == -1 && errno == EACCES) { + // "Insufficient resources", try again in a bit. + // + // Note that the first Sleep(0) is a yield. + Sleep(tries); // milliseconds + continue; + } else if (thandle == -1) { + break; + } + return; // Success! + } + + fprintf(stderr, "runtime: failed to create new OS thread (%d)\n", errno); + abort(); +} diff --git a/src/runtime/cgo/gcc_linux_386.c b/src/runtime/cgo/gcc_linux_386.c new file mode 100644 index 0000000..0ce9359 --- /dev/null +++ b/src/runtime/cgo/gcc_linux_386.c @@ -0,0 +1,74 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); +static void (*setg_gcc)(void*); + +// This will be set in gcc_android.c for android-specific customization. +void (*x_cgo_inittls)(void **tlsg, void **tlsbase) __attribute__((common)); + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); + + if (x_cgo_inittls) { + x_cgo_inittls(tlsg, tlsbase); + } +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + /* + * Set specific keys. + */ + setg_gcc((void*)ts.g); + + crosscall_386(ts.fn); + return nil; +} diff --git a/src/runtime/cgo/gcc_linux_amd64.c b/src/runtime/cgo/gcc_linux_amd64.c new file mode 100644 index 0000000..fb164c1 --- /dev/null +++ b/src/runtime/cgo/gcc_linux_amd64.c @@ -0,0 +1,96 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <pthread.h> +#include <errno.h> +#include <string.h> // strerror +#include <signal.h> +#include <stdlib.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +// This will be set in gcc_android.c for android-specific customization. +void (*x_cgo_inittls)(void **tlsg, void **tlsbase) __attribute__((common)); + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + pthread_attr_t *attr; + size_t size; + + /* The memory sanitizer distributed with versions of clang + before 3.8 has a bug: if you call mmap before malloc, mmap + may return an address that is later overwritten by the msan + library. Avoid this problem by forcing a call to malloc + here, before we ever call malloc. + + This is only required for the memory sanitizer, so it's + unfortunate that we always run it. It should be possible + to remove this when we no longer care about versions of + clang before 3.8. The test for this is + misc/cgo/testsanitizers. + + GCC works hard to eliminate a seemingly unnecessary call to + malloc, so we actually use the memory we allocate. */ + + setg_gcc = setg; + attr = (pthread_attr_t*)malloc(sizeof *attr); + if (attr == NULL) { + fatalf("malloc failed: %s", strerror(errno)); + } + pthread_attr_init(attr); + pthread_attr_getstacksize(attr, &size); + g->stacklo = (uintptr)__builtin_frame_address(0) - size + 4096; + if (g->stacklo >= g->stackhi) + fatalf("bad stack bounds: lo=%p hi=%p\n", g->stacklo, g->stackhi); + pthread_attr_destroy(attr); + free(attr); + + if (x_cgo_inittls) { + x_cgo_inittls(tlsg, tlsbase); + } +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + _cgo_tsan_acquire(); + free(v); + _cgo_tsan_release(); + + crosscall_amd64(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_linux_arm.c b/src/runtime/cgo/gcc_linux_arm.c new file mode 100644 index 0000000..5e97a9e --- /dev/null +++ b/src/runtime/cgo/gcc_linux_arm.c @@ -0,0 +1,69 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase) __attribute__((common)); +static void (*setg_gcc)(void*); + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +extern void crosscall_arm1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall_arm1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); + + if (x_cgo_inittls) { + x_cgo_inittls(tlsg, tlsbase); + } +} diff --git a/src/runtime/cgo/gcc_linux_arm64.c b/src/runtime/cgo/gcc_linux_arm64.c new file mode 100644 index 0000000..dac45e4 --- /dev/null +++ b/src/runtime/cgo/gcc_linux_arm64.c @@ -0,0 +1,91 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <pthread.h> +#include <errno.h> +#include <string.h> +#include <signal.h> +#include <stdlib.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase) __attribute__((common)); +static void (*setg_gcc)(void*); + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + pthread_attr_t *attr; + size_t size; + + /* The memory sanitizer distributed with versions of clang + before 3.8 has a bug: if you call mmap before malloc, mmap + may return an address that is later overwritten by the msan + library. Avoid this problem by forcing a call to malloc + here, before we ever call malloc. + + This is only required for the memory sanitizer, so it's + unfortunate that we always run it. It should be possible + to remove this when we no longer care about versions of + clang before 3.8. The test for this is + misc/cgo/testsanitizers. + + GCC works hard to eliminate a seemingly unnecessary call to + malloc, so we actually use the memory we allocate. */ + + setg_gcc = setg; + attr = (pthread_attr_t*)malloc(sizeof *attr); + if (attr == NULL) { + fatalf("malloc failed: %s", strerror(errno)); + } + pthread_attr_init(attr); + pthread_attr_getstacksize(attr, &size); + g->stacklo = (uintptr)&size - size + 4096; + pthread_attr_destroy(attr); + free(attr); + + if (x_cgo_inittls) { + x_cgo_inittls(tlsg, tlsbase); + } +} diff --git a/src/runtime/cgo/gcc_linux_loong64.c b/src/runtime/cgo/gcc_linux_loong64.c new file mode 100644 index 0000000..96a06eb --- /dev/null +++ b/src/runtime/cgo/gcc_linux_loong64.c @@ -0,0 +1,69 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase); +static void (*setg_gcc)(void*); + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); + + if (x_cgo_inittls) { + x_cgo_inittls(tlsg, tlsbase); + } +} diff --git a/src/runtime/cgo/gcc_linux_mips64x.c b/src/runtime/cgo/gcc_linux_mips64x.c new file mode 100644 index 0000000..3ea29b0 --- /dev/null +++ b/src/runtime/cgo/gcc_linux_mips64x.c @@ -0,0 +1,73 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build cgo +// +build linux +// +build mips64 mips64le + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase); +static void (*setg_gcc)(void*); + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); + + if (x_cgo_inittls) { + x_cgo_inittls(tlsg, tlsbase); + } +} diff --git a/src/runtime/cgo/gcc_linux_mipsx.c b/src/runtime/cgo/gcc_linux_mipsx.c new file mode 100644 index 0000000..3b60a0e --- /dev/null +++ b/src/runtime/cgo/gcc_linux_mipsx.c @@ -0,0 +1,74 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build cgo +// +build linux +// +build mips mipsle + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase); +static void (*setg_gcc)(void*); + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); + + if (x_cgo_inittls) { + x_cgo_inittls(tlsg, tlsbase); + } +} diff --git a/src/runtime/cgo/gcc_linux_ppc64x.S b/src/runtime/cgo/gcc_linux_ppc64x.S new file mode 100644 index 0000000..957ef3a --- /dev/null +++ b/src/runtime/cgo/gcc_linux_ppc64x.S @@ -0,0 +1,140 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build ppc64 ppc64le +// +build linux + +.file "gcc_linux_ppc64x.S" + +/* + * Apple still insists on underscore prefixes for C function names. + */ +#if defined(__APPLE__) +#define EXT(s) _##s +#else +#define EXT(s) s +#endif + +/* + * void crosscall_ppc64(void (*fn)(void), void *g) + * + * Calling into the gc tool chain, where all registers are caller save. + * Called from standard ppc64 C ABI, where r2, r14-r31, f14-f31 are + * callee-save, so they must be saved explicitly. + */ +.globl EXT(crosscall_ppc64) +EXT(crosscall_ppc64): + // Start with standard C stack frame layout and linkage + mflr %r0 + std %r0, 16(%r1) // Save LR in caller's frame + std %r2, 24(%r1) // Save TOC in caller's frame + bl saveregs + stdu %r1, -296(%r1) + + // Set up Go ABI constant registers + bl _cgo_reginit + nop + + // Restore g pointer (r30 in Go ABI, which may have been clobbered by C) + mr %r30, %r4 + + // Call fn + mr %r12, %r3 + mtctr %r3 + bctrl + + addi %r1, %r1, 296 + bl restoreregs + ld %r2, 24(%r1) + ld %r0, 16(%r1) + mtlr %r0 + blr + +saveregs: + // Save callee-save registers + // O=-288; for R in %r{14..31}; do echo "\tstd\t$R, $O(%r1)"; ((O+=8)); done; for F in f{14..31}; do echo "\tstfd\t$F, $O(%r1)"; ((O+=8)); done + std %r14, -288(%r1) + std %r15, -280(%r1) + std %r16, -272(%r1) + std %r17, -264(%r1) + std %r18, -256(%r1) + std %r19, -248(%r1) + std %r20, -240(%r1) + std %r21, -232(%r1) + std %r22, -224(%r1) + std %r23, -216(%r1) + std %r24, -208(%r1) + std %r25, -200(%r1) + std %r26, -192(%r1) + std %r27, -184(%r1) + std %r28, -176(%r1) + std %r29, -168(%r1) + std %r30, -160(%r1) + std %r31, -152(%r1) + stfd %f14, -144(%r1) + stfd %f15, -136(%r1) + stfd %f16, -128(%r1) + stfd %f17, -120(%r1) + stfd %f18, -112(%r1) + stfd %f19, -104(%r1) + stfd %f20, -96(%r1) + stfd %f21, -88(%r1) + stfd %f22, -80(%r1) + stfd %f23, -72(%r1) + stfd %f24, -64(%r1) + stfd %f25, -56(%r1) + stfd %f26, -48(%r1) + stfd %f27, -40(%r1) + stfd %f28, -32(%r1) + stfd %f29, -24(%r1) + stfd %f30, -16(%r1) + stfd %f31, -8(%r1) + + blr + +restoreregs: + // O=-288; for R in %r{14..31}; do echo "\tld\t$R, $O(%r1)"; ((O+=8)); done; for F in %f{14..31}; do echo "\tlfd\t$F, $O(%r1)"; ((O+=8)); done + ld %r14, -288(%r1) + ld %r15, -280(%r1) + ld %r16, -272(%r1) + ld %r17, -264(%r1) + ld %r18, -256(%r1) + ld %r19, -248(%r1) + ld %r20, -240(%r1) + ld %r21, -232(%r1) + ld %r22, -224(%r1) + ld %r23, -216(%r1) + ld %r24, -208(%r1) + ld %r25, -200(%r1) + ld %r26, -192(%r1) + ld %r27, -184(%r1) + ld %r28, -176(%r1) + ld %r29, -168(%r1) + ld %r30, -160(%r1) + ld %r31, -152(%r1) + lfd %f14, -144(%r1) + lfd %f15, -136(%r1) + lfd %f16, -128(%r1) + lfd %f17, -120(%r1) + lfd %f18, -112(%r1) + lfd %f19, -104(%r1) + lfd %f20, -96(%r1) + lfd %f21, -88(%r1) + lfd %f22, -80(%r1) + lfd %f23, -72(%r1) + lfd %f24, -64(%r1) + lfd %f25, -56(%r1) + lfd %f26, -48(%r1) + lfd %f27, -40(%r1) + lfd %f28, -32(%r1) + lfd %f29, -24(%r1) + lfd %f30, -16(%r1) + lfd %f31, -8(%r1) + + blr + + +#ifdef __ELF__ +.section .note.GNU-stack,"",%progbits +#endif diff --git a/src/runtime/cgo/gcc_linux_riscv64.c b/src/runtime/cgo/gcc_linux_riscv64.c new file mode 100644 index 0000000..99c2866 --- /dev/null +++ b/src/runtime/cgo/gcc_linux_riscv64.c @@ -0,0 +1,69 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase); +static void (*setg_gcc)(void*); + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); + + if (x_cgo_inittls) { + x_cgo_inittls(tlsg, tlsbase); + } +} diff --git a/src/runtime/cgo/gcc_linux_s390x.c b/src/runtime/cgo/gcc_linux_s390x.c new file mode 100644 index 0000000..bb60048 --- /dev/null +++ b/src/runtime/cgo/gcc_linux_s390x.c @@ -0,0 +1,69 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsbase) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +extern void crosscall_s390x(void (*fn)(void), void *g); + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + // Save g for this thread in C TLS + setg_gcc((void*)ts.g); + + crosscall_s390x(ts.fn, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_loong64.S b/src/runtime/cgo/gcc_loong64.S new file mode 100644 index 0000000..6b7668f --- /dev/null +++ b/src/runtime/cgo/gcc_loong64.S @@ -0,0 +1,67 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +.file "gcc_loong64.S" + +/* + * void crosscall1(void (*fn)(void), void (*setg_gcc)(void *g), void *g) + * + * Calling into the gc tool chain, where all registers are caller save. + * Called from standard lp64d ABI, where $r1, $r3, $r23-$r30, and $f24-$f31 + * are callee-save, so they must be saved explicitly, along with $r1 (LR). + */ +.globl crosscall1 +crosscall1: + addi.d $r3, $r3, -160 + st.d $r1, $r3, 0 + st.d $r23, $r3, 8 + st.d $r24, $r3, 16 + st.d $r25, $r3, 24 + st.d $r26, $r3, 32 + st.d $r27, $r3, 40 + st.d $r28, $r3, 48 + st.d $r29, $r3, 56 + st.d $r30, $r3, 64 + st.d $r2, $r3, 72 + st.d $r22, $r3, 80 + fst.d $f24, $r3, 88 + fst.d $f25, $r3, 96 + fst.d $f26, $r3, 104 + fst.d $f27, $r3, 112 + fst.d $f28, $r3, 120 + fst.d $f29, $r3, 128 + fst.d $f30, $r3, 136 + fst.d $f31, $r3, 144 + + move $r18, $r4 // save R4 + move $r19, $r6 + jirl $r1, $r5, 0 // call setg_gcc (clobbers R4) + jirl $r1, $r18, 0 // call fn + + ld.d $r23, $r3, 8 + ld.d $r24, $r3, 16 + ld.d $r25, $r3, 24 + ld.d $r26, $r3, 32 + ld.d $r27, $r3, 40 + ld.d $r28, $r3, 48 + ld.d $r29, $r3, 56 + ld.d $r30, $r3, 64 + ld.d $r2, $r3, 72 + ld.d $r22, $r3, 80 + fld.d $f24, $r3, 88 + fld.d $f25, $r3, 96 + fld.d $f26, $r3, 104 + fld.d $f27, $r3, 112 + fld.d $f28, $r3, 120 + fld.d $f29, $r3, 128 + fld.d $f30, $r3, 136 + fld.d $f31, $r3, 144 + ld.d $r1, $r3, 0 + addi.d $r3, $r3, 160 + jirl $r0, $r1, 0 + + +#ifdef __ELF__ +.section .note.GNU-stack,"",%progbits +#endif diff --git a/src/runtime/cgo/gcc_mips64x.S b/src/runtime/cgo/gcc_mips64x.S new file mode 100644 index 0000000..ec24d71 --- /dev/null +++ b/src/runtime/cgo/gcc_mips64x.S @@ -0,0 +1,89 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build mips64 mips64le + +.file "gcc_mips64x.S" + +/* + * void crosscall1(void (*fn)(void), void (*setg_gcc)(void *g), void *g) + * + * Calling into the gc tool chain, where all registers are caller save. + * Called from standard MIPS N64 ABI, where $16-$23, $28, $30, and $f24-$f31 + * are callee-save, so they must be saved explicitly, along with $31 (LR). + */ +.globl crosscall1 +.set noat +crosscall1: +#ifndef __mips_soft_float + daddiu $29, $29, -160 +#else + daddiu $29, $29, -96 // For soft-float, no need to make room for FP registers +#endif + sd $31, 0($29) + sd $16, 8($29) + sd $17, 16($29) + sd $18, 24($29) + sd $19, 32($29) + sd $20, 40($29) + sd $21, 48($29) + sd $22, 56($29) + sd $23, 64($29) + sd $28, 72($29) + sd $30, 80($29) +#ifndef __mips_soft_float + sdc1 $f24, 88($29) + sdc1 $f25, 96($29) + sdc1 $f26, 104($29) + sdc1 $f27, 112($29) + sdc1 $f28, 120($29) + sdc1 $f29, 128($29) + sdc1 $f30, 136($29) + sdc1 $f31, 144($29) +#endif + + // prepare SB register = pc & 0xffffffff00000000 + bal 1f +1: + dsrl $28, $31, 32 + dsll $28, $28, 32 + + move $20, $4 // save R4 + move $1, $6 + jalr $5 // call setg_gcc (clobbers R4) + jalr $20 // call fn + + ld $16, 8($29) + ld $17, 16($29) + ld $18, 24($29) + ld $19, 32($29) + ld $20, 40($29) + ld $21, 48($29) + ld $22, 56($29) + ld $23, 64($29) + ld $28, 72($29) + ld $30, 80($29) +#ifndef __mips_soft_float + ldc1 $f24, 88($29) + ldc1 $f25, 96($29) + ldc1 $f26, 104($29) + ldc1 $f27, 112($29) + ldc1 $f28, 120($29) + ldc1 $f29, 128($29) + ldc1 $f30, 136($29) + ldc1 $f31, 144($29) +#endif + ld $31, 0($29) +#ifndef __mips_soft_float + daddiu $29, $29, 160 +#else + daddiu $29, $29, 96 +#endif + jr $31 + +.set at + +#ifdef __ELF__ +.section .note.GNU-stack,"",%progbits +#endif diff --git a/src/runtime/cgo/gcc_mipsx.S b/src/runtime/cgo/gcc_mipsx.S new file mode 100644 index 0000000..2867f6a --- /dev/null +++ b/src/runtime/cgo/gcc_mipsx.S @@ -0,0 +1,77 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build mips mipsle + +.file "gcc_mipsx.S" + +/* + * void crosscall1(void (*fn)(void), void (*setg_gcc)(void *g), void *g) + * + * Calling into the gc tool chain, where all registers are caller save. + * Called from standard MIPS O32 ABI, where $16-$23, $30, and $f20-$f31 + * are callee-save, so they must be saved explicitly, along with $31 (LR). + */ +.globl crosscall1 +.set noat +crosscall1: +#ifndef __mips_soft_float + addiu $29, $29, -88 +#else + addiu $29, $29, -40 // For soft-float, no need to make room for FP registers +#endif + sw $31, 0($29) + sw $16, 4($29) + sw $17, 8($29) + sw $18, 12($29) + sw $19, 16($29) + sw $20, 20($29) + sw $21, 24($29) + sw $22, 28($29) + sw $23, 32($29) + sw $30, 36($29) + +#ifndef __mips_soft_float + sdc1 $f20, 40($29) + sdc1 $f22, 48($29) + sdc1 $f24, 56($29) + sdc1 $f26, 64($29) + sdc1 $f28, 72($29) + sdc1 $f30, 80($29) +#endif + move $20, $4 // save R4 + move $4, $6 + jalr $5 // call setg_gcc + jalr $20 // call fn + + lw $16, 4($29) + lw $17, 8($29) + lw $18, 12($29) + lw $19, 16($29) + lw $20, 20($29) + lw $21, 24($29) + lw $22, 28($29) + lw $23, 32($29) + lw $30, 36($29) +#ifndef __mips_soft_float + ldc1 $f20, 40($29) + ldc1 $f22, 48($29) + ldc1 $f24, 56($29) + ldc1 $f26, 64($29) + ldc1 $f28, 72($29) + ldc1 $f30, 80($29) +#endif + lw $31, 0($29) +#ifndef __mips_soft_float + addiu $29, $29, 88 +#else + addiu $29, $29, 40 +#endif + jr $31 + +.set at + +#ifdef __ELF__ +.section .note.GNU-stack,"",%progbits +#endif diff --git a/src/runtime/cgo/gcc_mmap.c b/src/runtime/cgo/gcc_mmap.c new file mode 100644 index 0000000..83d857f --- /dev/null +++ b/src/runtime/cgo/gcc_mmap.c @@ -0,0 +1,39 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build linux,amd64 linux,arm64 linux,ppc64le freebsd,amd64 + +#include <errno.h> +#include <stdint.h> +#include <stdlib.h> +#include <sys/mman.h> + +#include "libcgo.h" + +uintptr_t +x_cgo_mmap(void *addr, uintptr_t length, int32_t prot, int32_t flags, int32_t fd, uint32_t offset) { + void *p; + + _cgo_tsan_acquire(); + p = mmap(addr, length, prot, flags, fd, offset); + _cgo_tsan_release(); + if (p == MAP_FAILED) { + /* This is what the Go code expects on failure. */ + return (uintptr_t)errno; + } + return (uintptr_t)p; +} + +void +x_cgo_munmap(void *addr, uintptr_t length) { + int r; + + _cgo_tsan_acquire(); + r = munmap(addr, length); + _cgo_tsan_release(); + if (r < 0) { + /* The Go runtime is not prepared for munmap to fail. */ + abort(); + } +} diff --git a/src/runtime/cgo/gcc_netbsd_386.c b/src/runtime/cgo/gcc_netbsd_386.c new file mode 100644 index 0000000..5495f0f --- /dev/null +++ b/src/runtime/cgo/gcc_netbsd_386.c @@ -0,0 +1,82 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + stack_t ss; + + ts = *(ThreadStart*)v; + free(v); + + /* + * Set specific keys. + */ + setg_gcc((void*)ts.g); + + // On NetBSD, a new thread inherits the signal stack of the + // creating thread. That confuses minit, so we remove that + // signal stack here before calling the regular mstart. It's + // a bit baroque to remove a signal stack here only to add one + // in minit, but it's a simple change that keeps NetBSD + // working like other OS's. At this point all signals are + // blocked, so there is no race. + memset(&ss, 0, sizeof ss); + ss.ss_flags = SS_DISABLE; + sigaltstack(&ss, nil); + + crosscall_386(ts.fn); + return nil; +} diff --git a/src/runtime/cgo/gcc_netbsd_amd64.c b/src/runtime/cgo/gcc_netbsd_amd64.c new file mode 100644 index 0000000..9f4b031 --- /dev/null +++ b/src/runtime/cgo/gcc_netbsd_amd64.c @@ -0,0 +1,78 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + stack_t ss; + + ts = *(ThreadStart*)v; + free(v); + + // On NetBSD, a new thread inherits the signal stack of the + // creating thread. That confuses minit, so we remove that + // signal stack here before calling the regular mstart. It's + // a bit baroque to remove a signal stack here only to add one + // in minit, but it's a simple change that keeps NetBSD + // working like other OS's. At this point all signals are + // blocked, so there is no race. + memset(&ss, 0, sizeof ss); + ss.ss_flags = SS_DISABLE; + sigaltstack(&ss, nil); + + crosscall_amd64(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_netbsd_arm.c b/src/runtime/cgo/gcc_netbsd_arm.c new file mode 100644 index 0000000..b0c80ea --- /dev/null +++ b/src/runtime/cgo/gcc_netbsd_arm.c @@ -0,0 +1,79 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall_arm1(void (*fn)(void), void (*setg_gcc)(void*), void *g); +static void* +threadentry(void *v) +{ + ThreadStart ts; + stack_t ss; + + ts = *(ThreadStart*)v; + free(v); + + // On NetBSD, a new thread inherits the signal stack of the + // creating thread. That confuses minit, so we remove that + // signal stack here before calling the regular mstart. It's + // a bit baroque to remove a signal stack here only to add one + // in minit, but it's a simple change that keeps NetBSD + // working like other OS's. At this point all signals are + // blocked, so there is no race. + memset(&ss, 0, sizeof ss); + ss.ss_flags = SS_DISABLE; + sigaltstack(&ss, nil); + + crosscall_arm1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_netbsd_arm64.c b/src/runtime/cgo/gcc_netbsd_arm64.c new file mode 100644 index 0000000..694116c --- /dev/null +++ b/src/runtime/cgo/gcc_netbsd_arm64.c @@ -0,0 +1,80 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); + +static void* +threadentry(void *v) +{ + ThreadStart ts; + stack_t ss; + + ts = *(ThreadStart*)v; + free(v); + + // On NetBSD, a new thread inherits the signal stack of the + // creating thread. That confuses minit, so we remove that + // signal stack here before calling the regular mstart. It's + // a bit baroque to remove a signal stack here only to add one + // in minit, but it's a simple change that keeps NetBSD + // working like other OS's. At this point all signals are + // blocked, so there is no race. + memset(&ss, 0, sizeof ss); + ss.ss_flags = SS_DISABLE; + sigaltstack(&ss, nil); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_openbsd_386.c b/src/runtime/cgo/gcc_openbsd_386.c new file mode 100644 index 0000000..127a1b6 --- /dev/null +++ b/src/runtime/cgo/gcc_openbsd_386.c @@ -0,0 +1,70 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + /* + * Set specific keys. + */ + setg_gcc((void*)ts.g); + + crosscall_386(ts.fn); + return nil; +} diff --git a/src/runtime/cgo/gcc_openbsd_amd64.c b/src/runtime/cgo/gcc_openbsd_amd64.c new file mode 100644 index 0000000..09d2750 --- /dev/null +++ b/src/runtime/cgo/gcc_openbsd_amd64.c @@ -0,0 +1,65 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall_amd64(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_openbsd_arm.c b/src/runtime/cgo/gcc_openbsd_arm.c new file mode 100644 index 0000000..9a5757f --- /dev/null +++ b/src/runtime/cgo/gcc_openbsd_arm.c @@ -0,0 +1,67 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall_arm1(void (*fn)(void), void (*setg_gcc)(void*), void *g); + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall_arm1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_openbsd_arm64.c b/src/runtime/cgo/gcc_openbsd_arm64.c new file mode 100644 index 0000000..abf9f66 --- /dev/null +++ b/src/runtime/cgo/gcc_openbsd_arm64.c @@ -0,0 +1,67 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_openbsd_mips64.c b/src/runtime/cgo/gcc_openbsd_mips64.c new file mode 100644 index 0000000..79f039a --- /dev/null +++ b/src/runtime/cgo/gcc_openbsd_mips64.c @@ -0,0 +1,67 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <sys/types.h> +#include <pthread.h> +#include <signal.h> +#include <string.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_ppc64x.c b/src/runtime/cgo/gcc_ppc64x.c new file mode 100644 index 0000000..9cb6e0c --- /dev/null +++ b/src/runtime/cgo/gcc_ppc64x.c @@ -0,0 +1,71 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build ppc64 ppc64le + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void *threadentry(void*); + +void (*x_cgo_inittls)(void **tlsg, void **tlsbase); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsbase) +{ + pthread_attr_t attr; + size_t size; + + setg_gcc = setg; + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + g->stacklo = (uintptr)&attr - size + 4096; + pthread_attr_destroy(&attr); +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + pthread_attr_getstacksize(&attr, &size); + // Leave stacklo=0 and set stackhi=size; mstart will do the rest. + ts->g->stackhi = size; + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fatalf("pthread_create failed: %s", strerror(err)); + } +} + +extern void crosscall_ppc64(void (*fn)(void), void *g); + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + // Save g for this thread in C TLS + setg_gcc((void*)ts.g); + + crosscall_ppc64(ts.fn, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_riscv64.S b/src/runtime/cgo/gcc_riscv64.S new file mode 100644 index 0000000..8f07649 --- /dev/null +++ b/src/runtime/cgo/gcc_riscv64.S @@ -0,0 +1,82 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +.file "gcc_riscv64.S" + +/* + * void crosscall1(void (*fn)(void), void (*setg_gcc)(void *g), void *g) + * + * Calling into the gc tool chain, where all registers are caller save. + * Called from standard RISCV ELF psABI, where x8-x9, x18-x27, f8-f9 and + * f18-f27 are callee-save, so they must be saved explicitly, along with + * x1 (LR). + */ +.globl crosscall1 +crosscall1: + sd x1, -200(sp) + addi sp, sp, -200 + sd x8, 8(sp) + sd x9, 16(sp) + sd x18, 24(sp) + sd x19, 32(sp) + sd x20, 40(sp) + sd x21, 48(sp) + sd x22, 56(sp) + sd x23, 64(sp) + sd x24, 72(sp) + sd x25, 80(sp) + sd x26, 88(sp) + sd x27, 96(sp) + fsd f8, 104(sp) + fsd f9, 112(sp) + fsd f18, 120(sp) + fsd f19, 128(sp) + fsd f20, 136(sp) + fsd f21, 144(sp) + fsd f22, 152(sp) + fsd f23, 160(sp) + fsd f24, 168(sp) + fsd f25, 176(sp) + fsd f26, 184(sp) + fsd f27, 192(sp) + + // a0 = *fn, a1 = *setg_gcc, a2 = *g + mv s1, a0 + mv s0, a1 + mv a0, a2 + jalr ra, s0 // call setg_gcc (clobbers x30 aka g) + jalr ra, s1 // call fn + + ld x1, 0(sp) + ld x8, 8(sp) + ld x9, 16(sp) + ld x18, 24(sp) + ld x19, 32(sp) + ld x20, 40(sp) + ld x21, 48(sp) + ld x22, 56(sp) + ld x23, 64(sp) + ld x24, 72(sp) + ld x25, 80(sp) + ld x26, 88(sp) + ld x27, 96(sp) + fld f8, 104(sp) + fld f9, 112(sp) + fld f18, 120(sp) + fld f19, 128(sp) + fld f20, 136(sp) + fld f21, 144(sp) + fld f22, 152(sp) + fld f23, 160(sp) + fld f24, 168(sp) + fld f25, 176(sp) + fld f26, 184(sp) + fld f27, 192(sp) + addi sp, sp, 200 + + jr ra + +#ifdef __ELF__ +.section .note.GNU-stack,"",%progbits +#endif diff --git a/src/runtime/cgo/gcc_s390x.S b/src/runtime/cgo/gcc_s390x.S new file mode 100644 index 0000000..8bd30fe --- /dev/null +++ b/src/runtime/cgo/gcc_s390x.S @@ -0,0 +1,58 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +.file "gcc_s390x.S" + +/* + * void crosscall_s390x(void (*fn)(void), void *g) + * + * Calling into the go tool chain, where all registers are caller save. + * Called from standard s390x C ABI, where r6-r13, r15, and f8-f15 are + * callee-save, so they must be saved explicitly. + */ +.globl crosscall_s390x +crosscall_s390x: + /* save r6-r15 in the register save area of the calling function */ + stmg %r6, %r15, 48(%r15) + + /* allocate 64 bytes of stack space to save f8-f15 */ + lay %r15, -64(%r15) + + /* save callee-saved floating point registers */ + std %f8, 0(%r15) + std %f9, 8(%r15) + std %f10, 16(%r15) + std %f11, 24(%r15) + std %f12, 32(%r15) + std %f13, 40(%r15) + std %f14, 48(%r15) + std %f15, 56(%r15) + + /* restore g pointer */ + lgr %r13, %r3 + + /* call fn */ + basr %r14, %r2 + + /* restore floating point registers */ + ld %f8, 0(%r15) + ld %f9, 8(%r15) + ld %f10, 16(%r15) + ld %f11, 24(%r15) + ld %f12, 32(%r15) + ld %f13, 40(%r15) + ld %f14, 48(%r15) + ld %f15, 56(%r15) + + /* de-allocate stack frame */ + la %r15, 64(%r15) + + /* restore general purpose registers */ + lmg %r6, %r15, 48(%r15) + + br %r14 /* restored by lmg */ + +#ifdef __ELF__ +.section .note.GNU-stack,"",%progbits +#endif diff --git a/src/runtime/cgo/gcc_setenv.c b/src/runtime/cgo/gcc_setenv.c new file mode 100644 index 0000000..d4f7983 --- /dev/null +++ b/src/runtime/cgo/gcc_setenv.c @@ -0,0 +1,28 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build cgo +// +build aix darwin dragonfly freebsd linux netbsd openbsd solaris + +#include "libcgo.h" + +#include <stdlib.h> + +/* Stub for calling setenv */ +void +x_cgo_setenv(char **arg) +{ + _cgo_tsan_acquire(); + setenv(arg[0], arg[1], 1); + _cgo_tsan_release(); +} + +/* Stub for calling unsetenv */ +void +x_cgo_unsetenv(char **arg) +{ + _cgo_tsan_acquire(); + unsetenv(arg[0]); + _cgo_tsan_release(); +} diff --git a/src/runtime/cgo/gcc_sigaction.c b/src/runtime/cgo/gcc_sigaction.c new file mode 100644 index 0000000..fcf1e50 --- /dev/null +++ b/src/runtime/cgo/gcc_sigaction.c @@ -0,0 +1,82 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build linux,amd64 linux,arm64 linux,ppc64le + +#include <errno.h> +#include <stddef.h> +#include <stdint.h> +#include <string.h> +#include <signal.h> + +#include "libcgo.h" + +// go_sigaction_t is a C version of the sigactiont struct from +// defs_linux_amd64.go. This definition — and its conversion to and from struct +// sigaction — are specific to linux/amd64. +typedef struct { + uintptr_t handler; + uint64_t flags; + uintptr_t restorer; + uint64_t mask; +} go_sigaction_t; + +// SA_RESTORER is part of the kernel interface. +// This is Linux i386/amd64 specific. +#ifndef SA_RESTORER +#define SA_RESTORER 0x4000000 +#endif + +int32_t +x_cgo_sigaction(intptr_t signum, const go_sigaction_t *goact, go_sigaction_t *oldgoact) { + int32_t ret; + struct sigaction act; + struct sigaction oldact; + size_t i; + + _cgo_tsan_acquire(); + + memset(&act, 0, sizeof act); + memset(&oldact, 0, sizeof oldact); + + if (goact) { + if (goact->flags & SA_SIGINFO) { + act.sa_sigaction = (void(*)(int, siginfo_t*, void*))(goact->handler); + } else { + act.sa_handler = (void(*)(int))(goact->handler); + } + sigemptyset(&act.sa_mask); + for (i = 0; i < 8 * sizeof(goact->mask); i++) { + if (goact->mask & ((uint64_t)(1)<<i)) { + sigaddset(&act.sa_mask, (int)(i+1)); + } + } + act.sa_flags = (int)(goact->flags & ~(uint64_t)SA_RESTORER); + } + + ret = sigaction((int)signum, goact ? &act : NULL, oldgoact ? &oldact : NULL); + if (ret == -1) { + // runtime.rt_sigaction expects _cgo_sigaction to return errno on error. + _cgo_tsan_release(); + return errno; + } + + if (oldgoact) { + if (oldact.sa_flags & SA_SIGINFO) { + oldgoact->handler = (uintptr_t)(oldact.sa_sigaction); + } else { + oldgoact->handler = (uintptr_t)(oldact.sa_handler); + } + oldgoact->mask = 0; + for (i = 0; i < 8 * sizeof(oldgoact->mask); i++) { + if (sigismember(&oldact.sa_mask, (int)(i+1)) == 1) { + oldgoact->mask |= (uint64_t)(1)<<i; + } + } + oldgoact->flags = (uint64_t)oldact.sa_flags; + } + + _cgo_tsan_release(); + return ret; +} diff --git a/src/runtime/cgo/gcc_signal2_ios_arm64.c b/src/runtime/cgo/gcc_signal2_ios_arm64.c new file mode 100644 index 0000000..5b8a18f --- /dev/null +++ b/src/runtime/cgo/gcc_signal2_ios_arm64.c @@ -0,0 +1,11 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build lldb + +// Used by gcc_signal_darwin_arm64.c when doing the test build during cgo. +// We hope that for real binaries the definition provided by Go will take precedence +// and the linker will drop this .o file altogether, which is why this definition +// is all by itself in its own file. +void __attribute__((weak)) xx_cgo_panicmem(void) {} diff --git a/src/runtime/cgo/gcc_signal_ios_arm64.c b/src/runtime/cgo/gcc_signal_ios_arm64.c new file mode 100644 index 0000000..6519edd --- /dev/null +++ b/src/runtime/cgo/gcc_signal_ios_arm64.c @@ -0,0 +1,213 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Emulation of the Unix signal SIGSEGV. +// +// On iOS, Go tests and apps under development are run by lldb. +// The debugger uses a task-level exception handler to intercept signals. +// Despite having a 'handle' mechanism like gdb, lldb will not allow a +// SIGSEGV to pass to the running program. For Go, this means we cannot +// generate a panic, which cannot be recovered, and so tests fail. +// +// We work around this by registering a thread-level mach exception handler +// and intercepting EXC_BAD_ACCESS. The kernel offers thread handlers a +// chance to resolve exceptions before the task handler, so we can generate +// the panic and avoid lldb's SIGSEGV handler. +// +// The dist tool enables this by build flag when testing. + +// +build lldb + +#include <limits.h> +#include <pthread.h> +#include <stdio.h> +#include <signal.h> +#include <stdlib.h> +#include <unistd.h> + +#include <mach/arm/thread_status.h> +#include <mach/exception_types.h> +#include <mach/mach.h> +#include <mach/mach_init.h> +#include <mach/mach_port.h> +#include <mach/thread_act.h> +#include <mach/thread_status.h> + +#include "libcgo.h" +#include "libcgo_unix.h" + +void xx_cgo_panicmem(void); +uintptr_t x_cgo_panicmem = (uintptr_t)xx_cgo_panicmem; + +static pthread_mutex_t mach_exception_handler_port_set_mu; +static mach_port_t mach_exception_handler_port_set = MACH_PORT_NULL; + +kern_return_t +catch_exception_raise( + mach_port_t exception_port, + mach_port_t thread, + mach_port_t task, + exception_type_t exception, + exception_data_t code_vector, + mach_msg_type_number_t code_count) +{ + kern_return_t ret; + arm_unified_thread_state_t thread_state; + mach_msg_type_number_t state_count = ARM_UNIFIED_THREAD_STATE_COUNT; + + // Returning KERN_SUCCESS intercepts the exception. + // + // Returning KERN_FAILURE lets the exception fall through to the + // next handler, which is the standard signal emulation code + // registered on the task port. + + if (exception != EXC_BAD_ACCESS) { + return KERN_FAILURE; + } + + ret = thread_get_state(thread, ARM_UNIFIED_THREAD_STATE, (thread_state_t)&thread_state, &state_count); + if (ret) { + fprintf(stderr, "runtime/cgo: thread_get_state failed: %d\n", ret); + abort(); + } + + // Bounce call to sigpanic through asm that makes it look like + // we call sigpanic directly from the faulting code. +#ifdef __arm64__ + thread_state.ts_64.__x[1] = thread_state.ts_64.__lr; + thread_state.ts_64.__x[2] = thread_state.ts_64.__pc; + thread_state.ts_64.__pc = x_cgo_panicmem; +#else + thread_state.ts_32.__r[1] = thread_state.ts_32.__lr; + thread_state.ts_32.__r[2] = thread_state.ts_32.__pc; + thread_state.ts_32.__pc = x_cgo_panicmem; +#endif + + if (0) { + // Useful debugging logic when panicmem is broken. + // + // Sends the first SIGSEGV and lets lldb catch the + // second one, avoiding a loop that locks up iOS + // devices requiring a hard reboot. + fprintf(stderr, "runtime/cgo: caught exc_bad_access\n"); + fprintf(stderr, "__lr = %llx\n", thread_state.ts_64.__lr); + fprintf(stderr, "__pc = %llx\n", thread_state.ts_64.__pc); + static int pass1 = 0; + if (pass1) { + return KERN_FAILURE; + } + pass1 = 1; + } + + ret = thread_set_state(thread, ARM_UNIFIED_THREAD_STATE, (thread_state_t)&thread_state, state_count); + if (ret) { + fprintf(stderr, "runtime/cgo: thread_set_state failed: %d\n", ret); + abort(); + } + + return KERN_SUCCESS; +} + +void +darwin_arm_init_thread_exception_port() +{ + // Called by each new OS thread to bind its EXC_BAD_ACCESS exception + // to mach_exception_handler_port_set. + int ret; + mach_port_t port = MACH_PORT_NULL; + + ret = mach_port_allocate(mach_task_self(), MACH_PORT_RIGHT_RECEIVE, &port); + if (ret) { + fprintf(stderr, "runtime/cgo: mach_port_allocate failed: %d\n", ret); + abort(); + } + ret = mach_port_insert_right( + mach_task_self(), + port, + port, + MACH_MSG_TYPE_MAKE_SEND); + if (ret) { + fprintf(stderr, "runtime/cgo: mach_port_insert_right failed: %d\n", ret); + abort(); + } + + ret = thread_set_exception_ports( + mach_thread_self(), + EXC_MASK_BAD_ACCESS, + port, + EXCEPTION_DEFAULT, + THREAD_STATE_NONE); + if (ret) { + fprintf(stderr, "runtime/cgo: thread_set_exception_ports failed: %d\n", ret); + abort(); + } + + ret = pthread_mutex_lock(&mach_exception_handler_port_set_mu); + if (ret) { + fprintf(stderr, "runtime/cgo: pthread_mutex_lock failed: %d\n", ret); + abort(); + } + ret = mach_port_move_member( + mach_task_self(), + port, + mach_exception_handler_port_set); + if (ret) { + fprintf(stderr, "runtime/cgo: mach_port_move_member failed: %d\n", ret); + abort(); + } + ret = pthread_mutex_unlock(&mach_exception_handler_port_set_mu); + if (ret) { + fprintf(stderr, "runtime/cgo: pthread_mutex_unlock failed: %d\n", ret); + abort(); + } +} + +static void* +mach_exception_handler(void *port) +{ + // Calls catch_exception_raise. + extern boolean_t exc_server(); + mach_msg_server(exc_server, 2048, (mach_port_t)port, 0); + abort(); // never returns +} + +void +darwin_arm_init_mach_exception_handler() +{ + pthread_mutex_init(&mach_exception_handler_port_set_mu, NULL); + + // Called once per process to initialize a mach port server, listening + // for EXC_BAD_ACCESS thread exceptions. + int ret; + pthread_t thr = NULL; + pthread_attr_t attr; + sigset_t ign, oset; + + ret = mach_port_allocate( + mach_task_self(), + MACH_PORT_RIGHT_PORT_SET, + &mach_exception_handler_port_set); + if (ret) { + fprintf(stderr, "runtime/cgo: mach_port_allocate failed for port_set: %d\n", ret); + abort(); + } + + // Block all signals to the exception handler thread + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + // Start a thread to handle exceptions. + uintptr_t port_set = (uintptr_t)mach_exception_handler_port_set; + pthread_attr_init(&attr); + pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); + ret = _cgo_try_pthread_create(&thr, &attr, mach_exception_handler, (void*)port_set); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (ret) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %d\n", ret); + abort(); + } + pthread_attr_destroy(&attr); +} diff --git a/src/runtime/cgo/gcc_signal_ios_nolldb.c b/src/runtime/cgo/gcc_signal_ios_nolldb.c new file mode 100644 index 0000000..cfa4025 --- /dev/null +++ b/src/runtime/cgo/gcc_signal_ios_nolldb.c @@ -0,0 +1,12 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build !lldb +// +build ios +// +build arm64 + +#include <stdint.h> + +void darwin_arm_init_thread_exception_port() {} +void darwin_arm_init_mach_exception_handler() {} diff --git a/src/runtime/cgo/gcc_solaris_amd64.c b/src/runtime/cgo/gcc_solaris_amd64.c new file mode 100644 index 0000000..e89e844 --- /dev/null +++ b/src/runtime/cgo/gcc_solaris_amd64.c @@ -0,0 +1,77 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <pthread.h> +#include <string.h> +#include <signal.h> +#include <ucontext.h> +#include "libcgo.h" +#include "libcgo_unix.h" + +static void* threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + ucontext_t ctx; + + setg_gcc = setg; + if (getcontext(&ctx) != 0) + perror("runtime/cgo: getcontext failed"); + g->stacklo = (uintptr_t)ctx.uc_stack.ss_sp; + + // Solaris processes report a tiny stack when run with "ulimit -s unlimited". + // Correct that as best we can: assume it's at least 1 MB. + // See golang.org/issue/12210. + if(ctx.uc_stack.ss_size < 1024*1024) + g->stacklo -= 1024*1024 - ctx.uc_stack.ss_size; +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + pthread_attr_t attr; + sigset_t ign, oset; + pthread_t p; + void *base; + size_t size; + int err; + + sigfillset(&ign); + pthread_sigmask(SIG_SETMASK, &ign, &oset); + + pthread_attr_init(&attr); + + if (pthread_attr_getstack(&attr, &base, &size) != 0) + perror("runtime/cgo: pthread_attr_getstack failed"); + if (size == 0) { + ts->g->stackhi = 2 << 20; + if (pthread_attr_setstack(&attr, NULL, ts->g->stackhi) != 0) + perror("runtime/cgo: pthread_attr_setstack failed"); + } else { + ts->g->stackhi = size; + } + pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); + err = _cgo_try_pthread_create(&p, &attr, threadentry, ts); + + pthread_sigmask(SIG_SETMASK, &oset, nil); + + if (err != 0) { + fprintf(stderr, "runtime/cgo: pthread_create failed: %s\n", strerror(err)); + abort(); + } +} + +static void* +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall_amd64(ts.fn, setg_gcc, (void*)ts.g); + return nil; +} diff --git a/src/runtime/cgo/gcc_traceback.c b/src/runtime/cgo/gcc_traceback.c new file mode 100644 index 0000000..6e9470c --- /dev/null +++ b/src/runtime/cgo/gcc_traceback.c @@ -0,0 +1,44 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build cgo,darwin cgo,linux + +#include <stdint.h> +#include "libcgo.h" + +#ifndef __has_feature +#define __has_feature(x) 0 +#endif + +#if __has_feature(memory_sanitizer) +#include <sanitizer/msan_interface.h> +#endif + +// Call the user's traceback function and then call sigtramp. +// The runtime signal handler will jump to this code. +// We do it this way so that the user's traceback function will be called +// by a C function with proper unwind info. +void +x_cgo_callers(uintptr_t sig, void *info, void *context, void (*cgoTraceback)(struct cgoTracebackArg*), uintptr_t* cgoCallers, void (*sigtramp)(uintptr_t, void*, void*)) { + struct cgoTracebackArg arg; + + arg.Context = 0; + arg.SigContext = (uintptr_t)(context); + arg.Buf = cgoCallers; + arg.Max = 32; // must match len(runtime.cgoCallers) + +#if __has_feature(memory_sanitizer) + // This function is called directly from the signal handler. + // The arguments are passed in registers, so whether msan + // considers cgoCallers to be initialized depends on whether + // it considers the appropriate register to be initialized. + // That can cause false reports in rare cases. + // Explicitly unpoison the memory to avoid that. + // See issue #47543 for more details. + __msan_unpoison(&arg, sizeof arg); +#endif + + (*cgoTraceback)(&arg); + sigtramp(sig, info, context); +} diff --git a/src/runtime/cgo/gcc_util.c b/src/runtime/cgo/gcc_util.c new file mode 100644 index 0000000..3fcb48c --- /dev/null +++ b/src/runtime/cgo/gcc_util.c @@ -0,0 +1,69 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "libcgo.h" + +/* Stub for creating a new thread */ +void +x_cgo_thread_start(ThreadStart *arg) +{ + ThreadStart *ts; + + /* Make our own copy that can persist after we return. */ + _cgo_tsan_acquire(); + ts = malloc(sizeof *ts); + _cgo_tsan_release(); + if(ts == nil) { + fprintf(stderr, "runtime/cgo: out of memory in thread_start\n"); + abort(); + } + *ts = *arg; + + _cgo_sys_thread_start(ts); /* OS-dependent half */ +} + +#ifndef CGO_TSAN +void(* const _cgo_yield)() = NULL; +#else + +#include <string.h> + +char x_cgo_yield_strncpy_src = 0; +char x_cgo_yield_strncpy_dst = 0; +size_t x_cgo_yield_strncpy_n = 0; + +/* +Stub for allowing libc interceptors to execute. + +_cgo_yield is set to NULL if we do not expect libc interceptors to exist. +*/ +static void +x_cgo_yield() +{ + /* + The libc function(s) we call here must form a no-op and include at least one + call that triggers TSAN to process pending asynchronous signals. + + sleep(0) would be fine, but it's not portable C (so it would need more header + guards). + free(NULL) has a fast-path special case in TSAN, so it doesn't + trigger signal delivery. + free(malloc(0)) would work (triggering the interceptors in malloc), but + it also runs a bunch of user-supplied malloc hooks. + + So we choose strncpy(_, _, 0): it requires an extra header, + but it's standard and should be very efficient. + + GCC 7 has an unfortunate habit of optimizing out strncpy calls (see + https://golang.org/issue/21196), so the arguments here need to be global + variables with external linkage in order to ensure that the call traps all the + way down into libc. + */ + strncpy(&x_cgo_yield_strncpy_dst, &x_cgo_yield_strncpy_src, + x_cgo_yield_strncpy_n); +} + +void(* const _cgo_yield)() = &x_cgo_yield; + +#endif /* GO_TSAN */ diff --git a/src/runtime/cgo/gcc_windows_386.c b/src/runtime/cgo/gcc_windows_386.c new file mode 100644 index 0000000..56fbaac --- /dev/null +++ b/src/runtime/cgo/gcc_windows_386.c @@ -0,0 +1,49 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#define WIN32_LEAN_AND_MEAN +#include <windows.h> +#include <process.h> +#include <stdlib.h> +#include <stdio.h> +#include <errno.h> +#include "libcgo.h" +#include "libcgo_windows.h" + +static void threadentry(void*); + +void +x_cgo_init(G *g) +{ +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + _cgo_beginthread(threadentry, ts); +} + +static void +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + // minit queries stack bounds from the OS. + + /* + * Set specific keys in thread local storage. + */ + asm volatile ( + "movl %0, %%fs:0x14\n" // MOVL tls0, 0x14(FS) + "movl %%fs:0x14, %%eax\n" // MOVL 0x14(FS), tmp + "movl %1, 0(%%eax)\n" // MOVL g, 0(FS) + :: "r"(ts.tls), "r"(ts.g) : "%eax" + ); + + crosscall_386(ts.fn); +} diff --git a/src/runtime/cgo/gcc_windows_amd64.c b/src/runtime/cgo/gcc_windows_amd64.c new file mode 100644 index 0000000..3ff3c64 --- /dev/null +++ b/src/runtime/cgo/gcc_windows_amd64.c @@ -0,0 +1,51 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#define WIN32_LEAN_AND_MEAN +#include <windows.h> +#include <process.h> +#include <stdlib.h> +#include <stdio.h> +#include <errno.h> +#include "libcgo.h" +#include "libcgo_windows.h" + +static void threadentry(void*); +static void (*setg_gcc)(void*); +static DWORD *tls_g; + +void +x_cgo_init(G *g, void (*setg)(void*), void **tlsg, void **tlsbase) +{ + setg_gcc = setg; + tls_g = (DWORD *)tlsg; +} + + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + _cgo_beginthread(threadentry, ts); +} + +static void +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + // minit queries stack bounds from the OS. + + /* + * Set specific keys in thread local storage. + */ + asm volatile ( + "movq %0, %%gs:0(%1)\n" // MOVL tls0, 0(tls_g)(GS) + :: "r"(ts.tls), "r"(*tls_g) + ); + + crosscall_amd64(ts.fn, setg_gcc, (void*)ts.g); +} diff --git a/src/runtime/cgo/gcc_windows_arm64.c b/src/runtime/cgo/gcc_windows_arm64.c new file mode 100644 index 0000000..8f113cc --- /dev/null +++ b/src/runtime/cgo/gcc_windows_arm64.c @@ -0,0 +1,40 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#define WIN32_LEAN_AND_MEAN +#include <windows.h> +#include <process.h> +#include <stdlib.h> +#include <stdio.h> +#include <errno.h> +#include "libcgo.h" +#include "libcgo_windows.h" + +static void threadentry(void*); +static void (*setg_gcc)(void*); + +void +x_cgo_init(G *g, void (*setg)(void*)) +{ + setg_gcc = setg; +} + +void +_cgo_sys_thread_start(ThreadStart *ts) +{ + _cgo_beginthread(threadentry, ts); +} + +extern void crosscall1(void (*fn)(void), void (*setg_gcc)(void*), void *g); + +static void +threadentry(void *v) +{ + ThreadStart ts; + + ts = *(ThreadStart*)v; + free(v); + + crosscall1(ts.fn, setg_gcc, (void *)ts.g); +} diff --git a/src/runtime/cgo/handle.go b/src/runtime/cgo/handle.go new file mode 100644 index 0000000..d711900 --- /dev/null +++ b/src/runtime/cgo/handle.go @@ -0,0 +1,144 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package cgo + +import ( + "sync" + "sync/atomic" +) + +// Handle provides a way to pass values that contain Go pointers +// (pointers to memory allocated by Go) between Go and C without +// breaking the cgo pointer passing rules. A Handle is an integer +// value that can represent any Go value. A Handle can be passed +// through C and back to Go, and Go code can use the Handle to +// retrieve the original Go value. +// +// The underlying type of Handle is guaranteed to fit in an integer type +// that is large enough to hold the bit pattern of any pointer. The zero +// value of a Handle is not valid, and thus is safe to use as a sentinel +// in C APIs. +// +// For instance, on the Go side: +// +// package main +// +// /* +// #include <stdint.h> // for uintptr_t +// +// extern void MyGoPrint(uintptr_t handle); +// void myprint(uintptr_t handle); +// */ +// import "C" +// import "runtime/cgo" +// +// //export MyGoPrint +// func MyGoPrint(handle C.uintptr_t) { +// h := cgo.Handle(handle) +// val := h.Value().(string) +// println(val) +// h.Delete() +// } +// +// func main() { +// val := "hello Go" +// C.myprint(C.uintptr_t(cgo.NewHandle(val))) +// // Output: hello Go +// } +// +// and on the C side: +// +// #include <stdint.h> // for uintptr_t +// +// // A Go function +// extern void MyGoPrint(uintptr_t handle); +// +// // A C function +// void myprint(uintptr_t handle) { +// MyGoPrint(handle); +// } +// +// Some C functions accept a void* argument that points to an arbitrary +// data value supplied by the caller. It is not safe to coerce a cgo.Handle +// (an integer) to a Go unsafe.Pointer, but instead we can pass the address +// of the cgo.Handle to the void* parameter, as in this variant of the +// previous example: +// +// package main +// +// /* +// extern void MyGoPrint(void *context); +// static inline void myprint(void *context) { +// MyGoPrint(context); +// } +// */ +// import "C" +// import ( +// "runtime/cgo" +// "unsafe" +// ) +// +// //export MyGoPrint +// func MyGoPrint(context unsafe.Pointer) { +// h := *(*cgo.Handle)(context) +// val := h.Value().(string) +// println(val) +// h.Delete() +// } +// +// func main() { +// val := "hello Go" +// h := cgo.NewHandle(val) +// C.myprint(unsafe.Pointer(&h)) +// // Output: hello Go +// } +type Handle uintptr + +// NewHandle returns a handle for a given value. +// +// The handle is valid until the program calls Delete on it. The handle +// uses resources, and this package assumes that C code may hold on to +// the handle, so a program must explicitly call Delete when the handle +// is no longer needed. +// +// The intended use is to pass the returned handle to C code, which +// passes it back to Go, which calls Value. +func NewHandle(v any) Handle { + h := atomic.AddUintptr(&handleIdx, 1) + if h == 0 { + panic("runtime/cgo: ran out of handle space") + } + + handles.Store(h, v) + return Handle(h) +} + +// Value returns the associated Go value for a valid handle. +// +// The method panics if the handle is invalid. +func (h Handle) Value() any { + v, ok := handles.Load(uintptr(h)) + if !ok { + panic("runtime/cgo: misuse of an invalid Handle") + } + return v +} + +// Delete invalidates a handle. This method should only be called once +// the program no longer needs to pass the handle to C and the C code +// no longer has a copy of the handle value. +// +// The method panics if the handle is invalid. +func (h Handle) Delete() { + _, ok := handles.LoadAndDelete(uintptr(h)) + if !ok { + panic("runtime/cgo: misuse of an invalid Handle") + } +} + +var ( + handles = sync.Map{} // map[Handle]interface{} + handleIdx uintptr // atomic +) diff --git a/src/runtime/cgo/handle_test.go b/src/runtime/cgo/handle_test.go new file mode 100644 index 0000000..b341c8e --- /dev/null +++ b/src/runtime/cgo/handle_test.go @@ -0,0 +1,103 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package cgo + +import ( + "reflect" + "testing" +) + +func TestHandle(t *testing.T) { + v := 42 + + tests := []struct { + v1 any + v2 any + }{ + {v1: v, v2: v}, + {v1: &v, v2: &v}, + {v1: nil, v2: nil}, + } + + for _, tt := range tests { + h1 := NewHandle(tt.v1) + h2 := NewHandle(tt.v2) + + if uintptr(h1) == 0 || uintptr(h2) == 0 { + t.Fatalf("NewHandle returns zero") + } + + if uintptr(h1) == uintptr(h2) { + t.Fatalf("Duplicated Go values should have different handles, but got equal") + } + + h1v := h1.Value() + h2v := h2.Value() + if !reflect.DeepEqual(h1v, h2v) || !reflect.DeepEqual(h1v, tt.v1) { + t.Fatalf("Value of a Handle got wrong, got %+v %+v, want %+v", h1v, h2v, tt.v1) + } + + h1.Delete() + h2.Delete() + } + + siz := 0 + handles.Range(func(k, v any) bool { + siz++ + return true + }) + if siz != 0 { + t.Fatalf("handles are not cleared, got %d, want %d", siz, 0) + } +} + +func TestInvalidHandle(t *testing.T) { + t.Run("zero", func(t *testing.T) { + h := Handle(0) + + defer func() { + if r := recover(); r != nil { + return + } + t.Fatalf("Delete of zero handle did not trigger a panic") + }() + + h.Delete() + }) + + t.Run("invalid", func(t *testing.T) { + h := NewHandle(42) + + defer func() { + if r := recover(); r != nil { + h.Delete() + return + } + t.Fatalf("Invalid handle did not trigger a panic") + }() + + Handle(h + 1).Delete() + }) +} + +func BenchmarkHandle(b *testing.B) { + b.Run("non-concurrent", func(b *testing.B) { + for i := 0; i < b.N; i++ { + h := NewHandle(i) + _ = h.Value() + h.Delete() + } + }) + b.Run("concurrent", func(b *testing.B) { + b.RunParallel(func(pb *testing.PB) { + var v int + for pb.Next() { + h := NewHandle(v) + _ = h.Value() + h.Delete() + } + }) + }) +} diff --git a/src/runtime/cgo/iscgo.go b/src/runtime/cgo/iscgo.go new file mode 100644 index 0000000..e12d0f4 --- /dev/null +++ b/src/runtime/cgo/iscgo.go @@ -0,0 +1,17 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The runtime package contains an uninitialized definition +// for runtime·iscgo. Override it to tell the runtime we're here. +// There are various function pointers that should be set too, +// but those depend on dynamic linker magic to get initialized +// correctly, and sometimes they break. This variable is a +// backup: it depends only on old C style static linking rules. + +package cgo + +import _ "unsafe" // for go:linkname + +//go:linkname _iscgo runtime.iscgo +var _iscgo bool = true diff --git a/src/runtime/cgo/libcgo.h b/src/runtime/cgo/libcgo.h new file mode 100644 index 0000000..af4960e --- /dev/null +++ b/src/runtime/cgo/libcgo.h @@ -0,0 +1,151 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <stdint.h> +#include <stdlib.h> +#include <stdio.h> + +#undef nil +#define nil ((void*)0) +#define nelem(x) (sizeof(x)/sizeof((x)[0])) + +typedef uint32_t uint32; +typedef uint64_t uint64; +typedef uintptr_t uintptr; + +/* + * The beginning of the per-goroutine structure, + * as defined in ../pkg/runtime/runtime.h. + * Just enough to edit these two fields. + */ +typedef struct G G; +struct G +{ + uintptr stacklo; + uintptr stackhi; +}; + +/* + * Arguments to the _cgo_thread_start call. + * Also known to ../pkg/runtime/runtime.h. + */ +typedef struct ThreadStart ThreadStart; +struct ThreadStart +{ + G *g; + uintptr *tls; + void (*fn)(void); +}; + +/* + * Called by 5c/6c/8c world. + * Makes a local copy of the ThreadStart and + * calls _cgo_sys_thread_start(ts). + */ +extern void (*_cgo_thread_start)(ThreadStart *ts); + +/* + * Creates a new operating system thread without updating any Go state + * (OS dependent). + */ +extern void (*_cgo_sys_thread_create)(void* (*func)(void*), void* arg); + +/* + * Creates the new operating system thread (OS, arch dependent). + */ +void _cgo_sys_thread_start(ThreadStart *ts); + +/* + * Waits for the Go runtime to be initialized (OS dependent). + * If runtime.SetCgoTraceback is used to set a context function, + * calls the context function and returns the context value. + */ +uintptr_t _cgo_wait_runtime_init_done(void); + +/* + * Call fn in the 6c world. + */ +void crosscall_amd64(void (*fn)(void), void (*setg_gcc)(void*), void *g); + +/* + * Call fn in the 8c world. + */ +void crosscall_386(void (*fn)(void)); + +/* + * Prints error then calls abort. For linux and android. + */ +void fatalf(const char* format, ...); + +/* + * Registers the current mach thread port for EXC_BAD_ACCESS processing. + */ +void darwin_arm_init_thread_exception_port(void); + +/* + * Starts a mach message server processing EXC_BAD_ACCESS. + */ +void darwin_arm_init_mach_exception_handler(void); + +/* + * The cgo context function. See runtime.SetCgoTraceback. + */ +struct context_arg { + uintptr_t Context; +}; +extern void (*(_cgo_get_context_function(void)))(struct context_arg*); + +/* + * The argument for the cgo traceback callback. See runtime.SetCgoTraceback. + */ +struct cgoTracebackArg { + uintptr_t Context; + uintptr_t SigContext; + uintptr_t* Buf; + uintptr_t Max; +}; + +/* + * TSAN support. This is only useful when building with + * CGO_CFLAGS="-fsanitize=thread" CGO_LDFLAGS="-fsanitize=thread" go install + */ +#undef CGO_TSAN +#if defined(__has_feature) +# if __has_feature(thread_sanitizer) +# define CGO_TSAN +# endif +#elif defined(__SANITIZE_THREAD__) +# define CGO_TSAN +#endif + +#ifdef CGO_TSAN + +// These must match the definitions in yesTsanProlog in cmd/cgo/out.go. +// In general we should call _cgo_tsan_acquire when we enter C code, +// and call _cgo_tsan_release when we return to Go code. +// This is only necessary when calling code that might be instrumented +// by TSAN, which mostly means system library calls that TSAN intercepts. +// See the comment in cmd/cgo/out.go for more details. + +long long _cgo_sync __attribute__ ((common)); + +extern void __tsan_acquire(void*); +extern void __tsan_release(void*); + +__attribute__ ((unused)) +static void _cgo_tsan_acquire() { + __tsan_acquire(&_cgo_sync); +} + +__attribute__ ((unused)) +static void _cgo_tsan_release() { + __tsan_release(&_cgo_sync); +} + +#else // !defined(CGO_TSAN) + +#define _cgo_tsan_acquire() +#define _cgo_tsan_release() + +#endif // !defined(CGO_TSAN) diff --git a/src/runtime/cgo/libcgo_unix.h b/src/runtime/cgo/libcgo_unix.h new file mode 100644 index 0000000..a56a366 --- /dev/null +++ b/src/runtime/cgo/libcgo_unix.h @@ -0,0 +1,15 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +/* + * Call pthread_create, retrying on EAGAIN. + */ +extern int _cgo_try_pthread_create(pthread_t*, const pthread_attr_t*, void* (*)(void*), void*); + +/* + * Same as _cgo_try_pthread_create, but passing on the pthread_create function. + * Only defined on OpenBSD. + */ +extern int _cgo_openbsd_try_pthread_create(int (*)(pthread_t*, const pthread_attr_t*, void *(*pfn)(void*), void*), + pthread_t*, const pthread_attr_t*, void* (*)(void*), void* arg); diff --git a/src/runtime/cgo/libcgo_windows.h b/src/runtime/cgo/libcgo_windows.h new file mode 100644 index 0000000..33d7637 --- /dev/null +++ b/src/runtime/cgo/libcgo_windows.h @@ -0,0 +1,6 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Call _beginthread, aborting on failure. +void _cgo_beginthread(void (*func)(void*), void* arg); diff --git a/src/runtime/cgo/linux.go b/src/runtime/cgo/linux.go new file mode 100644 index 0000000..1d6fe03 --- /dev/null +++ b/src/runtime/cgo/linux.go @@ -0,0 +1,74 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Linux system call wrappers that provide POSIX semantics through the +// corresponding cgo->libc (nptl) wrappers for various system calls. + +//go:build linux + +package cgo + +import "unsafe" + +// Each of the following entries is needed to ensure that the +// syscall.syscall_linux code can conditionally call these +// function pointers: +// +// 1. find the C-defined function start +// 2. force the local byte alias to be mapped to that location +// 3. map the Go pointer to the function to the syscall package + +//go:cgo_import_static _cgo_libc_setegid +//go:linkname _cgo_libc_setegid _cgo_libc_setegid +//go:linkname cgo_libc_setegid syscall.cgo_libc_setegid +var _cgo_libc_setegid byte +var cgo_libc_setegid = unsafe.Pointer(&_cgo_libc_setegid) + +//go:cgo_import_static _cgo_libc_seteuid +//go:linkname _cgo_libc_seteuid _cgo_libc_seteuid +//go:linkname cgo_libc_seteuid syscall.cgo_libc_seteuid +var _cgo_libc_seteuid byte +var cgo_libc_seteuid = unsafe.Pointer(&_cgo_libc_seteuid) + +//go:cgo_import_static _cgo_libc_setregid +//go:linkname _cgo_libc_setregid _cgo_libc_setregid +//go:linkname cgo_libc_setregid syscall.cgo_libc_setregid +var _cgo_libc_setregid byte +var cgo_libc_setregid = unsafe.Pointer(&_cgo_libc_setregid) + +//go:cgo_import_static _cgo_libc_setresgid +//go:linkname _cgo_libc_setresgid _cgo_libc_setresgid +//go:linkname cgo_libc_setresgid syscall.cgo_libc_setresgid +var _cgo_libc_setresgid byte +var cgo_libc_setresgid = unsafe.Pointer(&_cgo_libc_setresgid) + +//go:cgo_import_static _cgo_libc_setresuid +//go:linkname _cgo_libc_setresuid _cgo_libc_setresuid +//go:linkname cgo_libc_setresuid syscall.cgo_libc_setresuid +var _cgo_libc_setresuid byte +var cgo_libc_setresuid = unsafe.Pointer(&_cgo_libc_setresuid) + +//go:cgo_import_static _cgo_libc_setreuid +//go:linkname _cgo_libc_setreuid _cgo_libc_setreuid +//go:linkname cgo_libc_setreuid syscall.cgo_libc_setreuid +var _cgo_libc_setreuid byte +var cgo_libc_setreuid = unsafe.Pointer(&_cgo_libc_setreuid) + +//go:cgo_import_static _cgo_libc_setgroups +//go:linkname _cgo_libc_setgroups _cgo_libc_setgroups +//go:linkname cgo_libc_setgroups syscall.cgo_libc_setgroups +var _cgo_libc_setgroups byte +var cgo_libc_setgroups = unsafe.Pointer(&_cgo_libc_setgroups) + +//go:cgo_import_static _cgo_libc_setgid +//go:linkname _cgo_libc_setgid _cgo_libc_setgid +//go:linkname cgo_libc_setgid syscall.cgo_libc_setgid +var _cgo_libc_setgid byte +var cgo_libc_setgid = unsafe.Pointer(&_cgo_libc_setgid) + +//go:cgo_import_static _cgo_libc_setuid +//go:linkname _cgo_libc_setuid _cgo_libc_setuid +//go:linkname cgo_libc_setuid syscall.cgo_libc_setuid +var _cgo_libc_setuid byte +var cgo_libc_setuid = unsafe.Pointer(&_cgo_libc_setuid) diff --git a/src/runtime/cgo/linux_syscall.c b/src/runtime/cgo/linux_syscall.c new file mode 100644 index 0000000..59761c8 --- /dev/null +++ b/src/runtime/cgo/linux_syscall.c @@ -0,0 +1,85 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build linux + +#ifndef _GNU_SOURCE // setres[ug]id() API. +#define _GNU_SOURCE +#endif + +#include <grp.h> +#include <sys/types.h> +#include <unistd.h> +#include <errno.h> +#include "libcgo.h" + +/* + * Assumed POSIX compliant libc system call wrappers. For linux, the + * glibc/nptl/setxid mechanism ensures that POSIX semantics are + * honored for all pthreads (by default), and this in turn with cgo + * ensures that all Go threads launched with cgo are kept in sync for + * these function calls. + */ + +// argset_t matches runtime/cgocall.go:argset. +typedef struct { + uintptr_t* args; + uintptr_t retval; +} argset_t; + +// libc backed posix-compliant syscalls. + +#define SET_RETVAL(fn) \ + uintptr_t ret = (uintptr_t) fn ; \ + if (ret == (uintptr_t) -1) { \ + x->retval = (uintptr_t) errno; \ + } else \ + x->retval = ret + +void +_cgo_libc_setegid(argset_t* x) { + SET_RETVAL(setegid((gid_t) x->args[0])); +} + +void +_cgo_libc_seteuid(argset_t* x) { + SET_RETVAL(seteuid((uid_t) x->args[0])); +} + +void +_cgo_libc_setgid(argset_t* x) { + SET_RETVAL(setgid((gid_t) x->args[0])); +} + +void +_cgo_libc_setgroups(argset_t* x) { + SET_RETVAL(setgroups((size_t) x->args[0], (const gid_t *) x->args[1])); +} + +void +_cgo_libc_setregid(argset_t* x) { + SET_RETVAL(setregid((gid_t) x->args[0], (gid_t) x->args[1])); +} + +void +_cgo_libc_setresgid(argset_t* x) { + SET_RETVAL(setresgid((gid_t) x->args[0], (gid_t) x->args[1], + (gid_t) x->args[2])); +} + +void +_cgo_libc_setresuid(argset_t* x) { + SET_RETVAL(setresuid((uid_t) x->args[0], (uid_t) x->args[1], + (uid_t) x->args[2])); +} + +void +_cgo_libc_setreuid(argset_t* x) { + SET_RETVAL(setreuid((uid_t) x->args[0], (uid_t) x->args[1])); +} + +void +_cgo_libc_setuid(argset_t* x) { + SET_RETVAL(setuid((uid_t) x->args[0])); +} diff --git a/src/runtime/cgo/mmap.go b/src/runtime/cgo/mmap.go new file mode 100644 index 0000000..2f7e83b --- /dev/null +++ b/src/runtime/cgo/mmap.go @@ -0,0 +1,31 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux && amd64) || (linux && arm64) || (freebsd && amd64) + +package cgo + +// Import "unsafe" because we use go:linkname. +import _ "unsafe" + +// When using cgo, call the C library for mmap, so that we call into +// any sanitizer interceptors. This supports using the memory +// sanitizer with Go programs. The memory sanitizer only applies to +// C/C++ code; this permits that code to see the Go code as normal +// program addresses that have been initialized. + +// To support interceptors that look for both mmap and munmap, +// also call the C library for munmap. + +//go:cgo_import_static x_cgo_mmap +//go:linkname x_cgo_mmap x_cgo_mmap +//go:linkname _cgo_mmap _cgo_mmap +var x_cgo_mmap byte +var _cgo_mmap = &x_cgo_mmap + +//go:cgo_import_static x_cgo_munmap +//go:linkname x_cgo_munmap x_cgo_munmap +//go:linkname _cgo_munmap _cgo_munmap +var x_cgo_munmap byte +var _cgo_munmap = &x_cgo_munmap diff --git a/src/runtime/cgo/netbsd.go b/src/runtime/cgo/netbsd.go new file mode 100644 index 0000000..8a8018b --- /dev/null +++ b/src/runtime/cgo/netbsd.go @@ -0,0 +1,21 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build netbsd + +package cgo + +import _ "unsafe" // for go:linkname + +// Supply environ and __progname, because we don't +// link against the standard NetBSD crt0.o and the +// libc dynamic library needs them. + +//go:linkname _environ environ +//go:linkname _progname __progname +//go:linkname ___ps_strings __ps_strings + +var _environ uintptr +var _progname uintptr +var ___ps_strings uintptr diff --git a/src/runtime/cgo/openbsd.go b/src/runtime/cgo/openbsd.go new file mode 100644 index 0000000..26b62fb --- /dev/null +++ b/src/runtime/cgo/openbsd.go @@ -0,0 +1,21 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd + +package cgo + +import _ "unsafe" // for go:linkname + +// Supply __guard_local because we don't link against the standard +// OpenBSD crt0.o and the libc dynamic library needs it. + +//go:linkname _guard_local __guard_local + +var _guard_local uintptr + +// This is normally marked as hidden and placed in the +// .openbsd.randomdata section. +// +//go:cgo_export_dynamic __guard_local __guard_local diff --git a/src/runtime/cgo/setenv.go b/src/runtime/cgo/setenv.go new file mode 100644 index 0000000..2247cb2 --- /dev/null +++ b/src/runtime/cgo/setenv.go @@ -0,0 +1,21 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package cgo + +import _ "unsafe" // for go:linkname + +//go:cgo_import_static x_cgo_setenv +//go:linkname x_cgo_setenv x_cgo_setenv +//go:linkname _cgo_setenv runtime._cgo_setenv +var x_cgo_setenv byte +var _cgo_setenv = &x_cgo_setenv + +//go:cgo_import_static x_cgo_unsetenv +//go:linkname x_cgo_unsetenv x_cgo_unsetenv +//go:linkname _cgo_unsetenv runtime._cgo_unsetenv +var x_cgo_unsetenv byte +var _cgo_unsetenv = &x_cgo_unsetenv diff --git a/src/runtime/cgo/sigaction.go b/src/runtime/cgo/sigaction.go new file mode 100644 index 0000000..dc714f7 --- /dev/null +++ b/src/runtime/cgo/sigaction.go @@ -0,0 +1,22 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux && amd64) || (freebsd && amd64) || (linux && arm64) || (linux && ppc64le) + +package cgo + +// Import "unsafe" because we use go:linkname. +import _ "unsafe" + +// When using cgo, call the C library for sigaction, so that we call into +// any sanitizer interceptors. This supports using the sanitizers +// with Go programs. The thread and memory sanitizers only apply to +// C/C++ code; this permits that code to see the Go runtime's existing signal +// handlers when registering new signal handlers for the process. + +//go:cgo_import_static x_cgo_sigaction +//go:linkname x_cgo_sigaction x_cgo_sigaction +//go:linkname _cgo_sigaction _cgo_sigaction +var x_cgo_sigaction byte +var _cgo_sigaction = &x_cgo_sigaction diff --git a/src/runtime/cgo/signal_ios_arm64.go b/src/runtime/cgo/signal_ios_arm64.go new file mode 100644 index 0000000..3425c44 --- /dev/null +++ b/src/runtime/cgo/signal_ios_arm64.go @@ -0,0 +1,10 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package cgo + +import _ "unsafe" + +//go:cgo_export_static xx_cgo_panicmem xx_cgo_panicmem +func xx_cgo_panicmem() diff --git a/src/runtime/cgo/signal_ios_arm64.s b/src/runtime/cgo/signal_ios_arm64.s new file mode 100644 index 0000000..1ae00d1 --- /dev/null +++ b/src/runtime/cgo/signal_ios_arm64.s @@ -0,0 +1,56 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// xx_cgo_panicmem is the entrypoint for SIGSEGV as intercepted via a +// mach thread port as EXC_BAD_ACCESS. As the segfault may have happened +// in C code, we first need to load_g then call xx_cgo_panicmem. +// +// R1 - LR at moment of fault +// R2 - PC at moment of fault +TEXT xx_cgo_panicmem(SB),NOSPLIT|NOFRAME,$0 + // If in external C code, we need to load the g register. + BL runtime·load_g(SB) + CMP $0, g + BNE ongothread + + // On a foreign thread. + // TODO(crawshaw): call badsignal + MOVD.W $0, -16(RSP) + MOVW $139, R1 + MOVW R1, 8(RSP) + B runtime·exit(SB) + +ongothread: + // Trigger a SIGSEGV panic. + // + // The goal is to arrange the stack so it looks like the runtime + // function sigpanic was called from the PC that faulted. It has + // to be sigpanic, as the stack unwinding code in traceback.go + // looks explicitly for it. + // + // To do this we call into runtime·setsigsegv, which sets the + // appropriate state inside the g object. We give it the faulting + // PC on the stack, then put it in the LR before calling sigpanic. + + // Build a 32-byte stack frame for us for this call. + // Saved LR (none available) is at the bottom, + // then the PC argument for setsigsegv, + // then a copy of the LR for us to restore. + MOVD.W $0, -32(RSP) + MOVD R1, 8(RSP) + MOVD R2, 16(RSP) + BL runtime·setsigsegv(SB) + MOVD 8(RSP), R1 + MOVD 16(RSP), R2 + + // Build a 16-byte stack frame for the simulated + // call to sigpanic, by taking 16 bytes away from the + // 32-byte stack frame above. + // The saved LR in this frame is the LR at time of fault, + // and the LR on entry to sigpanic is the PC at time of fault. + MOVD.W R1, 16(RSP) + MOVD R2, R30 + B runtime·sigpanic(SB) diff --git a/src/runtime/cgo_mmap.go b/src/runtime/cgo_mmap.go new file mode 100644 index 0000000..30660f7 --- /dev/null +++ b/src/runtime/cgo_mmap.go @@ -0,0 +1,70 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Support for memory sanitizer. See runtime/cgo/mmap.go. + +//go:build (linux && amd64) || (linux && arm64) || (freebsd && amd64) + +package runtime + +import "unsafe" + +// _cgo_mmap is filled in by runtime/cgo when it is linked into the +// program, so it is only non-nil when using cgo. +// +//go:linkname _cgo_mmap _cgo_mmap +var _cgo_mmap unsafe.Pointer + +// _cgo_munmap is filled in by runtime/cgo when it is linked into the +// program, so it is only non-nil when using cgo. +// +//go:linkname _cgo_munmap _cgo_munmap +var _cgo_munmap unsafe.Pointer + +// mmap is used to route the mmap system call through C code when using cgo, to +// support sanitizer interceptors. Don't allow stack splits, since this function +// (used by sysAlloc) is called in a lot of low-level parts of the runtime and +// callers often assume it won't acquire any locks. +// +//go:nosplit +func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (unsafe.Pointer, int) { + if _cgo_mmap != nil { + // Make ret a uintptr so that writing to it in the + // function literal does not trigger a write barrier. + // A write barrier here could break because of the way + // that mmap uses the same value both as a pointer and + // an errno value. + var ret uintptr + systemstack(func() { + ret = callCgoMmap(addr, n, prot, flags, fd, off) + }) + if ret < 4096 { + return nil, int(ret) + } + return unsafe.Pointer(ret), 0 + } + return sysMmap(addr, n, prot, flags, fd, off) +} + +func munmap(addr unsafe.Pointer, n uintptr) { + if _cgo_munmap != nil { + systemstack(func() { callCgoMunmap(addr, n) }) + return + } + sysMunmap(addr, n) +} + +// sysMmap calls the mmap system call. It is implemented in assembly. +func sysMmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (p unsafe.Pointer, err int) + +// callCgoMmap calls the mmap function in the runtime/cgo package +// using the GCC calling convention. It is implemented in assembly. +func callCgoMmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) uintptr + +// sysMunmap calls the munmap system call. It is implemented in assembly. +func sysMunmap(addr unsafe.Pointer, n uintptr) + +// callCgoMunmap calls the munmap function in the runtime/cgo package +// using the GCC calling convention. It is implemented in assembly. +func callCgoMunmap(addr unsafe.Pointer, n uintptr) diff --git a/src/runtime/cgo_ppc64x.go b/src/runtime/cgo_ppc64x.go new file mode 100644 index 0000000..c723213 --- /dev/null +++ b/src/runtime/cgo_ppc64x.go @@ -0,0 +1,13 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +package runtime + +// crosscall_ppc64 calls into the runtime to set up the registers the +// Go runtime expects and so the symbol it calls needs to be exported +// for external linking to work. +// +//go:cgo_export_static _cgo_reginit diff --git a/src/runtime/cgo_sigaction.go b/src/runtime/cgo_sigaction.go new file mode 100644 index 0000000..9500c52 --- /dev/null +++ b/src/runtime/cgo_sigaction.go @@ -0,0 +1,94 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Support for sanitizers. See runtime/cgo/sigaction.go. + +//go:build (linux && amd64) || (freebsd && amd64) || (linux && arm64) || (linux && ppc64le) + +package runtime + +import "unsafe" + +// _cgo_sigaction is filled in by runtime/cgo when it is linked into the +// program, so it is only non-nil when using cgo. +// +//go:linkname _cgo_sigaction _cgo_sigaction +var _cgo_sigaction unsafe.Pointer + +//go:nosplit +//go:nowritebarrierrec +func sigaction(sig uint32, new, old *sigactiont) { + // racewalk.go avoids adding sanitizing instrumentation to package runtime, + // but we might be calling into instrumented C functions here, + // so we need the pointer parameters to be properly marked. + // + // Mark the input as having been written before the call + // and the output as read after. + if msanenabled && new != nil { + msanwrite(unsafe.Pointer(new), unsafe.Sizeof(*new)) + } + if asanenabled && new != nil { + asanwrite(unsafe.Pointer(new), unsafe.Sizeof(*new)) + } + if _cgo_sigaction == nil || inForkedChild { + sysSigaction(sig, new, old) + } else { + // We need to call _cgo_sigaction, which means we need a big enough stack + // for C. To complicate matters, we may be in libpreinit (before the + // runtime has been initialized) or in an asynchronous signal handler (with + // the current thread in transition between goroutines, or with the g0 + // system stack already in use). + + var ret int32 + + var g *g + if mainStarted { + g = getg() + } + sp := uintptr(unsafe.Pointer(&sig)) + switch { + case g == nil: + // No g: we're on a C stack or a signal stack. + ret = callCgoSigaction(uintptr(sig), new, old) + case sp < g.stack.lo || sp >= g.stack.hi: + // We're no longer on g's stack, so we must be handling a signal. It's + // possible that we interrupted the thread during a transition between g + // and g0, so we should stay on the current stack to avoid corrupting g0. + ret = callCgoSigaction(uintptr(sig), new, old) + default: + // We're running on g's stack, so either we're not in a signal handler or + // the signal handler has set the correct g. If we're on gsignal or g0, + // systemstack will make the call directly; otherwise, it will switch to + // g0 to ensure we have enough room to call a libc function. + // + // The function literal that we pass to systemstack is not nosplit, but + // that's ok: we'll be running on a fresh, clean system stack so the stack + // check will always succeed anyway. + systemstack(func() { + ret = callCgoSigaction(uintptr(sig), new, old) + }) + } + + const EINVAL = 22 + if ret == EINVAL { + // libc reserves certain signals — normally 32-33 — for pthreads, and + // returns EINVAL for sigaction calls on those signals. If we get EINVAL, + // fall back to making the syscall directly. + sysSigaction(sig, new, old) + } + } + + if msanenabled && old != nil { + msanread(unsafe.Pointer(old), unsafe.Sizeof(*old)) + } + if asanenabled && old != nil { + asanread(unsafe.Pointer(old), unsafe.Sizeof(*old)) + } +} + +// callCgoSigaction calls the sigaction function in the runtime/cgo package +// using the GCC calling convention. It is implemented in assembly. +// +//go:noescape +func callCgoSigaction(sig uintptr, new, old *sigactiont) int32 diff --git a/src/runtime/cgocall.go b/src/runtime/cgocall.go new file mode 100644 index 0000000..9c75280 --- /dev/null +++ b/src/runtime/cgocall.go @@ -0,0 +1,641 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Cgo call and callback support. +// +// To call into the C function f from Go, the cgo-generated code calls +// runtime.cgocall(_cgo_Cfunc_f, frame), where _cgo_Cfunc_f is a +// gcc-compiled function written by cgo. +// +// runtime.cgocall (below) calls entersyscall so as not to block +// other goroutines or the garbage collector, and then calls +// runtime.asmcgocall(_cgo_Cfunc_f, frame). +// +// runtime.asmcgocall (in asm_$GOARCH.s) switches to the m->g0 stack +// (assumed to be an operating system-allocated stack, so safe to run +// gcc-compiled code on) and calls _cgo_Cfunc_f(frame). +// +// _cgo_Cfunc_f invokes the actual C function f with arguments +// taken from the frame structure, records the results in the frame, +// and returns to runtime.asmcgocall. +// +// After it regains control, runtime.asmcgocall switches back to the +// original g (m->curg)'s stack and returns to runtime.cgocall. +// +// After it regains control, runtime.cgocall calls exitsyscall, which blocks +// until this m can run Go code without violating the $GOMAXPROCS limit, +// and then unlocks g from m. +// +// The above description skipped over the possibility of the gcc-compiled +// function f calling back into Go. If that happens, we continue down +// the rabbit hole during the execution of f. +// +// To make it possible for gcc-compiled C code to call a Go function p.GoF, +// cgo writes a gcc-compiled function named GoF (not p.GoF, since gcc doesn't +// know about packages). The gcc-compiled C function f calls GoF. +// +// GoF initializes "frame", a structure containing all of its +// arguments and slots for p.GoF's results. It calls +// crosscall2(_cgoexp_GoF, frame, framesize, ctxt) using the gcc ABI. +// +// crosscall2 (in cgo/asm_$GOARCH.s) is a four-argument adapter from +// the gcc function call ABI to the gc function call ABI. At this +// point we're in the Go runtime, but we're still running on m.g0's +// stack and outside the $GOMAXPROCS limit. crosscall2 calls +// runtime.cgocallback(_cgoexp_GoF, frame, ctxt) using the gc ABI. +// (crosscall2's framesize argument is no longer used, but there's one +// case where SWIG calls crosscall2 directly and expects to pass this +// argument. See _cgo_panic.) +// +// runtime.cgocallback (in asm_$GOARCH.s) switches from m.g0's stack +// to the original g (m.curg)'s stack, on which it calls +// runtime.cgocallbackg(_cgoexp_GoF, frame, ctxt). As part of the +// stack switch, runtime.cgocallback saves the current SP as +// m.g0.sched.sp, so that any use of m.g0's stack during the execution +// of the callback will be done below the existing stack frames. +// Before overwriting m.g0.sched.sp, it pushes the old value on the +// m.g0 stack, so that it can be restored later. +// +// runtime.cgocallbackg (below) is now running on a real goroutine +// stack (not an m.g0 stack). First it calls runtime.exitsyscall, which will +// block until the $GOMAXPROCS limit allows running this goroutine. +// Once exitsyscall has returned, it is safe to do things like call the memory +// allocator or invoke the Go callback function. runtime.cgocallbackg +// first defers a function to unwind m.g0.sched.sp, so that if p.GoF +// panics, m.g0.sched.sp will be restored to its old value: the m.g0 stack +// and the m.curg stack will be unwound in lock step. +// Then it calls _cgoexp_GoF(frame). +// +// _cgoexp_GoF, which was generated by cmd/cgo, unpacks the arguments +// from frame, calls p.GoF, writes the results back to frame, and +// returns. Now we start unwinding this whole process. +// +// runtime.cgocallbackg pops but does not execute the deferred +// function to unwind m.g0.sched.sp, calls runtime.entersyscall, and +// returns to runtime.cgocallback. +// +// After it regains control, runtime.cgocallback switches back to +// m.g0's stack (the pointer is still in m.g0.sched.sp), restores the old +// m.g0.sched.sp value from the stack, and returns to crosscall2. +// +// crosscall2 restores the callee-save registers for gcc and returns +// to GoF, which unpacks any result values and returns to f. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/sys" + "unsafe" +) + +// Addresses collected in a cgo backtrace when crashing. +// Length must match arg.Max in x_cgo_callers in runtime/cgo/gcc_traceback.c. +type cgoCallers [32]uintptr + +// argset matches runtime/cgo/linux_syscall.c:argset_t +type argset struct { + args unsafe.Pointer + retval uintptr +} + +// wrapper for syscall package to call cgocall for libc (cgo) calls. +// +//go:linkname syscall_cgocaller syscall.cgocaller +//go:nosplit +//go:uintptrescapes +func syscall_cgocaller(fn unsafe.Pointer, args ...uintptr) uintptr { + as := argset{args: unsafe.Pointer(&args[0])} + cgocall(fn, unsafe.Pointer(&as)) + return as.retval +} + +var ncgocall uint64 // number of cgo calls in total for dead m + +// Call from Go to C. +// +// This must be nosplit because it's used for syscalls on some +// platforms. Syscalls may have untyped arguments on the stack, so +// it's not safe to grow or scan the stack. +// +//go:nosplit +func cgocall(fn, arg unsafe.Pointer) int32 { + if !iscgo && GOOS != "solaris" && GOOS != "illumos" && GOOS != "windows" { + throw("cgocall unavailable") + } + + if fn == nil { + throw("cgocall nil") + } + + if raceenabled { + racereleasemerge(unsafe.Pointer(&racecgosync)) + } + + mp := getg().m + mp.ncgocall++ + mp.ncgo++ + + // Reset traceback. + mp.cgoCallers[0] = 0 + + // Announce we are entering a system call + // so that the scheduler knows to create another + // M to run goroutines while we are in the + // foreign code. + // + // The call to asmcgocall is guaranteed not to + // grow the stack and does not allocate memory, + // so it is safe to call while "in a system call", outside + // the $GOMAXPROCS accounting. + // + // fn may call back into Go code, in which case we'll exit the + // "system call", run the Go code (which may grow the stack), + // and then re-enter the "system call" reusing the PC and SP + // saved by entersyscall here. + entersyscall() + + // Tell asynchronous preemption that we're entering external + // code. We do this after entersyscall because this may block + // and cause an async preemption to fail, but at this point a + // sync preemption will succeed (though this is not a matter + // of correctness). + osPreemptExtEnter(mp) + + mp.incgo = true + errno := asmcgocall(fn, arg) + + // Update accounting before exitsyscall because exitsyscall may + // reschedule us on to a different M. + mp.incgo = false + mp.ncgo-- + + osPreemptExtExit(mp) + + exitsyscall() + + // Note that raceacquire must be called only after exitsyscall has + // wired this M to a P. + if raceenabled { + raceacquire(unsafe.Pointer(&racecgosync)) + } + + // From the garbage collector's perspective, time can move + // backwards in the sequence above. If there's a callback into + // Go code, GC will see this function at the call to + // asmcgocall. When the Go call later returns to C, the + // syscall PC/SP is rolled back and the GC sees this function + // back at the call to entersyscall. Normally, fn and arg + // would be live at entersyscall and dead at asmcgocall, so if + // time moved backwards, GC would see these arguments as dead + // and then live. Prevent these undead arguments from crashing + // GC by forcing them to stay live across this time warp. + KeepAlive(fn) + KeepAlive(arg) + KeepAlive(mp) + + return errno +} + +// Call from C back to Go. fn must point to an ABIInternal Go entry-point. +// +//go:nosplit +func cgocallbackg(fn, frame unsafe.Pointer, ctxt uintptr) { + gp := getg() + if gp != gp.m.curg { + println("runtime: bad g in cgocallback") + exit(2) + } + + // The call from C is on gp.m's g0 stack, so we must ensure + // that we stay on that M. We have to do this before calling + // exitsyscall, since it would otherwise be free to move us to + // a different M. The call to unlockOSThread is in unwindm. + lockOSThread() + + checkm := gp.m + + // Save current syscall parameters, so m.syscall can be + // used again if callback decide to make syscall. + syscall := gp.m.syscall + + // entersyscall saves the caller's SP to allow the GC to trace the Go + // stack. However, since we're returning to an earlier stack frame and + // need to pair with the entersyscall() call made by cgocall, we must + // save syscall* and let reentersyscall restore them. + savedsp := unsafe.Pointer(gp.syscallsp) + savedpc := gp.syscallpc + exitsyscall() // coming out of cgo call + gp.m.incgo = false + + osPreemptExtExit(gp.m) + + cgocallbackg1(fn, frame, ctxt) // will call unlockOSThread + + // At this point unlockOSThread has been called. + // The following code must not change to a different m. + // This is enforced by checking incgo in the schedule function. + + gp.m.incgo = true + + if gp.m != checkm { + throw("m changed unexpectedly in cgocallbackg") + } + + osPreemptExtEnter(gp.m) + + // going back to cgo call + reentersyscall(savedpc, uintptr(savedsp)) + + gp.m.syscall = syscall +} + +func cgocallbackg1(fn, frame unsafe.Pointer, ctxt uintptr) { + gp := getg() + + // When we return, undo the call to lockOSThread in cgocallbackg. + // We must still stay on the same m. + defer unlockOSThread() + + if gp.m.needextram || extraMWaiters.Load() > 0 { + gp.m.needextram = false + systemstack(newextram) + } + + if ctxt != 0 { + s := append(gp.cgoCtxt, ctxt) + + // Now we need to set gp.cgoCtxt = s, but we could get + // a SIGPROF signal while manipulating the slice, and + // the SIGPROF handler could pick up gp.cgoCtxt while + // tracing up the stack. We need to ensure that the + // handler always sees a valid slice, so set the + // values in an order such that it always does. + p := (*slice)(unsafe.Pointer(&gp.cgoCtxt)) + atomicstorep(unsafe.Pointer(&p.array), unsafe.Pointer(&s[0])) + p.cap = cap(s) + p.len = len(s) + + defer func(gp *g) { + // Decrease the length of the slice by one, safely. + p := (*slice)(unsafe.Pointer(&gp.cgoCtxt)) + p.len-- + }(gp) + } + + if gp.m.ncgo == 0 { + // The C call to Go came from a thread not currently running + // any Go. In the case of -buildmode=c-archive or c-shared, + // this call may be coming in before package initialization + // is complete. Wait until it is. + <-main_init_done + } + + // Check whether the profiler needs to be turned on or off; this route to + // run Go code does not use runtime.execute, so bypasses the check there. + hz := sched.profilehz + if gp.m.profilehz != hz { + setThreadCPUProfiler(hz) + } + + // Add entry to defer stack in case of panic. + restore := true + defer unwindm(&restore) + + if raceenabled { + raceacquire(unsafe.Pointer(&racecgosync)) + } + + // Invoke callback. This function is generated by cmd/cgo and + // will unpack the argument frame and call the Go function. + var cb func(frame unsafe.Pointer) + cbFV := funcval{uintptr(fn)} + *(*unsafe.Pointer)(unsafe.Pointer(&cb)) = noescape(unsafe.Pointer(&cbFV)) + cb(frame) + + if raceenabled { + racereleasemerge(unsafe.Pointer(&racecgosync)) + } + + // Do not unwind m->g0->sched.sp. + // Our caller, cgocallback, will do that. + restore = false +} + +func unwindm(restore *bool) { + if *restore { + // Restore sp saved by cgocallback during + // unwind of g's stack (see comment at top of file). + mp := acquirem() + sched := &mp.g0.sched + sched.sp = *(*uintptr)(unsafe.Pointer(sched.sp + alignUp(sys.MinFrameSize, sys.StackAlign))) + + // Do the accounting that cgocall will not have a chance to do + // during an unwind. + // + // In the case where a Go call originates from C, ncgo is 0 + // and there is no matching cgocall to end. + if mp.ncgo > 0 { + mp.incgo = false + mp.ncgo-- + osPreemptExtExit(mp) + } + + releasem(mp) + } +} + +// called from assembly. +func badcgocallback() { + throw("misaligned stack in cgocallback") +} + +// called from (incomplete) assembly. +func cgounimpl() { + throw("cgo not implemented") +} + +var racecgosync uint64 // represents possible synchronization in C code + +// Pointer checking for cgo code. + +// We want to detect all cases where a program that does not use +// unsafe makes a cgo call passing a Go pointer to memory that +// contains a Go pointer. Here a Go pointer is defined as a pointer +// to memory allocated by the Go runtime. Programs that use unsafe +// can evade this restriction easily, so we don't try to catch them. +// The cgo program will rewrite all possibly bad pointer arguments to +// call cgoCheckPointer, where we can catch cases of a Go pointer +// pointing to a Go pointer. + +// Complicating matters, taking the address of a slice or array +// element permits the C program to access all elements of the slice +// or array. In that case we will see a pointer to a single element, +// but we need to check the entire data structure. + +// The cgoCheckPointer call takes additional arguments indicating that +// it was called on an address expression. An additional argument of +// true means that it only needs to check a single element. An +// additional argument of a slice or array means that it needs to +// check the entire slice/array, but nothing else. Otherwise, the +// pointer could be anything, and we check the entire heap object, +// which is conservative but safe. + +// When and if we implement a moving garbage collector, +// cgoCheckPointer will pin the pointer for the duration of the cgo +// call. (This is necessary but not sufficient; the cgo program will +// also have to change to pin Go pointers that cannot point to Go +// pointers.) + +// cgoCheckPointer checks if the argument contains a Go pointer that +// points to a Go pointer, and panics if it does. +func cgoCheckPointer(ptr any, arg any) { + if debug.cgocheck == 0 { + return + } + + ep := efaceOf(&ptr) + t := ep._type + + top := true + if arg != nil && (t.kind&kindMask == kindPtr || t.kind&kindMask == kindUnsafePointer) { + p := ep.data + if t.kind&kindDirectIface == 0 { + p = *(*unsafe.Pointer)(p) + } + if p == nil || !cgoIsGoPointer(p) { + return + } + aep := efaceOf(&arg) + switch aep._type.kind & kindMask { + case kindBool: + if t.kind&kindMask == kindUnsafePointer { + // We don't know the type of the element. + break + } + pt := (*ptrtype)(unsafe.Pointer(t)) + cgoCheckArg(pt.elem, p, true, false, cgoCheckPointerFail) + return + case kindSlice: + // Check the slice rather than the pointer. + ep = aep + t = ep._type + case kindArray: + // Check the array rather than the pointer. + // Pass top as false since we have a pointer + // to the array. + ep = aep + t = ep._type + top = false + default: + throw("can't happen") + } + } + + cgoCheckArg(t, ep.data, t.kind&kindDirectIface == 0, top, cgoCheckPointerFail) +} + +const cgoCheckPointerFail = "cgo argument has Go pointer to Go pointer" +const cgoResultFail = "cgo result has Go pointer" + +// cgoCheckArg is the real work of cgoCheckPointer. The argument p +// is either a pointer to the value (of type t), or the value itself, +// depending on indir. The top parameter is whether we are at the top +// level, where Go pointers are allowed. +func cgoCheckArg(t *_type, p unsafe.Pointer, indir, top bool, msg string) { + if t.ptrdata == 0 || p == nil { + // If the type has no pointers there is nothing to do. + return + } + + switch t.kind & kindMask { + default: + throw("can't happen") + case kindArray: + at := (*arraytype)(unsafe.Pointer(t)) + if !indir { + if at.len != 1 { + throw("can't happen") + } + cgoCheckArg(at.elem, p, at.elem.kind&kindDirectIface == 0, top, msg) + return + } + for i := uintptr(0); i < at.len; i++ { + cgoCheckArg(at.elem, p, true, top, msg) + p = add(p, at.elem.size) + } + case kindChan, kindMap: + // These types contain internal pointers that will + // always be allocated in the Go heap. It's never OK + // to pass them to C. + panic(errorString(msg)) + case kindFunc: + if indir { + p = *(*unsafe.Pointer)(p) + } + if !cgoIsGoPointer(p) { + return + } + panic(errorString(msg)) + case kindInterface: + it := *(**_type)(p) + if it == nil { + return + } + // A type known at compile time is OK since it's + // constant. A type not known at compile time will be + // in the heap and will not be OK. + if inheap(uintptr(unsafe.Pointer(it))) { + panic(errorString(msg)) + } + p = *(*unsafe.Pointer)(add(p, goarch.PtrSize)) + if !cgoIsGoPointer(p) { + return + } + if !top { + panic(errorString(msg)) + } + cgoCheckArg(it, p, it.kind&kindDirectIface == 0, false, msg) + case kindSlice: + st := (*slicetype)(unsafe.Pointer(t)) + s := (*slice)(p) + p = s.array + if p == nil || !cgoIsGoPointer(p) { + return + } + if !top { + panic(errorString(msg)) + } + if st.elem.ptrdata == 0 { + return + } + for i := 0; i < s.cap; i++ { + cgoCheckArg(st.elem, p, true, false, msg) + p = add(p, st.elem.size) + } + case kindString: + ss := (*stringStruct)(p) + if !cgoIsGoPointer(ss.str) { + return + } + if !top { + panic(errorString(msg)) + } + case kindStruct: + st := (*structtype)(unsafe.Pointer(t)) + if !indir { + if len(st.fields) != 1 { + throw("can't happen") + } + cgoCheckArg(st.fields[0].typ, p, st.fields[0].typ.kind&kindDirectIface == 0, top, msg) + return + } + for _, f := range st.fields { + if f.typ.ptrdata == 0 { + continue + } + cgoCheckArg(f.typ, add(p, f.offset), true, top, msg) + } + case kindPtr, kindUnsafePointer: + if indir { + p = *(*unsafe.Pointer)(p) + if p == nil { + return + } + } + + if !cgoIsGoPointer(p) { + return + } + if !top { + panic(errorString(msg)) + } + + cgoCheckUnknownPointer(p, msg) + } +} + +// cgoCheckUnknownPointer is called for an arbitrary pointer into Go +// memory. It checks whether that Go memory contains any other +// pointer into Go memory. If it does, we panic. +// The return values are unused but useful to see in panic tracebacks. +func cgoCheckUnknownPointer(p unsafe.Pointer, msg string) (base, i uintptr) { + if inheap(uintptr(p)) { + b, span, _ := findObject(uintptr(p), 0, 0) + base = b + if base == 0 { + return + } + n := span.elemsize + hbits := heapBitsForAddr(base, n) + for { + var addr uintptr + if hbits, addr = hbits.next(); addr == 0 { + break + } + if cgoIsGoPointer(*(*unsafe.Pointer)(unsafe.Pointer(addr))) { + panic(errorString(msg)) + } + } + + return + } + + for _, datap := range activeModules() { + if cgoInRange(p, datap.data, datap.edata) || cgoInRange(p, datap.bss, datap.ebss) { + // We have no way to know the size of the object. + // We have to assume that it might contain a pointer. + panic(errorString(msg)) + } + // In the text or noptr sections, we know that the + // pointer does not point to a Go pointer. + } + + return +} + +// cgoIsGoPointer reports whether the pointer is a Go pointer--a +// pointer to Go memory. We only care about Go memory that might +// contain pointers. +// +//go:nosplit +//go:nowritebarrierrec +func cgoIsGoPointer(p unsafe.Pointer) bool { + if p == nil { + return false + } + + if inHeapOrStack(uintptr(p)) { + return true + } + + for _, datap := range activeModules() { + if cgoInRange(p, datap.data, datap.edata) || cgoInRange(p, datap.bss, datap.ebss) { + return true + } + } + + return false +} + +// cgoInRange reports whether p is between start and end. +// +//go:nosplit +//go:nowritebarrierrec +func cgoInRange(p unsafe.Pointer, start, end uintptr) bool { + return start <= uintptr(p) && uintptr(p) < end +} + +// cgoCheckResult is called to check the result parameter of an +// exported Go function. It panics if the result is or contains a Go +// pointer. +func cgoCheckResult(val any) { + if debug.cgocheck == 0 { + return + } + + ep := efaceOf(&val) + t := ep._type + cgoCheckArg(t, ep.data, t.kind&kindDirectIface == 0, false, cgoResultFail) +} diff --git a/src/runtime/cgocallback.go b/src/runtime/cgocallback.go new file mode 100644 index 0000000..59953f1 --- /dev/null +++ b/src/runtime/cgocallback.go @@ -0,0 +1,13 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// These functions are called from C code via cgo/callbacks.go. + +// Panic. + +func _cgo_panic_internal(p *byte) { + panic(gostringnocopy(p)) +} diff --git a/src/runtime/cgocheck.go b/src/runtime/cgocheck.go new file mode 100644 index 0000000..84e7516 --- /dev/null +++ b/src/runtime/cgocheck.go @@ -0,0 +1,268 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Code to check that pointer writes follow the cgo rules. +// These functions are invoked via the write barrier when debug.cgocheck > 1. + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +const cgoWriteBarrierFail = "Go pointer stored into non-Go memory" + +// cgoCheckWriteBarrier is called whenever a pointer is stored into memory. +// It throws if the program is storing a Go pointer into non-Go memory. +// +// This is called from the write barrier, so its entire call tree must +// be nosplit. +// +//go:nosplit +//go:nowritebarrier +func cgoCheckWriteBarrier(dst *uintptr, src uintptr) { + if !cgoIsGoPointer(unsafe.Pointer(src)) { + return + } + if cgoIsGoPointer(unsafe.Pointer(dst)) { + return + } + + // If we are running on the system stack then dst might be an + // address on the stack, which is OK. + gp := getg() + if gp == gp.m.g0 || gp == gp.m.gsignal { + return + } + + // Allocating memory can write to various mfixalloc structs + // that look like they are non-Go memory. + if gp.m.mallocing != 0 { + return + } + + // It's OK if writing to memory allocated by persistentalloc. + // Do this check last because it is more expensive and rarely true. + // If it is false the expense doesn't matter since we are crashing. + if inPersistentAlloc(uintptr(unsafe.Pointer(dst))) { + return + } + + systemstack(func() { + println("write of Go pointer", hex(src), "to non-Go memory", hex(uintptr(unsafe.Pointer(dst)))) + throw(cgoWriteBarrierFail) + }) +} + +// cgoCheckMemmove is called when moving a block of memory. +// dst and src point off bytes into the value to copy. +// size is the number of bytes to copy. +// It throws if the program is copying a block that contains a Go pointer +// into non-Go memory. +// +//go:nosplit +//go:nowritebarrier +func cgoCheckMemmove(typ *_type, dst, src unsafe.Pointer, off, size uintptr) { + if typ.ptrdata == 0 { + return + } + if !cgoIsGoPointer(src) { + return + } + if cgoIsGoPointer(dst) { + return + } + cgoCheckTypedBlock(typ, src, off, size) +} + +// cgoCheckSliceCopy is called when copying n elements of a slice. +// src and dst are pointers to the first element of the slice. +// typ is the element type of the slice. +// It throws if the program is copying slice elements that contain Go pointers +// into non-Go memory. +// +//go:nosplit +//go:nowritebarrier +func cgoCheckSliceCopy(typ *_type, dst, src unsafe.Pointer, n int) { + if typ.ptrdata == 0 { + return + } + if !cgoIsGoPointer(src) { + return + } + if cgoIsGoPointer(dst) { + return + } + p := src + for i := 0; i < n; i++ { + cgoCheckTypedBlock(typ, p, 0, typ.size) + p = add(p, typ.size) + } +} + +// cgoCheckTypedBlock checks the block of memory at src, for up to size bytes, +// and throws if it finds a Go pointer. The type of the memory is typ, +// and src is off bytes into that type. +// +//go:nosplit +//go:nowritebarrier +func cgoCheckTypedBlock(typ *_type, src unsafe.Pointer, off, size uintptr) { + // Anything past typ.ptrdata is not a pointer. + if typ.ptrdata <= off { + return + } + if ptrdataSize := typ.ptrdata - off; size > ptrdataSize { + size = ptrdataSize + } + + if typ.kind&kindGCProg == 0 { + cgoCheckBits(src, typ.gcdata, off, size) + return + } + + // The type has a GC program. Try to find GC bits somewhere else. + for _, datap := range activeModules() { + if cgoInRange(src, datap.data, datap.edata) { + doff := uintptr(src) - datap.data + cgoCheckBits(add(src, -doff), datap.gcdatamask.bytedata, off+doff, size) + return + } + if cgoInRange(src, datap.bss, datap.ebss) { + boff := uintptr(src) - datap.bss + cgoCheckBits(add(src, -boff), datap.gcbssmask.bytedata, off+boff, size) + return + } + } + + s := spanOfUnchecked(uintptr(src)) + if s.state.get() == mSpanManual { + // There are no heap bits for value stored on the stack. + // For a channel receive src might be on the stack of some + // other goroutine, so we can't unwind the stack even if + // we wanted to. + // We can't expand the GC program without extra storage + // space we can't easily get. + // Fortunately we have the type information. + systemstack(func() { + cgoCheckUsingType(typ, src, off, size) + }) + return + } + + // src must be in the regular heap. + + hbits := heapBitsForAddr(uintptr(src), size) + for { + var addr uintptr + if hbits, addr = hbits.next(); addr == 0 { + break + } + v := *(*unsafe.Pointer)(unsafe.Pointer(addr)) + if cgoIsGoPointer(v) { + throw(cgoWriteBarrierFail) + } + } +} + +// cgoCheckBits checks the block of memory at src, for up to size +// bytes, and throws if it finds a Go pointer. The gcbits mark each +// pointer value. The src pointer is off bytes into the gcbits. +// +//go:nosplit +//go:nowritebarrier +func cgoCheckBits(src unsafe.Pointer, gcbits *byte, off, size uintptr) { + skipMask := off / goarch.PtrSize / 8 + skipBytes := skipMask * goarch.PtrSize * 8 + ptrmask := addb(gcbits, skipMask) + src = add(src, skipBytes) + off -= skipBytes + size += off + var bits uint32 + for i := uintptr(0); i < size; i += goarch.PtrSize { + if i&(goarch.PtrSize*8-1) == 0 { + bits = uint32(*ptrmask) + ptrmask = addb(ptrmask, 1) + } else { + bits >>= 1 + } + if off > 0 { + off -= goarch.PtrSize + } else { + if bits&1 != 0 { + v := *(*unsafe.Pointer)(add(src, i)) + if cgoIsGoPointer(v) { + throw(cgoWriteBarrierFail) + } + } + } + } +} + +// cgoCheckUsingType is like cgoCheckTypedBlock, but is a last ditch +// fall back to look for pointers in src using the type information. +// We only use this when looking at a value on the stack when the type +// uses a GC program, because otherwise it's more efficient to use the +// GC bits. This is called on the system stack. +// +//go:nowritebarrier +//go:systemstack +func cgoCheckUsingType(typ *_type, src unsafe.Pointer, off, size uintptr) { + if typ.ptrdata == 0 { + return + } + + // Anything past typ.ptrdata is not a pointer. + if typ.ptrdata <= off { + return + } + if ptrdataSize := typ.ptrdata - off; size > ptrdataSize { + size = ptrdataSize + } + + if typ.kind&kindGCProg == 0 { + cgoCheckBits(src, typ.gcdata, off, size) + return + } + switch typ.kind & kindMask { + default: + throw("can't happen") + case kindArray: + at := (*arraytype)(unsafe.Pointer(typ)) + for i := uintptr(0); i < at.len; i++ { + if off < at.elem.size { + cgoCheckUsingType(at.elem, src, off, size) + } + src = add(src, at.elem.size) + skipped := off + if skipped > at.elem.size { + skipped = at.elem.size + } + checked := at.elem.size - skipped + off -= skipped + if size <= checked { + return + } + size -= checked + } + case kindStruct: + st := (*structtype)(unsafe.Pointer(typ)) + for _, f := range st.fields { + if off < f.typ.size { + cgoCheckUsingType(f.typ, src, off, size) + } + src = add(src, f.typ.size) + skipped := off + if skipped > f.typ.size { + skipped = f.typ.size + } + checked := f.typ.size - skipped + off -= skipped + if size <= checked { + return + } + size -= checked + } + } +} diff --git a/src/runtime/chan.go b/src/runtime/chan.go new file mode 100644 index 0000000..6a0ad35 --- /dev/null +++ b/src/runtime/chan.go @@ -0,0 +1,851 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// This file contains the implementation of Go channels. + +// Invariants: +// At least one of c.sendq and c.recvq is empty, +// except for the case of an unbuffered channel with a single goroutine +// blocked on it for both sending and receiving using a select statement, +// in which case the length of c.sendq and c.recvq is limited only by the +// size of the select statement. +// +// For buffered channels, also: +// c.qcount > 0 implies that c.recvq is empty. +// c.qcount < c.dataqsiz implies that c.sendq is empty. + +import ( + "internal/abi" + "runtime/internal/atomic" + "runtime/internal/math" + "unsafe" +) + +const ( + maxAlign = 8 + hchanSize = unsafe.Sizeof(hchan{}) + uintptr(-int(unsafe.Sizeof(hchan{}))&(maxAlign-1)) + debugChan = false +) + +type hchan struct { + qcount uint // total data in the queue + dataqsiz uint // size of the circular queue + buf unsafe.Pointer // points to an array of dataqsiz elements + elemsize uint16 + closed uint32 + elemtype *_type // element type + sendx uint // send index + recvx uint // receive index + recvq waitq // list of recv waiters + sendq waitq // list of send waiters + + // lock protects all fields in hchan, as well as several + // fields in sudogs blocked on this channel. + // + // Do not change another G's status while holding this lock + // (in particular, do not ready a G), as this can deadlock + // with stack shrinking. + lock mutex +} + +type waitq struct { + first *sudog + last *sudog +} + +//go:linkname reflect_makechan reflect.makechan +func reflect_makechan(t *chantype, size int) *hchan { + return makechan(t, size) +} + +func makechan64(t *chantype, size int64) *hchan { + if int64(int(size)) != size { + panic(plainError("makechan: size out of range")) + } + + return makechan(t, int(size)) +} + +func makechan(t *chantype, size int) *hchan { + elem := t.elem + + // compiler checks this but be safe. + if elem.size >= 1<<16 { + throw("makechan: invalid channel element type") + } + if hchanSize%maxAlign != 0 || elem.align > maxAlign { + throw("makechan: bad alignment") + } + + mem, overflow := math.MulUintptr(elem.size, uintptr(size)) + if overflow || mem > maxAlloc-hchanSize || size < 0 { + panic(plainError("makechan: size out of range")) + } + + // Hchan does not contain pointers interesting for GC when elements stored in buf do not contain pointers. + // buf points into the same allocation, elemtype is persistent. + // SudoG's are referenced from their owning thread so they can't be collected. + // TODO(dvyukov,rlh): Rethink when collector can move allocated objects. + var c *hchan + switch { + case mem == 0: + // Queue or element size is zero. + c = (*hchan)(mallocgc(hchanSize, nil, true)) + // Race detector uses this location for synchronization. + c.buf = c.raceaddr() + case elem.ptrdata == 0: + // Elements do not contain pointers. + // Allocate hchan and buf in one call. + c = (*hchan)(mallocgc(hchanSize+mem, nil, true)) + c.buf = add(unsafe.Pointer(c), hchanSize) + default: + // Elements contain pointers. + c = new(hchan) + c.buf = mallocgc(mem, elem, true) + } + + c.elemsize = uint16(elem.size) + c.elemtype = elem + c.dataqsiz = uint(size) + lockInit(&c.lock, lockRankHchan) + + if debugChan { + print("makechan: chan=", c, "; elemsize=", elem.size, "; dataqsiz=", size, "\n") + } + return c +} + +// chanbuf(c, i) is pointer to the i'th slot in the buffer. +func chanbuf(c *hchan, i uint) unsafe.Pointer { + return add(c.buf, uintptr(i)*uintptr(c.elemsize)) +} + +// full reports whether a send on c would block (that is, the channel is full). +// It uses a single word-sized read of mutable state, so although +// the answer is instantaneously true, the correct answer may have changed +// by the time the calling function receives the return value. +func full(c *hchan) bool { + // c.dataqsiz is immutable (never written after the channel is created) + // so it is safe to read at any time during channel operation. + if c.dataqsiz == 0 { + // Assumes that a pointer read is relaxed-atomic. + return c.recvq.first == nil + } + // Assumes that a uint read is relaxed-atomic. + return c.qcount == c.dataqsiz +} + +// entry point for c <- x from compiled code. +// +//go:nosplit +func chansend1(c *hchan, elem unsafe.Pointer) { + chansend(c, elem, true, getcallerpc()) +} + +/* + * generic single channel send/recv + * If block is not nil, + * then the protocol will not + * sleep but return if it could + * not complete. + * + * sleep can wake up with g.param == nil + * when a channel involved in the sleep has + * been closed. it is easiest to loop and re-run + * the operation; we'll see that it's now closed. + */ +func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool { + if c == nil { + if !block { + return false + } + gopark(nil, nil, waitReasonChanSendNilChan, traceEvGoStop, 2) + throw("unreachable") + } + + if debugChan { + print("chansend: chan=", c, "\n") + } + + if raceenabled { + racereadpc(c.raceaddr(), callerpc, abi.FuncPCABIInternal(chansend)) + } + + // Fast path: check for failed non-blocking operation without acquiring the lock. + // + // After observing that the channel is not closed, we observe that the channel is + // not ready for sending. Each of these observations is a single word-sized read + // (first c.closed and second full()). + // Because a closed channel cannot transition from 'ready for sending' to + // 'not ready for sending', even if the channel is closed between the two observations, + // they imply a moment between the two when the channel was both not yet closed + // and not ready for sending. We behave as if we observed the channel at that moment, + // and report that the send cannot proceed. + // + // It is okay if the reads are reordered here: if we observe that the channel is not + // ready for sending and then observe that it is not closed, that implies that the + // channel wasn't closed during the first observation. However, nothing here + // guarantees forward progress. We rely on the side effects of lock release in + // chanrecv() and closechan() to update this thread's view of c.closed and full(). + if !block && c.closed == 0 && full(c) { + return false + } + + var t0 int64 + if blockprofilerate > 0 { + t0 = cputicks() + } + + lock(&c.lock) + + if c.closed != 0 { + unlock(&c.lock) + panic(plainError("send on closed channel")) + } + + if sg := c.recvq.dequeue(); sg != nil { + // Found a waiting receiver. We pass the value we want to send + // directly to the receiver, bypassing the channel buffer (if any). + send(c, sg, ep, func() { unlock(&c.lock) }, 3) + return true + } + + if c.qcount < c.dataqsiz { + // Space is available in the channel buffer. Enqueue the element to send. + qp := chanbuf(c, c.sendx) + if raceenabled { + racenotify(c, c.sendx, nil) + } + typedmemmove(c.elemtype, qp, ep) + c.sendx++ + if c.sendx == c.dataqsiz { + c.sendx = 0 + } + c.qcount++ + unlock(&c.lock) + return true + } + + if !block { + unlock(&c.lock) + return false + } + + // Block on the channel. Some receiver will complete our operation for us. + gp := getg() + mysg := acquireSudog() + mysg.releasetime = 0 + if t0 != 0 { + mysg.releasetime = -1 + } + // No stack splits between assigning elem and enqueuing mysg + // on gp.waiting where copystack can find it. + mysg.elem = ep + mysg.waitlink = nil + mysg.g = gp + mysg.isSelect = false + mysg.c = c + gp.waiting = mysg + gp.param = nil + c.sendq.enqueue(mysg) + // Signal to anyone trying to shrink our stack that we're about + // to park on a channel. The window between when this G's status + // changes and when we set gp.activeStackChans is not safe for + // stack shrinking. + gp.parkingOnChan.Store(true) + gopark(chanparkcommit, unsafe.Pointer(&c.lock), waitReasonChanSend, traceEvGoBlockSend, 2) + // Ensure the value being sent is kept alive until the + // receiver copies it out. The sudog has a pointer to the + // stack object, but sudogs aren't considered as roots of the + // stack tracer. + KeepAlive(ep) + + // someone woke us up. + if mysg != gp.waiting { + throw("G waiting list is corrupted") + } + gp.waiting = nil + gp.activeStackChans = false + closed := !mysg.success + gp.param = nil + if mysg.releasetime > 0 { + blockevent(mysg.releasetime-t0, 2) + } + mysg.c = nil + releaseSudog(mysg) + if closed { + if c.closed == 0 { + throw("chansend: spurious wakeup") + } + panic(plainError("send on closed channel")) + } + return true +} + +// send processes a send operation on an empty channel c. +// The value ep sent by the sender is copied to the receiver sg. +// The receiver is then woken up to go on its merry way. +// Channel c must be empty and locked. send unlocks c with unlockf. +// sg must already be dequeued from c. +// ep must be non-nil and point to the heap or the caller's stack. +func send(c *hchan, sg *sudog, ep unsafe.Pointer, unlockf func(), skip int) { + if raceenabled { + if c.dataqsiz == 0 { + racesync(c, sg) + } else { + // Pretend we go through the buffer, even though + // we copy directly. Note that we need to increment + // the head/tail locations only when raceenabled. + racenotify(c, c.recvx, nil) + racenotify(c, c.recvx, sg) + c.recvx++ + if c.recvx == c.dataqsiz { + c.recvx = 0 + } + c.sendx = c.recvx // c.sendx = (c.sendx+1) % c.dataqsiz + } + } + if sg.elem != nil { + sendDirect(c.elemtype, sg, ep) + sg.elem = nil + } + gp := sg.g + unlockf() + gp.param = unsafe.Pointer(sg) + sg.success = true + if sg.releasetime != 0 { + sg.releasetime = cputicks() + } + goready(gp, skip+1) +} + +// Sends and receives on unbuffered or empty-buffered channels are the +// only operations where one running goroutine writes to the stack of +// another running goroutine. The GC assumes that stack writes only +// happen when the goroutine is running and are only done by that +// goroutine. Using a write barrier is sufficient to make up for +// violating that assumption, but the write barrier has to work. +// typedmemmove will call bulkBarrierPreWrite, but the target bytes +// are not in the heap, so that will not help. We arrange to call +// memmove and typeBitsBulkBarrier instead. + +func sendDirect(t *_type, sg *sudog, src unsafe.Pointer) { + // src is on our stack, dst is a slot on another stack. + + // Once we read sg.elem out of sg, it will no longer + // be updated if the destination's stack gets copied (shrunk). + // So make sure that no preemption points can happen between read & use. + dst := sg.elem + typeBitsBulkBarrier(t, uintptr(dst), uintptr(src), t.size) + // No need for cgo write barrier checks because dst is always + // Go memory. + memmove(dst, src, t.size) +} + +func recvDirect(t *_type, sg *sudog, dst unsafe.Pointer) { + // dst is on our stack or the heap, src is on another stack. + // The channel is locked, so src will not move during this + // operation. + src := sg.elem + typeBitsBulkBarrier(t, uintptr(dst), uintptr(src), t.size) + memmove(dst, src, t.size) +} + +func closechan(c *hchan) { + if c == nil { + panic(plainError("close of nil channel")) + } + + lock(&c.lock) + if c.closed != 0 { + unlock(&c.lock) + panic(plainError("close of closed channel")) + } + + if raceenabled { + callerpc := getcallerpc() + racewritepc(c.raceaddr(), callerpc, abi.FuncPCABIInternal(closechan)) + racerelease(c.raceaddr()) + } + + c.closed = 1 + + var glist gList + + // release all readers + for { + sg := c.recvq.dequeue() + if sg == nil { + break + } + if sg.elem != nil { + typedmemclr(c.elemtype, sg.elem) + sg.elem = nil + } + if sg.releasetime != 0 { + sg.releasetime = cputicks() + } + gp := sg.g + gp.param = unsafe.Pointer(sg) + sg.success = false + if raceenabled { + raceacquireg(gp, c.raceaddr()) + } + glist.push(gp) + } + + // release all writers (they will panic) + for { + sg := c.sendq.dequeue() + if sg == nil { + break + } + sg.elem = nil + if sg.releasetime != 0 { + sg.releasetime = cputicks() + } + gp := sg.g + gp.param = unsafe.Pointer(sg) + sg.success = false + if raceenabled { + raceacquireg(gp, c.raceaddr()) + } + glist.push(gp) + } + unlock(&c.lock) + + // Ready all Gs now that we've dropped the channel lock. + for !glist.empty() { + gp := glist.pop() + gp.schedlink = 0 + goready(gp, 3) + } +} + +// empty reports whether a read from c would block (that is, the channel is +// empty). It uses a single atomic read of mutable state. +func empty(c *hchan) bool { + // c.dataqsiz is immutable. + if c.dataqsiz == 0 { + return atomic.Loadp(unsafe.Pointer(&c.sendq.first)) == nil + } + return atomic.Loaduint(&c.qcount) == 0 +} + +// entry points for <- c from compiled code. +// +//go:nosplit +func chanrecv1(c *hchan, elem unsafe.Pointer) { + chanrecv(c, elem, true) +} + +//go:nosplit +func chanrecv2(c *hchan, elem unsafe.Pointer) (received bool) { + _, received = chanrecv(c, elem, true) + return +} + +// chanrecv receives on channel c and writes the received data to ep. +// ep may be nil, in which case received data is ignored. +// If block == false and no elements are available, returns (false, false). +// Otherwise, if c is closed, zeros *ep and returns (true, false). +// Otherwise, fills in *ep with an element and returns (true, true). +// A non-nil ep must point to the heap or the caller's stack. +func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) { + // raceenabled: don't need to check ep, as it is always on the stack + // or is new memory allocated by reflect. + + if debugChan { + print("chanrecv: chan=", c, "\n") + } + + if c == nil { + if !block { + return + } + gopark(nil, nil, waitReasonChanReceiveNilChan, traceEvGoStop, 2) + throw("unreachable") + } + + // Fast path: check for failed non-blocking operation without acquiring the lock. + if !block && empty(c) { + // After observing that the channel is not ready for receiving, we observe whether the + // channel is closed. + // + // Reordering of these checks could lead to incorrect behavior when racing with a close. + // For example, if the channel was open and not empty, was closed, and then drained, + // reordered reads could incorrectly indicate "open and empty". To prevent reordering, + // we use atomic loads for both checks, and rely on emptying and closing to happen in + // separate critical sections under the same lock. This assumption fails when closing + // an unbuffered channel with a blocked send, but that is an error condition anyway. + if atomic.Load(&c.closed) == 0 { + // Because a channel cannot be reopened, the later observation of the channel + // being not closed implies that it was also not closed at the moment of the + // first observation. We behave as if we observed the channel at that moment + // and report that the receive cannot proceed. + return + } + // The channel is irreversibly closed. Re-check whether the channel has any pending data + // to receive, which could have arrived between the empty and closed checks above. + // Sequential consistency is also required here, when racing with such a send. + if empty(c) { + // The channel is irreversibly closed and empty. + if raceenabled { + raceacquire(c.raceaddr()) + } + if ep != nil { + typedmemclr(c.elemtype, ep) + } + return true, false + } + } + + var t0 int64 + if blockprofilerate > 0 { + t0 = cputicks() + } + + lock(&c.lock) + + if c.closed != 0 { + if c.qcount == 0 { + if raceenabled { + raceacquire(c.raceaddr()) + } + unlock(&c.lock) + if ep != nil { + typedmemclr(c.elemtype, ep) + } + return true, false + } + // The channel has been closed, but the channel's buffer have data. + } else { + // Just found waiting sender with not closed. + if sg := c.sendq.dequeue(); sg != nil { + // Found a waiting sender. If buffer is size 0, receive value + // directly from sender. Otherwise, receive from head of queue + // and add sender's value to the tail of the queue (both map to + // the same buffer slot because the queue is full). + recv(c, sg, ep, func() { unlock(&c.lock) }, 3) + return true, true + } + } + + if c.qcount > 0 { + // Receive directly from queue + qp := chanbuf(c, c.recvx) + if raceenabled { + racenotify(c, c.recvx, nil) + } + if ep != nil { + typedmemmove(c.elemtype, ep, qp) + } + typedmemclr(c.elemtype, qp) + c.recvx++ + if c.recvx == c.dataqsiz { + c.recvx = 0 + } + c.qcount-- + unlock(&c.lock) + return true, true + } + + if !block { + unlock(&c.lock) + return false, false + } + + // no sender available: block on this channel. + gp := getg() + mysg := acquireSudog() + mysg.releasetime = 0 + if t0 != 0 { + mysg.releasetime = -1 + } + // No stack splits between assigning elem and enqueuing mysg + // on gp.waiting where copystack can find it. + mysg.elem = ep + mysg.waitlink = nil + gp.waiting = mysg + mysg.g = gp + mysg.isSelect = false + mysg.c = c + gp.param = nil + c.recvq.enqueue(mysg) + // Signal to anyone trying to shrink our stack that we're about + // to park on a channel. The window between when this G's status + // changes and when we set gp.activeStackChans is not safe for + // stack shrinking. + gp.parkingOnChan.Store(true) + gopark(chanparkcommit, unsafe.Pointer(&c.lock), waitReasonChanReceive, traceEvGoBlockRecv, 2) + + // someone woke us up + if mysg != gp.waiting { + throw("G waiting list is corrupted") + } + gp.waiting = nil + gp.activeStackChans = false + if mysg.releasetime > 0 { + blockevent(mysg.releasetime-t0, 2) + } + success := mysg.success + gp.param = nil + mysg.c = nil + releaseSudog(mysg) + return true, success +} + +// recv processes a receive operation on a full channel c. +// There are 2 parts: +// 1. The value sent by the sender sg is put into the channel +// and the sender is woken up to go on its merry way. +// 2. The value received by the receiver (the current G) is +// written to ep. +// +// For synchronous channels, both values are the same. +// For asynchronous channels, the receiver gets its data from +// the channel buffer and the sender's data is put in the +// channel buffer. +// Channel c must be full and locked. recv unlocks c with unlockf. +// sg must already be dequeued from c. +// A non-nil ep must point to the heap or the caller's stack. +func recv(c *hchan, sg *sudog, ep unsafe.Pointer, unlockf func(), skip int) { + if c.dataqsiz == 0 { + if raceenabled { + racesync(c, sg) + } + if ep != nil { + // copy data from sender + recvDirect(c.elemtype, sg, ep) + } + } else { + // Queue is full. Take the item at the + // head of the queue. Make the sender enqueue + // its item at the tail of the queue. Since the + // queue is full, those are both the same slot. + qp := chanbuf(c, c.recvx) + if raceenabled { + racenotify(c, c.recvx, nil) + racenotify(c, c.recvx, sg) + } + // copy data from queue to receiver + if ep != nil { + typedmemmove(c.elemtype, ep, qp) + } + // copy data from sender to queue + typedmemmove(c.elemtype, qp, sg.elem) + c.recvx++ + if c.recvx == c.dataqsiz { + c.recvx = 0 + } + c.sendx = c.recvx // c.sendx = (c.sendx+1) % c.dataqsiz + } + sg.elem = nil + gp := sg.g + unlockf() + gp.param = unsafe.Pointer(sg) + sg.success = true + if sg.releasetime != 0 { + sg.releasetime = cputicks() + } + goready(gp, skip+1) +} + +func chanparkcommit(gp *g, chanLock unsafe.Pointer) bool { + // There are unlocked sudogs that point into gp's stack. Stack + // copying must lock the channels of those sudogs. + // Set activeStackChans here instead of before we try parking + // because we could self-deadlock in stack growth on the + // channel lock. + gp.activeStackChans = true + // Mark that it's safe for stack shrinking to occur now, + // because any thread acquiring this G's stack for shrinking + // is guaranteed to observe activeStackChans after this store. + gp.parkingOnChan.Store(false) + // Make sure we unlock after setting activeStackChans and + // unsetting parkingOnChan. The moment we unlock chanLock + // we risk gp getting readied by a channel operation and + // so gp could continue running before everything before + // the unlock is visible (even to gp itself). + unlock((*mutex)(chanLock)) + return true +} + +// compiler implements +// +// select { +// case c <- v: +// ... foo +// default: +// ... bar +// } +// +// as +// +// if selectnbsend(c, v) { +// ... foo +// } else { +// ... bar +// } +func selectnbsend(c *hchan, elem unsafe.Pointer) (selected bool) { + return chansend(c, elem, false, getcallerpc()) +} + +// compiler implements +// +// select { +// case v, ok = <-c: +// ... foo +// default: +// ... bar +// } +// +// as +// +// if selected, ok = selectnbrecv(&v, c); selected { +// ... foo +// } else { +// ... bar +// } +func selectnbrecv(elem unsafe.Pointer, c *hchan) (selected, received bool) { + return chanrecv(c, elem, false) +} + +//go:linkname reflect_chansend reflect.chansend +func reflect_chansend(c *hchan, elem unsafe.Pointer, nb bool) (selected bool) { + return chansend(c, elem, !nb, getcallerpc()) +} + +//go:linkname reflect_chanrecv reflect.chanrecv +func reflect_chanrecv(c *hchan, nb bool, elem unsafe.Pointer) (selected bool, received bool) { + return chanrecv(c, elem, !nb) +} + +//go:linkname reflect_chanlen reflect.chanlen +func reflect_chanlen(c *hchan) int { + if c == nil { + return 0 + } + return int(c.qcount) +} + +//go:linkname reflectlite_chanlen internal/reflectlite.chanlen +func reflectlite_chanlen(c *hchan) int { + if c == nil { + return 0 + } + return int(c.qcount) +} + +//go:linkname reflect_chancap reflect.chancap +func reflect_chancap(c *hchan) int { + if c == nil { + return 0 + } + return int(c.dataqsiz) +} + +//go:linkname reflect_chanclose reflect.chanclose +func reflect_chanclose(c *hchan) { + closechan(c) +} + +func (q *waitq) enqueue(sgp *sudog) { + sgp.next = nil + x := q.last + if x == nil { + sgp.prev = nil + q.first = sgp + q.last = sgp + return + } + sgp.prev = x + x.next = sgp + q.last = sgp +} + +func (q *waitq) dequeue() *sudog { + for { + sgp := q.first + if sgp == nil { + return nil + } + y := sgp.next + if y == nil { + q.first = nil + q.last = nil + } else { + y.prev = nil + q.first = y + sgp.next = nil // mark as removed (see dequeueSudoG) + } + + // if a goroutine was put on this queue because of a + // select, there is a small window between the goroutine + // being woken up by a different case and it grabbing the + // channel locks. Once it has the lock + // it removes itself from the queue, so we won't see it after that. + // We use a flag in the G struct to tell us when someone + // else has won the race to signal this goroutine but the goroutine + // hasn't removed itself from the queue yet. + if sgp.isSelect && !sgp.g.selectDone.CompareAndSwap(0, 1) { + continue + } + + return sgp + } +} + +func (c *hchan) raceaddr() unsafe.Pointer { + // Treat read-like and write-like operations on the channel to + // happen at this address. Avoid using the address of qcount + // or dataqsiz, because the len() and cap() builtins read + // those addresses, and we don't want them racing with + // operations like close(). + return unsafe.Pointer(&c.buf) +} + +func racesync(c *hchan, sg *sudog) { + racerelease(chanbuf(c, 0)) + raceacquireg(sg.g, chanbuf(c, 0)) + racereleaseg(sg.g, chanbuf(c, 0)) + raceacquire(chanbuf(c, 0)) +} + +// Notify the race detector of a send or receive involving buffer entry idx +// and a channel c or its communicating partner sg. +// This function handles the special case of c.elemsize==0. +func racenotify(c *hchan, idx uint, sg *sudog) { + // We could have passed the unsafe.Pointer corresponding to entry idx + // instead of idx itself. However, in a future version of this function, + // we can use idx to better handle the case of elemsize==0. + // A future improvement to the detector is to call TSan with c and idx: + // this way, Go will continue to not allocating buffer entries for channels + // of elemsize==0, yet the race detector can be made to handle multiple + // sync objects underneath the hood (one sync object per idx) + qp := chanbuf(c, idx) + // When elemsize==0, we don't allocate a full buffer for the channel. + // Instead of individual buffer entries, the race detector uses the + // c.buf as the only buffer entry. This simplification prevents us from + // following the memory model's happens-before rules (rules that are + // implemented in racereleaseacquire). Instead, we accumulate happens-before + // information in the synchronization object associated with c.buf. + if c.elemsize == 0 { + if sg == nil { + raceacquire(qp) + racerelease(qp) + } else { + raceacquireg(sg.g, qp) + racereleaseg(sg.g, qp) + } + } else { + if sg == nil { + racereleaseacquire(qp) + } else { + racereleaseacquireg(sg.g, qp) + } + } +} diff --git a/src/runtime/chan_test.go b/src/runtime/chan_test.go new file mode 100644 index 0000000..256f976 --- /dev/null +++ b/src/runtime/chan_test.go @@ -0,0 +1,1221 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "internal/testenv" + "math" + "runtime" + "sync" + "sync/atomic" + "testing" + "time" +) + +func TestChan(t *testing.T) { + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(4)) + N := 200 + if testing.Short() { + N = 20 + } + for chanCap := 0; chanCap < N; chanCap++ { + { + // Ensure that receive from empty chan blocks. + c := make(chan int, chanCap) + recv1 := false + go func() { + _ = <-c + recv1 = true + }() + recv2 := false + go func() { + _, _ = <-c + recv2 = true + }() + time.Sleep(time.Millisecond) + if recv1 || recv2 { + t.Fatalf("chan[%d]: receive from empty chan", chanCap) + } + // Ensure that non-blocking receive does not block. + select { + case _ = <-c: + t.Fatalf("chan[%d]: receive from empty chan", chanCap) + default: + } + select { + case _, _ = <-c: + t.Fatalf("chan[%d]: receive from empty chan", chanCap) + default: + } + c <- 0 + c <- 0 + } + + { + // Ensure that send to full chan blocks. + c := make(chan int, chanCap) + for i := 0; i < chanCap; i++ { + c <- i + } + sent := uint32(0) + go func() { + c <- 0 + atomic.StoreUint32(&sent, 1) + }() + time.Sleep(time.Millisecond) + if atomic.LoadUint32(&sent) != 0 { + t.Fatalf("chan[%d]: send to full chan", chanCap) + } + // Ensure that non-blocking send does not block. + select { + case c <- 0: + t.Fatalf("chan[%d]: send to full chan", chanCap) + default: + } + <-c + } + + { + // Ensure that we receive 0 from closed chan. + c := make(chan int, chanCap) + for i := 0; i < chanCap; i++ { + c <- i + } + close(c) + for i := 0; i < chanCap; i++ { + v := <-c + if v != i { + t.Fatalf("chan[%d]: received %v, expected %v", chanCap, v, i) + } + } + if v := <-c; v != 0 { + t.Fatalf("chan[%d]: received %v, expected %v", chanCap, v, 0) + } + if v, ok := <-c; v != 0 || ok { + t.Fatalf("chan[%d]: received %v/%v, expected %v/%v", chanCap, v, ok, 0, false) + } + } + + { + // Ensure that close unblocks receive. + c := make(chan int, chanCap) + done := make(chan bool) + go func() { + v, ok := <-c + done <- v == 0 && ok == false + }() + time.Sleep(time.Millisecond) + close(c) + if !<-done { + t.Fatalf("chan[%d]: received non zero from closed chan", chanCap) + } + } + + { + // Send 100 integers, + // ensure that we receive them non-corrupted in FIFO order. + c := make(chan int, chanCap) + go func() { + for i := 0; i < 100; i++ { + c <- i + } + }() + for i := 0; i < 100; i++ { + v := <-c + if v != i { + t.Fatalf("chan[%d]: received %v, expected %v", chanCap, v, i) + } + } + + // Same, but using recv2. + go func() { + for i := 0; i < 100; i++ { + c <- i + } + }() + for i := 0; i < 100; i++ { + v, ok := <-c + if !ok { + t.Fatalf("chan[%d]: receive failed, expected %v", chanCap, i) + } + if v != i { + t.Fatalf("chan[%d]: received %v, expected %v", chanCap, v, i) + } + } + + // Send 1000 integers in 4 goroutines, + // ensure that we receive what we send. + const P = 4 + const L = 1000 + for p := 0; p < P; p++ { + go func() { + for i := 0; i < L; i++ { + c <- i + } + }() + } + done := make(chan map[int]int) + for p := 0; p < P; p++ { + go func() { + recv := make(map[int]int) + for i := 0; i < L; i++ { + v := <-c + recv[v] = recv[v] + 1 + } + done <- recv + }() + } + recv := make(map[int]int) + for p := 0; p < P; p++ { + for k, v := range <-done { + recv[k] = recv[k] + v + } + } + if len(recv) != L { + t.Fatalf("chan[%d]: received %v values, expected %v", chanCap, len(recv), L) + } + for _, v := range recv { + if v != P { + t.Fatalf("chan[%d]: received %v values, expected %v", chanCap, v, P) + } + } + } + + { + // Test len/cap. + c := make(chan int, chanCap) + if len(c) != 0 || cap(c) != chanCap { + t.Fatalf("chan[%d]: bad len/cap, expect %v/%v, got %v/%v", chanCap, 0, chanCap, len(c), cap(c)) + } + for i := 0; i < chanCap; i++ { + c <- i + } + if len(c) != chanCap || cap(c) != chanCap { + t.Fatalf("chan[%d]: bad len/cap, expect %v/%v, got %v/%v", chanCap, chanCap, chanCap, len(c), cap(c)) + } + } + + } +} + +func TestNonblockRecvRace(t *testing.T) { + n := 10000 + if testing.Short() { + n = 100 + } + for i := 0; i < n; i++ { + c := make(chan int, 1) + c <- 1 + go func() { + select { + case <-c: + default: + t.Error("chan is not ready") + } + }() + close(c) + <-c + if t.Failed() { + return + } + } +} + +// This test checks that select acts on the state of the channels at one +// moment in the execution, not over a smeared time window. +// In the test, one goroutine does: +// +// create c1, c2 +// make c1 ready for receiving +// create second goroutine +// make c2 ready for receiving +// make c1 no longer ready for receiving (if possible) +// +// The second goroutine does a non-blocking select receiving from c1 and c2. +// From the time the second goroutine is created, at least one of c1 and c2 +// is always ready for receiving, so the select in the second goroutine must +// always receive from one or the other. It must never execute the default case. +func TestNonblockSelectRace(t *testing.T) { + n := 100000 + if testing.Short() { + n = 1000 + } + done := make(chan bool, 1) + for i := 0; i < n; i++ { + c1 := make(chan int, 1) + c2 := make(chan int, 1) + c1 <- 1 + go func() { + select { + case <-c1: + case <-c2: + default: + done <- false + return + } + done <- true + }() + c2 <- 1 + select { + case <-c1: + default: + } + if !<-done { + t.Fatal("no chan is ready") + } + } +} + +// Same as TestNonblockSelectRace, but close(c2) replaces c2 <- 1. +func TestNonblockSelectRace2(t *testing.T) { + n := 100000 + if testing.Short() { + n = 1000 + } + done := make(chan bool, 1) + for i := 0; i < n; i++ { + c1 := make(chan int, 1) + c2 := make(chan int) + c1 <- 1 + go func() { + select { + case <-c1: + case <-c2: + default: + done <- false + return + } + done <- true + }() + close(c2) + select { + case <-c1: + default: + } + if !<-done { + t.Fatal("no chan is ready") + } + } +} + +func TestSelfSelect(t *testing.T) { + // Ensure that send/recv on the same chan in select + // does not crash nor deadlock. + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(2)) + for _, chanCap := range []int{0, 10} { + var wg sync.WaitGroup + wg.Add(2) + c := make(chan int, chanCap) + for p := 0; p < 2; p++ { + p := p + go func() { + defer wg.Done() + for i := 0; i < 1000; i++ { + if p == 0 || i%2 == 0 { + select { + case c <- p: + case v := <-c: + if chanCap == 0 && v == p { + t.Errorf("self receive") + return + } + } + } else { + select { + case v := <-c: + if chanCap == 0 && v == p { + t.Errorf("self receive") + return + } + case c <- p: + } + } + } + }() + } + wg.Wait() + } +} + +func TestSelectStress(t *testing.T) { + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(10)) + var c [4]chan int + c[0] = make(chan int) + c[1] = make(chan int) + c[2] = make(chan int, 2) + c[3] = make(chan int, 3) + N := int(1e5) + if testing.Short() { + N /= 10 + } + // There are 4 goroutines that send N values on each of the chans, + // + 4 goroutines that receive N values on each of the chans, + // + 1 goroutine that sends N values on each of the chans in a single select, + // + 1 goroutine that receives N values on each of the chans in a single select. + // All these sends, receives and selects interact chaotically at runtime, + // but we are careful that this whole construct does not deadlock. + var wg sync.WaitGroup + wg.Add(10) + for k := 0; k < 4; k++ { + k := k + go func() { + for i := 0; i < N; i++ { + c[k] <- 0 + } + wg.Done() + }() + go func() { + for i := 0; i < N; i++ { + <-c[k] + } + wg.Done() + }() + } + go func() { + var n [4]int + c1 := c + for i := 0; i < 4*N; i++ { + select { + case c1[3] <- 0: + n[3]++ + if n[3] == N { + c1[3] = nil + } + case c1[2] <- 0: + n[2]++ + if n[2] == N { + c1[2] = nil + } + case c1[0] <- 0: + n[0]++ + if n[0] == N { + c1[0] = nil + } + case c1[1] <- 0: + n[1]++ + if n[1] == N { + c1[1] = nil + } + } + } + wg.Done() + }() + go func() { + var n [4]int + c1 := c + for i := 0; i < 4*N; i++ { + select { + case <-c1[0]: + n[0]++ + if n[0] == N { + c1[0] = nil + } + case <-c1[1]: + n[1]++ + if n[1] == N { + c1[1] = nil + } + case <-c1[2]: + n[2]++ + if n[2] == N { + c1[2] = nil + } + case <-c1[3]: + n[3]++ + if n[3] == N { + c1[3] = nil + } + } + } + wg.Done() + }() + wg.Wait() +} + +func TestSelectFairness(t *testing.T) { + const trials = 10000 + if runtime.GOOS == "linux" && runtime.GOARCH == "ppc64le" { + testenv.SkipFlaky(t, 22047) + } + c1 := make(chan byte, trials+1) + c2 := make(chan byte, trials+1) + for i := 0; i < trials+1; i++ { + c1 <- 1 + c2 <- 2 + } + c3 := make(chan byte) + c4 := make(chan byte) + out := make(chan byte) + done := make(chan byte) + var wg sync.WaitGroup + wg.Add(1) + go func() { + defer wg.Done() + for { + var b byte + select { + case b = <-c3: + case b = <-c4: + case b = <-c1: + case b = <-c2: + } + select { + case out <- b: + case <-done: + return + } + } + }() + cnt1, cnt2 := 0, 0 + for i := 0; i < trials; i++ { + switch b := <-out; b { + case 1: + cnt1++ + case 2: + cnt2++ + default: + t.Fatalf("unexpected value %d on channel", b) + } + } + // If the select in the goroutine is fair, + // cnt1 and cnt2 should be about the same value. + // With 10,000 trials, the expected margin of error at + // a confidence level of six nines is 4.891676 / (2 * Sqrt(10000)). + r := float64(cnt1) / trials + e := math.Abs(r - 0.5) + t.Log(cnt1, cnt2, r, e) + if e > 4.891676/(2*math.Sqrt(trials)) { + t.Errorf("unfair select: in %d trials, results were %d, %d", trials, cnt1, cnt2) + } + close(done) + wg.Wait() +} + +func TestChanSendInterface(t *testing.T) { + type mt struct{} + m := &mt{} + c := make(chan any, 1) + c <- m + select { + case c <- m: + default: + } + select { + case c <- m: + case c <- &mt{}: + default: + } +} + +func TestPseudoRandomSend(t *testing.T) { + n := 100 + for _, chanCap := range []int{0, n} { + c := make(chan int, chanCap) + l := make([]int, n) + var m sync.Mutex + m.Lock() + go func() { + for i := 0; i < n; i++ { + runtime.Gosched() + l[i] = <-c + } + m.Unlock() + }() + for i := 0; i < n; i++ { + select { + case c <- 1: + case c <- 0: + } + } + m.Lock() // wait + n0 := 0 + n1 := 0 + for _, i := range l { + n0 += (i + 1) % 2 + n1 += i + } + if n0 <= n/10 || n1 <= n/10 { + t.Errorf("Want pseudorandom, got %d zeros and %d ones (chan cap %d)", n0, n1, chanCap) + } + } +} + +func TestMultiConsumer(t *testing.T) { + const nwork = 23 + const niter = 271828 + + pn := []int{2, 3, 7, 11, 13, 17, 19, 23, 27, 31} + + q := make(chan int, nwork*3) + r := make(chan int, nwork*3) + + // workers + var wg sync.WaitGroup + for i := 0; i < nwork; i++ { + wg.Add(1) + go func(w int) { + for v := range q { + // mess with the fifo-ish nature of range + if pn[w%len(pn)] == v { + runtime.Gosched() + } + r <- v + } + wg.Done() + }(i) + } + + // feeder & closer + expect := 0 + go func() { + for i := 0; i < niter; i++ { + v := pn[i%len(pn)] + expect += v + q <- v + } + close(q) // no more work + wg.Wait() // workers done + close(r) // ... so there can be no more results + }() + + // consume & check + n := 0 + s := 0 + for v := range r { + n++ + s += v + } + if n != niter || s != expect { + t.Errorf("Expected sum %d (got %d) from %d iter (saw %d)", + expect, s, niter, n) + } +} + +func TestShrinkStackDuringBlockedSend(t *testing.T) { + // make sure that channel operations still work when we are + // blocked on a channel send and we shrink the stack. + // NOTE: this test probably won't fail unless stack1.go:stackDebug + // is set to >= 1. + const n = 10 + c := make(chan int) + done := make(chan struct{}) + + go func() { + for i := 0; i < n; i++ { + c <- i + // use lots of stack, briefly. + stackGrowthRecursive(20) + } + done <- struct{}{} + }() + + for i := 0; i < n; i++ { + x := <-c + if x != i { + t.Errorf("bad channel read: want %d, got %d", i, x) + } + // Waste some time so sender can finish using lots of stack + // and block in channel send. + time.Sleep(1 * time.Millisecond) + // trigger GC which will shrink the stack of the sender. + runtime.GC() + } + <-done +} + +func TestNoShrinkStackWhileParking(t *testing.T) { + if runtime.GOOS == "netbsd" && runtime.GOARCH == "arm64" { + testenv.SkipFlaky(t, 49382) + } + if runtime.GOOS == "openbsd" { + testenv.SkipFlaky(t, 51482) + } + + // The goal of this test is to trigger a "racy sudog adjustment" + // throw. Basically, there's a window between when a goroutine + // becomes available for preemption for stack scanning (and thus, + // stack shrinking) but before the goroutine has fully parked on a + // channel. See issue 40641 for more details on the problem. + // + // The way we try to induce this failure is to set up two + // goroutines: a sender and a receiver that communicate across + // a channel. We try to set up a situation where the sender + // grows its stack temporarily then *fully* blocks on a channel + // often. Meanwhile a GC is triggered so that we try to get a + // mark worker to shrink the sender's stack and race with the + // sender parking. + // + // Unfortunately the race window here is so small that we + // either need a ridiculous number of iterations, or we add + // "usleep(1000)" to park_m, just before the unlockf call. + const n = 10 + send := func(c chan<- int, done chan struct{}) { + for i := 0; i < n; i++ { + c <- i + // Use lots of stack briefly so that + // the GC is going to want to shrink us + // when it scans us. Make sure not to + // do any function calls otherwise + // in order to avoid us shrinking ourselves + // when we're preempted. + stackGrowthRecursive(20) + } + done <- struct{}{} + } + recv := func(c <-chan int, done chan struct{}) { + for i := 0; i < n; i++ { + // Sleep here so that the sender always + // fully blocks. + time.Sleep(10 * time.Microsecond) + <-c + } + done <- struct{}{} + } + for i := 0; i < n*20; i++ { + c := make(chan int) + done := make(chan struct{}) + go recv(c, done) + go send(c, done) + // Wait a little bit before triggering + // the GC to make sure the sender and + // receiver have gotten into their groove. + time.Sleep(50 * time.Microsecond) + runtime.GC() + <-done + <-done + } +} + +func TestSelectDuplicateChannel(t *testing.T) { + // This test makes sure we can queue a G on + // the same channel multiple times. + c := make(chan int) + d := make(chan int) + e := make(chan int) + + // goroutine A + go func() { + select { + case <-c: + case <-c: + case <-d: + } + e <- 9 + }() + time.Sleep(time.Millisecond) // make sure goroutine A gets queued first on c + + // goroutine B + go func() { + <-c + }() + time.Sleep(time.Millisecond) // make sure goroutine B gets queued on c before continuing + + d <- 7 // wake up A, it dequeues itself from c. This operation used to corrupt c.recvq. + <-e // A tells us it's done + c <- 8 // wake up B. This operation used to fail because c.recvq was corrupted (it tries to wake up an already running G instead of B) +} + +func TestSelectStackAdjust(t *testing.T) { + // Test that channel receive slots that contain local stack + // pointers are adjusted correctly by stack shrinking. + c := make(chan *int) + d := make(chan *int) + ready1 := make(chan bool) + ready2 := make(chan bool) + + f := func(ready chan bool, dup bool) { + // Temporarily grow the stack to 10K. + stackGrowthRecursive((10 << 10) / (128 * 8)) + + // We're ready to trigger GC and stack shrink. + ready <- true + + val := 42 + var cx *int + cx = &val + + var c2 chan *int + var d2 chan *int + if dup { + c2 = c + d2 = d + } + + // Receive from d. cx won't be affected. + select { + case cx = <-c: + case <-c2: + case <-d: + case <-d2: + } + + // Check that pointer in cx was adjusted correctly. + if cx != &val { + t.Error("cx no longer points to val") + } else if val != 42 { + t.Error("val changed") + } else { + *cx = 43 + if val != 43 { + t.Error("changing *cx failed to change val") + } + } + ready <- true + } + + go f(ready1, false) + go f(ready2, true) + + // Let the goroutines get into the select. + <-ready1 + <-ready2 + time.Sleep(10 * time.Millisecond) + + // Force concurrent GC to shrink the stacks. + runtime.GC() + + // Wake selects. + close(d) + <-ready1 + <-ready2 +} + +type struct0 struct{} + +func BenchmarkMakeChan(b *testing.B) { + b.Run("Byte", func(b *testing.B) { + var x chan byte + for i := 0; i < b.N; i++ { + x = make(chan byte, 8) + } + close(x) + }) + b.Run("Int", func(b *testing.B) { + var x chan int + for i := 0; i < b.N; i++ { + x = make(chan int, 8) + } + close(x) + }) + b.Run("Ptr", func(b *testing.B) { + var x chan *byte + for i := 0; i < b.N; i++ { + x = make(chan *byte, 8) + } + close(x) + }) + b.Run("Struct", func(b *testing.B) { + b.Run("0", func(b *testing.B) { + var x chan struct0 + for i := 0; i < b.N; i++ { + x = make(chan struct0, 8) + } + close(x) + }) + b.Run("32", func(b *testing.B) { + var x chan struct32 + for i := 0; i < b.N; i++ { + x = make(chan struct32, 8) + } + close(x) + }) + b.Run("40", func(b *testing.B) { + var x chan struct40 + for i := 0; i < b.N; i++ { + x = make(chan struct40, 8) + } + close(x) + }) + }) +} + +func BenchmarkChanNonblocking(b *testing.B) { + myc := make(chan int) + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + select { + case <-myc: + default: + } + } + }) +} + +func BenchmarkSelectUncontended(b *testing.B) { + b.RunParallel(func(pb *testing.PB) { + myc1 := make(chan int, 1) + myc2 := make(chan int, 1) + myc1 <- 0 + for pb.Next() { + select { + case <-myc1: + myc2 <- 0 + case <-myc2: + myc1 <- 0 + } + } + }) +} + +func BenchmarkSelectSyncContended(b *testing.B) { + myc1 := make(chan int) + myc2 := make(chan int) + myc3 := make(chan int) + done := make(chan int) + b.RunParallel(func(pb *testing.PB) { + go func() { + for { + select { + case myc1 <- 0: + case myc2 <- 0: + case myc3 <- 0: + case <-done: + return + } + } + }() + for pb.Next() { + select { + case <-myc1: + case <-myc2: + case <-myc3: + } + } + }) + close(done) +} + +func BenchmarkSelectAsyncContended(b *testing.B) { + procs := runtime.GOMAXPROCS(0) + myc1 := make(chan int, procs) + myc2 := make(chan int, procs) + b.RunParallel(func(pb *testing.PB) { + myc1 <- 0 + for pb.Next() { + select { + case <-myc1: + myc2 <- 0 + case <-myc2: + myc1 <- 0 + } + } + }) +} + +func BenchmarkSelectNonblock(b *testing.B) { + myc1 := make(chan int) + myc2 := make(chan int) + myc3 := make(chan int, 1) + myc4 := make(chan int, 1) + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + select { + case <-myc1: + default: + } + select { + case myc2 <- 0: + default: + } + select { + case <-myc3: + default: + } + select { + case myc4 <- 0: + default: + } + } + }) +} + +func BenchmarkChanUncontended(b *testing.B) { + const C = 100 + b.RunParallel(func(pb *testing.PB) { + myc := make(chan int, C) + for pb.Next() { + for i := 0; i < C; i++ { + myc <- 0 + } + for i := 0; i < C; i++ { + <-myc + } + } + }) +} + +func BenchmarkChanContended(b *testing.B) { + const C = 100 + myc := make(chan int, C*runtime.GOMAXPROCS(0)) + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + for i := 0; i < C; i++ { + myc <- 0 + } + for i := 0; i < C; i++ { + <-myc + } + } + }) +} + +func benchmarkChanSync(b *testing.B, work int) { + const CallsPerSched = 1000 + procs := 2 + N := int32(b.N / CallsPerSched / procs * procs) + c := make(chan bool, procs) + myc := make(chan int) + for p := 0; p < procs; p++ { + go func() { + for { + i := atomic.AddInt32(&N, -1) + if i < 0 { + break + } + for g := 0; g < CallsPerSched; g++ { + if i%2 == 0 { + <-myc + localWork(work) + myc <- 0 + localWork(work) + } else { + myc <- 0 + localWork(work) + <-myc + localWork(work) + } + } + } + c <- true + }() + } + for p := 0; p < procs; p++ { + <-c + } +} + +func BenchmarkChanSync(b *testing.B) { + benchmarkChanSync(b, 0) +} + +func BenchmarkChanSyncWork(b *testing.B) { + benchmarkChanSync(b, 1000) +} + +func benchmarkChanProdCons(b *testing.B, chanSize, localWork int) { + const CallsPerSched = 1000 + procs := runtime.GOMAXPROCS(-1) + N := int32(b.N / CallsPerSched) + c := make(chan bool, 2*procs) + myc := make(chan int, chanSize) + for p := 0; p < procs; p++ { + go func() { + foo := 0 + for atomic.AddInt32(&N, -1) >= 0 { + for g := 0; g < CallsPerSched; g++ { + for i := 0; i < localWork; i++ { + foo *= 2 + foo /= 2 + } + myc <- 1 + } + } + myc <- 0 + c <- foo == 42 + }() + go func() { + foo := 0 + for { + v := <-myc + if v == 0 { + break + } + for i := 0; i < localWork; i++ { + foo *= 2 + foo /= 2 + } + } + c <- foo == 42 + }() + } + for p := 0; p < procs; p++ { + <-c + <-c + } +} + +func BenchmarkChanProdCons0(b *testing.B) { + benchmarkChanProdCons(b, 0, 0) +} + +func BenchmarkChanProdCons10(b *testing.B) { + benchmarkChanProdCons(b, 10, 0) +} + +func BenchmarkChanProdCons100(b *testing.B) { + benchmarkChanProdCons(b, 100, 0) +} + +func BenchmarkChanProdConsWork0(b *testing.B) { + benchmarkChanProdCons(b, 0, 100) +} + +func BenchmarkChanProdConsWork10(b *testing.B) { + benchmarkChanProdCons(b, 10, 100) +} + +func BenchmarkChanProdConsWork100(b *testing.B) { + benchmarkChanProdCons(b, 100, 100) +} + +func BenchmarkSelectProdCons(b *testing.B) { + const CallsPerSched = 1000 + procs := runtime.GOMAXPROCS(-1) + N := int32(b.N / CallsPerSched) + c := make(chan bool, 2*procs) + myc := make(chan int, 128) + myclose := make(chan bool) + for p := 0; p < procs; p++ { + go func() { + // Producer: sends to myc. + foo := 0 + // Intended to not fire during benchmarking. + mytimer := time.After(time.Hour) + for atomic.AddInt32(&N, -1) >= 0 { + for g := 0; g < CallsPerSched; g++ { + // Model some local work. + for i := 0; i < 100; i++ { + foo *= 2 + foo /= 2 + } + select { + case myc <- 1: + case <-mytimer: + case <-myclose: + } + } + } + myc <- 0 + c <- foo == 42 + }() + go func() { + // Consumer: receives from myc. + foo := 0 + // Intended to not fire during benchmarking. + mytimer := time.After(time.Hour) + loop: + for { + select { + case v := <-myc: + if v == 0 { + break loop + } + case <-mytimer: + case <-myclose: + } + // Model some local work. + for i := 0; i < 100; i++ { + foo *= 2 + foo /= 2 + } + } + c <- foo == 42 + }() + } + for p := 0; p < procs; p++ { + <-c + <-c + } +} + +func BenchmarkReceiveDataFromClosedChan(b *testing.B) { + count := b.N + ch := make(chan struct{}, count) + for i := 0; i < count; i++ { + ch <- struct{}{} + } + close(ch) + + b.ResetTimer() + for range ch { + } +} + +func BenchmarkChanCreation(b *testing.B) { + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + myc := make(chan int, 1) + myc <- 0 + <-myc + } + }) +} + +func BenchmarkChanSem(b *testing.B) { + type Empty struct{} + myc := make(chan Empty, runtime.GOMAXPROCS(0)) + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + myc <- Empty{} + <-myc + } + }) +} + +func BenchmarkChanPopular(b *testing.B) { + const n = 1000 + c := make(chan bool) + var a []chan bool + var wg sync.WaitGroup + wg.Add(n) + for j := 0; j < n; j++ { + d := make(chan bool) + a = append(a, d) + go func() { + for i := 0; i < b.N; i++ { + select { + case <-c: + case <-d: + } + } + wg.Done() + }() + } + for i := 0; i < b.N; i++ { + for _, d := range a { + d <- true + } + } + wg.Wait() +} + +func BenchmarkChanClosed(b *testing.B) { + c := make(chan struct{}) + close(c) + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + select { + case <-c: + default: + b.Error("Unreachable") + } + } + }) +} + +var ( + alwaysFalse = false + workSink = 0 +) + +func localWork(w int) { + foo := 0 + for i := 0; i < w; i++ { + foo /= (foo + 1) + } + if alwaysFalse { + workSink += foo + } +} diff --git a/src/runtime/chanbarrier_test.go b/src/runtime/chanbarrier_test.go new file mode 100644 index 0000000..d479574 --- /dev/null +++ b/src/runtime/chanbarrier_test.go @@ -0,0 +1,83 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "runtime" + "sync" + "testing" +) + +type response struct { +} + +type myError struct { +} + +func (myError) Error() string { return "" } + +func doRequest(useSelect bool) (*response, error) { + type async struct { + resp *response + err error + } + ch := make(chan *async, 0) + done := make(chan struct{}, 0) + + if useSelect { + go func() { + select { + case ch <- &async{resp: nil, err: myError{}}: + case <-done: + } + }() + } else { + go func() { + ch <- &async{resp: nil, err: myError{}} + }() + } + + r := <-ch + runtime.Gosched() + return r.resp, r.err +} + +func TestChanSendSelectBarrier(t *testing.T) { + testChanSendBarrier(true) +} + +func TestChanSendBarrier(t *testing.T) { + testChanSendBarrier(false) +} + +func testChanSendBarrier(useSelect bool) { + var wg sync.WaitGroup + var globalMu sync.Mutex + outer := 100 + inner := 100000 + if testing.Short() || runtime.GOARCH == "wasm" { + outer = 10 + inner = 1000 + } + for i := 0; i < outer; i++ { + wg.Add(1) + go func() { + defer wg.Done() + var garbage []byte + for j := 0; j < inner; j++ { + _, err := doRequest(useSelect) + _, ok := err.(myError) + if !ok { + panic(1) + } + garbage = make([]byte, 1<<10) + } + globalMu.Lock() + global = garbage + globalMu.Unlock() + }() + } + wg.Wait() +} diff --git a/src/runtime/checkptr.go b/src/runtime/checkptr.go new file mode 100644 index 0000000..2d4afd5 --- /dev/null +++ b/src/runtime/checkptr.go @@ -0,0 +1,109 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +func checkptrAlignment(p unsafe.Pointer, elem *_type, n uintptr) { + // nil pointer is always suitably aligned (#47430). + if p == nil { + return + } + + // Check that (*[n]elem)(p) is appropriately aligned. + // Note that we allow unaligned pointers if the types they point to contain + // no pointers themselves. See issue 37298. + // TODO(mdempsky): What about fieldAlign? + if elem.ptrdata != 0 && uintptr(p)&(uintptr(elem.align)-1) != 0 { + throw("checkptr: misaligned pointer conversion") + } + + // Check that (*[n]elem)(p) doesn't straddle multiple heap objects. + // TODO(mdempsky): Fix #46938 so we don't need to worry about overflow here. + if checkptrStraddles(p, n*elem.size) { + throw("checkptr: converted pointer straddles multiple allocations") + } +} + +// checkptrStraddles reports whether the first size-bytes of memory +// addressed by ptr is known to straddle more than one Go allocation. +func checkptrStraddles(ptr unsafe.Pointer, size uintptr) bool { + if size <= 1 { + return false + } + + // Check that add(ptr, size-1) won't overflow. This avoids the risk + // of producing an illegal pointer value (assuming ptr is legal). + if uintptr(ptr) >= -(size - 1) { + return true + } + end := add(ptr, size-1) + + // TODO(mdempsky): Detect when [ptr, end] contains Go allocations, + // but neither ptr nor end point into one themselves. + + return checkptrBase(ptr) != checkptrBase(end) +} + +func checkptrArithmetic(p unsafe.Pointer, originals []unsafe.Pointer) { + if 0 < uintptr(p) && uintptr(p) < minLegalPointer { + throw("checkptr: pointer arithmetic computed bad pointer value") + } + + // Check that if the computed pointer p points into a heap + // object, then one of the original pointers must have pointed + // into the same object. + base := checkptrBase(p) + if base == 0 { + return + } + + for _, original := range originals { + if base == checkptrBase(original) { + return + } + } + + throw("checkptr: pointer arithmetic result points to invalid allocation") +} + +// checkptrBase returns the base address for the allocation containing +// the address p. +// +// Importantly, if p1 and p2 point into the same variable, then +// checkptrBase(p1) == checkptrBase(p2). However, the converse/inverse +// is not necessarily true as allocations can have trailing padding, +// and multiple variables may be packed into a single allocation. +func checkptrBase(p unsafe.Pointer) uintptr { + // stack + if gp := getg(); gp.stack.lo <= uintptr(p) && uintptr(p) < gp.stack.hi { + // TODO(mdempsky): Walk the stack to identify the + // specific stack frame or even stack object that p + // points into. + // + // In the mean time, use "1" as a pseudo-address to + // represent the stack. This is an invalid address on + // all platforms, so it's guaranteed to be distinct + // from any of the addresses we might return below. + return 1 + } + + // heap (must check after stack because of #35068) + if base, _, _ := findObject(uintptr(p), 0, 0); base != 0 { + return base + } + + // data or bss + for _, datap := range activeModules() { + if datap.data <= uintptr(p) && uintptr(p) < datap.edata { + return datap.data + } + if datap.bss <= uintptr(p) && uintptr(p) < datap.ebss { + return datap.bss + } + } + + return 0 +} diff --git a/src/runtime/checkptr_test.go b/src/runtime/checkptr_test.go new file mode 100644 index 0000000..811c0f0 --- /dev/null +++ b/src/runtime/checkptr_test.go @@ -0,0 +1,108 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "internal/testenv" + "os/exec" + "strings" + "testing" +) + +func TestCheckPtr(t *testing.T) { + // This test requires rebuilding packages with -d=checkptr=1, + // so it's somewhat slow. + if testing.Short() { + t.Skip("skipping test in -short mode") + } + + t.Parallel() + testenv.MustHaveGoRun(t) + + exe, err := buildTestProg(t, "testprog", "-gcflags=all=-d=checkptr=1") + if err != nil { + t.Fatal(err) + } + + testCases := []struct { + cmd string + want string + }{ + {"CheckPtrAlignmentPtr", "fatal error: checkptr: misaligned pointer conversion\n"}, + {"CheckPtrAlignmentNoPtr", ""}, + {"CheckPtrAlignmentNilPtr", ""}, + {"CheckPtrArithmetic", "fatal error: checkptr: pointer arithmetic result points to invalid allocation\n"}, + {"CheckPtrArithmetic2", "fatal error: checkptr: pointer arithmetic result points to invalid allocation\n"}, + {"CheckPtrSize", "fatal error: checkptr: converted pointer straddles multiple allocations\n"}, + {"CheckPtrSmall", "fatal error: checkptr: pointer arithmetic computed bad pointer value\n"}, + {"CheckPtrSliceOK", ""}, + {"CheckPtrSliceFail", "fatal error: checkptr: unsafe.Slice result straddles multiple allocations\n"}, + {"CheckPtrStringOK", ""}, + {"CheckPtrStringFail", "fatal error: checkptr: unsafe.String result straddles multiple allocations\n"}, + } + + for _, tc := range testCases { + tc := tc + t.Run(tc.cmd, func(t *testing.T) { + t.Parallel() + got, err := testenv.CleanCmdEnv(exec.Command(exe, tc.cmd)).CombinedOutput() + if err != nil { + t.Log(err) + } + if tc.want == "" { + if len(got) > 0 { + t.Errorf("output:\n%s\nwant no output", got) + } + return + } + if !strings.HasPrefix(string(got), tc.want) { + t.Errorf("output:\n%s\n\nwant output starting with: %s", got, tc.want) + } + }) + } +} + +func TestCheckPtr2(t *testing.T) { + // This test requires rebuilding packages with -d=checkptr=2, + // so it's somewhat slow. + if testing.Short() { + t.Skip("skipping test in -short mode") + } + + t.Parallel() + testenv.MustHaveGoRun(t) + + exe, err := buildTestProg(t, "testprog", "-gcflags=all=-d=checkptr=2") + if err != nil { + t.Fatal(err) + } + + testCases := []struct { + cmd string + want string + }{ + {"CheckPtrAlignmentNested", "fatal error: checkptr: converted pointer straddles multiple allocations\n"}, + } + + for _, tc := range testCases { + tc := tc + t.Run(tc.cmd, func(t *testing.T) { + t.Parallel() + got, err := testenv.CleanCmdEnv(exec.Command(exe, tc.cmd)).CombinedOutput() + if err != nil { + t.Log(err) + } + if tc.want == "" { + if len(got) > 0 { + t.Errorf("output:\n%s\nwant no output", got) + } + return + } + if !strings.HasPrefix(string(got), tc.want) { + t.Errorf("output:\n%s\n\nwant output starting with: %s", got, tc.want) + } + }) + } +} diff --git a/src/runtime/closure_test.go b/src/runtime/closure_test.go new file mode 100644 index 0000000..741c932 --- /dev/null +++ b/src/runtime/closure_test.go @@ -0,0 +1,54 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import "testing" + +var s int + +func BenchmarkCallClosure(b *testing.B) { + for i := 0; i < b.N; i++ { + s += func(ii int) int { return 2 * ii }(i) + } +} + +func BenchmarkCallClosure1(b *testing.B) { + for i := 0; i < b.N; i++ { + j := i + s += func(ii int) int { return 2*ii + j }(i) + } +} + +var ss *int + +func BenchmarkCallClosure2(b *testing.B) { + for i := 0; i < b.N; i++ { + j := i + s += func() int { + ss = &j + return 2 + }() + } +} + +func addr1(x int) *int { + return func() *int { return &x }() +} + +func BenchmarkCallClosure3(b *testing.B) { + for i := 0; i < b.N; i++ { + ss = addr1(i) + } +} + +func addr2() (x int, p *int) { + return 0, func() *int { return &x }() +} + +func BenchmarkCallClosure4(b *testing.B) { + for i := 0; i < b.N; i++ { + _, ss = addr2() + } +} diff --git a/src/runtime/compiler.go b/src/runtime/compiler.go new file mode 100644 index 0000000..f430a27 --- /dev/null +++ b/src/runtime/compiler.go @@ -0,0 +1,12 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// Compiler is the name of the compiler toolchain that built the +// running binary. Known toolchains are: +// +// gc Also known as cmd/compile. +// gccgo The gccgo front end, part of the GCC compiler suite. +const Compiler = "gc" diff --git a/src/runtime/complex.go b/src/runtime/complex.go new file mode 100644 index 0000000..07c596f --- /dev/null +++ b/src/runtime/complex.go @@ -0,0 +1,61 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// inf2one returns a signed 1 if f is an infinity and a signed 0 otherwise. +// The sign of the result is the sign of f. +func inf2one(f float64) float64 { + g := 0.0 + if isInf(f) { + g = 1.0 + } + return copysign(g, f) +} + +func complex128div(n complex128, m complex128) complex128 { + var e, f float64 // complex(e, f) = n/m + + // Algorithm for robust complex division as described in + // Robert L. Smith: Algorithm 116: Complex division. Commun. ACM 5(8): 435 (1962). + if abs(real(m)) >= abs(imag(m)) { + ratio := imag(m) / real(m) + denom := real(m) + ratio*imag(m) + e = (real(n) + imag(n)*ratio) / denom + f = (imag(n) - real(n)*ratio) / denom + } else { + ratio := real(m) / imag(m) + denom := imag(m) + ratio*real(m) + e = (real(n)*ratio + imag(n)) / denom + f = (imag(n)*ratio - real(n)) / denom + } + + if isNaN(e) && isNaN(f) { + // Correct final result to infinities and zeros if applicable. + // Matches C99: ISO/IEC 9899:1999 - G.5.1 Multiplicative operators. + + a, b := real(n), imag(n) + c, d := real(m), imag(m) + + switch { + case m == 0 && (!isNaN(a) || !isNaN(b)): + e = copysign(inf, c) * a + f = copysign(inf, c) * b + + case (isInf(a) || isInf(b)) && isFinite(c) && isFinite(d): + a = inf2one(a) + b = inf2one(b) + e = inf * (a*c + b*d) + f = inf * (b*c - a*d) + + case (isInf(c) || isInf(d)) && isFinite(a) && isFinite(b): + c = inf2one(c) + d = inf2one(d) + e = 0 * (a*c + b*d) + f = 0 * (b*c - a*d) + } + } + + return complex(e, f) +} diff --git a/src/runtime/complex_test.go b/src/runtime/complex_test.go new file mode 100644 index 0000000..f41e6a3 --- /dev/null +++ b/src/runtime/complex_test.go @@ -0,0 +1,67 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "math/cmplx" + "testing" +) + +var result complex128 + +func BenchmarkComplex128DivNormal(b *testing.B) { + d := 15 + 2i + n := 32 + 3i + res := 0i + for i := 0; i < b.N; i++ { + n += 0.1i + res += n / d + } + result = res +} + +func BenchmarkComplex128DivNisNaN(b *testing.B) { + d := cmplx.NaN() + n := 32 + 3i + res := 0i + for i := 0; i < b.N; i++ { + n += 0.1i + res += n / d + } + result = res +} + +func BenchmarkComplex128DivDisNaN(b *testing.B) { + d := 15 + 2i + n := cmplx.NaN() + res := 0i + for i := 0; i < b.N; i++ { + d += 0.1i + res += n / d + } + result = res +} + +func BenchmarkComplex128DivNisInf(b *testing.B) { + d := 15 + 2i + n := cmplx.Inf() + res := 0i + for i := 0; i < b.N; i++ { + d += 0.1i + res += n / d + } + result = res +} + +func BenchmarkComplex128DivDisInf(b *testing.B) { + d := cmplx.Inf() + n := 32 + 3i + res := 0i + for i := 0; i < b.N; i++ { + n += 0.1i + res += n / d + } + result = res +} diff --git a/src/runtime/conv_wasm_test.go b/src/runtime/conv_wasm_test.go new file mode 100644 index 0000000..5054fca --- /dev/null +++ b/src/runtime/conv_wasm_test.go @@ -0,0 +1,128 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "testing" +) + +var res int64 +var ures uint64 + +func TestFloatTruncation(t *testing.T) { + testdata := []struct { + input float64 + convInt64 int64 + convUInt64 uint64 + overflow bool + }{ + // max +- 1 + { + input: 0x7fffffffffffffff, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + // For out-of-bounds conversion, the result is implementation-dependent. + // This test verifies the implementation of wasm architecture. + { + input: 0x8000000000000000, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + { + input: 0x7ffffffffffffffe, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + // neg max +- 1 + { + input: -0x8000000000000000, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + { + input: -0x8000000000000001, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + { + input: -0x7fffffffffffffff, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + // trunc point +- 1 + { + input: 0x7ffffffffffffdff, + convInt64: 0x7ffffffffffffc00, + convUInt64: 0x7ffffffffffffc00, + }, + { + input: 0x7ffffffffffffe00, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + { + input: 0x7ffffffffffffdfe, + convInt64: 0x7ffffffffffffc00, + convUInt64: 0x7ffffffffffffc00, + }, + // neg trunc point +- 1 + { + input: -0x7ffffffffffffdff, + convInt64: -0x7ffffffffffffc00, + convUInt64: 0x8000000000000000, + }, + { + input: -0x7ffffffffffffe00, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + { + input: -0x7ffffffffffffdfe, + convInt64: -0x7ffffffffffffc00, + convUInt64: 0x8000000000000000, + }, + // umax +- 1 + { + input: 0xffffffffffffffff, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + { + input: 0x10000000000000000, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + { + input: 0xfffffffffffffffe, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + // umax trunc +- 1 + { + input: 0xfffffffffffffbff, + convInt64: -0x8000000000000000, + convUInt64: 0xfffffffffffff800, + }, + { + input: 0xfffffffffffffc00, + convInt64: -0x8000000000000000, + convUInt64: 0x8000000000000000, + }, + { + input: 0xfffffffffffffbfe, + convInt64: -0x8000000000000000, + convUInt64: 0xfffffffffffff800, + }, + } + for _, item := range testdata { + if got, want := int64(item.input), item.convInt64; got != want { + t.Errorf("int64(%f): got %x, want %x", item.input, got, want) + } + if got, want := uint64(item.input), item.convUInt64; got != want { + t.Errorf("uint64(%f): got %x, want %x", item.input, got, want) + } + } +} diff --git a/src/runtime/coverage/apis.go b/src/runtime/coverage/apis.go new file mode 100644 index 0000000..7d851f9 --- /dev/null +++ b/src/runtime/coverage/apis.go @@ -0,0 +1,178 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package coverage + +import ( + "fmt" + "internal/coverage" + "io" + "reflect" + "sync/atomic" + "unsafe" +) + +// WriteMetaDir writes a coverage meta-data file for the currently +// running program to the directory specified in 'dir'. An error will +// be returned if the operation can't be completed successfully (for +// example, if the currently running program was not built with +// "-cover", or if the directory does not exist). +func WriteMetaDir(dir string) error { + if !finalHashComputed { + return fmt.Errorf("error: no meta-data available (binary not built with -cover?)") + } + return emitMetaDataToDirectory(dir, getCovMetaList()) +} + +// WriteMeta writes the meta-data content (the payload that would +// normally be emitted to a meta-data file) for the currently running +// program to the the writer 'w'. An error will be returned if the +// operation can't be completed successfully (for example, if the +// currently running program was not built with "-cover", or if a +// write fails). +func WriteMeta(w io.Writer) error { + if w == nil { + return fmt.Errorf("error: nil writer in WriteMeta") + } + if !finalHashComputed { + return fmt.Errorf("error: no meta-data available (binary not built with -cover?)") + } + ml := getCovMetaList() + return writeMetaData(w, ml, cmode, cgran, finalHash) +} + +// WriteCountersDir writes a coverage counter-data file for the +// currently running program to the directory specified in 'dir'. An +// error will be returned if the operation can't be completed +// successfully (for example, if the currently running program was not +// built with "-cover", or if the directory does not exist). The +// counter data written will be a snapshot taken at the point of the +// call. +func WriteCountersDir(dir string) error { + return emitCounterDataToDirectory(dir) +} + +// WriteCounters writes coverage counter-data content for +// the currently running program to the writer 'w'. An error will be +// returned if the operation can't be completed successfully (for +// example, if the currently running program was not built with +// "-cover", or if a write fails). The counter data written will be a +// snapshot taken at the point of the invocation. +func WriteCounters(w io.Writer) error { + if w == nil { + return fmt.Errorf("error: nil writer in WriteCounters") + } + // Ask the runtime for the list of coverage counter symbols. + cl := getCovCounterList() + if len(cl) == 0 { + return fmt.Errorf("program not built with -cover") + } + if !finalHashComputed { + return fmt.Errorf("meta-data not written yet, unable to write counter data") + } + + pm := getCovPkgMap() + s := &emitState{ + counterlist: cl, + pkgmap: pm, + } + return s.emitCounterDataToWriter(w) +} + +// ClearCounters clears/resets all coverage counter variables in the +// currently running program. It returns an error if the program in +// question was not built with the "-cover" flag. Clearing of coverage +// counters is also not supported for programs not using atomic +// counter mode (see more detailed comments below for the rationale +// here). +func ClearCounters() error { + cl := getCovCounterList() + if len(cl) == 0 { + return fmt.Errorf("program not built with -cover") + } + if cmode != coverage.CtrModeAtomic { + return fmt.Errorf("ClearCounters invoked for program build with -covermode=%s (please use -covermode=atomic)", cmode.String()) + } + + // Implementation note: this function would be faster and simpler + // if we could just zero out the entire counter array, but for the + // moment we go through and zero out just the slots in the array + // corresponding to the counter values. We do this to avoid the + // following bad scenario: suppose that a user builds their Go + // program with "-cover", and that program has a function (call it + // main.XYZ) that invokes ClearCounters: + // + // func XYZ() { + // ... do some stuff ... + // coverage.ClearCounters() + // if someCondition { <<--- HERE + // ... + // } + // } + // + // At the point where ClearCounters executes, main.XYZ has not yet + // finished running, thus as soon as the call returns the line + // marked "HERE" above will trigger the writing of a non-zero + // value into main.XYZ's counter slab. However since we've just + // finished clearing the entire counter segment, we will have lost + // the values in the prolog portion of main.XYZ's counter slab + // (nctrs, pkgid, funcid). This means that later on at the end of + // program execution as we walk through the entire counter array + // for the program looking for executed functions, we'll zoom past + // main.XYZ's prolog (which was zero'd) and hit the non-zero + // counter value corresponding to the "HERE" block, which will + // then be interpreted as the start of another live function. + // Things will go downhill from there. + // + // This same scenario is also a potential risk if the program is + // running on an architecture that permits reordering of + // writes/stores, since the inconsistency described above could + // arise here. Example scenario: + // + // func ABC() { + // ... // prolog + // if alwaysTrue() { + // XYZ() // counter update here + // } + // } + // + // In the instrumented version of ABC, the prolog of the function + // will contain a series of stores to the initial portion of the + // counter array to write number-of-counters, pkgid, funcid. Later + // in the function there is also a store to increment a counter + // for the block containing the call to XYZ(). If the CPU is + // allowed to reorder stores and decides to issue the XYZ store + // before the prolog stores, this could be observable as an + // inconsistency similar to the one above. Hence the requirement + // for atomic counter mode: according to package atomic docs, + // "...operations that happen in a specific order on one thread, + // will always be observed to happen in exactly that order by + // another thread". Thus we can be sure that there will be no + // inconsistency when reading the counter array from the thread + // running ClearCounters. + + var sd []atomic.Uint32 + + bufHdr := (*reflect.SliceHeader)(unsafe.Pointer(&sd)) + for _, c := range cl { + bufHdr.Data = uintptr(unsafe.Pointer(c.Counters)) + bufHdr.Len = int(c.Len) + bufHdr.Cap = int(c.Len) + for i := 0; i < len(sd); i++ { + // Skip ahead until the next non-zero value. + sdi := sd[i].Load() + if sdi == 0 { + continue + } + // We found a function that was executed; clear its counters. + nCtrs := sdi + for j := 0; j < int(nCtrs); j++ { + sd[i+coverage.FirstCtrOffset+j].Store(0) + } + // Move to next function. + i += coverage.FirstCtrOffset + int(nCtrs) - 1 + } + } + return nil +} diff --git a/src/runtime/coverage/dummy.s b/src/runtime/coverage/dummy.s new file mode 100644 index 0000000..7592859 --- /dev/null +++ b/src/runtime/coverage/dummy.s @@ -0,0 +1,8 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The runtime package uses //go:linkname to push a few functions into this +// package but we still need a .s file so the Go tool does not pass -complete +// to 'go tool compile' so the latter does not complain about Go functions +// with no bodies. diff --git a/src/runtime/coverage/emit.go b/src/runtime/coverage/emit.go new file mode 100644 index 0000000..2aed99c --- /dev/null +++ b/src/runtime/coverage/emit.go @@ -0,0 +1,667 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package coverage + +import ( + "crypto/md5" + "fmt" + "internal/coverage" + "internal/coverage/encodecounter" + "internal/coverage/encodemeta" + "internal/coverage/rtcov" + "io" + "os" + "path/filepath" + "reflect" + "runtime" + "sync/atomic" + "time" + "unsafe" +) + +// This file contains functions that support the writing of data files +// emitted at the end of code coverage testing runs, from instrumented +// executables. + +// getCovMetaList returns a list of meta-data blobs registered +// for the currently executing instrumented program. It is defined in the +// runtime. +func getCovMetaList() []rtcov.CovMetaBlob + +// getCovCounterList returns a list of counter-data blobs registered +// for the currently executing instrumented program. It is defined in the +// runtime. +func getCovCounterList() []rtcov.CovCounterBlob + +// getCovPkgMap returns a map storing the remapped package IDs for +// hard-coded runtime packages (see internal/coverage/pkgid.go for +// more on why hard-coded package IDs are needed). This function +// is defined in the runtime. +func getCovPkgMap() map[int]int + +// emitState holds useful state information during the emit process. +// +// When an instrumented program finishes execution and starts the +// process of writing out coverage data, it's possible that an +// existing meta-data file already exists in the output directory. In +// this case openOutputFiles() below will leave the 'mf' field below +// as nil. If a new meta-data file is needed, field 'mfname' will be +// the final desired path of the meta file, 'mftmp' will be a +// temporary file, and 'mf' will be an open os.File pointer for +// 'mftmp'. The meta-data file payload will be written to 'mf', the +// temp file will be then closed and renamed (from 'mftmp' to +// 'mfname'), so as to insure that the meta-data file is created +// atomically; we want this so that things work smoothly in cases +// where there are several instances of a given instrumented program +// all terminating at the same time and trying to create meta-data +// files simultaneously. +// +// For counter data files there is less chance of a collision, hence +// the openOutputFiles() stores the counter data file in 'cfname' and +// then places the *io.File into 'cf'. +type emitState struct { + mfname string // path of final meta-data output file + mftmp string // path to meta-data temp file (if needed) + mf *os.File // open os.File for meta-data temp file + cfname string // path of final counter data file + cftmp string // path to counter data temp file + cf *os.File // open os.File for counter data file + outdir string // output directory + + // List of meta-data symbols obtained from the runtime + metalist []rtcov.CovMetaBlob + + // List of counter-data symbols obtained from the runtime + counterlist []rtcov.CovCounterBlob + + // Table to use for remapping hard-coded pkg ids. + pkgmap map[int]int + + // emit debug trace output + debug bool +} + +var ( + // finalHash is computed at init time from the list of meta-data + // symbols registered during init. It is used both for writing the + // meta-data file and counter-data files. + finalHash [16]byte + // Set to true when we've computed finalHash + finalMetaLen. + finalHashComputed bool + // Total meta-data length. + finalMetaLen uint64 + // Records whether we've already attempted to write meta-data. + metaDataEmitAttempted bool + // Counter mode for this instrumented program run. + cmode coverage.CounterMode + // Counter granularity for this instrumented program run. + cgran coverage.CounterGranularity + // Cached value of GOCOVERDIR environment variable. + goCoverDir string + // Copy of os.Args made at init time, converted into map format. + capturedOsArgs map[string]string + // Flag used in tests to signal that coverage data already written. + covProfileAlreadyEmitted bool +) + +// fileType is used to select between counter-data files and +// meta-data files. +type fileType int + +const ( + noFile = 1 << iota + metaDataFile + counterDataFile +) + +// emitMetaData emits the meta-data output file for this coverage run. +// This entry point is intended to be invoked by the compiler from +// an instrumented program's main package init func. +func emitMetaData() { + if covProfileAlreadyEmitted { + return + } + ml, err := prepareForMetaEmit() + if err != nil { + fmt.Fprintf(os.Stderr, "error: coverage meta-data prep failed: %v\n", err) + if os.Getenv("GOCOVERDEBUG") != "" { + panic("meta-data write failure") + } + } + if len(ml) == 0 { + fmt.Fprintf(os.Stderr, "program not built with -cover\n") + return + } + + goCoverDir = os.Getenv("GOCOVERDIR") + if goCoverDir == "" { + fmt.Fprintf(os.Stderr, "warning: GOCOVERDIR not set, no coverage data emitted\n") + return + } + + if err := emitMetaDataToDirectory(goCoverDir, ml); err != nil { + fmt.Fprintf(os.Stderr, "error: coverage meta-data emit failed: %v\n", err) + if os.Getenv("GOCOVERDEBUG") != "" { + panic("meta-data write failure") + } + } +} + +func modeClash(m coverage.CounterMode) bool { + if m == coverage.CtrModeRegOnly || m == coverage.CtrModeTestMain { + return false + } + if cmode == coverage.CtrModeInvalid { + cmode = m + return false + } + return cmode != m +} + +func granClash(g coverage.CounterGranularity) bool { + if cgran == coverage.CtrGranularityInvalid { + cgran = g + return false + } + return cgran != g +} + +// prepareForMetaEmit performs preparatory steps needed prior to +// emitting a meta-data file, notably computing a final hash of +// all meta-data blobs and capturing os args. +func prepareForMetaEmit() ([]rtcov.CovMetaBlob, error) { + // Ask the runtime for the list of coverage meta-data symbols. + ml := getCovMetaList() + + // In the normal case (go build -o prog.exe ... ; ./prog.exe) + // len(ml) will always be non-zero, but we check here since at + // some point this function will be reachable via user-callable + // APIs (for example, to write out coverage data from a server + // program that doesn't ever call os.Exit). + if len(ml) == 0 { + return nil, nil + } + + s := &emitState{ + metalist: ml, + debug: os.Getenv("GOCOVERDEBUG") != "", + } + + // Capture os.Args() now so as to avoid issues if args + // are rewritten during program execution. + capturedOsArgs = captureOsArgs() + + if s.debug { + fmt.Fprintf(os.Stderr, "=+= GOCOVERDIR is %s\n", os.Getenv("GOCOVERDIR")) + fmt.Fprintf(os.Stderr, "=+= contents of covmetalist:\n") + for k, b := range ml { + fmt.Fprintf(os.Stderr, "=+= slot: %d path: %s ", k, b.PkgPath) + if b.PkgID != -1 { + fmt.Fprintf(os.Stderr, " hcid: %d", b.PkgID) + } + fmt.Fprintf(os.Stderr, "\n") + } + pm := getCovPkgMap() + fmt.Fprintf(os.Stderr, "=+= remap table:\n") + for from, to := range pm { + fmt.Fprintf(os.Stderr, "=+= from %d to %d\n", + uint32(from), uint32(to)) + } + } + + h := md5.New() + tlen := uint64(unsafe.Sizeof(coverage.MetaFileHeader{})) + for _, entry := range ml { + if _, err := h.Write(entry.Hash[:]); err != nil { + return nil, err + } + tlen += uint64(entry.Len) + ecm := coverage.CounterMode(entry.CounterMode) + if modeClash(ecm) { + return nil, fmt.Errorf("coverage counter mode clash: package %s uses mode=%d, but package %s uses mode=%s\n", ml[0].PkgPath, cmode, entry.PkgPath, ecm) + } + ecg := coverage.CounterGranularity(entry.CounterGranularity) + if granClash(ecg) { + return nil, fmt.Errorf("coverage counter granularity clash: package %s uses gran=%d, but package %s uses gran=%s\n", ml[0].PkgPath, cgran, entry.PkgPath, ecg) + } + } + + // Hash mode and granularity as well. + h.Write([]byte(cmode.String())) + h.Write([]byte(cgran.String())) + + // Compute final digest. + fh := h.Sum(nil) + copy(finalHash[:], fh) + finalHashComputed = true + finalMetaLen = tlen + + return ml, nil +} + +// emitMetaData emits the meta-data output file to the specified +// directory, returning an error if something went wrong. +func emitMetaDataToDirectory(outdir string, ml []rtcov.CovMetaBlob) error { + ml, err := prepareForMetaEmit() + if err != nil { + return err + } + if len(ml) == 0 { + return nil + } + + metaDataEmitAttempted = true + + s := &emitState{ + metalist: ml, + debug: os.Getenv("GOCOVERDEBUG") != "", + outdir: outdir, + } + + // Open output files. + if err := s.openOutputFiles(finalHash, finalMetaLen, metaDataFile); err != nil { + return err + } + + // Emit meta-data file only if needed (may already be present). + if s.needMetaDataFile() { + if err := s.emitMetaDataFile(finalHash, finalMetaLen); err != nil { + return err + } + } + return nil +} + +// emitCounterData emits the counter data output file for this coverage run. +// This entry point is intended to be invoked by the runtime when an +// instrumented program is terminating or calling os.Exit(). +func emitCounterData() { + if goCoverDir == "" || !finalHashComputed || covProfileAlreadyEmitted { + return + } + if err := emitCounterDataToDirectory(goCoverDir); err != nil { + fmt.Fprintf(os.Stderr, "error: coverage counter data emit failed: %v\n", err) + if os.Getenv("GOCOVERDEBUG") != "" { + panic("counter-data write failure") + } + } +} + +// emitMetaData emits the counter-data output file for this coverage run. +func emitCounterDataToDirectory(outdir string) error { + // Ask the runtime for the list of coverage counter symbols. + cl := getCovCounterList() + if len(cl) == 0 { + // no work to do here. + return nil + } + + if !finalHashComputed { + return fmt.Errorf("error: meta-data not available (binary not built with -cover?)") + } + + // Ask the runtime for the list of coverage counter symbols. + pm := getCovPkgMap() + s := &emitState{ + counterlist: cl, + pkgmap: pm, + outdir: outdir, + debug: os.Getenv("GOCOVERDEBUG") != "", + } + + // Open output file. + if err := s.openOutputFiles(finalHash, finalMetaLen, counterDataFile); err != nil { + return err + } + if s.cf == nil { + return fmt.Errorf("counter data output file open failed (no additional info") + } + + // Emit counter data file. + if err := s.emitCounterDataFile(finalHash, s.cf); err != nil { + return err + } + if err := s.cf.Close(); err != nil { + return fmt.Errorf("closing counter data file: %v", err) + } + + // Counter file has now been closed. Rename the temp to the + // final desired path. + if err := os.Rename(s.cftmp, s.cfname); err != nil { + return fmt.Errorf("writing %s: rename from %s failed: %v\n", s.cfname, s.cftmp, err) + } + + return nil +} + +// emitMetaData emits counter data for this coverage run to an io.Writer. +func (s *emitState) emitCounterDataToWriter(w io.Writer) error { + if err := s.emitCounterDataFile(finalHash, w); err != nil { + return err + } + return nil +} + +// openMetaFile determines whether we need to emit a meta-data output +// file, or whether we can reuse the existing file in the coverage out +// dir. It updates mfname/mftmp/mf fields in 's', returning an error +// if something went wrong. See the comment on the emitState type +// definition above for more on how file opening is managed. +func (s *emitState) openMetaFile(metaHash [16]byte, metaLen uint64) error { + + // Open meta-outfile for reading to see if it exists. + fn := fmt.Sprintf("%s.%x", coverage.MetaFilePref, metaHash) + s.mfname = filepath.Join(s.outdir, fn) + fi, err := os.Stat(s.mfname) + if err != nil || fi.Size() != int64(metaLen) { + // We need a new meta-file. + tname := "tmp." + fn + fmt.Sprintf("%d", time.Now().UnixNano()) + s.mftmp = filepath.Join(s.outdir, tname) + s.mf, err = os.Create(s.mftmp) + if err != nil { + return fmt.Errorf("creating meta-data file %s: %v", s.mftmp, err) + } + } + return nil +} + +// openCounterFile opens an output file for the counter data portion +// of a test coverage run. If updates the 'cfname' and 'cf' fields in +// 's', returning an error if something went wrong. +func (s *emitState) openCounterFile(metaHash [16]byte) error { + processID := os.Getpid() + fn := fmt.Sprintf(coverage.CounterFileTempl, coverage.CounterFilePref, metaHash, processID, time.Now().UnixNano()) + s.cfname = filepath.Join(s.outdir, fn) + s.cftmp = filepath.Join(s.outdir, "tmp."+fn) + var err error + s.cf, err = os.Create(s.cftmp) + if err != nil { + return fmt.Errorf("creating counter data file %s: %v", s.cftmp, err) + } + return nil +} + +// openOutputFiles opens output files in preparation for emitting +// coverage data. In the case of the meta-data file, openOutputFiles +// may determine that we can reuse an existing meta-data file in the +// outdir, in which case it will leave the 'mf' field in the state +// struct as nil. If a new meta-file is needed, the field 'mfname' +// will be the final desired path of the meta file, 'mftmp' will be a +// temporary file, and 'mf' will be an open os.File pointer for +// 'mftmp'. The idea is that the client/caller will write content into +// 'mf', close it, and then rename 'mftmp' to 'mfname'. This function +// also opens the counter data output file, setting 'cf' and 'cfname' +// in the state struct. +func (s *emitState) openOutputFiles(metaHash [16]byte, metaLen uint64, which fileType) error { + fi, err := os.Stat(s.outdir) + if err != nil { + return fmt.Errorf("output directory %q inaccessible (err: %v); no coverage data written", s.outdir, err) + } + if !fi.IsDir() { + return fmt.Errorf("output directory %q not a directory; no coverage data written", s.outdir) + } + + if (which & metaDataFile) != 0 { + if err := s.openMetaFile(metaHash, metaLen); err != nil { + return err + } + } + if (which & counterDataFile) != 0 { + if err := s.openCounterFile(metaHash); err != nil { + return err + } + } + return nil +} + +// emitMetaDataFile emits coverage meta-data to a previously opened +// temporary file (s.mftmp), then renames the generated file to the +// final path (s.mfname). +func (s *emitState) emitMetaDataFile(finalHash [16]byte, tlen uint64) error { + if err := writeMetaData(s.mf, s.metalist, cmode, cgran, finalHash); err != nil { + return fmt.Errorf("writing %s: %v\n", s.mftmp, err) + } + if err := s.mf.Close(); err != nil { + return fmt.Errorf("closing meta data temp file: %v", err) + } + + // Temp file has now been flushed and closed. Rename the temp to the + // final desired path. + if err := os.Rename(s.mftmp, s.mfname); err != nil { + return fmt.Errorf("writing %s: rename from %s failed: %v\n", s.mfname, s.mftmp, err) + } + + return nil +} + +// needMetaDataFile returns TRUE if we need to emit a meta-data file +// for this program run. It should be used only after +// openOutputFiles() has been invoked. +func (s *emitState) needMetaDataFile() bool { + return s.mf != nil +} + +func writeMetaData(w io.Writer, metalist []rtcov.CovMetaBlob, cmode coverage.CounterMode, gran coverage.CounterGranularity, finalHash [16]byte) error { + mfw := encodemeta.NewCoverageMetaFileWriter("<io.Writer>", w) + + // Note: "sd" is re-initialized on each iteration of the loop + // below, and would normally be declared inside the loop, but + // placed here escape analysis since we capture it in bufHdr. + var sd []byte + bufHdr := (*reflect.SliceHeader)(unsafe.Pointer(&sd)) + + var blobs [][]byte + for _, e := range metalist { + bufHdr.Data = uintptr(unsafe.Pointer(e.P)) + bufHdr.Len = int(e.Len) + bufHdr.Cap = int(e.Len) + blobs = append(blobs, sd) + } + return mfw.Write(finalHash, blobs, cmode, gran) +} + +func (s *emitState) NumFuncs() (int, error) { + var sd []atomic.Uint32 + bufHdr := (*reflect.SliceHeader)(unsafe.Pointer(&sd)) + + totalFuncs := 0 + for _, c := range s.counterlist { + bufHdr.Data = uintptr(unsafe.Pointer(c.Counters)) + bufHdr.Len = int(c.Len) + bufHdr.Cap = int(c.Len) + for i := 0; i < len(sd); i++ { + // Skip ahead until the next non-zero value. + sdi := sd[i].Load() + if sdi == 0 { + continue + } + + // We found a function that was executed. + nCtrs := sdi + + // Check to make sure that we have at least one live + // counter. See the implementation note in ClearCoverageCounters + // for a description of why this is needed. + isLive := false + st := i + coverage.FirstCtrOffset + counters := sd[st : st+int(nCtrs)] + for i := 0; i < len(counters); i++ { + if counters[i].Load() != 0 { + isLive = true + break + } + } + if !isLive { + // Skip this function. + i += coverage.FirstCtrOffset + int(nCtrs) - 1 + continue + } + + totalFuncs++ + + // Move to the next function. + i += coverage.FirstCtrOffset + int(nCtrs) - 1 + } + } + return totalFuncs, nil +} + +func (s *emitState) VisitFuncs(f encodecounter.CounterVisitorFn) error { + var sd []atomic.Uint32 + var tcounters []uint32 + bufHdr := (*reflect.SliceHeader)(unsafe.Pointer(&sd)) + + rdCounters := func(actrs []atomic.Uint32, ctrs []uint32) []uint32 { + ctrs = ctrs[:0] + for i := range actrs { + ctrs = append(ctrs, actrs[i].Load()) + } + return ctrs + } + + dpkg := uint32(0) + for _, c := range s.counterlist { + bufHdr.Data = uintptr(unsafe.Pointer(c.Counters)) + bufHdr.Len = int(c.Len) + bufHdr.Cap = int(c.Len) + for i := 0; i < len(sd); i++ { + // Skip ahead until the next non-zero value. + sdi := sd[i].Load() + if sdi == 0 { + continue + } + + // We found a function that was executed. + nCtrs := sd[i+coverage.NumCtrsOffset].Load() + pkgId := sd[i+coverage.PkgIdOffset].Load() + funcId := sd[i+coverage.FuncIdOffset].Load() + cst := i + coverage.FirstCtrOffset + counters := sd[cst : cst+int(nCtrs)] + + // Check to make sure that we have at least one live + // counter. See the implementation note in ClearCoverageCounters + // for a description of why this is needed. + isLive := false + for i := 0; i < len(counters); i++ { + if counters[i].Load() != 0 { + isLive = true + break + } + } + if !isLive { + // Skip this function. + i += coverage.FirstCtrOffset + int(nCtrs) - 1 + continue + } + + if s.debug { + if pkgId != dpkg { + dpkg = pkgId + fmt.Fprintf(os.Stderr, "\n=+= %d: pk=%d visit live fcn", + i, pkgId) + } + fmt.Fprintf(os.Stderr, " {i=%d F%d NC%d}", i, funcId, nCtrs) + } + + // Vet and/or fix up package ID. A package ID of zero + // indicates that there is some new package X that is a + // runtime dependency, and this package has code that + // executes before its corresponding init package runs. + // This is a fatal error that we should only see during + // Go development (e.g. tip). + ipk := int32(pkgId) + if ipk == 0 { + fmt.Fprintf(os.Stderr, "\n") + reportErrorInHardcodedList(int32(i), ipk, funcId, nCtrs) + } else if ipk < 0 { + if newId, ok := s.pkgmap[int(ipk)]; ok { + pkgId = uint32(newId) + } else { + fmt.Fprintf(os.Stderr, "\n") + reportErrorInHardcodedList(int32(i), ipk, funcId, nCtrs) + } + } else { + // The package ID value stored in the counter array + // has 1 added to it (so as to preclude the + // possibility of a zero value ; see + // runtime.addCovMeta), so subtract off 1 here to form + // the real package ID. + pkgId-- + } + + tcounters = rdCounters(counters, tcounters) + if err := f(pkgId, funcId, tcounters); err != nil { + return err + } + + // Skip over this function. + i += coverage.FirstCtrOffset + int(nCtrs) - 1 + } + if s.debug { + fmt.Fprintf(os.Stderr, "\n") + } + } + return nil +} + +// captureOsArgs converts os.Args() into the format we use to store +// this info in the counter data file (counter data file "args" +// section is a generic key-value collection). See the 'args' section +// in internal/coverage/defs.go for more info. The args map +// is also used to capture GOOS + GOARCH values as well. +func captureOsArgs() map[string]string { + m := make(map[string]string) + m["argc"] = fmt.Sprintf("%d", len(os.Args)) + for k, a := range os.Args { + m[fmt.Sprintf("argv%d", k)] = a + } + m["GOOS"] = runtime.GOOS + m["GOARCH"] = runtime.GOARCH + return m +} + +// emitCounterDataFile emits the counter data portion of a +// coverage output file (to the file 's.cf'). +func (s *emitState) emitCounterDataFile(finalHash [16]byte, w io.Writer) error { + cfw := encodecounter.NewCoverageDataWriter(w, coverage.CtrULeb128) + if err := cfw.Write(finalHash, capturedOsArgs, s); err != nil { + return err + } + return nil +} + +// markProfileEmitted signals the runtime/coverage machinery that +// coverate data output files have already been written out, and there +// is no need to take any additional action at exit time. This +// function is called (via linknamed reference) from the +// coverage-related boilerplate code in _testmain.go emitted for go +// unit tests. +func markProfileEmitted(val bool) { + covProfileAlreadyEmitted = val +} + +func reportErrorInHardcodedList(slot, pkgID int32, fnID, nCtrs uint32) { + metaList := getCovMetaList() + pkgMap := getCovPkgMap() + + println("internal error in coverage meta-data tracking:") + println("encountered bad pkgID:", pkgID, " at slot:", slot, + " fnID:", fnID, " numCtrs:", nCtrs) + println("list of hard-coded runtime package IDs needs revising.") + println("[see the comment on the 'rtPkgs' var in ") + println(" <goroot>/src/internal/coverage/pkid.go]") + println("registered list:") + for k, b := range metaList { + print("slot: ", k, " path='", b.PkgPath, "' ") + if b.PkgID != -1 { + print(" hard-coded id: ", b.PkgID) + } + println("") + } + println("remap table:") + for from, to := range pkgMap { + println("from ", from, " to ", to) + } +} diff --git a/src/runtime/coverage/emitdata_test.go b/src/runtime/coverage/emitdata_test.go new file mode 100644 index 0000000..3839e44 --- /dev/null +++ b/src/runtime/coverage/emitdata_test.go @@ -0,0 +1,451 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package coverage + +import ( + "fmt" + "internal/coverage" + "internal/goexperiment" + "internal/platform" + "internal/testenv" + "os" + "os/exec" + "path/filepath" + "runtime" + "strings" + "testing" +) + +// Set to true for debugging (linux only). +const fixedTestDir = false + +func TestCoverageApis(t *testing.T) { + if testing.Short() { + t.Skipf("skipping test: too long for short mode") + } + if !goexperiment.CoverageRedesign { + t.Skipf("skipping new coverage tests (experiment not enabled)") + } + testenv.MustHaveGoBuild(t) + dir := t.TempDir() + if fixedTestDir { + dir = "/tmp/qqqzzz" + os.RemoveAll(dir) + mkdir(t, dir) + } + + // Build harness. + bdir := mkdir(t, filepath.Join(dir, "build")) + hargs := []string{"-cover", "-coverpkg=all"} + if testing.CoverMode() != "" { + hargs = append(hargs, "-covermode="+testing.CoverMode()) + } + harnessPath := buildHarness(t, bdir, hargs) + + t.Logf("harness path is %s", harnessPath) + + // Sub-tests for each API we want to inspect, plus + // extras for error testing. + t.Run("emitToDir", func(t *testing.T) { + t.Parallel() + testEmitToDir(t, harnessPath, dir) + }) + t.Run("emitToWriter", func(t *testing.T) { + t.Parallel() + testEmitToWriter(t, harnessPath, dir) + }) + t.Run("emitToNonexistentDir", func(t *testing.T) { + t.Parallel() + testEmitToNonexistentDir(t, harnessPath, dir) + }) + t.Run("emitToNilWriter", func(t *testing.T) { + t.Parallel() + testEmitToNilWriter(t, harnessPath, dir) + }) + t.Run("emitToFailingWriter", func(t *testing.T) { + t.Parallel() + testEmitToFailingWriter(t, harnessPath, dir) + }) + t.Run("emitWithCounterClear", func(t *testing.T) { + t.Parallel() + testEmitWithCounterClear(t, harnessPath, dir) + }) + +} + +// upmergeCoverData helps improve coverage data for this package +// itself. If this test itself is being invoked with "-cover", then +// what we'd like is for package coverage data (that is, coverage for +// routines in "runtime/coverage") to be incorporated into the test +// run from the "harness.exe" runs we've just done. We can accomplish +// this by doing a merge from the harness gocoverdir's to the test +// gocoverdir. +func upmergeCoverData(t *testing.T, gocoverdir string) { + if testing.CoverMode() == "" { + return + } + testGoCoverDir := os.Getenv("GOCOVERDIR") + if testGoCoverDir == "" { + return + } + args := []string{"tool", "covdata", "merge", "-pkg=runtime/coverage", + "-o", testGoCoverDir, "-i", gocoverdir} + t.Logf("up-merge of covdata from %s to %s", gocoverdir, testGoCoverDir) + t.Logf("executing: go %+v", args) + cmd := exec.Command(testenv.GoToolPath(t), args...) + if b, err := cmd.CombinedOutput(); err != nil { + t.Fatalf("covdata merge failed (%v): %s", err, b) + } +} + +// buildHarness builds the helper program "harness.exe". +func buildHarness(t *testing.T, dir string, opts []string) string { + harnessPath := filepath.Join(dir, "harness.exe") + harnessSrc := filepath.Join("testdata", "harness.go") + args := []string{"build", "-o", harnessPath} + args = append(args, opts...) + args = append(args, harnessSrc) + //t.Logf("harness build: go %+v\n", args) + cmd := exec.Command(testenv.GoToolPath(t), args...) + if b, err := cmd.CombinedOutput(); err != nil { + t.Fatalf("build failed (%v): %s", err, b) + } + return harnessPath +} + +func mkdir(t *testing.T, d string) string { + t.Helper() + if err := os.Mkdir(d, 0777); err != nil { + t.Fatalf("mkdir failed: %v", err) + } + return d +} + +// updateGoCoverDir updates the specified environment 'env' to set +// GOCOVERDIR to 'gcd' (if setGoCoverDir is TRUE) or removes +// GOCOVERDIR from the environment (if setGoCoverDir is false). +func updateGoCoverDir(env []string, gcd string, setGoCoverDir bool) []string { + rv := []string{} + found := false + for _, v := range env { + if strings.HasPrefix(v, "GOCOVERDIR=") { + if !setGoCoverDir { + continue + } + v = "GOCOVERDIR=" + gcd + found = true + } + rv = append(rv, v) + } + if !found && setGoCoverDir { + rv = append(rv, "GOCOVERDIR="+gcd) + } + return rv +} + +func runHarness(t *testing.T, harnessPath string, tp string, setGoCoverDir bool, rdir, edir string) (string, error) { + t.Logf("running: %s -tp %s -o %s with rdir=%s and GOCOVERDIR=%v", harnessPath, tp, edir, rdir, setGoCoverDir) + cmd := exec.Command(harnessPath, "-tp", tp, "-o", edir) + cmd.Dir = rdir + cmd.Env = updateGoCoverDir(os.Environ(), rdir, setGoCoverDir) + b, err := cmd.CombinedOutput() + //t.Logf("harness run output: %s\n", string(b)) + return string(b), err +} + +func testForSpecificFunctions(t *testing.T, dir string, want []string, avoid []string) string { + args := []string{"tool", "covdata", "debugdump", + "-live", "-pkg=command-line-arguments", "-i=" + dir} + t.Logf("running: go %v\n", args) + cmd := exec.Command(testenv.GoToolPath(t), args...) + b, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("'go tool covdata failed (%v): %s", err, b) + } + output := string(b) + rval := "" + for _, f := range want { + wf := "Func: " + f + "\n" + if strings.Contains(output, wf) { + continue + } + rval += fmt.Sprintf("error: output should contain %q but does not\n", wf) + } + for _, f := range avoid { + wf := "Func: " + f + "\n" + if strings.Contains(output, wf) { + rval += fmt.Sprintf("error: output should not contain %q but does\n", wf) + } + } + if rval != "" { + t.Logf("=-= begin output:\n" + output + "\n=-= end output\n") + } + return rval +} + +func withAndWithoutRunner(f func(setit bool, tag string)) { + // Run 'f' with and without GOCOVERDIR set. + for i := 0; i < 2; i++ { + tag := "x" + setGoCoverDir := true + if i == 0 { + setGoCoverDir = false + tag = "y" + } + f(setGoCoverDir, tag) + } +} + +func mktestdirs(t *testing.T, tag, tp, dir string) (string, string) { + t.Helper() + rdir := mkdir(t, filepath.Join(dir, tp+"-rdir-"+tag)) + edir := mkdir(t, filepath.Join(dir, tp+"-edir-"+tag)) + return rdir, edir +} + +func testEmitToDir(t *testing.T, harnessPath string, dir string) { + withAndWithoutRunner(func(setGoCoverDir bool, tag string) { + tp := "emitToDir" + rdir, edir := mktestdirs(t, tag, tp, dir) + output, err := runHarness(t, harnessPath, tp, + setGoCoverDir, rdir, edir) + if err != nil { + t.Logf("%s", output) + t.Fatalf("running 'harness -tp emitDir': %v", err) + } + + // Just check to make sure meta-data file and counter data file were + // written. Another alternative would be to run "go tool covdata" + // or equivalent, but for now, this is what we've got. + dents, err := os.ReadDir(edir) + if err != nil { + t.Fatalf("os.ReadDir(%s) failed: %v", edir, err) + } + mfc := 0 + cdc := 0 + for _, e := range dents { + if e.IsDir() { + continue + } + if strings.HasPrefix(e.Name(), coverage.MetaFilePref) { + mfc++ + } else if strings.HasPrefix(e.Name(), coverage.CounterFilePref) { + cdc++ + } + } + wantmf := 1 + wantcf := 1 + if mfc != wantmf { + t.Errorf("EmitToDir: want %d meta-data files, got %d\n", wantmf, mfc) + } + if cdc != wantcf { + t.Errorf("EmitToDir: want %d counter-data files, got %d\n", wantcf, cdc) + } + upmergeCoverData(t, edir) + upmergeCoverData(t, rdir) + }) +} + +func testEmitToWriter(t *testing.T, harnessPath string, dir string) { + withAndWithoutRunner(func(setGoCoverDir bool, tag string) { + tp := "emitToWriter" + rdir, edir := mktestdirs(t, tag, tp, dir) + output, err := runHarness(t, harnessPath, tp, setGoCoverDir, rdir, edir) + if err != nil { + t.Logf("%s", output) + t.Fatalf("running 'harness -tp %s': %v", tp, err) + } + want := []string{"main", tp} + avoid := []string{"final"} + if msg := testForSpecificFunctions(t, edir, want, avoid); msg != "" { + t.Errorf("coverage data from %q output match failed: %s", tp, msg) + } + upmergeCoverData(t, edir) + upmergeCoverData(t, rdir) + }) +} + +func testEmitToNonexistentDir(t *testing.T, harnessPath string, dir string) { + withAndWithoutRunner(func(setGoCoverDir bool, tag string) { + tp := "emitToNonexistentDir" + rdir, edir := mktestdirs(t, tag, tp, dir) + output, err := runHarness(t, harnessPath, tp, setGoCoverDir, rdir, edir) + if err != nil { + t.Logf("%s", output) + t.Fatalf("running 'harness -tp %s': %v", tp, err) + } + upmergeCoverData(t, edir) + upmergeCoverData(t, rdir) + }) +} + +func testEmitToUnwritableDir(t *testing.T, harnessPath string, dir string) { + withAndWithoutRunner(func(setGoCoverDir bool, tag string) { + + tp := "emitToUnwritableDir" + rdir, edir := mktestdirs(t, tag, tp, dir) + + // Make edir unwritable. + if err := os.Chmod(edir, 0555); err != nil { + t.Fatalf("chmod failed: %v", err) + } + defer os.Chmod(edir, 0777) + + output, err := runHarness(t, harnessPath, tp, setGoCoverDir, rdir, edir) + if err != nil { + t.Logf("%s", output) + t.Fatalf("running 'harness -tp %s': %v", tp, err) + } + upmergeCoverData(t, edir) + upmergeCoverData(t, rdir) + }) +} + +func testEmitToNilWriter(t *testing.T, harnessPath string, dir string) { + withAndWithoutRunner(func(setGoCoverDir bool, tag string) { + tp := "emitToNilWriter" + rdir, edir := mktestdirs(t, tag, tp, dir) + output, err := runHarness(t, harnessPath, tp, setGoCoverDir, rdir, edir) + if err != nil { + t.Logf("%s", output) + t.Fatalf("running 'harness -tp %s': %v", tp, err) + } + upmergeCoverData(t, edir) + upmergeCoverData(t, rdir) + }) +} + +func testEmitToFailingWriter(t *testing.T, harnessPath string, dir string) { + withAndWithoutRunner(func(setGoCoverDir bool, tag string) { + tp := "emitToFailingWriter" + rdir, edir := mktestdirs(t, tag, tp, dir) + output, err := runHarness(t, harnessPath, tp, setGoCoverDir, rdir, edir) + if err != nil { + t.Logf("%s", output) + t.Fatalf("running 'harness -tp %s': %v", tp, err) + } + upmergeCoverData(t, edir) + upmergeCoverData(t, rdir) + }) +} + +func testEmitWithCounterClear(t *testing.T, harnessPath string, dir string) { + // Ensure that we have two versions of the harness: one built with + // -covermode=atomic and one built with -covermode=set (we need + // both modes to test all of the functionality). + var nonatomicHarnessPath, atomicHarnessPath string + if testing.CoverMode() != "atomic" { + nonatomicHarnessPath = harnessPath + bdir2 := mkdir(t, filepath.Join(dir, "build2")) + hargs := []string{"-covermode=atomic", "-coverpkg=all"} + atomicHarnessPath = buildHarness(t, bdir2, hargs) + } else { + atomicHarnessPath = harnessPath + mode := "set" + if testing.CoverMode() != "" && testing.CoverMode() != "atomic" { + mode = testing.CoverMode() + } + // Build a special nonatomic covermode version of the harness + // (we need both modes to test all of the functionality). + bdir2 := mkdir(t, filepath.Join(dir, "build2")) + hargs := []string{"-covermode=" + mode, "-coverpkg=all"} + nonatomicHarnessPath = buildHarness(t, bdir2, hargs) + } + + withAndWithoutRunner(func(setGoCoverDir bool, tag string) { + // First a run with the nonatomic harness path, which we + // expect to fail. + tp := "emitWithCounterClear" + rdir1, edir1 := mktestdirs(t, tag, tp+"1", dir) + output, err := runHarness(t, nonatomicHarnessPath, tp, + setGoCoverDir, rdir1, edir1) + if err == nil { + t.Logf("%s", output) + t.Fatalf("running '%s -tp %s': unexpected success", + nonatomicHarnessPath, tp) + } + + // Next a run with the atomic harness path, which we + // expect to succeed. + rdir2, edir2 := mktestdirs(t, tag, tp+"2", dir) + output, err = runHarness(t, atomicHarnessPath, tp, + setGoCoverDir, rdir2, edir2) + if err != nil { + t.Logf("%s", output) + t.Fatalf("running 'harness -tp %s': %v", tp, err) + } + want := []string{tp, "postClear"} + avoid := []string{"preClear", "main", "final"} + if msg := testForSpecificFunctions(t, edir2, want, avoid); msg != "" { + t.Logf("%s", output) + t.Errorf("coverage data from %q output match failed: %s", tp, msg) + } + + if testing.CoverMode() == "atomic" { + upmergeCoverData(t, edir2) + upmergeCoverData(t, rdir2) + } else { + upmergeCoverData(t, edir1) + upmergeCoverData(t, rdir1) + } + }) +} + +func TestApisOnNocoverBinary(t *testing.T) { + if testing.Short() { + t.Skipf("skipping test: too long for short mode") + } + testenv.MustHaveGoBuild(t) + dir := t.TempDir() + + // Build harness with no -cover. + bdir := mkdir(t, filepath.Join(dir, "nocover")) + edir := mkdir(t, filepath.Join(dir, "emitDirNo")) + harnessPath := buildHarness(t, bdir, nil) + output, err := runHarness(t, harnessPath, "emitToDir", false, edir, edir) + if err == nil { + t.Fatalf("expected error on TestApisOnNocoverBinary harness run") + } + const want = "not built with -cover" + if !strings.Contains(output, want) { + t.Errorf("error output does not contain %q: %s", want, output) + } +} + +func TestIssue56006EmitDataRaceCoverRunningGoroutine(t *testing.T) { + if testing.Short() { + t.Skipf("skipping test: too long for short mode") + } + if !goexperiment.CoverageRedesign { + t.Skipf("skipping new coverage tests (experiment not enabled)") + } + + // This test requires "go test -race -cover", meaning that we need + // go build, go run, and "-race" support. + testenv.MustHaveGoRun(t) + if !platform.RaceDetectorSupported(runtime.GOOS, runtime.GOARCH) || + !testenv.HasCGO() { + t.Skip("skipped due to lack of race detector support / CGO") + } + + // This will run a program with -cover and -race where we have a + // goroutine still running (and updating counters) at the point where + // the test runtime is trying to write out counter data. + cmd := exec.Command(testenv.GoToolPath(t), "test", "-cover", "-race") + cmd.Dir = filepath.Join("testdata", "issue56006") + b, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("go test -cover -race failed: %v", err) + } + + // Don't want to see any data races in output. + avoid := []string{"DATA RACE"} + for _, no := range avoid { + if strings.Contains(string(b), no) { + t.Logf("%s\n", string(b)) + t.Fatalf("found %s in test output, not permitted", no) + } + } +} diff --git a/src/runtime/coverage/hooks.go b/src/runtime/coverage/hooks.go new file mode 100644 index 0000000..a9fbf9d --- /dev/null +++ b/src/runtime/coverage/hooks.go @@ -0,0 +1,42 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package coverage + +import _ "unsafe" + +// initHook is invoked from the main package "init" routine in +// programs built with "-cover". This function is intended to be +// called only by the compiler. +// +// If 'istest' is false, it indicates we're building a regular program +// ("go build -cover ..."), in which case we immediately try to write +// out the meta-data file, and register emitCounterData as an exit +// hook. +// +// If 'istest' is true (indicating that the program in question is a +// Go test binary), then we tentatively queue up both emitMetaData and +// emitCounterData as exit hooks. In the normal case (e.g. regular "go +// test -cover" run) the testmain.go boilerplate will run at the end +// of the test, write out the coverage percentage, and then invoke +// markProfileEmitted() to indicate that no more work needs to be +// done. If however that call is never made, this is a sign that the +// test binary is being used as a replacement binary for the tool +// being tested, hence we do want to run exit hooks when the program +// terminates. +func initHook(istest bool) { + // Note: hooks are run in reverse registration order, so + // register the counter data hook before the meta-data hook + // (in the case where two hooks are needed). + runOnNonZeroExit := true + runtime_addExitHook(emitCounterData, runOnNonZeroExit) + if istest { + runtime_addExitHook(emitMetaData, runOnNonZeroExit) + } else { + emitMetaData() + } +} + +//go:linkname runtime_addExitHook runtime.addExitHook +func runtime_addExitHook(f func(), runOnNonZeroExit bool) diff --git a/src/runtime/coverage/testdata/harness.go b/src/runtime/coverage/testdata/harness.go new file mode 100644 index 0000000..5c87e4c --- /dev/null +++ b/src/runtime/coverage/testdata/harness.go @@ -0,0 +1,259 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "flag" + "fmt" + "internal/coverage/slicewriter" + "io" + "io/ioutil" + "log" + "path/filepath" + "runtime/coverage" + "strings" +) + +var verbflag = flag.Int("v", 0, "Verbose trace output level") +var testpointflag = flag.String("tp", "", "Testpoint to run") +var outdirflag = flag.String("o", "", "Output dir into which to emit") + +func emitToWriter() { + log.SetPrefix("emitToWriter: ") + var slwm slicewriter.WriteSeeker + if err := coverage.WriteMeta(&slwm); err != nil { + log.Fatalf("error: WriteMeta returns %v", err) + } + mf := filepath.Join(*outdirflag, "covmeta.0abcdef") + if err := ioutil.WriteFile(mf, slwm.BytesWritten(), 0666); err != nil { + log.Fatalf("error: writing %s: %v", mf, err) + } + var slwc slicewriter.WriteSeeker + if err := coverage.WriteCounters(&slwc); err != nil { + log.Fatalf("error: WriteCounters returns %v", err) + } + cf := filepath.Join(*outdirflag, "covcounters.0abcdef.99.77") + if err := ioutil.WriteFile(cf, slwc.BytesWritten(), 0666); err != nil { + log.Fatalf("error: writing %s: %v", cf, err) + } +} + +func emitToDir() { + log.SetPrefix("emitToDir: ") + if err := coverage.WriteMetaDir(*outdirflag); err != nil { + log.Fatalf("error: WriteMetaDir returns %v", err) + } + if err := coverage.WriteCountersDir(*outdirflag); err != nil { + log.Fatalf("error: WriteCountersDir returns %v", err) + } +} + +func emitToNonexistentDir() { + log.SetPrefix("emitToNonexistentDir: ") + + want := []string{ + "no such file or directory", // linux-ish + "system cannot find the file specified", // windows + "does not exist", // plan9 + } + + checkWant := func(which string, got string) { + found := false + for _, w := range want { + if strings.Contains(got, w) { + found = true + break + } + } + if !found { + log.Fatalf("%s emit to bad dir: got error:\n %v\nwanted error with one of:\n %+v", which, got, want) + } + } + + // Mangle the output directory to produce something nonexistent. + mangled := *outdirflag + "_MANGLED" + if err := coverage.WriteMetaDir(mangled); err == nil { + log.Fatal("expected error from WriteMetaDir to nonexistent dir") + } else { + got := fmt.Sprintf("%v", err) + checkWant("meta data", got) + } + + // Now try to emit counter data file to a bad dir. + if err := coverage.WriteCountersDir(mangled); err == nil { + log.Fatal("expected error emitting counter data to bad dir") + } else { + got := fmt.Sprintf("%v", err) + checkWant("counter data", got) + } +} + +func emitToUnwritableDir() { + log.SetPrefix("emitToUnwritableDir: ") + + want := "permission denied" + + if err := coverage.WriteMetaDir(*outdirflag); err == nil { + log.Fatal("expected error from WriteMetaDir to unwritable dir") + } else { + got := fmt.Sprintf("%v", err) + if !strings.Contains(got, want) { + log.Fatalf("meta-data emit to unwritable dir: wanted error containing %q got %q", want, got) + } + } + + // Similarly with writing counter data. + if err := coverage.WriteCountersDir(*outdirflag); err == nil { + log.Fatal("expected error emitting counter data to unwritable dir") + } else { + got := fmt.Sprintf("%v", err) + if !strings.Contains(got, want) { + log.Fatalf("emitting counter data to unwritable dir: wanted error containing %q got %q", want, got) + } + } +} + +func emitToNilWriter() { + log.SetPrefix("emitToWriter: ") + want := "nil writer" + var bad io.WriteSeeker + if err := coverage.WriteMeta(bad); err == nil { + log.Fatal("expected error passing nil writer for meta emit") + } else { + got := fmt.Sprintf("%v", err) + if !strings.Contains(got, want) { + log.Fatalf("emitting meta-data passing nil writer: wanted error containing %q got %q", want, got) + } + } + + if err := coverage.WriteCounters(bad); err == nil { + log.Fatal("expected error passing nil writer for counter emit") + } else { + got := fmt.Sprintf("%v", err) + if !strings.Contains(got, want) { + log.Fatalf("emitting counter data passing nil writer: wanted error containing %q got %q", want, got) + } + } +} + +type failingWriter struct { + writeCount int + writeLimit int + slws slicewriter.WriteSeeker +} + +func (f *failingWriter) Write(p []byte) (n int, err error) { + c := f.writeCount + f.writeCount++ + if f.writeLimit < 0 || c < f.writeLimit { + return f.slws.Write(p) + } + return 0, fmt.Errorf("manufactured write error") +} + +func (f *failingWriter) Seek(offset int64, whence int) (int64, error) { + return f.slws.Seek(offset, whence) +} + +func (f *failingWriter) reset(lim int) { + f.writeCount = 0 + f.writeLimit = lim + f.slws = slicewriter.WriteSeeker{} +} + +func writeStressTest(tag string, testf func(testf *failingWriter) error) { + // Invoke the function initially without the write limit + // set, to capture the number of writes performed. + fw := &failingWriter{writeLimit: -1} + testf(fw) + + // Now that we know how many writes are going to happen, run the + // function repeatedly, each time with a Write operation set to + // fail at a new spot. The goal here is to make sure that: + // A) an error is reported, and B) nothing crashes. + tot := fw.writeCount + for i := 0; i < tot; i++ { + fw.reset(i) + err := testf(fw) + if err == nil { + log.Fatalf("no error from write %d tag %s", i, tag) + } + } +} + +func postClear() int { + return 42 +} + +func preClear() int { + return 42 +} + +// This test is designed to ensure that write errors are properly +// handled by the code that writes out coverage data. It repeatedly +// invokes the 'emit to writer' apis using a specially crafted writer +// that captures the total number of expected writes, then replays the +// execution N times with a manufactured write error at the +// appropriate spot. +func emitToFailingWriter() { + log.SetPrefix("emitToFailingWriter: ") + + writeStressTest("emit-meta", func(f *failingWriter) error { + return coverage.WriteMeta(f) + }) + writeStressTest("emit-counter", func(f *failingWriter) error { + return coverage.WriteCounters(f) + }) +} + +func emitWithCounterClear() { + log.SetPrefix("emitWitCounterClear: ") + preClear() + if err := coverage.ClearCounters(); err != nil { + log.Fatalf("clear failed: %v", err) + } + postClear() + if err := coverage.WriteMetaDir(*outdirflag); err != nil { + log.Fatalf("error: WriteMetaDir returns %v", err) + } + if err := coverage.WriteCountersDir(*outdirflag); err != nil { + log.Fatalf("error: WriteCountersDir returns %v", err) + } +} + +func final() int { + println("I run last.") + return 43 +} + +func main() { + log.SetFlags(0) + flag.Parse() + if *testpointflag == "" { + log.Fatalf("error: no testpoint (use -tp flag)") + } + if *outdirflag == "" { + log.Fatalf("error: no output dir specified (use -o flag)") + } + switch *testpointflag { + case "emitToDir": + emitToDir() + case "emitToWriter": + emitToWriter() + case "emitToNonexistentDir": + emitToNonexistentDir() + case "emitToUnwritableDir": + emitToUnwritableDir() + case "emitToNilWriter": + emitToNilWriter() + case "emitToFailingWriter": + emitToFailingWriter() + case "emitWithCounterClear": + emitWithCounterClear() + default: + log.Fatalf("error: unknown testpoint %q", *testpointflag) + } + final() +} diff --git a/src/runtime/coverage/testdata/issue56006/repro.go b/src/runtime/coverage/testdata/issue56006/repro.go new file mode 100644 index 0000000..60a4925 --- /dev/null +++ b/src/runtime/coverage/testdata/issue56006/repro.go @@ -0,0 +1,26 @@ +package main + +//go:noinline +func blah(x int) int { + if x != 0 { + return x + 42 + } + return x - 42 +} + +func main() { + go infloop() + println(blah(1) + blah(0)) +} + +var G int + +func infloop() { + for { + G += blah(1) + G += blah(0) + if G > 10000 { + G = 0 + } + } +} diff --git a/src/runtime/coverage/testdata/issue56006/repro_test.go b/src/runtime/coverage/testdata/issue56006/repro_test.go new file mode 100644 index 0000000..674d819 --- /dev/null +++ b/src/runtime/coverage/testdata/issue56006/repro_test.go @@ -0,0 +1,8 @@ +package main + +import "testing" + +func TestSomething(t *testing.T) { + go infloop() + println(blah(1) + blah(0)) +} diff --git a/src/runtime/coverage/testsupport.go b/src/runtime/coverage/testsupport.go new file mode 100644 index 0000000..a481bbb --- /dev/null +++ b/src/runtime/coverage/testsupport.go @@ -0,0 +1,234 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package coverage + +import ( + "fmt" + "internal/coverage" + "internal/coverage/calloc" + "internal/coverage/cformat" + "internal/coverage/cmerge" + "internal/coverage/decodecounter" + "internal/coverage/decodemeta" + "internal/coverage/pods" + "io" + "os" + "strings" +) + +// processCoverTestDir is called (via a linknamed reference) from +// testmain code when "go test -cover" is in effect. It is not +// intended to be used other than internally by the Go command's +// generated code. +func processCoverTestDir(dir string, cfile string, cm string, cpkg string) error { + return processCoverTestDirInternal(dir, cfile, cm, cpkg, os.Stdout) +} + +// processCoverTestDirInternal is an io.Writer version of processCoverTestDir, +// exposed for unit testing. +func processCoverTestDirInternal(dir string, cfile string, cm string, cpkg string, w io.Writer) error { + cmode := coverage.ParseCounterMode(cm) + if cmode == coverage.CtrModeInvalid { + return fmt.Errorf("invalid counter mode %q", cm) + } + + // Emit meta-data and counter data. + ml := getCovMetaList() + if len(ml) == 0 { + // This corresponds to the case where we have a package that + // contains test code but no functions (which is fine). In this + // case there is no need to emit anything. + } else { + if err := emitMetaDataToDirectory(dir, ml); err != nil { + return err + } + if err := emitCounterDataToDirectory(dir); err != nil { + return err + } + } + + // Collect pods from test run. For the majority of cases we would + // expect to see a single pod here, but allow for multiple pods in + // case the test harness is doing extra work to collect data files + // from builds that it kicks off as part of the testing. + podlist, err := pods.CollectPods([]string{dir}, false) + if err != nil { + return fmt.Errorf("reading from %s: %v", dir, err) + } + + // Open text output file if appropriate. + var tf *os.File + var tfClosed bool + if cfile != "" { + var err error + tf, err = os.Create(cfile) + if err != nil { + return fmt.Errorf("internal error: opening coverage data output file %q: %v", cfile, err) + } + defer func() { + if !tfClosed { + tfClosed = true + tf.Close() + } + }() + } + + // Read/process the pods. + ts := &tstate{ + cm: &cmerge.Merger{}, + cf: cformat.NewFormatter(cmode), + cmode: cmode, + } + // Generate the expected hash string based on the final meta-data + // hash for this test, then look only for pods that refer to that + // hash (just in case there are multiple instrumented executables + // in play). See issue #57924 for more on this. + hashstring := fmt.Sprintf("%x", finalHash) + for _, p := range podlist { + if !strings.Contains(p.MetaFile, hashstring) { + continue + } + if err := ts.processPod(p); err != nil { + return err + } + } + + // Emit percent. + if err := ts.cf.EmitPercent(w, cpkg, true); err != nil { + return err + } + + // Emit text output. + if tf != nil { + if err := ts.cf.EmitTextual(tf); err != nil { + return err + } + tfClosed = true + if err := tf.Close(); err != nil { + return fmt.Errorf("closing %s: %v", cfile, err) + } + } + + return nil +} + +type tstate struct { + calloc.BatchCounterAlloc + cm *cmerge.Merger + cf *cformat.Formatter + cmode coverage.CounterMode +} + +// processPod reads coverage counter data for a specific pod. +func (ts *tstate) processPod(p pods.Pod) error { + // Open meta-data file + f, err := os.Open(p.MetaFile) + if err != nil { + return fmt.Errorf("unable to open meta-data file %s: %v", p.MetaFile, err) + } + defer func() { + f.Close() + }() + var mfr *decodemeta.CoverageMetaFileReader + mfr, err = decodemeta.NewCoverageMetaFileReader(f, nil) + if err != nil { + return fmt.Errorf("error reading meta-data file %s: %v", p.MetaFile, err) + } + newmode := mfr.CounterMode() + if newmode != ts.cmode { + return fmt.Errorf("internal error: counter mode clash: %q from test harness, %q from data file %s", ts.cmode.String(), newmode.String(), p.MetaFile) + } + newgran := mfr.CounterGranularity() + if err := ts.cm.SetModeAndGranularity(p.MetaFile, cmode, newgran); err != nil { + return err + } + + // A map to store counter data, indexed by pkgid/fnid tuple. + pmm := make(map[pkfunc][]uint32) + + // Helper to read a single counter data file. + readcdf := func(cdf string) error { + cf, err := os.Open(cdf) + if err != nil { + return fmt.Errorf("opening counter data file %s: %s", cdf, err) + } + defer cf.Close() + var cdr *decodecounter.CounterDataReader + cdr, err = decodecounter.NewCounterDataReader(cdf, cf) + if err != nil { + return fmt.Errorf("reading counter data file %s: %s", cdf, err) + } + var data decodecounter.FuncPayload + for { + ok, err := cdr.NextFunc(&data) + if err != nil { + return fmt.Errorf("reading counter data file %s: %v", cdf, err) + } + if !ok { + break + } + + // NB: sanity check on pkg and func IDs? + key := pkfunc{pk: data.PkgIdx, fcn: data.FuncIdx} + if prev, found := pmm[key]; found { + // Note: no overflow reporting here. + if err, _ := ts.cm.MergeCounters(data.Counters, prev); err != nil { + return fmt.Errorf("processing counter data file %s: %v", cdf, err) + } + } + c := ts.AllocateCounters(len(data.Counters)) + copy(c, data.Counters) + pmm[key] = c + } + return nil + } + + // Read counter data files. + for _, cdf := range p.CounterDataFiles { + if err := readcdf(cdf); err != nil { + return err + } + } + + // Visit meta-data file. + np := uint32(mfr.NumPackages()) + payload := []byte{} + for pkIdx := uint32(0); pkIdx < np; pkIdx++ { + var pd *decodemeta.CoverageMetaDataDecoder + pd, payload, err = mfr.GetPackageDecoder(pkIdx, payload) + if err != nil { + return fmt.Errorf("reading pkg %d from meta-file %s: %s", pkIdx, p.MetaFile, err) + } + ts.cf.SetPackage(pd.PackagePath()) + var fd coverage.FuncDesc + nf := pd.NumFuncs() + for fnIdx := uint32(0); fnIdx < nf; fnIdx++ { + if err := pd.ReadFunc(fnIdx, &fd); err != nil { + return fmt.Errorf("reading meta-data file %s: %v", + p.MetaFile, err) + } + key := pkfunc{pk: pkIdx, fcn: fnIdx} + counters, haveCounters := pmm[key] + for i := 0; i < len(fd.Units); i++ { + u := fd.Units[i] + // Skip units with non-zero parent (no way to represent + // these in the existing format). + if u.Parent != 0 { + continue + } + count := uint32(0) + if haveCounters { + count = counters[i] + } + ts.cf.AddUnit(fd.Srcfile, fd.Funcname, fd.Lit, u, count) + } + } + } + return nil +} + +type pkfunc struct { + pk, fcn uint32 +} diff --git a/src/runtime/coverage/ts_test.go b/src/runtime/coverage/ts_test.go new file mode 100644 index 0000000..b826058 --- /dev/null +++ b/src/runtime/coverage/ts_test.go @@ -0,0 +1,58 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package coverage + +import ( + "internal/goexperiment" + "os" + "path/filepath" + "strings" + "testing" + _ "unsafe" +) + +//go:linkname testing_testGoCoverDir testing.testGoCoverDir +func testing_testGoCoverDir() string + +// TestTestSupport does a basic verification of the functionality in +// runtime/coverage.processCoverTestDir (doing this here as opposed to +// relying on other test paths will provide a better signal when +// running "go test -cover" for this package). +func TestTestSupport(t *testing.T) { + if !goexperiment.CoverageRedesign { + return + } + if testing.CoverMode() == "" { + return + } + t.Logf("testing.testGoCoverDir() returns %s mode=%s\n", + testing_testGoCoverDir(), testing.CoverMode()) + + textfile := filepath.Join(t.TempDir(), "file.txt") + var sb strings.Builder + err := processCoverTestDirInternal(testing_testGoCoverDir(), textfile, + testing.CoverMode(), "", &sb) + if err != nil { + t.Fatalf("bad: %v", err) + } + + // Check for existence of text file. + if inf, err := os.Open(textfile); err != nil { + t.Fatalf("problems opening text file %s: %v", textfile, err) + } else { + inf.Close() + } + + // Check for percent output with expected tokens. + strout := sb.String() + want1 := "runtime/coverage" + want2 := "of statements" + if !strings.Contains(strout, want1) || + !strings.Contains(strout, want2) { + t.Logf("output from run: %s\n", strout) + t.Fatalf("percent output missing key tokens: %q and %q", + want1, want2) + } +} diff --git a/src/runtime/covercounter.go b/src/runtime/covercounter.go new file mode 100644 index 0000000..72842bd --- /dev/null +++ b/src/runtime/covercounter.go @@ -0,0 +1,26 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/coverage/rtcov" + "unsafe" +) + +//go:linkname runtime_coverage_getCovCounterList runtime/coverage.getCovCounterList +func runtime_coverage_getCovCounterList() []rtcov.CovCounterBlob { + res := []rtcov.CovCounterBlob{} + u32sz := unsafe.Sizeof(uint32(0)) + for datap := &firstmoduledata; datap != nil; datap = datap.next { + if datap.covctrs == datap.ecovctrs { + continue + } + res = append(res, rtcov.CovCounterBlob{ + Counters: (*uint32)(unsafe.Pointer(datap.covctrs)), + Len: uint64((datap.ecovctrs - datap.covctrs) / u32sz), + }) + } + return res +} diff --git a/src/runtime/covermeta.go b/src/runtime/covermeta.go new file mode 100644 index 0000000..54ef42a --- /dev/null +++ b/src/runtime/covermeta.go @@ -0,0 +1,72 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/coverage/rtcov" + "unsafe" +) + +// covMeta is the top-level container for bits of state related to +// code coverage meta-data in the runtime. +var covMeta struct { + // metaList contains the list of currently registered meta-data + // blobs for the running program. + metaList []rtcov.CovMetaBlob + + // pkgMap records mappings from hard-coded package IDs to + // slots in the covMetaList above. + pkgMap map[int]int + + // Set to true if we discover a package mapping glitch. + hardCodedListNeedsUpdating bool +} + +// addCovMeta is invoked during package "init" functions by the +// compiler when compiling for coverage instrumentation; here 'p' is a +// meta-data blob of length 'dlen' for the package in question, 'hash' +// is a compiler-computed md5.sum for the blob, 'pkpath' is the +// package path, 'pkid' is the hard-coded ID that the compiler is +// using for the package (or -1 if the compiler doesn't think a +// hard-coded ID is needed), and 'cmode'/'cgran' are the coverage +// counter mode and granularity requested by the user. Return value is +// the ID for the package for use by the package code itself. +func addCovMeta(p unsafe.Pointer, dlen uint32, hash [16]byte, pkpath string, pkid int, cmode uint8, cgran uint8) uint32 { + slot := len(covMeta.metaList) + covMeta.metaList = append(covMeta.metaList, + rtcov.CovMetaBlob{ + P: (*byte)(p), + Len: dlen, + Hash: hash, + PkgPath: pkpath, + PkgID: pkid, + CounterMode: cmode, + CounterGranularity: cgran, + }) + if pkid != -1 { + if covMeta.pkgMap == nil { + covMeta.pkgMap = make(map[int]int) + } + if _, ok := covMeta.pkgMap[pkid]; ok { + throw("runtime.addCovMeta: coverage package map collision") + } + // Record the real slot (position on meta-list) for this + // package; we'll use the map to fix things up later on. + covMeta.pkgMap[pkid] = slot + } + + // ID zero is reserved as invalid. + return uint32(slot + 1) +} + +//go:linkname runtime_coverage_getCovMetaList runtime/coverage.getCovMetaList +func runtime_coverage_getCovMetaList() []rtcov.CovMetaBlob { + return covMeta.metaList +} + +//go:linkname runtime_coverage_getCovPkgMap runtime/coverage.getCovPkgMap +func runtime_coverage_getCovPkgMap() map[int]int { + return covMeta.pkgMap +} diff --git a/src/runtime/cpuflags.go b/src/runtime/cpuflags.go new file mode 100644 index 0000000..bbe93c5 --- /dev/null +++ b/src/runtime/cpuflags.go @@ -0,0 +1,34 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/cpu" + "unsafe" +) + +// Offsets into internal/cpu records for use in assembly. +const ( + offsetX86HasAVX = unsafe.Offsetof(cpu.X86.HasAVX) + offsetX86HasAVX2 = unsafe.Offsetof(cpu.X86.HasAVX2) + offsetX86HasERMS = unsafe.Offsetof(cpu.X86.HasERMS) + offsetX86HasRDTSCP = unsafe.Offsetof(cpu.X86.HasRDTSCP) + + offsetARMHasIDIVA = unsafe.Offsetof(cpu.ARM.HasIDIVA) + + offsetMIPS64XHasMSA = unsafe.Offsetof(cpu.MIPS64X.HasMSA) +) + +var ( + // Set in runtime.cpuinit. + // TODO: deprecate these; use internal/cpu directly. + x86HasPOPCNT bool + x86HasSSE41 bool + x86HasFMA bool + + armHasVFPv4 bool + + arm64HasATOMICS bool +) diff --git a/src/runtime/cpuflags_amd64.go b/src/runtime/cpuflags_amd64.go new file mode 100644 index 0000000..8cca4bc --- /dev/null +++ b/src/runtime/cpuflags_amd64.go @@ -0,0 +1,24 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/cpu" +) + +var useAVXmemmove bool + +func init() { + // Let's remove stepping and reserved fields + processor := processorVersionInfo & 0x0FFF3FF0 + + isIntelBridgeFamily := isIntel && + processor == 0x206A0 || + processor == 0x206D0 || + processor == 0x306A0 || + processor == 0x306E0 + + useAVXmemmove = cpu.X86.HasAVX && !isIntelBridgeFamily +} diff --git a/src/runtime/cpuflags_arm64.go b/src/runtime/cpuflags_arm64.go new file mode 100644 index 0000000..a0f1d11 --- /dev/null +++ b/src/runtime/cpuflags_arm64.go @@ -0,0 +1,17 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/cpu" +) + +var arm64UseAlignedLoads bool + +func init() { + if cpu.ARM64.IsNeoverseN1 || cpu.ARM64.IsNeoverseV1 { + arm64UseAlignedLoads = true + } +} diff --git a/src/runtime/cpuprof.go b/src/runtime/cpuprof.go new file mode 100644 index 0000000..0d7eeac --- /dev/null +++ b/src/runtime/cpuprof.go @@ -0,0 +1,241 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// CPU profiling. +// +// The signal handler for the profiling clock tick adds a new stack trace +// to a log of recent traces. The log is read by a user goroutine that +// turns it into formatted profile data. If the reader does not keep up +// with the log, those writes will be recorded as a count of lost records. +// The actual profile buffer is in profbuf.go. + +package runtime + +import ( + "internal/abi" + "runtime/internal/sys" + "unsafe" +) + +const ( + maxCPUProfStack = 64 + + // profBufWordCount is the size of the CPU profile buffer's storage for the + // header and stack of each sample, measured in 64-bit words. Every sample + // has a required header of two words. With a small additional header (a + // word or two) and stacks at the profiler's maximum length of 64 frames, + // that capacity can support 1900 samples or 19 thread-seconds at a 100 Hz + // sample rate, at a cost of 1 MiB. + profBufWordCount = 1 << 17 + // profBufTagCount is the size of the CPU profile buffer's storage for the + // goroutine tags associated with each sample. A capacity of 1<<14 means + // room for 16k samples, or 160 thread-seconds at a 100 Hz sample rate. + profBufTagCount = 1 << 14 +) + +type cpuProfile struct { + lock mutex + on bool // profiling is on + log *profBuf // profile events written here + + // extra holds extra stacks accumulated in addNonGo + // corresponding to profiling signals arriving on + // non-Go-created threads. Those stacks are written + // to log the next time a normal Go thread gets the + // signal handler. + // Assuming the stacks are 2 words each (we don't get + // a full traceback from those threads), plus one word + // size for framing, 100 Hz profiling would generate + // 300 words per second. + // Hopefully a normal Go thread will get the profiling + // signal at least once every few seconds. + extra [1000]uintptr + numExtra int + lostExtra uint64 // count of frames lost because extra is full + lostAtomic uint64 // count of frames lost because of being in atomic64 on mips/arm; updated racily +} + +var cpuprof cpuProfile + +// SetCPUProfileRate sets the CPU profiling rate to hz samples per second. +// If hz <= 0, SetCPUProfileRate turns off profiling. +// If the profiler is on, the rate cannot be changed without first turning it off. +// +// Most clients should use the runtime/pprof package or +// the testing package's -test.cpuprofile flag instead of calling +// SetCPUProfileRate directly. +func SetCPUProfileRate(hz int) { + // Clamp hz to something reasonable. + if hz < 0 { + hz = 0 + } + if hz > 1000000 { + hz = 1000000 + } + + lock(&cpuprof.lock) + if hz > 0 { + if cpuprof.on || cpuprof.log != nil { + print("runtime: cannot set cpu profile rate until previous profile has finished.\n") + unlock(&cpuprof.lock) + return + } + + cpuprof.on = true + cpuprof.log = newProfBuf(1, profBufWordCount, profBufTagCount) + hdr := [1]uint64{uint64(hz)} + cpuprof.log.write(nil, nanotime(), hdr[:], nil) + setcpuprofilerate(int32(hz)) + } else if cpuprof.on { + setcpuprofilerate(0) + cpuprof.on = false + cpuprof.addExtra() + cpuprof.log.close() + } + unlock(&cpuprof.lock) +} + +// add adds the stack trace to the profile. +// It is called from signal handlers and other limited environments +// and cannot allocate memory or acquire locks that might be +// held at the time of the signal, nor can it use substantial amounts +// of stack. +// +//go:nowritebarrierrec +func (p *cpuProfile) add(tagPtr *unsafe.Pointer, stk []uintptr) { + // Simple cas-lock to coordinate with setcpuprofilerate. + for !prof.signalLock.CompareAndSwap(0, 1) { + // TODO: Is it safe to osyield here? https://go.dev/issue/52672 + osyield() + } + + if prof.hz.Load() != 0 { // implies cpuprof.log != nil + if p.numExtra > 0 || p.lostExtra > 0 || p.lostAtomic > 0 { + p.addExtra() + } + hdr := [1]uint64{1} + // Note: write "knows" that the argument is &gp.labels, + // because otherwise its write barrier behavior may not + // be correct. See the long comment there before + // changing the argument here. + cpuprof.log.write(tagPtr, nanotime(), hdr[:], stk) + } + + prof.signalLock.Store(0) +} + +// addNonGo adds the non-Go stack trace to the profile. +// It is called from a non-Go thread, so we cannot use much stack at all, +// nor do anything that needs a g or an m. +// In particular, we can't call cpuprof.log.write. +// Instead, we copy the stack into cpuprof.extra, +// which will be drained the next time a Go thread +// gets the signal handling event. +// +//go:nosplit +//go:nowritebarrierrec +func (p *cpuProfile) addNonGo(stk []uintptr) { + // Simple cas-lock to coordinate with SetCPUProfileRate. + // (Other calls to add or addNonGo should be blocked out + // by the fact that only one SIGPROF can be handled by the + // process at a time. If not, this lock will serialize those too. + // The use of timer_create(2) on Linux to request process-targeted + // signals may have changed this.) + for !prof.signalLock.CompareAndSwap(0, 1) { + // TODO: Is it safe to osyield here? https://go.dev/issue/52672 + osyield() + } + + if cpuprof.numExtra+1+len(stk) < len(cpuprof.extra) { + i := cpuprof.numExtra + cpuprof.extra[i] = uintptr(1 + len(stk)) + copy(cpuprof.extra[i+1:], stk) + cpuprof.numExtra += 1 + len(stk) + } else { + cpuprof.lostExtra++ + } + + prof.signalLock.Store(0) +} + +// addExtra adds the "extra" profiling events, +// queued by addNonGo, to the profile log. +// addExtra is called either from a signal handler on a Go thread +// or from an ordinary goroutine; either way it can use stack +// and has a g. The world may be stopped, though. +func (p *cpuProfile) addExtra() { + // Copy accumulated non-Go profile events. + hdr := [1]uint64{1} + for i := 0; i < p.numExtra; { + p.log.write(nil, 0, hdr[:], p.extra[i+1:i+int(p.extra[i])]) + i += int(p.extra[i]) + } + p.numExtra = 0 + + // Report any lost events. + if p.lostExtra > 0 { + hdr := [1]uint64{p.lostExtra} + lostStk := [2]uintptr{ + abi.FuncPCABIInternal(_LostExternalCode) + sys.PCQuantum, + abi.FuncPCABIInternal(_ExternalCode) + sys.PCQuantum, + } + p.log.write(nil, 0, hdr[:], lostStk[:]) + p.lostExtra = 0 + } + + if p.lostAtomic > 0 { + hdr := [1]uint64{p.lostAtomic} + lostStk := [2]uintptr{ + abi.FuncPCABIInternal(_LostSIGPROFDuringAtomic64) + sys.PCQuantum, + abi.FuncPCABIInternal(_System) + sys.PCQuantum, + } + p.log.write(nil, 0, hdr[:], lostStk[:]) + p.lostAtomic = 0 + } + +} + +// CPUProfile panics. +// It formerly provided raw access to chunks of +// a pprof-format profile generated by the runtime. +// The details of generating that format have changed, +// so this functionality has been removed. +// +// Deprecated: Use the runtime/pprof package, +// or the handlers in the net/http/pprof package, +// or the testing package's -test.cpuprofile flag instead. +func CPUProfile() []byte { + panic("CPUProfile no longer available") +} + +//go:linkname runtime_pprof_runtime_cyclesPerSecond runtime/pprof.runtime_cyclesPerSecond +func runtime_pprof_runtime_cyclesPerSecond() int64 { + return tickspersecond() +} + +// readProfile, provided to runtime/pprof, returns the next chunk of +// binary CPU profiling stack trace data, blocking until data is available. +// If profiling is turned off and all the profile data accumulated while it was +// on has been returned, readProfile returns eof=true. +// The caller must save the returned data and tags before calling readProfile again. +// The returned data contains a whole number of records, and tags contains +// exactly one entry per record. +// +//go:linkname runtime_pprof_readProfile runtime/pprof.readProfile +func runtime_pprof_readProfile() ([]uint64, []unsafe.Pointer, bool) { + lock(&cpuprof.lock) + log := cpuprof.log + unlock(&cpuprof.lock) + readMode := profBufBlocking + if GOOS == "darwin" || GOOS == "ios" { + readMode = profBufNonBlocking // For #61768; on Darwin notes are not async-signal-safe. See sigNoteSetup in os_darwin.go. + } + data, tags, eof := log.read(readMode) + if len(data) == 0 && eof { + lock(&cpuprof.lock) + cpuprof.log = nil + unlock(&cpuprof.lock) + } + return data, tags, eof +} diff --git a/src/runtime/cputicks.go b/src/runtime/cputicks.go new file mode 100644 index 0000000..9127061 --- /dev/null +++ b/src/runtime/cputicks.go @@ -0,0 +1,11 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !arm && !arm64 && !loong64 && !mips64 && !mips64le && !mips && !mipsle && !wasm + +package runtime + +// careful: cputicks is not guaranteed to be monotonic! In particular, we have +// noticed drift between cpus on certain os/arch combinations. See issue 8976. +func cputicks() int64 diff --git a/src/runtime/crash_cgo_test.go b/src/runtime/crash_cgo_test.go new file mode 100644 index 0000000..51d7bb5 --- /dev/null +++ b/src/runtime/crash_cgo_test.go @@ -0,0 +1,770 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build cgo + +package runtime_test + +import ( + "fmt" + "internal/goos" + "internal/testenv" + "os" + "os/exec" + "runtime" + "strconv" + "strings" + "testing" + "time" +) + +func TestCgoCrashHandler(t *testing.T) { + t.Parallel() + testCrashHandler(t, true) +} + +func TestCgoSignalDeadlock(t *testing.T) { + // Don't call t.Parallel, since too much work going on at the + // same time can cause the testprogcgo code to overrun its + // timeouts (issue #18598). + + if testing.Short() && runtime.GOOS == "windows" { + t.Skip("Skipping in short mode") // takes up to 64 seconds + } + got := runTestProg(t, "testprogcgo", "CgoSignalDeadlock") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got:\n%s", want, got) + } +} + +func TestCgoTraceback(t *testing.T) { + t.Parallel() + got := runTestProg(t, "testprogcgo", "CgoTraceback") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got:\n%s", want, got) + } +} + +func TestCgoCallbackGC(t *testing.T) { + t.Parallel() + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no pthreads on %s", runtime.GOOS) + } + if testing.Short() { + switch { + case runtime.GOOS == "dragonfly": + t.Skip("see golang.org/issue/11990") + case runtime.GOOS == "linux" && runtime.GOARCH == "arm": + t.Skip("too slow for arm builders") + case runtime.GOOS == "linux" && (runtime.GOARCH == "mips64" || runtime.GOARCH == "mips64le"): + t.Skip("too slow for mips64x builders") + } + } + if testenv.Builder() == "darwin-amd64-10_14" { + // TODO(#23011): When the 10.14 builders are gone, remove this skip. + t.Skip("skipping due to platform bug on macOS 10.14; see https://golang.org/issue/43926") + } + got := runTestProg(t, "testprogcgo", "CgoCallbackGC") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got:\n%s", want, got) + } +} + +func TestCgoExternalThreadPanic(t *testing.T) { + t.Parallel() + if runtime.GOOS == "plan9" { + t.Skipf("no pthreads on %s", runtime.GOOS) + } + got := runTestProg(t, "testprogcgo", "CgoExternalThreadPanic") + want := "panic: BOOM" + if !strings.Contains(got, want) { + t.Fatalf("want failure containing %q. output:\n%s\n", want, got) + } +} + +func TestCgoExternalThreadSIGPROF(t *testing.T) { + t.Parallel() + // issue 9456. + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no pthreads on %s", runtime.GOOS) + } + + got := runTestProg(t, "testprogcgo", "CgoExternalThreadSIGPROF", "GO_START_SIGPROF_THREAD=1") + if want := "OK\n"; got != want { + t.Fatalf("expected %q, but got:\n%s", want, got) + } +} + +func TestCgoExternalThreadSignal(t *testing.T) { + t.Parallel() + // issue 10139 + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no pthreads on %s", runtime.GOOS) + } + + got := runTestProg(t, "testprogcgo", "CgoExternalThreadSignal") + if want := "OK\n"; got != want { + t.Fatalf("expected %q, but got:\n%s", want, got) + } +} + +func TestCgoDLLImports(t *testing.T) { + // test issue 9356 + if runtime.GOOS != "windows" { + t.Skip("skipping windows specific test") + } + got := runTestProg(t, "testprogcgo", "CgoDLLImportsMain") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got %v", want, got) + } +} + +func TestCgoExecSignalMask(t *testing.T) { + t.Parallel() + // Test issue 13164. + switch runtime.GOOS { + case "windows", "plan9": + t.Skipf("skipping signal mask test on %s", runtime.GOOS) + } + got := runTestProg(t, "testprogcgo", "CgoExecSignalMask", "GOTRACEBACK=system") + want := "OK\n" + if got != want { + t.Errorf("expected %q, got %v", want, got) + } +} + +func TestEnsureDropM(t *testing.T) { + t.Parallel() + // Test for issue 13881. + switch runtime.GOOS { + case "windows", "plan9": + t.Skipf("skipping dropm test on %s", runtime.GOOS) + } + got := runTestProg(t, "testprogcgo", "EnsureDropM") + want := "OK\n" + if got != want { + t.Errorf("expected %q, got %v", want, got) + } +} + +// Test for issue 14387. +// Test that the program that doesn't need any cgo pointer checking +// takes about the same amount of time with it as without it. +func TestCgoCheckBytes(t *testing.T) { + t.Parallel() + // Make sure we don't count the build time as part of the run time. + testenv.MustHaveGoBuild(t) + exe, err := buildTestProg(t, "testprogcgo") + if err != nil { + t.Fatal(err) + } + + // Try it 10 times to avoid flakiness. + const tries = 10 + var tot1, tot2 time.Duration + for i := 0; i < tries; i++ { + cmd := testenv.CleanCmdEnv(exec.Command(exe, "CgoCheckBytes")) + cmd.Env = append(cmd.Env, "GODEBUG=cgocheck=0", fmt.Sprintf("GO_CGOCHECKBYTES_TRY=%d", i)) + + start := time.Now() + cmd.Run() + d1 := time.Since(start) + + cmd = testenv.CleanCmdEnv(exec.Command(exe, "CgoCheckBytes")) + cmd.Env = append(cmd.Env, fmt.Sprintf("GO_CGOCHECKBYTES_TRY=%d", i)) + + start = time.Now() + cmd.Run() + d2 := time.Since(start) + + if d1*20 > d2 { + // The slow version (d2) was less than 20 times + // slower than the fast version (d1), so OK. + return + } + + tot1 += d1 + tot2 += d2 + } + + t.Errorf("cgo check too slow: got %v, expected at most %v", tot2/tries, (tot1/tries)*20) +} + +func TestCgoPanicDeadlock(t *testing.T) { + t.Parallel() + // test issue 14432 + got := runTestProg(t, "testprogcgo", "CgoPanicDeadlock") + want := "panic: cgo error\n\n" + if !strings.HasPrefix(got, want) { + t.Fatalf("output does not start with %q:\n%s", want, got) + } +} + +func TestCgoCCodeSIGPROF(t *testing.T) { + t.Parallel() + got := runTestProg(t, "testprogcgo", "CgoCCodeSIGPROF") + want := "OK\n" + if got != want { + t.Errorf("expected %q got %v", want, got) + } +} + +func TestCgoPprofCallback(t *testing.T) { + if testing.Short() { + t.Skip("skipping in short mode") // takes a full second + } + switch runtime.GOOS { + case "windows", "plan9": + t.Skipf("skipping cgo pprof callback test on %s", runtime.GOOS) + } + got := runTestProg(t, "testprogcgo", "CgoPprofCallback") + want := "OK\n" + if got != want { + t.Errorf("expected %q got %v", want, got) + } +} + +func TestCgoCrashTraceback(t *testing.T) { + t.Parallel() + switch platform := runtime.GOOS + "/" + runtime.GOARCH; platform { + case "darwin/amd64": + case "linux/amd64": + case "linux/arm64": + case "linux/ppc64le": + default: + t.Skipf("not yet supported on %s", platform) + } + got := runTestProg(t, "testprogcgo", "CrashTraceback") + for i := 1; i <= 3; i++ { + if !strings.Contains(got, fmt.Sprintf("cgo symbolizer:%d", i)) { + t.Errorf("missing cgo symbolizer:%d", i) + } + } +} + +func TestCgoCrashTracebackGo(t *testing.T) { + t.Parallel() + switch platform := runtime.GOOS + "/" + runtime.GOARCH; platform { + case "darwin/amd64": + case "linux/amd64": + case "linux/arm64": + case "linux/ppc64le": + default: + t.Skipf("not yet supported on %s", platform) + } + got := runTestProg(t, "testprogcgo", "CrashTracebackGo") + for i := 1; i <= 3; i++ { + want := fmt.Sprintf("main.h%d", i) + if !strings.Contains(got, want) { + t.Errorf("missing %s", want) + } + } +} + +func TestCgoTracebackContext(t *testing.T) { + t.Parallel() + got := runTestProg(t, "testprogcgo", "TracebackContext") + want := "OK\n" + if got != want { + t.Errorf("expected %q got %v", want, got) + } +} + +func TestCgoTracebackContextPreemption(t *testing.T) { + t.Parallel() + got := runTestProg(t, "testprogcgo", "TracebackContextPreemption") + want := "OK\n" + if got != want { + t.Errorf("expected %q got %v", want, got) + } +} + +func testCgoPprof(t *testing.T, buildArg, runArg, top, bottom string) { + t.Parallel() + if runtime.GOOS != "linux" || (runtime.GOARCH != "amd64" && runtime.GOARCH != "ppc64le" && runtime.GOARCH != "arm64") { + t.Skipf("not yet supported on %s/%s", runtime.GOOS, runtime.GOARCH) + } + testenv.MustHaveGoRun(t) + + exe, err := buildTestProg(t, "testprogcgo", buildArg) + if err != nil { + t.Fatal(err) + } + + cmd := testenv.CleanCmdEnv(exec.Command(exe, runArg)) + got, err := cmd.CombinedOutput() + if err != nil { + if testenv.Builder() == "linux-amd64-alpine" { + // See Issue 18243 and Issue 19938. + t.Skipf("Skipping failing test on Alpine (golang.org/issue/18243). Ignoring error: %v", err) + } + t.Fatalf("%s\n\n%v", got, err) + } + fn := strings.TrimSpace(string(got)) + defer os.Remove(fn) + + for try := 0; try < 2; try++ { + cmd := testenv.CleanCmdEnv(exec.Command(testenv.GoToolPath(t), "tool", "pprof", "-tagignore=ignore", "-traces")) + // Check that pprof works both with and without explicit executable on command line. + if try == 0 { + cmd.Args = append(cmd.Args, exe, fn) + } else { + cmd.Args = append(cmd.Args, fn) + } + + found := false + for i, e := range cmd.Env { + if strings.HasPrefix(e, "PPROF_TMPDIR=") { + cmd.Env[i] = "PPROF_TMPDIR=" + os.TempDir() + found = true + break + } + } + if !found { + cmd.Env = append(cmd.Env, "PPROF_TMPDIR="+os.TempDir()) + } + + out, err := cmd.CombinedOutput() + t.Logf("%s:\n%s", cmd.Args, out) + if err != nil { + t.Error(err) + continue + } + + trace := findTrace(string(out), top) + if len(trace) == 0 { + t.Errorf("%s traceback missing.", top) + continue + } + if trace[len(trace)-1] != bottom { + t.Errorf("invalid traceback origin: got=%v; want=[%s ... %s]", trace, top, bottom) + } + } +} + +func TestCgoPprof(t *testing.T) { + testCgoPprof(t, "", "CgoPprof", "cpuHog", "runtime.main") +} + +func TestCgoPprofPIE(t *testing.T) { + testCgoPprof(t, "-buildmode=pie", "CgoPprof", "cpuHog", "runtime.main") +} + +func TestCgoPprofThread(t *testing.T) { + testCgoPprof(t, "", "CgoPprofThread", "cpuHogThread", "cpuHogThread2") +} + +func TestCgoPprofThreadNoTraceback(t *testing.T) { + testCgoPprof(t, "", "CgoPprofThreadNoTraceback", "cpuHogThread", "runtime._ExternalCode") +} + +func TestRaceProf(t *testing.T) { + if (runtime.GOOS != "linux" && runtime.GOOS != "freebsd") || runtime.GOARCH != "amd64" { + t.Skipf("not yet supported on %s/%s", runtime.GOOS, runtime.GOARCH) + } + + testenv.MustHaveGoRun(t) + + // This test requires building various packages with -race, so + // it's somewhat slow. + if testing.Short() { + t.Skip("skipping test in -short mode") + } + + exe, err := buildTestProg(t, "testprogcgo", "-race") + if err != nil { + t.Fatal(err) + } + + got, err := testenv.CleanCmdEnv(exec.Command(exe, "CgoRaceprof")).CombinedOutput() + if err != nil { + t.Fatal(err) + } + want := "OK\n" + if string(got) != want { + t.Errorf("expected %q got %s", want, got) + } +} + +func TestRaceSignal(t *testing.T) { + t.Parallel() + if (runtime.GOOS != "linux" && runtime.GOOS != "freebsd") || runtime.GOARCH != "amd64" { + t.Skipf("not yet supported on %s/%s", runtime.GOOS, runtime.GOARCH) + } + + testenv.MustHaveGoRun(t) + + // This test requires building various packages with -race, so + // it's somewhat slow. + if testing.Short() { + t.Skip("skipping test in -short mode") + } + + exe, err := buildTestProg(t, "testprogcgo", "-race") + if err != nil { + t.Fatal(err) + } + + got, err := testenv.CleanCmdEnv(exec.Command(exe, "CgoRaceSignal")).CombinedOutput() + if err != nil { + t.Logf("%s\n", got) + t.Fatal(err) + } + want := "OK\n" + if string(got) != want { + t.Errorf("expected %q got %s", want, got) + } +} + +func TestCgoNumGoroutine(t *testing.T) { + switch runtime.GOOS { + case "windows", "plan9": + t.Skipf("skipping numgoroutine test on %s", runtime.GOOS) + } + t.Parallel() + got := runTestProg(t, "testprogcgo", "NumGoroutine") + want := "OK\n" + if got != want { + t.Errorf("expected %q got %v", want, got) + } +} + +func TestCatchPanic(t *testing.T) { + t.Parallel() + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no signals on %s", runtime.GOOS) + case "darwin": + if runtime.GOARCH == "amd64" { + t.Skipf("crash() on darwin/amd64 doesn't raise SIGABRT") + } + } + + testenv.MustHaveGoRun(t) + + exe, err := buildTestProg(t, "testprogcgo") + if err != nil { + t.Fatal(err) + } + + for _, early := range []bool{true, false} { + cmd := testenv.CleanCmdEnv(exec.Command(exe, "CgoCatchPanic")) + // Make sure a panic results in a crash. + cmd.Env = append(cmd.Env, "GOTRACEBACK=crash") + if early { + // Tell testprogcgo to install an early signal handler for SIGABRT + cmd.Env = append(cmd.Env, "CGOCATCHPANIC_EARLY_HANDLER=1") + } + if out, err := cmd.CombinedOutput(); err != nil { + t.Errorf("testprogcgo CgoCatchPanic failed: %v\n%s", err, out) + } + } +} + +func TestCgoLockOSThreadExit(t *testing.T) { + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no pthreads on %s", runtime.GOOS) + } + t.Parallel() + testLockOSThreadExit(t, "testprogcgo") +} + +func TestWindowsStackMemoryCgo(t *testing.T) { + if runtime.GOOS != "windows" { + t.Skip("skipping windows specific test") + } + testenv.SkipFlaky(t, 22575) + o := runTestProg(t, "testprogcgo", "StackMemory") + stackUsage, err := strconv.Atoi(o) + if err != nil { + t.Fatalf("Failed to read stack usage: %v", err) + } + if expected, got := 100<<10, stackUsage; got > expected { + t.Fatalf("expected < %d bytes of memory per thread, got %d", expected, got) + } +} + +func TestSigStackSwapping(t *testing.T) { + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no sigaltstack on %s", runtime.GOOS) + } + t.Parallel() + got := runTestProg(t, "testprogcgo", "SigStack") + want := "OK\n" + if got != want { + t.Errorf("expected %q got %v", want, got) + } +} + +func TestCgoTracebackSigpanic(t *testing.T) { + // Test unwinding over a sigpanic in C code without a C + // symbolizer. See issue #23576. + if runtime.GOOS == "windows" { + // On Windows if we get an exception in C code, we let + // the Windows exception handler unwind it, rather + // than injecting a sigpanic. + t.Skip("no sigpanic in C on windows") + } + t.Parallel() + got := runTestProg(t, "testprogcgo", "TracebackSigpanic") + t.Log(got) + want := "runtime.sigpanic" + if !strings.Contains(got, want) { + t.Errorf("did not see %q in output", want) + } + // No runtime errors like "runtime: unexpected return pc". + nowant := "runtime: " + if strings.Contains(got, nowant) { + t.Errorf("unexpectedly saw %q in output", nowant) + } +} + +func TestCgoPanicCallback(t *testing.T) { + t.Parallel() + got := runTestProg(t, "testprogcgo", "PanicCallback") + t.Log(got) + want := "panic: runtime error: invalid memory address or nil pointer dereference" + if !strings.Contains(got, want) { + t.Errorf("did not see %q in output", want) + } + want = "panic_callback" + if !strings.Contains(got, want) { + t.Errorf("did not see %q in output", want) + } + want = "PanicCallback" + if !strings.Contains(got, want) { + t.Errorf("did not see %q in output", want) + } + // No runtime errors like "runtime: unexpected return pc". + nowant := "runtime: " + if strings.Contains(got, nowant) { + t.Errorf("did not see %q in output", want) + } +} + +// Test that C code called via cgo can use large Windows thread stacks +// and call back in to Go without crashing. See issue #20975. +// +// See also TestBigStackCallbackSyscall. +func TestBigStackCallbackCgo(t *testing.T) { + if runtime.GOOS != "windows" { + t.Skip("skipping windows specific test") + } + t.Parallel() + got := runTestProg(t, "testprogcgo", "BigStack") + want := "OK\n" + if got != want { + t.Errorf("expected %q got %v", want, got) + } +} + +func nextTrace(lines []string) ([]string, []string) { + var trace []string + for n, line := range lines { + if strings.HasPrefix(line, "---") { + return trace, lines[n+1:] + } + fields := strings.Fields(strings.TrimSpace(line)) + if len(fields) == 0 { + continue + } + // Last field contains the function name. + trace = append(trace, fields[len(fields)-1]) + } + return nil, nil +} + +func findTrace(text, top string) []string { + lines := strings.Split(text, "\n") + _, lines = nextTrace(lines) // Skip the header. + for len(lines) > 0 { + var t []string + t, lines = nextTrace(lines) + if len(t) == 0 { + continue + } + if t[0] == top { + return t + } + } + return nil +} + +func TestSegv(t *testing.T) { + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no signals on %s", runtime.GOOS) + } + + for _, test := range []string{"Segv", "SegvInCgo", "TgkillSegv", "TgkillSegvInCgo"} { + test := test + + // The tgkill variants only run on Linux. + if runtime.GOOS != "linux" && strings.HasPrefix(test, "Tgkill") { + continue + } + + t.Run(test, func(t *testing.T) { + t.Parallel() + got := runTestProg(t, "testprogcgo", test) + t.Log(got) + want := "SIGSEGV" + if !strings.Contains(got, want) { + if runtime.GOOS == "darwin" && runtime.GOARCH == "amd64" && strings.Contains(got, "fatal: morestack on g0") { + testenv.SkipFlaky(t, 39457) + } + t.Errorf("did not see %q in output", want) + } + + // No runtime errors like "runtime: unknown pc". + switch runtime.GOOS { + case "darwin", "illumos", "solaris": + // Runtime sometimes throws when generating the traceback. + testenv.SkipFlaky(t, 49182) + case "linux": + if runtime.GOARCH == "386" { + // Runtime throws when generating a traceback from + // a VDSO call via asmcgocall. + testenv.SkipFlaky(t, 50504) + } + } + if test == "SegvInCgo" && strings.Contains(got, "unknown pc") { + testenv.SkipFlaky(t, 50979) + } + + for _, nowant := range []string{"fatal error: ", "runtime: "} { + if strings.Contains(got, nowant) { + if runtime.GOOS == "darwin" && strings.Contains(got, "0xb01dfacedebac1e") { + // See the comment in signal_darwin_amd64.go. + t.Skip("skipping due to Darwin handling of malformed addresses") + } + t.Errorf("unexpectedly saw %q in output", nowant) + } + } + }) + } +} + +func TestAbortInCgo(t *testing.T) { + switch runtime.GOOS { + case "plan9", "windows": + // N.B. On Windows, C abort() causes the program to exit + // without going through the runtime at all. + t.Skipf("no signals on %s", runtime.GOOS) + } + + t.Parallel() + got := runTestProg(t, "testprogcgo", "Abort") + t.Log(got) + want := "SIGABRT" + if !strings.Contains(got, want) { + t.Errorf("did not see %q in output", want) + } + // No runtime errors like "runtime: unknown pc". + nowant := "runtime: " + if strings.Contains(got, nowant) { + t.Errorf("did not see %q in output", want) + } +} + +// TestEINTR tests that we handle EINTR correctly. +// See issue #20400 and friends. +func TestEINTR(t *testing.T) { + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no EINTR on %s", runtime.GOOS) + case "linux": + if runtime.GOARCH == "386" { + // On linux-386 the Go signal handler sets + // a restorer function that is not preserved + // by the C sigaction call in the test, + // causing the signal handler to crash when + // returning the normal code. The test is not + // architecture-specific, so just skip on 386 + // rather than doing a complicated workaround. + t.Skip("skipping on linux-386; C sigaction does not preserve Go restorer") + } + } + + t.Parallel() + output := runTestProg(t, "testprogcgo", "EINTR") + want := "OK\n" + if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +// Issue #42207. +func TestNeedmDeadlock(t *testing.T) { + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no signals on %s", runtime.GOOS) + } + output := runTestProg(t, "testprogcgo", "NeedmDeadlock") + want := "OK\n" + if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +func TestCgoTracebackGoroutineProfile(t *testing.T) { + output := runTestProg(t, "testprogcgo", "GoroutineProfile") + want := "OK\n" + if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +func TestCgoTraceParser(t *testing.T) { + // Test issue 29707. + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no pthreads on %s", runtime.GOOS) + } + output := runTestProg(t, "testprogcgo", "CgoTraceParser") + want := "OK\n" + ErrTimeOrder := "ErrTimeOrder\n" + if output == ErrTimeOrder { + t.Skipf("skipping due to golang.org/issue/16755: %v", output) + } else if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +func TestCgoTraceParserWithOneProc(t *testing.T) { + // Test issue 29707. + switch runtime.GOOS { + case "plan9", "windows": + t.Skipf("no pthreads on %s", runtime.GOOS) + } + output := runTestProg(t, "testprogcgo", "CgoTraceParser", "GOMAXPROCS=1") + want := "OK\n" + ErrTimeOrder := "ErrTimeOrder\n" + if output == ErrTimeOrder { + t.Skipf("skipping due to golang.org/issue/16755: %v", output) + } else if output != want { + t.Fatalf("GOMAXPROCS=1, want %s, got %s\n", want, output) + } +} + +func TestCgoSigfwd(t *testing.T) { + t.Parallel() + if !goos.IsUnix { + t.Skipf("no signals on %s", runtime.GOOS) + } + + got := runTestProg(t, "testprogcgo", "CgoSigfwd", "GO_TEST_CGOSIGFWD=1") + if want := "OK\n"; got != want { + t.Fatalf("expected %q, but got:\n%s", want, got) + } +} diff --git a/src/runtime/crash_test.go b/src/runtime/crash_test.go new file mode 100644 index 0000000..309777d --- /dev/null +++ b/src/runtime/crash_test.go @@ -0,0 +1,868 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "bytes" + "errors" + "flag" + "fmt" + "internal/testenv" + "os" + "os/exec" + "path/filepath" + "regexp" + "runtime" + "strings" + "sync" + "testing" + "time" +) + +var toRemove []string + +func TestMain(m *testing.M) { + status := m.Run() + for _, file := range toRemove { + os.RemoveAll(file) + } + os.Exit(status) +} + +var testprog struct { + sync.Mutex + dir string + target map[string]*buildexe +} + +type buildexe struct { + once sync.Once + exe string + err error +} + +func runTestProg(t *testing.T, binary, name string, env ...string) string { + if *flagQuick { + t.Skip("-quick") + } + + testenv.MustHaveGoBuild(t) + + exe, err := buildTestProg(t, binary) + if err != nil { + t.Fatal(err) + } + + return runBuiltTestProg(t, exe, name, env...) +} + +func runBuiltTestProg(t *testing.T, exe, name string, env ...string) string { + t.Helper() + + if *flagQuick { + t.Skip("-quick") + } + + start := time.Now() + + cmd := testenv.CleanCmdEnv(testenv.Command(t, exe, name)) + cmd.Env = append(cmd.Env, env...) + if testing.Short() { + cmd.Env = append(cmd.Env, "RUNTIME_TEST_SHORT=1") + } + out, err := cmd.CombinedOutput() + if err == nil { + t.Logf("%v (%v): ok", cmd, time.Since(start)) + } else { + if _, ok := err.(*exec.ExitError); ok { + t.Logf("%v: %v", cmd, err) + } else if errors.Is(err, exec.ErrWaitDelay) { + t.Fatalf("%v: %v", cmd, err) + } else { + t.Fatalf("%v failed to start: %v", cmd, err) + } + } + return string(out) +} + +var serializeBuild = make(chan bool, 2) + +func buildTestProg(t *testing.T, binary string, flags ...string) (string, error) { + if *flagQuick { + t.Skip("-quick") + } + testenv.MustHaveGoBuild(t) + + testprog.Lock() + if testprog.dir == "" { + dir, err := os.MkdirTemp("", "go-build") + if err != nil { + t.Fatalf("failed to create temp directory: %v", err) + } + testprog.dir = dir + toRemove = append(toRemove, dir) + } + + if testprog.target == nil { + testprog.target = make(map[string]*buildexe) + } + name := binary + if len(flags) > 0 { + name += "_" + strings.Join(flags, "_") + } + target, ok := testprog.target[name] + if !ok { + target = &buildexe{} + testprog.target[name] = target + } + + dir := testprog.dir + + // Unlock testprog while actually building, so that other + // tests can look up executables that were already built. + testprog.Unlock() + + target.once.Do(func() { + // Only do two "go build"'s at a time, + // to keep load from getting too high. + serializeBuild <- true + defer func() { <-serializeBuild }() + + // Don't get confused if testenv.GoToolPath calls t.Skip. + target.err = errors.New("building test called t.Skip") + + exe := filepath.Join(dir, name+".exe") + + t.Logf("running go build -o %s %s", exe, strings.Join(flags, " ")) + cmd := exec.Command(testenv.GoToolPath(t), append([]string{"build", "-o", exe}, flags...)...) + cmd.Dir = "testdata/" + binary + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + target.err = fmt.Errorf("building %s %v: %v\n%s", binary, flags, err, out) + } else { + target.exe = exe + target.err = nil + } + }) + + return target.exe, target.err +} + +func TestVDSO(t *testing.T) { + t.Parallel() + output := runTestProg(t, "testprog", "SignalInVDSO") + want := "success\n" + if output != want { + t.Fatalf("output:\n%s\n\nwanted:\n%s", output, want) + } +} + +func testCrashHandler(t *testing.T, cgo bool) { + type crashTest struct { + Cgo bool + } + var output string + if cgo { + output = runTestProg(t, "testprogcgo", "Crash") + } else { + output = runTestProg(t, "testprog", "Crash") + } + want := "main: recovered done\nnew-thread: recovered done\nsecond-new-thread: recovered done\nmain-again: recovered done\n" + if output != want { + t.Fatalf("output:\n%s\n\nwanted:\n%s", output, want) + } +} + +func TestCrashHandler(t *testing.T) { + testCrashHandler(t, false) +} + +func testDeadlock(t *testing.T, name string) { + // External linking brings in cgo, causing deadlock detection not working. + testenv.MustInternalLink(t) + + output := runTestProg(t, "testprog", name) + want := "fatal error: all goroutines are asleep - deadlock!\n" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestSimpleDeadlock(t *testing.T) { + testDeadlock(t, "SimpleDeadlock") +} + +func TestInitDeadlock(t *testing.T) { + testDeadlock(t, "InitDeadlock") +} + +func TestLockedDeadlock(t *testing.T) { + testDeadlock(t, "LockedDeadlock") +} + +func TestLockedDeadlock2(t *testing.T) { + testDeadlock(t, "LockedDeadlock2") +} + +func TestGoexitDeadlock(t *testing.T) { + // External linking brings in cgo, causing deadlock detection not working. + testenv.MustInternalLink(t) + + output := runTestProg(t, "testprog", "GoexitDeadlock") + want := "no goroutines (main called runtime.Goexit) - deadlock!" + if !strings.Contains(output, want) { + t.Fatalf("output:\n%s\n\nwant output containing: %s", output, want) + } +} + +func TestStackOverflow(t *testing.T) { + output := runTestProg(t, "testprog", "StackOverflow") + want := []string{ + "runtime: goroutine stack exceeds 1474560-byte limit\n", + "fatal error: stack overflow", + // information about the current SP and stack bounds + "runtime: sp=", + "stack=[", + } + if !strings.HasPrefix(output, want[0]) { + t.Errorf("output does not start with %q", want[0]) + } + for _, s := range want[1:] { + if !strings.Contains(output, s) { + t.Errorf("output does not contain %q", s) + } + } + if t.Failed() { + t.Logf("output:\n%s", output) + } +} + +func TestThreadExhaustion(t *testing.T) { + output := runTestProg(t, "testprog", "ThreadExhaustion") + want := "runtime: program exceeds 10-thread limit\nfatal error: thread exhaustion" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestRecursivePanic(t *testing.T) { + output := runTestProg(t, "testprog", "RecursivePanic") + want := `wrap: bad +panic: again + +` + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } + +} + +func TestRecursivePanic2(t *testing.T) { + output := runTestProg(t, "testprog", "RecursivePanic2") + want := `first panic +second panic +panic: third panic + +` + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } + +} + +func TestRecursivePanic3(t *testing.T) { + output := runTestProg(t, "testprog", "RecursivePanic3") + want := `panic: first panic + +` + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } + +} + +func TestRecursivePanic4(t *testing.T) { + output := runTestProg(t, "testprog", "RecursivePanic4") + want := `panic: first panic [recovered] + panic: second panic +` + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } + +} + +func TestRecursivePanic5(t *testing.T) { + output := runTestProg(t, "testprog", "RecursivePanic5") + want := `first panic +second panic +panic: third panic +` + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } + +} + +func TestGoexitCrash(t *testing.T) { + // External linking brings in cgo, causing deadlock detection not working. + testenv.MustInternalLink(t) + + output := runTestProg(t, "testprog", "GoexitExit") + want := "no goroutines (main called runtime.Goexit) - deadlock!" + if !strings.Contains(output, want) { + t.Fatalf("output:\n%s\n\nwant output containing: %s", output, want) + } +} + +func TestGoexitDefer(t *testing.T) { + c := make(chan struct{}) + go func() { + defer func() { + r := recover() + if r != nil { + t.Errorf("non-nil recover during Goexit") + } + c <- struct{}{} + }() + runtime.Goexit() + }() + // Note: if the defer fails to run, we will get a deadlock here + <-c +} + +func TestGoNil(t *testing.T) { + output := runTestProg(t, "testprog", "GoNil") + want := "go of nil func value" + if !strings.Contains(output, want) { + t.Fatalf("output:\n%s\n\nwant output containing: %s", output, want) + } +} + +func TestMainGoroutineID(t *testing.T) { + output := runTestProg(t, "testprog", "MainGoroutineID") + want := "panic: test\n\ngoroutine 1 [running]:\n" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestNoHelperGoroutines(t *testing.T) { + output := runTestProg(t, "testprog", "NoHelperGoroutines") + matches := regexp.MustCompile(`goroutine [0-9]+ \[`).FindAllStringSubmatch(output, -1) + if len(matches) != 1 || matches[0][0] != "goroutine 1 [" { + t.Fatalf("want to see only goroutine 1, see:\n%s", output) + } +} + +func TestBreakpoint(t *testing.T) { + output := runTestProg(t, "testprog", "Breakpoint") + // If runtime.Breakpoint() is inlined, then the stack trace prints + // "runtime.Breakpoint(...)" instead of "runtime.Breakpoint()". + want := "runtime.Breakpoint(" + if !strings.Contains(output, want) { + t.Fatalf("output:\n%s\n\nwant output containing: %s", output, want) + } +} + +func TestGoexitInPanic(t *testing.T) { + // External linking brings in cgo, causing deadlock detection not working. + testenv.MustInternalLink(t) + + // see issue 8774: this code used to trigger an infinite recursion + output := runTestProg(t, "testprog", "GoexitInPanic") + want := "fatal error: no goroutines (main called runtime.Goexit) - deadlock!" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +// Issue 14965: Runtime panics should be of type runtime.Error +func TestRuntimePanicWithRuntimeError(t *testing.T) { + testCases := [...]func(){ + 0: func() { + var m map[uint64]bool + m[1234] = true + }, + 1: func() { + ch := make(chan struct{}) + close(ch) + close(ch) + }, + 2: func() { + var ch = make(chan struct{}) + close(ch) + ch <- struct{}{} + }, + 3: func() { + var s = make([]int, 2) + _ = s[2] + }, + 4: func() { + n := -1 + _ = make(chan bool, n) + }, + 5: func() { + close((chan bool)(nil)) + }, + } + + for i, fn := range testCases { + got := panicValue(fn) + if _, ok := got.(runtime.Error); !ok { + t.Errorf("test #%d: recovered value %v(type %T) does not implement runtime.Error", i, got, got) + } + } +} + +func panicValue(fn func()) (recovered any) { + defer func() { + recovered = recover() + }() + fn() + return +} + +func TestPanicAfterGoexit(t *testing.T) { + // an uncaught panic should still work after goexit + output := runTestProg(t, "testprog", "PanicAfterGoexit") + want := "panic: hello" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestRecoveredPanicAfterGoexit(t *testing.T) { + // External linking brings in cgo, causing deadlock detection not working. + testenv.MustInternalLink(t) + + output := runTestProg(t, "testprog", "RecoveredPanicAfterGoexit") + want := "fatal error: no goroutines (main called runtime.Goexit) - deadlock!" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestRecoverBeforePanicAfterGoexit(t *testing.T) { + // External linking brings in cgo, causing deadlock detection not working. + testenv.MustInternalLink(t) + + t.Parallel() + output := runTestProg(t, "testprog", "RecoverBeforePanicAfterGoexit") + want := "fatal error: no goroutines (main called runtime.Goexit) - deadlock!" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestRecoverBeforePanicAfterGoexit2(t *testing.T) { + // External linking brings in cgo, causing deadlock detection not working. + testenv.MustInternalLink(t) + + t.Parallel() + output := runTestProg(t, "testprog", "RecoverBeforePanicAfterGoexit2") + want := "fatal error: no goroutines (main called runtime.Goexit) - deadlock!" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestNetpollDeadlock(t *testing.T) { + t.Parallel() + output := runTestProg(t, "testprognet", "NetpollDeadlock") + want := "done\n" + if !strings.HasSuffix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestPanicTraceback(t *testing.T) { + t.Parallel() + output := runTestProg(t, "testprog", "PanicTraceback") + want := "panic: hello\n\tpanic: panic pt2\n\tpanic: panic pt1\n" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } + + // Check functions in the traceback. + fns := []string{"main.pt1.func1", "panic", "main.pt2.func1", "panic", "main.pt2", "main.pt1"} + for _, fn := range fns { + re := regexp.MustCompile(`(?m)^` + regexp.QuoteMeta(fn) + `\(.*\n`) + idx := re.FindStringIndex(output) + if idx == nil { + t.Fatalf("expected %q function in traceback:\n%s", fn, output) + } + output = output[idx[1]:] + } +} + +func testPanicDeadlock(t *testing.T, name string, want string) { + // test issue 14432 + output := runTestProg(t, "testprog", name) + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestPanicDeadlockGosched(t *testing.T) { + testPanicDeadlock(t, "GoschedInPanic", "panic: errorThatGosched\n\n") +} + +func TestPanicDeadlockSyscall(t *testing.T) { + testPanicDeadlock(t, "SyscallInPanic", "1\n2\npanic: 3\n\n") +} + +func TestPanicLoop(t *testing.T) { + output := runTestProg(t, "testprog", "PanicLoop") + if want := "panic while printing panic value"; !strings.Contains(output, want) { + t.Errorf("output does not contain %q:\n%s", want, output) + } +} + +func TestMemPprof(t *testing.T) { + testenv.MustHaveGoRun(t) + + exe, err := buildTestProg(t, "testprog") + if err != nil { + t.Fatal(err) + } + + got, err := testenv.CleanCmdEnv(exec.Command(exe, "MemProf")).CombinedOutput() + if err != nil { + t.Fatal(err) + } + fn := strings.TrimSpace(string(got)) + defer os.Remove(fn) + + for try := 0; try < 2; try++ { + cmd := testenv.CleanCmdEnv(exec.Command(testenv.GoToolPath(t), "tool", "pprof", "-alloc_space", "-top")) + // Check that pprof works both with and without explicit executable on command line. + if try == 0 { + cmd.Args = append(cmd.Args, exe, fn) + } else { + cmd.Args = append(cmd.Args, fn) + } + found := false + for i, e := range cmd.Env { + if strings.HasPrefix(e, "PPROF_TMPDIR=") { + cmd.Env[i] = "PPROF_TMPDIR=" + os.TempDir() + found = true + break + } + } + if !found { + cmd.Env = append(cmd.Env, "PPROF_TMPDIR="+os.TempDir()) + } + + top, err := cmd.CombinedOutput() + t.Logf("%s:\n%s", cmd.Args, top) + if err != nil { + t.Error(err) + } else if !bytes.Contains(top, []byte("MemProf")) { + t.Error("missing MemProf in pprof output") + } + } +} + +var concurrentMapTest = flag.Bool("run_concurrent_map_tests", false, "also run flaky concurrent map tests") + +func TestConcurrentMapWrites(t *testing.T) { + if !*concurrentMapTest { + t.Skip("skipping without -run_concurrent_map_tests") + } + testenv.MustHaveGoRun(t) + output := runTestProg(t, "testprog", "concurrentMapWrites") + want := "fatal error: concurrent map writes" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} +func TestConcurrentMapReadWrite(t *testing.T) { + if !*concurrentMapTest { + t.Skip("skipping without -run_concurrent_map_tests") + } + testenv.MustHaveGoRun(t) + output := runTestProg(t, "testprog", "concurrentMapReadWrite") + want := "fatal error: concurrent map read and map write" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} +func TestConcurrentMapIterateWrite(t *testing.T) { + if !*concurrentMapTest { + t.Skip("skipping without -run_concurrent_map_tests") + } + testenv.MustHaveGoRun(t) + output := runTestProg(t, "testprog", "concurrentMapIterateWrite") + want := "fatal error: concurrent map iteration and map write" + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +type point struct { + x, y *int +} + +func (p *point) negate() { + *p.x = *p.x * -1 + *p.y = *p.y * -1 +} + +// Test for issue #10152. +func TestPanicInlined(t *testing.T) { + defer func() { + r := recover() + if r == nil { + t.Fatalf("recover failed") + } + buf := make([]byte, 2048) + n := runtime.Stack(buf, false) + buf = buf[:n] + if !bytes.Contains(buf, []byte("(*point).negate(")) { + t.Fatalf("expecting stack trace to contain call to (*point).negate()") + } + }() + + pt := new(point) + pt.negate() +} + +// Test for issues #3934 and #20018. +// We want to delay exiting until a panic print is complete. +func TestPanicRace(t *testing.T) { + testenv.MustHaveGoRun(t) + + exe, err := buildTestProg(t, "testprog") + if err != nil { + t.Fatal(err) + } + + // The test is intentionally racy, and in my testing does not + // produce the expected output about 0.05% of the time. + // So run the program in a loop and only fail the test if we + // get the wrong output ten times in a row. + const tries = 10 +retry: + for i := 0; i < tries; i++ { + got, err := testenv.CleanCmdEnv(exec.Command(exe, "PanicRace")).CombinedOutput() + if err == nil { + t.Logf("try %d: program exited successfully, should have failed", i+1) + continue + } + + if i > 0 { + t.Logf("try %d:\n", i+1) + } + t.Logf("%s\n", got) + + wants := []string{ + "panic: crash", + "PanicRace", + "created by ", + } + for _, want := range wants { + if !bytes.Contains(got, []byte(want)) { + t.Logf("did not find expected string %q", want) + continue retry + } + } + + // Test generated expected output. + return + } + t.Errorf("test ran %d times without producing expected output", tries) +} + +func TestBadTraceback(t *testing.T) { + output := runTestProg(t, "testprog", "BadTraceback") + for _, want := range []string{ + "unexpected return pc", + "called from 0xbad", + "00000bad", // Smashed LR in hex dump + "<main.badLR", // Symbolization in hex dump (badLR1 or badLR2) + } { + if !strings.Contains(output, want) { + t.Errorf("output does not contain %q:\n%s", want, output) + } + } +} + +func TestTimePprof(t *testing.T) { + // This test is unreliable on any system in which nanotime + // calls into libc. + switch runtime.GOOS { + case "aix", "darwin", "illumos", "openbsd", "solaris": + t.Skipf("skipping on %s because nanotime calls libc", runtime.GOOS) + } + + // Pass GOTRACEBACK for issue #41120 to try to get more + // information on timeout. + fn := runTestProg(t, "testprog", "TimeProf", "GOTRACEBACK=crash") + fn = strings.TrimSpace(fn) + defer os.Remove(fn) + + cmd := testenv.CleanCmdEnv(exec.Command(testenv.GoToolPath(t), "tool", "pprof", "-top", "-nodecount=1", fn)) + cmd.Env = append(cmd.Env, "PPROF_TMPDIR="+os.TempDir()) + top, err := cmd.CombinedOutput() + t.Logf("%s", top) + if err != nil { + t.Error(err) + } else if bytes.Contains(top, []byte("ExternalCode")) { + t.Error("profiler refers to ExternalCode") + } +} + +// Test that runtime.abort does so. +func TestAbort(t *testing.T) { + // Pass GOTRACEBACK to ensure we get runtime frames. + output := runTestProg(t, "testprog", "Abort", "GOTRACEBACK=system") + if want := "runtime.abort"; !strings.Contains(output, want) { + t.Errorf("output does not contain %q:\n%s", want, output) + } + if strings.Contains(output, "BAD") { + t.Errorf("output contains BAD:\n%s", output) + } + // Check that it's a signal traceback. + want := "PC=" + // For systems that use a breakpoint, check specifically for that. + switch runtime.GOARCH { + case "386", "amd64": + switch runtime.GOOS { + case "plan9": + want = "sys: breakpoint" + case "windows": + want = "Exception 0x80000003" + default: + want = "SIGTRAP" + } + } + if !strings.Contains(output, want) { + t.Errorf("output does not contain %q:\n%s", want, output) + } +} + +// For TestRuntimePanic: test a panic in the runtime package without +// involving the testing harness. +func init() { + if os.Getenv("GO_TEST_RUNTIME_PANIC") == "1" { + defer func() { + if r := recover(); r != nil { + // We expect to crash, so exit 0 + // to indicate failure. + os.Exit(0) + } + }() + runtime.PanicForTesting(nil, 1) + // We expect to crash, so exit 0 to indicate failure. + os.Exit(0) + } +} + +func TestRuntimePanic(t *testing.T) { + testenv.MustHaveExec(t) + cmd := testenv.CleanCmdEnv(exec.Command(os.Args[0], "-test.run=TestRuntimePanic")) + cmd.Env = append(cmd.Env, "GO_TEST_RUNTIME_PANIC=1") + out, err := cmd.CombinedOutput() + t.Logf("%s", out) + if err == nil { + t.Error("child process did not fail") + } else if want := "runtime.unexportedPanicForTesting"; !bytes.Contains(out, []byte(want)) { + t.Errorf("output did not contain expected string %q", want) + } +} + +// Test that g0 stack overflows are handled gracefully. +func TestG0StackOverflow(t *testing.T) { + testenv.MustHaveExec(t) + + switch runtime.GOOS { + case "darwin", "dragonfly", "freebsd", "linux", "netbsd", "openbsd", "android": + t.Skipf("g0 stack is wrong on pthread platforms (see golang.org/issue/26061)") + } + + if os.Getenv("TEST_G0_STACK_OVERFLOW") != "1" { + cmd := testenv.CleanCmdEnv(exec.Command(os.Args[0], "-test.run=TestG0StackOverflow", "-test.v")) + cmd.Env = append(cmd.Env, "TEST_G0_STACK_OVERFLOW=1") + out, err := cmd.CombinedOutput() + // Don't check err since it's expected to crash. + if n := strings.Count(string(out), "morestack on g0\n"); n != 1 { + t.Fatalf("%s\n(exit status %v)", out, err) + } + // Check that it's a signal-style traceback. + if runtime.GOOS != "windows" { + if want := "PC="; !strings.Contains(string(out), want) { + t.Errorf("output does not contain %q:\n%s", want, out) + } + } + return + } + + runtime.G0StackOverflow() +} + +// Test that panic message is not clobbered. +// See issue 30150. +func TestDoublePanic(t *testing.T) { + output := runTestProg(t, "testprog", "DoublePanic", "GODEBUG=clobberfree=1") + wants := []string{"panic: XXX", "panic: YYY"} + for _, want := range wants { + if !strings.Contains(output, want) { + t.Errorf("output:\n%s\n\nwant output containing: %s", output, want) + } + } +} + +// Test that panic while panicking discards error message +// See issue 52257 +func TestPanicWhilePanicking(t *testing.T) { + tests := []struct { + Want string + Func string + }{ + { + "panic while printing panic value: important error message", + "ErrorPanic", + }, + { + "panic while printing panic value: important stringer message", + "StringerPanic", + }, + { + "panic while printing panic value: type", + "DoubleErrorPanic", + }, + { + "panic while printing panic value: type", + "DoubleStringerPanic", + }, + { + "panic while printing panic value: type", + "CircularPanic", + }, + { + "important string message", + "StringPanic", + }, + { + "nil", + "NilPanic", + }, + } + for _, x := range tests { + output := runTestProg(t, "testprog", x.Func) + if !strings.Contains(output, x.Want) { + t.Errorf("output does not contain %q:\n%s", x.Want, output) + } + } +} + +func TestPanicOnUnsafeSlice(t *testing.T) { + output := runTestProg(t, "testprog", "panicOnNilAndEleSizeIsZero") + want := "panic: runtime error: unsafe.Slice: ptr is nil and len is not zero" + if !strings.Contains(output, want) { + t.Errorf("output does not contain %q:\n%s", want, output) + } +} diff --git a/src/runtime/crash_unix_test.go b/src/runtime/crash_unix_test.go new file mode 100644 index 0000000..29d9c47 --- /dev/null +++ b/src/runtime/crash_unix_test.go @@ -0,0 +1,313 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime_test + +import ( + "bytes" + "internal/testenv" + "io" + "os" + "os/exec" + "runtime" + "runtime/debug" + "sync" + "syscall" + "testing" + "time" + "unsafe" +) + +func init() { + if runtime.Sigisblocked(int(syscall.SIGQUIT)) { + // We can't use SIGQUIT to kill subprocesses because + // it's blocked. Use SIGKILL instead. See issue + // #19196 for an example of when this happens. + testenv.Sigquit = syscall.SIGKILL + } +} + +func TestBadOpen(t *testing.T) { + // make sure we get the correct error code if open fails. Same for + // read/write/close on the resulting -1 fd. See issue 10052. + nonfile := []byte("/notreallyafile") + fd := runtime.Open(&nonfile[0], 0, 0) + if fd != -1 { + t.Errorf("open(%q)=%d, want -1", nonfile, fd) + } + var buf [32]byte + r := runtime.Read(-1, unsafe.Pointer(&buf[0]), int32(len(buf))) + if got, want := r, -int32(syscall.EBADF); got != want { + t.Errorf("read()=%d, want %d", got, want) + } + w := runtime.Write(^uintptr(0), unsafe.Pointer(&buf[0]), int32(len(buf))) + if got, want := w, -int32(syscall.EBADF); got != want { + t.Errorf("write()=%d, want %d", got, want) + } + c := runtime.Close(-1) + if c != -1 { + t.Errorf("close()=%d, want -1", c) + } +} + +func TestCrashDumpsAllThreads(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + + switch runtime.GOOS { + case "darwin", "dragonfly", "freebsd", "linux", "netbsd", "openbsd", "illumos", "solaris": + default: + t.Skipf("skipping; not supported on %v", runtime.GOOS) + } + + if runtime.GOOS == "openbsd" && (runtime.GOARCH == "arm" || runtime.GOARCH == "mips64") { + // This may be ncpu < 2 related... + t.Skipf("skipping; test fails on %s/%s - see issue #42464", runtime.GOOS, runtime.GOARCH) + } + + if runtime.Sigisblocked(int(syscall.SIGQUIT)) { + t.Skip("skipping; SIGQUIT is blocked, see golang.org/issue/19196") + } + + testenv.MustHaveGoBuild(t) + + exe, err := buildTestProg(t, "testprog") + if err != nil { + t.Fatal(err) + } + + cmd := exec.Command(exe, "CrashDumpsAllThreads") + cmd = testenv.CleanCmdEnv(cmd) + cmd.Env = append(cmd.Env, + "GOTRACEBACK=crash", + // Set GOGC=off. Because of golang.org/issue/10958, the tight + // loops in the test program are not preemptible. If GC kicks + // in, it may lock up and prevent main from saying it's ready. + "GOGC=off", + // Set GODEBUG=asyncpreemptoff=1. If a thread is preempted + // when it receives SIGQUIT, it won't show the expected + // stack trace. See issue 35356. + "GODEBUG=asyncpreemptoff=1", + ) + + var outbuf bytes.Buffer + cmd.Stdout = &outbuf + cmd.Stderr = &outbuf + + rp, wp, err := os.Pipe() + if err != nil { + t.Fatal(err) + } + defer rp.Close() + + cmd.ExtraFiles = []*os.File{wp} + + if err := cmd.Start(); err != nil { + wp.Close() + t.Fatalf("starting program: %v", err) + } + + if err := wp.Close(); err != nil { + t.Logf("closing write pipe: %v", err) + } + if _, err := rp.Read(make([]byte, 1)); err != nil { + t.Fatalf("reading from pipe: %v", err) + } + + if err := cmd.Process.Signal(syscall.SIGQUIT); err != nil { + t.Fatalf("signal: %v", err) + } + + // No point in checking the error return from Wait--we expect + // it to fail. + cmd.Wait() + + // We want to see a stack trace for each thread. + // Before https://golang.org/cl/2811 running threads would say + // "goroutine running on other thread; stack unavailable". + out := outbuf.Bytes() + n := bytes.Count(out, []byte("main.crashDumpsAllThreadsLoop(")) + if n != 4 { + t.Errorf("found %d instances of main.crashDumpsAllThreadsLoop; expected 4", n) + t.Logf("%s", out) + } +} + +func TestPanicSystemstack(t *testing.T) { + // Test that GOTRACEBACK=crash prints both the system and user + // stack of other threads. + + // The GOTRACEBACK=crash handler takes 0.1 seconds even if + // it's not writing a core file and potentially much longer if + // it is. Skip in short mode. + if testing.Short() { + t.Skip("Skipping in short mode (GOTRACEBACK=crash is slow)") + } + + if runtime.Sigisblocked(int(syscall.SIGQUIT)) { + t.Skip("skipping; SIGQUIT is blocked, see golang.org/issue/19196") + } + + t.Parallel() + cmd := exec.Command(os.Args[0], "testPanicSystemstackInternal") + cmd = testenv.CleanCmdEnv(cmd) + cmd.Env = append(cmd.Env, "GOTRACEBACK=crash") + pr, pw, err := os.Pipe() + if err != nil { + t.Fatal("creating pipe: ", err) + } + cmd.Stderr = pw + if err := cmd.Start(); err != nil { + t.Fatal("starting command: ", err) + } + defer cmd.Process.Wait() + defer cmd.Process.Kill() + if err := pw.Close(); err != nil { + t.Log("closing write pipe: ", err) + } + defer pr.Close() + + // Wait for "x\nx\n" to indicate almost-readiness. + buf := make([]byte, 4) + _, err = io.ReadFull(pr, buf) + if err != nil || string(buf) != "x\nx\n" { + t.Fatal("subprocess failed; output:\n", string(buf)) + } + + // The child blockers print "x\n" and then block on a lock. Receiving + // those bytes only indicates that the child is _about to block_. Since + // we don't have a way to know when it is fully blocked, sleep a bit to + // make us less likely to lose the race and signal before the child + // blocks. + time.Sleep(100 * time.Millisecond) + + // Send SIGQUIT. + if err := cmd.Process.Signal(syscall.SIGQUIT); err != nil { + t.Fatal("signaling subprocess: ", err) + } + + // Get traceback. + tb, err := io.ReadAll(pr) + if err != nil { + t.Fatal("reading traceback from pipe: ", err) + } + + // Traceback should have two testPanicSystemstackInternal's + // and two blockOnSystemStackInternal's. + if bytes.Count(tb, []byte("testPanicSystemstackInternal")) != 2 { + t.Fatal("traceback missing user stack:\n", string(tb)) + } else if bytes.Count(tb, []byte("blockOnSystemStackInternal")) != 2 { + t.Fatal("traceback missing system stack:\n", string(tb)) + } +} + +func init() { + if len(os.Args) >= 2 && os.Args[1] == "testPanicSystemstackInternal" { + // Complete any in-flight GCs and disable future ones. We're going to + // block goroutines on runtime locks, which aren't ever preemptible for the + // GC to scan them. + runtime.GC() + debug.SetGCPercent(-1) + // Get two threads running on the system stack with + // something recognizable in the stack trace. + runtime.GOMAXPROCS(2) + go testPanicSystemstackInternal() + testPanicSystemstackInternal() + } +} + +func testPanicSystemstackInternal() { + runtime.BlockOnSystemStack() + os.Exit(1) // Should be unreachable. +} + +func TestSignalExitStatus(t *testing.T) { + testenv.MustHaveGoBuild(t) + exe, err := buildTestProg(t, "testprog") + if err != nil { + t.Fatal(err) + } + err = testenv.CleanCmdEnv(exec.Command(exe, "SignalExitStatus")).Run() + if err == nil { + t.Error("test program succeeded unexpectedly") + } else if ee, ok := err.(*exec.ExitError); !ok { + t.Errorf("error (%v) has type %T; expected exec.ExitError", err, err) + } else if ws, ok := ee.Sys().(syscall.WaitStatus); !ok { + t.Errorf("error.Sys (%v) has type %T; expected syscall.WaitStatus", ee.Sys(), ee.Sys()) + } else if !ws.Signaled() || ws.Signal() != syscall.SIGTERM { + t.Errorf("got %v; expected SIGTERM", ee) + } +} + +func TestSignalIgnoreSIGTRAP(t *testing.T) { + if runtime.GOOS == "openbsd" { + testenv.SkipFlaky(t, 49725) + } + + output := runTestProg(t, "testprognet", "SignalIgnoreSIGTRAP") + want := "OK\n" + if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +func TestSignalDuringExec(t *testing.T) { + switch runtime.GOOS { + case "darwin", "dragonfly", "freebsd", "linux", "netbsd", "openbsd": + default: + t.Skipf("skipping test on %s", runtime.GOOS) + } + output := runTestProg(t, "testprognet", "SignalDuringExec") + want := "OK\n" + if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +func TestSignalM(t *testing.T) { + r, w, errno := runtime.Pipe() + if errno != 0 { + t.Fatal(syscall.Errno(errno)) + } + defer func() { + runtime.Close(r) + runtime.Close(w) + }() + runtime.Closeonexec(r) + runtime.Closeonexec(w) + + var want, got int64 + var wg sync.WaitGroup + ready := make(chan *runtime.M) + wg.Add(1) + go func() { + runtime.LockOSThread() + want, got = runtime.WaitForSigusr1(r, w, func(mp *runtime.M) { + ready <- mp + }) + runtime.UnlockOSThread() + wg.Done() + }() + waitingM := <-ready + runtime.SendSigusr1(waitingM) + + timer := time.AfterFunc(time.Second, func() { + // Write 1 to tell WaitForSigusr1 that we timed out. + bw := byte(1) + if n := runtime.Write(uintptr(w), unsafe.Pointer(&bw), 1); n != 1 { + t.Errorf("pipe write failed: %d", n) + } + }) + defer timer.Stop() + + wg.Wait() + if got == -1 { + t.Fatal("signalM signal not received") + } else if want != got { + t.Fatalf("signal sent to M %d, but received on M %d", want, got) + } +} diff --git a/src/runtime/create_file_nounix.go b/src/runtime/create_file_nounix.go new file mode 100644 index 0000000..60f7517 --- /dev/null +++ b/src/runtime/create_file_nounix.go @@ -0,0 +1,14 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !unix + +package runtime + +const canCreateFile = false + +func create(name *byte, perm int32) int32 { + throw("unimplemented") + return -1 +} diff --git a/src/runtime/create_file_unix.go b/src/runtime/create_file_unix.go new file mode 100644 index 0000000..7280810 --- /dev/null +++ b/src/runtime/create_file_unix.go @@ -0,0 +1,14 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime + +const canCreateFile = true + +// create returns an fd to a write-only file. +func create(name *byte, perm int32) int32 { + return open(name, _O_CREAT|_O_WRONLY|_O_TRUNC, perm) +} diff --git a/src/runtime/debug.go b/src/runtime/debug.go new file mode 100644 index 0000000..669c36f --- /dev/null +++ b/src/runtime/debug.go @@ -0,0 +1,115 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// GOMAXPROCS sets the maximum number of CPUs that can be executing +// simultaneously and returns the previous setting. It defaults to +// the value of runtime.NumCPU. If n < 1, it does not change the current setting. +// This call will go away when the scheduler improves. +func GOMAXPROCS(n int) int { + if GOARCH == "wasm" && n > 1 { + n = 1 // WebAssembly has no threads yet, so only one CPU is possible. + } + + lock(&sched.lock) + ret := int(gomaxprocs) + unlock(&sched.lock) + if n <= 0 || n == ret { + return ret + } + + stopTheWorldGC("GOMAXPROCS") + + // newprocs will be processed by startTheWorld + newprocs = int32(n) + + startTheWorldGC() + return ret +} + +// NumCPU returns the number of logical CPUs usable by the current process. +// +// The set of available CPUs is checked by querying the operating system +// at process startup. Changes to operating system CPU allocation after +// process startup are not reflected. +func NumCPU() int { + return int(ncpu) +} + +// NumCgoCall returns the number of cgo calls made by the current process. +func NumCgoCall() int64 { + var n = int64(atomic.Load64(&ncgocall)) + for mp := (*m)(atomic.Loadp(unsafe.Pointer(&allm))); mp != nil; mp = mp.alllink { + n += int64(mp.ncgocall) + } + return n +} + +// NumGoroutine returns the number of goroutines that currently exist. +func NumGoroutine() int { + return int(gcount()) +} + +//go:linkname debug_modinfo runtime/debug.modinfo +func debug_modinfo() string { + return modinfo +} + +// mayMoreStackPreempt is a maymorestack hook that forces a preemption +// at every possible cooperative preemption point. +// +// This is valuable to apply to the runtime, which can be sensitive to +// preemption points. To apply this to all preemption points in the +// runtime and runtime-like code, use the following in bash or zsh: +// +// X=(-{gc,asm}flags={runtime/...,reflect,sync}=-d=maymorestack=runtime.mayMoreStackPreempt) GOFLAGS=${X[@]} +// +// This must be deeply nosplit because it is called from a function +// prologue before the stack is set up and because the compiler will +// call it from any splittable prologue (leading to infinite +// recursion). +// +// Ideally it should also use very little stack because the linker +// doesn't currently account for this in nosplit stack depth checking. +// +// Ensure mayMoreStackPreempt can be called for all ABIs. +// +//go:nosplit +//go:linkname mayMoreStackPreempt +func mayMoreStackPreempt() { + // Don't do anything on the g0 or gsignal stack. + gp := getg() + if gp == gp.m.g0 || gp == gp.m.gsignal { + return + } + // Force a preemption, unless the stack is already poisoned. + if gp.stackguard0 < stackPoisonMin { + gp.stackguard0 = stackPreempt + } +} + +// mayMoreStackMove is a maymorestack hook that forces stack movement +// at every possible point. +// +// See mayMoreStackPreempt. +// +//go:nosplit +//go:linkname mayMoreStackMove +func mayMoreStackMove() { + // Don't do anything on the g0 or gsignal stack. + gp := getg() + if gp == gp.m.g0 || gp == gp.m.gsignal { + return + } + // Force stack movement, unless the stack is already poisoned. + if gp.stackguard0 < stackPoisonMin { + gp.stackguard0 = stackForceMove + } +} diff --git a/src/runtime/debug/debug.s b/src/runtime/debug/debug.s new file mode 100644 index 0000000..6aae33a --- /dev/null +++ b/src/runtime/debug/debug.s @@ -0,0 +1,9 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Nothing to see here. +// This file exists so that the go command knows that parts of the +// package are implemented in C, so that it does not instruct the +// Go compiler to complain about extern declarations. +// The actual implementations are in package runtime. diff --git a/src/runtime/debug/garbage.go b/src/runtime/debug/garbage.go new file mode 100644 index 0000000..0f53928 --- /dev/null +++ b/src/runtime/debug/garbage.go @@ -0,0 +1,238 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package debug + +import ( + "runtime" + "sort" + "time" +) + +// GCStats collect information about recent garbage collections. +type GCStats struct { + LastGC time.Time // time of last collection + NumGC int64 // number of garbage collections + PauseTotal time.Duration // total pause for all collections + Pause []time.Duration // pause history, most recent first + PauseEnd []time.Time // pause end times history, most recent first + PauseQuantiles []time.Duration +} + +// ReadGCStats reads statistics about garbage collection into stats. +// The number of entries in the pause history is system-dependent; +// stats.Pause slice will be reused if large enough, reallocated otherwise. +// ReadGCStats may use the full capacity of the stats.Pause slice. +// If stats.PauseQuantiles is non-empty, ReadGCStats fills it with quantiles +// summarizing the distribution of pause time. For example, if +// len(stats.PauseQuantiles) is 5, it will be filled with the minimum, +// 25%, 50%, 75%, and maximum pause times. +func ReadGCStats(stats *GCStats) { + // Create a buffer with space for at least two copies of the + // pause history tracked by the runtime. One will be returned + // to the caller and the other will be used as transfer buffer + // for end times history and as a temporary buffer for + // computing quantiles. + const maxPause = len(((*runtime.MemStats)(nil)).PauseNs) + if cap(stats.Pause) < 2*maxPause+3 { + stats.Pause = make([]time.Duration, 2*maxPause+3) + } + + // readGCStats fills in the pause and end times histories (up to + // maxPause entries) and then three more: Unix ns time of last GC, + // number of GC, and total pause time in nanoseconds. Here we + // depend on the fact that time.Duration's native unit is + // nanoseconds, so the pauses and the total pause time do not need + // any conversion. + readGCStats(&stats.Pause) + n := len(stats.Pause) - 3 + stats.LastGC = time.Unix(0, int64(stats.Pause[n])) + stats.NumGC = int64(stats.Pause[n+1]) + stats.PauseTotal = stats.Pause[n+2] + n /= 2 // buffer holds pauses and end times + stats.Pause = stats.Pause[:n] + + if cap(stats.PauseEnd) < maxPause { + stats.PauseEnd = make([]time.Time, 0, maxPause) + } + stats.PauseEnd = stats.PauseEnd[:0] + for _, ns := range stats.Pause[n : n+n] { + stats.PauseEnd = append(stats.PauseEnd, time.Unix(0, int64(ns))) + } + + if len(stats.PauseQuantiles) > 0 { + if n == 0 { + for i := range stats.PauseQuantiles { + stats.PauseQuantiles[i] = 0 + } + } else { + // There's room for a second copy of the data in stats.Pause. + // See the allocation at the top of the function. + sorted := stats.Pause[n : n+n] + copy(sorted, stats.Pause) + sort.Slice(sorted, func(i, j int) bool { return sorted[i] < sorted[j] }) + nq := len(stats.PauseQuantiles) - 1 + for i := 0; i < nq; i++ { + stats.PauseQuantiles[i] = sorted[len(sorted)*i/nq] + } + stats.PauseQuantiles[nq] = sorted[len(sorted)-1] + } + } +} + +// SetGCPercent sets the garbage collection target percentage: +// a collection is triggered when the ratio of freshly allocated data +// to live data remaining after the previous collection reaches this percentage. +// SetGCPercent returns the previous setting. +// The initial setting is the value of the GOGC environment variable +// at startup, or 100 if the variable is not set. +// This setting may be effectively reduced in order to maintain a memory +// limit. +// A negative percentage effectively disables garbage collection, unless +// the memory limit is reached. +// See SetMemoryLimit for more details. +func SetGCPercent(percent int) int { + return int(setGCPercent(int32(percent))) +} + +// FreeOSMemory forces a garbage collection followed by an +// attempt to return as much memory to the operating system +// as possible. (Even if this is not called, the runtime gradually +// returns memory to the operating system in a background task.) +func FreeOSMemory() { + freeOSMemory() +} + +// SetMaxStack sets the maximum amount of memory that +// can be used by a single goroutine stack. +// If any goroutine exceeds this limit while growing its stack, +// the program crashes. +// SetMaxStack returns the previous setting. +// The initial setting is 1 GB on 64-bit systems, 250 MB on 32-bit systems. +// There may be a system-imposed maximum stack limit regardless +// of the value provided to SetMaxStack. +// +// SetMaxStack is useful mainly for limiting the damage done by +// goroutines that enter an infinite recursion. It only limits future +// stack growth. +func SetMaxStack(bytes int) int { + return setMaxStack(bytes) +} + +// SetMaxThreads sets the maximum number of operating system +// threads that the Go program can use. If it attempts to use more than +// this many, the program crashes. +// SetMaxThreads returns the previous setting. +// The initial setting is 10,000 threads. +// +// The limit controls the number of operating system threads, not the number +// of goroutines. A Go program creates a new thread only when a goroutine +// is ready to run but all the existing threads are blocked in system calls, cgo calls, +// or are locked to other goroutines due to use of runtime.LockOSThread. +// +// SetMaxThreads is useful mainly for limiting the damage done by +// programs that create an unbounded number of threads. The idea is +// to take down the program before it takes down the operating system. +func SetMaxThreads(threads int) int { + return setMaxThreads(threads) +} + +// SetPanicOnFault controls the runtime's behavior when a program faults +// at an unexpected (non-nil) address. Such faults are typically caused by +// bugs such as runtime memory corruption, so the default response is to crash +// the program. Programs working with memory-mapped files or unsafe +// manipulation of memory may cause faults at non-nil addresses in less +// dramatic situations; SetPanicOnFault allows such programs to request +// that the runtime trigger only a panic, not a crash. +// The runtime.Error that the runtime panics with may have an additional method: +// +// Addr() uintptr +// +// If that method exists, it returns the memory address which triggered the fault. +// The results of Addr are best-effort and the veracity of the result +// may depend on the platform. +// SetPanicOnFault applies only to the current goroutine. +// It returns the previous setting. +func SetPanicOnFault(enabled bool) bool { + return setPanicOnFault(enabled) +} + +// WriteHeapDump writes a description of the heap and the objects in +// it to the given file descriptor. +// +// WriteHeapDump suspends the execution of all goroutines until the heap +// dump is completely written. Thus, the file descriptor must not be +// connected to a pipe or socket whose other end is in the same Go +// process; instead, use a temporary file or network socket. +// +// The heap dump format is defined at https://golang.org/s/go15heapdump. +func WriteHeapDump(fd uintptr) + +// SetTraceback sets the amount of detail printed by the runtime in +// the traceback it prints before exiting due to an unrecovered panic +// or an internal runtime error. +// The level argument takes the same values as the GOTRACEBACK +// environment variable. For example, SetTraceback("all") ensure +// that the program prints all goroutines when it crashes. +// See the package runtime documentation for details. +// If SetTraceback is called with a level lower than that of the +// environment variable, the call is ignored. +func SetTraceback(level string) + +// SetMemoryLimit provides the runtime with a soft memory limit. +// +// The runtime undertakes several processes to try to respect this +// memory limit, including adjustments to the frequency of garbage +// collections and returning memory to the underlying system more +// aggressively. This limit will be respected even if GOGC=off (or, +// if SetGCPercent(-1) is executed). +// +// The input limit is provided as bytes, and includes all memory +// mapped, managed, and not released by the Go runtime. Notably, it +// does not account for space used by the Go binary and memory +// external to Go, such as memory managed by the underlying system +// on behalf of the process, or memory managed by non-Go code inside +// the same process. Examples of excluded memory sources include: OS +// kernel memory held on behalf of the process, memory allocated by +// C code, and memory mapped by syscall.Mmap (because it is not +// managed by the Go runtime). +// +// More specifically, the following expression accurately reflects +// the value the runtime attempts to maintain as the limit: +// +// runtime.MemStats.Sys - runtime.MemStats.HeapReleased +// +// or in terms of the runtime/metrics package: +// +// /memory/classes/total:bytes - /memory/classes/heap/released:bytes +// +// A zero limit or a limit that's lower than the amount of memory +// used by the Go runtime may cause the garbage collector to run +// nearly continuously. However, the application may still make +// progress. +// +// The memory limit is always respected by the Go runtime, so to +// effectively disable this behavior, set the limit very high. +// math.MaxInt64 is the canonical value for disabling the limit, +// but values much greater than the available memory on the underlying +// system work just as well. +// +// See https://go.dev/doc/gc-guide for a detailed guide explaining +// the soft memory limit in more detail, as well as a variety of common +// use-cases and scenarios. +// +// The initial setting is math.MaxInt64 unless the GOMEMLIMIT +// environment variable is set, in which case it provides the initial +// setting. GOMEMLIMIT is a numeric value in bytes with an optional +// unit suffix. The supported suffixes include B, KiB, MiB, GiB, and +// TiB. These suffixes represent quantities of bytes as defined by +// the IEC 80000-13 standard. That is, they are based on powers of +// two: KiB means 2^10 bytes, MiB means 2^20 bytes, and so on. +// +// SetMemoryLimit returns the previously set memory limit. +// A negative input does not adjust the limit, and allows for +// retrieval of the currently set memory limit. +func SetMemoryLimit(limit int64) int64 { + return setMemoryLimit(limit) +} diff --git a/src/runtime/debug/garbage_test.go b/src/runtime/debug/garbage_test.go new file mode 100644 index 0000000..7213bbe --- /dev/null +++ b/src/runtime/debug/garbage_test.go @@ -0,0 +1,238 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package debug_test + +import ( + "internal/testenv" + "os" + "runtime" + . "runtime/debug" + "testing" + "time" +) + +func TestReadGCStats(t *testing.T) { + defer SetGCPercent(SetGCPercent(-1)) + + var stats GCStats + var mstats runtime.MemStats + var min, max time.Duration + + // First ReadGCStats will allocate, second should not, + // especially if we follow up with an explicit garbage collection. + stats.PauseQuantiles = make([]time.Duration, 10) + ReadGCStats(&stats) + runtime.GC() + + // Assume these will return same data: no GC during ReadGCStats. + ReadGCStats(&stats) + runtime.ReadMemStats(&mstats) + + if stats.NumGC != int64(mstats.NumGC) { + t.Errorf("stats.NumGC = %d, but mstats.NumGC = %d", stats.NumGC, mstats.NumGC) + } + if stats.PauseTotal != time.Duration(mstats.PauseTotalNs) { + t.Errorf("stats.PauseTotal = %d, but mstats.PauseTotalNs = %d", stats.PauseTotal, mstats.PauseTotalNs) + } + if stats.LastGC.UnixNano() != int64(mstats.LastGC) { + t.Errorf("stats.LastGC.UnixNano = %d, but mstats.LastGC = %d", stats.LastGC.UnixNano(), mstats.LastGC) + } + n := int(mstats.NumGC) + if n > len(mstats.PauseNs) { + n = len(mstats.PauseNs) + } + if len(stats.Pause) != n { + t.Errorf("len(stats.Pause) = %d, want %d", len(stats.Pause), n) + } else { + off := (int(mstats.NumGC) + len(mstats.PauseNs) - 1) % len(mstats.PauseNs) + for i := 0; i < n; i++ { + dt := stats.Pause[i] + if dt != time.Duration(mstats.PauseNs[off]) { + t.Errorf("stats.Pause[%d] = %d, want %d", i, dt, mstats.PauseNs[off]) + } + if max < dt { + max = dt + } + if min > dt || i == 0 { + min = dt + } + off = (off + len(mstats.PauseNs) - 1) % len(mstats.PauseNs) + } + } + + q := stats.PauseQuantiles + nq := len(q) + if q[0] != min || q[nq-1] != max { + t.Errorf("stats.PauseQuantiles = [%d, ..., %d], want [%d, ..., %d]", q[0], q[nq-1], min, max) + } + + for i := 0; i < nq-1; i++ { + if q[i] > q[i+1] { + t.Errorf("stats.PauseQuantiles[%d]=%d > stats.PauseQuantiles[%d]=%d", i, q[i], i+1, q[i+1]) + } + } + + // compare memory stats with gc stats: + if len(stats.PauseEnd) != n { + t.Fatalf("len(stats.PauseEnd) = %d, want %d", len(stats.PauseEnd), n) + } + off := (int(mstats.NumGC) + len(mstats.PauseEnd) - 1) % len(mstats.PauseEnd) + for i := 0; i < n; i++ { + dt := stats.PauseEnd[i] + if dt.UnixNano() != int64(mstats.PauseEnd[off]) { + t.Errorf("stats.PauseEnd[%d] = %d, want %d", i, dt.UnixNano(), mstats.PauseEnd[off]) + } + off = (off + len(mstats.PauseEnd) - 1) % len(mstats.PauseEnd) + } +} + +var big []byte + +func TestFreeOSMemory(t *testing.T) { + // Tests FreeOSMemory by making big susceptible to collection + // and checking that at least that much memory is returned to + // the OS after. + + const bigBytes = 32 << 20 + big = make([]byte, bigBytes) + + // Make sure any in-progress GCs are complete. + runtime.GC() + + var before runtime.MemStats + runtime.ReadMemStats(&before) + + // Clear the last reference to the big allocation, making it + // susceptible to collection. + big = nil + + // FreeOSMemory runs a GC cycle before releasing memory, + // so it's fine to skip a GC here. + // + // It's possible the background scavenger runs concurrently + // with this function and does most of the work for it. + // If that happens, it's OK. What we want is a test that fails + // often if FreeOSMemory does not work correctly, and a test + // that passes every time if it does. + FreeOSMemory() + + var after runtime.MemStats + runtime.ReadMemStats(&after) + + // Check to make sure that the big allocation (now freed) + // had its memory shift into HeapReleased as a result of that + // FreeOSMemory. + if after.HeapReleased <= before.HeapReleased { + t.Fatalf("no memory released: %d -> %d", before.HeapReleased, after.HeapReleased) + } + + // Check to make sure bigBytes was released, plus some slack. Pages may get + // allocated in between the two measurements above for a variety for reasons, + // most commonly for GC work bufs. Since this can get fairly high, depending + // on scheduling and what GOMAXPROCS is, give a lot of slack up-front. + // + // Add a little more slack too if the page size is bigger than the runtime page size. + // "big" could end up unaligned on its ends, forcing the scavenger to skip at worst + // 2x pages. + slack := uint64(bigBytes / 2) + pageSize := uint64(os.Getpagesize()) + if pageSize > 8<<10 { + slack += pageSize * 2 + } + if slack > bigBytes { + // We basically already checked this. + return + } + if after.HeapReleased-before.HeapReleased < bigBytes-slack { + t.Fatalf("less than %d released: %d -> %d", bigBytes, before.HeapReleased, after.HeapReleased) + } +} + +var ( + setGCPercentBallast any + setGCPercentSink any +) + +func TestSetGCPercent(t *testing.T) { + testenv.SkipFlaky(t, 20076) + + // Test that the variable is being set and returned correctly. + old := SetGCPercent(123) + new := SetGCPercent(old) + if new != 123 { + t.Errorf("SetGCPercent(123); SetGCPercent(x) = %d, want 123", new) + } + + // Test that the percentage is implemented correctly. + defer func() { + SetGCPercent(old) + setGCPercentBallast, setGCPercentSink = nil, nil + }() + SetGCPercent(100) + runtime.GC() + // Create 100 MB of live heap as a baseline. + const baseline = 100 << 20 + var ms runtime.MemStats + runtime.ReadMemStats(&ms) + setGCPercentBallast = make([]byte, baseline-ms.Alloc) + runtime.GC() + runtime.ReadMemStats(&ms) + if abs64(baseline-int64(ms.Alloc)) > 10<<20 { + t.Fatalf("failed to set up baseline live heap; got %d MB, want %d MB", ms.Alloc>>20, baseline>>20) + } + // NextGC should be ~200 MB. + const thresh = 20 << 20 // TODO: Figure out why this is so noisy on some builders + if want := int64(2 * baseline); abs64(want-int64(ms.NextGC)) > thresh { + t.Errorf("NextGC = %d MB, want %d±%d MB", ms.NextGC>>20, want>>20, thresh>>20) + } + // Create some garbage, but not enough to trigger another GC. + for i := 0; i < int(1.2*baseline); i += 1 << 10 { + setGCPercentSink = make([]byte, 1<<10) + } + setGCPercentSink = nil + // Adjust GOGC to 50. NextGC should be ~150 MB. + SetGCPercent(50) + runtime.ReadMemStats(&ms) + if want := int64(1.5 * baseline); abs64(want-int64(ms.NextGC)) > thresh { + t.Errorf("NextGC = %d MB, want %d±%d MB", ms.NextGC>>20, want>>20, thresh>>20) + } + + // Trigger a GC and get back to 100 MB live with GOGC=100. + SetGCPercent(100) + runtime.GC() + // Raise live to 120 MB. + setGCPercentSink = make([]byte, int(0.2*baseline)) + // Lower GOGC to 10. This must force a GC. + runtime.ReadMemStats(&ms) + ngc1 := ms.NumGC + SetGCPercent(10) + // It may require an allocation to actually force the GC. + setGCPercentSink = make([]byte, 1<<20) + runtime.ReadMemStats(&ms) + ngc2 := ms.NumGC + if ngc1 == ngc2 { + t.Errorf("expected GC to run but it did not") + } +} + +func abs64(a int64) int64 { + if a < 0 { + return -a + } + return a +} + +func TestSetMaxThreadsOvf(t *testing.T) { + // Verify that a big threads count will not overflow the int32 + // maxmcount variable, causing a panic (see Issue 16076). + // + // This can only happen when ints are 64 bits, since on platforms + // with 32 bit ints SetMaxThreads (which takes an int parameter) + // cannot be given anything that will overflow an int32. + // + // Call SetMaxThreads with 1<<31, but only on 64 bit systems. + nt := SetMaxThreads(1 << (30 + ^uint(0)>>63)) + SetMaxThreads(nt) // restore previous value +} diff --git a/src/runtime/debug/heapdump_test.go b/src/runtime/debug/heapdump_test.go new file mode 100644 index 0000000..ee6b054 --- /dev/null +++ b/src/runtime/debug/heapdump_test.go @@ -0,0 +1,95 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package debug_test + +import ( + "os" + "runtime" + . "runtime/debug" + "testing" +) + +func TestWriteHeapDumpNonempty(t *testing.T) { + if runtime.GOOS == "js" { + t.Skipf("WriteHeapDump is not available on %s.", runtime.GOOS) + } + f, err := os.CreateTemp("", "heapdumptest") + if err != nil { + t.Fatalf("TempFile failed: %v", err) + } + defer os.Remove(f.Name()) + defer f.Close() + WriteHeapDump(f.Fd()) + fi, err := f.Stat() + if err != nil { + t.Fatalf("Stat failed: %v", err) + } + const minSize = 1 + if size := fi.Size(); size < minSize { + t.Fatalf("Heap dump size %d bytes, expected at least %d bytes", size, minSize) + } +} + +type Obj struct { + x, y int +} + +func objfin(x *Obj) { + //println("finalized", x) +} + +func TestWriteHeapDumpFinalizers(t *testing.T) { + if runtime.GOOS == "js" { + t.Skipf("WriteHeapDump is not available on %s.", runtime.GOOS) + } + f, err := os.CreateTemp("", "heapdumptest") + if err != nil { + t.Fatalf("TempFile failed: %v", err) + } + defer os.Remove(f.Name()) + defer f.Close() + + // bug 9172: WriteHeapDump couldn't handle more than one finalizer + println("allocating objects") + x := &Obj{} + runtime.SetFinalizer(x, objfin) + y := &Obj{} + runtime.SetFinalizer(y, objfin) + + // Trigger collection of x and y, queueing of their finalizers. + println("starting gc") + runtime.GC() + + // Make sure WriteHeapDump doesn't fail with multiple queued finalizers. + println("starting dump") + WriteHeapDump(f.Fd()) + println("done dump") +} + +type G[T any] struct{} +type I interface { + M() +} + +//go:noinline +func (g G[T]) M() {} + +var dummy I = G[int]{} +var dummy2 I = G[G[int]]{} + +func TestWriteHeapDumpTypeName(t *testing.T) { + if runtime.GOOS == "js" { + t.Skipf("WriteHeapDump is not available on %s.", runtime.GOOS) + } + f, err := os.CreateTemp("", "heapdumptest") + if err != nil { + t.Fatalf("TempFile failed: %v", err) + } + defer os.Remove(f.Name()) + defer f.Close() + WriteHeapDump(f.Fd()) + dummy.M() + dummy2.M() +} diff --git a/src/runtime/debug/mod.go b/src/runtime/debug/mod.go new file mode 100644 index 0000000..8b7a423 --- /dev/null +++ b/src/runtime/debug/mod.go @@ -0,0 +1,287 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package debug + +import ( + "fmt" + "runtime" + "strconv" + "strings" +) + +// exported from runtime. +func modinfo() string + +// ReadBuildInfo returns the build information embedded +// in the running binary. The information is available only +// in binaries built with module support. +func ReadBuildInfo() (info *BuildInfo, ok bool) { + data := modinfo() + if len(data) < 32 { + return nil, false + } + data = data[16 : len(data)-16] + bi, err := ParseBuildInfo(data) + if err != nil { + return nil, false + } + + // The go version is stored separately from other build info, mostly for + // historical reasons. It is not part of the modinfo() string, and + // ParseBuildInfo does not recognize it. We inject it here to hide this + // awkwardness from the user. + bi.GoVersion = runtime.Version() + + return bi, true +} + +// BuildInfo represents the build information read from a Go binary. +type BuildInfo struct { + // GoVersion is the version of the Go toolchain that built the binary + // (for example, "go1.19.2"). + GoVersion string + + // Path is the package path of the main package for the binary + // (for example, "golang.org/x/tools/cmd/stringer"). + Path string + + // Main describes the module that contains the main package for the binary. + Main Module + + // Deps describes all the dependency modules, both direct and indirect, + // that contributed packages to the build of this binary. + Deps []*Module + + // Settings describes the build settings used to build the binary. + Settings []BuildSetting +} + +// A Module describes a single module included in a build. +type Module struct { + Path string // module path + Version string // module version + Sum string // checksum + Replace *Module // replaced by this module +} + +// A BuildSetting is a key-value pair describing one setting that influenced a build. +// +// Defined keys include: +// +// - -buildmode: the buildmode flag used (typically "exe") +// - -compiler: the compiler toolchain flag used (typically "gc") +// - CGO_ENABLED: the effective CGO_ENABLED environment variable +// - CGO_CFLAGS: the effective CGO_CFLAGS environment variable +// - CGO_CPPFLAGS: the effective CGO_CPPFLAGS environment variable +// - CGO_CXXFLAGS: the effective CGO_CPPFLAGS environment variable +// - CGO_LDFLAGS: the effective CGO_CPPFLAGS environment variable +// - GOARCH: the architecture target +// - GOAMD64/GOARM64/GO386/etc: the architecture feature level for GOARCH +// - GOOS: the operating system target +// - vcs: the version control system for the source tree where the build ran +// - vcs.revision: the revision identifier for the current commit or checkout +// - vcs.time: the modification time associated with vcs.revision, in RFC3339 format +// - vcs.modified: true or false indicating whether the source tree had local modifications +type BuildSetting struct { + // Key and Value describe the build setting. + // Key must not contain an equals sign, space, tab, or newline. + // Value must not contain newlines ('\n'). + Key, Value string +} + +// quoteKey reports whether key is required to be quoted. +func quoteKey(key string) bool { + return len(key) == 0 || strings.ContainsAny(key, "= \t\r\n\"`") +} + +// quoteValue reports whether value is required to be quoted. +func quoteValue(value string) bool { + return strings.ContainsAny(value, " \t\r\n\"`") +} + +func (bi *BuildInfo) String() string { + buf := new(strings.Builder) + if bi.GoVersion != "" { + fmt.Fprintf(buf, "go\t%s\n", bi.GoVersion) + } + if bi.Path != "" { + fmt.Fprintf(buf, "path\t%s\n", bi.Path) + } + var formatMod func(string, Module) + formatMod = func(word string, m Module) { + buf.WriteString(word) + buf.WriteByte('\t') + buf.WriteString(m.Path) + buf.WriteByte('\t') + buf.WriteString(m.Version) + if m.Replace == nil { + buf.WriteByte('\t') + buf.WriteString(m.Sum) + } else { + buf.WriteByte('\n') + formatMod("=>", *m.Replace) + } + buf.WriteByte('\n') + } + if bi.Main != (Module{}) { + formatMod("mod", bi.Main) + } + for _, dep := range bi.Deps { + formatMod("dep", *dep) + } + for _, s := range bi.Settings { + key := s.Key + if quoteKey(key) { + key = strconv.Quote(key) + } + value := s.Value + if quoteValue(value) { + value = strconv.Quote(value) + } + fmt.Fprintf(buf, "build\t%s=%s\n", key, value) + } + + return buf.String() +} + +func ParseBuildInfo(data string) (bi *BuildInfo, err error) { + lineNum := 1 + defer func() { + if err != nil { + err = fmt.Errorf("could not parse Go build info: line %d: %w", lineNum, err) + } + }() + + var ( + pathLine = "path\t" + modLine = "mod\t" + depLine = "dep\t" + repLine = "=>\t" + buildLine = "build\t" + newline = "\n" + tab = "\t" + ) + + readModuleLine := func(elem []string) (Module, error) { + if len(elem) != 2 && len(elem) != 3 { + return Module{}, fmt.Errorf("expected 2 or 3 columns; got %d", len(elem)) + } + version := elem[1] + sum := "" + if len(elem) == 3 { + sum = elem[2] + } + return Module{ + Path: elem[0], + Version: version, + Sum: sum, + }, nil + } + + bi = new(BuildInfo) + var ( + last *Module + line string + ok bool + ) + // Reverse of BuildInfo.String(), except for go version. + for len(data) > 0 { + line, data, ok = strings.Cut(data, newline) + if !ok { + break + } + switch { + case strings.HasPrefix(line, pathLine): + elem := line[len(pathLine):] + bi.Path = string(elem) + case strings.HasPrefix(line, modLine): + elem := strings.Split(line[len(modLine):], tab) + last = &bi.Main + *last, err = readModuleLine(elem) + if err != nil { + return nil, err + } + case strings.HasPrefix(line, depLine): + elem := strings.Split(line[len(depLine):], tab) + last = new(Module) + bi.Deps = append(bi.Deps, last) + *last, err = readModuleLine(elem) + if err != nil { + return nil, err + } + case strings.HasPrefix(line, repLine): + elem := strings.Split(line[len(repLine):], tab) + if len(elem) != 3 { + return nil, fmt.Errorf("expected 3 columns for replacement; got %d", len(elem)) + } + if last == nil { + return nil, fmt.Errorf("replacement with no module on previous line") + } + last.Replace = &Module{ + Path: string(elem[0]), + Version: string(elem[1]), + Sum: string(elem[2]), + } + last = nil + case strings.HasPrefix(line, buildLine): + kv := line[len(buildLine):] + if len(kv) < 1 { + return nil, fmt.Errorf("build line missing '='") + } + + var key, rawValue string + switch kv[0] { + case '=': + return nil, fmt.Errorf("build line with missing key") + + case '`', '"': + rawKey, err := strconv.QuotedPrefix(kv) + if err != nil { + return nil, fmt.Errorf("invalid quoted key in build line") + } + if len(kv) == len(rawKey) { + return nil, fmt.Errorf("build line missing '=' after quoted key") + } + if c := kv[len(rawKey)]; c != '=' { + return nil, fmt.Errorf("unexpected character after quoted key: %q", c) + } + key, _ = strconv.Unquote(rawKey) + rawValue = kv[len(rawKey)+1:] + + default: + var ok bool + key, rawValue, ok = strings.Cut(kv, "=") + if !ok { + return nil, fmt.Errorf("build line missing '=' after key") + } + if quoteKey(key) { + return nil, fmt.Errorf("unquoted key %q must be quoted", key) + } + } + + var value string + if len(rawValue) > 0 { + switch rawValue[0] { + case '`', '"': + var err error + value, err = strconv.Unquote(rawValue) + if err != nil { + return nil, fmt.Errorf("invalid quoted value in build line") + } + + default: + value = rawValue + if quoteValue(value) { + return nil, fmt.Errorf("unquoted value %q must be quoted", value) + } + } + } + + bi.Settings = append(bi.Settings, BuildSetting{Key: key, Value: value}) + } + lineNum++ + } + return bi, nil +} diff --git a/src/runtime/debug/mod_test.go b/src/runtime/debug/mod_test.go new file mode 100644 index 0000000..b291769 --- /dev/null +++ b/src/runtime/debug/mod_test.go @@ -0,0 +1,75 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package debug_test + +import ( + "reflect" + "runtime/debug" + "strings" + "testing" +) + +// strip removes two leading tabs after each newline of s. +func strip(s string) string { + replaced := strings.ReplaceAll(s, "\n\t\t", "\n") + if len(replaced) > 0 && replaced[0] == '\n' { + replaced = replaced[1:] + } + return replaced +} + +func FuzzParseBuildInfoRoundTrip(f *testing.F) { + // Package built from outside a module, missing some fields.. + f.Add(strip(` + path rsc.io/fortune + mod rsc.io/fortune v1.0.0 + `)) + + // Package built from the standard library, missing some fields.. + f.Add(`path cmd/test2json`) + + // Package built from inside a module. + f.Add(strip(` + go 1.18 + path example.com/m + mod example.com/m (devel) + build -compiler=gc + `)) + + // Package built in GOPATH mode. + f.Add(strip(` + go 1.18 + path example.com/m + build -compiler=gc + `)) + + // Escaped build info. + f.Add(strip(` + go 1.18 + path example.com/m + build CRAZY_ENV="requires\nescaping" + `)) + + f.Fuzz(func(t *testing.T, s string) { + bi, err := debug.ParseBuildInfo(s) + if err != nil { + // Not a round-trippable BuildInfo string. + t.Log(err) + return + } + + // s2 could have different escaping from s. + // However, it should parse to exactly the same contents. + s2 := bi.String() + bi2, err := debug.ParseBuildInfo(s2) + if err != nil { + t.Fatalf("%v:\n%s", err, s2) + } + + if !reflect.DeepEqual(bi2, bi) { + t.Fatalf("Parsed representation differs.\ninput:\n%s\noutput:\n%s", s, s2) + } + }) +} diff --git a/src/runtime/debug/panic_test.go b/src/runtime/debug/panic_test.go new file mode 100644 index 0000000..ec5294c --- /dev/null +++ b/src/runtime/debug/panic_test.go @@ -0,0 +1,56 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build aix || darwin || dragonfly || freebsd || linux || netbsd || openbsd + +// TODO: test on Windows? + +package debug_test + +import ( + "runtime" + "runtime/debug" + "syscall" + "testing" + "unsafe" +) + +func TestPanicOnFault(t *testing.T) { + if runtime.GOARCH == "s390x" { + t.Skip("s390x fault addresses are missing the low order bits") + } + if runtime.GOOS == "ios" { + t.Skip("iOS doesn't provide fault addresses") + } + if runtime.GOOS == "netbsd" && runtime.GOARCH == "arm" { + t.Skip("netbsd-arm doesn't provide fault address (golang.org/issue/45026)") + } + m, err := syscall.Mmap(-1, 0, 0x1000, syscall.PROT_READ /* Note: no PROT_WRITE */, syscall.MAP_SHARED|syscall.MAP_ANON) + if err != nil { + t.Fatalf("can't map anonymous memory: %s", err) + } + defer syscall.Munmap(m) + old := debug.SetPanicOnFault(true) + defer debug.SetPanicOnFault(old) + const lowBits = 0x3e7 + defer func() { + r := recover() + if r == nil { + t.Fatalf("write did not fault") + } + type addressable interface { + Addr() uintptr + } + a, ok := r.(addressable) + if !ok { + t.Fatalf("fault does not contain address") + } + want := uintptr(unsafe.Pointer(&m[lowBits])) + got := a.Addr() + if got != want { + t.Fatalf("fault address %x, want %x", got, want) + } + }() + m[lowBits] = 1 // will fault +} diff --git a/src/runtime/debug/stack.go b/src/runtime/debug/stack.go new file mode 100644 index 0000000..5d810af --- /dev/null +++ b/src/runtime/debug/stack.go @@ -0,0 +1,30 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Package debug contains facilities for programs to debug themselves while +// they are running. +package debug + +import ( + "os" + "runtime" +) + +// PrintStack prints to standard error the stack trace returned by runtime.Stack. +func PrintStack() { + os.Stderr.Write(Stack()) +} + +// Stack returns a formatted stack trace of the goroutine that calls it. +// It calls runtime.Stack with a large enough buffer to capture the entire trace. +func Stack() []byte { + buf := make([]byte, 1024) + for { + n := runtime.Stack(buf, false) + if n < len(buf) { + return buf[:n] + } + buf = make([]byte, 2*len(buf)) + } +} diff --git a/src/runtime/debug/stack_test.go b/src/runtime/debug/stack_test.go new file mode 100644 index 0000000..671057c --- /dev/null +++ b/src/runtime/debug/stack_test.go @@ -0,0 +1,121 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package debug_test + +import ( + "bytes" + "fmt" + "internal/testenv" + "os" + "os/exec" + "path/filepath" + "runtime" + . "runtime/debug" + "strings" + "testing" +) + +func TestMain(m *testing.M) { + if os.Getenv("GO_RUNTIME_DEBUG_TEST_DUMP_GOROOT") != "" { + fmt.Println(runtime.GOROOT()) + os.Exit(0) + } + os.Exit(m.Run()) +} + +type T int + +func (t *T) ptrmethod() []byte { + return Stack() +} +func (t T) method() []byte { + return t.ptrmethod() +} + +/* +The traceback should look something like this, modulo line numbers and hex constants. +Don't worry much about the base levels, but check the ones in our own package. + + goroutine 10 [running]: + runtime/debug.Stack(0x0, 0x0, 0x0) + /Users/r/go/src/runtime/debug/stack.go:28 +0x80 + runtime/debug.(*T).ptrmethod(0xc82005ee70, 0x0, 0x0, 0x0) + /Users/r/go/src/runtime/debug/stack_test.go:15 +0x29 + runtime/debug.T.method(0x0, 0x0, 0x0, 0x0) + /Users/r/go/src/runtime/debug/stack_test.go:18 +0x32 + runtime/debug.TestStack(0xc8201ce000) + /Users/r/go/src/runtime/debug/stack_test.go:37 +0x38 + testing.tRunner(0xc8201ce000, 0x664b58) + /Users/r/go/src/testing/testing.go:456 +0x98 + created by testing.RunTests + /Users/r/go/src/testing/testing.go:561 +0x86d +*/ +func TestStack(t *testing.T) { + b := T(0).method() + lines := strings.Split(string(b), "\n") + if len(lines) < 6 { + t.Fatal("too few lines") + } + + // If built with -trimpath, file locations should start with package paths. + // Otherwise, file locations should start with a GOROOT/src prefix + // (for whatever value of GOROOT is baked into the binary, not the one + // that may be set in the environment). + fileGoroot := "" + if envGoroot := os.Getenv("GOROOT"); envGoroot != "" { + // Since GOROOT is set explicitly in the environment, we can't be certain + // that it is the same GOROOT value baked into the binary, and we can't + // change the value in-process because runtime.GOROOT uses the value from + // initial (not current) environment. Spawn a subprocess to determine the + // real baked-in GOROOT. + t.Logf("found GOROOT %q from environment; checking embedded GOROOT value", envGoroot) + testenv.MustHaveExec(t) + exe, err := os.Executable() + if err != nil { + t.Fatal(err) + } + cmd := exec.Command(exe) + cmd.Env = append(os.Environ(), "GOROOT=", "GO_RUNTIME_DEBUG_TEST_DUMP_GOROOT=1") + out, err := cmd.Output() + if err != nil { + t.Fatal(err) + } + fileGoroot = string(bytes.TrimSpace(out)) + } else { + // Since GOROOT is not set in the environment, its value (if any) must come + // from the path embedded in the binary. + fileGoroot = runtime.GOROOT() + } + filePrefix := "" + if fileGoroot != "" { + filePrefix = filepath.ToSlash(fileGoroot) + "/src/" + } + + n := 0 + frame := func(file, code string) { + t.Helper() + + line := lines[n] + if !strings.Contains(line, code) { + t.Errorf("expected %q in %q", code, line) + } + n++ + + line = lines[n] + + wantPrefix := "\t" + filePrefix + file + if !strings.HasPrefix(line, wantPrefix) { + t.Errorf("in line %q, expected prefix %q", line, wantPrefix) + } + n++ + } + n++ + + frame("runtime/debug/stack.go", "runtime/debug.Stack") + frame("runtime/debug/stack_test.go", "runtime/debug_test.(*T).ptrmethod") + frame("runtime/debug/stack_test.go", "runtime/debug_test.T.method") + frame("runtime/debug/stack_test.go", "runtime/debug_test.TestStack") + frame("testing/testing.go", "") +} diff --git a/src/runtime/debug/stubs.go b/src/runtime/debug/stubs.go new file mode 100644 index 0000000..913d4b9 --- /dev/null +++ b/src/runtime/debug/stubs.go @@ -0,0 +1,18 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package debug + +import ( + "time" +) + +// Implemented in package runtime. +func readGCStats(*[]time.Duration) +func freeOSMemory() +func setMaxStack(int) int +func setGCPercent(int32) int32 +func setPanicOnFault(bool) bool +func setMaxThreads(int) int +func setMemoryLimit(int64) int64 diff --git a/src/runtime/debug_test.go b/src/runtime/debug_test.go new file mode 100644 index 0000000..75fe07e --- /dev/null +++ b/src/runtime/debug_test.go @@ -0,0 +1,307 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// TODO: This test could be implemented on all (most?) UNIXes if we +// added syscall.Tgkill more widely. + +// We skip all of these tests under race mode because our test thread +// spends all of its time in the race runtime, which isn't a safe +// point. + +//go:build (amd64 || arm64) && linux && !race + +package runtime_test + +import ( + "fmt" + "internal/abi" + "math" + "os" + "regexp" + "runtime" + "runtime/debug" + "sync/atomic" + "syscall" + "testing" +) + +func startDebugCallWorker(t *testing.T) (g *runtime.G, after func()) { + // This can deadlock if run under a debugger because it + // depends on catching SIGTRAP, which is usually swallowed by + // a debugger. + skipUnderDebugger(t) + + // This can deadlock if there aren't enough threads or if a GC + // tries to interrupt an atomic loop (see issue #10958). Execute + // an extra GC to ensure even the sweep phase is done (out of + // caution to prevent #49370 from happening). + // TODO(mknyszek): This extra GC cycle is likely unnecessary + // because preemption (which may happen during the sweep phase) + // isn't much of an issue anymore thanks to asynchronous preemption. + // The biggest risk is having a write barrier in the debug call + // injection test code fire, because it runs in a signal handler + // and may not have a P. + // + // We use 8 Ps so there's room for the debug call worker, + // something that's trying to preempt the call worker, and the + // goroutine that's trying to stop the call worker. + ogomaxprocs := runtime.GOMAXPROCS(8) + ogcpercent := debug.SetGCPercent(-1) + runtime.GC() + + // ready is a buffered channel so debugCallWorker won't block + // on sending to it. This makes it less likely we'll catch + // debugCallWorker while it's in the runtime. + ready := make(chan *runtime.G, 1) + var stop uint32 + done := make(chan error) + go debugCallWorker(ready, &stop, done) + g = <-ready + return g, func() { + atomic.StoreUint32(&stop, 1) + err := <-done + if err != nil { + t.Fatal(err) + } + runtime.GOMAXPROCS(ogomaxprocs) + debug.SetGCPercent(ogcpercent) + } +} + +func debugCallWorker(ready chan<- *runtime.G, stop *uint32, done chan<- error) { + runtime.LockOSThread() + defer runtime.UnlockOSThread() + + ready <- runtime.Getg() + + x := 2 + debugCallWorker2(stop, &x) + if x != 1 { + done <- fmt.Errorf("want x = 2, got %d; register pointer not adjusted?", x) + } + close(done) +} + +// Don't inline this function, since we want to test adjusting +// pointers in the arguments. +// +//go:noinline +func debugCallWorker2(stop *uint32, x *int) { + for atomic.LoadUint32(stop) == 0 { + // Strongly encourage x to live in a register so we + // can test pointer register adjustment. + *x++ + } + *x = 1 +} + +func debugCallTKill(tid int) error { + return syscall.Tgkill(syscall.Getpid(), tid, syscall.SIGTRAP) +} + +// skipUnderDebugger skips the current test when running under a +// debugger (specifically if this process has a tracer). This is +// Linux-specific. +func skipUnderDebugger(t *testing.T) { + pid := syscall.Getpid() + status, err := os.ReadFile(fmt.Sprintf("/proc/%d/status", pid)) + if err != nil { + t.Logf("couldn't get proc tracer: %s", err) + return + } + re := regexp.MustCompile(`TracerPid:\s+([0-9]+)`) + sub := re.FindSubmatch(status) + if sub == nil { + t.Logf("couldn't find proc tracer PID") + return + } + if string(sub[1]) == "0" { + return + } + t.Skip("test will deadlock under a debugger") +} + +func TestDebugCall(t *testing.T) { + g, after := startDebugCallWorker(t) + defer after() + + type stackArgs struct { + x0 int + x1 float64 + y0Ret int + y1Ret float64 + } + + // Inject a call into the debugCallWorker goroutine and test + // basic argument and result passing. + fn := func(x int, y float64) (y0Ret int, y1Ret float64) { + return x + 1, y + 1.0 + } + var args *stackArgs + var regs abi.RegArgs + intRegs := regs.Ints[:] + floatRegs := regs.Floats[:] + fval := float64(42.0) + if len(intRegs) > 0 { + intRegs[0] = 42 + floatRegs[0] = math.Float64bits(fval) + } else { + args = &stackArgs{ + x0: 42, + x1: 42.0, + } + } + + if _, err := runtime.InjectDebugCall(g, fn, ®s, args, debugCallTKill, false); err != nil { + t.Fatal(err) + } + var result0 int + var result1 float64 + if len(intRegs) > 0 { + result0 = int(intRegs[0]) + result1 = math.Float64frombits(floatRegs[0]) + } else { + result0 = args.y0Ret + result1 = args.y1Ret + } + if result0 != 43 { + t.Errorf("want 43, got %d", result0) + } + if result1 != fval+1 { + t.Errorf("want 43, got %f", result1) + } +} + +func TestDebugCallLarge(t *testing.T) { + g, after := startDebugCallWorker(t) + defer after() + + // Inject a call with a large call frame. + const N = 128 + var args struct { + in [N]int + out [N]int + } + fn := func(in [N]int) (out [N]int) { + for i := range in { + out[i] = in[i] + 1 + } + return + } + var want [N]int + for i := range args.in { + args.in[i] = i + want[i] = i + 1 + } + if _, err := runtime.InjectDebugCall(g, fn, nil, &args, debugCallTKill, false); err != nil { + t.Fatal(err) + } + if want != args.out { + t.Fatalf("want %v, got %v", want, args.out) + } +} + +func TestDebugCallGC(t *testing.T) { + g, after := startDebugCallWorker(t) + defer after() + + // Inject a call that performs a GC. + if _, err := runtime.InjectDebugCall(g, runtime.GC, nil, nil, debugCallTKill, false); err != nil { + t.Fatal(err) + } +} + +func TestDebugCallGrowStack(t *testing.T) { + g, after := startDebugCallWorker(t) + defer after() + + // Inject a call that grows the stack. debugCallWorker checks + // for stack pointer breakage. + if _, err := runtime.InjectDebugCall(g, func() { growStack(nil) }, nil, nil, debugCallTKill, false); err != nil { + t.Fatal(err) + } +} + +//go:nosplit +func debugCallUnsafePointWorker(gpp **runtime.G, ready, stop *uint32) { + // The nosplit causes this function to not contain safe-points + // except at calls. + runtime.LockOSThread() + defer runtime.UnlockOSThread() + + *gpp = runtime.Getg() + + for atomic.LoadUint32(stop) == 0 { + atomic.StoreUint32(ready, 1) + } +} + +func TestDebugCallUnsafePoint(t *testing.T) { + skipUnderDebugger(t) + + // This can deadlock if there aren't enough threads or if a GC + // tries to interrupt an atomic loop (see issue #10958). + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(8)) + + // InjectDebugCall cannot be executed while a GC is actively in + // progress. Wait until the current GC is done, and turn it off. + // + // See #49370. + runtime.GC() + defer debug.SetGCPercent(debug.SetGCPercent(-1)) + + // Test that the runtime refuses call injection at unsafe points. + var g *runtime.G + var ready, stop uint32 + defer atomic.StoreUint32(&stop, 1) + go debugCallUnsafePointWorker(&g, &ready, &stop) + for atomic.LoadUint32(&ready) == 0 { + runtime.Gosched() + } + + _, err := runtime.InjectDebugCall(g, func() {}, nil, nil, debugCallTKill, true) + if msg := "call not at safe point"; err == nil || err.Error() != msg { + t.Fatalf("want %q, got %s", msg, err) + } +} + +func TestDebugCallPanic(t *testing.T) { + skipUnderDebugger(t) + + // This can deadlock if there aren't enough threads. + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(8)) + + // InjectDebugCall cannot be executed while a GC is actively in + // progress. Wait until the current GC is done, and turn it off. + // + // See #10958 and #49370. + defer debug.SetGCPercent(debug.SetGCPercent(-1)) + // TODO(mknyszek): This extra GC cycle is likely unnecessary + // because preemption (which may happen during the sweep phase) + // isn't much of an issue anymore thanks to asynchronous preemption. + // The biggest risk is having a write barrier in the debug call + // injection test code fire, because it runs in a signal handler + // and may not have a P. + runtime.GC() + + ready := make(chan *runtime.G) + var stop uint32 + defer atomic.StoreUint32(&stop, 1) + go func() { + runtime.LockOSThread() + defer runtime.UnlockOSThread() + ready <- runtime.Getg() + for atomic.LoadUint32(&stop) == 0 { + } + }() + g := <-ready + + p, err := runtime.InjectDebugCall(g, func() { panic("test") }, nil, nil, debugCallTKill, false) + if err != nil { + t.Fatal(err) + } + if ps, ok := p.(string); !ok || ps != "test" { + t.Fatalf("wanted panic %v, got %v", "test", p) + } +} diff --git a/src/runtime/debugcall.go b/src/runtime/debugcall.go new file mode 100644 index 0000000..a4393b1 --- /dev/null +++ b/src/runtime/debugcall.go @@ -0,0 +1,252 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build amd64 || arm64 + +package runtime + +import "unsafe" + +const ( + debugCallSystemStack = "executing on Go runtime stack" + debugCallUnknownFunc = "call from unknown function" + debugCallRuntime = "call from within the Go runtime" + debugCallUnsafePoint = "call not at safe point" +) + +func debugCallV2() +func debugCallPanicked(val any) + +// debugCallCheck checks whether it is safe to inject a debugger +// function call with return PC pc. If not, it returns a string +// explaining why. +// +//go:nosplit +func debugCallCheck(pc uintptr) string { + // No user calls from the system stack. + if getg() != getg().m.curg { + return debugCallSystemStack + } + if sp := getcallersp(); !(getg().stack.lo < sp && sp <= getg().stack.hi) { + // Fast syscalls (nanotime) and racecall switch to the + // g0 stack without switching g. We can't safely make + // a call in this state. (We can't even safely + // systemstack.) + return debugCallSystemStack + } + + // Switch to the system stack to avoid overflowing the user + // stack. + var ret string + systemstack(func() { + f := findfunc(pc) + if !f.valid() { + ret = debugCallUnknownFunc + return + } + + name := funcname(f) + + switch name { + case "debugCall32", + "debugCall64", + "debugCall128", + "debugCall256", + "debugCall512", + "debugCall1024", + "debugCall2048", + "debugCall4096", + "debugCall8192", + "debugCall16384", + "debugCall32768", + "debugCall65536": + // These functions are allowed so that the debugger can initiate multiple function calls. + // See: https://golang.org/cl/161137/ + return + } + + // Disallow calls from the runtime. We could + // potentially make this condition tighter (e.g., not + // when locks are held), but there are enough tightly + // coded sequences (e.g., defer handling) that it's + // better to play it safe. + if pfx := "runtime."; len(name) > len(pfx) && name[:len(pfx)] == pfx { + ret = debugCallRuntime + return + } + + // Check that this isn't an unsafe-point. + if pc != f.entry() { + pc-- + } + up := pcdatavalue(f, _PCDATA_UnsafePoint, pc, nil) + if up != _PCDATA_UnsafePointSafe { + // Not at a safe point. + ret = debugCallUnsafePoint + } + }) + return ret +} + +// debugCallWrap starts a new goroutine to run a debug call and blocks +// the calling goroutine. On the goroutine, it prepares to recover +// panics from the debug call, and then calls the call dispatching +// function at PC dispatch. +// +// This must be deeply nosplit because there are untyped values on the +// stack from debugCallV2. +// +//go:nosplit +func debugCallWrap(dispatch uintptr) { + var lockedm bool + var lockedExt uint32 + callerpc := getcallerpc() + gp := getg() + + // Create a new goroutine to execute the call on. Run this on + // the system stack to avoid growing our stack. + systemstack(func() { + // TODO(mknyszek): It would be nice to wrap these arguments in an allocated + // closure and start the goroutine with that closure, but the compiler disallows + // implicit closure allocation in the runtime. + fn := debugCallWrap1 + newg := newproc1(*(**funcval)(unsafe.Pointer(&fn)), gp, callerpc) + args := &debugCallWrapArgs{ + dispatch: dispatch, + callingG: gp, + } + newg.param = unsafe.Pointer(args) + + // If the current G is locked, then transfer that + // locked-ness to the new goroutine. + if gp.lockedm != 0 { + // Save lock state to restore later. + mp := gp.m + if mp != gp.lockedm.ptr() { + throw("inconsistent lockedm") + } + + lockedm = true + lockedExt = mp.lockedExt + + // Transfer external lock count to internal so + // it can't be unlocked from the debug call. + mp.lockedInt++ + mp.lockedExt = 0 + + mp.lockedg.set(newg) + newg.lockedm.set(mp) + gp.lockedm = 0 + } + + // Mark the calling goroutine as being at an async + // safe-point, since it has a few conservative frames + // at the bottom of the stack. This also prevents + // stack shrinks. + gp.asyncSafePoint = true + + // Stash newg away so we can execute it below (mcall's + // closure can't capture anything). + gp.schedlink.set(newg) + }) + + // Switch to the new goroutine. + mcall(func(gp *g) { + // Get newg. + newg := gp.schedlink.ptr() + gp.schedlink = 0 + + // Park the calling goroutine. + if trace.enabled { + traceGoPark(traceEvGoBlock, 1) + } + casGToWaiting(gp, _Grunning, waitReasonDebugCall) + dropg() + + // Directly execute the new goroutine. The debug + // protocol will continue on the new goroutine, so + // it's important we not just let the scheduler do + // this or it may resume a different goroutine. + execute(newg, true) + }) + + // We'll resume here when the call returns. + + // Restore locked state. + if lockedm { + mp := gp.m + mp.lockedExt = lockedExt + mp.lockedInt-- + mp.lockedg.set(gp) + gp.lockedm.set(mp) + } + + gp.asyncSafePoint = false +} + +type debugCallWrapArgs struct { + dispatch uintptr + callingG *g +} + +// debugCallWrap1 is the continuation of debugCallWrap on the callee +// goroutine. +func debugCallWrap1() { + gp := getg() + args := (*debugCallWrapArgs)(gp.param) + dispatch, callingG := args.dispatch, args.callingG + gp.param = nil + + // Dispatch call and trap panics. + debugCallWrap2(dispatch) + + // Resume the caller goroutine. + getg().schedlink.set(callingG) + mcall(func(gp *g) { + callingG := gp.schedlink.ptr() + gp.schedlink = 0 + + // Unlock this goroutine from the M if necessary. The + // calling G will relock. + if gp.lockedm != 0 { + gp.lockedm = 0 + gp.m.lockedg = 0 + } + + // Switch back to the calling goroutine. At some point + // the scheduler will schedule us again and we'll + // finish exiting. + if trace.enabled { + traceGoSched() + } + casgstatus(gp, _Grunning, _Grunnable) + dropg() + lock(&sched.lock) + globrunqput(gp) + unlock(&sched.lock) + + if trace.enabled { + traceGoUnpark(callingG, 0) + } + casgstatus(callingG, _Gwaiting, _Grunnable) + execute(callingG, true) + }) +} + +func debugCallWrap2(dispatch uintptr) { + // Call the dispatch function and trap panics. + var dispatchF func() + dispatchFV := funcval{dispatch} + *(*unsafe.Pointer)(unsafe.Pointer(&dispatchF)) = noescape(unsafe.Pointer(&dispatchFV)) + + var ok bool + defer func() { + if !ok { + err := recover() + debugCallPanicked(err) + } + }() + dispatchF() + ok = true +} diff --git a/src/runtime/debuglog.go b/src/runtime/debuglog.go new file mode 100644 index 0000000..b18774e --- /dev/null +++ b/src/runtime/debuglog.go @@ -0,0 +1,831 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This file provides an internal debug logging facility. The debug +// log is a lightweight, in-memory, per-M ring buffer. By default, the +// runtime prints the debug log on panic. +// +// To print something to the debug log, call dlog to obtain a dlogger +// and use the methods on that to add values. The values will be +// space-separated in the output (much like println). +// +// This facility can be enabled by passing -tags debuglog when +// building. Without this tag, dlog calls compile to nothing. + +package runtime + +import ( + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// debugLogBytes is the size of each per-M ring buffer. This is +// allocated off-heap to avoid blowing up the M and hence the GC'd +// heap size. +const debugLogBytes = 16 << 10 + +// debugLogStringLimit is the maximum number of bytes in a string. +// Above this, the string will be truncated with "..(n more bytes).." +const debugLogStringLimit = debugLogBytes / 8 + +// dlog returns a debug logger. The caller can use methods on the +// returned logger to add values, which will be space-separated in the +// final output, much like println. The caller must call end() to +// finish the message. +// +// dlog can be used from highly-constrained corners of the runtime: it +// is safe to use in the signal handler, from within the write +// barrier, from within the stack implementation, and in places that +// must be recursively nosplit. +// +// This will be compiled away if built without the debuglog build tag. +// However, argument construction may not be. If any of the arguments +// are not literals or trivial expressions, consider protecting the +// call with "if dlogEnabled". +// +//go:nosplit +//go:nowritebarrierrec +func dlog() *dlogger { + if !dlogEnabled { + return nil + } + + // Get the time. + tick, nano := uint64(cputicks()), uint64(nanotime()) + + // Try to get a cached logger. + l := getCachedDlogger() + + // If we couldn't get a cached logger, try to get one from the + // global pool. + if l == nil { + allp := (*uintptr)(unsafe.Pointer(&allDloggers)) + all := (*dlogger)(unsafe.Pointer(atomic.Loaduintptr(allp))) + for l1 := all; l1 != nil; l1 = l1.allLink { + if l1.owned.Load() == 0 && l1.owned.CompareAndSwap(0, 1) { + l = l1 + break + } + } + } + + // If that failed, allocate a new logger. + if l == nil { + // Use sysAllocOS instead of sysAlloc because we want to interfere + // with the runtime as little as possible, and sysAlloc updates accounting. + l = (*dlogger)(sysAllocOS(unsafe.Sizeof(dlogger{}))) + if l == nil { + throw("failed to allocate debug log") + } + l.w.r.data = &l.w.data + l.owned.Store(1) + + // Prepend to allDloggers list. + headp := (*uintptr)(unsafe.Pointer(&allDloggers)) + for { + head := atomic.Loaduintptr(headp) + l.allLink = (*dlogger)(unsafe.Pointer(head)) + if atomic.Casuintptr(headp, head, uintptr(unsafe.Pointer(l))) { + break + } + } + } + + // If the time delta is getting too high, write a new sync + // packet. We set the limit so we don't write more than 6 + // bytes of delta in the record header. + const deltaLimit = 1<<(3*7) - 1 // ~2ms between sync packets + if tick-l.w.tick > deltaLimit || nano-l.w.nano > deltaLimit { + l.w.writeSync(tick, nano) + } + + // Reserve space for framing header. + l.w.ensure(debugLogHeaderSize) + l.w.write += debugLogHeaderSize + + // Write record header. + l.w.uvarint(tick - l.w.tick) + l.w.uvarint(nano - l.w.nano) + gp := getg() + if gp != nil && gp.m != nil && gp.m.p != 0 { + l.w.varint(int64(gp.m.p.ptr().id)) + } else { + l.w.varint(-1) + } + + return l +} + +// A dlogger writes to the debug log. +// +// To obtain a dlogger, call dlog(). When done with the dlogger, call +// end(). +type dlogger struct { + _ sys.NotInHeap + w debugLogWriter + + // allLink is the next dlogger in the allDloggers list. + allLink *dlogger + + // owned indicates that this dlogger is owned by an M. This is + // accessed atomically. + owned atomic.Uint32 +} + +// allDloggers is a list of all dloggers, linked through +// dlogger.allLink. This is accessed atomically. This is prepend only, +// so it doesn't need to protect against ABA races. +var allDloggers *dlogger + +//go:nosplit +func (l *dlogger) end() { + if !dlogEnabled { + return + } + + // Fill in framing header. + size := l.w.write - l.w.r.end + if !l.w.writeFrameAt(l.w.r.end, size) { + throw("record too large") + } + + // Commit the record. + l.w.r.end = l.w.write + + // Attempt to return this logger to the cache. + if putCachedDlogger(l) { + return + } + + // Return the logger to the global pool. + l.owned.Store(0) +} + +const ( + debugLogUnknown = 1 + iota + debugLogBoolTrue + debugLogBoolFalse + debugLogInt + debugLogUint + debugLogHex + debugLogPtr + debugLogString + debugLogConstString + debugLogStringOverflow + + debugLogPC + debugLogTraceback +) + +//go:nosplit +func (l *dlogger) b(x bool) *dlogger { + if !dlogEnabled { + return l + } + if x { + l.w.byte(debugLogBoolTrue) + } else { + l.w.byte(debugLogBoolFalse) + } + return l +} + +//go:nosplit +func (l *dlogger) i(x int) *dlogger { + return l.i64(int64(x)) +} + +//go:nosplit +func (l *dlogger) i8(x int8) *dlogger { + return l.i64(int64(x)) +} + +//go:nosplit +func (l *dlogger) i16(x int16) *dlogger { + return l.i64(int64(x)) +} + +//go:nosplit +func (l *dlogger) i32(x int32) *dlogger { + return l.i64(int64(x)) +} + +//go:nosplit +func (l *dlogger) i64(x int64) *dlogger { + if !dlogEnabled { + return l + } + l.w.byte(debugLogInt) + l.w.varint(x) + return l +} + +//go:nosplit +func (l *dlogger) u(x uint) *dlogger { + return l.u64(uint64(x)) +} + +//go:nosplit +func (l *dlogger) uptr(x uintptr) *dlogger { + return l.u64(uint64(x)) +} + +//go:nosplit +func (l *dlogger) u8(x uint8) *dlogger { + return l.u64(uint64(x)) +} + +//go:nosplit +func (l *dlogger) u16(x uint16) *dlogger { + return l.u64(uint64(x)) +} + +//go:nosplit +func (l *dlogger) u32(x uint32) *dlogger { + return l.u64(uint64(x)) +} + +//go:nosplit +func (l *dlogger) u64(x uint64) *dlogger { + if !dlogEnabled { + return l + } + l.w.byte(debugLogUint) + l.w.uvarint(x) + return l +} + +//go:nosplit +func (l *dlogger) hex(x uint64) *dlogger { + if !dlogEnabled { + return l + } + l.w.byte(debugLogHex) + l.w.uvarint(x) + return l +} + +//go:nosplit +func (l *dlogger) p(x any) *dlogger { + if !dlogEnabled { + return l + } + l.w.byte(debugLogPtr) + if x == nil { + l.w.uvarint(0) + } else { + v := efaceOf(&x) + switch v._type.kind & kindMask { + case kindChan, kindFunc, kindMap, kindPtr, kindUnsafePointer: + l.w.uvarint(uint64(uintptr(v.data))) + default: + throw("not a pointer type") + } + } + return l +} + +//go:nosplit +func (l *dlogger) s(x string) *dlogger { + if !dlogEnabled { + return l + } + + strData := unsafe.StringData(x) + datap := &firstmoduledata + if len(x) > 4 && datap.etext <= uintptr(unsafe.Pointer(strData)) && uintptr(unsafe.Pointer(strData)) < datap.end { + // String constants are in the rodata section, which + // isn't recorded in moduledata. But it has to be + // somewhere between etext and end. + l.w.byte(debugLogConstString) + l.w.uvarint(uint64(len(x))) + l.w.uvarint(uint64(uintptr(unsafe.Pointer(strData)) - datap.etext)) + } else { + l.w.byte(debugLogString) + // We can't use unsafe.Slice as it may panic, which isn't safe + // in this (potentially) nowritebarrier context. + var b []byte + bb := (*slice)(unsafe.Pointer(&b)) + bb.array = unsafe.Pointer(strData) + bb.len, bb.cap = len(x), len(x) + if len(b) > debugLogStringLimit { + b = b[:debugLogStringLimit] + } + l.w.uvarint(uint64(len(b))) + l.w.bytes(b) + if len(b) != len(x) { + l.w.byte(debugLogStringOverflow) + l.w.uvarint(uint64(len(x) - len(b))) + } + } + return l +} + +//go:nosplit +func (l *dlogger) pc(x uintptr) *dlogger { + if !dlogEnabled { + return l + } + l.w.byte(debugLogPC) + l.w.uvarint(uint64(x)) + return l +} + +//go:nosplit +func (l *dlogger) traceback(x []uintptr) *dlogger { + if !dlogEnabled { + return l + } + l.w.byte(debugLogTraceback) + l.w.uvarint(uint64(len(x))) + for _, pc := range x { + l.w.uvarint(uint64(pc)) + } + return l +} + +// A debugLogWriter is a ring buffer of binary debug log records. +// +// A log record consists of a 2-byte framing header and a sequence of +// fields. The framing header gives the size of the record as a little +// endian 16-bit value. Each field starts with a byte indicating its +// type, followed by type-specific data. If the size in the framing +// header is 0, it's a sync record consisting of two little endian +// 64-bit values giving a new time base. +// +// Because this is a ring buffer, new records will eventually +// overwrite old records. Hence, it maintains a reader that consumes +// the log as it gets overwritten. That reader state is where an +// actual log reader would start. +type debugLogWriter struct { + _ sys.NotInHeap + write uint64 + data debugLogBuf + + // tick and nano are the time bases from the most recently + // written sync record. + tick, nano uint64 + + // r is a reader that consumes records as they get overwritten + // by the writer. It also acts as the initial reader state + // when printing the log. + r debugLogReader + + // buf is a scratch buffer for encoding. This is here to + // reduce stack usage. + buf [10]byte +} + +type debugLogBuf struct { + _ sys.NotInHeap + b [debugLogBytes]byte +} + +const ( + // debugLogHeaderSize is the number of bytes in the framing + // header of every dlog record. + debugLogHeaderSize = 2 + + // debugLogSyncSize is the number of bytes in a sync record. + debugLogSyncSize = debugLogHeaderSize + 2*8 +) + +//go:nosplit +func (l *debugLogWriter) ensure(n uint64) { + for l.write+n >= l.r.begin+uint64(len(l.data.b)) { + // Consume record at begin. + if l.r.skip() == ^uint64(0) { + // Wrapped around within a record. + // + // TODO(austin): It would be better to just + // eat the whole buffer at this point, but we + // have to communicate that to the reader + // somehow. + throw("record wrapped around") + } + } +} + +//go:nosplit +func (l *debugLogWriter) writeFrameAt(pos, size uint64) bool { + l.data.b[pos%uint64(len(l.data.b))] = uint8(size) + l.data.b[(pos+1)%uint64(len(l.data.b))] = uint8(size >> 8) + return size <= 0xFFFF +} + +//go:nosplit +func (l *debugLogWriter) writeSync(tick, nano uint64) { + l.tick, l.nano = tick, nano + l.ensure(debugLogHeaderSize) + l.writeFrameAt(l.write, 0) + l.write += debugLogHeaderSize + l.writeUint64LE(tick) + l.writeUint64LE(nano) + l.r.end = l.write +} + +//go:nosplit +func (l *debugLogWriter) writeUint64LE(x uint64) { + var b [8]byte + b[0] = byte(x) + b[1] = byte(x >> 8) + b[2] = byte(x >> 16) + b[3] = byte(x >> 24) + b[4] = byte(x >> 32) + b[5] = byte(x >> 40) + b[6] = byte(x >> 48) + b[7] = byte(x >> 56) + l.bytes(b[:]) +} + +//go:nosplit +func (l *debugLogWriter) byte(x byte) { + l.ensure(1) + pos := l.write + l.write++ + l.data.b[pos%uint64(len(l.data.b))] = x +} + +//go:nosplit +func (l *debugLogWriter) bytes(x []byte) { + l.ensure(uint64(len(x))) + pos := l.write + l.write += uint64(len(x)) + for len(x) > 0 { + n := copy(l.data.b[pos%uint64(len(l.data.b)):], x) + pos += uint64(n) + x = x[n:] + } +} + +//go:nosplit +func (l *debugLogWriter) varint(x int64) { + var u uint64 + if x < 0 { + u = (^uint64(x) << 1) | 1 // complement i, bit 0 is 1 + } else { + u = (uint64(x) << 1) // do not complement i, bit 0 is 0 + } + l.uvarint(u) +} + +//go:nosplit +func (l *debugLogWriter) uvarint(u uint64) { + i := 0 + for u >= 0x80 { + l.buf[i] = byte(u) | 0x80 + u >>= 7 + i++ + } + l.buf[i] = byte(u) + i++ + l.bytes(l.buf[:i]) +} + +type debugLogReader struct { + data *debugLogBuf + + // begin and end are the positions in the log of the beginning + // and end of the log data, modulo len(data). + begin, end uint64 + + // tick and nano are the current time base at begin. + tick, nano uint64 +} + +//go:nosplit +func (r *debugLogReader) skip() uint64 { + // Read size at pos. + if r.begin+debugLogHeaderSize > r.end { + return ^uint64(0) + } + size := uint64(r.readUint16LEAt(r.begin)) + if size == 0 { + // Sync packet. + r.tick = r.readUint64LEAt(r.begin + debugLogHeaderSize) + r.nano = r.readUint64LEAt(r.begin + debugLogHeaderSize + 8) + size = debugLogSyncSize + } + if r.begin+size > r.end { + return ^uint64(0) + } + r.begin += size + return size +} + +//go:nosplit +func (r *debugLogReader) readUint16LEAt(pos uint64) uint16 { + return uint16(r.data.b[pos%uint64(len(r.data.b))]) | + uint16(r.data.b[(pos+1)%uint64(len(r.data.b))])<<8 +} + +//go:nosplit +func (r *debugLogReader) readUint64LEAt(pos uint64) uint64 { + var b [8]byte + for i := range b { + b[i] = r.data.b[pos%uint64(len(r.data.b))] + pos++ + } + return uint64(b[0]) | uint64(b[1])<<8 | + uint64(b[2])<<16 | uint64(b[3])<<24 | + uint64(b[4])<<32 | uint64(b[5])<<40 | + uint64(b[6])<<48 | uint64(b[7])<<56 +} + +func (r *debugLogReader) peek() (tick uint64) { + // Consume any sync records. + size := uint64(0) + for size == 0 { + if r.begin+debugLogHeaderSize > r.end { + return ^uint64(0) + } + size = uint64(r.readUint16LEAt(r.begin)) + if size != 0 { + break + } + if r.begin+debugLogSyncSize > r.end { + return ^uint64(0) + } + // Sync packet. + r.tick = r.readUint64LEAt(r.begin + debugLogHeaderSize) + r.nano = r.readUint64LEAt(r.begin + debugLogHeaderSize + 8) + r.begin += debugLogSyncSize + } + + // Peek tick delta. + if r.begin+size > r.end { + return ^uint64(0) + } + pos := r.begin + debugLogHeaderSize + var u uint64 + for i := uint(0); ; i += 7 { + b := r.data.b[pos%uint64(len(r.data.b))] + pos++ + u |= uint64(b&^0x80) << i + if b&0x80 == 0 { + break + } + } + if pos > r.begin+size { + return ^uint64(0) + } + return r.tick + u +} + +func (r *debugLogReader) header() (end, tick, nano uint64, p int) { + // Read size. We've already skipped sync packets and checked + // bounds in peek. + size := uint64(r.readUint16LEAt(r.begin)) + end = r.begin + size + r.begin += debugLogHeaderSize + + // Read tick, nano, and p. + tick = r.uvarint() + r.tick + nano = r.uvarint() + r.nano + p = int(r.varint()) + + return +} + +func (r *debugLogReader) uvarint() uint64 { + var u uint64 + for i := uint(0); ; i += 7 { + b := r.data.b[r.begin%uint64(len(r.data.b))] + r.begin++ + u |= uint64(b&^0x80) << i + if b&0x80 == 0 { + break + } + } + return u +} + +func (r *debugLogReader) varint() int64 { + u := r.uvarint() + var v int64 + if u&1 == 0 { + v = int64(u >> 1) + } else { + v = ^int64(u >> 1) + } + return v +} + +func (r *debugLogReader) printVal() bool { + typ := r.data.b[r.begin%uint64(len(r.data.b))] + r.begin++ + + switch typ { + default: + print("<unknown field type ", hex(typ), " pos ", r.begin-1, " end ", r.end, ">\n") + return false + + case debugLogUnknown: + print("<unknown kind>") + + case debugLogBoolTrue: + print(true) + + case debugLogBoolFalse: + print(false) + + case debugLogInt: + print(r.varint()) + + case debugLogUint: + print(r.uvarint()) + + case debugLogHex, debugLogPtr: + print(hex(r.uvarint())) + + case debugLogString: + sl := r.uvarint() + if r.begin+sl > r.end { + r.begin = r.end + print("<string length corrupted>") + break + } + for sl > 0 { + b := r.data.b[r.begin%uint64(len(r.data.b)):] + if uint64(len(b)) > sl { + b = b[:sl] + } + r.begin += uint64(len(b)) + sl -= uint64(len(b)) + gwrite(b) + } + + case debugLogConstString: + len, ptr := int(r.uvarint()), uintptr(r.uvarint()) + ptr += firstmoduledata.etext + // We can't use unsafe.String as it may panic, which isn't safe + // in this (potentially) nowritebarrier context. + str := stringStruct{ + str: unsafe.Pointer(ptr), + len: len, + } + s := *(*string)(unsafe.Pointer(&str)) + print(s) + + case debugLogStringOverflow: + print("..(", r.uvarint(), " more bytes)..") + + case debugLogPC: + printDebugLogPC(uintptr(r.uvarint()), false) + + case debugLogTraceback: + n := int(r.uvarint()) + for i := 0; i < n; i++ { + print("\n\t") + // gentraceback PCs are always return PCs. + // Convert them to call PCs. + // + // TODO(austin): Expand inlined frames. + printDebugLogPC(uintptr(r.uvarint()), true) + } + } + + return true +} + +// printDebugLog prints the debug log. +func printDebugLog() { + if !dlogEnabled { + return + } + + // This function should not panic or throw since it is used in + // the fatal panic path and this may deadlock. + + printlock() + + // Get the list of all debug logs. + allp := (*uintptr)(unsafe.Pointer(&allDloggers)) + all := (*dlogger)(unsafe.Pointer(atomic.Loaduintptr(allp))) + + // Count the logs. + n := 0 + for l := all; l != nil; l = l.allLink { + n++ + } + if n == 0 { + printunlock() + return + } + + // Prepare read state for all logs. + type readState struct { + debugLogReader + first bool + lost uint64 + nextTick uint64 + } + // Use sysAllocOS instead of sysAlloc because we want to interfere + // with the runtime as little as possible, and sysAlloc updates accounting. + state1 := sysAllocOS(unsafe.Sizeof(readState{}) * uintptr(n)) + if state1 == nil { + println("failed to allocate read state for", n, "logs") + printunlock() + return + } + state := (*[1 << 20]readState)(state1)[:n] + { + l := all + for i := range state { + s := &state[i] + s.debugLogReader = l.w.r + s.first = true + s.lost = l.w.r.begin + s.nextTick = s.peek() + l = l.allLink + } + } + + // Print records. + for { + // Find the next record. + var best struct { + tick uint64 + i int + } + best.tick = ^uint64(0) + for i := range state { + if state[i].nextTick < best.tick { + best.tick = state[i].nextTick + best.i = i + } + } + if best.tick == ^uint64(0) { + break + } + + // Print record. + s := &state[best.i] + if s.first { + print(">> begin log ", best.i) + if s.lost != 0 { + print("; lost first ", s.lost>>10, "KB") + } + print(" <<\n") + s.first = false + } + + end, _, nano, p := s.header() + oldEnd := s.end + s.end = end + + print("[") + var tmpbuf [21]byte + pnano := int64(nano) - runtimeInitTime + if pnano < 0 { + // Logged before runtimeInitTime was set. + pnano = 0 + } + pnanoBytes := itoaDiv(tmpbuf[:], uint64(pnano), 9) + print(slicebytetostringtmp((*byte)(noescape(unsafe.Pointer(&pnanoBytes[0]))), len(pnanoBytes))) + print(" P ", p, "] ") + + for i := 0; s.begin < s.end; i++ { + if i > 0 { + print(" ") + } + if !s.printVal() { + // Abort this P log. + print("<aborting P log>") + end = oldEnd + break + } + } + println() + + // Move on to the next record. + s.begin = end + s.end = oldEnd + s.nextTick = s.peek() + } + + printunlock() +} + +// printDebugLogPC prints a single symbolized PC. If returnPC is true, +// pc is a return PC that must first be converted to a call PC. +func printDebugLogPC(pc uintptr, returnPC bool) { + fn := findfunc(pc) + if returnPC && (!fn.valid() || pc > fn.entry()) { + // TODO(austin): Don't back up if the previous frame + // was a sigpanic. + pc-- + } + + print(hex(pc)) + if !fn.valid() { + print(" [unknown PC]") + } else { + name := funcname(fn) + file, line := funcline(fn, pc) + print(" [", name, "+", hex(pc-fn.entry()), + " ", file, ":", line, "]") + } +} diff --git a/src/runtime/debuglog_off.go b/src/runtime/debuglog_off.go new file mode 100644 index 0000000..fa3be39 --- /dev/null +++ b/src/runtime/debuglog_off.go @@ -0,0 +1,19 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !debuglog + +package runtime + +const dlogEnabled = false + +type dlogPerM struct{} + +func getCachedDlogger() *dlogger { + return nil +} + +func putCachedDlogger(l *dlogger) bool { + return false +} diff --git a/src/runtime/debuglog_on.go b/src/runtime/debuglog_on.go new file mode 100644 index 0000000..b815020 --- /dev/null +++ b/src/runtime/debuglog_on.go @@ -0,0 +1,45 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build debuglog + +package runtime + +const dlogEnabled = true + +// dlogPerM is the per-M debug log data. This is embedded in the m +// struct. +type dlogPerM struct { + dlogCache *dlogger +} + +// getCachedDlogger returns a cached dlogger if it can do so +// efficiently, or nil otherwise. The returned dlogger will be owned. +func getCachedDlogger() *dlogger { + mp := acquirem() + // We don't return a cached dlogger if we're running on the + // signal stack in case the signal arrived while in + // get/putCachedDlogger. (Too bad we don't have non-atomic + // exchange!) + var l *dlogger + if getg() != mp.gsignal { + l = mp.dlogCache + mp.dlogCache = nil + } + releasem(mp) + return l +} + +// putCachedDlogger attempts to return l to the local cache. It +// returns false if this fails. +func putCachedDlogger(l *dlogger) bool { + mp := acquirem() + if getg() != mp.gsignal && mp.dlogCache == nil { + mp.dlogCache = l + releasem(mp) + return true + } + releasem(mp) + return false +} diff --git a/src/runtime/debuglog_test.go b/src/runtime/debuglog_test.go new file mode 100644 index 0000000..18c54a8 --- /dev/null +++ b/src/runtime/debuglog_test.go @@ -0,0 +1,169 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// TODO(austin): All of these tests are skipped if the debuglog build +// tag isn't provided. That means we basically never test debuglog. +// There are two potential ways around this: +// +// 1. Make these tests re-build the runtime test with the debuglog +// build tag and re-invoke themselves. +// +// 2. Always build the whole debuglog infrastructure and depend on +// linker dead-code elimination to drop it. This is easy for dlog() +// since there won't be any calls to it. For printDebugLog, we can +// make panic call a wrapper that is call printDebugLog if the +// debuglog build tag is set, or otherwise do nothing. Then tests +// could call printDebugLog directly. This is the right answer in +// principle, but currently our linker reads in all symbols +// regardless, so this would slow down and bloat all links. If the +// linker gets more efficient about this, we should revisit this +// approach. + +package runtime_test + +import ( + "fmt" + "internal/testenv" + "regexp" + "runtime" + "strings" + "sync" + "sync/atomic" + "testing" +) + +func skipDebugLog(t *testing.T) { + if !runtime.DlogEnabled { + t.Skip("debug log disabled (rebuild with -tags debuglog)") + } +} + +func dlogCanonicalize(x string) string { + begin := regexp.MustCompile(`(?m)^>> begin log \d+ <<\n`) + x = begin.ReplaceAllString(x, "") + prefix := regexp.MustCompile(`(?m)^\[[^]]+\]`) + x = prefix.ReplaceAllString(x, "[]") + return x +} + +func TestDebugLog(t *testing.T) { + skipDebugLog(t) + runtime.ResetDebugLog() + runtime.Dlog().S("testing").End() + got := dlogCanonicalize(runtime.DumpDebugLog()) + if want := "[] testing\n"; got != want { + t.Fatalf("want %q, got %q", want, got) + } +} + +func TestDebugLogTypes(t *testing.T) { + skipDebugLog(t) + runtime.ResetDebugLog() + var varString = strings.Repeat("a", 4) + runtime.Dlog().B(true).B(false).I(-42).I16(0x7fff).U64(^uint64(0)).Hex(0xfff).P(nil).S(varString).S("const string").End() + got := dlogCanonicalize(runtime.DumpDebugLog()) + if want := "[] true false -42 32767 18446744073709551615 0xfff 0x0 aaaa const string\n"; got != want { + t.Fatalf("want %q, got %q", want, got) + } +} + +func TestDebugLogSym(t *testing.T) { + skipDebugLog(t) + runtime.ResetDebugLog() + pc, _, _, _ := runtime.Caller(0) + runtime.Dlog().PC(pc).End() + got := dlogCanonicalize(runtime.DumpDebugLog()) + want := regexp.MustCompile(`\[\] 0x[0-9a-f]+ \[runtime_test\.TestDebugLogSym\+0x[0-9a-f]+ .*/debuglog_test\.go:[0-9]+\]\n`) + if !want.MatchString(got) { + t.Fatalf("want matching %s, got %q", want, got) + } +} + +func TestDebugLogInterleaving(t *testing.T) { + skipDebugLog(t) + runtime.ResetDebugLog() + var wg sync.WaitGroup + done := int32(0) + wg.Add(1) + go func() { + // Encourage main goroutine to move around to + // different Ms and Ps. + for atomic.LoadInt32(&done) == 0 { + runtime.Gosched() + } + wg.Done() + }() + var want strings.Builder + for i := 0; i < 1000; i++ { + runtime.Dlog().I(i).End() + fmt.Fprintf(&want, "[] %d\n", i) + runtime.Gosched() + } + atomic.StoreInt32(&done, 1) + wg.Wait() + + gotFull := runtime.DumpDebugLog() + got := dlogCanonicalize(gotFull) + if got != want.String() { + // Since the timestamps are useful in understand + // failures of this test, we print the uncanonicalized + // output. + t.Fatalf("want %q, got (uncanonicalized) %q", want.String(), gotFull) + } +} + +func TestDebugLogWraparound(t *testing.T) { + skipDebugLog(t) + + // Make sure we don't switch logs so it's easier to fill one up. + runtime.LockOSThread() + defer runtime.UnlockOSThread() + + runtime.ResetDebugLog() + var longString = strings.Repeat("a", 128) + var want strings.Builder + for i, j := 0, 0; j < 2*runtime.DebugLogBytes; i, j = i+1, j+len(longString) { + runtime.Dlog().I(i).S(longString).End() + fmt.Fprintf(&want, "[] %d %s\n", i, longString) + } + log := runtime.DumpDebugLog() + + // Check for "lost" message. + lost := regexp.MustCompile(`^>> begin log \d+; lost first \d+KB <<\n`) + if !lost.MatchString(log) { + t.Fatalf("want matching %s, got %q", lost, log) + } + idx := lost.FindStringIndex(log) + // Strip lost message. + log = dlogCanonicalize(log[idx[1]:]) + + // Check log. + if !strings.HasSuffix(want.String(), log) { + t.Fatalf("wrong suffix:\n%s", log) + } +} + +func TestDebugLogLongString(t *testing.T) { + skipDebugLog(t) + + runtime.ResetDebugLog() + var longString = strings.Repeat("a", runtime.DebugLogStringLimit+1) + runtime.Dlog().S(longString).End() + got := dlogCanonicalize(runtime.DumpDebugLog()) + want := "[] " + strings.Repeat("a", runtime.DebugLogStringLimit) + " ..(1 more bytes)..\n" + if got != want { + t.Fatalf("want %q, got %q", want, got) + } +} + +// TestDebugLogBuild verifies that the runtime builds with -tags=debuglog. +func TestDebugLogBuild(t *testing.T) { + testenv.MustHaveGoBuild(t) + + // It doesn't matter which program we build, anything will rebuild the + // runtime. + if _, err := buildTestProg(t, "testprog", "-tags=debuglog"); err != nil { + t.Fatal(err) + } +} diff --git a/src/runtime/defer_test.go b/src/runtime/defer_test.go new file mode 100644 index 0000000..3a54951 --- /dev/null +++ b/src/runtime/defer_test.go @@ -0,0 +1,518 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "reflect" + "runtime" + "testing" +) + +// Make sure open-coded defer exit code is not lost, even when there is an +// unconditional panic (hence no return from the function) +func TestUnconditionalPanic(t *testing.T) { + defer func() { + if recover() != "testUnconditional" { + t.Fatal("expected unconditional panic") + } + }() + panic("testUnconditional") +} + +var glob int = 3 + +// Test an open-coded defer and non-open-coded defer - make sure both defers run +// and call recover() +func TestOpenAndNonOpenDefers(t *testing.T) { + for { + // Non-open defer because in a loop + defer func(n int) { + if recover() != "testNonOpenDefer" { + t.Fatal("expected testNonOpen panic") + } + }(3) + if glob > 2 { + break + } + } + testOpen(t, 47) + panic("testNonOpenDefer") +} + +//go:noinline +func testOpen(t *testing.T, arg int) { + defer func(n int) { + if recover() != "testOpenDefer" { + t.Fatal("expected testOpen panic") + } + }(4) + if arg > 2 { + panic("testOpenDefer") + } +} + +// Test a non-open-coded defer and an open-coded defer - make sure both defers run +// and call recover() +func TestNonOpenAndOpenDefers(t *testing.T) { + testOpen(t, 47) + for { + // Non-open defer because in a loop + defer func(n int) { + if recover() != "testNonOpenDefer" { + t.Fatal("expected testNonOpen panic") + } + }(3) + if glob > 2 { + break + } + } + panic("testNonOpenDefer") +} + +var list []int + +// Make sure that conditional open-coded defers are activated correctly and run in +// the correct order. +func TestConditionalDefers(t *testing.T) { + list = make([]int, 0, 10) + + defer func() { + if recover() != "testConditional" { + t.Fatal("expected panic") + } + want := []int{4, 2, 1} + if !reflect.DeepEqual(want, list) { + t.Fatal(fmt.Sprintf("wanted %v, got %v", want, list)) + } + + }() + testConditionalDefers(8) +} + +func testConditionalDefers(n int) { + doappend := func(i int) { + list = append(list, i) + } + + defer doappend(1) + if n > 5 { + defer doappend(2) + if n > 8 { + defer doappend(3) + } else { + defer doappend(4) + } + } + panic("testConditional") +} + +// Test that there is no compile-time or run-time error if an open-coded defer +// call is removed by constant propagation and dead-code elimination. +func TestDisappearingDefer(t *testing.T) { + switch runtime.GOOS { + case "invalidOS": + defer func() { + t.Fatal("Defer shouldn't run") + }() + } +} + +// This tests an extra recursive panic behavior that is only specified in the +// code. Suppose a first panic P1 happens and starts processing defer calls. If a +// second panic P2 happens while processing defer call D in frame F, then defer +// call processing is restarted (with some potentially new defer calls created by +// D or its callees). If the defer processing reaches the started defer call D +// again in the defer stack, then the original panic P1 is aborted and cannot +// continue panic processing or be recovered. If the panic P2 does a recover at +// some point, it will naturally remove the original panic P1 from the stack +// (since the original panic had to be in frame F or a descendant of F). +func TestAbortedPanic(t *testing.T) { + defer func() { + r := recover() + if r != nil { + t.Fatal(fmt.Sprintf("wanted nil recover, got %v", r)) + } + }() + defer func() { + r := recover() + if r != "panic2" { + t.Fatal(fmt.Sprintf("wanted %v, got %v", "panic2", r)) + } + }() + defer func() { + panic("panic2") + }() + panic("panic1") +} + +// This tests that recover() does not succeed unless it is called directly from a +// defer function that is directly called by the panic. Here, we first call it +// from a defer function that is created by the defer function called directly by +// the panic. In +func TestRecoverMatching(t *testing.T) { + defer func() { + r := recover() + if r != "panic1" { + t.Fatal(fmt.Sprintf("wanted %v, got %v", "panic1", r)) + } + }() + defer func() { + defer func() { + // Shouldn't succeed, even though it is called directly + // from a defer function, since this defer function was + // not directly called by the panic. + r := recover() + if r != nil { + t.Fatal(fmt.Sprintf("wanted nil recover, got %v", r)) + } + }() + }() + panic("panic1") +} + +type nonSSAable [128]byte + +type bigStruct struct { + x, y, z, w, p, q int64 +} + +type containsBigStruct struct { + element bigStruct +} + +func mknonSSAable() nonSSAable { + globint1++ + return nonSSAable{0, 0, 0, 0, 5} +} + +var globint1, globint2, globint3 int + +//go:noinline +func sideeffect(n int64) int64 { + globint2++ + return n +} + +func sideeffect2(in containsBigStruct) containsBigStruct { + globint3++ + return in +} + +// Test that nonSSAable arguments to defer are handled correctly and only evaluated once. +func TestNonSSAableArgs(t *testing.T) { + globint1 = 0 + globint2 = 0 + globint3 = 0 + var save1 byte + var save2 int64 + var save3 int64 + var save4 int64 + + defer func() { + if globint1 != 1 { + t.Fatal(fmt.Sprintf("globint1: wanted: 1, got %v", globint1)) + } + if save1 != 5 { + t.Fatal(fmt.Sprintf("save1: wanted: 5, got %v", save1)) + } + if globint2 != 1 { + t.Fatal(fmt.Sprintf("globint2: wanted: 1, got %v", globint2)) + } + if save2 != 2 { + t.Fatal(fmt.Sprintf("save2: wanted: 2, got %v", save2)) + } + if save3 != 4 { + t.Fatal(fmt.Sprintf("save3: wanted: 4, got %v", save3)) + } + if globint3 != 1 { + t.Fatal(fmt.Sprintf("globint3: wanted: 1, got %v", globint3)) + } + if save4 != 4 { + t.Fatal(fmt.Sprintf("save1: wanted: 4, got %v", save4)) + } + }() + + // Test function returning a non-SSAable arg + defer func(n nonSSAable) { + save1 = n[4] + }(mknonSSAable()) + // Test composite literal that is not SSAable + defer func(b bigStruct) { + save2 = b.y + }(bigStruct{1, 2, 3, 4, 5, sideeffect(6)}) + + // Test struct field reference that is non-SSAable + foo := containsBigStruct{} + foo.element.z = 4 + defer func(element bigStruct) { + save3 = element.z + }(foo.element) + defer func(element bigStruct) { + save4 = element.z + }(sideeffect2(foo).element) +} + +//go:noinline +func doPanic() { + panic("Test panic") +} + +func TestDeferForFuncWithNoExit(t *testing.T) { + cond := 1 + defer func() { + if cond != 2 { + t.Fatal(fmt.Sprintf("cond: wanted 2, got %v", cond)) + } + if recover() != "Test panic" { + t.Fatal("Didn't find expected panic") + } + }() + x := 0 + // Force a stack copy, to make sure that the &cond pointer passed to defer + // function is properly updated. + growStackIter(&x, 1000) + cond = 2 + doPanic() + + // This function has no exit/return, since it ends with an infinite loop + for { + } +} + +// Test case approximating issue #37664, where a recursive function (interpreter) +// may do repeated recovers/re-panics until it reaches the frame where the panic +// can actually be handled. The recurseFnPanicRec() function is testing that there +// are no stale defer structs on the defer chain after the interpreter() sequence, +// by writing a bunch of 0xffffffffs into several recursive stack frames, and then +// doing a single panic-recover which would invoke any such stale defer structs. +func TestDeferWithRepeatedRepanics(t *testing.T) { + interpreter(0, 6, 2) + recurseFnPanicRec(0, 10) + interpreter(0, 5, 1) + recurseFnPanicRec(0, 10) + interpreter(0, 6, 3) + recurseFnPanicRec(0, 10) +} + +func interpreter(level int, maxlevel int, rec int) { + defer func() { + e := recover() + if e == nil { + return + } + if level != e.(int) { + //fmt.Fprintln(os.Stderr, "re-panicing, level", level) + panic(e) + } + //fmt.Fprintln(os.Stderr, "Recovered, level", level) + }() + if level+1 < maxlevel { + interpreter(level+1, maxlevel, rec) + } else { + //fmt.Fprintln(os.Stderr, "Initiating panic") + panic(rec) + } +} + +func recurseFnPanicRec(level int, maxlevel int) { + defer func() { + recover() + }() + recurseFn(level, maxlevel) +} + +var saveInt uint32 + +func recurseFn(level int, maxlevel int) { + a := [40]uint32{0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff, 0xffffffff} + if level+1 < maxlevel { + // Make sure a array is referenced, so it is not optimized away + saveInt = a[4] + recurseFn(level+1, maxlevel) + } else { + panic("recurseFn panic") + } +} + +// Try to reproduce issue #37688, where a pointer to an open-coded defer struct is +// mistakenly held, and that struct keeps a pointer to a stack-allocated defer +// struct, and that stack-allocated struct gets overwritten or the stack gets +// moved, so a memory error happens on GC. +func TestIssue37688(t *testing.T) { + for j := 0; j < 10; j++ { + g2() + g3() + } +} + +type foo struct { +} + +//go:noinline +func (f *foo) method1() { +} + +//go:noinline +func (f *foo) method2() { +} + +func g2() { + var a foo + ap := &a + // The loop forces this defer to be heap-allocated and the remaining two + // to be stack-allocated. + for i := 0; i < 1; i++ { + defer ap.method1() + } + defer ap.method2() + defer ap.method1() + ff1(ap, 1, 2, 3, 4, 5, 6, 7, 8, 9) + // Try to get the stack to be moved by growing it too large, so + // existing stack-allocated defer becomes invalid. + rec1(2000) +} + +func g3() { + // Mix up the stack layout by adding in an extra function frame + g2() +} + +var globstruct struct { + a, b, c, d, e, f, g, h, i int +} + +func ff1(ap *foo, a, b, c, d, e, f, g, h, i int) { + defer ap.method1() + + // Make a defer that has a very large set of args, hence big size for the + // defer record for the open-coded frame (which means it won't use the + // defer pool) + defer func(ap *foo, a, b, c, d, e, f, g, h, i int) { + if v := recover(); v != nil { + } + globstruct.a = a + globstruct.b = b + globstruct.c = c + globstruct.d = d + globstruct.e = e + globstruct.f = f + globstruct.g = g + globstruct.h = h + }(ap, a, b, c, d, e, f, g, h, i) + panic("ff1 panic") +} + +func rec1(max int) { + if max > 0 { + rec1(max - 1) + } +} + +func TestIssue43921(t *testing.T) { + defer func() { + expect(t, 1, recover()) + }() + func() { + // Prevent open-coded defers + for { + defer func() {}() + break + } + + defer func() { + defer func() { + expect(t, 4, recover()) + }() + panic(4) + }() + panic(1) + + }() +} + +func expect(t *testing.T, n int, err any) { + if n != err { + t.Fatalf("have %v, want %v", err, n) + } +} + +func TestIssue43920(t *testing.T) { + var steps int + + defer func() { + expect(t, 1, recover()) + }() + defer func() { + defer func() { + defer func() { + expect(t, 5, recover()) + }() + defer panic(5) + func() { + panic(4) + }() + }() + defer func() { + expect(t, 3, recover()) + }() + defer panic(3) + }() + func() { + defer step(t, &steps, 1) + panic(1) + }() +} + +func step(t *testing.T, steps *int, want int) { + *steps++ + if *steps != want { + t.Fatalf("have %v, want %v", *steps, want) + } +} + +func TestIssue43941(t *testing.T) { + var steps int = 7 + defer func() { + step(t, &steps, 14) + expect(t, 4, recover()) + }() + func() { + func() { + defer func() { + defer func() { + expect(t, 3, recover()) + }() + defer panic(3) + panic(2) + }() + defer func() { + expect(t, 1, recover()) + }() + defer panic(1) + }() + defer func() {}() + defer func() {}() + defer step(t, &steps, 10) + defer step(t, &steps, 9) + step(t, &steps, 8) + }() + func() { + defer step(t, &steps, 13) + defer step(t, &steps, 12) + func() { + defer step(t, &steps, 11) + panic(4) + }() + + // Code below isn't executed, + // but removing it breaks the test case. + defer func() {}() + defer panic(-1) + defer step(t, &steps, -1) + defer step(t, &steps, -1) + defer func() {}() + }() +} diff --git a/src/runtime/defs1_linux.go b/src/runtime/defs1_linux.go new file mode 100644 index 0000000..709f19e --- /dev/null +++ b/src/runtime/defs1_linux.go @@ -0,0 +1,40 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo -cdefs + +GOARCH=amd64 cgo -cdefs defs.go defs1.go >amd64/defs.h +*/ + +package runtime + +/* +#include <ucontext.h> +#include <fcntl.h> +#include <asm/signal.h> +*/ +import "C" + +const ( + O_RDONLY = C.O_RDONLY + O_NONBLOCK = C.O_NONBLOCK + O_CLOEXEC = C.O_CLOEXEC + SA_RESTORER = C.SA_RESTORER +) + +type Usigset C.__sigset_t +type Fpxreg C.struct__libc_fpxreg +type Xmmreg C.struct__libc_xmmreg +type Fpstate C.struct__libc_fpstate +type Fpxreg1 C.struct__fpxreg +type Xmmreg1 C.struct__xmmreg +type Fpstate1 C.struct__fpstate +type Fpreg1 C.struct__fpreg +type StackT C.stack_t +type Mcontext C.mcontext_t +type Ucontext C.ucontext_t +type Sigcontext C.struct_sigcontext diff --git a/src/runtime/defs1_netbsd_386.go b/src/runtime/defs1_netbsd_386.go new file mode 100644 index 0000000..f7fe45b --- /dev/null +++ b/src/runtime/defs1_netbsd_386.go @@ -0,0 +1,183 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_netbsd.go defs_netbsd_386.go + +package runtime + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x400000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = 0x0 + _EVFILT_WRITE = 0x1 +) + +type sigset struct { + __bits [4]uint32 +} + +type siginfo struct { + _signo int32 + _code int32 + _errno int32 + _reason [20]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 +} + +type timespec struct { + tv_sec int64 + tv_nsec int32 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = int64(timediv(ns, 1e9, &ts.tv_nsec)) +} + +type timeval struct { + tv_sec int64 + tv_usec int32 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type mcontextt struct { + __gregs [19]uint32 + __fpregs [644]byte + _mc_tlsbase int32 +} + +type ucontextt struct { + uc_flags uint32 + uc_link *ucontextt + uc_sigmask sigset + uc_stack stackt + uc_mcontext mcontextt + __uc_pad [4]int32 +} + +type keventt struct { + ident uint32 + filter uint32 + flags uint32 + fflags uint32 + data int64 + udata *byte +} + +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_netbsd.go defs_netbsd_386.go + +const ( + _REG_GS = 0x0 + _REG_FS = 0x1 + _REG_ES = 0x2 + _REG_DS = 0x3 + _REG_EDI = 0x4 + _REG_ESI = 0x5 + _REG_EBP = 0x6 + _REG_ESP = 0x7 + _REG_EBX = 0x8 + _REG_EDX = 0x9 + _REG_ECX = 0xa + _REG_EAX = 0xb + _REG_TRAPNO = 0xc + _REG_ERR = 0xd + _REG_EIP = 0xe + _REG_CS = 0xf + _REG_EFL = 0x10 + _REG_UESP = 0x11 + _REG_SS = 0x12 +) diff --git a/src/runtime/defs1_netbsd_amd64.go b/src/runtime/defs1_netbsd_amd64.go new file mode 100644 index 0000000..80908cd --- /dev/null +++ b/src/runtime/defs1_netbsd_amd64.go @@ -0,0 +1,195 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_netbsd.go defs_netbsd_amd64.go + +package runtime + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x400000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = 0x0 + _EVFILT_WRITE = 0x1 +) + +type sigset struct { + __bits [4]uint32 +} + +type siginfo struct { + _signo int32 + _code int32 + _errno int32 + _pad int32 + _reason [24]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int32 + pad_cgo_0 [4]byte +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type mcontextt struct { + __gregs [26]uint64 + _mc_tlsbase uint64 + __fpregs [512]int8 +} + +type ucontextt struct { + uc_flags uint32 + pad_cgo_0 [4]byte + uc_link *ucontextt + uc_sigmask sigset + uc_stack stackt + uc_mcontext mcontextt +} + +type keventt struct { + ident uint64 + filter uint32 + flags uint32 + fflags uint32 + pad_cgo_0 [4]byte + data int64 + udata *byte +} + +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_netbsd.go defs_netbsd_amd64.go + +const ( + _REG_RDI = 0x0 + _REG_RSI = 0x1 + _REG_RDX = 0x2 + _REG_RCX = 0x3 + _REG_R8 = 0x4 + _REG_R9 = 0x5 + _REG_R10 = 0x6 + _REG_R11 = 0x7 + _REG_R12 = 0x8 + _REG_R13 = 0x9 + _REG_R14 = 0xa + _REG_R15 = 0xb + _REG_RBP = 0xc + _REG_RBX = 0xd + _REG_RAX = 0xe + _REG_GS = 0xf + _REG_FS = 0x10 + _REG_ES = 0x11 + _REG_DS = 0x12 + _REG_TRAPNO = 0x13 + _REG_ERR = 0x14 + _REG_RIP = 0x15 + _REG_CS = 0x16 + _REG_RFLAGS = 0x17 + _REG_RSP = 0x18 + _REG_SS = 0x19 +) diff --git a/src/runtime/defs1_netbsd_arm.go b/src/runtime/defs1_netbsd_arm.go new file mode 100644 index 0000000..c63e592 --- /dev/null +++ b/src/runtime/defs1_netbsd_arm.go @@ -0,0 +1,188 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_netbsd.go defs_netbsd_arm.go + +package runtime + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x400000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = 0x0 + _EVFILT_WRITE = 0x1 +) + +type sigset struct { + __bits [4]uint32 +} + +type siginfo struct { + _signo int32 + _code int32 + _errno int32 + _reason uintptr + _reasonx [16]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 +} + +type timespec struct { + tv_sec int64 + tv_nsec int32 + _ [4]byte // EABI +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = int64(timediv(ns, 1e9, &ts.tv_nsec)) +} + +type timeval struct { + tv_sec int64 + tv_usec int32 + _ [4]byte // EABI +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type mcontextt struct { + __gregs [17]uint32 + _ [4]byte // EABI + __fpu [272]byte // EABI + _mc_tlsbase uint32 + _ [4]byte // EABI +} + +type ucontextt struct { + uc_flags uint32 + uc_link *ucontextt + uc_sigmask sigset + uc_stack stackt + _ [4]byte // EABI + uc_mcontext mcontextt + __uc_pad [2]int32 +} + +type keventt struct { + ident uint32 + filter uint32 + flags uint32 + fflags uint32 + data int64 + udata *byte + _ [4]byte // EABI +} + +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_netbsd.go defs_netbsd_arm.go + +const ( + _REG_R0 = 0x0 + _REG_R1 = 0x1 + _REG_R2 = 0x2 + _REG_R3 = 0x3 + _REG_R4 = 0x4 + _REG_R5 = 0x5 + _REG_R6 = 0x6 + _REG_R7 = 0x7 + _REG_R8 = 0x8 + _REG_R9 = 0x9 + _REG_R10 = 0xa + _REG_R11 = 0xb + _REG_R12 = 0xc + _REG_R13 = 0xd + _REG_R14 = 0xe + _REG_R15 = 0xf + _REG_CPSR = 0x10 +) diff --git a/src/runtime/defs1_netbsd_arm64.go b/src/runtime/defs1_netbsd_arm64.go new file mode 100644 index 0000000..804b5b0 --- /dev/null +++ b/src/runtime/defs1_netbsd_arm64.go @@ -0,0 +1,203 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_netbsd.go defs_netbsd_arm.go + +package runtime + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x400000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = 0x0 + _EVFILT_WRITE = 0x1 +) + +type sigset struct { + __bits [4]uint32 +} + +type siginfo struct { + _signo int32 + _code int32 + _errno int32 + _reason uintptr + _reasonx [16]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int32 + _ [4]byte // EABI +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type mcontextt struct { + __gregs [35]uint64 + __fregs [4160]byte // _NFREG * 128 + 32 + 32 + _ [8]uint64 // future use +} + +type ucontextt struct { + uc_flags uint32 + uc_link *ucontextt + uc_sigmask sigset + uc_stack stackt + _ [4]byte // EABI + uc_mcontext mcontextt + __uc_pad [2]int32 +} + +type keventt struct { + ident uint64 + filter uint32 + flags uint32 + fflags uint32 + pad_cgo_0 [4]byte + data int64 + udata *byte +} + +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_netbsd.go defs_netbsd_arm.go + +const ( + _REG_X0 = 0 + _REG_X1 = 1 + _REG_X2 = 2 + _REG_X3 = 3 + _REG_X4 = 4 + _REG_X5 = 5 + _REG_X6 = 6 + _REG_X7 = 7 + _REG_X8 = 8 + _REG_X9 = 9 + _REG_X10 = 10 + _REG_X11 = 11 + _REG_X12 = 12 + _REG_X13 = 13 + _REG_X14 = 14 + _REG_X15 = 15 + _REG_X16 = 16 + _REG_X17 = 17 + _REG_X18 = 18 + _REG_X19 = 19 + _REG_X20 = 20 + _REG_X21 = 21 + _REG_X22 = 22 + _REG_X23 = 23 + _REG_X24 = 24 + _REG_X25 = 25 + _REG_X26 = 26 + _REG_X27 = 27 + _REG_X28 = 28 + _REG_X29 = 29 + _REG_X30 = 30 + _REG_X31 = 31 + _REG_ELR = 32 + _REG_SPSR = 33 + _REG_TPIDR = 34 +) diff --git a/src/runtime/defs1_solaris_amd64.go b/src/runtime/defs1_solaris_amd64.go new file mode 100644 index 0000000..bb53c22 --- /dev/null +++ b/src/runtime/defs1_solaris_amd64.go @@ -0,0 +1,254 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_solaris.go defs_solaris_amd64.go + +package runtime + +const ( + _EINTR = 0x4 + _EBADF = 0x9 + _EFAULT = 0xe + _EAGAIN = 0xb + _EBUSY = 0x10 + _ETIME = 0x3e + _ETIMEDOUT = 0x91 + _EWOULDBLOCK = 0xb + _EINPROGRESS = 0x96 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x100 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + + _SA_SIGINFO = 0x8 + _SA_RESTART = 0x4 + _SA_ONSTACK = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x15 + _SIGSTOP = 0x17 + _SIGTSTP = 0x18 + _SIGCONT = 0x19 + _SIGCHLD = 0x12 + _SIGTTIN = 0x1a + _SIGTTOU = 0x1b + _SIGIO = 0x16 + _SIGXCPU = 0x1e + _SIGXFSZ = 0x1f + _SIGVTALRM = 0x1c + _SIGPROF = 0x1d + _SIGWINCH = 0x14 + _SIGUSR1 = 0x10 + _SIGUSR2 = 0x11 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + __SC_PAGESIZE = 0xb + __SC_NPROCESSORS_ONLN = 0xf + + _PTHREAD_CREATE_DETACHED = 0x40 + + _FORK_NOSIGCHLD = 0x1 + _FORK_WAITPID = 0x2 + + _MAXHOSTNAMELEN = 0x100 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x80 + _O_TRUNC = 0x200 + _O_CREAT = 0x100 + _O_CLOEXEC = 0x800000 + _FD_CLOEXEC = 0x1 + _F_GETFL = 0x3 + _F_SETFL = 0x4 + _F_SETFD = 0x2 + + _POLLIN = 0x1 + _POLLOUT = 0x4 + _POLLHUP = 0x10 + _POLLERR = 0x8 + + _PORT_SOURCE_FD = 0x4 + _PORT_SOURCE_ALERT = 0x5 + _PORT_ALERT_UPDATE = 0x2 +) + +type semt struct { + sem_count uint32 + sem_type uint16 + sem_magic uint16 + sem_pad1 [3]uint64 + sem_pad2 [2]uint64 +} + +type sigset struct { + __sigbits [4]uint32 +} + +type stackt struct { + ss_sp *byte + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type siginfo struct { + si_signo int32 + si_code int32 + si_errno int32 + si_pad int32 + __data [240]byte +} + +type sigactiont struct { + sa_flags int32 + pad_cgo_0 [4]byte + _funcptr [8]byte + sa_mask sigset +} + +type fpregset struct { + fp_reg_set [528]byte +} + +type mcontext struct { + gregs [28]int64 + fpregs fpregset +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_sigmask sigset + uc_stack stackt + pad_cgo_0 [8]byte + uc_mcontext mcontext + uc_filler [5]int64 + pad_cgo_1 [8]byte +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type portevent struct { + portev_events int32 + portev_source uint16 + portev_pad uint16 + portev_object uint64 + portev_user *byte +} + +type pthread uint32 +type pthreadattr struct { + __pthread_attrp *byte +} + +type stat struct { + st_dev uint64 + st_ino uint64 + st_mode uint32 + st_nlink uint32 + st_uid uint32 + st_gid uint32 + st_rdev uint64 + st_size int64 + st_atim timespec + st_mtim timespec + st_ctim timespec + st_blksize int32 + pad_cgo_0 [4]byte + st_blocks int64 + st_fstype [16]int8 +} + +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_solaris.go defs_solaris_amd64.go + +const ( + _REG_RDI = 0x8 + _REG_RSI = 0x9 + _REG_RDX = 0xc + _REG_RCX = 0xd + _REG_R8 = 0x7 + _REG_R9 = 0x6 + _REG_R10 = 0x5 + _REG_R11 = 0x4 + _REG_R12 = 0x3 + _REG_R13 = 0x2 + _REG_R14 = 0x1 + _REG_R15 = 0x0 + _REG_RBP = 0xa + _REG_RBX = 0xb + _REG_RAX = 0xe + _REG_GS = 0x17 + _REG_FS = 0x16 + _REG_ES = 0x18 + _REG_DS = 0x19 + _REG_TRAPNO = 0xf + _REG_ERR = 0x10 + _REG_RIP = 0x11 + _REG_CS = 0x12 + _REG_RFLAGS = 0x13 + _REG_RSP = 0x14 + _REG_SS = 0x15 +) diff --git a/src/runtime/defs2_linux.go b/src/runtime/defs2_linux.go new file mode 100644 index 0000000..5d6730a --- /dev/null +++ b/src/runtime/defs2_linux.go @@ -0,0 +1,138 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* + * Input to cgo -cdefs + +GOARCH=386 go tool cgo -cdefs defs2_linux.go >defs_linux_386.h + +The asm header tricks we have to use for Linux on amd64 +(see defs.c and defs1.c) don't work here, so this is yet another +file. Sigh. +*/ + +package runtime + +/* +#cgo CFLAGS: -I/tmp/linux/arch/x86/include -I/tmp/linux/include -D_LOOSE_KERNEL_NAMES -D__ARCH_SI_UID_T=__kernel_uid32_t + +#define size_t __kernel_size_t +#define pid_t int +#include <asm/signal.h> +#include <asm/mman.h> +#include <asm/sigcontext.h> +#include <asm/ucontext.h> +#include <asm/siginfo.h> +#include <asm-generic/errno.h> +#include <asm-generic/fcntl.h> +#include <asm-generic/poll.h> +#include <linux/eventpoll.h> + +// This is the sigaction structure from the Linux 2.1.68 kernel which +// is used with the rt_sigaction system call. For 386 this is not +// defined in any public header file. + +struct kernel_sigaction { + __sighandler_t k_sa_handler; + unsigned long sa_flags; + void (*sa_restorer) (void); + unsigned long long sa_mask; +}; +*/ +import "C" + +const ( + EINTR = C.EINTR + EAGAIN = C.EAGAIN + ENOMEM = C.ENOMEM + + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANONYMOUS + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + + MADV_DONTNEED = C.MADV_DONTNEED + MADV_FREE = C.MADV_FREE + MADV_HUGEPAGE = C.MADV_HUGEPAGE + MADV_NOHUGEPAGE = C.MADV_NOHUGEPAGE + + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + SA_RESTORER = C.SA_RESTORER + SA_SIGINFO = C.SA_SIGINFO + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGBUS = C.SIGBUS + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGUSR1 = C.SIGUSR1 + SIGSEGV = C.SIGSEGV + SIGUSR2 = C.SIGUSR2 + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGSTKFLT = C.SIGSTKFLT + SIGCHLD = C.SIGCHLD + SIGCONT = C.SIGCONT + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGURG = C.SIGURG + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGIO = C.SIGIO + SIGPWR = C.SIGPWR + SIGSYS = C.SIGSYS + + FPE_INTDIV = C.FPE_INTDIV + FPE_INTOVF = C.FPE_INTOVF + FPE_FLTDIV = C.FPE_FLTDIV + FPE_FLTOVF = C.FPE_FLTOVF + FPE_FLTUND = C.FPE_FLTUND + FPE_FLTRES = C.FPE_FLTRES + FPE_FLTINV = C.FPE_FLTINV + FPE_FLTSUB = C.FPE_FLTSUB + + BUS_ADRALN = C.BUS_ADRALN + BUS_ADRERR = C.BUS_ADRERR + BUS_OBJERR = C.BUS_OBJERR + + SEGV_MAPERR = C.SEGV_MAPERR + SEGV_ACCERR = C.SEGV_ACCERR + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + ITIMER_PROF = C.ITIMER_PROF + + O_RDONLY = C.O_RDONLY + O_CLOEXEC = C.O_CLOEXEC +) + +type Fpreg C.struct__fpreg +type Fpxreg C.struct__fpxreg +type Xmmreg C.struct__xmmreg +type Fpstate C.struct__fpstate +type Timespec C.struct_timespec +type Timeval C.struct_timeval +type Sigaction C.struct_kernel_sigaction +type Siginfo C.siginfo_t +type StackT C.stack_t +type Sigcontext C.struct_sigcontext +type Ucontext C.struct_ucontext +type Itimerval C.struct_itimerval +type EpollEvent C.struct_epoll_event diff --git a/src/runtime/defs3_linux.go b/src/runtime/defs3_linux.go new file mode 100644 index 0000000..99479aa --- /dev/null +++ b/src/runtime/defs3_linux.go @@ -0,0 +1,43 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo -cdefs + +GOARCH=ppc64 cgo -cdefs defs_linux.go defs3_linux.go > defs_linux_ppc64.h +*/ + +package runtime + +/* +#define size_t __kernel_size_t +#define sigset_t __sigset_t // rename the sigset_t here otherwise cgo will complain about "inconsistent definitions for C.sigset_t" +#define _SYS_TYPES_H // avoid inclusion of sys/types.h +#include <asm/ucontext.h> +#include <asm-generic/fcntl.h> +*/ +import "C" + +const ( + O_RDONLY = C.O_RDONLY + O_CLOEXEC = C.O_CLOEXEC + SA_RESTORER = 0 // unused +) + +type Usigset C.__sigset_t + +// types used in sigcontext +type Ptregs C.struct_pt_regs +type Gregset C.elf_gregset_t +type FPregset C.elf_fpregset_t +type Vreg C.elf_vrreg_t + +type StackT C.stack_t + +// PPC64 uses sigcontext in place of mcontext in ucontext. +// see https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/include/uapi/asm/ucontext.h +type Sigcontext C.struct_sigcontext +type Ucontext C.struct_ucontext diff --git a/src/runtime/defs_aix.go b/src/runtime/defs_aix.go new file mode 100644 index 0000000..3895989 --- /dev/null +++ b/src/runtime/defs_aix.go @@ -0,0 +1,174 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo -godefs +GOARCH=ppc64 go tool cgo -godefs defs_aix.go > defs_aix_ppc64_tmp.go + +This is only a helper to create defs_aix_ppc64.go +Go runtime functions require the "linux" name of fields (ss_sp, si_addr, etc) +However, AIX structures don't provide such names and must be modified. + +TODO(aix): create a script to automatise defs_aix creation. + +Modifications made: + - sigset replaced by a [4]uint64 array + - add sigset_all variable + - siginfo.si_addr uintptr instead of *byte + - add (*timeval) set_usec + - stackt.ss_sp uintptr instead of *byte + - stackt.ss_size uintptr instead of uint64 + - sigcontext.sc_jmpbuf context64 instead of jumbuf + - ucontext.__extctx is a uintptr because we don't need extctx struct + - ucontext.uc_mcontext: replace jumbuf structure by context64 structure + - sigaction.sa_handler represents union field as both are uintptr + - tstate.* replace *byte by uintptr + + +*/ + +package runtime + +/* + +#include <sys/types.h> +#include <sys/errno.h> +#include <sys/time.h> +#include <sys/signal.h> +#include <sys/mman.h> +#include <sys/thread.h> +#include <sys/resource.h> + +#include <unistd.h> +#include <fcntl.h> +#include <pthread.h> +#include <semaphore.h> +*/ +import "C" + +const ( + _EPERM = C.EPERM + _ENOENT = C.ENOENT + _EINTR = C.EINTR + _EAGAIN = C.EAGAIN + _ENOMEM = C.ENOMEM + _EACCES = C.EACCES + _EFAULT = C.EFAULT + _EINVAL = C.EINVAL + _ETIMEDOUT = C.ETIMEDOUT + + _PROT_NONE = C.PROT_NONE + _PROT_READ = C.PROT_READ + _PROT_WRITE = C.PROT_WRITE + _PROT_EXEC = C.PROT_EXEC + + _MAP_ANON = C.MAP_ANONYMOUS + _MAP_PRIVATE = C.MAP_PRIVATE + _MAP_FIXED = C.MAP_FIXED + _MADV_DONTNEED = C.MADV_DONTNEED + + _SIGHUP = C.SIGHUP + _SIGINT = C.SIGINT + _SIGQUIT = C.SIGQUIT + _SIGILL = C.SIGILL + _SIGTRAP = C.SIGTRAP + _SIGABRT = C.SIGABRT + _SIGBUS = C.SIGBUS + _SIGFPE = C.SIGFPE + _SIGKILL = C.SIGKILL + _SIGUSR1 = C.SIGUSR1 + _SIGSEGV = C.SIGSEGV + _SIGUSR2 = C.SIGUSR2 + _SIGPIPE = C.SIGPIPE + _SIGALRM = C.SIGALRM + _SIGCHLD = C.SIGCHLD + _SIGCONT = C.SIGCONT + _SIGSTOP = C.SIGSTOP + _SIGTSTP = C.SIGTSTP + _SIGTTIN = C.SIGTTIN + _SIGTTOU = C.SIGTTOU + _SIGURG = C.SIGURG + _SIGXCPU = C.SIGXCPU + _SIGXFSZ = C.SIGXFSZ + _SIGVTALRM = C.SIGVTALRM + _SIGPROF = C.SIGPROF + _SIGWINCH = C.SIGWINCH + _SIGIO = C.SIGIO + _SIGPWR = C.SIGPWR + _SIGSYS = C.SIGSYS + _SIGTERM = C.SIGTERM + _SIGEMT = C.SIGEMT + _SIGWAITING = C.SIGWAITING + + _FPE_INTDIV = C.FPE_INTDIV + _FPE_INTOVF = C.FPE_INTOVF + _FPE_FLTDIV = C.FPE_FLTDIV + _FPE_FLTOVF = C.FPE_FLTOVF + _FPE_FLTUND = C.FPE_FLTUND + _FPE_FLTRES = C.FPE_FLTRES + _FPE_FLTINV = C.FPE_FLTINV + _FPE_FLTSUB = C.FPE_FLTSUB + + _BUS_ADRALN = C.BUS_ADRALN + _BUS_ADRERR = C.BUS_ADRERR + _BUS_OBJERR = C.BUS_OBJERR + + _SEGV_MAPERR = C.SEGV_MAPERR + _SEGV_ACCERR = C.SEGV_ACCERR + + _ITIMER_REAL = C.ITIMER_REAL + _ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + _ITIMER_PROF = C.ITIMER_PROF + + _O_RDONLY = C.O_RDONLY + _O_WRONLY = C.O_WRONLY + _O_NONBLOCK = C.O_NONBLOCK + _O_CREAT = C.O_CREAT + _O_TRUNC = C.O_TRUNC + + _SS_DISABLE = C.SS_DISABLE + _SI_USER = C.SI_USER + _SIG_BLOCK = C.SIG_BLOCK + _SIG_UNBLOCK = C.SIG_UNBLOCK + _SIG_SETMASK = C.SIG_SETMASK + + _SA_SIGINFO = C.SA_SIGINFO + _SA_RESTART = C.SA_RESTART + _SA_ONSTACK = C.SA_ONSTACK + + _PTHREAD_CREATE_DETACHED = C.PTHREAD_CREATE_DETACHED + + __SC_PAGE_SIZE = C._SC_PAGE_SIZE + __SC_NPROCESSORS_ONLN = C._SC_NPROCESSORS_ONLN + + _F_SETFD = C.F_SETFD + _F_SETFL = C.F_SETFL + _F_GETFD = C.F_GETFD + _F_GETFL = C.F_GETFL + _FD_CLOEXEC = C.FD_CLOEXEC +) + +type sigset C.sigset_t +type siginfo C.siginfo_t +type timespec C.struct_timespec +type timestruc C.struct_timestruc_t +type timeval C.struct_timeval +type itimerval C.struct_itimerval + +type stackt C.stack_t +type sigcontext C.struct_sigcontext +type ucontext C.ucontext_t +type _Ctype_struct___extctx uint64 // ucontext use a pointer to this structure but it shouldn't be used +type jmpbuf C.struct___jmpbuf +type context64 C.struct___context64 +type sigactiont C.struct_sigaction +type tstate C.struct_tstate +type rusage C.struct_rusage + +type pthread C.pthread_t +type pthread_attr C.pthread_attr_t + +type semt C.sem_t diff --git a/src/runtime/defs_aix_ppc64.go b/src/runtime/defs_aix_ppc64.go new file mode 100644 index 0000000..2d25b7c --- /dev/null +++ b/src/runtime/defs_aix_ppc64.go @@ -0,0 +1,214 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build aix + +package runtime + +const ( + _EPERM = 0x1 + _ENOENT = 0x2 + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + _EACCES = 0xd + _EFAULT = 0xe + _EINVAL = 0x16 + _ETIMEDOUT = 0x4e + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x10 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x100 + _MADV_DONTNEED = 0x4 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0xa + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0x1e + _SIGSEGV = 0xb + _SIGUSR2 = 0x1f + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGCHLD = 0x14 + _SIGCONT = 0x13 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x10 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x22 + _SIGPROF = 0x20 + _SIGWINCH = 0x1c + _SIGIO = 0x17 + _SIGPWR = 0x1d + _SIGSYS = 0xc + _SIGTERM = 0xf + _SIGEMT = 0x7 + _SIGWAITING = 0x27 + + _FPE_INTDIV = 0x14 + _FPE_INTOVF = 0x15 + _FPE_FLTDIV = 0x16 + _FPE_FLTOVF = 0x17 + _FPE_FLTUND = 0x18 + _FPE_FLTRES = 0x19 + _FPE_FLTINV = 0x1a + _FPE_FLTSUB = 0x1b + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + _ + _SEGV_MAPERR = 0x32 + _SEGV_ACCERR = 0x33 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x100 + _O_TRUNC = 0x200 + + _SS_DISABLE = 0x2 + _SI_USER = 0x0 + _SIG_BLOCK = 0x0 + _SIG_UNBLOCK = 0x1 + _SIG_SETMASK = 0x2 + + _SA_SIGINFO = 0x100 + _SA_RESTART = 0x8 + _SA_ONSTACK = 0x1 + + _PTHREAD_CREATE_DETACHED = 0x1 + + __SC_PAGE_SIZE = 0x30 + __SC_NPROCESSORS_ONLN = 0x48 + + _F_SETFD = 0x2 + _F_SETFL = 0x4 + _F_GETFD = 0x1 + _F_GETFL = 0x3 + _FD_CLOEXEC = 0x1 +) + +type sigset [4]uint64 + +var sigset_all = sigset{^uint64(0), ^uint64(0), ^uint64(0), ^uint64(0)} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr uintptr + si_band int64 + si_value [2]int32 // [8]byte + __si_flags int32 + __pad [3]int32 +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int32 + pad_cgo_0 [4]byte +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + __pad [4]int32 + pas_cgo_0 [4]byte +} + +type sigcontext struct { + sc_onstack int32 + pad_cgo_0 [4]byte + sc_mask sigset + sc_uerror int32 + sc_jmpbuf context64 +} + +type ucontext struct { + __sc_onstack int32 + pad_cgo_0 [4]byte + uc_sigmask sigset + __sc_error int32 + pad_cgo_1 [4]byte + uc_mcontext context64 + uc_link *ucontext + uc_stack stackt + __extctx uintptr // pointer to struct __extctx but we don't use it + __extctx_magic int32 + __pad int32 +} + +type context64 struct { + gpr [32]uint64 + msr uint64 + iar uint64 + lr uint64 + ctr uint64 + cr uint32 + xer uint32 + fpscr uint32 + fpscrx uint32 + except [1]uint64 + fpr [32]float64 + fpeu uint8 + fpinfo uint8 + fpscr24_31 uint8 + pad [1]uint8 + excp_type int32 +} + +type sigactiont struct { + sa_handler uintptr // a union of two pointer + sa_mask sigset + sa_flags int32 + pad_cgo_0 [4]byte +} + +type pthread uint32 +type pthread_attr *byte + +type semt int32 diff --git a/src/runtime/defs_arm_linux.go b/src/runtime/defs_arm_linux.go new file mode 100644 index 0000000..805735b --- /dev/null +++ b/src/runtime/defs_arm_linux.go @@ -0,0 +1,124 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. +On a Debian Lenny arm linux distribution: + +cgo -cdefs defs_arm.c >arm/defs.h +*/ + +package runtime + +/* +#cgo CFLAGS: -I/usr/src/linux-headers-2.6.26-2-versatile/include + +#define __ARCH_SI_UID_T int +#include <asm/signal.h> +#include <asm/mman.h> +#include <asm/sigcontext.h> +#include <asm/ucontext.h> +#include <asm/siginfo.h> +#include <linux/time.h> + +struct xsiginfo { + int si_signo; + int si_errno; + int si_code; + char _sifields[4]; +}; + +#undef sa_handler +#undef sa_flags +#undef sa_restorer +#undef sa_mask + +struct xsigaction { + void (*sa_handler)(void); + unsigned long sa_flags; + void (*sa_restorer)(void); + unsigned int sa_mask; // mask last for extensibility +}; +*/ +import "C" + +const ( + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANONYMOUS + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + + MADV_DONTNEED = C.MADV_DONTNEED + + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + SA_RESTORER = C.SA_RESTORER + SA_SIGINFO = C.SA_SIGINFO + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGBUS = C.SIGBUS + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGUSR1 = C.SIGUSR1 + SIGSEGV = C.SIGSEGV + SIGUSR2 = C.SIGUSR2 + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGSTKFLT = C.SIGSTKFLT + SIGCHLD = C.SIGCHLD + SIGCONT = C.SIGCONT + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGURG = C.SIGURG + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGIO = C.SIGIO + SIGPWR = C.SIGPWR + SIGSYS = C.SIGSYS + + FPE_INTDIV = C.FPE_INTDIV & 0xFFFF + FPE_INTOVF = C.FPE_INTOVF & 0xFFFF + FPE_FLTDIV = C.FPE_FLTDIV & 0xFFFF + FPE_FLTOVF = C.FPE_FLTOVF & 0xFFFF + FPE_FLTUND = C.FPE_FLTUND & 0xFFFF + FPE_FLTRES = C.FPE_FLTRES & 0xFFFF + FPE_FLTINV = C.FPE_FLTINV & 0xFFFF + FPE_FLTSUB = C.FPE_FLTSUB & 0xFFFF + + BUS_ADRALN = C.BUS_ADRALN & 0xFFFF + BUS_ADRERR = C.BUS_ADRERR & 0xFFFF + BUS_OBJERR = C.BUS_OBJERR & 0xFFFF + + SEGV_MAPERR = C.SEGV_MAPERR & 0xFFFF + SEGV_ACCERR = C.SEGV_ACCERR & 0xFFFF + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_PROF = C.ITIMER_PROF + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL +) + +type Timespec C.struct_timespec +type StackT C.stack_t +type Sigcontext C.struct_sigcontext +type Ucontext C.struct_ucontext +type Timeval C.struct_timeval +type Itimerval C.struct_itimerval +type Siginfo C.struct_xsiginfo +type Sigaction C.struct_xsigaction diff --git a/src/runtime/defs_darwin.go b/src/runtime/defs_darwin.go new file mode 100644 index 0000000..89e4253 --- /dev/null +++ b/src/runtime/defs_darwin.go @@ -0,0 +1,167 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=amd64 go tool cgo -cdefs defs_darwin.go >defs_darwin_amd64.h +*/ + +package runtime + +/* +#define __DARWIN_UNIX03 0 +#include <mach/mach_time.h> +#include <sys/types.h> +#include <sys/time.h> +#include <errno.h> +#include <signal.h> +#include <sys/event.h> +#include <sys/mman.h> +#include <pthread.h> +#include <fcntl.h> +*/ +import "C" + +const ( + EINTR = C.EINTR + EFAULT = C.EFAULT + EAGAIN = C.EAGAIN + ETIMEDOUT = C.ETIMEDOUT + + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANON + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + + MADV_DONTNEED = C.MADV_DONTNEED + MADV_FREE = C.MADV_FREE + MADV_FREE_REUSABLE = C.MADV_FREE_REUSABLE + MADV_FREE_REUSE = C.MADV_FREE_REUSE + + SA_SIGINFO = C.SA_SIGINFO + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + SA_USERTRAMP = C.SA_USERTRAMP + SA_64REGSET = C.SA_64REGSET + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGEMT = C.SIGEMT + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGBUS = C.SIGBUS + SIGSEGV = C.SIGSEGV + SIGSYS = C.SIGSYS + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGTERM = C.SIGTERM + SIGURG = C.SIGURG + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGCONT = C.SIGCONT + SIGCHLD = C.SIGCHLD + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGIO = C.SIGIO + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGINFO = C.SIGINFO + SIGUSR1 = C.SIGUSR1 + SIGUSR2 = C.SIGUSR2 + + FPE_INTDIV = C.FPE_INTDIV + FPE_INTOVF = C.FPE_INTOVF + FPE_FLTDIV = C.FPE_FLTDIV + FPE_FLTOVF = C.FPE_FLTOVF + FPE_FLTUND = C.FPE_FLTUND + FPE_FLTRES = C.FPE_FLTRES + FPE_FLTINV = C.FPE_FLTINV + FPE_FLTSUB = C.FPE_FLTSUB + + BUS_ADRALN = C.BUS_ADRALN + BUS_ADRERR = C.BUS_ADRERR + BUS_OBJERR = C.BUS_OBJERR + + SEGV_MAPERR = C.SEGV_MAPERR + SEGV_ACCERR = C.SEGV_ACCERR + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + ITIMER_PROF = C.ITIMER_PROF + + EV_ADD = C.EV_ADD + EV_DELETE = C.EV_DELETE + EV_CLEAR = C.EV_CLEAR + EV_RECEIPT = C.EV_RECEIPT + EV_ERROR = C.EV_ERROR + EV_EOF = C.EV_EOF + EVFILT_READ = C.EVFILT_READ + EVFILT_WRITE = C.EVFILT_WRITE + + PTHREAD_CREATE_DETACHED = C.PTHREAD_CREATE_DETACHED + + F_SETFD = C.F_SETFD + F_GETFL = C.F_GETFL + F_SETFL = C.F_SETFL + FD_CLOEXEC = C.FD_CLOEXEC + + O_WRONLY = C.O_WRONLY + O_NONBLOCK = C.O_NONBLOCK + O_CREAT = C.O_CREAT + O_TRUNC = C.O_TRUNC +) + +type StackT C.struct_sigaltstack +type Sighandler C.union___sigaction_u + +type Sigaction C.struct___sigaction // used in syscalls +type Usigaction C.struct_sigaction // used by sigaction second argument +type Sigset C.sigset_t +type Sigval C.union_sigval +type Siginfo C.siginfo_t +type Timeval C.struct_timeval +type Itimerval C.struct_itimerval +type Timespec C.struct_timespec + +type FPControl C.struct_fp_control +type FPStatus C.struct_fp_status +type RegMMST C.struct_mmst_reg +type RegXMM C.struct_xmm_reg + +type Regs64 C.struct_x86_thread_state64 +type FloatState64 C.struct_x86_float_state64 +type ExceptionState64 C.struct_x86_exception_state64 +type Mcontext64 C.struct_mcontext64 + +type Regs32 C.struct_i386_thread_state +type FloatState32 C.struct_i386_float_state +type ExceptionState32 C.struct_i386_exception_state +type Mcontext32 C.struct_mcontext32 + +type Ucontext C.struct_ucontext + +type Kevent C.struct_kevent + +type Pthread C.pthread_t +type PthreadAttr C.pthread_attr_t +type PthreadMutex C.pthread_mutex_t +type PthreadMutexAttr C.pthread_mutexattr_t +type PthreadCond C.pthread_cond_t +type PthreadCondAttr C.pthread_condattr_t + +type MachTimebaseInfo C.mach_timebase_info_data_t diff --git a/src/runtime/defs_darwin_amd64.go b/src/runtime/defs_darwin_amd64.go new file mode 100644 index 0000000..84e6f37 --- /dev/null +++ b/src/runtime/defs_darwin_amd64.go @@ -0,0 +1,375 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_darwin.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + _ETIMEDOUT = 0x3c + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + _MADV_FREE_REUSABLE = 0x7 + _MADV_FREE_REUSE = 0x8 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + _SA_USERTRAMP = 0x100 + _SA_64REGSET = 0x200 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x7 + _FPE_INTOVF = 0x8 + _FPE_FLTDIV = 0x1 + _FPE_FLTOVF = 0x2 + _FPE_FLTUND = 0x3 + _FPE_FLTRES = 0x4 + _FPE_FLTINV = 0x5 + _FPE_FLTSUB = 0x6 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0x40 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 + + _PTHREAD_CREATE_DETACHED = 0x2 + + _F_SETFD = 0x2 + _F_GETFL = 0x3 + _F_SETFL = 0x4 + _FD_CLOEXEC = 0x1 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 +) + +type stackt struct { + ss_sp *byte + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type sigactiont struct { + __sigaction_u [8]byte + sa_tramp unsafe.Pointer + sa_mask uint32 + sa_flags int32 +} + +type usigactiont struct { + __sigaction_u [8]byte + sa_mask uint32 + sa_flags int32 +} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr uint64 + si_value [8]byte + si_band int64 + __pad [7]uint64 +} + +type timeval struct { + tv_sec int64 + tv_usec int32 + pad_cgo_0 [4]byte +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type fpcontrol struct { + pad_cgo_0 [2]byte +} + +type fpstatus struct { + pad_cgo_0 [2]byte +} + +type regmmst struct { + mmst_reg [10]int8 + mmst_rsrv [6]int8 +} + +type regxmm struct { + xmm_reg [16]int8 +} + +type regs64 struct { + rax uint64 + rbx uint64 + rcx uint64 + rdx uint64 + rdi uint64 + rsi uint64 + rbp uint64 + rsp uint64 + r8 uint64 + r9 uint64 + r10 uint64 + r11 uint64 + r12 uint64 + r13 uint64 + r14 uint64 + r15 uint64 + rip uint64 + rflags uint64 + cs uint64 + fs uint64 + gs uint64 +} + +type floatstate64 struct { + fpu_reserved [2]int32 + fpu_fcw fpcontrol + fpu_fsw fpstatus + fpu_ftw uint8 + fpu_rsrv1 uint8 + fpu_fop uint16 + fpu_ip uint32 + fpu_cs uint16 + fpu_rsrv2 uint16 + fpu_dp uint32 + fpu_ds uint16 + fpu_rsrv3 uint16 + fpu_mxcsr uint32 + fpu_mxcsrmask uint32 + fpu_stmm0 regmmst + fpu_stmm1 regmmst + fpu_stmm2 regmmst + fpu_stmm3 regmmst + fpu_stmm4 regmmst + fpu_stmm5 regmmst + fpu_stmm6 regmmst + fpu_stmm7 regmmst + fpu_xmm0 regxmm + fpu_xmm1 regxmm + fpu_xmm2 regxmm + fpu_xmm3 regxmm + fpu_xmm4 regxmm + fpu_xmm5 regxmm + fpu_xmm6 regxmm + fpu_xmm7 regxmm + fpu_xmm8 regxmm + fpu_xmm9 regxmm + fpu_xmm10 regxmm + fpu_xmm11 regxmm + fpu_xmm12 regxmm + fpu_xmm13 regxmm + fpu_xmm14 regxmm + fpu_xmm15 regxmm + fpu_rsrv4 [96]int8 + fpu_reserved1 int32 +} + +type exceptionstate64 struct { + trapno uint16 + cpu uint16 + err uint32 + faultvaddr uint64 +} + +type mcontext64 struct { + es exceptionstate64 + ss regs64 + fs floatstate64 + pad_cgo_0 [4]byte +} + +type regs32 struct { + eax uint32 + ebx uint32 + ecx uint32 + edx uint32 + edi uint32 + esi uint32 + ebp uint32 + esp uint32 + ss uint32 + eflags uint32 + eip uint32 + cs uint32 + ds uint32 + es uint32 + fs uint32 + gs uint32 +} + +type floatstate32 struct { + fpu_reserved [2]int32 + fpu_fcw fpcontrol + fpu_fsw fpstatus + fpu_ftw uint8 + fpu_rsrv1 uint8 + fpu_fop uint16 + fpu_ip uint32 + fpu_cs uint16 + fpu_rsrv2 uint16 + fpu_dp uint32 + fpu_ds uint16 + fpu_rsrv3 uint16 + fpu_mxcsr uint32 + fpu_mxcsrmask uint32 + fpu_stmm0 regmmst + fpu_stmm1 regmmst + fpu_stmm2 regmmst + fpu_stmm3 regmmst + fpu_stmm4 regmmst + fpu_stmm5 regmmst + fpu_stmm6 regmmst + fpu_stmm7 regmmst + fpu_xmm0 regxmm + fpu_xmm1 regxmm + fpu_xmm2 regxmm + fpu_xmm3 regxmm + fpu_xmm4 regxmm + fpu_xmm5 regxmm + fpu_xmm6 regxmm + fpu_xmm7 regxmm + fpu_rsrv4 [224]int8 + fpu_reserved1 int32 +} + +type exceptionstate32 struct { + trapno uint16 + cpu uint16 + err uint32 + faultvaddr uint32 +} + +type mcontext32 struct { + es exceptionstate32 + ss regs32 + fs floatstate32 +} + +type ucontext struct { + uc_onstack int32 + uc_sigmask uint32 + uc_stack stackt + uc_link *ucontext + uc_mcsize uint64 + uc_mcontext *mcontext64 +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte +} + +type pthread uintptr +type pthreadattr struct { + X__sig int64 + X__opaque [56]int8 +} +type pthreadmutex struct { + X__sig int64 + X__opaque [56]int8 +} +type pthreadmutexattr struct { + X__sig int64 + X__opaque [8]int8 +} +type pthreadcond struct { + X__sig int64 + X__opaque [40]int8 +} +type pthreadcondattr struct { + X__sig int64 + X__opaque [8]int8 +} + +type machTimebaseInfo struct { + numer uint32 + denom uint32 +} diff --git a/src/runtime/defs_darwin_arm64.go b/src/runtime/defs_darwin_arm64.go new file mode 100644 index 0000000..30d7443 --- /dev/null +++ b/src/runtime/defs_darwin_arm64.go @@ -0,0 +1,242 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_darwin.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + _ETIMEDOUT = 0x3c + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + _MADV_FREE_REUSABLE = 0x7 + _MADV_FREE_REUSE = 0x8 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + _SA_USERTRAMP = 0x100 + _SA_64REGSET = 0x200 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x7 + _FPE_INTOVF = 0x8 + _FPE_FLTDIV = 0x1 + _FPE_FLTOVF = 0x2 + _FPE_FLTUND = 0x3 + _FPE_FLTRES = 0x4 + _FPE_FLTINV = 0x5 + _FPE_FLTSUB = 0x6 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0x40 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 + + _PTHREAD_CREATE_DETACHED = 0x2 + + _PTHREAD_KEYS_MAX = 512 + + _F_SETFD = 0x2 + _F_GETFL = 0x3 + _F_SETFL = 0x4 + _FD_CLOEXEC = 0x1 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 +) + +type stackt struct { + ss_sp *byte + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type sigactiont struct { + __sigaction_u [8]byte + sa_tramp unsafe.Pointer + sa_mask uint32 + sa_flags int32 +} + +type usigactiont struct { + __sigaction_u [8]byte + sa_mask uint32 + sa_flags int32 +} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr *byte + si_value [8]byte + si_band int64 + __pad [7]uint64 +} + +type timeval struct { + tv_sec int64 + tv_usec int32 + pad_cgo_0 [4]byte +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type exceptionstate64 struct { + far uint64 // virtual fault addr + esr uint32 // exception syndrome + exc uint32 // number of arm exception taken +} + +type regs64 struct { + x [29]uint64 // registers x0 to x28 + fp uint64 // frame register, x29 + lr uint64 // link register, x30 + sp uint64 // stack pointer, x31 + pc uint64 // program counter + cpsr uint32 // current program status register + __pad uint32 +} + +type neonstate64 struct { + v [64]uint64 // actually [32]uint128 + fpsr uint32 + fpcr uint32 +} + +type mcontext64 struct { + es exceptionstate64 + ss regs64 + ns neonstate64 +} + +type ucontext struct { + uc_onstack int32 + uc_sigmask uint32 + uc_stack stackt + uc_link *ucontext + uc_mcsize uint64 + uc_mcontext *mcontext64 +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte +} + +type pthread uintptr +type pthreadattr struct { + X__sig int64 + X__opaque [56]int8 +} +type pthreadmutex struct { + X__sig int64 + X__opaque [56]int8 +} +type pthreadmutexattr struct { + X__sig int64 + X__opaque [8]int8 +} +type pthreadcond struct { + X__sig int64 + X__opaque [40]int8 +} +type pthreadcondattr struct { + X__sig int64 + X__opaque [8]int8 +} + +type machTimebaseInfo struct { + numer uint32 + denom uint32 +} + +type pthreadkey uint64 diff --git a/src/runtime/defs_dragonfly.go b/src/runtime/defs_dragonfly.go new file mode 100644 index 0000000..9dcfdf0 --- /dev/null +++ b/src/runtime/defs_dragonfly.go @@ -0,0 +1,132 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=amd64 go tool cgo -cdefs defs_dragonfly.go >defs_dragonfly_amd64.h +*/ + +package runtime + +/* +#include <sys/user.h> +#include <sys/time.h> +#include <sys/event.h> +#include <sys/mman.h> +#include <sys/ucontext.h> +#include <sys/rtprio.h> +#include <sys/signal.h> +#include <sys/unistd.h> +#include <errno.h> +#include <signal.h> +*/ +import "C" + +const ( + EINTR = C.EINTR + EFAULT = C.EFAULT + EBUSY = C.EBUSY + EAGAIN = C.EAGAIN + + O_WRONLY = C.O_WRONLY + O_NONBLOCK = C.O_NONBLOCK + O_CREAT = C.O_CREAT + O_TRUNC = C.O_TRUNC + O_CLOEXEC = C.O_CLOEXEC + + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANON + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + + MADV_DONTNEED = C.MADV_DONTNEED + MADV_FREE = C.MADV_FREE + + SA_SIGINFO = C.SA_SIGINFO + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGEMT = C.SIGEMT + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGBUS = C.SIGBUS + SIGSEGV = C.SIGSEGV + SIGSYS = C.SIGSYS + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGTERM = C.SIGTERM + SIGURG = C.SIGURG + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGCONT = C.SIGCONT + SIGCHLD = C.SIGCHLD + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGIO = C.SIGIO + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGINFO = C.SIGINFO + SIGUSR1 = C.SIGUSR1 + SIGUSR2 = C.SIGUSR2 + + FPE_INTDIV = C.FPE_INTDIV + FPE_INTOVF = C.FPE_INTOVF + FPE_FLTDIV = C.FPE_FLTDIV + FPE_FLTOVF = C.FPE_FLTOVF + FPE_FLTUND = C.FPE_FLTUND + FPE_FLTRES = C.FPE_FLTRES + FPE_FLTINV = C.FPE_FLTINV + FPE_FLTSUB = C.FPE_FLTSUB + + BUS_ADRALN = C.BUS_ADRALN + BUS_ADRERR = C.BUS_ADRERR + BUS_OBJERR = C.BUS_OBJERR + + SEGV_MAPERR = C.SEGV_MAPERR + SEGV_ACCERR = C.SEGV_ACCERR + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + ITIMER_PROF = C.ITIMER_PROF + + EV_ADD = C.EV_ADD + EV_DELETE = C.EV_DELETE + EV_CLEAR = C.EV_CLEAR + EV_ERROR = C.EV_ERROR + EV_EOF = C.EV_EOF + EVFILT_READ = C.EVFILT_READ + EVFILT_WRITE = C.EVFILT_WRITE +) + +type Rtprio C.struct_rtprio +type Lwpparams C.struct_lwp_params +type Sigset C.struct___sigset +type StackT C.stack_t + +type Siginfo C.siginfo_t + +type Mcontext C.mcontext_t +type Ucontext C.ucontext_t + +type Timespec C.struct_timespec +type Timeval C.struct_timeval +type Itimerval C.struct_itimerval + +type Kevent C.struct_kevent diff --git a/src/runtime/defs_dragonfly_amd64.go b/src/runtime/defs_dragonfly_amd64.go new file mode 100644 index 0000000..f1a2302 --- /dev/null +++ b/src/runtime/defs_dragonfly_amd64.go @@ -0,0 +1,211 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_dragonfly.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EBUSY = 0x10 + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x20000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x2 + _FPE_INTOVF = 0x1 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type rtprio struct { + _type uint16 + prio uint16 +} + +type lwpparams struct { + start_func uintptr + arg unsafe.Pointer + stack uintptr + tid1 unsafe.Pointer // *int32 + tid2 unsafe.Pointer // *int32 +} + +type sigset struct { + __bits [4]uint32 +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr uint64 + si_value [8]byte + si_band int64 + __spare__ [7]int32 + pad_cgo_0 [4]byte +} + +type mcontext struct { + mc_onstack uint64 + mc_rdi uint64 + mc_rsi uint64 + mc_rdx uint64 + mc_rcx uint64 + mc_r8 uint64 + mc_r9 uint64 + mc_rax uint64 + mc_rbx uint64 + mc_rbp uint64 + mc_r10 uint64 + mc_r11 uint64 + mc_r12 uint64 + mc_r13 uint64 + mc_r14 uint64 + mc_r15 uint64 + mc_xflags uint64 + mc_trapno uint64 + mc_addr uint64 + mc_flags uint64 + mc_err uint64 + mc_rip uint64 + mc_cs uint64 + mc_rflags uint64 + mc_rsp uint64 + mc_ss uint64 + mc_len uint32 + mc_fpformat uint32 + mc_ownedfp uint32 + mc_reserved uint32 + mc_unused [8]uint32 + mc_fpregs [256]int32 +} + +type ucontext struct { + uc_sigmask sigset + pad_cgo_0 [48]byte + uc_mcontext mcontext + uc_link *ucontext + uc_stack stackt + __spare__ [8]int32 +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte +} diff --git a/src/runtime/defs_freebsd.go b/src/runtime/defs_freebsd.go new file mode 100644 index 0000000..d86ae91 --- /dev/null +++ b/src/runtime/defs_freebsd.go @@ -0,0 +1,174 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=amd64 go tool cgo -cdefs defs_freebsd.go >defs_freebsd_amd64.h +GOARCH=386 go tool cgo -cdefs defs_freebsd.go >defs_freebsd_386.h +GOARCH=arm go tool cgo -cdefs defs_freebsd.go >defs_freebsd_arm.h +*/ + +package runtime + +/* +#include <sys/types.h> +#include <unistd.h> +#include <fcntl.h> +#include <sys/time.h> +#include <signal.h> +#include <errno.h> +#include <sys/event.h> +#include <sys/mman.h> +#include <sys/ucontext.h> +#include <sys/umtx.h> +#include <sys/_umtx.h> +#include <sys/rtprio.h> +#include <sys/thr.h> +#include <sys/_sigset.h> +#include <sys/unistd.h> +#include <sys/sysctl.h> +#include <sys/cpuset.h> +#include <sys/param.h> +#include <sys/vdso.h> +*/ +import "C" + +// Local consts. +const ( + _NBBY = C.NBBY // Number of bits in a byte. + _CTL_MAXNAME = C.CTL_MAXNAME // Largest number of components supported. + _CPU_LEVEL_WHICH = C.CPU_LEVEL_WHICH // Actual mask/id for which. + _CPU_WHICH_PID = C.CPU_WHICH_PID // Specifies a process id. +) + +const ( + EINTR = C.EINTR + EFAULT = C.EFAULT + EAGAIN = C.EAGAIN + ETIMEDOUT = C.ETIMEDOUT + + O_WRONLY = C.O_WRONLY + O_NONBLOCK = C.O_NONBLOCK + O_CREAT = C.O_CREAT + O_TRUNC = C.O_TRUNC + O_CLOEXEC = C.O_CLOEXEC + + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANON + MAP_SHARED = C.MAP_SHARED + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + + MADV_DONTNEED = C.MADV_DONTNEED + MADV_FREE = C.MADV_FREE + + SA_SIGINFO = C.SA_SIGINFO + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + + CLOCK_MONOTONIC = C.CLOCK_MONOTONIC + CLOCK_REALTIME = C.CLOCK_REALTIME + + UMTX_OP_WAIT_UINT = C.UMTX_OP_WAIT_UINT + UMTX_OP_WAIT_UINT_PRIVATE = C.UMTX_OP_WAIT_UINT_PRIVATE + UMTX_OP_WAKE = C.UMTX_OP_WAKE + UMTX_OP_WAKE_PRIVATE = C.UMTX_OP_WAKE_PRIVATE + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGEMT = C.SIGEMT + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGBUS = C.SIGBUS + SIGSEGV = C.SIGSEGV + SIGSYS = C.SIGSYS + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGTERM = C.SIGTERM + SIGURG = C.SIGURG + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGCONT = C.SIGCONT + SIGCHLD = C.SIGCHLD + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGIO = C.SIGIO + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGINFO = C.SIGINFO + SIGUSR1 = C.SIGUSR1 + SIGUSR2 = C.SIGUSR2 + + FPE_INTDIV = C.FPE_INTDIV + FPE_INTOVF = C.FPE_INTOVF + FPE_FLTDIV = C.FPE_FLTDIV + FPE_FLTOVF = C.FPE_FLTOVF + FPE_FLTUND = C.FPE_FLTUND + FPE_FLTRES = C.FPE_FLTRES + FPE_FLTINV = C.FPE_FLTINV + FPE_FLTSUB = C.FPE_FLTSUB + + BUS_ADRALN = C.BUS_ADRALN + BUS_ADRERR = C.BUS_ADRERR + BUS_OBJERR = C.BUS_OBJERR + + SEGV_MAPERR = C.SEGV_MAPERR + SEGV_ACCERR = C.SEGV_ACCERR + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + ITIMER_PROF = C.ITIMER_PROF + + EV_ADD = C.EV_ADD + EV_DELETE = C.EV_DELETE + EV_CLEAR = C.EV_CLEAR + EV_RECEIPT = C.EV_RECEIPT + EV_ERROR = C.EV_ERROR + EV_EOF = C.EV_EOF + EVFILT_READ = C.EVFILT_READ + EVFILT_WRITE = C.EVFILT_WRITE +) + +type Rtprio C.struct_rtprio +type ThrParam C.struct_thr_param +type Sigset C.struct___sigset +type StackT C.stack_t + +type Siginfo C.siginfo_t + +type Mcontext C.mcontext_t +type Ucontext C.ucontext_t + +type Timespec C.struct_timespec +type Timeval C.struct_timeval +type Itimerval C.struct_itimerval + +type Umtx_time C.struct__umtx_time + +type KeventT C.struct_kevent + +type bintime C.struct_bintime +type vdsoTimehands C.struct_vdso_timehands +type vdsoTimekeep C.struct_vdso_timekeep + +const ( + _VDSO_TK_VER_CURR = C.VDSO_TK_VER_CURR + + vdsoTimehandsSize = C.sizeof_struct_vdso_timehands + vdsoTimekeepSize = C.sizeof_struct_vdso_timekeep +) diff --git a/src/runtime/defs_freebsd_386.go b/src/runtime/defs_freebsd_386.go new file mode 100644 index 0000000..ee82741 --- /dev/null +++ b/src/runtime/defs_freebsd_386.go @@ -0,0 +1,270 @@ +// Code generated by cgo, then manually converted into appropriate naming and code +// for the Go runtime. +// go tool cgo -godefs defs_freebsd.go + +package runtime + +import "unsafe" + +const ( + _NBBY = 0x8 + _CTL_MAXNAME = 0x18 + _CPU_LEVEL_WHICH = 0x3 + _CPU_WHICH_PID = 0x2 +) + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + _ETIMEDOUT = 0x3c + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x100000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_SHARED = 0x1 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _CLOCK_MONOTONIC = 0x4 + _CLOCK_REALTIME = 0x0 + + _UMTX_OP_WAIT_UINT = 0xb + _UMTX_OP_WAIT_UINT_PRIVATE = 0xf + _UMTX_OP_WAKE = 0x3 + _UMTX_OP_WAKE_PRIVATE = 0x10 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x2 + _FPE_INTOVF = 0x1 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0x40 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type rtprio struct { + _type uint16 + prio uint16 +} + +type thrparam struct { + start_func uintptr + arg unsafe.Pointer + stack_base uintptr + stack_size uintptr + tls_base unsafe.Pointer + tls_size uintptr + child_tid unsafe.Pointer // *int32 + parent_tid *int32 + flags int32 + rtp *rtprio + spare [3]uintptr +} + +type thread int32 // long + +type sigset struct { + __bits [4]uint32 +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 +} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr uintptr + si_value [4]byte + _reason [32]byte +} + +type mcontext struct { + mc_onstack uint32 + mc_gs uint32 + mc_fs uint32 + mc_es uint32 + mc_ds uint32 + mc_edi uint32 + mc_esi uint32 + mc_ebp uint32 + mc_isp uint32 + mc_ebx uint32 + mc_edx uint32 + mc_ecx uint32 + mc_eax uint32 + mc_trapno uint32 + mc_err uint32 + mc_eip uint32 + mc_cs uint32 + mc_eflags uint32 + mc_esp uint32 + mc_ss uint32 + mc_len uint32 + mc_fpformat uint32 + mc_ownedfp uint32 + mc_flags uint32 + mc_fpstate [128]uint32 + mc_fsbase uint32 + mc_gsbase uint32 + mc_xfpustate uint32 + mc_xfpustate_len uint32 + mc_spare2 [4]uint32 +} + +type ucontext struct { + uc_sigmask sigset + uc_mcontext mcontext + uc_link *ucontext + uc_stack stackt + uc_flags int32 + __spare__ [4]int32 + pad_cgo_0 [12]byte +} + +type timespec struct { + tv_sec int32 + tv_nsec int32 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = timediv(ns, 1e9, &ts.tv_nsec) +} + +type timeval struct { + tv_sec int32 + tv_usec int32 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type umtx_time struct { + _timeout timespec + _flags uint32 + _clockid uint32 +} + +type keventt struct { + ident uint32 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte + ext [4]uint64 +} + +type bintime struct { + sec int32 + frac uint64 +} + +type vdsoTimehands struct { + algo uint32 + gen uint32 + scale uint64 + offset_count uint32 + counter_mask uint32 + offset bintime + boottime bintime + x86_shift uint32 + x86_hpet_idx uint32 + res [6]uint32 +} + +type vdsoTimekeep struct { + ver uint32 + enabled uint32 + current uint32 +} + +const ( + _VDSO_TK_VER_CURR = 0x1 + + vdsoTimehandsSize = 0x50 + vdsoTimekeepSize = 0xc +) diff --git a/src/runtime/defs_freebsd_amd64.go b/src/runtime/defs_freebsd_amd64.go new file mode 100644 index 0000000..9003f92 --- /dev/null +++ b/src/runtime/defs_freebsd_amd64.go @@ -0,0 +1,282 @@ +// Code generated by cgo, then manually converted into appropriate naming and code +// for the Go runtime. +// go tool cgo -godefs defs_freebsd.go + +package runtime + +import "unsafe" + +const ( + _NBBY = 0x8 + _CTL_MAXNAME = 0x18 + _CPU_LEVEL_WHICH = 0x3 + _CPU_WHICH_PID = 0x2 +) + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + _ETIMEDOUT = 0x3c + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x100000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_SHARED = 0x1 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _CLOCK_MONOTONIC = 0x4 + _CLOCK_REALTIME = 0x0 + + _UMTX_OP_WAIT_UINT = 0xb + _UMTX_OP_WAIT_UINT_PRIVATE = 0xf + _UMTX_OP_WAKE = 0x3 + _UMTX_OP_WAKE_PRIVATE = 0x10 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x2 + _FPE_INTOVF = 0x1 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0x40 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type rtprio struct { + _type uint16 + prio uint16 +} + +type thrparam struct { + start_func uintptr + arg unsafe.Pointer + stack_base uintptr + stack_size uintptr + tls_base unsafe.Pointer + tls_size uintptr + child_tid unsafe.Pointer // *int64 + parent_tid *int64 + flags int32 + pad_cgo_0 [4]byte + rtp *rtprio + spare [3]uintptr +} + +type thread int64 // long + +type sigset struct { + __bits [4]uint32 +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr uint64 + si_value [8]byte + _reason [40]byte +} + +type mcontext struct { + mc_onstack uint64 + mc_rdi uint64 + mc_rsi uint64 + mc_rdx uint64 + mc_rcx uint64 + mc_r8 uint64 + mc_r9 uint64 + mc_rax uint64 + mc_rbx uint64 + mc_rbp uint64 + mc_r10 uint64 + mc_r11 uint64 + mc_r12 uint64 + mc_r13 uint64 + mc_r14 uint64 + mc_r15 uint64 + mc_trapno uint32 + mc_fs uint16 + mc_gs uint16 + mc_addr uint64 + mc_flags uint32 + mc_es uint16 + mc_ds uint16 + mc_err uint64 + mc_rip uint64 + mc_cs uint64 + mc_rflags uint64 + mc_rsp uint64 + mc_ss uint64 + mc_len uint64 + mc_fpformat uint64 + mc_ownedfp uint64 + mc_fpstate [64]uint64 + mc_fsbase uint64 + mc_gsbase uint64 + mc_xfpustate uint64 + mc_xfpustate_len uint64 + mc_spare [4]uint64 +} + +type ucontext struct { + uc_sigmask sigset + uc_mcontext mcontext + uc_link *ucontext + uc_stack stackt + uc_flags int32 + __spare__ [4]int32 + pad_cgo_0 [12]byte +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type umtx_time struct { + _timeout timespec + _flags uint32 + _clockid uint32 +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte + ext [4]uint64 +} + +type bintime struct { + sec int64 + frac uint64 +} + +type vdsoTimehands struct { + algo uint32 + gen uint32 + scale uint64 + offset_count uint32 + counter_mask uint32 + offset bintime + boottime bintime + x86_shift uint32 + x86_hpet_idx uint32 + res [6]uint32 +} + +type vdsoTimekeep struct { + ver uint32 + enabled uint32 + current uint32 + pad_cgo_0 [4]byte +} + +const ( + _VDSO_TK_VER_CURR = 0x1 + + vdsoTimehandsSize = 0x58 + vdsoTimekeepSize = 0x10 +) diff --git a/src/runtime/defs_freebsd_arm.go b/src/runtime/defs_freebsd_arm.go new file mode 100644 index 0000000..68cc1b9 --- /dev/null +++ b/src/runtime/defs_freebsd_arm.go @@ -0,0 +1,245 @@ +// Code generated by cgo, then manually converted into appropriate naming and code +// for the Go runtime. +// go tool cgo -godefs defs_freebsd.go + +package runtime + +import "unsafe" + +const ( + _NBBY = 0x8 + _CTL_MAXNAME = 0x18 + _CPU_LEVEL_WHICH = 0x3 + _CPU_WHICH_PID = 0x2 +) + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + _ETIMEDOUT = 0x3c + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x100000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_SHARED = 0x1 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _CLOCK_MONOTONIC = 0x4 + _CLOCK_REALTIME = 0x0 + + _UMTX_OP_WAIT_UINT = 0xb + _UMTX_OP_WAIT_UINT_PRIVATE = 0xf + _UMTX_OP_WAKE = 0x3 + _UMTX_OP_WAKE_PRIVATE = 0x10 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x2 + _FPE_INTOVF = 0x1 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0x40 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type rtprio struct { + _type uint16 + prio uint16 +} + +type thrparam struct { + start_func uintptr + arg unsafe.Pointer + stack_base uintptr + stack_size uintptr + tls_base unsafe.Pointer + tls_size uintptr + child_tid unsafe.Pointer // *int32 + parent_tid *int32 + flags int32 + rtp *rtprio + spare [3]uintptr +} + +type thread int32 // long + +type sigset struct { + __bits [4]uint32 +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 +} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr uintptr + si_value [4]byte + _reason [32]byte +} + +type mcontext struct { + __gregs [17]uint32 + __fpu [140]byte +} + +type ucontext struct { + uc_sigmask sigset + uc_mcontext mcontext + uc_link *ucontext + uc_stack stackt + uc_flags int32 + __spare__ [4]int32 +} + +type timespec struct { + tv_sec int64 + tv_nsec int32 + pad_cgo_0 [4]byte +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = int64(timediv(ns, 1e9, &ts.tv_nsec)) +} + +type timeval struct { + tv_sec int64 + tv_usec int32 + pad_cgo_0 [4]byte +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type umtx_time struct { + _timeout timespec + _flags uint32 + _clockid uint32 +} + +type keventt struct { + ident uint32 + filter int16 + flags uint16 + fflags uint32 + pad_cgo_0 [4]byte + data int64 + udata *byte + pad_cgo_1 [4]byte + ext [4]uint64 +} + +type bintime struct { + sec int64 + frac uint64 +} + +type vdsoTimehands struct { + algo uint32 + gen uint32 + scale uint64 + offset_count uint32 + counter_mask uint32 + offset bintime + boottime bintime + physical uint32 + res [7]uint32 +} + +type vdsoTimekeep struct { + ver uint32 + enabled uint32 + current uint32 + pad_cgo_0 [4]byte +} + +const ( + _VDSO_TK_VER_CURR = 0x1 + + vdsoTimehandsSize = 0x58 + vdsoTimekeepSize = 0x10 +) diff --git a/src/runtime/defs_freebsd_arm64.go b/src/runtime/defs_freebsd_arm64.go new file mode 100644 index 0000000..1d67236 --- /dev/null +++ b/src/runtime/defs_freebsd_arm64.go @@ -0,0 +1,265 @@ +// Code generated by cgo, then manually converted into appropriate naming and code +// for the Go runtime. +// go tool cgo -godefs defs_freebsd.go + +package runtime + +import "unsafe" + +const ( + _NBBY = 0x8 + _CTL_MAXNAME = 0x18 + _CPU_LEVEL_WHICH = 0x3 + _CPU_WHICH_PID = 0x2 +) + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + _ETIMEDOUT = 0x3c + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x100000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_SHARED = 0x1 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _CLOCK_MONOTONIC = 0x4 + _CLOCK_REALTIME = 0x0 + + _UMTX_OP_WAIT_UINT = 0xb + _UMTX_OP_WAIT_UINT_PRIVATE = 0xf + _UMTX_OP_WAKE = 0x3 + _UMTX_OP_WAKE_PRIVATE = 0x10 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x2 + _FPE_INTOVF = 0x1 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0x40 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type rtprio struct { + _type uint16 + prio uint16 +} + +type thrparam struct { + start_func uintptr + arg unsafe.Pointer + stack_base uintptr + stack_size uintptr + tls_base unsafe.Pointer + tls_size uintptr + child_tid unsafe.Pointer // *int64 + parent_tid *int64 + flags int32 + pad_cgo_0 [4]byte + rtp *rtprio + spare [3]uintptr +} + +type thread int64 // long + +type sigset struct { + __bits [4]uint32 +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr uint64 + si_value [8]byte + _reason [40]byte +} + +type gpregs struct { + gp_x [30]uint64 + gp_lr uint64 + gp_sp uint64 + gp_elr uint64 + gp_spsr uint32 + gp_pad int32 +} + +type fpregs struct { + fp_q [64]uint64 // actually [32]uint128 + fp_sr uint32 + fp_cr uint32 + fp_flags int32 + fp_pad int32 +} + +type mcontext struct { + mc_gpregs gpregs + mc_fpregs fpregs + mc_flags int32 + mc_pad int32 + mc_spare [8]uint64 +} + +type ucontext struct { + uc_sigmask sigset + uc_mcontext mcontext + uc_link *ucontext + uc_stack stackt + uc_flags int32 + __spare__ [4]int32 + pad_cgo_0 [12]byte +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type umtx_time struct { + _timeout timespec + _flags uint32 + _clockid uint32 +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte + ext [4]uint64 +} + +type bintime struct { + sec int64 + frac uint64 +} + +type vdsoTimehands struct { + algo uint32 + gen uint32 + scale uint64 + offset_count uint32 + counter_mask uint32 + offset bintime + boottime bintime + physical uint32 + res [7]uint32 +} + +type vdsoTimekeep struct { + ver uint32 + enabled uint32 + current uint32 + pad_cgo_0 [4]byte +} + +const ( + _VDSO_TK_VER_CURR = 0x1 + + vdsoTimehandsSize = 0x58 + vdsoTimekeepSize = 0x10 +) diff --git a/src/runtime/defs_freebsd_riscv64.go b/src/runtime/defs_freebsd_riscv64.go new file mode 100644 index 0000000..b977bde --- /dev/null +++ b/src/runtime/defs_freebsd_riscv64.go @@ -0,0 +1,266 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_freebsd.go + +package runtime + +import "unsafe" + +const ( + _NBBY = 0x8 + _CTL_MAXNAME = 0x18 + _CPU_LEVEL_WHICH = 0x3 + _CPU_WHICH_PID = 0x2 +) + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + _ETIMEDOUT = 0x3c + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x100000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_SHARED = 0x1 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x5 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _CLOCK_MONOTONIC = 0x4 + _CLOCK_REALTIME = 0x0 + + _UMTX_OP_WAIT_UINT = 0xb + _UMTX_OP_WAIT_UINT_PRIVATE = 0xf + _UMTX_OP_WAKE = 0x3 + _UMTX_OP_WAKE_PRIVATE = 0x10 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x2 + _FPE_INTOVF = 0x1 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_RECEIPT = 0x40 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type rtprio struct { + _type uint16 + prio uint16 +} + +type thrparam struct { + start_func uintptr + arg unsafe.Pointer + stack_base uintptr + stack_size uintptr + tls_base unsafe.Pointer + tls_size uintptr + child_tid unsafe.Pointer // *int64 + parent_tid *int64 + flags int32 + pad_cgo_0 [4]byte + rtp *rtprio + spare [3]uintptr +} + +type thread int64 // long + +type sigset struct { + __bits [4]uint32 +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type siginfo struct { + si_signo int32 + si_errno int32 + si_code int32 + si_pid int32 + si_uid uint32 + si_status int32 + si_addr uint64 + si_value [8]byte + _reason [40]byte +} + +type gpregs struct { + gp_ra uint64 + gp_sp uint64 + gp_gp uint64 + gp_tp uint64 + gp_t [7]uint64 + gp_s [12]uint64 + gp_a [8]uint64 + gp_sepc uint64 + gp_sstatus uint64 +} + +type fpregs struct { + fp_x [64]uint64 // actually __uint64_t fp_x[32][2] + fp_fcsr uint64 + fp_flags int32 + pad int32 +} + +type mcontext struct { + mc_gpregs gpregs + mc_fpregs fpregs + mc_flags int32 + mc_pad int32 + mc_spare [8]uint64 +} + +type ucontext struct { + uc_sigmask sigset + uc_mcontext mcontext + uc_link *ucontext + uc_stack stackt + uc_flags int32 + __spare__ [4]int32 + pad_cgo_0 [12]byte +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type umtx_time struct { + _timeout timespec + _flags uint32 + _clockid uint32 +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte + ext [4]uint64 +} + +type bintime struct { + sec int64 + frac uint64 +} + +type vdsoTimehands struct { + algo uint32 + gen uint32 + scale uint64 + offset_count uint32 + counter_mask uint32 + offset bintime + boottime bintime + physical uint32 + res [7]uint32 +} + +type vdsoTimekeep struct { + ver uint32 + enabled uint32 + current uint32 + pad_cgo_0 [4]byte +} + +const ( + _VDSO_TK_VER_CURR = 0x1 + + vdsoTimehandsSize = 0x58 + vdsoTimekeepSize = 0x10 +) diff --git a/src/runtime/defs_illumos_amd64.go b/src/runtime/defs_illumos_amd64.go new file mode 100644 index 0000000..9c5413b --- /dev/null +++ b/src/runtime/defs_illumos_amd64.go @@ -0,0 +1,14 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + _RCTL_LOCAL_DENY = 0x2 + + _RCTL_LOCAL_MAXIMAL = 0x80000000 + + _RCTL_FIRST = 0x0 + _RCTL_NEXT = 0x1 +) diff --git a/src/runtime/defs_linux.go b/src/runtime/defs_linux.go new file mode 100644 index 0000000..296fcb4 --- /dev/null +++ b/src/runtime/defs_linux.go @@ -0,0 +1,127 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo -cdefs + +GOARCH=amd64 go tool cgo -cdefs defs_linux.go defs1_linux.go >defs_linux_amd64.h +*/ + +package runtime + +/* +// Linux glibc and Linux kernel define different and conflicting +// definitions for struct sigaction, struct timespec, etc. +// We want the kernel ones, which are in the asm/* headers. +// But then we'd get conflicts when we include the system +// headers for things like ucontext_t, so that happens in +// a separate file, defs1.go. + +#define _SYS_TYPES_H // avoid inclusion of sys/types.h +#include <asm/posix_types.h> +#define size_t __kernel_size_t +#include <asm/signal.h> +#include <asm/siginfo.h> +#include <asm/mman.h> +#include <asm-generic/errno.h> +#include <asm-generic/poll.h> +#include <linux/eventpoll.h> +#include <linux/time.h> +*/ +import "C" + +const ( + EINTR = C.EINTR + EAGAIN = C.EAGAIN + ENOMEM = C.ENOMEM + + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANONYMOUS + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + + MADV_DONTNEED = C.MADV_DONTNEED + MADV_FREE = C.MADV_FREE + MADV_HUGEPAGE = C.MADV_HUGEPAGE + MADV_NOHUGEPAGE = C.MADV_NOHUGEPAGE + + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + SA_SIGINFO = C.SA_SIGINFO + + SI_KERNEL = C.SI_KERNEL + SI_TIMER = C.SI_TIMER + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGBUS = C.SIGBUS + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGUSR1 = C.SIGUSR1 + SIGSEGV = C.SIGSEGV + SIGUSR2 = C.SIGUSR2 + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGSTKFLT = C.SIGSTKFLT + SIGCHLD = C.SIGCHLD + SIGCONT = C.SIGCONT + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGURG = C.SIGURG + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGIO = C.SIGIO + SIGPWR = C.SIGPWR + SIGSYS = C.SIGSYS + + SIGRTMIN = C.SIGRTMIN + + FPE_INTDIV = C.FPE_INTDIV + FPE_INTOVF = C.FPE_INTOVF + FPE_FLTDIV = C.FPE_FLTDIV + FPE_FLTOVF = C.FPE_FLTOVF + FPE_FLTUND = C.FPE_FLTUND + FPE_FLTRES = C.FPE_FLTRES + FPE_FLTINV = C.FPE_FLTINV + FPE_FLTSUB = C.FPE_FLTSUB + + BUS_ADRALN = C.BUS_ADRALN + BUS_ADRERR = C.BUS_ADRERR + BUS_OBJERR = C.BUS_OBJERR + + SEGV_MAPERR = C.SEGV_MAPERR + SEGV_ACCERR = C.SEGV_ACCERR + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + ITIMER_PROF = C.ITIMER_PROF + + CLOCK_THREAD_CPUTIME_ID = C.CLOCK_THREAD_CPUTIME_ID + + SIGEV_THREAD_ID = C.SIGEV_THREAD_ID +) + +type Sigset C.sigset_t +type Timespec C.struct_timespec +type Timeval C.struct_timeval +type Sigaction C.struct_sigaction +type Siginfo C.siginfo_t +type Itimerspec C.struct_itimerspec +type Itimerval C.struct_itimerval +type Sigevent C.struct_sigevent diff --git a/src/runtime/defs_linux_386.go b/src/runtime/defs_linux_386.go new file mode 100644 index 0000000..72339f4 --- /dev/null +++ b/src/runtime/defs_linux_386.go @@ -0,0 +1,252 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs2_linux.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_RESTORER = 0x4000000 + _SA_SIGINFO = 0x4 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 + + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 + + _AF_UNIX = 0x1 + _SOCK_DGRAM = 0x2 +) + +type fpreg struct { + significand [4]uint16 + exponent uint16 +} + +type fpxreg struct { + significand [4]uint16 + exponent uint16 + padding [3]uint16 +} + +type xmmreg struct { + element [4]uint32 +} + +type fpstate struct { + cw uint32 + sw uint32 + tag uint32 + ipoff uint32 + cssel uint32 + dataoff uint32 + datasel uint32 + _st [8]fpreg + status uint16 + magic uint16 + _fxsr_env [6]uint32 + mxcsr uint32 + reserved uint32 + _fxsr_st [8]fpxreg + _xmm [8]xmmreg + padding1 [44]uint32 + anon0 [48]byte +} + +type timespec struct { + tv_sec int32 + tv_nsec int32 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = timediv(ns, 1e9, &ts.tv_nsec) +} + +type timeval struct { + tv_sec int32 + tv_usec int32 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type sigactiont struct { + sa_handler uintptr + sa_flags uint32 + sa_restorer uintptr + sa_mask uint64 +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + // below here is a union; si_addr is the only field we use + si_addr uint32 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type stackt struct { + ss_sp *byte + ss_flags int32 + ss_size uintptr +} + +type sigcontext struct { + gs uint16 + __gsh uint16 + fs uint16 + __fsh uint16 + es uint16 + __esh uint16 + ds uint16 + __dsh uint16 + edi uint32 + esi uint32 + ebp uint32 + esp uint32 + ebx uint32 + edx uint32 + ecx uint32 + eax uint32 + trapno uint32 + err uint32 + eip uint32 + cs uint16 + __csh uint16 + eflags uint32 + esp_at_signal uint32 + ss uint16 + __ssh uint16 + fpstate *fpstate + oldmask uint32 + cr2 uint32 +} + +type ucontext struct { + uc_flags uint32 + uc_link *ucontext + uc_stack stackt + uc_mcontext sigcontext + uc_sigmask uint32 +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +type sockaddr_un struct { + family uint16 + path [108]byte +} diff --git a/src/runtime/defs_linux_amd64.go b/src/runtime/defs_linux_amd64.go new file mode 100644 index 0000000..298f3eb --- /dev/null +++ b/src/runtime/defs_linux_amd64.go @@ -0,0 +1,288 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_linux.go defs1_linux.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_RESTORER = 0x4000000 + _SA_SIGINFO = 0x4 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 + + _AF_UNIX = 0x1 + _SOCK_DGRAM = 0x2 +) + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type sigactiont struct { + sa_handler uintptr + sa_flags uint64 + sa_restorer uintptr + sa_mask uint64 +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + // below here is a union; si_addr is the only field we use + si_addr uint64 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_linux.go defs1_linux.go + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 +) + +type usigset struct { + __val [16]uint64 +} + +type fpxreg struct { + significand [4]uint16 + exponent uint16 + padding [3]uint16 +} + +type xmmreg struct { + element [4]uint32 +} + +type fpstate struct { + cwd uint16 + swd uint16 + ftw uint16 + fop uint16 + rip uint64 + rdp uint64 + mxcsr uint32 + mxcr_mask uint32 + _st [8]fpxreg + _xmm [16]xmmreg + padding [24]uint32 +} + +type fpxreg1 struct { + significand [4]uint16 + exponent uint16 + padding [3]uint16 +} + +type xmmreg1 struct { + element [4]uint32 +} + +type fpstate1 struct { + cwd uint16 + swd uint16 + ftw uint16 + fop uint16 + rip uint64 + rdp uint64 + mxcsr uint32 + mxcr_mask uint32 + _st [8]fpxreg1 + _xmm [16]xmmreg1 + padding [24]uint32 +} + +type fpreg1 struct { + significand [4]uint16 + exponent uint16 +} + +type stackt struct { + ss_sp *byte + ss_flags int32 + pad_cgo_0 [4]byte + ss_size uintptr +} + +type mcontext struct { + gregs [23]uint64 + fpregs *fpstate + __reserved1 [8]uint64 +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_stack stackt + uc_mcontext mcontext + uc_sigmask usigset + __fpregs_mem fpstate +} + +type sigcontext struct { + r8 uint64 + r9 uint64 + r10 uint64 + r11 uint64 + r12 uint64 + r13 uint64 + r14 uint64 + r15 uint64 + rdi uint64 + rsi uint64 + rbp uint64 + rbx uint64 + rdx uint64 + rax uint64 + rcx uint64 + rsp uint64 + rip uint64 + eflags uint64 + cs uint16 + gs uint16 + fs uint16 + __pad0 uint16 + err uint64 + trapno uint64 + oldmask uint64 + cr2 uint64 + fpstate *fpstate1 + __reserved1 [8]uint64 +} + +type sockaddr_un struct { + family uint16 + path [108]byte +} diff --git a/src/runtime/defs_linux_arm.go b/src/runtime/defs_linux_arm.go new file mode 100644 index 0000000..6fee57d --- /dev/null +++ b/src/runtime/defs_linux_arm.go @@ -0,0 +1,206 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// Constants +const ( + _EINTR = 0x4 + _ENOMEM = 0xc + _EAGAIN = 0xb + + _PROT_NONE = 0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_RESTORER = 0 // unused on ARM + _SA_SIGINFO = 0x4 + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + _SIGRTMIN = 0x20 + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + _ITIMER_REAL = 0 + _ITIMER_PROF = 0x2 + _ITIMER_VIRTUAL = 0x1 + _O_RDONLY = 0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 + + _AF_UNIX = 0x1 + _SOCK_DGRAM = 0x2 +) + +type timespec struct { + tv_sec int32 + tv_nsec int32 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = timediv(ns, 1e9, &ts.tv_nsec) +} + +type stackt struct { + ss_sp *byte + ss_flags int32 + ss_size uintptr +} + +type sigcontext struct { + trap_no uint32 + error_code uint32 + oldmask uint32 + r0 uint32 + r1 uint32 + r2 uint32 + r3 uint32 + r4 uint32 + r5 uint32 + r6 uint32 + r7 uint32 + r8 uint32 + r9 uint32 + r10 uint32 + fp uint32 + ip uint32 + sp uint32 + lr uint32 + pc uint32 + cpsr uint32 + fault_address uint32 +} + +type ucontext struct { + uc_flags uint32 + uc_link *ucontext + uc_stack stackt + uc_mcontext sigcontext + uc_sigmask uint32 + __unused [31]int32 + uc_regspace [128]uint32 +} + +type timeval struct { + tv_sec int32 + tv_usec int32 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + // below here is a union; si_addr is the only field we use + si_addr uint32 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type sigactiont struct { + sa_handler uintptr + sa_flags uint32 + sa_restorer uintptr + sa_mask uint64 +} + +type sockaddr_un struct { + family uint16 + path [108]byte +} diff --git a/src/runtime/defs_linux_arm64.go b/src/runtime/defs_linux_arm64.go new file mode 100644 index 0000000..0216096 --- /dev/null +++ b/src/runtime/defs_linux_arm64.go @@ -0,0 +1,210 @@ +// Created by cgo -cdefs and converted (by hand) to Go +// ../cmd/cgo/cgo -cdefs defs_linux.go defs1_linux.go defs2_linux.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_RESTORER = 0x0 // Only used on intel + _SA_SIGINFO = 0x4 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 + + _AF_UNIX = 0x1 + _SOCK_DGRAM = 0x2 +) + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type sigactiont struct { + sa_handler uintptr + sa_flags uint64 + sa_restorer uintptr + sa_mask uint64 +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + // below here is a union; si_addr is the only field we use + si_addr uint64 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +// Created by cgo -cdefs and then converted to Go by hand +// ../cmd/cgo/cgo -cdefs defs_linux.go defs1_linux.go defs2_linux.go + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 +) + +type usigset struct { + __val [16]uint64 +} + +type stackt struct { + ss_sp *byte + ss_flags int32 + pad_cgo_0 [4]byte + ss_size uintptr +} + +type sigcontext struct { + fault_address uint64 + /* AArch64 registers */ + regs [31]uint64 + sp uint64 + pc uint64 + pstate uint64 + _pad [8]byte // __attribute__((__aligned__(16))) + __reserved [4096]byte +} + +type sockaddr_un struct { + family uint16 + path [108]byte +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_stack stackt + uc_sigmask uint64 + _pad [(1024 - 64) / 8]byte + _pad2 [8]byte // sigcontext must be aligned to 16-byte + uc_mcontext sigcontext +} diff --git a/src/runtime/defs_linux_loong64.go b/src/runtime/defs_linux_loong64.go new file mode 100644 index 0000000..6eca18b --- /dev/null +++ b/src/runtime/defs_linux_loong64.go @@ -0,0 +1,197 @@ +// Generated using cgo, then manually converted into appropriate naming and code +// for the Go runtime. +// go tool cgo -godefs defs_linux.go defs1_linux.go defs2_linux.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_SIGINFO = 0x4 + _SA_RESTORER = 0x0 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 +) + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 +) + +type sigactiont struct { + sa_handler uintptr + sa_flags uint64 + sa_mask uint64 + // Linux on loong64 does not have the sa_restorer field, but the setsig + // function references it (for x86). Not much harm to include it at the end. + sa_restorer uintptr +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + __pad0 [1]int32 + // below here is a union; si_addr is the only field we use + si_addr uint64 +} + +type siginfo struct { + siginfoFields + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type usigset struct { + val [16]uint64 +} + +type stackt struct { + ss_sp *byte + ss_flags int32 + pad_cgo_0 [4]byte + ss_size uintptr +} + +type sigcontext struct { + sc_pc uint64 + sc_regs [32]uint64 + sc_flags uint32 + sc_extcontext [0]uint64 +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_stack stackt + uc_sigmask usigset + uc_x_unused [0]uint8 + uc_pad_cgo_0 [8]byte + uc_mcontext sigcontext +} diff --git a/src/runtime/defs_linux_mips64x.go b/src/runtime/defs_linux_mips64x.go new file mode 100644 index 0000000..2e8c405 --- /dev/null +++ b/src/runtime/defs_linux_mips64x.go @@ -0,0 +1,210 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (mips64 || mips64le) && linux + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x800 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_SIGINFO = 0x8 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGUSR1 = 0x10 + _SIGUSR2 = 0x11 + _SIGCHLD = 0x12 + _SIGPWR = 0x13 + _SIGWINCH = 0x14 + _SIGURG = 0x15 + _SIGIO = 0x16 + _SIGSTOP = 0x17 + _SIGTSTP = 0x18 + _SIGCONT = 0x19 + _SIGTTIN = 0x1a + _SIGTTOU = 0x1b + _SIGVTALRM = 0x1c + _SIGPROF = 0x1d + _SIGXCPU = 0x1e + _SIGXFSZ = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 +) + +//struct Sigset { +// uint64 sig[1]; +//}; +//typedef uint64 Sigset; + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type sigactiont struct { + sa_flags uint32 + sa_handler uintptr + sa_mask [2]uint64 + // linux header does not have sa_restorer field, + // but it is used in setsig(). it is no harm to put it here + sa_restorer uintptr +} + +type siginfoFields struct { + si_signo int32 + si_code int32 + si_errno int32 + __pad0 [1]int32 + // below here is a union; si_addr is the only field we use + si_addr uint64 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x100 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x80 + _O_CLOEXEC = 0x80000 + _SA_RESTORER = 0 +) + +type stackt struct { + ss_sp *byte + ss_size uintptr + ss_flags int32 +} + +type sigcontext struct { + sc_regs [32]uint64 + sc_fpregs [32]uint64 + sc_mdhi uint64 + sc_hi1 uint64 + sc_hi2 uint64 + sc_hi3 uint64 + sc_mdlo uint64 + sc_lo1 uint64 + sc_lo2 uint64 + sc_lo3 uint64 + sc_pc uint64 + sc_fpc_csr uint32 + sc_used_math uint32 + sc_dsp uint32 + sc_reserved uint32 +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_stack stackt + uc_mcontext sigcontext + uc_sigmask uint64 +} diff --git a/src/runtime/defs_linux_mipsx.go b/src/runtime/defs_linux_mipsx.go new file mode 100644 index 0000000..7593600 --- /dev/null +++ b/src/runtime/defs_linux_mipsx.go @@ -0,0 +1,208 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (mips || mipsle) && linux + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x800 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_SIGINFO = 0x8 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGUSR1 = 0x10 + _SIGUSR2 = 0x11 + _SIGCHLD = 0x12 + _SIGPWR = 0x13 + _SIGWINCH = 0x14 + _SIGURG = 0x15 + _SIGIO = 0x16 + _SIGSTOP = 0x17 + _SIGTSTP = 0x18 + _SIGCONT = 0x19 + _SIGTTIN = 0x1a + _SIGTTOU = 0x1b + _SIGVTALRM = 0x1c + _SIGPROF = 0x1d + _SIGXCPU = 0x1e + _SIGXFSZ = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 +) + +type timespec struct { + tv_sec int32 + tv_nsec int32 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = timediv(ns, 1e9, &ts.tv_nsec) +} + +type timeval struct { + tv_sec int32 + tv_usec int32 +} + +//go:nosplit +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type sigactiont struct { + sa_flags uint32 + sa_handler uintptr + sa_mask [4]uint32 + // linux header does not have sa_restorer field, + // but it is used in setsig(). it is no harm to put it here + sa_restorer uintptr +} + +type siginfoFields struct { + si_signo int32 + si_code int32 + si_errno int32 + // below here is a union; si_addr is the only field we use + si_addr uint32 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x80 + _O_CREAT = 0x100 + _O_TRUNC = 0x200 + _O_CLOEXEC = 0x80000 + _SA_RESTORER = 0 +) + +type stackt struct { + ss_sp *byte + ss_size uintptr + ss_flags int32 +} + +type sigcontext struct { + sc_regmask uint32 + sc_status uint32 + sc_pc uint64 + sc_regs [32]uint64 + sc_fpregs [32]uint64 + sc_acx uint32 + sc_fpc_csr uint32 + sc_fpc_eir uint32 + sc_used_math uint32 + sc_dsp uint32 + sc_mdhi uint64 + sc_mdlo uint64 + sc_hi1 uint32 + sc_lo1 uint32 + sc_hi2 uint32 + sc_lo2 uint32 + sc_hi3 uint32 + sc_lo3 uint32 +} + +type ucontext struct { + uc_flags uint32 + uc_link *ucontext + uc_stack stackt + Pad_cgo_0 [4]byte + uc_mcontext sigcontext + uc_sigmask [4]uint32 +} diff --git a/src/runtime/defs_linux_ppc64.go b/src/runtime/defs_linux_ppc64.go new file mode 100644 index 0000000..bb3ac01 --- /dev/null +++ b/src/runtime/defs_linux_ppc64.go @@ -0,0 +1,224 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_linux.go defs3_linux.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_SIGINFO = 0x4 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 +) + +//struct Sigset { +// uint64 sig[1]; +//}; +//typedef uint64 Sigset; + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type sigactiont struct { + sa_handler uintptr + sa_flags uint64 + sa_restorer uintptr + sa_mask uint64 +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + // below here is a union; si_addr is the only field we use + si_addr uint64 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_linux.go defs3_linux.go + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 + _SA_RESTORER = 0 +) + +type ptregs struct { + gpr [32]uint64 + nip uint64 + msr uint64 + orig_gpr3 uint64 + ctr uint64 + link uint64 + xer uint64 + ccr uint64 + softe uint64 + trap uint64 + dar uint64 + dsisr uint64 + result uint64 +} + +type vreg struct { + u [4]uint32 +} + +type stackt struct { + ss_sp *byte + ss_flags int32 + pad_cgo_0 [4]byte + ss_size uintptr +} + +type sigcontext struct { + _unused [4]uint64 + signal int32 + _pad0 int32 + handler uint64 + oldmask uint64 + regs *ptregs + gp_regs [48]uint64 + fp_regs [33]float64 + v_regs *vreg + vmx_reserve [101]int64 +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_stack stackt + uc_sigmask uint64 + __unused [15]uint64 + uc_mcontext sigcontext +} diff --git a/src/runtime/defs_linux_ppc64le.go b/src/runtime/defs_linux_ppc64le.go new file mode 100644 index 0000000..bb3ac01 --- /dev/null +++ b/src/runtime/defs_linux_ppc64le.go @@ -0,0 +1,224 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_linux.go defs3_linux.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_SIGINFO = 0x4 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 +) + +//struct Sigset { +// uint64 sig[1]; +//}; +//typedef uint64 Sigset; + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type sigactiont struct { + sa_handler uintptr + sa_flags uint64 + sa_restorer uintptr + sa_mask uint64 +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + // below here is a union; si_addr is the only field we use + si_addr uint64 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_linux.go defs3_linux.go + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 + _SA_RESTORER = 0 +) + +type ptregs struct { + gpr [32]uint64 + nip uint64 + msr uint64 + orig_gpr3 uint64 + ctr uint64 + link uint64 + xer uint64 + ccr uint64 + softe uint64 + trap uint64 + dar uint64 + dsisr uint64 + result uint64 +} + +type vreg struct { + u [4]uint32 +} + +type stackt struct { + ss_sp *byte + ss_flags int32 + pad_cgo_0 [4]byte + ss_size uintptr +} + +type sigcontext struct { + _unused [4]uint64 + signal int32 + _pad0 int32 + handler uint64 + oldmask uint64 + regs *ptregs + gp_regs [48]uint64 + fp_regs [33]float64 + v_regs *vreg + vmx_reserve [101]int64 +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_stack stackt + uc_sigmask uint64 + __unused [15]uint64 + uc_mcontext sigcontext +} diff --git a/src/runtime/defs_linux_riscv64.go b/src/runtime/defs_linux_riscv64.go new file mode 100644 index 0000000..ce4a7f3 --- /dev/null +++ b/src/runtime/defs_linux_riscv64.go @@ -0,0 +1,234 @@ +// Generated using cgo, then manually converted into appropriate naming and code +// for the Go runtime. +// go tool cgo -godefs defs_linux.go defs1_linux.go defs2_linux.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_RESTORER = 0x0 + _SA_SIGINFO = 0x4 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 +) + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type sigactiont struct { + sa_handler uintptr + sa_flags uint64 + sa_mask uint64 + // Linux on riscv64 does not have the sa_restorer field, but the setsig + // function references it (for x86). Not much harm to include it at the end. + sa_restorer uintptr +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + // below here is a union; si_addr is the only field we use + si_addr uint64 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 +) + +type user_regs_struct struct { + pc uint64 + ra uint64 + sp uint64 + gp uint64 + tp uint64 + t0 uint64 + t1 uint64 + t2 uint64 + s0 uint64 + s1 uint64 + a0 uint64 + a1 uint64 + a2 uint64 + a3 uint64 + a4 uint64 + a5 uint64 + a6 uint64 + a7 uint64 + s2 uint64 + s3 uint64 + s4 uint64 + s5 uint64 + s6 uint64 + s7 uint64 + s8 uint64 + s9 uint64 + s10 uint64 + s11 uint64 + t3 uint64 + t4 uint64 + t5 uint64 + t6 uint64 +} + +type user_fpregs_struct struct { + f [528]byte +} + +type usigset struct { + us_x__val [16]uint64 +} + +type sigcontext struct { + sc_regs user_regs_struct + sc_fpregs user_fpregs_struct +} + +type stackt struct { + ss_sp *byte + ss_flags int32 + ss_size uintptr +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_stack stackt + uc_sigmask usigset + uc_x__unused [0]uint8 + uc_pad_cgo_0 [8]byte + uc_mcontext sigcontext +} diff --git a/src/runtime/defs_linux_s390x.go b/src/runtime/defs_linux_s390x.go new file mode 100644 index 0000000..36497dd --- /dev/null +++ b/src/runtime/defs_linux_s390x.go @@ -0,0 +1,191 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EAGAIN = 0xb + _ENOMEM = 0xc + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x20 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x8 + _MADV_HUGEPAGE = 0xe + _MADV_NOHUGEPAGE = 0xf + + _SA_RESTART = 0x10000000 + _SA_ONSTACK = 0x8000000 + _SA_SIGINFO = 0x4 + + _SI_KERNEL = 0x80 + _SI_TIMER = -0x2 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGBUS = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGUSR1 = 0xa + _SIGSEGV = 0xb + _SIGUSR2 = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGSTKFLT = 0x10 + _SIGCHLD = 0x11 + _SIGCONT = 0x12 + _SIGSTOP = 0x13 + _SIGTSTP = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGURG = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGIO = 0x1d + _SIGPWR = 0x1e + _SIGSYS = 0x1f + + _SIGRTMIN = 0x20 + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _CLOCK_THREAD_CPUTIME_ID = 0x3 + + _SIGEV_THREAD_ID = 0x4 +) + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type sigactiont struct { + sa_handler uintptr + sa_flags uint64 + sa_restorer uintptr + sa_mask uint64 +} + +type siginfoFields struct { + si_signo int32 + si_errno int32 + si_code int32 + // below here is a union; si_addr is the only field we use + si_addr uint64 +} + +type siginfo struct { + siginfoFields + + // Pad struct to the max size in the kernel. + _ [_si_max_size - unsafe.Sizeof(siginfoFields{})]byte +} + +type itimerspec struct { + it_interval timespec + it_value timespec +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type sigeventFields struct { + value uintptr + signo int32 + notify int32 + // below here is a union; sigev_notify_thread_id is the only field we use + sigev_notify_thread_id int32 +} + +type sigevent struct { + sigeventFields + + // Pad struct to the max size in the kernel. + _ [_sigev_max_size - unsafe.Sizeof(sigeventFields{})]byte +} + +const ( + _O_RDONLY = 0x0 + _O_WRONLY = 0x1 + _O_CREAT = 0x40 + _O_TRUNC = 0x200 + _O_NONBLOCK = 0x800 + _O_CLOEXEC = 0x80000 + _SA_RESTORER = 0 +) + +type stackt struct { + ss_sp *byte + ss_flags int32 + ss_size uintptr +} + +type sigcontext struct { + psw_mask uint64 + psw_addr uint64 + gregs [16]uint64 + aregs [16]uint32 + fpc uint32 + fpregs [16]uint64 +} + +type ucontext struct { + uc_flags uint64 + uc_link *ucontext + uc_stack stackt + uc_mcontext sigcontext + uc_sigmask uint64 +} diff --git a/src/runtime/defs_netbsd.go b/src/runtime/defs_netbsd.go new file mode 100644 index 0000000..43923e3 --- /dev/null +++ b/src/runtime/defs_netbsd.go @@ -0,0 +1,133 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=amd64 go tool cgo -cdefs defs_netbsd.go defs_netbsd_amd64.go >defs_netbsd_amd64.h +GOARCH=386 go tool cgo -cdefs defs_netbsd.go defs_netbsd_386.go >defs_netbsd_386.h +GOARCH=arm go tool cgo -cdefs defs_netbsd.go defs_netbsd_arm.go >defs_netbsd_arm.h +*/ + +// +godefs map __fpregset_t [644]byte + +package runtime + +/* +#include <sys/types.h> +#include <sys/mman.h> +#include <sys/signal.h> +#include <sys/event.h> +#include <sys/time.h> +#include <sys/ucontext.h> +#include <sys/unistd.h> +#include <errno.h> +#include <signal.h> +*/ +import "C" + +const ( + EINTR = C.EINTR + EFAULT = C.EFAULT + EAGAIN = C.EAGAIN + + O_WRONLY = C.O_WRONLY + O_NONBLOCK = C.O_NONBLOCK + O_CREAT = C.O_CREAT + O_TRUNC = C.O_TRUNC + O_CLOEXEC = C.O_CLOEXEC + + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANON + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + + MADV_DONTNEED = C.MADV_DONTNEED + MADV_FREE = C.MADV_FREE + + SA_SIGINFO = C.SA_SIGINFO + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGEMT = C.SIGEMT + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGBUS = C.SIGBUS + SIGSEGV = C.SIGSEGV + SIGSYS = C.SIGSYS + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGTERM = C.SIGTERM + SIGURG = C.SIGURG + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGCONT = C.SIGCONT + SIGCHLD = C.SIGCHLD + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGIO = C.SIGIO + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGINFO = C.SIGINFO + SIGUSR1 = C.SIGUSR1 + SIGUSR2 = C.SIGUSR2 + + FPE_INTDIV = C.FPE_INTDIV + FPE_INTOVF = C.FPE_INTOVF + FPE_FLTDIV = C.FPE_FLTDIV + FPE_FLTOVF = C.FPE_FLTOVF + FPE_FLTUND = C.FPE_FLTUND + FPE_FLTRES = C.FPE_FLTRES + FPE_FLTINV = C.FPE_FLTINV + FPE_FLTSUB = C.FPE_FLTSUB + + BUS_ADRALN = C.BUS_ADRALN + BUS_ADRERR = C.BUS_ADRERR + BUS_OBJERR = C.BUS_OBJERR + + SEGV_MAPERR = C.SEGV_MAPERR + SEGV_ACCERR = C.SEGV_ACCERR + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + ITIMER_PROF = C.ITIMER_PROF + + EV_ADD = C.EV_ADD + EV_DELETE = C.EV_DELETE + EV_CLEAR = C.EV_CLEAR + EV_RECEIPT = 0 + EV_ERROR = C.EV_ERROR + EV_EOF = C.EV_EOF + EVFILT_READ = C.EVFILT_READ + EVFILT_WRITE = C.EVFILT_WRITE +) + +type Sigset C.sigset_t +type Siginfo C.struct__ksiginfo + +type StackT C.stack_t + +type Timespec C.struct_timespec +type Timeval C.struct_timeval +type Itimerval C.struct_itimerval + +type McontextT C.mcontext_t +type UcontextT C.ucontext_t + +type Kevent C.struct_kevent diff --git a/src/runtime/defs_netbsd_386.go b/src/runtime/defs_netbsd_386.go new file mode 100644 index 0000000..2943ea3 --- /dev/null +++ b/src/runtime/defs_netbsd_386.go @@ -0,0 +1,41 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=386 go tool cgo -cdefs defs_netbsd.go defs_netbsd_386.go >defs_netbsd_386.h +*/ + +package runtime + +/* +#include <sys/types.h> +#include <machine/mcontext.h> +*/ +import "C" + +const ( + REG_GS = C._REG_GS + REG_FS = C._REG_FS + REG_ES = C._REG_ES + REG_DS = C._REG_DS + REG_EDI = C._REG_EDI + REG_ESI = C._REG_ESI + REG_EBP = C._REG_EBP + REG_ESP = C._REG_ESP + REG_EBX = C._REG_EBX + REG_EDX = C._REG_EDX + REG_ECX = C._REG_ECX + REG_EAX = C._REG_EAX + REG_TRAPNO = C._REG_TRAPNO + REG_ERR = C._REG_ERR + REG_EIP = C._REG_EIP + REG_CS = C._REG_CS + REG_EFL = C._REG_EFL + REG_UESP = C._REG_UESP + REG_SS = C._REG_SS +) diff --git a/src/runtime/defs_netbsd_amd64.go b/src/runtime/defs_netbsd_amd64.go new file mode 100644 index 0000000..33d80ff --- /dev/null +++ b/src/runtime/defs_netbsd_amd64.go @@ -0,0 +1,48 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=amd64 go tool cgo -cdefs defs_netbsd.go defs_netbsd_amd64.go >defs_netbsd_amd64.h +*/ + +package runtime + +/* +#include <sys/types.h> +#include <machine/mcontext.h> +*/ +import "C" + +const ( + REG_RDI = C._REG_RDI + REG_RSI = C._REG_RSI + REG_RDX = C._REG_RDX + REG_RCX = C._REG_RCX + REG_R8 = C._REG_R8 + REG_R9 = C._REG_R9 + REG_R10 = C._REG_R10 + REG_R11 = C._REG_R11 + REG_R12 = C._REG_R12 + REG_R13 = C._REG_R13 + REG_R14 = C._REG_R14 + REG_R15 = C._REG_R15 + REG_RBP = C._REG_RBP + REG_RBX = C._REG_RBX + REG_RAX = C._REG_RAX + REG_GS = C._REG_GS + REG_FS = C._REG_FS + REG_ES = C._REG_ES + REG_DS = C._REG_DS + REG_TRAPNO = C._REG_TRAPNO + REG_ERR = C._REG_ERR + REG_RIP = C._REG_RIP + REG_CS = C._REG_CS + REG_RFLAGS = C._REG_RFLAGS + REG_RSP = C._REG_RSP + REG_SS = C._REG_SS +) diff --git a/src/runtime/defs_netbsd_arm.go b/src/runtime/defs_netbsd_arm.go new file mode 100644 index 0000000..74b3752 --- /dev/null +++ b/src/runtime/defs_netbsd_arm.go @@ -0,0 +1,39 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=arm go tool cgo -cdefs defs_netbsd.go defs_netbsd_arm.go >defs_netbsd_arm.h +*/ + +package runtime + +/* +#include <sys/types.h> +#include <machine/mcontext.h> +*/ +import "C" + +const ( + REG_R0 = C._REG_R0 + REG_R1 = C._REG_R1 + REG_R2 = C._REG_R2 + REG_R3 = C._REG_R3 + REG_R4 = C._REG_R4 + REG_R5 = C._REG_R5 + REG_R6 = C._REG_R6 + REG_R7 = C._REG_R7 + REG_R8 = C._REG_R8 + REG_R9 = C._REG_R9 + REG_R10 = C._REG_R10 + REG_R11 = C._REG_R11 + REG_R12 = C._REG_R12 + REG_R13 = C._REG_R13 + REG_R14 = C._REG_R14 + REG_R15 = C._REG_R15 + REG_CPSR = C._REG_CPSR +) diff --git a/src/runtime/defs_openbsd.go b/src/runtime/defs_openbsd.go new file mode 100644 index 0000000..4161e21 --- /dev/null +++ b/src/runtime/defs_openbsd.go @@ -0,0 +1,146 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=amd64 go tool cgo -godefs defs_openbsd.go +GOARCH=386 go tool cgo -godefs defs_openbsd.go +GOARCH=arm go tool cgo -godefs defs_openbsd.go +GOARCH=arm64 go tool cgo -godefs defs_openbsd.go +GOARCH=mips64 go tool cgo -godefs defs_openbsd.go +*/ + +package runtime + +/* +#include <sys/types.h> +#include <sys/event.h> +#include <sys/mman.h> +#include <sys/time.h> +#include <sys/unistd.h> +#include <sys/signal.h> +#include <errno.h> +#include <fcntl.h> +#include <pthread.h> +#include <signal.h> +*/ +import "C" + +const ( + EINTR = C.EINTR + EFAULT = C.EFAULT + EAGAIN = C.EAGAIN + + O_NONBLOCK = C.O_NONBLOCK + O_CLOEXEC = C.O_CLOEXEC + + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANON + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + MAP_STACK = C.MAP_STACK + + MADV_DONTNEED = C.MADV_DONTNEED + MADV_FREE = C.MADV_FREE + + SA_SIGINFO = C.SA_SIGINFO + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + + PTHREAD_CREATE_DETACHED = C.PTHREAD_CREATE_DETACHED + + F_SETFD = C.F_SETFD + F_GETFL = C.F_GETFL + F_SETFL = C.F_SETFL + FD_CLOEXEC = C.FD_CLOEXEC + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGEMT = C.SIGEMT + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGBUS = C.SIGBUS + SIGSEGV = C.SIGSEGV + SIGSYS = C.SIGSYS + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGTERM = C.SIGTERM + SIGURG = C.SIGURG + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGCONT = C.SIGCONT + SIGCHLD = C.SIGCHLD + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGIO = C.SIGIO + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGINFO = C.SIGINFO + SIGUSR1 = C.SIGUSR1 + SIGUSR2 = C.SIGUSR2 + + FPE_INTDIV = C.FPE_INTDIV + FPE_INTOVF = C.FPE_INTOVF + FPE_FLTDIV = C.FPE_FLTDIV + FPE_FLTOVF = C.FPE_FLTOVF + FPE_FLTUND = C.FPE_FLTUND + FPE_FLTRES = C.FPE_FLTRES + FPE_FLTINV = C.FPE_FLTINV + FPE_FLTSUB = C.FPE_FLTSUB + + BUS_ADRALN = C.BUS_ADRALN + BUS_ADRERR = C.BUS_ADRERR + BUS_OBJERR = C.BUS_OBJERR + + SEGV_MAPERR = C.SEGV_MAPERR + SEGV_ACCERR = C.SEGV_ACCERR + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + ITIMER_PROF = C.ITIMER_PROF + + EV_ADD = C.EV_ADD + EV_DELETE = C.EV_DELETE + EV_CLEAR = C.EV_CLEAR + EV_ERROR = C.EV_ERROR + EV_EOF = C.EV_EOF + EVFILT_READ = C.EVFILT_READ + EVFILT_WRITE = C.EVFILT_WRITE +) + +type TforkT C.struct___tfork + +type Sigcontext C.struct_sigcontext +type Siginfo C.siginfo_t +type Sigset C.sigset_t +type Sigval C.union_sigval + +type StackT C.stack_t + +type Timespec C.struct_timespec +type Timeval C.struct_timeval +type Itimerval C.struct_itimerval + +type KeventT C.struct_kevent + +type Pthread C.pthread_t +type PthreadAttr C.pthread_attr_t +type PthreadCond C.pthread_cond_t +type PthreadCondAttr C.pthread_condattr_t +type PthreadMutex C.pthread_mutex_t +type PthreadMutexAttr C.pthread_mutexattr_t diff --git a/src/runtime/defs_openbsd_386.go b/src/runtime/defs_openbsd_386.go new file mode 100644 index 0000000..25524c5 --- /dev/null +++ b/src/runtime/defs_openbsd_386.go @@ -0,0 +1,185 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_openbsd.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x10000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + _MAP_STACK = 0x4000 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _PTHREAD_CREATE_DETACHED = 0x1 + + _F_SETFD = 0x2 + _F_GETFL = 0x3 + _F_SETFL = 0x4 + _FD_CLOEXEC = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type tforkt struct { + tf_tcb unsafe.Pointer + tf_tid *int32 + tf_stack uintptr +} + +type sigcontext struct { + sc_gs uint32 + sc_fs uint32 + sc_es uint32 + sc_ds uint32 + sc_edi uint32 + sc_esi uint32 + sc_ebp uint32 + sc_ebx uint32 + sc_edx uint32 + sc_ecx uint32 + sc_eax uint32 + sc_eip uint32 + sc_cs uint32 + sc_eflags uint32 + sc_esp uint32 + sc_ss uint32 + __sc_unused uint32 + sc_mask uint32 + sc_trapno uint32 + sc_err uint32 + sc_fpstate unsafe.Pointer +} + +type siginfo struct { + si_signo int32 + si_code int32 + si_errno int32 + _data [116]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 +} + +type timespec struct { + tv_sec int64 + tv_nsec int32 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = int64(timediv(ns, 1e9, &ts.tv_nsec)) +} + +type timeval struct { + tv_sec int64 + tv_usec int32 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type keventt struct { + ident uint32 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte +} + +type pthread uintptr +type pthreadattr uintptr +type pthreadcond uintptr +type pthreadcondattr uintptr +type pthreadmutex uintptr +type pthreadmutexattr uintptr diff --git a/src/runtime/defs_openbsd_amd64.go b/src/runtime/defs_openbsd_amd64.go new file mode 100644 index 0000000..a31d03b --- /dev/null +++ b/src/runtime/defs_openbsd_amd64.go @@ -0,0 +1,196 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_openbsd.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x10000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + _MAP_STACK = 0x4000 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _PTHREAD_CREATE_DETACHED = 0x1 + + _F_SETFD = 0x2 + _F_GETFL = 0x3 + _F_SETFL = 0x4 + _FD_CLOEXEC = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type tforkt struct { + tf_tcb unsafe.Pointer + tf_tid *int32 + tf_stack uintptr +} + +type sigcontext struct { + sc_rdi uint64 + sc_rsi uint64 + sc_rdx uint64 + sc_rcx uint64 + sc_r8 uint64 + sc_r9 uint64 + sc_r10 uint64 + sc_r11 uint64 + sc_r12 uint64 + sc_r13 uint64 + sc_r14 uint64 + sc_r15 uint64 + sc_rbp uint64 + sc_rbx uint64 + sc_rax uint64 + sc_gs uint64 + sc_fs uint64 + sc_es uint64 + sc_ds uint64 + sc_trapno uint64 + sc_err uint64 + sc_rip uint64 + sc_cs uint64 + sc_rflags uint64 + sc_rsp uint64 + sc_ss uint64 + sc_fpstate unsafe.Pointer + __sc_unused int32 + sc_mask int32 +} + +type siginfo struct { + si_signo int32 + si_code int32 + si_errno int32 + pad_cgo_0 [4]byte + _data [120]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte +} + +type pthread uintptr +type pthreadattr uintptr +type pthreadcond uintptr +type pthreadcondattr uintptr +type pthreadmutex uintptr +type pthreadmutexattr uintptr diff --git a/src/runtime/defs_openbsd_arm.go b/src/runtime/defs_openbsd_arm.go new file mode 100644 index 0000000..1d1767b --- /dev/null +++ b/src/runtime/defs_openbsd_arm.go @@ -0,0 +1,193 @@ +// created by cgo -cdefs and then converted to Go +// cgo -cdefs defs_openbsd.go + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x10000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + _MAP_STACK = 0x4000 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _PTHREAD_CREATE_DETACHED = 0x1 + + _F_SETFD = 0x2 + _F_GETFL = 0x3 + _F_SETFL = 0x4 + _FD_CLOEXEC = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type tforkt struct { + tf_tcb unsafe.Pointer + tf_tid *int32 + tf_stack uintptr +} + +type sigcontext struct { + __sc_unused int32 + sc_mask int32 + + sc_spsr uint32 + sc_r0 uint32 + sc_r1 uint32 + sc_r2 uint32 + sc_r3 uint32 + sc_r4 uint32 + sc_r5 uint32 + sc_r6 uint32 + sc_r7 uint32 + sc_r8 uint32 + sc_r9 uint32 + sc_r10 uint32 + sc_r11 uint32 + sc_r12 uint32 + sc_usr_sp uint32 + sc_usr_lr uint32 + sc_svc_lr uint32 + sc_pc uint32 + sc_fpused uint32 + sc_fpscr uint32 + sc_fpreg [32]uint64 +} + +type siginfo struct { + si_signo int32 + si_code int32 + si_errno int32 + pad_cgo_0 [4]byte + _data [120]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 +} + +type timespec struct { + tv_sec int64 + tv_nsec int32 + pad_cgo_0 [4]byte +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = int64(timediv(ns, 1e9, &ts.tv_nsec)) +} + +type timeval struct { + tv_sec int64 + tv_usec int32 + pad_cgo_0 [4]byte +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = x +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type keventt struct { + ident uint32 + filter int16 + flags uint16 + fflags uint32 + pad_cgo_0 [4]byte + data int64 + udata *byte + pad_cgo_1 [4]byte +} + +type pthread uintptr +type pthreadattr uintptr +type pthreadcond uintptr +type pthreadcondattr uintptr +type pthreadmutex uintptr +type pthreadmutexattr uintptr diff --git a/src/runtime/defs_openbsd_arm64.go b/src/runtime/defs_openbsd_arm64.go new file mode 100644 index 0000000..745d0d3 --- /dev/null +++ b/src/runtime/defs_openbsd_arm64.go @@ -0,0 +1,176 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x10000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + _MAP_STACK = 0x4000 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _PTHREAD_CREATE_DETACHED = 0x1 + + _F_SETFD = 0x2 + _F_GETFL = 0x3 + _F_SETFL = 0x4 + _FD_CLOEXEC = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type tforkt struct { + tf_tcb unsafe.Pointer + tf_tid *int32 + tf_stack uintptr +} + +type sigcontext struct { + __sc_unused int32 + sc_mask int32 + sc_sp uintptr + sc_lr uintptr + sc_elr uintptr + sc_spsr uintptr + sc_x [30]uintptr + sc_cookie int64 +} + +type siginfo struct { + si_signo int32 + si_code int32 + si_errno int32 + pad_cgo_0 [4]byte + _data [120]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte +} + +type pthread uintptr +type pthreadattr uintptr +type pthreadcond uintptr +type pthreadcondattr uintptr +type pthreadmutex uintptr +type pthreadmutexattr uintptr diff --git a/src/runtime/defs_openbsd_mips64.go b/src/runtime/defs_openbsd_mips64.go new file mode 100644 index 0000000..1e469e4 --- /dev/null +++ b/src/runtime/defs_openbsd_mips64.go @@ -0,0 +1,170 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Generated from: +// +// GOARCH=mips64 go tool cgo -godefs defs_openbsd.go +// +// Then converted to the form used by the runtime. + +package runtime + +import "unsafe" + +const ( + _EINTR = 0x4 + _EFAULT = 0xe + _EAGAIN = 0x23 + + _O_WRONLY = 0x1 + _O_NONBLOCK = 0x4 + _O_CREAT = 0x200 + _O_TRUNC = 0x400 + _O_CLOEXEC = 0x10000 + + _PROT_NONE = 0x0 + _PROT_READ = 0x1 + _PROT_WRITE = 0x2 + _PROT_EXEC = 0x4 + + _MAP_ANON = 0x1000 + _MAP_PRIVATE = 0x2 + _MAP_FIXED = 0x10 + _MAP_STACK = 0x4000 + + _MADV_DONTNEED = 0x4 + _MADV_FREE = 0x6 + + _SA_SIGINFO = 0x40 + _SA_RESTART = 0x2 + _SA_ONSTACK = 0x1 + + _SIGHUP = 0x1 + _SIGINT = 0x2 + _SIGQUIT = 0x3 + _SIGILL = 0x4 + _SIGTRAP = 0x5 + _SIGABRT = 0x6 + _SIGEMT = 0x7 + _SIGFPE = 0x8 + _SIGKILL = 0x9 + _SIGBUS = 0xa + _SIGSEGV = 0xb + _SIGSYS = 0xc + _SIGPIPE = 0xd + _SIGALRM = 0xe + _SIGTERM = 0xf + _SIGURG = 0x10 + _SIGSTOP = 0x11 + _SIGTSTP = 0x12 + _SIGCONT = 0x13 + _SIGCHLD = 0x14 + _SIGTTIN = 0x15 + _SIGTTOU = 0x16 + _SIGIO = 0x17 + _SIGXCPU = 0x18 + _SIGXFSZ = 0x19 + _SIGVTALRM = 0x1a + _SIGPROF = 0x1b + _SIGWINCH = 0x1c + _SIGINFO = 0x1d + _SIGUSR1 = 0x1e + _SIGUSR2 = 0x1f + + _FPE_INTDIV = 0x1 + _FPE_INTOVF = 0x2 + _FPE_FLTDIV = 0x3 + _FPE_FLTOVF = 0x4 + _FPE_FLTUND = 0x5 + _FPE_FLTRES = 0x6 + _FPE_FLTINV = 0x7 + _FPE_FLTSUB = 0x8 + + _BUS_ADRALN = 0x1 + _BUS_ADRERR = 0x2 + _BUS_OBJERR = 0x3 + + _SEGV_MAPERR = 0x1 + _SEGV_ACCERR = 0x2 + + _ITIMER_REAL = 0x0 + _ITIMER_VIRTUAL = 0x1 + _ITIMER_PROF = 0x2 + + _EV_ADD = 0x1 + _EV_DELETE = 0x2 + _EV_CLEAR = 0x20 + _EV_ERROR = 0x4000 + _EV_EOF = 0x8000 + _EVFILT_READ = -0x1 + _EVFILT_WRITE = -0x2 +) + +type tforkt struct { + tf_tcb unsafe.Pointer + tf_tid *int32 + tf_stack uintptr +} + +type sigcontext struct { + sc_cookie uint64 + sc_mask uint64 + sc_pc uint64 + sc_regs [32]uint64 + mullo uint64 + mulhi uint64 + sc_fpregs [33]uint64 + sc_fpused uint64 + sc_fpc_eir uint64 + _xxx [8]int64 +} + +type siginfo struct { + si_signo int32 + si_code int32 + si_errno int32 + pad_cgo_0 [4]byte + _data [120]byte +} + +type stackt struct { + ss_sp uintptr + ss_size uintptr + ss_flags int32 + pad_cgo_0 [4]byte +} + +type timespec struct { + tv_sec int64 + tv_nsec int64 +} + +//go:nosplit +func (ts *timespec) setNsec(ns int64) { + ts.tv_sec = ns / 1e9 + ts.tv_nsec = ns % 1e9 +} + +type timeval struct { + tv_sec int64 + tv_usec int64 +} + +func (tv *timeval) set_usec(x int32) { + tv.tv_usec = int64(x) +} + +type itimerval struct { + it_interval timeval + it_value timeval +} + +type keventt struct { + ident uint64 + filter int16 + flags uint16 + fflags uint32 + data int64 + udata *byte +} diff --git a/src/runtime/defs_plan9_386.go b/src/runtime/defs_plan9_386.go new file mode 100644 index 0000000..428044d --- /dev/null +++ b/src/runtime/defs_plan9_386.go @@ -0,0 +1,64 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const _PAGESIZE = 0x1000 + +type ureg struct { + di uint32 /* general registers */ + si uint32 /* ... */ + bp uint32 /* ... */ + nsp uint32 + bx uint32 /* ... */ + dx uint32 /* ... */ + cx uint32 /* ... */ + ax uint32 /* ... */ + gs uint32 /* data segments */ + fs uint32 /* ... */ + es uint32 /* ... */ + ds uint32 /* ... */ + trap uint32 /* trap _type */ + ecode uint32 /* error code (or zero) */ + pc uint32 /* pc */ + cs uint32 /* old context */ + flags uint32 /* old flags */ + sp uint32 + ss uint32 /* old stack segment */ +} + +type sigctxt struct { + u *ureg +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uintptr { return uintptr(c.u.pc) } + +func (c *sigctxt) sp() uintptr { return uintptr(c.u.sp) } +func (c *sigctxt) lr() uintptr { return uintptr(0) } + +func (c *sigctxt) setpc(x uintptr) { c.u.pc = uint32(x) } +func (c *sigctxt) setsp(x uintptr) { c.u.sp = uint32(x) } +func (c *sigctxt) setlr(x uintptr) {} + +func (c *sigctxt) savelr(x uintptr) {} + +func dumpregs(u *ureg) { + print("ax ", hex(u.ax), "\n") + print("bx ", hex(u.bx), "\n") + print("cx ", hex(u.cx), "\n") + print("dx ", hex(u.dx), "\n") + print("di ", hex(u.di), "\n") + print("si ", hex(u.si), "\n") + print("bp ", hex(u.bp), "\n") + print("sp ", hex(u.sp), "\n") + print("pc ", hex(u.pc), "\n") + print("flags ", hex(u.flags), "\n") + print("cs ", hex(u.cs), "\n") + print("fs ", hex(u.fs), "\n") + print("gs ", hex(u.gs), "\n") +} + +func sigpanictramp() diff --git a/src/runtime/defs_plan9_amd64.go b/src/runtime/defs_plan9_amd64.go new file mode 100644 index 0000000..15a27fc --- /dev/null +++ b/src/runtime/defs_plan9_amd64.go @@ -0,0 +1,81 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const _PAGESIZE = 0x1000 + +type ureg struct { + ax uint64 + bx uint64 + cx uint64 + dx uint64 + si uint64 + di uint64 + bp uint64 + r8 uint64 + r9 uint64 + r10 uint64 + r11 uint64 + r12 uint64 + r13 uint64 + r14 uint64 + r15 uint64 + + ds uint16 + es uint16 + fs uint16 + gs uint16 + + _type uint64 + error uint64 /* error code (or zero) */ + ip uint64 /* pc */ + cs uint64 /* old context */ + flags uint64 /* old flags */ + sp uint64 /* sp */ + ss uint64 /* old stack segment */ +} + +type sigctxt struct { + u *ureg +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uintptr { return uintptr(c.u.ip) } + +func (c *sigctxt) sp() uintptr { return uintptr(c.u.sp) } +func (c *sigctxt) lr() uintptr { return uintptr(0) } + +func (c *sigctxt) setpc(x uintptr) { c.u.ip = uint64(x) } +func (c *sigctxt) setsp(x uintptr) { c.u.sp = uint64(x) } +func (c *sigctxt) setlr(x uintptr) {} + +func (c *sigctxt) savelr(x uintptr) {} + +func dumpregs(u *ureg) { + print("ax ", hex(u.ax), "\n") + print("bx ", hex(u.bx), "\n") + print("cx ", hex(u.cx), "\n") + print("dx ", hex(u.dx), "\n") + print("di ", hex(u.di), "\n") + print("si ", hex(u.si), "\n") + print("bp ", hex(u.bp), "\n") + print("sp ", hex(u.sp), "\n") + print("r8 ", hex(u.r8), "\n") + print("r9 ", hex(u.r9), "\n") + print("r10 ", hex(u.r10), "\n") + print("r11 ", hex(u.r11), "\n") + print("r12 ", hex(u.r12), "\n") + print("r13 ", hex(u.r13), "\n") + print("r14 ", hex(u.r14), "\n") + print("r15 ", hex(u.r15), "\n") + print("ip ", hex(u.ip), "\n") + print("flags ", hex(u.flags), "\n") + print("cs ", hex(u.cs), "\n") + print("fs ", hex(u.fs), "\n") + print("gs ", hex(u.gs), "\n") +} + +func sigpanictramp() diff --git a/src/runtime/defs_plan9_arm.go b/src/runtime/defs_plan9_arm.go new file mode 100644 index 0000000..1adc16e --- /dev/null +++ b/src/runtime/defs_plan9_arm.go @@ -0,0 +1,66 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const _PAGESIZE = 0x1000 + +type ureg struct { + r0 uint32 /* general registers */ + r1 uint32 /* ... */ + r2 uint32 /* ... */ + r3 uint32 /* ... */ + r4 uint32 /* ... */ + r5 uint32 /* ... */ + r6 uint32 /* ... */ + r7 uint32 /* ... */ + r8 uint32 /* ... */ + r9 uint32 /* ... */ + r10 uint32 /* ... */ + r11 uint32 /* ... */ + r12 uint32 /* ... */ + sp uint32 + link uint32 /* ... */ + trap uint32 /* trap type */ + psr uint32 + pc uint32 /* interrupted addr */ +} + +type sigctxt struct { + u *ureg +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uintptr { return uintptr(c.u.pc) } + +func (c *sigctxt) sp() uintptr { return uintptr(c.u.sp) } +func (c *sigctxt) lr() uintptr { return uintptr(c.u.link) } + +func (c *sigctxt) setpc(x uintptr) { c.u.pc = uint32(x) } +func (c *sigctxt) setsp(x uintptr) { c.u.sp = uint32(x) } +func (c *sigctxt) setlr(x uintptr) { c.u.link = uint32(x) } +func (c *sigctxt) savelr(x uintptr) { c.u.r0 = uint32(x) } + +func dumpregs(u *ureg) { + print("r0 ", hex(u.r0), "\n") + print("r1 ", hex(u.r1), "\n") + print("r2 ", hex(u.r2), "\n") + print("r3 ", hex(u.r3), "\n") + print("r4 ", hex(u.r4), "\n") + print("r5 ", hex(u.r5), "\n") + print("r6 ", hex(u.r6), "\n") + print("r7 ", hex(u.r7), "\n") + print("r8 ", hex(u.r8), "\n") + print("r9 ", hex(u.r9), "\n") + print("r10 ", hex(u.r10), "\n") + print("r11 ", hex(u.r11), "\n") + print("r12 ", hex(u.r12), "\n") + print("sp ", hex(u.sp), "\n") + print("link ", hex(u.link), "\n") + print("pc ", hex(u.pc), "\n") + print("psr ", hex(u.psr), "\n") +} + +func sigpanictramp() diff --git a/src/runtime/defs_solaris.go b/src/runtime/defs_solaris.go new file mode 100644 index 0000000..406304d --- /dev/null +++ b/src/runtime/defs_solaris.go @@ -0,0 +1,164 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=amd64 go tool cgo -cdefs defs_solaris.go >defs_solaris_amd64.h +*/ + +package runtime + +/* +#include <sys/types.h> +#include <sys/mman.h> +#include <sys/select.h> +#include <sys/siginfo.h> +#include <sys/signal.h> +#include <sys/stat.h> +#include <sys/time.h> +#include <sys/ucontext.h> +#include <sys/regset.h> +#include <sys/unistd.h> +#include <sys/fork.h> +#include <sys/port.h> +#include <semaphore.h> +#include <errno.h> +#include <signal.h> +#include <pthread.h> +#include <netdb.h> +*/ +import "C" + +const ( + EINTR = C.EINTR + EBADF = C.EBADF + EFAULT = C.EFAULT + EAGAIN = C.EAGAIN + EBUSY = C.EBUSY + ETIME = C.ETIME + ETIMEDOUT = C.ETIMEDOUT + EWOULDBLOCK = C.EWOULDBLOCK + EINPROGRESS = C.EINPROGRESS + + PROT_NONE = C.PROT_NONE + PROT_READ = C.PROT_READ + PROT_WRITE = C.PROT_WRITE + PROT_EXEC = C.PROT_EXEC + + MAP_ANON = C.MAP_ANON + MAP_PRIVATE = C.MAP_PRIVATE + MAP_FIXED = C.MAP_FIXED + + MADV_DONTNEED = C.MADV_DONTNEED + MADV_FREE = C.MADV_FREE + + SA_SIGINFO = C.SA_SIGINFO + SA_RESTART = C.SA_RESTART + SA_ONSTACK = C.SA_ONSTACK + + SIGHUP = C.SIGHUP + SIGINT = C.SIGINT + SIGQUIT = C.SIGQUIT + SIGILL = C.SIGILL + SIGTRAP = C.SIGTRAP + SIGABRT = C.SIGABRT + SIGEMT = C.SIGEMT + SIGFPE = C.SIGFPE + SIGKILL = C.SIGKILL + SIGBUS = C.SIGBUS + SIGSEGV = C.SIGSEGV + SIGSYS = C.SIGSYS + SIGPIPE = C.SIGPIPE + SIGALRM = C.SIGALRM + SIGTERM = C.SIGTERM + SIGURG = C.SIGURG + SIGSTOP = C.SIGSTOP + SIGTSTP = C.SIGTSTP + SIGCONT = C.SIGCONT + SIGCHLD = C.SIGCHLD + SIGTTIN = C.SIGTTIN + SIGTTOU = C.SIGTTOU + SIGIO = C.SIGIO + SIGXCPU = C.SIGXCPU + SIGXFSZ = C.SIGXFSZ + SIGVTALRM = C.SIGVTALRM + SIGPROF = C.SIGPROF + SIGWINCH = C.SIGWINCH + SIGUSR1 = C.SIGUSR1 + SIGUSR2 = C.SIGUSR2 + + FPE_INTDIV = C.FPE_INTDIV + FPE_INTOVF = C.FPE_INTOVF + FPE_FLTDIV = C.FPE_FLTDIV + FPE_FLTOVF = C.FPE_FLTOVF + FPE_FLTUND = C.FPE_FLTUND + FPE_FLTRES = C.FPE_FLTRES + FPE_FLTINV = C.FPE_FLTINV + FPE_FLTSUB = C.FPE_FLTSUB + + BUS_ADRALN = C.BUS_ADRALN + BUS_ADRERR = C.BUS_ADRERR + BUS_OBJERR = C.BUS_OBJERR + + SEGV_MAPERR = C.SEGV_MAPERR + SEGV_ACCERR = C.SEGV_ACCERR + + ITIMER_REAL = C.ITIMER_REAL + ITIMER_VIRTUAL = C.ITIMER_VIRTUAL + ITIMER_PROF = C.ITIMER_PROF + + _SC_NPROCESSORS_ONLN = C._SC_NPROCESSORS_ONLN + + PTHREAD_CREATE_DETACHED = C.PTHREAD_CREATE_DETACHED + + FORK_NOSIGCHLD = C.FORK_NOSIGCHLD + FORK_WAITPID = C.FORK_WAITPID + + MAXHOSTNAMELEN = C.MAXHOSTNAMELEN + + O_WRONLY = C.O_WRONLY + O_NONBLOCK = C.O_NONBLOCK + O_CREAT = C.O_CREAT + O_TRUNC = C.O_TRUNC + O_CLOEXEC = C.O_CLOEXEC + FD_CLOEXEC = C.FD_CLOEXEC + F_GETFL = C.F_GETFL + F_SETFL = C.F_SETFL + F_SETFD = C.F_SETFD + + POLLIN = C.POLLIN + POLLOUT = C.POLLOUT + POLLHUP = C.POLLHUP + POLLERR = C.POLLERR + + PORT_SOURCE_FD = C.PORT_SOURCE_FD + PORT_SOURCE_ALERT = C.PORT_SOURCE_ALERT + PORT_ALERT_UPDATE = C.PORT_ALERT_UPDATE +) + +type SemT C.sem_t + +type Sigset C.sigset_t +type StackT C.stack_t + +type Siginfo C.siginfo_t +type Sigaction C.struct_sigaction + +type Fpregset C.fpregset_t +type Mcontext C.mcontext_t +type Ucontext C.ucontext_t + +type Timespec C.struct_timespec +type Timeval C.struct_timeval +type Itimerval C.struct_itimerval + +type PortEvent C.port_event_t +type Pthread C.pthread_t +type PthreadAttr C.pthread_attr_t + +// depends on Timespec, must appear below +type Stat C.struct_stat diff --git a/src/runtime/defs_solaris_amd64.go b/src/runtime/defs_solaris_amd64.go new file mode 100644 index 0000000..56e4b38 --- /dev/null +++ b/src/runtime/defs_solaris_amd64.go @@ -0,0 +1,48 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +/* +Input to cgo. + +GOARCH=amd64 go tool cgo -cdefs defs_solaris.go defs_solaris_amd64.go >defs_solaris_amd64.h +*/ + +package runtime + +/* +#include <sys/types.h> +#include <sys/regset.h> +*/ +import "C" + +const ( + REG_RDI = C.REG_RDI + REG_RSI = C.REG_RSI + REG_RDX = C.REG_RDX + REG_RCX = C.REG_RCX + REG_R8 = C.REG_R8 + REG_R9 = C.REG_R9 + REG_R10 = C.REG_R10 + REG_R11 = C.REG_R11 + REG_R12 = C.REG_R12 + REG_R13 = C.REG_R13 + REG_R14 = C.REG_R14 + REG_R15 = C.REG_R15 + REG_RBP = C.REG_RBP + REG_RBX = C.REG_RBX + REG_RAX = C.REG_RAX + REG_GS = C.REG_GS + REG_FS = C.REG_FS + REG_ES = C.REG_ES + REG_DS = C.REG_DS + REG_TRAPNO = C.REG_TRAPNO + REG_ERR = C.REG_ERR + REG_RIP = C.REG_RIP + REG_CS = C.REG_CS + REG_RFLAGS = C.REG_RFL + REG_RSP = C.REG_RSP + REG_SS = C.REG_SS +) diff --git a/src/runtime/defs_windows.go b/src/runtime/defs_windows.go new file mode 100644 index 0000000..8d4e381 --- /dev/null +++ b/src/runtime/defs_windows.go @@ -0,0 +1,84 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Windows architecture-independent definitions. + +package runtime + +const ( + _PROT_NONE = 0 + _PROT_READ = 1 + _PROT_WRITE = 2 + _PROT_EXEC = 4 + + _MAP_ANON = 1 + _MAP_PRIVATE = 2 + + _DUPLICATE_SAME_ACCESS = 0x2 + _THREAD_PRIORITY_HIGHEST = 0x2 + + _SIGINT = 0x2 + _SIGTERM = 0xF + _CTRL_C_EVENT = 0x0 + _CTRL_BREAK_EVENT = 0x1 + _CTRL_CLOSE_EVENT = 0x2 + _CTRL_LOGOFF_EVENT = 0x5 + _CTRL_SHUTDOWN_EVENT = 0x6 + + _EXCEPTION_ACCESS_VIOLATION = 0xc0000005 + _EXCEPTION_BREAKPOINT = 0x80000003 + _EXCEPTION_ILLEGAL_INSTRUCTION = 0xc000001d + _EXCEPTION_FLT_DENORMAL_OPERAND = 0xc000008d + _EXCEPTION_FLT_DIVIDE_BY_ZERO = 0xc000008e + _EXCEPTION_FLT_INEXACT_RESULT = 0xc000008f + _EXCEPTION_FLT_OVERFLOW = 0xc0000091 + _EXCEPTION_FLT_UNDERFLOW = 0xc0000093 + _EXCEPTION_INT_DIVIDE_BY_ZERO = 0xc0000094 + _EXCEPTION_INT_OVERFLOW = 0xc0000095 + + _INFINITE = 0xffffffff + _WAIT_TIMEOUT = 0x102 + + _EXCEPTION_CONTINUE_EXECUTION = -0x1 + _EXCEPTION_CONTINUE_SEARCH = 0x0 +) + +type systeminfo struct { + anon0 [4]byte + dwpagesize uint32 + lpminimumapplicationaddress *byte + lpmaximumapplicationaddress *byte + dwactiveprocessormask uintptr + dwnumberofprocessors uint32 + dwprocessortype uint32 + dwallocationgranularity uint32 + wprocessorlevel uint16 + wprocessorrevision uint16 +} + +type exceptionrecord struct { + exceptioncode uint32 + exceptionflags uint32 + exceptionrecord *exceptionrecord + exceptionaddress *byte + numberparameters uint32 + exceptioninformation [15]uintptr +} + +type overlapped struct { + internal uintptr + internalhigh uintptr + anon0 [8]byte + hevent *byte +} + +type memoryBasicInformation struct { + baseAddress uintptr + allocationBase uintptr + allocationProtect uint32 + regionSize uintptr + state uint32 + protect uint32 + type_ uint32 +} diff --git a/src/runtime/defs_windows_386.go b/src/runtime/defs_windows_386.go new file mode 100644 index 0000000..37fe74c --- /dev/null +++ b/src/runtime/defs_windows_386.go @@ -0,0 +1,73 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const _CONTEXT_CONTROL = 0x10001 + +type floatingsavearea struct { + controlword uint32 + statusword uint32 + tagword uint32 + erroroffset uint32 + errorselector uint32 + dataoffset uint32 + dataselector uint32 + registerarea [80]uint8 + cr0npxstate uint32 +} + +type context struct { + contextflags uint32 + dr0 uint32 + dr1 uint32 + dr2 uint32 + dr3 uint32 + dr6 uint32 + dr7 uint32 + floatsave floatingsavearea + seggs uint32 + segfs uint32 + seges uint32 + segds uint32 + edi uint32 + esi uint32 + ebx uint32 + edx uint32 + ecx uint32 + eax uint32 + ebp uint32 + eip uint32 + segcs uint32 + eflags uint32 + esp uint32 + segss uint32 + extendedregisters [512]uint8 +} + +func (c *context) ip() uintptr { return uintptr(c.eip) } +func (c *context) sp() uintptr { return uintptr(c.esp) } + +// 386 does not have link register, so this returns 0. +func (c *context) lr() uintptr { return 0 } +func (c *context) set_lr(x uintptr) {} + +func (c *context) set_ip(x uintptr) { c.eip = uint32(x) } +func (c *context) set_sp(x uintptr) { c.esp = uint32(x) } + +func dumpregs(r *context) { + print("eax ", hex(r.eax), "\n") + print("ebx ", hex(r.ebx), "\n") + print("ecx ", hex(r.ecx), "\n") + print("edx ", hex(r.edx), "\n") + print("edi ", hex(r.edi), "\n") + print("esi ", hex(r.esi), "\n") + print("ebp ", hex(r.ebp), "\n") + print("esp ", hex(r.esp), "\n") + print("eip ", hex(r.eip), "\n") + print("eflags ", hex(r.eflags), "\n") + print("cs ", hex(r.segcs), "\n") + print("fs ", hex(r.segfs), "\n") + print("gs ", hex(r.seggs), "\n") +} diff --git a/src/runtime/defs_windows_amd64.go b/src/runtime/defs_windows_amd64.go new file mode 100644 index 0000000..ac636a6 --- /dev/null +++ b/src/runtime/defs_windows_amd64.go @@ -0,0 +1,94 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const _CONTEXT_CONTROL = 0x100001 + +type m128a struct { + low uint64 + high int64 +} + +type context struct { + p1home uint64 + p2home uint64 + p3home uint64 + p4home uint64 + p5home uint64 + p6home uint64 + contextflags uint32 + mxcsr uint32 + segcs uint16 + segds uint16 + seges uint16 + segfs uint16 + seggs uint16 + segss uint16 + eflags uint32 + dr0 uint64 + dr1 uint64 + dr2 uint64 + dr3 uint64 + dr6 uint64 + dr7 uint64 + rax uint64 + rcx uint64 + rdx uint64 + rbx uint64 + rsp uint64 + rbp uint64 + rsi uint64 + rdi uint64 + r8 uint64 + r9 uint64 + r10 uint64 + r11 uint64 + r12 uint64 + r13 uint64 + r14 uint64 + r15 uint64 + rip uint64 + anon0 [512]byte + vectorregister [26]m128a + vectorcontrol uint64 + debugcontrol uint64 + lastbranchtorip uint64 + lastbranchfromrip uint64 + lastexceptiontorip uint64 + lastexceptionfromrip uint64 +} + +func (c *context) ip() uintptr { return uintptr(c.rip) } +func (c *context) sp() uintptr { return uintptr(c.rsp) } + +// AMD64 does not have link register, so this returns 0. +func (c *context) lr() uintptr { return 0 } +func (c *context) set_lr(x uintptr) {} + +func (c *context) set_ip(x uintptr) { c.rip = uint64(x) } +func (c *context) set_sp(x uintptr) { c.rsp = uint64(x) } + +func dumpregs(r *context) { + print("rax ", hex(r.rax), "\n") + print("rbx ", hex(r.rbx), "\n") + print("rcx ", hex(r.rcx), "\n") + print("rdi ", hex(r.rdi), "\n") + print("rsi ", hex(r.rsi), "\n") + print("rbp ", hex(r.rbp), "\n") + print("rsp ", hex(r.rsp), "\n") + print("r8 ", hex(r.r8), "\n") + print("r9 ", hex(r.r9), "\n") + print("r10 ", hex(r.r10), "\n") + print("r11 ", hex(r.r11), "\n") + print("r12 ", hex(r.r12), "\n") + print("r13 ", hex(r.r13), "\n") + print("r14 ", hex(r.r14), "\n") + print("r15 ", hex(r.r15), "\n") + print("rip ", hex(r.rip), "\n") + print("rflags ", hex(r.eflags), "\n") + print("cs ", hex(r.segcs), "\n") + print("fs ", hex(r.segfs), "\n") + print("gs ", hex(r.seggs), "\n") +} diff --git a/src/runtime/defs_windows_arm.go b/src/runtime/defs_windows_arm.go new file mode 100644 index 0000000..370470e --- /dev/null +++ b/src/runtime/defs_windows_arm.go @@ -0,0 +1,83 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// NOTE(rsc): _CONTEXT_CONTROL is actually 0x200001 and should include PC, SP, and LR. +// However, empirically, LR doesn't come along on Windows 10 +// unless you also set _CONTEXT_INTEGER (0x200002). +// Without LR, we skip over the next-to-bottom function in profiles +// when the bottom function is frameless. +// So we set both here, to make a working _CONTEXT_CONTROL. +const _CONTEXT_CONTROL = 0x200003 + +type neon128 struct { + low uint64 + high int64 +} + +type context struct { + contextflags uint32 + r0 uint32 + r1 uint32 + r2 uint32 + r3 uint32 + r4 uint32 + r5 uint32 + r6 uint32 + r7 uint32 + r8 uint32 + r9 uint32 + r10 uint32 + r11 uint32 + r12 uint32 + + spr uint32 + lrr uint32 + pc uint32 + cpsr uint32 + + fpscr uint32 + padding uint32 + + floatNeon [16]neon128 + + bvr [8]uint32 + bcr [8]uint32 + wvr [1]uint32 + wcr [1]uint32 + padding2 [2]uint32 +} + +func (c *context) ip() uintptr { return uintptr(c.pc) } +func (c *context) sp() uintptr { return uintptr(c.spr) } +func (c *context) lr() uintptr { return uintptr(c.lrr) } + +func (c *context) set_ip(x uintptr) { c.pc = uint32(x) } +func (c *context) set_sp(x uintptr) { c.spr = uint32(x) } +func (c *context) set_lr(x uintptr) { c.lrr = uint32(x) } + +func dumpregs(r *context) { + print("r0 ", hex(r.r0), "\n") + print("r1 ", hex(r.r1), "\n") + print("r2 ", hex(r.r2), "\n") + print("r3 ", hex(r.r3), "\n") + print("r4 ", hex(r.r4), "\n") + print("r5 ", hex(r.r5), "\n") + print("r6 ", hex(r.r6), "\n") + print("r7 ", hex(r.r7), "\n") + print("r8 ", hex(r.r8), "\n") + print("r9 ", hex(r.r9), "\n") + print("r10 ", hex(r.r10), "\n") + print("r11 ", hex(r.r11), "\n") + print("r12 ", hex(r.r12), "\n") + print("sp ", hex(r.spr), "\n") + print("lr ", hex(r.lrr), "\n") + print("pc ", hex(r.pc), "\n") + print("cpsr ", hex(r.cpsr), "\n") +} + +func stackcheck() { + // TODO: not implemented on ARM +} diff --git a/src/runtime/defs_windows_arm64.go b/src/runtime/defs_windows_arm64.go new file mode 100644 index 0000000..9ccce46 --- /dev/null +++ b/src/runtime/defs_windows_arm64.go @@ -0,0 +1,83 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// NOTE(rsc): _CONTEXT_CONTROL is actually 0x400001 and should include PC, SP, and LR. +// However, empirically, LR doesn't come along on Windows 10 +// unless you also set _CONTEXT_INTEGER (0x400002). +// Without LR, we skip over the next-to-bottom function in profiles +// when the bottom function is frameless. +// So we set both here, to make a working _CONTEXT_CONTROL. +const _CONTEXT_CONTROL = 0x400003 + +type neon128 struct { + low uint64 + high int64 +} + +// See https://docs.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-arm64_nt_context +type context struct { + contextflags uint32 + cpsr uint32 + x [31]uint64 // fp is x[29], lr is x[30] + xsp uint64 + pc uint64 + v [32]neon128 + fpcr uint32 + fpsr uint32 + bcr [8]uint32 + bvr [8]uint64 + wcr [2]uint32 + wvr [2]uint64 +} + +func (c *context) ip() uintptr { return uintptr(c.pc) } +func (c *context) sp() uintptr { return uintptr(c.xsp) } +func (c *context) lr() uintptr { return uintptr(c.x[30]) } + +func (c *context) set_ip(x uintptr) { c.pc = uint64(x) } +func (c *context) set_sp(x uintptr) { c.xsp = uint64(x) } +func (c *context) set_lr(x uintptr) { c.x[30] = uint64(x) } + +func dumpregs(r *context) { + print("r0 ", hex(r.x[0]), "\n") + print("r1 ", hex(r.x[1]), "\n") + print("r2 ", hex(r.x[2]), "\n") + print("r3 ", hex(r.x[3]), "\n") + print("r4 ", hex(r.x[4]), "\n") + print("r5 ", hex(r.x[5]), "\n") + print("r6 ", hex(r.x[6]), "\n") + print("r7 ", hex(r.x[7]), "\n") + print("r8 ", hex(r.x[8]), "\n") + print("r9 ", hex(r.x[9]), "\n") + print("r10 ", hex(r.x[10]), "\n") + print("r11 ", hex(r.x[11]), "\n") + print("r12 ", hex(r.x[12]), "\n") + print("r13 ", hex(r.x[13]), "\n") + print("r14 ", hex(r.x[14]), "\n") + print("r15 ", hex(r.x[15]), "\n") + print("r16 ", hex(r.x[16]), "\n") + print("r17 ", hex(r.x[17]), "\n") + print("r18 ", hex(r.x[18]), "\n") + print("r19 ", hex(r.x[19]), "\n") + print("r20 ", hex(r.x[20]), "\n") + print("r21 ", hex(r.x[21]), "\n") + print("r22 ", hex(r.x[22]), "\n") + print("r23 ", hex(r.x[23]), "\n") + print("r24 ", hex(r.x[24]), "\n") + print("r25 ", hex(r.x[25]), "\n") + print("r26 ", hex(r.x[26]), "\n") + print("r27 ", hex(r.x[27]), "\n") + print("r28 ", hex(r.x[28]), "\n") + print("r29 ", hex(r.x[29]), "\n") + print("lr ", hex(r.x[30]), "\n") + print("sp ", hex(r.xsp), "\n") + print("pc ", hex(r.pc), "\n") + print("cpsr ", hex(r.cpsr), "\n") +} + +func stackcheck() { + // TODO: not implemented on ARM +} diff --git a/src/runtime/duff_386.s b/src/runtime/duff_386.s new file mode 100644 index 0000000..ab01430 --- /dev/null +++ b/src/runtime/duff_386.s @@ -0,0 +1,779 @@ +// Code generated by mkduff.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkduff.go for comments. + +#include "textflag.h" + +TEXT runtime·duffzeroruntime·duffcopy(SB), NOSPLIT, $0-0 + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + MOVL (SI), CX + ADDL $4, SI + MOVL CX, (DI) + ADDL $4, DI + + RET diff --git a/src/runtime/duff_amd64.s b/src/runtime/duff_amd64.s new file mode 100644 index 0000000..df010f5 --- /dev/null +++ b/src/runtime/duff_amd64.s @@ -0,0 +1,427 @@ +// Code generated by mkduff.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkduff.go for comments. + +#include "textflag.h" + +TEXT runtime·duffzero<ABIInternal>(SB), NOSPLIT, $0-0 + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + MOVUPS X15,(DI) + MOVUPS X15,16(DI) + MOVUPS X15,32(DI) + MOVUPS X15,48(DI) + LEAQ 64(DI),DI + + RET + +TEXT runtime·duffcopy<ABIInternal>(SB), NOSPLIT, $0-0 + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + MOVUPS (SI), X0 + ADDQ $16, SI + MOVUPS X0, (DI) + ADDQ $16, DI + + RET diff --git a/src/runtime/duff_arm.s b/src/runtime/duff_arm.s new file mode 100644 index 0000000..ba8235b --- /dev/null +++ b/src/runtime/duff_arm.s @@ -0,0 +1,523 @@ +// Code generated by mkduff.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkduff.go for comments. + +#include "textflag.h" + +TEXT runtime·duffzero(SB), NOSPLIT, $0-0 + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + MOVW.P R0, 4(R1) + RET + +TEXT runtime·duffcopy(SB), NOSPLIT, $0-0 + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + MOVW.P 4(R1), R0 + MOVW.P R0, 4(R2) + + RET diff --git a/src/runtime/duff_arm64.s b/src/runtime/duff_arm64.s new file mode 100644 index 0000000..33c4905 --- /dev/null +++ b/src/runtime/duff_arm64.s @@ -0,0 +1,267 @@ +// Code generated by mkduff.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkduff.go for comments. + +#include "textflag.h" + +TEXT runtime·duffzero<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0 + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP.P (ZR, ZR), 16(R20) + STP (ZR, ZR), (R20) + RET + +TEXT runtime·duffcopy<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0 + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + LDP.P 16(R20), (R26, R27) + STP.P (R26, R27), 16(R21) + + RET diff --git a/src/runtime/duff_loong64.s b/src/runtime/duff_loong64.s new file mode 100644 index 0000000..7f78e4f --- /dev/null +++ b/src/runtime/duff_loong64.s @@ -0,0 +1,907 @@ +// Code generated by mkduff.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkduff.go for comments. + +#include "textflag.h" + +TEXT runtime·duffzero(SB), NOSPLIT|NOFRAME, $0-0 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + MOVV R0, 8(R19) + ADDV $8, R19 + RET + +TEXT runtime·duffcopy(SB), NOSPLIT|NOFRAME, $0-0 + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + MOVV (R19), R30 + ADDV $8, R19 + MOVV R30, (R20) + ADDV $8, R20 + + RET diff --git a/src/runtime/duff_mips64x.s b/src/runtime/duff_mips64x.s new file mode 100644 index 0000000..3a8524c --- /dev/null +++ b/src/runtime/duff_mips64x.s @@ -0,0 +1,909 @@ +// Code generated by mkduff.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkduff.go for comments. + +//go:build mips64 || mips64le + +#include "textflag.h" + +TEXT runtime·duffzero(SB), NOSPLIT|NOFRAME, $0-0 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + MOVV R0, 8(R1) + ADDV $8, R1 + RET + +TEXT runtime·duffcopy(SB), NOSPLIT|NOFRAME, $0-0 + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + MOVV (R1), R23 + ADDV $8, R1 + MOVV R23, (R2) + ADDV $8, R2 + + RET diff --git a/src/runtime/duff_ppc64x.s b/src/runtime/duff_ppc64x.s new file mode 100644 index 0000000..a3caaa8 --- /dev/null +++ b/src/runtime/duff_ppc64x.s @@ -0,0 +1,397 @@ +// Code generated by mkduff.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkduff.go for comments. + +//go:build ppc64 || ppc64le + +#include "textflag.h" + +TEXT runtime·duffzero<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0 + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + MOVDU R0, 8(R20) + RET + +TEXT runtime·duffcopy<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0 + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + MOVDU 8(R20), R5 + MOVDU R5, 8(R21) + RET diff --git a/src/runtime/duff_riscv64.s b/src/runtime/duff_riscv64.s new file mode 100644 index 0000000..ec44767 --- /dev/null +++ b/src/runtime/duff_riscv64.s @@ -0,0 +1,907 @@ +// Code generated by mkduff.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkduff.go for comments. + +#include "textflag.h" + +TEXT runtime·duffzero<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + MOV ZERO, (X25) + ADD $8, X25 + RET + +TEXT runtime·duffcopy<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0 + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + MOV (X24), X31 + ADD $8, X24 + MOV X31, (X25) + ADD $8, X25 + + RET diff --git a/src/runtime/duff_s390x.s b/src/runtime/duff_s390x.s new file mode 100644 index 0000000..95d492a --- /dev/null +++ b/src/runtime/duff_s390x.s @@ -0,0 +1,19 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// s390x can copy/zero 1-256 bytes with a single instruction, +// so there's no need for these, except to satisfy the prototypes +// in stubs.go. + +TEXT runtime·duffzero(SB),NOSPLIT|NOFRAME,$0-0 + MOVD $0, 2(R0) + RET + +TEXT runtime·duffcopy(SB),NOSPLIT|NOFRAME,$0-0 + MOVD $0, 2(R0) + RET diff --git a/src/runtime/ehooks_test.go b/src/runtime/ehooks_test.go new file mode 100644 index 0000000..ee286ec --- /dev/null +++ b/src/runtime/ehooks_test.go @@ -0,0 +1,91 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "internal/platform" + "internal/testenv" + "os/exec" + "runtime" + "strings" + "testing" +) + +func TestExitHooks(t *testing.T) { + bmodes := []string{""} + if testing.Short() { + t.Skip("skipping due to -short") + } + // Note the HasCGO() test below; this is to prevent the test + // running if CGO_ENABLED=0 is in effect. + haverace := platform.RaceDetectorSupported(runtime.GOOS, runtime.GOARCH) + if haverace && testenv.HasCGO() { + bmodes = append(bmodes, "-race") + } + for _, bmode := range bmodes { + scenarios := []struct { + mode string + expected string + musthave string + }{ + { + mode: "simple", + expected: "bar foo", + musthave: "", + }, + { + mode: "goodexit", + expected: "orange apple", + musthave: "", + }, + { + mode: "badexit", + expected: "blub blix", + musthave: "", + }, + { + mode: "panics", + expected: "", + musthave: "fatal error: internal error: exit hook invoked panic", + }, + { + mode: "callsexit", + expected: "", + musthave: "fatal error: internal error: exit hook invoked exit", + }, + } + + exe, err := buildTestProg(t, "testexithooks", bmode) + if err != nil { + t.Fatal(err) + } + + bt := "" + if bmode != "" { + bt = " bmode: " + bmode + } + for _, s := range scenarios { + cmd := exec.Command(exe, []string{"-mode", s.mode}...) + out, _ := cmd.CombinedOutput() + outs := strings.ReplaceAll(string(out), "\n", " ") + outs = strings.TrimSpace(outs) + if s.expected != "" { + if s.expected != outs { + t.Logf("raw output: %q", outs) + t.Errorf("failed%s mode %s: wanted %q got %q", bt, + s.mode, s.expected, outs) + } + } else if s.musthave != "" { + if !strings.Contains(outs, s.musthave) { + t.Logf("raw output: %q", outs) + t.Errorf("failed mode %s: output does not contain %q", + s.mode, s.musthave) + } + } else { + panic("badly written scenario") + } + } + } +} diff --git a/src/runtime/env_plan9.go b/src/runtime/env_plan9.go new file mode 100644 index 0000000..d206c5d --- /dev/null +++ b/src/runtime/env_plan9.go @@ -0,0 +1,126 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +const ( + // Plan 9 environment device + envDir = "/env/" + // size of buffer to read from a directory + dirBufSize = 4096 + // size of buffer to read an environment variable (may grow) + envBufSize = 128 + // offset of the name field in a 9P directory entry - see syscall.UnmarshalDir() + nameOffset = 39 +) + +// goenvs caches the Plan 9 environment variables at start of execution into +// string array envs, to supply the initial contents for os.Environ. +// Subsequent calls to os.Setenv will change this cache, without writing back +// to the (possibly shared) Plan 9 environment, so that Setenv and Getenv +// conform to the same Posix semantics as on other operating systems. +// For Plan 9 shared environment semantics, instead of Getenv(key) and +// Setenv(key, value), one can use os.ReadFile("/env/" + key) and +// os.WriteFile("/env/" + key, value, 0666) respectively. +// +//go:nosplit +func goenvs() { + buf := make([]byte, envBufSize) + copy(buf, envDir) + dirfd := open(&buf[0], _OREAD, 0) + if dirfd < 0 { + return + } + defer closefd(dirfd) + dofiles(dirfd, func(name []byte) { + name = append(name, 0) + buf = buf[:len(envDir)] + copy(buf, envDir) + buf = append(buf, name...) + fd := open(&buf[0], _OREAD, 0) + if fd < 0 { + return + } + defer closefd(fd) + n := len(buf) + r := 0 + for { + r = int(pread(fd, unsafe.Pointer(&buf[0]), int32(n), 0)) + if r < n { + break + } + n = int(seek(fd, 0, 2)) + 1 + if len(buf) < n { + buf = make([]byte, n) + } + } + if r <= 0 { + r = 0 + } else if buf[r-1] == 0 { + r-- + } + name[len(name)-1] = '=' + env := make([]byte, len(name)+r) + copy(env, name) + copy(env[len(name):], buf[:r]) + envs = append(envs, string(env)) + }) +} + +// dofiles reads the directory opened with file descriptor fd, applying function f +// to each filename in it. +// +//go:nosplit +func dofiles(dirfd int32, f func([]byte)) { + dirbuf := new([dirBufSize]byte) + + var off int64 = 0 + for { + n := pread(dirfd, unsafe.Pointer(&dirbuf[0]), int32(dirBufSize), off) + if n <= 0 { + return + } + for b := dirbuf[:n]; len(b) > 0; { + var name []byte + name, b = gdirname(b) + if name == nil { + return + } + f(name) + } + off += int64(n) + } +} + +// gdirname returns the first filename from a buffer of directory entries, +// and a slice containing the remaining directory entries. +// If the buffer doesn't start with a valid directory entry, the returned name is nil. +// +//go:nosplit +func gdirname(buf []byte) (name []byte, rest []byte) { + if 2+nameOffset+2 > len(buf) { + return + } + entryLen, buf := gbit16(buf) + if entryLen > len(buf) { + return + } + n, b := gbit16(buf[nameOffset:]) + if n > len(b) { + return + } + name = b[:n] + rest = buf[entryLen:] + return +} + +// gbit16 reads a 16-bit little-endian binary number from b and returns it +// with the remaining slice of b. +// +//go:nosplit +func gbit16(b []byte) (int, []byte) { + return int(b[0]) | int(b[1])<<8, b[2:] +} diff --git a/src/runtime/env_posix.go b/src/runtime/env_posix.go new file mode 100644 index 0000000..0eb4f0d --- /dev/null +++ b/src/runtime/env_posix.go @@ -0,0 +1,70 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +func gogetenv(key string) string { + env := environ() + if env == nil { + throw("getenv before env init") + } + for _, s := range env { + if len(s) > len(key) && s[len(key)] == '=' && envKeyEqual(s[:len(key)], key) { + return s[len(key)+1:] + } + } + return "" +} + +// envKeyEqual reports whether a == b, with ASCII-only case insensitivity +// on Windows. The two strings must have the same length. +func envKeyEqual(a, b string) bool { + if GOOS == "windows" { // case insensitive + for i := 0; i < len(a); i++ { + ca, cb := a[i], b[i] + if ca == cb || lowerASCII(ca) == lowerASCII(cb) { + continue + } + return false + } + return true + } + return a == b +} + +func lowerASCII(c byte) byte { + if 'A' <= c && c <= 'Z' { + return c + ('a' - 'A') + } + return c +} + +var _cgo_setenv unsafe.Pointer // pointer to C function +var _cgo_unsetenv unsafe.Pointer // pointer to C function + +// Update the C environment if cgo is loaded. +func setenv_c(k string, v string) { + if _cgo_setenv == nil { + return + } + arg := [2]unsafe.Pointer{cstring(k), cstring(v)} + asmcgocall(_cgo_setenv, unsafe.Pointer(&arg)) +} + +// Update the C environment if cgo is loaded. +func unsetenv_c(k string) { + if _cgo_unsetenv == nil { + return + } + arg := [1]unsafe.Pointer{cstring(k)} + asmcgocall(_cgo_unsetenv, unsafe.Pointer(&arg)) +} + +func cstring(s string) unsafe.Pointer { + p := make([]byte, len(s)+1) + copy(p, s) + return unsafe.Pointer(&p[0]) +} diff --git a/src/runtime/env_test.go b/src/runtime/env_test.go new file mode 100644 index 0000000..c009d0f --- /dev/null +++ b/src/runtime/env_test.go @@ -0,0 +1,43 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "runtime" + "syscall" + "testing" +) + +func TestFixedGOROOT(t *testing.T) { + // Restore both the real GOROOT environment variable, and runtime's copies: + if orig, ok := syscall.Getenv("GOROOT"); ok { + defer syscall.Setenv("GOROOT", orig) + } else { + defer syscall.Unsetenv("GOROOT") + } + envs := runtime.Envs() + oldenvs := append([]string{}, envs...) + defer runtime.SetEnvs(oldenvs) + + // attempt to reuse existing envs backing array. + want := runtime.GOROOT() + runtime.SetEnvs(append(envs[:0], "GOROOT="+want)) + + if got := runtime.GOROOT(); got != want { + t.Errorf(`initial runtime.GOROOT()=%q, want %q`, got, want) + } + if err := syscall.Setenv("GOROOT", "/os"); err != nil { + t.Fatal(err) + } + if got := runtime.GOROOT(); got != want { + t.Errorf(`after setenv runtime.GOROOT()=%q, want %q`, got, want) + } + if err := syscall.Unsetenv("GOROOT"); err != nil { + t.Fatal(err) + } + if got := runtime.GOROOT(); got != want { + t.Errorf(`after unsetenv runtime.GOROOT()=%q, want %q`, got, want) + } +} diff --git a/src/runtime/error.go b/src/runtime/error.go new file mode 100644 index 0000000..a211fbf --- /dev/null +++ b/src/runtime/error.go @@ -0,0 +1,330 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "internal/bytealg" + +// The Error interface identifies a run time error. +type Error interface { + error + + // RuntimeError is a no-op function but + // serves to distinguish types that are run time + // errors from ordinary errors: a type is a + // run time error if it has a RuntimeError method. + RuntimeError() +} + +// A TypeAssertionError explains a failed type assertion. +type TypeAssertionError struct { + _interface *_type + concrete *_type + asserted *_type + missingMethod string // one method needed by Interface, missing from Concrete +} + +func (*TypeAssertionError) RuntimeError() {} + +func (e *TypeAssertionError) Error() string { + inter := "interface" + if e._interface != nil { + inter = e._interface.string() + } + as := e.asserted.string() + if e.concrete == nil { + return "interface conversion: " + inter + " is nil, not " + as + } + cs := e.concrete.string() + if e.missingMethod == "" { + msg := "interface conversion: " + inter + " is " + cs + ", not " + as + if cs == as { + // provide slightly clearer error message + if e.concrete.pkgpath() != e.asserted.pkgpath() { + msg += " (types from different packages)" + } else { + msg += " (types from different scopes)" + } + } + return msg + } + return "interface conversion: " + cs + " is not " + as + + ": missing method " + e.missingMethod +} + +// itoa converts val to a decimal representation. The result is +// written somewhere within buf and the location of the result is returned. +// buf must be at least 20 bytes. +// +//go:nosplit +func itoa(buf []byte, val uint64) []byte { + i := len(buf) - 1 + for val >= 10 { + buf[i] = byte(val%10 + '0') + i-- + val /= 10 + } + buf[i] = byte(val + '0') + return buf[i:] +} + +// An errorString represents a runtime error described by a single string. +type errorString string + +func (e errorString) RuntimeError() {} + +func (e errorString) Error() string { + return "runtime error: " + string(e) +} + +type errorAddressString struct { + msg string // error message + addr uintptr // memory address where the error occurred +} + +func (e errorAddressString) RuntimeError() {} + +func (e errorAddressString) Error() string { + return "runtime error: " + e.msg +} + +// Addr returns the memory address where a fault occurred. +// The address provided is best-effort. +// The veracity of the result may depend on the platform. +// Errors providing this method will only be returned as +// a result of using runtime/debug.SetPanicOnFault. +func (e errorAddressString) Addr() uintptr { + return e.addr +} + +// plainError represents a runtime error described a string without +// the prefix "runtime error: " after invoking errorString.Error(). +// See Issue #14965. +type plainError string + +func (e plainError) RuntimeError() {} + +func (e plainError) Error() string { + return string(e) +} + +// A boundsError represents an indexing or slicing operation gone wrong. +type boundsError struct { + x int64 + y int + // Values in an index or slice expression can be signed or unsigned. + // That means we'd need 65 bits to encode all possible indexes, from -2^63 to 2^64-1. + // Instead, we keep track of whether x should be interpreted as signed or unsigned. + // y is known to be nonnegative and to fit in an int. + signed bool + code boundsErrorCode +} + +type boundsErrorCode uint8 + +const ( + boundsIndex boundsErrorCode = iota // s[x], 0 <= x < len(s) failed + + boundsSliceAlen // s[?:x], 0 <= x <= len(s) failed + boundsSliceAcap // s[?:x], 0 <= x <= cap(s) failed + boundsSliceB // s[x:y], 0 <= x <= y failed (but boundsSliceA didn't happen) + + boundsSlice3Alen // s[?:?:x], 0 <= x <= len(s) failed + boundsSlice3Acap // s[?:?:x], 0 <= x <= cap(s) failed + boundsSlice3B // s[?:x:y], 0 <= x <= y failed (but boundsSlice3A didn't happen) + boundsSlice3C // s[x:y:?], 0 <= x <= y failed (but boundsSlice3A/B didn't happen) + + boundsConvert // (*[x]T)(s), 0 <= x <= len(s) failed + // Note: in the above, len(s) and cap(s) are stored in y +) + +// boundsErrorFmts provide error text for various out-of-bounds panics. +// Note: if you change these strings, you should adjust the size of the buffer +// in boundsError.Error below as well. +var boundsErrorFmts = [...]string{ + boundsIndex: "index out of range [%x] with length %y", + boundsSliceAlen: "slice bounds out of range [:%x] with length %y", + boundsSliceAcap: "slice bounds out of range [:%x] with capacity %y", + boundsSliceB: "slice bounds out of range [%x:%y]", + boundsSlice3Alen: "slice bounds out of range [::%x] with length %y", + boundsSlice3Acap: "slice bounds out of range [::%x] with capacity %y", + boundsSlice3B: "slice bounds out of range [:%x:%y]", + boundsSlice3C: "slice bounds out of range [%x:%y:]", + boundsConvert: "cannot convert slice with length %y to array or pointer to array with length %x", +} + +// boundsNegErrorFmts are overriding formats if x is negative. In this case there's no need to report y. +var boundsNegErrorFmts = [...]string{ + boundsIndex: "index out of range [%x]", + boundsSliceAlen: "slice bounds out of range [:%x]", + boundsSliceAcap: "slice bounds out of range [:%x]", + boundsSliceB: "slice bounds out of range [%x:]", + boundsSlice3Alen: "slice bounds out of range [::%x]", + boundsSlice3Acap: "slice bounds out of range [::%x]", + boundsSlice3B: "slice bounds out of range [:%x:]", + boundsSlice3C: "slice bounds out of range [%x::]", +} + +func (e boundsError) RuntimeError() {} + +func appendIntStr(b []byte, v int64, signed bool) []byte { + if signed && v < 0 { + b = append(b, '-') + v = -v + } + var buf [20]byte + b = append(b, itoa(buf[:], uint64(v))...) + return b +} + +func (e boundsError) Error() string { + fmt := boundsErrorFmts[e.code] + if e.signed && e.x < 0 { + fmt = boundsNegErrorFmts[e.code] + } + // max message length is 99: "runtime error: slice bounds out of range [::%x] with capacity %y" + // x can be at most 20 characters. y can be at most 19. + b := make([]byte, 0, 100) + b = append(b, "runtime error: "...) + for i := 0; i < len(fmt); i++ { + c := fmt[i] + if c != '%' { + b = append(b, c) + continue + } + i++ + switch fmt[i] { + case 'x': + b = appendIntStr(b, e.x, e.signed) + case 'y': + b = appendIntStr(b, int64(e.y), true) + } + } + return string(b) +} + +type stringer interface { + String() string +} + +// printany prints an argument passed to panic. +// If panic is called with a value that has a String or Error method, +// it has already been converted into a string by preprintpanics. +func printany(i any) { + switch v := i.(type) { + case nil: + print("nil") + case bool: + print(v) + case int: + print(v) + case int8: + print(v) + case int16: + print(v) + case int32: + print(v) + case int64: + print(v) + case uint: + print(v) + case uint8: + print(v) + case uint16: + print(v) + case uint32: + print(v) + case uint64: + print(v) + case uintptr: + print(v) + case float32: + print(v) + case float64: + print(v) + case complex64: + print(v) + case complex128: + print(v) + case string: + print(v) + default: + printanycustomtype(i) + } +} + +func printanycustomtype(i any) { + eface := efaceOf(&i) + typestring := eface._type.string() + + switch eface._type.kind { + case kindString: + print(typestring, `("`, *(*string)(eface.data), `")`) + case kindBool: + print(typestring, "(", *(*bool)(eface.data), ")") + case kindInt: + print(typestring, "(", *(*int)(eface.data), ")") + case kindInt8: + print(typestring, "(", *(*int8)(eface.data), ")") + case kindInt16: + print(typestring, "(", *(*int16)(eface.data), ")") + case kindInt32: + print(typestring, "(", *(*int32)(eface.data), ")") + case kindInt64: + print(typestring, "(", *(*int64)(eface.data), ")") + case kindUint: + print(typestring, "(", *(*uint)(eface.data), ")") + case kindUint8: + print(typestring, "(", *(*uint8)(eface.data), ")") + case kindUint16: + print(typestring, "(", *(*uint16)(eface.data), ")") + case kindUint32: + print(typestring, "(", *(*uint32)(eface.data), ")") + case kindUint64: + print(typestring, "(", *(*uint64)(eface.data), ")") + case kindUintptr: + print(typestring, "(", *(*uintptr)(eface.data), ")") + case kindFloat32: + print(typestring, "(", *(*float32)(eface.data), ")") + case kindFloat64: + print(typestring, "(", *(*float64)(eface.data), ")") + case kindComplex64: + print(typestring, *(*complex64)(eface.data)) + case kindComplex128: + print(typestring, *(*complex128)(eface.data)) + default: + print("(", typestring, ") ", eface.data) + } +} + +// panicwrap generates a panic for a call to a wrapped value method +// with a nil pointer receiver. +// +// It is called from the generated wrapper code. +func panicwrap() { + pc := getcallerpc() + name := funcname(findfunc(pc)) + // name is something like "main.(*T).F". + // We want to extract pkg ("main"), typ ("T"), and meth ("F"). + // Do it by finding the parens. + i := bytealg.IndexByteString(name, '(') + if i < 0 { + throw("panicwrap: no ( in " + name) + } + pkg := name[:i-1] + if i+2 >= len(name) || name[i-1:i+2] != ".(*" { + throw("panicwrap: unexpected string after package name: " + name) + } + name = name[i+2:] + i = bytealg.IndexByteString(name, ')') + if i < 0 { + throw("panicwrap: no ) in " + name) + } + if i+2 >= len(name) || name[i:i+2] != ")." { + throw("panicwrap: unexpected string after type name: " + name) + } + typ := name[:i] + meth := name[i+2:] + panic(plainError("value method " + pkg + "." + typ + "." + meth + " called using nil *" + typ + " pointer")) +} diff --git a/src/runtime/example_test.go b/src/runtime/example_test.go new file mode 100644 index 0000000..dcb8f77 --- /dev/null +++ b/src/runtime/example_test.go @@ -0,0 +1,62 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "runtime" + "strings" +) + +func ExampleFrames() { + c := func() { + // Ask runtime.Callers for up to 10 PCs, including runtime.Callers itself. + pc := make([]uintptr, 10) + n := runtime.Callers(0, pc) + if n == 0 { + // No PCs available. This can happen if the first argument to + // runtime.Callers is large. + // + // Return now to avoid processing the zero Frame that would + // otherwise be returned by frames.Next below. + return + } + + pc = pc[:n] // pass only valid pcs to runtime.CallersFrames + frames := runtime.CallersFrames(pc) + + // Loop to get frames. + // A fixed number of PCs can expand to an indefinite number of Frames. + for { + frame, more := frames.Next() + + // Process this frame. + // + // To keep this example's output stable + // even if there are changes in the testing package, + // stop unwinding when we leave package runtime. + if !strings.Contains(frame.File, "runtime/") { + break + } + fmt.Printf("- more:%v | %s\n", more, frame.Function) + + // Check whether there are more frames to process after this one. + if !more { + break + } + } + } + + b := func() { c() } + a := func() { b() } + + a() + // Output: + // - more:true | runtime.Callers + // - more:true | runtime_test.ExampleFrames.func1 + // - more:true | runtime_test.ExampleFrames.func2 + // - more:true | runtime_test.ExampleFrames.func3 + // - more:true | runtime_test.ExampleFrames +} diff --git a/src/runtime/exithook.go b/src/runtime/exithook.go new file mode 100644 index 0000000..bb29a94 --- /dev/null +++ b/src/runtime/exithook.go @@ -0,0 +1,69 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// addExitHook registers the specified function 'f' to be run at +// program termination (e.g. when someone invokes os.Exit(), or when +// main.main returns). Hooks are run in reverse order of registration: +// first hook added is the last one run. +// +// CAREFUL: the expectation is that addExitHook should only be called +// from a safe context (e.g. not an error/panic path or signal +// handler, preemption enabled, allocation allowed, write barriers +// allowed, etc), and that the exit function 'f' will be invoked under +// similar circumstances. That is the say, we are expecting that 'f' +// uses normal / high-level Go code as opposed to one of the more +// restricted dialects used for the trickier parts of the runtime. +func addExitHook(f func(), runOnNonZeroExit bool) { + exitHooks.hooks = append(exitHooks.hooks, exitHook{f: f, runOnNonZeroExit: runOnNonZeroExit}) +} + +// exitHook stores a function to be run on program exit, registered +// by the utility runtime.addExitHook. +type exitHook struct { + f func() // func to run + runOnNonZeroExit bool // whether to run on non-zero exit code +} + +// exitHooks stores state related to hook functions registered to +// run when program execution terminates. +var exitHooks struct { + hooks []exitHook + runningExitHooks bool +} + +// runExitHooks runs any registered exit hook functions (funcs +// previously registered using runtime.addExitHook). Here 'exitCode' +// is the status code being passed to os.Exit, or zero if the program +// is terminating normally without calling os.Exit). +func runExitHooks(exitCode int) { + if exitHooks.runningExitHooks { + throw("internal error: exit hook invoked exit") + } + exitHooks.runningExitHooks = true + + runExitHook := func(f func()) (caughtPanic bool) { + defer func() { + if x := recover(); x != nil { + caughtPanic = true + } + }() + f() + return + } + + finishPageTrace() + for i := range exitHooks.hooks { + h := exitHooks.hooks[len(exitHooks.hooks)-i-1] + if exitCode != 0 && !h.runOnNonZeroExit { + continue + } + if caughtPanic := runExitHook(h.f); caughtPanic { + throw("internal error: exit hook invoked panic") + } + } + exitHooks.hooks = nil + exitHooks.runningExitHooks = false +} diff --git a/src/runtime/export_aix_test.go b/src/runtime/export_aix_test.go new file mode 100644 index 0000000..4845533 --- /dev/null +++ b/src/runtime/export_aix_test.go @@ -0,0 +1,7 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var SetNonblock = setNonblock diff --git a/src/runtime/export_arm_test.go b/src/runtime/export_arm_test.go new file mode 100644 index 0000000..b8a89fc --- /dev/null +++ b/src/runtime/export_arm_test.go @@ -0,0 +1,9 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Export guts for testing. + +package runtime + +var Usplit = usplit diff --git a/src/runtime/export_darwin_test.go b/src/runtime/export_darwin_test.go new file mode 100644 index 0000000..4845533 --- /dev/null +++ b/src/runtime/export_darwin_test.go @@ -0,0 +1,7 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var SetNonblock = setNonblock diff --git a/src/runtime/export_debug_amd64_test.go b/src/runtime/export_debug_amd64_test.go new file mode 100644 index 0000000..f9908cd --- /dev/null +++ b/src/runtime/export_debug_amd64_test.go @@ -0,0 +1,132 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build amd64 && linux + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +type sigContext struct { + savedRegs sigcontext + // sigcontext.fpstate is a pointer, so we need to save + // the its value with a fpstate1 structure. + savedFP fpstate1 +} + +func sigctxtSetContextRegister(ctxt *sigctxt, x uint64) { + ctxt.regs().rdx = x +} + +func sigctxtAtTrapInstruction(ctxt *sigctxt) bool { + return *(*byte)(unsafe.Pointer(uintptr(ctxt.rip() - 1))) == 0xcc // INT 3 +} + +func sigctxtStatus(ctxt *sigctxt) uint64 { + return ctxt.r12() +} + +func (h *debugCallHandler) saveSigContext(ctxt *sigctxt) { + // Push current PC on the stack. + rsp := ctxt.rsp() - goarch.PtrSize + *(*uint64)(unsafe.Pointer(uintptr(rsp))) = ctxt.rip() + ctxt.set_rsp(rsp) + // Write the argument frame size. + *(*uintptr)(unsafe.Pointer(uintptr(rsp - 16))) = h.argSize + // Save current registers. + h.sigCtxt.savedRegs = *ctxt.regs() + h.sigCtxt.savedFP = *h.sigCtxt.savedRegs.fpstate + h.sigCtxt.savedRegs.fpstate = nil +} + +// case 0 +func (h *debugCallHandler) debugCallRun(ctxt *sigctxt) { + rsp := ctxt.rsp() + memmove(unsafe.Pointer(uintptr(rsp)), h.argp, h.argSize) + if h.regArgs != nil { + storeRegArgs(ctxt.regs(), h.regArgs) + } + // Push return PC. + rsp -= goarch.PtrSize + ctxt.set_rsp(rsp) + // The signal PC is the next PC of the trap instruction. + *(*uint64)(unsafe.Pointer(uintptr(rsp))) = ctxt.rip() + // Set PC to call and context register. + ctxt.set_rip(uint64(h.fv.fn)) + sigctxtSetContextRegister(ctxt, uint64(uintptr(unsafe.Pointer(h.fv)))) +} + +// case 1 +func (h *debugCallHandler) debugCallReturn(ctxt *sigctxt) { + rsp := ctxt.rsp() + memmove(h.argp, unsafe.Pointer(uintptr(rsp)), h.argSize) + if h.regArgs != nil { + loadRegArgs(h.regArgs, ctxt.regs()) + } +} + +// case 2 +func (h *debugCallHandler) debugCallPanicOut(ctxt *sigctxt) { + rsp := ctxt.rsp() + memmove(unsafe.Pointer(&h.panic), unsafe.Pointer(uintptr(rsp)), 2*goarch.PtrSize) +} + +// case 8 +func (h *debugCallHandler) debugCallUnsafe(ctxt *sigctxt) { + rsp := ctxt.rsp() + reason := *(*string)(unsafe.Pointer(uintptr(rsp))) + h.err = plainError(reason) +} + +// case 16 +func (h *debugCallHandler) restoreSigContext(ctxt *sigctxt) { + // Restore all registers except RIP and RSP. + rip, rsp := ctxt.rip(), ctxt.rsp() + fp := ctxt.regs().fpstate + *ctxt.regs() = h.sigCtxt.savedRegs + ctxt.regs().fpstate = fp + *fp = h.sigCtxt.savedFP + ctxt.set_rip(rip) + ctxt.set_rsp(rsp) +} + +// storeRegArgs sets up argument registers in the signal +// context state from an abi.RegArgs. +// +// Both src and dst must be non-nil. +func storeRegArgs(dst *sigcontext, src *abi.RegArgs) { + dst.rax = uint64(src.Ints[0]) + dst.rbx = uint64(src.Ints[1]) + dst.rcx = uint64(src.Ints[2]) + dst.rdi = uint64(src.Ints[3]) + dst.rsi = uint64(src.Ints[4]) + dst.r8 = uint64(src.Ints[5]) + dst.r9 = uint64(src.Ints[6]) + dst.r10 = uint64(src.Ints[7]) + dst.r11 = uint64(src.Ints[8]) + for i := range src.Floats { + dst.fpstate._xmm[i].element[0] = uint32(src.Floats[i] >> 0) + dst.fpstate._xmm[i].element[1] = uint32(src.Floats[i] >> 32) + } +} + +func loadRegArgs(dst *abi.RegArgs, src *sigcontext) { + dst.Ints[0] = uintptr(src.rax) + dst.Ints[1] = uintptr(src.rbx) + dst.Ints[2] = uintptr(src.rcx) + dst.Ints[3] = uintptr(src.rdi) + dst.Ints[4] = uintptr(src.rsi) + dst.Ints[5] = uintptr(src.r8) + dst.Ints[6] = uintptr(src.r9) + dst.Ints[7] = uintptr(src.r10) + dst.Ints[8] = uintptr(src.r11) + for i := range dst.Floats { + dst.Floats[i] = uint64(src.fpstate._xmm[i].element[0]) << 0 + dst.Floats[i] |= uint64(src.fpstate._xmm[i].element[1]) << 32 + } +} diff --git a/src/runtime/export_debug_arm64_test.go b/src/runtime/export_debug_arm64_test.go new file mode 100644 index 0000000..ee90241 --- /dev/null +++ b/src/runtime/export_debug_arm64_test.go @@ -0,0 +1,135 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build arm64 && linux + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +type sigContext struct { + savedRegs sigcontext +} + +func sigctxtSetContextRegister(ctxt *sigctxt, x uint64) { + ctxt.regs().regs[26] = x +} + +func sigctxtAtTrapInstruction(ctxt *sigctxt) bool { + return *(*uint32)(unsafe.Pointer(ctxt.sigpc())) == 0xd4200000 // BRK 0 +} + +func sigctxtStatus(ctxt *sigctxt) uint64 { + return ctxt.r20() +} + +func (h *debugCallHandler) saveSigContext(ctxt *sigctxt) { + sp := ctxt.sp() + sp -= 2 * goarch.PtrSize + ctxt.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = ctxt.lr() // save the current lr + ctxt.set_lr(ctxt.pc()) // set new lr to the current pc + // Write the argument frame size. + *(*uintptr)(unsafe.Pointer(uintptr(sp - 16))) = h.argSize + // Save current registers. + h.sigCtxt.savedRegs = *ctxt.regs() +} + +// case 0 +func (h *debugCallHandler) debugCallRun(ctxt *sigctxt) { + sp := ctxt.sp() + memmove(unsafe.Pointer(uintptr(sp)+8), h.argp, h.argSize) + if h.regArgs != nil { + storeRegArgs(ctxt.regs(), h.regArgs) + } + // Push return PC, which should be the signal PC+4, because + // the signal PC is the PC of the trap instruction itself. + ctxt.set_lr(ctxt.pc() + 4) + // Set PC to call and context register. + ctxt.set_pc(uint64(h.fv.fn)) + sigctxtSetContextRegister(ctxt, uint64(uintptr(unsafe.Pointer(h.fv)))) +} + +// case 1 +func (h *debugCallHandler) debugCallReturn(ctxt *sigctxt) { + sp := ctxt.sp() + memmove(h.argp, unsafe.Pointer(uintptr(sp)+8), h.argSize) + if h.regArgs != nil { + loadRegArgs(h.regArgs, ctxt.regs()) + } + // Restore the old lr from *sp + olr := *(*uint64)(unsafe.Pointer(uintptr(sp))) + ctxt.set_lr(olr) + pc := ctxt.pc() + ctxt.set_pc(pc + 4) // step to next instruction +} + +// case 2 +func (h *debugCallHandler) debugCallPanicOut(ctxt *sigctxt) { + sp := ctxt.sp() + memmove(unsafe.Pointer(&h.panic), unsafe.Pointer(uintptr(sp)+8), 2*goarch.PtrSize) + ctxt.set_pc(ctxt.pc() + 4) +} + +// case 8 +func (h *debugCallHandler) debugCallUnsafe(ctxt *sigctxt) { + sp := ctxt.sp() + reason := *(*string)(unsafe.Pointer(uintptr(sp) + 8)) + h.err = plainError(reason) + ctxt.set_pc(ctxt.pc() + 4) +} + +// case 16 +func (h *debugCallHandler) restoreSigContext(ctxt *sigctxt) { + // Restore all registers except for pc and sp + pc, sp := ctxt.pc(), ctxt.sp() + *ctxt.regs() = h.sigCtxt.savedRegs + ctxt.set_pc(pc + 4) + ctxt.set_sp(sp) +} + +// storeRegArgs sets up argument registers in the signal +// context state from an abi.RegArgs. +// +// Both src and dst must be non-nil. +func storeRegArgs(dst *sigcontext, src *abi.RegArgs) { + for i, r := range src.Ints { + dst.regs[i] = uint64(r) + } + for i, r := range src.Floats { + *(fpRegAddr(dst, i)) = r + } +} + +func loadRegArgs(dst *abi.RegArgs, src *sigcontext) { + for i := range dst.Ints { + dst.Ints[i] = uintptr(src.regs[i]) + } + for i := range dst.Floats { + dst.Floats[i] = *(fpRegAddr(src, i)) + } +} + +// fpRegAddr returns the address of the ith fp-simd register in sigcontext. +func fpRegAddr(dst *sigcontext, i int) *uint64 { + /* FP-SIMD registers are saved in sigcontext.__reserved, which is orgnized in + the following C structs: + struct fpsimd_context { + struct _aarch64_ctx head; + __u32 fpsr; + __u32 fpcr; + __uint128_t vregs[32]; + }; + struct _aarch64_ctx { + __u32 magic; + __u32 size; + }; + So the offset of the ith FP_SIMD register is 16+i*128. + */ + return (*uint64)(unsafe.Pointer(&dst.__reserved[16+i*128])) +} diff --git a/src/runtime/export_debug_test.go b/src/runtime/export_debug_test.go new file mode 100644 index 0000000..2d8a133 --- /dev/null +++ b/src/runtime/export_debug_test.go @@ -0,0 +1,182 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (amd64 || arm64) && linux + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +// InjectDebugCall injects a debugger call to fn into g. regArgs must +// contain any arguments to fn that are passed in registers, according +// to the internal Go ABI. It may be nil if no arguments are passed in +// registers to fn. args must be a pointer to a valid call frame (including +// arguments and return space) for fn, or nil. tkill must be a function that +// will send SIGTRAP to thread ID tid. gp must be locked to its OS thread and +// running. +// +// On success, InjectDebugCall returns the panic value of fn or nil. +// If fn did not panic, its results will be available in args. +func InjectDebugCall(gp *g, fn any, regArgs *abi.RegArgs, stackArgs any, tkill func(tid int) error, returnOnUnsafePoint bool) (any, error) { + if gp.lockedm == 0 { + return nil, plainError("goroutine not locked to thread") + } + + tid := int(gp.lockedm.ptr().procid) + if tid == 0 { + return nil, plainError("missing tid") + } + + f := efaceOf(&fn) + if f._type == nil || f._type.kind&kindMask != kindFunc { + return nil, plainError("fn must be a function") + } + fv := (*funcval)(f.data) + + a := efaceOf(&stackArgs) + if a._type != nil && a._type.kind&kindMask != kindPtr { + return nil, plainError("args must be a pointer or nil") + } + argp := a.data + var argSize uintptr + if argp != nil { + argSize = (*ptrtype)(unsafe.Pointer(a._type)).elem.size + } + + h := new(debugCallHandler) + h.gp = gp + // gp may not be running right now, but we can still get the M + // it will run on since it's locked. + h.mp = gp.lockedm.ptr() + h.fv, h.regArgs, h.argp, h.argSize = fv, regArgs, argp, argSize + h.handleF = h.handle // Avoid allocating closure during signal + + defer func() { testSigtrap = nil }() + for i := 0; ; i++ { + testSigtrap = h.inject + noteclear(&h.done) + h.err = "" + + if err := tkill(tid); err != nil { + return nil, err + } + // Wait for completion. + notetsleepg(&h.done, -1) + if h.err != "" { + switch h.err { + case "call not at safe point": + if returnOnUnsafePoint { + // This is for TestDebugCallUnsafePoint. + return nil, h.err + } + fallthrough + case "retry _Grunnable", "executing on Go runtime stack", "call from within the Go runtime": + // These are transient states. Try to get out of them. + if i < 100 { + usleep(100) + Gosched() + continue + } + } + return nil, h.err + } + return h.panic, nil + } +} + +type debugCallHandler struct { + gp *g + mp *m + fv *funcval + regArgs *abi.RegArgs + argp unsafe.Pointer + argSize uintptr + panic any + + handleF func(info *siginfo, ctxt *sigctxt, gp2 *g) bool + + err plainError + done note + sigCtxt sigContext +} + +func (h *debugCallHandler) inject(info *siginfo, ctxt *sigctxt, gp2 *g) bool { + // TODO(49370): This code is riddled with write barriers, but called from + // a signal handler. Add the go:nowritebarrierrec annotation and restructure + // this to avoid write barriers. + + switch h.gp.atomicstatus.Load() { + case _Grunning: + if getg().m != h.mp { + println("trap on wrong M", getg().m, h.mp) + return false + } + // Save the signal context + h.saveSigContext(ctxt) + // Set PC to debugCallV2. + ctxt.setsigpc(uint64(abi.FuncPCABIInternal(debugCallV2))) + // Call injected. Switch to the debugCall protocol. + testSigtrap = h.handleF + case _Grunnable: + // Ask InjectDebugCall to pause for a bit and then try + // again to interrupt this goroutine. + h.err = plainError("retry _Grunnable") + notewakeup(&h.done) + default: + h.err = plainError("goroutine in unexpected state at call inject") + notewakeup(&h.done) + } + // Resume execution. + return true +} + +func (h *debugCallHandler) handle(info *siginfo, ctxt *sigctxt, gp2 *g) bool { + // TODO(49370): This code is riddled with write barriers, but called from + // a signal handler. Add the go:nowritebarrierrec annotation and restructure + // this to avoid write barriers. + + // Double-check m. + if getg().m != h.mp { + println("trap on wrong M", getg().m, h.mp) + return false + } + f := findfunc(ctxt.sigpc()) + if !(hasPrefix(funcname(f), "runtime.debugCall") || hasPrefix(funcname(f), "debugCall")) { + println("trap in unknown function", funcname(f)) + return false + } + if !sigctxtAtTrapInstruction(ctxt) { + println("trap at non-INT3 instruction pc =", hex(ctxt.sigpc())) + return false + } + + switch status := sigctxtStatus(ctxt); status { + case 0: + // Frame is ready. Copy the arguments to the frame and to registers. + // Call the debug function. + h.debugCallRun(ctxt) + case 1: + // Function returned. Copy frame and result registers back out. + h.debugCallReturn(ctxt) + case 2: + // Function panicked. Copy panic out. + h.debugCallPanicOut(ctxt) + case 8: + // Call isn't safe. Get the reason. + h.debugCallUnsafe(ctxt) + // Don't wake h.done. We need to transition to status 16 first. + case 16: + h.restoreSigContext(ctxt) + // Done + notewakeup(&h.done) + default: + h.err = plainError("unexpected debugCallV2 status") + notewakeup(&h.done) + } + // Resume execution. + return true +} diff --git a/src/runtime/export_debuglog_test.go b/src/runtime/export_debuglog_test.go new file mode 100644 index 0000000..c9dfdcb --- /dev/null +++ b/src/runtime/export_debuglog_test.go @@ -0,0 +1,46 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Export debuglog guts for testing. + +package runtime + +const DlogEnabled = dlogEnabled + +const DebugLogBytes = debugLogBytes + +const DebugLogStringLimit = debugLogStringLimit + +var Dlog = dlog + +func (l *dlogger) End() { l.end() } +func (l *dlogger) B(x bool) *dlogger { return l.b(x) } +func (l *dlogger) I(x int) *dlogger { return l.i(x) } +func (l *dlogger) I16(x int16) *dlogger { return l.i16(x) } +func (l *dlogger) U64(x uint64) *dlogger { return l.u64(x) } +func (l *dlogger) Hex(x uint64) *dlogger { return l.hex(x) } +func (l *dlogger) P(x any) *dlogger { return l.p(x) } +func (l *dlogger) S(x string) *dlogger { return l.s(x) } +func (l *dlogger) PC(x uintptr) *dlogger { return l.pc(x) } + +func DumpDebugLog() string { + gp := getg() + gp.writebuf = make([]byte, 0, 1<<20) + printDebugLog() + buf := gp.writebuf + gp.writebuf = nil + + return string(buf) +} + +func ResetDebugLog() { + stopTheWorld("ResetDebugLog") + for l := allDloggers; l != nil; l = l.allLink { + l.w.write = 0 + l.w.tick, l.w.nano = 0, 0 + l.w.r.begin, l.w.r.end = 0, 0 + l.w.r.tick, l.w.r.nano = 0, 0 + } + startTheWorld() +} diff --git a/src/runtime/export_linux_test.go b/src/runtime/export_linux_test.go new file mode 100644 index 0000000..a441c0e --- /dev/null +++ b/src/runtime/export_linux_test.go @@ -0,0 +1,22 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Export guts for testing. + +package runtime + +import ( + "runtime/internal/syscall" +) + +const SiginfoMaxSize = _si_max_size +const SigeventMaxSize = _sigev_max_size + +var Closeonexec = syscall.CloseOnExec +var NewOSProc0 = newosproc0 +var Mincore = mincore +var Add = add + +type Siginfo siginfo +type Sigevent sigevent diff --git a/src/runtime/export_mmap_test.go b/src/runtime/export_mmap_test.go new file mode 100644 index 0000000..f73fcbd --- /dev/null +++ b/src/runtime/export_mmap_test.go @@ -0,0 +1,21 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +// Export guts for testing. + +package runtime + +var Mmap = mmap +var Munmap = munmap + +const ENOMEM = _ENOMEM +const MAP_ANON = _MAP_ANON +const MAP_PRIVATE = _MAP_PRIVATE +const MAP_FIXED = _MAP_FIXED + +func GetPhysPageSize() uintptr { + return physPageSize +} diff --git a/src/runtime/export_pipe2_test.go b/src/runtime/export_pipe2_test.go new file mode 100644 index 0000000..8d49009 --- /dev/null +++ b/src/runtime/export_pipe2_test.go @@ -0,0 +1,11 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build dragonfly || freebsd || linux || netbsd || openbsd || solaris + +package runtime + +func Pipe() (r, w int32, errno int32) { + return pipe2(0) +} diff --git a/src/runtime/export_pipe_test.go b/src/runtime/export_pipe_test.go new file mode 100644 index 0000000..0583039 --- /dev/null +++ b/src/runtime/export_pipe_test.go @@ -0,0 +1,9 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build aix || darwin + +package runtime + +var Pipe = pipe diff --git a/src/runtime/export_test.go b/src/runtime/export_test.go new file mode 100644 index 0000000..3d8d6d3 --- /dev/null +++ b/src/runtime/export_test.go @@ -0,0 +1,1726 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Export guts for testing. + +package runtime + +import ( + "internal/goarch" + "internal/goos" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +var Fadd64 = fadd64 +var Fsub64 = fsub64 +var Fmul64 = fmul64 +var Fdiv64 = fdiv64 +var F64to32 = f64to32 +var F32to64 = f32to64 +var Fcmp64 = fcmp64 +var Fintto64 = fintto64 +var F64toint = f64toint + +var Entersyscall = entersyscall +var Exitsyscall = exitsyscall +var LockedOSThread = lockedOSThread +var Xadduintptr = atomic.Xadduintptr + +var Fastlog2 = fastlog2 + +var Atoi = atoi +var Atoi32 = atoi32 +var ParseByteCount = parseByteCount + +var Nanotime = nanotime +var NetpollBreak = netpollBreak +var Usleep = usleep + +var PhysPageSize = physPageSize +var PhysHugePageSize = physHugePageSize + +var NetpollGenericInit = netpollGenericInit + +var Memmove = memmove +var MemclrNoHeapPointers = memclrNoHeapPointers + +var LockPartialOrder = lockPartialOrder + +type LockRank lockRank + +func (l LockRank) String() string { + return lockRank(l).String() +} + +const PreemptMSupported = preemptMSupported + +type LFNode struct { + Next uint64 + Pushcnt uintptr +} + +func LFStackPush(head *uint64, node *LFNode) { + (*lfstack)(head).push((*lfnode)(unsafe.Pointer(node))) +} + +func LFStackPop(head *uint64) *LFNode { + return (*LFNode)(unsafe.Pointer((*lfstack)(head).pop())) +} +func LFNodeValidate(node *LFNode) { + lfnodeValidate((*lfnode)(unsafe.Pointer(node))) +} + +func Netpoll(delta int64) { + systemstack(func() { + netpoll(delta) + }) +} + +func GCMask(x any) (ret []byte) { + systemstack(func() { + ret = getgcmask(x) + }) + return +} + +func RunSchedLocalQueueTest() { + pp := new(p) + gs := make([]g, len(pp.runq)) + Escape(gs) // Ensure gs doesn't move, since we use guintptrs + for i := 0; i < len(pp.runq); i++ { + if g, _ := runqget(pp); g != nil { + throw("runq is not empty initially") + } + for j := 0; j < i; j++ { + runqput(pp, &gs[i], false) + } + for j := 0; j < i; j++ { + if g, _ := runqget(pp); g != &gs[i] { + print("bad element at iter ", i, "/", j, "\n") + throw("bad element") + } + } + if g, _ := runqget(pp); g != nil { + throw("runq is not empty afterwards") + } + } +} + +func RunSchedLocalQueueStealTest() { + p1 := new(p) + p2 := new(p) + gs := make([]g, len(p1.runq)) + Escape(gs) // Ensure gs doesn't move, since we use guintptrs + for i := 0; i < len(p1.runq); i++ { + for j := 0; j < i; j++ { + gs[j].sig = 0 + runqput(p1, &gs[j], false) + } + gp := runqsteal(p2, p1, true) + s := 0 + if gp != nil { + s++ + gp.sig++ + } + for { + gp, _ = runqget(p2) + if gp == nil { + break + } + s++ + gp.sig++ + } + for { + gp, _ = runqget(p1) + if gp == nil { + break + } + gp.sig++ + } + for j := 0; j < i; j++ { + if gs[j].sig != 1 { + print("bad element ", j, "(", gs[j].sig, ") at iter ", i, "\n") + throw("bad element") + } + } + if s != i/2 && s != i/2+1 { + print("bad steal ", s, ", want ", i/2, " or ", i/2+1, ", iter ", i, "\n") + throw("bad steal") + } + } +} + +func RunSchedLocalQueueEmptyTest(iters int) { + // Test that runq is not spuriously reported as empty. + // Runq emptiness affects scheduling decisions and spurious emptiness + // can lead to underutilization (both runnable Gs and idle Ps coexist + // for arbitrary long time). + done := make(chan bool, 1) + p := new(p) + gs := make([]g, 2) + Escape(gs) // Ensure gs doesn't move, since we use guintptrs + ready := new(uint32) + for i := 0; i < iters; i++ { + *ready = 0 + next0 := (i & 1) == 0 + next1 := (i & 2) == 0 + runqput(p, &gs[0], next0) + go func() { + for atomic.Xadd(ready, 1); atomic.Load(ready) != 2; { + } + if runqempty(p) { + println("next:", next0, next1) + throw("queue is empty") + } + done <- true + }() + for atomic.Xadd(ready, 1); atomic.Load(ready) != 2; { + } + runqput(p, &gs[1], next1) + runqget(p) + <-done + runqget(p) + } +} + +var ( + StringHash = stringHash + BytesHash = bytesHash + Int32Hash = int32Hash + Int64Hash = int64Hash + MemHash = memhash + MemHash32 = memhash32 + MemHash64 = memhash64 + EfaceHash = efaceHash + IfaceHash = ifaceHash +) + +var UseAeshash = &useAeshash + +func MemclrBytes(b []byte) { + s := (*slice)(unsafe.Pointer(&b)) + memclrNoHeapPointers(s.array, uintptr(s.len)) +} + +const HashLoad = hashLoad + +// entry point for testing +func GostringW(w []uint16) (s string) { + systemstack(func() { + s = gostringw(&w[0]) + }) + return +} + +var Open = open +var Close = closefd +var Read = read +var Write = write + +func Envs() []string { return envs } +func SetEnvs(e []string) { envs = e } + +// For benchmarking. + +func BenchSetType(n int, x any) { + e := *efaceOf(&x) + t := e._type + var size uintptr + var p unsafe.Pointer + switch t.kind & kindMask { + case kindPtr: + t = (*ptrtype)(unsafe.Pointer(t)).elem + size = t.size + p = e.data + case kindSlice: + slice := *(*struct { + ptr unsafe.Pointer + len, cap uintptr + })(e.data) + t = (*slicetype)(unsafe.Pointer(t)).elem + size = t.size * slice.len + p = slice.ptr + } + allocSize := roundupsize(size) + systemstack(func() { + for i := 0; i < n; i++ { + heapBitsSetType(uintptr(p), allocSize, size, t) + } + }) +} + +const PtrSize = goarch.PtrSize + +var ForceGCPeriod = &forcegcperiod + +// SetTracebackEnv is like runtime/debug.SetTraceback, but it raises +// the "environment" traceback level, so later calls to +// debug.SetTraceback (e.g., from testing timeouts) can't lower it. +func SetTracebackEnv(level string) { + setTraceback(level) + traceback_env = traceback_cache +} + +var ReadUnaligned32 = readUnaligned32 +var ReadUnaligned64 = readUnaligned64 + +func CountPagesInUse() (pagesInUse, counted uintptr) { + stopTheWorld("CountPagesInUse") + + pagesInUse = uintptr(mheap_.pagesInUse.Load()) + + for _, s := range mheap_.allspans { + if s.state.get() == mSpanInUse { + counted += s.npages + } + } + + startTheWorld() + + return +} + +func Fastrand() uint32 { return fastrand() } +func Fastrand64() uint64 { return fastrand64() } +func Fastrandn(n uint32) uint32 { return fastrandn(n) } + +type ProfBuf profBuf + +func NewProfBuf(hdrsize, bufwords, tags int) *ProfBuf { + return (*ProfBuf)(newProfBuf(hdrsize, bufwords, tags)) +} + +func (p *ProfBuf) Write(tag *unsafe.Pointer, now int64, hdr []uint64, stk []uintptr) { + (*profBuf)(p).write(tag, now, hdr, stk) +} + +const ( + ProfBufBlocking = profBufBlocking + ProfBufNonBlocking = profBufNonBlocking +) + +func (p *ProfBuf) Read(mode profBufReadMode) ([]uint64, []unsafe.Pointer, bool) { + return (*profBuf)(p).read(profBufReadMode(mode)) +} + +func (p *ProfBuf) Close() { + (*profBuf)(p).close() +} + +func ReadMetricsSlow(memStats *MemStats, samplesp unsafe.Pointer, len, cap int) { + stopTheWorld("ReadMetricsSlow") + + // Initialize the metrics beforehand because this could + // allocate and skew the stats. + metricsLock() + initMetrics() + metricsUnlock() + + systemstack(func() { + // Read memstats first. It's going to flush + // the mcaches which readMetrics does not do, so + // going the other way around may result in + // inconsistent statistics. + readmemstats_m(memStats) + }) + + // Read metrics off the system stack. + // + // The only part of readMetrics that could allocate + // and skew the stats is initMetrics. + readMetrics(samplesp, len, cap) + + startTheWorld() +} + +var DoubleCheckReadMemStats = &doubleCheckReadMemStats + +// ReadMemStatsSlow returns both the runtime-computed MemStats and +// MemStats accumulated by scanning the heap. +func ReadMemStatsSlow() (base, slow MemStats) { + stopTheWorld("ReadMemStatsSlow") + + // Run on the system stack to avoid stack growth allocation. + systemstack(func() { + // Make sure stats don't change. + getg().m.mallocing++ + + readmemstats_m(&base) + + // Initialize slow from base and zero the fields we're + // recomputing. + slow = base + slow.Alloc = 0 + slow.TotalAlloc = 0 + slow.Mallocs = 0 + slow.Frees = 0 + slow.HeapReleased = 0 + var bySize [_NumSizeClasses]struct { + Mallocs, Frees uint64 + } + + // Add up current allocations in spans. + for _, s := range mheap_.allspans { + if s.state.get() != mSpanInUse { + continue + } + if s.isUnusedUserArenaChunk() { + continue + } + if sizeclass := s.spanclass.sizeclass(); sizeclass == 0 { + slow.Mallocs++ + slow.Alloc += uint64(s.elemsize) + } else { + slow.Mallocs += uint64(s.allocCount) + slow.Alloc += uint64(s.allocCount) * uint64(s.elemsize) + bySize[sizeclass].Mallocs += uint64(s.allocCount) + } + } + + // Add in frees by just reading the stats for those directly. + var m heapStatsDelta + memstats.heapStats.unsafeRead(&m) + + // Collect per-sizeclass free stats. + var smallFree uint64 + for i := 0; i < _NumSizeClasses; i++ { + slow.Frees += uint64(m.smallFreeCount[i]) + bySize[i].Frees += uint64(m.smallFreeCount[i]) + bySize[i].Mallocs += uint64(m.smallFreeCount[i]) + smallFree += uint64(m.smallFreeCount[i]) * uint64(class_to_size[i]) + } + slow.Frees += uint64(m.tinyAllocCount) + uint64(m.largeFreeCount) + slow.Mallocs += slow.Frees + + slow.TotalAlloc = slow.Alloc + uint64(m.largeFree) + smallFree + + for i := range slow.BySize { + slow.BySize[i].Mallocs = bySize[i].Mallocs + slow.BySize[i].Frees = bySize[i].Frees + } + + for i := mheap_.pages.start; i < mheap_.pages.end; i++ { + chunk := mheap_.pages.tryChunkOf(i) + if chunk == nil { + continue + } + pg := chunk.scavenged.popcntRange(0, pallocChunkPages) + slow.HeapReleased += uint64(pg) * pageSize + } + for _, p := range allp { + pg := sys.OnesCount64(p.pcache.scav) + slow.HeapReleased += uint64(pg) * pageSize + } + + getg().m.mallocing-- + }) + + startTheWorld() + return +} + +// BlockOnSystemStack switches to the system stack, prints "x\n" to +// stderr, and blocks in a stack containing +// "runtime.blockOnSystemStackInternal". +func BlockOnSystemStack() { + systemstack(blockOnSystemStackInternal) +} + +func blockOnSystemStackInternal() { + print("x\n") + lock(&deadlock) + lock(&deadlock) +} + +type RWMutex struct { + rw rwmutex +} + +func (rw *RWMutex) Init() { + rw.rw.init(lockRankTestR, lockRankTestRInternal, lockRankTestW) +} + +func (rw *RWMutex) RLock() { + rw.rw.rlock() +} + +func (rw *RWMutex) RUnlock() { + rw.rw.runlock() +} + +func (rw *RWMutex) Lock() { + rw.rw.lock() +} + +func (rw *RWMutex) Unlock() { + rw.rw.unlock() +} + +const RuntimeHmapSize = unsafe.Sizeof(hmap{}) + +func MapBucketsCount(m map[int]int) int { + h := *(**hmap)(unsafe.Pointer(&m)) + return 1 << h.B +} + +func MapBucketsPointerIsNil(m map[int]int) bool { + h := *(**hmap)(unsafe.Pointer(&m)) + return h.buckets == nil +} + +func LockOSCounts() (external, internal uint32) { + gp := getg() + if gp.m.lockedExt+gp.m.lockedInt == 0 { + if gp.lockedm != 0 { + panic("lockedm on non-locked goroutine") + } + } else { + if gp.lockedm == 0 { + panic("nil lockedm on locked goroutine") + } + } + return gp.m.lockedExt, gp.m.lockedInt +} + +//go:noinline +func TracebackSystemstack(stk []uintptr, i int) int { + if i == 0 { + pc, sp := getcallerpc(), getcallersp() + return gentraceback(pc, sp, 0, getg(), 0, &stk[0], len(stk), nil, nil, _TraceJumpStack) + } + n := 0 + systemstack(func() { + n = TracebackSystemstack(stk, i-1) + }) + return n +} + +func KeepNArenaHints(n int) { + hint := mheap_.arenaHints + for i := 1; i < n; i++ { + hint = hint.next + if hint == nil { + return + } + } + hint.next = nil +} + +// MapNextArenaHint reserves a page at the next arena growth hint, +// preventing the arena from growing there, and returns the range of +// addresses that are no longer viable. +// +// This may fail to reserve memory. If it fails, it still returns the +// address range it attempted to reserve. +func MapNextArenaHint() (start, end uintptr, ok bool) { + hint := mheap_.arenaHints + addr := hint.addr + if hint.down { + start, end = addr-heapArenaBytes, addr + addr -= physPageSize + } else { + start, end = addr, addr+heapArenaBytes + } + got := sysReserve(unsafe.Pointer(addr), physPageSize) + ok = (addr == uintptr(got)) + if !ok { + // We were unable to get the requested reservation. + // Release what we did get and fail. + sysFreeOS(got, physPageSize) + } + return +} + +func GetNextArenaHint() uintptr { + return mheap_.arenaHints.addr +} + +type G = g + +type Sudog = sudog + +func Getg() *G { + return getg() +} + +func GIsWaitingOnMutex(gp *G) bool { + return readgstatus(gp) == _Gwaiting && gp.waitreason.isMutexWait() +} + +var CasGStatusAlwaysTrack = &casgstatusAlwaysTrack + +//go:noinline +func PanicForTesting(b []byte, i int) byte { + return unexportedPanicForTesting(b, i) +} + +//go:noinline +func unexportedPanicForTesting(b []byte, i int) byte { + return b[i] +} + +func G0StackOverflow() { + systemstack(func() { + stackOverflow(nil) + }) +} + +func stackOverflow(x *byte) { + var buf [256]byte + stackOverflow(&buf[0]) +} + +func MapTombstoneCheck(m map[int]int) { + // Make sure emptyOne and emptyRest are distributed correctly. + // We should have a series of filled and emptyOne cells, followed by + // a series of emptyRest cells. + h := *(**hmap)(unsafe.Pointer(&m)) + i := any(m) + t := *(**maptype)(unsafe.Pointer(&i)) + + for x := 0; x < 1<<h.B; x++ { + b0 := (*bmap)(add(h.buckets, uintptr(x)*uintptr(t.bucketsize))) + n := 0 + for b := b0; b != nil; b = b.overflow(t) { + for i := 0; i < bucketCnt; i++ { + if b.tophash[i] != emptyRest { + n++ + } + } + } + k := 0 + for b := b0; b != nil; b = b.overflow(t) { + for i := 0; i < bucketCnt; i++ { + if k < n && b.tophash[i] == emptyRest { + panic("early emptyRest") + } + if k >= n && b.tophash[i] != emptyRest { + panic("late non-emptyRest") + } + if k == n-1 && b.tophash[i] == emptyOne { + panic("last non-emptyRest entry is emptyOne") + } + k++ + } + } + } +} + +func RunGetgThreadSwitchTest() { + // Test that getg works correctly with thread switch. + // With gccgo, if we generate getg inlined, the backend + // may cache the address of the TLS variable, which + // will become invalid after a thread switch. This test + // checks that the bad caching doesn't happen. + + ch := make(chan int) + go func(ch chan int) { + ch <- 5 + LockOSThread() + }(ch) + + g1 := getg() + + // Block on a receive. This is likely to get us a thread + // switch. If we yield to the sender goroutine, it will + // lock the thread, forcing us to resume on a different + // thread. + <-ch + + g2 := getg() + if g1 != g2 { + panic("g1 != g2") + } + + // Also test getg after some control flow, as the + // backend is sensitive to control flow. + g3 := getg() + if g1 != g3 { + panic("g1 != g3") + } +} + +const ( + PageSize = pageSize + PallocChunkPages = pallocChunkPages + PageAlloc64Bit = pageAlloc64Bit + PallocSumBytes = pallocSumBytes +) + +// Expose pallocSum for testing. +type PallocSum pallocSum + +func PackPallocSum(start, max, end uint) PallocSum { return PallocSum(packPallocSum(start, max, end)) } +func (m PallocSum) Start() uint { return pallocSum(m).start() } +func (m PallocSum) Max() uint { return pallocSum(m).max() } +func (m PallocSum) End() uint { return pallocSum(m).end() } + +// Expose pallocBits for testing. +type PallocBits pallocBits + +func (b *PallocBits) Find(npages uintptr, searchIdx uint) (uint, uint) { + return (*pallocBits)(b).find(npages, searchIdx) +} +func (b *PallocBits) AllocRange(i, n uint) { (*pallocBits)(b).allocRange(i, n) } +func (b *PallocBits) Free(i, n uint) { (*pallocBits)(b).free(i, n) } +func (b *PallocBits) Summarize() PallocSum { return PallocSum((*pallocBits)(b).summarize()) } +func (b *PallocBits) PopcntRange(i, n uint) uint { return (*pageBits)(b).popcntRange(i, n) } + +// SummarizeSlow is a slow but more obviously correct implementation +// of (*pallocBits).summarize. Used for testing. +func SummarizeSlow(b *PallocBits) PallocSum { + var start, max, end uint + + const N = uint(len(b)) * 64 + for start < N && (*pageBits)(b).get(start) == 0 { + start++ + } + for end < N && (*pageBits)(b).get(N-end-1) == 0 { + end++ + } + run := uint(0) + for i := uint(0); i < N; i++ { + if (*pageBits)(b).get(i) == 0 { + run++ + } else { + run = 0 + } + if run > max { + max = run + } + } + return PackPallocSum(start, max, end) +} + +// Expose non-trivial helpers for testing. +func FindBitRange64(c uint64, n uint) uint { return findBitRange64(c, n) } + +// Given two PallocBits, returns a set of bit ranges where +// they differ. +func DiffPallocBits(a, b *PallocBits) []BitRange { + ba := (*pageBits)(a) + bb := (*pageBits)(b) + + var d []BitRange + base, size := uint(0), uint(0) + for i := uint(0); i < uint(len(ba))*64; i++ { + if ba.get(i) != bb.get(i) { + if size == 0 { + base = i + } + size++ + } else { + if size != 0 { + d = append(d, BitRange{base, size}) + } + size = 0 + } + } + if size != 0 { + d = append(d, BitRange{base, size}) + } + return d +} + +// StringifyPallocBits gets the bits in the bit range r from b, +// and returns a string containing the bits as ASCII 0 and 1 +// characters. +func StringifyPallocBits(b *PallocBits, r BitRange) string { + str := "" + for j := r.I; j < r.I+r.N; j++ { + if (*pageBits)(b).get(j) != 0 { + str += "1" + } else { + str += "0" + } + } + return str +} + +// Expose pallocData for testing. +type PallocData pallocData + +func (d *PallocData) FindScavengeCandidate(searchIdx uint, min, max uintptr) (uint, uint) { + return (*pallocData)(d).findScavengeCandidate(searchIdx, min, max) +} +func (d *PallocData) AllocRange(i, n uint) { (*pallocData)(d).allocRange(i, n) } +func (d *PallocData) ScavengedSetRange(i, n uint) { + (*pallocData)(d).scavenged.setRange(i, n) +} +func (d *PallocData) PallocBits() *PallocBits { + return (*PallocBits)(&(*pallocData)(d).pallocBits) +} +func (d *PallocData) Scavenged() *PallocBits { + return (*PallocBits)(&(*pallocData)(d).scavenged) +} + +// Expose fillAligned for testing. +func FillAligned(x uint64, m uint) uint64 { return fillAligned(x, m) } + +// Expose pageCache for testing. +type PageCache pageCache + +const PageCachePages = pageCachePages + +func NewPageCache(base uintptr, cache, scav uint64) PageCache { + return PageCache(pageCache{base: base, cache: cache, scav: scav}) +} +func (c *PageCache) Empty() bool { return (*pageCache)(c).empty() } +func (c *PageCache) Base() uintptr { return (*pageCache)(c).base } +func (c *PageCache) Cache() uint64 { return (*pageCache)(c).cache } +func (c *PageCache) Scav() uint64 { return (*pageCache)(c).scav } +func (c *PageCache) Alloc(npages uintptr) (uintptr, uintptr) { + return (*pageCache)(c).alloc(npages) +} +func (c *PageCache) Flush(s *PageAlloc) { + cp := (*pageCache)(c) + sp := (*pageAlloc)(s) + + systemstack(func() { + // None of the tests need any higher-level locking, so we just + // take the lock internally. + lock(sp.mheapLock) + cp.flush(sp) + unlock(sp.mheapLock) + }) +} + +// Expose chunk index type. +type ChunkIdx chunkIdx + +// Expose pageAlloc for testing. Note that because pageAlloc is +// not in the heap, so is PageAlloc. +type PageAlloc pageAlloc + +func (p *PageAlloc) Alloc(npages uintptr) (uintptr, uintptr) { + pp := (*pageAlloc)(p) + + var addr, scav uintptr + systemstack(func() { + // None of the tests need any higher-level locking, so we just + // take the lock internally. + lock(pp.mheapLock) + addr, scav = pp.alloc(npages) + unlock(pp.mheapLock) + }) + return addr, scav +} +func (p *PageAlloc) AllocToCache() PageCache { + pp := (*pageAlloc)(p) + + var c PageCache + systemstack(func() { + // None of the tests need any higher-level locking, so we just + // take the lock internally. + lock(pp.mheapLock) + c = PageCache(pp.allocToCache()) + unlock(pp.mheapLock) + }) + return c +} +func (p *PageAlloc) Free(base, npages uintptr) { + pp := (*pageAlloc)(p) + + systemstack(func() { + // None of the tests need any higher-level locking, so we just + // take the lock internally. + lock(pp.mheapLock) + pp.free(base, npages, true) + unlock(pp.mheapLock) + }) +} +func (p *PageAlloc) Bounds() (ChunkIdx, ChunkIdx) { + return ChunkIdx((*pageAlloc)(p).start), ChunkIdx((*pageAlloc)(p).end) +} +func (p *PageAlloc) Scavenge(nbytes uintptr) (r uintptr) { + pp := (*pageAlloc)(p) + systemstack(func() { + r = pp.scavenge(nbytes, nil) + }) + return +} +func (p *PageAlloc) InUse() []AddrRange { + ranges := make([]AddrRange, 0, len(p.inUse.ranges)) + for _, r := range p.inUse.ranges { + ranges = append(ranges, AddrRange{r}) + } + return ranges +} + +// Returns nil if the PallocData's L2 is missing. +func (p *PageAlloc) PallocData(i ChunkIdx) *PallocData { + ci := chunkIdx(i) + return (*PallocData)((*pageAlloc)(p).tryChunkOf(ci)) +} + +// AddrRange is a wrapper around addrRange for testing. +type AddrRange struct { + addrRange +} + +// MakeAddrRange creates a new address range. +func MakeAddrRange(base, limit uintptr) AddrRange { + return AddrRange{makeAddrRange(base, limit)} +} + +// Base returns the virtual base address of the address range. +func (a AddrRange) Base() uintptr { + return a.addrRange.base.addr() +} + +// Base returns the virtual address of the limit of the address range. +func (a AddrRange) Limit() uintptr { + return a.addrRange.limit.addr() +} + +// Equals returns true if the two address ranges are exactly equal. +func (a AddrRange) Equals(b AddrRange) bool { + return a == b +} + +// Size returns the size in bytes of the address range. +func (a AddrRange) Size() uintptr { + return a.addrRange.size() +} + +// testSysStat is the sysStat passed to test versions of various +// runtime structures. We do actually have to keep track of this +// because otherwise memstats.mappedReady won't actually line up +// with other stats in the runtime during tests. +var testSysStat = &memstats.other_sys + +// AddrRanges is a wrapper around addrRanges for testing. +type AddrRanges struct { + addrRanges + mutable bool +} + +// NewAddrRanges creates a new empty addrRanges. +// +// Note that this initializes addrRanges just like in the +// runtime, so its memory is persistentalloc'd. Call this +// function sparingly since the memory it allocates is +// leaked. +// +// This AddrRanges is mutable, so we can test methods like +// Add. +func NewAddrRanges() AddrRanges { + r := addrRanges{} + r.init(testSysStat) + return AddrRanges{r, true} +} + +// MakeAddrRanges creates a new addrRanges populated with +// the ranges in a. +// +// The returned AddrRanges is immutable, so methods like +// Add will fail. +func MakeAddrRanges(a ...AddrRange) AddrRanges { + // Methods that manipulate the backing store of addrRanges.ranges should + // not be used on the result from this function (e.g. add) since they may + // trigger reallocation. That would normally be fine, except the new + // backing store won't come from the heap, but from persistentalloc, so + // we'll leak some memory implicitly. + ranges := make([]addrRange, 0, len(a)) + total := uintptr(0) + for _, r := range a { + ranges = append(ranges, r.addrRange) + total += r.Size() + } + return AddrRanges{addrRanges{ + ranges: ranges, + totalBytes: total, + sysStat: testSysStat, + }, false} +} + +// Ranges returns a copy of the ranges described by the +// addrRanges. +func (a *AddrRanges) Ranges() []AddrRange { + result := make([]AddrRange, 0, len(a.addrRanges.ranges)) + for _, r := range a.addrRanges.ranges { + result = append(result, AddrRange{r}) + } + return result +} + +// FindSucc returns the successor to base. See addrRanges.findSucc +// for more details. +func (a *AddrRanges) FindSucc(base uintptr) int { + return a.findSucc(base) +} + +// Add adds a new AddrRange to the AddrRanges. +// +// The AddrRange must be mutable (i.e. created by NewAddrRanges), +// otherwise this method will throw. +func (a *AddrRanges) Add(r AddrRange) { + if !a.mutable { + throw("attempt to mutate immutable AddrRanges") + } + a.add(r.addrRange) +} + +// TotalBytes returns the totalBytes field of the addrRanges. +func (a *AddrRanges) TotalBytes() uintptr { + return a.addrRanges.totalBytes +} + +// BitRange represents a range over a bitmap. +type BitRange struct { + I, N uint // bit index and length in bits +} + +// NewPageAlloc creates a new page allocator for testing and +// initializes it with the scav and chunks maps. Each key in these maps +// represents a chunk index and each value is a series of bit ranges to +// set within each bitmap's chunk. +// +// The initialization of the pageAlloc preserves the invariant that if a +// scavenged bit is set the alloc bit is necessarily unset, so some +// of the bits described by scav may be cleared in the final bitmap if +// ranges in chunks overlap with them. +// +// scav is optional, and if nil, the scavenged bitmap will be cleared +// (as opposed to all 1s, which it usually is). Furthermore, every +// chunk index in scav must appear in chunks; ones that do not are +// ignored. +func NewPageAlloc(chunks, scav map[ChunkIdx][]BitRange) *PageAlloc { + p := new(pageAlloc) + + // We've got an entry, so initialize the pageAlloc. + p.init(new(mutex), testSysStat) + lockInit(p.mheapLock, lockRankMheap) + p.test = true + for i, init := range chunks { + addr := chunkBase(chunkIdx(i)) + + // Mark the chunk's existence in the pageAlloc. + systemstack(func() { + lock(p.mheapLock) + p.grow(addr, pallocChunkBytes) + unlock(p.mheapLock) + }) + + // Initialize the bitmap and update pageAlloc metadata. + chunk := p.chunkOf(chunkIndex(addr)) + + // Clear all the scavenged bits which grow set. + chunk.scavenged.clearRange(0, pallocChunkPages) + + // Apply scavenge state if applicable. + if scav != nil { + if scvg, ok := scav[i]; ok { + for _, s := range scvg { + // Ignore the case of s.N == 0. setRange doesn't handle + // it and it's a no-op anyway. + if s.N != 0 { + chunk.scavenged.setRange(s.I, s.N) + } + } + } + } + + // Apply alloc state. + for _, s := range init { + // Ignore the case of s.N == 0. allocRange doesn't handle + // it and it's a no-op anyway. + if s.N != 0 { + chunk.allocRange(s.I, s.N) + } + } + + // Make sure the scavenge index is updated. + // + // This is an inefficient way to do it, but it's also the simplest way. + minPages := physPageSize / pageSize + if minPages < 1 { + minPages = 1 + } + _, npages := chunk.findScavengeCandidate(pallocChunkPages-1, minPages, minPages) + if npages != 0 { + p.scav.index.mark(addr, addr+pallocChunkBytes) + } + + // Update heap metadata for the allocRange calls above. + systemstack(func() { + lock(p.mheapLock) + p.update(addr, pallocChunkPages, false, false) + unlock(p.mheapLock) + }) + } + + return (*PageAlloc)(p) +} + +// FreePageAlloc releases hard OS resources owned by the pageAlloc. Once this +// is called the pageAlloc may no longer be used. The object itself will be +// collected by the garbage collector once it is no longer live. +func FreePageAlloc(pp *PageAlloc) { + p := (*pageAlloc)(pp) + + // Free all the mapped space for the summary levels. + if pageAlloc64Bit != 0 { + for l := 0; l < summaryLevels; l++ { + sysFreeOS(unsafe.Pointer(&p.summary[l][0]), uintptr(cap(p.summary[l]))*pallocSumBytes) + } + // Only necessary on 64-bit. This is a global on 32-bit. + sysFreeOS(unsafe.Pointer(&p.scav.index.chunks[0]), uintptr(cap(p.scav.index.chunks))) + } else { + resSize := uintptr(0) + for _, s := range p.summary { + resSize += uintptr(cap(s)) * pallocSumBytes + } + sysFreeOS(unsafe.Pointer(&p.summary[0][0]), alignUp(resSize, physPageSize)) + } + + // Subtract back out whatever we mapped for the summaries. + // sysUsed adds to p.sysStat and memstats.mappedReady no matter what + // (and in anger should actually be accounted for), and there's no other + // way to figure out how much we actually mapped. + gcController.mappedReady.Add(-int64(p.summaryMappedReady)) + testSysStat.add(-int64(p.summaryMappedReady)) + + // Free the mapped space for chunks. + for i := range p.chunks { + if x := p.chunks[i]; x != nil { + p.chunks[i] = nil + // This memory comes from sysAlloc and will always be page-aligned. + sysFree(unsafe.Pointer(x), unsafe.Sizeof(*p.chunks[0]), testSysStat) + } + } +} + +// BaseChunkIdx is a convenient chunkIdx value which works on both +// 64 bit and 32 bit platforms, allowing the tests to share code +// between the two. +// +// This should not be higher than 0x100*pallocChunkBytes to support +// mips and mipsle, which only have 31-bit address spaces. +var BaseChunkIdx = func() ChunkIdx { + var prefix uintptr + if pageAlloc64Bit != 0 { + prefix = 0xc000 + } else { + prefix = 0x100 + } + baseAddr := prefix * pallocChunkBytes + if goos.IsAix != 0 { + baseAddr += arenaBaseOffset + } + return ChunkIdx(chunkIndex(baseAddr)) +}() + +// PageBase returns an address given a chunk index and a page index +// relative to that chunk. +func PageBase(c ChunkIdx, pageIdx uint) uintptr { + return chunkBase(chunkIdx(c)) + uintptr(pageIdx)*pageSize +} + +type BitsMismatch struct { + Base uintptr + Got, Want uint64 +} + +func CheckScavengedBitsCleared(mismatches []BitsMismatch) (n int, ok bool) { + ok = true + + // Run on the system stack to avoid stack growth allocation. + systemstack(func() { + getg().m.mallocing++ + + // Lock so that we can safely access the bitmap. + lock(&mheap_.lock) + chunkLoop: + for i := mheap_.pages.start; i < mheap_.pages.end; i++ { + chunk := mheap_.pages.tryChunkOf(i) + if chunk == nil { + continue + } + for j := 0; j < pallocChunkPages/64; j++ { + // Run over each 64-bit bitmap section and ensure + // scavenged is being cleared properly on allocation. + // If a used bit and scavenged bit are both set, that's + // an error, and could indicate a larger problem, or + // an accounting problem. + want := chunk.scavenged[j] &^ chunk.pallocBits[j] + got := chunk.scavenged[j] + if want != got { + ok = false + if n >= len(mismatches) { + break chunkLoop + } + mismatches[n] = BitsMismatch{ + Base: chunkBase(i) + uintptr(j)*64*pageSize, + Got: got, + Want: want, + } + n++ + } + } + } + unlock(&mheap_.lock) + + getg().m.mallocing-- + }) + return +} + +func PageCachePagesLeaked() (leaked uintptr) { + stopTheWorld("PageCachePagesLeaked") + + // Walk over destroyed Ps and look for unflushed caches. + deadp := allp[len(allp):cap(allp)] + for _, p := range deadp { + // Since we're going past len(allp) we may see nil Ps. + // Just ignore them. + if p != nil { + leaked += uintptr(sys.OnesCount64(p.pcache.cache)) + } + } + + startTheWorld() + return +} + +var Semacquire = semacquire +var Semrelease1 = semrelease1 + +func SemNwait(addr *uint32) uint32 { + root := semtable.rootFor(addr) + return root.nwait.Load() +} + +const SemTableSize = semTabSize + +// SemTable is a wrapper around semTable exported for testing. +type SemTable struct { + semTable +} + +// Enqueue simulates enqueuing a waiter for a semaphore (or lock) at addr. +func (t *SemTable) Enqueue(addr *uint32) { + s := acquireSudog() + s.releasetime = 0 + s.acquiretime = 0 + s.ticket = 0 + t.semTable.rootFor(addr).queue(addr, s, false) +} + +// Dequeue simulates dequeuing a waiter for a semaphore (or lock) at addr. +// +// Returns true if there actually was a waiter to be dequeued. +func (t *SemTable) Dequeue(addr *uint32) bool { + s, _ := t.semTable.rootFor(addr).dequeue(addr) + if s != nil { + releaseSudog(s) + return true + } + return false +} + +// mspan wrapper for testing. +type MSpan mspan + +// Allocate an mspan for testing. +func AllocMSpan() *MSpan { + var s *mspan + systemstack(func() { + lock(&mheap_.lock) + s = (*mspan)(mheap_.spanalloc.alloc()) + unlock(&mheap_.lock) + }) + return (*MSpan)(s) +} + +// Free an allocated mspan. +func FreeMSpan(s *MSpan) { + systemstack(func() { + lock(&mheap_.lock) + mheap_.spanalloc.free(unsafe.Pointer(s)) + unlock(&mheap_.lock) + }) +} + +func MSpanCountAlloc(ms *MSpan, bits []byte) int { + s := (*mspan)(ms) + s.nelems = uintptr(len(bits) * 8) + s.gcmarkBits = (*gcBits)(unsafe.Pointer(&bits[0])) + result := s.countAlloc() + s.gcmarkBits = nil + return result +} + +const ( + TimeHistSubBucketBits = timeHistSubBucketBits + TimeHistNumSubBuckets = timeHistNumSubBuckets + TimeHistNumBuckets = timeHistNumBuckets + TimeHistMinBucketBits = timeHistMinBucketBits + TimeHistMaxBucketBits = timeHistMaxBucketBits +) + +type TimeHistogram timeHistogram + +// Counts returns the counts for the given bucket, subBucket indices. +// Returns true if the bucket was valid, otherwise returns the counts +// for the overflow bucket if bucket > 0 or the underflow bucket if +// bucket < 0, and false. +func (th *TimeHistogram) Count(bucket, subBucket int) (uint64, bool) { + t := (*timeHistogram)(th) + if bucket < 0 { + return t.underflow.Load(), false + } + i := bucket*TimeHistNumSubBuckets + subBucket + if i >= len(t.counts) { + return t.overflow.Load(), false + } + return t.counts[i].Load(), true +} + +func (th *TimeHistogram) Record(duration int64) { + (*timeHistogram)(th).record(duration) +} + +var TimeHistogramMetricsBuckets = timeHistogramMetricsBuckets + +func SetIntArgRegs(a int) int { + lock(&finlock) + old := intArgRegs + if a >= 0 { + intArgRegs = a + } + unlock(&finlock) + return old +} + +func FinalizerGAsleep() bool { + return fingStatus.Load()&fingWait != 0 +} + +// For GCTestMoveStackOnNextCall, it's important not to introduce an +// extra layer of call, since then there's a return before the "real" +// next call. +var GCTestMoveStackOnNextCall = gcTestMoveStackOnNextCall + +// For GCTestIsReachable, it's important that we do this as a call so +// escape analysis can see through it. +func GCTestIsReachable(ptrs ...unsafe.Pointer) (mask uint64) { + return gcTestIsReachable(ptrs...) +} + +// For GCTestPointerClass, it's important that we do this as a call so +// escape analysis can see through it. +// +// This is nosplit because gcTestPointerClass is. +// +//go:nosplit +func GCTestPointerClass(p unsafe.Pointer) string { + return gcTestPointerClass(p) +} + +const Raceenabled = raceenabled + +const ( + GCBackgroundUtilization = gcBackgroundUtilization + GCGoalUtilization = gcGoalUtilization + DefaultHeapMinimum = defaultHeapMinimum + MemoryLimitHeapGoalHeadroom = memoryLimitHeapGoalHeadroom +) + +type GCController struct { + gcControllerState +} + +func NewGCController(gcPercent int, memoryLimit int64) *GCController { + // Force the controller to escape. We're going to + // do 64-bit atomics on it, and if it gets stack-allocated + // on a 32-bit architecture, it may get allocated unaligned + // space. + g := Escape(new(GCController)) + g.gcControllerState.test = true // Mark it as a test copy. + g.init(int32(gcPercent), memoryLimit) + return g +} + +func (c *GCController) StartCycle(stackSize, globalsSize uint64, scannableFrac float64, gomaxprocs int) { + trigger, _ := c.trigger() + if c.heapMarked > trigger { + trigger = c.heapMarked + } + c.maxStackScan.Store(stackSize) + c.globalsScan.Store(globalsSize) + c.heapLive.Store(trigger) + c.heapScan.Add(int64(float64(trigger-c.heapMarked) * scannableFrac)) + c.startCycle(0, gomaxprocs, gcTrigger{kind: gcTriggerHeap}) +} + +func (c *GCController) AssistWorkPerByte() float64 { + return c.assistWorkPerByte.Load() +} + +func (c *GCController) HeapGoal() uint64 { + return c.heapGoal() +} + +func (c *GCController) HeapLive() uint64 { + return c.heapLive.Load() +} + +func (c *GCController) HeapMarked() uint64 { + return c.heapMarked +} + +func (c *GCController) Triggered() uint64 { + return c.triggered +} + +type GCControllerReviseDelta struct { + HeapLive int64 + HeapScan int64 + HeapScanWork int64 + StackScanWork int64 + GlobalsScanWork int64 +} + +func (c *GCController) Revise(d GCControllerReviseDelta) { + c.heapLive.Add(d.HeapLive) + c.heapScan.Add(d.HeapScan) + c.heapScanWork.Add(d.HeapScanWork) + c.stackScanWork.Add(d.StackScanWork) + c.globalsScanWork.Add(d.GlobalsScanWork) + c.revise() +} + +func (c *GCController) EndCycle(bytesMarked uint64, assistTime, elapsed int64, gomaxprocs int) { + c.assistTime.Store(assistTime) + c.endCycle(elapsed, gomaxprocs, false) + c.resetLive(bytesMarked) + c.commit(false) +} + +func (c *GCController) AddIdleMarkWorker() bool { + return c.addIdleMarkWorker() +} + +func (c *GCController) NeedIdleMarkWorker() bool { + return c.needIdleMarkWorker() +} + +func (c *GCController) RemoveIdleMarkWorker() { + c.removeIdleMarkWorker() +} + +func (c *GCController) SetMaxIdleMarkWorkers(max int32) { + c.setMaxIdleMarkWorkers(max) +} + +var alwaysFalse bool +var escapeSink any + +func Escape[T any](x T) T { + if alwaysFalse { + escapeSink = x + } + return x +} + +// Acquirem blocks preemption. +func Acquirem() { + acquirem() +} + +func Releasem() { + releasem(getg().m) +} + +var Timediv = timediv + +type PIController struct { + piController +} + +func NewPIController(kp, ti, tt, min, max float64) *PIController { + return &PIController{piController{ + kp: kp, + ti: ti, + tt: tt, + min: min, + max: max, + }} +} + +func (c *PIController) Next(input, setpoint, period float64) (float64, bool) { + return c.piController.next(input, setpoint, period) +} + +const ( + CapacityPerProc = capacityPerProc + GCCPULimiterUpdatePeriod = gcCPULimiterUpdatePeriod +) + +type GCCPULimiter struct { + limiter gcCPULimiterState +} + +func NewGCCPULimiter(now int64, gomaxprocs int32) *GCCPULimiter { + // Force the controller to escape. We're going to + // do 64-bit atomics on it, and if it gets stack-allocated + // on a 32-bit architecture, it may get allocated unaligned + // space. + l := Escape(new(GCCPULimiter)) + l.limiter.test = true + l.limiter.resetCapacity(now, gomaxprocs) + return l +} + +func (l *GCCPULimiter) Fill() uint64 { + return l.limiter.bucket.fill +} + +func (l *GCCPULimiter) Capacity() uint64 { + return l.limiter.bucket.capacity +} + +func (l *GCCPULimiter) Overflow() uint64 { + return l.limiter.overflow +} + +func (l *GCCPULimiter) Limiting() bool { + return l.limiter.limiting() +} + +func (l *GCCPULimiter) NeedUpdate(now int64) bool { + return l.limiter.needUpdate(now) +} + +func (l *GCCPULimiter) StartGCTransition(enableGC bool, now int64) { + l.limiter.startGCTransition(enableGC, now) +} + +func (l *GCCPULimiter) FinishGCTransition(now int64) { + l.limiter.finishGCTransition(now) +} + +func (l *GCCPULimiter) Update(now int64) { + l.limiter.update(now) +} + +func (l *GCCPULimiter) AddAssistTime(t int64) { + l.limiter.addAssistTime(t) +} + +func (l *GCCPULimiter) ResetCapacity(now int64, nprocs int32) { + l.limiter.resetCapacity(now, nprocs) +} + +const ScavengePercent = scavengePercent + +type Scavenger struct { + Sleep func(int64) int64 + Scavenge func(uintptr) (uintptr, int64) + ShouldStop func() bool + GoMaxProcs func() int32 + + released atomic.Uintptr + scavenger scavengerState + stop chan<- struct{} + done <-chan struct{} +} + +func (s *Scavenger) Start() { + if s.Sleep == nil || s.Scavenge == nil || s.ShouldStop == nil || s.GoMaxProcs == nil { + panic("must populate all stubs") + } + + // Install hooks. + s.scavenger.sleepStub = s.Sleep + s.scavenger.scavenge = s.Scavenge + s.scavenger.shouldStop = s.ShouldStop + s.scavenger.gomaxprocs = s.GoMaxProcs + + // Start up scavenger goroutine, and wait for it to be ready. + stop := make(chan struct{}) + s.stop = stop + done := make(chan struct{}) + s.done = done + go func() { + // This should match bgscavenge, loosely. + s.scavenger.init() + s.scavenger.park() + for { + select { + case <-stop: + close(done) + return + default: + } + released, workTime := s.scavenger.run() + if released == 0 { + s.scavenger.park() + continue + } + s.released.Add(released) + s.scavenger.sleep(workTime) + } + }() + if !s.BlockUntilParked(1e9 /* 1 second */) { + panic("timed out waiting for scavenger to get ready") + } +} + +// BlockUntilParked blocks until the scavenger parks, or until +// timeout is exceeded. Returns true if the scavenger parked. +// +// Note that in testing, parked means something slightly different. +// In anger, the scavenger parks to sleep, too, but in testing, +// it only parks when it actually has no work to do. +func (s *Scavenger) BlockUntilParked(timeout int64) bool { + // Just spin, waiting for it to park. + // + // The actual parking process is racy with respect to + // wakeups, which is fine, but for testing we need something + // a bit more robust. + start := nanotime() + for nanotime()-start < timeout { + lock(&s.scavenger.lock) + parked := s.scavenger.parked + unlock(&s.scavenger.lock) + if parked { + return true + } + Gosched() + } + return false +} + +// Released returns how many bytes the scavenger released. +func (s *Scavenger) Released() uintptr { + return s.released.Load() +} + +// Wake wakes up a parked scavenger to keep running. +func (s *Scavenger) Wake() { + s.scavenger.wake() +} + +// Stop cleans up the scavenger's resources. The scavenger +// must be parked for this to work. +func (s *Scavenger) Stop() { + lock(&s.scavenger.lock) + parked := s.scavenger.parked + unlock(&s.scavenger.lock) + if !parked { + panic("tried to clean up scavenger that is not parked") + } + close(s.stop) + s.Wake() + <-s.done +} + +type ScavengeIndex struct { + i scavengeIndex +} + +func NewScavengeIndex(min, max ChunkIdx) *ScavengeIndex { + s := new(ScavengeIndex) + s.i.chunks = make([]atomic.Uint8, uintptr(1<<heapAddrBits/pallocChunkBytes/8)) + s.i.min.Store(int32(min / 8)) + s.i.max.Store(int32(max / 8)) + return s +} + +func (s *ScavengeIndex) Find() (ChunkIdx, uint) { + ci, off := s.i.find() + return ChunkIdx(ci), off +} + +func (s *ScavengeIndex) Mark(base, limit uintptr) { + s.i.mark(base, limit) +} + +func (s *ScavengeIndex) Clear(ci ChunkIdx) { + s.i.clear(chunkIdx(ci)) +} + +const GTrackingPeriod = gTrackingPeriod + +var ZeroBase = unsafe.Pointer(&zerobase) + +const UserArenaChunkBytes = userArenaChunkBytes + +type UserArena struct { + arena *userArena +} + +func NewUserArena() *UserArena { + return &UserArena{newUserArena()} +} + +func (a *UserArena) New(out *any) { + i := efaceOf(out) + typ := i._type + if typ.kind&kindMask != kindPtr { + panic("new result of non-ptr type") + } + typ = (*ptrtype)(unsafe.Pointer(typ)).elem + i.data = a.arena.new(typ) +} + +func (a *UserArena) Slice(sl any, cap int) { + a.arena.slice(sl, cap) +} + +func (a *UserArena) Free() { + a.arena.free() +} + +func GlobalWaitingArenaChunks() int { + n := 0 + systemstack(func() { + lock(&mheap_.lock) + for s := mheap_.userArena.quarantineList.first; s != nil; s = s.next { + n++ + } + unlock(&mheap_.lock) + }) + return n +} + +func UserArenaClone[T any](s T) T { + return arena_heapify(s).(T) +} + +var AlignUp = alignUp + +// BlockUntilEmptyFinalizerQueue blocks until either the finalizer +// queue is emptied (and the finalizers have executed) or the timeout +// is reached. Returns true if the finalizer queue was emptied. +func BlockUntilEmptyFinalizerQueue(timeout int64) bool { + start := nanotime() + for nanotime()-start < timeout { + lock(&finlock) + // We know the queue has been drained when both finq is nil + // and the finalizer g has stopped executing. + empty := finq == nil + empty = empty && readgstatus(fing) == _Gwaiting && fing.waitreason == waitReasonFinalizerWait + unlock(&finlock) + if empty { + return true + } + Gosched() + } + return false +} + +func FrameStartLine(f *Frame) int { + return f.startLine +} + +// PersistentAlloc allocates some memory that lives outside the Go heap. +// This memory will never be freed; use sparingly. +func PersistentAlloc(n uintptr) unsafe.Pointer { + return persistentalloc(n, 0, &memstats.other_sys) +} diff --git a/src/runtime/export_unix2_test.go b/src/runtime/export_unix2_test.go new file mode 100644 index 0000000..360565f --- /dev/null +++ b/src/runtime/export_unix2_test.go @@ -0,0 +1,10 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix && !linux + +package runtime + +// for linux close-on-exec implemented in runtime/internal/syscall +var Closeonexec = closeonexec diff --git a/src/runtime/export_unix_test.go b/src/runtime/export_unix_test.go new file mode 100644 index 0000000..6967e76 --- /dev/null +++ b/src/runtime/export_unix_test.go @@ -0,0 +1,98 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime + +import "unsafe" + +var NonblockingPipe = nonblockingPipe +var Fcntl = fcntl + +func sigismember(mask *sigset, i int) bool { + clear := *mask + sigdelset(&clear, i) + return clear != *mask +} + +func Sigisblocked(i int) bool { + var sigmask sigset + sigprocmask(_SIG_SETMASK, nil, &sigmask) + return sigismember(&sigmask, i) +} + +type M = m + +var waitForSigusr1 struct { + rdpipe int32 + wrpipe int32 + mID int64 +} + +// WaitForSigusr1 blocks until a SIGUSR1 is received. It calls ready +// when it is set up to receive SIGUSR1. The ready function should +// cause a SIGUSR1 to be sent. The r and w arguments are a pipe that +// the signal handler can use to report when the signal is received. +// +// Once SIGUSR1 is received, it returns the ID of the current M and +// the ID of the M the SIGUSR1 was received on. If the caller writes +// a non-zero byte to w, WaitForSigusr1 returns immediately with -1, -1. +func WaitForSigusr1(r, w int32, ready func(mp *M)) (int64, int64) { + lockOSThread() + // Make sure we can receive SIGUSR1. + unblocksig(_SIGUSR1) + + waitForSigusr1.rdpipe = r + waitForSigusr1.wrpipe = w + + mp := getg().m + testSigusr1 = waitForSigusr1Callback + ready(mp) + + // Wait for the signal. We use a pipe rather than a note + // because write is always async-signal-safe. + entersyscallblock() + var b byte + read(waitForSigusr1.rdpipe, noescape(unsafe.Pointer(&b)), 1) + exitsyscall() + + gotM := waitForSigusr1.mID + testSigusr1 = nil + + unlockOSThread() + + if b != 0 { + // timeout signal from caller + return -1, -1 + } + return mp.id, gotM +} + +// waitForSigusr1Callback is called from the signal handler during +// WaitForSigusr1. It must not have write barriers because there may +// not be a P. +// +//go:nowritebarrierrec +func waitForSigusr1Callback(gp *g) bool { + if gp == nil || gp.m == nil { + waitForSigusr1.mID = -1 + } else { + waitForSigusr1.mID = gp.m.id + } + b := byte(0) + write(uintptr(waitForSigusr1.wrpipe), noescape(unsafe.Pointer(&b)), 1) + return true +} + +// SendSigusr1 sends SIGUSR1 to mp. +func SendSigusr1(mp *M) { + signalM(mp, _SIGUSR1) +} + +const ( + O_WRONLY = _O_WRONLY + O_CREAT = _O_CREAT + O_TRUNC = _O_TRUNC +) diff --git a/src/runtime/export_windows_test.go b/src/runtime/export_windows_test.go new file mode 100644 index 0000000..d9cf753 --- /dev/null +++ b/src/runtime/export_windows_test.go @@ -0,0 +1,27 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Export guts for testing. + +package runtime + +import "unsafe" + +const MaxArgs = maxArgs + +var ( + TestingWER = &testingWER + OsYield = osyield + TimeBeginPeriodRetValue = &timeBeginPeriodRetValue +) + +func NumberOfProcessors() int32 { + var info systeminfo + stdcall1(_GetSystemInfo, uintptr(unsafe.Pointer(&info))) + return int32(info.dwnumberofprocessors) +} + +func LoadLibraryExStatus() (useEx, haveEx, haveFlags bool) { + return useLoadLibraryEx, _LoadLibraryExW != nil, _AddDllDirectory != nil +} diff --git a/src/runtime/extern.go b/src/runtime/extern.go new file mode 100644 index 0000000..afadc3d --- /dev/null +++ b/src/runtime/extern.go @@ -0,0 +1,322 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +/* +Package runtime contains operations that interact with Go's runtime system, +such as functions to control goroutines. It also includes the low-level type information +used by the reflect package; see reflect's documentation for the programmable +interface to the run-time type system. + +# Environment Variables + +The following environment variables ($name or %name%, depending on the host +operating system) control the run-time behavior of Go programs. The meanings +and use may change from release to release. + +The GOGC variable sets the initial garbage collection target percentage. +A collection is triggered when the ratio of freshly allocated data to live data +remaining after the previous collection reaches this percentage. The default +is GOGC=100. Setting GOGC=off disables the garbage collector entirely. +[runtime/debug.SetGCPercent] allows changing this percentage at run time. + +The GOMEMLIMIT variable sets a soft memory limit for the runtime. This memory limit +includes the Go heap and all other memory managed by the runtime, and excludes +external memory sources such as mappings of the binary itself, memory managed in +other languages, and memory held by the operating system on behalf of the Go +program. GOMEMLIMIT is a numeric value in bytes with an optional unit suffix. +The supported suffixes include B, KiB, MiB, GiB, and TiB. These suffixes +represent quantities of bytes as defined by the IEC 80000-13 standard. That is, +they are based on powers of two: KiB means 2^10 bytes, MiB means 2^20 bytes, +and so on. The default setting is math.MaxInt64, which effectively disables the +memory limit. [runtime/debug.SetMemoryLimit] allows changing this limit at run +time. + +The GODEBUG variable controls debugging variables within the runtime. +It is a comma-separated list of name=val pairs setting these named variables: + + allocfreetrace: setting allocfreetrace=1 causes every allocation to be + profiled and a stack trace printed on each object's allocation and free. + + clobberfree: setting clobberfree=1 causes the garbage collector to + clobber the memory content of an object with bad content when it frees + the object. + + cpu.*: cpu.all=off disables the use of all optional instruction set extensions. + cpu.extension=off disables use of instructions from the specified instruction set extension. + extension is the lower case name for the instruction set extension such as sse41 or avx + as listed in internal/cpu package. As an example cpu.avx=off disables runtime detection + and thereby use of AVX instructions. + + cgocheck: setting cgocheck=0 disables all checks for packages + using cgo to incorrectly pass Go pointers to non-Go code. + Setting cgocheck=1 (the default) enables relatively cheap + checks that may miss some errors. Setting cgocheck=2 enables + expensive checks that should not miss any errors, but will + cause your program to run slower. + + efence: setting efence=1 causes the allocator to run in a mode + where each object is allocated on a unique page and addresses are + never recycled. + + gccheckmark: setting gccheckmark=1 enables verification of the + garbage collector's concurrent mark phase by performing a + second mark pass while the world is stopped. If the second + pass finds a reachable object that was not found by concurrent + mark, the garbage collector will panic. + + gcpacertrace: setting gcpacertrace=1 causes the garbage collector to + print information about the internal state of the concurrent pacer. + + gcshrinkstackoff: setting gcshrinkstackoff=1 disables moving goroutines + onto smaller stacks. In this mode, a goroutine's stack can only grow. + + gcstoptheworld: setting gcstoptheworld=1 disables concurrent garbage collection, + making every garbage collection a stop-the-world event. Setting gcstoptheworld=2 + also disables concurrent sweeping after the garbage collection finishes. + + gctrace: setting gctrace=1 causes the garbage collector to emit a single line to standard + error at each collection, summarizing the amount of memory collected and the + length of the pause. The format of this line is subject to change. + Currently, it is: + gc # @#s #%: #+#+# ms clock, #+#/#/#+# ms cpu, #->#-># MB, # MB goal, # MB stacks, #MB globals, # P + where the fields are as follows: + gc # the GC number, incremented at each GC + @#s time in seconds since program start + #% percentage of time spent in GC since program start + #+...+# wall-clock/CPU times for the phases of the GC + #->#-># MB heap size at GC start, at GC end, and live heap + # MB goal goal heap size + # MB stacks estimated scannable stack size + # MB globals scannable global size + # P number of processors used + The phases are stop-the-world (STW) sweep termination, concurrent + mark and scan, and STW mark termination. The CPU times + for mark/scan are broken down in to assist time (GC performed in + line with allocation), background GC time, and idle GC time. + If the line ends with "(forced)", this GC was forced by a + runtime.GC() call. + + harddecommit: setting harddecommit=1 causes memory that is returned to the OS to + also have protections removed on it. This is the only mode of operation on Windows, + but is helpful in debugging scavenger-related issues on other platforms. Currently, + only supported on Linux. + + inittrace: setting inittrace=1 causes the runtime to emit a single line to standard + error for each package with init work, summarizing the execution time and memory + allocation. No information is printed for inits executed as part of plugin loading + and for packages without both user defined and compiler generated init work. + The format of this line is subject to change. Currently, it is: + init # @#ms, # ms clock, # bytes, # allocs + where the fields are as follows: + init # the package name + @# ms time in milliseconds when the init started since program start + # clock wall-clock time for package initialization work + # bytes memory allocated on the heap + # allocs number of heap allocations + + madvdontneed: setting madvdontneed=0 will use MADV_FREE + instead of MADV_DONTNEED on Linux when returning memory to the + kernel. This is more efficient, but means RSS numbers will + drop only when the OS is under memory pressure. On the BSDs and + Illumos/Solaris, setting madvdontneed=1 will use MADV_DONTNEED instead + of MADV_FREE. This is less efficient, but causes RSS numbers to drop + more quickly. + + memprofilerate: setting memprofilerate=X will update the value of runtime.MemProfileRate. + When set to 0 memory profiling is disabled. Refer to the description of + MemProfileRate for the default value. + + pagetrace: setting pagetrace=/path/to/file will write out a trace of page events + that can be viewed, analyzed, and visualized using the x/debug/cmd/pagetrace tool. + Build your program with GOEXPERIMENT=pagetrace to enable this functionality. Do not + enable this functionality if your program is a setuid binary as it introduces a security + risk in that scenario. Currently not supported on Windows, plan9 or js/wasm. Setting this + option for some applications can produce large traces, so use with care. + + invalidptr: invalidptr=1 (the default) causes the garbage collector and stack + copier to crash the program if an invalid pointer value (for example, 1) + is found in a pointer-typed location. Setting invalidptr=0 disables this check. + This should only be used as a temporary workaround to diagnose buggy code. + The real fix is to not store integers in pointer-typed locations. + + sbrk: setting sbrk=1 replaces the memory allocator and garbage collector + with a trivial allocator that obtains memory from the operating system and + never reclaims any memory. + + scavtrace: setting scavtrace=1 causes the runtime to emit a single line to standard + error, roughly once per GC cycle, summarizing the amount of work done by the + scavenger as well as the total amount of memory returned to the operating system + and an estimate of physical memory utilization. The format of this line is subject + to change, but currently it is: + scav # KiB work, # KiB total, #% util + where the fields are as follows: + # KiB work the amount of memory returned to the OS since the last line + # KiB total the total amount of memory returned to the OS + #% util the fraction of all unscavenged memory which is in-use + If the line ends with "(forced)", then scavenging was forced by a + debug.FreeOSMemory() call. + + scheddetail: setting schedtrace=X and scheddetail=1 causes the scheduler to emit + detailed multiline info every X milliseconds, describing state of the scheduler, + processors, threads and goroutines. + + schedtrace: setting schedtrace=X causes the scheduler to emit a single line to standard + error every X milliseconds, summarizing the scheduler state. + + tracebackancestors: setting tracebackancestors=N extends tracebacks with the stacks at + which goroutines were created, where N limits the number of ancestor goroutines to + report. This also extends the information returned by runtime.Stack. Ancestor's goroutine + IDs will refer to the ID of the goroutine at the time of creation; it's possible for this + ID to be reused for another goroutine. Setting N to 0 will report no ancestry information. + + asyncpreemptoff: asyncpreemptoff=1 disables signal-based + asynchronous goroutine preemption. This makes some loops + non-preemptible for long periods, which may delay GC and + goroutine scheduling. This is useful for debugging GC issues + because it also disables the conservative stack scanning used + for asynchronously preempted goroutines. + +The net and net/http packages also refer to debugging variables in GODEBUG. +See the documentation for those packages for details. + +The GOMAXPROCS variable limits the number of operating system threads that +can execute user-level Go code simultaneously. There is no limit to the number of threads +that can be blocked in system calls on behalf of Go code; those do not count against +the GOMAXPROCS limit. This package's GOMAXPROCS function queries and changes +the limit. + +The GORACE variable configures the race detector, for programs built using -race. +See https://golang.org/doc/articles/race_detector.html for details. + +The GOTRACEBACK variable controls the amount of output generated when a Go +program fails due to an unrecovered panic or an unexpected runtime condition. +By default, a failure prints a stack trace for the current goroutine, +eliding functions internal to the run-time system, and then exits with exit code 2. +The failure prints stack traces for all goroutines if there is no current goroutine +or the failure is internal to the run-time. +GOTRACEBACK=none omits the goroutine stack traces entirely. +GOTRACEBACK=single (the default) behaves as described above. +GOTRACEBACK=all adds stack traces for all user-created goroutines. +GOTRACEBACK=system is like “all” but adds stack frames for run-time functions +and shows goroutines created internally by the run-time. +GOTRACEBACK=crash is like “system” but crashes in an operating system-specific +manner instead of exiting. For example, on Unix systems, the crash raises +SIGABRT to trigger a core dump. +For historical reasons, the GOTRACEBACK settings 0, 1, and 2 are synonyms for +none, all, and system, respectively. +The runtime/debug package's SetTraceback function allows increasing the +amount of output at run time, but it cannot reduce the amount below that +specified by the environment variable. +See https://golang.org/pkg/runtime/debug/#SetTraceback. + +The GOARCH, GOOS, GOPATH, and GOROOT environment variables complete +the set of Go environment variables. They influence the building of Go programs +(see https://golang.org/cmd/go and https://golang.org/pkg/go/build). +GOARCH, GOOS, and GOROOT are recorded at compile time and made available by +constants or functions in this package, but they do not influence the execution +of the run-time system. + +# Security + +On Unix platforms, Go's runtime system behaves slightly differently when a +binary is setuid/setgid or executed with setuid/setgid-like properties, in order +to prevent dangerous behaviors. On Linux this is determined by checking for the +AT_SECURE flag in the auxiliary vector, on the BSDs and Solaris/Illumos it is +determined by checking the issetugid syscall, and on AIX it is determined by +checking if the uid/gid match the effective uid/gid. + +When the runtime determines the binary is setuid/setgid-like, it does three main +things: + - The standard input/output file descriptors (0, 1, 2) are checked to be open. + If any of them are closed, they are opened pointing at /dev/null. + - The value of the GOTRACEBACK environment variable is set to 'none'. + - When a signal is received that terminates the program, or the program + encounters an unrecoverable panic that would otherwise override the value + of GOTRACEBACK, the goroutine stack, registers, and other memory related + information are omitted. +*/ +package runtime + +import ( + "internal/goarch" + "internal/goos" +) + +// Caller reports file and line number information about function invocations on +// the calling goroutine's stack. The argument skip is the number of stack frames +// to ascend, with 0 identifying the caller of Caller. (For historical reasons the +// meaning of skip differs between Caller and Callers.) The return values report the +// program counter, file name, and line number within the file of the corresponding +// call. The boolean ok is false if it was not possible to recover the information. +func Caller(skip int) (pc uintptr, file string, line int, ok bool) { + rpc := make([]uintptr, 1) + n := callers(skip+1, rpc[:]) + if n < 1 { + return + } + frame, _ := CallersFrames(rpc).Next() + return frame.PC, frame.File, frame.Line, frame.PC != 0 +} + +// Callers fills the slice pc with the return program counters of function invocations +// on the calling goroutine's stack. The argument skip is the number of stack frames +// to skip before recording in pc, with 0 identifying the frame for Callers itself and +// 1 identifying the caller of Callers. +// It returns the number of entries written to pc. +// +// To translate these PCs into symbolic information such as function +// names and line numbers, use CallersFrames. CallersFrames accounts +// for inlined functions and adjusts the return program counters into +// call program counters. Iterating over the returned slice of PCs +// directly is discouraged, as is using FuncForPC on any of the +// returned PCs, since these cannot account for inlining or return +// program counter adjustment. +func Callers(skip int, pc []uintptr) int { + // runtime.callers uses pc.array==nil as a signal + // to print a stack trace. Pick off 0-length pc here + // so that we don't let a nil pc slice get to it. + if len(pc) == 0 { + return 0 + } + return callers(skip, pc) +} + +var defaultGOROOT string // set by cmd/link + +// GOROOT returns the root of the Go tree. It uses the +// GOROOT environment variable, if set at process start, +// or else the root used during the Go build. +func GOROOT() string { + s := gogetenv("GOROOT") + if s != "" { + return s + } + return defaultGOROOT +} + +// buildVersion is the Go tree's version string at build time. +// +// If any GOEXPERIMENTs are set to non-default values, it will include +// "X:<GOEXPERIMENT>". +// +// This is set by the linker. +// +// This is accessed by "go version <binary>". +var buildVersion string + +// Version returns the Go tree's version string. +// It is either the commit hash and date at the time of the build or, +// when possible, a release tag like "go1.3". +func Version() string { + return buildVersion +} + +// GOOS is the running program's operating system target: +// one of darwin, freebsd, linux, and so on. +// To view possible combinations of GOOS and GOARCH, run "go tool dist list". +const GOOS string = goos.GOOS + +// GOARCH is the running program's architecture target: +// one of 386, amd64, arm, s390x, and so on. +const GOARCH string = goarch.GOARCH diff --git a/src/runtime/fastlog2.go b/src/runtime/fastlog2.go new file mode 100644 index 0000000..1f251bf --- /dev/null +++ b/src/runtime/fastlog2.go @@ -0,0 +1,27 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// fastlog2 implements a fast approximation to the base 2 log of a +// float64. This is used to compute a geometric distribution for heap +// sampling, without introducing dependencies into package math. This +// uses a very rough approximation using the float64 exponent and the +// first 25 bits of the mantissa. The top 5 bits of the mantissa are +// used to load limits from a table of constants and the rest are used +// to scale linearly between them. +func fastlog2(x float64) float64 { + const fastlogScaleBits = 20 + const fastlogScaleRatio = 1.0 / (1 << fastlogScaleBits) + + xBits := float64bits(x) + // Extract the exponent from the IEEE float64, and index a constant + // table with the first 10 bits from the mantissa. + xExp := int64((xBits>>52)&0x7FF) - 1023 + xManIndex := (xBits >> (52 - fastlogNumBits)) % (1 << fastlogNumBits) + xManScale := (xBits >> (52 - fastlogNumBits - fastlogScaleBits)) % (1 << fastlogScaleBits) + + low, high := fastlog2Table[xManIndex], fastlog2Table[xManIndex+1] + return float64(xExp) + low + (high-low)*float64(xManScale)*fastlogScaleRatio +} diff --git a/src/runtime/fastlog2_test.go b/src/runtime/fastlog2_test.go new file mode 100644 index 0000000..ae0f40b --- /dev/null +++ b/src/runtime/fastlog2_test.go @@ -0,0 +1,34 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "math" + "runtime" + "testing" +) + +func TestFastLog2(t *testing.T) { + // Compute the euclidean distance between math.Log2 and the FastLog2 + // implementation over the range of interest for heap sampling. + const randomBitCount = 26 + var e float64 + + inc := 1 + if testing.Short() { + // Check 1K total values, down from 64M. + inc = 1 << 16 + } + for i := 1; i < 1<<randomBitCount; i += inc { + l, fl := math.Log2(float64(i)), runtime.Fastlog2(float64(i)) + d := l - fl + e += d * d + } + e = math.Sqrt(e) + + if e > 1.0 { + t.Fatalf("imprecision on fastlog2 implementation, want <=1.0, got %f", e) + } +} diff --git a/src/runtime/fastlog2table.go b/src/runtime/fastlog2table.go new file mode 100644 index 0000000..6ba4a7d --- /dev/null +++ b/src/runtime/fastlog2table.go @@ -0,0 +1,43 @@ +// Code generated by mkfastlog2table.go; DO NOT EDIT. +// Run go generate from src/runtime to update. +// See mkfastlog2table.go for comments. + +package runtime + +const fastlogNumBits = 5 + +var fastlog2Table = [1<<fastlogNumBits + 1]float64{ + 0, + 0.0443941193584535, + 0.08746284125033943, + 0.12928301694496647, + 0.16992500144231248, + 0.2094533656289499, + 0.24792751344358555, + 0.28540221886224837, + 0.3219280948873623, + 0.3575520046180837, + 0.39231742277876036, + 0.4262647547020979, + 0.4594316186372973, + 0.4918530963296748, + 0.5235619560570128, + 0.5545888516776374, + 0.5849625007211563, + 0.6147098441152082, + 0.6438561897747247, + 0.6724253419714956, + 0.7004397181410922, + 0.7279204545631992, + 0.7548875021634686, + 0.7813597135246596, + 0.8073549220576042, + 0.8328900141647417, + 0.8579809951275721, + 0.8826430493618412, + 0.9068905956085185, + 0.9307373375628862, + 0.9541963103868752, + 0.9772799234999164, + 1, +} diff --git a/src/runtime/float.go b/src/runtime/float.go new file mode 100644 index 0000000..9f281c4 --- /dev/null +++ b/src/runtime/float.go @@ -0,0 +1,54 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +var inf = float64frombits(0x7FF0000000000000) + +// isNaN reports whether f is an IEEE 754 “not-a-number” value. +func isNaN(f float64) (is bool) { + // IEEE 754 says that only NaNs satisfy f != f. + return f != f +} + +// isFinite reports whether f is neither NaN nor an infinity. +func isFinite(f float64) bool { + return !isNaN(f - f) +} + +// isInf reports whether f is an infinity. +func isInf(f float64) bool { + return !isNaN(f) && !isFinite(f) +} + +// abs returns the absolute value of x. +// +// Special cases are: +// +// abs(±Inf) = +Inf +// abs(NaN) = NaN +func abs(x float64) float64 { + const sign = 1 << 63 + return float64frombits(float64bits(x) &^ sign) +} + +// copysign returns a value with the magnitude +// of x and the sign of y. +func copysign(x, y float64) float64 { + const sign = 1 << 63 + return float64frombits(float64bits(x)&^sign | float64bits(y)&sign) +} + +// float64bits returns the IEEE 754 binary representation of f. +func float64bits(f float64) uint64 { + return *(*uint64)(unsafe.Pointer(&f)) +} + +// float64frombits returns the floating point number corresponding +// the IEEE 754 binary representation b. +func float64frombits(b uint64) float64 { + return *(*float64)(unsafe.Pointer(&b)) +} diff --git a/src/runtime/float_test.go b/src/runtime/float_test.go new file mode 100644 index 0000000..b2aa43d --- /dev/null +++ b/src/runtime/float_test.go @@ -0,0 +1,25 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "testing" +) + +func TestIssue48807(t *testing.T) { + for _, i := range []uint64{ + 0x8234508000000001, // from issue48807 + 1<<56 + 1<<32 + 1, + } { + got := float32(i) + dontwant := float32(float64(i)) + if got == dontwant { + // The test cases above should be uint64s such that + // this equality doesn't hold. These examples trigger + // the case where using an intermediate float64 doesn't work. + t.Errorf("direct float32 conversion doesn't work: arg=%x got=%x dontwant=%x", i, got, dontwant) + } + } +} diff --git a/src/runtime/funcdata.h b/src/runtime/funcdata.h new file mode 100644 index 0000000..2e2bb30 --- /dev/null +++ b/src/runtime/funcdata.h @@ -0,0 +1,56 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This file defines the IDs for PCDATA and FUNCDATA instructions +// in Go binaries. It is included by assembly sources, so it must +// be written using #defines. +// +// These must agree with symtab.go and ../cmd/internal/objabi/funcdata.go. + +#define PCDATA_UnsafePoint 0 +#define PCDATA_StackMapIndex 1 +#define PCDATA_InlTreeIndex 2 +#define PCDATA_ArgLiveIndex 3 + +#define FUNCDATA_ArgsPointerMaps 0 /* garbage collector blocks */ +#define FUNCDATA_LocalsPointerMaps 1 +#define FUNCDATA_StackObjects 2 +#define FUNCDATA_InlTree 3 +#define FUNCDATA_OpenCodedDeferInfo 4 /* info for func with open-coded defers */ +#define FUNCDATA_ArgInfo 5 +#define FUNCDATA_ArgLiveInfo 6 +#define FUNCDATA_WrapInfo 7 + +// Pseudo-assembly statements. + +// GO_ARGS, GO_RESULTS_INITIALIZED, and NO_LOCAL_POINTERS are macros +// that communicate to the runtime information about the location and liveness +// of pointers in an assembly function's arguments, results, and stack frame. +// This communication is only required in assembly functions that make calls +// to other functions that might be preempted or grow the stack. +// NOSPLIT functions that make no calls do not need to use these macros. + +// GO_ARGS indicates that the Go prototype for this assembly function +// defines the pointer map for the function's arguments. +// GO_ARGS should be the first instruction in a function that uses it. +// It can be omitted if there are no arguments at all. +// GO_ARGS is inserted implicitly by the linker for any function whose +// name starts with a middle-dot and that also has a Go prototype; it +// is therefore usually not necessary to write explicitly. +#define GO_ARGS FUNCDATA $FUNCDATA_ArgsPointerMaps, go_args_stackmap(SB) + +// GO_RESULTS_INITIALIZED indicates that the assembly function +// has initialized the stack space for its results and that those results +// should be considered live for the remainder of the function. +#define GO_RESULTS_INITIALIZED PCDATA $PCDATA_StackMapIndex, $1 + +// NO_LOCAL_POINTERS indicates that the assembly function stores +// no pointers to heap objects in its local stack variables. +#define NO_LOCAL_POINTERS FUNCDATA $FUNCDATA_LocalsPointerMaps, no_pointers_stackmap(SB) + +// ArgsSizeUnknown is set in Func.argsize to mark all functions +// whose argument size is unknown (C vararg functions, and +// assembly code without an explicit specification). +// This value is generated by the compiler, assembler, or linker. +#define ArgsSizeUnknown 0x80000000 diff --git a/src/runtime/gc_test.go b/src/runtime/gc_test.go new file mode 100644 index 0000000..1a1655e --- /dev/null +++ b/src/runtime/gc_test.go @@ -0,0 +1,939 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "math/rand" + "os" + "reflect" + "runtime" + "runtime/debug" + "sort" + "strings" + "sync" + "sync/atomic" + "testing" + "time" + "unsafe" +) + +func TestGcSys(t *testing.T) { + t.Skip("skipping known-flaky test; golang.org/issue/37331") + if os.Getenv("GOGC") == "off" { + t.Skip("skipping test; GOGC=off in environment") + } + got := runTestProg(t, "testprog", "GCSys") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got %q", want, got) + } +} + +func TestGcDeepNesting(t *testing.T) { + type T [2][2][2][2][2][2][2][2][2][2]*int + a := new(T) + + // Prevent the compiler from applying escape analysis. + // This makes sure new(T) is allocated on heap, not on the stack. + t.Logf("%p", a) + + a[0][0][0][0][0][0][0][0][0][0] = new(int) + *a[0][0][0][0][0][0][0][0][0][0] = 13 + runtime.GC() + if *a[0][0][0][0][0][0][0][0][0][0] != 13 { + t.Fail() + } +} + +func TestGcMapIndirection(t *testing.T) { + defer debug.SetGCPercent(debug.SetGCPercent(1)) + runtime.GC() + type T struct { + a [256]int + } + m := make(map[T]T) + for i := 0; i < 2000; i++ { + var a T + a.a[0] = i + m[a] = T{} + } +} + +func TestGcArraySlice(t *testing.T) { + type X struct { + buf [1]byte + nextbuf []byte + next *X + } + var head *X + for i := 0; i < 10; i++ { + p := &X{} + p.buf[0] = 42 + p.next = head + if head != nil { + p.nextbuf = head.buf[:] + } + head = p + runtime.GC() + } + for p := head; p != nil; p = p.next { + if p.buf[0] != 42 { + t.Fatal("corrupted heap") + } + } +} + +func TestGcRescan(t *testing.T) { + type X struct { + c chan error + nextx *X + } + type Y struct { + X + nexty *Y + p *int + } + var head *Y + for i := 0; i < 10; i++ { + p := &Y{} + p.c = make(chan error) + if head != nil { + p.nextx = &head.X + } + p.nexty = head + p.p = new(int) + *p.p = 42 + head = p + runtime.GC() + } + for p := head; p != nil; p = p.nexty { + if *p.p != 42 { + t.Fatal("corrupted heap") + } + } +} + +func TestGcLastTime(t *testing.T) { + ms := new(runtime.MemStats) + t0 := time.Now().UnixNano() + runtime.GC() + t1 := time.Now().UnixNano() + runtime.ReadMemStats(ms) + last := int64(ms.LastGC) + if t0 > last || last > t1 { + t.Fatalf("bad last GC time: got %v, want [%v, %v]", last, t0, t1) + } + pause := ms.PauseNs[(ms.NumGC+255)%256] + // Due to timer granularity, pause can actually be 0 on windows + // or on virtualized environments. + if pause == 0 { + t.Logf("last GC pause was 0") + } else if pause > 10e9 { + t.Logf("bad last GC pause: got %v, want [0, 10e9]", pause) + } +} + +var hugeSink any + +func TestHugeGCInfo(t *testing.T) { + // The test ensures that compiler can chew these huge types even on weakest machines. + // The types are not allocated at runtime. + if hugeSink != nil { + // 400MB on 32 bots, 4TB on 64-bits. + const n = (400 << 20) + (unsafe.Sizeof(uintptr(0))-4)<<40 + hugeSink = new([n]*byte) + hugeSink = new([n]uintptr) + hugeSink = new(struct { + x float64 + y [n]*byte + z []string + }) + hugeSink = new(struct { + x float64 + y [n]uintptr + z []string + }) + } +} + +func TestPeriodicGC(t *testing.T) { + if runtime.GOARCH == "wasm" { + t.Skip("no sysmon on wasm yet") + } + + // Make sure we're not in the middle of a GC. + runtime.GC() + + var ms1, ms2 runtime.MemStats + runtime.ReadMemStats(&ms1) + + // Make periodic GC run continuously. + orig := *runtime.ForceGCPeriod + *runtime.ForceGCPeriod = 0 + + // Let some periodic GCs happen. In a heavily loaded system, + // it's possible these will be delayed, so this is designed to + // succeed quickly if things are working, but to give it some + // slack if things are slow. + var numGCs uint32 + const want = 2 + for i := 0; i < 200 && numGCs < want; i++ { + time.Sleep(5 * time.Millisecond) + + // Test that periodic GC actually happened. + runtime.ReadMemStats(&ms2) + numGCs = ms2.NumGC - ms1.NumGC + } + *runtime.ForceGCPeriod = orig + + if numGCs < want { + t.Fatalf("no periodic GC: got %v GCs, want >= 2", numGCs) + } +} + +func TestGcZombieReporting(t *testing.T) { + // This test is somewhat sensitive to how the allocator works. + // Pointers in zombies slice may cross-span, thus we + // add invalidptr=0 for avoiding the badPointer check. + // See issue https://golang.org/issues/49613/ + got := runTestProg(t, "testprog", "GCZombie", "GODEBUG=invalidptr=0") + want := "found pointer to free object" + if !strings.Contains(got, want) { + t.Fatalf("expected %q in output, but got %q", want, got) + } +} + +func TestGCTestMoveStackOnNextCall(t *testing.T) { + t.Parallel() + var onStack int + // GCTestMoveStackOnNextCall can fail in rare cases if there's + // a preemption. This won't happen many times in quick + // succession, so just retry a few times. + for retry := 0; retry < 5; retry++ { + runtime.GCTestMoveStackOnNextCall() + if moveStackCheck(t, &onStack, uintptr(unsafe.Pointer(&onStack))) { + // Passed. + return + } + } + t.Fatal("stack did not move") +} + +// This must not be inlined because the point is to force a stack +// growth check and move the stack. +// +//go:noinline +func moveStackCheck(t *testing.T, new *int, old uintptr) bool { + // new should have been updated by the stack move; + // old should not have. + + // Capture new's value before doing anything that could + // further move the stack. + new2 := uintptr(unsafe.Pointer(new)) + + t.Logf("old stack pointer %x, new stack pointer %x", old, new2) + if new2 == old { + // Check that we didn't screw up the test's escape analysis. + if cls := runtime.GCTestPointerClass(unsafe.Pointer(new)); cls != "stack" { + t.Fatalf("test bug: new (%#x) should be a stack pointer, not %s", new2, cls) + } + // This was a real failure. + return false + } + return true +} + +func TestGCTestMoveStackRepeatedly(t *testing.T) { + // Move the stack repeatedly to make sure we're not doubling + // it each time. + for i := 0; i < 100; i++ { + runtime.GCTestMoveStackOnNextCall() + moveStack1(false) + } +} + +//go:noinline +func moveStack1(x bool) { + // Make sure this function doesn't get auto-nosplit. + if x { + println("x") + } +} + +func TestGCTestIsReachable(t *testing.T) { + var all, half []unsafe.Pointer + var want uint64 + for i := 0; i < 16; i++ { + // The tiny allocator muddies things, so we use a + // scannable type. + p := unsafe.Pointer(new(*int)) + all = append(all, p) + if i%2 == 0 { + half = append(half, p) + want |= 1 << i + } + } + + got := runtime.GCTestIsReachable(all...) + if want != got { + t.Fatalf("did not get expected reachable set; want %b, got %b", want, got) + } + runtime.KeepAlive(half) +} + +var pointerClassBSS *int +var pointerClassData = 42 + +func TestGCTestPointerClass(t *testing.T) { + t.Parallel() + check := func(p unsafe.Pointer, want string) { + t.Helper() + got := runtime.GCTestPointerClass(p) + if got != want { + // Convert the pointer to a uintptr to avoid + // escaping it. + t.Errorf("for %#x, want class %s, got %s", uintptr(p), want, got) + } + } + var onStack int + var notOnStack int + check(unsafe.Pointer(&onStack), "stack") + check(unsafe.Pointer(runtime.Escape(¬OnStack)), "heap") + check(unsafe.Pointer(&pointerClassBSS), "bss") + check(unsafe.Pointer(&pointerClassData), "data") + check(nil, "other") +} + +func BenchmarkSetTypePtr(b *testing.B) { + benchSetType(b, new(*byte)) +} + +func BenchmarkSetTypePtr8(b *testing.B) { + benchSetType(b, new([8]*byte)) +} + +func BenchmarkSetTypePtr16(b *testing.B) { + benchSetType(b, new([16]*byte)) +} + +func BenchmarkSetTypePtr32(b *testing.B) { + benchSetType(b, new([32]*byte)) +} + +func BenchmarkSetTypePtr64(b *testing.B) { + benchSetType(b, new([64]*byte)) +} + +func BenchmarkSetTypePtr126(b *testing.B) { + benchSetType(b, new([126]*byte)) +} + +func BenchmarkSetTypePtr128(b *testing.B) { + benchSetType(b, new([128]*byte)) +} + +func BenchmarkSetTypePtrSlice(b *testing.B) { + benchSetType(b, make([]*byte, 1<<10)) +} + +type Node1 struct { + Value [1]uintptr + Left, Right *byte +} + +func BenchmarkSetTypeNode1(b *testing.B) { + benchSetType(b, new(Node1)) +} + +func BenchmarkSetTypeNode1Slice(b *testing.B) { + benchSetType(b, make([]Node1, 32)) +} + +type Node8 struct { + Value [8]uintptr + Left, Right *byte +} + +func BenchmarkSetTypeNode8(b *testing.B) { + benchSetType(b, new(Node8)) +} + +func BenchmarkSetTypeNode8Slice(b *testing.B) { + benchSetType(b, make([]Node8, 32)) +} + +type Node64 struct { + Value [64]uintptr + Left, Right *byte +} + +func BenchmarkSetTypeNode64(b *testing.B) { + benchSetType(b, new(Node64)) +} + +func BenchmarkSetTypeNode64Slice(b *testing.B) { + benchSetType(b, make([]Node64, 32)) +} + +type Node64Dead struct { + Left, Right *byte + Value [64]uintptr +} + +func BenchmarkSetTypeNode64Dead(b *testing.B) { + benchSetType(b, new(Node64Dead)) +} + +func BenchmarkSetTypeNode64DeadSlice(b *testing.B) { + benchSetType(b, make([]Node64Dead, 32)) +} + +type Node124 struct { + Value [124]uintptr + Left, Right *byte +} + +func BenchmarkSetTypeNode124(b *testing.B) { + benchSetType(b, new(Node124)) +} + +func BenchmarkSetTypeNode124Slice(b *testing.B) { + benchSetType(b, make([]Node124, 32)) +} + +type Node126 struct { + Value [126]uintptr + Left, Right *byte +} + +func BenchmarkSetTypeNode126(b *testing.B) { + benchSetType(b, new(Node126)) +} + +func BenchmarkSetTypeNode126Slice(b *testing.B) { + benchSetType(b, make([]Node126, 32)) +} + +type Node128 struct { + Value [128]uintptr + Left, Right *byte +} + +func BenchmarkSetTypeNode128(b *testing.B) { + benchSetType(b, new(Node128)) +} + +func BenchmarkSetTypeNode128Slice(b *testing.B) { + benchSetType(b, make([]Node128, 32)) +} + +type Node130 struct { + Value [130]uintptr + Left, Right *byte +} + +func BenchmarkSetTypeNode130(b *testing.B) { + benchSetType(b, new(Node130)) +} + +func BenchmarkSetTypeNode130Slice(b *testing.B) { + benchSetType(b, make([]Node130, 32)) +} + +type Node1024 struct { + Value [1024]uintptr + Left, Right *byte +} + +func BenchmarkSetTypeNode1024(b *testing.B) { + benchSetType(b, new(Node1024)) +} + +func BenchmarkSetTypeNode1024Slice(b *testing.B) { + benchSetType(b, make([]Node1024, 32)) +} + +func benchSetType(b *testing.B, x any) { + v := reflect.ValueOf(x) + t := v.Type() + switch t.Kind() { + case reflect.Pointer: + b.SetBytes(int64(t.Elem().Size())) + case reflect.Slice: + b.SetBytes(int64(t.Elem().Size()) * int64(v.Len())) + } + b.ResetTimer() + runtime.BenchSetType(b.N, x) +} + +func BenchmarkAllocation(b *testing.B) { + type T struct { + x, y *byte + } + ngo := runtime.GOMAXPROCS(0) + work := make(chan bool, b.N+ngo) + result := make(chan *T) + for i := 0; i < b.N; i++ { + work <- true + } + for i := 0; i < ngo; i++ { + work <- false + } + for i := 0; i < ngo; i++ { + go func() { + var x *T + for <-work { + for i := 0; i < 1000; i++ { + x = &T{} + } + } + result <- x + }() + } + for i := 0; i < ngo; i++ { + <-result + } +} + +func TestPrintGC(t *testing.T) { + if testing.Short() { + t.Skip("Skipping in short mode") + } + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(2)) + done := make(chan bool) + go func() { + for { + select { + case <-done: + return + default: + runtime.GC() + } + } + }() + for i := 0; i < 1e4; i++ { + func() { + defer print("") + }() + } + close(done) +} + +func testTypeSwitch(x any) error { + switch y := x.(type) { + case nil: + // ok + case error: + return y + } + return nil +} + +func testAssert(x any) error { + if y, ok := x.(error); ok { + return y + } + return nil +} + +func testAssertVar(x any) error { + var y, ok = x.(error) + if ok { + return y + } + return nil +} + +var a bool + +//go:noinline +func testIfaceEqual(x any) { + if x == "abc" { + a = true + } +} + +func TestPageAccounting(t *testing.T) { + // Grow the heap in small increments. This used to drop the + // pages-in-use count below zero because of a rounding + // mismatch (golang.org/issue/15022). + const blockSize = 64 << 10 + blocks := make([]*[blockSize]byte, (64<<20)/blockSize) + for i := range blocks { + blocks[i] = new([blockSize]byte) + } + + // Check that the running page count matches reality. + pagesInUse, counted := runtime.CountPagesInUse() + if pagesInUse != counted { + t.Fatalf("mheap_.pagesInUse is %d, but direct count is %d", pagesInUse, counted) + } +} + +func init() { + // Enable ReadMemStats' double-check mode. + *runtime.DoubleCheckReadMemStats = true +} + +func TestReadMemStats(t *testing.T) { + base, slow := runtime.ReadMemStatsSlow() + if base != slow { + logDiff(t, "MemStats", reflect.ValueOf(base), reflect.ValueOf(slow)) + t.Fatal("memstats mismatch") + } +} + +func logDiff(t *testing.T, prefix string, got, want reflect.Value) { + typ := got.Type() + switch typ.Kind() { + case reflect.Array, reflect.Slice: + if got.Len() != want.Len() { + t.Logf("len(%s): got %v, want %v", prefix, got, want) + return + } + for i := 0; i < got.Len(); i++ { + logDiff(t, fmt.Sprintf("%s[%d]", prefix, i), got.Index(i), want.Index(i)) + } + case reflect.Struct: + for i := 0; i < typ.NumField(); i++ { + gf, wf := got.Field(i), want.Field(i) + logDiff(t, prefix+"."+typ.Field(i).Name, gf, wf) + } + case reflect.Map: + t.Fatal("not implemented: logDiff for map") + default: + if got.Interface() != want.Interface() { + t.Logf("%s: got %v, want %v", prefix, got, want) + } + } +} + +func BenchmarkReadMemStats(b *testing.B) { + var ms runtime.MemStats + const heapSize = 100 << 20 + x := make([]*[1024]byte, heapSize/1024) + for i := range x { + x[i] = new([1024]byte) + } + + b.ResetTimer() + for i := 0; i < b.N; i++ { + runtime.ReadMemStats(&ms) + } + + runtime.KeepAlive(x) +} + +func applyGCLoad(b *testing.B) func() { + // We’ll apply load to the runtime with maxProcs-1 goroutines + // and use one more to actually benchmark. It doesn't make sense + // to try to run this test with only 1 P (that's what + // BenchmarkReadMemStats is for). + maxProcs := runtime.GOMAXPROCS(-1) + if maxProcs == 1 { + b.Skip("This benchmark can only be run with GOMAXPROCS > 1") + } + + // Code to build a big tree with lots of pointers. + type node struct { + children [16]*node + } + var buildTree func(depth int) *node + buildTree = func(depth int) *node { + tree := new(node) + if depth != 0 { + for i := range tree.children { + tree.children[i] = buildTree(depth - 1) + } + } + return tree + } + + // Keep the GC busy by continuously generating large trees. + done := make(chan struct{}) + var wg sync.WaitGroup + for i := 0; i < maxProcs-1; i++ { + wg.Add(1) + go func() { + defer wg.Done() + var hold *node + loop: + for { + hold = buildTree(5) + select { + case <-done: + break loop + default: + } + } + runtime.KeepAlive(hold) + }() + } + return func() { + close(done) + wg.Wait() + } +} + +func BenchmarkReadMemStatsLatency(b *testing.B) { + stop := applyGCLoad(b) + + // Spend this much time measuring latencies. + latencies := make([]time.Duration, 0, 1024) + + // Run for timeToBench hitting ReadMemStats continuously + // and measuring the latency. + b.ResetTimer() + var ms runtime.MemStats + for i := 0; i < b.N; i++ { + // Sleep for a bit, otherwise we're just going to keep + // stopping the world and no one will get to do anything. + time.Sleep(100 * time.Millisecond) + start := time.Now() + runtime.ReadMemStats(&ms) + latencies = append(latencies, time.Since(start)) + } + // Make sure to stop the timer before we wait! The load created above + // is very heavy-weight and not easy to stop, so we could end up + // confusing the benchmarking framework for small b.N. + b.StopTimer() + stop() + + // Disable the default */op metrics. + // ns/op doesn't mean anything because it's an average, but we + // have a sleep in our b.N loop above which skews this significantly. + b.ReportMetric(0, "ns/op") + b.ReportMetric(0, "B/op") + b.ReportMetric(0, "allocs/op") + + // Sort latencies then report percentiles. + sort.Slice(latencies, func(i, j int) bool { + return latencies[i] < latencies[j] + }) + b.ReportMetric(float64(latencies[len(latencies)*50/100]), "p50-ns") + b.ReportMetric(float64(latencies[len(latencies)*90/100]), "p90-ns") + b.ReportMetric(float64(latencies[len(latencies)*99/100]), "p99-ns") +} + +func TestUserForcedGC(t *testing.T) { + // Test that runtime.GC() triggers a GC even if GOGC=off. + defer debug.SetGCPercent(debug.SetGCPercent(-1)) + + var ms1, ms2 runtime.MemStats + runtime.ReadMemStats(&ms1) + runtime.GC() + runtime.ReadMemStats(&ms2) + if ms1.NumGC == ms2.NumGC { + t.Fatalf("runtime.GC() did not trigger GC") + } + if ms1.NumForcedGC == ms2.NumForcedGC { + t.Fatalf("runtime.GC() was not accounted in NumForcedGC") + } +} + +func writeBarrierBenchmark(b *testing.B, f func()) { + runtime.GC() + var ms runtime.MemStats + runtime.ReadMemStats(&ms) + //b.Logf("heap size: %d MB", ms.HeapAlloc>>20) + + // Keep GC running continuously during the benchmark, which in + // turn keeps the write barrier on continuously. + var stop uint32 + done := make(chan bool) + go func() { + for atomic.LoadUint32(&stop) == 0 { + runtime.GC() + } + close(done) + }() + defer func() { + atomic.StoreUint32(&stop, 1) + <-done + }() + + b.ResetTimer() + f() + b.StopTimer() +} + +func BenchmarkWriteBarrier(b *testing.B) { + if runtime.GOMAXPROCS(-1) < 2 { + // We don't want GC to take our time. + b.Skip("need GOMAXPROCS >= 2") + } + + // Construct a large tree both so the GC runs for a while and + // so we have a data structure to manipulate the pointers of. + type node struct { + l, r *node + } + var wbRoots []*node + var mkTree func(level int) *node + mkTree = func(level int) *node { + if level == 0 { + return nil + } + n := &node{mkTree(level - 1), mkTree(level - 1)} + if level == 10 { + // Seed GC with enough early pointers so it + // doesn't start termination barriers when it + // only has the top of the tree. + wbRoots = append(wbRoots, n) + } + return n + } + const depth = 22 // 64 MB + root := mkTree(22) + + writeBarrierBenchmark(b, func() { + var stack [depth]*node + tos := -1 + + // There are two write barriers per iteration, so i+=2. + for i := 0; i < b.N; i += 2 { + if tos == -1 { + stack[0] = root + tos = 0 + } + + // Perform one step of reversing the tree. + n := stack[tos] + if n.l == nil { + tos-- + } else { + n.l, n.r = n.r, n.l + stack[tos] = n.l + stack[tos+1] = n.r + tos++ + } + + if i%(1<<12) == 0 { + // Avoid non-preemptible loops (see issue #10958). + runtime.Gosched() + } + } + }) + + runtime.KeepAlive(wbRoots) +} + +func BenchmarkBulkWriteBarrier(b *testing.B) { + if runtime.GOMAXPROCS(-1) < 2 { + // We don't want GC to take our time. + b.Skip("need GOMAXPROCS >= 2") + } + + // Construct a large set of objects we can copy around. + const heapSize = 64 << 20 + type obj [16]*byte + ptrs := make([]*obj, heapSize/unsafe.Sizeof(obj{})) + for i := range ptrs { + ptrs[i] = new(obj) + } + + writeBarrierBenchmark(b, func() { + const blockSize = 1024 + var pos int + for i := 0; i < b.N; i += blockSize { + // Rotate block. + block := ptrs[pos : pos+blockSize] + first := block[0] + copy(block, block[1:]) + block[blockSize-1] = first + + pos += blockSize + if pos+blockSize > len(ptrs) { + pos = 0 + } + + runtime.Gosched() + } + }) + + runtime.KeepAlive(ptrs) +} + +func BenchmarkScanStackNoLocals(b *testing.B) { + var ready sync.WaitGroup + teardown := make(chan bool) + for j := 0; j < 10; j++ { + ready.Add(1) + go func() { + x := 100000 + countpwg(&x, &ready, teardown) + }() + } + ready.Wait() + b.ResetTimer() + for i := 0; i < b.N; i++ { + b.StartTimer() + runtime.GC() + runtime.GC() + b.StopTimer() + } + close(teardown) +} + +func BenchmarkMSpanCountAlloc(b *testing.B) { + // Allocate one dummy mspan for the whole benchmark. + s := runtime.AllocMSpan() + defer runtime.FreeMSpan(s) + + // n is the number of bytes to benchmark against. + // n must always be a multiple of 8, since gcBits is + // always rounded up 8 bytes. + for _, n := range []int{8, 16, 32, 64, 128} { + b.Run(fmt.Sprintf("bits=%d", n*8), func(b *testing.B) { + // Initialize a new byte slice with pseduo-random data. + bits := make([]byte, n) + rand.Read(bits) + + b.ResetTimer() + for i := 0; i < b.N; i++ { + runtime.MSpanCountAlloc(s, bits) + } + }) + } +} + +func countpwg(n *int, ready *sync.WaitGroup, teardown chan bool) { + if *n == 0 { + ready.Done() + <-teardown + return + } + *n-- + countpwg(n, ready, teardown) +} + +func TestMemoryLimit(t *testing.T) { + if testing.Short() { + t.Skip("stress test that takes time to run") + } + if runtime.NumCPU() < 4 { + t.Skip("want at least 4 CPUs for this test") + } + got := runTestProg(t, "testprog", "GCMemoryLimit") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got %q", want, got) + } +} + +func TestMemoryLimitNoGCPercent(t *testing.T) { + if testing.Short() { + t.Skip("stress test that takes time to run") + } + if runtime.NumCPU() < 4 { + t.Skip("want at least 4 CPUs for this test") + } + got := runTestProg(t, "testprog", "GCMemoryLimitNoGCPercent") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got %q", want, got) + } +} diff --git a/src/runtime/gcinfo_test.go b/src/runtime/gcinfo_test.go new file mode 100644 index 0000000..787160d --- /dev/null +++ b/src/runtime/gcinfo_test.go @@ -0,0 +1,207 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "bytes" + "runtime" + "testing" +) + +const ( + typeScalar = 0 + typePointer = 1 +) + +// TestGCInfo tests that various objects in heap, data and bss receive correct GC pointer type info. +func TestGCInfo(t *testing.T) { + verifyGCInfo(t, "bss Ptr", &bssPtr, infoPtr) + verifyGCInfo(t, "bss ScalarPtr", &bssScalarPtr, infoScalarPtr) + verifyGCInfo(t, "bss PtrScalar", &bssPtrScalar, infoPtrScalar) + verifyGCInfo(t, "bss BigStruct", &bssBigStruct, infoBigStruct()) + verifyGCInfo(t, "bss string", &bssString, infoString) + verifyGCInfo(t, "bss slice", &bssSlice, infoSlice) + verifyGCInfo(t, "bss eface", &bssEface, infoEface) + verifyGCInfo(t, "bss iface", &bssIface, infoIface) + + verifyGCInfo(t, "data Ptr", &dataPtr, infoPtr) + verifyGCInfo(t, "data ScalarPtr", &dataScalarPtr, infoScalarPtr) + verifyGCInfo(t, "data PtrScalar", &dataPtrScalar, infoPtrScalar) + verifyGCInfo(t, "data BigStruct", &dataBigStruct, infoBigStruct()) + verifyGCInfo(t, "data string", &dataString, infoString) + verifyGCInfo(t, "data slice", &dataSlice, infoSlice) + verifyGCInfo(t, "data eface", &dataEface, infoEface) + verifyGCInfo(t, "data iface", &dataIface, infoIface) + + { + var x Ptr + verifyGCInfo(t, "stack Ptr", &x, infoPtr) + runtime.KeepAlive(x) + } + { + var x ScalarPtr + verifyGCInfo(t, "stack ScalarPtr", &x, infoScalarPtr) + runtime.KeepAlive(x) + } + { + var x PtrScalar + verifyGCInfo(t, "stack PtrScalar", &x, infoPtrScalar) + runtime.KeepAlive(x) + } + { + var x BigStruct + verifyGCInfo(t, "stack BigStruct", &x, infoBigStruct()) + runtime.KeepAlive(x) + } + { + var x string + verifyGCInfo(t, "stack string", &x, infoString) + runtime.KeepAlive(x) + } + { + var x []string + verifyGCInfo(t, "stack slice", &x, infoSlice) + runtime.KeepAlive(x) + } + { + var x any + verifyGCInfo(t, "stack eface", &x, infoEface) + runtime.KeepAlive(x) + } + { + var x Iface + verifyGCInfo(t, "stack iface", &x, infoIface) + runtime.KeepAlive(x) + } + + for i := 0; i < 10; i++ { + verifyGCInfo(t, "heap Ptr", runtime.Escape(new(Ptr)), trimDead(infoPtr)) + verifyGCInfo(t, "heap PtrSlice", runtime.Escape(&make([]*byte, 10)[0]), trimDead(infoPtr10)) + verifyGCInfo(t, "heap ScalarPtr", runtime.Escape(new(ScalarPtr)), trimDead(infoScalarPtr)) + verifyGCInfo(t, "heap ScalarPtrSlice", runtime.Escape(&make([]ScalarPtr, 4)[0]), trimDead(infoScalarPtr4)) + verifyGCInfo(t, "heap PtrScalar", runtime.Escape(new(PtrScalar)), trimDead(infoPtrScalar)) + verifyGCInfo(t, "heap BigStruct", runtime.Escape(new(BigStruct)), trimDead(infoBigStruct())) + verifyGCInfo(t, "heap string", runtime.Escape(new(string)), trimDead(infoString)) + verifyGCInfo(t, "heap eface", runtime.Escape(new(any)), trimDead(infoEface)) + verifyGCInfo(t, "heap iface", runtime.Escape(new(Iface)), trimDead(infoIface)) + } +} + +func verifyGCInfo(t *testing.T, name string, p any, mask0 []byte) { + mask := runtime.GCMask(p) + if !bytes.Equal(mask, mask0) { + t.Errorf("bad GC program for %v:\nwant %+v\ngot %+v", name, mask0, mask) + return + } +} + +func trimDead(mask []byte) []byte { + for len(mask) > 0 && mask[len(mask)-1] == typeScalar { + mask = mask[:len(mask)-1] + } + return mask +} + +var infoPtr = []byte{typePointer} + +type Ptr struct { + *byte +} + +var infoPtr10 = []byte{typePointer, typePointer, typePointer, typePointer, typePointer, typePointer, typePointer, typePointer, typePointer, typePointer} + +type ScalarPtr struct { + q int + w *int + e int + r *int + t int + y *int +} + +var infoScalarPtr = []byte{typeScalar, typePointer, typeScalar, typePointer, typeScalar, typePointer} + +var infoScalarPtr4 = append(append(append(append([]byte(nil), infoScalarPtr...), infoScalarPtr...), infoScalarPtr...), infoScalarPtr...) + +type PtrScalar struct { + q *int + w int + e *int + r int + t *int + y int +} + +var infoPtrScalar = []byte{typePointer, typeScalar, typePointer, typeScalar, typePointer, typeScalar} + +type BigStruct struct { + q *int + w byte + e [17]byte + r []byte + t int + y uint16 + u uint64 + i string +} + +func infoBigStruct() []byte { + switch runtime.GOARCH { + case "386", "arm", "mips", "mipsle": + return []byte{ + typePointer, // q *int + typeScalar, typeScalar, typeScalar, typeScalar, typeScalar, // w byte; e [17]byte + typePointer, typeScalar, typeScalar, // r []byte + typeScalar, typeScalar, typeScalar, typeScalar, // t int; y uint16; u uint64 + typePointer, typeScalar, // i string + } + case "arm64", "amd64", "loong64", "mips64", "mips64le", "ppc64", "ppc64le", "riscv64", "s390x", "wasm": + return []byte{ + typePointer, // q *int + typeScalar, typeScalar, typeScalar, // w byte; e [17]byte + typePointer, typeScalar, typeScalar, // r []byte + typeScalar, typeScalar, typeScalar, // t int; y uint16; u uint64 + typePointer, typeScalar, // i string + } + default: + panic("unknown arch") + } +} + +type Iface interface { + f() +} + +type IfaceImpl int + +func (IfaceImpl) f() { +} + +var ( + // BSS + bssPtr Ptr + bssScalarPtr ScalarPtr + bssPtrScalar PtrScalar + bssBigStruct BigStruct + bssString string + bssSlice []string + bssEface any + bssIface Iface + + // DATA + dataPtr = Ptr{new(byte)} + dataScalarPtr = ScalarPtr{q: 1} + dataPtrScalar = PtrScalar{w: 1} + dataBigStruct = BigStruct{w: 1} + dataString = "foo" + dataSlice = []string{"foo"} + dataEface any = 42 + dataIface Iface = IfaceImpl(42) + + infoString = []byte{typePointer, typeScalar} + infoSlice = []byte{typePointer, typeScalar, typeScalar} + infoEface = []byte{typeScalar, typePointer} + infoIface = []byte{typeScalar, typePointer} +) diff --git a/src/runtime/go_tls.h b/src/runtime/go_tls.h new file mode 100644 index 0000000..a47e798 --- /dev/null +++ b/src/runtime/go_tls.h @@ -0,0 +1,17 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#ifdef GOARCH_arm +#define LR R14 +#endif + +#ifdef GOARCH_amd64 +#define get_tls(r) MOVQ TLS, r +#define g(r) 0(r)(TLS*1) +#endif + +#ifdef GOARCH_386 +#define get_tls(r) MOVL TLS, r +#define g(r) 0(r)(TLS*1) +#endif diff --git a/src/runtime/hash32.go b/src/runtime/hash32.go new file mode 100644 index 0000000..0616c7d --- /dev/null +++ b/src/runtime/hash32.go @@ -0,0 +1,62 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Hashing algorithm inspired by +// wyhash: https://github.com/wangyi-fudan/wyhash/blob/ceb019b530e2c1c14d70b79bfa2bc49de7d95bc1/Modern%20Non-Cryptographic%20Hash%20Function%20and%20Pseudorandom%20Number%20Generator.pdf + +//go:build 386 || arm || mips || mipsle + +package runtime + +import "unsafe" + +func memhash32Fallback(p unsafe.Pointer, seed uintptr) uintptr { + a, b := mix32(uint32(seed), uint32(4^hashkey[0])) + t := readUnaligned32(p) + a ^= t + b ^= t + a, b = mix32(a, b) + a, b = mix32(a, b) + return uintptr(a ^ b) +} + +func memhash64Fallback(p unsafe.Pointer, seed uintptr) uintptr { + a, b := mix32(uint32(seed), uint32(8^hashkey[0])) + a ^= readUnaligned32(p) + b ^= readUnaligned32(add(p, 4)) + a, b = mix32(a, b) + a, b = mix32(a, b) + return uintptr(a ^ b) +} + +func memhashFallback(p unsafe.Pointer, seed, s uintptr) uintptr { + + a, b := mix32(uint32(seed), uint32(s^hashkey[0])) + if s == 0 { + return uintptr(a ^ b) + } + for ; s > 8; s -= 8 { + a ^= readUnaligned32(p) + b ^= readUnaligned32(add(p, 4)) + a, b = mix32(a, b) + p = add(p, 8) + } + if s >= 4 { + a ^= readUnaligned32(p) + b ^= readUnaligned32(add(p, s-4)) + } else { + t := uint32(*(*byte)(p)) + t |= uint32(*(*byte)(add(p, s>>1))) << 8 + t |= uint32(*(*byte)(add(p, s-1))) << 16 + b ^= t + } + a, b = mix32(a, b) + a, b = mix32(a, b) + return uintptr(a ^ b) +} + +func mix32(a, b uint32) (uint32, uint32) { + c := uint64(a^uint32(hashkey[1])) * uint64(b^uint32(hashkey[2])) + return uint32(c), uint32(c >> 32) +} diff --git a/src/runtime/hash64.go b/src/runtime/hash64.go new file mode 100644 index 0000000..2864a4b --- /dev/null +++ b/src/runtime/hash64.go @@ -0,0 +1,92 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Hashing algorithm inspired by +// wyhash: https://github.com/wangyi-fudan/wyhash + +//go:build amd64 || arm64 || loong64 || mips64 || mips64le || ppc64 || ppc64le || riscv64 || s390x || wasm + +package runtime + +import ( + "runtime/internal/math" + "unsafe" +) + +const ( + m1 = 0xa0761d6478bd642f + m2 = 0xe7037ed1a0b428db + m3 = 0x8ebc6af09c88c6e3 + m4 = 0x589965cc75374cc3 + m5 = 0x1d8e4e27c47d124f +) + +func memhashFallback(p unsafe.Pointer, seed, s uintptr) uintptr { + var a, b uintptr + seed ^= hashkey[0] ^ m1 + switch { + case s == 0: + return seed + case s < 4: + a = uintptr(*(*byte)(p)) + a |= uintptr(*(*byte)(add(p, s>>1))) << 8 + a |= uintptr(*(*byte)(add(p, s-1))) << 16 + case s == 4: + a = r4(p) + b = a + case s < 8: + a = r4(p) + b = r4(add(p, s-4)) + case s == 8: + a = r8(p) + b = a + case s <= 16: + a = r8(p) + b = r8(add(p, s-8)) + default: + l := s + if l > 48 { + seed1 := seed + seed2 := seed + for ; l > 48; l -= 48 { + seed = mix(r8(p)^m2, r8(add(p, 8))^seed) + seed1 = mix(r8(add(p, 16))^m3, r8(add(p, 24))^seed1) + seed2 = mix(r8(add(p, 32))^m4, r8(add(p, 40))^seed2) + p = add(p, 48) + } + seed ^= seed1 ^ seed2 + } + for ; l > 16; l -= 16 { + seed = mix(r8(p)^m2, r8(add(p, 8))^seed) + p = add(p, 16) + } + a = r8(add(p, l-16)) + b = r8(add(p, l-8)) + } + + return mix(m5^s, mix(a^m2, b^seed)) +} + +func memhash32Fallback(p unsafe.Pointer, seed uintptr) uintptr { + a := r4(p) + return mix(m5^4, mix(a^m2, a^seed^hashkey[0]^m1)) +} + +func memhash64Fallback(p unsafe.Pointer, seed uintptr) uintptr { + a := r8(p) + return mix(m5^8, mix(a^m2, a^seed^hashkey[0]^m1)) +} + +func mix(a, b uintptr) uintptr { + hi, lo := math.Mul64(uint64(a), uint64(b)) + return uintptr(hi ^ lo) +} + +func r4(p unsafe.Pointer) uintptr { + return uintptr(readUnaligned32(p)) +} + +func r8(p unsafe.Pointer) uintptr { + return uintptr(readUnaligned64(p)) +} diff --git a/src/runtime/hash_test.go b/src/runtime/hash_test.go new file mode 100644 index 0000000..d4a2b3f --- /dev/null +++ b/src/runtime/hash_test.go @@ -0,0 +1,783 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "internal/race" + "math" + "math/rand" + . "runtime" + "strings" + "testing" + "unsafe" +) + +func TestMemHash32Equality(t *testing.T) { + if *UseAeshash { + t.Skip("skipping since AES hash implementation is used") + } + var b [4]byte + r := rand.New(rand.NewSource(1234)) + seed := uintptr(r.Uint64()) + for i := 0; i < 100; i++ { + randBytes(r, b[:]) + got := MemHash32(unsafe.Pointer(&b), seed) + want := MemHash(unsafe.Pointer(&b), seed, 4) + if got != want { + t.Errorf("MemHash32(%x, %v) = %v; want %v", b, seed, got, want) + } + } +} + +func TestMemHash64Equality(t *testing.T) { + if *UseAeshash { + t.Skip("skipping since AES hash implementation is used") + } + var b [8]byte + r := rand.New(rand.NewSource(1234)) + seed := uintptr(r.Uint64()) + for i := 0; i < 100; i++ { + randBytes(r, b[:]) + got := MemHash64(unsafe.Pointer(&b), seed) + want := MemHash(unsafe.Pointer(&b), seed, 8) + if got != want { + t.Errorf("MemHash64(%x, %v) = %v; want %v", b, seed, got, want) + } + } +} + +// Smhasher is a torture test for hash functions. +// https://code.google.com/p/smhasher/ +// This code is a port of some of the Smhasher tests to Go. +// +// The current AES hash function passes Smhasher. Our fallback +// hash functions don't, so we only enable the difficult tests when +// we know the AES implementation is available. + +// Sanity checks. +// hash should not depend on values outside key. +// hash should not depend on alignment. +func TestSmhasherSanity(t *testing.T) { + r := rand.New(rand.NewSource(1234)) + const REP = 10 + const KEYMAX = 128 + const PAD = 16 + const OFFMAX = 16 + for k := 0; k < REP; k++ { + for n := 0; n < KEYMAX; n++ { + for i := 0; i < OFFMAX; i++ { + var b [KEYMAX + OFFMAX + 2*PAD]byte + var c [KEYMAX + OFFMAX + 2*PAD]byte + randBytes(r, b[:]) + randBytes(r, c[:]) + copy(c[PAD+i:PAD+i+n], b[PAD:PAD+n]) + if BytesHash(b[PAD:PAD+n], 0) != BytesHash(c[PAD+i:PAD+i+n], 0) { + t.Errorf("hash depends on bytes outside key") + } + } + } + } +} + +type HashSet struct { + m map[uintptr]struct{} // set of hashes added + n int // number of hashes added +} + +func newHashSet() *HashSet { + return &HashSet{make(map[uintptr]struct{}), 0} +} +func (s *HashSet) add(h uintptr) { + s.m[h] = struct{}{} + s.n++ +} +func (s *HashSet) addS(x string) { + s.add(StringHash(x, 0)) +} +func (s *HashSet) addB(x []byte) { + s.add(BytesHash(x, 0)) +} +func (s *HashSet) addS_seed(x string, seed uintptr) { + s.add(StringHash(x, seed)) +} +func (s *HashSet) check(t *testing.T) { + const SLOP = 50.0 + collisions := s.n - len(s.m) + pairs := int64(s.n) * int64(s.n-1) / 2 + expected := float64(pairs) / math.Pow(2.0, float64(hashSize)) + stddev := math.Sqrt(expected) + if float64(collisions) > expected+SLOP*(3*stddev+1) { + t.Errorf("unexpected number of collisions: got=%d mean=%f stddev=%f threshold=%f", collisions, expected, stddev, expected+SLOP*(3*stddev+1)) + } +} + +// a string plus adding zeros must make distinct hashes +func TestSmhasherAppendedZeros(t *testing.T) { + s := "hello" + strings.Repeat("\x00", 256) + h := newHashSet() + for i := 0; i <= len(s); i++ { + h.addS(s[:i]) + } + h.check(t) +} + +// All 0-3 byte strings have distinct hashes. +func TestSmhasherSmallKeys(t *testing.T) { + if race.Enabled { + t.Skip("Too long for race mode") + } + h := newHashSet() + var b [3]byte + for i := 0; i < 256; i++ { + b[0] = byte(i) + h.addB(b[:1]) + for j := 0; j < 256; j++ { + b[1] = byte(j) + h.addB(b[:2]) + if !testing.Short() { + for k := 0; k < 256; k++ { + b[2] = byte(k) + h.addB(b[:3]) + } + } + } + } + h.check(t) +} + +// Different length strings of all zeros have distinct hashes. +func TestSmhasherZeros(t *testing.T) { + N := 256 * 1024 + if testing.Short() { + N = 1024 + } + h := newHashSet() + b := make([]byte, N) + for i := 0; i <= N; i++ { + h.addB(b[:i]) + } + h.check(t) +} + +// Strings with up to two nonzero bytes all have distinct hashes. +func TestSmhasherTwoNonzero(t *testing.T) { + if GOARCH == "wasm" { + t.Skip("Too slow on wasm") + } + if testing.Short() { + t.Skip("Skipping in short mode") + } + if race.Enabled { + t.Skip("Too long for race mode") + } + h := newHashSet() + for n := 2; n <= 16; n++ { + twoNonZero(h, n) + } + h.check(t) +} +func twoNonZero(h *HashSet, n int) { + b := make([]byte, n) + + // all zero + h.addB(b) + + // one non-zero byte + for i := 0; i < n; i++ { + for x := 1; x < 256; x++ { + b[i] = byte(x) + h.addB(b) + b[i] = 0 + } + } + + // two non-zero bytes + for i := 0; i < n; i++ { + for x := 1; x < 256; x++ { + b[i] = byte(x) + for j := i + 1; j < n; j++ { + for y := 1; y < 256; y++ { + b[j] = byte(y) + h.addB(b) + b[j] = 0 + } + } + b[i] = 0 + } + } +} + +// Test strings with repeats, like "abcdabcdabcdabcd..." +func TestSmhasherCyclic(t *testing.T) { + if testing.Short() { + t.Skip("Skipping in short mode") + } + if race.Enabled { + t.Skip("Too long for race mode") + } + r := rand.New(rand.NewSource(1234)) + const REPEAT = 8 + const N = 1000000 + for n := 4; n <= 12; n++ { + h := newHashSet() + b := make([]byte, REPEAT*n) + for i := 0; i < N; i++ { + b[0] = byte(i * 79 % 97) + b[1] = byte(i * 43 % 137) + b[2] = byte(i * 151 % 197) + b[3] = byte(i * 199 % 251) + randBytes(r, b[4:n]) + for j := n; j < n*REPEAT; j++ { + b[j] = b[j-n] + } + h.addB(b) + } + h.check(t) + } +} + +// Test strings with only a few bits set +func TestSmhasherSparse(t *testing.T) { + if GOARCH == "wasm" { + t.Skip("Too slow on wasm") + } + if testing.Short() { + t.Skip("Skipping in short mode") + } + sparse(t, 32, 6) + sparse(t, 40, 6) + sparse(t, 48, 5) + sparse(t, 56, 5) + sparse(t, 64, 5) + sparse(t, 96, 4) + sparse(t, 256, 3) + sparse(t, 2048, 2) +} +func sparse(t *testing.T, n int, k int) { + b := make([]byte, n/8) + h := newHashSet() + setbits(h, b, 0, k) + h.check(t) +} + +// set up to k bits at index i and greater +func setbits(h *HashSet, b []byte, i int, k int) { + h.addB(b) + if k == 0 { + return + } + for j := i; j < len(b)*8; j++ { + b[j/8] |= byte(1 << uint(j&7)) + setbits(h, b, j+1, k-1) + b[j/8] &= byte(^(1 << uint(j&7))) + } +} + +// Test all possible combinations of n blocks from the set s. +// "permutation" is a bad name here, but it is what Smhasher uses. +func TestSmhasherPermutation(t *testing.T) { + if GOARCH == "wasm" { + t.Skip("Too slow on wasm") + } + if testing.Short() { + t.Skip("Skipping in short mode") + } + if race.Enabled { + t.Skip("Too long for race mode") + } + permutation(t, []uint32{0, 1, 2, 3, 4, 5, 6, 7}, 8) + permutation(t, []uint32{0, 1 << 29, 2 << 29, 3 << 29, 4 << 29, 5 << 29, 6 << 29, 7 << 29}, 8) + permutation(t, []uint32{0, 1}, 20) + permutation(t, []uint32{0, 1 << 31}, 20) + permutation(t, []uint32{0, 1, 2, 3, 4, 5, 6, 7, 1 << 29, 2 << 29, 3 << 29, 4 << 29, 5 << 29, 6 << 29, 7 << 29}, 6) +} +func permutation(t *testing.T, s []uint32, n int) { + b := make([]byte, n*4) + h := newHashSet() + genPerm(h, b, s, 0) + h.check(t) +} +func genPerm(h *HashSet, b []byte, s []uint32, n int) { + h.addB(b[:n]) + if n == len(b) { + return + } + for _, v := range s { + b[n] = byte(v) + b[n+1] = byte(v >> 8) + b[n+2] = byte(v >> 16) + b[n+3] = byte(v >> 24) + genPerm(h, b, s, n+4) + } +} + +type Key interface { + clear() // set bits all to 0 + random(r *rand.Rand) // set key to something random + bits() int // how many bits key has + flipBit(i int) // flip bit i of the key + hash() uintptr // hash the key + name() string // for error reporting +} + +type BytesKey struct { + b []byte +} + +func (k *BytesKey) clear() { + for i := range k.b { + k.b[i] = 0 + } +} +func (k *BytesKey) random(r *rand.Rand) { + randBytes(r, k.b) +} +func (k *BytesKey) bits() int { + return len(k.b) * 8 +} +func (k *BytesKey) flipBit(i int) { + k.b[i>>3] ^= byte(1 << uint(i&7)) +} +func (k *BytesKey) hash() uintptr { + return BytesHash(k.b, 0) +} +func (k *BytesKey) name() string { + return fmt.Sprintf("bytes%d", len(k.b)) +} + +type Int32Key struct { + i uint32 +} + +func (k *Int32Key) clear() { + k.i = 0 +} +func (k *Int32Key) random(r *rand.Rand) { + k.i = r.Uint32() +} +func (k *Int32Key) bits() int { + return 32 +} +func (k *Int32Key) flipBit(i int) { + k.i ^= 1 << uint(i) +} +func (k *Int32Key) hash() uintptr { + return Int32Hash(k.i, 0) +} +func (k *Int32Key) name() string { + return "int32" +} + +type Int64Key struct { + i uint64 +} + +func (k *Int64Key) clear() { + k.i = 0 +} +func (k *Int64Key) random(r *rand.Rand) { + k.i = uint64(r.Uint32()) + uint64(r.Uint32())<<32 +} +func (k *Int64Key) bits() int { + return 64 +} +func (k *Int64Key) flipBit(i int) { + k.i ^= 1 << uint(i) +} +func (k *Int64Key) hash() uintptr { + return Int64Hash(k.i, 0) +} +func (k *Int64Key) name() string { + return "int64" +} + +type EfaceKey struct { + i any +} + +func (k *EfaceKey) clear() { + k.i = nil +} +func (k *EfaceKey) random(r *rand.Rand) { + k.i = uint64(r.Int63()) +} +func (k *EfaceKey) bits() int { + // use 64 bits. This tests inlined interfaces + // on 64-bit targets and indirect interfaces on + // 32-bit targets. + return 64 +} +func (k *EfaceKey) flipBit(i int) { + k.i = k.i.(uint64) ^ uint64(1)<<uint(i) +} +func (k *EfaceKey) hash() uintptr { + return EfaceHash(k.i, 0) +} +func (k *EfaceKey) name() string { + return "Eface" +} + +type IfaceKey struct { + i interface { + F() + } +} +type fInter uint64 + +func (x fInter) F() { +} + +func (k *IfaceKey) clear() { + k.i = nil +} +func (k *IfaceKey) random(r *rand.Rand) { + k.i = fInter(r.Int63()) +} +func (k *IfaceKey) bits() int { + // use 64 bits. This tests inlined interfaces + // on 64-bit targets and indirect interfaces on + // 32-bit targets. + return 64 +} +func (k *IfaceKey) flipBit(i int) { + k.i = k.i.(fInter) ^ fInter(1)<<uint(i) +} +func (k *IfaceKey) hash() uintptr { + return IfaceHash(k.i, 0) +} +func (k *IfaceKey) name() string { + return "Iface" +} + +// Flipping a single bit of a key should flip each output bit with 50% probability. +func TestSmhasherAvalanche(t *testing.T) { + if GOARCH == "wasm" { + t.Skip("Too slow on wasm") + } + if testing.Short() { + t.Skip("Skipping in short mode") + } + if race.Enabled { + t.Skip("Too long for race mode") + } + avalancheTest1(t, &BytesKey{make([]byte, 2)}) + avalancheTest1(t, &BytesKey{make([]byte, 4)}) + avalancheTest1(t, &BytesKey{make([]byte, 8)}) + avalancheTest1(t, &BytesKey{make([]byte, 16)}) + avalancheTest1(t, &BytesKey{make([]byte, 32)}) + avalancheTest1(t, &BytesKey{make([]byte, 200)}) + avalancheTest1(t, &Int32Key{}) + avalancheTest1(t, &Int64Key{}) + avalancheTest1(t, &EfaceKey{}) + avalancheTest1(t, &IfaceKey{}) +} +func avalancheTest1(t *testing.T, k Key) { + const REP = 100000 + r := rand.New(rand.NewSource(1234)) + n := k.bits() + + // grid[i][j] is a count of whether flipping + // input bit i affects output bit j. + grid := make([][hashSize]int, n) + + for z := 0; z < REP; z++ { + // pick a random key, hash it + k.random(r) + h := k.hash() + + // flip each bit, hash & compare the results + for i := 0; i < n; i++ { + k.flipBit(i) + d := h ^ k.hash() + k.flipBit(i) + + // record the effects of that bit flip + g := &grid[i] + for j := 0; j < hashSize; j++ { + g[j] += int(d & 1) + d >>= 1 + } + } + } + + // Each entry in the grid should be about REP/2. + // More precisely, we did N = k.bits() * hashSize experiments where + // each is the sum of REP coin flips. We want to find bounds on the + // sum of coin flips such that a truly random experiment would have + // all sums inside those bounds with 99% probability. + N := n * hashSize + var c float64 + // find c such that Prob(mean-c*stddev < x < mean+c*stddev)^N > .9999 + for c = 0.0; math.Pow(math.Erf(c/math.Sqrt(2)), float64(N)) < .9999; c += .1 { + } + c *= 4.0 // allowed slack - we don't need to be perfectly random + mean := .5 * REP + stddev := .5 * math.Sqrt(REP) + low := int(mean - c*stddev) + high := int(mean + c*stddev) + for i := 0; i < n; i++ { + for j := 0; j < hashSize; j++ { + x := grid[i][j] + if x < low || x > high { + t.Errorf("bad bias for %s bit %d -> bit %d: %d/%d\n", k.name(), i, j, x, REP) + } + } + } +} + +// All bit rotations of a set of distinct keys +func TestSmhasherWindowed(t *testing.T) { + if race.Enabled { + t.Skip("Too long for race mode") + } + t.Logf("32 bit keys") + windowed(t, &Int32Key{}) + t.Logf("64 bit keys") + windowed(t, &Int64Key{}) + t.Logf("string keys") + windowed(t, &BytesKey{make([]byte, 128)}) +} +func windowed(t *testing.T, k Key) { + if GOARCH == "wasm" { + t.Skip("Too slow on wasm") + } + if PtrSize == 4 { + // This test tends to be flaky on 32-bit systems. + // There's not enough bits in the hash output, so we + // expect a nontrivial number of collisions, and it is + // often quite a bit higher than expected. See issue 43130. + t.Skip("Flaky on 32-bit systems") + } + if testing.Short() { + t.Skip("Skipping in short mode") + } + const BITS = 16 + + for r := 0; r < k.bits(); r++ { + h := newHashSet() + for i := 0; i < 1<<BITS; i++ { + k.clear() + for j := 0; j < BITS; j++ { + if i>>uint(j)&1 != 0 { + k.flipBit((j + r) % k.bits()) + } + } + h.add(k.hash()) + } + h.check(t) + } +} + +// All keys of the form prefix + [A-Za-z0-9]*N + suffix. +func TestSmhasherText(t *testing.T) { + if testing.Short() { + t.Skip("Skipping in short mode") + } + text(t, "Foo", "Bar") + text(t, "FooBar", "") + text(t, "", "FooBar") +} +func text(t *testing.T, prefix, suffix string) { + const N = 4 + const S = "ABCDEFGHIJKLMNOPQRSTabcdefghijklmnopqrst0123456789" + const L = len(S) + b := make([]byte, len(prefix)+N+len(suffix)) + copy(b, prefix) + copy(b[len(prefix)+N:], suffix) + h := newHashSet() + c := b[len(prefix):] + for i := 0; i < L; i++ { + c[0] = S[i] + for j := 0; j < L; j++ { + c[1] = S[j] + for k := 0; k < L; k++ { + c[2] = S[k] + for x := 0; x < L; x++ { + c[3] = S[x] + h.addB(b) + } + } + } + } + h.check(t) +} + +// Make sure different seed values generate different hashes. +func TestSmhasherSeed(t *testing.T) { + h := newHashSet() + const N = 100000 + s := "hello" + for i := 0; i < N; i++ { + h.addS_seed(s, uintptr(i)) + } + h.check(t) +} + +// size of the hash output (32 or 64 bits) +const hashSize = 32 + int(^uintptr(0)>>63<<5) + +func randBytes(r *rand.Rand, b []byte) { + for i := range b { + b[i] = byte(r.Uint32()) + } +} + +func benchmarkHash(b *testing.B, n int) { + s := strings.Repeat("A", n) + + for i := 0; i < b.N; i++ { + StringHash(s, 0) + } + b.SetBytes(int64(n)) +} + +func BenchmarkHash5(b *testing.B) { benchmarkHash(b, 5) } +func BenchmarkHash16(b *testing.B) { benchmarkHash(b, 16) } +func BenchmarkHash64(b *testing.B) { benchmarkHash(b, 64) } +func BenchmarkHash1024(b *testing.B) { benchmarkHash(b, 1024) } +func BenchmarkHash65536(b *testing.B) { benchmarkHash(b, 65536) } + +func TestArrayHash(t *testing.T) { + // Make sure that "" in arrays hash correctly. The hash + // should at least scramble the input seed so that, e.g., + // {"","foo"} and {"foo",""} have different hashes. + + // If the hash is bad, then all (8 choose 4) = 70 keys + // have the same hash. If so, we allocate 70/8 = 8 + // overflow buckets. If the hash is good we don't + // normally allocate any overflow buckets, and the + // probability of even one or two overflows goes down rapidly. + // (There is always 1 allocation of the bucket array. The map + // header is allocated on the stack.) + f := func() { + // Make the key type at most 128 bytes. Otherwise, + // we get an allocation per key. + type key [8]string + m := make(map[key]bool, 70) + + // fill m with keys that have 4 "foo"s and 4 ""s. + for i := 0; i < 256; i++ { + var k key + cnt := 0 + for j := uint(0); j < 8; j++ { + if i>>j&1 != 0 { + k[j] = "foo" + cnt++ + } + } + if cnt == 4 { + m[k] = true + } + } + if len(m) != 70 { + t.Errorf("bad test: (8 choose 4) should be 70, not %d", len(m)) + } + } + if n := testing.AllocsPerRun(10, f); n > 6 { + t.Errorf("too many allocs %f - hash not balanced", n) + } +} +func TestStructHash(t *testing.T) { + // See the comment in TestArrayHash. + f := func() { + type key struct { + a, b, c, d, e, f, g, h string + } + m := make(map[key]bool, 70) + + // fill m with keys that have 4 "foo"s and 4 ""s. + for i := 0; i < 256; i++ { + var k key + cnt := 0 + if i&1 != 0 { + k.a = "foo" + cnt++ + } + if i&2 != 0 { + k.b = "foo" + cnt++ + } + if i&4 != 0 { + k.c = "foo" + cnt++ + } + if i&8 != 0 { + k.d = "foo" + cnt++ + } + if i&16 != 0 { + k.e = "foo" + cnt++ + } + if i&32 != 0 { + k.f = "foo" + cnt++ + } + if i&64 != 0 { + k.g = "foo" + cnt++ + } + if i&128 != 0 { + k.h = "foo" + cnt++ + } + if cnt == 4 { + m[k] = true + } + } + if len(m) != 70 { + t.Errorf("bad test: (8 choose 4) should be 70, not %d", len(m)) + } + } + if n := testing.AllocsPerRun(10, f); n > 6 { + t.Errorf("too many allocs %f - hash not balanced", n) + } +} + +var sink uint64 + +func BenchmarkAlignedLoad(b *testing.B) { + var buf [16]byte + p := unsafe.Pointer(&buf[0]) + var s uint64 + for i := 0; i < b.N; i++ { + s += ReadUnaligned64(p) + } + sink = s +} + +func BenchmarkUnalignedLoad(b *testing.B) { + var buf [16]byte + p := unsafe.Pointer(&buf[1]) + var s uint64 + for i := 0; i < b.N; i++ { + s += ReadUnaligned64(p) + } + sink = s +} + +func TestCollisions(t *testing.T) { + if testing.Short() { + t.Skip("Skipping in short mode") + } + for i := 0; i < 16; i++ { + for j := 0; j < 16; j++ { + if j == i { + continue + } + var a [16]byte + m := make(map[uint16]struct{}, 1<<16) + for n := 0; n < 1<<16; n++ { + a[i] = byte(n) + a[j] = byte(n >> 8) + m[uint16(BytesHash(a[:], 0))] = struct{}{} + } + if len(m) <= 1<<15 { + t.Errorf("too many collisions i=%d j=%d outputs=%d out of 65536\n", i, j, len(m)) + } + } + } +} diff --git a/src/runtime/heapdump.go b/src/runtime/heapdump.go new file mode 100644 index 0000000..f57a1a1 --- /dev/null +++ b/src/runtime/heapdump.go @@ -0,0 +1,748 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Implementation of runtime/debug.WriteHeapDump. Writes all +// objects in the heap plus additional info (roots, threads, +// finalizers, etc.) to a file. + +// The format of the dumped file is described at +// https://golang.org/s/go15heapdump. + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +//go:linkname runtime_debug_WriteHeapDump runtime/debug.WriteHeapDump +func runtime_debug_WriteHeapDump(fd uintptr) { + stopTheWorld("write heap dump") + + // Keep m on this G's stack instead of the system stack. + // Both readmemstats_m and writeheapdump_m have pretty large + // peak stack depths and we risk blowing the system stack. + // This is safe because the world is stopped, so we don't + // need to worry about anyone shrinking and therefore moving + // our stack. + var m MemStats + systemstack(func() { + // Call readmemstats_m here instead of deeper in + // writeheapdump_m because we might blow the system stack + // otherwise. + readmemstats_m(&m) + writeheapdump_m(fd, &m) + }) + + startTheWorld() +} + +const ( + fieldKindEol = 0 + fieldKindPtr = 1 + fieldKindIface = 2 + fieldKindEface = 3 + tagEOF = 0 + tagObject = 1 + tagOtherRoot = 2 + tagType = 3 + tagGoroutine = 4 + tagStackFrame = 5 + tagParams = 6 + tagFinalizer = 7 + tagItab = 8 + tagOSThread = 9 + tagMemStats = 10 + tagQueuedFinalizer = 11 + tagData = 12 + tagBSS = 13 + tagDefer = 14 + tagPanic = 15 + tagMemProf = 16 + tagAllocSample = 17 +) + +var dumpfd uintptr // fd to write the dump to. +var tmpbuf []byte + +// buffer of pending write data +const ( + bufSize = 4096 +) + +var buf [bufSize]byte +var nbuf uintptr + +func dwrite(data unsafe.Pointer, len uintptr) { + if len == 0 { + return + } + if nbuf+len <= bufSize { + copy(buf[nbuf:], (*[bufSize]byte)(data)[:len]) + nbuf += len + return + } + + write(dumpfd, unsafe.Pointer(&buf), int32(nbuf)) + if len >= bufSize { + write(dumpfd, data, int32(len)) + nbuf = 0 + } else { + copy(buf[:], (*[bufSize]byte)(data)[:len]) + nbuf = len + } +} + +func dwritebyte(b byte) { + dwrite(unsafe.Pointer(&b), 1) +} + +func flush() { + write(dumpfd, unsafe.Pointer(&buf), int32(nbuf)) + nbuf = 0 +} + +// Cache of types that have been serialized already. +// We use a type's hash field to pick a bucket. +// Inside a bucket, we keep a list of types that +// have been serialized so far, most recently used first. +// Note: when a bucket overflows we may end up +// serializing a type more than once. That's ok. +const ( + typeCacheBuckets = 256 + typeCacheAssoc = 4 +) + +type typeCacheBucket struct { + t [typeCacheAssoc]*_type +} + +var typecache [typeCacheBuckets]typeCacheBucket + +// dump a uint64 in a varint format parseable by encoding/binary. +func dumpint(v uint64) { + var buf [10]byte + var n int + for v >= 0x80 { + buf[n] = byte(v | 0x80) + n++ + v >>= 7 + } + buf[n] = byte(v) + n++ + dwrite(unsafe.Pointer(&buf), uintptr(n)) +} + +func dumpbool(b bool) { + if b { + dumpint(1) + } else { + dumpint(0) + } +} + +// dump varint uint64 length followed by memory contents. +func dumpmemrange(data unsafe.Pointer, len uintptr) { + dumpint(uint64(len)) + dwrite(data, len) +} + +func dumpslice(b []byte) { + dumpint(uint64(len(b))) + if len(b) > 0 { + dwrite(unsafe.Pointer(&b[0]), uintptr(len(b))) + } +} + +func dumpstr(s string) { + dumpmemrange(unsafe.Pointer(unsafe.StringData(s)), uintptr(len(s))) +} + +// dump information for a type. +func dumptype(t *_type) { + if t == nil { + return + } + + // If we've definitely serialized the type before, + // no need to do it again. + b := &typecache[t.hash&(typeCacheBuckets-1)] + if t == b.t[0] { + return + } + for i := 1; i < typeCacheAssoc; i++ { + if t == b.t[i] { + // Move-to-front + for j := i; j > 0; j-- { + b.t[j] = b.t[j-1] + } + b.t[0] = t + return + } + } + + // Might not have been dumped yet. Dump it and + // remember we did so. + for j := typeCacheAssoc - 1; j > 0; j-- { + b.t[j] = b.t[j-1] + } + b.t[0] = t + + // dump the type + dumpint(tagType) + dumpint(uint64(uintptr(unsafe.Pointer(t)))) + dumpint(uint64(t.size)) + if x := t.uncommon(); x == nil || t.nameOff(x.pkgpath).name() == "" { + dumpstr(t.string()) + } else { + pkgpath := t.nameOff(x.pkgpath).name() + name := t.name() + dumpint(uint64(uintptr(len(pkgpath)) + 1 + uintptr(len(name)))) + dwrite(unsafe.Pointer(unsafe.StringData(pkgpath)), uintptr(len(pkgpath))) + dwritebyte('.') + dwrite(unsafe.Pointer(unsafe.StringData(name)), uintptr(len(name))) + } + dumpbool(t.kind&kindDirectIface == 0 || t.ptrdata != 0) +} + +// dump an object. +func dumpobj(obj unsafe.Pointer, size uintptr, bv bitvector) { + dumpint(tagObject) + dumpint(uint64(uintptr(obj))) + dumpmemrange(obj, size) + dumpfields(bv) +} + +func dumpotherroot(description string, to unsafe.Pointer) { + dumpint(tagOtherRoot) + dumpstr(description) + dumpint(uint64(uintptr(to))) +} + +func dumpfinalizer(obj unsafe.Pointer, fn *funcval, fint *_type, ot *ptrtype) { + dumpint(tagFinalizer) + dumpint(uint64(uintptr(obj))) + dumpint(uint64(uintptr(unsafe.Pointer(fn)))) + dumpint(uint64(uintptr(unsafe.Pointer(fn.fn)))) + dumpint(uint64(uintptr(unsafe.Pointer(fint)))) + dumpint(uint64(uintptr(unsafe.Pointer(ot)))) +} + +type childInfo struct { + // Information passed up from the callee frame about + // the layout of the outargs region. + argoff uintptr // where the arguments start in the frame + arglen uintptr // size of args region + args bitvector // if args.n >= 0, pointer map of args region + sp *uint8 // callee sp + depth uintptr // depth in call stack (0 == most recent) +} + +// dump kinds & offsets of interesting fields in bv. +func dumpbv(cbv *bitvector, offset uintptr) { + for i := uintptr(0); i < uintptr(cbv.n); i++ { + if cbv.ptrbit(i) == 1 { + dumpint(fieldKindPtr) + dumpint(uint64(offset + i*goarch.PtrSize)) + } + } +} + +func dumpframe(s *stkframe, arg unsafe.Pointer) bool { + child := (*childInfo)(arg) + f := s.fn + + // Figure out what we can about our stack map + pc := s.pc + pcdata := int32(-1) // Use the entry map at function entry + if pc != f.entry() { + pc-- + pcdata = pcdatavalue(f, _PCDATA_StackMapIndex, pc, nil) + } + if pcdata == -1 { + // We do not have a valid pcdata value but there might be a + // stackmap for this function. It is likely that we are looking + // at the function prologue, assume so and hope for the best. + pcdata = 0 + } + stkmap := (*stackmap)(funcdata(f, _FUNCDATA_LocalsPointerMaps)) + + var bv bitvector + if stkmap != nil && stkmap.n > 0 { + bv = stackmapdata(stkmap, pcdata) + } else { + bv.n = -1 + } + + // Dump main body of stack frame. + dumpint(tagStackFrame) + dumpint(uint64(s.sp)) // lowest address in frame + dumpint(uint64(child.depth)) // # of frames deep on the stack + dumpint(uint64(uintptr(unsafe.Pointer(child.sp)))) // sp of child, or 0 if bottom of stack + dumpmemrange(unsafe.Pointer(s.sp), s.fp-s.sp) // frame contents + dumpint(uint64(f.entry())) + dumpint(uint64(s.pc)) + dumpint(uint64(s.continpc)) + name := funcname(f) + if name == "" { + name = "unknown function" + } + dumpstr(name) + + // Dump fields in the outargs section + if child.args.n >= 0 { + dumpbv(&child.args, child.argoff) + } else { + // conservative - everything might be a pointer + for off := child.argoff; off < child.argoff+child.arglen; off += goarch.PtrSize { + dumpint(fieldKindPtr) + dumpint(uint64(off)) + } + } + + // Dump fields in the local vars section + if stkmap == nil { + // No locals information, dump everything. + for off := child.arglen; off < s.varp-s.sp; off += goarch.PtrSize { + dumpint(fieldKindPtr) + dumpint(uint64(off)) + } + } else if stkmap.n < 0 { + // Locals size information, dump just the locals. + size := uintptr(-stkmap.n) + for off := s.varp - size - s.sp; off < s.varp-s.sp; off += goarch.PtrSize { + dumpint(fieldKindPtr) + dumpint(uint64(off)) + } + } else if stkmap.n > 0 { + // Locals bitmap information, scan just the pointers in + // locals. + dumpbv(&bv, s.varp-uintptr(bv.n)*goarch.PtrSize-s.sp) + } + dumpint(fieldKindEol) + + // Record arg info for parent. + child.argoff = s.argp - s.fp + child.arglen = s.argBytes() + child.sp = (*uint8)(unsafe.Pointer(s.sp)) + child.depth++ + stkmap = (*stackmap)(funcdata(f, _FUNCDATA_ArgsPointerMaps)) + if stkmap != nil { + child.args = stackmapdata(stkmap, pcdata) + } else { + child.args.n = -1 + } + return true +} + +func dumpgoroutine(gp *g) { + var sp, pc, lr uintptr + if gp.syscallsp != 0 { + sp = gp.syscallsp + pc = gp.syscallpc + lr = 0 + } else { + sp = gp.sched.sp + pc = gp.sched.pc + lr = gp.sched.lr + } + + dumpint(tagGoroutine) + dumpint(uint64(uintptr(unsafe.Pointer(gp)))) + dumpint(uint64(sp)) + dumpint(gp.goid) + dumpint(uint64(gp.gopc)) + dumpint(uint64(readgstatus(gp))) + dumpbool(isSystemGoroutine(gp, false)) + dumpbool(false) // isbackground + dumpint(uint64(gp.waitsince)) + dumpstr(gp.waitreason.String()) + dumpint(uint64(uintptr(gp.sched.ctxt))) + dumpint(uint64(uintptr(unsafe.Pointer(gp.m)))) + dumpint(uint64(uintptr(unsafe.Pointer(gp._defer)))) + dumpint(uint64(uintptr(unsafe.Pointer(gp._panic)))) + + // dump stack + var child childInfo + child.args.n = -1 + child.arglen = 0 + child.sp = nil + child.depth = 0 + gentraceback(pc, sp, lr, gp, 0, nil, 0x7fffffff, dumpframe, noescape(unsafe.Pointer(&child)), 0) + + // dump defer & panic records + for d := gp._defer; d != nil; d = d.link { + dumpint(tagDefer) + dumpint(uint64(uintptr(unsafe.Pointer(d)))) + dumpint(uint64(uintptr(unsafe.Pointer(gp)))) + dumpint(uint64(d.sp)) + dumpint(uint64(d.pc)) + fn := *(**funcval)(unsafe.Pointer(&d.fn)) + dumpint(uint64(uintptr(unsafe.Pointer(fn)))) + if d.fn == nil { + // d.fn can be nil for open-coded defers + dumpint(uint64(0)) + } else { + dumpint(uint64(uintptr(unsafe.Pointer(fn.fn)))) + } + dumpint(uint64(uintptr(unsafe.Pointer(d.link)))) + } + for p := gp._panic; p != nil; p = p.link { + dumpint(tagPanic) + dumpint(uint64(uintptr(unsafe.Pointer(p)))) + dumpint(uint64(uintptr(unsafe.Pointer(gp)))) + eface := efaceOf(&p.arg) + dumpint(uint64(uintptr(unsafe.Pointer(eface._type)))) + dumpint(uint64(uintptr(unsafe.Pointer(eface.data)))) + dumpint(0) // was p->defer, no longer recorded + dumpint(uint64(uintptr(unsafe.Pointer(p.link)))) + } +} + +func dumpgs() { + assertWorldStopped() + + // goroutines & stacks + forEachG(func(gp *g) { + status := readgstatus(gp) // The world is stopped so gp will not be in a scan state. + switch status { + default: + print("runtime: unexpected G.status ", hex(status), "\n") + throw("dumpgs in STW - bad status") + case _Gdead: + // ok + case _Grunnable, + _Gsyscall, + _Gwaiting: + dumpgoroutine(gp) + } + }) +} + +func finq_callback(fn *funcval, obj unsafe.Pointer, nret uintptr, fint *_type, ot *ptrtype) { + dumpint(tagQueuedFinalizer) + dumpint(uint64(uintptr(obj))) + dumpint(uint64(uintptr(unsafe.Pointer(fn)))) + dumpint(uint64(uintptr(unsafe.Pointer(fn.fn)))) + dumpint(uint64(uintptr(unsafe.Pointer(fint)))) + dumpint(uint64(uintptr(unsafe.Pointer(ot)))) +} + +func dumproots() { + // To protect mheap_.allspans. + assertWorldStopped() + + // TODO(mwhudson): dump datamask etc from all objects + // data segment + dumpint(tagData) + dumpint(uint64(firstmoduledata.data)) + dumpmemrange(unsafe.Pointer(firstmoduledata.data), firstmoduledata.edata-firstmoduledata.data) + dumpfields(firstmoduledata.gcdatamask) + + // bss segment + dumpint(tagBSS) + dumpint(uint64(firstmoduledata.bss)) + dumpmemrange(unsafe.Pointer(firstmoduledata.bss), firstmoduledata.ebss-firstmoduledata.bss) + dumpfields(firstmoduledata.gcbssmask) + + // mspan.types + for _, s := range mheap_.allspans { + if s.state.get() == mSpanInUse { + // Finalizers + for sp := s.specials; sp != nil; sp = sp.next { + if sp.kind != _KindSpecialFinalizer { + continue + } + spf := (*specialfinalizer)(unsafe.Pointer(sp)) + p := unsafe.Pointer(s.base() + uintptr(spf.special.offset)) + dumpfinalizer(p, spf.fn, spf.fint, spf.ot) + } + } + } + + // Finalizer queue + iterate_finq(finq_callback) +} + +// Bit vector of free marks. +// Needs to be as big as the largest number of objects per span. +var freemark [_PageSize / 8]bool + +func dumpobjs() { + // To protect mheap_.allspans. + assertWorldStopped() + + for _, s := range mheap_.allspans { + if s.state.get() != mSpanInUse { + continue + } + p := s.base() + size := s.elemsize + n := (s.npages << _PageShift) / size + if n > uintptr(len(freemark)) { + throw("freemark array doesn't have enough entries") + } + + for freeIndex := uintptr(0); freeIndex < s.nelems; freeIndex++ { + if s.isFree(freeIndex) { + freemark[freeIndex] = true + } + } + + for j := uintptr(0); j < n; j, p = j+1, p+size { + if freemark[j] { + freemark[j] = false + continue + } + dumpobj(unsafe.Pointer(p), size, makeheapobjbv(p, size)) + } + } +} + +func dumpparams() { + dumpint(tagParams) + x := uintptr(1) + if *(*byte)(unsafe.Pointer(&x)) == 1 { + dumpbool(false) // little-endian ptrs + } else { + dumpbool(true) // big-endian ptrs + } + dumpint(goarch.PtrSize) + var arenaStart, arenaEnd uintptr + for i1 := range mheap_.arenas { + if mheap_.arenas[i1] == nil { + continue + } + for i, ha := range mheap_.arenas[i1] { + if ha == nil { + continue + } + base := arenaBase(arenaIdx(i1)<<arenaL1Shift | arenaIdx(i)) + if arenaStart == 0 || base < arenaStart { + arenaStart = base + } + if base+heapArenaBytes > arenaEnd { + arenaEnd = base + heapArenaBytes + } + } + } + dumpint(uint64(arenaStart)) + dumpint(uint64(arenaEnd)) + dumpstr(goarch.GOARCH) + dumpstr(buildVersion) + dumpint(uint64(ncpu)) +} + +func itab_callback(tab *itab) { + t := tab._type + dumptype(t) + dumpint(tagItab) + dumpint(uint64(uintptr(unsafe.Pointer(tab)))) + dumpint(uint64(uintptr(unsafe.Pointer(t)))) +} + +func dumpitabs() { + iterate_itabs(itab_callback) +} + +func dumpms() { + for mp := allm; mp != nil; mp = mp.alllink { + dumpint(tagOSThread) + dumpint(uint64(uintptr(unsafe.Pointer(mp)))) + dumpint(uint64(mp.id)) + dumpint(mp.procid) + } +} + +//go:systemstack +func dumpmemstats(m *MemStats) { + assertWorldStopped() + + // These ints should be identical to the exported + // MemStats structure and should be ordered the same + // way too. + dumpint(tagMemStats) + dumpint(m.Alloc) + dumpint(m.TotalAlloc) + dumpint(m.Sys) + dumpint(m.Lookups) + dumpint(m.Mallocs) + dumpint(m.Frees) + dumpint(m.HeapAlloc) + dumpint(m.HeapSys) + dumpint(m.HeapIdle) + dumpint(m.HeapInuse) + dumpint(m.HeapReleased) + dumpint(m.HeapObjects) + dumpint(m.StackInuse) + dumpint(m.StackSys) + dumpint(m.MSpanInuse) + dumpint(m.MSpanSys) + dumpint(m.MCacheInuse) + dumpint(m.MCacheSys) + dumpint(m.BuckHashSys) + dumpint(m.GCSys) + dumpint(m.OtherSys) + dumpint(m.NextGC) + dumpint(m.LastGC) + dumpint(m.PauseTotalNs) + for i := 0; i < 256; i++ { + dumpint(m.PauseNs[i]) + } + dumpint(uint64(m.NumGC)) +} + +func dumpmemprof_callback(b *bucket, nstk uintptr, pstk *uintptr, size, allocs, frees uintptr) { + stk := (*[100000]uintptr)(unsafe.Pointer(pstk)) + dumpint(tagMemProf) + dumpint(uint64(uintptr(unsafe.Pointer(b)))) + dumpint(uint64(size)) + dumpint(uint64(nstk)) + for i := uintptr(0); i < nstk; i++ { + pc := stk[i] + f := findfunc(pc) + if !f.valid() { + var buf [64]byte + n := len(buf) + n-- + buf[n] = ')' + if pc == 0 { + n-- + buf[n] = '0' + } else { + for pc > 0 { + n-- + buf[n] = "0123456789abcdef"[pc&15] + pc >>= 4 + } + } + n-- + buf[n] = 'x' + n-- + buf[n] = '0' + n-- + buf[n] = '(' + dumpslice(buf[n:]) + dumpstr("?") + dumpint(0) + } else { + dumpstr(funcname(f)) + if i > 0 && pc > f.entry() { + pc-- + } + file, line := funcline(f, pc) + dumpstr(file) + dumpint(uint64(line)) + } + } + dumpint(uint64(allocs)) + dumpint(uint64(frees)) +} + +func dumpmemprof() { + // To protect mheap_.allspans. + assertWorldStopped() + + iterate_memprof(dumpmemprof_callback) + for _, s := range mheap_.allspans { + if s.state.get() != mSpanInUse { + continue + } + for sp := s.specials; sp != nil; sp = sp.next { + if sp.kind != _KindSpecialProfile { + continue + } + spp := (*specialprofile)(unsafe.Pointer(sp)) + p := s.base() + uintptr(spp.special.offset) + dumpint(tagAllocSample) + dumpint(uint64(p)) + dumpint(uint64(uintptr(unsafe.Pointer(spp.b)))) + } + } +} + +var dumphdr = []byte("go1.7 heap dump\n") + +func mdump(m *MemStats) { + assertWorldStopped() + + // make sure we're done sweeping + for _, s := range mheap_.allspans { + if s.state.get() == mSpanInUse { + s.ensureSwept() + } + } + memclrNoHeapPointers(unsafe.Pointer(&typecache), unsafe.Sizeof(typecache)) + dwrite(unsafe.Pointer(&dumphdr[0]), uintptr(len(dumphdr))) + dumpparams() + dumpitabs() + dumpobjs() + dumpgs() + dumpms() + dumproots() + dumpmemstats(m) + dumpmemprof() + dumpint(tagEOF) + flush() +} + +func writeheapdump_m(fd uintptr, m *MemStats) { + assertWorldStopped() + + gp := getg() + casGToWaiting(gp.m.curg, _Grunning, waitReasonDumpingHeap) + + // Set dump file. + dumpfd = fd + + // Call dump routine. + mdump(m) + + // Reset dump file. + dumpfd = 0 + if tmpbuf != nil { + sysFree(unsafe.Pointer(&tmpbuf[0]), uintptr(len(tmpbuf)), &memstats.other_sys) + tmpbuf = nil + } + + casgstatus(gp.m.curg, _Gwaiting, _Grunning) +} + +// dumpint() the kind & offset of each field in an object. +func dumpfields(bv bitvector) { + dumpbv(&bv, 0) + dumpint(fieldKindEol) +} + +func makeheapobjbv(p uintptr, size uintptr) bitvector { + // Extend the temp buffer if necessary. + nptr := size / goarch.PtrSize + if uintptr(len(tmpbuf)) < nptr/8+1 { + if tmpbuf != nil { + sysFree(unsafe.Pointer(&tmpbuf[0]), uintptr(len(tmpbuf)), &memstats.other_sys) + } + n := nptr/8 + 1 + p := sysAlloc(n, &memstats.other_sys) + if p == nil { + throw("heapdump: out of memory") + } + tmpbuf = (*[1 << 30]byte)(p)[:n] + } + // Convert heap bitmap to pointer bitmap. + for i := uintptr(0); i < nptr/8+1; i++ { + tmpbuf[i] = 0 + } + + hbits := heapBitsForAddr(p, size) + for { + var addr uintptr + hbits, addr = hbits.next() + if addr == 0 { + break + } + i := (addr - p) / goarch.PtrSize + tmpbuf[i/8] |= 1 << (i % 8) + } + return bitvector{int32(nptr), &tmpbuf[0]} +} diff --git a/src/runtime/histogram.go b/src/runtime/histogram.go new file mode 100644 index 0000000..43dfe61 --- /dev/null +++ b/src/runtime/histogram.go @@ -0,0 +1,190 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +const ( + // For the time histogram type, we use an HDR histogram. + // Values are placed in buckets based solely on the most + // significant set bit. Thus, buckets are power-of-2 sized. + // Values are then placed into sub-buckets based on the value of + // the next timeHistSubBucketBits most significant bits. Thus, + // sub-buckets are linear within a bucket. + // + // Therefore, the number of sub-buckets (timeHistNumSubBuckets) + // defines the error. This error may be computed as + // 1/timeHistNumSubBuckets*100%. For example, for 16 sub-buckets + // per bucket the error is approximately 6%. + // + // The number of buckets (timeHistNumBuckets), on the + // other hand, defines the range. To avoid producing a large number + // of buckets that are close together, especially for small numbers + // (e.g. 1, 2, 3, 4, 5 ns) that aren't very useful, timeHistNumBuckets + // is defined in terms of the least significant bit (timeHistMinBucketBits) + // that needs to be set before we start bucketing and the most + // significant bit (timeHistMaxBucketBits) that we bucket before we just + // dump it into a catch-all bucket. + // + // As an example, consider the configuration: + // + // timeHistMinBucketBits = 9 + // timeHistMaxBucketBits = 48 + // timeHistSubBucketBits = 2 + // + // Then: + // + // 011000001 + // ^-- + // │ ^ + // │ └---- Next 2 bits -> sub-bucket 3 + // └------- Bit 9 unset -> bucket 0 + // + // 110000001 + // ^-- + // │ ^ + // │ └---- Next 2 bits -> sub-bucket 2 + // └------- Bit 9 set -> bucket 1 + // + // 1000000010 + // ^-- ^ + // │ ^ └-- Lower bits ignored + // │ └---- Next 2 bits -> sub-bucket 0 + // └------- Bit 10 set -> bucket 2 + // + // Following this pattern, bucket 38 will have the bit 46 set. We don't + // have any buckets for higher values, so we spill the rest into an overflow + // bucket containing values of 2^47-1 nanoseconds or approx. 1 day or more. + // This range is more than enough to handle durations produced by the runtime. + timeHistMinBucketBits = 9 + timeHistMaxBucketBits = 48 // Note that this is exclusive; 1 higher than the actual range. + timeHistSubBucketBits = 2 + timeHistNumSubBuckets = 1 << timeHistSubBucketBits + timeHistNumBuckets = timeHistMaxBucketBits - timeHistMinBucketBits + 1 + // Two extra buckets, one for underflow, one for overflow. + timeHistTotalBuckets = timeHistNumBuckets*timeHistNumSubBuckets + 2 +) + +// timeHistogram represents a distribution of durations in +// nanoseconds. +// +// The accuracy and range of the histogram is defined by the +// timeHistSubBucketBits and timeHistNumBuckets constants. +// +// It is an HDR histogram with exponentially-distributed +// buckets and linearly distributed sub-buckets. +// +// The histogram is safe for concurrent reads and writes. +type timeHistogram struct { + counts [timeHistNumBuckets * timeHistNumSubBuckets]atomic.Uint64 + + // underflow counts all the times we got a negative duration + // sample. Because of how time works on some platforms, it's + // possible to measure negative durations. We could ignore them, + // but we record them anyway because it's better to have some + // signal that it's happening than just missing samples. + underflow atomic.Uint64 + + // overflow counts all the times we got a duration that exceeded + // the range counts represents. + overflow atomic.Uint64 +} + +// record adds the given duration to the distribution. +// +// Disallow preemptions and stack growths because this function +// may run in sensitive locations. +// +//go:nosplit +func (h *timeHistogram) record(duration int64) { + // If the duration is negative, capture that in underflow. + if duration < 0 { + h.underflow.Add(1) + return + } + // bucketBit is the target bit for the bucket which is usually the + // highest 1 bit, but if we're less than the minimum, is the highest + // 1 bit of the minimum (which will be zero in the duration). + // + // bucket is the bucket index, which is the bucketBit minus the + // highest bit of the minimum, plus one to leave room for the catch-all + // bucket for samples lower than the minimum. + var bucketBit, bucket uint + if l := sys.Len64(uint64(duration)); l < timeHistMinBucketBits { + bucketBit = timeHistMinBucketBits + bucket = 0 // bucketBit - timeHistMinBucketBits + } else { + bucketBit = uint(l) + bucket = bucketBit - timeHistMinBucketBits + 1 + } + // If the bucket we computed is greater than the number of buckets, + // count that in overflow. + if bucket >= timeHistNumBuckets { + h.overflow.Add(1) + return + } + // The sub-bucket index is just next timeHistSubBucketBits after the bucketBit. + subBucket := uint(duration>>(bucketBit-1-timeHistSubBucketBits)) % timeHistNumSubBuckets + h.counts[bucket*timeHistNumSubBuckets+subBucket].Add(1) +} + +const ( + fInf = 0x7FF0000000000000 + fNegInf = 0xFFF0000000000000 +) + +func float64Inf() float64 { + inf := uint64(fInf) + return *(*float64)(unsafe.Pointer(&inf)) +} + +func float64NegInf() float64 { + inf := uint64(fNegInf) + return *(*float64)(unsafe.Pointer(&inf)) +} + +// timeHistogramMetricsBuckets generates a slice of boundaries for +// the timeHistogram. These boundaries are represented in seconds, +// not nanoseconds like the timeHistogram represents durations. +func timeHistogramMetricsBuckets() []float64 { + b := make([]float64, timeHistTotalBuckets+1) + // Underflow bucket. + b[0] = float64NegInf() + + for j := 0; j < timeHistNumSubBuckets; j++ { + // No bucket bit for the first few buckets. Just sub-bucket bits after the + // min bucket bit. + bucketNanos := uint64(j) << (timeHistMinBucketBits - 1 - timeHistSubBucketBits) + // Convert nanoseconds to seconds via a division. + // These values will all be exactly representable by a float64. + b[j+1] = float64(bucketNanos) / 1e9 + } + // Generate the rest of the buckets. It's easier to reason + // about if we cut out the 0'th bucket. + for i := timeHistMinBucketBits; i < timeHistMaxBucketBits; i++ { + for j := 0; j < timeHistNumSubBuckets; j++ { + // Set the bucket bit. + bucketNanos := uint64(1) << (i - 1) + // Set the sub-bucket bits. + bucketNanos |= uint64(j) << (i - 1 - timeHistSubBucketBits) + // The index for this bucket is going to be the (i+1)'th bucket + // (note that we're starting from zero, but handled the first bucket + // earlier, so we need to compensate), and the j'th sub bucket. + // Add 1 because we left space for -Inf. + bucketIndex := (i-timeHistMinBucketBits+1)*timeHistNumSubBuckets + j + 1 + // Convert nanoseconds to seconds via a division. + // These values will all be exactly representable by a float64. + b[bucketIndex] = float64(bucketNanos) / 1e9 + } + } + // Overflow bucket. + b[len(b)-2] = float64(uint64(1)<<(timeHistMaxBucketBits-1)) / 1e9 + b[len(b)-1] = float64Inf() + return b +} diff --git a/src/runtime/histogram_test.go b/src/runtime/histogram_test.go new file mode 100644 index 0000000..5246e86 --- /dev/null +++ b/src/runtime/histogram_test.go @@ -0,0 +1,112 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "math" + . "runtime" + "testing" +) + +var dummyTimeHistogram TimeHistogram + +func TestTimeHistogram(t *testing.T) { + // We need to use a global dummy because this + // could get stack-allocated with a non-8-byte alignment. + // The result of this bad alignment is a segfault on + // 32-bit platforms when calling Record. + h := &dummyTimeHistogram + + // Record exactly one sample in each bucket. + for j := 0; j < TimeHistNumSubBuckets; j++ { + v := int64(j) << (TimeHistMinBucketBits - 1 - TimeHistSubBucketBits) + for k := 0; k < j; k++ { + // Record a number of times equal to the bucket index. + h.Record(v) + } + } + for i := TimeHistMinBucketBits; i < TimeHistMaxBucketBits; i++ { + base := int64(1) << (i - 1) + for j := 0; j < TimeHistNumSubBuckets; j++ { + v := int64(j) << (i - 1 - TimeHistSubBucketBits) + for k := 0; k < (i+1-TimeHistMinBucketBits)*TimeHistNumSubBuckets+j; k++ { + // Record a number of times equal to the bucket index. + h.Record(base + v) + } + } + } + // Hit the underflow and overflow buckets. + h.Record(int64(-1)) + h.Record(math.MaxInt64) + h.Record(math.MaxInt64) + + // Check to make sure there's exactly one count in each + // bucket. + for i := 0; i < TimeHistNumBuckets; i++ { + for j := 0; j < TimeHistNumSubBuckets; j++ { + c, ok := h.Count(i, j) + if !ok { + t.Errorf("unexpected invalid bucket: (%d, %d)", i, j) + } else if idx := uint64(i*TimeHistNumSubBuckets + j); c != idx { + t.Errorf("bucket (%d, %d) has count that is not %d: %d", i, j, idx, c) + } + } + } + c, ok := h.Count(-1, 0) + if ok { + t.Errorf("expected to hit underflow bucket: (%d, %d)", -1, 0) + } + if c != 1 { + t.Errorf("overflow bucket has count that is not 1: %d", c) + } + + c, ok = h.Count(TimeHistNumBuckets+1, 0) + if ok { + t.Errorf("expected to hit overflow bucket: (%d, %d)", TimeHistNumBuckets+1, 0) + } + if c != 2 { + t.Errorf("overflow bucket has count that is not 2: %d", c) + } + + dummyTimeHistogram = TimeHistogram{} +} + +func TestTimeHistogramMetricsBuckets(t *testing.T) { + buckets := TimeHistogramMetricsBuckets() + + nonInfBucketsLen := TimeHistNumSubBuckets * TimeHistNumBuckets + expBucketsLen := nonInfBucketsLen + 3 // Count -Inf, the edge for the overflow bucket, and +Inf. + if len(buckets) != expBucketsLen { + t.Fatalf("unexpected length of buckets: got %d, want %d", len(buckets), expBucketsLen) + } + // Check some values. + idxToBucket := map[int]float64{ + 0: math.Inf(-1), + 1: 0.0, + 2: float64(0x040) / 1e9, + 3: float64(0x080) / 1e9, + 4: float64(0x0c0) / 1e9, + 5: float64(0x100) / 1e9, + 6: float64(0x140) / 1e9, + 7: float64(0x180) / 1e9, + 8: float64(0x1c0) / 1e9, + 9: float64(0x200) / 1e9, + 10: float64(0x280) / 1e9, + 11: float64(0x300) / 1e9, + 12: float64(0x380) / 1e9, + 13: float64(0x400) / 1e9, + 15: float64(0x600) / 1e9, + 81: float64(0x8000000) / 1e9, + 82: float64(0xa000000) / 1e9, + 108: float64(0x380000000) / 1e9, + expBucketsLen - 2: float64(0x1<<47) / 1e9, + expBucketsLen - 1: math.Inf(1), + } + for idx, bucket := range idxToBucket { + if got, want := buckets[idx], bucket; got != want { + t.Errorf("expected bucket %d to have value %e, got %e", idx, want, got) + } + } +} diff --git a/src/runtime/iface.go b/src/runtime/iface.go new file mode 100644 index 0000000..a4d56dd --- /dev/null +++ b/src/runtime/iface.go @@ -0,0 +1,533 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +const itabInitSize = 512 + +var ( + itabLock mutex // lock for accessing itab table + itabTable = &itabTableInit // pointer to current table + itabTableInit = itabTableType{size: itabInitSize} // starter table +) + +// Note: change the formula in the mallocgc call in itabAdd if you change these fields. +type itabTableType struct { + size uintptr // length of entries array. Always a power of 2. + count uintptr // current number of filled entries. + entries [itabInitSize]*itab // really [size] large +} + +func itabHashFunc(inter *interfacetype, typ *_type) uintptr { + // compiler has provided some good hash codes for us. + return uintptr(inter.typ.hash ^ typ.hash) +} + +func getitab(inter *interfacetype, typ *_type, canfail bool) *itab { + if len(inter.mhdr) == 0 { + throw("internal error - misuse of itab") + } + + // easy case + if typ.tflag&tflagUncommon == 0 { + if canfail { + return nil + } + name := inter.typ.nameOff(inter.mhdr[0].name) + panic(&TypeAssertionError{nil, typ, &inter.typ, name.name()}) + } + + var m *itab + + // First, look in the existing table to see if we can find the itab we need. + // This is by far the most common case, so do it without locks. + // Use atomic to ensure we see any previous writes done by the thread + // that updates the itabTable field (with atomic.Storep in itabAdd). + t := (*itabTableType)(atomic.Loadp(unsafe.Pointer(&itabTable))) + if m = t.find(inter, typ); m != nil { + goto finish + } + + // Not found. Grab the lock and try again. + lock(&itabLock) + if m = itabTable.find(inter, typ); m != nil { + unlock(&itabLock) + goto finish + } + + // Entry doesn't exist yet. Make a new entry & add it. + m = (*itab)(persistentalloc(unsafe.Sizeof(itab{})+uintptr(len(inter.mhdr)-1)*goarch.PtrSize, 0, &memstats.other_sys)) + m.inter = inter + m._type = typ + // The hash is used in type switches. However, compiler statically generates itab's + // for all interface/type pairs used in switches (which are added to itabTable + // in itabsinit). The dynamically-generated itab's never participate in type switches, + // and thus the hash is irrelevant. + // Note: m.hash is _not_ the hash used for the runtime itabTable hash table. + m.hash = 0 + m.init() + itabAdd(m) + unlock(&itabLock) +finish: + if m.fun[0] != 0 { + return m + } + if canfail { + return nil + } + // this can only happen if the conversion + // was already done once using the , ok form + // and we have a cached negative result. + // The cached result doesn't record which + // interface function was missing, so initialize + // the itab again to get the missing function name. + panic(&TypeAssertionError{concrete: typ, asserted: &inter.typ, missingMethod: m.init()}) +} + +// find finds the given interface/type pair in t. +// Returns nil if the given interface/type pair isn't present. +func (t *itabTableType) find(inter *interfacetype, typ *_type) *itab { + // Implemented using quadratic probing. + // Probe sequence is h(i) = h0 + i*(i+1)/2 mod 2^k. + // We're guaranteed to hit all table entries using this probe sequence. + mask := t.size - 1 + h := itabHashFunc(inter, typ) & mask + for i := uintptr(1); ; i++ { + p := (**itab)(add(unsafe.Pointer(&t.entries), h*goarch.PtrSize)) + // Use atomic read here so if we see m != nil, we also see + // the initializations of the fields of m. + // m := *p + m := (*itab)(atomic.Loadp(unsafe.Pointer(p))) + if m == nil { + return nil + } + if m.inter == inter && m._type == typ { + return m + } + h += i + h &= mask + } +} + +// itabAdd adds the given itab to the itab hash table. +// itabLock must be held. +func itabAdd(m *itab) { + // Bugs can lead to calling this while mallocing is set, + // typically because this is called while panicing. + // Crash reliably, rather than only when we need to grow + // the hash table. + if getg().m.mallocing != 0 { + throw("malloc deadlock") + } + + t := itabTable + if t.count >= 3*(t.size/4) { // 75% load factor + // Grow hash table. + // t2 = new(itabTableType) + some additional entries + // We lie and tell malloc we want pointer-free memory because + // all the pointed-to values are not in the heap. + t2 := (*itabTableType)(mallocgc((2+2*t.size)*goarch.PtrSize, nil, true)) + t2.size = t.size * 2 + + // Copy over entries. + // Note: while copying, other threads may look for an itab and + // fail to find it. That's ok, they will then try to get the itab lock + // and as a consequence wait until this copying is complete. + iterate_itabs(t2.add) + if t2.count != t.count { + throw("mismatched count during itab table copy") + } + // Publish new hash table. Use an atomic write: see comment in getitab. + atomicstorep(unsafe.Pointer(&itabTable), unsafe.Pointer(t2)) + // Adopt the new table as our own. + t = itabTable + // Note: the old table can be GC'ed here. + } + t.add(m) +} + +// add adds the given itab to itab table t. +// itabLock must be held. +func (t *itabTableType) add(m *itab) { + // See comment in find about the probe sequence. + // Insert new itab in the first empty spot in the probe sequence. + mask := t.size - 1 + h := itabHashFunc(m.inter, m._type) & mask + for i := uintptr(1); ; i++ { + p := (**itab)(add(unsafe.Pointer(&t.entries), h*goarch.PtrSize)) + m2 := *p + if m2 == m { + // A given itab may be used in more than one module + // and thanks to the way global symbol resolution works, the + // pointed-to itab may already have been inserted into the + // global 'hash'. + return + } + if m2 == nil { + // Use atomic write here so if a reader sees m, it also + // sees the correctly initialized fields of m. + // NoWB is ok because m is not in heap memory. + // *p = m + atomic.StorepNoWB(unsafe.Pointer(p), unsafe.Pointer(m)) + t.count++ + return + } + h += i + h &= mask + } +} + +// init fills in the m.fun array with all the code pointers for +// the m.inter/m._type pair. If the type does not implement the interface, +// it sets m.fun[0] to 0 and returns the name of an interface function that is missing. +// It is ok to call this multiple times on the same m, even concurrently. +func (m *itab) init() string { + inter := m.inter + typ := m._type + x := typ.uncommon() + + // both inter and typ have method sorted by name, + // and interface names are unique, + // so can iterate over both in lock step; + // the loop is O(ni+nt) not O(ni*nt). + ni := len(inter.mhdr) + nt := int(x.mcount) + xmhdr := (*[1 << 16]method)(add(unsafe.Pointer(x), uintptr(x.moff)))[:nt:nt] + j := 0 + methods := (*[1 << 16]unsafe.Pointer)(unsafe.Pointer(&m.fun[0]))[:ni:ni] + var fun0 unsafe.Pointer +imethods: + for k := 0; k < ni; k++ { + i := &inter.mhdr[k] + itype := inter.typ.typeOff(i.ityp) + name := inter.typ.nameOff(i.name) + iname := name.name() + ipkg := name.pkgPath() + if ipkg == "" { + ipkg = inter.pkgpath.name() + } + for ; j < nt; j++ { + t := &xmhdr[j] + tname := typ.nameOff(t.name) + if typ.typeOff(t.mtyp) == itype && tname.name() == iname { + pkgPath := tname.pkgPath() + if pkgPath == "" { + pkgPath = typ.nameOff(x.pkgpath).name() + } + if tname.isExported() || pkgPath == ipkg { + if m != nil { + ifn := typ.textOff(t.ifn) + if k == 0 { + fun0 = ifn // we'll set m.fun[0] at the end + } else { + methods[k] = ifn + } + } + continue imethods + } + } + } + // didn't find method + m.fun[0] = 0 + return iname + } + m.fun[0] = uintptr(fun0) + return "" +} + +func itabsinit() { + lockInit(&itabLock, lockRankItab) + lock(&itabLock) + for _, md := range activeModules() { + for _, i := range md.itablinks { + itabAdd(i) + } + } + unlock(&itabLock) +} + +// panicdottypeE is called when doing an e.(T) conversion and the conversion fails. +// have = the dynamic type we have. +// want = the static type we're trying to convert to. +// iface = the static type we're converting from. +func panicdottypeE(have, want, iface *_type) { + panic(&TypeAssertionError{iface, have, want, ""}) +} + +// panicdottypeI is called when doing an i.(T) conversion and the conversion fails. +// Same args as panicdottypeE, but "have" is the dynamic itab we have. +func panicdottypeI(have *itab, want, iface *_type) { + var t *_type + if have != nil { + t = have._type + } + panicdottypeE(t, want, iface) +} + +// panicnildottype is called when doing a i.(T) conversion and the interface i is nil. +// want = the static type we're trying to convert to. +func panicnildottype(want *_type) { + panic(&TypeAssertionError{nil, nil, want, ""}) + // TODO: Add the static type we're converting from as well. + // It might generate a better error message. + // Just to match other nil conversion errors, we don't for now. +} + +// The specialized convTx routines need a type descriptor to use when calling mallocgc. +// We don't need the type to be exact, just to have the correct size, alignment, and pointer-ness. +// However, when debugging, it'd be nice to have some indication in mallocgc where the types came from, +// so we use named types here. +// We then construct interface values of these types, +// and then extract the type word to use as needed. +type ( + uint16InterfacePtr uint16 + uint32InterfacePtr uint32 + uint64InterfacePtr uint64 + stringInterfacePtr string + sliceInterfacePtr []byte +) + +var ( + uint16Eface any = uint16InterfacePtr(0) + uint32Eface any = uint32InterfacePtr(0) + uint64Eface any = uint64InterfacePtr(0) + stringEface any = stringInterfacePtr("") + sliceEface any = sliceInterfacePtr(nil) + + uint16Type *_type = efaceOf(&uint16Eface)._type + uint32Type *_type = efaceOf(&uint32Eface)._type + uint64Type *_type = efaceOf(&uint64Eface)._type + stringType *_type = efaceOf(&stringEface)._type + sliceType *_type = efaceOf(&sliceEface)._type +) + +// The conv and assert functions below do very similar things. +// The convXXX functions are guaranteed by the compiler to succeed. +// The assertXXX functions may fail (either panicking or returning false, +// depending on whether they are 1-result or 2-result). +// The convXXX functions succeed on a nil input, whereas the assertXXX +// functions fail on a nil input. + +// convT converts a value of type t, which is pointed to by v, to a pointer that can +// be used as the second word of an interface value. +func convT(t *_type, v unsafe.Pointer) unsafe.Pointer { + if raceenabled { + raceReadObjectPC(t, v, getcallerpc(), abi.FuncPCABIInternal(convT)) + } + if msanenabled { + msanread(v, t.size) + } + if asanenabled { + asanread(v, t.size) + } + x := mallocgc(t.size, t, true) + typedmemmove(t, x, v) + return x +} +func convTnoptr(t *_type, v unsafe.Pointer) unsafe.Pointer { + // TODO: maybe take size instead of type? + if raceenabled { + raceReadObjectPC(t, v, getcallerpc(), abi.FuncPCABIInternal(convTnoptr)) + } + if msanenabled { + msanread(v, t.size) + } + if asanenabled { + asanread(v, t.size) + } + + x := mallocgc(t.size, t, false) + memmove(x, v, t.size) + return x +} + +func convT16(val uint16) (x unsafe.Pointer) { + if val < uint16(len(staticuint64s)) { + x = unsafe.Pointer(&staticuint64s[val]) + if goarch.BigEndian { + x = add(x, 6) + } + } else { + x = mallocgc(2, uint16Type, false) + *(*uint16)(x) = val + } + return +} + +func convT32(val uint32) (x unsafe.Pointer) { + if val < uint32(len(staticuint64s)) { + x = unsafe.Pointer(&staticuint64s[val]) + if goarch.BigEndian { + x = add(x, 4) + } + } else { + x = mallocgc(4, uint32Type, false) + *(*uint32)(x) = val + } + return +} + +func convT64(val uint64) (x unsafe.Pointer) { + if val < uint64(len(staticuint64s)) { + x = unsafe.Pointer(&staticuint64s[val]) + } else { + x = mallocgc(8, uint64Type, false) + *(*uint64)(x) = val + } + return +} + +func convTstring(val string) (x unsafe.Pointer) { + if val == "" { + x = unsafe.Pointer(&zeroVal[0]) + } else { + x = mallocgc(unsafe.Sizeof(val), stringType, true) + *(*string)(x) = val + } + return +} + +func convTslice(val []byte) (x unsafe.Pointer) { + // Note: this must work for any element type, not just byte. + if (*slice)(unsafe.Pointer(&val)).array == nil { + x = unsafe.Pointer(&zeroVal[0]) + } else { + x = mallocgc(unsafe.Sizeof(val), sliceType, true) + *(*[]byte)(x) = val + } + return +} + +// convI2I returns the new itab to be used for the destination value +// when converting a value with itab src to the dst interface. +func convI2I(dst *interfacetype, src *itab) *itab { + if src == nil { + return nil + } + if src.inter == dst { + return src + } + return getitab(dst, src._type, false) +} + +func assertI2I(inter *interfacetype, tab *itab) *itab { + if tab == nil { + // explicit conversions require non-nil interface value. + panic(&TypeAssertionError{nil, nil, &inter.typ, ""}) + } + if tab.inter == inter { + return tab + } + return getitab(inter, tab._type, false) +} + +func assertI2I2(inter *interfacetype, i iface) (r iface) { + tab := i.tab + if tab == nil { + return + } + if tab.inter != inter { + tab = getitab(inter, tab._type, true) + if tab == nil { + return + } + } + r.tab = tab + r.data = i.data + return +} + +func assertE2I(inter *interfacetype, t *_type) *itab { + if t == nil { + // explicit conversions require non-nil interface value. + panic(&TypeAssertionError{nil, nil, &inter.typ, ""}) + } + return getitab(inter, t, false) +} + +func assertE2I2(inter *interfacetype, e eface) (r iface) { + t := e._type + if t == nil { + return + } + tab := getitab(inter, t, true) + if tab == nil { + return + } + r.tab = tab + r.data = e.data + return +} + +//go:linkname reflect_ifaceE2I reflect.ifaceE2I +func reflect_ifaceE2I(inter *interfacetype, e eface, dst *iface) { + *dst = iface{assertE2I(inter, e._type), e.data} +} + +//go:linkname reflectlite_ifaceE2I internal/reflectlite.ifaceE2I +func reflectlite_ifaceE2I(inter *interfacetype, e eface, dst *iface) { + *dst = iface{assertE2I(inter, e._type), e.data} +} + +func iterate_itabs(fn func(*itab)) { + // Note: only runs during stop the world or with itabLock held, + // so no other locks/atomics needed. + t := itabTable + for i := uintptr(0); i < t.size; i++ { + m := *(**itab)(add(unsafe.Pointer(&t.entries), i*goarch.PtrSize)) + if m != nil { + fn(m) + } + } +} + +// staticuint64s is used to avoid allocating in convTx for small integer values. +var staticuint64s = [...]uint64{ + 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, + 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, + 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, + 0x18, 0x19, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, + 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, + 0x28, 0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f, + 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, + 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x3e, 0x3f, + 0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, + 0x48, 0x49, 0x4a, 0x4b, 0x4c, 0x4d, 0x4e, 0x4f, + 0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57, + 0x58, 0x59, 0x5a, 0x5b, 0x5c, 0x5d, 0x5e, 0x5f, + 0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, + 0x68, 0x69, 0x6a, 0x6b, 0x6c, 0x6d, 0x6e, 0x6f, + 0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77, + 0x78, 0x79, 0x7a, 0x7b, 0x7c, 0x7d, 0x7e, 0x7f, + 0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x86, 0x87, + 0x88, 0x89, 0x8a, 0x8b, 0x8c, 0x8d, 0x8e, 0x8f, + 0x90, 0x91, 0x92, 0x93, 0x94, 0x95, 0x96, 0x97, + 0x98, 0x99, 0x9a, 0x9b, 0x9c, 0x9d, 0x9e, 0x9f, + 0xa0, 0xa1, 0xa2, 0xa3, 0xa4, 0xa5, 0xa6, 0xa7, + 0xa8, 0xa9, 0xaa, 0xab, 0xac, 0xad, 0xae, 0xaf, + 0xb0, 0xb1, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7, + 0xb8, 0xb9, 0xba, 0xbb, 0xbc, 0xbd, 0xbe, 0xbf, + 0xc0, 0xc1, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7, + 0xc8, 0xc9, 0xca, 0xcb, 0xcc, 0xcd, 0xce, 0xcf, + 0xd0, 0xd1, 0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7, + 0xd8, 0xd9, 0xda, 0xdb, 0xdc, 0xdd, 0xde, 0xdf, + 0xe0, 0xe1, 0xe2, 0xe3, 0xe4, 0xe5, 0xe6, 0xe7, + 0xe8, 0xe9, 0xea, 0xeb, 0xec, 0xed, 0xee, 0xef, + 0xf0, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5, 0xf6, 0xf7, + 0xf8, 0xf9, 0xfa, 0xfb, 0xfc, 0xfd, 0xfe, 0xff, +} + +// The linker redirects a reference of a method that it determined +// unreachable to a reference to this function, so it will throw if +// ever called. +func unreachableMethod() { + throw("unreachable method called. linker bug?") +} diff --git a/src/runtime/iface_test.go b/src/runtime/iface_test.go new file mode 100644 index 0000000..06f6eeb --- /dev/null +++ b/src/runtime/iface_test.go @@ -0,0 +1,439 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "runtime" + "testing" +) + +type I1 interface { + Method1() +} + +type I2 interface { + Method1() + Method2() +} + +type TS uint16 +type TM uintptr +type TL [2]uintptr + +func (TS) Method1() {} +func (TS) Method2() {} +func (TM) Method1() {} +func (TM) Method2() {} +func (TL) Method1() {} +func (TL) Method2() {} + +type T8 uint8 +type T16 uint16 +type T32 uint32 +type T64 uint64 +type Tstr string +type Tslice []byte + +func (T8) Method1() {} +func (T16) Method1() {} +func (T32) Method1() {} +func (T64) Method1() {} +func (Tstr) Method1() {} +func (Tslice) Method1() {} + +var ( + e any + e_ any + i1 I1 + i2 I2 + ts TS + tm TM + tl TL + ok bool +) + +// Issue 9370 +func TestCmpIfaceConcreteAlloc(t *testing.T) { + if runtime.Compiler != "gc" { + t.Skip("skipping on non-gc compiler") + } + + n := testing.AllocsPerRun(1, func() { + _ = e == ts + _ = i1 == ts + _ = e == 1 + }) + + if n > 0 { + t.Fatalf("iface cmp allocs=%v; want 0", n) + } +} + +func BenchmarkEqEfaceConcrete(b *testing.B) { + for i := 0; i < b.N; i++ { + _ = e == ts + } +} + +func BenchmarkEqIfaceConcrete(b *testing.B) { + for i := 0; i < b.N; i++ { + _ = i1 == ts + } +} + +func BenchmarkNeEfaceConcrete(b *testing.B) { + for i := 0; i < b.N; i++ { + _ = e != ts + } +} + +func BenchmarkNeIfaceConcrete(b *testing.B) { + for i := 0; i < b.N; i++ { + _ = i1 != ts + } +} + +func BenchmarkConvT2EByteSized(b *testing.B) { + b.Run("bool", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = yes + } + }) + b.Run("uint8", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = eight8 + } + }) +} + +func BenchmarkConvT2ESmall(b *testing.B) { + for i := 0; i < b.N; i++ { + e = ts + } +} + +func BenchmarkConvT2EUintptr(b *testing.B) { + for i := 0; i < b.N; i++ { + e = tm + } +} + +func BenchmarkConvT2ELarge(b *testing.B) { + for i := 0; i < b.N; i++ { + e = tl + } +} + +func BenchmarkConvT2ISmall(b *testing.B) { + for i := 0; i < b.N; i++ { + i1 = ts + } +} + +func BenchmarkConvT2IUintptr(b *testing.B) { + for i := 0; i < b.N; i++ { + i1 = tm + } +} + +func BenchmarkConvT2ILarge(b *testing.B) { + for i := 0; i < b.N; i++ { + i1 = tl + } +} + +func BenchmarkConvI2E(b *testing.B) { + i2 = tm + for i := 0; i < b.N; i++ { + e = i2 + } +} + +func BenchmarkConvI2I(b *testing.B) { + i2 = tm + for i := 0; i < b.N; i++ { + i1 = i2 + } +} + +func BenchmarkAssertE2T(b *testing.B) { + e = tm + for i := 0; i < b.N; i++ { + tm = e.(TM) + } +} + +func BenchmarkAssertE2TLarge(b *testing.B) { + e = tl + for i := 0; i < b.N; i++ { + tl = e.(TL) + } +} + +func BenchmarkAssertE2I(b *testing.B) { + e = tm + for i := 0; i < b.N; i++ { + i1 = e.(I1) + } +} + +func BenchmarkAssertI2T(b *testing.B) { + i1 = tm + for i := 0; i < b.N; i++ { + tm = i1.(TM) + } +} + +func BenchmarkAssertI2I(b *testing.B) { + i1 = tm + for i := 0; i < b.N; i++ { + i2 = i1.(I2) + } +} + +func BenchmarkAssertI2E(b *testing.B) { + i1 = tm + for i := 0; i < b.N; i++ { + e = i1.(any) + } +} + +func BenchmarkAssertE2E(b *testing.B) { + e = tm + for i := 0; i < b.N; i++ { + e_ = e + } +} + +func BenchmarkAssertE2T2(b *testing.B) { + e = tm + for i := 0; i < b.N; i++ { + tm, ok = e.(TM) + } +} + +func BenchmarkAssertE2T2Blank(b *testing.B) { + e = tm + for i := 0; i < b.N; i++ { + _, ok = e.(TM) + } +} + +func BenchmarkAssertI2E2(b *testing.B) { + i1 = tm + for i := 0; i < b.N; i++ { + e, ok = i1.(any) + } +} + +func BenchmarkAssertI2E2Blank(b *testing.B) { + i1 = tm + for i := 0; i < b.N; i++ { + _, ok = i1.(any) + } +} + +func BenchmarkAssertE2E2(b *testing.B) { + e = tm + for i := 0; i < b.N; i++ { + e_, ok = e.(any) + } +} + +func BenchmarkAssertE2E2Blank(b *testing.B) { + e = tm + for i := 0; i < b.N; i++ { + _, ok = e.(any) + } +} + +func TestNonEscapingConvT2E(t *testing.T) { + m := make(map[any]bool) + m[42] = true + if !m[42] { + t.Fatalf("42 is not present in the map") + } + if m[0] { + t.Fatalf("0 is present in the map") + } + + n := testing.AllocsPerRun(1000, func() { + if m[0] { + t.Fatalf("0 is present in the map") + } + }) + if n != 0 { + t.Fatalf("want 0 allocs, got %v", n) + } +} + +func TestNonEscapingConvT2I(t *testing.T) { + m := make(map[I1]bool) + m[TM(42)] = true + if !m[TM(42)] { + t.Fatalf("42 is not present in the map") + } + if m[TM(0)] { + t.Fatalf("0 is present in the map") + } + + n := testing.AllocsPerRun(1000, func() { + if m[TM(0)] { + t.Fatalf("0 is present in the map") + } + }) + if n != 0 { + t.Fatalf("want 0 allocs, got %v", n) + } +} + +func TestZeroConvT2x(t *testing.T) { + tests := []struct { + name string + fn func() + }{ + {name: "E8", fn: func() { e = eight8 }}, // any byte-sized value does not allocate + {name: "E16", fn: func() { e = zero16 }}, // zero values do not allocate + {name: "E32", fn: func() { e = zero32 }}, + {name: "E64", fn: func() { e = zero64 }}, + {name: "Estr", fn: func() { e = zerostr }}, + {name: "Eslice", fn: func() { e = zeroslice }}, + {name: "Econstflt", fn: func() { e = 99.0 }}, // constants do not allocate + {name: "Econststr", fn: func() { e = "change" }}, + {name: "I8", fn: func() { i1 = eight8I }}, + {name: "I16", fn: func() { i1 = zero16I }}, + {name: "I32", fn: func() { i1 = zero32I }}, + {name: "I64", fn: func() { i1 = zero64I }}, + {name: "Istr", fn: func() { i1 = zerostrI }}, + {name: "Islice", fn: func() { i1 = zerosliceI }}, + } + + for _, test := range tests { + t.Run(test.name, func(t *testing.T) { + n := testing.AllocsPerRun(1000, test.fn) + if n != 0 { + t.Errorf("want zero allocs, got %v", n) + } + }) + } +} + +var ( + eight8 uint8 = 8 + eight8I T8 = 8 + yes bool = true + + zero16 uint16 = 0 + zero16I T16 = 0 + one16 uint16 = 1 + thousand16 uint16 = 1000 + + zero32 uint32 = 0 + zero32I T32 = 0 + one32 uint32 = 1 + thousand32 uint32 = 1000 + + zero64 uint64 = 0 + zero64I T64 = 0 + one64 uint64 = 1 + thousand64 uint64 = 1000 + + zerostr string = "" + zerostrI Tstr = "" + nzstr string = "abc" + + zeroslice []byte = nil + zerosliceI Tslice = nil + nzslice []byte = []byte("abc") + + zerobig [512]byte + nzbig [512]byte = [512]byte{511: 1} +) + +func BenchmarkConvT2Ezero(b *testing.B) { + b.Run("zero", func(b *testing.B) { + b.Run("16", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = zero16 + } + }) + b.Run("32", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = zero32 + } + }) + b.Run("64", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = zero64 + } + }) + b.Run("str", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = zerostr + } + }) + b.Run("slice", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = zeroslice + } + }) + b.Run("big", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = zerobig + } + }) + }) + b.Run("nonzero", func(b *testing.B) { + b.Run("str", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = nzstr + } + }) + b.Run("slice", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = nzslice + } + }) + b.Run("big", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = nzbig + } + }) + }) + b.Run("smallint", func(b *testing.B) { + b.Run("16", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = one16 + } + }) + b.Run("32", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = one32 + } + }) + b.Run("64", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = one64 + } + }) + }) + b.Run("largeint", func(b *testing.B) { + b.Run("16", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = thousand16 + } + }) + b.Run("32", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = thousand32 + } + }) + b.Run("64", func(b *testing.B) { + for i := 0; i < b.N; i++ { + e = thousand64 + } + }) + }) +} diff --git a/src/runtime/internal/atomic/atomic_386.go b/src/runtime/internal/atomic/atomic_386.go new file mode 100644 index 0000000..bf2f4b9 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_386.go @@ -0,0 +1,103 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build 386 + +package atomic + +import "unsafe" + +// Export some functions via linkname to assembly in sync/atomic. +// +//go:linkname Load +//go:linkname Loadp + +//go:nosplit +//go:noinline +func Load(ptr *uint32) uint32 { + return *ptr +} + +//go:nosplit +//go:noinline +func Loadp(ptr unsafe.Pointer) unsafe.Pointer { + return *(*unsafe.Pointer)(ptr) +} + +//go:nosplit +//go:noinline +func LoadAcq(ptr *uint32) uint32 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcquintptr(ptr *uintptr) uintptr { + return *ptr +} + +//go:noescape +func Xadd64(ptr *uint64, delta int64) uint64 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xchg64(ptr *uint64, new uint64) uint64 + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:noescape +func Load64(ptr *uint64) uint64 + +//go:nosplit +//go:noinline +func Load8(ptr *uint8) uint8 { + return *ptr +} + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func Or8(ptr *uint8, val uint8) + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or(ptr *uint32, val uint32) + +// NOTE: Do not add atomicxor8 (XOR is not idempotent). + +//go:noescape +func Cas64(ptr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(ptr *uint32, old, new uint32) bool + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +//go:noescape +func Store64(ptr *uint64, val uint64) + +//go:noescape +func StoreRel(ptr *uint32, val uint32) + +//go:noescape +func StoreReluintptr(ptr *uintptr, val uintptr) + +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) diff --git a/src/runtime/internal/atomic/atomic_386.s b/src/runtime/internal/atomic/atomic_386.s new file mode 100644 index 0000000..724d515 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_386.s @@ -0,0 +1,285 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "funcdata.h" + +// bool Cas(int32 *val, int32 old, int32 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// }else +// return 0; +TEXT ·Cas(SB), NOSPLIT, $0-13 + MOVL ptr+0(FP), BX + MOVL old+4(FP), AX + MOVL new+8(FP), CX + LOCK + CMPXCHGL CX, 0(BX) + SETEQ ret+12(FP) + RET + +TEXT ·Casint32(SB), NOSPLIT, $0-13 + JMP ·Cas(SB) + +TEXT ·Casint64(SB), NOSPLIT, $0-21 + JMP ·Cas64(SB) + +TEXT ·Casuintptr(SB), NOSPLIT, $0-13 + JMP ·Cas(SB) + +TEXT ·CasRel(SB), NOSPLIT, $0-13 + JMP ·Cas(SB) + +TEXT ·Loaduintptr(SB), NOSPLIT, $0-8 + JMP ·Load(SB) + +TEXT ·Loaduint(SB), NOSPLIT, $0-8 + JMP ·Load(SB) + +TEXT ·Storeint32(SB), NOSPLIT, $0-8 + JMP ·Store(SB) + +TEXT ·Storeint64(SB), NOSPLIT, $0-12 + JMP ·Store64(SB) + +TEXT ·Storeuintptr(SB), NOSPLIT, $0-8 + JMP ·Store(SB) + +TEXT ·Xadduintptr(SB), NOSPLIT, $0-12 + JMP ·Xadd(SB) + +TEXT ·Loadint32(SB), NOSPLIT, $0-8 + JMP ·Load(SB) + +TEXT ·Loadint64(SB), NOSPLIT, $0-12 + JMP ·Load64(SB) + +TEXT ·Xaddint32(SB), NOSPLIT, $0-12 + JMP ·Xadd(SB) + +TEXT ·Xaddint64(SB), NOSPLIT, $0-20 + JMP ·Xadd64(SB) + +// bool ·Cas64(uint64 *val, uint64 old, uint64 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else { +// return 0; +// } +TEXT ·Cas64(SB), NOSPLIT, $0-21 + NO_LOCAL_POINTERS + MOVL ptr+0(FP), BP + TESTL $7, BP + JZ 2(PC) + CALL ·panicUnaligned(SB) + MOVL old_lo+4(FP), AX + MOVL old_hi+8(FP), DX + MOVL new_lo+12(FP), BX + MOVL new_hi+16(FP), CX + LOCK + CMPXCHG8B 0(BP) + SETEQ ret+20(FP) + RET + +// bool Casp1(void **p, void *old, void *new) +// Atomically: +// if(*p == old){ +// *p = new; +// return 1; +// }else +// return 0; +TEXT ·Casp1(SB), NOSPLIT, $0-13 + MOVL ptr+0(FP), BX + MOVL old+4(FP), AX + MOVL new+8(FP), CX + LOCK + CMPXCHGL CX, 0(BX) + SETEQ ret+12(FP) + RET + +// uint32 Xadd(uint32 volatile *val, int32 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd(SB), NOSPLIT, $0-12 + MOVL ptr+0(FP), BX + MOVL delta+4(FP), AX + MOVL AX, CX + LOCK + XADDL AX, 0(BX) + ADDL CX, AX + MOVL AX, ret+8(FP) + RET + +TEXT ·Xadd64(SB), NOSPLIT, $0-20 + NO_LOCAL_POINTERS + // no XADDQ so use CMPXCHG8B loop + MOVL ptr+0(FP), BP + TESTL $7, BP + JZ 2(PC) + CALL ·panicUnaligned(SB) + // DI:SI = delta + MOVL delta_lo+4(FP), SI + MOVL delta_hi+8(FP), DI + // DX:AX = *addr + MOVL 0(BP), AX + MOVL 4(BP), DX +addloop: + // CX:BX = DX:AX (*addr) + DI:SI (delta) + MOVL AX, BX + MOVL DX, CX + ADDL SI, BX + ADCL DI, CX + + // if *addr == DX:AX { + // *addr = CX:BX + // } else { + // DX:AX = *addr + // } + // all in one instruction + LOCK + CMPXCHG8B 0(BP) + + JNZ addloop + + // success + // return CX:BX + MOVL BX, ret_lo+12(FP) + MOVL CX, ret_hi+16(FP) + RET + +TEXT ·Xchg(SB), NOSPLIT, $0-12 + MOVL ptr+0(FP), BX + MOVL new+4(FP), AX + XCHGL AX, 0(BX) + MOVL AX, ret+8(FP) + RET + +TEXT ·Xchgint32(SB), NOSPLIT, $0-12 + JMP ·Xchg(SB) + +TEXT ·Xchgint64(SB), NOSPLIT, $0-20 + JMP ·Xchg64(SB) + +TEXT ·Xchguintptr(SB), NOSPLIT, $0-12 + JMP ·Xchg(SB) + +TEXT ·Xchg64(SB),NOSPLIT,$0-20 + NO_LOCAL_POINTERS + // no XCHGQ so use CMPXCHG8B loop + MOVL ptr+0(FP), BP + TESTL $7, BP + JZ 2(PC) + CALL ·panicUnaligned(SB) + // CX:BX = new + MOVL new_lo+4(FP), BX + MOVL new_hi+8(FP), CX + // DX:AX = *addr + MOVL 0(BP), AX + MOVL 4(BP), DX +swaploop: + // if *addr == DX:AX + // *addr = CX:BX + // else + // DX:AX = *addr + // all in one instruction + LOCK + CMPXCHG8B 0(BP) + JNZ swaploop + + // success + // return DX:AX + MOVL AX, ret_lo+12(FP) + MOVL DX, ret_hi+16(FP) + RET + +TEXT ·StorepNoWB(SB), NOSPLIT, $0-8 + MOVL ptr+0(FP), BX + MOVL val+4(FP), AX + XCHGL AX, 0(BX) + RET + +TEXT ·Store(SB), NOSPLIT, $0-8 + MOVL ptr+0(FP), BX + MOVL val+4(FP), AX + XCHGL AX, 0(BX) + RET + +TEXT ·StoreRel(SB), NOSPLIT, $0-8 + JMP ·Store(SB) + +TEXT ·StoreReluintptr(SB), NOSPLIT, $0-8 + JMP ·Store(SB) + +// uint64 atomicload64(uint64 volatile* addr); +TEXT ·Load64(SB), NOSPLIT, $0-12 + NO_LOCAL_POINTERS + MOVL ptr+0(FP), AX + TESTL $7, AX + JZ 2(PC) + CALL ·panicUnaligned(SB) + MOVQ (AX), M0 + MOVQ M0, ret+4(FP) + EMMS + RET + +// void ·Store64(uint64 volatile* addr, uint64 v); +TEXT ·Store64(SB), NOSPLIT, $0-12 + NO_LOCAL_POINTERS + MOVL ptr+0(FP), AX + TESTL $7, AX + JZ 2(PC) + CALL ·panicUnaligned(SB) + // MOVQ and EMMS were introduced on the Pentium MMX. + MOVQ val+4(FP), M0 + MOVQ M0, (AX) + EMMS + // This is essentially a no-op, but it provides required memory fencing. + // It can be replaced with MFENCE, but MFENCE was introduced only on the Pentium4 (SSE2). + XORL AX, AX + LOCK + XADDL AX, (SP) + RET + +// void ·Or8(byte volatile*, byte); +TEXT ·Or8(SB), NOSPLIT, $0-5 + MOVL ptr+0(FP), AX + MOVB val+4(FP), BX + LOCK + ORB BX, (AX) + RET + +// void ·And8(byte volatile*, byte); +TEXT ·And8(SB), NOSPLIT, $0-5 + MOVL ptr+0(FP), AX + MOVB val+4(FP), BX + LOCK + ANDB BX, (AX) + RET + +TEXT ·Store8(SB), NOSPLIT, $0-5 + MOVL ptr+0(FP), BX + MOVB val+4(FP), AX + XCHGB AX, 0(BX) + RET + +// func Or(addr *uint32, v uint32) +TEXT ·Or(SB), NOSPLIT, $0-8 + MOVL ptr+0(FP), AX + MOVL val+4(FP), BX + LOCK + ORL BX, (AX) + RET + +// func And(addr *uint32, v uint32) +TEXT ·And(SB), NOSPLIT, $0-8 + MOVL ptr+0(FP), AX + MOVL val+4(FP), BX + LOCK + ANDL BX, (AX) + RET diff --git a/src/runtime/internal/atomic/atomic_amd64.go b/src/runtime/internal/atomic/atomic_amd64.go new file mode 100644 index 0000000..52a8362 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_amd64.go @@ -0,0 +1,117 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package atomic + +import "unsafe" + +// Export some functions via linkname to assembly in sync/atomic. +// +//go:linkname Load +//go:linkname Loadp +//go:linkname Load64 + +//go:nosplit +//go:noinline +func Load(ptr *uint32) uint32 { + return *ptr +} + +//go:nosplit +//go:noinline +func Loadp(ptr unsafe.Pointer) unsafe.Pointer { + return *(*unsafe.Pointer)(ptr) +} + +//go:nosplit +//go:noinline +func Load64(ptr *uint64) uint64 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcq(ptr *uint32) uint32 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcq64(ptr *uint64) uint64 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcquintptr(ptr *uintptr) uintptr { + return *ptr +} + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xadd64(ptr *uint64, delta int64) uint64 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchg64(ptr *uint64, new uint64) uint64 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:nosplit +//go:noinline +func Load8(ptr *uint8) uint8 { + return *ptr +} + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func Or8(ptr *uint8, val uint8) + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or(ptr *uint32, val uint32) + +// NOTE: Do not add atomicxor8 (XOR is not idempotent). + +//go:noescape +func Cas64(ptr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(ptr *uint32, old, new uint32) bool + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +//go:noescape +func Store64(ptr *uint64, val uint64) + +//go:noescape +func StoreRel(ptr *uint32, val uint32) + +//go:noescape +func StoreRel64(ptr *uint64, val uint64) + +//go:noescape +func StoreReluintptr(ptr *uintptr, val uintptr) + +// StorepNoWB performs *ptr = val atomically and without a write +// barrier. +// +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) diff --git a/src/runtime/internal/atomic/atomic_amd64.s b/src/runtime/internal/atomic/atomic_amd64.s new file mode 100644 index 0000000..d21514b --- /dev/null +++ b/src/runtime/internal/atomic/atomic_amd64.s @@ -0,0 +1,225 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Note: some of these functions are semantically inlined +// by the compiler (in src/cmd/compile/internal/gc/ssa.go). + +#include "textflag.h" + +TEXT ·Loaduintptr(SB), NOSPLIT, $0-16 + JMP ·Load64(SB) + +TEXT ·Loaduint(SB), NOSPLIT, $0-16 + JMP ·Load64(SB) + +TEXT ·Loadint32(SB), NOSPLIT, $0-12 + JMP ·Load(SB) + +TEXT ·Loadint64(SB), NOSPLIT, $0-16 + JMP ·Load64(SB) + +// bool Cas(int32 *val, int32 old, int32 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Cas(SB),NOSPLIT,$0-17 + MOVQ ptr+0(FP), BX + MOVL old+8(FP), AX + MOVL new+12(FP), CX + LOCK + CMPXCHGL CX, 0(BX) + SETEQ ret+16(FP) + RET + +// bool ·Cas64(uint64 *val, uint64 old, uint64 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else { +// return 0; +// } +TEXT ·Cas64(SB), NOSPLIT, $0-25 + MOVQ ptr+0(FP), BX + MOVQ old+8(FP), AX + MOVQ new+16(FP), CX + LOCK + CMPXCHGQ CX, 0(BX) + SETEQ ret+24(FP) + RET + +// bool Casp1(void **val, void *old, void *new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Casp1(SB), NOSPLIT, $0-25 + MOVQ ptr+0(FP), BX + MOVQ old+8(FP), AX + MOVQ new+16(FP), CX + LOCK + CMPXCHGQ CX, 0(BX) + SETEQ ret+24(FP) + RET + +TEXT ·Casint32(SB), NOSPLIT, $0-17 + JMP ·Cas(SB) + +TEXT ·Casint64(SB), NOSPLIT, $0-25 + JMP ·Cas64(SB) + +TEXT ·Casuintptr(SB), NOSPLIT, $0-25 + JMP ·Cas64(SB) + +TEXT ·CasRel(SB), NOSPLIT, $0-17 + JMP ·Cas(SB) + +// uint32 Xadd(uint32 volatile *val, int32 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd(SB), NOSPLIT, $0-20 + MOVQ ptr+0(FP), BX + MOVL delta+8(FP), AX + MOVL AX, CX + LOCK + XADDL AX, 0(BX) + ADDL CX, AX + MOVL AX, ret+16(FP) + RET + +// uint64 Xadd64(uint64 volatile *val, int64 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd64(SB), NOSPLIT, $0-24 + MOVQ ptr+0(FP), BX + MOVQ delta+8(FP), AX + MOVQ AX, CX + LOCK + XADDQ AX, 0(BX) + ADDQ CX, AX + MOVQ AX, ret+16(FP) + RET + +TEXT ·Xaddint32(SB), NOSPLIT, $0-20 + JMP ·Xadd(SB) + +TEXT ·Xaddint64(SB), NOSPLIT, $0-24 + JMP ·Xadd64(SB) + +TEXT ·Xadduintptr(SB), NOSPLIT, $0-24 + JMP ·Xadd64(SB) + +// uint32 Xchg(ptr *uint32, new uint32) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg(SB), NOSPLIT, $0-20 + MOVQ ptr+0(FP), BX + MOVL new+8(FP), AX + XCHGL AX, 0(BX) + MOVL AX, ret+16(FP) + RET + +// uint64 Xchg64(ptr *uint64, new uint64) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg64(SB), NOSPLIT, $0-24 + MOVQ ptr+0(FP), BX + MOVQ new+8(FP), AX + XCHGQ AX, 0(BX) + MOVQ AX, ret+16(FP) + RET + +TEXT ·Xchgint32(SB), NOSPLIT, $0-20 + JMP ·Xchg(SB) + +TEXT ·Xchgint64(SB), NOSPLIT, $0-24 + JMP ·Xchg64(SB) + +TEXT ·Xchguintptr(SB), NOSPLIT, $0-24 + JMP ·Xchg64(SB) + +TEXT ·StorepNoWB(SB), NOSPLIT, $0-16 + MOVQ ptr+0(FP), BX + MOVQ val+8(FP), AX + XCHGQ AX, 0(BX) + RET + +TEXT ·Store(SB), NOSPLIT, $0-12 + MOVQ ptr+0(FP), BX + MOVL val+8(FP), AX + XCHGL AX, 0(BX) + RET + +TEXT ·Store8(SB), NOSPLIT, $0-9 + MOVQ ptr+0(FP), BX + MOVB val+8(FP), AX + XCHGB AX, 0(BX) + RET + +TEXT ·Store64(SB), NOSPLIT, $0-16 + MOVQ ptr+0(FP), BX + MOVQ val+8(FP), AX + XCHGQ AX, 0(BX) + RET + +TEXT ·Storeint32(SB), NOSPLIT, $0-12 + JMP ·Store(SB) + +TEXT ·Storeint64(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·Storeuintptr(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·StoreRel(SB), NOSPLIT, $0-12 + JMP ·Store(SB) + +TEXT ·StoreRel64(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·StoreReluintptr(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +// void ·Or8(byte volatile*, byte); +TEXT ·Or8(SB), NOSPLIT, $0-9 + MOVQ ptr+0(FP), AX + MOVB val+8(FP), BX + LOCK + ORB BX, (AX) + RET + +// void ·And8(byte volatile*, byte); +TEXT ·And8(SB), NOSPLIT, $0-9 + MOVQ ptr+0(FP), AX + MOVB val+8(FP), BX + LOCK + ANDB BX, (AX) + RET + +// func Or(addr *uint32, v uint32) +TEXT ·Or(SB), NOSPLIT, $0-12 + MOVQ ptr+0(FP), AX + MOVL val+8(FP), BX + LOCK + ORL BX, (AX) + RET + +// func And(addr *uint32, v uint32) +TEXT ·And(SB), NOSPLIT, $0-12 + MOVQ ptr+0(FP), AX + MOVL val+8(FP), BX + LOCK + ANDL BX, (AX) + RET diff --git a/src/runtime/internal/atomic/atomic_arm.go b/src/runtime/internal/atomic/atomic_arm.go new file mode 100644 index 0000000..bdb1847 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_arm.go @@ -0,0 +1,244 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build arm + +package atomic + +import ( + "internal/cpu" + "unsafe" +) + +// Export some functions via linkname to assembly in sync/atomic. +// +//go:linkname Xchg +//go:linkname Xchguintptr + +type spinlock struct { + v uint32 +} + +//go:nosplit +func (l *spinlock) lock() { + for { + if Cas(&l.v, 0, 1) { + return + } + } +} + +//go:nosplit +func (l *spinlock) unlock() { + Store(&l.v, 0) +} + +var locktab [57]struct { + l spinlock + pad [cpu.CacheLinePadSize - unsafe.Sizeof(spinlock{})]byte +} + +func addrLock(addr *uint64) *spinlock { + return &locktab[(uintptr(unsafe.Pointer(addr))>>3)%uintptr(len(locktab))].l +} + +// Atomic add and return new value. +// +//go:nosplit +func Xadd(val *uint32, delta int32) uint32 { + for { + oval := *val + nval := oval + uint32(delta) + if Cas(val, oval, nval) { + return nval + } + } +} + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:nosplit +func Xchg(addr *uint32, v uint32) uint32 { + for { + old := *addr + if Cas(addr, old, v) { + return old + } + } +} + +//go:nosplit +func Xchguintptr(addr *uintptr, v uintptr) uintptr { + return uintptr(Xchg((*uint32)(unsafe.Pointer(addr)), uint32(v))) +} + +// Not noescape -- it installs a pointer to addr. +func StorepNoWB(addr unsafe.Pointer, v unsafe.Pointer) + +//go:noescape +func Store(addr *uint32, v uint32) + +//go:noescape +func StoreRel(addr *uint32, v uint32) + +//go:noescape +func StoreReluintptr(addr *uintptr, v uintptr) + +//go:nosplit +func goCas64(addr *uint64, old, new uint64) bool { + if uintptr(unsafe.Pointer(addr))&7 != 0 { + *(*int)(nil) = 0 // crash on unaligned uint64 + } + _ = *addr // if nil, fault before taking the lock + var ok bool + addrLock(addr).lock() + if *addr == old { + *addr = new + ok = true + } + addrLock(addr).unlock() + return ok +} + +//go:nosplit +func goXadd64(addr *uint64, delta int64) uint64 { + if uintptr(unsafe.Pointer(addr))&7 != 0 { + *(*int)(nil) = 0 // crash on unaligned uint64 + } + _ = *addr // if nil, fault before taking the lock + var r uint64 + addrLock(addr).lock() + r = *addr + uint64(delta) + *addr = r + addrLock(addr).unlock() + return r +} + +//go:nosplit +func goXchg64(addr *uint64, v uint64) uint64 { + if uintptr(unsafe.Pointer(addr))&7 != 0 { + *(*int)(nil) = 0 // crash on unaligned uint64 + } + _ = *addr // if nil, fault before taking the lock + var r uint64 + addrLock(addr).lock() + r = *addr + *addr = v + addrLock(addr).unlock() + return r +} + +//go:nosplit +func goLoad64(addr *uint64) uint64 { + if uintptr(unsafe.Pointer(addr))&7 != 0 { + *(*int)(nil) = 0 // crash on unaligned uint64 + } + _ = *addr // if nil, fault before taking the lock + var r uint64 + addrLock(addr).lock() + r = *addr + addrLock(addr).unlock() + return r +} + +//go:nosplit +func goStore64(addr *uint64, v uint64) { + if uintptr(unsafe.Pointer(addr))&7 != 0 { + *(*int)(nil) = 0 // crash on unaligned uint64 + } + _ = *addr // if nil, fault before taking the lock + addrLock(addr).lock() + *addr = v + addrLock(addr).unlock() +} + +//go:nosplit +func Or8(addr *uint8, v uint8) { + // Align down to 4 bytes and use 32-bit CAS. + uaddr := uintptr(unsafe.Pointer(addr)) + addr32 := (*uint32)(unsafe.Pointer(uaddr &^ 3)) + word := uint32(v) << ((uaddr & 3) * 8) // little endian + for { + old := *addr32 + if Cas(addr32, old, old|word) { + return + } + } +} + +//go:nosplit +func And8(addr *uint8, v uint8) { + // Align down to 4 bytes and use 32-bit CAS. + uaddr := uintptr(unsafe.Pointer(addr)) + addr32 := (*uint32)(unsafe.Pointer(uaddr &^ 3)) + word := uint32(v) << ((uaddr & 3) * 8) // little endian + mask := uint32(0xFF) << ((uaddr & 3) * 8) // little endian + word |= ^mask + for { + old := *addr32 + if Cas(addr32, old, old&word) { + return + } + } +} + +//go:nosplit +func Or(addr *uint32, v uint32) { + for { + old := *addr + if Cas(addr, old, old|v) { + return + } + } +} + +//go:nosplit +func And(addr *uint32, v uint32) { + for { + old := *addr + if Cas(addr, old, old&v) { + return + } + } +} + +//go:nosplit +func armcas(ptr *uint32, old, new uint32) bool + +//go:noescape +func Load(addr *uint32) uint32 + +// NO go:noescape annotation; *addr escapes if result escapes (#31525) +func Loadp(addr unsafe.Pointer) unsafe.Pointer + +//go:noescape +func Load8(addr *uint8) uint8 + +//go:noescape +func LoadAcq(addr *uint32) uint32 + +//go:noescape +func LoadAcquintptr(ptr *uintptr) uintptr + +//go:noescape +func Cas64(addr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(addr *uint32, old, new uint32) bool + +//go:noescape +func Xadd64(addr *uint64, delta int64) uint64 + +//go:noescape +func Xchg64(addr *uint64, v uint64) uint64 + +//go:noescape +func Load64(addr *uint64) uint64 + +//go:noescape +func Store8(addr *uint8, v uint8) + +//go:noescape +func Store64(addr *uint64, v uint64) diff --git a/src/runtime/internal/atomic/atomic_arm.s b/src/runtime/internal/atomic/atomic_arm.s new file mode 100644 index 0000000..92cbe8a --- /dev/null +++ b/src/runtime/internal/atomic/atomic_arm.s @@ -0,0 +1,297 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "funcdata.h" + +// bool armcas(int32 *val, int32 old, int32 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// }else +// return 0; +// +// To implement ·cas in sys_$GOOS_arm.s +// using the native instructions, use: +// +// TEXT ·cas(SB),NOSPLIT,$0 +// B ·armcas(SB) +// +TEXT ·armcas(SB),NOSPLIT,$0-13 + MOVW ptr+0(FP), R1 + MOVW old+4(FP), R2 + MOVW new+8(FP), R3 +casl: + LDREX (R1), R0 + CMP R0, R2 + BNE casfail + + MOVB runtime·goarm(SB), R8 + CMP $7, R8 + BLT 2(PC) + DMB MB_ISHST + + STREX R3, (R1), R0 + CMP $0, R0 + BNE casl + MOVW $1, R0 + + CMP $7, R8 + BLT 2(PC) + DMB MB_ISH + + MOVB R0, ret+12(FP) + RET +casfail: + MOVW $0, R0 + MOVB R0, ret+12(FP) + RET + +// stubs + +TEXT ·Loadp(SB),NOSPLIT|NOFRAME,$0-8 + B ·Load(SB) + +TEXT ·LoadAcq(SB),NOSPLIT|NOFRAME,$0-8 + B ·Load(SB) + +TEXT ·LoadAcquintptr(SB),NOSPLIT|NOFRAME,$0-8 + B ·Load(SB) + +TEXT ·Casint32(SB),NOSPLIT,$0-13 + B ·Cas(SB) + +TEXT ·Casint64(SB),NOSPLIT,$-4-21 + B ·Cas64(SB) + +TEXT ·Casuintptr(SB),NOSPLIT,$0-13 + B ·Cas(SB) + +TEXT ·Casp1(SB),NOSPLIT,$0-13 + B ·Cas(SB) + +TEXT ·CasRel(SB),NOSPLIT,$0-13 + B ·Cas(SB) + +TEXT ·Loadint32(SB),NOSPLIT,$0-8 + B ·Load(SB) + +TEXT ·Loadint64(SB),NOSPLIT,$-4-12 + B ·Load64(SB) + +TEXT ·Loaduintptr(SB),NOSPLIT,$0-8 + B ·Load(SB) + +TEXT ·Loaduint(SB),NOSPLIT,$0-8 + B ·Load(SB) + +TEXT ·Storeint32(SB),NOSPLIT,$0-8 + B ·Store(SB) + +TEXT ·Storeint64(SB),NOSPLIT,$0-12 + B ·Store64(SB) + +TEXT ·Storeuintptr(SB),NOSPLIT,$0-8 + B ·Store(SB) + +TEXT ·StorepNoWB(SB),NOSPLIT,$0-8 + B ·Store(SB) + +TEXT ·StoreRel(SB),NOSPLIT,$0-8 + B ·Store(SB) + +TEXT ·StoreReluintptr(SB),NOSPLIT,$0-8 + B ·Store(SB) + +TEXT ·Xaddint32(SB),NOSPLIT,$0-12 + B ·Xadd(SB) + +TEXT ·Xaddint64(SB),NOSPLIT,$-4-20 + B ·Xadd64(SB) + +TEXT ·Xadduintptr(SB),NOSPLIT,$0-12 + B ·Xadd(SB) + +TEXT ·Xchgint32(SB),NOSPLIT,$0-12 + B ·Xchg(SB) + +TEXT ·Xchgint64(SB),NOSPLIT,$-4-20 + B ·Xchg64(SB) + +// 64-bit atomics +// The native ARM implementations use LDREXD/STREXD, which are +// available on ARMv6k or later. We use them only on ARMv7. +// On older ARM, we use Go implementations which simulate 64-bit +// atomics with locks. +TEXT armCas64<>(SB),NOSPLIT,$0-21 + // addr is already in R1 + MOVW old_lo+4(FP), R2 + MOVW old_hi+8(FP), R3 + MOVW new_lo+12(FP), R4 + MOVW new_hi+16(FP), R5 +cas64loop: + LDREXD (R1), R6 // loads R6 and R7 + CMP R2, R6 + BNE cas64fail + CMP R3, R7 + BNE cas64fail + + DMB MB_ISHST + + STREXD R4, (R1), R0 // stores R4 and R5 + CMP $0, R0 + BNE cas64loop + MOVW $1, R0 + + DMB MB_ISH + + MOVBU R0, swapped+20(FP) + RET +cas64fail: + MOVW $0, R0 + MOVBU R0, swapped+20(FP) + RET + +TEXT armXadd64<>(SB),NOSPLIT,$0-20 + // addr is already in R1 + MOVW delta_lo+4(FP), R2 + MOVW delta_hi+8(FP), R3 + +add64loop: + LDREXD (R1), R4 // loads R4 and R5 + ADD.S R2, R4 + ADC R3, R5 + + DMB MB_ISHST + + STREXD R4, (R1), R0 // stores R4 and R5 + CMP $0, R0 + BNE add64loop + + DMB MB_ISH + + MOVW R4, new_lo+12(FP) + MOVW R5, new_hi+16(FP) + RET + +TEXT armXchg64<>(SB),NOSPLIT,$0-20 + // addr is already in R1 + MOVW new_lo+4(FP), R2 + MOVW new_hi+8(FP), R3 + +swap64loop: + LDREXD (R1), R4 // loads R4 and R5 + + DMB MB_ISHST + + STREXD R2, (R1), R0 // stores R2 and R3 + CMP $0, R0 + BNE swap64loop + + DMB MB_ISH + + MOVW R4, old_lo+12(FP) + MOVW R5, old_hi+16(FP) + RET + +TEXT armLoad64<>(SB),NOSPLIT,$0-12 + // addr is already in R1 + + LDREXD (R1), R2 // loads R2 and R3 + DMB MB_ISH + + MOVW R2, val_lo+4(FP) + MOVW R3, val_hi+8(FP) + RET + +TEXT armStore64<>(SB),NOSPLIT,$0-12 + // addr is already in R1 + MOVW val_lo+4(FP), R2 + MOVW val_hi+8(FP), R3 + +store64loop: + LDREXD (R1), R4 // loads R4 and R5 + + DMB MB_ISHST + + STREXD R2, (R1), R0 // stores R2 and R3 + CMP $0, R0 + BNE store64loop + + DMB MB_ISH + RET + +// The following functions all panic if their address argument isn't +// 8-byte aligned. Since we're calling back into Go code to do this, +// we have to cooperate with stack unwinding. In the normal case, the +// functions tail-call into the appropriate implementation, which +// means they must not open a frame. Hence, when they go down the +// panic path, at that point they push the LR to create a real frame +// (they don't need to pop it because panic won't return; however, we +// do need to set the SP delta back). + +// Check if R1 is 8-byte aligned, panic if not. +// Clobbers R2. +#define CHECK_ALIGN \ + AND.S $7, R1, R2 \ + BEQ 4(PC) \ + MOVW.W R14, -4(R13) /* prepare a real frame */ \ + BL ·panicUnaligned(SB) \ + ADD $4, R13 /* compensate SP delta */ + +TEXT ·Cas64(SB),NOSPLIT,$-4-21 + NO_LOCAL_POINTERS + MOVW addr+0(FP), R1 + CHECK_ALIGN + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + JMP armCas64<>(SB) + JMP ·goCas64(SB) + +TEXT ·Xadd64(SB),NOSPLIT,$-4-20 + NO_LOCAL_POINTERS + MOVW addr+0(FP), R1 + CHECK_ALIGN + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + JMP armXadd64<>(SB) + JMP ·goXadd64(SB) + +TEXT ·Xchg64(SB),NOSPLIT,$-4-20 + NO_LOCAL_POINTERS + MOVW addr+0(FP), R1 + CHECK_ALIGN + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + JMP armXchg64<>(SB) + JMP ·goXchg64(SB) + +TEXT ·Load64(SB),NOSPLIT,$-4-12 + NO_LOCAL_POINTERS + MOVW addr+0(FP), R1 + CHECK_ALIGN + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + JMP armLoad64<>(SB) + JMP ·goLoad64(SB) + +TEXT ·Store64(SB),NOSPLIT,$-4-12 + NO_LOCAL_POINTERS + MOVW addr+0(FP), R1 + CHECK_ALIGN + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + JMP armStore64<>(SB) + JMP ·goStore64(SB) diff --git a/src/runtime/internal/atomic/atomic_arm64.go b/src/runtime/internal/atomic/atomic_arm64.go new file mode 100644 index 0000000..459fb99 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_arm64.go @@ -0,0 +1,94 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build arm64 + +package atomic + +import ( + "internal/cpu" + "unsafe" +) + +const ( + offsetARM64HasATOMICS = unsafe.Offsetof(cpu.ARM64.HasATOMICS) +) + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xadd64(ptr *uint64, delta int64) uint64 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchg64(ptr *uint64, new uint64) uint64 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:noescape +func Load(ptr *uint32) uint32 + +//go:noescape +func Load8(ptr *uint8) uint8 + +//go:noescape +func Load64(ptr *uint64) uint64 + +// NO go:noescape annotation; *ptr escapes if result escapes (#31525) +func Loadp(ptr unsafe.Pointer) unsafe.Pointer + +//go:noescape +func LoadAcq(addr *uint32) uint32 + +//go:noescape +func LoadAcq64(ptr *uint64) uint64 + +//go:noescape +func LoadAcquintptr(ptr *uintptr) uintptr + +//go:noescape +func Or8(ptr *uint8, val uint8) + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or(ptr *uint32, val uint32) + +//go:noescape +func Cas64(ptr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(ptr *uint32, old, new uint32) bool + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +//go:noescape +func Store64(ptr *uint64, val uint64) + +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) + +//go:noescape +func StoreRel(ptr *uint32, val uint32) + +//go:noescape +func StoreRel64(ptr *uint64, val uint64) + +//go:noescape +func StoreReluintptr(ptr *uintptr, val uintptr) diff --git a/src/runtime/internal/atomic/atomic_arm64.s b/src/runtime/internal/atomic/atomic_arm64.s new file mode 100644 index 0000000..5f77d92 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_arm64.s @@ -0,0 +1,333 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·Casint32(SB), NOSPLIT, $0-17 + B ·Cas(SB) + +TEXT ·Casint64(SB), NOSPLIT, $0-25 + B ·Cas64(SB) + +TEXT ·Casuintptr(SB), NOSPLIT, $0-25 + B ·Cas64(SB) + +TEXT ·CasRel(SB), NOSPLIT, $0-17 + B ·Cas(SB) + +TEXT ·Loadint32(SB), NOSPLIT, $0-12 + B ·Load(SB) + +TEXT ·Loadint64(SB), NOSPLIT, $0-16 + B ·Load64(SB) + +TEXT ·Loaduintptr(SB), NOSPLIT, $0-16 + B ·Load64(SB) + +TEXT ·Loaduint(SB), NOSPLIT, $0-16 + B ·Load64(SB) + +TEXT ·Storeint32(SB), NOSPLIT, $0-12 + B ·Store(SB) + +TEXT ·Storeint64(SB), NOSPLIT, $0-16 + B ·Store64(SB) + +TEXT ·Storeuintptr(SB), NOSPLIT, $0-16 + B ·Store64(SB) + +TEXT ·Xaddint32(SB), NOSPLIT, $0-20 + B ·Xadd(SB) + +TEXT ·Xaddint64(SB), NOSPLIT, $0-24 + B ·Xadd64(SB) + +TEXT ·Xadduintptr(SB), NOSPLIT, $0-24 + B ·Xadd64(SB) + +TEXT ·Casp1(SB), NOSPLIT, $0-25 + B ·Cas64(SB) + +// uint32 ·Load(uint32 volatile* addr) +TEXT ·Load(SB),NOSPLIT,$0-12 + MOVD ptr+0(FP), R0 + LDARW (R0), R0 + MOVW R0, ret+8(FP) + RET + +// uint8 ·Load8(uint8 volatile* addr) +TEXT ·Load8(SB),NOSPLIT,$0-9 + MOVD ptr+0(FP), R0 + LDARB (R0), R0 + MOVB R0, ret+8(FP) + RET + +// uint64 ·Load64(uint64 volatile* addr) +TEXT ·Load64(SB),NOSPLIT,$0-16 + MOVD ptr+0(FP), R0 + LDAR (R0), R0 + MOVD R0, ret+8(FP) + RET + +// void *·Loadp(void *volatile *addr) +TEXT ·Loadp(SB),NOSPLIT,$0-16 + MOVD ptr+0(FP), R0 + LDAR (R0), R0 + MOVD R0, ret+8(FP) + RET + +// uint32 ·LoadAcq(uint32 volatile* addr) +TEXT ·LoadAcq(SB),NOSPLIT,$0-12 + B ·Load(SB) + +// uint64 ·LoadAcquintptr(uint64 volatile* addr) +TEXT ·LoadAcq64(SB),NOSPLIT,$0-16 + B ·Load64(SB) + +// uintptr ·LoadAcq64(uintptr volatile* addr) +TEXT ·LoadAcquintptr(SB),NOSPLIT,$0-16 + B ·Load64(SB) + +TEXT ·StorepNoWB(SB), NOSPLIT, $0-16 + B ·Store64(SB) + +TEXT ·StoreRel(SB), NOSPLIT, $0-12 + B ·Store(SB) + +TEXT ·StoreRel64(SB), NOSPLIT, $0-16 + B ·Store64(SB) + +TEXT ·StoreReluintptr(SB), NOSPLIT, $0-16 + B ·Store64(SB) + +TEXT ·Store(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R0 + MOVW val+8(FP), R1 + STLRW R1, (R0) + RET + +TEXT ·Store8(SB), NOSPLIT, $0-9 + MOVD ptr+0(FP), R0 + MOVB val+8(FP), R1 + STLRB R1, (R0) + RET + +TEXT ·Store64(SB), NOSPLIT, $0-16 + MOVD ptr+0(FP), R0 + MOVD val+8(FP), R1 + STLR R1, (R0) + RET + +// uint32 Xchg(ptr *uint32, new uint32) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg(SB), NOSPLIT, $0-20 + MOVD ptr+0(FP), R0 + MOVW new+8(FP), R1 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + SWPALW R1, (R0), R2 + MOVW R2, ret+16(FP) + RET +load_store_loop: + LDAXRW (R0), R2 + STLXRW R1, (R0), R3 + CBNZ R3, load_store_loop + MOVW R2, ret+16(FP) + RET + +// uint64 Xchg64(ptr *uint64, new uint64) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg64(SB), NOSPLIT, $0-24 + MOVD ptr+0(FP), R0 + MOVD new+8(FP), R1 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + SWPALD R1, (R0), R2 + MOVD R2, ret+16(FP) + RET +load_store_loop: + LDAXR (R0), R2 + STLXR R1, (R0), R3 + CBNZ R3, load_store_loop + MOVD R2, ret+16(FP) + RET + +// bool Cas(uint32 *ptr, uint32 old, uint32 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Cas(SB), NOSPLIT, $0-17 + MOVD ptr+0(FP), R0 + MOVW old+8(FP), R1 + MOVW new+12(FP), R2 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + MOVD R1, R3 + CASALW R3, (R0), R2 + CMP R1, R3 + CSET EQ, R0 + MOVB R0, ret+16(FP) + RET +load_store_loop: + LDAXRW (R0), R3 + CMPW R1, R3 + BNE ok + STLXRW R2, (R0), R3 + CBNZ R3, load_store_loop +ok: + CSET EQ, R0 + MOVB R0, ret+16(FP) + RET + +// bool ·Cas64(uint64 *ptr, uint64 old, uint64 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else { +// return 0; +// } +TEXT ·Cas64(SB), NOSPLIT, $0-25 + MOVD ptr+0(FP), R0 + MOVD old+8(FP), R1 + MOVD new+16(FP), R2 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + MOVD R1, R3 + CASALD R3, (R0), R2 + CMP R1, R3 + CSET EQ, R0 + MOVB R0, ret+24(FP) + RET +load_store_loop: + LDAXR (R0), R3 + CMP R1, R3 + BNE ok + STLXR R2, (R0), R3 + CBNZ R3, load_store_loop +ok: + CSET EQ, R0 + MOVB R0, ret+24(FP) + RET + +// uint32 xadd(uint32 volatile *ptr, int32 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd(SB), NOSPLIT, $0-20 + MOVD ptr+0(FP), R0 + MOVW delta+8(FP), R1 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + LDADDALW R1, (R0), R2 + ADD R1, R2 + MOVW R2, ret+16(FP) + RET +load_store_loop: + LDAXRW (R0), R2 + ADDW R2, R1, R2 + STLXRW R2, (R0), R3 + CBNZ R3, load_store_loop + MOVW R2, ret+16(FP) + RET + +// uint64 Xadd64(uint64 volatile *ptr, int64 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd64(SB), NOSPLIT, $0-24 + MOVD ptr+0(FP), R0 + MOVD delta+8(FP), R1 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + LDADDALD R1, (R0), R2 + ADD R1, R2 + MOVD R2, ret+16(FP) + RET +load_store_loop: + LDAXR (R0), R2 + ADD R2, R1, R2 + STLXR R2, (R0), R3 + CBNZ R3, load_store_loop + MOVD R2, ret+16(FP) + RET + +TEXT ·Xchgint32(SB), NOSPLIT, $0-20 + B ·Xchg(SB) + +TEXT ·Xchgint64(SB), NOSPLIT, $0-24 + B ·Xchg64(SB) + +TEXT ·Xchguintptr(SB), NOSPLIT, $0-24 + B ·Xchg64(SB) + +TEXT ·And8(SB), NOSPLIT, $0-9 + MOVD ptr+0(FP), R0 + MOVB val+8(FP), R1 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + MVN R1, R2 + LDCLRALB R2, (R0), R3 + RET +load_store_loop: + LDAXRB (R0), R2 + AND R1, R2 + STLXRB R2, (R0), R3 + CBNZ R3, load_store_loop + RET + +TEXT ·Or8(SB), NOSPLIT, $0-9 + MOVD ptr+0(FP), R0 + MOVB val+8(FP), R1 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + LDORALB R1, (R0), R2 + RET +load_store_loop: + LDAXRB (R0), R2 + ORR R1, R2 + STLXRB R2, (R0), R3 + CBNZ R3, load_store_loop + RET + +// func And(addr *uint32, v uint32) +TEXT ·And(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R0 + MOVW val+8(FP), R1 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + MVN R1, R2 + LDCLRALW R2, (R0), R3 + RET +load_store_loop: + LDAXRW (R0), R2 + AND R1, R2 + STLXRW R2, (R0), R3 + CBNZ R3, load_store_loop + RET + +// func Or(addr *uint32, v uint32) +TEXT ·Or(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R0 + MOVW val+8(FP), R1 + MOVBU internal∕cpu·ARM64+const_offsetARM64HasATOMICS(SB), R4 + CBZ R4, load_store_loop + LDORALW R1, (R0), R2 + RET +load_store_loop: + LDAXRW (R0), R2 + ORR R1, R2 + STLXRW R2, (R0), R3 + CBNZ R3, load_store_loop + RET diff --git a/src/runtime/internal/atomic/atomic_loong64.go b/src/runtime/internal/atomic/atomic_loong64.go new file mode 100644 index 0000000..d82a5b8 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_loong64.go @@ -0,0 +1,89 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build loong64 + +package atomic + +import "unsafe" + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xadd64(ptr *uint64, delta int64) uint64 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchg64(ptr *uint64, new uint64) uint64 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:noescape +func Load(ptr *uint32) uint32 + +//go:noescape +func Load8(ptr *uint8) uint8 + +//go:noescape +func Load64(ptr *uint64) uint64 + +// NO go:noescape annotation; *ptr escapes if result escapes (#31525) +func Loadp(ptr unsafe.Pointer) unsafe.Pointer + +//go:noescape +func LoadAcq(ptr *uint32) uint32 + +//go:noescape +func LoadAcq64(ptr *uint64) uint64 + +//go:noescape +func LoadAcquintptr(ptr *uintptr) uintptr + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or8(ptr *uint8, val uint8) + +//go:noescape +func Or(ptr *uint32, val uint32) + +// NOTE: Do not add atomicxor8 (XOR is not idempotent). + +//go:noescape +func Cas64(ptr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(ptr *uint32, old, new uint32) bool + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +//go:noescape +func Store64(ptr *uint64, val uint64) + +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) + +//go:noescape +func StoreRel(ptr *uint32, val uint32) + +//go:noescape +func StoreRel64(ptr *uint64, val uint64) + +//go:noescape +func StoreReluintptr(ptr *uintptr, val uintptr) diff --git a/src/runtime/internal/atomic/atomic_loong64.s b/src/runtime/internal/atomic/atomic_loong64.s new file mode 100644 index 0000000..3d802be --- /dev/null +++ b/src/runtime/internal/atomic/atomic_loong64.s @@ -0,0 +1,306 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// bool cas(uint32 *ptr, uint32 old, uint32 new) +// Atomically: +// if(*ptr == old){ +// *ptr = new; +// return 1; +// } else +// return 0; +TEXT ·Cas(SB), NOSPLIT, $0-17 + MOVV ptr+0(FP), R4 + MOVW old+8(FP), R5 + MOVW new+12(FP), R6 + DBAR +cas_again: + MOVV R6, R7 + LL (R4), R8 + BNE R5, R8, cas_fail + SC R7, (R4) + BEQ R7, cas_again + MOVV $1, R4 + MOVB R4, ret+16(FP) + DBAR + RET +cas_fail: + MOVV $0, R4 + JMP -4(PC) + +// bool cas64(uint64 *ptr, uint64 old, uint64 new) +// Atomically: +// if(*ptr == old){ +// *ptr = new; +// return 1; +// } else { +// return 0; +// } +TEXT ·Cas64(SB), NOSPLIT, $0-25 + MOVV ptr+0(FP), R4 + MOVV old+8(FP), R5 + MOVV new+16(FP), R6 + DBAR +cas64_again: + MOVV R6, R7 + LLV (R4), R8 + BNE R5, R8, cas64_fail + SCV R7, (R4) + BEQ R7, cas64_again + MOVV $1, R4 + MOVB R4, ret+24(FP) + DBAR + RET +cas64_fail: + MOVV $0, R4 + JMP -4(PC) + +TEXT ·Casuintptr(SB), NOSPLIT, $0-25 + JMP ·Cas64(SB) + +TEXT ·CasRel(SB), NOSPLIT, $0-17 + JMP ·Cas(SB) + +TEXT ·Loaduintptr(SB), NOSPLIT|NOFRAME, $0-16 + JMP ·Load64(SB) + +TEXT ·Loaduint(SB), NOSPLIT|NOFRAME, $0-16 + JMP ·Load64(SB) + +TEXT ·Storeuintptr(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·Xadduintptr(SB), NOSPLIT, $0-24 + JMP ·Xadd64(SB) + +TEXT ·Loadint64(SB), NOSPLIT, $0-16 + JMP ·Load64(SB) + +TEXT ·Xaddint64(SB), NOSPLIT, $0-24 + JMP ·Xadd64(SB) + +// bool casp(void **val, void *old, void *new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Casp1(SB), NOSPLIT, $0-25 + JMP runtime∕internal∕atomic·Cas64(SB) + +// uint32 xadd(uint32 volatile *ptr, int32 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd(SB), NOSPLIT, $0-20 + MOVV ptr+0(FP), R4 + MOVW delta+8(FP), R5 + DBAR + LL (R4), R6 + ADDU R6, R5, R7 + MOVV R7, R6 + SC R7, (R4) + BEQ R7, -4(PC) + MOVW R6, ret+16(FP) + DBAR + RET + +TEXT ·Xadd64(SB), NOSPLIT, $0-24 + MOVV ptr+0(FP), R4 + MOVV delta+8(FP), R5 + DBAR + LLV (R4), R6 + ADDVU R6, R5, R7 + MOVV R7, R6 + SCV R7, (R4) + BEQ R7, -4(PC) + MOVV R6, ret+16(FP) + DBAR + RET + +TEXT ·Xchg(SB), NOSPLIT, $0-20 + MOVV ptr+0(FP), R4 + MOVW new+8(FP), R5 + + DBAR + MOVV R5, R6 + LL (R4), R7 + SC R6, (R4) + BEQ R6, -3(PC) + MOVW R7, ret+16(FP) + DBAR + RET + +TEXT ·Xchg64(SB), NOSPLIT, $0-24 + MOVV ptr+0(FP), R4 + MOVV new+8(FP), R5 + + DBAR + MOVV R5, R6 + LLV (R4), R7 + SCV R6, (R4) + BEQ R6, -3(PC) + MOVV R7, ret+16(FP) + DBAR + RET + +TEXT ·Xchguintptr(SB), NOSPLIT, $0-24 + JMP ·Xchg64(SB) + +TEXT ·StorepNoWB(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·StoreRel(SB), NOSPLIT, $0-12 + JMP ·Store(SB) + +TEXT ·StoreRel64(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·StoreReluintptr(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·Store(SB), NOSPLIT, $0-12 + MOVV ptr+0(FP), R4 + MOVW val+8(FP), R5 + DBAR + MOVW R5, 0(R4) + DBAR + RET + +TEXT ·Store8(SB), NOSPLIT, $0-9 + MOVV ptr+0(FP), R4 + MOVB val+8(FP), R5 + DBAR + MOVB R5, 0(R4) + DBAR + RET + +TEXT ·Store64(SB), NOSPLIT, $0-16 + MOVV ptr+0(FP), R4 + MOVV val+8(FP), R5 + DBAR + MOVV R5, 0(R4) + DBAR + RET + +// void Or8(byte volatile*, byte); +TEXT ·Or8(SB), NOSPLIT, $0-9 + MOVV ptr+0(FP), R4 + MOVBU val+8(FP), R5 + // Align ptr down to 4 bytes so we can use 32-bit load/store. + MOVV $~3, R6 + AND R4, R6 + // R7 = ((ptr & 3) * 8) + AND $3, R4, R7 + SLLV $3, R7 + // Shift val for aligned ptr. R5 = val << R4 + SLLV R7, R5 + + DBAR + LL (R6), R7 + OR R5, R7 + SC R7, (R6) + BEQ R7, -4(PC) + DBAR + RET + +// void And8(byte volatile*, byte); +TEXT ·And8(SB), NOSPLIT, $0-9 + MOVV ptr+0(FP), R4 + MOVBU val+8(FP), R5 + // Align ptr down to 4 bytes so we can use 32-bit load/store. + MOVV $~3, R6 + AND R4, R6 + // R7 = ((ptr & 3) * 8) + AND $3, R4, R7 + SLLV $3, R7 + // Shift val for aligned ptr. R5 = val << R7 | ^(0xFF << R7) + MOVV $0xFF, R8 + SLLV R7, R5 + SLLV R7, R8 + NOR R0, R8 + OR R8, R5 + + DBAR + LL (R6), R7 + AND R5, R7 + SC R7, (R6) + BEQ R7, -4(PC) + DBAR + RET + +// func Or(addr *uint32, v uint32) +TEXT ·Or(SB), NOSPLIT, $0-12 + MOVV ptr+0(FP), R4 + MOVW val+8(FP), R5 + DBAR + LL (R4), R6 + OR R5, R6 + SC R6, (R4) + BEQ R6, -4(PC) + DBAR + RET + +// func And(addr *uint32, v uint32) +TEXT ·And(SB), NOSPLIT, $0-12 + MOVV ptr+0(FP), R4 + MOVW val+8(FP), R5 + DBAR + LL (R4), R6 + AND R5, R6 + SC R6, (R4) + BEQ R6, -4(PC) + DBAR + RET + +// uint32 runtime∕internal∕atomic·Load(uint32 volatile* ptr) +TEXT ·Load(SB),NOSPLIT|NOFRAME,$0-12 + MOVV ptr+0(FP), R19 + DBAR + MOVWU 0(R19), R19 + DBAR + MOVW R19, ret+8(FP) + RET + +// uint8 runtime∕internal∕atomic·Load8(uint8 volatile* ptr) +TEXT ·Load8(SB),NOSPLIT|NOFRAME,$0-9 + MOVV ptr+0(FP), R19 + DBAR + MOVBU 0(R19), R19 + DBAR + MOVB R19, ret+8(FP) + RET + +// uint64 runtime∕internal∕atomic·Load64(uint64 volatile* ptr) +TEXT ·Load64(SB),NOSPLIT|NOFRAME,$0-16 + MOVV ptr+0(FP), R19 + DBAR + MOVV 0(R19), R19 + DBAR + MOVV R19, ret+8(FP) + RET + +// void *runtime∕internal∕atomic·Loadp(void *volatile *ptr) +TEXT ·Loadp(SB),NOSPLIT|NOFRAME,$0-16 + MOVV ptr+0(FP), R19 + DBAR + MOVV 0(R19), R19 + DBAR + MOVV R19, ret+8(FP) + RET + +// uint32 runtime∕internal∕atomic·LoadAcq(uint32 volatile* ptr) +TEXT ·LoadAcq(SB),NOSPLIT|NOFRAME,$0-12 + JMP atomic·Load(SB) + +// uint64 ·LoadAcq64(uint64 volatile* ptr) +TEXT ·LoadAcq64(SB),NOSPLIT|NOFRAME,$0-16 + JMP atomic·Load64(SB) + +// uintptr ·LoadAcquintptr(uintptr volatile* ptr) +TEXT ·LoadAcquintptr(SB),NOSPLIT|NOFRAME,$0-16 + JMP atomic·Load64(SB) + diff --git a/src/runtime/internal/atomic/atomic_mips64x.go b/src/runtime/internal/atomic/atomic_mips64x.go new file mode 100644 index 0000000..1e12b83 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_mips64x.go @@ -0,0 +1,89 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +package atomic + +import "unsafe" + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xadd64(ptr *uint64, delta int64) uint64 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchg64(ptr *uint64, new uint64) uint64 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:noescape +func Load(ptr *uint32) uint32 + +//go:noescape +func Load8(ptr *uint8) uint8 + +//go:noescape +func Load64(ptr *uint64) uint64 + +// NO go:noescape annotation; *ptr escapes if result escapes (#31525) +func Loadp(ptr unsafe.Pointer) unsafe.Pointer + +//go:noescape +func LoadAcq(ptr *uint32) uint32 + +//go:noescape +func LoadAcq64(ptr *uint64) uint64 + +//go:noescape +func LoadAcquintptr(ptr *uintptr) uintptr + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func Or8(ptr *uint8, val uint8) + +// NOTE: Do not add atomicxor8 (XOR is not idempotent). + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or(ptr *uint32, val uint32) + +//go:noescape +func Cas64(ptr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(ptr *uint32, old, new uint32) bool + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +//go:noescape +func Store64(ptr *uint64, val uint64) + +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) + +//go:noescape +func StoreRel(ptr *uint32, val uint32) + +//go:noescape +func StoreRel64(ptr *uint64, val uint64) + +//go:noescape +func StoreReluintptr(ptr *uintptr, val uintptr) diff --git a/src/runtime/internal/atomic/atomic_mips64x.s b/src/runtime/internal/atomic/atomic_mips64x.s new file mode 100644 index 0000000..b4411d8 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_mips64x.s @@ -0,0 +1,359 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +#include "textflag.h" + +#define SYNC WORD $0xf + +// bool cas(uint32 *ptr, uint32 old, uint32 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Cas(SB), NOSPLIT, $0-17 + MOVV ptr+0(FP), R1 + MOVW old+8(FP), R2 + MOVW new+12(FP), R5 + SYNC +cas_again: + MOVV R5, R3 + LL (R1), R4 + BNE R2, R4, cas_fail + SC R3, (R1) + BEQ R3, cas_again + MOVV $1, R1 + MOVB R1, ret+16(FP) + SYNC + RET +cas_fail: + MOVV $0, R1 + JMP -4(PC) + +// bool cas64(uint64 *ptr, uint64 old, uint64 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else { +// return 0; +// } +TEXT ·Cas64(SB), NOSPLIT, $0-25 + MOVV ptr+0(FP), R1 + MOVV old+8(FP), R2 + MOVV new+16(FP), R5 + SYNC +cas64_again: + MOVV R5, R3 + LLV (R1), R4 + BNE R2, R4, cas64_fail + SCV R3, (R1) + BEQ R3, cas64_again + MOVV $1, R1 + MOVB R1, ret+24(FP) + SYNC + RET +cas64_fail: + MOVV $0, R1 + JMP -4(PC) + +TEXT ·Casint32(SB), NOSPLIT, $0-17 + JMP ·Cas(SB) + +TEXT ·Casint64(SB), NOSPLIT, $0-25 + JMP ·Cas64(SB) + +TEXT ·Casuintptr(SB), NOSPLIT, $0-25 + JMP ·Cas64(SB) + +TEXT ·CasRel(SB), NOSPLIT, $0-17 + JMP ·Cas(SB) + +TEXT ·Loaduintptr(SB), NOSPLIT|NOFRAME, $0-16 + JMP ·Load64(SB) + +TEXT ·Loaduint(SB), NOSPLIT|NOFRAME, $0-16 + JMP ·Load64(SB) + +TEXT ·Storeint32(SB), NOSPLIT, $0-12 + JMP ·Store(SB) + +TEXT ·Storeint64(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·Storeuintptr(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·Xadduintptr(SB), NOSPLIT, $0-24 + JMP ·Xadd64(SB) + +TEXT ·Loadint32(SB), NOSPLIT, $0-12 + JMP ·Load(SB) + +TEXT ·Loadint64(SB), NOSPLIT, $0-16 + JMP ·Load64(SB) + +TEXT ·Xaddint32(SB), NOSPLIT, $0-20 + JMP ·Xadd(SB) + +TEXT ·Xaddint64(SB), NOSPLIT, $0-24 + JMP ·Xadd64(SB) + +// bool casp(void **val, void *old, void *new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Casp1(SB), NOSPLIT, $0-25 + JMP ·Cas64(SB) + +// uint32 xadd(uint32 volatile *ptr, int32 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd(SB), NOSPLIT, $0-20 + MOVV ptr+0(FP), R2 + MOVW delta+8(FP), R3 + SYNC + LL (R2), R1 + ADDU R1, R3, R4 + MOVV R4, R1 + SC R4, (R2) + BEQ R4, -4(PC) + MOVW R1, ret+16(FP) + SYNC + RET + +// uint64 Xadd64(uint64 volatile *ptr, int64 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd64(SB), NOSPLIT, $0-24 + MOVV ptr+0(FP), R2 + MOVV delta+8(FP), R3 + SYNC + LLV (R2), R1 + ADDVU R1, R3, R4 + MOVV R4, R1 + SCV R4, (R2) + BEQ R4, -4(PC) + MOVV R1, ret+16(FP) + SYNC + RET + +// uint32 Xchg(ptr *uint32, new uint32) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg(SB), NOSPLIT, $0-20 + MOVV ptr+0(FP), R2 + MOVW new+8(FP), R5 + + SYNC + MOVV R5, R3 + LL (R2), R1 + SC R3, (R2) + BEQ R3, -3(PC) + MOVW R1, ret+16(FP) + SYNC + RET + +// uint64 Xchg64(ptr *uint64, new uint64) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg64(SB), NOSPLIT, $0-24 + MOVV ptr+0(FP), R2 + MOVV new+8(FP), R5 + + SYNC + MOVV R5, R3 + LLV (R2), R1 + SCV R3, (R2) + BEQ R3, -3(PC) + MOVV R1, ret+16(FP) + SYNC + RET + +TEXT ·Xchgint32(SB), NOSPLIT, $0-20 + JMP ·Xchg(SB) + +TEXT ·Xchgint64(SB), NOSPLIT, $0-24 + JMP ·Xchg64(SB) + +TEXT ·Xchguintptr(SB), NOSPLIT, $0-24 + JMP ·Xchg64(SB) + +TEXT ·StorepNoWB(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·StoreRel(SB), NOSPLIT, $0-12 + JMP ·Store(SB) + +TEXT ·StoreRel64(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·StoreReluintptr(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·Store(SB), NOSPLIT, $0-12 + MOVV ptr+0(FP), R1 + MOVW val+8(FP), R2 + SYNC + MOVW R2, 0(R1) + SYNC + RET + +TEXT ·Store8(SB), NOSPLIT, $0-9 + MOVV ptr+0(FP), R1 + MOVB val+8(FP), R2 + SYNC + MOVB R2, 0(R1) + SYNC + RET + +TEXT ·Store64(SB), NOSPLIT, $0-16 + MOVV ptr+0(FP), R1 + MOVV val+8(FP), R2 + SYNC + MOVV R2, 0(R1) + SYNC + RET + +// void Or8(byte volatile*, byte); +TEXT ·Or8(SB), NOSPLIT, $0-9 + MOVV ptr+0(FP), R1 + MOVBU val+8(FP), R2 + // Align ptr down to 4 bytes so we can use 32-bit load/store. + MOVV $~3, R3 + AND R1, R3 + // Compute val shift. +#ifdef GOARCH_mips64 + // Big endian. ptr = ptr ^ 3 + XOR $3, R1 +#endif + // R4 = ((ptr & 3) * 8) + AND $3, R1, R4 + SLLV $3, R4 + // Shift val for aligned ptr. R2 = val << R4 + SLLV R4, R2 + + SYNC + LL (R3), R4 + OR R2, R4 + SC R4, (R3) + BEQ R4, -4(PC) + SYNC + RET + +// void And8(byte volatile*, byte); +TEXT ·And8(SB), NOSPLIT, $0-9 + MOVV ptr+0(FP), R1 + MOVBU val+8(FP), R2 + // Align ptr down to 4 bytes so we can use 32-bit load/store. + MOVV $~3, R3 + AND R1, R3 + // Compute val shift. +#ifdef GOARCH_mips64 + // Big endian. ptr = ptr ^ 3 + XOR $3, R1 +#endif + // R4 = ((ptr & 3) * 8) + AND $3, R1, R4 + SLLV $3, R4 + // Shift val for aligned ptr. R2 = val << R4 | ^(0xFF << R4) + MOVV $0xFF, R5 + SLLV R4, R2 + SLLV R4, R5 + NOR R0, R5 + OR R5, R2 + + SYNC + LL (R3), R4 + AND R2, R4 + SC R4, (R3) + BEQ R4, -4(PC) + SYNC + RET + +// func Or(addr *uint32, v uint32) +TEXT ·Or(SB), NOSPLIT, $0-12 + MOVV ptr+0(FP), R1 + MOVW val+8(FP), R2 + + SYNC + LL (R1), R3 + OR R2, R3 + SC R3, (R1) + BEQ R3, -4(PC) + SYNC + RET + +// func And(addr *uint32, v uint32) +TEXT ·And(SB), NOSPLIT, $0-12 + MOVV ptr+0(FP), R1 + MOVW val+8(FP), R2 + + SYNC + LL (R1), R3 + AND R2, R3 + SC R3, (R1) + BEQ R3, -4(PC) + SYNC + RET + +// uint32 ·Load(uint32 volatile* ptr) +TEXT ·Load(SB),NOSPLIT|NOFRAME,$0-12 + MOVV ptr+0(FP), R1 + SYNC + MOVWU 0(R1), R1 + SYNC + MOVW R1, ret+8(FP) + RET + +// uint8 ·Load8(uint8 volatile* ptr) +TEXT ·Load8(SB),NOSPLIT|NOFRAME,$0-9 + MOVV ptr+0(FP), R1 + SYNC + MOVBU 0(R1), R1 + SYNC + MOVB R1, ret+8(FP) + RET + +// uint64 ·Load64(uint64 volatile* ptr) +TEXT ·Load64(SB),NOSPLIT|NOFRAME,$0-16 + MOVV ptr+0(FP), R1 + SYNC + MOVV 0(R1), R1 + SYNC + MOVV R1, ret+8(FP) + RET + +// void *·Loadp(void *volatile *ptr) +TEXT ·Loadp(SB),NOSPLIT|NOFRAME,$0-16 + MOVV ptr+0(FP), R1 + SYNC + MOVV 0(R1), R1 + SYNC + MOVV R1, ret+8(FP) + RET + +// uint32 ·LoadAcq(uint32 volatile* ptr) +TEXT ·LoadAcq(SB),NOSPLIT|NOFRAME,$0-12 + JMP atomic·Load(SB) + +// uint64 ·LoadAcq64(uint64 volatile* ptr) +TEXT ·LoadAcq64(SB),NOSPLIT|NOFRAME,$0-16 + JMP atomic·Load64(SB) + +// uintptr ·LoadAcquintptr(uintptr volatile* ptr) +TEXT ·LoadAcquintptr(SB),NOSPLIT|NOFRAME,$0-16 + JMP atomic·Load64(SB) diff --git a/src/runtime/internal/atomic/atomic_mipsx.go b/src/runtime/internal/atomic/atomic_mipsx.go new file mode 100644 index 0000000..5dd15a0 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_mipsx.go @@ -0,0 +1,167 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +// Export some functions via linkname to assembly in sync/atomic. +// +//go:linkname Xadd64 +//go:linkname Xchg64 +//go:linkname Cas64 +//go:linkname Load64 +//go:linkname Store64 + +package atomic + +import ( + "internal/cpu" + "unsafe" +) + +// TODO implement lock striping +var lock struct { + state uint32 + pad [cpu.CacheLinePadSize - 4]byte +} + +//go:noescape +func spinLock(state *uint32) + +//go:noescape +func spinUnlock(state *uint32) + +//go:nosplit +func lockAndCheck(addr *uint64) { + // ensure 8-byte alignment + if uintptr(unsafe.Pointer(addr))&7 != 0 { + panicUnaligned() + } + // force dereference before taking lock + _ = *addr + + spinLock(&lock.state) +} + +//go:nosplit +func unlock() { + spinUnlock(&lock.state) +} + +//go:nosplit +func unlockNoFence() { + lock.state = 0 +} + +//go:nosplit +func Xadd64(addr *uint64, delta int64) (new uint64) { + lockAndCheck(addr) + + new = *addr + uint64(delta) + *addr = new + + unlock() + return +} + +//go:nosplit +func Xchg64(addr *uint64, new uint64) (old uint64) { + lockAndCheck(addr) + + old = *addr + *addr = new + + unlock() + return +} + +//go:nosplit +func Cas64(addr *uint64, old, new uint64) (swapped bool) { + lockAndCheck(addr) + + if (*addr) == old { + *addr = new + unlock() + return true + } + + unlockNoFence() + return false +} + +//go:nosplit +func Load64(addr *uint64) (val uint64) { + lockAndCheck(addr) + + val = *addr + + unlock() + return +} + +//go:nosplit +func Store64(addr *uint64, val uint64) { + lockAndCheck(addr) + + *addr = val + + unlock() + return +} + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:noescape +func Load(ptr *uint32) uint32 + +//go:noescape +func Load8(ptr *uint8) uint8 + +// NO go:noescape annotation; *ptr escapes if result escapes (#31525) +func Loadp(ptr unsafe.Pointer) unsafe.Pointer + +//go:noescape +func LoadAcq(ptr *uint32) uint32 + +//go:noescape +func LoadAcquintptr(ptr *uintptr) uintptr + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func Or8(ptr *uint8, val uint8) + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or(ptr *uint32, val uint32) + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) + +//go:noescape +func StoreRel(ptr *uint32, val uint32) + +//go:noescape +func StoreReluintptr(ptr *uintptr, val uintptr) + +//go:noescape +func CasRel(addr *uint32, old, new uint32) bool diff --git a/src/runtime/internal/atomic/atomic_mipsx.s b/src/runtime/internal/atomic/atomic_mipsx.s new file mode 100644 index 0000000..390e9ce --- /dev/null +++ b/src/runtime/internal/atomic/atomic_mipsx.s @@ -0,0 +1,261 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +#include "textflag.h" + +// bool Cas(int32 *val, int32 old, int32 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Cas(SB),NOSPLIT,$0-13 + MOVW ptr+0(FP), R1 + MOVW old+4(FP), R2 + MOVW new+8(FP), R5 + SYNC +try_cas: + MOVW R5, R3 + LL (R1), R4 // R4 = *R1 + BNE R2, R4, cas_fail + SC R3, (R1) // *R1 = R3 + BEQ R3, try_cas + SYNC + MOVB R3, ret+12(FP) + RET +cas_fail: + MOVB R0, ret+12(FP) + RET + +TEXT ·Store(SB),NOSPLIT,$0-8 + MOVW ptr+0(FP), R1 + MOVW val+4(FP), R2 + SYNC + MOVW R2, 0(R1) + SYNC + RET + +TEXT ·Store8(SB),NOSPLIT,$0-5 + MOVW ptr+0(FP), R1 + MOVB val+4(FP), R2 + SYNC + MOVB R2, 0(R1) + SYNC + RET + +TEXT ·Load(SB),NOSPLIT,$0-8 + MOVW ptr+0(FP), R1 + SYNC + MOVW 0(R1), R1 + SYNC + MOVW R1, ret+4(FP) + RET + +TEXT ·Load8(SB),NOSPLIT,$0-5 + MOVW ptr+0(FP), R1 + SYNC + MOVB 0(R1), R1 + SYNC + MOVB R1, ret+4(FP) + RET + +// uint32 Xadd(uint32 volatile *val, int32 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd(SB),NOSPLIT,$0-12 + MOVW ptr+0(FP), R2 + MOVW delta+4(FP), R3 + SYNC +try_xadd: + LL (R2), R1 // R1 = *R2 + ADDU R1, R3, R4 + MOVW R4, R1 + SC R4, (R2) // *R2 = R4 + BEQ R4, try_xadd + SYNC + MOVW R1, ret+8(FP) + RET + +// uint32 Xchg(ptr *uint32, new uint32) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg(SB),NOSPLIT,$0-12 + MOVW ptr+0(FP), R2 + MOVW new+4(FP), R5 + SYNC +try_xchg: + MOVW R5, R3 + LL (R2), R1 // R1 = *R2 + SC R3, (R2) // *R2 = R3 + BEQ R3, try_xchg + SYNC + MOVW R1, ret+8(FP) + RET + +TEXT ·Casint32(SB),NOSPLIT,$0-13 + JMP ·Cas(SB) + +TEXT ·Casint64(SB),NOSPLIT,$0-21 + JMP ·Cas64(SB) + +TEXT ·Casuintptr(SB),NOSPLIT,$0-13 + JMP ·Cas(SB) + +TEXT ·CasRel(SB),NOSPLIT,$0-13 + JMP ·Cas(SB) + +TEXT ·Loaduintptr(SB),NOSPLIT,$0-8 + JMP ·Load(SB) + +TEXT ·Loaduint(SB),NOSPLIT,$0-8 + JMP ·Load(SB) + +TEXT ·Loadp(SB),NOSPLIT,$-0-8 + JMP ·Load(SB) + +TEXT ·Storeint32(SB),NOSPLIT,$0-8 + JMP ·Store(SB) + +TEXT ·Storeint64(SB),NOSPLIT,$0-12 + JMP ·Store64(SB) + +TEXT ·Storeuintptr(SB),NOSPLIT,$0-8 + JMP ·Store(SB) + +TEXT ·Xadduintptr(SB),NOSPLIT,$0-12 + JMP ·Xadd(SB) + +TEXT ·Loadint32(SB),NOSPLIT,$0-8 + JMP ·Load(SB) + +TEXT ·Loadint64(SB),NOSPLIT,$0-12 + JMP ·Load64(SB) + +TEXT ·Xaddint32(SB),NOSPLIT,$0-12 + JMP ·Xadd(SB) + +TEXT ·Xaddint64(SB),NOSPLIT,$0-20 + JMP ·Xadd64(SB) + +TEXT ·Casp1(SB),NOSPLIT,$0-13 + JMP ·Cas(SB) + +TEXT ·Xchgint32(SB),NOSPLIT,$0-12 + JMP ·Xchg(SB) + +TEXT ·Xchgint64(SB),NOSPLIT,$0-20 + JMP ·Xchg64(SB) + +TEXT ·Xchguintptr(SB),NOSPLIT,$0-12 + JMP ·Xchg(SB) + +TEXT ·StorepNoWB(SB),NOSPLIT,$0-8 + JMP ·Store(SB) + +TEXT ·StoreRel(SB),NOSPLIT,$0-8 + JMP ·Store(SB) + +TEXT ·StoreReluintptr(SB),NOSPLIT,$0-8 + JMP ·Store(SB) + +// void Or8(byte volatile*, byte); +TEXT ·Or8(SB),NOSPLIT,$0-5 + MOVW ptr+0(FP), R1 + MOVBU val+4(FP), R2 + MOVW $~3, R3 // Align ptr down to 4 bytes so we can use 32-bit load/store. + AND R1, R3 +#ifdef GOARCH_mips + // Big endian. ptr = ptr ^ 3 + XOR $3, R1 +#endif + AND $3, R1, R4 // R4 = ((ptr & 3) * 8) + SLL $3, R4 + SLL R4, R2, R2 // Shift val for aligned ptr. R2 = val << R4 + SYNC +try_or8: + LL (R3), R4 // R4 = *R3 + OR R2, R4 + SC R4, (R3) // *R3 = R4 + BEQ R4, try_or8 + SYNC + RET + +// void And8(byte volatile*, byte); +TEXT ·And8(SB),NOSPLIT,$0-5 + MOVW ptr+0(FP), R1 + MOVBU val+4(FP), R2 + MOVW $~3, R3 + AND R1, R3 +#ifdef GOARCH_mips + // Big endian. ptr = ptr ^ 3 + XOR $3, R1 +#endif + AND $3, R1, R4 // R4 = ((ptr & 3) * 8) + SLL $3, R4 + MOVW $0xFF, R5 + SLL R4, R2 + SLL R4, R5 + NOR R0, R5 + OR R5, R2 // Shift val for aligned ptr. R2 = val << R4 | ^(0xFF << R4) + SYNC +try_and8: + LL (R3), R4 // R4 = *R3 + AND R2, R4 + SC R4, (R3) // *R3 = R4 + BEQ R4, try_and8 + SYNC + RET + +// func Or(addr *uint32, v uint32) +TEXT ·Or(SB), NOSPLIT, $0-8 + MOVW ptr+0(FP), R1 + MOVW val+4(FP), R2 + + SYNC + LL (R1), R3 + OR R2, R3 + SC R3, (R1) + BEQ R3, -4(PC) + SYNC + RET + +// func And(addr *uint32, v uint32) +TEXT ·And(SB), NOSPLIT, $0-8 + MOVW ptr+0(FP), R1 + MOVW val+4(FP), R2 + + SYNC + LL (R1), R3 + AND R2, R3 + SC R3, (R1) + BEQ R3, -4(PC) + SYNC + RET + +TEXT ·spinLock(SB),NOSPLIT,$0-4 + MOVW state+0(FP), R1 + MOVW $1, R2 + SYNC +try_lock: + MOVW R2, R3 +check_again: + LL (R1), R4 + BNE R4, check_again + SC R3, (R1) + BEQ R3, try_lock + SYNC + RET + +TEXT ·spinUnlock(SB),NOSPLIT,$0-4 + MOVW state+0(FP), R1 + SYNC + MOVW R0, (R1) + SYNC + RET diff --git a/src/runtime/internal/atomic/atomic_ppc64x.go b/src/runtime/internal/atomic/atomic_ppc64x.go new file mode 100644 index 0000000..998d16e --- /dev/null +++ b/src/runtime/internal/atomic/atomic_ppc64x.go @@ -0,0 +1,89 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +package atomic + +import "unsafe" + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xadd64(ptr *uint64, delta int64) uint64 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchg64(ptr *uint64, new uint64) uint64 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:noescape +func Load(ptr *uint32) uint32 + +//go:noescape +func Load8(ptr *uint8) uint8 + +//go:noescape +func Load64(ptr *uint64) uint64 + +// NO go:noescape annotation; *ptr escapes if result escapes (#31525) +func Loadp(ptr unsafe.Pointer) unsafe.Pointer + +//go:noescape +func LoadAcq(ptr *uint32) uint32 + +//go:noescape +func LoadAcq64(ptr *uint64) uint64 + +//go:noescape +func LoadAcquintptr(ptr *uintptr) uintptr + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func Or8(ptr *uint8, val uint8) + +// NOTE: Do not add atomicxor8 (XOR is not idempotent). + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or(ptr *uint32, val uint32) + +//go:noescape +func Cas64(ptr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(ptr *uint32, old, new uint32) bool + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +//go:noescape +func Store64(ptr *uint64, val uint64) + +//go:noescape +func StoreRel(ptr *uint32, val uint32) + +//go:noescape +func StoreRel64(ptr *uint64, val uint64) + +//go:noescape +func StoreReluintptr(ptr *uintptr, val uintptr) + +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) diff --git a/src/runtime/internal/atomic/atomic_ppc64x.s b/src/runtime/internal/atomic/atomic_ppc64x.s new file mode 100644 index 0000000..04f0ead --- /dev/null +++ b/src/runtime/internal/atomic/atomic_ppc64x.s @@ -0,0 +1,362 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +#include "textflag.h" + +// For more details about how various memory models are +// enforced on POWER, the following paper provides more +// details about how they enforce C/C++ like models. This +// gives context about why the strange looking code +// sequences below work. +// +// http://www.rdrop.com/users/paulmck/scalability/paper/N2745r.2011.03.04a.html + +// uint32 ·Load(uint32 volatile* ptr) +TEXT ·Load(SB),NOSPLIT|NOFRAME,$-8-12 + MOVD ptr+0(FP), R3 + SYNC + MOVWZ 0(R3), R3 + CMPW R3, R3, CR7 + BC 4, 30, 1(PC) // bne- cr7,0x4 + ISYNC + MOVW R3, ret+8(FP) + RET + +// uint8 ·Load8(uint8 volatile* ptr) +TEXT ·Load8(SB),NOSPLIT|NOFRAME,$-8-9 + MOVD ptr+0(FP), R3 + SYNC + MOVBZ 0(R3), R3 + CMP R3, R3, CR7 + BC 4, 30, 1(PC) // bne- cr7,0x4 + ISYNC + MOVB R3, ret+8(FP) + RET + +// uint64 ·Load64(uint64 volatile* ptr) +TEXT ·Load64(SB),NOSPLIT|NOFRAME,$-8-16 + MOVD ptr+0(FP), R3 + SYNC + MOVD 0(R3), R3 + CMP R3, R3, CR7 + BC 4, 30, 1(PC) // bne- cr7,0x4 + ISYNC + MOVD R3, ret+8(FP) + RET + +// void *·Loadp(void *volatile *ptr) +TEXT ·Loadp(SB),NOSPLIT|NOFRAME,$-8-16 + MOVD ptr+0(FP), R3 + SYNC + MOVD 0(R3), R3 + CMP R3, R3, CR7 + BC 4, 30, 1(PC) // bne- cr7,0x4 + ISYNC + MOVD R3, ret+8(FP) + RET + +// uint32 ·LoadAcq(uint32 volatile* ptr) +TEXT ·LoadAcq(SB),NOSPLIT|NOFRAME,$-8-12 + MOVD ptr+0(FP), R3 + MOVWZ 0(R3), R3 + CMPW R3, R3, CR7 + BC 4, 30, 1(PC) // bne- cr7, 0x4 + ISYNC + MOVW R3, ret+8(FP) + RET + +// uint64 ·LoadAcq64(uint64 volatile* ptr) +TEXT ·LoadAcq64(SB),NOSPLIT|NOFRAME,$-8-16 + MOVD ptr+0(FP), R3 + MOVD 0(R3), R3 + CMP R3, R3, CR7 + BC 4, 30, 1(PC) // bne- cr7, 0x4 + ISYNC + MOVD R3, ret+8(FP) + RET + +// bool cas(uint32 *ptr, uint32 old, uint32 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Cas(SB), NOSPLIT, $0-17 + MOVD ptr+0(FP), R3 + MOVWZ old+8(FP), R4 + MOVWZ new+12(FP), R5 + LWSYNC +cas_again: + LWAR (R3), R6 + CMPW R6, R4 + BNE cas_fail + STWCCC R5, (R3) + BNE cas_again + MOVD $1, R3 + LWSYNC + MOVB R3, ret+16(FP) + RET +cas_fail: + MOVB R0, ret+16(FP) + RET + +// bool ·Cas64(uint64 *ptr, uint64 old, uint64 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else { +// return 0; +// } +TEXT ·Cas64(SB), NOSPLIT, $0-25 + MOVD ptr+0(FP), R3 + MOVD old+8(FP), R4 + MOVD new+16(FP), R5 + LWSYNC +cas64_again: + LDAR (R3), R6 + CMP R6, R4 + BNE cas64_fail + STDCCC R5, (R3) + BNE cas64_again + MOVD $1, R3 + LWSYNC + MOVB R3, ret+24(FP) + RET +cas64_fail: + MOVB R0, ret+24(FP) + RET + +TEXT ·CasRel(SB), NOSPLIT, $0-17 + MOVD ptr+0(FP), R3 + MOVWZ old+8(FP), R4 + MOVWZ new+12(FP), R5 + LWSYNC +cas_again: + LWAR (R3), $0, R6 // 0 = Mutex release hint + CMPW R6, R4 + BNE cas_fail + STWCCC R5, (R3) + BNE cas_again + MOVD $1, R3 + MOVB R3, ret+16(FP) + RET +cas_fail: + MOVB R0, ret+16(FP) + RET + +TEXT ·Casint32(SB), NOSPLIT, $0-17 + BR ·Cas(SB) + +TEXT ·Casint64(SB), NOSPLIT, $0-25 + BR ·Cas64(SB) + +TEXT ·Casuintptr(SB), NOSPLIT, $0-25 + BR ·Cas64(SB) + +TEXT ·Loaduintptr(SB), NOSPLIT|NOFRAME, $0-16 + BR ·Load64(SB) + +TEXT ·LoadAcquintptr(SB), NOSPLIT|NOFRAME, $0-16 + BR ·LoadAcq64(SB) + +TEXT ·Loaduint(SB), NOSPLIT|NOFRAME, $0-16 + BR ·Load64(SB) + +TEXT ·Storeint32(SB), NOSPLIT, $0-12 + BR ·Store(SB) + +TEXT ·Storeint64(SB), NOSPLIT, $0-16 + BR ·Store64(SB) + +TEXT ·Storeuintptr(SB), NOSPLIT, $0-16 + BR ·Store64(SB) + +TEXT ·StoreReluintptr(SB), NOSPLIT, $0-16 + BR ·StoreRel64(SB) + +TEXT ·Xadduintptr(SB), NOSPLIT, $0-24 + BR ·Xadd64(SB) + +TEXT ·Loadint32(SB), NOSPLIT, $0-12 + BR ·Load(SB) + +TEXT ·Loadint64(SB), NOSPLIT, $0-16 + BR ·Load64(SB) + +TEXT ·Xaddint32(SB), NOSPLIT, $0-20 + BR ·Xadd(SB) + +TEXT ·Xaddint64(SB), NOSPLIT, $0-24 + BR ·Xadd64(SB) + +// bool casp(void **val, void *old, void *new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else +// return 0; +TEXT ·Casp1(SB), NOSPLIT, $0-25 + BR ·Cas64(SB) + +// uint32 xadd(uint32 volatile *ptr, int32 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd(SB), NOSPLIT, $0-20 + MOVD ptr+0(FP), R4 + MOVW delta+8(FP), R5 + LWSYNC + LWAR (R4), R3 + ADD R5, R3 + STWCCC R3, (R4) + BNE -3(PC) + MOVW R3, ret+16(FP) + RET + +// uint64 Xadd64(uint64 volatile *val, int64 delta) +// Atomically: +// *val += delta; +// return *val; +TEXT ·Xadd64(SB), NOSPLIT, $0-24 + MOVD ptr+0(FP), R4 + MOVD delta+8(FP), R5 + LWSYNC + LDAR (R4), R3 + ADD R5, R3 + STDCCC R3, (R4) + BNE -3(PC) + MOVD R3, ret+16(FP) + RET + +// uint32 Xchg(ptr *uint32, new uint32) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg(SB), NOSPLIT, $0-20 + MOVD ptr+0(FP), R4 + MOVW new+8(FP), R5 + LWSYNC + LWAR (R4), R3 + STWCCC R5, (R4) + BNE -2(PC) + ISYNC + MOVW R3, ret+16(FP) + RET + +// uint64 Xchg64(ptr *uint64, new uint64) +// Atomically: +// old := *ptr; +// *ptr = new; +// return old; +TEXT ·Xchg64(SB), NOSPLIT, $0-24 + MOVD ptr+0(FP), R4 + MOVD new+8(FP), R5 + LWSYNC + LDAR (R4), R3 + STDCCC R5, (R4) + BNE -2(PC) + ISYNC + MOVD R3, ret+16(FP) + RET + +TEXT ·Xchgint32(SB), NOSPLIT, $0-20 + BR ·Xchg(SB) + +TEXT ·Xchgint64(SB), NOSPLIT, $0-24 + BR ·Xchg64(SB) + +TEXT ·Xchguintptr(SB), NOSPLIT, $0-24 + BR ·Xchg64(SB) + +TEXT ·StorepNoWB(SB), NOSPLIT, $0-16 + BR ·Store64(SB) + +TEXT ·Store(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R3 + MOVW val+8(FP), R4 + SYNC + MOVW R4, 0(R3) + RET + +TEXT ·Store8(SB), NOSPLIT, $0-9 + MOVD ptr+0(FP), R3 + MOVB val+8(FP), R4 + SYNC + MOVB R4, 0(R3) + RET + +TEXT ·Store64(SB), NOSPLIT, $0-16 + MOVD ptr+0(FP), R3 + MOVD val+8(FP), R4 + SYNC + MOVD R4, 0(R3) + RET + +TEXT ·StoreRel(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R3 + MOVW val+8(FP), R4 + LWSYNC + MOVW R4, 0(R3) + RET + +TEXT ·StoreRel64(SB), NOSPLIT, $0-16 + MOVD ptr+0(FP), R3 + MOVD val+8(FP), R4 + LWSYNC + MOVD R4, 0(R3) + RET + +// void ·Or8(byte volatile*, byte); +TEXT ·Or8(SB), NOSPLIT, $0-9 + MOVD ptr+0(FP), R3 + MOVBZ val+8(FP), R4 + LWSYNC +again: + LBAR (R3), R6 + OR R4, R6 + STBCCC R6, (R3) + BNE again + RET + +// void ·And8(byte volatile*, byte); +TEXT ·And8(SB), NOSPLIT, $0-9 + MOVD ptr+0(FP), R3 + MOVBZ val+8(FP), R4 + LWSYNC +again: + LBAR (R3), R6 + AND R4, R6 + STBCCC R6, (R3) + BNE again + RET + +// func Or(addr *uint32, v uint32) +TEXT ·Or(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R3 + MOVW val+8(FP), R4 + LWSYNC +again: + LWAR (R3), R6 + OR R4, R6 + STWCCC R6, (R3) + BNE again + RET + +// func And(addr *uint32, v uint32) +TEXT ·And(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R3 + MOVW val+8(FP), R4 + LWSYNC +again: + LWAR (R3),R6 + AND R4, R6 + STWCCC R6, (R3) + BNE again + RET diff --git a/src/runtime/internal/atomic/atomic_riscv64.go b/src/runtime/internal/atomic/atomic_riscv64.go new file mode 100644 index 0000000..8f24d61 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_riscv64.go @@ -0,0 +1,85 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package atomic + +import "unsafe" + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xadd64(ptr *uint64, delta int64) uint64 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchg64(ptr *uint64, new uint64) uint64 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:noescape +func Load(ptr *uint32) uint32 + +//go:noescape +func Load8(ptr *uint8) uint8 + +//go:noescape +func Load64(ptr *uint64) uint64 + +// NO go:noescape annotation; *ptr escapes if result escapes (#31525) +func Loadp(ptr unsafe.Pointer) unsafe.Pointer + +//go:noescape +func LoadAcq(ptr *uint32) uint32 + +//go:noescape +func LoadAcq64(ptr *uint64) uint64 + +//go:noescape +func LoadAcquintptr(ptr *uintptr) uintptr + +//go:noescape +func Or8(ptr *uint8, val uint8) + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or(ptr *uint32, val uint32) + +//go:noescape +func Cas64(ptr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(ptr *uint32, old, new uint32) bool + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +//go:noescape +func Store64(ptr *uint64, val uint64) + +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) + +//go:noescape +func StoreRel(ptr *uint32, val uint32) + +//go:noescape +func StoreRel64(ptr *uint64, val uint64) + +//go:noescape +func StoreReluintptr(ptr *uintptr, val uintptr) diff --git a/src/runtime/internal/atomic/atomic_riscv64.s b/src/runtime/internal/atomic/atomic_riscv64.s new file mode 100644 index 0000000..21d5adc --- /dev/null +++ b/src/runtime/internal/atomic/atomic_riscv64.s @@ -0,0 +1,284 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// RISC-V's atomic operations have two bits, aq ("acquire") and rl ("release"), +// which may be toggled on and off. Their precise semantics are defined in +// section 6.3 of the specification, but the basic idea is as follows: +// +// - If neither aq nor rl is set, the CPU may reorder the atomic arbitrarily. +// It guarantees only that it will execute atomically. +// +// - If aq is set, the CPU may move the instruction backward, but not forward. +// +// - If rl is set, the CPU may move the instruction forward, but not backward. +// +// - If both are set, the CPU may not reorder the instruction at all. +// +// These four modes correspond to other well-known memory models on other CPUs. +// On ARM, aq corresponds to a dmb ishst, aq+rl corresponds to a dmb ish. On +// Intel, aq corresponds to an lfence, rl to an sfence, and aq+rl to an mfence +// (or a lock prefix). +// +// Go's memory model requires that +// - if a read happens after a write, the read must observe the write, and +// that +// - if a read happens concurrently with a write, the read may observe the +// write. +// aq is sufficient to guarantee this, so that's what we use here. (This jibes +// with ARM, which uses dmb ishst.) + +#include "textflag.h" + +// func Cas(ptr *uint64, old, new uint64) bool +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// } else { +// return 0; +// } +TEXT ·Cas(SB), NOSPLIT, $0-17 + MOV ptr+0(FP), A0 + MOVW old+8(FP), A1 + MOVW new+12(FP), A2 +cas_again: + LRW (A0), A3 + BNE A3, A1, cas_fail + SCW A2, (A0), A4 + BNE A4, ZERO, cas_again + MOV $1, A0 + MOVB A0, ret+16(FP) + RET +cas_fail: + MOV $0, A0 + MOV A0, ret+16(FP) + RET + +// func Cas64(ptr *uint64, old, new uint64) bool +TEXT ·Cas64(SB), NOSPLIT, $0-25 + MOV ptr+0(FP), A0 + MOV old+8(FP), A1 + MOV new+16(FP), A2 +cas_again: + LRD (A0), A3 + BNE A3, A1, cas_fail + SCD A2, (A0), A4 + BNE A4, ZERO, cas_again + MOV $1, A0 + MOVB A0, ret+24(FP) + RET +cas_fail: + MOVB ZERO, ret+24(FP) + RET + +// func Load(ptr *uint32) uint32 +TEXT ·Load(SB),NOSPLIT|NOFRAME,$0-12 + MOV ptr+0(FP), A0 + LRW (A0), A0 + MOVW A0, ret+8(FP) + RET + +// func Load8(ptr *uint8) uint8 +TEXT ·Load8(SB),NOSPLIT|NOFRAME,$0-9 + MOV ptr+0(FP), A0 + FENCE + MOVBU (A0), A1 + FENCE + MOVB A1, ret+8(FP) + RET + +// func Load64(ptr *uint64) uint64 +TEXT ·Load64(SB),NOSPLIT|NOFRAME,$0-16 + MOV ptr+0(FP), A0 + LRD (A0), A0 + MOV A0, ret+8(FP) + RET + +// func Store(ptr *uint32, val uint32) +TEXT ·Store(SB), NOSPLIT, $0-12 + MOV ptr+0(FP), A0 + MOVW val+8(FP), A1 + AMOSWAPW A1, (A0), ZERO + RET + +// func Store8(ptr *uint8, val uint8) +TEXT ·Store8(SB), NOSPLIT, $0-9 + MOV ptr+0(FP), A0 + MOVBU val+8(FP), A1 + FENCE + MOVB A1, (A0) + FENCE + RET + +// func Store64(ptr *uint64, val uint64) +TEXT ·Store64(SB), NOSPLIT, $0-16 + MOV ptr+0(FP), A0 + MOV val+8(FP), A1 + AMOSWAPD A1, (A0), ZERO + RET + +TEXT ·Casp1(SB), NOSPLIT, $0-25 + JMP ·Cas64(SB) + +TEXT ·Casint32(SB),NOSPLIT,$0-17 + JMP ·Cas(SB) + +TEXT ·Casint64(SB),NOSPLIT,$0-25 + JMP ·Cas64(SB) + +TEXT ·Casuintptr(SB),NOSPLIT,$0-25 + JMP ·Cas64(SB) + +TEXT ·CasRel(SB), NOSPLIT, $0-17 + JMP ·Cas(SB) + +TEXT ·Loaduintptr(SB),NOSPLIT,$0-16 + JMP ·Load64(SB) + +TEXT ·Storeint32(SB),NOSPLIT,$0-12 + JMP ·Store(SB) + +TEXT ·Storeint64(SB),NOSPLIT,$0-16 + JMP ·Store64(SB) + +TEXT ·Storeuintptr(SB),NOSPLIT,$0-16 + JMP ·Store64(SB) + +TEXT ·Loaduint(SB),NOSPLIT,$0-16 + JMP ·Loaduintptr(SB) + +TEXT ·Loadint32(SB),NOSPLIT,$0-12 + JMP ·Load(SB) + +TEXT ·Loadint64(SB),NOSPLIT,$0-16 + JMP ·Load64(SB) + +TEXT ·Xaddint32(SB),NOSPLIT,$0-20 + JMP ·Xadd(SB) + +TEXT ·Xaddint64(SB),NOSPLIT,$0-24 + MOV ptr+0(FP), A0 + MOV delta+8(FP), A1 + AMOADDD A1, (A0), A0 + ADD A0, A1, A0 + MOVW A0, ret+16(FP) + RET + +TEXT ·LoadAcq(SB),NOSPLIT|NOFRAME,$0-12 + JMP ·Load(SB) + +TEXT ·LoadAcq64(SB),NOSPLIT|NOFRAME,$0-16 + JMP ·Load64(SB) + +TEXT ·LoadAcquintptr(SB),NOSPLIT|NOFRAME,$0-16 + JMP ·Load64(SB) + +// func Loadp(ptr unsafe.Pointer) unsafe.Pointer +TEXT ·Loadp(SB),NOSPLIT,$0-16 + JMP ·Load64(SB) + +// func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) +TEXT ·StorepNoWB(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·StoreRel(SB), NOSPLIT, $0-12 + JMP ·Store(SB) + +TEXT ·StoreRel64(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +TEXT ·StoreReluintptr(SB), NOSPLIT, $0-16 + JMP ·Store64(SB) + +// func Xchg(ptr *uint32, new uint32) uint32 +TEXT ·Xchg(SB), NOSPLIT, $0-20 + MOV ptr+0(FP), A0 + MOVW new+8(FP), A1 + AMOSWAPW A1, (A0), A1 + MOVW A1, ret+16(FP) + RET + +// func Xchg64(ptr *uint64, new uint64) uint64 +TEXT ·Xchg64(SB), NOSPLIT, $0-24 + MOV ptr+0(FP), A0 + MOV new+8(FP), A1 + AMOSWAPD A1, (A0), A1 + MOV A1, ret+16(FP) + RET + +// Atomically: +// *val += delta; +// return *val; + +// func Xadd(ptr *uint32, delta int32) uint32 +TEXT ·Xadd(SB), NOSPLIT, $0-20 + MOV ptr+0(FP), A0 + MOVW delta+8(FP), A1 + AMOADDW A1, (A0), A2 + ADD A2,A1,A0 + MOVW A0, ret+16(FP) + RET + +// func Xadd64(ptr *uint64, delta int64) uint64 +TEXT ·Xadd64(SB), NOSPLIT, $0-24 + MOV ptr+0(FP), A0 + MOV delta+8(FP), A1 + AMOADDD A1, (A0), A2 + ADD A2, A1, A0 + MOV A0, ret+16(FP) + RET + +// func Xadduintptr(ptr *uintptr, delta uintptr) uintptr +TEXT ·Xadduintptr(SB), NOSPLIT, $0-24 + JMP ·Xadd64(SB) + +// func Xchgint32(ptr *int32, new int32) int32 +TEXT ·Xchgint32(SB), NOSPLIT, $0-20 + JMP ·Xchg(SB) + +// func Xchgint64(ptr *int64, new int64) int64 +TEXT ·Xchgint64(SB), NOSPLIT, $0-24 + JMP ·Xchg64(SB) + +// func Xchguintptr(ptr *uintptr, new uintptr) uintptr +TEXT ·Xchguintptr(SB), NOSPLIT, $0-24 + JMP ·Xchg64(SB) + +// func And8(ptr *uint8, val uint8) +TEXT ·And8(SB), NOSPLIT, $0-9 + MOV ptr+0(FP), A0 + MOVBU val+8(FP), A1 + AND $3, A0, A2 + AND $-4, A0 + SLL $3, A2 + XOR $255, A1 + SLL A2, A1 + XOR $-1, A1 + AMOANDW A1, (A0), ZERO + RET + +// func Or8(ptr *uint8, val uint8) +TEXT ·Or8(SB), NOSPLIT, $0-9 + MOV ptr+0(FP), A0 + MOVBU val+8(FP), A1 + AND $3, A0, A2 + AND $-4, A0 + SLL $3, A2 + SLL A2, A1 + AMOORW A1, (A0), ZERO + RET + +// func And(ptr *uint32, val uint32) +TEXT ·And(SB), NOSPLIT, $0-12 + MOV ptr+0(FP), A0 + MOVW val+8(FP), A1 + AMOANDW A1, (A0), ZERO + RET + +// func Or(ptr *uint32, val uint32) +TEXT ·Or(SB), NOSPLIT, $0-12 + MOV ptr+0(FP), A0 + MOVW val+8(FP), A1 + AMOORW A1, (A0), ZERO + RET diff --git a/src/runtime/internal/atomic/atomic_s390x.go b/src/runtime/internal/atomic/atomic_s390x.go new file mode 100644 index 0000000..9855bf0 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_s390x.go @@ -0,0 +1,123 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package atomic + +import "unsafe" + +// Export some functions via linkname to assembly in sync/atomic. +// +//go:linkname Load +//go:linkname Loadp +//go:linkname Load64 + +//go:nosplit +//go:noinline +func Load(ptr *uint32) uint32 { + return *ptr +} + +//go:nosplit +//go:noinline +func Loadp(ptr unsafe.Pointer) unsafe.Pointer { + return *(*unsafe.Pointer)(ptr) +} + +//go:nosplit +//go:noinline +func Load8(ptr *uint8) uint8 { + return *ptr +} + +//go:nosplit +//go:noinline +func Load64(ptr *uint64) uint64 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcq(ptr *uint32) uint32 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcq64(ptr *uint64) uint64 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcquintptr(ptr *uintptr) uintptr { + return *ptr +} + +//go:noescape +func Store(ptr *uint32, val uint32) + +//go:noescape +func Store8(ptr *uint8, val uint8) + +//go:noescape +func Store64(ptr *uint64, val uint64) + +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) + +//go:nosplit +//go:noinline +func StoreRel(ptr *uint32, val uint32) { + *ptr = val +} + +//go:nosplit +//go:noinline +func StoreRel64(ptr *uint64, val uint64) { + *ptr = val +} + +//go:nosplit +//go:noinline +func StoreReluintptr(ptr *uintptr, val uintptr) { + *ptr = val +} + +//go:noescape +func And8(ptr *uint8, val uint8) + +//go:noescape +func Or8(ptr *uint8, val uint8) + +// NOTE: Do not add atomicxor8 (XOR is not idempotent). + +//go:noescape +func And(ptr *uint32, val uint32) + +//go:noescape +func Or(ptr *uint32, val uint32) + +//go:noescape +func Xadd(ptr *uint32, delta int32) uint32 + +//go:noescape +func Xadd64(ptr *uint64, delta int64) uint64 + +//go:noescape +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr + +//go:noescape +func Xchg(ptr *uint32, new uint32) uint32 + +//go:noescape +func Xchg64(ptr *uint64, new uint64) uint64 + +//go:noescape +func Xchguintptr(ptr *uintptr, new uintptr) uintptr + +//go:noescape +func Cas64(ptr *uint64, old, new uint64) bool + +//go:noescape +func CasRel(ptr *uint32, old, new uint32) bool diff --git a/src/runtime/internal/atomic/atomic_s390x.s b/src/runtime/internal/atomic/atomic_s390x.s new file mode 100644 index 0000000..a0c204b --- /dev/null +++ b/src/runtime/internal/atomic/atomic_s390x.s @@ -0,0 +1,248 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// func Store(ptr *uint32, val uint32) +TEXT ·Store(SB), NOSPLIT, $0 + MOVD ptr+0(FP), R2 + MOVWZ val+8(FP), R3 + MOVW R3, 0(R2) + SYNC + RET + +// func Store8(ptr *uint8, val uint8) +TEXT ·Store8(SB), NOSPLIT, $0 + MOVD ptr+0(FP), R2 + MOVB val+8(FP), R3 + MOVB R3, 0(R2) + SYNC + RET + +// func Store64(ptr *uint64, val uint64) +TEXT ·Store64(SB), NOSPLIT, $0 + MOVD ptr+0(FP), R2 + MOVD val+8(FP), R3 + MOVD R3, 0(R2) + SYNC + RET + +// func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) +TEXT ·StorepNoWB(SB), NOSPLIT, $0 + MOVD ptr+0(FP), R2 + MOVD val+8(FP), R3 + MOVD R3, 0(R2) + SYNC + RET + +// func Cas(ptr *uint32, old, new uint32) bool +// Atomically: +// if *ptr == old { +// *val = new +// return 1 +// } else { +// return 0 +// } +TEXT ·Cas(SB), NOSPLIT, $0-17 + MOVD ptr+0(FP), R3 + MOVWZ old+8(FP), R4 + MOVWZ new+12(FP), R5 + CS R4, R5, 0(R3) // if (R4 == 0(R3)) then 0(R3)= R5 + BNE cas_fail + MOVB $1, ret+16(FP) + RET +cas_fail: + MOVB $0, ret+16(FP) + RET + +// func Cas64(ptr *uint64, old, new uint64) bool +// Atomically: +// if *ptr == old { +// *ptr = new +// return 1 +// } else { +// return 0 +// } +TEXT ·Cas64(SB), NOSPLIT, $0-25 + MOVD ptr+0(FP), R3 + MOVD old+8(FP), R4 + MOVD new+16(FP), R5 + CSG R4, R5, 0(R3) // if (R4 == 0(R3)) then 0(R3)= R5 + BNE cas64_fail + MOVB $1, ret+24(FP) + RET +cas64_fail: + MOVB $0, ret+24(FP) + RET + +// func Casint32(ptr *int32, old, new int32) bool +TEXT ·Casint32(SB), NOSPLIT, $0-17 + BR ·Cas(SB) + +// func Casint64(ptr *int64, old, new int64) bool +TEXT ·Casint64(SB), NOSPLIT, $0-25 + BR ·Cas64(SB) + +// func Casuintptr(ptr *uintptr, old, new uintptr) bool +TEXT ·Casuintptr(SB), NOSPLIT, $0-25 + BR ·Cas64(SB) + +// func CasRel(ptr *uint32, old, new uint32) bool +TEXT ·CasRel(SB), NOSPLIT, $0-17 + BR ·Cas(SB) + +// func Loaduintptr(ptr *uintptr) uintptr +TEXT ·Loaduintptr(SB), NOSPLIT, $0-16 + BR ·Load64(SB) + +// func Loaduint(ptr *uint) uint +TEXT ·Loaduint(SB), NOSPLIT, $0-16 + BR ·Load64(SB) + +// func Storeint32(ptr *int32, new int32) +TEXT ·Storeint32(SB), NOSPLIT, $0-12 + BR ·Store(SB) + +// func Storeint64(ptr *int64, new int64) +TEXT ·Storeint64(SB), NOSPLIT, $0-16 + BR ·Store64(SB) + +// func Storeuintptr(ptr *uintptr, new uintptr) +TEXT ·Storeuintptr(SB), NOSPLIT, $0-16 + BR ·Store64(SB) + +// func Loadint32(ptr *int32) int32 +TEXT ·Loadint32(SB), NOSPLIT, $0-12 + BR ·Load(SB) + +// func Loadint64(ptr *int64) int64 +TEXT ·Loadint64(SB), NOSPLIT, $0-16 + BR ·Load64(SB) + +// func Xadduintptr(ptr *uintptr, delta uintptr) uintptr +TEXT ·Xadduintptr(SB), NOSPLIT, $0-24 + BR ·Xadd64(SB) + +// func Xaddint32(ptr *int32, delta int32) int32 +TEXT ·Xaddint32(SB), NOSPLIT, $0-20 + BR ·Xadd(SB) + +// func Xaddint64(ptr *int64, delta int64) int64 +TEXT ·Xaddint64(SB), NOSPLIT, $0-24 + BR ·Xadd64(SB) + +// func Casp1(ptr *unsafe.Pointer, old, new unsafe.Pointer) bool +// Atomically: +// if *ptr == old { +// *ptr = new +// return 1 +// } else { +// return 0 +// } +TEXT ·Casp1(SB), NOSPLIT, $0-25 + BR ·Cas64(SB) + +// func Xadd(ptr *uint32, delta int32) uint32 +// Atomically: +// *ptr += delta +// return *ptr +TEXT ·Xadd(SB), NOSPLIT, $0-20 + MOVD ptr+0(FP), R4 + MOVW delta+8(FP), R5 + MOVW (R4), R3 +repeat: + ADD R5, R3, R6 + CS R3, R6, (R4) // if R3==(R4) then (R4)=R6 else R3=(R4) + BNE repeat + MOVW R6, ret+16(FP) + RET + +// func Xadd64(ptr *uint64, delta int64) uint64 +TEXT ·Xadd64(SB), NOSPLIT, $0-24 + MOVD ptr+0(FP), R4 + MOVD delta+8(FP), R5 + MOVD (R4), R3 +repeat: + ADD R5, R3, R6 + CSG R3, R6, (R4) // if R3==(R4) then (R4)=R6 else R3=(R4) + BNE repeat + MOVD R6, ret+16(FP) + RET + +// func Xchg(ptr *uint32, new uint32) uint32 +TEXT ·Xchg(SB), NOSPLIT, $0-20 + MOVD ptr+0(FP), R4 + MOVW new+8(FP), R3 + MOVW (R4), R6 +repeat: + CS R6, R3, (R4) // if R6==(R4) then (R4)=R3 else R6=(R4) + BNE repeat + MOVW R6, ret+16(FP) + RET + +// func Xchg64(ptr *uint64, new uint64) uint64 +TEXT ·Xchg64(SB), NOSPLIT, $0-24 + MOVD ptr+0(FP), R4 + MOVD new+8(FP), R3 + MOVD (R4), R6 +repeat: + CSG R6, R3, (R4) // if R6==(R4) then (R4)=R3 else R6=(R4) + BNE repeat + MOVD R6, ret+16(FP) + RET + +// func Xchgint32(ptr *int32, new int32) int32 +TEXT ·Xchgint32(SB), NOSPLIT, $0-20 + BR ·Xchg(SB) + +// func Xchgint64(ptr *int64, new int64) int64 +TEXT ·Xchgint64(SB), NOSPLIT, $0-24 + BR ·Xchg64(SB) + +// func Xchguintptr(ptr *uintptr, new uintptr) uintptr +TEXT ·Xchguintptr(SB), NOSPLIT, $0-24 + BR ·Xchg64(SB) + +// func Or8(addr *uint8, v uint8) +TEXT ·Or8(SB), NOSPLIT, $0-9 + MOVD ptr+0(FP), R3 + MOVBZ val+8(FP), R4 + // We don't have atomic operations that work on individual bytes so we + // need to align addr down to a word boundary and create a mask + // containing v to OR with the entire word atomically. + MOVD $(3<<3), R5 + RXSBG $59, $60, $3, R3, R5 // R5 = 24 - ((addr % 4) * 8) = ((addr & 3) << 3) ^ (3 << 3) + ANDW $~3, R3 // R3 = floor(addr, 4) = addr &^ 3 + SLW R5, R4 // R4 = uint32(v) << R5 + LAO R4, R6, 0(R3) // R6 = *R3; *R3 |= R4; (atomic) + RET + +// func And8(addr *uint8, v uint8) +TEXT ·And8(SB), NOSPLIT, $0-9 + MOVD ptr+0(FP), R3 + MOVBZ val+8(FP), R4 + // We don't have atomic operations that work on individual bytes so we + // need to align addr down to a word boundary and create a mask + // containing v to AND with the entire word atomically. + ORW $~0xff, R4 // R4 = uint32(v) | 0xffffff00 + MOVD $(3<<3), R5 + RXSBG $59, $60, $3, R3, R5 // R5 = 24 - ((addr % 4) * 8) = ((addr & 3) << 3) ^ (3 << 3) + ANDW $~3, R3 // R3 = floor(addr, 4) = addr &^ 3 + RLL R5, R4, R4 // R4 = rotl(R4, R5) + LAN R4, R6, 0(R3) // R6 = *R3; *R3 &= R4; (atomic) + RET + +// func Or(addr *uint32, v uint32) +TEXT ·Or(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R3 + MOVW val+8(FP), R4 + LAO R4, R6, 0(R3) // R6 = *R3; *R3 |= R4; (atomic) + RET + +// func And(addr *uint32, v uint32) +TEXT ·And(SB), NOSPLIT, $0-12 + MOVD ptr+0(FP), R3 + MOVW val+8(FP), R4 + LAN R4, R6, 0(R3) // R6 = *R3; *R3 &= R4; (atomic) + RET diff --git a/src/runtime/internal/atomic/atomic_test.go b/src/runtime/internal/atomic/atomic_test.go new file mode 100644 index 0000000..2427bfd --- /dev/null +++ b/src/runtime/internal/atomic/atomic_test.go @@ -0,0 +1,386 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package atomic_test + +import ( + "internal/goarch" + "runtime" + "runtime/internal/atomic" + "testing" + "unsafe" +) + +func runParallel(N, iter int, f func()) { + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(int(N))) + done := make(chan bool) + for i := 0; i < N; i++ { + go func() { + for j := 0; j < iter; j++ { + f() + } + done <- true + }() + } + for i := 0; i < N; i++ { + <-done + } +} + +func TestXadduintptr(t *testing.T) { + N := 20 + iter := 100000 + if testing.Short() { + N = 10 + iter = 10000 + } + inc := uintptr(100) + total := uintptr(0) + runParallel(N, iter, func() { + atomic.Xadduintptr(&total, inc) + }) + if want := uintptr(N*iter) * inc; want != total { + t.Fatalf("xadduintpr error, want %d, got %d", want, total) + } + total = 0 + runParallel(N, iter, func() { + atomic.Xadduintptr(&total, inc) + atomic.Xadduintptr(&total, uintptr(-int64(inc))) + }) + if total != 0 { + t.Fatalf("xadduintpr total error, want %d, got %d", 0, total) + } +} + +// Tests that xadduintptr correctly updates 64-bit values. The place where +// we actually do so is mstats.go, functions mSysStat{Inc,Dec}. +func TestXadduintptrOnUint64(t *testing.T) { + if goarch.BigEndian { + // On big endian architectures, we never use xadduintptr to update + // 64-bit values and hence we skip the test. (Note that functions + // mSysStat{Inc,Dec} in mstats.go have explicit checks for + // big-endianness.) + t.Skip("skip xadduintptr on big endian architecture") + } + const inc = 100 + val := uint64(0) + atomic.Xadduintptr((*uintptr)(unsafe.Pointer(&val)), inc) + if inc != val { + t.Fatalf("xadduintptr should increase lower-order bits, want %d, got %d", inc, val) + } +} + +func shouldPanic(t *testing.T, name string, f func()) { + defer func() { + // Check that all GC maps are sane. + runtime.GC() + + err := recover() + want := "unaligned 64-bit atomic operation" + if err == nil { + t.Errorf("%s did not panic", name) + } else if s, _ := err.(string); s != want { + t.Errorf("%s: wanted panic %q, got %q", name, want, err) + } + }() + f() +} + +// Variant of sync/atomic's TestUnaligned64: +func TestUnaligned64(t *testing.T) { + // Unaligned 64-bit atomics on 32-bit systems are + // a continual source of pain. Test that on 32-bit systems they crash + // instead of failing silently. + + if unsafe.Sizeof(int(0)) != 4 { + t.Skip("test only runs on 32-bit systems") + } + + x := make([]uint32, 4) + u := unsafe.Pointer(uintptr(unsafe.Pointer(&x[0])) | 4) // force alignment to 4 + + up64 := (*uint64)(u) // misaligned + p64 := (*int64)(u) // misaligned + + shouldPanic(t, "Load64", func() { atomic.Load64(up64) }) + shouldPanic(t, "Loadint64", func() { atomic.Loadint64(p64) }) + shouldPanic(t, "Store64", func() { atomic.Store64(up64, 0) }) + shouldPanic(t, "Xadd64", func() { atomic.Xadd64(up64, 1) }) + shouldPanic(t, "Xchg64", func() { atomic.Xchg64(up64, 1) }) + shouldPanic(t, "Cas64", func() { atomic.Cas64(up64, 1, 2) }) +} + +func TestAnd8(t *testing.T) { + // Basic sanity check. + x := uint8(0xff) + for i := uint8(0); i < 8; i++ { + atomic.And8(&x, ^(1 << i)) + if r := uint8(0xff) << (i + 1); x != r { + t.Fatalf("clearing bit %#x: want %#x, got %#x", uint8(1<<i), r, x) + } + } + + // Set every bit in array to 1. + a := make([]uint8, 1<<12) + for i := range a { + a[i] = 0xff + } + + // Clear array bit-by-bit in different goroutines. + done := make(chan bool) + for i := 0; i < 8; i++ { + m := ^uint8(1 << i) + go func() { + for i := range a { + atomic.And8(&a[i], m) + } + done <- true + }() + } + for i := 0; i < 8; i++ { + <-done + } + + // Check that the array has been totally cleared. + for i, v := range a { + if v != 0 { + t.Fatalf("a[%v] not cleared: want %#x, got %#x", i, uint8(0), v) + } + } +} + +func TestAnd(t *testing.T) { + // Basic sanity check. + x := uint32(0xffffffff) + for i := uint32(0); i < 32; i++ { + atomic.And(&x, ^(1 << i)) + if r := uint32(0xffffffff) << (i + 1); x != r { + t.Fatalf("clearing bit %#x: want %#x, got %#x", uint32(1<<i), r, x) + } + } + + // Set every bit in array to 1. + a := make([]uint32, 1<<12) + for i := range a { + a[i] = 0xffffffff + } + + // Clear array bit-by-bit in different goroutines. + done := make(chan bool) + for i := 0; i < 32; i++ { + m := ^uint32(1 << i) + go func() { + for i := range a { + atomic.And(&a[i], m) + } + done <- true + }() + } + for i := 0; i < 32; i++ { + <-done + } + + // Check that the array has been totally cleared. + for i, v := range a { + if v != 0 { + t.Fatalf("a[%v] not cleared: want %#x, got %#x", i, uint32(0), v) + } + } +} + +func TestOr8(t *testing.T) { + // Basic sanity check. + x := uint8(0) + for i := uint8(0); i < 8; i++ { + atomic.Or8(&x, 1<<i) + if r := (uint8(1) << (i + 1)) - 1; x != r { + t.Fatalf("setting bit %#x: want %#x, got %#x", uint8(1)<<i, r, x) + } + } + + // Start with every bit in array set to 0. + a := make([]uint8, 1<<12) + + // Set every bit in array bit-by-bit in different goroutines. + done := make(chan bool) + for i := 0; i < 8; i++ { + m := uint8(1 << i) + go func() { + for i := range a { + atomic.Or8(&a[i], m) + } + done <- true + }() + } + for i := 0; i < 8; i++ { + <-done + } + + // Check that the array has been totally set. + for i, v := range a { + if v != 0xff { + t.Fatalf("a[%v] not fully set: want %#x, got %#x", i, uint8(0xff), v) + } + } +} + +func TestOr(t *testing.T) { + // Basic sanity check. + x := uint32(0) + for i := uint32(0); i < 32; i++ { + atomic.Or(&x, 1<<i) + if r := (uint32(1) << (i + 1)) - 1; x != r { + t.Fatalf("setting bit %#x: want %#x, got %#x", uint32(1)<<i, r, x) + } + } + + // Start with every bit in array set to 0. + a := make([]uint32, 1<<12) + + // Set every bit in array bit-by-bit in different goroutines. + done := make(chan bool) + for i := 0; i < 32; i++ { + m := uint32(1 << i) + go func() { + for i := range a { + atomic.Or(&a[i], m) + } + done <- true + }() + } + for i := 0; i < 32; i++ { + <-done + } + + // Check that the array has been totally set. + for i, v := range a { + if v != 0xffffffff { + t.Fatalf("a[%v] not fully set: want %#x, got %#x", i, uint32(0xffffffff), v) + } + } +} + +func TestBitwiseContended8(t *testing.T) { + // Start with every bit in array set to 0. + a := make([]uint8, 16) + + // Iterations to try. + N := 1 << 16 + if testing.Short() { + N = 1 << 10 + } + + // Set and then clear every bit in the array bit-by-bit in different goroutines. + done := make(chan bool) + for i := 0; i < 8; i++ { + m := uint8(1 << i) + go func() { + for n := 0; n < N; n++ { + for i := range a { + atomic.Or8(&a[i], m) + if atomic.Load8(&a[i])&m != m { + t.Errorf("a[%v] bit %#x not set", i, m) + } + atomic.And8(&a[i], ^m) + if atomic.Load8(&a[i])&m != 0 { + t.Errorf("a[%v] bit %#x not clear", i, m) + } + } + } + done <- true + }() + } + for i := 0; i < 8; i++ { + <-done + } + + // Check that the array has been totally cleared. + for i, v := range a { + if v != 0 { + t.Fatalf("a[%v] not cleared: want %#x, got %#x", i, uint8(0), v) + } + } +} + +func TestBitwiseContended(t *testing.T) { + // Start with every bit in array set to 0. + a := make([]uint32, 16) + + // Iterations to try. + N := 1 << 16 + if testing.Short() { + N = 1 << 10 + } + + // Set and then clear every bit in the array bit-by-bit in different goroutines. + done := make(chan bool) + for i := 0; i < 32; i++ { + m := uint32(1 << i) + go func() { + for n := 0; n < N; n++ { + for i := range a { + atomic.Or(&a[i], m) + if atomic.Load(&a[i])&m != m { + t.Errorf("a[%v] bit %#x not set", i, m) + } + atomic.And(&a[i], ^m) + if atomic.Load(&a[i])&m != 0 { + t.Errorf("a[%v] bit %#x not clear", i, m) + } + } + } + done <- true + }() + } + for i := 0; i < 32; i++ { + <-done + } + + // Check that the array has been totally cleared. + for i, v := range a { + if v != 0 { + t.Fatalf("a[%v] not cleared: want %#x, got %#x", i, uint32(0), v) + } + } +} + +func TestCasRel(t *testing.T) { + const _magic = 0x5a5aa5a5 + var x struct { + before uint32 + i uint32 + after uint32 + o uint32 + n uint32 + } + + x.before = _magic + x.after = _magic + for j := 0; j < 32; j += 1 { + x.i = (1 << j) + 0 + x.o = (1 << j) + 0 + x.n = (1 << j) + 1 + if !atomic.CasRel(&x.i, x.o, x.n) { + t.Fatalf("should have swapped %#x %#x", x.o, x.n) + } + + if x.i != x.n { + t.Fatalf("wrong x.i after swap: x.i=%#x x.n=%#x", x.i, x.n) + } + + if x.before != _magic || x.after != _magic { + t.Fatalf("wrong magic: %#x _ %#x != %#x _ %#x", x.before, x.after, _magic, _magic) + } + } +} + +func TestStorepNoWB(t *testing.T) { + var p [2]*int + for i := range p { + atomic.StorepNoWB(unsafe.Pointer(&p[i]), unsafe.Pointer(new(int))) + } + if p[0] == p[1] { + t.Error("Bad escape analysis of StorepNoWB") + } +} diff --git a/src/runtime/internal/atomic/atomic_wasm.go b/src/runtime/internal/atomic/atomic_wasm.go new file mode 100644 index 0000000..835fc43 --- /dev/null +++ b/src/runtime/internal/atomic/atomic_wasm.go @@ -0,0 +1,341 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// TODO(neelance): implement with actual atomic operations as soon as threads are available +// See https://github.com/WebAssembly/design/issues/1073 + +// Export some functions via linkname to assembly in sync/atomic. +// +//go:linkname Load +//go:linkname Loadp +//go:linkname Load64 +//go:linkname Loadint32 +//go:linkname Loadint64 +//go:linkname Loaduintptr +//go:linkname Xadd +//go:linkname Xaddint32 +//go:linkname Xaddint64 +//go:linkname Xadd64 +//go:linkname Xadduintptr +//go:linkname Xchg +//go:linkname Xchg64 +//go:linkname Xchgint32 +//go:linkname Xchgint64 +//go:linkname Xchguintptr +//go:linkname Cas +//go:linkname Cas64 +//go:linkname Casint32 +//go:linkname Casint64 +//go:linkname Casuintptr +//go:linkname Store +//go:linkname Store64 +//go:linkname Storeint32 +//go:linkname Storeint64 +//go:linkname Storeuintptr + +package atomic + +import "unsafe" + +//go:nosplit +//go:noinline +func Load(ptr *uint32) uint32 { + return *ptr +} + +//go:nosplit +//go:noinline +func Loadp(ptr unsafe.Pointer) unsafe.Pointer { + return *(*unsafe.Pointer)(ptr) +} + +//go:nosplit +//go:noinline +func LoadAcq(ptr *uint32) uint32 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcq64(ptr *uint64) uint64 { + return *ptr +} + +//go:nosplit +//go:noinline +func LoadAcquintptr(ptr *uintptr) uintptr { + return *ptr +} + +//go:nosplit +//go:noinline +func Load8(ptr *uint8) uint8 { + return *ptr +} + +//go:nosplit +//go:noinline +func Load64(ptr *uint64) uint64 { + return *ptr +} + +//go:nosplit +//go:noinline +func Xadd(ptr *uint32, delta int32) uint32 { + new := *ptr + uint32(delta) + *ptr = new + return new +} + +//go:nosplit +//go:noinline +func Xadd64(ptr *uint64, delta int64) uint64 { + new := *ptr + uint64(delta) + *ptr = new + return new +} + +//go:nosplit +//go:noinline +func Xadduintptr(ptr *uintptr, delta uintptr) uintptr { + new := *ptr + delta + *ptr = new + return new +} + +//go:nosplit +//go:noinline +func Xchg(ptr *uint32, new uint32) uint32 { + old := *ptr + *ptr = new + return old +} + +//go:nosplit +//go:noinline +func Xchg64(ptr *uint64, new uint64) uint64 { + old := *ptr + *ptr = new + return old +} + +//go:nosplit +//go:noinline +func Xchgint32(ptr *int32, new int32) int32 { + old := *ptr + *ptr = new + return old +} + +//go:nosplit +//go:noinline +func Xchgint64(ptr *int64, new int64) int64 { + old := *ptr + *ptr = new + return old +} + +//go:nosplit +//go:noinline +func Xchguintptr(ptr *uintptr, new uintptr) uintptr { + old := *ptr + *ptr = new + return old +} + +//go:nosplit +//go:noinline +func And8(ptr *uint8, val uint8) { + *ptr = *ptr & val +} + +//go:nosplit +//go:noinline +func Or8(ptr *uint8, val uint8) { + *ptr = *ptr | val +} + +// NOTE: Do not add atomicxor8 (XOR is not idempotent). + +//go:nosplit +//go:noinline +func And(ptr *uint32, val uint32) { + *ptr = *ptr & val +} + +//go:nosplit +//go:noinline +func Or(ptr *uint32, val uint32) { + *ptr = *ptr | val +} + +//go:nosplit +//go:noinline +func Cas64(ptr *uint64, old, new uint64) bool { + if *ptr == old { + *ptr = new + return true + } + return false +} + +//go:nosplit +//go:noinline +func Store(ptr *uint32, val uint32) { + *ptr = val +} + +//go:nosplit +//go:noinline +func StoreRel(ptr *uint32, val uint32) { + *ptr = val +} + +//go:nosplit +//go:noinline +func StoreRel64(ptr *uint64, val uint64) { + *ptr = val +} + +//go:nosplit +//go:noinline +func StoreReluintptr(ptr *uintptr, val uintptr) { + *ptr = val +} + +//go:nosplit +//go:noinline +func Store8(ptr *uint8, val uint8) { + *ptr = val +} + +//go:nosplit +//go:noinline +func Store64(ptr *uint64, val uint64) { + *ptr = val +} + +// StorepNoWB performs *ptr = val atomically and without a write +// barrier. +// +// NO go:noescape annotation; see atomic_pointer.go. +func StorepNoWB(ptr unsafe.Pointer, val unsafe.Pointer) + +//go:nosplit +//go:noinline +func Casint32(ptr *int32, old, new int32) bool { + if *ptr == old { + *ptr = new + return true + } + return false +} + +//go:nosplit +//go:noinline +func Casint64(ptr *int64, old, new int64) bool { + if *ptr == old { + *ptr = new + return true + } + return false +} + +//go:nosplit +//go:noinline +func Cas(ptr *uint32, old, new uint32) bool { + if *ptr == old { + *ptr = new + return true + } + return false +} + +//go:nosplit +//go:noinline +func Casp1(ptr *unsafe.Pointer, old, new unsafe.Pointer) bool { + if *ptr == old { + *ptr = new + return true + } + return false +} + +//go:nosplit +//go:noinline +func Casuintptr(ptr *uintptr, old, new uintptr) bool { + if *ptr == old { + *ptr = new + return true + } + return false +} + +//go:nosplit +//go:noinline +func CasRel(ptr *uint32, old, new uint32) bool { + if *ptr == old { + *ptr = new + return true + } + return false +} + +//go:nosplit +//go:noinline +func Storeint32(ptr *int32, new int32) { + *ptr = new +} + +//go:nosplit +//go:noinline +func Storeint64(ptr *int64, new int64) { + *ptr = new +} + +//go:nosplit +//go:noinline +func Storeuintptr(ptr *uintptr, new uintptr) { + *ptr = new +} + +//go:nosplit +//go:noinline +func Loaduintptr(ptr *uintptr) uintptr { + return *ptr +} + +//go:nosplit +//go:noinline +func Loaduint(ptr *uint) uint { + return *ptr +} + +//go:nosplit +//go:noinline +func Loadint32(ptr *int32) int32 { + return *ptr +} + +//go:nosplit +//go:noinline +func Loadint64(ptr *int64) int64 { + return *ptr +} + +//go:nosplit +//go:noinline +func Xaddint32(ptr *int32, delta int32) int32 { + new := *ptr + delta + *ptr = new + return new +} + +//go:nosplit +//go:noinline +func Xaddint64(ptr *int64, delta int64) int64 { + new := *ptr + delta + *ptr = new + return new +} diff --git a/src/runtime/internal/atomic/atomic_wasm.s b/src/runtime/internal/atomic/atomic_wasm.s new file mode 100644 index 0000000..1c2d1ce --- /dev/null +++ b/src/runtime/internal/atomic/atomic_wasm.s @@ -0,0 +1,10 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT ·StorepNoWB(SB), NOSPLIT, $0-16 + MOVD ptr+0(FP), R0 + MOVD val+8(FP), 0(R0) + RET diff --git a/src/runtime/internal/atomic/bench_test.go b/src/runtime/internal/atomic/bench_test.go new file mode 100644 index 0000000..efc0531 --- /dev/null +++ b/src/runtime/internal/atomic/bench_test.go @@ -0,0 +1,195 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package atomic_test + +import ( + "runtime/internal/atomic" + "testing" +) + +var sink any + +func BenchmarkAtomicLoad64(b *testing.B) { + var x uint64 + sink = &x + for i := 0; i < b.N; i++ { + _ = atomic.Load64(&x) + } +} + +func BenchmarkAtomicStore64(b *testing.B) { + var x uint64 + sink = &x + for i := 0; i < b.N; i++ { + atomic.Store64(&x, 0) + } +} + +func BenchmarkAtomicLoad(b *testing.B) { + var x uint32 + sink = &x + for i := 0; i < b.N; i++ { + _ = atomic.Load(&x) + } +} + +func BenchmarkAtomicStore(b *testing.B) { + var x uint32 + sink = &x + for i := 0; i < b.N; i++ { + atomic.Store(&x, 0) + } +} + +func BenchmarkAnd8(b *testing.B) { + var x [512]uint8 // give byte its own cache line + sink = &x + for i := 0; i < b.N; i++ { + atomic.And8(&x[255], uint8(i)) + } +} + +func BenchmarkAnd(b *testing.B) { + var x [128]uint32 // give x its own cache line + sink = &x + for i := 0; i < b.N; i++ { + atomic.And(&x[63], uint32(i)) + } +} + +func BenchmarkAnd8Parallel(b *testing.B) { + var x [512]uint8 // give byte its own cache line + sink = &x + b.RunParallel(func(pb *testing.PB) { + i := uint8(0) + for pb.Next() { + atomic.And8(&x[255], i) + i++ + } + }) +} + +func BenchmarkAndParallel(b *testing.B) { + var x [128]uint32 // give x its own cache line + sink = &x + b.RunParallel(func(pb *testing.PB) { + i := uint32(0) + for pb.Next() { + atomic.And(&x[63], i) + i++ + } + }) +} + +func BenchmarkOr8(b *testing.B) { + var x [512]uint8 // give byte its own cache line + sink = &x + for i := 0; i < b.N; i++ { + atomic.Or8(&x[255], uint8(i)) + } +} + +func BenchmarkOr(b *testing.B) { + var x [128]uint32 // give x its own cache line + sink = &x + for i := 0; i < b.N; i++ { + atomic.Or(&x[63], uint32(i)) + } +} + +func BenchmarkOr8Parallel(b *testing.B) { + var x [512]uint8 // give byte its own cache line + sink = &x + b.RunParallel(func(pb *testing.PB) { + i := uint8(0) + for pb.Next() { + atomic.Or8(&x[255], i) + i++ + } + }) +} + +func BenchmarkOrParallel(b *testing.B) { + var x [128]uint32 // give x its own cache line + sink = &x + b.RunParallel(func(pb *testing.PB) { + i := uint32(0) + for pb.Next() { + atomic.Or(&x[63], i) + i++ + } + }) +} + +func BenchmarkXadd(b *testing.B) { + var x uint32 + ptr := &x + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + atomic.Xadd(ptr, 1) + } + }) +} + +func BenchmarkXadd64(b *testing.B) { + var x uint64 + ptr := &x + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + atomic.Xadd64(ptr, 1) + } + }) +} + +func BenchmarkCas(b *testing.B) { + var x uint32 + x = 1 + ptr := &x + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + atomic.Cas(ptr, 1, 0) + atomic.Cas(ptr, 0, 1) + } + }) +} + +func BenchmarkCas64(b *testing.B) { + var x uint64 + x = 1 + ptr := &x + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + atomic.Cas64(ptr, 1, 0) + atomic.Cas64(ptr, 0, 1) + } + }) +} +func BenchmarkXchg(b *testing.B) { + var x uint32 + x = 1 + ptr := &x + b.RunParallel(func(pb *testing.PB) { + var y uint32 + y = 1 + for pb.Next() { + y = atomic.Xchg(ptr, y) + y += 1 + } + }) +} + +func BenchmarkXchg64(b *testing.B) { + var x uint64 + x = 1 + ptr := &x + b.RunParallel(func(pb *testing.PB) { + var y uint64 + y = 1 + for pb.Next() { + y = atomic.Xchg64(ptr, y) + y += 1 + } + }) +} diff --git a/src/runtime/internal/atomic/doc.go b/src/runtime/internal/atomic/doc.go new file mode 100644 index 0000000..08e6b6c --- /dev/null +++ b/src/runtime/internal/atomic/doc.go @@ -0,0 +1,18 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +/* +Package atomic provides atomic operations, independent of sync/atomic, +to the runtime. + +On most platforms, the compiler is aware of the functions defined +in this package, and they're replaced with platform-specific intrinsics. +On other platforms, generic implementations are made available. + +Unless otherwise noted, operations defined in this package are sequentially +consistent across threads with respect to the values they manipulate. More +specifically, operations that happen in a specific order on one thread, +will always be observed to happen in exactly that order by another thread. +*/ +package atomic diff --git a/src/runtime/internal/atomic/stubs.go b/src/runtime/internal/atomic/stubs.go new file mode 100644 index 0000000..7df8d9c --- /dev/null +++ b/src/runtime/internal/atomic/stubs.go @@ -0,0 +1,59 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !wasm + +package atomic + +import "unsafe" + +//go:noescape +func Cas(ptr *uint32, old, new uint32) bool + +// NO go:noescape annotation; see atomic_pointer.go. +func Casp1(ptr *unsafe.Pointer, old, new unsafe.Pointer) bool + +//go:noescape +func Casint32(ptr *int32, old, new int32) bool + +//go:noescape +func Casint64(ptr *int64, old, new int64) bool + +//go:noescape +func Casuintptr(ptr *uintptr, old, new uintptr) bool + +//go:noescape +func Storeint32(ptr *int32, new int32) + +//go:noescape +func Storeint64(ptr *int64, new int64) + +//go:noescape +func Storeuintptr(ptr *uintptr, new uintptr) + +//go:noescape +func Loaduintptr(ptr *uintptr) uintptr + +//go:noescape +func Loaduint(ptr *uint) uint + +// TODO(matloob): Should these functions have the go:noescape annotation? + +//go:noescape +func Loadint32(ptr *int32) int32 + +//go:noescape +func Loadint64(ptr *int64) int64 + +//go:noescape +func Xaddint32(ptr *int32, delta int32) int32 + +//go:noescape +func Xaddint64(ptr *int64, delta int64) int64 + +//go:noescape +func Xchgint32(ptr *int32, new int32) int32 + +//go:noescape +func Xchgint64(ptr *int64, new int64) int64 diff --git a/src/runtime/internal/atomic/sys_linux_arm.s b/src/runtime/internal/atomic/sys_linux_arm.s new file mode 100644 index 0000000..9225df8 --- /dev/null +++ b/src/runtime/internal/atomic/sys_linux_arm.s @@ -0,0 +1,134 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// Linux/ARM atomic operations. + +// Because there is so much variation in ARM devices, +// the Linux kernel provides an appropriate compare-and-swap +// implementation at address 0xffff0fc0. Caller sets: +// R0 = old value +// R1 = new value +// R2 = addr +// LR = return address +// The function returns with CS true if the swap happened. +// http://lxr.linux.no/linux+v2.6.37.2/arch/arm/kernel/entry-armv.S#L850 +// +// https://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=b49c0f24cf6744a3f4fd09289fe7cade349dead5 +// +TEXT cas<>(SB),NOSPLIT,$0 + MOVW $0xffff0fc0, R15 // R15 is hardware PC. + +TEXT ·Cas(SB),NOSPLIT|NOFRAME,$0 + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + JMP ·armcas(SB) + JMP kernelcas<>(SB) + +TEXT kernelcas<>(SB),NOSPLIT,$0 + MOVW ptr+0(FP), R2 + // trigger potential paging fault here, + // because we don't know how to traceback through __kuser_cmpxchg + MOVW (R2), R0 + MOVW old+4(FP), R0 + MOVW new+8(FP), R1 + BL cas<>(SB) + BCC ret0 + MOVW $1, R0 + MOVB R0, ret+12(FP) + RET +ret0: + MOVW $0, R0 + MOVB R0, ret+12(FP) + RET + +// As for cas, memory barriers are complicated on ARM, but the kernel +// provides a user helper. ARMv5 does not support SMP and has no +// memory barrier instruction at all. ARMv6 added SMP support and has +// a memory barrier, but it requires writing to a coprocessor +// register. ARMv7 introduced the DMB instruction, but it's expensive +// even on single-core devices. The kernel helper takes care of all of +// this for us. + +// Use kernel helper version of memory_barrier, when compiled with GOARM < 7. +TEXT memory_barrier<>(SB),NOSPLIT|NOFRAME,$0 + MOVW $0xffff0fa0, R15 // R15 is hardware PC. + +TEXT ·Load(SB),NOSPLIT,$0-8 + MOVW addr+0(FP), R0 + MOVW (R0), R1 + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BGE native_barrier + BL memory_barrier<>(SB) + B end +native_barrier: + DMB MB_ISH +end: + MOVW R1, ret+4(FP) + RET + +TEXT ·Store(SB),NOSPLIT,$0-8 + MOVW addr+0(FP), R1 + MOVW v+4(FP), R2 + + MOVB runtime·goarm(SB), R8 + CMP $7, R8 + BGE native_barrier + BL memory_barrier<>(SB) + B store +native_barrier: + DMB MB_ISH + +store: + MOVW R2, (R1) + + CMP $7, R8 + BGE native_barrier2 + BL memory_barrier<>(SB) + RET +native_barrier2: + DMB MB_ISH + RET + +TEXT ·Load8(SB),NOSPLIT,$0-5 + MOVW addr+0(FP), R0 + MOVB (R0), R1 + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BGE native_barrier + BL memory_barrier<>(SB) + B end +native_barrier: + DMB MB_ISH +end: + MOVB R1, ret+4(FP) + RET + +TEXT ·Store8(SB),NOSPLIT,$0-5 + MOVW addr+0(FP), R1 + MOVB v+4(FP), R2 + + MOVB runtime·goarm(SB), R8 + CMP $7, R8 + BGE native_barrier + BL memory_barrier<>(SB) + B store +native_barrier: + DMB MB_ISH + +store: + MOVB R2, (R1) + + CMP $7, R8 + BGE native_barrier2 + BL memory_barrier<>(SB) + RET +native_barrier2: + DMB MB_ISH + RET diff --git a/src/runtime/internal/atomic/sys_nonlinux_arm.s b/src/runtime/internal/atomic/sys_nonlinux_arm.s new file mode 100644 index 0000000..b55bf90 --- /dev/null +++ b/src/runtime/internal/atomic/sys_nonlinux_arm.s @@ -0,0 +1,79 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !linux + +#include "textflag.h" + +// TODO(minux): this is only valid for ARMv6+ +// bool armcas(int32 *val, int32 old, int32 new) +// Atomically: +// if(*val == old){ +// *val = new; +// return 1; +// }else +// return 0; +TEXT ·Cas(SB),NOSPLIT,$0 + JMP ·armcas(SB) + +// Non-linux OSes support only single processor machines before ARMv7. +// So we don't need memory barriers if goarm < 7. And we fail loud at +// startup (runtime.checkgoarm) if it is a multi-processor but goarm < 7. + +TEXT ·Load(SB),NOSPLIT|NOFRAME,$0-8 + MOVW addr+0(FP), R0 + MOVW (R0), R1 + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + DMB MB_ISH + + MOVW R1, ret+4(FP) + RET + +TEXT ·Store(SB),NOSPLIT,$0-8 + MOVW addr+0(FP), R1 + MOVW v+4(FP), R2 + + MOVB runtime·goarm(SB), R8 + CMP $7, R8 + BLT 2(PC) + DMB MB_ISH + + MOVW R2, (R1) + + CMP $7, R8 + BLT 2(PC) + DMB MB_ISH + RET + +TEXT ·Load8(SB),NOSPLIT|NOFRAME,$0-5 + MOVW addr+0(FP), R0 + MOVB (R0), R1 + + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + DMB MB_ISH + + MOVB R1, ret+4(FP) + RET + +TEXT ·Store8(SB),NOSPLIT,$0-5 + MOVW addr+0(FP), R1 + MOVB v+4(FP), R2 + + MOVB runtime·goarm(SB), R8 + CMP $7, R8 + BLT 2(PC) + DMB MB_ISH + + MOVB R2, (R1) + + CMP $7, R8 + BLT 2(PC) + DMB MB_ISH + RET + diff --git a/src/runtime/internal/atomic/types.go b/src/runtime/internal/atomic/types.go new file mode 100644 index 0000000..0d75226 --- /dev/null +++ b/src/runtime/internal/atomic/types.go @@ -0,0 +1,585 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package atomic + +import "unsafe" + +// Int32 is an atomically accessed int32 value. +// +// An Int32 must not be copied. +type Int32 struct { + noCopy noCopy + value int32 +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (i *Int32) Load() int32 { + return Loadint32(&i.value) +} + +// Store updates the value atomically. +// +//go:nosplit +func (i *Int32) Store(value int32) { + Storeint32(&i.value, value) +} + +// CompareAndSwap atomically compares i's value with old, +// and if they're equal, swaps i's value with new. +// It reports whether the swap ran. +// +//go:nosplit +func (i *Int32) CompareAndSwap(old, new int32) bool { + return Casint32(&i.value, old, new) +} + +// Swap replaces i's value with new, returning +// i's value before the replacement. +// +//go:nosplit +func (i *Int32) Swap(new int32) int32 { + return Xchgint32(&i.value, new) +} + +// Add adds delta to i atomically, returning +// the new updated value. +// +// This operation wraps around in the usual +// two's-complement way. +// +//go:nosplit +func (i *Int32) Add(delta int32) int32 { + return Xaddint32(&i.value, delta) +} + +// Int64 is an atomically accessed int64 value. +// +// 8-byte aligned on all platforms, unlike a regular int64. +// +// An Int64 must not be copied. +type Int64 struct { + noCopy noCopy + _ align64 + value int64 +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (i *Int64) Load() int64 { + return Loadint64(&i.value) +} + +// Store updates the value atomically. +// +//go:nosplit +func (i *Int64) Store(value int64) { + Storeint64(&i.value, value) +} + +// CompareAndSwap atomically compares i's value with old, +// and if they're equal, swaps i's value with new. +// It reports whether the swap ran. +// +//go:nosplit +func (i *Int64) CompareAndSwap(old, new int64) bool { + return Casint64(&i.value, old, new) +} + +// Swap replaces i's value with new, returning +// i's value before the replacement. +// +//go:nosplit +func (i *Int64) Swap(new int64) int64 { + return Xchgint64(&i.value, new) +} + +// Add adds delta to i atomically, returning +// the new updated value. +// +// This operation wraps around in the usual +// two's-complement way. +// +//go:nosplit +func (i *Int64) Add(delta int64) int64 { + return Xaddint64(&i.value, delta) +} + +// Uint8 is an atomically accessed uint8 value. +// +// A Uint8 must not be copied. +type Uint8 struct { + noCopy noCopy + value uint8 +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (u *Uint8) Load() uint8 { + return Load8(&u.value) +} + +// Store updates the value atomically. +// +//go:nosplit +func (u *Uint8) Store(value uint8) { + Store8(&u.value, value) +} + +// And takes value and performs a bit-wise +// "and" operation with the value of u, storing +// the result into u. +// +// The full process is performed atomically. +// +//go:nosplit +func (u *Uint8) And(value uint8) { + And8(&u.value, value) +} + +// Or takes value and performs a bit-wise +// "or" operation with the value of u, storing +// the result into u. +// +// The full process is performed atomically. +// +//go:nosplit +func (u *Uint8) Or(value uint8) { + Or8(&u.value, value) +} + +// Bool is an atomically accessed bool value. +// +// A Bool must not be copied. +type Bool struct { + // Inherits noCopy from Uint8. + u Uint8 +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (b *Bool) Load() bool { + return b.u.Load() != 0 +} + +// Store updates the value atomically. +// +//go:nosplit +func (b *Bool) Store(value bool) { + s := uint8(0) + if value { + s = 1 + } + b.u.Store(s) +} + +// Uint32 is an atomically accessed uint32 value. +// +// A Uint32 must not be copied. +type Uint32 struct { + noCopy noCopy + value uint32 +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (u *Uint32) Load() uint32 { + return Load(&u.value) +} + +// LoadAcquire is a partially unsynchronized version +// of Load that relaxes ordering constraints. Other threads +// may observe operations that precede this operation to +// occur after it, but no operation that occurs after it +// on this thread can be observed to occur before it. +// +// WARNING: Use sparingly and with great care. +// +//go:nosplit +func (u *Uint32) LoadAcquire() uint32 { + return LoadAcq(&u.value) +} + +// Store updates the value atomically. +// +//go:nosplit +func (u *Uint32) Store(value uint32) { + Store(&u.value, value) +} + +// StoreRelease is a partially unsynchronized version +// of Store that relaxes ordering constraints. Other threads +// may observe operations that occur after this operation to +// precede it, but no operation that precedes it +// on this thread can be observed to occur after it. +// +// WARNING: Use sparingly and with great care. +// +//go:nosplit +func (u *Uint32) StoreRelease(value uint32) { + StoreRel(&u.value, value) +} + +// CompareAndSwap atomically compares u's value with old, +// and if they're equal, swaps u's value with new. +// It reports whether the swap ran. +// +//go:nosplit +func (u *Uint32) CompareAndSwap(old, new uint32) bool { + return Cas(&u.value, old, new) +} + +// CompareAndSwapRelease is a partially unsynchronized version +// of Cas that relaxes ordering constraints. Other threads +// may observe operations that occur after this operation to +// precede it, but no operation that precedes it +// on this thread can be observed to occur after it. +// It reports whether the swap ran. +// +// WARNING: Use sparingly and with great care. +// +//go:nosplit +func (u *Uint32) CompareAndSwapRelease(old, new uint32) bool { + return CasRel(&u.value, old, new) +} + +// Swap replaces u's value with new, returning +// u's value before the replacement. +// +//go:nosplit +func (u *Uint32) Swap(value uint32) uint32 { + return Xchg(&u.value, value) +} + +// And takes value and performs a bit-wise +// "and" operation with the value of u, storing +// the result into u. +// +// The full process is performed atomically. +// +//go:nosplit +func (u *Uint32) And(value uint32) { + And(&u.value, value) +} + +// Or takes value and performs a bit-wise +// "or" operation with the value of u, storing +// the result into u. +// +// The full process is performed atomically. +// +//go:nosplit +func (u *Uint32) Or(value uint32) { + Or(&u.value, value) +} + +// Add adds delta to u atomically, returning +// the new updated value. +// +// This operation wraps around in the usual +// two's-complement way. +// +//go:nosplit +func (u *Uint32) Add(delta int32) uint32 { + return Xadd(&u.value, delta) +} + +// Uint64 is an atomically accessed uint64 value. +// +// 8-byte aligned on all platforms, unlike a regular uint64. +// +// A Uint64 must not be copied. +type Uint64 struct { + noCopy noCopy + _ align64 + value uint64 +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (u *Uint64) Load() uint64 { + return Load64(&u.value) +} + +// Store updates the value atomically. +// +//go:nosplit +func (u *Uint64) Store(value uint64) { + Store64(&u.value, value) +} + +// CompareAndSwap atomically compares u's value with old, +// and if they're equal, swaps u's value with new. +// It reports whether the swap ran. +// +//go:nosplit +func (u *Uint64) CompareAndSwap(old, new uint64) bool { + return Cas64(&u.value, old, new) +} + +// Swap replaces u's value with new, returning +// u's value before the replacement. +// +//go:nosplit +func (u *Uint64) Swap(value uint64) uint64 { + return Xchg64(&u.value, value) +} + +// Add adds delta to u atomically, returning +// the new updated value. +// +// This operation wraps around in the usual +// two's-complement way. +// +//go:nosplit +func (u *Uint64) Add(delta int64) uint64 { + return Xadd64(&u.value, delta) +} + +// Uintptr is an atomically accessed uintptr value. +// +// A Uintptr must not be copied. +type Uintptr struct { + noCopy noCopy + value uintptr +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (u *Uintptr) Load() uintptr { + return Loaduintptr(&u.value) +} + +// LoadAcquire is a partially unsynchronized version +// of Load that relaxes ordering constraints. Other threads +// may observe operations that precede this operation to +// occur after it, but no operation that occurs after it +// on this thread can be observed to occur before it. +// +// WARNING: Use sparingly and with great care. +// +//go:nosplit +func (u *Uintptr) LoadAcquire() uintptr { + return LoadAcquintptr(&u.value) +} + +// Store updates the value atomically. +// +//go:nosplit +func (u *Uintptr) Store(value uintptr) { + Storeuintptr(&u.value, value) +} + +// StoreRelease is a partially unsynchronized version +// of Store that relaxes ordering constraints. Other threads +// may observe operations that occur after this operation to +// precede it, but no operation that precedes it +// on this thread can be observed to occur after it. +// +// WARNING: Use sparingly and with great care. +// +//go:nosplit +func (u *Uintptr) StoreRelease(value uintptr) { + StoreReluintptr(&u.value, value) +} + +// CompareAndSwap atomically compares u's value with old, +// and if they're equal, swaps u's value with new. +// It reports whether the swap ran. +// +//go:nosplit +func (u *Uintptr) CompareAndSwap(old, new uintptr) bool { + return Casuintptr(&u.value, old, new) +} + +// Swap replaces u's value with new, returning +// u's value before the replacement. +// +//go:nosplit +func (u *Uintptr) Swap(value uintptr) uintptr { + return Xchguintptr(&u.value, value) +} + +// Add adds delta to u atomically, returning +// the new updated value. +// +// This operation wraps around in the usual +// two's-complement way. +// +//go:nosplit +func (u *Uintptr) Add(delta uintptr) uintptr { + return Xadduintptr(&u.value, delta) +} + +// Float64 is an atomically accessed float64 value. +// +// 8-byte aligned on all platforms, unlike a regular float64. +// +// A Float64 must not be copied. +type Float64 struct { + // Inherits noCopy and align64 from Uint64. + u Uint64 +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (f *Float64) Load() float64 { + r := f.u.Load() + return *(*float64)(unsafe.Pointer(&r)) +} + +// Store updates the value atomically. +// +//go:nosplit +func (f *Float64) Store(value float64) { + f.u.Store(*(*uint64)(unsafe.Pointer(&value))) +} + +// UnsafePointer is an atomically accessed unsafe.Pointer value. +// +// Note that because of the atomicity guarantees, stores to values +// of this type never trigger a write barrier, and the relevant +// methods are suffixed with "NoWB" to indicate that explicitly. +// As a result, this type should be used carefully, and sparingly, +// mostly with values that do not live in the Go heap anyway. +// +// An UnsafePointer must not be copied. +type UnsafePointer struct { + noCopy noCopy + value unsafe.Pointer +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (u *UnsafePointer) Load() unsafe.Pointer { + return Loadp(unsafe.Pointer(&u.value)) +} + +// StoreNoWB updates the value atomically. +// +// WARNING: As the name implies this operation does *not* +// perform a write barrier on value, and so this operation may +// hide pointers from the GC. Use with care and sparingly. +// It is safe to use with values not found in the Go heap. +// Prefer Store instead. +// +//go:nosplit +func (u *UnsafePointer) StoreNoWB(value unsafe.Pointer) { + StorepNoWB(unsafe.Pointer(&u.value), value) +} + +// Store updates the value atomically. +func (u *UnsafePointer) Store(value unsafe.Pointer) { + storePointer(&u.value, value) +} + +// provided by runtime +//go:linkname storePointer +func storePointer(ptr *unsafe.Pointer, new unsafe.Pointer) + +// CompareAndSwapNoWB atomically (with respect to other methods) +// compares u's value with old, and if they're equal, +// swaps u's value with new. +// It reports whether the swap ran. +// +// WARNING: As the name implies this operation does *not* +// perform a write barrier on value, and so this operation may +// hide pointers from the GC. Use with care and sparingly. +// It is safe to use with values not found in the Go heap. +// Prefer CompareAndSwap instead. +// +//go:nosplit +func (u *UnsafePointer) CompareAndSwapNoWB(old, new unsafe.Pointer) bool { + return Casp1(&u.value, old, new) +} + +// CompareAndSwap atomically compares u's value with old, +// and if they're equal, swaps u's value with new. +// It reports whether the swap ran. +func (u *UnsafePointer) CompareAndSwap(old, new unsafe.Pointer) bool { + return casPointer(&u.value, old, new) +} + +func casPointer(ptr *unsafe.Pointer, old, new unsafe.Pointer) bool + +// Pointer is an atomic pointer of type *T. +type Pointer[T any] struct { + u UnsafePointer +} + +// Load accesses and returns the value atomically. +// +//go:nosplit +func (p *Pointer[T]) Load() *T { + return (*T)(p.u.Load()) +} + +// StoreNoWB updates the value atomically. +// +// WARNING: As the name implies this operation does *not* +// perform a write barrier on value, and so this operation may +// hide pointers from the GC. Use with care and sparingly. +// It is safe to use with values not found in the Go heap. +// Prefer Store instead. +// +//go:nosplit +func (p *Pointer[T]) StoreNoWB(value *T) { + p.u.StoreNoWB(unsafe.Pointer(value)) +} + +// Store updates the value atomically. +//go:nosplit +func (p *Pointer[T]) Store(value *T) { + p.u.Store(unsafe.Pointer(value)) +} + +// CompareAndSwapNoWB atomically (with respect to other methods) +// compares u's value with old, and if they're equal, +// swaps u's value with new. +// It reports whether the swap ran. +// +// WARNING: As the name implies this operation does *not* +// perform a write barrier on value, and so this operation may +// hide pointers from the GC. Use with care and sparingly. +// It is safe to use with values not found in the Go heap. +// Prefer CompareAndSwap instead. +// +//go:nosplit +func (p *Pointer[T]) CompareAndSwapNoWB(old, new *T) bool { + return p.u.CompareAndSwapNoWB(unsafe.Pointer(old), unsafe.Pointer(new)) +} + +// CompareAndSwap atomically (with respect to other methods) +// compares u's value with old, and if they're equal, +// swaps u's value with new. +// It reports whether the swap ran. +func (p *Pointer[T]) CompareAndSwap(old, new *T) bool { + return p.u.CompareAndSwap(unsafe.Pointer(old), unsafe.Pointer(new)) +} + +// noCopy may be embedded into structs which must not be copied +// after the first use. +// +// See https://golang.org/issues/8005#issuecomment-190753527 +// for details. +type noCopy struct{} + +// Lock is a no-op used by -copylocks checker from `go vet`. +func (*noCopy) Lock() {} +func (*noCopy) Unlock() {} + +// align64 may be added to structs that must be 64-bit aligned. +// This struct is recognized by a special case in the compiler +// and will not work if copied to any other package. +type align64 struct{} diff --git a/src/runtime/internal/atomic/types_64bit.go b/src/runtime/internal/atomic/types_64bit.go new file mode 100644 index 0000000..006e83b --- /dev/null +++ b/src/runtime/internal/atomic/types_64bit.go @@ -0,0 +1,33 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build amd64 || arm64 || loong64 || mips64 || mips64le || ppc64 || ppc64le || riscv64 || s390x || wasm + +package atomic + +// LoadAcquire is a partially unsynchronized version +// of Load that relaxes ordering constraints. Other threads +// may observe operations that precede this operation to +// occur after it, but no operation that occurs after it +// on this thread can be observed to occur before it. +// +// WARNING: Use sparingly and with great care. +// +//go:nosplit +func (u *Uint64) LoadAcquire() uint64 { + return LoadAcq64(&u.value) +} + +// StoreRelease is a partially unsynchronized version +// of Store that relaxes ordering constraints. Other threads +// may observe operations that occur after this operation to +// precede it, but no operation that precedes it +// on this thread can be observed to occur after it. +// +// WARNING: Use sparingly and with great care. +// +//go:nosplit +func (u *Uint64) StoreRelease(value uint64) { + StoreRel64(&u.value, value) +} diff --git a/src/runtime/internal/atomic/unaligned.go b/src/runtime/internal/atomic/unaligned.go new file mode 100644 index 0000000..a859de4 --- /dev/null +++ b/src/runtime/internal/atomic/unaligned.go @@ -0,0 +1,9 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package atomic + +func panicUnaligned() { + panic("unaligned 64-bit atomic operation") +} diff --git a/src/runtime/internal/math/math.go b/src/runtime/internal/math/math.go new file mode 100644 index 0000000..c3fac36 --- /dev/null +++ b/src/runtime/internal/math/math.go @@ -0,0 +1,40 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package math + +import "internal/goarch" + +const MaxUintptr = ^uintptr(0) + +// MulUintptr returns a * b and whether the multiplication overflowed. +// On supported platforms this is an intrinsic lowered by the compiler. +func MulUintptr(a, b uintptr) (uintptr, bool) { + if a|b < 1<<(4*goarch.PtrSize) || a == 0 { + return a * b, false + } + overflow := b > MaxUintptr/a + return a * b, overflow +} + +// Mul64 returns the 128-bit product of x and y: (hi, lo) = x * y +// with the product bits' upper half returned in hi and the lower +// half returned in lo. +// This is a copy from math/bits.Mul64 +// On supported platforms this is an intrinsic lowered by the compiler. +func Mul64(x, y uint64) (hi, lo uint64) { + const mask32 = 1<<32 - 1 + x0 := x & mask32 + x1 := x >> 32 + y0 := y & mask32 + y1 := y >> 32 + w0 := x0 * y0 + t := x1*y0 + w0>>32 + w1 := t & mask32 + w2 := t >> 32 + w1 += x0 * y1 + hi = x1*y1 + w2 + w1>>32 + lo = x * y + return +} diff --git a/src/runtime/internal/math/math_test.go b/src/runtime/internal/math/math_test.go new file mode 100644 index 0000000..303eb63 --- /dev/null +++ b/src/runtime/internal/math/math_test.go @@ -0,0 +1,79 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package math_test + +import ( + . "runtime/internal/math" + "testing" +) + +const ( + UintptrSize = 32 << (^uintptr(0) >> 63) +) + +type mulUintptrTest struct { + a uintptr + b uintptr + overflow bool +} + +var mulUintptrTests = []mulUintptrTest{ + {0, 0, false}, + {1000, 1000, false}, + {MaxUintptr, 0, false}, + {MaxUintptr, 1, false}, + {MaxUintptr / 2, 2, false}, + {MaxUintptr / 2, 3, true}, + {MaxUintptr, 10, true}, + {MaxUintptr, 100, true}, + {MaxUintptr / 100, 100, false}, + {MaxUintptr / 1000, 1001, true}, + {1<<(UintptrSize/2) - 1, 1<<(UintptrSize/2) - 1, false}, + {1 << (UintptrSize / 2), 1 << (UintptrSize / 2), true}, + {MaxUintptr >> 32, MaxUintptr >> 32, false}, + {MaxUintptr, MaxUintptr, true}, +} + +func TestMulUintptr(t *testing.T) { + for _, test := range mulUintptrTests { + a, b := test.a, test.b + for i := 0; i < 2; i++ { + mul, overflow := MulUintptr(a, b) + if mul != a*b || overflow != test.overflow { + t.Errorf("MulUintptr(%v, %v) = %v, %v want %v, %v", + a, b, mul, overflow, a*b, test.overflow) + } + a, b = b, a + } + } +} + +var SinkUintptr uintptr +var SinkBool bool + +var x, y uintptr + +func BenchmarkMulUintptr(b *testing.B) { + x, y = 1, 2 + b.Run("small", func(b *testing.B) { + for i := 0; i < b.N; i++ { + var overflow bool + SinkUintptr, overflow = MulUintptr(x, y) + if overflow { + SinkUintptr = 0 + } + } + }) + x, y = MaxUintptr, MaxUintptr-1 + b.Run("large", func(b *testing.B) { + for i := 0; i < b.N; i++ { + var overflow bool + SinkUintptr, overflow = MulUintptr(x, y) + if overflow { + SinkUintptr = 0 + } + } + }) +} diff --git a/src/runtime/internal/startlinetest/func_amd64.go b/src/runtime/internal/startlinetest/func_amd64.go new file mode 100644 index 0000000..ab7063d --- /dev/null +++ b/src/runtime/internal/startlinetest/func_amd64.go @@ -0,0 +1,13 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Package startlinetest contains helpers for runtime_test.TestStartLineAsm. +package startlinetest + +// Defined in func_amd64.s, this is a trivial assembly function that calls +// runtime_test.callerStartLine. +func AsmFunc() int + +// Provided by runtime_test. +var CallerStartLine func(bool) int diff --git a/src/runtime/internal/startlinetest/func_amd64.s b/src/runtime/internal/startlinetest/func_amd64.s new file mode 100644 index 0000000..96982be --- /dev/null +++ b/src/runtime/internal/startlinetest/func_amd64.s @@ -0,0 +1,28 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "funcdata.h" +#include "textflag.h" + +// Assembly function for runtime_test.TestStartLineAsm. +// +// Note that this file can't be built directly as part of runtime_test, as assembly +// files can't declare an alternative package. Building it into runtime is +// possible, but linkshared complicates things: +// +// 1. linkshared mode leaves the function around in the final output of +// non-test builds. +// 2. Due of (1), the linker can't resolve the callerStartLine relocation +// (as runtime_test isn't built for non-test builds). +// +// Thus it is simpler to just put this in its own package, imported only by +// runtime_test. We use ABIInternal as no ABI wrapper is generated for +// callerStartLine since it is in a different package. + +TEXT ·AsmFunc<ABIInternal>(SB),NOSPLIT,$8-0 + NO_LOCAL_POINTERS + MOVQ $0, AX // wantInlined + MOVQ ·CallerStartLine(SB), DX + CALL (DX) + RET diff --git a/src/runtime/internal/sys/consts.go b/src/runtime/internal/sys/consts.go new file mode 100644 index 0000000..98c0f09 --- /dev/null +++ b/src/runtime/internal/sys/consts.go @@ -0,0 +1,36 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package sys + +import ( + "internal/goarch" + "internal/goos" +) + +// AIX requires a larger stack for syscalls. +// The race build also needs more stack. See issue 54291. +// This arithmetic must match that in cmd/internal/objabi/stack.go:stackGuardMultiplier. +const StackGuardMultiplier = 1 + goos.IsAix + isRace + +// DefaultPhysPageSize is the default physical page size. +const DefaultPhysPageSize = goarch.DefaultPhysPageSize + +// PCQuantum is the minimal unit for a program counter (1 on x86, 4 on most other systems). +// The various PC tables record PC deltas pre-divided by PCQuantum. +const PCQuantum = goarch.PCQuantum + +// Int64Align is the required alignment for a 64-bit integer (4 on 32-bit systems, 8 on 64-bit). +const Int64Align = goarch.PtrSize + +// MinFrameSize is the size of the system-reserved words at the bottom +// of a frame (just above the architectural stack pointer). +// It is zero on x86 and PtrSize on most non-x86 (LR-based) systems. +// On PowerPC it is larger, to cover three more reserved words: +// the compiler word, the link editor word, and the TOC save word. +const MinFrameSize = goarch.MinFrameSize + +// StackAlign is the required alignment of the SP register. +// The stack must be at least word aligned, but some architectures require more. +const StackAlign = goarch.StackAlign diff --git a/src/runtime/internal/sys/consts_norace.go b/src/runtime/internal/sys/consts_norace.go new file mode 100644 index 0000000..a9613b8 --- /dev/null +++ b/src/runtime/internal/sys/consts_norace.go @@ -0,0 +1,9 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !race + +package sys + +const isRace = 0 diff --git a/src/runtime/internal/sys/consts_race.go b/src/runtime/internal/sys/consts_race.go new file mode 100644 index 0000000..f824fb3 --- /dev/null +++ b/src/runtime/internal/sys/consts_race.go @@ -0,0 +1,9 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +package sys + +const isRace = 1 diff --git a/src/runtime/internal/sys/intrinsics.go b/src/runtime/internal/sys/intrinsics.go new file mode 100644 index 0000000..902d893 --- /dev/null +++ b/src/runtime/internal/sys/intrinsics.go @@ -0,0 +1,110 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !386 + +// TODO finish intrinsifying 386, deadcode the assembly, remove build tags, merge w/ intrinsics_common + +package sys + +// Copied from math/bits to avoid dependence. + +var deBruijn32tab = [32]byte{ + 0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, + 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9, +} + +const deBruijn32 = 0x077CB531 + +var deBruijn64tab = [64]byte{ + 0, 1, 56, 2, 57, 49, 28, 3, 61, 58, 42, 50, 38, 29, 17, 4, + 62, 47, 59, 36, 45, 43, 51, 22, 53, 39, 33, 30, 24, 18, 12, 5, + 63, 55, 48, 27, 60, 41, 37, 16, 46, 35, 44, 21, 52, 32, 23, 11, + 54, 26, 40, 15, 34, 20, 31, 10, 25, 14, 19, 9, 13, 8, 7, 6, +} + +const deBruijn64 = 0x03f79d71b4ca8b09 + +const ntz8tab = "" + + "\x08\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x04\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x05\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x04\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x06\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x04\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x05\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x04\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x07\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x04\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x05\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x04\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x06\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x04\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x05\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + + "\x04\x00\x01\x00\x02\x00\x01\x00\x03\x00\x01\x00\x02\x00\x01\x00" + +// TrailingZeros32 returns the number of trailing zero bits in x; the result is 32 for x == 0. +func TrailingZeros32(x uint32) int { + if x == 0 { + return 32 + } + // see comment in TrailingZeros64 + return int(deBruijn32tab[(x&-x)*deBruijn32>>(32-5)]) +} + +// TrailingZeros64 returns the number of trailing zero bits in x; the result is 64 for x == 0. +func TrailingZeros64(x uint64) int { + if x == 0 { + return 64 + } + // If popcount is fast, replace code below with return popcount(^x & (x - 1)). + // + // x & -x leaves only the right-most bit set in the word. Let k be the + // index of that bit. Since only a single bit is set, the value is two + // to the power of k. Multiplying by a power of two is equivalent to + // left shifting, in this case by k bits. The de Bruijn (64 bit) constant + // is such that all six bit, consecutive substrings are distinct. + // Therefore, if we have a left shifted version of this constant we can + // find by how many bits it was shifted by looking at which six bit + // substring ended up at the top of the word. + // (Knuth, volume 4, section 7.3.1) + return int(deBruijn64tab[(x&-x)*deBruijn64>>(64-6)]) +} + +// TrailingZeros8 returns the number of trailing zero bits in x; the result is 8 for x == 0. +func TrailingZeros8(x uint8) int { + return int(ntz8tab[x]) +} + +// Bswap64 returns its input with byte order reversed +// 0x0102030405060708 -> 0x0807060504030201 +func Bswap64(x uint64) uint64 { + c8 := uint64(0x00ff00ff00ff00ff) + a := x >> 8 & c8 + b := (x & c8) << 8 + x = a | b + c16 := uint64(0x0000ffff0000ffff) + a = x >> 16 & c16 + b = (x & c16) << 16 + x = a | b + c32 := uint64(0x00000000ffffffff) + a = x >> 32 & c32 + b = (x & c32) << 32 + x = a | b + return x +} + +// Bswap32 returns its input with byte order reversed +// 0x01020304 -> 0x04030201 +func Bswap32(x uint32) uint32 { + c8 := uint32(0x00ff00ff) + a := x >> 8 & c8 + b := (x & c8) << 8 + x = a | b + c16 := uint32(0x0000ffff) + a = x >> 16 & c16 + b = (x & c16) << 16 + x = a | b + return x +} diff --git a/src/runtime/internal/sys/intrinsics_386.s b/src/runtime/internal/sys/intrinsics_386.s new file mode 100644 index 0000000..f33ade0 --- /dev/null +++ b/src/runtime/internal/sys/intrinsics_386.s @@ -0,0 +1,58 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT runtime∕internal∕sys·TrailingZeros64(SB), NOSPLIT, $0-12 + // Try low 32 bits. + MOVL x_lo+0(FP), AX + BSFL AX, AX + JZ tryhigh + MOVL AX, ret+8(FP) + RET + +tryhigh: + // Try high 32 bits. + MOVL x_hi+4(FP), AX + BSFL AX, AX + JZ none + ADDL $32, AX + MOVL AX, ret+8(FP) + RET + +none: + // No bits are set. + MOVL $64, ret+8(FP) + RET + +TEXT runtime∕internal∕sys·TrailingZeros32(SB), NOSPLIT, $0-8 + MOVL x+0(FP), AX + BSFL AX, AX + JNZ 2(PC) + MOVL $32, AX + MOVL AX, ret+4(FP) + RET + +TEXT runtime∕internal∕sys·TrailingZeros8(SB), NOSPLIT, $0-8 + MOVBLZX x+0(FP), AX + BSFL AX, AX + JNZ 2(PC) + MOVL $8, AX + MOVL AX, ret+4(FP) + RET + +TEXT runtime∕internal∕sys·Bswap64(SB), NOSPLIT, $0-16 + MOVL x_lo+0(FP), AX + MOVL x_hi+4(FP), BX + BSWAPL AX + BSWAPL BX + MOVL BX, ret_lo+8(FP) + MOVL AX, ret_hi+12(FP) + RET + +TEXT runtime∕internal∕sys·Bswap32(SB), NOSPLIT, $0-8 + MOVL x+0(FP), AX + BSWAPL AX + MOVL AX, ret+4(FP) + RET diff --git a/src/runtime/internal/sys/intrinsics_common.go b/src/runtime/internal/sys/intrinsics_common.go new file mode 100644 index 0000000..1461551 --- /dev/null +++ b/src/runtime/internal/sys/intrinsics_common.go @@ -0,0 +1,109 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package sys + +// Copied from math/bits to avoid dependence. + +const len8tab = "" + + "\x00\x01\x02\x02\x03\x03\x03\x03\x04\x04\x04\x04\x04\x04\x04\x04" + + "\x05\x05\x05\x05\x05\x05\x05\x05\x05\x05\x05\x05\x05\x05\x05\x05" + + "\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06" + + "\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06\x06" + + "\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07" + + "\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07" + + "\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07" + + "\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07\x07" + + "\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08" + + "\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08" + + "\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08" + + "\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08" + + "\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08" + + "\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08" + + "\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08" + + "\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08\x08" + +// Len64 returns the minimum number of bits required to represent x; the result is 0 for x == 0. +// +// nosplit because this is used in src/runtime/histogram.go, which make run in sensitive contexts. +// +//go:nosplit +func Len64(x uint64) (n int) { + if x >= 1<<32 { + x >>= 32 + n = 32 + } + if x >= 1<<16 { + x >>= 16 + n += 16 + } + if x >= 1<<8 { + x >>= 8 + n += 8 + } + return n + int(len8tab[x]) +} + +// --- OnesCount --- + +const m0 = 0x5555555555555555 // 01010101 ... +const m1 = 0x3333333333333333 // 00110011 ... +const m2 = 0x0f0f0f0f0f0f0f0f // 00001111 ... + +// OnesCount64 returns the number of one bits ("population count") in x. +func OnesCount64(x uint64) int { + // Implementation: Parallel summing of adjacent bits. + // See "Hacker's Delight", Chap. 5: Counting Bits. + // The following pattern shows the general approach: + // + // x = x>>1&(m0&m) + x&(m0&m) + // x = x>>2&(m1&m) + x&(m1&m) + // x = x>>4&(m2&m) + x&(m2&m) + // x = x>>8&(m3&m) + x&(m3&m) + // x = x>>16&(m4&m) + x&(m4&m) + // x = x>>32&(m5&m) + x&(m5&m) + // return int(x) + // + // Masking (& operations) can be left away when there's no + // danger that a field's sum will carry over into the next + // field: Since the result cannot be > 64, 8 bits is enough + // and we can ignore the masks for the shifts by 8 and up. + // Per "Hacker's Delight", the first line can be simplified + // more, but it saves at best one instruction, so we leave + // it alone for clarity. + const m = 1<<64 - 1 + x = x>>1&(m0&m) + x&(m0&m) + x = x>>2&(m1&m) + x&(m1&m) + x = (x>>4 + x) & (m2 & m) + x += x >> 8 + x += x >> 16 + x += x >> 32 + return int(x) & (1<<7 - 1) +} + +// LeadingZeros64 returns the number of leading zero bits in x; the result is 64 for x == 0. +func LeadingZeros64(x uint64) int { return 64 - Len64(x) } + +// LeadingZeros8 returns the number of leading zero bits in x; the result is 8 for x == 0. +func LeadingZeros8(x uint8) int { return 8 - Len8(x) } + +// Len8 returns the minimum number of bits required to represent x; the result is 0 for x == 0. +func Len8(x uint8) int { + return int(len8tab[x]) +} + +// Prefetch prefetches data from memory addr to cache +// +// AMD64: Produce PREFETCHT0 instruction +// +// ARM64: Produce PRFM instruction with PLDL1KEEP option +func Prefetch(addr uintptr) {} + +// PrefetchStreamed prefetches data from memory addr, with a hint that this data is being streamed. +// That is, it is likely to be accessed very soon, but only once. If possible, this will avoid polluting the cache. +// +// AMD64: Produce PREFETCHNTA instruction +// +// ARM64: Produce PRFM instruction with PLDL1STRM option +func PrefetchStreamed(addr uintptr) {} diff --git a/src/runtime/internal/sys/intrinsics_stubs.go b/src/runtime/internal/sys/intrinsics_stubs.go new file mode 100644 index 0000000..66cfcde --- /dev/null +++ b/src/runtime/internal/sys/intrinsics_stubs.go @@ -0,0 +1,13 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build 386 + +package sys + +func TrailingZeros64(x uint64) int +func TrailingZeros32(x uint32) int +func TrailingZeros8(x uint8) int +func Bswap64(x uint64) uint64 +func Bswap32(x uint32) uint32 diff --git a/src/runtime/internal/sys/intrinsics_test.go b/src/runtime/internal/sys/intrinsics_test.go new file mode 100644 index 0000000..bf75f19 --- /dev/null +++ b/src/runtime/internal/sys/intrinsics_test.go @@ -0,0 +1,38 @@ +package sys_test + +import ( + "runtime/internal/sys" + "testing" +) + +func TestTrailingZeros64(t *testing.T) { + for i := 0; i <= 64; i++ { + x := uint64(5) << uint(i) + if got := sys.TrailingZeros64(x); got != i { + t.Errorf("TrailingZeros64(%d)=%d, want %d", x, got, i) + } + } +} +func TestTrailingZeros32(t *testing.T) { + for i := 0; i <= 32; i++ { + x := uint32(5) << uint(i) + if got := sys.TrailingZeros32(x); got != i { + t.Errorf("TrailingZeros32(%d)=%d, want %d", x, got, i) + } + } +} + +func TestBswap64(t *testing.T) { + x := uint64(0x1122334455667788) + y := sys.Bswap64(x) + if y != 0x8877665544332211 { + t.Errorf("Bswap(%x)=%x, want 0x8877665544332211", x, y) + } +} +func TestBswap32(t *testing.T) { + x := uint32(0x11223344) + y := sys.Bswap32(x) + if y != 0x44332211 { + t.Errorf("Bswap(%x)=%x, want 0x44332211", x, y) + } +} diff --git a/src/runtime/internal/sys/nih.go b/src/runtime/internal/sys/nih.go new file mode 100644 index 0000000..17eab67 --- /dev/null +++ b/src/runtime/internal/sys/nih.go @@ -0,0 +1,41 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package sys + +// NOTE: keep in sync with cmd/compile/internal/types.CalcSize +// to make the compiler recognize this as an intrinsic type. +type nih struct{} + +// NotInHeap is a type must never be allocated from the GC'd heap or on the stack, +// and is called not-in-heap. +// +// Other types can embed NotInHeap to make it not-in-heap. Specifically, pointers +// to these types must always fail the `runtime.inheap` check. The type may be used +// for global variables, or for objects in unmanaged memory (e.g., allocated with +// `sysAlloc`, `persistentalloc`, r`fixalloc`, or from a manually-managed span). +// +// Specifically: +// +// 1. `new(T)`, `make([]T)`, `append([]T, ...)` and implicit heap +// allocation of T are disallowed. (Though implicit allocations are +// disallowed in the runtime anyway.) +// +// 2. A pointer to a regular type (other than `unsafe.Pointer`) cannot be +// converted to a pointer to a not-in-heap type, even if they have the +// same underlying type. +// +// 3. Any type that containing a not-in-heap type is itself considered as not-in-heap. +// +// - Structs and arrays are not-in-heap if their elements are not-in-heap. +// - Maps and channels contains no-in-heap types are disallowed. +// +// 4. Write barriers on pointers to not-in-heap types can be omitted. +// +// The last point is the real benefit of NotInHeap. The runtime uses +// it for low-level internal structures to avoid memory barriers in the +// scheduler and the memory allocator where they are illegal or simply +// inefficient. This mechanism is reasonably safe and does not compromise +// the readability of the runtime. +type NotInHeap struct{ _ nih } diff --git a/src/runtime/internal/sys/sys.go b/src/runtime/internal/sys/sys.go new file mode 100644 index 0000000..694101d --- /dev/null +++ b/src/runtime/internal/sys/sys.go @@ -0,0 +1,7 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// package sys contains system- and configuration- and architecture-specific +// constants used by the runtime. +package sys diff --git a/src/runtime/internal/syscall/asm_linux_386.s b/src/runtime/internal/syscall/asm_linux_386.s new file mode 100644 index 0000000..15aae4d --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_386.s @@ -0,0 +1,34 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See ../sys_linux_386.s for the reason why we always use int 0x80 +// instead of the glibc-specific "CALL 0x10(GS)". +#define INVOKE_SYSCALL INT $0x80 + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +// +// Syscall # in AX, args in BX CX DX SI DI BP, return in AX +TEXT ·Syscall6(SB),NOSPLIT,$0-40 + MOVL num+0(FP), AX // syscall entry + MOVL a1+4(FP), BX + MOVL a2+8(FP), CX + MOVL a3+12(FP), DX + MOVL a4+16(FP), SI + MOVL a5+20(FP), DI + MOVL a6+24(FP), BP + INVOKE_SYSCALL + CMPL AX, $0xfffff001 + JLS ok + MOVL $-1, r1+28(FP) + MOVL $0, r2+32(FP) + NEGL AX + MOVL AX, errno+36(FP) + RET +ok: + MOVL AX, r1+28(FP) + MOVL DX, r2+32(FP) + MOVL $0, errno+36(FP) + RET diff --git a/src/runtime/internal/syscall/asm_linux_amd64.s b/src/runtime/internal/syscall/asm_linux_amd64.s new file mode 100644 index 0000000..3740ef1 --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_amd64.s @@ -0,0 +1,47 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +// +// We need to convert to the syscall ABI. +// +// arg | ABIInternal | Syscall +// --------------------------- +// num | AX | AX +// a1 | BX | DI +// a2 | CX | SI +// a3 | DI | DX +// a4 | SI | R10 +// a5 | R8 | R8 +// a6 | R9 | R9 +// +// r1 | AX | AX +// r2 | BX | DX +// err | CX | part of AX +// +// Note that this differs from "standard" ABI convention, which would pass 4th +// arg in CX, not R10. +TEXT ·Syscall6<ABIInternal>(SB),NOSPLIT,$0 + // a6 already in R9. + // a5 already in R8. + MOVQ SI, R10 // a4 + MOVQ DI, DX // a3 + MOVQ CX, SI // a2 + MOVQ BX, DI // a1 + // num already in AX. + SYSCALL + CMPQ AX, $0xfffffffffffff001 + JLS ok + NEGQ AX + MOVQ AX, CX // errno + MOVQ $-1, AX // r1 + MOVQ $0, BX // r2 + RET +ok: + // r1 already in AX. + MOVQ DX, BX // r2 + MOVQ $0, CX // errno + RET diff --git a/src/runtime/internal/syscall/asm_linux_arm.s b/src/runtime/internal/syscall/asm_linux_arm.s new file mode 100644 index 0000000..dbf1826 --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_arm.s @@ -0,0 +1,32 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +TEXT ·Syscall6(SB),NOSPLIT,$0-40 + MOVW num+0(FP), R7 // syscall entry + MOVW a1+4(FP), R0 + MOVW a2+8(FP), R1 + MOVW a3+12(FP), R2 + MOVW a4+16(FP), R3 + MOVW a5+20(FP), R4 + MOVW a6+24(FP), R5 + SWI $0 + MOVW $0xfffff001, R6 + CMP R6, R0 + BLS ok + MOVW $-1, R1 + MOVW R1, r1+28(FP) + MOVW $0, R2 + MOVW R2, r2+32(FP) + RSB $0, R0, R0 + MOVW R0, errno+36(FP) + RET +ok: + MOVW R0, r1+28(FP) + MOVW R1, r2+32(FP) + MOVW $0, R0 + MOVW R0, errno+36(FP) + RET diff --git a/src/runtime/internal/syscall/asm_linux_arm64.s b/src/runtime/internal/syscall/asm_linux_arm64.s new file mode 100644 index 0000000..83e862f --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_arm64.s @@ -0,0 +1,29 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +TEXT ·Syscall6(SB),NOSPLIT,$0-80 + MOVD num+0(FP), R8 // syscall entry + MOVD a1+8(FP), R0 + MOVD a2+16(FP), R1 + MOVD a3+24(FP), R2 + MOVD a4+32(FP), R3 + MOVD a5+40(FP), R4 + MOVD a6+48(FP), R5 + SVC + CMN $4095, R0 + BCC ok + MOVD $-1, R4 + MOVD R4, r1+56(FP) + MOVD ZR, r2+64(FP) + NEG R0, R0 + MOVD R0, errno+72(FP) + RET +ok: + MOVD R0, r1+56(FP) + MOVD R1, r2+64(FP) + MOVD ZR, errno+72(FP) + RET diff --git a/src/runtime/internal/syscall/asm_linux_loong64.s b/src/runtime/internal/syscall/asm_linux_loong64.s new file mode 100644 index 0000000..d6a33f9 --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_loong64.s @@ -0,0 +1,29 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +TEXT ·Syscall6(SB),NOSPLIT,$0-80 + MOVV num+0(FP), R11 // syscall entry + MOVV a1+8(FP), R4 + MOVV a2+16(FP), R5 + MOVV a3+24(FP), R6 + MOVV a4+32(FP), R7 + MOVV a5+40(FP), R8 + MOVV a6+48(FP), R9 + SYSCALL + MOVW $-4096, R12 + BGEU R12, R4, ok + MOVV $-1, R12 + MOVV R12, r1+56(FP) + MOVV R0, r2+64(FP) + SUBVU R4, R0, R4 + MOVV R4, errno+72(FP) + RET +ok: + MOVV R4, r1+56(FP) + MOVV R0, r2+64(FP) // r2 is not used. Always set to 0. + MOVV R0, errno+72(FP) + RET diff --git a/src/runtime/internal/syscall/asm_linux_mips64x.s b/src/runtime/internal/syscall/asm_linux_mips64x.s new file mode 100644 index 0000000..6b7c524 --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_mips64x.s @@ -0,0 +1,30 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips64 || mips64le) + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +TEXT ·Syscall6(SB),NOSPLIT,$0-80 + MOVV num+0(FP), R2 // syscall entry + MOVV a1+8(FP), R4 + MOVV a2+16(FP), R5 + MOVV a3+24(FP), R6 + MOVV a4+32(FP), R7 + MOVV a5+40(FP), R8 + MOVV a6+48(FP), R9 + MOVV R0, R3 // reset R3 to 0 as 1-ret SYSCALL keeps it + SYSCALL + BEQ R7, ok + MOVV $-1, R1 + MOVV R1, r1+56(FP) + MOVV R0, r2+64(FP) + MOVV R2, errno+72(FP) + RET +ok: + MOVV R2, r1+56(FP) + MOVV R3, r2+64(FP) + MOVV R0, errno+72(FP) + RET diff --git a/src/runtime/internal/syscall/asm_linux_mipsx.s b/src/runtime/internal/syscall/asm_linux_mipsx.s new file mode 100644 index 0000000..561310f --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_mipsx.s @@ -0,0 +1,35 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips || mipsle) + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +// +// The 5th and 6th arg go at sp+16, sp+20. +// Note that frame size of 20 means that 24 bytes gets reserved on stack. +TEXT ·Syscall6(SB),NOSPLIT,$20-40 + MOVW num+0(FP), R2 // syscall entry + MOVW a1+4(FP), R4 + MOVW a2+8(FP), R5 + MOVW a3+12(FP), R6 + MOVW a4+16(FP), R7 + MOVW a5+20(FP), R8 + MOVW a6+24(FP), R9 + MOVW R8, 16(R29) + MOVW R9, 20(R29) + MOVW R0, R3 // reset R3 to 0 as 1-ret SYSCALL keeps it + SYSCALL + BEQ R7, ok + MOVW $-1, R1 + MOVW R1, r1+28(FP) + MOVW R0, r2+32(FP) + MOVW R2, errno+36(FP) + RET +ok: + MOVW R2, r1+28(FP) + MOVW R3, r2+32(FP) + MOVW R0, errno+36(FP) + RET diff --git a/src/runtime/internal/syscall/asm_linux_ppc64x.s b/src/runtime/internal/syscall/asm_linux_ppc64x.s new file mode 100644 index 0000000..3e985ed --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_ppc64x.s @@ -0,0 +1,23 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (ppc64 || ppc64le) + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +TEXT ·Syscall6<ABIInternal>(SB),NOSPLIT,$0-80 + MOVD R3, R10 // Move syscall number to R10. SYSCALL will move it R0, and restore R0. + MOVD R4, R3 + MOVD R5, R4 + MOVD R6, R5 + MOVD R7, R6 + MOVD R8, R7 + MOVD R9, R8 + SYSCALL R10 + MOVD $-1, R6 + ISEL CR0SO, R3, R0, R5 // errno = (error) ? R3 : 0 + ISEL CR0SO, R6, R3, R3 // r1 = (error) ? -1 : 0 + MOVD $0, R4 // r2 is not used on linux/ppc64 + RET diff --git a/src/runtime/internal/syscall/asm_linux_riscv64.s b/src/runtime/internal/syscall/asm_linux_riscv64.s new file mode 100644 index 0000000..15e50ec --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_riscv64.s @@ -0,0 +1,43 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +// +// We need to convert to the syscall ABI. +// +// arg | ABIInternal | Syscall +// --------------------------- +// num | A0 | A7 +// a1 | A1 | A0 +// a2 | A2 | A1 +// a3 | A3 | A2 +// a4 | A4 | A3 +// a5 | A5 | A4 +// a6 | A6 | A5 +// +// r1 | A0 | A0 +// r2 | A1 | A1 +// err | A2 | part of A0 +TEXT ·Syscall6<ABIInternal>(SB),NOSPLIT,$0-80 + MOV A0, A7 + MOV A1, A0 + MOV A2, A1 + MOV A3, A2 + MOV A4, A3 + MOV A5, A4 + MOV A6, A5 + ECALL + MOV $-4096, T0 + BLTU T0, A0, err + // r1 already in A0 + // r2 already in A1 + MOV ZERO, A2 // errno + RET +err: + SUB A0, ZERO, A2 // errno + MOV $-1, A0 // r1 + MOV ZERO, A1 // r2 + RET diff --git a/src/runtime/internal/syscall/asm_linux_s390x.s b/src/runtime/internal/syscall/asm_linux_s390x.s new file mode 100644 index 0000000..1b27f29 --- /dev/null +++ b/src/runtime/internal/syscall/asm_linux_s390x.s @@ -0,0 +1,28 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) +TEXT ·Syscall6(SB),NOSPLIT,$0-80 + MOVD num+0(FP), R1 // syscall entry + MOVD a1+8(FP), R2 + MOVD a2+16(FP), R3 + MOVD a3+24(FP), R4 + MOVD a4+32(FP), R5 + MOVD a5+40(FP), R6 + MOVD a6+48(FP), R7 + SYSCALL + MOVD $0xfffffffffffff001, R8 + CMPUBLT R2, R8, ok + MOVD $-1, r1+56(FP) + MOVD $0, r2+64(FP) + NEG R2, R2 + MOVD R2, errno+72(FP) + RET +ok: + MOVD R2, r1+56(FP) + MOVD R3, r2+64(FP) + MOVD $0, errno+72(FP) + RET diff --git a/src/runtime/internal/syscall/defs_linux.go b/src/runtime/internal/syscall/defs_linux.go new file mode 100644 index 0000000..71f1fa1 --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux.go @@ -0,0 +1,10 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +const ( + F_SETFD = 2 + FD_CLOEXEC = 1 +) diff --git a/src/runtime/internal/syscall/defs_linux_386.go b/src/runtime/internal/syscall/defs_linux_386.go new file mode 100644 index 0000000..dc723a6 --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_386.go @@ -0,0 +1,29 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +const ( + SYS_FCNTL = 55 + SYS_EPOLL_CTL = 255 + SYS_EPOLL_PWAIT = 319 + SYS_EPOLL_CREATE1 = 329 + SYS_EPOLL_PWAIT2 = 441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + Data [8]byte // to match amd64 +} diff --git a/src/runtime/internal/syscall/defs_linux_amd64.go b/src/runtime/internal/syscall/defs_linux_amd64.go new file mode 100644 index 0000000..886eb5b --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_amd64.go @@ -0,0 +1,29 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +const ( + SYS_FCNTL = 72 + SYS_EPOLL_CTL = 233 + SYS_EPOLL_PWAIT = 281 + SYS_EPOLL_CREATE1 = 291 + SYS_EPOLL_PWAIT2 = 441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + Data [8]byte // unaligned uintptr +} diff --git a/src/runtime/internal/syscall/defs_linux_arm.go b/src/runtime/internal/syscall/defs_linux_arm.go new file mode 100644 index 0000000..8f812a2 --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_arm.go @@ -0,0 +1,30 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +const ( + SYS_FCNTL = 55 + SYS_EPOLL_CTL = 251 + SYS_EPOLL_PWAIT = 346 + SYS_EPOLL_CREATE1 = 357 + SYS_EPOLL_PWAIT2 = 441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + _pad uint32 + Data [8]byte // to match amd64 +} diff --git a/src/runtime/internal/syscall/defs_linux_arm64.go b/src/runtime/internal/syscall/defs_linux_arm64.go new file mode 100644 index 0000000..48e11b0 --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_arm64.go @@ -0,0 +1,30 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +const ( + SYS_EPOLL_CREATE1 = 20 + SYS_EPOLL_CTL = 21 + SYS_EPOLL_PWAIT = 22 + SYS_FCNTL = 25 + SYS_EPOLL_PWAIT2 = 441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + _pad uint32 + Data [8]byte // to match amd64 +} diff --git a/src/runtime/internal/syscall/defs_linux_loong64.go b/src/runtime/internal/syscall/defs_linux_loong64.go new file mode 100644 index 0000000..b78ef81 --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_loong64.go @@ -0,0 +1,30 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +const ( + SYS_EPOLL_CREATE1 = 20 + SYS_EPOLL_CTL = 21 + SYS_EPOLL_PWAIT = 22 + SYS_FCNTL = 25 + SYS_EPOLL_PWAIT2 = 441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + pad_cgo_0 [4]byte + Data [8]byte // unaligned uintptr +} diff --git a/src/runtime/internal/syscall/defs_linux_mips64x.go b/src/runtime/internal/syscall/defs_linux_mips64x.go new file mode 100644 index 0000000..92b49ca --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_mips64x.go @@ -0,0 +1,32 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips64 || mips64le) + +package syscall + +const ( + SYS_FCNTL = 5070 + SYS_EPOLL_CTL = 5208 + SYS_EPOLL_PWAIT = 5272 + SYS_EPOLL_CREATE1 = 5285 + SYS_EPOLL_PWAIT2 = 5441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + pad_cgo_0 [4]byte + Data [8]byte // unaligned uintptr +} diff --git a/src/runtime/internal/syscall/defs_linux_mipsx.go b/src/runtime/internal/syscall/defs_linux_mipsx.go new file mode 100644 index 0000000..e28d09c --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_mipsx.go @@ -0,0 +1,32 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips || mipsle) + +package syscall + +const ( + SYS_FCNTL = 4055 + SYS_EPOLL_CTL = 4249 + SYS_EPOLL_PWAIT = 4313 + SYS_EPOLL_CREATE1 = 4326 + SYS_EPOLL_PWAIT2 = 4441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + pad_cgo_0 [4]byte + Data uint64 +} diff --git a/src/runtime/internal/syscall/defs_linux_ppc64x.go b/src/runtime/internal/syscall/defs_linux_ppc64x.go new file mode 100644 index 0000000..a74483e --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_ppc64x.go @@ -0,0 +1,32 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (ppc64 || ppc64le) + +package syscall + +const ( + SYS_FCNTL = 55 + SYS_EPOLL_CTL = 237 + SYS_EPOLL_PWAIT = 303 + SYS_EPOLL_CREATE1 = 315 + SYS_EPOLL_PWAIT2 = 441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + pad_cgo_0 [4]byte + Data [8]byte // unaligned uintptr +} diff --git a/src/runtime/internal/syscall/defs_linux_riscv64.go b/src/runtime/internal/syscall/defs_linux_riscv64.go new file mode 100644 index 0000000..b78ef81 --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_riscv64.go @@ -0,0 +1,30 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +const ( + SYS_EPOLL_CREATE1 = 20 + SYS_EPOLL_CTL = 21 + SYS_EPOLL_PWAIT = 22 + SYS_FCNTL = 25 + SYS_EPOLL_PWAIT2 = 441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + pad_cgo_0 [4]byte + Data [8]byte // unaligned uintptr +} diff --git a/src/runtime/internal/syscall/defs_linux_s390x.go b/src/runtime/internal/syscall/defs_linux_s390x.go new file mode 100644 index 0000000..a7bb1ba --- /dev/null +++ b/src/runtime/internal/syscall/defs_linux_s390x.go @@ -0,0 +1,30 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall + +const ( + SYS_FCNTL = 55 + SYS_EPOLL_CTL = 250 + SYS_EPOLL_PWAIT = 312 + SYS_EPOLL_CREATE1 = 327 + SYS_EPOLL_PWAIT2 = 441 + + EPOLLIN = 0x1 + EPOLLOUT = 0x4 + EPOLLERR = 0x8 + EPOLLHUP = 0x10 + EPOLLRDHUP = 0x2000 + EPOLLET = 0x80000000 + EPOLL_CLOEXEC = 0x80000 + EPOLL_CTL_ADD = 0x1 + EPOLL_CTL_DEL = 0x2 + EPOLL_CTL_MOD = 0x3 +) + +type EpollEvent struct { + Events uint32 + pad_cgo_0 [4]byte + Data [8]byte // unaligned uintptr +} diff --git a/src/runtime/internal/syscall/syscall_linux.go b/src/runtime/internal/syscall/syscall_linux.go new file mode 100644 index 0000000..a103d31 --- /dev/null +++ b/src/runtime/internal/syscall/syscall_linux.go @@ -0,0 +1,66 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Package syscall provides the syscall primitives required for the runtime. +package syscall + +import ( + "unsafe" +) + +// TODO(https://go.dev/issue/51087): This package is incomplete and currently +// only contains very minimal support for Linux. + +// Syscall6 calls system call number 'num' with arguments a1-6. +func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) + +// syscall_RawSyscall6 is a push linkname to export Syscall6 as +// syscall.RawSyscall6. +// +// //go:uintptrkeepalive because the uintptr argument may be converted pointers +// that need to be kept alive in the caller (this is implied for Syscall6 since +// it has no body). +// +// //go:nosplit because stack copying does not account for uintptrkeepalive, so +// the stack must not grow. Stack copying cannot blindly assume that all +// uintptr arguments are pointers, because some values may look like pointers, +// but not really be pointers, and adjusting their value would break the call. +// +// This is a separate wrapper because we can't export one function as two +// names. The assembly implementations name themselves Syscall6 would not be +// affected by a linkname. +// +//go:uintptrkeepalive +//go:nosplit +//go:linkname syscall_RawSyscall6 syscall.RawSyscall6 +func syscall_RawSyscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr) { + return Syscall6(num, a1, a2, a3, a4, a5, a6) +} + +func EpollCreate1(flags int32) (fd int32, errno uintptr) { + r1, _, e := Syscall6(SYS_EPOLL_CREATE1, uintptr(flags), 0, 0, 0, 0, 0) + return int32(r1), e +} + +var _zero uintptr + +func EpollWait(epfd int32, events []EpollEvent, maxev, waitms int32) (n int32, errno uintptr) { + var ev unsafe.Pointer + if len(events) > 0 { + ev = unsafe.Pointer(&events[0]) + } else { + ev = unsafe.Pointer(&_zero) + } + r1, _, e := Syscall6(SYS_EPOLL_PWAIT, uintptr(epfd), uintptr(ev), uintptr(maxev), uintptr(waitms), 0, 0) + return int32(r1), e +} + +func EpollCtl(epfd, op, fd int32, event *EpollEvent) (errno uintptr) { + _, _, e := Syscall6(SYS_EPOLL_CTL, uintptr(epfd), uintptr(op), uintptr(fd), uintptr(unsafe.Pointer(event)), 0, 0) + return e +} + +func CloseOnExec(fd int32) { + Syscall6(SYS_FCNTL, uintptr(fd), F_SETFD, FD_CLOEXEC, 0, 0, 0) +} diff --git a/src/runtime/internal/syscall/syscall_linux_test.go b/src/runtime/internal/syscall/syscall_linux_test.go new file mode 100644 index 0000000..1976da5 --- /dev/null +++ b/src/runtime/internal/syscall/syscall_linux_test.go @@ -0,0 +1,19 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package syscall_test + +import ( + "runtime/internal/syscall" + "testing" +) + +func TestEpollctlErrorSign(t *testing.T) { + v := syscall.EpollCtl(-1, 1, -1, &syscall.EpollEvent{}) + + const EBADF = 0x09 + if v != EBADF { + t.Errorf("epollctl = %v, want %v", v, EBADF) + } +} diff --git a/src/runtime/lfstack.go b/src/runtime/lfstack.go new file mode 100644 index 0000000..306a8e8 --- /dev/null +++ b/src/runtime/lfstack.go @@ -0,0 +1,69 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Lock-free stack. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// lfstack is the head of a lock-free stack. +// +// The zero value of lfstack is an empty list. +// +// This stack is intrusive. Nodes must embed lfnode as the first field. +// +// The stack does not keep GC-visible pointers to nodes, so the caller +// must ensure the nodes are allocated outside the Go heap. +type lfstack uint64 + +func (head *lfstack) push(node *lfnode) { + node.pushcnt++ + new := lfstackPack(node, node.pushcnt) + if node1 := lfstackUnpack(new); node1 != node { + print("runtime: lfstack.push invalid packing: node=", node, " cnt=", hex(node.pushcnt), " packed=", hex(new), " -> node=", node1, "\n") + throw("lfstack.push") + } + for { + old := atomic.Load64((*uint64)(head)) + node.next = old + if atomic.Cas64((*uint64)(head), old, new) { + break + } + } +} + +func (head *lfstack) pop() unsafe.Pointer { + for { + old := atomic.Load64((*uint64)(head)) + if old == 0 { + return nil + } + node := lfstackUnpack(old) + next := atomic.Load64(&node.next) + if atomic.Cas64((*uint64)(head), old, next) { + return unsafe.Pointer(node) + } + } +} + +func (head *lfstack) empty() bool { + return atomic.Load64((*uint64)(head)) == 0 +} + +// lfnodeValidate panics if node is not a valid address for use with +// lfstack.push. This only needs to be called when node is allocated. +func lfnodeValidate(node *lfnode) { + if base, _, _ := findObject(uintptr(unsafe.Pointer(node)), 0, 0); base != 0 { + throw("lfstack node allocated from the heap") + } + if lfstackUnpack(lfstackPack(node, ^uintptr(0))) != node { + printlock() + println("runtime: bad lfnode address", hex(uintptr(unsafe.Pointer(node)))) + throw("bad lfnode address") + } +} diff --git a/src/runtime/lfstack_32bit.go b/src/runtime/lfstack_32bit.go new file mode 100644 index 0000000..405923c --- /dev/null +++ b/src/runtime/lfstack_32bit.go @@ -0,0 +1,19 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build 386 || arm || mips || mipsle + +package runtime + +import "unsafe" + +// On 32-bit systems, the stored uint64 has a 32-bit pointer and 32-bit count. + +func lfstackPack(node *lfnode, cnt uintptr) uint64 { + return uint64(uintptr(unsafe.Pointer(node)))<<32 | uint64(cnt) +} + +func lfstackUnpack(val uint64) *lfnode { + return (*lfnode)(unsafe.Pointer(uintptr(val >> 32))) +} diff --git a/src/runtime/lfstack_64bit.go b/src/runtime/lfstack_64bit.go new file mode 100644 index 0000000..88cbd3b --- /dev/null +++ b/src/runtime/lfstack_64bit.go @@ -0,0 +1,70 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build amd64 || arm64 || loong64 || mips64 || mips64le || ppc64 || ppc64le || riscv64 || s390x || wasm + +package runtime + +import "unsafe" + +const ( + // addrBits is the number of bits needed to represent a virtual address. + // + // See heapAddrBits for a table of address space sizes on + // various architectures. 48 bits is enough for all + // architectures except s390x. + // + // On AMD64, virtual addresses are 48-bit (or 57-bit) numbers sign extended to 64. + // We shift the address left 16 to eliminate the sign extended part and make + // room in the bottom for the count. + // + // On s390x, virtual addresses are 64-bit. There's not much we + // can do about this, so we just hope that the kernel doesn't + // get to really high addresses and panic if it does. + addrBits = 48 + + // In addition to the 16 bits taken from the top, we can take 3 from the + // bottom, because node must be pointer-aligned, giving a total of 19 bits + // of count. + cntBits = 64 - addrBits + 3 + + // On AIX, 64-bit addresses are split into 36-bit segment number and 28-bit + // offset in segment. Segment numbers in the range 0x0A0000000-0x0AFFFFFFF(LSA) + // are available for mmap. + // We assume all lfnode addresses are from memory allocated with mmap. + // We use one bit to distinguish between the two ranges. + aixAddrBits = 57 + aixCntBits = 64 - aixAddrBits + 3 + + // riscv64 SV57 mode gives 56 bits of userspace VA. + // lfstack code supports it, but broader support for SV57 mode is incomplete, + // and there may be other issues (see #54104). + riscv64AddrBits = 56 + riscv64CntBits = 64 - riscv64AddrBits + 3 +) + +func lfstackPack(node *lfnode, cnt uintptr) uint64 { + if GOARCH == "ppc64" && GOOS == "aix" { + return uint64(uintptr(unsafe.Pointer(node)))<<(64-aixAddrBits) | uint64(cnt&(1<<aixCntBits-1)) + } + if GOARCH == "riscv64" { + return uint64(uintptr(unsafe.Pointer(node)))<<(64-riscv64AddrBits) | uint64(cnt&(1<<riscv64CntBits-1)) + } + return uint64(uintptr(unsafe.Pointer(node)))<<(64-addrBits) | uint64(cnt&(1<<cntBits-1)) +} + +func lfstackUnpack(val uint64) *lfnode { + if GOARCH == "amd64" { + // amd64 systems can place the stack above the VA hole, so we need to sign extend + // val before unpacking. + return (*lfnode)(unsafe.Pointer(uintptr(int64(val) >> cntBits << 3))) + } + if GOARCH == "ppc64" && GOOS == "aix" { + return (*lfnode)(unsafe.Pointer(uintptr((val >> aixCntBits << 3) | 0xa<<56))) + } + if GOARCH == "riscv64" { + return (*lfnode)(unsafe.Pointer(uintptr(val >> riscv64CntBits << 3))) + } + return (*lfnode)(unsafe.Pointer(uintptr(val >> cntBits << 3))) +} diff --git a/src/runtime/lfstack_test.go b/src/runtime/lfstack_test.go new file mode 100644 index 0000000..e36297e --- /dev/null +++ b/src/runtime/lfstack_test.go @@ -0,0 +1,137 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "math/rand" + . "runtime" + "testing" + "unsafe" +) + +type MyNode struct { + LFNode + data int +} + +// allocMyNode allocates nodes that are stored in an lfstack +// outside the Go heap. +// We require lfstack objects to live outside the heap so that +// checkptr passes on the unsafe shenanigans used. +func allocMyNode(data int) *MyNode { + n := (*MyNode)(PersistentAlloc(unsafe.Sizeof(MyNode{}))) + LFNodeValidate(&n.LFNode) + n.data = data + return n +} + +func fromMyNode(node *MyNode) *LFNode { + return (*LFNode)(unsafe.Pointer(node)) +} + +func toMyNode(node *LFNode) *MyNode { + return (*MyNode)(unsafe.Pointer(node)) +} + +var global any + +func TestLFStack(t *testing.T) { + stack := new(uint64) + global = stack // force heap allocation + + // Check the stack is initially empty. + if LFStackPop(stack) != nil { + t.Fatalf("stack is not empty") + } + + // Push one element. + node := allocMyNode(42) + LFStackPush(stack, fromMyNode(node)) + + // Push another. + node = allocMyNode(43) + LFStackPush(stack, fromMyNode(node)) + + // Pop one element. + node = toMyNode(LFStackPop(stack)) + if node == nil { + t.Fatalf("stack is empty") + } + if node.data != 43 { + t.Fatalf("no lifo") + } + + // Pop another. + node = toMyNode(LFStackPop(stack)) + if node == nil { + t.Fatalf("stack is empty") + } + if node.data != 42 { + t.Fatalf("no lifo") + } + + // Check the stack is empty again. + if LFStackPop(stack) != nil { + t.Fatalf("stack is not empty") + } + if *stack != 0 { + t.Fatalf("stack is not empty") + } +} + +func TestLFStackStress(t *testing.T) { + const K = 100 + P := 4 * GOMAXPROCS(-1) + N := 100000 + if testing.Short() { + N /= 10 + } + // Create 2 stacks. + stacks := [2]*uint64{new(uint64), new(uint64)} + // Push K elements randomly onto the stacks. + sum := 0 + for i := 0; i < K; i++ { + sum += i + node := allocMyNode(i) + LFStackPush(stacks[i%2], fromMyNode(node)) + } + c := make(chan bool, P) + for p := 0; p < P; p++ { + go func() { + r := rand.New(rand.NewSource(rand.Int63())) + // Pop a node from a random stack, then push it onto a random stack. + for i := 0; i < N; i++ { + node := toMyNode(LFStackPop(stacks[r.Intn(2)])) + if node != nil { + LFStackPush(stacks[r.Intn(2)], fromMyNode(node)) + } + } + c <- true + }() + } + for i := 0; i < P; i++ { + <-c + } + // Pop all elements from both stacks, and verify that nothing lost. + sum2 := 0 + cnt := 0 + for i := 0; i < 2; i++ { + for { + node := toMyNode(LFStackPop(stacks[i])) + if node == nil { + break + } + cnt++ + sum2 += node.data + node.Next = 0 + } + } + if cnt != K { + t.Fatalf("Wrong number of nodes %d/%d", cnt, K) + } + if sum2 != sum { + t.Fatalf("Wrong sum %d/%d", sum2, sum) + } +} diff --git a/src/runtime/libfuzzer.go b/src/runtime/libfuzzer.go new file mode 100644 index 0000000..0ece035 --- /dev/null +++ b/src/runtime/libfuzzer.go @@ -0,0 +1,160 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build libfuzzer + +package runtime + +import "unsafe" + +func libfuzzerCallWithTwoByteBuffers(fn, start, end *byte) +func libfuzzerCallTraceIntCmp(fn *byte, arg0, arg1, fakePC uintptr) +func libfuzzerCall4(fn *byte, fakePC uintptr, s1, s2 unsafe.Pointer, result uintptr) + +// Keep in sync with the definition of ret_sled in src/runtime/libfuzzer_amd64.s +const retSledSize = 512 + +// In libFuzzer mode, the compiler inserts calls to libfuzzerTraceCmpN and libfuzzerTraceConstCmpN +// (where N can be 1, 2, 4, or 8) for encountered integer comparisons in the code to be instrumented. +// This may result in these functions having callers that are nosplit. That is why they must be nosplit. +// +//go:nosplit +func libfuzzerTraceCmp1(arg0, arg1 uint8, fakePC uint) { + fakePC = fakePC % retSledSize + libfuzzerCallTraceIntCmp(&__sanitizer_cov_trace_cmp1, uintptr(arg0), uintptr(arg1), uintptr(fakePC)) +} + +//go:nosplit +func libfuzzerTraceCmp2(arg0, arg1 uint16, fakePC uint) { + fakePC = fakePC % retSledSize + libfuzzerCallTraceIntCmp(&__sanitizer_cov_trace_cmp2, uintptr(arg0), uintptr(arg1), uintptr(fakePC)) +} + +//go:nosplit +func libfuzzerTraceCmp4(arg0, arg1 uint32, fakePC uint) { + fakePC = fakePC % retSledSize + libfuzzerCallTraceIntCmp(&__sanitizer_cov_trace_cmp4, uintptr(arg0), uintptr(arg1), uintptr(fakePC)) +} + +//go:nosplit +func libfuzzerTraceCmp8(arg0, arg1 uint64, fakePC uint) { + fakePC = fakePC % retSledSize + libfuzzerCallTraceIntCmp(&__sanitizer_cov_trace_cmp8, uintptr(arg0), uintptr(arg1), uintptr(fakePC)) +} + +//go:nosplit +func libfuzzerTraceConstCmp1(arg0, arg1 uint8, fakePC uint) { + fakePC = fakePC % retSledSize + libfuzzerCallTraceIntCmp(&__sanitizer_cov_trace_const_cmp1, uintptr(arg0), uintptr(arg1), uintptr(fakePC)) +} + +//go:nosplit +func libfuzzerTraceConstCmp2(arg0, arg1 uint16, fakePC uint) { + fakePC = fakePC % retSledSize + libfuzzerCallTraceIntCmp(&__sanitizer_cov_trace_const_cmp2, uintptr(arg0), uintptr(arg1), uintptr(fakePC)) +} + +//go:nosplit +func libfuzzerTraceConstCmp4(arg0, arg1 uint32, fakePC uint) { + fakePC = fakePC % retSledSize + libfuzzerCallTraceIntCmp(&__sanitizer_cov_trace_const_cmp4, uintptr(arg0), uintptr(arg1), uintptr(fakePC)) +} + +//go:nosplit +func libfuzzerTraceConstCmp8(arg0, arg1 uint64, fakePC uint) { + fakePC = fakePC % retSledSize + libfuzzerCallTraceIntCmp(&__sanitizer_cov_trace_const_cmp8, uintptr(arg0), uintptr(arg1), uintptr(fakePC)) +} + +var pcTables []byte + +func init() { + libfuzzerCallWithTwoByteBuffers(&__sanitizer_cov_8bit_counters_init, &__start___sancov_cntrs, &__stop___sancov_cntrs) + start := unsafe.Pointer(&__start___sancov_cntrs) + end := unsafe.Pointer(&__stop___sancov_cntrs) + + // PC tables are arrays of ptr-sized integers representing pairs [PC,PCFlags] for every instrumented block. + // The number of PCs and PCFlags is the same as the number of 8-bit counters. Each PC table entry has + // the size of two ptr-sized integers. We allocate one more byte than what we actually need so that we can + // get a pointer representing the end of the PC table array. + size := (uintptr(end)-uintptr(start))*unsafe.Sizeof(uintptr(0))*2 + 1 + pcTables = make([]byte, size) + libfuzzerCallWithTwoByteBuffers(&__sanitizer_cov_pcs_init, &pcTables[0], &pcTables[size-1]) +} + +// We call libFuzzer's __sanitizer_weak_hook_strcmp function which takes the +// following four arguments: +// +// 1. caller_pc: location of string comparison call site +// 2. s1: first string used in the comparison +// 3. s2: second string used in the comparison +// 4. result: an integer representing the comparison result. 0 indicates +// equality (comparison will ignored by libfuzzer), non-zero indicates a +// difference (comparison will be taken into consideration). +// +//go:nosplit +func libfuzzerHookStrCmp(s1, s2 string, fakePC int) { + if s1 != s2 { + libfuzzerCall4(&__sanitizer_weak_hook_strcmp, uintptr(fakePC), cstring(s1), cstring(s2), uintptr(1)) + } + // if s1 == s2 we could call the hook with a last argument of 0 but this is unnecessary since this case will be then + // ignored by libfuzzer +} + +// This function has now the same implementation as libfuzzerHookStrCmp because we lack better checks +// for case-insensitive string equality in the runtime package. +// +//go:nosplit +func libfuzzerHookEqualFold(s1, s2 string, fakePC int) { + if s1 != s2 { + libfuzzerCall4(&__sanitizer_weak_hook_strcmp, uintptr(fakePC), cstring(s1), cstring(s2), uintptr(1)) + } +} + +//go:linkname __sanitizer_cov_trace_cmp1 __sanitizer_cov_trace_cmp1 +//go:cgo_import_static __sanitizer_cov_trace_cmp1 +var __sanitizer_cov_trace_cmp1 byte + +//go:linkname __sanitizer_cov_trace_cmp2 __sanitizer_cov_trace_cmp2 +//go:cgo_import_static __sanitizer_cov_trace_cmp2 +var __sanitizer_cov_trace_cmp2 byte + +//go:linkname __sanitizer_cov_trace_cmp4 __sanitizer_cov_trace_cmp4 +//go:cgo_import_static __sanitizer_cov_trace_cmp4 +var __sanitizer_cov_trace_cmp4 byte + +//go:linkname __sanitizer_cov_trace_cmp8 __sanitizer_cov_trace_cmp8 +//go:cgo_import_static __sanitizer_cov_trace_cmp8 +var __sanitizer_cov_trace_cmp8 byte + +//go:linkname __sanitizer_cov_trace_const_cmp1 __sanitizer_cov_trace_const_cmp1 +//go:cgo_import_static __sanitizer_cov_trace_const_cmp1 +var __sanitizer_cov_trace_const_cmp1 byte + +//go:linkname __sanitizer_cov_trace_const_cmp2 __sanitizer_cov_trace_const_cmp2 +//go:cgo_import_static __sanitizer_cov_trace_const_cmp2 +var __sanitizer_cov_trace_const_cmp2 byte + +//go:linkname __sanitizer_cov_trace_const_cmp4 __sanitizer_cov_trace_const_cmp4 +//go:cgo_import_static __sanitizer_cov_trace_const_cmp4 +var __sanitizer_cov_trace_const_cmp4 byte + +//go:linkname __sanitizer_cov_trace_const_cmp8 __sanitizer_cov_trace_const_cmp8 +//go:cgo_import_static __sanitizer_cov_trace_const_cmp8 +var __sanitizer_cov_trace_const_cmp8 byte + +//go:linkname __sanitizer_cov_8bit_counters_init __sanitizer_cov_8bit_counters_init +//go:cgo_import_static __sanitizer_cov_8bit_counters_init +var __sanitizer_cov_8bit_counters_init byte + +// start, stop markers of counters, set by the linker +var __start___sancov_cntrs, __stop___sancov_cntrs byte + +//go:linkname __sanitizer_cov_pcs_init __sanitizer_cov_pcs_init +//go:cgo_import_static __sanitizer_cov_pcs_init +var __sanitizer_cov_pcs_init byte + +//go:linkname __sanitizer_weak_hook_strcmp __sanitizer_weak_hook_strcmp +//go:cgo_import_static __sanitizer_weak_hook_strcmp +var __sanitizer_weak_hook_strcmp byte diff --git a/src/runtime/libfuzzer_amd64.s b/src/runtime/libfuzzer_amd64.s new file mode 100644 index 0000000..4355369 --- /dev/null +++ b/src/runtime/libfuzzer_amd64.s @@ -0,0 +1,157 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build libfuzzer + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// Based on race_amd64.s; see commentary there. + +#ifdef GOOS_windows +#define RARG0 CX +#define RARG1 DX +#define RARG2 R8 +#define RARG3 R9 +#else +#define RARG0 DI +#define RARG1 SI +#define RARG2 DX +#define RARG3 CX +#endif + +// void runtime·libfuzzerCall4(fn, hookId int, s1, s2 unsafe.Pointer, result uintptr) +// Calls C function fn from libFuzzer and passes 4 arguments to it. +TEXT runtime·libfuzzerCall4(SB), NOSPLIT, $0-40 + MOVQ fn+0(FP), AX + MOVQ hookId+8(FP), RARG0 + MOVQ s1+16(FP), RARG1 + MOVQ s2+24(FP), RARG2 + MOVQ result+32(FP), RARG3 + + get_tls(R12) + MOVQ g(R12), R14 + MOVQ g_m(R14), R13 + + // Switch to g0 stack. + MOVQ SP, R12 // callee-saved, preserved across the CALL + MOVQ m_g0(R13), R10 + CMPQ R10, R14 + JE call // already on g0 + MOVQ (g_sched+gobuf_sp)(R10), SP +call: + ANDQ $~15, SP // alignment for gcc ABI + CALL AX + MOVQ R12, SP + RET + +// void runtime·libfuzzerCallTraceIntCmp(fn, arg0, arg1, fakePC uintptr) +// Calls C function fn from libFuzzer and passes 2 arguments to it after +// manipulating the return address so that libfuzzer's integer compare hooks +// work +// libFuzzer's compare hooks obtain the caller's address from the compiler +// builtin __builtin_return_address. Since we invoke the hooks always +// from the same native function, this builtin would always return the same +// value. Internally, the libFuzzer hooks call through to the always inlined +// HandleCmp and thus can't be mimicked without patching libFuzzer. +// +// We solve this problem via an inline assembly trampoline construction that +// translates a runtime argument `fake_pc` in the range [0, 512) into a call to +// a hook with a fake return address whose lower 9 bits are `fake_pc` up to a +// constant shift. This is achieved by pushing a return address pointing into +// 512 ret instructions at offset `fake_pc` onto the stack and then jumping +// directly to the address of the hook. +// +// Note: We only set the lowest 9 bits of the return address since only these +// bits are used by the libFuzzer value profiling mode for integer compares, see +// https://github.com/llvm/llvm-project/blob/704d92607d26e696daba596b72cb70effe79a872/compiler-rt/lib/fuzzer/FuzzerTracePC.cpp#L390 +// as well as +// https://github.com/llvm/llvm-project/blob/704d92607d26e696daba596b72cb70effe79a872/compiler-rt/lib/fuzzer/FuzzerValueBitMap.h#L34 +// ValueProfileMap.AddValue() truncates its argument to 16 bits and shifts the +// PC to the left by log_2(128)=7, which means that only the lowest 16 - 7 bits +// of the return address matter. String compare hooks use the lowest 12 bits, +// but take the return address as an argument and thus don't require the +// indirection through a trampoline. +// TODO: Remove the inline assembly trampoline once a PC argument has been added to libfuzzer's int compare hooks. +TEXT runtime·libfuzzerCallTraceIntCmp(SB), NOSPLIT, $0-32 + MOVQ fn+0(FP), AX + MOVQ arg0+8(FP), RARG0 + MOVQ arg1+16(FP), RARG1 + MOVQ fakePC+24(FP), R8 + + get_tls(R12) + MOVQ g(R12), R14 + MOVQ g_m(R14), R13 + + // Switch to g0 stack. + MOVQ SP, R12 // callee-saved, preserved across the CALL + MOVQ m_g0(R13), R10 + CMPQ R10, R14 + JE call // already on g0 + MOVQ (g_sched+gobuf_sp)(R10), SP +call: + ANDQ $~15, SP // alignment for gcc ABI + // Load the address of the end of the function and push it into the stack. + // This address will be jumped to after executing the return instruction + // from the return sled. There we reset the stack pointer and return. + MOVQ $end_of_function<>(SB), BX + PUSHQ BX + // Load the starting address of the return sled into BX. + MOVQ $ret_sled<>(SB), BX + // Load the address of the i'th return instruction fron the return sled. + // The index is given in the fakePC argument. + ADDQ R8, BX + PUSHQ BX + // Call the original function with the fakePC return address on the stack. + // Function arguments arg0 and arg1 are passed in the registers specified + // by the x64 calling convention. + JMP AX +// This code will not be executed and is only there to statisfy assembler +// check of a balanced stack. +not_reachable: + POPQ BX + POPQ BX + RET + +TEXT end_of_function<>(SB), NOSPLIT, $0-0 + MOVQ R12, SP + RET + +#define REPEAT_8(a) a \ + a \ + a \ + a \ + a \ + a \ + a \ + a + +#define REPEAT_512(a) REPEAT_8(REPEAT_8(REPEAT_8(a))) + +TEXT ret_sled<>(SB), NOSPLIT, $0-0 + REPEAT_512(RET) + +// void runtime·libfuzzerCallWithTwoByteBuffers(fn, start, end *byte) +// Calls C function fn from libFuzzer and passes 2 arguments of type *byte to it. +TEXT runtime·libfuzzerCallWithTwoByteBuffers(SB), NOSPLIT, $0-24 + MOVQ fn+0(FP), AX + MOVQ start+8(FP), RARG0 + MOVQ end+16(FP), RARG1 + + get_tls(R12) + MOVQ g(R12), R14 + MOVQ g_m(R14), R13 + + // Switch to g0 stack. + MOVQ SP, R12 // callee-saved, preserved across the CALL + MOVQ m_g0(R13), R10 + CMPQ R10, R14 + JE call // already on g0 + MOVQ (g_sched+gobuf_sp)(R10), SP +call: + ANDQ $~15, SP // alignment for gcc ABI + CALL AX + MOVQ R12, SP + RET diff --git a/src/runtime/libfuzzer_arm64.s b/src/runtime/libfuzzer_arm64.s new file mode 100644 index 0000000..37b3517 --- /dev/null +++ b/src/runtime/libfuzzer_arm64.s @@ -0,0 +1,115 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build libfuzzer + +#include "go_asm.h" +#include "textflag.h" + +// Based on race_arm64.s; see commentary there. + +#define RARG0 R0 +#define RARG1 R1 +#define RARG2 R2 +#define RARG3 R3 + +#define REPEAT_2(a) a a +#define REPEAT_8(a) REPEAT_2(REPEAT_2(REPEAT_2(a))) +#define REPEAT_128(a) REPEAT_2(REPEAT_8(REPEAT_8(a))) + +// void runtime·libfuzzerCallTraceIntCmp(fn, arg0, arg1, fakePC uintptr) +// Calls C function fn from libFuzzer and passes 2 arguments to it after +// manipulating the return address so that libfuzzer's integer compare hooks +// work. +// The problem statement and solution are documented in detail in libfuzzer_amd64.s. +// See commentary there. +TEXT runtime·libfuzzerCallTraceIntCmp(SB), NOSPLIT, $8-32 + MOVD fn+0(FP), R9 + MOVD arg0+8(FP), RARG0 + MOVD arg1+16(FP), RARG1 + MOVD fakePC+24(FP), R8 + // Save the original return address in a local variable + MOVD R30, savedRetAddr-8(SP) + + MOVD g_m(g), R10 + + // Switch to g0 stack. + MOVD RSP, R19 // callee-saved, preserved across the CALL + MOVD m_g0(R10), R11 + CMP R11, g + BEQ call // already on g0 + MOVD (g_sched+gobuf_sp)(R11), R12 + MOVD R12, RSP +call: + // Load address of the ret sled into the default register for the return + // address. + ADR ret_sled, R30 + // Clear the lowest 2 bits of fakePC. All ARM64 instructions are four + // bytes long, so we cannot get better return address granularity than + // multiples of 4. + AND $-4, R8, R8 + // Add the offset of the fake_pc-th ret. + ADD R8, R30, R30 + // Call the function by jumping to it and reusing all registers except + // for the modified return address register R30. + JMP (R9) + +// The ret sled for ARM64 consists of 128 br instructions jumping to the +// end of the function. Each instruction is 4 bytes long. The sled thus +// has the same byte length of 4 * 128 = 512 as the x86_64 sled, but +// coarser granularity. +#define RET_SLED \ + JMP end_of_function; + +ret_sled: + REPEAT_128(RET_SLED); + +end_of_function: + MOVD R19, RSP + MOVD savedRetAddr-8(SP), R30 + RET + +// void runtime·libfuzzerCall4(fn, hookId int, s1, s2 unsafe.Pointer, result uintptr) +// Calls C function fn from libFuzzer and passes 4 arguments to it. +TEXT runtime·libfuzzerCall4(SB), NOSPLIT, $0-40 + MOVD fn+0(FP), R9 + MOVD hookId+8(FP), RARG0 + MOVD s1+16(FP), RARG1 + MOVD s2+24(FP), RARG2 + MOVD result+32(FP), RARG3 + + MOVD g_m(g), R10 + + // Switch to g0 stack. + MOVD RSP, R19 // callee-saved, preserved across the CALL + MOVD m_g0(R10), R11 + CMP R11, g + BEQ call // already on g0 + MOVD (g_sched+gobuf_sp)(R11), R12 + MOVD R12, RSP +call: + BL R9 + MOVD R19, RSP + RET + +// void runtime·libfuzzerCallWithTwoByteBuffers(fn, start, end *byte) +// Calls C function fn from libFuzzer and passes 2 arguments of type *byte to it. +TEXT runtime·libfuzzerCallWithTwoByteBuffers(SB), NOSPLIT, $0-24 + MOVD fn+0(FP), R9 + MOVD start+8(FP), R0 + MOVD end+16(FP), R1 + + MOVD g_m(g), R10 + + // Switch to g0 stack. + MOVD RSP, R19 // callee-saved, preserved across the CALL + MOVD m_g0(R10), R11 + CMP R11, g + BEQ call // already on g0 + MOVD (g_sched+gobuf_sp)(R11), R12 + MOVD R12, RSP +call: + BL R9 + MOVD R19, RSP + RET diff --git a/src/runtime/lock_futex.go b/src/runtime/lock_futex.go new file mode 100644 index 0000000..cc7d465 --- /dev/null +++ b/src/runtime/lock_futex.go @@ -0,0 +1,246 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build dragonfly || freebsd || linux + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// This implementation depends on OS-specific implementations of +// +// futexsleep(addr *uint32, val uint32, ns int64) +// Atomically, +// if *addr == val { sleep } +// Might be woken up spuriously; that's allowed. +// Don't sleep longer than ns; ns < 0 means forever. +// +// futexwakeup(addr *uint32, cnt uint32) +// If any procs are sleeping on addr, wake up at most cnt. + +const ( + mutex_unlocked = 0 + mutex_locked = 1 + mutex_sleeping = 2 + + active_spin = 4 + active_spin_cnt = 30 + passive_spin = 1 +) + +// Possible lock states are mutex_unlocked, mutex_locked and mutex_sleeping. +// mutex_sleeping means that there is presumably at least one sleeping thread. +// Note that there can be spinning threads during all states - they do not +// affect mutex's state. + +// We use the uintptr mutex.key and note.key as a uint32. +// +//go:nosplit +func key32(p *uintptr) *uint32 { + return (*uint32)(unsafe.Pointer(p)) +} + +func lock(l *mutex) { + lockWithRank(l, getLockRank(l)) +} + +func lock2(l *mutex) { + gp := getg() + + if gp.m.locks < 0 { + throw("runtime·lock: lock count") + } + gp.m.locks++ + + // Speculative grab for lock. + v := atomic.Xchg(key32(&l.key), mutex_locked) + if v == mutex_unlocked { + return + } + + // wait is either MUTEX_LOCKED or MUTEX_SLEEPING + // depending on whether there is a thread sleeping + // on this mutex. If we ever change l->key from + // MUTEX_SLEEPING to some other value, we must be + // careful to change it back to MUTEX_SLEEPING before + // returning, to ensure that the sleeping thread gets + // its wakeup call. + wait := v + + // On uniprocessors, no point spinning. + // On multiprocessors, spin for ACTIVE_SPIN attempts. + spin := 0 + if ncpu > 1 { + spin = active_spin + } + for { + // Try for lock, spinning. + for i := 0; i < spin; i++ { + for l.key == mutex_unlocked { + if atomic.Cas(key32(&l.key), mutex_unlocked, wait) { + return + } + } + procyield(active_spin_cnt) + } + + // Try for lock, rescheduling. + for i := 0; i < passive_spin; i++ { + for l.key == mutex_unlocked { + if atomic.Cas(key32(&l.key), mutex_unlocked, wait) { + return + } + } + osyield() + } + + // Sleep. + v = atomic.Xchg(key32(&l.key), mutex_sleeping) + if v == mutex_unlocked { + return + } + wait = mutex_sleeping + futexsleep(key32(&l.key), mutex_sleeping, -1) + } +} + +func unlock(l *mutex) { + unlockWithRank(l) +} + +func unlock2(l *mutex) { + v := atomic.Xchg(key32(&l.key), mutex_unlocked) + if v == mutex_unlocked { + throw("unlock of unlocked lock") + } + if v == mutex_sleeping { + futexwakeup(key32(&l.key), 1) + } + + gp := getg() + gp.m.locks-- + if gp.m.locks < 0 { + throw("runtime·unlock: lock count") + } + if gp.m.locks == 0 && gp.preempt { // restore the preemption request in case we've cleared it in newstack + gp.stackguard0 = stackPreempt + } +} + +// One-time notifications. +func noteclear(n *note) { + n.key = 0 +} + +func notewakeup(n *note) { + old := atomic.Xchg(key32(&n.key), 1) + if old != 0 { + print("notewakeup - double wakeup (", old, ")\n") + throw("notewakeup - double wakeup") + } + futexwakeup(key32(&n.key), 1) +} + +func notesleep(n *note) { + gp := getg() + if gp != gp.m.g0 { + throw("notesleep not on g0") + } + ns := int64(-1) + if *cgo_yield != nil { + // Sleep for an arbitrary-but-moderate interval to poll libc interceptors. + ns = 10e6 + } + for atomic.Load(key32(&n.key)) == 0 { + gp.m.blocked = true + futexsleep(key32(&n.key), 0, ns) + if *cgo_yield != nil { + asmcgocall(*cgo_yield, nil) + } + gp.m.blocked = false + } +} + +// May run with m.p==nil if called from notetsleep, so write barriers +// are not allowed. +// +//go:nosplit +//go:nowritebarrier +func notetsleep_internal(n *note, ns int64) bool { + gp := getg() + + if ns < 0 { + if *cgo_yield != nil { + // Sleep for an arbitrary-but-moderate interval to poll libc interceptors. + ns = 10e6 + } + for atomic.Load(key32(&n.key)) == 0 { + gp.m.blocked = true + futexsleep(key32(&n.key), 0, ns) + if *cgo_yield != nil { + asmcgocall(*cgo_yield, nil) + } + gp.m.blocked = false + } + return true + } + + if atomic.Load(key32(&n.key)) != 0 { + return true + } + + deadline := nanotime() + ns + for { + if *cgo_yield != nil && ns > 10e6 { + ns = 10e6 + } + gp.m.blocked = true + futexsleep(key32(&n.key), 0, ns) + if *cgo_yield != nil { + asmcgocall(*cgo_yield, nil) + } + gp.m.blocked = false + if atomic.Load(key32(&n.key)) != 0 { + break + } + now := nanotime() + if now >= deadline { + break + } + ns = deadline - now + } + return atomic.Load(key32(&n.key)) != 0 +} + +func notetsleep(n *note, ns int64) bool { + gp := getg() + if gp != gp.m.g0 && gp.m.preemptoff != "" { + throw("notetsleep not on g0") + } + + return notetsleep_internal(n, ns) +} + +// same as runtime·notetsleep, but called on user g (not g0) +// calls only nosplit functions between entersyscallblock/exitsyscall. +func notetsleepg(n *note, ns int64) bool { + gp := getg() + if gp == gp.m.g0 { + throw("notetsleepg on g0") + } + + entersyscallblock() + ok := notetsleep_internal(n, ns) + exitsyscall() + return ok +} + +func beforeIdle(int64, int64) (*g, bool) { + return nil, false +} + +func checkTimeouts() {} diff --git a/src/runtime/lock_js.go b/src/runtime/lock_js.go new file mode 100644 index 0000000..f71e7a2 --- /dev/null +++ b/src/runtime/lock_js.go @@ -0,0 +1,271 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build js && wasm + +package runtime + +import ( + _ "unsafe" +) + +// js/wasm has no support for threads yet. There is no preemption. + +const ( + mutex_unlocked = 0 + mutex_locked = 1 + + note_cleared = 0 + note_woken = 1 + note_timeout = 2 + + active_spin = 4 + active_spin_cnt = 30 + passive_spin = 1 +) + +func lock(l *mutex) { + lockWithRank(l, getLockRank(l)) +} + +func lock2(l *mutex) { + if l.key == mutex_locked { + // js/wasm is single-threaded so we should never + // observe this. + throw("self deadlock") + } + gp := getg() + if gp.m.locks < 0 { + throw("lock count") + } + gp.m.locks++ + l.key = mutex_locked +} + +func unlock(l *mutex) { + unlockWithRank(l) +} + +func unlock2(l *mutex) { + if l.key == mutex_unlocked { + throw("unlock of unlocked lock") + } + gp := getg() + gp.m.locks-- + if gp.m.locks < 0 { + throw("lock count") + } + l.key = mutex_unlocked +} + +// One-time notifications. + +type noteWithTimeout struct { + gp *g + deadline int64 +} + +var ( + notes = make(map[*note]*g) + notesWithTimeout = make(map[*note]noteWithTimeout) +) + +func noteclear(n *note) { + n.key = note_cleared +} + +func notewakeup(n *note) { + // gp := getg() + if n.key == note_woken { + throw("notewakeup - double wakeup") + } + cleared := n.key == note_cleared + n.key = note_woken + if cleared { + goready(notes[n], 1) + } +} + +func notesleep(n *note) { + throw("notesleep not supported by js") +} + +func notetsleep(n *note, ns int64) bool { + throw("notetsleep not supported by js") + return false +} + +// same as runtime·notetsleep, but called on user g (not g0) +func notetsleepg(n *note, ns int64) bool { + gp := getg() + if gp == gp.m.g0 { + throw("notetsleepg on g0") + } + + if ns >= 0 { + deadline := nanotime() + ns + delay := ns/1000000 + 1 // round up + if delay > 1<<31-1 { + delay = 1<<31 - 1 // cap to max int32 + } + + id := scheduleTimeoutEvent(delay) + mp := acquirem() + notes[n] = gp + notesWithTimeout[n] = noteWithTimeout{gp: gp, deadline: deadline} + releasem(mp) + + gopark(nil, nil, waitReasonSleep, traceEvNone, 1) + + clearTimeoutEvent(id) // note might have woken early, clear timeout + clearIdleID() + + mp = acquirem() + delete(notes, n) + delete(notesWithTimeout, n) + releasem(mp) + + return n.key == note_woken + } + + for n.key != note_woken { + mp := acquirem() + notes[n] = gp + releasem(mp) + + gopark(nil, nil, waitReasonZero, traceEvNone, 1) + + mp = acquirem() + delete(notes, n) + releasem(mp) + } + return true +} + +// checkTimeouts resumes goroutines that are waiting on a note which has reached its deadline. +// TODO(drchase): need to understand if write barriers are really okay in this context. +// +//go:yeswritebarrierrec +func checkTimeouts() { + now := nanotime() + // TODO: map iteration has the write barriers in it; is that okay? + for n, nt := range notesWithTimeout { + if n.key == note_cleared && now >= nt.deadline { + n.key = note_timeout + goready(nt.gp, 1) + } + } +} + +// events is a stack of calls from JavaScript into Go. +var events []*event + +type event struct { + // g was the active goroutine when the call from JavaScript occurred. + // It needs to be active when returning to JavaScript. + gp *g + // returned reports whether the event handler has returned. + // When all goroutines are idle and the event handler has returned, + // then g gets resumed and returns the execution to JavaScript. + returned bool +} + +// The timeout event started by beforeIdle. +var idleID int32 + +// beforeIdle gets called by the scheduler if no goroutine is awake. +// If we are not already handling an event, then we pause for an async event. +// If an event handler returned, we resume it and it will pause the execution. +// beforeIdle either returns the specific goroutine to schedule next or +// indicates with otherReady that some goroutine became ready. +// TODO(drchase): need to understand if write barriers are really okay in this context. +// +//go:yeswritebarrierrec +func beforeIdle(now, pollUntil int64) (gp *g, otherReady bool) { + delay := int64(-1) + if pollUntil != 0 { + delay = pollUntil - now + } + + if delay > 0 { + clearIdleID() + if delay < 1e6 { + delay = 1 + } else if delay < 1e15 { + delay = delay / 1e6 + } else { + // An arbitrary cap on how long to wait for a timer. + // 1e9 ms == ~11.5 days. + delay = 1e9 + } + idleID = scheduleTimeoutEvent(delay) + } + + if len(events) == 0 { + // TODO: this is the line that requires the yeswritebarrierrec + go handleAsyncEvent() + return nil, true + } + + e := events[len(events)-1] + if e.returned { + return e.gp, false + } + return nil, false +} + +func handleAsyncEvent() { + pause(getcallersp() - 16) +} + +// clearIdleID clears our record of the timeout started by beforeIdle. +func clearIdleID() { + if idleID != 0 { + clearTimeoutEvent(idleID) + idleID = 0 + } +} + +// pause sets SP to newsp and pauses the execution of Go's WebAssembly code until an event is triggered. +func pause(newsp uintptr) + +// scheduleTimeoutEvent tells the WebAssembly environment to trigger an event after ms milliseconds. +// It returns a timer id that can be used with clearTimeoutEvent. +func scheduleTimeoutEvent(ms int64) int32 + +// clearTimeoutEvent clears a timeout event scheduled by scheduleTimeoutEvent. +func clearTimeoutEvent(id int32) + +// handleEvent gets invoked on a call from JavaScript into Go. It calls the event handler of the syscall/js package +// and then parks the handler goroutine to allow other goroutines to run before giving execution back to JavaScript. +// When no other goroutine is awake any more, beforeIdle resumes the handler goroutine. Now that the same goroutine +// is running as was running when the call came in from JavaScript, execution can be safely passed back to JavaScript. +func handleEvent() { + e := &event{ + gp: getg(), + returned: false, + } + events = append(events, e) + + eventHandler() + + clearIdleID() + + // wait until all goroutines are idle + e.returned = true + gopark(nil, nil, waitReasonZero, traceEvNone, 1) + + events[len(events)-1] = nil + events = events[:len(events)-1] + + // return execution to JavaScript + pause(getcallersp() - 16) +} + +var eventHandler func() + +//go:linkname setEventHandler syscall/js.setEventHandler +func setEventHandler(fn func()) { + eventHandler = fn +} diff --git a/src/runtime/lock_sema.go b/src/runtime/lock_sema.go new file mode 100644 index 0000000..e15bbf7 --- /dev/null +++ b/src/runtime/lock_sema.go @@ -0,0 +1,304 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build aix || darwin || netbsd || openbsd || plan9 || solaris || windows + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// This implementation depends on OS-specific implementations of +// +// func semacreate(mp *m) +// Create a semaphore for mp, if it does not already have one. +// +// func semasleep(ns int64) int32 +// If ns < 0, acquire m's semaphore and return 0. +// If ns >= 0, try to acquire m's semaphore for at most ns nanoseconds. +// Return 0 if the semaphore was acquired, -1 if interrupted or timed out. +// +// func semawakeup(mp *m) +// Wake up mp, which is or will soon be sleeping on its semaphore. +const ( + locked uintptr = 1 + + active_spin = 4 + active_spin_cnt = 30 + passive_spin = 1 +) + +func lock(l *mutex) { + lockWithRank(l, getLockRank(l)) +} + +func lock2(l *mutex) { + gp := getg() + if gp.m.locks < 0 { + throw("runtime·lock: lock count") + } + gp.m.locks++ + + // Speculative grab for lock. + if atomic.Casuintptr(&l.key, 0, locked) { + return + } + semacreate(gp.m) + + // On uniprocessor's, no point spinning. + // On multiprocessors, spin for ACTIVE_SPIN attempts. + spin := 0 + if ncpu > 1 { + spin = active_spin + } +Loop: + for i := 0; ; i++ { + v := atomic.Loaduintptr(&l.key) + if v&locked == 0 { + // Unlocked. Try to lock. + if atomic.Casuintptr(&l.key, v, v|locked) { + return + } + i = 0 + } + if i < spin { + procyield(active_spin_cnt) + } else if i < spin+passive_spin { + osyield() + } else { + // Someone else has it. + // l->waitm points to a linked list of M's waiting + // for this lock, chained through m->nextwaitm. + // Queue this M. + for { + gp.m.nextwaitm = muintptr(v &^ locked) + if atomic.Casuintptr(&l.key, v, uintptr(unsafe.Pointer(gp.m))|locked) { + break + } + v = atomic.Loaduintptr(&l.key) + if v&locked == 0 { + continue Loop + } + } + if v&locked != 0 { + // Queued. Wait. + semasleep(-1) + i = 0 + } + } + } +} + +func unlock(l *mutex) { + unlockWithRank(l) +} + +// We might not be holding a p in this code. +// +//go:nowritebarrier +func unlock2(l *mutex) { + gp := getg() + var mp *m + for { + v := atomic.Loaduintptr(&l.key) + if v == locked { + if atomic.Casuintptr(&l.key, locked, 0) { + break + } + } else { + // Other M's are waiting for the lock. + // Dequeue an M. + mp = muintptr(v &^ locked).ptr() + if atomic.Casuintptr(&l.key, v, uintptr(mp.nextwaitm)) { + // Dequeued an M. Wake it. + semawakeup(mp) + break + } + } + } + gp.m.locks-- + if gp.m.locks < 0 { + throw("runtime·unlock: lock count") + } + if gp.m.locks == 0 && gp.preempt { // restore the preemption request in case we've cleared it in newstack + gp.stackguard0 = stackPreempt + } +} + +// One-time notifications. +func noteclear(n *note) { + if GOOS == "aix" { + // On AIX, semaphores might not synchronize the memory in some + // rare cases. See issue #30189. + atomic.Storeuintptr(&n.key, 0) + } else { + n.key = 0 + } +} + +func notewakeup(n *note) { + var v uintptr + for { + v = atomic.Loaduintptr(&n.key) + if atomic.Casuintptr(&n.key, v, locked) { + break + } + } + + // Successfully set waitm to locked. + // What was it before? + switch { + case v == 0: + // Nothing was waiting. Done. + case v == locked: + // Two notewakeups! Not allowed. + throw("notewakeup - double wakeup") + default: + // Must be the waiting m. Wake it up. + semawakeup((*m)(unsafe.Pointer(v))) + } +} + +func notesleep(n *note) { + gp := getg() + if gp != gp.m.g0 { + throw("notesleep not on g0") + } + semacreate(gp.m) + if !atomic.Casuintptr(&n.key, 0, uintptr(unsafe.Pointer(gp.m))) { + // Must be locked (got wakeup). + if n.key != locked { + throw("notesleep - waitm out of sync") + } + return + } + // Queued. Sleep. + gp.m.blocked = true + if *cgo_yield == nil { + semasleep(-1) + } else { + // Sleep for an arbitrary-but-moderate interval to poll libc interceptors. + const ns = 10e6 + for atomic.Loaduintptr(&n.key) == 0 { + semasleep(ns) + asmcgocall(*cgo_yield, nil) + } + } + gp.m.blocked = false +} + +//go:nosplit +func notetsleep_internal(n *note, ns int64, gp *g, deadline int64) bool { + // gp and deadline are logically local variables, but they are written + // as parameters so that the stack space they require is charged + // to the caller. + // This reduces the nosplit footprint of notetsleep_internal. + gp = getg() + + // Register for wakeup on n->waitm. + if !atomic.Casuintptr(&n.key, 0, uintptr(unsafe.Pointer(gp.m))) { + // Must be locked (got wakeup). + if n.key != locked { + throw("notetsleep - waitm out of sync") + } + return true + } + if ns < 0 { + // Queued. Sleep. + gp.m.blocked = true + if *cgo_yield == nil { + semasleep(-1) + } else { + // Sleep in arbitrary-but-moderate intervals to poll libc interceptors. + const ns = 10e6 + for semasleep(ns) < 0 { + asmcgocall(*cgo_yield, nil) + } + } + gp.m.blocked = false + return true + } + + deadline = nanotime() + ns + for { + // Registered. Sleep. + gp.m.blocked = true + if *cgo_yield != nil && ns > 10e6 { + ns = 10e6 + } + if semasleep(ns) >= 0 { + gp.m.blocked = false + // Acquired semaphore, semawakeup unregistered us. + // Done. + return true + } + if *cgo_yield != nil { + asmcgocall(*cgo_yield, nil) + } + gp.m.blocked = false + // Interrupted or timed out. Still registered. Semaphore not acquired. + ns = deadline - nanotime() + if ns <= 0 { + break + } + // Deadline hasn't arrived. Keep sleeping. + } + + // Deadline arrived. Still registered. Semaphore not acquired. + // Want to give up and return, but have to unregister first, + // so that any notewakeup racing with the return does not + // try to grant us the semaphore when we don't expect it. + for { + v := atomic.Loaduintptr(&n.key) + switch v { + case uintptr(unsafe.Pointer(gp.m)): + // No wakeup yet; unregister if possible. + if atomic.Casuintptr(&n.key, v, 0) { + return false + } + case locked: + // Wakeup happened so semaphore is available. + // Grab it to avoid getting out of sync. + gp.m.blocked = true + if semasleep(-1) < 0 { + throw("runtime: unable to acquire - semaphore out of sync") + } + gp.m.blocked = false + return true + default: + throw("runtime: unexpected waitm - semaphore out of sync") + } + } +} + +func notetsleep(n *note, ns int64) bool { + gp := getg() + if gp != gp.m.g0 { + throw("notetsleep not on g0") + } + semacreate(gp.m) + return notetsleep_internal(n, ns, nil, 0) +} + +// same as runtime·notetsleep, but called on user g (not g0) +// calls only nosplit functions between entersyscallblock/exitsyscall. +func notetsleepg(n *note, ns int64) bool { + gp := getg() + if gp == gp.m.g0 { + throw("notetsleepg on g0") + } + semacreate(gp.m) + entersyscallblock() + ok := notetsleep_internal(n, ns, nil, 0) + exitsyscall() + return ok +} + +func beforeIdle(int64, int64) (*g, bool) { + return nil, false +} + +func checkTimeouts() {} diff --git a/src/runtime/lockrank.go b/src/runtime/lockrank.go new file mode 100644 index 0000000..e51d7a0 --- /dev/null +++ b/src/runtime/lockrank.go @@ -0,0 +1,206 @@ +// Code generated by mklockrank.go; DO NOT EDIT. + +package runtime + +type lockRank int + +// Constants representing the ranks of all non-leaf runtime locks, in rank order. +// Locks with lower rank must be taken before locks with higher rank, +// in addition to satisfying the partial order in lockPartialOrder. +// A few ranks allow self-cycles, which are specified in lockPartialOrder. +const ( + lockRankUnknown lockRank = iota + + lockRankSysmon + lockRankScavenge + lockRankForcegc + lockRankDefer + lockRankSweepWaiters + lockRankAssistQueue + lockRankSweep + lockRankTestR + lockRankTestW + lockRankAllocmW + lockRankExecW + lockRankCpuprof + lockRankPollDesc + // SCHED + lockRankAllocmR + lockRankExecR + lockRankSched + lockRankAllg + lockRankAllp + lockRankTimers + lockRankNetpollInit + lockRankHchan + lockRankNotifyList + lockRankSudog + lockRankRoot + lockRankItab + lockRankReflectOffs + lockRankUserArenaState + // TRACEGLOBAL + lockRankTraceBuf + lockRankTraceStrings + // MALLOC + lockRankFin + lockRankGcBitsArenas + lockRankMheapSpecial + lockRankMspanSpecial + lockRankSpanSetSpine + // MPROF + lockRankProfInsert + lockRankProfBlock + lockRankProfMemActive + lockRankProfMemFuture + // STACKGROW + lockRankGscan + lockRankStackpool + lockRankStackLarge + lockRankHchanLeaf + // WB + lockRankWbufSpans + lockRankMheap + lockRankGlobalAlloc + // TRACE + lockRankTrace + lockRankTraceStackTab + lockRankPanic + lockRankDeadlock + lockRankAllocmRInternal + lockRankExecRInternal + lockRankTestRInternal +) + +// lockRankLeafRank is the rank of lock that does not have a declared rank, +// and hence is a leaf lock. +const lockRankLeafRank lockRank = 1000 + +// lockNames gives the names associated with each of the above ranks. +var lockNames = []string{ + lockRankSysmon: "sysmon", + lockRankScavenge: "scavenge", + lockRankForcegc: "forcegc", + lockRankDefer: "defer", + lockRankSweepWaiters: "sweepWaiters", + lockRankAssistQueue: "assistQueue", + lockRankSweep: "sweep", + lockRankTestR: "testR", + lockRankTestW: "testW", + lockRankAllocmW: "allocmW", + lockRankExecW: "execW", + lockRankCpuprof: "cpuprof", + lockRankPollDesc: "pollDesc", + lockRankAllocmR: "allocmR", + lockRankExecR: "execR", + lockRankSched: "sched", + lockRankAllg: "allg", + lockRankAllp: "allp", + lockRankTimers: "timers", + lockRankNetpollInit: "netpollInit", + lockRankHchan: "hchan", + lockRankNotifyList: "notifyList", + lockRankSudog: "sudog", + lockRankRoot: "root", + lockRankItab: "itab", + lockRankReflectOffs: "reflectOffs", + lockRankUserArenaState: "userArenaState", + lockRankTraceBuf: "traceBuf", + lockRankTraceStrings: "traceStrings", + lockRankFin: "fin", + lockRankGcBitsArenas: "gcBitsArenas", + lockRankMheapSpecial: "mheapSpecial", + lockRankMspanSpecial: "mspanSpecial", + lockRankSpanSetSpine: "spanSetSpine", + lockRankProfInsert: "profInsert", + lockRankProfBlock: "profBlock", + lockRankProfMemActive: "profMemActive", + lockRankProfMemFuture: "profMemFuture", + lockRankGscan: "gscan", + lockRankStackpool: "stackpool", + lockRankStackLarge: "stackLarge", + lockRankHchanLeaf: "hchanLeaf", + lockRankWbufSpans: "wbufSpans", + lockRankMheap: "mheap", + lockRankGlobalAlloc: "globalAlloc", + lockRankTrace: "trace", + lockRankTraceStackTab: "traceStackTab", + lockRankPanic: "panic", + lockRankDeadlock: "deadlock", + lockRankAllocmRInternal: "allocmRInternal", + lockRankExecRInternal: "execRInternal", + lockRankTestRInternal: "testRInternal", +} + +func (rank lockRank) String() string { + if rank == 0 { + return "UNKNOWN" + } + if rank == lockRankLeafRank { + return "LEAF" + } + if rank < 0 || int(rank) >= len(lockNames) { + return "BAD RANK" + } + return lockNames[rank] +} + +// lockPartialOrder is the transitive closure of the lock rank graph. +// An entry for rank X lists all of the ranks that can already be held +// when rank X is acquired. +// +// Lock ranks that allow self-cycles list themselves. +var lockPartialOrder [][]lockRank = [][]lockRank{ + lockRankSysmon: {}, + lockRankScavenge: {lockRankSysmon}, + lockRankForcegc: {lockRankSysmon}, + lockRankDefer: {}, + lockRankSweepWaiters: {}, + lockRankAssistQueue: {}, + lockRankSweep: {}, + lockRankTestR: {}, + lockRankTestW: {}, + lockRankAllocmW: {}, + lockRankExecW: {}, + lockRankCpuprof: {}, + lockRankPollDesc: {}, + lockRankAllocmR: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankCpuprof, lockRankPollDesc}, + lockRankExecR: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankCpuprof, lockRankPollDesc}, + lockRankSched: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR}, + lockRankAllg: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched}, + lockRankAllp: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched}, + lockRankTimers: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllp, lockRankTimers}, + lockRankNetpollInit: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllp, lockRankTimers}, + lockRankHchan: {lockRankSysmon, lockRankScavenge, lockRankSweep, lockRankTestR, lockRankHchan}, + lockRankNotifyList: {}, + lockRankSudog: {lockRankSysmon, lockRankScavenge, lockRankSweep, lockRankTestR, lockRankHchan, lockRankNotifyList}, + lockRankRoot: {}, + lockRankItab: {}, + lockRankReflectOffs: {lockRankItab}, + lockRankUserArenaState: {}, + lockRankTraceBuf: {lockRankSysmon, lockRankScavenge}, + lockRankTraceStrings: {lockRankSysmon, lockRankScavenge, lockRankTraceBuf}, + lockRankFin: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings}, + lockRankGcBitsArenas: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings}, + lockRankMheapSpecial: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings}, + lockRankMspanSpecial: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings}, + lockRankSpanSetSpine: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings}, + lockRankProfInsert: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings}, + lockRankProfBlock: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings}, + lockRankProfMemActive: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings}, + lockRankProfMemFuture: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankHchan, lockRankNotifyList, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankProfMemActive}, + lockRankGscan: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture}, + lockRankStackpool: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan}, + lockRankStackLarge: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan}, + lockRankHchanLeaf: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankHchanLeaf}, + lockRankWbufSpans: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan}, + lockRankMheap: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans}, + lockRankGlobalAlloc: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMheapSpecial, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap}, + lockRankTrace: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap}, + lockRankTraceStackTab: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankDefer, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR, lockRankExecR, lockRankSched, lockRankAllg, lockRankAllp, lockRankTimers, lockRankNetpollInit, lockRankHchan, lockRankNotifyList, lockRankSudog, lockRankRoot, lockRankItab, lockRankReflectOffs, lockRankUserArenaState, lockRankTraceBuf, lockRankTraceStrings, lockRankFin, lockRankGcBitsArenas, lockRankMspanSpecial, lockRankSpanSetSpine, lockRankProfInsert, lockRankProfBlock, lockRankProfMemActive, lockRankProfMemFuture, lockRankGscan, lockRankStackpool, lockRankStackLarge, lockRankWbufSpans, lockRankMheap, lockRankTrace}, + lockRankPanic: {}, + lockRankDeadlock: {lockRankPanic, lockRankDeadlock}, + lockRankAllocmRInternal: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankAllocmW, lockRankCpuprof, lockRankPollDesc, lockRankAllocmR}, + lockRankExecRInternal: {lockRankSysmon, lockRankScavenge, lockRankForcegc, lockRankSweepWaiters, lockRankAssistQueue, lockRankSweep, lockRankTestR, lockRankExecW, lockRankCpuprof, lockRankPollDesc, lockRankExecR}, + lockRankTestRInternal: {lockRankTestR, lockRankTestW}, +} diff --git a/src/runtime/lockrank_off.go b/src/runtime/lockrank_off.go new file mode 100644 index 0000000..bf046a1 --- /dev/null +++ b/src/runtime/lockrank_off.go @@ -0,0 +1,66 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !goexperiment.staticlockranking + +package runtime + +// // lockRankStruct is embedded in mutex, but is empty when staticklockranking is +// disabled (the default) +type lockRankStruct struct { +} + +func lockInit(l *mutex, rank lockRank) { +} + +func getLockRank(l *mutex) lockRank { + return 0 +} + +func lockWithRank(l *mutex, rank lockRank) { + lock2(l) +} + +// This function may be called in nosplit context and thus must be nosplit. +// +//go:nosplit +func acquireLockRank(rank lockRank) { +} + +func unlockWithRank(l *mutex) { + unlock2(l) +} + +// This function may be called in nosplit context and thus must be nosplit. +// +//go:nosplit +func releaseLockRank(rank lockRank) { +} + +func lockWithRankMayAcquire(l *mutex, rank lockRank) { +} + +//go:nosplit +func assertLockHeld(l *mutex) { +} + +//go:nosplit +func assertRankHeld(r lockRank) { +} + +//go:nosplit +func worldStopped() { +} + +//go:nosplit +func worldStarted() { +} + +//go:nosplit +func assertWorldStopped() { +} + +//go:nosplit +func assertWorldStoppedOrLockHeld(l *mutex) { +} diff --git a/src/runtime/lockrank_on.go b/src/runtime/lockrank_on.go new file mode 100644 index 0000000..5dcc79b --- /dev/null +++ b/src/runtime/lockrank_on.go @@ -0,0 +1,383 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build goexperiment.staticlockranking + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// worldIsStopped is accessed atomically to track world-stops. 1 == world +// stopped. +var worldIsStopped atomic.Uint32 + +// lockRankStruct is embedded in mutex +type lockRankStruct struct { + // static lock ranking of the lock + rank lockRank + // pad field to make sure lockRankStruct is a multiple of 8 bytes, even on + // 32-bit systems. + pad int +} + +// lockInit(l *mutex, rank int) sets the rank of lock before it is used. +// If there is no clear place to initialize a lock, then the rank of a lock can be +// specified during the lock call itself via lockWithRank(l *mutex, rank int). +func lockInit(l *mutex, rank lockRank) { + l.rank = rank +} + +func getLockRank(l *mutex) lockRank { + return l.rank +} + +// lockWithRank is like lock(l), but allows the caller to specify a lock rank +// when acquiring a non-static lock. +// +// Note that we need to be careful about stack splits: +// +// This function is not nosplit, thus it may split at function entry. This may +// introduce a new edge in the lock order, but it is no different from any +// other (nosplit) call before this call (including the call to lock() itself). +// +// However, we switch to the systemstack to record the lock held to ensure that +// we record an accurate lock ordering. e.g., without systemstack, a stack +// split on entry to lock2() would record stack split locks as taken after l, +// even though l is not actually locked yet. +func lockWithRank(l *mutex, rank lockRank) { + if l == &debuglock || l == &paniclk { + // debuglock is only used for println/printlock(). Don't do lock + // rank recording for it, since print/println are used when + // printing out a lock ordering problem below. + // + // paniclk is only used for fatal throw/panic. Don't do lock + // ranking recording for it, since we throw after reporting a + // lock ordering problem. Additionally, paniclk may be taken + // after effectively any lock (anywhere we might panic), which + // the partial order doesn't cover. + lock2(l) + return + } + if rank == 0 { + rank = lockRankLeafRank + } + gp := getg() + // Log the new class. + systemstack(func() { + i := gp.m.locksHeldLen + if i >= len(gp.m.locksHeld) { + throw("too many locks held concurrently for rank checking") + } + gp.m.locksHeld[i].rank = rank + gp.m.locksHeld[i].lockAddr = uintptr(unsafe.Pointer(l)) + gp.m.locksHeldLen++ + + // i is the index of the lock being acquired + if i > 0 { + checkRanks(gp, gp.m.locksHeld[i-1].rank, rank) + } + lock2(l) + }) +} + +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func printHeldLocks(gp *g) { + if gp.m.locksHeldLen == 0 { + println("<none>") + return + } + + for j, held := range gp.m.locksHeld[:gp.m.locksHeldLen] { + println(j, ":", held.rank.String(), held.rank, unsafe.Pointer(gp.m.locksHeld[j].lockAddr)) + } +} + +// acquireLockRank acquires a rank which is not associated with a mutex lock +// +// This function may be called in nosplit context and thus must be nosplit. +// +//go:nosplit +func acquireLockRank(rank lockRank) { + gp := getg() + // Log the new class. See comment on lockWithRank. + systemstack(func() { + i := gp.m.locksHeldLen + if i >= len(gp.m.locksHeld) { + throw("too many locks held concurrently for rank checking") + } + gp.m.locksHeld[i].rank = rank + gp.m.locksHeld[i].lockAddr = 0 + gp.m.locksHeldLen++ + + // i is the index of the lock being acquired + if i > 0 { + checkRanks(gp, gp.m.locksHeld[i-1].rank, rank) + } + }) +} + +// checkRanks checks if goroutine g, which has mostly recently acquired a lock +// with rank 'prevRank', can now acquire a lock with rank 'rank'. +// +//go:systemstack +func checkRanks(gp *g, prevRank, rank lockRank) { + rankOK := false + if rank < prevRank { + // If rank < prevRank, then we definitely have a rank error + rankOK = false + } else if rank == lockRankLeafRank { + // If new lock is a leaf lock, then the preceding lock can + // be anything except another leaf lock. + rankOK = prevRank < lockRankLeafRank + } else { + // We've now verified the total lock ranking, but we + // also enforce the partial ordering specified by + // lockPartialOrder as well. Two locks with the same rank + // can only be acquired at the same time if explicitly + // listed in the lockPartialOrder table. + list := lockPartialOrder[rank] + for _, entry := range list { + if entry == prevRank { + rankOK = true + break + } + } + } + if !rankOK { + printlock() + println(gp.m.procid, " ======") + printHeldLocks(gp) + throw("lock ordering problem") + } +} + +// See comment on lockWithRank regarding stack splitting. +func unlockWithRank(l *mutex) { + if l == &debuglock || l == &paniclk { + // See comment at beginning of lockWithRank. + unlock2(l) + return + } + gp := getg() + systemstack(func() { + found := false + for i := gp.m.locksHeldLen - 1; i >= 0; i-- { + if gp.m.locksHeld[i].lockAddr == uintptr(unsafe.Pointer(l)) { + found = true + copy(gp.m.locksHeld[i:gp.m.locksHeldLen-1], gp.m.locksHeld[i+1:gp.m.locksHeldLen]) + gp.m.locksHeldLen-- + break + } + } + if !found { + println(gp.m.procid, ":", l.rank.String(), l.rank, l) + throw("unlock without matching lock acquire") + } + unlock2(l) + }) +} + +// releaseLockRank releases a rank which is not associated with a mutex lock +// +// This function may be called in nosplit context and thus must be nosplit. +// +//go:nosplit +func releaseLockRank(rank lockRank) { + gp := getg() + systemstack(func() { + found := false + for i := gp.m.locksHeldLen - 1; i >= 0; i-- { + if gp.m.locksHeld[i].rank == rank && gp.m.locksHeld[i].lockAddr == 0 { + found = true + copy(gp.m.locksHeld[i:gp.m.locksHeldLen-1], gp.m.locksHeld[i+1:gp.m.locksHeldLen]) + gp.m.locksHeldLen-- + break + } + } + if !found { + println(gp.m.procid, ":", rank.String(), rank) + throw("lockRank release without matching lockRank acquire") + } + }) +} + +// See comment on lockWithRank regarding stack splitting. +func lockWithRankMayAcquire(l *mutex, rank lockRank) { + gp := getg() + if gp.m.locksHeldLen == 0 { + // No possibility of lock ordering problem if no other locks held + return + } + + systemstack(func() { + i := gp.m.locksHeldLen + if i >= len(gp.m.locksHeld) { + throw("too many locks held concurrently for rank checking") + } + // Temporarily add this lock to the locksHeld list, so + // checkRanks() will print out list, including this lock, if there + // is a lock ordering problem. + gp.m.locksHeld[i].rank = rank + gp.m.locksHeld[i].lockAddr = uintptr(unsafe.Pointer(l)) + gp.m.locksHeldLen++ + checkRanks(gp, gp.m.locksHeld[i-1].rank, rank) + gp.m.locksHeldLen-- + }) +} + +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func checkLockHeld(gp *g, l *mutex) bool { + for i := gp.m.locksHeldLen - 1; i >= 0; i-- { + if gp.m.locksHeld[i].lockAddr == uintptr(unsafe.Pointer(l)) { + return true + } + } + return false +} + +// assertLockHeld throws if l is not held by the caller. +// +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func assertLockHeld(l *mutex) { + gp := getg() + + held := checkLockHeld(gp, l) + if held { + return + } + + // Crash from system stack to avoid splits that may cause + // additional issues. + systemstack(func() { + printlock() + print("caller requires lock ", l, " (rank ", l.rank.String(), "), holding:\n") + printHeldLocks(gp) + throw("not holding required lock!") + }) +} + +// assertRankHeld throws if a mutex with rank r is not held by the caller. +// +// This is less precise than assertLockHeld, but can be used in places where a +// pointer to the exact mutex is not available. +// +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func assertRankHeld(r lockRank) { + gp := getg() + + for i := gp.m.locksHeldLen - 1; i >= 0; i-- { + if gp.m.locksHeld[i].rank == r { + return + } + } + + // Crash from system stack to avoid splits that may cause + // additional issues. + systemstack(func() { + printlock() + print("caller requires lock with rank ", r.String(), "), holding:\n") + printHeldLocks(gp) + throw("not holding required lock!") + }) +} + +// worldStopped notes that the world is stopped. +// +// Caller must hold worldsema. +// +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func worldStopped() { + if stopped := worldIsStopped.Add(1); stopped != 1 { + systemstack(func() { + print("world stop count=", stopped, "\n") + throw("recursive world stop") + }) + } +} + +// worldStarted that the world is starting. +// +// Caller must hold worldsema. +// +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func worldStarted() { + if stopped := worldIsStopped.Add(-1); stopped != 0 { + systemstack(func() { + print("world stop count=", stopped, "\n") + throw("released non-stopped world stop") + }) + } +} + +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func checkWorldStopped() bool { + stopped := worldIsStopped.Load() + if stopped > 1 { + systemstack(func() { + print("inconsistent world stop count=", stopped, "\n") + throw("inconsistent world stop count") + }) + } + + return stopped == 1 +} + +// assertWorldStopped throws if the world is not stopped. It does not check +// which M stopped the world. +// +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func assertWorldStopped() { + if checkWorldStopped() { + return + } + + throw("world not stopped") +} + +// assertWorldStoppedOrLockHeld throws if the world is not stopped and the +// passed lock is not held. +// +// nosplit to ensure it can be called in as many contexts as possible. +// +//go:nosplit +func assertWorldStoppedOrLockHeld(l *mutex) { + if checkWorldStopped() { + return + } + + gp := getg() + held := checkLockHeld(gp, l) + if held { + return + } + + // Crash from system stack to avoid splits that may cause + // additional issues. + systemstack(func() { + printlock() + print("caller requires world stop or lock ", l, " (rank ", l.rank.String(), "), holding:\n") + println("<no world stop>") + printHeldLocks(gp) + throw("no world stop or required lock!") + }) +} diff --git a/src/runtime/lockrank_test.go b/src/runtime/lockrank_test.go new file mode 100644 index 0000000..a7b1b8d --- /dev/null +++ b/src/runtime/lockrank_test.go @@ -0,0 +1,29 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "bytes" + "internal/testenv" + "os" + "os/exec" + "testing" +) + +// Test that the generated code for the lock rank graph is up-to-date. +func TestLockRankGenerated(t *testing.T) { + testenv.MustHaveGoRun(t) + want, err := testenv.CleanCmdEnv(exec.Command(testenv.GoToolPath(t), "run", "mklockrank.go")).CombinedOutput() + if err != nil { + t.Fatal(err) + } + got, err := os.ReadFile("lockrank.go") + if err != nil { + t.Fatal(err) + } + if !bytes.Equal(want, got) { + t.Fatalf("lockrank.go is out of date. Please run go generate.") + } +} diff --git a/src/runtime/malloc.go b/src/runtime/malloc.go new file mode 100644 index 0000000..7ff2190 --- /dev/null +++ b/src/runtime/malloc.go @@ -0,0 +1,1562 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Memory allocator. +// +// This was originally based on tcmalloc, but has diverged quite a bit. +// http://goog-perftools.sourceforge.net/doc/tcmalloc.html + +// The main allocator works in runs of pages. +// Small allocation sizes (up to and including 32 kB) are +// rounded to one of about 70 size classes, each of which +// has its own free set of objects of exactly that size. +// Any free page of memory can be split into a set of objects +// of one size class, which are then managed using a free bitmap. +// +// The allocator's data structures are: +// +// fixalloc: a free-list allocator for fixed-size off-heap objects, +// used to manage storage used by the allocator. +// mheap: the malloc heap, managed at page (8192-byte) granularity. +// mspan: a run of in-use pages managed by the mheap. +// mcentral: collects all spans of a given size class. +// mcache: a per-P cache of mspans with free space. +// mstats: allocation statistics. +// +// Allocating a small object proceeds up a hierarchy of caches: +// +// 1. Round the size up to one of the small size classes +// and look in the corresponding mspan in this P's mcache. +// Scan the mspan's free bitmap to find a free slot. +// If there is a free slot, allocate it. +// This can all be done without acquiring a lock. +// +// 2. If the mspan has no free slots, obtain a new mspan +// from the mcentral's list of mspans of the required size +// class that have free space. +// Obtaining a whole span amortizes the cost of locking +// the mcentral. +// +// 3. If the mcentral's mspan list is empty, obtain a run +// of pages from the mheap to use for the mspan. +// +// 4. If the mheap is empty or has no page runs large enough, +// allocate a new group of pages (at least 1MB) from the +// operating system. Allocating a large run of pages +// amortizes the cost of talking to the operating system. +// +// Sweeping an mspan and freeing objects on it proceeds up a similar +// hierarchy: +// +// 1. If the mspan is being swept in response to allocation, it +// is returned to the mcache to satisfy the allocation. +// +// 2. Otherwise, if the mspan still has allocated objects in it, +// it is placed on the mcentral free list for the mspan's size +// class. +// +// 3. Otherwise, if all objects in the mspan are free, the mspan's +// pages are returned to the mheap and the mspan is now dead. +// +// Allocating and freeing a large object uses the mheap +// directly, bypassing the mcache and mcentral. +// +// If mspan.needzero is false, then free object slots in the mspan are +// already zeroed. Otherwise if needzero is true, objects are zeroed as +// they are allocated. There are various benefits to delaying zeroing +// this way: +// +// 1. Stack frame allocation can avoid zeroing altogether. +// +// 2. It exhibits better temporal locality, since the program is +// probably about to write to the memory. +// +// 3. We don't zero pages that never get reused. + +// Virtual memory layout +// +// The heap consists of a set of arenas, which are 64MB on 64-bit and +// 4MB on 32-bit (heapArenaBytes). Each arena's start address is also +// aligned to the arena size. +// +// Each arena has an associated heapArena object that stores the +// metadata for that arena: the heap bitmap for all words in the arena +// and the span map for all pages in the arena. heapArena objects are +// themselves allocated off-heap. +// +// Since arenas are aligned, the address space can be viewed as a +// series of arena frames. The arena map (mheap_.arenas) maps from +// arena frame number to *heapArena, or nil for parts of the address +// space not backed by the Go heap. The arena map is structured as a +// two-level array consisting of a "L1" arena map and many "L2" arena +// maps; however, since arenas are large, on many architectures, the +// arena map consists of a single, large L2 map. +// +// The arena map covers the entire possible address space, allowing +// the Go heap to use any part of the address space. The allocator +// attempts to keep arenas contiguous so that large spans (and hence +// large objects) can cross arenas. + +package runtime + +import ( + "internal/goarch" + "internal/goos" + "runtime/internal/atomic" + "runtime/internal/math" + "runtime/internal/sys" + "unsafe" +) + +const ( + maxTinySize = _TinySize + tinySizeClass = _TinySizeClass + maxSmallSize = _MaxSmallSize + + pageShift = _PageShift + pageSize = _PageSize + + concurrentSweep = _ConcurrentSweep + + _PageSize = 1 << _PageShift + _PageMask = _PageSize - 1 + + // _64bit = 1 on 64-bit systems, 0 on 32-bit systems + _64bit = 1 << (^uintptr(0) >> 63) / 2 + + // Tiny allocator parameters, see "Tiny allocator" comment in malloc.go. + _TinySize = 16 + _TinySizeClass = int8(2) + + _FixAllocChunk = 16 << 10 // Chunk size for FixAlloc + + // Per-P, per order stack segment cache size. + _StackCacheSize = 32 * 1024 + + // Number of orders that get caching. Order 0 is FixedStack + // and each successive order is twice as large. + // We want to cache 2KB, 4KB, 8KB, and 16KB stacks. Larger stacks + // will be allocated directly. + // Since FixedStack is different on different systems, we + // must vary NumStackOrders to keep the same maximum cached size. + // OS | FixedStack | NumStackOrders + // -----------------+------------+--------------- + // linux/darwin/bsd | 2KB | 4 + // windows/32 | 4KB | 3 + // windows/64 | 8KB | 2 + // plan9 | 4KB | 3 + _NumStackOrders = 4 - goarch.PtrSize/4*goos.IsWindows - 1*goos.IsPlan9 + + // heapAddrBits is the number of bits in a heap address. On + // amd64, addresses are sign-extended beyond heapAddrBits. On + // other arches, they are zero-extended. + // + // On most 64-bit platforms, we limit this to 48 bits based on a + // combination of hardware and OS limitations. + // + // amd64 hardware limits addresses to 48 bits, sign-extended + // to 64 bits. Addresses where the top 16 bits are not either + // all 0 or all 1 are "non-canonical" and invalid. Because of + // these "negative" addresses, we offset addresses by 1<<47 + // (arenaBaseOffset) on amd64 before computing indexes into + // the heap arenas index. In 2017, amd64 hardware added + // support for 57 bit addresses; however, currently only Linux + // supports this extension and the kernel will never choose an + // address above 1<<47 unless mmap is called with a hint + // address above 1<<47 (which we never do). + // + // arm64 hardware (as of ARMv8) limits user addresses to 48 + // bits, in the range [0, 1<<48). + // + // ppc64, mips64, and s390x support arbitrary 64 bit addresses + // in hardware. On Linux, Go leans on stricter OS limits. Based + // on Linux's processor.h, the user address space is limited as + // follows on 64-bit architectures: + // + // Architecture Name Maximum Value (exclusive) + // --------------------------------------------------------------------- + // amd64 TASK_SIZE_MAX 0x007ffffffff000 (47 bit addresses) + // arm64 TASK_SIZE_64 0x01000000000000 (48 bit addresses) + // ppc64{,le} TASK_SIZE_USER64 0x00400000000000 (46 bit addresses) + // mips64{,le} TASK_SIZE64 0x00010000000000 (40 bit addresses) + // s390x TASK_SIZE 1<<64 (64 bit addresses) + // + // These limits may increase over time, but are currently at + // most 48 bits except on s390x. On all architectures, Linux + // starts placing mmap'd regions at addresses that are + // significantly below 48 bits, so even if it's possible to + // exceed Go's 48 bit limit, it's extremely unlikely in + // practice. + // + // On 32-bit platforms, we accept the full 32-bit address + // space because doing so is cheap. + // mips32 only has access to the low 2GB of virtual memory, so + // we further limit it to 31 bits. + // + // On ios/arm64, although 64-bit pointers are presumably + // available, pointers are truncated to 33 bits in iOS <14. + // Furthermore, only the top 4 GiB of the address space are + // actually available to the application. In iOS >=14, more + // of the address space is available, and the OS can now + // provide addresses outside of those 33 bits. Pick 40 bits + // as a reasonable balance between address space usage by the + // page allocator, and flexibility for what mmap'd regions + // we'll accept for the heap. We can't just move to the full + // 48 bits because this uses too much address space for older + // iOS versions. + // TODO(mknyszek): Once iOS <14 is deprecated, promote ios/arm64 + // to a 48-bit address space like every other arm64 platform. + // + // WebAssembly currently has a limit of 4GB linear memory. + heapAddrBits = (_64bit*(1-goarch.IsWasm)*(1-goos.IsIos*goarch.IsArm64))*48 + (1-_64bit+goarch.IsWasm)*(32-(goarch.IsMips+goarch.IsMipsle)) + 40*goos.IsIos*goarch.IsArm64 + + // maxAlloc is the maximum size of an allocation. On 64-bit, + // it's theoretically possible to allocate 1<<heapAddrBits bytes. On + // 32-bit, however, this is one less than 1<<32 because the + // number of bytes in the address space doesn't actually fit + // in a uintptr. + maxAlloc = (1 << heapAddrBits) - (1-_64bit)*1 + + // The number of bits in a heap address, the size of heap + // arenas, and the L1 and L2 arena map sizes are related by + // + // (1 << addr bits) = arena size * L1 entries * L2 entries + // + // Currently, we balance these as follows: + // + // Platform Addr bits Arena size L1 entries L2 entries + // -------------- --------- ---------- ---------- ----------- + // */64-bit 48 64MB 1 4M (32MB) + // windows/64-bit 48 4MB 64 1M (8MB) + // ios/arm64 33 4MB 1 2048 (8KB) + // */32-bit 32 4MB 1 1024 (4KB) + // */mips(le) 31 4MB 1 512 (2KB) + + // heapArenaBytes is the size of a heap arena. The heap + // consists of mappings of size heapArenaBytes, aligned to + // heapArenaBytes. The initial heap mapping is one arena. + // + // This is currently 64MB on 64-bit non-Windows and 4MB on + // 32-bit and on Windows. We use smaller arenas on Windows + // because all committed memory is charged to the process, + // even if it's not touched. Hence, for processes with small + // heaps, the mapped arena space needs to be commensurate. + // This is particularly important with the race detector, + // since it significantly amplifies the cost of committed + // memory. + heapArenaBytes = 1 << logHeapArenaBytes + + heapArenaWords = heapArenaBytes / goarch.PtrSize + + // logHeapArenaBytes is log_2 of heapArenaBytes. For clarity, + // prefer using heapArenaBytes where possible (we need the + // constant to compute some other constants). + logHeapArenaBytes = (6+20)*(_64bit*(1-goos.IsWindows)*(1-goarch.IsWasm)*(1-goos.IsIos*goarch.IsArm64)) + (2+20)*(_64bit*goos.IsWindows) + (2+20)*(1-_64bit) + (2+20)*goarch.IsWasm + (2+20)*goos.IsIos*goarch.IsArm64 + + // heapArenaBitmapWords is the size of each heap arena's bitmap in uintptrs. + heapArenaBitmapWords = heapArenaWords / (8 * goarch.PtrSize) + + pagesPerArena = heapArenaBytes / pageSize + + // arenaL1Bits is the number of bits of the arena number + // covered by the first level arena map. + // + // This number should be small, since the first level arena + // map requires PtrSize*(1<<arenaL1Bits) of space in the + // binary's BSS. It can be zero, in which case the first level + // index is effectively unused. There is a performance benefit + // to this, since the generated code can be more efficient, + // but comes at the cost of having a large L2 mapping. + // + // We use the L1 map on 64-bit Windows because the arena size + // is small, but the address space is still 48 bits, and + // there's a high cost to having a large L2. + arenaL1Bits = 6 * (_64bit * goos.IsWindows) + + // arenaL2Bits is the number of bits of the arena number + // covered by the second level arena index. + // + // The size of each arena map allocation is proportional to + // 1<<arenaL2Bits, so it's important that this not be too + // large. 48 bits leads to 32MB arena index allocations, which + // is about the practical threshold. + arenaL2Bits = heapAddrBits - logHeapArenaBytes - arenaL1Bits + + // arenaL1Shift is the number of bits to shift an arena frame + // number by to compute an index into the first level arena map. + arenaL1Shift = arenaL2Bits + + // arenaBits is the total bits in a combined arena map index. + // This is split between the index into the L1 arena map and + // the L2 arena map. + arenaBits = arenaL1Bits + arenaL2Bits + + // arenaBaseOffset is the pointer value that corresponds to + // index 0 in the heap arena map. + // + // On amd64, the address space is 48 bits, sign extended to 64 + // bits. This offset lets us handle "negative" addresses (or + // high addresses if viewed as unsigned). + // + // On aix/ppc64, this offset allows to keep the heapAddrBits to + // 48. Otherwise, it would be 60 in order to handle mmap addresses + // (in range 0x0a00000000000000 - 0x0afffffffffffff). But in this + // case, the memory reserved in (s *pageAlloc).init for chunks + // is causing important slowdowns. + // + // On other platforms, the user address space is contiguous + // and starts at 0, so no offset is necessary. + arenaBaseOffset = 0xffff800000000000*goarch.IsAmd64 + 0x0a00000000000000*goos.IsAix + // A typed version of this constant that will make it into DWARF (for viewcore). + arenaBaseOffsetUintptr = uintptr(arenaBaseOffset) + + // Max number of threads to run garbage collection. + // 2, 3, and 4 are all plausible maximums depending + // on the hardware details of the machine. The garbage + // collector scales well to 32 cpus. + _MaxGcproc = 32 + + // minLegalPointer is the smallest possible legal pointer. + // This is the smallest possible architectural page size, + // since we assume that the first page is never mapped. + // + // This should agree with minZeroPage in the compiler. + minLegalPointer uintptr = 4096 +) + +// physPageSize is the size in bytes of the OS's physical pages. +// Mapping and unmapping operations must be done at multiples of +// physPageSize. +// +// This must be set by the OS init code (typically in osinit) before +// mallocinit. +var physPageSize uintptr + +// physHugePageSize is the size in bytes of the OS's default physical huge +// page size whose allocation is opaque to the application. It is assumed +// and verified to be a power of two. +// +// If set, this must be set by the OS init code (typically in osinit) before +// mallocinit. However, setting it at all is optional, and leaving the default +// value is always safe (though potentially less efficient). +// +// Since physHugePageSize is always assumed to be a power of two, +// physHugePageShift is defined as physHugePageSize == 1 << physHugePageShift. +// The purpose of physHugePageShift is to avoid doing divisions in +// performance critical functions. +var ( + physHugePageSize uintptr + physHugePageShift uint +) + +func mallocinit() { + if class_to_size[_TinySizeClass] != _TinySize { + throw("bad TinySizeClass") + } + + if heapArenaBitmapWords&(heapArenaBitmapWords-1) != 0 { + // heapBits expects modular arithmetic on bitmap + // addresses to work. + throw("heapArenaBitmapWords not a power of 2") + } + + // Check physPageSize. + if physPageSize == 0 { + // The OS init code failed to fetch the physical page size. + throw("failed to get system page size") + } + if physPageSize > maxPhysPageSize { + print("system page size (", physPageSize, ") is larger than maximum page size (", maxPhysPageSize, ")\n") + throw("bad system page size") + } + if physPageSize < minPhysPageSize { + print("system page size (", physPageSize, ") is smaller than minimum page size (", minPhysPageSize, ")\n") + throw("bad system page size") + } + if physPageSize&(physPageSize-1) != 0 { + print("system page size (", physPageSize, ") must be a power of 2\n") + throw("bad system page size") + } + if physHugePageSize&(physHugePageSize-1) != 0 { + print("system huge page size (", physHugePageSize, ") must be a power of 2\n") + throw("bad system huge page size") + } + if physHugePageSize > maxPhysHugePageSize { + // physHugePageSize is greater than the maximum supported huge page size. + // Don't throw here, like in the other cases, since a system configured + // in this way isn't wrong, we just don't have the code to support them. + // Instead, silently set the huge page size to zero. + physHugePageSize = 0 + } + if physHugePageSize != 0 { + // Since physHugePageSize is a power of 2, it suffices to increase + // physHugePageShift until 1<<physHugePageShift == physHugePageSize. + for 1<<physHugePageShift != physHugePageSize { + physHugePageShift++ + } + } + if pagesPerArena%pagesPerSpanRoot != 0 { + print("pagesPerArena (", pagesPerArena, ") is not divisible by pagesPerSpanRoot (", pagesPerSpanRoot, ")\n") + throw("bad pagesPerSpanRoot") + } + if pagesPerArena%pagesPerReclaimerChunk != 0 { + print("pagesPerArena (", pagesPerArena, ") is not divisible by pagesPerReclaimerChunk (", pagesPerReclaimerChunk, ")\n") + throw("bad pagesPerReclaimerChunk") + } + + // Initialize the heap. + mheap_.init() + mcache0 = allocmcache() + lockInit(&gcBitsArenas.lock, lockRankGcBitsArenas) + lockInit(&profInsertLock, lockRankProfInsert) + lockInit(&profBlockLock, lockRankProfBlock) + lockInit(&profMemActiveLock, lockRankProfMemActive) + for i := range profMemFutureLock { + lockInit(&profMemFutureLock[i], lockRankProfMemFuture) + } + lockInit(&globalAlloc.mutex, lockRankGlobalAlloc) + + // Create initial arena growth hints. + if goarch.PtrSize == 8 { + // On a 64-bit machine, we pick the following hints + // because: + // + // 1. Starting from the middle of the address space + // makes it easier to grow out a contiguous range + // without running in to some other mapping. + // + // 2. This makes Go heap addresses more easily + // recognizable when debugging. + // + // 3. Stack scanning in gccgo is still conservative, + // so it's important that addresses be distinguishable + // from other data. + // + // Starting at 0x00c0 means that the valid memory addresses + // will begin 0x00c0, 0x00c1, ... + // In little-endian, that's c0 00, c1 00, ... None of those are valid + // UTF-8 sequences, and they are otherwise as far away from + // ff (likely a common byte) as possible. If that fails, we try other 0xXXc0 + // addresses. An earlier attempt to use 0x11f8 caused out of memory errors + // on OS X during thread allocations. 0x00c0 causes conflicts with + // AddressSanitizer which reserves all memory up to 0x0100. + // These choices reduce the odds of a conservative garbage collector + // not collecting memory because some non-pointer block of memory + // had a bit pattern that matched a memory address. + // + // However, on arm64, we ignore all this advice above and slam the + // allocation at 0x40 << 32 because when using 4k pages with 3-level + // translation buffers, the user address space is limited to 39 bits + // On ios/arm64, the address space is even smaller. + // + // On AIX, mmaps starts at 0x0A00000000000000 for 64-bit. + // processes. + // + // Space mapped for user arenas comes immediately after the range + // originally reserved for the regular heap when race mode is not + // enabled because user arena chunks can never be used for regular heap + // allocations and we want to avoid fragmenting the address space. + // + // In race mode we have no choice but to just use the same hints because + // the race detector requires that the heap be mapped contiguously. + for i := 0x7f; i >= 0; i-- { + var p uintptr + switch { + case raceenabled: + // The TSAN runtime requires the heap + // to be in the range [0x00c000000000, + // 0x00e000000000). + p = uintptr(i)<<32 | uintptrMask&(0x00c0<<32) + if p >= uintptrMask&0x00e000000000 { + continue + } + case GOARCH == "arm64" && GOOS == "ios": + p = uintptr(i)<<40 | uintptrMask&(0x0013<<28) + case GOARCH == "arm64": + p = uintptr(i)<<40 | uintptrMask&(0x0040<<32) + case GOOS == "aix": + if i == 0 { + // We don't use addresses directly after 0x0A00000000000000 + // to avoid collisions with others mmaps done by non-go programs. + continue + } + p = uintptr(i)<<40 | uintptrMask&(0xa0<<52) + default: + p = uintptr(i)<<40 | uintptrMask&(0x00c0<<32) + } + // Switch to generating hints for user arenas if we've gone + // through about half the hints. In race mode, take only about + // a quarter; we don't have very much space to work with. + hintList := &mheap_.arenaHints + if (!raceenabled && i > 0x3f) || (raceenabled && i > 0x5f) { + hintList = &mheap_.userArena.arenaHints + } + hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc()) + hint.addr = p + hint.next, *hintList = *hintList, hint + } + } else { + // On a 32-bit machine, we're much more concerned + // about keeping the usable heap contiguous. + // Hence: + // + // 1. We reserve space for all heapArenas up front so + // they don't get interleaved with the heap. They're + // ~258MB, so this isn't too bad. (We could reserve a + // smaller amount of space up front if this is a + // problem.) + // + // 2. We hint the heap to start right above the end of + // the binary so we have the best chance of keeping it + // contiguous. + // + // 3. We try to stake out a reasonably large initial + // heap reservation. + + const arenaMetaSize = (1 << arenaBits) * unsafe.Sizeof(heapArena{}) + meta := uintptr(sysReserve(nil, arenaMetaSize)) + if meta != 0 { + mheap_.heapArenaAlloc.init(meta, arenaMetaSize, true) + } + + // We want to start the arena low, but if we're linked + // against C code, it's possible global constructors + // have called malloc and adjusted the process' brk. + // Query the brk so we can avoid trying to map the + // region over it (which will cause the kernel to put + // the region somewhere else, likely at a high + // address). + procBrk := sbrk0() + + // If we ask for the end of the data segment but the + // operating system requires a little more space + // before we can start allocating, it will give out a + // slightly higher pointer. Except QEMU, which is + // buggy, as usual: it won't adjust the pointer + // upward. So adjust it upward a little bit ourselves: + // 1/4 MB to get away from the running binary image. + p := firstmoduledata.end + if p < procBrk { + p = procBrk + } + if mheap_.heapArenaAlloc.next <= p && p < mheap_.heapArenaAlloc.end { + p = mheap_.heapArenaAlloc.end + } + p = alignUp(p+(256<<10), heapArenaBytes) + // Because we're worried about fragmentation on + // 32-bit, we try to make a large initial reservation. + arenaSizes := []uintptr{ + 512 << 20, + 256 << 20, + 128 << 20, + } + for _, arenaSize := range arenaSizes { + a, size := sysReserveAligned(unsafe.Pointer(p), arenaSize, heapArenaBytes) + if a != nil { + mheap_.arena.init(uintptr(a), size, false) + p = mheap_.arena.end // For hint below + break + } + } + hint := (*arenaHint)(mheap_.arenaHintAlloc.alloc()) + hint.addr = p + hint.next, mheap_.arenaHints = mheap_.arenaHints, hint + + // Place the hint for user arenas just after the large reservation. + // + // While this potentially competes with the hint above, in practice we probably + // aren't going to be getting this far anyway on 32-bit platforms. + userArenaHint := (*arenaHint)(mheap_.arenaHintAlloc.alloc()) + userArenaHint.addr = p + userArenaHint.next, mheap_.userArena.arenaHints = mheap_.userArena.arenaHints, userArenaHint + } +} + +// sysAlloc allocates heap arena space for at least n bytes. The +// returned pointer is always heapArenaBytes-aligned and backed by +// h.arenas metadata. The returned size is always a multiple of +// heapArenaBytes. sysAlloc returns nil on failure. +// There is no corresponding free function. +// +// hintList is a list of hint addresses for where to allocate new +// heap arenas. It must be non-nil. +// +// register indicates whether the heap arena should be registered +// in allArenas. +// +// sysAlloc returns a memory region in the Reserved state. This region must +// be transitioned to Prepared and then Ready before use. +// +// h must be locked. +func (h *mheap) sysAlloc(n uintptr, hintList **arenaHint, register bool) (v unsafe.Pointer, size uintptr) { + assertLockHeld(&h.lock) + + n = alignUp(n, heapArenaBytes) + + if hintList == &h.arenaHints { + // First, try the arena pre-reservation. + // Newly-used mappings are considered released. + // + // Only do this if we're using the regular heap arena hints. + // This behavior is only for the heap. + v = h.arena.alloc(n, heapArenaBytes, &gcController.heapReleased) + if v != nil { + size = n + goto mapped + } + } + + // Try to grow the heap at a hint address. + for *hintList != nil { + hint := *hintList + p := hint.addr + if hint.down { + p -= n + } + if p+n < p { + // We can't use this, so don't ask. + v = nil + } else if arenaIndex(p+n-1) >= 1<<arenaBits { + // Outside addressable heap. Can't use. + v = nil + } else { + v = sysReserve(unsafe.Pointer(p), n) + } + if p == uintptr(v) { + // Success. Update the hint. + if !hint.down { + p += n + } + hint.addr = p + size = n + break + } + // Failed. Discard this hint and try the next. + // + // TODO: This would be cleaner if sysReserve could be + // told to only return the requested address. In + // particular, this is already how Windows behaves, so + // it would simplify things there. + if v != nil { + sysFreeOS(v, n) + } + *hintList = hint.next + h.arenaHintAlloc.free(unsafe.Pointer(hint)) + } + + if size == 0 { + if raceenabled { + // The race detector assumes the heap lives in + // [0x00c000000000, 0x00e000000000), but we + // just ran out of hints in this region. Give + // a nice failure. + throw("too many address space collisions for -race mode") + } + + // All of the hints failed, so we'll take any + // (sufficiently aligned) address the kernel will give + // us. + v, size = sysReserveAligned(nil, n, heapArenaBytes) + if v == nil { + return nil, 0 + } + + // Create new hints for extending this region. + hint := (*arenaHint)(h.arenaHintAlloc.alloc()) + hint.addr, hint.down = uintptr(v), true + hint.next, mheap_.arenaHints = mheap_.arenaHints, hint + hint = (*arenaHint)(h.arenaHintAlloc.alloc()) + hint.addr = uintptr(v) + size + hint.next, mheap_.arenaHints = mheap_.arenaHints, hint + } + + // Check for bad pointers or pointers we can't use. + { + var bad string + p := uintptr(v) + if p+size < p { + bad = "region exceeds uintptr range" + } else if arenaIndex(p) >= 1<<arenaBits { + bad = "base outside usable address space" + } else if arenaIndex(p+size-1) >= 1<<arenaBits { + bad = "end outside usable address space" + } + if bad != "" { + // This should be impossible on most architectures, + // but it would be really confusing to debug. + print("runtime: memory allocated by OS [", hex(p), ", ", hex(p+size), ") not in usable address space: ", bad, "\n") + throw("memory reservation exceeds address space limit") + } + } + + if uintptr(v)&(heapArenaBytes-1) != 0 { + throw("misrounded allocation in sysAlloc") + } + +mapped: + // Create arena metadata. + for ri := arenaIndex(uintptr(v)); ri <= arenaIndex(uintptr(v)+size-1); ri++ { + l2 := h.arenas[ri.l1()] + if l2 == nil { + // Allocate an L2 arena map. + // + // Use sysAllocOS instead of sysAlloc or persistentalloc because there's no + // statistic we can comfortably account for this space in. With this structure, + // we rely on demand paging to avoid large overheads, but tracking which memory + // is paged in is too expensive. Trying to account for the whole region means + // that it will appear like an enormous memory overhead in statistics, even though + // it is not. + l2 = (*[1 << arenaL2Bits]*heapArena)(sysAllocOS(unsafe.Sizeof(*l2))) + if l2 == nil { + throw("out of memory allocating heap arena map") + } + atomic.StorepNoWB(unsafe.Pointer(&h.arenas[ri.l1()]), unsafe.Pointer(l2)) + } + + if l2[ri.l2()] != nil { + throw("arena already initialized") + } + var r *heapArena + r = (*heapArena)(h.heapArenaAlloc.alloc(unsafe.Sizeof(*r), goarch.PtrSize, &memstats.gcMiscSys)) + if r == nil { + r = (*heapArena)(persistentalloc(unsafe.Sizeof(*r), goarch.PtrSize, &memstats.gcMiscSys)) + if r == nil { + throw("out of memory allocating heap arena metadata") + } + } + + // Register the arena in allArenas if requested. + if register { + if len(h.allArenas) == cap(h.allArenas) { + size := 2 * uintptr(cap(h.allArenas)) * goarch.PtrSize + if size == 0 { + size = physPageSize + } + newArray := (*notInHeap)(persistentalloc(size, goarch.PtrSize, &memstats.gcMiscSys)) + if newArray == nil { + throw("out of memory allocating allArenas") + } + oldSlice := h.allArenas + *(*notInHeapSlice)(unsafe.Pointer(&h.allArenas)) = notInHeapSlice{newArray, len(h.allArenas), int(size / goarch.PtrSize)} + copy(h.allArenas, oldSlice) + // Do not free the old backing array because + // there may be concurrent readers. Since we + // double the array each time, this can lead + // to at most 2x waste. + } + h.allArenas = h.allArenas[:len(h.allArenas)+1] + h.allArenas[len(h.allArenas)-1] = ri + } + + // Store atomically just in case an object from the + // new heap arena becomes visible before the heap lock + // is released (which shouldn't happen, but there's + // little downside to this). + atomic.StorepNoWB(unsafe.Pointer(&l2[ri.l2()]), unsafe.Pointer(r)) + } + + // Tell the race detector about the new heap memory. + if raceenabled { + racemapshadow(v, size) + } + + return +} + +// sysReserveAligned is like sysReserve, but the returned pointer is +// aligned to align bytes. It may reserve either n or n+align bytes, +// so it returns the size that was reserved. +func sysReserveAligned(v unsafe.Pointer, size, align uintptr) (unsafe.Pointer, uintptr) { + // Since the alignment is rather large in uses of this + // function, we're not likely to get it by chance, so we ask + // for a larger region and remove the parts we don't need. + retries := 0 +retry: + p := uintptr(sysReserve(v, size+align)) + switch { + case p == 0: + return nil, 0 + case p&(align-1) == 0: + return unsafe.Pointer(p), size + align + case GOOS == "windows": + // On Windows we can't release pieces of a + // reservation, so we release the whole thing and + // re-reserve the aligned sub-region. This may race, + // so we may have to try again. + sysFreeOS(unsafe.Pointer(p), size+align) + p = alignUp(p, align) + p2 := sysReserve(unsafe.Pointer(p), size) + if p != uintptr(p2) { + // Must have raced. Try again. + sysFreeOS(p2, size) + if retries++; retries == 100 { + throw("failed to allocate aligned heap memory; too many retries") + } + goto retry + } + // Success. + return p2, size + default: + // Trim off the unaligned parts. + pAligned := alignUp(p, align) + sysFreeOS(unsafe.Pointer(p), pAligned-p) + end := pAligned + size + endLen := (p + size + align) - end + if endLen > 0 { + sysFreeOS(unsafe.Pointer(end), endLen) + } + return unsafe.Pointer(pAligned), size + } +} + +// base address for all 0-byte allocations +var zerobase uintptr + +// nextFreeFast returns the next free object if one is quickly available. +// Otherwise it returns 0. +func nextFreeFast(s *mspan) gclinkptr { + theBit := sys.TrailingZeros64(s.allocCache) // Is there a free object in the allocCache? + if theBit < 64 { + result := s.freeindex + uintptr(theBit) + if result < s.nelems { + freeidx := result + 1 + if freeidx%64 == 0 && freeidx != s.nelems { + return 0 + } + s.allocCache >>= uint(theBit + 1) + s.freeindex = freeidx + s.allocCount++ + return gclinkptr(result*s.elemsize + s.base()) + } + } + return 0 +} + +// nextFree returns the next free object from the cached span if one is available. +// Otherwise it refills the cache with a span with an available object and +// returns that object along with a flag indicating that this was a heavy +// weight allocation. If it is a heavy weight allocation the caller must +// determine whether a new GC cycle needs to be started or if the GC is active +// whether this goroutine needs to assist the GC. +// +// Must run in a non-preemptible context since otherwise the owner of +// c could change. +func (c *mcache) nextFree(spc spanClass) (v gclinkptr, s *mspan, shouldhelpgc bool) { + s = c.alloc[spc] + shouldhelpgc = false + freeIndex := s.nextFreeIndex() + if freeIndex == s.nelems { + // The span is full. + if uintptr(s.allocCount) != s.nelems { + println("runtime: s.allocCount=", s.allocCount, "s.nelems=", s.nelems) + throw("s.allocCount != s.nelems && freeIndex == s.nelems") + } + c.refill(spc) + shouldhelpgc = true + s = c.alloc[spc] + + freeIndex = s.nextFreeIndex() + } + + if freeIndex >= s.nelems { + throw("freeIndex is not valid") + } + + v = gclinkptr(freeIndex*s.elemsize + s.base()) + s.allocCount++ + if uintptr(s.allocCount) > s.nelems { + println("s.allocCount=", s.allocCount, "s.nelems=", s.nelems) + throw("s.allocCount > s.nelems") + } + return +} + +// Allocate an object of size bytes. +// Small objects are allocated from the per-P cache's free lists. +// Large objects (> 32 kB) are allocated straight from the heap. +func mallocgc(size uintptr, typ *_type, needzero bool) unsafe.Pointer { + if gcphase == _GCmarktermination { + throw("mallocgc called with gcphase == _GCmarktermination") + } + + if size == 0 { + return unsafe.Pointer(&zerobase) + } + + // It's possible for any malloc to trigger sweeping, which may in + // turn queue finalizers. Record this dynamic lock edge. + lockRankMayQueueFinalizer() + + userSize := size + if asanenabled { + // Refer to ASAN runtime library, the malloc() function allocates extra memory, + // the redzone, around the user requested memory region. And the redzones are marked + // as unaddressable. We perform the same operations in Go to detect the overflows or + // underflows. + size += computeRZlog(size) + } + + if debug.malloc { + if debug.sbrk != 0 { + align := uintptr(16) + if typ != nil { + // TODO(austin): This should be just + // align = uintptr(typ.align) + // but that's only 4 on 32-bit platforms, + // even if there's a uint64 field in typ (see #599). + // This causes 64-bit atomic accesses to panic. + // Hence, we use stricter alignment that matches + // the normal allocator better. + if size&7 == 0 { + align = 8 + } else if size&3 == 0 { + align = 4 + } else if size&1 == 0 { + align = 2 + } else { + align = 1 + } + } + return persistentalloc(size, align, &memstats.other_sys) + } + + if inittrace.active && inittrace.id == getg().goid { + // Init functions are executed sequentially in a single goroutine. + inittrace.allocs += 1 + } + } + + // assistG is the G to charge for this allocation, or nil if + // GC is not currently active. + assistG := deductAssistCredit(size) + + // Set mp.mallocing to keep from being preempted by GC. + mp := acquirem() + if mp.mallocing != 0 { + throw("malloc deadlock") + } + if mp.gsignal == getg() { + throw("malloc during signal") + } + mp.mallocing = 1 + + shouldhelpgc := false + dataSize := userSize + c := getMCache(mp) + if c == nil { + throw("mallocgc called without a P or outside bootstrapping") + } + var span *mspan + var x unsafe.Pointer + noscan := typ == nil || typ.ptrdata == 0 + // In some cases block zeroing can profitably (for latency reduction purposes) + // be delayed till preemption is possible; delayedZeroing tracks that state. + delayedZeroing := false + if size <= maxSmallSize { + if noscan && size < maxTinySize { + // Tiny allocator. + // + // Tiny allocator combines several tiny allocation requests + // into a single memory block. The resulting memory block + // is freed when all subobjects are unreachable. The subobjects + // must be noscan (don't have pointers), this ensures that + // the amount of potentially wasted memory is bounded. + // + // Size of the memory block used for combining (maxTinySize) is tunable. + // Current setting is 16 bytes, which relates to 2x worst case memory + // wastage (when all but one subobjects are unreachable). + // 8 bytes would result in no wastage at all, but provides less + // opportunities for combining. + // 32 bytes provides more opportunities for combining, + // but can lead to 4x worst case wastage. + // The best case winning is 8x regardless of block size. + // + // Objects obtained from tiny allocator must not be freed explicitly. + // So when an object will be freed explicitly, we ensure that + // its size >= maxTinySize. + // + // SetFinalizer has a special case for objects potentially coming + // from tiny allocator, it such case it allows to set finalizers + // for an inner byte of a memory block. + // + // The main targets of tiny allocator are small strings and + // standalone escaping variables. On a json benchmark + // the allocator reduces number of allocations by ~12% and + // reduces heap size by ~20%. + off := c.tinyoffset + // Align tiny pointer for required (conservative) alignment. + if size&7 == 0 { + off = alignUp(off, 8) + } else if goarch.PtrSize == 4 && size == 12 { + // Conservatively align 12-byte objects to 8 bytes on 32-bit + // systems so that objects whose first field is a 64-bit + // value is aligned to 8 bytes and does not cause a fault on + // atomic access. See issue 37262. + // TODO(mknyszek): Remove this workaround if/when issue 36606 + // is resolved. + off = alignUp(off, 8) + } else if size&3 == 0 { + off = alignUp(off, 4) + } else if size&1 == 0 { + off = alignUp(off, 2) + } + if off+size <= maxTinySize && c.tiny != 0 { + // The object fits into existing tiny block. + x = unsafe.Pointer(c.tiny + off) + c.tinyoffset = off + size + c.tinyAllocs++ + mp.mallocing = 0 + releasem(mp) + return x + } + // Allocate a new maxTinySize block. + span = c.alloc[tinySpanClass] + v := nextFreeFast(span) + if v == 0 { + v, span, shouldhelpgc = c.nextFree(tinySpanClass) + } + x = unsafe.Pointer(v) + (*[2]uint64)(x)[0] = 0 + (*[2]uint64)(x)[1] = 0 + // See if we need to replace the existing tiny block with the new one + // based on amount of remaining free space. + if !raceenabled && (size < c.tinyoffset || c.tiny == 0) { + // Note: disabled when race detector is on, see comment near end of this function. + c.tiny = uintptr(x) + c.tinyoffset = size + } + size = maxTinySize + } else { + var sizeclass uint8 + if size <= smallSizeMax-8 { + sizeclass = size_to_class8[divRoundUp(size, smallSizeDiv)] + } else { + sizeclass = size_to_class128[divRoundUp(size-smallSizeMax, largeSizeDiv)] + } + size = uintptr(class_to_size[sizeclass]) + spc := makeSpanClass(sizeclass, noscan) + span = c.alloc[spc] + v := nextFreeFast(span) + if v == 0 { + v, span, shouldhelpgc = c.nextFree(spc) + } + x = unsafe.Pointer(v) + if needzero && span.needzero != 0 { + memclrNoHeapPointers(x, size) + } + } + } else { + shouldhelpgc = true + // For large allocations, keep track of zeroed state so that + // bulk zeroing can be happen later in a preemptible context. + span = c.allocLarge(size, noscan) + span.freeindex = 1 + span.allocCount = 1 + size = span.elemsize + x = unsafe.Pointer(span.base()) + if needzero && span.needzero != 0 { + if noscan { + delayedZeroing = true + } else { + memclrNoHeapPointers(x, size) + // We've in theory cleared almost the whole span here, + // and could take the extra step of actually clearing + // the whole thing. However, don't. Any GC bits for the + // uncleared parts will be zero, and it's just going to + // be needzero = 1 once freed anyway. + } + } + } + + if !noscan { + var scanSize uintptr + heapBitsSetType(uintptr(x), size, dataSize, typ) + if dataSize > typ.size { + // Array allocation. If there are any + // pointers, GC has to scan to the last + // element. + if typ.ptrdata != 0 { + scanSize = dataSize - typ.size + typ.ptrdata + } + } else { + scanSize = typ.ptrdata + } + c.scanAlloc += scanSize + } + + // Ensure that the stores above that initialize x to + // type-safe memory and set the heap bits occur before + // the caller can make x observable to the garbage + // collector. Otherwise, on weakly ordered machines, + // the garbage collector could follow a pointer to x, + // but see uninitialized memory or stale heap bits. + publicationBarrier() + // As x and the heap bits are initialized, update + // freeIndexForScan now so x is seen by the GC + // (including convervative scan) as an allocated object. + // While this pointer can't escape into user code as a + // _live_ pointer until we return, conservative scanning + // may find a dead pointer that happens to point into this + // object. Delaying this update until now ensures that + // conservative scanning considers this pointer dead until + // this point. + span.freeIndexForScan = span.freeindex + + // Allocate black during GC. + // All slots hold nil so no scanning is needed. + // This may be racing with GC so do it atomically if there can be + // a race marking the bit. + if gcphase != _GCoff { + gcmarknewobject(span, uintptr(x), size) + } + + if raceenabled { + racemalloc(x, size) + } + + if msanenabled { + msanmalloc(x, size) + } + + if asanenabled { + // We should only read/write the memory with the size asked by the user. + // The rest of the allocated memory should be poisoned, so that we can report + // errors when accessing poisoned memory. + // The allocated memory is larger than required userSize, it will also include + // redzone and some other padding bytes. + rzBeg := unsafe.Add(x, userSize) + asanpoison(rzBeg, size-userSize) + asanunpoison(x, userSize) + } + + if rate := MemProfileRate; rate > 0 { + // Note cache c only valid while m acquired; see #47302 + if rate != 1 && size < c.nextSample { + c.nextSample -= size + } else { + profilealloc(mp, x, size) + } + } + mp.mallocing = 0 + releasem(mp) + + // Pointerfree data can be zeroed late in a context where preemption can occur. + // x will keep the memory alive. + if delayedZeroing { + if !noscan { + throw("delayed zeroing on data that may contain pointers") + } + memclrNoHeapPointersChunked(size, x) // This is a possible preemption point: see #47302 + } + + if debug.malloc { + if debug.allocfreetrace != 0 { + tracealloc(x, size, typ) + } + + if inittrace.active && inittrace.id == getg().goid { + // Init functions are executed sequentially in a single goroutine. + inittrace.bytes += uint64(size) + } + } + + if assistG != nil { + // Account for internal fragmentation in the assist + // debt now that we know it. + assistG.gcAssistBytes -= int64(size - dataSize) + } + + if shouldhelpgc { + if t := (gcTrigger{kind: gcTriggerHeap}); t.test() { + gcStart(t) + } + } + + if raceenabled && noscan && dataSize < maxTinySize { + // Pad tinysize allocations so they are aligned with the end + // of the tinyalloc region. This ensures that any arithmetic + // that goes off the top end of the object will be detectable + // by checkptr (issue 38872). + // Note that we disable tinyalloc when raceenabled for this to work. + // TODO: This padding is only performed when the race detector + // is enabled. It would be nice to enable it if any package + // was compiled with checkptr, but there's no easy way to + // detect that (especially at compile time). + // TODO: enable this padding for all allocations, not just + // tinyalloc ones. It's tricky because of pointer maps. + // Maybe just all noscan objects? + x = add(x, size-dataSize) + } + + return x +} + +// deductAssistCredit reduces the current G's assist credit +// by size bytes, and assists the GC if necessary. +// +// Caller must be preemptible. +// +// Returns the G for which the assist credit was accounted. +func deductAssistCredit(size uintptr) *g { + var assistG *g + if gcBlackenEnabled != 0 { + // Charge the current user G for this allocation. + assistG = getg() + if assistG.m.curg != nil { + assistG = assistG.m.curg + } + // Charge the allocation against the G. We'll account + // for internal fragmentation at the end of mallocgc. + assistG.gcAssistBytes -= int64(size) + + if assistG.gcAssistBytes < 0 { + // This G is in debt. Assist the GC to correct + // this before allocating. This must happen + // before disabling preemption. + gcAssistAlloc(assistG) + } + } + return assistG +} + +// memclrNoHeapPointersChunked repeatedly calls memclrNoHeapPointers +// on chunks of the buffer to be zeroed, with opportunities for preemption +// along the way. memclrNoHeapPointers contains no safepoints and also +// cannot be preemptively scheduled, so this provides a still-efficient +// block copy that can also be preempted on a reasonable granularity. +// +// Use this with care; if the data being cleared is tagged to contain +// pointers, this allows the GC to run before it is all cleared. +func memclrNoHeapPointersChunked(size uintptr, x unsafe.Pointer) { + v := uintptr(x) + // got this from benchmarking. 128k is too small, 512k is too large. + const chunkBytes = 256 * 1024 + vsize := v + size + for voff := v; voff < vsize; voff = voff + chunkBytes { + if getg().preempt { + // may hold locks, e.g., profiling + goschedguarded() + } + // clear min(avail, lump) bytes + n := vsize - voff + if n > chunkBytes { + n = chunkBytes + } + memclrNoHeapPointers(unsafe.Pointer(voff), n) + } +} + +// implementation of new builtin +// compiler (both frontend and SSA backend) knows the signature +// of this function. +func newobject(typ *_type) unsafe.Pointer { + return mallocgc(typ.size, typ, true) +} + +//go:linkname reflect_unsafe_New reflect.unsafe_New +func reflect_unsafe_New(typ *_type) unsafe.Pointer { + return mallocgc(typ.size, typ, true) +} + +//go:linkname reflectlite_unsafe_New internal/reflectlite.unsafe_New +func reflectlite_unsafe_New(typ *_type) unsafe.Pointer { + return mallocgc(typ.size, typ, true) +} + +// newarray allocates an array of n elements of type typ. +func newarray(typ *_type, n int) unsafe.Pointer { + if n == 1 { + return mallocgc(typ.size, typ, true) + } + mem, overflow := math.MulUintptr(typ.size, uintptr(n)) + if overflow || mem > maxAlloc || n < 0 { + panic(plainError("runtime: allocation size out of range")) + } + return mallocgc(mem, typ, true) +} + +//go:linkname reflect_unsafe_NewArray reflect.unsafe_NewArray +func reflect_unsafe_NewArray(typ *_type, n int) unsafe.Pointer { + return newarray(typ, n) +} + +func profilealloc(mp *m, x unsafe.Pointer, size uintptr) { + c := getMCache(mp) + if c == nil { + throw("profilealloc called without a P or outside bootstrapping") + } + c.nextSample = nextSample() + mProf_Malloc(x, size) +} + +// nextSample returns the next sampling point for heap profiling. The goal is +// to sample allocations on average every MemProfileRate bytes, but with a +// completely random distribution over the allocation timeline; this +// corresponds to a Poisson process with parameter MemProfileRate. In Poisson +// processes, the distance between two samples follows the exponential +// distribution (exp(MemProfileRate)), so the best return value is a random +// number taken from an exponential distribution whose mean is MemProfileRate. +func nextSample() uintptr { + if MemProfileRate == 1 { + // Callers assign our return value to + // mcache.next_sample, but next_sample is not used + // when the rate is 1. So avoid the math below and + // just return something. + return 0 + } + if GOOS == "plan9" { + // Plan 9 doesn't support floating point in note handler. + if gp := getg(); gp == gp.m.gsignal { + return nextSampleNoFP() + } + } + + return uintptr(fastexprand(MemProfileRate)) +} + +// fastexprand returns a random number from an exponential distribution with +// the specified mean. +func fastexprand(mean int) int32 { + // Avoid overflow. Maximum possible step is + // -ln(1/(1<<randomBitCount)) * mean, approximately 20 * mean. + switch { + case mean > 0x7000000: + mean = 0x7000000 + case mean == 0: + return 0 + } + + // Take a random sample of the exponential distribution exp(-mean*x). + // The probability distribution function is mean*exp(-mean*x), so the CDF is + // p = 1 - exp(-mean*x), so + // q = 1 - p == exp(-mean*x) + // log_e(q) = -mean*x + // -log_e(q)/mean = x + // x = -log_e(q) * mean + // x = log_2(q) * (-log_e(2)) * mean ; Using log_2 for efficiency + const randomBitCount = 26 + q := fastrandn(1<<randomBitCount) + 1 + qlog := fastlog2(float64(q)) - randomBitCount + if qlog > 0 { + qlog = 0 + } + const minusLog2 = -0.6931471805599453 // -ln(2) + return int32(qlog*(minusLog2*float64(mean))) + 1 +} + +// nextSampleNoFP is similar to nextSample, but uses older, +// simpler code to avoid floating point. +func nextSampleNoFP() uintptr { + // Set first allocation sample size. + rate := MemProfileRate + if rate > 0x3fffffff { // make 2*rate not overflow + rate = 0x3fffffff + } + if rate != 0 { + return uintptr(fastrandn(uint32(2 * rate))) + } + return 0 +} + +type persistentAlloc struct { + base *notInHeap + off uintptr +} + +var globalAlloc struct { + mutex + persistentAlloc +} + +// persistentChunkSize is the number of bytes we allocate when we grow +// a persistentAlloc. +const persistentChunkSize = 256 << 10 + +// persistentChunks is a list of all the persistent chunks we have +// allocated. The list is maintained through the first word in the +// persistent chunk. This is updated atomically. +var persistentChunks *notInHeap + +// Wrapper around sysAlloc that can allocate small chunks. +// There is no associated free operation. +// Intended for things like function/type/debug-related persistent data. +// If align is 0, uses default align (currently 8). +// The returned memory will be zeroed. +// sysStat must be non-nil. +// +// Consider marking persistentalloc'd types not in heap by embedding +// runtime/internal/sys.NotInHeap. +func persistentalloc(size, align uintptr, sysStat *sysMemStat) unsafe.Pointer { + var p *notInHeap + systemstack(func() { + p = persistentalloc1(size, align, sysStat) + }) + return unsafe.Pointer(p) +} + +// Must run on system stack because stack growth can (re)invoke it. +// See issue 9174. +// +//go:systemstack +func persistentalloc1(size, align uintptr, sysStat *sysMemStat) *notInHeap { + const ( + maxBlock = 64 << 10 // VM reservation granularity is 64K on windows + ) + + if size == 0 { + throw("persistentalloc: size == 0") + } + if align != 0 { + if align&(align-1) != 0 { + throw("persistentalloc: align is not a power of 2") + } + if align > _PageSize { + throw("persistentalloc: align is too large") + } + } else { + align = 8 + } + + if size >= maxBlock { + return (*notInHeap)(sysAlloc(size, sysStat)) + } + + mp := acquirem() + var persistent *persistentAlloc + if mp != nil && mp.p != 0 { + persistent = &mp.p.ptr().palloc + } else { + lock(&globalAlloc.mutex) + persistent = &globalAlloc.persistentAlloc + } + persistent.off = alignUp(persistent.off, align) + if persistent.off+size > persistentChunkSize || persistent.base == nil { + persistent.base = (*notInHeap)(sysAlloc(persistentChunkSize, &memstats.other_sys)) + if persistent.base == nil { + if persistent == &globalAlloc.persistentAlloc { + unlock(&globalAlloc.mutex) + } + throw("runtime: cannot allocate memory") + } + + // Add the new chunk to the persistentChunks list. + for { + chunks := uintptr(unsafe.Pointer(persistentChunks)) + *(*uintptr)(unsafe.Pointer(persistent.base)) = chunks + if atomic.Casuintptr((*uintptr)(unsafe.Pointer(&persistentChunks)), chunks, uintptr(unsafe.Pointer(persistent.base))) { + break + } + } + persistent.off = alignUp(goarch.PtrSize, align) + } + p := persistent.base.add(persistent.off) + persistent.off += size + releasem(mp) + if persistent == &globalAlloc.persistentAlloc { + unlock(&globalAlloc.mutex) + } + + if sysStat != &memstats.other_sys { + sysStat.add(int64(size)) + memstats.other_sys.add(-int64(size)) + } + return p +} + +// inPersistentAlloc reports whether p points to memory allocated by +// persistentalloc. This must be nosplit because it is called by the +// cgo checker code, which is called by the write barrier code. +// +//go:nosplit +func inPersistentAlloc(p uintptr) bool { + chunk := atomic.Loaduintptr((*uintptr)(unsafe.Pointer(&persistentChunks))) + for chunk != 0 { + if p >= chunk && p < chunk+persistentChunkSize { + return true + } + chunk = *(*uintptr)(unsafe.Pointer(chunk)) + } + return false +} + +// linearAlloc is a simple linear allocator that pre-reserves a region +// of memory and then optionally maps that region into the Ready state +// as needed. +// +// The caller is responsible for locking. +type linearAlloc struct { + next uintptr // next free byte + mapped uintptr // one byte past end of mapped space + end uintptr // end of reserved space + + mapMemory bool // transition memory from Reserved to Ready if true +} + +func (l *linearAlloc) init(base, size uintptr, mapMemory bool) { + if base+size < base { + // Chop off the last byte. The runtime isn't prepared + // to deal with situations where the bounds could overflow. + // Leave that memory reserved, though, so we don't map it + // later. + size -= 1 + } + l.next, l.mapped = base, base + l.end = base + size + l.mapMemory = mapMemory +} + +func (l *linearAlloc) alloc(size, align uintptr, sysStat *sysMemStat) unsafe.Pointer { + p := alignUp(l.next, align) + if p+size > l.end { + return nil + } + l.next = p + size + if pEnd := alignUp(l.next-1, physPageSize); pEnd > l.mapped { + if l.mapMemory { + // Transition from Reserved to Prepared to Ready. + n := pEnd - l.mapped + sysMap(unsafe.Pointer(l.mapped), n, sysStat) + sysUsed(unsafe.Pointer(l.mapped), n, n) + } + l.mapped = pEnd + } + return unsafe.Pointer(p) +} + +// notInHeap is off-heap memory allocated by a lower-level allocator +// like sysAlloc or persistentAlloc. +// +// In general, it's better to use real types which embed +// runtime/internal/sys.NotInHeap, but this serves as a generic type +// for situations where that isn't possible (like in the allocators). +// +// TODO: Use this as the return type of sysAlloc, persistentAlloc, etc? +type notInHeap struct{ _ sys.NotInHeap } + +func (p *notInHeap) add(bytes uintptr) *notInHeap { + return (*notInHeap)(unsafe.Pointer(uintptr(unsafe.Pointer(p)) + bytes)) +} + +// computeRZlog computes the size of the redzone. +// Refer to the implementation of the compiler-rt. +func computeRZlog(userSize uintptr) uintptr { + switch { + case userSize <= (64 - 16): + return 16 << 0 + case userSize <= (128 - 32): + return 16 << 1 + case userSize <= (512 - 64): + return 16 << 2 + case userSize <= (4096 - 128): + return 16 << 3 + case userSize <= (1<<14)-256: + return 16 << 4 + case userSize <= (1<<15)-512: + return 16 << 5 + case userSize <= (1<<16)-1024: + return 16 << 6 + default: + return 16 << 7 + } +} diff --git a/src/runtime/malloc_test.go b/src/runtime/malloc_test.go new file mode 100644 index 0000000..5b9ce98 --- /dev/null +++ b/src/runtime/malloc_test.go @@ -0,0 +1,449 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "flag" + "fmt" + "internal/race" + "internal/testenv" + "os" + "os/exec" + "reflect" + "runtime" + . "runtime" + "strings" + "sync/atomic" + "testing" + "time" + "unsafe" +) + +var testMemStatsCount int + +func TestMemStats(t *testing.T) { + testMemStatsCount++ + + // Make sure there's at least one forced GC. + GC() + + // Test that MemStats has sane values. + st := new(MemStats) + ReadMemStats(st) + + nz := func(x any) error { + if x != reflect.Zero(reflect.TypeOf(x)).Interface() { + return nil + } + return fmt.Errorf("zero value") + } + le := func(thresh float64) func(any) error { + return func(x any) error { + // These sanity tests aren't necessarily valid + // with high -test.count values, so only run + // them once. + if testMemStatsCount > 1 { + return nil + } + + if reflect.ValueOf(x).Convert(reflect.TypeOf(thresh)).Float() < thresh { + return nil + } + return fmt.Errorf("insanely high value (overflow?); want <= %v", thresh) + } + } + eq := func(x any) func(any) error { + return func(y any) error { + if x == y { + return nil + } + return fmt.Errorf("want %v", x) + } + } + // Of the uint fields, HeapReleased, HeapIdle can be 0. + // PauseTotalNs can be 0 if timer resolution is poor. + fields := map[string][]func(any) error{ + "Alloc": {nz, le(1e10)}, "TotalAlloc": {nz, le(1e11)}, "Sys": {nz, le(1e10)}, + "Lookups": {eq(uint64(0))}, "Mallocs": {nz, le(1e10)}, "Frees": {nz, le(1e10)}, + "HeapAlloc": {nz, le(1e10)}, "HeapSys": {nz, le(1e10)}, "HeapIdle": {le(1e10)}, + "HeapInuse": {nz, le(1e10)}, "HeapReleased": {le(1e10)}, "HeapObjects": {nz, le(1e10)}, + "StackInuse": {nz, le(1e10)}, "StackSys": {nz, le(1e10)}, + "MSpanInuse": {nz, le(1e10)}, "MSpanSys": {nz, le(1e10)}, + "MCacheInuse": {nz, le(1e10)}, "MCacheSys": {nz, le(1e10)}, + "BuckHashSys": {nz, le(1e10)}, "GCSys": {nz, le(1e10)}, "OtherSys": {nz, le(1e10)}, + "NextGC": {nz, le(1e10)}, "LastGC": {nz}, + "PauseTotalNs": {le(1e11)}, "PauseNs": nil, "PauseEnd": nil, + "NumGC": {nz, le(1e9)}, "NumForcedGC": {nz, le(1e9)}, + "GCCPUFraction": {le(0.99)}, "EnableGC": {eq(true)}, "DebugGC": {eq(false)}, + "BySize": nil, + } + + rst := reflect.ValueOf(st).Elem() + for i := 0; i < rst.Type().NumField(); i++ { + name, val := rst.Type().Field(i).Name, rst.Field(i).Interface() + checks, ok := fields[name] + if !ok { + t.Errorf("unknown MemStats field %s", name) + continue + } + for _, check := range checks { + if err := check(val); err != nil { + t.Errorf("%s = %v: %s", name, val, err) + } + } + } + + if st.Sys != st.HeapSys+st.StackSys+st.MSpanSys+st.MCacheSys+ + st.BuckHashSys+st.GCSys+st.OtherSys { + t.Fatalf("Bad sys value: %+v", *st) + } + + if st.HeapIdle+st.HeapInuse != st.HeapSys { + t.Fatalf("HeapIdle(%d) + HeapInuse(%d) should be equal to HeapSys(%d), but isn't.", st.HeapIdle, st.HeapInuse, st.HeapSys) + } + + if lpe := st.PauseEnd[int(st.NumGC+255)%len(st.PauseEnd)]; st.LastGC != lpe { + t.Fatalf("LastGC(%d) != last PauseEnd(%d)", st.LastGC, lpe) + } + + var pauseTotal uint64 + for _, pause := range st.PauseNs { + pauseTotal += pause + } + if int(st.NumGC) < len(st.PauseNs) { + // We have all pauses, so this should be exact. + if st.PauseTotalNs != pauseTotal { + t.Fatalf("PauseTotalNs(%d) != sum PauseNs(%d)", st.PauseTotalNs, pauseTotal) + } + for i := int(st.NumGC); i < len(st.PauseNs); i++ { + if st.PauseNs[i] != 0 { + t.Fatalf("Non-zero PauseNs[%d]: %+v", i, st) + } + if st.PauseEnd[i] != 0 { + t.Fatalf("Non-zero PauseEnd[%d]: %+v", i, st) + } + } + } else { + if st.PauseTotalNs < pauseTotal { + t.Fatalf("PauseTotalNs(%d) < sum PauseNs(%d)", st.PauseTotalNs, pauseTotal) + } + } + + if st.NumForcedGC > st.NumGC { + t.Fatalf("NumForcedGC(%d) > NumGC(%d)", st.NumForcedGC, st.NumGC) + } +} + +func TestStringConcatenationAllocs(t *testing.T) { + n := testing.AllocsPerRun(1e3, func() { + b := make([]byte, 10) + for i := 0; i < 10; i++ { + b[i] = byte(i) + '0' + } + s := "foo" + string(b) + if want := "foo0123456789"; s != want { + t.Fatalf("want %v, got %v", want, s) + } + }) + // Only string concatenation allocates. + if n != 1 { + t.Fatalf("want 1 allocation, got %v", n) + } +} + +func TestTinyAlloc(t *testing.T) { + if runtime.Raceenabled { + t.Skip("tinyalloc suppressed when running in race mode") + } + const N = 16 + var v [N]unsafe.Pointer + for i := range v { + v[i] = unsafe.Pointer(new(byte)) + } + + chunks := make(map[uintptr]bool, N) + for _, p := range v { + chunks[uintptr(p)&^7] = true + } + + if len(chunks) == N { + t.Fatal("no bytes allocated within the same 8-byte chunk") + } +} + +type obj12 struct { + a uint64 + b uint32 +} + +func TestTinyAllocIssue37262(t *testing.T) { + if runtime.Raceenabled { + t.Skip("tinyalloc suppressed when running in race mode") + } + // Try to cause an alignment access fault + // by atomically accessing the first 64-bit + // value of a tiny-allocated object. + // See issue 37262 for details. + + // GC twice, once to reach a stable heap state + // and again to make sure we finish the sweep phase. + runtime.GC() + runtime.GC() + + // Disable preemption so we stay on one P's tiny allocator and + // nothing else allocates from it. + runtime.Acquirem() + + // Make 1-byte allocations until we get a fresh tiny slot. + aligned := false + for i := 0; i < 16; i++ { + x := runtime.Escape(new(byte)) + if uintptr(unsafe.Pointer(x))&0xf == 0xf { + aligned = true + break + } + } + if !aligned { + runtime.Releasem() + t.Fatal("unable to get a fresh tiny slot") + } + + // Create a 4-byte object so that the current + // tiny slot is partially filled. + runtime.Escape(new(uint32)) + + // Create a 12-byte object, which fits into the + // tiny slot. If it actually gets place there, + // then the field "a" will be improperly aligned + // for atomic access on 32-bit architectures. + // This won't be true if issue 36606 gets resolved. + tinyObj12 := runtime.Escape(new(obj12)) + + // Try to atomically access "x.a". + atomic.StoreUint64(&tinyObj12.a, 10) + + runtime.Releasem() +} + +func TestPageCacheLeak(t *testing.T) { + defer GOMAXPROCS(GOMAXPROCS(1)) + leaked := PageCachePagesLeaked() + if leaked != 0 { + t.Fatalf("found %d leaked pages in page caches", leaked) + } +} + +func TestPhysicalMemoryUtilization(t *testing.T) { + got := runTestProg(t, "testprog", "GCPhys") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got %q", want, got) + } +} + +func TestScavengedBitsCleared(t *testing.T) { + var mismatches [128]BitsMismatch + if n, ok := CheckScavengedBitsCleared(mismatches[:]); !ok { + t.Errorf("uncleared scavenged bits") + for _, m := range mismatches[:n] { + t.Logf("\t@ address 0x%x", m.Base) + t.Logf("\t| got: %064b", m.Got) + t.Logf("\t| want: %064b", m.Want) + } + t.FailNow() + } +} + +type acLink struct { + x [1 << 20]byte +} + +var arenaCollisionSink []*acLink + +func TestArenaCollision(t *testing.T) { + testenv.MustHaveExec(t) + + // Test that mheap.sysAlloc handles collisions with other + // memory mappings. + if os.Getenv("TEST_ARENA_COLLISION") != "1" { + cmd := testenv.CleanCmdEnv(exec.Command(os.Args[0], "-test.run=TestArenaCollision", "-test.v")) + cmd.Env = append(cmd.Env, "TEST_ARENA_COLLISION=1") + out, err := cmd.CombinedOutput() + if race.Enabled { + // This test runs the runtime out of hint + // addresses, so it will start mapping the + // heap wherever it can. The race detector + // doesn't support this, so look for the + // expected failure. + if want := "too many address space collisions"; !strings.Contains(string(out), want) { + t.Fatalf("want %q, got:\n%s", want, string(out)) + } + } else if !strings.Contains(string(out), "PASS\n") || err != nil { + t.Fatalf("%s\n(exit status %v)", string(out), err) + } + return + } + disallowed := [][2]uintptr{} + // Drop all but the next 3 hints. 64-bit has a lot of hints, + // so it would take a lot of memory to go through all of them. + KeepNArenaHints(3) + // Consume these 3 hints and force the runtime to find some + // fallback hints. + for i := 0; i < 5; i++ { + // Reserve memory at the next hint so it can't be used + // for the heap. + start, end, ok := MapNextArenaHint() + if !ok { + t.Skipf("failed to reserve memory at next arena hint [%#x, %#x)", start, end) + } + t.Logf("reserved [%#x, %#x)", start, end) + disallowed = append(disallowed, [2]uintptr{start, end}) + // Allocate until the runtime tries to use the hint we + // just mapped over. + hint := GetNextArenaHint() + for GetNextArenaHint() == hint { + ac := new(acLink) + arenaCollisionSink = append(arenaCollisionSink, ac) + // The allocation must not have fallen into + // one of the reserved regions. + p := uintptr(unsafe.Pointer(ac)) + for _, d := range disallowed { + if d[0] <= p && p < d[1] { + t.Fatalf("allocation %#x in reserved region [%#x, %#x)", p, d[0], d[1]) + } + } + } + } +} + +func BenchmarkMalloc8(b *testing.B) { + for i := 0; i < b.N; i++ { + p := new(int64) + Escape(p) + } +} + +func BenchmarkMalloc16(b *testing.B) { + for i := 0; i < b.N; i++ { + p := new([2]int64) + Escape(p) + } +} + +func BenchmarkMallocTypeInfo8(b *testing.B) { + for i := 0; i < b.N; i++ { + p := new(struct { + p [8 / unsafe.Sizeof(uintptr(0))]*int + }) + Escape(p) + } +} + +func BenchmarkMallocTypeInfo16(b *testing.B) { + for i := 0; i < b.N; i++ { + p := new(struct { + p [16 / unsafe.Sizeof(uintptr(0))]*int + }) + Escape(p) + } +} + +type LargeStruct struct { + x [16][]byte +} + +func BenchmarkMallocLargeStruct(b *testing.B) { + for i := 0; i < b.N; i++ { + p := make([]LargeStruct, 2) + Escape(p) + } +} + +var n = flag.Int("n", 1000, "number of goroutines") + +func BenchmarkGoroutineSelect(b *testing.B) { + quit := make(chan struct{}) + read := func(ch chan struct{}) { + for { + select { + case _, ok := <-ch: + if !ok { + return + } + case <-quit: + return + } + } + } + benchHelper(b, *n, read) +} + +func BenchmarkGoroutineBlocking(b *testing.B) { + read := func(ch chan struct{}) { + for { + if _, ok := <-ch; !ok { + return + } + } + } + benchHelper(b, *n, read) +} + +func BenchmarkGoroutineForRange(b *testing.B) { + read := func(ch chan struct{}) { + for range ch { + } + } + benchHelper(b, *n, read) +} + +func benchHelper(b *testing.B, n int, read func(chan struct{})) { + m := make([]chan struct{}, n) + for i := range m { + m[i] = make(chan struct{}, 1) + go read(m[i]) + } + b.StopTimer() + b.ResetTimer() + GC() + + for i := 0; i < b.N; i++ { + for _, ch := range m { + if ch != nil { + ch <- struct{}{} + } + } + time.Sleep(10 * time.Millisecond) + b.StartTimer() + GC() + b.StopTimer() + } + + for _, ch := range m { + close(ch) + } + time.Sleep(10 * time.Millisecond) +} + +func BenchmarkGoroutineIdle(b *testing.B) { + quit := make(chan struct{}) + fn := func() { + <-quit + } + for i := 0; i < *n; i++ { + go fn() + } + + GC() + b.ResetTimer() + + for i := 0; i < b.N; i++ { + GC() + } + + b.StopTimer() + close(quit) + time.Sleep(10 * time.Millisecond) +} diff --git a/src/runtime/map.go b/src/runtime/map.go new file mode 100644 index 0000000..f546ce8 --- /dev/null +++ b/src/runtime/map.go @@ -0,0 +1,1418 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// This file contains the implementation of Go's map type. +// +// A map is just a hash table. The data is arranged +// into an array of buckets. Each bucket contains up to +// 8 key/elem pairs. The low-order bits of the hash are +// used to select a bucket. Each bucket contains a few +// high-order bits of each hash to distinguish the entries +// within a single bucket. +// +// If more than 8 keys hash to a bucket, we chain on +// extra buckets. +// +// When the hashtable grows, we allocate a new array +// of buckets twice as big. Buckets are incrementally +// copied from the old bucket array to the new bucket array. +// +// Map iterators walk through the array of buckets and +// return the keys in walk order (bucket #, then overflow +// chain order, then bucket index). To maintain iteration +// semantics, we never move keys within their bucket (if +// we did, keys might be returned 0 or 2 times). When +// growing the table, iterators remain iterating through the +// old table and must check the new table if the bucket +// they are iterating through has been moved ("evacuated") +// to the new table. + +// Picking loadFactor: too large and we have lots of overflow +// buckets, too small and we waste a lot of space. I wrote +// a simple program to check some stats for different loads: +// (64-bit, 8 byte keys and elems) +// loadFactor %overflow bytes/entry hitprobe missprobe +// 4.00 2.13 20.77 3.00 4.00 +// 4.50 4.05 17.30 3.25 4.50 +// 5.00 6.85 14.77 3.50 5.00 +// 5.50 10.55 12.94 3.75 5.50 +// 6.00 15.27 11.67 4.00 6.00 +// 6.50 20.90 10.79 4.25 6.50 +// 7.00 27.14 10.15 4.50 7.00 +// 7.50 34.03 9.73 4.75 7.50 +// 8.00 41.10 9.40 5.00 8.00 +// +// %overflow = percentage of buckets which have an overflow bucket +// bytes/entry = overhead bytes used per key/elem pair +// hitprobe = # of entries to check when looking up a present key +// missprobe = # of entries to check when looking up an absent key +// +// Keep in mind this data is for maximally loaded tables, i.e. just +// before the table grows. Typical tables will be somewhat less loaded. + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/math" + "unsafe" +) + +const ( + // Maximum number of key/elem pairs a bucket can hold. + bucketCntBits = 3 + bucketCnt = 1 << bucketCntBits + + // Maximum average load of a bucket that triggers growth is 6.5. + // Represent as loadFactorNum/loadFactorDen, to allow integer math. + loadFactorNum = 13 + loadFactorDen = 2 + + // Maximum key or elem size to keep inline (instead of mallocing per element). + // Must fit in a uint8. + // Fast versions cannot handle big elems - the cutoff size for + // fast versions in cmd/compile/internal/gc/walk.go must be at most this elem. + maxKeySize = 128 + maxElemSize = 128 + + // data offset should be the size of the bmap struct, but needs to be + // aligned correctly. For amd64p32 this means 64-bit alignment + // even though pointers are 32 bit. + dataOffset = unsafe.Offsetof(struct { + b bmap + v int64 + }{}.v) + + // Possible tophash values. We reserve a few possibilities for special marks. + // Each bucket (including its overflow buckets, if any) will have either all or none of its + // entries in the evacuated* states (except during the evacuate() method, which only happens + // during map writes and thus no one else can observe the map during that time). + emptyRest = 0 // this cell is empty, and there are no more non-empty cells at higher indexes or overflows. + emptyOne = 1 // this cell is empty + evacuatedX = 2 // key/elem is valid. Entry has been evacuated to first half of larger table. + evacuatedY = 3 // same as above, but evacuated to second half of larger table. + evacuatedEmpty = 4 // cell is empty, bucket is evacuated. + minTopHash = 5 // minimum tophash for a normal filled cell. + + // flags + iterator = 1 // there may be an iterator using buckets + oldIterator = 2 // there may be an iterator using oldbuckets + hashWriting = 4 // a goroutine is writing to the map + sameSizeGrow = 8 // the current map growth is to a new map of the same size + + // sentinel bucket ID for iterator checks + noCheck = 1<<(8*goarch.PtrSize) - 1 +) + +// isEmpty reports whether the given tophash array entry represents an empty bucket entry. +func isEmpty(x uint8) bool { + return x <= emptyOne +} + +// A header for a Go map. +type hmap struct { + // Note: the format of the hmap is also encoded in cmd/compile/internal/reflectdata/reflect.go. + // Make sure this stays in sync with the compiler's definition. + count int // # live cells == size of map. Must be first (used by len() builtin) + flags uint8 + B uint8 // log_2 of # of buckets (can hold up to loadFactor * 2^B items) + noverflow uint16 // approximate number of overflow buckets; see incrnoverflow for details + hash0 uint32 // hash seed + + buckets unsafe.Pointer // array of 2^B Buckets. may be nil if count==0. + oldbuckets unsafe.Pointer // previous bucket array of half the size, non-nil only when growing + nevacuate uintptr // progress counter for evacuation (buckets less than this have been evacuated) + + extra *mapextra // optional fields +} + +// mapextra holds fields that are not present on all maps. +type mapextra struct { + // If both key and elem do not contain pointers and are inline, then we mark bucket + // type as containing no pointers. This avoids scanning such maps. + // However, bmap.overflow is a pointer. In order to keep overflow buckets + // alive, we store pointers to all overflow buckets in hmap.extra.overflow and hmap.extra.oldoverflow. + // overflow and oldoverflow are only used if key and elem do not contain pointers. + // overflow contains overflow buckets for hmap.buckets. + // oldoverflow contains overflow buckets for hmap.oldbuckets. + // The indirection allows to store a pointer to the slice in hiter. + overflow *[]*bmap + oldoverflow *[]*bmap + + // nextOverflow holds a pointer to a free overflow bucket. + nextOverflow *bmap +} + +// A bucket for a Go map. +type bmap struct { + // tophash generally contains the top byte of the hash value + // for each key in this bucket. If tophash[0] < minTopHash, + // tophash[0] is a bucket evacuation state instead. + tophash [bucketCnt]uint8 + // Followed by bucketCnt keys and then bucketCnt elems. + // NOTE: packing all the keys together and then all the elems together makes the + // code a bit more complicated than alternating key/elem/key/elem/... but it allows + // us to eliminate padding which would be needed for, e.g., map[int64]int8. + // Followed by an overflow pointer. +} + +// A hash iteration structure. +// If you modify hiter, also change cmd/compile/internal/reflectdata/reflect.go +// and reflect/value.go to match the layout of this structure. +type hiter struct { + key unsafe.Pointer // Must be in first position. Write nil to indicate iteration end (see cmd/compile/internal/walk/range.go). + elem unsafe.Pointer // Must be in second position (see cmd/compile/internal/walk/range.go). + t *maptype + h *hmap + buckets unsafe.Pointer // bucket ptr at hash_iter initialization time + bptr *bmap // current bucket + overflow *[]*bmap // keeps overflow buckets of hmap.buckets alive + oldoverflow *[]*bmap // keeps overflow buckets of hmap.oldbuckets alive + startBucket uintptr // bucket iteration started at + offset uint8 // intra-bucket offset to start from during iteration (should be big enough to hold bucketCnt-1) + wrapped bool // already wrapped around from end of bucket array to beginning + B uint8 + i uint8 + bucket uintptr + checkBucket uintptr +} + +// bucketShift returns 1<<b, optimized for code generation. +func bucketShift(b uint8) uintptr { + // Masking the shift amount allows overflow checks to be elided. + return uintptr(1) << (b & (goarch.PtrSize*8 - 1)) +} + +// bucketMask returns 1<<b - 1, optimized for code generation. +func bucketMask(b uint8) uintptr { + return bucketShift(b) - 1 +} + +// tophash calculates the tophash value for hash. +func tophash(hash uintptr) uint8 { + top := uint8(hash >> (goarch.PtrSize*8 - 8)) + if top < minTopHash { + top += minTopHash + } + return top +} + +func evacuated(b *bmap) bool { + h := b.tophash[0] + return h > emptyOne && h < minTopHash +} + +func (b *bmap) overflow(t *maptype) *bmap { + return *(**bmap)(add(unsafe.Pointer(b), uintptr(t.bucketsize)-goarch.PtrSize)) +} + +func (b *bmap) setoverflow(t *maptype, ovf *bmap) { + *(**bmap)(add(unsafe.Pointer(b), uintptr(t.bucketsize)-goarch.PtrSize)) = ovf +} + +func (b *bmap) keys() unsafe.Pointer { + return add(unsafe.Pointer(b), dataOffset) +} + +// incrnoverflow increments h.noverflow. +// noverflow counts the number of overflow buckets. +// This is used to trigger same-size map growth. +// See also tooManyOverflowBuckets. +// To keep hmap small, noverflow is a uint16. +// When there are few buckets, noverflow is an exact count. +// When there are many buckets, noverflow is an approximate count. +func (h *hmap) incrnoverflow() { + // We trigger same-size map growth if there are + // as many overflow buckets as buckets. + // We need to be able to count to 1<<h.B. + if h.B < 16 { + h.noverflow++ + return + } + // Increment with probability 1/(1<<(h.B-15)). + // When we reach 1<<15 - 1, we will have approximately + // as many overflow buckets as buckets. + mask := uint32(1)<<(h.B-15) - 1 + // Example: if h.B == 18, then mask == 7, + // and fastrand & 7 == 0 with probability 1/8. + if fastrand()&mask == 0 { + h.noverflow++ + } +} + +func (h *hmap) newoverflow(t *maptype, b *bmap) *bmap { + var ovf *bmap + if h.extra != nil && h.extra.nextOverflow != nil { + // We have preallocated overflow buckets available. + // See makeBucketArray for more details. + ovf = h.extra.nextOverflow + if ovf.overflow(t) == nil { + // We're not at the end of the preallocated overflow buckets. Bump the pointer. + h.extra.nextOverflow = (*bmap)(add(unsafe.Pointer(ovf), uintptr(t.bucketsize))) + } else { + // This is the last preallocated overflow bucket. + // Reset the overflow pointer on this bucket, + // which was set to a non-nil sentinel value. + ovf.setoverflow(t, nil) + h.extra.nextOverflow = nil + } + } else { + ovf = (*bmap)(newobject(t.bucket)) + } + h.incrnoverflow() + if t.bucket.ptrdata == 0 { + h.createOverflow() + *h.extra.overflow = append(*h.extra.overflow, ovf) + } + b.setoverflow(t, ovf) + return ovf +} + +func (h *hmap) createOverflow() { + if h.extra == nil { + h.extra = new(mapextra) + } + if h.extra.overflow == nil { + h.extra.overflow = new([]*bmap) + } +} + +func makemap64(t *maptype, hint int64, h *hmap) *hmap { + if int64(int(hint)) != hint { + hint = 0 + } + return makemap(t, int(hint), h) +} + +// makemap_small implements Go map creation for make(map[k]v) and +// make(map[k]v, hint) when hint is known to be at most bucketCnt +// at compile time and the map needs to be allocated on the heap. +func makemap_small() *hmap { + h := new(hmap) + h.hash0 = fastrand() + return h +} + +// makemap implements Go map creation for make(map[k]v, hint). +// If the compiler has determined that the map or the first bucket +// can be created on the stack, h and/or bucket may be non-nil. +// If h != nil, the map can be created directly in h. +// If h.buckets != nil, bucket pointed to can be used as the first bucket. +func makemap(t *maptype, hint int, h *hmap) *hmap { + mem, overflow := math.MulUintptr(uintptr(hint), t.bucket.size) + if overflow || mem > maxAlloc { + hint = 0 + } + + // initialize Hmap + if h == nil { + h = new(hmap) + } + h.hash0 = fastrand() + + // Find the size parameter B which will hold the requested # of elements. + // For hint < 0 overLoadFactor returns false since hint < bucketCnt. + B := uint8(0) + for overLoadFactor(hint, B) { + B++ + } + h.B = B + + // allocate initial hash table + // if B == 0, the buckets field is allocated lazily later (in mapassign) + // If hint is large zeroing this memory could take a while. + if h.B != 0 { + var nextOverflow *bmap + h.buckets, nextOverflow = makeBucketArray(t, h.B, nil) + if nextOverflow != nil { + h.extra = new(mapextra) + h.extra.nextOverflow = nextOverflow + } + } + + return h +} + +// makeBucketArray initializes a backing array for map buckets. +// 1<<b is the minimum number of buckets to allocate. +// dirtyalloc should either be nil or a bucket array previously +// allocated by makeBucketArray with the same t and b parameters. +// If dirtyalloc is nil a new backing array will be alloced and +// otherwise dirtyalloc will be cleared and reused as backing array. +func makeBucketArray(t *maptype, b uint8, dirtyalloc unsafe.Pointer) (buckets unsafe.Pointer, nextOverflow *bmap) { + base := bucketShift(b) + nbuckets := base + // For small b, overflow buckets are unlikely. + // Avoid the overhead of the calculation. + if b >= 4 { + // Add on the estimated number of overflow buckets + // required to insert the median number of elements + // used with this value of b. + nbuckets += bucketShift(b - 4) + sz := t.bucket.size * nbuckets + up := roundupsize(sz) + if up != sz { + nbuckets = up / t.bucket.size + } + } + + if dirtyalloc == nil { + buckets = newarray(t.bucket, int(nbuckets)) + } else { + // dirtyalloc was previously generated by + // the above newarray(t.bucket, int(nbuckets)) + // but may not be empty. + buckets = dirtyalloc + size := t.bucket.size * nbuckets + if t.bucket.ptrdata != 0 { + memclrHasPointers(buckets, size) + } else { + memclrNoHeapPointers(buckets, size) + } + } + + if base != nbuckets { + // We preallocated some overflow buckets. + // To keep the overhead of tracking these overflow buckets to a minimum, + // we use the convention that if a preallocated overflow bucket's overflow + // pointer is nil, then there are more available by bumping the pointer. + // We need a safe non-nil pointer for the last overflow bucket; just use buckets. + nextOverflow = (*bmap)(add(buckets, base*uintptr(t.bucketsize))) + last := (*bmap)(add(buckets, (nbuckets-1)*uintptr(t.bucketsize))) + last.setoverflow(t, (*bmap)(buckets)) + } + return buckets, nextOverflow +} + +// mapaccess1 returns a pointer to h[key]. Never returns nil, instead +// it will return a reference to the zero object for the elem type if +// the key is not in the map. +// NOTE: The returned pointer may keep the whole map live, so don't +// hold onto it for very long. +func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer { + if raceenabled && h != nil { + callerpc := getcallerpc() + pc := abi.FuncPCABIInternal(mapaccess1) + racereadpc(unsafe.Pointer(h), callerpc, pc) + raceReadObjectPC(t.key, key, callerpc, pc) + } + if msanenabled && h != nil { + msanread(key, t.key.size) + } + if asanenabled && h != nil { + asanread(key, t.key.size) + } + if h == nil || h.count == 0 { + if t.hashMightPanic() { + t.hasher(key, 0) // see issue 23734 + } + return unsafe.Pointer(&zeroVal[0]) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map read and map write") + } + hash := t.hasher(key, uintptr(h.hash0)) + m := bucketMask(h.B) + b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + top := tophash(hash) +bucketloop: + for ; b != nil; b = b.overflow(t) { + for i := uintptr(0); i < bucketCnt; i++ { + if b.tophash[i] != top { + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) + if t.indirectkey() { + k = *((*unsafe.Pointer)(k)) + } + if t.key.equal(key, k) { + e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize)) + if t.indirectelem() { + e = *((*unsafe.Pointer)(e)) + } + return e + } + } + } + return unsafe.Pointer(&zeroVal[0]) +} + +func mapaccess2(t *maptype, h *hmap, key unsafe.Pointer) (unsafe.Pointer, bool) { + if raceenabled && h != nil { + callerpc := getcallerpc() + pc := abi.FuncPCABIInternal(mapaccess2) + racereadpc(unsafe.Pointer(h), callerpc, pc) + raceReadObjectPC(t.key, key, callerpc, pc) + } + if msanenabled && h != nil { + msanread(key, t.key.size) + } + if asanenabled && h != nil { + asanread(key, t.key.size) + } + if h == nil || h.count == 0 { + if t.hashMightPanic() { + t.hasher(key, 0) // see issue 23734 + } + return unsafe.Pointer(&zeroVal[0]), false + } + if h.flags&hashWriting != 0 { + fatal("concurrent map read and map write") + } + hash := t.hasher(key, uintptr(h.hash0)) + m := bucketMask(h.B) + b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + top := tophash(hash) +bucketloop: + for ; b != nil; b = b.overflow(t) { + for i := uintptr(0); i < bucketCnt; i++ { + if b.tophash[i] != top { + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) + if t.indirectkey() { + k = *((*unsafe.Pointer)(k)) + } + if t.key.equal(key, k) { + e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize)) + if t.indirectelem() { + e = *((*unsafe.Pointer)(e)) + } + return e, true + } + } + } + return unsafe.Pointer(&zeroVal[0]), false +} + +// returns both key and elem. Used by map iterator. +func mapaccessK(t *maptype, h *hmap, key unsafe.Pointer) (unsafe.Pointer, unsafe.Pointer) { + if h == nil || h.count == 0 { + return nil, nil + } + hash := t.hasher(key, uintptr(h.hash0)) + m := bucketMask(h.B) + b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + top := tophash(hash) +bucketloop: + for ; b != nil; b = b.overflow(t) { + for i := uintptr(0); i < bucketCnt; i++ { + if b.tophash[i] != top { + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) + if t.indirectkey() { + k = *((*unsafe.Pointer)(k)) + } + if t.key.equal(key, k) { + e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize)) + if t.indirectelem() { + e = *((*unsafe.Pointer)(e)) + } + return k, e + } + } + } + return nil, nil +} + +func mapaccess1_fat(t *maptype, h *hmap, key, zero unsafe.Pointer) unsafe.Pointer { + e := mapaccess1(t, h, key) + if e == unsafe.Pointer(&zeroVal[0]) { + return zero + } + return e +} + +func mapaccess2_fat(t *maptype, h *hmap, key, zero unsafe.Pointer) (unsafe.Pointer, bool) { + e := mapaccess1(t, h, key) + if e == unsafe.Pointer(&zeroVal[0]) { + return zero, false + } + return e, true +} + +// Like mapaccess, but allocates a slot for the key if it is not present in the map. +func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer { + if h == nil { + panic(plainError("assignment to entry in nil map")) + } + if raceenabled { + callerpc := getcallerpc() + pc := abi.FuncPCABIInternal(mapassign) + racewritepc(unsafe.Pointer(h), callerpc, pc) + raceReadObjectPC(t.key, key, callerpc, pc) + } + if msanenabled { + msanread(key, t.key.size) + } + if asanenabled { + asanread(key, t.key.size) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + hash := t.hasher(key, uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher, since t.hasher may panic, + // in which case we have not actually done a write. + h.flags ^= hashWriting + + if h.buckets == nil { + h.buckets = newobject(t.bucket) // newarray(t.bucket, 1) + } + +again: + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + top := tophash(hash) + + var inserti *uint8 + var insertk unsafe.Pointer + var elem unsafe.Pointer +bucketloop: + for { + for i := uintptr(0); i < bucketCnt; i++ { + if b.tophash[i] != top { + if isEmpty(b.tophash[i]) && inserti == nil { + inserti = &b.tophash[i] + insertk = add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) + elem = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize)) + } + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) + if t.indirectkey() { + k = *((*unsafe.Pointer)(k)) + } + if !t.key.equal(key, k) { + continue + } + // already have a mapping for key. Update it. + if t.needkeyupdate() { + typedmemmove(t.key, k, key) + } + elem = add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize)) + goto done + } + ovf := b.overflow(t) + if ovf == nil { + break + } + b = ovf + } + + // Did not find mapping for key. Allocate new cell & add entry. + + // If we hit the max load factor or we have too many overflow buckets, + // and we're not already in the middle of growing, start growing. + if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { + hashGrow(t, h) + goto again // Growing the table invalidates everything, so try again + } + + if inserti == nil { + // The current bucket and all the overflow buckets connected to it are full, allocate a new one. + newb := h.newoverflow(t, b) + inserti = &newb.tophash[0] + insertk = add(unsafe.Pointer(newb), dataOffset) + elem = add(insertk, bucketCnt*uintptr(t.keysize)) + } + + // store new key/elem at insert position + if t.indirectkey() { + kmem := newobject(t.key) + *(*unsafe.Pointer)(insertk) = kmem + insertk = kmem + } + if t.indirectelem() { + vmem := newobject(t.elem) + *(*unsafe.Pointer)(elem) = vmem + } + typedmemmove(t.key, insertk, key) + *inserti = top + h.count++ + +done: + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting + if t.indirectelem() { + elem = *((*unsafe.Pointer)(elem)) + } + return elem +} + +func mapdelete(t *maptype, h *hmap, key unsafe.Pointer) { + if raceenabled && h != nil { + callerpc := getcallerpc() + pc := abi.FuncPCABIInternal(mapdelete) + racewritepc(unsafe.Pointer(h), callerpc, pc) + raceReadObjectPC(t.key, key, callerpc, pc) + } + if msanenabled && h != nil { + msanread(key, t.key.size) + } + if asanenabled && h != nil { + asanread(key, t.key.size) + } + if h == nil || h.count == 0 { + if t.hashMightPanic() { + t.hasher(key, 0) // see issue 23734 + } + return + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + + hash := t.hasher(key, uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher, since t.hasher may panic, + // in which case we have not actually done a write (delete). + h.flags ^= hashWriting + + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + bOrig := b + top := tophash(hash) +search: + for ; b != nil; b = b.overflow(t) { + for i := uintptr(0); i < bucketCnt; i++ { + if b.tophash[i] != top { + if b.tophash[i] == emptyRest { + break search + } + continue + } + k := add(unsafe.Pointer(b), dataOffset+i*uintptr(t.keysize)) + k2 := k + if t.indirectkey() { + k2 = *((*unsafe.Pointer)(k2)) + } + if !t.key.equal(key, k2) { + continue + } + // Only clear key if there are pointers in it. + if t.indirectkey() { + *(*unsafe.Pointer)(k) = nil + } else if t.key.ptrdata != 0 { + memclrHasPointers(k, t.key.size) + } + e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.elemsize)) + if t.indirectelem() { + *(*unsafe.Pointer)(e) = nil + } else if t.elem.ptrdata != 0 { + memclrHasPointers(e, t.elem.size) + } else { + memclrNoHeapPointers(e, t.elem.size) + } + b.tophash[i] = emptyOne + // If the bucket now ends in a bunch of emptyOne states, + // change those to emptyRest states. + // It would be nice to make this a separate function, but + // for loops are not currently inlineable. + if i == bucketCnt-1 { + if b.overflow(t) != nil && b.overflow(t).tophash[0] != emptyRest { + goto notLast + } + } else { + if b.tophash[i+1] != emptyRest { + goto notLast + } + } + for { + b.tophash[i] = emptyRest + if i == 0 { + if b == bOrig { + break // beginning of initial bucket, we're done. + } + // Find previous bucket, continue at its last entry. + c := b + for b = bOrig; b.overflow(t) != c; b = b.overflow(t) { + } + i = bucketCnt - 1 + } else { + i-- + } + if b.tophash[i] != emptyOne { + break + } + } + notLast: + h.count-- + // Reset the hash seed to make it more difficult for attackers to + // repeatedly trigger hash collisions. See issue 25237. + if h.count == 0 { + h.hash0 = fastrand() + } + break search + } + } + + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting +} + +// mapiterinit initializes the hiter struct used for ranging over maps. +// The hiter struct pointed to by 'it' is allocated on the stack +// by the compilers order pass or on the heap by reflect_mapiterinit. +// Both need to have zeroed hiter since the struct contains pointers. +func mapiterinit(t *maptype, h *hmap, it *hiter) { + if raceenabled && h != nil { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapiterinit)) + } + + it.t = t + if h == nil || h.count == 0 { + return + } + + if unsafe.Sizeof(hiter{})/goarch.PtrSize != 12 { + throw("hash_iter size incorrect") // see cmd/compile/internal/reflectdata/reflect.go + } + it.h = h + + // grab snapshot of bucket state + it.B = h.B + it.buckets = h.buckets + if t.bucket.ptrdata == 0 { + // Allocate the current slice and remember pointers to both current and old. + // This preserves all relevant overflow buckets alive even if + // the table grows and/or overflow buckets are added to the table + // while we are iterating. + h.createOverflow() + it.overflow = h.extra.overflow + it.oldoverflow = h.extra.oldoverflow + } + + // decide where to start + var r uintptr + if h.B > 31-bucketCntBits { + r = uintptr(fastrand64()) + } else { + r = uintptr(fastrand()) + } + it.startBucket = r & bucketMask(h.B) + it.offset = uint8(r >> h.B & (bucketCnt - 1)) + + // iterator state + it.bucket = it.startBucket + + // Remember we have an iterator. + // Can run concurrently with another mapiterinit(). + if old := h.flags; old&(iterator|oldIterator) != iterator|oldIterator { + atomic.Or8(&h.flags, iterator|oldIterator) + } + + mapiternext(it) +} + +func mapiternext(it *hiter) { + h := it.h + if raceenabled { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapiternext)) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map iteration and map write") + } + t := it.t + bucket := it.bucket + b := it.bptr + i := it.i + checkBucket := it.checkBucket + +next: + if b == nil { + if bucket == it.startBucket && it.wrapped { + // end of iteration + it.key = nil + it.elem = nil + return + } + if h.growing() && it.B == h.B { + // Iterator was started in the middle of a grow, and the grow isn't done yet. + // If the bucket we're looking at hasn't been filled in yet (i.e. the old + // bucket hasn't been evacuated) then we need to iterate through the old + // bucket and only return the ones that will be migrated to this bucket. + oldbucket := bucket & it.h.oldbucketmask() + b = (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize))) + if !evacuated(b) { + checkBucket = bucket + } else { + b = (*bmap)(add(it.buckets, bucket*uintptr(t.bucketsize))) + checkBucket = noCheck + } + } else { + b = (*bmap)(add(it.buckets, bucket*uintptr(t.bucketsize))) + checkBucket = noCheck + } + bucket++ + if bucket == bucketShift(it.B) { + bucket = 0 + it.wrapped = true + } + i = 0 + } + for ; i < bucketCnt; i++ { + offi := (i + it.offset) & (bucketCnt - 1) + if isEmpty(b.tophash[offi]) || b.tophash[offi] == evacuatedEmpty { + // TODO: emptyRest is hard to use here, as we start iterating + // in the middle of a bucket. It's feasible, just tricky. + continue + } + k := add(unsafe.Pointer(b), dataOffset+uintptr(offi)*uintptr(t.keysize)) + if t.indirectkey() { + k = *((*unsafe.Pointer)(k)) + } + e := add(unsafe.Pointer(b), dataOffset+bucketCnt*uintptr(t.keysize)+uintptr(offi)*uintptr(t.elemsize)) + if checkBucket != noCheck && !h.sameSizeGrow() { + // Special case: iterator was started during a grow to a larger size + // and the grow is not done yet. We're working on a bucket whose + // oldbucket has not been evacuated yet. Or at least, it wasn't + // evacuated when we started the bucket. So we're iterating + // through the oldbucket, skipping any keys that will go + // to the other new bucket (each oldbucket expands to two + // buckets during a grow). + if t.reflexivekey() || t.key.equal(k, k) { + // If the item in the oldbucket is not destined for + // the current new bucket in the iteration, skip it. + hash := t.hasher(k, uintptr(h.hash0)) + if hash&bucketMask(it.B) != checkBucket { + continue + } + } else { + // Hash isn't repeatable if k != k (NaNs). We need a + // repeatable and randomish choice of which direction + // to send NaNs during evacuation. We'll use the low + // bit of tophash to decide which way NaNs go. + // NOTE: this case is why we need two evacuate tophash + // values, evacuatedX and evacuatedY, that differ in + // their low bit. + if checkBucket>>(it.B-1) != uintptr(b.tophash[offi]&1) { + continue + } + } + } + if (b.tophash[offi] != evacuatedX && b.tophash[offi] != evacuatedY) || + !(t.reflexivekey() || t.key.equal(k, k)) { + // This is the golden data, we can return it. + // OR + // key!=key, so the entry can't be deleted or updated, so we can just return it. + // That's lucky for us because when key!=key we can't look it up successfully. + it.key = k + if t.indirectelem() { + e = *((*unsafe.Pointer)(e)) + } + it.elem = e + } else { + // The hash table has grown since the iterator was started. + // The golden data for this key is now somewhere else. + // Check the current hash table for the data. + // This code handles the case where the key + // has been deleted, updated, or deleted and reinserted. + // NOTE: we need to regrab the key as it has potentially been + // updated to an equal() but not identical key (e.g. +0.0 vs -0.0). + rk, re := mapaccessK(t, h, k) + if rk == nil { + continue // key has been deleted + } + it.key = rk + it.elem = re + } + it.bucket = bucket + if it.bptr != b { // avoid unnecessary write barrier; see issue 14921 + it.bptr = b + } + it.i = i + 1 + it.checkBucket = checkBucket + return + } + b = b.overflow(t) + i = 0 + goto next +} + +// mapclear deletes all keys from a map. +func mapclear(t *maptype, h *hmap) { + if raceenabled && h != nil { + callerpc := getcallerpc() + pc := abi.FuncPCABIInternal(mapclear) + racewritepc(unsafe.Pointer(h), callerpc, pc) + } + + if h == nil || h.count == 0 { + return + } + + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + + h.flags ^= hashWriting + + h.flags &^= sameSizeGrow + h.oldbuckets = nil + h.nevacuate = 0 + h.noverflow = 0 + h.count = 0 + + // Reset the hash seed to make it more difficult for attackers to + // repeatedly trigger hash collisions. See issue 25237. + h.hash0 = fastrand() + + // Keep the mapextra allocation but clear any extra information. + if h.extra != nil { + *h.extra = mapextra{} + } + + // makeBucketArray clears the memory pointed to by h.buckets + // and recovers any overflow buckets by generating them + // as if h.buckets was newly alloced. + _, nextOverflow := makeBucketArray(t, h.B, h.buckets) + if nextOverflow != nil { + // If overflow buckets are created then h.extra + // will have been allocated during initial bucket creation. + h.extra.nextOverflow = nextOverflow + } + + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting +} + +func hashGrow(t *maptype, h *hmap) { + // If we've hit the load factor, get bigger. + // Otherwise, there are too many overflow buckets, + // so keep the same number of buckets and "grow" laterally. + bigger := uint8(1) + if !overLoadFactor(h.count+1, h.B) { + bigger = 0 + h.flags |= sameSizeGrow + } + oldbuckets := h.buckets + newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil) + + flags := h.flags &^ (iterator | oldIterator) + if h.flags&iterator != 0 { + flags |= oldIterator + } + // commit the grow (atomic wrt gc) + h.B += bigger + h.flags = flags + h.oldbuckets = oldbuckets + h.buckets = newbuckets + h.nevacuate = 0 + h.noverflow = 0 + + if h.extra != nil && h.extra.overflow != nil { + // Promote current overflow buckets to the old generation. + if h.extra.oldoverflow != nil { + throw("oldoverflow is not nil") + } + h.extra.oldoverflow = h.extra.overflow + h.extra.overflow = nil + } + if nextOverflow != nil { + if h.extra == nil { + h.extra = new(mapextra) + } + h.extra.nextOverflow = nextOverflow + } + + // the actual copying of the hash table data is done incrementally + // by growWork() and evacuate(). +} + +// overLoadFactor reports whether count items placed in 1<<B buckets is over loadFactor. +func overLoadFactor(count int, B uint8) bool { + return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen) +} + +// tooManyOverflowBuckets reports whether noverflow buckets is too many for a map with 1<<B buckets. +// Note that most of these overflow buckets must be in sparse use; +// if use was dense, then we'd have already triggered regular map growth. +func tooManyOverflowBuckets(noverflow uint16, B uint8) bool { + // If the threshold is too low, we do extraneous work. + // If the threshold is too high, maps that grow and shrink can hold on to lots of unused memory. + // "too many" means (approximately) as many overflow buckets as regular buckets. + // See incrnoverflow for more details. + if B > 15 { + B = 15 + } + // The compiler doesn't see here that B < 16; mask B to generate shorter shift code. + return noverflow >= uint16(1)<<(B&15) +} + +// growing reports whether h is growing. The growth may be to the same size or bigger. +func (h *hmap) growing() bool { + return h.oldbuckets != nil +} + +// sameSizeGrow reports whether the current growth is to a map of the same size. +func (h *hmap) sameSizeGrow() bool { + return h.flags&sameSizeGrow != 0 +} + +// noldbuckets calculates the number of buckets prior to the current map growth. +func (h *hmap) noldbuckets() uintptr { + oldB := h.B + if !h.sameSizeGrow() { + oldB-- + } + return bucketShift(oldB) +} + +// oldbucketmask provides a mask that can be applied to calculate n % noldbuckets(). +func (h *hmap) oldbucketmask() uintptr { + return h.noldbuckets() - 1 +} + +func growWork(t *maptype, h *hmap, bucket uintptr) { + // make sure we evacuate the oldbucket corresponding + // to the bucket we're about to use + evacuate(t, h, bucket&h.oldbucketmask()) + + // evacuate one more oldbucket to make progress on growing + if h.growing() { + evacuate(t, h, h.nevacuate) + } +} + +func bucketEvacuated(t *maptype, h *hmap, bucket uintptr) bool { + b := (*bmap)(add(h.oldbuckets, bucket*uintptr(t.bucketsize))) + return evacuated(b) +} + +// evacDst is an evacuation destination. +type evacDst struct { + b *bmap // current destination bucket + i int // key/elem index into b + k unsafe.Pointer // pointer to current key storage + e unsafe.Pointer // pointer to current elem storage +} + +func evacuate(t *maptype, h *hmap, oldbucket uintptr) { + b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize))) + newbit := h.noldbuckets() + if !evacuated(b) { + // TODO: reuse overflow buckets instead of using new ones, if there + // is no iterator using the old buckets. (If !oldIterator.) + + // xy contains the x and y (low and high) evacuation destinations. + var xy [2]evacDst + x := &xy[0] + x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize))) + x.k = add(unsafe.Pointer(x.b), dataOffset) + x.e = add(x.k, bucketCnt*uintptr(t.keysize)) + + if !h.sameSizeGrow() { + // Only calculate y pointers if we're growing bigger. + // Otherwise GC can see bad pointers. + y := &xy[1] + y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize))) + y.k = add(unsafe.Pointer(y.b), dataOffset) + y.e = add(y.k, bucketCnt*uintptr(t.keysize)) + } + + for ; b != nil; b = b.overflow(t) { + k := add(unsafe.Pointer(b), dataOffset) + e := add(k, bucketCnt*uintptr(t.keysize)) + for i := 0; i < bucketCnt; i, k, e = i+1, add(k, uintptr(t.keysize)), add(e, uintptr(t.elemsize)) { + top := b.tophash[i] + if isEmpty(top) { + b.tophash[i] = evacuatedEmpty + continue + } + if top < minTopHash { + throw("bad map state") + } + k2 := k + if t.indirectkey() { + k2 = *((*unsafe.Pointer)(k2)) + } + var useY uint8 + if !h.sameSizeGrow() { + // Compute hash to make our evacuation decision (whether we need + // to send this key/elem to bucket x or bucket y). + hash := t.hasher(k2, uintptr(h.hash0)) + if h.flags&iterator != 0 && !t.reflexivekey() && !t.key.equal(k2, k2) { + // If key != key (NaNs), then the hash could be (and probably + // will be) entirely different from the old hash. Moreover, + // it isn't reproducible. Reproducibility is required in the + // presence of iterators, as our evacuation decision must + // match whatever decision the iterator made. + // Fortunately, we have the freedom to send these keys either + // way. Also, tophash is meaningless for these kinds of keys. + // We let the low bit of tophash drive the evacuation decision. + // We recompute a new random tophash for the next level so + // these keys will get evenly distributed across all buckets + // after multiple grows. + useY = top & 1 + top = tophash(hash) + } else { + if hash&newbit != 0 { + useY = 1 + } + } + } + + if evacuatedX+1 != evacuatedY || evacuatedX^1 != evacuatedY { + throw("bad evacuatedN") + } + + b.tophash[i] = evacuatedX + useY // evacuatedX + 1 == evacuatedY + dst := &xy[useY] // evacuation destination + + if dst.i == bucketCnt { + dst.b = h.newoverflow(t, dst.b) + dst.i = 0 + dst.k = add(unsafe.Pointer(dst.b), dataOffset) + dst.e = add(dst.k, bucketCnt*uintptr(t.keysize)) + } + dst.b.tophash[dst.i&(bucketCnt-1)] = top // mask dst.i as an optimization, to avoid a bounds check + if t.indirectkey() { + *(*unsafe.Pointer)(dst.k) = k2 // copy pointer + } else { + typedmemmove(t.key, dst.k, k) // copy elem + } + if t.indirectelem() { + *(*unsafe.Pointer)(dst.e) = *(*unsafe.Pointer)(e) + } else { + typedmemmove(t.elem, dst.e, e) + } + dst.i++ + // These updates might push these pointers past the end of the + // key or elem arrays. That's ok, as we have the overflow pointer + // at the end of the bucket to protect against pointing past the + // end of the bucket. + dst.k = add(dst.k, uintptr(t.keysize)) + dst.e = add(dst.e, uintptr(t.elemsize)) + } + } + // Unlink the overflow buckets & clear key/elem to help GC. + if h.flags&oldIterator == 0 && t.bucket.ptrdata != 0 { + b := add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)) + // Preserve b.tophash because the evacuation + // state is maintained there. + ptr := add(b, dataOffset) + n := uintptr(t.bucketsize) - dataOffset + memclrHasPointers(ptr, n) + } + } + + if oldbucket == h.nevacuate { + advanceEvacuationMark(h, t, newbit) + } +} + +func advanceEvacuationMark(h *hmap, t *maptype, newbit uintptr) { + h.nevacuate++ + // Experiments suggest that 1024 is overkill by at least an order of magnitude. + // Put it in there as a safeguard anyway, to ensure O(1) behavior. + stop := h.nevacuate + 1024 + if stop > newbit { + stop = newbit + } + for h.nevacuate != stop && bucketEvacuated(t, h, h.nevacuate) { + h.nevacuate++ + } + if h.nevacuate == newbit { // newbit == # of oldbuckets + // Growing is all done. Free old main bucket array. + h.oldbuckets = nil + // Can discard old overflow buckets as well. + // If they are still referenced by an iterator, + // then the iterator holds a pointers to the slice. + if h.extra != nil { + h.extra.oldoverflow = nil + } + h.flags &^= sameSizeGrow + } +} + +// Reflect stubs. Called from ../reflect/asm_*.s + +//go:linkname reflect_makemap reflect.makemap +func reflect_makemap(t *maptype, cap int) *hmap { + // Check invariants and reflects math. + if t.key.equal == nil { + throw("runtime.reflect_makemap: unsupported map key type") + } + if t.key.size > maxKeySize && (!t.indirectkey() || t.keysize != uint8(goarch.PtrSize)) || + t.key.size <= maxKeySize && (t.indirectkey() || t.keysize != uint8(t.key.size)) { + throw("key size wrong") + } + if t.elem.size > maxElemSize && (!t.indirectelem() || t.elemsize != uint8(goarch.PtrSize)) || + t.elem.size <= maxElemSize && (t.indirectelem() || t.elemsize != uint8(t.elem.size)) { + throw("elem size wrong") + } + if t.key.align > bucketCnt { + throw("key align too big") + } + if t.elem.align > bucketCnt { + throw("elem align too big") + } + if t.key.size%uintptr(t.key.align) != 0 { + throw("key size not a multiple of key align") + } + if t.elem.size%uintptr(t.elem.align) != 0 { + throw("elem size not a multiple of elem align") + } + if bucketCnt < 8 { + throw("bucketsize too small for proper alignment") + } + if dataOffset%uintptr(t.key.align) != 0 { + throw("need padding in bucket (key)") + } + if dataOffset%uintptr(t.elem.align) != 0 { + throw("need padding in bucket (elem)") + } + + return makemap(t, cap, nil) +} + +//go:linkname reflect_mapaccess reflect.mapaccess +func reflect_mapaccess(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer { + elem, ok := mapaccess2(t, h, key) + if !ok { + // reflect wants nil for a missing element + elem = nil + } + return elem +} + +//go:linkname reflect_mapaccess_faststr reflect.mapaccess_faststr +func reflect_mapaccess_faststr(t *maptype, h *hmap, key string) unsafe.Pointer { + elem, ok := mapaccess2_faststr(t, h, key) + if !ok { + // reflect wants nil for a missing element + elem = nil + } + return elem +} + +//go:linkname reflect_mapassign reflect.mapassign +func reflect_mapassign(t *maptype, h *hmap, key unsafe.Pointer, elem unsafe.Pointer) { + p := mapassign(t, h, key) + typedmemmove(t.elem, p, elem) +} + +//go:linkname reflect_mapassign_faststr reflect.mapassign_faststr +func reflect_mapassign_faststr(t *maptype, h *hmap, key string, elem unsafe.Pointer) { + p := mapassign_faststr(t, h, key) + typedmemmove(t.elem, p, elem) +} + +//go:linkname reflect_mapdelete reflect.mapdelete +func reflect_mapdelete(t *maptype, h *hmap, key unsafe.Pointer) { + mapdelete(t, h, key) +} + +//go:linkname reflect_mapdelete_faststr reflect.mapdelete_faststr +func reflect_mapdelete_faststr(t *maptype, h *hmap, key string) { + mapdelete_faststr(t, h, key) +} + +//go:linkname reflect_mapiterinit reflect.mapiterinit +func reflect_mapiterinit(t *maptype, h *hmap, it *hiter) { + mapiterinit(t, h, it) +} + +//go:linkname reflect_mapiternext reflect.mapiternext +func reflect_mapiternext(it *hiter) { + mapiternext(it) +} + +//go:linkname reflect_mapiterkey reflect.mapiterkey +func reflect_mapiterkey(it *hiter) unsafe.Pointer { + return it.key +} + +//go:linkname reflect_mapiterelem reflect.mapiterelem +func reflect_mapiterelem(it *hiter) unsafe.Pointer { + return it.elem +} + +//go:linkname reflect_maplen reflect.maplen +func reflect_maplen(h *hmap) int { + if h == nil { + return 0 + } + if raceenabled { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(reflect_maplen)) + } + return h.count +} + +//go:linkname reflectlite_maplen internal/reflectlite.maplen +func reflectlite_maplen(h *hmap) int { + if h == nil { + return 0 + } + if raceenabled { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(reflect_maplen)) + } + return h.count +} + +const maxZero = 1024 // must match value in reflect/value.go:maxZero cmd/compile/internal/gc/walk.go:zeroValSize +var zeroVal [maxZero]byte diff --git a/src/runtime/map_benchmark_test.go b/src/runtime/map_benchmark_test.go new file mode 100644 index 0000000..b46d2a4 --- /dev/null +++ b/src/runtime/map_benchmark_test.go @@ -0,0 +1,535 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "math/rand" + "strconv" + "strings" + "testing" +) + +const size = 10 + +func BenchmarkHashStringSpeed(b *testing.B) { + strings := make([]string, size) + for i := 0; i < size; i++ { + strings[i] = fmt.Sprintf("string#%d", i) + } + sum := 0 + m := make(map[string]int, size) + for i := 0; i < size; i++ { + m[strings[i]] = 0 + } + idx := 0 + b.ResetTimer() + for i := 0; i < b.N; i++ { + sum += m[strings[idx]] + idx++ + if idx == size { + idx = 0 + } + } +} + +type chunk [17]byte + +func BenchmarkHashBytesSpeed(b *testing.B) { + // a bunch of chunks, each with a different alignment mod 16 + var chunks [size]chunk + // initialize each to a different value + for i := 0; i < size; i++ { + chunks[i][0] = byte(i) + } + // put into a map + m := make(map[chunk]int, size) + for i, c := range chunks { + m[c] = i + } + idx := 0 + b.ResetTimer() + for i := 0; i < b.N; i++ { + if m[chunks[idx]] != idx { + b.Error("bad map entry for chunk") + } + idx++ + if idx == size { + idx = 0 + } + } +} + +func BenchmarkHashInt32Speed(b *testing.B) { + ints := make([]int32, size) + for i := 0; i < size; i++ { + ints[i] = int32(i) + } + sum := 0 + m := make(map[int32]int, size) + for i := 0; i < size; i++ { + m[ints[i]] = 0 + } + idx := 0 + b.ResetTimer() + for i := 0; i < b.N; i++ { + sum += m[ints[idx]] + idx++ + if idx == size { + idx = 0 + } + } +} + +func BenchmarkHashInt64Speed(b *testing.B) { + ints := make([]int64, size) + for i := 0; i < size; i++ { + ints[i] = int64(i) + } + sum := 0 + m := make(map[int64]int, size) + for i := 0; i < size; i++ { + m[ints[i]] = 0 + } + idx := 0 + b.ResetTimer() + for i := 0; i < b.N; i++ { + sum += m[ints[idx]] + idx++ + if idx == size { + idx = 0 + } + } +} +func BenchmarkHashStringArraySpeed(b *testing.B) { + stringpairs := make([][2]string, size) + for i := 0; i < size; i++ { + for j := 0; j < 2; j++ { + stringpairs[i][j] = fmt.Sprintf("string#%d/%d", i, j) + } + } + sum := 0 + m := make(map[[2]string]int, size) + for i := 0; i < size; i++ { + m[stringpairs[i]] = 0 + } + idx := 0 + b.ResetTimer() + for i := 0; i < b.N; i++ { + sum += m[stringpairs[idx]] + idx++ + if idx == size { + idx = 0 + } + } +} + +func BenchmarkMegMap(b *testing.B) { + m := make(map[string]bool) + for suffix := 'A'; suffix <= 'G'; suffix++ { + m[strings.Repeat("X", 1<<20-1)+fmt.Sprint(suffix)] = true + } + key := strings.Repeat("X", 1<<20-1) + "k" + b.ResetTimer() + for i := 0; i < b.N; i++ { + _, _ = m[key] + } +} + +func BenchmarkMegOneMap(b *testing.B) { + m := make(map[string]bool) + m[strings.Repeat("X", 1<<20)] = true + key := strings.Repeat("Y", 1<<20) + b.ResetTimer() + for i := 0; i < b.N; i++ { + _, _ = m[key] + } +} + +func BenchmarkMegEqMap(b *testing.B) { + m := make(map[string]bool) + key1 := strings.Repeat("X", 1<<20) + key2 := strings.Repeat("X", 1<<20) // equal but different instance + m[key1] = true + b.ResetTimer() + for i := 0; i < b.N; i++ { + _, _ = m[key2] + } +} + +func BenchmarkMegEmptyMap(b *testing.B) { + m := make(map[string]bool) + key := strings.Repeat("X", 1<<20) + b.ResetTimer() + for i := 0; i < b.N; i++ { + _, _ = m[key] + } +} + +func BenchmarkSmallStrMap(b *testing.B) { + m := make(map[string]bool) + for suffix := 'A'; suffix <= 'G'; suffix++ { + m[fmt.Sprint(suffix)] = true + } + key := "k" + b.ResetTimer() + for i := 0; i < b.N; i++ { + _, _ = m[key] + } +} + +func BenchmarkMapStringKeysEight_16(b *testing.B) { benchmarkMapStringKeysEight(b, 16) } +func BenchmarkMapStringKeysEight_32(b *testing.B) { benchmarkMapStringKeysEight(b, 32) } +func BenchmarkMapStringKeysEight_64(b *testing.B) { benchmarkMapStringKeysEight(b, 64) } +func BenchmarkMapStringKeysEight_1M(b *testing.B) { benchmarkMapStringKeysEight(b, 1<<20) } + +func benchmarkMapStringKeysEight(b *testing.B, keySize int) { + m := make(map[string]bool) + for i := 0; i < 8; i++ { + m[strings.Repeat("K", i+1)] = true + } + key := strings.Repeat("K", keySize) + b.ResetTimer() + for i := 0; i < b.N; i++ { + _ = m[key] + } +} + +func BenchmarkIntMap(b *testing.B) { + m := make(map[int]bool) + for i := 0; i < 8; i++ { + m[i] = true + } + b.ResetTimer() + for i := 0; i < b.N; i++ { + _, _ = m[7] + } +} + +func BenchmarkMapFirst(b *testing.B) { + for n := 1; n <= 16; n++ { + b.Run(fmt.Sprintf("%d", n), func(b *testing.B) { + m := make(map[int]bool) + for i := 0; i < n; i++ { + m[i] = true + } + b.ResetTimer() + for i := 0; i < b.N; i++ { + _ = m[0] + } + }) + } +} +func BenchmarkMapMid(b *testing.B) { + for n := 1; n <= 16; n++ { + b.Run(fmt.Sprintf("%d", n), func(b *testing.B) { + m := make(map[int]bool) + for i := 0; i < n; i++ { + m[i] = true + } + b.ResetTimer() + for i := 0; i < b.N; i++ { + _ = m[n>>1] + } + }) + } +} +func BenchmarkMapLast(b *testing.B) { + for n := 1; n <= 16; n++ { + b.Run(fmt.Sprintf("%d", n), func(b *testing.B) { + m := make(map[int]bool) + for i := 0; i < n; i++ { + m[i] = true + } + b.ResetTimer() + for i := 0; i < b.N; i++ { + _ = m[n-1] + } + }) + } +} + +func BenchmarkMapCycle(b *testing.B) { + // Arrange map entries to be a permutation, so that + // we hit all entries, and one lookup is data dependent + // on the previous lookup. + const N = 3127 + p := rand.New(rand.NewSource(1)).Perm(N) + m := map[int]int{} + for i := 0; i < N; i++ { + m[i] = p[i] + } + b.ResetTimer() + j := 0 + for i := 0; i < b.N; i++ { + j = m[j] + } + sink = uint64(j) +} + +// Accessing the same keys in a row. +func benchmarkRepeatedLookup(b *testing.B, lookupKeySize int) { + m := make(map[string]bool) + // At least bigger than a single bucket: + for i := 0; i < 64; i++ { + m[fmt.Sprintf("some key %d", i)] = true + } + base := strings.Repeat("x", lookupKeySize-1) + key1 := base + "1" + key2 := base + "2" + b.ResetTimer() + for i := 0; i < b.N/4; i++ { + _ = m[key1] + _ = m[key1] + _ = m[key2] + _ = m[key2] + } +} + +func BenchmarkRepeatedLookupStrMapKey32(b *testing.B) { benchmarkRepeatedLookup(b, 32) } +func BenchmarkRepeatedLookupStrMapKey1M(b *testing.B) { benchmarkRepeatedLookup(b, 1<<20) } + +func BenchmarkMakeMap(b *testing.B) { + b.Run("[Byte]Byte", func(b *testing.B) { + var m map[byte]byte + for i := 0; i < b.N; i++ { + m = make(map[byte]byte, 10) + } + hugeSink = m + }) + b.Run("[Int]Int", func(b *testing.B) { + var m map[int]int + for i := 0; i < b.N; i++ { + m = make(map[int]int, 10) + } + hugeSink = m + }) +} + +func BenchmarkNewEmptyMap(b *testing.B) { + b.ReportAllocs() + for i := 0; i < b.N; i++ { + _ = make(map[int]int) + } +} + +func BenchmarkNewSmallMap(b *testing.B) { + b.ReportAllocs() + for i := 0; i < b.N; i++ { + m := make(map[int]int) + m[0] = 0 + m[1] = 1 + } +} + +func BenchmarkMapIter(b *testing.B) { + m := make(map[int]bool) + for i := 0; i < 8; i++ { + m[i] = true + } + b.ResetTimer() + for i := 0; i < b.N; i++ { + for range m { + } + } +} + +func BenchmarkMapIterEmpty(b *testing.B) { + m := make(map[int]bool) + b.ResetTimer() + for i := 0; i < b.N; i++ { + for range m { + } + } +} + +func BenchmarkSameLengthMap(b *testing.B) { + // long strings, same length, differ in first few + // and last few bytes. + m := make(map[string]bool) + s1 := "foo" + strings.Repeat("-", 100) + "bar" + s2 := "goo" + strings.Repeat("-", 100) + "ber" + m[s1] = true + m[s2] = true + b.ResetTimer() + for i := 0; i < b.N; i++ { + _ = m[s1] + } +} + +type BigKey [3]int64 + +func BenchmarkBigKeyMap(b *testing.B) { + m := make(map[BigKey]bool) + k := BigKey{3, 4, 5} + m[k] = true + for i := 0; i < b.N; i++ { + _ = m[k] + } +} + +type BigVal [3]int64 + +func BenchmarkBigValMap(b *testing.B) { + m := make(map[BigKey]BigVal) + k := BigKey{3, 4, 5} + m[k] = BigVal{6, 7, 8} + for i := 0; i < b.N; i++ { + _ = m[k] + } +} + +func BenchmarkSmallKeyMap(b *testing.B) { + m := make(map[int16]bool) + m[5] = true + for i := 0; i < b.N; i++ { + _ = m[5] + } +} + +func BenchmarkMapPopulate(b *testing.B) { + for size := 1; size < 1000000; size *= 10 { + b.Run(strconv.Itoa(size), func(b *testing.B) { + b.ReportAllocs() + for i := 0; i < b.N; i++ { + m := make(map[int]bool) + for j := 0; j < size; j++ { + m[j] = true + } + } + }) + } +} + +type ComplexAlgKey struct { + a, b, c int64 + _ int + d int32 + _ int + e string + _ int + f, g, h int64 +} + +func BenchmarkComplexAlgMap(b *testing.B) { + m := make(map[ComplexAlgKey]bool) + var k ComplexAlgKey + m[k] = true + for i := 0; i < b.N; i++ { + _ = m[k] + } +} + +func BenchmarkGoMapClear(b *testing.B) { + b.Run("Reflexive", func(b *testing.B) { + for size := 1; size < 100000; size *= 10 { + b.Run(strconv.Itoa(size), func(b *testing.B) { + m := make(map[int]int, size) + for i := 0; i < b.N; i++ { + m[0] = size // Add one element so len(m) != 0 avoiding fast paths. + for k := range m { + delete(m, k) + } + } + }) + } + }) + b.Run("NonReflexive", func(b *testing.B) { + for size := 1; size < 100000; size *= 10 { + b.Run(strconv.Itoa(size), func(b *testing.B) { + m := make(map[float64]int, size) + for i := 0; i < b.N; i++ { + m[1.0] = size // Add one element so len(m) != 0 avoiding fast paths. + for k := range m { + delete(m, k) + } + } + }) + } + }) +} + +func BenchmarkMapStringConversion(b *testing.B) { + for _, length := range []int{32, 64} { + b.Run(strconv.Itoa(length), func(b *testing.B) { + bytes := make([]byte, length) + b.Run("simple", func(b *testing.B) { + b.ReportAllocs() + m := make(map[string]int) + m[string(bytes)] = 0 + for i := 0; i < b.N; i++ { + _ = m[string(bytes)] + } + }) + b.Run("struct", func(b *testing.B) { + b.ReportAllocs() + type stringstruct struct{ s string } + m := make(map[stringstruct]int) + m[stringstruct{string(bytes)}] = 0 + for i := 0; i < b.N; i++ { + _ = m[stringstruct{string(bytes)}] + } + }) + b.Run("array", func(b *testing.B) { + b.ReportAllocs() + type stringarray [1]string + m := make(map[stringarray]int) + m[stringarray{string(bytes)}] = 0 + for i := 0; i < b.N; i++ { + _ = m[stringarray{string(bytes)}] + } + }) + }) + } +} + +var BoolSink bool + +func BenchmarkMapInterfaceString(b *testing.B) { + m := map[any]bool{} + + for i := 0; i < 100; i++ { + m[fmt.Sprintf("%d", i)] = true + } + + key := (any)("A") + b.ResetTimer() + for i := 0; i < b.N; i++ { + BoolSink = m[key] + } +} +func BenchmarkMapInterfacePtr(b *testing.B) { + m := map[any]bool{} + + for i := 0; i < 100; i++ { + i := i + m[&i] = true + } + + key := new(int) + b.ResetTimer() + for i := 0; i < b.N; i++ { + BoolSink = m[key] + } +} + +var ( + hintLessThan8 = 7 + hintGreaterThan8 = 32 +) + +func BenchmarkNewEmptyMapHintLessThan8(b *testing.B) { + b.ReportAllocs() + for i := 0; i < b.N; i++ { + _ = make(map[int]int, hintLessThan8) + } +} + +func BenchmarkNewEmptyMapHintGreaterThan8(b *testing.B) { + b.ReportAllocs() + for i := 0; i < b.N; i++ { + _ = make(map[int]int, hintGreaterThan8) + } +} diff --git a/src/runtime/map_fast32.go b/src/runtime/map_fast32.go new file mode 100644 index 0000000..01ea330 --- /dev/null +++ b/src/runtime/map_fast32.go @@ -0,0 +1,462 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +func mapaccess1_fast32(t *maptype, h *hmap, key uint32) unsafe.Pointer { + if raceenabled && h != nil { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapaccess1_fast32)) + } + if h == nil || h.count == 0 { + return unsafe.Pointer(&zeroVal[0]) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map read and map write") + } + var b *bmap + if h.B == 0 { + // One-bucket table. No need to hash. + b = (*bmap)(h.buckets) + } else { + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + m := bucketMask(h.B) + b = (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + } + for ; b != nil; b = b.overflow(t) { + for i, k := uintptr(0), b.keys(); i < bucketCnt; i, k = i+1, add(k, 4) { + if *(*uint32)(k) == key && !isEmpty(b.tophash[i]) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*4+i*uintptr(t.elemsize)) + } + } + } + return unsafe.Pointer(&zeroVal[0]) +} + +func mapaccess2_fast32(t *maptype, h *hmap, key uint32) (unsafe.Pointer, bool) { + if raceenabled && h != nil { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapaccess2_fast32)) + } + if h == nil || h.count == 0 { + return unsafe.Pointer(&zeroVal[0]), false + } + if h.flags&hashWriting != 0 { + fatal("concurrent map read and map write") + } + var b *bmap + if h.B == 0 { + // One-bucket table. No need to hash. + b = (*bmap)(h.buckets) + } else { + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + m := bucketMask(h.B) + b = (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + } + for ; b != nil; b = b.overflow(t) { + for i, k := uintptr(0), b.keys(); i < bucketCnt; i, k = i+1, add(k, 4) { + if *(*uint32)(k) == key && !isEmpty(b.tophash[i]) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*4+i*uintptr(t.elemsize)), true + } + } + } + return unsafe.Pointer(&zeroVal[0]), false +} + +func mapassign_fast32(t *maptype, h *hmap, key uint32) unsafe.Pointer { + if h == nil { + panic(plainError("assignment to entry in nil map")) + } + if raceenabled { + callerpc := getcallerpc() + racewritepc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapassign_fast32)) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher for consistency with mapassign. + h.flags ^= hashWriting + + if h.buckets == nil { + h.buckets = newobject(t.bucket) // newarray(t.bucket, 1) + } + +again: + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork_fast32(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + + var insertb *bmap + var inserti uintptr + var insertk unsafe.Pointer + +bucketloop: + for { + for i := uintptr(0); i < bucketCnt; i++ { + if isEmpty(b.tophash[i]) { + if insertb == nil { + inserti = i + insertb = b + } + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := *((*uint32)(add(unsafe.Pointer(b), dataOffset+i*4))) + if k != key { + continue + } + inserti = i + insertb = b + goto done + } + ovf := b.overflow(t) + if ovf == nil { + break + } + b = ovf + } + + // Did not find mapping for key. Allocate new cell & add entry. + + // If we hit the max load factor or we have too many overflow buckets, + // and we're not already in the middle of growing, start growing. + if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { + hashGrow(t, h) + goto again // Growing the table invalidates everything, so try again + } + + if insertb == nil { + // The current bucket and all the overflow buckets connected to it are full, allocate a new one. + insertb = h.newoverflow(t, b) + inserti = 0 // not necessary, but avoids needlessly spilling inserti + } + insertb.tophash[inserti&(bucketCnt-1)] = tophash(hash) // mask inserti to avoid bounds checks + + insertk = add(unsafe.Pointer(insertb), dataOffset+inserti*4) + // store new key at insert position + *(*uint32)(insertk) = key + + h.count++ + +done: + elem := add(unsafe.Pointer(insertb), dataOffset+bucketCnt*4+inserti*uintptr(t.elemsize)) + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting + return elem +} + +func mapassign_fast32ptr(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer { + if h == nil { + panic(plainError("assignment to entry in nil map")) + } + if raceenabled { + callerpc := getcallerpc() + racewritepc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapassign_fast32)) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher for consistency with mapassign. + h.flags ^= hashWriting + + if h.buckets == nil { + h.buckets = newobject(t.bucket) // newarray(t.bucket, 1) + } + +again: + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork_fast32(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + + var insertb *bmap + var inserti uintptr + var insertk unsafe.Pointer + +bucketloop: + for { + for i := uintptr(0); i < bucketCnt; i++ { + if isEmpty(b.tophash[i]) { + if insertb == nil { + inserti = i + insertb = b + } + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := *((*unsafe.Pointer)(add(unsafe.Pointer(b), dataOffset+i*4))) + if k != key { + continue + } + inserti = i + insertb = b + goto done + } + ovf := b.overflow(t) + if ovf == nil { + break + } + b = ovf + } + + // Did not find mapping for key. Allocate new cell & add entry. + + // If we hit the max load factor or we have too many overflow buckets, + // and we're not already in the middle of growing, start growing. + if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { + hashGrow(t, h) + goto again // Growing the table invalidates everything, so try again + } + + if insertb == nil { + // The current bucket and all the overflow buckets connected to it are full, allocate a new one. + insertb = h.newoverflow(t, b) + inserti = 0 // not necessary, but avoids needlessly spilling inserti + } + insertb.tophash[inserti&(bucketCnt-1)] = tophash(hash) // mask inserti to avoid bounds checks + + insertk = add(unsafe.Pointer(insertb), dataOffset+inserti*4) + // store new key at insert position + *(*unsafe.Pointer)(insertk) = key + + h.count++ + +done: + elem := add(unsafe.Pointer(insertb), dataOffset+bucketCnt*4+inserti*uintptr(t.elemsize)) + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting + return elem +} + +func mapdelete_fast32(t *maptype, h *hmap, key uint32) { + if raceenabled && h != nil { + callerpc := getcallerpc() + racewritepc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapdelete_fast32)) + } + if h == nil || h.count == 0 { + return + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher for consistency with mapdelete + h.flags ^= hashWriting + + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork_fast32(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + bOrig := b +search: + for ; b != nil; b = b.overflow(t) { + for i, k := uintptr(0), b.keys(); i < bucketCnt; i, k = i+1, add(k, 4) { + if key != *(*uint32)(k) || isEmpty(b.tophash[i]) { + continue + } + // Only clear key if there are pointers in it. + // This can only happen if pointers are 32 bit + // wide as 64 bit pointers do not fit into a 32 bit key. + if goarch.PtrSize == 4 && t.key.ptrdata != 0 { + // The key must be a pointer as we checked pointers are + // 32 bits wide and the key is 32 bits wide also. + *(*unsafe.Pointer)(k) = nil + } + e := add(unsafe.Pointer(b), dataOffset+bucketCnt*4+i*uintptr(t.elemsize)) + if t.elem.ptrdata != 0 { + memclrHasPointers(e, t.elem.size) + } else { + memclrNoHeapPointers(e, t.elem.size) + } + b.tophash[i] = emptyOne + // If the bucket now ends in a bunch of emptyOne states, + // change those to emptyRest states. + if i == bucketCnt-1 { + if b.overflow(t) != nil && b.overflow(t).tophash[0] != emptyRest { + goto notLast + } + } else { + if b.tophash[i+1] != emptyRest { + goto notLast + } + } + for { + b.tophash[i] = emptyRest + if i == 0 { + if b == bOrig { + break // beginning of initial bucket, we're done. + } + // Find previous bucket, continue at its last entry. + c := b + for b = bOrig; b.overflow(t) != c; b = b.overflow(t) { + } + i = bucketCnt - 1 + } else { + i-- + } + if b.tophash[i] != emptyOne { + break + } + } + notLast: + h.count-- + // Reset the hash seed to make it more difficult for attackers to + // repeatedly trigger hash collisions. See issue 25237. + if h.count == 0 { + h.hash0 = fastrand() + } + break search + } + } + + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting +} + +func growWork_fast32(t *maptype, h *hmap, bucket uintptr) { + // make sure we evacuate the oldbucket corresponding + // to the bucket we're about to use + evacuate_fast32(t, h, bucket&h.oldbucketmask()) + + // evacuate one more oldbucket to make progress on growing + if h.growing() { + evacuate_fast32(t, h, h.nevacuate) + } +} + +func evacuate_fast32(t *maptype, h *hmap, oldbucket uintptr) { + b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize))) + newbit := h.noldbuckets() + if !evacuated(b) { + // TODO: reuse overflow buckets instead of using new ones, if there + // is no iterator using the old buckets. (If !oldIterator.) + + // xy contains the x and y (low and high) evacuation destinations. + var xy [2]evacDst + x := &xy[0] + x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize))) + x.k = add(unsafe.Pointer(x.b), dataOffset) + x.e = add(x.k, bucketCnt*4) + + if !h.sameSizeGrow() { + // Only calculate y pointers if we're growing bigger. + // Otherwise GC can see bad pointers. + y := &xy[1] + y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize))) + y.k = add(unsafe.Pointer(y.b), dataOffset) + y.e = add(y.k, bucketCnt*4) + } + + for ; b != nil; b = b.overflow(t) { + k := add(unsafe.Pointer(b), dataOffset) + e := add(k, bucketCnt*4) + for i := 0; i < bucketCnt; i, k, e = i+1, add(k, 4), add(e, uintptr(t.elemsize)) { + top := b.tophash[i] + if isEmpty(top) { + b.tophash[i] = evacuatedEmpty + continue + } + if top < minTopHash { + throw("bad map state") + } + var useY uint8 + if !h.sameSizeGrow() { + // Compute hash to make our evacuation decision (whether we need + // to send this key/elem to bucket x or bucket y). + hash := t.hasher(k, uintptr(h.hash0)) + if hash&newbit != 0 { + useY = 1 + } + } + + b.tophash[i] = evacuatedX + useY // evacuatedX + 1 == evacuatedY, enforced in makemap + dst := &xy[useY] // evacuation destination + + if dst.i == bucketCnt { + dst.b = h.newoverflow(t, dst.b) + dst.i = 0 + dst.k = add(unsafe.Pointer(dst.b), dataOffset) + dst.e = add(dst.k, bucketCnt*4) + } + dst.b.tophash[dst.i&(bucketCnt-1)] = top // mask dst.i as an optimization, to avoid a bounds check + + // Copy key. + if goarch.PtrSize == 4 && t.key.ptrdata != 0 && writeBarrier.enabled { + // Write with a write barrier. + *(*unsafe.Pointer)(dst.k) = *(*unsafe.Pointer)(k) + } else { + *(*uint32)(dst.k) = *(*uint32)(k) + } + + typedmemmove(t.elem, dst.e, e) + dst.i++ + // These updates might push these pointers past the end of the + // key or elem arrays. That's ok, as we have the overflow pointer + // at the end of the bucket to protect against pointing past the + // end of the bucket. + dst.k = add(dst.k, 4) + dst.e = add(dst.e, uintptr(t.elemsize)) + } + } + // Unlink the overflow buckets & clear key/elem to help GC. + if h.flags&oldIterator == 0 && t.bucket.ptrdata != 0 { + b := add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)) + // Preserve b.tophash because the evacuation + // state is maintained there. + ptr := add(b, dataOffset) + n := uintptr(t.bucketsize) - dataOffset + memclrHasPointers(ptr, n) + } + } + + if oldbucket == h.nevacuate { + advanceEvacuationMark(h, t, newbit) + } +} diff --git a/src/runtime/map_fast64.go b/src/runtime/map_fast64.go new file mode 100644 index 0000000..2967360 --- /dev/null +++ b/src/runtime/map_fast64.go @@ -0,0 +1,470 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +func mapaccess1_fast64(t *maptype, h *hmap, key uint64) unsafe.Pointer { + if raceenabled && h != nil { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapaccess1_fast64)) + } + if h == nil || h.count == 0 { + return unsafe.Pointer(&zeroVal[0]) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map read and map write") + } + var b *bmap + if h.B == 0 { + // One-bucket table. No need to hash. + b = (*bmap)(h.buckets) + } else { + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + m := bucketMask(h.B) + b = (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + } + for ; b != nil; b = b.overflow(t) { + for i, k := uintptr(0), b.keys(); i < bucketCnt; i, k = i+1, add(k, 8) { + if *(*uint64)(k) == key && !isEmpty(b.tophash[i]) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*8+i*uintptr(t.elemsize)) + } + } + } + return unsafe.Pointer(&zeroVal[0]) +} + +func mapaccess2_fast64(t *maptype, h *hmap, key uint64) (unsafe.Pointer, bool) { + if raceenabled && h != nil { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapaccess2_fast64)) + } + if h == nil || h.count == 0 { + return unsafe.Pointer(&zeroVal[0]), false + } + if h.flags&hashWriting != 0 { + fatal("concurrent map read and map write") + } + var b *bmap + if h.B == 0 { + // One-bucket table. No need to hash. + b = (*bmap)(h.buckets) + } else { + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + m := bucketMask(h.B) + b = (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + } + for ; b != nil; b = b.overflow(t) { + for i, k := uintptr(0), b.keys(); i < bucketCnt; i, k = i+1, add(k, 8) { + if *(*uint64)(k) == key && !isEmpty(b.tophash[i]) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*8+i*uintptr(t.elemsize)), true + } + } + } + return unsafe.Pointer(&zeroVal[0]), false +} + +func mapassign_fast64(t *maptype, h *hmap, key uint64) unsafe.Pointer { + if h == nil { + panic(plainError("assignment to entry in nil map")) + } + if raceenabled { + callerpc := getcallerpc() + racewritepc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapassign_fast64)) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher for consistency with mapassign. + h.flags ^= hashWriting + + if h.buckets == nil { + h.buckets = newobject(t.bucket) // newarray(t.bucket, 1) + } + +again: + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork_fast64(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + + var insertb *bmap + var inserti uintptr + var insertk unsafe.Pointer + +bucketloop: + for { + for i := uintptr(0); i < bucketCnt; i++ { + if isEmpty(b.tophash[i]) { + if insertb == nil { + insertb = b + inserti = i + } + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := *((*uint64)(add(unsafe.Pointer(b), dataOffset+i*8))) + if k != key { + continue + } + insertb = b + inserti = i + goto done + } + ovf := b.overflow(t) + if ovf == nil { + break + } + b = ovf + } + + // Did not find mapping for key. Allocate new cell & add entry. + + // If we hit the max load factor or we have too many overflow buckets, + // and we're not already in the middle of growing, start growing. + if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { + hashGrow(t, h) + goto again // Growing the table invalidates everything, so try again + } + + if insertb == nil { + // The current bucket and all the overflow buckets connected to it are full, allocate a new one. + insertb = h.newoverflow(t, b) + inserti = 0 // not necessary, but avoids needlessly spilling inserti + } + insertb.tophash[inserti&(bucketCnt-1)] = tophash(hash) // mask inserti to avoid bounds checks + + insertk = add(unsafe.Pointer(insertb), dataOffset+inserti*8) + // store new key at insert position + *(*uint64)(insertk) = key + + h.count++ + +done: + elem := add(unsafe.Pointer(insertb), dataOffset+bucketCnt*8+inserti*uintptr(t.elemsize)) + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting + return elem +} + +func mapassign_fast64ptr(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer { + if h == nil { + panic(plainError("assignment to entry in nil map")) + } + if raceenabled { + callerpc := getcallerpc() + racewritepc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapassign_fast64)) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher for consistency with mapassign. + h.flags ^= hashWriting + + if h.buckets == nil { + h.buckets = newobject(t.bucket) // newarray(t.bucket, 1) + } + +again: + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork_fast64(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + + var insertb *bmap + var inserti uintptr + var insertk unsafe.Pointer + +bucketloop: + for { + for i := uintptr(0); i < bucketCnt; i++ { + if isEmpty(b.tophash[i]) { + if insertb == nil { + insertb = b + inserti = i + } + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := *((*unsafe.Pointer)(add(unsafe.Pointer(b), dataOffset+i*8))) + if k != key { + continue + } + insertb = b + inserti = i + goto done + } + ovf := b.overflow(t) + if ovf == nil { + break + } + b = ovf + } + + // Did not find mapping for key. Allocate new cell & add entry. + + // If we hit the max load factor or we have too many overflow buckets, + // and we're not already in the middle of growing, start growing. + if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { + hashGrow(t, h) + goto again // Growing the table invalidates everything, so try again + } + + if insertb == nil { + // The current bucket and all the overflow buckets connected to it are full, allocate a new one. + insertb = h.newoverflow(t, b) + inserti = 0 // not necessary, but avoids needlessly spilling inserti + } + insertb.tophash[inserti&(bucketCnt-1)] = tophash(hash) // mask inserti to avoid bounds checks + + insertk = add(unsafe.Pointer(insertb), dataOffset+inserti*8) + // store new key at insert position + *(*unsafe.Pointer)(insertk) = key + + h.count++ + +done: + elem := add(unsafe.Pointer(insertb), dataOffset+bucketCnt*8+inserti*uintptr(t.elemsize)) + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting + return elem +} + +func mapdelete_fast64(t *maptype, h *hmap, key uint64) { + if raceenabled && h != nil { + callerpc := getcallerpc() + racewritepc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapdelete_fast64)) + } + if h == nil || h.count == 0 { + return + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + + hash := t.hasher(noescape(unsafe.Pointer(&key)), uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher for consistency with mapdelete + h.flags ^= hashWriting + + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork_fast64(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + bOrig := b +search: + for ; b != nil; b = b.overflow(t) { + for i, k := uintptr(0), b.keys(); i < bucketCnt; i, k = i+1, add(k, 8) { + if key != *(*uint64)(k) || isEmpty(b.tophash[i]) { + continue + } + // Only clear key if there are pointers in it. + if t.key.ptrdata != 0 { + if goarch.PtrSize == 8 { + *(*unsafe.Pointer)(k) = nil + } else { + // There are three ways to squeeze at one ore more 32 bit pointers into 64 bits. + // Just call memclrHasPointers instead of trying to handle all cases here. + memclrHasPointers(k, 8) + } + } + e := add(unsafe.Pointer(b), dataOffset+bucketCnt*8+i*uintptr(t.elemsize)) + if t.elem.ptrdata != 0 { + memclrHasPointers(e, t.elem.size) + } else { + memclrNoHeapPointers(e, t.elem.size) + } + b.tophash[i] = emptyOne + // If the bucket now ends in a bunch of emptyOne states, + // change those to emptyRest states. + if i == bucketCnt-1 { + if b.overflow(t) != nil && b.overflow(t).tophash[0] != emptyRest { + goto notLast + } + } else { + if b.tophash[i+1] != emptyRest { + goto notLast + } + } + for { + b.tophash[i] = emptyRest + if i == 0 { + if b == bOrig { + break // beginning of initial bucket, we're done. + } + // Find previous bucket, continue at its last entry. + c := b + for b = bOrig; b.overflow(t) != c; b = b.overflow(t) { + } + i = bucketCnt - 1 + } else { + i-- + } + if b.tophash[i] != emptyOne { + break + } + } + notLast: + h.count-- + // Reset the hash seed to make it more difficult for attackers to + // repeatedly trigger hash collisions. See issue 25237. + if h.count == 0 { + h.hash0 = fastrand() + } + break search + } + } + + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting +} + +func growWork_fast64(t *maptype, h *hmap, bucket uintptr) { + // make sure we evacuate the oldbucket corresponding + // to the bucket we're about to use + evacuate_fast64(t, h, bucket&h.oldbucketmask()) + + // evacuate one more oldbucket to make progress on growing + if h.growing() { + evacuate_fast64(t, h, h.nevacuate) + } +} + +func evacuate_fast64(t *maptype, h *hmap, oldbucket uintptr) { + b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize))) + newbit := h.noldbuckets() + if !evacuated(b) { + // TODO: reuse overflow buckets instead of using new ones, if there + // is no iterator using the old buckets. (If !oldIterator.) + + // xy contains the x and y (low and high) evacuation destinations. + var xy [2]evacDst + x := &xy[0] + x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize))) + x.k = add(unsafe.Pointer(x.b), dataOffset) + x.e = add(x.k, bucketCnt*8) + + if !h.sameSizeGrow() { + // Only calculate y pointers if we're growing bigger. + // Otherwise GC can see bad pointers. + y := &xy[1] + y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize))) + y.k = add(unsafe.Pointer(y.b), dataOffset) + y.e = add(y.k, bucketCnt*8) + } + + for ; b != nil; b = b.overflow(t) { + k := add(unsafe.Pointer(b), dataOffset) + e := add(k, bucketCnt*8) + for i := 0; i < bucketCnt; i, k, e = i+1, add(k, 8), add(e, uintptr(t.elemsize)) { + top := b.tophash[i] + if isEmpty(top) { + b.tophash[i] = evacuatedEmpty + continue + } + if top < minTopHash { + throw("bad map state") + } + var useY uint8 + if !h.sameSizeGrow() { + // Compute hash to make our evacuation decision (whether we need + // to send this key/elem to bucket x or bucket y). + hash := t.hasher(k, uintptr(h.hash0)) + if hash&newbit != 0 { + useY = 1 + } + } + + b.tophash[i] = evacuatedX + useY // evacuatedX + 1 == evacuatedY, enforced in makemap + dst := &xy[useY] // evacuation destination + + if dst.i == bucketCnt { + dst.b = h.newoverflow(t, dst.b) + dst.i = 0 + dst.k = add(unsafe.Pointer(dst.b), dataOffset) + dst.e = add(dst.k, bucketCnt*8) + } + dst.b.tophash[dst.i&(bucketCnt-1)] = top // mask dst.i as an optimization, to avoid a bounds check + + // Copy key. + if t.key.ptrdata != 0 && writeBarrier.enabled { + if goarch.PtrSize == 8 { + // Write with a write barrier. + *(*unsafe.Pointer)(dst.k) = *(*unsafe.Pointer)(k) + } else { + // There are three ways to squeeze at least one 32 bit pointer into 64 bits. + // Give up and call typedmemmove. + typedmemmove(t.key, dst.k, k) + } + } else { + *(*uint64)(dst.k) = *(*uint64)(k) + } + + typedmemmove(t.elem, dst.e, e) + dst.i++ + // These updates might push these pointers past the end of the + // key or elem arrays. That's ok, as we have the overflow pointer + // at the end of the bucket to protect against pointing past the + // end of the bucket. + dst.k = add(dst.k, 8) + dst.e = add(dst.e, uintptr(t.elemsize)) + } + } + // Unlink the overflow buckets & clear key/elem to help GC. + if h.flags&oldIterator == 0 && t.bucket.ptrdata != 0 { + b := add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)) + // Preserve b.tophash because the evacuation + // state is maintained there. + ptr := add(b, dataOffset) + n := uintptr(t.bucketsize) - dataOffset + memclrHasPointers(ptr, n) + } + } + + if oldbucket == h.nevacuate { + advanceEvacuationMark(h, t, newbit) + } +} diff --git a/src/runtime/map_faststr.go b/src/runtime/map_faststr.go new file mode 100644 index 0000000..006c24c --- /dev/null +++ b/src/runtime/map_faststr.go @@ -0,0 +1,485 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +func mapaccess1_faststr(t *maptype, h *hmap, ky string) unsafe.Pointer { + if raceenabled && h != nil { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapaccess1_faststr)) + } + if h == nil || h.count == 0 { + return unsafe.Pointer(&zeroVal[0]) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map read and map write") + } + key := stringStructOf(&ky) + if h.B == 0 { + // One-bucket table. + b := (*bmap)(h.buckets) + if key.len < 32 { + // short key, doing lots of comparisons is ok + for i, kptr := uintptr(0), b.keys(); i < bucketCnt; i, kptr = i+1, add(kptr, 2*goarch.PtrSize) { + k := (*stringStruct)(kptr) + if k.len != key.len || isEmpty(b.tophash[i]) { + if b.tophash[i] == emptyRest { + break + } + continue + } + if k.str == key.str || memequal(k.str, key.str, uintptr(key.len)) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+i*uintptr(t.elemsize)) + } + } + return unsafe.Pointer(&zeroVal[0]) + } + // long key, try not to do more comparisons than necessary + keymaybe := uintptr(bucketCnt) + for i, kptr := uintptr(0), b.keys(); i < bucketCnt; i, kptr = i+1, add(kptr, 2*goarch.PtrSize) { + k := (*stringStruct)(kptr) + if k.len != key.len || isEmpty(b.tophash[i]) { + if b.tophash[i] == emptyRest { + break + } + continue + } + if k.str == key.str { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+i*uintptr(t.elemsize)) + } + // check first 4 bytes + if *((*[4]byte)(key.str)) != *((*[4]byte)(k.str)) { + continue + } + // check last 4 bytes + if *((*[4]byte)(add(key.str, uintptr(key.len)-4))) != *((*[4]byte)(add(k.str, uintptr(key.len)-4))) { + continue + } + if keymaybe != bucketCnt { + // Two keys are potential matches. Use hash to distinguish them. + goto dohash + } + keymaybe = i + } + if keymaybe != bucketCnt { + k := (*stringStruct)(add(unsafe.Pointer(b), dataOffset+keymaybe*2*goarch.PtrSize)) + if memequal(k.str, key.str, uintptr(key.len)) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+keymaybe*uintptr(t.elemsize)) + } + } + return unsafe.Pointer(&zeroVal[0]) + } +dohash: + hash := t.hasher(noescape(unsafe.Pointer(&ky)), uintptr(h.hash0)) + m := bucketMask(h.B) + b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + top := tophash(hash) + for ; b != nil; b = b.overflow(t) { + for i, kptr := uintptr(0), b.keys(); i < bucketCnt; i, kptr = i+1, add(kptr, 2*goarch.PtrSize) { + k := (*stringStruct)(kptr) + if k.len != key.len || b.tophash[i] != top { + continue + } + if k.str == key.str || memequal(k.str, key.str, uintptr(key.len)) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+i*uintptr(t.elemsize)) + } + } + } + return unsafe.Pointer(&zeroVal[0]) +} + +func mapaccess2_faststr(t *maptype, h *hmap, ky string) (unsafe.Pointer, bool) { + if raceenabled && h != nil { + callerpc := getcallerpc() + racereadpc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapaccess2_faststr)) + } + if h == nil || h.count == 0 { + return unsafe.Pointer(&zeroVal[0]), false + } + if h.flags&hashWriting != 0 { + fatal("concurrent map read and map write") + } + key := stringStructOf(&ky) + if h.B == 0 { + // One-bucket table. + b := (*bmap)(h.buckets) + if key.len < 32 { + // short key, doing lots of comparisons is ok + for i, kptr := uintptr(0), b.keys(); i < bucketCnt; i, kptr = i+1, add(kptr, 2*goarch.PtrSize) { + k := (*stringStruct)(kptr) + if k.len != key.len || isEmpty(b.tophash[i]) { + if b.tophash[i] == emptyRest { + break + } + continue + } + if k.str == key.str || memequal(k.str, key.str, uintptr(key.len)) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+i*uintptr(t.elemsize)), true + } + } + return unsafe.Pointer(&zeroVal[0]), false + } + // long key, try not to do more comparisons than necessary + keymaybe := uintptr(bucketCnt) + for i, kptr := uintptr(0), b.keys(); i < bucketCnt; i, kptr = i+1, add(kptr, 2*goarch.PtrSize) { + k := (*stringStruct)(kptr) + if k.len != key.len || isEmpty(b.tophash[i]) { + if b.tophash[i] == emptyRest { + break + } + continue + } + if k.str == key.str { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+i*uintptr(t.elemsize)), true + } + // check first 4 bytes + if *((*[4]byte)(key.str)) != *((*[4]byte)(k.str)) { + continue + } + // check last 4 bytes + if *((*[4]byte)(add(key.str, uintptr(key.len)-4))) != *((*[4]byte)(add(k.str, uintptr(key.len)-4))) { + continue + } + if keymaybe != bucketCnt { + // Two keys are potential matches. Use hash to distinguish them. + goto dohash + } + keymaybe = i + } + if keymaybe != bucketCnt { + k := (*stringStruct)(add(unsafe.Pointer(b), dataOffset+keymaybe*2*goarch.PtrSize)) + if memequal(k.str, key.str, uintptr(key.len)) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+keymaybe*uintptr(t.elemsize)), true + } + } + return unsafe.Pointer(&zeroVal[0]), false + } +dohash: + hash := t.hasher(noescape(unsafe.Pointer(&ky)), uintptr(h.hash0)) + m := bucketMask(h.B) + b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize))) + if c := h.oldbuckets; c != nil { + if !h.sameSizeGrow() { + // There used to be half as many buckets; mask down one more power of two. + m >>= 1 + } + oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize))) + if !evacuated(oldb) { + b = oldb + } + } + top := tophash(hash) + for ; b != nil; b = b.overflow(t) { + for i, kptr := uintptr(0), b.keys(); i < bucketCnt; i, kptr = i+1, add(kptr, 2*goarch.PtrSize) { + k := (*stringStruct)(kptr) + if k.len != key.len || b.tophash[i] != top { + continue + } + if k.str == key.str || memequal(k.str, key.str, uintptr(key.len)) { + return add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+i*uintptr(t.elemsize)), true + } + } + } + return unsafe.Pointer(&zeroVal[0]), false +} + +func mapassign_faststr(t *maptype, h *hmap, s string) unsafe.Pointer { + if h == nil { + panic(plainError("assignment to entry in nil map")) + } + if raceenabled { + callerpc := getcallerpc() + racewritepc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapassign_faststr)) + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + key := stringStructOf(&s) + hash := t.hasher(noescape(unsafe.Pointer(&s)), uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher for consistency with mapassign. + h.flags ^= hashWriting + + if h.buckets == nil { + h.buckets = newobject(t.bucket) // newarray(t.bucket, 1) + } + +again: + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork_faststr(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + top := tophash(hash) + + var insertb *bmap + var inserti uintptr + var insertk unsafe.Pointer + +bucketloop: + for { + for i := uintptr(0); i < bucketCnt; i++ { + if b.tophash[i] != top { + if isEmpty(b.tophash[i]) && insertb == nil { + insertb = b + inserti = i + } + if b.tophash[i] == emptyRest { + break bucketloop + } + continue + } + k := (*stringStruct)(add(unsafe.Pointer(b), dataOffset+i*2*goarch.PtrSize)) + if k.len != key.len { + continue + } + if k.str != key.str && !memequal(k.str, key.str, uintptr(key.len)) { + continue + } + // already have a mapping for key. Update it. + inserti = i + insertb = b + // Overwrite existing key, so it can be garbage collected. + // The size is already guaranteed to be set correctly. + k.str = key.str + goto done + } + ovf := b.overflow(t) + if ovf == nil { + break + } + b = ovf + } + + // Did not find mapping for key. Allocate new cell & add entry. + + // If we hit the max load factor or we have too many overflow buckets, + // and we're not already in the middle of growing, start growing. + if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) { + hashGrow(t, h) + goto again // Growing the table invalidates everything, so try again + } + + if insertb == nil { + // The current bucket and all the overflow buckets connected to it are full, allocate a new one. + insertb = h.newoverflow(t, b) + inserti = 0 // not necessary, but avoids needlessly spilling inserti + } + insertb.tophash[inserti&(bucketCnt-1)] = top // mask inserti to avoid bounds checks + + insertk = add(unsafe.Pointer(insertb), dataOffset+inserti*2*goarch.PtrSize) + // store new key at insert position + *((*stringStruct)(insertk)) = *key + h.count++ + +done: + elem := add(unsafe.Pointer(insertb), dataOffset+bucketCnt*2*goarch.PtrSize+inserti*uintptr(t.elemsize)) + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting + return elem +} + +func mapdelete_faststr(t *maptype, h *hmap, ky string) { + if raceenabled && h != nil { + callerpc := getcallerpc() + racewritepc(unsafe.Pointer(h), callerpc, abi.FuncPCABIInternal(mapdelete_faststr)) + } + if h == nil || h.count == 0 { + return + } + if h.flags&hashWriting != 0 { + fatal("concurrent map writes") + } + + key := stringStructOf(&ky) + hash := t.hasher(noescape(unsafe.Pointer(&ky)), uintptr(h.hash0)) + + // Set hashWriting after calling t.hasher for consistency with mapdelete + h.flags ^= hashWriting + + bucket := hash & bucketMask(h.B) + if h.growing() { + growWork_faststr(t, h, bucket) + } + b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize))) + bOrig := b + top := tophash(hash) +search: + for ; b != nil; b = b.overflow(t) { + for i, kptr := uintptr(0), b.keys(); i < bucketCnt; i, kptr = i+1, add(kptr, 2*goarch.PtrSize) { + k := (*stringStruct)(kptr) + if k.len != key.len || b.tophash[i] != top { + continue + } + if k.str != key.str && !memequal(k.str, key.str, uintptr(key.len)) { + continue + } + // Clear key's pointer. + k.str = nil + e := add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+i*uintptr(t.elemsize)) + if t.elem.ptrdata != 0 { + memclrHasPointers(e, t.elem.size) + } else { + memclrNoHeapPointers(e, t.elem.size) + } + b.tophash[i] = emptyOne + // If the bucket now ends in a bunch of emptyOne states, + // change those to emptyRest states. + if i == bucketCnt-1 { + if b.overflow(t) != nil && b.overflow(t).tophash[0] != emptyRest { + goto notLast + } + } else { + if b.tophash[i+1] != emptyRest { + goto notLast + } + } + for { + b.tophash[i] = emptyRest + if i == 0 { + if b == bOrig { + break // beginning of initial bucket, we're done. + } + // Find previous bucket, continue at its last entry. + c := b + for b = bOrig; b.overflow(t) != c; b = b.overflow(t) { + } + i = bucketCnt - 1 + } else { + i-- + } + if b.tophash[i] != emptyOne { + break + } + } + notLast: + h.count-- + // Reset the hash seed to make it more difficult for attackers to + // repeatedly trigger hash collisions. See issue 25237. + if h.count == 0 { + h.hash0 = fastrand() + } + break search + } + } + + if h.flags&hashWriting == 0 { + fatal("concurrent map writes") + } + h.flags &^= hashWriting +} + +func growWork_faststr(t *maptype, h *hmap, bucket uintptr) { + // make sure we evacuate the oldbucket corresponding + // to the bucket we're about to use + evacuate_faststr(t, h, bucket&h.oldbucketmask()) + + // evacuate one more oldbucket to make progress on growing + if h.growing() { + evacuate_faststr(t, h, h.nevacuate) + } +} + +func evacuate_faststr(t *maptype, h *hmap, oldbucket uintptr) { + b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize))) + newbit := h.noldbuckets() + if !evacuated(b) { + // TODO: reuse overflow buckets instead of using new ones, if there + // is no iterator using the old buckets. (If !oldIterator.) + + // xy contains the x and y (low and high) evacuation destinations. + var xy [2]evacDst + x := &xy[0] + x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize))) + x.k = add(unsafe.Pointer(x.b), dataOffset) + x.e = add(x.k, bucketCnt*2*goarch.PtrSize) + + if !h.sameSizeGrow() { + // Only calculate y pointers if we're growing bigger. + // Otherwise GC can see bad pointers. + y := &xy[1] + y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize))) + y.k = add(unsafe.Pointer(y.b), dataOffset) + y.e = add(y.k, bucketCnt*2*goarch.PtrSize) + } + + for ; b != nil; b = b.overflow(t) { + k := add(unsafe.Pointer(b), dataOffset) + e := add(k, bucketCnt*2*goarch.PtrSize) + for i := 0; i < bucketCnt; i, k, e = i+1, add(k, 2*goarch.PtrSize), add(e, uintptr(t.elemsize)) { + top := b.tophash[i] + if isEmpty(top) { + b.tophash[i] = evacuatedEmpty + continue + } + if top < minTopHash { + throw("bad map state") + } + var useY uint8 + if !h.sameSizeGrow() { + // Compute hash to make our evacuation decision (whether we need + // to send this key/elem to bucket x or bucket y). + hash := t.hasher(k, uintptr(h.hash0)) + if hash&newbit != 0 { + useY = 1 + } + } + + b.tophash[i] = evacuatedX + useY // evacuatedX + 1 == evacuatedY, enforced in makemap + dst := &xy[useY] // evacuation destination + + if dst.i == bucketCnt { + dst.b = h.newoverflow(t, dst.b) + dst.i = 0 + dst.k = add(unsafe.Pointer(dst.b), dataOffset) + dst.e = add(dst.k, bucketCnt*2*goarch.PtrSize) + } + dst.b.tophash[dst.i&(bucketCnt-1)] = top // mask dst.i as an optimization, to avoid a bounds check + + // Copy key. + *(*string)(dst.k) = *(*string)(k) + + typedmemmove(t.elem, dst.e, e) + dst.i++ + // These updates might push these pointers past the end of the + // key or elem arrays. That's ok, as we have the overflow pointer + // at the end of the bucket to protect against pointing past the + // end of the bucket. + dst.k = add(dst.k, 2*goarch.PtrSize) + dst.e = add(dst.e, uintptr(t.elemsize)) + } + } + // Unlink the overflow buckets & clear key/elem to help GC. + if h.flags&oldIterator == 0 && t.bucket.ptrdata != 0 { + b := add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)) + // Preserve b.tophash because the evacuation + // state is maintained there. + ptr := add(b, dataOffset) + n := uintptr(t.bucketsize) - dataOffset + memclrHasPointers(ptr, n) + } + } + + if oldbucket == h.nevacuate { + advanceEvacuationMark(h, t, newbit) + } +} diff --git a/src/runtime/map_test.go b/src/runtime/map_test.go new file mode 100644 index 0000000..4afbae6 --- /dev/null +++ b/src/runtime/map_test.go @@ -0,0 +1,1236 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "internal/goarch" + "math" + "reflect" + "runtime" + "sort" + "strconv" + "strings" + "sync" + "testing" +) + +func TestHmapSize(t *testing.T) { + // The structure of hmap is defined in runtime/map.go + // and in cmd/compile/internal/gc/reflect.go and must be in sync. + // The size of hmap should be 48 bytes on 64 bit and 28 bytes on 32 bit platforms. + var hmapSize = uintptr(8 + 5*goarch.PtrSize) + if runtime.RuntimeHmapSize != hmapSize { + t.Errorf("sizeof(runtime.hmap{})==%d, want %d", runtime.RuntimeHmapSize, hmapSize) + } + +} + +// negative zero is a good test because: +// 1. 0 and -0 are equal, yet have distinct representations. +// 2. 0 is represented as all zeros, -0 isn't. +// +// I'm not sure the language spec actually requires this behavior, +// but it's what the current map implementation does. +func TestNegativeZero(t *testing.T) { + m := make(map[float64]bool, 0) + + m[+0.0] = true + m[math.Copysign(0.0, -1.0)] = true // should overwrite +0 entry + + if len(m) != 1 { + t.Error("length wrong") + } + + for k := range m { + if math.Copysign(1.0, k) > 0 { + t.Error("wrong sign") + } + } + + m = make(map[float64]bool, 0) + m[math.Copysign(0.0, -1.0)] = true + m[+0.0] = true // should overwrite -0.0 entry + + if len(m) != 1 { + t.Error("length wrong") + } + + for k := range m { + if math.Copysign(1.0, k) < 0 { + t.Error("wrong sign") + } + } +} + +func testMapNan(t *testing.T, m map[float64]int) { + if len(m) != 3 { + t.Error("length wrong") + } + s := 0 + for k, v := range m { + if k == k { + t.Error("nan disappeared") + } + if (v & (v - 1)) != 0 { + t.Error("value wrong") + } + s |= v + } + if s != 7 { + t.Error("values wrong") + } +} + +// nan is a good test because nan != nan, and nan has +// a randomized hash value. +func TestMapAssignmentNan(t *testing.T) { + m := make(map[float64]int, 0) + nan := math.NaN() + + // Test assignment. + m[nan] = 1 + m[nan] = 2 + m[nan] = 4 + testMapNan(t, m) +} + +// nan is a good test because nan != nan, and nan has +// a randomized hash value. +func TestMapOperatorAssignmentNan(t *testing.T) { + m := make(map[float64]int, 0) + nan := math.NaN() + + // Test assignment operations. + m[nan] += 1 + m[nan] += 2 + m[nan] += 4 + testMapNan(t, m) +} + +func TestMapOperatorAssignment(t *testing.T) { + m := make(map[int]int, 0) + + // "m[k] op= x" is rewritten into "m[k] = m[k] op x" + // differently when op is / or % than when it isn't. + // Simple test to make sure they all work as expected. + m[0] = 12345 + m[0] += 67890 + m[0] /= 123 + m[0] %= 456 + + const want = (12345 + 67890) / 123 % 456 + if got := m[0]; got != want { + t.Errorf("got %d, want %d", got, want) + } +} + +var sinkAppend bool + +func TestMapAppendAssignment(t *testing.T) { + m := make(map[int][]int, 0) + + m[0] = nil + m[0] = append(m[0], 12345) + m[0] = append(m[0], 67890) + sinkAppend, m[0] = !sinkAppend, append(m[0], 123, 456) + a := []int{7, 8, 9, 0} + m[0] = append(m[0], a...) + + want := []int{12345, 67890, 123, 456, 7, 8, 9, 0} + if got := m[0]; !reflect.DeepEqual(got, want) { + t.Errorf("got %v, want %v", got, want) + } +} + +// Maps aren't actually copied on assignment. +func TestAlias(t *testing.T) { + m := make(map[int]int, 0) + m[0] = 5 + n := m + n[0] = 6 + if m[0] != 6 { + t.Error("alias didn't work") + } +} + +func TestGrowWithNaN(t *testing.T) { + m := make(map[float64]int, 4) + nan := math.NaN() + + // Use both assignment and assignment operations as they may + // behave differently. + m[nan] = 1 + m[nan] = 2 + m[nan] += 4 + + cnt := 0 + s := 0 + growflag := true + for k, v := range m { + if growflag { + // force a hashtable resize + for i := 0; i < 50; i++ { + m[float64(i)] = i + } + for i := 50; i < 100; i++ { + m[float64(i)] += i + } + growflag = false + } + if k != k { + cnt++ + s |= v + } + } + if cnt != 3 { + t.Error("NaN keys lost during grow") + } + if s != 7 { + t.Error("NaN values lost during grow") + } +} + +type FloatInt struct { + x float64 + y int +} + +func TestGrowWithNegativeZero(t *testing.T) { + negzero := math.Copysign(0.0, -1.0) + m := make(map[FloatInt]int, 4) + m[FloatInt{0.0, 0}] = 1 + m[FloatInt{0.0, 1}] += 2 + m[FloatInt{0.0, 2}] += 4 + m[FloatInt{0.0, 3}] = 8 + growflag := true + s := 0 + cnt := 0 + negcnt := 0 + // The first iteration should return the +0 key. + // The subsequent iterations should return the -0 key. + // I'm not really sure this is required by the spec, + // but it makes sense. + // TODO: are we allowed to get the first entry returned again??? + for k, v := range m { + if v == 0 { + continue + } // ignore entries added to grow table + cnt++ + if math.Copysign(1.0, k.x) < 0 { + if v&16 == 0 { + t.Error("key/value not updated together 1") + } + negcnt++ + s |= v & 15 + } else { + if v&16 == 16 { + t.Error("key/value not updated together 2", k, v) + } + s |= v + } + if growflag { + // force a hashtable resize + for i := 0; i < 100; i++ { + m[FloatInt{3.0, i}] = 0 + } + // then change all the entries + // to negative zero + m[FloatInt{negzero, 0}] = 1 | 16 + m[FloatInt{negzero, 1}] = 2 | 16 + m[FloatInt{negzero, 2}] = 4 | 16 + m[FloatInt{negzero, 3}] = 8 | 16 + growflag = false + } + } + if s != 15 { + t.Error("entry missing", s) + } + if cnt != 4 { + t.Error("wrong number of entries returned by iterator", cnt) + } + if negcnt != 3 { + t.Error("update to negzero missed by iteration", negcnt) + } +} + +func TestIterGrowAndDelete(t *testing.T) { + m := make(map[int]int, 4) + for i := 0; i < 100; i++ { + m[i] = i + } + growflag := true + for k := range m { + if growflag { + // grow the table + for i := 100; i < 1000; i++ { + m[i] = i + } + // delete all odd keys + for i := 1; i < 1000; i += 2 { + delete(m, i) + } + growflag = false + } else { + if k&1 == 1 { + t.Error("odd value returned") + } + } + } +} + +// make sure old bucket arrays don't get GCd while +// an iterator is still using them. +func TestIterGrowWithGC(t *testing.T) { + m := make(map[int]int, 4) + for i := 0; i < 8; i++ { + m[i] = i + } + for i := 8; i < 16; i++ { + m[i] += i + } + growflag := true + bitmask := 0 + for k := range m { + if k < 16 { + bitmask |= 1 << uint(k) + } + if growflag { + // grow the table + for i := 100; i < 1000; i++ { + m[i] = i + } + // trigger a gc + runtime.GC() + growflag = false + } + } + if bitmask != 1<<16-1 { + t.Error("missing key", bitmask) + } +} + +func testConcurrentReadsAfterGrowth(t *testing.T, useReflect bool) { + t.Parallel() + if runtime.GOMAXPROCS(-1) == 1 { + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(16)) + } + numLoop := 10 + numGrowStep := 250 + numReader := 16 + if testing.Short() { + numLoop, numGrowStep = 2, 100 + } + for i := 0; i < numLoop; i++ { + m := make(map[int]int, 0) + for gs := 0; gs < numGrowStep; gs++ { + m[gs] = gs + var wg sync.WaitGroup + wg.Add(numReader * 2) + for nr := 0; nr < numReader; nr++ { + go func() { + defer wg.Done() + for range m { + } + }() + go func() { + defer wg.Done() + for key := 0; key < gs; key++ { + _ = m[key] + } + }() + if useReflect { + wg.Add(1) + go func() { + defer wg.Done() + mv := reflect.ValueOf(m) + keys := mv.MapKeys() + for _, k := range keys { + mv.MapIndex(k) + } + }() + } + } + wg.Wait() + } + } +} + +func TestConcurrentReadsAfterGrowth(t *testing.T) { + testConcurrentReadsAfterGrowth(t, false) +} + +func TestConcurrentReadsAfterGrowthReflect(t *testing.T) { + testConcurrentReadsAfterGrowth(t, true) +} + +func TestBigItems(t *testing.T) { + var key [256]string + for i := 0; i < 256; i++ { + key[i] = "foo" + } + m := make(map[[256]string][256]string, 4) + for i := 0; i < 100; i++ { + key[37] = fmt.Sprintf("string%02d", i) + m[key] = key + } + var keys [100]string + var values [100]string + i := 0 + for k, v := range m { + keys[i] = k[37] + values[i] = v[37] + i++ + } + sort.Strings(keys[:]) + sort.Strings(values[:]) + for i := 0; i < 100; i++ { + if keys[i] != fmt.Sprintf("string%02d", i) { + t.Errorf("#%d: missing key: %v", i, keys[i]) + } + if values[i] != fmt.Sprintf("string%02d", i) { + t.Errorf("#%d: missing value: %v", i, values[i]) + } + } +} + +func TestMapHugeZero(t *testing.T) { + type T [4000]byte + m := map[int]T{} + x := m[0] + if x != (T{}) { + t.Errorf("map value not zero") + } + y, ok := m[0] + if ok { + t.Errorf("map value should be missing") + } + if y != (T{}) { + t.Errorf("map value not zero") + } +} + +type empty struct { +} + +func TestEmptyKeyAndValue(t *testing.T) { + a := make(map[int]empty, 4) + b := make(map[empty]int, 4) + c := make(map[empty]empty, 4) + a[0] = empty{} + b[empty{}] = 0 + b[empty{}] = 1 + c[empty{}] = empty{} + + if len(a) != 1 { + t.Errorf("empty value insert problem") + } + if b[empty{}] != 1 { + t.Errorf("empty key returned wrong value") + } +} + +// Tests a map with a single bucket, with same-lengthed short keys +// ("quick keys") as well as long keys. +func TestSingleBucketMapStringKeys_DupLen(t *testing.T) { + testMapLookups(t, map[string]string{ + "x": "x1val", + "xx": "x2val", + "foo": "fooval", + "bar": "barval", // same key length as "foo" + "xxxx": "x4val", + strings.Repeat("x", 128): "longval1", + strings.Repeat("y", 128): "longval2", + }) +} + +// Tests a map with a single bucket, with all keys having different lengths. +func TestSingleBucketMapStringKeys_NoDupLen(t *testing.T) { + testMapLookups(t, map[string]string{ + "x": "x1val", + "xx": "x2val", + "foo": "fooval", + "xxxx": "x4val", + "xxxxx": "x5val", + "xxxxxx": "x6val", + strings.Repeat("x", 128): "longval", + }) +} + +func testMapLookups(t *testing.T, m map[string]string) { + for k, v := range m { + if m[k] != v { + t.Fatalf("m[%q] = %q; want %q", k, m[k], v) + } + } +} + +// Tests whether the iterator returns the right elements when +// started in the middle of a grow, when the keys are NaNs. +func TestMapNanGrowIterator(t *testing.T) { + m := make(map[float64]int) + nan := math.NaN() + const nBuckets = 16 + // To fill nBuckets buckets takes LOAD * nBuckets keys. + nKeys := int(nBuckets * runtime.HashLoad) + + // Get map to full point with nan keys. + for i := 0; i < nKeys; i++ { + m[nan] = i + } + // Trigger grow + m[1.0] = 1 + delete(m, 1.0) + + // Run iterator + found := make(map[int]struct{}) + for _, v := range m { + if v != -1 { + if _, repeat := found[v]; repeat { + t.Fatalf("repeat of value %d", v) + } + found[v] = struct{}{} + } + if len(found) == nKeys/2 { + // Halfway through iteration, finish grow. + for i := 0; i < nBuckets; i++ { + delete(m, 1.0) + } + } + } + if len(found) != nKeys { + t.Fatalf("missing value") + } +} + +func TestMapIterOrder(t *testing.T) { + for _, n := range [...]int{3, 7, 9, 15} { + for i := 0; i < 1000; i++ { + // Make m be {0: true, 1: true, ..., n-1: true}. + m := make(map[int]bool) + for i := 0; i < n; i++ { + m[i] = true + } + // Check that iterating over the map produces at least two different orderings. + ord := func() []int { + var s []int + for key := range m { + s = append(s, key) + } + return s + } + first := ord() + ok := false + for try := 0; try < 100; try++ { + if !reflect.DeepEqual(first, ord()) { + ok = true + break + } + } + if !ok { + t.Errorf("Map with n=%d elements had consistent iteration order: %v", n, first) + break + } + } + } +} + +// Issue 8410 +func TestMapSparseIterOrder(t *testing.T) { + // Run several rounds to increase the probability + // of failure. One is not enough. +NextRound: + for round := 0; round < 10; round++ { + m := make(map[int]bool) + // Add 1000 items, remove 980. + for i := 0; i < 1000; i++ { + m[i] = true + } + for i := 20; i < 1000; i++ { + delete(m, i) + } + + var first []int + for i := range m { + first = append(first, i) + } + + // 800 chances to get a different iteration order. + // See bug 8736 for why we need so many tries. + for n := 0; n < 800; n++ { + idx := 0 + for i := range m { + if i != first[idx] { + // iteration order changed. + continue NextRound + } + idx++ + } + } + t.Fatalf("constant iteration order on round %d: %v", round, first) + } +} + +func TestMapStringBytesLookup(t *testing.T) { + // Use large string keys to avoid small-allocation coalescing, + // which can cause AllocsPerRun to report lower counts than it should. + m := map[string]int{ + "1000000000000000000000000000000000000000000000000": 1, + "2000000000000000000000000000000000000000000000000": 2, + } + buf := []byte("1000000000000000000000000000000000000000000000000") + if x := m[string(buf)]; x != 1 { + t.Errorf(`m[string([]byte("1"))] = %d, want 1`, x) + } + buf[0] = '2' + if x := m[string(buf)]; x != 2 { + t.Errorf(`m[string([]byte("2"))] = %d, want 2`, x) + } + + var x int + n := testing.AllocsPerRun(100, func() { + x += m[string(buf)] + }) + if n != 0 { + t.Errorf("AllocsPerRun for m[string(buf)] = %v, want 0", n) + } + + x = 0 + n = testing.AllocsPerRun(100, func() { + y, ok := m[string(buf)] + if !ok { + panic("!ok") + } + x += y + }) + if n != 0 { + t.Errorf("AllocsPerRun for x,ok = m[string(buf)] = %v, want 0", n) + } +} + +func TestMapLargeKeyNoPointer(t *testing.T) { + const ( + I = 1000 + N = 64 + ) + type T [N]int + m := make(map[T]int) + for i := 0; i < I; i++ { + var v T + for j := 0; j < N; j++ { + v[j] = i + j + } + m[v] = i + } + runtime.GC() + for i := 0; i < I; i++ { + var v T + for j := 0; j < N; j++ { + v[j] = i + j + } + if m[v] != i { + t.Fatalf("corrupted map: want %+v, got %+v", i, m[v]) + } + } +} + +func TestMapLargeValNoPointer(t *testing.T) { + const ( + I = 1000 + N = 64 + ) + type T [N]int + m := make(map[int]T) + for i := 0; i < I; i++ { + var v T + for j := 0; j < N; j++ { + v[j] = i + j + } + m[i] = v + } + runtime.GC() + for i := 0; i < I; i++ { + var v T + for j := 0; j < N; j++ { + v[j] = i + j + } + v1 := m[i] + for j := 0; j < N; j++ { + if v1[j] != v[j] { + t.Fatalf("corrupted map: want %+v, got %+v", v, v1) + } + } + } +} + +// Test that making a map with a large or invalid hint +// doesn't panic. (Issue 19926). +func TestIgnoreBogusMapHint(t *testing.T) { + for _, hint := range []int64{-1, 1 << 62} { + _ = make(map[int]int, hint) + } +} + +var mapBucketTests = [...]struct { + n int // n is the number of map elements + noescape int // number of expected buckets for non-escaping map + escape int // number of expected buckets for escaping map +}{ + {-(1 << 30), 1, 1}, + {-1, 1, 1}, + {0, 1, 1}, + {1, 1, 1}, + {8, 1, 1}, + {9, 2, 2}, + {13, 2, 2}, + {14, 4, 4}, + {26, 4, 4}, +} + +func TestMapBuckets(t *testing.T) { + // Test that maps of different sizes have the right number of buckets. + // Non-escaping maps with small buckets (like map[int]int) never + // have a nil bucket pointer due to starting with preallocated buckets + // on the stack. Escaping maps start with a non-nil bucket pointer if + // hint size is above bucketCnt and thereby have more than one bucket. + // These tests depend on bucketCnt and loadFactor* in map.go. + t.Run("mapliteral", func(t *testing.T) { + for _, tt := range mapBucketTests { + localMap := map[int]int{} + if runtime.MapBucketsPointerIsNil(localMap) { + t.Errorf("no escape: buckets pointer is nil for non-escaping map") + } + for i := 0; i < tt.n; i++ { + localMap[i] = i + } + if got := runtime.MapBucketsCount(localMap); got != tt.noescape { + t.Errorf("no escape: n=%d want %d buckets, got %d", tt.n, tt.noescape, got) + } + escapingMap := runtime.Escape(map[int]int{}) + if count := runtime.MapBucketsCount(escapingMap); count > 1 && runtime.MapBucketsPointerIsNil(escapingMap) { + t.Errorf("escape: buckets pointer is nil for n=%d buckets", count) + } + for i := 0; i < tt.n; i++ { + escapingMap[i] = i + } + if got := runtime.MapBucketsCount(escapingMap); got != tt.escape { + t.Errorf("escape n=%d want %d buckets, got %d", tt.n, tt.escape, got) + } + } + }) + t.Run("nohint", func(t *testing.T) { + for _, tt := range mapBucketTests { + localMap := make(map[int]int) + if runtime.MapBucketsPointerIsNil(localMap) { + t.Errorf("no escape: buckets pointer is nil for non-escaping map") + } + for i := 0; i < tt.n; i++ { + localMap[i] = i + } + if got := runtime.MapBucketsCount(localMap); got != tt.noescape { + t.Errorf("no escape: n=%d want %d buckets, got %d", tt.n, tt.noescape, got) + } + escapingMap := runtime.Escape(make(map[int]int)) + if count := runtime.MapBucketsCount(escapingMap); count > 1 && runtime.MapBucketsPointerIsNil(escapingMap) { + t.Errorf("escape: buckets pointer is nil for n=%d buckets", count) + } + for i := 0; i < tt.n; i++ { + escapingMap[i] = i + } + if got := runtime.MapBucketsCount(escapingMap); got != tt.escape { + t.Errorf("escape: n=%d want %d buckets, got %d", tt.n, tt.escape, got) + } + } + }) + t.Run("makemap", func(t *testing.T) { + for _, tt := range mapBucketTests { + localMap := make(map[int]int, tt.n) + if runtime.MapBucketsPointerIsNil(localMap) { + t.Errorf("no escape: buckets pointer is nil for non-escaping map") + } + for i := 0; i < tt.n; i++ { + localMap[i] = i + } + if got := runtime.MapBucketsCount(localMap); got != tt.noescape { + t.Errorf("no escape: n=%d want %d buckets, got %d", tt.n, tt.noescape, got) + } + escapingMap := runtime.Escape(make(map[int]int, tt.n)) + if count := runtime.MapBucketsCount(escapingMap); count > 1 && runtime.MapBucketsPointerIsNil(escapingMap) { + t.Errorf("escape: buckets pointer is nil for n=%d buckets", count) + } + for i := 0; i < tt.n; i++ { + escapingMap[i] = i + } + if got := runtime.MapBucketsCount(escapingMap); got != tt.escape { + t.Errorf("escape: n=%d want %d buckets, got %d", tt.n, tt.escape, got) + } + } + }) + t.Run("makemap64", func(t *testing.T) { + for _, tt := range mapBucketTests { + localMap := make(map[int]int, int64(tt.n)) + if runtime.MapBucketsPointerIsNil(localMap) { + t.Errorf("no escape: buckets pointer is nil for non-escaping map") + } + for i := 0; i < tt.n; i++ { + localMap[i] = i + } + if got := runtime.MapBucketsCount(localMap); got != tt.noescape { + t.Errorf("no escape: n=%d want %d buckets, got %d", tt.n, tt.noescape, got) + } + escapingMap := runtime.Escape(make(map[int]int, tt.n)) + if count := runtime.MapBucketsCount(escapingMap); count > 1 && runtime.MapBucketsPointerIsNil(escapingMap) { + t.Errorf("escape: buckets pointer is nil for n=%d buckets", count) + } + for i := 0; i < tt.n; i++ { + escapingMap[i] = i + } + if got := runtime.MapBucketsCount(escapingMap); got != tt.escape { + t.Errorf("escape: n=%d want %d buckets, got %d", tt.n, tt.escape, got) + } + } + }) + +} + +func benchmarkMapPop(b *testing.B, n int) { + m := map[int]int{} + for i := 0; i < b.N; i++ { + for j := 0; j < n; j++ { + m[j] = j + } + for j := 0; j < n; j++ { + // Use iterator to pop an element. + // We want this to be fast, see issue 8412. + for k := range m { + delete(m, k) + break + } + } + } +} + +func BenchmarkMapPop100(b *testing.B) { benchmarkMapPop(b, 100) } +func BenchmarkMapPop1000(b *testing.B) { benchmarkMapPop(b, 1000) } +func BenchmarkMapPop10000(b *testing.B) { benchmarkMapPop(b, 10000) } + +var testNonEscapingMapVariable int = 8 + +func TestNonEscapingMap(t *testing.T) { + n := testing.AllocsPerRun(1000, func() { + m := map[int]int{} + m[0] = 0 + }) + if n != 0 { + t.Fatalf("mapliteral: want 0 allocs, got %v", n) + } + n = testing.AllocsPerRun(1000, func() { + m := make(map[int]int) + m[0] = 0 + }) + if n != 0 { + t.Fatalf("no hint: want 0 allocs, got %v", n) + } + n = testing.AllocsPerRun(1000, func() { + m := make(map[int]int, 8) + m[0] = 0 + }) + if n != 0 { + t.Fatalf("with small hint: want 0 allocs, got %v", n) + } + n = testing.AllocsPerRun(1000, func() { + m := make(map[int]int, testNonEscapingMapVariable) + m[0] = 0 + }) + if n != 0 { + t.Fatalf("with variable hint: want 0 allocs, got %v", n) + } + +} + +func benchmarkMapAssignInt32(b *testing.B, n int) { + a := make(map[int32]int) + for i := 0; i < b.N; i++ { + a[int32(i&(n-1))] = i + } +} + +func benchmarkMapOperatorAssignInt32(b *testing.B, n int) { + a := make(map[int32]int) + for i := 0; i < b.N; i++ { + a[int32(i&(n-1))] += i + } +} + +func benchmarkMapAppendAssignInt32(b *testing.B, n int) { + a := make(map[int32][]int) + b.ReportAllocs() + b.ResetTimer() + for i := 0; i < b.N; i++ { + key := int32(i & (n - 1)) + a[key] = append(a[key], i) + } +} + +func benchmarkMapDeleteInt32(b *testing.B, n int) { + a := make(map[int32]int, n) + b.ResetTimer() + for i := 0; i < b.N; i++ { + if len(a) == 0 { + b.StopTimer() + for j := i; j < i+n; j++ { + a[int32(j)] = j + } + b.StartTimer() + } + delete(a, int32(i)) + } +} + +func benchmarkMapAssignInt64(b *testing.B, n int) { + a := make(map[int64]int) + for i := 0; i < b.N; i++ { + a[int64(i&(n-1))] = i + } +} + +func benchmarkMapOperatorAssignInt64(b *testing.B, n int) { + a := make(map[int64]int) + for i := 0; i < b.N; i++ { + a[int64(i&(n-1))] += i + } +} + +func benchmarkMapAppendAssignInt64(b *testing.B, n int) { + a := make(map[int64][]int) + b.ReportAllocs() + b.ResetTimer() + for i := 0; i < b.N; i++ { + key := int64(i & (n - 1)) + a[key] = append(a[key], i) + } +} + +func benchmarkMapDeleteInt64(b *testing.B, n int) { + a := make(map[int64]int, n) + b.ResetTimer() + for i := 0; i < b.N; i++ { + if len(a) == 0 { + b.StopTimer() + for j := i; j < i+n; j++ { + a[int64(j)] = j + } + b.StartTimer() + } + delete(a, int64(i)) + } +} + +func benchmarkMapAssignStr(b *testing.B, n int) { + k := make([]string, n) + for i := 0; i < len(k); i++ { + k[i] = strconv.Itoa(i) + } + b.ResetTimer() + a := make(map[string]int) + for i := 0; i < b.N; i++ { + a[k[i&(n-1)]] = i + } +} + +func benchmarkMapOperatorAssignStr(b *testing.B, n int) { + k := make([]string, n) + for i := 0; i < len(k); i++ { + k[i] = strconv.Itoa(i) + } + b.ResetTimer() + a := make(map[string]string) + for i := 0; i < b.N; i++ { + key := k[i&(n-1)] + a[key] += key + } +} + +func benchmarkMapAppendAssignStr(b *testing.B, n int) { + k := make([]string, n) + for i := 0; i < len(k); i++ { + k[i] = strconv.Itoa(i) + } + a := make(map[string][]string) + b.ReportAllocs() + b.ResetTimer() + for i := 0; i < b.N; i++ { + key := k[i&(n-1)] + a[key] = append(a[key], key) + } +} + +func benchmarkMapDeleteStr(b *testing.B, n int) { + i2s := make([]string, n) + for i := 0; i < n; i++ { + i2s[i] = strconv.Itoa(i) + } + a := make(map[string]int, n) + b.ResetTimer() + k := 0 + for i := 0; i < b.N; i++ { + if len(a) == 0 { + b.StopTimer() + for j := 0; j < n; j++ { + a[i2s[j]] = j + } + k = i + b.StartTimer() + } + delete(a, i2s[i-k]) + } +} + +func benchmarkMapDeletePointer(b *testing.B, n int) { + i2p := make([]*int, n) + for i := 0; i < n; i++ { + i2p[i] = new(int) + } + a := make(map[*int]int, n) + b.ResetTimer() + k := 0 + for i := 0; i < b.N; i++ { + if len(a) == 0 { + b.StopTimer() + for j := 0; j < n; j++ { + a[i2p[j]] = j + } + k = i + b.StartTimer() + } + delete(a, i2p[i-k]) + } +} + +func runWith(f func(*testing.B, int), v ...int) func(*testing.B) { + return func(b *testing.B) { + for _, n := range v { + b.Run(strconv.Itoa(n), func(b *testing.B) { f(b, n) }) + } + } +} + +func BenchmarkMapAssign(b *testing.B) { + b.Run("Int32", runWith(benchmarkMapAssignInt32, 1<<8, 1<<16)) + b.Run("Int64", runWith(benchmarkMapAssignInt64, 1<<8, 1<<16)) + b.Run("Str", runWith(benchmarkMapAssignStr, 1<<8, 1<<16)) +} + +func BenchmarkMapOperatorAssign(b *testing.B) { + b.Run("Int32", runWith(benchmarkMapOperatorAssignInt32, 1<<8, 1<<16)) + b.Run("Int64", runWith(benchmarkMapOperatorAssignInt64, 1<<8, 1<<16)) + b.Run("Str", runWith(benchmarkMapOperatorAssignStr, 1<<8, 1<<16)) +} + +func BenchmarkMapAppendAssign(b *testing.B) { + b.Run("Int32", runWith(benchmarkMapAppendAssignInt32, 1<<8, 1<<16)) + b.Run("Int64", runWith(benchmarkMapAppendAssignInt64, 1<<8, 1<<16)) + b.Run("Str", runWith(benchmarkMapAppendAssignStr, 1<<8, 1<<16)) +} + +func BenchmarkMapDelete(b *testing.B) { + b.Run("Int32", runWith(benchmarkMapDeleteInt32, 100, 1000, 10000)) + b.Run("Int64", runWith(benchmarkMapDeleteInt64, 100, 1000, 10000)) + b.Run("Str", runWith(benchmarkMapDeleteStr, 100, 1000, 10000)) + b.Run("Pointer", runWith(benchmarkMapDeletePointer, 100, 1000, 10000)) +} + +func TestDeferDeleteSlow(t *testing.T) { + ks := []complex128{0, 1, 2, 3} + + m := make(map[any]int) + for i, k := range ks { + m[k] = i + } + if len(m) != len(ks) { + t.Errorf("want %d elements, got %d", len(ks), len(m)) + } + + func() { + for _, k := range ks { + defer delete(m, k) + } + }() + if len(m) != 0 { + t.Errorf("want 0 elements, got %d", len(m)) + } +} + +// TestIncrementAfterDeleteValueInt and other test Issue 25936. +// Value types int, int32, int64 are affected. Value type string +// works as expected. +func TestIncrementAfterDeleteValueInt(t *testing.T) { + const key1 = 12 + const key2 = 13 + + m := make(map[int]int) + m[key1] = 99 + delete(m, key1) + m[key2]++ + if n2 := m[key2]; n2 != 1 { + t.Errorf("incremented 0 to %d", n2) + } +} + +func TestIncrementAfterDeleteValueInt32(t *testing.T) { + const key1 = 12 + const key2 = 13 + + m := make(map[int]int32) + m[key1] = 99 + delete(m, key1) + m[key2]++ + if n2 := m[key2]; n2 != 1 { + t.Errorf("incremented 0 to %d", n2) + } +} + +func TestIncrementAfterDeleteValueInt64(t *testing.T) { + const key1 = 12 + const key2 = 13 + + m := make(map[int]int64) + m[key1] = 99 + delete(m, key1) + m[key2]++ + if n2 := m[key2]; n2 != 1 { + t.Errorf("incremented 0 to %d", n2) + } +} + +func TestIncrementAfterDeleteKeyStringValueInt(t *testing.T) { + const key1 = "" + const key2 = "x" + + m := make(map[string]int) + m[key1] = 99 + delete(m, key1) + m[key2] += 1 + if n2 := m[key2]; n2 != 1 { + t.Errorf("incremented 0 to %d", n2) + } +} + +func TestIncrementAfterDeleteKeyValueString(t *testing.T) { + const key1 = "" + const key2 = "x" + + m := make(map[string]string) + m[key1] = "99" + delete(m, key1) + m[key2] += "1" + if n2 := m[key2]; n2 != "1" { + t.Errorf("appended '1' to empty (nil) string, got %s", n2) + } +} + +// TestIncrementAfterBulkClearKeyStringValueInt tests that map bulk +// deletion (mapclear) still works as expected. Note that it was not +// affected by Issue 25936. +func TestIncrementAfterBulkClearKeyStringValueInt(t *testing.T) { + const key1 = "" + const key2 = "x" + + m := make(map[string]int) + m[key1] = 99 + for k := range m { + delete(m, k) + } + m[key2]++ + if n2 := m[key2]; n2 != 1 { + t.Errorf("incremented 0 to %d", n2) + } +} + +func TestMapTombstones(t *testing.T) { + m := map[int]int{} + const N = 10000 + // Fill a map. + for i := 0; i < N; i++ { + m[i] = i + } + runtime.MapTombstoneCheck(m) + // Delete half of the entries. + for i := 0; i < N; i += 2 { + delete(m, i) + } + runtime.MapTombstoneCheck(m) + // Add new entries to fill in holes. + for i := N; i < 3*N/2; i++ { + m[i] = i + } + runtime.MapTombstoneCheck(m) + // Delete everything. + for i := 0; i < 3*N/2; i++ { + delete(m, i) + } + runtime.MapTombstoneCheck(m) +} + +type canString int + +func (c canString) String() string { + return fmt.Sprintf("%d", int(c)) +} + +func TestMapInterfaceKey(t *testing.T) { + // Test all the special cases in runtime.typehash. + type GrabBag struct { + f32 float32 + f64 float64 + c64 complex64 + c128 complex128 + s string + i0 any + i1 interface { + String() string + } + a [4]string + } + + m := map[any]bool{} + // Put a bunch of data in m, so that a bad hash is likely to + // lead to a bad bucket, which will lead to a missed lookup. + for i := 0; i < 1000; i++ { + m[i] = true + } + m[GrabBag{f32: 1.0}] = true + if !m[GrabBag{f32: 1.0}] { + panic("f32 not found") + } + m[GrabBag{f64: 1.0}] = true + if !m[GrabBag{f64: 1.0}] { + panic("f64 not found") + } + m[GrabBag{c64: 1.0i}] = true + if !m[GrabBag{c64: 1.0i}] { + panic("c64 not found") + } + m[GrabBag{c128: 1.0i}] = true + if !m[GrabBag{c128: 1.0i}] { + panic("c128 not found") + } + m[GrabBag{s: "foo"}] = true + if !m[GrabBag{s: "foo"}] { + panic("string not found") + } + m[GrabBag{i0: "foo"}] = true + if !m[GrabBag{i0: "foo"}] { + panic("interface{} not found") + } + m[GrabBag{i1: canString(5)}] = true + if !m[GrabBag{i1: canString(5)}] { + panic("interface{String() string} not found") + } + m[GrabBag{a: [4]string{"foo", "bar", "baz", "bop"}}] = true + if !m[GrabBag{a: [4]string{"foo", "bar", "baz", "bop"}}] { + panic("array not found") + } +} diff --git a/src/runtime/mbarrier.go b/src/runtime/mbarrier.go new file mode 100644 index 0000000..46ef42f --- /dev/null +++ b/src/runtime/mbarrier.go @@ -0,0 +1,346 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Garbage collector: write barriers. +// +// For the concurrent garbage collector, the Go compiler implements +// updates to pointer-valued fields that may be in heap objects by +// emitting calls to write barriers. The main write barrier for +// individual pointer writes is gcWriteBarrier and is implemented in +// assembly. This file contains write barrier entry points for bulk +// operations. See also mwbbuf.go. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +// Go uses a hybrid barrier that combines a Yuasa-style deletion +// barrier—which shades the object whose reference is being +// overwritten—with Dijkstra insertion barrier—which shades the object +// whose reference is being written. The insertion part of the barrier +// is necessary while the calling goroutine's stack is grey. In +// pseudocode, the barrier is: +// +// writePointer(slot, ptr): +// shade(*slot) +// if current stack is grey: +// shade(ptr) +// *slot = ptr +// +// slot is the destination in Go code. +// ptr is the value that goes into the slot in Go code. +// +// Shade indicates that it has seen a white pointer by adding the referent +// to wbuf as well as marking it. +// +// The two shades and the condition work together to prevent a mutator +// from hiding an object from the garbage collector: +// +// 1. shade(*slot) prevents a mutator from hiding an object by moving +// the sole pointer to it from the heap to its stack. If it attempts +// to unlink an object from the heap, this will shade it. +// +// 2. shade(ptr) prevents a mutator from hiding an object by moving +// the sole pointer to it from its stack into a black object in the +// heap. If it attempts to install the pointer into a black object, +// this will shade it. +// +// 3. Once a goroutine's stack is black, the shade(ptr) becomes +// unnecessary. shade(ptr) prevents hiding an object by moving it from +// the stack to the heap, but this requires first having a pointer +// hidden on the stack. Immediately after a stack is scanned, it only +// points to shaded objects, so it's not hiding anything, and the +// shade(*slot) prevents it from hiding any other pointers on its +// stack. +// +// For a detailed description of this barrier and proof of +// correctness, see https://github.com/golang/proposal/blob/master/design/17503-eliminate-rescan.md +// +// +// +// Dealing with memory ordering: +// +// Both the Yuasa and Dijkstra barriers can be made conditional on the +// color of the object containing the slot. We chose not to make these +// conditional because the cost of ensuring that the object holding +// the slot doesn't concurrently change color without the mutator +// noticing seems prohibitive. +// +// Consider the following example where the mutator writes into +// a slot and then loads the slot's mark bit while the GC thread +// writes to the slot's mark bit and then as part of scanning reads +// the slot. +// +// Initially both [slot] and [slotmark] are 0 (nil) +// Mutator thread GC thread +// st [slot], ptr st [slotmark], 1 +// +// ld r1, [slotmark] ld r2, [slot] +// +// Without an expensive memory barrier between the st and the ld, the final +// result on most HW (including 386/amd64) can be r1==r2==0. This is a classic +// example of what can happen when loads are allowed to be reordered with older +// stores (avoiding such reorderings lies at the heart of the classic +// Peterson/Dekker algorithms for mutual exclusion). Rather than require memory +// barriers, which will slow down both the mutator and the GC, we always grey +// the ptr object regardless of the slot's color. +// +// Another place where we intentionally omit memory barriers is when +// accessing mheap_.arena_used to check if a pointer points into the +// heap. On relaxed memory machines, it's possible for a mutator to +// extend the size of the heap by updating arena_used, allocate an +// object from this new region, and publish a pointer to that object, +// but for tracing running on another processor to observe the pointer +// but use the old value of arena_used. In this case, tracing will not +// mark the object, even though it's reachable. However, the mutator +// is guaranteed to execute a write barrier when it publishes the +// pointer, so it will take care of marking the object. A general +// consequence of this is that the garbage collector may cache the +// value of mheap_.arena_used. (See issue #9984.) +// +// +// Stack writes: +// +// The compiler omits write barriers for writes to the current frame, +// but if a stack pointer has been passed down the call stack, the +// compiler will generate a write barrier for writes through that +// pointer (because it doesn't know it's not a heap pointer). +// +// One might be tempted to ignore the write barrier if slot points +// into to the stack. Don't do it! Mark termination only re-scans +// frames that have potentially been active since the concurrent scan, +// so it depends on write barriers to track changes to pointers in +// stack frames that have not been active. +// +// +// Global writes: +// +// The Go garbage collector requires write barriers when heap pointers +// are stored in globals. Many garbage collectors ignore writes to +// globals and instead pick up global -> heap pointers during +// termination. This increases pause time, so we instead rely on write +// barriers for writes to globals so that we don't have to rescan +// global during mark termination. +// +// +// Publication ordering: +// +// The write barrier is *pre-publication*, meaning that the write +// barrier happens prior to the *slot = ptr write that may make ptr +// reachable by some goroutine that currently cannot reach it. +// +// +// Signal handler pointer writes: +// +// In general, the signal handler cannot safely invoke the write +// barrier because it may run without a P or even during the write +// barrier. +// +// There is exactly one exception: profbuf.go omits a barrier during +// signal handler profile logging. That's safe only because of the +// deletion barrier. See profbuf.go for a detailed argument. If we +// remove the deletion barrier, we'll have to work out a new way to +// handle the profile logging. + +// typedmemmove copies a value of type typ to dst from src. +// Must be nosplit, see #16026. +// +// TODO: Perfect for go:nosplitrec since we can't have a safe point +// anywhere in the bulk barrier or memmove. +// +//go:nosplit +func typedmemmove(typ *_type, dst, src unsafe.Pointer) { + if dst == src { + return + } + if writeBarrier.needed && typ.ptrdata != 0 { + bulkBarrierPreWrite(uintptr(dst), uintptr(src), typ.ptrdata) + } + // There's a race here: if some other goroutine can write to + // src, it may change some pointer in src after we've + // performed the write barrier but before we perform the + // memory copy. This safe because the write performed by that + // other goroutine must also be accompanied by a write + // barrier, so at worst we've unnecessarily greyed the old + // pointer that was in src. + memmove(dst, src, typ.size) + if writeBarrier.cgo { + cgoCheckMemmove(typ, dst, src, 0, typ.size) + } +} + +//go:linkname reflect_typedmemmove reflect.typedmemmove +func reflect_typedmemmove(typ *_type, dst, src unsafe.Pointer) { + if raceenabled { + raceWriteObjectPC(typ, dst, getcallerpc(), abi.FuncPCABIInternal(reflect_typedmemmove)) + raceReadObjectPC(typ, src, getcallerpc(), abi.FuncPCABIInternal(reflect_typedmemmove)) + } + if msanenabled { + msanwrite(dst, typ.size) + msanread(src, typ.size) + } + if asanenabled { + asanwrite(dst, typ.size) + asanread(src, typ.size) + } + typedmemmove(typ, dst, src) +} + +//go:linkname reflectlite_typedmemmove internal/reflectlite.typedmemmove +func reflectlite_typedmemmove(typ *_type, dst, src unsafe.Pointer) { + reflect_typedmemmove(typ, dst, src) +} + +// reflect_typedmemmovepartial is like typedmemmove but assumes that +// dst and src point off bytes into the value and only copies size bytes. +// off must be a multiple of goarch.PtrSize. +// +//go:linkname reflect_typedmemmovepartial reflect.typedmemmovepartial +func reflect_typedmemmovepartial(typ *_type, dst, src unsafe.Pointer, off, size uintptr) { + if writeBarrier.needed && typ.ptrdata > off && size >= goarch.PtrSize { + if off&(goarch.PtrSize-1) != 0 { + panic("reflect: internal error: misaligned offset") + } + pwsize := alignDown(size, goarch.PtrSize) + if poff := typ.ptrdata - off; pwsize > poff { + pwsize = poff + } + bulkBarrierPreWrite(uintptr(dst), uintptr(src), pwsize) + } + + memmove(dst, src, size) + if writeBarrier.cgo { + cgoCheckMemmove(typ, dst, src, off, size) + } +} + +// reflectcallmove is invoked by reflectcall to copy the return values +// out of the stack and into the heap, invoking the necessary write +// barriers. dst, src, and size describe the return value area to +// copy. typ describes the entire frame (not just the return values). +// typ may be nil, which indicates write barriers are not needed. +// +// It must be nosplit and must only call nosplit functions because the +// stack map of reflectcall is wrong. +// +//go:nosplit +func reflectcallmove(typ *_type, dst, src unsafe.Pointer, size uintptr, regs *abi.RegArgs) { + if writeBarrier.needed && typ != nil && typ.ptrdata != 0 && size >= goarch.PtrSize { + bulkBarrierPreWrite(uintptr(dst), uintptr(src), size) + } + memmove(dst, src, size) + + // Move pointers returned in registers to a place where the GC can see them. + for i := range regs.Ints { + if regs.ReturnIsPtr.Get(i) { + regs.Ptrs[i] = unsafe.Pointer(regs.Ints[i]) + } + } +} + +//go:nosplit +func typedslicecopy(typ *_type, dstPtr unsafe.Pointer, dstLen int, srcPtr unsafe.Pointer, srcLen int) int { + n := dstLen + if n > srcLen { + n = srcLen + } + if n == 0 { + return 0 + } + + // The compiler emits calls to typedslicecopy before + // instrumentation runs, so unlike the other copying and + // assignment operations, it's not instrumented in the calling + // code and needs its own instrumentation. + if raceenabled { + callerpc := getcallerpc() + pc := abi.FuncPCABIInternal(slicecopy) + racewriterangepc(dstPtr, uintptr(n)*typ.size, callerpc, pc) + racereadrangepc(srcPtr, uintptr(n)*typ.size, callerpc, pc) + } + if msanenabled { + msanwrite(dstPtr, uintptr(n)*typ.size) + msanread(srcPtr, uintptr(n)*typ.size) + } + if asanenabled { + asanwrite(dstPtr, uintptr(n)*typ.size) + asanread(srcPtr, uintptr(n)*typ.size) + } + + if writeBarrier.cgo { + cgoCheckSliceCopy(typ, dstPtr, srcPtr, n) + } + + if dstPtr == srcPtr { + return n + } + + // Note: No point in checking typ.ptrdata here: + // compiler only emits calls to typedslicecopy for types with pointers, + // and growslice and reflect_typedslicecopy check for pointers + // before calling typedslicecopy. + size := uintptr(n) * typ.size + if writeBarrier.needed { + pwsize := size - typ.size + typ.ptrdata + bulkBarrierPreWrite(uintptr(dstPtr), uintptr(srcPtr), pwsize) + } + // See typedmemmove for a discussion of the race between the + // barrier and memmove. + memmove(dstPtr, srcPtr, size) + return n +} + +//go:linkname reflect_typedslicecopy reflect.typedslicecopy +func reflect_typedslicecopy(elemType *_type, dst, src slice) int { + if elemType.ptrdata == 0 { + return slicecopy(dst.array, dst.len, src.array, src.len, elemType.size) + } + return typedslicecopy(elemType, dst.array, dst.len, src.array, src.len) +} + +// typedmemclr clears the typed memory at ptr with type typ. The +// memory at ptr must already be initialized (and hence in type-safe +// state). If the memory is being initialized for the first time, see +// memclrNoHeapPointers. +// +// If the caller knows that typ has pointers, it can alternatively +// call memclrHasPointers. +// +// TODO: A "go:nosplitrec" annotation would be perfect for this. +// +//go:nosplit +func typedmemclr(typ *_type, ptr unsafe.Pointer) { + if writeBarrier.needed && typ.ptrdata != 0 { + bulkBarrierPreWrite(uintptr(ptr), 0, typ.ptrdata) + } + memclrNoHeapPointers(ptr, typ.size) +} + +//go:linkname reflect_typedmemclr reflect.typedmemclr +func reflect_typedmemclr(typ *_type, ptr unsafe.Pointer) { + typedmemclr(typ, ptr) +} + +//go:linkname reflect_typedmemclrpartial reflect.typedmemclrpartial +func reflect_typedmemclrpartial(typ *_type, ptr unsafe.Pointer, off, size uintptr) { + if writeBarrier.needed && typ.ptrdata != 0 { + bulkBarrierPreWrite(uintptr(ptr), 0, size) + } + memclrNoHeapPointers(ptr, size) +} + +// memclrHasPointers clears n bytes of typed memory starting at ptr. +// The caller must ensure that the type of the object at ptr has +// pointers, usually by checking typ.ptrdata. However, ptr +// does not have to point to the start of the allocation. +// +//go:nosplit +func memclrHasPointers(ptr unsafe.Pointer, n uintptr) { + bulkBarrierPreWrite(uintptr(ptr), 0, n) + memclrNoHeapPointers(ptr, n) +} diff --git a/src/runtime/mbitmap.go b/src/runtime/mbitmap.go new file mode 100644 index 0000000..088b566 --- /dev/null +++ b/src/runtime/mbitmap.go @@ -0,0 +1,1501 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Garbage collector: type and heap bitmaps. +// +// Stack, data, and bss bitmaps +// +// Stack frames and global variables in the data and bss sections are +// described by bitmaps with 1 bit per pointer-sized word. A "1" bit +// means the word is a live pointer to be visited by the GC (referred to +// as "pointer"). A "0" bit means the word should be ignored by GC +// (referred to as "scalar", though it could be a dead pointer value). +// +// Heap bitmap +// +// The heap bitmap comprises 1 bit for each pointer-sized word in the heap, +// recording whether a pointer is stored in that word or not. This bitmap +// is stored in the heapArena metadata backing each heap arena. +// That is, if ha is the heapArena for the arena starting at "start", +// then ha.bitmap[0] holds the 64 bits for the 64 words "start" +// through start+63*ptrSize, ha.bitmap[1] holds the entries for +// start+64*ptrSize through start+127*ptrSize, and so on. +// Bits correspond to words in little-endian order. ha.bitmap[0]&1 represents +// the word at "start", ha.bitmap[0]>>1&1 represents the word at start+8, etc. +// (For 32-bit platforms, s/64/32/.) +// +// We also keep a noMorePtrs bitmap which allows us to stop scanning +// the heap bitmap early in certain situations. If ha.noMorePtrs[i]>>j&1 +// is 1, then the object containing the last word described by ha.bitmap[8*i+j] +// has no more pointers beyond those described by ha.bitmap[8*i+j]. +// If ha.noMorePtrs[i]>>j&1 is set, the entries in ha.bitmap[8*i+j+1] and +// beyond must all be zero until the start of the next object. +// +// The bitmap for noscan spans is set to all zero at span allocation time. +// +// The bitmap for unallocated objects in scannable spans is not maintained +// (can be junk). + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// addb returns the byte pointer p+n. +// +//go:nowritebarrier +//go:nosplit +func addb(p *byte, n uintptr) *byte { + // Note: wrote out full expression instead of calling add(p, n) + // to reduce the number of temporaries generated by the + // compiler for this trivial expression during inlining. + return (*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(p)) + n)) +} + +// subtractb returns the byte pointer p-n. +// +//go:nowritebarrier +//go:nosplit +func subtractb(p *byte, n uintptr) *byte { + // Note: wrote out full expression instead of calling add(p, -n) + // to reduce the number of temporaries generated by the + // compiler for this trivial expression during inlining. + return (*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(p)) - n)) +} + +// add1 returns the byte pointer p+1. +// +//go:nowritebarrier +//go:nosplit +func add1(p *byte) *byte { + // Note: wrote out full expression instead of calling addb(p, 1) + // to reduce the number of temporaries generated by the + // compiler for this trivial expression during inlining. + return (*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(p)) + 1)) +} + +// subtract1 returns the byte pointer p-1. +// +// nosplit because it is used during write barriers and must not be preempted. +// +//go:nowritebarrier +//go:nosplit +func subtract1(p *byte) *byte { + // Note: wrote out full expression instead of calling subtractb(p, 1) + // to reduce the number of temporaries generated by the + // compiler for this trivial expression during inlining. + return (*byte)(unsafe.Pointer(uintptr(unsafe.Pointer(p)) - 1)) +} + +// markBits provides access to the mark bit for an object in the heap. +// bytep points to the byte holding the mark bit. +// mask is a byte with a single bit set that can be &ed with *bytep +// to see if the bit has been set. +// *m.byte&m.mask != 0 indicates the mark bit is set. +// index can be used along with span information to generate +// the address of the object in the heap. +// We maintain one set of mark bits for allocation and one for +// marking purposes. +type markBits struct { + bytep *uint8 + mask uint8 + index uintptr +} + +//go:nosplit +func (s *mspan) allocBitsForIndex(allocBitIndex uintptr) markBits { + bytep, mask := s.allocBits.bitp(allocBitIndex) + return markBits{bytep, mask, allocBitIndex} +} + +// refillAllocCache takes 8 bytes s.allocBits starting at whichByte +// and negates them so that ctz (count trailing zeros) instructions +// can be used. It then places these 8 bytes into the cached 64 bit +// s.allocCache. +func (s *mspan) refillAllocCache(whichByte uintptr) { + bytes := (*[8]uint8)(unsafe.Pointer(s.allocBits.bytep(whichByte))) + aCache := uint64(0) + aCache |= uint64(bytes[0]) + aCache |= uint64(bytes[1]) << (1 * 8) + aCache |= uint64(bytes[2]) << (2 * 8) + aCache |= uint64(bytes[3]) << (3 * 8) + aCache |= uint64(bytes[4]) << (4 * 8) + aCache |= uint64(bytes[5]) << (5 * 8) + aCache |= uint64(bytes[6]) << (6 * 8) + aCache |= uint64(bytes[7]) << (7 * 8) + s.allocCache = ^aCache +} + +// nextFreeIndex returns the index of the next free object in s at +// or after s.freeindex. +// There are hardware instructions that can be used to make this +// faster if profiling warrants it. +func (s *mspan) nextFreeIndex() uintptr { + sfreeindex := s.freeindex + snelems := s.nelems + if sfreeindex == snelems { + return sfreeindex + } + if sfreeindex > snelems { + throw("s.freeindex > s.nelems") + } + + aCache := s.allocCache + + bitIndex := sys.TrailingZeros64(aCache) + for bitIndex == 64 { + // Move index to start of next cached bits. + sfreeindex = (sfreeindex + 64) &^ (64 - 1) + if sfreeindex >= snelems { + s.freeindex = snelems + return snelems + } + whichByte := sfreeindex / 8 + // Refill s.allocCache with the next 64 alloc bits. + s.refillAllocCache(whichByte) + aCache = s.allocCache + bitIndex = sys.TrailingZeros64(aCache) + // nothing available in cached bits + // grab the next 8 bytes and try again. + } + result := sfreeindex + uintptr(bitIndex) + if result >= snelems { + s.freeindex = snelems + return snelems + } + + s.allocCache >>= uint(bitIndex + 1) + sfreeindex = result + 1 + + if sfreeindex%64 == 0 && sfreeindex != snelems { + // We just incremented s.freeindex so it isn't 0. + // As each 1 in s.allocCache was encountered and used for allocation + // it was shifted away. At this point s.allocCache contains all 0s. + // Refill s.allocCache so that it corresponds + // to the bits at s.allocBits starting at s.freeindex. + whichByte := sfreeindex / 8 + s.refillAllocCache(whichByte) + } + s.freeindex = sfreeindex + return result +} + +// isFree reports whether the index'th object in s is unallocated. +// +// The caller must ensure s.state is mSpanInUse, and there must have +// been no preemption points since ensuring this (which could allow a +// GC transition, which would allow the state to change). +func (s *mspan) isFree(index uintptr) bool { + if index < s.freeIndexForScan { + return false + } + bytep, mask := s.allocBits.bitp(index) + return *bytep&mask == 0 +} + +// divideByElemSize returns n/s.elemsize. +// n must be within [0, s.npages*_PageSize), +// or may be exactly s.npages*_PageSize +// if s.elemsize is from sizeclasses.go. +func (s *mspan) divideByElemSize(n uintptr) uintptr { + const doubleCheck = false + + // See explanation in mksizeclasses.go's computeDivMagic. + q := uintptr((uint64(n) * uint64(s.divMul)) >> 32) + + if doubleCheck && q != n/s.elemsize { + println(n, "/", s.elemsize, "should be", n/s.elemsize, "but got", q) + throw("bad magic division") + } + return q +} + +func (s *mspan) objIndex(p uintptr) uintptr { + return s.divideByElemSize(p - s.base()) +} + +func markBitsForAddr(p uintptr) markBits { + s := spanOf(p) + objIndex := s.objIndex(p) + return s.markBitsForIndex(objIndex) +} + +func (s *mspan) markBitsForIndex(objIndex uintptr) markBits { + bytep, mask := s.gcmarkBits.bitp(objIndex) + return markBits{bytep, mask, objIndex} +} + +func (s *mspan) markBitsForBase() markBits { + return markBits{&s.gcmarkBits.x, uint8(1), 0} +} + +// isMarked reports whether mark bit m is set. +func (m markBits) isMarked() bool { + return *m.bytep&m.mask != 0 +} + +// setMarked sets the marked bit in the markbits, atomically. +func (m markBits) setMarked() { + // Might be racing with other updates, so use atomic update always. + // We used to be clever here and use a non-atomic update in certain + // cases, but it's not worth the risk. + atomic.Or8(m.bytep, m.mask) +} + +// setMarkedNonAtomic sets the marked bit in the markbits, non-atomically. +func (m markBits) setMarkedNonAtomic() { + *m.bytep |= m.mask +} + +// clearMarked clears the marked bit in the markbits, atomically. +func (m markBits) clearMarked() { + // Might be racing with other updates, so use atomic update always. + // We used to be clever here and use a non-atomic update in certain + // cases, but it's not worth the risk. + atomic.And8(m.bytep, ^m.mask) +} + +// markBitsForSpan returns the markBits for the span base address base. +func markBitsForSpan(base uintptr) (mbits markBits) { + mbits = markBitsForAddr(base) + if mbits.mask != 1 { + throw("markBitsForSpan: unaligned start") + } + return mbits +} + +// advance advances the markBits to the next object in the span. +func (m *markBits) advance() { + if m.mask == 1<<7 { + m.bytep = (*uint8)(unsafe.Pointer(uintptr(unsafe.Pointer(m.bytep)) + 1)) + m.mask = 1 + } else { + m.mask = m.mask << 1 + } + m.index++ +} + +// clobberdeadPtr is a special value that is used by the compiler to +// clobber dead stack slots, when -clobberdead flag is set. +const clobberdeadPtr = uintptr(0xdeaddead | 0xdeaddead<<((^uintptr(0)>>63)*32)) + +// badPointer throws bad pointer in heap panic. +func badPointer(s *mspan, p, refBase, refOff uintptr) { + // Typically this indicates an incorrect use + // of unsafe or cgo to store a bad pointer in + // the Go heap. It may also indicate a runtime + // bug. + // + // TODO(austin): We could be more aggressive + // and detect pointers to unallocated objects + // in allocated spans. + printlock() + print("runtime: pointer ", hex(p)) + if s != nil { + state := s.state.get() + if state != mSpanInUse { + print(" to unallocated span") + } else { + print(" to unused region of span") + } + print(" span.base()=", hex(s.base()), " span.limit=", hex(s.limit), " span.state=", state) + } + print("\n") + if refBase != 0 { + print("runtime: found in object at *(", hex(refBase), "+", hex(refOff), ")\n") + gcDumpObject("object", refBase, refOff) + } + getg().m.traceback = 2 + throw("found bad pointer in Go heap (incorrect use of unsafe or cgo?)") +} + +// findObject returns the base address for the heap object containing +// the address p, the object's span, and the index of the object in s. +// If p does not point into a heap object, it returns base == 0. +// +// If p points is an invalid heap pointer and debug.invalidptr != 0, +// findObject panics. +// +// refBase and refOff optionally give the base address of the object +// in which the pointer p was found and the byte offset at which it +// was found. These are used for error reporting. +// +// It is nosplit so it is safe for p to be a pointer to the current goroutine's stack. +// Since p is a uintptr, it would not be adjusted if the stack were to move. +// +//go:nosplit +func findObject(p, refBase, refOff uintptr) (base uintptr, s *mspan, objIndex uintptr) { + s = spanOf(p) + // If s is nil, the virtual address has never been part of the heap. + // This pointer may be to some mmap'd region, so we allow it. + if s == nil { + if (GOARCH == "amd64" || GOARCH == "arm64") && p == clobberdeadPtr && debug.invalidptr != 0 { + // Crash if clobberdeadPtr is seen. Only on AMD64 and ARM64 for now, + // as they are the only platform where compiler's clobberdead mode is + // implemented. On these platforms clobberdeadPtr cannot be a valid address. + badPointer(s, p, refBase, refOff) + } + return + } + // If p is a bad pointer, it may not be in s's bounds. + // + // Check s.state to synchronize with span initialization + // before checking other fields. See also spanOfHeap. + if state := s.state.get(); state != mSpanInUse || p < s.base() || p >= s.limit { + // Pointers into stacks are also ok, the runtime manages these explicitly. + if state == mSpanManual { + return + } + // The following ensures that we are rigorous about what data + // structures hold valid pointers. + if debug.invalidptr != 0 { + badPointer(s, p, refBase, refOff) + } + return + } + + objIndex = s.objIndex(p) + base = s.base() + objIndex*s.elemsize + return +} + +// reflect_verifyNotInHeapPtr reports whether converting the not-in-heap pointer into a unsafe.Pointer is ok. +// +//go:linkname reflect_verifyNotInHeapPtr reflect.verifyNotInHeapPtr +func reflect_verifyNotInHeapPtr(p uintptr) bool { + // Conversion to a pointer is ok as long as findObject above does not call badPointer. + // Since we're already promised that p doesn't point into the heap, just disallow heap + // pointers and the special clobbered pointer. + return spanOf(p) == nil && p != clobberdeadPtr +} + +const ptrBits = 8 * goarch.PtrSize + +// heapBits provides access to the bitmap bits for a single heap word. +// The methods on heapBits take value receivers so that the compiler +// can more easily inline calls to those methods and registerize the +// struct fields independently. +type heapBits struct { + // heapBits will report on pointers in the range [addr,addr+size). + // The low bit of mask contains the pointerness of the word at addr + // (assuming valid>0). + addr, size uintptr + + // The next few pointer bits representing words starting at addr. + // Those bits already returned by next() are zeroed. + mask uintptr + // Number of bits in mask that are valid. mask is always less than 1<<valid. + valid uintptr +} + +// heapBitsForAddr returns the heapBits for the address addr. +// The caller must ensure [addr,addr+size) is in an allocated span. +// In particular, be careful not to point past the end of an object. +// +// nosplit because it is used during write barriers and must not be preempted. +// +//go:nosplit +func heapBitsForAddr(addr, size uintptr) heapBits { + // Find arena + ai := arenaIndex(addr) + ha := mheap_.arenas[ai.l1()][ai.l2()] + + // Word index in arena. + word := addr / goarch.PtrSize % heapArenaWords + + // Word index and bit offset in bitmap array. + idx := word / ptrBits + off := word % ptrBits + + // Grab relevant bits of bitmap. + mask := ha.bitmap[idx] >> off + valid := ptrBits - off + + // Process depending on where the object ends. + nptr := size / goarch.PtrSize + if nptr < valid { + // Bits for this object end before the end of this bitmap word. + // Squash bits for the following objects. + mask &= 1<<(nptr&(ptrBits-1)) - 1 + valid = nptr + } else if nptr == valid { + // Bits for this object end at exactly the end of this bitmap word. + // All good. + } else { + // Bits for this object extend into the next bitmap word. See if there + // may be any pointers recorded there. + if uintptr(ha.noMorePtrs[idx/8])>>(idx%8)&1 != 0 { + // No more pointers in this object after this bitmap word. + // Update size so we know not to look there. + size = valid * goarch.PtrSize + } + } + + return heapBits{addr: addr, size: size, mask: mask, valid: valid} +} + +// Returns the (absolute) address of the next known pointer and +// a heapBits iterator representing any remaining pointers. +// If there are no more pointers, returns address 0. +// Note that next does not modify h. The caller must record the result. +// +// nosplit because it is used during write barriers and must not be preempted. +// +//go:nosplit +func (h heapBits) next() (heapBits, uintptr) { + for { + if h.mask != 0 { + var i int + if goarch.PtrSize == 8 { + i = sys.TrailingZeros64(uint64(h.mask)) + } else { + i = sys.TrailingZeros32(uint32(h.mask)) + } + h.mask ^= uintptr(1) << (i & (ptrBits - 1)) + return h, h.addr + uintptr(i)*goarch.PtrSize + } + + // Skip words that we've already processed. + h.addr += h.valid * goarch.PtrSize + h.size -= h.valid * goarch.PtrSize + if h.size == 0 { + return h, 0 // no more pointers + } + + // Grab more bits and try again. + h = heapBitsForAddr(h.addr, h.size) + } +} + +// nextFast is like next, but can return 0 even when there are more pointers +// to be found. Callers should call next if nextFast returns 0 as its second +// return value. +// +// if addr, h = h.nextFast(); addr == 0 { +// if addr, h = h.next(); addr == 0 { +// ... no more pointers ... +// } +// } +// ... process pointer at addr ... +// +// nextFast is designed to be inlineable. +// +//go:nosplit +func (h heapBits) nextFast() (heapBits, uintptr) { + // TESTQ/JEQ + if h.mask == 0 { + return h, 0 + } + // BSFQ + var i int + if goarch.PtrSize == 8 { + i = sys.TrailingZeros64(uint64(h.mask)) + } else { + i = sys.TrailingZeros32(uint32(h.mask)) + } + // BTCQ + h.mask ^= uintptr(1) << (i & (ptrBits - 1)) + // LEAQ (XX)(XX*8) + return h, h.addr + uintptr(i)*goarch.PtrSize +} + +// bulkBarrierPreWrite executes a write barrier +// for every pointer slot in the memory range [src, src+size), +// using pointer/scalar information from [dst, dst+size). +// This executes the write barriers necessary before a memmove. +// src, dst, and size must be pointer-aligned. +// The range [dst, dst+size) must lie within a single object. +// It does not perform the actual writes. +// +// As a special case, src == 0 indicates that this is being used for a +// memclr. bulkBarrierPreWrite will pass 0 for the src of each write +// barrier. +// +// Callers should call bulkBarrierPreWrite immediately before +// calling memmove(dst, src, size). This function is marked nosplit +// to avoid being preempted; the GC must not stop the goroutine +// between the memmove and the execution of the barriers. +// The caller is also responsible for cgo pointer checks if this +// may be writing Go pointers into non-Go memory. +// +// The pointer bitmap is not maintained for allocations containing +// no pointers at all; any caller of bulkBarrierPreWrite must first +// make sure the underlying allocation contains pointers, usually +// by checking typ.ptrdata. +// +// Callers must perform cgo checks if writeBarrier.cgo. +// +//go:nosplit +func bulkBarrierPreWrite(dst, src, size uintptr) { + if (dst|src|size)&(goarch.PtrSize-1) != 0 { + throw("bulkBarrierPreWrite: unaligned arguments") + } + if !writeBarrier.needed { + return + } + if s := spanOf(dst); s == nil { + // If dst is a global, use the data or BSS bitmaps to + // execute write barriers. + for _, datap := range activeModules() { + if datap.data <= dst && dst < datap.edata { + bulkBarrierBitmap(dst, src, size, dst-datap.data, datap.gcdatamask.bytedata) + return + } + } + for _, datap := range activeModules() { + if datap.bss <= dst && dst < datap.ebss { + bulkBarrierBitmap(dst, src, size, dst-datap.bss, datap.gcbssmask.bytedata) + return + } + } + return + } else if s.state.get() != mSpanInUse || dst < s.base() || s.limit <= dst { + // dst was heap memory at some point, but isn't now. + // It can't be a global. It must be either our stack, + // or in the case of direct channel sends, it could be + // another stack. Either way, no need for barriers. + // This will also catch if dst is in a freed span, + // though that should never have. + return + } + + buf := &getg().m.p.ptr().wbBuf + h := heapBitsForAddr(dst, size) + if src == 0 { + for { + var addr uintptr + if h, addr = h.next(); addr == 0 { + break + } + dstx := (*uintptr)(unsafe.Pointer(addr)) + if !buf.putFast(*dstx, 0) { + wbBufFlush(nil, 0) + } + } + } else { + for { + var addr uintptr + if h, addr = h.next(); addr == 0 { + break + } + dstx := (*uintptr)(unsafe.Pointer(addr)) + srcx := (*uintptr)(unsafe.Pointer(src + (addr - dst))) + if !buf.putFast(*dstx, *srcx) { + wbBufFlush(nil, 0) + } + } + } +} + +// bulkBarrierPreWriteSrcOnly is like bulkBarrierPreWrite but +// does not execute write barriers for [dst, dst+size). +// +// In addition to the requirements of bulkBarrierPreWrite +// callers need to ensure [dst, dst+size) is zeroed. +// +// This is used for special cases where e.g. dst was just +// created and zeroed with malloc. +// +//go:nosplit +func bulkBarrierPreWriteSrcOnly(dst, src, size uintptr) { + if (dst|src|size)&(goarch.PtrSize-1) != 0 { + throw("bulkBarrierPreWrite: unaligned arguments") + } + if !writeBarrier.needed { + return + } + buf := &getg().m.p.ptr().wbBuf + h := heapBitsForAddr(dst, size) + for { + var addr uintptr + if h, addr = h.next(); addr == 0 { + break + } + srcx := (*uintptr)(unsafe.Pointer(addr - dst + src)) + if !buf.putFast(0, *srcx) { + wbBufFlush(nil, 0) + } + } +} + +// bulkBarrierBitmap executes write barriers for copying from [src, +// src+size) to [dst, dst+size) using a 1-bit pointer bitmap. src is +// assumed to start maskOffset bytes into the data covered by the +// bitmap in bits (which may not be a multiple of 8). +// +// This is used by bulkBarrierPreWrite for writes to data and BSS. +// +//go:nosplit +func bulkBarrierBitmap(dst, src, size, maskOffset uintptr, bits *uint8) { + word := maskOffset / goarch.PtrSize + bits = addb(bits, word/8) + mask := uint8(1) << (word % 8) + + buf := &getg().m.p.ptr().wbBuf + for i := uintptr(0); i < size; i += goarch.PtrSize { + if mask == 0 { + bits = addb(bits, 1) + if *bits == 0 { + // Skip 8 words. + i += 7 * goarch.PtrSize + continue + } + mask = 1 + } + if *bits&mask != 0 { + dstx := (*uintptr)(unsafe.Pointer(dst + i)) + if src == 0 { + if !buf.putFast(*dstx, 0) { + wbBufFlush(nil, 0) + } + } else { + srcx := (*uintptr)(unsafe.Pointer(src + i)) + if !buf.putFast(*dstx, *srcx) { + wbBufFlush(nil, 0) + } + } + } + mask <<= 1 + } +} + +// typeBitsBulkBarrier executes a write barrier for every +// pointer that would be copied from [src, src+size) to [dst, +// dst+size) by a memmove using the type bitmap to locate those +// pointer slots. +// +// The type typ must correspond exactly to [src, src+size) and [dst, dst+size). +// dst, src, and size must be pointer-aligned. +// The type typ must have a plain bitmap, not a GC program. +// The only use of this function is in channel sends, and the +// 64 kB channel element limit takes care of this for us. +// +// Must not be preempted because it typically runs right before memmove, +// and the GC must observe them as an atomic action. +// +// Callers must perform cgo checks if writeBarrier.cgo. +// +//go:nosplit +func typeBitsBulkBarrier(typ *_type, dst, src, size uintptr) { + if typ == nil { + throw("runtime: typeBitsBulkBarrier without type") + } + if typ.size != size { + println("runtime: typeBitsBulkBarrier with type ", typ.string(), " of size ", typ.size, " but memory size", size) + throw("runtime: invalid typeBitsBulkBarrier") + } + if typ.kind&kindGCProg != 0 { + println("runtime: typeBitsBulkBarrier with type ", typ.string(), " with GC prog") + throw("runtime: invalid typeBitsBulkBarrier") + } + if !writeBarrier.needed { + return + } + ptrmask := typ.gcdata + buf := &getg().m.p.ptr().wbBuf + var bits uint32 + for i := uintptr(0); i < typ.ptrdata; i += goarch.PtrSize { + if i&(goarch.PtrSize*8-1) == 0 { + bits = uint32(*ptrmask) + ptrmask = addb(ptrmask, 1) + } else { + bits = bits >> 1 + } + if bits&1 != 0 { + dstx := (*uintptr)(unsafe.Pointer(dst + i)) + srcx := (*uintptr)(unsafe.Pointer(src + i)) + if !buf.putFast(*dstx, *srcx) { + wbBufFlush(nil, 0) + } + } + } +} + +// initHeapBits initializes the heap bitmap for a span. +// If this is a span of single pointer allocations, it initializes all +// words to pointer. If force is true, clears all bits. +func (s *mspan) initHeapBits(forceClear bool) { + if forceClear || s.spanclass.noscan() { + // Set all the pointer bits to zero. We do this once + // when the span is allocated so we don't have to do it + // for each object allocation. + base := s.base() + size := s.npages * pageSize + h := writeHeapBitsForAddr(base) + h.flush(base, size) + return + } + isPtrs := goarch.PtrSize == 8 && s.elemsize == goarch.PtrSize + if !isPtrs { + return // nothing to do + } + h := writeHeapBitsForAddr(s.base()) + size := s.npages * pageSize + nptrs := size / goarch.PtrSize + for i := uintptr(0); i < nptrs; i += ptrBits { + h = h.write(^uintptr(0), ptrBits) + } + h.flush(s.base(), size) +} + +// countAlloc returns the number of objects allocated in span s by +// scanning the allocation bitmap. +func (s *mspan) countAlloc() int { + count := 0 + bytes := divRoundUp(s.nelems, 8) + // Iterate over each 8-byte chunk and count allocations + // with an intrinsic. Note that newMarkBits guarantees that + // gcmarkBits will be 8-byte aligned, so we don't have to + // worry about edge cases, irrelevant bits will simply be zero. + for i := uintptr(0); i < bytes; i += 8 { + // Extract 64 bits from the byte pointer and get a OnesCount. + // Note that the unsafe cast here doesn't preserve endianness, + // but that's OK. We only care about how many bits are 1, not + // about the order we discover them in. + mrkBits := *(*uint64)(unsafe.Pointer(s.gcmarkBits.bytep(i))) + count += sys.OnesCount64(mrkBits) + } + return count +} + +type writeHeapBits struct { + addr uintptr // address that the low bit of mask represents the pointer state of. + mask uintptr // some pointer bits starting at the address addr. + valid uintptr // number of bits in buf that are valid (including low) + low uintptr // number of low-order bits to not overwrite +} + +func writeHeapBitsForAddr(addr uintptr) (h writeHeapBits) { + // We start writing bits maybe in the middle of a heap bitmap word. + // Remember how many bits into the word we started, so we can be sure + // not to overwrite the previous bits. + h.low = addr / goarch.PtrSize % ptrBits + + // round down to heap word that starts the bitmap word. + h.addr = addr - h.low*goarch.PtrSize + + // We don't have any bits yet. + h.mask = 0 + h.valid = h.low + + return +} + +// write appends the pointerness of the next valid pointer slots +// using the low valid bits of bits. 1=pointer, 0=scalar. +func (h writeHeapBits) write(bits, valid uintptr) writeHeapBits { + if h.valid+valid <= ptrBits { + // Fast path - just accumulate the bits. + h.mask |= bits << h.valid + h.valid += valid + return h + } + // Too many bits to fit in this word. Write the current word + // out and move on to the next word. + + data := h.mask | bits<<h.valid // mask for this word + h.mask = bits >> (ptrBits - h.valid) // leftover for next word + h.valid += valid - ptrBits // have h.valid+valid bits, writing ptrBits of them + + // Flush mask to the memory bitmap. + // TODO: figure out how to cache arena lookup. + ai := arenaIndex(h.addr) + ha := mheap_.arenas[ai.l1()][ai.l2()] + idx := h.addr / (ptrBits * goarch.PtrSize) % heapArenaBitmapWords + m := uintptr(1)<<h.low - 1 + ha.bitmap[idx] = ha.bitmap[idx]&m | data + // Note: no synchronization required for this write because + // the allocator has exclusive access to the page, and the bitmap + // entries are all for a single page. Also, visibility of these + // writes is guaranteed by the publication barrier in mallocgc. + + // Clear noMorePtrs bit, since we're going to be writing bits + // into the following word. + ha.noMorePtrs[idx/8] &^= uint8(1) << (idx % 8) + // Note: same as above + + // Move to next word of bitmap. + h.addr += ptrBits * goarch.PtrSize + h.low = 0 + return h +} + +// Add padding of size bytes. +func (h writeHeapBits) pad(size uintptr) writeHeapBits { + if size == 0 { + return h + } + words := size / goarch.PtrSize + for words > ptrBits { + h = h.write(0, ptrBits) + words -= ptrBits + } + return h.write(0, words) +} + +// Flush the bits that have been written, and add zeros as needed +// to cover the full object [addr, addr+size). +func (h writeHeapBits) flush(addr, size uintptr) { + // zeros counts the number of bits needed to represent the object minus the + // number of bits we've already written. This is the number of 0 bits + // that need to be added. + zeros := (addr+size-h.addr)/goarch.PtrSize - h.valid + + // Add zero bits up to the bitmap word boundary + if zeros > 0 { + z := ptrBits - h.valid + if z > zeros { + z = zeros + } + h.valid += z + zeros -= z + } + + // Find word in bitmap that we're going to write. + ai := arenaIndex(h.addr) + ha := mheap_.arenas[ai.l1()][ai.l2()] + idx := h.addr / (ptrBits * goarch.PtrSize) % heapArenaBitmapWords + + // Write remaining bits. + if h.valid != h.low { + m := uintptr(1)<<h.low - 1 // don't clear existing bits below "low" + m |= ^(uintptr(1)<<h.valid - 1) // don't clear existing bits above "valid" + ha.bitmap[idx] = ha.bitmap[idx]&m | h.mask + } + if zeros == 0 { + return + } + + // Record in the noMorePtrs map that there won't be any more 1 bits, + // so readers can stop early. + ha.noMorePtrs[idx/8] |= uint8(1) << (idx % 8) + + // Advance to next bitmap word. + h.addr += ptrBits * goarch.PtrSize + + // Continue on writing zeros for the rest of the object. + // For standard use of the ptr bits this is not required, as + // the bits are read from the beginning of the object. Some uses, + // like noscan spans, oblets, bulk write barriers, and cgocheck, might + // start mid-object, so these writes are still required. + for { + // Write zero bits. + ai := arenaIndex(h.addr) + ha := mheap_.arenas[ai.l1()][ai.l2()] + idx := h.addr / (ptrBits * goarch.PtrSize) % heapArenaBitmapWords + if zeros < ptrBits { + ha.bitmap[idx] &^= uintptr(1)<<zeros - 1 + break + } else if zeros == ptrBits { + ha.bitmap[idx] = 0 + break + } else { + ha.bitmap[idx] = 0 + zeros -= ptrBits + } + ha.noMorePtrs[idx/8] |= uint8(1) << (idx % 8) + h.addr += ptrBits * goarch.PtrSize + } +} + +// Read the bytes starting at the aligned pointer p into a uintptr. +// Read is little-endian. +func readUintptr(p *byte) uintptr { + x := *(*uintptr)(unsafe.Pointer(p)) + if goarch.BigEndian { + if goarch.PtrSize == 8 { + return uintptr(sys.Bswap64(uint64(x))) + } + return uintptr(sys.Bswap32(uint32(x))) + } + return x +} + +// heapBitsSetType records that the new allocation [x, x+size) +// holds in [x, x+dataSize) one or more values of type typ. +// (The number of values is given by dataSize / typ.size.) +// If dataSize < size, the fragment [x+dataSize, x+size) is +// recorded as non-pointer data. +// It is known that the type has pointers somewhere; +// malloc does not call heapBitsSetType when there are no pointers, +// because all free objects are marked as noscan during +// heapBitsSweepSpan. +// +// There can only be one allocation from a given span active at a time, +// and the bitmap for a span always falls on word boundaries, +// so there are no write-write races for access to the heap bitmap. +// Hence, heapBitsSetType can access the bitmap without atomics. +// +// There can be read-write races between heapBitsSetType and things +// that read the heap bitmap like scanobject. However, since +// heapBitsSetType is only used for objects that have not yet been +// made reachable, readers will ignore bits being modified by this +// function. This does mean this function cannot transiently modify +// bits that belong to neighboring objects. Also, on weakly-ordered +// machines, callers must execute a store/store (publication) barrier +// between calling this function and making the object reachable. +func heapBitsSetType(x, size, dataSize uintptr, typ *_type) { + const doubleCheck = false // slow but helpful; enable to test modifications to this code + + if doubleCheck && dataSize%typ.size != 0 { + throw("heapBitsSetType: dataSize not a multiple of typ.size") + } + + if goarch.PtrSize == 8 && size == goarch.PtrSize { + // It's one word and it has pointers, it must be a pointer. + // Since all allocated one-word objects are pointers + // (non-pointers are aggregated into tinySize allocations), + // (*mspan).initHeapBits sets the pointer bits for us. + // Nothing to do here. + if doubleCheck { + h, addr := heapBitsForAddr(x, size).next() + if addr != x { + throw("heapBitsSetType: pointer bit missing") + } + _, addr = h.next() + if addr != 0 { + throw("heapBitsSetType: second pointer bit found") + } + } + return + } + + h := writeHeapBitsForAddr(x) + + // Handle GC program. + if typ.kind&kindGCProg != 0 { + // Expand the gc program into the storage we're going to use for the actual object. + obj := (*uint8)(unsafe.Pointer(x)) + n := runGCProg(addb(typ.gcdata, 4), obj) + // Use the expanded program to set the heap bits. + for i := uintptr(0); true; i += typ.size { + // Copy expanded program to heap bitmap. + p := obj + j := n + for j > 8 { + h = h.write(uintptr(*p), 8) + p = add1(p) + j -= 8 + } + h = h.write(uintptr(*p), j) + + if i+typ.size == dataSize { + break // no padding after last element + } + + // Pad with zeros to the start of the next element. + h = h.pad(typ.size - n*goarch.PtrSize) + } + + h.flush(x, size) + + // Erase the expanded GC program. + memclrNoHeapPointers(unsafe.Pointer(obj), (n+7)/8) + return + } + + // Note about sizes: + // + // typ.size is the number of words in the object, + // and typ.ptrdata is the number of words in the prefix + // of the object that contains pointers. That is, the final + // typ.size - typ.ptrdata words contain no pointers. + // This allows optimization of a common pattern where + // an object has a small header followed by a large scalar + // buffer. If we know the pointers are over, we don't have + // to scan the buffer's heap bitmap at all. + // The 1-bit ptrmasks are sized to contain only bits for + // the typ.ptrdata prefix, zero padded out to a full byte + // of bitmap. If there is more room in the allocated object, + // that space is pointerless. The noMorePtrs bitmap will prevent + // scanning large pointerless tails of an object. + // + // Replicated copies are not as nice: if there is an array of + // objects with scalar tails, all but the last tail does have to + // be initialized, because there is no way to say "skip forward". + + ptrs := typ.ptrdata / goarch.PtrSize + if typ.size == dataSize { // Single element + if ptrs <= ptrBits { // Single small element + m := readUintptr(typ.gcdata) + h = h.write(m, ptrs) + } else { // Single large element + p := typ.gcdata + for { + h = h.write(readUintptr(p), ptrBits) + p = addb(p, ptrBits/8) + ptrs -= ptrBits + if ptrs <= ptrBits { + break + } + } + m := readUintptr(p) + h = h.write(m, ptrs) + } + } else { // Repeated element + words := typ.size / goarch.PtrSize // total words, including scalar tail + if words <= ptrBits { // Repeated small element + n := dataSize / typ.size + m := readUintptr(typ.gcdata) + // Make larger unit to repeat + for words <= ptrBits/2 { + if n&1 != 0 { + h = h.write(m, words) + } + n /= 2 + m |= m << words + ptrs += words + words *= 2 + if n == 1 { + break + } + } + for n > 1 { + h = h.write(m, words) + n-- + } + h = h.write(m, ptrs) + } else { // Repeated large element + for i := uintptr(0); true; i += typ.size { + p := typ.gcdata + j := ptrs + for j > ptrBits { + h = h.write(readUintptr(p), ptrBits) + p = addb(p, ptrBits/8) + j -= ptrBits + } + m := readUintptr(p) + h = h.write(m, j) + if i+typ.size == dataSize { + break // don't need the trailing nonptr bits on the last element. + } + // Pad with zeros to the start of the next element. + h = h.pad(typ.size - typ.ptrdata) + } + } + } + h.flush(x, size) + + if doubleCheck { + h := heapBitsForAddr(x, size) + for i := uintptr(0); i < size; i += goarch.PtrSize { + // Compute the pointer bit we want at offset i. + want := false + if i < dataSize { + off := i % typ.size + if off < typ.ptrdata { + j := off / goarch.PtrSize + want = *addb(typ.gcdata, j/8)>>(j%8)&1 != 0 + } + } + if want { + var addr uintptr + h, addr = h.next() + if addr != x+i { + throw("heapBitsSetType: pointer entry not correct") + } + } + } + if _, addr := h.next(); addr != 0 { + throw("heapBitsSetType: extra pointer") + } + } +} + +var debugPtrmask struct { + lock mutex + data *byte +} + +// progToPointerMask returns the 1-bit pointer mask output by the GC program prog. +// size the size of the region described by prog, in bytes. +// The resulting bitvector will have no more than size/goarch.PtrSize bits. +func progToPointerMask(prog *byte, size uintptr) bitvector { + n := (size/goarch.PtrSize + 7) / 8 + x := (*[1 << 30]byte)(persistentalloc(n+1, 1, &memstats.buckhash_sys))[:n+1] + x[len(x)-1] = 0xa1 // overflow check sentinel + n = runGCProg(prog, &x[0]) + if x[len(x)-1] != 0xa1 { + throw("progToPointerMask: overflow") + } + return bitvector{int32(n), &x[0]} +} + +// Packed GC pointer bitmaps, aka GC programs. +// +// For large types containing arrays, the type information has a +// natural repetition that can be encoded to save space in the +// binary and in the memory representation of the type information. +// +// The encoding is a simple Lempel-Ziv style bytecode machine +// with the following instructions: +// +// 00000000: stop +// 0nnnnnnn: emit n bits copied from the next (n+7)/8 bytes +// 10000000 n c: repeat the previous n bits c times; n, c are varints +// 1nnnnnnn c: repeat the previous n bits c times; c is a varint + +// runGCProg returns the number of 1-bit entries written to memory. +func runGCProg(prog, dst *byte) uintptr { + dstStart := dst + + // Bits waiting to be written to memory. + var bits uintptr + var nbits uintptr + + p := prog +Run: + for { + // Flush accumulated full bytes. + // The rest of the loop assumes that nbits <= 7. + for ; nbits >= 8; nbits -= 8 { + *dst = uint8(bits) + dst = add1(dst) + bits >>= 8 + } + + // Process one instruction. + inst := uintptr(*p) + p = add1(p) + n := inst & 0x7F + if inst&0x80 == 0 { + // Literal bits; n == 0 means end of program. + if n == 0 { + // Program is over. + break Run + } + nbyte := n / 8 + for i := uintptr(0); i < nbyte; i++ { + bits |= uintptr(*p) << nbits + p = add1(p) + *dst = uint8(bits) + dst = add1(dst) + bits >>= 8 + } + if n %= 8; n > 0 { + bits |= uintptr(*p) << nbits + p = add1(p) + nbits += n + } + continue Run + } + + // Repeat. If n == 0, it is encoded in a varint in the next bytes. + if n == 0 { + for off := uint(0); ; off += 7 { + x := uintptr(*p) + p = add1(p) + n |= (x & 0x7F) << off + if x&0x80 == 0 { + break + } + } + } + + // Count is encoded in a varint in the next bytes. + c := uintptr(0) + for off := uint(0); ; off += 7 { + x := uintptr(*p) + p = add1(p) + c |= (x & 0x7F) << off + if x&0x80 == 0 { + break + } + } + c *= n // now total number of bits to copy + + // If the number of bits being repeated is small, load them + // into a register and use that register for the entire loop + // instead of repeatedly reading from memory. + // Handling fewer than 8 bits here makes the general loop simpler. + // The cutoff is goarch.PtrSize*8 - 7 to guarantee that when we add + // the pattern to a bit buffer holding at most 7 bits (a partial byte) + // it will not overflow. + src := dst + const maxBits = goarch.PtrSize*8 - 7 + if n <= maxBits { + // Start with bits in output buffer. + pattern := bits + npattern := nbits + + // If we need more bits, fetch them from memory. + src = subtract1(src) + for npattern < n { + pattern <<= 8 + pattern |= uintptr(*src) + src = subtract1(src) + npattern += 8 + } + + // We started with the whole bit output buffer, + // and then we loaded bits from whole bytes. + // Either way, we might now have too many instead of too few. + // Discard the extra. + if npattern > n { + pattern >>= npattern - n + npattern = n + } + + // Replicate pattern to at most maxBits. + if npattern == 1 { + // One bit being repeated. + // If the bit is 1, make the pattern all 1s. + // If the bit is 0, the pattern is already all 0s, + // but we can claim that the number of bits + // in the word is equal to the number we need (c), + // because right shift of bits will zero fill. + if pattern == 1 { + pattern = 1<<maxBits - 1 + npattern = maxBits + } else { + npattern = c + } + } else { + b := pattern + nb := npattern + if nb+nb <= maxBits { + // Double pattern until the whole uintptr is filled. + for nb <= goarch.PtrSize*8 { + b |= b << nb + nb += nb + } + // Trim away incomplete copy of original pattern in high bits. + // TODO(rsc): Replace with table lookup or loop on systems without divide? + nb = maxBits / npattern * npattern + b &= 1<<nb - 1 + pattern = b + npattern = nb + } + } + + // Add pattern to bit buffer and flush bit buffer, c/npattern times. + // Since pattern contains >8 bits, there will be full bytes to flush + // on each iteration. + for ; c >= npattern; c -= npattern { + bits |= pattern << nbits + nbits += npattern + for nbits >= 8 { + *dst = uint8(bits) + dst = add1(dst) + bits >>= 8 + nbits -= 8 + } + } + + // Add final fragment to bit buffer. + if c > 0 { + pattern &= 1<<c - 1 + bits |= pattern << nbits + nbits += c + } + continue Run + } + + // Repeat; n too large to fit in a register. + // Since nbits <= 7, we know the first few bytes of repeated data + // are already written to memory. + off := n - nbits // n > nbits because n > maxBits and nbits <= 7 + // Leading src fragment. + src = subtractb(src, (off+7)/8) + if frag := off & 7; frag != 0 { + bits |= uintptr(*src) >> (8 - frag) << nbits + src = add1(src) + nbits += frag + c -= frag + } + // Main loop: load one byte, write another. + // The bits are rotating through the bit buffer. + for i := c / 8; i > 0; i-- { + bits |= uintptr(*src) << nbits + src = add1(src) + *dst = uint8(bits) + dst = add1(dst) + bits >>= 8 + } + // Final src fragment. + if c %= 8; c > 0 { + bits |= (uintptr(*src) & (1<<c - 1)) << nbits + nbits += c + } + } + + // Write any final bits out, using full-byte writes, even for the final byte. + totalBits := (uintptr(unsafe.Pointer(dst))-uintptr(unsafe.Pointer(dstStart)))*8 + nbits + nbits += -nbits & 7 + for ; nbits > 0; nbits -= 8 { + *dst = uint8(bits) + dst = add1(dst) + bits >>= 8 + } + return totalBits +} + +// materializeGCProg allocates space for the (1-bit) pointer bitmask +// for an object of size ptrdata. Then it fills that space with the +// pointer bitmask specified by the program prog. +// The bitmask starts at s.startAddr. +// The result must be deallocated with dematerializeGCProg. +func materializeGCProg(ptrdata uintptr, prog *byte) *mspan { + // Each word of ptrdata needs one bit in the bitmap. + bitmapBytes := divRoundUp(ptrdata, 8*goarch.PtrSize) + // Compute the number of pages needed for bitmapBytes. + pages := divRoundUp(bitmapBytes, pageSize) + s := mheap_.allocManual(pages, spanAllocPtrScalarBits) + runGCProg(addb(prog, 4), (*byte)(unsafe.Pointer(s.startAddr))) + return s +} +func dematerializeGCProg(s *mspan) { + mheap_.freeManual(s, spanAllocPtrScalarBits) +} + +func dumpGCProg(p *byte) { + nptr := 0 + for { + x := *p + p = add1(p) + if x == 0 { + print("\t", nptr, " end\n") + break + } + if x&0x80 == 0 { + print("\t", nptr, " lit ", x, ":") + n := int(x+7) / 8 + for i := 0; i < n; i++ { + print(" ", hex(*p)) + p = add1(p) + } + print("\n") + nptr += int(x) + } else { + nbit := int(x &^ 0x80) + if nbit == 0 { + for nb := uint(0); ; nb += 7 { + x := *p + p = add1(p) + nbit |= int(x&0x7f) << nb + if x&0x80 == 0 { + break + } + } + } + count := 0 + for nb := uint(0); ; nb += 7 { + x := *p + p = add1(p) + count |= int(x&0x7f) << nb + if x&0x80 == 0 { + break + } + } + print("\t", nptr, " repeat ", nbit, " × ", count, "\n") + nptr += nbit * count + } + } +} + +// Testing. + +func getgcmaskcb(frame *stkframe, ctxt unsafe.Pointer) bool { + target := (*stkframe)(ctxt) + if frame.sp <= target.sp && target.sp < frame.varp { + *target = *frame + return false + } + return true +} + +// reflect_gcbits returns the GC type info for x, for testing. +// The result is the bitmap entries (0 or 1), one entry per byte. +// +//go:linkname reflect_gcbits reflect.gcbits +func reflect_gcbits(x any) []byte { + return getgcmask(x) +} + +// Returns GC type info for the pointer stored in ep for testing. +// If ep points to the stack, only static live information will be returned +// (i.e. not for objects which are only dynamically live stack objects). +func getgcmask(ep any) (mask []byte) { + e := *efaceOf(&ep) + p := e.data + t := e._type + // data or bss + for _, datap := range activeModules() { + // data + if datap.data <= uintptr(p) && uintptr(p) < datap.edata { + bitmap := datap.gcdatamask.bytedata + n := (*ptrtype)(unsafe.Pointer(t)).elem.size + mask = make([]byte, n/goarch.PtrSize) + for i := uintptr(0); i < n; i += goarch.PtrSize { + off := (uintptr(p) + i - datap.data) / goarch.PtrSize + mask[i/goarch.PtrSize] = (*addb(bitmap, off/8) >> (off % 8)) & 1 + } + return + } + + // bss + if datap.bss <= uintptr(p) && uintptr(p) < datap.ebss { + bitmap := datap.gcbssmask.bytedata + n := (*ptrtype)(unsafe.Pointer(t)).elem.size + mask = make([]byte, n/goarch.PtrSize) + for i := uintptr(0); i < n; i += goarch.PtrSize { + off := (uintptr(p) + i - datap.bss) / goarch.PtrSize + mask[i/goarch.PtrSize] = (*addb(bitmap, off/8) >> (off % 8)) & 1 + } + return + } + } + + // heap + if base, s, _ := findObject(uintptr(p), 0, 0); base != 0 { + if s.spanclass.noscan() { + return nil + } + n := s.elemsize + hbits := heapBitsForAddr(base, n) + mask = make([]byte, n/goarch.PtrSize) + for { + var addr uintptr + if hbits, addr = hbits.next(); addr == 0 { + break + } + mask[(addr-base)/goarch.PtrSize] = 1 + } + // Callers expect this mask to end at the last pointer. + for len(mask) > 0 && mask[len(mask)-1] == 0 { + mask = mask[:len(mask)-1] + } + return + } + + // stack + if gp := getg(); gp.m.curg.stack.lo <= uintptr(p) && uintptr(p) < gp.m.curg.stack.hi { + var frame stkframe + frame.sp = uintptr(p) + gentraceback(gp.m.curg.sched.pc, gp.m.curg.sched.sp, 0, gp.m.curg, 0, nil, 1000, getgcmaskcb, noescape(unsafe.Pointer(&frame)), 0) + if frame.fn.valid() { + locals, _, _ := frame.getStackMap(nil, false) + if locals.n == 0 { + return + } + size := uintptr(locals.n) * goarch.PtrSize + n := (*ptrtype)(unsafe.Pointer(t)).elem.size + mask = make([]byte, n/goarch.PtrSize) + for i := uintptr(0); i < n; i += goarch.PtrSize { + off := (uintptr(p) + i - frame.varp + size) / goarch.PtrSize + mask[i/goarch.PtrSize] = locals.ptrbit(off) + } + } + return + } + + // otherwise, not something the GC knows about. + // possibly read-only data, like malloc(0). + // must not have pointers + return +} diff --git a/src/runtime/mcache.go b/src/runtime/mcache.go new file mode 100644 index 0000000..acfd99b --- /dev/null +++ b/src/runtime/mcache.go @@ -0,0 +1,331 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// Per-thread (in Go, per-P) cache for small objects. +// This includes a small object cache and local allocation stats. +// No locking needed because it is per-thread (per-P). +// +// mcaches are allocated from non-GC'd memory, so any heap pointers +// must be specially handled. +type mcache struct { + _ sys.NotInHeap + + // The following members are accessed on every malloc, + // so they are grouped here for better caching. + nextSample uintptr // trigger heap sample after allocating this many bytes + scanAlloc uintptr // bytes of scannable heap allocated + + // Allocator cache for tiny objects w/o pointers. + // See "Tiny allocator" comment in malloc.go. + + // tiny points to the beginning of the current tiny block, or + // nil if there is no current tiny block. + // + // tiny is a heap pointer. Since mcache is in non-GC'd memory, + // we handle it by clearing it in releaseAll during mark + // termination. + // + // tinyAllocs is the number of tiny allocations performed + // by the P that owns this mcache. + tiny uintptr + tinyoffset uintptr + tinyAllocs uintptr + + // The rest is not accessed on every malloc. + + alloc [numSpanClasses]*mspan // spans to allocate from, indexed by spanClass + + stackcache [_NumStackOrders]stackfreelist + + // flushGen indicates the sweepgen during which this mcache + // was last flushed. If flushGen != mheap_.sweepgen, the spans + // in this mcache are stale and need to the flushed so they + // can be swept. This is done in acquirep. + flushGen atomic.Uint32 +} + +// A gclink is a node in a linked list of blocks, like mlink, +// but it is opaque to the garbage collector. +// The GC does not trace the pointers during collection, +// and the compiler does not emit write barriers for assignments +// of gclinkptr values. Code should store references to gclinks +// as gclinkptr, not as *gclink. +type gclink struct { + next gclinkptr +} + +// A gclinkptr is a pointer to a gclink, but it is opaque +// to the garbage collector. +type gclinkptr uintptr + +// ptr returns the *gclink form of p. +// The result should be used for accessing fields, not stored +// in other data structures. +func (p gclinkptr) ptr() *gclink { + return (*gclink)(unsafe.Pointer(p)) +} + +type stackfreelist struct { + list gclinkptr // linked list of free stacks + size uintptr // total size of stacks in list +} + +// dummy mspan that contains no free objects. +var emptymspan mspan + +func allocmcache() *mcache { + var c *mcache + systemstack(func() { + lock(&mheap_.lock) + c = (*mcache)(mheap_.cachealloc.alloc()) + c.flushGen.Store(mheap_.sweepgen) + unlock(&mheap_.lock) + }) + for i := range c.alloc { + c.alloc[i] = &emptymspan + } + c.nextSample = nextSample() + return c +} + +// freemcache releases resources associated with this +// mcache and puts the object onto a free list. +// +// In some cases there is no way to simply release +// resources, such as statistics, so donate them to +// a different mcache (the recipient). +func freemcache(c *mcache) { + systemstack(func() { + c.releaseAll() + stackcache_clear(c) + + // NOTE(rsc,rlh): If gcworkbuffree comes back, we need to coordinate + // with the stealing of gcworkbufs during garbage collection to avoid + // a race where the workbuf is double-freed. + // gcworkbuffree(c.gcworkbuf) + + lock(&mheap_.lock) + mheap_.cachealloc.free(unsafe.Pointer(c)) + unlock(&mheap_.lock) + }) +} + +// getMCache is a convenience function which tries to obtain an mcache. +// +// Returns nil if we're not bootstrapping or we don't have a P. The caller's +// P must not change, so we must be in a non-preemptible state. +func getMCache(mp *m) *mcache { + // Grab the mcache, since that's where stats live. + pp := mp.p.ptr() + var c *mcache + if pp == nil { + // We will be called without a P while bootstrapping, + // in which case we use mcache0, which is set in mallocinit. + // mcache0 is cleared when bootstrapping is complete, + // by procresize. + c = mcache0 + } else { + c = pp.mcache + } + return c +} + +// refill acquires a new span of span class spc for c. This span will +// have at least one free object. The current span in c must be full. +// +// Must run in a non-preemptible context since otherwise the owner of +// c could change. +func (c *mcache) refill(spc spanClass) { + // Return the current cached span to the central lists. + s := c.alloc[spc] + + if uintptr(s.allocCount) != s.nelems { + throw("refill of span with free space remaining") + } + if s != &emptymspan { + // Mark this span as no longer cached. + if s.sweepgen != mheap_.sweepgen+3 { + throw("bad sweepgen in refill") + } + mheap_.central[spc].mcentral.uncacheSpan(s) + + // Count up how many slots were used and record it. + stats := memstats.heapStats.acquire() + slotsUsed := int64(s.allocCount) - int64(s.allocCountBeforeCache) + atomic.Xadd64(&stats.smallAllocCount[spc.sizeclass()], slotsUsed) + + // Flush tinyAllocs. + if spc == tinySpanClass { + atomic.Xadd64(&stats.tinyAllocCount, int64(c.tinyAllocs)) + c.tinyAllocs = 0 + } + memstats.heapStats.release() + + // Count the allocs in inconsistent, internal stats. + bytesAllocated := slotsUsed * int64(s.elemsize) + gcController.totalAlloc.Add(bytesAllocated) + + // Clear the second allocCount just to be safe. + s.allocCountBeforeCache = 0 + } + + // Get a new cached span from the central lists. + s = mheap_.central[spc].mcentral.cacheSpan() + if s == nil { + throw("out of memory") + } + + if uintptr(s.allocCount) == s.nelems { + throw("span has no free space") + } + + // Indicate that this span is cached and prevent asynchronous + // sweeping in the next sweep phase. + s.sweepgen = mheap_.sweepgen + 3 + + // Store the current alloc count for accounting later. + s.allocCountBeforeCache = s.allocCount + + // Update heapLive and flush scanAlloc. + // + // We have not yet allocated anything new into the span, but we + // assume that all of its slots will get used, so this makes + // heapLive an overestimate. + // + // When the span gets uncached, we'll fix up this overestimate + // if necessary (see releaseAll). + // + // We pick an overestimate here because an underestimate leads + // the pacer to believe that it's in better shape than it is, + // which appears to lead to more memory used. See #53738 for + // more details. + usedBytes := uintptr(s.allocCount) * s.elemsize + gcController.update(int64(s.npages*pageSize)-int64(usedBytes), int64(c.scanAlloc)) + c.scanAlloc = 0 + + c.alloc[spc] = s +} + +// allocLarge allocates a span for a large object. +func (c *mcache) allocLarge(size uintptr, noscan bool) *mspan { + if size+_PageSize < size { + throw("out of memory") + } + npages := size >> _PageShift + if size&_PageMask != 0 { + npages++ + } + + // Deduct credit for this span allocation and sweep if + // necessary. mHeap_Alloc will also sweep npages, so this only + // pays the debt down to npage pages. + deductSweepCredit(npages*_PageSize, npages) + + spc := makeSpanClass(0, noscan) + s := mheap_.alloc(npages, spc) + if s == nil { + throw("out of memory") + } + + // Count the alloc in consistent, external stats. + stats := memstats.heapStats.acquire() + atomic.Xadd64(&stats.largeAlloc, int64(npages*pageSize)) + atomic.Xadd64(&stats.largeAllocCount, 1) + memstats.heapStats.release() + + // Count the alloc in inconsistent, internal stats. + gcController.totalAlloc.Add(int64(npages * pageSize)) + + // Update heapLive. + gcController.update(int64(s.npages*pageSize), 0) + + // Put the large span in the mcentral swept list so that it's + // visible to the background sweeper. + mheap_.central[spc].mcentral.fullSwept(mheap_.sweepgen).push(s) + s.limit = s.base() + size + s.initHeapBits(false) + return s +} + +func (c *mcache) releaseAll() { + // Take this opportunity to flush scanAlloc. + scanAlloc := int64(c.scanAlloc) + c.scanAlloc = 0 + + sg := mheap_.sweepgen + dHeapLive := int64(0) + for i := range c.alloc { + s := c.alloc[i] + if s != &emptymspan { + slotsUsed := int64(s.allocCount) - int64(s.allocCountBeforeCache) + s.allocCountBeforeCache = 0 + + // Adjust smallAllocCount for whatever was allocated. + stats := memstats.heapStats.acquire() + atomic.Xadd64(&stats.smallAllocCount[spanClass(i).sizeclass()], slotsUsed) + memstats.heapStats.release() + + // Adjust the actual allocs in inconsistent, internal stats. + // We assumed earlier that the full span gets allocated. + gcController.totalAlloc.Add(slotsUsed * int64(s.elemsize)) + + if s.sweepgen != sg+1 { + // refill conservatively counted unallocated slots in gcController.heapLive. + // Undo this. + // + // If this span was cached before sweep, then gcController.heapLive was totally + // recomputed since caching this span, so we don't do this for stale spans. + dHeapLive -= int64(uintptr(s.nelems)-uintptr(s.allocCount)) * int64(s.elemsize) + } + + // Release the span to the mcentral. + mheap_.central[i].mcentral.uncacheSpan(s) + c.alloc[i] = &emptymspan + } + } + // Clear tinyalloc pool. + c.tiny = 0 + c.tinyoffset = 0 + + // Flush tinyAllocs. + stats := memstats.heapStats.acquire() + atomic.Xadd64(&stats.tinyAllocCount, int64(c.tinyAllocs)) + c.tinyAllocs = 0 + memstats.heapStats.release() + + // Update heapLive and heapScan. + gcController.update(dHeapLive, scanAlloc) +} + +// prepareForSweep flushes c if the system has entered a new sweep phase +// since c was populated. This must happen between the sweep phase +// starting and the first allocation from c. +func (c *mcache) prepareForSweep() { + // Alternatively, instead of making sure we do this on every P + // between starting the world and allocating on that P, we + // could leave allocate-black on, allow allocation to continue + // as usual, use a ragged barrier at the beginning of sweep to + // ensure all cached spans are swept, and then disable + // allocate-black. However, with this approach it's difficult + // to avoid spilling mark bits into the *next* GC cycle. + sg := mheap_.sweepgen + flushGen := c.flushGen.Load() + if flushGen == sg { + return + } else if flushGen != sg-2 { + println("bad flushGen", flushGen, "in prepareForSweep; sweepgen", sg) + throw("bad flushGen") + } + c.releaseAll() + stackcache_clear(c) + c.flushGen.Store(mheap_.sweepgen) // Synchronizes with gcStart +} diff --git a/src/runtime/mcentral.go b/src/runtime/mcentral.go new file mode 100644 index 0000000..3382c54 --- /dev/null +++ b/src/runtime/mcentral.go @@ -0,0 +1,257 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Central free lists. +// +// See malloc.go for an overview. +// +// The mcentral doesn't actually contain the list of free objects; the mspan does. +// Each mcentral is two lists of mspans: those with free objects (c->nonempty) +// and those that are completely allocated (c->empty). + +package runtime + +import ( + "runtime/internal/atomic" + "runtime/internal/sys" +) + +// Central list of free objects of a given size. +type mcentral struct { + _ sys.NotInHeap + spanclass spanClass + + // partial and full contain two mspan sets: one of swept in-use + // spans, and one of unswept in-use spans. These two trade + // roles on each GC cycle. The unswept set is drained either by + // allocation or by the background sweeper in every GC cycle, + // so only two roles are necessary. + // + // sweepgen is increased by 2 on each GC cycle, so the swept + // spans are in partial[sweepgen/2%2] and the unswept spans are in + // partial[1-sweepgen/2%2]. Sweeping pops spans from the + // unswept set and pushes spans that are still in-use on the + // swept set. Likewise, allocating an in-use span pushes it + // on the swept set. + // + // Some parts of the sweeper can sweep arbitrary spans, and hence + // can't remove them from the unswept set, but will add the span + // to the appropriate swept list. As a result, the parts of the + // sweeper and mcentral that do consume from the unswept list may + // encounter swept spans, and these should be ignored. + partial [2]spanSet // list of spans with a free object + full [2]spanSet // list of spans with no free objects +} + +// Initialize a single central free list. +func (c *mcentral) init(spc spanClass) { + c.spanclass = spc + lockInit(&c.partial[0].spineLock, lockRankSpanSetSpine) + lockInit(&c.partial[1].spineLock, lockRankSpanSetSpine) + lockInit(&c.full[0].spineLock, lockRankSpanSetSpine) + lockInit(&c.full[1].spineLock, lockRankSpanSetSpine) +} + +// partialUnswept returns the spanSet which holds partially-filled +// unswept spans for this sweepgen. +func (c *mcentral) partialUnswept(sweepgen uint32) *spanSet { + return &c.partial[1-sweepgen/2%2] +} + +// partialSwept returns the spanSet which holds partially-filled +// swept spans for this sweepgen. +func (c *mcentral) partialSwept(sweepgen uint32) *spanSet { + return &c.partial[sweepgen/2%2] +} + +// fullUnswept returns the spanSet which holds unswept spans without any +// free slots for this sweepgen. +func (c *mcentral) fullUnswept(sweepgen uint32) *spanSet { + return &c.full[1-sweepgen/2%2] +} + +// fullSwept returns the spanSet which holds swept spans without any +// free slots for this sweepgen. +func (c *mcentral) fullSwept(sweepgen uint32) *spanSet { + return &c.full[sweepgen/2%2] +} + +// Allocate a span to use in an mcache. +func (c *mcentral) cacheSpan() *mspan { + // Deduct credit for this span allocation and sweep if necessary. + spanBytes := uintptr(class_to_allocnpages[c.spanclass.sizeclass()]) * _PageSize + deductSweepCredit(spanBytes, 0) + + traceDone := false + if trace.enabled { + traceGCSweepStart() + } + + // If we sweep spanBudget spans without finding any free + // space, just allocate a fresh span. This limits the amount + // of time we can spend trying to find free space and + // amortizes the cost of small object sweeping over the + // benefit of having a full free span to allocate from. By + // setting this to 100, we limit the space overhead to 1%. + // + // TODO(austin,mknyszek): This still has bad worst-case + // throughput. For example, this could find just one free slot + // on the 100th swept span. That limits allocation latency, but + // still has very poor throughput. We could instead keep a + // running free-to-used budget and switch to fresh span + // allocation if the budget runs low. + spanBudget := 100 + + var s *mspan + var sl sweepLocker + + // Try partial swept spans first. + sg := mheap_.sweepgen + if s = c.partialSwept(sg).pop(); s != nil { + goto havespan + } + + sl = sweep.active.begin() + if sl.valid { + // Now try partial unswept spans. + for ; spanBudget >= 0; spanBudget-- { + s = c.partialUnswept(sg).pop() + if s == nil { + break + } + if s, ok := sl.tryAcquire(s); ok { + // We got ownership of the span, so let's sweep it and use it. + s.sweep(true) + sweep.active.end(sl) + goto havespan + } + // We failed to get ownership of the span, which means it's being or + // has been swept by an asynchronous sweeper that just couldn't remove it + // from the unswept list. That sweeper took ownership of the span and + // responsibility for either freeing it to the heap or putting it on the + // right swept list. Either way, we should just ignore it (and it's unsafe + // for us to do anything else). + } + // Now try full unswept spans, sweeping them and putting them into the + // right list if we fail to get a span. + for ; spanBudget >= 0; spanBudget-- { + s = c.fullUnswept(sg).pop() + if s == nil { + break + } + if s, ok := sl.tryAcquire(s); ok { + // We got ownership of the span, so let's sweep it. + s.sweep(true) + // Check if there's any free space. + freeIndex := s.nextFreeIndex() + if freeIndex != s.nelems { + s.freeindex = freeIndex + sweep.active.end(sl) + goto havespan + } + // Add it to the swept list, because sweeping didn't give us any free space. + c.fullSwept(sg).push(s.mspan) + } + // See comment for partial unswept spans. + } + sweep.active.end(sl) + } + if trace.enabled { + traceGCSweepDone() + traceDone = true + } + + // We failed to get a span from the mcentral so get one from mheap. + s = c.grow() + if s == nil { + return nil + } + + // At this point s is a span that should have free slots. +havespan: + if trace.enabled && !traceDone { + traceGCSweepDone() + } + n := int(s.nelems) - int(s.allocCount) + if n == 0 || s.freeindex == s.nelems || uintptr(s.allocCount) == s.nelems { + throw("span has no free objects") + } + freeByteBase := s.freeindex &^ (64 - 1) + whichByte := freeByteBase / 8 + // Init alloc bits cache. + s.refillAllocCache(whichByte) + + // Adjust the allocCache so that s.freeindex corresponds to the low bit in + // s.allocCache. + s.allocCache >>= s.freeindex % 64 + + return s +} + +// Return span from an mcache. +// +// s must have a span class corresponding to this +// mcentral and it must not be empty. +func (c *mcentral) uncacheSpan(s *mspan) { + if s.allocCount == 0 { + throw("uncaching span but s.allocCount == 0") + } + + sg := mheap_.sweepgen + stale := s.sweepgen == sg+1 + + // Fix up sweepgen. + if stale { + // Span was cached before sweep began. It's our + // responsibility to sweep it. + // + // Set sweepgen to indicate it's not cached but needs + // sweeping and can't be allocated from. sweep will + // set s.sweepgen to indicate s is swept. + atomic.Store(&s.sweepgen, sg-1) + } else { + // Indicate that s is no longer cached. + atomic.Store(&s.sweepgen, sg) + } + + // Put the span in the appropriate place. + if stale { + // It's stale, so just sweep it. Sweeping will put it on + // the right list. + // + // We don't use a sweepLocker here. Stale cached spans + // aren't in the global sweep lists, so mark termination + // itself holds up sweep completion until all mcaches + // have been swept. + ss := sweepLocked{s} + ss.sweep(false) + } else { + if int(s.nelems)-int(s.allocCount) > 0 { + // Put it back on the partial swept list. + c.partialSwept(sg).push(s) + } else { + // There's no free space and it's not stale, so put it on the + // full swept list. + c.fullSwept(sg).push(s) + } + } +} + +// grow allocates a new empty span from the heap and initializes it for c's size class. +func (c *mcentral) grow() *mspan { + npages := uintptr(class_to_allocnpages[c.spanclass.sizeclass()]) + size := uintptr(class_to_size[c.spanclass.sizeclass()]) + + s := mheap_.alloc(npages, c.spanclass) + if s == nil { + return nil + } + + // Use division by multiplication and shifts to quickly compute: + // n := (npages << _PageShift) / size + n := s.divideByElemSize(npages << _PageShift) + s.limit = s.base() + size*n + s.initHeapBits(false) + return s +} diff --git a/src/runtime/mcheckmark.go b/src/runtime/mcheckmark.go new file mode 100644 index 0000000..73c1a10 --- /dev/null +++ b/src/runtime/mcheckmark.go @@ -0,0 +1,104 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// GC checkmarks +// +// In a concurrent garbage collector, one worries about failing to mark +// a live object due to mutations without write barriers or bugs in the +// collector implementation. As a sanity check, the GC has a 'checkmark' +// mode that retraverses the object graph with the world stopped, to make +// sure that everything that should be marked is marked. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// A checkmarksMap stores the GC marks in "checkmarks" mode. It is a +// per-arena bitmap with a bit for every word in the arena. The mark +// is stored on the bit corresponding to the first word of the marked +// allocation. +type checkmarksMap struct { + _ sys.NotInHeap + b [heapArenaBytes / goarch.PtrSize / 8]uint8 +} + +// If useCheckmark is true, marking of an object uses the checkmark +// bits instead of the standard mark bits. +var useCheckmark = false + +// startCheckmarks prepares for the checkmarks phase. +// +// The world must be stopped. +func startCheckmarks() { + assertWorldStopped() + + // Clear all checkmarks. + for _, ai := range mheap_.allArenas { + arena := mheap_.arenas[ai.l1()][ai.l2()] + bitmap := arena.checkmarks + + if bitmap == nil { + // Allocate bitmap on first use. + bitmap = (*checkmarksMap)(persistentalloc(unsafe.Sizeof(*bitmap), 0, &memstats.gcMiscSys)) + if bitmap == nil { + throw("out of memory allocating checkmarks bitmap") + } + arena.checkmarks = bitmap + } else { + // Otherwise clear the existing bitmap. + for i := range bitmap.b { + bitmap.b[i] = 0 + } + } + } + // Enable checkmarking. + useCheckmark = true +} + +// endCheckmarks ends the checkmarks phase. +func endCheckmarks() { + if gcMarkWorkAvailable(nil) { + throw("GC work not flushed") + } + useCheckmark = false +} + +// setCheckmark throws if marking object is a checkmarks violation, +// and otherwise sets obj's checkmark. It returns true if obj was +// already checkmarked. +func setCheckmark(obj, base, off uintptr, mbits markBits) bool { + if !mbits.isMarked() { + printlock() + print("runtime: checkmarks found unexpected unmarked object obj=", hex(obj), "\n") + print("runtime: found obj at *(", hex(base), "+", hex(off), ")\n") + + // Dump the source (base) object + gcDumpObject("base", base, off) + + // Dump the object + gcDumpObject("obj", obj, ^uintptr(0)) + + getg().m.traceback = 2 + throw("checkmark found unmarked object") + } + + ai := arenaIndex(obj) + arena := mheap_.arenas[ai.l1()][ai.l2()] + arenaWord := (obj / heapArenaBytes / 8) % uintptr(len(arena.checkmarks.b)) + mask := byte(1 << ((obj / heapArenaBytes) % 8)) + bytep := &arena.checkmarks.b[arenaWord] + + if atomic.Load8(bytep)&mask != 0 { + // Already checkmarked. + return true + } + + atomic.Or8(bytep, mask) + return false +} diff --git a/src/runtime/mem.go b/src/runtime/mem.go new file mode 100644 index 0000000..0ca933b --- /dev/null +++ b/src/runtime/mem.go @@ -0,0 +1,143 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// OS memory management abstraction layer +// +// Regions of the address space managed by the runtime may be in one of four +// states at any given time: +// 1) None - Unreserved and unmapped, the default state of any region. +// 2) Reserved - Owned by the runtime, but accessing it would cause a fault. +// Does not count against the process' memory footprint. +// 3) Prepared - Reserved, intended not to be backed by physical memory (though +// an OS may implement this lazily). Can transition efficiently to +// Ready. Accessing memory in such a region is undefined (may +// fault, may give back unexpected zeroes, etc.). +// 4) Ready - may be accessed safely. +// +// This set of states is more than is strictly necessary to support all the +// currently supported platforms. One could get by with just None, Reserved, and +// Ready. However, the Prepared state gives us flexibility for performance +// purposes. For example, on POSIX-y operating systems, Reserved is usually a +// private anonymous mmap'd region with PROT_NONE set, and to transition +// to Ready would require setting PROT_READ|PROT_WRITE. However the +// underspecification of Prepared lets us use just MADV_FREE to transition from +// Ready to Prepared. Thus with the Prepared state we can set the permission +// bits just once early on, we can efficiently tell the OS that it's free to +// take pages away from us when we don't strictly need them. +// +// This file defines a cross-OS interface for a common set of helpers +// that transition memory regions between these states. The helpers call into +// OS-specific implementations that handle errors, while the interface boundary +// implements cross-OS functionality, like updating runtime accounting. + +// sysAlloc transitions an OS-chosen region of memory from None to Ready. +// More specifically, it obtains a large chunk of zeroed memory from the +// operating system, typically on the order of a hundred kilobytes +// or a megabyte. This memory is always immediately available for use. +// +// sysStat must be non-nil. +// +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysAlloc(n uintptr, sysStat *sysMemStat) unsafe.Pointer { + sysStat.add(int64(n)) + gcController.mappedReady.Add(int64(n)) + return sysAllocOS(n) +} + +// sysUnused transitions a memory region from Ready to Prepared. It notifies the +// operating system that the physical pages backing this memory region are no +// longer needed and can be reused for other purposes. The contents of a +// sysUnused memory region are considered forfeit and the region must not be +// accessed again until sysUsed is called. +func sysUnused(v unsafe.Pointer, n uintptr) { + gcController.mappedReady.Add(-int64(n)) + sysUnusedOS(v, n) +} + +// sysUsed transitions a memory region from Prepared to Ready. It notifies the +// operating system that the memory region is needed and ensures that the region +// may be safely accessed. This is typically a no-op on systems that don't have +// an explicit commit step and hard over-commit limits, but is critical on +// Windows, for example. +// +// This operation is idempotent for memory already in the Prepared state, so +// it is safe to refer, with v and n, to a range of memory that includes both +// Prepared and Ready memory. However, the caller must provide the exact amount +// of Prepared memory for accounting purposes. +func sysUsed(v unsafe.Pointer, n, prepared uintptr) { + gcController.mappedReady.Add(int64(prepared)) + sysUsedOS(v, n) +} + +// sysHugePage does not transition memory regions, but instead provides a +// hint to the OS that it would be more efficient to back this memory region +// with pages of a larger size transparently. +func sysHugePage(v unsafe.Pointer, n uintptr) { + sysHugePageOS(v, n) +} + +// sysFree transitions a memory region from any state to None. Therefore, it +// returns memory unconditionally. It is used if an out-of-memory error has been +// detected midway through an allocation or to carve out an aligned section of +// the address space. It is okay if sysFree is a no-op only if sysReserve always +// returns a memory region aligned to the heap allocator's alignment +// restrictions. +// +// sysStat must be non-nil. +// +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysFree(v unsafe.Pointer, n uintptr, sysStat *sysMemStat) { + sysStat.add(-int64(n)) + gcController.mappedReady.Add(-int64(n)) + sysFreeOS(v, n) +} + +// sysFault transitions a memory region from Ready to Reserved. It +// marks a region such that it will always fault if accessed. Used only for +// debugging the runtime. +// +// TODO(mknyszek): Currently it's true that all uses of sysFault transition +// memory from Ready to Reserved, but this may not be true in the future +// since on every platform the operation is much more general than that. +// If a transition from Prepared is ever introduced, create a new function +// that elides the Ready state accounting. +func sysFault(v unsafe.Pointer, n uintptr) { + gcController.mappedReady.Add(-int64(n)) + sysFaultOS(v, n) +} + +// sysReserve transitions a memory region from None to Reserved. It reserves +// address space in such a way that it would cause a fatal fault upon access +// (either via permissions or not committing the memory). Such a reservation is +// thus never backed by physical memory. +// +// If the pointer passed to it is non-nil, the caller wants the +// reservation there, but sysReserve can still choose another +// location if that one is unavailable. +// +// NOTE: sysReserve returns OS-aligned memory, but the heap allocator +// may use larger alignment, so the caller must be careful to realign the +// memory obtained by sysReserve. +func sysReserve(v unsafe.Pointer, n uintptr) unsafe.Pointer { + return sysReserveOS(v, n) +} + +// sysMap transitions a memory region from Reserved to Prepared. It ensures the +// memory region can be efficiently transitioned to Ready. +// +// sysStat must be non-nil. +func sysMap(v unsafe.Pointer, n uintptr, sysStat *sysMemStat) { + sysStat.add(int64(n)) + sysMapOS(v, n) +} diff --git a/src/runtime/mem_aix.go b/src/runtime/mem_aix.go new file mode 100644 index 0000000..21726b5 --- /dev/null +++ b/src/runtime/mem_aix.go @@ -0,0 +1,75 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "unsafe" +) + +// Don't split the stack as this method may be invoked without a valid G, which +// prevents us from allocating more stack. +// +//go:nosplit +func sysAllocOS(n uintptr) unsafe.Pointer { + p, err := mmap(nil, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, -1, 0) + if err != 0 { + if err == _EACCES { + print("runtime: mmap: access denied\n") + exit(2) + } + if err == _EAGAIN { + print("runtime: mmap: too much locked memory (check 'ulimit -l').\n") + exit(2) + } + return nil + } + return p +} + +func sysUnusedOS(v unsafe.Pointer, n uintptr) { + madvise(v, n, _MADV_DONTNEED) +} + +func sysUsedOS(v unsafe.Pointer, n uintptr) { +} + +func sysHugePageOS(v unsafe.Pointer, n uintptr) { +} + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysFreeOS(v unsafe.Pointer, n uintptr) { + munmap(v, n) +} + +func sysFaultOS(v unsafe.Pointer, n uintptr) { + mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE|_MAP_FIXED, -1, 0) +} + +func sysReserveOS(v unsafe.Pointer, n uintptr) unsafe.Pointer { + p, err := mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE, -1, 0) + if err != 0 { + return nil + } + return p +} + +func sysMapOS(v unsafe.Pointer, n uintptr) { + // AIX does not allow mapping a range that is already mapped. + // So, call mprotect to change permissions. + // Note that sysMap is always called with a non-nil pointer + // since it transitions a Reserved memory region to Prepared, + // so mprotect is always possible. + _, err := mprotect(v, n, _PROT_READ|_PROT_WRITE) + if err == _ENOMEM { + throw("runtime: out of memory") + } + if err != 0 { + print("runtime: mprotect(", v, ", ", n, ") returned ", err, "\n") + throw("runtime: cannot map pages in arena address space") + } +} diff --git a/src/runtime/mem_bsd.go b/src/runtime/mem_bsd.go new file mode 100644 index 0000000..6c5edb1 --- /dev/null +++ b/src/runtime/mem_bsd.go @@ -0,0 +1,81 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build dragonfly || freebsd || netbsd || openbsd || solaris + +package runtime + +import ( + "unsafe" +) + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysAllocOS(n uintptr) unsafe.Pointer { + v, err := mmap(nil, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, -1, 0) + if err != 0 { + return nil + } + return v +} + +func sysUnusedOS(v unsafe.Pointer, n uintptr) { + if debug.madvdontneed != 0 { + madvise(v, n, _MADV_DONTNEED) + } else { + madvise(v, n, _MADV_FREE) + } +} + +func sysUsedOS(v unsafe.Pointer, n uintptr) { +} + +func sysHugePageOS(v unsafe.Pointer, n uintptr) { +} + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysFreeOS(v unsafe.Pointer, n uintptr) { + munmap(v, n) +} + +func sysFaultOS(v unsafe.Pointer, n uintptr) { + mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE|_MAP_FIXED, -1, 0) +} + +// Indicates not to reserve swap space for the mapping. +const _sunosMAP_NORESERVE = 0x40 + +func sysReserveOS(v unsafe.Pointer, n uintptr) unsafe.Pointer { + flags := int32(_MAP_ANON | _MAP_PRIVATE) + if GOOS == "solaris" || GOOS == "illumos" { + // Be explicit that we don't want to reserve swap space + // for PROT_NONE anonymous mappings. This avoids an issue + // wherein large mappings can cause fork to fail. + flags |= _sunosMAP_NORESERVE + } + p, err := mmap(v, n, _PROT_NONE, flags, -1, 0) + if err != 0 { + return nil + } + return p +} + +const _sunosEAGAIN = 11 +const _ENOMEM = 12 + +func sysMapOS(v unsafe.Pointer, n uintptr) { + p, err := mmap(v, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE, -1, 0) + if err == _ENOMEM || ((GOOS == "solaris" || GOOS == "illumos") && err == _sunosEAGAIN) { + throw("runtime: out of memory") + } + if p != v || err != 0 { + print("runtime: mmap(", v, ", ", n, ") returned ", p, ", ", err, "\n") + throw("runtime: cannot map pages in arena address space") + } +} diff --git a/src/runtime/mem_darwin.go b/src/runtime/mem_darwin.go new file mode 100644 index 0000000..25862cf --- /dev/null +++ b/src/runtime/mem_darwin.go @@ -0,0 +1,70 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "unsafe" +) + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysAllocOS(n uintptr) unsafe.Pointer { + v, err := mmap(nil, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, -1, 0) + if err != 0 { + return nil + } + return v +} + +func sysUnusedOS(v unsafe.Pointer, n uintptr) { + // MADV_FREE_REUSABLE is like MADV_FREE except it also propagates + // accounting information about the process to task_info. + madvise(v, n, _MADV_FREE_REUSABLE) +} + +func sysUsedOS(v unsafe.Pointer, n uintptr) { + // MADV_FREE_REUSE is necessary to keep the kernel's accounting + // accurate. If called on any memory region that hasn't been + // MADV_FREE_REUSABLE'd, it's a no-op. + madvise(v, n, _MADV_FREE_REUSE) +} + +func sysHugePageOS(v unsafe.Pointer, n uintptr) { +} + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysFreeOS(v unsafe.Pointer, n uintptr) { + munmap(v, n) +} + +func sysFaultOS(v unsafe.Pointer, n uintptr) { + mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE|_MAP_FIXED, -1, 0) +} + +func sysReserveOS(v unsafe.Pointer, n uintptr) unsafe.Pointer { + p, err := mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE, -1, 0) + if err != 0 { + return nil + } + return p +} + +const _ENOMEM = 12 + +func sysMapOS(v unsafe.Pointer, n uintptr) { + p, err := mmap(v, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE, -1, 0) + if err == _ENOMEM { + throw("runtime: out of memory") + } + if p != v || err != 0 { + print("runtime: mmap(", v, ", ", n, ") returned ", p, ", ", err, "\n") + throw("runtime: cannot map pages in arena address space") + } +} diff --git a/src/runtime/mem_js.go b/src/runtime/mem_js.go new file mode 100644 index 0000000..e87c5f2 --- /dev/null +++ b/src/runtime/mem_js.go @@ -0,0 +1,85 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build js && wasm + +package runtime + +import ( + "unsafe" +) + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysAllocOS(n uintptr) unsafe.Pointer { + p := sysReserveOS(nil, n) + sysMapOS(p, n) + return p +} + +func sysUnusedOS(v unsafe.Pointer, n uintptr) { +} + +func sysUsedOS(v unsafe.Pointer, n uintptr) { +} + +func sysHugePageOS(v unsafe.Pointer, n uintptr) { +} + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysFreeOS(v unsafe.Pointer, n uintptr) { +} + +func sysFaultOS(v unsafe.Pointer, n uintptr) { +} + +var reserveEnd uintptr + +func sysReserveOS(v unsafe.Pointer, n uintptr) unsafe.Pointer { + // TODO(neelance): maybe unify with mem_plan9.go, depending on how https://github.com/WebAssembly/design/blob/master/FutureFeatures.md#finer-grained-control-over-memory turns out + + if v != nil { + // The address space of WebAssembly's linear memory is contiguous, + // so requesting specific addresses is not supported. We could use + // a different address, but then mheap.sysAlloc discards the result + // right away and we don't reuse chunks passed to sysFree. + return nil + } + + // Round up the initial reserveEnd to 64 KiB so that + // reservations are always aligned to the page size. + initReserveEnd := alignUp(lastmoduledatap.end, physPageSize) + if reserveEnd < initReserveEnd { + reserveEnd = initReserveEnd + } + v = unsafe.Pointer(reserveEnd) + reserveEnd += alignUp(n, physPageSize) + + current := currentMemory() + // reserveEnd is always at a page boundary. + needed := int32(reserveEnd / physPageSize) + if current < needed { + if growMemory(needed-current) == -1 { + return nil + } + resetMemoryDataView() + } + + return v +} + +func currentMemory() int32 +func growMemory(pages int32) int32 + +// resetMemoryDataView signals the JS front-end that WebAssembly's memory.grow instruction has been used. +// This allows the front-end to replace the old DataView object with a new one. +func resetMemoryDataView() + +func sysMapOS(v unsafe.Pointer, n uintptr) { +} diff --git a/src/runtime/mem_linux.go b/src/runtime/mem_linux.go new file mode 100644 index 0000000..1630664 --- /dev/null +++ b/src/runtime/mem_linux.go @@ -0,0 +1,193 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +const ( + _EACCES = 13 + _EINVAL = 22 +) + +// Don't split the stack as this method may be invoked without a valid G, which +// prevents us from allocating more stack. +// +//go:nosplit +func sysAllocOS(n uintptr) unsafe.Pointer { + p, err := mmap(nil, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, -1, 0) + if err != 0 { + if err == _EACCES { + print("runtime: mmap: access denied\n") + exit(2) + } + if err == _EAGAIN { + print("runtime: mmap: too much locked memory (check 'ulimit -l').\n") + exit(2) + } + return nil + } + return p +} + +var adviseUnused = uint32(_MADV_FREE) + +func sysUnusedOS(v unsafe.Pointer, n uintptr) { + // By default, Linux's "transparent huge page" support will + // merge pages into a huge page if there's even a single + // present regular page, undoing the effects of madvise(adviseUnused) + // below. On amd64, that means khugepaged can turn a single + // 4KB page to 2MB, bloating the process's RSS by as much as + // 512X. (See issue #8832 and Linux kernel bug + // https://bugzilla.kernel.org/show_bug.cgi?id=93111) + // + // To work around this, we explicitly disable transparent huge + // pages when we release pages of the heap. However, we have + // to do this carefully because changing this flag tends to + // split the VMA (memory mapping) containing v in to three + // VMAs in order to track the different values of the + // MADV_NOHUGEPAGE flag in the different regions. There's a + // default limit of 65530 VMAs per address space (sysctl + // vm.max_map_count), so we must be careful not to create too + // many VMAs (see issue #12233). + // + // Since huge pages are huge, there's little use in adjusting + // the MADV_NOHUGEPAGE flag on a fine granularity, so we avoid + // exploding the number of VMAs by only adjusting the + // MADV_NOHUGEPAGE flag on a large granularity. This still + // gets most of the benefit of huge pages while keeping the + // number of VMAs under control. With hugePageSize = 2MB, even + // a pessimal heap can reach 128GB before running out of VMAs. + if physHugePageSize != 0 { + // If it's a large allocation, we want to leave huge + // pages enabled. Hence, we only adjust the huge page + // flag on the huge pages containing v and v+n-1, and + // only if those aren't aligned. + var head, tail uintptr + if uintptr(v)&(physHugePageSize-1) != 0 { + // Compute huge page containing v. + head = alignDown(uintptr(v), physHugePageSize) + } + if (uintptr(v)+n)&(physHugePageSize-1) != 0 { + // Compute huge page containing v+n-1. + tail = alignDown(uintptr(v)+n-1, physHugePageSize) + } + + // Note that madvise will return EINVAL if the flag is + // already set, which is quite likely. We ignore + // errors. + if head != 0 && head+physHugePageSize == tail { + // head and tail are different but adjacent, + // so do this in one call. + madvise(unsafe.Pointer(head), 2*physHugePageSize, _MADV_NOHUGEPAGE) + } else { + // Advise the huge pages containing v and v+n-1. + if head != 0 { + madvise(unsafe.Pointer(head), physHugePageSize, _MADV_NOHUGEPAGE) + } + if tail != 0 && tail != head { + madvise(unsafe.Pointer(tail), physHugePageSize, _MADV_NOHUGEPAGE) + } + } + } + + if uintptr(v)&(physPageSize-1) != 0 || n&(physPageSize-1) != 0 { + // madvise will round this to any physical page + // *covered* by this range, so an unaligned madvise + // will release more memory than intended. + throw("unaligned sysUnused") + } + + var advise uint32 + if debug.madvdontneed != 0 { + advise = _MADV_DONTNEED + } else { + advise = atomic.Load(&adviseUnused) + } + if errno := madvise(v, n, int32(advise)); advise == _MADV_FREE && errno != 0 { + // MADV_FREE was added in Linux 4.5. Fall back to MADV_DONTNEED if it is + // not supported. + atomic.Store(&adviseUnused, _MADV_DONTNEED) + madvise(v, n, _MADV_DONTNEED) + } + + if debug.harddecommit > 0 { + p, err := mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE, -1, 0) + if p != v || err != 0 { + throw("runtime: cannot disable permissions in address space") + } + } +} + +func sysUsedOS(v unsafe.Pointer, n uintptr) { + if debug.harddecommit > 0 { + p, err := mmap(v, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE, -1, 0) + if err == _ENOMEM { + throw("runtime: out of memory") + } + if p != v || err != 0 { + throw("runtime: cannot remap pages in address space") + } + return + + // Don't do the sysHugePage optimization in hard decommit mode. + // We're breaking up pages everywhere, there's no point. + } + // Partially undo the NOHUGEPAGE marks from sysUnused + // for whole huge pages between v and v+n. This may + // leave huge pages off at the end points v and v+n + // even though allocations may cover these entire huge + // pages. We could detect this and undo NOHUGEPAGE on + // the end points as well, but it's probably not worth + // the cost because when neighboring allocations are + // freed sysUnused will just set NOHUGEPAGE again. + sysHugePageOS(v, n) +} + +func sysHugePageOS(v unsafe.Pointer, n uintptr) { + if physHugePageSize != 0 { + // Round v up to a huge page boundary. + beg := alignUp(uintptr(v), physHugePageSize) + // Round v+n down to a huge page boundary. + end := alignDown(uintptr(v)+n, physHugePageSize) + + if beg < end { + madvise(unsafe.Pointer(beg), end-beg, _MADV_HUGEPAGE) + } + } +} + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysFreeOS(v unsafe.Pointer, n uintptr) { + munmap(v, n) +} + +func sysFaultOS(v unsafe.Pointer, n uintptr) { + mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE|_MAP_FIXED, -1, 0) +} + +func sysReserveOS(v unsafe.Pointer, n uintptr) unsafe.Pointer { + p, err := mmap(v, n, _PROT_NONE, _MAP_ANON|_MAP_PRIVATE, -1, 0) + if err != 0 { + return nil + } + return p +} + +func sysMapOS(v unsafe.Pointer, n uintptr) { + p, err := mmap(v, n, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_FIXED|_MAP_PRIVATE, -1, 0) + if err == _ENOMEM { + throw("runtime: out of memory") + } + if p != v || err != 0 { + print("runtime: mmap(", v, ", ", n, ") returned ", p, ", ", err, "\n") + throw("runtime: cannot map pages in arena address space") + } +} diff --git a/src/runtime/mem_plan9.go b/src/runtime/mem_plan9.go new file mode 100644 index 0000000..88e7d92 --- /dev/null +++ b/src/runtime/mem_plan9.go @@ -0,0 +1,195 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +const memDebug = false + +var bloc uintptr +var blocMax uintptr +var memlock mutex + +type memHdr struct { + next memHdrPtr + size uintptr +} + +var memFreelist memHdrPtr // sorted in ascending order + +type memHdrPtr uintptr + +func (p memHdrPtr) ptr() *memHdr { return (*memHdr)(unsafe.Pointer(p)) } +func (p *memHdrPtr) set(x *memHdr) { *p = memHdrPtr(unsafe.Pointer(x)) } + +func memAlloc(n uintptr) unsafe.Pointer { + n = memRound(n) + var prevp *memHdr + for p := memFreelist.ptr(); p != nil; p = p.next.ptr() { + if p.size >= n { + if p.size == n { + if prevp != nil { + prevp.next = p.next + } else { + memFreelist = p.next + } + } else { + p.size -= n + p = (*memHdr)(add(unsafe.Pointer(p), p.size)) + } + *p = memHdr{} + return unsafe.Pointer(p) + } + prevp = p + } + return sbrk(n) +} + +func memFree(ap unsafe.Pointer, n uintptr) { + n = memRound(n) + memclrNoHeapPointers(ap, n) + bp := (*memHdr)(ap) + bp.size = n + bpn := uintptr(ap) + if memFreelist == 0 { + bp.next = 0 + memFreelist.set(bp) + return + } + p := memFreelist.ptr() + if bpn < uintptr(unsafe.Pointer(p)) { + memFreelist.set(bp) + if bpn+bp.size == uintptr(unsafe.Pointer(p)) { + bp.size += p.size + bp.next = p.next + *p = memHdr{} + } else { + bp.next.set(p) + } + return + } + for ; p.next != 0; p = p.next.ptr() { + if bpn > uintptr(unsafe.Pointer(p)) && bpn < uintptr(unsafe.Pointer(p.next)) { + break + } + } + if bpn+bp.size == uintptr(unsafe.Pointer(p.next)) { + bp.size += p.next.ptr().size + bp.next = p.next.ptr().next + *p.next.ptr() = memHdr{} + } else { + bp.next = p.next + } + if uintptr(unsafe.Pointer(p))+p.size == bpn { + p.size += bp.size + p.next = bp.next + *bp = memHdr{} + } else { + p.next.set(bp) + } +} + +func memCheck() { + if !memDebug { + return + } + for p := memFreelist.ptr(); p != nil && p.next != 0; p = p.next.ptr() { + if uintptr(unsafe.Pointer(p)) == uintptr(unsafe.Pointer(p.next)) { + print("runtime: ", unsafe.Pointer(p), " == ", unsafe.Pointer(p.next), "\n") + throw("mem: infinite loop") + } + if uintptr(unsafe.Pointer(p)) > uintptr(unsafe.Pointer(p.next)) { + print("runtime: ", unsafe.Pointer(p), " > ", unsafe.Pointer(p.next), "\n") + throw("mem: unordered list") + } + if uintptr(unsafe.Pointer(p))+p.size > uintptr(unsafe.Pointer(p.next)) { + print("runtime: ", unsafe.Pointer(p), "+", p.size, " > ", unsafe.Pointer(p.next), "\n") + throw("mem: overlapping blocks") + } + for b := add(unsafe.Pointer(p), unsafe.Sizeof(memHdr{})); uintptr(b) < uintptr(unsafe.Pointer(p))+p.size; b = add(b, 1) { + if *(*byte)(b) != 0 { + print("runtime: value at addr ", b, " with offset ", uintptr(b)-uintptr(unsafe.Pointer(p)), " in block ", p, " of size ", p.size, " is not zero\n") + throw("mem: uninitialised memory") + } + } + } +} + +func memRound(p uintptr) uintptr { + return (p + _PAGESIZE - 1) &^ (_PAGESIZE - 1) +} + +func initBloc() { + bloc = memRound(firstmoduledata.end) + blocMax = bloc +} + +func sbrk(n uintptr) unsafe.Pointer { + // Plan 9 sbrk from /sys/src/libc/9sys/sbrk.c + bl := bloc + n = memRound(n) + if bl+n > blocMax { + if brk_(unsafe.Pointer(bl+n)) < 0 { + return nil + } + blocMax = bl + n + } + bloc += n + return unsafe.Pointer(bl) +} + +func sysAllocOS(n uintptr) unsafe.Pointer { + lock(&memlock) + p := memAlloc(n) + memCheck() + unlock(&memlock) + return p +} + +func sysFreeOS(v unsafe.Pointer, n uintptr) { + lock(&memlock) + if uintptr(v)+n == bloc { + // Address range being freed is at the end of memory, + // so record a new lower value for end of memory. + // Can't actually shrink address space because segment is shared. + memclrNoHeapPointers(v, n) + bloc -= n + } else { + memFree(v, n) + memCheck() + } + unlock(&memlock) +} + +func sysUnusedOS(v unsafe.Pointer, n uintptr) { +} + +func sysUsedOS(v unsafe.Pointer, n uintptr) { +} + +func sysHugePageOS(v unsafe.Pointer, n uintptr) { +} + +func sysMapOS(v unsafe.Pointer, n uintptr) { +} + +func sysFaultOS(v unsafe.Pointer, n uintptr) { +} + +func sysReserveOS(v unsafe.Pointer, n uintptr) unsafe.Pointer { + lock(&memlock) + var p unsafe.Pointer + if uintptr(v) == bloc { + // Address hint is the current end of memory, + // so try to extend the address space. + p = sbrk(n) + } + if p == nil && v == nil { + p = memAlloc(n) + memCheck() + } + unlock(&memlock) + return p +} diff --git a/src/runtime/mem_windows.go b/src/runtime/mem_windows.go new file mode 100644 index 0000000..b1292fc --- /dev/null +++ b/src/runtime/mem_windows.go @@ -0,0 +1,128 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "unsafe" +) + +const ( + _MEM_COMMIT = 0x1000 + _MEM_RESERVE = 0x2000 + _MEM_DECOMMIT = 0x4000 + _MEM_RELEASE = 0x8000 + + _PAGE_READWRITE = 0x0004 + _PAGE_NOACCESS = 0x0001 + + _ERROR_NOT_ENOUGH_MEMORY = 8 + _ERROR_COMMITMENT_LIMIT = 1455 +) + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysAllocOS(n uintptr) unsafe.Pointer { + return unsafe.Pointer(stdcall4(_VirtualAlloc, 0, n, _MEM_COMMIT|_MEM_RESERVE, _PAGE_READWRITE)) +} + +func sysUnusedOS(v unsafe.Pointer, n uintptr) { + r := stdcall3(_VirtualFree, uintptr(v), n, _MEM_DECOMMIT) + if r != 0 { + return + } + + // Decommit failed. Usual reason is that we've merged memory from two different + // VirtualAlloc calls, and Windows will only let each VirtualFree handle pages from + // a single VirtualAlloc. It is okay to specify a subset of the pages from a single alloc, + // just not pages from multiple allocs. This is a rare case, arising only when we're + // trying to give memory back to the operating system, which happens on a time + // scale of minutes. It doesn't have to be terribly fast. Instead of extra bookkeeping + // on all our VirtualAlloc calls, try freeing successively smaller pieces until + // we manage to free something, and then repeat. This ends up being O(n log n) + // in the worst case, but that's fast enough. + for n > 0 { + small := n + for small >= 4096 && stdcall3(_VirtualFree, uintptr(v), small, _MEM_DECOMMIT) == 0 { + small /= 2 + small &^= 4096 - 1 + } + if small < 4096 { + print("runtime: VirtualFree of ", small, " bytes failed with errno=", getlasterror(), "\n") + throw("runtime: failed to decommit pages") + } + v = add(v, small) + n -= small + } +} + +func sysUsedOS(v unsafe.Pointer, n uintptr) { + p := stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_COMMIT, _PAGE_READWRITE) + if p == uintptr(v) { + return + } + + // Commit failed. See SysUnused. + // Hold on to n here so we can give back a better error message + // for certain cases. + k := n + for k > 0 { + small := k + for small >= 4096 && stdcall4(_VirtualAlloc, uintptr(v), small, _MEM_COMMIT, _PAGE_READWRITE) == 0 { + small /= 2 + small &^= 4096 - 1 + } + if small < 4096 { + errno := getlasterror() + switch errno { + case _ERROR_NOT_ENOUGH_MEMORY, _ERROR_COMMITMENT_LIMIT: + print("runtime: VirtualAlloc of ", n, " bytes failed with errno=", errno, "\n") + throw("out of memory") + default: + print("runtime: VirtualAlloc of ", small, " bytes failed with errno=", errno, "\n") + throw("runtime: failed to commit pages") + } + } + v = add(v, small) + k -= small + } +} + +func sysHugePageOS(v unsafe.Pointer, n uintptr) { +} + +// Don't split the stack as this function may be invoked without a valid G, +// which prevents us from allocating more stack. +// +//go:nosplit +func sysFreeOS(v unsafe.Pointer, n uintptr) { + r := stdcall3(_VirtualFree, uintptr(v), 0, _MEM_RELEASE) + if r == 0 { + print("runtime: VirtualFree of ", n, " bytes failed with errno=", getlasterror(), "\n") + throw("runtime: failed to release pages") + } +} + +func sysFaultOS(v unsafe.Pointer, n uintptr) { + // SysUnused makes the memory inaccessible and prevents its reuse + sysUnusedOS(v, n) +} + +func sysReserveOS(v unsafe.Pointer, n uintptr) unsafe.Pointer { + // v is just a hint. + // First try at v. + // This will fail if any of [v, v+n) is already reserved. + v = unsafe.Pointer(stdcall4(_VirtualAlloc, uintptr(v), n, _MEM_RESERVE, _PAGE_READWRITE)) + if v != nil { + return v + } + + // Next let the kernel choose the address. + return unsafe.Pointer(stdcall4(_VirtualAlloc, 0, n, _MEM_RESERVE, _PAGE_READWRITE)) +} + +func sysMapOS(v unsafe.Pointer, n uintptr) { +} diff --git a/src/runtime/memclr_386.s b/src/runtime/memclr_386.s new file mode 100644 index 0000000..a72e5f2 --- /dev/null +++ b/src/runtime/memclr_386.s @@ -0,0 +1,137 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 + +#include "go_asm.h" +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers(SB), NOSPLIT, $0-8 + MOVL ptr+0(FP), DI + MOVL n+4(FP), BX + XORL AX, AX + + // MOVOU seems always faster than REP STOSL. +tail: + // BSR+branch table make almost all memmove/memclr benchmarks worse. Not worth doing. + TESTL BX, BX + JEQ _0 + CMPL BX, $2 + JBE _1or2 + CMPL BX, $4 + JB _3 + JE _4 + CMPL BX, $8 + JBE _5through8 + CMPL BX, $16 + JBE _9through16 +#ifdef GO386_softfloat + JMP nosse2 +#endif + PXOR X0, X0 + CMPL BX, $32 + JBE _17through32 + CMPL BX, $64 + JBE _33through64 + CMPL BX, $128 + JBE _65through128 + CMPL BX, $256 + JBE _129through256 + +loop: + MOVOU X0, 0(DI) + MOVOU X0, 16(DI) + MOVOU X0, 32(DI) + MOVOU X0, 48(DI) + MOVOU X0, 64(DI) + MOVOU X0, 80(DI) + MOVOU X0, 96(DI) + MOVOU X0, 112(DI) + MOVOU X0, 128(DI) + MOVOU X0, 144(DI) + MOVOU X0, 160(DI) + MOVOU X0, 176(DI) + MOVOU X0, 192(DI) + MOVOU X0, 208(DI) + MOVOU X0, 224(DI) + MOVOU X0, 240(DI) + SUBL $256, BX + ADDL $256, DI + CMPL BX, $256 + JAE loop + JMP tail + +_1or2: + MOVB AX, (DI) + MOVB AX, -1(DI)(BX*1) + RET +_0: + RET +_3: + MOVW AX, (DI) + MOVB AX, 2(DI) + RET +_4: + // We need a separate case for 4 to make sure we clear pointers atomically. + MOVL AX, (DI) + RET +_5through8: + MOVL AX, (DI) + MOVL AX, -4(DI)(BX*1) + RET +_9through16: + MOVL AX, (DI) + MOVL AX, 4(DI) + MOVL AX, -8(DI)(BX*1) + MOVL AX, -4(DI)(BX*1) + RET +_17through32: + MOVOU X0, (DI) + MOVOU X0, -16(DI)(BX*1) + RET +_33through64: + MOVOU X0, (DI) + MOVOU X0, 16(DI) + MOVOU X0, -32(DI)(BX*1) + MOVOU X0, -16(DI)(BX*1) + RET +_65through128: + MOVOU X0, (DI) + MOVOU X0, 16(DI) + MOVOU X0, 32(DI) + MOVOU X0, 48(DI) + MOVOU X0, -64(DI)(BX*1) + MOVOU X0, -48(DI)(BX*1) + MOVOU X0, -32(DI)(BX*1) + MOVOU X0, -16(DI)(BX*1) + RET +_129through256: + MOVOU X0, (DI) + MOVOU X0, 16(DI) + MOVOU X0, 32(DI) + MOVOU X0, 48(DI) + MOVOU X0, 64(DI) + MOVOU X0, 80(DI) + MOVOU X0, 96(DI) + MOVOU X0, 112(DI) + MOVOU X0, -128(DI)(BX*1) + MOVOU X0, -112(DI)(BX*1) + MOVOU X0, -96(DI)(BX*1) + MOVOU X0, -80(DI)(BX*1) + MOVOU X0, -64(DI)(BX*1) + MOVOU X0, -48(DI)(BX*1) + MOVOU X0, -32(DI)(BX*1) + MOVOU X0, -16(DI)(BX*1) + RET +nosse2: + MOVL BX, CX + SHRL $2, CX + REP + STOSL + ANDL $3, BX + JNE tail + RET diff --git a/src/runtime/memclr_amd64.s b/src/runtime/memclr_amd64.s new file mode 100644 index 0000000..19bfa6f --- /dev/null +++ b/src/runtime/memclr_amd64.s @@ -0,0 +1,218 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 + +#include "go_asm.h" +#include "textflag.h" +#include "asm_amd64.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +// ABIInternal for performance. +TEXT runtime·memclrNoHeapPointers<ABIInternal>(SB), NOSPLIT, $0-16 + // AX = ptr + // BX = n + MOVQ AX, DI // DI = ptr + XORQ AX, AX + + // MOVOU seems always faster than REP STOSQ when Enhanced REP STOSQ is not available. +tail: + // BSR+branch table make almost all memmove/memclr benchmarks worse. Not worth doing. + TESTQ BX, BX + JEQ _0 + CMPQ BX, $2 + JBE _1or2 + CMPQ BX, $4 + JBE _3or4 + CMPQ BX, $8 + JB _5through7 + JE _8 + CMPQ BX, $16 + JBE _9through16 + CMPQ BX, $32 + JBE _17through32 + CMPQ BX, $64 + JBE _33through64 + CMPQ BX, $128 + JBE _65through128 + CMPQ BX, $256 + JBE _129through256 + + CMPB internal∕cpu·X86+const_offsetX86HasERMS(SB), $1 // enhanced REP MOVSB/STOSB + JNE skip_erms + + // If the size is less than 2kb, do not use ERMS as it has a big start-up cost. + // Table 3-4. Relative Performance of Memcpy() Using ERMSB Vs. 128-bit AVX + // in the Intel Optimization Guide shows better performance for ERMSB starting + // from 2KB. Benchmarks show the similar threshold for REP STOS vs AVX. + CMPQ BX, $2048 + JAE loop_preheader_erms + +skip_erms: +#ifndef hasAVX2 + CMPB internal∕cpu·X86+const_offsetX86HasAVX2(SB), $1 + JE loop_preheader_avx2 + // TODO: for really big clears, use MOVNTDQ, even without AVX2. + +loop: + MOVOU X15, 0(DI) + MOVOU X15, 16(DI) + MOVOU X15, 32(DI) + MOVOU X15, 48(DI) + MOVOU X15, 64(DI) + MOVOU X15, 80(DI) + MOVOU X15, 96(DI) + MOVOU X15, 112(DI) + MOVOU X15, 128(DI) + MOVOU X15, 144(DI) + MOVOU X15, 160(DI) + MOVOU X15, 176(DI) + MOVOU X15, 192(DI) + MOVOU X15, 208(DI) + MOVOU X15, 224(DI) + MOVOU X15, 240(DI) + SUBQ $256, BX + ADDQ $256, DI + CMPQ BX, $256 + JAE loop + JMP tail +#endif + +loop_preheader_avx2: + VPXOR X0, X0, X0 + // For smaller sizes MOVNTDQ may be faster or slower depending on hardware. + // For larger sizes it is always faster, even on dual Xeons with 30M cache. + // TODO take into account actual LLC size. E. g. glibc uses LLC size/2. + CMPQ BX, $0x2000000 + JAE loop_preheader_avx2_huge + +loop_avx2: + VMOVDQU Y0, 0(DI) + VMOVDQU Y0, 32(DI) + VMOVDQU Y0, 64(DI) + VMOVDQU Y0, 96(DI) + SUBQ $128, BX + ADDQ $128, DI + CMPQ BX, $128 + JAE loop_avx2 + VMOVDQU Y0, -32(DI)(BX*1) + VMOVDQU Y0, -64(DI)(BX*1) + VMOVDQU Y0, -96(DI)(BX*1) + VMOVDQU Y0, -128(DI)(BX*1) + VZEROUPPER + RET + +loop_preheader_erms: +#ifndef hasAVX2 + CMPB internal∕cpu·X86+const_offsetX86HasAVX2(SB), $1 + JNE loop_erms +#endif + + VPXOR X0, X0, X0 + // At this point both ERMS and AVX2 is supported. While REP STOS can use a no-RFO + // write protocol, ERMS could show the same or slower performance comparing to + // Non-Temporal Stores when the size is bigger than LLC depending on hardware. + CMPQ BX, $0x2000000 + JAE loop_preheader_avx2_huge + +loop_erms: + // STOSQ is used to guarantee that the whole zeroed pointer-sized word is visible + // for a memory subsystem as the GC requires this. + MOVQ BX, CX + SHRQ $3, CX + ANDQ $7, BX + REP; STOSQ + JMP tail + +loop_preheader_avx2_huge: + // Align to 32 byte boundary + VMOVDQU Y0, 0(DI) + MOVQ DI, SI + ADDQ $32, DI + ANDQ $~31, DI + SUBQ DI, SI + ADDQ SI, BX +loop_avx2_huge: + VMOVNTDQ Y0, 0(DI) + VMOVNTDQ Y0, 32(DI) + VMOVNTDQ Y0, 64(DI) + VMOVNTDQ Y0, 96(DI) + SUBQ $128, BX + ADDQ $128, DI + CMPQ BX, $128 + JAE loop_avx2_huge + // In the description of MOVNTDQ in [1] + // "... fencing operation implemented with the SFENCE or MFENCE instruction + // should be used in conjunction with MOVNTDQ instructions..." + // [1] 64-ia-32-architectures-software-developer-manual-325462.pdf + SFENCE + VMOVDQU Y0, -32(DI)(BX*1) + VMOVDQU Y0, -64(DI)(BX*1) + VMOVDQU Y0, -96(DI)(BX*1) + VMOVDQU Y0, -128(DI)(BX*1) + VZEROUPPER + RET + +_1or2: + MOVB AX, (DI) + MOVB AX, -1(DI)(BX*1) + RET +_0: + RET +_3or4: + MOVW AX, (DI) + MOVW AX, -2(DI)(BX*1) + RET +_5through7: + MOVL AX, (DI) + MOVL AX, -4(DI)(BX*1) + RET +_8: + // We need a separate case for 8 to make sure we clear pointers atomically. + MOVQ AX, (DI) + RET +_9through16: + MOVQ AX, (DI) + MOVQ AX, -8(DI)(BX*1) + RET +_17through32: + MOVOU X15, (DI) + MOVOU X15, -16(DI)(BX*1) + RET +_33through64: + MOVOU X15, (DI) + MOVOU X15, 16(DI) + MOVOU X15, -32(DI)(BX*1) + MOVOU X15, -16(DI)(BX*1) + RET +_65through128: + MOVOU X15, (DI) + MOVOU X15, 16(DI) + MOVOU X15, 32(DI) + MOVOU X15, 48(DI) + MOVOU X15, -64(DI)(BX*1) + MOVOU X15, -48(DI)(BX*1) + MOVOU X15, -32(DI)(BX*1) + MOVOU X15, -16(DI)(BX*1) + RET +_129through256: + MOVOU X15, (DI) + MOVOU X15, 16(DI) + MOVOU X15, 32(DI) + MOVOU X15, 48(DI) + MOVOU X15, 64(DI) + MOVOU X15, 80(DI) + MOVOU X15, 96(DI) + MOVOU X15, 112(DI) + MOVOU X15, -128(DI)(BX*1) + MOVOU X15, -112(DI)(BX*1) + MOVOU X15, -96(DI)(BX*1) + MOVOU X15, -80(DI)(BX*1) + MOVOU X15, -64(DI)(BX*1) + MOVOU X15, -48(DI)(BX*1) + MOVOU X15, -32(DI)(BX*1) + MOVOU X15, -16(DI)(BX*1) + RET diff --git a/src/runtime/memclr_arm.s b/src/runtime/memclr_arm.s new file mode 100644 index 0000000..f02d058 --- /dev/null +++ b/src/runtime/memclr_arm.s @@ -0,0 +1,91 @@ +// Inferno's libkern/memset-arm.s +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/memset-arm.s +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +#include "textflag.h" + +#define TO R8 +#define TOE R11 +#define N R12 +#define TMP R12 /* N and TMP don't overlap */ + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +// Also called from assembly in sys_windows_arm.s without g (but using Go stack convention). +TEXT runtime·memclrNoHeapPointers(SB),NOSPLIT,$0-8 + MOVW ptr+0(FP), TO + MOVW n+4(FP), N + MOVW $0, R0 + + ADD N, TO, TOE /* to end pointer */ + + CMP $4, N /* need at least 4 bytes to copy */ + BLT _1tail + +_4align: /* align on 4 */ + AND.S $3, TO, TMP + BEQ _4aligned + + MOVBU.P R0, 1(TO) /* implicit write back */ + B _4align + +_4aligned: + SUB $31, TOE, TMP /* do 32-byte chunks if possible */ + CMP TMP, TO + BHS _4tail + + MOVW R0, R1 /* replicate */ + MOVW R0, R2 + MOVW R0, R3 + MOVW R0, R4 + MOVW R0, R5 + MOVW R0, R6 + MOVW R0, R7 + +_f32loop: + CMP TMP, TO + BHS _4tail + + MOVM.IA.W [R0-R7], (TO) + B _f32loop + +_4tail: + SUB $3, TOE, TMP /* do remaining words if possible */ +_4loop: + CMP TMP, TO + BHS _1tail + + MOVW.P R0, 4(TO) /* implicit write back */ + B _4loop + +_1tail: + CMP TO, TOE + BEQ _return + + MOVBU.P R0, 1(TO) /* implicit write back */ + B _1tail + +_return: + RET diff --git a/src/runtime/memclr_arm64.s b/src/runtime/memclr_arm64.s new file mode 100644 index 0000000..1c35dfe --- /dev/null +++ b/src/runtime/memclr_arm64.s @@ -0,0 +1,182 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +// Also called from assembly in sys_windows_arm64.s without g (but using Go stack convention). +TEXT runtime·memclrNoHeapPointers<ABIInternal>(SB),NOSPLIT,$0-16 + CMP $16, R1 + // If n is equal to 16 bytes, use zero_exact_16 to zero + BEQ zero_exact_16 + + // If n is greater than 16 bytes, use zero_by_16 to zero + BHI zero_by_16 + + // n is less than 16 bytes + ADD R1, R0, R7 + TBZ $3, R1, less_than_8 + MOVD ZR, (R0) + MOVD ZR, -8(R7) + RET + +less_than_8: + TBZ $2, R1, less_than_4 + MOVW ZR, (R0) + MOVW ZR, -4(R7) + RET + +less_than_4: + CBZ R1, ending + MOVB ZR, (R0) + TBZ $1, R1, ending + MOVH ZR, -2(R7) + +ending: + RET + +zero_exact_16: + // n is exactly 16 bytes + STP (ZR, ZR), (R0) + RET + +zero_by_16: + // n greater than 16 bytes, check if the start address is aligned + NEG R0, R4 + ANDS $15, R4, R4 + // Try zeroing using zva if the start address is aligned with 16 + BEQ try_zva + + // Non-aligned store + STP (ZR, ZR), (R0) + // Make the destination aligned + SUB R4, R1, R1 + ADD R4, R0, R0 + B try_zva + +tail_maybe_long: + CMP $64, R1 + BHS no_zva + +tail63: + ANDS $48, R1, R3 + BEQ last16 + CMPW $32, R3 + BEQ last48 + BLT last32 + STP.P (ZR, ZR), 16(R0) +last48: + STP.P (ZR, ZR), 16(R0) +last32: + STP.P (ZR, ZR), 16(R0) + // The last store length is at most 16, so it is safe to use + // stp to write last 16 bytes +last16: + ANDS $15, R1, R1 + CBZ R1, last_end + ADD R1, R0, R0 + STP (ZR, ZR), -16(R0) +last_end: + RET + +no_zva: + SUB $16, R0, R0 + SUB $64, R1, R1 + +loop_64: + STP (ZR, ZR), 16(R0) + STP (ZR, ZR), 32(R0) + STP (ZR, ZR), 48(R0) + STP.W (ZR, ZR), 64(R0) + SUBS $64, R1, R1 + BGE loop_64 + ANDS $63, R1, ZR + ADD $16, R0, R0 + BNE tail63 + RET + +try_zva: + // Try using the ZVA feature to zero entire cache lines + // It is not meaningful to use ZVA if the block size is less than 64, + // so make sure that n is greater than or equal to 64 + CMP $63, R1 + BLE tail63 + + CMP $128, R1 + // Ensure n is at least 128 bytes, so that there is enough to copy after + // alignment. + BLT no_zva + // Check if ZVA is allowed from user code, and if so get the block size + MOVW block_size<>(SB), R5 + TBNZ $31, R5, no_zva + CBNZ R5, zero_by_line + // DCZID_EL0 bit assignments + // [63:5] Reserved + // [4] DZP, if bit set DC ZVA instruction is prohibited, else permitted + // [3:0] log2 of the block size in words, eg. if it returns 0x4 then block size is 16 words + MRS DCZID_EL0, R3 + TBZ $4, R3, init + // ZVA not available + MOVW $~0, R5 + MOVW R5, block_size<>(SB) + B no_zva + +init: + MOVW $4, R9 + ANDW $15, R3, R5 + LSLW R5, R9, R5 + MOVW R5, block_size<>(SB) + + ANDS $63, R5, R9 + // Block size is less than 64. + BNE no_zva + +zero_by_line: + CMP R5, R1 + // Not enough memory to reach alignment + BLO no_zva + SUB $1, R5, R6 + NEG R0, R4 + ANDS R6, R4, R4 + // Already aligned + BEQ aligned + + // check there is enough to copy after alignment + SUB R4, R1, R3 + + // Check that the remaining length to ZVA after alignment + // is greater than 64. + CMP $64, R3 + CCMP GE, R3, R5, $10 // condition code GE, NZCV=0b1010 + BLT no_zva + + // We now have at least 64 bytes to zero, update n + MOVD R3, R1 + +loop_zva_prolog: + STP (ZR, ZR), (R0) + STP (ZR, ZR), 16(R0) + STP (ZR, ZR), 32(R0) + SUBS $64, R4, R4 + STP (ZR, ZR), 48(R0) + ADD $64, R0, R0 + BGE loop_zva_prolog + + ADD R4, R0, R0 + +aligned: + SUB R5, R1, R1 + +loop_zva: + WORD $0xd50b7420 // DC ZVA, R0 + ADD R5, R0, R0 + SUBS R5, R1, R1 + BHS loop_zva + ANDS R6, R1, R1 + BNE tail_maybe_long + RET + +GLOBL block_size<>(SB), NOPTR, $8 diff --git a/src/runtime/memclr_loong64.s b/src/runtime/memclr_loong64.s new file mode 100644 index 0000000..e4f2058 --- /dev/null +++ b/src/runtime/memclr_loong64.s @@ -0,0 +1,41 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "textflag.h" + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers(SB),NOSPLIT,$0-16 + MOVV ptr+0(FP), R6 + MOVV n+8(FP), R7 + ADDV R6, R7, R4 + + // if less than 8 bytes, do one byte at a time + SGTU $8, R7, R8 + BNE R8, out + + // do one byte at a time until 8-aligned + AND $7, R6, R8 + BEQ R8, words + MOVB R0, (R6) + ADDV $1, R6 + JMP -4(PC) + +words: + // do 8 bytes at a time if there is room + ADDV $-7, R4, R7 + + SGTU R7, R6, R8 + BEQ R8, out + MOVV R0, (R6) + ADDV $8, R6 + JMP -4(PC) + +out: + BEQ R6, R4, done + MOVB R0, (R6) + ADDV $1, R6 + JMP -3(PC) +done: + RET diff --git a/src/runtime/memclr_mips64x.s b/src/runtime/memclr_mips64x.s new file mode 100644 index 0000000..cf3a9c4 --- /dev/null +++ b/src/runtime/memclr_mips64x.s @@ -0,0 +1,99 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +#include "go_asm.h" +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers(SB),NOSPLIT,$0-16 + MOVV ptr+0(FP), R1 + MOVV n+8(FP), R2 + ADDV R1, R2, R4 + + // if less than 16 bytes or no MSA, do words check + SGTU $16, R2, R3 + BNE R3, no_msa + MOVBU internal∕cpu·MIPS64X+const_offsetMIPS64XHasMSA(SB), R3 + BEQ R3, R0, no_msa + + VMOVB $0, W0 + + SGTU $128, R2, R3 + BEQ R3, msa_large + + AND $15, R2, R5 + XOR R2, R5, R6 + ADDVU R1, R6 + +msa_small: + VMOVB W0, (R1) + ADDVU $16, R1 + SGTU R6, R1, R3 + BNE R3, R0, msa_small + BEQ R5, R0, done + VMOVB W0, -16(R4) + JMP done + +msa_large: + AND $127, R2, R5 + XOR R2, R5, R6 + ADDVU R1, R6 + +msa_large_loop: + VMOVB W0, (R1) + VMOVB W0, 16(R1) + VMOVB W0, 32(R1) + VMOVB W0, 48(R1) + VMOVB W0, 64(R1) + VMOVB W0, 80(R1) + VMOVB W0, 96(R1) + VMOVB W0, 112(R1) + + ADDVU $128, R1 + SGTU R6, R1, R3 + BNE R3, R0, msa_large_loop + BEQ R5, R0, done + VMOVB W0, -128(R4) + VMOVB W0, -112(R4) + VMOVB W0, -96(R4) + VMOVB W0, -80(R4) + VMOVB W0, -64(R4) + VMOVB W0, -48(R4) + VMOVB W0, -32(R4) + VMOVB W0, -16(R4) + JMP done + +no_msa: + // if less than 8 bytes, do one byte at a time + SGTU $8, R2, R3 + BNE R3, out + + // do one byte at a time until 8-aligned + AND $7, R1, R3 + BEQ R3, words + MOVB R0, (R1) + ADDV $1, R1 + JMP -4(PC) + +words: + // do 8 bytes at a time if there is room + ADDV $-7, R4, R2 + + SGTU R2, R1, R3 + BEQ R3, out + MOVV R0, (R1) + ADDV $8, R1 + JMP -4(PC) + +out: + BEQ R1, R4, done + MOVB R0, (R1) + ADDV $1, R1 + JMP -3(PC) +done: + RET diff --git a/src/runtime/memclr_mipsx.s b/src/runtime/memclr_mipsx.s new file mode 100644 index 0000000..ee3009d --- /dev/null +++ b/src/runtime/memclr_mipsx.s @@ -0,0 +1,73 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +#include "textflag.h" + +#ifdef GOARCH_mips +#define MOVWHI MOVWL +#define MOVWLO MOVWR +#else +#define MOVWHI MOVWR +#define MOVWLO MOVWL +#endif + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers(SB),NOSPLIT,$0-8 + MOVW n+4(FP), R2 + MOVW ptr+0(FP), R1 + + SGTU $4, R2, R3 + ADDU R2, R1, R4 + BNE R3, small_zero + +ptr_align: + AND $3, R1, R3 + BEQ R3, setup + SUBU R1, R0, R3 + AND $3, R3 // R3 contains number of bytes needed to align ptr + MOVWHI R0, 0(R1) // MOVWHI will write zeros up to next word boundary + SUBU R3, R2 + ADDU R3, R1 + +setup: + AND $31, R2, R6 + AND $3, R2, R5 + SUBU R6, R4, R6 // end pointer for 32-byte chunks + SUBU R5, R4, R5 // end pointer for 4-byte chunks + +large: + BEQ R1, R6, words + MOVW R0, 0(R1) + MOVW R0, 4(R1) + MOVW R0, 8(R1) + MOVW R0, 12(R1) + MOVW R0, 16(R1) + MOVW R0, 20(R1) + MOVW R0, 24(R1) + MOVW R0, 28(R1) + ADDU $32, R1 + JMP large + +words: + BEQ R1, R5, tail + MOVW R0, 0(R1) + ADDU $4, R1 + JMP words + +tail: + BEQ R1, R4, ret + MOVWLO R0, -1(R4) + +ret: + RET + +small_zero: + BEQ R1, R4, ret + MOVB R0, 0(R1) + ADDU $1, R1 + JMP small_zero diff --git a/src/runtime/memclr_plan9_386.s b/src/runtime/memclr_plan9_386.s new file mode 100644 index 0000000..54701a9 --- /dev/null +++ b/src/runtime/memclr_plan9_386.s @@ -0,0 +1,58 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers(SB), NOSPLIT, $0-8 + MOVL ptr+0(FP), DI + MOVL n+4(FP), BX + XORL AX, AX + +tail: + TESTL BX, BX + JEQ _0 + CMPL BX, $2 + JBE _1or2 + CMPL BX, $4 + JB _3 + JE _4 + CMPL BX, $8 + JBE _5through8 + CMPL BX, $16 + JBE _9through16 + MOVL BX, CX + SHRL $2, CX + REP + STOSL + ANDL $3, BX + JNE tail + RET + +_1or2: + MOVB AX, (DI) + MOVB AX, -1(DI)(BX*1) + RET +_0: + RET +_3: + MOVW AX, (DI) + MOVB AX, 2(DI) + RET +_4: + // We need a separate case for 4 to make sure we clear pointers atomically. + MOVL AX, (DI) + RET +_5through8: + MOVL AX, (DI) + MOVL AX, -4(DI)(BX*1) + RET +_9through16: + MOVL AX, (DI) + MOVL AX, 4(DI) + MOVL AX, -8(DI)(BX*1) + MOVL AX, -4(DI)(BX*1) + RET diff --git a/src/runtime/memclr_plan9_amd64.s b/src/runtime/memclr_plan9_amd64.s new file mode 100644 index 0000000..8c6a1cc --- /dev/null +++ b/src/runtime/memclr_plan9_amd64.s @@ -0,0 +1,23 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers(SB),NOSPLIT,$0-16 + MOVQ ptr+0(FP), DI + MOVQ n+8(FP), CX + MOVQ CX, BX + ANDQ $7, BX + SHRQ $3, CX + MOVQ $0, AX + CLD + REP + STOSQ + MOVQ BX, CX + REP + STOSB + RET diff --git a/src/runtime/memclr_ppc64x.s b/src/runtime/memclr_ppc64x.s new file mode 100644 index 0000000..3543255 --- /dev/null +++ b/src/runtime/memclr_ppc64x.s @@ -0,0 +1,174 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-16 + // R3 = ptr + // R4 = n + + // Determine if there are doublewords to clear +check: + ANDCC $7, R4, R5 // R5: leftover bytes to clear + SRD $3, R4, R6 // R6: double words to clear + CMP R6, $0, CR1 // CR1[EQ] set if no double words + + BC 12, 6, nozerolarge // only single bytes + CMP R4, $512 + BLT under512 // special case for < 512 + ANDCC $127, R3, R8 // check for 128 alignment of address + BEQ zero512setup + + ANDCC $7, R3, R15 + BEQ zero512xsetup // at least 8 byte aligned + + // zero bytes up to 8 byte alignment + + ANDCC $1, R3, R15 // check for byte alignment + BEQ byte2 + MOVB R0, 0(R3) // zero 1 byte + ADD $1, R3 // bump ptr by 1 + ADD $-1, R4 + +byte2: + ANDCC $2, R3, R15 // check for 2 byte alignment + BEQ byte4 + MOVH R0, 0(R3) // zero 2 bytes + ADD $2, R3 // bump ptr by 2 + ADD $-2, R4 + +byte4: + ANDCC $4, R3, R15 // check for 4 byte alignment + BEQ zero512xsetup + MOVW R0, 0(R3) // zero 4 bytes + ADD $4, R3 // bump ptr by 4 + ADD $-4, R4 + BR zero512xsetup // ptr should now be 8 byte aligned + +under512: + SRDCC $3, R6, R7 // 64 byte chunks? + XXLXOR VS32, VS32, VS32 // clear VS32 (V0) + BEQ lt64gt8 + + // Prepare to clear 64 bytes at a time. + +zero64setup: + DCBTST (R3) // prepare data cache + MOVD R7, CTR // number of 64 byte chunks + MOVD $16, R8 + MOVD $32, R16 + MOVD $48, R17 + +zero64: + STXVD2X VS32, (R3+R0) // store 16 bytes + STXVD2X VS32, (R3+R8) + STXVD2X VS32, (R3+R16) + STXVD2X VS32, (R3+R17) + ADD $64, R3 + ADD $-64, R4 + BDNZ zero64 // dec ctr, br zero64 if ctr not 0 + SRDCC $3, R4, R6 // remaining doublewords + BEQ nozerolarge + +lt64gt8: + CMP R4, $32 + BLT lt32gt8 + MOVD $16, R8 + STXVD2X VS32, (R3+R0) + STXVD2X VS32, (R3+R8) + ADD $-32, R4 + ADD $32, R3 +lt32gt8: + CMP R4, $16 + BLT lt16gt8 + STXVD2X VS32, (R3+R0) + ADD $16, R3 + ADD $-16, R4 +lt16gt8: + CMP R4, $8 + BLT nozerolarge + MOVD R0, 0(R3) + ADD $8, R3 + ADD $-8, R4 + +nozerolarge: + ANDCC $7, R4, R5 // any remaining bytes + BC 4, 1, LR // ble lr + +zerotail: + MOVD R5, CTR // set up to clear tail bytes + +zerotailloop: + MOVB R0, 0(R3) // clear single bytes + ADD $1, R3 + BDNZ zerotailloop // dec ctr, br zerotailloop if ctr not 0 + RET + +zero512xsetup: // 512 chunk with extra needed + ANDCC $8, R3, R11 // 8 byte alignment? + BEQ zero512setup16 + MOVD R0, 0(R3) // clear 8 bytes + ADD $8, R3 // update ptr to next 8 + ADD $-8, R4 // dec count by 8 + +zero512setup16: + ANDCC $127, R3, R14 // < 128 byte alignment + BEQ zero512setup // handle 128 byte alignment + MOVD $128, R15 + SUB R14, R15, R14 // find increment to 128 alignment + SRD $4, R14, R15 // number of 16 byte chunks + +zero512presetup: + MOVD R15, CTR // loop counter of 16 bytes + XXLXOR VS32, VS32, VS32 // clear VS32 (V0) + +zero512preloop: // clear up to 128 alignment + STXVD2X VS32, (R3+R0) // clear 16 bytes + ADD $16, R3 // update ptr + ADD $-16, R4 // dec count + BDNZ zero512preloop + +zero512setup: // setup for dcbz loop + CMP R4, $512 // check if at least 512 + BLT remain + SRD $9, R4, R8 // loop count for 512 chunks + MOVD R8, CTR // set up counter + MOVD $128, R9 // index regs for 128 bytes + MOVD $256, R10 + MOVD $384, R11 + PCALIGN $32 + +zero512: + DCBZ (R3+R0) // clear first chunk + DCBZ (R3+R9) // clear second chunk + DCBZ (R3+R10) // clear third chunk + DCBZ (R3+R11) // clear fourth chunk + ADD $512, R3 + BDNZ zero512 + ANDCC $511, R4 + +remain: + CMP R4, $128 // check if 128 byte chunks left + BLT smaller + DCBZ (R3+R0) // clear 128 + ADD $128, R3 + ADD $-128, R4 + BR remain + +smaller: + ANDCC $127, R4, R7 // find leftovers + BEQ done + CMP R7, $64 // more than 64, do 64 at a time + XXLXOR VS32, VS32, VS32 + BLT lt64gt8 // less than 64 + SRD $6, R7, R7 // set up counter for 64 + BR zero64setup + +done: + RET diff --git a/src/runtime/memclr_riscv64.s b/src/runtime/memclr_riscv64.s new file mode 100644 index 0000000..d12b545 --- /dev/null +++ b/src/runtime/memclr_riscv64.s @@ -0,0 +1,103 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// void runtime·memclrNoHeapPointers(void*, uintptr) +TEXT runtime·memclrNoHeapPointers<ABIInternal>(SB),NOSPLIT,$0-16 + // X10 = ptr + // X11 = n + + // If less than 8 bytes, do single byte zeroing. + MOV $8, X9 + BLT X11, X9, check4 + + // Check alignment + AND $3, X10, X5 + BEQZ X5, aligned + + // Zero one byte at a time until we reach 8 byte alignment. + SUB X5, X11, X11 +align: + ADD $-1, X5 + MOVB ZERO, 0(X10) + ADD $1, X10 + BNEZ X5, align + +aligned: + MOV $8, X9 + BLT X11, X9, check4 + MOV $16, X9 + BLT X11, X9, zero8 + MOV $32, X9 + BLT X11, X9, zero16 + MOV $64, X9 + BLT X11, X9, zero32 +loop64: + MOV ZERO, 0(X10) + MOV ZERO, 8(X10) + MOV ZERO, 16(X10) + MOV ZERO, 24(X10) + MOV ZERO, 32(X10) + MOV ZERO, 40(X10) + MOV ZERO, 48(X10) + MOV ZERO, 56(X10) + ADD $64, X10 + ADD $-64, X11 + BGE X11, X9, loop64 + BEQZ X11, done + +check32: + MOV $32, X9 + BLT X11, X9, check16 +zero32: + MOV ZERO, 0(X10) + MOV ZERO, 8(X10) + MOV ZERO, 16(X10) + MOV ZERO, 24(X10) + ADD $32, X10 + ADD $-32, X11 + BEQZ X11, done + +check16: + MOV $16, X9 + BLT X11, X9, check8 +zero16: + MOV ZERO, 0(X10) + MOV ZERO, 8(X10) + ADD $16, X10 + ADD $-16, X11 + BEQZ X11, done + +check8: + MOV $8, X9 + BLT X11, X9, check4 +zero8: + MOV ZERO, 0(X10) + ADD $8, X10 + ADD $-8, X11 + BEQZ X11, done + +check4: + MOV $4, X9 + BLT X11, X9, loop1 +zero4: + MOVB ZERO, 0(X10) + MOVB ZERO, 1(X10) + MOVB ZERO, 2(X10) + MOVB ZERO, 3(X10) + ADD $4, X10 + ADD $-4, X11 + +loop1: + BEQZ X11, done + MOVB ZERO, 0(X10) + ADD $1, X10 + ADD $-1, X11 + JMP loop1 + +done: + RET diff --git a/src/runtime/memclr_s390x.s b/src/runtime/memclr_s390x.s new file mode 100644 index 0000000..fa657ef --- /dev/null +++ b/src/runtime/memclr_s390x.s @@ -0,0 +1,124 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers(SB),NOSPLIT|NOFRAME,$0-16 + MOVD ptr+0(FP), R4 + MOVD n+8(FP), R5 + +start: + CMPBLE R5, $3, clear0to3 + CMPBLE R5, $7, clear4to7 + CMPBLE R5, $11, clear8to11 + CMPBLE R5, $15, clear12to15 + CMP R5, $32 + BGE clearmt32 + MOVD $0, 0(R4) + MOVD $0, 8(R4) + ADD $16, R4 + SUB $16, R5 + BR start + +clear0to3: + CMPBEQ R5, $0, done + CMPBNE R5, $1, clear2 + MOVB $0, 0(R4) + RET +clear2: + CMPBNE R5, $2, clear3 + MOVH $0, 0(R4) + RET +clear3: + MOVH $0, 0(R4) + MOVB $0, 2(R4) + RET + +clear4to7: + CMPBNE R5, $4, clear5 + MOVW $0, 0(R4) + RET +clear5: + CMPBNE R5, $5, clear6 + MOVW $0, 0(R4) + MOVB $0, 4(R4) + RET +clear6: + CMPBNE R5, $6, clear7 + MOVW $0, 0(R4) + MOVH $0, 4(R4) + RET +clear7: + MOVW $0, 0(R4) + MOVH $0, 4(R4) + MOVB $0, 6(R4) + RET + +clear8to11: + CMPBNE R5, $8, clear9 + MOVD $0, 0(R4) + RET +clear9: + CMPBNE R5, $9, clear10 + MOVD $0, 0(R4) + MOVB $0, 8(R4) + RET +clear10: + CMPBNE R5, $10, clear11 + MOVD $0, 0(R4) + MOVH $0, 8(R4) + RET +clear11: + MOVD $0, 0(R4) + MOVH $0, 8(R4) + MOVB $0, 10(R4) + RET + +clear12to15: + CMPBNE R5, $12, clear13 + MOVD $0, 0(R4) + MOVW $0, 8(R4) + RET +clear13: + CMPBNE R5, $13, clear14 + MOVD $0, 0(R4) + MOVW $0, 8(R4) + MOVB $0, 12(R4) + RET +clear14: + CMPBNE R5, $14, clear15 + MOVD $0, 0(R4) + MOVW $0, 8(R4) + MOVH $0, 12(R4) + RET +clear15: + MOVD $0, 0(R4) + MOVW $0, 8(R4) + MOVH $0, 12(R4) + MOVB $0, 14(R4) + RET + +clearmt32: + CMP R5, $256 + BLT clearlt256 + XC $256, 0(R4), 0(R4) + ADD $256, R4 + ADD $-256, R5 + BR clearmt32 +clearlt256: + CMPBEQ R5, $0, done + ADD $-1, R5 + EXRL $memclr_exrl_xc<>(SB), R5 +done: + RET + +// DO NOT CALL - target for exrl (execute relative long) instruction. +TEXT memclr_exrl_xc<>(SB),NOSPLIT|NOFRAME,$0-0 + XC $1, 0(R4), 0(R4) + MOVD $0, 0(R0) + RET + diff --git a/src/runtime/memclr_wasm.s b/src/runtime/memclr_wasm.s new file mode 100644 index 0000000..19d08ff --- /dev/null +++ b/src/runtime/memclr_wasm.s @@ -0,0 +1,20 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memclrNoHeapPointers Go doc for important implementation constraints. + +// func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) +TEXT runtime·memclrNoHeapPointers(SB), NOSPLIT, $0-16 + MOVD ptr+0(FP), R0 + MOVD n+8(FP), R1 + + Get R0 + I32WrapI64 + I32Const $0 + Get R1 + I32WrapI64 + MemoryFill + RET diff --git a/src/runtime/memmove_386.s b/src/runtime/memmove_386.s new file mode 100644 index 0000000..6d7e17f --- /dev/null +++ b/src/runtime/memmove_386.s @@ -0,0 +1,204 @@ +// Inferno's libkern/memmove-386.s +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/memmove-386.s +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +//go:build !plan9 + +#include "go_asm.h" +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB), NOSPLIT, $0-12 + MOVL to+0(FP), DI + MOVL from+4(FP), SI + MOVL n+8(FP), BX + + // REP instructions have a high startup cost, so we handle small sizes + // with some straightline code. The REP MOVSL instruction is really fast + // for large sizes. The cutover is approximately 1K. We implement up to + // 128 because that is the maximum SSE register load (loading all data + // into registers lets us ignore copy direction). +tail: + // BSR+branch table make almost all memmove/memclr benchmarks worse. Not worth doing. + TESTL BX, BX + JEQ move_0 + CMPL BX, $2 + JBE move_1or2 + CMPL BX, $4 + JB move_3 + JE move_4 + CMPL BX, $8 + JBE move_5through8 + CMPL BX, $16 + JBE move_9through16 +#ifdef GO386_softfloat + JMP nosse2 +#endif + CMPL BX, $32 + JBE move_17through32 + CMPL BX, $64 + JBE move_33through64 + CMPL BX, $128 + JBE move_65through128 + +nosse2: +/* + * check and set for backwards + */ + CMPL SI, DI + JLS back + +/* + * forward copy loop + */ +forward: + // If REP MOVSB isn't fast, don't use it + CMPB internal∕cpu·X86+const_offsetX86HasERMS(SB), $1 // enhanced REP MOVSB/STOSB + JNE fwdBy4 + + // Check alignment + MOVL SI, AX + ORL DI, AX + TESTL $3, AX + JEQ fwdBy4 + + // Do 1 byte at a time + MOVL BX, CX + REP; MOVSB + RET + +fwdBy4: + // Do 4 bytes at a time + MOVL BX, CX + SHRL $2, CX + ANDL $3, BX + REP; MOVSL + JMP tail + +/* + * check overlap + */ +back: + MOVL SI, CX + ADDL BX, CX + CMPL CX, DI + JLS forward +/* + * whole thing backwards has + * adjusted addresses + */ + + ADDL BX, DI + ADDL BX, SI + STD + +/* + * copy + */ + MOVL BX, CX + SHRL $2, CX + ANDL $3, BX + + SUBL $4, DI + SUBL $4, SI + REP; MOVSL + + CLD + ADDL $4, DI + ADDL $4, SI + SUBL BX, DI + SUBL BX, SI + JMP tail + +move_1or2: + MOVB (SI), AX + MOVB -1(SI)(BX*1), CX + MOVB AX, (DI) + MOVB CX, -1(DI)(BX*1) + RET +move_0: + RET +move_3: + MOVW (SI), AX + MOVB 2(SI), CX + MOVW AX, (DI) + MOVB CX, 2(DI) + RET +move_4: + // We need a separate case for 4 to make sure we write pointers atomically. + MOVL (SI), AX + MOVL AX, (DI) + RET +move_5through8: + MOVL (SI), AX + MOVL -4(SI)(BX*1), CX + MOVL AX, (DI) + MOVL CX, -4(DI)(BX*1) + RET +move_9through16: + MOVL (SI), AX + MOVL 4(SI), CX + MOVL -8(SI)(BX*1), DX + MOVL -4(SI)(BX*1), BP + MOVL AX, (DI) + MOVL CX, 4(DI) + MOVL DX, -8(DI)(BX*1) + MOVL BP, -4(DI)(BX*1) + RET +move_17through32: + MOVOU (SI), X0 + MOVOU -16(SI)(BX*1), X1 + MOVOU X0, (DI) + MOVOU X1, -16(DI)(BX*1) + RET +move_33through64: + MOVOU (SI), X0 + MOVOU 16(SI), X1 + MOVOU -32(SI)(BX*1), X2 + MOVOU -16(SI)(BX*1), X3 + MOVOU X0, (DI) + MOVOU X1, 16(DI) + MOVOU X2, -32(DI)(BX*1) + MOVOU X3, -16(DI)(BX*1) + RET +move_65through128: + MOVOU (SI), X0 + MOVOU 16(SI), X1 + MOVOU 32(SI), X2 + MOVOU 48(SI), X3 + MOVOU -64(SI)(BX*1), X4 + MOVOU -48(SI)(BX*1), X5 + MOVOU -32(SI)(BX*1), X6 + MOVOU -16(SI)(BX*1), X7 + MOVOU X0, (DI) + MOVOU X1, 16(DI) + MOVOU X2, 32(DI) + MOVOU X3, 48(DI) + MOVOU X4, -64(DI)(BX*1) + MOVOU X5, -48(DI)(BX*1) + MOVOU X6, -32(DI)(BX*1) + MOVOU X7, -16(DI)(BX*1) + RET diff --git a/src/runtime/memmove_amd64.s b/src/runtime/memmove_amd64.s new file mode 100644 index 0000000..018bb0b --- /dev/null +++ b/src/runtime/memmove_amd64.s @@ -0,0 +1,532 @@ +// Derived from Inferno's libkern/memmove-386.s (adapted for amd64) +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/memmove-386.s +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +//go:build !plan9 + +#include "go_asm.h" +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +// ABIInternal for performance. +TEXT runtime·memmove<ABIInternal>(SB), NOSPLIT, $0-24 + // AX = to + // BX = from + // CX = n + MOVQ AX, DI + MOVQ BX, SI + MOVQ CX, BX + + // REP instructions have a high startup cost, so we handle small sizes + // with some straightline code. The REP MOVSQ instruction is really fast + // for large sizes. The cutover is approximately 2K. +tail: + // move_129through256 or smaller work whether or not the source and the + // destination memory regions overlap because they load all data into + // registers before writing it back. move_256through2048 on the other + // hand can be used only when the memory regions don't overlap or the copy + // direction is forward. + // + // BSR+branch table make almost all memmove/memclr benchmarks worse. Not worth doing. + TESTQ BX, BX + JEQ move_0 + CMPQ BX, $2 + JBE move_1or2 + CMPQ BX, $4 + JB move_3 + JBE move_4 + CMPQ BX, $8 + JB move_5through7 + JE move_8 + CMPQ BX, $16 + JBE move_9through16 + CMPQ BX, $32 + JBE move_17through32 + CMPQ BX, $64 + JBE move_33through64 + CMPQ BX, $128 + JBE move_65through128 + CMPQ BX, $256 + JBE move_129through256 + + TESTB $1, runtime·useAVXmemmove(SB) + JNZ avxUnaligned + +/* + * check and set for backwards + */ + CMPQ SI, DI + JLS back + +/* + * forward copy loop + */ +forward: + CMPQ BX, $2048 + JLS move_256through2048 + + // If REP MOVSB isn't fast, don't use it + CMPB internal∕cpu·X86+const_offsetX86HasERMS(SB), $1 // enhanced REP MOVSB/STOSB + JNE fwdBy8 + + // Check alignment + MOVL SI, AX + ORL DI, AX + TESTL $7, AX + JEQ fwdBy8 + + // Do 1 byte at a time + MOVQ BX, CX + REP; MOVSB + RET + +fwdBy8: + // Do 8 bytes at a time + MOVQ BX, CX + SHRQ $3, CX + ANDQ $7, BX + REP; MOVSQ + JMP tail + +back: +/* + * check overlap + */ + MOVQ SI, CX + ADDQ BX, CX + CMPQ CX, DI + JLS forward +/* + * whole thing backwards has + * adjusted addresses + */ + ADDQ BX, DI + ADDQ BX, SI + STD + +/* + * copy + */ + MOVQ BX, CX + SHRQ $3, CX + ANDQ $7, BX + + SUBQ $8, DI + SUBQ $8, SI + REP; MOVSQ + + CLD + ADDQ $8, DI + ADDQ $8, SI + SUBQ BX, DI + SUBQ BX, SI + JMP tail + +move_1or2: + MOVB (SI), AX + MOVB -1(SI)(BX*1), CX + MOVB AX, (DI) + MOVB CX, -1(DI)(BX*1) + RET +move_0: + RET +move_4: + MOVL (SI), AX + MOVL AX, (DI) + RET +move_3: + MOVW (SI), AX + MOVB 2(SI), CX + MOVW AX, (DI) + MOVB CX, 2(DI) + RET +move_5through7: + MOVL (SI), AX + MOVL -4(SI)(BX*1), CX + MOVL AX, (DI) + MOVL CX, -4(DI)(BX*1) + RET +move_8: + // We need a separate case for 8 to make sure we write pointers atomically. + MOVQ (SI), AX + MOVQ AX, (DI) + RET +move_9through16: + MOVQ (SI), AX + MOVQ -8(SI)(BX*1), CX + MOVQ AX, (DI) + MOVQ CX, -8(DI)(BX*1) + RET +move_17through32: + MOVOU (SI), X0 + MOVOU -16(SI)(BX*1), X1 + MOVOU X0, (DI) + MOVOU X1, -16(DI)(BX*1) + RET +move_33through64: + MOVOU (SI), X0 + MOVOU 16(SI), X1 + MOVOU -32(SI)(BX*1), X2 + MOVOU -16(SI)(BX*1), X3 + MOVOU X0, (DI) + MOVOU X1, 16(DI) + MOVOU X2, -32(DI)(BX*1) + MOVOU X3, -16(DI)(BX*1) + RET +move_65through128: + MOVOU (SI), X0 + MOVOU 16(SI), X1 + MOVOU 32(SI), X2 + MOVOU 48(SI), X3 + MOVOU -64(SI)(BX*1), X4 + MOVOU -48(SI)(BX*1), X5 + MOVOU -32(SI)(BX*1), X6 + MOVOU -16(SI)(BX*1), X7 + MOVOU X0, (DI) + MOVOU X1, 16(DI) + MOVOU X2, 32(DI) + MOVOU X3, 48(DI) + MOVOU X4, -64(DI)(BX*1) + MOVOU X5, -48(DI)(BX*1) + MOVOU X6, -32(DI)(BX*1) + MOVOU X7, -16(DI)(BX*1) + RET +move_129through256: + MOVOU (SI), X0 + MOVOU 16(SI), X1 + MOVOU 32(SI), X2 + MOVOU 48(SI), X3 + MOVOU 64(SI), X4 + MOVOU 80(SI), X5 + MOVOU 96(SI), X6 + MOVOU 112(SI), X7 + MOVOU -128(SI)(BX*1), X8 + MOVOU -112(SI)(BX*1), X9 + MOVOU -96(SI)(BX*1), X10 + MOVOU -80(SI)(BX*1), X11 + MOVOU -64(SI)(BX*1), X12 + MOVOU -48(SI)(BX*1), X13 + MOVOU -32(SI)(BX*1), X14 + MOVOU -16(SI)(BX*1), X15 + MOVOU X0, (DI) + MOVOU X1, 16(DI) + MOVOU X2, 32(DI) + MOVOU X3, 48(DI) + MOVOU X4, 64(DI) + MOVOU X5, 80(DI) + MOVOU X6, 96(DI) + MOVOU X7, 112(DI) + MOVOU X8, -128(DI)(BX*1) + MOVOU X9, -112(DI)(BX*1) + MOVOU X10, -96(DI)(BX*1) + MOVOU X11, -80(DI)(BX*1) + MOVOU X12, -64(DI)(BX*1) + MOVOU X13, -48(DI)(BX*1) + MOVOU X14, -32(DI)(BX*1) + MOVOU X15, -16(DI)(BX*1) + // X15 must be zero on return + PXOR X15, X15 + RET +move_256through2048: + SUBQ $256, BX + MOVOU (SI), X0 + MOVOU 16(SI), X1 + MOVOU 32(SI), X2 + MOVOU 48(SI), X3 + MOVOU 64(SI), X4 + MOVOU 80(SI), X5 + MOVOU 96(SI), X6 + MOVOU 112(SI), X7 + MOVOU 128(SI), X8 + MOVOU 144(SI), X9 + MOVOU 160(SI), X10 + MOVOU 176(SI), X11 + MOVOU 192(SI), X12 + MOVOU 208(SI), X13 + MOVOU 224(SI), X14 + MOVOU 240(SI), X15 + MOVOU X0, (DI) + MOVOU X1, 16(DI) + MOVOU X2, 32(DI) + MOVOU X3, 48(DI) + MOVOU X4, 64(DI) + MOVOU X5, 80(DI) + MOVOU X6, 96(DI) + MOVOU X7, 112(DI) + MOVOU X8, 128(DI) + MOVOU X9, 144(DI) + MOVOU X10, 160(DI) + MOVOU X11, 176(DI) + MOVOU X12, 192(DI) + MOVOU X13, 208(DI) + MOVOU X14, 224(DI) + MOVOU X15, 240(DI) + CMPQ BX, $256 + LEAQ 256(SI), SI + LEAQ 256(DI), DI + JGE move_256through2048 + // X15 must be zero on return + PXOR X15, X15 + JMP tail + +avxUnaligned: + // There are two implementations of move algorithm. + // The first one for non-overlapped memory regions. It uses forward copying. + // The second one for overlapped regions. It uses backward copying + MOVQ DI, CX + SUBQ SI, CX + // Now CX contains distance between SRC and DEST + CMPQ CX, BX + // If the distance lesser than region length it means that regions are overlapped + JC copy_backward + + // Non-temporal copy would be better for big sizes. + CMPQ BX, $0x100000 + JAE gobble_big_data_fwd + + // Memory layout on the source side + // SI CX + // |<---------BX before correction--------->| + // | |<--BX corrected-->| | + // | | |<--- AX --->| + // |<-R11->| |<-128 bytes->| + // +----------------------------------------+ + // | Head | Body | Tail | + // +-------+------------------+-------------+ + // ^ ^ ^ + // | | | + // Save head into Y4 Save tail into X5..X12 + // | + // SI+R11, where R11 = ((DI & -32) + 32) - DI + // Algorithm: + // 1. Unaligned save of the tail's 128 bytes + // 2. Unaligned save of the head's 32 bytes + // 3. Destination-aligned copying of body (128 bytes per iteration) + // 4. Put head on the new place + // 5. Put the tail on the new place + // It can be important to satisfy processor's pipeline requirements for + // small sizes as the cost of unaligned memory region copying is + // comparable with the cost of main loop. So code is slightly messed there. + // There is more clean implementation of that algorithm for bigger sizes + // where the cost of unaligned part copying is negligible. + // You can see it after gobble_big_data_fwd label. + LEAQ (SI)(BX*1), CX + MOVQ DI, R10 + // CX points to the end of buffer so we need go back slightly. We will use negative offsets there. + MOVOU -0x80(CX), X5 + MOVOU -0x70(CX), X6 + MOVQ $0x80, AX + // Align destination address + ANDQ $-32, DI + ADDQ $32, DI + // Continue tail saving. + MOVOU -0x60(CX), X7 + MOVOU -0x50(CX), X8 + // Make R11 delta between aligned and unaligned destination addresses. + MOVQ DI, R11 + SUBQ R10, R11 + // Continue tail saving. + MOVOU -0x40(CX), X9 + MOVOU -0x30(CX), X10 + // Let's make bytes-to-copy value adjusted as we've prepared unaligned part for copying. + SUBQ R11, BX + // Continue tail saving. + MOVOU -0x20(CX), X11 + MOVOU -0x10(CX), X12 + // The tail will be put on its place after main body copying. + // It's time for the unaligned heading part. + VMOVDQU (SI), Y4 + // Adjust source address to point past head. + ADDQ R11, SI + SUBQ AX, BX + // Aligned memory copying there +gobble_128_loop: + VMOVDQU (SI), Y0 + VMOVDQU 0x20(SI), Y1 + VMOVDQU 0x40(SI), Y2 + VMOVDQU 0x60(SI), Y3 + ADDQ AX, SI + VMOVDQA Y0, (DI) + VMOVDQA Y1, 0x20(DI) + VMOVDQA Y2, 0x40(DI) + VMOVDQA Y3, 0x60(DI) + ADDQ AX, DI + SUBQ AX, BX + JA gobble_128_loop + // Now we can store unaligned parts. + ADDQ AX, BX + ADDQ DI, BX + VMOVDQU Y4, (R10) + VZEROUPPER + MOVOU X5, -0x80(BX) + MOVOU X6, -0x70(BX) + MOVOU X7, -0x60(BX) + MOVOU X8, -0x50(BX) + MOVOU X9, -0x40(BX) + MOVOU X10, -0x30(BX) + MOVOU X11, -0x20(BX) + MOVOU X12, -0x10(BX) + RET + +gobble_big_data_fwd: + // There is forward copying for big regions. + // It uses non-temporal mov instructions. + // Details of this algorithm are commented previously for small sizes. + LEAQ (SI)(BX*1), CX + MOVOU -0x80(SI)(BX*1), X5 + MOVOU -0x70(CX), X6 + MOVOU -0x60(CX), X7 + MOVOU -0x50(CX), X8 + MOVOU -0x40(CX), X9 + MOVOU -0x30(CX), X10 + MOVOU -0x20(CX), X11 + MOVOU -0x10(CX), X12 + VMOVDQU (SI), Y4 + MOVQ DI, R8 + ANDQ $-32, DI + ADDQ $32, DI + MOVQ DI, R10 + SUBQ R8, R10 + SUBQ R10, BX + ADDQ R10, SI + LEAQ (DI)(BX*1), CX + SUBQ $0x80, BX +gobble_mem_fwd_loop: + PREFETCHNTA 0x1C0(SI) + PREFETCHNTA 0x280(SI) + // Prefetch values were chosen empirically. + // Approach for prefetch usage as in 9.5.6 of [1] + // [1] 64-ia-32-architectures-optimization-manual.pdf + // https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf + VMOVDQU (SI), Y0 + VMOVDQU 0x20(SI), Y1 + VMOVDQU 0x40(SI), Y2 + VMOVDQU 0x60(SI), Y3 + ADDQ $0x80, SI + VMOVNTDQ Y0, (DI) + VMOVNTDQ Y1, 0x20(DI) + VMOVNTDQ Y2, 0x40(DI) + VMOVNTDQ Y3, 0x60(DI) + ADDQ $0x80, DI + SUBQ $0x80, BX + JA gobble_mem_fwd_loop + // NT instructions don't follow the normal cache-coherency rules. + // We need SFENCE there to make copied data available timely. + SFENCE + VMOVDQU Y4, (R8) + VZEROUPPER + MOVOU X5, -0x80(CX) + MOVOU X6, -0x70(CX) + MOVOU X7, -0x60(CX) + MOVOU X8, -0x50(CX) + MOVOU X9, -0x40(CX) + MOVOU X10, -0x30(CX) + MOVOU X11, -0x20(CX) + MOVOU X12, -0x10(CX) + RET + +copy_backward: + MOVQ DI, AX + // Backward copying is about the same as the forward one. + // Firstly we load unaligned tail in the beginning of region. + MOVOU (SI), X5 + MOVOU 0x10(SI), X6 + ADDQ BX, DI + MOVOU 0x20(SI), X7 + MOVOU 0x30(SI), X8 + LEAQ -0x20(DI), R10 + MOVQ DI, R11 + MOVOU 0x40(SI), X9 + MOVOU 0x50(SI), X10 + ANDQ $0x1F, R11 + MOVOU 0x60(SI), X11 + MOVOU 0x70(SI), X12 + XORQ R11, DI + // Let's point SI to the end of region + ADDQ BX, SI + // and load unaligned head into X4. + VMOVDQU -0x20(SI), Y4 + SUBQ R11, SI + SUBQ R11, BX + // If there is enough data for non-temporal moves go to special loop + CMPQ BX, $0x100000 + JA gobble_big_data_bwd + SUBQ $0x80, BX +gobble_mem_bwd_loop: + VMOVDQU -0x20(SI), Y0 + VMOVDQU -0x40(SI), Y1 + VMOVDQU -0x60(SI), Y2 + VMOVDQU -0x80(SI), Y3 + SUBQ $0x80, SI + VMOVDQA Y0, -0x20(DI) + VMOVDQA Y1, -0x40(DI) + VMOVDQA Y2, -0x60(DI) + VMOVDQA Y3, -0x80(DI) + SUBQ $0x80, DI + SUBQ $0x80, BX + JA gobble_mem_bwd_loop + // Let's store unaligned data + VMOVDQU Y4, (R10) + VZEROUPPER + MOVOU X5, (AX) + MOVOU X6, 0x10(AX) + MOVOU X7, 0x20(AX) + MOVOU X8, 0x30(AX) + MOVOU X9, 0x40(AX) + MOVOU X10, 0x50(AX) + MOVOU X11, 0x60(AX) + MOVOU X12, 0x70(AX) + RET + +gobble_big_data_bwd: + SUBQ $0x80, BX +gobble_big_mem_bwd_loop: + PREFETCHNTA -0x1C0(SI) + PREFETCHNTA -0x280(SI) + VMOVDQU -0x20(SI), Y0 + VMOVDQU -0x40(SI), Y1 + VMOVDQU -0x60(SI), Y2 + VMOVDQU -0x80(SI), Y3 + SUBQ $0x80, SI + VMOVNTDQ Y0, -0x20(DI) + VMOVNTDQ Y1, -0x40(DI) + VMOVNTDQ Y2, -0x60(DI) + VMOVNTDQ Y3, -0x80(DI) + SUBQ $0x80, DI + SUBQ $0x80, BX + JA gobble_big_mem_bwd_loop + SFENCE + VMOVDQU Y4, (R10) + VZEROUPPER + MOVOU X5, (AX) + MOVOU X6, 0x10(AX) + MOVOU X7, 0x20(AX) + MOVOU X8, 0x30(AX) + MOVOU X9, 0x40(AX) + MOVOU X10, 0x50(AX) + MOVOU X11, 0x60(AX) + MOVOU X12, 0x70(AX) + RET diff --git a/src/runtime/memmove_arm.s b/src/runtime/memmove_arm.s new file mode 100644 index 0000000..43d53fa --- /dev/null +++ b/src/runtime/memmove_arm.s @@ -0,0 +1,264 @@ +// Inferno's libkern/memmove-arm.s +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/memmove-arm.s +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +#include "textflag.h" + +// TE or TS are spilled to the stack during bulk register moves. +#define TS R0 +#define TE R8 + +// Warning: the linker will use R11 to synthesize certain instructions. Please +// take care and double check with objdump. +#define FROM R11 +#define N R12 +#define TMP R12 /* N and TMP don't overlap */ +#define TMP1 R5 + +#define RSHIFT R5 +#define LSHIFT R6 +#define OFFSET R7 + +#define BR0 R0 /* shared with TS */ +#define BW0 R1 +#define BR1 R1 +#define BW1 R2 +#define BR2 R2 +#define BW2 R3 +#define BR3 R3 +#define BW3 R4 + +#define FW0 R1 +#define FR0 R2 +#define FW1 R2 +#define FR1 R3 +#define FW2 R3 +#define FR2 R4 +#define FW3 R4 +#define FR3 R8 /* shared with TE */ + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB), NOSPLIT, $4-12 +_memmove: + MOVW to+0(FP), TS + MOVW from+4(FP), FROM + MOVW n+8(FP), N + + ADD N, TS, TE /* to end pointer */ + + CMP FROM, TS + BLS _forward + +_back: + ADD N, FROM /* from end pointer */ + CMP $4, N /* need at least 4 bytes to copy */ + BLT _b1tail + +_b4align: /* align destination on 4 */ + AND.S $3, TE, TMP + BEQ _b4aligned + + MOVBU.W -1(FROM), TMP /* pre-indexed */ + MOVBU.W TMP, -1(TE) /* pre-indexed */ + B _b4align + +_b4aligned: /* is source now aligned? */ + AND.S $3, FROM, TMP + BNE _bunaligned + + ADD $31, TS, TMP /* do 32-byte chunks if possible */ + MOVW TS, savedts-4(SP) +_b32loop: + CMP TMP, TE + BLS _b4tail + + MOVM.DB.W (FROM), [R0-R7] + MOVM.DB.W [R0-R7], (TE) + B _b32loop + +_b4tail: /* do remaining words if possible */ + MOVW savedts-4(SP), TS + ADD $3, TS, TMP +_b4loop: + CMP TMP, TE + BLS _b1tail + + MOVW.W -4(FROM), TMP1 /* pre-indexed */ + MOVW.W TMP1, -4(TE) /* pre-indexed */ + B _b4loop + +_b1tail: /* remaining bytes */ + CMP TE, TS + BEQ _return + + MOVBU.W -1(FROM), TMP /* pre-indexed */ + MOVBU.W TMP, -1(TE) /* pre-indexed */ + B _b1tail + +_forward: + CMP $4, N /* need at least 4 bytes to copy */ + BLT _f1tail + +_f4align: /* align destination on 4 */ + AND.S $3, TS, TMP + BEQ _f4aligned + + MOVBU.P 1(FROM), TMP /* implicit write back */ + MOVBU.P TMP, 1(TS) /* implicit write back */ + B _f4align + +_f4aligned: /* is source now aligned? */ + AND.S $3, FROM, TMP + BNE _funaligned + + SUB $31, TE, TMP /* do 32-byte chunks if possible */ + MOVW TE, savedte-4(SP) +_f32loop: + CMP TMP, TS + BHS _f4tail + + MOVM.IA.W (FROM), [R1-R8] + MOVM.IA.W [R1-R8], (TS) + B _f32loop + +_f4tail: + MOVW savedte-4(SP), TE + SUB $3, TE, TMP /* do remaining words if possible */ +_f4loop: + CMP TMP, TS + BHS _f1tail + + MOVW.P 4(FROM), TMP1 /* implicit write back */ + MOVW.P TMP1, 4(TS) /* implicit write back */ + B _f4loop + +_f1tail: + CMP TS, TE + BEQ _return + + MOVBU.P 1(FROM), TMP /* implicit write back */ + MOVBU.P TMP, 1(TS) /* implicit write back */ + B _f1tail + +_return: + MOVW to+0(FP), R0 + RET + +_bunaligned: + CMP $2, TMP /* is TMP < 2 ? */ + + MOVW.LT $8, RSHIFT /* (R(n)<<24)|(R(n-1)>>8) */ + MOVW.LT $24, LSHIFT + MOVW.LT $1, OFFSET + + MOVW.EQ $16, RSHIFT /* (R(n)<<16)|(R(n-1)>>16) */ + MOVW.EQ $16, LSHIFT + MOVW.EQ $2, OFFSET + + MOVW.GT $24, RSHIFT /* (R(n)<<8)|(R(n-1)>>24) */ + MOVW.GT $8, LSHIFT + MOVW.GT $3, OFFSET + + ADD $16, TS, TMP /* do 16-byte chunks if possible */ + CMP TMP, TE + BLS _b1tail + + BIC $3, FROM /* align source */ + MOVW TS, savedts-4(SP) + MOVW (FROM), BR0 /* prime first block register */ + +_bu16loop: + CMP TMP, TE + BLS _bu1tail + + MOVW BR0<<LSHIFT, BW3 + MOVM.DB.W (FROM), [BR0-BR3] + ORR BR3>>RSHIFT, BW3 + + MOVW BR3<<LSHIFT, BW2 + ORR BR2>>RSHIFT, BW2 + + MOVW BR2<<LSHIFT, BW1 + ORR BR1>>RSHIFT, BW1 + + MOVW BR1<<LSHIFT, BW0 + ORR BR0>>RSHIFT, BW0 + + MOVM.DB.W [BW0-BW3], (TE) + B _bu16loop + +_bu1tail: + MOVW savedts-4(SP), TS + ADD OFFSET, FROM + B _b1tail + +_funaligned: + CMP $2, TMP + + MOVW.LT $8, RSHIFT /* (R(n+1)<<24)|(R(n)>>8) */ + MOVW.LT $24, LSHIFT + MOVW.LT $3, OFFSET + + MOVW.EQ $16, RSHIFT /* (R(n+1)<<16)|(R(n)>>16) */ + MOVW.EQ $16, LSHIFT + MOVW.EQ $2, OFFSET + + MOVW.GT $24, RSHIFT /* (R(n+1)<<8)|(R(n)>>24) */ + MOVW.GT $8, LSHIFT + MOVW.GT $1, OFFSET + + SUB $16, TE, TMP /* do 16-byte chunks if possible */ + CMP TMP, TS + BHS _f1tail + + BIC $3, FROM /* align source */ + MOVW TE, savedte-4(SP) + MOVW.P 4(FROM), FR3 /* prime last block register, implicit write back */ + +_fu16loop: + CMP TMP, TS + BHS _fu1tail + + MOVW FR3>>RSHIFT, FW0 + MOVM.IA.W (FROM), [FR0,FR1,FR2,FR3] + ORR FR0<<LSHIFT, FW0 + + MOVW FR0>>RSHIFT, FW1 + ORR FR1<<LSHIFT, FW1 + + MOVW FR1>>RSHIFT, FW2 + ORR FR2<<LSHIFT, FW2 + + MOVW FR2>>RSHIFT, FW3 + ORR FR3<<LSHIFT, FW3 + + MOVM.IA.W [FW0,FW1,FW2,FW3], (TS) + B _fu16loop + +_fu1tail: + MOVW savedte-4(SP), TE + SUB OFFSET, FROM + B _f1tail diff --git a/src/runtime/memmove_arm64.s b/src/runtime/memmove_arm64.s new file mode 100644 index 0000000..8ec3ed8 --- /dev/null +++ b/src/runtime/memmove_arm64.s @@ -0,0 +1,238 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// Register map +// +// dstin R0 +// src R1 +// count R2 +// dst R3 (same as R0, but gets modified in unaligned cases) +// srcend R4 +// dstend R5 +// data R6-R17 +// tmp1 R14 + +// Copies are split into 3 main cases: small copies of up to 32 bytes, medium +// copies of up to 128 bytes, and large copies. The overhead of the overlap +// check is negligible since it is only required for large copies. +// +// Large copies use a software pipelined loop processing 64 bytes per iteration. +// The destination pointer is 16-byte aligned to minimize unaligned accesses. +// The loop tail is handled by always copying 64 bytes from the end. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-24 + CBZ R2, copy0 + + // Small copies: 1..16 bytes + CMP $16, R2 + BLE copy16 + + // Large copies + CMP $128, R2 + BHI copy_long + CMP $32, R2 + BHI copy32_128 + + // Small copies: 17..32 bytes. + LDP (R1), (R6, R7) + ADD R1, R2, R4 // R4 points just past the last source byte + LDP -16(R4), (R12, R13) + STP (R6, R7), (R0) + ADD R0, R2, R5 // R5 points just past the last destination byte + STP (R12, R13), -16(R5) + RET + +// Small copies: 1..16 bytes. +copy16: + ADD R1, R2, R4 // R4 points just past the last source byte + ADD R0, R2, R5 // R5 points just past the last destination byte + CMP $8, R2 + BLT copy7 + MOVD (R1), R6 + MOVD -8(R4), R7 + MOVD R6, (R0) + MOVD R7, -8(R5) + RET + +copy7: + TBZ $2, R2, copy3 + MOVWU (R1), R6 + MOVWU -4(R4), R7 + MOVW R6, (R0) + MOVW R7, -4(R5) + RET + +copy3: + TBZ $1, R2, copy1 + MOVHU (R1), R6 + MOVHU -2(R4), R7 + MOVH R6, (R0) + MOVH R7, -2(R5) + RET + +copy1: + MOVBU (R1), R6 + MOVB R6, (R0) + +copy0: + RET + + // Medium copies: 33..128 bytes. +copy32_128: + ADD R1, R2, R4 // R4 points just past the last source byte + ADD R0, R2, R5 // R5 points just past the last destination byte + LDP (R1), (R6, R7) + LDP 16(R1), (R8, R9) + LDP -32(R4), (R10, R11) + LDP -16(R4), (R12, R13) + CMP $64, R2 + BHI copy128 + STP (R6, R7), (R0) + STP (R8, R9), 16(R0) + STP (R10, R11), -32(R5) + STP (R12, R13), -16(R5) + RET + + // Copy 65..128 bytes. +copy128: + LDP 32(R1), (R14, R15) + LDP 48(R1), (R16, R17) + CMP $96, R2 + BLS copy96 + LDP -64(R4), (R2, R3) + LDP -48(R4), (R1, R4) + STP (R2, R3), -64(R5) + STP (R1, R4), -48(R5) + +copy96: + STP (R6, R7), (R0) + STP (R8, R9), 16(R0) + STP (R14, R15), 32(R0) + STP (R16, R17), 48(R0) + STP (R10, R11), -32(R5) + STP (R12, R13), -16(R5) + RET + + // Copy more than 128 bytes. +copy_long: + ADD R1, R2, R4 // R4 points just past the last source byte + ADD R0, R2, R5 // R5 points just past the last destination byte + MOVD ZR, R7 + MOVD ZR, R8 + + CMP $1024, R2 + BLT backward_check + // feature detect to decide how to align + MOVBU runtime·arm64UseAlignedLoads(SB), R6 + CBNZ R6, use_aligned_loads + MOVD R0, R7 + MOVD R5, R8 + B backward_check +use_aligned_loads: + MOVD R1, R7 + MOVD R4, R8 + // R7 and R8 are used here for the realignment calculation. In + // the use_aligned_loads case, R7 is the src pointer and R8 is + // srcend pointer, which is used in the backward copy case. + // When doing aligned stores, R7 is the dst pointer and R8 is + // the dstend pointer. + +backward_check: + // Use backward copy if there is an overlap. + SUB R1, R0, R14 + CBZ R14, copy0 + CMP R2, R14 + BCC copy_long_backward + + // Copy 16 bytes and then align src (R1) or dst (R0) to 16-byte alignment. + LDP (R1), (R12, R13) // Load A + AND $15, R7, R14 // Calculate the realignment offset + SUB R14, R1, R1 + SUB R14, R0, R3 // move dst back same amount as src + ADD R14, R2, R2 + LDP 16(R1), (R6, R7) // Load B + STP (R12, R13), (R0) // Store A + LDP 32(R1), (R8, R9) // Load C + LDP 48(R1), (R10, R11) // Load D + LDP.W 64(R1), (R12, R13) // Load E + // 80 bytes have been loaded; if less than 80+64 bytes remain, copy from the end + SUBS $144, R2, R2 + BLS copy64_from_end + +loop64: + STP (R6, R7), 16(R3) // Store B + LDP 16(R1), (R6, R7) // Load B (next iteration) + STP (R8, R9), 32(R3) // Store C + LDP 32(R1), (R8, R9) // Load C + STP (R10, R11), 48(R3) // Store D + LDP 48(R1), (R10, R11) // Load D + STP.W (R12, R13), 64(R3) // Store E + LDP.W 64(R1), (R12, R13) // Load E + SUBS $64, R2, R2 + BHI loop64 + + // Write the last iteration and copy 64 bytes from the end. +copy64_from_end: + LDP -64(R4), (R14, R15) // Load F + STP (R6, R7), 16(R3) // Store B + LDP -48(R4), (R6, R7) // Load G + STP (R8, R9), 32(R3) // Store C + LDP -32(R4), (R8, R9) // Load H + STP (R10, R11), 48(R3) // Store D + LDP -16(R4), (R10, R11) // Load I + STP (R12, R13), 64(R3) // Store E + STP (R14, R15), -64(R5) // Store F + STP (R6, R7), -48(R5) // Store G + STP (R8, R9), -32(R5) // Store H + STP (R10, R11), -16(R5) // Store I + RET + + // Large backward copy for overlapping copies. + // Copy 16 bytes and then align srcend (R4) or dstend (R5) to 16-byte alignment. +copy_long_backward: + LDP -16(R4), (R12, R13) + AND $15, R8, R14 + SUB R14, R4, R4 + SUB R14, R2, R2 + LDP -16(R4), (R6, R7) + STP (R12, R13), -16(R5) + LDP -32(R4), (R8, R9) + LDP -48(R4), (R10, R11) + LDP.W -64(R4), (R12, R13) + SUB R14, R5, R5 + SUBS $128, R2, R2 + BLS copy64_from_start + +loop64_backward: + STP (R6, R7), -16(R5) + LDP -16(R4), (R6, R7) + STP (R8, R9), -32(R5) + LDP -32(R4), (R8, R9) + STP (R10, R11), -48(R5) + LDP -48(R4), (R10, R11) + STP.W (R12, R13), -64(R5) + LDP.W -64(R4), (R12, R13) + SUBS $64, R2, R2 + BHI loop64_backward + + // Write the last iteration and copy 64 bytes from the start. +copy64_from_start: + LDP 48(R1), (R2, R3) + STP (R6, R7), -16(R5) + LDP 32(R1), (R6, R7) + STP (R8, R9), -32(R5) + LDP 16(R1), (R8, R9) + STP (R10, R11), -48(R5) + LDP (R1), (R10, R11) + STP (R12, R13), -64(R5) + STP (R2, R3), 48(R0) + STP (R6, R7), 32(R0) + STP (R8, R9), 16(R0) + STP (R10, R11), (R0) + RET diff --git a/src/runtime/memmove_linux_amd64_test.go b/src/runtime/memmove_linux_amd64_test.go new file mode 100644 index 0000000..5f90062 --- /dev/null +++ b/src/runtime/memmove_linux_amd64_test.go @@ -0,0 +1,56 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "os" + "syscall" + "testing" + "unsafe" +) + +// TestMemmoveOverflow maps 3GB of memory and calls memmove on +// the corresponding slice. +func TestMemmoveOverflow(t *testing.T) { + t.Parallel() + // Create a temporary file. + tmp, err := os.CreateTemp("", "go-memmovetest") + if err != nil { + t.Fatal(err) + } + _, err = tmp.Write(make([]byte, 65536)) + if err != nil { + t.Fatal(err) + } + defer os.Remove(tmp.Name()) + defer tmp.Close() + + // Set up mappings. + base, _, errno := syscall.Syscall6(syscall.SYS_MMAP, + 0xa0<<32, 3<<30, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_PRIVATE|syscall.MAP_ANONYMOUS, ^uintptr(0), 0) + if errno != 0 { + t.Skipf("could not create memory mapping: %s", errno) + } + syscall.Syscall(syscall.SYS_MUNMAP, base, 3<<30, 0) + + for off := uintptr(0); off < 3<<30; off += 65536 { + _, _, errno := syscall.Syscall6(syscall.SYS_MMAP, + base+off, 65536, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED|syscall.MAP_FIXED, tmp.Fd(), 0) + if errno != 0 { + t.Skipf("could not map a page at requested 0x%x: %s", base+off, errno) + } + defer syscall.Syscall(syscall.SYS_MUNMAP, base+off, 65536, 0) + } + + s := unsafe.Slice((*byte)(unsafe.Pointer(base)), 3<<30) + n := copy(s[1:], s) + if n != 3<<30-1 { + t.Fatalf("copied %d bytes, expected %d", n, 3<<30-1) + } + n = copy(s, s[1:]) + if n != 3<<30-1 { + t.Fatalf("copied %d bytes, expected %d", n, 3<<30-1) + } +} diff --git a/src/runtime/memmove_loong64.s b/src/runtime/memmove_loong64.s new file mode 100644 index 0000000..b7b9c56 --- /dev/null +++ b/src/runtime/memmove_loong64.s @@ -0,0 +1,105 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB), NOSPLIT|NOFRAME, $0-24 + MOVV to+0(FP), R4 + MOVV from+8(FP), R5 + MOVV n+16(FP), R6 + BNE R6, check + RET + +check: + SGTU R4, R5, R7 + BNE R7, backward + + ADDV R4, R6, R9 // end pointer + + // if the two pointers are not of same alignments, do byte copying + SUBVU R5, R4, R7 + AND $7, R7 + BNE R7, out + + // if less than 8 bytes, do byte copying + SGTU $8, R6, R7 + BNE R7, out + + // do one byte at a time until 8-aligned + AND $7, R4, R8 + BEQ R8, words + MOVB (R5), R7 + ADDV $1, R5 + MOVB R7, (R4) + ADDV $1, R4 + JMP -6(PC) + +words: + // do 8 bytes at a time if there is room + ADDV $-7, R9, R6 // R6 is end pointer-7 + + SGTU R6, R4, R8 + BEQ R8, out + MOVV (R5), R7 + ADDV $8, R5 + MOVV R7, (R4) + ADDV $8, R4 + JMP -6(PC) + +out: + BEQ R4, R9, done + MOVB (R5), R7 + ADDV $1, R5 + MOVB R7, (R4) + ADDV $1, R4 + JMP -5(PC) +done: + RET + +backward: + ADDV R6, R5 // from-end pointer + ADDV R4, R6, R9 // to-end pointer + + // if the two pointers are not of same alignments, do byte copying + SUBVU R9, R5, R7 + AND $7, R7 + BNE R7, out1 + + // if less than 8 bytes, do byte copying + SGTU $8, R6, R7 + BNE R7, out1 + + // do one byte at a time until 8-aligned + AND $7, R9, R8 + BEQ R8, words1 + ADDV $-1, R5 + MOVB (R5), R7 + ADDV $-1, R9 + MOVB R7, (R9) + JMP -6(PC) + +words1: + // do 8 bytes at a time if there is room + ADDV $7, R4, R6 // R6 is start pointer+7 + + SGTU R9, R6, R8 + BEQ R8, out1 + ADDV $-8, R5 + MOVV (R5), R7 + ADDV $-8, R9 + MOVV R7, (R9) + JMP -6(PC) + +out1: + BEQ R4, R9, done1 + ADDV $-1, R5 + MOVB (R5), R7 + ADDV $-1, R9 + MOVB R7, (R9) + JMP -5(PC) +done1: + RET diff --git a/src/runtime/memmove_mips64x.s b/src/runtime/memmove_mips64x.s new file mode 100644 index 0000000..b69178c --- /dev/null +++ b/src/runtime/memmove_mips64x.s @@ -0,0 +1,107 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB), NOSPLIT|NOFRAME, $0-24 + MOVV to+0(FP), R1 + MOVV from+8(FP), R2 + MOVV n+16(FP), R3 + BNE R3, check + RET + +check: + SGTU R1, R2, R4 + BNE R4, backward + + ADDV R1, R3, R6 // end pointer + + // if the two pointers are not of same alignments, do byte copying + SUBVU R2, R1, R4 + AND $7, R4 + BNE R4, out + + // if less than 8 bytes, do byte copying + SGTU $8, R3, R4 + BNE R4, out + + // do one byte at a time until 8-aligned + AND $7, R1, R5 + BEQ R5, words + MOVB (R2), R4 + ADDV $1, R2 + MOVB R4, (R1) + ADDV $1, R1 + JMP -6(PC) + +words: + // do 8 bytes at a time if there is room + ADDV $-7, R6, R3 // R3 is end pointer-7 + + SGTU R3, R1, R5 + BEQ R5, out + MOVV (R2), R4 + ADDV $8, R2 + MOVV R4, (R1) + ADDV $8, R1 + JMP -6(PC) + +out: + BEQ R1, R6, done + MOVB (R2), R4 + ADDV $1, R2 + MOVB R4, (R1) + ADDV $1, R1 + JMP -5(PC) +done: + RET + +backward: + ADDV R3, R2 // from-end pointer + ADDV R1, R3, R6 // to-end pointer + + // if the two pointers are not of same alignments, do byte copying + SUBVU R6, R2, R4 + AND $7, R4 + BNE R4, out1 + + // if less than 8 bytes, do byte copying + SGTU $8, R3, R4 + BNE R4, out1 + + // do one byte at a time until 8-aligned + AND $7, R6, R5 + BEQ R5, words1 + ADDV $-1, R2 + MOVB (R2), R4 + ADDV $-1, R6 + MOVB R4, (R6) + JMP -6(PC) + +words1: + // do 8 bytes at a time if there is room + ADDV $7, R1, R3 // R3 is start pointer+7 + + SGTU R6, R3, R5 + BEQ R5, out1 + ADDV $-8, R2 + MOVV (R2), R4 + ADDV $-8, R6 + MOVV R4, (R6) + JMP -6(PC) + +out1: + BEQ R1, R6, done1 + ADDV $-1, R2 + MOVB (R2), R4 + ADDV $-1, R6 + MOVB R4, (R6) + JMP -5(PC) +done1: + RET diff --git a/src/runtime/memmove_mipsx.s b/src/runtime/memmove_mipsx.s new file mode 100644 index 0000000..494288c --- /dev/null +++ b/src/runtime/memmove_mipsx.s @@ -0,0 +1,260 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +#include "textflag.h" + +#ifdef GOARCH_mips +#define MOVWHI MOVWL +#define MOVWLO MOVWR +#else +#define MOVWHI MOVWR +#define MOVWLO MOVWL +#endif + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB),NOSPLIT,$-0-12 + MOVW n+8(FP), R3 + MOVW from+4(FP), R2 + MOVW to+0(FP), R1 + + ADDU R3, R2, R4 // end pointer for source + ADDU R3, R1, R5 // end pointer for destination + + // if destination is ahead of source, start at the end of the buffer and go backward. + SGTU R1, R2, R6 + BNE R6, backward + + // if less than 4 bytes, use byte by byte copying + SGTU $4, R3, R6 + BNE R6, f_small_copy + + // align destination to 4 bytes + AND $3, R1, R6 + BEQ R6, f_dest_aligned + SUBU R1, R0, R6 + AND $3, R6 + MOVWHI 0(R2), R7 + SUBU R6, R3 + MOVWLO 3(R2), R7 + ADDU R6, R2 + MOVWHI R7, 0(R1) + ADDU R6, R1 + +f_dest_aligned: + AND $31, R3, R7 + AND $3, R3, R6 + SUBU R7, R5, R7 // end pointer for 32-byte chunks + SUBU R6, R5, R6 // end pointer for 4-byte chunks + + // if source is not aligned, use unaligned reads + AND $3, R2, R8 + BNE R8, f_large_ua + +f_large: + BEQ R1, R7, f_words + ADDU $32, R1 + MOVW 0(R2), R8 + MOVW 4(R2), R9 + MOVW 8(R2), R10 + MOVW 12(R2), R11 + MOVW 16(R2), R12 + MOVW 20(R2), R13 + MOVW 24(R2), R14 + MOVW 28(R2), R15 + ADDU $32, R2 + MOVW R8, -32(R1) + MOVW R9, -28(R1) + MOVW R10, -24(R1) + MOVW R11, -20(R1) + MOVW R12, -16(R1) + MOVW R13, -12(R1) + MOVW R14, -8(R1) + MOVW R15, -4(R1) + JMP f_large + +f_words: + BEQ R1, R6, f_tail + ADDU $4, R1 + MOVW 0(R2), R8 + ADDU $4, R2 + MOVW R8, -4(R1) + JMP f_words + +f_tail: + BEQ R1, R5, ret + MOVWLO -1(R4), R8 + MOVWLO R8, -1(R5) + +ret: + RET + +f_large_ua: + BEQ R1, R7, f_words_ua + ADDU $32, R1 + MOVWHI 0(R2), R8 + MOVWHI 4(R2), R9 + MOVWHI 8(R2), R10 + MOVWHI 12(R2), R11 + MOVWHI 16(R2), R12 + MOVWHI 20(R2), R13 + MOVWHI 24(R2), R14 + MOVWHI 28(R2), R15 + MOVWLO 3(R2), R8 + MOVWLO 7(R2), R9 + MOVWLO 11(R2), R10 + MOVWLO 15(R2), R11 + MOVWLO 19(R2), R12 + MOVWLO 23(R2), R13 + MOVWLO 27(R2), R14 + MOVWLO 31(R2), R15 + ADDU $32, R2 + MOVW R8, -32(R1) + MOVW R9, -28(R1) + MOVW R10, -24(R1) + MOVW R11, -20(R1) + MOVW R12, -16(R1) + MOVW R13, -12(R1) + MOVW R14, -8(R1) + MOVW R15, -4(R1) + JMP f_large_ua + +f_words_ua: + BEQ R1, R6, f_tail_ua + MOVWHI 0(R2), R8 + ADDU $4, R1 + MOVWLO 3(R2), R8 + ADDU $4, R2 + MOVW R8, -4(R1) + JMP f_words_ua + +f_tail_ua: + BEQ R1, R5, ret + MOVWHI -4(R4), R8 + MOVWLO -1(R4), R8 + MOVWLO R8, -1(R5) + JMP ret + +f_small_copy: + BEQ R1, R5, ret + ADDU $1, R1 + MOVB 0(R2), R6 + ADDU $1, R2 + MOVB R6, -1(R1) + JMP f_small_copy + +backward: + SGTU $4, R3, R6 + BNE R6, b_small_copy + + AND $3, R5, R6 + BEQ R6, b_dest_aligned + MOVWHI -4(R4), R7 + SUBU R6, R3 + MOVWLO -1(R4), R7 + SUBU R6, R4 + MOVWLO R7, -1(R5) + SUBU R6, R5 + +b_dest_aligned: + AND $31, R3, R7 + AND $3, R3, R6 + ADDU R7, R1, R7 + ADDU R6, R1, R6 + + AND $3, R4, R8 + BNE R8, b_large_ua + +b_large: + BEQ R5, R7, b_words + ADDU $-32, R5 + MOVW -4(R4), R8 + MOVW -8(R4), R9 + MOVW -12(R4), R10 + MOVW -16(R4), R11 + MOVW -20(R4), R12 + MOVW -24(R4), R13 + MOVW -28(R4), R14 + MOVW -32(R4), R15 + ADDU $-32, R4 + MOVW R8, 28(R5) + MOVW R9, 24(R5) + MOVW R10, 20(R5) + MOVW R11, 16(R5) + MOVW R12, 12(R5) + MOVW R13, 8(R5) + MOVW R14, 4(R5) + MOVW R15, 0(R5) + JMP b_large + +b_words: + BEQ R5, R6, b_tail + ADDU $-4, R5 + MOVW -4(R4), R8 + ADDU $-4, R4 + MOVW R8, 0(R5) + JMP b_words + +b_tail: + BEQ R5, R1, ret + MOVWHI 0(R2), R8 // R2 and R1 have the same alignment so we don't need to load a whole word + MOVWHI R8, 0(R1) + JMP ret + +b_large_ua: + BEQ R5, R7, b_words_ua + ADDU $-32, R5 + MOVWHI -4(R4), R8 + MOVWHI -8(R4), R9 + MOVWHI -12(R4), R10 + MOVWHI -16(R4), R11 + MOVWHI -20(R4), R12 + MOVWHI -24(R4), R13 + MOVWHI -28(R4), R14 + MOVWHI -32(R4), R15 + MOVWLO -1(R4), R8 + MOVWLO -5(R4), R9 + MOVWLO -9(R4), R10 + MOVWLO -13(R4), R11 + MOVWLO -17(R4), R12 + MOVWLO -21(R4), R13 + MOVWLO -25(R4), R14 + MOVWLO -29(R4), R15 + ADDU $-32, R4 + MOVW R8, 28(R5) + MOVW R9, 24(R5) + MOVW R10, 20(R5) + MOVW R11, 16(R5) + MOVW R12, 12(R5) + MOVW R13, 8(R5) + MOVW R14, 4(R5) + MOVW R15, 0(R5) + JMP b_large_ua + +b_words_ua: + BEQ R5, R6, b_tail_ua + MOVWHI -4(R4), R8 + ADDU $-4, R5 + MOVWLO -1(R4), R8 + ADDU $-4, R4 + MOVW R8, 0(R5) + JMP b_words_ua + +b_tail_ua: + BEQ R5, R1, ret + MOVWHI (R2), R8 + MOVWLO 3(R2), R8 + MOVWHI R8, 0(R1) + JMP ret + +b_small_copy: + BEQ R5, R1, ret + ADDU $-1, R5 + MOVB -1(R4), R6 + ADDU $-1, R4 + MOVB R6, 0(R5) + JMP b_small_copy diff --git a/src/runtime/memmove_plan9_386.s b/src/runtime/memmove_plan9_386.s new file mode 100644 index 0000000..cfce0e9 --- /dev/null +++ b/src/runtime/memmove_plan9_386.s @@ -0,0 +1,137 @@ +// Inferno's libkern/memmove-386.s +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/memmove-386.s +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB), NOSPLIT, $0-12 + MOVL to+0(FP), DI + MOVL from+4(FP), SI + MOVL n+8(FP), BX + + // REP instructions have a high startup cost, so we handle small sizes + // with some straightline code. The REP MOVSL instruction is really fast + // for large sizes. The cutover is approximately 1K. +tail: + TESTL BX, BX + JEQ move_0 + CMPL BX, $2 + JBE move_1or2 + CMPL BX, $4 + JB move_3 + JE move_4 + CMPL BX, $8 + JBE move_5through8 + CMPL BX, $16 + JBE move_9through16 + +/* + * check and set for backwards + */ + CMPL SI, DI + JLS back + +/* + * forward copy loop + */ +forward: + MOVL BX, CX + SHRL $2, CX + ANDL $3, BX + + REP; MOVSL + JMP tail +/* + * check overlap + */ +back: + MOVL SI, CX + ADDL BX, CX + CMPL CX, DI + JLS forward +/* + * whole thing backwards has + * adjusted addresses + */ + + ADDL BX, DI + ADDL BX, SI + STD + +/* + * copy + */ + MOVL BX, CX + SHRL $2, CX + ANDL $3, BX + + SUBL $4, DI + SUBL $4, SI + REP; MOVSL + + CLD + ADDL $4, DI + ADDL $4, SI + SUBL BX, DI + SUBL BX, SI + JMP tail + +move_1or2: + MOVB (SI), AX + MOVB -1(SI)(BX*1), CX + MOVB AX, (DI) + MOVB CX, -1(DI)(BX*1) + RET +move_0: + RET +move_3: + MOVW (SI), AX + MOVB 2(SI), CX + MOVW AX, (DI) + MOVB CX, 2(DI) + RET +move_4: + // We need a separate case for 4 to make sure we write pointers atomically. + MOVL (SI), AX + MOVL AX, (DI) + RET +move_5through8: + MOVL (SI), AX + MOVL -4(SI)(BX*1), CX + MOVL AX, (DI) + MOVL CX, -4(DI)(BX*1) + RET +move_9through16: + MOVL (SI), AX + MOVL 4(SI), CX + MOVL -8(SI)(BX*1), DX + MOVL -4(SI)(BX*1), BP + MOVL AX, (DI) + MOVL CX, 4(DI) + MOVL DX, -8(DI)(BX*1) + MOVL BP, -4(DI)(BX*1) + RET diff --git a/src/runtime/memmove_plan9_amd64.s b/src/runtime/memmove_plan9_amd64.s new file mode 100644 index 0000000..217aa60 --- /dev/null +++ b/src/runtime/memmove_plan9_amd64.s @@ -0,0 +1,135 @@ +// Derived from Inferno's libkern/memmove-386.s (adapted for amd64) +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/memmove-386.s +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB), NOSPLIT, $0-24 + + MOVQ to+0(FP), DI + MOVQ from+8(FP), SI + MOVQ n+16(FP), BX + + // REP instructions have a high startup cost, so we handle small sizes + // with some straightline code. The REP MOVSQ instruction is really fast + // for large sizes. The cutover is approximately 1K. +tail: + TESTQ BX, BX + JEQ move_0 + CMPQ BX, $2 + JBE move_1or2 + CMPQ BX, $4 + JBE move_3or4 + CMPQ BX, $8 + JB move_5through7 + JE move_8 + CMPQ BX, $16 + JBE move_9through16 + +/* + * check and set for backwards + */ + CMPQ SI, DI + JLS back + +/* + * forward copy loop + */ +forward: + MOVQ BX, CX + SHRQ $3, CX + ANDQ $7, BX + + REP; MOVSQ + JMP tail + +back: +/* + * check overlap + */ + MOVQ SI, CX + ADDQ BX, CX + CMPQ CX, DI + JLS forward + +/* + * whole thing backwards has + * adjusted addresses + */ + ADDQ BX, DI + ADDQ BX, SI + STD + +/* + * copy + */ + MOVQ BX, CX + SHRQ $3, CX + ANDQ $7, BX + + SUBQ $8, DI + SUBQ $8, SI + REP; MOVSQ + + CLD + ADDQ $8, DI + ADDQ $8, SI + SUBQ BX, DI + SUBQ BX, SI + JMP tail + +move_1or2: + MOVB (SI), AX + MOVB -1(SI)(BX*1), CX + MOVB AX, (DI) + MOVB CX, -1(DI)(BX*1) + RET +move_0: + RET +move_3or4: + MOVW (SI), AX + MOVW -2(SI)(BX*1), CX + MOVW AX, (DI) + MOVW CX, -2(DI)(BX*1) + RET +move_5through7: + MOVL (SI), AX + MOVL -4(SI)(BX*1), CX + MOVL AX, (DI) + MOVL CX, -4(DI)(BX*1) + RET +move_8: + // We need a separate case for 8 to make sure we write pointers atomically. + MOVQ (SI), AX + MOVQ AX, (DI) + RET +move_9through16: + MOVQ (SI), AX + MOVQ -8(SI)(BX*1), CX + MOVQ AX, (DI) + MOVQ CX, -8(DI)(BX*1) + RET diff --git a/src/runtime/memmove_ppc64x.s b/src/runtime/memmove_ppc64x.s new file mode 100644 index 0000000..5fa51c0 --- /dev/null +++ b/src/runtime/memmove_ppc64x.s @@ -0,0 +1,196 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) + +// target address +#define TGT R3 +// source address +#define SRC R4 +// length to move +#define LEN R5 +// number of doublewords +#define DWORDS R6 +// number of bytes < 8 +#define BYTES R7 +// const 16 used as index +#define IDX16 R8 +// temp used for copies, etc. +#define TMP R9 +// number of 64 byte chunks +#define QWORDS R10 +// index values +#define IDX32 R14 +#define IDX48 R15 +#define OCTWORDS R16 + +TEXT runtime·memmove<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-24 + // R3 = TGT = to + // R4 = SRC = from + // R5 = LEN = n + + // Determine if there are doublewords to + // copy so a more efficient move can be done +check: + ANDCC $7, LEN, BYTES // R7: bytes to copy + SRD $3, LEN, DWORDS // R6: double words to copy + MOVFL CR0, CR3 // save CR from ANDCC + CMP DWORDS, $0, CR1 // CR1[EQ] set if no double words to copy + + // Determine overlap by subtracting dest - src and comparing against the + // length. This catches the cases where src and dest are in different types + // of storage such as stack and static to avoid doing backward move when not + // necessary. + + SUB SRC, TGT, TMP // dest - src + CMPU TMP, LEN, CR2 // < len? + BC 12, 8, backward // BLT CR2 backward + + // Copying forward if no overlap. + + BC 12, 6, checkbytes // BEQ CR1, checkbytes + SRDCC $3, DWORDS, OCTWORDS // 64 byte chunks? + MOVD $16, IDX16 + BEQ lt64gt8 // < 64 bytes + + // Prepare for moves of 64 bytes at a time. + +forward64setup: + DCBTST (TGT) // prepare data cache + DCBT (SRC) + MOVD OCTWORDS, CTR // Number of 64 byte chunks + MOVD $32, IDX32 + MOVD $48, IDX48 + PCALIGN $32 + +forward64: + LXVD2X (R0)(SRC), VS32 // load 64 bytes + LXVD2X (IDX16)(SRC), VS33 + LXVD2X (IDX32)(SRC), VS34 + LXVD2X (IDX48)(SRC), VS35 + ADD $64, SRC + STXVD2X VS32, (R0)(TGT) // store 64 bytes + STXVD2X VS33, (IDX16)(TGT) + STXVD2X VS34, (IDX32)(TGT) + STXVD2X VS35, (IDX48)(TGT) + ADD $64,TGT // bump up for next set + BC 16, 0, forward64 // continue + ANDCC $7, DWORDS // remaining doublewords + BEQ checkbytes // only bytes remain + +lt64gt8: + CMP DWORDS, $4 + BLT lt32gt8 + LXVD2X (R0)(SRC), VS32 + LXVD2X (IDX16)(SRC), VS33 + ADD $-4, DWORDS + STXVD2X VS32, (R0)(TGT) + STXVD2X VS33, (IDX16)(TGT) + ADD $32, SRC + ADD $32, TGT + +lt32gt8: + // At this point >= 8 and < 32 + // Move 16 bytes if possible + CMP DWORDS, $2 + BLT lt16 + LXVD2X (R0)(SRC), VS32 + ADD $-2, DWORDS + STXVD2X VS32, (R0)(TGT) + ADD $16, SRC + ADD $16, TGT + +lt16: // Move 8 bytes if possible + CMP DWORDS, $1 + BLT checkbytes + MOVD 0(SRC), TMP + ADD $8, SRC + MOVD TMP, 0(TGT) + ADD $8, TGT +checkbytes: + BC 12, 14, LR // BEQ lr +lt8: // Move word if possible + CMP BYTES, $4 + BLT lt4 + MOVWZ 0(SRC), TMP + ADD $-4, BYTES + MOVW TMP, 0(TGT) + ADD $4, SRC + ADD $4, TGT +lt4: // Move halfword if possible + CMP BYTES, $2 + BLT lt2 + MOVHZ 0(SRC), TMP + ADD $-2, BYTES + MOVH TMP, 0(TGT) + ADD $2, SRC + ADD $2, TGT +lt2: // Move last byte if 1 left + CMP BYTES, $1 + BC 12, 0, LR // ble lr + MOVBZ 0(SRC), TMP + MOVBZ TMP, 0(TGT) + RET + +backward: + // Copying backwards proceeds by copying R7 bytes then copying R6 double words. + // R3 and R4 are advanced to the end of the destination/source buffers + // respectively and moved back as we copy. + + ADD LEN, SRC, SRC // end of source + ADD TGT, LEN, TGT // end of dest + + BEQ nobackwardtail // earlier condition + + MOVD BYTES, CTR // bytes to move + +backwardtailloop: + MOVBZ -1(SRC), TMP // point to last byte + SUB $1,SRC + MOVBZ TMP, -1(TGT) + SUB $1,TGT + BDNZ backwardtailloop + +nobackwardtail: + BC 4, 5, LR // blelr cr1, return if DWORDS == 0 + SRDCC $2,DWORDS,QWORDS // Compute number of 32B blocks and compare to 0 + BNE backward32setup // If QWORDS != 0, start the 32B copy loop. + +backward24: + // DWORDS is a value between 1-3. + CMP DWORDS, $2 + + MOVD -8(SRC), TMP + MOVD TMP, -8(TGT) + BC 12, 0, LR // bltlr, return if DWORDS == 1 + + MOVD -16(SRC), TMP + MOVD TMP, -16(TGT) + BC 12, 2, LR // beqlr, return if DWORDS == 2 + + MOVD -24(SRC), TMP + MOVD TMP, -24(TGT) + RET + +backward32setup: + ANDCC $3,DWORDS // Compute remaining DWORDS and compare to 0 + MOVD QWORDS, CTR // set up loop ctr + MOVD $16, IDX16 // 32 bytes at a time + +backward32loop: + SUB $32, TGT + SUB $32, SRC + LXVD2X (R0)(SRC), VS32 // load 16x2 bytes + LXVD2X (IDX16)(SRC), VS33 + STXVD2X VS32, (R0)(TGT) // store 16x2 bytes + STXVD2X VS33, (IDX16)(TGT) + BDNZ backward32loop + BC 12, 2, LR // beqlr, return if DWORDS == 0 + BR backward24 diff --git a/src/runtime/memmove_riscv64.s b/src/runtime/memmove_riscv64.s new file mode 100644 index 0000000..ea622ed --- /dev/null +++ b/src/runtime/memmove_riscv64.s @@ -0,0 +1,318 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// void runtime·memmove(void*, void*, uintptr) +TEXT runtime·memmove<ABIInternal>(SB),NOSPLIT,$-0-24 + // X10 = to + // X11 = from + // X12 = n + BEQ X10, X11, done + BEQZ X12, done + + // If the destination is ahead of the source, start at the end of the + // buffer and go backward. + BGTU X10, X11, backward + + // If less than 8 bytes, do single byte copies. + MOV $8, X9 + BLT X12, X9, f_loop4_check + + // Check alignment - if alignment differs we have to do one byte at a time. + AND $3, X10, X5 + AND $3, X11, X6 + BNE X5, X6, f_loop8_unaligned_check + BEQZ X5, f_loop_check + + // Move one byte at a time until we reach 8 byte alignment. + SUB X5, X12, X12 +f_align: + ADD $-1, X5 + MOVB 0(X11), X14 + MOVB X14, 0(X10) + ADD $1, X10 + ADD $1, X11 + BNEZ X5, f_align + +f_loop_check: + MOV $16, X9 + BLT X12, X9, f_loop8_check + MOV $32, X9 + BLT X12, X9, f_loop16_check + MOV $64, X9 + BLT X12, X9, f_loop32_check +f_loop64: + MOV 0(X11), X14 + MOV 8(X11), X15 + MOV 16(X11), X16 + MOV 24(X11), X17 + MOV 32(X11), X18 + MOV 40(X11), X19 + MOV 48(X11), X20 + MOV 56(X11), X21 + MOV X14, 0(X10) + MOV X15, 8(X10) + MOV X16, 16(X10) + MOV X17, 24(X10) + MOV X18, 32(X10) + MOV X19, 40(X10) + MOV X20, 48(X10) + MOV X21, 56(X10) + ADD $64, X10 + ADD $64, X11 + ADD $-64, X12 + BGE X12, X9, f_loop64 + BEQZ X12, done + +f_loop32_check: + MOV $32, X9 + BLT X12, X9, f_loop16_check +f_loop32: + MOV 0(X11), X14 + MOV 8(X11), X15 + MOV 16(X11), X16 + MOV 24(X11), X17 + MOV X14, 0(X10) + MOV X15, 8(X10) + MOV X16, 16(X10) + MOV X17, 24(X10) + ADD $32, X10 + ADD $32, X11 + ADD $-32, X12 + BGE X12, X9, f_loop32 + BEQZ X12, done + +f_loop16_check: + MOV $16, X9 + BLT X12, X9, f_loop8_check +f_loop16: + MOV 0(X11), X14 + MOV 8(X11), X15 + MOV X14, 0(X10) + MOV X15, 8(X10) + ADD $16, X10 + ADD $16, X11 + ADD $-16, X12 + BGE X12, X9, f_loop16 + BEQZ X12, done + +f_loop8_check: + MOV $8, X9 + BLT X12, X9, f_loop4_check +f_loop8: + MOV 0(X11), X14 + MOV X14, 0(X10) + ADD $8, X10 + ADD $8, X11 + ADD $-8, X12 + BGE X12, X9, f_loop8 + BEQZ X12, done + JMP f_loop4_check + +f_loop8_unaligned_check: + MOV $8, X9 + BLT X12, X9, f_loop4_check +f_loop8_unaligned: + MOVB 0(X11), X14 + MOVB 1(X11), X15 + MOVB 2(X11), X16 + MOVB 3(X11), X17 + MOVB 4(X11), X18 + MOVB 5(X11), X19 + MOVB 6(X11), X20 + MOVB 7(X11), X21 + MOVB X14, 0(X10) + MOVB X15, 1(X10) + MOVB X16, 2(X10) + MOVB X17, 3(X10) + MOVB X18, 4(X10) + MOVB X19, 5(X10) + MOVB X20, 6(X10) + MOVB X21, 7(X10) + ADD $8, X10 + ADD $8, X11 + ADD $-8, X12 + BGE X12, X9, f_loop8_unaligned + +f_loop4_check: + MOV $4, X9 + BLT X12, X9, f_loop1 +f_loop4: + MOVB 0(X11), X14 + MOVB 1(X11), X15 + MOVB 2(X11), X16 + MOVB 3(X11), X17 + MOVB X14, 0(X10) + MOVB X15, 1(X10) + MOVB X16, 2(X10) + MOVB X17, 3(X10) + ADD $4, X10 + ADD $4, X11 + ADD $-4, X12 + BGE X12, X9, f_loop4 + +f_loop1: + BEQZ X12, done + MOVB 0(X11), X14 + MOVB X14, 0(X10) + ADD $1, X10 + ADD $1, X11 + ADD $-1, X12 + JMP f_loop1 + +backward: + ADD X10, X12, X10 + ADD X11, X12, X11 + + // If less than 8 bytes, do single byte copies. + MOV $8, X9 + BLT X12, X9, b_loop4_check + + // Check alignment - if alignment differs we have to do one byte at a time. + AND $3, X10, X5 + AND $3, X11, X6 + BNE X5, X6, b_loop8_unaligned_check + BEQZ X5, b_loop_check + + // Move one byte at a time until we reach 8 byte alignment. + SUB X5, X12, X12 +b_align: + ADD $-1, X5 + ADD $-1, X10 + ADD $-1, X11 + MOVB 0(X11), X14 + MOVB X14, 0(X10) + BNEZ X5, b_align + +b_loop_check: + MOV $16, X9 + BLT X12, X9, b_loop8_check + MOV $32, X9 + BLT X12, X9, b_loop16_check + MOV $64, X9 + BLT X12, X9, b_loop32_check +b_loop64: + ADD $-64, X10 + ADD $-64, X11 + MOV 0(X11), X14 + MOV 8(X11), X15 + MOV 16(X11), X16 + MOV 24(X11), X17 + MOV 32(X11), X18 + MOV 40(X11), X19 + MOV 48(X11), X20 + MOV 56(X11), X21 + MOV X14, 0(X10) + MOV X15, 8(X10) + MOV X16, 16(X10) + MOV X17, 24(X10) + MOV X18, 32(X10) + MOV X19, 40(X10) + MOV X20, 48(X10) + MOV X21, 56(X10) + ADD $-64, X12 + BGE X12, X9, b_loop64 + BEQZ X12, done + +b_loop32_check: + MOV $32, X9 + BLT X12, X9, b_loop16_check +b_loop32: + ADD $-32, X10 + ADD $-32, X11 + MOV 0(X11), X14 + MOV 8(X11), X15 + MOV 16(X11), X16 + MOV 24(X11), X17 + MOV X14, 0(X10) + MOV X15, 8(X10) + MOV X16, 16(X10) + MOV X17, 24(X10) + ADD $-32, X12 + BGE X12, X9, b_loop32 + BEQZ X12, done + +b_loop16_check: + MOV $16, X9 + BLT X12, X9, b_loop8_check +b_loop16: + ADD $-16, X10 + ADD $-16, X11 + MOV 0(X11), X14 + MOV 8(X11), X15 + MOV X14, 0(X10) + MOV X15, 8(X10) + ADD $-16, X12 + BGE X12, X9, b_loop16 + BEQZ X12, done + +b_loop8_check: + MOV $8, X9 + BLT X12, X9, b_loop4_check +b_loop8: + ADD $-8, X10 + ADD $-8, X11 + MOV 0(X11), X14 + MOV X14, 0(X10) + ADD $-8, X12 + BGE X12, X9, b_loop8 + BEQZ X12, done + JMP b_loop4_check + +b_loop8_unaligned_check: + MOV $8, X9 + BLT X12, X9, b_loop4_check +b_loop8_unaligned: + ADD $-8, X10 + ADD $-8, X11 + MOVB 0(X11), X14 + MOVB 1(X11), X15 + MOVB 2(X11), X16 + MOVB 3(X11), X17 + MOVB 4(X11), X18 + MOVB 5(X11), X19 + MOVB 6(X11), X20 + MOVB 7(X11), X21 + MOVB X14, 0(X10) + MOVB X15, 1(X10) + MOVB X16, 2(X10) + MOVB X17, 3(X10) + MOVB X18, 4(X10) + MOVB X19, 5(X10) + MOVB X20, 6(X10) + MOVB X21, 7(X10) + ADD $-8, X12 + BGE X12, X9, b_loop8_unaligned + +b_loop4_check: + MOV $4, X9 + BLT X12, X9, b_loop1 +b_loop4: + ADD $-4, X10 + ADD $-4, X11 + MOVB 0(X11), X14 + MOVB 1(X11), X15 + MOVB 2(X11), X16 + MOVB 3(X11), X17 + MOVB X14, 0(X10) + MOVB X15, 1(X10) + MOVB X16, 2(X10) + MOVB X17, 3(X10) + ADD $-4, X12 + BGE X12, X9, b_loop4 + +b_loop1: + BEQZ X12, done + ADD $-1, X10 + ADD $-1, X11 + MOVB 0(X11), X14 + MOVB X14, 0(X10) + ADD $-1, X12 + JMP b_loop1 + +done: + RET diff --git a/src/runtime/memmove_s390x.s b/src/runtime/memmove_s390x.s new file mode 100644 index 0000000..f4c2b87 --- /dev/null +++ b/src/runtime/memmove_s390x.s @@ -0,0 +1,191 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB),NOSPLIT|NOFRAME,$0-24 + MOVD to+0(FP), R6 + MOVD from+8(FP), R4 + MOVD n+16(FP), R5 + + CMPBEQ R6, R4, done + +start: + CMPBLE R5, $3, move0to3 + CMPBLE R5, $7, move4to7 + CMPBLE R5, $11, move8to11 + CMPBLE R5, $15, move12to15 + CMPBNE R5, $16, movemt16 + MOVD 0(R4), R7 + MOVD 8(R4), R8 + MOVD R7, 0(R6) + MOVD R8, 8(R6) + RET + +movemt16: + CMPBGT R4, R6, forwards + ADD R5, R4, R7 + CMPBLE R7, R6, forwards + ADD R5, R6, R8 +backwards: + MOVD -8(R7), R3 + MOVD R3, -8(R8) + MOVD -16(R7), R3 + MOVD R3, -16(R8) + ADD $-16, R5 + ADD $-16, R7 + ADD $-16, R8 + CMP R5, $16 + BGE backwards + BR start + +forwards: + CMPBGT R5, $64, forwards_fast + MOVD 0(R4), R3 + MOVD R3, 0(R6) + MOVD 8(R4), R3 + MOVD R3, 8(R6) + ADD $16, R4 + ADD $16, R6 + ADD $-16, R5 + CMP R5, $16 + BGE forwards + BR start + +forwards_fast: + CMP R5, $256 + BLE forwards_small + MVC $256, 0(R4), 0(R6) + ADD $256, R4 + ADD $256, R6 + ADD $-256, R5 + BR forwards_fast + +forwards_small: + CMPBEQ R5, $0, done + ADD $-1, R5 + EXRL $memmove_exrl_mvc<>(SB), R5 + RET + +move0to3: + CMPBEQ R5, $0, done +move1: + CMPBNE R5, $1, move2 + MOVB 0(R4), R3 + MOVB R3, 0(R6) + RET +move2: + CMPBNE R5, $2, move3 + MOVH 0(R4), R3 + MOVH R3, 0(R6) + RET +move3: + MOVH 0(R4), R3 + MOVB 2(R4), R7 + MOVH R3, 0(R6) + MOVB R7, 2(R6) + RET + +move4to7: + CMPBNE R5, $4, move5 + MOVW 0(R4), R3 + MOVW R3, 0(R6) + RET +move5: + CMPBNE R5, $5, move6 + MOVW 0(R4), R3 + MOVB 4(R4), R7 + MOVW R3, 0(R6) + MOVB R7, 4(R6) + RET +move6: + CMPBNE R5, $6, move7 + MOVW 0(R4), R3 + MOVH 4(R4), R7 + MOVW R3, 0(R6) + MOVH R7, 4(R6) + RET +move7: + MOVW 0(R4), R3 + MOVH 4(R4), R7 + MOVB 6(R4), R8 + MOVW R3, 0(R6) + MOVH R7, 4(R6) + MOVB R8, 6(R6) + RET + +move8to11: + CMPBNE R5, $8, move9 + MOVD 0(R4), R3 + MOVD R3, 0(R6) + RET +move9: + CMPBNE R5, $9, move10 + MOVD 0(R4), R3 + MOVB 8(R4), R7 + MOVD R3, 0(R6) + MOVB R7, 8(R6) + RET +move10: + CMPBNE R5, $10, move11 + MOVD 0(R4), R3 + MOVH 8(R4), R7 + MOVD R3, 0(R6) + MOVH R7, 8(R6) + RET +move11: + MOVD 0(R4), R3 + MOVH 8(R4), R7 + MOVB 10(R4), R8 + MOVD R3, 0(R6) + MOVH R7, 8(R6) + MOVB R8, 10(R6) + RET + +move12to15: + CMPBNE R5, $12, move13 + MOVD 0(R4), R3 + MOVW 8(R4), R7 + MOVD R3, 0(R6) + MOVW R7, 8(R6) + RET +move13: + CMPBNE R5, $13, move14 + MOVD 0(R4), R3 + MOVW 8(R4), R7 + MOVB 12(R4), R8 + MOVD R3, 0(R6) + MOVW R7, 8(R6) + MOVB R8, 12(R6) + RET +move14: + CMPBNE R5, $14, move15 + MOVD 0(R4), R3 + MOVW 8(R4), R7 + MOVH 12(R4), R8 + MOVD R3, 0(R6) + MOVW R7, 8(R6) + MOVH R8, 12(R6) + RET +move15: + MOVD 0(R4), R3 + MOVW 8(R4), R7 + MOVH 12(R4), R8 + MOVB 14(R4), R10 + MOVD R3, 0(R6) + MOVW R7, 8(R6) + MOVH R8, 12(R6) + MOVB R10, 14(R6) +done: + RET + +// DO NOT CALL - target for exrl (execute relative long) instruction. +TEXT memmove_exrl_mvc<>(SB),NOSPLIT|NOFRAME,$0-0 + MVC $1, 0(R4), 0(R6) + MOVD R0, 0(R0) + RET + diff --git a/src/runtime/memmove_test.go b/src/runtime/memmove_test.go new file mode 100644 index 0000000..f1247f6 --- /dev/null +++ b/src/runtime/memmove_test.go @@ -0,0 +1,876 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "crypto/rand" + "encoding/binary" + "fmt" + "internal/race" + "internal/testenv" + . "runtime" + "sync/atomic" + "testing" + "unsafe" +) + +func TestMemmove(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + t.Parallel() + size := 256 + if testing.Short() { + size = 128 + 16 + } + src := make([]byte, size) + dst := make([]byte, size) + for i := 0; i < size; i++ { + src[i] = byte(128 + (i & 127)) + } + for i := 0; i < size; i++ { + dst[i] = byte(i & 127) + } + for n := 0; n <= size; n++ { + for x := 0; x <= size-n; x++ { // offset in src + for y := 0; y <= size-n; y++ { // offset in dst + copy(dst[y:y+n], src[x:x+n]) + for i := 0; i < y; i++ { + if dst[i] != byte(i&127) { + t.Fatalf("prefix dst[%d] = %d", i, dst[i]) + } + } + for i := y; i < y+n; i++ { + if dst[i] != byte(128+((i-y+x)&127)) { + t.Fatalf("copied dst[%d] = %d", i, dst[i]) + } + dst[i] = byte(i & 127) // reset dst + } + for i := y + n; i < size; i++ { + if dst[i] != byte(i&127) { + t.Fatalf("suffix dst[%d] = %d", i, dst[i]) + } + } + } + } + } +} + +func TestMemmoveAlias(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + t.Parallel() + size := 256 + if testing.Short() { + size = 128 + 16 + } + buf := make([]byte, size) + for i := 0; i < size; i++ { + buf[i] = byte(i) + } + for n := 0; n <= size; n++ { + for x := 0; x <= size-n; x++ { // src offset + for y := 0; y <= size-n; y++ { // dst offset + copy(buf[y:y+n], buf[x:x+n]) + for i := 0; i < y; i++ { + if buf[i] != byte(i) { + t.Fatalf("prefix buf[%d] = %d", i, buf[i]) + } + } + for i := y; i < y+n; i++ { + if buf[i] != byte(i-y+x) { + t.Fatalf("copied buf[%d] = %d", i, buf[i]) + } + buf[i] = byte(i) // reset buf + } + for i := y + n; i < size; i++ { + if buf[i] != byte(i) { + t.Fatalf("suffix buf[%d] = %d", i, buf[i]) + } + } + } + } + } +} + +func TestMemmoveLarge0x180000(t *testing.T) { + if testing.Short() && testenv.Builder() == "" { + t.Skip("-short") + } + + t.Parallel() + if race.Enabled { + t.Skip("skipping large memmove test under race detector") + } + testSize(t, 0x180000) +} + +func TestMemmoveOverlapLarge0x120000(t *testing.T) { + if testing.Short() && testenv.Builder() == "" { + t.Skip("-short") + } + + t.Parallel() + if race.Enabled { + t.Skip("skipping large memmove test under race detector") + } + testOverlap(t, 0x120000) +} + +func testSize(t *testing.T, size int) { + src := make([]byte, size) + dst := make([]byte, size) + _, _ = rand.Read(src) + _, _ = rand.Read(dst) + + ref := make([]byte, size) + copyref(ref, dst) + + for n := size - 50; n > 1; n >>= 1 { + for x := 0; x <= size-n; x = x*7 + 1 { // offset in src + for y := 0; y <= size-n; y = y*9 + 1 { // offset in dst + copy(dst[y:y+n], src[x:x+n]) + copyref(ref[y:y+n], src[x:x+n]) + p := cmpb(dst, ref) + if p >= 0 { + t.Fatalf("Copy failed, copying from src[%d:%d] to dst[%d:%d].\nOffset %d is different, %v != %v", x, x+n, y, y+n, p, dst[p], ref[p]) + } + } + } + } +} + +func testOverlap(t *testing.T, size int) { + src := make([]byte, size) + test := make([]byte, size) + ref := make([]byte, size) + _, _ = rand.Read(src) + + for n := size - 50; n > 1; n >>= 1 { + for x := 0; x <= size-n; x = x*7 + 1 { // offset in src + for y := 0; y <= size-n; y = y*9 + 1 { // offset in dst + // Reset input + copyref(test, src) + copyref(ref, src) + copy(test[y:y+n], test[x:x+n]) + if y <= x { + copyref(ref[y:y+n], ref[x:x+n]) + } else { + copybw(ref[y:y+n], ref[x:x+n]) + } + p := cmpb(test, ref) + if p >= 0 { + t.Fatalf("Copy failed, copying from src[%d:%d] to dst[%d:%d].\nOffset %d is different, %v != %v", x, x+n, y, y+n, p, test[p], ref[p]) + } + } + } + } + +} + +// Forward copy. +func copyref(dst, src []byte) { + for i, v := range src { + dst[i] = v + } +} + +// Backwards copy +func copybw(dst, src []byte) { + if len(src) == 0 { + return + } + for i := len(src) - 1; i >= 0; i-- { + dst[i] = src[i] + } +} + +// Returns offset of difference +func matchLen(a, b []byte, max int) int { + a = a[:max] + b = b[:max] + for i, av := range a { + if b[i] != av { + return i + } + } + return max +} + +func cmpb(a, b []byte) int { + l := matchLen(a, b, len(a)) + if l == len(a) { + return -1 + } + return l +} + +// Ensure that memmove writes pointers atomically, so the GC won't +// observe a partially updated pointer. +func TestMemmoveAtomicity(t *testing.T) { + if race.Enabled { + t.Skip("skip under the race detector -- this test is intentionally racy") + } + + var x int + + for _, backward := range []bool{true, false} { + for _, n := range []int{3, 4, 5, 6, 7, 8, 9, 10, 15, 25, 49} { + n := n + + // test copying [N]*int. + sz := uintptr(n * PtrSize) + name := fmt.Sprint(sz) + if backward { + name += "-backward" + } else { + name += "-forward" + } + t.Run(name, func(t *testing.T) { + // Use overlapping src and dst to force forward/backward copy. + var s [100]*int + src := s[n-1 : 2*n-1] + dst := s[:n] + if backward { + src, dst = dst, src + } + for i := range src { + src[i] = &x + } + for i := range dst { + dst[i] = nil + } + + var ready atomic.Uint32 + go func() { + sp := unsafe.Pointer(&src[0]) + dp := unsafe.Pointer(&dst[0]) + ready.Store(1) + for i := 0; i < 10000; i++ { + Memmove(dp, sp, sz) + MemclrNoHeapPointers(dp, sz) + } + ready.Store(2) + }() + + for ready.Load() == 0 { + Gosched() + } + + for ready.Load() != 2 { + for i := range dst { + p := dst[i] + if p != nil && p != &x { + t.Fatalf("got partially updated pointer %p at dst[%d], want either nil or %p", p, i, &x) + } + } + } + }) + } + } +} + +func benchmarkSizes(b *testing.B, sizes []int, fn func(b *testing.B, n int)) { + for _, n := range sizes { + b.Run(fmt.Sprint(n), func(b *testing.B) { + b.SetBytes(int64(n)) + fn(b, n) + }) + } +} + +var bufSizes = []int{ + 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, + 32, 64, 128, 256, 512, 1024, 2048, 4096, +} +var bufSizesOverlap = []int{ + 32, 64, 128, 256, 512, 1024, 2048, 4096, +} + +func BenchmarkMemmove(b *testing.B) { + benchmarkSizes(b, bufSizes, func(b *testing.B, n int) { + x := make([]byte, n) + y := make([]byte, n) + for i := 0; i < b.N; i++ { + copy(x, y) + } + }) +} + +func BenchmarkMemmoveOverlap(b *testing.B) { + benchmarkSizes(b, bufSizesOverlap, func(b *testing.B, n int) { + x := make([]byte, n+16) + for i := 0; i < b.N; i++ { + copy(x[16:n+16], x[:n]) + } + }) +} + +func BenchmarkMemmoveUnalignedDst(b *testing.B) { + benchmarkSizes(b, bufSizes, func(b *testing.B, n int) { + x := make([]byte, n+1) + y := make([]byte, n) + for i := 0; i < b.N; i++ { + copy(x[1:], y) + } + }) +} + +func BenchmarkMemmoveUnalignedDstOverlap(b *testing.B) { + benchmarkSizes(b, bufSizesOverlap, func(b *testing.B, n int) { + x := make([]byte, n+16) + for i := 0; i < b.N; i++ { + copy(x[16:n+16], x[1:n+1]) + } + }) +} + +func BenchmarkMemmoveUnalignedSrc(b *testing.B) { + benchmarkSizes(b, bufSizes, func(b *testing.B, n int) { + x := make([]byte, n) + y := make([]byte, n+1) + for i := 0; i < b.N; i++ { + copy(x, y[1:]) + } + }) +} + +func BenchmarkMemmoveUnalignedSrcOverlap(b *testing.B) { + benchmarkSizes(b, bufSizesOverlap, func(b *testing.B, n int) { + x := make([]byte, n+1) + for i := 0; i < b.N; i++ { + copy(x[1:n+1], x[:n]) + } + }) +} + +func TestMemclr(t *testing.T) { + size := 512 + if testing.Short() { + size = 128 + 16 + } + mem := make([]byte, size) + for i := 0; i < size; i++ { + mem[i] = 0xee + } + for n := 0; n < size; n++ { + for x := 0; x <= size-n; x++ { // offset in mem + MemclrBytes(mem[x : x+n]) + for i := 0; i < x; i++ { + if mem[i] != 0xee { + t.Fatalf("overwrite prefix mem[%d] = %d", i, mem[i]) + } + } + for i := x; i < x+n; i++ { + if mem[i] != 0 { + t.Fatalf("failed clear mem[%d] = %d", i, mem[i]) + } + mem[i] = 0xee + } + for i := x + n; i < size; i++ { + if mem[i] != 0xee { + t.Fatalf("overwrite suffix mem[%d] = %d", i, mem[i]) + } + } + } + } +} + +func BenchmarkMemclr(b *testing.B) { + for _, n := range []int{5, 16, 64, 256, 4096, 65536} { + x := make([]byte, n) + b.Run(fmt.Sprint(n), func(b *testing.B) { + b.SetBytes(int64(n)) + for i := 0; i < b.N; i++ { + MemclrBytes(x) + } + }) + } + for _, m := range []int{1, 4, 8, 16, 64} { + x := make([]byte, m<<20) + b.Run(fmt.Sprint(m, "M"), func(b *testing.B) { + b.SetBytes(int64(m << 20)) + for i := 0; i < b.N; i++ { + MemclrBytes(x) + } + }) + } +} + +func BenchmarkGoMemclr(b *testing.B) { + benchmarkSizes(b, []int{5, 16, 64, 256}, func(b *testing.B, n int) { + x := make([]byte, n) + for i := 0; i < b.N; i++ { + for j := range x { + x[j] = 0 + } + } + }) +} + +func BenchmarkMemclrRange(b *testing.B) { + type RunData struct { + data []int + } + + benchSizes := []RunData{ + {[]int{1043, 1078, 1894, 1582, 1044, 1165, 1467, 1100, 1919, 1562, 1932, 1645, + 1412, 1038, 1576, 1200, 1029, 1336, 1095, 1494, 1350, 1025, 1502, 1548, 1316, 1296, + 1868, 1639, 1546, 1626, 1642, 1308, 1726, 1665, 1678, 1187, 1515, 1598, 1353, 1237, + 1977, 1452, 2012, 1914, 1514, 1136, 1975, 1618, 1536, 1695, 1600, 1733, 1392, 1099, + 1358, 1996, 1224, 1783, 1197, 1838, 1460, 1556, 1554, 2020}}, // 1kb-2kb + {[]int{3964, 5139, 6573, 7775, 6553, 2413, 3466, 5394, 2469, 7336, 7091, 6745, + 4028, 5643, 6164, 3475, 4138, 6908, 7559, 3335, 5660, 4122, 3945, 2082, 7564, 6584, + 5111, 2288, 6789, 2797, 4928, 7986, 5163, 5447, 2999, 4968, 3174, 3202, 7908, 8137, + 4735, 6161, 4646, 7592, 3083, 5329, 3687, 2754, 3599, 7231, 6455, 2549, 8063, 2189, + 7121, 5048, 4277, 6626, 6306, 2815, 7473, 3963, 7549, 7255}}, // 2kb-8kb + {[]int{16304, 15936, 15760, 4736, 9136, 11184, 10160, 5952, 14560, 15744, + 6624, 5872, 13088, 14656, 14192, 10304, 4112, 10384, 9344, 4496, 11392, 7024, + 5200, 10064, 14784, 5808, 13504, 10480, 8512, 4896, 13264, 5600}}, // 4kb-16kb + {[]int{164576, 233136, 220224, 183280, 214112, 217248, 228560, 201728}}, // 128kb-256kb + } + + for _, t := range benchSizes { + total := 0 + minLen := 0 + maxLen := 0 + + for _, clrLen := range t.data { + if clrLen > maxLen { + maxLen = clrLen + } + if clrLen < minLen || minLen == 0 { + minLen = clrLen + } + total += clrLen + } + buffer := make([]byte, maxLen) + + text := "" + if minLen >= (1 << 20) { + text = fmt.Sprint(minLen>>20, "M ", (maxLen+(1<<20-1))>>20, "M") + } else if minLen >= (1 << 10) { + text = fmt.Sprint(minLen>>10, "K ", (maxLen+(1<<10-1))>>10, "K") + } else { + text = fmt.Sprint(minLen, " ", maxLen) + } + b.Run(text, func(b *testing.B) { + b.SetBytes(int64(total)) + for i := 0; i < b.N; i++ { + for _, clrLen := range t.data { + MemclrBytes(buffer[:clrLen]) + } + } + }) + } +} + +func BenchmarkClearFat7(b *testing.B) { + p := new([7]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [7]byte{} + } +} + +func BenchmarkClearFat8(b *testing.B) { + p := new([8 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [8 / 4]uint32{} + } +} + +func BenchmarkClearFat11(b *testing.B) { + p := new([11]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [11]byte{} + } +} + +func BenchmarkClearFat12(b *testing.B) { + p := new([12 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [12 / 4]uint32{} + } +} + +func BenchmarkClearFat13(b *testing.B) { + p := new([13]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [13]byte{} + } +} + +func BenchmarkClearFat14(b *testing.B) { + p := new([14]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [14]byte{} + } +} + +func BenchmarkClearFat15(b *testing.B) { + p := new([15]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [15]byte{} + } +} + +func BenchmarkClearFat16(b *testing.B) { + p := new([16 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [16 / 4]uint32{} + } +} + +func BenchmarkClearFat24(b *testing.B) { + p := new([24 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [24 / 4]uint32{} + } +} + +func BenchmarkClearFat32(b *testing.B) { + p := new([32 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [32 / 4]uint32{} + } +} + +func BenchmarkClearFat40(b *testing.B) { + p := new([40 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [40 / 4]uint32{} + } +} + +func BenchmarkClearFat48(b *testing.B) { + p := new([48 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [48 / 4]uint32{} + } +} + +func BenchmarkClearFat56(b *testing.B) { + p := new([56 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [56 / 4]uint32{} + } +} + +func BenchmarkClearFat64(b *testing.B) { + p := new([64 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [64 / 4]uint32{} + } +} + +func BenchmarkClearFat72(b *testing.B) { + p := new([72 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [72 / 4]uint32{} + } +} + +func BenchmarkClearFat128(b *testing.B) { + p := new([128 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [128 / 4]uint32{} + } +} + +func BenchmarkClearFat256(b *testing.B) { + p := new([256 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [256 / 4]uint32{} + } +} + +func BenchmarkClearFat512(b *testing.B) { + p := new([512 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [512 / 4]uint32{} + } +} + +func BenchmarkClearFat1024(b *testing.B) { + p := new([1024 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [1024 / 4]uint32{} + } +} + +func BenchmarkClearFat1032(b *testing.B) { + p := new([1032 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [1032 / 4]uint32{} + } +} + +func BenchmarkClearFat1040(b *testing.B) { + p := new([1040 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = [1040 / 4]uint32{} + } +} + +func BenchmarkCopyFat7(b *testing.B) { + var x [7]byte + p := new([7]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat8(b *testing.B) { + var x [8 / 4]uint32 + p := new([8 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat11(b *testing.B) { + var x [11]byte + p := new([11]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat12(b *testing.B) { + var x [12 / 4]uint32 + p := new([12 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat13(b *testing.B) { + var x [13]byte + p := new([13]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat14(b *testing.B) { + var x [14]byte + p := new([14]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat15(b *testing.B) { + var x [15]byte + p := new([15]byte) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat16(b *testing.B) { + var x [16 / 4]uint32 + p := new([16 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat24(b *testing.B) { + var x [24 / 4]uint32 + p := new([24 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat32(b *testing.B) { + var x [32 / 4]uint32 + p := new([32 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat64(b *testing.B) { + var x [64 / 4]uint32 + p := new([64 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat72(b *testing.B) { + var x [72 / 4]uint32 + p := new([72 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat128(b *testing.B) { + var x [128 / 4]uint32 + p := new([128 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat256(b *testing.B) { + var x [256 / 4]uint32 + p := new([256 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat512(b *testing.B) { + var x [512 / 4]uint32 + p := new([512 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat520(b *testing.B) { + var x [520 / 4]uint32 + p := new([520 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat1024(b *testing.B) { + var x [1024 / 4]uint32 + p := new([1024 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat1032(b *testing.B) { + var x [1032 / 4]uint32 + p := new([1032 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +func BenchmarkCopyFat1040(b *testing.B) { + var x [1040 / 4]uint32 + p := new([1040 / 4]uint32) + Escape(p) + b.ResetTimer() + for i := 0; i < b.N; i++ { + *p = x + } +} + +// BenchmarkIssue18740 ensures that memmove uses 4 and 8 byte load/store to move 4 and 8 bytes. +// It used to do 2 2-byte load/stores, which leads to a pipeline stall +// when we try to read the result with one 4-byte load. +func BenchmarkIssue18740(b *testing.B) { + benchmarks := []struct { + name string + nbyte int + f func([]byte) uint64 + }{ + {"2byte", 2, func(buf []byte) uint64 { return uint64(binary.LittleEndian.Uint16(buf)) }}, + {"4byte", 4, func(buf []byte) uint64 { return uint64(binary.LittleEndian.Uint32(buf)) }}, + {"8byte", 8, func(buf []byte) uint64 { return binary.LittleEndian.Uint64(buf) }}, + } + + var g [4096]byte + for _, bm := range benchmarks { + buf := make([]byte, bm.nbyte) + b.Run(bm.name, func(b *testing.B) { + for j := 0; j < b.N; j++ { + for i := 0; i < 4096; i += bm.nbyte { + copy(buf[:], g[i:]) + sink += bm.f(buf[:]) + } + } + }) + } +} diff --git a/src/runtime/memmove_wasm.s b/src/runtime/memmove_wasm.s new file mode 100644 index 0000000..1be8487 --- /dev/null +++ b/src/runtime/memmove_wasm.s @@ -0,0 +1,22 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// See memmove Go doc for important implementation constraints. + +// func memmove(to, from unsafe.Pointer, n uintptr) +TEXT runtime·memmove(SB), NOSPLIT, $0-24 + MOVD to+0(FP), R0 + MOVD from+8(FP), R1 + MOVD n+16(FP), R2 + + Get R0 + I32WrapI64 + Get R1 + I32WrapI64 + Get R2 + I32WrapI64 + MemoryCopy + RET diff --git a/src/runtime/metrics.go b/src/runtime/metrics.go new file mode 100644 index 0000000..2061dc0 --- /dev/null +++ b/src/runtime/metrics.go @@ -0,0 +1,723 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// Metrics implementation exported to runtime/metrics. + +import ( + "unsafe" +) + +var ( + // metrics is a map of runtime/metrics keys to data used by the runtime + // to sample each metric's value. metricsInit indicates it has been + // initialized. + // + // These fields are protected by metricsSema which should be + // locked/unlocked with metricsLock() / metricsUnlock(). + metricsSema uint32 = 1 + metricsInit bool + metrics map[string]metricData + + sizeClassBuckets []float64 + timeHistBuckets []float64 +) + +type metricData struct { + // deps is the set of runtime statistics that this metric + // depends on. Before compute is called, the statAggregate + // which will be passed must ensure() these dependencies. + deps statDepSet + + // compute is a function that populates a metricValue + // given a populated statAggregate structure. + compute func(in *statAggregate, out *metricValue) +} + +func metricsLock() { + // Acquire the metricsSema but with handoff. Operations are typically + // expensive enough that queueing up goroutines and handing off between + // them will be noticeably better-behaved. + semacquire1(&metricsSema, true, 0, 0, waitReasonSemacquire) + if raceenabled { + raceacquire(unsafe.Pointer(&metricsSema)) + } +} + +func metricsUnlock() { + if raceenabled { + racerelease(unsafe.Pointer(&metricsSema)) + } + semrelease(&metricsSema) +} + +// initMetrics initializes the metrics map if it hasn't been yet. +// +// metricsSema must be held. +func initMetrics() { + if metricsInit { + return + } + + sizeClassBuckets = make([]float64, _NumSizeClasses, _NumSizeClasses+1) + // Skip size class 0 which is a stand-in for large objects, but large + // objects are tracked separately (and they actually get placed in + // the last bucket, not the first). + sizeClassBuckets[0] = 1 // The smallest allocation is 1 byte in size. + for i := 1; i < _NumSizeClasses; i++ { + // Size classes have an inclusive upper-bound + // and exclusive lower bound (e.g. 48-byte size class is + // (32, 48]) whereas we want and inclusive lower-bound + // and exclusive upper-bound (e.g. 48-byte size class is + // [33, 49). We can achieve this by shifting all bucket + // boundaries up by 1. + // + // Also, a float64 can precisely represent integers with + // value up to 2^53 and size classes are relatively small + // (nowhere near 2^48 even) so this will give us exact + // boundaries. + sizeClassBuckets[i] = float64(class_to_size[i] + 1) + } + sizeClassBuckets = append(sizeClassBuckets, float64Inf()) + + timeHistBuckets = timeHistogramMetricsBuckets() + metrics = map[string]metricData{ + "/cgo/go-to-c-calls:calls": { + compute: func(_ *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(NumCgoCall()) + }, + }, + "/cpu/classes/gc/mark/assist:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.gcAssistTime)) + }, + }, + "/cpu/classes/gc/mark/dedicated:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.gcDedicatedTime)) + }, + }, + "/cpu/classes/gc/mark/idle:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.gcIdleTime)) + }, + }, + "/cpu/classes/gc/pause:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.gcPauseTime)) + }, + }, + "/cpu/classes/gc/total:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.gcTotalTime)) + }, + }, + "/cpu/classes/idle:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.idleTime)) + }, + }, + "/cpu/classes/scavenge/assist:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.scavengeAssistTime)) + }, + }, + "/cpu/classes/scavenge/background:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.scavengeBgTime)) + }, + }, + "/cpu/classes/scavenge/total:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.scavengeTotalTime)) + }, + }, + "/cpu/classes/total:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.totalTime)) + }, + }, + "/cpu/classes/user:cpu-seconds": { + deps: makeStatDepSet(cpuStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(in.cpuStats.userTime)) + }, + }, + "/gc/cycles/automatic:gc-cycles": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.gcCyclesDone - in.sysStats.gcCyclesForced + }, + }, + "/gc/cycles/forced:gc-cycles": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.gcCyclesForced + }, + }, + "/gc/cycles/total:gc-cycles": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.gcCyclesDone + }, + }, + "/gc/heap/allocs-by-size:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + hist := out.float64HistOrInit(sizeClassBuckets) + hist.counts[len(hist.counts)-1] = uint64(in.heapStats.largeAllocCount) + // Cut off the first index which is ostensibly for size class 0, + // but large objects are tracked separately so it's actually unused. + for i, count := range in.heapStats.smallAllocCount[1:] { + hist.counts[i] = uint64(count) + } + }, + }, + "/gc/heap/allocs:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.heapStats.totalAllocated + }, + }, + "/gc/heap/allocs:objects": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.heapStats.totalAllocs + }, + }, + "/gc/heap/frees-by-size:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + hist := out.float64HistOrInit(sizeClassBuckets) + hist.counts[len(hist.counts)-1] = uint64(in.heapStats.largeFreeCount) + // Cut off the first index which is ostensibly for size class 0, + // but large objects are tracked separately so it's actually unused. + for i, count := range in.heapStats.smallFreeCount[1:] { + hist.counts[i] = uint64(count) + } + }, + }, + "/gc/heap/frees:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.heapStats.totalFreed + }, + }, + "/gc/heap/frees:objects": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.heapStats.totalFrees + }, + }, + "/gc/heap/goal:bytes": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.heapGoal + }, + }, + "/gc/heap/objects:objects": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.heapStats.numObjects + }, + }, + "/gc/heap/tiny/allocs:objects": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(in.heapStats.tinyAllocCount) + }, + }, + "/gc/limiter/last-enabled:gc-cycle": { + compute: func(_ *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(gcCPULimiter.lastEnabledCycle.Load()) + }, + }, + "/gc/pauses:seconds": { + compute: func(_ *statAggregate, out *metricValue) { + hist := out.float64HistOrInit(timeHistBuckets) + // The bottom-most bucket, containing negative values, is tracked + // as a separately as underflow, so fill that in manually and then + // iterate over the rest. + hist.counts[0] = memstats.gcPauseDist.underflow.Load() + for i := range memstats.gcPauseDist.counts { + hist.counts[i+1] = memstats.gcPauseDist.counts[i].Load() + } + hist.counts[len(hist.counts)-1] = memstats.gcPauseDist.overflow.Load() + }, + }, + "/gc/stack/starting-size:bytes": { + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(startingStackSize) + }, + }, + "/memory/classes/heap/free:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(in.heapStats.committed - in.heapStats.inHeap - + in.heapStats.inStacks - in.heapStats.inWorkBufs - + in.heapStats.inPtrScalarBits) + }, + }, + "/memory/classes/heap/objects:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.heapStats.inObjects + }, + }, + "/memory/classes/heap/released:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(in.heapStats.released) + }, + }, + "/memory/classes/heap/stacks:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(in.heapStats.inStacks) + }, + }, + "/memory/classes/heap/unused:bytes": { + deps: makeStatDepSet(heapStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(in.heapStats.inHeap) - in.heapStats.inObjects + }, + }, + "/memory/classes/metadata/mcache/free:bytes": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.mCacheSys - in.sysStats.mCacheInUse + }, + }, + "/memory/classes/metadata/mcache/inuse:bytes": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.mCacheInUse + }, + }, + "/memory/classes/metadata/mspan/free:bytes": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.mSpanSys - in.sysStats.mSpanInUse + }, + }, + "/memory/classes/metadata/mspan/inuse:bytes": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.mSpanInUse + }, + }, + "/memory/classes/metadata/other:bytes": { + deps: makeStatDepSet(heapStatsDep, sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(in.heapStats.inWorkBufs+in.heapStats.inPtrScalarBits) + in.sysStats.gcMiscSys + }, + }, + "/memory/classes/os-stacks:bytes": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.stacksSys + }, + }, + "/memory/classes/other:bytes": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.otherSys + }, + }, + "/memory/classes/profiling/buckets:bytes": { + deps: makeStatDepSet(sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = in.sysStats.buckHashSys + }, + }, + "/memory/classes/total:bytes": { + deps: makeStatDepSet(heapStatsDep, sysStatsDep), + compute: func(in *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(in.heapStats.committed+in.heapStats.released) + + in.sysStats.stacksSys + in.sysStats.mSpanSys + + in.sysStats.mCacheSys + in.sysStats.buckHashSys + + in.sysStats.gcMiscSys + in.sysStats.otherSys + }, + }, + "/sched/gomaxprocs:threads": { + compute: func(_ *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(gomaxprocs) + }, + }, + "/sched/goroutines:goroutines": { + compute: func(_ *statAggregate, out *metricValue) { + out.kind = metricKindUint64 + out.scalar = uint64(gcount()) + }, + }, + "/sched/latencies:seconds": { + compute: func(_ *statAggregate, out *metricValue) { + hist := out.float64HistOrInit(timeHistBuckets) + hist.counts[0] = sched.timeToRun.underflow.Load() + for i := range sched.timeToRun.counts { + hist.counts[i+1] = sched.timeToRun.counts[i].Load() + } + hist.counts[len(hist.counts)-1] = sched.timeToRun.overflow.Load() + }, + }, + "/sync/mutex/wait/total:seconds": { + compute: func(_ *statAggregate, out *metricValue) { + out.kind = metricKindFloat64 + out.scalar = float64bits(nsToSec(sched.totalMutexWaitTime.Load())) + }, + }, + } + metricsInit = true +} + +// statDep is a dependency on a group of statistics +// that a metric might have. +type statDep uint + +const ( + heapStatsDep statDep = iota // corresponds to heapStatsAggregate + sysStatsDep // corresponds to sysStatsAggregate + cpuStatsDep // corresponds to cpuStatsAggregate + numStatsDeps +) + +// statDepSet represents a set of statDeps. +// +// Under the hood, it's a bitmap. +type statDepSet [1]uint64 + +// makeStatDepSet creates a new statDepSet from a list of statDeps. +func makeStatDepSet(deps ...statDep) statDepSet { + var s statDepSet + for _, d := range deps { + s[d/64] |= 1 << (d % 64) + } + return s +} + +// differennce returns set difference of s from b as a new set. +func (s statDepSet) difference(b statDepSet) statDepSet { + var c statDepSet + for i := range s { + c[i] = s[i] &^ b[i] + } + return c +} + +// union returns the union of the two sets as a new set. +func (s statDepSet) union(b statDepSet) statDepSet { + var c statDepSet + for i := range s { + c[i] = s[i] | b[i] + } + return c +} + +// empty returns true if there are no dependencies in the set. +func (s *statDepSet) empty() bool { + for _, c := range s { + if c != 0 { + return false + } + } + return true +} + +// has returns true if the set contains a given statDep. +func (s *statDepSet) has(d statDep) bool { + return s[d/64]&(1<<(d%64)) != 0 +} + +// heapStatsAggregate represents memory stats obtained from the +// runtime. This set of stats is grouped together because they +// depend on each other in some way to make sense of the runtime's +// current heap memory use. They're also sharded across Ps, so it +// makes sense to grab them all at once. +type heapStatsAggregate struct { + heapStatsDelta + + // Derived from values in heapStatsDelta. + + // inObjects is the bytes of memory occupied by objects, + inObjects uint64 + + // numObjects is the number of live objects in the heap. + numObjects uint64 + + // totalAllocated is the total bytes of heap objects allocated + // over the lifetime of the program. + totalAllocated uint64 + + // totalFreed is the total bytes of heap objects freed + // over the lifetime of the program. + totalFreed uint64 + + // totalAllocs is the number of heap objects allocated over + // the lifetime of the program. + totalAllocs uint64 + + // totalFrees is the number of heap objects freed over + // the lifetime of the program. + totalFrees uint64 +} + +// compute populates the heapStatsAggregate with values from the runtime. +func (a *heapStatsAggregate) compute() { + memstats.heapStats.read(&a.heapStatsDelta) + + // Calculate derived stats. + a.totalAllocs = a.largeAllocCount + a.totalFrees = a.largeFreeCount + a.totalAllocated = a.largeAlloc + a.totalFreed = a.largeFree + for i := range a.smallAllocCount { + na := a.smallAllocCount[i] + nf := a.smallFreeCount[i] + a.totalAllocs += na + a.totalFrees += nf + a.totalAllocated += na * uint64(class_to_size[i]) + a.totalFreed += nf * uint64(class_to_size[i]) + } + a.inObjects = a.totalAllocated - a.totalFreed + a.numObjects = a.totalAllocs - a.totalFrees +} + +// sysStatsAggregate represents system memory stats obtained +// from the runtime. This set of stats is grouped together because +// they're all relatively cheap to acquire and generally independent +// of one another and other runtime memory stats. The fact that they +// may be acquired at different times, especially with respect to +// heapStatsAggregate, means there could be some skew, but because of +// these stats are independent, there's no real consistency issue here. +type sysStatsAggregate struct { + stacksSys uint64 + mSpanSys uint64 + mSpanInUse uint64 + mCacheSys uint64 + mCacheInUse uint64 + buckHashSys uint64 + gcMiscSys uint64 + otherSys uint64 + heapGoal uint64 + gcCyclesDone uint64 + gcCyclesForced uint64 +} + +// compute populates the sysStatsAggregate with values from the runtime. +func (a *sysStatsAggregate) compute() { + a.stacksSys = memstats.stacks_sys.load() + a.buckHashSys = memstats.buckhash_sys.load() + a.gcMiscSys = memstats.gcMiscSys.load() + a.otherSys = memstats.other_sys.load() + a.heapGoal = gcController.heapGoal() + a.gcCyclesDone = uint64(memstats.numgc) + a.gcCyclesForced = uint64(memstats.numforcedgc) + + systemstack(func() { + lock(&mheap_.lock) + a.mSpanSys = memstats.mspan_sys.load() + a.mSpanInUse = uint64(mheap_.spanalloc.inuse) + a.mCacheSys = memstats.mcache_sys.load() + a.mCacheInUse = uint64(mheap_.cachealloc.inuse) + unlock(&mheap_.lock) + }) +} + +// cpuStatsAggregate represents CPU stats obtained from the runtime +// acquired together to avoid skew and inconsistencies. +type cpuStatsAggregate struct { + cpuStats +} + +// compute populates the cpuStatsAggregate with values from the runtime. +func (a *cpuStatsAggregate) compute() { + a.cpuStats = work.cpuStats +} + +// nsToSec takes a duration in nanoseconds and converts it to seconds as +// a float64. +func nsToSec(ns int64) float64 { + return float64(ns) / 1e9 +} + +// statAggregate is the main driver of the metrics implementation. +// +// It contains multiple aggregates of runtime statistics, as well +// as a set of these aggregates that it has populated. The aggergates +// are populated lazily by its ensure method. +type statAggregate struct { + ensured statDepSet + heapStats heapStatsAggregate + sysStats sysStatsAggregate + cpuStats cpuStatsAggregate +} + +// ensure populates statistics aggregates determined by deps if they +// haven't yet been populated. +func (a *statAggregate) ensure(deps *statDepSet) { + missing := deps.difference(a.ensured) + if missing.empty() { + return + } + for i := statDep(0); i < numStatsDeps; i++ { + if !missing.has(i) { + continue + } + switch i { + case heapStatsDep: + a.heapStats.compute() + case sysStatsDep: + a.sysStats.compute() + case cpuStatsDep: + a.cpuStats.compute() + } + } + a.ensured = a.ensured.union(missing) +} + +// metricKind is a runtime copy of runtime/metrics.ValueKind and +// must be kept structurally identical to that type. +type metricKind int + +const ( + // These values must be kept identical to their corresponding Kind* values + // in the runtime/metrics package. + metricKindBad metricKind = iota + metricKindUint64 + metricKindFloat64 + metricKindFloat64Histogram +) + +// metricSample is a runtime copy of runtime/metrics.Sample and +// must be kept structurally identical to that type. +type metricSample struct { + name string + value metricValue +} + +// metricValue is a runtime copy of runtime/metrics.Sample and +// must be kept structurally identical to that type. +type metricValue struct { + kind metricKind + scalar uint64 // contains scalar values for scalar Kinds. + pointer unsafe.Pointer // contains non-scalar values. +} + +// float64HistOrInit tries to pull out an existing float64Histogram +// from the value, but if none exists, then it allocates one with +// the given buckets. +func (v *metricValue) float64HistOrInit(buckets []float64) *metricFloat64Histogram { + var hist *metricFloat64Histogram + if v.kind == metricKindFloat64Histogram && v.pointer != nil { + hist = (*metricFloat64Histogram)(v.pointer) + } else { + v.kind = metricKindFloat64Histogram + hist = new(metricFloat64Histogram) + v.pointer = unsafe.Pointer(hist) + } + hist.buckets = buckets + if len(hist.counts) != len(hist.buckets)-1 { + hist.counts = make([]uint64, len(buckets)-1) + } + return hist +} + +// metricFloat64Histogram is a runtime copy of runtime/metrics.Float64Histogram +// and must be kept structurally identical to that type. +type metricFloat64Histogram struct { + counts []uint64 + buckets []float64 +} + +// agg is used by readMetrics, and is protected by metricsSema. +// +// Managed as a global variable because its pointer will be +// an argument to a dynamically-defined function, and we'd +// like to avoid it escaping to the heap. +var agg statAggregate + +// readMetrics is the implementation of runtime/metrics.Read. +// +//go:linkname readMetrics runtime/metrics.runtime_readMetrics +func readMetrics(samplesp unsafe.Pointer, len int, cap int) { + // Construct a slice from the args. + sl := slice{samplesp, len, cap} + samples := *(*[]metricSample)(unsafe.Pointer(&sl)) + + metricsLock() + + // Ensure the map is initialized. + initMetrics() + + // Clear agg defensively. + agg = statAggregate{} + + // Sample. + for i := range samples { + sample := &samples[i] + data, ok := metrics[sample.name] + if !ok { + sample.value.kind = metricKindBad + continue + } + // Ensure we have all the stats we need. + // agg is populated lazily. + agg.ensure(&data.deps) + + // Compute the value based on the stats we have. + data.compute(&agg, &sample.value) + } + + metricsUnlock() +} diff --git a/src/runtime/metrics/description.go b/src/runtime/metrics/description.go new file mode 100644 index 0000000..dcfe01e --- /dev/null +++ b/src/runtime/metrics/description.go @@ -0,0 +1,380 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package metrics + +// Description describes a runtime metric. +type Description struct { + // Name is the full name of the metric which includes the unit. + // + // The format of the metric may be described by the following regular expression. + // + // ^(?P<name>/[^:]+):(?P<unit>[^:*/]+(?:[*/][^:*/]+)*)$ + // + // The format splits the name into two components, separated by a colon: a path which always + // starts with a /, and a machine-parseable unit. The name may contain any valid Unicode + // codepoint in between / characters, but by convention will try to stick to lowercase + // characters and hyphens. An example of such a path might be "/memory/heap/free". + // + // The unit is by convention a series of lowercase English unit names (singular or plural) + // without prefixes delimited by '*' or '/'. The unit names may contain any valid Unicode + // codepoint that is not a delimiter. + // Examples of units might be "seconds", "bytes", "bytes/second", "cpu-seconds", + // "byte*cpu-seconds", and "bytes/second/second". + // + // For histograms, multiple units may apply. For instance, the units of the buckets and + // the count. By convention, for histograms, the units of the count are always "samples" + // with the type of sample evident by the metric's name, while the unit in the name + // specifies the buckets' unit. + // + // A complete name might look like "/memory/heap/free:bytes". + Name string + + // Description is an English language sentence describing the metric. + Description string + + // Kind is the kind of value for this metric. + // + // The purpose of this field is to allow users to filter out metrics whose values are + // types which their application may not understand. + Kind ValueKind + + // Cumulative is whether or not the metric is cumulative. If a cumulative metric is just + // a single number, then it increases monotonically. If the metric is a distribution, + // then each bucket count increases monotonically. + // + // This flag thus indicates whether or not it's useful to compute a rate from this value. + Cumulative bool +} + +// The English language descriptions below must be kept in sync with the +// descriptions of each metric in doc.go. +var allDesc = []Description{ + { + Name: "/cgo/go-to-c-calls:calls", + Description: "Count of calls made from Go to C by the current process.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/cpu/classes/gc/mark/assist:cpu-seconds", + Description: "Estimated total CPU time goroutines spent performing GC tasks " + + "to assist the GC and prevent it from falling behind the application. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/gc/mark/dedicated:cpu-seconds", + Description: "Estimated total CPU time spent performing GC tasks on " + + "processors (as defined by GOMAXPROCS) dedicated to those tasks. " + + "This includes time spent with the world stopped due to the GC. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/gc/mark/idle:cpu-seconds", + Description: "Estimated total CPU time spent performing GC tasks on " + + "spare CPU resources that the Go scheduler could not otherwise find " + + "a use for. This should be subtracted from the total GC CPU time to " + + "obtain a measure of compulsory GC CPU time. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/gc/pause:cpu-seconds", + Description: "Estimated total CPU time spent with the application paused by " + + "the GC. Even if only one thread is running during the pause, this is " + + "computed as GOMAXPROCS times the pause latency because nothing else " + + "can be executing. This is the exact sum of samples in /gc/pause:seconds " + + "if each sample is multiplied by GOMAXPROCS at the time it is taken. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/gc/total:cpu-seconds", + Description: "Estimated total CPU time spent performing GC tasks. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics. Sum of all metrics in /cpu/classes/gc.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/idle:cpu-seconds", + Description: "Estimated total available CPU time not spent executing any Go or Go runtime code. " + + "In other words, the part of /cpu/classes/total:cpu-seconds that was unused. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/scavenge/assist:cpu-seconds", + Description: "Estimated total CPU time spent returning unused memory to the " + + "underlying platform in response eagerly in response to memory pressure. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/scavenge/background:cpu-seconds", + Description: "Estimated total CPU time spent performing background tasks " + + "to return unused memory to the underlying platform. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/scavenge/total:cpu-seconds", + Description: "Estimated total CPU time spent performing tasks that return " + + "unused memory to the underlying platform. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics. Sum of all metrics in /cpu/classes/scavenge.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/total:cpu-seconds", + Description: "Estimated total available CPU time for user Go code " + + "or the Go runtime, as defined by GOMAXPROCS. In other words, GOMAXPROCS " + + "integrated over the wall-clock duration this process has been executing for. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics. Sum of all metrics in /cpu/classes.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/cpu/classes/user:cpu-seconds", + Description: "Estimated total CPU time spent running user Go code. This may " + + "also include some small amount of time spent in the Go runtime. " + + "This metric is an overestimate, and not directly comparable to " + + "system CPU time measurements. Compare only with other /cpu/classes " + + "metrics.", + Kind: KindFloat64, + Cumulative: true, + }, + { + Name: "/gc/cycles/automatic:gc-cycles", + Description: "Count of completed GC cycles generated by the Go runtime.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/gc/cycles/forced:gc-cycles", + Description: "Count of completed GC cycles forced by the application.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/gc/cycles/total:gc-cycles", + Description: "Count of all completed GC cycles.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/gc/heap/allocs-by-size:bytes", + Description: "Distribution of heap allocations by approximate size. " + + "Note that this does not include tiny objects as defined by " + + "/gc/heap/tiny/allocs:objects, only tiny blocks.", + Kind: KindFloat64Histogram, + Cumulative: true, + }, + { + Name: "/gc/heap/allocs:bytes", + Description: "Cumulative sum of memory allocated to the heap by the application.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/gc/heap/allocs:objects", + Description: "Cumulative count of heap allocations triggered by the application. " + + "Note that this does not include tiny objects as defined by " + + "/gc/heap/tiny/allocs:objects, only tiny blocks.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/gc/heap/frees-by-size:bytes", + Description: "Distribution of freed heap allocations by approximate size. " + + "Note that this does not include tiny objects as defined by " + + "/gc/heap/tiny/allocs:objects, only tiny blocks.", + Kind: KindFloat64Histogram, + Cumulative: true, + }, + { + Name: "/gc/heap/frees:bytes", + Description: "Cumulative sum of heap memory freed by the garbage collector.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/gc/heap/frees:objects", + Description: "Cumulative count of heap allocations whose storage was freed " + + "by the garbage collector. " + + "Note that this does not include tiny objects as defined by " + + "/gc/heap/tiny/allocs:objects, only tiny blocks.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/gc/heap/goal:bytes", + Description: "Heap size target for the end of the GC cycle.", + Kind: KindUint64, + }, + { + Name: "/gc/heap/objects:objects", + Description: "Number of objects, live or unswept, occupying heap memory.", + Kind: KindUint64, + }, + { + Name: "/gc/heap/tiny/allocs:objects", + Description: "Count of small allocations that are packed together into blocks. " + + "These allocations are counted separately from other allocations " + + "because each individual allocation is not tracked by the runtime, " + + "only their block. Each block is already accounted for in " + + "allocs-by-size and frees-by-size.", + Kind: KindUint64, + Cumulative: true, + }, + { + Name: "/gc/limiter/last-enabled:gc-cycle", + Description: "GC cycle the last time the GC CPU limiter was enabled. " + + "This metric is useful for diagnosing the root cause of an out-of-memory " + + "error, because the limiter trades memory for CPU time when the GC's CPU " + + "time gets too high. This is most likely to occur with use of SetMemoryLimit. " + + "The first GC cycle is cycle 1, so a value of 0 indicates that it was never enabled.", + Kind: KindUint64, + }, + { + Name: "/gc/pauses:seconds", + Description: "Distribution individual GC-related stop-the-world pause latencies.", + Kind: KindFloat64Histogram, + Cumulative: true, + }, + { + Name: "/gc/stack/starting-size:bytes", + Description: "The stack size of new goroutines.", + Kind: KindUint64, + Cumulative: false, + }, + { + Name: "/memory/classes/heap/free:bytes", + Description: "Memory that is completely free and eligible to be returned to the underlying system, " + + "but has not been. This metric is the runtime's estimate of free address space that is backed by " + + "physical memory.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/heap/objects:bytes", + Description: "Memory occupied by live objects and dead objects that have not yet been marked free by the garbage collector.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/heap/released:bytes", + Description: "Memory that is completely free and has been returned to the underlying system. This " + + "metric is the runtime's estimate of free address space that is still mapped into the process, " + + "but is not backed by physical memory.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/heap/stacks:bytes", + Description: "Memory allocated from the heap that is reserved for stack space, whether or not it is currently in-use.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/heap/unused:bytes", + Description: "Memory that is reserved for heap objects but is not currently used to hold heap objects.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/metadata/mcache/free:bytes", + Description: "Memory that is reserved for runtime mcache structures, but not in-use.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/metadata/mcache/inuse:bytes", + Description: "Memory that is occupied by runtime mcache structures that are currently being used.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/metadata/mspan/free:bytes", + Description: "Memory that is reserved for runtime mspan structures, but not in-use.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/metadata/mspan/inuse:bytes", + Description: "Memory that is occupied by runtime mspan structures that are currently being used.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/metadata/other:bytes", + Description: "Memory that is reserved for or used to hold runtime metadata.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/os-stacks:bytes", + Description: "Stack memory allocated by the underlying operating system.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/other:bytes", + Description: "Memory used by execution trace buffers, structures for debugging the runtime, finalizer and profiler specials, and more.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/profiling/buckets:bytes", + Description: "Memory that is used by the stack trace hash map used for profiling.", + Kind: KindUint64, + }, + { + Name: "/memory/classes/total:bytes", + Description: "All memory mapped by the Go runtime into the current process as read-write. Note that this does not include memory mapped by code called via cgo or via the syscall package. Sum of all metrics in /memory/classes.", + Kind: KindUint64, + }, + { + Name: "/sched/gomaxprocs:threads", + Description: "The current runtime.GOMAXPROCS setting, or the number of operating system threads that can execute user-level Go code simultaneously.", + Kind: KindUint64, + }, + { + Name: "/sched/goroutines:goroutines", + Description: "Count of live goroutines.", + Kind: KindUint64, + }, + { + Name: "/sched/latencies:seconds", + Description: "Distribution of the time goroutines have spent in the scheduler in a runnable state before actually running.", + Kind: KindFloat64Histogram, + }, + { + Name: "/sync/mutex/wait/total:seconds", + Description: "Approximate cumulative time goroutines have spent blocked on a sync.Mutex or sync.RWMutex. This metric is useful for identifying global changes in lock contention. Collect a mutex or block profile using the runtime/pprof package for more detailed contention data.", + Kind: KindFloat64, + Cumulative: true, + }, +} + +// All returns a slice of containing metric descriptions for all supported metrics. +func All() []Description { + return allDesc +} diff --git a/src/runtime/metrics/description_test.go b/src/runtime/metrics/description_test.go new file mode 100644 index 0000000..192c1f2 --- /dev/null +++ b/src/runtime/metrics/description_test.go @@ -0,0 +1,115 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package metrics_test + +import ( + "bufio" + "os" + "regexp" + "runtime/metrics" + "strings" + "testing" +) + +func TestDescriptionNameFormat(t *testing.T) { + r := regexp.MustCompile("^(?P<name>/[^:]+):(?P<unit>[^:*/]+(?:[*/][^:*/]+)*)$") + descriptions := metrics.All() + for _, desc := range descriptions { + if !r.MatchString(desc.Name) { + t.Errorf("metrics %q does not match regexp %s", desc.Name, r) + } + } +} + +func extractMetricDocs(t *testing.T) map[string]string { + f, err := os.Open("doc.go") + if err != nil { + t.Fatalf("failed to open doc.go in runtime/metrics package: %v", err) + } + const ( + stateSearch = iota // look for list of metrics + stateNextMetric // look for next metric + stateNextDescription // build description + ) + state := stateSearch + s := bufio.NewScanner(f) + result := make(map[string]string) + var metric string + var prevMetric string + var desc strings.Builder + for s.Scan() { + line := strings.TrimSpace(s.Text()) + switch state { + case stateSearch: + if line == "Below is the full list of supported metrics, ordered lexicographically." { + state = stateNextMetric + } + case stateNextMetric: + // Ignore empty lines until we find a non-empty + // one. This will be our metric name. + if len(line) != 0 { + prevMetric = metric + metric = line + if prevMetric > metric { + t.Errorf("metrics %s and %s are out of lexicographical order", prevMetric, metric) + } + state = stateNextDescription + } + case stateNextDescription: + if len(line) == 0 || line == `*/` { + // An empty line means we're done. + // Write down the description and look + // for a new metric. + result[metric] = desc.String() + desc.Reset() + state = stateNextMetric + } else { + // As long as we're seeing data, assume that's + // part of the description and append it. + if desc.Len() != 0 { + // Turn previous newlines into spaces. + desc.WriteString(" ") + } + desc.WriteString(line) + } + } + if line == `*/` { + break + } + } + if state == stateSearch { + t.Fatalf("failed to find supported metrics docs in %s", f.Name()) + } + return result +} + +func TestDescriptionDocs(t *testing.T) { + docs := extractMetricDocs(t) + descriptions := metrics.All() + for _, d := range descriptions { + want := d.Description + got, ok := docs[d.Name] + if !ok { + t.Errorf("no docs found for metric %s", d.Name) + continue + } + if got != want { + t.Errorf("mismatched description and docs for metric %s", d.Name) + t.Errorf("want: %q, got %q", want, got) + continue + } + } + if len(docs) > len(descriptions) { + docsLoop: + for name := range docs { + for _, d := range descriptions { + if name == d.Name { + continue docsLoop + } + } + t.Errorf("stale documentation for non-existent metric: %s", name) + } + } +} diff --git a/src/runtime/metrics/doc.go b/src/runtime/metrics/doc.go new file mode 100644 index 0000000..b593d8d --- /dev/null +++ b/src/runtime/metrics/doc.go @@ -0,0 +1,283 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +/* +Package metrics provides a stable interface to access implementation-defined +metrics exported by the Go runtime. This package is similar to existing functions +like runtime.ReadMemStats and debug.ReadGCStats, but significantly more general. + +The set of metrics defined by this package may evolve as the runtime itself +evolves, and also enables variation across Go implementations, whose relevant +metric sets may not intersect. + +# Interface + +Metrics are designated by a string key, rather than, for example, a field name in +a struct. The full list of supported metrics is always available in the slice of +Descriptions returned by All. Each Description also includes useful information +about the metric. + +Thus, users of this API are encouraged to sample supported metrics defined by the +slice returned by All to remain compatible across Go versions. Of course, situations +arise where reading specific metrics is critical. For these cases, users are +encouraged to use build tags, and although metrics may be deprecated and removed, +users should consider this to be an exceptional and rare event, coinciding with a +very large change in a particular Go implementation. + +Each metric key also has a "kind" that describes the format of the metric's value. +In the interest of not breaking users of this package, the "kind" for a given metric +is guaranteed not to change. If it must change, then a new metric will be introduced +with a new key and a new "kind." + +# Metric key format + +As mentioned earlier, metric keys are strings. Their format is simple and well-defined, +designed to be both human and machine readable. It is split into two components, +separated by a colon: a rooted path and a unit. The choice to include the unit in +the key is motivated by compatibility: if a metric's unit changes, its semantics likely +did also, and a new key should be introduced. + +For more details on the precise definition of the metric key's path and unit formats, see +the documentation of the Name field of the Description struct. + +# A note about floats + +This package supports metrics whose values have a floating-point representation. In +order to improve ease-of-use, this package promises to never produce the following +classes of floating-point values: NaN, infinity. + +# Supported metrics + +Below is the full list of supported metrics, ordered lexicographically. + + /cgo/go-to-c-calls:calls + Count of calls made from Go to C by the current process. + + /cpu/classes/gc/mark/assist:cpu-seconds + Estimated total CPU time goroutines spent performing GC tasks + to assist the GC and prevent it from falling behind the application. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. + + /cpu/classes/gc/mark/dedicated:cpu-seconds + Estimated total CPU time spent performing GC tasks on + processors (as defined by GOMAXPROCS) dedicated to those tasks. + This includes time spent with the world stopped due to the GC. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. + + /cpu/classes/gc/mark/idle:cpu-seconds + Estimated total CPU time spent performing GC tasks on + spare CPU resources that the Go scheduler could not otherwise find + a use for. This should be subtracted from the total GC CPU time to + obtain a measure of compulsory GC CPU time. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. + + /cpu/classes/gc/pause:cpu-seconds + Estimated total CPU time spent with the application paused by + the GC. Even if only one thread is running during the pause, this is + computed as GOMAXPROCS times the pause latency because nothing else + can be executing. This is the exact sum of samples in /gc/pause:seconds + if each sample is multiplied by GOMAXPROCS at the time it is taken. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. + + /cpu/classes/gc/total:cpu-seconds + Estimated total CPU time spent performing GC tasks. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. Sum of all metrics in /cpu/classes/gc. + + /cpu/classes/idle:cpu-seconds + Estimated total available CPU time not spent executing any Go or Go + runtime code. In other words, the part of /cpu/classes/total:cpu-seconds + that was unused. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. + + /cpu/classes/scavenge/assist:cpu-seconds + Estimated total CPU time spent returning unused memory to the + underlying platform in response eagerly in response to memory pressure. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. + + /cpu/classes/scavenge/background:cpu-seconds + Estimated total CPU time spent performing background tasks + to return unused memory to the underlying platform. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. + + /cpu/classes/scavenge/total:cpu-seconds + Estimated total CPU time spent performing tasks that return + unused memory to the underlying platform. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. Sum of all metrics in /cpu/classes/scavenge. + + /cpu/classes/total:cpu-seconds + Estimated total available CPU time for user Go code or the Go runtime, as + defined by GOMAXPROCS. In other words, GOMAXPROCS integrated over the + wall-clock duration this process has been executing for. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. Sum of all metrics in /cpu/classes. + + /cpu/classes/user:cpu-seconds + Estimated total CPU time spent running user Go code. This may + also include some small amount of time spent in the Go runtime. + This metric is an overestimate, and not directly comparable to + system CPU time measurements. Compare only with other /cpu/classes + metrics. + + /gc/cycles/automatic:gc-cycles + Count of completed GC cycles generated by the Go runtime. + + /gc/cycles/forced:gc-cycles + Count of completed GC cycles forced by the application. + + /gc/cycles/total:gc-cycles + Count of all completed GC cycles. + + /gc/heap/allocs-by-size:bytes + Distribution of heap allocations by approximate size. + Note that this does not include tiny objects as defined by /gc/heap/tiny/allocs:objects, + only tiny blocks. + + /gc/heap/allocs:bytes + Cumulative sum of memory allocated to the heap by the application. + + /gc/heap/allocs:objects + Cumulative count of heap allocations triggered by the application. + Note that this does not include tiny objects as defined by /gc/heap/tiny/allocs:objects, + only tiny blocks. + + /gc/heap/frees-by-size:bytes + Distribution of freed heap allocations by approximate size. + Note that this does not include tiny objects as defined by /gc/heap/tiny/allocs:objects, + only tiny blocks. + + /gc/heap/frees:bytes + Cumulative sum of heap memory freed by the garbage collector. + + /gc/heap/frees:objects + Cumulative count of heap allocations whose storage was freed by the garbage collector. + Note that this does not include tiny objects as defined by /gc/heap/tiny/allocs:objects, + only tiny blocks. + + /gc/heap/goal:bytes + Heap size target for the end of the GC cycle. + + /gc/heap/objects:objects + Number of objects, live or unswept, occupying heap memory. + + /gc/heap/tiny/allocs:objects + Count of small allocations that are packed together into blocks. + These allocations are counted separately from other allocations + because each individual allocation is not tracked by the runtime, + only their block. Each block is already accounted for in + allocs-by-size and frees-by-size. + + /gc/limiter/last-enabled:gc-cycle + GC cycle the last time the GC CPU limiter was enabled. + This metric is useful for diagnosing the root cause of an out-of-memory + error, because the limiter trades memory for CPU time when the GC's CPU + time gets too high. This is most likely to occur with use of SetMemoryLimit. + The first GC cycle is cycle 1, so a value of 0 indicates that it was never enabled. + + /gc/pauses:seconds + Distribution individual GC-related stop-the-world pause latencies. + + /gc/stack/starting-size:bytes + The stack size of new goroutines. + + /memory/classes/heap/free:bytes + Memory that is completely free and eligible to be returned to + the underlying system, but has not been. This metric is the + runtime's estimate of free address space that is backed by + physical memory. + + /memory/classes/heap/objects:bytes + Memory occupied by live objects and dead objects that have + not yet been marked free by the garbage collector. + + /memory/classes/heap/released:bytes + Memory that is completely free and has been returned to + the underlying system. This metric is the runtime's estimate of + free address space that is still mapped into the process, but + is not backed by physical memory. + + /memory/classes/heap/stacks:bytes + Memory allocated from the heap that is reserved for stack + space, whether or not it is currently in-use. + + /memory/classes/heap/unused:bytes + Memory that is reserved for heap objects but is not currently + used to hold heap objects. + + /memory/classes/metadata/mcache/free:bytes + Memory that is reserved for runtime mcache structures, but + not in-use. + + /memory/classes/metadata/mcache/inuse:bytes + Memory that is occupied by runtime mcache structures that + are currently being used. + + /memory/classes/metadata/mspan/free:bytes + Memory that is reserved for runtime mspan structures, but + not in-use. + + /memory/classes/metadata/mspan/inuse:bytes + Memory that is occupied by runtime mspan structures that are + currently being used. + + /memory/classes/metadata/other:bytes + Memory that is reserved for or used to hold runtime + metadata. + + /memory/classes/os-stacks:bytes + Stack memory allocated by the underlying operating system. + + /memory/classes/other:bytes + Memory used by execution trace buffers, structures for + debugging the runtime, finalizer and profiler specials, and + more. + + /memory/classes/profiling/buckets:bytes + Memory that is used by the stack trace hash map used for + profiling. + + /memory/classes/total:bytes + All memory mapped by the Go runtime into the current process + as read-write. Note that this does not include memory mapped + by code called via cgo or via the syscall package. + Sum of all metrics in /memory/classes. + + /sched/gomaxprocs:threads + The current runtime.GOMAXPROCS setting, or the number of + operating system threads that can execute user-level Go code + simultaneously. + + /sched/goroutines:goroutines + Count of live goroutines. + + /sched/latencies:seconds + Distribution of the time goroutines have spent in the scheduler + in a runnable state before actually running. + + /sync/mutex/wait/total:seconds + Approximate cumulative time goroutines have spent blocked on a + sync.Mutex or sync.RWMutex. This metric is useful for identifying + global changes in lock contention. Collect a mutex or block + profile using the runtime/pprof package for more detailed + contention data. +*/ +package metrics diff --git a/src/runtime/metrics/example_test.go b/src/runtime/metrics/example_test.go new file mode 100644 index 0000000..624d9d8 --- /dev/null +++ b/src/runtime/metrics/example_test.go @@ -0,0 +1,96 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package metrics_test + +import ( + "fmt" + "runtime/metrics" +) + +func ExampleRead_readingOneMetric() { + // Name of the metric we want to read. + const myMetric = "/memory/classes/heap/free:bytes" + + // Create a sample for the metric. + sample := make([]metrics.Sample, 1) + sample[0].Name = myMetric + + // Sample the metric. + metrics.Read(sample) + + // Check if the metric is actually supported. + // If it's not, the resulting value will always have + // kind KindBad. + if sample[0].Value.Kind() == metrics.KindBad { + panic(fmt.Sprintf("metric %q no longer supported", myMetric)) + } + + // Handle the result. + // + // It's OK to assume a particular Kind for a metric; + // they're guaranteed not to change. + freeBytes := sample[0].Value.Uint64() + + fmt.Printf("free but not released memory: %d\n", freeBytes) +} + +func ExampleRead_readingAllMetrics() { + // Get descriptions for all supported metrics. + descs := metrics.All() + + // Create a sample for each metric. + samples := make([]metrics.Sample, len(descs)) + for i := range samples { + samples[i].Name = descs[i].Name + } + + // Sample the metrics. Re-use the samples slice if you can! + metrics.Read(samples) + + // Iterate over all results. + for _, sample := range samples { + // Pull out the name and value. + name, value := sample.Name, sample.Value + + // Handle each sample. + switch value.Kind() { + case metrics.KindUint64: + fmt.Printf("%s: %d\n", name, value.Uint64()) + case metrics.KindFloat64: + fmt.Printf("%s: %f\n", name, value.Float64()) + case metrics.KindFloat64Histogram: + // The histogram may be quite large, so let's just pull out + // a crude estimate for the median for the sake of this example. + fmt.Printf("%s: %f\n", name, medianBucket(value.Float64Histogram())) + case metrics.KindBad: + // This should never happen because all metrics are supported + // by construction. + panic("bug in runtime/metrics package!") + default: + // This may happen as new metrics get added. + // + // The safest thing to do here is to simply log it somewhere + // as something to look into, but ignore it for now. + // In the worst case, you might temporarily miss out on a new metric. + fmt.Printf("%s: unexpected metric Kind: %v\n", name, value.Kind()) + } + } +} + +func medianBucket(h *metrics.Float64Histogram) float64 { + total := uint64(0) + for _, count := range h.Counts { + total += count + } + thresh := total / 2 + total = 0 + for i, count := range h.Counts { + total += count + if total >= thresh { + return h.Buckets[i] + } + } + panic("should not happen") +} diff --git a/src/runtime/metrics/histogram.go b/src/runtime/metrics/histogram.go new file mode 100644 index 0000000..956422b --- /dev/null +++ b/src/runtime/metrics/histogram.go @@ -0,0 +1,33 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package metrics + +// Float64Histogram represents a distribution of float64 values. +type Float64Histogram struct { + // Counts contains the weights for each histogram bucket. + // + // Given N buckets, Count[n] is the weight of the range + // [bucket[n], bucket[n+1]), for 0 <= n < N. + Counts []uint64 + + // Buckets contains the boundaries of the histogram buckets, in increasing order. + // + // Buckets[0] is the inclusive lower bound of the minimum bucket while + // Buckets[len(Buckets)-1] is the exclusive upper bound of the maximum bucket. + // Hence, there are len(Buckets)-1 counts. Furthermore, len(Buckets) != 1, always, + // since at least two boundaries are required to describe one bucket (and 0 + // boundaries are used to describe 0 buckets). + // + // Buckets[0] is permitted to have value -Inf and Buckets[len(Buckets)-1] is + // permitted to have value Inf. + // + // For a given metric name, the value of Buckets is guaranteed not to change + // between calls until program exit. + // + // This slice value is permitted to alias with other Float64Histograms' Buckets + // fields, so the values within should only ever be read. If they need to be + // modified, the user must make a copy. + Buckets []float64 +} diff --git a/src/runtime/metrics/sample.go b/src/runtime/metrics/sample.go new file mode 100644 index 0000000..4cf8cdf --- /dev/null +++ b/src/runtime/metrics/sample.go @@ -0,0 +1,47 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package metrics + +import ( + _ "runtime" // depends on the runtime via a linkname'd function + "unsafe" +) + +// Sample captures a single metric sample. +type Sample struct { + // Name is the name of the metric sampled. + // + // It must correspond to a name in one of the metric descriptions + // returned by All. + Name string + + // Value is the value of the metric sample. + Value Value +} + +// Implemented in the runtime. +func runtime_readMetrics(unsafe.Pointer, int, int) + +// Read populates each Value field in the given slice of metric samples. +// +// Desired metrics should be present in the slice with the appropriate name. +// The user of this API is encouraged to re-use the same slice between calls for +// efficiency, but is not required to do so. +// +// Note that re-use has some caveats. Notably, Values should not be read or +// manipulated while a Read with that value is outstanding; that is a data race. +// This property includes pointer-typed Values (for example, Float64Histogram) +// whose underlying storage will be reused by Read when possible. To safely use +// such values in a concurrent setting, all data must be deep-copied. +// +// It is safe to execute multiple Read calls concurrently, but their arguments +// must share no underlying memory. When in doubt, create a new []Sample from +// scratch, which is always safe, though may be inefficient. +// +// Sample values with names not appearing in All will have their Value populated +// as KindBad to indicate that the name is unknown. +func Read(m []Sample) { + runtime_readMetrics(unsafe.Pointer(&m[0]), len(m), cap(m)) +} diff --git a/src/runtime/metrics/value.go b/src/runtime/metrics/value.go new file mode 100644 index 0000000..ed9a33d --- /dev/null +++ b/src/runtime/metrics/value.go @@ -0,0 +1,69 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package metrics + +import ( + "math" + "unsafe" +) + +// ValueKind is a tag for a metric Value which indicates its type. +type ValueKind int + +const ( + // KindBad indicates that the Value has no type and should not be used. + KindBad ValueKind = iota + + // KindUint64 indicates that the type of the Value is a uint64. + KindUint64 + + // KindFloat64 indicates that the type of the Value is a float64. + KindFloat64 + + // KindFloat64Histogram indicates that the type of the Value is a *Float64Histogram. + KindFloat64Histogram +) + +// Value represents a metric value returned by the runtime. +type Value struct { + kind ValueKind + scalar uint64 // contains scalar values for scalar Kinds. + pointer unsafe.Pointer // contains non-scalar values. +} + +// Kind returns the tag representing the kind of value this is. +func (v Value) Kind() ValueKind { + return v.kind +} + +// Uint64 returns the internal uint64 value for the metric. +// +// If v.Kind() != KindUint64, this method panics. +func (v Value) Uint64() uint64 { + if v.kind != KindUint64 { + panic("called Uint64 on non-uint64 metric value") + } + return v.scalar +} + +// Float64 returns the internal float64 value for the metric. +// +// If v.Kind() != KindFloat64, this method panics. +func (v Value) Float64() float64 { + if v.kind != KindFloat64 { + panic("called Float64 on non-float64 metric value") + } + return math.Float64frombits(v.scalar) +} + +// Float64Histogram returns the internal *Float64Histogram value for the metric. +// +// If v.Kind() != KindFloat64Histogram, this method panics. +func (v Value) Float64Histogram() *Float64Histogram { + if v.kind != KindFloat64Histogram { + panic("called Float64Histogram on non-Float64Histogram metric value") + } + return (*Float64Histogram)(v.pointer) +} diff --git a/src/runtime/metrics_test.go b/src/runtime/metrics_test.go new file mode 100644 index 0000000..d981c8e --- /dev/null +++ b/src/runtime/metrics_test.go @@ -0,0 +1,613 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "reflect" + "runtime" + "runtime/metrics" + "sort" + "strings" + "sync" + "testing" + "time" + "unsafe" +) + +func prepareAllMetricsSamples() (map[string]metrics.Description, []metrics.Sample) { + all := metrics.All() + samples := make([]metrics.Sample, len(all)) + descs := make(map[string]metrics.Description) + for i := range all { + samples[i].Name = all[i].Name + descs[all[i].Name] = all[i] + } + return descs, samples +} + +func TestReadMetrics(t *testing.T) { + // Tests whether readMetrics produces values aligning + // with ReadMemStats while the world is stopped. + var mstats runtime.MemStats + _, samples := prepareAllMetricsSamples() + runtime.ReadMetricsSlow(&mstats, unsafe.Pointer(&samples[0]), len(samples), cap(samples)) + + checkUint64 := func(t *testing.T, m string, got, want uint64) { + t.Helper() + if got != want { + t.Errorf("metric %q: got %d, want %d", m, got, want) + } + } + + // Check to make sure the values we read line up with other values we read. + var allocsBySize *metrics.Float64Histogram + var tinyAllocs uint64 + var mallocs, frees uint64 + for i := range samples { + switch name := samples[i].Name; name { + case "/cgo/go-to-c-calls:calls": + checkUint64(t, name, samples[i].Value.Uint64(), uint64(runtime.NumCgoCall())) + case "/memory/classes/heap/free:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.HeapIdle-mstats.HeapReleased) + case "/memory/classes/heap/released:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.HeapReleased) + case "/memory/classes/heap/objects:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.HeapAlloc) + case "/memory/classes/heap/unused:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.HeapInuse-mstats.HeapAlloc) + case "/memory/classes/heap/stacks:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.StackInuse) + case "/memory/classes/metadata/mcache/free:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.MCacheSys-mstats.MCacheInuse) + case "/memory/classes/metadata/mcache/inuse:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.MCacheInuse) + case "/memory/classes/metadata/mspan/free:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.MSpanSys-mstats.MSpanInuse) + case "/memory/classes/metadata/mspan/inuse:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.MSpanInuse) + case "/memory/classes/metadata/other:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.GCSys) + case "/memory/classes/os-stacks:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.StackSys-mstats.StackInuse) + case "/memory/classes/other:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.OtherSys) + case "/memory/classes/profiling/buckets:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.BuckHashSys) + case "/memory/classes/total:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.Sys) + case "/gc/heap/allocs-by-size:bytes": + hist := samples[i].Value.Float64Histogram() + // Skip size class 0 in BySize, because it's always empty and not represented + // in the histogram. + for i, sc := range mstats.BySize[1:] { + if b, s := hist.Buckets[i+1], float64(sc.Size+1); b != s { + t.Errorf("bucket does not match size class: got %f, want %f", b, s) + // The rest of the checks aren't expected to work anyway. + continue + } + if c, m := hist.Counts[i], sc.Mallocs; c != m { + t.Errorf("histogram counts do not much BySize for class %d: got %d, want %d", i, c, m) + } + } + allocsBySize = hist + case "/gc/heap/allocs:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.TotalAlloc) + case "/gc/heap/frees-by-size:bytes": + hist := samples[i].Value.Float64Histogram() + // Skip size class 0 in BySize, because it's always empty and not represented + // in the histogram. + for i, sc := range mstats.BySize[1:] { + if b, s := hist.Buckets[i+1], float64(sc.Size+1); b != s { + t.Errorf("bucket does not match size class: got %f, want %f", b, s) + // The rest of the checks aren't expected to work anyway. + continue + } + if c, f := hist.Counts[i], sc.Frees; c != f { + t.Errorf("histogram counts do not match BySize for class %d: got %d, want %d", i, c, f) + } + } + case "/gc/heap/frees:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.TotalAlloc-mstats.HeapAlloc) + case "/gc/heap/tiny/allocs:objects": + // Currently, MemStats adds tiny alloc count to both Mallocs AND Frees. + // The reason for this is because MemStats couldn't be extended at the time + // but there was a desire to have Mallocs at least be a little more representative, + // while having Mallocs - Frees still represent a live object count. + // Unfortunately, MemStats doesn't actually export a large allocation count, + // so it's impossible to pull this number out directly. + // + // Check tiny allocation count outside of this loop, by using the allocs-by-size + // histogram in order to figure out how many large objects there are. + tinyAllocs = samples[i].Value.Uint64() + // Because the next two metrics tests are checking against Mallocs and Frees, + // we can't check them directly for the same reason: we need to account for tiny + // allocations included in Mallocs and Frees. + case "/gc/heap/allocs:objects": + mallocs = samples[i].Value.Uint64() + case "/gc/heap/frees:objects": + frees = samples[i].Value.Uint64() + case "/gc/heap/objects:objects": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.HeapObjects) + case "/gc/heap/goal:bytes": + checkUint64(t, name, samples[i].Value.Uint64(), mstats.NextGC) + case "/gc/cycles/automatic:gc-cycles": + checkUint64(t, name, samples[i].Value.Uint64(), uint64(mstats.NumGC-mstats.NumForcedGC)) + case "/gc/cycles/forced:gc-cycles": + checkUint64(t, name, samples[i].Value.Uint64(), uint64(mstats.NumForcedGC)) + case "/gc/cycles/total:gc-cycles": + checkUint64(t, name, samples[i].Value.Uint64(), uint64(mstats.NumGC)) + } + } + + // Check tinyAllocs. + nonTinyAllocs := uint64(0) + for _, c := range allocsBySize.Counts { + nonTinyAllocs += c + } + checkUint64(t, "/gc/heap/tiny/allocs:objects", tinyAllocs, mstats.Mallocs-nonTinyAllocs) + + // Check allocation and free counts. + checkUint64(t, "/gc/heap/allocs:objects", mallocs, mstats.Mallocs-tinyAllocs) + checkUint64(t, "/gc/heap/frees:objects", frees, mstats.Frees-tinyAllocs) +} + +func TestReadMetricsConsistency(t *testing.T) { + // Tests whether readMetrics produces consistent, sensible values. + // The values are read concurrently with the runtime doing other + // things (e.g. allocating) so what we read can't reasonably compared + // to other runtime values (e.g. MemStats). + + // Run a few GC cycles to get some of the stats to be non-zero. + runtime.GC() + runtime.GC() + runtime.GC() + + // Set GOMAXPROCS high then sleep briefly to ensure we generate + // some idle time. + oldmaxprocs := runtime.GOMAXPROCS(10) + time.Sleep(time.Millisecond) + runtime.GOMAXPROCS(oldmaxprocs) + + // Read all the supported metrics through the metrics package. + descs, samples := prepareAllMetricsSamples() + metrics.Read(samples) + + // Check to make sure the values we read make sense. + var totalVirtual struct { + got, want uint64 + } + var objects struct { + alloc, free *metrics.Float64Histogram + allocs, frees uint64 + allocdBytes, freedBytes uint64 + total, totalBytes uint64 + } + var gc struct { + numGC uint64 + pauses uint64 + } + var cpu struct { + gcAssist float64 + gcDedicated float64 + gcIdle float64 + gcPause float64 + gcTotal float64 + + idle float64 + user float64 + + scavengeAssist float64 + scavengeBg float64 + scavengeTotal float64 + + total float64 + } + for i := range samples { + kind := samples[i].Value.Kind() + if want := descs[samples[i].Name].Kind; kind != want { + t.Errorf("supported metric %q has unexpected kind: got %d, want %d", samples[i].Name, kind, want) + continue + } + if samples[i].Name != "/memory/classes/total:bytes" && strings.HasPrefix(samples[i].Name, "/memory/classes") { + v := samples[i].Value.Uint64() + totalVirtual.want += v + + // None of these stats should ever get this big. + // If they do, there's probably overflow involved, + // usually due to bad accounting. + if int64(v) < 0 { + t.Errorf("%q has high/negative value: %d", samples[i].Name, v) + } + } + switch samples[i].Name { + case "/cpu/classes/gc/mark/assist:cpu-seconds": + cpu.gcAssist = samples[i].Value.Float64() + case "/cpu/classes/gc/mark/dedicated:cpu-seconds": + cpu.gcDedicated = samples[i].Value.Float64() + case "/cpu/classes/gc/mark/idle:cpu-seconds": + cpu.gcIdle = samples[i].Value.Float64() + case "/cpu/classes/gc/pause:cpu-seconds": + cpu.gcPause = samples[i].Value.Float64() + case "/cpu/classes/gc/total:cpu-seconds": + cpu.gcTotal = samples[i].Value.Float64() + case "/cpu/classes/idle:cpu-seconds": + cpu.idle = samples[i].Value.Float64() + case "/cpu/classes/scavenge/assist:cpu-seconds": + cpu.scavengeAssist = samples[i].Value.Float64() + case "/cpu/classes/scavenge/background:cpu-seconds": + cpu.scavengeBg = samples[i].Value.Float64() + case "/cpu/classes/scavenge/total:cpu-seconds": + cpu.scavengeTotal = samples[i].Value.Float64() + case "/cpu/classes/total:cpu-seconds": + cpu.total = samples[i].Value.Float64() + case "/cpu/classes/user:cpu-seconds": + cpu.user = samples[i].Value.Float64() + case "/memory/classes/total:bytes": + totalVirtual.got = samples[i].Value.Uint64() + case "/memory/classes/heap/objects:bytes": + objects.totalBytes = samples[i].Value.Uint64() + case "/gc/heap/objects:objects": + objects.total = samples[i].Value.Uint64() + case "/gc/heap/allocs:bytes": + objects.allocdBytes = samples[i].Value.Uint64() + case "/gc/heap/allocs:objects": + objects.allocs = samples[i].Value.Uint64() + case "/gc/heap/allocs-by-size:bytes": + objects.alloc = samples[i].Value.Float64Histogram() + case "/gc/heap/frees:bytes": + objects.freedBytes = samples[i].Value.Uint64() + case "/gc/heap/frees:objects": + objects.frees = samples[i].Value.Uint64() + case "/gc/heap/frees-by-size:bytes": + objects.free = samples[i].Value.Float64Histogram() + case "/gc/cycles:gc-cycles": + gc.numGC = samples[i].Value.Uint64() + case "/gc/pauses:seconds": + h := samples[i].Value.Float64Histogram() + gc.pauses = 0 + for i := range h.Counts { + gc.pauses += h.Counts[i] + } + case "/sched/gomaxprocs:threads": + if got, want := samples[i].Value.Uint64(), uint64(runtime.GOMAXPROCS(-1)); got != want { + t.Errorf("gomaxprocs doesn't match runtime.GOMAXPROCS: got %d, want %d", got, want) + } + case "/sched/goroutines:goroutines": + if samples[i].Value.Uint64() < 1 { + t.Error("number of goroutines is less than one") + } + } + } + // Only check this on Linux where we can be reasonably sure we have a high-resolution timer. + if runtime.GOOS == "linux" { + if cpu.gcDedicated <= 0 && cpu.gcAssist <= 0 && cpu.gcIdle <= 0 { + t.Errorf("found no time spent on GC work: %#v", cpu) + } + if cpu.gcPause <= 0 { + t.Errorf("found no GC pauses: %f", cpu.gcPause) + } + if cpu.idle <= 0 { + t.Errorf("found no idle time: %f", cpu.idle) + } + if total := cpu.gcDedicated + cpu.gcAssist + cpu.gcIdle + cpu.gcPause; !withinEpsilon(cpu.gcTotal, total, 0.01) { + t.Errorf("calculated total GC CPU not within 1%% of sampled total: %f vs. %f", total, cpu.gcTotal) + } + if total := cpu.scavengeAssist + cpu.scavengeBg; !withinEpsilon(cpu.scavengeTotal, total, 0.01) { + t.Errorf("calculated total scavenge CPU not within 1%% of sampled total: %f vs. %f", total, cpu.scavengeTotal) + } + if cpu.total <= 0 { + t.Errorf("found no total CPU time passed") + } + if cpu.user <= 0 { + t.Errorf("found no user time passed") + } + if total := cpu.gcTotal + cpu.scavengeTotal + cpu.user + cpu.idle; !withinEpsilon(cpu.total, total, 0.02) { + t.Errorf("calculated total CPU not within 2%% of sampled total: %f vs. %f", total, cpu.total) + } + } + if totalVirtual.got != totalVirtual.want { + t.Errorf(`"/memory/classes/total:bytes" does not match sum of /memory/classes/**: got %d, want %d`, totalVirtual.got, totalVirtual.want) + } + if got, want := objects.allocs-objects.frees, objects.total; got != want { + t.Errorf("mismatch between object alloc/free tallies and total: got %d, want %d", got, want) + } + if got, want := objects.allocdBytes-objects.freedBytes, objects.totalBytes; got != want { + t.Errorf("mismatch between object alloc/free tallies and total: got %d, want %d", got, want) + } + if b, c := len(objects.alloc.Buckets), len(objects.alloc.Counts); b != c+1 { + t.Errorf("allocs-by-size has wrong bucket or counts length: %d buckets, %d counts", b, c) + } + if b, c := len(objects.free.Buckets), len(objects.free.Counts); b != c+1 { + t.Errorf("frees-by-size has wrong bucket or counts length: %d buckets, %d counts", b, c) + } + if len(objects.alloc.Buckets) != len(objects.free.Buckets) { + t.Error("allocs-by-size and frees-by-size buckets don't match in length") + } else if len(objects.alloc.Counts) != len(objects.free.Counts) { + t.Error("allocs-by-size and frees-by-size counts don't match in length") + } else { + for i := range objects.alloc.Buckets { + ba := objects.alloc.Buckets[i] + bf := objects.free.Buckets[i] + if ba != bf { + t.Errorf("bucket %d is different for alloc and free hists: %f != %f", i, ba, bf) + } + } + if !t.Failed() { + var gotAlloc, gotFree uint64 + want := objects.total + for i := range objects.alloc.Counts { + if objects.alloc.Counts[i] < objects.free.Counts[i] { + t.Errorf("found more allocs than frees in object dist bucket %d", i) + continue + } + gotAlloc += objects.alloc.Counts[i] + gotFree += objects.free.Counts[i] + } + if got := gotAlloc - gotFree; got != want { + t.Errorf("object distribution counts don't match count of live objects: got %d, want %d", got, want) + } + if gotAlloc != objects.allocs { + t.Errorf("object distribution counts don't match total allocs: got %d, want %d", gotAlloc, objects.allocs) + } + if gotFree != objects.frees { + t.Errorf("object distribution counts don't match total allocs: got %d, want %d", gotFree, objects.frees) + } + } + } + // The current GC has at least 2 pauses per GC. + // Check to see if that value makes sense. + if gc.pauses < gc.numGC*2 { + t.Errorf("fewer pauses than expected: got %d, want at least %d", gc.pauses, gc.numGC*2) + } +} + +func BenchmarkReadMetricsLatency(b *testing.B) { + stop := applyGCLoad(b) + + // Spend this much time measuring latencies. + latencies := make([]time.Duration, 0, 1024) + _, samples := prepareAllMetricsSamples() + + // Hit metrics.Read continuously and measure. + b.ResetTimer() + for i := 0; i < b.N; i++ { + start := time.Now() + metrics.Read(samples) + latencies = append(latencies, time.Since(start)) + } + // Make sure to stop the timer before we wait! The load created above + // is very heavy-weight and not easy to stop, so we could end up + // confusing the benchmarking framework for small b.N. + b.StopTimer() + stop() + + // Disable the default */op metrics. + // ns/op doesn't mean anything because it's an average, but we + // have a sleep in our b.N loop above which skews this significantly. + b.ReportMetric(0, "ns/op") + b.ReportMetric(0, "B/op") + b.ReportMetric(0, "allocs/op") + + // Sort latencies then report percentiles. + sort.Slice(latencies, func(i, j int) bool { + return latencies[i] < latencies[j] + }) + b.ReportMetric(float64(latencies[len(latencies)*50/100]), "p50-ns") + b.ReportMetric(float64(latencies[len(latencies)*90/100]), "p90-ns") + b.ReportMetric(float64(latencies[len(latencies)*99/100]), "p99-ns") +} + +var readMetricsSink [1024]interface{} + +func TestReadMetricsCumulative(t *testing.T) { + // Set up the set of metrics marked cumulative. + descs := metrics.All() + var samples [2][]metrics.Sample + samples[0] = make([]metrics.Sample, len(descs)) + samples[1] = make([]metrics.Sample, len(descs)) + total := 0 + for i := range samples[0] { + if !descs[i].Cumulative { + continue + } + samples[0][total].Name = descs[i].Name + total++ + } + samples[0] = samples[0][:total] + samples[1] = samples[1][:total] + copy(samples[1], samples[0]) + + // Start some noise in the background. + var wg sync.WaitGroup + wg.Add(1) + done := make(chan struct{}) + go func() { + defer wg.Done() + for { + // Add more things here that could influence metrics. + for i := 0; i < len(readMetricsSink); i++ { + readMetricsSink[i] = make([]byte, 1024) + select { + case <-done: + return + default: + } + } + runtime.GC() + } + }() + + sum := func(us []uint64) uint64 { + total := uint64(0) + for _, u := range us { + total += u + } + return total + } + + // Populate the first generation. + metrics.Read(samples[0]) + + // Check to make sure that these metrics only grow monotonically. + for gen := 1; gen < 10; gen++ { + metrics.Read(samples[gen%2]) + for i := range samples[gen%2] { + name := samples[gen%2][i].Name + vNew, vOld := samples[gen%2][i].Value, samples[1-(gen%2)][i].Value + + switch vNew.Kind() { + case metrics.KindUint64: + new := vNew.Uint64() + old := vOld.Uint64() + if new < old { + t.Errorf("%s decreased: %d < %d", name, new, old) + } + case metrics.KindFloat64: + new := vNew.Float64() + old := vOld.Float64() + if new < old { + t.Errorf("%s decreased: %f < %f", name, new, old) + } + case metrics.KindFloat64Histogram: + new := sum(vNew.Float64Histogram().Counts) + old := sum(vOld.Float64Histogram().Counts) + if new < old { + t.Errorf("%s counts decreased: %d < %d", name, new, old) + } + } + } + } + close(done) + + wg.Wait() +} + +func withinEpsilon(v1, v2, e float64) bool { + return v2-v2*e <= v1 && v1 <= v2+v2*e +} + +func TestMutexWaitTimeMetric(t *testing.T) { + var sample [1]metrics.Sample + sample[0].Name = "/sync/mutex/wait/total:seconds" + + locks := []locker2{ + new(mutex), + new(rwmutexWrite), + new(rwmutexReadWrite), + new(rwmutexWriteRead), + } + for _, lock := range locks { + t.Run(reflect.TypeOf(lock).Elem().Name(), func(t *testing.T) { + metrics.Read(sample[:]) + before := time.Duration(sample[0].Value.Float64() * 1e9) + + minMutexWaitTime := generateMutexWaitTime(lock) + + metrics.Read(sample[:]) + after := time.Duration(sample[0].Value.Float64() * 1e9) + + if wt := after - before; wt < minMutexWaitTime { + t.Errorf("too little mutex wait time: got %s, want %s", wt, minMutexWaitTime) + } + }) + } +} + +// locker2 represents an API surface of two concurrent goroutines +// locking the same resource, but through different APIs. It's intended +// to abstract over the relationship of two Lock calls or an RLock +// and a Lock call. +type locker2 interface { + Lock1() + Unlock1() + Lock2() + Unlock2() +} + +type mutex struct { + mu sync.Mutex +} + +func (m *mutex) Lock1() { m.mu.Lock() } +func (m *mutex) Unlock1() { m.mu.Unlock() } +func (m *mutex) Lock2() { m.mu.Lock() } +func (m *mutex) Unlock2() { m.mu.Unlock() } + +type rwmutexWrite struct { + mu sync.RWMutex +} + +func (m *rwmutexWrite) Lock1() { m.mu.Lock() } +func (m *rwmutexWrite) Unlock1() { m.mu.Unlock() } +func (m *rwmutexWrite) Lock2() { m.mu.Lock() } +func (m *rwmutexWrite) Unlock2() { m.mu.Unlock() } + +type rwmutexReadWrite struct { + mu sync.RWMutex +} + +func (m *rwmutexReadWrite) Lock1() { m.mu.RLock() } +func (m *rwmutexReadWrite) Unlock1() { m.mu.RUnlock() } +func (m *rwmutexReadWrite) Lock2() { m.mu.Lock() } +func (m *rwmutexReadWrite) Unlock2() { m.mu.Unlock() } + +type rwmutexWriteRead struct { + mu sync.RWMutex +} + +func (m *rwmutexWriteRead) Lock1() { m.mu.Lock() } +func (m *rwmutexWriteRead) Unlock1() { m.mu.Unlock() } +func (m *rwmutexWriteRead) Lock2() { m.mu.RLock() } +func (m *rwmutexWriteRead) Unlock2() { m.mu.RUnlock() } + +// generateMutexWaitTime causes a couple of goroutines +// to block a whole bunch of times on a sync.Mutex, returning +// the minimum amount of time that should be visible in the +// /sync/mutex-wait:seconds metric. +func generateMutexWaitTime(mu locker2) time.Duration { + // Set up the runtime to always track casgstatus transitions for metrics. + *runtime.CasGStatusAlwaysTrack = true + + mu.Lock1() + + // Start up a goroutine to wait on the lock. + gc := make(chan *runtime.G) + done := make(chan bool) + go func() { + gc <- runtime.Getg() + + for { + mu.Lock2() + mu.Unlock2() + if <-done { + return + } + } + }() + gp := <-gc + + // Set the block time high enough so that it will always show up, even + // on systems with coarse timer granularity. + const blockTime = 100 * time.Millisecond + + // Make sure the goroutine spawned above actually blocks on the lock. + for { + if runtime.GIsWaitingOnMutex(gp) { + break + } + runtime.Gosched() + } + + // Let some amount of time pass. + time.Sleep(blockTime) + + // Let the other goroutine acquire the lock. + mu.Unlock1() + done <- true + + // Reset flag. + *runtime.CasGStatusAlwaysTrack = false + return blockTime +} diff --git a/src/runtime/mfinal.go b/src/runtime/mfinal.go new file mode 100644 index 0000000..d4d4f1f --- /dev/null +++ b/src/runtime/mfinal.go @@ -0,0 +1,518 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Garbage collector: finalizers and block profiling. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// finblock is an array of finalizers to be executed. finblocks are +// arranged in a linked list for the finalizer queue. +// +// finblock is allocated from non-GC'd memory, so any heap pointers +// must be specially handled. GC currently assumes that the finalizer +// queue does not grow during marking (but it can shrink). +type finblock struct { + _ sys.NotInHeap + alllink *finblock + next *finblock + cnt uint32 + _ int32 + fin [(_FinBlockSize - 2*goarch.PtrSize - 2*4) / unsafe.Sizeof(finalizer{})]finalizer +} + +var fingStatus atomic.Uint32 + +// finalizer goroutine status. +const ( + fingUninitialized uint32 = iota + fingCreated uint32 = 1 << (iota - 1) + fingRunningFinalizer + fingWait + fingWake +) + +var finlock mutex // protects the following variables +var fing *g // goroutine that runs finalizers +var finq *finblock // list of finalizers that are to be executed +var finc *finblock // cache of free blocks +var finptrmask [_FinBlockSize / goarch.PtrSize / 8]byte + +var allfin *finblock // list of all blocks + +// NOTE: Layout known to queuefinalizer. +type finalizer struct { + fn *funcval // function to call (may be a heap pointer) + arg unsafe.Pointer // ptr to object (may be a heap pointer) + nret uintptr // bytes of return values from fn + fint *_type // type of first argument of fn + ot *ptrtype // type of ptr to object (may be a heap pointer) +} + +var finalizer1 = [...]byte{ + // Each Finalizer is 5 words, ptr ptr INT ptr ptr (INT = uintptr here) + // Each byte describes 8 words. + // Need 8 Finalizers described by 5 bytes before pattern repeats: + // ptr ptr INT ptr ptr + // ptr ptr INT ptr ptr + // ptr ptr INT ptr ptr + // ptr ptr INT ptr ptr + // ptr ptr INT ptr ptr + // ptr ptr INT ptr ptr + // ptr ptr INT ptr ptr + // ptr ptr INT ptr ptr + // aka + // + // ptr ptr INT ptr ptr ptr ptr INT + // ptr ptr ptr ptr INT ptr ptr ptr + // ptr INT ptr ptr ptr ptr INT ptr + // ptr ptr ptr INT ptr ptr ptr ptr + // INT ptr ptr ptr ptr INT ptr ptr + // + // Assumptions about Finalizer layout checked below. + 1<<0 | 1<<1 | 0<<2 | 1<<3 | 1<<4 | 1<<5 | 1<<6 | 0<<7, + 1<<0 | 1<<1 | 1<<2 | 1<<3 | 0<<4 | 1<<5 | 1<<6 | 1<<7, + 1<<0 | 0<<1 | 1<<2 | 1<<3 | 1<<4 | 1<<5 | 0<<6 | 1<<7, + 1<<0 | 1<<1 | 1<<2 | 0<<3 | 1<<4 | 1<<5 | 1<<6 | 1<<7, + 0<<0 | 1<<1 | 1<<2 | 1<<3 | 1<<4 | 0<<5 | 1<<6 | 1<<7, +} + +// lockRankMayQueueFinalizer records the lock ranking effects of a +// function that may call queuefinalizer. +func lockRankMayQueueFinalizer() { + lockWithRankMayAcquire(&finlock, getLockRank(&finlock)) +} + +func queuefinalizer(p unsafe.Pointer, fn *funcval, nret uintptr, fint *_type, ot *ptrtype) { + if gcphase != _GCoff { + // Currently we assume that the finalizer queue won't + // grow during marking so we don't have to rescan it + // during mark termination. If we ever need to lift + // this assumption, we can do it by adding the + // necessary barriers to queuefinalizer (which it may + // have automatically). + throw("queuefinalizer during GC") + } + + lock(&finlock) + if finq == nil || finq.cnt == uint32(len(finq.fin)) { + if finc == nil { + finc = (*finblock)(persistentalloc(_FinBlockSize, 0, &memstats.gcMiscSys)) + finc.alllink = allfin + allfin = finc + if finptrmask[0] == 0 { + // Build pointer mask for Finalizer array in block. + // Check assumptions made in finalizer1 array above. + if (unsafe.Sizeof(finalizer{}) != 5*goarch.PtrSize || + unsafe.Offsetof(finalizer{}.fn) != 0 || + unsafe.Offsetof(finalizer{}.arg) != goarch.PtrSize || + unsafe.Offsetof(finalizer{}.nret) != 2*goarch.PtrSize || + unsafe.Offsetof(finalizer{}.fint) != 3*goarch.PtrSize || + unsafe.Offsetof(finalizer{}.ot) != 4*goarch.PtrSize) { + throw("finalizer out of sync") + } + for i := range finptrmask { + finptrmask[i] = finalizer1[i%len(finalizer1)] + } + } + } + block := finc + finc = block.next + block.next = finq + finq = block + } + f := &finq.fin[finq.cnt] + atomic.Xadd(&finq.cnt, +1) // Sync with markroots + f.fn = fn + f.nret = nret + f.fint = fint + f.ot = ot + f.arg = p + unlock(&finlock) + fingStatus.Or(fingWake) +} + +//go:nowritebarrier +func iterate_finq(callback func(*funcval, unsafe.Pointer, uintptr, *_type, *ptrtype)) { + for fb := allfin; fb != nil; fb = fb.alllink { + for i := uint32(0); i < fb.cnt; i++ { + f := &fb.fin[i] + callback(f.fn, f.arg, f.nret, f.fint, f.ot) + } + } +} + +func wakefing() *g { + if ok := fingStatus.CompareAndSwap(fingCreated|fingWait|fingWake, fingCreated); ok { + return fing + } + return nil +} + +func createfing() { + // start the finalizer goroutine exactly once + if fingStatus.Load() == fingUninitialized && fingStatus.CompareAndSwap(fingUninitialized, fingCreated) { + go runfinq() + } +} + +func finalizercommit(gp *g, lock unsafe.Pointer) bool { + unlock((*mutex)(lock)) + // fingStatus should be modified after fing is put into a waiting state + // to avoid waking fing in running state, even if it is about to be parked. + fingStatus.Or(fingWait) + return true +} + +// This is the goroutine that runs all of the finalizers. +func runfinq() { + var ( + frame unsafe.Pointer + framecap uintptr + argRegs int + ) + + gp := getg() + lock(&finlock) + fing = gp + unlock(&finlock) + + for { + lock(&finlock) + fb := finq + finq = nil + if fb == nil { + gopark(finalizercommit, unsafe.Pointer(&finlock), waitReasonFinalizerWait, traceEvGoBlock, 1) + continue + } + argRegs = intArgRegs + unlock(&finlock) + if raceenabled { + racefingo() + } + for fb != nil { + for i := fb.cnt; i > 0; i-- { + f := &fb.fin[i-1] + + var regs abi.RegArgs + // The args may be passed in registers or on stack. Even for + // the register case, we still need the spill slots. + // TODO: revisit if we remove spill slots. + // + // Unfortunately because we can have an arbitrary + // amount of returns and it would be complex to try and + // figure out how many of those can get passed in registers, + // just conservatively assume none of them do. + framesz := unsafe.Sizeof((any)(nil)) + f.nret + if framecap < framesz { + // The frame does not contain pointers interesting for GC, + // all not yet finalized objects are stored in finq. + // If we do not mark it as FlagNoScan, + // the last finalized object is not collected. + frame = mallocgc(framesz, nil, true) + framecap = framesz + } + + if f.fint == nil { + throw("missing type in runfinq") + } + r := frame + if argRegs > 0 { + r = unsafe.Pointer(®s.Ints) + } else { + // frame is effectively uninitialized + // memory. That means we have to clear + // it before writing to it to avoid + // confusing the write barrier. + *(*[2]uintptr)(frame) = [2]uintptr{} + } + switch f.fint.kind & kindMask { + case kindPtr: + // direct use of pointer + *(*unsafe.Pointer)(r) = f.arg + case kindInterface: + ityp := (*interfacetype)(unsafe.Pointer(f.fint)) + // set up with empty interface + (*eface)(r)._type = &f.ot.typ + (*eface)(r).data = f.arg + if len(ityp.mhdr) != 0 { + // convert to interface with methods + // this conversion is guaranteed to succeed - we checked in SetFinalizer + (*iface)(r).tab = assertE2I(ityp, (*eface)(r)._type) + } + default: + throw("bad kind in runfinq") + } + fingStatus.Or(fingRunningFinalizer) + reflectcall(nil, unsafe.Pointer(f.fn), frame, uint32(framesz), uint32(framesz), uint32(framesz), ®s) + fingStatus.And(^fingRunningFinalizer) + + // Drop finalizer queue heap references + // before hiding them from markroot. + // This also ensures these will be + // clear if we reuse the finalizer. + f.fn = nil + f.arg = nil + f.ot = nil + atomic.Store(&fb.cnt, i-1) + } + next := fb.next + lock(&finlock) + fb.next = finc + finc = fb + unlock(&finlock) + fb = next + } + } +} + +// SetFinalizer sets the finalizer associated with obj to the provided +// finalizer function. When the garbage collector finds an unreachable block +// with an associated finalizer, it clears the association and runs +// finalizer(obj) in a separate goroutine. This makes obj reachable again, +// but now without an associated finalizer. Assuming that SetFinalizer +// is not called again, the next time the garbage collector sees +// that obj is unreachable, it will free obj. +// +// SetFinalizer(obj, nil) clears any finalizer associated with obj. +// +// The argument obj must be a pointer to an object allocated by calling +// new, by taking the address of a composite literal, or by taking the +// address of a local variable. +// The argument finalizer must be a function that takes a single argument +// to which obj's type can be assigned, and can have arbitrary ignored return +// values. If either of these is not true, SetFinalizer may abort the +// program. +// +// Finalizers are run in dependency order: if A points at B, both have +// finalizers, and they are otherwise unreachable, only the finalizer +// for A runs; once A is freed, the finalizer for B can run. +// If a cyclic structure includes a block with a finalizer, that +// cycle is not guaranteed to be garbage collected and the finalizer +// is not guaranteed to run, because there is no ordering that +// respects the dependencies. +// +// The finalizer is scheduled to run at some arbitrary time after the +// program can no longer reach the object to which obj points. +// There is no guarantee that finalizers will run before a program exits, +// so typically they are useful only for releasing non-memory resources +// associated with an object during a long-running program. +// For example, an os.File object could use a finalizer to close the +// associated operating system file descriptor when a program discards +// an os.File without calling Close, but it would be a mistake +// to depend on a finalizer to flush an in-memory I/O buffer such as a +// bufio.Writer, because the buffer would not be flushed at program exit. +// +// It is not guaranteed that a finalizer will run if the size of *obj is +// zero bytes, because it may share same address with other zero-size +// objects in memory. See https://go.dev/ref/spec#Size_and_alignment_guarantees. +// +// It is not guaranteed that a finalizer will run for objects allocated +// in initializers for package-level variables. Such objects may be +// linker-allocated, not heap-allocated. +// +// Note that because finalizers may execute arbitrarily far into the future +// after an object is no longer referenced, the runtime is allowed to perform +// a space-saving optimization that batches objects together in a single +// allocation slot. The finalizer for an unreferenced object in such an +// allocation may never run if it always exists in the same batch as a +// referenced object. Typically, this batching only happens for tiny +// (on the order of 16 bytes or less) and pointer-free objects. +// +// A finalizer may run as soon as an object becomes unreachable. +// In order to use finalizers correctly, the program must ensure that +// the object is reachable until it is no longer required. +// Objects stored in global variables, or that can be found by tracing +// pointers from a global variable, are reachable. For other objects, +// pass the object to a call of the KeepAlive function to mark the +// last point in the function where the object must be reachable. +// +// For example, if p points to a struct, such as os.File, that contains +// a file descriptor d, and p has a finalizer that closes that file +// descriptor, and if the last use of p in a function is a call to +// syscall.Write(p.d, buf, size), then p may be unreachable as soon as +// the program enters syscall.Write. The finalizer may run at that moment, +// closing p.d, causing syscall.Write to fail because it is writing to +// a closed file descriptor (or, worse, to an entirely different +// file descriptor opened by a different goroutine). To avoid this problem, +// call KeepAlive(p) after the call to syscall.Write. +// +// A single goroutine runs all finalizers for a program, sequentially. +// If a finalizer must run for a long time, it should do so by starting +// a new goroutine. +// +// In the terminology of the Go memory model, a call +// SetFinalizer(x, f) “synchronizes before” the finalization call f(x). +// However, there is no guarantee that KeepAlive(x) or any other use of x +// “synchronizes before” f(x), so in general a finalizer should use a mutex +// or other synchronization mechanism if it needs to access mutable state in x. +// For example, consider a finalizer that inspects a mutable field in x +// that is modified from time to time in the main program before x +// becomes unreachable and the finalizer is invoked. +// The modifications in the main program and the inspection in the finalizer +// need to use appropriate synchronization, such as mutexes or atomic updates, +// to avoid read-write races. +func SetFinalizer(obj any, finalizer any) { + if debug.sbrk != 0 { + // debug.sbrk never frees memory, so no finalizers run + // (and we don't have the data structures to record them). + return + } + e := efaceOf(&obj) + etyp := e._type + if etyp == nil { + throw("runtime.SetFinalizer: first argument is nil") + } + if etyp.kind&kindMask != kindPtr { + throw("runtime.SetFinalizer: first argument is " + etyp.string() + ", not pointer") + } + ot := (*ptrtype)(unsafe.Pointer(etyp)) + if ot.elem == nil { + throw("nil elem type!") + } + + if inUserArenaChunk(uintptr(e.data)) { + // Arena-allocated objects are not eligible for finalizers. + throw("runtime.SetFinalizer: first argument was allocated into an arena") + } + + // find the containing object + base, _, _ := findObject(uintptr(e.data), 0, 0) + + if base == 0 { + // 0-length objects are okay. + if e.data == unsafe.Pointer(&zerobase) { + return + } + + // Global initializers might be linker-allocated. + // var Foo = &Object{} + // func main() { + // runtime.SetFinalizer(Foo, nil) + // } + // The relevant segments are: noptrdata, data, bss, noptrbss. + // We cannot assume they are in any order or even contiguous, + // due to external linking. + for datap := &firstmoduledata; datap != nil; datap = datap.next { + if datap.noptrdata <= uintptr(e.data) && uintptr(e.data) < datap.enoptrdata || + datap.data <= uintptr(e.data) && uintptr(e.data) < datap.edata || + datap.bss <= uintptr(e.data) && uintptr(e.data) < datap.ebss || + datap.noptrbss <= uintptr(e.data) && uintptr(e.data) < datap.enoptrbss { + return + } + } + throw("runtime.SetFinalizer: pointer not in allocated block") + } + + if uintptr(e.data) != base { + // As an implementation detail we allow to set finalizers for an inner byte + // of an object if it could come from tiny alloc (see mallocgc for details). + if ot.elem == nil || ot.elem.ptrdata != 0 || ot.elem.size >= maxTinySize { + throw("runtime.SetFinalizer: pointer not at beginning of allocated block") + } + } + + f := efaceOf(&finalizer) + ftyp := f._type + if ftyp == nil { + // switch to system stack and remove finalizer + systemstack(func() { + removefinalizer(e.data) + }) + return + } + + if ftyp.kind&kindMask != kindFunc { + throw("runtime.SetFinalizer: second argument is " + ftyp.string() + ", not a function") + } + ft := (*functype)(unsafe.Pointer(ftyp)) + if ft.dotdotdot() { + throw("runtime.SetFinalizer: cannot pass " + etyp.string() + " to finalizer " + ftyp.string() + " because dotdotdot") + } + if ft.inCount != 1 { + throw("runtime.SetFinalizer: cannot pass " + etyp.string() + " to finalizer " + ftyp.string()) + } + fint := ft.in()[0] + switch { + case fint == etyp: + // ok - same type + goto okarg + case fint.kind&kindMask == kindPtr: + if (fint.uncommon() == nil || etyp.uncommon() == nil) && (*ptrtype)(unsafe.Pointer(fint)).elem == ot.elem { + // ok - not same type, but both pointers, + // one or the other is unnamed, and same element type, so assignable. + goto okarg + } + case fint.kind&kindMask == kindInterface: + ityp := (*interfacetype)(unsafe.Pointer(fint)) + if len(ityp.mhdr) == 0 { + // ok - satisfies empty interface + goto okarg + } + if iface := assertE2I2(ityp, *efaceOf(&obj)); iface.tab != nil { + goto okarg + } + } + throw("runtime.SetFinalizer: cannot pass " + etyp.string() + " to finalizer " + ftyp.string()) +okarg: + // compute size needed for return parameters + nret := uintptr(0) + for _, t := range ft.out() { + nret = alignUp(nret, uintptr(t.align)) + uintptr(t.size) + } + nret = alignUp(nret, goarch.PtrSize) + + // make sure we have a finalizer goroutine + createfing() + + systemstack(func() { + if !addfinalizer(e.data, (*funcval)(f.data), nret, fint, ot) { + throw("runtime.SetFinalizer: finalizer already set") + } + }) +} + +// Mark KeepAlive as noinline so that it is easily detectable as an intrinsic. +// +//go:noinline + +// KeepAlive marks its argument as currently reachable. +// This ensures that the object is not freed, and its finalizer is not run, +// before the point in the program where KeepAlive is called. +// +// A very simplified example showing where KeepAlive is required: +// +// type File struct { d int } +// d, err := syscall.Open("/file/path", syscall.O_RDONLY, 0) +// // ... do something if err != nil ... +// p := &File{d} +// runtime.SetFinalizer(p, func(p *File) { syscall.Close(p.d) }) +// var buf [10]byte +// n, err := syscall.Read(p.d, buf[:]) +// // Ensure p is not finalized until Read returns. +// runtime.KeepAlive(p) +// // No more uses of p after this point. +// +// Without the KeepAlive call, the finalizer could run at the start of +// syscall.Read, closing the file descriptor before syscall.Read makes +// the actual system call. +// +// Note: KeepAlive should only be used to prevent finalizers from +// running prematurely. In particular, when used with unsafe.Pointer, +// the rules for valid uses of unsafe.Pointer still apply. +func KeepAlive(x any) { + // Introduce a use of x that the compiler can't eliminate. + // This makes sure x is alive on entry. We need x to be alive + // on entry for "defer runtime.KeepAlive(x)"; see issue 21402. + if cgoAlwaysFalse { + println(x) + } +} diff --git a/src/runtime/mfinal_test.go b/src/runtime/mfinal_test.go new file mode 100644 index 0000000..61d625a --- /dev/null +++ b/src/runtime/mfinal_test.go @@ -0,0 +1,257 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "runtime" + "testing" + "time" + "unsafe" +) + +type Tintptr *int // assignable to *int +type Tint int // *Tint implements Tinter, interface{} + +func (t *Tint) m() {} + +type Tinter interface { + m() +} + +func TestFinalizerType(t *testing.T) { + if runtime.GOARCH != "amd64" { + t.Skipf("Skipping on non-amd64 machine") + } + + ch := make(chan bool, 10) + finalize := func(x *int) { + if *x != 97531 { + t.Errorf("finalizer %d, want %d", *x, 97531) + } + ch <- true + } + + var finalizerTests = []struct { + convert func(*int) any + finalizer any + }{ + {func(x *int) any { return x }, func(v *int) { finalize(v) }}, + {func(x *int) any { return Tintptr(x) }, func(v Tintptr) { finalize(v) }}, + {func(x *int) any { return Tintptr(x) }, func(v *int) { finalize(v) }}, + {func(x *int) any { return (*Tint)(x) }, func(v *Tint) { finalize((*int)(v)) }}, + {func(x *int) any { return (*Tint)(x) }, func(v Tinter) { finalize((*int)(v.(*Tint))) }}, + // Test case for argument spill slot. + // If the spill slot was not counted for the frame size, it will (incorrectly) choose + // call32 as the result has (exactly) 32 bytes. When the argument actually spills, + // it clobbers the caller's frame (likely the return PC). + {func(x *int) any { return x }, func(v any) [4]int64 { + print() // force spill + finalize(v.(*int)) + return [4]int64{} + }}, + } + + for _, tt := range finalizerTests { + done := make(chan bool, 1) + go func() { + // allocate struct with pointer to avoid hitting tinyalloc. + // Otherwise we can't be sure when the allocation will + // be freed. + type T struct { + v int + p unsafe.Pointer + } + v := &new(T).v + *v = 97531 + runtime.SetFinalizer(tt.convert(v), tt.finalizer) + v = nil + done <- true + }() + <-done + runtime.GC() + <-ch + } +} + +type bigValue struct { + fill uint64 + it bool + up string +} + +func TestFinalizerInterfaceBig(t *testing.T) { + if runtime.GOARCH != "amd64" { + t.Skipf("Skipping on non-amd64 machine") + } + ch := make(chan bool) + done := make(chan bool, 1) + go func() { + v := &bigValue{0xDEADBEEFDEADBEEF, true, "It matters not how strait the gate"} + old := *v + runtime.SetFinalizer(v, func(v any) { + i, ok := v.(*bigValue) + if !ok { + t.Errorf("finalizer called with type %T, want *bigValue", v) + } + if *i != old { + t.Errorf("finalizer called with %+v, want %+v", *i, old) + } + close(ch) + }) + v = nil + done <- true + }() + <-done + runtime.GC() + <-ch +} + +func fin(v *int) { +} + +// Verify we don't crash at least. golang.org/issue/6857 +func TestFinalizerZeroSizedStruct(t *testing.T) { + type Z struct{} + z := new(Z) + runtime.SetFinalizer(z, func(*Z) {}) +} + +func BenchmarkFinalizer(b *testing.B) { + const Batch = 1000 + b.RunParallel(func(pb *testing.PB) { + var data [Batch]*int + for i := 0; i < Batch; i++ { + data[i] = new(int) + } + for pb.Next() { + for i := 0; i < Batch; i++ { + runtime.SetFinalizer(data[i], fin) + } + for i := 0; i < Batch; i++ { + runtime.SetFinalizer(data[i], nil) + } + } + }) +} + +func BenchmarkFinalizerRun(b *testing.B) { + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + v := new(int) + runtime.SetFinalizer(v, fin) + } + }) +} + +// One chunk must be exactly one sizeclass in size. +// It should be a sizeclass not used much by others, so we +// have a greater chance of finding adjacent ones. +// size class 19: 320 byte objects, 25 per page, 1 page alloc at a time +const objsize = 320 + +type objtype [objsize]byte + +func adjChunks() (*objtype, *objtype) { + var s []*objtype + + for { + c := new(objtype) + for _, d := range s { + if uintptr(unsafe.Pointer(c))+unsafe.Sizeof(*c) == uintptr(unsafe.Pointer(d)) { + return c, d + } + if uintptr(unsafe.Pointer(d))+unsafe.Sizeof(*c) == uintptr(unsafe.Pointer(c)) { + return d, c + } + } + s = append(s, c) + } +} + +// Make sure an empty slice on the stack doesn't pin the next object in memory. +func TestEmptySlice(t *testing.T) { + x, y := adjChunks() + + // the pointer inside xs points to y. + xs := x[objsize:] // change objsize to objsize-1 and the test passes + + fin := make(chan bool, 1) + runtime.SetFinalizer(y, func(z *objtype) { fin <- true }) + runtime.GC() + <-fin + xsglobal = xs // keep empty slice alive until here +} + +var xsglobal []byte + +func adjStringChunk() (string, *objtype) { + b := make([]byte, objsize) + for { + s := string(b) + t := new(objtype) + p := *(*uintptr)(unsafe.Pointer(&s)) + q := uintptr(unsafe.Pointer(t)) + if p+objsize == q { + return s, t + } + } +} + +// Make sure an empty string on the stack doesn't pin the next object in memory. +func TestEmptyString(t *testing.T) { + x, y := adjStringChunk() + + ss := x[objsize:] // change objsize to objsize-1 and the test passes + fin := make(chan bool, 1) + // set finalizer on string contents of y + runtime.SetFinalizer(y, func(z *objtype) { fin <- true }) + runtime.GC() + <-fin + ssglobal = ss // keep 0-length string live until here +} + +var ssglobal string + +// Test for issue 7656. +func TestFinalizerOnGlobal(t *testing.T) { + runtime.SetFinalizer(Foo1, func(p *Object1) {}) + runtime.SetFinalizer(Foo2, func(p *Object2) {}) + runtime.SetFinalizer(Foo1, nil) + runtime.SetFinalizer(Foo2, nil) +} + +type Object1 struct { + Something []byte +} + +type Object2 struct { + Something byte +} + +var ( + Foo2 = &Object2{} + Foo1 = &Object1{} +) + +func TestDeferKeepAlive(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + + // See issue 21402. + t.Parallel() + type T *int // needs to be a pointer base type to avoid tinyalloc and its never-finalized behavior. + x := new(T) + finRun := false + runtime.SetFinalizer(x, func(x *T) { + finRun = true + }) + defer runtime.KeepAlive(x) + runtime.GC() + time.Sleep(time.Second) + if finRun { + t.Errorf("finalizer ran prematurely") + } +} diff --git a/src/runtime/mfixalloc.go b/src/runtime/mfixalloc.go new file mode 100644 index 0000000..8788d95 --- /dev/null +++ b/src/runtime/mfixalloc.go @@ -0,0 +1,111 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Fixed-size object allocator. Returned memory is not zeroed. +// +// See malloc.go for overview. + +package runtime + +import ( + "runtime/internal/sys" + "unsafe" +) + +// FixAlloc is a simple free-list allocator for fixed size objects. +// Malloc uses a FixAlloc wrapped around sysAlloc to manage its +// mcache and mspan objects. +// +// Memory returned by fixalloc.alloc is zeroed by default, but the +// caller may take responsibility for zeroing allocations by setting +// the zero flag to false. This is only safe if the memory never +// contains heap pointers. +// +// The caller is responsible for locking around FixAlloc calls. +// Callers can keep state in the object but the first word is +// smashed by freeing and reallocating. +// +// Consider marking fixalloc'd types not in heap by embedding +// runtime/internal/sys.NotInHeap. +type fixalloc struct { + size uintptr + first func(arg, p unsafe.Pointer) // called first time p is returned + arg unsafe.Pointer + list *mlink + chunk uintptr // use uintptr instead of unsafe.Pointer to avoid write barriers + nchunk uint32 // bytes remaining in current chunk + nalloc uint32 // size of new chunks in bytes + inuse uintptr // in-use bytes now + stat *sysMemStat + zero bool // zero allocations +} + +// A generic linked list of blocks. (Typically the block is bigger than sizeof(MLink).) +// Since assignments to mlink.next will result in a write barrier being performed +// this cannot be used by some of the internal GC structures. For example when +// the sweeper is placing an unmarked object on the free list it does not want the +// write barrier to be called since that could result in the object being reachable. +type mlink struct { + _ sys.NotInHeap + next *mlink +} + +// Initialize f to allocate objects of the given size, +// using the allocator to obtain chunks of memory. +func (f *fixalloc) init(size uintptr, first func(arg, p unsafe.Pointer), arg unsafe.Pointer, stat *sysMemStat) { + if size > _FixAllocChunk { + throw("runtime: fixalloc size too large") + } + if min := unsafe.Sizeof(mlink{}); size < min { + size = min + } + + f.size = size + f.first = first + f.arg = arg + f.list = nil + f.chunk = 0 + f.nchunk = 0 + f.nalloc = uint32(_FixAllocChunk / size * size) // Round _FixAllocChunk down to an exact multiple of size to eliminate tail waste + f.inuse = 0 + f.stat = stat + f.zero = true +} + +func (f *fixalloc) alloc() unsafe.Pointer { + if f.size == 0 { + print("runtime: use of FixAlloc_Alloc before FixAlloc_Init\n") + throw("runtime: internal error") + } + + if f.list != nil { + v := unsafe.Pointer(f.list) + f.list = f.list.next + f.inuse += f.size + if f.zero { + memclrNoHeapPointers(v, f.size) + } + return v + } + if uintptr(f.nchunk) < f.size { + f.chunk = uintptr(persistentalloc(uintptr(f.nalloc), 0, f.stat)) + f.nchunk = f.nalloc + } + + v := unsafe.Pointer(f.chunk) + if f.first != nil { + f.first(f.arg, v) + } + f.chunk = f.chunk + f.size + f.nchunk -= uint32(f.size) + f.inuse += f.size + return v +} + +func (f *fixalloc) free(p unsafe.Pointer) { + f.inuse -= f.size + v := (*mlink)(p) + v.next = f.list + f.list = v +} diff --git a/src/runtime/mgc.go b/src/runtime/mgc.go new file mode 100644 index 0000000..1b05707 --- /dev/null +++ b/src/runtime/mgc.go @@ -0,0 +1,1801 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Garbage collector (GC). +// +// The GC runs concurrently with mutator threads, is type accurate (aka precise), allows multiple +// GC thread to run in parallel. It is a concurrent mark and sweep that uses a write barrier. It is +// non-generational and non-compacting. Allocation is done using size segregated per P allocation +// areas to minimize fragmentation while eliminating locks in the common case. +// +// The algorithm decomposes into several steps. +// This is a high level description of the algorithm being used. For an overview of GC a good +// place to start is Richard Jones' gchandbook.org. +// +// The algorithm's intellectual heritage includes Dijkstra's on-the-fly algorithm, see +// Edsger W. Dijkstra, Leslie Lamport, A. J. Martin, C. S. Scholten, and E. F. M. Steffens. 1978. +// On-the-fly garbage collection: an exercise in cooperation. Commun. ACM 21, 11 (November 1978), +// 966-975. +// For journal quality proofs that these steps are complete, correct, and terminate see +// Hudson, R., and Moss, J.E.B. Copying Garbage Collection without stopping the world. +// Concurrency and Computation: Practice and Experience 15(3-5), 2003. +// +// 1. GC performs sweep termination. +// +// a. Stop the world. This causes all Ps to reach a GC safe-point. +// +// b. Sweep any unswept spans. There will only be unswept spans if +// this GC cycle was forced before the expected time. +// +// 2. GC performs the mark phase. +// +// a. Prepare for the mark phase by setting gcphase to _GCmark +// (from _GCoff), enabling the write barrier, enabling mutator +// assists, and enqueueing root mark jobs. No objects may be +// scanned until all Ps have enabled the write barrier, which is +// accomplished using STW. +// +// b. Start the world. From this point, GC work is done by mark +// workers started by the scheduler and by assists performed as +// part of allocation. The write barrier shades both the +// overwritten pointer and the new pointer value for any pointer +// writes (see mbarrier.go for details). Newly allocated objects +// are immediately marked black. +// +// c. GC performs root marking jobs. This includes scanning all +// stacks, shading all globals, and shading any heap pointers in +// off-heap runtime data structures. Scanning a stack stops a +// goroutine, shades any pointers found on its stack, and then +// resumes the goroutine. +// +// d. GC drains the work queue of grey objects, scanning each grey +// object to black and shading all pointers found in the object +// (which in turn may add those pointers to the work queue). +// +// e. Because GC work is spread across local caches, GC uses a +// distributed termination algorithm to detect when there are no +// more root marking jobs or grey objects (see gcMarkDone). At this +// point, GC transitions to mark termination. +// +// 3. GC performs mark termination. +// +// a. Stop the world. +// +// b. Set gcphase to _GCmarktermination, and disable workers and +// assists. +// +// c. Perform housekeeping like flushing mcaches. +// +// 4. GC performs the sweep phase. +// +// a. Prepare for the sweep phase by setting gcphase to _GCoff, +// setting up sweep state and disabling the write barrier. +// +// b. Start the world. From this point on, newly allocated objects +// are white, and allocating sweeps spans before use if necessary. +// +// c. GC does concurrent sweeping in the background and in response +// to allocation. See description below. +// +// 5. When sufficient allocation has taken place, replay the sequence +// starting with 1 above. See discussion of GC rate below. + +// Concurrent sweep. +// +// The sweep phase proceeds concurrently with normal program execution. +// The heap is swept span-by-span both lazily (when a goroutine needs another span) +// and concurrently in a background goroutine (this helps programs that are not CPU bound). +// At the end of STW mark termination all spans are marked as "needs sweeping". +// +// The background sweeper goroutine simply sweeps spans one-by-one. +// +// To avoid requesting more OS memory while there are unswept spans, when a +// goroutine needs another span, it first attempts to reclaim that much memory +// by sweeping. When a goroutine needs to allocate a new small-object span, it +// sweeps small-object spans for the same object size until it frees at least +// one object. When a goroutine needs to allocate large-object span from heap, +// it sweeps spans until it frees at least that many pages into heap. There is +// one case where this may not suffice: if a goroutine sweeps and frees two +// nonadjacent one-page spans to the heap, it will allocate a new two-page +// span, but there can still be other one-page unswept spans which could be +// combined into a two-page span. +// +// It's critical to ensure that no operations proceed on unswept spans (that would corrupt +// mark bits in GC bitmap). During GC all mcaches are flushed into the central cache, +// so they are empty. When a goroutine grabs a new span into mcache, it sweeps it. +// When a goroutine explicitly frees an object or sets a finalizer, it ensures that +// the span is swept (either by sweeping it, or by waiting for the concurrent sweep to finish). +// The finalizer goroutine is kicked off only when all spans are swept. +// When the next GC starts, it sweeps all not-yet-swept spans (if any). + +// GC rate. +// Next GC is after we've allocated an extra amount of memory proportional to +// the amount already in use. The proportion is controlled by GOGC environment variable +// (100 by default). If GOGC=100 and we're using 4M, we'll GC again when we get to 8M +// (this mark is computed by the gcController.heapGoal method). This keeps the GC cost in +// linear proportion to the allocation cost. Adjusting GOGC just changes the linear constant +// (and also the amount of extra memory used). + +// Oblets +// +// In order to prevent long pauses while scanning large objects and to +// improve parallelism, the garbage collector breaks up scan jobs for +// objects larger than maxObletBytes into "oblets" of at most +// maxObletBytes. When scanning encounters the beginning of a large +// object, it scans only the first oblet and enqueues the remaining +// oblets as new scan jobs. + +package runtime + +import ( + "internal/cpu" + "runtime/internal/atomic" + "unsafe" +) + +const ( + _DebugGC = 0 + _ConcurrentSweep = true + _FinBlockSize = 4 * 1024 + + // debugScanConservative enables debug logging for stack + // frames that are scanned conservatively. + debugScanConservative = false + + // sweepMinHeapDistance is a lower bound on the heap distance + // (in bytes) reserved for concurrent sweeping between GC + // cycles. + sweepMinHeapDistance = 1024 * 1024 +) + +func gcinit() { + if unsafe.Sizeof(workbuf{}) != _WorkbufSize { + throw("size of Workbuf is suboptimal") + } + // No sweep on the first cycle. + sweep.active.state.Store(sweepDrainedMask) + + // Initialize GC pacer state. + // Use the environment variable GOGC for the initial gcPercent value. + // Use the environment variable GOMEMLIMIT for the initial memoryLimit value. + gcController.init(readGOGC(), readGOMEMLIMIT()) + + work.startSema = 1 + work.markDoneSema = 1 + lockInit(&work.sweepWaiters.lock, lockRankSweepWaiters) + lockInit(&work.assistQueue.lock, lockRankAssistQueue) + lockInit(&work.wbufSpans.lock, lockRankWbufSpans) +} + +// gcenable is called after the bulk of the runtime initialization, +// just before we're about to start letting user code run. +// It kicks off the background sweeper goroutine, the background +// scavenger goroutine, and enables GC. +func gcenable() { + // Kick off sweeping and scavenging. + c := make(chan int, 2) + go bgsweep(c) + go bgscavenge(c) + <-c + <-c + memstats.enablegc = true // now that runtime is initialized, GC is okay +} + +// Garbage collector phase. +// Indicates to write barrier and synchronization task to perform. +var gcphase uint32 + +// The compiler knows about this variable. +// If you change it, you must change builtin/runtime.go, too. +// If you change the first four bytes, you must also change the write +// barrier insertion code. +var writeBarrier struct { + enabled bool // compiler emits a check of this before calling write barrier + pad [3]byte // compiler uses 32-bit load for "enabled" field + needed bool // whether we need a write barrier for current GC phase + cgo bool // whether we need a write barrier for a cgo check + alignme uint64 // guarantee alignment so that compiler can use a 32 or 64-bit load +} + +// gcBlackenEnabled is 1 if mutator assists and background mark +// workers are allowed to blacken objects. This must only be set when +// gcphase == _GCmark. +var gcBlackenEnabled uint32 + +const ( + _GCoff = iota // GC not running; sweeping in background, write barrier disabled + _GCmark // GC marking roots and workbufs: allocate black, write barrier ENABLED + _GCmarktermination // GC mark termination: allocate black, P's help GC, write barrier ENABLED +) + +//go:nosplit +func setGCPhase(x uint32) { + atomic.Store(&gcphase, x) + writeBarrier.needed = gcphase == _GCmark || gcphase == _GCmarktermination + writeBarrier.enabled = writeBarrier.needed || writeBarrier.cgo +} + +// gcMarkWorkerMode represents the mode that a concurrent mark worker +// should operate in. +// +// Concurrent marking happens through four different mechanisms. One +// is mutator assists, which happen in response to allocations and are +// not scheduled. The other three are variations in the per-P mark +// workers and are distinguished by gcMarkWorkerMode. +type gcMarkWorkerMode int + +const ( + // gcMarkWorkerNotWorker indicates that the next scheduled G is not + // starting work and the mode should be ignored. + gcMarkWorkerNotWorker gcMarkWorkerMode = iota + + // gcMarkWorkerDedicatedMode indicates that the P of a mark + // worker is dedicated to running that mark worker. The mark + // worker should run without preemption. + gcMarkWorkerDedicatedMode + + // gcMarkWorkerFractionalMode indicates that a P is currently + // running the "fractional" mark worker. The fractional worker + // is necessary when GOMAXPROCS*gcBackgroundUtilization is not + // an integer and using only dedicated workers would result in + // utilization too far from the target of gcBackgroundUtilization. + // The fractional worker should run until it is preempted and + // will be scheduled to pick up the fractional part of + // GOMAXPROCS*gcBackgroundUtilization. + gcMarkWorkerFractionalMode + + // gcMarkWorkerIdleMode indicates that a P is running the mark + // worker because it has nothing else to do. The idle worker + // should run until it is preempted and account its time + // against gcController.idleMarkTime. + gcMarkWorkerIdleMode +) + +// gcMarkWorkerModeStrings are the strings labels of gcMarkWorkerModes +// to use in execution traces. +var gcMarkWorkerModeStrings = [...]string{ + "Not worker", + "GC (dedicated)", + "GC (fractional)", + "GC (idle)", +} + +// pollFractionalWorkerExit reports whether a fractional mark worker +// should self-preempt. It assumes it is called from the fractional +// worker. +func pollFractionalWorkerExit() bool { + // This should be kept in sync with the fractional worker + // scheduler logic in findRunnableGCWorker. + now := nanotime() + delta := now - gcController.markStartTime + if delta <= 0 { + return true + } + p := getg().m.p.ptr() + selfTime := p.gcFractionalMarkTime + (now - p.gcMarkWorkerStartTime) + // Add some slack to the utilization goal so that the + // fractional worker isn't behind again the instant it exits. + return float64(selfTime)/float64(delta) > 1.2*gcController.fractionalUtilizationGoal +} + +var work workType + +type workType struct { + full lfstack // lock-free list of full blocks workbuf + empty lfstack // lock-free list of empty blocks workbuf + pad0 cpu.CacheLinePad // prevents false-sharing between full/empty and nproc/nwait + + wbufSpans struct { + lock mutex + // free is a list of spans dedicated to workbufs, but + // that don't currently contain any workbufs. + free mSpanList + // busy is a list of all spans containing workbufs on + // one of the workbuf lists. + busy mSpanList + } + + // Restore 64-bit alignment on 32-bit. + _ uint32 + + // bytesMarked is the number of bytes marked this cycle. This + // includes bytes blackened in scanned objects, noscan objects + // that go straight to black, and permagrey objects scanned by + // markroot during the concurrent scan phase. This is updated + // atomically during the cycle. Updates may be batched + // arbitrarily, since the value is only read at the end of the + // cycle. + // + // Because of benign races during marking, this number may not + // be the exact number of marked bytes, but it should be very + // close. + // + // Put this field here because it needs 64-bit atomic access + // (and thus 8-byte alignment even on 32-bit architectures). + bytesMarked uint64 + + markrootNext uint32 // next markroot job + markrootJobs uint32 // number of markroot jobs + + nproc uint32 + tstart int64 + nwait uint32 + + // Number of roots of various root types. Set by gcMarkRootPrepare. + // + // nStackRoots == len(stackRoots), but we have nStackRoots for + // consistency. + nDataRoots, nBSSRoots, nSpanRoots, nStackRoots int + + // Base indexes of each root type. Set by gcMarkRootPrepare. + baseData, baseBSS, baseSpans, baseStacks, baseEnd uint32 + + // stackRoots is a snapshot of all of the Gs that existed + // before the beginning of concurrent marking. The backing + // store of this must not be modified because it might be + // shared with allgs. + stackRoots []*g + + // Each type of GC state transition is protected by a lock. + // Since multiple threads can simultaneously detect the state + // transition condition, any thread that detects a transition + // condition must acquire the appropriate transition lock, + // re-check the transition condition and return if it no + // longer holds or perform the transition if it does. + // Likewise, any transition must invalidate the transition + // condition before releasing the lock. This ensures that each + // transition is performed by exactly one thread and threads + // that need the transition to happen block until it has + // happened. + // + // startSema protects the transition from "off" to mark or + // mark termination. + startSema uint32 + // markDoneSema protects transitions from mark to mark termination. + markDoneSema uint32 + + bgMarkReady note // signal background mark worker has started + bgMarkDone uint32 // cas to 1 when at a background mark completion point + // Background mark completion signaling + + // mode is the concurrency mode of the current GC cycle. + mode gcMode + + // userForced indicates the current GC cycle was forced by an + // explicit user call. + userForced bool + + // initialHeapLive is the value of gcController.heapLive at the + // beginning of this GC cycle. + initialHeapLive uint64 + + // assistQueue is a queue of assists that are blocked because + // there was neither enough credit to steal or enough work to + // do. + assistQueue struct { + lock mutex + q gQueue + } + + // sweepWaiters is a list of blocked goroutines to wake when + // we transition from mark termination to sweep. + sweepWaiters struct { + lock mutex + list gList + } + + // cycles is the number of completed GC cycles, where a GC + // cycle is sweep termination, mark, mark termination, and + // sweep. This differs from memstats.numgc, which is + // incremented at mark termination. + cycles atomic.Uint32 + + // Timing/utilization stats for this cycle. + stwprocs, maxprocs int32 + tSweepTerm, tMark, tMarkTerm, tEnd int64 // nanotime() of phase start + + pauseNS int64 // total STW time this cycle + pauseStart int64 // nanotime() of last STW + + // debug.gctrace heap sizes for this cycle. + heap0, heap1, heap2 uint64 + + // Cumulative estimated CPU usage. + cpuStats +} + +// GC runs a garbage collection and blocks the caller until the +// garbage collection is complete. It may also block the entire +// program. +func GC() { + // We consider a cycle to be: sweep termination, mark, mark + // termination, and sweep. This function shouldn't return + // until a full cycle has been completed, from beginning to + // end. Hence, we always want to finish up the current cycle + // and start a new one. That means: + // + // 1. In sweep termination, mark, or mark termination of cycle + // N, wait until mark termination N completes and transitions + // to sweep N. + // + // 2. In sweep N, help with sweep N. + // + // At this point we can begin a full cycle N+1. + // + // 3. Trigger cycle N+1 by starting sweep termination N+1. + // + // 4. Wait for mark termination N+1 to complete. + // + // 5. Help with sweep N+1 until it's done. + // + // This all has to be written to deal with the fact that the + // GC may move ahead on its own. For example, when we block + // until mark termination N, we may wake up in cycle N+2. + + // Wait until the current sweep termination, mark, and mark + // termination complete. + n := work.cycles.Load() + gcWaitOnMark(n) + + // We're now in sweep N or later. Trigger GC cycle N+1, which + // will first finish sweep N if necessary and then enter sweep + // termination N+1. + gcStart(gcTrigger{kind: gcTriggerCycle, n: n + 1}) + + // Wait for mark termination N+1 to complete. + gcWaitOnMark(n + 1) + + // Finish sweep N+1 before returning. We do this both to + // complete the cycle and because runtime.GC() is often used + // as part of tests and benchmarks to get the system into a + // relatively stable and isolated state. + for work.cycles.Load() == n+1 && sweepone() != ^uintptr(0) { + sweep.nbgsweep++ + Gosched() + } + + // Callers may assume that the heap profile reflects the + // just-completed cycle when this returns (historically this + // happened because this was a STW GC), but right now the + // profile still reflects mark termination N, not N+1. + // + // As soon as all of the sweep frees from cycle N+1 are done, + // we can go ahead and publish the heap profile. + // + // First, wait for sweeping to finish. (We know there are no + // more spans on the sweep queue, but we may be concurrently + // sweeping spans, so we have to wait.) + for work.cycles.Load() == n+1 && !isSweepDone() { + Gosched() + } + + // Now we're really done with sweeping, so we can publish the + // stable heap profile. Only do this if we haven't already hit + // another mark termination. + mp := acquirem() + cycle := work.cycles.Load() + if cycle == n+1 || (gcphase == _GCmark && cycle == n+2) { + mProf_PostSweep() + } + releasem(mp) +} + +// gcWaitOnMark blocks until GC finishes the Nth mark phase. If GC has +// already completed this mark phase, it returns immediately. +func gcWaitOnMark(n uint32) { + for { + // Disable phase transitions. + lock(&work.sweepWaiters.lock) + nMarks := work.cycles.Load() + if gcphase != _GCmark { + // We've already completed this cycle's mark. + nMarks++ + } + if nMarks > n { + // We're done. + unlock(&work.sweepWaiters.lock) + return + } + + // Wait until sweep termination, mark, and mark + // termination of cycle N complete. + work.sweepWaiters.list.push(getg()) + goparkunlock(&work.sweepWaiters.lock, waitReasonWaitForGCCycle, traceEvGoBlock, 1) + } +} + +// gcMode indicates how concurrent a GC cycle should be. +type gcMode int + +const ( + gcBackgroundMode gcMode = iota // concurrent GC and sweep + gcForceMode // stop-the-world GC now, concurrent sweep + gcForceBlockMode // stop-the-world GC now and STW sweep (forced by user) +) + +// A gcTrigger is a predicate for starting a GC cycle. Specifically, +// it is an exit condition for the _GCoff phase. +type gcTrigger struct { + kind gcTriggerKind + now int64 // gcTriggerTime: current time + n uint32 // gcTriggerCycle: cycle number to start +} + +type gcTriggerKind int + +const ( + // gcTriggerHeap indicates that a cycle should be started when + // the heap size reaches the trigger heap size computed by the + // controller. + gcTriggerHeap gcTriggerKind = iota + + // gcTriggerTime indicates that a cycle should be started when + // it's been more than forcegcperiod nanoseconds since the + // previous GC cycle. + gcTriggerTime + + // gcTriggerCycle indicates that a cycle should be started if + // we have not yet started cycle number gcTrigger.n (relative + // to work.cycles). + gcTriggerCycle +) + +// test reports whether the trigger condition is satisfied, meaning +// that the exit condition for the _GCoff phase has been met. The exit +// condition should be tested when allocating. +func (t gcTrigger) test() bool { + if !memstats.enablegc || panicking.Load() != 0 || gcphase != _GCoff { + return false + } + switch t.kind { + case gcTriggerHeap: + // Non-atomic access to gcController.heapLive for performance. If + // we are going to trigger on this, this thread just + // atomically wrote gcController.heapLive anyway and we'll see our + // own write. + trigger, _ := gcController.trigger() + return gcController.heapLive.Load() >= trigger + case gcTriggerTime: + if gcController.gcPercent.Load() < 0 { + return false + } + lastgc := int64(atomic.Load64(&memstats.last_gc_nanotime)) + return lastgc != 0 && t.now-lastgc > forcegcperiod + case gcTriggerCycle: + // t.n > work.cycles, but accounting for wraparound. + return int32(t.n-work.cycles.Load()) > 0 + } + return true +} + +// gcStart starts the GC. It transitions from _GCoff to _GCmark (if +// debug.gcstoptheworld == 0) or performs all of GC (if +// debug.gcstoptheworld != 0). +// +// This may return without performing this transition in some cases, +// such as when called on a system stack or with locks held. +func gcStart(trigger gcTrigger) { + // Since this is called from malloc and malloc is called in + // the guts of a number of libraries that might be holding + // locks, don't attempt to start GC in non-preemptible or + // potentially unstable situations. + mp := acquirem() + if gp := getg(); gp == mp.g0 || mp.locks > 1 || mp.preemptoff != "" { + releasem(mp) + return + } + releasem(mp) + mp = nil + + // Pick up the remaining unswept/not being swept spans concurrently + // + // This shouldn't happen if we're being invoked in background + // mode since proportional sweep should have just finished + // sweeping everything, but rounding errors, etc, may leave a + // few spans unswept. In forced mode, this is necessary since + // GC can be forced at any point in the sweeping cycle. + // + // We check the transition condition continuously here in case + // this G gets delayed in to the next GC cycle. + for trigger.test() && sweepone() != ^uintptr(0) { + sweep.nbgsweep++ + } + + // Perform GC initialization and the sweep termination + // transition. + semacquire(&work.startSema) + // Re-check transition condition under transition lock. + if !trigger.test() { + semrelease(&work.startSema) + return + } + + // In gcstoptheworld debug mode, upgrade the mode accordingly. + // We do this after re-checking the transition condition so + // that multiple goroutines that detect the heap trigger don't + // start multiple STW GCs. + mode := gcBackgroundMode + if debug.gcstoptheworld == 1 { + mode = gcForceMode + } else if debug.gcstoptheworld == 2 { + mode = gcForceBlockMode + } + + // Ok, we're doing it! Stop everybody else + semacquire(&gcsema) + semacquire(&worldsema) + + // For stats, check if this GC was forced by the user. + // Update it under gcsema to avoid gctrace getting wrong values. + work.userForced = trigger.kind == gcTriggerCycle + + if trace.enabled { + traceGCStart() + } + + // Check that all Ps have finished deferred mcache flushes. + for _, p := range allp { + if fg := p.mcache.flushGen.Load(); fg != mheap_.sweepgen { + println("runtime: p", p.id, "flushGen", fg, "!= sweepgen", mheap_.sweepgen) + throw("p mcache not flushed") + } + } + + gcBgMarkStartWorkers() + + systemstack(gcResetMarkState) + + work.stwprocs, work.maxprocs = gomaxprocs, gomaxprocs + if work.stwprocs > ncpu { + // This is used to compute CPU time of the STW phases, + // so it can't be more than ncpu, even if GOMAXPROCS is. + work.stwprocs = ncpu + } + work.heap0 = gcController.heapLive.Load() + work.pauseNS = 0 + work.mode = mode + + now := nanotime() + work.tSweepTerm = now + work.pauseStart = now + if trace.enabled { + traceGCSTWStart(1) + } + systemstack(stopTheWorldWithSema) + // Finish sweep before we start concurrent scan. + systemstack(func() { + finishsweep_m() + }) + + // clearpools before we start the GC. If we wait they memory will not be + // reclaimed until the next GC cycle. + clearpools() + + work.cycles.Add(1) + + // Assists and workers can start the moment we start + // the world. + gcController.startCycle(now, int(gomaxprocs), trigger) + + // Notify the CPU limiter that assists may begin. + gcCPULimiter.startGCTransition(true, now) + + // In STW mode, disable scheduling of user Gs. This may also + // disable scheduling of this goroutine, so it may block as + // soon as we start the world again. + if mode != gcBackgroundMode { + schedEnableUser(false) + } + + // Enter concurrent mark phase and enable + // write barriers. + // + // Because the world is stopped, all Ps will + // observe that write barriers are enabled by + // the time we start the world and begin + // scanning. + // + // Write barriers must be enabled before assists are + // enabled because they must be enabled before + // any non-leaf heap objects are marked. Since + // allocations are blocked until assists can + // happen, we want enable assists as early as + // possible. + setGCPhase(_GCmark) + + gcBgMarkPrepare() // Must happen before assist enable. + gcMarkRootPrepare() + + // Mark all active tinyalloc blocks. Since we're + // allocating from these, they need to be black like + // other allocations. The alternative is to blacken + // the tiny block on every allocation from it, which + // would slow down the tiny allocator. + gcMarkTinyAllocs() + + // At this point all Ps have enabled the write + // barrier, thus maintaining the no white to + // black invariant. Enable mutator assists to + // put back-pressure on fast allocating + // mutators. + atomic.Store(&gcBlackenEnabled, 1) + + // In STW mode, we could block the instant systemstack + // returns, so make sure we're not preemptible. + mp = acquirem() + + // Concurrent mark. + systemstack(func() { + now = startTheWorldWithSema(trace.enabled) + work.pauseNS += now - work.pauseStart + work.tMark = now + memstats.gcPauseDist.record(now - work.pauseStart) + + // Release the CPU limiter. + gcCPULimiter.finishGCTransition(now) + }) + + // Release the world sema before Gosched() in STW mode + // because we will need to reacquire it later but before + // this goroutine becomes runnable again, and we could + // self-deadlock otherwise. + semrelease(&worldsema) + releasem(mp) + + // Make sure we block instead of returning to user code + // in STW mode. + if mode != gcBackgroundMode { + Gosched() + } + + semrelease(&work.startSema) +} + +// gcMarkDoneFlushed counts the number of P's with flushed work. +// +// Ideally this would be a captured local in gcMarkDone, but forEachP +// escapes its callback closure, so it can't capture anything. +// +// This is protected by markDoneSema. +var gcMarkDoneFlushed uint32 + +// gcMarkDone transitions the GC from mark to mark termination if all +// reachable objects have been marked (that is, there are no grey +// objects and can be no more in the future). Otherwise, it flushes +// all local work to the global queues where it can be discovered by +// other workers. +// +// This should be called when all local mark work has been drained and +// there are no remaining workers. Specifically, when +// +// work.nwait == work.nproc && !gcMarkWorkAvailable(p) +// +// The calling context must be preemptible. +// +// Flushing local work is important because idle Ps may have local +// work queued. This is the only way to make that work visible and +// drive GC to completion. +// +// It is explicitly okay to have write barriers in this function. If +// it does transition to mark termination, then all reachable objects +// have been marked, so the write barrier cannot shade any more +// objects. +func gcMarkDone() { + // Ensure only one thread is running the ragged barrier at a + // time. + semacquire(&work.markDoneSema) + +top: + // Re-check transition condition under transition lock. + // + // It's critical that this checks the global work queues are + // empty before performing the ragged barrier. Otherwise, + // there could be global work that a P could take after the P + // has passed the ragged barrier. + if !(gcphase == _GCmark && work.nwait == work.nproc && !gcMarkWorkAvailable(nil)) { + semrelease(&work.markDoneSema) + return + } + + // forEachP needs worldsema to execute, and we'll need it to + // stop the world later, so acquire worldsema now. + semacquire(&worldsema) + + // Flush all local buffers and collect flushedWork flags. + gcMarkDoneFlushed = 0 + systemstack(func() { + gp := getg().m.curg + // Mark the user stack as preemptible so that it may be scanned. + // Otherwise, our attempt to force all P's to a safepoint could + // result in a deadlock as we attempt to preempt a worker that's + // trying to preempt us (e.g. for a stack scan). + casGToWaiting(gp, _Grunning, waitReasonGCMarkTermination) + forEachP(func(pp *p) { + // Flush the write barrier buffer, since this may add + // work to the gcWork. + wbBufFlush1(pp) + + // Flush the gcWork, since this may create global work + // and set the flushedWork flag. + // + // TODO(austin): Break up these workbufs to + // better distribute work. + pp.gcw.dispose() + // Collect the flushedWork flag. + if pp.gcw.flushedWork { + atomic.Xadd(&gcMarkDoneFlushed, 1) + pp.gcw.flushedWork = false + } + }) + casgstatus(gp, _Gwaiting, _Grunning) + }) + + if gcMarkDoneFlushed != 0 { + // More grey objects were discovered since the + // previous termination check, so there may be more + // work to do. Keep going. It's possible the + // transition condition became true again during the + // ragged barrier, so re-check it. + semrelease(&worldsema) + goto top + } + + // There was no global work, no local work, and no Ps + // communicated work since we took markDoneSema. Therefore + // there are no grey objects and no more objects can be + // shaded. Transition to mark termination. + now := nanotime() + work.tMarkTerm = now + work.pauseStart = now + getg().m.preemptoff = "gcing" + if trace.enabled { + traceGCSTWStart(0) + } + systemstack(stopTheWorldWithSema) + // The gcphase is _GCmark, it will transition to _GCmarktermination + // below. The important thing is that the wb remains active until + // all marking is complete. This includes writes made by the GC. + + // There is sometimes work left over when we enter mark termination due + // to write barriers performed after the completion barrier above. + // Detect this and resume concurrent mark. This is obviously + // unfortunate. + // + // See issue #27993 for details. + // + // Switch to the system stack to call wbBufFlush1, though in this case + // it doesn't matter because we're non-preemptible anyway. + restart := false + systemstack(func() { + for _, p := range allp { + wbBufFlush1(p) + if !p.gcw.empty() { + restart = true + break + } + } + }) + if restart { + getg().m.preemptoff = "" + systemstack(func() { + now := startTheWorldWithSema(trace.enabled) + work.pauseNS += now - work.pauseStart + memstats.gcPauseDist.record(now - work.pauseStart) + }) + semrelease(&worldsema) + goto top + } + + gcComputeStartingStackSize() + + // Disable assists and background workers. We must do + // this before waking blocked assists. + atomic.Store(&gcBlackenEnabled, 0) + + // Notify the CPU limiter that GC assists will now cease. + gcCPULimiter.startGCTransition(false, now) + + // Wake all blocked assists. These will run when we + // start the world again. + gcWakeAllAssists() + + // Likewise, release the transition lock. Blocked + // workers and assists will run when we start the + // world again. + semrelease(&work.markDoneSema) + + // In STW mode, re-enable user goroutines. These will be + // queued to run after we start the world. + schedEnableUser(true) + + // endCycle depends on all gcWork cache stats being flushed. + // The termination algorithm above ensured that up to + // allocations since the ragged barrier. + gcController.endCycle(now, int(gomaxprocs), work.userForced) + + // Perform mark termination. This will restart the world. + gcMarkTermination() +} + +// World must be stopped and mark assists and background workers must be +// disabled. +func gcMarkTermination() { + // Start marktermination (write barrier remains enabled for now). + setGCPhase(_GCmarktermination) + + work.heap1 = gcController.heapLive.Load() + startTime := nanotime() + + mp := acquirem() + mp.preemptoff = "gcing" + mp.traceback = 2 + curgp := mp.curg + casGToWaiting(curgp, _Grunning, waitReasonGarbageCollection) + + // Run gc on the g0 stack. We do this so that the g stack + // we're currently running on will no longer change. Cuts + // the root set down a bit (g0 stacks are not scanned, and + // we don't need to scan gc's internal state). We also + // need to switch to g0 so we can shrink the stack. + systemstack(func() { + gcMark(startTime) + // Must return immediately. + // The outer function's stack may have moved + // during gcMark (it shrinks stacks, including the + // outer function's stack), so we must not refer + // to any of its variables. Return back to the + // non-system stack to pick up the new addresses + // before continuing. + }) + + systemstack(func() { + work.heap2 = work.bytesMarked + if debug.gccheckmark > 0 { + // Run a full non-parallel, stop-the-world + // mark using checkmark bits, to check that we + // didn't forget to mark anything during the + // concurrent mark process. + startCheckmarks() + gcResetMarkState() + gcw := &getg().m.p.ptr().gcw + gcDrain(gcw, 0) + wbBufFlush1(getg().m.p.ptr()) + gcw.dispose() + endCheckmarks() + } + + // marking is complete so we can turn the write barrier off + setGCPhase(_GCoff) + gcSweep(work.mode) + }) + + mp.traceback = 0 + casgstatus(curgp, _Gwaiting, _Grunning) + + if trace.enabled { + traceGCDone() + } + + // all done + mp.preemptoff = "" + + if gcphase != _GCoff { + throw("gc done but gcphase != _GCoff") + } + + // Record heapInUse for scavenger. + memstats.lastHeapInUse = gcController.heapInUse.load() + + // Update GC trigger and pacing, as well as downstream consumers + // of this pacing information, for the next cycle. + systemstack(gcControllerCommit) + + // Update timing memstats + now := nanotime() + sec, nsec, _ := time_now() + unixNow := sec*1e9 + int64(nsec) + work.pauseNS += now - work.pauseStart + work.tEnd = now + memstats.gcPauseDist.record(now - work.pauseStart) + atomic.Store64(&memstats.last_gc_unix, uint64(unixNow)) // must be Unix time to make sense to user + atomic.Store64(&memstats.last_gc_nanotime, uint64(now)) // monotonic time for us + memstats.pause_ns[memstats.numgc%uint32(len(memstats.pause_ns))] = uint64(work.pauseNS) + memstats.pause_end[memstats.numgc%uint32(len(memstats.pause_end))] = uint64(unixNow) + memstats.pause_total_ns += uint64(work.pauseNS) + + sweepTermCpu := int64(work.stwprocs) * (work.tMark - work.tSweepTerm) + // We report idle marking time below, but omit it from the + // overall utilization here since it's "free". + markAssistCpu := gcController.assistTime.Load() + markDedicatedCpu := gcController.dedicatedMarkTime.Load() + markFractionalCpu := gcController.fractionalMarkTime.Load() + markIdleCpu := gcController.idleMarkTime.Load() + markTermCpu := int64(work.stwprocs) * (work.tEnd - work.tMarkTerm) + scavAssistCpu := scavenge.assistTime.Load() + scavBgCpu := scavenge.backgroundTime.Load() + + // Update cumulative GC CPU stats. + work.cpuStats.gcAssistTime += markAssistCpu + work.cpuStats.gcDedicatedTime += markDedicatedCpu + markFractionalCpu + work.cpuStats.gcIdleTime += markIdleCpu + work.cpuStats.gcPauseTime += sweepTermCpu + markTermCpu + work.cpuStats.gcTotalTime += sweepTermCpu + markAssistCpu + markDedicatedCpu + markFractionalCpu + markIdleCpu + markTermCpu + + // Update cumulative scavenge CPU stats. + work.cpuStats.scavengeAssistTime += scavAssistCpu + work.cpuStats.scavengeBgTime += scavBgCpu + work.cpuStats.scavengeTotalTime += scavAssistCpu + scavBgCpu + + // Update total CPU. + work.cpuStats.totalTime = sched.totaltime + (now-sched.procresizetime)*int64(gomaxprocs) + work.cpuStats.idleTime += sched.idleTime.Load() + + // Compute userTime. We compute this indirectly as everything that's not the above. + // + // Since time spent in _Pgcstop is covered by gcPauseTime, and time spent in _Pidle + // is covered by idleTime, what we're left with is time spent in _Prunning and _Psyscall, + // the latter of which is fine because the P will either go idle or get used for something + // else via sysmon. Meanwhile if we subtract GC time from whatever's left, we get non-GC + // _Prunning time. Note that this still leaves time spent in sweeping and in the scheduler, + // but that's fine. The overwhelming majority of this time will be actual user time. + work.cpuStats.userTime = work.cpuStats.totalTime - (work.cpuStats.gcTotalTime + + work.cpuStats.scavengeTotalTime + work.cpuStats.idleTime) + + // Compute overall GC CPU utilization. + // Omit idle marking time from the overall utilization here since it's "free". + memstats.gc_cpu_fraction = float64(work.cpuStats.gcTotalTime-work.cpuStats.gcIdleTime) / float64(work.cpuStats.totalTime) + + // Reset assist time and background time stats. + // + // Do this now, instead of at the start of the next GC cycle, because + // these two may keep accumulating even if the GC is not active. + scavenge.assistTime.Store(0) + scavenge.backgroundTime.Store(0) + + // Reset idle time stat. + sched.idleTime.Store(0) + + // Reset sweep state. + sweep.nbgsweep = 0 + sweep.npausesweep = 0 + + if work.userForced { + memstats.numforcedgc++ + } + + // Bump GC cycle count and wake goroutines waiting on sweep. + lock(&work.sweepWaiters.lock) + memstats.numgc++ + injectglist(&work.sweepWaiters.list) + unlock(&work.sweepWaiters.lock) + + // Release the CPU limiter. + gcCPULimiter.finishGCTransition(now) + + // Finish the current heap profiling cycle and start a new + // heap profiling cycle. We do this before starting the world + // so events don't leak into the wrong cycle. + mProf_NextCycle() + + // There may be stale spans in mcaches that need to be swept. + // Those aren't tracked in any sweep lists, so we need to + // count them against sweep completion until we ensure all + // those spans have been forced out. + sl := sweep.active.begin() + if !sl.valid { + throw("failed to set sweep barrier") + } + + systemstack(func() { startTheWorldWithSema(trace.enabled) }) + + // Flush the heap profile so we can start a new cycle next GC. + // This is relatively expensive, so we don't do it with the + // world stopped. + mProf_Flush() + + // Prepare workbufs for freeing by the sweeper. We do this + // asynchronously because it can take non-trivial time. + prepareFreeWorkbufs() + + // Free stack spans. This must be done between GC cycles. + systemstack(freeStackSpans) + + // Ensure all mcaches are flushed. Each P will flush its own + // mcache before allocating, but idle Ps may not. Since this + // is necessary to sweep all spans, we need to ensure all + // mcaches are flushed before we start the next GC cycle. + systemstack(func() { + forEachP(func(pp *p) { + pp.mcache.prepareForSweep() + }) + }) + // Now that we've swept stale spans in mcaches, they don't + // count against unswept spans. + sweep.active.end(sl) + + // Print gctrace before dropping worldsema. As soon as we drop + // worldsema another cycle could start and smash the stats + // we're trying to print. + if debug.gctrace > 0 { + util := int(memstats.gc_cpu_fraction * 100) + + var sbuf [24]byte + printlock() + print("gc ", memstats.numgc, + " @", string(itoaDiv(sbuf[:], uint64(work.tSweepTerm-runtimeInitTime)/1e6, 3)), "s ", + util, "%: ") + prev := work.tSweepTerm + for i, ns := range []int64{work.tMark, work.tMarkTerm, work.tEnd} { + if i != 0 { + print("+") + } + print(string(fmtNSAsMS(sbuf[:], uint64(ns-prev)))) + prev = ns + } + print(" ms clock, ") + for i, ns := range []int64{ + sweepTermCpu, + gcController.assistTime.Load(), + gcController.dedicatedMarkTime.Load() + gcController.fractionalMarkTime.Load(), + gcController.idleMarkTime.Load(), + markTermCpu, + } { + if i == 2 || i == 3 { + // Separate mark time components with /. + print("/") + } else if i != 0 { + print("+") + } + print(string(fmtNSAsMS(sbuf[:], uint64(ns)))) + } + print(" ms cpu, ", + work.heap0>>20, "->", work.heap1>>20, "->", work.heap2>>20, " MB, ", + gcController.lastHeapGoal>>20, " MB goal, ", + gcController.lastStackScan.Load()>>20, " MB stacks, ", + gcController.globalsScan.Load()>>20, " MB globals, ", + work.maxprocs, " P") + if work.userForced { + print(" (forced)") + } + print("\n") + printunlock() + } + + // Set any arena chunks that were deferred to fault. + lock(&userArenaState.lock) + faultList := userArenaState.fault + userArenaState.fault = nil + unlock(&userArenaState.lock) + for _, lc := range faultList { + lc.mspan.setUserArenaChunkToFault() + } + + semrelease(&worldsema) + semrelease(&gcsema) + // Careful: another GC cycle may start now. + + releasem(mp) + mp = nil + + // now that gc is done, kick off finalizer thread if needed + if !concurrentSweep { + // give the queued finalizers, if any, a chance to run + Gosched() + } +} + +// gcBgMarkStartWorkers prepares background mark worker goroutines. These +// goroutines will not run until the mark phase, but they must be started while +// the work is not stopped and from a regular G stack. The caller must hold +// worldsema. +func gcBgMarkStartWorkers() { + // Background marking is performed by per-P G's. Ensure that each P has + // a background GC G. + // + // Worker Gs don't exit if gomaxprocs is reduced. If it is raised + // again, we can reuse the old workers; no need to create new workers. + for gcBgMarkWorkerCount < gomaxprocs { + go gcBgMarkWorker() + + notetsleepg(&work.bgMarkReady, -1) + noteclear(&work.bgMarkReady) + // The worker is now guaranteed to be added to the pool before + // its P's next findRunnableGCWorker. + + gcBgMarkWorkerCount++ + } +} + +// gcBgMarkPrepare sets up state for background marking. +// Mutator assists must not yet be enabled. +func gcBgMarkPrepare() { + // Background marking will stop when the work queues are empty + // and there are no more workers (note that, since this is + // concurrent, this may be a transient state, but mark + // termination will clean it up). Between background workers + // and assists, we don't really know how many workers there + // will be, so we pretend to have an arbitrarily large number + // of workers, almost all of which are "waiting". While a + // worker is working it decrements nwait. If nproc == nwait, + // there are no workers. + work.nproc = ^uint32(0) + work.nwait = ^uint32(0) +} + +// gcBgMarkWorkerNode is an entry in the gcBgMarkWorkerPool. It points to a single +// gcBgMarkWorker goroutine. +type gcBgMarkWorkerNode struct { + // Unused workers are managed in a lock-free stack. This field must be first. + node lfnode + + // The g of this worker. + gp guintptr + + // Release this m on park. This is used to communicate with the unlock + // function, which cannot access the G's stack. It is unused outside of + // gcBgMarkWorker(). + m muintptr +} + +func gcBgMarkWorker() { + gp := getg() + + // We pass node to a gopark unlock function, so it can't be on + // the stack (see gopark). Prevent deadlock from recursively + // starting GC by disabling preemption. + gp.m.preemptoff = "GC worker init" + node := new(gcBgMarkWorkerNode) + gp.m.preemptoff = "" + + node.gp.set(gp) + + node.m.set(acquirem()) + notewakeup(&work.bgMarkReady) + // After this point, the background mark worker is generally scheduled + // cooperatively by gcController.findRunnableGCWorker. While performing + // work on the P, preemption is disabled because we are working on + // P-local work buffers. When the preempt flag is set, this puts itself + // into _Gwaiting to be woken up by gcController.findRunnableGCWorker + // at the appropriate time. + // + // When preemption is enabled (e.g., while in gcMarkDone), this worker + // may be preempted and schedule as a _Grunnable G from a runq. That is + // fine; it will eventually gopark again for further scheduling via + // findRunnableGCWorker. + // + // Since we disable preemption before notifying bgMarkReady, we + // guarantee that this G will be in the worker pool for the next + // findRunnableGCWorker. This isn't strictly necessary, but it reduces + // latency between _GCmark starting and the workers starting. + + for { + // Go to sleep until woken by + // gcController.findRunnableGCWorker. + gopark(func(g *g, nodep unsafe.Pointer) bool { + node := (*gcBgMarkWorkerNode)(nodep) + + if mp := node.m.ptr(); mp != nil { + // The worker G is no longer running; release + // the M. + // + // N.B. it is _safe_ to release the M as soon + // as we are no longer performing P-local mark + // work. + // + // However, since we cooperatively stop work + // when gp.preempt is set, if we releasem in + // the loop then the following call to gopark + // would immediately preempt the G. This is + // also safe, but inefficient: the G must + // schedule again only to enter gopark and park + // again. Thus, we defer the release until + // after parking the G. + releasem(mp) + } + + // Release this G to the pool. + gcBgMarkWorkerPool.push(&node.node) + // Note that at this point, the G may immediately be + // rescheduled and may be running. + return true + }, unsafe.Pointer(node), waitReasonGCWorkerIdle, traceEvGoBlock, 0) + + // Preemption must not occur here, or another G might see + // p.gcMarkWorkerMode. + + // Disable preemption so we can use the gcw. If the + // scheduler wants to preempt us, we'll stop draining, + // dispose the gcw, and then preempt. + node.m.set(acquirem()) + pp := gp.m.p.ptr() // P can't change with preemption disabled. + + if gcBlackenEnabled == 0 { + println("worker mode", pp.gcMarkWorkerMode) + throw("gcBgMarkWorker: blackening not enabled") + } + + if pp.gcMarkWorkerMode == gcMarkWorkerNotWorker { + throw("gcBgMarkWorker: mode not set") + } + + startTime := nanotime() + pp.gcMarkWorkerStartTime = startTime + var trackLimiterEvent bool + if pp.gcMarkWorkerMode == gcMarkWorkerIdleMode { + trackLimiterEvent = pp.limiterEvent.start(limiterEventIdleMarkWork, startTime) + } + + decnwait := atomic.Xadd(&work.nwait, -1) + if decnwait == work.nproc { + println("runtime: work.nwait=", decnwait, "work.nproc=", work.nproc) + throw("work.nwait was > work.nproc") + } + + systemstack(func() { + // Mark our goroutine preemptible so its stack + // can be scanned. This lets two mark workers + // scan each other (otherwise, they would + // deadlock). We must not modify anything on + // the G stack. However, stack shrinking is + // disabled for mark workers, so it is safe to + // read from the G stack. + casGToWaiting(gp, _Grunning, waitReasonGCWorkerActive) + switch pp.gcMarkWorkerMode { + default: + throw("gcBgMarkWorker: unexpected gcMarkWorkerMode") + case gcMarkWorkerDedicatedMode: + gcDrain(&pp.gcw, gcDrainUntilPreempt|gcDrainFlushBgCredit) + if gp.preempt { + // We were preempted. This is + // a useful signal to kick + // everything out of the run + // queue so it can run + // somewhere else. + if drainQ, n := runqdrain(pp); n > 0 { + lock(&sched.lock) + globrunqputbatch(&drainQ, int32(n)) + unlock(&sched.lock) + } + } + // Go back to draining, this time + // without preemption. + gcDrain(&pp.gcw, gcDrainFlushBgCredit) + case gcMarkWorkerFractionalMode: + gcDrain(&pp.gcw, gcDrainFractional|gcDrainUntilPreempt|gcDrainFlushBgCredit) + case gcMarkWorkerIdleMode: + gcDrain(&pp.gcw, gcDrainIdle|gcDrainUntilPreempt|gcDrainFlushBgCredit) + } + casgstatus(gp, _Gwaiting, _Grunning) + }) + + // Account for time and mark us as stopped. + now := nanotime() + duration := now - startTime + gcController.markWorkerStop(pp.gcMarkWorkerMode, duration) + if trackLimiterEvent { + pp.limiterEvent.stop(limiterEventIdleMarkWork, now) + } + if pp.gcMarkWorkerMode == gcMarkWorkerFractionalMode { + atomic.Xaddint64(&pp.gcFractionalMarkTime, duration) + } + + // Was this the last worker and did we run out + // of work? + incnwait := atomic.Xadd(&work.nwait, +1) + if incnwait > work.nproc { + println("runtime: p.gcMarkWorkerMode=", pp.gcMarkWorkerMode, + "work.nwait=", incnwait, "work.nproc=", work.nproc) + throw("work.nwait > work.nproc") + } + + // We'll releasem after this point and thus this P may run + // something else. We must clear the worker mode to avoid + // attributing the mode to a different (non-worker) G in + // traceGoStart. + pp.gcMarkWorkerMode = gcMarkWorkerNotWorker + + // If this worker reached a background mark completion + // point, signal the main GC goroutine. + if incnwait == work.nproc && !gcMarkWorkAvailable(nil) { + // We don't need the P-local buffers here, allow + // preemption because we may schedule like a regular + // goroutine in gcMarkDone (block on locks, etc). + releasem(node.m.ptr()) + node.m.set(nil) + + gcMarkDone() + } + } +} + +// gcMarkWorkAvailable reports whether executing a mark worker +// on p is potentially useful. p may be nil, in which case it only +// checks the global sources of work. +func gcMarkWorkAvailable(p *p) bool { + if p != nil && !p.gcw.empty() { + return true + } + if !work.full.empty() { + return true // global work available + } + if work.markrootNext < work.markrootJobs { + return true // root scan work available + } + return false +} + +// gcMark runs the mark (or, for concurrent GC, mark termination) +// All gcWork caches must be empty. +// STW is in effect at this point. +func gcMark(startTime int64) { + if debug.allocfreetrace > 0 { + tracegc() + } + + if gcphase != _GCmarktermination { + throw("in gcMark expecting to see gcphase as _GCmarktermination") + } + work.tstart = startTime + + // Check that there's no marking work remaining. + if work.full != 0 || work.markrootNext < work.markrootJobs { + print("runtime: full=", hex(work.full), " next=", work.markrootNext, " jobs=", work.markrootJobs, " nDataRoots=", work.nDataRoots, " nBSSRoots=", work.nBSSRoots, " nSpanRoots=", work.nSpanRoots, " nStackRoots=", work.nStackRoots, "\n") + panic("non-empty mark queue after concurrent mark") + } + + if debug.gccheckmark > 0 { + // This is expensive when there's a large number of + // Gs, so only do it if checkmark is also enabled. + gcMarkRootCheck() + } + if work.full != 0 { + throw("work.full != 0") + } + + // Drop allg snapshot. allgs may have grown, in which case + // this is the only reference to the old backing store and + // there's no need to keep it around. + work.stackRoots = nil + + // Clear out buffers and double-check that all gcWork caches + // are empty. This should be ensured by gcMarkDone before we + // enter mark termination. + // + // TODO: We could clear out buffers just before mark if this + // has a non-negligible impact on STW time. + for _, p := range allp { + // The write barrier may have buffered pointers since + // the gcMarkDone barrier. However, since the barrier + // ensured all reachable objects were marked, all of + // these must be pointers to black objects. Hence we + // can just discard the write barrier buffer. + if debug.gccheckmark > 0 { + // For debugging, flush the buffer and make + // sure it really was all marked. + wbBufFlush1(p) + } else { + p.wbBuf.reset() + } + + gcw := &p.gcw + if !gcw.empty() { + printlock() + print("runtime: P ", p.id, " flushedWork ", gcw.flushedWork) + if gcw.wbuf1 == nil { + print(" wbuf1=<nil>") + } else { + print(" wbuf1.n=", gcw.wbuf1.nobj) + } + if gcw.wbuf2 == nil { + print(" wbuf2=<nil>") + } else { + print(" wbuf2.n=", gcw.wbuf2.nobj) + } + print("\n") + throw("P has cached GC work at end of mark termination") + } + // There may still be cached empty buffers, which we + // need to flush since we're going to free them. Also, + // there may be non-zero stats because we allocated + // black after the gcMarkDone barrier. + gcw.dispose() + } + + // Flush scanAlloc from each mcache since we're about to modify + // heapScan directly. If we were to flush this later, then scanAlloc + // might have incorrect information. + // + // Note that it's not important to retain this information; we know + // exactly what heapScan is at this point via scanWork. + for _, p := range allp { + c := p.mcache + if c == nil { + continue + } + c.scanAlloc = 0 + } + + // Reset controller state. + gcController.resetLive(work.bytesMarked) +} + +// gcSweep must be called on the system stack because it acquires the heap +// lock. See mheap for details. +// +// The world must be stopped. +// +//go:systemstack +func gcSweep(mode gcMode) { + assertWorldStopped() + + if gcphase != _GCoff { + throw("gcSweep being done but phase is not GCoff") + } + + lock(&mheap_.lock) + mheap_.sweepgen += 2 + sweep.active.reset() + mheap_.pagesSwept.Store(0) + mheap_.sweepArenas = mheap_.allArenas + mheap_.reclaimIndex.Store(0) + mheap_.reclaimCredit.Store(0) + unlock(&mheap_.lock) + + sweep.centralIndex.clear() + + if !_ConcurrentSweep || mode == gcForceBlockMode { + // Special case synchronous sweep. + // Record that no proportional sweeping has to happen. + lock(&mheap_.lock) + mheap_.sweepPagesPerByte = 0 + unlock(&mheap_.lock) + // Sweep all spans eagerly. + for sweepone() != ^uintptr(0) { + sweep.npausesweep++ + } + // Free workbufs eagerly. + prepareFreeWorkbufs() + for freeSomeWbufs(false) { + } + // All "free" events for this mark/sweep cycle have + // now happened, so we can make this profile cycle + // available immediately. + mProf_NextCycle() + mProf_Flush() + return + } + + // Background sweep. + lock(&sweep.lock) + if sweep.parked { + sweep.parked = false + ready(sweep.g, 0, true) + } + unlock(&sweep.lock) +} + +// gcResetMarkState resets global state prior to marking (concurrent +// or STW) and resets the stack scan state of all Gs. +// +// This is safe to do without the world stopped because any Gs created +// during or after this will start out in the reset state. +// +// gcResetMarkState must be called on the system stack because it acquires +// the heap lock. See mheap for details. +// +//go:systemstack +func gcResetMarkState() { + // This may be called during a concurrent phase, so lock to make sure + // allgs doesn't change. + forEachG(func(gp *g) { + gp.gcscandone = false // set to true in gcphasework + gp.gcAssistBytes = 0 + }) + + // Clear page marks. This is just 1MB per 64GB of heap, so the + // time here is pretty trivial. + lock(&mheap_.lock) + arenas := mheap_.allArenas + unlock(&mheap_.lock) + for _, ai := range arenas { + ha := mheap_.arenas[ai.l1()][ai.l2()] + for i := range ha.pageMarks { + ha.pageMarks[i] = 0 + } + } + + work.bytesMarked = 0 + work.initialHeapLive = gcController.heapLive.Load() +} + +// Hooks for other packages + +var poolcleanup func() +var boringCaches []unsafe.Pointer // for crypto/internal/boring + +//go:linkname sync_runtime_registerPoolCleanup sync.runtime_registerPoolCleanup +func sync_runtime_registerPoolCleanup(f func()) { + poolcleanup = f +} + +//go:linkname boring_registerCache crypto/internal/boring/bcache.registerCache +func boring_registerCache(p unsafe.Pointer) { + boringCaches = append(boringCaches, p) +} + +func clearpools() { + // clear sync.Pools + if poolcleanup != nil { + poolcleanup() + } + + // clear boringcrypto caches + for _, p := range boringCaches { + atomicstorep(p, nil) + } + + // Clear central sudog cache. + // Leave per-P caches alone, they have strictly bounded size. + // Disconnect cached list before dropping it on the floor, + // so that a dangling ref to one entry does not pin all of them. + lock(&sched.sudoglock) + var sg, sgnext *sudog + for sg = sched.sudogcache; sg != nil; sg = sgnext { + sgnext = sg.next + sg.next = nil + } + sched.sudogcache = nil + unlock(&sched.sudoglock) + + // Clear central defer pool. + // Leave per-P pools alone, they have strictly bounded size. + lock(&sched.deferlock) + // disconnect cached list before dropping it on the floor, + // so that a dangling ref to one entry does not pin all of them. + var d, dlink *_defer + for d = sched.deferpool; d != nil; d = dlink { + dlink = d.link + d.link = nil + } + sched.deferpool = nil + unlock(&sched.deferlock) +} + +// Timing + +// itoaDiv formats val/(10**dec) into buf. +func itoaDiv(buf []byte, val uint64, dec int) []byte { + i := len(buf) - 1 + idec := i - dec + for val >= 10 || i >= idec { + buf[i] = byte(val%10 + '0') + i-- + if i == idec { + buf[i] = '.' + i-- + } + val /= 10 + } + buf[i] = byte(val + '0') + return buf[i:] +} + +// fmtNSAsMS nicely formats ns nanoseconds as milliseconds. +func fmtNSAsMS(buf []byte, ns uint64) []byte { + if ns >= 10e6 { + // Format as whole milliseconds. + return itoaDiv(buf, ns/1e6, 0) + } + // Format two digits of precision, with at most three decimal places. + x := ns / 1e3 + if x == 0 { + buf[0] = '0' + return buf[:1] + } + dec := 3 + for x >= 100 { + x /= 10 + dec-- + } + return itoaDiv(buf, x, dec) +} + +// Helpers for testing GC. + +// gcTestMoveStackOnNextCall causes the stack to be moved on a call +// immediately following the call to this. It may not work correctly +// if any other work appears after this call (such as returning). +// Typically the following call should be marked go:noinline so it +// performs a stack check. +// +// In rare cases this may not cause the stack to move, specifically if +// there's a preemption between this call and the next. +func gcTestMoveStackOnNextCall() { + gp := getg() + gp.stackguard0 = stackForceMove +} + +// gcTestIsReachable performs a GC and returns a bit set where bit i +// is set if ptrs[i] is reachable. +func gcTestIsReachable(ptrs ...unsafe.Pointer) (mask uint64) { + // This takes the pointers as unsafe.Pointers in order to keep + // them live long enough for us to attach specials. After + // that, we drop our references to them. + + if len(ptrs) > 64 { + panic("too many pointers for uint64 mask") + } + + // Block GC while we attach specials and drop our references + // to ptrs. Otherwise, if a GC is in progress, it could mark + // them reachable via this function before we have a chance to + // drop them. + semacquire(&gcsema) + + // Create reachability specials for ptrs. + specials := make([]*specialReachable, len(ptrs)) + for i, p := range ptrs { + lock(&mheap_.speciallock) + s := (*specialReachable)(mheap_.specialReachableAlloc.alloc()) + unlock(&mheap_.speciallock) + s.special.kind = _KindSpecialReachable + if !addspecial(p, &s.special) { + throw("already have a reachable special (duplicate pointer?)") + } + specials[i] = s + // Make sure we don't retain ptrs. + ptrs[i] = nil + } + + semrelease(&gcsema) + + // Force a full GC and sweep. + GC() + + // Process specials. + for i, s := range specials { + if !s.done { + printlock() + println("runtime: object", i, "was not swept") + throw("IsReachable failed") + } + if s.reachable { + mask |= 1 << i + } + lock(&mheap_.speciallock) + mheap_.specialReachableAlloc.free(unsafe.Pointer(s)) + unlock(&mheap_.speciallock) + } + + return mask +} + +// gcTestPointerClass returns the category of what p points to, one of: +// "heap", "stack", "data", "bss", "other". This is useful for checking +// that a test is doing what it's intended to do. +// +// This is nosplit simply to avoid extra pointer shuffling that may +// complicate a test. +// +//go:nosplit +func gcTestPointerClass(p unsafe.Pointer) string { + p2 := uintptr(noescape(p)) + gp := getg() + if gp.stack.lo <= p2 && p2 < gp.stack.hi { + return "stack" + } + if base, _, _ := findObject(p2, 0, 0); base != 0 { + return "heap" + } + for _, datap := range activeModules() { + if datap.data <= p2 && p2 < datap.edata || datap.noptrdata <= p2 && p2 < datap.enoptrdata { + return "data" + } + if datap.bss <= p2 && p2 < datap.ebss || datap.noptrbss <= p2 && p2 <= datap.enoptrbss { + return "bss" + } + } + KeepAlive(p) + return "other" +} diff --git a/src/runtime/mgclimit.go b/src/runtime/mgclimit.go new file mode 100644 index 0000000..bcbe7f8 --- /dev/null +++ b/src/runtime/mgclimit.go @@ -0,0 +1,483 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "runtime/internal/atomic" + +// gcCPULimiter is a mechanism to limit GC CPU utilization in situations +// where it might become excessive and inhibit application progress (e.g. +// a death spiral). +// +// The core of the limiter is a leaky bucket mechanism that fills with GC +// CPU time and drains with mutator time. Because the bucket fills and +// drains with time directly (i.e. without any weighting), this effectively +// sets a very conservative limit of 50%. This limit could be enforced directly, +// however, but the purpose of the bucket is to accommodate spikes in GC CPU +// utilization without hurting throughput. +// +// Note that the bucket in the leaky bucket mechanism can never go negative, +// so the GC never gets credit for a lot of CPU time spent without the GC +// running. This is intentional, as an application that stays idle for, say, +// an entire day, could build up enough credit to fail to prevent a death +// spiral the following day. The bucket's capacity is the GC's only leeway. +// +// The capacity thus also sets the window the limiter considers. For example, +// if the capacity of the bucket is 1 cpu-second, then the limiter will not +// kick in until at least 1 full cpu-second in the last 2 cpu-second window +// is spent on GC CPU time. +var gcCPULimiter gcCPULimiterState + +type gcCPULimiterState struct { + lock atomic.Uint32 + + enabled atomic.Bool + bucket struct { + // Invariants: + // - fill >= 0 + // - capacity >= 0 + // - fill <= capacity + fill, capacity uint64 + } + // overflow is the cumulative amount of GC CPU time that we tried to fill the + // bucket with but exceeded its capacity. + overflow uint64 + + // gcEnabled is an internal copy of gcBlackenEnabled that determines + // whether the limiter tracks total assist time. + // + // gcBlackenEnabled isn't used directly so as to keep this structure + // unit-testable. + gcEnabled bool + + // transitioning is true when the GC is in a STW and transitioning between + // the mark and sweep phases. + transitioning bool + + // assistTimePool is the accumulated assist time since the last update. + assistTimePool atomic.Int64 + + // idleMarkTimePool is the accumulated idle mark time since the last update. + idleMarkTimePool atomic.Int64 + + // idleTimePool is the accumulated time Ps spent on the idle list since the last update. + idleTimePool atomic.Int64 + + // lastUpdate is the nanotime timestamp of the last time update was called. + // + // Updated under lock, but may be read concurrently. + lastUpdate atomic.Int64 + + // lastEnabledCycle is the GC cycle that last had the limiter enabled. + lastEnabledCycle atomic.Uint32 + + // nprocs is an internal copy of gomaxprocs, used to determine total available + // CPU time. + // + // gomaxprocs isn't used directly so as to keep this structure unit-testable. + nprocs int32 + + // test indicates whether this instance of the struct was made for testing purposes. + test bool +} + +// limiting returns true if the CPU limiter is currently enabled, meaning the Go GC +// should take action to limit CPU utilization. +// +// It is safe to call concurrently with other operations. +func (l *gcCPULimiterState) limiting() bool { + return l.enabled.Load() +} + +// startGCTransition notifies the limiter of a GC transition. +// +// This call takes ownership of the limiter and disables all other means of +// updating the limiter. Release ownership by calling finishGCTransition. +// +// It is safe to call concurrently with other operations. +func (l *gcCPULimiterState) startGCTransition(enableGC bool, now int64) { + if !l.tryLock() { + // This must happen during a STW, so we can't fail to acquire the lock. + // If we did, something went wrong. Throw. + throw("failed to acquire lock to start a GC transition") + } + if l.gcEnabled == enableGC { + throw("transitioning GC to the same state as before?") + } + // Flush whatever was left between the last update and now. + l.updateLocked(now) + l.gcEnabled = enableGC + l.transitioning = true + // N.B. finishGCTransition releases the lock. + // + // We don't release here to increase the chance that if there's a failure + // to finish the transition, that we throw on failing to acquire the lock. +} + +// finishGCTransition notifies the limiter that the GC transition is complete +// and releases ownership of it. It also accumulates STW time in the bucket. +// now must be the timestamp from the end of the STW pause. +func (l *gcCPULimiterState) finishGCTransition(now int64) { + if !l.transitioning { + throw("finishGCTransition called without starting one?") + } + // Count the full nprocs set of CPU time because the world is stopped + // between startGCTransition and finishGCTransition. Even though the GC + // isn't running on all CPUs, it is preventing user code from doing so, + // so it might as well be. + if lastUpdate := l.lastUpdate.Load(); now >= lastUpdate { + l.accumulate(0, (now-lastUpdate)*int64(l.nprocs)) + } + l.lastUpdate.Store(now) + l.transitioning = false + l.unlock() +} + +// gcCPULimiterUpdatePeriod dictates the maximum amount of wall-clock time +// we can go before updating the limiter. +const gcCPULimiterUpdatePeriod = 10e6 // 10ms + +// needUpdate returns true if the limiter's maximum update period has been +// exceeded, and so would benefit from an update. +func (l *gcCPULimiterState) needUpdate(now int64) bool { + return now-l.lastUpdate.Load() > gcCPULimiterUpdatePeriod +} + +// addAssistTime notifies the limiter of additional assist time. It will be +// included in the next update. +func (l *gcCPULimiterState) addAssistTime(t int64) { + l.assistTimePool.Add(t) +} + +// addIdleTime notifies the limiter of additional time a P spent on the idle list. It will be +// subtracted from the total CPU time in the next update. +func (l *gcCPULimiterState) addIdleTime(t int64) { + l.idleTimePool.Add(t) +} + +// update updates the bucket given runtime-specific information. now is the +// current monotonic time in nanoseconds. +// +// This is safe to call concurrently with other operations, except *GCTransition. +func (l *gcCPULimiterState) update(now int64) { + if !l.tryLock() { + // We failed to acquire the lock, which means something else is currently + // updating. Just drop our update, the next one to update will include + // our total assist time. + return + } + if l.transitioning { + throw("update during transition") + } + l.updateLocked(now) + l.unlock() +} + +// updatedLocked is the implementation of update. l.lock must be held. +func (l *gcCPULimiterState) updateLocked(now int64) { + lastUpdate := l.lastUpdate.Load() + if now < lastUpdate { + // Defensively avoid overflow. This isn't even the latest update anyway. + return + } + windowTotalTime := (now - lastUpdate) * int64(l.nprocs) + l.lastUpdate.Store(now) + + // Drain the pool of assist time. + assistTime := l.assistTimePool.Load() + if assistTime != 0 { + l.assistTimePool.Add(-assistTime) + } + + // Drain the pool of idle time. + idleTime := l.idleTimePool.Load() + if idleTime != 0 { + l.idleTimePool.Add(-idleTime) + } + + if !l.test { + // Consume time from in-flight events. Make sure we're not preemptible so allp can't change. + // + // The reason we do this instead of just waiting for those events to finish and push updates + // is to ensure that all the time we're accounting for happened sometime between lastUpdate + // and now. This dramatically simplifies reasoning about the limiter because we're not at + // risk of extra time being accounted for in this window than actually happened in this window, + // leading to all sorts of weird transient behavior. + mp := acquirem() + for _, pp := range allp { + typ, duration := pp.limiterEvent.consume(now) + switch typ { + case limiterEventIdleMarkWork: + fallthrough + case limiterEventIdle: + idleTime += duration + case limiterEventMarkAssist: + fallthrough + case limiterEventScavengeAssist: + assistTime += duration + case limiterEventNone: + break + default: + throw("invalid limiter event type found") + } + } + releasem(mp) + } + + // Compute total GC time. + windowGCTime := assistTime + if l.gcEnabled { + windowGCTime += int64(float64(windowTotalTime) * gcBackgroundUtilization) + } + + // Subtract out all idle time from the total time. Do this after computing + // GC time, because the background utilization is dependent on the *real* + // total time, not the total time after idle time is subtracted. + // + // Idle time is counted as any time that a P is on the P idle list plus idle mark + // time. Idle mark workers soak up time that the application spends idle. + // + // On a heavily undersubscribed system, any additional idle time can skew GC CPU + // utilization, because the GC might be executing continuously and thrashing, + // yet the CPU utilization with respect to GOMAXPROCS will be quite low, so + // the limiter fails to turn on. By subtracting idle time, we're removing time that + // we know the application was idle giving a more accurate picture of whether + // the GC is thrashing. + // + // Note that this can cause the limiter to turn on even if it's not needed. For + // instance, on a system with 32 Ps but only 1 running goroutine, each GC will have + // 8 dedicated GC workers. Assuming the GC cycle is half mark phase and half sweep + // phase, then the GC CPU utilization over that cycle, with idle time removed, will + // be 8/(8+2) = 80%. Even though the limiter turns on, though, assist should be + // unnecessary, as the GC has way more CPU time to outpace the 1 goroutine that's + // running. + windowTotalTime -= idleTime + + l.accumulate(windowTotalTime-windowGCTime, windowGCTime) +} + +// accumulate adds time to the bucket and signals whether the limiter is enabled. +// +// This is an internal function that deals just with the bucket. Prefer update. +// l.lock must be held. +func (l *gcCPULimiterState) accumulate(mutatorTime, gcTime int64) { + headroom := l.bucket.capacity - l.bucket.fill + enabled := headroom == 0 + + // Let's be careful about three things here: + // 1. The addition and subtraction, for the invariants. + // 2. Overflow. + // 3. Excessive mutation of l.enabled, which is accessed + // by all assists, potentially more than once. + change := gcTime - mutatorTime + + // Handle limiting case. + if change > 0 && headroom <= uint64(change) { + l.overflow += uint64(change) - headroom + l.bucket.fill = l.bucket.capacity + if !enabled { + l.enabled.Store(true) + l.lastEnabledCycle.Store(memstats.numgc + 1) + } + return + } + + // Handle non-limiting cases. + if change < 0 && l.bucket.fill <= uint64(-change) { + // Bucket emptied. + l.bucket.fill = 0 + } else { + // All other cases. + l.bucket.fill -= uint64(-change) + } + if change != 0 && enabled { + l.enabled.Store(false) + } +} + +// tryLock attempts to lock l. Returns true on success. +func (l *gcCPULimiterState) tryLock() bool { + return l.lock.CompareAndSwap(0, 1) +} + +// unlock releases the lock on l. Must be called if tryLock returns true. +func (l *gcCPULimiterState) unlock() { + old := l.lock.Swap(0) + if old != 1 { + throw("double unlock") + } +} + +// capacityPerProc is the limiter's bucket capacity for each P in GOMAXPROCS. +const capacityPerProc = 1e9 // 1 second in nanoseconds + +// resetCapacity updates the capacity based on GOMAXPROCS. Must not be called +// while the GC is enabled. +// +// It is safe to call concurrently with other operations. +func (l *gcCPULimiterState) resetCapacity(now int64, nprocs int32) { + if !l.tryLock() { + // This must happen during a STW, so we can't fail to acquire the lock. + // If we did, something went wrong. Throw. + throw("failed to acquire lock to reset capacity") + } + // Flush the rest of the time for this period. + l.updateLocked(now) + l.nprocs = nprocs + + l.bucket.capacity = uint64(nprocs) * capacityPerProc + if l.bucket.fill > l.bucket.capacity { + l.bucket.fill = l.bucket.capacity + l.enabled.Store(true) + l.lastEnabledCycle.Store(memstats.numgc + 1) + } else if l.bucket.fill < l.bucket.capacity { + l.enabled.Store(false) + } + l.unlock() +} + +// limiterEventType indicates the type of an event occurring on some P. +// +// These events represent the full set of events that the GC CPU limiter tracks +// to execute its function. +// +// This type may use no more than limiterEventBits bits of information. +type limiterEventType uint8 + +const ( + limiterEventNone limiterEventType = iota // None of the following events. + limiterEventIdleMarkWork // Refers to an idle mark worker (see gcMarkWorkerMode). + limiterEventMarkAssist // Refers to mark assist (see gcAssistAlloc). + limiterEventScavengeAssist // Refers to a scavenge assist (see allocSpan). + limiterEventIdle // Refers to time a P spent on the idle list. + + limiterEventBits = 3 +) + +// limiterEventTypeMask is a mask for the bits in p.limiterEventStart that represent +// the event type. The rest of the bits of that field represent a timestamp. +const ( + limiterEventTypeMask = uint64((1<<limiterEventBits)-1) << (64 - limiterEventBits) + limiterEventStampNone = limiterEventStamp(0) +) + +// limiterEventStamp is a nanotime timestamp packed with a limiterEventType. +type limiterEventStamp uint64 + +// makeLimiterEventStamp creates a new stamp from the event type and the current timestamp. +func makeLimiterEventStamp(typ limiterEventType, now int64) limiterEventStamp { + return limiterEventStamp(uint64(typ)<<(64-limiterEventBits) | (uint64(now) &^ limiterEventTypeMask)) +} + +// duration computes the difference between now and the start time stored in the stamp. +// +// Returns 0 if the difference is negative, which may happen if now is stale or if the +// before and after timestamps cross a 2^(64-limiterEventBits) boundary. +func (s limiterEventStamp) duration(now int64) int64 { + // The top limiterEventBits bits of the timestamp are derived from the current time + // when computing a duration. + start := int64((uint64(now) & limiterEventTypeMask) | (uint64(s) &^ limiterEventTypeMask)) + if now < start { + return 0 + } + return now - start +} + +// type extracts the event type from the stamp. +func (s limiterEventStamp) typ() limiterEventType { + return limiterEventType(s >> (64 - limiterEventBits)) +} + +// limiterEvent represents tracking state for an event tracked by the GC CPU limiter. +type limiterEvent struct { + stamp atomic.Uint64 // Stores a limiterEventStamp. +} + +// start begins tracking a new limiter event of the current type. If an event +// is already in flight, then a new event cannot begin because the current time is +// already being attributed to that event. In this case, this function returns false. +// Otherwise, it returns true. +// +// The caller must be non-preemptible until at least stop is called or this function +// returns false. Because this is trying to measure "on-CPU" time of some event, getting +// scheduled away during it can mean that whatever we're measuring isn't a reflection +// of "on-CPU" time. The OS could deschedule us at any time, but we want to maintain as +// close of an approximation as we can. +func (e *limiterEvent) start(typ limiterEventType, now int64) bool { + if limiterEventStamp(e.stamp.Load()).typ() != limiterEventNone { + return false + } + e.stamp.Store(uint64(makeLimiterEventStamp(typ, now))) + return true +} + +// consume acquires the partial event CPU time from any in-flight event. +// It achieves this by storing the current time as the new event time. +// +// Returns the type of the in-flight event, as well as how long it's currently been +// executing for. Returns limiterEventNone if no event is active. +func (e *limiterEvent) consume(now int64) (typ limiterEventType, duration int64) { + // Read the limiter event timestamp and update it to now. + for { + old := limiterEventStamp(e.stamp.Load()) + typ = old.typ() + if typ == limiterEventNone { + // There's no in-flight event, so just push that up. + return + } + duration = old.duration(now) + if duration == 0 { + // We might have a stale now value, or this crossed the + // 2^(64-limiterEventBits) boundary in the clock readings. + // Just ignore it. + return limiterEventNone, 0 + } + new := makeLimiterEventStamp(typ, now) + if e.stamp.CompareAndSwap(uint64(old), uint64(new)) { + break + } + } + return +} + +// stop stops the active limiter event. Throws if the +// +// The caller must be non-preemptible across the event. See start as to why. +func (e *limiterEvent) stop(typ limiterEventType, now int64) { + var stamp limiterEventStamp + for { + stamp = limiterEventStamp(e.stamp.Load()) + if stamp.typ() != typ { + print("runtime: want=", typ, " got=", stamp.typ(), "\n") + throw("limiterEvent.stop: found wrong event in p's limiter event slot") + } + if e.stamp.CompareAndSwap(uint64(stamp), uint64(limiterEventStampNone)) { + break + } + } + duration := stamp.duration(now) + if duration == 0 { + // It's possible that we're missing time because we crossed a + // 2^(64-limiterEventBits) boundary between the start and end. + // In this case, we're dropping that information. This is OK because + // at worst it'll cause a transient hiccup that will quickly resolve + // itself as all new timestamps begin on the other side of the boundary. + // Such a hiccup should be incredibly rare. + return + } + // Account for the event. + switch typ { + case limiterEventIdleMarkWork: + gcCPULimiter.addIdleTime(duration) + case limiterEventIdle: + gcCPULimiter.addIdleTime(duration) + sched.idleTime.Add(duration) + case limiterEventMarkAssist: + fallthrough + case limiterEventScavengeAssist: + gcCPULimiter.addAssistTime(duration) + default: + throw("limiterEvent.stop: invalid limiter event type found") + } +} diff --git a/src/runtime/mgclimit_test.go b/src/runtime/mgclimit_test.go new file mode 100644 index 0000000..124da03 --- /dev/null +++ b/src/runtime/mgclimit_test.go @@ -0,0 +1,255 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + . "runtime" + "testing" + "time" +) + +func TestGCCPULimiter(t *testing.T) { + const procs = 14 + + // Create mock time. + ticks := int64(0) + advance := func(d time.Duration) int64 { + t.Helper() + ticks += int64(d) + return ticks + } + + // assistTime computes the CPU time for assists using frac of GOMAXPROCS + // over the wall-clock duration d. + assistTime := func(d time.Duration, frac float64) int64 { + t.Helper() + return int64(frac * float64(d) * procs) + } + + l := NewGCCPULimiter(ticks, procs) + + // Do the whole test twice to make sure state doesn't leak across. + var baseOverflow uint64 // Track total overflow across iterations. + for i := 0; i < 2; i++ { + t.Logf("Iteration %d", i+1) + + if l.Capacity() != procs*CapacityPerProc { + t.Fatalf("unexpected capacity: %d", l.Capacity()) + } + if l.Fill() != 0 { + t.Fatalf("expected empty bucket to start") + } + + // Test filling the bucket with just mutator time. + + l.Update(advance(10 * time.Millisecond)) + l.Update(advance(1 * time.Second)) + l.Update(advance(1 * time.Hour)) + if l.Fill() != 0 { + t.Fatalf("expected empty bucket from only accumulating mutator time, got fill of %d cpu-ns", l.Fill()) + } + + // Test needUpdate. + + if l.NeedUpdate(advance(GCCPULimiterUpdatePeriod / 2)) { + t.Fatal("need update even though updated half a period ago") + } + if !l.NeedUpdate(advance(GCCPULimiterUpdatePeriod)) { + t.Fatal("doesn't need update even though updated 1.5 periods ago") + } + l.Update(advance(0)) + if l.NeedUpdate(advance(0)) { + t.Fatal("need update even though just updated") + } + + // Test transitioning the bucket to enable the GC. + + l.StartGCTransition(true, advance(109*time.Millisecond)) + l.FinishGCTransition(advance(2*time.Millisecond + 1*time.Microsecond)) + + if expect := uint64((2*time.Millisecond + 1*time.Microsecond) * procs); l.Fill() != expect { + t.Fatalf("expected fill of %d, got %d cpu-ns", expect, l.Fill()) + } + + // Test passing time without assists during a GC. Specifically, just enough to drain the bucket to + // exactly procs nanoseconds (easier to get to because of rounding). + // + // The window we need to drain the bucket is 1/(1-2*gcBackgroundUtilization) times the current fill: + // + // fill + (window * procs * gcBackgroundUtilization - window * procs * (1-gcBackgroundUtilization)) = n + // fill = n - (window * procs * gcBackgroundUtilization - window * procs * (1-gcBackgroundUtilization)) + // fill = n + window * procs * ((1-gcBackgroundUtilization) - gcBackgroundUtilization) + // fill = n + window * procs * (1-2*gcBackgroundUtilization) + // window = (fill - n) / (procs * (1-2*gcBackgroundUtilization))) + // + // And here we want n=procs: + factor := (1 / (1 - 2*GCBackgroundUtilization)) + fill := (2*time.Millisecond + 1*time.Microsecond) * procs + l.Update(advance(time.Duration(factor * float64(fill-procs) / procs))) + if l.Fill() != procs { + t.Fatalf("expected fill %d cpu-ns from draining after a GC started, got fill of %d cpu-ns", procs, l.Fill()) + } + + // Drain to zero for the rest of the test. + l.Update(advance(2 * procs * CapacityPerProc)) + if l.Fill() != 0 { + t.Fatalf("expected empty bucket from draining, got fill of %d cpu-ns", l.Fill()) + } + + // Test filling up the bucket with 50% total GC work (so, not moving the bucket at all). + l.AddAssistTime(assistTime(10*time.Millisecond, 0.5-GCBackgroundUtilization)) + l.Update(advance(10 * time.Millisecond)) + if l.Fill() != 0 { + t.Fatalf("expected empty bucket from 50%% GC work, got fill of %d cpu-ns", l.Fill()) + } + + // Test adding to the bucket overall with 100% GC work. + l.AddAssistTime(assistTime(time.Millisecond, 1.0-GCBackgroundUtilization)) + l.Update(advance(time.Millisecond)) + if expect := uint64(procs * time.Millisecond); l.Fill() != expect { + t.Errorf("expected %d fill from 100%% GC CPU, got fill of %d cpu-ns", expect, l.Fill()) + } + if l.Limiting() { + t.Errorf("limiter is enabled after filling bucket but shouldn't be") + } + if t.Failed() { + t.FailNow() + } + + // Test filling the bucket exactly full. + l.AddAssistTime(assistTime(CapacityPerProc-time.Millisecond, 1.0-GCBackgroundUtilization)) + l.Update(advance(CapacityPerProc - time.Millisecond)) + if l.Fill() != l.Capacity() { + t.Errorf("expected bucket filled to capacity %d, got %d", l.Capacity(), l.Fill()) + } + if !l.Limiting() { + t.Errorf("limiter is not enabled after filling bucket but should be") + } + if l.Overflow() != 0+baseOverflow { + t.Errorf("bucket filled exactly should not have overflow, found %d", l.Overflow()) + } + if t.Failed() { + t.FailNow() + } + + // Test adding with a delta of exactly zero. That is, GC work is exactly 50% of all resources. + // Specifically, the limiter should still be on, and no overflow should accumulate. + l.AddAssistTime(assistTime(1*time.Second, 0.5-GCBackgroundUtilization)) + l.Update(advance(1 * time.Second)) + if l.Fill() != l.Capacity() { + t.Errorf("expected bucket filled to capacity %d, got %d", l.Capacity(), l.Fill()) + } + if !l.Limiting() { + t.Errorf("limiter is not enabled after filling bucket but should be") + } + if l.Overflow() != 0+baseOverflow { + t.Errorf("bucket filled exactly should not have overflow, found %d", l.Overflow()) + } + if t.Failed() { + t.FailNow() + } + + // Drain the bucket by half. + l.AddAssistTime(assistTime(CapacityPerProc, 0)) + l.Update(advance(CapacityPerProc)) + if expect := l.Capacity() / 2; l.Fill() != expect { + t.Errorf("failed to drain to %d, got fill %d", expect, l.Fill()) + } + if l.Limiting() { + t.Errorf("limiter is enabled after draining bucket but shouldn't be") + } + if t.Failed() { + t.FailNow() + } + + // Test overfilling the bucket. + l.AddAssistTime(assistTime(CapacityPerProc, 1.0-GCBackgroundUtilization)) + l.Update(advance(CapacityPerProc)) + if l.Fill() != l.Capacity() { + t.Errorf("failed to fill to capacity %d, got fill %d", l.Capacity(), l.Fill()) + } + if !l.Limiting() { + t.Errorf("limiter is not enabled after overfill but should be") + } + if expect := uint64(CapacityPerProc * procs / 2); l.Overflow() != expect+baseOverflow { + t.Errorf("bucket overfilled should have overflow %d, found %d", expect, l.Overflow()) + } + if t.Failed() { + t.FailNow() + } + + // Test ending the cycle with some assists left over. + l.AddAssistTime(assistTime(1*time.Millisecond, 1.0-GCBackgroundUtilization)) + l.StartGCTransition(false, advance(1*time.Millisecond)) + if l.Fill() != l.Capacity() { + t.Errorf("failed to maintain fill to capacity %d, got fill %d", l.Capacity(), l.Fill()) + } + if !l.Limiting() { + t.Errorf("limiter is not enabled after overfill but should be") + } + if expect := uint64((CapacityPerProc/2 + time.Millisecond) * procs); l.Overflow() != expect+baseOverflow { + t.Errorf("bucket overfilled should have overflow %d, found %d", expect, l.Overflow()) + } + if t.Failed() { + t.FailNow() + } + + // Make sure the STW adds to the bucket. + l.FinishGCTransition(advance(5 * time.Millisecond)) + if l.Fill() != l.Capacity() { + t.Errorf("failed to maintain fill to capacity %d, got fill %d", l.Capacity(), l.Fill()) + } + if !l.Limiting() { + t.Errorf("limiter is not enabled after overfill but should be") + } + if expect := uint64((CapacityPerProc/2 + 6*time.Millisecond) * procs); l.Overflow() != expect+baseOverflow { + t.Errorf("bucket overfilled should have overflow %d, found %d", expect, l.Overflow()) + } + if t.Failed() { + t.FailNow() + } + + // Resize procs up and make sure limiting stops. + expectFill := l.Capacity() + l.ResetCapacity(advance(0), procs+10) + if l.Fill() != expectFill { + t.Errorf("failed to maintain fill at old capacity %d, got fill %d", expectFill, l.Fill()) + } + if l.Limiting() { + t.Errorf("limiter is enabled after resetting capacity higher") + } + if expect := uint64((CapacityPerProc/2 + 6*time.Millisecond) * procs); l.Overflow() != expect+baseOverflow { + t.Errorf("bucket overflow %d should have remained constant, found %d", expect, l.Overflow()) + } + if t.Failed() { + t.FailNow() + } + + // Resize procs down and make sure limiting begins again. + // Also make sure resizing doesn't affect overflow. This isn't + // a case where we want to report overflow, because we're not + // actively doing work to achieve it. It's that we have fewer + // CPU resources now. + l.ResetCapacity(advance(0), procs-10) + if l.Fill() != l.Capacity() { + t.Errorf("failed lower fill to new capacity %d, got fill %d", l.Capacity(), l.Fill()) + } + if !l.Limiting() { + t.Errorf("limiter is disabled after resetting capacity lower") + } + if expect := uint64((CapacityPerProc/2 + 6*time.Millisecond) * procs); l.Overflow() != expect+baseOverflow { + t.Errorf("bucket overflow %d should have remained constant, found %d", expect, l.Overflow()) + } + if t.Failed() { + t.FailNow() + } + + // Get back to a zero state. The top of the loop will double check. + l.ResetCapacity(advance(CapacityPerProc*procs), procs) + + // Track total overflow for future iterations. + baseOverflow += uint64((CapacityPerProc/2 + 6*time.Millisecond) * procs) + } +} diff --git a/src/runtime/mgcmark.go b/src/runtime/mgcmark.go new file mode 100644 index 0000000..cfda706 --- /dev/null +++ b/src/runtime/mgcmark.go @@ -0,0 +1,1598 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Garbage collector: marking and scanning + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +const ( + fixedRootFinalizers = iota + fixedRootFreeGStacks + fixedRootCount + + // rootBlockBytes is the number of bytes to scan per data or + // BSS root. + rootBlockBytes = 256 << 10 + + // maxObletBytes is the maximum bytes of an object to scan at + // once. Larger objects will be split up into "oblets" of at + // most this size. Since we can scan 1–2 MB/ms, 128 KB bounds + // scan preemption at ~100 µs. + // + // This must be > _MaxSmallSize so that the object base is the + // span base. + maxObletBytes = 128 << 10 + + // drainCheckThreshold specifies how many units of work to do + // between self-preemption checks in gcDrain. Assuming a scan + // rate of 1 MB/ms, this is ~100 µs. Lower values have higher + // overhead in the scan loop (the scheduler check may perform + // a syscall, so its overhead is nontrivial). Higher values + // make the system less responsive to incoming work. + drainCheckThreshold = 100000 + + // pagesPerSpanRoot indicates how many pages to scan from a span root + // at a time. Used by special root marking. + // + // Higher values improve throughput by increasing locality, but + // increase the minimum latency of a marking operation. + // + // Must be a multiple of the pageInUse bitmap element size and + // must also evenly divide pagesPerArena. + pagesPerSpanRoot = 512 +) + +// gcMarkRootPrepare queues root scanning jobs (stacks, globals, and +// some miscellany) and initializes scanning-related state. +// +// The world must be stopped. +func gcMarkRootPrepare() { + assertWorldStopped() + + // Compute how many data and BSS root blocks there are. + nBlocks := func(bytes uintptr) int { + return int(divRoundUp(bytes, rootBlockBytes)) + } + + work.nDataRoots = 0 + work.nBSSRoots = 0 + + // Scan globals. + for _, datap := range activeModules() { + nDataRoots := nBlocks(datap.edata - datap.data) + if nDataRoots > work.nDataRoots { + work.nDataRoots = nDataRoots + } + } + + for _, datap := range activeModules() { + nBSSRoots := nBlocks(datap.ebss - datap.bss) + if nBSSRoots > work.nBSSRoots { + work.nBSSRoots = nBSSRoots + } + } + + // Scan span roots for finalizer specials. + // + // We depend on addfinalizer to mark objects that get + // finalizers after root marking. + // + // We're going to scan the whole heap (that was available at the time the + // mark phase started, i.e. markArenas) for in-use spans which have specials. + // + // Break up the work into arenas, and further into chunks. + // + // Snapshot allArenas as markArenas. This snapshot is safe because allArenas + // is append-only. + mheap_.markArenas = mheap_.allArenas[:len(mheap_.allArenas):len(mheap_.allArenas)] + work.nSpanRoots = len(mheap_.markArenas) * (pagesPerArena / pagesPerSpanRoot) + + // Scan stacks. + // + // Gs may be created after this point, but it's okay that we + // ignore them because they begin life without any roots, so + // there's nothing to scan, and any roots they create during + // the concurrent phase will be caught by the write barrier. + work.stackRoots = allGsSnapshot() + work.nStackRoots = len(work.stackRoots) + + work.markrootNext = 0 + work.markrootJobs = uint32(fixedRootCount + work.nDataRoots + work.nBSSRoots + work.nSpanRoots + work.nStackRoots) + + // Calculate base indexes of each root type + work.baseData = uint32(fixedRootCount) + work.baseBSS = work.baseData + uint32(work.nDataRoots) + work.baseSpans = work.baseBSS + uint32(work.nBSSRoots) + work.baseStacks = work.baseSpans + uint32(work.nSpanRoots) + work.baseEnd = work.baseStacks + uint32(work.nStackRoots) +} + +// gcMarkRootCheck checks that all roots have been scanned. It is +// purely for debugging. +func gcMarkRootCheck() { + if work.markrootNext < work.markrootJobs { + print(work.markrootNext, " of ", work.markrootJobs, " markroot jobs done\n") + throw("left over markroot jobs") + } + + // Check that stacks have been scanned. + // + // We only check the first nStackRoots Gs that we should have scanned. + // Since we don't care about newer Gs (see comment in + // gcMarkRootPrepare), no locking is required. + i := 0 + forEachGRace(func(gp *g) { + if i >= work.nStackRoots { + return + } + + if !gp.gcscandone { + println("gp", gp, "goid", gp.goid, + "status", readgstatus(gp), + "gcscandone", gp.gcscandone) + throw("scan missed a g") + } + + i++ + }) +} + +// ptrmask for an allocation containing a single pointer. +var oneptrmask = [...]uint8{1} + +// markroot scans the i'th root. +// +// Preemption must be disabled (because this uses a gcWork). +// +// Returns the amount of GC work credit produced by the operation. +// If flushBgCredit is true, then that credit is also flushed +// to the background credit pool. +// +// nowritebarrier is only advisory here. +// +//go:nowritebarrier +func markroot(gcw *gcWork, i uint32, flushBgCredit bool) int64 { + // Note: if you add a case here, please also update heapdump.go:dumproots. + var workDone int64 + var workCounter *atomic.Int64 + switch { + case work.baseData <= i && i < work.baseBSS: + workCounter = &gcController.globalsScanWork + for _, datap := range activeModules() { + workDone += markrootBlock(datap.data, datap.edata-datap.data, datap.gcdatamask.bytedata, gcw, int(i-work.baseData)) + } + + case work.baseBSS <= i && i < work.baseSpans: + workCounter = &gcController.globalsScanWork + for _, datap := range activeModules() { + workDone += markrootBlock(datap.bss, datap.ebss-datap.bss, datap.gcbssmask.bytedata, gcw, int(i-work.baseBSS)) + } + + case i == fixedRootFinalizers: + for fb := allfin; fb != nil; fb = fb.alllink { + cnt := uintptr(atomic.Load(&fb.cnt)) + scanblock(uintptr(unsafe.Pointer(&fb.fin[0])), cnt*unsafe.Sizeof(fb.fin[0]), &finptrmask[0], gcw, nil) + } + + case i == fixedRootFreeGStacks: + // Switch to the system stack so we can call + // stackfree. + systemstack(markrootFreeGStacks) + + case work.baseSpans <= i && i < work.baseStacks: + // mark mspan.specials + markrootSpans(gcw, int(i-work.baseSpans)) + + default: + // the rest is scanning goroutine stacks + workCounter = &gcController.stackScanWork + if i < work.baseStacks || work.baseEnd <= i { + printlock() + print("runtime: markroot index ", i, " not in stack roots range [", work.baseStacks, ", ", work.baseEnd, ")\n") + throw("markroot: bad index") + } + gp := work.stackRoots[i-work.baseStacks] + + // remember when we've first observed the G blocked + // needed only to output in traceback + status := readgstatus(gp) // We are not in a scan state + if (status == _Gwaiting || status == _Gsyscall) && gp.waitsince == 0 { + gp.waitsince = work.tstart + } + + // scanstack must be done on the system stack in case + // we're trying to scan our own stack. + systemstack(func() { + // If this is a self-scan, put the user G in + // _Gwaiting to prevent self-deadlock. It may + // already be in _Gwaiting if this is a mark + // worker or we're in mark termination. + userG := getg().m.curg + selfScan := gp == userG && readgstatus(userG) == _Grunning + if selfScan { + casGToWaiting(userG, _Grunning, waitReasonGarbageCollectionScan) + } + + // TODO: suspendG blocks (and spins) until gp + // stops, which may take a while for + // running goroutines. Consider doing this in + // two phases where the first is non-blocking: + // we scan the stacks we can and ask running + // goroutines to scan themselves; and the + // second blocks. + stopped := suspendG(gp) + if stopped.dead { + gp.gcscandone = true + return + } + if gp.gcscandone { + throw("g already scanned") + } + workDone += scanstack(gp, gcw) + gp.gcscandone = true + resumeG(stopped) + + if selfScan { + casgstatus(userG, _Gwaiting, _Grunning) + } + }) + } + if workCounter != nil && workDone != 0 { + workCounter.Add(workDone) + if flushBgCredit { + gcFlushBgCredit(workDone) + } + } + return workDone +} + +// markrootBlock scans the shard'th shard of the block of memory [b0, +// b0+n0), with the given pointer mask. +// +// Returns the amount of work done. +// +//go:nowritebarrier +func markrootBlock(b0, n0 uintptr, ptrmask0 *uint8, gcw *gcWork, shard int) int64 { + if rootBlockBytes%(8*goarch.PtrSize) != 0 { + // This is necessary to pick byte offsets in ptrmask0. + throw("rootBlockBytes must be a multiple of 8*ptrSize") + } + + // Note that if b0 is toward the end of the address space, + // then b0 + rootBlockBytes might wrap around. + // These tests are written to avoid any possible overflow. + off := uintptr(shard) * rootBlockBytes + if off >= n0 { + return 0 + } + b := b0 + off + ptrmask := (*uint8)(add(unsafe.Pointer(ptrmask0), uintptr(shard)*(rootBlockBytes/(8*goarch.PtrSize)))) + n := uintptr(rootBlockBytes) + if off+n > n0 { + n = n0 - off + } + + // Scan this shard. + scanblock(b, n, ptrmask, gcw, nil) + return int64(n) +} + +// markrootFreeGStacks frees stacks of dead Gs. +// +// This does not free stacks of dead Gs cached on Ps, but having a few +// cached stacks around isn't a problem. +func markrootFreeGStacks() { + // Take list of dead Gs with stacks. + lock(&sched.gFree.lock) + list := sched.gFree.stack + sched.gFree.stack = gList{} + unlock(&sched.gFree.lock) + if list.empty() { + return + } + + // Free stacks. + q := gQueue{list.head, list.head} + for gp := list.head.ptr(); gp != nil; gp = gp.schedlink.ptr() { + stackfree(gp.stack) + gp.stack.lo = 0 + gp.stack.hi = 0 + // Manipulate the queue directly since the Gs are + // already all linked the right way. + q.tail.set(gp) + } + + // Put Gs back on the free list. + lock(&sched.gFree.lock) + sched.gFree.noStack.pushAll(q) + unlock(&sched.gFree.lock) +} + +// markrootSpans marks roots for one shard of markArenas. +// +//go:nowritebarrier +func markrootSpans(gcw *gcWork, shard int) { + // Objects with finalizers have two GC-related invariants: + // + // 1) Everything reachable from the object must be marked. + // This ensures that when we pass the object to its finalizer, + // everything the finalizer can reach will be retained. + // + // 2) Finalizer specials (which are not in the garbage + // collected heap) are roots. In practice, this means the fn + // field must be scanned. + sg := mheap_.sweepgen + + // Find the arena and page index into that arena for this shard. + ai := mheap_.markArenas[shard/(pagesPerArena/pagesPerSpanRoot)] + ha := mheap_.arenas[ai.l1()][ai.l2()] + arenaPage := uint(uintptr(shard) * pagesPerSpanRoot % pagesPerArena) + + // Construct slice of bitmap which we'll iterate over. + specialsbits := ha.pageSpecials[arenaPage/8:] + specialsbits = specialsbits[:pagesPerSpanRoot/8] + for i := range specialsbits { + // Find set bits, which correspond to spans with specials. + specials := atomic.Load8(&specialsbits[i]) + if specials == 0 { + continue + } + for j := uint(0); j < 8; j++ { + if specials&(1<<j) == 0 { + continue + } + // Find the span for this bit. + // + // This value is guaranteed to be non-nil because having + // specials implies that the span is in-use, and since we're + // currently marking we can be sure that we don't have to worry + // about the span being freed and re-used. + s := ha.spans[arenaPage+uint(i)*8+j] + + // The state must be mSpanInUse if the specials bit is set, so + // sanity check that. + if state := s.state.get(); state != mSpanInUse { + print("s.state = ", state, "\n") + throw("non in-use span found with specials bit set") + } + // Check that this span was swept (it may be cached or uncached). + if !useCheckmark && !(s.sweepgen == sg || s.sweepgen == sg+3) { + // sweepgen was updated (+2) during non-checkmark GC pass + print("sweep ", s.sweepgen, " ", sg, "\n") + throw("gc: unswept span") + } + + // Lock the specials to prevent a special from being + // removed from the list while we're traversing it. + lock(&s.speciallock) + for sp := s.specials; sp != nil; sp = sp.next { + if sp.kind != _KindSpecialFinalizer { + continue + } + // don't mark finalized object, but scan it so we + // retain everything it points to. + spf := (*specialfinalizer)(unsafe.Pointer(sp)) + // A finalizer can be set for an inner byte of an object, find object beginning. + p := s.base() + uintptr(spf.special.offset)/s.elemsize*s.elemsize + + // Mark everything that can be reached from + // the object (but *not* the object itself or + // we'll never collect it). + if !s.spanclass.noscan() { + scanobject(p, gcw) + } + + // The special itself is a root. + scanblock(uintptr(unsafe.Pointer(&spf.fn)), goarch.PtrSize, &oneptrmask[0], gcw, nil) + } + unlock(&s.speciallock) + } + } +} + +// gcAssistAlloc performs GC work to make gp's assist debt positive. +// gp must be the calling user goroutine. +// +// This must be called with preemption enabled. +func gcAssistAlloc(gp *g) { + // Don't assist in non-preemptible contexts. These are + // generally fragile and won't allow the assist to block. + if getg() == gp.m.g0 { + return + } + if mp := getg().m; mp.locks > 0 || mp.preemptoff != "" { + return + } + + traced := false +retry: + if go119MemoryLimitSupport && gcCPULimiter.limiting() { + // If the CPU limiter is enabled, intentionally don't + // assist to reduce the amount of CPU time spent in the GC. + if traced { + traceGCMarkAssistDone() + } + return + } + // Compute the amount of scan work we need to do to make the + // balance positive. When the required amount of work is low, + // we over-assist to build up credit for future allocations + // and amortize the cost of assisting. + assistWorkPerByte := gcController.assistWorkPerByte.Load() + assistBytesPerWork := gcController.assistBytesPerWork.Load() + debtBytes := -gp.gcAssistBytes + scanWork := int64(assistWorkPerByte * float64(debtBytes)) + if scanWork < gcOverAssistWork { + scanWork = gcOverAssistWork + debtBytes = int64(assistBytesPerWork * float64(scanWork)) + } + + // Steal as much credit as we can from the background GC's + // scan credit. This is racy and may drop the background + // credit below 0 if two mutators steal at the same time. This + // will just cause steals to fail until credit is accumulated + // again, so in the long run it doesn't really matter, but we + // do have to handle the negative credit case. + bgScanCredit := gcController.bgScanCredit.Load() + stolen := int64(0) + if bgScanCredit > 0 { + if bgScanCredit < scanWork { + stolen = bgScanCredit + gp.gcAssistBytes += 1 + int64(assistBytesPerWork*float64(stolen)) + } else { + stolen = scanWork + gp.gcAssistBytes += debtBytes + } + gcController.bgScanCredit.Add(-stolen) + + scanWork -= stolen + + if scanWork == 0 { + // We were able to steal all of the credit we + // needed. + if traced { + traceGCMarkAssistDone() + } + return + } + } + + if trace.enabled && !traced { + traced = true + traceGCMarkAssistStart() + } + + // Perform assist work + systemstack(func() { + gcAssistAlloc1(gp, scanWork) + // The user stack may have moved, so this can't touch + // anything on it until it returns from systemstack. + }) + + completed := gp.param != nil + gp.param = nil + if completed { + gcMarkDone() + } + + if gp.gcAssistBytes < 0 { + // We were unable steal enough credit or perform + // enough work to pay off the assist debt. We need to + // do one of these before letting the mutator allocate + // more to prevent over-allocation. + // + // If this is because we were preempted, reschedule + // and try some more. + if gp.preempt { + Gosched() + goto retry + } + + // Add this G to an assist queue and park. When the GC + // has more background credit, it will satisfy queued + // assists before flushing to the global credit pool. + // + // Note that this does *not* get woken up when more + // work is added to the work list. The theory is that + // there wasn't enough work to do anyway, so we might + // as well let background marking take care of the + // work that is available. + if !gcParkAssist() { + goto retry + } + + // At this point either background GC has satisfied + // this G's assist debt, or the GC cycle is over. + } + if traced { + traceGCMarkAssistDone() + } +} + +// gcAssistAlloc1 is the part of gcAssistAlloc that runs on the system +// stack. This is a separate function to make it easier to see that +// we're not capturing anything from the user stack, since the user +// stack may move while we're in this function. +// +// gcAssistAlloc1 indicates whether this assist completed the mark +// phase by setting gp.param to non-nil. This can't be communicated on +// the stack since it may move. +// +//go:systemstack +func gcAssistAlloc1(gp *g, scanWork int64) { + // Clear the flag indicating that this assist completed the + // mark phase. + gp.param = nil + + if atomic.Load(&gcBlackenEnabled) == 0 { + // The gcBlackenEnabled check in malloc races with the + // store that clears it but an atomic check in every malloc + // would be a performance hit. + // Instead we recheck it here on the non-preemptable system + // stack to determine if we should perform an assist. + + // GC is done, so ignore any remaining debt. + gp.gcAssistBytes = 0 + return + } + // Track time spent in this assist. Since we're on the + // system stack, this is non-preemptible, so we can + // just measure start and end time. + // + // Limiter event tracking might be disabled if we end up here + // while on a mark worker. + startTime := nanotime() + trackLimiterEvent := gp.m.p.ptr().limiterEvent.start(limiterEventMarkAssist, startTime) + + decnwait := atomic.Xadd(&work.nwait, -1) + if decnwait == work.nproc { + println("runtime: work.nwait =", decnwait, "work.nproc=", work.nproc) + throw("nwait > work.nprocs") + } + + // gcDrainN requires the caller to be preemptible. + casGToWaiting(gp, _Grunning, waitReasonGCAssistMarking) + + // drain own cached work first in the hopes that it + // will be more cache friendly. + gcw := &getg().m.p.ptr().gcw + workDone := gcDrainN(gcw, scanWork) + + casgstatus(gp, _Gwaiting, _Grunning) + + // Record that we did this much scan work. + // + // Back out the number of bytes of assist credit that + // this scan work counts for. The "1+" is a poor man's + // round-up, to ensure this adds credit even if + // assistBytesPerWork is very low. + assistBytesPerWork := gcController.assistBytesPerWork.Load() + gp.gcAssistBytes += 1 + int64(assistBytesPerWork*float64(workDone)) + + // If this is the last worker and we ran out of work, + // signal a completion point. + incnwait := atomic.Xadd(&work.nwait, +1) + if incnwait > work.nproc { + println("runtime: work.nwait=", incnwait, + "work.nproc=", work.nproc) + throw("work.nwait > work.nproc") + } + + if incnwait == work.nproc && !gcMarkWorkAvailable(nil) { + // This has reached a background completion point. Set + // gp.param to a non-nil value to indicate this. It + // doesn't matter what we set it to (it just has to be + // a valid pointer). + gp.param = unsafe.Pointer(gp) + } + now := nanotime() + duration := now - startTime + pp := gp.m.p.ptr() + pp.gcAssistTime += duration + if trackLimiterEvent { + pp.limiterEvent.stop(limiterEventMarkAssist, now) + } + if pp.gcAssistTime > gcAssistTimeSlack { + gcController.assistTime.Add(pp.gcAssistTime) + gcCPULimiter.update(now) + pp.gcAssistTime = 0 + } +} + +// gcWakeAllAssists wakes all currently blocked assists. This is used +// at the end of a GC cycle. gcBlackenEnabled must be false to prevent +// new assists from going to sleep after this point. +func gcWakeAllAssists() { + lock(&work.assistQueue.lock) + list := work.assistQueue.q.popList() + injectglist(&list) + unlock(&work.assistQueue.lock) +} + +// gcParkAssist puts the current goroutine on the assist queue and parks. +// +// gcParkAssist reports whether the assist is now satisfied. If it +// returns false, the caller must retry the assist. +func gcParkAssist() bool { + lock(&work.assistQueue.lock) + // If the GC cycle finished while we were getting the lock, + // exit the assist. The cycle can't finish while we hold the + // lock. + if atomic.Load(&gcBlackenEnabled) == 0 { + unlock(&work.assistQueue.lock) + return true + } + + gp := getg() + oldList := work.assistQueue.q + work.assistQueue.q.pushBack(gp) + + // Recheck for background credit now that this G is in + // the queue, but can still back out. This avoids a + // race in case background marking has flushed more + // credit since we checked above. + if gcController.bgScanCredit.Load() > 0 { + work.assistQueue.q = oldList + if oldList.tail != 0 { + oldList.tail.ptr().schedlink.set(nil) + } + unlock(&work.assistQueue.lock) + return false + } + // Park. + goparkunlock(&work.assistQueue.lock, waitReasonGCAssistWait, traceEvGoBlockGC, 2) + return true +} + +// gcFlushBgCredit flushes scanWork units of background scan work +// credit. This first satisfies blocked assists on the +// work.assistQueue and then flushes any remaining credit to +// gcController.bgScanCredit. +// +// Write barriers are disallowed because this is used by gcDrain after +// it has ensured that all work is drained and this must preserve that +// condition. +// +//go:nowritebarrierrec +func gcFlushBgCredit(scanWork int64) { + if work.assistQueue.q.empty() { + // Fast path; there are no blocked assists. There's a + // small window here where an assist may add itself to + // the blocked queue and park. If that happens, we'll + // just get it on the next flush. + gcController.bgScanCredit.Add(scanWork) + return + } + + assistBytesPerWork := gcController.assistBytesPerWork.Load() + scanBytes := int64(float64(scanWork) * assistBytesPerWork) + + lock(&work.assistQueue.lock) + for !work.assistQueue.q.empty() && scanBytes > 0 { + gp := work.assistQueue.q.pop() + // Note that gp.gcAssistBytes is negative because gp + // is in debt. Think carefully about the signs below. + if scanBytes+gp.gcAssistBytes >= 0 { + // Satisfy this entire assist debt. + scanBytes += gp.gcAssistBytes + gp.gcAssistBytes = 0 + // It's important that we *not* put gp in + // runnext. Otherwise, it's possible for user + // code to exploit the GC worker's high + // scheduler priority to get itself always run + // before other goroutines and always in the + // fresh quantum started by GC. + ready(gp, 0, false) + } else { + // Partially satisfy this assist. + gp.gcAssistBytes += scanBytes + scanBytes = 0 + // As a heuristic, we move this assist to the + // back of the queue so that large assists + // can't clog up the assist queue and + // substantially delay small assists. + work.assistQueue.q.pushBack(gp) + break + } + } + + if scanBytes > 0 { + // Convert from scan bytes back to work. + assistWorkPerByte := gcController.assistWorkPerByte.Load() + scanWork = int64(float64(scanBytes) * assistWorkPerByte) + gcController.bgScanCredit.Add(scanWork) + } + unlock(&work.assistQueue.lock) +} + +// scanstack scans gp's stack, greying all pointers found on the stack. +// +// Returns the amount of scan work performed, but doesn't update +// gcController.stackScanWork or flush any credit. Any background credit produced +// by this function should be flushed by its caller. scanstack itself can't +// safely flush because it may result in trying to wake up a goroutine that +// was just scanned, resulting in a self-deadlock. +// +// scanstack will also shrink the stack if it is safe to do so. If it +// is not, it schedules a stack shrink for the next synchronous safe +// point. +// +// scanstack is marked go:systemstack because it must not be preempted +// while using a workbuf. +// +//go:nowritebarrier +//go:systemstack +func scanstack(gp *g, gcw *gcWork) int64 { + if readgstatus(gp)&_Gscan == 0 { + print("runtime:scanstack: gp=", gp, ", goid=", gp.goid, ", gp->atomicstatus=", hex(readgstatus(gp)), "\n") + throw("scanstack - bad status") + } + + switch readgstatus(gp) &^ _Gscan { + default: + print("runtime: gp=", gp, ", goid=", gp.goid, ", gp->atomicstatus=", readgstatus(gp), "\n") + throw("mark - bad status") + case _Gdead: + return 0 + case _Grunning: + print("runtime: gp=", gp, ", goid=", gp.goid, ", gp->atomicstatus=", readgstatus(gp), "\n") + throw("scanstack: goroutine not stopped") + case _Grunnable, _Gsyscall, _Gwaiting: + // ok + } + + if gp == getg() { + throw("can't scan our own stack") + } + + // scannedSize is the amount of work we'll be reporting. + // + // It is less than the allocated size (which is hi-lo). + var sp uintptr + if gp.syscallsp != 0 { + sp = gp.syscallsp // If in a system call this is the stack pointer (gp.sched.sp can be 0 in this case on Windows). + } else { + sp = gp.sched.sp + } + scannedSize := gp.stack.hi - sp + + // Keep statistics for initial stack size calculation. + // Note that this accumulates the scanned size, not the allocated size. + p := getg().m.p.ptr() + p.scannedStackSize += uint64(scannedSize) + p.scannedStacks++ + + if isShrinkStackSafe(gp) { + // Shrink the stack if not much of it is being used. + shrinkstack(gp) + } else { + // Otherwise, shrink the stack at the next sync safe point. + gp.preemptShrink = true + } + + var state stackScanState + state.stack = gp.stack + + if stackTraceDebug { + println("stack trace goroutine", gp.goid) + } + + if debugScanConservative && gp.asyncSafePoint { + print("scanning async preempted goroutine ", gp.goid, " stack [", hex(gp.stack.lo), ",", hex(gp.stack.hi), ")\n") + } + + // Scan the saved context register. This is effectively a live + // register that gets moved back and forth between the + // register and sched.ctxt without a write barrier. + if gp.sched.ctxt != nil { + scanblock(uintptr(unsafe.Pointer(&gp.sched.ctxt)), goarch.PtrSize, &oneptrmask[0], gcw, &state) + } + + // Scan the stack. Accumulate a list of stack objects. + scanframe := func(frame *stkframe, unused unsafe.Pointer) bool { + scanframeworker(frame, &state, gcw) + return true + } + gentraceback(^uintptr(0), ^uintptr(0), 0, gp, 0, nil, 0x7fffffff, scanframe, nil, 0) + + // Find additional pointers that point into the stack from the heap. + // Currently this includes defers and panics. See also function copystack. + + // Find and trace other pointers in defer records. + for d := gp._defer; d != nil; d = d.link { + if d.fn != nil { + // Scan the func value, which could be a stack allocated closure. + // See issue 30453. + scanblock(uintptr(unsafe.Pointer(&d.fn)), goarch.PtrSize, &oneptrmask[0], gcw, &state) + } + if d.link != nil { + // The link field of a stack-allocated defer record might point + // to a heap-allocated defer record. Keep that heap record live. + scanblock(uintptr(unsafe.Pointer(&d.link)), goarch.PtrSize, &oneptrmask[0], gcw, &state) + } + // Retain defers records themselves. + // Defer records might not be reachable from the G through regular heap + // tracing because the defer linked list might weave between the stack and the heap. + if d.heap { + scanblock(uintptr(unsafe.Pointer(&d)), goarch.PtrSize, &oneptrmask[0], gcw, &state) + } + } + if gp._panic != nil { + // Panics are always stack allocated. + state.putPtr(uintptr(unsafe.Pointer(gp._panic)), false) + } + + // Find and scan all reachable stack objects. + // + // The state's pointer queue prioritizes precise pointers over + // conservative pointers so that we'll prefer scanning stack + // objects precisely. + state.buildIndex() + for { + p, conservative := state.getPtr() + if p == 0 { + break + } + obj := state.findObject(p) + if obj == nil { + continue + } + r := obj.r + if r == nil { + // We've already scanned this object. + continue + } + obj.setRecord(nil) // Don't scan it again. + if stackTraceDebug { + printlock() + print(" live stkobj at", hex(state.stack.lo+uintptr(obj.off)), "of size", obj.size) + if conservative { + print(" (conservative)") + } + println() + printunlock() + } + gcdata := r.gcdata() + var s *mspan + if r.useGCProg() { + // This path is pretty unlikely, an object large enough + // to have a GC program allocated on the stack. + // We need some space to unpack the program into a straight + // bitmask, which we allocate/free here. + // TODO: it would be nice if there were a way to run a GC + // program without having to store all its bits. We'd have + // to change from a Lempel-Ziv style program to something else. + // Or we can forbid putting objects on stacks if they require + // a gc program (see issue 27447). + s = materializeGCProg(r.ptrdata(), gcdata) + gcdata = (*byte)(unsafe.Pointer(s.startAddr)) + } + + b := state.stack.lo + uintptr(obj.off) + if conservative { + scanConservative(b, r.ptrdata(), gcdata, gcw, &state) + } else { + scanblock(b, r.ptrdata(), gcdata, gcw, &state) + } + + if s != nil { + dematerializeGCProg(s) + } + } + + // Deallocate object buffers. + // (Pointer buffers were all deallocated in the loop above.) + for state.head != nil { + x := state.head + state.head = x.next + if stackTraceDebug { + for i := 0; i < x.nobj; i++ { + obj := &x.obj[i] + if obj.r == nil { // reachable + continue + } + println(" dead stkobj at", hex(gp.stack.lo+uintptr(obj.off)), "of size", obj.r.size) + // Note: not necessarily really dead - only reachable-from-ptr dead. + } + } + x.nobj = 0 + putempty((*workbuf)(unsafe.Pointer(x))) + } + if state.buf != nil || state.cbuf != nil || state.freeBuf != nil { + throw("remaining pointer buffers") + } + return int64(scannedSize) +} + +// Scan a stack frame: local variables and function arguments/results. +// +//go:nowritebarrier +func scanframeworker(frame *stkframe, state *stackScanState, gcw *gcWork) { + if _DebugGC > 1 && frame.continpc != 0 { + print("scanframe ", funcname(frame.fn), "\n") + } + + isAsyncPreempt := frame.fn.valid() && frame.fn.funcID == funcID_asyncPreempt + isDebugCall := frame.fn.valid() && frame.fn.funcID == funcID_debugCallV2 + if state.conservative || isAsyncPreempt || isDebugCall { + if debugScanConservative { + println("conservatively scanning function", funcname(frame.fn), "at PC", hex(frame.continpc)) + } + + // Conservatively scan the frame. Unlike the precise + // case, this includes the outgoing argument space + // since we may have stopped while this function was + // setting up a call. + // + // TODO: We could narrow this down if the compiler + // produced a single map per function of stack slots + // and registers that ever contain a pointer. + if frame.varp != 0 { + size := frame.varp - frame.sp + if size > 0 { + scanConservative(frame.sp, size, nil, gcw, state) + } + } + + // Scan arguments to this frame. + if n := frame.argBytes(); n != 0 { + // TODO: We could pass the entry argument map + // to narrow this down further. + scanConservative(frame.argp, n, nil, gcw, state) + } + + if isAsyncPreempt || isDebugCall { + // This function's frame contained the + // registers for the asynchronously stopped + // parent frame. Scan the parent + // conservatively. + state.conservative = true + } else { + // We only wanted to scan those two frames + // conservatively. Clear the flag for future + // frames. + state.conservative = false + } + return + } + + locals, args, objs := frame.getStackMap(&state.cache, false) + + // Scan local variables if stack frame has been allocated. + if locals.n > 0 { + size := uintptr(locals.n) * goarch.PtrSize + scanblock(frame.varp-size, size, locals.bytedata, gcw, state) + } + + // Scan arguments. + if args.n > 0 { + scanblock(frame.argp, uintptr(args.n)*goarch.PtrSize, args.bytedata, gcw, state) + } + + // Add all stack objects to the stack object list. + if frame.varp != 0 { + // varp is 0 for defers, where there are no locals. + // In that case, there can't be a pointer to its args, either. + // (And all args would be scanned above anyway.) + for i := range objs { + obj := &objs[i] + off := obj.off + base := frame.varp // locals base pointer + if off >= 0 { + base = frame.argp // arguments and return values base pointer + } + ptr := base + uintptr(off) + if ptr < frame.sp { + // object hasn't been allocated in the frame yet. + continue + } + if stackTraceDebug { + println("stkobj at", hex(ptr), "of size", obj.size) + } + state.addObject(ptr, obj) + } + } +} + +type gcDrainFlags int + +const ( + gcDrainUntilPreempt gcDrainFlags = 1 << iota + gcDrainFlushBgCredit + gcDrainIdle + gcDrainFractional +) + +// gcDrain scans roots and objects in work buffers, blackening grey +// objects until it is unable to get more work. It may return before +// GC is done; it's the caller's responsibility to balance work from +// other Ps. +// +// If flags&gcDrainUntilPreempt != 0, gcDrain returns when g.preempt +// is set. +// +// If flags&gcDrainIdle != 0, gcDrain returns when there is other work +// to do. +// +// If flags&gcDrainFractional != 0, gcDrain self-preempts when +// pollFractionalWorkerExit() returns true. This implies +// gcDrainNoBlock. +// +// If flags&gcDrainFlushBgCredit != 0, gcDrain flushes scan work +// credit to gcController.bgScanCredit every gcCreditSlack units of +// scan work. +// +// gcDrain will always return if there is a pending STW. +// +//go:nowritebarrier +func gcDrain(gcw *gcWork, flags gcDrainFlags) { + if !writeBarrier.needed { + throw("gcDrain phase incorrect") + } + + gp := getg().m.curg + preemptible := flags&gcDrainUntilPreempt != 0 + flushBgCredit := flags&gcDrainFlushBgCredit != 0 + idle := flags&gcDrainIdle != 0 + + initScanWork := gcw.heapScanWork + + // checkWork is the scan work before performing the next + // self-preempt check. + checkWork := int64(1<<63 - 1) + var check func() bool + if flags&(gcDrainIdle|gcDrainFractional) != 0 { + checkWork = initScanWork + drainCheckThreshold + if idle { + check = pollWork + } else if flags&gcDrainFractional != 0 { + check = pollFractionalWorkerExit + } + } + + // Drain root marking jobs. + if work.markrootNext < work.markrootJobs { + // Stop if we're preemptible or if someone wants to STW. + for !(gp.preempt && (preemptible || sched.gcwaiting.Load())) { + job := atomic.Xadd(&work.markrootNext, +1) - 1 + if job >= work.markrootJobs { + break + } + markroot(gcw, job, flushBgCredit) + if check != nil && check() { + goto done + } + } + } + + // Drain heap marking jobs. + // Stop if we're preemptible or if someone wants to STW. + for !(gp.preempt && (preemptible || sched.gcwaiting.Load())) { + // Try to keep work available on the global queue. We used to + // check if there were waiting workers, but it's better to + // just keep work available than to make workers wait. In the + // worst case, we'll do O(log(_WorkbufSize)) unnecessary + // balances. + if work.full == 0 { + gcw.balance() + } + + b := gcw.tryGetFast() + if b == 0 { + b = gcw.tryGet() + if b == 0 { + // Flush the write barrier + // buffer; this may create + // more work. + wbBufFlush(nil, 0) + b = gcw.tryGet() + } + } + if b == 0 { + // Unable to get work. + break + } + scanobject(b, gcw) + + // Flush background scan work credit to the global + // account if we've accumulated enough locally so + // mutator assists can draw on it. + if gcw.heapScanWork >= gcCreditSlack { + gcController.heapScanWork.Add(gcw.heapScanWork) + if flushBgCredit { + gcFlushBgCredit(gcw.heapScanWork - initScanWork) + initScanWork = 0 + } + checkWork -= gcw.heapScanWork + gcw.heapScanWork = 0 + + if checkWork <= 0 { + checkWork += drainCheckThreshold + if check != nil && check() { + break + } + } + } + } + +done: + // Flush remaining scan work credit. + if gcw.heapScanWork > 0 { + gcController.heapScanWork.Add(gcw.heapScanWork) + if flushBgCredit { + gcFlushBgCredit(gcw.heapScanWork - initScanWork) + } + gcw.heapScanWork = 0 + } +} + +// gcDrainN blackens grey objects until it has performed roughly +// scanWork units of scan work or the G is preempted. This is +// best-effort, so it may perform less work if it fails to get a work +// buffer. Otherwise, it will perform at least n units of work, but +// may perform more because scanning is always done in whole object +// increments. It returns the amount of scan work performed. +// +// The caller goroutine must be in a preemptible state (e.g., +// _Gwaiting) to prevent deadlocks during stack scanning. As a +// consequence, this must be called on the system stack. +// +//go:nowritebarrier +//go:systemstack +func gcDrainN(gcw *gcWork, scanWork int64) int64 { + if !writeBarrier.needed { + throw("gcDrainN phase incorrect") + } + + // There may already be scan work on the gcw, which we don't + // want to claim was done by this call. + workFlushed := -gcw.heapScanWork + + // In addition to backing out because of a preemption, back out + // if the GC CPU limiter is enabled. + gp := getg().m.curg + for !gp.preempt && !gcCPULimiter.limiting() && workFlushed+gcw.heapScanWork < scanWork { + // See gcDrain comment. + if work.full == 0 { + gcw.balance() + } + + b := gcw.tryGetFast() + if b == 0 { + b = gcw.tryGet() + if b == 0 { + // Flush the write barrier buffer; + // this may create more work. + wbBufFlush(nil, 0) + b = gcw.tryGet() + } + } + + if b == 0 { + // Try to do a root job. + if work.markrootNext < work.markrootJobs { + job := atomic.Xadd(&work.markrootNext, +1) - 1 + if job < work.markrootJobs { + workFlushed += markroot(gcw, job, false) + continue + } + } + // No heap or root jobs. + break + } + + scanobject(b, gcw) + + // Flush background scan work credit. + if gcw.heapScanWork >= gcCreditSlack { + gcController.heapScanWork.Add(gcw.heapScanWork) + workFlushed += gcw.heapScanWork + gcw.heapScanWork = 0 + } + } + + // Unlike gcDrain, there's no need to flush remaining work + // here because this never flushes to bgScanCredit and + // gcw.dispose will flush any remaining work to scanWork. + + return workFlushed + gcw.heapScanWork +} + +// scanblock scans b as scanobject would, but using an explicit +// pointer bitmap instead of the heap bitmap. +// +// This is used to scan non-heap roots, so it does not update +// gcw.bytesMarked or gcw.heapScanWork. +// +// If stk != nil, possible stack pointers are also reported to stk.putPtr. +// +//go:nowritebarrier +func scanblock(b0, n0 uintptr, ptrmask *uint8, gcw *gcWork, stk *stackScanState) { + // Use local copies of original parameters, so that a stack trace + // due to one of the throws below shows the original block + // base and extent. + b := b0 + n := n0 + + for i := uintptr(0); i < n; { + // Find bits for the next word. + bits := uint32(*addb(ptrmask, i/(goarch.PtrSize*8))) + if bits == 0 { + i += goarch.PtrSize * 8 + continue + } + for j := 0; j < 8 && i < n; j++ { + if bits&1 != 0 { + // Same work as in scanobject; see comments there. + p := *(*uintptr)(unsafe.Pointer(b + i)) + if p != 0 { + if obj, span, objIndex := findObject(p, b, i); obj != 0 { + greyobject(obj, b, i, span, gcw, objIndex) + } else if stk != nil && p >= stk.stack.lo && p < stk.stack.hi { + stk.putPtr(p, false) + } + } + } + bits >>= 1 + i += goarch.PtrSize + } + } +} + +// scanobject scans the object starting at b, adding pointers to gcw. +// b must point to the beginning of a heap object or an oblet. +// scanobject consults the GC bitmap for the pointer mask and the +// spans for the size of the object. +// +//go:nowritebarrier +func scanobject(b uintptr, gcw *gcWork) { + // Prefetch object before we scan it. + // + // This will overlap fetching the beginning of the object with initial + // setup before we start scanning the object. + sys.Prefetch(b) + + // Find the bits for b and the size of the object at b. + // + // b is either the beginning of an object, in which case this + // is the size of the object to scan, or it points to an + // oblet, in which case we compute the size to scan below. + s := spanOfUnchecked(b) + n := s.elemsize + if n == 0 { + throw("scanobject n == 0") + } + if s.spanclass.noscan() { + // Correctness-wise this is ok, but it's inefficient + // if noscan objects reach here. + throw("scanobject of a noscan object") + } + + if n > maxObletBytes { + // Large object. Break into oblets for better + // parallelism and lower latency. + if b == s.base() { + // Enqueue the other oblets to scan later. + // Some oblets may be in b's scalar tail, but + // these will be marked as "no more pointers", + // so we'll drop out immediately when we go to + // scan those. + for oblet := b + maxObletBytes; oblet < s.base()+s.elemsize; oblet += maxObletBytes { + if !gcw.putFast(oblet) { + gcw.put(oblet) + } + } + } + + // Compute the size of the oblet. Since this object + // must be a large object, s.base() is the beginning + // of the object. + n = s.base() + s.elemsize - b + if n > maxObletBytes { + n = maxObletBytes + } + } + + hbits := heapBitsForAddr(b, n) + var scanSize uintptr + for { + var addr uintptr + if hbits, addr = hbits.nextFast(); addr == 0 { + if hbits, addr = hbits.next(); addr == 0 { + break + } + } + + // Keep track of farthest pointer we found, so we can + // update heapScanWork. TODO: is there a better metric, + // now that we can skip scalar portions pretty efficiently? + scanSize = addr - b + goarch.PtrSize + + // Work here is duplicated in scanblock and above. + // If you make changes here, make changes there too. + obj := *(*uintptr)(unsafe.Pointer(addr)) + + // At this point we have extracted the next potential pointer. + // Quickly filter out nil and pointers back to the current object. + if obj != 0 && obj-b >= n { + // Test if obj points into the Go heap and, if so, + // mark the object. + // + // Note that it's possible for findObject to + // fail if obj points to a just-allocated heap + // object because of a race with growing the + // heap. In this case, we know the object was + // just allocated and hence will be marked by + // allocation itself. + if obj, span, objIndex := findObject(obj, b, addr-b); obj != 0 { + greyobject(obj, b, addr-b, span, gcw, objIndex) + } + } + } + gcw.bytesMarked += uint64(n) + gcw.heapScanWork += int64(scanSize) +} + +// scanConservative scans block [b, b+n) conservatively, treating any +// pointer-like value in the block as a pointer. +// +// If ptrmask != nil, only words that are marked in ptrmask are +// considered as potential pointers. +// +// If state != nil, it's assumed that [b, b+n) is a block in the stack +// and may contain pointers to stack objects. +func scanConservative(b, n uintptr, ptrmask *uint8, gcw *gcWork, state *stackScanState) { + if debugScanConservative { + printlock() + print("conservatively scanning [", hex(b), ",", hex(b+n), ")\n") + hexdumpWords(b, b+n, func(p uintptr) byte { + if ptrmask != nil { + word := (p - b) / goarch.PtrSize + bits := *addb(ptrmask, word/8) + if (bits>>(word%8))&1 == 0 { + return '$' + } + } + + val := *(*uintptr)(unsafe.Pointer(p)) + if state != nil && state.stack.lo <= val && val < state.stack.hi { + return '@' + } + + span := spanOfHeap(val) + if span == nil { + return ' ' + } + idx := span.objIndex(val) + if span.isFree(idx) { + return ' ' + } + return '*' + }) + printunlock() + } + + for i := uintptr(0); i < n; i += goarch.PtrSize { + if ptrmask != nil { + word := i / goarch.PtrSize + bits := *addb(ptrmask, word/8) + if bits == 0 { + // Skip 8 words (the loop increment will do the 8th) + // + // This must be the first time we've + // seen this word of ptrmask, so i + // must be 8-word-aligned, but check + // our reasoning just in case. + if i%(goarch.PtrSize*8) != 0 { + throw("misaligned mask") + } + i += goarch.PtrSize*8 - goarch.PtrSize + continue + } + if (bits>>(word%8))&1 == 0 { + continue + } + } + + val := *(*uintptr)(unsafe.Pointer(b + i)) + + // Check if val points into the stack. + if state != nil && state.stack.lo <= val && val < state.stack.hi { + // val may point to a stack object. This + // object may be dead from last cycle and + // hence may contain pointers to unallocated + // objects, but unlike heap objects we can't + // tell if it's already dead. Hence, if all + // pointers to this object are from + // conservative scanning, we have to scan it + // defensively, too. + state.putPtr(val, true) + continue + } + + // Check if val points to a heap span. + span := spanOfHeap(val) + if span == nil { + continue + } + + // Check if val points to an allocated object. + idx := span.objIndex(val) + if span.isFree(idx) { + continue + } + + // val points to an allocated object. Mark it. + obj := span.base() + idx*span.elemsize + greyobject(obj, b, i, span, gcw, idx) + } +} + +// Shade the object if it isn't already. +// The object is not nil and known to be in the heap. +// Preemption must be disabled. +// +//go:nowritebarrier +func shade(b uintptr) { + if obj, span, objIndex := findObject(b, 0, 0); obj != 0 { + gcw := &getg().m.p.ptr().gcw + greyobject(obj, 0, 0, span, gcw, objIndex) + } +} + +// obj is the start of an object with mark mbits. +// If it isn't already marked, mark it and enqueue into gcw. +// base and off are for debugging only and could be removed. +// +// See also wbBufFlush1, which partially duplicates this logic. +// +//go:nowritebarrierrec +func greyobject(obj, base, off uintptr, span *mspan, gcw *gcWork, objIndex uintptr) { + // obj should be start of allocation, and so must be at least pointer-aligned. + if obj&(goarch.PtrSize-1) != 0 { + throw("greyobject: obj not pointer-aligned") + } + mbits := span.markBitsForIndex(objIndex) + + if useCheckmark { + if setCheckmark(obj, base, off, mbits) { + // Already marked. + return + } + } else { + if debug.gccheckmark > 0 && span.isFree(objIndex) { + print("runtime: marking free object ", hex(obj), " found at *(", hex(base), "+", hex(off), ")\n") + gcDumpObject("base", base, off) + gcDumpObject("obj", obj, ^uintptr(0)) + getg().m.traceback = 2 + throw("marking free object") + } + + // If marked we have nothing to do. + if mbits.isMarked() { + return + } + mbits.setMarked() + + // Mark span. + arena, pageIdx, pageMask := pageIndexOf(span.base()) + if arena.pageMarks[pageIdx]&pageMask == 0 { + atomic.Or8(&arena.pageMarks[pageIdx], pageMask) + } + + // If this is a noscan object, fast-track it to black + // instead of greying it. + if span.spanclass.noscan() { + gcw.bytesMarked += uint64(span.elemsize) + return + } + } + + // We're adding obj to P's local workbuf, so it's likely + // this object will be processed soon by the same P. + // Even if the workbuf gets flushed, there will likely still be + // some benefit on platforms with inclusive shared caches. + sys.Prefetch(obj) + // Queue the obj for scanning. + if !gcw.putFast(obj) { + gcw.put(obj) + } +} + +// gcDumpObject dumps the contents of obj for debugging and marks the +// field at byte offset off in obj. +func gcDumpObject(label string, obj, off uintptr) { + s := spanOf(obj) + print(label, "=", hex(obj)) + if s == nil { + print(" s=nil\n") + return + } + print(" s.base()=", hex(s.base()), " s.limit=", hex(s.limit), " s.spanclass=", s.spanclass, " s.elemsize=", s.elemsize, " s.state=") + if state := s.state.get(); 0 <= state && int(state) < len(mSpanStateNames) { + print(mSpanStateNames[state], "\n") + } else { + print("unknown(", state, ")\n") + } + + skipped := false + size := s.elemsize + if s.state.get() == mSpanManual && size == 0 { + // We're printing something from a stack frame. We + // don't know how big it is, so just show up to an + // including off. + size = off + goarch.PtrSize + } + for i := uintptr(0); i < size; i += goarch.PtrSize { + // For big objects, just print the beginning (because + // that usually hints at the object's type) and the + // fields around off. + if !(i < 128*goarch.PtrSize || off-16*goarch.PtrSize < i && i < off+16*goarch.PtrSize) { + skipped = true + continue + } + if skipped { + print(" ...\n") + skipped = false + } + print(" *(", label, "+", i, ") = ", hex(*(*uintptr)(unsafe.Pointer(obj + i)))) + if i == off { + print(" <==") + } + print("\n") + } + if skipped { + print(" ...\n") + } +} + +// gcmarknewobject marks a newly allocated object black. obj must +// not contain any non-nil pointers. +// +// This is nosplit so it can manipulate a gcWork without preemption. +// +//go:nowritebarrier +//go:nosplit +func gcmarknewobject(span *mspan, obj, size uintptr) { + if useCheckmark { // The world should be stopped so this should not happen. + throw("gcmarknewobject called while doing checkmark") + } + + // Mark object. + objIndex := span.objIndex(obj) + span.markBitsForIndex(objIndex).setMarked() + + // Mark span. + arena, pageIdx, pageMask := pageIndexOf(span.base()) + if arena.pageMarks[pageIdx]&pageMask == 0 { + atomic.Or8(&arena.pageMarks[pageIdx], pageMask) + } + + gcw := &getg().m.p.ptr().gcw + gcw.bytesMarked += uint64(size) +} + +// gcMarkTinyAllocs greys all active tiny alloc blocks. +// +// The world must be stopped. +func gcMarkTinyAllocs() { + assertWorldStopped() + + for _, p := range allp { + c := p.mcache + if c == nil || c.tiny == 0 { + continue + } + _, span, objIndex := findObject(c.tiny, 0, 0) + gcw := &p.gcw + greyobject(c.tiny, 0, 0, span, gcw, objIndex) + } +} diff --git a/src/runtime/mgcpacer.go b/src/runtime/mgcpacer.go new file mode 100644 index 0000000..9d9840e --- /dev/null +++ b/src/runtime/mgcpacer.go @@ -0,0 +1,1426 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/cpu" + "internal/goexperiment" + "runtime/internal/atomic" + _ "unsafe" // for go:linkname +) + +// go119MemoryLimitSupport is a feature flag for a number of changes +// related to the memory limit feature (#48409). Disabling this flag +// disables those features, as well as the memory limit mechanism, +// which becomes a no-op. +const go119MemoryLimitSupport = true + +const ( + // gcGoalUtilization is the goal CPU utilization for + // marking as a fraction of GOMAXPROCS. + // + // Increasing the goal utilization will shorten GC cycles as the GC + // has more resources behind it, lessening costs from the write barrier, + // but comes at the cost of increasing mutator latency. + gcGoalUtilization = gcBackgroundUtilization + + // gcBackgroundUtilization is the fixed CPU utilization for background + // marking. It must be <= gcGoalUtilization. The difference between + // gcGoalUtilization and gcBackgroundUtilization will be made up by + // mark assists. The scheduler will aim to use within 50% of this + // goal. + // + // As a general rule, there's little reason to set gcBackgroundUtilization + // < gcGoalUtilization. One reason might be in mostly idle applications, + // where goroutines are unlikely to assist at all, so the actual + // utilization will be lower than the goal. But this is moot point + // because the idle mark workers already soak up idle CPU resources. + // These two values are still kept separate however because they are + // distinct conceptually, and in previous iterations of the pacer the + // distinction was more important. + gcBackgroundUtilization = 0.25 + + // gcCreditSlack is the amount of scan work credit that can + // accumulate locally before updating gcController.heapScanWork and, + // optionally, gcController.bgScanCredit. Lower values give a more + // accurate assist ratio and make it more likely that assists will + // successfully steal background credit. Higher values reduce memory + // contention. + gcCreditSlack = 2000 + + // gcAssistTimeSlack is the nanoseconds of mutator assist time that + // can accumulate on a P before updating gcController.assistTime. + gcAssistTimeSlack = 5000 + + // gcOverAssistWork determines how many extra units of scan work a GC + // assist does when an assist happens. This amortizes the cost of an + // assist by pre-paying for this many bytes of future allocations. + gcOverAssistWork = 64 << 10 + + // defaultHeapMinimum is the value of heapMinimum for GOGC==100. + defaultHeapMinimum = (goexperiment.HeapMinimum512KiBInt)*(512<<10) + + (1-goexperiment.HeapMinimum512KiBInt)*(4<<20) + + // maxStackScanSlack is the bytes of stack space allocated or freed + // that can accumulate on a P before updating gcController.stackSize. + maxStackScanSlack = 8 << 10 + + // memoryLimitHeapGoalHeadroom is the amount of headroom the pacer gives to + // the heap goal when operating in the memory-limited regime. That is, + // it'll reduce the heap goal by this many extra bytes off of the base + // calculation. + memoryLimitHeapGoalHeadroom = 1 << 20 +) + +// gcController implements the GC pacing controller that determines +// when to trigger concurrent garbage collection and how much marking +// work to do in mutator assists and background marking. +// +// It calculates the ratio between the allocation rate (in terms of CPU +// time) and the GC scan throughput to determine the heap size at which to +// trigger a GC cycle such that no GC assists are required to finish on time. +// This algorithm thus optimizes GC CPU utilization to the dedicated background +// mark utilization of 25% of GOMAXPROCS by minimizing GC assists. +// GOMAXPROCS. The high-level design of this algorithm is documented +// at https://github.com/golang/proposal/blob/master/design/44167-gc-pacer-redesign.md. +// See https://golang.org/s/go15gcpacing for additional historical context. +var gcController gcControllerState + +type gcControllerState struct { + // Initialized from GOGC. GOGC=off means no GC. + gcPercent atomic.Int32 + + // memoryLimit is the soft memory limit in bytes. + // + // Initialized from GOMEMLIMIT. GOMEMLIMIT=off is equivalent to MaxInt64 + // which means no soft memory limit in practice. + // + // This is an int64 instead of a uint64 to more easily maintain parity with + // the SetMemoryLimit API, which sets a maximum at MaxInt64. This value + // should never be negative. + memoryLimit atomic.Int64 + + // heapMinimum is the minimum heap size at which to trigger GC. + // For small heaps, this overrides the usual GOGC*live set rule. + // + // When there is a very small live set but a lot of allocation, simply + // collecting when the heap reaches GOGC*live results in many GC + // cycles and high total per-GC overhead. This minimum amortizes this + // per-GC overhead while keeping the heap reasonably small. + // + // During initialization this is set to 4MB*GOGC/100. In the case of + // GOGC==0, this will set heapMinimum to 0, resulting in constant + // collection even when the heap size is small, which is useful for + // debugging. + heapMinimum uint64 + + // runway is the amount of runway in heap bytes allocated by the + // application that we want to give the GC once it starts. + // + // This is computed from consMark during mark termination. + runway atomic.Uint64 + + // consMark is the estimated per-CPU consMark ratio for the application. + // + // It represents the ratio between the application's allocation + // rate, as bytes allocated per CPU-time, and the GC's scan rate, + // as bytes scanned per CPU-time. + // The units of this ratio are (B / cpu-ns) / (B / cpu-ns). + // + // At a high level, this value is computed as the bytes of memory + // allocated (cons) per unit of scan work completed (mark) in a GC + // cycle, divided by the CPU time spent on each activity. + // + // Updated at the end of each GC cycle, in endCycle. + consMark float64 + + // lastConsMark is the computed cons/mark value for the previous GC + // cycle. Note that this is *not* the last value of cons/mark, but the + // actual computed value. See endCycle for details. + lastConsMark float64 + + // gcPercentHeapGoal is the goal heapLive for when next GC ends derived + // from gcPercent. + // + // Set to ^uint64(0) if gcPercent is disabled. + gcPercentHeapGoal atomic.Uint64 + + // sweepDistMinTrigger is the minimum trigger to ensure a minimum + // sweep distance. + // + // This bound is also special because it applies to both the trigger + // *and* the goal (all other trigger bounds must be based *on* the goal). + // + // It is computed ahead of time, at commit time. The theory is that, + // absent a sudden change to a parameter like gcPercent, the trigger + // will be chosen to always give the sweeper enough headroom. However, + // such a change might dramatically and suddenly move up the trigger, + // in which case we need to ensure the sweeper still has enough headroom. + sweepDistMinTrigger atomic.Uint64 + + // triggered is the point at which the current GC cycle actually triggered. + // Only valid during the mark phase of a GC cycle, otherwise set to ^uint64(0). + // + // Updated while the world is stopped. + triggered uint64 + + // lastHeapGoal is the value of heapGoal at the moment the last GC + // ended. Note that this is distinct from the last value heapGoal had, + // because it could change if e.g. gcPercent changes. + // + // Read and written with the world stopped or with mheap_.lock held. + lastHeapGoal uint64 + + // heapLive is the number of bytes considered live by the GC. + // That is: retained by the most recent GC plus allocated + // since then. heapLive ≤ memstats.totalAlloc-memstats.totalFree, since + // heapAlloc includes unmarked objects that have not yet been swept (and + // hence goes up as we allocate and down as we sweep) while heapLive + // excludes these objects (and hence only goes up between GCs). + // + // To reduce contention, this is updated only when obtaining a span + // from an mcentral and at this point it counts all of the unallocated + // slots in that span (which will be allocated before that mcache + // obtains another span from that mcentral). Hence, it slightly + // overestimates the "true" live heap size. It's better to overestimate + // than to underestimate because 1) this triggers the GC earlier than + // necessary rather than potentially too late and 2) this leads to a + // conservative GC rate rather than a GC rate that is potentially too + // low. + // + // Whenever this is updated, call traceHeapAlloc() and + // this gcControllerState's revise() method. + heapLive atomic.Uint64 + + // heapScan is the number of bytes of "scannable" heap. This is the + // live heap (as counted by heapLive), but omitting no-scan objects and + // no-scan tails of objects. + // + // This value is fixed at the start of a GC cycle. It represents the + // maximum scannable heap. + heapScan atomic.Uint64 + + // lastHeapScan is the number of bytes of heap that were scanned + // last GC cycle. It is the same as heapMarked, but only + // includes the "scannable" parts of objects. + // + // Updated when the world is stopped. + lastHeapScan uint64 + + // lastStackScan is the number of bytes of stack that were scanned + // last GC cycle. + lastStackScan atomic.Uint64 + + // maxStackScan is the amount of allocated goroutine stack space in + // use by goroutines. + // + // This number tracks allocated goroutine stack space rather than used + // goroutine stack space (i.e. what is actually scanned) because used + // goroutine stack space is much harder to measure cheaply. By using + // allocated space, we make an overestimate; this is OK, it's better + // to conservatively overcount than undercount. + maxStackScan atomic.Uint64 + + // globalsScan is the total amount of global variable space + // that is scannable. + globalsScan atomic.Uint64 + + // heapMarked is the number of bytes marked by the previous + // GC. After mark termination, heapLive == heapMarked, but + // unlike heapLive, heapMarked does not change until the + // next mark termination. + heapMarked uint64 + + // heapScanWork is the total heap scan work performed this cycle. + // stackScanWork is the total stack scan work performed this cycle. + // globalsScanWork is the total globals scan work performed this cycle. + // + // These are updated atomically during the cycle. Updates occur in + // bounded batches, since they are both written and read + // throughout the cycle. At the end of the cycle, heapScanWork is how + // much of the retained heap is scannable. + // + // Currently these are measured in bytes. For most uses, this is an + // opaque unit of work, but for estimation the definition is important. + // + // Note that stackScanWork includes only stack space scanned, not all + // of the allocated stack. + heapScanWork atomic.Int64 + stackScanWork atomic.Int64 + globalsScanWork atomic.Int64 + + // bgScanCredit is the scan work credit accumulated by the concurrent + // background scan. This credit is accumulated by the background scan + // and stolen by mutator assists. Updates occur in bounded batches, + // since it is both written and read throughout the cycle. + bgScanCredit atomic.Int64 + + // assistTime is the nanoseconds spent in mutator assists + // during this cycle. This is updated atomically, and must also + // be updated atomically even during a STW, because it is read + // by sysmon. Updates occur in bounded batches, since it is both + // written and read throughout the cycle. + assistTime atomic.Int64 + + // dedicatedMarkTime is the nanoseconds spent in dedicated mark workers + // during this cycle. This is updated at the end of the concurrent mark + // phase. + dedicatedMarkTime atomic.Int64 + + // fractionalMarkTime is the nanoseconds spent in the fractional mark + // worker during this cycle. This is updated throughout the cycle and + // will be up-to-date if the fractional mark worker is not currently + // running. + fractionalMarkTime atomic.Int64 + + // idleMarkTime is the nanoseconds spent in idle marking during this + // cycle. This is updated throughout the cycle. + idleMarkTime atomic.Int64 + + // markStartTime is the absolute start time in nanoseconds + // that assists and background mark workers started. + markStartTime int64 + + // dedicatedMarkWorkersNeeded is the number of dedicated mark workers + // that need to be started. This is computed at the beginning of each + // cycle and decremented as dedicated mark workers get started. + dedicatedMarkWorkersNeeded atomic.Int64 + + // idleMarkWorkers is two packed int32 values in a single uint64. + // These two values are always updated simultaneously. + // + // The bottom int32 is the current number of idle mark workers executing. + // + // The top int32 is the maximum number of idle mark workers allowed to + // execute concurrently. Normally, this number is just gomaxprocs. However, + // during periodic GC cycles it is set to 0 because the system is idle + // anyway; there's no need to go full blast on all of GOMAXPROCS. + // + // The maximum number of idle mark workers is used to prevent new workers + // from starting, but it is not a hard maximum. It is possible (but + // exceedingly rare) for the current number of idle mark workers to + // transiently exceed the maximum. This could happen if the maximum changes + // just after a GC ends, and an M with no P. + // + // Note that if we have no dedicated mark workers, we set this value to + // 1 in this case we only have fractional GC workers which aren't scheduled + // strictly enough to ensure GC progress. As a result, idle-priority mark + // workers are vital to GC progress in these situations. + // + // For example, consider a situation in which goroutines block on the GC + // (such as via runtime.GOMAXPROCS) and only fractional mark workers are + // scheduled (e.g. GOMAXPROCS=1). Without idle-priority mark workers, the + // last running M might skip scheduling a fractional mark worker if its + // utilization goal is met, such that once it goes to sleep (because there's + // nothing to do), there will be nothing else to spin up a new M for the + // fractional worker in the future, stalling GC progress and causing a + // deadlock. However, idle-priority workers will *always* run when there is + // nothing left to do, ensuring the GC makes progress. + // + // See github.com/golang/go/issues/44163 for more details. + idleMarkWorkers atomic.Uint64 + + // assistWorkPerByte is the ratio of scan work to allocated + // bytes that should be performed by mutator assists. This is + // computed at the beginning of each cycle and updated every + // time heapScan is updated. + assistWorkPerByte atomic.Float64 + + // assistBytesPerWork is 1/assistWorkPerByte. + // + // Note that because this is read and written independently + // from assistWorkPerByte users may notice a skew between + // the two values, and such a state should be safe. + assistBytesPerWork atomic.Float64 + + // fractionalUtilizationGoal is the fraction of wall clock + // time that should be spent in the fractional mark worker on + // each P that isn't running a dedicated worker. + // + // For example, if the utilization goal is 25% and there are + // no dedicated workers, this will be 0.25. If the goal is + // 25%, there is one dedicated worker, and GOMAXPROCS is 5, + // this will be 0.05 to make up the missing 5%. + // + // If this is zero, no fractional workers are needed. + fractionalUtilizationGoal float64 + + // These memory stats are effectively duplicates of fields from + // memstats.heapStats but are updated atomically or with the world + // stopped and don't provide the same consistency guarantees. + // + // Because the runtime is responsible for managing a memory limit, it's + // useful to couple these stats more tightly to the gcController, which + // is intimately connected to how that memory limit is maintained. + heapInUse sysMemStat // bytes in mSpanInUse spans + heapReleased sysMemStat // bytes released to the OS + heapFree sysMemStat // bytes not in any span, but not released to the OS + totalAlloc atomic.Uint64 // total bytes allocated + totalFree atomic.Uint64 // total bytes freed + mappedReady atomic.Uint64 // total virtual memory in the Ready state (see mem.go). + + // test indicates that this is a test-only copy of gcControllerState. + test bool + + _ cpu.CacheLinePad +} + +func (c *gcControllerState) init(gcPercent int32, memoryLimit int64) { + c.heapMinimum = defaultHeapMinimum + c.triggered = ^uint64(0) + c.setGCPercent(gcPercent) + c.setMemoryLimit(memoryLimit) + c.commit(true) // No sweep phase in the first GC cycle. + // N.B. Don't bother calling traceHeapGoal. Tracing is never enabled at + // initialization time. + // N.B. No need to call revise; there's no GC enabled during + // initialization. +} + +// startCycle resets the GC controller's state and computes estimates +// for a new GC cycle. The caller must hold worldsema and the world +// must be stopped. +func (c *gcControllerState) startCycle(markStartTime int64, procs int, trigger gcTrigger) { + c.heapScanWork.Store(0) + c.stackScanWork.Store(0) + c.globalsScanWork.Store(0) + c.bgScanCredit.Store(0) + c.assistTime.Store(0) + c.dedicatedMarkTime.Store(0) + c.fractionalMarkTime.Store(0) + c.idleMarkTime.Store(0) + c.markStartTime = markStartTime + c.triggered = c.heapLive.Load() + + // Compute the background mark utilization goal. In general, + // this may not come out exactly. We round the number of + // dedicated workers so that the utilization is closest to + // 25%. For small GOMAXPROCS, this would introduce too much + // error, so we add fractional workers in that case. + totalUtilizationGoal := float64(procs) * gcBackgroundUtilization + dedicatedMarkWorkersNeeded := int64(totalUtilizationGoal + 0.5) + utilError := float64(dedicatedMarkWorkersNeeded)/totalUtilizationGoal - 1 + const maxUtilError = 0.3 + if utilError < -maxUtilError || utilError > maxUtilError { + // Rounding put us more than 30% off our goal. With + // gcBackgroundUtilization of 25%, this happens for + // GOMAXPROCS<=3 or GOMAXPROCS=6. Enable fractional + // workers to compensate. + if float64(dedicatedMarkWorkersNeeded) > totalUtilizationGoal { + // Too many dedicated workers. + dedicatedMarkWorkersNeeded-- + } + c.fractionalUtilizationGoal = (totalUtilizationGoal - float64(dedicatedMarkWorkersNeeded)) / float64(procs) + } else { + c.fractionalUtilizationGoal = 0 + } + + // In STW mode, we just want dedicated workers. + if debug.gcstoptheworld > 0 { + dedicatedMarkWorkersNeeded = int64(procs) + c.fractionalUtilizationGoal = 0 + } + + // Clear per-P state + for _, p := range allp { + p.gcAssistTime = 0 + p.gcFractionalMarkTime = 0 + } + + if trigger.kind == gcTriggerTime { + // During a periodic GC cycle, reduce the number of idle mark workers + // required. However, we need at least one dedicated mark worker or + // idle GC worker to ensure GC progress in some scenarios (see comment + // on maxIdleMarkWorkers). + if dedicatedMarkWorkersNeeded > 0 { + c.setMaxIdleMarkWorkers(0) + } else { + // TODO(mknyszek): The fundamental reason why we need this is because + // we can't count on the fractional mark worker to get scheduled. + // Fix that by ensuring it gets scheduled according to its quota even + // if the rest of the application is idle. + c.setMaxIdleMarkWorkers(1) + } + } else { + // N.B. gomaxprocs and dedicatedMarkWorkersNeeded are guaranteed not to + // change during a GC cycle. + c.setMaxIdleMarkWorkers(int32(procs) - int32(dedicatedMarkWorkersNeeded)) + } + + // Compute initial values for controls that are updated + // throughout the cycle. + c.dedicatedMarkWorkersNeeded.Store(dedicatedMarkWorkersNeeded) + c.revise() + + if debug.gcpacertrace > 0 { + heapGoal := c.heapGoal() + assistRatio := c.assistWorkPerByte.Load() + print("pacer: assist ratio=", assistRatio, + " (scan ", gcController.heapScan.Load()>>20, " MB in ", + work.initialHeapLive>>20, "->", + heapGoal>>20, " MB)", + " workers=", dedicatedMarkWorkersNeeded, + "+", c.fractionalUtilizationGoal, "\n") + } +} + +// revise updates the assist ratio during the GC cycle to account for +// improved estimates. This should be called whenever gcController.heapScan, +// gcController.heapLive, or if any inputs to gcController.heapGoal are +// updated. It is safe to call concurrently, but it may race with other +// calls to revise. +// +// The result of this race is that the two assist ratio values may not line +// up or may be stale. In practice this is OK because the assist ratio +// moves slowly throughout a GC cycle, and the assist ratio is a best-effort +// heuristic anyway. Furthermore, no part of the heuristic depends on +// the two assist ratio values being exact reciprocals of one another, since +// the two values are used to convert values from different sources. +// +// The worst case result of this raciness is that we may miss a larger shift +// in the ratio (say, if we decide to pace more aggressively against the +// hard heap goal) but even this "hard goal" is best-effort (see #40460). +// The dedicated GC should ensure we don't exceed the hard goal by too much +// in the rare case we do exceed it. +// +// It should only be called when gcBlackenEnabled != 0 (because this +// is when assists are enabled and the necessary statistics are +// available). +func (c *gcControllerState) revise() { + gcPercent := c.gcPercent.Load() + if gcPercent < 0 { + // If GC is disabled but we're running a forced GC, + // act like GOGC is huge for the below calculations. + gcPercent = 100000 + } + live := c.heapLive.Load() + scan := c.heapScan.Load() + work := c.heapScanWork.Load() + c.stackScanWork.Load() + c.globalsScanWork.Load() + + // Assume we're under the soft goal. Pace GC to complete at + // heapGoal assuming the heap is in steady-state. + heapGoal := int64(c.heapGoal()) + + // The expected scan work is computed as the amount of bytes scanned last + // GC cycle (both heap and stack), plus our estimate of globals work for this cycle. + scanWorkExpected := int64(c.lastHeapScan + c.lastStackScan.Load() + c.globalsScan.Load()) + + // maxScanWork is a worst-case estimate of the amount of scan work that + // needs to be performed in this GC cycle. Specifically, it represents + // the case where *all* scannable memory turns out to be live, and + // *all* allocated stack space is scannable. + maxStackScan := c.maxStackScan.Load() + maxScanWork := int64(scan + maxStackScan + c.globalsScan.Load()) + if work > scanWorkExpected { + // We've already done more scan work than expected. Because our expectation + // is based on a steady-state scannable heap size, we assume this means our + // heap is growing. Compute a new heap goal that takes our existing runway + // computed for scanWorkExpected and extrapolates it to maxScanWork, the worst-case + // scan work. This keeps our assist ratio stable if the heap continues to grow. + // + // The effect of this mechanism is that assists stay flat in the face of heap + // growths. It's OK to use more memory this cycle to scan all the live heap, + // because the next GC cycle is inevitably going to use *at least* that much + // memory anyway. + extHeapGoal := int64(float64(heapGoal-int64(c.triggered))/float64(scanWorkExpected)*float64(maxScanWork)) + int64(c.triggered) + scanWorkExpected = maxScanWork + + // hardGoal is a hard limit on the amount that we're willing to push back the + // heap goal, and that's twice the heap goal (i.e. if GOGC=100 and the heap and/or + // stacks and/or globals grow to twice their size, this limits the current GC cycle's + // growth to 4x the original live heap's size). + // + // This maintains the invariant that we use no more memory than the next GC cycle + // will anyway. + hardGoal := int64((1.0 + float64(gcPercent)/100.0) * float64(heapGoal)) + if extHeapGoal > hardGoal { + extHeapGoal = hardGoal + } + heapGoal = extHeapGoal + } + if int64(live) > heapGoal { + // We're already past our heap goal, even the extrapolated one. + // Leave ourselves some extra runway, so in the worst case we + // finish by that point. + const maxOvershoot = 1.1 + heapGoal = int64(float64(heapGoal) * maxOvershoot) + + // Compute the upper bound on the scan work remaining. + scanWorkExpected = maxScanWork + } + + // Compute the remaining scan work estimate. + // + // Note that we currently count allocations during GC as both + // scannable heap (heapScan) and scan work completed + // (scanWork), so allocation will change this difference + // slowly in the soft regime and not at all in the hard + // regime. + scanWorkRemaining := scanWorkExpected - work + if scanWorkRemaining < 1000 { + // We set a somewhat arbitrary lower bound on + // remaining scan work since if we aim a little high, + // we can miss by a little. + // + // We *do* need to enforce that this is at least 1, + // since marking is racy and double-scanning objects + // may legitimately make the remaining scan work + // negative, even in the hard goal regime. + scanWorkRemaining = 1000 + } + + // Compute the heap distance remaining. + heapRemaining := heapGoal - int64(live) + if heapRemaining <= 0 { + // This shouldn't happen, but if it does, avoid + // dividing by zero or setting the assist negative. + heapRemaining = 1 + } + + // Compute the mutator assist ratio so by the time the mutator + // allocates the remaining heap bytes up to heapGoal, it will + // have done (or stolen) the remaining amount of scan work. + // Note that the assist ratio values are updated atomically + // but not together. This means there may be some degree of + // skew between the two values. This is generally OK as the + // values shift relatively slowly over the course of a GC + // cycle. + assistWorkPerByte := float64(scanWorkRemaining) / float64(heapRemaining) + assistBytesPerWork := float64(heapRemaining) / float64(scanWorkRemaining) + c.assistWorkPerByte.Store(assistWorkPerByte) + c.assistBytesPerWork.Store(assistBytesPerWork) +} + +// endCycle computes the consMark estimate for the next cycle. +// userForced indicates whether the current GC cycle was forced +// by the application. +func (c *gcControllerState) endCycle(now int64, procs int, userForced bool) { + // Record last heap goal for the scavenger. + // We'll be updating the heap goal soon. + gcController.lastHeapGoal = c.heapGoal() + + // Compute the duration of time for which assists were turned on. + assistDuration := now - c.markStartTime + + // Assume background mark hit its utilization goal. + utilization := gcBackgroundUtilization + // Add assist utilization; avoid divide by zero. + if assistDuration > 0 { + utilization += float64(c.assistTime.Load()) / float64(assistDuration*int64(procs)) + } + + if c.heapLive.Load() <= c.triggered { + // Shouldn't happen, but let's be very safe about this in case the + // GC is somehow extremely short. + // + // In this case though, the only reasonable value for c.heapLive-c.triggered + // would be 0, which isn't really all that useful, i.e. the GC was so short + // that it didn't matter. + // + // Ignore this case and don't update anything. + return + } + idleUtilization := 0.0 + if assistDuration > 0 { + idleUtilization = float64(c.idleMarkTime.Load()) / float64(assistDuration*int64(procs)) + } + // Determine the cons/mark ratio. + // + // The units we want for the numerator and denominator are both B / cpu-ns. + // We get this by taking the bytes allocated or scanned, and divide by the amount of + // CPU time it took for those operations. For allocations, that CPU time is + // + // assistDuration * procs * (1 - utilization) + // + // Where utilization includes just background GC workers and assists. It does *not* + // include idle GC work time, because in theory the mutator is free to take that at + // any point. + // + // For scanning, that CPU time is + // + // assistDuration * procs * (utilization + idleUtilization) + // + // In this case, we *include* idle utilization, because that is additional CPU time that + // the GC had available to it. + // + // In effect, idle GC time is sort of double-counted here, but it's very weird compared + // to other kinds of GC work, because of how fluid it is. Namely, because the mutator is + // *always* free to take it. + // + // So this calculation is really: + // (heapLive-trigger) / (assistDuration * procs * (1-utilization)) / + // (scanWork) / (assistDuration * procs * (utilization+idleUtilization) + // + // Note that because we only care about the ratio, assistDuration and procs cancel out. + scanWork := c.heapScanWork.Load() + c.stackScanWork.Load() + c.globalsScanWork.Load() + currentConsMark := (float64(c.heapLive.Load()-c.triggered) * (utilization + idleUtilization)) / + (float64(scanWork) * (1 - utilization)) + + // Update our cons/mark estimate. This is the raw value above, but averaged over 2 GC cycles + // because it tends to be jittery, even in the steady-state. The smoothing helps the GC to + // maintain much more stable cycle-by-cycle behavior. + oldConsMark := c.consMark + c.consMark = (currentConsMark + c.lastConsMark) / 2 + c.lastConsMark = currentConsMark + + if debug.gcpacertrace > 0 { + printlock() + goal := gcGoalUtilization * 100 + print("pacer: ", int(utilization*100), "% CPU (", int(goal), " exp.) for ") + print(c.heapScanWork.Load(), "+", c.stackScanWork.Load(), "+", c.globalsScanWork.Load(), " B work (", c.lastHeapScan+c.lastStackScan.Load()+c.globalsScan.Load(), " B exp.) ") + live := c.heapLive.Load() + print("in ", c.triggered, " B -> ", live, " B (∆goal ", int64(live)-int64(c.lastHeapGoal), ", cons/mark ", oldConsMark, ")") + println() + printunlock() + } +} + +// enlistWorker encourages another dedicated mark worker to start on +// another P if there are spare worker slots. It is used by putfull +// when more work is made available. +// +//go:nowritebarrier +func (c *gcControllerState) enlistWorker() { + // If there are idle Ps, wake one so it will run an idle worker. + // NOTE: This is suspected of causing deadlocks. See golang.org/issue/19112. + // + // if sched.npidle.Load() != 0 && sched.nmspinning.Load() == 0 { + // wakep() + // return + // } + + // There are no idle Ps. If we need more dedicated workers, + // try to preempt a running P so it will switch to a worker. + if c.dedicatedMarkWorkersNeeded.Load() <= 0 { + return + } + // Pick a random other P to preempt. + if gomaxprocs <= 1 { + return + } + gp := getg() + if gp == nil || gp.m == nil || gp.m.p == 0 { + return + } + myID := gp.m.p.ptr().id + for tries := 0; tries < 5; tries++ { + id := int32(fastrandn(uint32(gomaxprocs - 1))) + if id >= myID { + id++ + } + p := allp[id] + if p.status != _Prunning { + continue + } + if preemptone(p) { + return + } + } +} + +// findRunnableGCWorker returns a background mark worker for pp if it +// should be run. This must only be called when gcBlackenEnabled != 0. +func (c *gcControllerState) findRunnableGCWorker(pp *p, now int64) (*g, int64) { + if gcBlackenEnabled == 0 { + throw("gcControllerState.findRunnable: blackening not enabled") + } + + // Since we have the current time, check if the GC CPU limiter + // hasn't had an update in a while. This check is necessary in + // case the limiter is on but hasn't been checked in a while and + // so may have left sufficient headroom to turn off again. + if now == 0 { + now = nanotime() + } + if gcCPULimiter.needUpdate(now) { + gcCPULimiter.update(now) + } + + if !gcMarkWorkAvailable(pp) { + // No work to be done right now. This can happen at + // the end of the mark phase when there are still + // assists tapering off. Don't bother running a worker + // now because it'll just return immediately. + return nil, now + } + + // Grab a worker before we commit to running below. + node := (*gcBgMarkWorkerNode)(gcBgMarkWorkerPool.pop()) + if node == nil { + // There is at least one worker per P, so normally there are + // enough workers to run on all Ps, if necessary. However, once + // a worker enters gcMarkDone it may park without rejoining the + // pool, thus freeing a P with no corresponding worker. + // gcMarkDone never depends on another worker doing work, so it + // is safe to simply do nothing here. + // + // If gcMarkDone bails out without completing the mark phase, + // it will always do so with queued global work. Thus, that P + // will be immediately eligible to re-run the worker G it was + // just using, ensuring work can complete. + return nil, now + } + + decIfPositive := func(val *atomic.Int64) bool { + for { + v := val.Load() + if v <= 0 { + return false + } + + if val.CompareAndSwap(v, v-1) { + return true + } + } + } + + if decIfPositive(&c.dedicatedMarkWorkersNeeded) { + // This P is now dedicated to marking until the end of + // the concurrent mark phase. + pp.gcMarkWorkerMode = gcMarkWorkerDedicatedMode + } else if c.fractionalUtilizationGoal == 0 { + // No need for fractional workers. + gcBgMarkWorkerPool.push(&node.node) + return nil, now + } else { + // Is this P behind on the fractional utilization + // goal? + // + // This should be kept in sync with pollFractionalWorkerExit. + delta := now - c.markStartTime + if delta > 0 && float64(pp.gcFractionalMarkTime)/float64(delta) > c.fractionalUtilizationGoal { + // Nope. No need to run a fractional worker. + gcBgMarkWorkerPool.push(&node.node) + return nil, now + } + // Run a fractional worker. + pp.gcMarkWorkerMode = gcMarkWorkerFractionalMode + } + + // Run the background mark worker. + gp := node.gp.ptr() + casgstatus(gp, _Gwaiting, _Grunnable) + if trace.enabled { + traceGoUnpark(gp, 0) + } + return gp, now +} + +// resetLive sets up the controller state for the next mark phase after the end +// of the previous one. Must be called after endCycle and before commit, before +// the world is started. +// +// The world must be stopped. +func (c *gcControllerState) resetLive(bytesMarked uint64) { + c.heapMarked = bytesMarked + c.heapLive.Store(bytesMarked) + c.heapScan.Store(uint64(c.heapScanWork.Load())) + c.lastHeapScan = uint64(c.heapScanWork.Load()) + c.lastStackScan.Store(uint64(c.stackScanWork.Load())) + c.triggered = ^uint64(0) // Reset triggered. + + // heapLive was updated, so emit a trace event. + if trace.enabled { + traceHeapAlloc(bytesMarked) + } +} + +// markWorkerStop must be called whenever a mark worker stops executing. +// +// It updates mark work accounting in the controller by a duration of +// work in nanoseconds and other bookkeeping. +// +// Safe to execute at any time. +func (c *gcControllerState) markWorkerStop(mode gcMarkWorkerMode, duration int64) { + switch mode { + case gcMarkWorkerDedicatedMode: + c.dedicatedMarkTime.Add(duration) + c.dedicatedMarkWorkersNeeded.Add(1) + case gcMarkWorkerFractionalMode: + c.fractionalMarkTime.Add(duration) + case gcMarkWorkerIdleMode: + c.idleMarkTime.Add(duration) + c.removeIdleMarkWorker() + default: + throw("markWorkerStop: unknown mark worker mode") + } +} + +func (c *gcControllerState) update(dHeapLive, dHeapScan int64) { + if dHeapLive != 0 { + live := gcController.heapLive.Add(dHeapLive) + if trace.enabled { + // gcController.heapLive changed. + traceHeapAlloc(live) + } + } + if gcBlackenEnabled == 0 { + // Update heapScan when we're not in a current GC. It is fixed + // at the beginning of a cycle. + if dHeapScan != 0 { + gcController.heapScan.Add(dHeapScan) + } + } else { + // gcController.heapLive changed. + c.revise() + } +} + +func (c *gcControllerState) addScannableStack(pp *p, amount int64) { + if pp == nil { + c.maxStackScan.Add(amount) + return + } + pp.maxStackScanDelta += amount + if pp.maxStackScanDelta >= maxStackScanSlack || pp.maxStackScanDelta <= -maxStackScanSlack { + c.maxStackScan.Add(pp.maxStackScanDelta) + pp.maxStackScanDelta = 0 + } +} + +func (c *gcControllerState) addGlobals(amount int64) { + c.globalsScan.Add(amount) +} + +// heapGoal returns the current heap goal. +func (c *gcControllerState) heapGoal() uint64 { + goal, _ := c.heapGoalInternal() + return goal +} + +// heapGoalInternal is the implementation of heapGoal which returns additional +// information that is necessary for computing the trigger. +// +// The returned minTrigger is always <= goal. +func (c *gcControllerState) heapGoalInternal() (goal, minTrigger uint64) { + // Start with the goal calculated for gcPercent. + goal = c.gcPercentHeapGoal.Load() + + // Check if the memory-limit-based goal is smaller, and if so, pick that. + if newGoal := c.memoryLimitHeapGoal(); go119MemoryLimitSupport && newGoal < goal { + goal = newGoal + } else { + // We're not limited by the memory limit goal, so perform a series of + // adjustments that might move the goal forward in a variety of circumstances. + + sweepDistTrigger := c.sweepDistMinTrigger.Load() + if sweepDistTrigger > goal { + // Set the goal to maintain a minimum sweep distance since + // the last call to commit. Note that we never want to do this + // if we're in the memory limit regime, because it could push + // the goal up. + goal = sweepDistTrigger + } + // Since we ignore the sweep distance trigger in the memory + // limit regime, we need to ensure we don't propagate it to + // the trigger, because it could cause a violation of the + // invariant that the trigger < goal. + minTrigger = sweepDistTrigger + + // Ensure that the heap goal is at least a little larger than + // the point at which we triggered. This may not be the case if GC + // start is delayed or if the allocation that pushed gcController.heapLive + // over trigger is large or if the trigger is really close to + // GOGC. Assist is proportional to this distance, so enforce a + // minimum distance, even if it means going over the GOGC goal + // by a tiny bit. + // + // Ignore this if we're in the memory limit regime: we'd prefer to + // have the GC respond hard about how close we are to the goal than to + // push the goal back in such a manner that it could cause us to exceed + // the memory limit. + const minRunway = 64 << 10 + if c.triggered != ^uint64(0) && goal < c.triggered+minRunway { + goal = c.triggered + minRunway + } + } + return +} + +// memoryLimitHeapGoal returns a heap goal derived from memoryLimit. +func (c *gcControllerState) memoryLimitHeapGoal() uint64 { + // Start by pulling out some values we'll need. Be careful about overflow. + var heapFree, heapAlloc, mappedReady uint64 + for { + heapFree = c.heapFree.load() // Free and unscavenged memory. + heapAlloc = c.totalAlloc.Load() - c.totalFree.Load() // Heap object bytes in use. + mappedReady = c.mappedReady.Load() // Total unreleased mapped memory. + if heapFree+heapAlloc <= mappedReady { + break + } + // It is impossible for total unreleased mapped memory to exceed heap memory, but + // because these stats are updated independently, we may observe a partial update + // including only some values. Thus, we appear to break the invariant. However, + // this condition is necessarily transient, so just try again. In the case of a + // persistent accounting error, we'll deadlock here. + } + + // Below we compute a goal from memoryLimit. There are a few things to be aware of. + // Firstly, the memoryLimit does not easily compare to the heap goal: the former + // is total mapped memory by the runtime that hasn't been released, while the latter is + // only heap object memory. Intuitively, the way we convert from one to the other is to + // subtract everything from memoryLimit that both contributes to the memory limit (so, + // ignore scavenged memory) and doesn't contain heap objects. This isn't quite what + // lines up with reality, but it's a good starting point. + // + // In practice this computation looks like the following: + // + // memoryLimit - ((mappedReady - heapFree - heapAlloc) + max(mappedReady - memoryLimit, 0)) - memoryLimitHeapGoalHeadroom + // ^1 ^2 ^3 + // + // Let's break this down. + // + // The first term (marker 1) is everything that contributes to the memory limit and isn't + // or couldn't become heap objects. It represents, broadly speaking, non-heap overheads. + // One oddity you may have noticed is that we also subtract out heapFree, i.e. unscavenged + // memory that may contain heap objects in the future. + // + // Let's take a step back. In an ideal world, this term would look something like just + // the heap goal. That is, we "reserve" enough space for the heap to grow to the heap + // goal, and subtract out everything else. This is of course impossible; the definition + // is circular! However, this impossible definition contains a key insight: the amount + // we're *going* to use matters just as much as whatever we're currently using. + // + // Consider if the heap shrinks to 1/10th its size, leaving behind lots of free and + // unscavenged memory. mappedReady - heapAlloc will be quite large, because of that free + // and unscavenged memory, pushing the goal down significantly. + // + // heapFree is also safe to exclude from the memory limit because in the steady-state, it's + // just a pool of memory for future heap allocations, and making new allocations from heapFree + // memory doesn't increase overall memory use. In transient states, the scavenger and the + // allocator actively manage the pool of heapFree memory to maintain the memory limit. + // + // The second term (marker 2) is the amount of memory we've exceeded the limit by, and is + // intended to help recover from such a situation. By pushing the heap goal down, we also + // push the trigger down, triggering and finishing a GC sooner in order to make room for + // other memory sources. Note that since we're effectively reducing the heap goal by X bytes, + // we're actually giving more than X bytes of headroom back, because the heap goal is in + // terms of heap objects, but it takes more than X bytes (e.g. due to fragmentation) to store + // X bytes worth of objects. + // + // The third term (marker 3) subtracts an additional memoryLimitHeapGoalHeadroom bytes from the + // heap goal. As the name implies, this is to provide additional headroom in the face of pacing + // inaccuracies. This is a fixed number of bytes because these inaccuracies disproportionately + // affect small heaps: as heaps get smaller, the pacer's inputs get fuzzier. Shorter GC cycles + // and less GC work means noisy external factors like the OS scheduler have a greater impact. + + memoryLimit := uint64(c.memoryLimit.Load()) + + // Compute term 1. + nonHeapMemory := mappedReady - heapFree - heapAlloc + + // Compute term 2. + var overage uint64 + if mappedReady > memoryLimit { + overage = mappedReady - memoryLimit + } + + if nonHeapMemory+overage >= memoryLimit { + // We're at a point where non-heap memory exceeds the memory limit on its own. + // There's honestly not much we can do here but just trigger GCs continuously + // and let the CPU limiter reign that in. Something has to give at this point. + // Set it to heapMarked, the lowest possible goal. + return c.heapMarked + } + + // Compute the goal. + goal := memoryLimit - (nonHeapMemory + overage) + + // Apply some headroom to the goal to account for pacing inaccuracies. + // Be careful about small limits. + if goal < memoryLimitHeapGoalHeadroom || goal-memoryLimitHeapGoalHeadroom < memoryLimitHeapGoalHeadroom { + goal = memoryLimitHeapGoalHeadroom + } else { + goal = goal - memoryLimitHeapGoalHeadroom + } + // Don't let us go below the live heap. A heap goal below the live heap doesn't make sense. + if goal < c.heapMarked { + goal = c.heapMarked + } + return goal +} + +const ( + // These constants determine the bounds on the GC trigger as a fraction + // of heap bytes allocated between the start of a GC (heapLive == heapMarked) + // and the end of a GC (heapLive == heapGoal). + // + // The constants are obscured in this way for efficiency. The denominator + // of the fraction is always a power-of-two for a quick division, so that + // the numerator is a single constant integer multiplication. + triggerRatioDen = 64 + + // The minimum trigger constant was chosen empirically: given a sufficiently + // fast/scalable allocator with 48 Ps that could drive the trigger ratio + // to <0.05, this constant causes applications to retain the same peak + // RSS compared to not having this allocator. + minTriggerRatioNum = 45 // ~0.7 + + // The maximum trigger constant is chosen somewhat arbitrarily, but the + // current constant has served us well over the years. + maxTriggerRatioNum = 61 // ~0.95 +) + +// trigger returns the current point at which a GC should trigger along with +// the heap goal. +// +// The returned value may be compared against heapLive to determine whether +// the GC should trigger. Thus, the GC trigger condition should be (but may +// not be, in the case of small movements for efficiency) checked whenever +// the heap goal may change. +func (c *gcControllerState) trigger() (uint64, uint64) { + goal, minTrigger := c.heapGoalInternal() + + // Invariant: the trigger must always be less than the heap goal. + // + // Note that the memory limit sets a hard maximum on our heap goal, + // but the live heap may grow beyond it. + + if c.heapMarked >= goal { + // The goal should never be smaller than heapMarked, but let's be + // defensive about it. The only reasonable trigger here is one that + // causes a continuous GC cycle at heapMarked, but respect the goal + // if it came out as smaller than that. + return goal, goal + } + + // Below this point, c.heapMarked < goal. + + // heapMarked is our absolute minimum, and it's possible the trigger + // bound we get from heapGoalinternal is less than that. + if minTrigger < c.heapMarked { + minTrigger = c.heapMarked + } + + // If we let the trigger go too low, then if the application + // is allocating very rapidly we might end up in a situation + // where we're allocating black during a nearly always-on GC. + // The result of this is a growing heap and ultimately an + // increase in RSS. By capping us at a point >0, we're essentially + // saying that we're OK using more CPU during the GC to prevent + // this growth in RSS. + triggerLowerBound := uint64(((goal-c.heapMarked)/triggerRatioDen)*minTriggerRatioNum) + c.heapMarked + if minTrigger < triggerLowerBound { + minTrigger = triggerLowerBound + } + + // For small heaps, set the max trigger point at maxTriggerRatio of the way + // from the live heap to the heap goal. This ensures we always have *some* + // headroom when the GC actually starts. For larger heaps, set the max trigger + // point at the goal, minus the minimum heap size. + // + // This choice follows from the fact that the minimum heap size is chosen + // to reflect the costs of a GC with no work to do. With a large heap but + // very little scan work to perform, this gives us exactly as much runway + // as we would need, in the worst case. + maxTrigger := uint64(((goal-c.heapMarked)/triggerRatioDen)*maxTriggerRatioNum) + c.heapMarked + if goal > defaultHeapMinimum && goal-defaultHeapMinimum > maxTrigger { + maxTrigger = goal - defaultHeapMinimum + } + if maxTrigger < minTrigger { + maxTrigger = minTrigger + } + + // Compute the trigger from our bounds and the runway stored by commit. + var trigger uint64 + runway := c.runway.Load() + if runway > goal { + trigger = minTrigger + } else { + trigger = goal - runway + } + if trigger < minTrigger { + trigger = minTrigger + } + if trigger > maxTrigger { + trigger = maxTrigger + } + if trigger > goal { + print("trigger=", trigger, " heapGoal=", goal, "\n") + print("minTrigger=", minTrigger, " maxTrigger=", maxTrigger, "\n") + throw("produced a trigger greater than the heap goal") + } + return trigger, goal +} + +// commit recomputes all pacing parameters needed to derive the +// trigger and the heap goal. Namely, the gcPercent-based heap goal, +// and the amount of runway we want to give the GC this cycle. +// +// This can be called any time. If GC is the in the middle of a +// concurrent phase, it will adjust the pacing of that phase. +// +// isSweepDone should be the result of calling isSweepDone(), +// unless we're testing or we know we're executing during a GC cycle. +// +// This depends on gcPercent, gcController.heapMarked, and +// gcController.heapLive. These must be up to date. +// +// Callers must call gcControllerState.revise after calling this +// function if the GC is enabled. +// +// mheap_.lock must be held or the world must be stopped. +func (c *gcControllerState) commit(isSweepDone bool) { + if !c.test { + assertWorldStoppedOrLockHeld(&mheap_.lock) + } + + if isSweepDone { + // The sweep is done, so there aren't any restrictions on the trigger + // we need to think about. + c.sweepDistMinTrigger.Store(0) + } else { + // Concurrent sweep happens in the heap growth + // from gcController.heapLive to trigger. Make sure we + // give the sweeper some runway if it doesn't have enough. + c.sweepDistMinTrigger.Store(c.heapLive.Load() + sweepMinHeapDistance) + } + + // Compute the next GC goal, which is when the allocated heap + // has grown by GOGC/100 over where it started the last cycle, + // plus additional runway for non-heap sources of GC work. + gcPercentHeapGoal := ^uint64(0) + if gcPercent := c.gcPercent.Load(); gcPercent >= 0 { + gcPercentHeapGoal = c.heapMarked + (c.heapMarked+c.lastStackScan.Load()+c.globalsScan.Load())*uint64(gcPercent)/100 + } + // Apply the minimum heap size here. It's defined in terms of gcPercent + // and is only updated by functions that call commit. + if gcPercentHeapGoal < c.heapMinimum { + gcPercentHeapGoal = c.heapMinimum + } + c.gcPercentHeapGoal.Store(gcPercentHeapGoal) + + // Compute the amount of runway we want the GC to have by using our + // estimate of the cons/mark ratio. + // + // The idea is to take our expected scan work, and multiply it by + // the cons/mark ratio to determine how long it'll take to complete + // that scan work in terms of bytes allocated. This gives us our GC's + // runway. + // + // However, the cons/mark ratio is a ratio of rates per CPU-second, but + // here we care about the relative rates for some division of CPU + // resources among the mutator and the GC. + // + // To summarize, we have B / cpu-ns, and we want B / ns. We get that + // by multiplying by our desired division of CPU resources. We choose + // to express CPU resources as GOMAPROCS*fraction. Note that because + // we're working with a ratio here, we can omit the number of CPU cores, + // because they'll appear in the numerator and denominator and cancel out. + // As a result, this is basically just "weighing" the cons/mark ratio by + // our desired division of resources. + // + // Furthermore, by setting the runway so that CPU resources are divided + // this way, assuming that the cons/mark ratio is correct, we make that + // division a reality. + c.runway.Store(uint64((c.consMark * (1 - gcGoalUtilization) / (gcGoalUtilization)) * float64(c.lastHeapScan+c.lastStackScan.Load()+c.globalsScan.Load()))) +} + +// setGCPercent updates gcPercent. commit must be called after. +// Returns the old value of gcPercent. +// +// The world must be stopped, or mheap_.lock must be held. +func (c *gcControllerState) setGCPercent(in int32) int32 { + if !c.test { + assertWorldStoppedOrLockHeld(&mheap_.lock) + } + + out := c.gcPercent.Load() + if in < 0 { + in = -1 + } + c.heapMinimum = defaultHeapMinimum * uint64(in) / 100 + c.gcPercent.Store(in) + + return out +} + +//go:linkname setGCPercent runtime/debug.setGCPercent +func setGCPercent(in int32) (out int32) { + // Run on the system stack since we grab the heap lock. + systemstack(func() { + lock(&mheap_.lock) + out = gcController.setGCPercent(in) + gcControllerCommit() + unlock(&mheap_.lock) + }) + + // If we just disabled GC, wait for any concurrent GC mark to + // finish so we always return with no GC running. + if in < 0 { + gcWaitOnMark(work.cycles.Load()) + } + + return out +} + +func readGOGC() int32 { + p := gogetenv("GOGC") + if p == "off" { + return -1 + } + if n, ok := atoi32(p); ok { + return n + } + return 100 +} + +// setMemoryLimit updates memoryLimit. commit must be called after +// Returns the old value of memoryLimit. +// +// The world must be stopped, or mheap_.lock must be held. +func (c *gcControllerState) setMemoryLimit(in int64) int64 { + if !c.test { + assertWorldStoppedOrLockHeld(&mheap_.lock) + } + + out := c.memoryLimit.Load() + if in >= 0 { + c.memoryLimit.Store(in) + } + + return out +} + +//go:linkname setMemoryLimit runtime/debug.setMemoryLimit +func setMemoryLimit(in int64) (out int64) { + // Run on the system stack since we grab the heap lock. + systemstack(func() { + lock(&mheap_.lock) + out = gcController.setMemoryLimit(in) + if in < 0 || out == in { + // If we're just checking the value or not changing + // it, there's no point in doing the rest. + unlock(&mheap_.lock) + return + } + gcControllerCommit() + unlock(&mheap_.lock) + }) + return out +} + +func readGOMEMLIMIT() int64 { + p := gogetenv("GOMEMLIMIT") + if p == "" || p == "off" { + return maxInt64 + } + n, ok := parseByteCount(p) + if !ok { + print("GOMEMLIMIT=", p, "\n") + throw("malformed GOMEMLIMIT; see `go doc runtime/debug.SetMemoryLimit`") + } + return n +} + +// addIdleMarkWorker attempts to add a new idle mark worker. +// +// If this returns true, the caller must become an idle mark worker unless +// there's no background mark worker goroutines in the pool. This case is +// harmless because there are already background mark workers running. +// If this returns false, the caller must NOT become an idle mark worker. +// +// nosplit because it may be called without a P. +// +//go:nosplit +func (c *gcControllerState) addIdleMarkWorker() bool { + for { + old := c.idleMarkWorkers.Load() + n, max := int32(old&uint64(^uint32(0))), int32(old>>32) + if n >= max { + // See the comment on idleMarkWorkers for why + // n > max is tolerated. + return false + } + if n < 0 { + print("n=", n, " max=", max, "\n") + throw("negative idle mark workers") + } + new := uint64(uint32(n+1)) | (uint64(max) << 32) + if c.idleMarkWorkers.CompareAndSwap(old, new) { + return true + } + } +} + +// needIdleMarkWorker is a hint as to whether another idle mark worker is needed. +// +// The caller must still call addIdleMarkWorker to become one. This is mainly +// useful for a quick check before an expensive operation. +// +// nosplit because it may be called without a P. +// +//go:nosplit +func (c *gcControllerState) needIdleMarkWorker() bool { + p := c.idleMarkWorkers.Load() + n, max := int32(p&uint64(^uint32(0))), int32(p>>32) + return n < max +} + +// removeIdleMarkWorker must be called when an new idle mark worker stops executing. +func (c *gcControllerState) removeIdleMarkWorker() { + for { + old := c.idleMarkWorkers.Load() + n, max := int32(old&uint64(^uint32(0))), int32(old>>32) + if n-1 < 0 { + print("n=", n, " max=", max, "\n") + throw("negative idle mark workers") + } + new := uint64(uint32(n-1)) | (uint64(max) << 32) + if c.idleMarkWorkers.CompareAndSwap(old, new) { + return + } + } +} + +// setMaxIdleMarkWorkers sets the maximum number of idle mark workers allowed. +// +// This method is optimistic in that it does not wait for the number of +// idle mark workers to reduce to max before returning; it assumes the workers +// will deschedule themselves. +func (c *gcControllerState) setMaxIdleMarkWorkers(max int32) { + for { + old := c.idleMarkWorkers.Load() + n := int32(old & uint64(^uint32(0))) + if n < 0 { + print("n=", n, " max=", max, "\n") + throw("negative idle mark workers") + } + new := uint64(uint32(n)) | (uint64(max) << 32) + if c.idleMarkWorkers.CompareAndSwap(old, new) { + return + } + } +} + +// gcControllerCommit is gcController.commit, but passes arguments from live +// (non-test) data. It also updates any consumers of the GC pacing, such as +// sweep pacing and the background scavenger. +// +// Calls gcController.commit. +// +// The heap lock must be held, so this must be executed on the system stack. +// +//go:systemstack +func gcControllerCommit() { + assertWorldStoppedOrLockHeld(&mheap_.lock) + + gcController.commit(isSweepDone()) + + // Update mark pacing. + if gcphase != _GCoff { + gcController.revise() + } + + // TODO(mknyszek): This isn't really accurate any longer because the heap + // goal is computed dynamically. Still useful to snapshot, but not as useful. + if trace.enabled { + traceHeapGoal() + } + + trigger, heapGoal := gcController.trigger() + gcPaceSweeper(trigger) + gcPaceScavenger(gcController.memoryLimit.Load(), heapGoal, gcController.lastHeapGoal) +} diff --git a/src/runtime/mgcpacer_test.go b/src/runtime/mgcpacer_test.go new file mode 100644 index 0000000..e373e32 --- /dev/null +++ b/src/runtime/mgcpacer_test.go @@ -0,0 +1,1084 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "math" + "math/rand" + . "runtime" + "testing" + "time" +) + +func TestGcPacer(t *testing.T) { + t.Parallel() + + const initialHeapBytes = 256 << 10 + for _, e := range []*gcExecTest{ + { + // The most basic test case: a steady-state heap. + // Growth to an O(MiB) heap, then constant heap size, alloc/scan rates. + name: "Steady", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33.0), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n >= 25 { + // At this alloc/scan rate, the pacer should be extremely close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + } + }, + }, + { + // Same as the steady-state case, but lots of stacks to scan relative to the heap size. + name: "SteadyBigStacks", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(132.0), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(2048).sum(ramp(128<<20, 8)), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + // Check the same conditions as the steady-state case, except the old pacer can't + // really handle this well, so don't check the goal ratio for it. + n := len(c) + if n >= 25 { + // For the pacer redesign, assert something even stronger: at this alloc/scan rate, + // it should be extremely close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + } + }, + }, + { + // Same as the steady-state case, but lots of globals to scan relative to the heap size. + name: "SteadyBigGlobals", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 128 << 20, + nCores: 8, + allocRate: constant(132.0), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + // Check the same conditions as the steady-state case, except the old pacer can't + // really handle this well, so don't check the goal ratio for it. + n := len(c) + if n >= 25 { + // For the pacer redesign, assert something even stronger: at this alloc/scan rate, + // it should be extremely close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + } + }, + }, + { + // This tests the GC pacer's response to a small change in allocation rate. + name: "StepAlloc", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33.0).sum(ramp(66.0, 1).delay(50)), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 100, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if (n >= 25 && n < 50) || n >= 75 { + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles + // and then is able to settle again after a significant jump in allocation rate. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + } + }, + }, + { + // This tests the GC pacer's response to a large change in allocation rate. + name: "HeavyStepAlloc", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33).sum(ramp(330, 1).delay(50)), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 100, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if (n >= 25 && n < 50) || n >= 75 { + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles + // and then is able to settle again after a significant jump in allocation rate. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + } + }, + }, + { + // This tests the GC pacer's response to a change in the fraction of the scannable heap. + name: "StepScannableFrac", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(128.0), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(0.2).sum(unit(0.5).delay(50)), + stackBytes: constant(8192), + length: 100, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if (n >= 25 && n < 50) || n >= 75 { + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles + // and then is able to settle again after a significant jump in allocation rate. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + } + }, + }, + { + // Tests the pacer for a high GOGC value with a large heap growth happening + // in the middle. The purpose of the large heap growth is to check if GC + // utilization ends up sensitive + name: "HighGOGC", + gcPercent: 1500, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: random(7, 0x53).offset(165), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12), random(0.01, 0x1), unit(14).delay(25)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 12 { + if n == 26 { + // In the 26th cycle there's a heap growth. Overshoot is expected to maintain + // a stable utilization, but we should *never* overshoot more than GOGC of + // the next cycle. + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.90, 15) + } else { + // Give a wider goal range here. With such a high GOGC value we're going to be + // forced to undershoot. + // + // TODO(mknyszek): Instead of placing a 0.95 limit on the trigger, make the limit + // based on absolute bytes, that's based somewhat in how the minimum heap size + // is determined. + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.90, 1.05) + } + + // Ensure utilization remains stable despite a growth in live heap size + // at GC #25. This test fails prior to the GC pacer redesign. + // + // Because GOGC is so large, we should also be really close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, GCGoalUtilization+0.03) + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.03) + } + }, + }, + { + // This test makes sure that in the face of a varying (in this case, oscillating) allocation + // rate, the pacer does a reasonably good job of staying abreast of the changes. + name: "OscAlloc", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: oscillate(13, 0, 8).offset(67), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 12 { + // After the 12th GC, the heap will stop growing. Now, just make sure that: + // 1. Utilization isn't varying _too_ much, and + // 2. The pacer is mostly keeping up with the goal. + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + assertInRange(t, "GC utilization", c[n-1].gcUtilization, 0.25, 0.3) + } + }, + }, + { + // This test is the same as OscAlloc, but instead of oscillating, the allocation rate is jittery. + name: "JitterAlloc", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: random(13, 0xf).offset(132), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12), random(0.01, 0xe)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 12 { + // After the 12th GC, the heap will stop growing. Now, just make sure that: + // 1. Utilization isn't varying _too_ much, and + // 2. The pacer is mostly keeping up with the goal. + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + assertInRange(t, "GC utilization", c[n-1].gcUtilization, 0.25, 0.3) + } + }, + }, + { + // This test is the same as JitterAlloc, but with a much higher allocation rate. + // The jitter is proportionally the same. + name: "HeavyJitterAlloc", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: random(33.0, 0x0).offset(330), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12), random(0.01, 0x152)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 13 { + // After the 12th GC, the heap will stop growing. Now, just make sure that: + // 1. Utilization isn't varying _too_ much, and + // 2. The pacer is mostly keeping up with the goal. + // We start at the 13th here because we want to use the 12th as a reference. + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + // Unlike the other tests, GC utilization here will vary more and tend higher. + // Just make sure it's not going too crazy. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.05) + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[11].gcUtilization, 0.05) + } + }, + }, + { + // This test sets a slow allocation rate and a small heap (close to the minimum heap size) + // to try to minimize the difference between the trigger and the goal. + name: "SmallHeapSlowAlloc", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(1.0), + scanRate: constant(2048.0), + growthRate: constant(2.0).sum(ramp(-1.0, 3)), + scannableFrac: constant(0.01), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 4 { + // After the 4th GC, the heap will stop growing. + // First, let's make sure we're finishing near the goal, with some extra + // room because we're probably going to be triggering early. + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.925, 1.025) + // Next, let's make sure there's some minimum distance between the goal + // and the trigger. It should be proportional to the runway (hence the + // trigger ratio check, instead of a check against the runway). + assertInRange(t, "trigger ratio", c[n-1].triggerRatio(), 0.925, 0.975) + } + if n > 25 { + // Double-check that GC utilization looks OK. + + // At this alloc/scan rate, the pacer should be extremely close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + // Make sure GC utilization has mostly levelled off. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.05) + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[11].gcUtilization, 0.05) + } + }, + }, + { + // This test sets a slow allocation rate and a medium heap (around 10x the min heap size) + // to try to minimize the difference between the trigger and the goal. + name: "MediumHeapSlowAlloc", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(1.0), + scanRate: constant(2048.0), + growthRate: constant(2.0).sum(ramp(-1.0, 8)), + scannableFrac: constant(0.01), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 9 { + // After the 4th GC, the heap will stop growing. + // First, let's make sure we're finishing near the goal, with some extra + // room because we're probably going to be triggering early. + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.925, 1.025) + // Next, let's make sure there's some minimum distance between the goal + // and the trigger. It should be proportional to the runway (hence the + // trigger ratio check, instead of a check against the runway). + assertInRange(t, "trigger ratio", c[n-1].triggerRatio(), 0.925, 0.975) + } + if n > 25 { + // Double-check that GC utilization looks OK. + + // At this alloc/scan rate, the pacer should be extremely close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + // Make sure GC utilization has mostly levelled off. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.05) + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[11].gcUtilization, 0.05) + } + }, + }, + { + // This test sets a slow allocation rate and a large heap to try to minimize the + // difference between the trigger and the goal. + name: "LargeHeapSlowAlloc", + gcPercent: 100, + memoryLimit: math.MaxInt64, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(1.0), + scanRate: constant(2048.0), + growthRate: constant(4.0).sum(ramp(-3.0, 12)), + scannableFrac: constant(0.01), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 13 { + // After the 4th GC, the heap will stop growing. + // First, let's make sure we're finishing near the goal. + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + // Next, let's make sure there's some minimum distance between the goal + // and the trigger. It should be around the default minimum heap size. + assertInRange(t, "runway", c[n-1].runway(), DefaultHeapMinimum-64<<10, DefaultHeapMinimum+64<<10) + } + if n > 25 { + // Double-check that GC utilization looks OK. + + // At this alloc/scan rate, the pacer should be extremely close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + // Make sure GC utilization has mostly levelled off. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.05) + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[11].gcUtilization, 0.05) + } + }, + }, + { + // The most basic test case with a memory limit: a steady-state heap. + // Growth to an O(MiB) heap, then constant heap size, alloc/scan rates. + // Provide a lot of room for the limit. Essentially, this should behave just like + // the "Steady" test. Note that we don't simulate non-heap overheads, so the + // memory limit and the heap limit are identical. + name: "SteadyMemoryLimit", + gcPercent: 100, + memoryLimit: 512 << 20, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33.0), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if peak := c[n-1].heapPeak; peak >= (512<<20)-MemoryLimitHeapGoalHeadroom { + t.Errorf("peak heap size reaches heap limit: %d", peak) + } + if n >= 25 { + // At this alloc/scan rate, the pacer should be extremely close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + } + }, + }, + { + // This is the same as the previous test, but gcPercent = -1, so the heap *should* grow + // all the way to the peak. + name: "SteadyMemoryLimitNoGCPercent", + gcPercent: -1, + memoryLimit: 512 << 20, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33.0), + scanRate: constant(1024.0), + growthRate: constant(2.0).sum(ramp(-1.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if goal := c[n-1].heapGoal; goal != (512<<20)-MemoryLimitHeapGoalHeadroom { + t.Errorf("heap goal is not the heap limit: %d", goal) + } + if n >= 25 { + // At this alloc/scan rate, the pacer should be extremely close to the goal utilization. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + } + }, + }, + { + // This test ensures that the pacer doesn't fall over even when the live heap exceeds + // the memory limit. It also makes sure GC utilization actually rises to push back. + name: "ExceedMemoryLimit", + gcPercent: 100, + memoryLimit: 512 << 20, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33.0), + scanRate: constant(1024.0), + growthRate: constant(3.5).sum(ramp(-2.5, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 12 { + // We're way over the memory limit, so we want to make sure our goal is set + // as low as it possibly can be. + if goal, live := c[n-1].heapGoal, c[n-1].heapLive; goal != live { + t.Errorf("heap goal is not equal to live heap: %d != %d", goal, live) + } + } + if n >= 25 { + // Due to memory pressure, we should scale to 100% GC CPU utilization. + // Note that in practice this won't actually happen because of the CPU limiter, + // but it's not the pacer's job to limit CPU usage. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, 1.0, 0.005) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles. + // In this case, that just means it's not wavering around a whole bunch. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + } + }, + }, + { + // Same as the previous test, but with gcPercent = -1. + name: "ExceedMemoryLimitNoGCPercent", + gcPercent: -1, + memoryLimit: 512 << 20, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33.0), + scanRate: constant(1024.0), + growthRate: constant(3.5).sum(ramp(-2.5, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n < 10 { + if goal := c[n-1].heapGoal; goal != (512<<20)-MemoryLimitHeapGoalHeadroom { + t.Errorf("heap goal is not the heap limit: %d", goal) + } + } + if n > 12 { + // We're way over the memory limit, so we want to make sure our goal is set + // as low as it possibly can be. + if goal, live := c[n-1].heapGoal, c[n-1].heapLive; goal != live { + t.Errorf("heap goal is not equal to live heap: %d != %d", goal, live) + } + } + if n >= 25 { + // Due to memory pressure, we should scale to 100% GC CPU utilization. + // Note that in practice this won't actually happen because of the CPU limiter, + // but it's not the pacer's job to limit CPU usage. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, 1.0, 0.005) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles. + // In this case, that just means it's not wavering around a whole bunch. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + } + }, + }, + { + // This test ensures that the pacer maintains the memory limit as the heap grows. + name: "MaintainMemoryLimit", + gcPercent: 100, + memoryLimit: 512 << 20, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33.0), + scanRate: constant(1024.0), + growthRate: constant(3.0).sum(ramp(-2.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if n > 12 { + // We're trying to saturate the memory limit. + if goal := c[n-1].heapGoal; goal != (512<<20)-MemoryLimitHeapGoalHeadroom { + t.Errorf("heap goal is not the heap limit: %d", goal) + } + } + if n >= 25 { + // At this alloc/scan rate, the pacer should be extremely close to the goal utilization, + // even with the additional memory pressure. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles and + // that it's meeting its goal. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + } + }, + }, + { + // Same as the previous test, but with gcPercent = -1. + name: "MaintainMemoryLimitNoGCPercent", + gcPercent: -1, + memoryLimit: 512 << 20, + globalsBytes: 32 << 10, + nCores: 8, + allocRate: constant(33.0), + scanRate: constant(1024.0), + growthRate: constant(3.0).sum(ramp(-2.0, 12)), + scannableFrac: constant(1.0), + stackBytes: constant(8192), + length: 50, + checker: func(t *testing.T, c []gcCycleResult) { + n := len(c) + if goal := c[n-1].heapGoal; goal != (512<<20)-MemoryLimitHeapGoalHeadroom { + t.Errorf("heap goal is not the heap limit: %d", goal) + } + if n >= 25 { + // At this alloc/scan rate, the pacer should be extremely close to the goal utilization, + // even with the additional memory pressure. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, GCGoalUtilization, 0.005) + + // Make sure the pacer settles into a non-degenerate state in at least 25 GC cycles and + // that it's meeting its goal. + assertInEpsilon(t, "GC utilization", c[n-1].gcUtilization, c[n-2].gcUtilization, 0.005) + assertInRange(t, "goal ratio", c[n-1].goalRatio(), 0.95, 1.05) + } + }, + }, + // TODO(mknyszek): Write a test that exercises the pacer's hard goal. + // This is difficult in the idealized model this testing framework places + // the pacer in, because the calculated overshoot is directly proportional + // to the runway for the case of the expected work. + // However, it is still possible to trigger this case if something exceptional + // happens between calls to revise; the framework just doesn't support this yet. + } { + e := e + t.Run(e.name, func(t *testing.T) { + t.Parallel() + + c := NewGCController(e.gcPercent, e.memoryLimit) + var bytesAllocatedBlackLast int64 + results := make([]gcCycleResult, 0, e.length) + for i := 0; i < e.length; i++ { + cycle := e.next() + c.StartCycle(cycle.stackBytes, e.globalsBytes, cycle.scannableFrac, e.nCores) + + // Update pacer incrementally as we complete scan work. + const ( + revisePeriod = 500 * time.Microsecond + rateConv = 1024 * float64(revisePeriod) / float64(time.Millisecond) + ) + var nextHeapMarked int64 + if i == 0 { + nextHeapMarked = initialHeapBytes + } else { + nextHeapMarked = int64(float64(int64(c.HeapMarked())-bytesAllocatedBlackLast) * cycle.growthRate) + } + globalsScanWorkLeft := int64(e.globalsBytes) + stackScanWorkLeft := int64(cycle.stackBytes) + heapScanWorkLeft := int64(float64(nextHeapMarked) * cycle.scannableFrac) + doWork := func(work int64) (int64, int64, int64) { + var deltas [3]int64 + + // Do globals work first, then stacks, then heap. + for i, workLeft := range []*int64{&globalsScanWorkLeft, &stackScanWorkLeft, &heapScanWorkLeft} { + if *workLeft == 0 { + continue + } + if *workLeft > work { + deltas[i] += work + *workLeft -= work + work = 0 + break + } else { + deltas[i] += *workLeft + work -= *workLeft + *workLeft = 0 + } + } + return deltas[0], deltas[1], deltas[2] + } + var ( + gcDuration int64 + assistTime int64 + bytesAllocatedBlack int64 + ) + for heapScanWorkLeft+stackScanWorkLeft+globalsScanWorkLeft > 0 { + // Simulate GC assist pacing. + // + // Note that this is an idealized view of the GC assist pacing + // mechanism. + + // From the assist ratio and the alloc and scan rates, we can idealize what + // the GC CPU utilization looks like. + // + // We start with assistRatio = (bytes of scan work) / (bytes of runway) (by definition). + // + // Over revisePeriod, we can also calculate how many bytes are scanned and + // allocated, given some GC CPU utilization u: + // + // bytesScanned = scanRate * rateConv * nCores * u + // bytesAllocated = allocRate * rateConv * nCores * (1 - u) + // + // During revisePeriod, assistRatio is kept constant, and GC assists kick in to + // maintain it. Specifically, they act to prevent too many bytes being allocated + // compared to how many bytes are scanned. It directly defines the ratio of + // bytesScanned to bytesAllocated over this period, hence: + // + // assistRatio = bytesScanned / bytesAllocated + // + // From this, we can solve for utilization, because everything else has already + // been determined: + // + // assistRatio = (scanRate * rateConv * nCores * u) / (allocRate * rateConv * nCores * (1 - u)) + // assistRatio = (scanRate * u) / (allocRate * (1 - u)) + // assistRatio * allocRate * (1-u) = scanRate * u + // assistRatio * allocRate - assistRatio * allocRate * u = scanRate * u + // assistRatio * allocRate = assistRatio * allocRate * u + scanRate * u + // assistRatio * allocRate = (assistRatio * allocRate + scanRate) * u + // u = (assistRatio * allocRate) / (assistRatio * allocRate + scanRate) + // + // Note that this may give a utilization that is _less_ than GCBackgroundUtilization, + // which isn't possible in practice because of dedicated workers. Thus, this case + // must be interpreted as GC assists not kicking in at all, and just round up. All + // downstream values will then have this accounted for. + assistRatio := c.AssistWorkPerByte() + utilization := assistRatio * cycle.allocRate / (assistRatio*cycle.allocRate + cycle.scanRate) + if utilization < GCBackgroundUtilization { + utilization = GCBackgroundUtilization + } + + // Knowing the utilization, calculate bytesScanned and bytesAllocated. + bytesScanned := int64(cycle.scanRate * rateConv * float64(e.nCores) * utilization) + bytesAllocated := int64(cycle.allocRate * rateConv * float64(e.nCores) * (1 - utilization)) + + // Subtract work from our model. + globalsScanned, stackScanned, heapScanned := doWork(bytesScanned) + + // doWork may not use all of bytesScanned. + // In this case, the GC actually ends sometime in this period. + // Let's figure out when, exactly, and adjust bytesAllocated too. + actualElapsed := revisePeriod + actualAllocated := bytesAllocated + if actualScanned := globalsScanned + stackScanned + heapScanned; actualScanned < bytesScanned { + // actualScanned = scanRate * rateConv * (t / revisePeriod) * nCores * u + // => t = actualScanned * revisePeriod / (scanRate * rateConv * nCores * u) + actualElapsed = time.Duration(float64(actualScanned) * float64(revisePeriod) / (cycle.scanRate * rateConv * float64(e.nCores) * utilization)) + actualAllocated = int64(cycle.allocRate * rateConv * float64(actualElapsed) / float64(revisePeriod) * float64(e.nCores) * (1 - utilization)) + } + + // Ask the pacer to revise. + c.Revise(GCControllerReviseDelta{ + HeapLive: actualAllocated, + HeapScan: int64(float64(actualAllocated) * cycle.scannableFrac), + HeapScanWork: heapScanned, + StackScanWork: stackScanned, + GlobalsScanWork: globalsScanned, + }) + + // Accumulate variables. + assistTime += int64(float64(actualElapsed) * float64(e.nCores) * (utilization - GCBackgroundUtilization)) + gcDuration += int64(actualElapsed) + bytesAllocatedBlack += actualAllocated + } + + // Put together the results, log them, and concatenate them. + result := gcCycleResult{ + cycle: i + 1, + heapLive: c.HeapMarked(), + heapScannable: int64(float64(int64(c.HeapMarked())-bytesAllocatedBlackLast) * cycle.scannableFrac), + heapTrigger: c.Triggered(), + heapPeak: c.HeapLive(), + heapGoal: c.HeapGoal(), + gcUtilization: float64(assistTime)/(float64(gcDuration)*float64(e.nCores)) + GCBackgroundUtilization, + } + t.Log("GC", result.String()) + results = append(results, result) + + // Run the checker for this test. + e.check(t, results) + + c.EndCycle(uint64(nextHeapMarked+bytesAllocatedBlack), assistTime, gcDuration, e.nCores) + + bytesAllocatedBlackLast = bytesAllocatedBlack + } + }) + } +} + +type gcExecTest struct { + name string + + gcPercent int + memoryLimit int64 + globalsBytes uint64 + nCores int + + allocRate float64Stream // > 0, KiB / cpu-ms + scanRate float64Stream // > 0, KiB / cpu-ms + growthRate float64Stream // > 0 + scannableFrac float64Stream // Clamped to [0, 1] + stackBytes float64Stream // Multiple of 2048. + length int + + checker func(*testing.T, []gcCycleResult) +} + +// minRate is an arbitrary minimum for allocRate, scanRate, and growthRate. +// These values just cannot be zero. +const minRate = 0.0001 + +func (e *gcExecTest) next() gcCycle { + return gcCycle{ + allocRate: e.allocRate.min(minRate)(), + scanRate: e.scanRate.min(minRate)(), + growthRate: e.growthRate.min(minRate)(), + scannableFrac: e.scannableFrac.limit(0, 1)(), + stackBytes: uint64(e.stackBytes.quantize(2048).min(0)()), + } +} + +func (e *gcExecTest) check(t *testing.T, results []gcCycleResult) { + t.Helper() + + // Do some basic general checks first. + n := len(results) + switch n { + case 0: + t.Fatal("no results passed to check") + return + case 1: + if results[0].cycle != 1 { + t.Error("first cycle has incorrect number") + } + default: + if results[n-1].cycle != results[n-2].cycle+1 { + t.Error("cycle numbers out of order") + } + } + if u := results[n-1].gcUtilization; u < 0 || u > 1 { + t.Fatal("GC utilization not within acceptable bounds") + } + if s := results[n-1].heapScannable; s < 0 { + t.Fatal("heapScannable is negative") + } + if e.checker == nil { + t.Fatal("test-specific checker is missing") + } + + // Run the test-specific checker. + e.checker(t, results) +} + +type gcCycle struct { + allocRate float64 + scanRate float64 + growthRate float64 + scannableFrac float64 + stackBytes uint64 +} + +type gcCycleResult struct { + cycle int + + // These come directly from the pacer, so uint64. + heapLive uint64 + heapTrigger uint64 + heapGoal uint64 + heapPeak uint64 + + // These are produced by the simulation, so int64 and + // float64 are more appropriate, so that we can check for + // bad states in the simulation. + heapScannable int64 + gcUtilization float64 +} + +func (r *gcCycleResult) goalRatio() float64 { + return float64(r.heapPeak) / float64(r.heapGoal) +} + +func (r *gcCycleResult) runway() float64 { + return float64(r.heapGoal - r.heapTrigger) +} + +func (r *gcCycleResult) triggerRatio() float64 { + return float64(r.heapTrigger-r.heapLive) / float64(r.heapGoal-r.heapLive) +} + +func (r *gcCycleResult) String() string { + return fmt.Sprintf("%d %2.1f%% %d->%d->%d (goal: %d)", r.cycle, r.gcUtilization*100, r.heapLive, r.heapTrigger, r.heapPeak, r.heapGoal) +} + +func assertInEpsilon(t *testing.T, name string, a, b, epsilon float64) { + t.Helper() + assertInRange(t, name, a, b-epsilon, b+epsilon) +} + +func assertInRange(t *testing.T, name string, a, min, max float64) { + t.Helper() + if a < min || a > max { + t.Errorf("%s not in range (%f, %f): %f", name, min, max, a) + } +} + +// float64Stream is a function that generates an infinite stream of +// float64 values when called repeatedly. +type float64Stream func() float64 + +// constant returns a stream that generates the value c. +func constant(c float64) float64Stream { + return func() float64 { + return c + } +} + +// unit returns a stream that generates a single peak with +// amplitude amp, followed by zeroes. +// +// In another manner of speaking, this is the Kronecker delta. +func unit(amp float64) float64Stream { + dropped := false + return func() float64 { + if dropped { + return 0 + } + dropped = true + return amp + } +} + +// oscillate returns a stream that oscillates sinusoidally +// with the given amplitude, phase, and period. +func oscillate(amp, phase float64, period int) float64Stream { + var cycle int + return func() float64 { + p := float64(cycle)/float64(period)*2*math.Pi + phase + cycle++ + if cycle == period { + cycle = 0 + } + return math.Sin(p) * amp + } +} + +// ramp returns a stream that moves from zero to height +// over the course of length steps. +func ramp(height float64, length int) float64Stream { + var cycle int + return func() float64 { + h := height * float64(cycle) / float64(length) + if cycle < length { + cycle++ + } + return h + } +} + +// random returns a stream that generates random numbers +// between -amp and amp. +func random(amp float64, seed int64) float64Stream { + r := rand.New(rand.NewSource(seed)) + return func() float64 { + return ((r.Float64() - 0.5) * 2) * amp + } +} + +// delay returns a new stream which is a buffered version +// of f: it returns zero for cycles steps, followed by f. +func (f float64Stream) delay(cycles int) float64Stream { + zeroes := 0 + return func() float64 { + if zeroes < cycles { + zeroes++ + return 0 + } + return f() + } +} + +// scale returns a new stream that is f, but attenuated by a +// constant factor. +func (f float64Stream) scale(amt float64) float64Stream { + return func() float64 { + return f() * amt + } +} + +// offset returns a new stream that is f but offset by amt +// at each step. +func (f float64Stream) offset(amt float64) float64Stream { + return func() float64 { + old := f() + return old + amt + } +} + +// sum returns a new stream that is the sum of all input streams +// at each step. +func (f float64Stream) sum(fs ...float64Stream) float64Stream { + return func() float64 { + sum := f() + for _, s := range fs { + sum += s() + } + return sum + } +} + +// quantize returns a new stream that rounds f to a multiple +// of mult at each step. +func (f float64Stream) quantize(mult float64) float64Stream { + return func() float64 { + r := f() / mult + if r < 0 { + return math.Ceil(r) * mult + } + return math.Floor(r) * mult + } +} + +// min returns a new stream that replaces all values produced +// by f lower than min with min. +func (f float64Stream) min(min float64) float64Stream { + return func() float64 { + return math.Max(min, f()) + } +} + +// max returns a new stream that replaces all values produced +// by f higher than max with max. +func (f float64Stream) max(max float64) float64Stream { + return func() float64 { + return math.Min(max, f()) + } +} + +// limit returns a new stream that replaces all values produced +// by f lower than min with min and higher than max with max. +func (f float64Stream) limit(min, max float64) float64Stream { + return func() float64 { + v := f() + if v < min { + v = min + } else if v > max { + v = max + } + return v + } +} + +func TestIdleMarkWorkerCount(t *testing.T) { + const workers = 10 + c := NewGCController(100, math.MaxInt64) + c.SetMaxIdleMarkWorkers(workers) + for i := 0; i < workers; i++ { + if !c.NeedIdleMarkWorker() { + t.Fatalf("expected to need idle mark workers: i=%d", i) + } + if !c.AddIdleMarkWorker() { + t.Fatalf("expected to be able to add an idle mark worker: i=%d", i) + } + } + if c.NeedIdleMarkWorker() { + t.Fatalf("expected to not need idle mark workers") + } + if c.AddIdleMarkWorker() { + t.Fatalf("expected to not be able to add an idle mark worker") + } + for i := 0; i < workers; i++ { + c.RemoveIdleMarkWorker() + if !c.NeedIdleMarkWorker() { + t.Fatalf("expected to need idle mark workers after removal: i=%d", i) + } + } + for i := 0; i < workers-1; i++ { + if !c.AddIdleMarkWorker() { + t.Fatalf("expected to be able to add idle mark workers after adding again: i=%d", i) + } + } + for i := 0; i < 10; i++ { + if !c.AddIdleMarkWorker() { + t.Fatalf("expected to be able to add idle mark workers interleaved: i=%d", i) + } + if c.AddIdleMarkWorker() { + t.Fatalf("expected to not be able to add idle mark workers interleaved: i=%d", i) + } + c.RemoveIdleMarkWorker() + } + // Support the max being below the count. + c.SetMaxIdleMarkWorkers(0) + if c.NeedIdleMarkWorker() { + t.Fatalf("expected to not need idle mark workers after capacity set to 0") + } + if c.AddIdleMarkWorker() { + t.Fatalf("expected to not be able to add idle mark workers after capacity set to 0") + } + for i := 0; i < workers-1; i++ { + c.RemoveIdleMarkWorker() + } + if c.NeedIdleMarkWorker() { + t.Fatalf("expected to not need idle mark workers after capacity set to 0") + } + if c.AddIdleMarkWorker() { + t.Fatalf("expected to not be able to add idle mark workers after capacity set to 0") + } + c.SetMaxIdleMarkWorkers(1) + if !c.NeedIdleMarkWorker() { + t.Fatalf("expected to need idle mark workers after capacity set to 1") + } + if !c.AddIdleMarkWorker() { + t.Fatalf("expected to be able to add idle mark workers after capacity set to 1") + } +} diff --git a/src/runtime/mgcscavenge.go b/src/runtime/mgcscavenge.go new file mode 100644 index 0000000..e59340e --- /dev/null +++ b/src/runtime/mgcscavenge.go @@ -0,0 +1,1186 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Scavenging free pages. +// +// This file implements scavenging (the release of physical pages backing mapped +// memory) of free and unused pages in the heap as a way to deal with page-level +// fragmentation and reduce the RSS of Go applications. +// +// Scavenging in Go happens on two fronts: there's the background +// (asynchronous) scavenger and the heap-growth (synchronous) scavenger. +// +// The former happens on a goroutine much like the background sweeper which is +// soft-capped at using scavengePercent of the mutator's time, based on +// order-of-magnitude estimates of the costs of scavenging. The background +// scavenger's primary goal is to bring the estimated heap RSS of the +// application down to a goal. +// +// Before we consider what this looks like, we need to split the world into two +// halves. One in which a memory limit is not set, and one in which it is. +// +// For the former, the goal is defined as: +// (retainExtraPercent+100) / 100 * (heapGoal / lastHeapGoal) * lastHeapInUse +// +// Essentially, we wish to have the application's RSS track the heap goal, but +// the heap goal is defined in terms of bytes of objects, rather than pages like +// RSS. As a result, we need to take into account for fragmentation internal to +// spans. heapGoal / lastHeapGoal defines the ratio between the current heap goal +// and the last heap goal, which tells us by how much the heap is growing and +// shrinking. We estimate what the heap will grow to in terms of pages by taking +// this ratio and multiplying it by heapInUse at the end of the last GC, which +// allows us to account for this additional fragmentation. Note that this +// procedure makes the assumption that the degree of fragmentation won't change +// dramatically over the next GC cycle. Overestimating the amount of +// fragmentation simply results in higher memory use, which will be accounted +// for by the next pacing up date. Underestimating the fragmentation however +// could lead to performance degradation. Handling this case is not within the +// scope of the scavenger. Situations where the amount of fragmentation balloons +// over the course of a single GC cycle should be considered pathologies, +// flagged as bugs, and fixed appropriately. +// +// An additional factor of retainExtraPercent is added as a buffer to help ensure +// that there's more unscavenged memory to allocate out of, since each allocation +// out of scavenged memory incurs a potentially expensive page fault. +// +// If a memory limit is set, then we wish to pick a scavenge goal that maintains +// that memory limit. For that, we look at total memory that has been committed +// (memstats.mappedReady) and try to bring that down below the limit. In this case, +// we want to give buffer space in the *opposite* direction. When the application +// is close to the limit, we want to make sure we push harder to keep it under, so +// if we target below the memory limit, we ensure that the background scavenger is +// giving the situation the urgency it deserves. +// +// In this case, the goal is defined as: +// (100-reduceExtraPercent) / 100 * memoryLimit +// +// We compute both of these goals, and check whether either of them have been met. +// The background scavenger continues operating as long as either one of the goals +// has not been met. +// +// The goals are updated after each GC. +// +// The synchronous heap-growth scavenging happens whenever the heap grows in +// size, for some definition of heap-growth. The intuition behind this is that +// the application had to grow the heap because existing fragments were +// not sufficiently large to satisfy a page-level memory allocation, so we +// scavenge those fragments eagerly to offset the growth in RSS that results. + +package runtime + +import ( + "internal/goos" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +const ( + // The background scavenger is paced according to these parameters. + // + // scavengePercent represents the portion of mutator time we're willing + // to spend on scavenging in percent. + scavengePercent = 1 // 1% + + // retainExtraPercent represents the amount of memory over the heap goal + // that the scavenger should keep as a buffer space for the allocator. + // This constant is used when we do not have a memory limit set. + // + // The purpose of maintaining this overhead is to have a greater pool of + // unscavenged memory available for allocation (since using scavenged memory + // incurs an additional cost), to account for heap fragmentation and + // the ever-changing layout of the heap. + retainExtraPercent = 10 + + // reduceExtraPercent represents the amount of memory under the limit + // that the scavenger should target. For example, 5 means we target 95% + // of the limit. + // + // The purpose of shooting lower than the limit is to ensure that, once + // close to the limit, the scavenger is working hard to maintain it. If + // we have a memory limit set but are far away from it, there's no harm + // in leaving up to 100-retainExtraPercent live, and it's more efficient + // anyway, for the same reasons that retainExtraPercent exists. + reduceExtraPercent = 5 + + // maxPagesPerPhysPage is the maximum number of supported runtime pages per + // physical page, based on maxPhysPageSize. + maxPagesPerPhysPage = maxPhysPageSize / pageSize + + // scavengeCostRatio is the approximate ratio between the costs of using previously + // scavenged memory and scavenging memory. + // + // For most systems the cost of scavenging greatly outweighs the costs + // associated with using scavenged memory, making this constant 0. On other systems + // (especially ones where "sysUsed" is not just a no-op) this cost is non-trivial. + // + // This ratio is used as part of multiplicative factor to help the scavenger account + // for the additional costs of using scavenged memory in its pacing. + scavengeCostRatio = 0.7 * (goos.IsDarwin + goos.IsIos) +) + +// heapRetained returns an estimate of the current heap RSS. +func heapRetained() uint64 { + return gcController.heapInUse.load() + gcController.heapFree.load() +} + +// gcPaceScavenger updates the scavenger's pacing, particularly +// its rate and RSS goal. For this, it requires the current heapGoal, +// and the heapGoal for the previous GC cycle. +// +// The RSS goal is based on the current heap goal with a small overhead +// to accommodate non-determinism in the allocator. +// +// The pacing is based on scavengePageRate, which applies to both regular and +// huge pages. See that constant for more information. +// +// Must be called whenever GC pacing is updated. +// +// mheap_.lock must be held or the world must be stopped. +func gcPaceScavenger(memoryLimit int64, heapGoal, lastHeapGoal uint64) { + assertWorldStoppedOrLockHeld(&mheap_.lock) + + // As described at the top of this file, there are two scavenge goals here: one + // for gcPercent and one for memoryLimit. Let's handle the latter first because + // it's simpler. + + // We want to target retaining (100-reduceExtraPercent)% of the heap. + memoryLimitGoal := uint64(float64(memoryLimit) * (100.0 - reduceExtraPercent)) + + // mappedReady is comparable to memoryLimit, and represents how much total memory + // the Go runtime has committed now (estimated). + mappedReady := gcController.mappedReady.Load() + + // If we're below the goal already indicate that we don't need the background + // scavenger for the memory limit. This may seems worrisome at first, but note + // that the allocator will assist the background scavenger in the face of a memory + // limit, so we'll be safe even if we stop the scavenger when we shouldn't have. + if mappedReady <= memoryLimitGoal { + scavenge.memoryLimitGoal.Store(^uint64(0)) + } else { + scavenge.memoryLimitGoal.Store(memoryLimitGoal) + } + + // Now handle the gcPercent goal. + + // If we're called before the first GC completed, disable scavenging. + // We never scavenge before the 2nd GC cycle anyway (we don't have enough + // information about the heap yet) so this is fine, and avoids a fault + // or garbage data later. + if lastHeapGoal == 0 { + scavenge.gcPercentGoal.Store(^uint64(0)) + return + } + // Compute our scavenging goal. + goalRatio := float64(heapGoal) / float64(lastHeapGoal) + gcPercentGoal := uint64(float64(memstats.lastHeapInUse) * goalRatio) + // Add retainExtraPercent overhead to retainedGoal. This calculation + // looks strange but the purpose is to arrive at an integer division + // (e.g. if retainExtraPercent = 12.5, then we get a divisor of 8) + // that also avoids the overflow from a multiplication. + gcPercentGoal += gcPercentGoal / (1.0 / (retainExtraPercent / 100.0)) + // Align it to a physical page boundary to make the following calculations + // a bit more exact. + gcPercentGoal = (gcPercentGoal + uint64(physPageSize) - 1) &^ (uint64(physPageSize) - 1) + + // Represents where we are now in the heap's contribution to RSS in bytes. + // + // Guaranteed to always be a multiple of physPageSize on systems where + // physPageSize <= pageSize since we map new heap memory at a size larger than + // any physPageSize and released memory in multiples of the physPageSize. + // + // However, certain functions recategorize heap memory as other stats (e.g. + // stacks) and this happens in multiples of pageSize, so on systems + // where physPageSize > pageSize the calculations below will not be exact. + // Generally this is OK since we'll be off by at most one regular + // physical page. + heapRetainedNow := heapRetained() + + // If we're already below our goal, or within one page of our goal, then indicate + // that we don't need the background scavenger for maintaining a memory overhead + // proportional to the heap goal. + if heapRetainedNow <= gcPercentGoal || heapRetainedNow-gcPercentGoal < uint64(physPageSize) { + scavenge.gcPercentGoal.Store(^uint64(0)) + } else { + scavenge.gcPercentGoal.Store(gcPercentGoal) + } +} + +var scavenge struct { + // gcPercentGoal is the amount of retained heap memory (measured by + // heapRetained) that the runtime will try to maintain by returning + // memory to the OS. This goal is derived from gcController.gcPercent + // by choosing to retain enough memory to allocate heap memory up to + // the heap goal. + gcPercentGoal atomic.Uint64 + + // memoryLimitGoal is the amount of memory retained by the runtime ( + // measured by gcController.mappedReady) that the runtime will try to + // maintain by returning memory to the OS. This goal is derived from + // gcController.memoryLimit by choosing to target the memory limit or + // some lower target to keep the scavenger working. + memoryLimitGoal atomic.Uint64 + + // assistTime is the time spent by the allocator scavenging in the last GC cycle. + // + // This is reset once a GC cycle ends. + assistTime atomic.Int64 + + // backgroundTime is the time spent by the background scavenger in the last GC cycle. + // + // This is reset once a GC cycle ends. + backgroundTime atomic.Int64 +} + +const ( + // It doesn't really matter what value we start at, but we can't be zero, because + // that'll cause divide-by-zero issues. Pick something conservative which we'll + // also use as a fallback. + startingScavSleepRatio = 0.001 + + // Spend at least 1 ms scavenging, otherwise the corresponding + // sleep time to maintain our desired utilization is too low to + // be reliable. + minScavWorkTime = 1e6 +) + +// Sleep/wait state of the background scavenger. +var scavenger scavengerState + +type scavengerState struct { + // lock protects all fields below. + lock mutex + + // g is the goroutine the scavenger is bound to. + g *g + + // parked is whether or not the scavenger is parked. + parked bool + + // timer is the timer used for the scavenger to sleep. + timer *timer + + // sysmonWake signals to sysmon that it should wake the scavenger. + sysmonWake atomic.Uint32 + + // targetCPUFraction is the target CPU overhead for the scavenger. + targetCPUFraction float64 + + // sleepRatio is the ratio of time spent doing scavenging work to + // time spent sleeping. This is used to decide how long the scavenger + // should sleep for in between batches of work. It is set by + // critSleepController in order to maintain a CPU overhead of + // targetCPUFraction. + // + // Lower means more sleep, higher means more aggressive scavenging. + sleepRatio float64 + + // sleepController controls sleepRatio. + // + // See sleepRatio for more details. + sleepController piController + + // cooldown is the time left in nanoseconds during which we avoid + // using the controller and we hold sleepRatio at a conservative + // value. Used if the controller's assumptions fail to hold. + controllerCooldown int64 + + // printControllerReset instructs printScavTrace to signal that + // the controller was reset. + printControllerReset bool + + // sleepStub is a stub used for testing to avoid actually having + // the scavenger sleep. + // + // Unlike the other stubs, this is not populated if left nil + // Instead, it is called when non-nil because any valid implementation + // of this function basically requires closing over this scavenger + // state, and allocating a closure is not allowed in the runtime as + // a matter of policy. + sleepStub func(n int64) int64 + + // scavenge is a function that scavenges n bytes of memory. + // Returns how many bytes of memory it actually scavenged, as + // well as the time it took in nanoseconds. Usually mheap.pages.scavenge + // with nanotime called around it, but stubbed out for testing. + // Like mheap.pages.scavenge, if it scavenges less than n bytes of + // memory, the caller may assume the heap is exhausted of scavengable + // memory for now. + // + // If this is nil, it is populated with the real thing in init. + scavenge func(n uintptr) (uintptr, int64) + + // shouldStop is a callback called in the work loop and provides a + // point that can force the scavenger to stop early, for example because + // the scavenge policy dictates too much has been scavenged already. + // + // If this is nil, it is populated with the real thing in init. + shouldStop func() bool + + // gomaxprocs returns the current value of gomaxprocs. Stub for testing. + // + // If this is nil, it is populated with the real thing in init. + gomaxprocs func() int32 +} + +// init initializes a scavenger state and wires to the current G. +// +// Must be called from a regular goroutine that can allocate. +func (s *scavengerState) init() { + if s.g != nil { + throw("scavenger state is already wired") + } + lockInit(&s.lock, lockRankScavenge) + s.g = getg() + + s.timer = new(timer) + s.timer.arg = s + s.timer.f = func(s any, _ uintptr) { + s.(*scavengerState).wake() + } + + // input: fraction of CPU time actually used. + // setpoint: ideal CPU fraction. + // output: ratio of time worked to time slept (determines sleep time). + // + // The output of this controller is somewhat indirect to what we actually + // want to achieve: how much time to sleep for. The reason for this definition + // is to ensure that the controller's outputs have a direct relationship with + // its inputs (as opposed to an inverse relationship), making it somewhat + // easier to reason about for tuning purposes. + s.sleepController = piController{ + // Tuned loosely via Ziegler-Nichols process. + kp: 0.3375, + ti: 3.2e6, + tt: 1e9, // 1 second reset time. + + // These ranges seem wide, but we want to give the controller plenty of + // room to hunt for the optimal value. + min: 0.001, // 1:1000 + max: 1000.0, // 1000:1 + } + s.sleepRatio = startingScavSleepRatio + + // Install real functions if stubs aren't present. + if s.scavenge == nil { + s.scavenge = func(n uintptr) (uintptr, int64) { + start := nanotime() + r := mheap_.pages.scavenge(n, nil) + end := nanotime() + if start >= end { + return r, 0 + } + scavenge.backgroundTime.Add(end - start) + return r, end - start + } + } + if s.shouldStop == nil { + s.shouldStop = func() bool { + // If background scavenging is disabled or if there's no work to do just stop. + return heapRetained() <= scavenge.gcPercentGoal.Load() && + (!go119MemoryLimitSupport || + gcController.mappedReady.Load() <= scavenge.memoryLimitGoal.Load()) + } + } + if s.gomaxprocs == nil { + s.gomaxprocs = func() int32 { + return gomaxprocs + } + } +} + +// park parks the scavenger goroutine. +func (s *scavengerState) park() { + lock(&s.lock) + if getg() != s.g { + throw("tried to park scavenger from another goroutine") + } + s.parked = true + goparkunlock(&s.lock, waitReasonGCScavengeWait, traceEvGoBlock, 2) +} + +// ready signals to sysmon that the scavenger should be awoken. +func (s *scavengerState) ready() { + s.sysmonWake.Store(1) +} + +// wake immediately unparks the scavenger if necessary. +// +// Safe to run without a P. +func (s *scavengerState) wake() { + lock(&s.lock) + if s.parked { + // Unset sysmonWake, since the scavenger is now being awoken. + s.sysmonWake.Store(0) + + // s.parked is unset to prevent a double wake-up. + s.parked = false + + // Ready the goroutine by injecting it. We use injectglist instead + // of ready or goready in order to allow us to run this function + // without a P. injectglist also avoids placing the goroutine in + // the current P's runnext slot, which is desirable to prevent + // the scavenger from interfering with user goroutine scheduling + // too much. + var list gList + list.push(s.g) + injectglist(&list) + } + unlock(&s.lock) +} + +// sleep puts the scavenger to sleep based on the amount of time that it worked +// in nanoseconds. +// +// Note that this function should only be called by the scavenger. +// +// The scavenger may be woken up earlier by a pacing change, and it may not go +// to sleep at all if there's a pending pacing change. +func (s *scavengerState) sleep(worked float64) { + lock(&s.lock) + if getg() != s.g { + throw("tried to sleep scavenger from another goroutine") + } + + if worked < minScavWorkTime { + // This means there wasn't enough work to actually fill up minScavWorkTime. + // That's fine; we shouldn't try to do anything with this information + // because it's going result in a short enough sleep request that things + // will get messy. Just assume we did at least this much work. + // All this means is that we'll sleep longer than we otherwise would have. + worked = minScavWorkTime + } + + // Multiply the critical time by 1 + the ratio of the costs of using + // scavenged memory vs. scavenging memory. This forces us to pay down + // the cost of reusing this memory eagerly by sleeping for a longer period + // of time and scavenging less frequently. More concretely, we avoid situations + // where we end up scavenging so often that we hurt allocation performance + // because of the additional overheads of using scavenged memory. + worked *= 1 + scavengeCostRatio + + // sleepTime is the amount of time we're going to sleep, based on the amount + // of time we worked, and the sleepRatio. + sleepTime := int64(worked / s.sleepRatio) + + var slept int64 + if s.sleepStub == nil { + // Set the timer. + // + // This must happen here instead of inside gopark + // because we can't close over any variables without + // failing escape analysis. + start := nanotime() + resetTimer(s.timer, start+sleepTime) + + // Mark ourselves as asleep and go to sleep. + s.parked = true + goparkunlock(&s.lock, waitReasonSleep, traceEvGoSleep, 2) + + // How long we actually slept for. + slept = nanotime() - start + + lock(&s.lock) + // Stop the timer here because s.wake is unable to do it for us. + // We don't really care if we succeed in stopping the timer. One + // reason we might fail is that we've already woken up, but the timer + // might be in the process of firing on some other P; essentially we're + // racing with it. That's totally OK. Double wake-ups are perfectly safe. + stopTimer(s.timer) + unlock(&s.lock) + } else { + unlock(&s.lock) + slept = s.sleepStub(sleepTime) + } + + // Stop here if we're cooling down from the controller. + if s.controllerCooldown > 0 { + // worked and slept aren't exact measures of time, but it's OK to be a bit + // sloppy here. We're just hoping we're avoiding some transient bad behavior. + t := slept + int64(worked) + if t > s.controllerCooldown { + s.controllerCooldown = 0 + } else { + s.controllerCooldown -= t + } + return + } + + // idealFraction is the ideal % of overall application CPU time that we + // spend scavenging. + idealFraction := float64(scavengePercent) / 100.0 + + // Calculate the CPU time spent. + // + // This may be slightly inaccurate with respect to GOMAXPROCS, but we're + // recomputing this often enough relative to GOMAXPROCS changes in general + // (it only changes when the world is stopped, and not during a GC) that + // that small inaccuracy is in the noise. + cpuFraction := worked / ((float64(slept) + worked) * float64(s.gomaxprocs())) + + // Update the critSleepRatio, adjusting until we reach our ideal fraction. + var ok bool + s.sleepRatio, ok = s.sleepController.next(cpuFraction, idealFraction, float64(slept)+worked) + if !ok { + // The core assumption of the controller, that we can get a proportional + // response, broke down. This may be transient, so temporarily switch to + // sleeping a fixed, conservative amount. + s.sleepRatio = startingScavSleepRatio + s.controllerCooldown = 5e9 // 5 seconds. + + // Signal the scav trace printer to output this. + s.controllerFailed() + } +} + +// controllerFailed indicates that the scavenger's scheduling +// controller failed. +func (s *scavengerState) controllerFailed() { + lock(&s.lock) + s.printControllerReset = true + unlock(&s.lock) +} + +// run is the body of the main scavenging loop. +// +// Returns the number of bytes released and the estimated time spent +// releasing those bytes. +// +// Must be run on the scavenger goroutine. +func (s *scavengerState) run() (released uintptr, worked float64) { + lock(&s.lock) + if getg() != s.g { + throw("tried to run scavenger from another goroutine") + } + unlock(&s.lock) + + for worked < minScavWorkTime { + // If something from outside tells us to stop early, stop. + if s.shouldStop() { + break + } + + // scavengeQuantum is the amount of memory we try to scavenge + // in one go. A smaller value means the scavenger is more responsive + // to the scheduler in case of e.g. preemption. A larger value means + // that the overheads of scavenging are better amortized, so better + // scavenging throughput. + // + // The current value is chosen assuming a cost of ~10µs/physical page + // (this is somewhat pessimistic), which implies a worst-case latency of + // about 160µs for 4 KiB physical pages. The current value is biased + // toward latency over throughput. + const scavengeQuantum = 64 << 10 + + // Accumulate the amount of time spent scavenging. + r, duration := s.scavenge(scavengeQuantum) + + // On some platforms we may see end >= start if the time it takes to scavenge + // memory is less than the minimum granularity of its clock (e.g. Windows) or + // due to clock bugs. + // + // In this case, just assume scavenging takes 10 µs per regular physical page + // (determined empirically), and conservatively ignore the impact of huge pages + // on timing. + const approxWorkedNSPerPhysicalPage = 10e3 + if duration == 0 { + worked += approxWorkedNSPerPhysicalPage * float64(r/physPageSize) + } else { + // TODO(mknyszek): If duration is small compared to worked, it could be + // rounded down to zero. Probably not a problem in practice because the + // values are all within a few orders of magnitude of each other but maybe + // worth worrying about. + worked += float64(duration) + } + released += r + + // scavenge does not return until it either finds the requisite amount of + // memory to scavenge, or exhausts the heap. If we haven't found enough + // to scavenge, then the heap must be exhausted. + if r < scavengeQuantum { + break + } + // When using fake time just do one loop. + if faketime != 0 { + break + } + } + if released > 0 && released < physPageSize { + // If this happens, it means that we may have attempted to release part + // of a physical page, but the likely effect of that is that it released + // the whole physical page, some of which may have still been in-use. + // This could lead to memory corruption. Throw. + throw("released less than one physical page of memory") + } + return +} + +// Background scavenger. +// +// The background scavenger maintains the RSS of the application below +// the line described by the proportional scavenging statistics in +// the mheap struct. +func bgscavenge(c chan int) { + scavenger.init() + + c <- 1 + scavenger.park() + + for { + released, workTime := scavenger.run() + if released == 0 { + scavenger.park() + continue + } + atomic.Xadduintptr(&mheap_.pages.scav.released, released) + scavenger.sleep(workTime) + } +} + +// scavenge scavenges nbytes worth of free pages, starting with the +// highest address first. Successive calls continue from where it left +// off until the heap is exhausted. Call scavengeStartGen to bring it +// back to the top of the heap. +// +// Returns the amount of memory scavenged in bytes. +// +// scavenge always tries to scavenge nbytes worth of memory, and will +// only fail to do so if the heap is exhausted for now. +func (p *pageAlloc) scavenge(nbytes uintptr, shouldStop func() bool) uintptr { + released := uintptr(0) + for released < nbytes { + ci, pageIdx := p.scav.index.find() + if ci == 0 { + break + } + systemstack(func() { + released += p.scavengeOne(ci, pageIdx, nbytes-released) + }) + if shouldStop != nil && shouldStop() { + break + } + } + return released +} + +// printScavTrace prints a scavenge trace line to standard error. +// +// released should be the amount of memory released since the last time this +// was called, and forced indicates whether the scavenge was forced by the +// application. +// +// scavenger.lock must be held. +func printScavTrace(released uintptr, forced bool) { + assertLockHeld(&scavenger.lock) + + printlock() + print("scav ", + released>>10, " KiB work, ", + gcController.heapReleased.load()>>10, " KiB total, ", + (gcController.heapInUse.load()*100)/heapRetained(), "% util", + ) + if forced { + print(" (forced)") + } else if scavenger.printControllerReset { + print(" [controller reset]") + scavenger.printControllerReset = false + } + println() + printunlock() +} + +// scavengeOne walks over the chunk at chunk index ci and searches for +// a contiguous run of pages to scavenge. It will try to scavenge +// at most max bytes at once, but may scavenge more to avoid +// breaking huge pages. Once it scavenges some memory it returns +// how much it scavenged in bytes. +// +// searchIdx is the page index to start searching from in ci. +// +// Returns the number of bytes scavenged. +// +// Must run on the systemstack because it acquires p.mheapLock. +// +//go:systemstack +func (p *pageAlloc) scavengeOne(ci chunkIdx, searchIdx uint, max uintptr) uintptr { + // Calculate the maximum number of pages to scavenge. + // + // This should be alignUp(max, pageSize) / pageSize but max can and will + // be ^uintptr(0), so we need to be very careful not to overflow here. + // Rather than use alignUp, calculate the number of pages rounded down + // first, then add back one if necessary. + maxPages := max / pageSize + if max%pageSize != 0 { + maxPages++ + } + + // Calculate the minimum number of pages we can scavenge. + // + // Because we can only scavenge whole physical pages, we must + // ensure that we scavenge at least minPages each time, aligned + // to minPages*pageSize. + minPages := physPageSize / pageSize + if minPages < 1 { + minPages = 1 + } + + lock(p.mheapLock) + if p.summary[len(p.summary)-1][ci].max() >= uint(minPages) { + // We only bother looking for a candidate if there at least + // minPages free pages at all. + base, npages := p.chunkOf(ci).findScavengeCandidate(searchIdx, minPages, maxPages) + + // If we found something, scavenge it and return! + if npages != 0 { + // Compute the full address for the start of the range. + addr := chunkBase(ci) + uintptr(base)*pageSize + + // Mark the range we're about to scavenge as allocated, because + // we don't want any allocating goroutines to grab it while + // the scavenging is in progress. + if scav := p.allocRange(addr, uintptr(npages)); scav != 0 { + throw("double scavenge") + } + + // With that done, it's safe to unlock. + unlock(p.mheapLock) + + if !p.test { + pageTraceScav(getg().m.p.ptr(), 0, addr, uintptr(npages)) + + // Only perform the actual scavenging if we're not in a test. + // It's dangerous to do so otherwise. + sysUnused(unsafe.Pointer(addr), uintptr(npages)*pageSize) + + // Update global accounting only when not in test, otherwise + // the runtime's accounting will be wrong. + nbytes := int64(npages) * pageSize + gcController.heapReleased.add(nbytes) + gcController.heapFree.add(-nbytes) + + stats := memstats.heapStats.acquire() + atomic.Xaddint64(&stats.committed, -nbytes) + atomic.Xaddint64(&stats.released, nbytes) + memstats.heapStats.release() + } + + // Relock the heap, because now we need to make these pages + // available allocation. Free them back to the page allocator. + lock(p.mheapLock) + p.free(addr, uintptr(npages), true) + + // Mark the range as scavenged. + p.chunkOf(ci).scavenged.setRange(base, npages) + unlock(p.mheapLock) + + return uintptr(npages) * pageSize + } + } + // Mark this chunk as having no free pages. + p.scav.index.clear(ci) + unlock(p.mheapLock) + + return 0 +} + +// fillAligned returns x but with all zeroes in m-aligned +// groups of m bits set to 1 if any bit in the group is non-zero. +// +// For example, fillAligned(0x0100a3, 8) == 0xff00ff. +// +// Note that if m == 1, this is a no-op. +// +// m must be a power of 2 <= maxPagesPerPhysPage. +func fillAligned(x uint64, m uint) uint64 { + apply := func(x uint64, c uint64) uint64 { + // The technique used it here is derived from + // https://graphics.stanford.edu/~seander/bithacks.html#ZeroInWord + // and extended for more than just bytes (like nibbles + // and uint16s) by using an appropriate constant. + // + // To summarize the technique, quoting from that page: + // "[It] works by first zeroing the high bits of the [8] + // bytes in the word. Subsequently, it adds a number that + // will result in an overflow to the high bit of a byte if + // any of the low bits were initially set. Next the high + // bits of the original word are ORed with these values; + // thus, the high bit of a byte is set iff any bit in the + // byte was set. Finally, we determine if any of these high + // bits are zero by ORing with ones everywhere except the + // high bits and inverting the result." + return ^((((x & c) + c) | x) | c) + } + // Transform x to contain a 1 bit at the top of each m-aligned + // group of m zero bits. + switch m { + case 1: + return x + case 2: + x = apply(x, 0x5555555555555555) + case 4: + x = apply(x, 0x7777777777777777) + case 8: + x = apply(x, 0x7f7f7f7f7f7f7f7f) + case 16: + x = apply(x, 0x7fff7fff7fff7fff) + case 32: + x = apply(x, 0x7fffffff7fffffff) + case 64: // == maxPagesPerPhysPage + x = apply(x, 0x7fffffffffffffff) + default: + throw("bad m value") + } + // Now, the top bit of each m-aligned group in x is set + // that group was all zero in the original x. + + // From each group of m bits subtract 1. + // Because we know only the top bits of each + // m-aligned group are set, we know this will + // set each group to have all the bits set except + // the top bit, so just OR with the original + // result to set all the bits. + return ^((x - (x >> (m - 1))) | x) +} + +// findScavengeCandidate returns a start index and a size for this pallocData +// segment which represents a contiguous region of free and unscavenged memory. +// +// searchIdx indicates the page index within this chunk to start the search, but +// note that findScavengeCandidate searches backwards through the pallocData. As a +// a result, it will return the highest scavenge candidate in address order. +// +// min indicates a hard minimum size and alignment for runs of pages. That is, +// findScavengeCandidate will not return a region smaller than min pages in size, +// or that is min pages or greater in size but not aligned to min. min must be +// a non-zero power of 2 <= maxPagesPerPhysPage. +// +// max is a hint for how big of a region is desired. If max >= pallocChunkPages, then +// findScavengeCandidate effectively returns entire free and unscavenged regions. +// If max < pallocChunkPages, it may truncate the returned region such that size is +// max. However, findScavengeCandidate may still return a larger region if, for +// example, it chooses to preserve huge pages, or if max is not aligned to min (it +// will round up). That is, even if max is small, the returned size is not guaranteed +// to be equal to max. max is allowed to be less than min, in which case it is as if +// max == min. +func (m *pallocData) findScavengeCandidate(searchIdx uint, min, max uintptr) (uint, uint) { + if min&(min-1) != 0 || min == 0 { + print("runtime: min = ", min, "\n") + throw("min must be a non-zero power of 2") + } else if min > maxPagesPerPhysPage { + print("runtime: min = ", min, "\n") + throw("min too large") + } + // max may not be min-aligned, so we might accidentally truncate to + // a max value which causes us to return a non-min-aligned value. + // To prevent this, align max up to a multiple of min (which is always + // a power of 2). This also prevents max from ever being less than + // min, unless it's zero, so handle that explicitly. + if max == 0 { + max = min + } else { + max = alignUp(max, min) + } + + i := int(searchIdx / 64) + // Start by quickly skipping over blocks of non-free or scavenged pages. + for ; i >= 0; i-- { + // 1s are scavenged OR non-free => 0s are unscavenged AND free + x := fillAligned(m.scavenged[i]|m.pallocBits[i], uint(min)) + if x != ^uint64(0) { + break + } + } + if i < 0 { + // Failed to find any free/unscavenged pages. + return 0, 0 + } + // We have something in the 64-bit chunk at i, but it could + // extend further. Loop until we find the extent of it. + + // 1s are scavenged OR non-free => 0s are unscavenged AND free + x := fillAligned(m.scavenged[i]|m.pallocBits[i], uint(min)) + z1 := uint(sys.LeadingZeros64(^x)) + run, end := uint(0), uint(i)*64+(64-z1) + if x<<z1 != 0 { + // After shifting out z1 bits, we still have 1s, + // so the run ends inside this word. + run = uint(sys.LeadingZeros64(x << z1)) + } else { + // After shifting out z1 bits, we have no more 1s. + // This means the run extends to the bottom of the + // word so it may extend into further words. + run = 64 - z1 + for j := i - 1; j >= 0; j-- { + x := fillAligned(m.scavenged[j]|m.pallocBits[j], uint(min)) + run += uint(sys.LeadingZeros64(x)) + if x != 0 { + // The run stopped in this word. + break + } + } + } + + // Split the run we found if it's larger than max but hold on to + // our original length, since we may need it later. + size := run + if size > uint(max) { + size = uint(max) + } + start := end - size + + // Each huge page is guaranteed to fit in a single palloc chunk. + // + // TODO(mknyszek): Support larger huge page sizes. + // TODO(mknyszek): Consider taking pages-per-huge-page as a parameter + // so we can write tests for this. + if physHugePageSize > pageSize && physHugePageSize > physPageSize { + // We have huge pages, so let's ensure we don't break one by scavenging + // over a huge page boundary. If the range [start, start+size) overlaps with + // a free-and-unscavenged huge page, we want to grow the region we scavenge + // to include that huge page. + + // Compute the huge page boundary above our candidate. + pagesPerHugePage := uintptr(physHugePageSize / pageSize) + hugePageAbove := uint(alignUp(uintptr(start), pagesPerHugePage)) + + // If that boundary is within our current candidate, then we may be breaking + // a huge page. + if hugePageAbove <= end { + // Compute the huge page boundary below our candidate. + hugePageBelow := uint(alignDown(uintptr(start), pagesPerHugePage)) + + if hugePageBelow >= end-run { + // We're in danger of breaking apart a huge page since start+size crosses + // a huge page boundary and rounding down start to the nearest huge + // page boundary is included in the full run we found. Include the entire + // huge page in the bound by rounding down to the huge page size. + size = size + (start - hugePageBelow) + start = hugePageBelow + } + } + } + return start, size +} + +// scavengeIndex is a structure for efficiently managing which pageAlloc chunks have +// memory available to scavenge. +type scavengeIndex struct { + // chunks is a bitmap representing the entire address space. Each bit represents + // a single chunk, and a 1 value indicates the presence of pages available for + // scavenging. Updates to the bitmap are serialized by the pageAlloc lock. + // + // The underlying storage of chunks is platform dependent and may not even be + // totally mapped read/write. min and max reflect the extent that is safe to access. + // min is inclusive, max is exclusive. + // + // searchAddr is the maximum address (in the offset address space, so we have a linear + // view of the address space; see mranges.go:offAddr) containing memory available to + // scavenge. It is a hint to the find operation to avoid O(n^2) behavior in repeated lookups. + // + // searchAddr is always inclusive and should be the base address of the highest runtime + // page available for scavenging. + // + // searchAddr is managed by both find and mark. + // + // Normally, find monotonically decreases searchAddr as it finds no more free pages to + // scavenge. However, mark, when marking a new chunk at an index greater than the current + // searchAddr, sets searchAddr to the *negative* index into chunks of that page. The trick here + // is that concurrent calls to find will fail to monotonically decrease searchAddr, and so they + // won't barge over new memory becoming available to scavenge. Furthermore, this ensures + // that some future caller of find *must* observe the new high index. That caller + // (or any other racing with it), then makes searchAddr positive before continuing, bringing + // us back to our monotonically decreasing steady-state. + // + // A pageAlloc lock serializes updates between min, max, and searchAddr, so abs(searchAddr) + // is always guaranteed to be >= min and < max (converted to heap addresses). + // + // TODO(mknyszek): Ideally we would use something bigger than a uint8 for faster + // iteration like uint32, but we lack the bit twiddling intrinsics. We'd need to either + // copy them from math/bits or fix the fact that we can't import math/bits' code from + // the runtime due to compiler instrumentation. + searchAddr atomicOffAddr + chunks []atomic.Uint8 + minHeapIdx atomic.Int32 + min, max atomic.Int32 +} + +// find returns the highest chunk index that may contain pages available to scavenge. +// It also returns an offset to start searching in the highest chunk. +func (s *scavengeIndex) find() (chunkIdx, uint) { + searchAddr, marked := s.searchAddr.Load() + if searchAddr == minOffAddr.addr() { + // We got a cleared search addr. + return 0, 0 + } + + // Starting from searchAddr's chunk, and moving down to minHeapIdx, + // iterate until we find a chunk with pages to scavenge. + min := s.minHeapIdx.Load() + searchChunk := chunkIndex(uintptr(searchAddr)) + start := int32(searchChunk / 8) + for i := start; i >= min; i-- { + // Skip over irrelevant address space. + chunks := s.chunks[i].Load() + if chunks == 0 { + continue + } + // Note that we can't have 8 leading zeroes here because + // we necessarily skipped that case. So, what's left is + // an index. If there are no zeroes, we want the 7th + // index, if 1 zero, the 6th, and so on. + n := 7 - sys.LeadingZeros8(chunks) + ci := chunkIdx(uint(i)*8 + uint(n)) + if searchChunk == ci { + return ci, chunkPageIndex(uintptr(searchAddr)) + } + // Try to reduce searchAddr to newSearchAddr. + newSearchAddr := chunkBase(ci) + pallocChunkBytes - pageSize + if marked { + // Attempt to be the first one to decrease the searchAddr + // after an increase. If we fail, that means there was another + // increase, or somebody else got to it before us. Either way, + // it doesn't matter. We may lose some performance having an + // incorrect search address, but it's far more important that + // we don't miss updates. + s.searchAddr.StoreUnmark(searchAddr, newSearchAddr) + } else { + // Decrease searchAddr. + s.searchAddr.StoreMin(newSearchAddr) + } + return ci, pallocChunkPages - 1 + } + // Clear searchAddr, because we've exhausted the heap. + s.searchAddr.Clear() + return 0, 0 +} + +// mark sets the inclusive range of chunks between indices start and end as +// containing pages available to scavenge. +// +// Must be serialized with other mark, markRange, and clear calls. +func (s *scavengeIndex) mark(base, limit uintptr) { + start, end := chunkIndex(base), chunkIndex(limit-pageSize) + if start == end { + // Within a chunk. + mask := uint8(1 << (start % 8)) + s.chunks[start/8].Or(mask) + } else if start/8 == end/8 { + // Within the same byte in the index. + mask := uint8(uint16(1<<(end-start+1))-1) << (start % 8) + s.chunks[start/8].Or(mask) + } else { + // Crosses multiple bytes in the index. + startAligned := chunkIdx(alignUp(uintptr(start), 8)) + endAligned := chunkIdx(alignDown(uintptr(end), 8)) + + // Do the end of the first byte first. + if width := startAligned - start; width > 0 { + mask := uint8(uint16(1<<width)-1) << (start % 8) + s.chunks[start/8].Or(mask) + } + // Do the middle aligned sections that take up a whole + // byte. + for ci := startAligned; ci < endAligned; ci += 8 { + s.chunks[ci/8].Store(^uint8(0)) + } + // Do the end of the last byte. + // + // This width check doesn't match the one above + // for start because aligning down into the endAligned + // block means we always have at least one chunk in this + // block (note that end is *inclusive*). This also means + // that if end == endAligned+n, then what we really want + // is to fill n+1 chunks, i.e. width n+1. By induction, + // this is true for all n. + if width := end - endAligned + 1; width > 0 { + mask := uint8(uint16(1<<width) - 1) + s.chunks[end/8].Or(mask) + } + } + newSearchAddr := limit - pageSize + searchAddr, _ := s.searchAddr.Load() + // N.B. Because mark is serialized, it's not necessary to do a + // full CAS here. mark only ever increases searchAddr, while + // find only ever decreases it. Since we only ever race with + // decreases, even if the value we loaded is stale, the actual + // value will never be larger. + if (offAddr{searchAddr}).lessThan(offAddr{newSearchAddr}) { + s.searchAddr.StoreMarked(newSearchAddr) + } +} + +// clear sets the chunk at index ci as not containing pages available to scavenge. +// +// Must be serialized with other mark, markRange, and clear calls. +func (s *scavengeIndex) clear(ci chunkIdx) { + s.chunks[ci/8].And(^uint8(1 << (ci % 8))) +} + +type piController struct { + kp float64 // Proportional constant. + ti float64 // Integral time constant. + tt float64 // Reset time. + + min, max float64 // Output boundaries. + + // PI controller state. + + errIntegral float64 // Integral of the error from t=0 to now. + + // Error flags. + errOverflow bool // Set if errIntegral ever overflowed. + inputOverflow bool // Set if an operation with the input overflowed. +} + +// next provides a new sample to the controller. +// +// input is the sample, setpoint is the desired point, and period is how much +// time (in whatever unit makes the most sense) has passed since the last sample. +// +// Returns a new value for the variable it's controlling, and whether the operation +// completed successfully. One reason this might fail is if error has been growing +// in an unbounded manner, to the point of overflow. +// +// In the specific case of an error overflow occurs, the errOverflow field will be +// set and the rest of the controller's internal state will be fully reset. +func (c *piController) next(input, setpoint, period float64) (float64, bool) { + // Compute the raw output value. + prop := c.kp * (setpoint - input) + rawOutput := prop + c.errIntegral + + // Clamp rawOutput into output. + output := rawOutput + if isInf(output) || isNaN(output) { + // The input had a large enough magnitude that either it was already + // overflowed, or some operation with it overflowed. + // Set a flag and reset. That's the safest thing to do. + c.reset() + c.inputOverflow = true + return c.min, false + } + if output < c.min { + output = c.min + } else if output > c.max { + output = c.max + } + + // Update the controller's state. + if c.ti != 0 && c.tt != 0 { + c.errIntegral += (c.kp*period/c.ti)*(setpoint-input) + (period/c.tt)*(output-rawOutput) + if isInf(c.errIntegral) || isNaN(c.errIntegral) { + // So much error has accumulated that we managed to overflow. + // The assumptions around the controller have likely broken down. + // Set a flag and reset. That's the safest thing to do. + c.reset() + c.errOverflow = true + return c.min, false + } + } + return output, true +} + +// reset resets the controller state, except for controller error flags. +func (c *piController) reset() { + c.errIntegral = 0 +} diff --git a/src/runtime/mgcscavenge_test.go b/src/runtime/mgcscavenge_test.go new file mode 100644 index 0000000..c436ff0 --- /dev/null +++ b/src/runtime/mgcscavenge_test.go @@ -0,0 +1,755 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "internal/goos" + "math" + "math/rand" + . "runtime" + "runtime/internal/atomic" + "testing" + "time" +) + +// makePallocData produces an initialized PallocData by setting +// the ranges of described in alloc and scavenge. +func makePallocData(alloc, scavenged []BitRange) *PallocData { + b := new(PallocData) + for _, v := range alloc { + if v.N == 0 { + // Skip N==0. It's harmless and allocRange doesn't + // handle this case. + continue + } + b.AllocRange(v.I, v.N) + } + for _, v := range scavenged { + if v.N == 0 { + // See the previous loop. + continue + } + b.ScavengedSetRange(v.I, v.N) + } + return b +} + +func TestFillAligned(t *testing.T) { + fillAlignedSlow := func(x uint64, m uint) uint64 { + if m == 1 { + return x + } + out := uint64(0) + for i := uint(0); i < 64; i += m { + for j := uint(0); j < m; j++ { + if x&(uint64(1)<<(i+j)) != 0 { + out |= ((uint64(1) << m) - 1) << i + break + } + } + } + return out + } + check := func(x uint64, m uint) { + want := fillAlignedSlow(x, m) + if got := FillAligned(x, m); got != want { + t.Logf("got: %064b", got) + t.Logf("want: %064b", want) + t.Errorf("bad fillAligned(%016x, %d)", x, m) + } + } + for m := uint(1); m <= 64; m *= 2 { + tests := []uint64{ + 0x0000000000000000, + 0x00000000ffffffff, + 0xffffffff00000000, + 0x8000000000000001, + 0xf00000000000000f, + 0xf00000010050000f, + 0xffffffffffffffff, + 0x0000000000000001, + 0x0000000000000002, + 0x0000000000000008, + uint64(1) << (m - 1), + uint64(1) << m, + // Try a few fixed arbitrary examples. + 0xb02b9effcf137016, + 0x3975a076a9fbff18, + 0x0f8c88ec3b81506e, + 0x60f14d80ef2fa0e6, + } + for _, test := range tests { + check(test, m) + } + for i := 0; i < 1000; i++ { + // Try a pseudo-random numbers. + check(rand.Uint64(), m) + + if m > 1 { + // For m != 1, let's construct a slightly more interesting + // random test. Generate a bitmap which is either 0 or + // randomly set bits for each m-aligned group of m bits. + val := uint64(0) + for n := uint(0); n < 64; n += m { + // For each group of m bits, flip a coin: + // * Leave them as zero. + // * Set them randomly. + if rand.Uint64()%2 == 0 { + val |= (rand.Uint64() & ((1 << m) - 1)) << n + } + } + check(val, m) + } + } + } +} + +func TestPallocDataFindScavengeCandidate(t *testing.T) { + type test struct { + alloc, scavenged []BitRange + min, max uintptr + want BitRange + } + tests := map[string]test{ + "MixedMin1": { + alloc: []BitRange{{0, 40}, {42, PallocChunkPages - 42}}, + scavenged: []BitRange{{0, 41}, {42, PallocChunkPages - 42}}, + min: 1, + max: PallocChunkPages, + want: BitRange{41, 1}, + }, + "MultiMin1": { + alloc: []BitRange{{0, 63}, {65, 20}, {87, PallocChunkPages - 87}}, + scavenged: []BitRange{{86, 1}}, + min: 1, + max: PallocChunkPages, + want: BitRange{85, 1}, + }, + } + // Try out different page minimums. + for m := uintptr(1); m <= 64; m *= 2 { + suffix := fmt.Sprintf("Min%d", m) + tests["AllFree"+suffix] = test{ + min: m, + max: PallocChunkPages, + want: BitRange{0, PallocChunkPages}, + } + tests["AllScavenged"+suffix] = test{ + scavenged: []BitRange{{0, PallocChunkPages}}, + min: m, + max: PallocChunkPages, + want: BitRange{0, 0}, + } + tests["NoneFree"+suffix] = test{ + alloc: []BitRange{{0, PallocChunkPages}}, + scavenged: []BitRange{{PallocChunkPages / 2, PallocChunkPages / 2}}, + min: m, + max: PallocChunkPages, + want: BitRange{0, 0}, + } + tests["StartFree"+suffix] = test{ + alloc: []BitRange{{uint(m), PallocChunkPages - uint(m)}}, + min: m, + max: PallocChunkPages, + want: BitRange{0, uint(m)}, + } + tests["EndFree"+suffix] = test{ + alloc: []BitRange{{0, PallocChunkPages - uint(m)}}, + min: m, + max: PallocChunkPages, + want: BitRange{PallocChunkPages - uint(m), uint(m)}, + } + tests["Straddle64"+suffix] = test{ + alloc: []BitRange{{0, 64 - uint(m)}, {64 + uint(m), PallocChunkPages - (64 + uint(m))}}, + min: m, + max: 2 * m, + want: BitRange{64 - uint(m), 2 * uint(m)}, + } + tests["BottomEdge64WithFull"+suffix] = test{ + alloc: []BitRange{{64, 64}, {128 + 3*uint(m), PallocChunkPages - (128 + 3*uint(m))}}, + scavenged: []BitRange{{1, 10}}, + min: m, + max: 3 * m, + want: BitRange{128, 3 * uint(m)}, + } + tests["BottomEdge64WithPocket"+suffix] = test{ + alloc: []BitRange{{64, 62}, {127, 1}, {128 + 3*uint(m), PallocChunkPages - (128 + 3*uint(m))}}, + scavenged: []BitRange{{1, 10}}, + min: m, + max: 3 * m, + want: BitRange{128, 3 * uint(m)}, + } + tests["Max0"+suffix] = test{ + scavenged: []BitRange{{0, PallocChunkPages - uint(m)}}, + min: m, + max: 0, + want: BitRange{PallocChunkPages - uint(m), uint(m)}, + } + if m <= 8 { + tests["OneFree"] = test{ + alloc: []BitRange{{0, 40}, {40 + uint(m), PallocChunkPages - (40 + uint(m))}}, + min: m, + max: PallocChunkPages, + want: BitRange{40, uint(m)}, + } + tests["OneScavenged"] = test{ + alloc: []BitRange{{0, 40}, {40 + uint(m), PallocChunkPages - (40 + uint(m))}}, + scavenged: []BitRange{{40, 1}}, + min: m, + max: PallocChunkPages, + want: BitRange{0, 0}, + } + } + if m > 1 { + tests["MaxUnaligned"+suffix] = test{ + scavenged: []BitRange{{0, PallocChunkPages - uint(m*2-1)}}, + min: m, + max: m - 2, + want: BitRange{PallocChunkPages - uint(m), uint(m)}, + } + tests["SkipSmall"+suffix] = test{ + alloc: []BitRange{{0, 64 - uint(m)}, {64, 5}, {70, 11}, {82, PallocChunkPages - 82}}, + min: m, + max: m, + want: BitRange{64 - uint(m), uint(m)}, + } + tests["SkipMisaligned"+suffix] = test{ + alloc: []BitRange{{0, 64 - uint(m)}, {64, 63}, {127 + uint(m), PallocChunkPages - (127 + uint(m))}}, + min: m, + max: m, + want: BitRange{64 - uint(m), uint(m)}, + } + tests["MaxLessThan"+suffix] = test{ + scavenged: []BitRange{{0, PallocChunkPages - uint(m)}}, + min: m, + max: 1, + want: BitRange{PallocChunkPages - uint(m), uint(m)}, + } + } + } + if PhysHugePageSize > uintptr(PageSize) { + // Check hugepage preserving behavior. + bits := uint(PhysHugePageSize / uintptr(PageSize)) + if bits < PallocChunkPages { + tests["PreserveHugePageBottom"] = test{ + alloc: []BitRange{{bits + 2, PallocChunkPages - (bits + 2)}}, + min: 1, + max: 3, // Make it so that max would have us try to break the huge page. + want: BitRange{0, bits + 2}, + } + if 3*bits < PallocChunkPages { + // We need at least 3 huge pages in a chunk for this test to make sense. + tests["PreserveHugePageMiddle"] = test{ + alloc: []BitRange{{0, bits - 10}, {2*bits + 10, PallocChunkPages - (2*bits + 10)}}, + min: 1, + max: 12, // Make it so that max would have us try to break the huge page. + want: BitRange{bits, bits + 10}, + } + } + tests["PreserveHugePageTop"] = test{ + alloc: []BitRange{{0, PallocChunkPages - bits}}, + min: 1, + max: 1, // Even one page would break a huge page in this case. + want: BitRange{PallocChunkPages - bits, bits}, + } + } else if bits == PallocChunkPages { + tests["PreserveHugePageAll"] = test{ + min: 1, + max: 1, // Even one page would break a huge page in this case. + want: BitRange{0, PallocChunkPages}, + } + } else { + // The huge page size is greater than pallocChunkPages, so it should + // be effectively disabled. There's no way we can possible scavenge + // a huge page out of this bitmap chunk. + tests["PreserveHugePageNone"] = test{ + min: 1, + max: 1, + want: BitRange{PallocChunkPages - 1, 1}, + } + } + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := makePallocData(v.alloc, v.scavenged) + start, size := b.FindScavengeCandidate(PallocChunkPages-1, v.min, v.max) + got := BitRange{start, size} + if !(got.N == 0 && v.want.N == 0) && got != v.want { + t.Fatalf("candidate mismatch: got %v, want %v", got, v.want) + } + }) + } +} + +// Tests end-to-end scavenging on a pageAlloc. +func TestPageAllocScavenge(t *testing.T) { + if GOOS == "openbsd" && testing.Short() { + t.Skip("skipping because virtual memory is limited; see #36210") + } + type test struct { + request, expect uintptr + } + minPages := PhysPageSize / PageSize + if minPages < 1 { + minPages = 1 + } + type setup struct { + beforeAlloc map[ChunkIdx][]BitRange + beforeScav map[ChunkIdx][]BitRange + expect []test + afterScav map[ChunkIdx][]BitRange + } + tests := map[string]setup{ + "AllFreeUnscavExhaust": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + BaseChunkIdx + 2: {}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + BaseChunkIdx + 2: {}, + }, + expect: []test{ + {^uintptr(0), 3 * PallocChunkPages * PageSize}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + }, + }, + "NoneFreeUnscavExhaust": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {}, + }, + expect: []test{ + {^uintptr(0), 0}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {}, + }, + }, + "ScavHighestPageFirst": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{uint(minPages), PallocChunkPages - uint(2*minPages)}}, + }, + expect: []test{ + {1, minPages * PageSize}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{uint(minPages), PallocChunkPages - uint(minPages)}}, + }, + }, + "ScavMultiple": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{uint(minPages), PallocChunkPages - uint(2*minPages)}}, + }, + expect: []test{ + {minPages * PageSize, minPages * PageSize}, + {minPages * PageSize, minPages * PageSize}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + }, + "ScavMultiple2": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{uint(minPages), PallocChunkPages - uint(2*minPages)}}, + BaseChunkIdx + 1: {{0, PallocChunkPages - uint(2*minPages)}}, + }, + expect: []test{ + {2 * minPages * PageSize, 2 * minPages * PageSize}, + {minPages * PageSize, minPages * PageSize}, + {minPages * PageSize, minPages * PageSize}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + }, + }, + "ScavDiscontiguous": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 0xe: {}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{uint(minPages), PallocChunkPages - uint(2*minPages)}}, + BaseChunkIdx + 0xe: {{uint(2 * minPages), PallocChunkPages - uint(2*minPages)}}, + }, + expect: []test{ + {2 * minPages * PageSize, 2 * minPages * PageSize}, + {^uintptr(0), 2 * minPages * PageSize}, + {^uintptr(0), 0}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0xe: {{0, PallocChunkPages}}, + }, + }, + } + // Disable these tests on iOS since we have a small address space. + // See #46860. + if PageAlloc64Bit != 0 && goos.IsIos == 0 { + tests["ScavAllVeryDiscontiguous"] = setup{ + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 0x1000: {}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 0x1000: {}, + }, + expect: []test{ + {^uintptr(0), 2 * PallocChunkPages * PageSize}, + {^uintptr(0), 0}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0x1000: {{0, PallocChunkPages}}, + }, + } + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := NewPageAlloc(v.beforeAlloc, v.beforeScav) + defer FreePageAlloc(b) + + for iter, h := range v.expect { + if got := b.Scavenge(h.request); got != h.expect { + t.Fatalf("bad scavenge #%d: want %d, got %d", iter+1, h.expect, got) + } + } + want := NewPageAlloc(v.beforeAlloc, v.afterScav) + defer FreePageAlloc(want) + + checkPageAlloc(t, want, b) + }) + } +} + +func TestScavenger(t *testing.T) { + // workedTime is a standard conversion of bytes of scavenge + // work to time elapsed. + workedTime := func(bytes uintptr) int64 { + return int64((bytes+4095)/4096) * int64(10*time.Microsecond) + } + + // Set up a bunch of state that we're going to track and verify + // throughout the test. + totalWork := uint64(64<<20 - 3*PhysPageSize) + var totalSlept, totalWorked atomic.Int64 + var availableWork atomic.Uint64 + var stopAt atomic.Uint64 // How much available work to stop at. + + // Set up the scavenger. + var s Scavenger + s.Sleep = func(ns int64) int64 { + totalSlept.Add(ns) + return ns + } + s.Scavenge = func(bytes uintptr) (uintptr, int64) { + avail := availableWork.Load() + if uint64(bytes) > avail { + bytes = uintptr(avail) + } + t := workedTime(bytes) + if bytes != 0 { + availableWork.Add(-int64(bytes)) + totalWorked.Add(t) + } + return bytes, t + } + s.ShouldStop = func() bool { + if availableWork.Load() <= stopAt.Load() { + return true + } + return false + } + s.GoMaxProcs = func() int32 { + return 1 + } + + // Define a helper for verifying that various properties hold. + verifyScavengerState := func(t *testing.T, expWork uint64) { + t.Helper() + + // Check to make sure it did the amount of work we expected. + if workDone := uint64(s.Released()); workDone != expWork { + t.Errorf("want %d bytes of work done, got %d", expWork, workDone) + } + // Check to make sure the scavenger is meeting its CPU target. + idealFraction := float64(ScavengePercent) / 100.0 + cpuFraction := float64(totalWorked.Load()) / float64(totalWorked.Load()+totalSlept.Load()) + if cpuFraction < idealFraction-0.005 || cpuFraction > idealFraction+0.005 { + t.Errorf("want %f CPU fraction, got %f", idealFraction, cpuFraction) + } + } + + // Start the scavenger. + s.Start() + + // Set up some work and let the scavenger run to completion. + availableWork.Store(totalWork) + s.Wake() + if !s.BlockUntilParked(2e9 /* 2 seconds */) { + t.Fatal("timed out waiting for scavenger to run to completion") + } + // Run a check. + verifyScavengerState(t, totalWork) + + // Now let's do it again and see what happens when we have no work to do. + // It should've gone right back to sleep. + s.Wake() + if !s.BlockUntilParked(2e9 /* 2 seconds */) { + t.Fatal("timed out waiting for scavenger to run to completion") + } + // Run another check. + verifyScavengerState(t, totalWork) + + // One more time, this time doing the same amount of work as the first time. + // Let's see if we can get the scavenger to continue. + availableWork.Store(totalWork) + s.Wake() + if !s.BlockUntilParked(2e9 /* 2 seconds */) { + t.Fatal("timed out waiting for scavenger to run to completion") + } + // Run another check. + verifyScavengerState(t, 2*totalWork) + + // This time, let's stop after a certain amount of work. + // + // Pick a stopping point such that when subtracted from totalWork + // we get a multiple of a relatively large power of 2. verifyScavengerState + // always makes an exact check, but the scavenger might go a little over, + // which is OK. If this breaks often or gets annoying to maintain, modify + // verifyScavengerState. + availableWork.Store(totalWork) + stoppingPoint := uint64(1<<20 - 3*PhysPageSize) + stopAt.Store(stoppingPoint) + s.Wake() + if !s.BlockUntilParked(2e9 /* 2 seconds */) { + t.Fatal("timed out waiting for scavenger to run to completion") + } + // Run another check. + verifyScavengerState(t, 2*totalWork+(totalWork-stoppingPoint)) + + // Clean up. + s.Stop() +} + +func TestScavengeIndex(t *testing.T) { + setup := func(t *testing.T) (func(ChunkIdx, uint), func(uintptr, uintptr)) { + t.Helper() + + // Pick some reasonable bounds. We don't need a huge range just to test. + si := NewScavengeIndex(BaseChunkIdx, BaseChunkIdx+64) + find := func(want ChunkIdx, wantOffset uint) { + t.Helper() + + got, gotOffset := si.Find() + if want != got { + t.Errorf("find: wanted chunk index %d, got %d", want, got) + } + if want != got { + t.Errorf("find: wanted page offset %d, got %d", wantOffset, gotOffset) + } + if t.Failed() { + t.FailNow() + } + si.Clear(got) + } + mark := func(base, limit uintptr) { + t.Helper() + + si.Mark(base, limit) + } + return find, mark + } + t.Run("Uninitialized", func(t *testing.T) { + find, _ := setup(t) + find(0, 0) + }) + t.Run("OnePage", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, 3), PageBase(BaseChunkIdx, 4)) + find(BaseChunkIdx, 3) + find(0, 0) + }) + t.Run("FirstPage", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx, 1)) + find(BaseChunkIdx, 0) + find(0, 0) + }) + t.Run("SeveralPages", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, 9), PageBase(BaseChunkIdx, 14)) + find(BaseChunkIdx, 13) + find(0, 0) + }) + t.Run("WholeChunk", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+1, 0)) + find(BaseChunkIdx, PallocChunkPages-1) + find(0, 0) + }) + t.Run("LastPage", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, PallocChunkPages-1), PageBase(BaseChunkIdx+1, 0)) + find(BaseChunkIdx, PallocChunkPages-1) + find(0, 0) + }) + t.Run("TwoChunks", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, 128), PageBase(BaseChunkIdx+1, 128)) + find(BaseChunkIdx+1, 127) + find(BaseChunkIdx, PallocChunkPages-1) + find(0, 0) + }) + t.Run("TwoChunksOffset", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx+7, 128), PageBase(BaseChunkIdx+8, 129)) + find(BaseChunkIdx+8, 128) + find(BaseChunkIdx+7, PallocChunkPages-1) + find(0, 0) + }) + t.Run("SevenChunksOffset", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx+6, 11), PageBase(BaseChunkIdx+13, 15)) + find(BaseChunkIdx+13, 14) + for i := BaseChunkIdx + 12; i >= BaseChunkIdx+6; i-- { + find(i, PallocChunkPages-1) + } + find(0, 0) + }) + t.Run("ThirtyTwoChunks", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+32, 0)) + for i := BaseChunkIdx + 31; i >= BaseChunkIdx; i-- { + find(i, PallocChunkPages-1) + } + find(0, 0) + }) + t.Run("ThirtyTwoChunksOffset", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx+3, 0), PageBase(BaseChunkIdx+35, 0)) + for i := BaseChunkIdx + 34; i >= BaseChunkIdx+3; i-- { + find(i, PallocChunkPages-1) + } + find(0, 0) + }) + t.Run("Mark", func(t *testing.T) { + find, mark := setup(t) + for i := BaseChunkIdx; i < BaseChunkIdx+32; i++ { + mark(PageBase(i, 0), PageBase(i+1, 0)) + } + for i := BaseChunkIdx + 31; i >= BaseChunkIdx; i-- { + find(i, PallocChunkPages-1) + } + find(0, 0) + }) + t.Run("MarkInterleaved", func(t *testing.T) { + find, mark := setup(t) + for i := BaseChunkIdx; i < BaseChunkIdx+32; i++ { + mark(PageBase(i, 0), PageBase(i+1, 0)) + find(i, PallocChunkPages-1) + } + find(0, 0) + }) + t.Run("MarkIdempotentOneChunk", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+1, 0)) + mark(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+1, 0)) + find(BaseChunkIdx, PallocChunkPages-1) + find(0, 0) + }) + t.Run("MarkIdempotentThirtyTwoChunks", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+32, 0)) + mark(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+32, 0)) + for i := BaseChunkIdx + 31; i >= BaseChunkIdx; i-- { + find(i, PallocChunkPages-1) + } + find(0, 0) + }) + t.Run("MarkIdempotentThirtyTwoChunksOffset", func(t *testing.T) { + find, mark := setup(t) + mark(PageBase(BaseChunkIdx+4, 0), PageBase(BaseChunkIdx+31, 0)) + mark(PageBase(BaseChunkIdx+5, 0), PageBase(BaseChunkIdx+36, 0)) + for i := BaseChunkIdx + 35; i >= BaseChunkIdx+4; i-- { + find(i, PallocChunkPages-1) + } + find(0, 0) + }) +} + +func FuzzPIController(f *testing.F) { + isNormal := func(x float64) bool { + return !math.IsInf(x, 0) && !math.IsNaN(x) + } + isPositive := func(x float64) bool { + return isNormal(x) && x > 0 + } + // Seed with constants from controllers in the runtime. + // It's not critical that we keep these in sync, they're just + // reasonable seed inputs. + f.Add(0.3375, 3.2e6, 1e9, 0.001, 1000.0, 0.01) + f.Add(0.9, 4.0, 1000.0, -1000.0, 1000.0, 0.84) + f.Fuzz(func(t *testing.T, kp, ti, tt, min, max, setPoint float64) { + // Ignore uninteresting invalid parameters. These parameters + // are constant, so in practice surprising values will be documented + // or will be other otherwise immediately visible. + // + // We just want to make sure that given a non-Inf, non-NaN input, + // we always get a non-Inf, non-NaN output. + if !isPositive(kp) || !isPositive(ti) || !isPositive(tt) { + return + } + if !isNormal(min) || !isNormal(max) || min > max { + return + } + // Use a random source, but make it deterministic. + rs := rand.New(rand.NewSource(800)) + randFloat64 := func() float64 { + return math.Float64frombits(rs.Uint64()) + } + p := NewPIController(kp, ti, tt, min, max) + state := float64(0) + for i := 0; i < 100; i++ { + input := randFloat64() + // Ignore the "ok" parameter. We're just trying to break it. + // state is intentionally completely uncorrelated with the input. + var ok bool + state, ok = p.Next(input, setPoint, 1.0) + if !isNormal(state) { + t.Fatalf("got NaN or Inf result from controller: %f %v", state, ok) + } + } + }) +} diff --git a/src/runtime/mgcstack.go b/src/runtime/mgcstack.go new file mode 100644 index 0000000..6b55220 --- /dev/null +++ b/src/runtime/mgcstack.go @@ -0,0 +1,350 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Garbage collector: stack objects and stack tracing +// See the design doc at https://docs.google.com/document/d/1un-Jn47yByHL7I0aVIP_uVCMxjdM5mpelJhiKlIqxkE/edit?usp=sharing +// Also see issue 22350. + +// Stack tracing solves the problem of determining which parts of the +// stack are live and should be scanned. It runs as part of scanning +// a single goroutine stack. +// +// Normally determining which parts of the stack are live is easy to +// do statically, as user code has explicit references (reads and +// writes) to stack variables. The compiler can do a simple dataflow +// analysis to determine liveness of stack variables at every point in +// the code. See cmd/compile/internal/gc/plive.go for that analysis. +// +// However, when we take the address of a stack variable, determining +// whether that variable is still live is less clear. We can still +// look for static accesses, but accesses through a pointer to the +// variable are difficult in general to track statically. That pointer +// can be passed among functions on the stack, conditionally retained, +// etc. +// +// Instead, we will track pointers to stack variables dynamically. +// All pointers to stack-allocated variables will themselves be on the +// stack somewhere (or in associated locations, like defer records), so +// we can find them all efficiently. +// +// Stack tracing is organized as a mini garbage collection tracing +// pass. The objects in this garbage collection are all the variables +// on the stack whose address is taken, and which themselves contain a +// pointer. We call these variables "stack objects". +// +// We begin by determining all the stack objects on the stack and all +// the statically live pointers that may point into the stack. We then +// process each pointer to see if it points to a stack object. If it +// does, we scan that stack object. It may contain pointers into the +// heap, in which case those pointers are passed to the main garbage +// collection. It may also contain pointers into the stack, in which +// case we add them to our set of stack pointers. +// +// Once we're done processing all the pointers (including the ones we +// added during processing), we've found all the stack objects that +// are live. Any dead stack objects are not scanned and their contents +// will not keep heap objects live. Unlike the main garbage +// collection, we can't sweep the dead stack objects; they live on in +// a moribund state until the stack frame that contains them is +// popped. +// +// A stack can look like this: +// +// +----------+ +// | foo() | +// | +------+ | +// | | A | | <---\ +// | +------+ | | +// | | | +// | +------+ | | +// | | B | | | +// | +------+ | | +// | | | +// +----------+ | +// | bar() | | +// | +------+ | | +// | | C | | <-\ | +// | +----|-+ | | | +// | | | | | +// | +----v-+ | | | +// | | D ---------/ +// | +------+ | | +// | | | +// +----------+ | +// | baz() | | +// | +------+ | | +// | | E -------/ +// | +------+ | +// | ^ | +// | F: --/ | +// | | +// +----------+ +// +// foo() calls bar() calls baz(). Each has a frame on the stack. +// foo() has stack objects A and B. +// bar() has stack objects C and D, with C pointing to D and D pointing to A. +// baz() has a stack object E pointing to C, and a local variable F pointing to E. +// +// Starting from the pointer in local variable F, we will eventually +// scan all of E, C, D, and A (in that order). B is never scanned +// because there is no live pointer to it. If B is also statically +// dead (meaning that foo() never accesses B again after it calls +// bar()), then B's pointers into the heap are not considered live. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/sys" + "unsafe" +) + +const stackTraceDebug = false + +// Buffer for pointers found during stack tracing. +// Must be smaller than or equal to workbuf. +type stackWorkBuf struct { + _ sys.NotInHeap + stackWorkBufHdr + obj [(_WorkbufSize - unsafe.Sizeof(stackWorkBufHdr{})) / goarch.PtrSize]uintptr +} + +// Header declaration must come after the buf declaration above, because of issue #14620. +type stackWorkBufHdr struct { + _ sys.NotInHeap + workbufhdr + next *stackWorkBuf // linked list of workbufs + // Note: we could theoretically repurpose lfnode.next as this next pointer. + // It would save 1 word, but that probably isn't worth busting open + // the lfnode API. +} + +// Buffer for stack objects found on a goroutine stack. +// Must be smaller than or equal to workbuf. +type stackObjectBuf struct { + _ sys.NotInHeap + stackObjectBufHdr + obj [(_WorkbufSize - unsafe.Sizeof(stackObjectBufHdr{})) / unsafe.Sizeof(stackObject{})]stackObject +} + +type stackObjectBufHdr struct { + _ sys.NotInHeap + workbufhdr + next *stackObjectBuf +} + +func init() { + if unsafe.Sizeof(stackWorkBuf{}) > unsafe.Sizeof(workbuf{}) { + panic("stackWorkBuf too big") + } + if unsafe.Sizeof(stackObjectBuf{}) > unsafe.Sizeof(workbuf{}) { + panic("stackObjectBuf too big") + } +} + +// A stackObject represents a variable on the stack that has had +// its address taken. +type stackObject struct { + _ sys.NotInHeap + off uint32 // offset above stack.lo + size uint32 // size of object + r *stackObjectRecord // info of the object (for ptr/nonptr bits). nil if object has been scanned. + left *stackObject // objects with lower addresses + right *stackObject // objects with higher addresses +} + +// obj.r = r, but with no write barrier. +// +//go:nowritebarrier +func (obj *stackObject) setRecord(r *stackObjectRecord) { + // Types of stack objects are always in read-only memory, not the heap. + // So not using a write barrier is ok. + *(*uintptr)(unsafe.Pointer(&obj.r)) = uintptr(unsafe.Pointer(r)) +} + +// A stackScanState keeps track of the state used during the GC walk +// of a goroutine. +type stackScanState struct { + cache pcvalueCache + + // stack limits + stack stack + + // conservative indicates that the next frame must be scanned conservatively. + // This applies only to the innermost frame at an async safe-point. + conservative bool + + // buf contains the set of possible pointers to stack objects. + // Organized as a LIFO linked list of buffers. + // All buffers except possibly the head buffer are full. + buf *stackWorkBuf + freeBuf *stackWorkBuf // keep around one free buffer for allocation hysteresis + + // cbuf contains conservative pointers to stack objects. If + // all pointers to a stack object are obtained via + // conservative scanning, then the stack object may be dead + // and may contain dead pointers, so it must be scanned + // defensively. + cbuf *stackWorkBuf + + // list of stack objects + // Objects are in increasing address order. + head *stackObjectBuf + tail *stackObjectBuf + nobjs int + + // root of binary tree for fast object lookup by address + // Initialized by buildIndex. + root *stackObject +} + +// Add p as a potential pointer to a stack object. +// p must be a stack address. +func (s *stackScanState) putPtr(p uintptr, conservative bool) { + if p < s.stack.lo || p >= s.stack.hi { + throw("address not a stack address") + } + head := &s.buf + if conservative { + head = &s.cbuf + } + buf := *head + if buf == nil { + // Initial setup. + buf = (*stackWorkBuf)(unsafe.Pointer(getempty())) + buf.nobj = 0 + buf.next = nil + *head = buf + } else if buf.nobj == len(buf.obj) { + if s.freeBuf != nil { + buf = s.freeBuf + s.freeBuf = nil + } else { + buf = (*stackWorkBuf)(unsafe.Pointer(getempty())) + } + buf.nobj = 0 + buf.next = *head + *head = buf + } + buf.obj[buf.nobj] = p + buf.nobj++ +} + +// Remove and return a potential pointer to a stack object. +// Returns 0 if there are no more pointers available. +// +// This prefers non-conservative pointers so we scan stack objects +// precisely if there are any non-conservative pointers to them. +func (s *stackScanState) getPtr() (p uintptr, conservative bool) { + for _, head := range []**stackWorkBuf{&s.buf, &s.cbuf} { + buf := *head + if buf == nil { + // Never had any data. + continue + } + if buf.nobj == 0 { + if s.freeBuf != nil { + // Free old freeBuf. + putempty((*workbuf)(unsafe.Pointer(s.freeBuf))) + } + // Move buf to the freeBuf. + s.freeBuf = buf + buf = buf.next + *head = buf + if buf == nil { + // No more data in this list. + continue + } + } + buf.nobj-- + return buf.obj[buf.nobj], head == &s.cbuf + } + // No more data in either list. + if s.freeBuf != nil { + putempty((*workbuf)(unsafe.Pointer(s.freeBuf))) + s.freeBuf = nil + } + return 0, false +} + +// addObject adds a stack object at addr of type typ to the set of stack objects. +func (s *stackScanState) addObject(addr uintptr, r *stackObjectRecord) { + x := s.tail + if x == nil { + // initial setup + x = (*stackObjectBuf)(unsafe.Pointer(getempty())) + x.next = nil + s.head = x + s.tail = x + } + if x.nobj > 0 && uint32(addr-s.stack.lo) < x.obj[x.nobj-1].off+x.obj[x.nobj-1].size { + throw("objects added out of order or overlapping") + } + if x.nobj == len(x.obj) { + // full buffer - allocate a new buffer, add to end of linked list + y := (*stackObjectBuf)(unsafe.Pointer(getempty())) + y.next = nil + x.next = y + s.tail = y + x = y + } + obj := &x.obj[x.nobj] + x.nobj++ + obj.off = uint32(addr - s.stack.lo) + obj.size = uint32(r.size) + obj.setRecord(r) + // obj.left and obj.right will be initialized by buildIndex before use. + s.nobjs++ +} + +// buildIndex initializes s.root to a binary search tree. +// It should be called after all addObject calls but before +// any call of findObject. +func (s *stackScanState) buildIndex() { + s.root, _, _ = binarySearchTree(s.head, 0, s.nobjs) +} + +// Build a binary search tree with the n objects in the list +// x.obj[idx], x.obj[idx+1], ..., x.next.obj[0], ... +// Returns the root of that tree, and the buf+idx of the nth object after x.obj[idx]. +// (The first object that was not included in the binary search tree.) +// If n == 0, returns nil, x. +func binarySearchTree(x *stackObjectBuf, idx int, n int) (root *stackObject, restBuf *stackObjectBuf, restIdx int) { + if n == 0 { + return nil, x, idx + } + var left, right *stackObject + left, x, idx = binarySearchTree(x, idx, n/2) + root = &x.obj[idx] + idx++ + if idx == len(x.obj) { + x = x.next + idx = 0 + } + right, x, idx = binarySearchTree(x, idx, n-n/2-1) + root.left = left + root.right = right + return root, x, idx +} + +// findObject returns the stack object containing address a, if any. +// Must have called buildIndex previously. +func (s *stackScanState) findObject(a uintptr) *stackObject { + off := uint32(a - s.stack.lo) + obj := s.root + for { + if obj == nil { + return nil + } + if off < obj.off { + obj = obj.left + continue + } + if off >= obj.off+obj.size { + obj = obj.right + continue + } + return obj + } +} diff --git a/src/runtime/mgcsweep.go b/src/runtime/mgcsweep.go new file mode 100644 index 0000000..6ccf090 --- /dev/null +++ b/src/runtime/mgcsweep.go @@ -0,0 +1,967 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Garbage collector: sweeping + +// The sweeper consists of two different algorithms: +// +// * The object reclaimer finds and frees unmarked slots in spans. It +// can free a whole span if none of the objects are marked, but that +// isn't its goal. This can be driven either synchronously by +// mcentral.cacheSpan for mcentral spans, or asynchronously by +// sweepone, which looks at all the mcentral lists. +// +// * The span reclaimer looks for spans that contain no marked objects +// and frees whole spans. This is a separate algorithm because +// freeing whole spans is the hardest task for the object reclaimer, +// but is critical when allocating new spans. The entry point for +// this is mheap_.reclaim and it's driven by a sequential scan of +// the page marks bitmap in the heap arenas. +// +// Both algorithms ultimately call mspan.sweep, which sweeps a single +// heap span. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +var sweep sweepdata + +// State of background sweep. +type sweepdata struct { + lock mutex + g *g + parked bool + + nbgsweep uint32 + npausesweep uint32 + + // active tracks outstanding sweepers and the sweep + // termination condition. + active activeSweep + + // centralIndex is the current unswept span class. + // It represents an index into the mcentral span + // sets. Accessed and updated via its load and + // update methods. Not protected by a lock. + // + // Reset at mark termination. + // Used by mheap.nextSpanForSweep. + centralIndex sweepClass +} + +// sweepClass is a spanClass and one bit to represent whether we're currently +// sweeping partial or full spans. +type sweepClass uint32 + +const ( + numSweepClasses = numSpanClasses * 2 + sweepClassDone sweepClass = sweepClass(^uint32(0)) +) + +func (s *sweepClass) load() sweepClass { + return sweepClass(atomic.Load((*uint32)(s))) +} + +func (s *sweepClass) update(sNew sweepClass) { + // Only update *s if its current value is less than sNew, + // since *s increases monotonically. + sOld := s.load() + for sOld < sNew && !atomic.Cas((*uint32)(s), uint32(sOld), uint32(sNew)) { + sOld = s.load() + } + // TODO(mknyszek): This isn't the only place we have + // an atomic monotonically increasing counter. It would + // be nice to have an "atomic max" which is just implemented + // as the above on most architectures. Some architectures + // like RISC-V however have native support for an atomic max. +} + +func (s *sweepClass) clear() { + atomic.Store((*uint32)(s), 0) +} + +// split returns the underlying span class as well as +// whether we're interested in the full or partial +// unswept lists for that class, indicated as a boolean +// (true means "full"). +func (s sweepClass) split() (spc spanClass, full bool) { + return spanClass(s >> 1), s&1 == 0 +} + +// nextSpanForSweep finds and pops the next span for sweeping from the +// central sweep buffers. It returns ownership of the span to the caller. +// Returns nil if no such span exists. +func (h *mheap) nextSpanForSweep() *mspan { + sg := h.sweepgen + for sc := sweep.centralIndex.load(); sc < numSweepClasses; sc++ { + spc, full := sc.split() + c := &h.central[spc].mcentral + var s *mspan + if full { + s = c.fullUnswept(sg).pop() + } else { + s = c.partialUnswept(sg).pop() + } + if s != nil { + // Write down that we found something so future sweepers + // can start from here. + sweep.centralIndex.update(sc) + return s + } + } + // Write down that we found nothing. + sweep.centralIndex.update(sweepClassDone) + return nil +} + +const sweepDrainedMask = 1 << 31 + +// activeSweep is a type that captures whether sweeping +// is done, and whether there are any outstanding sweepers. +// +// Every potential sweeper must call begin() before they look +// for work, and end() after they've finished sweeping. +type activeSweep struct { + // state is divided into two parts. + // + // The top bit (masked by sweepDrainedMask) is a boolean + // value indicating whether all the sweep work has been + // drained from the queue. + // + // The rest of the bits are a counter, indicating the + // number of outstanding concurrent sweepers. + state atomic.Uint32 +} + +// begin registers a new sweeper. Returns a sweepLocker +// for acquiring spans for sweeping. Any outstanding sweeper blocks +// sweep termination. +// +// If the sweepLocker is invalid, the caller can be sure that all +// outstanding sweep work has been drained, so there is nothing left +// to sweep. Note that there may be sweepers currently running, so +// this does not indicate that all sweeping has completed. +// +// Even if the sweepLocker is invalid, its sweepGen is always valid. +func (a *activeSweep) begin() sweepLocker { + for { + state := a.state.Load() + if state&sweepDrainedMask != 0 { + return sweepLocker{mheap_.sweepgen, false} + } + if a.state.CompareAndSwap(state, state+1) { + return sweepLocker{mheap_.sweepgen, true} + } + } +} + +// end deregisters a sweeper. Must be called once for each time +// begin is called if the sweepLocker is valid. +func (a *activeSweep) end(sl sweepLocker) { + if sl.sweepGen != mheap_.sweepgen { + throw("sweeper left outstanding across sweep generations") + } + for { + state := a.state.Load() + if (state&^sweepDrainedMask)-1 >= sweepDrainedMask { + throw("mismatched begin/end of activeSweep") + } + if a.state.CompareAndSwap(state, state-1) { + if state != sweepDrainedMask { + return + } + if debug.gcpacertrace > 0 { + live := gcController.heapLive.Load() + print("pacer: sweep done at heap size ", live>>20, "MB; allocated ", (live-mheap_.sweepHeapLiveBasis)>>20, "MB during sweep; swept ", mheap_.pagesSwept.Load(), " pages at ", mheap_.sweepPagesPerByte, " pages/byte\n") + } + return + } + } +} + +// markDrained marks the active sweep cycle as having drained +// all remaining work. This is safe to be called concurrently +// with all other methods of activeSweep, though may race. +// +// Returns true if this call was the one that actually performed +// the mark. +func (a *activeSweep) markDrained() bool { + for { + state := a.state.Load() + if state&sweepDrainedMask != 0 { + return false + } + if a.state.CompareAndSwap(state, state|sweepDrainedMask) { + return true + } + } +} + +// sweepers returns the current number of active sweepers. +func (a *activeSweep) sweepers() uint32 { + return a.state.Load() &^ sweepDrainedMask +} + +// isDone returns true if all sweep work has been drained and no more +// outstanding sweepers exist. That is, when the sweep phase is +// completely done. +func (a *activeSweep) isDone() bool { + return a.state.Load() == sweepDrainedMask +} + +// reset sets up the activeSweep for the next sweep cycle. +// +// The world must be stopped. +func (a *activeSweep) reset() { + assertWorldStopped() + a.state.Store(0) +} + +// finishsweep_m ensures that all spans are swept. +// +// The world must be stopped. This ensures there are no sweeps in +// progress. +// +//go:nowritebarrier +func finishsweep_m() { + assertWorldStopped() + + // Sweeping must be complete before marking commences, so + // sweep any unswept spans. If this is a concurrent GC, there + // shouldn't be any spans left to sweep, so this should finish + // instantly. If GC was forced before the concurrent sweep + // finished, there may be spans to sweep. + for sweepone() != ^uintptr(0) { + sweep.npausesweep++ + } + + // Make sure there aren't any outstanding sweepers left. + // At this point, with the world stopped, it means one of two + // things. Either we were able to preempt a sweeper, or that + // a sweeper didn't call sweep.active.end when it should have. + // Both cases indicate a bug, so throw. + if sweep.active.sweepers() != 0 { + throw("active sweepers found at start of mark phase") + } + + // Reset all the unswept buffers, which should be empty. + // Do this in sweep termination as opposed to mark termination + // so that we can catch unswept spans and reclaim blocks as + // soon as possible. + sg := mheap_.sweepgen + for i := range mheap_.central { + c := &mheap_.central[i].mcentral + c.partialUnswept(sg).reset() + c.fullUnswept(sg).reset() + } + + // Sweeping is done, so if the scavenger isn't already awake, + // wake it up. There's definitely work for it to do at this + // point. + scavenger.wake() + + nextMarkBitArenaEpoch() +} + +func bgsweep(c chan int) { + sweep.g = getg() + + lockInit(&sweep.lock, lockRankSweep) + lock(&sweep.lock) + sweep.parked = true + c <- 1 + goparkunlock(&sweep.lock, waitReasonGCSweepWait, traceEvGoBlock, 1) + + for { + // bgsweep attempts to be a "low priority" goroutine by intentionally + // yielding time. It's OK if it doesn't run, because goroutines allocating + // memory will sweep and ensure that all spans are swept before the next + // GC cycle. We really only want to run when we're idle. + // + // However, calling Gosched after each span swept produces a tremendous + // amount of tracing events, sometimes up to 50% of events in a trace. It's + // also inefficient to call into the scheduler so much because sweeping a + // single span is in general a very fast operation, taking as little as 30 ns + // on modern hardware. (See #54767.) + // + // As a result, bgsweep sweeps in batches, and only calls into the scheduler + // at the end of every batch. Furthermore, it only yields its time if there + // isn't spare idle time available on other cores. If there's available idle + // time, helping to sweep can reduce allocation latencies by getting ahead of + // the proportional sweeper and having spans ready to go for allocation. + const sweepBatchSize = 10 + nSwept := 0 + for sweepone() != ^uintptr(0) { + sweep.nbgsweep++ + nSwept++ + if nSwept%sweepBatchSize == 0 { + goschedIfBusy() + } + } + for freeSomeWbufs(true) { + // N.B. freeSomeWbufs is already batched internally. + goschedIfBusy() + } + lock(&sweep.lock) + if !isSweepDone() { + // This can happen if a GC runs between + // gosweepone returning ^0 above + // and the lock being acquired. + unlock(&sweep.lock) + continue + } + sweep.parked = true + goparkunlock(&sweep.lock, waitReasonGCSweepWait, traceEvGoBlock, 1) + } +} + +// sweepLocker acquires sweep ownership of spans. +type sweepLocker struct { + // sweepGen is the sweep generation of the heap. + sweepGen uint32 + valid bool +} + +// sweepLocked represents sweep ownership of a span. +type sweepLocked struct { + *mspan +} + +// tryAcquire attempts to acquire sweep ownership of span s. If it +// successfully acquires ownership, it blocks sweep completion. +func (l *sweepLocker) tryAcquire(s *mspan) (sweepLocked, bool) { + if !l.valid { + throw("use of invalid sweepLocker") + } + // Check before attempting to CAS. + if atomic.Load(&s.sweepgen) != l.sweepGen-2 { + return sweepLocked{}, false + } + // Attempt to acquire sweep ownership of s. + if !atomic.Cas(&s.sweepgen, l.sweepGen-2, l.sweepGen-1) { + return sweepLocked{}, false + } + return sweepLocked{s}, true +} + +// sweepone sweeps some unswept heap span and returns the number of pages returned +// to the heap, or ^uintptr(0) if there was nothing to sweep. +func sweepone() uintptr { + gp := getg() + + // Increment locks to ensure that the goroutine is not preempted + // in the middle of sweep thus leaving the span in an inconsistent state for next GC + gp.m.locks++ + + // TODO(austin): sweepone is almost always called in a loop; + // lift the sweepLocker into its callers. + sl := sweep.active.begin() + if !sl.valid { + gp.m.locks-- + return ^uintptr(0) + } + + // Find a span to sweep. + npages := ^uintptr(0) + var noMoreWork bool + for { + s := mheap_.nextSpanForSweep() + if s == nil { + noMoreWork = sweep.active.markDrained() + break + } + if state := s.state.get(); state != mSpanInUse { + // This can happen if direct sweeping already + // swept this span, but in that case the sweep + // generation should always be up-to-date. + if !(s.sweepgen == sl.sweepGen || s.sweepgen == sl.sweepGen+3) { + print("runtime: bad span s.state=", state, " s.sweepgen=", s.sweepgen, " sweepgen=", sl.sweepGen, "\n") + throw("non in-use span in unswept list") + } + continue + } + if s, ok := sl.tryAcquire(s); ok { + // Sweep the span we found. + npages = s.npages + if s.sweep(false) { + // Whole span was freed. Count it toward the + // page reclaimer credit since these pages can + // now be used for span allocation. + mheap_.reclaimCredit.Add(npages) + } else { + // Span is still in-use, so this returned no + // pages to the heap and the span needs to + // move to the swept in-use list. + npages = 0 + } + break + } + } + sweep.active.end(sl) + + if noMoreWork { + // The sweep list is empty. There may still be + // concurrent sweeps running, but we're at least very + // close to done sweeping. + + // Move the scavenge gen forward (signaling + // that there's new work to do) and wake the scavenger. + // + // The scavenger is signaled by the last sweeper because once + // sweeping is done, we will definitely have useful work for + // the scavenger to do, since the scavenger only runs over the + // heap once per GC cycle. This update is not done during sweep + // termination because in some cases there may be a long delay + // between sweep done and sweep termination (e.g. not enough + // allocations to trigger a GC) which would be nice to fill in + // with scavenging work. + if debug.scavtrace > 0 { + systemstack(func() { + lock(&mheap_.lock) + released := atomic.Loaduintptr(&mheap_.pages.scav.released) + printScavTrace(released, false) + atomic.Storeuintptr(&mheap_.pages.scav.released, 0) + unlock(&mheap_.lock) + }) + } + scavenger.ready() + } + + gp.m.locks-- + return npages +} + +// isSweepDone reports whether all spans are swept. +// +// Note that this condition may transition from false to true at any +// time as the sweeper runs. It may transition from true to false if a +// GC runs; to prevent that the caller must be non-preemptible or must +// somehow block GC progress. +func isSweepDone() bool { + return sweep.active.isDone() +} + +// Returns only when span s has been swept. +// +//go:nowritebarrier +func (s *mspan) ensureSwept() { + // Caller must disable preemption. + // Otherwise when this function returns the span can become unswept again + // (if GC is triggered on another goroutine). + gp := getg() + if gp.m.locks == 0 && gp.m.mallocing == 0 && gp != gp.m.g0 { + throw("mspan.ensureSwept: m is not locked") + } + + // If this operation fails, then that means that there are + // no more spans to be swept. In this case, either s has already + // been swept, or is about to be acquired for sweeping and swept. + sl := sweep.active.begin() + if sl.valid { + // The caller must be sure that the span is a mSpanInUse span. + if s, ok := sl.tryAcquire(s); ok { + s.sweep(false) + sweep.active.end(sl) + return + } + sweep.active.end(sl) + } + + // Unfortunately we can't sweep the span ourselves. Somebody else + // got to it first. We don't have efficient means to wait, but that's + // OK, it will be swept fairly soon. + for { + spangen := atomic.Load(&s.sweepgen) + if spangen == sl.sweepGen || spangen == sl.sweepGen+3 { + break + } + osyield() + } +} + +// Sweep frees or collects finalizers for blocks not marked in the mark phase. +// It clears the mark bits in preparation for the next GC round. +// Returns true if the span was returned to heap. +// If preserve=true, don't return it to heap nor relink in mcentral lists; +// caller takes care of it. +func (sl *sweepLocked) sweep(preserve bool) bool { + // It's critical that we enter this function with preemption disabled, + // GC must not start while we are in the middle of this function. + gp := getg() + if gp.m.locks == 0 && gp.m.mallocing == 0 && gp != gp.m.g0 { + throw("mspan.sweep: m is not locked") + } + + s := sl.mspan + if !preserve { + // We'll release ownership of this span. Nil it out to + // prevent the caller from accidentally using it. + sl.mspan = nil + } + + sweepgen := mheap_.sweepgen + if state := s.state.get(); state != mSpanInUse || s.sweepgen != sweepgen-1 { + print("mspan.sweep: state=", state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n") + throw("mspan.sweep: bad span state") + } + + if trace.enabled { + traceGCSweepSpan(s.npages * _PageSize) + } + + mheap_.pagesSwept.Add(int64(s.npages)) + + spc := s.spanclass + size := s.elemsize + + // The allocBits indicate which unmarked objects don't need to be + // processed since they were free at the end of the last GC cycle + // and were not allocated since then. + // If the allocBits index is >= s.freeindex and the bit + // is not marked then the object remains unallocated + // since the last GC. + // This situation is analogous to being on a freelist. + + // Unlink & free special records for any objects we're about to free. + // Two complications here: + // 1. An object can have both finalizer and profile special records. + // In such case we need to queue finalizer for execution, + // mark the object as live and preserve the profile special. + // 2. A tiny object can have several finalizers setup for different offsets. + // If such object is not marked, we need to queue all finalizers at once. + // Both 1 and 2 are possible at the same time. + hadSpecials := s.specials != nil + siter := newSpecialsIter(s) + for siter.valid() { + // A finalizer can be set for an inner byte of an object, find object beginning. + objIndex := uintptr(siter.s.offset) / size + p := s.base() + objIndex*size + mbits := s.markBitsForIndex(objIndex) + if !mbits.isMarked() { + // This object is not marked and has at least one special record. + // Pass 1: see if it has at least one finalizer. + hasFin := false + endOffset := p - s.base() + size + for tmp := siter.s; tmp != nil && uintptr(tmp.offset) < endOffset; tmp = tmp.next { + if tmp.kind == _KindSpecialFinalizer { + // Stop freeing of object if it has a finalizer. + mbits.setMarkedNonAtomic() + hasFin = true + break + } + } + // Pass 2: queue all finalizers _or_ handle profile record. + for siter.valid() && uintptr(siter.s.offset) < endOffset { + // Find the exact byte for which the special was setup + // (as opposed to object beginning). + special := siter.s + p := s.base() + uintptr(special.offset) + if special.kind == _KindSpecialFinalizer || !hasFin { + siter.unlinkAndNext() + freeSpecial(special, unsafe.Pointer(p), size) + } else { + // The object has finalizers, so we're keeping it alive. + // All other specials only apply when an object is freed, + // so just keep the special record. + siter.next() + } + } + } else { + // object is still live + if siter.s.kind == _KindSpecialReachable { + special := siter.unlinkAndNext() + (*specialReachable)(unsafe.Pointer(special)).reachable = true + freeSpecial(special, unsafe.Pointer(p), size) + } else { + // keep special record + siter.next() + } + } + } + if hadSpecials && s.specials == nil { + spanHasNoSpecials(s) + } + + if debug.allocfreetrace != 0 || debug.clobberfree != 0 || raceenabled || msanenabled || asanenabled { + // Find all newly freed objects. This doesn't have to + // efficient; allocfreetrace has massive overhead. + mbits := s.markBitsForBase() + abits := s.allocBitsForIndex(0) + for i := uintptr(0); i < s.nelems; i++ { + if !mbits.isMarked() && (abits.index < s.freeindex || abits.isMarked()) { + x := s.base() + i*s.elemsize + if debug.allocfreetrace != 0 { + tracefree(unsafe.Pointer(x), size) + } + if debug.clobberfree != 0 { + clobberfree(unsafe.Pointer(x), size) + } + // User arenas are handled on explicit free. + if raceenabled && !s.isUserArenaChunk { + racefree(unsafe.Pointer(x), size) + } + if msanenabled && !s.isUserArenaChunk { + msanfree(unsafe.Pointer(x), size) + } + if asanenabled && !s.isUserArenaChunk { + asanpoison(unsafe.Pointer(x), size) + } + } + mbits.advance() + abits.advance() + } + } + + // Check for zombie objects. + if s.freeindex < s.nelems { + // Everything < freeindex is allocated and hence + // cannot be zombies. + // + // Check the first bitmap byte, where we have to be + // careful with freeindex. + obj := s.freeindex + if (*s.gcmarkBits.bytep(obj / 8)&^*s.allocBits.bytep(obj / 8))>>(obj%8) != 0 { + s.reportZombies() + } + // Check remaining bytes. + for i := obj/8 + 1; i < divRoundUp(s.nelems, 8); i++ { + if *s.gcmarkBits.bytep(i)&^*s.allocBits.bytep(i) != 0 { + s.reportZombies() + } + } + } + + // Count the number of free objects in this span. + nalloc := uint16(s.countAlloc()) + nfreed := s.allocCount - nalloc + if nalloc > s.allocCount { + // The zombie check above should have caught this in + // more detail. + print("runtime: nelems=", s.nelems, " nalloc=", nalloc, " previous allocCount=", s.allocCount, " nfreed=", nfreed, "\n") + throw("sweep increased allocation count") + } + + s.allocCount = nalloc + s.freeindex = 0 // reset allocation index to start of span. + s.freeIndexForScan = 0 + if trace.enabled { + getg().m.p.ptr().traceReclaimed += uintptr(nfreed) * s.elemsize + } + + // gcmarkBits becomes the allocBits. + // get a fresh cleared gcmarkBits in preparation for next GC + s.allocBits = s.gcmarkBits + s.gcmarkBits = newMarkBits(s.nelems) + + // Initialize alloc bits cache. + s.refillAllocCache(0) + + // The span must be in our exclusive ownership until we update sweepgen, + // check for potential races. + if state := s.state.get(); state != mSpanInUse || s.sweepgen != sweepgen-1 { + print("mspan.sweep: state=", state, " sweepgen=", s.sweepgen, " mheap.sweepgen=", sweepgen, "\n") + throw("mspan.sweep: bad span state after sweep") + } + if s.sweepgen == sweepgen+1 || s.sweepgen == sweepgen+3 { + throw("swept cached span") + } + + // We need to set s.sweepgen = h.sweepgen only when all blocks are swept, + // because of the potential for a concurrent free/SetFinalizer. + // + // But we need to set it before we make the span available for allocation + // (return it to heap or mcentral), because allocation code assumes that a + // span is already swept if available for allocation. + // + // Serialization point. + // At this point the mark bits are cleared and allocation ready + // to go so release the span. + atomic.Store(&s.sweepgen, sweepgen) + + if s.isUserArenaChunk { + if preserve { + // This is a case that should never be handled by a sweeper that + // preserves the span for reuse. + throw("sweep: tried to preserve a user arena span") + } + if nalloc > 0 { + // There still exist pointers into the span or the span hasn't been + // freed yet. It's not ready to be reused. Put it back on the + // full swept list for the next cycle. + mheap_.central[spc].mcentral.fullSwept(sweepgen).push(s) + return false + } + + // It's only at this point that the sweeper doesn't actually need to look + // at this arena anymore, so subtract from pagesInUse now. + mheap_.pagesInUse.Add(-s.npages) + s.state.set(mSpanDead) + + // The arena is ready to be recycled. Remove it from the quarantine list + // and place it on the ready list. Don't add it back to any sweep lists. + systemstack(func() { + // It's the arena code's responsibility to get the chunk on the quarantine + // list by the time all references to the chunk are gone. + if s.list != &mheap_.userArena.quarantineList { + throw("user arena span is on the wrong list") + } + lock(&mheap_.lock) + mheap_.userArena.quarantineList.remove(s) + mheap_.userArena.readyList.insert(s) + unlock(&mheap_.lock) + }) + return false + } + + if spc.sizeclass() != 0 { + // Handle spans for small objects. + if nfreed > 0 { + // Only mark the span as needing zeroing if we've freed any + // objects, because a fresh span that had been allocated into, + // wasn't totally filled, but then swept, still has all of its + // free slots zeroed. + s.needzero = 1 + stats := memstats.heapStats.acquire() + atomic.Xadd64(&stats.smallFreeCount[spc.sizeclass()], int64(nfreed)) + memstats.heapStats.release() + + // Count the frees in the inconsistent, internal stats. + gcController.totalFree.Add(int64(nfreed) * int64(s.elemsize)) + } + if !preserve { + // The caller may not have removed this span from whatever + // unswept set its on but taken ownership of the span for + // sweeping by updating sweepgen. If this span still is in + // an unswept set, then the mcentral will pop it off the + // set, check its sweepgen, and ignore it. + if nalloc == 0 { + // Free totally free span directly back to the heap. + mheap_.freeSpan(s) + return true + } + // Return span back to the right mcentral list. + if uintptr(nalloc) == s.nelems { + mheap_.central[spc].mcentral.fullSwept(sweepgen).push(s) + } else { + mheap_.central[spc].mcentral.partialSwept(sweepgen).push(s) + } + } + } else if !preserve { + // Handle spans for large objects. + if nfreed != 0 { + // Free large object span to heap. + + // NOTE(rsc,dvyukov): The original implementation of efence + // in CL 22060046 used sysFree instead of sysFault, so that + // the operating system would eventually give the memory + // back to us again, so that an efence program could run + // longer without running out of memory. Unfortunately, + // calling sysFree here without any kind of adjustment of the + // heap data structures means that when the memory does + // come back to us, we have the wrong metadata for it, either in + // the mspan structures or in the garbage collection bitmap. + // Using sysFault here means that the program will run out of + // memory fairly quickly in efence mode, but at least it won't + // have mysterious crashes due to confused memory reuse. + // It should be possible to switch back to sysFree if we also + // implement and then call some kind of mheap.deleteSpan. + if debug.efence > 0 { + s.limit = 0 // prevent mlookup from finding this span + sysFault(unsafe.Pointer(s.base()), size) + } else { + mheap_.freeSpan(s) + } + + // Count the free in the consistent, external stats. + stats := memstats.heapStats.acquire() + atomic.Xadd64(&stats.largeFreeCount, 1) + atomic.Xadd64(&stats.largeFree, int64(size)) + memstats.heapStats.release() + + // Count the free in the inconsistent, internal stats. + gcController.totalFree.Add(int64(size)) + + return true + } + + // Add a large span directly onto the full+swept list. + mheap_.central[spc].mcentral.fullSwept(sweepgen).push(s) + } + return false +} + +// reportZombies reports any marked but free objects in s and throws. +// +// This generally means one of the following: +// +// 1. User code converted a pointer to a uintptr and then back +// unsafely, and a GC ran while the uintptr was the only reference to +// an object. +// +// 2. User code (or a compiler bug) constructed a bad pointer that +// points to a free slot, often a past-the-end pointer. +// +// 3. The GC two cycles ago missed a pointer and freed a live object, +// but it was still live in the last cycle, so this GC cycle found a +// pointer to that object and marked it. +func (s *mspan) reportZombies() { + printlock() + print("runtime: marked free object in span ", s, ", elemsize=", s.elemsize, " freeindex=", s.freeindex, " (bad use of unsafe.Pointer? try -d=checkptr)\n") + mbits := s.markBitsForBase() + abits := s.allocBitsForIndex(0) + for i := uintptr(0); i < s.nelems; i++ { + addr := s.base() + i*s.elemsize + print(hex(addr)) + alloc := i < s.freeindex || abits.isMarked() + if alloc { + print(" alloc") + } else { + print(" free ") + } + if mbits.isMarked() { + print(" marked ") + } else { + print(" unmarked") + } + zombie := mbits.isMarked() && !alloc + if zombie { + print(" zombie") + } + print("\n") + if zombie { + length := s.elemsize + if length > 1024 { + length = 1024 + } + hexdumpWords(addr, addr+length, nil) + } + mbits.advance() + abits.advance() + } + throw("found pointer to free object") +} + +// deductSweepCredit deducts sweep credit for allocating a span of +// size spanBytes. This must be performed *before* the span is +// allocated to ensure the system has enough credit. If necessary, it +// performs sweeping to prevent going in to debt. If the caller will +// also sweep pages (e.g., for a large allocation), it can pass a +// non-zero callerSweepPages to leave that many pages unswept. +// +// deductSweepCredit makes a worst-case assumption that all spanBytes +// bytes of the ultimately allocated span will be available for object +// allocation. +// +// deductSweepCredit is the core of the "proportional sweep" system. +// It uses statistics gathered by the garbage collector to perform +// enough sweeping so that all pages are swept during the concurrent +// sweep phase between GC cycles. +// +// mheap_ must NOT be locked. +func deductSweepCredit(spanBytes uintptr, callerSweepPages uintptr) { + if mheap_.sweepPagesPerByte == 0 { + // Proportional sweep is done or disabled. + return + } + + if trace.enabled { + traceGCSweepStart() + } + + // Fix debt if necessary. +retry: + sweptBasis := mheap_.pagesSweptBasis.Load() + live := gcController.heapLive.Load() + liveBasis := mheap_.sweepHeapLiveBasis + newHeapLive := spanBytes + if liveBasis < live { + // Only do this subtraction when we don't overflow. Otherwise, pagesTarget + // might be computed as something really huge, causing us to get stuck + // sweeping here until the next mark phase. + // + // Overflow can happen here if gcPaceSweeper is called concurrently with + // sweeping (i.e. not during a STW, like it usually is) because this code + // is intentionally racy. A concurrent call to gcPaceSweeper can happen + // if a GC tuning parameter is modified and we read an older value of + // heapLive than what was used to set the basis. + // + // This state should be transient, so it's fine to just let newHeapLive + // be a relatively small number. We'll probably just skip this attempt to + // sweep. + // + // See issue #57523. + newHeapLive += uintptr(live - liveBasis) + } + pagesTarget := int64(mheap_.sweepPagesPerByte*float64(newHeapLive)) - int64(callerSweepPages) + for pagesTarget > int64(mheap_.pagesSwept.Load()-sweptBasis) { + if sweepone() == ^uintptr(0) { + mheap_.sweepPagesPerByte = 0 + break + } + if mheap_.pagesSweptBasis.Load() != sweptBasis { + // Sweep pacing changed. Recompute debt. + goto retry + } + } + + if trace.enabled { + traceGCSweepDone() + } +} + +// clobberfree sets the memory content at x to bad content, for debugging +// purposes. +func clobberfree(x unsafe.Pointer, size uintptr) { + // size (span.elemsize) is always a multiple of 4. + for i := uintptr(0); i < size; i += 4 { + *(*uint32)(add(x, i)) = 0xdeadbeef + } +} + +// gcPaceSweeper updates the sweeper's pacing parameters. +// +// Must be called whenever the GC's pacing is updated. +// +// The world must be stopped, or mheap_.lock must be held. +func gcPaceSweeper(trigger uint64) { + assertWorldStoppedOrLockHeld(&mheap_.lock) + + // Update sweep pacing. + if isSweepDone() { + mheap_.sweepPagesPerByte = 0 + } else { + // Concurrent sweep needs to sweep all of the in-use + // pages by the time the allocated heap reaches the GC + // trigger. Compute the ratio of in-use pages to sweep + // per byte allocated, accounting for the fact that + // some might already be swept. + heapLiveBasis := gcController.heapLive.Load() + heapDistance := int64(trigger) - int64(heapLiveBasis) + // Add a little margin so rounding errors and + // concurrent sweep are less likely to leave pages + // unswept when GC starts. + heapDistance -= 1024 * 1024 + if heapDistance < _PageSize { + // Avoid setting the sweep ratio extremely high + heapDistance = _PageSize + } + pagesSwept := mheap_.pagesSwept.Load() + pagesInUse := mheap_.pagesInUse.Load() + sweepDistancePages := int64(pagesInUse) - int64(pagesSwept) + if sweepDistancePages <= 0 { + mheap_.sweepPagesPerByte = 0 + } else { + mheap_.sweepPagesPerByte = float64(sweepDistancePages) / float64(heapDistance) + mheap_.sweepHeapLiveBasis = heapLiveBasis + // Write pagesSweptBasis last, since this + // signals concurrent sweeps to recompute + // their debt. + mheap_.pagesSweptBasis.Store(pagesSwept) + } + } +} diff --git a/src/runtime/mgcwork.go b/src/runtime/mgcwork.go new file mode 100644 index 0000000..7ab8975 --- /dev/null +++ b/src/runtime/mgcwork.go @@ -0,0 +1,489 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +const ( + _WorkbufSize = 2048 // in bytes; larger values result in less contention + + // workbufAlloc is the number of bytes to allocate at a time + // for new workbufs. This must be a multiple of pageSize and + // should be a multiple of _WorkbufSize. + // + // Larger values reduce workbuf allocation overhead. Smaller + // values reduce heap fragmentation. + workbufAlloc = 32 << 10 +) + +func init() { + if workbufAlloc%pageSize != 0 || workbufAlloc%_WorkbufSize != 0 { + throw("bad workbufAlloc") + } +} + +// Garbage collector work pool abstraction. +// +// This implements a producer/consumer model for pointers to grey +// objects. A grey object is one that is marked and on a work +// queue. A black object is marked and not on a work queue. +// +// Write barriers, root discovery, stack scanning, and object scanning +// produce pointers to grey objects. Scanning consumes pointers to +// grey objects, thus blackening them, and then scans them, +// potentially producing new pointers to grey objects. + +// A gcWork provides the interface to produce and consume work for the +// garbage collector. +// +// A gcWork can be used on the stack as follows: +// +// (preemption must be disabled) +// gcw := &getg().m.p.ptr().gcw +// .. call gcw.put() to produce and gcw.tryGet() to consume .. +// +// It's important that any use of gcWork during the mark phase prevent +// the garbage collector from transitioning to mark termination since +// gcWork may locally hold GC work buffers. This can be done by +// disabling preemption (systemstack or acquirem). +type gcWork struct { + // wbuf1 and wbuf2 are the primary and secondary work buffers. + // + // This can be thought of as a stack of both work buffers' + // pointers concatenated. When we pop the last pointer, we + // shift the stack up by one work buffer by bringing in a new + // full buffer and discarding an empty one. When we fill both + // buffers, we shift the stack down by one work buffer by + // bringing in a new empty buffer and discarding a full one. + // This way we have one buffer's worth of hysteresis, which + // amortizes the cost of getting or putting a work buffer over + // at least one buffer of work and reduces contention on the + // global work lists. + // + // wbuf1 is always the buffer we're currently pushing to and + // popping from and wbuf2 is the buffer that will be discarded + // next. + // + // Invariant: Both wbuf1 and wbuf2 are nil or neither are. + wbuf1, wbuf2 *workbuf + + // Bytes marked (blackened) on this gcWork. This is aggregated + // into work.bytesMarked by dispose. + bytesMarked uint64 + + // Heap scan work performed on this gcWork. This is aggregated into + // gcController by dispose and may also be flushed by callers. + // Other types of scan work are flushed immediately. + heapScanWork int64 + + // flushedWork indicates that a non-empty work buffer was + // flushed to the global work list since the last gcMarkDone + // termination check. Specifically, this indicates that this + // gcWork may have communicated work to another gcWork. + flushedWork bool +} + +// Most of the methods of gcWork are go:nowritebarrierrec because the +// write barrier itself can invoke gcWork methods but the methods are +// not generally re-entrant. Hence, if a gcWork method invoked the +// write barrier while the gcWork was in an inconsistent state, and +// the write barrier in turn invoked a gcWork method, it could +// permanently corrupt the gcWork. + +func (w *gcWork) init() { + w.wbuf1 = getempty() + wbuf2 := trygetfull() + if wbuf2 == nil { + wbuf2 = getempty() + } + w.wbuf2 = wbuf2 +} + +// put enqueues a pointer for the garbage collector to trace. +// obj must point to the beginning of a heap object or an oblet. +// +//go:nowritebarrierrec +func (w *gcWork) put(obj uintptr) { + flushed := false + wbuf := w.wbuf1 + // Record that this may acquire the wbufSpans or heap lock to + // allocate a workbuf. + lockWithRankMayAcquire(&work.wbufSpans.lock, lockRankWbufSpans) + lockWithRankMayAcquire(&mheap_.lock, lockRankMheap) + if wbuf == nil { + w.init() + wbuf = w.wbuf1 + // wbuf is empty at this point. + } else if wbuf.nobj == len(wbuf.obj) { + w.wbuf1, w.wbuf2 = w.wbuf2, w.wbuf1 + wbuf = w.wbuf1 + if wbuf.nobj == len(wbuf.obj) { + putfull(wbuf) + w.flushedWork = true + wbuf = getempty() + w.wbuf1 = wbuf + flushed = true + } + } + + wbuf.obj[wbuf.nobj] = obj + wbuf.nobj++ + + // If we put a buffer on full, let the GC controller know so + // it can encourage more workers to run. We delay this until + // the end of put so that w is in a consistent state, since + // enlistWorker may itself manipulate w. + if flushed && gcphase == _GCmark { + gcController.enlistWorker() + } +} + +// putFast does a put and reports whether it can be done quickly +// otherwise it returns false and the caller needs to call put. +// +//go:nowritebarrierrec +func (w *gcWork) putFast(obj uintptr) bool { + wbuf := w.wbuf1 + if wbuf == nil || wbuf.nobj == len(wbuf.obj) { + return false + } + + wbuf.obj[wbuf.nobj] = obj + wbuf.nobj++ + return true +} + +// putBatch performs a put on every pointer in obj. See put for +// constraints on these pointers. +// +//go:nowritebarrierrec +func (w *gcWork) putBatch(obj []uintptr) { + if len(obj) == 0 { + return + } + + flushed := false + wbuf := w.wbuf1 + if wbuf == nil { + w.init() + wbuf = w.wbuf1 + } + + for len(obj) > 0 { + for wbuf.nobj == len(wbuf.obj) { + putfull(wbuf) + w.flushedWork = true + w.wbuf1, w.wbuf2 = w.wbuf2, getempty() + wbuf = w.wbuf1 + flushed = true + } + n := copy(wbuf.obj[wbuf.nobj:], obj) + wbuf.nobj += n + obj = obj[n:] + } + + if flushed && gcphase == _GCmark { + gcController.enlistWorker() + } +} + +// tryGet dequeues a pointer for the garbage collector to trace. +// +// If there are no pointers remaining in this gcWork or in the global +// queue, tryGet returns 0. Note that there may still be pointers in +// other gcWork instances or other caches. +// +//go:nowritebarrierrec +func (w *gcWork) tryGet() uintptr { + wbuf := w.wbuf1 + if wbuf == nil { + w.init() + wbuf = w.wbuf1 + // wbuf is empty at this point. + } + if wbuf.nobj == 0 { + w.wbuf1, w.wbuf2 = w.wbuf2, w.wbuf1 + wbuf = w.wbuf1 + if wbuf.nobj == 0 { + owbuf := wbuf + wbuf = trygetfull() + if wbuf == nil { + return 0 + } + putempty(owbuf) + w.wbuf1 = wbuf + } + } + + wbuf.nobj-- + return wbuf.obj[wbuf.nobj] +} + +// tryGetFast dequeues a pointer for the garbage collector to trace +// if one is readily available. Otherwise it returns 0 and +// the caller is expected to call tryGet(). +// +//go:nowritebarrierrec +func (w *gcWork) tryGetFast() uintptr { + wbuf := w.wbuf1 + if wbuf == nil || wbuf.nobj == 0 { + return 0 + } + + wbuf.nobj-- + return wbuf.obj[wbuf.nobj] +} + +// dispose returns any cached pointers to the global queue. +// The buffers are being put on the full queue so that the +// write barriers will not simply reacquire them before the +// GC can inspect them. This helps reduce the mutator's +// ability to hide pointers during the concurrent mark phase. +// +//go:nowritebarrierrec +func (w *gcWork) dispose() { + if wbuf := w.wbuf1; wbuf != nil { + if wbuf.nobj == 0 { + putempty(wbuf) + } else { + putfull(wbuf) + w.flushedWork = true + } + w.wbuf1 = nil + + wbuf = w.wbuf2 + if wbuf.nobj == 0 { + putempty(wbuf) + } else { + putfull(wbuf) + w.flushedWork = true + } + w.wbuf2 = nil + } + if w.bytesMarked != 0 { + // dispose happens relatively infrequently. If this + // atomic becomes a problem, we should first try to + // dispose less and if necessary aggregate in a per-P + // counter. + atomic.Xadd64(&work.bytesMarked, int64(w.bytesMarked)) + w.bytesMarked = 0 + } + if w.heapScanWork != 0 { + gcController.heapScanWork.Add(w.heapScanWork) + w.heapScanWork = 0 + } +} + +// balance moves some work that's cached in this gcWork back on the +// global queue. +// +//go:nowritebarrierrec +func (w *gcWork) balance() { + if w.wbuf1 == nil { + return + } + if wbuf := w.wbuf2; wbuf.nobj != 0 { + putfull(wbuf) + w.flushedWork = true + w.wbuf2 = getempty() + } else if wbuf := w.wbuf1; wbuf.nobj > 4 { + w.wbuf1 = handoff(wbuf) + w.flushedWork = true // handoff did putfull + } else { + return + } + // We flushed a buffer to the full list, so wake a worker. + if gcphase == _GCmark { + gcController.enlistWorker() + } +} + +// empty reports whether w has no mark work available. +// +//go:nowritebarrierrec +func (w *gcWork) empty() bool { + return w.wbuf1 == nil || (w.wbuf1.nobj == 0 && w.wbuf2.nobj == 0) +} + +// Internally, the GC work pool is kept in arrays in work buffers. +// The gcWork interface caches a work buffer until full (or empty) to +// avoid contending on the global work buffer lists. + +type workbufhdr struct { + node lfnode // must be first + nobj int +} + +type workbuf struct { + _ sys.NotInHeap + workbufhdr + // account for the above fields + obj [(_WorkbufSize - unsafe.Sizeof(workbufhdr{})) / goarch.PtrSize]uintptr +} + +// workbuf factory routines. These funcs are used to manage the +// workbufs. +// If the GC asks for some work these are the only routines that +// make wbufs available to the GC. + +func (b *workbuf) checknonempty() { + if b.nobj == 0 { + throw("workbuf is empty") + } +} + +func (b *workbuf) checkempty() { + if b.nobj != 0 { + throw("workbuf is not empty") + } +} + +// getempty pops an empty work buffer off the work.empty list, +// allocating new buffers if none are available. +// +//go:nowritebarrier +func getempty() *workbuf { + var b *workbuf + if work.empty != 0 { + b = (*workbuf)(work.empty.pop()) + if b != nil { + b.checkempty() + } + } + // Record that this may acquire the wbufSpans or heap lock to + // allocate a workbuf. + lockWithRankMayAcquire(&work.wbufSpans.lock, lockRankWbufSpans) + lockWithRankMayAcquire(&mheap_.lock, lockRankMheap) + if b == nil { + // Allocate more workbufs. + var s *mspan + if work.wbufSpans.free.first != nil { + lock(&work.wbufSpans.lock) + s = work.wbufSpans.free.first + if s != nil { + work.wbufSpans.free.remove(s) + work.wbufSpans.busy.insert(s) + } + unlock(&work.wbufSpans.lock) + } + if s == nil { + systemstack(func() { + s = mheap_.allocManual(workbufAlloc/pageSize, spanAllocWorkBuf) + }) + if s == nil { + throw("out of memory") + } + // Record the new span in the busy list. + lock(&work.wbufSpans.lock) + work.wbufSpans.busy.insert(s) + unlock(&work.wbufSpans.lock) + } + // Slice up the span into new workbufs. Return one and + // put the rest on the empty list. + for i := uintptr(0); i+_WorkbufSize <= workbufAlloc; i += _WorkbufSize { + newb := (*workbuf)(unsafe.Pointer(s.base() + i)) + newb.nobj = 0 + lfnodeValidate(&newb.node) + if i == 0 { + b = newb + } else { + putempty(newb) + } + } + } + return b +} + +// putempty puts a workbuf onto the work.empty list. +// Upon entry this goroutine owns b. The lfstack.push relinquishes ownership. +// +//go:nowritebarrier +func putempty(b *workbuf) { + b.checkempty() + work.empty.push(&b.node) +} + +// putfull puts the workbuf on the work.full list for the GC. +// putfull accepts partially full buffers so the GC can avoid competing +// with the mutators for ownership of partially full buffers. +// +//go:nowritebarrier +func putfull(b *workbuf) { + b.checknonempty() + work.full.push(&b.node) +} + +// trygetfull tries to get a full or partially empty workbuffer. +// If one is not immediately available return nil. +// +//go:nowritebarrier +func trygetfull() *workbuf { + b := (*workbuf)(work.full.pop()) + if b != nil { + b.checknonempty() + return b + } + return b +} + +//go:nowritebarrier +func handoff(b *workbuf) *workbuf { + // Make new buffer with half of b's pointers. + b1 := getempty() + n := b.nobj / 2 + b.nobj -= n + b1.nobj = n + memmove(unsafe.Pointer(&b1.obj[0]), unsafe.Pointer(&b.obj[b.nobj]), uintptr(n)*unsafe.Sizeof(b1.obj[0])) + + // Put b on full list - let first half of b get stolen. + putfull(b) + return b1 +} + +// prepareFreeWorkbufs moves busy workbuf spans to free list so they +// can be freed to the heap. This must only be called when all +// workbufs are on the empty list. +func prepareFreeWorkbufs() { + lock(&work.wbufSpans.lock) + if work.full != 0 { + throw("cannot free workbufs when work.full != 0") + } + // Since all workbufs are on the empty list, we don't care + // which ones are in which spans. We can wipe the entire empty + // list and move all workbuf spans to the free list. + work.empty = 0 + work.wbufSpans.free.takeAll(&work.wbufSpans.busy) + unlock(&work.wbufSpans.lock) +} + +// freeSomeWbufs frees some workbufs back to the heap and returns +// true if it should be called again to free more. +func freeSomeWbufs(preemptible bool) bool { + const batchSize = 64 // ~1–2 µs per span. + lock(&work.wbufSpans.lock) + if gcphase != _GCoff || work.wbufSpans.free.isEmpty() { + unlock(&work.wbufSpans.lock) + return false + } + systemstack(func() { + gp := getg().m.curg + for i := 0; i < batchSize && !(preemptible && gp.preempt); i++ { + span := work.wbufSpans.free.first + if span == nil { + break + } + work.wbufSpans.free.remove(span) + mheap_.freeManual(span, spanAllocWorkBuf) + } + }) + more := !work.wbufSpans.free.isEmpty() + unlock(&work.wbufSpans.lock) + return more +} diff --git a/src/runtime/mheap.go b/src/runtime/mheap.go new file mode 100644 index 0000000..1401e92 --- /dev/null +++ b/src/runtime/mheap.go @@ -0,0 +1,2228 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Page heap. +// +// See malloc.go for overview. + +package runtime + +import ( + "internal/cpu" + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +const ( + // minPhysPageSize is a lower-bound on the physical page size. The + // true physical page size may be larger than this. In contrast, + // sys.PhysPageSize is an upper-bound on the physical page size. + minPhysPageSize = 4096 + + // maxPhysPageSize is the maximum page size the runtime supports. + maxPhysPageSize = 512 << 10 + + // maxPhysHugePageSize sets an upper-bound on the maximum huge page size + // that the runtime supports. + maxPhysHugePageSize = pallocChunkBytes + + // pagesPerReclaimerChunk indicates how many pages to scan from the + // pageInUse bitmap at a time. Used by the page reclaimer. + // + // Higher values reduce contention on scanning indexes (such as + // h.reclaimIndex), but increase the minimum latency of the + // operation. + // + // The time required to scan this many pages can vary a lot depending + // on how many spans are actually freed. Experimentally, it can + // scan for pages at ~300 GB/ms on a 2.6GHz Core i7, but can only + // free spans at ~32 MB/ms. Using 512 pages bounds this at + // roughly 100µs. + // + // Must be a multiple of the pageInUse bitmap element size and + // must also evenly divide pagesPerArena. + pagesPerReclaimerChunk = 512 + + // physPageAlignedStacks indicates whether stack allocations must be + // physical page aligned. This is a requirement for MAP_STACK on + // OpenBSD. + physPageAlignedStacks = GOOS == "openbsd" +) + +// Main malloc heap. +// The heap itself is the "free" and "scav" treaps, +// but all the other global data is here too. +// +// mheap must not be heap-allocated because it contains mSpanLists, +// which must not be heap-allocated. +type mheap struct { + _ sys.NotInHeap + + // lock must only be acquired on the system stack, otherwise a g + // could self-deadlock if its stack grows with the lock held. + lock mutex + + pages pageAlloc // page allocation data structure + + sweepgen uint32 // sweep generation, see comment in mspan; written during STW + + // allspans is a slice of all mspans ever created. Each mspan + // appears exactly once. + // + // The memory for allspans is manually managed and can be + // reallocated and move as the heap grows. + // + // In general, allspans is protected by mheap_.lock, which + // prevents concurrent access as well as freeing the backing + // store. Accesses during STW might not hold the lock, but + // must ensure that allocation cannot happen around the + // access (since that may free the backing store). + allspans []*mspan // all spans out there + + // Proportional sweep + // + // These parameters represent a linear function from gcController.heapLive + // to page sweep count. The proportional sweep system works to + // stay in the black by keeping the current page sweep count + // above this line at the current gcController.heapLive. + // + // The line has slope sweepPagesPerByte and passes through a + // basis point at (sweepHeapLiveBasis, pagesSweptBasis). At + // any given time, the system is at (gcController.heapLive, + // pagesSwept) in this space. + // + // It is important that the line pass through a point we + // control rather than simply starting at a 0,0 origin + // because that lets us adjust sweep pacing at any time while + // accounting for current progress. If we could only adjust + // the slope, it would create a discontinuity in debt if any + // progress has already been made. + pagesInUse atomic.Uintptr // pages of spans in stats mSpanInUse + pagesSwept atomic.Uint64 // pages swept this cycle + pagesSweptBasis atomic.Uint64 // pagesSwept to use as the origin of the sweep ratio + sweepHeapLiveBasis uint64 // value of gcController.heapLive to use as the origin of sweep ratio; written with lock, read without + sweepPagesPerByte float64 // proportional sweep ratio; written with lock, read without + + // Page reclaimer state + + // reclaimIndex is the page index in allArenas of next page to + // reclaim. Specifically, it refers to page (i % + // pagesPerArena) of arena allArenas[i / pagesPerArena]. + // + // If this is >= 1<<63, the page reclaimer is done scanning + // the page marks. + reclaimIndex atomic.Uint64 + + // reclaimCredit is spare credit for extra pages swept. Since + // the page reclaimer works in large chunks, it may reclaim + // more than requested. Any spare pages released go to this + // credit pool. + reclaimCredit atomic.Uintptr + + // arenas is the heap arena map. It points to the metadata for + // the heap for every arena frame of the entire usable virtual + // address space. + // + // Use arenaIndex to compute indexes into this array. + // + // For regions of the address space that are not backed by the + // Go heap, the arena map contains nil. + // + // Modifications are protected by mheap_.lock. Reads can be + // performed without locking; however, a given entry can + // transition from nil to non-nil at any time when the lock + // isn't held. (Entries never transitions back to nil.) + // + // In general, this is a two-level mapping consisting of an L1 + // map and possibly many L2 maps. This saves space when there + // are a huge number of arena frames. However, on many + // platforms (even 64-bit), arenaL1Bits is 0, making this + // effectively a single-level map. In this case, arenas[0] + // will never be nil. + arenas [1 << arenaL1Bits]*[1 << arenaL2Bits]*heapArena + + // heapArenaAlloc is pre-reserved space for allocating heapArena + // objects. This is only used on 32-bit, where we pre-reserve + // this space to avoid interleaving it with the heap itself. + heapArenaAlloc linearAlloc + + // arenaHints is a list of addresses at which to attempt to + // add more heap arenas. This is initially populated with a + // set of general hint addresses, and grown with the bounds of + // actual heap arena ranges. + arenaHints *arenaHint + + // arena is a pre-reserved space for allocating heap arenas + // (the actual arenas). This is only used on 32-bit. + arena linearAlloc + + // allArenas is the arenaIndex of every mapped arena. This can + // be used to iterate through the address space. + // + // Access is protected by mheap_.lock. However, since this is + // append-only and old backing arrays are never freed, it is + // safe to acquire mheap_.lock, copy the slice header, and + // then release mheap_.lock. + allArenas []arenaIdx + + // sweepArenas is a snapshot of allArenas taken at the + // beginning of the sweep cycle. This can be read safely by + // simply blocking GC (by disabling preemption). + sweepArenas []arenaIdx + + // markArenas is a snapshot of allArenas taken at the beginning + // of the mark cycle. Because allArenas is append-only, neither + // this slice nor its contents will change during the mark, so + // it can be read safely. + markArenas []arenaIdx + + // curArena is the arena that the heap is currently growing + // into. This should always be physPageSize-aligned. + curArena struct { + base, end uintptr + } + + // central free lists for small size classes. + // the padding makes sure that the mcentrals are + // spaced CacheLinePadSize bytes apart, so that each mcentral.lock + // gets its own cache line. + // central is indexed by spanClass. + central [numSpanClasses]struct { + mcentral mcentral + pad [(cpu.CacheLinePadSize - unsafe.Sizeof(mcentral{})%cpu.CacheLinePadSize) % cpu.CacheLinePadSize]byte + } + + spanalloc fixalloc // allocator for span* + cachealloc fixalloc // allocator for mcache* + specialfinalizeralloc fixalloc // allocator for specialfinalizer* + specialprofilealloc fixalloc // allocator for specialprofile* + specialReachableAlloc fixalloc // allocator for specialReachable + speciallock mutex // lock for special record allocators. + arenaHintAlloc fixalloc // allocator for arenaHints + + // User arena state. + // + // Protected by mheap_.lock. + userArena struct { + // arenaHints is a list of addresses at which to attempt to + // add more heap arenas for user arena chunks. This is initially + // populated with a set of general hint addresses, and grown with + // the bounds of actual heap arena ranges. + arenaHints *arenaHint + + // quarantineList is a list of user arena spans that have been set to fault, but + // are waiting for all pointers into them to go away. Sweeping handles + // identifying when this is true, and moves the span to the ready list. + quarantineList mSpanList + + // readyList is a list of empty user arena spans that are ready for reuse. + readyList mSpanList + } + + unused *specialfinalizer // never set, just here to force the specialfinalizer type into DWARF +} + +var mheap_ mheap + +// A heapArena stores metadata for a heap arena. heapArenas are stored +// outside of the Go heap and accessed via the mheap_.arenas index. +type heapArena struct { + _ sys.NotInHeap + + // bitmap stores the pointer/scalar bitmap for the words in + // this arena. See mbitmap.go for a description. + // This array uses 1 bit per word of heap, or 1.6% of the heap size (for 64-bit). + bitmap [heapArenaBitmapWords]uintptr + + // If the ith bit of noMorePtrs is true, then there are no more + // pointers for the object containing the word described by the + // high bit of bitmap[i]. + // In that case, bitmap[i+1], ... must be zero until the start + // of the next object. + // We never operate on these entries using bit-parallel techniques, + // so it is ok if they are small. Also, they can't be bigger than + // uint16 because at that size a single noMorePtrs entry + // represents 8K of memory, the minimum size of a span. Any larger + // and we'd have to worry about concurrent updates. + // This array uses 1 bit per word of bitmap, or .024% of the heap size (for 64-bit). + noMorePtrs [heapArenaBitmapWords / 8]uint8 + + // spans maps from virtual address page ID within this arena to *mspan. + // For allocated spans, their pages map to the span itself. + // For free spans, only the lowest and highest pages map to the span itself. + // Internal pages map to an arbitrary span. + // For pages that have never been allocated, spans entries are nil. + // + // Modifications are protected by mheap.lock. Reads can be + // performed without locking, but ONLY from indexes that are + // known to contain in-use or stack spans. This means there + // must not be a safe-point between establishing that an + // address is live and looking it up in the spans array. + spans [pagesPerArena]*mspan + + // pageInUse is a bitmap that indicates which spans are in + // state mSpanInUse. This bitmap is indexed by page number, + // but only the bit corresponding to the first page in each + // span is used. + // + // Reads and writes are atomic. + pageInUse [pagesPerArena / 8]uint8 + + // pageMarks is a bitmap that indicates which spans have any + // marked objects on them. Like pageInUse, only the bit + // corresponding to the first page in each span is used. + // + // Writes are done atomically during marking. Reads are + // non-atomic and lock-free since they only occur during + // sweeping (and hence never race with writes). + // + // This is used to quickly find whole spans that can be freed. + // + // TODO(austin): It would be nice if this was uint64 for + // faster scanning, but we don't have 64-bit atomic bit + // operations. + pageMarks [pagesPerArena / 8]uint8 + + // pageSpecials is a bitmap that indicates which spans have + // specials (finalizers or other). Like pageInUse, only the bit + // corresponding to the first page in each span is used. + // + // Writes are done atomically whenever a special is added to + // a span and whenever the last special is removed from a span. + // Reads are done atomically to find spans containing specials + // during marking. + pageSpecials [pagesPerArena / 8]uint8 + + // checkmarks stores the debug.gccheckmark state. It is only + // used if debug.gccheckmark > 0. + checkmarks *checkmarksMap + + // zeroedBase marks the first byte of the first page in this + // arena which hasn't been used yet and is therefore already + // zero. zeroedBase is relative to the arena base. + // Increases monotonically until it hits heapArenaBytes. + // + // This field is sufficient to determine if an allocation + // needs to be zeroed because the page allocator follows an + // address-ordered first-fit policy. + // + // Read atomically and written with an atomic CAS. + zeroedBase uintptr +} + +// arenaHint is a hint for where to grow the heap arenas. See +// mheap_.arenaHints. +type arenaHint struct { + _ sys.NotInHeap + addr uintptr + down bool + next *arenaHint +} + +// An mspan is a run of pages. +// +// When a mspan is in the heap free treap, state == mSpanFree +// and heapmap(s->start) == span, heapmap(s->start+s->npages-1) == span. +// If the mspan is in the heap scav treap, then in addition to the +// above scavenged == true. scavenged == false in all other cases. +// +// When a mspan is allocated, state == mSpanInUse or mSpanManual +// and heapmap(i) == span for all s->start <= i < s->start+s->npages. + +// Every mspan is in one doubly-linked list, either in the mheap's +// busy list or one of the mcentral's span lists. + +// An mspan representing actual memory has state mSpanInUse, +// mSpanManual, or mSpanFree. Transitions between these states are +// constrained as follows: +// +// - A span may transition from free to in-use or manual during any GC +// phase. +// +// - During sweeping (gcphase == _GCoff), a span may transition from +// in-use to free (as a result of sweeping) or manual to free (as a +// result of stacks being freed). +// +// - During GC (gcphase != _GCoff), a span *must not* transition from +// manual or in-use to free. Because concurrent GC may read a pointer +// and then look up its span, the span state must be monotonic. +// +// Setting mspan.state to mSpanInUse or mSpanManual must be done +// atomically and only after all other span fields are valid. +// Likewise, if inspecting a span is contingent on it being +// mSpanInUse, the state should be loaded atomically and checked +// before depending on other fields. This allows the garbage collector +// to safely deal with potentially invalid pointers, since resolving +// such pointers may race with a span being allocated. +type mSpanState uint8 + +const ( + mSpanDead mSpanState = iota + mSpanInUse // allocated for garbage collected heap + mSpanManual // allocated for manual management (e.g., stack allocator) +) + +// mSpanStateNames are the names of the span states, indexed by +// mSpanState. +var mSpanStateNames = []string{ + "mSpanDead", + "mSpanInUse", + "mSpanManual", +} + +// mSpanStateBox holds an atomic.Uint8 to provide atomic operations on +// an mSpanState. This is a separate type to disallow accidental comparison +// or assignment with mSpanState. +type mSpanStateBox struct { + s atomic.Uint8 +} + +// It is nosplit to match get, below. + +//go:nosplit +func (b *mSpanStateBox) set(s mSpanState) { + b.s.Store(uint8(s)) +} + +// It is nosplit because it's called indirectly by typedmemclr, +// which must not be preempted. + +//go:nosplit +func (b *mSpanStateBox) get() mSpanState { + return mSpanState(b.s.Load()) +} + +// mSpanList heads a linked list of spans. +type mSpanList struct { + _ sys.NotInHeap + first *mspan // first span in list, or nil if none + last *mspan // last span in list, or nil if none +} + +type mspan struct { + _ sys.NotInHeap + next *mspan // next span in list, or nil if none + prev *mspan // previous span in list, or nil if none + list *mSpanList // For debugging. TODO: Remove. + + startAddr uintptr // address of first byte of span aka s.base() + npages uintptr // number of pages in span + + manualFreeList gclinkptr // list of free objects in mSpanManual spans + + // freeindex is the slot index between 0 and nelems at which to begin scanning + // for the next free object in this span. + // Each allocation scans allocBits starting at freeindex until it encounters a 0 + // indicating a free object. freeindex is then adjusted so that subsequent scans begin + // just past the newly discovered free object. + // + // If freeindex == nelem, this span has no free objects. + // + // allocBits is a bitmap of objects in this span. + // If n >= freeindex and allocBits[n/8] & (1<<(n%8)) is 0 + // then object n is free; + // otherwise, object n is allocated. Bits starting at nelem are + // undefined and should never be referenced. + // + // Object n starts at address n*elemsize + (start << pageShift). + freeindex uintptr + // TODO: Look up nelems from sizeclass and remove this field if it + // helps performance. + nelems uintptr // number of object in the span. + + // Cache of the allocBits at freeindex. allocCache is shifted + // such that the lowest bit corresponds to the bit freeindex. + // allocCache holds the complement of allocBits, thus allowing + // ctz (count trailing zero) to use it directly. + // allocCache may contain bits beyond s.nelems; the caller must ignore + // these. + allocCache uint64 + + // allocBits and gcmarkBits hold pointers to a span's mark and + // allocation bits. The pointers are 8 byte aligned. + // There are three arenas where this data is held. + // free: Dirty arenas that are no longer accessed + // and can be reused. + // next: Holds information to be used in the next GC cycle. + // current: Information being used during this GC cycle. + // previous: Information being used during the last GC cycle. + // A new GC cycle starts with the call to finishsweep_m. + // finishsweep_m moves the previous arena to the free arena, + // the current arena to the previous arena, and + // the next arena to the current arena. + // The next arena is populated as the spans request + // memory to hold gcmarkBits for the next GC cycle as well + // as allocBits for newly allocated spans. + // + // The pointer arithmetic is done "by hand" instead of using + // arrays to avoid bounds checks along critical performance + // paths. + // The sweep will free the old allocBits and set allocBits to the + // gcmarkBits. The gcmarkBits are replaced with a fresh zeroed + // out memory. + allocBits *gcBits + gcmarkBits *gcBits + + // sweep generation: + // if sweepgen == h->sweepgen - 2, the span needs sweeping + // if sweepgen == h->sweepgen - 1, the span is currently being swept + // if sweepgen == h->sweepgen, the span is swept and ready to use + // if sweepgen == h->sweepgen + 1, the span was cached before sweep began and is still cached, and needs sweeping + // if sweepgen == h->sweepgen + 3, the span was swept and then cached and is still cached + // h->sweepgen is incremented by 2 after every GC + + sweepgen uint32 + divMul uint32 // for divide by elemsize + allocCount uint16 // number of allocated objects + spanclass spanClass // size class and noscan (uint8) + state mSpanStateBox // mSpanInUse etc; accessed atomically (get/set methods) + needzero uint8 // needs to be zeroed before allocation + isUserArenaChunk bool // whether or not this span represents a user arena + allocCountBeforeCache uint16 // a copy of allocCount that is stored just before this span is cached + elemsize uintptr // computed from sizeclass or from npages + limit uintptr // end of data in span + speciallock mutex // guards specials list + specials *special // linked list of special records sorted by offset. + userArenaChunkFree addrRange // interval for managing chunk allocation + + // freeIndexForScan is like freeindex, except that freeindex is + // used by the allocator whereas freeIndexForScan is used by the + // GC scanner. They are two fields so that the GC sees the object + // is allocated only when the object and the heap bits are + // initialized (see also the assignment of freeIndexForScan in + // mallocgc, and issue 54596). + freeIndexForScan uintptr +} + +func (s *mspan) base() uintptr { + return s.startAddr +} + +func (s *mspan) layout() (size, n, total uintptr) { + total = s.npages << _PageShift + size = s.elemsize + if size > 0 { + n = total / size + } + return +} + +// recordspan adds a newly allocated span to h.allspans. +// +// This only happens the first time a span is allocated from +// mheap.spanalloc (it is not called when a span is reused). +// +// Write barriers are disallowed here because it can be called from +// gcWork when allocating new workbufs. However, because it's an +// indirect call from the fixalloc initializer, the compiler can't see +// this. +// +// The heap lock must be held. +// +//go:nowritebarrierrec +func recordspan(vh unsafe.Pointer, p unsafe.Pointer) { + h := (*mheap)(vh) + s := (*mspan)(p) + + assertLockHeld(&h.lock) + + if len(h.allspans) >= cap(h.allspans) { + n := 64 * 1024 / goarch.PtrSize + if n < cap(h.allspans)*3/2 { + n = cap(h.allspans) * 3 / 2 + } + var new []*mspan + sp := (*slice)(unsafe.Pointer(&new)) + sp.array = sysAlloc(uintptr(n)*goarch.PtrSize, &memstats.other_sys) + if sp.array == nil { + throw("runtime: cannot allocate memory") + } + sp.len = len(h.allspans) + sp.cap = n + if len(h.allspans) > 0 { + copy(new, h.allspans) + } + oldAllspans := h.allspans + *(*notInHeapSlice)(unsafe.Pointer(&h.allspans)) = *(*notInHeapSlice)(unsafe.Pointer(&new)) + if len(oldAllspans) != 0 { + sysFree(unsafe.Pointer(&oldAllspans[0]), uintptr(cap(oldAllspans))*unsafe.Sizeof(oldAllspans[0]), &memstats.other_sys) + } + } + h.allspans = h.allspans[:len(h.allspans)+1] + h.allspans[len(h.allspans)-1] = s +} + +// A spanClass represents the size class and noscan-ness of a span. +// +// Each size class has a noscan spanClass and a scan spanClass. The +// noscan spanClass contains only noscan objects, which do not contain +// pointers and thus do not need to be scanned by the garbage +// collector. +type spanClass uint8 + +const ( + numSpanClasses = _NumSizeClasses << 1 + tinySpanClass = spanClass(tinySizeClass<<1 | 1) +) + +func makeSpanClass(sizeclass uint8, noscan bool) spanClass { + return spanClass(sizeclass<<1) | spanClass(bool2int(noscan)) +} + +func (sc spanClass) sizeclass() int8 { + return int8(sc >> 1) +} + +func (sc spanClass) noscan() bool { + return sc&1 != 0 +} + +// arenaIndex returns the index into mheap_.arenas of the arena +// containing metadata for p. This index combines of an index into the +// L1 map and an index into the L2 map and should be used as +// mheap_.arenas[ai.l1()][ai.l2()]. +// +// If p is outside the range of valid heap addresses, either l1() or +// l2() will be out of bounds. +// +// It is nosplit because it's called by spanOf and several other +// nosplit functions. +// +//go:nosplit +func arenaIndex(p uintptr) arenaIdx { + return arenaIdx((p - arenaBaseOffset) / heapArenaBytes) +} + +// arenaBase returns the low address of the region covered by heap +// arena i. +func arenaBase(i arenaIdx) uintptr { + return uintptr(i)*heapArenaBytes + arenaBaseOffset +} + +type arenaIdx uint + +// l1 returns the "l1" portion of an arenaIdx. +// +// Marked nosplit because it's called by spanOf and other nosplit +// functions. +// +//go:nosplit +func (i arenaIdx) l1() uint { + if arenaL1Bits == 0 { + // Let the compiler optimize this away if there's no + // L1 map. + return 0 + } else { + return uint(i) >> arenaL1Shift + } +} + +// l2 returns the "l2" portion of an arenaIdx. +// +// Marked nosplit because it's called by spanOf and other nosplit funcs. +// functions. +// +//go:nosplit +func (i arenaIdx) l2() uint { + if arenaL1Bits == 0 { + return uint(i) + } else { + return uint(i) & (1<<arenaL2Bits - 1) + } +} + +// inheap reports whether b is a pointer into a (potentially dead) heap object. +// It returns false for pointers into mSpanManual spans. +// Non-preemptible because it is used by write barriers. +// +//go:nowritebarrier +//go:nosplit +func inheap(b uintptr) bool { + return spanOfHeap(b) != nil +} + +// inHeapOrStack is a variant of inheap that returns true for pointers +// into any allocated heap span. +// +//go:nowritebarrier +//go:nosplit +func inHeapOrStack(b uintptr) bool { + s := spanOf(b) + if s == nil || b < s.base() { + return false + } + switch s.state.get() { + case mSpanInUse, mSpanManual: + return b < s.limit + default: + return false + } +} + +// spanOf returns the span of p. If p does not point into the heap +// arena or no span has ever contained p, spanOf returns nil. +// +// If p does not point to allocated memory, this may return a non-nil +// span that does *not* contain p. If this is a possibility, the +// caller should either call spanOfHeap or check the span bounds +// explicitly. +// +// Must be nosplit because it has callers that are nosplit. +// +//go:nosplit +func spanOf(p uintptr) *mspan { + // This function looks big, but we use a lot of constant + // folding around arenaL1Bits to get it under the inlining + // budget. Also, many of the checks here are safety checks + // that Go needs to do anyway, so the generated code is quite + // short. + ri := arenaIndex(p) + if arenaL1Bits == 0 { + // If there's no L1, then ri.l1() can't be out of bounds but ri.l2() can. + if ri.l2() >= uint(len(mheap_.arenas[0])) { + return nil + } + } else { + // If there's an L1, then ri.l1() can be out of bounds but ri.l2() can't. + if ri.l1() >= uint(len(mheap_.arenas)) { + return nil + } + } + l2 := mheap_.arenas[ri.l1()] + if arenaL1Bits != 0 && l2 == nil { // Should never happen if there's no L1. + return nil + } + ha := l2[ri.l2()] + if ha == nil { + return nil + } + return ha.spans[(p/pageSize)%pagesPerArena] +} + +// spanOfUnchecked is equivalent to spanOf, but the caller must ensure +// that p points into an allocated heap arena. +// +// Must be nosplit because it has callers that are nosplit. +// +//go:nosplit +func spanOfUnchecked(p uintptr) *mspan { + ai := arenaIndex(p) + return mheap_.arenas[ai.l1()][ai.l2()].spans[(p/pageSize)%pagesPerArena] +} + +// spanOfHeap is like spanOf, but returns nil if p does not point to a +// heap object. +// +// Must be nosplit because it has callers that are nosplit. +// +//go:nosplit +func spanOfHeap(p uintptr) *mspan { + s := spanOf(p) + // s is nil if it's never been allocated. Otherwise, we check + // its state first because we don't trust this pointer, so we + // have to synchronize with span initialization. Then, it's + // still possible we picked up a stale span pointer, so we + // have to check the span's bounds. + if s == nil || s.state.get() != mSpanInUse || p < s.base() || p >= s.limit { + return nil + } + return s +} + +// pageIndexOf returns the arena, page index, and page mask for pointer p. +// The caller must ensure p is in the heap. +func pageIndexOf(p uintptr) (arena *heapArena, pageIdx uintptr, pageMask uint8) { + ai := arenaIndex(p) + arena = mheap_.arenas[ai.l1()][ai.l2()] + pageIdx = ((p / pageSize) / 8) % uintptr(len(arena.pageInUse)) + pageMask = byte(1 << ((p / pageSize) % 8)) + return +} + +// Initialize the heap. +func (h *mheap) init() { + lockInit(&h.lock, lockRankMheap) + lockInit(&h.speciallock, lockRankMheapSpecial) + + h.spanalloc.init(unsafe.Sizeof(mspan{}), recordspan, unsafe.Pointer(h), &memstats.mspan_sys) + h.cachealloc.init(unsafe.Sizeof(mcache{}), nil, nil, &memstats.mcache_sys) + h.specialfinalizeralloc.init(unsafe.Sizeof(specialfinalizer{}), nil, nil, &memstats.other_sys) + h.specialprofilealloc.init(unsafe.Sizeof(specialprofile{}), nil, nil, &memstats.other_sys) + h.specialReachableAlloc.init(unsafe.Sizeof(specialReachable{}), nil, nil, &memstats.other_sys) + h.arenaHintAlloc.init(unsafe.Sizeof(arenaHint{}), nil, nil, &memstats.other_sys) + + // Don't zero mspan allocations. Background sweeping can + // inspect a span concurrently with allocating it, so it's + // important that the span's sweepgen survive across freeing + // and re-allocating a span to prevent background sweeping + // from improperly cas'ing it from 0. + // + // This is safe because mspan contains no heap pointers. + h.spanalloc.zero = false + + // h->mapcache needs no init + + for i := range h.central { + h.central[i].mcentral.init(spanClass(i)) + } + + h.pages.init(&h.lock, &memstats.gcMiscSys) +} + +// reclaim sweeps and reclaims at least npage pages into the heap. +// It is called before allocating npage pages to keep growth in check. +// +// reclaim implements the page-reclaimer half of the sweeper. +// +// h.lock must NOT be held. +func (h *mheap) reclaim(npage uintptr) { + // TODO(austin): Half of the time spent freeing spans is in + // locking/unlocking the heap (even with low contention). We + // could make the slow path here several times faster by + // batching heap frees. + + // Bail early if there's no more reclaim work. + if h.reclaimIndex.Load() >= 1<<63 { + return + } + + // Disable preemption so the GC can't start while we're + // sweeping, so we can read h.sweepArenas, and so + // traceGCSweepStart/Done pair on the P. + mp := acquirem() + + if trace.enabled { + traceGCSweepStart() + } + + arenas := h.sweepArenas + locked := false + for npage > 0 { + // Pull from accumulated credit first. + if credit := h.reclaimCredit.Load(); credit > 0 { + take := credit + if take > npage { + // Take only what we need. + take = npage + } + if h.reclaimCredit.CompareAndSwap(credit, credit-take) { + npage -= take + } + continue + } + + // Claim a chunk of work. + idx := uintptr(h.reclaimIndex.Add(pagesPerReclaimerChunk) - pagesPerReclaimerChunk) + if idx/pagesPerArena >= uintptr(len(arenas)) { + // Page reclaiming is done. + h.reclaimIndex.Store(1 << 63) + break + } + + if !locked { + // Lock the heap for reclaimChunk. + lock(&h.lock) + locked = true + } + + // Scan this chunk. + nfound := h.reclaimChunk(arenas, idx, pagesPerReclaimerChunk) + if nfound <= npage { + npage -= nfound + } else { + // Put spare pages toward global credit. + h.reclaimCredit.Add(nfound - npage) + npage = 0 + } + } + if locked { + unlock(&h.lock) + } + + if trace.enabled { + traceGCSweepDone() + } + releasem(mp) +} + +// reclaimChunk sweeps unmarked spans that start at page indexes [pageIdx, pageIdx+n). +// It returns the number of pages returned to the heap. +// +// h.lock must be held and the caller must be non-preemptible. Note: h.lock may be +// temporarily unlocked and re-locked in order to do sweeping or if tracing is +// enabled. +func (h *mheap) reclaimChunk(arenas []arenaIdx, pageIdx, n uintptr) uintptr { + // The heap lock must be held because this accesses the + // heapArena.spans arrays using potentially non-live pointers. + // In particular, if a span were freed and merged concurrently + // with this probing heapArena.spans, it would be possible to + // observe arbitrary, stale span pointers. + assertLockHeld(&h.lock) + + n0 := n + var nFreed uintptr + sl := sweep.active.begin() + if !sl.valid { + return 0 + } + for n > 0 { + ai := arenas[pageIdx/pagesPerArena] + ha := h.arenas[ai.l1()][ai.l2()] + + // Get a chunk of the bitmap to work on. + arenaPage := uint(pageIdx % pagesPerArena) + inUse := ha.pageInUse[arenaPage/8:] + marked := ha.pageMarks[arenaPage/8:] + if uintptr(len(inUse)) > n/8 { + inUse = inUse[:n/8] + marked = marked[:n/8] + } + + // Scan this bitmap chunk for spans that are in-use + // but have no marked objects on them. + for i := range inUse { + inUseUnmarked := atomic.Load8(&inUse[i]) &^ marked[i] + if inUseUnmarked == 0 { + continue + } + + for j := uint(0); j < 8; j++ { + if inUseUnmarked&(1<<j) != 0 { + s := ha.spans[arenaPage+uint(i)*8+j] + if s, ok := sl.tryAcquire(s); ok { + npages := s.npages + unlock(&h.lock) + if s.sweep(false) { + nFreed += npages + } + lock(&h.lock) + // Reload inUse. It's possible nearby + // spans were freed when we dropped the + // lock and we don't want to get stale + // pointers from the spans array. + inUseUnmarked = atomic.Load8(&inUse[i]) &^ marked[i] + } + } + } + } + + // Advance. + pageIdx += uintptr(len(inUse) * 8) + n -= uintptr(len(inUse) * 8) + } + sweep.active.end(sl) + if trace.enabled { + unlock(&h.lock) + // Account for pages scanned but not reclaimed. + traceGCSweepSpan((n0 - nFreed) * pageSize) + lock(&h.lock) + } + + assertLockHeld(&h.lock) // Must be locked on return. + return nFreed +} + +// spanAllocType represents the type of allocation to make, or +// the type of allocation to be freed. +type spanAllocType uint8 + +const ( + spanAllocHeap spanAllocType = iota // heap span + spanAllocStack // stack span + spanAllocPtrScalarBits // unrolled GC prog bitmap span + spanAllocWorkBuf // work buf span +) + +// manual returns true if the span allocation is manually managed. +func (s spanAllocType) manual() bool { + return s != spanAllocHeap +} + +// alloc allocates a new span of npage pages from the GC'd heap. +// +// spanclass indicates the span's size class and scannability. +// +// Returns a span that has been fully initialized. span.needzero indicates +// whether the span has been zeroed. Note that it may not be. +func (h *mheap) alloc(npages uintptr, spanclass spanClass) *mspan { + // Don't do any operations that lock the heap on the G stack. + // It might trigger stack growth, and the stack growth code needs + // to be able to allocate heap. + var s *mspan + systemstack(func() { + // To prevent excessive heap growth, before allocating n pages + // we need to sweep and reclaim at least n pages. + if !isSweepDone() { + h.reclaim(npages) + } + s = h.allocSpan(npages, spanAllocHeap, spanclass) + }) + return s +} + +// allocManual allocates a manually-managed span of npage pages. +// allocManual returns nil if allocation fails. +// +// allocManual adds the bytes used to *stat, which should be a +// memstats in-use field. Unlike allocations in the GC'd heap, the +// allocation does *not* count toward heapInUse. +// +// The memory backing the returned span may not be zeroed if +// span.needzero is set. +// +// allocManual must be called on the system stack because it may +// acquire the heap lock via allocSpan. See mheap for details. +// +// If new code is written to call allocManual, do NOT use an +// existing spanAllocType value and instead declare a new one. +// +//go:systemstack +func (h *mheap) allocManual(npages uintptr, typ spanAllocType) *mspan { + if !typ.manual() { + throw("manual span allocation called with non-manually-managed type") + } + return h.allocSpan(npages, typ, 0) +} + +// setSpans modifies the span map so [spanOf(base), spanOf(base+npage*pageSize)) +// is s. +func (h *mheap) setSpans(base, npage uintptr, s *mspan) { + p := base / pageSize + ai := arenaIndex(base) + ha := h.arenas[ai.l1()][ai.l2()] + for n := uintptr(0); n < npage; n++ { + i := (p + n) % pagesPerArena + if i == 0 { + ai = arenaIndex(base + n*pageSize) + ha = h.arenas[ai.l1()][ai.l2()] + } + ha.spans[i] = s + } +} + +// allocNeedsZero checks if the region of address space [base, base+npage*pageSize), +// assumed to be allocated, needs to be zeroed, updating heap arena metadata for +// future allocations. +// +// This must be called each time pages are allocated from the heap, even if the page +// allocator can otherwise prove the memory it's allocating is already zero because +// they're fresh from the operating system. It updates heapArena metadata that is +// critical for future page allocations. +// +// There are no locking constraints on this method. +func (h *mheap) allocNeedsZero(base, npage uintptr) (needZero bool) { + for npage > 0 { + ai := arenaIndex(base) + ha := h.arenas[ai.l1()][ai.l2()] + + zeroedBase := atomic.Loaduintptr(&ha.zeroedBase) + arenaBase := base % heapArenaBytes + if arenaBase < zeroedBase { + // We extended into the non-zeroed part of the + // arena, so this region needs to be zeroed before use. + // + // zeroedBase is monotonically increasing, so if we see this now then + // we can be sure we need to zero this memory region. + // + // We still need to update zeroedBase for this arena, and + // potentially more arenas. + needZero = true + } + // We may observe arenaBase > zeroedBase if we're racing with one or more + // allocations which are acquiring memory directly before us in the address + // space. But, because we know no one else is acquiring *this* memory, it's + // still safe to not zero. + + // Compute how far into the arena we extend into, capped + // at heapArenaBytes. + arenaLimit := arenaBase + npage*pageSize + if arenaLimit > heapArenaBytes { + arenaLimit = heapArenaBytes + } + // Increase ha.zeroedBase so it's >= arenaLimit. + // We may be racing with other updates. + for arenaLimit > zeroedBase { + if atomic.Casuintptr(&ha.zeroedBase, zeroedBase, arenaLimit) { + break + } + zeroedBase = atomic.Loaduintptr(&ha.zeroedBase) + // Double check basic conditions of zeroedBase. + if zeroedBase <= arenaLimit && zeroedBase > arenaBase { + // The zeroedBase moved into the space we were trying to + // claim. That's very bad, and indicates someone allocated + // the same region we did. + throw("potentially overlapping in-use allocations detected") + } + } + + // Move base forward and subtract from npage to move into + // the next arena, or finish. + base += arenaLimit - arenaBase + npage -= (arenaLimit - arenaBase) / pageSize + } + return +} + +// tryAllocMSpan attempts to allocate an mspan object from +// the P-local cache, but may fail. +// +// h.lock need not be held. +// +// This caller must ensure that its P won't change underneath +// it during this function. Currently to ensure that we enforce +// that the function is run on the system stack, because that's +// the only place it is used now. In the future, this requirement +// may be relaxed if its use is necessary elsewhere. +// +//go:systemstack +func (h *mheap) tryAllocMSpan() *mspan { + pp := getg().m.p.ptr() + // If we don't have a p or the cache is empty, we can't do + // anything here. + if pp == nil || pp.mspancache.len == 0 { + return nil + } + // Pull off the last entry in the cache. + s := pp.mspancache.buf[pp.mspancache.len-1] + pp.mspancache.len-- + return s +} + +// allocMSpanLocked allocates an mspan object. +// +// h.lock must be held. +// +// allocMSpanLocked must be called on the system stack because +// its caller holds the heap lock. See mheap for details. +// Running on the system stack also ensures that we won't +// switch Ps during this function. See tryAllocMSpan for details. +// +//go:systemstack +func (h *mheap) allocMSpanLocked() *mspan { + assertLockHeld(&h.lock) + + pp := getg().m.p.ptr() + if pp == nil { + // We don't have a p so just do the normal thing. + return (*mspan)(h.spanalloc.alloc()) + } + // Refill the cache if necessary. + if pp.mspancache.len == 0 { + const refillCount = len(pp.mspancache.buf) / 2 + for i := 0; i < refillCount; i++ { + pp.mspancache.buf[i] = (*mspan)(h.spanalloc.alloc()) + } + pp.mspancache.len = refillCount + } + // Pull off the last entry in the cache. + s := pp.mspancache.buf[pp.mspancache.len-1] + pp.mspancache.len-- + return s +} + +// freeMSpanLocked free an mspan object. +// +// h.lock must be held. +// +// freeMSpanLocked must be called on the system stack because +// its caller holds the heap lock. See mheap for details. +// Running on the system stack also ensures that we won't +// switch Ps during this function. See tryAllocMSpan for details. +// +//go:systemstack +func (h *mheap) freeMSpanLocked(s *mspan) { + assertLockHeld(&h.lock) + + pp := getg().m.p.ptr() + // First try to free the mspan directly to the cache. + if pp != nil && pp.mspancache.len < len(pp.mspancache.buf) { + pp.mspancache.buf[pp.mspancache.len] = s + pp.mspancache.len++ + return + } + // Failing that (or if we don't have a p), just free it to + // the heap. + h.spanalloc.free(unsafe.Pointer(s)) +} + +// allocSpan allocates an mspan which owns npages worth of memory. +// +// If typ.manual() == false, allocSpan allocates a heap span of class spanclass +// and updates heap accounting. If manual == true, allocSpan allocates a +// manually-managed span (spanclass is ignored), and the caller is +// responsible for any accounting related to its use of the span. Either +// way, allocSpan will atomically add the bytes in the newly allocated +// span to *sysStat. +// +// The returned span is fully initialized. +// +// h.lock must not be held. +// +// allocSpan must be called on the system stack both because it acquires +// the heap lock and because it must block GC transitions. +// +//go:systemstack +func (h *mheap) allocSpan(npages uintptr, typ spanAllocType, spanclass spanClass) (s *mspan) { + // Function-global state. + gp := getg() + base, scav := uintptr(0), uintptr(0) + growth := uintptr(0) + + // On some platforms we need to provide physical page aligned stack + // allocations. Where the page size is less than the physical page + // size, we already manage to do this by default. + needPhysPageAlign := physPageAlignedStacks && typ == spanAllocStack && pageSize < physPageSize + + // If the allocation is small enough, try the page cache! + // The page cache does not support aligned allocations, so we cannot use + // it if we need to provide a physical page aligned stack allocation. + pp := gp.m.p.ptr() + if !needPhysPageAlign && pp != nil && npages < pageCachePages/4 { + c := &pp.pcache + + // If the cache is empty, refill it. + if c.empty() { + lock(&h.lock) + *c = h.pages.allocToCache() + unlock(&h.lock) + } + + // Try to allocate from the cache. + base, scav = c.alloc(npages) + if base != 0 { + s = h.tryAllocMSpan() + if s != nil { + goto HaveSpan + } + // We have a base but no mspan, so we need + // to lock the heap. + } + } + + // For one reason or another, we couldn't get the + // whole job done without the heap lock. + lock(&h.lock) + + if needPhysPageAlign { + // Overallocate by a physical page to allow for later alignment. + extraPages := physPageSize / pageSize + + // Find a big enough region first, but then only allocate the + // aligned portion. We can't just allocate and then free the + // edges because we need to account for scavenged memory, and + // that's difficult with alloc. + // + // Note that we skip updates to searchAddr here. It's OK if + // it's stale and higher than normal; it'll operate correctly, + // just come with a performance cost. + base, _ = h.pages.find(npages + extraPages) + if base == 0 { + var ok bool + growth, ok = h.grow(npages + extraPages) + if !ok { + unlock(&h.lock) + return nil + } + base, _ = h.pages.find(npages + extraPages) + if base == 0 { + throw("grew heap, but no adequate free space found") + } + } + base = alignUp(base, physPageSize) + scav = h.pages.allocRange(base, npages) + } + + if base == 0 { + // Try to acquire a base address. + base, scav = h.pages.alloc(npages) + if base == 0 { + var ok bool + growth, ok = h.grow(npages) + if !ok { + unlock(&h.lock) + return nil + } + base, scav = h.pages.alloc(npages) + if base == 0 { + throw("grew heap, but no adequate free space found") + } + } + } + if s == nil { + // We failed to get an mspan earlier, so grab + // one now that we have the heap lock. + s = h.allocMSpanLocked() + } + unlock(&h.lock) + +HaveSpan: + // Decide if we need to scavenge in response to what we just allocated. + // Specifically, we track the maximum amount of memory to scavenge of all + // the alternatives below, assuming that the maximum satisfies *all* + // conditions we check (e.g. if we need to scavenge X to satisfy the + // memory limit and Y to satisfy heap-growth scavenging, and Y > X, then + // it's fine to pick Y, because the memory limit is still satisfied). + // + // It's fine to do this after allocating because we expect any scavenged + // pages not to get touched until we return. Simultaneously, it's important + // to do this before calling sysUsed because that may commit address space. + bytesToScavenge := uintptr(0) + if limit := gcController.memoryLimit.Load(); go119MemoryLimitSupport && !gcCPULimiter.limiting() { + // Assist with scavenging to maintain the memory limit by the amount + // that we expect to page in. + inuse := gcController.mappedReady.Load() + // Be careful about overflow, especially with uintptrs. Even on 32-bit platforms + // someone can set a really big memory limit that isn't maxInt64. + if uint64(scav)+inuse > uint64(limit) { + bytesToScavenge = uintptr(uint64(scav) + inuse - uint64(limit)) + } + } + if goal := scavenge.gcPercentGoal.Load(); goal != ^uint64(0) && growth > 0 { + // We just caused a heap growth, so scavenge down what will soon be used. + // By scavenging inline we deal with the failure to allocate out of + // memory fragments by scavenging the memory fragments that are least + // likely to be re-used. + // + // Only bother with this because we're not using a memory limit. We don't + // care about heap growths as long as we're under the memory limit, and the + // previous check for scaving already handles that. + if retained := heapRetained(); retained+uint64(growth) > goal { + // The scavenging algorithm requires the heap lock to be dropped so it + // can acquire it only sparingly. This is a potentially expensive operation + // so it frees up other goroutines to allocate in the meanwhile. In fact, + // they can make use of the growth we just created. + todo := growth + if overage := uintptr(retained + uint64(growth) - goal); todo > overage { + todo = overage + } + if todo > bytesToScavenge { + bytesToScavenge = todo + } + } + } + // There are a few very limited cirumstances where we won't have a P here. + // It's OK to simply skip scavenging in these cases. Something else will notice + // and pick up the tab. + var now int64 + if pp != nil && bytesToScavenge > 0 { + // Measure how long we spent scavenging and add that measurement to the assist + // time so we can track it for the GC CPU limiter. + // + // Limiter event tracking might be disabled if we end up here + // while on a mark worker. + start := nanotime() + track := pp.limiterEvent.start(limiterEventScavengeAssist, start) + + // Scavenge, but back out if the limiter turns on. + h.pages.scavenge(bytesToScavenge, func() bool { + return gcCPULimiter.limiting() + }) + + // Finish up accounting. + now = nanotime() + if track { + pp.limiterEvent.stop(limiterEventScavengeAssist, now) + } + scavenge.assistTime.Add(now - start) + } + + // Initialize the span. + h.initSpan(s, typ, spanclass, base, npages) + + // Commit and account for any scavenged memory that the span now owns. + nbytes := npages * pageSize + if scav != 0 { + // sysUsed all the pages that are actually available + // in the span since some of them might be scavenged. + sysUsed(unsafe.Pointer(base), nbytes, scav) + gcController.heapReleased.add(-int64(scav)) + } + // Update stats. + gcController.heapFree.add(-int64(nbytes - scav)) + if typ == spanAllocHeap { + gcController.heapInUse.add(int64(nbytes)) + } + // Update consistent stats. + stats := memstats.heapStats.acquire() + atomic.Xaddint64(&stats.committed, int64(scav)) + atomic.Xaddint64(&stats.released, -int64(scav)) + switch typ { + case spanAllocHeap: + atomic.Xaddint64(&stats.inHeap, int64(nbytes)) + case spanAllocStack: + atomic.Xaddint64(&stats.inStacks, int64(nbytes)) + case spanAllocPtrScalarBits: + atomic.Xaddint64(&stats.inPtrScalarBits, int64(nbytes)) + case spanAllocWorkBuf: + atomic.Xaddint64(&stats.inWorkBufs, int64(nbytes)) + } + memstats.heapStats.release() + + pageTraceAlloc(pp, now, base, npages) + return s +} + +// initSpan initializes a blank span s which will represent the range +// [base, base+npages*pageSize). typ is the type of span being allocated. +func (h *mheap) initSpan(s *mspan, typ spanAllocType, spanclass spanClass, base, npages uintptr) { + // At this point, both s != nil and base != 0, and the heap + // lock is no longer held. Initialize the span. + s.init(base, npages) + if h.allocNeedsZero(base, npages) { + s.needzero = 1 + } + nbytes := npages * pageSize + if typ.manual() { + s.manualFreeList = 0 + s.nelems = 0 + s.limit = s.base() + s.npages*pageSize + s.state.set(mSpanManual) + } else { + // We must set span properties before the span is published anywhere + // since we're not holding the heap lock. + s.spanclass = spanclass + if sizeclass := spanclass.sizeclass(); sizeclass == 0 { + s.elemsize = nbytes + s.nelems = 1 + s.divMul = 0 + } else { + s.elemsize = uintptr(class_to_size[sizeclass]) + s.nelems = nbytes / s.elemsize + s.divMul = class_to_divmagic[sizeclass] + } + + // Initialize mark and allocation structures. + s.freeindex = 0 + s.freeIndexForScan = 0 + s.allocCache = ^uint64(0) // all 1s indicating all free. + s.gcmarkBits = newMarkBits(s.nelems) + s.allocBits = newAllocBits(s.nelems) + + // It's safe to access h.sweepgen without the heap lock because it's + // only ever updated with the world stopped and we run on the + // systemstack which blocks a STW transition. + atomic.Store(&s.sweepgen, h.sweepgen) + + // Now that the span is filled in, set its state. This + // is a publication barrier for the other fields in + // the span. While valid pointers into this span + // should never be visible until the span is returned, + // if the garbage collector finds an invalid pointer, + // access to the span may race with initialization of + // the span. We resolve this race by atomically + // setting the state after the span is fully + // initialized, and atomically checking the state in + // any situation where a pointer is suspect. + s.state.set(mSpanInUse) + } + + // Publish the span in various locations. + + // This is safe to call without the lock held because the slots + // related to this span will only ever be read or modified by + // this thread until pointers into the span are published (and + // we execute a publication barrier at the end of this function + // before that happens) or pageInUse is updated. + h.setSpans(s.base(), npages, s) + + if !typ.manual() { + // Mark in-use span in arena page bitmap. + // + // This publishes the span to the page sweeper, so + // it's imperative that the span be completely initialized + // prior to this line. + arena, pageIdx, pageMask := pageIndexOf(s.base()) + atomic.Or8(&arena.pageInUse[pageIdx], pageMask) + + // Update related page sweeper stats. + h.pagesInUse.Add(npages) + } + + // Make sure the newly allocated span will be observed + // by the GC before pointers into the span are published. + publicationBarrier() +} + +// Try to add at least npage pages of memory to the heap, +// returning how much the heap grew by and whether it worked. +// +// h.lock must be held. +func (h *mheap) grow(npage uintptr) (uintptr, bool) { + assertLockHeld(&h.lock) + + // We must grow the heap in whole palloc chunks. + // We call sysMap below but note that because we + // round up to pallocChunkPages which is on the order + // of MiB (generally >= to the huge page size) we + // won't be calling it too much. + ask := alignUp(npage, pallocChunkPages) * pageSize + + totalGrowth := uintptr(0) + // This may overflow because ask could be very large + // and is otherwise unrelated to h.curArena.base. + end := h.curArena.base + ask + nBase := alignUp(end, physPageSize) + if nBase > h.curArena.end || /* overflow */ end < h.curArena.base { + // Not enough room in the current arena. Allocate more + // arena space. This may not be contiguous with the + // current arena, so we have to request the full ask. + av, asize := h.sysAlloc(ask, &h.arenaHints, true) + if av == nil { + inUse := gcController.heapFree.load() + gcController.heapReleased.load() + gcController.heapInUse.load() + print("runtime: out of memory: cannot allocate ", ask, "-byte block (", inUse, " in use)\n") + return 0, false + } + + if uintptr(av) == h.curArena.end { + // The new space is contiguous with the old + // space, so just extend the current space. + h.curArena.end = uintptr(av) + asize + } else { + // The new space is discontiguous. Track what + // remains of the current space and switch to + // the new space. This should be rare. + if size := h.curArena.end - h.curArena.base; size != 0 { + // Transition this space from Reserved to Prepared and mark it + // as released since we'll be able to start using it after updating + // the page allocator and releasing the lock at any time. + sysMap(unsafe.Pointer(h.curArena.base), size, &gcController.heapReleased) + // Update stats. + stats := memstats.heapStats.acquire() + atomic.Xaddint64(&stats.released, int64(size)) + memstats.heapStats.release() + // Update the page allocator's structures to make this + // space ready for allocation. + h.pages.grow(h.curArena.base, size) + totalGrowth += size + } + // Switch to the new space. + h.curArena.base = uintptr(av) + h.curArena.end = uintptr(av) + asize + } + + // Recalculate nBase. + // We know this won't overflow, because sysAlloc returned + // a valid region starting at h.curArena.base which is at + // least ask bytes in size. + nBase = alignUp(h.curArena.base+ask, physPageSize) + } + + // Grow into the current arena. + v := h.curArena.base + h.curArena.base = nBase + + // Transition the space we're going to use from Reserved to Prepared. + // + // The allocation is always aligned to the heap arena + // size which is always > physPageSize, so its safe to + // just add directly to heapReleased. + sysMap(unsafe.Pointer(v), nBase-v, &gcController.heapReleased) + + // The memory just allocated counts as both released + // and idle, even though it's not yet backed by spans. + stats := memstats.heapStats.acquire() + atomic.Xaddint64(&stats.released, int64(nBase-v)) + memstats.heapStats.release() + + // Update the page allocator's structures to make this + // space ready for allocation. + h.pages.grow(v, nBase-v) + totalGrowth += nBase - v + return totalGrowth, true +} + +// Free the span back into the heap. +func (h *mheap) freeSpan(s *mspan) { + systemstack(func() { + pageTraceFree(getg().m.p.ptr(), 0, s.base(), s.npages) + + lock(&h.lock) + if msanenabled { + // Tell msan that this entire span is no longer in use. + base := unsafe.Pointer(s.base()) + bytes := s.npages << _PageShift + msanfree(base, bytes) + } + if asanenabled { + // Tell asan that this entire span is no longer in use. + base := unsafe.Pointer(s.base()) + bytes := s.npages << _PageShift + asanpoison(base, bytes) + } + h.freeSpanLocked(s, spanAllocHeap) + unlock(&h.lock) + }) +} + +// freeManual frees a manually-managed span returned by allocManual. +// typ must be the same as the spanAllocType passed to the allocManual that +// allocated s. +// +// This must only be called when gcphase == _GCoff. See mSpanState for +// an explanation. +// +// freeManual must be called on the system stack because it acquires +// the heap lock. See mheap for details. +// +//go:systemstack +func (h *mheap) freeManual(s *mspan, typ spanAllocType) { + pageTraceFree(getg().m.p.ptr(), 0, s.base(), s.npages) + + s.needzero = 1 + lock(&h.lock) + h.freeSpanLocked(s, typ) + unlock(&h.lock) +} + +func (h *mheap) freeSpanLocked(s *mspan, typ spanAllocType) { + assertLockHeld(&h.lock) + + switch s.state.get() { + case mSpanManual: + if s.allocCount != 0 { + throw("mheap.freeSpanLocked - invalid stack free") + } + case mSpanInUse: + if s.isUserArenaChunk { + throw("mheap.freeSpanLocked - invalid free of user arena chunk") + } + if s.allocCount != 0 || s.sweepgen != h.sweepgen { + print("mheap.freeSpanLocked - span ", s, " ptr ", hex(s.base()), " allocCount ", s.allocCount, " sweepgen ", s.sweepgen, "/", h.sweepgen, "\n") + throw("mheap.freeSpanLocked - invalid free") + } + h.pagesInUse.Add(-s.npages) + + // Clear in-use bit in arena page bitmap. + arena, pageIdx, pageMask := pageIndexOf(s.base()) + atomic.And8(&arena.pageInUse[pageIdx], ^pageMask) + default: + throw("mheap.freeSpanLocked - invalid span state") + } + + // Update stats. + // + // Mirrors the code in allocSpan. + nbytes := s.npages * pageSize + gcController.heapFree.add(int64(nbytes)) + if typ == spanAllocHeap { + gcController.heapInUse.add(-int64(nbytes)) + } + // Update consistent stats. + stats := memstats.heapStats.acquire() + switch typ { + case spanAllocHeap: + atomic.Xaddint64(&stats.inHeap, -int64(nbytes)) + case spanAllocStack: + atomic.Xaddint64(&stats.inStacks, -int64(nbytes)) + case spanAllocPtrScalarBits: + atomic.Xaddint64(&stats.inPtrScalarBits, -int64(nbytes)) + case spanAllocWorkBuf: + atomic.Xaddint64(&stats.inWorkBufs, -int64(nbytes)) + } + memstats.heapStats.release() + + // Mark the space as free. + h.pages.free(s.base(), s.npages, false) + + // Free the span structure. We no longer have a use for it. + s.state.set(mSpanDead) + h.freeMSpanLocked(s) +} + +// scavengeAll acquires the heap lock (blocking any additional +// manipulation of the page allocator) and iterates over the whole +// heap, scavenging every free page available. +func (h *mheap) scavengeAll() { + // Disallow malloc or panic while holding the heap lock. We do + // this here because this is a non-mallocgc entry-point to + // the mheap API. + gp := getg() + gp.m.mallocing++ + + released := h.pages.scavenge(^uintptr(0), nil) + + gp.m.mallocing-- + + if debug.scavtrace > 0 { + printScavTrace(released, true) + } +} + +//go:linkname runtime_debug_freeOSMemory runtime/debug.freeOSMemory +func runtime_debug_freeOSMemory() { + GC() + systemstack(func() { mheap_.scavengeAll() }) +} + +// Initialize a new span with the given start and npages. +func (span *mspan) init(base uintptr, npages uintptr) { + // span is *not* zeroed. + span.next = nil + span.prev = nil + span.list = nil + span.startAddr = base + span.npages = npages + span.allocCount = 0 + span.spanclass = 0 + span.elemsize = 0 + span.speciallock.key = 0 + span.specials = nil + span.needzero = 0 + span.freeindex = 0 + span.freeIndexForScan = 0 + span.allocBits = nil + span.gcmarkBits = nil + span.state.set(mSpanDead) + lockInit(&span.speciallock, lockRankMspanSpecial) +} + +func (span *mspan) inList() bool { + return span.list != nil +} + +// Initialize an empty doubly-linked list. +func (list *mSpanList) init() { + list.first = nil + list.last = nil +} + +func (list *mSpanList) remove(span *mspan) { + if span.list != list { + print("runtime: failed mSpanList.remove span.npages=", span.npages, + " span=", span, " prev=", span.prev, " span.list=", span.list, " list=", list, "\n") + throw("mSpanList.remove") + } + if list.first == span { + list.first = span.next + } else { + span.prev.next = span.next + } + if list.last == span { + list.last = span.prev + } else { + span.next.prev = span.prev + } + span.next = nil + span.prev = nil + span.list = nil +} + +func (list *mSpanList) isEmpty() bool { + return list.first == nil +} + +func (list *mSpanList) insert(span *mspan) { + if span.next != nil || span.prev != nil || span.list != nil { + println("runtime: failed mSpanList.insert", span, span.next, span.prev, span.list) + throw("mSpanList.insert") + } + span.next = list.first + if list.first != nil { + // The list contains at least one span; link it in. + // The last span in the list doesn't change. + list.first.prev = span + } else { + // The list contains no spans, so this is also the last span. + list.last = span + } + list.first = span + span.list = list +} + +func (list *mSpanList) insertBack(span *mspan) { + if span.next != nil || span.prev != nil || span.list != nil { + println("runtime: failed mSpanList.insertBack", span, span.next, span.prev, span.list) + throw("mSpanList.insertBack") + } + span.prev = list.last + if list.last != nil { + // The list contains at least one span. + list.last.next = span + } else { + // The list contains no spans, so this is also the first span. + list.first = span + } + list.last = span + span.list = list +} + +// takeAll removes all spans from other and inserts them at the front +// of list. +func (list *mSpanList) takeAll(other *mSpanList) { + if other.isEmpty() { + return + } + + // Reparent everything in other to list. + for s := other.first; s != nil; s = s.next { + s.list = list + } + + // Concatenate the lists. + if list.isEmpty() { + *list = *other + } else { + // Neither list is empty. Put other before list. + other.last.next = list.first + list.first.prev = other.last + list.first = other.first + } + + other.first, other.last = nil, nil +} + +const ( + _KindSpecialFinalizer = 1 + _KindSpecialProfile = 2 + // _KindSpecialReachable is a special used for tracking + // reachability during testing. + _KindSpecialReachable = 3 + // Note: The finalizer special must be first because if we're freeing + // an object, a finalizer special will cause the freeing operation + // to abort, and we want to keep the other special records around + // if that happens. +) + +type special struct { + _ sys.NotInHeap + next *special // linked list in span + offset uint16 // span offset of object + kind byte // kind of special +} + +// spanHasSpecials marks a span as having specials in the arena bitmap. +func spanHasSpecials(s *mspan) { + arenaPage := (s.base() / pageSize) % pagesPerArena + ai := arenaIndex(s.base()) + ha := mheap_.arenas[ai.l1()][ai.l2()] + atomic.Or8(&ha.pageSpecials[arenaPage/8], uint8(1)<<(arenaPage%8)) +} + +// spanHasNoSpecials marks a span as having no specials in the arena bitmap. +func spanHasNoSpecials(s *mspan) { + arenaPage := (s.base() / pageSize) % pagesPerArena + ai := arenaIndex(s.base()) + ha := mheap_.arenas[ai.l1()][ai.l2()] + atomic.And8(&ha.pageSpecials[arenaPage/8], ^(uint8(1) << (arenaPage % 8))) +} + +// Adds the special record s to the list of special records for +// the object p. All fields of s should be filled in except for +// offset & next, which this routine will fill in. +// Returns true if the special was successfully added, false otherwise. +// (The add will fail only if a record with the same p and s->kind +// already exists.) +func addspecial(p unsafe.Pointer, s *special) bool { + span := spanOfHeap(uintptr(p)) + if span == nil { + throw("addspecial on invalid pointer") + } + + // Ensure that the span is swept. + // Sweeping accesses the specials list w/o locks, so we have + // to synchronize with it. And it's just much safer. + mp := acquirem() + span.ensureSwept() + + offset := uintptr(p) - span.base() + kind := s.kind + + lock(&span.speciallock) + + // Find splice point, check for existing record. + t := &span.specials + for { + x := *t + if x == nil { + break + } + if offset == uintptr(x.offset) && kind == x.kind { + unlock(&span.speciallock) + releasem(mp) + return false // already exists + } + if offset < uintptr(x.offset) || (offset == uintptr(x.offset) && kind < x.kind) { + break + } + t = &x.next + } + + // Splice in record, fill in offset. + s.offset = uint16(offset) + s.next = *t + *t = s + spanHasSpecials(span) + unlock(&span.speciallock) + releasem(mp) + + return true +} + +// Removes the Special record of the given kind for the object p. +// Returns the record if the record existed, nil otherwise. +// The caller must FixAlloc_Free the result. +func removespecial(p unsafe.Pointer, kind uint8) *special { + span := spanOfHeap(uintptr(p)) + if span == nil { + throw("removespecial on invalid pointer") + } + + // Ensure that the span is swept. + // Sweeping accesses the specials list w/o locks, so we have + // to synchronize with it. And it's just much safer. + mp := acquirem() + span.ensureSwept() + + offset := uintptr(p) - span.base() + + var result *special + lock(&span.speciallock) + t := &span.specials + for { + s := *t + if s == nil { + break + } + // This function is used for finalizers only, so we don't check for + // "interior" specials (p must be exactly equal to s->offset). + if offset == uintptr(s.offset) && kind == s.kind { + *t = s.next + result = s + break + } + t = &s.next + } + if span.specials == nil { + spanHasNoSpecials(span) + } + unlock(&span.speciallock) + releasem(mp) + return result +} + +// The described object has a finalizer set for it. +// +// specialfinalizer is allocated from non-GC'd memory, so any heap +// pointers must be specially handled. +type specialfinalizer struct { + _ sys.NotInHeap + special special + fn *funcval // May be a heap pointer. + nret uintptr + fint *_type // May be a heap pointer, but always live. + ot *ptrtype // May be a heap pointer, but always live. +} + +// Adds a finalizer to the object p. Returns true if it succeeded. +func addfinalizer(p unsafe.Pointer, f *funcval, nret uintptr, fint *_type, ot *ptrtype) bool { + lock(&mheap_.speciallock) + s := (*specialfinalizer)(mheap_.specialfinalizeralloc.alloc()) + unlock(&mheap_.speciallock) + s.special.kind = _KindSpecialFinalizer + s.fn = f + s.nret = nret + s.fint = fint + s.ot = ot + if addspecial(p, &s.special) { + // This is responsible for maintaining the same + // GC-related invariants as markrootSpans in any + // situation where it's possible that markrootSpans + // has already run but mark termination hasn't yet. + if gcphase != _GCoff { + base, span, _ := findObject(uintptr(p), 0, 0) + mp := acquirem() + gcw := &mp.p.ptr().gcw + // Mark everything reachable from the object + // so it's retained for the finalizer. + if !span.spanclass.noscan() { + scanobject(base, gcw) + } + // Mark the finalizer itself, since the + // special isn't part of the GC'd heap. + scanblock(uintptr(unsafe.Pointer(&s.fn)), goarch.PtrSize, &oneptrmask[0], gcw, nil) + releasem(mp) + } + return true + } + + // There was an old finalizer + lock(&mheap_.speciallock) + mheap_.specialfinalizeralloc.free(unsafe.Pointer(s)) + unlock(&mheap_.speciallock) + return false +} + +// Removes the finalizer (if any) from the object p. +func removefinalizer(p unsafe.Pointer) { + s := (*specialfinalizer)(unsafe.Pointer(removespecial(p, _KindSpecialFinalizer))) + if s == nil { + return // there wasn't a finalizer to remove + } + lock(&mheap_.speciallock) + mheap_.specialfinalizeralloc.free(unsafe.Pointer(s)) + unlock(&mheap_.speciallock) +} + +// The described object is being heap profiled. +type specialprofile struct { + _ sys.NotInHeap + special special + b *bucket +} + +// Set the heap profile bucket associated with addr to b. +func setprofilebucket(p unsafe.Pointer, b *bucket) { + lock(&mheap_.speciallock) + s := (*specialprofile)(mheap_.specialprofilealloc.alloc()) + unlock(&mheap_.speciallock) + s.special.kind = _KindSpecialProfile + s.b = b + if !addspecial(p, &s.special) { + throw("setprofilebucket: profile already set") + } +} + +// specialReachable tracks whether an object is reachable on the next +// GC cycle. This is used by testing. +type specialReachable struct { + special special + done bool + reachable bool +} + +// specialsIter helps iterate over specials lists. +type specialsIter struct { + pprev **special + s *special +} + +func newSpecialsIter(span *mspan) specialsIter { + return specialsIter{&span.specials, span.specials} +} + +func (i *specialsIter) valid() bool { + return i.s != nil +} + +func (i *specialsIter) next() { + i.pprev = &i.s.next + i.s = *i.pprev +} + +// unlinkAndNext removes the current special from the list and moves +// the iterator to the next special. It returns the unlinked special. +func (i *specialsIter) unlinkAndNext() *special { + cur := i.s + i.s = cur.next + *i.pprev = i.s + return cur +} + +// freeSpecial performs any cleanup on special s and deallocates it. +// s must already be unlinked from the specials list. +func freeSpecial(s *special, p unsafe.Pointer, size uintptr) { + switch s.kind { + case _KindSpecialFinalizer: + sf := (*specialfinalizer)(unsafe.Pointer(s)) + queuefinalizer(p, sf.fn, sf.nret, sf.fint, sf.ot) + lock(&mheap_.speciallock) + mheap_.specialfinalizeralloc.free(unsafe.Pointer(sf)) + unlock(&mheap_.speciallock) + case _KindSpecialProfile: + sp := (*specialprofile)(unsafe.Pointer(s)) + mProf_Free(sp.b, size) + lock(&mheap_.speciallock) + mheap_.specialprofilealloc.free(unsafe.Pointer(sp)) + unlock(&mheap_.speciallock) + case _KindSpecialReachable: + sp := (*specialReachable)(unsafe.Pointer(s)) + sp.done = true + // The creator frees these. + default: + throw("bad special kind") + panic("not reached") + } +} + +// gcBits is an alloc/mark bitmap. This is always used as gcBits.x. +type gcBits struct { + _ sys.NotInHeap + x uint8 +} + +// bytep returns a pointer to the n'th byte of b. +func (b *gcBits) bytep(n uintptr) *uint8 { + return addb(&b.x, n) +} + +// bitp returns a pointer to the byte containing bit n and a mask for +// selecting that bit from *bytep. +func (b *gcBits) bitp(n uintptr) (bytep *uint8, mask uint8) { + return b.bytep(n / 8), 1 << (n % 8) +} + +const gcBitsChunkBytes = uintptr(64 << 10) +const gcBitsHeaderBytes = unsafe.Sizeof(gcBitsHeader{}) + +type gcBitsHeader struct { + free uintptr // free is the index into bits of the next free byte. + next uintptr // *gcBits triggers recursive type bug. (issue 14620) +} + +type gcBitsArena struct { + _ sys.NotInHeap + // gcBitsHeader // side step recursive type bug (issue 14620) by including fields by hand. + free uintptr // free is the index into bits of the next free byte; read/write atomically + next *gcBitsArena + bits [gcBitsChunkBytes - gcBitsHeaderBytes]gcBits +} + +var gcBitsArenas struct { + lock mutex + free *gcBitsArena + next *gcBitsArena // Read atomically. Write atomically under lock. + current *gcBitsArena + previous *gcBitsArena +} + +// tryAlloc allocates from b or returns nil if b does not have enough room. +// This is safe to call concurrently. +func (b *gcBitsArena) tryAlloc(bytes uintptr) *gcBits { + if b == nil || atomic.Loaduintptr(&b.free)+bytes > uintptr(len(b.bits)) { + return nil + } + // Try to allocate from this block. + end := atomic.Xadduintptr(&b.free, bytes) + if end > uintptr(len(b.bits)) { + return nil + } + // There was enough room. + start := end - bytes + return &b.bits[start] +} + +// newMarkBits returns a pointer to 8 byte aligned bytes +// to be used for a span's mark bits. +func newMarkBits(nelems uintptr) *gcBits { + blocksNeeded := uintptr((nelems + 63) / 64) + bytesNeeded := blocksNeeded * 8 + + // Try directly allocating from the current head arena. + head := (*gcBitsArena)(atomic.Loadp(unsafe.Pointer(&gcBitsArenas.next))) + if p := head.tryAlloc(bytesNeeded); p != nil { + return p + } + + // There's not enough room in the head arena. We may need to + // allocate a new arena. + lock(&gcBitsArenas.lock) + // Try the head arena again, since it may have changed. Now + // that we hold the lock, the list head can't change, but its + // free position still can. + if p := gcBitsArenas.next.tryAlloc(bytesNeeded); p != nil { + unlock(&gcBitsArenas.lock) + return p + } + + // Allocate a new arena. This may temporarily drop the lock. + fresh := newArenaMayUnlock() + // If newArenaMayUnlock dropped the lock, another thread may + // have put a fresh arena on the "next" list. Try allocating + // from next again. + if p := gcBitsArenas.next.tryAlloc(bytesNeeded); p != nil { + // Put fresh back on the free list. + // TODO: Mark it "already zeroed" + fresh.next = gcBitsArenas.free + gcBitsArenas.free = fresh + unlock(&gcBitsArenas.lock) + return p + } + + // Allocate from the fresh arena. We haven't linked it in yet, so + // this cannot race and is guaranteed to succeed. + p := fresh.tryAlloc(bytesNeeded) + if p == nil { + throw("markBits overflow") + } + + // Add the fresh arena to the "next" list. + fresh.next = gcBitsArenas.next + atomic.StorepNoWB(unsafe.Pointer(&gcBitsArenas.next), unsafe.Pointer(fresh)) + + unlock(&gcBitsArenas.lock) + return p +} + +// newAllocBits returns a pointer to 8 byte aligned bytes +// to be used for this span's alloc bits. +// newAllocBits is used to provide newly initialized spans +// allocation bits. For spans not being initialized the +// mark bits are repurposed as allocation bits when +// the span is swept. +func newAllocBits(nelems uintptr) *gcBits { + return newMarkBits(nelems) +} + +// nextMarkBitArenaEpoch establishes a new epoch for the arenas +// holding the mark bits. The arenas are named relative to the +// current GC cycle which is demarcated by the call to finishweep_m. +// +// All current spans have been swept. +// During that sweep each span allocated room for its gcmarkBits in +// gcBitsArenas.next block. gcBitsArenas.next becomes the gcBitsArenas.current +// where the GC will mark objects and after each span is swept these bits +// will be used to allocate objects. +// gcBitsArenas.current becomes gcBitsArenas.previous where the span's +// gcAllocBits live until all the spans have been swept during this GC cycle. +// The span's sweep extinguishes all the references to gcBitsArenas.previous +// by pointing gcAllocBits into the gcBitsArenas.current. +// The gcBitsArenas.previous is released to the gcBitsArenas.free list. +func nextMarkBitArenaEpoch() { + lock(&gcBitsArenas.lock) + if gcBitsArenas.previous != nil { + if gcBitsArenas.free == nil { + gcBitsArenas.free = gcBitsArenas.previous + } else { + // Find end of previous arenas. + last := gcBitsArenas.previous + for last = gcBitsArenas.previous; last.next != nil; last = last.next { + } + last.next = gcBitsArenas.free + gcBitsArenas.free = gcBitsArenas.previous + } + } + gcBitsArenas.previous = gcBitsArenas.current + gcBitsArenas.current = gcBitsArenas.next + atomic.StorepNoWB(unsafe.Pointer(&gcBitsArenas.next), nil) // newMarkBits calls newArena when needed + unlock(&gcBitsArenas.lock) +} + +// newArenaMayUnlock allocates and zeroes a gcBits arena. +// The caller must hold gcBitsArena.lock. This may temporarily release it. +func newArenaMayUnlock() *gcBitsArena { + var result *gcBitsArena + if gcBitsArenas.free == nil { + unlock(&gcBitsArenas.lock) + result = (*gcBitsArena)(sysAlloc(gcBitsChunkBytes, &memstats.gcMiscSys)) + if result == nil { + throw("runtime: cannot allocate memory") + } + lock(&gcBitsArenas.lock) + } else { + result = gcBitsArenas.free + gcBitsArenas.free = gcBitsArenas.free.next + memclrNoHeapPointers(unsafe.Pointer(result), gcBitsChunkBytes) + } + result.next = nil + // If result.bits is not 8 byte aligned adjust index so + // that &result.bits[result.free] is 8 byte aligned. + if uintptr(unsafe.Offsetof(gcBitsArena{}.bits))&7 == 0 { + result.free = 0 + } else { + result.free = 8 - (uintptr(unsafe.Pointer(&result.bits[0])) & 7) + } + return result +} diff --git a/src/runtime/mkduff.go b/src/runtime/mkduff.go new file mode 100644 index 0000000..6b42b85 --- /dev/null +++ b/src/runtime/mkduff.go @@ -0,0 +1,286 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +// runtime·duffzero is a Duff's device for zeroing memory. +// The compiler jumps to computed addresses within +// the routine to zero chunks of memory. +// Do not change duffzero without also +// changing the uses in cmd/compile/internal/*/*.go. + +// runtime·duffcopy is a Duff's device for copying memory. +// The compiler jumps to computed addresses within +// the routine to copy chunks of memory. +// Source and destination must not overlap. +// Do not change duffcopy without also +// changing the uses in cmd/compile/internal/*/*.go. + +// See the zero* and copy* generators below +// for architecture-specific comments. + +// mkduff generates duff_*.s. +package main + +import ( + "bytes" + "fmt" + "io" + "log" + "os" +) + +func main() { + gen("amd64", notags, zeroAMD64, copyAMD64) + gen("386", notags, zero386, copy386) + gen("arm", notags, zeroARM, copyARM) + gen("arm64", notags, zeroARM64, copyARM64) + gen("loong64", notags, zeroLOONG64, copyLOONG64) + gen("ppc64x", tagsPPC64x, zeroPPC64x, copyPPC64x) + gen("mips64x", tagsMIPS64x, zeroMIPS64x, copyMIPS64x) + gen("riscv64", notags, zeroRISCV64, copyRISCV64) +} + +func gen(arch string, tags, zero, copy func(io.Writer)) { + var buf bytes.Buffer + + fmt.Fprintln(&buf, "// Code generated by mkduff.go; DO NOT EDIT.") + fmt.Fprintln(&buf, "// Run go generate from src/runtime to update.") + fmt.Fprintln(&buf, "// See mkduff.go for comments.") + tags(&buf) + fmt.Fprintln(&buf, "#include \"textflag.h\"") + fmt.Fprintln(&buf) + zero(&buf) + fmt.Fprintln(&buf) + copy(&buf) + + if err := os.WriteFile("duff_"+arch+".s", buf.Bytes(), 0644); err != nil { + log.Fatalln(err) + } +} + +func notags(w io.Writer) { fmt.Fprintln(w) } + +func zeroAMD64(w io.Writer) { + // X15: zero + // DI: ptr to memory to be zeroed + // DI is updated as a side effect. + fmt.Fprintln(w, "TEXT runtime·duffzero<ABIInternal>(SB), NOSPLIT, $0-0") + for i := 0; i < 16; i++ { + fmt.Fprintln(w, "\tMOVUPS\tX15,(DI)") + fmt.Fprintln(w, "\tMOVUPS\tX15,16(DI)") + fmt.Fprintln(w, "\tMOVUPS\tX15,32(DI)") + fmt.Fprintln(w, "\tMOVUPS\tX15,48(DI)") + fmt.Fprintln(w, "\tLEAQ\t64(DI),DI") // We use lea instead of add, to avoid clobbering flags + fmt.Fprintln(w) + } + fmt.Fprintln(w, "\tRET") +} + +func copyAMD64(w io.Writer) { + // SI: ptr to source memory + // DI: ptr to destination memory + // SI and DI are updated as a side effect. + // + // This is equivalent to a sequence of MOVSQ but + // for some reason that is 3.5x slower than this code. + fmt.Fprintln(w, "TEXT runtime·duffcopy<ABIInternal>(SB), NOSPLIT, $0-0") + for i := 0; i < 64; i++ { + fmt.Fprintln(w, "\tMOVUPS\t(SI), X0") + fmt.Fprintln(w, "\tADDQ\t$16, SI") + fmt.Fprintln(w, "\tMOVUPS\tX0, (DI)") + fmt.Fprintln(w, "\tADDQ\t$16, DI") + fmt.Fprintln(w) + } + fmt.Fprintln(w, "\tRET") +} + +func zero386(w io.Writer) { + // AX: zero + // DI: ptr to memory to be zeroed + // DI is updated as a side effect. + fmt.Fprintln(w, "TEXT runtime·duffzero(SB), NOSPLIT, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tSTOSL") + } + fmt.Fprintln(w, "\tRET") +} + +func copy386(w io.Writer) { + // SI: ptr to source memory + // DI: ptr to destination memory + // SI and DI are updated as a side effect. + // + // This is equivalent to a sequence of MOVSL but + // for some reason MOVSL is really slow. + fmt.Fprintln(w, "TEXT runtime·duffcopy(SB), NOSPLIT, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVL\t(SI), CX") + fmt.Fprintln(w, "\tADDL\t$4, SI") + fmt.Fprintln(w, "\tMOVL\tCX, (DI)") + fmt.Fprintln(w, "\tADDL\t$4, DI") + fmt.Fprintln(w) + } + fmt.Fprintln(w, "\tRET") +} + +func zeroARM(w io.Writer) { + // R0: zero + // R1: ptr to memory to be zeroed + // R1 is updated as a side effect. + fmt.Fprintln(w, "TEXT runtime·duffzero(SB), NOSPLIT, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVW.P\tR0, 4(R1)") + } + fmt.Fprintln(w, "\tRET") +} + +func copyARM(w io.Writer) { + // R0: scratch space + // R1: ptr to source memory + // R2: ptr to destination memory + // R1 and R2 are updated as a side effect + fmt.Fprintln(w, "TEXT runtime·duffcopy(SB), NOSPLIT, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVW.P\t4(R1), R0") + fmt.Fprintln(w, "\tMOVW.P\tR0, 4(R2)") + fmt.Fprintln(w) + } + fmt.Fprintln(w, "\tRET") +} + +func zeroARM64(w io.Writer) { + // ZR: always zero + // R20: ptr to memory to be zeroed + // On return, R20 points to the last zeroed dword. + fmt.Fprintln(w, "TEXT runtime·duffzero<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 63; i++ { + fmt.Fprintln(w, "\tSTP.P\t(ZR, ZR), 16(R20)") + } + fmt.Fprintln(w, "\tSTP\t(ZR, ZR), (R20)") + fmt.Fprintln(w, "\tRET") +} + +func copyARM64(w io.Writer) { + // R20: ptr to source memory + // R21: ptr to destination memory + // R26, R27 (aka REGTMP): scratch space + // R20 and R21 are updated as a side effect + fmt.Fprintln(w, "TEXT runtime·duffcopy<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0") + + for i := 0; i < 64; i++ { + fmt.Fprintln(w, "\tLDP.P\t16(R20), (R26, R27)") + fmt.Fprintln(w, "\tSTP.P\t(R26, R27), 16(R21)") + fmt.Fprintln(w) + } + fmt.Fprintln(w, "\tRET") +} + +func zeroLOONG64(w io.Writer) { + // R0: always zero + // R19 (aka REGRT1): ptr to memory to be zeroed - 8 + // On return, R19 points to the last zeroed dword. + fmt.Fprintln(w, "TEXT runtime·duffzero(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVV\tR0, 8(R19)") + fmt.Fprintln(w, "\tADDV\t$8, R19") + } + fmt.Fprintln(w, "\tRET") +} + +func copyLOONG64(w io.Writer) { + fmt.Fprintln(w, "TEXT runtime·duffcopy(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVV\t(R19), R30") + fmt.Fprintln(w, "\tADDV\t$8, R19") + fmt.Fprintln(w, "\tMOVV\tR30, (R20)") + fmt.Fprintln(w, "\tADDV\t$8, R20") + fmt.Fprintln(w) + } + fmt.Fprintln(w, "\tRET") +} + +func tagsPPC64x(w io.Writer) { + fmt.Fprintln(w) + fmt.Fprintln(w, "//go:build ppc64 || ppc64le") + fmt.Fprintln(w) +} + +func zeroPPC64x(w io.Writer) { + // R0: always zero + // R3 (aka REGRT1): ptr to memory to be zeroed - 8 + // On return, R3 points to the last zeroed dword. + fmt.Fprintln(w, "TEXT runtime·duffzero<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVDU\tR0, 8(R20)") + } + fmt.Fprintln(w, "\tRET") +} + +func copyPPC64x(w io.Writer) { + // duffcopy is not used on PPC64. + fmt.Fprintln(w, "TEXT runtime·duffcopy<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVDU\t8(R20), R5") + fmt.Fprintln(w, "\tMOVDU\tR5, 8(R21)") + } + fmt.Fprintln(w, "\tRET") +} + +func tagsMIPS64x(w io.Writer) { + fmt.Fprintln(w) + fmt.Fprintln(w, "//go:build mips64 || mips64le") + fmt.Fprintln(w) +} + +func zeroMIPS64x(w io.Writer) { + // R0: always zero + // R1 (aka REGRT1): ptr to memory to be zeroed - 8 + // On return, R1 points to the last zeroed dword. + fmt.Fprintln(w, "TEXT runtime·duffzero(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVV\tR0, 8(R1)") + fmt.Fprintln(w, "\tADDV\t$8, R1") + } + fmt.Fprintln(w, "\tRET") +} + +func copyMIPS64x(w io.Writer) { + fmt.Fprintln(w, "TEXT runtime·duffcopy(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOVV\t(R1), R23") + fmt.Fprintln(w, "\tADDV\t$8, R1") + fmt.Fprintln(w, "\tMOVV\tR23, (R2)") + fmt.Fprintln(w, "\tADDV\t$8, R2") + fmt.Fprintln(w) + } + fmt.Fprintln(w, "\tRET") +} + +func zeroRISCV64(w io.Writer) { + // ZERO: always zero + // X25: ptr to memory to be zeroed + // X25 is updated as a side effect. + fmt.Fprintln(w, "TEXT runtime·duffzero<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOV\tZERO, (X25)") + fmt.Fprintln(w, "\tADD\t$8, X25") + } + fmt.Fprintln(w, "\tRET") +} + +func copyRISCV64(w io.Writer) { + // X24: ptr to source memory + // X25: ptr to destination memory + // X24 and X25 are updated as a side effect + fmt.Fprintln(w, "TEXT runtime·duffcopy<ABIInternal>(SB), NOSPLIT|NOFRAME, $0-0") + for i := 0; i < 128; i++ { + fmt.Fprintln(w, "\tMOV\t(X24), X31") + fmt.Fprintln(w, "\tADD\t$8, X24") + fmt.Fprintln(w, "\tMOV\tX31, (X25)") + fmt.Fprintln(w, "\tADD\t$8, X25") + fmt.Fprintln(w) + } + fmt.Fprintln(w, "\tRET") +} diff --git a/src/runtime/mkfastlog2table.go b/src/runtime/mkfastlog2table.go new file mode 100644 index 0000000..614d1f7 --- /dev/null +++ b/src/runtime/mkfastlog2table.go @@ -0,0 +1,109 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +// fastlog2Table contains log2 approximations for 5 binary digits. +// This is used to implement fastlog2, which is used for heap sampling. + +package main + +import ( + "bytes" + "fmt" + "log" + "math" + "os" +) + +func main() { + var buf bytes.Buffer + + fmt.Fprintln(&buf, "// Code generated by mkfastlog2table.go; DO NOT EDIT.") + fmt.Fprintln(&buf, "// Run go generate from src/runtime to update.") + fmt.Fprintln(&buf, "// See mkfastlog2table.go for comments.") + fmt.Fprintln(&buf) + fmt.Fprintln(&buf, "package runtime") + fmt.Fprintln(&buf) + fmt.Fprintln(&buf, "const fastlogNumBits =", fastlogNumBits) + fmt.Fprintln(&buf) + + fmt.Fprintln(&buf, "var fastlog2Table = [1<<fastlogNumBits + 1]float64{") + table := computeTable() + for _, t := range table { + fmt.Fprintf(&buf, "\t%v,\n", t) + } + fmt.Fprintln(&buf, "}") + + if err := os.WriteFile("fastlog2table.go", buf.Bytes(), 0644); err != nil { + log.Fatalln(err) + } +} + +const fastlogNumBits = 5 + +func computeTable() []float64 { + fastlog2Table := make([]float64, 1<<fastlogNumBits+1) + for i := 0; i <= (1 << fastlogNumBits); i++ { + fastlog2Table[i] = log2(1.0 + float64(i)/(1<<fastlogNumBits)) + } + return fastlog2Table +} + +// log2 is a local copy of math.Log2 with an explicit float64 conversion +// to disable FMA. This lets us generate the same output on all platforms. +func log2(x float64) float64 { + frac, exp := math.Frexp(x) + // Make sure exact powers of two give an exact answer. + // Don't depend on Log(0.5)*(1/Ln2)+exp being exactly exp-1. + if frac == 0.5 { + return float64(exp - 1) + } + return float64(nlog(frac)*(1/math.Ln2)) + float64(exp) +} + +// nlog is a local copy of math.Log with explicit float64 conversions +// to disable FMA. This lets us generate the same output on all platforms. +func nlog(x float64) float64 { + const ( + Ln2Hi = 6.93147180369123816490e-01 /* 3fe62e42 fee00000 */ + Ln2Lo = 1.90821492927058770002e-10 /* 3dea39ef 35793c76 */ + L1 = 6.666666666666735130e-01 /* 3FE55555 55555593 */ + L2 = 3.999999999940941908e-01 /* 3FD99999 9997FA04 */ + L3 = 2.857142874366239149e-01 /* 3FD24924 94229359 */ + L4 = 2.222219843214978396e-01 /* 3FCC71C5 1D8E78AF */ + L5 = 1.818357216161805012e-01 /* 3FC74664 96CB03DE */ + L6 = 1.531383769920937332e-01 /* 3FC39A09 D078C69F */ + L7 = 1.479819860511658591e-01 /* 3FC2F112 DF3E5244 */ + ) + + // special cases + switch { + case math.IsNaN(x) || math.IsInf(x, 1): + return x + case x < 0: + return math.NaN() + case x == 0: + return math.Inf(-1) + } + + // reduce + f1, ki := math.Frexp(x) + if f1 < math.Sqrt2/2 { + f1 *= 2 + ki-- + } + f := f1 - 1 + k := float64(ki) + + // compute + s := float64(f / (2 + f)) + s2 := float64(s * s) + s4 := float64(s2 * s2) + t1 := s2 * float64(L1+float64(s4*float64(L3+float64(s4*float64(L5+float64(s4*L7)))))) + t2 := s4 * float64(L2+float64(s4*float64(L4+float64(s4*L6)))) + R := float64(t1 + t2) + hfsq := float64(0.5 * f * f) + return float64(k*Ln2Hi) - ((hfsq - (float64(s*float64(hfsq+R)) + float64(k*Ln2Lo))) - f) +} diff --git a/src/runtime/mklockrank.go b/src/runtime/mklockrank.go new file mode 100644 index 0000000..ef2f07d --- /dev/null +++ b/src/runtime/mklockrank.go @@ -0,0 +1,392 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +// mklockrank records the static rank graph of the locks in the +// runtime and generates the rank checking structures in lockrank.go. +package main + +import ( + "bytes" + "flag" + "fmt" + "go/format" + "internal/dag" + "io" + "log" + "os" + "strings" +) + +// ranks describes the lock rank graph. See "go doc internal/dag" for +// the syntax. +// +// "a < b" means a must be acquired before b if both are held +// (or, if b is held, a cannot be acquired). +// +// "NONE < a" means no locks may be held when a is acquired. +// +// If a lock is not given a rank, then it is assumed to be a leaf +// lock, which means no other lock can be acquired while it is held. +// Therefore, leaf locks do not need to be given an explicit rank. +// +// Ranks in all caps are pseudo-nodes that help define order, but do +// not actually define a rank. +// +// TODO: It's often hard to correlate rank names to locks. Change +// these to be more consistent with the locks they label. +const ranks = ` +# Sysmon +NONE +< sysmon +< scavenge, forcegc; + +# Defer +NONE < defer; + +# GC +NONE < + sweepWaiters, + assistQueue, + sweep; + +# Test only +NONE < testR, testW; + +# Scheduler, timers, netpoll +NONE < + allocmW, + execW, + cpuprof, + pollDesc; +assistQueue, + cpuprof, + forcegc, + pollDesc, # pollDesc can interact with timers, which can lock sched. + scavenge, + sweep, + sweepWaiters, + testR +# Above SCHED are things that can call into the scheduler. +< SCHED +# Below SCHED is the scheduler implementation. +< allocmR, + execR +< sched; +sched < allg, allp; +allp < timers; +timers < netpollInit; + +# Channels +scavenge, sweep, testR < hchan; +NONE < notifyList; +hchan, notifyList < sudog; + +# Semaphores +NONE < root; + +# Itabs +NONE +< itab +< reflectOffs; + +# User arena state +NONE < userArenaState; + +# Tracing without a P uses a global trace buffer. +scavenge +# Above TRACEGLOBAL can emit a trace event without a P. +< TRACEGLOBAL +# Below TRACEGLOBAL manages the global tracing buffer. +# Note that traceBuf eventually chains to MALLOC, but we never get that far +# in the situation where there's no P. +< traceBuf; +# Starting/stopping tracing traces strings. +traceBuf < traceStrings; + +# Malloc +allg, + allocmR, + execR, # May grow stack + execW, # May allocate after BeforeFork + hchan, + notifyList, + reflectOffs, + timers, + traceStrings, + userArenaState +# Above MALLOC are things that can allocate memory. +< MALLOC +# Below MALLOC is the malloc implementation. +< fin, + gcBitsArenas, + mheapSpecial, + mspanSpecial, + spanSetSpine, + MPROF; + +# Memory profiling +MPROF < profInsert, profBlock, profMemActive; +profMemActive < profMemFuture; + +# Stack allocation and copying +gcBitsArenas, + netpollInit, + profBlock, + profInsert, + profMemFuture, + spanSetSpine, + fin, + root +# Anything that can grow the stack can acquire STACKGROW. +# (Most higher layers imply STACKGROW, like MALLOC.) +< STACKGROW +# Below STACKGROW is the stack allocator/copying implementation. +< gscan; +gscan < stackpool; +gscan < stackLarge; +# Generally, hchan must be acquired before gscan. But in one case, +# where we suspend a G and then shrink its stack, syncadjustsudogs +# can acquire hchan locks while holding gscan. To allow this case, +# we use hchanLeaf instead of hchan. +gscan < hchanLeaf; + +# Write barrier +defer, + gscan, + mspanSpecial, + sudog +# Anything that can have write barriers can acquire WB. +# Above WB, we can have write barriers. +< WB +# Below WB is the write barrier implementation. +< wbufSpans; + +# Span allocator +stackLarge, + stackpool, + wbufSpans +# Above mheap is anything that can call the span allocator. +< mheap; +# Below mheap is the span allocator implementation. +mheap, mheapSpecial < globalAlloc; + +# Execution tracer events (with a P) +hchan, + mheap, + root, + sched, + traceStrings, + notifyList, + fin +# Above TRACE is anything that can create a trace event +< TRACE +< trace +< traceStackTab; + +# panic is handled specially. It is implicitly below all other locks. +NONE < panic; +# deadlock is not acquired while holding panic, but it also needs to be +# below all other locks. +panic < deadlock; + +# RWMutex internal read lock + +allocmR, + allocmW +< allocmRInternal; + +execR, + execW +< execRInternal; + +testR, + testW +< testRInternal; +` + +// cyclicRanks lists lock ranks that allow multiple locks of the same +// rank to be acquired simultaneously. The runtime enforces ordering +// within these ranks using a separate mechanism. +var cyclicRanks = map[string]bool{ + // Multiple timers are locked simultaneously in destroy(). + "timers": true, + // Multiple hchans are acquired in hchan.sortkey() order in + // select. + "hchan": true, + // Multiple hchanLeafs are acquired in hchan.sortkey() order in + // syncadjustsudogs(). + "hchanLeaf": true, + // The point of the deadlock lock is to deadlock. + "deadlock": true, +} + +func main() { + flagO := flag.String("o", "", "write to `file` instead of stdout") + flagDot := flag.Bool("dot", false, "emit graphviz output instead of Go") + flag.Parse() + if flag.NArg() != 0 { + fmt.Fprintf(os.Stderr, "too many arguments") + os.Exit(2) + } + + g, err := dag.Parse(ranks) + if err != nil { + log.Fatal(err) + } + + var out []byte + if *flagDot { + var b bytes.Buffer + g.TransitiveReduction() + // Add cyclic edges for visualization. + for k := range cyclicRanks { + g.AddEdge(k, k) + } + // Reverse the graph. It's much easier to read this as + // a "<" partial order than a ">" partial order. This + // ways, locks are acquired from the top going down + // and time moves forward over the edges instead of + // backward. + g.Transpose() + generateDot(&b, g) + out = b.Bytes() + } else { + var b bytes.Buffer + generateGo(&b, g) + out, err = format.Source(b.Bytes()) + if err != nil { + log.Fatal(err) + } + } + + if *flagO != "" { + err = os.WriteFile(*flagO, out, 0666) + } else { + _, err = os.Stdout.Write(out) + } + if err != nil { + log.Fatal(err) + } +} + +func generateGo(w io.Writer, g *dag.Graph) { + fmt.Fprintf(w, `// Code generated by mklockrank.go; DO NOT EDIT. + +package runtime + +type lockRank int + +`) + + // Create numeric ranks. + topo := g.Topo() + for i, j := 0, len(topo)-1; i < j; i, j = i+1, j-1 { + topo[i], topo[j] = topo[j], topo[i] + } + fmt.Fprintf(w, ` +// Constants representing the ranks of all non-leaf runtime locks, in rank order. +// Locks with lower rank must be taken before locks with higher rank, +// in addition to satisfying the partial order in lockPartialOrder. +// A few ranks allow self-cycles, which are specified in lockPartialOrder. +const ( + lockRankUnknown lockRank = iota + +`) + for _, rank := range topo { + if isPseudo(rank) { + fmt.Fprintf(w, "\t// %s\n", rank) + } else { + fmt.Fprintf(w, "\t%s\n", cname(rank)) + } + } + fmt.Fprintf(w, `) + +// lockRankLeafRank is the rank of lock that does not have a declared rank, +// and hence is a leaf lock. +const lockRankLeafRank lockRank = 1000 +`) + + // Create string table. + fmt.Fprintf(w, ` +// lockNames gives the names associated with each of the above ranks. +var lockNames = []string{ +`) + for _, rank := range topo { + if !isPseudo(rank) { + fmt.Fprintf(w, "\t%s: %q,\n", cname(rank), rank) + } + } + fmt.Fprintf(w, `} + +func (rank lockRank) String() string { + if rank == 0 { + return "UNKNOWN" + } + if rank == lockRankLeafRank { + return "LEAF" + } + if rank < 0 || int(rank) >= len(lockNames) { + return "BAD RANK" + } + return lockNames[rank] +} +`) + + // Create partial order structure. + fmt.Fprintf(w, ` +// lockPartialOrder is the transitive closure of the lock rank graph. +// An entry for rank X lists all of the ranks that can already be held +// when rank X is acquired. +// +// Lock ranks that allow self-cycles list themselves. +var lockPartialOrder [][]lockRank = [][]lockRank{ +`) + for _, rank := range topo { + if isPseudo(rank) { + continue + } + list := []string{} + for _, before := range g.Edges(rank) { + if !isPseudo(before) { + list = append(list, cname(before)) + } + } + if cyclicRanks[rank] { + list = append(list, cname(rank)) + } + + fmt.Fprintf(w, "\t%s: {%s},\n", cname(rank), strings.Join(list, ", ")) + } + fmt.Fprintf(w, "}\n") +} + +// cname returns the Go const name for the given lock rank label. +func cname(label string) string { + return "lockRank" + strings.ToUpper(label[:1]) + label[1:] +} + +func isPseudo(label string) bool { + return strings.ToUpper(label) == label +} + +// generateDot emits a Graphviz dot representation of g to w. +func generateDot(w io.Writer, g *dag.Graph) { + fmt.Fprintf(w, "digraph g {\n") + + // Define all nodes. + for _, node := range g.Nodes { + fmt.Fprintf(w, "%q;\n", node) + } + + // Create edges. + for _, node := range g.Nodes { + for _, to := range g.Edges(node) { + fmt.Fprintf(w, "%q -> %q;\n", node, to) + } + } + + fmt.Fprintf(w, "}\n") +} diff --git a/src/runtime/mkpreempt.go b/src/runtime/mkpreempt.go new file mode 100644 index 0000000..61d2d02 --- /dev/null +++ b/src/runtime/mkpreempt.go @@ -0,0 +1,626 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +// mkpreempt generates the asyncPreempt functions for each +// architecture. +package main + +import ( + "flag" + "fmt" + "io" + "log" + "os" + "strings" +) + +// Copied from cmd/compile/internal/ssa/gen/*Ops.go + +var regNames386 = []string{ + "AX", + "CX", + "DX", + "BX", + "SP", + "BP", + "SI", + "DI", + "X0", + "X1", + "X2", + "X3", + "X4", + "X5", + "X6", + "X7", +} + +var regNamesAMD64 = []string{ + "AX", + "CX", + "DX", + "BX", + "SP", + "BP", + "SI", + "DI", + "R8", + "R9", + "R10", + "R11", + "R12", + "R13", + "R14", + "R15", + "X0", + "X1", + "X2", + "X3", + "X4", + "X5", + "X6", + "X7", + "X8", + "X9", + "X10", + "X11", + "X12", + "X13", + "X14", + "X15", +} + +var out io.Writer + +var arches = map[string]func(){ + "386": gen386, + "amd64": genAMD64, + "arm": genARM, + "arm64": genARM64, + "loong64": genLoong64, + "mips64x": func() { genMIPS(true) }, + "mipsx": func() { genMIPS(false) }, + "ppc64x": genPPC64, + "riscv64": genRISCV64, + "s390x": genS390X, + "wasm": genWasm, +} +var beLe = map[string]bool{"mips64x": true, "mipsx": true, "ppc64x": true} + +func main() { + flag.Parse() + if flag.NArg() > 0 { + out = os.Stdout + for _, arch := range flag.Args() { + gen, ok := arches[arch] + if !ok { + log.Fatalf("unknown arch %s", arch) + } + header(arch) + gen() + } + return + } + + for arch, gen := range arches { + f, err := os.Create(fmt.Sprintf("preempt_%s.s", arch)) + if err != nil { + log.Fatal(err) + } + out = f + header(arch) + gen() + if err := f.Close(); err != nil { + log.Fatal(err) + } + } +} + +func header(arch string) { + fmt.Fprintf(out, "// Code generated by mkpreempt.go; DO NOT EDIT.\n\n") + if beLe[arch] { + base := arch[:len(arch)-1] + fmt.Fprintf(out, "//go:build %s || %sle\n\n", base, base) + } + fmt.Fprintf(out, "#include \"go_asm.h\"\n") + if arch == "amd64" { + fmt.Fprintf(out, "#include \"asm_amd64.h\"\n") + } + fmt.Fprintf(out, "#include \"textflag.h\"\n\n") + fmt.Fprintf(out, "TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0\n") +} + +func p(f string, args ...any) { + fmted := fmt.Sprintf(f, args...) + fmt.Fprintf(out, "\t%s\n", strings.ReplaceAll(fmted, "\n", "\n\t")) +} + +func label(l string) { + fmt.Fprintf(out, "%s\n", l) +} + +type layout struct { + stack int + regs []regPos + sp string // stack pointer register +} + +type regPos struct { + pos int + + saveOp string + restoreOp string + reg string + + // If this register requires special save and restore, these + // give those operations with a %d placeholder for the stack + // offset. + save, restore string +} + +func (l *layout) add(op, reg string, size int) { + l.regs = append(l.regs, regPos{saveOp: op, restoreOp: op, reg: reg, pos: l.stack}) + l.stack += size +} + +func (l *layout) add2(sop, rop, reg string, size int) { + l.regs = append(l.regs, regPos{saveOp: sop, restoreOp: rop, reg: reg, pos: l.stack}) + l.stack += size +} + +func (l *layout) addSpecial(save, restore string, size int) { + l.regs = append(l.regs, regPos{save: save, restore: restore, pos: l.stack}) + l.stack += size +} + +func (l *layout) save() { + for _, reg := range l.regs { + if reg.save != "" { + p(reg.save, reg.pos) + } else { + p("%s %s, %d(%s)", reg.saveOp, reg.reg, reg.pos, l.sp) + } + } +} + +func (l *layout) restore() { + for i := len(l.regs) - 1; i >= 0; i-- { + reg := l.regs[i] + if reg.restore != "" { + p(reg.restore, reg.pos) + } else { + p("%s %d(%s), %s", reg.restoreOp, reg.pos, l.sp, reg.reg) + } + } +} + +func gen386() { + p("PUSHFL") + // Save general purpose registers. + var l = layout{sp: "SP"} + for _, reg := range regNames386 { + if reg == "SP" || strings.HasPrefix(reg, "X") { + continue + } + l.add("MOVL", reg, 4) + } + + softfloat := "GO386_softfloat" + + // Save SSE state only if supported. + lSSE := layout{stack: l.stack, sp: "SP"} + for i := 0; i < 8; i++ { + lSSE.add("MOVUPS", fmt.Sprintf("X%d", i), 16) + } + + p("ADJSP $%d", lSSE.stack) + p("NOP SP") + l.save() + p("#ifndef %s", softfloat) + lSSE.save() + p("#endif") + p("CALL ·asyncPreempt2(SB)") + p("#ifndef %s", softfloat) + lSSE.restore() + p("#endif") + l.restore() + p("ADJSP $%d", -lSSE.stack) + + p("POPFL") + p("RET") +} + +func genAMD64() { + // Assign stack offsets. + var l = layout{sp: "SP"} + for _, reg := range regNamesAMD64 { + if reg == "SP" || reg == "BP" { + continue + } + if !strings.HasPrefix(reg, "X") { + l.add("MOVQ", reg, 8) + } + } + lSSE := layout{stack: l.stack, sp: "SP"} + for _, reg := range regNamesAMD64 { + if strings.HasPrefix(reg, "X") { + lSSE.add("MOVUPS", reg, 16) + } + } + + // TODO: MXCSR register? + + p("PUSHQ BP") + p("MOVQ SP, BP") + p("// Save flags before clobbering them") + p("PUSHFQ") + p("// obj doesn't understand ADD/SUB on SP, but does understand ADJSP") + p("ADJSP $%d", lSSE.stack) + p("// But vet doesn't know ADJSP, so suppress vet stack checking") + p("NOP SP") + + l.save() + + // Apparently, the signal handling code path in darwin kernel leaves + // the upper bits of Y registers in a dirty state, which causes + // many SSE operations (128-bit and narrower) become much slower. + // Clear the upper bits to get to a clean state. See issue #37174. + // It is safe here as Go code don't use the upper bits of Y registers. + p("#ifdef GOOS_darwin") + p("#ifndef hasAVX") + p("CMPB internal∕cpu·X86+const_offsetX86HasAVX(SB), $0") + p("JE 2(PC)") + p("#endif") + p("VZEROUPPER") + p("#endif") + + lSSE.save() + p("CALL ·asyncPreempt2(SB)") + lSSE.restore() + l.restore() + p("ADJSP $%d", -lSSE.stack) + p("POPFQ") + p("POPQ BP") + p("RET") +} + +func genARM() { + // Add integer registers R0-R12. + // R13 (SP), R14 (LR), R15 (PC) are special and not saved here. + var l = layout{sp: "R13", stack: 4} // add LR slot + for i := 0; i <= 12; i++ { + reg := fmt.Sprintf("R%d", i) + if i == 10 { + continue // R10 is g register, no need to save/restore + } + l.add("MOVW", reg, 4) + } + // Add flag register. + l.addSpecial( + "MOVW CPSR, R0\nMOVW R0, %d(R13)", + "MOVW %d(R13), R0\nMOVW R0, CPSR", + 4) + + // Add floating point registers F0-F15 and flag register. + var lfp = layout{stack: l.stack, sp: "R13"} + lfp.addSpecial( + "MOVW FPCR, R0\nMOVW R0, %d(R13)", + "MOVW %d(R13), R0\nMOVW R0, FPCR", + 4) + for i := 0; i <= 15; i++ { + reg := fmt.Sprintf("F%d", i) + lfp.add("MOVD", reg, 8) + } + + p("MOVW.W R14, -%d(R13)", lfp.stack) // allocate frame, save LR + l.save() + p("MOVB ·goarm(SB), R0\nCMP $6, R0\nBLT nofp") // test goarm, and skip FP registers if goarm=5. + lfp.save() + label("nofp:") + p("CALL ·asyncPreempt2(SB)") + p("MOVB ·goarm(SB), R0\nCMP $6, R0\nBLT nofp2") // test goarm, and skip FP registers if goarm=5. + lfp.restore() + label("nofp2:") + l.restore() + + p("MOVW %d(R13), R14", lfp.stack) // sigctxt.pushCall pushes LR on stack, restore it + p("MOVW.P %d(R13), R15", lfp.stack+4) // load PC, pop frame (including the space pushed by sigctxt.pushCall) + p("UNDEF") // shouldn't get here +} + +func genARM64() { + // Add integer registers R0-R26 + // R27 (REGTMP), R28 (g), R29 (FP), R30 (LR), R31 (SP) are special + // and not saved here. + var l = layout{sp: "RSP", stack: 8} // add slot to save PC of interrupted instruction + for i := 0; i < 26; i += 2 { + if i == 18 { + i-- + continue // R18 is not used, skip + } + reg := fmt.Sprintf("(R%d, R%d)", i, i+1) + l.add2("STP", "LDP", reg, 16) + } + // Add flag registers. + l.addSpecial( + "MOVD NZCV, R0\nMOVD R0, %d(RSP)", + "MOVD %d(RSP), R0\nMOVD R0, NZCV", + 8) + l.addSpecial( + "MOVD FPSR, R0\nMOVD R0, %d(RSP)", + "MOVD %d(RSP), R0\nMOVD R0, FPSR", + 8) + // TODO: FPCR? I don't think we'll change it, so no need to save. + // Add floating point registers F0-F31. + for i := 0; i < 31; i += 2 { + reg := fmt.Sprintf("(F%d, F%d)", i, i+1) + l.add2("FSTPD", "FLDPD", reg, 16) + } + if l.stack%16 != 0 { + l.stack += 8 // SP needs 16-byte alignment + } + + // allocate frame, save PC of interrupted instruction (in LR) + p("MOVD R30, %d(RSP)", -l.stack) + p("SUB $%d, RSP", l.stack) + p("MOVD R29, -8(RSP)") // save frame pointer (only used on Linux) + p("SUB $8, RSP, R29") // set up new frame pointer + // On iOS, save the LR again after decrementing SP. We run the + // signal handler on the G stack (as it doesn't support sigaltstack), + // so any writes below SP may be clobbered. + p("#ifdef GOOS_ios") + p("MOVD R30, (RSP)") + p("#endif") + + l.save() + p("CALL ·asyncPreempt2(SB)") + l.restore() + + p("MOVD %d(RSP), R30", l.stack) // sigctxt.pushCall has pushed LR (at interrupt) on stack, restore it + p("MOVD -8(RSP), R29") // restore frame pointer + p("MOVD (RSP), R27") // load PC to REGTMP + p("ADD $%d, RSP", l.stack+16) // pop frame (including the space pushed by sigctxt.pushCall) + p("JMP (R27)") +} + +func genMIPS(_64bit bool) { + mov := "MOVW" + movf := "MOVF" + add := "ADD" + sub := "SUB" + r28 := "R28" + regsize := 4 + softfloat := "GOMIPS_softfloat" + if _64bit { + mov = "MOVV" + movf = "MOVD" + add = "ADDV" + sub = "SUBV" + r28 = "RSB" + regsize = 8 + softfloat = "GOMIPS64_softfloat" + } + + // Add integer registers R1-R22, R24-R25, R28 + // R0 (zero), R23 (REGTMP), R29 (SP), R30 (g), R31 (LR) are special, + // and not saved here. R26 and R27 are reserved by kernel and not used. + var l = layout{sp: "R29", stack: regsize} // add slot to save PC of interrupted instruction (in LR) + for i := 1; i <= 25; i++ { + if i == 23 { + continue // R23 is REGTMP + } + reg := fmt.Sprintf("R%d", i) + l.add(mov, reg, regsize) + } + l.add(mov, r28, regsize) + l.addSpecial( + mov+" HI, R1\n"+mov+" R1, %d(R29)", + mov+" %d(R29), R1\n"+mov+" R1, HI", + regsize) + l.addSpecial( + mov+" LO, R1\n"+mov+" R1, %d(R29)", + mov+" %d(R29), R1\n"+mov+" R1, LO", + regsize) + + // Add floating point control/status register FCR31 (FCR0-FCR30 are irrelevant) + var lfp = layout{sp: "R29", stack: l.stack} + lfp.addSpecial( + mov+" FCR31, R1\n"+mov+" R1, %d(R29)", + mov+" %d(R29), R1\n"+mov+" R1, FCR31", + regsize) + // Add floating point registers F0-F31. + for i := 0; i <= 31; i++ { + reg := fmt.Sprintf("F%d", i) + lfp.add(movf, reg, regsize) + } + + // allocate frame, save PC of interrupted instruction (in LR) + p(mov+" R31, -%d(R29)", lfp.stack) + p(sub+" $%d, R29", lfp.stack) + + l.save() + p("#ifndef %s", softfloat) + lfp.save() + p("#endif") + p("CALL ·asyncPreempt2(SB)") + p("#ifndef %s", softfloat) + lfp.restore() + p("#endif") + l.restore() + + p(mov+" %d(R29), R31", lfp.stack) // sigctxt.pushCall has pushed LR (at interrupt) on stack, restore it + p(mov + " (R29), R23") // load PC to REGTMP + p(add+" $%d, R29", lfp.stack+regsize) // pop frame (including the space pushed by sigctxt.pushCall) + p("JMP (R23)") +} + +func genLoong64() { + mov := "MOVV" + movf := "MOVD" + add := "ADDV" + sub := "SUBV" + r31 := "RSB" + regsize := 8 + + // Add integer registers r4-r21 r23-r29 r31 + // R0 (zero), R30 (REGTMP), R2 (tp), R3 (SP), R22 (g), R1 (LR) are special, + var l = layout{sp: "R3", stack: regsize} // add slot to save PC of interrupted instruction (in LR) + for i := 4; i <= 29; i++ { + if i == 22 { + continue // R3 is REGSP R22 is g + } + reg := fmt.Sprintf("R%d", i) + l.add(mov, reg, regsize) + } + l.add(mov, r31, regsize) + + // Add floating point registers F0-F31. + for i := 0; i <= 31; i++ { + reg := fmt.Sprintf("F%d", i) + l.add(movf, reg, regsize) + } + + // allocate frame, save PC of interrupted instruction (in LR) + p(mov+" R1, -%d(R3)", l.stack) + p(sub+" $%d, R3", l.stack) + + l.save() + p("CALL ·asyncPreempt2(SB)") + l.restore() + + p(mov+" %d(R3), R1", l.stack) // sigctxt.pushCall has pushed LR (at interrupt) on stack, restore it + p(mov + " (R3), R30") // load PC to REGTMP + p(add+" $%d, R3", l.stack+regsize) // pop frame (including the space pushed by sigctxt.pushCall) + p("JMP (R30)") +} + +func genPPC64() { + // Add integer registers R3-R29 + // R0 (zero), R1 (SP), R30 (g) are special and not saved here. + // R2 (TOC pointer in PIC mode), R12 (function entry address in PIC mode) have been saved in sigctxt.pushCall. + // R31 (REGTMP) will be saved manually. + var l = layout{sp: "R1", stack: 32 + 8} // MinFrameSize on PPC64, plus one word for saving R31 + for i := 3; i <= 29; i++ { + if i == 12 || i == 13 { + // R12 has been saved in sigctxt.pushCall. + // R13 is TLS pointer, not used by Go code. we must NOT + // restore it, otherwise if we parked and resumed on a + // different thread we'll mess up TLS addresses. + continue + } + reg := fmt.Sprintf("R%d", i) + l.add("MOVD", reg, 8) + } + l.addSpecial( + "MOVW CR, R31\nMOVW R31, %d(R1)", + "MOVW %d(R1), R31\nMOVFL R31, $0xff", // this is MOVW R31, CR + 8) // CR is 4-byte wide, but just keep the alignment + l.addSpecial( + "MOVD XER, R31\nMOVD R31, %d(R1)", + "MOVD %d(R1), R31\nMOVD R31, XER", + 8) + // Add floating point registers F0-F31. + for i := 0; i <= 31; i++ { + reg := fmt.Sprintf("F%d", i) + l.add("FMOVD", reg, 8) + } + // Add floating point control/status register FPSCR. + l.addSpecial( + "MOVFL FPSCR, F0\nFMOVD F0, %d(R1)", + "FMOVD %d(R1), F0\nMOVFL F0, FPSCR", + 8) + + p("MOVD R31, -%d(R1)", l.stack-32) // save R31 first, we'll use R31 for saving LR + p("MOVD LR, R31") + p("MOVDU R31, -%d(R1)", l.stack) // allocate frame, save PC of interrupted instruction (in LR) + + l.save() + p("CALL ·asyncPreempt2(SB)") + l.restore() + + p("MOVD %d(R1), R31", l.stack) // sigctxt.pushCall has pushed LR, R2, R12 (at interrupt) on stack, restore them + p("MOVD R31, LR") + p("MOVD %d(R1), R2", l.stack+8) + p("MOVD %d(R1), R12", l.stack+16) + p("MOVD (R1), R31") // load PC to CTR + p("MOVD R31, CTR") + p("MOVD 32(R1), R31") // restore R31 + p("ADD $%d, R1", l.stack+32) // pop frame (including the space pushed by sigctxt.pushCall) + p("JMP (CTR)") +} + +func genRISCV64() { + // X0 (zero), X1 (LR), X2 (SP), X3 (GP), X4 (TP), X27 (g), X31 (TMP) are special. + var l = layout{sp: "X2", stack: 8} + + // Add integer registers (X5-X26, X28-30). + for i := 5; i < 31; i++ { + if i == 27 { + continue + } + reg := fmt.Sprintf("X%d", i) + l.add("MOV", reg, 8) + } + + // Add floating point registers (F0-F31). + for i := 0; i <= 31; i++ { + reg := fmt.Sprintf("F%d", i) + l.add("MOVD", reg, 8) + } + + p("MOV X1, -%d(X2)", l.stack) + p("ADD $-%d, X2", l.stack) + l.save() + p("CALL ·asyncPreempt2(SB)") + l.restore() + p("MOV %d(X2), X1", l.stack) + p("MOV (X2), X31") + p("ADD $%d, X2", l.stack+8) + p("JMP (X31)") +} + +func genS390X() { + // Add integer registers R0-R12 + // R13 (g), R14 (LR), R15 (SP) are special, and not saved here. + // Saving R10 (REGTMP) is not necessary, but it is saved anyway. + var l = layout{sp: "R15", stack: 16} // add slot to save PC of interrupted instruction and flags + l.addSpecial( + "STMG R0, R12, %d(R15)", + "LMG %d(R15), R0, R12", + 13*8) + // Add floating point registers F0-F31. + for i := 0; i <= 15; i++ { + reg := fmt.Sprintf("F%d", i) + l.add("FMOVD", reg, 8) + } + + // allocate frame, save PC of interrupted instruction (in LR) and flags (condition code) + p("IPM R10") // save flags upfront, as ADD will clobber flags + p("MOVD R14, -%d(R15)", l.stack) + p("ADD $-%d, R15", l.stack) + p("MOVW R10, 8(R15)") // save flags + + l.save() + p("CALL ·asyncPreempt2(SB)") + l.restore() + + p("MOVD %d(R15), R14", l.stack) // sigctxt.pushCall has pushed LR (at interrupt) on stack, restore it + p("ADD $%d, R15", l.stack+8) // pop frame (including the space pushed by sigctxt.pushCall) + p("MOVWZ -%d(R15), R10", l.stack) // load flags to REGTMP + p("TMLH R10, $(3<<12)") // restore flags + p("MOVD -%d(R15), R10", l.stack+8) // load PC to REGTMP + p("JMP (R10)") +} + +func genWasm() { + p("// No async preemption on wasm") + p("UNDEF") +} + +func notImplemented() { + p("// Not implemented yet") + p("JMP ·abort(SB)") +} diff --git a/src/runtime/mksizeclasses.go b/src/runtime/mksizeclasses.go new file mode 100644 index 0000000..64ed844 --- /dev/null +++ b/src/runtime/mksizeclasses.go @@ -0,0 +1,345 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +// Generate tables for small malloc size classes. +// +// See malloc.go for overview. +// +// The size classes are chosen so that rounding an allocation +// request up to the next size class wastes at most 12.5% (1.125x). +// +// Each size class has its own page count that gets allocated +// and chopped up when new objects of the size class are needed. +// That page count is chosen so that chopping up the run of +// pages into objects of the given size wastes at most 12.5% (1.125x) +// of the memory. It is not necessary that the cutoff here be +// the same as above. +// +// The two sources of waste multiply, so the worst possible case +// for the above constraints would be that allocations of some +// size might have a 26.6% (1.266x) overhead. +// In practice, only one of the wastes comes into play for a +// given size (sizes < 512 waste mainly on the round-up, +// sizes > 512 waste mainly on the page chopping). +// For really small sizes, alignment constraints force the +// overhead higher. + +package main + +import ( + "bytes" + "flag" + "fmt" + "go/format" + "io" + "log" + "math" + "math/bits" + "os" +) + +// Generate msize.go + +var stdout = flag.Bool("stdout", false, "write to stdout instead of sizeclasses.go") + +func main() { + flag.Parse() + + var b bytes.Buffer + fmt.Fprintln(&b, "// Code generated by mksizeclasses.go; DO NOT EDIT.") + fmt.Fprintln(&b, "//go:generate go run mksizeclasses.go") + fmt.Fprintln(&b) + fmt.Fprintln(&b, "package runtime") + classes := makeClasses() + + printComment(&b, classes) + + printClasses(&b, classes) + + out, err := format.Source(b.Bytes()) + if err != nil { + log.Fatal(err) + } + if *stdout { + _, err = os.Stdout.Write(out) + } else { + err = os.WriteFile("sizeclasses.go", out, 0666) + } + if err != nil { + log.Fatal(err) + } +} + +const ( + // Constants that we use and will transfer to the runtime. + maxSmallSize = 32 << 10 + smallSizeDiv = 8 + smallSizeMax = 1024 + largeSizeDiv = 128 + pageShift = 13 + + // Derived constants. + pageSize = 1 << pageShift +) + +type class struct { + size int // max size + npages int // number of pages +} + +func powerOfTwo(x int) bool { + return x != 0 && x&(x-1) == 0 +} + +func makeClasses() []class { + var classes []class + + classes = append(classes, class{}) // class #0 is a dummy entry + + align := 8 + for size := align; size <= maxSmallSize; size += align { + if powerOfTwo(size) { // bump alignment once in a while + if size >= 2048 { + align = 256 + } else if size >= 128 { + align = size / 8 + } else if size >= 32 { + align = 16 // heap bitmaps assume 16 byte alignment for allocations >= 32 bytes. + } + } + if !powerOfTwo(align) { + panic("incorrect alignment") + } + + // Make the allocnpages big enough that + // the leftover is less than 1/8 of the total, + // so wasted space is at most 12.5%. + allocsize := pageSize + for allocsize%size > allocsize/8 { + allocsize += pageSize + } + npages := allocsize / pageSize + + // If the previous sizeclass chose the same + // allocation size and fit the same number of + // objects into the page, we might as well + // use just this size instead of having two + // different sizes. + if len(classes) > 1 && npages == classes[len(classes)-1].npages && allocsize/size == allocsize/classes[len(classes)-1].size { + classes[len(classes)-1].size = size + continue + } + classes = append(classes, class{size: size, npages: npages}) + } + + // Increase object sizes if we can fit the same number of larger objects + // into the same number of pages. For example, we choose size 8448 above + // with 6 objects in 7 pages. But we can well use object size 9472, + // which is also 6 objects in 7 pages but +1024 bytes (+12.12%). + // We need to preserve at least largeSizeDiv alignment otherwise + // sizeToClass won't work. + for i := range classes { + if i == 0 { + continue + } + c := &classes[i] + psize := c.npages * pageSize + new_size := (psize / (psize / c.size)) &^ (largeSizeDiv - 1) + if new_size > c.size { + c.size = new_size + } + } + + if len(classes) != 68 { + panic("number of size classes has changed") + } + + for i := range classes { + computeDivMagic(&classes[i]) + } + + return classes +} + +// computeDivMagic checks that the division required to compute object +// index from span offset can be computed using 32-bit multiplication. +// n / c.size is implemented as (n * (^uint32(0)/uint32(c.size) + 1)) >> 32 +// for all 0 <= n <= c.npages * pageSize +func computeDivMagic(c *class) { + // divisor + d := c.size + if d == 0 { + return + } + + // maximum input value for which the formula needs to work. + max := c.npages * pageSize + + // As reported in [1], if n and d are unsigned N-bit integers, we + // can compute n / d as ⌊n * c / 2^F⌋, where c is ⌈2^F / d⌉ and F is + // computed with: + // + // Algorithm 2: Algorithm to select the number of fractional bits + // and the scaled approximate reciprocal in the case of unsigned + // integers. + // + // if d is a power of two then + // Let F ← log₂(d) and c = 1. + // else + // Let F ← N + L where L is the smallest integer + // such that d ≤ (2^(N+L) mod d) + 2^L. + // end if + // + // [1] "Faster Remainder by Direct Computation: Applications to + // Compilers and Software Libraries" Daniel Lemire, Owen Kaser, + // Nathan Kurz arXiv:1902.01961 + // + // To minimize the risk of introducing errors, we implement the + // algorithm exactly as stated, rather than trying to adapt it to + // fit typical Go idioms. + N := bits.Len(uint(max)) + var F int + if powerOfTwo(d) { + F = int(math.Log2(float64(d))) + if d != 1<<F { + panic("imprecise log2") + } + } else { + for L := 0; ; L++ { + if d <= ((1<<(N+L))%d)+(1<<L) { + F = N + L + break + } + } + } + + // Also, noted in the paper, F is the smallest number of fractional + // bits required. We use 32 bits, because it works for all size + // classes and is fast on all CPU architectures that we support. + if F > 32 { + fmt.Printf("d=%d max=%d N=%d F=%d\n", c.size, max, N, F) + panic("size class requires more than 32 bits of precision") + } + + // Brute force double-check with the exact computation that will be + // done by the runtime. + m := ^uint32(0)/uint32(c.size) + 1 + for n := 0; n <= max; n++ { + if uint32((uint64(n)*uint64(m))>>32) != uint32(n/c.size) { + fmt.Printf("d=%d max=%d m=%d n=%d\n", d, max, m, n) + panic("bad 32-bit multiply magic") + } + } +} + +func printComment(w io.Writer, classes []class) { + fmt.Fprintf(w, "// %-5s %-9s %-10s %-7s %-10s %-9s %-9s\n", "class", "bytes/obj", "bytes/span", "objects", "tail waste", "max waste", "min align") + prevSize := 0 + var minAligns [pageShift + 1]int + for i, c := range classes { + if i == 0 { + continue + } + spanSize := c.npages * pageSize + objects := spanSize / c.size + tailWaste := spanSize - c.size*(spanSize/c.size) + maxWaste := float64((c.size-prevSize-1)*objects+tailWaste) / float64(spanSize) + alignBits := bits.TrailingZeros(uint(c.size)) + if alignBits > pageShift { + // object alignment is capped at page alignment + alignBits = pageShift + } + for i := range minAligns { + if i > alignBits { + minAligns[i] = 0 + } else if minAligns[i] == 0 { + minAligns[i] = c.size + } + } + prevSize = c.size + fmt.Fprintf(w, "// %5d %9d %10d %7d %10d %8.2f%% %9d\n", i, c.size, spanSize, objects, tailWaste, 100*maxWaste, 1<<alignBits) + } + fmt.Fprintf(w, "\n") + + fmt.Fprintf(w, "// %-9s %-4s %-12s\n", "alignment", "bits", "min obj size") + for bits, size := range minAligns { + if size == 0 { + break + } + if bits+1 < len(minAligns) && size == minAligns[bits+1] { + continue + } + fmt.Fprintf(w, "// %9d %4d %12d\n", 1<<bits, bits, size) + } + fmt.Fprintf(w, "\n") +} + +func printClasses(w io.Writer, classes []class) { + fmt.Fprintln(w, "const (") + fmt.Fprintf(w, "_MaxSmallSize = %d\n", maxSmallSize) + fmt.Fprintf(w, "smallSizeDiv = %d\n", smallSizeDiv) + fmt.Fprintf(w, "smallSizeMax = %d\n", smallSizeMax) + fmt.Fprintf(w, "largeSizeDiv = %d\n", largeSizeDiv) + fmt.Fprintf(w, "_NumSizeClasses = %d\n", len(classes)) + fmt.Fprintf(w, "_PageShift = %d\n", pageShift) + fmt.Fprintln(w, ")") + + fmt.Fprint(w, "var class_to_size = [_NumSizeClasses]uint16 {") + for _, c := range classes { + fmt.Fprintf(w, "%d,", c.size) + } + fmt.Fprintln(w, "}") + + fmt.Fprint(w, "var class_to_allocnpages = [_NumSizeClasses]uint8 {") + for _, c := range classes { + fmt.Fprintf(w, "%d,", c.npages) + } + fmt.Fprintln(w, "}") + + fmt.Fprint(w, "var class_to_divmagic = [_NumSizeClasses]uint32 {") + for _, c := range classes { + if c.size == 0 { + fmt.Fprintf(w, "0,") + continue + } + fmt.Fprintf(w, "^uint32(0)/%d+1,", c.size) + } + fmt.Fprintln(w, "}") + + // map from size to size class, for small sizes. + sc := make([]int, smallSizeMax/smallSizeDiv+1) + for i := range sc { + size := i * smallSizeDiv + for j, c := range classes { + if c.size >= size { + sc[i] = j + break + } + } + } + fmt.Fprint(w, "var size_to_class8 = [smallSizeMax/smallSizeDiv+1]uint8 {") + for _, v := range sc { + fmt.Fprintf(w, "%d,", v) + } + fmt.Fprintln(w, "}") + + // map from size to size class, for large sizes. + sc = make([]int, (maxSmallSize-smallSizeMax)/largeSizeDiv+1) + for i := range sc { + size := smallSizeMax + i*largeSizeDiv + for j, c := range classes { + if c.size >= size { + sc[i] = j + break + } + } + } + fmt.Fprint(w, "var size_to_class128 = [(_MaxSmallSize-smallSizeMax)/largeSizeDiv+1]uint8 {") + for _, v := range sc { + fmt.Fprintf(w, "%d,", v) + } + fmt.Fprintln(w, "}") +} diff --git a/src/runtime/mmap.go b/src/runtime/mmap.go new file mode 100644 index 0000000..f0183f6 --- /dev/null +++ b/src/runtime/mmap.go @@ -0,0 +1,19 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !aix && !darwin && !js && (!linux || !amd64) && (!linux || !arm64) && (!freebsd || !amd64) && !openbsd && !plan9 && !solaris && !windows + +package runtime + +import "unsafe" + +// mmap calls the mmap system call. It is implemented in assembly. +// We only pass the lower 32 bits of file offset to the +// assembly routine; the higher bits (if required), should be provided +// by the assembly routine as 0. +// The err result is an OS error code such as ENOMEM. +func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (p unsafe.Pointer, err int) + +// munmap calls the munmap system call. It is implemented in assembly. +func munmap(addr unsafe.Pointer, n uintptr) diff --git a/src/runtime/mpagealloc.go b/src/runtime/mpagealloc.go new file mode 100644 index 0000000..35b2a01 --- /dev/null +++ b/src/runtime/mpagealloc.go @@ -0,0 +1,1002 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Page allocator. +// +// The page allocator manages mapped pages (defined by pageSize, NOT +// physPageSize) for allocation and re-use. It is embedded into mheap. +// +// Pages are managed using a bitmap that is sharded into chunks. +// In the bitmap, 1 means in-use, and 0 means free. The bitmap spans the +// process's address space. Chunks are managed in a sparse-array-style structure +// similar to mheap.arenas, since the bitmap may be large on some systems. +// +// The bitmap is efficiently searched by using a radix tree in combination +// with fast bit-wise intrinsics. Allocation is performed using an address-ordered +// first-fit approach. +// +// Each entry in the radix tree is a summary that describes three properties of +// a particular region of the address space: the number of contiguous free pages +// at the start and end of the region it represents, and the maximum number of +// contiguous free pages found anywhere in that region. +// +// Each level of the radix tree is stored as one contiguous array, which represents +// a different granularity of subdivision of the processes' address space. Thus, this +// radix tree is actually implicit in these large arrays, as opposed to having explicit +// dynamically-allocated pointer-based node structures. Naturally, these arrays may be +// quite large for system with large address spaces, so in these cases they are mapped +// into memory as needed. The leaf summaries of the tree correspond to a bitmap chunk. +// +// The root level (referred to as L0 and index 0 in pageAlloc.summary) has each +// summary represent the largest section of address space (16 GiB on 64-bit systems), +// with each subsequent level representing successively smaller subsections until we +// reach the finest granularity at the leaves, a chunk. +// +// More specifically, each summary in each level (except for leaf summaries) +// represents some number of entries in the following level. For example, each +// summary in the root level may represent a 16 GiB region of address space, +// and in the next level there could be 8 corresponding entries which represent 2 +// GiB subsections of that 16 GiB region, each of which could correspond to 8 +// entries in the next level which each represent 256 MiB regions, and so on. +// +// Thus, this design only scales to heaps so large, but can always be extended to +// larger heaps by simply adding levels to the radix tree, which mostly costs +// additional virtual address space. The choice of managing large arrays also means +// that a large amount of virtual address space may be reserved by the runtime. + +package runtime + +import ( + "unsafe" +) + +const ( + // The size of a bitmap chunk, i.e. the amount of bits (that is, pages) to consider + // in the bitmap at once. + pallocChunkPages = 1 << logPallocChunkPages + pallocChunkBytes = pallocChunkPages * pageSize + logPallocChunkPages = 9 + logPallocChunkBytes = logPallocChunkPages + pageShift + + // The number of radix bits for each level. + // + // The value of 3 is chosen such that the block of summaries we need to scan at + // each level fits in 64 bytes (2^3 summaries * 8 bytes per summary), which is + // close to the L1 cache line width on many systems. Also, a value of 3 fits 4 tree + // levels perfectly into the 21-bit pallocBits summary field at the root level. + // + // The following equation explains how each of the constants relate: + // summaryL0Bits + (summaryLevels-1)*summaryLevelBits + logPallocChunkBytes = heapAddrBits + // + // summaryLevels is an architecture-dependent value defined in mpagealloc_*.go. + summaryLevelBits = 3 + summaryL0Bits = heapAddrBits - logPallocChunkBytes - (summaryLevels-1)*summaryLevelBits + + // pallocChunksL2Bits is the number of bits of the chunk index number + // covered by the second level of the chunks map. + // + // See (*pageAlloc).chunks for more details. Update the documentation + // there should this change. + pallocChunksL2Bits = heapAddrBits - logPallocChunkBytes - pallocChunksL1Bits + pallocChunksL1Shift = pallocChunksL2Bits +) + +// maxSearchAddr returns the maximum searchAddr value, which indicates +// that the heap has no free space. +// +// This function exists just to make it clear that this is the maximum address +// for the page allocator's search space. See maxOffAddr for details. +// +// It's a function (rather than a variable) because it needs to be +// usable before package runtime's dynamic initialization is complete. +// See #51913 for details. +func maxSearchAddr() offAddr { return maxOffAddr } + +// Global chunk index. +// +// Represents an index into the leaf level of the radix tree. +// Similar to arenaIndex, except instead of arenas, it divides the address +// space into chunks. +type chunkIdx uint + +// chunkIndex returns the global index of the palloc chunk containing the +// pointer p. +func chunkIndex(p uintptr) chunkIdx { + return chunkIdx((p - arenaBaseOffset) / pallocChunkBytes) +} + +// chunkBase returns the base address of the palloc chunk at index ci. +func chunkBase(ci chunkIdx) uintptr { + return uintptr(ci)*pallocChunkBytes + arenaBaseOffset +} + +// chunkPageIndex computes the index of the page that contains p, +// relative to the chunk which contains p. +func chunkPageIndex(p uintptr) uint { + return uint(p % pallocChunkBytes / pageSize) +} + +// l1 returns the index into the first level of (*pageAlloc).chunks. +func (i chunkIdx) l1() uint { + if pallocChunksL1Bits == 0 { + // Let the compiler optimize this away if there's no + // L1 map. + return 0 + } else { + return uint(i) >> pallocChunksL1Shift + } +} + +// l2 returns the index into the second level of (*pageAlloc).chunks. +func (i chunkIdx) l2() uint { + if pallocChunksL1Bits == 0 { + return uint(i) + } else { + return uint(i) & (1<<pallocChunksL2Bits - 1) + } +} + +// offAddrToLevelIndex converts an address in the offset address space +// to the index into summary[level] containing addr. +func offAddrToLevelIndex(level int, addr offAddr) int { + return int((addr.a - arenaBaseOffset) >> levelShift[level]) +} + +// levelIndexToOffAddr converts an index into summary[level] into +// the corresponding address in the offset address space. +func levelIndexToOffAddr(level, idx int) offAddr { + return offAddr{(uintptr(idx) << levelShift[level]) + arenaBaseOffset} +} + +// addrsToSummaryRange converts base and limit pointers into a range +// of entries for the given summary level. +// +// The returned range is inclusive on the lower bound and exclusive on +// the upper bound. +func addrsToSummaryRange(level int, base, limit uintptr) (lo int, hi int) { + // This is slightly more nuanced than just a shift for the exclusive + // upper-bound. Note that the exclusive upper bound may be within a + // summary at this level, meaning if we just do the obvious computation + // hi will end up being an inclusive upper bound. Unfortunately, just + // adding 1 to that is too broad since we might be on the very edge + // of a summary's max page count boundary for this level + // (1 << levelLogPages[level]). So, make limit an inclusive upper bound + // then shift, then add 1, so we get an exclusive upper bound at the end. + lo = int((base - arenaBaseOffset) >> levelShift[level]) + hi = int(((limit-1)-arenaBaseOffset)>>levelShift[level]) + 1 + return +} + +// blockAlignSummaryRange aligns indices into the given level to that +// level's block width (1 << levelBits[level]). It assumes lo is inclusive +// and hi is exclusive, and so aligns them down and up respectively. +func blockAlignSummaryRange(level int, lo, hi int) (int, int) { + e := uintptr(1) << levelBits[level] + return int(alignDown(uintptr(lo), e)), int(alignUp(uintptr(hi), e)) +} + +type pageAlloc struct { + // Radix tree of summaries. + // + // Each slice's cap represents the whole memory reservation. + // Each slice's len reflects the allocator's maximum known + // mapped heap address for that level. + // + // The backing store of each summary level is reserved in init + // and may or may not be committed in grow (small address spaces + // may commit all the memory in init). + // + // The purpose of keeping len <= cap is to enforce bounds checks + // on the top end of the slice so that instead of an unknown + // runtime segmentation fault, we get a much friendlier out-of-bounds + // error. + // + // To iterate over a summary level, use inUse to determine which ranges + // are currently available. Otherwise one might try to access + // memory which is only Reserved which may result in a hard fault. + // + // We may still get segmentation faults < len since some of that + // memory may not be committed yet. + summary [summaryLevels][]pallocSum + + // chunks is a slice of bitmap chunks. + // + // The total size of chunks is quite large on most 64-bit platforms + // (O(GiB) or more) if flattened, so rather than making one large mapping + // (which has problems on some platforms, even when PROT_NONE) we use a + // two-level sparse array approach similar to the arena index in mheap. + // + // To find the chunk containing a memory address `a`, do: + // chunkOf(chunkIndex(a)) + // + // Below is a table describing the configuration for chunks for various + // heapAddrBits supported by the runtime. + // + // heapAddrBits | L1 Bits | L2 Bits | L2 Entry Size + // ------------------------------------------------ + // 32 | 0 | 10 | 128 KiB + // 33 (iOS) | 0 | 11 | 256 KiB + // 48 | 13 | 13 | 1 MiB + // + // There's no reason to use the L1 part of chunks on 32-bit, the + // address space is small so the L2 is small. For platforms with a + // 48-bit address space, we pick the L1 such that the L2 is 1 MiB + // in size, which is a good balance between low granularity without + // making the impact on BSS too high (note the L1 is stored directly + // in pageAlloc). + // + // To iterate over the bitmap, use inUse to determine which ranges + // are currently available. Otherwise one might iterate over unused + // ranges. + // + // Protected by mheapLock. + // + // TODO(mknyszek): Consider changing the definition of the bitmap + // such that 1 means free and 0 means in-use so that summaries and + // the bitmaps align better on zero-values. + chunks [1 << pallocChunksL1Bits]*[1 << pallocChunksL2Bits]pallocData + + // The address to start an allocation search with. It must never + // point to any memory that is not contained in inUse, i.e. + // inUse.contains(searchAddr.addr()) must always be true. The one + // exception to this rule is that it may take on the value of + // maxOffAddr to indicate that the heap is exhausted. + // + // We guarantee that all valid heap addresses below this value + // are allocated and not worth searching. + searchAddr offAddr + + // start and end represent the chunk indices + // which pageAlloc knows about. It assumes + // chunks in the range [start, end) are + // currently ready to use. + start, end chunkIdx + + // inUse is a slice of ranges of address space which are + // known by the page allocator to be currently in-use (passed + // to grow). + // + // This field is currently unused on 32-bit architectures but + // is harmless to track. We care much more about having a + // contiguous heap in these cases and take additional measures + // to ensure that, so in nearly all cases this should have just + // 1 element. + // + // All access is protected by the mheapLock. + inUse addrRanges + + // scav stores the scavenger state. + scav struct { + // index is an efficient index of chunks that have pages available to + // scavenge. + index scavengeIndex + + // released is the amount of memory released this scavenge cycle. + // + // Updated atomically. + released uintptr + } + + // mheap_.lock. This level of indirection makes it possible + // to test pageAlloc indepedently of the runtime allocator. + mheapLock *mutex + + // sysStat is the runtime memstat to update when new system + // memory is committed by the pageAlloc for allocation metadata. + sysStat *sysMemStat + + // summaryMappedReady is the number of bytes mapped in the Ready state + // in the summary structure. Used only for testing currently. + // + // Protected by mheapLock. + summaryMappedReady uintptr + + // Whether or not this struct is being used in tests. + test bool +} + +func (p *pageAlloc) init(mheapLock *mutex, sysStat *sysMemStat) { + if levelLogPages[0] > logMaxPackedValue { + // We can't represent 1<<levelLogPages[0] pages, the maximum number + // of pages we need to represent at the root level, in a summary, which + // is a big problem. Throw. + print("runtime: root level max pages = ", 1<<levelLogPages[0], "\n") + print("runtime: summary max pages = ", maxPackedValue, "\n") + throw("root level max pages doesn't fit in summary") + } + p.sysStat = sysStat + + // Initialize p.inUse. + p.inUse.init(sysStat) + + // System-dependent initialization. + p.sysInit() + + // Start with the searchAddr in a state indicating there's no free memory. + p.searchAddr = maxSearchAddr() + + // Set the mheapLock. + p.mheapLock = mheapLock +} + +// tryChunkOf returns the bitmap data for the given chunk. +// +// Returns nil if the chunk data has not been mapped. +func (p *pageAlloc) tryChunkOf(ci chunkIdx) *pallocData { + l2 := p.chunks[ci.l1()] + if l2 == nil { + return nil + } + return &l2[ci.l2()] +} + +// chunkOf returns the chunk at the given chunk index. +// +// The chunk index must be valid or this method may throw. +func (p *pageAlloc) chunkOf(ci chunkIdx) *pallocData { + return &p.chunks[ci.l1()][ci.l2()] +} + +// grow sets up the metadata for the address range [base, base+size). +// It may allocate metadata, in which case *p.sysStat will be updated. +// +// p.mheapLock must be held. +func (p *pageAlloc) grow(base, size uintptr) { + assertLockHeld(p.mheapLock) + + // Round up to chunks, since we can't deal with increments smaller + // than chunks. Also, sysGrow expects aligned values. + limit := alignUp(base+size, pallocChunkBytes) + base = alignDown(base, pallocChunkBytes) + + // Grow the summary levels in a system-dependent manner. + // We just update a bunch of additional metadata here. + p.sysGrow(base, limit) + + // Update p.start and p.end. + // If no growth happened yet, start == 0. This is generally + // safe since the zero page is unmapped. + firstGrowth := p.start == 0 + start, end := chunkIndex(base), chunkIndex(limit) + if firstGrowth || start < p.start { + p.start = start + } + if end > p.end { + p.end = end + } + // Note that [base, limit) will never overlap with any existing + // range inUse because grow only ever adds never-used memory + // regions to the page allocator. + p.inUse.add(makeAddrRange(base, limit)) + + // A grow operation is a lot like a free operation, so if our + // chunk ends up below p.searchAddr, update p.searchAddr to the + // new address, just like in free. + if b := (offAddr{base}); b.lessThan(p.searchAddr) { + p.searchAddr = b + } + + // Add entries into chunks, which is sparse, if needed. Then, + // initialize the bitmap. + // + // Newly-grown memory is always considered scavenged. + // Set all the bits in the scavenged bitmaps high. + for c := chunkIndex(base); c < chunkIndex(limit); c++ { + if p.chunks[c.l1()] == nil { + // Create the necessary l2 entry. + r := sysAlloc(unsafe.Sizeof(*p.chunks[0]), p.sysStat) + if r == nil { + throw("pageAlloc: out of memory") + } + // Store the new chunk block but avoid a write barrier. + // grow is used in call chains that disallow write barriers. + *(*uintptr)(unsafe.Pointer(&p.chunks[c.l1()])) = uintptr(r) + } + p.chunkOf(c).scavenged.setRange(0, pallocChunkPages) + } + + // Update summaries accordingly. The grow acts like a free, so + // we need to ensure this newly-free memory is visible in the + // summaries. + p.update(base, size/pageSize, true, false) +} + +// update updates heap metadata. It must be called each time the bitmap +// is updated. +// +// If contig is true, update does some optimizations assuming that there was +// a contiguous allocation or free between addr and addr+npages. alloc indicates +// whether the operation performed was an allocation or a free. +// +// p.mheapLock must be held. +func (p *pageAlloc) update(base, npages uintptr, contig, alloc bool) { + assertLockHeld(p.mheapLock) + + // base, limit, start, and end are inclusive. + limit := base + npages*pageSize - 1 + sc, ec := chunkIndex(base), chunkIndex(limit) + + // Handle updating the lowest level first. + if sc == ec { + // Fast path: the allocation doesn't span more than one chunk, + // so update this one and if the summary didn't change, return. + x := p.summary[len(p.summary)-1][sc] + y := p.chunkOf(sc).summarize() + if x == y { + return + } + p.summary[len(p.summary)-1][sc] = y + } else if contig { + // Slow contiguous path: the allocation spans more than one chunk + // and at least one summary is guaranteed to change. + summary := p.summary[len(p.summary)-1] + + // Update the summary for chunk sc. + summary[sc] = p.chunkOf(sc).summarize() + + // Update the summaries for chunks in between, which are + // either totally allocated or freed. + whole := p.summary[len(p.summary)-1][sc+1 : ec] + if alloc { + // Should optimize into a memclr. + for i := range whole { + whole[i] = 0 + } + } else { + for i := range whole { + whole[i] = freeChunkSum + } + } + + // Update the summary for chunk ec. + summary[ec] = p.chunkOf(ec).summarize() + } else { + // Slow general path: the allocation spans more than one chunk + // and at least one summary is guaranteed to change. + // + // We can't assume a contiguous allocation happened, so walk over + // every chunk in the range and manually recompute the summary. + summary := p.summary[len(p.summary)-1] + for c := sc; c <= ec; c++ { + summary[c] = p.chunkOf(c).summarize() + } + } + + // Walk up the radix tree and update the summaries appropriately. + changed := true + for l := len(p.summary) - 2; l >= 0 && changed; l-- { + // Update summaries at level l from summaries at level l+1. + changed = false + + // "Constants" for the previous level which we + // need to compute the summary from that level. + logEntriesPerBlock := levelBits[l+1] + logMaxPages := levelLogPages[l+1] + + // lo and hi describe all the parts of the level we need to look at. + lo, hi := addrsToSummaryRange(l, base, limit+1) + + // Iterate over each block, updating the corresponding summary in the less-granular level. + for i := lo; i < hi; i++ { + children := p.summary[l+1][i<<logEntriesPerBlock : (i+1)<<logEntriesPerBlock] + sum := mergeSummaries(children, logMaxPages) + old := p.summary[l][i] + if old != sum { + changed = true + p.summary[l][i] = sum + } + } + } +} + +// allocRange marks the range of memory [base, base+npages*pageSize) as +// allocated. It also updates the summaries to reflect the newly-updated +// bitmap. +// +// Returns the amount of scavenged memory in bytes present in the +// allocated range. +// +// p.mheapLock must be held. +func (p *pageAlloc) allocRange(base, npages uintptr) uintptr { + assertLockHeld(p.mheapLock) + + limit := base + npages*pageSize - 1 + sc, ec := chunkIndex(base), chunkIndex(limit) + si, ei := chunkPageIndex(base), chunkPageIndex(limit) + + scav := uint(0) + if sc == ec { + // The range doesn't cross any chunk boundaries. + chunk := p.chunkOf(sc) + scav += chunk.scavenged.popcntRange(si, ei+1-si) + chunk.allocRange(si, ei+1-si) + } else { + // The range crosses at least one chunk boundary. + chunk := p.chunkOf(sc) + scav += chunk.scavenged.popcntRange(si, pallocChunkPages-si) + chunk.allocRange(si, pallocChunkPages-si) + for c := sc + 1; c < ec; c++ { + chunk := p.chunkOf(c) + scav += chunk.scavenged.popcntRange(0, pallocChunkPages) + chunk.allocAll() + } + chunk = p.chunkOf(ec) + scav += chunk.scavenged.popcntRange(0, ei+1) + chunk.allocRange(0, ei+1) + } + p.update(base, npages, true, true) + return uintptr(scav) * pageSize +} + +// findMappedAddr returns the smallest mapped offAddr that is +// >= addr. That is, if addr refers to mapped memory, then it is +// returned. If addr is higher than any mapped region, then +// it returns maxOffAddr. +// +// p.mheapLock must be held. +func (p *pageAlloc) findMappedAddr(addr offAddr) offAddr { + assertLockHeld(p.mheapLock) + + // If we're not in a test, validate first by checking mheap_.arenas. + // This is a fast path which is only safe to use outside of testing. + ai := arenaIndex(addr.addr()) + if p.test || mheap_.arenas[ai.l1()] == nil || mheap_.arenas[ai.l1()][ai.l2()] == nil { + vAddr, ok := p.inUse.findAddrGreaterEqual(addr.addr()) + if ok { + return offAddr{vAddr} + } else { + // The candidate search address is greater than any + // known address, which means we definitely have no + // free memory left. + return maxOffAddr + } + } + return addr +} + +// find searches for the first (address-ordered) contiguous free region of +// npages in size and returns a base address for that region. +// +// It uses p.searchAddr to prune its search and assumes that no palloc chunks +// below chunkIndex(p.searchAddr) contain any free memory at all. +// +// find also computes and returns a candidate p.searchAddr, which may or +// may not prune more of the address space than p.searchAddr already does. +// This candidate is always a valid p.searchAddr. +// +// find represents the slow path and the full radix tree search. +// +// Returns a base address of 0 on failure, in which case the candidate +// searchAddr returned is invalid and must be ignored. +// +// p.mheapLock must be held. +func (p *pageAlloc) find(npages uintptr) (uintptr, offAddr) { + assertLockHeld(p.mheapLock) + + // Search algorithm. + // + // This algorithm walks each level l of the radix tree from the root level + // to the leaf level. It iterates over at most 1 << levelBits[l] of entries + // in a given level in the radix tree, and uses the summary information to + // find either: + // 1) That a given subtree contains a large enough contiguous region, at + // which point it continues iterating on the next level, or + // 2) That there are enough contiguous boundary-crossing bits to satisfy + // the allocation, at which point it knows exactly where to start + // allocating from. + // + // i tracks the index into the current level l's structure for the + // contiguous 1 << levelBits[l] entries we're actually interested in. + // + // NOTE: Technically this search could allocate a region which crosses + // the arenaBaseOffset boundary, which when arenaBaseOffset != 0, is + // a discontinuity. However, the only way this could happen is if the + // page at the zero address is mapped, and this is impossible on + // every system we support where arenaBaseOffset != 0. So, the + // discontinuity is already encoded in the fact that the OS will never + // map the zero page for us, and this function doesn't try to handle + // this case in any way. + + // i is the beginning of the block of entries we're searching at the + // current level. + i := 0 + + // firstFree is the region of address space that we are certain to + // find the first free page in the heap. base and bound are the inclusive + // bounds of this window, and both are addresses in the linearized, contiguous + // view of the address space (with arenaBaseOffset pre-added). At each level, + // this window is narrowed as we find the memory region containing the + // first free page of memory. To begin with, the range reflects the + // full process address space. + // + // firstFree is updated by calling foundFree each time free space in the + // heap is discovered. + // + // At the end of the search, base.addr() is the best new + // searchAddr we could deduce in this search. + firstFree := struct { + base, bound offAddr + }{ + base: minOffAddr, + bound: maxOffAddr, + } + // foundFree takes the given address range [addr, addr+size) and + // updates firstFree if it is a narrower range. The input range must + // either be fully contained within firstFree or not overlap with it + // at all. + // + // This way, we'll record the first summary we find with any free + // pages on the root level and narrow that down if we descend into + // that summary. But as soon as we need to iterate beyond that summary + // in a level to find a large enough range, we'll stop narrowing. + foundFree := func(addr offAddr, size uintptr) { + if firstFree.base.lessEqual(addr) && addr.add(size-1).lessEqual(firstFree.bound) { + // This range fits within the current firstFree window, so narrow + // down the firstFree window to the base and bound of this range. + firstFree.base = addr + firstFree.bound = addr.add(size - 1) + } else if !(addr.add(size-1).lessThan(firstFree.base) || firstFree.bound.lessThan(addr)) { + // This range only partially overlaps with the firstFree range, + // so throw. + print("runtime: addr = ", hex(addr.addr()), ", size = ", size, "\n") + print("runtime: base = ", hex(firstFree.base.addr()), ", bound = ", hex(firstFree.bound.addr()), "\n") + throw("range partially overlaps") + } + } + + // lastSum is the summary which we saw on the previous level that made us + // move on to the next level. Used to print additional information in the + // case of a catastrophic failure. + // lastSumIdx is that summary's index in the previous level. + lastSum := packPallocSum(0, 0, 0) + lastSumIdx := -1 + +nextLevel: + for l := 0; l < len(p.summary); l++ { + // For the root level, entriesPerBlock is the whole level. + entriesPerBlock := 1 << levelBits[l] + logMaxPages := levelLogPages[l] + + // We've moved into a new level, so let's update i to our new + // starting index. This is a no-op for level 0. + i <<= levelBits[l] + + // Slice out the block of entries we care about. + entries := p.summary[l][i : i+entriesPerBlock] + + // Determine j0, the first index we should start iterating from. + // The searchAddr may help us eliminate iterations if we followed the + // searchAddr on the previous level or we're on the root level, in which + // case the searchAddr should be the same as i after levelShift. + j0 := 0 + if searchIdx := offAddrToLevelIndex(l, p.searchAddr); searchIdx&^(entriesPerBlock-1) == i { + j0 = searchIdx & (entriesPerBlock - 1) + } + + // Run over the level entries looking for + // a contiguous run of at least npages either + // within an entry or across entries. + // + // base contains the page index (relative to + // the first entry's first page) of the currently + // considered run of consecutive pages. + // + // size contains the size of the currently considered + // run of consecutive pages. + var base, size uint + for j := j0; j < len(entries); j++ { + sum := entries[j] + if sum == 0 { + // A full entry means we broke any streak and + // that we should skip it altogether. + size = 0 + continue + } + + // We've encountered a non-zero summary which means + // free memory, so update firstFree. + foundFree(levelIndexToOffAddr(l, i+j), (uintptr(1)<<logMaxPages)*pageSize) + + s := sum.start() + if size+s >= uint(npages) { + // If size == 0 we don't have a run yet, + // which means base isn't valid. So, set + // base to the first page in this block. + if size == 0 { + base = uint(j) << logMaxPages + } + // We hit npages; we're done! + size += s + break + } + if sum.max() >= uint(npages) { + // The entry itself contains npages contiguous + // free pages, so continue on the next level + // to find that run. + i += j + lastSumIdx = i + lastSum = sum + continue nextLevel + } + if size == 0 || s < 1<<logMaxPages { + // We either don't have a current run started, or this entry + // isn't totally free (meaning we can't continue the current + // one), so try to begin a new run by setting size and base + // based on sum.end. + size = sum.end() + base = uint(j+1)<<logMaxPages - size + continue + } + // The entry is completely free, so continue the run. + size += 1 << logMaxPages + } + if size >= uint(npages) { + // We found a sufficiently large run of free pages straddling + // some boundary, so compute the address and return it. + addr := levelIndexToOffAddr(l, i).add(uintptr(base) * pageSize).addr() + return addr, p.findMappedAddr(firstFree.base) + } + if l == 0 { + // We're at level zero, so that means we've exhausted our search. + return 0, maxSearchAddr() + } + + // We're not at level zero, and we exhausted the level we were looking in. + // This means that either our calculations were wrong or the level above + // lied to us. In either case, dump some useful state and throw. + print("runtime: summary[", l-1, "][", lastSumIdx, "] = ", lastSum.start(), ", ", lastSum.max(), ", ", lastSum.end(), "\n") + print("runtime: level = ", l, ", npages = ", npages, ", j0 = ", j0, "\n") + print("runtime: p.searchAddr = ", hex(p.searchAddr.addr()), ", i = ", i, "\n") + print("runtime: levelShift[level] = ", levelShift[l], ", levelBits[level] = ", levelBits[l], "\n") + for j := 0; j < len(entries); j++ { + sum := entries[j] + print("runtime: summary[", l, "][", i+j, "] = (", sum.start(), ", ", sum.max(), ", ", sum.end(), ")\n") + } + throw("bad summary data") + } + + // Since we've gotten to this point, that means we haven't found a + // sufficiently-sized free region straddling some boundary (chunk or larger). + // This means the last summary we inspected must have had a large enough "max" + // value, so look inside the chunk to find a suitable run. + // + // After iterating over all levels, i must contain a chunk index which + // is what the final level represents. + ci := chunkIdx(i) + j, searchIdx := p.chunkOf(ci).find(npages, 0) + if j == ^uint(0) { + // We couldn't find any space in this chunk despite the summaries telling + // us it should be there. There's likely a bug, so dump some state and throw. + sum := p.summary[len(p.summary)-1][i] + print("runtime: summary[", len(p.summary)-1, "][", i, "] = (", sum.start(), ", ", sum.max(), ", ", sum.end(), ")\n") + print("runtime: npages = ", npages, "\n") + throw("bad summary data") + } + + // Compute the address at which the free space starts. + addr := chunkBase(ci) + uintptr(j)*pageSize + + // Since we actually searched the chunk, we may have + // found an even narrower free window. + searchAddr := chunkBase(ci) + uintptr(searchIdx)*pageSize + foundFree(offAddr{searchAddr}, chunkBase(ci+1)-searchAddr) + return addr, p.findMappedAddr(firstFree.base) +} + +// alloc allocates npages worth of memory from the page heap, returning the base +// address for the allocation and the amount of scavenged memory in bytes +// contained in the region [base address, base address + npages*pageSize). +// +// Returns a 0 base address on failure, in which case other returned values +// should be ignored. +// +// p.mheapLock must be held. +// +// Must run on the system stack because p.mheapLock must be held. +// +//go:systemstack +func (p *pageAlloc) alloc(npages uintptr) (addr uintptr, scav uintptr) { + assertLockHeld(p.mheapLock) + + // If the searchAddr refers to a region which has a higher address than + // any known chunk, then we know we're out of memory. + if chunkIndex(p.searchAddr.addr()) >= p.end { + return 0, 0 + } + + // If npages has a chance of fitting in the chunk where the searchAddr is, + // search it directly. + searchAddr := minOffAddr + if pallocChunkPages-chunkPageIndex(p.searchAddr.addr()) >= uint(npages) { + // npages is guaranteed to be no greater than pallocChunkPages here. + i := chunkIndex(p.searchAddr.addr()) + if max := p.summary[len(p.summary)-1][i].max(); max >= uint(npages) { + j, searchIdx := p.chunkOf(i).find(npages, chunkPageIndex(p.searchAddr.addr())) + if j == ^uint(0) { + print("runtime: max = ", max, ", npages = ", npages, "\n") + print("runtime: searchIdx = ", chunkPageIndex(p.searchAddr.addr()), ", p.searchAddr = ", hex(p.searchAddr.addr()), "\n") + throw("bad summary data") + } + addr = chunkBase(i) + uintptr(j)*pageSize + searchAddr = offAddr{chunkBase(i) + uintptr(searchIdx)*pageSize} + goto Found + } + } + // We failed to use a searchAddr for one reason or another, so try + // the slow path. + addr, searchAddr = p.find(npages) + if addr == 0 { + if npages == 1 { + // We failed to find a single free page, the smallest unit + // of allocation. This means we know the heap is completely + // exhausted. Otherwise, the heap still might have free + // space in it, just not enough contiguous space to + // accommodate npages. + p.searchAddr = maxSearchAddr() + } + return 0, 0 + } +Found: + // Go ahead and actually mark the bits now that we have an address. + scav = p.allocRange(addr, npages) + + // If we found a higher searchAddr, we know that all the + // heap memory before that searchAddr in an offset address space is + // allocated, so bump p.searchAddr up to the new one. + if p.searchAddr.lessThan(searchAddr) { + p.searchAddr = searchAddr + } + return addr, scav +} + +// free returns npages worth of memory starting at base back to the page heap. +// +// p.mheapLock must be held. +// +// Must run on the system stack because p.mheapLock must be held. +// +//go:systemstack +func (p *pageAlloc) free(base, npages uintptr, scavenged bool) { + assertLockHeld(p.mheapLock) + + // If we're freeing pages below the p.searchAddr, update searchAddr. + if b := (offAddr{base}); b.lessThan(p.searchAddr) { + p.searchAddr = b + } + limit := base + npages*pageSize - 1 + if !scavenged { + p.scav.index.mark(base, limit+1) + } + if npages == 1 { + // Fast path: we're clearing a single bit, and we know exactly + // where it is, so mark it directly. + i := chunkIndex(base) + p.chunkOf(i).free1(chunkPageIndex(base)) + } else { + // Slow path: we're clearing more bits so we may need to iterate. + sc, ec := chunkIndex(base), chunkIndex(limit) + si, ei := chunkPageIndex(base), chunkPageIndex(limit) + + if sc == ec { + // The range doesn't cross any chunk boundaries. + p.chunkOf(sc).free(si, ei+1-si) + } else { + // The range crosses at least one chunk boundary. + p.chunkOf(sc).free(si, pallocChunkPages-si) + for c := sc + 1; c < ec; c++ { + p.chunkOf(c).freeAll() + } + p.chunkOf(ec).free(0, ei+1) + } + } + p.update(base, npages, true, false) +} + +const ( + pallocSumBytes = unsafe.Sizeof(pallocSum(0)) + + // maxPackedValue is the maximum value that any of the three fields in + // the pallocSum may take on. + maxPackedValue = 1 << logMaxPackedValue + logMaxPackedValue = logPallocChunkPages + (summaryLevels-1)*summaryLevelBits + + freeChunkSum = pallocSum(uint64(pallocChunkPages) | + uint64(pallocChunkPages<<logMaxPackedValue) | + uint64(pallocChunkPages<<(2*logMaxPackedValue))) +) + +// pallocSum is a packed summary type which packs three numbers: start, max, +// and end into a single 8-byte value. Each of these values are a summary of +// a bitmap and are thus counts, each of which may have a maximum value of +// 2^21 - 1, or all three may be equal to 2^21. The latter case is represented +// by just setting the 64th bit. +type pallocSum uint64 + +// packPallocSum takes a start, max, and end value and produces a pallocSum. +func packPallocSum(start, max, end uint) pallocSum { + if max == maxPackedValue { + return pallocSum(uint64(1 << 63)) + } + return pallocSum((uint64(start) & (maxPackedValue - 1)) | + ((uint64(max) & (maxPackedValue - 1)) << logMaxPackedValue) | + ((uint64(end) & (maxPackedValue - 1)) << (2 * logMaxPackedValue))) +} + +// start extracts the start value from a packed sum. +func (p pallocSum) start() uint { + if uint64(p)&uint64(1<<63) != 0 { + return maxPackedValue + } + return uint(uint64(p) & (maxPackedValue - 1)) +} + +// max extracts the max value from a packed sum. +func (p pallocSum) max() uint { + if uint64(p)&uint64(1<<63) != 0 { + return maxPackedValue + } + return uint((uint64(p) >> logMaxPackedValue) & (maxPackedValue - 1)) +} + +// end extracts the end value from a packed sum. +func (p pallocSum) end() uint { + if uint64(p)&uint64(1<<63) != 0 { + return maxPackedValue + } + return uint((uint64(p) >> (2 * logMaxPackedValue)) & (maxPackedValue - 1)) +} + +// unpack unpacks all three values from the summary. +func (p pallocSum) unpack() (uint, uint, uint) { + if uint64(p)&uint64(1<<63) != 0 { + return maxPackedValue, maxPackedValue, maxPackedValue + } + return uint(uint64(p) & (maxPackedValue - 1)), + uint((uint64(p) >> logMaxPackedValue) & (maxPackedValue - 1)), + uint((uint64(p) >> (2 * logMaxPackedValue)) & (maxPackedValue - 1)) +} + +// mergeSummaries merges consecutive summaries which may each represent at +// most 1 << logMaxPagesPerSum pages each together into one. +func mergeSummaries(sums []pallocSum, logMaxPagesPerSum uint) pallocSum { + // Merge the summaries in sums into one. + // + // We do this by keeping a running summary representing the merged + // summaries of sums[:i] in start, max, and end. + start, max, end := sums[0].unpack() + for i := 1; i < len(sums); i++ { + // Merge in sums[i]. + si, mi, ei := sums[i].unpack() + + // Merge in sums[i].start only if the running summary is + // completely free, otherwise this summary's start + // plays no role in the combined sum. + if start == uint(i)<<logMaxPagesPerSum { + start += si + } + + // Recompute the max value of the running sum by looking + // across the boundary between the running sum and sums[i] + // and at the max sums[i], taking the greatest of those two + // and the max of the running sum. + if end+si > max { + max = end + si + } + if mi > max { + max = mi + } + + // Merge in end by checking if this new summary is totally + // free. If it is, then we want to extend the running sum's + // end by the new summary. If not, then we have some alloc'd + // pages in there and we just want to take the end value in + // sums[i]. + if ei == 1<<logMaxPagesPerSum { + end += 1 << logMaxPagesPerSum + } else { + end = ei + } + } + return packPallocSum(start, max, end) +} diff --git a/src/runtime/mpagealloc_32bit.go b/src/runtime/mpagealloc_32bit.go new file mode 100644 index 0000000..859c61d --- /dev/null +++ b/src/runtime/mpagealloc_32bit.go @@ -0,0 +1,121 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build 386 || arm || mips || mipsle || wasm + +// wasm is a treated as a 32-bit architecture for the purposes of the page +// allocator, even though it has 64-bit pointers. This is because any wasm +// pointer always has its top 32 bits as zero, so the effective heap address +// space is only 2^32 bytes in size (see heapAddrBits). + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +const ( + // The number of levels in the radix tree. + summaryLevels = 4 + + // Constants for testing. + pageAlloc32Bit = 1 + pageAlloc64Bit = 0 + + // Number of bits needed to represent all indices into the L1 of the + // chunks map. + // + // See (*pageAlloc).chunks for more details. Update the documentation + // there should this number change. + pallocChunksL1Bits = 0 +) + +// See comment in mpagealloc_64bit.go. +var levelBits = [summaryLevels]uint{ + summaryL0Bits, + summaryLevelBits, + summaryLevelBits, + summaryLevelBits, +} + +// See comment in mpagealloc_64bit.go. +var levelShift = [summaryLevels]uint{ + heapAddrBits - summaryL0Bits, + heapAddrBits - summaryL0Bits - 1*summaryLevelBits, + heapAddrBits - summaryL0Bits - 2*summaryLevelBits, + heapAddrBits - summaryL0Bits - 3*summaryLevelBits, +} + +// See comment in mpagealloc_64bit.go. +var levelLogPages = [summaryLevels]uint{ + logPallocChunkPages + 3*summaryLevelBits, + logPallocChunkPages + 2*summaryLevelBits, + logPallocChunkPages + 1*summaryLevelBits, + logPallocChunkPages, +} + +// scavengeIndexArray is the backing store for p.scav.index.chunks. +// On 32-bit platforms, it's small enough to just be a global. +var scavengeIndexArray [((1 << heapAddrBits) / pallocChunkBytes) / 8]atomic.Uint8 + +// See mpagealloc_64bit.go for details. +func (p *pageAlloc) sysInit() { + // Calculate how much memory all our entries will take up. + // + // This should be around 12 KiB or less. + totalSize := uintptr(0) + for l := 0; l < summaryLevels; l++ { + totalSize += (uintptr(1) << (heapAddrBits - levelShift[l])) * pallocSumBytes + } + totalSize = alignUp(totalSize, physPageSize) + + // Reserve memory for all levels in one go. There shouldn't be much for 32-bit. + reservation := sysReserve(nil, totalSize) + if reservation == nil { + throw("failed to reserve page summary memory") + } + // There isn't much. Just map it and mark it as used immediately. + sysMap(reservation, totalSize, p.sysStat) + sysUsed(reservation, totalSize, totalSize) + p.summaryMappedReady += totalSize + + // Iterate over the reservation and cut it up into slices. + // + // Maintain i as the byte offset from reservation where + // the new slice should start. + for l, shift := range levelShift { + entries := 1 << (heapAddrBits - shift) + + // Put this reservation into a slice. + sl := notInHeapSlice{(*notInHeap)(reservation), 0, entries} + p.summary[l] = *(*[]pallocSum)(unsafe.Pointer(&sl)) + + reservation = add(reservation, uintptr(entries)*pallocSumBytes) + } + + // Set up the scavenge index. + p.scav.index.chunks = scavengeIndexArray[:] +} + +// See mpagealloc_64bit.go for details. +func (p *pageAlloc) sysGrow(base, limit uintptr) { + if base%pallocChunkBytes != 0 || limit%pallocChunkBytes != 0 { + print("runtime: base = ", hex(base), ", limit = ", hex(limit), "\n") + throw("sysGrow bounds not aligned to pallocChunkBytes") + } + + // Walk up the tree and update the summary slices. + for l := len(p.summary) - 1; l >= 0; l-- { + // Figure out what part of the summary array this new address space needs. + // Note that we need to align the ranges to the block width (1<<levelBits[l]) + // at this level because the full block is needed to compute the summary for + // the next level. + lo, hi := addrsToSummaryRange(l, base, limit) + _, hi = blockAlignSummaryRange(l, lo, hi) + if hi > len(p.summary[l]) { + p.summary[l] = p.summary[l][:hi] + } + } +} diff --git a/src/runtime/mpagealloc_64bit.go b/src/runtime/mpagealloc_64bit.go new file mode 100644 index 0000000..371c1fb --- /dev/null +++ b/src/runtime/mpagealloc_64bit.go @@ -0,0 +1,257 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build amd64 || arm64 || loong64 || mips64 || mips64le || ppc64 || ppc64le || riscv64 || s390x + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +const ( + // The number of levels in the radix tree. + summaryLevels = 5 + + // Constants for testing. + pageAlloc32Bit = 0 + pageAlloc64Bit = 1 + + // Number of bits needed to represent all indices into the L1 of the + // chunks map. + // + // See (*pageAlloc).chunks for more details. Update the documentation + // there should this number change. + pallocChunksL1Bits = 13 +) + +// levelBits is the number of bits in the radix for a given level in the super summary +// structure. +// +// The sum of all the entries of levelBits should equal heapAddrBits. +var levelBits = [summaryLevels]uint{ + summaryL0Bits, + summaryLevelBits, + summaryLevelBits, + summaryLevelBits, + summaryLevelBits, +} + +// levelShift is the number of bits to shift to acquire the radix for a given level +// in the super summary structure. +// +// With levelShift, one can compute the index of the summary at level l related to a +// pointer p by doing: +// +// p >> levelShift[l] +var levelShift = [summaryLevels]uint{ + heapAddrBits - summaryL0Bits, + heapAddrBits - summaryL0Bits - 1*summaryLevelBits, + heapAddrBits - summaryL0Bits - 2*summaryLevelBits, + heapAddrBits - summaryL0Bits - 3*summaryLevelBits, + heapAddrBits - summaryL0Bits - 4*summaryLevelBits, +} + +// levelLogPages is log2 the maximum number of runtime pages in the address space +// a summary in the given level represents. +// +// The leaf level always represents exactly log2 of 1 chunk's worth of pages. +var levelLogPages = [summaryLevels]uint{ + logPallocChunkPages + 4*summaryLevelBits, + logPallocChunkPages + 3*summaryLevelBits, + logPallocChunkPages + 2*summaryLevelBits, + logPallocChunkPages + 1*summaryLevelBits, + logPallocChunkPages, +} + +// sysInit performs architecture-dependent initialization of fields +// in pageAlloc. pageAlloc should be uninitialized except for sysStat +// if any runtime statistic should be updated. +func (p *pageAlloc) sysInit() { + // Reserve memory for each level. This will get mapped in + // as R/W by setArenas. + for l, shift := range levelShift { + entries := 1 << (heapAddrBits - shift) + + // Reserve b bytes of memory anywhere in the address space. + b := alignUp(uintptr(entries)*pallocSumBytes, physPageSize) + r := sysReserve(nil, b) + if r == nil { + throw("failed to reserve page summary memory") + } + + // Put this reservation into a slice. + sl := notInHeapSlice{(*notInHeap)(r), 0, entries} + p.summary[l] = *(*[]pallocSum)(unsafe.Pointer(&sl)) + } + + // Set up the scavenge index. + nbytes := uintptr(1<<heapAddrBits) / pallocChunkBytes / 8 + r := sysReserve(nil, nbytes) + sl := notInHeapSlice{(*notInHeap)(r), int(nbytes), int(nbytes)} + p.scav.index.chunks = *(*[]atomic.Uint8)(unsafe.Pointer(&sl)) +} + +// sysGrow performs architecture-dependent operations on heap +// growth for the page allocator, such as mapping in new memory +// for summaries. It also updates the length of the slices in +// [.summary. +// +// base is the base of the newly-added heap memory and limit is +// the first address past the end of the newly-added heap memory. +// Both must be aligned to pallocChunkBytes. +// +// The caller must update p.start and p.end after calling sysGrow. +func (p *pageAlloc) sysGrow(base, limit uintptr) { + if base%pallocChunkBytes != 0 || limit%pallocChunkBytes != 0 { + print("runtime: base = ", hex(base), ", limit = ", hex(limit), "\n") + throw("sysGrow bounds not aligned to pallocChunkBytes") + } + + // addrRangeToSummaryRange converts a range of addresses into a range + // of summary indices which must be mapped to support those addresses + // in the summary range. + addrRangeToSummaryRange := func(level int, r addrRange) (int, int) { + sumIdxBase, sumIdxLimit := addrsToSummaryRange(level, r.base.addr(), r.limit.addr()) + return blockAlignSummaryRange(level, sumIdxBase, sumIdxLimit) + } + + // summaryRangeToSumAddrRange converts a range of indices in any + // level of p.summary into page-aligned addresses which cover that + // range of indices. + summaryRangeToSumAddrRange := func(level, sumIdxBase, sumIdxLimit int) addrRange { + baseOffset := alignDown(uintptr(sumIdxBase)*pallocSumBytes, physPageSize) + limitOffset := alignUp(uintptr(sumIdxLimit)*pallocSumBytes, physPageSize) + base := unsafe.Pointer(&p.summary[level][0]) + return addrRange{ + offAddr{uintptr(add(base, baseOffset))}, + offAddr{uintptr(add(base, limitOffset))}, + } + } + + // addrRangeToSumAddrRange is a convienience function that converts + // an address range r to the address range of the given summary level + // that stores the summaries for r. + addrRangeToSumAddrRange := func(level int, r addrRange) addrRange { + sumIdxBase, sumIdxLimit := addrRangeToSummaryRange(level, r) + return summaryRangeToSumAddrRange(level, sumIdxBase, sumIdxLimit) + } + + // Find the first inUse index which is strictly greater than base. + // + // Because this function will never be asked remap the same memory + // twice, this index is effectively the index at which we would insert + // this new growth, and base will never overlap/be contained within + // any existing range. + // + // This will be used to look at what memory in the summary array is already + // mapped before and after this new range. + inUseIndex := p.inUse.findSucc(base) + + // Walk up the radix tree and map summaries in as needed. + for l := range p.summary { + // Figure out what part of the summary array this new address space needs. + needIdxBase, needIdxLimit := addrRangeToSummaryRange(l, makeAddrRange(base, limit)) + + // Update the summary slices with a new upper-bound. This ensures + // we get tight bounds checks on at least the top bound. + // + // We must do this regardless of whether we map new memory. + if needIdxLimit > len(p.summary[l]) { + p.summary[l] = p.summary[l][:needIdxLimit] + } + + // Compute the needed address range in the summary array for level l. + need := summaryRangeToSumAddrRange(l, needIdxBase, needIdxLimit) + + // Prune need down to what needs to be newly mapped. Some parts of it may + // already be mapped by what inUse describes due to page alignment requirements + // for mapping. prune's invariants are guaranteed by the fact that this + // function will never be asked to remap the same memory twice. + if inUseIndex > 0 { + need = need.subtract(addrRangeToSumAddrRange(l, p.inUse.ranges[inUseIndex-1])) + } + if inUseIndex < len(p.inUse.ranges) { + need = need.subtract(addrRangeToSumAddrRange(l, p.inUse.ranges[inUseIndex])) + } + // It's possible that after our pruning above, there's nothing new to map. + if need.size() == 0 { + continue + } + + // Map and commit need. + sysMap(unsafe.Pointer(need.base.addr()), need.size(), p.sysStat) + sysUsed(unsafe.Pointer(need.base.addr()), need.size(), need.size()) + p.summaryMappedReady += need.size() + } + + // Update the scavenge index. + p.summaryMappedReady += p.scav.index.grow(base, limit, p.sysStat) +} + +// grow increases the index's backing store in response to a heap growth. +// +// Returns the amount of memory added to sysStat. +func (s *scavengeIndex) grow(base, limit uintptr, sysStat *sysMemStat) uintptr { + if base%pallocChunkBytes != 0 || limit%pallocChunkBytes != 0 { + print("runtime: base = ", hex(base), ", limit = ", hex(limit), "\n") + throw("sysGrow bounds not aligned to pallocChunkBytes") + } + // Map and commit the pieces of chunks that we need. + // + // We always map the full range of the minimum heap address to the + // maximum heap address. We don't do this for the summary structure + // because it's quite large and a discontiguous heap could cause a + // lot of memory to be used. In this situation, the worst case overhead + // is in the single-digit MiB if we map the whole thing. + // + // The base address of the backing store is always page-aligned, + // because it comes from the OS, so it's sufficient to align the + // index. + haveMin := s.min.Load() + haveMax := s.max.Load() + needMin := int32(alignDown(uintptr(chunkIndex(base)/8), physPageSize)) + needMax := int32(alignUp(uintptr((chunkIndex(limit)+7)/8), physPageSize)) + // Extend the range down to what we have, if there's no overlap. + if needMax < haveMin { + needMax = haveMin + } + if needMin > haveMax { + needMin = haveMax + } + have := makeAddrRange( + // Avoid a panic from indexing one past the last element. + uintptr(unsafe.Pointer(&s.chunks[0]))+uintptr(haveMin), + uintptr(unsafe.Pointer(&s.chunks[0]))+uintptr(haveMax), + ) + need := makeAddrRange( + // Avoid a panic from indexing one past the last element. + uintptr(unsafe.Pointer(&s.chunks[0]))+uintptr(needMin), + uintptr(unsafe.Pointer(&s.chunks[0]))+uintptr(needMax), + ) + // Subtract any overlap from rounding. We can't re-map memory because + // it'll be zeroed. + need = need.subtract(have) + + // If we've got something to map, map it, and update the slice bounds. + if need.size() != 0 { + sysMap(unsafe.Pointer(need.base.addr()), need.size(), sysStat) + sysUsed(unsafe.Pointer(need.base.addr()), need.size(), need.size()) + // Update the indices only after the new memory is valid. + if haveMin == 0 || needMin < haveMin { + s.min.Store(needMin) + } + if haveMax == 0 || needMax > haveMax { + s.max.Store(needMax) + } + } + // Update minHeapIdx. Note that even if there's no mapping work to do, + // we may still have a new, lower minimum heap address. + minHeapIdx := s.minHeapIdx.Load() + if baseIdx := int32(chunkIndex(base) / 8); minHeapIdx == 0 || baseIdx < minHeapIdx { + s.minHeapIdx.Store(baseIdx) + } + return need.size() +} diff --git a/src/runtime/mpagealloc_test.go b/src/runtime/mpagealloc_test.go new file mode 100644 index 0000000..f2b82e3 --- /dev/null +++ b/src/runtime/mpagealloc_test.go @@ -0,0 +1,1040 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "internal/goos" + . "runtime" + "testing" +) + +func checkPageAlloc(t *testing.T, want, got *PageAlloc) { + // Ensure start and end are correct. + wantStart, wantEnd := want.Bounds() + gotStart, gotEnd := got.Bounds() + if gotStart != wantStart { + t.Fatalf("start values not equal: got %d, want %d", gotStart, wantStart) + } + if gotEnd != wantEnd { + t.Fatalf("end values not equal: got %d, want %d", gotEnd, wantEnd) + } + + for i := gotStart; i < gotEnd; i++ { + // Check the bitmaps. Note that we may have nil data. + gb, wb := got.PallocData(i), want.PallocData(i) + if gb == nil && wb == nil { + continue + } + if (gb == nil && wb != nil) || (gb != nil && wb == nil) { + t.Errorf("chunk %d nilness mismatch", i) + } + if !checkPallocBits(t, gb.PallocBits(), wb.PallocBits()) { + t.Logf("in chunk %d (mallocBits)", i) + } + if !checkPallocBits(t, gb.Scavenged(), wb.Scavenged()) { + t.Logf("in chunk %d (scavenged)", i) + } + } + // TODO(mknyszek): Verify summaries too? +} + +func TestPageAllocGrow(t *testing.T) { + if GOOS == "openbsd" && testing.Short() { + t.Skip("skipping because virtual memory is limited; see #36210") + } + type test struct { + chunks []ChunkIdx + inUse []AddrRange + } + tests := map[string]test{ + "One": { + chunks: []ChunkIdx{ + BaseChunkIdx, + }, + inUse: []AddrRange{ + MakeAddrRange(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+1, 0)), + }, + }, + "Contiguous2": { + chunks: []ChunkIdx{ + BaseChunkIdx, + BaseChunkIdx + 1, + }, + inUse: []AddrRange{ + MakeAddrRange(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+2, 0)), + }, + }, + "Contiguous5": { + chunks: []ChunkIdx{ + BaseChunkIdx, + BaseChunkIdx + 1, + BaseChunkIdx + 2, + BaseChunkIdx + 3, + BaseChunkIdx + 4, + }, + inUse: []AddrRange{ + MakeAddrRange(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+5, 0)), + }, + }, + "Discontiguous": { + chunks: []ChunkIdx{ + BaseChunkIdx, + BaseChunkIdx + 2, + BaseChunkIdx + 4, + }, + inUse: []AddrRange{ + MakeAddrRange(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+1, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+2, 0), PageBase(BaseChunkIdx+3, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+4, 0), PageBase(BaseChunkIdx+5, 0)), + }, + }, + "Mixed": { + chunks: []ChunkIdx{ + BaseChunkIdx, + BaseChunkIdx + 1, + BaseChunkIdx + 2, + BaseChunkIdx + 4, + }, + inUse: []AddrRange{ + MakeAddrRange(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+3, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+4, 0), PageBase(BaseChunkIdx+5, 0)), + }, + }, + "WildlyDiscontiguous": { + chunks: []ChunkIdx{ + BaseChunkIdx, + BaseChunkIdx + 1, + BaseChunkIdx + 0x10, + BaseChunkIdx + 0x21, + }, + inUse: []AddrRange{ + MakeAddrRange(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+2, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+0x10, 0), PageBase(BaseChunkIdx+0x11, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+0x21, 0), PageBase(BaseChunkIdx+0x22, 0)), + }, + }, + "ManyDiscontiguous": { + // The initial cap is 16. Test 33 ranges, to exercise the growth path (twice). + chunks: []ChunkIdx{ + BaseChunkIdx, BaseChunkIdx + 2, BaseChunkIdx + 4, BaseChunkIdx + 6, + BaseChunkIdx + 8, BaseChunkIdx + 10, BaseChunkIdx + 12, BaseChunkIdx + 14, + BaseChunkIdx + 16, BaseChunkIdx + 18, BaseChunkIdx + 20, BaseChunkIdx + 22, + BaseChunkIdx + 24, BaseChunkIdx + 26, BaseChunkIdx + 28, BaseChunkIdx + 30, + BaseChunkIdx + 32, BaseChunkIdx + 34, BaseChunkIdx + 36, BaseChunkIdx + 38, + BaseChunkIdx + 40, BaseChunkIdx + 42, BaseChunkIdx + 44, BaseChunkIdx + 46, + BaseChunkIdx + 48, BaseChunkIdx + 50, BaseChunkIdx + 52, BaseChunkIdx + 54, + BaseChunkIdx + 56, BaseChunkIdx + 58, BaseChunkIdx + 60, BaseChunkIdx + 62, + BaseChunkIdx + 64, + }, + inUse: []AddrRange{ + MakeAddrRange(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+1, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+2, 0), PageBase(BaseChunkIdx+3, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+4, 0), PageBase(BaseChunkIdx+5, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+6, 0), PageBase(BaseChunkIdx+7, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+8, 0), PageBase(BaseChunkIdx+9, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+10, 0), PageBase(BaseChunkIdx+11, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+12, 0), PageBase(BaseChunkIdx+13, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+14, 0), PageBase(BaseChunkIdx+15, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+16, 0), PageBase(BaseChunkIdx+17, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+18, 0), PageBase(BaseChunkIdx+19, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+20, 0), PageBase(BaseChunkIdx+21, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+22, 0), PageBase(BaseChunkIdx+23, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+24, 0), PageBase(BaseChunkIdx+25, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+26, 0), PageBase(BaseChunkIdx+27, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+28, 0), PageBase(BaseChunkIdx+29, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+30, 0), PageBase(BaseChunkIdx+31, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+32, 0), PageBase(BaseChunkIdx+33, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+34, 0), PageBase(BaseChunkIdx+35, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+36, 0), PageBase(BaseChunkIdx+37, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+38, 0), PageBase(BaseChunkIdx+39, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+40, 0), PageBase(BaseChunkIdx+41, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+42, 0), PageBase(BaseChunkIdx+43, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+44, 0), PageBase(BaseChunkIdx+45, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+46, 0), PageBase(BaseChunkIdx+47, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+48, 0), PageBase(BaseChunkIdx+49, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+50, 0), PageBase(BaseChunkIdx+51, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+52, 0), PageBase(BaseChunkIdx+53, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+54, 0), PageBase(BaseChunkIdx+55, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+56, 0), PageBase(BaseChunkIdx+57, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+58, 0), PageBase(BaseChunkIdx+59, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+60, 0), PageBase(BaseChunkIdx+61, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+62, 0), PageBase(BaseChunkIdx+63, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+64, 0), PageBase(BaseChunkIdx+65, 0)), + }, + }, + } + // Disable these tests on iOS since we have a small address space. + // See #46860. + if PageAlloc64Bit != 0 && goos.IsIos == 0 { + tests["ExtremelyDiscontiguous"] = test{ + chunks: []ChunkIdx{ + BaseChunkIdx, + BaseChunkIdx + 0x100000, // constant translates to O(TiB) + }, + inUse: []AddrRange{ + MakeAddrRange(PageBase(BaseChunkIdx, 0), PageBase(BaseChunkIdx+1, 0)), + MakeAddrRange(PageBase(BaseChunkIdx+0x100000, 0), PageBase(BaseChunkIdx+0x100001, 0)), + }, + } + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + // By creating a new pageAlloc, we will + // grow it for each chunk defined in x. + x := make(map[ChunkIdx][]BitRange) + for _, c := range v.chunks { + x[c] = []BitRange{} + } + b := NewPageAlloc(x, nil) + defer FreePageAlloc(b) + + got := b.InUse() + want := v.inUse + + // Check for mismatches. + if len(got) != len(want) { + t.Fail() + } else { + for i := range want { + if !want[i].Equals(got[i]) { + t.Fail() + break + } + } + } + if t.Failed() { + t.Logf("found inUse mismatch") + t.Logf("got:") + for i, r := range got { + t.Logf("\t#%d [0x%x, 0x%x)", i, r.Base(), r.Limit()) + } + t.Logf("want:") + for i, r := range want { + t.Logf("\t#%d [0x%x, 0x%x)", i, r.Base(), r.Limit()) + } + } + }) + } +} + +func TestPageAllocAlloc(t *testing.T) { + if GOOS == "openbsd" && testing.Short() { + t.Skip("skipping because virtual memory is limited; see #36210") + } + type hit struct { + npages, base, scav uintptr + } + type test struct { + scav map[ChunkIdx][]BitRange + before map[ChunkIdx][]BitRange + after map[ChunkIdx][]BitRange + hits []hit + } + tests := map[string]test{ + "AllFree1": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 1}, {2, 2}}, + }, + hits: []hit{ + {1, PageBase(BaseChunkIdx, 0), PageSize}, + {1, PageBase(BaseChunkIdx, 1), 0}, + {1, PageBase(BaseChunkIdx, 2), PageSize}, + {1, PageBase(BaseChunkIdx, 3), PageSize}, + {1, PageBase(BaseChunkIdx, 4), 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 5}}, + }, + }, + "ManyArena1": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages - 1}}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + }, + hits: []hit{ + {1, PageBase(BaseChunkIdx+2, PallocChunkPages-1), PageSize}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + }, + }, + "NotContiguous1": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0xff: {{0, 0}}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0xff: {{0, PallocChunkPages}}, + }, + hits: []hit{ + {1, PageBase(BaseChunkIdx+0xff, 0), PageSize}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0xff: {{0, 1}}, + }, + }, + "AllFree2": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 3}, {7, 1}}, + }, + hits: []hit{ + {2, PageBase(BaseChunkIdx, 0), 2 * PageSize}, + {2, PageBase(BaseChunkIdx, 2), PageSize}, + {2, PageBase(BaseChunkIdx, 4), 0}, + {2, PageBase(BaseChunkIdx, 6), PageSize}, + {2, PageBase(BaseChunkIdx, 8), 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 10}}, + }, + }, + "Straddle2": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages - 1}}, + BaseChunkIdx + 1: {{1, PallocChunkPages - 1}}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{PallocChunkPages - 1, 1}}, + BaseChunkIdx + 1: {}, + }, + hits: []hit{ + {2, PageBase(BaseChunkIdx, PallocChunkPages-1), PageSize}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + }, + }, + "AllFree5": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 8}, {9, 1}, {17, 5}}, + }, + hits: []hit{ + {5, PageBase(BaseChunkIdx, 0), 5 * PageSize}, + {5, PageBase(BaseChunkIdx, 5), 4 * PageSize}, + {5, PageBase(BaseChunkIdx, 10), 0}, + {5, PageBase(BaseChunkIdx, 15), 3 * PageSize}, + {5, PageBase(BaseChunkIdx, 20), 2 * PageSize}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 25}}, + }, + }, + "AllFree64": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{21, 1}, {63, 65}}, + }, + hits: []hit{ + {64, PageBase(BaseChunkIdx, 0), 2 * PageSize}, + {64, PageBase(BaseChunkIdx, 64), 64 * PageSize}, + {64, PageBase(BaseChunkIdx, 128), 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 192}}, + }, + }, + "AllFree65": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{129, 1}}, + }, + hits: []hit{ + {65, PageBase(BaseChunkIdx, 0), 0}, + {65, PageBase(BaseChunkIdx, 65), PageSize}, + {65, PageBase(BaseChunkIdx, 130), 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 195}}, + }, + }, + "ExhaustPallocChunkPages-3": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{10, 1}}, + }, + hits: []hit{ + {PallocChunkPages - 3, PageBase(BaseChunkIdx, 0), PageSize}, + {PallocChunkPages - 3, 0, 0}, + {1, PageBase(BaseChunkIdx, PallocChunkPages-3), 0}, + {2, PageBase(BaseChunkIdx, PallocChunkPages-2), 0}, + {1, 0, 0}, + {PallocChunkPages - 3, 0, 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + }, + "AllFreePallocChunkPages": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 1}, {PallocChunkPages - 1, 1}}, + }, + hits: []hit{ + {PallocChunkPages, PageBase(BaseChunkIdx, 0), 2 * PageSize}, + {PallocChunkPages, 0, 0}, + {1, 0, 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + }, + "StraddlePallocChunkPages": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages / 2}}, + BaseChunkIdx + 1: {{PallocChunkPages / 2, PallocChunkPages / 2}}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {{3, 100}}, + }, + hits: []hit{ + {PallocChunkPages, PageBase(BaseChunkIdx, PallocChunkPages/2), 100 * PageSize}, + {PallocChunkPages, 0, 0}, + {1, 0, 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + }, + }, + "StraddlePallocChunkPages+1": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages / 2}}, + BaseChunkIdx + 1: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + }, + hits: []hit{ + {PallocChunkPages + 1, PageBase(BaseChunkIdx, PallocChunkPages/2), (PallocChunkPages + 1) * PageSize}, + {PallocChunkPages, 0, 0}, + {1, PageBase(BaseChunkIdx+1, PallocChunkPages/2+1), PageSize}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages/2 + 2}}, + }, + }, + "AllFreePallocChunkPages*2": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + }, + hits: []hit{ + {PallocChunkPages * 2, PageBase(BaseChunkIdx, 0), 0}, + {PallocChunkPages * 2, 0, 0}, + {1, 0, 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + }, + }, + "NotContiguousPallocChunkPages*2": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 0x40: {}, + BaseChunkIdx + 0x41: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0x40: {}, + BaseChunkIdx + 0x41: {}, + }, + hits: []hit{ + {PallocChunkPages * 2, PageBase(BaseChunkIdx+0x40, 0), 0}, + {21, PageBase(BaseChunkIdx, 0), 21 * PageSize}, + {1, PageBase(BaseChunkIdx, 21), PageSize}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 22}}, + BaseChunkIdx + 0x40: {{0, PallocChunkPages}}, + BaseChunkIdx + 0x41: {{0, PallocChunkPages}}, + }, + }, + "StraddlePallocChunkPages*2": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages / 2}}, + BaseChunkIdx + 1: {}, + BaseChunkIdx + 2: {{PallocChunkPages / 2, PallocChunkPages / 2}}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 7}}, + BaseChunkIdx + 1: {{3, 5}, {121, 10}}, + BaseChunkIdx + 2: {{PallocChunkPages/2 + 12, 2}}, + }, + hits: []hit{ + {PallocChunkPages * 2, PageBase(BaseChunkIdx, PallocChunkPages/2), 15 * PageSize}, + {PallocChunkPages * 2, 0, 0}, + {1, 0, 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + }, + }, + "StraddlePallocChunkPages*5/4": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages * 3 / 4}}, + BaseChunkIdx + 2: {{0, PallocChunkPages * 3 / 4}}, + BaseChunkIdx + 3: {{0, 0}}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{PallocChunkPages / 2, PallocChunkPages/4 + 1}}, + BaseChunkIdx + 2: {{PallocChunkPages / 3, 1}}, + BaseChunkIdx + 3: {{PallocChunkPages * 2 / 3, 1}}, + }, + hits: []hit{ + {PallocChunkPages * 5 / 4, PageBase(BaseChunkIdx+2, PallocChunkPages*3/4), PageSize}, + {PallocChunkPages * 5 / 4, 0, 0}, + {1, PageBase(BaseChunkIdx+1, PallocChunkPages*3/4), PageSize}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages*3/4 + 1}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + BaseChunkIdx + 3: {{0, PallocChunkPages}}, + }, + }, + "AllFreePallocChunkPages*7+5": { + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + BaseChunkIdx + 2: {}, + BaseChunkIdx + 3: {}, + BaseChunkIdx + 4: {}, + BaseChunkIdx + 5: {}, + BaseChunkIdx + 6: {}, + BaseChunkIdx + 7: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{50, 1}}, + BaseChunkIdx + 1: {{31, 1}}, + BaseChunkIdx + 2: {{7, 1}}, + BaseChunkIdx + 3: {{200, 1}}, + BaseChunkIdx + 4: {{3, 1}}, + BaseChunkIdx + 5: {{51, 1}}, + BaseChunkIdx + 6: {{20, 1}}, + BaseChunkIdx + 7: {{1, 1}}, + }, + hits: []hit{ + {PallocChunkPages*7 + 5, PageBase(BaseChunkIdx, 0), 8 * PageSize}, + {PallocChunkPages*7 + 5, 0, 0}, + {1, PageBase(BaseChunkIdx+7, 5), 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + BaseChunkIdx + 3: {{0, PallocChunkPages}}, + BaseChunkIdx + 4: {{0, PallocChunkPages}}, + BaseChunkIdx + 5: {{0, PallocChunkPages}}, + BaseChunkIdx + 6: {{0, PallocChunkPages}}, + BaseChunkIdx + 7: {{0, 6}}, + }, + }, + } + // Disable these tests on iOS since we have a small address space. + // See #46860. + if PageAlloc64Bit != 0 && goos.IsIos == 0 { + const chunkIdxBigJump = 0x100000 // chunk index offset which translates to O(TiB) + + // This test attempts to trigger a bug wherein we look at unmapped summary + // memory that isn't just in the case where we exhaust the heap. + // + // It achieves this by placing a chunk such that its summary will be + // at the very end of a physical page. It then also places another chunk + // much further up in the address space, such that any allocations into the + // first chunk do not exhaust the heap and the second chunk's summary is not in the + // page immediately adjacent to the first chunk's summary's page. + // Allocating into this first chunk to exhaustion and then into the second + // chunk may then trigger a check in the allocator which erroneously looks at + // unmapped summary memory and crashes. + + // Figure out how many chunks are in a physical page, then align BaseChunkIdx + // to a physical page in the chunk summary array. Here we only assume that + // each summary array is aligned to some physical page. + sumsPerPhysPage := ChunkIdx(PhysPageSize / PallocSumBytes) + baseChunkIdx := BaseChunkIdx &^ (sumsPerPhysPage - 1) + tests["DiscontiguousMappedSumBoundary"] = test{ + before: map[ChunkIdx][]BitRange{ + baseChunkIdx + sumsPerPhysPage - 1: {}, + baseChunkIdx + chunkIdxBigJump: {}, + }, + scav: map[ChunkIdx][]BitRange{ + baseChunkIdx + sumsPerPhysPage - 1: {}, + baseChunkIdx + chunkIdxBigJump: {}, + }, + hits: []hit{ + {PallocChunkPages - 1, PageBase(baseChunkIdx+sumsPerPhysPage-1, 0), 0}, + {1, PageBase(baseChunkIdx+sumsPerPhysPage-1, PallocChunkPages-1), 0}, + {1, PageBase(baseChunkIdx+chunkIdxBigJump, 0), 0}, + {PallocChunkPages - 1, PageBase(baseChunkIdx+chunkIdxBigJump, 1), 0}, + {1, 0, 0}, + }, + after: map[ChunkIdx][]BitRange{ + baseChunkIdx + sumsPerPhysPage - 1: {{0, PallocChunkPages}}, + baseChunkIdx + chunkIdxBigJump: {{0, PallocChunkPages}}, + }, + } + + // Test to check for issue #40191. Essentially, the candidate searchAddr + // discovered by find may not point to mapped memory, so we need to handle + // that explicitly. + // + // chunkIdxSmallOffset is an offset intended to be used within chunkIdxBigJump. + // It is far enough within chunkIdxBigJump that the summaries at the beginning + // of an address range the size of chunkIdxBigJump will not be mapped in. + const chunkIdxSmallOffset = 0x503 + tests["DiscontiguousBadSearchAddr"] = test{ + before: map[ChunkIdx][]BitRange{ + // The mechanism for the bug involves three chunks, A, B, and C, which are + // far apart in the address space. In particular, B is chunkIdxBigJump + + // chunkIdxSmalloffset chunks away from B, and C is 2*chunkIdxBigJump chunks + // away from A. A has 1 page free, B has several (NOT at the end of B), and + // C is totally free. + // Note that B's free memory must not be at the end of B because the fast + // path in the page allocator will check if the searchAddr even gives us + // enough space to place the allocation in a chunk before accessing the + // summary. + BaseChunkIdx + chunkIdxBigJump*0: {{0, PallocChunkPages - 1}}, + BaseChunkIdx + chunkIdxBigJump*1 + chunkIdxSmallOffset: { + {0, PallocChunkPages - 10}, + {PallocChunkPages - 1, 1}, + }, + BaseChunkIdx + chunkIdxBigJump*2: {}, + }, + scav: map[ChunkIdx][]BitRange{ + BaseChunkIdx + chunkIdxBigJump*0: {}, + BaseChunkIdx + chunkIdxBigJump*1 + chunkIdxSmallOffset: {}, + BaseChunkIdx + chunkIdxBigJump*2: {}, + }, + hits: []hit{ + // We first allocate into A to set the page allocator's searchAddr to the + // end of that chunk. That is the only purpose A serves. + {1, PageBase(BaseChunkIdx, PallocChunkPages-1), 0}, + // Then, we make a big allocation that doesn't fit into B, and so must be + // fulfilled by C. + // + // On the way to fulfilling the allocation into C, we estimate searchAddr + // using the summary structure, but that will give us a searchAddr of + // B's base address minus chunkIdxSmallOffset chunks. These chunks will + // not be mapped. + {100, PageBase(baseChunkIdx+chunkIdxBigJump*2, 0), 0}, + // Now we try to make a smaller allocation that can be fulfilled by B. + // In an older implementation of the page allocator, this will segfault, + // because this last allocation will first try to access the summary + // for B's base address minus chunkIdxSmallOffset chunks in the fast path, + // and this will not be mapped. + {9, PageBase(baseChunkIdx+chunkIdxBigJump*1+chunkIdxSmallOffset, PallocChunkPages-10), 0}, + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx + chunkIdxBigJump*0: {{0, PallocChunkPages}}, + BaseChunkIdx + chunkIdxBigJump*1 + chunkIdxSmallOffset: {{0, PallocChunkPages}}, + BaseChunkIdx + chunkIdxBigJump*2: {{0, 100}}, + }, + } + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := NewPageAlloc(v.before, v.scav) + defer FreePageAlloc(b) + + for iter, i := range v.hits { + a, s := b.Alloc(i.npages) + if a != i.base { + t.Fatalf("bad alloc #%d: want base 0x%x, got 0x%x", iter+1, i.base, a) + } + if s != i.scav { + t.Fatalf("bad alloc #%d: want scav %d, got %d", iter+1, i.scav, s) + } + } + want := NewPageAlloc(v.after, v.scav) + defer FreePageAlloc(want) + + checkPageAlloc(t, want, b) + }) + } +} + +func TestPageAllocExhaust(t *testing.T) { + if GOOS == "openbsd" && testing.Short() { + t.Skip("skipping because virtual memory is limited; see #36210") + } + for _, npages := range []uintptr{1, 2, 3, 4, 5, 8, 16, 64, 1024, 1025, 2048, 2049} { + npages := npages + t.Run(fmt.Sprintf("%d", npages), func(t *testing.T) { + // Construct b. + bDesc := make(map[ChunkIdx][]BitRange) + for i := ChunkIdx(0); i < 4; i++ { + bDesc[BaseChunkIdx+i] = []BitRange{} + } + b := NewPageAlloc(bDesc, nil) + defer FreePageAlloc(b) + + // Allocate into b with npages until we've exhausted the heap. + nAlloc := (PallocChunkPages * 4) / int(npages) + for i := 0; i < nAlloc; i++ { + addr := PageBase(BaseChunkIdx, uint(i)*uint(npages)) + if a, _ := b.Alloc(npages); a != addr { + t.Fatalf("bad alloc #%d: want 0x%x, got 0x%x", i+1, addr, a) + } + } + + // Check to make sure the next allocation fails. + if a, _ := b.Alloc(npages); a != 0 { + t.Fatalf("bad alloc #%d: want 0, got 0x%x", nAlloc, a) + } + + // Construct what we want the heap to look like now. + allocPages := nAlloc * int(npages) + wantDesc := make(map[ChunkIdx][]BitRange) + for i := ChunkIdx(0); i < 4; i++ { + if allocPages >= PallocChunkPages { + wantDesc[BaseChunkIdx+i] = []BitRange{{0, PallocChunkPages}} + allocPages -= PallocChunkPages + } else if allocPages > 0 { + wantDesc[BaseChunkIdx+i] = []BitRange{{0, uint(allocPages)}} + allocPages = 0 + } else { + wantDesc[BaseChunkIdx+i] = []BitRange{} + } + } + want := NewPageAlloc(wantDesc, nil) + defer FreePageAlloc(want) + + // Check to make sure the heap b matches what we want. + checkPageAlloc(t, want, b) + }) + } +} + +func TestPageAllocFree(t *testing.T) { + if GOOS == "openbsd" && testing.Short() { + t.Skip("skipping because virtual memory is limited; see #36210") + } + tests := map[string]struct { + before map[ChunkIdx][]BitRange + after map[ChunkIdx][]BitRange + npages uintptr + frees []uintptr + }{ + "Free1": { + npages: 1, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, 0), + PageBase(BaseChunkIdx, 1), + PageBase(BaseChunkIdx, 2), + PageBase(BaseChunkIdx, 3), + PageBase(BaseChunkIdx, 4), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{5, PallocChunkPages - 5}}, + }, + }, + "ManyArena1": { + npages: 1, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, PallocChunkPages/2), + PageBase(BaseChunkIdx+1, 0), + PageBase(BaseChunkIdx+2, PallocChunkPages-1), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages / 2}, {PallocChunkPages/2 + 1, PallocChunkPages/2 - 1}}, + BaseChunkIdx + 1: {{1, PallocChunkPages - 1}}, + BaseChunkIdx + 2: {{0, PallocChunkPages - 1}}, + }, + }, + "Free2": { + npages: 2, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, 0), + PageBase(BaseChunkIdx, 2), + PageBase(BaseChunkIdx, 4), + PageBase(BaseChunkIdx, 6), + PageBase(BaseChunkIdx, 8), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{10, PallocChunkPages - 10}}, + }, + }, + "Straddle2": { + npages: 2, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{PallocChunkPages - 1, 1}}, + BaseChunkIdx + 1: {{0, 1}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, PallocChunkPages-1), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + }, + }, + "Free5": { + npages: 5, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, 0), + PageBase(BaseChunkIdx, 5), + PageBase(BaseChunkIdx, 10), + PageBase(BaseChunkIdx, 15), + PageBase(BaseChunkIdx, 20), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{25, PallocChunkPages - 25}}, + }, + }, + "Free64": { + npages: 64, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, 0), + PageBase(BaseChunkIdx, 64), + PageBase(BaseChunkIdx, 128), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{192, PallocChunkPages - 192}}, + }, + }, + "Free65": { + npages: 65, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, 0), + PageBase(BaseChunkIdx, 65), + PageBase(BaseChunkIdx, 130), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{195, PallocChunkPages - 195}}, + }, + }, + "FreePallocChunkPages": { + npages: PallocChunkPages, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, 0), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + }, + "StraddlePallocChunkPages": { + npages: PallocChunkPages, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{PallocChunkPages / 2, PallocChunkPages / 2}}, + BaseChunkIdx + 1: {{0, PallocChunkPages / 2}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, PallocChunkPages/2), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + }, + }, + "StraddlePallocChunkPages+1": { + npages: PallocChunkPages + 1, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, PallocChunkPages/2), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages / 2}}, + BaseChunkIdx + 1: {{PallocChunkPages/2 + 1, PallocChunkPages/2 - 1}}, + }, + }, + "FreePallocChunkPages*2": { + npages: PallocChunkPages * 2, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, 0), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + }, + }, + "StraddlePallocChunkPages*2": { + npages: PallocChunkPages * 2, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, PallocChunkPages/2), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages / 2}}, + BaseChunkIdx + 1: {}, + BaseChunkIdx + 2: {{PallocChunkPages / 2, PallocChunkPages / 2}}, + }, + }, + "AllFreePallocChunkPages*7+5": { + npages: PallocChunkPages*7 + 5, + before: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + BaseChunkIdx + 3: {{0, PallocChunkPages}}, + BaseChunkIdx + 4: {{0, PallocChunkPages}}, + BaseChunkIdx + 5: {{0, PallocChunkPages}}, + BaseChunkIdx + 6: {{0, PallocChunkPages}}, + BaseChunkIdx + 7: {{0, PallocChunkPages}}, + }, + frees: []uintptr{ + PageBase(BaseChunkIdx, 0), + }, + after: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + BaseChunkIdx + 2: {}, + BaseChunkIdx + 3: {}, + BaseChunkIdx + 4: {}, + BaseChunkIdx + 5: {}, + BaseChunkIdx + 6: {}, + BaseChunkIdx + 7: {{5, PallocChunkPages - 5}}, + }, + }, + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := NewPageAlloc(v.before, nil) + defer FreePageAlloc(b) + + for _, addr := range v.frees { + b.Free(addr, v.npages) + } + want := NewPageAlloc(v.after, nil) + defer FreePageAlloc(want) + + checkPageAlloc(t, want, b) + }) + } +} + +func TestPageAllocAllocAndFree(t *testing.T) { + if GOOS == "openbsd" && testing.Short() { + t.Skip("skipping because virtual memory is limited; see #36210") + } + type hit struct { + alloc bool + npages uintptr + base uintptr + } + tests := map[string]struct { + init map[ChunkIdx][]BitRange + hits []hit + }{ + // TODO(mknyszek): Write more tests here. + "Chunks8": { + init: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + BaseChunkIdx + 1: {}, + BaseChunkIdx + 2: {}, + BaseChunkIdx + 3: {}, + BaseChunkIdx + 4: {}, + BaseChunkIdx + 5: {}, + BaseChunkIdx + 6: {}, + BaseChunkIdx + 7: {}, + }, + hits: []hit{ + {true, PallocChunkPages * 8, PageBase(BaseChunkIdx, 0)}, + {false, PallocChunkPages * 8, PageBase(BaseChunkIdx, 0)}, + {true, PallocChunkPages * 8, PageBase(BaseChunkIdx, 0)}, + {false, PallocChunkPages * 8, PageBase(BaseChunkIdx, 0)}, + {true, PallocChunkPages * 8, PageBase(BaseChunkIdx, 0)}, + {false, PallocChunkPages * 8, PageBase(BaseChunkIdx, 0)}, + {true, 1, PageBase(BaseChunkIdx, 0)}, + {false, 1, PageBase(BaseChunkIdx, 0)}, + {true, PallocChunkPages * 8, PageBase(BaseChunkIdx, 0)}, + }, + }, + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := NewPageAlloc(v.init, nil) + defer FreePageAlloc(b) + + for iter, i := range v.hits { + if i.alloc { + if a, _ := b.Alloc(i.npages); a != i.base { + t.Fatalf("bad alloc #%d: want 0x%x, got 0x%x", iter+1, i.base, a) + } + } else { + b.Free(i.base, i.npages) + } + } + }) + } +} diff --git a/src/runtime/mpagecache.go b/src/runtime/mpagecache.go new file mode 100644 index 0000000..5bc9c84 --- /dev/null +++ b/src/runtime/mpagecache.go @@ -0,0 +1,176 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/sys" + "unsafe" +) + +const pageCachePages = 8 * unsafe.Sizeof(pageCache{}.cache) + +// pageCache represents a per-p cache of pages the allocator can +// allocate from without a lock. More specifically, it represents +// a pageCachePages*pageSize chunk of memory with 0 or more free +// pages in it. +type pageCache struct { + base uintptr // base address of the chunk + cache uint64 // 64-bit bitmap representing free pages (1 means free) + scav uint64 // 64-bit bitmap representing scavenged pages (1 means scavenged) +} + +// empty reports whether the page cache has no free pages. +func (c *pageCache) empty() bool { + return c.cache == 0 +} + +// alloc allocates npages from the page cache and is the main entry +// point for allocation. +// +// Returns a base address and the amount of scavenged memory in the +// allocated region in bytes. +// +// Returns a base address of zero on failure, in which case the +// amount of scavenged memory should be ignored. +func (c *pageCache) alloc(npages uintptr) (uintptr, uintptr) { + if c.cache == 0 { + return 0, 0 + } + if npages == 1 { + i := uintptr(sys.TrailingZeros64(c.cache)) + scav := (c.scav >> i) & 1 + c.cache &^= 1 << i // set bit to mark in-use + c.scav &^= 1 << i // clear bit to mark unscavenged + return c.base + i*pageSize, uintptr(scav) * pageSize + } + return c.allocN(npages) +} + +// allocN is a helper which attempts to allocate npages worth of pages +// from the cache. It represents the general case for allocating from +// the page cache. +// +// Returns a base address and the amount of scavenged memory in the +// allocated region in bytes. +func (c *pageCache) allocN(npages uintptr) (uintptr, uintptr) { + i := findBitRange64(c.cache, uint(npages)) + if i >= 64 { + return 0, 0 + } + mask := ((uint64(1) << npages) - 1) << i + scav := sys.OnesCount64(c.scav & mask) + c.cache &^= mask // mark in-use bits + c.scav &^= mask // clear scavenged bits + return c.base + uintptr(i*pageSize), uintptr(scav) * pageSize +} + +// flush empties out unallocated free pages in the given cache +// into s. Then, it clears the cache, such that empty returns +// true. +// +// p.mheapLock must be held. +// +// Must run on the system stack because p.mheapLock must be held. +// +//go:systemstack +func (c *pageCache) flush(p *pageAlloc) { + assertLockHeld(p.mheapLock) + + if c.empty() { + return + } + ci := chunkIndex(c.base) + pi := chunkPageIndex(c.base) + + // This method is called very infrequently, so just do the + // slower, safer thing by iterating over each bit individually. + for i := uint(0); i < 64; i++ { + if c.cache&(1<<i) != 0 { + p.chunkOf(ci).free1(pi + i) + } + if c.scav&(1<<i) != 0 { + p.chunkOf(ci).scavenged.setRange(pi+i, 1) + } + } + // Since this is a lot like a free, we need to make sure + // we update the searchAddr just like free does. + if b := (offAddr{c.base}); b.lessThan(p.searchAddr) { + p.searchAddr = b + } + p.update(c.base, pageCachePages, false, false) + *c = pageCache{} +} + +// allocToCache acquires a pageCachePages-aligned chunk of free pages which +// may not be contiguous, and returns a pageCache structure which owns the +// chunk. +// +// p.mheapLock must be held. +// +// Must run on the system stack because p.mheapLock must be held. +// +//go:systemstack +func (p *pageAlloc) allocToCache() pageCache { + assertLockHeld(p.mheapLock) + + // If the searchAddr refers to a region which has a higher address than + // any known chunk, then we know we're out of memory. + if chunkIndex(p.searchAddr.addr()) >= p.end { + return pageCache{} + } + c := pageCache{} + ci := chunkIndex(p.searchAddr.addr()) // chunk index + var chunk *pallocData + if p.summary[len(p.summary)-1][ci] != 0 { + // Fast path: there's free pages at or near the searchAddr address. + chunk = p.chunkOf(ci) + j, _ := chunk.find(1, chunkPageIndex(p.searchAddr.addr())) + if j == ^uint(0) { + throw("bad summary data") + } + c = pageCache{ + base: chunkBase(ci) + alignDown(uintptr(j), 64)*pageSize, + cache: ^chunk.pages64(j), + scav: chunk.scavenged.block64(j), + } + } else { + // Slow path: the searchAddr address had nothing there, so go find + // the first free page the slow way. + addr, _ := p.find(1) + if addr == 0 { + // We failed to find adequate free space, so mark the searchAddr as OoM + // and return an empty pageCache. + p.searchAddr = maxSearchAddr() + return pageCache{} + } + ci := chunkIndex(addr) + chunk = p.chunkOf(ci) + c = pageCache{ + base: alignDown(addr, 64*pageSize), + cache: ^chunk.pages64(chunkPageIndex(addr)), + scav: chunk.scavenged.block64(chunkPageIndex(addr)), + } + } + + // Set the page bits as allocated and clear the scavenged bits, but + // be careful to only set and clear the relevant bits. + cpi := chunkPageIndex(c.base) + chunk.allocPages64(cpi, c.cache) + chunk.scavenged.clearBlock64(cpi, c.cache&c.scav /* free and scavenged */) + + // Update as an allocation, but note that it's not contiguous. + p.update(c.base, pageCachePages, false, true) + + // Set the search address to the last page represented by the cache. + // Since all of the pages in this block are going to the cache, and we + // searched for the first free page, we can confidently start at the + // next page. + // + // However, p.searchAddr is not allowed to point into unmapped heap memory + // unless it is maxSearchAddr, so make it the last page as opposed to + // the page after. + p.searchAddr = offAddr{c.base + pageSize*(pageCachePages-1)} + return c +} diff --git a/src/runtime/mpagecache_test.go b/src/runtime/mpagecache_test.go new file mode 100644 index 0000000..6cb0620 --- /dev/null +++ b/src/runtime/mpagecache_test.go @@ -0,0 +1,424 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "internal/goos" + "math/rand" + . "runtime" + "testing" +) + +func checkPageCache(t *testing.T, got, want PageCache) { + if got.Base() != want.Base() { + t.Errorf("bad pageCache base: got 0x%x, want 0x%x", got.Base(), want.Base()) + } + if got.Cache() != want.Cache() { + t.Errorf("bad pageCache bits: got %016x, want %016x", got.Base(), want.Base()) + } + if got.Scav() != want.Scav() { + t.Errorf("bad pageCache scav: got %016x, want %016x", got.Scav(), want.Scav()) + } +} + +func TestPageCacheAlloc(t *testing.T) { + base := PageBase(BaseChunkIdx, 0) + type hit struct { + npages uintptr + base uintptr + scav uintptr + } + tests := map[string]struct { + cache PageCache + hits []hit + }{ + "Empty": { + cache: NewPageCache(base, 0, 0), + hits: []hit{ + {1, 0, 0}, + {2, 0, 0}, + {3, 0, 0}, + {4, 0, 0}, + {5, 0, 0}, + {11, 0, 0}, + {12, 0, 0}, + {16, 0, 0}, + {27, 0, 0}, + {32, 0, 0}, + {43, 0, 0}, + {57, 0, 0}, + {64, 0, 0}, + {121, 0, 0}, + }, + }, + "Lo1": { + cache: NewPageCache(base, 0x1, 0x1), + hits: []hit{ + {1, base, PageSize}, + {1, 0, 0}, + {10, 0, 0}, + }, + }, + "Hi1": { + cache: NewPageCache(base, 0x1<<63, 0x1), + hits: []hit{ + {1, base + 63*PageSize, 0}, + {1, 0, 0}, + {10, 0, 0}, + }, + }, + "Swiss1": { + cache: NewPageCache(base, 0x20005555, 0x5505), + hits: []hit{ + {2, 0, 0}, + {1, base, PageSize}, + {1, base + 2*PageSize, PageSize}, + {1, base + 4*PageSize, 0}, + {1, base + 6*PageSize, 0}, + {1, base + 8*PageSize, PageSize}, + {1, base + 10*PageSize, PageSize}, + {1, base + 12*PageSize, PageSize}, + {1, base + 14*PageSize, PageSize}, + {1, base + 29*PageSize, 0}, + {1, 0, 0}, + {10, 0, 0}, + }, + }, + "Lo2": { + cache: NewPageCache(base, 0x3, 0x2<<62), + hits: []hit{ + {2, base, 0}, + {2, 0, 0}, + {1, 0, 0}, + }, + }, + "Hi2": { + cache: NewPageCache(base, 0x3<<62, 0x3<<62), + hits: []hit{ + {2, base + 62*PageSize, 2 * PageSize}, + {2, 0, 0}, + {1, 0, 0}, + }, + }, + "Swiss2": { + cache: NewPageCache(base, 0x3333<<31, 0x3030<<31), + hits: []hit{ + {2, base + 31*PageSize, 0}, + {2, base + 35*PageSize, 2 * PageSize}, + {2, base + 39*PageSize, 0}, + {2, base + 43*PageSize, 2 * PageSize}, + {2, 0, 0}, + }, + }, + "Hi53": { + cache: NewPageCache(base, ((uint64(1)<<53)-1)<<10, ((uint64(1)<<16)-1)<<10), + hits: []hit{ + {53, base + 10*PageSize, 16 * PageSize}, + {53, 0, 0}, + {1, 0, 0}, + }, + }, + "Full53": { + cache: NewPageCache(base, ^uint64(0), ((uint64(1)<<16)-1)<<10), + hits: []hit{ + {53, base, 16 * PageSize}, + {53, 0, 0}, + {1, base + 53*PageSize, 0}, + }, + }, + "Full64": { + cache: NewPageCache(base, ^uint64(0), ^uint64(0)), + hits: []hit{ + {64, base, 64 * PageSize}, + {64, 0, 0}, + {1, 0, 0}, + }, + }, + "FullMixed": { + cache: NewPageCache(base, ^uint64(0), ^uint64(0)), + hits: []hit{ + {5, base, 5 * PageSize}, + {7, base + 5*PageSize, 7 * PageSize}, + {1, base + 12*PageSize, 1 * PageSize}, + {23, base + 13*PageSize, 23 * PageSize}, + {63, 0, 0}, + {3, base + 36*PageSize, 3 * PageSize}, + {3, base + 39*PageSize, 3 * PageSize}, + {3, base + 42*PageSize, 3 * PageSize}, + {12, base + 45*PageSize, 12 * PageSize}, + {11, 0, 0}, + {4, base + 57*PageSize, 4 * PageSize}, + {4, 0, 0}, + {6, 0, 0}, + {36, 0, 0}, + {2, base + 61*PageSize, 2 * PageSize}, + {3, 0, 0}, + {1, base + 63*PageSize, 1 * PageSize}, + {4, 0, 0}, + {2, 0, 0}, + {62, 0, 0}, + {1, 0, 0}, + }, + }, + } + for name, test := range tests { + test := test + t.Run(name, func(t *testing.T) { + c := test.cache + for i, h := range test.hits { + b, s := c.Alloc(h.npages) + if b != h.base { + t.Fatalf("bad alloc base #%d: got 0x%x, want 0x%x", i, b, h.base) + } + if s != h.scav { + t.Fatalf("bad alloc scav #%d: got %d, want %d", i, s, h.scav) + } + } + }) + } +} + +func TestPageCacheFlush(t *testing.T) { + if GOOS == "openbsd" && testing.Short() { + t.Skip("skipping because virtual memory is limited; see #36210") + } + bits64ToBitRanges := func(bits uint64, base uint) []BitRange { + var ranges []BitRange + start, size := uint(0), uint(0) + for i := 0; i < 64; i++ { + if bits&(1<<i) != 0 { + if size == 0 { + start = uint(i) + base + } + size++ + } else { + if size != 0 { + ranges = append(ranges, BitRange{start, size}) + size = 0 + } + } + } + if size != 0 { + ranges = append(ranges, BitRange{start, size}) + } + return ranges + } + runTest := func(t *testing.T, base uint, cache, scav uint64) { + // Set up the before state. + beforeAlloc := map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{base, 64}}, + } + beforeScav := map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + } + b := NewPageAlloc(beforeAlloc, beforeScav) + defer FreePageAlloc(b) + + // Create and flush the cache. + c := NewPageCache(PageBase(BaseChunkIdx, base), cache, scav) + c.Flush(b) + if !c.Empty() { + t.Errorf("pageCache flush did not clear cache") + } + + // Set up the expected after state. + afterAlloc := map[ChunkIdx][]BitRange{ + BaseChunkIdx: bits64ToBitRanges(^cache, base), + } + afterScav := map[ChunkIdx][]BitRange{ + BaseChunkIdx: bits64ToBitRanges(scav, base), + } + want := NewPageAlloc(afterAlloc, afterScav) + defer FreePageAlloc(want) + + // Check to see if it worked. + checkPageAlloc(t, want, b) + } + + // Empty. + runTest(t, 0, 0, 0) + + // Full. + runTest(t, 0, ^uint64(0), ^uint64(0)) + + // Random. + for i := 0; i < 100; i++ { + // Generate random valid base within a chunk. + base := uint(rand.Intn(PallocChunkPages/64)) * 64 + + // Generate random cache. + cache := rand.Uint64() + scav := rand.Uint64() & cache + + // Run the test. + runTest(t, base, cache, scav) + } +} + +func TestPageAllocAllocToCache(t *testing.T) { + if GOOS == "openbsd" && testing.Short() { + t.Skip("skipping because virtual memory is limited; see #36210") + } + type test struct { + beforeAlloc map[ChunkIdx][]BitRange + beforeScav map[ChunkIdx][]BitRange + hits []PageCache // expected base addresses and patterns + afterAlloc map[ChunkIdx][]BitRange + afterScav map[ChunkIdx][]BitRange + } + tests := map[string]test{ + "AllFree": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{1, 1}, {64, 64}}, + }, + hits: []PageCache{ + NewPageCache(PageBase(BaseChunkIdx, 0), ^uint64(0), 0x2), + NewPageCache(PageBase(BaseChunkIdx, 64), ^uint64(0), ^uint64(0)), + NewPageCache(PageBase(BaseChunkIdx, 128), ^uint64(0), 0), + NewPageCache(PageBase(BaseChunkIdx, 192), ^uint64(0), 0), + }, + afterAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 256}}, + }, + }, + "ManyArena": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages - 64}}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {}, + }, + hits: []PageCache{ + NewPageCache(PageBase(BaseChunkIdx+2, PallocChunkPages-64), ^uint64(0), 0), + }, + afterAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 1: {{0, PallocChunkPages}}, + BaseChunkIdx + 2: {{0, PallocChunkPages}}, + }, + }, + "NotContiguous": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0xff: {{0, 0}}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0xff: {{31, 67}}, + }, + hits: []PageCache{ + NewPageCache(PageBase(BaseChunkIdx+0xff, 0), ^uint64(0), ((uint64(1)<<33)-1)<<31), + }, + afterAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0xff: {{0, 64}}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + BaseChunkIdx + 0xff: {{64, 34}}, + }, + }, + "First": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 32}, {33, 31}, {96, 32}}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{1, 4}, {31, 5}, {66, 2}}, + }, + hits: []PageCache{ + NewPageCache(PageBase(BaseChunkIdx, 0), 1<<32, 1<<32), + NewPageCache(PageBase(BaseChunkIdx, 64), (uint64(1)<<32)-1, 0x3<<2), + }, + afterAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 128}}, + }, + }, + "Fail": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + hits: []PageCache{ + NewPageCache(0, 0, 0), + NewPageCache(0, 0, 0), + NewPageCache(0, 0, 0), + }, + afterAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, PallocChunkPages}}, + }, + }, + "RetainScavBits": { + beforeAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 1}, {10, 2}}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 4}, {11, 1}}, + }, + hits: []PageCache{ + NewPageCache(PageBase(BaseChunkIdx, 0), ^uint64(0x1|(0x3<<10)), 0x7<<1), + }, + afterAlloc: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 64}}, + }, + afterScav: map[ChunkIdx][]BitRange{ + BaseChunkIdx: {{0, 1}, {11, 1}}, + }, + }, + } + // Disable these tests on iOS since we have a small address space. + // See #46860. + if PageAlloc64Bit != 0 && goos.IsIos == 0 { + const chunkIdxBigJump = 0x100000 // chunk index offset which translates to O(TiB) + + // This test is similar to the one with the same name for + // pageAlloc.alloc and serves the same purpose. + // See mpagealloc_test.go for details. + sumsPerPhysPage := ChunkIdx(PhysPageSize / PallocSumBytes) + baseChunkIdx := BaseChunkIdx &^ (sumsPerPhysPage - 1) + tests["DiscontiguousMappedSumBoundary"] = test{ + beforeAlloc: map[ChunkIdx][]BitRange{ + baseChunkIdx + sumsPerPhysPage - 1: {{0, PallocChunkPages - 1}}, + baseChunkIdx + chunkIdxBigJump: {{1, PallocChunkPages - 1}}, + }, + beforeScav: map[ChunkIdx][]BitRange{ + baseChunkIdx + sumsPerPhysPage - 1: {}, + baseChunkIdx + chunkIdxBigJump: {}, + }, + hits: []PageCache{ + NewPageCache(PageBase(baseChunkIdx+sumsPerPhysPage-1, PallocChunkPages-64), 1<<63, 0), + NewPageCache(PageBase(baseChunkIdx+chunkIdxBigJump, 0), 1, 0), + NewPageCache(0, 0, 0), + }, + afterAlloc: map[ChunkIdx][]BitRange{ + baseChunkIdx + sumsPerPhysPage - 1: {{0, PallocChunkPages}}, + baseChunkIdx + chunkIdxBigJump: {{0, PallocChunkPages}}, + }, + } + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := NewPageAlloc(v.beforeAlloc, v.beforeScav) + defer FreePageAlloc(b) + + for _, expect := range v.hits { + checkPageCache(t, b.AllocToCache(), expect) + if t.Failed() { + return + } + } + want := NewPageAlloc(v.afterAlloc, v.afterScav) + defer FreePageAlloc(want) + + checkPageAlloc(t, want, b) + }) + } +} diff --git a/src/runtime/mpallocbits.go b/src/runtime/mpallocbits.go new file mode 100644 index 0000000..f63164b --- /dev/null +++ b/src/runtime/mpallocbits.go @@ -0,0 +1,446 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/sys" +) + +// pageBits is a bitmap representing one bit per page in a palloc chunk. +type pageBits [pallocChunkPages / 64]uint64 + +// get returns the value of the i'th bit in the bitmap. +func (b *pageBits) get(i uint) uint { + return uint((b[i/64] >> (i % 64)) & 1) +} + +// block64 returns the 64-bit aligned block of bits containing the i'th bit. +func (b *pageBits) block64(i uint) uint64 { + return b[i/64] +} + +// set sets bit i of pageBits. +func (b *pageBits) set(i uint) { + b[i/64] |= 1 << (i % 64) +} + +// setRange sets bits in the range [i, i+n). +func (b *pageBits) setRange(i, n uint) { + _ = b[i/64] + if n == 1 { + // Fast path for the n == 1 case. + b.set(i) + return + } + // Set bits [i, j]. + j := i + n - 1 + if i/64 == j/64 { + b[i/64] |= ((uint64(1) << n) - 1) << (i % 64) + return + } + _ = b[j/64] + // Set leading bits. + b[i/64] |= ^uint64(0) << (i % 64) + for k := i/64 + 1; k < j/64; k++ { + b[k] = ^uint64(0) + } + // Set trailing bits. + b[j/64] |= (uint64(1) << (j%64 + 1)) - 1 +} + +// setAll sets all the bits of b. +func (b *pageBits) setAll() { + for i := range b { + b[i] = ^uint64(0) + } +} + +// setBlock64 sets the 64-bit aligned block of bits containing the i'th bit that +// are set in v. +func (b *pageBits) setBlock64(i uint, v uint64) { + b[i/64] |= v +} + +// clear clears bit i of pageBits. +func (b *pageBits) clear(i uint) { + b[i/64] &^= 1 << (i % 64) +} + +// clearRange clears bits in the range [i, i+n). +func (b *pageBits) clearRange(i, n uint) { + _ = b[i/64] + if n == 1 { + // Fast path for the n == 1 case. + b.clear(i) + return + } + // Clear bits [i, j]. + j := i + n - 1 + if i/64 == j/64 { + b[i/64] &^= ((uint64(1) << n) - 1) << (i % 64) + return + } + _ = b[j/64] + // Clear leading bits. + b[i/64] &^= ^uint64(0) << (i % 64) + for k := i/64 + 1; k < j/64; k++ { + b[k] = 0 + } + // Clear trailing bits. + b[j/64] &^= (uint64(1) << (j%64 + 1)) - 1 +} + +// clearAll frees all the bits of b. +func (b *pageBits) clearAll() { + for i := range b { + b[i] = 0 + } +} + +// clearBlock64 clears the 64-bit aligned block of bits containing the i'th bit that +// are set in v. +func (b *pageBits) clearBlock64(i uint, v uint64) { + b[i/64] &^= v +} + +// popcntRange counts the number of set bits in the +// range [i, i+n). +func (b *pageBits) popcntRange(i, n uint) (s uint) { + if n == 1 { + return uint((b[i/64] >> (i % 64)) & 1) + } + _ = b[i/64] + j := i + n - 1 + if i/64 == j/64 { + return uint(sys.OnesCount64((b[i/64] >> (i % 64)) & ((1 << n) - 1))) + } + _ = b[j/64] + s += uint(sys.OnesCount64(b[i/64] >> (i % 64))) + for k := i/64 + 1; k < j/64; k++ { + s += uint(sys.OnesCount64(b[k])) + } + s += uint(sys.OnesCount64(b[j/64] & ((1 << (j%64 + 1)) - 1))) + return +} + +// pallocBits is a bitmap that tracks page allocations for at most one +// palloc chunk. +// +// The precise representation is an implementation detail, but for the +// sake of documentation, 0s are free pages and 1s are allocated pages. +type pallocBits pageBits + +// summarize returns a packed summary of the bitmap in pallocBits. +func (b *pallocBits) summarize() pallocSum { + var start, max, cur uint + const notSetYet = ^uint(0) // sentinel for start value + start = notSetYet + for i := 0; i < len(b); i++ { + x := b[i] + if x == 0 { + cur += 64 + continue + } + t := uint(sys.TrailingZeros64(x)) + l := uint(sys.LeadingZeros64(x)) + + // Finish any region spanning the uint64s + cur += t + if start == notSetYet { + start = cur + } + if cur > max { + max = cur + } + // Final region that might span to next uint64 + cur = l + } + if start == notSetYet { + // Made it all the way through without finding a single 1 bit. + const n = uint(64 * len(b)) + return packPallocSum(n, n, n) + } + if cur > max { + max = cur + } + if max >= 64-2 { + // There is no way an internal run of zeros could beat max. + return packPallocSum(start, max, cur) + } + // Now look inside each uint64 for runs of zeros. + // All uint64s must be nonzero, or we would have aborted above. +outer: + for i := 0; i < len(b); i++ { + x := b[i] + + // Look inside this uint64. We have a pattern like + // 000000 1xxxxx1 000000 + // We need to look inside the 1xxxxx1 for any contiguous + // region of zeros. + + // We already know the trailing zeros are no larger than max. Remove them. + x >>= sys.TrailingZeros64(x) & 63 + if x&(x+1) == 0 { // no more zeros (except at the top). + continue + } + + // Strategy: shrink all runs of zeros by max. If any runs of zero + // remain, then we've identified a larger maxiumum zero run. + p := max // number of zeros we still need to shrink by. + k := uint(1) // current minimum length of runs of ones in x. + for { + // Shrink all runs of zeros by p places (except the top zeros). + for p > 0 { + if p <= k { + // Shift p ones down into the top of each run of zeros. + x |= x >> (p & 63) + if x&(x+1) == 0 { // no more zeros (except at the top). + continue outer + } + break + } + // Shift k ones down into the top of each run of zeros. + x |= x >> (k & 63) + if x&(x+1) == 0 { // no more zeros (except at the top). + continue outer + } + p -= k + // We've just doubled the minimum length of 1-runs. + // This allows us to shift farther in the next iteration. + k *= 2 + } + + // The length of the lowest-order zero run is an increment to our maximum. + j := uint(sys.TrailingZeros64(^x)) // count contiguous trailing ones + x >>= j & 63 // remove trailing ones + j = uint(sys.TrailingZeros64(x)) // count contiguous trailing zeros + x >>= j & 63 // remove zeros + max += j // we have a new maximum! + if x&(x+1) == 0 { // no more zeros (except at the top). + continue outer + } + p = j // remove j more zeros from each zero run. + } + } + return packPallocSum(start, max, cur) +} + +// find searches for npages contiguous free pages in pallocBits and returns +// the index where that run starts, as well as the index of the first free page +// it found in the search. searchIdx represents the first known free page and +// where to begin the next search from. +// +// If find fails to find any free space, it returns an index of ^uint(0) and +// the new searchIdx should be ignored. +// +// Note that if npages == 1, the two returned values will always be identical. +func (b *pallocBits) find(npages uintptr, searchIdx uint) (uint, uint) { + if npages == 1 { + addr := b.find1(searchIdx) + return addr, addr + } else if npages <= 64 { + return b.findSmallN(npages, searchIdx) + } + return b.findLargeN(npages, searchIdx) +} + +// find1 is a helper for find which searches for a single free page +// in the pallocBits and returns the index. +// +// See find for an explanation of the searchIdx parameter. +func (b *pallocBits) find1(searchIdx uint) uint { + _ = b[0] // lift nil check out of loop + for i := searchIdx / 64; i < uint(len(b)); i++ { + x := b[i] + if ^x == 0 { + continue + } + return i*64 + uint(sys.TrailingZeros64(^x)) + } + return ^uint(0) +} + +// findSmallN is a helper for find which searches for npages contiguous free pages +// in this pallocBits and returns the index where that run of contiguous pages +// starts as well as the index of the first free page it finds in its search. +// +// See find for an explanation of the searchIdx parameter. +// +// Returns a ^uint(0) index on failure and the new searchIdx should be ignored. +// +// findSmallN assumes npages <= 64, where any such contiguous run of pages +// crosses at most one aligned 64-bit boundary in the bits. +func (b *pallocBits) findSmallN(npages uintptr, searchIdx uint) (uint, uint) { + end, newSearchIdx := uint(0), ^uint(0) + for i := searchIdx / 64; i < uint(len(b)); i++ { + bi := b[i] + if ^bi == 0 { + end = 0 + continue + } + // First see if we can pack our allocation in the trailing + // zeros plus the end of the last 64 bits. + if newSearchIdx == ^uint(0) { + // The new searchIdx is going to be at these 64 bits after any + // 1s we file, so count trailing 1s. + newSearchIdx = i*64 + uint(sys.TrailingZeros64(^bi)) + } + start := uint(sys.TrailingZeros64(bi)) + if end+start >= uint(npages) { + return i*64 - end, newSearchIdx + } + // Next, check the interior of the 64-bit chunk. + j := findBitRange64(^bi, uint(npages)) + if j < 64 { + return i*64 + j, newSearchIdx + } + end = uint(sys.LeadingZeros64(bi)) + } + return ^uint(0), newSearchIdx +} + +// findLargeN is a helper for find which searches for npages contiguous free pages +// in this pallocBits and returns the index where that run starts, as well as the +// index of the first free page it found it its search. +// +// See alloc for an explanation of the searchIdx parameter. +// +// Returns a ^uint(0) index on failure and the new searchIdx should be ignored. +// +// findLargeN assumes npages > 64, where any such run of free pages +// crosses at least one aligned 64-bit boundary in the bits. +func (b *pallocBits) findLargeN(npages uintptr, searchIdx uint) (uint, uint) { + start, size, newSearchIdx := ^uint(0), uint(0), ^uint(0) + for i := searchIdx / 64; i < uint(len(b)); i++ { + x := b[i] + if x == ^uint64(0) { + size = 0 + continue + } + if newSearchIdx == ^uint(0) { + // The new searchIdx is going to be at these 64 bits after any + // 1s we file, so count trailing 1s. + newSearchIdx = i*64 + uint(sys.TrailingZeros64(^x)) + } + if size == 0 { + size = uint(sys.LeadingZeros64(x)) + start = i*64 + 64 - size + continue + } + s := uint(sys.TrailingZeros64(x)) + if s+size >= uint(npages) { + size += s + return start, newSearchIdx + } + if s < 64 { + size = uint(sys.LeadingZeros64(x)) + start = i*64 + 64 - size + continue + } + size += 64 + } + if size < uint(npages) { + return ^uint(0), newSearchIdx + } + return start, newSearchIdx +} + +// allocRange allocates the range [i, i+n). +func (b *pallocBits) allocRange(i, n uint) { + (*pageBits)(b).setRange(i, n) +} + +// allocAll allocates all the bits of b. +func (b *pallocBits) allocAll() { + (*pageBits)(b).setAll() +} + +// free1 frees a single page in the pallocBits at i. +func (b *pallocBits) free1(i uint) { + (*pageBits)(b).clear(i) +} + +// free frees the range [i, i+n) of pages in the pallocBits. +func (b *pallocBits) free(i, n uint) { + (*pageBits)(b).clearRange(i, n) +} + +// freeAll frees all the bits of b. +func (b *pallocBits) freeAll() { + (*pageBits)(b).clearAll() +} + +// pages64 returns a 64-bit bitmap representing a block of 64 pages aligned +// to 64 pages. The returned block of pages is the one containing the i'th +// page in this pallocBits. Each bit represents whether the page is in-use. +func (b *pallocBits) pages64(i uint) uint64 { + return (*pageBits)(b).block64(i) +} + +// allocPages64 allocates a 64-bit block of 64 pages aligned to 64 pages according +// to the bits set in alloc. The block set is the one containing the i'th page. +func (b *pallocBits) allocPages64(i uint, alloc uint64) { + (*pageBits)(b).setBlock64(i, alloc) +} + +// findBitRange64 returns the bit index of the first set of +// n consecutive 1 bits. If no consecutive set of 1 bits of +// size n may be found in c, then it returns an integer >= 64. +// n must be > 0. +func findBitRange64(c uint64, n uint) uint { + // This implementation is based on shrinking the length of + // runs of contiguous 1 bits. We remove the top n-1 1 bits + // from each run of 1s, then look for the first remaining 1 bit. + p := n - 1 // number of 1s we want to remove. + k := uint(1) // current minimum width of runs of 0 in c. + for p > 0 { + if p <= k { + // Shift p 0s down into the top of each run of 1s. + c &= c >> (p & 63) + break + } + // Shift k 0s down into the top of each run of 1s. + c &= c >> (k & 63) + if c == 0 { + return 64 + } + p -= k + // We've just doubled the minimum length of 0-runs. + // This allows us to shift farther in the next iteration. + k *= 2 + } + // Find first remaining 1. + // Since we shrunk from the top down, the first 1 is in + // its correct original position. + return uint(sys.TrailingZeros64(c)) +} + +// pallocData encapsulates pallocBits and a bitmap for +// whether or not a given page is scavenged in a single +// structure. It's effectively a pallocBits with +// additional functionality. +// +// Update the comment on (*pageAlloc).chunks should this +// structure change. +type pallocData struct { + pallocBits + scavenged pageBits +} + +// allocRange sets bits [i, i+n) in the bitmap to 1 and +// updates the scavenged bits appropriately. +func (m *pallocData) allocRange(i, n uint) { + // Clear the scavenged bits when we alloc the range. + m.pallocBits.allocRange(i, n) + m.scavenged.clearRange(i, n) +} + +// allocAll sets every bit in the bitmap to 1 and updates +// the scavenged bits appropriately. +func (m *pallocData) allocAll() { + // Clear the scavenged bits when we alloc the range. + m.pallocBits.allocAll() + m.scavenged.clearAll() +} diff --git a/src/runtime/mpallocbits_test.go b/src/runtime/mpallocbits_test.go new file mode 100644 index 0000000..5095e24 --- /dev/null +++ b/src/runtime/mpallocbits_test.go @@ -0,0 +1,551 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "math/rand" + . "runtime" + "testing" +) + +// Ensures that got and want are the same, and if not, reports +// detailed diff information. +func checkPallocBits(t *testing.T, got, want *PallocBits) bool { + d := DiffPallocBits(got, want) + if len(d) != 0 { + t.Errorf("%d range(s) different", len(d)) + for _, bits := range d { + t.Logf("\t@ bit index %d", bits.I) + t.Logf("\t| got: %s", StringifyPallocBits(got, bits)) + t.Logf("\t| want: %s", StringifyPallocBits(want, bits)) + } + return false + } + return true +} + +// makePallocBits produces an initialized PallocBits by setting +// the ranges in s to 1 and the rest to zero. +func makePallocBits(s []BitRange) *PallocBits { + b := new(PallocBits) + for _, v := range s { + b.AllocRange(v.I, v.N) + } + return b +} + +// Ensures that PallocBits.AllocRange works, which is a fundamental +// method used for testing and initialization since it's used by +// makePallocBits. +func TestPallocBitsAllocRange(t *testing.T) { + test := func(t *testing.T, i, n uint, want *PallocBits) { + checkPallocBits(t, makePallocBits([]BitRange{{i, n}}), want) + } + t.Run("OneLow", func(t *testing.T) { + want := new(PallocBits) + want[0] = 0x1 + test(t, 0, 1, want) + }) + t.Run("OneHigh", func(t *testing.T) { + want := new(PallocBits) + want[PallocChunkPages/64-1] = 1 << 63 + test(t, PallocChunkPages-1, 1, want) + }) + t.Run("Inner", func(t *testing.T) { + want := new(PallocBits) + want[2] = 0x3e + test(t, 129, 5, want) + }) + t.Run("Aligned", func(t *testing.T) { + want := new(PallocBits) + want[2] = ^uint64(0) + want[3] = ^uint64(0) + test(t, 128, 128, want) + }) + t.Run("Begin", func(t *testing.T) { + want := new(PallocBits) + want[0] = ^uint64(0) + want[1] = ^uint64(0) + want[2] = ^uint64(0) + want[3] = ^uint64(0) + want[4] = ^uint64(0) + want[5] = 0x1 + test(t, 0, 321, want) + }) + t.Run("End", func(t *testing.T) { + want := new(PallocBits) + want[PallocChunkPages/64-1] = ^uint64(0) + want[PallocChunkPages/64-2] = ^uint64(0) + want[PallocChunkPages/64-3] = ^uint64(0) + want[PallocChunkPages/64-4] = 1 << 63 + test(t, PallocChunkPages-(64*3+1), 64*3+1, want) + }) + t.Run("All", func(t *testing.T) { + want := new(PallocBits) + for i := range want { + want[i] = ^uint64(0) + } + test(t, 0, PallocChunkPages, want) + }) +} + +// Inverts every bit in the PallocBits. +func invertPallocBits(b *PallocBits) { + for i := range b { + b[i] = ^b[i] + } +} + +// Ensures two packed summaries are identical, and reports a detailed description +// of the difference if they're not. +func checkPallocSum(t testing.TB, got, want PallocSum) { + if got.Start() != want.Start() { + t.Errorf("inconsistent start: got %d, want %d", got.Start(), want.Start()) + } + if got.Max() != want.Max() { + t.Errorf("inconsistent max: got %d, want %d", got.Max(), want.Max()) + } + if got.End() != want.End() { + t.Errorf("inconsistent end: got %d, want %d", got.End(), want.End()) + } +} + +func TestMallocBitsPopcntRange(t *testing.T) { + type test struct { + i, n uint // bit range to popcnt over. + want uint // expected popcnt result on that range. + } + tests := map[string]struct { + init []BitRange // bit ranges to set to 1 in the bitmap. + tests []test // a set of popcnt tests to run over the bitmap. + }{ + "None": { + tests: []test{ + {0, 1, 0}, + {5, 3, 0}, + {2, 11, 0}, + {PallocChunkPages/4 + 1, PallocChunkPages / 2, 0}, + {0, PallocChunkPages, 0}, + }, + }, + "All": { + init: []BitRange{{0, PallocChunkPages}}, + tests: []test{ + {0, 1, 1}, + {5, 3, 3}, + {2, 11, 11}, + {PallocChunkPages/4 + 1, PallocChunkPages / 2, PallocChunkPages / 2}, + {0, PallocChunkPages, PallocChunkPages}, + }, + }, + "Half": { + init: []BitRange{{PallocChunkPages / 2, PallocChunkPages / 2}}, + tests: []test{ + {0, 1, 0}, + {5, 3, 0}, + {2, 11, 0}, + {PallocChunkPages/2 - 1, 1, 0}, + {PallocChunkPages / 2, 1, 1}, + {PallocChunkPages/2 + 10, 1, 1}, + {PallocChunkPages/2 - 1, 2, 1}, + {PallocChunkPages / 4, PallocChunkPages / 4, 0}, + {PallocChunkPages / 4, PallocChunkPages/4 + 1, 1}, + {PallocChunkPages/4 + 1, PallocChunkPages / 2, PallocChunkPages/4 + 1}, + {0, PallocChunkPages, PallocChunkPages / 2}, + }, + }, + "OddBound": { + init: []BitRange{{0, 111}}, + tests: []test{ + {0, 1, 1}, + {5, 3, 3}, + {2, 11, 11}, + {110, 2, 1}, + {99, 50, 12}, + {110, 1, 1}, + {111, 1, 0}, + {99, 1, 1}, + {120, 1, 0}, + {PallocChunkPages / 2, PallocChunkPages / 2, 0}, + {0, PallocChunkPages, 111}, + }, + }, + "Scattered": { + init: []BitRange{ + {1, 3}, {5, 1}, {7, 1}, {10, 2}, {13, 1}, {15, 4}, + {21, 1}, {23, 1}, {26, 2}, {30, 5}, {36, 2}, {40, 3}, + {44, 6}, {51, 1}, {53, 2}, {58, 3}, {63, 1}, {67, 2}, + {71, 10}, {84, 1}, {89, 7}, {99, 2}, {103, 1}, {107, 2}, + {111, 1}, {113, 1}, {115, 1}, {118, 1}, {120, 2}, {125, 5}, + }, + tests: []test{ + {0, 11, 6}, + {0, 64, 39}, + {13, 64, 40}, + {64, 64, 34}, + {0, 128, 73}, + {1, 128, 74}, + {0, PallocChunkPages, 75}, + }, + }, + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := makePallocBits(v.init) + for _, h := range v.tests { + if got := b.PopcntRange(h.i, h.n); got != h.want { + t.Errorf("bad popcnt (i=%d, n=%d): got %d, want %d", h.i, h.n, got, h.want) + } + } + }) + } +} + +// Ensures computing bit summaries works as expected by generating random +// bitmaps and checking against a reference implementation. +func TestPallocBitsSummarizeRandom(t *testing.T) { + b := new(PallocBits) + for i := 0; i < 1000; i++ { + // Randomize bitmap. + for i := range b { + b[i] = rand.Uint64() + } + // Check summary against reference implementation. + checkPallocSum(t, b.Summarize(), SummarizeSlow(b)) + } +} + +// Ensures computing bit summaries works as expected. +func TestPallocBitsSummarize(t *testing.T) { + var emptySum = PackPallocSum(PallocChunkPages, PallocChunkPages, PallocChunkPages) + type test struct { + free []BitRange // Ranges of free (zero) bits. + hits []PallocSum + } + tests := make(map[string]test) + tests["NoneFree"] = test{ + free: []BitRange{}, + hits: []PallocSum{ + PackPallocSum(0, 0, 0), + }, + } + tests["OnlyStart"] = test{ + free: []BitRange{{0, 10}}, + hits: []PallocSum{ + PackPallocSum(10, 10, 0), + }, + } + tests["OnlyEnd"] = test{ + free: []BitRange{{PallocChunkPages - 40, 40}}, + hits: []PallocSum{ + PackPallocSum(0, 40, 40), + }, + } + tests["StartAndEnd"] = test{ + free: []BitRange{{0, 11}, {PallocChunkPages - 23, 23}}, + hits: []PallocSum{ + PackPallocSum(11, 23, 23), + }, + } + tests["StartMaxEnd"] = test{ + free: []BitRange{{0, 4}, {50, 100}, {PallocChunkPages - 4, 4}}, + hits: []PallocSum{ + PackPallocSum(4, 100, 4), + }, + } + tests["OnlyMax"] = test{ + free: []BitRange{{1, 20}, {35, 241}, {PallocChunkPages - 50, 30}}, + hits: []PallocSum{ + PackPallocSum(0, 241, 0), + }, + } + tests["MultiMax"] = test{ + free: []BitRange{{35, 2}, {40, 5}, {100, 5}}, + hits: []PallocSum{ + PackPallocSum(0, 5, 0), + }, + } + tests["One"] = test{ + free: []BitRange{{2, 1}}, + hits: []PallocSum{ + PackPallocSum(0, 1, 0), + }, + } + tests["AllFree"] = test{ + free: []BitRange{{0, PallocChunkPages}}, + hits: []PallocSum{ + emptySum, + }, + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := makePallocBits(v.free) + // In the PallocBits we create 1's represent free spots, but in our actual + // PallocBits 1 means not free, so invert. + invertPallocBits(b) + for _, h := range v.hits { + checkPallocSum(t, b.Summarize(), h) + } + }) + } +} + +// Benchmarks how quickly we can summarize a PallocBits. +func BenchmarkPallocBitsSummarize(b *testing.B) { + patterns := []uint64{ + 0, + ^uint64(0), + 0xaa, + 0xaaaaaaaaaaaaaaaa, + 0x80000000aaaaaaaa, + 0xaaaaaaaa00000001, + 0xbbbbbbbbbbbbbbbb, + 0x80000000bbbbbbbb, + 0xbbbbbbbb00000001, + 0xcccccccccccccccc, + 0x4444444444444444, + 0x4040404040404040, + 0x4000400040004000, + 0x1000404044ccaaff, + } + for _, p := range patterns { + buf := new(PallocBits) + for i := 0; i < len(buf); i++ { + buf[i] = p + } + b.Run(fmt.Sprintf("Unpacked%02X", p), func(b *testing.B) { + checkPallocSum(b, buf.Summarize(), SummarizeSlow(buf)) + for i := 0; i < b.N; i++ { + buf.Summarize() + } + }) + } +} + +// Ensures page allocation works. +func TestPallocBitsAlloc(t *testing.T) { + tests := map[string]struct { + before []BitRange + after []BitRange + npages uintptr + hits []uint + }{ + "AllFree1": { + npages: 1, + hits: []uint{0, 1, 2, 3, 4, 5}, + after: []BitRange{{0, 6}}, + }, + "AllFree2": { + npages: 2, + hits: []uint{0, 2, 4, 6, 8, 10}, + after: []BitRange{{0, 12}}, + }, + "AllFree5": { + npages: 5, + hits: []uint{0, 5, 10, 15, 20}, + after: []BitRange{{0, 25}}, + }, + "AllFree64": { + npages: 64, + hits: []uint{0, 64, 128}, + after: []BitRange{{0, 192}}, + }, + "AllFree65": { + npages: 65, + hits: []uint{0, 65, 130}, + after: []BitRange{{0, 195}}, + }, + "SomeFree64": { + before: []BitRange{{0, 32}, {64, 32}, {100, PallocChunkPages - 100}}, + npages: 64, + hits: []uint{^uint(0)}, + after: []BitRange{{0, 32}, {64, 32}, {100, PallocChunkPages - 100}}, + }, + "NoneFree1": { + before: []BitRange{{0, PallocChunkPages}}, + npages: 1, + hits: []uint{^uint(0), ^uint(0)}, + after: []BitRange{{0, PallocChunkPages}}, + }, + "NoneFree2": { + before: []BitRange{{0, PallocChunkPages}}, + npages: 2, + hits: []uint{^uint(0), ^uint(0)}, + after: []BitRange{{0, PallocChunkPages}}, + }, + "NoneFree5": { + before: []BitRange{{0, PallocChunkPages}}, + npages: 5, + hits: []uint{^uint(0), ^uint(0)}, + after: []BitRange{{0, PallocChunkPages}}, + }, + "NoneFree65": { + before: []BitRange{{0, PallocChunkPages}}, + npages: 65, + hits: []uint{^uint(0), ^uint(0)}, + after: []BitRange{{0, PallocChunkPages}}, + }, + "ExactFit1": { + before: []BitRange{{0, PallocChunkPages/2 - 3}, {PallocChunkPages/2 - 2, PallocChunkPages/2 + 2}}, + npages: 1, + hits: []uint{PallocChunkPages/2 - 3, ^uint(0)}, + after: []BitRange{{0, PallocChunkPages}}, + }, + "ExactFit2": { + before: []BitRange{{0, PallocChunkPages/2 - 3}, {PallocChunkPages/2 - 1, PallocChunkPages/2 + 1}}, + npages: 2, + hits: []uint{PallocChunkPages/2 - 3, ^uint(0)}, + after: []BitRange{{0, PallocChunkPages}}, + }, + "ExactFit5": { + before: []BitRange{{0, PallocChunkPages/2 - 3}, {PallocChunkPages/2 + 2, PallocChunkPages/2 - 2}}, + npages: 5, + hits: []uint{PallocChunkPages/2 - 3, ^uint(0)}, + after: []BitRange{{0, PallocChunkPages}}, + }, + "ExactFit65": { + before: []BitRange{{0, PallocChunkPages/2 - 31}, {PallocChunkPages/2 + 34, PallocChunkPages/2 - 34}}, + npages: 65, + hits: []uint{PallocChunkPages/2 - 31, ^uint(0)}, + after: []BitRange{{0, PallocChunkPages}}, + }, + "SomeFree161": { + before: []BitRange{{0, 185}, {331, 1}}, + npages: 161, + hits: []uint{332}, + after: []BitRange{{0, 185}, {331, 162}}, + }, + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := makePallocBits(v.before) + for iter, i := range v.hits { + a, _ := b.Find(v.npages, 0) + if i != a { + t.Fatalf("find #%d picked wrong index: want %d, got %d", iter+1, i, a) + } + if i != ^uint(0) { + b.AllocRange(a, uint(v.npages)) + } + } + want := makePallocBits(v.after) + checkPallocBits(t, b, want) + }) + } +} + +// Ensures page freeing works. +func TestPallocBitsFree(t *testing.T) { + tests := map[string]struct { + beforeInv []BitRange + afterInv []BitRange + frees []uint + npages uintptr + }{ + "SomeFree": { + npages: 1, + beforeInv: []BitRange{{0, 32}, {64, 32}, {100, 1}}, + frees: []uint{32}, + afterInv: []BitRange{{0, 33}, {64, 32}, {100, 1}}, + }, + "NoneFree1": { + npages: 1, + frees: []uint{0, 1, 2, 3, 4, 5}, + afterInv: []BitRange{{0, 6}}, + }, + "NoneFree2": { + npages: 2, + frees: []uint{0, 2, 4, 6, 8, 10}, + afterInv: []BitRange{{0, 12}}, + }, + "NoneFree5": { + npages: 5, + frees: []uint{0, 5, 10, 15, 20}, + afterInv: []BitRange{{0, 25}}, + }, + "NoneFree64": { + npages: 64, + frees: []uint{0, 64, 128}, + afterInv: []BitRange{{0, 192}}, + }, + "NoneFree65": { + npages: 65, + frees: []uint{0, 65, 130}, + afterInv: []BitRange{{0, 195}}, + }, + } + for name, v := range tests { + v := v + t.Run(name, func(t *testing.T) { + b := makePallocBits(v.beforeInv) + invertPallocBits(b) + for _, i := range v.frees { + b.Free(i, uint(v.npages)) + } + want := makePallocBits(v.afterInv) + invertPallocBits(want) + checkPallocBits(t, b, want) + }) + } +} + +func TestFindBitRange64(t *testing.T) { + check := func(x uint64, n uint, result uint) { + i := FindBitRange64(x, n) + if result == ^uint(0) && i < 64 { + t.Errorf("case (%016x, %d): got %d, want failure", x, n, i) + } else if result != ^uint(0) && i != result { + t.Errorf("case (%016x, %d): got %d, want %d", x, n, i, result) + } + } + for i := uint(1); i <= 64; i++ { + check(^uint64(0), i, 0) + } + for i := uint(1); i <= 64; i++ { + check(0, i, ^uint(0)) + } + check(0x8000000000000000, 1, 63) + check(0xc000010001010000, 2, 62) + check(0xc000010001030000, 2, 16) + check(0xe000030001030000, 3, 61) + check(0xe000030001070000, 3, 16) + check(0xffff03ff01070000, 16, 48) + check(0xffff03ff0107ffff, 16, 0) + check(0x0fff03ff01079fff, 16, ^uint(0)) +} + +func BenchmarkFindBitRange64(b *testing.B) { + patterns := []uint64{ + 0, + ^uint64(0), + 0xaa, + 0xaaaaaaaaaaaaaaaa, + 0x80000000aaaaaaaa, + 0xaaaaaaaa00000001, + 0xbbbbbbbbbbbbbbbb, + 0x80000000bbbbbbbb, + 0xbbbbbbbb00000001, + 0xcccccccccccccccc, + 0x4444444444444444, + 0x4040404040404040, + 0x4000400040004000, + } + sizes := []uint{ + 2, 8, 32, + } + for _, pattern := range patterns { + for _, size := range sizes { + b.Run(fmt.Sprintf("Pattern%02XSize%d", pattern, size), func(b *testing.B) { + for i := 0; i < b.N; i++ { + FindBitRange64(pattern, size) + } + }) + } + } +} diff --git a/src/runtime/mprof.go b/src/runtime/mprof.go new file mode 100644 index 0000000..24f8889 --- /dev/null +++ b/src/runtime/mprof.go @@ -0,0 +1,1281 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Malloc profiling. +// Patterned after tcmalloc's algorithms; shorter code. + +package runtime + +import ( + "internal/abi" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// NOTE(rsc): Everything here could use cas if contention became an issue. +var ( + // profInsertLock protects changes to the start of all *bucket linked lists + profInsertLock mutex + // profBlockLock protects the contents of every blockRecord struct + profBlockLock mutex + // profMemActiveLock protects the active field of every memRecord struct + profMemActiveLock mutex + // profMemFutureLock is a set of locks that protect the respective elements + // of the future array of every memRecord struct + profMemFutureLock [len(memRecord{}.future)]mutex +) + +// All memory allocations are local and do not escape outside of the profiler. +// The profiler is forbidden from referring to garbage-collected memory. + +const ( + // profile types + memProfile bucketType = 1 + iota + blockProfile + mutexProfile + + // size of bucket hash table + buckHashSize = 179999 + + // max depth of stack to record in bucket + maxStack = 32 +) + +type bucketType int + +// A bucket holds per-call-stack profiling information. +// The representation is a bit sleazy, inherited from C. +// This struct defines the bucket header. It is followed in +// memory by the stack words and then the actual record +// data, either a memRecord or a blockRecord. +// +// Per-call-stack profiling information. +// Lookup by hashing call stack into a linked-list hash table. +// +// None of the fields in this bucket header are modified after +// creation, including its next and allnext links. +// +// No heap pointers. +type bucket struct { + _ sys.NotInHeap + next *bucket + allnext *bucket + typ bucketType // memBucket or blockBucket (includes mutexProfile) + hash uintptr + size uintptr + nstk uintptr +} + +// A memRecord is the bucket data for a bucket of type memProfile, +// part of the memory profile. +type memRecord struct { + // The following complex 3-stage scheme of stats accumulation + // is required to obtain a consistent picture of mallocs and frees + // for some point in time. + // The problem is that mallocs come in real time, while frees + // come only after a GC during concurrent sweeping. So if we would + // naively count them, we would get a skew toward mallocs. + // + // Hence, we delay information to get consistent snapshots as + // of mark termination. Allocations count toward the next mark + // termination's snapshot, while sweep frees count toward the + // previous mark termination's snapshot: + // + // MT MT MT MT + // .·| .·| .·| .·| + // .·˙ | .·˙ | .·˙ | .·˙ | + // .·˙ | .·˙ | .·˙ | .·˙ | + // .·˙ |.·˙ |.·˙ |.·˙ | + // + // alloc → ▲ ← free + // ┠┅┅┅┅┅┅┅┅┅┅┅P + // C+2 → C+1 → C + // + // alloc → ▲ ← free + // ┠┅┅┅┅┅┅┅┅┅┅┅P + // C+2 → C+1 → C + // + // Since we can't publish a consistent snapshot until all of + // the sweep frees are accounted for, we wait until the next + // mark termination ("MT" above) to publish the previous mark + // termination's snapshot ("P" above). To do this, allocation + // and free events are accounted to *future* heap profile + // cycles ("C+n" above) and we only publish a cycle once all + // of the events from that cycle must be done. Specifically: + // + // Mallocs are accounted to cycle C+2. + // Explicit frees are accounted to cycle C+2. + // GC frees (done during sweeping) are accounted to cycle C+1. + // + // After mark termination, we increment the global heap + // profile cycle counter and accumulate the stats from cycle C + // into the active profile. + + // active is the currently published profile. A profiling + // cycle can be accumulated into active once its complete. + active memRecordCycle + + // future records the profile events we're counting for cycles + // that have not yet been published. This is ring buffer + // indexed by the global heap profile cycle C and stores + // cycles C, C+1, and C+2. Unlike active, these counts are + // only for a single cycle; they are not cumulative across + // cycles. + // + // We store cycle C here because there's a window between when + // C becomes the active cycle and when we've flushed it to + // active. + future [3]memRecordCycle +} + +// memRecordCycle +type memRecordCycle struct { + allocs, frees uintptr + alloc_bytes, free_bytes uintptr +} + +// add accumulates b into a. It does not zero b. +func (a *memRecordCycle) add(b *memRecordCycle) { + a.allocs += b.allocs + a.frees += b.frees + a.alloc_bytes += b.alloc_bytes + a.free_bytes += b.free_bytes +} + +// A blockRecord is the bucket data for a bucket of type blockProfile, +// which is used in blocking and mutex profiles. +type blockRecord struct { + count float64 + cycles int64 +} + +var ( + mbuckets atomic.UnsafePointer // *bucket, memory profile buckets + bbuckets atomic.UnsafePointer // *bucket, blocking profile buckets + xbuckets atomic.UnsafePointer // *bucket, mutex profile buckets + buckhash atomic.UnsafePointer // *buckhashArray + + mProfCycle mProfCycleHolder +) + +type buckhashArray [buckHashSize]atomic.UnsafePointer // *bucket + +const mProfCycleWrap = uint32(len(memRecord{}.future)) * (2 << 24) + +// mProfCycleHolder holds the global heap profile cycle number (wrapped at +// mProfCycleWrap, stored starting at bit 1), and a flag (stored at bit 0) to +// indicate whether future[cycle] in all buckets has been queued to flush into +// the active profile. +type mProfCycleHolder struct { + value atomic.Uint32 +} + +// read returns the current cycle count. +func (c *mProfCycleHolder) read() (cycle uint32) { + v := c.value.Load() + cycle = v >> 1 + return cycle +} + +// setFlushed sets the flushed flag. It returns the current cycle count and the +// previous value of the flushed flag. +func (c *mProfCycleHolder) setFlushed() (cycle uint32, alreadyFlushed bool) { + for { + prev := c.value.Load() + cycle = prev >> 1 + alreadyFlushed = (prev & 0x1) != 0 + next := prev | 0x1 + if c.value.CompareAndSwap(prev, next) { + return cycle, alreadyFlushed + } + } +} + +// increment increases the cycle count by one, wrapping the value at +// mProfCycleWrap. It clears the flushed flag. +func (c *mProfCycleHolder) increment() { + // We explicitly wrap mProfCycle rather than depending on + // uint wraparound because the memRecord.future ring does not + // itself wrap at a power of two. + for { + prev := c.value.Load() + cycle := prev >> 1 + cycle = (cycle + 1) % mProfCycleWrap + next := cycle << 1 + if c.value.CompareAndSwap(prev, next) { + break + } + } +} + +// newBucket allocates a bucket with the given type and number of stack entries. +func newBucket(typ bucketType, nstk int) *bucket { + size := unsafe.Sizeof(bucket{}) + uintptr(nstk)*unsafe.Sizeof(uintptr(0)) + switch typ { + default: + throw("invalid profile bucket type") + case memProfile: + size += unsafe.Sizeof(memRecord{}) + case blockProfile, mutexProfile: + size += unsafe.Sizeof(blockRecord{}) + } + + b := (*bucket)(persistentalloc(size, 0, &memstats.buckhash_sys)) + b.typ = typ + b.nstk = uintptr(nstk) + return b +} + +// stk returns the slice in b holding the stack. +func (b *bucket) stk() []uintptr { + stk := (*[maxStack]uintptr)(add(unsafe.Pointer(b), unsafe.Sizeof(*b))) + return stk[:b.nstk:b.nstk] +} + +// mp returns the memRecord associated with the memProfile bucket b. +func (b *bucket) mp() *memRecord { + if b.typ != memProfile { + throw("bad use of bucket.mp") + } + data := add(unsafe.Pointer(b), unsafe.Sizeof(*b)+b.nstk*unsafe.Sizeof(uintptr(0))) + return (*memRecord)(data) +} + +// bp returns the blockRecord associated with the blockProfile bucket b. +func (b *bucket) bp() *blockRecord { + if b.typ != blockProfile && b.typ != mutexProfile { + throw("bad use of bucket.bp") + } + data := add(unsafe.Pointer(b), unsafe.Sizeof(*b)+b.nstk*unsafe.Sizeof(uintptr(0))) + return (*blockRecord)(data) +} + +// Return the bucket for stk[0:nstk], allocating new bucket if needed. +func stkbucket(typ bucketType, size uintptr, stk []uintptr, alloc bool) *bucket { + bh := (*buckhashArray)(buckhash.Load()) + if bh == nil { + lock(&profInsertLock) + // check again under the lock + bh = (*buckhashArray)(buckhash.Load()) + if bh == nil { + bh = (*buckhashArray)(sysAlloc(unsafe.Sizeof(buckhashArray{}), &memstats.buckhash_sys)) + if bh == nil { + throw("runtime: cannot allocate memory") + } + buckhash.StoreNoWB(unsafe.Pointer(bh)) + } + unlock(&profInsertLock) + } + + // Hash stack. + var h uintptr + for _, pc := range stk { + h += pc + h += h << 10 + h ^= h >> 6 + } + // hash in size + h += size + h += h << 10 + h ^= h >> 6 + // finalize + h += h << 3 + h ^= h >> 11 + + i := int(h % buckHashSize) + // first check optimistically, without the lock + for b := (*bucket)(bh[i].Load()); b != nil; b = b.next { + if b.typ == typ && b.hash == h && b.size == size && eqslice(b.stk(), stk) { + return b + } + } + + if !alloc { + return nil + } + + lock(&profInsertLock) + // check again under the insertion lock + for b := (*bucket)(bh[i].Load()); b != nil; b = b.next { + if b.typ == typ && b.hash == h && b.size == size && eqslice(b.stk(), stk) { + unlock(&profInsertLock) + return b + } + } + + // Create new bucket. + b := newBucket(typ, len(stk)) + copy(b.stk(), stk) + b.hash = h + b.size = size + + var allnext *atomic.UnsafePointer + if typ == memProfile { + allnext = &mbuckets + } else if typ == mutexProfile { + allnext = &xbuckets + } else { + allnext = &bbuckets + } + + b.next = (*bucket)(bh[i].Load()) + b.allnext = (*bucket)(allnext.Load()) + + bh[i].StoreNoWB(unsafe.Pointer(b)) + allnext.StoreNoWB(unsafe.Pointer(b)) + + unlock(&profInsertLock) + return b +} + +func eqslice(x, y []uintptr) bool { + if len(x) != len(y) { + return false + } + for i, xi := range x { + if xi != y[i] { + return false + } + } + return true +} + +// mProf_NextCycle publishes the next heap profile cycle and creates a +// fresh heap profile cycle. This operation is fast and can be done +// during STW. The caller must call mProf_Flush before calling +// mProf_NextCycle again. +// +// This is called by mark termination during STW so allocations and +// frees after the world is started again count towards a new heap +// profiling cycle. +func mProf_NextCycle() { + mProfCycle.increment() +} + +// mProf_Flush flushes the events from the current heap profiling +// cycle into the active profile. After this it is safe to start a new +// heap profiling cycle with mProf_NextCycle. +// +// This is called by GC after mark termination starts the world. In +// contrast with mProf_NextCycle, this is somewhat expensive, but safe +// to do concurrently. +func mProf_Flush() { + cycle, alreadyFlushed := mProfCycle.setFlushed() + if alreadyFlushed { + return + } + + index := cycle % uint32(len(memRecord{}.future)) + lock(&profMemActiveLock) + lock(&profMemFutureLock[index]) + mProf_FlushLocked(index) + unlock(&profMemFutureLock[index]) + unlock(&profMemActiveLock) +} + +// mProf_FlushLocked flushes the events from the heap profiling cycle at index +// into the active profile. The caller must hold the lock for the active profile +// (profMemActiveLock) and for the profiling cycle at index +// (profMemFutureLock[index]). +func mProf_FlushLocked(index uint32) { + assertLockHeld(&profMemActiveLock) + assertLockHeld(&profMemFutureLock[index]) + head := (*bucket)(mbuckets.Load()) + for b := head; b != nil; b = b.allnext { + mp := b.mp() + + // Flush cycle C into the published profile and clear + // it for reuse. + mpc := &mp.future[index] + mp.active.add(mpc) + *mpc = memRecordCycle{} + } +} + +// mProf_PostSweep records that all sweep frees for this GC cycle have +// completed. This has the effect of publishing the heap profile +// snapshot as of the last mark termination without advancing the heap +// profile cycle. +func mProf_PostSweep() { + // Flush cycle C+1 to the active profile so everything as of + // the last mark termination becomes visible. *Don't* advance + // the cycle, since we're still accumulating allocs in cycle + // C+2, which have to become C+1 in the next mark termination + // and so on. + cycle := mProfCycle.read() + 1 + + index := cycle % uint32(len(memRecord{}.future)) + lock(&profMemActiveLock) + lock(&profMemFutureLock[index]) + mProf_FlushLocked(index) + unlock(&profMemFutureLock[index]) + unlock(&profMemActiveLock) +} + +// Called by malloc to record a profiled block. +func mProf_Malloc(p unsafe.Pointer, size uintptr) { + var stk [maxStack]uintptr + nstk := callers(4, stk[:]) + + index := (mProfCycle.read() + 2) % uint32(len(memRecord{}.future)) + + b := stkbucket(memProfile, size, stk[:nstk], true) + mp := b.mp() + mpc := &mp.future[index] + + lock(&profMemFutureLock[index]) + mpc.allocs++ + mpc.alloc_bytes += size + unlock(&profMemFutureLock[index]) + + // Setprofilebucket locks a bunch of other mutexes, so we call it outside of + // the profiler locks. This reduces potential contention and chances of + // deadlocks. Since the object must be alive during the call to + // mProf_Malloc, it's fine to do this non-atomically. + systemstack(func() { + setprofilebucket(p, b) + }) +} + +// Called when freeing a profiled block. +func mProf_Free(b *bucket, size uintptr) { + index := (mProfCycle.read() + 1) % uint32(len(memRecord{}.future)) + + mp := b.mp() + mpc := &mp.future[index] + + lock(&profMemFutureLock[index]) + mpc.frees++ + mpc.free_bytes += size + unlock(&profMemFutureLock[index]) +} + +var blockprofilerate uint64 // in CPU ticks + +// SetBlockProfileRate controls the fraction of goroutine blocking events +// that are reported in the blocking profile. The profiler aims to sample +// an average of one blocking event per rate nanoseconds spent blocked. +// +// To include every blocking event in the profile, pass rate = 1. +// To turn off profiling entirely, pass rate <= 0. +func SetBlockProfileRate(rate int) { + var r int64 + if rate <= 0 { + r = 0 // disable profiling + } else if rate == 1 { + r = 1 // profile everything + } else { + // convert ns to cycles, use float64 to prevent overflow during multiplication + r = int64(float64(rate) * float64(tickspersecond()) / (1000 * 1000 * 1000)) + if r == 0 { + r = 1 + } + } + + atomic.Store64(&blockprofilerate, uint64(r)) +} + +func blockevent(cycles int64, skip int) { + if cycles <= 0 { + cycles = 1 + } + + rate := int64(atomic.Load64(&blockprofilerate)) + if blocksampled(cycles, rate) { + saveblockevent(cycles, rate, skip+1, blockProfile) + } +} + +// blocksampled returns true for all events where cycles >= rate. Shorter +// events have a cycles/rate random chance of returning true. +func blocksampled(cycles, rate int64) bool { + if rate <= 0 || (rate > cycles && int64(fastrand())%rate > cycles) { + return false + } + return true +} + +func saveblockevent(cycles, rate int64, skip int, which bucketType) { + gp := getg() + var nstk int + var stk [maxStack]uintptr + if gp.m.curg == nil || gp.m.curg == gp { + nstk = callers(skip, stk[:]) + } else { + nstk = gcallers(gp.m.curg, skip, stk[:]) + } + b := stkbucket(which, 0, stk[:nstk], true) + bp := b.bp() + + lock(&profBlockLock) + // We want to up-scale the count and cycles according to the + // probability that the event was sampled. For block profile events, + // the sample probability is 1 if cycles >= rate, and cycles / rate + // otherwise. For mutex profile events, the sample probability is 1 / rate. + // We scale the events by 1 / (probability the event was sampled). + if which == blockProfile && cycles < rate { + // Remove sampling bias, see discussion on http://golang.org/cl/299991. + bp.count += float64(rate) / float64(cycles) + bp.cycles += rate + } else if which == mutexProfile { + bp.count += float64(rate) + bp.cycles += rate * cycles + } else { + bp.count++ + bp.cycles += cycles + } + unlock(&profBlockLock) +} + +var mutexprofilerate uint64 // fraction sampled + +// SetMutexProfileFraction controls the fraction of mutex contention events +// that are reported in the mutex profile. On average 1/rate events are +// reported. The previous rate is returned. +// +// To turn off profiling entirely, pass rate 0. +// To just read the current rate, pass rate < 0. +// (For n>1 the details of sampling may change.) +func SetMutexProfileFraction(rate int) int { + if rate < 0 { + return int(mutexprofilerate) + } + old := mutexprofilerate + atomic.Store64(&mutexprofilerate, uint64(rate)) + return int(old) +} + +//go:linkname mutexevent sync.event +func mutexevent(cycles int64, skip int) { + if cycles < 0 { + cycles = 0 + } + rate := int64(atomic.Load64(&mutexprofilerate)) + // TODO(pjw): measure impact of always calling fastrand vs using something + // like malloc.go:nextSample() + if rate > 0 && int64(fastrand())%rate == 0 { + saveblockevent(cycles, rate, skip+1, mutexProfile) + } +} + +// Go interface to profile data. + +// A StackRecord describes a single execution stack. +type StackRecord struct { + Stack0 [32]uintptr // stack trace for this record; ends at first 0 entry +} + +// Stack returns the stack trace associated with the record, +// a prefix of r.Stack0. +func (r *StackRecord) Stack() []uintptr { + for i, v := range r.Stack0 { + if v == 0 { + return r.Stack0[0:i] + } + } + return r.Stack0[0:] +} + +// MemProfileRate controls the fraction of memory allocations +// that are recorded and reported in the memory profile. +// The profiler aims to sample an average of +// one allocation per MemProfileRate bytes allocated. +// +// To include every allocated block in the profile, set MemProfileRate to 1. +// To turn off profiling entirely, set MemProfileRate to 0. +// +// The tools that process the memory profiles assume that the +// profile rate is constant across the lifetime of the program +// and equal to the current value. Programs that change the +// memory profiling rate should do so just once, as early as +// possible in the execution of the program (for example, +// at the beginning of main). +var MemProfileRate int = 512 * 1024 + +// disableMemoryProfiling is set by the linker if runtime.MemProfile +// is not used and the link type guarantees nobody else could use it +// elsewhere. +var disableMemoryProfiling bool + +// A MemProfileRecord describes the live objects allocated +// by a particular call sequence (stack trace). +type MemProfileRecord struct { + AllocBytes, FreeBytes int64 // number of bytes allocated, freed + AllocObjects, FreeObjects int64 // number of objects allocated, freed + Stack0 [32]uintptr // stack trace for this record; ends at first 0 entry +} + +// InUseBytes returns the number of bytes in use (AllocBytes - FreeBytes). +func (r *MemProfileRecord) InUseBytes() int64 { return r.AllocBytes - r.FreeBytes } + +// InUseObjects returns the number of objects in use (AllocObjects - FreeObjects). +func (r *MemProfileRecord) InUseObjects() int64 { + return r.AllocObjects - r.FreeObjects +} + +// Stack returns the stack trace associated with the record, +// a prefix of r.Stack0. +func (r *MemProfileRecord) Stack() []uintptr { + for i, v := range r.Stack0 { + if v == 0 { + return r.Stack0[0:i] + } + } + return r.Stack0[0:] +} + +// MemProfile returns a profile of memory allocated and freed per allocation +// site. +// +// MemProfile returns n, the number of records in the current memory profile. +// If len(p) >= n, MemProfile copies the profile into p and returns n, true. +// If len(p) < n, MemProfile does not change p and returns n, false. +// +// If inuseZero is true, the profile includes allocation records +// where r.AllocBytes > 0 but r.AllocBytes == r.FreeBytes. +// These are sites where memory was allocated, but it has all +// been released back to the runtime. +// +// The returned profile may be up to two garbage collection cycles old. +// This is to avoid skewing the profile toward allocations; because +// allocations happen in real time but frees are delayed until the garbage +// collector performs sweeping, the profile only accounts for allocations +// that have had a chance to be freed by the garbage collector. +// +// Most clients should use the runtime/pprof package or +// the testing package's -test.memprofile flag instead +// of calling MemProfile directly. +func MemProfile(p []MemProfileRecord, inuseZero bool) (n int, ok bool) { + cycle := mProfCycle.read() + // If we're between mProf_NextCycle and mProf_Flush, take care + // of flushing to the active profile so we only have to look + // at the active profile below. + index := cycle % uint32(len(memRecord{}.future)) + lock(&profMemActiveLock) + lock(&profMemFutureLock[index]) + mProf_FlushLocked(index) + unlock(&profMemFutureLock[index]) + clear := true + head := (*bucket)(mbuckets.Load()) + for b := head; b != nil; b = b.allnext { + mp := b.mp() + if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes { + n++ + } + if mp.active.allocs != 0 || mp.active.frees != 0 { + clear = false + } + } + if clear { + // Absolutely no data, suggesting that a garbage collection + // has not yet happened. In order to allow profiling when + // garbage collection is disabled from the beginning of execution, + // accumulate all of the cycles, and recount buckets. + n = 0 + for b := head; b != nil; b = b.allnext { + mp := b.mp() + for c := range mp.future { + lock(&profMemFutureLock[c]) + mp.active.add(&mp.future[c]) + mp.future[c] = memRecordCycle{} + unlock(&profMemFutureLock[c]) + } + if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes { + n++ + } + } + } + if n <= len(p) { + ok = true + idx := 0 + for b := head; b != nil; b = b.allnext { + mp := b.mp() + if inuseZero || mp.active.alloc_bytes != mp.active.free_bytes { + record(&p[idx], b) + idx++ + } + } + } + unlock(&profMemActiveLock) + return +} + +// Write b's data to r. +func record(r *MemProfileRecord, b *bucket) { + mp := b.mp() + r.AllocBytes = int64(mp.active.alloc_bytes) + r.FreeBytes = int64(mp.active.free_bytes) + r.AllocObjects = int64(mp.active.allocs) + r.FreeObjects = int64(mp.active.frees) + if raceenabled { + racewriterangepc(unsafe.Pointer(&r.Stack0[0]), unsafe.Sizeof(r.Stack0), getcallerpc(), abi.FuncPCABIInternal(MemProfile)) + } + if msanenabled { + msanwrite(unsafe.Pointer(&r.Stack0[0]), unsafe.Sizeof(r.Stack0)) + } + if asanenabled { + asanwrite(unsafe.Pointer(&r.Stack0[0]), unsafe.Sizeof(r.Stack0)) + } + copy(r.Stack0[:], b.stk()) + for i := int(b.nstk); i < len(r.Stack0); i++ { + r.Stack0[i] = 0 + } +} + +func iterate_memprof(fn func(*bucket, uintptr, *uintptr, uintptr, uintptr, uintptr)) { + lock(&profMemActiveLock) + head := (*bucket)(mbuckets.Load()) + for b := head; b != nil; b = b.allnext { + mp := b.mp() + fn(b, b.nstk, &b.stk()[0], b.size, mp.active.allocs, mp.active.frees) + } + unlock(&profMemActiveLock) +} + +// BlockProfileRecord describes blocking events originated +// at a particular call sequence (stack trace). +type BlockProfileRecord struct { + Count int64 + Cycles int64 + StackRecord +} + +// BlockProfile returns n, the number of records in the current blocking profile. +// If len(p) >= n, BlockProfile copies the profile into p and returns n, true. +// If len(p) < n, BlockProfile does not change p and returns n, false. +// +// Most clients should use the runtime/pprof package or +// the testing package's -test.blockprofile flag instead +// of calling BlockProfile directly. +func BlockProfile(p []BlockProfileRecord) (n int, ok bool) { + lock(&profBlockLock) + head := (*bucket)(bbuckets.Load()) + for b := head; b != nil; b = b.allnext { + n++ + } + if n <= len(p) { + ok = true + for b := head; b != nil; b = b.allnext { + bp := b.bp() + r := &p[0] + r.Count = int64(bp.count) + // Prevent callers from having to worry about division by zero errors. + // See discussion on http://golang.org/cl/299991. + if r.Count == 0 { + r.Count = 1 + } + r.Cycles = bp.cycles + if raceenabled { + racewriterangepc(unsafe.Pointer(&r.Stack0[0]), unsafe.Sizeof(r.Stack0), getcallerpc(), abi.FuncPCABIInternal(BlockProfile)) + } + if msanenabled { + msanwrite(unsafe.Pointer(&r.Stack0[0]), unsafe.Sizeof(r.Stack0)) + } + if asanenabled { + asanwrite(unsafe.Pointer(&r.Stack0[0]), unsafe.Sizeof(r.Stack0)) + } + i := copy(r.Stack0[:], b.stk()) + for ; i < len(r.Stack0); i++ { + r.Stack0[i] = 0 + } + p = p[1:] + } + } + unlock(&profBlockLock) + return +} + +// MutexProfile returns n, the number of records in the current mutex profile. +// If len(p) >= n, MutexProfile copies the profile into p and returns n, true. +// Otherwise, MutexProfile does not change p, and returns n, false. +// +// Most clients should use the runtime/pprof package +// instead of calling MutexProfile directly. +func MutexProfile(p []BlockProfileRecord) (n int, ok bool) { + lock(&profBlockLock) + head := (*bucket)(xbuckets.Load()) + for b := head; b != nil; b = b.allnext { + n++ + } + if n <= len(p) { + ok = true + for b := head; b != nil; b = b.allnext { + bp := b.bp() + r := &p[0] + r.Count = int64(bp.count) + r.Cycles = bp.cycles + i := copy(r.Stack0[:], b.stk()) + for ; i < len(r.Stack0); i++ { + r.Stack0[i] = 0 + } + p = p[1:] + } + } + unlock(&profBlockLock) + return +} + +// ThreadCreateProfile returns n, the number of records in the thread creation profile. +// If len(p) >= n, ThreadCreateProfile copies the profile into p and returns n, true. +// If len(p) < n, ThreadCreateProfile does not change p and returns n, false. +// +// Most clients should use the runtime/pprof package instead +// of calling ThreadCreateProfile directly. +func ThreadCreateProfile(p []StackRecord) (n int, ok bool) { + first := (*m)(atomic.Loadp(unsafe.Pointer(&allm))) + for mp := first; mp != nil; mp = mp.alllink { + n++ + } + if n <= len(p) { + ok = true + i := 0 + for mp := first; mp != nil; mp = mp.alllink { + p[i].Stack0 = mp.createstack + i++ + } + } + return +} + +//go:linkname runtime_goroutineProfileWithLabels runtime/pprof.runtime_goroutineProfileWithLabels +func runtime_goroutineProfileWithLabels(p []StackRecord, labels []unsafe.Pointer) (n int, ok bool) { + return goroutineProfileWithLabels(p, labels) +} + +const go119ConcurrentGoroutineProfile = true + +// labels may be nil. If labels is non-nil, it must have the same length as p. +func goroutineProfileWithLabels(p []StackRecord, labels []unsafe.Pointer) (n int, ok bool) { + if labels != nil && len(labels) != len(p) { + labels = nil + } + + if go119ConcurrentGoroutineProfile { + return goroutineProfileWithLabelsConcurrent(p, labels) + } + return goroutineProfileWithLabelsSync(p, labels) +} + +var goroutineProfile = struct { + sema uint32 + active bool + offset atomic.Int64 + records []StackRecord + labels []unsafe.Pointer +}{ + sema: 1, +} + +// goroutineProfileState indicates the status of a goroutine's stack for the +// current in-progress goroutine profile. Goroutines' stacks are initially +// "Absent" from the profile, and end up "Satisfied" by the time the profile is +// complete. While a goroutine's stack is being captured, its +// goroutineProfileState will be "InProgress" and it will not be able to run +// until the capture completes and the state moves to "Satisfied". +// +// Some goroutines (the finalizer goroutine, which at various times can be +// either a "system" or a "user" goroutine, and the goroutine that is +// coordinating the profile, any goroutines created during the profile) move +// directly to the "Satisfied" state. +type goroutineProfileState uint32 + +const ( + goroutineProfileAbsent goroutineProfileState = iota + goroutineProfileInProgress + goroutineProfileSatisfied +) + +type goroutineProfileStateHolder atomic.Uint32 + +func (p *goroutineProfileStateHolder) Load() goroutineProfileState { + return goroutineProfileState((*atomic.Uint32)(p).Load()) +} + +func (p *goroutineProfileStateHolder) Store(value goroutineProfileState) { + (*atomic.Uint32)(p).Store(uint32(value)) +} + +func (p *goroutineProfileStateHolder) CompareAndSwap(old, new goroutineProfileState) bool { + return (*atomic.Uint32)(p).CompareAndSwap(uint32(old), uint32(new)) +} + +func goroutineProfileWithLabelsConcurrent(p []StackRecord, labels []unsafe.Pointer) (n int, ok bool) { + semacquire(&goroutineProfile.sema) + + ourg := getg() + + stopTheWorld("profile") + // Using gcount while the world is stopped should give us a consistent view + // of the number of live goroutines, minus the number of goroutines that are + // alive and permanently marked as "system". But to make this count agree + // with what we'd get from isSystemGoroutine, we need special handling for + // goroutines that can vary between user and system to ensure that the count + // doesn't change during the collection. So, check the finalizer goroutine + // in particular. + n = int(gcount()) + if fingStatus.Load()&fingRunningFinalizer != 0 { + n++ + } + + if n > len(p) { + // There's not enough space in p to store the whole profile, so (per the + // contract of runtime.GoroutineProfile) we're not allowed to write to p + // at all and must return n, false. + startTheWorld() + semrelease(&goroutineProfile.sema) + return n, false + } + + // Save current goroutine. + sp := getcallersp() + pc := getcallerpc() + systemstack(func() { + saveg(pc, sp, ourg, &p[0]) + }) + ourg.goroutineProfiled.Store(goroutineProfileSatisfied) + goroutineProfile.offset.Store(1) + + // Prepare for all other goroutines to enter the profile. Aside from ourg, + // every goroutine struct in the allgs list has its goroutineProfiled field + // cleared. Any goroutine created from this point on (while + // goroutineProfile.active is set) will start with its goroutineProfiled + // field set to goroutineProfileSatisfied. + goroutineProfile.active = true + goroutineProfile.records = p + goroutineProfile.labels = labels + // The finalizer goroutine needs special handling because it can vary over + // time between being a user goroutine (eligible for this profile) and a + // system goroutine (to be excluded). Pick one before restarting the world. + if fing != nil { + fing.goroutineProfiled.Store(goroutineProfileSatisfied) + if readgstatus(fing) != _Gdead && !isSystemGoroutine(fing, false) { + doRecordGoroutineProfile(fing) + } + } + startTheWorld() + + // Visit each goroutine that existed as of the startTheWorld call above. + // + // New goroutines may not be in this list, but we didn't want to know about + // them anyway. If they do appear in this list (via reusing a dead goroutine + // struct, or racing to launch between the world restarting and us getting + // the list), they will already have their goroutineProfiled field set to + // goroutineProfileSatisfied before their state transitions out of _Gdead. + // + // Any goroutine that the scheduler tries to execute concurrently with this + // call will start by adding itself to the profile (before the act of + // executing can cause any changes in its stack). + forEachGRace(func(gp1 *g) { + tryRecordGoroutineProfile(gp1, Gosched) + }) + + stopTheWorld("profile cleanup") + endOffset := goroutineProfile.offset.Swap(0) + goroutineProfile.active = false + goroutineProfile.records = nil + goroutineProfile.labels = nil + startTheWorld() + + // Restore the invariant that every goroutine struct in allgs has its + // goroutineProfiled field cleared. + forEachGRace(func(gp1 *g) { + gp1.goroutineProfiled.Store(goroutineProfileAbsent) + }) + + if raceenabled { + raceacquire(unsafe.Pointer(&labelSync)) + } + + if n != int(endOffset) { + // It's a big surprise that the number of goroutines changed while we + // were collecting the profile. But probably better to return a + // truncated profile than to crash the whole process. + // + // For instance, needm moves a goroutine out of the _Gdead state and so + // might be able to change the goroutine count without interacting with + // the scheduler. For code like that, the race windows are small and the + // combination of features is uncommon, so it's hard to be (and remain) + // sure we've caught them all. + } + + semrelease(&goroutineProfile.sema) + return n, true +} + +// tryRecordGoroutineProfileWB asserts that write barriers are allowed and calls +// tryRecordGoroutineProfile. +// +//go:yeswritebarrierrec +func tryRecordGoroutineProfileWB(gp1 *g) { + if getg().m.p.ptr() == nil { + throw("no P available, write barriers are forbidden") + } + tryRecordGoroutineProfile(gp1, osyield) +} + +// tryRecordGoroutineProfile ensures that gp1 has the appropriate representation +// in the current goroutine profile: either that it should not be profiled, or +// that a snapshot of its call stack and labels are now in the profile. +func tryRecordGoroutineProfile(gp1 *g, yield func()) { + if readgstatus(gp1) == _Gdead { + // Dead goroutines should not appear in the profile. Goroutines that + // start while profile collection is active will get goroutineProfiled + // set to goroutineProfileSatisfied before transitioning out of _Gdead, + // so here we check _Gdead first. + return + } + if isSystemGoroutine(gp1, true) { + // System goroutines should not appear in the profile. (The finalizer + // goroutine is marked as "already profiled".) + return + } + + for { + prev := gp1.goroutineProfiled.Load() + if prev == goroutineProfileSatisfied { + // This goroutine is already in the profile (or is new since the + // start of collection, so shouldn't appear in the profile). + break + } + if prev == goroutineProfileInProgress { + // Something else is adding gp1 to the goroutine profile right now. + // Give that a moment to finish. + yield() + continue + } + + // While we have gp1.goroutineProfiled set to + // goroutineProfileInProgress, gp1 may appear _Grunnable but will not + // actually be able to run. Disable preemption for ourselves, to make + // sure we finish profiling gp1 right away instead of leaving it stuck + // in this limbo. + mp := acquirem() + if gp1.goroutineProfiled.CompareAndSwap(goroutineProfileAbsent, goroutineProfileInProgress) { + doRecordGoroutineProfile(gp1) + gp1.goroutineProfiled.Store(goroutineProfileSatisfied) + } + releasem(mp) + } +} + +// doRecordGoroutineProfile writes gp1's call stack and labels to an in-progress +// goroutine profile. Preemption is disabled. +// +// This may be called via tryRecordGoroutineProfile in two ways: by the +// goroutine that is coordinating the goroutine profile (running on its own +// stack), or from the scheduler in preparation to execute gp1 (running on the +// system stack). +func doRecordGoroutineProfile(gp1 *g) { + if readgstatus(gp1) == _Grunning { + print("doRecordGoroutineProfile gp1=", gp1.goid, "\n") + throw("cannot read stack of running goroutine") + } + + offset := int(goroutineProfile.offset.Add(1)) - 1 + + if offset >= len(goroutineProfile.records) { + // Should be impossible, but better to return a truncated profile than + // to crash the entire process at this point. Instead, deal with it in + // goroutineProfileWithLabelsConcurrent where we have more context. + return + } + + // saveg calls gentraceback, which may call cgo traceback functions. When + // called from the scheduler, this is on the system stack already so + // traceback.go:cgoContextPCs will avoid calling back into the scheduler. + // + // When called from the goroutine coordinating the profile, we still have + // set gp1.goroutineProfiled to goroutineProfileInProgress and so are still + // preventing it from being truly _Grunnable. So we'll use the system stack + // to avoid schedule delays. + systemstack(func() { saveg(^uintptr(0), ^uintptr(0), gp1, &goroutineProfile.records[offset]) }) + + if goroutineProfile.labels != nil { + goroutineProfile.labels[offset] = gp1.labels + } +} + +func goroutineProfileWithLabelsSync(p []StackRecord, labels []unsafe.Pointer) (n int, ok bool) { + gp := getg() + + isOK := func(gp1 *g) bool { + // Checking isSystemGoroutine here makes GoroutineProfile + // consistent with both NumGoroutine and Stack. + return gp1 != gp && readgstatus(gp1) != _Gdead && !isSystemGoroutine(gp1, false) + } + + stopTheWorld("profile") + + // World is stopped, no locking required. + n = 1 + forEachGRace(func(gp1 *g) { + if isOK(gp1) { + n++ + } + }) + + if n <= len(p) { + ok = true + r, lbl := p, labels + + // Save current goroutine. + sp := getcallersp() + pc := getcallerpc() + systemstack(func() { + saveg(pc, sp, gp, &r[0]) + }) + r = r[1:] + + // If we have a place to put our goroutine labelmap, insert it there. + if labels != nil { + lbl[0] = gp.labels + lbl = lbl[1:] + } + + // Save other goroutines. + forEachGRace(func(gp1 *g) { + if !isOK(gp1) { + return + } + + if len(r) == 0 { + // Should be impossible, but better to return a + // truncated profile than to crash the entire process. + return + } + // saveg calls gentraceback, which may call cgo traceback functions. + // The world is stopped, so it cannot use cgocall (which will be + // blocked at exitsyscall). Do it on the system stack so it won't + // call into the schedular (see traceback.go:cgoContextPCs). + systemstack(func() { saveg(^uintptr(0), ^uintptr(0), gp1, &r[0]) }) + if labels != nil { + lbl[0] = gp1.labels + lbl = lbl[1:] + } + r = r[1:] + }) + } + + if raceenabled { + raceacquire(unsafe.Pointer(&labelSync)) + } + + startTheWorld() + return n, ok +} + +// GoroutineProfile returns n, the number of records in the active goroutine stack profile. +// If len(p) >= n, GoroutineProfile copies the profile into p and returns n, true. +// If len(p) < n, GoroutineProfile does not change p and returns n, false. +// +// Most clients should use the runtime/pprof package instead +// of calling GoroutineProfile directly. +func GoroutineProfile(p []StackRecord) (n int, ok bool) { + + return goroutineProfileWithLabels(p, nil) +} + +func saveg(pc, sp uintptr, gp *g, r *StackRecord) { + n := gentraceback(pc, sp, 0, gp, 0, &r.Stack0[0], len(r.Stack0), nil, nil, 0) + if n < len(r.Stack0) { + r.Stack0[n] = 0 + } +} + +// Stack formats a stack trace of the calling goroutine into buf +// and returns the number of bytes written to buf. +// If all is true, Stack formats stack traces of all other goroutines +// into buf after the trace for the current goroutine. +func Stack(buf []byte, all bool) int { + if all { + stopTheWorld("stack trace") + } + + n := 0 + if len(buf) > 0 { + gp := getg() + sp := getcallersp() + pc := getcallerpc() + systemstack(func() { + g0 := getg() + // Force traceback=1 to override GOTRACEBACK setting, + // so that Stack's results are consistent. + // GOTRACEBACK is only about crash dumps. + g0.m.traceback = 1 + g0.writebuf = buf[0:0:len(buf)] + goroutineheader(gp) + traceback(pc, sp, 0, gp) + if all { + tracebackothers(gp) + } + g0.m.traceback = 0 + n = len(g0.writebuf) + g0.writebuf = nil + }) + } + + if all { + startTheWorld() + } + return n +} + +// Tracing of alloc/free/gc. + +var tracelock mutex + +func tracealloc(p unsafe.Pointer, size uintptr, typ *_type) { + lock(&tracelock) + gp := getg() + gp.m.traceback = 2 + if typ == nil { + print("tracealloc(", p, ", ", hex(size), ")\n") + } else { + print("tracealloc(", p, ", ", hex(size), ", ", typ.string(), ")\n") + } + if gp.m.curg == nil || gp == gp.m.curg { + goroutineheader(gp) + pc := getcallerpc() + sp := getcallersp() + systemstack(func() { + traceback(pc, sp, 0, gp) + }) + } else { + goroutineheader(gp.m.curg) + traceback(^uintptr(0), ^uintptr(0), 0, gp.m.curg) + } + print("\n") + gp.m.traceback = 0 + unlock(&tracelock) +} + +func tracefree(p unsafe.Pointer, size uintptr) { + lock(&tracelock) + gp := getg() + gp.m.traceback = 2 + print("tracefree(", p, ", ", hex(size), ")\n") + goroutineheader(gp) + pc := getcallerpc() + sp := getcallersp() + systemstack(func() { + traceback(pc, sp, 0, gp) + }) + print("\n") + gp.m.traceback = 0 + unlock(&tracelock) +} + +func tracegc() { + lock(&tracelock) + gp := getg() + gp.m.traceback = 2 + print("tracegc()\n") + // running on m->g0 stack; show all non-g0 goroutines + tracebackothers(gp) + print("end tracegc\n") + print("\n") + gp.m.traceback = 0 + unlock(&tracelock) +} diff --git a/src/runtime/mranges.go b/src/runtime/mranges.go new file mode 100644 index 0000000..4388d26 --- /dev/null +++ b/src/runtime/mranges.go @@ -0,0 +1,460 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Address range data structure. +// +// This file contains an implementation of a data structure which +// manages ordered address ranges. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +// addrRange represents a region of address space. +// +// An addrRange must never span a gap in the address space. +type addrRange struct { + // base and limit together represent the region of address space + // [base, limit). That is, base is inclusive, limit is exclusive. + // These are address over an offset view of the address space on + // platforms with a segmented address space, that is, on platforms + // where arenaBaseOffset != 0. + base, limit offAddr +} + +// makeAddrRange creates a new address range from two virtual addresses. +// +// Throws if the base and limit are not in the same memory segment. +func makeAddrRange(base, limit uintptr) addrRange { + r := addrRange{offAddr{base}, offAddr{limit}} + if (base-arenaBaseOffset >= base) != (limit-arenaBaseOffset >= limit) { + throw("addr range base and limit are not in the same memory segment") + } + return r +} + +// size returns the size of the range represented in bytes. +func (a addrRange) size() uintptr { + if !a.base.lessThan(a.limit) { + return 0 + } + // Subtraction is safe because limit and base must be in the same + // segment of the address space. + return a.limit.diff(a.base) +} + +// contains returns whether or not the range contains a given address. +func (a addrRange) contains(addr uintptr) bool { + return a.base.lessEqual(offAddr{addr}) && (offAddr{addr}).lessThan(a.limit) +} + +// subtract takes the addrRange toPrune and cuts out any overlap with +// from, then returns the new range. subtract assumes that a and b +// either don't overlap at all, only overlap on one side, or are equal. +// If b is strictly contained in a, thus forcing a split, it will throw. +func (a addrRange) subtract(b addrRange) addrRange { + if b.base.lessEqual(a.base) && a.limit.lessEqual(b.limit) { + return addrRange{} + } else if a.base.lessThan(b.base) && b.limit.lessThan(a.limit) { + throw("bad prune") + } else if b.limit.lessThan(a.limit) && a.base.lessThan(b.limit) { + a.base = b.limit + } else if a.base.lessThan(b.base) && b.base.lessThan(a.limit) { + a.limit = b.base + } + return a +} + +// takeFromFront takes len bytes from the front of the address range, aligning +// the base to align first. On success, returns the aligned start of the region +// taken and true. +func (a *addrRange) takeFromFront(len uintptr, align uint8) (uintptr, bool) { + base := alignUp(a.base.addr(), uintptr(align)) + len + if base > a.limit.addr() { + return 0, false + } + a.base = offAddr{base} + return base - len, true +} + +// takeFromBack takes len bytes from the end of the address range, aligning +// the limit to align after subtracting len. On success, returns the aligned +// start of the region taken and true. +func (a *addrRange) takeFromBack(len uintptr, align uint8) (uintptr, bool) { + limit := alignDown(a.limit.addr()-len, uintptr(align)) + if a.base.addr() > limit { + return 0, false + } + a.limit = offAddr{limit} + return limit, true +} + +// removeGreaterEqual removes all addresses in a greater than or equal +// to addr and returns the new range. +func (a addrRange) removeGreaterEqual(addr uintptr) addrRange { + if (offAddr{addr}).lessEqual(a.base) { + return addrRange{} + } + if a.limit.lessEqual(offAddr{addr}) { + return a + } + return makeAddrRange(a.base.addr(), addr) +} + +var ( + // minOffAddr is the minimum address in the offset space, and + // it corresponds to the virtual address arenaBaseOffset. + minOffAddr = offAddr{arenaBaseOffset} + + // maxOffAddr is the maximum address in the offset address + // space. It corresponds to the highest virtual address representable + // by the page alloc chunk and heap arena maps. + maxOffAddr = offAddr{(((1 << heapAddrBits) - 1) + arenaBaseOffset) & uintptrMask} +) + +// offAddr represents an address in a contiguous view +// of the address space on systems where the address space is +// segmented. On other systems, it's just a normal address. +type offAddr struct { + // a is just the virtual address, but should never be used + // directly. Call addr() to get this value instead. + a uintptr +} + +// add adds a uintptr offset to the offAddr. +func (l offAddr) add(bytes uintptr) offAddr { + return offAddr{a: l.a + bytes} +} + +// sub subtracts a uintptr offset from the offAddr. +func (l offAddr) sub(bytes uintptr) offAddr { + return offAddr{a: l.a - bytes} +} + +// diff returns the amount of bytes in between the +// two offAddrs. +func (l1 offAddr) diff(l2 offAddr) uintptr { + return l1.a - l2.a +} + +// lessThan returns true if l1 is less than l2 in the offset +// address space. +func (l1 offAddr) lessThan(l2 offAddr) bool { + return (l1.a - arenaBaseOffset) < (l2.a - arenaBaseOffset) +} + +// lessEqual returns true if l1 is less than or equal to l2 in +// the offset address space. +func (l1 offAddr) lessEqual(l2 offAddr) bool { + return (l1.a - arenaBaseOffset) <= (l2.a - arenaBaseOffset) +} + +// equal returns true if the two offAddr values are equal. +func (l1 offAddr) equal(l2 offAddr) bool { + // No need to compare in the offset space, it + // means the same thing. + return l1 == l2 +} + +// addr returns the virtual address for this offset address. +func (l offAddr) addr() uintptr { + return l.a +} + +// atomicOffAddr is like offAddr, but operations on it are atomic. +// It also contains operations to be able to store marked addresses +// to ensure that they're not overridden until they've been seen. +type atomicOffAddr struct { + // a contains the offset address, unlike offAddr. + a atomic.Int64 +} + +// Clear attempts to store minOffAddr in atomicOffAddr. It may fail +// if a marked value is placed in the box in the meanwhile. +func (b *atomicOffAddr) Clear() { + for { + old := b.a.Load() + if old < 0 { + return + } + if b.a.CompareAndSwap(old, int64(minOffAddr.addr()-arenaBaseOffset)) { + return + } + } +} + +// StoreMin stores addr if it's less than the current value in the +// offset address space if the current value is not marked. +func (b *atomicOffAddr) StoreMin(addr uintptr) { + new := int64(addr - arenaBaseOffset) + for { + old := b.a.Load() + if old < new { + return + } + if b.a.CompareAndSwap(old, new) { + return + } + } +} + +// StoreUnmark attempts to unmark the value in atomicOffAddr and +// replace it with newAddr. markedAddr must be a marked address +// returned by Load. This function will not store newAddr if the +// box no longer contains markedAddr. +func (b *atomicOffAddr) StoreUnmark(markedAddr, newAddr uintptr) { + b.a.CompareAndSwap(-int64(markedAddr-arenaBaseOffset), int64(newAddr-arenaBaseOffset)) +} + +// StoreMarked stores addr but first converted to the offset address +// space and then negated. +func (b *atomicOffAddr) StoreMarked(addr uintptr) { + b.a.Store(-int64(addr - arenaBaseOffset)) +} + +// Load returns the address in the box as a virtual address. It also +// returns if the value was marked or not. +func (b *atomicOffAddr) Load() (uintptr, bool) { + v := b.a.Load() + wasMarked := false + if v < 0 { + wasMarked = true + v = -v + } + return uintptr(v) + arenaBaseOffset, wasMarked +} + +// addrRanges is a data structure holding a collection of ranges of +// address space. +// +// The ranges are coalesced eagerly to reduce the +// number ranges it holds. +// +// The slice backing store for this field is persistentalloc'd +// and thus there is no way to free it. +// +// addrRanges is not thread-safe. +type addrRanges struct { + // ranges is a slice of ranges sorted by base. + ranges []addrRange + + // totalBytes is the total amount of address space in bytes counted by + // this addrRanges. + totalBytes uintptr + + // sysStat is the stat to track allocations by this type + sysStat *sysMemStat +} + +func (a *addrRanges) init(sysStat *sysMemStat) { + ranges := (*notInHeapSlice)(unsafe.Pointer(&a.ranges)) + ranges.len = 0 + ranges.cap = 16 + ranges.array = (*notInHeap)(persistentalloc(unsafe.Sizeof(addrRange{})*uintptr(ranges.cap), goarch.PtrSize, sysStat)) + a.sysStat = sysStat + a.totalBytes = 0 +} + +// findSucc returns the first index in a such that addr is +// less than the base of the addrRange at that index. +func (a *addrRanges) findSucc(addr uintptr) int { + base := offAddr{addr} + + // Narrow down the search space via a binary search + // for large addrRanges until we have at most iterMax + // candidates left. + const iterMax = 8 + bot, top := 0, len(a.ranges) + for top-bot > iterMax { + i := ((top - bot) / 2) + bot + if a.ranges[i].contains(base.addr()) { + // a.ranges[i] contains base, so + // its successor is the next index. + return i + 1 + } + if base.lessThan(a.ranges[i].base) { + // In this case i might actually be + // the successor, but we can't be sure + // until we check the ones before it. + top = i + } else { + // In this case we know base is + // greater than or equal to a.ranges[i].limit-1, + // so i is definitely not the successor. + // We already checked i, so pick the next + // one. + bot = i + 1 + } + } + // There are top-bot candidates left, so + // iterate over them and find the first that + // base is strictly less than. + for i := bot; i < top; i++ { + if base.lessThan(a.ranges[i].base) { + return i + } + } + return top +} + +// findAddrGreaterEqual returns the smallest address represented by a +// that is >= addr. Thus, if the address is represented by a, +// then it returns addr. The second return value indicates whether +// such an address exists for addr in a. That is, if addr is larger than +// any address known to a, the second return value will be false. +func (a *addrRanges) findAddrGreaterEqual(addr uintptr) (uintptr, bool) { + i := a.findSucc(addr) + if i == 0 { + return a.ranges[0].base.addr(), true + } + if a.ranges[i-1].contains(addr) { + return addr, true + } + if i < len(a.ranges) { + return a.ranges[i].base.addr(), true + } + return 0, false +} + +// contains returns true if a covers the address addr. +func (a *addrRanges) contains(addr uintptr) bool { + i := a.findSucc(addr) + if i == 0 { + return false + } + return a.ranges[i-1].contains(addr) +} + +// add inserts a new address range to a. +// +// r must not overlap with any address range in a and r.size() must be > 0. +func (a *addrRanges) add(r addrRange) { + // The copies in this function are potentially expensive, but this data + // structure is meant to represent the Go heap. At worst, copying this + // would take ~160µs assuming a conservative copying rate of 25 GiB/s (the + // copy will almost never trigger a page fault) for a 1 TiB heap with 4 MiB + // arenas which is completely discontiguous. ~160µs is still a lot, but in + // practice most platforms have 64 MiB arenas (which cuts this by a factor + // of 16) and Go heaps are usually mostly contiguous, so the chance that + // an addrRanges even grows to that size is extremely low. + + // An empty range has no effect on the set of addresses represented + // by a, but passing a zero-sized range is almost always a bug. + if r.size() == 0 { + print("runtime: range = {", hex(r.base.addr()), ", ", hex(r.limit.addr()), "}\n") + throw("attempted to add zero-sized address range") + } + // Because we assume r is not currently represented in a, + // findSucc gives us our insertion index. + i := a.findSucc(r.base.addr()) + coalescesDown := i > 0 && a.ranges[i-1].limit.equal(r.base) + coalescesUp := i < len(a.ranges) && r.limit.equal(a.ranges[i].base) + if coalescesUp && coalescesDown { + // We have neighbors and they both border us. + // Merge a.ranges[i-1], r, and a.ranges[i] together into a.ranges[i-1]. + a.ranges[i-1].limit = a.ranges[i].limit + + // Delete a.ranges[i]. + copy(a.ranges[i:], a.ranges[i+1:]) + a.ranges = a.ranges[:len(a.ranges)-1] + } else if coalescesDown { + // We have a neighbor at a lower address only and it borders us. + // Merge the new space into a.ranges[i-1]. + a.ranges[i-1].limit = r.limit + } else if coalescesUp { + // We have a neighbor at a higher address only and it borders us. + // Merge the new space into a.ranges[i]. + a.ranges[i].base = r.base + } else { + // We may or may not have neighbors which don't border us. + // Add the new range. + if len(a.ranges)+1 > cap(a.ranges) { + // Grow the array. Note that this leaks the old array, but since + // we're doubling we have at most 2x waste. For a 1 TiB heap and + // 4 MiB arenas which are all discontiguous (both very conservative + // assumptions), this would waste at most 4 MiB of memory. + oldRanges := a.ranges + ranges := (*notInHeapSlice)(unsafe.Pointer(&a.ranges)) + ranges.len = len(oldRanges) + 1 + ranges.cap = cap(oldRanges) * 2 + ranges.array = (*notInHeap)(persistentalloc(unsafe.Sizeof(addrRange{})*uintptr(ranges.cap), goarch.PtrSize, a.sysStat)) + + // Copy in the old array, but make space for the new range. + copy(a.ranges[:i], oldRanges[:i]) + copy(a.ranges[i+1:], oldRanges[i:]) + } else { + a.ranges = a.ranges[:len(a.ranges)+1] + copy(a.ranges[i+1:], a.ranges[i:]) + } + a.ranges[i] = r + } + a.totalBytes += r.size() +} + +// removeLast removes and returns the highest-addressed contiguous range +// of a, or the last nBytes of that range, whichever is smaller. If a is +// empty, it returns an empty range. +func (a *addrRanges) removeLast(nBytes uintptr) addrRange { + if len(a.ranges) == 0 { + return addrRange{} + } + r := a.ranges[len(a.ranges)-1] + size := r.size() + if size > nBytes { + newEnd := r.limit.sub(nBytes) + a.ranges[len(a.ranges)-1].limit = newEnd + a.totalBytes -= nBytes + return addrRange{newEnd, r.limit} + } + a.ranges = a.ranges[:len(a.ranges)-1] + a.totalBytes -= size + return r +} + +// removeGreaterEqual removes the ranges of a which are above addr, and additionally +// splits any range containing addr. +func (a *addrRanges) removeGreaterEqual(addr uintptr) { + pivot := a.findSucc(addr) + if pivot == 0 { + // addr is before all ranges in a. + a.totalBytes = 0 + a.ranges = a.ranges[:0] + return + } + removed := uintptr(0) + for _, r := range a.ranges[pivot:] { + removed += r.size() + } + if r := a.ranges[pivot-1]; r.contains(addr) { + removed += r.size() + r = r.removeGreaterEqual(addr) + if r.size() == 0 { + pivot-- + } else { + removed -= r.size() + a.ranges[pivot-1] = r + } + } + a.ranges = a.ranges[:pivot] + a.totalBytes -= removed +} + +// cloneInto makes a deep clone of a's state into b, re-using +// b's ranges if able. +func (a *addrRanges) cloneInto(b *addrRanges) { + if len(a.ranges) > cap(b.ranges) { + // Grow the array. + ranges := (*notInHeapSlice)(unsafe.Pointer(&b.ranges)) + ranges.len = 0 + ranges.cap = cap(a.ranges) + ranges.array = (*notInHeap)(persistentalloc(unsafe.Sizeof(addrRange{})*uintptr(ranges.cap), goarch.PtrSize, b.sysStat)) + } + b.ranges = b.ranges[:len(a.ranges)] + b.totalBytes = a.totalBytes + copy(b.ranges, a.ranges) +} diff --git a/src/runtime/mranges_test.go b/src/runtime/mranges_test.go new file mode 100644 index 0000000..ed439c5 --- /dev/null +++ b/src/runtime/mranges_test.go @@ -0,0 +1,275 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + . "runtime" + "testing" +) + +func validateAddrRanges(t *testing.T, a *AddrRanges, want ...AddrRange) { + ranges := a.Ranges() + if len(ranges) != len(want) { + t.Errorf("want %v, got %v", want, ranges) + t.Fatal("different lengths") + } + gotTotalBytes := uintptr(0) + wantTotalBytes := uintptr(0) + for i := range ranges { + gotTotalBytes += ranges[i].Size() + wantTotalBytes += want[i].Size() + if ranges[i].Base() >= ranges[i].Limit() { + t.Error("empty range found") + } + // Ensure this is equivalent to what we want. + if !ranges[i].Equals(want[i]) { + t.Errorf("range %d: got [0x%x, 0x%x), want [0x%x, 0x%x)", i, + ranges[i].Base(), ranges[i].Limit(), + want[i].Base(), want[i].Limit(), + ) + } + if i != 0 { + // Ensure the ranges are sorted. + if ranges[i-1].Base() >= ranges[i].Base() { + t.Errorf("ranges %d and %d are out of sorted order", i-1, i) + } + // Check for a failure to coalesce. + if ranges[i-1].Limit() == ranges[i].Base() { + t.Errorf("ranges %d and %d should have coalesced", i-1, i) + } + // Check if any ranges overlap. Because the ranges are sorted + // by base, it's sufficient to just check neighbors. + if ranges[i-1].Limit() > ranges[i].Base() { + t.Errorf("ranges %d and %d overlap", i-1, i) + } + } + } + if wantTotalBytes != gotTotalBytes { + t.Errorf("expected %d total bytes, got %d", wantTotalBytes, gotTotalBytes) + } + if b := a.TotalBytes(); b != gotTotalBytes { + t.Errorf("inconsistent total bytes: want %d, got %d", gotTotalBytes, b) + } + if t.Failed() { + t.Errorf("addrRanges: %v", ranges) + t.Fatal("detected bad addrRanges") + } +} + +func TestAddrRangesAdd(t *testing.T) { + a := NewAddrRanges() + + // First range. + a.Add(MakeAddrRange(512, 1024)) + validateAddrRanges(t, &a, + MakeAddrRange(512, 1024), + ) + + // Coalesce up. + a.Add(MakeAddrRange(1024, 2048)) + validateAddrRanges(t, &a, + MakeAddrRange(512, 2048), + ) + + // Add new independent range. + a.Add(MakeAddrRange(4096, 8192)) + validateAddrRanges(t, &a, + MakeAddrRange(512, 2048), + MakeAddrRange(4096, 8192), + ) + + // Coalesce down. + a.Add(MakeAddrRange(3776, 4096)) + validateAddrRanges(t, &a, + MakeAddrRange(512, 2048), + MakeAddrRange(3776, 8192), + ) + + // Coalesce up and down. + a.Add(MakeAddrRange(2048, 3776)) + validateAddrRanges(t, &a, + MakeAddrRange(512, 8192), + ) + + // Push a bunch of independent ranges to the end to try and force growth. + expectedRanges := []AddrRange{MakeAddrRange(512, 8192)} + for i := uintptr(0); i < 64; i++ { + dRange := MakeAddrRange(8192+(i+1)*2048, 8192+(i+1)*2048+10) + a.Add(dRange) + expectedRanges = append(expectedRanges, dRange) + validateAddrRanges(t, &a, expectedRanges...) + } + + // Push a bunch of independent ranges to the beginning to try and force growth. + var bottomRanges []AddrRange + for i := uintptr(0); i < 63; i++ { + dRange := MakeAddrRange(8+i*8, 8+i*8+4) + a.Add(dRange) + bottomRanges = append(bottomRanges, dRange) + validateAddrRanges(t, &a, append(bottomRanges, expectedRanges...)...) + } +} + +func TestAddrRangesFindSucc(t *testing.T) { + var large []AddrRange + for i := 0; i < 100; i++ { + large = append(large, MakeAddrRange(5+uintptr(i)*5, 5+uintptr(i)*5+3)) + } + + type testt struct { + name string + base uintptr + expect int + ranges []AddrRange + } + tests := []testt{ + { + name: "Empty", + base: 12, + expect: 0, + ranges: []AddrRange{}, + }, + { + name: "OneBefore", + base: 12, + expect: 0, + ranges: []AddrRange{ + MakeAddrRange(14, 16), + }, + }, + { + name: "OneWithin", + base: 14, + expect: 1, + ranges: []AddrRange{ + MakeAddrRange(14, 16), + }, + }, + { + name: "OneAfterLimit", + base: 16, + expect: 1, + ranges: []AddrRange{ + MakeAddrRange(14, 16), + }, + }, + { + name: "OneAfter", + base: 17, + expect: 1, + ranges: []AddrRange{ + MakeAddrRange(14, 16), + }, + }, + { + name: "ThreeBefore", + base: 3, + expect: 0, + ranges: []AddrRange{ + MakeAddrRange(6, 10), + MakeAddrRange(12, 16), + MakeAddrRange(19, 22), + }, + }, + { + name: "ThreeAfter", + base: 24, + expect: 3, + ranges: []AddrRange{ + MakeAddrRange(6, 10), + MakeAddrRange(12, 16), + MakeAddrRange(19, 22), + }, + }, + { + name: "ThreeBetween", + base: 11, + expect: 1, + ranges: []AddrRange{ + MakeAddrRange(6, 10), + MakeAddrRange(12, 16), + MakeAddrRange(19, 22), + }, + }, + { + name: "ThreeWithin", + base: 9, + expect: 1, + ranges: []AddrRange{ + MakeAddrRange(6, 10), + MakeAddrRange(12, 16), + MakeAddrRange(19, 22), + }, + }, + { + name: "Zero", + base: 0, + expect: 1, + ranges: []AddrRange{ + MakeAddrRange(0, 10), + }, + }, + { + name: "Max", + base: ^uintptr(0), + expect: 1, + ranges: []AddrRange{ + MakeAddrRange(^uintptr(0)-5, ^uintptr(0)), + }, + }, + { + name: "LargeBefore", + base: 2, + expect: 0, + ranges: large, + }, + { + name: "LargeAfter", + base: 5 + uintptr(len(large))*5 + 30, + expect: len(large), + ranges: large, + }, + { + name: "LargeBetweenLow", + base: 14, + expect: 2, + ranges: large, + }, + { + name: "LargeBetweenHigh", + base: 249, + expect: 49, + ranges: large, + }, + { + name: "LargeWithinLow", + base: 25, + expect: 5, + ranges: large, + }, + { + name: "LargeWithinHigh", + base: 396, + expect: 79, + ranges: large, + }, + { + name: "LargeWithinMiddle", + base: 250, + expect: 50, + ranges: large, + }, + } + + for _, test := range tests { + t.Run(test.name, func(t *testing.T) { + a := MakeAddrRanges(test.ranges...) + i := a.FindSucc(test.base) + if i != test.expect { + t.Fatalf("expected %d, got %d", test.expect, i) + } + }) + } +} diff --git a/src/runtime/msan.go b/src/runtime/msan.go new file mode 100644 index 0000000..5e2aae1 --- /dev/null +++ b/src/runtime/msan.go @@ -0,0 +1,62 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build msan + +package runtime + +import ( + "unsafe" +) + +// Public memory sanitizer API. + +func MSanRead(addr unsafe.Pointer, len int) { + msanread(addr, uintptr(len)) +} + +func MSanWrite(addr unsafe.Pointer, len int) { + msanwrite(addr, uintptr(len)) +} + +// Private interface for the runtime. +const msanenabled = true + +// If we are running on the system stack, the C program may have +// marked part of that stack as uninitialized. We don't instrument +// the runtime, but operations like a slice copy can call msanread +// anyhow for values on the stack. Just ignore msanread when running +// on the system stack. The other msan functions are fine. +// +//go:nosplit +func msanread(addr unsafe.Pointer, sz uintptr) { + gp := getg() + if gp == nil || gp.m == nil || gp == gp.m.g0 || gp == gp.m.gsignal { + return + } + domsanread(addr, sz) +} + +//go:noescape +func domsanread(addr unsafe.Pointer, sz uintptr) + +//go:noescape +func msanwrite(addr unsafe.Pointer, sz uintptr) + +//go:noescape +func msanmalloc(addr unsafe.Pointer, sz uintptr) + +//go:noescape +func msanfree(addr unsafe.Pointer, sz uintptr) + +//go:noescape +func msanmove(dst, src unsafe.Pointer, sz uintptr) + +// These are called from msan_GOARCH.s +// +//go:cgo_import_static __msan_read_go +//go:cgo_import_static __msan_write_go +//go:cgo_import_static __msan_malloc_go +//go:cgo_import_static __msan_free_go +//go:cgo_import_static __msan_memmove diff --git a/src/runtime/msan/msan.go b/src/runtime/msan/msan.go new file mode 100644 index 0000000..4e41f85 --- /dev/null +++ b/src/runtime/msan/msan.go @@ -0,0 +1,32 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build msan && ((linux && (amd64 || arm64)) || (freebsd && amd64)) + +package msan + +/* +#cgo CFLAGS: -fsanitize=memory +#cgo LDFLAGS: -fsanitize=memory + +#include <stdint.h> +#include <sanitizer/msan_interface.h> + +void __msan_read_go(void *addr, uintptr_t sz) { + __msan_check_mem_is_initialized(addr, sz); +} + +void __msan_write_go(void *addr, uintptr_t sz) { + __msan_unpoison(addr, sz); +} + +void __msan_malloc_go(void *addr, uintptr_t sz) { + __msan_unpoison(addr, sz); +} + +void __msan_free_go(void *addr, uintptr_t sz) { + __msan_poison(addr, sz); +} +*/ +import "C" diff --git a/src/runtime/msan0.go b/src/runtime/msan0.go new file mode 100644 index 0000000..2f5fd2d --- /dev/null +++ b/src/runtime/msan0.go @@ -0,0 +1,23 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !msan + +// Dummy MSan support API, used when not built with -msan. + +package runtime + +import ( + "unsafe" +) + +const msanenabled = false + +// Because msanenabled is false, none of these functions should be called. + +func msanread(addr unsafe.Pointer, sz uintptr) { throw("msan") } +func msanwrite(addr unsafe.Pointer, sz uintptr) { throw("msan") } +func msanmalloc(addr unsafe.Pointer, sz uintptr) { throw("msan") } +func msanfree(addr unsafe.Pointer, sz uintptr) { throw("msan") } +func msanmove(dst, src unsafe.Pointer, sz uintptr) { throw("msan") } diff --git a/src/runtime/msan_amd64.s b/src/runtime/msan_amd64.s new file mode 100644 index 0000000..89ed304 --- /dev/null +++ b/src/runtime/msan_amd64.s @@ -0,0 +1,89 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build msan + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// This is like race_amd64.s, but for the msan calls. +// See race_amd64.s for detailed comments. + +#ifdef GOOS_windows +#define RARG0 CX +#define RARG1 DX +#define RARG2 R8 +#define RARG3 R9 +#else +#define RARG0 DI +#define RARG1 SI +#define RARG2 DX +#define RARG3 CX +#endif + +// func runtime·domsanread(addr unsafe.Pointer, sz uintptr) +// Called from msanread. +TEXT runtime·domsanread(SB), NOSPLIT, $0-16 + MOVQ addr+0(FP), RARG0 + MOVQ size+8(FP), RARG1 + // void __msan_read_go(void *addr, uintptr_t sz); + MOVQ $__msan_read_go(SB), AX + JMP msancall<>(SB) + +// func runtime·msanwrite(addr unsafe.Pointer, sz uintptr) +// Called from instrumented code. +TEXT runtime·msanwrite(SB), NOSPLIT, $0-16 + MOVQ addr+0(FP), RARG0 + MOVQ size+8(FP), RARG1 + // void __msan_write_go(void *addr, uintptr_t sz); + MOVQ $__msan_write_go(SB), AX + JMP msancall<>(SB) + +// func runtime·msanmalloc(addr unsafe.Pointer, sz uintptr) +TEXT runtime·msanmalloc(SB), NOSPLIT, $0-16 + MOVQ addr+0(FP), RARG0 + MOVQ size+8(FP), RARG1 + // void __msan_malloc_go(void *addr, uintptr_t sz); + MOVQ $__msan_malloc_go(SB), AX + JMP msancall<>(SB) + +// func runtime·msanfree(addr unsafe.Pointer, sz uintptr) +TEXT runtime·msanfree(SB), NOSPLIT, $0-16 + MOVQ addr+0(FP), RARG0 + MOVQ size+8(FP), RARG1 + // void __msan_free_go(void *addr, uintptr_t sz); + MOVQ $__msan_free_go(SB), AX + JMP msancall<>(SB) + +// func runtime·msanmove(dst, src unsafe.Pointer, sz uintptr) +TEXT runtime·msanmove(SB), NOSPLIT, $0-24 + MOVQ dst+0(FP), RARG0 + MOVQ src+8(FP), RARG1 + MOVQ size+16(FP), RARG2 + // void __msan_memmove(void *dst, void *src, uintptr_t sz); + MOVQ $__msan_memmove(SB), AX + JMP msancall<>(SB) + +// Switches SP to g0 stack and calls (AX). Arguments already set. +TEXT msancall<>(SB), NOSPLIT, $0-0 + get_tls(R12) + MOVQ g(R12), R14 + MOVQ SP, R12 // callee-saved, preserved across the CALL + CMPQ R14, $0 + JE call // no g; still on a system stack + + MOVQ g_m(R14), R13 + // Switch to g0 stack. + MOVQ m_g0(R13), R10 + CMPQ R10, R14 + JE call // already on g0 + + MOVQ (g_sched+gobuf_sp)(R10), SP +call: + ANDQ $~15, SP // alignment for gcc ABI + CALL AX + MOVQ R12, SP + RET diff --git a/src/runtime/msan_arm64.s b/src/runtime/msan_arm64.s new file mode 100644 index 0000000..b9eff34 --- /dev/null +++ b/src/runtime/msan_arm64.s @@ -0,0 +1,73 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build msan + +#include "go_asm.h" +#include "textflag.h" + +#define RARG0 R0 +#define RARG1 R1 +#define RARG2 R2 +#define FARG R3 + +// func runtime·domsanread(addr unsafe.Pointer, sz uintptr) +// Called from msanread. +TEXT runtime·domsanread(SB), NOSPLIT, $0-16 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + // void __msan_read_go(void *addr, uintptr_t sz); + MOVD $__msan_read_go(SB), FARG + JMP msancall<>(SB) + +// func runtime·msanwrite(addr unsafe.Pointer, sz uintptr) +// Called from instrumented code. +TEXT runtime·msanwrite(SB), NOSPLIT, $0-16 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + // void __msan_write_go(void *addr, uintptr_t sz); + MOVD $__msan_write_go(SB), FARG + JMP msancall<>(SB) + +// func runtime·msanmalloc(addr unsafe.Pointer, sz uintptr) +TEXT runtime·msanmalloc(SB), NOSPLIT, $0-16 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + // void __msan_malloc_go(void *addr, uintptr_t sz); + MOVD $__msan_malloc_go(SB), FARG + JMP msancall<>(SB) + +// func runtime·msanfree(addr unsafe.Pointer, sz uintptr) +TEXT runtime·msanfree(SB), NOSPLIT, $0-16 + MOVD addr+0(FP), RARG0 + MOVD size+8(FP), RARG1 + // void __msan_free_go(void *addr, uintptr_t sz); + MOVD $__msan_free_go(SB), FARG + JMP msancall<>(SB) + +// func runtime·msanmove(dst, src unsafe.Pointer, sz uintptr) +TEXT runtime·msanmove(SB), NOSPLIT, $0-24 + MOVD dst+0(FP), RARG0 + MOVD src+8(FP), RARG1 + MOVD size+16(FP), RARG2 + // void __msan_memmove(void *dst, void *src, uintptr_t sz); + MOVD $__msan_memmove(SB), FARG + JMP msancall<>(SB) + +// Switches SP to g0 stack and calls (FARG). Arguments already set. +TEXT msancall<>(SB), NOSPLIT, $0-0 + MOVD RSP, R19 // callee-saved + CBZ g, g0stack // no g, still on a system stack + MOVD g_m(g), R10 + MOVD m_g0(R10), R11 + CMP R11, g + BEQ g0stack + + MOVD (g_sched+gobuf_sp)(R11), R4 + MOVD R4, RSP + +g0stack: + BL (FARG) + MOVD R19, RSP + RET diff --git a/src/runtime/msize.go b/src/runtime/msize.go new file mode 100644 index 0000000..c56aa5a --- /dev/null +++ b/src/runtime/msize.go @@ -0,0 +1,25 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Malloc small size classes. +// +// See malloc.go for overview. +// See also mksizeclasses.go for how we decide what size classes to use. + +package runtime + +// Returns size of the memory block that mallocgc will allocate if you ask for the size. +func roundupsize(size uintptr) uintptr { + if size < _MaxSmallSize { + if size <= smallSizeMax-8 { + return uintptr(class_to_size[size_to_class8[divRoundUp(size, smallSizeDiv)]]) + } else { + return uintptr(class_to_size[size_to_class128[divRoundUp(size-smallSizeMax, largeSizeDiv)]]) + } + } + if size+_PageSize < size { + return size + } + return alignUp(size, _PageSize) +} diff --git a/src/runtime/mspanset.go b/src/runtime/mspanset.go new file mode 100644 index 0000000..abbd450 --- /dev/null +++ b/src/runtime/mspanset.go @@ -0,0 +1,404 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/cpu" + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +// A spanSet is a set of *mspans. +// +// spanSet is safe for concurrent push and pop operations. +type spanSet struct { + // A spanSet is a two-level data structure consisting of a + // growable spine that points to fixed-sized blocks. The spine + // can be accessed without locks, but adding a block or + // growing it requires taking the spine lock. + // + // Because each mspan covers at least 8K of heap and takes at + // most 8 bytes in the spanSet, the growth of the spine is + // quite limited. + // + // The spine and all blocks are allocated off-heap, which + // allows this to be used in the memory manager and avoids the + // need for write barriers on all of these. spanSetBlocks are + // managed in a pool, though never freed back to the operating + // system. We never release spine memory because there could be + // concurrent lock-free access and we're likely to reuse it + // anyway. (In principle, we could do this during STW.) + + spineLock mutex + spine atomicSpanSetSpinePointer // *[N]atomic.Pointer[spanSetBlock] + spineLen atomic.Uintptr // Spine array length + spineCap uintptr // Spine array cap, accessed under spineLock + + // index is the head and tail of the spanSet in a single field. + // The head and the tail both represent an index into the logical + // concatenation of all blocks, with the head always behind or + // equal to the tail (indicating an empty set). This field is + // always accessed atomically. + // + // The head and the tail are only 32 bits wide, which means we + // can only support up to 2^32 pushes before a reset. If every + // span in the heap were stored in this set, and each span were + // the minimum size (1 runtime page, 8 KiB), then roughly the + // smallest heap which would be unrepresentable is 32 TiB in size. + index atomicHeadTailIndex +} + +const ( + spanSetBlockEntries = 512 // 4KB on 64-bit + spanSetInitSpineCap = 256 // Enough for 1GB heap on 64-bit +) + +type spanSetBlock struct { + // Free spanSetBlocks are managed via a lock-free stack. + lfnode + + // popped is the number of pop operations that have occurred on + // this block. This number is used to help determine when a block + // may be safely recycled. + popped atomic.Uint32 + + // spans is the set of spans in this block. + spans [spanSetBlockEntries]atomicMSpanPointer +} + +// push adds span s to buffer b. push is safe to call concurrently +// with other push and pop operations. +func (b *spanSet) push(s *mspan) { + // Obtain our slot. + cursor := uintptr(b.index.incTail().tail() - 1) + top, bottom := cursor/spanSetBlockEntries, cursor%spanSetBlockEntries + + // Do we need to add a block? + spineLen := b.spineLen.Load() + var block *spanSetBlock +retry: + if top < spineLen { + block = b.spine.Load().lookup(top).Load() + } else { + // Add a new block to the spine, potentially growing + // the spine. + lock(&b.spineLock) + // spineLen cannot change until we release the lock, + // but may have changed while we were waiting. + spineLen = b.spineLen.Load() + if top < spineLen { + unlock(&b.spineLock) + goto retry + } + + spine := b.spine.Load() + if spineLen == b.spineCap { + // Grow the spine. + newCap := b.spineCap * 2 + if newCap == 0 { + newCap = spanSetInitSpineCap + } + newSpine := persistentalloc(newCap*goarch.PtrSize, cpu.CacheLineSize, &memstats.gcMiscSys) + if b.spineCap != 0 { + // Blocks are allocated off-heap, so + // no write barriers. + memmove(newSpine, spine.p, b.spineCap*goarch.PtrSize) + } + spine = spanSetSpinePointer{newSpine} + + // Spine is allocated off-heap, so no write barrier. + b.spine.StoreNoWB(spine) + b.spineCap = newCap + // We can't immediately free the old spine + // since a concurrent push with a lower index + // could still be reading from it. We let it + // leak because even a 1TB heap would waste + // less than 2MB of memory on old spines. If + // this is a problem, we could free old spines + // during STW. + } + + // Allocate a new block from the pool. + block = spanSetBlockPool.alloc() + + // Add it to the spine. + // Blocks are allocated off-heap, so no write barrier. + spine.lookup(top).StoreNoWB(block) + b.spineLen.Store(spineLen + 1) + unlock(&b.spineLock) + } + + // We have a block. Insert the span atomically, since there may be + // concurrent readers via the block API. + block.spans[bottom].StoreNoWB(s) +} + +// pop removes and returns a span from buffer b, or nil if b is empty. +// pop is safe to call concurrently with other pop and push operations. +func (b *spanSet) pop() *mspan { + var head, tail uint32 +claimLoop: + for { + headtail := b.index.load() + head, tail = headtail.split() + if head >= tail { + // The buf is empty, as far as we can tell. + return nil + } + // Check if the head position we want to claim is actually + // backed by a block. + spineLen := b.spineLen.Load() + if spineLen <= uintptr(head)/spanSetBlockEntries { + // We're racing with a spine growth and the allocation of + // a new block (and maybe a new spine!), and trying to grab + // the span at the index which is currently being pushed. + // Instead of spinning, let's just notify the caller that + // there's nothing currently here. Spinning on this is + // almost definitely not worth it. + return nil + } + // Try to claim the current head by CASing in an updated head. + // This may fail transiently due to a push which modifies the + // tail, so keep trying while the head isn't changing. + want := head + for want == head { + if b.index.cas(headtail, makeHeadTailIndex(want+1, tail)) { + break claimLoop + } + headtail = b.index.load() + head, tail = headtail.split() + } + // We failed to claim the spot we were after and the head changed, + // meaning a popper got ahead of us. Try again from the top because + // the buf may not be empty. + } + top, bottom := head/spanSetBlockEntries, head%spanSetBlockEntries + + // We may be reading a stale spine pointer, but because the length + // grows monotonically and we've already verified it, we'll definitely + // be reading from a valid block. + blockp := b.spine.Load().lookup(uintptr(top)) + + // Given that the spine length is correct, we know we will never + // see a nil block here, since the length is always updated after + // the block is set. + block := blockp.Load() + s := block.spans[bottom].Load() + for s == nil { + // We raced with the span actually being set, but given that we + // know a block for this span exists, the race window here is + // extremely small. Try again. + s = block.spans[bottom].Load() + } + // Clear the pointer. This isn't strictly necessary, but defensively + // avoids accidentally re-using blocks which could lead to memory + // corruption. This way, we'll get a nil pointer access instead. + block.spans[bottom].StoreNoWB(nil) + + // Increase the popped count. If we are the last possible popper + // in the block (note that bottom need not equal spanSetBlockEntries-1 + // due to races) then it's our responsibility to free the block. + // + // If we increment popped to spanSetBlockEntries, we can be sure that + // we're the last popper for this block, and it's thus safe to free it. + // Every other popper must have crossed this barrier (and thus finished + // popping its corresponding mspan) by the time we get here. Because + // we're the last popper, we also don't have to worry about concurrent + // pushers (there can't be any). Note that we may not be the popper + // which claimed the last slot in the block, we're just the last one + // to finish popping. + if block.popped.Add(1) == spanSetBlockEntries { + // Clear the block's pointer. + blockp.StoreNoWB(nil) + + // Return the block to the block pool. + spanSetBlockPool.free(block) + } + return s +} + +// reset resets a spanSet which is empty. It will also clean up +// any left over blocks. +// +// Throws if the buf is not empty. +// +// reset may not be called concurrently with any other operations +// on the span set. +func (b *spanSet) reset() { + head, tail := b.index.load().split() + if head < tail { + print("head = ", head, ", tail = ", tail, "\n") + throw("attempt to clear non-empty span set") + } + top := head / spanSetBlockEntries + if uintptr(top) < b.spineLen.Load() { + // If the head catches up to the tail and the set is empty, + // we may not clean up the block containing the head and tail + // since it may be pushed into again. In order to avoid leaking + // memory since we're going to reset the head and tail, clean + // up such a block now, if it exists. + blockp := b.spine.Load().lookup(uintptr(top)) + block := blockp.Load() + if block != nil { + // Check the popped value. + if block.popped.Load() == 0 { + // popped should never be zero because that means we have + // pushed at least one value but not yet popped if this + // block pointer is not nil. + throw("span set block with unpopped elements found in reset") + } + if block.popped.Load() == spanSetBlockEntries { + // popped should also never be equal to spanSetBlockEntries + // because the last popper should have made the block pointer + // in this slot nil. + throw("fully empty unfreed span set block found in reset") + } + + // Clear the pointer to the block. + blockp.StoreNoWB(nil) + + // Return the block to the block pool. + spanSetBlockPool.free(block) + } + } + b.index.reset() + b.spineLen.Store(0) +} + +// atomicSpanSetSpinePointer is an atomically-accessed spanSetSpinePointer. +// +// It has the same semantics as atomic.UnsafePointer. +type atomicSpanSetSpinePointer struct { + a atomic.UnsafePointer +} + +// Loads the spanSetSpinePointer and returns it. +// +// It has the same semantics as atomic.UnsafePointer. +func (s *atomicSpanSetSpinePointer) Load() spanSetSpinePointer { + return spanSetSpinePointer{s.a.Load()} +} + +// Stores the spanSetSpinePointer. +// +// It has the same semantics as atomic.UnsafePointer. +func (s *atomicSpanSetSpinePointer) StoreNoWB(p spanSetSpinePointer) { + s.a.StoreNoWB(p.p) +} + +// spanSetSpinePointer represents a pointer to a contiguous block of atomic.Pointer[spanSetBlock]. +type spanSetSpinePointer struct { + p unsafe.Pointer +} + +// lookup returns &s[idx]. +func (s spanSetSpinePointer) lookup(idx uintptr) *atomic.Pointer[spanSetBlock] { + return (*atomic.Pointer[spanSetBlock])(add(unsafe.Pointer(s.p), goarch.PtrSize*idx)) +} + +// spanSetBlockPool is a global pool of spanSetBlocks. +var spanSetBlockPool spanSetBlockAlloc + +// spanSetBlockAlloc represents a concurrent pool of spanSetBlocks. +type spanSetBlockAlloc struct { + stack lfstack +} + +// alloc tries to grab a spanSetBlock out of the pool, and if it fails +// persistentallocs a new one and returns it. +func (p *spanSetBlockAlloc) alloc() *spanSetBlock { + if s := (*spanSetBlock)(p.stack.pop()); s != nil { + return s + } + return (*spanSetBlock)(persistentalloc(unsafe.Sizeof(spanSetBlock{}), cpu.CacheLineSize, &memstats.gcMiscSys)) +} + +// free returns a spanSetBlock back to the pool. +func (p *spanSetBlockAlloc) free(block *spanSetBlock) { + block.popped.Store(0) + p.stack.push(&block.lfnode) +} + +// haidTailIndex represents a combined 32-bit head and 32-bit tail +// of a queue into a single 64-bit value. +type headTailIndex uint64 + +// makeHeadTailIndex creates a headTailIndex value from a separate +// head and tail. +func makeHeadTailIndex(head, tail uint32) headTailIndex { + return headTailIndex(uint64(head)<<32 | uint64(tail)) +} + +// head returns the head of a headTailIndex value. +func (h headTailIndex) head() uint32 { + return uint32(h >> 32) +} + +// tail returns the tail of a headTailIndex value. +func (h headTailIndex) tail() uint32 { + return uint32(h) +} + +// split splits the headTailIndex value into its parts. +func (h headTailIndex) split() (head uint32, tail uint32) { + return h.head(), h.tail() +} + +// atomicHeadTailIndex is an atomically-accessed headTailIndex. +type atomicHeadTailIndex struct { + u atomic.Uint64 +} + +// load atomically reads a headTailIndex value. +func (h *atomicHeadTailIndex) load() headTailIndex { + return headTailIndex(h.u.Load()) +} + +// cas atomically compares-and-swaps a headTailIndex value. +func (h *atomicHeadTailIndex) cas(old, new headTailIndex) bool { + return h.u.CompareAndSwap(uint64(old), uint64(new)) +} + +// incHead atomically increments the head of a headTailIndex. +func (h *atomicHeadTailIndex) incHead() headTailIndex { + return headTailIndex(h.u.Add(1 << 32)) +} + +// decHead atomically decrements the head of a headTailIndex. +func (h *atomicHeadTailIndex) decHead() headTailIndex { + return headTailIndex(h.u.Add(-(1 << 32))) +} + +// incTail atomically increments the tail of a headTailIndex. +func (h *atomicHeadTailIndex) incTail() headTailIndex { + ht := headTailIndex(h.u.Add(1)) + // Check for overflow. + if ht.tail() == 0 { + print("runtime: head = ", ht.head(), ", tail = ", ht.tail(), "\n") + throw("headTailIndex overflow") + } + return ht +} + +// reset clears the headTailIndex to (0, 0). +func (h *atomicHeadTailIndex) reset() { + h.u.Store(0) +} + +// atomicMSpanPointer is an atomic.Pointer[mspan]. Can't use generics because it's NotInHeap. +type atomicMSpanPointer struct { + p atomic.UnsafePointer +} + +// Load returns the *mspan. +func (p *atomicMSpanPointer) Load() *mspan { + return (*mspan)(p.p.Load()) +} + +// Store stores an *mspan. +func (p *atomicMSpanPointer) StoreNoWB(s *mspan) { + p.p.StoreNoWB(unsafe.Pointer(s)) +} diff --git a/src/runtime/mstats.go b/src/runtime/mstats.go new file mode 100644 index 0000000..8424946 --- /dev/null +++ b/src/runtime/mstats.go @@ -0,0 +1,917 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Memory statistics + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +type mstats struct { + // Statistics about malloc heap. + heapStats consistentHeapStats + + // Statistics about stacks. + stacks_sys sysMemStat // only counts newosproc0 stack in mstats; differs from MemStats.StackSys + + // Statistics about allocation of low-level fixed-size structures. + mspan_sys sysMemStat + mcache_sys sysMemStat + buckhash_sys sysMemStat // profiling bucket hash table + + // Statistics about GC overhead. + gcMiscSys sysMemStat // updated atomically or during STW + + // Miscellaneous statistics. + other_sys sysMemStat // updated atomically or during STW + + // Statistics about the garbage collector. + + // Protected by mheap or stopping the world during GC. + last_gc_unix uint64 // last gc (in unix time) + pause_total_ns uint64 + pause_ns [256]uint64 // circular buffer of recent gc pause lengths + pause_end [256]uint64 // circular buffer of recent gc end times (nanoseconds since 1970) + numgc uint32 + numforcedgc uint32 // number of user-forced GCs + gc_cpu_fraction float64 // fraction of CPU time used by GC + + last_gc_nanotime uint64 // last gc (monotonic time) + lastHeapInUse uint64 // heapInUse at mark termination of the previous GC + + enablegc bool + + // gcPauseDist represents the distribution of all GC-related + // application pauses in the runtime. + // + // Each individual pause is counted separately, unlike pause_ns. + gcPauseDist timeHistogram +} + +var memstats mstats + +// A MemStats records statistics about the memory allocator. +type MemStats struct { + // General statistics. + + // Alloc is bytes of allocated heap objects. + // + // This is the same as HeapAlloc (see below). + Alloc uint64 + + // TotalAlloc is cumulative bytes allocated for heap objects. + // + // TotalAlloc increases as heap objects are allocated, but + // unlike Alloc and HeapAlloc, it does not decrease when + // objects are freed. + TotalAlloc uint64 + + // Sys is the total bytes of memory obtained from the OS. + // + // Sys is the sum of the XSys fields below. Sys measures the + // virtual address space reserved by the Go runtime for the + // heap, stacks, and other internal data structures. It's + // likely that not all of the virtual address space is backed + // by physical memory at any given moment, though in general + // it all was at some point. + Sys uint64 + + // Lookups is the number of pointer lookups performed by the + // runtime. + // + // This is primarily useful for debugging runtime internals. + Lookups uint64 + + // Mallocs is the cumulative count of heap objects allocated. + // The number of live objects is Mallocs - Frees. + Mallocs uint64 + + // Frees is the cumulative count of heap objects freed. + Frees uint64 + + // Heap memory statistics. + // + // Interpreting the heap statistics requires some knowledge of + // how Go organizes memory. Go divides the virtual address + // space of the heap into "spans", which are contiguous + // regions of memory 8K or larger. A span may be in one of + // three states: + // + // An "idle" span contains no objects or other data. The + // physical memory backing an idle span can be released back + // to the OS (but the virtual address space never is), or it + // can be converted into an "in use" or "stack" span. + // + // An "in use" span contains at least one heap object and may + // have free space available to allocate more heap objects. + // + // A "stack" span is used for goroutine stacks. Stack spans + // are not considered part of the heap. A span can change + // between heap and stack memory; it is never used for both + // simultaneously. + + // HeapAlloc is bytes of allocated heap objects. + // + // "Allocated" heap objects include all reachable objects, as + // well as unreachable objects that the garbage collector has + // not yet freed. Specifically, HeapAlloc increases as heap + // objects are allocated and decreases as the heap is swept + // and unreachable objects are freed. Sweeping occurs + // incrementally between GC cycles, so these two processes + // occur simultaneously, and as a result HeapAlloc tends to + // change smoothly (in contrast with the sawtooth that is + // typical of stop-the-world garbage collectors). + HeapAlloc uint64 + + // HeapSys is bytes of heap memory obtained from the OS. + // + // HeapSys measures the amount of virtual address space + // reserved for the heap. This includes virtual address space + // that has been reserved but not yet used, which consumes no + // physical memory, but tends to be small, as well as virtual + // address space for which the physical memory has been + // returned to the OS after it became unused (see HeapReleased + // for a measure of the latter). + // + // HeapSys estimates the largest size the heap has had. + HeapSys uint64 + + // HeapIdle is bytes in idle (unused) spans. + // + // Idle spans have no objects in them. These spans could be + // (and may already have been) returned to the OS, or they can + // be reused for heap allocations, or they can be reused as + // stack memory. + // + // HeapIdle minus HeapReleased estimates the amount of memory + // that could be returned to the OS, but is being retained by + // the runtime so it can grow the heap without requesting more + // memory from the OS. If this difference is significantly + // larger than the heap size, it indicates there was a recent + // transient spike in live heap size. + HeapIdle uint64 + + // HeapInuse is bytes in in-use spans. + // + // In-use spans have at least one object in them. These spans + // can only be used for other objects of roughly the same + // size. + // + // HeapInuse minus HeapAlloc estimates the amount of memory + // that has been dedicated to particular size classes, but is + // not currently being used. This is an upper bound on + // fragmentation, but in general this memory can be reused + // efficiently. + HeapInuse uint64 + + // HeapReleased is bytes of physical memory returned to the OS. + // + // This counts heap memory from idle spans that was returned + // to the OS and has not yet been reacquired for the heap. + HeapReleased uint64 + + // HeapObjects is the number of allocated heap objects. + // + // Like HeapAlloc, this increases as objects are allocated and + // decreases as the heap is swept and unreachable objects are + // freed. + HeapObjects uint64 + + // Stack memory statistics. + // + // Stacks are not considered part of the heap, but the runtime + // can reuse a span of heap memory for stack memory, and + // vice-versa. + + // StackInuse is bytes in stack spans. + // + // In-use stack spans have at least one stack in them. These + // spans can only be used for other stacks of the same size. + // + // There is no StackIdle because unused stack spans are + // returned to the heap (and hence counted toward HeapIdle). + StackInuse uint64 + + // StackSys is bytes of stack memory obtained from the OS. + // + // StackSys is StackInuse, plus any memory obtained directly + // from the OS for OS thread stacks (which should be minimal). + StackSys uint64 + + // Off-heap memory statistics. + // + // The following statistics measure runtime-internal + // structures that are not allocated from heap memory (usually + // because they are part of implementing the heap). Unlike + // heap or stack memory, any memory allocated to these + // structures is dedicated to these structures. + // + // These are primarily useful for debugging runtime memory + // overheads. + + // MSpanInuse is bytes of allocated mspan structures. + MSpanInuse uint64 + + // MSpanSys is bytes of memory obtained from the OS for mspan + // structures. + MSpanSys uint64 + + // MCacheInuse is bytes of allocated mcache structures. + MCacheInuse uint64 + + // MCacheSys is bytes of memory obtained from the OS for + // mcache structures. + MCacheSys uint64 + + // BuckHashSys is bytes of memory in profiling bucket hash tables. + BuckHashSys uint64 + + // GCSys is bytes of memory in garbage collection metadata. + GCSys uint64 + + // OtherSys is bytes of memory in miscellaneous off-heap + // runtime allocations. + OtherSys uint64 + + // Garbage collector statistics. + + // NextGC is the target heap size of the next GC cycle. + // + // The garbage collector's goal is to keep HeapAlloc ≤ NextGC. + // At the end of each GC cycle, the target for the next cycle + // is computed based on the amount of reachable data and the + // value of GOGC. + NextGC uint64 + + // LastGC is the time the last garbage collection finished, as + // nanoseconds since 1970 (the UNIX epoch). + LastGC uint64 + + // PauseTotalNs is the cumulative nanoseconds in GC + // stop-the-world pauses since the program started. + // + // During a stop-the-world pause, all goroutines are paused + // and only the garbage collector can run. + PauseTotalNs uint64 + + // PauseNs is a circular buffer of recent GC stop-the-world + // pause times in nanoseconds. + // + // The most recent pause is at PauseNs[(NumGC+255)%256]. In + // general, PauseNs[N%256] records the time paused in the most + // recent N%256th GC cycle. There may be multiple pauses per + // GC cycle; this is the sum of all pauses during a cycle. + PauseNs [256]uint64 + + // PauseEnd is a circular buffer of recent GC pause end times, + // as nanoseconds since 1970 (the UNIX epoch). + // + // This buffer is filled the same way as PauseNs. There may be + // multiple pauses per GC cycle; this records the end of the + // last pause in a cycle. + PauseEnd [256]uint64 + + // NumGC is the number of completed GC cycles. + NumGC uint32 + + // NumForcedGC is the number of GC cycles that were forced by + // the application calling the GC function. + NumForcedGC uint32 + + // GCCPUFraction is the fraction of this program's available + // CPU time used by the GC since the program started. + // + // GCCPUFraction is expressed as a number between 0 and 1, + // where 0 means GC has consumed none of this program's CPU. A + // program's available CPU time is defined as the integral of + // GOMAXPROCS since the program started. That is, if + // GOMAXPROCS is 2 and a program has been running for 10 + // seconds, its "available CPU" is 20 seconds. GCCPUFraction + // does not include CPU time used for write barrier activity. + // + // This is the same as the fraction of CPU reported by + // GODEBUG=gctrace=1. + GCCPUFraction float64 + + // EnableGC indicates that GC is enabled. It is always true, + // even if GOGC=off. + EnableGC bool + + // DebugGC is currently unused. + DebugGC bool + + // BySize reports per-size class allocation statistics. + // + // BySize[N] gives statistics for allocations of size S where + // BySize[N-1].Size < S ≤ BySize[N].Size. + // + // This does not report allocations larger than BySize[60].Size. + BySize [61]struct { + // Size is the maximum byte size of an object in this + // size class. + Size uint32 + + // Mallocs is the cumulative count of heap objects + // allocated in this size class. The cumulative bytes + // of allocation is Size*Mallocs. The number of live + // objects in this size class is Mallocs - Frees. + Mallocs uint64 + + // Frees is the cumulative count of heap objects freed + // in this size class. + Frees uint64 + } +} + +func init() { + if offset := unsafe.Offsetof(memstats.heapStats); offset%8 != 0 { + println(offset) + throw("memstats.heapStats not aligned to 8 bytes") + } + // Ensure the size of heapStatsDelta causes adjacent fields/slots (e.g. + // [3]heapStatsDelta) to be 8-byte aligned. + if size := unsafe.Sizeof(heapStatsDelta{}); size%8 != 0 { + println(size) + throw("heapStatsDelta not a multiple of 8 bytes in size") + } +} + +// ReadMemStats populates m with memory allocator statistics. +// +// The returned memory allocator statistics are up to date as of the +// call to ReadMemStats. This is in contrast with a heap profile, +// which is a snapshot as of the most recently completed garbage +// collection cycle. +func ReadMemStats(m *MemStats) { + stopTheWorld("read mem stats") + + systemstack(func() { + readmemstats_m(m) + }) + + startTheWorld() +} + +// doubleCheckReadMemStats controls a double-check mode for ReadMemStats that +// ensures consistency between the values that ReadMemStats is using and the +// runtime-internal stats. +var doubleCheckReadMemStats = false + +// readmemstats_m populates stats for internal runtime values. +// +// The world must be stopped. +func readmemstats_m(stats *MemStats) { + assertWorldStopped() + + // Flush mcaches to mcentral before doing anything else. + // + // Flushing to the mcentral may in general cause stats to + // change as mcentral data structures are manipulated. + systemstack(flushallmcaches) + + // Calculate memory allocator stats. + // During program execution we only count number of frees and amount of freed memory. + // Current number of alive objects in the heap and amount of alive heap memory + // are calculated by scanning all spans. + // Total number of mallocs is calculated as number of frees plus number of alive objects. + // Similarly, total amount of allocated memory is calculated as amount of freed memory + // plus amount of alive heap memory. + + // Collect consistent stats, which are the source-of-truth in some cases. + var consStats heapStatsDelta + memstats.heapStats.unsafeRead(&consStats) + + // Collect large allocation stats. + totalAlloc := consStats.largeAlloc + nMalloc := consStats.largeAllocCount + totalFree := consStats.largeFree + nFree := consStats.largeFreeCount + + // Collect per-sizeclass stats. + var bySize [_NumSizeClasses]struct { + Size uint32 + Mallocs uint64 + Frees uint64 + } + for i := range bySize { + bySize[i].Size = uint32(class_to_size[i]) + + // Malloc stats. + a := consStats.smallAllocCount[i] + totalAlloc += a * uint64(class_to_size[i]) + nMalloc += a + bySize[i].Mallocs = a + + // Free stats. + f := consStats.smallFreeCount[i] + totalFree += f * uint64(class_to_size[i]) + nFree += f + bySize[i].Frees = f + } + + // Account for tiny allocations. + // For historical reasons, MemStats includes tiny allocations + // in both the total free and total alloc count. This double-counts + // memory in some sense because their tiny allocation block is also + // counted. Tracking the lifetime of individual tiny allocations is + // currently not done because it would be too expensive. + nFree += consStats.tinyAllocCount + nMalloc += consStats.tinyAllocCount + + // Calculate derived stats. + + stackInUse := uint64(consStats.inStacks) + gcWorkBufInUse := uint64(consStats.inWorkBufs) + gcProgPtrScalarBitsInUse := uint64(consStats.inPtrScalarBits) + + totalMapped := gcController.heapInUse.load() + gcController.heapFree.load() + gcController.heapReleased.load() + + memstats.stacks_sys.load() + memstats.mspan_sys.load() + memstats.mcache_sys.load() + + memstats.buckhash_sys.load() + memstats.gcMiscSys.load() + memstats.other_sys.load() + + stackInUse + gcWorkBufInUse + gcProgPtrScalarBitsInUse + + heapGoal := gcController.heapGoal() + + if doubleCheckReadMemStats { + // Only check this if we're debugging. It would be bad to crash an application + // just because the debugging stats are wrong. We mostly rely on tests to catch + // these issues, and we enable the double check mode for tests. + // + // The world is stopped, so the consistent stats (after aggregation) + // should be identical to some combination of memstats. In particular: + // + // * memstats.heapInUse == inHeap + // * memstats.heapReleased == released + // * memstats.heapInUse + memstats.heapFree == committed - inStacks - inWorkBufs - inPtrScalarBits + // * memstats.totalAlloc == totalAlloc + // * memstats.totalFree == totalFree + // + // Check if that's actually true. + // + // Prevent sysmon and the tracer from skewing the stats since they can + // act without synchronizing with a STW. See #64401. + lock(&sched.sysmonlock) + lock(&trace.lock) + if gcController.heapInUse.load() != uint64(consStats.inHeap) { + print("runtime: heapInUse=", gcController.heapInUse.load(), "\n") + print("runtime: consistent value=", consStats.inHeap, "\n") + throw("heapInUse and consistent stats are not equal") + } + if gcController.heapReleased.load() != uint64(consStats.released) { + print("runtime: heapReleased=", gcController.heapReleased.load(), "\n") + print("runtime: consistent value=", consStats.released, "\n") + throw("heapReleased and consistent stats are not equal") + } + heapRetained := gcController.heapInUse.load() + gcController.heapFree.load() + consRetained := uint64(consStats.committed - consStats.inStacks - consStats.inWorkBufs - consStats.inPtrScalarBits) + if heapRetained != consRetained { + print("runtime: global value=", heapRetained, "\n") + print("runtime: consistent value=", consRetained, "\n") + throw("measures of the retained heap are not equal") + } + if gcController.totalAlloc.Load() != totalAlloc { + print("runtime: totalAlloc=", gcController.totalAlloc.Load(), "\n") + print("runtime: consistent value=", totalAlloc, "\n") + throw("totalAlloc and consistent stats are not equal") + } + if gcController.totalFree.Load() != totalFree { + print("runtime: totalFree=", gcController.totalFree.Load(), "\n") + print("runtime: consistent value=", totalFree, "\n") + throw("totalFree and consistent stats are not equal") + } + // Also check that mappedReady lines up with totalMapped - released. + // This isn't really the same type of "make sure consistent stats line up" situation, + // but this is an opportune time to check. + if gcController.mappedReady.Load() != totalMapped-uint64(consStats.released) { + print("runtime: mappedReady=", gcController.mappedReady.Load(), "\n") + print("runtime: totalMapped=", totalMapped, "\n") + print("runtime: released=", uint64(consStats.released), "\n") + print("runtime: totalMapped-released=", totalMapped-uint64(consStats.released), "\n") + throw("mappedReady and other memstats are not equal") + } + unlock(&trace.lock) + unlock(&sched.sysmonlock) + } + + // We've calculated all the values we need. Now, populate stats. + + stats.Alloc = totalAlloc - totalFree + stats.TotalAlloc = totalAlloc + stats.Sys = totalMapped + stats.Mallocs = nMalloc + stats.Frees = nFree + stats.HeapAlloc = totalAlloc - totalFree + stats.HeapSys = gcController.heapInUse.load() + gcController.heapFree.load() + gcController.heapReleased.load() + // By definition, HeapIdle is memory that was mapped + // for the heap but is not currently used to hold heap + // objects. It also specifically is memory that can be + // used for other purposes, like stacks, but this memory + // is subtracted out of HeapSys before it makes that + // transition. Put another way: + // + // HeapSys = bytes allocated from the OS for the heap - bytes ultimately used for non-heap purposes + // HeapIdle = bytes allocated from the OS for the heap - bytes ultimately used for any purpose + // + // or + // + // HeapSys = sys - stacks_inuse - gcWorkBufInUse - gcProgPtrScalarBitsInUse + // HeapIdle = sys - stacks_inuse - gcWorkBufInUse - gcProgPtrScalarBitsInUse - heapInUse + // + // => HeapIdle = HeapSys - heapInUse = heapFree + heapReleased + stats.HeapIdle = gcController.heapFree.load() + gcController.heapReleased.load() + stats.HeapInuse = gcController.heapInUse.load() + stats.HeapReleased = gcController.heapReleased.load() + stats.HeapObjects = nMalloc - nFree + stats.StackInuse = stackInUse + // memstats.stacks_sys is only memory mapped directly for OS stacks. + // Add in heap-allocated stack memory for user consumption. + stats.StackSys = stackInUse + memstats.stacks_sys.load() + stats.MSpanInuse = uint64(mheap_.spanalloc.inuse) + stats.MSpanSys = memstats.mspan_sys.load() + stats.MCacheInuse = uint64(mheap_.cachealloc.inuse) + stats.MCacheSys = memstats.mcache_sys.load() + stats.BuckHashSys = memstats.buckhash_sys.load() + // MemStats defines GCSys as an aggregate of all memory related + // to the memory management system, but we track this memory + // at a more granular level in the runtime. + stats.GCSys = memstats.gcMiscSys.load() + gcWorkBufInUse + gcProgPtrScalarBitsInUse + stats.OtherSys = memstats.other_sys.load() + stats.NextGC = heapGoal + stats.LastGC = memstats.last_gc_unix + stats.PauseTotalNs = memstats.pause_total_ns + stats.PauseNs = memstats.pause_ns + stats.PauseEnd = memstats.pause_end + stats.NumGC = memstats.numgc + stats.NumForcedGC = memstats.numforcedgc + stats.GCCPUFraction = memstats.gc_cpu_fraction + stats.EnableGC = true + + // stats.BySize and bySize might not match in length. + // That's OK, stats.BySize cannot change due to backwards + // compatibility issues. copy will copy the minimum amount + // of values between the two of them. + copy(stats.BySize[:], bySize[:]) +} + +//go:linkname readGCStats runtime/debug.readGCStats +func readGCStats(pauses *[]uint64) { + systemstack(func() { + readGCStats_m(pauses) + }) +} + +// readGCStats_m must be called on the system stack because it acquires the heap +// lock. See mheap for details. +// +//go:systemstack +func readGCStats_m(pauses *[]uint64) { + p := *pauses + // Calling code in runtime/debug should make the slice large enough. + if cap(p) < len(memstats.pause_ns)+3 { + throw("short slice passed to readGCStats") + } + + // Pass back: pauses, pause ends, last gc (absolute time), number of gc, total pause ns. + lock(&mheap_.lock) + + n := memstats.numgc + if n > uint32(len(memstats.pause_ns)) { + n = uint32(len(memstats.pause_ns)) + } + + // The pause buffer is circular. The most recent pause is at + // pause_ns[(numgc-1)%len(pause_ns)], and then backward + // from there to go back farther in time. We deliver the times + // most recent first (in p[0]). + p = p[:cap(p)] + for i := uint32(0); i < n; i++ { + j := (memstats.numgc - 1 - i) % uint32(len(memstats.pause_ns)) + p[i] = memstats.pause_ns[j] + p[n+i] = memstats.pause_end[j] + } + + p[n+n] = memstats.last_gc_unix + p[n+n+1] = uint64(memstats.numgc) + p[n+n+2] = memstats.pause_total_ns + unlock(&mheap_.lock) + *pauses = p[:n+n+3] +} + +// flushmcache flushes the mcache of allp[i]. +// +// The world must be stopped. +// +//go:nowritebarrier +func flushmcache(i int) { + assertWorldStopped() + + p := allp[i] + c := p.mcache + if c == nil { + return + } + c.releaseAll() + stackcache_clear(c) +} + +// flushallmcaches flushes the mcaches of all Ps. +// +// The world must be stopped. +// +//go:nowritebarrier +func flushallmcaches() { + assertWorldStopped() + + for i := 0; i < int(gomaxprocs); i++ { + flushmcache(i) + } +} + +// sysMemStat represents a global system statistic that is managed atomically. +// +// This type must structurally be a uint64 so that mstats aligns with MemStats. +type sysMemStat uint64 + +// load atomically reads the value of the stat. +// +// Must be nosplit as it is called in runtime initialization, e.g. newosproc0. +// +//go:nosplit +func (s *sysMemStat) load() uint64 { + return atomic.Load64((*uint64)(s)) +} + +// add atomically adds the sysMemStat by n. +// +// Must be nosplit as it is called in runtime initialization, e.g. newosproc0. +// +//go:nosplit +func (s *sysMemStat) add(n int64) { + val := atomic.Xadd64((*uint64)(s), n) + if (n > 0 && int64(val) < n) || (n < 0 && int64(val)+n < n) { + print("runtime: val=", val, " n=", n, "\n") + throw("sysMemStat overflow") + } +} + +// heapStatsDelta contains deltas of various runtime memory statistics +// that need to be updated together in order for them to be kept +// consistent with one another. +type heapStatsDelta struct { + // Memory stats. + committed int64 // byte delta of memory committed + released int64 // byte delta of released memory generated + inHeap int64 // byte delta of memory placed in the heap + inStacks int64 // byte delta of memory reserved for stacks + inWorkBufs int64 // byte delta of memory reserved for work bufs + inPtrScalarBits int64 // byte delta of memory reserved for unrolled GC prog bits + + // Allocator stats. + // + // These are all uint64 because they're cumulative, and could quickly wrap + // around otherwise. + tinyAllocCount uint64 // number of tiny allocations + largeAlloc uint64 // bytes allocated for large objects + largeAllocCount uint64 // number of large object allocations + smallAllocCount [_NumSizeClasses]uint64 // number of allocs for small objects + largeFree uint64 // bytes freed for large objects (>maxSmallSize) + largeFreeCount uint64 // number of frees for large objects (>maxSmallSize) + smallFreeCount [_NumSizeClasses]uint64 // number of frees for small objects (<=maxSmallSize) + + // NOTE: This struct must be a multiple of 8 bytes in size because it + // is stored in an array. If it's not, atomic accesses to the above + // fields may be unaligned and fail on 32-bit platforms. +} + +// merge adds in the deltas from b into a. +func (a *heapStatsDelta) merge(b *heapStatsDelta) { + a.committed += b.committed + a.released += b.released + a.inHeap += b.inHeap + a.inStacks += b.inStacks + a.inWorkBufs += b.inWorkBufs + a.inPtrScalarBits += b.inPtrScalarBits + + a.tinyAllocCount += b.tinyAllocCount + a.largeAlloc += b.largeAlloc + a.largeAllocCount += b.largeAllocCount + for i := range b.smallAllocCount { + a.smallAllocCount[i] += b.smallAllocCount[i] + } + a.largeFree += b.largeFree + a.largeFreeCount += b.largeFreeCount + for i := range b.smallFreeCount { + a.smallFreeCount[i] += b.smallFreeCount[i] + } +} + +// consistentHeapStats represents a set of various memory statistics +// whose updates must be viewed completely to get a consistent +// state of the world. +// +// To write updates to memory stats use the acquire and release +// methods. To obtain a consistent global snapshot of these statistics, +// use read. +type consistentHeapStats struct { + // stats is a ring buffer of heapStatsDelta values. + // Writers always atomically update the delta at index gen. + // + // Readers operate by rotating gen (0 -> 1 -> 2 -> 0 -> ...) + // and synchronizing with writers by observing each P's + // statsSeq field. If the reader observes a P not writing, + // it can be sure that it will pick up the new gen value the + // next time it writes. + // + // The reader then takes responsibility by clearing space + // in the ring buffer for the next reader to rotate gen to + // that space (i.e. it merges in values from index (gen-2) mod 3 + // to index (gen-1) mod 3, then clears the former). + // + // Note that this means only one reader can be reading at a time. + // There is no way for readers to synchronize. + // + // This process is why we need a ring buffer of size 3 instead + // of 2: one is for the writers, one contains the most recent + // data, and the last one is clear so writers can begin writing + // to it the moment gen is updated. + stats [3]heapStatsDelta + + // gen represents the current index into which writers + // are writing, and can take on the value of 0, 1, or 2. + gen atomic.Uint32 + + // noPLock is intended to provide mutual exclusion for updating + // stats when no P is available. It does not block other writers + // with a P, only other writers without a P and the reader. Because + // stats are usually updated when a P is available, contention on + // this lock should be minimal. + noPLock mutex +} + +// acquire returns a heapStatsDelta to be updated. In effect, +// it acquires the shard for writing. release must be called +// as soon as the relevant deltas are updated. +// +// The returned heapStatsDelta must be updated atomically. +// +// The caller's P must not change between acquire and +// release. This also means that the caller should not +// acquire a P or release its P in between. A P also must +// not acquire a given consistentHeapStats if it hasn't +// yet released it. +// +// nosplit because a stack growth in this function could +// lead to a stack allocation that could reenter the +// function. +// +//go:nosplit +func (m *consistentHeapStats) acquire() *heapStatsDelta { + if pp := getg().m.p.ptr(); pp != nil { + seq := pp.statsSeq.Add(1) + if seq%2 == 0 { + // Should have been incremented to odd. + print("runtime: seq=", seq, "\n") + throw("bad sequence number") + } + } else { + lock(&m.noPLock) + } + gen := m.gen.Load() % 3 + return &m.stats[gen] +} + +// release indicates that the writer is done modifying +// the delta. The value returned by the corresponding +// acquire must no longer be accessed or modified after +// release is called. +// +// The caller's P must not change between acquire and +// release. This also means that the caller should not +// acquire a P or release its P in between. +// +// nosplit because a stack growth in this function could +// lead to a stack allocation that causes another acquire +// before this operation has completed. +// +//go:nosplit +func (m *consistentHeapStats) release() { + if pp := getg().m.p.ptr(); pp != nil { + seq := pp.statsSeq.Add(1) + if seq%2 != 0 { + // Should have been incremented to even. + print("runtime: seq=", seq, "\n") + throw("bad sequence number") + } + } else { + unlock(&m.noPLock) + } +} + +// unsafeRead aggregates the delta for this shard into out. +// +// Unsafe because it does so without any synchronization. The +// world must be stopped. +func (m *consistentHeapStats) unsafeRead(out *heapStatsDelta) { + assertWorldStopped() + + for i := range m.stats { + out.merge(&m.stats[i]) + } +} + +// unsafeClear clears the shard. +// +// Unsafe because the world must be stopped and values should +// be donated elsewhere before clearing. +func (m *consistentHeapStats) unsafeClear() { + assertWorldStopped() + + for i := range m.stats { + m.stats[i] = heapStatsDelta{} + } +} + +// read takes a globally consistent snapshot of m +// and puts the aggregated value in out. Even though out is a +// heapStatsDelta, the resulting values should be complete and +// valid statistic values. +// +// Not safe to call concurrently. The world must be stopped +// or metricsSema must be held. +func (m *consistentHeapStats) read(out *heapStatsDelta) { + // Getting preempted after this point is not safe because + // we read allp. We need to make sure a STW can't happen + // so it doesn't change out from under us. + mp := acquirem() + + // Get the current generation. We can be confident that this + // will not change since read is serialized and is the only + // one that modifies currGen. + currGen := m.gen.Load() + prevGen := currGen - 1 + if currGen == 0 { + prevGen = 2 + } + + // Prevent writers without a P from writing while we update gen. + lock(&m.noPLock) + + // Rotate gen, effectively taking a snapshot of the state of + // these statistics at the point of the exchange by moving + // writers to the next set of deltas. + // + // This exchange is safe to do because we won't race + // with anyone else trying to update this value. + m.gen.Swap((currGen + 1) % 3) + + // Allow P-less writers to continue. They'll be writing to the + // next generation now. + unlock(&m.noPLock) + + for _, p := range allp { + // Spin until there are no more writers. + for p.statsSeq.Load()%2 != 0 { + } + } + + // At this point we've observed that each sequence + // number is even, so any future writers will observe + // the new gen value. That means it's safe to read from + // the other deltas in the stats buffer. + + // Perform our responsibilities and free up + // stats[prevGen] for the next time we want to take + // a snapshot. + m.stats[currGen].merge(&m.stats[prevGen]) + m.stats[prevGen] = heapStatsDelta{} + + // Finally, copy out the complete delta. + *out = m.stats[currGen] + + releasem(mp) +} + +type cpuStats struct { + // All fields are CPU time in nanoseconds computed by comparing + // calls of nanotime. This means they're all overestimates, because + // they don't accurately compute on-CPU time (so some of the time + // could be spent scheduled away by the OS). + + gcAssistTime int64 // GC assists + gcDedicatedTime int64 // GC dedicated mark workers + pauses + gcIdleTime int64 // GC idle mark workers + gcPauseTime int64 // GC pauses (all GOMAXPROCS, even if just 1 is running) + gcTotalTime int64 + + scavengeAssistTime int64 // background scavenger + scavengeBgTime int64 // scavenge assists + scavengeTotalTime int64 + + idleTime int64 // Time Ps spent in _Pidle. + userTime int64 // Time Ps spent in _Prunning or _Psyscall that's not any of the above. + + totalTime int64 // GOMAXPROCS * (monotonic wall clock time elapsed) +} diff --git a/src/runtime/mwbbuf.go b/src/runtime/mwbbuf.go new file mode 100644 index 0000000..3b7cbf8 --- /dev/null +++ b/src/runtime/mwbbuf.go @@ -0,0 +1,290 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This implements the write barrier buffer. The write barrier itself +// is gcWriteBarrier and is implemented in assembly. +// +// See mbarrier.go for algorithmic details on the write barrier. This +// file deals only with the buffer. +// +// The write barrier has a fast path and a slow path. The fast path +// simply enqueues to a per-P write barrier buffer. It's written in +// assembly and doesn't clobber any general purpose registers, so it +// doesn't have the usual overheads of a Go call. +// +// When the buffer fills up, the write barrier invokes the slow path +// (wbBufFlush) to flush the buffer to the GC work queues. In this +// path, since the compiler didn't spill registers, we spill *all* +// registers and disallow any GC safe points that could observe the +// stack frame (since we don't know the types of the spilled +// registers). + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +// testSmallBuf forces a small write barrier buffer to stress write +// barrier flushing. +const testSmallBuf = false + +// wbBuf is a per-P buffer of pointers queued by the write barrier. +// This buffer is flushed to the GC workbufs when it fills up and on +// various GC transitions. +// +// This is closely related to a "sequential store buffer" (SSB), +// except that SSBs are usually used for maintaining remembered sets, +// while this is used for marking. +type wbBuf struct { + // next points to the next slot in buf. It must not be a + // pointer type because it can point past the end of buf and + // must be updated without write barriers. + // + // This is a pointer rather than an index to optimize the + // write barrier assembly. + next uintptr + + // end points to just past the end of buf. It must not be a + // pointer type because it points past the end of buf and must + // be updated without write barriers. + end uintptr + + // buf stores a series of pointers to execute write barriers + // on. This must be a multiple of wbBufEntryPointers because + // the write barrier only checks for overflow once per entry. + buf [wbBufEntryPointers * wbBufEntries]uintptr +} + +const ( + // wbBufEntries is the number of write barriers between + // flushes of the write barrier buffer. + // + // This trades latency for throughput amortization. Higher + // values amortize flushing overhead more, but increase the + // latency of flushing. Higher values also increase the cache + // footprint of the buffer. + // + // TODO: What is the latency cost of this? Tune this value. + wbBufEntries = 256 + + // wbBufEntryPointers is the number of pointers added to the + // buffer by each write barrier. + wbBufEntryPointers = 2 +) + +// reset empties b by resetting its next and end pointers. +func (b *wbBuf) reset() { + start := uintptr(unsafe.Pointer(&b.buf[0])) + b.next = start + if writeBarrier.cgo { + // Effectively disable the buffer by forcing a flush + // on every barrier. + b.end = uintptr(unsafe.Pointer(&b.buf[wbBufEntryPointers])) + } else if testSmallBuf { + // For testing, allow two barriers in the buffer. If + // we only did one, then barriers of non-heap pointers + // would be no-ops. This lets us combine a buffered + // barrier with a flush at a later time. + b.end = uintptr(unsafe.Pointer(&b.buf[2*wbBufEntryPointers])) + } else { + b.end = start + uintptr(len(b.buf))*unsafe.Sizeof(b.buf[0]) + } + + if (b.end-b.next)%(wbBufEntryPointers*unsafe.Sizeof(b.buf[0])) != 0 { + throw("bad write barrier buffer bounds") + } +} + +// discard resets b's next pointer, but not its end pointer. +// +// This must be nosplit because it's called by wbBufFlush. +// +//go:nosplit +func (b *wbBuf) discard() { + b.next = uintptr(unsafe.Pointer(&b.buf[0])) +} + +// empty reports whether b contains no pointers. +func (b *wbBuf) empty() bool { + return b.next == uintptr(unsafe.Pointer(&b.buf[0])) +} + +// putFast adds old and new to the write barrier buffer and returns +// false if a flush is necessary. Callers should use this as: +// +// buf := &getg().m.p.ptr().wbBuf +// if !buf.putFast(old, new) { +// wbBufFlush(...) +// } +// ... actual memory write ... +// +// The arguments to wbBufFlush depend on whether the caller is doing +// its own cgo pointer checks. If it is, then this can be +// wbBufFlush(nil, 0). Otherwise, it must pass the slot address and +// new. +// +// The caller must ensure there are no preemption points during the +// above sequence. There must be no preemption points while buf is in +// use because it is a per-P resource. There must be no preemption +// points between the buffer put and the write to memory because this +// could allow a GC phase change, which could result in missed write +// barriers. +// +// putFast must be nowritebarrierrec to because write barriers here would +// corrupt the write barrier buffer. It (and everything it calls, if +// it called anything) has to be nosplit to avoid scheduling on to a +// different P and a different buffer. +// +//go:nowritebarrierrec +//go:nosplit +func (b *wbBuf) putFast(old, new uintptr) bool { + p := (*[2]uintptr)(unsafe.Pointer(b.next)) + p[0] = old + p[1] = new + b.next += 2 * goarch.PtrSize + return b.next != b.end +} + +// wbBufFlush flushes the current P's write barrier buffer to the GC +// workbufs. It is passed the slot and value of the write barrier that +// caused the flush so that it can implement cgocheck. +// +// This must not have write barriers because it is part of the write +// barrier implementation. +// +// This and everything it calls must be nosplit because 1) the stack +// contains untyped slots from gcWriteBarrier and 2) there must not be +// a GC safe point between the write barrier test in the caller and +// flushing the buffer. +// +// TODO: A "go:nosplitrec" annotation would be perfect for this. +// +//go:nowritebarrierrec +//go:nosplit +func wbBufFlush(dst *uintptr, src uintptr) { + // Note: Every possible return from this function must reset + // the buffer's next pointer to prevent buffer overflow. + + // This *must not* modify its arguments because this + // function's argument slots do double duty in gcWriteBarrier + // as register spill slots. Currently, not modifying the + // arguments is sufficient to keep the spill slots unmodified + // (which seems unlikely to change since it costs little and + // helps with debugging). + + if getg().m.dying > 0 { + // We're going down. Not much point in write barriers + // and this way we can allow write barriers in the + // panic path. + getg().m.p.ptr().wbBuf.discard() + return + } + + if writeBarrier.cgo && dst != nil { + // This must be called from the stack that did the + // write. It's nosplit all the way down. + cgoCheckWriteBarrier(dst, src) + if !writeBarrier.needed { + // We were only called for cgocheck. + getg().m.p.ptr().wbBuf.discard() + return + } + } + + // Switch to the system stack so we don't have to worry about + // the untyped stack slots or safe points. + systemstack(func() { + wbBufFlush1(getg().m.p.ptr()) + }) +} + +// wbBufFlush1 flushes p's write barrier buffer to the GC work queue. +// +// This must not have write barriers because it is part of the write +// barrier implementation, so this may lead to infinite loops or +// buffer corruption. +// +// This must be non-preemptible because it uses the P's workbuf. +// +//go:nowritebarrierrec +//go:systemstack +func wbBufFlush1(pp *p) { + // Get the buffered pointers. + start := uintptr(unsafe.Pointer(&pp.wbBuf.buf[0])) + n := (pp.wbBuf.next - start) / unsafe.Sizeof(pp.wbBuf.buf[0]) + ptrs := pp.wbBuf.buf[:n] + + // Poison the buffer to make extra sure nothing is enqueued + // while we're processing the buffer. + pp.wbBuf.next = 0 + + if useCheckmark { + // Slow path for checkmark mode. + for _, ptr := range ptrs { + shade(ptr) + } + pp.wbBuf.reset() + return + } + + // Mark all of the pointers in the buffer and record only the + // pointers we greyed. We use the buffer itself to temporarily + // record greyed pointers. + // + // TODO: Should scanobject/scanblock just stuff pointers into + // the wbBuf? Then this would become the sole greying path. + // + // TODO: We could avoid shading any of the "new" pointers in + // the buffer if the stack has been shaded, or even avoid + // putting them in the buffer at all (which would double its + // capacity). This is slightly complicated with the buffer; we + // could track whether any un-shaded goroutine has used the + // buffer, or just track globally whether there are any + // un-shaded stacks and flush after each stack scan. + gcw := &pp.gcw + pos := 0 + for _, ptr := range ptrs { + if ptr < minLegalPointer { + // nil pointers are very common, especially + // for the "old" values. Filter out these and + // other "obvious" non-heap pointers ASAP. + // + // TODO: Should we filter out nils in the fast + // path to reduce the rate of flushes? + continue + } + obj, span, objIndex := findObject(ptr, 0, 0) + if obj == 0 { + continue + } + // TODO: Consider making two passes where the first + // just prefetches the mark bits. + mbits := span.markBitsForIndex(objIndex) + if mbits.isMarked() { + continue + } + mbits.setMarked() + + // Mark span. + arena, pageIdx, pageMask := pageIndexOf(span.base()) + if arena.pageMarks[pageIdx]&pageMask == 0 { + atomic.Or8(&arena.pageMarks[pageIdx], pageMask) + } + + if span.spanclass.noscan() { + gcw.bytesMarked += uint64(span.elemsize) + continue + } + ptrs[pos] = obj + pos++ + } + + // Enqueue the greyed objects. + gcw.putBatch(ptrs[:pos]) + + pp.wbBuf.reset() +} diff --git a/src/runtime/nbpipe_pipe.go b/src/runtime/nbpipe_pipe.go new file mode 100644 index 0000000..408e1ec --- /dev/null +++ b/src/runtime/nbpipe_pipe.go @@ -0,0 +1,19 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build aix || darwin + +package runtime + +func nonblockingPipe() (r, w int32, errno int32) { + r, w, errno = pipe() + if errno != 0 { + return -1, -1, errno + } + closeonexec(r) + setNonblock(r) + closeonexec(w) + setNonblock(w) + return r, w, errno +} diff --git a/src/runtime/nbpipe_pipe2.go b/src/runtime/nbpipe_pipe2.go new file mode 100644 index 0000000..22d60b4 --- /dev/null +++ b/src/runtime/nbpipe_pipe2.go @@ -0,0 +1,11 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build dragonfly || freebsd || linux || netbsd || openbsd || solaris + +package runtime + +func nonblockingPipe() (r, w int32, errno int32) { + return pipe2(_O_NONBLOCK | _O_CLOEXEC) +} diff --git a/src/runtime/nbpipe_pipe_test.go b/src/runtime/nbpipe_pipe_test.go new file mode 100644 index 0000000..c8cb3cf --- /dev/null +++ b/src/runtime/nbpipe_pipe_test.go @@ -0,0 +1,38 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build aix || darwin + +package runtime_test + +import ( + "runtime" + "syscall" + "testing" +) + +func TestSetNonblock(t *testing.T) { + t.Parallel() + + r, w, errno := runtime.Pipe() + if errno != 0 { + t.Fatal(syscall.Errno(errno)) + } + defer func() { + runtime.Close(r) + runtime.Close(w) + }() + + checkIsPipe(t, r, w) + + runtime.SetNonblock(r) + runtime.SetNonblock(w) + checkNonblocking(t, r, "reader") + checkNonblocking(t, w, "writer") + + runtime.Closeonexec(r) + runtime.Closeonexec(w) + checkCloseonexec(t, r, "reader") + checkCloseonexec(t, w, "writer") +} diff --git a/src/runtime/nbpipe_test.go b/src/runtime/nbpipe_test.go new file mode 100644 index 0000000..337b8e5 --- /dev/null +++ b/src/runtime/nbpipe_test.go @@ -0,0 +1,74 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime_test + +import ( + "runtime" + "syscall" + "testing" + "unsafe" +) + +func TestNonblockingPipe(t *testing.T) { + // NonblockingPipe is the test name for nonblockingPipe. + r, w, errno := runtime.NonblockingPipe() + if errno != 0 { + t.Fatal(syscall.Errno(errno)) + } + defer runtime.Close(w) + + checkIsPipe(t, r, w) + checkNonblocking(t, r, "reader") + checkCloseonexec(t, r, "reader") + checkNonblocking(t, w, "writer") + checkCloseonexec(t, w, "writer") + + // Test that fcntl returns an error as expected. + if runtime.Close(r) != 0 { + t.Fatalf("Close(%d) failed", r) + } + val, errno := runtime.Fcntl(r, syscall.F_GETFD, 0) + if val != -1 { + t.Errorf("Fcntl succeeded unexpectedly") + } else if syscall.Errno(errno) != syscall.EBADF { + t.Errorf("Fcntl failed with error %v, expected %v", syscall.Errno(errno), syscall.EBADF) + } +} + +func checkIsPipe(t *testing.T, r, w int32) { + bw := byte(42) + if n := runtime.Write(uintptr(w), unsafe.Pointer(&bw), 1); n != 1 { + t.Fatalf("Write(w, &b, 1) == %d, expected 1", n) + } + var br byte + if n := runtime.Read(r, unsafe.Pointer(&br), 1); n != 1 { + t.Fatalf("Read(r, &b, 1) == %d, expected 1", n) + } + if br != bw { + t.Errorf("pipe read %d, expected %d", br, bw) + } +} + +func checkNonblocking(t *testing.T, fd int32, name string) { + t.Helper() + flags, errno := runtime.Fcntl(fd, syscall.F_GETFL, 0) + if flags == -1 { + t.Errorf("fcntl(%s, F_GETFL) failed: %v", name, syscall.Errno(errno)) + } else if flags&syscall.O_NONBLOCK == 0 { + t.Errorf("O_NONBLOCK not set in %s flags %#x", name, flags) + } +} + +func checkCloseonexec(t *testing.T, fd int32, name string) { + t.Helper() + flags, errno := runtime.Fcntl(fd, syscall.F_GETFD, 0) + if flags == -1 { + t.Errorf("fcntl(%s, F_GETFD) failed: %v", name, syscall.Errno(errno)) + } else if flags&syscall.FD_CLOEXEC == 0 { + t.Errorf("FD_CLOEXEC not set in %s flags %#x", name, flags) + } +} diff --git a/src/runtime/net_plan9.go b/src/runtime/net_plan9.go new file mode 100644 index 0000000..b1ac7c7 --- /dev/null +++ b/src/runtime/net_plan9.go @@ -0,0 +1,29 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + _ "unsafe" +) + +//go:linkname runtime_ignoreHangup internal/poll.runtime_ignoreHangup +func runtime_ignoreHangup() { + getg().m.ignoreHangup = true +} + +//go:linkname runtime_unignoreHangup internal/poll.runtime_unignoreHangup +func runtime_unignoreHangup(sig string) { + getg().m.ignoreHangup = false +} + +func ignoredNote(note *byte) bool { + if note == nil { + return false + } + if gostringnocopy(note) != "hangup" { + return false + } + return getg().m.ignoreHangup +} diff --git a/src/runtime/netpoll.go b/src/runtime/netpoll.go new file mode 100644 index 0000000..5ac1f37 --- /dev/null +++ b/src/runtime/netpoll.go @@ -0,0 +1,657 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix || (js && wasm) || windows + +package runtime + +import ( + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// Integrated network poller (platform-independent part). +// A particular implementation (epoll/kqueue/port/AIX/Windows) +// must define the following functions: +// +// func netpollinit() +// Initialize the poller. Only called once. +// +// func netpollopen(fd uintptr, pd *pollDesc) int32 +// Arm edge-triggered notifications for fd. The pd argument is to pass +// back to netpollready when fd is ready. Return an errno value. +// +// func netpollclose(fd uintptr) int32 +// Disable notifications for fd. Return an errno value. +// +// func netpoll(delta int64) gList +// Poll the network. If delta < 0, block indefinitely. If delta == 0, +// poll without blocking. If delta > 0, block for up to delta nanoseconds. +// Return a list of goroutines built by calling netpollready. +// +// func netpollBreak() +// Wake up the network poller, assumed to be blocked in netpoll. +// +// func netpollIsPollDescriptor(fd uintptr) bool +// Reports whether fd is a file descriptor used by the poller. + +// Error codes returned by runtime_pollReset and runtime_pollWait. +// These must match the values in internal/poll/fd_poll_runtime.go. +const ( + pollNoError = 0 // no error + pollErrClosing = 1 // descriptor is closed + pollErrTimeout = 2 // I/O timeout + pollErrNotPollable = 3 // general error polling descriptor +) + +// pollDesc contains 2 binary semaphores, rg and wg, to park reader and writer +// goroutines respectively. The semaphore can be in the following states: +// +// pdReady - io readiness notification is pending; +// a goroutine consumes the notification by changing the state to pdNil. +// pdWait - a goroutine prepares to park on the semaphore, but not yet parked; +// the goroutine commits to park by changing the state to G pointer, +// or, alternatively, concurrent io notification changes the state to pdReady, +// or, alternatively, concurrent timeout/close changes the state to pdNil. +// G pointer - the goroutine is blocked on the semaphore; +// io notification or timeout/close changes the state to pdReady or pdNil respectively +// and unparks the goroutine. +// pdNil - none of the above. +const ( + pdNil uintptr = 0 + pdReady uintptr = 1 + pdWait uintptr = 2 +) + +const pollBlockSize = 4 * 1024 + +// Network poller descriptor. +// +// No heap pointers. +type pollDesc struct { + _ sys.NotInHeap + link *pollDesc // in pollcache, protected by pollcache.lock + fd uintptr // constant for pollDesc usage lifetime + + // atomicInfo holds bits from closing, rd, and wd, + // which are only ever written while holding the lock, + // summarized for use by netpollcheckerr, + // which cannot acquire the lock. + // After writing these fields under lock in a way that + // might change the summary, code must call publishInfo + // before releasing the lock. + // Code that changes fields and then calls netpollunblock + // (while still holding the lock) must call publishInfo + // before calling netpollunblock, because publishInfo is what + // stops netpollblock from blocking anew + // (by changing the result of netpollcheckerr). + // atomicInfo also holds the eventErr bit, + // recording whether a poll event on the fd got an error; + // atomicInfo is the only source of truth for that bit. + atomicInfo atomic.Uint32 // atomic pollInfo + + // rg, wg are accessed atomically and hold g pointers. + // (Using atomic.Uintptr here is similar to using guintptr elsewhere.) + rg atomic.Uintptr // pdReady, pdWait, G waiting for read or pdNil + wg atomic.Uintptr // pdReady, pdWait, G waiting for write or pdNil + + lock mutex // protects the following fields + closing bool + user uint32 // user settable cookie + rseq uintptr // protects from stale read timers + rt timer // read deadline timer (set if rt.f != nil) + rd int64 // read deadline (a nanotime in the future, -1 when expired) + wseq uintptr // protects from stale write timers + wt timer // write deadline timer + wd int64 // write deadline (a nanotime in the future, -1 when expired) + self *pollDesc // storage for indirect interface. See (*pollDesc).makeArg. +} + +// pollInfo is the bits needed by netpollcheckerr, stored atomically, +// mostly duplicating state that is manipulated under lock in pollDesc. +// The one exception is the pollEventErr bit, which is maintained only +// in the pollInfo. +type pollInfo uint32 + +const ( + pollClosing = 1 << iota + pollEventErr + pollExpiredReadDeadline + pollExpiredWriteDeadline +) + +func (i pollInfo) closing() bool { return i&pollClosing != 0 } +func (i pollInfo) eventErr() bool { return i&pollEventErr != 0 } +func (i pollInfo) expiredReadDeadline() bool { return i&pollExpiredReadDeadline != 0 } +func (i pollInfo) expiredWriteDeadline() bool { return i&pollExpiredWriteDeadline != 0 } + +// info returns the pollInfo corresponding to pd. +func (pd *pollDesc) info() pollInfo { + return pollInfo(pd.atomicInfo.Load()) +} + +// publishInfo updates pd.atomicInfo (returned by pd.info) +// using the other values in pd. +// It must be called while holding pd.lock, +// and it must be called after changing anything +// that might affect the info bits. +// In practice this means after changing closing +// or changing rd or wd from < 0 to >= 0. +func (pd *pollDesc) publishInfo() { + var info uint32 + if pd.closing { + info |= pollClosing + } + if pd.rd < 0 { + info |= pollExpiredReadDeadline + } + if pd.wd < 0 { + info |= pollExpiredWriteDeadline + } + + // Set all of x except the pollEventErr bit. + x := pd.atomicInfo.Load() + for !pd.atomicInfo.CompareAndSwap(x, (x&pollEventErr)|info) { + x = pd.atomicInfo.Load() + } +} + +// setEventErr sets the result of pd.info().eventErr() to b. +func (pd *pollDesc) setEventErr(b bool) { + x := pd.atomicInfo.Load() + for (x&pollEventErr != 0) != b && !pd.atomicInfo.CompareAndSwap(x, x^pollEventErr) { + x = pd.atomicInfo.Load() + } +} + +type pollCache struct { + lock mutex + first *pollDesc + // PollDesc objects must be type-stable, + // because we can get ready notification from epoll/kqueue + // after the descriptor is closed/reused. + // Stale notifications are detected using seq variable, + // seq is incremented when deadlines are changed or descriptor is reused. +} + +var ( + netpollInitLock mutex + netpollInited atomic.Uint32 + + pollcache pollCache + netpollWaiters atomic.Uint32 +) + +//go:linkname poll_runtime_pollServerInit internal/poll.runtime_pollServerInit +func poll_runtime_pollServerInit() { + netpollGenericInit() +} + +func netpollGenericInit() { + if netpollInited.Load() == 0 { + lockInit(&netpollInitLock, lockRankNetpollInit) + lock(&netpollInitLock) + if netpollInited.Load() == 0 { + netpollinit() + netpollInited.Store(1) + } + unlock(&netpollInitLock) + } +} + +func netpollinited() bool { + return netpollInited.Load() != 0 +} + +//go:linkname poll_runtime_isPollServerDescriptor internal/poll.runtime_isPollServerDescriptor + +// poll_runtime_isPollServerDescriptor reports whether fd is a +// descriptor being used by netpoll. +func poll_runtime_isPollServerDescriptor(fd uintptr) bool { + return netpollIsPollDescriptor(fd) +} + +//go:linkname poll_runtime_pollOpen internal/poll.runtime_pollOpen +func poll_runtime_pollOpen(fd uintptr) (*pollDesc, int) { + pd := pollcache.alloc() + lock(&pd.lock) + wg := pd.wg.Load() + if wg != pdNil && wg != pdReady { + throw("runtime: blocked write on free polldesc") + } + rg := pd.rg.Load() + if rg != pdNil && rg != pdReady { + throw("runtime: blocked read on free polldesc") + } + pd.fd = fd + pd.closing = false + pd.setEventErr(false) + pd.rseq++ + pd.rg.Store(pdNil) + pd.rd = 0 + pd.wseq++ + pd.wg.Store(pdNil) + pd.wd = 0 + pd.self = pd + pd.publishInfo() + unlock(&pd.lock) + + errno := netpollopen(fd, pd) + if errno != 0 { + pollcache.free(pd) + return nil, int(errno) + } + return pd, 0 +} + +//go:linkname poll_runtime_pollClose internal/poll.runtime_pollClose +func poll_runtime_pollClose(pd *pollDesc) { + if !pd.closing { + throw("runtime: close polldesc w/o unblock") + } + wg := pd.wg.Load() + if wg != pdNil && wg != pdReady { + throw("runtime: blocked write on closing polldesc") + } + rg := pd.rg.Load() + if rg != pdNil && rg != pdReady { + throw("runtime: blocked read on closing polldesc") + } + netpollclose(pd.fd) + pollcache.free(pd) +} + +func (c *pollCache) free(pd *pollDesc) { + lock(&c.lock) + pd.link = c.first + c.first = pd + unlock(&c.lock) +} + +// poll_runtime_pollReset, which is internal/poll.runtime_pollReset, +// prepares a descriptor for polling in mode, which is 'r' or 'w'. +// This returns an error code; the codes are defined above. +// +//go:linkname poll_runtime_pollReset internal/poll.runtime_pollReset +func poll_runtime_pollReset(pd *pollDesc, mode int) int { + errcode := netpollcheckerr(pd, int32(mode)) + if errcode != pollNoError { + return errcode + } + if mode == 'r' { + pd.rg.Store(pdNil) + } else if mode == 'w' { + pd.wg.Store(pdNil) + } + return pollNoError +} + +// poll_runtime_pollWait, which is internal/poll.runtime_pollWait, +// waits for a descriptor to be ready for reading or writing, +// according to mode, which is 'r' or 'w'. +// This returns an error code; the codes are defined above. +// +//go:linkname poll_runtime_pollWait internal/poll.runtime_pollWait +func poll_runtime_pollWait(pd *pollDesc, mode int) int { + errcode := netpollcheckerr(pd, int32(mode)) + if errcode != pollNoError { + return errcode + } + // As for now only Solaris, illumos, and AIX use level-triggered IO. + if GOOS == "solaris" || GOOS == "illumos" || GOOS == "aix" { + netpollarm(pd, mode) + } + for !netpollblock(pd, int32(mode), false) { + errcode = netpollcheckerr(pd, int32(mode)) + if errcode != pollNoError { + return errcode + } + // Can happen if timeout has fired and unblocked us, + // but before we had a chance to run, timeout has been reset. + // Pretend it has not happened and retry. + } + return pollNoError +} + +//go:linkname poll_runtime_pollWaitCanceled internal/poll.runtime_pollWaitCanceled +func poll_runtime_pollWaitCanceled(pd *pollDesc, mode int) { + // This function is used only on windows after a failed attempt to cancel + // a pending async IO operation. Wait for ioready, ignore closing or timeouts. + for !netpollblock(pd, int32(mode), true) { + } +} + +//go:linkname poll_runtime_pollSetDeadline internal/poll.runtime_pollSetDeadline +func poll_runtime_pollSetDeadline(pd *pollDesc, d int64, mode int) { + lock(&pd.lock) + if pd.closing { + unlock(&pd.lock) + return + } + rd0, wd0 := pd.rd, pd.wd + combo0 := rd0 > 0 && rd0 == wd0 + if d > 0 { + d += nanotime() + if d <= 0 { + // If the user has a deadline in the future, but the delay calculation + // overflows, then set the deadline to the maximum possible value. + d = 1<<63 - 1 + } + } + if mode == 'r' || mode == 'r'+'w' { + pd.rd = d + } + if mode == 'w' || mode == 'r'+'w' { + pd.wd = d + } + pd.publishInfo() + combo := pd.rd > 0 && pd.rd == pd.wd + rtf := netpollReadDeadline + if combo { + rtf = netpollDeadline + } + if pd.rt.f == nil { + if pd.rd > 0 { + pd.rt.f = rtf + // Copy current seq into the timer arg. + // Timer func will check the seq against current descriptor seq, + // if they differ the descriptor was reused or timers were reset. + pd.rt.arg = pd.makeArg() + pd.rt.seq = pd.rseq + resettimer(&pd.rt, pd.rd) + } + } else if pd.rd != rd0 || combo != combo0 { + pd.rseq++ // invalidate current timers + if pd.rd > 0 { + modtimer(&pd.rt, pd.rd, 0, rtf, pd.makeArg(), pd.rseq) + } else { + deltimer(&pd.rt) + pd.rt.f = nil + } + } + if pd.wt.f == nil { + if pd.wd > 0 && !combo { + pd.wt.f = netpollWriteDeadline + pd.wt.arg = pd.makeArg() + pd.wt.seq = pd.wseq + resettimer(&pd.wt, pd.wd) + } + } else if pd.wd != wd0 || combo != combo0 { + pd.wseq++ // invalidate current timers + if pd.wd > 0 && !combo { + modtimer(&pd.wt, pd.wd, 0, netpollWriteDeadline, pd.makeArg(), pd.wseq) + } else { + deltimer(&pd.wt) + pd.wt.f = nil + } + } + // If we set the new deadline in the past, unblock currently pending IO if any. + // Note that pd.publishInfo has already been called, above, immediately after modifying rd and wd. + var rg, wg *g + if pd.rd < 0 { + rg = netpollunblock(pd, 'r', false) + } + if pd.wd < 0 { + wg = netpollunblock(pd, 'w', false) + } + unlock(&pd.lock) + if rg != nil { + netpollgoready(rg, 3) + } + if wg != nil { + netpollgoready(wg, 3) + } +} + +//go:linkname poll_runtime_pollUnblock internal/poll.runtime_pollUnblock +func poll_runtime_pollUnblock(pd *pollDesc) { + lock(&pd.lock) + if pd.closing { + throw("runtime: unblock on closing polldesc") + } + pd.closing = true + pd.rseq++ + pd.wseq++ + var rg, wg *g + pd.publishInfo() + rg = netpollunblock(pd, 'r', false) + wg = netpollunblock(pd, 'w', false) + if pd.rt.f != nil { + deltimer(&pd.rt) + pd.rt.f = nil + } + if pd.wt.f != nil { + deltimer(&pd.wt) + pd.wt.f = nil + } + unlock(&pd.lock) + if rg != nil { + netpollgoready(rg, 3) + } + if wg != nil { + netpollgoready(wg, 3) + } +} + +// netpollready is called by the platform-specific netpoll function. +// It declares that the fd associated with pd is ready for I/O. +// The toRun argument is used to build a list of goroutines to return +// from netpoll. The mode argument is 'r', 'w', or 'r'+'w' to indicate +// whether the fd is ready for reading or writing or both. +// +// This may run while the world is stopped, so write barriers are not allowed. +// +//go:nowritebarrier +func netpollready(toRun *gList, pd *pollDesc, mode int32) { + var rg, wg *g + if mode == 'r' || mode == 'r'+'w' { + rg = netpollunblock(pd, 'r', true) + } + if mode == 'w' || mode == 'r'+'w' { + wg = netpollunblock(pd, 'w', true) + } + if rg != nil { + toRun.push(rg) + } + if wg != nil { + toRun.push(wg) + } +} + +func netpollcheckerr(pd *pollDesc, mode int32) int { + info := pd.info() + if info.closing() { + return pollErrClosing + } + if (mode == 'r' && info.expiredReadDeadline()) || (mode == 'w' && info.expiredWriteDeadline()) { + return pollErrTimeout + } + // Report an event scanning error only on a read event. + // An error on a write event will be captured in a subsequent + // write call that is able to report a more specific error. + if mode == 'r' && info.eventErr() { + return pollErrNotPollable + } + return pollNoError +} + +func netpollblockcommit(gp *g, gpp unsafe.Pointer) bool { + r := atomic.Casuintptr((*uintptr)(gpp), pdWait, uintptr(unsafe.Pointer(gp))) + if r { + // Bump the count of goroutines waiting for the poller. + // The scheduler uses this to decide whether to block + // waiting for the poller if there is nothing else to do. + netpollWaiters.Add(1) + } + return r +} + +func netpollgoready(gp *g, traceskip int) { + netpollWaiters.Add(-1) + goready(gp, traceskip+1) +} + +// returns true if IO is ready, or false if timed out or closed +// waitio - wait only for completed IO, ignore errors +// Concurrent calls to netpollblock in the same mode are forbidden, as pollDesc +// can hold only a single waiting goroutine for each mode. +func netpollblock(pd *pollDesc, mode int32, waitio bool) bool { + gpp := &pd.rg + if mode == 'w' { + gpp = &pd.wg + } + + // set the gpp semaphore to pdWait + for { + // Consume notification if already ready. + if gpp.CompareAndSwap(pdReady, pdNil) { + return true + } + if gpp.CompareAndSwap(pdNil, pdWait) { + break + } + + // Double check that this isn't corrupt; otherwise we'd loop + // forever. + if v := gpp.Load(); v != pdReady && v != pdNil { + throw("runtime: double wait") + } + } + + // need to recheck error states after setting gpp to pdWait + // this is necessary because runtime_pollUnblock/runtime_pollSetDeadline/deadlineimpl + // do the opposite: store to closing/rd/wd, publishInfo, load of rg/wg + if waitio || netpollcheckerr(pd, mode) == pollNoError { + gopark(netpollblockcommit, unsafe.Pointer(gpp), waitReasonIOWait, traceEvGoBlockNet, 5) + } + // be careful to not lose concurrent pdReady notification + old := gpp.Swap(pdNil) + if old > pdWait { + throw("runtime: corrupted polldesc") + } + return old == pdReady +} + +func netpollunblock(pd *pollDesc, mode int32, ioready bool) *g { + gpp := &pd.rg + if mode == 'w' { + gpp = &pd.wg + } + + for { + old := gpp.Load() + if old == pdReady { + return nil + } + if old == pdNil && !ioready { + // Only set pdReady for ioready. runtime_pollWait + // will check for timeout/cancel before waiting. + return nil + } + var new uintptr + if ioready { + new = pdReady + } + if gpp.CompareAndSwap(old, new) { + if old == pdWait { + old = pdNil + } + return (*g)(unsafe.Pointer(old)) + } + } +} + +func netpolldeadlineimpl(pd *pollDesc, seq uintptr, read, write bool) { + lock(&pd.lock) + // Seq arg is seq when the timer was set. + // If it's stale, ignore the timer event. + currentSeq := pd.rseq + if !read { + currentSeq = pd.wseq + } + if seq != currentSeq { + // The descriptor was reused or timers were reset. + unlock(&pd.lock) + return + } + var rg *g + if read { + if pd.rd <= 0 || pd.rt.f == nil { + throw("runtime: inconsistent read deadline") + } + pd.rd = -1 + pd.publishInfo() + rg = netpollunblock(pd, 'r', false) + } + var wg *g + if write { + if pd.wd <= 0 || pd.wt.f == nil && !read { + throw("runtime: inconsistent write deadline") + } + pd.wd = -1 + pd.publishInfo() + wg = netpollunblock(pd, 'w', false) + } + unlock(&pd.lock) + if rg != nil { + netpollgoready(rg, 0) + } + if wg != nil { + netpollgoready(wg, 0) + } +} + +func netpollDeadline(arg any, seq uintptr) { + netpolldeadlineimpl(arg.(*pollDesc), seq, true, true) +} + +func netpollReadDeadline(arg any, seq uintptr) { + netpolldeadlineimpl(arg.(*pollDesc), seq, true, false) +} + +func netpollWriteDeadline(arg any, seq uintptr) { + netpolldeadlineimpl(arg.(*pollDesc), seq, false, true) +} + +func (c *pollCache) alloc() *pollDesc { + lock(&c.lock) + if c.first == nil { + const pdSize = unsafe.Sizeof(pollDesc{}) + n := pollBlockSize / pdSize + if n == 0 { + n = 1 + } + // Must be in non-GC memory because can be referenced + // only from epoll/kqueue internals. + mem := persistentalloc(n*pdSize, 0, &memstats.other_sys) + for i := uintptr(0); i < n; i++ { + pd := (*pollDesc)(add(mem, i*pdSize)) + pd.link = c.first + c.first = pd + } + } + pd := c.first + c.first = pd.link + lockInit(&pd.lock, lockRankPollDesc) + unlock(&c.lock) + return pd +} + +// makeArg converts pd to an interface{}. +// makeArg does not do any allocation. Normally, such +// a conversion requires an allocation because pointers to +// types which embed runtime/internal/sys.NotInHeap (which pollDesc is) +// must be stored in interfaces indirectly. See issue 42076. +func (pd *pollDesc) makeArg() (i any) { + x := (*eface)(unsafe.Pointer(&i)) + x._type = pdType + x.data = unsafe.Pointer(&pd.self) + return +} + +var ( + pdEface any = (*pollDesc)(nil) + pdType *_type = efaceOf(&pdEface)._type +) diff --git a/src/runtime/netpoll_aix.go b/src/runtime/netpoll_aix.go new file mode 100644 index 0000000..5184aad --- /dev/null +++ b/src/runtime/netpoll_aix.go @@ -0,0 +1,226 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// This is based on the former libgo/runtime/netpoll_select.c implementation +// except that it uses poll instead of select and is written in Go. +// It's also based on Solaris implementation for the arming mechanisms + +//go:cgo_import_dynamic libc_poll poll "libc.a/shr_64.o" +//go:linkname libc_poll libc_poll + +var libc_poll libFunc + +//go:nosplit +func poll(pfds *pollfd, npfds uintptr, timeout uintptr) (int32, int32) { + r, err := syscall3(&libc_poll, uintptr(unsafe.Pointer(pfds)), npfds, timeout) + return int32(r), int32(err) +} + +// pollfd represents the poll structure for AIX operating system. +type pollfd struct { + fd int32 + events int16 + revents int16 +} + +const _POLLIN = 0x0001 +const _POLLOUT = 0x0002 +const _POLLHUP = 0x2000 +const _POLLERR = 0x4000 + +var ( + pfds []pollfd + pds []*pollDesc + mtxpoll mutex + mtxset mutex + rdwake int32 + wrwake int32 + pendingUpdates int32 + + netpollWakeSig atomic.Uint32 // used to avoid duplicate calls of netpollBreak +) + +func netpollinit() { + // Create the pipe we use to wakeup poll. + r, w, errno := nonblockingPipe() + if errno != 0 { + throw("netpollinit: failed to create pipe") + } + rdwake = r + wrwake = w + + // Pre-allocate array of pollfd structures for poll. + pfds = make([]pollfd, 1, 128) + + // Poll the read side of the pipe. + pfds[0].fd = rdwake + pfds[0].events = _POLLIN + + pds = make([]*pollDesc, 1, 128) + pds[0] = nil +} + +func netpollIsPollDescriptor(fd uintptr) bool { + return fd == uintptr(rdwake) || fd == uintptr(wrwake) +} + +// netpollwakeup writes on wrwake to wakeup poll before any changes. +func netpollwakeup() { + if pendingUpdates == 0 { + pendingUpdates = 1 + b := [1]byte{0} + write(uintptr(wrwake), unsafe.Pointer(&b[0]), 1) + } +} + +func netpollopen(fd uintptr, pd *pollDesc) int32 { + lock(&mtxpoll) + netpollwakeup() + + lock(&mtxset) + unlock(&mtxpoll) + + pd.user = uint32(len(pfds)) + pfds = append(pfds, pollfd{fd: int32(fd)}) + pds = append(pds, pd) + unlock(&mtxset) + return 0 +} + +func netpollclose(fd uintptr) int32 { + lock(&mtxpoll) + netpollwakeup() + + lock(&mtxset) + unlock(&mtxpoll) + + for i := 0; i < len(pfds); i++ { + if pfds[i].fd == int32(fd) { + pfds[i] = pfds[len(pfds)-1] + pfds = pfds[:len(pfds)-1] + + pds[i] = pds[len(pds)-1] + pds[i].user = uint32(i) + pds = pds[:len(pds)-1] + break + } + } + unlock(&mtxset) + return 0 +} + +func netpollarm(pd *pollDesc, mode int) { + lock(&mtxpoll) + netpollwakeup() + + lock(&mtxset) + unlock(&mtxpoll) + + switch mode { + case 'r': + pfds[pd.user].events |= _POLLIN + case 'w': + pfds[pd.user].events |= _POLLOUT + } + unlock(&mtxset) +} + +// netpollBreak interrupts a poll. +func netpollBreak() { + // Failing to cas indicates there is an in-flight wakeup, so we're done here. + if !netpollWakeSig.CompareAndSwap(0, 1) { + return + } + + b := [1]byte{0} + write(uintptr(wrwake), unsafe.Pointer(&b[0]), 1) +} + +// netpoll checks for ready network connections. +// Returns list of goroutines that become runnable. +// delay < 0: blocks indefinitely +// delay == 0: does not block, just polls +// delay > 0: block for up to that many nanoseconds +// +//go:nowritebarrierrec +func netpoll(delay int64) gList { + var timeout uintptr + if delay < 0 { + timeout = ^uintptr(0) + } else if delay == 0 { + // TODO: call poll with timeout == 0 + return gList{} + } else if delay < 1e6 { + timeout = 1 + } else if delay < 1e15 { + timeout = uintptr(delay / 1e6) + } else { + // An arbitrary cap on how long to wait for a timer. + // 1e9 ms == ~11.5 days. + timeout = 1e9 + } +retry: + lock(&mtxpoll) + lock(&mtxset) + pendingUpdates = 0 + unlock(&mtxpoll) + + n, e := poll(&pfds[0], uintptr(len(pfds)), timeout) + if n < 0 { + if e != _EINTR { + println("errno=", e, " len(pfds)=", len(pfds)) + throw("poll failed") + } + unlock(&mtxset) + // If a timed sleep was interrupted, just return to + // recalculate how long we should sleep now. + if timeout > 0 { + return gList{} + } + goto retry + } + // Check if some descriptors need to be changed + if n != 0 && pfds[0].revents&(_POLLIN|_POLLHUP|_POLLERR) != 0 { + if delay != 0 { + // A netpollwakeup could be picked up by a + // non-blocking poll. Only clear the wakeup + // if blocking. + var b [1]byte + for read(rdwake, unsafe.Pointer(&b[0]), 1) == 1 { + } + netpollWakeSig.Store(0) + } + // Still look at the other fds even if the mode may have + // changed, as netpollBreak might have been called. + n-- + } + var toRun gList + for i := 1; i < len(pfds) && n > 0; i++ { + pfd := &pfds[i] + + var mode int32 + if pfd.revents&(_POLLIN|_POLLHUP|_POLLERR) != 0 { + mode += 'r' + pfd.events &= ^_POLLIN + } + if pfd.revents&(_POLLOUT|_POLLHUP|_POLLERR) != 0 { + mode += 'w' + pfd.events &= ^_POLLOUT + } + if mode != 0 { + pds[i].setEventErr(pfd.revents == _POLLERR) + netpollready(&toRun, pds[i], mode) + n-- + } + } + unlock(&mtxset) + return toRun +} diff --git a/src/runtime/netpoll_epoll.go b/src/runtime/netpoll_epoll.go new file mode 100644 index 0000000..7164a59 --- /dev/null +++ b/src/runtime/netpoll_epoll.go @@ -0,0 +1,167 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux + +package runtime + +import ( + "runtime/internal/atomic" + "runtime/internal/syscall" + "unsafe" +) + +var ( + epfd int32 = -1 // epoll descriptor + + netpollBreakRd, netpollBreakWr uintptr // for netpollBreak + + netpollWakeSig atomic.Uint32 // used to avoid duplicate calls of netpollBreak +) + +func netpollinit() { + var errno uintptr + epfd, errno = syscall.EpollCreate1(syscall.EPOLL_CLOEXEC) + if errno != 0 { + println("runtime: epollcreate failed with", errno) + throw("runtime: netpollinit failed") + } + r, w, errpipe := nonblockingPipe() + if errpipe != 0 { + println("runtime: pipe failed with", -errpipe) + throw("runtime: pipe failed") + } + ev := syscall.EpollEvent{ + Events: syscall.EPOLLIN, + } + *(**uintptr)(unsafe.Pointer(&ev.Data)) = &netpollBreakRd + errno = syscall.EpollCtl(epfd, syscall.EPOLL_CTL_ADD, r, &ev) + if errno != 0 { + println("runtime: epollctl failed with", errno) + throw("runtime: epollctl failed") + } + netpollBreakRd = uintptr(r) + netpollBreakWr = uintptr(w) +} + +func netpollIsPollDescriptor(fd uintptr) bool { + return fd == uintptr(epfd) || fd == netpollBreakRd || fd == netpollBreakWr +} + +func netpollopen(fd uintptr, pd *pollDesc) uintptr { + var ev syscall.EpollEvent + ev.Events = syscall.EPOLLIN | syscall.EPOLLOUT | syscall.EPOLLRDHUP | syscall.EPOLLET + *(**pollDesc)(unsafe.Pointer(&ev.Data)) = pd + return syscall.EpollCtl(epfd, syscall.EPOLL_CTL_ADD, int32(fd), &ev) +} + +func netpollclose(fd uintptr) uintptr { + var ev syscall.EpollEvent + return syscall.EpollCtl(epfd, syscall.EPOLL_CTL_DEL, int32(fd), &ev) +} + +func netpollarm(pd *pollDesc, mode int) { + throw("runtime: unused") +} + +// netpollBreak interrupts an epollwait. +func netpollBreak() { + // Failing to cas indicates there is an in-flight wakeup, so we're done here. + if !netpollWakeSig.CompareAndSwap(0, 1) { + return + } + + for { + var b byte + n := write(netpollBreakWr, unsafe.Pointer(&b), 1) + if n == 1 { + break + } + if n == -_EINTR { + continue + } + if n == -_EAGAIN { + return + } + println("runtime: netpollBreak write failed with", -n) + throw("runtime: netpollBreak write failed") + } +} + +// netpoll checks for ready network connections. +// Returns list of goroutines that become runnable. +// delay < 0: blocks indefinitely +// delay == 0: does not block, just polls +// delay > 0: block for up to that many nanoseconds +func netpoll(delay int64) gList { + if epfd == -1 { + return gList{} + } + var waitms int32 + if delay < 0 { + waitms = -1 + } else if delay == 0 { + waitms = 0 + } else if delay < 1e6 { + waitms = 1 + } else if delay < 1e15 { + waitms = int32(delay / 1e6) + } else { + // An arbitrary cap on how long to wait for a timer. + // 1e9 ms == ~11.5 days. + waitms = 1e9 + } + var events [128]syscall.EpollEvent +retry: + n, errno := syscall.EpollWait(epfd, events[:], int32(len(events)), waitms) + if errno != 0 { + if errno != _EINTR { + println("runtime: epollwait on fd", epfd, "failed with", errno) + throw("runtime: netpoll failed") + } + // If a timed sleep was interrupted, just return to + // recalculate how long we should sleep now. + if waitms > 0 { + return gList{} + } + goto retry + } + var toRun gList + for i := int32(0); i < n; i++ { + ev := events[i] + if ev.Events == 0 { + continue + } + + if *(**uintptr)(unsafe.Pointer(&ev.Data)) == &netpollBreakRd { + if ev.Events != syscall.EPOLLIN { + println("runtime: netpoll: break fd ready for", ev.Events) + throw("runtime: netpoll: break fd ready for something unexpected") + } + if delay != 0 { + // netpollBreak could be picked up by a + // nonblocking poll. Only read the byte + // if blocking. + var tmp [16]byte + read(int32(netpollBreakRd), noescape(unsafe.Pointer(&tmp[0])), int32(len(tmp))) + netpollWakeSig.Store(0) + } + continue + } + + var mode int32 + if ev.Events&(syscall.EPOLLIN|syscall.EPOLLRDHUP|syscall.EPOLLHUP|syscall.EPOLLERR) != 0 { + mode += 'r' + } + if ev.Events&(syscall.EPOLLOUT|syscall.EPOLLHUP|syscall.EPOLLERR) != 0 { + mode += 'w' + } + if mode != 0 { + pd := *(**pollDesc)(unsafe.Pointer(&ev.Data)) + pd.setEventErr(ev.Events == syscall.EPOLLERR) + netpollready(&toRun, pd, mode) + } + } + return toRun +} diff --git a/src/runtime/netpoll_fake.go b/src/runtime/netpoll_fake.go new file mode 100644 index 0000000..de1dcae --- /dev/null +++ b/src/runtime/netpoll_fake.go @@ -0,0 +1,35 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Fake network poller for wasm/js. +// Should never be used, because wasm/js network connections do not honor "SetNonblock". + +//go:build js && wasm + +package runtime + +func netpollinit() { +} + +func netpollIsPollDescriptor(fd uintptr) bool { + return false +} + +func netpollopen(fd uintptr, pd *pollDesc) int32 { + return 0 +} + +func netpollclose(fd uintptr) int32 { + return 0 +} + +func netpollarm(pd *pollDesc, mode int) { +} + +func netpollBreak() { +} + +func netpoll(delay int64) gList { + return gList{} +} diff --git a/src/runtime/netpoll_kqueue.go b/src/runtime/netpoll_kqueue.go new file mode 100644 index 0000000..5ae77b5 --- /dev/null +++ b/src/runtime/netpoll_kqueue.go @@ -0,0 +1,190 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build darwin || dragonfly || freebsd || netbsd || openbsd + +package runtime + +// Integrated network poller (kqueue-based implementation). + +import ( + "runtime/internal/atomic" + "unsafe" +) + +var ( + kq int32 = -1 + + netpollBreakRd, netpollBreakWr uintptr // for netpollBreak + + netpollWakeSig atomic.Uint32 // used to avoid duplicate calls of netpollBreak +) + +func netpollinit() { + kq = kqueue() + if kq < 0 { + println("runtime: kqueue failed with", -kq) + throw("runtime: netpollinit failed") + } + closeonexec(kq) + r, w, errno := nonblockingPipe() + if errno != 0 { + println("runtime: pipe failed with", -errno) + throw("runtime: pipe failed") + } + ev := keventt{ + filter: _EVFILT_READ, + flags: _EV_ADD, + } + *(*uintptr)(unsafe.Pointer(&ev.ident)) = uintptr(r) + n := kevent(kq, &ev, 1, nil, 0, nil) + if n < 0 { + println("runtime: kevent failed with", -n) + throw("runtime: kevent failed") + } + netpollBreakRd = uintptr(r) + netpollBreakWr = uintptr(w) +} + +func netpollIsPollDescriptor(fd uintptr) bool { + return fd == uintptr(kq) || fd == netpollBreakRd || fd == netpollBreakWr +} + +func netpollopen(fd uintptr, pd *pollDesc) int32 { + // Arm both EVFILT_READ and EVFILT_WRITE in edge-triggered mode (EV_CLEAR) + // for the whole fd lifetime. The notifications are automatically unregistered + // when fd is closed. + var ev [2]keventt + *(*uintptr)(unsafe.Pointer(&ev[0].ident)) = fd + ev[0].filter = _EVFILT_READ + ev[0].flags = _EV_ADD | _EV_CLEAR + ev[0].fflags = 0 + ev[0].data = 0 + ev[0].udata = (*byte)(unsafe.Pointer(pd)) + ev[1] = ev[0] + ev[1].filter = _EVFILT_WRITE + n := kevent(kq, &ev[0], 2, nil, 0, nil) + if n < 0 { + return -n + } + return 0 +} + +func netpollclose(fd uintptr) int32 { + // Don't need to unregister because calling close() + // on fd will remove any kevents that reference the descriptor. + return 0 +} + +func netpollarm(pd *pollDesc, mode int) { + throw("runtime: unused") +} + +// netpollBreak interrupts a kevent. +func netpollBreak() { + // Failing to cas indicates there is an in-flight wakeup, so we're done here. + if !netpollWakeSig.CompareAndSwap(0, 1) { + return + } + + for { + var b byte + n := write(netpollBreakWr, unsafe.Pointer(&b), 1) + if n == 1 || n == -_EAGAIN { + break + } + if n == -_EINTR { + continue + } + println("runtime: netpollBreak write failed with", -n) + throw("runtime: netpollBreak write failed") + } +} + +// netpoll checks for ready network connections. +// Returns list of goroutines that become runnable. +// delay < 0: blocks indefinitely +// delay == 0: does not block, just polls +// delay > 0: block for up to that many nanoseconds +func netpoll(delay int64) gList { + if kq == -1 { + return gList{} + } + var tp *timespec + var ts timespec + if delay < 0 { + tp = nil + } else if delay == 0 { + tp = &ts + } else { + ts.setNsec(delay) + if ts.tv_sec > 1e6 { + // Darwin returns EINVAL if the sleep time is too long. + ts.tv_sec = 1e6 + } + tp = &ts + } + var events [64]keventt +retry: + n := kevent(kq, nil, 0, &events[0], int32(len(events)), tp) + if n < 0 { + if n != -_EINTR { + println("runtime: kevent on fd", kq, "failed with", -n) + throw("runtime: netpoll failed") + } + // If a timed sleep was interrupted, just return to + // recalculate how long we should sleep now. + if delay > 0 { + return gList{} + } + goto retry + } + var toRun gList + for i := 0; i < int(n); i++ { + ev := &events[i] + + if uintptr(ev.ident) == netpollBreakRd { + if ev.filter != _EVFILT_READ { + println("runtime: netpoll: break fd ready for", ev.filter) + throw("runtime: netpoll: break fd ready for something unexpected") + } + if delay != 0 { + // netpollBreak could be picked up by a + // nonblocking poll. Only read the byte + // if blocking. + var tmp [16]byte + read(int32(netpollBreakRd), noescape(unsafe.Pointer(&tmp[0])), int32(len(tmp))) + netpollWakeSig.Store(0) + } + continue + } + + var mode int32 + switch ev.filter { + case _EVFILT_READ: + mode += 'r' + + // On some systems when the read end of a pipe + // is closed the write end will not get a + // _EVFILT_WRITE event, but will get a + // _EVFILT_READ event with EV_EOF set. + // Note that setting 'w' here just means that we + // will wake up a goroutine waiting to write; + // that goroutine will try the write again, + // and the appropriate thing will happen based + // on what that write returns (success, EPIPE, EAGAIN). + if ev.flags&_EV_EOF != 0 { + mode += 'w' + } + case _EVFILT_WRITE: + mode += 'w' + } + if mode != 0 { + pd := (*pollDesc)(unsafe.Pointer(ev.udata)) + pd.setEventErr(ev.flags == _EV_ERROR) + netpollready(&toRun, pd, mode) + } + } + return toRun +} diff --git a/src/runtime/netpoll_os_test.go b/src/runtime/netpoll_os_test.go new file mode 100644 index 0000000..b96b9f3 --- /dev/null +++ b/src/runtime/netpoll_os_test.go @@ -0,0 +1,28 @@ +package runtime_test + +import ( + "runtime" + "sync" + "testing" +) + +var wg sync.WaitGroup + +func init() { + runtime.NetpollGenericInit() +} + +func BenchmarkNetpollBreak(b *testing.B) { + b.StartTimer() + for i := 0; i < b.N; i++ { + for j := 0; j < 10; j++ { + wg.Add(1) + go func() { + runtime.NetpollBreak() + wg.Done() + }() + } + } + wg.Wait() + b.StopTimer() +} diff --git a/src/runtime/netpoll_solaris.go b/src/runtime/netpoll_solaris.go new file mode 100644 index 0000000..fad7f78 --- /dev/null +++ b/src/runtime/netpoll_solaris.go @@ -0,0 +1,318 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// Solaris runtime-integrated network poller. +// +// Solaris uses event ports for scalable network I/O. Event +// ports are level-triggered, unlike epoll and kqueue which +// can be configured in both level-triggered and edge-triggered +// mode. Level triggering means we have to keep track of a few things +// ourselves. After we receive an event for a file descriptor, +// it's our responsibility to ask again to be notified for future +// events for that descriptor. When doing this we must keep track of +// what kind of events the goroutines are currently interested in, +// for example a fd may be open both for reading and writing. +// +// A description of the high level operation of this code +// follows. Networking code will get a file descriptor by some means +// and will register it with the netpolling mechanism by a code path +// that eventually calls runtime·netpollopen. runtime·netpollopen +// calls port_associate with an empty event set. That means that we +// will not receive any events at this point. The association needs +// to be done at this early point because we need to process the I/O +// readiness notification at some point in the future. If I/O becomes +// ready when nobody is listening, when we finally care about it, +// nobody will tell us anymore. +// +// Beside calling runtime·netpollopen, the networking code paths +// will call runtime·netpollarm each time goroutines are interested +// in doing network I/O. Because now we know what kind of I/O we +// are interested in (reading/writing), we can call port_associate +// passing the correct type of event set (POLLIN/POLLOUT). As we made +// sure to have already associated the file descriptor with the port, +// when we now call port_associate, we will unblock the main poller +// loop (in runtime·netpoll) right away if the socket is actually +// ready for I/O. +// +// The main poller loop runs in its own thread waiting for events +// using port_getn. When an event happens, it will tell the scheduler +// about it using runtime·netpollready. Besides doing this, it must +// also re-associate the events that were not part of this current +// notification with the file descriptor. Failing to do this would +// mean each notification will prevent concurrent code using the +// same file descriptor in parallel. +// +// The logic dealing with re-associations is encapsulated in +// runtime·netpollupdate. This function takes care to associate the +// descriptor only with the subset of events that were previously +// part of the association, except the one that just happened. We +// can't re-associate with that right away, because event ports +// are level triggered so it would cause a busy loop. Instead, that +// association is effected only by the runtime·netpollarm code path, +// when Go code actually asks for I/O. +// +// The open and arming mechanisms are serialized using the lock +// inside PollDesc. This is required because the netpoll loop runs +// asynchronously in respect to other Go code and by the time we get +// to call port_associate to update the association in the loop, the +// file descriptor might have been closed and reopened already. The +// lock allows runtime·netpollupdate to be called synchronously from +// the loop thread while preventing other threads operating to the +// same PollDesc, so once we unblock in the main loop, until we loop +// again we know for sure we are always talking about the same file +// descriptor and can safely access the data we want (the event set). + +//go:cgo_import_dynamic libc_port_create port_create "libc.so" +//go:cgo_import_dynamic libc_port_associate port_associate "libc.so" +//go:cgo_import_dynamic libc_port_dissociate port_dissociate "libc.so" +//go:cgo_import_dynamic libc_port_getn port_getn "libc.so" +//go:cgo_import_dynamic libc_port_alert port_alert "libc.so" + +//go:linkname libc_port_create libc_port_create +//go:linkname libc_port_associate libc_port_associate +//go:linkname libc_port_dissociate libc_port_dissociate +//go:linkname libc_port_getn libc_port_getn +//go:linkname libc_port_alert libc_port_alert + +var ( + libc_port_create, + libc_port_associate, + libc_port_dissociate, + libc_port_getn, + libc_port_alert libcFunc + netpollWakeSig atomic.Uint32 // used to avoid duplicate calls of netpollBreak +) + +func errno() int32 { + return *getg().m.perrno +} + +func port_create() int32 { + return int32(sysvicall0(&libc_port_create)) +} + +func port_associate(port, source int32, object uintptr, events uint32, user uintptr) int32 { + return int32(sysvicall5(&libc_port_associate, uintptr(port), uintptr(source), object, uintptr(events), user)) +} + +func port_dissociate(port, source int32, object uintptr) int32 { + return int32(sysvicall3(&libc_port_dissociate, uintptr(port), uintptr(source), object)) +} + +func port_getn(port int32, evs *portevent, max uint32, nget *uint32, timeout *timespec) int32 { + return int32(sysvicall5(&libc_port_getn, uintptr(port), uintptr(unsafe.Pointer(evs)), uintptr(max), uintptr(unsafe.Pointer(nget)), uintptr(unsafe.Pointer(timeout)))) +} + +func port_alert(port int32, flags, events uint32, user uintptr) int32 { + return int32(sysvicall4(&libc_port_alert, uintptr(port), uintptr(flags), uintptr(events), user)) +} + +var portfd int32 = -1 + +func netpollinit() { + portfd = port_create() + if portfd >= 0 { + fcntl(portfd, _F_SETFD, _FD_CLOEXEC) + return + } + + print("runtime: port_create failed (errno=", errno(), ")\n") + throw("runtime: netpollinit failed") +} + +func netpollIsPollDescriptor(fd uintptr) bool { + return fd == uintptr(portfd) +} + +func netpollopen(fd uintptr, pd *pollDesc) int32 { + lock(&pd.lock) + // We don't register for any specific type of events yet, that's + // netpollarm's job. We merely ensure we call port_associate before + // asynchronous connect/accept completes, so when we actually want + // to do any I/O, the call to port_associate (from netpollarm, + // with the interested event set) will unblock port_getn right away + // because of the I/O readiness notification. + pd.user = 0 + r := port_associate(portfd, _PORT_SOURCE_FD, fd, 0, uintptr(unsafe.Pointer(pd))) + unlock(&pd.lock) + return r +} + +func netpollclose(fd uintptr) int32 { + return port_dissociate(portfd, _PORT_SOURCE_FD, fd) +} + +// Updates the association with a new set of interested events. After +// this call, port_getn will return one and only one event for that +// particular descriptor, so this function needs to be called again. +func netpollupdate(pd *pollDesc, set, clear uint32) { + if pd.info().closing() { + return + } + + old := pd.user + events := (old & ^clear) | set + if old == events { + return + } + + if events != 0 && port_associate(portfd, _PORT_SOURCE_FD, pd.fd, events, uintptr(unsafe.Pointer(pd))) != 0 { + print("runtime: port_associate failed (errno=", errno(), ")\n") + throw("runtime: netpollupdate failed") + } + pd.user = events +} + +// subscribe the fd to the port such that port_getn will return one event. +func netpollarm(pd *pollDesc, mode int) { + lock(&pd.lock) + switch mode { + case 'r': + netpollupdate(pd, _POLLIN, 0) + case 'w': + netpollupdate(pd, _POLLOUT, 0) + default: + throw("runtime: bad mode") + } + unlock(&pd.lock) +} + +// netpollBreak interrupts a port_getn wait. +func netpollBreak() { + // Failing to cas indicates there is an in-flight wakeup, so we're done here. + if !netpollWakeSig.CompareAndSwap(0, 1) { + return + } + + // Use port_alert to put portfd into alert mode. + // This will wake up all threads sleeping in port_getn on portfd, + // and cause their calls to port_getn to return immediately. + // Further, until portfd is taken out of alert mode, + // all calls to port_getn will return immediately. + if port_alert(portfd, _PORT_ALERT_UPDATE, _POLLHUP, uintptr(unsafe.Pointer(&portfd))) < 0 { + if e := errno(); e != _EBUSY { + println("runtime: port_alert failed with", e) + throw("runtime: netpoll: port_alert failed") + } + } +} + +// netpoll checks for ready network connections. +// Returns list of goroutines that become runnable. +// delay < 0: blocks indefinitely +// delay == 0: does not block, just polls +// delay > 0: block for up to that many nanoseconds +func netpoll(delay int64) gList { + if portfd == -1 { + return gList{} + } + + var wait *timespec + var ts timespec + if delay < 0 { + wait = nil + } else if delay == 0 { + wait = &ts + } else { + ts.setNsec(delay) + if ts.tv_sec > 1e6 { + // An arbitrary cap on how long to wait for a timer. + // 1e6 s == ~11.5 days. + ts.tv_sec = 1e6 + } + wait = &ts + } + + var events [128]portevent +retry: + var n uint32 = 1 + r := port_getn(portfd, &events[0], uint32(len(events)), &n, wait) + e := errno() + if r < 0 && e == _ETIME && n > 0 { + // As per port_getn(3C), an ETIME failure does not preclude the + // delivery of some number of events. Treat a timeout failure + // with delivered events as a success. + r = 0 + } + if r < 0 { + if e != _EINTR && e != _ETIME { + print("runtime: port_getn on fd ", portfd, " failed (errno=", e, ")\n") + throw("runtime: netpoll failed") + } + // If a timed sleep was interrupted and there are no events, + // just return to recalculate how long we should sleep now. + if delay > 0 { + return gList{} + } + goto retry + } + + var toRun gList + for i := 0; i < int(n); i++ { + ev := &events[i] + + if ev.portev_source == _PORT_SOURCE_ALERT { + if ev.portev_events != _POLLHUP || unsafe.Pointer(ev.portev_user) != unsafe.Pointer(&portfd) { + throw("runtime: netpoll: bad port_alert wakeup") + } + if delay != 0 { + // Now that a blocking call to netpoll + // has seen the alert, take portfd + // back out of alert mode. + // See the comment in netpollBreak. + if port_alert(portfd, 0, 0, 0) < 0 { + e := errno() + println("runtime: port_alert failed with", e) + throw("runtime: netpoll: port_alert failed") + } + netpollWakeSig.Store(0) + } + continue + } + + if ev.portev_events == 0 { + continue + } + pd := (*pollDesc)(unsafe.Pointer(ev.portev_user)) + + var mode, clear int32 + if (ev.portev_events & (_POLLIN | _POLLHUP | _POLLERR)) != 0 { + mode += 'r' + clear |= _POLLIN + } + if (ev.portev_events & (_POLLOUT | _POLLHUP | _POLLERR)) != 0 { + mode += 'w' + clear |= _POLLOUT + } + // To effect edge-triggered events, we need to be sure to + // update our association with whatever events were not + // set with the event. For example if we are registered + // for POLLIN|POLLOUT, and we get POLLIN, besides waking + // the goroutine interested in POLLIN we have to not forget + // about the one interested in POLLOUT. + if clear != 0 { + lock(&pd.lock) + netpollupdate(pd, 0, uint32(clear)) + unlock(&pd.lock) + } + + if mode != 0 { + // TODO(mikio): Consider implementing event + // scanning error reporting once we are sure + // about the event port on SmartOS. + // + // See golang.org/x/issue/30840. + netpollready(&toRun, pd, mode) + } + } + + return toRun +} diff --git a/src/runtime/netpoll_stub.go b/src/runtime/netpoll_stub.go new file mode 100644 index 0000000..14cf0c3 --- /dev/null +++ b/src/runtime/netpoll_stub.go @@ -0,0 +1,61 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build plan9 + +package runtime + +import "runtime/internal/atomic" + +var netpollInited atomic.Uint32 +var netpollWaiters atomic.Uint32 + +var netpollStubLock mutex +var netpollNote note + +// netpollBroken, protected by netpollBrokenLock, avoids a double notewakeup. +var netpollBrokenLock mutex +var netpollBroken bool + +func netpollGenericInit() { + netpollInited.Store(1) +} + +func netpollBreak() { + lock(&netpollBrokenLock) + broken := netpollBroken + netpollBroken = true + if !broken { + notewakeup(&netpollNote) + } + unlock(&netpollBrokenLock) +} + +// Polls for ready network connections. +// Returns list of goroutines that become runnable. +func netpoll(delay int64) gList { + // Implementation for platforms that do not support + // integrated network poller. + if delay != 0 { + // This lock ensures that only one goroutine tries to use + // the note. It should normally be completely uncontended. + lock(&netpollStubLock) + + lock(&netpollBrokenLock) + noteclear(&netpollNote) + netpollBroken = false + unlock(&netpollBrokenLock) + + notetsleep(&netpollNote, delay) + unlock(&netpollStubLock) + // Guard against starvation in case the lock is contended + // (eg when running TestNetpollBreak). + osyield() + } + return gList{} +} + +func netpollinited() bool { + return netpollInited.Load() != 0 +} diff --git a/src/runtime/netpoll_windows.go b/src/runtime/netpoll_windows.go new file mode 100644 index 0000000..796bf1d --- /dev/null +++ b/src/runtime/netpoll_windows.go @@ -0,0 +1,159 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +const _DWORD_MAX = 0xffffffff + +const _INVALID_HANDLE_VALUE = ^uintptr(0) + +// net_op must be the same as beginning of internal/poll.operation. +// Keep these in sync. +type net_op struct { + // used by windows + o overlapped + // used by netpoll + pd *pollDesc + mode int32 + errno int32 + qty uint32 +} + +type overlappedEntry struct { + key uintptr + op *net_op // In reality it's *overlapped, but we cast it to *net_op anyway. + internal uintptr + qty uint32 +} + +var ( + iocphandle uintptr = _INVALID_HANDLE_VALUE // completion port io handle + + netpollWakeSig atomic.Uint32 // used to avoid duplicate calls of netpollBreak +) + +func netpollinit() { + iocphandle = stdcall4(_CreateIoCompletionPort, _INVALID_HANDLE_VALUE, 0, 0, _DWORD_MAX) + if iocphandle == 0 { + println("runtime: CreateIoCompletionPort failed (errno=", getlasterror(), ")") + throw("runtime: netpollinit failed") + } +} + +func netpollIsPollDescriptor(fd uintptr) bool { + return fd == iocphandle +} + +func netpollopen(fd uintptr, pd *pollDesc) int32 { + if stdcall4(_CreateIoCompletionPort, fd, iocphandle, 0, 0) == 0 { + return int32(getlasterror()) + } + return 0 +} + +func netpollclose(fd uintptr) int32 { + // nothing to do + return 0 +} + +func netpollarm(pd *pollDesc, mode int) { + throw("runtime: unused") +} + +func netpollBreak() { + // Failing to cas indicates there is an in-flight wakeup, so we're done here. + if !netpollWakeSig.CompareAndSwap(0, 1) { + return + } + + if stdcall4(_PostQueuedCompletionStatus, iocphandle, 0, 0, 0) == 0 { + println("runtime: netpoll: PostQueuedCompletionStatus failed (errno=", getlasterror(), ")") + throw("runtime: netpoll: PostQueuedCompletionStatus failed") + } +} + +// netpoll checks for ready network connections. +// Returns list of goroutines that become runnable. +// delay < 0: blocks indefinitely +// delay == 0: does not block, just polls +// delay > 0: block for up to that many nanoseconds +func netpoll(delay int64) gList { + var entries [64]overlappedEntry + var wait, qty, flags, n, i uint32 + var errno int32 + var op *net_op + var toRun gList + + mp := getg().m + + if iocphandle == _INVALID_HANDLE_VALUE { + return gList{} + } + if delay < 0 { + wait = _INFINITE + } else if delay == 0 { + wait = 0 + } else if delay < 1e6 { + wait = 1 + } else if delay < 1e15 { + wait = uint32(delay / 1e6) + } else { + // An arbitrary cap on how long to wait for a timer. + // 1e9 ms == ~11.5 days. + wait = 1e9 + } + + n = uint32(len(entries) / int(gomaxprocs)) + if n < 8 { + n = 8 + } + if delay != 0 { + mp.blocked = true + } + if stdcall6(_GetQueuedCompletionStatusEx, iocphandle, uintptr(unsafe.Pointer(&entries[0])), uintptr(n), uintptr(unsafe.Pointer(&n)), uintptr(wait), 0) == 0 { + mp.blocked = false + errno = int32(getlasterror()) + if errno == _WAIT_TIMEOUT { + return gList{} + } + println("runtime: GetQueuedCompletionStatusEx failed (errno=", errno, ")") + throw("runtime: netpoll failed") + } + mp.blocked = false + for i = 0; i < n; i++ { + op = entries[i].op + if op != nil { + errno = 0 + qty = 0 + if stdcall5(_WSAGetOverlappedResult, op.pd.fd, uintptr(unsafe.Pointer(op)), uintptr(unsafe.Pointer(&qty)), 0, uintptr(unsafe.Pointer(&flags))) == 0 { + errno = int32(getlasterror()) + } + handlecompletion(&toRun, op, errno, qty) + } else { + netpollWakeSig.Store(0) + if delay == 0 { + // Forward the notification to the + // blocked poller. + netpollBreak() + } + } + } + return toRun +} + +func handlecompletion(toRun *gList, op *net_op, errno int32, qty uint32) { + mode := op.mode + if mode != 'r' && mode != 'w' { + println("runtime: GetQueuedCompletionStatusEx returned invalid mode=", mode) + throw("runtime: netpoll failed") + } + op.errno = errno + op.qty = qty + netpollready(toRun, op.pd, mode) +} diff --git a/src/runtime/norace_linux_test.go b/src/runtime/norace_linux_test.go new file mode 100644 index 0000000..3521b24 --- /dev/null +++ b/src/runtime/norace_linux_test.go @@ -0,0 +1,43 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The file contains tests that cannot run under race detector for some reason. +// +//go:build !race + +package runtime_test + +import ( + "internal/abi" + "runtime" + "testing" + "time" + "unsafe" +) + +var newOSProcDone bool + +//go:nosplit +func newOSProcCreated() { + newOSProcDone = true +} + +// Can't be run with -race because it inserts calls into newOSProcCreated() +// that require a valid G/M. +func TestNewOSProc0(t *testing.T) { + runtime.NewOSProc0(0x800000, unsafe.Pointer(abi.FuncPCABIInternal(newOSProcCreated))) + check := time.NewTicker(100 * time.Millisecond) + defer check.Stop() + end := time.After(5 * time.Second) + for { + select { + case <-check.C: + if newOSProcDone { + return + } + case <-end: + t.Fatalf("couldn't create new OS process") + } + } +} diff --git a/src/runtime/norace_test.go b/src/runtime/norace_test.go new file mode 100644 index 0000000..3b5eca5 --- /dev/null +++ b/src/runtime/norace_test.go @@ -0,0 +1,47 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The file contains tests that cannot run under race detector for some reason. +// +//go:build !race + +package runtime_test + +import ( + "runtime" + "testing" +) + +// Syscall tests split stack between Entersyscall and Exitsyscall under race detector. +func BenchmarkSyscall(b *testing.B) { + benchmarkSyscall(b, 0, 1) +} + +func BenchmarkSyscallWork(b *testing.B) { + benchmarkSyscall(b, 100, 1) +} + +func BenchmarkSyscallExcess(b *testing.B) { + benchmarkSyscall(b, 0, 4) +} + +func BenchmarkSyscallExcessWork(b *testing.B) { + benchmarkSyscall(b, 100, 4) +} + +func benchmarkSyscall(b *testing.B, work, excess int) { + b.SetParallelism(excess) + b.RunParallel(func(pb *testing.PB) { + foo := 42 + for pb.Next() { + runtime.Entersyscall() + for i := 0; i < work; i++ { + foo *= 2 + foo /= 2 + } + runtime.Exitsyscall() + } + _ = foo + }) +} diff --git a/src/runtime/numcpu_freebsd_test.go b/src/runtime/numcpu_freebsd_test.go new file mode 100644 index 0000000..e78890a --- /dev/null +++ b/src/runtime/numcpu_freebsd_test.go @@ -0,0 +1,15 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import "testing" + +func TestFreeBSDNumCPU(t *testing.T) { + got := runTestProg(t, "testprog", "FreeBSDNumCPU") + want := "OK\n" + if got != want { + t.Fatalf("expected %q, but got:\n%s", want, got) + } +} diff --git a/src/runtime/os2_aix.go b/src/runtime/os2_aix.go new file mode 100644 index 0000000..0e39b85 --- /dev/null +++ b/src/runtime/os2_aix.go @@ -0,0 +1,763 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This file contains main runtime AIX syscalls. +// Pollset syscalls are in netpoll_aix.go. +// The implementation is based on Solaris and Windows. +// Each syscall is made by calling its libc symbol using asmcgocall and asmsyscall6 +// assembly functions. + +package runtime + +import ( + "unsafe" +) + +// Symbols imported for __start function. + +//go:cgo_import_dynamic libc___n_pthreads __n_pthreads "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libc___mod_init __mod_init "libc.a/shr_64.o" +//go:linkname libc___n_pthreads libc___n_pthreads +//go:linkname libc___mod_init libc___mod_init + +var ( + libc___n_pthreads, + libc___mod_init libFunc +) + +// Syscalls + +//go:cgo_import_dynamic libc__Errno _Errno "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_clock_gettime clock_gettime "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_close close "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_exit exit "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_getpid getpid "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_getsystemcfg getsystemcfg "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_kill kill "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_madvise madvise "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_malloc malloc "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_mmap mmap "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_mprotect mprotect "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_munmap munmap "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_open open "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_pipe pipe "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_raise raise "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_read read "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_sched_yield sched_yield "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_sem_init sem_init "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_sem_post sem_post "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_sem_timedwait sem_timedwait "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_sem_wait sem_wait "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_setitimer setitimer "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_sigaction sigaction "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_sigaltstack sigaltstack "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_sysconf sysconf "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_usleep usleep "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_write write "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_getuid getuid "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_geteuid geteuid "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_getgid getgid "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_getegid getegid "libc.a/shr_64.o" + +//go:cgo_import_dynamic libpthread___pth_init __pth_init "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_attr_destroy pthread_attr_destroy "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_attr_init pthread_attr_init "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_attr_getstacksize pthread_attr_getstacksize "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_attr_setstacksize pthread_attr_setstacksize "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_attr_setdetachstate pthread_attr_setdetachstate "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_attr_setstackaddr pthread_attr_setstackaddr "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_create pthread_create "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_sigthreadmask sigthreadmask "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_self pthread_self "libpthread.a/shr_xpg5_64.o" +//go:cgo_import_dynamic libpthread_kill pthread_kill "libpthread.a/shr_xpg5_64.o" + +//go:linkname libc__Errno libc__Errno +//go:linkname libc_clock_gettime libc_clock_gettime +//go:linkname libc_close libc_close +//go:linkname libc_exit libc_exit +//go:linkname libc_getpid libc_getpid +//go:linkname libc_getsystemcfg libc_getsystemcfg +//go:linkname libc_kill libc_kill +//go:linkname libc_madvise libc_madvise +//go:linkname libc_malloc libc_malloc +//go:linkname libc_mmap libc_mmap +//go:linkname libc_mprotect libc_mprotect +//go:linkname libc_munmap libc_munmap +//go:linkname libc_open libc_open +//go:linkname libc_pipe libc_pipe +//go:linkname libc_raise libc_raise +//go:linkname libc_read libc_read +//go:linkname libc_sched_yield libc_sched_yield +//go:linkname libc_sem_init libc_sem_init +//go:linkname libc_sem_post libc_sem_post +//go:linkname libc_sem_timedwait libc_sem_timedwait +//go:linkname libc_sem_wait libc_sem_wait +//go:linkname libc_setitimer libc_setitimer +//go:linkname libc_sigaction libc_sigaction +//go:linkname libc_sigaltstack libc_sigaltstack +//go:linkname libc_sysconf libc_sysconf +//go:linkname libc_usleep libc_usleep +//go:linkname libc_write libc_write +//go:linkname libc_getuid libc_getuid +//go:linkname libc_geteuid libc_geteuid +//go:linkname libc_getgid libc_getgid +//go:linkname libc_getegid libc_getegid + +//go:linkname libpthread___pth_init libpthread___pth_init +//go:linkname libpthread_attr_destroy libpthread_attr_destroy +//go:linkname libpthread_attr_init libpthread_attr_init +//go:linkname libpthread_attr_getstacksize libpthread_attr_getstacksize +//go:linkname libpthread_attr_setstacksize libpthread_attr_setstacksize +//go:linkname libpthread_attr_setdetachstate libpthread_attr_setdetachstate +//go:linkname libpthread_attr_setstackaddr libpthread_attr_setstackaddr +//go:linkname libpthread_create libpthread_create +//go:linkname libpthread_sigthreadmask libpthread_sigthreadmask +//go:linkname libpthread_self libpthread_self +//go:linkname libpthread_kill libpthread_kill + +var ( + //libc + libc__Errno, + libc_clock_gettime, + libc_close, + libc_exit, + libc_getpid, + libc_getsystemcfg, + libc_kill, + libc_madvise, + libc_malloc, + libc_mmap, + libc_mprotect, + libc_munmap, + libc_open, + libc_pipe, + libc_raise, + libc_read, + libc_sched_yield, + libc_sem_init, + libc_sem_post, + libc_sem_timedwait, + libc_sem_wait, + libc_setitimer, + libc_sigaction, + libc_sigaltstack, + libc_sysconf, + libc_usleep, + libc_write, + libc_getuid, + libc_geteuid, + libc_getgid, + libc_getegid, + //libpthread + libpthread___pth_init, + libpthread_attr_destroy, + libpthread_attr_init, + libpthread_attr_getstacksize, + libpthread_attr_setstacksize, + libpthread_attr_setdetachstate, + libpthread_attr_setstackaddr, + libpthread_create, + libpthread_sigthreadmask, + libpthread_self, + libpthread_kill libFunc +) + +type libFunc uintptr + +// asmsyscall6 calls the libc symbol using a C convention. +// It's defined in sys_aix_ppc64.go. +var asmsyscall6 libFunc + +// syscallX functions must always be called with g != nil and m != nil, +// as it relies on g.m.libcall to pass arguments to asmcgocall. +// The few cases where syscalls haven't a g or a m must call their equivalent +// function in sys_aix_ppc64.s to handle them. + +//go:nowritebarrier +//go:nosplit +func syscall0(fn *libFunc) (r, err uintptr) { + gp := getg() + mp := gp.m + resetLibcall := true + if mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + resetLibcall = false // See comment in sys_darwin.go:libcCall + } + + c := libcall{ + fn: uintptr(unsafe.Pointer(fn)), + n: 0, + args: uintptr(unsafe.Pointer(&fn)), // it's unused but must be non-nil, otherwise crashes + } + + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + + if resetLibcall { + mp.libcallsp = 0 + } + + return c.r1, c.err +} + +//go:nowritebarrier +//go:nosplit +func syscall1(fn *libFunc, a0 uintptr) (r, err uintptr) { + gp := getg() + mp := gp.m + resetLibcall := true + if mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + resetLibcall = false // See comment in sys_darwin.go:libcCall + } + + c := libcall{ + fn: uintptr(unsafe.Pointer(fn)), + n: 1, + args: uintptr(unsafe.Pointer(&a0)), + } + + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + + if resetLibcall { + mp.libcallsp = 0 + } + + return c.r1, c.err +} + +//go:nowritebarrier +//go:nosplit +//go:cgo_unsafe_args +func syscall2(fn *libFunc, a0, a1 uintptr) (r, err uintptr) { + gp := getg() + mp := gp.m + resetLibcall := true + if mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + resetLibcall = false // See comment in sys_darwin.go:libcCall + } + + c := libcall{ + fn: uintptr(unsafe.Pointer(fn)), + n: 2, + args: uintptr(unsafe.Pointer(&a0)), + } + + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + + if resetLibcall { + mp.libcallsp = 0 + } + + return c.r1, c.err +} + +//go:nowritebarrier +//go:nosplit +//go:cgo_unsafe_args +func syscall3(fn *libFunc, a0, a1, a2 uintptr) (r, err uintptr) { + gp := getg() + mp := gp.m + resetLibcall := true + if mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + resetLibcall = false // See comment in sys_darwin.go:libcCall + } + + c := libcall{ + fn: uintptr(unsafe.Pointer(fn)), + n: 3, + args: uintptr(unsafe.Pointer(&a0)), + } + + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + + if resetLibcall { + mp.libcallsp = 0 + } + + return c.r1, c.err +} + +//go:nowritebarrier +//go:nosplit +//go:cgo_unsafe_args +func syscall4(fn *libFunc, a0, a1, a2, a3 uintptr) (r, err uintptr) { + gp := getg() + mp := gp.m + resetLibcall := true + if mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + resetLibcall = false // See comment in sys_darwin.go:libcCall + } + + c := libcall{ + fn: uintptr(unsafe.Pointer(fn)), + n: 4, + args: uintptr(unsafe.Pointer(&a0)), + } + + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + + if resetLibcall { + mp.libcallsp = 0 + } + + return c.r1, c.err +} + +//go:nowritebarrier +//go:nosplit +//go:cgo_unsafe_args +func syscall5(fn *libFunc, a0, a1, a2, a3, a4 uintptr) (r, err uintptr) { + gp := getg() + mp := gp.m + resetLibcall := true + if mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + resetLibcall = false // See comment in sys_darwin.go:libcCall + } + + c := libcall{ + fn: uintptr(unsafe.Pointer(fn)), + n: 5, + args: uintptr(unsafe.Pointer(&a0)), + } + + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + + if resetLibcall { + mp.libcallsp = 0 + } + + return c.r1, c.err +} + +//go:nowritebarrier +//go:nosplit +//go:cgo_unsafe_args +func syscall6(fn *libFunc, a0, a1, a2, a3, a4, a5 uintptr) (r, err uintptr) { + gp := getg() + mp := gp.m + resetLibcall := true + if mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + resetLibcall = false // See comment in sys_darwin.go:libcCall + } + + c := libcall{ + fn: uintptr(unsafe.Pointer(fn)), + n: 6, + args: uintptr(unsafe.Pointer(&a0)), + } + + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + + if resetLibcall { + mp.libcallsp = 0 + } + + return c.r1, c.err +} + +func exit1(code int32) + +//go:nosplit +func exit(code int32) { + gp := getg() + + // Check the validity of g because without a g during + // newosproc0. + if gp != nil { + syscall1(&libc_exit, uintptr(code)) + return + } + exit1(code) +} + +func write2(fd, p uintptr, n int32) int32 + +//go:nosplit +func write1(fd uintptr, p unsafe.Pointer, n int32) int32 { + gp := getg() + + // Check the validity of g because without a g during + // newosproc0. + if gp != nil { + r, errno := syscall3(&libc_write, uintptr(fd), uintptr(p), uintptr(n)) + if int32(r) < 0 { + return -int32(errno) + } + return int32(r) + } + // Note that in this case we can't return a valid errno value. + return write2(fd, uintptr(p), n) + +} + +//go:nosplit +func read(fd int32, p unsafe.Pointer, n int32) int32 { + r, errno := syscall3(&libc_read, uintptr(fd), uintptr(p), uintptr(n)) + if int32(r) < 0 { + return -int32(errno) + } + return int32(r) +} + +//go:nosplit +func open(name *byte, mode, perm int32) int32 { + r, _ := syscall3(&libc_open, uintptr(unsafe.Pointer(name)), uintptr(mode), uintptr(perm)) + return int32(r) +} + +//go:nosplit +func closefd(fd int32) int32 { + r, _ := syscall1(&libc_close, uintptr(fd)) + return int32(r) +} + +//go:nosplit +func pipe() (r, w int32, errno int32) { + var p [2]int32 + _, err := syscall1(&libc_pipe, uintptr(noescape(unsafe.Pointer(&p[0])))) + return p[0], p[1], int32(err) +} + +// mmap calls the mmap system call. +// We only pass the lower 32 bits of file offset to the +// assembly routine; the higher bits (if required), should be provided +// by the assembly routine as 0. +// The err result is an OS error code such as ENOMEM. +// +//go:nosplit +func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (unsafe.Pointer, int) { + r, err0 := syscall6(&libc_mmap, uintptr(addr), uintptr(n), uintptr(prot), uintptr(flags), uintptr(fd), uintptr(off)) + if r == ^uintptr(0) { + return nil, int(err0) + } + return unsafe.Pointer(r), int(err0) +} + +//go:nosplit +func mprotect(addr unsafe.Pointer, n uintptr, prot int32) (unsafe.Pointer, int) { + r, err0 := syscall3(&libc_mprotect, uintptr(addr), uintptr(n), uintptr(prot)) + if r == ^uintptr(0) { + return nil, int(err0) + } + return unsafe.Pointer(r), int(err0) +} + +//go:nosplit +func munmap(addr unsafe.Pointer, n uintptr) { + r, err := syscall2(&libc_munmap, uintptr(addr), uintptr(n)) + if int32(r) == -1 { + println("syscall munmap failed: ", hex(err)) + throw("syscall munmap") + } +} + +//go:nosplit +func madvise(addr unsafe.Pointer, n uintptr, flags int32) { + r, err := syscall3(&libc_madvise, uintptr(addr), uintptr(n), uintptr(flags)) + if int32(r) == -1 { + println("syscall madvise failed: ", hex(err)) + throw("syscall madvise") + } +} + +func sigaction1(sig, new, old uintptr) + +//go:nosplit +func sigaction(sig uintptr, new, old *sigactiont) { + gp := getg() + + // Check the validity of g because without a g during + // runtime.libpreinit. + if gp != nil { + r, err := syscall3(&libc_sigaction, sig, uintptr(unsafe.Pointer(new)), uintptr(unsafe.Pointer(old))) + if int32(r) == -1 { + println("Sigaction failed for sig: ", sig, " with error:", hex(err)) + throw("syscall sigaction") + } + return + } + + sigaction1(sig, uintptr(unsafe.Pointer(new)), uintptr(unsafe.Pointer(old))) +} + +//go:nosplit +func sigaltstack(new, old *stackt) { + r, err := syscall2(&libc_sigaltstack, uintptr(unsafe.Pointer(new)), uintptr(unsafe.Pointer(old))) + if int32(r) == -1 { + println("syscall sigaltstack failed: ", hex(err)) + throw("syscall sigaltstack") + } +} + +//go:nosplit +//go:linkname internal_cpu_getsystemcfg internal/cpu.getsystemcfg +func internal_cpu_getsystemcfg(label uint) uint { + r, _ := syscall1(&libc_getsystemcfg, uintptr(label)) + return uint(r) +} + +func usleep1(us uint32) + +//go:nosplit +func usleep_no_g(us uint32) { + usleep1(us) +} + +//go:nosplit +func usleep(us uint32) { + r, err := syscall1(&libc_usleep, uintptr(us)) + if int32(r) == -1 { + println("syscall usleep failed: ", hex(err)) + throw("syscall usleep") + } +} + +//go:nosplit +func clock_gettime(clockid int32, tp *timespec) int32 { + r, _ := syscall2(&libc_clock_gettime, uintptr(clockid), uintptr(unsafe.Pointer(tp))) + return int32(r) +} + +//go:nosplit +func setitimer(mode int32, new, old *itimerval) { + r, err := syscall3(&libc_setitimer, uintptr(mode), uintptr(unsafe.Pointer(new)), uintptr(unsafe.Pointer(old))) + if int32(r) == -1 { + println("syscall setitimer failed: ", hex(err)) + throw("syscall setitimer") + } +} + +//go:nosplit +func malloc(size uintptr) unsafe.Pointer { + r, _ := syscall1(&libc_malloc, size) + return unsafe.Pointer(r) +} + +//go:nosplit +func sem_init(sem *semt, pshared int32, value uint32) int32 { + r, _ := syscall3(&libc_sem_init, uintptr(unsafe.Pointer(sem)), uintptr(pshared), uintptr(value)) + return int32(r) +} + +//go:nosplit +func sem_wait(sem *semt) (int32, int32) { + r, err := syscall1(&libc_sem_wait, uintptr(unsafe.Pointer(sem))) + return int32(r), int32(err) +} + +//go:nosplit +func sem_post(sem *semt) int32 { + r, _ := syscall1(&libc_sem_post, uintptr(unsafe.Pointer(sem))) + return int32(r) +} + +//go:nosplit +func sem_timedwait(sem *semt, timeout *timespec) (int32, int32) { + r, err := syscall2(&libc_sem_timedwait, uintptr(unsafe.Pointer(sem)), uintptr(unsafe.Pointer(timeout))) + return int32(r), int32(err) +} + +//go:nosplit +func raise(sig uint32) { + r, err := syscall1(&libc_raise, uintptr(sig)) + if int32(r) == -1 { + println("syscall raise failed: ", hex(err)) + throw("syscall raise") + } +} + +//go:nosplit +func raiseproc(sig uint32) { + pid, err := syscall0(&libc_getpid) + if int32(pid) == -1 { + println("syscall getpid failed: ", hex(err)) + throw("syscall raiseproc") + } + + syscall2(&libc_kill, pid, uintptr(sig)) +} + +func osyield1() + +//go:nosplit +func osyield_no_g() { + osyield1() +} + +//go:nosplit +func osyield() { + r, err := syscall0(&libc_sched_yield) + if int32(r) == -1 { + println("syscall osyield failed: ", hex(err)) + throw("syscall osyield") + } +} + +//go:nosplit +func sysconf(name int32) uintptr { + r, _ := syscall1(&libc_sysconf, uintptr(name)) + if int32(r) == -1 { + throw("syscall sysconf") + } + return r + +} + +// pthread functions returns its error code in the main return value +// Therefore, err returns by syscall means nothing and must not be used + +//go:nosplit +func pthread_attr_destroy(attr *pthread_attr) int32 { + r, _ := syscall1(&libpthread_attr_destroy, uintptr(unsafe.Pointer(attr))) + return int32(r) +} + +func pthread_attr_init1(attr uintptr) int32 + +//go:nosplit +func pthread_attr_init(attr *pthread_attr) int32 { + gp := getg() + + // Check the validity of g because without a g during + // newosproc0. + if gp != nil { + r, _ := syscall1(&libpthread_attr_init, uintptr(unsafe.Pointer(attr))) + return int32(r) + } + + return pthread_attr_init1(uintptr(unsafe.Pointer(attr))) +} + +func pthread_attr_setdetachstate1(attr uintptr, state int32) int32 + +//go:nosplit +func pthread_attr_setdetachstate(attr *pthread_attr, state int32) int32 { + gp := getg() + + // Check the validity of g because without a g during + // newosproc0. + if gp != nil { + r, _ := syscall2(&libpthread_attr_setdetachstate, uintptr(unsafe.Pointer(attr)), uintptr(state)) + return int32(r) + } + + return pthread_attr_setdetachstate1(uintptr(unsafe.Pointer(attr)), state) +} + +//go:nosplit +func pthread_attr_setstackaddr(attr *pthread_attr, stk unsafe.Pointer) int32 { + r, _ := syscall2(&libpthread_attr_setstackaddr, uintptr(unsafe.Pointer(attr)), uintptr(stk)) + return int32(r) +} + +//go:nosplit +func pthread_attr_getstacksize(attr *pthread_attr, size *uint64) int32 { + r, _ := syscall2(&libpthread_attr_getstacksize, uintptr(unsafe.Pointer(attr)), uintptr(unsafe.Pointer(size))) + return int32(r) +} + +func pthread_attr_setstacksize1(attr uintptr, size uint64) int32 + +//go:nosplit +func pthread_attr_setstacksize(attr *pthread_attr, size uint64) int32 { + gp := getg() + + // Check the validity of g because without a g during + // newosproc0. + if gp != nil { + r, _ := syscall2(&libpthread_attr_setstacksize, uintptr(unsafe.Pointer(attr)), uintptr(size)) + return int32(r) + } + + return pthread_attr_setstacksize1(uintptr(unsafe.Pointer(attr)), size) +} + +func pthread_create1(tid, attr, fn, arg uintptr) int32 + +//go:nosplit +func pthread_create(tid *pthread, attr *pthread_attr, fn *funcDescriptor, arg unsafe.Pointer) int32 { + gp := getg() + + // Check the validity of g because without a g during + // newosproc0. + if gp != nil { + r, _ := syscall4(&libpthread_create, uintptr(unsafe.Pointer(tid)), uintptr(unsafe.Pointer(attr)), uintptr(unsafe.Pointer(fn)), uintptr(arg)) + return int32(r) + } + + return pthread_create1(uintptr(unsafe.Pointer(tid)), uintptr(unsafe.Pointer(attr)), uintptr(unsafe.Pointer(fn)), uintptr(arg)) +} + +// On multi-thread program, sigprocmask must not be called. +// It's replaced by sigthreadmask. +func sigprocmask1(how, new, old uintptr) + +//go:nosplit +func sigprocmask(how int32, new, old *sigset) { + gp := getg() + + // Check the validity of m because it might be called during a cgo + // callback early enough where m isn't available yet. + if gp != nil && gp.m != nil { + r, err := syscall3(&libpthread_sigthreadmask, uintptr(how), uintptr(unsafe.Pointer(new)), uintptr(unsafe.Pointer(old))) + if int32(r) != 0 { + println("syscall sigthreadmask failed: ", hex(err)) + throw("syscall sigthreadmask") + } + return + } + sigprocmask1(uintptr(how), uintptr(unsafe.Pointer(new)), uintptr(unsafe.Pointer(old))) + +} + +//go:nosplit +func pthread_self() pthread { + r, _ := syscall0(&libpthread_self) + return pthread(r) +} + +//go:nosplit +func signalM(mp *m, sig int) { + syscall2(&libpthread_kill, uintptr(pthread(mp.procid)), uintptr(sig)) +} diff --git a/src/runtime/os2_freebsd.go b/src/runtime/os2_freebsd.go new file mode 100644 index 0000000..29f0b76 --- /dev/null +++ b/src/runtime/os2_freebsd.go @@ -0,0 +1,14 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + _SS_DISABLE = 4 + _NSIG = 33 + _SI_USER = 0x10001 + _SIG_BLOCK = 1 + _SIG_UNBLOCK = 2 + _SIG_SETMASK = 3 +) diff --git a/src/runtime/os2_openbsd.go b/src/runtime/os2_openbsd.go new file mode 100644 index 0000000..8656a91 --- /dev/null +++ b/src/runtime/os2_openbsd.go @@ -0,0 +1,14 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + _SS_DISABLE = 4 + _SIG_BLOCK = 1 + _SIG_UNBLOCK = 2 + _SIG_SETMASK = 3 + _NSIG = 33 + _SI_USER = 0 +) diff --git a/src/runtime/os2_plan9.go b/src/runtime/os2_plan9.go new file mode 100644 index 0000000..58fb2be --- /dev/null +++ b/src/runtime/os2_plan9.go @@ -0,0 +1,74 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Plan 9-specific system calls + +package runtime + +// open +const ( + _OREAD = 0 + _OWRITE = 1 + _ORDWR = 2 + _OEXEC = 3 + _OTRUNC = 16 + _OCEXEC = 32 + _ORCLOSE = 64 + _OEXCL = 0x1000 +) + +// rfork +const ( + _RFNAMEG = 1 << 0 + _RFENVG = 1 << 1 + _RFFDG = 1 << 2 + _RFNOTEG = 1 << 3 + _RFPROC = 1 << 4 + _RFMEM = 1 << 5 + _RFNOWAIT = 1 << 6 + _RFCNAMEG = 1 << 10 + _RFCENVG = 1 << 11 + _RFCFDG = 1 << 12 + _RFREND = 1 << 13 + _RFNOMNT = 1 << 14 +) + +// notify +const ( + _NCONT = 0 + _NDFLT = 1 +) + +type uinptr _Plink + +type tos struct { + prof struct { // Per process profiling + pp *_Plink // known to be 0(ptr) + next *_Plink // known to be 4(ptr) + last *_Plink + first *_Plink + pid uint32 + what uint32 + } + cyclefreq uint64 // cycle clock frequency if there is one, 0 otherwise + kcycles int64 // cycles spent in kernel + pcycles int64 // cycles spent in process (kernel + user) + pid uint32 // might as well put the pid here + clock uint32 + // top of stack is here +} + +const ( + _NSIG = 14 // number of signals in sigtable array + _ERRMAX = 128 // max length of note string + + // Notes in runtime·sigtab that are handled by runtime·sigpanic. + _SIGRFAULT = 2 + _SIGWFAULT = 3 + _SIGINTDIV = 4 + _SIGFLOAT = 5 + _SIGTRAP = 6 + _SIGPROF = 0 // dummy value defined for badsignal + _SIGQUIT = 0 // dummy value defined for sighandler +) diff --git a/src/runtime/os2_solaris.go b/src/runtime/os2_solaris.go new file mode 100644 index 0000000..108bea6 --- /dev/null +++ b/src/runtime/os2_solaris.go @@ -0,0 +1,13 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + _SS_DISABLE = 2 + _SIG_UNBLOCK = 2 + _SIG_SETMASK = 3 + _NSIG = 73 /* number of signals in sigtable array */ + _SI_USER = 0 +) diff --git a/src/runtime/os3_plan9.go b/src/runtime/os3_plan9.go new file mode 100644 index 0000000..8c9cbe2 --- /dev/null +++ b/src/runtime/os3_plan9.go @@ -0,0 +1,166 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func sighandler(_ureg *ureg, note *byte, gp *g) int { + gsignal := getg() + mp := gsignal.m + + var t sigTabT + var docrash bool + var sig int + var flags int + var level int32 + + c := &sigctxt{_ureg} + notestr := gostringnocopy(note) + + // The kernel will never pass us a nil note or ureg so we probably + // made a mistake somewhere in sigtramp. + if _ureg == nil || note == nil { + print("sighandler: ureg ", _ureg, " note ", note, "\n") + goto Throw + } + // Check that the note is no more than ERRMAX bytes (including + // the trailing NUL). We should never receive a longer note. + if len(notestr) > _ERRMAX-1 { + print("sighandler: note is longer than ERRMAX\n") + goto Throw + } + if isAbortPC(c.pc()) { + // Never turn abort into a panic. + goto Throw + } + // See if the note matches one of the patterns in sigtab. + // Notes that do not match any pattern can be handled at a higher + // level by the program but will otherwise be ignored. + flags = _SigNotify + for sig, t = range sigtable { + if hasPrefix(notestr, t.name) { + flags = t.flags + break + } + } + if flags&_SigPanic != 0 && gp.throwsplit { + // We can't safely sigpanic because it may grow the + // stack. Abort in the signal handler instead. + flags = (flags &^ _SigPanic) | _SigThrow + } + if flags&_SigGoExit != 0 { + exits((*byte)(add(unsafe.Pointer(note), 9))) // Strip "go: exit " prefix. + } + if flags&_SigPanic != 0 { + // Copy the error string from sigtramp's stack into m->notesig so + // we can reliably access it from the panic routines. + memmove(unsafe.Pointer(mp.notesig), unsafe.Pointer(note), uintptr(len(notestr)+1)) + gp.sig = uint32(sig) + gp.sigpc = c.pc() + + pc := c.pc() + sp := c.sp() + + // If we don't recognize the PC as code + // but we do recognize the top pointer on the stack as code, + // then assume this was a call to non-code and treat like + // pc == 0, to make unwinding show the context. + if pc != 0 && !findfunc(pc).valid() && findfunc(*(*uintptr)(unsafe.Pointer(sp))).valid() { + pc = 0 + } + + // IF LR exists, sigpanictramp must save it to the stack + // before entry to sigpanic so that panics in leaf + // functions are correctly handled. This will smash + // the stack frame but we're not going back there + // anyway. + if usesLR { + c.savelr(c.lr()) + } + + // If PC == 0, probably panicked because of a call to a nil func. + // Not faking that as the return address will make the trace look like a call + // to sigpanic instead. (Otherwise the trace will end at + // sigpanic and we won't get to see who faulted). + if pc != 0 { + if usesLR { + c.setlr(pc) + } else { + sp -= goarch.PtrSize + *(*uintptr)(unsafe.Pointer(sp)) = pc + c.setsp(sp) + } + } + if usesLR { + c.setpc(abi.FuncPCABI0(sigpanictramp)) + } else { + c.setpc(abi.FuncPCABI0(sigpanic0)) + } + return _NCONT + } + if flags&_SigNotify != 0 { + if ignoredNote(note) { + return _NCONT + } + if sendNote(note) { + return _NCONT + } + } + if flags&_SigKill != 0 { + goto Exit + } + if flags&_SigThrow == 0 { + return _NCONT + } +Throw: + mp.throwing = throwTypeRuntime + mp.caughtsig.set(gp) + startpanic_m() + print(notestr, "\n") + print("PC=", hex(c.pc()), "\n") + print("\n") + level, _, docrash = gotraceback() + if level > 0 { + goroutineheader(gp) + tracebacktrap(c.pc(), c.sp(), c.lr(), gp) + tracebackothers(gp) + print("\n") + dumpregs(_ureg) + } + if docrash { + crash() + } +Exit: + goexitsall(note) + exits(note) + return _NDFLT // not reached +} + +func sigenable(sig uint32) { +} + +func sigdisable(sig uint32) { +} + +func sigignore(sig uint32) { +} + +func setProcessCPUProfiler(hz int32) { +} + +func setThreadCPUProfiler(hz int32) { + // TODO: Enable profiling interrupts. + getg().m.profilehz = hz +} + +// gsignalStack is unused on Plan 9. +type gsignalStack struct{} diff --git a/src/runtime/os3_solaris.go b/src/runtime/os3_solaris.go new file mode 100644 index 0000000..44ea7a2 --- /dev/null +++ b/src/runtime/os3_solaris.go @@ -0,0 +1,639 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +//go:cgo_export_dynamic runtime.end _end +//go:cgo_export_dynamic runtime.etext _etext +//go:cgo_export_dynamic runtime.edata _edata + +//go:cgo_import_dynamic libc____errno ___errno "libc.so" +//go:cgo_import_dynamic libc_clock_gettime clock_gettime "libc.so" +//go:cgo_import_dynamic libc_exit exit "libc.so" +//go:cgo_import_dynamic libc_getcontext getcontext "libc.so" +//go:cgo_import_dynamic libc_kill kill "libc.so" +//go:cgo_import_dynamic libc_madvise madvise "libc.so" +//go:cgo_import_dynamic libc_malloc malloc "libc.so" +//go:cgo_import_dynamic libc_mmap mmap "libc.so" +//go:cgo_import_dynamic libc_munmap munmap "libc.so" +//go:cgo_import_dynamic libc_open open "libc.so" +//go:cgo_import_dynamic libc_pthread_attr_destroy pthread_attr_destroy "libc.so" +//go:cgo_import_dynamic libc_pthread_attr_getstack pthread_attr_getstack "libc.so" +//go:cgo_import_dynamic libc_pthread_attr_init pthread_attr_init "libc.so" +//go:cgo_import_dynamic libc_pthread_attr_setdetachstate pthread_attr_setdetachstate "libc.so" +//go:cgo_import_dynamic libc_pthread_attr_setstack pthread_attr_setstack "libc.so" +//go:cgo_import_dynamic libc_pthread_create pthread_create "libc.so" +//go:cgo_import_dynamic libc_pthread_self pthread_self "libc.so" +//go:cgo_import_dynamic libc_pthread_kill pthread_kill "libc.so" +//go:cgo_import_dynamic libc_raise raise "libc.so" +//go:cgo_import_dynamic libc_read read "libc.so" +//go:cgo_import_dynamic libc_select select "libc.so" +//go:cgo_import_dynamic libc_sched_yield sched_yield "libc.so" +//go:cgo_import_dynamic libc_sem_init sem_init "libc.so" +//go:cgo_import_dynamic libc_sem_post sem_post "libc.so" +//go:cgo_import_dynamic libc_sem_reltimedwait_np sem_reltimedwait_np "libc.so" +//go:cgo_import_dynamic libc_sem_wait sem_wait "libc.so" +//go:cgo_import_dynamic libc_setitimer setitimer "libc.so" +//go:cgo_import_dynamic libc_sigaction sigaction "libc.so" +//go:cgo_import_dynamic libc_sigaltstack sigaltstack "libc.so" +//go:cgo_import_dynamic libc_sigprocmask sigprocmask "libc.so" +//go:cgo_import_dynamic libc_sysconf sysconf "libc.so" +//go:cgo_import_dynamic libc_usleep usleep "libc.so" +//go:cgo_import_dynamic libc_write write "libc.so" +//go:cgo_import_dynamic libc_pipe2 pipe2 "libc.so" + +//go:linkname libc____errno libc____errno +//go:linkname libc_clock_gettime libc_clock_gettime +//go:linkname libc_exit libc_exit +//go:linkname libc_getcontext libc_getcontext +//go:linkname libc_kill libc_kill +//go:linkname libc_madvise libc_madvise +//go:linkname libc_malloc libc_malloc +//go:linkname libc_mmap libc_mmap +//go:linkname libc_munmap libc_munmap +//go:linkname libc_open libc_open +//go:linkname libc_pthread_attr_destroy libc_pthread_attr_destroy +//go:linkname libc_pthread_attr_getstack libc_pthread_attr_getstack +//go:linkname libc_pthread_attr_init libc_pthread_attr_init +//go:linkname libc_pthread_attr_setdetachstate libc_pthread_attr_setdetachstate +//go:linkname libc_pthread_attr_setstack libc_pthread_attr_setstack +//go:linkname libc_pthread_create libc_pthread_create +//go:linkname libc_pthread_self libc_pthread_self +//go:linkname libc_pthread_kill libc_pthread_kill +//go:linkname libc_raise libc_raise +//go:linkname libc_read libc_read +//go:linkname libc_select libc_select +//go:linkname libc_sched_yield libc_sched_yield +//go:linkname libc_sem_init libc_sem_init +//go:linkname libc_sem_post libc_sem_post +//go:linkname libc_sem_reltimedwait_np libc_sem_reltimedwait_np +//go:linkname libc_sem_wait libc_sem_wait +//go:linkname libc_setitimer libc_setitimer +//go:linkname libc_sigaction libc_sigaction +//go:linkname libc_sigaltstack libc_sigaltstack +//go:linkname libc_sigprocmask libc_sigprocmask +//go:linkname libc_sysconf libc_sysconf +//go:linkname libc_usleep libc_usleep +//go:linkname libc_write libc_write +//go:linkname libc_pipe2 libc_pipe2 + +var ( + libc____errno, + libc_clock_gettime, + libc_exit, + libc_getcontext, + libc_kill, + libc_madvise, + libc_malloc, + libc_mmap, + libc_munmap, + libc_open, + libc_pthread_attr_destroy, + libc_pthread_attr_getstack, + libc_pthread_attr_init, + libc_pthread_attr_setdetachstate, + libc_pthread_attr_setstack, + libc_pthread_create, + libc_pthread_self, + libc_pthread_kill, + libc_raise, + libc_read, + libc_sched_yield, + libc_select, + libc_sem_init, + libc_sem_post, + libc_sem_reltimedwait_np, + libc_sem_wait, + libc_setitimer, + libc_sigaction, + libc_sigaltstack, + libc_sigprocmask, + libc_sysconf, + libc_usleep, + libc_write, + libc_pipe2 libcFunc +) + +var sigset_all = sigset{[4]uint32{^uint32(0), ^uint32(0), ^uint32(0), ^uint32(0)}} + +func getPageSize() uintptr { + n := int32(sysconf(__SC_PAGESIZE)) + if n <= 0 { + return 0 + } + return uintptr(n) +} + +func osinit() { + ncpu = getncpu() + if physPageSize == 0 { + physPageSize = getPageSize() + } +} + +func tstart_sysvicall(newm *m) uint32 + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrier +func newosproc(mp *m) { + var ( + attr pthreadattr + oset sigset + tid pthread + ret int32 + size uint64 + ) + + if pthread_attr_init(&attr) != 0 { + throw("pthread_attr_init") + } + // Allocate a new 2MB stack. + if pthread_attr_setstack(&attr, 0, 0x200000) != 0 { + throw("pthread_attr_setstack") + } + // Read back the allocated stack. + if pthread_attr_getstack(&attr, unsafe.Pointer(&mp.g0.stack.hi), &size) != 0 { + throw("pthread_attr_getstack") + } + mp.g0.stack.lo = mp.g0.stack.hi - uintptr(size) + if pthread_attr_setdetachstate(&attr, _PTHREAD_CREATE_DETACHED) != 0 { + throw("pthread_attr_setdetachstate") + } + + // Disable signals during create, so that the new thread starts + // with signals disabled. It will enable them in minit. + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + ret = retryOnEAGAIN(func() int32 { + return pthread_create(&tid, &attr, abi.FuncPCABI0(tstart_sysvicall), unsafe.Pointer(mp)) + }) + sigprocmask(_SIG_SETMASK, &oset, nil) + if ret != 0 { + print("runtime: failed to create new OS thread (have ", mcount(), " already; errno=", ret, ")\n") + if ret == _EAGAIN { + println("runtime: may need to increase max user processes (ulimit -u)") + } + throw("newosproc") + } +} + +func exitThread(wait *atomic.Uint32) { + // We should never reach exitThread on Solaris because we let + // libc clean up threads. + throw("exitThread") +} + +var urandom_dev = []byte("/dev/urandom\x00") + +//go:nosplit +func getRandomData(r []byte) { + fd := open(&urandom_dev[0], 0 /* O_RDONLY */, 0) + n := read(fd, unsafe.Pointer(&r[0]), int32(len(r))) + closefd(fd) + extendRandom(r, int(n)) +} + +func goenvs() { + goenvs_unix() +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + mp.gsignal = malg(32 * 1024) + mp.gsignal.m = mp +} + +func miniterrno() + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { + asmcgocall(unsafe.Pointer(abi.FuncPCABI0(miniterrno)), unsafe.Pointer(&libc____errno)) + + minitSignals() + + getg().m.procid = uint64(pthread_self()) +} + +// Called from dropm to undo the effect of an minit. +func unminit() { + unminitSignals() +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +func sigtramp() + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa sigactiont + + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTART + sa.sa_mask = sigset_all + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + fn = abi.FuncPCABI0(sigtramp) + } + *((*uintptr)(unsafe.Pointer(&sa._funcptr))) = fn + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func setsigstack(i uint32) { + var sa sigactiont + sigaction(i, nil, &sa) + if sa.sa_flags&_SA_ONSTACK != 0 { + return + } + sa.sa_flags |= _SA_ONSTACK + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func getsig(i uint32) uintptr { + var sa sigactiont + sigaction(i, nil, &sa) + return *((*uintptr)(unsafe.Pointer(&sa._funcptr))) +} + +// setSignalstackSP sets the ss_sp field of a stackt. +// +//go:nosplit +func setSignalstackSP(s *stackt, sp uintptr) { + *(*uintptr)(unsafe.Pointer(&s.ss_sp)) = sp +} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + mask.__sigbits[(i-1)/32] |= 1 << ((uint32(i) - 1) & 31) +} + +func sigdelset(mask *sigset, i int) { + mask.__sigbits[(i-1)/32] &^= 1 << ((uint32(i) - 1) & 31) +} + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { +} + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + setThreadCPUProfilerHz(hz) +} + +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + return true +} + +//go:nosplit +func semacreate(mp *m) { + if mp.waitsema != 0 { + return + } + + var sem *semt + + // Call libc's malloc rather than malloc. This will + // allocate space on the C heap. We can't call malloc + // here because it could cause a deadlock. + mp.libcall.fn = uintptr(unsafe.Pointer(&libc_malloc)) + mp.libcall.n = 1 + mp.scratch = mscratch{} + mp.scratch.v[0] = unsafe.Sizeof(*sem) + mp.libcall.args = uintptr(unsafe.Pointer(&mp.scratch)) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&mp.libcall)) + sem = (*semt)(unsafe.Pointer(mp.libcall.r1)) + if sem_init(sem, 0, 0) != 0 { + throw("sem_init") + } + mp.waitsema = uintptr(unsafe.Pointer(sem)) +} + +//go:nosplit +func semasleep(ns int64) int32 { + mp := getg().m + if ns >= 0 { + mp.ts.tv_sec = ns / 1000000000 + mp.ts.tv_nsec = ns % 1000000000 + + mp.libcall.fn = uintptr(unsafe.Pointer(&libc_sem_reltimedwait_np)) + mp.libcall.n = 2 + mp.scratch = mscratch{} + mp.scratch.v[0] = mp.waitsema + mp.scratch.v[1] = uintptr(unsafe.Pointer(&mp.ts)) + mp.libcall.args = uintptr(unsafe.Pointer(&mp.scratch)) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&mp.libcall)) + if *mp.perrno != 0 { + if *mp.perrno == _ETIMEDOUT || *mp.perrno == _EAGAIN || *mp.perrno == _EINTR { + return -1 + } + throw("sem_reltimedwait_np") + } + return 0 + } + for { + mp.libcall.fn = uintptr(unsafe.Pointer(&libc_sem_wait)) + mp.libcall.n = 1 + mp.scratch = mscratch{} + mp.scratch.v[0] = mp.waitsema + mp.libcall.args = uintptr(unsafe.Pointer(&mp.scratch)) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&mp.libcall)) + if mp.libcall.r1 == 0 { + break + } + if *mp.perrno == _EINTR { + continue + } + throw("sem_wait") + } + return 0 +} + +//go:nosplit +func semawakeup(mp *m) { + if sem_post((*semt)(unsafe.Pointer(mp.waitsema))) != 0 { + throw("sem_post") + } +} + +//go:nosplit +func closefd(fd int32) int32 { + return int32(sysvicall1(&libc_close, uintptr(fd))) +} + +//go:nosplit +func exit(r int32) { + sysvicall1(&libc_exit, uintptr(r)) +} + +//go:nosplit +func getcontext(context *ucontext) /* int32 */ { + sysvicall1(&libc_getcontext, uintptr(unsafe.Pointer(context))) +} + +//go:nosplit +func madvise(addr unsafe.Pointer, n uintptr, flags int32) { + sysvicall3(&libc_madvise, uintptr(addr), uintptr(n), uintptr(flags)) +} + +//go:nosplit +func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (unsafe.Pointer, int) { + p, err := doMmap(uintptr(addr), n, uintptr(prot), uintptr(flags), uintptr(fd), uintptr(off)) + if p == ^uintptr(0) { + return nil, int(err) + } + return unsafe.Pointer(p), 0 +} + +//go:nosplit +//go:cgo_unsafe_args +func doMmap(addr, n, prot, flags, fd, off uintptr) (uintptr, uintptr) { + var libcall libcall + libcall.fn = uintptr(unsafe.Pointer(&libc_mmap)) + libcall.n = 6 + libcall.args = uintptr(noescape(unsafe.Pointer(&addr))) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&libcall)) + return libcall.r1, libcall.err +} + +//go:nosplit +func munmap(addr unsafe.Pointer, n uintptr) { + sysvicall2(&libc_munmap, uintptr(addr), uintptr(n)) +} + +const ( + _CLOCK_REALTIME = 3 + _CLOCK_MONOTONIC = 4 +) + +//go:nosplit +func nanotime1() int64 { + var ts mts + sysvicall2(&libc_clock_gettime, _CLOCK_MONOTONIC, uintptr(unsafe.Pointer(&ts))) + return ts.tv_sec*1e9 + ts.tv_nsec +} + +//go:nosplit +func open(path *byte, mode, perm int32) int32 { + return int32(sysvicall3(&libc_open, uintptr(unsafe.Pointer(path)), uintptr(mode), uintptr(perm))) +} + +func pthread_attr_destroy(attr *pthreadattr) int32 { + return int32(sysvicall1(&libc_pthread_attr_destroy, uintptr(unsafe.Pointer(attr)))) +} + +func pthread_attr_getstack(attr *pthreadattr, addr unsafe.Pointer, size *uint64) int32 { + return int32(sysvicall3(&libc_pthread_attr_getstack, uintptr(unsafe.Pointer(attr)), uintptr(addr), uintptr(unsafe.Pointer(size)))) +} + +func pthread_attr_init(attr *pthreadattr) int32 { + return int32(sysvicall1(&libc_pthread_attr_init, uintptr(unsafe.Pointer(attr)))) +} + +func pthread_attr_setdetachstate(attr *pthreadattr, state int32) int32 { + return int32(sysvicall2(&libc_pthread_attr_setdetachstate, uintptr(unsafe.Pointer(attr)), uintptr(state))) +} + +func pthread_attr_setstack(attr *pthreadattr, addr uintptr, size uint64) int32 { + return int32(sysvicall3(&libc_pthread_attr_setstack, uintptr(unsafe.Pointer(attr)), uintptr(addr), uintptr(size))) +} + +func pthread_create(thread *pthread, attr *pthreadattr, fn uintptr, arg unsafe.Pointer) int32 { + return int32(sysvicall4(&libc_pthread_create, uintptr(unsafe.Pointer(thread)), uintptr(unsafe.Pointer(attr)), uintptr(fn), uintptr(arg))) +} + +func pthread_self() pthread { + return pthread(sysvicall0(&libc_pthread_self)) +} + +func signalM(mp *m, sig int) { + sysvicall2(&libc_pthread_kill, uintptr(pthread(mp.procid)), uintptr(sig)) +} + +//go:nosplit +//go:nowritebarrierrec +func raise(sig uint32) /* int32 */ { + sysvicall1(&libc_raise, uintptr(sig)) +} + +func raiseproc(sig uint32) /* int32 */ { + pid := sysvicall0(&libc_getpid) + sysvicall2(&libc_kill, pid, uintptr(sig)) +} + +//go:nosplit +func read(fd int32, buf unsafe.Pointer, nbyte int32) int32 { + r1, err := sysvicall3Err(&libc_read, uintptr(fd), uintptr(buf), uintptr(nbyte)) + if c := int32(r1); c >= 0 { + return c + } + return -int32(err) +} + +//go:nosplit +func sem_init(sem *semt, pshared int32, value uint32) int32 { + return int32(sysvicall3(&libc_sem_init, uintptr(unsafe.Pointer(sem)), uintptr(pshared), uintptr(value))) +} + +//go:nosplit +func sem_post(sem *semt) int32 { + return int32(sysvicall1(&libc_sem_post, uintptr(unsafe.Pointer(sem)))) +} + +//go:nosplit +func sem_reltimedwait_np(sem *semt, timeout *timespec) int32 { + return int32(sysvicall2(&libc_sem_reltimedwait_np, uintptr(unsafe.Pointer(sem)), uintptr(unsafe.Pointer(timeout)))) +} + +//go:nosplit +func sem_wait(sem *semt) int32 { + return int32(sysvicall1(&libc_sem_wait, uintptr(unsafe.Pointer(sem)))) +} + +func setitimer(which int32, value *itimerval, ovalue *itimerval) /* int32 */ { + sysvicall3(&libc_setitimer, uintptr(which), uintptr(unsafe.Pointer(value)), uintptr(unsafe.Pointer(ovalue))) +} + +//go:nosplit +//go:nowritebarrierrec +func sigaction(sig uint32, act *sigactiont, oact *sigactiont) /* int32 */ { + sysvicall3(&libc_sigaction, uintptr(sig), uintptr(unsafe.Pointer(act)), uintptr(unsafe.Pointer(oact))) +} + +//go:nosplit +//go:nowritebarrierrec +func sigaltstack(ss *stackt, oss *stackt) /* int32 */ { + sysvicall2(&libc_sigaltstack, uintptr(unsafe.Pointer(ss)), uintptr(unsafe.Pointer(oss))) +} + +//go:nosplit +//go:nowritebarrierrec +func sigprocmask(how int32, set *sigset, oset *sigset) /* int32 */ { + sysvicall3(&libc_sigprocmask, uintptr(how), uintptr(unsafe.Pointer(set)), uintptr(unsafe.Pointer(oset))) +} + +func sysconf(name int32) int64 { + return int64(sysvicall1(&libc_sysconf, uintptr(name))) +} + +func usleep1(usec uint32) + +//go:nosplit +func usleep_no_g(µs uint32) { + usleep1(µs) +} + +//go:nosplit +func usleep(µs uint32) { + usleep1(µs) +} + +func walltime() (sec int64, nsec int32) { + var ts mts + sysvicall2(&libc_clock_gettime, _CLOCK_REALTIME, uintptr(unsafe.Pointer(&ts))) + return ts.tv_sec, int32(ts.tv_nsec) +} + +//go:nosplit +func write1(fd uintptr, buf unsafe.Pointer, nbyte int32) int32 { + r1, err := sysvicall3Err(&libc_write, fd, uintptr(buf), uintptr(nbyte)) + if c := int32(r1); c >= 0 { + return c + } + return -int32(err) +} + +//go:nosplit +func pipe2(flags int32) (r, w int32, errno int32) { + var p [2]int32 + _, e := sysvicall2Err(&libc_pipe2, uintptr(noescape(unsafe.Pointer(&p))), uintptr(flags)) + return p[0], p[1], int32(e) +} + +//go:nosplit +func fcntl(fd, cmd, arg int32) (ret int32, errno int32) { + r1, err := sysvicall3Err(&libc_fcntl, uintptr(fd), uintptr(cmd), uintptr(arg)) + return int32(r1), int32(err) +} + +//go:nosplit +func closeonexec(fd int32) { + fcntl(fd, _F_SETFD, _FD_CLOEXEC) +} + +func osyield1() + +//go:nosplit +func osyield_no_g() { + osyield1() +} + +//go:nosplit +func osyield() { + sysvicall0(&libc_sched_yield) +} + +//go:linkname executablePath os.executablePath +var executablePath string + +func sysargs(argc int32, argv **byte) { + n := argc + 1 + + // skip over argv, envp to get to auxv + for argv_index(argv, n) != nil { + n++ + } + + // skip NULL separator + n++ + + // now argv+n is auxv + auxv := (*[1 << 28]uintptr)(add(unsafe.Pointer(argv), uintptr(n)*goarch.PtrSize)) + sysauxv(auxv[:]) +} + +const ( + _AT_NULL = 0 // Terminates the vector + _AT_PAGESZ = 6 // Page size in bytes + _AT_SUN_EXECNAME = 2014 // exec() path name +) + +func sysauxv(auxv []uintptr) { + for i := 0; auxv[i] != _AT_NULL; i += 2 { + tag, val := auxv[i], auxv[i+1] + switch tag { + case _AT_PAGESZ: + physPageSize = val + case _AT_SUN_EXECNAME: + executablePath = gostringnocopy((*byte)(unsafe.Pointer(val))) + } + } +} + +// sigPerThreadSyscall is only used on linux, so we assign a bogus signal +// number. +const sigPerThreadSyscall = 1 << 31 + +//go:nosplit +func runPerThreadSyscall() { + throw("runPerThreadSyscall only valid on linux") +} diff --git a/src/runtime/os_aix.go b/src/runtime/os_aix.go new file mode 100644 index 0000000..ad96ac3 --- /dev/null +++ b/src/runtime/os_aix.go @@ -0,0 +1,420 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build aix + +package runtime + +import ( + "internal/abi" + "runtime/internal/atomic" + "unsafe" +) + +const ( + threadStackSize = 0x100000 // size of a thread stack allocated by OS +) + +// funcDescriptor is a structure representing a function descriptor +// A variable with this type is always created in assembler +type funcDescriptor struct { + fn uintptr + toc uintptr + envPointer uintptr // unused in Golang +} + +type mOS struct { + waitsema uintptr // semaphore for parking on locks + perrno uintptr // pointer to tls errno +} + +//go:nosplit +func semacreate(mp *m) { + if mp.waitsema != 0 { + return + } + + var sem *semt + + // Call libc's malloc rather than malloc. This will + // allocate space on the C heap. We can't call mallocgc + // here because it could cause a deadlock. + sem = (*semt)(malloc(unsafe.Sizeof(*sem))) + if sem_init(sem, 0, 0) != 0 { + throw("sem_init") + } + mp.waitsema = uintptr(unsafe.Pointer(sem)) +} + +//go:nosplit +func semasleep(ns int64) int32 { + mp := getg().m + if ns >= 0 { + var ts timespec + + if clock_gettime(_CLOCK_REALTIME, &ts) != 0 { + throw("clock_gettime") + } + ts.tv_sec += ns / 1e9 + ts.tv_nsec += ns % 1e9 + if ts.tv_nsec >= 1e9 { + ts.tv_sec++ + ts.tv_nsec -= 1e9 + } + + if r, err := sem_timedwait((*semt)(unsafe.Pointer(mp.waitsema)), &ts); r != 0 { + if err == _ETIMEDOUT || err == _EAGAIN || err == _EINTR { + return -1 + } + println("sem_timedwait err ", err, " ts.tv_sec ", ts.tv_sec, " ts.tv_nsec ", ts.tv_nsec, " ns ", ns, " id ", mp.id) + throw("sem_timedwait") + } + return 0 + } + for { + r1, err := sem_wait((*semt)(unsafe.Pointer(mp.waitsema))) + if r1 == 0 { + break + } + if err == _EINTR { + continue + } + throw("sem_wait") + } + return 0 +} + +//go:nosplit +func semawakeup(mp *m) { + if sem_post((*semt)(unsafe.Pointer(mp.waitsema))) != 0 { + throw("sem_post") + } +} + +func osinit() { + ncpu = int32(sysconf(__SC_NPROCESSORS_ONLN)) + physPageSize = sysconf(__SC_PAGE_SIZE) +} + +// newosproc0 is a version of newosproc that can be called before the runtime +// is initialized. +// +// This function is not safe to use after initialization as it does not pass an M as fnarg. +// +//go:nosplit +func newosproc0(stacksize uintptr, fn *funcDescriptor) { + var ( + attr pthread_attr + oset sigset + tid pthread + ) + + if pthread_attr_init(&attr) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + if pthread_attr_setstacksize(&attr, threadStackSize) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + if pthread_attr_setdetachstate(&attr, _PTHREAD_CREATE_DETACHED) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + // Disable signals during create, so that the new thread starts + // with signals disabled. It will enable them in minit. + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + var ret int32 + for tries := 0; tries < 20; tries++ { + // pthread_create can fail with EAGAIN for no reasons + // but it will be ok if it retries. + ret = pthread_create(&tid, &attr, fn, nil) + if ret != _EAGAIN { + break + } + usleep(uint32(tries+1) * 1000) // Milliseconds. + } + sigprocmask(_SIG_SETMASK, &oset, nil) + if ret != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + +} + +// Called to do synchronous initialization of Go code built with +// -buildmode=c-archive or -buildmode=c-shared. +// None of the Go runtime is initialized. +// +//go:nosplit +//go:nowritebarrierrec +func libpreinit() { + initsig(true) +} + +// Ms related functions +func mpreinit(mp *m) { + mp.gsignal = malg(32 * 1024) // AIX wants >= 8K + mp.gsignal.m = mp +} + +// errno address must be retrieved by calling _Errno libc function. +// This will return a pointer to errno. +func miniterrno() { + mp := getg().m + r, _ := syscall0(&libc__Errno) + mp.perrno = r + +} + +func minit() { + miniterrno() + minitSignals() + getg().m.procid = uint64(pthread_self()) +} + +func unminit() { + unminitSignals() +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +// tstart is a function descriptor to _tstart defined in assembly. +var tstart funcDescriptor + +func newosproc(mp *m) { + var ( + attr pthread_attr + oset sigset + tid pthread + ) + + if pthread_attr_init(&attr) != 0 { + throw("pthread_attr_init") + } + + if pthread_attr_setstacksize(&attr, threadStackSize) != 0 { + throw("pthread_attr_getstacksize") + } + + if pthread_attr_setdetachstate(&attr, _PTHREAD_CREATE_DETACHED) != 0 { + throw("pthread_attr_setdetachstate") + } + + // Disable signals during create, so that the new thread starts + // with signals disabled. It will enable them in minit. + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + ret := retryOnEAGAIN(func() int32 { + return pthread_create(&tid, &attr, &tstart, unsafe.Pointer(mp)) + }) + sigprocmask(_SIG_SETMASK, &oset, nil) + if ret != 0 { + print("runtime: failed to create new OS thread (have ", mcount(), " already; errno=", ret, ")\n") + if ret == _EAGAIN { + println("runtime: may need to increase max user processes (ulimit -u)") + } + throw("newosproc") + } + +} + +func exitThread(wait *atomic.Uint32) { + // We should never reach exitThread on AIX because we let + // libc clean up threads. + throw("exitThread") +} + +var urandom_dev = []byte("/dev/urandom\x00") + +//go:nosplit +func getRandomData(r []byte) { + fd := open(&urandom_dev[0], 0 /* O_RDONLY */, 0) + n := read(fd, unsafe.Pointer(&r[0]), int32(len(r))) + closefd(fd) + extendRandom(r, int(n)) +} + +func goenvs() { + goenvs_unix() +} + +/* SIGNAL */ + +const ( + _NSIG = 256 +) + +// sigtramp is a function descriptor to _sigtramp defined in assembly +var sigtramp funcDescriptor + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa sigactiont + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTART + sa.sa_mask = sigset_all + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + fn = uintptr(unsafe.Pointer(&sigtramp)) + } + sa.sa_handler = fn + sigaction(uintptr(i), &sa, nil) + +} + +//go:nosplit +//go:nowritebarrierrec +func setsigstack(i uint32) { + var sa sigactiont + sigaction(uintptr(i), nil, &sa) + if sa.sa_flags&_SA_ONSTACK != 0 { + return + } + sa.sa_flags |= _SA_ONSTACK + sigaction(uintptr(i), &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func getsig(i uint32) uintptr { + var sa sigactiont + sigaction(uintptr(i), nil, &sa) + return sa.sa_handler +} + +// setSignalstackSP sets the ss_sp field of a stackt. +// +//go:nosplit +func setSignalstackSP(s *stackt, sp uintptr) { + *(*uintptr)(unsafe.Pointer(&s.ss_sp)) = sp +} + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { + switch sig { + case _SIGPIPE: + // For SIGPIPE, c.sigcode() isn't set to _SI_USER as on Linux. + // Therefore, raisebadsignal won't raise SIGPIPE again if + // it was deliver in a non-Go thread. + c.set_sigcode(_SI_USER) + } +} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + (*mask)[(i-1)/64] |= 1 << ((uint32(i) - 1) & 63) +} + +func sigdelset(mask *sigset, i int) { + (*mask)[(i-1)/64] &^= 1 << ((uint32(i) - 1) & 63) +} + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + setThreadCPUProfilerHz(hz) +} + +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + return true +} + +const ( + _CLOCK_REALTIME = 9 + _CLOCK_MONOTONIC = 10 +) + +//go:nosplit +func nanotime1() int64 { + tp := ×pec{} + if clock_gettime(_CLOCK_REALTIME, tp) != 0 { + throw("syscall clock_gettime failed") + } + return tp.tv_sec*1000000000 + tp.tv_nsec +} + +func walltime() (sec int64, nsec int32) { + ts := ×pec{} + if clock_gettime(_CLOCK_REALTIME, ts) != 0 { + throw("syscall clock_gettime failed") + } + return ts.tv_sec, int32(ts.tv_nsec) +} + +//go:nosplit +func fcntl(fd, cmd, arg int32) (int32, int32) { + r, errno := syscall3(&libc_fcntl, uintptr(fd), uintptr(cmd), uintptr(arg)) + return int32(r), int32(errno) +} + +//go:nosplit +func closeonexec(fd int32) { + fcntl(fd, _F_SETFD, _FD_CLOEXEC) +} + +//go:nosplit +func setNonblock(fd int32) { + flags, _ := fcntl(fd, _F_GETFL, 0) + if flags != -1 { + fcntl(fd, _F_SETFL, flags|_O_NONBLOCK) + } +} + +// sigPerThreadSyscall is only used on linux, so we assign a bogus signal +// number. +const sigPerThreadSyscall = 1 << 31 + +//go:nosplit +func runPerThreadSyscall() { + throw("runPerThreadSyscall only valid on linux") +} + +//go:nosplit +func getuid() int32 { + r, errno := syscall0(&libc_getuid) + if errno != 0 { + print("getuid failed ", errno) + throw("getuid") + } + return int32(r) +} + +//go:nosplit +func geteuid() int32 { + r, errno := syscall0(&libc_geteuid) + if errno != 0 { + print("geteuid failed ", errno) + throw("geteuid") + } + return int32(r) +} + +//go:nosplit +func getgid() int32 { + r, errno := syscall0(&libc_getgid) + if errno != 0 { + print("getgid failed ", errno) + throw("getgid") + } + return int32(r) +} + +//go:nosplit +func getegid() int32 { + r, errno := syscall0(&libc_getegid) + if errno != 0 { + print("getegid failed ", errno) + throw("getegid") + } + return int32(r) +} diff --git a/src/runtime/os_android.go b/src/runtime/os_android.go new file mode 100644 index 0000000..52c8c86 --- /dev/null +++ b/src/runtime/os_android.go @@ -0,0 +1,15 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import _ "unsafe" // for go:cgo_export_static and go:cgo_export_dynamic + +// Export the main function. +// +// Used by the app package to start all-Go Android apps that are +// loaded via JNI. See golang.org/x/mobile/app. + +//go:cgo_export_static main.main +//go:cgo_export_dynamic main.main diff --git a/src/runtime/os_darwin.go b/src/runtime/os_darwin.go new file mode 100644 index 0000000..105de47 --- /dev/null +++ b/src/runtime/os_darwin.go @@ -0,0 +1,476 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +type mOS struct { + initialized bool + mutex pthreadmutex + cond pthreadcond + count int +} + +func unimplemented(name string) { + println(name, "not implemented") + *(*int)(unsafe.Pointer(uintptr(1231))) = 1231 +} + +//go:nosplit +func semacreate(mp *m) { + if mp.initialized { + return + } + mp.initialized = true + if err := pthread_mutex_init(&mp.mutex, nil); err != 0 { + throw("pthread_mutex_init") + } + if err := pthread_cond_init(&mp.cond, nil); err != 0 { + throw("pthread_cond_init") + } +} + +//go:nosplit +func semasleep(ns int64) int32 { + var start int64 + if ns >= 0 { + start = nanotime() + } + mp := getg().m + pthread_mutex_lock(&mp.mutex) + for { + if mp.count > 0 { + mp.count-- + pthread_mutex_unlock(&mp.mutex) + return 0 + } + if ns >= 0 { + spent := nanotime() - start + if spent >= ns { + pthread_mutex_unlock(&mp.mutex) + return -1 + } + var t timespec + t.setNsec(ns - spent) + err := pthread_cond_timedwait_relative_np(&mp.cond, &mp.mutex, &t) + if err == _ETIMEDOUT { + pthread_mutex_unlock(&mp.mutex) + return -1 + } + } else { + pthread_cond_wait(&mp.cond, &mp.mutex) + } + } +} + +//go:nosplit +func semawakeup(mp *m) { + pthread_mutex_lock(&mp.mutex) + mp.count++ + if mp.count > 0 { + pthread_cond_signal(&mp.cond) + } + pthread_mutex_unlock(&mp.mutex) +} + +// The read and write file descriptors used by the sigNote functions. +var sigNoteRead, sigNoteWrite int32 + +// sigNoteSetup initializes a single, there-can-only-be-one, async-signal-safe note. +// +// The current implementation of notes on Darwin is not async-signal-safe, +// because the functions pthread_mutex_lock, pthread_cond_signal, and +// pthread_mutex_unlock, called by semawakeup, are not async-signal-safe. +// There is only one case where we need to wake up a note from a signal +// handler: the sigsend function. The signal handler code does not require +// all the features of notes: it does not need to do a timed wait. +// This is a separate implementation of notes, based on a pipe, that does +// not support timed waits but is async-signal-safe. +func sigNoteSetup(*note) { + if sigNoteRead != 0 || sigNoteWrite != 0 { + // Generalizing this would require avoiding the pipe-fork-closeonexec race, which entangles syscall. + throw("duplicate sigNoteSetup") + } + var errno int32 + sigNoteRead, sigNoteWrite, errno = pipe() + if errno != 0 { + throw("pipe failed") + } + closeonexec(sigNoteRead) + closeonexec(sigNoteWrite) + + // Make the write end of the pipe non-blocking, so that if the pipe + // buffer is somehow full we will not block in the signal handler. + // Leave the read end of the pipe blocking so that we will block + // in sigNoteSleep. + setNonblock(sigNoteWrite) +} + +// sigNoteWakeup wakes up a thread sleeping on a note created by sigNoteSetup. +func sigNoteWakeup(*note) { + var b byte + write(uintptr(sigNoteWrite), unsafe.Pointer(&b), 1) +} + +// sigNoteSleep waits for a note created by sigNoteSetup to be woken. +func sigNoteSleep(*note) { + for { + var b byte + entersyscallblock() + n := read(sigNoteRead, unsafe.Pointer(&b), 1) + exitsyscall() + if n != -_EINTR { + return + } + } +} + +// BSD interface for threading. +func osinit() { + // pthread_create delayed until end of goenvs so that we + // can look at the environment first. + + ncpu = getncpu() + physPageSize = getPageSize() + + osinit_hack() +} + +func sysctlbynameInt32(name []byte) (int32, int32) { + out := int32(0) + nout := unsafe.Sizeof(out) + ret := sysctlbyname(&name[0], (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + return ret, out +} + +//go:linkname internal_cpu_getsysctlbyname internal/cpu.getsysctlbyname +func internal_cpu_getsysctlbyname(name []byte) (int32, int32) { + return sysctlbynameInt32(name) +} + +const ( + _CTL_HW = 6 + _HW_NCPU = 3 + _HW_PAGESIZE = 7 +) + +func getncpu() int32 { + // Use sysctl to fetch hw.ncpu. + mib := [2]uint32{_CTL_HW, _HW_NCPU} + out := uint32(0) + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], 2, (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret >= 0 && int32(out) > 0 { + return int32(out) + } + return 1 +} + +func getPageSize() uintptr { + // Use sysctl to fetch hw.pagesize. + mib := [2]uint32{_CTL_HW, _HW_PAGESIZE} + out := uint32(0) + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], 2, (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret >= 0 && int32(out) > 0 { + return uintptr(out) + } + return 0 +} + +var urandom_dev = []byte("/dev/urandom\x00") + +//go:nosplit +func getRandomData(r []byte) { + fd := open(&urandom_dev[0], 0 /* O_RDONLY */, 0) + n := read(fd, unsafe.Pointer(&r[0]), int32(len(r))) + closefd(fd) + extendRandom(r, int(n)) +} + +func goenvs() { + goenvs_unix() +} + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrierrec +func newosproc(mp *m) { + stk := unsafe.Pointer(mp.g0.stack.hi) + if false { + print("newosproc stk=", stk, " m=", mp, " g=", mp.g0, " id=", mp.id, " ostk=", &mp, "\n") + } + + // Initialize an attribute object. + var attr pthreadattr + var err int32 + err = pthread_attr_init(&attr) + if err != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + // Find out OS stack size for our own stack guard. + var stacksize uintptr + if pthread_attr_getstacksize(&attr, &stacksize) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + mp.g0.stack.hi = stacksize // for mstart + + // Tell the pthread library we won't join with this thread. + if pthread_attr_setdetachstate(&attr, _PTHREAD_CREATE_DETACHED) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + // Finally, create the thread. It starts at mstart_stub, which does some low-level + // setup and then calls mstart. + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + err = retryOnEAGAIN(func() int32 { + return pthread_create(&attr, abi.FuncPCABI0(mstart_stub), unsafe.Pointer(mp)) + }) + sigprocmask(_SIG_SETMASK, &oset, nil) + if err != 0 { + writeErrStr(failthreadcreate) + exit(1) + } +} + +// glue code to call mstart from pthread_create. +func mstart_stub() + +// newosproc0 is a version of newosproc that can be called before the runtime +// is initialized. +// +// This function is not safe to use after initialization as it does not pass an M as fnarg. +// +//go:nosplit +func newosproc0(stacksize uintptr, fn uintptr) { + // Initialize an attribute object. + var attr pthreadattr + var err int32 + err = pthread_attr_init(&attr) + if err != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + // The caller passes in a suggested stack size, + // from when we allocated the stack and thread ourselves, + // without libpthread. Now that we're using libpthread, + // we use the OS default stack size instead of the suggestion. + // Find out that stack size for our own stack guard. + if pthread_attr_getstacksize(&attr, &stacksize) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + g0.stack.hi = stacksize // for mstart + memstats.stacks_sys.add(int64(stacksize)) + + // Tell the pthread library we won't join with this thread. + if pthread_attr_setdetachstate(&attr, _PTHREAD_CREATE_DETACHED) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + // Finally, create the thread. It starts at mstart_stub, which does some low-level + // setup and then calls mstart. + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + err = pthread_create(&attr, fn, nil) + sigprocmask(_SIG_SETMASK, &oset, nil) + if err != 0 { + writeErrStr(failthreadcreate) + exit(1) + } +} + +// Called to do synchronous initialization of Go code built with +// -buildmode=c-archive or -buildmode=c-shared. +// None of the Go runtime is initialized. +// +//go:nosplit +//go:nowritebarrierrec +func libpreinit() { + initsig(true) +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + mp.gsignal = malg(32 * 1024) // OS X wants >= 8K + mp.gsignal.m = mp + if GOOS == "darwin" && GOARCH == "arm64" { + // mlock the signal stack to work around a kernel bug where it may + // SIGILL when the signal stack is not faulted in while a signal + // arrives. See issue 42774. + mlock(unsafe.Pointer(mp.gsignal.stack.hi-physPageSize), physPageSize) + } +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { + // iOS does not support alternate signal stack. + // The signal handler handles it directly. + if !(GOOS == "ios" && GOARCH == "arm64") { + minitSignalStack() + } + minitSignalMask() + getg().m.procid = uint64(pthread_self()) +} + +// Called from dropm to undo the effect of an minit. +// +//go:nosplit +func unminit() { + // iOS does not support alternate signal stack. + // See minit. + if !(GOOS == "ios" && GOARCH == "arm64") { + unminitSignals() + } +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +//go:nosplit +func osyield_no_g() { + usleep_no_g(1) +} + +//go:nosplit +func osyield() { + usleep(1) +} + +const ( + _NSIG = 32 + _SI_USER = 0 /* empirically true, but not what headers say */ + _SIG_BLOCK = 1 + _SIG_UNBLOCK = 2 + _SIG_SETMASK = 3 + _SS_DISABLE = 4 +) + +//extern SigTabTT runtime·sigtab[]; + +type sigset uint32 + +var sigset_all = ^sigset(0) + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa usigactiont + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTART + sa.sa_mask = ^uint32(0) + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + if iscgo { + fn = abi.FuncPCABI0(cgoSigtramp) + } else { + fn = abi.FuncPCABI0(sigtramp) + } + } + *(*uintptr)(unsafe.Pointer(&sa.__sigaction_u)) = fn + sigaction(i, &sa, nil) +} + +// sigtramp is the callback from libc when a signal is received. +// It is called with the C calling convention. +func sigtramp() +func cgoSigtramp() + +//go:nosplit +//go:nowritebarrierrec +func setsigstack(i uint32) { + var osa usigactiont + sigaction(i, nil, &osa) + handler := *(*uintptr)(unsafe.Pointer(&osa.__sigaction_u)) + if osa.sa_flags&_SA_ONSTACK != 0 { + return + } + var sa usigactiont + *(*uintptr)(unsafe.Pointer(&sa.__sigaction_u)) = handler + sa.sa_mask = osa.sa_mask + sa.sa_flags = osa.sa_flags | _SA_ONSTACK + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func getsig(i uint32) uintptr { + var sa usigactiont + sigaction(i, nil, &sa) + return *(*uintptr)(unsafe.Pointer(&sa.__sigaction_u)) +} + +// setSignalstackSP sets the ss_sp field of a stackt. +// +//go:nosplit +func setSignalstackSP(s *stackt, sp uintptr) { + *(*uintptr)(unsafe.Pointer(&s.ss_sp)) = sp +} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + *mask |= 1 << (uint32(i) - 1) +} + +func sigdelset(mask *sigset, i int) { + *mask &^= 1 << (uint32(i) - 1) +} + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + setThreadCPUProfilerHz(hz) +} + +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + return true +} + +//go:linkname executablePath os.executablePath +var executablePath string + +func sysargs(argc int32, argv **byte) { + // skip over argv, envv and the first string will be the path + n := argc + 1 + for argv_index(argv, n) != nil { + n++ + } + executablePath = gostringnocopy(argv_index(argv, n+1)) + + // strip "executable_path=" prefix if available, it's added after OS X 10.11. + const prefix = "executable_path=" + if len(executablePath) > len(prefix) && executablePath[:len(prefix)] == prefix { + executablePath = executablePath[len(prefix):] + } +} + +func signalM(mp *m, sig int) { + pthread_kill(pthread(mp.procid), uint32(sig)) +} + +// sigPerThreadSyscall is only used on linux, so we assign a bogus signal +// number. +const sigPerThreadSyscall = 1 << 31 + +//go:nosplit +func runPerThreadSyscall() { + throw("runPerThreadSyscall only valid on linux") +} diff --git a/src/runtime/os_darwin_arm64.go b/src/runtime/os_darwin_arm64.go new file mode 100644 index 0000000..b808150 --- /dev/null +++ b/src/runtime/os_darwin_arm64.go @@ -0,0 +1,12 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_dragonfly.go b/src/runtime/os_dragonfly.go new file mode 100644 index 0000000..c14d904 --- /dev/null +++ b/src/runtime/os_dragonfly.go @@ -0,0 +1,342 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +const ( + _NSIG = 33 + _SI_USER = 0 + _SS_DISABLE = 4 + _SIG_BLOCK = 1 + _SIG_UNBLOCK = 2 + _SIG_SETMASK = 3 +) + +type mOS struct{} + +//go:noescape +func lwp_create(param *lwpparams) int32 + +//go:noescape +func sigaltstack(new, old *stackt) + +//go:noescape +func sigaction(sig uint32, new, old *sigactiont) + +//go:noescape +func sigprocmask(how int32, new, old *sigset) + +//go:noescape +func setitimer(mode int32, new, old *itimerval) + +//go:noescape +func sysctl(mib *uint32, miblen uint32, out *byte, size *uintptr, dst *byte, ndst uintptr) int32 + +func raiseproc(sig uint32) + +func lwp_gettid() int32 +func lwp_kill(pid, tid int32, sig int) + +//go:noescape +func sys_umtx_sleep(addr *uint32, val, timeout int32) int32 + +//go:noescape +func sys_umtx_wakeup(addr *uint32, val int32) int32 + +func osyield() + +//go:nosplit +func osyield_no_g() { + osyield() +} + +func kqueue() int32 + +//go:noescape +func kevent(kq int32, ch *keventt, nch int32, ev *keventt, nev int32, ts *timespec) int32 + +func pipe2(flags int32) (r, w int32, errno int32) +func fcntl(fd, cmd, arg int32) (ret int32, errno int32) +func closeonexec(fd int32) + +func issetugid() int32 + +// From DragonFly's <sys/sysctl.h> +const ( + _CTL_HW = 6 + _HW_NCPU = 3 + _HW_PAGESIZE = 7 +) + +var sigset_all = sigset{[4]uint32{^uint32(0), ^uint32(0), ^uint32(0), ^uint32(0)}} + +func getncpu() int32 { + mib := [2]uint32{_CTL_HW, _HW_NCPU} + out := uint32(0) + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], 2, (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret >= 0 { + return int32(out) + } + return 1 +} + +func getPageSize() uintptr { + mib := [2]uint32{_CTL_HW, _HW_PAGESIZE} + out := uint32(0) + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], 2, (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret >= 0 { + return uintptr(out) + } + return 0 +} + +//go:nosplit +func futexsleep(addr *uint32, val uint32, ns int64) { + systemstack(func() { + futexsleep1(addr, val, ns) + }) +} + +func futexsleep1(addr *uint32, val uint32, ns int64) { + var timeout int32 + if ns >= 0 { + // The timeout is specified in microseconds - ensure that we + // do not end up dividing to zero, which would put us to sleep + // indefinitely... + timeout = timediv(ns, 1000, nil) + if timeout == 0 { + timeout = 1 + } + } + + // sys_umtx_sleep will return EWOULDBLOCK (EAGAIN) when the timeout + // expires or EBUSY if the mutex value does not match. + ret := sys_umtx_sleep(addr, int32(val), timeout) + if ret >= 0 || ret == -_EINTR || ret == -_EAGAIN || ret == -_EBUSY { + return + } + + print("umtx_sleep addr=", addr, " val=", val, " ret=", ret, "\n") + *(*int32)(unsafe.Pointer(uintptr(0x1005))) = 0x1005 +} + +//go:nosplit +func futexwakeup(addr *uint32, cnt uint32) { + ret := sys_umtx_wakeup(addr, int32(cnt)) + if ret >= 0 { + return + } + + systemstack(func() { + print("umtx_wake_addr=", addr, " ret=", ret, "\n") + *(*int32)(unsafe.Pointer(uintptr(0x1006))) = 0x1006 + }) +} + +func lwp_start(uintptr) + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrier +func newosproc(mp *m) { + stk := unsafe.Pointer(mp.g0.stack.hi) + if false { + print("newosproc stk=", stk, " m=", mp, " g=", mp.g0, " lwp_start=", abi.FuncPCABI0(lwp_start), " id=", mp.id, " ostk=", &mp, "\n") + } + + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + + params := lwpparams{ + start_func: abi.FuncPCABI0(lwp_start), + arg: unsafe.Pointer(mp), + stack: uintptr(stk), + tid1: nil, // minit will record tid + tid2: nil, + } + + // TODO: Check for error. + retryOnEAGAIN(func() int32 { + lwp_create(¶ms) + return 0 + }) + sigprocmask(_SIG_SETMASK, &oset, nil) +} + +func osinit() { + ncpu = getncpu() + if physPageSize == 0 { + physPageSize = getPageSize() + } +} + +var urandom_dev = []byte("/dev/urandom\x00") + +//go:nosplit +func getRandomData(r []byte) { + fd := open(&urandom_dev[0], 0 /* O_RDONLY */, 0) + n := read(fd, unsafe.Pointer(&r[0]), int32(len(r))) + closefd(fd) + extendRandom(r, int(n)) +} + +func goenvs() { + goenvs_unix() +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + mp.gsignal = malg(32 * 1024) + mp.gsignal.m = mp +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { + getg().m.procid = uint64(lwp_gettid()) + minitSignals() +} + +// Called from dropm to undo the effect of an minit. +// +//go:nosplit +func unminit() { + unminitSignals() +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +func sigtramp() + +type sigactiont struct { + sa_sigaction uintptr + sa_flags int32 + sa_mask sigset +} + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa sigactiont + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTART + sa.sa_mask = sigset_all + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + fn = abi.FuncPCABI0(sigtramp) + } + sa.sa_sigaction = fn + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func setsigstack(i uint32) { + throw("setsigstack") +} + +//go:nosplit +//go:nowritebarrierrec +func getsig(i uint32) uintptr { + var sa sigactiont + sigaction(i, nil, &sa) + return sa.sa_sigaction +} + +// setSignalstackSP sets the ss_sp field of a stackt. +// +//go:nosplit +func setSignalstackSP(s *stackt, sp uintptr) { + s.ss_sp = sp +} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + mask.__bits[(i-1)/32] |= 1 << ((uint32(i) - 1) & 31) +} + +func sigdelset(mask *sigset, i int) { + mask.__bits[(i-1)/32] &^= 1 << ((uint32(i) - 1) & 31) +} + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { +} + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + setThreadCPUProfilerHz(hz) +} + +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + return true +} + +func sysargs(argc int32, argv **byte) { + n := argc + 1 + + // skip over argv, envp to get to auxv + for argv_index(argv, n) != nil { + n++ + } + + // skip NULL separator + n++ + + auxv := (*[1 << 28]uintptr)(add(unsafe.Pointer(argv), uintptr(n)*goarch.PtrSize)) + sysauxv(auxv[:]) +} + +const ( + _AT_NULL = 0 + _AT_PAGESZ = 6 +) + +func sysauxv(auxv []uintptr) { + for i := 0; auxv[i] != _AT_NULL; i += 2 { + tag, val := auxv[i], auxv[i+1] + switch tag { + case _AT_PAGESZ: + physPageSize = val + } + } +} + +// raise sends a signal to the calling thread. +// +// It must be nosplit because it is used by the signal handler before +// it definitely has a Go stack. +// +//go:nosplit +func raise(sig uint32) { + lwp_kill(-1, lwp_gettid(), int(sig)) +} + +func signalM(mp *m, sig int) { + lwp_kill(-1, int32(mp.procid), sig) +} + +// sigPerThreadSyscall is only used on linux, so we assign a bogus signal +// number. +const sigPerThreadSyscall = 1 << 31 + +//go:nosplit +func runPerThreadSyscall() { + throw("runPerThreadSyscall only valid on linux") +} diff --git a/src/runtime/os_freebsd.go b/src/runtime/os_freebsd.go new file mode 100644 index 0000000..a7288c5 --- /dev/null +++ b/src/runtime/os_freebsd.go @@ -0,0 +1,480 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +type mOS struct{} + +//go:noescape +func thr_new(param *thrparam, size int32) int32 + +//go:noescape +func sigaltstack(new, old *stackt) + +//go:noescape +func sigprocmask(how int32, new, old *sigset) + +//go:noescape +func setitimer(mode int32, new, old *itimerval) + +//go:noescape +func sysctl(mib *uint32, miblen uint32, out *byte, size *uintptr, dst *byte, ndst uintptr) int32 + +func raiseproc(sig uint32) + +func thr_self() thread +func thr_kill(tid thread, sig int) + +//go:noescape +func sys_umtx_op(addr *uint32, mode int32, val uint32, uaddr1 uintptr, ut *umtx_time) int32 + +func osyield() + +//go:nosplit +func osyield_no_g() { + osyield() +} + +func kqueue() int32 + +//go:noescape +func kevent(kq int32, ch *keventt, nch int32, ev *keventt, nev int32, ts *timespec) int32 + +func pipe2(flags int32) (r, w int32, errno int32) +func fcntl(fd, cmd, arg int32) (ret int32, errno int32) +func closeonexec(fd int32) + +func issetugid() int32 + +// From FreeBSD's <sys/sysctl.h> +const ( + _CTL_HW = 6 + _HW_PAGESIZE = 7 +) + +var sigset_all = sigset{[4]uint32{^uint32(0), ^uint32(0), ^uint32(0), ^uint32(0)}} + +// Undocumented numbers from FreeBSD's lib/libc/gen/sysctlnametomib.c. +const ( + _CTL_QUERY = 0 + _CTL_QUERY_MIB = 3 +) + +// sysctlnametomib fill mib with dynamically assigned sysctl entries of name, +// return count of effected mib slots, return 0 on error. +func sysctlnametomib(name []byte, mib *[_CTL_MAXNAME]uint32) uint32 { + oid := [2]uint32{_CTL_QUERY, _CTL_QUERY_MIB} + miblen := uintptr(_CTL_MAXNAME) + if sysctl(&oid[0], 2, (*byte)(unsafe.Pointer(mib)), &miblen, (*byte)(unsafe.Pointer(&name[0])), (uintptr)(len(name))) < 0 { + return 0 + } + miblen /= unsafe.Sizeof(uint32(0)) + if miblen <= 0 { + return 0 + } + return uint32(miblen) +} + +const ( + _CPU_CURRENT_PID = -1 // Current process ID. +) + +//go:noescape +func cpuset_getaffinity(level int, which int, id int64, size int, mask *byte) int32 + +//go:systemstack +func getncpu() int32 { + // Use a large buffer for the CPU mask. We're on the system + // stack, so this is fine, and we can't allocate memory for a + // dynamically-sized buffer at this point. + const maxCPUs = 64 * 1024 + var mask [maxCPUs / 8]byte + var mib [_CTL_MAXNAME]uint32 + + // According to FreeBSD's /usr/src/sys/kern/kern_cpuset.c, + // cpuset_getaffinity return ERANGE when provided buffer size exceed the limits in kernel. + // Querying kern.smp.maxcpus to calculate maximum buffer size. + // See https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=200802 + + // Variable kern.smp.maxcpus introduced at Dec 23 2003, revision 123766, + // with dynamically assigned sysctl entries. + miblen := sysctlnametomib([]byte("kern.smp.maxcpus"), &mib) + if miblen == 0 { + return 1 + } + + // Query kern.smp.maxcpus. + dstsize := uintptr(4) + maxcpus := uint32(0) + if sysctl(&mib[0], miblen, (*byte)(unsafe.Pointer(&maxcpus)), &dstsize, nil, 0) != 0 { + return 1 + } + + maskSize := int(maxcpus+7) / 8 + if maskSize < goarch.PtrSize { + maskSize = goarch.PtrSize + } + if maskSize > len(mask) { + maskSize = len(mask) + } + + if cpuset_getaffinity(_CPU_LEVEL_WHICH, _CPU_WHICH_PID, _CPU_CURRENT_PID, + maskSize, (*byte)(unsafe.Pointer(&mask[0]))) != 0 { + return 1 + } + n := int32(0) + for _, v := range mask[:maskSize] { + for v != 0 { + n += int32(v & 1) + v >>= 1 + } + } + if n == 0 { + return 1 + } + return n +} + +func getPageSize() uintptr { + mib := [2]uint32{_CTL_HW, _HW_PAGESIZE} + out := uint32(0) + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], 2, (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret >= 0 { + return uintptr(out) + } + return 0 +} + +// FreeBSD's umtx_op syscall is effectively the same as Linux's futex, and +// thus the code is largely similar. See Linux implementation +// and lock_futex.go for comments. + +//go:nosplit +func futexsleep(addr *uint32, val uint32, ns int64) { + systemstack(func() { + futexsleep1(addr, val, ns) + }) +} + +func futexsleep1(addr *uint32, val uint32, ns int64) { + var utp *umtx_time + if ns >= 0 { + var ut umtx_time + ut._clockid = _CLOCK_MONOTONIC + ut._timeout.setNsec(ns) + utp = &ut + } + ret := sys_umtx_op(addr, _UMTX_OP_WAIT_UINT_PRIVATE, val, unsafe.Sizeof(*utp), utp) + if ret >= 0 || ret == -_EINTR || ret == -_ETIMEDOUT { + return + } + print("umtx_wait addr=", addr, " val=", val, " ret=", ret, "\n") + *(*int32)(unsafe.Pointer(uintptr(0x1005))) = 0x1005 +} + +//go:nosplit +func futexwakeup(addr *uint32, cnt uint32) { + ret := sys_umtx_op(addr, _UMTX_OP_WAKE_PRIVATE, cnt, 0, nil) + if ret >= 0 { + return + } + + systemstack(func() { + print("umtx_wake_addr=", addr, " ret=", ret, "\n") + }) +} + +func thr_start() + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrier +func newosproc(mp *m) { + stk := unsafe.Pointer(mp.g0.stack.hi) + if false { + print("newosproc stk=", stk, " m=", mp, " g=", mp.g0, " thr_start=", abi.FuncPCABI0(thr_start), " id=", mp.id, " ostk=", &mp, "\n") + } + + param := thrparam{ + start_func: abi.FuncPCABI0(thr_start), + arg: unsafe.Pointer(mp), + stack_base: mp.g0.stack.lo, + stack_size: uintptr(stk) - mp.g0.stack.lo, + child_tid: nil, // minit will record tid + parent_tid: nil, + tls_base: unsafe.Pointer(&mp.tls[0]), + tls_size: unsafe.Sizeof(mp.tls), + } + + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + ret := retryOnEAGAIN(func() int32 { + errno := thr_new(¶m, int32(unsafe.Sizeof(param))) + // thr_new returns negative errno + return -errno + }) + sigprocmask(_SIG_SETMASK, &oset, nil) + if ret != 0 { + print("runtime: failed to create new OS thread (have ", mcount(), " already; errno=", ret, ")\n") + throw("newosproc") + } +} + +// Version of newosproc that doesn't require a valid G. +// +//go:nosplit +func newosproc0(stacksize uintptr, fn unsafe.Pointer) { + stack := sysAlloc(stacksize, &memstats.stacks_sys) + if stack == nil { + writeErrStr(failallocatestack) + exit(1) + } + // This code "knows" it's being called once from the library + // initialization code, and so it's using the static m0 for the + // tls and procid (thread) pointers. thr_new() requires the tls + // pointers, though the tid pointers can be nil. + // However, newosproc0 is currently unreachable because builds + // utilizing c-shared/c-archive force external linking. + param := thrparam{ + start_func: uintptr(fn), + arg: nil, + stack_base: uintptr(stack), //+stacksize? + stack_size: stacksize, + child_tid: nil, // minit will record tid + parent_tid: nil, + tls_base: unsafe.Pointer(&m0.tls[0]), + tls_size: unsafe.Sizeof(m0.tls), + } + + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + ret := thr_new(¶m, int32(unsafe.Sizeof(param))) + sigprocmask(_SIG_SETMASK, &oset, nil) + if ret < 0 { + writeErrStr(failthreadcreate) + exit(1) + } +} + +// Called to do synchronous initialization of Go code built with +// -buildmode=c-archive or -buildmode=c-shared. +// None of the Go runtime is initialized. +// +//go:nosplit +//go:nowritebarrierrec +func libpreinit() { + initsig(true) +} + +func osinit() { + ncpu = getncpu() + if physPageSize == 0 { + physPageSize = getPageSize() + } +} + +var urandom_dev = []byte("/dev/urandom\x00") + +//go:nosplit +func getRandomData(r []byte) { + fd := open(&urandom_dev[0], 0 /* O_RDONLY */, 0) + n := read(fd, unsafe.Pointer(&r[0]), int32(len(r))) + closefd(fd) + extendRandom(r, int(n)) +} + +func goenvs() { + goenvs_unix() +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + mp.gsignal = malg(32 * 1024) + mp.gsignal.m = mp +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { + getg().m.procid = uint64(thr_self()) + + // On FreeBSD before about April 2017 there was a bug such + // that calling execve from a thread other than the main + // thread did not reset the signal stack. That would confuse + // minitSignals, which calls minitSignalStack, which checks + // whether there is currently a signal stack and uses it if + // present. To avoid this confusion, explicitly disable the + // signal stack on the main thread when not running in a + // library. This can be removed when we are confident that all + // FreeBSD users are running a patched kernel. See issue #15658. + if gp := getg(); !isarchive && !islibrary && gp.m == &m0 && gp == gp.m.g0 { + st := stackt{ss_flags: _SS_DISABLE} + sigaltstack(&st, nil) + } + + minitSignals() +} + +// Called from dropm to undo the effect of an minit. +// +//go:nosplit +func unminit() { + unminitSignals() +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +func sigtramp() + +type sigactiont struct { + sa_handler uintptr + sa_flags int32 + sa_mask sigset +} + +// See os_freebsd2.go, os_freebsd_amd64.go for setsig function + +//go:nosplit +//go:nowritebarrierrec +func setsigstack(i uint32) { + var sa sigactiont + sigaction(i, nil, &sa) + if sa.sa_flags&_SA_ONSTACK != 0 { + return + } + sa.sa_flags |= _SA_ONSTACK + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func getsig(i uint32) uintptr { + var sa sigactiont + sigaction(i, nil, &sa) + return sa.sa_handler +} + +// setSignalstackSP sets the ss_sp field of a stackt. +// +//go:nosplit +func setSignalstackSP(s *stackt, sp uintptr) { + s.ss_sp = sp +} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + mask.__bits[(i-1)/32] |= 1 << ((uint32(i) - 1) & 31) +} + +func sigdelset(mask *sigset, i int) { + mask.__bits[(i-1)/32] &^= 1 << ((uint32(i) - 1) & 31) +} + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { +} + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + setThreadCPUProfilerHz(hz) +} + +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + return true +} + +func sysargs(argc int32, argv **byte) { + n := argc + 1 + + // skip over argv, envp to get to auxv + for argv_index(argv, n) != nil { + n++ + } + + // skip NULL separator + n++ + + // now argv+n is auxv + auxv := (*[1 << 28]uintptr)(add(unsafe.Pointer(argv), uintptr(n)*goarch.PtrSize)) + sysauxv(auxv[:]) +} + +const ( + _AT_NULL = 0 // Terminates the vector + _AT_PAGESZ = 6 // Page size in bytes + _AT_TIMEKEEP = 22 // Pointer to timehands. + _AT_HWCAP = 25 // CPU feature flags + _AT_HWCAP2 = 26 // CPU feature flags 2 +) + +func sysauxv(auxv []uintptr) { + for i := 0; auxv[i] != _AT_NULL; i += 2 { + tag, val := auxv[i], auxv[i+1] + switch tag { + // _AT_NCPUS from auxv shouldn't be used due to golang.org/issue/15206 + case _AT_PAGESZ: + physPageSize = val + case _AT_TIMEKEEP: + timekeepSharedPage = (*vdsoTimekeep)(unsafe.Pointer(val)) + } + + archauxv(tag, val) + } +} + +// sysSigaction calls the sigaction system call. +// +//go:nosplit +func sysSigaction(sig uint32, new, old *sigactiont) { + // Use system stack to avoid split stack overflow on amd64 + if asmSigaction(uintptr(sig), new, old) != 0 { + systemstack(func() { + throw("sigaction failed") + }) + } +} + +// asmSigaction is implemented in assembly. +// +//go:noescape +func asmSigaction(sig uintptr, new, old *sigactiont) int32 + +// raise sends a signal to the calling thread. +// +// It must be nosplit because it is used by the signal handler before +// it definitely has a Go stack. +// +//go:nosplit +func raise(sig uint32) { + thr_kill(thr_self(), int(sig)) +} + +func signalM(mp *m, sig int) { + thr_kill(thread(mp.procid), sig) +} + +// sigPerThreadSyscall is only used on linux, so we assign a bogus signal +// number. +const sigPerThreadSyscall = 1 << 31 + +//go:nosplit +func runPerThreadSyscall() { + throw("runPerThreadSyscall only valid on linux") +} diff --git a/src/runtime/os_freebsd2.go b/src/runtime/os_freebsd2.go new file mode 100644 index 0000000..3eaedf0 --- /dev/null +++ b/src/runtime/os_freebsd2.go @@ -0,0 +1,22 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build freebsd && !amd64 + +package runtime + +import "internal/abi" + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa sigactiont + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTART + sa.sa_mask = sigset_all + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + fn = abi.FuncPCABI0(sigtramp) + } + sa.sa_handler = fn + sigaction(i, &sa, nil) +} diff --git a/src/runtime/os_freebsd_amd64.go b/src/runtime/os_freebsd_amd64.go new file mode 100644 index 0000000..b179383 --- /dev/null +++ b/src/runtime/os_freebsd_amd64.go @@ -0,0 +1,26 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "internal/abi" + +func cgoSigtramp() + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa sigactiont + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTART + sa.sa_mask = sigset_all + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + if iscgo { + fn = abi.FuncPCABI0(cgoSigtramp) + } else { + fn = abi.FuncPCABI0(sigtramp) + } + } + sa.sa_handler = fn + sigaction(i, &sa, nil) +} diff --git a/src/runtime/os_freebsd_arm.go b/src/runtime/os_freebsd_arm.go new file mode 100644 index 0000000..3feaa5e --- /dev/null +++ b/src/runtime/os_freebsd_arm.go @@ -0,0 +1,48 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "internal/cpu" + +const ( + _HWCAP_VFP = 1 << 6 + _HWCAP_VFPv3 = 1 << 13 +) + +func checkgoarm() { + if goarm > 5 && cpu.HWCap&_HWCAP_VFP == 0 { + print("runtime: this CPU has no floating point hardware, so it cannot run\n") + print("this GOARM=", goarm, " binary. Recompile using GOARM=5.\n") + exit(1) + } + if goarm > 6 && cpu.HWCap&_HWCAP_VFPv3 == 0 { + print("runtime: this CPU has no VFPv3 floating point hardware, so it cannot run\n") + print("this GOARM=", goarm, " binary. Recompile using GOARM=5 or GOARM=6.\n") + exit(1) + } + + // osinit not called yet, so ncpu not set: must use getncpu directly. + if getncpu() > 1 && goarm < 7 { + print("runtime: this system has multiple CPUs and must use\n") + print("atomic synchronization instructions. Recompile using GOARM=7.\n") + exit(1) + } +} + +func archauxv(tag, val uintptr) { + switch tag { + case _AT_HWCAP: + cpu.HWCap = uint(val) + case _AT_HWCAP2: + cpu.HWCap2 = uint(val) + } +} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_freebsd_arm64.go b/src/runtime/os_freebsd_arm64.go new file mode 100644 index 0000000..b5b25f0 --- /dev/null +++ b/src/runtime/os_freebsd_arm64.go @@ -0,0 +1,12 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed fastrand(). + // nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_freebsd_noauxv.go b/src/runtime/os_freebsd_noauxv.go new file mode 100644 index 0000000..1d9452b --- /dev/null +++ b/src/runtime/os_freebsd_noauxv.go @@ -0,0 +1,10 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build freebsd && !arm + +package runtime + +func archauxv(tag, val uintptr) { +} diff --git a/src/runtime/os_freebsd_riscv64.go b/src/runtime/os_freebsd_riscv64.go new file mode 100644 index 0000000..0f2ed50 --- /dev/null +++ b/src/runtime/os_freebsd_riscv64.go @@ -0,0 +1,7 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +func osArchInit() {} diff --git a/src/runtime/os_illumos.go b/src/runtime/os_illumos.go new file mode 100644 index 0000000..c3c3e4e --- /dev/null +++ b/src/runtime/os_illumos.go @@ -0,0 +1,132 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "unsafe" +) + +//go:cgo_import_dynamic libc_getrctl getrctl "libc.so" +//go:cgo_import_dynamic libc_rctlblk_get_local_action rctlblk_get_local_action "libc.so" +//go:cgo_import_dynamic libc_rctlblk_get_local_flags rctlblk_get_local_flags "libc.so" +//go:cgo_import_dynamic libc_rctlblk_get_value rctlblk_get_value "libc.so" +//go:cgo_import_dynamic libc_rctlblk_size rctlblk_size "libc.so" + +//go:linkname libc_getrctl libc_getrctl +//go:linkname libc_rctlblk_get_local_action libc_rctlblk_get_local_action +//go:linkname libc_rctlblk_get_local_flags libc_rctlblk_get_local_flags +//go:linkname libc_rctlblk_get_value libc_rctlblk_get_value +//go:linkname libc_rctlblk_size libc_rctlblk_size + +var ( + libc_getrctl, + libc_rctlblk_get_local_action, + libc_rctlblk_get_local_flags, + libc_rctlblk_get_value, + libc_rctlblk_size libcFunc +) + +// Return the minimum value seen for the zone CPU cap, or 0 if no cap is +// detected. +func getcpucap() uint64 { + // The resource control block is an opaque object whose size is only + // known to libc. In practice, given the contents, it is unlikely to + // grow beyond 8KB so we'll use a static buffer of that size here. + const rblkmaxsize = 8 * 1024 + if rctlblk_size() > rblkmaxsize { + return 0 + } + + // The "zone.cpu-cap" resource control, as described in + // resource_controls(5), "sets a limit on the amount of CPU time that + // can be used by a zone. The unit used is the percentage of a single + // CPU that can be used by all user threads in a zone, expressed as an + // integer." A C string of the name must be passed to getrctl(2). + name := []byte("zone.cpu-cap\x00") + + // To iterate over the list of values for a particular resource + // control, we need two blocks: one for the previously read value and + // one for the next value. + var rblk0 [rblkmaxsize]byte + var rblk1 [rblkmaxsize]byte + rblk := &rblk0[0] + rblkprev := &rblk1[0] + + var flag uint32 = _RCTL_FIRST + var capval uint64 = 0 + + for { + if getrctl(unsafe.Pointer(&name[0]), unsafe.Pointer(rblkprev), unsafe.Pointer(rblk), flag) != 0 { + // The end of the sequence is reported as an ENOENT + // failure, but determining the CPU cap is not critical + // here. We'll treat any failure as if it were the end + // of sequence. + break + } + + lflags := rctlblk_get_local_flags(unsafe.Pointer(rblk)) + action := rctlblk_get_local_action(unsafe.Pointer(rblk)) + if (lflags&_RCTL_LOCAL_MAXIMAL) == 0 && action == _RCTL_LOCAL_DENY { + // This is a finite (not maximal) value representing a + // cap (deny) action. + v := rctlblk_get_value(unsafe.Pointer(rblk)) + if capval == 0 || capval > v { + capval = v + } + } + + // Swap the blocks around so that we can fetch the next value + t := rblk + rblk = rblkprev + rblkprev = t + flag = _RCTL_NEXT + } + + return capval +} + +func getncpu() int32 { + n := int32(sysconf(__SC_NPROCESSORS_ONLN)) + if n < 1 { + return 1 + } + + if cents := int32(getcpucap()); cents > 0 { + // Convert from a percentage of CPUs to a number of CPUs, + // rounding up to make use of a fractional CPU + // e.g., 336% becomes 4 CPUs + ncap := (cents + 99) / 100 + if ncap < n { + return ncap + } + } + + return n +} + +//go:nosplit +func getrctl(controlname, oldbuf, newbuf unsafe.Pointer, flags uint32) uintptr { + return sysvicall4(&libc_getrctl, uintptr(controlname), uintptr(oldbuf), uintptr(newbuf), uintptr(flags)) +} + +//go:nosplit +func rctlblk_get_local_action(buf unsafe.Pointer) uintptr { + return sysvicall2(&libc_rctlblk_get_local_action, uintptr(buf), uintptr(0)) +} + +//go:nosplit +func rctlblk_get_local_flags(buf unsafe.Pointer) uintptr { + return sysvicall1(&libc_rctlblk_get_local_flags, uintptr(buf)) +} + +//go:nosplit +func rctlblk_get_value(buf unsafe.Pointer) uint64 { + return uint64(sysvicall1(&libc_rctlblk_get_value, uintptr(buf))) +} + +//go:nosplit +func rctlblk_size() uintptr { + return sysvicall0(&libc_rctlblk_size) +} diff --git a/src/runtime/os_js.go b/src/runtime/os_js.go new file mode 100644 index 0000000..7481fb9 --- /dev/null +++ b/src/runtime/os_js.go @@ -0,0 +1,167 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build js && wasm + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +func exit(code int32) + +func write1(fd uintptr, p unsafe.Pointer, n int32) int32 { + if fd > 2 { + throw("runtime.write to fd > 2 is unsupported") + } + wasmWrite(fd, p, n) + return n +} + +// Stubs so tests can link correctly. These should never be called. +func open(name *byte, mode, perm int32) int32 { panic("not implemented") } +func closefd(fd int32) int32 { panic("not implemented") } +func read(fd int32, p unsafe.Pointer, n int32) int32 { panic("not implemented") } + +//go:noescape +func wasmWrite(fd uintptr, p unsafe.Pointer, n int32) + +func usleep(usec uint32) + +//go:nosplit +func usleep_no_g(usec uint32) { + usleep(usec) +} + +func exitThread(wait *atomic.Uint32) + +type mOS struct{} + +func osyield() + +//go:nosplit +func osyield_no_g() { + osyield() +} + +const _SIGSEGV = 0xb + +func sigpanic() { + gp := getg() + if !canpanic() { + throw("unexpected signal during runtime execution") + } + + // js only invokes the exception handler for memory faults. + gp.sig = _SIGSEGV + panicmem() +} + +type sigset struct{} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + mp.gsignal = malg(32 * 1024) + mp.gsignal.m = mp +} + +//go:nosplit +func sigsave(p *sigset) { +} + +//go:nosplit +func msigrestore(sigmask sigset) { +} + +//go:nosplit +//go:nowritebarrierrec +func clearSignalHandlers() { +} + +//go:nosplit +func sigblock(exiting bool) { +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { +} + +// Called from dropm to undo the effect of an minit. +func unminit() { +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +func osinit() { + ncpu = 1 + getg().m.procid = 2 + physPageSize = 64 * 1024 +} + +// wasm has no signals +const _NSIG = 0 + +func signame(sig uint32) string { + return "" +} + +func crash() { + *(*int32)(nil) = 0 +} + +func getRandomData(r []byte) + +func goenvs() { + goenvs_unix() +} + +func initsig(preinit bool) { +} + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrier +func newosproc(mp *m) { + throw("newosproc: not implemented") +} + +func setProcessCPUProfiler(hz int32) {} +func setThreadCPUProfiler(hz int32) {} +func sigdisable(uint32) {} +func sigenable(uint32) {} +func sigignore(uint32) {} + +//go:linkname os_sigpipe os.sigpipe +func os_sigpipe() { + throw("too many writes on closed pipe") +} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} + +//go:linkname syscall_now syscall.now +func syscall_now() (sec int64, nsec int32) { + sec, nsec, _ = time_now() + return +} + +// gsignalStack is unused on js. +type gsignalStack struct{} + +const preemptMSupported = false + +func preemptM(mp *m) { + // No threads, so nothing to do. +} diff --git a/src/runtime/os_linux.go b/src/runtime/os_linux.go new file mode 100644 index 0000000..26db4a0 --- /dev/null +++ b/src/runtime/os_linux.go @@ -0,0 +1,920 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/syscall" + "unsafe" +) + +// sigPerThreadSyscall is the same signal (SIGSETXID) used by glibc for +// per-thread syscalls on Linux. We use it for the same purpose in non-cgo +// binaries. +const sigPerThreadSyscall = _SIGRTMIN + 1 + +type mOS struct { + // profileTimer holds the ID of the POSIX interval timer for profiling CPU + // usage on this thread. + // + // It is valid when the profileTimerValid field is true. A thread + // creates and manages its own timer, and these fields are read and written + // only by this thread. But because some of the reads on profileTimerValid + // are in signal handling code, this field should be atomic type. + profileTimer int32 + profileTimerValid atomic.Bool + + // needPerThreadSyscall indicates that a per-thread syscall is required + // for doAllThreadsSyscall. + needPerThreadSyscall atomic.Uint8 +} + +//go:noescape +func futex(addr unsafe.Pointer, op int32, val uint32, ts, addr2 unsafe.Pointer, val3 uint32) int32 + +// Linux futex. +// +// futexsleep(uint32 *addr, uint32 val) +// futexwakeup(uint32 *addr) +// +// Futexsleep atomically checks if *addr == val and if so, sleeps on addr. +// Futexwakeup wakes up threads sleeping on addr. +// Futexsleep is allowed to wake up spuriously. + +const ( + _FUTEX_PRIVATE_FLAG = 128 + _FUTEX_WAIT_PRIVATE = 0 | _FUTEX_PRIVATE_FLAG + _FUTEX_WAKE_PRIVATE = 1 | _FUTEX_PRIVATE_FLAG +) + +// Atomically, +// +// if(*addr == val) sleep +// +// Might be woken up spuriously; that's allowed. +// Don't sleep longer than ns; ns < 0 means forever. +// +//go:nosplit +func futexsleep(addr *uint32, val uint32, ns int64) { + // Some Linux kernels have a bug where futex of + // FUTEX_WAIT returns an internal error code + // as an errno. Libpthread ignores the return value + // here, and so can we: as it says a few lines up, + // spurious wakeups are allowed. + if ns < 0 { + futex(unsafe.Pointer(addr), _FUTEX_WAIT_PRIVATE, val, nil, nil, 0) + return + } + + var ts timespec + ts.setNsec(ns) + futex(unsafe.Pointer(addr), _FUTEX_WAIT_PRIVATE, val, unsafe.Pointer(&ts), nil, 0) +} + +// If any procs are sleeping on addr, wake up at most cnt. +// +//go:nosplit +func futexwakeup(addr *uint32, cnt uint32) { + ret := futex(unsafe.Pointer(addr), _FUTEX_WAKE_PRIVATE, cnt, nil, nil, 0) + if ret >= 0 { + return + } + + // I don't know that futex wakeup can return + // EAGAIN or EINTR, but if it does, it would be + // safe to loop and call futex again. + systemstack(func() { + print("futexwakeup addr=", addr, " returned ", ret, "\n") + }) + + *(*int32)(unsafe.Pointer(uintptr(0x1006))) = 0x1006 +} + +func getproccount() int32 { + // This buffer is huge (8 kB) but we are on the system stack + // and there should be plenty of space (64 kB). + // Also this is a leaf, so we're not holding up the memory for long. + // See golang.org/issue/11823. + // The suggested behavior here is to keep trying with ever-larger + // buffers, but we don't have a dynamic memory allocator at the + // moment, so that's a bit tricky and seems like overkill. + const maxCPUs = 64 * 1024 + var buf [maxCPUs / 8]byte + r := sched_getaffinity(0, unsafe.Sizeof(buf), &buf[0]) + if r < 0 { + return 1 + } + n := int32(0) + for _, v := range buf[:r] { + for v != 0 { + n += int32(v & 1) + v >>= 1 + } + } + if n == 0 { + n = 1 + } + return n +} + +// Clone, the Linux rfork. +const ( + _CLONE_VM = 0x100 + _CLONE_FS = 0x200 + _CLONE_FILES = 0x400 + _CLONE_SIGHAND = 0x800 + _CLONE_PTRACE = 0x2000 + _CLONE_VFORK = 0x4000 + _CLONE_PARENT = 0x8000 + _CLONE_THREAD = 0x10000 + _CLONE_NEWNS = 0x20000 + _CLONE_SYSVSEM = 0x40000 + _CLONE_SETTLS = 0x80000 + _CLONE_PARENT_SETTID = 0x100000 + _CLONE_CHILD_CLEARTID = 0x200000 + _CLONE_UNTRACED = 0x800000 + _CLONE_CHILD_SETTID = 0x1000000 + _CLONE_STOPPED = 0x2000000 + _CLONE_NEWUTS = 0x4000000 + _CLONE_NEWIPC = 0x8000000 + + // As of QEMU 2.8.0 (5ea2fc84d), user emulation requires all six of these + // flags to be set when creating a thread; attempts to share the other + // five but leave SYSVSEM unshared will fail with -EINVAL. + // + // In non-QEMU environments CLONE_SYSVSEM is inconsequential as we do not + // use System V semaphores. + + cloneFlags = _CLONE_VM | /* share memory */ + _CLONE_FS | /* share cwd, etc */ + _CLONE_FILES | /* share fd table */ + _CLONE_SIGHAND | /* share sig handler table */ + _CLONE_SYSVSEM | /* share SysV semaphore undo lists (see issue #20763) */ + _CLONE_THREAD /* revisit - okay for now */ +) + +//go:noescape +func clone(flags int32, stk, mp, gp, fn unsafe.Pointer) int32 + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrier +func newosproc(mp *m) { + stk := unsafe.Pointer(mp.g0.stack.hi) + /* + * note: strace gets confused if we use CLONE_PTRACE here. + */ + if false { + print("newosproc stk=", stk, " m=", mp, " g=", mp.g0, " clone=", abi.FuncPCABI0(clone), " id=", mp.id, " ostk=", &mp, "\n") + } + + // Disable signals during clone, so that the new thread starts + // with signals disabled. It will enable them in minit. + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + ret := retryOnEAGAIN(func() int32 { + r := clone(cloneFlags, stk, unsafe.Pointer(mp), unsafe.Pointer(mp.g0), unsafe.Pointer(abi.FuncPCABI0(mstart))) + // clone returns positive TID, negative errno. + // We don't care about the TID. + if r >= 0 { + return 0 + } + return -r + }) + sigprocmask(_SIG_SETMASK, &oset, nil) + + if ret != 0 { + print("runtime: failed to create new OS thread (have ", mcount(), " already; errno=", ret, ")\n") + if ret == _EAGAIN { + println("runtime: may need to increase max user processes (ulimit -u)") + } + throw("newosproc") + } +} + +// Version of newosproc that doesn't require a valid G. +// +//go:nosplit +func newosproc0(stacksize uintptr, fn unsafe.Pointer) { + stack := sysAlloc(stacksize, &memstats.stacks_sys) + if stack == nil { + writeErrStr(failallocatestack) + exit(1) + } + ret := clone(cloneFlags, unsafe.Pointer(uintptr(stack)+stacksize), nil, nil, fn) + if ret < 0 { + writeErrStr(failthreadcreate) + exit(1) + } +} + +const ( + _AT_NULL = 0 // End of vector + _AT_PAGESZ = 6 // System physical page size + _AT_HWCAP = 16 // hardware capability bit vector + _AT_SECURE = 23 // secure mode boolean + _AT_RANDOM = 25 // introduced in 2.6.29 + _AT_HWCAP2 = 26 // hardware capability bit vector 2 +) + +var procAuxv = []byte("/proc/self/auxv\x00") + +var addrspace_vec [1]byte + +func mincore(addr unsafe.Pointer, n uintptr, dst *byte) int32 + +func sysargs(argc int32, argv **byte) { + n := argc + 1 + + // skip over argv, envp to get to auxv + for argv_index(argv, n) != nil { + n++ + } + + // skip NULL separator + n++ + + // now argv+n is auxv + auxv := (*[1 << 28]uintptr)(add(unsafe.Pointer(argv), uintptr(n)*goarch.PtrSize)) + if sysauxv(auxv[:]) != 0 { + return + } + // In some situations we don't get a loader-provided + // auxv, such as when loaded as a library on Android. + // Fall back to /proc/self/auxv. + fd := open(&procAuxv[0], 0 /* O_RDONLY */, 0) + if fd < 0 { + // On Android, /proc/self/auxv might be unreadable (issue 9229), so we fallback to + // try using mincore to detect the physical page size. + // mincore should return EINVAL when address is not a multiple of system page size. + const size = 256 << 10 // size of memory region to allocate + p, err := mmap(nil, size, _PROT_READ|_PROT_WRITE, _MAP_ANON|_MAP_PRIVATE, -1, 0) + if err != 0 { + return + } + var n uintptr + for n = 4 << 10; n < size; n <<= 1 { + err := mincore(unsafe.Pointer(uintptr(p)+n), 1, &addrspace_vec[0]) + if err == 0 { + physPageSize = n + break + } + } + if physPageSize == 0 { + physPageSize = size + } + munmap(p, size) + return + } + var buf [128]uintptr + n = read(fd, noescape(unsafe.Pointer(&buf[0])), int32(unsafe.Sizeof(buf))) + closefd(fd) + if n < 0 { + return + } + // Make sure buf is terminated, even if we didn't read + // the whole file. + buf[len(buf)-2] = _AT_NULL + sysauxv(buf[:]) +} + +// startupRandomData holds random bytes initialized at startup. These come from +// the ELF AT_RANDOM auxiliary vector. +var startupRandomData []byte + +// secureMode holds the value of AT_SECURE passed in the auxiliary vector. +var secureMode bool + +func sysauxv(auxv []uintptr) int { + var i int + for ; auxv[i] != _AT_NULL; i += 2 { + tag, val := auxv[i], auxv[i+1] + switch tag { + case _AT_RANDOM: + // The kernel provides a pointer to 16-bytes + // worth of random data. + startupRandomData = (*[16]byte)(unsafe.Pointer(val))[:] + + case _AT_PAGESZ: + physPageSize = val + + case _AT_SECURE: + secureMode = val == 1 + } + + archauxv(tag, val) + vdsoauxv(tag, val) + } + return i / 2 +} + +var sysTHPSizePath = []byte("/sys/kernel/mm/transparent_hugepage/hpage_pmd_size\x00") + +func getHugePageSize() uintptr { + var numbuf [20]byte + fd := open(&sysTHPSizePath[0], 0 /* O_RDONLY */, 0) + if fd < 0 { + return 0 + } + ptr := noescape(unsafe.Pointer(&numbuf[0])) + n := read(fd, ptr, int32(len(numbuf))) + closefd(fd) + if n <= 0 { + return 0 + } + n-- // remove trailing newline + v, ok := atoi(slicebytetostringtmp((*byte)(ptr), int(n))) + if !ok || v < 0 { + v = 0 + } + if v&(v-1) != 0 { + // v is not a power of 2 + return 0 + } + return uintptr(v) +} + +func osinit() { + ncpu = getproccount() + physHugePageSize = getHugePageSize() + if iscgo { + // #42494 glibc and musl reserve some signals for + // internal use and require they not be blocked by + // the rest of a normal C runtime. When the go runtime + // blocks...unblocks signals, temporarily, the blocked + // interval of time is generally very short. As such, + // these expectations of *libc code are mostly met by + // the combined go+cgo system of threads. However, + // when go causes a thread to exit, via a return from + // mstart(), the combined runtime can deadlock if + // these signals are blocked. Thus, don't block these + // signals when exiting threads. + // - glibc: SIGCANCEL (32), SIGSETXID (33) + // - musl: SIGTIMER (32), SIGCANCEL (33), SIGSYNCCALL (34) + sigdelset(&sigsetAllExiting, 32) + sigdelset(&sigsetAllExiting, 33) + sigdelset(&sigsetAllExiting, 34) + } + osArchInit() +} + +var urandom_dev = []byte("/dev/urandom\x00") + +func getRandomData(r []byte) { + if startupRandomData != nil { + n := copy(r, startupRandomData) + extendRandom(r, n) + return + } + fd := open(&urandom_dev[0], 0 /* O_RDONLY */, 0) + n := read(fd, unsafe.Pointer(&r[0]), int32(len(r))) + closefd(fd) + extendRandom(r, int(n)) +} + +func goenvs() { + goenvs_unix() +} + +// Called to do synchronous initialization of Go code built with +// -buildmode=c-archive or -buildmode=c-shared. +// None of the Go runtime is initialized. +// +//go:nosplit +//go:nowritebarrierrec +func libpreinit() { + initsig(true) +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + mp.gsignal = malg(32 * 1024) // Linux wants >= 2K + mp.gsignal.m = mp +} + +func gettid() uint32 + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { + minitSignals() + + // Cgo-created threads and the bootstrap m are missing a + // procid. We need this for asynchronous preemption and it's + // useful in debuggers. + getg().m.procid = uint64(gettid()) +} + +// Called from dropm to undo the effect of an minit. +// +//go:nosplit +func unminit() { + unminitSignals() +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +//#ifdef GOARCH_386 +//#define sa_handler k_sa_handler +//#endif + +func sigreturn() +func sigtramp() // Called via C ABI +func cgoSigtramp() + +//go:noescape +func sigaltstack(new, old *stackt) + +//go:noescape +func setitimer(mode int32, new, old *itimerval) + +//go:noescape +func timer_create(clockid int32, sevp *sigevent, timerid *int32) int32 + +//go:noescape +func timer_settime(timerid int32, flags int32, new, old *itimerspec) int32 + +//go:noescape +func timer_delete(timerid int32) int32 + +//go:noescape +func rtsigprocmask(how int32, new, old *sigset, size int32) + +//go:nosplit +//go:nowritebarrierrec +func sigprocmask(how int32, new, old *sigset) { + rtsigprocmask(how, new, old, int32(unsafe.Sizeof(*new))) +} + +func raise(sig uint32) +func raiseproc(sig uint32) + +//go:noescape +func sched_getaffinity(pid, len uintptr, buf *byte) int32 +func osyield() + +//go:nosplit +func osyield_no_g() { + osyield() +} + +func pipe2(flags int32) (r, w int32, errno int32) + +//go:nosplit +func fcntl(fd, cmd, arg int32) (ret int32, errno int32) { + r, _, err := syscall.Syscall6(syscall.SYS_FCNTL, uintptr(fd), uintptr(cmd), uintptr(arg), 0, 0, 0) + return int32(r), int32(err) +} + +const ( + _si_max_size = 128 + _sigev_max_size = 64 +) + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa sigactiont + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTORER | _SA_RESTART + sigfillset(&sa.sa_mask) + // Although Linux manpage says "sa_restorer element is obsolete and + // should not be used". x86_64 kernel requires it. Only use it on + // x86. + if GOARCH == "386" || GOARCH == "amd64" { + sa.sa_restorer = abi.FuncPCABI0(sigreturn) + } + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + if iscgo { + fn = abi.FuncPCABI0(cgoSigtramp) + } else { + fn = abi.FuncPCABI0(sigtramp) + } + } + sa.sa_handler = fn + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func setsigstack(i uint32) { + var sa sigactiont + sigaction(i, nil, &sa) + if sa.sa_flags&_SA_ONSTACK != 0 { + return + } + sa.sa_flags |= _SA_ONSTACK + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func getsig(i uint32) uintptr { + var sa sigactiont + sigaction(i, nil, &sa) + return sa.sa_handler +} + +// setSignalstackSP sets the ss_sp field of a stackt. +// +//go:nosplit +func setSignalstackSP(s *stackt, sp uintptr) { + *(*uintptr)(unsafe.Pointer(&s.ss_sp)) = sp +} + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { +} + +// sysSigaction calls the rt_sigaction system call. +// +//go:nosplit +func sysSigaction(sig uint32, new, old *sigactiont) { + if rt_sigaction(uintptr(sig), new, old, unsafe.Sizeof(sigactiont{}.sa_mask)) != 0 { + // Workaround for bugs in QEMU user mode emulation. + // + // QEMU turns calls to the sigaction system call into + // calls to the C library sigaction call; the C + // library call rejects attempts to call sigaction for + // SIGCANCEL (32) or SIGSETXID (33). + // + // QEMU rejects calling sigaction on SIGRTMAX (64). + // + // Just ignore the error in these case. There isn't + // anything we can do about it anyhow. + if sig != 32 && sig != 33 && sig != 64 { + // Use system stack to avoid split stack overflow on ppc64/ppc64le. + systemstack(func() { + throw("sigaction failed") + }) + } + } +} + +// rt_sigaction is implemented in assembly. +// +//go:noescape +func rt_sigaction(sig uintptr, new, old *sigactiont, size uintptr) int32 + +func getpid() int +func tgkill(tgid, tid, sig int) + +// signalM sends a signal to mp. +func signalM(mp *m, sig int) { + tgkill(getpid(), int(mp.procid), sig) +} + +// go118UseTimerCreateProfiler enables the per-thread CPU profiler. +const go118UseTimerCreateProfiler = true + +// validSIGPROF compares this signal delivery's code against the signal sources +// that the profiler uses, returning whether the delivery should be processed. +// To be processed, a signal delivery from a known profiling mechanism should +// correspond to the best profiling mechanism available to this thread. Signals +// from other sources are always considered valid. +// +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + code := int32(c.sigcode()) + setitimer := code == _SI_KERNEL + timer_create := code == _SI_TIMER + + if !(setitimer || timer_create) { + // The signal doesn't correspond to a profiling mechanism that the + // runtime enables itself. There's no reason to process it, but there's + // no reason to ignore it either. + return true + } + + if mp == nil { + // Since we don't have an M, we can't check if there's an active + // per-thread timer for this thread. We don't know how long this thread + // has been around, and if it happened to interact with the Go scheduler + // at a time when profiling was active (causing it to have a per-thread + // timer). But it may have never interacted with the Go scheduler, or + // never while profiling was active. To avoid double-counting, process + // only signals from setitimer. + // + // When a custom cgo traceback function has been registered (on + // platforms that support runtime.SetCgoTraceback), SIGPROF signals + // delivered to a thread that cannot find a matching M do this check in + // the assembly implementations of runtime.cgoSigtramp. + return setitimer + } + + // Having an M means the thread interacts with the Go scheduler, and we can + // check whether there's an active per-thread timer for this thread. + if mp.profileTimerValid.Load() { + // If this M has its own per-thread CPU profiling interval timer, we + // should track the SIGPROF signals that come from that timer (for + // accurate reporting of its CPU usage; see issue 35057) and ignore any + // that it gets from the process-wide setitimer (to not over-count its + // CPU consumption). + return timer_create + } + + // No active per-thread timer means the only valid profiler is setitimer. + return setitimer +} + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + mp := getg().m + mp.profilehz = hz + + if !go118UseTimerCreateProfiler { + return + } + + // destroy any active timer + if mp.profileTimerValid.Load() { + timerid := mp.profileTimer + mp.profileTimerValid.Store(false) + mp.profileTimer = 0 + + ret := timer_delete(timerid) + if ret != 0 { + print("runtime: failed to disable profiling timer; timer_delete(", timerid, ") errno=", -ret, "\n") + throw("timer_delete") + } + } + + if hz == 0 { + // If the goal was to disable profiling for this thread, then the job's done. + return + } + + // The period of the timer should be 1/Hz. For every "1/Hz" of additional + // work, the user should expect one additional sample in the profile. + // + // But to scale down to very small amounts of application work, to observe + // even CPU usage of "one tenth" of the requested period, set the initial + // timing delay in a different way: So that "one tenth" of a period of CPU + // spend shows up as a 10% chance of one sample (for an expected value of + // 0.1 samples), and so that "two and six tenths" periods of CPU spend show + // up as a 60% chance of 3 samples and a 40% chance of 2 samples (for an + // expected value of 2.6). Set the initial delay to a value in the unifom + // random distribution between 0 and the desired period. And because "0" + // means "disable timer", add 1 so the half-open interval [0,period) turns + // into (0,period]. + // + // Otherwise, this would show up as a bias away from short-lived threads and + // from threads that are only occasionally active: for example, when the + // garbage collector runs on a mostly-idle system, the additional threads it + // activates may do a couple milliseconds of GC-related work and nothing + // else in the few seconds that the profiler observes. + spec := new(itimerspec) + spec.it_value.setNsec(1 + int64(fastrandn(uint32(1e9/hz)))) + spec.it_interval.setNsec(1e9 / int64(hz)) + + var timerid int32 + var sevp sigevent + sevp.notify = _SIGEV_THREAD_ID + sevp.signo = _SIGPROF + sevp.sigev_notify_thread_id = int32(mp.procid) + ret := timer_create(_CLOCK_THREAD_CPUTIME_ID, &sevp, &timerid) + if ret != 0 { + // If we cannot create a timer for this M, leave profileTimerValid false + // to fall back to the process-wide setitimer profiler. + return + } + + ret = timer_settime(timerid, 0, spec, nil) + if ret != 0 { + print("runtime: failed to configure profiling timer; timer_settime(", timerid, + ", 0, {interval: {", + spec.it_interval.tv_sec, "s + ", spec.it_interval.tv_nsec, "ns} value: {", + spec.it_value.tv_sec, "s + ", spec.it_value.tv_nsec, "ns}}, nil) errno=", -ret, "\n") + throw("timer_settime") + } + + mp.profileTimer = timerid + mp.profileTimerValid.Store(true) +} + +// perThreadSyscallArgs contains the system call number, arguments, and +// expected return values for a system call to be executed on all threads. +type perThreadSyscallArgs struct { + trap uintptr + a1 uintptr + a2 uintptr + a3 uintptr + a4 uintptr + a5 uintptr + a6 uintptr + r1 uintptr + r2 uintptr +} + +// perThreadSyscall is the system call to execute for the ongoing +// doAllThreadsSyscall. +// +// perThreadSyscall may only be written while mp.needPerThreadSyscall == 0 on +// all Ms. +var perThreadSyscall perThreadSyscallArgs + +// syscall_runtime_doAllThreadsSyscall and executes a specified system call on +// all Ms. +// +// The system call is expected to succeed and return the same value on every +// thread. If any threads do not match, the runtime throws. +// +//go:linkname syscall_runtime_doAllThreadsSyscall syscall.runtime_doAllThreadsSyscall +//go:uintptrescapes +func syscall_runtime_doAllThreadsSyscall(trap, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + if iscgo { + // In cgo, we are not aware of threads created in C, so this approach will not work. + panic("doAllThreadsSyscall not supported with cgo enabled") + } + + // STW to guarantee that user goroutines see an atomic change to thread + // state. Without STW, goroutines could migrate Ms while change is in + // progress and e.g., see state old -> new -> old -> new. + // + // N.B. Internally, this function does not depend on STW to + // successfully change every thread. It is only needed for user + // expectations, per above. + stopTheWorld("doAllThreadsSyscall") + + // This function depends on several properties: + // + // 1. All OS threads that already exist are associated with an M in + // allm. i.e., we won't miss any pre-existing threads. + // 2. All Ms listed in allm will eventually have an OS thread exist. + // i.e., they will set procid and be able to receive signals. + // 3. OS threads created after we read allm will clone from a thread + // that has executed the system call. i.e., they inherit the + // modified state. + // + // We achieve these through different mechanisms: + // + // 1. Addition of new Ms to allm in allocm happens before clone of its + // OS thread later in newm. + // 2. newm does acquirem to avoid being preempted, ensuring that new Ms + // created in allocm will eventually reach OS thread clone later in + // newm. + // 3. We take allocmLock for write here to prevent allocation of new Ms + // while this function runs. Per (1), this prevents clone of OS + // threads that are not yet in allm. + allocmLock.lock() + + // Disable preemption, preventing us from changing Ms, as we handle + // this M specially. + // + // N.B. STW and lock() above do this as well, this is added for extra + // clarity. + acquirem() + + // N.B. allocmLock also prevents concurrent execution of this function, + // serializing use of perThreadSyscall, mp.needPerThreadSyscall, and + // ensuring all threads execute system calls from multiple calls in the + // same order. + + r1, r2, errno := syscall.Syscall6(trap, a1, a2, a3, a4, a5, a6) + if GOARCH == "ppc64" || GOARCH == "ppc64le" { + // TODO(https://go.dev/issue/51192 ): ppc64 doesn't use r2. + r2 = 0 + } + if errno != 0 { + releasem(getg().m) + allocmLock.unlock() + startTheWorld() + return r1, r2, errno + } + + perThreadSyscall = perThreadSyscallArgs{ + trap: trap, + a1: a1, + a2: a2, + a3: a3, + a4: a4, + a5: a5, + a6: a6, + r1: r1, + r2: r2, + } + + // Wait for all threads to start. + // + // As described above, some Ms have been added to allm prior to + // allocmLock, but not yet completed OS clone and set procid. + // + // At minimum we must wait for a thread to set procid before we can + // send it a signal. + // + // We take this one step further and wait for all threads to start + // before sending any signals. This prevents system calls from getting + // applied twice: once in the parent and once in the child, like so: + // + // A B C + // add C to allm + // doAllThreadsSyscall + // allocmLock.lock() + // signal B + // <receive signal> + // execute syscall + // <signal return> + // clone C + // <thread start> + // set procid + // signal C + // <receive signal> + // execute syscall + // <signal return> + // + // In this case, thread C inherited the syscall-modified state from + // thread B and did not need to execute the syscall, but did anyway + // because doAllThreadsSyscall could not be sure whether it was + // required. + // + // Some system calls may not be idempotent, so we ensure each thread + // executes the system call exactly once. + for mp := allm; mp != nil; mp = mp.alllink { + for atomic.Load64(&mp.procid) == 0 { + // Thread is starting. + osyield() + } + } + + // Signal every other thread, where they will execute perThreadSyscall + // from the signal handler. + gp := getg() + tid := gp.m.procid + for mp := allm; mp != nil; mp = mp.alllink { + if atomic.Load64(&mp.procid) == tid { + // Our thread already performed the syscall. + continue + } + mp.needPerThreadSyscall.Store(1) + signalM(mp, sigPerThreadSyscall) + } + + // Wait for all threads to complete. + for mp := allm; mp != nil; mp = mp.alllink { + if mp.procid == tid { + continue + } + for mp.needPerThreadSyscall.Load() != 0 { + osyield() + } + } + + perThreadSyscall = perThreadSyscallArgs{} + + releasem(getg().m) + allocmLock.unlock() + startTheWorld() + + return r1, r2, errno +} + +// runPerThreadSyscall runs perThreadSyscall for this M if required. +// +// This function throws if the system call returns with anything other than the +// expected values. +// +//go:nosplit +func runPerThreadSyscall() { + gp := getg() + if gp.m.needPerThreadSyscall.Load() == 0 { + return + } + + args := perThreadSyscall + r1, r2, errno := syscall.Syscall6(args.trap, args.a1, args.a2, args.a3, args.a4, args.a5, args.a6) + if GOARCH == "ppc64" || GOARCH == "ppc64le" { + // TODO(https://go.dev/issue/51192 ): ppc64 doesn't use r2. + r2 = 0 + } + if errno != 0 || r1 != args.r1 || r2 != args.r2 { + print("trap:", args.trap, ", a123456=[", args.a1, ",", args.a2, ",", args.a3, ",", args.a4, ",", args.a5, ",", args.a6, "]\n") + print("results: got {r1=", r1, ",r2=", r2, ",errno=", errno, "}, want {r1=", args.r1, ",r2=", args.r2, ",errno=0}\n") + fatal("AllThreadsSyscall6 results differ between threads; runtime corrupted") + } + + gp.m.needPerThreadSyscall.Store(0) +} + +const ( + _SI_USER = 0 + _SI_TKILL = -6 +) + +// sigFromUser reports whether the signal was sent because of a call +// to kill or tgkill. +// +//go:nosplit +func (c *sigctxt) sigFromUser() bool { + code := int32(c.sigcode()) + return code == _SI_USER || code == _SI_TKILL +} diff --git a/src/runtime/os_linux_arm.go b/src/runtime/os_linux_arm.go new file mode 100644 index 0000000..bd3ab44 --- /dev/null +++ b/src/runtime/os_linux_arm.go @@ -0,0 +1,51 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "internal/cpu" + +const ( + _HWCAP_VFP = 1 << 6 // introduced in at least 2.6.11 + _HWCAP_VFPv3 = 1 << 13 // introduced in 2.6.30 +) + +func vdsoCall() + +func checkgoarm() { + // On Android, /proc/self/auxv might be unreadable and hwcap won't + // reflect the CPU capabilities. Assume that every Android arm device + // has the necessary floating point hardware available. + if GOOS == "android" { + return + } + if goarm > 5 && cpu.HWCap&_HWCAP_VFP == 0 { + print("runtime: this CPU has no floating point hardware, so it cannot run\n") + print("this GOARM=", goarm, " binary. Recompile using GOARM=5.\n") + exit(1) + } + if goarm > 6 && cpu.HWCap&_HWCAP_VFPv3 == 0 { + print("runtime: this CPU has no VFPv3 floating point hardware, so it cannot run\n") + print("this GOARM=", goarm, " binary. Recompile using GOARM=5 or GOARM=6.\n") + exit(1) + } +} + +func archauxv(tag, val uintptr) { + switch tag { + case _AT_HWCAP: + cpu.HWCap = uint(val) + case _AT_HWCAP2: + cpu.HWCap2 = uint(val) + } +} + +func osArchInit() {} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed fastrand(). + // nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_linux_arm64.go b/src/runtime/os_linux_arm64.go new file mode 100644 index 0000000..2daa56f --- /dev/null +++ b/src/runtime/os_linux_arm64.go @@ -0,0 +1,25 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build arm64 + +package runtime + +import "internal/cpu" + +func archauxv(tag, val uintptr) { + switch tag { + case _AT_HWCAP: + cpu.HWCap = uint(val) + } +} + +func osArchInit() {} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed fastrand(). + // nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_linux_be64.go b/src/runtime/os_linux_be64.go new file mode 100644 index 0000000..d8d4ac2 --- /dev/null +++ b/src/runtime/os_linux_be64.go @@ -0,0 +1,42 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The standard Linux sigset type on big-endian 64-bit machines. + +//go:build linux && (ppc64 || s390x) + +package runtime + +const ( + _SS_DISABLE = 2 + _NSIG = 65 + _SIG_BLOCK = 0 + _SIG_UNBLOCK = 1 + _SIG_SETMASK = 2 +) + +type sigset uint64 + +var sigset_all = sigset(^uint64(0)) + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + if i > 64 { + throw("unexpected signal greater than 64") + } + *mask |= 1 << (uint(i) - 1) +} + +func sigdelset(mask *sigset, i int) { + if i > 64 { + throw("unexpected signal greater than 64") + } + *mask &^= 1 << (uint(i) - 1) +} + +//go:nosplit +func sigfillset(mask *uint64) { + *mask = ^uint64(0) +} diff --git a/src/runtime/os_linux_generic.go b/src/runtime/os_linux_generic.go new file mode 100644 index 0000000..15fafc1 --- /dev/null +++ b/src/runtime/os_linux_generic.go @@ -0,0 +1,37 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !mips && !mipsle && !mips64 && !mips64le && !s390x && !ppc64 && linux + +package runtime + +const ( + _SS_DISABLE = 2 + _NSIG = 65 + _SIG_BLOCK = 0 + _SIG_UNBLOCK = 1 + _SIG_SETMASK = 2 +) + +// It's hard to tease out exactly how big a Sigset is, but +// rt_sigprocmask crashes if we get it wrong, so if binaries +// are running, this is right. +type sigset [2]uint32 + +var sigset_all = sigset{^uint32(0), ^uint32(0)} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + (*mask)[(i-1)/32] |= 1 << ((uint32(i) - 1) & 31) +} + +func sigdelset(mask *sigset, i int) { + (*mask)[(i-1)/32] &^= 1 << ((uint32(i) - 1) & 31) +} + +//go:nosplit +func sigfillset(mask *uint64) { + *mask = ^uint64(0) +} diff --git a/src/runtime/os_linux_loong64.go b/src/runtime/os_linux_loong64.go new file mode 100644 index 0000000..3d84e9a --- /dev/null +++ b/src/runtime/os_linux_loong64.go @@ -0,0 +1,18 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && loong64 + +package runtime + +func archauxv(tag, val uintptr) {} + +func osArchInit() {} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed fastrand(). + // nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_linux_mips64x.go b/src/runtime/os_linux_mips64x.go new file mode 100644 index 0000000..11d35bc --- /dev/null +++ b/src/runtime/os_linux_mips64x.go @@ -0,0 +1,52 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips64 || mips64le) + +package runtime + +import "internal/cpu" + +func archauxv(tag, val uintptr) { + switch tag { + case _AT_HWCAP: + cpu.HWCap = uint(val) + } +} + +func osArchInit() {} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed fastrand(). + // nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} + +const ( + _SS_DISABLE = 2 + _NSIG = 129 + _SIG_BLOCK = 1 + _SIG_UNBLOCK = 2 + _SIG_SETMASK = 3 +) + +type sigset [2]uint64 + +var sigset_all = sigset{^uint64(0), ^uint64(0)} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + (*mask)[(i-1)/64] |= 1 << ((uint32(i) - 1) & 63) +} + +func sigdelset(mask *sigset, i int) { + (*mask)[(i-1)/64] &^= 1 << ((uint32(i) - 1) & 63) +} + +//go:nosplit +func sigfillset(mask *[2]uint64) { + (*mask)[0], (*mask)[1] = ^uint64(0), ^uint64(0) +} diff --git a/src/runtime/os_linux_mipsx.go b/src/runtime/os_linux_mipsx.go new file mode 100644 index 0000000..cdf83ff --- /dev/null +++ b/src/runtime/os_linux_mipsx.go @@ -0,0 +1,46 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips || mipsle) + +package runtime + +func archauxv(tag, val uintptr) { +} + +func osArchInit() {} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed fastrand(). + // nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} + +const ( + _SS_DISABLE = 2 + _NSIG = 128 + 1 + _SIG_BLOCK = 1 + _SIG_UNBLOCK = 2 + _SIG_SETMASK = 3 +) + +type sigset [4]uint32 + +var sigset_all = sigset{^uint32(0), ^uint32(0), ^uint32(0), ^uint32(0)} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + (*mask)[(i-1)/32] |= 1 << ((uint32(i) - 1) & 31) +} + +func sigdelset(mask *sigset, i int) { + (*mask)[(i-1)/32] &^= 1 << ((uint32(i) - 1) & 31) +} + +//go:nosplit +func sigfillset(mask *[4]uint32) { + (*mask)[0], (*mask)[1], (*mask)[2], (*mask)[3] = ^uint32(0), ^uint32(0), ^uint32(0), ^uint32(0) +} diff --git a/src/runtime/os_linux_noauxv.go b/src/runtime/os_linux_noauxv.go new file mode 100644 index 0000000..ff37727 --- /dev/null +++ b/src/runtime/os_linux_noauxv.go @@ -0,0 +1,10 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && !arm && !arm64 && !loong64 && !mips && !mipsle && !mips64 && !mips64le && !s390x && !ppc64 && !ppc64le + +package runtime + +func archauxv(tag, val uintptr) { +} diff --git a/src/runtime/os_linux_novdso.go b/src/runtime/os_linux_novdso.go new file mode 100644 index 0000000..d7e1ea0 --- /dev/null +++ b/src/runtime/os_linux_novdso.go @@ -0,0 +1,10 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && !386 && !amd64 && !arm && !arm64 && !loong64 && !mips64 && !mips64le && !ppc64 && !ppc64le && !riscv64 && !s390x + +package runtime + +func vdsoauxv(tag, val uintptr) { +} diff --git a/src/runtime/os_linux_ppc64x.go b/src/runtime/os_linux_ppc64x.go new file mode 100644 index 0000000..25d7ccc --- /dev/null +++ b/src/runtime/os_linux_ppc64x.go @@ -0,0 +1,23 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (ppc64 || ppc64le) + +package runtime + +import "internal/cpu" + +func archauxv(tag, val uintptr) { + switch tag { + case _AT_HWCAP: + // ppc64x doesn't have a 'cpuid' instruction + // equivalent and relies on HWCAP/HWCAP2 bits for + // hardware capabilities. + cpu.HWCap = uint(val) + case _AT_HWCAP2: + cpu.HWCap2 = uint(val) + } +} + +func osArchInit() {} diff --git a/src/runtime/os_linux_riscv64.go b/src/runtime/os_linux_riscv64.go new file mode 100644 index 0000000..9be88a5 --- /dev/null +++ b/src/runtime/os_linux_riscv64.go @@ -0,0 +1,7 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +func osArchInit() {} diff --git a/src/runtime/os_linux_s390x.go b/src/runtime/os_linux_s390x.go new file mode 100644 index 0000000..b9651f1 --- /dev/null +++ b/src/runtime/os_linux_s390x.go @@ -0,0 +1,16 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "internal/cpu" + +func archauxv(tag, val uintptr) { + switch tag { + case _AT_HWCAP: + cpu.HWCap = uint(val) + } +} + +func osArchInit() {} diff --git a/src/runtime/os_linux_x86.go b/src/runtime/os_linux_x86.go new file mode 100644 index 0000000..c88f61f --- /dev/null +++ b/src/runtime/os_linux_x86.go @@ -0,0 +1,9 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (386 || amd64) + +package runtime + +func osArchInit() {} diff --git a/src/runtime/os_netbsd.go b/src/runtime/os_netbsd.go new file mode 100644 index 0000000..d3ae1f8 --- /dev/null +++ b/src/runtime/os_netbsd.go @@ -0,0 +1,448 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +const ( + _SS_DISABLE = 4 + _SIG_BLOCK = 1 + _SIG_UNBLOCK = 2 + _SIG_SETMASK = 3 + _NSIG = 33 + _SI_USER = 0 + + // From NetBSD's <sys/ucontext.h> + _UC_SIGMASK = 0x01 + _UC_CPU = 0x04 + + // From <sys/lwp.h> + _LWP_DETACHED = 0x00000040 +) + +type mOS struct { + waitsemacount uint32 +} + +//go:noescape +func setitimer(mode int32, new, old *itimerval) + +//go:noescape +func sigaction(sig uint32, new, old *sigactiont) + +//go:noescape +func sigaltstack(new, old *stackt) + +//go:noescape +func sigprocmask(how int32, new, old *sigset) + +//go:noescape +func sysctl(mib *uint32, miblen uint32, out *byte, size *uintptr, dst *byte, ndst uintptr) int32 + +func lwp_tramp() + +func raiseproc(sig uint32) + +func lwp_kill(tid int32, sig int) + +//go:noescape +func getcontext(ctxt unsafe.Pointer) + +//go:noescape +func lwp_create(ctxt unsafe.Pointer, flags uintptr, lwpid unsafe.Pointer) int32 + +//go:noescape +func lwp_park(clockid, flags int32, ts *timespec, unpark int32, hint, unparkhint unsafe.Pointer) int32 + +//go:noescape +func lwp_unpark(lwp int32, hint unsafe.Pointer) int32 + +func lwp_self() int32 + +func osyield() + +//go:nosplit +func osyield_no_g() { + osyield() +} + +func kqueue() int32 + +//go:noescape +func kevent(kq int32, ch *keventt, nch int32, ev *keventt, nev int32, ts *timespec) int32 + +func pipe2(flags int32) (r, w int32, errno int32) +func fcntl(fd, cmd, arg int32) (ret int32, errno int32) +func closeonexec(fd int32) + +func issetugid() int32 + +const ( + _ESRCH = 3 + _ETIMEDOUT = 60 + + // From NetBSD's <sys/time.h> + _CLOCK_REALTIME = 0 + _CLOCK_VIRTUAL = 1 + _CLOCK_PROF = 2 + _CLOCK_MONOTONIC = 3 + + _TIMER_RELTIME = 0 + _TIMER_ABSTIME = 1 +) + +var sigset_all = sigset{[4]uint32{^uint32(0), ^uint32(0), ^uint32(0), ^uint32(0)}} + +// From NetBSD's <sys/sysctl.h> +const ( + _CTL_KERN = 1 + _KERN_OSREV = 3 + + _CTL_HW = 6 + _HW_NCPU = 3 + _HW_PAGESIZE = 7 + _HW_NCPUONLINE = 16 +) + +func sysctlInt(mib []uint32) (int32, bool) { + var out int32 + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], uint32(len(mib)), (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret < 0 { + return 0, false + } + return out, true +} + +func getncpu() int32 { + if n, ok := sysctlInt([]uint32{_CTL_HW, _HW_NCPUONLINE}); ok { + return int32(n) + } + if n, ok := sysctlInt([]uint32{_CTL_HW, _HW_NCPU}); ok { + return int32(n) + } + return 1 +} + +func getPageSize() uintptr { + mib := [2]uint32{_CTL_HW, _HW_PAGESIZE} + out := uint32(0) + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], 2, (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret >= 0 { + return uintptr(out) + } + return 0 +} + +func getOSRev() int { + if osrev, ok := sysctlInt([]uint32{_CTL_KERN, _KERN_OSREV}); ok { + return int(osrev) + } + return 0 +} + +//go:nosplit +func semacreate(mp *m) { +} + +//go:nosplit +func semasleep(ns int64) int32 { + gp := getg() + var deadline int64 + if ns >= 0 { + deadline = nanotime() + ns + } + + for { + v := atomic.Load(&gp.m.waitsemacount) + if v > 0 { + if atomic.Cas(&gp.m.waitsemacount, v, v-1) { + return 0 // semaphore acquired + } + continue + } + + // Sleep until unparked by semawakeup or timeout. + var tsp *timespec + var ts timespec + if ns >= 0 { + wait := deadline - nanotime() + if wait <= 0 { + return -1 + } + ts.setNsec(wait) + tsp = &ts + } + ret := lwp_park(_CLOCK_MONOTONIC, _TIMER_RELTIME, tsp, 0, unsafe.Pointer(&gp.m.waitsemacount), nil) + if ret == _ETIMEDOUT { + return -1 + } + } +} + +//go:nosplit +func semawakeup(mp *m) { + atomic.Xadd(&mp.waitsemacount, 1) + // From NetBSD's _lwp_unpark(2) manual: + // "If the target LWP is not currently waiting, it will return + // immediately upon the next call to _lwp_park()." + ret := lwp_unpark(int32(mp.procid), unsafe.Pointer(&mp.waitsemacount)) + if ret != 0 && ret != _ESRCH { + // semawakeup can be called on signal stack. + systemstack(func() { + print("thrwakeup addr=", &mp.waitsemacount, " sem=", mp.waitsemacount, " ret=", ret, "\n") + }) + } +} + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrier +func newosproc(mp *m) { + stk := unsafe.Pointer(mp.g0.stack.hi) + if false { + print("newosproc stk=", stk, " m=", mp, " g=", mp.g0, " id=", mp.id, " ostk=", &mp, "\n") + } + + var uc ucontextt + getcontext(unsafe.Pointer(&uc)) + + // _UC_SIGMASK does not seem to work here. + // It would be nice if _UC_SIGMASK and _UC_STACK + // worked so that we could do all the work setting + // the sigmask and the stack here, instead of setting + // the mask here and the stack in netbsdMstart. + // For now do the blocking manually. + uc.uc_flags = _UC_SIGMASK | _UC_CPU + uc.uc_link = nil + uc.uc_sigmask = sigset_all + + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + + lwp_mcontext_init(&uc.uc_mcontext, stk, mp, mp.g0, abi.FuncPCABI0(netbsdMstart)) + + ret := retryOnEAGAIN(func() int32 { + errno := lwp_create(unsafe.Pointer(&uc), _LWP_DETACHED, unsafe.Pointer(&mp.procid)) + // lwp_create returns negative errno + return -errno + }) + sigprocmask(_SIG_SETMASK, &oset, nil) + if ret != 0 { + print("runtime: failed to create new OS thread (have ", mcount()-1, " already; errno=", ret, ")\n") + if ret == _EAGAIN { + println("runtime: may need to increase max user processes (ulimit -p)") + } + throw("runtime.newosproc") + } +} + +// mstart is the entry-point for new Ms. +// It is written in assembly, uses ABI0, is marked TOPFRAME, and calls netbsdMstart0. +func netbsdMstart() + +// netbsdMStart0 is the function call that starts executing a newly +// created thread. On NetBSD, a new thread inherits the signal stack +// of the creating thread. That confuses minit, so we remove that +// signal stack here before calling the regular mstart. It's a bit +// baroque to remove a signal stack here only to add one in minit, but +// it's a simple change that keeps NetBSD working like other OS's. +// At this point all signals are blocked, so there is no race. +// +//go:nosplit +func netbsdMstart0() { + st := stackt{ss_flags: _SS_DISABLE} + sigaltstack(&st, nil) + mstart0() +} + +func osinit() { + ncpu = getncpu() + if physPageSize == 0 { + physPageSize = getPageSize() + } + needSysmonWorkaround = getOSRev() < 902000000 // NetBSD 9.2 +} + +var urandom_dev = []byte("/dev/urandom\x00") + +//go:nosplit +func getRandomData(r []byte) { + fd := open(&urandom_dev[0], 0 /* O_RDONLY */, 0) + n := read(fd, unsafe.Pointer(&r[0]), int32(len(r))) + closefd(fd) + extendRandom(r, int(n)) +} + +func goenvs() { + goenvs_unix() +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + mp.gsignal = malg(32 * 1024) + mp.gsignal.m = mp +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { + gp := getg() + gp.m.procid = uint64(lwp_self()) + + // On NetBSD a thread created by pthread_create inherits the + // signal stack of the creating thread. We always create a + // new signal stack here, to avoid having two Go threads using + // the same signal stack. This breaks the case of a thread + // created in C that calls sigaltstack and then calls a Go + // function, because we will lose track of the C code's + // sigaltstack, but it's the best we can do. + signalstack(&gp.m.gsignal.stack) + gp.m.newSigstack = true + + minitSignalMask() +} + +// Called from dropm to undo the effect of an minit. +// +//go:nosplit +func unminit() { + unminitSignals() +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +func sigtramp() + +type sigactiont struct { + sa_sigaction uintptr + sa_mask sigset + sa_flags int32 +} + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa sigactiont + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTART + sa.sa_mask = sigset_all + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + fn = abi.FuncPCABI0(sigtramp) + } + sa.sa_sigaction = fn + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func setsigstack(i uint32) { + throw("setsigstack") +} + +//go:nosplit +//go:nowritebarrierrec +func getsig(i uint32) uintptr { + var sa sigactiont + sigaction(i, nil, &sa) + return sa.sa_sigaction +} + +// setSignalstackSP sets the ss_sp field of a stackt. +// +//go:nosplit +func setSignalstackSP(s *stackt, sp uintptr) { + s.ss_sp = sp +} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + mask.__bits[(i-1)/32] |= 1 << ((uint32(i) - 1) & 31) +} + +func sigdelset(mask *sigset, i int) { + mask.__bits[(i-1)/32] &^= 1 << ((uint32(i) - 1) & 31) +} + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { +} + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + setThreadCPUProfilerHz(hz) +} + +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + return true +} + +func sysargs(argc int32, argv **byte) { + n := argc + 1 + + // skip over argv, envp to get to auxv + for argv_index(argv, n) != nil { + n++ + } + + // skip NULL separator + n++ + + // now argv+n is auxv + auxv := (*[1 << 28]uintptr)(add(unsafe.Pointer(argv), uintptr(n)*goarch.PtrSize)) + sysauxv(auxv[:]) +} + +const ( + _AT_NULL = 0 // Terminates the vector + _AT_PAGESZ = 6 // Page size in bytes +) + +func sysauxv(auxv []uintptr) { + for i := 0; auxv[i] != _AT_NULL; i += 2 { + tag, val := auxv[i], auxv[i+1] + switch tag { + case _AT_PAGESZ: + physPageSize = val + } + } +} + +// raise sends signal to the calling thread. +// +// It must be nosplit because it is used by the signal handler before +// it definitely has a Go stack. +// +//go:nosplit +func raise(sig uint32) { + lwp_kill(lwp_self(), int(sig)) +} + +func signalM(mp *m, sig int) { + lwp_kill(int32(mp.procid), sig) +} + +// sigPerThreadSyscall is only used on linux, so we assign a bogus signal +// number. +const sigPerThreadSyscall = 1 << 31 + +//go:nosplit +func runPerThreadSyscall() { + throw("runPerThreadSyscall only valid on linux") +} diff --git a/src/runtime/os_netbsd_386.go b/src/runtime/os_netbsd_386.go new file mode 100644 index 0000000..ac89b98 --- /dev/null +++ b/src/runtime/os_netbsd_386.go @@ -0,0 +1,19 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +func lwp_mcontext_init(mc *mcontextt, stk unsafe.Pointer, mp *m, gp *g, fn uintptr) { + // Machine dependent mcontext initialisation for LWP. + mc.__gregs[_REG_EIP] = uint32(abi.FuncPCABI0(lwp_tramp)) + mc.__gregs[_REG_UESP] = uint32(uintptr(stk)) + mc.__gregs[_REG_EBX] = uint32(uintptr(unsafe.Pointer(mp))) + mc.__gregs[_REG_EDX] = uint32(uintptr(unsafe.Pointer(gp))) + mc.__gregs[_REG_ESI] = uint32(fn) +} diff --git a/src/runtime/os_netbsd_amd64.go b/src/runtime/os_netbsd_amd64.go new file mode 100644 index 0000000..74eea0c --- /dev/null +++ b/src/runtime/os_netbsd_amd64.go @@ -0,0 +1,19 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +func lwp_mcontext_init(mc *mcontextt, stk unsafe.Pointer, mp *m, gp *g, fn uintptr) { + // Machine dependent mcontext initialisation for LWP. + mc.__gregs[_REG_RIP] = uint64(abi.FuncPCABI0(lwp_tramp)) + mc.__gregs[_REG_RSP] = uint64(uintptr(stk)) + mc.__gregs[_REG_R8] = uint64(uintptr(unsafe.Pointer(mp))) + mc.__gregs[_REG_R9] = uint64(uintptr(unsafe.Pointer(gp))) + mc.__gregs[_REG_R12] = uint64(fn) +} diff --git a/src/runtime/os_netbsd_arm.go b/src/runtime/os_netbsd_arm.go new file mode 100644 index 0000000..5fb4e08 --- /dev/null +++ b/src/runtime/os_netbsd_arm.go @@ -0,0 +1,37 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +func lwp_mcontext_init(mc *mcontextt, stk unsafe.Pointer, mp *m, gp *g, fn uintptr) { + // Machine dependent mcontext initialisation for LWP. + mc.__gregs[_REG_R15] = uint32(abi.FuncPCABI0(lwp_tramp)) + mc.__gregs[_REG_R13] = uint32(uintptr(stk)) + mc.__gregs[_REG_R0] = uint32(uintptr(unsafe.Pointer(mp))) + mc.__gregs[_REG_R1] = uint32(uintptr(unsafe.Pointer(gp))) + mc.__gregs[_REG_R2] = uint32(fn) +} + +func checkgoarm() { + // TODO(minux): FP checks like in os_linux_arm.go. + + // osinit not called yet, so ncpu not set: must use getncpu directly. + if getncpu() > 1 && goarm < 7 { + print("runtime: this system has multiple CPUs and must use\n") + print("atomic synchronization instructions. Recompile using GOARM=7.\n") + exit(1) + } +} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_netbsd_arm64.go b/src/runtime/os_netbsd_arm64.go new file mode 100644 index 0000000..2dda9c9 --- /dev/null +++ b/src/runtime/os_netbsd_arm64.go @@ -0,0 +1,26 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +func lwp_mcontext_init(mc *mcontextt, stk unsafe.Pointer, mp *m, gp *g, fn uintptr) { + // Machine dependent mcontext initialisation for LWP. + mc.__gregs[_REG_ELR] = uint64(abi.FuncPCABI0(lwp_tramp)) + mc.__gregs[_REG_X31] = uint64(uintptr(stk)) + mc.__gregs[_REG_X0] = uint64(uintptr(unsafe.Pointer(mp))) + mc.__gregs[_REG_X1] = uint64(uintptr(unsafe.Pointer(mp.g0))) + mc.__gregs[_REG_X2] = uint64(fn) +} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_nonopenbsd.go b/src/runtime/os_nonopenbsd.go new file mode 100644 index 0000000..a577596 --- /dev/null +++ b/src/runtime/os_nonopenbsd.go @@ -0,0 +1,17 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !openbsd + +package runtime + +// osStackAlloc performs OS-specific initialization before s is used +// as stack memory. +func osStackAlloc(s *mspan) { +} + +// osStackFree undoes the effect of osStackAlloc before s is returned +// to the heap. +func osStackFree(s *mspan) { +} diff --git a/src/runtime/os_only_solaris.go b/src/runtime/os_only_solaris.go new file mode 100644 index 0000000..0c72500 --- /dev/null +++ b/src/runtime/os_only_solaris.go @@ -0,0 +1,18 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Solaris code that doesn't also apply to illumos. + +//go:build !illumos + +package runtime + +func getncpu() int32 { + n := int32(sysconf(__SC_NPROCESSORS_ONLN)) + if n < 1 { + return 1 + } + + return n +} diff --git a/src/runtime/os_openbsd.go b/src/runtime/os_openbsd.go new file mode 100644 index 0000000..500286a --- /dev/null +++ b/src/runtime/os_openbsd.go @@ -0,0 +1,314 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "runtime/internal/atomic" + "unsafe" +) + +type mOS struct { + waitsemacount uint32 +} + +const ( + _ESRCH = 3 + _EWOULDBLOCK = _EAGAIN + _ENOTSUP = 91 + + // From OpenBSD's sys/time.h + _CLOCK_REALTIME = 0 + _CLOCK_VIRTUAL = 1 + _CLOCK_PROF = 2 + _CLOCK_MONOTONIC = 3 +) + +type sigset uint32 + +var sigset_all = ^sigset(0) + +// From OpenBSD's <sys/sysctl.h> +const ( + _CTL_KERN = 1 + _KERN_OSREV = 3 + + _CTL_HW = 6 + _HW_NCPU = 3 + _HW_PAGESIZE = 7 + _HW_NCPUONLINE = 25 +) + +func sysctlInt(mib []uint32) (int32, bool) { + var out int32 + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], uint32(len(mib)), (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret < 0 { + return 0, false + } + return out, true +} + +func sysctlUint64(mib []uint32) (uint64, bool) { + var out uint64 + nout := unsafe.Sizeof(out) + ret := sysctl(&mib[0], uint32(len(mib)), (*byte)(unsafe.Pointer(&out)), &nout, nil, 0) + if ret < 0 { + return 0, false + } + return out, true +} + +//go:linkname internal_cpu_sysctlUint64 internal/cpu.sysctlUint64 +func internal_cpu_sysctlUint64(mib []uint32) (uint64, bool) { + return sysctlUint64(mib) +} + +func getncpu() int32 { + // Try hw.ncpuonline first because hw.ncpu would report a number twice as + // high as the actual CPUs running on OpenBSD 6.4 with hyperthreading + // disabled (hw.smt=0). See https://golang.org/issue/30127 + if n, ok := sysctlInt([]uint32{_CTL_HW, _HW_NCPUONLINE}); ok { + return int32(n) + } + if n, ok := sysctlInt([]uint32{_CTL_HW, _HW_NCPU}); ok { + return int32(n) + } + return 1 +} + +func getPageSize() uintptr { + if ps, ok := sysctlInt([]uint32{_CTL_HW, _HW_PAGESIZE}); ok { + return uintptr(ps) + } + return 0 +} + +func getOSRev() int { + if osrev, ok := sysctlInt([]uint32{_CTL_KERN, _KERN_OSREV}); ok { + return int(osrev) + } + return 0 +} + +//go:nosplit +func semacreate(mp *m) { +} + +//go:nosplit +func semasleep(ns int64) int32 { + gp := getg() + + // Compute sleep deadline. + var tsp *timespec + if ns >= 0 { + var ts timespec + ts.setNsec(ns + nanotime()) + tsp = &ts + } + + for { + v := atomic.Load(&gp.m.waitsemacount) + if v > 0 { + if atomic.Cas(&gp.m.waitsemacount, v, v-1) { + return 0 // semaphore acquired + } + continue + } + + // Sleep until woken by semawakeup or timeout; or abort if waitsemacount != 0. + // + // From OpenBSD's __thrsleep(2) manual: + // "The abort argument, if not NULL, points to an int that will + // be examined [...] immediately before blocking. If that int + // is non-zero then __thrsleep() will immediately return EINTR + // without blocking." + ret := thrsleep(uintptr(unsafe.Pointer(&gp.m.waitsemacount)), _CLOCK_MONOTONIC, tsp, 0, &gp.m.waitsemacount) + if ret == _EWOULDBLOCK { + return -1 + } + } +} + +//go:nosplit +func semawakeup(mp *m) { + atomic.Xadd(&mp.waitsemacount, 1) + ret := thrwakeup(uintptr(unsafe.Pointer(&mp.waitsemacount)), 1) + if ret != 0 && ret != _ESRCH { + // semawakeup can be called on signal stack. + systemstack(func() { + print("thrwakeup addr=", &mp.waitsemacount, " sem=", mp.waitsemacount, " ret=", ret, "\n") + }) + } +} + +func osinit() { + ncpu = getncpu() + physPageSize = getPageSize() + haveMapStack = getOSRev() >= 201805 // OpenBSD 6.3 +} + +var urandom_dev = []byte("/dev/urandom\x00") + +//go:nosplit +func getRandomData(r []byte) { + fd := open(&urandom_dev[0], 0 /* O_RDONLY */, 0) + n := read(fd, unsafe.Pointer(&r[0]), int32(len(r))) + closefd(fd) + extendRandom(r, int(n)) +} + +func goenvs() { + goenvs_unix() +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + gsignalSize := int32(32 * 1024) + if GOARCH == "mips64" { + gsignalSize = int32(64 * 1024) + } + mp.gsignal = malg(gsignalSize) + mp.gsignal.m = mp +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, can not allocate memory. +func minit() { + getg().m.procid = uint64(getthrid()) + minitSignals() +} + +// Called from dropm to undo the effect of an minit. +// +//go:nosplit +func unminit() { + unminitSignals() +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +func sigtramp() + +type sigactiont struct { + sa_sigaction uintptr + sa_mask uint32 + sa_flags int32 +} + +//go:nosplit +//go:nowritebarrierrec +func setsig(i uint32, fn uintptr) { + var sa sigactiont + sa.sa_flags = _SA_SIGINFO | _SA_ONSTACK | _SA_RESTART + sa.sa_mask = uint32(sigset_all) + if fn == abi.FuncPCABIInternal(sighandler) { // abi.FuncPCABIInternal(sighandler) matches the callers in signal_unix.go + fn = abi.FuncPCABI0(sigtramp) + } + sa.sa_sigaction = fn + sigaction(i, &sa, nil) +} + +//go:nosplit +//go:nowritebarrierrec +func setsigstack(i uint32) { + throw("setsigstack") +} + +//go:nosplit +//go:nowritebarrierrec +func getsig(i uint32) uintptr { + var sa sigactiont + sigaction(i, nil, &sa) + return sa.sa_sigaction +} + +// setSignalstackSP sets the ss_sp field of a stackt. +// +//go:nosplit +func setSignalstackSP(s *stackt, sp uintptr) { + s.ss_sp = sp +} + +//go:nosplit +//go:nowritebarrierrec +func sigaddset(mask *sigset, i int) { + *mask |= 1 << (uint32(i) - 1) +} + +func sigdelset(mask *sigset, i int) { + *mask &^= 1 << (uint32(i) - 1) +} + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { +} + +func setProcessCPUProfiler(hz int32) { + setProcessCPUProfilerTimer(hz) +} + +func setThreadCPUProfiler(hz int32) { + setThreadCPUProfilerHz(hz) +} + +//go:nosplit +func validSIGPROF(mp *m, c *sigctxt) bool { + return true +} + +var haveMapStack = false + +func osStackAlloc(s *mspan) { + // OpenBSD 6.4+ requires that stacks be mapped with MAP_STACK. + // It will check this on entry to system calls, traps, and + // when switching to the alternate system stack. + // + // This function is called before s is used for any data, so + // it's safe to simply re-map it. + osStackRemap(s, _MAP_STACK) +} + +func osStackFree(s *mspan) { + // Undo MAP_STACK. + osStackRemap(s, 0) +} + +func osStackRemap(s *mspan, flags int32) { + if !haveMapStack { + // OpenBSD prior to 6.3 did not have MAP_STACK and so + // the following mmap will fail. But it also didn't + // require MAP_STACK (obviously), so there's no need + // to do the mmap. + return + } + a, err := mmap(unsafe.Pointer(s.base()), s.npages*pageSize, _PROT_READ|_PROT_WRITE, _MAP_PRIVATE|_MAP_ANON|_MAP_FIXED|flags, -1, 0) + if err != 0 || uintptr(a) != s.base() { + print("runtime: remapping stack memory ", hex(s.base()), " ", s.npages*pageSize, " a=", a, " err=", err, "\n") + throw("remapping stack memory failed") + } +} + +//go:nosplit +func raise(sig uint32) { + thrkill(getthrid(), int(sig)) +} + +func signalM(mp *m, sig int) { + thrkill(int32(mp.procid), sig) +} + +// sigPerThreadSyscall is only used on linux, so we assign a bogus signal +// number. +const sigPerThreadSyscall = 1 << 31 + +//go:nosplit +func runPerThreadSyscall() { + throw("runPerThreadSyscall only valid on linux") +} diff --git a/src/runtime/os_openbsd_arm.go b/src/runtime/os_openbsd_arm.go new file mode 100644 index 0000000..0a24096 --- /dev/null +++ b/src/runtime/os_openbsd_arm.go @@ -0,0 +1,23 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +func checkgoarm() { + // TODO(minux): FP checks like in os_linux_arm.go. + + // osinit not called yet, so ncpu not set: must use getncpu directly. + if getncpu() > 1 && goarm < 7 { + print("runtime: this system has multiple CPUs and must use\n") + print("atomic synchronization instructions. Recompile using GOARM=7.\n") + exit(1) + } +} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_openbsd_arm64.go b/src/runtime/os_openbsd_arm64.go new file mode 100644 index 0000000..d71de7d --- /dev/null +++ b/src/runtime/os_openbsd_arm64.go @@ -0,0 +1,12 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_openbsd_libc.go b/src/runtime/os_openbsd_libc.go new file mode 100644 index 0000000..201f162 --- /dev/null +++ b/src/runtime/os_openbsd_libc.go @@ -0,0 +1,60 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd && !mips64 + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +// mstart_stub provides glue code to call mstart from pthread_create. +func mstart_stub() + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrierrec +func newosproc(mp *m) { + if false { + print("newosproc m=", mp, " g=", mp.g0, " id=", mp.id, " ostk=", &mp, "\n") + } + + // Initialize an attribute object. + var attr pthreadattr + if err := pthread_attr_init(&attr); err != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + // Find out OS stack size for our own stack guard. + var stacksize uintptr + if pthread_attr_getstacksize(&attr, &stacksize) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + mp.g0.stack.hi = stacksize // for mstart + + // Tell the pthread library we won't join with this thread. + if pthread_attr_setdetachstate(&attr, _PTHREAD_CREATE_DETACHED) != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + // Finally, create the thread. It starts at mstart_stub, which does some low-level + // setup and then calls mstart. + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + err := retryOnEAGAIN(func() int32 { + return pthread_create(&attr, abi.FuncPCABI0(mstart_stub), unsafe.Pointer(mp)) + }) + sigprocmask(_SIG_SETMASK, &oset, nil) + if err != 0 { + writeErrStr(failthreadcreate) + exit(1) + } + + pthread_attr_destroy(&attr) +} diff --git a/src/runtime/os_openbsd_mips64.go b/src/runtime/os_openbsd_mips64.go new file mode 100644 index 0000000..ae220cd --- /dev/null +++ b/src/runtime/os_openbsd_mips64.go @@ -0,0 +1,12 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_openbsd_syscall.go b/src/runtime/os_openbsd_syscall.go new file mode 100644 index 0000000..d784f76 --- /dev/null +++ b/src/runtime/os_openbsd_syscall.go @@ -0,0 +1,51 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd && mips64 + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +//go:noescape +func tfork(param *tforkt, psize uintptr, mm *m, gg *g, fn uintptr) int32 + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrier +func newosproc(mp *m) { + stk := unsafe.Pointer(mp.g0.stack.hi) + if false { + print("newosproc stk=", stk, " m=", mp, " g=", mp.g0, " id=", mp.id, " ostk=", &mp, "\n") + } + + // Stack pointer must point inside stack area (as marked with MAP_STACK), + // rather than at the top of it. + param := tforkt{ + tf_tcb: unsafe.Pointer(&mp.tls[0]), + tf_tid: nil, // minit will record tid + tf_stack: uintptr(stk) - goarch.PtrSize, + } + + var oset sigset + sigprocmask(_SIG_SETMASK, &sigset_all, &oset) + ret := retryOnEAGAIN(func() int32 { + errno := tfork(¶m, unsafe.Sizeof(param), mp, mp.g0, abi.FuncPCABI0(mstart)) + // tfork returns negative errno + return -errno + }) + sigprocmask(_SIG_SETMASK, &oset, nil) + + if ret != 0 { + print("runtime: failed to create new OS thread (have ", mcount()-1, " already; errno=", ret, ")\n") + if ret == _EAGAIN { + println("runtime: may need to increase max user processes (ulimit -p)") + } + throw("runtime.newosproc") + } +} diff --git a/src/runtime/os_openbsd_syscall1.go b/src/runtime/os_openbsd_syscall1.go new file mode 100644 index 0000000..d32894b --- /dev/null +++ b/src/runtime/os_openbsd_syscall1.go @@ -0,0 +1,20 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd && mips64 + +package runtime + +//go:noescape +func thrsleep(ident uintptr, clock_id int32, tsp *timespec, lock uintptr, abort *uint32) int32 + +//go:noescape +func thrwakeup(ident uintptr, n int32) int32 + +func osyield() + +//go:nosplit +func osyield_no_g() { + osyield() +} diff --git a/src/runtime/os_openbsd_syscall2.go b/src/runtime/os_openbsd_syscall2.go new file mode 100644 index 0000000..8e48593 --- /dev/null +++ b/src/runtime/os_openbsd_syscall2.go @@ -0,0 +1,103 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd && mips64 + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +//go:noescape +func sigaction(sig uint32, new, old *sigactiont) + +func kqueue() int32 + +//go:noescape +func kevent(kq int32, ch *keventt, nch int32, ev *keventt, nev int32, ts *timespec) int32 + +func raiseproc(sig uint32) + +func getthrid() int32 +func thrkill(tid int32, sig int) + +// read calls the read system call. +// It returns a non-negative number of bytes written or a negative errno value. +func read(fd int32, p unsafe.Pointer, n int32) int32 + +func closefd(fd int32) int32 + +func exit(code int32) +func usleep(usec uint32) + +//go:nosplit +func usleep_no_g(usec uint32) { + usleep(usec) +} + +// write1 calls the write system call. +// It returns a non-negative number of bytes written or a negative errno value. +// +//go:noescape +func write1(fd uintptr, p unsafe.Pointer, n int32) int32 + +//go:noescape +func open(name *byte, mode, perm int32) int32 + +// return value is only set on linux to be used in osinit(). +func madvise(addr unsafe.Pointer, n uintptr, flags int32) int32 + +// exitThread terminates the current thread, writing *wait = freeMStack when +// the stack is safe to reclaim. +// +//go:noescape +func exitThread(wait *atomic.Uint32) + +//go:noescape +func obsdsigprocmask(how int32, new sigset) sigset + +//go:nosplit +//go:nowritebarrierrec +func sigprocmask(how int32, new, old *sigset) { + n := sigset(0) + if new != nil { + n = *new + } + r := obsdsigprocmask(how, n) + if old != nil { + *old = r + } +} + +func pipe2(flags int32) (r, w int32, errno int32) + +//go:noescape +func setitimer(mode int32, new, old *itimerval) + +//go:noescape +func sysctl(mib *uint32, miblen uint32, out *byte, size *uintptr, dst *byte, ndst uintptr) int32 + +// mmap calls the mmap system call. It is implemented in assembly. +// We only pass the lower 32 bits of file offset to the +// assembly routine; the higher bits (if required), should be provided +// by the assembly routine as 0. +// The err result is an OS error code such as ENOMEM. +func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (p unsafe.Pointer, err int) + +// munmap calls the munmap system call. It is implemented in assembly. +func munmap(addr unsafe.Pointer, n uintptr) + +func nanotime1() int64 + +//go:noescape +func sigaltstack(new, old *stackt) + +func fcntl(fd, cmd, arg int32) (ret int32, errno int32) +func closeonexec(fd int32) + +func walltime() (sec int64, nsec int32) + +func issetugid() int32 diff --git a/src/runtime/os_plan9.go b/src/runtime/os_plan9.go new file mode 100644 index 0000000..5e5a63d --- /dev/null +++ b/src/runtime/os_plan9.go @@ -0,0 +1,552 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "runtime/internal/atomic" + "unsafe" +) + +type mOS struct { + waitsemacount uint32 + notesig *int8 + errstr *byte + ignoreHangup bool +} + +func closefd(fd int32) int32 + +//go:noescape +func open(name *byte, mode, perm int32) int32 + +//go:noescape +func pread(fd int32, buf unsafe.Pointer, nbytes int32, offset int64) int32 + +//go:noescape +func pwrite(fd int32, buf unsafe.Pointer, nbytes int32, offset int64) int32 + +func seek(fd int32, offset int64, whence int32) int64 + +//go:noescape +func exits(msg *byte) + +//go:noescape +func brk_(addr unsafe.Pointer) int32 + +func sleep(ms int32) int32 + +func rfork(flags int32) int32 + +//go:noescape +func plan9_semacquire(addr *uint32, block int32) int32 + +//go:noescape +func plan9_tsemacquire(addr *uint32, ms int32) int32 + +//go:noescape +func plan9_semrelease(addr *uint32, count int32) int32 + +//go:noescape +func notify(fn unsafe.Pointer) int32 + +func noted(mode int32) int32 + +//go:noescape +func nsec(*int64) int64 + +//go:noescape +func sigtramp(ureg, note unsafe.Pointer) + +func setfpmasks() + +//go:noescape +func tstart_plan9(newm *m) + +func errstr() string + +type _Plink uintptr + +//go:linkname os_sigpipe os.sigpipe +func os_sigpipe() { + throw("too many writes on closed pipe") +} + +func sigpanic() { + gp := getg() + if !canpanic() { + throw("unexpected signal during runtime execution") + } + + note := gostringnocopy((*byte)(unsafe.Pointer(gp.m.notesig))) + switch gp.sig { + case _SIGRFAULT, _SIGWFAULT: + i := indexNoFloat(note, "addr=") + if i >= 0 { + i += 5 + } else if i = indexNoFloat(note, "va="); i >= 0 { + i += 3 + } else { + panicmem() + } + addr := note[i:] + gp.sigcode1 = uintptr(atolwhex(addr)) + if gp.sigcode1 < 0x1000 { + panicmem() + } + if gp.paniconfault { + panicmemAddr(gp.sigcode1) + } + if inUserArenaChunk(gp.sigcode1) { + // We could check that the arena chunk is explicitly set to fault, + // but the fact that we faulted on accessing it is enough to prove + // that it is. + print("accessed data from freed user arena ", hex(gp.sigcode1), "\n") + } else { + print("unexpected fault address ", hex(gp.sigcode1), "\n") + } + throw("fault") + case _SIGTRAP: + if gp.paniconfault { + panicmem() + } + throw(note) + case _SIGINTDIV: + panicdivide() + case _SIGFLOAT: + panicfloat() + default: + panic(errorString(note)) + } +} + +// indexNoFloat is bytealg.IndexString but safe to use in a note +// handler. +func indexNoFloat(s, t string) int { + if len(t) == 0 { + return 0 + } + for i := 0; i < len(s); i++ { + if s[i] == t[0] && hasPrefix(s[i:], t) { + return i + } + } + return -1 +} + +func atolwhex(p string) int64 { + for hasPrefix(p, " ") || hasPrefix(p, "\t") { + p = p[1:] + } + neg := false + if hasPrefix(p, "-") || hasPrefix(p, "+") { + neg = p[0] == '-' + p = p[1:] + for hasPrefix(p, " ") || hasPrefix(p, "\t") { + p = p[1:] + } + } + var n int64 + switch { + case hasPrefix(p, "0x"), hasPrefix(p, "0X"): + p = p[2:] + for ; len(p) > 0; p = p[1:] { + if '0' <= p[0] && p[0] <= '9' { + n = n*16 + int64(p[0]-'0') + } else if 'a' <= p[0] && p[0] <= 'f' { + n = n*16 + int64(p[0]-'a'+10) + } else if 'A' <= p[0] && p[0] <= 'F' { + n = n*16 + int64(p[0]-'A'+10) + } else { + break + } + } + case hasPrefix(p, "0"): + for ; len(p) > 0 && '0' <= p[0] && p[0] <= '7'; p = p[1:] { + n = n*8 + int64(p[0]-'0') + } + default: + for ; len(p) > 0 && '0' <= p[0] && p[0] <= '9'; p = p[1:] { + n = n*10 + int64(p[0]-'0') + } + } + if neg { + n = -n + } + return n +} + +type sigset struct{} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { + // Initialize stack and goroutine for note handling. + mp.gsignal = malg(32 * 1024) + mp.gsignal.m = mp + mp.notesig = (*int8)(mallocgc(_ERRMAX, nil, true)) + // Initialize stack for handling strings from the + // errstr system call, as used in package syscall. + mp.errstr = (*byte)(mallocgc(_ERRMAX, nil, true)) +} + +func sigsave(p *sigset) { +} + +func msigrestore(sigmask sigset) { +} + +//go:nosplit +//go:nowritebarrierrec +func clearSignalHandlers() { +} + +func sigblock(exiting bool) { +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { + if atomic.Load(&exiting) != 0 { + exits(&emptystatus[0]) + } + // Mask all SSE floating-point exceptions + // when running on the 64-bit kernel. + setfpmasks() +} + +// Called from dropm to undo the effect of an minit. +func unminit() { +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +func mdestroy(mp *m) { +} + +var sysstat = []byte("/dev/sysstat\x00") + +func getproccount() int32 { + var buf [2048]byte + fd := open(&sysstat[0], _OREAD, 0) + if fd < 0 { + return 1 + } + ncpu := int32(0) + for { + n := read(fd, unsafe.Pointer(&buf), int32(len(buf))) + if n <= 0 { + break + } + for i := int32(0); i < n; i++ { + if buf[i] == '\n' { + ncpu++ + } + } + } + closefd(fd) + if ncpu == 0 { + ncpu = 1 + } + return ncpu +} + +var devswap = []byte("/dev/swap\x00") +var pagesize = []byte(" pagesize\n") + +func getPageSize() uintptr { + var buf [2048]byte + var pos int + fd := open(&devswap[0], _OREAD, 0) + if fd < 0 { + // There's not much we can do if /dev/swap doesn't + // exist. However, nothing in the memory manager uses + // this on Plan 9, so it also doesn't really matter. + return minPhysPageSize + } + for pos < len(buf) { + n := read(fd, unsafe.Pointer(&buf[pos]), int32(len(buf)-pos)) + if n <= 0 { + break + } + pos += int(n) + } + closefd(fd) + text := buf[:pos] + // Find "<n> pagesize" line. + bol := 0 + for i, c := range text { + if c == '\n' { + bol = i + 1 + } + if bytesHasPrefix(text[i:], pagesize) { + // Parse number at the beginning of this line. + return uintptr(_atoi(text[bol:])) + } + } + // Again, the page size doesn't really matter, so use a fallback. + return minPhysPageSize +} + +func bytesHasPrefix(s, prefix []byte) bool { + if len(s) < len(prefix) { + return false + } + for i, p := range prefix { + if s[i] != p { + return false + } + } + return true +} + +var pid = []byte("#c/pid\x00") + +func getpid() uint64 { + var b [20]byte + fd := open(&pid[0], 0, 0) + if fd >= 0 { + read(fd, unsafe.Pointer(&b), int32(len(b))) + closefd(fd) + } + c := b[:] + for c[0] == ' ' || c[0] == '\t' { + c = c[1:] + } + return uint64(_atoi(c)) +} + +func osinit() { + initBloc() + ncpu = getproccount() + physPageSize = getPageSize() + getg().m.procid = getpid() +} + +//go:nosplit +func crash() { + notify(nil) + *(*int)(nil) = 0 +} + +//go:nosplit +func getRandomData(r []byte) { + // inspired by wyrand see hash32.go for detail + t := nanotime() + v := getg().m.procid ^ uint64(t) + + for len(r) > 0 { + v ^= 0xa0761d6478bd642f + v *= 0xe7037ed1a0b428db + size := 8 + if len(r) < 8 { + size = len(r) + } + for i := 0; i < size; i++ { + r[i] = byte(v >> (8 * i)) + } + r = r[size:] + v = v>>32 | v<<32 + } +} + +func initsig(preinit bool) { + if !preinit { + notify(unsafe.Pointer(abi.FuncPCABI0(sigtramp))) + } +} + +//go:nosplit +func osyield() { + sleep(0) +} + +//go:nosplit +func osyield_no_g() { + osyield() +} + +//go:nosplit +func usleep(µs uint32) { + ms := int32(µs / 1000) + if ms == 0 { + ms = 1 + } + sleep(ms) +} + +//go:nosplit +func usleep_no_g(usec uint32) { + usleep(usec) +} + +//go:nosplit +func nanotime1() int64 { + var scratch int64 + ns := nsec(&scratch) + // TODO(aram): remove hack after I fix _nsec in the pc64 kernel. + if ns == 0 { + return scratch + } + return ns +} + +var goexits = []byte("go: exit ") +var emptystatus = []byte("\x00") +var exiting uint32 + +func goexitsall(status *byte) { + var buf [_ERRMAX]byte + if !atomic.Cas(&exiting, 0, 1) { + return + } + getg().m.locks++ + n := copy(buf[:], goexits) + n = copy(buf[n:], gostringnocopy(status)) + pid := getpid() + for mp := (*m)(atomic.Loadp(unsafe.Pointer(&allm))); mp != nil; mp = mp.alllink { + if mp.procid != 0 && mp.procid != pid { + postnote(mp.procid, buf[:]) + } + } + getg().m.locks-- +} + +var procdir = []byte("/proc/") +var notefile = []byte("/note\x00") + +func postnote(pid uint64, msg []byte) int { + var buf [128]byte + var tmp [32]byte + n := copy(buf[:], procdir) + n += copy(buf[n:], itoa(tmp[:], pid)) + copy(buf[n:], notefile) + fd := open(&buf[0], _OWRITE, 0) + if fd < 0 { + return -1 + } + len := findnull(&msg[0]) + if write1(uintptr(fd), unsafe.Pointer(&msg[0]), int32(len)) != int32(len) { + closefd(fd) + return -1 + } + closefd(fd) + return 0 +} + +//go:nosplit +func exit(e int32) { + var status []byte + if e == 0 { + status = emptystatus + } else { + // build error string + var tmp [32]byte + sl := itoa(tmp[:len(tmp)-1], uint64(e)) + // Don't append, rely on the existing data being zero. + status = sl[:len(sl)+1] + } + goexitsall(&status[0]) + exits(&status[0]) +} + +// May run with m.p==nil, so write barriers are not allowed. +// +//go:nowritebarrier +func newosproc(mp *m) { + if false { + print("newosproc mp=", mp, " ostk=", &mp, "\n") + } + pid := rfork(_RFPROC | _RFMEM | _RFNOWAIT) + if pid < 0 { + throw("newosproc: rfork failed") + } + if pid == 0 { + tstart_plan9(mp) + } +} + +func exitThread(wait *atomic.Uint32) { + // We should never reach exitThread on Plan 9 because we let + // the OS clean up threads. + throw("exitThread") +} + +//go:nosplit +func semacreate(mp *m) { +} + +//go:nosplit +func semasleep(ns int64) int { + gp := getg() + if ns >= 0 { + ms := timediv(ns, 1000000, nil) + if ms == 0 { + ms = 1 + } + ret := plan9_tsemacquire(&gp.m.waitsemacount, ms) + if ret == 1 { + return 0 // success + } + return -1 // timeout or interrupted + } + for plan9_semacquire(&gp.m.waitsemacount, 1) < 0 { + // interrupted; try again (c.f. lock_sema.go) + } + return 0 // success +} + +//go:nosplit +func semawakeup(mp *m) { + plan9_semrelease(&mp.waitsemacount, 1) +} + +//go:nosplit +func read(fd int32, buf unsafe.Pointer, n int32) int32 { + return pread(fd, buf, n, -1) +} + +//go:nosplit +func write1(fd uintptr, buf unsafe.Pointer, n int32) int32 { + return pwrite(int32(fd), buf, n, -1) +} + +var _badsignal = []byte("runtime: signal received on thread not created by Go.\n") + +// This runs on a foreign stack, without an m or a g. No stack split. +// +//go:nosplit +func badsignal2() { + pwrite(2, unsafe.Pointer(&_badsignal[0]), int32(len(_badsignal)), -1) + exits(&_badsignal[0]) +} + +func raisebadsignal(sig uint32) { + badsignal2() +} + +func _atoi(b []byte) int { + n := 0 + for len(b) > 0 && '0' <= b[0] && b[0] <= '9' { + n = n*10 + int(b[0]) - '0' + b = b[1:] + } + return n +} + +func signame(sig uint32) string { + if sig >= uint32(len(sigtable)) { + return "" + } + return sigtable[sig].name +} + +const preemptMSupported = false + +func preemptM(mp *m) { + // Not currently supported. + // + // TODO: Use a note like we use signals on POSIX OSes +} diff --git a/src/runtime/os_plan9_arm.go b/src/runtime/os_plan9_arm.go new file mode 100644 index 0000000..f165a34 --- /dev/null +++ b/src/runtime/os_plan9_arm.go @@ -0,0 +1,16 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +func checkgoarm() { + return // TODO(minux) +} + +//go:nosplit +func cputicks() int64 { + // Currently cputicks() is used in blocking profiler and to seed runtime·fastrand(). + // runtime·nanotime() is a poor approximation of CPU ticks that is enough for the profiler. + return nanotime() +} diff --git a/src/runtime/os_solaris.go b/src/runtime/os_solaris.go new file mode 100644 index 0000000..47edda1 --- /dev/null +++ b/src/runtime/os_solaris.go @@ -0,0 +1,273 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type mts struct { + tv_sec int64 + tv_nsec int64 +} + +type mscratch struct { + v [6]uintptr +} + +type mOS struct { + waitsema uintptr // semaphore for parking on locks + perrno *int32 // pointer to tls errno + // these are here because they are too large to be on the stack + // of low-level NOSPLIT functions. + //LibCall libcall; + ts mts + scratch mscratch +} + +type libcFunc uintptr + +//go:linkname asmsysvicall6x runtime.asmsysvicall6 +var asmsysvicall6x libcFunc // name to take addr of asmsysvicall6 + +func asmsysvicall6() // declared for vet; do NOT call + +//go:nosplit +func sysvicall0(fn *libcFunc) uintptr { + // Leave caller's PC/SP around for traceback. + gp := getg() + var mp *m + if gp != nil { + mp = gp.m + } + if mp != nil && mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + mp = nil // See comment in sys_darwin.go:libcCall + } + + var libcall libcall + libcall.fn = uintptr(unsafe.Pointer(fn)) + libcall.n = 0 + libcall.args = uintptr(unsafe.Pointer(fn)) // it's unused but must be non-nil, otherwise crashes + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&libcall)) + if mp != nil { + mp.libcallsp = 0 + } + return libcall.r1 +} + +//go:nosplit +func sysvicall1(fn *libcFunc, a1 uintptr) uintptr { + r1, _ := sysvicall1Err(fn, a1) + return r1 +} + +//go:nosplit + +// sysvicall1Err returns both the system call result and the errno value. +// This is used by sysvicall1 and pipe. +func sysvicall1Err(fn *libcFunc, a1 uintptr) (r1, err uintptr) { + // Leave caller's PC/SP around for traceback. + gp := getg() + var mp *m + if gp != nil { + mp = gp.m + } + if mp != nil && mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + mp = nil + } + + var libcall libcall + libcall.fn = uintptr(unsafe.Pointer(fn)) + libcall.n = 1 + // TODO(rsc): Why is noescape necessary here and below? + libcall.args = uintptr(noescape(unsafe.Pointer(&a1))) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&libcall)) + if mp != nil { + mp.libcallsp = 0 + } + return libcall.r1, libcall.err +} + +//go:nosplit +func sysvicall2(fn *libcFunc, a1, a2 uintptr) uintptr { + r1, _ := sysvicall2Err(fn, a1, a2) + return r1 +} + +//go:nosplit +//go:cgo_unsafe_args + +// sysvicall2Err returns both the system call result and the errno value. +// This is used by sysvicall2 and pipe2. +func sysvicall2Err(fn *libcFunc, a1, a2 uintptr) (uintptr, uintptr) { + // Leave caller's PC/SP around for traceback. + gp := getg() + var mp *m + if gp != nil { + mp = gp.m + } + if mp != nil && mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + mp = nil + } + + var libcall libcall + libcall.fn = uintptr(unsafe.Pointer(fn)) + libcall.n = 2 + libcall.args = uintptr(noescape(unsafe.Pointer(&a1))) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&libcall)) + if mp != nil { + mp.libcallsp = 0 + } + return libcall.r1, libcall.err +} + +//go:nosplit +func sysvicall3(fn *libcFunc, a1, a2, a3 uintptr) uintptr { + r1, _ := sysvicall3Err(fn, a1, a2, a3) + return r1 +} + +//go:nosplit +//go:cgo_unsafe_args + +// sysvicall3Err returns both the system call result and the errno value. +// This is used by sysvicall3 and write1. +func sysvicall3Err(fn *libcFunc, a1, a2, a3 uintptr) (r1, err uintptr) { + // Leave caller's PC/SP around for traceback. + gp := getg() + var mp *m + if gp != nil { + mp = gp.m + } + if mp != nil && mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + mp = nil + } + + var libcall libcall + libcall.fn = uintptr(unsafe.Pointer(fn)) + libcall.n = 3 + libcall.args = uintptr(noescape(unsafe.Pointer(&a1))) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&libcall)) + if mp != nil { + mp.libcallsp = 0 + } + return libcall.r1, libcall.err +} + +//go:nosplit +//go:cgo_unsafe_args +func sysvicall4(fn *libcFunc, a1, a2, a3, a4 uintptr) uintptr { + // Leave caller's PC/SP around for traceback. + gp := getg() + var mp *m + if gp != nil { + mp = gp.m + } + if mp != nil && mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + mp = nil + } + + var libcall libcall + libcall.fn = uintptr(unsafe.Pointer(fn)) + libcall.n = 4 + libcall.args = uintptr(noescape(unsafe.Pointer(&a1))) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&libcall)) + if mp != nil { + mp.libcallsp = 0 + } + return libcall.r1 +} + +//go:nosplit +//go:cgo_unsafe_args +func sysvicall5(fn *libcFunc, a1, a2, a3, a4, a5 uintptr) uintptr { + // Leave caller's PC/SP around for traceback. + gp := getg() + var mp *m + if gp != nil { + mp = gp.m + } + if mp != nil && mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + mp = nil + } + + var libcall libcall + libcall.fn = uintptr(unsafe.Pointer(fn)) + libcall.n = 5 + libcall.args = uintptr(noescape(unsafe.Pointer(&a1))) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&libcall)) + if mp != nil { + mp.libcallsp = 0 + } + return libcall.r1 +} + +//go:nosplit +//go:cgo_unsafe_args +func sysvicall6(fn *libcFunc, a1, a2, a3, a4, a5, a6 uintptr) uintptr { + // Leave caller's PC/SP around for traceback. + gp := getg() + var mp *m + if gp != nil { + mp = gp.m + } + if mp != nil && mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + mp = nil + } + + var libcall libcall + libcall.fn = uintptr(unsafe.Pointer(fn)) + libcall.n = 6 + libcall.args = uintptr(noescape(unsafe.Pointer(&a1))) + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&libcall)) + if mp != nil { + mp.libcallsp = 0 + } + return libcall.r1 +} + +func issetugid() int32 { + return int32(sysvicall0(&libc_issetugid)) +} diff --git a/src/runtime/os_unix_nonlinux.go b/src/runtime/os_unix_nonlinux.go new file mode 100644 index 0000000..b98753b --- /dev/null +++ b/src/runtime/os_unix_nonlinux.go @@ -0,0 +1,15 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix && !linux + +package runtime + +// sigFromUser reports whether the signal was sent because of a call +// to kill. +// +//go:nosplit +func (c *sigctxt) sigFromUser() bool { + return c.sigcode() == _SI_USER +} diff --git a/src/runtime/os_windows.go b/src/runtime/os_windows.go new file mode 100644 index 0000000..44718f1 --- /dev/null +++ b/src/runtime/os_windows.go @@ -0,0 +1,1470 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +// TODO(brainman): should not need those +const ( + _NSIG = 65 +) + +//go:cgo_import_dynamic runtime._AddVectoredExceptionHandler AddVectoredExceptionHandler%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._CloseHandle CloseHandle%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._CreateEventA CreateEventA%4 "kernel32.dll" +//go:cgo_import_dynamic runtime._CreateFileA CreateFileA%7 "kernel32.dll" +//go:cgo_import_dynamic runtime._CreateIoCompletionPort CreateIoCompletionPort%4 "kernel32.dll" +//go:cgo_import_dynamic runtime._CreateThread CreateThread%6 "kernel32.dll" +//go:cgo_import_dynamic runtime._CreateWaitableTimerA CreateWaitableTimerA%3 "kernel32.dll" +//go:cgo_import_dynamic runtime._CreateWaitableTimerExW CreateWaitableTimerExW%4 "kernel32.dll" +//go:cgo_import_dynamic runtime._DuplicateHandle DuplicateHandle%7 "kernel32.dll" +//go:cgo_import_dynamic runtime._ExitProcess ExitProcess%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._FreeEnvironmentStringsW FreeEnvironmentStringsW%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetConsoleMode GetConsoleMode%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetEnvironmentStringsW GetEnvironmentStringsW%0 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetProcAddress GetProcAddress%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetProcessAffinityMask GetProcessAffinityMask%3 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetQueuedCompletionStatusEx GetQueuedCompletionStatusEx%6 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetStdHandle GetStdHandle%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetSystemDirectoryA GetSystemDirectoryA%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetSystemInfo GetSystemInfo%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._GetThreadContext GetThreadContext%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._SetThreadContext SetThreadContext%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._LoadLibraryW LoadLibraryW%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._LoadLibraryA LoadLibraryA%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._PostQueuedCompletionStatus PostQueuedCompletionStatus%4 "kernel32.dll" +//go:cgo_import_dynamic runtime._ResumeThread ResumeThread%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._SetConsoleCtrlHandler SetConsoleCtrlHandler%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._SetErrorMode SetErrorMode%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._SetEvent SetEvent%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._SetProcessPriorityBoost SetProcessPriorityBoost%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._SetThreadPriority SetThreadPriority%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._SetUnhandledExceptionFilter SetUnhandledExceptionFilter%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._SetWaitableTimer SetWaitableTimer%6 "kernel32.dll" +//go:cgo_import_dynamic runtime._Sleep Sleep%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._SuspendThread SuspendThread%1 "kernel32.dll" +//go:cgo_import_dynamic runtime._SwitchToThread SwitchToThread%0 "kernel32.dll" +//go:cgo_import_dynamic runtime._TlsAlloc TlsAlloc%0 "kernel32.dll" +//go:cgo_import_dynamic runtime._VirtualAlloc VirtualAlloc%4 "kernel32.dll" +//go:cgo_import_dynamic runtime._VirtualFree VirtualFree%3 "kernel32.dll" +//go:cgo_import_dynamic runtime._VirtualQuery VirtualQuery%3 "kernel32.dll" +//go:cgo_import_dynamic runtime._WaitForSingleObject WaitForSingleObject%2 "kernel32.dll" +//go:cgo_import_dynamic runtime._WaitForMultipleObjects WaitForMultipleObjects%4 "kernel32.dll" +//go:cgo_import_dynamic runtime._WriteConsoleW WriteConsoleW%5 "kernel32.dll" +//go:cgo_import_dynamic runtime._WriteFile WriteFile%5 "kernel32.dll" + +type stdFunction unsafe.Pointer + +var ( + // Following syscalls are available on every Windows PC. + // All these variables are set by the Windows executable + // loader before the Go program starts. + _AddVectoredExceptionHandler, + _CloseHandle, + _CreateEventA, + _CreateFileA, + _CreateIoCompletionPort, + _CreateThread, + _CreateWaitableTimerA, + _CreateWaitableTimerExW, + _DuplicateHandle, + _ExitProcess, + _FreeEnvironmentStringsW, + _GetConsoleMode, + _GetEnvironmentStringsW, + _GetProcAddress, + _GetProcessAffinityMask, + _GetQueuedCompletionStatusEx, + _GetStdHandle, + _GetSystemDirectoryA, + _GetSystemInfo, + _GetSystemTimeAsFileTime, + _GetThreadContext, + _SetThreadContext, + _LoadLibraryW, + _LoadLibraryA, + _PostQueuedCompletionStatus, + _QueryPerformanceCounter, + _QueryPerformanceFrequency, + _ResumeThread, + _SetConsoleCtrlHandler, + _SetErrorMode, + _SetEvent, + _SetProcessPriorityBoost, + _SetThreadPriority, + _SetUnhandledExceptionFilter, + _SetWaitableTimer, + _Sleep, + _SuspendThread, + _SwitchToThread, + _TlsAlloc, + _VirtualAlloc, + _VirtualFree, + _VirtualQuery, + _WaitForSingleObject, + _WaitForMultipleObjects, + _WriteConsoleW, + _WriteFile, + _ stdFunction + + // Following syscalls are only available on some Windows PCs. + // We will load syscalls, if available, before using them. + _AddDllDirectory, + _AddVectoredContinueHandler, + _LoadLibraryExA, + _LoadLibraryExW, + _ stdFunction + + // Use RtlGenRandom to generate cryptographically random data. + // This approach has been recommended by Microsoft (see issue + // 15589 for details). + // The RtlGenRandom is not listed in advapi32.dll, instead + // RtlGenRandom function can be found by searching for SystemFunction036. + // Also some versions of Mingw cannot link to SystemFunction036 + // when building executable as Cgo. So load SystemFunction036 + // manually during runtime startup. + _RtlGenRandom stdFunction + + // Load ntdll.dll manually during startup, otherwise Mingw + // links wrong printf function to cgo executable (see issue + // 12030 for details). + _NtWaitForSingleObject stdFunction + _RtlGetCurrentPeb stdFunction + _RtlGetNtVersionNumbers stdFunction + + // These are from non-kernel32.dll, so we prefer to LoadLibraryEx them. + _timeBeginPeriod, + _timeEndPeriod, + _WSAGetOverlappedResult, + _ stdFunction +) + +// Function to be called by windows CreateThread +// to start new os thread. +func tstart_stdcall(newm *m) + +// Init-time helper +func wintls() + +type mOS struct { + threadLock mutex // protects "thread" and prevents closing + thread uintptr // thread handle + + waitsema uintptr // semaphore for parking on locks + resumesema uintptr // semaphore to indicate suspend/resume + + highResTimer uintptr // high resolution timer handle used in usleep + + // preemptExtLock synchronizes preemptM with entry/exit from + // external C code. + // + // This protects against races between preemptM calling + // SuspendThread and external code on this thread calling + // ExitProcess. If these happen concurrently, it's possible to + // exit the suspending thread and suspend the exiting thread, + // leading to deadlock. + // + // 0 indicates this M is not being preempted or in external + // code. Entering external code CASes this from 0 to 1. If + // this fails, a preemption is in progress, so the thread must + // wait for the preemption. preemptM also CASes this from 0 to + // 1. If this fails, the preemption fails (as it would if the + // PC weren't in Go code). The value is reset to 0 when + // returning from external code or after a preemption is + // complete. + // + // TODO(austin): We may not need this if preemption were more + // tightly synchronized on the G/P status and preemption + // blocked transition into _Gsyscall/_Psyscall. + preemptExtLock uint32 +} + +//go:linkname os_sigpipe os.sigpipe +func os_sigpipe() { + throw("too many writes on closed pipe") +} + +// Stubs so tests can link correctly. These should never be called. +func open(name *byte, mode, perm int32) int32 { + throw("unimplemented") + return -1 +} +func closefd(fd int32) int32 { + throw("unimplemented") + return -1 +} +func read(fd int32, p unsafe.Pointer, n int32) int32 { + throw("unimplemented") + return -1 +} + +type sigset struct{} + +// Call a Windows function with stdcall conventions, +// and switch to os stack during the call. +func asmstdcall(fn unsafe.Pointer) + +var asmstdcallAddr unsafe.Pointer + +func windowsFindfunc(lib uintptr, name []byte) stdFunction { + if name[len(name)-1] != 0 { + throw("usage") + } + f := stdcall2(_GetProcAddress, lib, uintptr(unsafe.Pointer(&name[0]))) + return stdFunction(unsafe.Pointer(f)) +} + +const _MAX_PATH = 260 // https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation +var sysDirectory [_MAX_PATH + 1]byte +var sysDirectoryLen uintptr + +func windowsLoadSystemLib(name []byte) uintptr { + if sysDirectoryLen == 0 { + l := stdcall2(_GetSystemDirectoryA, uintptr(unsafe.Pointer(&sysDirectory[0])), uintptr(len(sysDirectory)-1)) + if l == 0 || l > uintptr(len(sysDirectory)-1) { + throw("Unable to determine system directory") + } + sysDirectory[l] = '\\' + sysDirectoryLen = l + 1 + } + if useLoadLibraryEx { + return stdcall3(_LoadLibraryExA, uintptr(unsafe.Pointer(&name[0])), 0, _LOAD_LIBRARY_SEARCH_SYSTEM32) + } else { + absName := append(sysDirectory[:sysDirectoryLen], name...) + return stdcall1(_LoadLibraryA, uintptr(unsafe.Pointer(&absName[0]))) + } +} + +const haveCputicksAsm = GOARCH == "386" || GOARCH == "amd64" + +func loadOptionalSyscalls() { + var kernel32dll = []byte("kernel32.dll\000") + k32 := stdcall1(_LoadLibraryA, uintptr(unsafe.Pointer(&kernel32dll[0]))) + if k32 == 0 { + throw("kernel32.dll not found") + } + _AddDllDirectory = windowsFindfunc(k32, []byte("AddDllDirectory\000")) + _AddVectoredContinueHandler = windowsFindfunc(k32, []byte("AddVectoredContinueHandler\000")) + _LoadLibraryExA = windowsFindfunc(k32, []byte("LoadLibraryExA\000")) + _LoadLibraryExW = windowsFindfunc(k32, []byte("LoadLibraryExW\000")) + useLoadLibraryEx = (_LoadLibraryExW != nil && _LoadLibraryExA != nil && _AddDllDirectory != nil) + + var advapi32dll = []byte("advapi32.dll\000") + a32 := windowsLoadSystemLib(advapi32dll) + if a32 == 0 { + throw("advapi32.dll not found") + } + _RtlGenRandom = windowsFindfunc(a32, []byte("SystemFunction036\000")) + + var ntdll = []byte("ntdll.dll\000") + n32 := windowsLoadSystemLib(ntdll) + if n32 == 0 { + throw("ntdll.dll not found") + } + _NtWaitForSingleObject = windowsFindfunc(n32, []byte("NtWaitForSingleObject\000")) + _RtlGetCurrentPeb = windowsFindfunc(n32, []byte("RtlGetCurrentPeb\000")) + _RtlGetNtVersionNumbers = windowsFindfunc(n32, []byte("RtlGetNtVersionNumbers\000")) + + if !haveCputicksAsm { + _QueryPerformanceCounter = windowsFindfunc(k32, []byte("QueryPerformanceCounter\000")) + if _QueryPerformanceCounter == nil { + throw("could not find QPC syscalls") + } + } + + var winmmdll = []byte("winmm.dll\000") + m32 := windowsLoadSystemLib(winmmdll) + if m32 == 0 { + throw("winmm.dll not found") + } + _timeBeginPeriod = windowsFindfunc(m32, []byte("timeBeginPeriod\000")) + _timeEndPeriod = windowsFindfunc(m32, []byte("timeEndPeriod\000")) + if _timeBeginPeriod == nil || _timeEndPeriod == nil { + throw("timeBegin/EndPeriod not found") + } + + var ws232dll = []byte("ws2_32.dll\000") + ws232 := windowsLoadSystemLib(ws232dll) + if ws232 == 0 { + throw("ws2_32.dll not found") + } + _WSAGetOverlappedResult = windowsFindfunc(ws232, []byte("WSAGetOverlappedResult\000")) + if _WSAGetOverlappedResult == nil { + throw("WSAGetOverlappedResult not found") + } + + if windowsFindfunc(n32, []byte("wine_get_version\000")) != nil { + // running on Wine + initWine(k32) + } +} + +func monitorSuspendResume() { + const ( + _DEVICE_NOTIFY_CALLBACK = 2 + ) + type _DEVICE_NOTIFY_SUBSCRIBE_PARAMETERS struct { + callback uintptr + context uintptr + } + + powrprof := windowsLoadSystemLib([]byte("powrprof.dll\000")) + if powrprof == 0 { + return // Running on Windows 7, where we don't need it anyway. + } + powerRegisterSuspendResumeNotification := windowsFindfunc(powrprof, []byte("PowerRegisterSuspendResumeNotification\000")) + if powerRegisterSuspendResumeNotification == nil { + return // Running on Windows 7, where we don't need it anyway. + } + var fn any = func(context uintptr, changeType uint32, setting uintptr) uintptr { + for mp := (*m)(atomic.Loadp(unsafe.Pointer(&allm))); mp != nil; mp = mp.alllink { + if mp.resumesema != 0 { + stdcall1(_SetEvent, mp.resumesema) + } + } + return 0 + } + params := _DEVICE_NOTIFY_SUBSCRIBE_PARAMETERS{ + callback: compileCallback(*efaceOf(&fn), true), + } + handle := uintptr(0) + stdcall3(powerRegisterSuspendResumeNotification, _DEVICE_NOTIFY_CALLBACK, + uintptr(unsafe.Pointer(¶ms)), uintptr(unsafe.Pointer(&handle))) +} + +//go:nosplit +func getLoadLibrary() uintptr { + return uintptr(unsafe.Pointer(_LoadLibraryW)) +} + +//go:nosplit +func getLoadLibraryEx() uintptr { + return uintptr(unsafe.Pointer(_LoadLibraryExW)) +} + +//go:nosplit +func getGetProcAddress() uintptr { + return uintptr(unsafe.Pointer(_GetProcAddress)) +} + +func getproccount() int32 { + var mask, sysmask uintptr + ret := stdcall3(_GetProcessAffinityMask, currentProcess, uintptr(unsafe.Pointer(&mask)), uintptr(unsafe.Pointer(&sysmask))) + if ret != 0 { + n := 0 + maskbits := int(unsafe.Sizeof(mask) * 8) + for i := 0; i < maskbits; i++ { + if mask&(1<<uint(i)) != 0 { + n++ + } + } + if n != 0 { + return int32(n) + } + } + // use GetSystemInfo if GetProcessAffinityMask fails + var info systeminfo + stdcall1(_GetSystemInfo, uintptr(unsafe.Pointer(&info))) + return int32(info.dwnumberofprocessors) +} + +func getPageSize() uintptr { + var info systeminfo + stdcall1(_GetSystemInfo, uintptr(unsafe.Pointer(&info))) + return uintptr(info.dwpagesize) +} + +const ( + currentProcess = ^uintptr(0) // -1 = current process + currentThread = ^uintptr(1) // -2 = current thread +) + +// in sys_windows_386.s and sys_windows_amd64.s: +func getlasterror() uint32 + +// When loading DLLs, we prefer to use LoadLibraryEx with +// LOAD_LIBRARY_SEARCH_* flags, if available. LoadLibraryEx is not +// available on old Windows, though, and the LOAD_LIBRARY_SEARCH_* +// flags are not available on some versions of Windows without a +// security patch. +// +// https://msdn.microsoft.com/en-us/library/ms684179(v=vs.85).aspx says: +// "Windows 7, Windows Server 2008 R2, Windows Vista, and Windows +// Server 2008: The LOAD_LIBRARY_SEARCH_* flags are available on +// systems that have KB2533623 installed. To determine whether the +// flags are available, use GetProcAddress to get the address of the +// AddDllDirectory, RemoveDllDirectory, or SetDefaultDllDirectories +// function. If GetProcAddress succeeds, the LOAD_LIBRARY_SEARCH_* +// flags can be used with LoadLibraryEx." +var useLoadLibraryEx bool + +var timeBeginPeriodRetValue uint32 + +// osRelaxMinNS indicates that sysmon shouldn't osRelax if the next +// timer is less than 60 ms from now. Since osRelaxing may reduce +// timer resolution to 15.6 ms, this keeps timer error under roughly 1 +// part in 4. +const osRelaxMinNS = 60 * 1e6 + +// osRelax is called by the scheduler when transitioning to and from +// all Ps being idle. +// +// Some versions of Windows have high resolution timer. For those +// versions osRelax is noop. +// For Windows versions without high resolution timer, osRelax +// adjusts the system-wide timer resolution. Go needs a +// high resolution timer while running and there's little extra cost +// if we're already using the CPU, but if all Ps are idle there's no +// need to consume extra power to drive the high-res timer. +func osRelax(relax bool) uint32 { + if haveHighResTimer { + // If the high resolution timer is available, the runtime uses the timer + // to sleep for short durations. This means there's no need to adjust + // the global clock frequency. + return 0 + } + + if relax { + return uint32(stdcall1(_timeEndPeriod, 1)) + } else { + return uint32(stdcall1(_timeBeginPeriod, 1)) + } +} + +// haveHighResTimer indicates that the CreateWaitableTimerEx +// CREATE_WAITABLE_TIMER_HIGH_RESOLUTION flag is available. +var haveHighResTimer = false + +// createHighResTimer calls CreateWaitableTimerEx with +// CREATE_WAITABLE_TIMER_HIGH_RESOLUTION flag to create high +// resolution timer. createHighResTimer returns new timer +// handle or 0, if CreateWaitableTimerEx failed. +func createHighResTimer() uintptr { + const ( + // As per @jstarks, see + // https://github.com/golang/go/issues/8687#issuecomment-656259353 + _CREATE_WAITABLE_TIMER_HIGH_RESOLUTION = 0x00000002 + + _SYNCHRONIZE = 0x00100000 + _TIMER_QUERY_STATE = 0x0001 + _TIMER_MODIFY_STATE = 0x0002 + ) + return stdcall4(_CreateWaitableTimerExW, 0, 0, + _CREATE_WAITABLE_TIMER_HIGH_RESOLUTION, + _SYNCHRONIZE|_TIMER_QUERY_STATE|_TIMER_MODIFY_STATE) +} + +const highResTimerSupported = GOARCH == "386" || GOARCH == "amd64" + +func initHighResTimer() { + if !highResTimerSupported { + // TODO: Not yet implemented. + return + } + h := createHighResTimer() + if h != 0 { + haveHighResTimer = true + stdcall1(_CloseHandle, h) + } +} + +//go:linkname canUseLongPaths os.canUseLongPaths +var canUseLongPaths bool + +// We want this to be large enough to hold the contents of sysDirectory, *plus* +// a slash and another component that itself is greater than MAX_PATH. +var longFileName [(_MAX_PATH+1)*2 + 1]byte + +// initLongPathSupport initializes the canUseLongPaths variable, which is +// linked into os.canUseLongPaths for determining whether or not long paths +// need to be fixed up. In the best case, this function is running on newer +// Windows 10 builds, which have a bit field member of the PEB called +// "IsLongPathAwareProcess." When this is set, we don't need to go through the +// error-prone fixup function in order to access long paths. So this init +// function first checks the Windows build number, sets the flag, and then +// tests to see if it's actually working. If everything checks out, then +// canUseLongPaths is set to true, and later when called, os.fixLongPath +// returns early without doing work. +func initLongPathSupport() { + const ( + IsLongPathAwareProcess = 0x80 + PebBitFieldOffset = 3 + OPEN_EXISTING = 3 + ERROR_PATH_NOT_FOUND = 3 + ) + + // Check that we're ≥ 10.0.15063. + var maj, min, build uint32 + stdcall3(_RtlGetNtVersionNumbers, uintptr(unsafe.Pointer(&maj)), uintptr(unsafe.Pointer(&min)), uintptr(unsafe.Pointer(&build))) + if maj < 10 || (maj == 10 && min == 0 && build&0xffff < 15063) { + return + } + + // Set the IsLongPathAwareProcess flag of the PEB's bit field. + bitField := (*byte)(unsafe.Pointer(stdcall0(_RtlGetCurrentPeb) + PebBitFieldOffset)) + originalBitField := *bitField + *bitField |= IsLongPathAwareProcess + + // Check that this actually has an effect, by constructing a large file + // path and seeing whether we get ERROR_PATH_NOT_FOUND, rather than + // some other error, which would indicate the path is too long, and + // hence long path support is not successful. This whole section is NOT + // strictly necessary, but is a nice validity check for the near to + // medium term, when this functionality is still relatively new in + // Windows. + getRandomData(longFileName[len(longFileName)-33 : len(longFileName)-1]) + start := copy(longFileName[:], sysDirectory[:sysDirectoryLen]) + const dig = "0123456789abcdef" + for i := 0; i < 32; i++ { + longFileName[start+i*2] = dig[longFileName[len(longFileName)-33+i]>>4] + longFileName[start+i*2+1] = dig[longFileName[len(longFileName)-33+i]&0xf] + } + start += 64 + for i := start; i < len(longFileName)-1; i++ { + longFileName[i] = 'A' + } + stdcall7(_CreateFileA, uintptr(unsafe.Pointer(&longFileName[0])), 0, 0, 0, OPEN_EXISTING, 0, 0) + // The ERROR_PATH_NOT_FOUND error value is distinct from + // ERROR_FILE_NOT_FOUND or ERROR_INVALID_NAME, the latter of which we + // expect here due to the final component being too long. + if getlasterror() == ERROR_PATH_NOT_FOUND { + *bitField = originalBitField + println("runtime: warning: IsLongPathAwareProcess failed to enable long paths; proceeding in fixup mode") + return + } + + canUseLongPaths = true +} + +func osinit() { + asmstdcallAddr = unsafe.Pointer(abi.FuncPCABI0(asmstdcall)) + + setBadSignalMsg() + + loadOptionalSyscalls() + + disableWER() + + initExceptionHandler() + + initHighResTimer() + timeBeginPeriodRetValue = osRelax(false) + + initLongPathSupport() + + ncpu = getproccount() + + physPageSize = getPageSize() + + // Windows dynamic priority boosting assumes that a process has different types + // of dedicated threads -- GUI, IO, computational, etc. Go processes use + // equivalent threads that all do a mix of GUI, IO, computations, etc. + // In such context dynamic priority boosting does nothing but harm, so we turn it off. + stdcall2(_SetProcessPriorityBoost, currentProcess, 1) +} + +// useQPCTime controls whether time.now and nanotime use QueryPerformanceCounter. +// This is only set to 1 when running under Wine. +var useQPCTime uint8 + +var qpcStartCounter int64 +var qpcMultiplier int64 + +//go:nosplit +func nanotimeQPC() int64 { + var counter int64 = 0 + stdcall1(_QueryPerformanceCounter, uintptr(unsafe.Pointer(&counter))) + + // returns number of nanoseconds + return (counter - qpcStartCounter) * qpcMultiplier +} + +//go:nosplit +func nowQPC() (sec int64, nsec int32, mono int64) { + var ft int64 + stdcall1(_GetSystemTimeAsFileTime, uintptr(unsafe.Pointer(&ft))) + + t := (ft - 116444736000000000) * 100 + + sec = t / 1000000000 + nsec = int32(t - sec*1000000000) + + mono = nanotimeQPC() + return +} + +func initWine(k32 uintptr) { + _GetSystemTimeAsFileTime = windowsFindfunc(k32, []byte("GetSystemTimeAsFileTime\000")) + if _GetSystemTimeAsFileTime == nil { + throw("could not find GetSystemTimeAsFileTime() syscall") + } + + _QueryPerformanceCounter = windowsFindfunc(k32, []byte("QueryPerformanceCounter\000")) + _QueryPerformanceFrequency = windowsFindfunc(k32, []byte("QueryPerformanceFrequency\000")) + if _QueryPerformanceCounter == nil || _QueryPerformanceFrequency == nil { + throw("could not find QPC syscalls") + } + + // We can not simply fallback to GetSystemTimeAsFileTime() syscall, since its time is not monotonic, + // instead we use QueryPerformanceCounter family of syscalls to implement monotonic timer + // https://msdn.microsoft.com/en-us/library/windows/desktop/dn553408(v=vs.85).aspx + + var tmp int64 + stdcall1(_QueryPerformanceFrequency, uintptr(unsafe.Pointer(&tmp))) + if tmp == 0 { + throw("QueryPerformanceFrequency syscall returned zero, running on unsupported hardware") + } + + // This should not overflow, it is a number of ticks of the performance counter per second, + // its resolution is at most 10 per usecond (on Wine, even smaller on real hardware), so it will be at most 10 millions here, + // panic if overflows. + if tmp > (1<<31 - 1) { + throw("QueryPerformanceFrequency overflow 32 bit divider, check nosplit discussion to proceed") + } + qpcFrequency := int32(tmp) + stdcall1(_QueryPerformanceCounter, uintptr(unsafe.Pointer(&qpcStartCounter))) + + // Since we are supposed to run this time calls only on Wine, it does not lose precision, + // since Wine's timer is kind of emulated at 10 Mhz, so it will be a nice round multiplier of 100 + // but for general purpose system (like 3.3 Mhz timer on i7) it will not be very precise. + // We have to do it this way (or similar), since multiplying QPC counter by 100 millions overflows + // int64 and resulted time will always be invalid. + qpcMultiplier = int64(timediv(1000000000, qpcFrequency, nil)) + + useQPCTime = 1 +} + +//go:nosplit +func getRandomData(r []byte) { + n := 0 + if stdcall2(_RtlGenRandom, uintptr(unsafe.Pointer(&r[0])), uintptr(len(r)))&0xff != 0 { + n = len(r) + } + extendRandom(r, n) +} + +func goenvs() { + // strings is a pointer to environment variable pairs in the form: + // "envA=valA\x00envB=valB\x00\x00" (in UTF-16) + // Two consecutive zero bytes end the list. + strings := unsafe.Pointer(stdcall0(_GetEnvironmentStringsW)) + p := (*[1 << 24]uint16)(strings)[:] + + n := 0 + for from, i := 0, 0; true; i++ { + if p[i] == 0 { + // empty string marks the end + if i == from { + break + } + from = i + 1 + n++ + } + } + envs = make([]string, n) + + for i := range envs { + envs[i] = gostringw(&p[0]) + for p[0] != 0 { + p = p[1:] + } + p = p[1:] // skip nil byte + } + + stdcall1(_FreeEnvironmentStringsW, uintptr(strings)) + + // We call these all the way here, late in init, so that malloc works + // for the callback functions these generate. + var fn any = ctrlHandler + ctrlHandlerPC := compileCallback(*efaceOf(&fn), true) + stdcall2(_SetConsoleCtrlHandler, ctrlHandlerPC, 1) + + monitorSuspendResume() +} + +// exiting is set to non-zero when the process is exiting. +var exiting uint32 + +//go:nosplit +func exit(code int32) { + // Disallow thread suspension for preemption. Otherwise, + // ExitProcess and SuspendThread can race: SuspendThread + // queues a suspension request for this thread, ExitProcess + // kills the suspending thread, and then this thread suspends. + lock(&suspendLock) + atomic.Store(&exiting, 1) + stdcall1(_ExitProcess, uintptr(code)) +} + +// write1 must be nosplit because it's used as a last resort in +// functions like badmorestackg0. In such cases, we'll always take the +// ASCII path. +// +//go:nosplit +func write1(fd uintptr, buf unsafe.Pointer, n int32) int32 { + const ( + _STD_OUTPUT_HANDLE = ^uintptr(10) // -11 + _STD_ERROR_HANDLE = ^uintptr(11) // -12 + ) + var handle uintptr + switch fd { + case 1: + handle = stdcall1(_GetStdHandle, _STD_OUTPUT_HANDLE) + case 2: + handle = stdcall1(_GetStdHandle, _STD_ERROR_HANDLE) + default: + // assume fd is real windows handle. + handle = fd + } + isASCII := true + b := (*[1 << 30]byte)(buf)[:n] + for _, x := range b { + if x >= 0x80 { + isASCII = false + break + } + } + + if !isASCII { + var m uint32 + isConsole := stdcall2(_GetConsoleMode, handle, uintptr(unsafe.Pointer(&m))) != 0 + // If this is a console output, various non-unicode code pages can be in use. + // Use the dedicated WriteConsole call to ensure unicode is printed correctly. + if isConsole { + return int32(writeConsole(handle, buf, n)) + } + } + var written uint32 + stdcall5(_WriteFile, handle, uintptr(buf), uintptr(n), uintptr(unsafe.Pointer(&written)), 0) + return int32(written) +} + +var ( + utf16ConsoleBack [1000]uint16 + utf16ConsoleBackLock mutex +) + +// writeConsole writes bufLen bytes from buf to the console File. +// It returns the number of bytes written. +func writeConsole(handle uintptr, buf unsafe.Pointer, bufLen int32) int { + const surr2 = (surrogateMin + surrogateMax + 1) / 2 + + // Do not use defer for unlock. May cause issues when printing a panic. + lock(&utf16ConsoleBackLock) + + b := (*[1 << 30]byte)(buf)[:bufLen] + s := *(*string)(unsafe.Pointer(&b)) + + utf16tmp := utf16ConsoleBack[:] + + total := len(s) + w := 0 + for _, r := range s { + if w >= len(utf16tmp)-2 { + writeConsoleUTF16(handle, utf16tmp[:w]) + w = 0 + } + if r < 0x10000 { + utf16tmp[w] = uint16(r) + w++ + } else { + r -= 0x10000 + utf16tmp[w] = surrogateMin + uint16(r>>10)&0x3ff + utf16tmp[w+1] = surr2 + uint16(r)&0x3ff + w += 2 + } + } + writeConsoleUTF16(handle, utf16tmp[:w]) + unlock(&utf16ConsoleBackLock) + return total +} + +// writeConsoleUTF16 is the dedicated windows calls that correctly prints +// to the console regardless of the current code page. Input is utf-16 code points. +// The handle must be a console handle. +func writeConsoleUTF16(handle uintptr, b []uint16) { + l := uint32(len(b)) + if l == 0 { + return + } + var written uint32 + stdcall5(_WriteConsoleW, + handle, + uintptr(unsafe.Pointer(&b[0])), + uintptr(l), + uintptr(unsafe.Pointer(&written)), + 0, + ) + return +} + +//go:nosplit +func semasleep(ns int64) int32 { + const ( + _WAIT_ABANDONED = 0x00000080 + _WAIT_OBJECT_0 = 0x00000000 + _WAIT_TIMEOUT = 0x00000102 + _WAIT_FAILED = 0xFFFFFFFF + ) + + var result uintptr + if ns < 0 { + result = stdcall2(_WaitForSingleObject, getg().m.waitsema, uintptr(_INFINITE)) + } else { + start := nanotime() + elapsed := int64(0) + for { + ms := int64(timediv(ns-elapsed, 1000000, nil)) + if ms == 0 { + ms = 1 + } + result = stdcall4(_WaitForMultipleObjects, 2, + uintptr(unsafe.Pointer(&[2]uintptr{getg().m.waitsema, getg().m.resumesema})), + 0, uintptr(ms)) + if result != _WAIT_OBJECT_0+1 { + // Not a suspend/resume event + break + } + elapsed = nanotime() - start + if elapsed >= ns { + return -1 + } + } + } + switch result { + case _WAIT_OBJECT_0: // Signaled + return 0 + + case _WAIT_TIMEOUT: + return -1 + + case _WAIT_ABANDONED: + systemstack(func() { + throw("runtime.semasleep wait_abandoned") + }) + + case _WAIT_FAILED: + systemstack(func() { + print("runtime: waitforsingleobject wait_failed; errno=", getlasterror(), "\n") + throw("runtime.semasleep wait_failed") + }) + + default: + systemstack(func() { + print("runtime: waitforsingleobject unexpected; result=", result, "\n") + throw("runtime.semasleep unexpected") + }) + } + + return -1 // unreachable +} + +//go:nosplit +func semawakeup(mp *m) { + if stdcall1(_SetEvent, mp.waitsema) == 0 { + systemstack(func() { + print("runtime: setevent failed; errno=", getlasterror(), "\n") + throw("runtime.semawakeup") + }) + } +} + +//go:nosplit +func semacreate(mp *m) { + if mp.waitsema != 0 { + return + } + mp.waitsema = stdcall4(_CreateEventA, 0, 0, 0, 0) + if mp.waitsema == 0 { + systemstack(func() { + print("runtime: createevent failed; errno=", getlasterror(), "\n") + throw("runtime.semacreate") + }) + } + mp.resumesema = stdcall4(_CreateEventA, 0, 0, 0, 0) + if mp.resumesema == 0 { + systemstack(func() { + print("runtime: createevent failed; errno=", getlasterror(), "\n") + throw("runtime.semacreate") + }) + stdcall1(_CloseHandle, mp.waitsema) + mp.waitsema = 0 + } +} + +// May run with m.p==nil, so write barriers are not allowed. This +// function is called by newosproc0, so it is also required to +// operate without stack guards. +// +//go:nowritebarrierrec +//go:nosplit +func newosproc(mp *m) { + // We pass 0 for the stack size to use the default for this binary. + thandle := stdcall6(_CreateThread, 0, 0, + abi.FuncPCABI0(tstart_stdcall), uintptr(unsafe.Pointer(mp)), + 0, 0) + + if thandle == 0 { + if atomic.Load(&exiting) != 0 { + // CreateThread may fail if called + // concurrently with ExitProcess. If this + // happens, just freeze this thread and let + // the process exit. See issue #18253. + lock(&deadlock) + lock(&deadlock) + } + print("runtime: failed to create new OS thread (have ", mcount(), " already; errno=", getlasterror(), ")\n") + throw("runtime.newosproc") + } + + // Close thandle to avoid leaking the thread object if it exits. + stdcall1(_CloseHandle, thandle) +} + +// Used by the C library build mode. On Linux this function would allocate a +// stack, but that's not necessary for Windows. No stack guards are present +// and the GC has not been initialized, so write barriers will fail. +// +//go:nowritebarrierrec +//go:nosplit +func newosproc0(mp *m, stk unsafe.Pointer) { + // TODO: this is completely broken. The args passed to newosproc0 (in asm_amd64.s) + // are stacksize and function, not *m and stack. + // Check os_linux.go for an implementation that might actually work. + throw("bad newosproc0") +} + +func exitThread(wait *atomic.Uint32) { + // We should never reach exitThread on Windows because we let + // the OS clean up threads. + throw("exitThread") +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the parent thread (main thread in case of bootstrap), can allocate memory. +func mpreinit(mp *m) { +} + +//go:nosplit +func sigsave(p *sigset) { +} + +//go:nosplit +func msigrestore(sigmask sigset) { +} + +//go:nosplit +//go:nowritebarrierrec +func clearSignalHandlers() { +} + +//go:nosplit +func sigblock(exiting bool) { +} + +// Called to initialize a new m (including the bootstrap m). +// Called on the new thread, cannot allocate memory. +func minit() { + var thandle uintptr + if stdcall7(_DuplicateHandle, currentProcess, currentThread, currentProcess, uintptr(unsafe.Pointer(&thandle)), 0, 0, _DUPLICATE_SAME_ACCESS) == 0 { + print("runtime.minit: duplicatehandle failed; errno=", getlasterror(), "\n") + throw("runtime.minit: duplicatehandle failed") + } + + mp := getg().m + lock(&mp.threadLock) + mp.thread = thandle + + // Configure usleep timer, if possible. + if mp.highResTimer == 0 && haveHighResTimer { + mp.highResTimer = createHighResTimer() + if mp.highResTimer == 0 { + print("runtime: CreateWaitableTimerEx failed; errno=", getlasterror(), "\n") + throw("CreateWaitableTimerEx when creating timer failed") + } + } + unlock(&mp.threadLock) + + // Query the true stack base from the OS. Currently we're + // running on a small assumed stack. + var mbi memoryBasicInformation + res := stdcall3(_VirtualQuery, uintptr(unsafe.Pointer(&mbi)), uintptr(unsafe.Pointer(&mbi)), unsafe.Sizeof(mbi)) + if res == 0 { + print("runtime: VirtualQuery failed; errno=", getlasterror(), "\n") + throw("VirtualQuery for stack base failed") + } + // The system leaves an 8K PAGE_GUARD region at the bottom of + // the stack (in theory VirtualQuery isn't supposed to include + // that, but it does). Add an additional 8K of slop for + // calling C functions that don't have stack checks and for + // lastcontinuehandler. We shouldn't be anywhere near this + // bound anyway. + base := mbi.allocationBase + 16<<10 + // Sanity check the stack bounds. + g0 := getg() + if base > g0.stack.hi || g0.stack.hi-base > 64<<20 { + print("runtime: g0 stack [", hex(base), ",", hex(g0.stack.hi), ")\n") + throw("bad g0 stack") + } + g0.stack.lo = base + g0.stackguard0 = g0.stack.lo + _StackGuard + g0.stackguard1 = g0.stackguard0 + // Sanity check the SP. + stackcheck() +} + +// Called from dropm to undo the effect of an minit. +// +//go:nosplit +func unminit() { + mp := getg().m + lock(&mp.threadLock) + if mp.thread != 0 { + stdcall1(_CloseHandle, mp.thread) + mp.thread = 0 + } + unlock(&mp.threadLock) +} + +// Called from exitm, but not from drop, to undo the effect of thread-owned +// resources in minit, semacreate, or elsewhere. Do not take locks after calling this. +// +//go:nosplit +func mdestroy(mp *m) { + if mp.highResTimer != 0 { + stdcall1(_CloseHandle, mp.highResTimer) + mp.highResTimer = 0 + } + if mp.waitsema != 0 { + stdcall1(_CloseHandle, mp.waitsema) + mp.waitsema = 0 + } + if mp.resumesema != 0 { + stdcall1(_CloseHandle, mp.resumesema) + mp.resumesema = 0 + } +} + +// Calling stdcall on os stack. +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrier +//go:nosplit +func stdcall(fn stdFunction) uintptr { + gp := getg() + mp := gp.m + mp.libcall.fn = uintptr(unsafe.Pointer(fn)) + resetLibcall := false + if mp.profilehz != 0 && mp.libcallsp == 0 { + // leave pc/sp for cpu profiler + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + resetLibcall = true // See comment in sys_darwin.go:libcCall + } + asmcgocall(asmstdcallAddr, unsafe.Pointer(&mp.libcall)) + if resetLibcall { + mp.libcallsp = 0 + } + return mp.libcall.r1 +} + +//go:nosplit +func stdcall0(fn stdFunction) uintptr { + mp := getg().m + mp.libcall.n = 0 + mp.libcall.args = uintptr(noescape(unsafe.Pointer(&fn))) // it's unused but must be non-nil, otherwise crashes + return stdcall(fn) +} + +//go:nosplit +//go:cgo_unsafe_args +func stdcall1(fn stdFunction, a0 uintptr) uintptr { + mp := getg().m + mp.libcall.n = 1 + mp.libcall.args = uintptr(noescape(unsafe.Pointer(&a0))) + return stdcall(fn) +} + +//go:nosplit +//go:cgo_unsafe_args +func stdcall2(fn stdFunction, a0, a1 uintptr) uintptr { + mp := getg().m + mp.libcall.n = 2 + mp.libcall.args = uintptr(noescape(unsafe.Pointer(&a0))) + return stdcall(fn) +} + +//go:nosplit +//go:cgo_unsafe_args +func stdcall3(fn stdFunction, a0, a1, a2 uintptr) uintptr { + mp := getg().m + mp.libcall.n = 3 + mp.libcall.args = uintptr(noescape(unsafe.Pointer(&a0))) + return stdcall(fn) +} + +//go:nosplit +//go:cgo_unsafe_args +func stdcall4(fn stdFunction, a0, a1, a2, a3 uintptr) uintptr { + mp := getg().m + mp.libcall.n = 4 + mp.libcall.args = uintptr(noescape(unsafe.Pointer(&a0))) + return stdcall(fn) +} + +//go:nosplit +//go:cgo_unsafe_args +func stdcall5(fn stdFunction, a0, a1, a2, a3, a4 uintptr) uintptr { + mp := getg().m + mp.libcall.n = 5 + mp.libcall.args = uintptr(noescape(unsafe.Pointer(&a0))) + return stdcall(fn) +} + +//go:nosplit +//go:cgo_unsafe_args +func stdcall6(fn stdFunction, a0, a1, a2, a3, a4, a5 uintptr) uintptr { + mp := getg().m + mp.libcall.n = 6 + mp.libcall.args = uintptr(noescape(unsafe.Pointer(&a0))) + return stdcall(fn) +} + +//go:nosplit +//go:cgo_unsafe_args +func stdcall7(fn stdFunction, a0, a1, a2, a3, a4, a5, a6 uintptr) uintptr { + mp := getg().m + mp.libcall.n = 7 + mp.libcall.args = uintptr(noescape(unsafe.Pointer(&a0))) + return stdcall(fn) +} + +// These must run on the system stack only. +func usleep2(dt int32) +func usleep2HighRes(dt int32) +func switchtothread() + +//go:nosplit +func osyield_no_g() { + switchtothread() +} + +//go:nosplit +func osyield() { + systemstack(switchtothread) +} + +//go:nosplit +func usleep_no_g(us uint32) { + dt := -10 * int32(us) // relative sleep (negative), 100ns units + usleep2(dt) +} + +//go:nosplit +func usleep(us uint32) { + systemstack(func() { + dt := -10 * int32(us) // relative sleep (negative), 100ns units + // If the high-res timer is available and its handle has been allocated for this m, use it. + // Otherwise fall back to the low-res one, which doesn't need a handle. + if haveHighResTimer && getg().m.highResTimer != 0 { + usleep2HighRes(dt) + } else { + usleep2(dt) + } + }) +} + +func ctrlHandler(_type uint32) uintptr { + var s uint32 + + switch _type { + case _CTRL_C_EVENT, _CTRL_BREAK_EVENT: + s = _SIGINT + case _CTRL_CLOSE_EVENT, _CTRL_LOGOFF_EVENT, _CTRL_SHUTDOWN_EVENT: + s = _SIGTERM + default: + return 0 + } + + if sigsend(s) { + if s == _SIGTERM { + // Windows terminates the process after this handler returns. + // Block indefinitely to give signal handlers a chance to clean up, + // but make sure to be properly parked first, so the rest of the + // program can continue executing. + block() + } + return 1 + } + return 0 +} + +// called from zcallback_windows_*.s to sys_windows_*.s +func callbackasm1() + +var profiletimer uintptr + +func profilem(mp *m, thread uintptr) { + // Align Context to 16 bytes. + var c *context + var cbuf [unsafe.Sizeof(*c) + 15]byte + c = (*context)(unsafe.Pointer((uintptr(unsafe.Pointer(&cbuf[15]))) &^ 15)) + + c.contextflags = _CONTEXT_CONTROL + stdcall2(_GetThreadContext, thread, uintptr(unsafe.Pointer(c))) + + gp := gFromSP(mp, c.sp()) + + sigprof(c.ip(), c.sp(), c.lr(), gp, mp) +} + +func gFromSP(mp *m, sp uintptr) *g { + if gp := mp.g0; gp != nil && gp.stack.lo < sp && sp < gp.stack.hi { + return gp + } + if gp := mp.gsignal; gp != nil && gp.stack.lo < sp && sp < gp.stack.hi { + return gp + } + if gp := mp.curg; gp != nil && gp.stack.lo < sp && sp < gp.stack.hi { + return gp + } + return nil +} + +func profileLoop() { + stdcall2(_SetThreadPriority, currentThread, _THREAD_PRIORITY_HIGHEST) + + for { + stdcall2(_WaitForSingleObject, profiletimer, _INFINITE) + first := (*m)(atomic.Loadp(unsafe.Pointer(&allm))) + for mp := first; mp != nil; mp = mp.alllink { + if mp == getg().m { + // Don't profile ourselves. + continue + } + + lock(&mp.threadLock) + // Do not profile threads blocked on Notes, + // this includes idle worker threads, + // idle timer thread, idle heap scavenger, etc. + if mp.thread == 0 || mp.profilehz == 0 || mp.blocked { + unlock(&mp.threadLock) + continue + } + // Acquire our own handle to the thread. + var thread uintptr + if stdcall7(_DuplicateHandle, currentProcess, mp.thread, currentProcess, uintptr(unsafe.Pointer(&thread)), 0, 0, _DUPLICATE_SAME_ACCESS) == 0 { + print("runtime: duplicatehandle failed; errno=", getlasterror(), "\n") + throw("duplicatehandle failed") + } + unlock(&mp.threadLock) + + // mp may exit between the DuplicateHandle + // above and the SuspendThread. The handle + // will remain valid, but SuspendThread may + // fail. + if int32(stdcall1(_SuspendThread, thread)) == -1 { + // The thread no longer exists. + stdcall1(_CloseHandle, thread) + continue + } + if mp.profilehz != 0 && !mp.blocked { + // Pass the thread handle in case mp + // was in the process of shutting down. + profilem(mp, thread) + } + stdcall1(_ResumeThread, thread) + stdcall1(_CloseHandle, thread) + } + } +} + +func setProcessCPUProfiler(hz int32) { + if profiletimer == 0 { + timer := stdcall3(_CreateWaitableTimerA, 0, 0, 0) + atomic.Storeuintptr(&profiletimer, timer) + newm(profileLoop, nil, -1) + } +} + +func setThreadCPUProfiler(hz int32) { + ms := int32(0) + due := ^int64(^uint64(1 << 63)) + if hz > 0 { + ms = 1000 / hz + if ms == 0 { + ms = 1 + } + due = int64(ms) * -10000 + } + stdcall6(_SetWaitableTimer, profiletimer, uintptr(unsafe.Pointer(&due)), uintptr(ms), 0, 0, 0) + atomic.Store((*uint32)(unsafe.Pointer(&getg().m.profilehz)), uint32(hz)) +} + +const preemptMSupported = true + +// suspendLock protects simultaneous SuspendThread operations from +// suspending each other. +var suspendLock mutex + +func preemptM(mp *m) { + if mp == getg().m { + throw("self-preempt") + } + + // Synchronize with external code that may try to ExitProcess. + if !atomic.Cas(&mp.preemptExtLock, 0, 1) { + // External code is running. Fail the preemption + // attempt. + mp.preemptGen.Add(1) + return + } + + // Acquire our own handle to mp's thread. + lock(&mp.threadLock) + if mp.thread == 0 { + // The M hasn't been minit'd yet (or was just unminit'd). + unlock(&mp.threadLock) + atomic.Store(&mp.preemptExtLock, 0) + mp.preemptGen.Add(1) + return + } + var thread uintptr + if stdcall7(_DuplicateHandle, currentProcess, mp.thread, currentProcess, uintptr(unsafe.Pointer(&thread)), 0, 0, _DUPLICATE_SAME_ACCESS) == 0 { + print("runtime.preemptM: duplicatehandle failed; errno=", getlasterror(), "\n") + throw("runtime.preemptM: duplicatehandle failed") + } + unlock(&mp.threadLock) + + // Prepare thread context buffer. This must be aligned to 16 bytes. + var c *context + var cbuf [unsafe.Sizeof(*c) + 15]byte + c = (*context)(unsafe.Pointer((uintptr(unsafe.Pointer(&cbuf[15]))) &^ 15)) + c.contextflags = _CONTEXT_CONTROL + + // Serialize thread suspension. SuspendThread is asynchronous, + // so it's otherwise possible for two threads to suspend each + // other and deadlock. We must hold this lock until after + // GetThreadContext, since that blocks until the thread is + // actually suspended. + lock(&suspendLock) + + // Suspend the thread. + if int32(stdcall1(_SuspendThread, thread)) == -1 { + unlock(&suspendLock) + stdcall1(_CloseHandle, thread) + atomic.Store(&mp.preemptExtLock, 0) + // The thread no longer exists. This shouldn't be + // possible, but just acknowledge the request. + mp.preemptGen.Add(1) + return + } + + // We have to be very careful between this point and once + // we've shown mp is at an async safe-point. This is like a + // signal handler in the sense that mp could have been doing + // anything when we stopped it, including holding arbitrary + // locks. + + // We have to get the thread context before inspecting the M + // because SuspendThread only requests a suspend. + // GetThreadContext actually blocks until it's suspended. + stdcall2(_GetThreadContext, thread, uintptr(unsafe.Pointer(c))) + + unlock(&suspendLock) + + // Does it want a preemption and is it safe to preempt? + gp := gFromSP(mp, c.sp()) + if gp != nil && wantAsyncPreempt(gp) { + if ok, newpc := isAsyncSafePoint(gp, c.ip(), c.sp(), c.lr()); ok { + // Inject call to asyncPreempt + targetPC := abi.FuncPCABI0(asyncPreempt) + switch GOARCH { + default: + throw("unsupported architecture") + case "386", "amd64": + // Make it look like the thread called targetPC. + sp := c.sp() + sp -= goarch.PtrSize + *(*uintptr)(unsafe.Pointer(sp)) = newpc + c.set_sp(sp) + c.set_ip(targetPC) + + case "arm": + // Push LR. The injected call is responsible + // for restoring LR. gentraceback is aware of + // this extra slot. See sigctxt.pushCall in + // signal_arm.go, which is similar except we + // subtract 1 from IP here. + sp := c.sp() + sp -= goarch.PtrSize + c.set_sp(sp) + *(*uint32)(unsafe.Pointer(sp)) = uint32(c.lr()) + c.set_lr(newpc - 1) + c.set_ip(targetPC) + + case "arm64": + // Push LR. The injected call is responsible + // for restoring LR. gentraceback is aware of + // this extra slot. See sigctxt.pushCall in + // signal_arm64.go. + sp := c.sp() - 16 // SP needs 16-byte alignment + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(sp)) = uint64(c.lr()) + c.set_lr(newpc) + c.set_ip(targetPC) + } + stdcall2(_SetThreadContext, thread, uintptr(unsafe.Pointer(c))) + } + } + + atomic.Store(&mp.preemptExtLock, 0) + + // Acknowledge the preemption. + mp.preemptGen.Add(1) + + stdcall1(_ResumeThread, thread) + stdcall1(_CloseHandle, thread) +} + +// osPreemptExtEnter is called before entering external code that may +// call ExitProcess. +// +// This must be nosplit because it may be called from a syscall with +// untyped stack slots, so the stack must not be grown or scanned. +// +//go:nosplit +func osPreemptExtEnter(mp *m) { + for !atomic.Cas(&mp.preemptExtLock, 0, 1) { + // An asynchronous preemption is in progress. It's not + // safe to enter external code because it may call + // ExitProcess and deadlock with SuspendThread. + // Ideally we would do the preemption ourselves, but + // can't since there may be untyped syscall arguments + // on the stack. Instead, just wait and encourage the + // SuspendThread APC to run. The preemption should be + // done shortly. + osyield() + } + // Asynchronous preemption is now blocked. +} + +// osPreemptExtExit is called after returning from external code that +// may call ExitProcess. +// +// See osPreemptExtEnter for why this is nosplit. +// +//go:nosplit +func osPreemptExtExit(mp *m) { + atomic.Store(&mp.preemptExtLock, 0) +} diff --git a/src/runtime/os_windows_arm.go b/src/runtime/os_windows_arm.go new file mode 100644 index 0000000..10aff75 --- /dev/null +++ b/src/runtime/os_windows_arm.go @@ -0,0 +1,22 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +//go:nosplit +func cputicks() int64 { + var counter int64 + stdcall1(_QueryPerformanceCounter, uintptr(unsafe.Pointer(&counter))) + return counter +} + +func checkgoarm() { + if goarm < 7 { + print("Need atomic synchronization instructions, coprocessor ", + "access instructions. Recompile using GOARM=7.\n") + exit(1) + } +} diff --git a/src/runtime/os_windows_arm64.go b/src/runtime/os_windows_arm64.go new file mode 100644 index 0000000..7e41344 --- /dev/null +++ b/src/runtime/os_windows_arm64.go @@ -0,0 +1,14 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +//go:nosplit +func cputicks() int64 { + var counter int64 + stdcall1(_QueryPerformanceCounter, uintptr(unsafe.Pointer(&counter))) + return counter +} diff --git a/src/runtime/pagetrace_off.go b/src/runtime/pagetrace_off.go new file mode 100644 index 0000000..10b44d4 --- /dev/null +++ b/src/runtime/pagetrace_off.go @@ -0,0 +1,28 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !goexperiment.pagetrace + +package runtime + +//go:systemstack +func pageTraceAlloc(pp *p, now int64, base, npages uintptr) { +} + +//go:systemstack +func pageTraceFree(pp *p, now int64, base, npages uintptr) { +} + +//go:systemstack +func pageTraceScav(pp *p, now int64, base, npages uintptr) { +} + +type pageTraceBuf struct { +} + +func initPageTrace(env string) { +} + +func finishPageTrace() { +} diff --git a/src/runtime/pagetrace_on.go b/src/runtime/pagetrace_on.go new file mode 100644 index 0000000..0e621cb --- /dev/null +++ b/src/runtime/pagetrace_on.go @@ -0,0 +1,358 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build goexperiment.pagetrace + +// Page tracer. +// +// This file contains an implementation of page trace instrumentation for tracking +// the way the Go runtime manages pages of memory. The trace may be enabled at program +// startup with the GODEBUG option pagetrace. +// +// Each page trace event is either 8 or 16 bytes wide. The first +// 8 bytes follow this format for non-sync events: +// +// [16 timestamp delta][35 base address][10 npages][1 isLarge][2 pageTraceEventType] +// +// If the "large" bit is set then the event is 16 bytes wide with the second 8 byte word +// containing the full npages value (the npages bitfield is 0). +// +// The base address's bottom pageShift bits are always zero hence why we can pack other +// data in there. We ignore the top 16 bits, assuming a 48 bit address space for the +// heap. +// +// The timestamp delta is computed from the difference between the current nanotime +// timestamp and the last sync event's timestamp. The bottom pageTraceTimeLostBits of +// this delta is removed and only the next pageTraceTimeDeltaBits are kept. +// +// A sync event is emitted at the beginning of each trace buffer and whenever the +// timestamp delta would not fit in an event. +// +// Sync events have the following structure: +// +// [61 timestamp or P ID][1 isPID][2 pageTraceSyncEvent] +// +// In essence, the "large" bit repurposed to indicate whether it's a timestamp or a P ID +// (these are typically uint32). Note that we only have 61 bits for the 64-bit timestamp, +// but like for the delta we drop the bottom pageTraceTimeLostBits here as well. + +package runtime + +import ( + "runtime/internal/sys" + "unsafe" +) + +// pageTraceAlloc records a page trace allocation event. +// pp may be nil. Call only if debug.pagetracefd != 0. +// +// Must run on the system stack as a crude way to prevent preemption. +// +//go:systemstack +func pageTraceAlloc(pp *p, now int64, base, npages uintptr) { + if pageTrace.enabled { + if now == 0 { + now = nanotime() + } + pageTraceEmit(pp, now, base, npages, pageTraceAllocEvent) + } +} + +// pageTraceFree records a page trace free event. +// pp may be nil. Call only if debug.pagetracefd != 0. +// +// Must run on the system stack as a crude way to prevent preemption. +// +//go:systemstack +func pageTraceFree(pp *p, now int64, base, npages uintptr) { + if pageTrace.enabled { + if now == 0 { + now = nanotime() + } + pageTraceEmit(pp, now, base, npages, pageTraceFreeEvent) + } +} + +// pageTraceScav records a page trace scavenge event. +// pp may be nil. Call only if debug.pagetracefd != 0. +// +// Must run on the system stack as a crude way to prevent preemption. +// +//go:systemstack +func pageTraceScav(pp *p, now int64, base, npages uintptr) { + if pageTrace.enabled { + if now == 0 { + now = nanotime() + } + pageTraceEmit(pp, now, base, npages, pageTraceScavEvent) + } +} + +// pageTraceEventType is a page trace event type. +type pageTraceEventType uint8 + +const ( + pageTraceSyncEvent pageTraceEventType = iota // Timestamp emission. + pageTraceAllocEvent // Allocation of pages. + pageTraceFreeEvent // Freeing pages. + pageTraceScavEvent // Scavenging pages. +) + +// pageTraceEmit emits a page trace event. +// +// Must run on the system stack as a crude way to prevent preemption. +// +//go:systemstack +func pageTraceEmit(pp *p, now int64, base, npages uintptr, typ pageTraceEventType) { + // Get a buffer. + var tbp *pageTraceBuf + pid := int32(-1) + if pp == nil { + // We have no P, so take the global buffer. + lock(&pageTrace.lock) + tbp = &pageTrace.buf + } else { + tbp = &pp.pageTraceBuf + pid = pp.id + } + + // Initialize the buffer if necessary. + tb := *tbp + if tb.buf == nil { + tb.buf = (*pageTraceEvents)(sysAlloc(pageTraceBufSize, &memstats.other_sys)) + tb = tb.writePid(pid) + } + + // Handle timestamp and emit a sync event if necessary. + if now < tb.timeBase { + now = tb.timeBase + } + if now-tb.timeBase >= pageTraceTimeMaxDelta { + tb.timeBase = now + tb = tb.writeSync(pid) + } + + // Emit the event. + tb = tb.writeEvent(pid, now, base, npages, typ) + + // Write back the buffer. + *tbp = tb + if pp == nil { + unlock(&pageTrace.lock) + } +} + +const ( + pageTraceBufSize = 32 << 10 + + // These constants describe the per-event timestamp delta encoding. + pageTraceTimeLostBits = 7 // How many bits of precision we lose in the delta. + pageTraceTimeDeltaBits = 16 // Size of the delta in bits. + pageTraceTimeMaxDelta = 1 << (pageTraceTimeLostBits + pageTraceTimeDeltaBits) +) + +// pageTraceEvents is the low-level buffer containing the trace data. +type pageTraceEvents struct { + _ sys.NotInHeap + events [pageTraceBufSize / 8]uint64 +} + +// pageTraceBuf is a wrapper around pageTraceEvents that knows how to write events +// to the buffer. It tracks state necessary to do so. +type pageTraceBuf struct { + buf *pageTraceEvents + len int // How many events have been written so far. + timeBase int64 // The current timestamp base from which deltas are produced. + finished bool // Whether this trace buf should no longer flush anything out. +} + +// writePid writes a P ID event indicating which P we're running on. +// +// Assumes there's always space in the buffer since this is only called at the +// beginning of a new buffer. +// +// Must run on the system stack as a crude way to prevent preemption. +// +//go:systemstack +func (tb pageTraceBuf) writePid(pid int32) pageTraceBuf { + e := uint64(int64(pid))<<3 | 0b100 | uint64(pageTraceSyncEvent) + tb.buf.events[tb.len] = e + tb.len++ + return tb +} + +// writeSync writes a sync event, which is just a timestamp. Handles flushing. +// +// Must run on the system stack as a crude way to prevent preemption. +// +//go:systemstack +func (tb pageTraceBuf) writeSync(pid int32) pageTraceBuf { + if tb.len+1 > len(tb.buf.events) { + // N.B. flush will writeSync again. + return tb.flush(pid, tb.timeBase) + } + e := ((uint64(tb.timeBase) >> pageTraceTimeLostBits) << 3) | uint64(pageTraceSyncEvent) + tb.buf.events[tb.len] = e + tb.len++ + return tb +} + +// writeEvent handles writing all non-sync and non-pid events. Handles flushing if necessary. +// +// pid indicates the P we're currently running on. Necessary in case we need to flush. +// now is the current nanotime timestamp. +// base is the base address of whatever group of pages this event is happening to. +// npages is the length of the group of pages this event is happening to. +// typ is the event that's happening to these pages. +// +// Must run on the system stack as a crude way to prevent preemption. +// +//go:systemstack +func (tb pageTraceBuf) writeEvent(pid int32, now int64, base, npages uintptr, typ pageTraceEventType) pageTraceBuf { + large := 0 + np := npages + if npages >= 1024 { + large = 1 + np = 0 + } + if tb.len+1+large > len(tb.buf.events) { + tb = tb.flush(pid, now) + } + if base%pageSize != 0 { + throw("base address not page aligned") + } + e := uint64(base) + // The pageShift low-order bits are zero. + e |= uint64(typ) // 2 bits + e |= uint64(large) << 2 // 1 bit + e |= uint64(np) << 3 // 10 bits + // Write the timestamp delta in the upper pageTraceTimeDeltaBits. + e |= uint64((now-tb.timeBase)>>pageTraceTimeLostBits) << (64 - pageTraceTimeDeltaBits) + tb.buf.events[tb.len] = e + if large != 0 { + // npages doesn't fit in 10 bits, so write an additional word with that data. + tb.buf.events[tb.len+1] = uint64(npages) + } + tb.len += 1 + large + return tb +} + +// flush writes out the contents of the buffer to pageTrace.fd and resets the buffer. +// It then writes out a P ID event and the first sync event for the new buffer. +// +// Must run on the system stack as a crude way to prevent preemption. +// +//go:systemstack +func (tb pageTraceBuf) flush(pid int32, now int64) pageTraceBuf { + if !tb.finished { + lock(&pageTrace.fdLock) + writeFull(uintptr(pageTrace.fd), (*byte)(unsafe.Pointer(&tb.buf.events[0])), tb.len*8) + unlock(&pageTrace.fdLock) + } + tb.len = 0 + tb.timeBase = now + return tb.writePid(pid).writeSync(pid) +} + +var pageTrace struct { + // enabled indicates whether tracing is enabled. If true, fd >= 0. + // + // Safe to read without synchronization because it's only set once + // at program initialization. + enabled bool + + // buf is the page trace buffer used if there is no P. + // + // lock protects buf. + lock mutex + buf pageTraceBuf + + // fdLock protects writing to fd. + // + // fd is the file to write the page trace to. + fdLock mutex + fd int32 +} + +// initPageTrace initializes the page tracing infrastructure from GODEBUG. +// +// env must be the value of the GODEBUG environment variable. +func initPageTrace(env string) { + var value string + for env != "" { + elt, rest := env, "" + for i := 0; i < len(env); i++ { + if env[i] == ',' { + elt, rest = env[:i], env[i+1:] + break + } + } + env = rest + if hasPrefix(elt, "pagetrace=") { + value = elt[len("pagetrace="):] + break + } + } + pageTrace.fd = -1 + if canCreateFile && value != "" { + var tmp [4096]byte + if len(value) != 0 && len(value) < 4096 { + copy(tmp[:], value) + pageTrace.fd = create(&tmp[0], 0o664) + } + } + pageTrace.enabled = pageTrace.fd >= 0 +} + +// finishPageTrace flushes all P's trace buffers and disables page tracing. +func finishPageTrace() { + if !pageTrace.enabled { + return + } + // Grab worldsema as we're about to execute a ragged barrier. + semacquire(&worldsema) + systemstack(func() { + // Disable tracing. This isn't strictly necessary and it's best-effort. + pageTrace.enabled = false + + // Execute a ragged barrier, flushing each trace buffer. + forEachP(func(pp *p) { + if pp.pageTraceBuf.buf != nil { + pp.pageTraceBuf = pp.pageTraceBuf.flush(pp.id, nanotime()) + } + pp.pageTraceBuf.finished = true + }) + + // Write the global have-no-P buffer. + lock(&pageTrace.lock) + if pageTrace.buf.buf != nil { + pageTrace.buf = pageTrace.buf.flush(-1, nanotime()) + } + pageTrace.buf.finished = true + unlock(&pageTrace.lock) + + // Safely close the file as nothing else should be allowed to write to the fd. + lock(&pageTrace.fdLock) + closefd(pageTrace.fd) + pageTrace.fd = -1 + unlock(&pageTrace.fdLock) + }) + semrelease(&worldsema) +} + +// writeFull ensures that a complete write of bn bytes from b is made to fd. +func writeFull(fd uintptr, b *byte, bn int) { + for bn > 0 { + n := write(fd, unsafe.Pointer(b), int32(bn)) + if n == -_EINTR || n == -_EAGAIN { + continue + } + if n < 0 { + print("errno=", -n, "\n") + throw("writeBytes: bad write") + } + bn -= int(n) + b = addb(b, uintptr(n)) + } +} diff --git a/src/runtime/panic.go b/src/runtime/panic.go new file mode 100644 index 0000000..6a6437d --- /dev/null +++ b/src/runtime/panic.go @@ -0,0 +1,1376 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// throwType indicates the current type of ongoing throw, which affects the +// amount of detail printed to stderr. Higher values include more detail. +type throwType uint32 + +const ( + // throwTypeNone means that we are not throwing. + throwTypeNone throwType = iota + + // throwTypeUser is a throw due to a problem with the application. + // + // These throws do not include runtime frames, system goroutines, or + // frame metadata. + throwTypeUser + + // throwTypeRuntime is a throw due to a problem with Go itself. + // + // These throws include as much information as possible to aid in + // debugging the runtime, including runtime frames, system goroutines, + // and frame metadata. + throwTypeRuntime +) + +// We have two different ways of doing defers. The older way involves creating a +// defer record at the time that a defer statement is executing and adding it to a +// defer chain. This chain is inspected by the deferreturn call at all function +// exits in order to run the appropriate defer calls. A cheaper way (which we call +// open-coded defers) is used for functions in which no defer statements occur in +// loops. In that case, we simply store the defer function/arg information into +// specific stack slots at the point of each defer statement, as well as setting a +// bit in a bitmask. At each function exit, we add inline code to directly make +// the appropriate defer calls based on the bitmask and fn/arg information stored +// on the stack. During panic/Goexit processing, the appropriate defer calls are +// made using extra funcdata info that indicates the exact stack slots that +// contain the bitmask and defer fn/args. + +// Check to make sure we can really generate a panic. If the panic +// was generated from the runtime, or from inside malloc, then convert +// to a throw of msg. +// pc should be the program counter of the compiler-generated code that +// triggered this panic. +func panicCheck1(pc uintptr, msg string) { + if goarch.IsWasm == 0 && hasPrefix(funcname(findfunc(pc)), "runtime.") { + // Note: wasm can't tail call, so we can't get the original caller's pc. + throw(msg) + } + // TODO: is this redundant? How could we be in malloc + // but not in the runtime? runtime/internal/*, maybe? + gp := getg() + if gp != nil && gp.m != nil && gp.m.mallocing != 0 { + throw(msg) + } +} + +// Same as above, but calling from the runtime is allowed. +// +// Using this function is necessary for any panic that may be +// generated by runtime.sigpanic, since those are always called by the +// runtime. +func panicCheck2(err string) { + // panic allocates, so to avoid recursive malloc, turn panics + // during malloc into throws. + gp := getg() + if gp != nil && gp.m != nil && gp.m.mallocing != 0 { + throw(err) + } +} + +// Many of the following panic entry-points turn into throws when they +// happen in various runtime contexts. These should never happen in +// the runtime, and if they do, they indicate a serious issue and +// should not be caught by user code. +// +// The panic{Index,Slice,divide,shift} functions are called by +// code generated by the compiler for out of bounds index expressions, +// out of bounds slice expressions, division by zero, and shift by negative. +// The panicdivide (again), panicoverflow, panicfloat, and panicmem +// functions are called by the signal handler when a signal occurs +// indicating the respective problem. +// +// Since panic{Index,Slice,shift} are never called directly, and +// since the runtime package should never have an out of bounds slice +// or array reference or negative shift, if we see those functions called from the +// runtime package we turn the panic into a throw. That will dump the +// entire runtime stack for easier debugging. +// +// The entry points called by the signal handler will be called from +// runtime.sigpanic, so we can't disallow calls from the runtime to +// these (they always look like they're called from the runtime). +// Hence, for these, we just check for clearly bad runtime conditions. +// +// The panic{Index,Slice} functions are implemented in assembly and tail call +// to the goPanic{Index,Slice} functions below. This is done so we can use +// a space-minimal register calling convention. + +// failures in the comparisons for s[x], 0 <= x < y (y == len(s)) +// +//go:yeswritebarrierrec +func goPanicIndex(x int, y int) { + panicCheck1(getcallerpc(), "index out of range") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsIndex}) +} + +//go:yeswritebarrierrec +func goPanicIndexU(x uint, y int) { + panicCheck1(getcallerpc(), "index out of range") + panic(boundsError{x: int64(x), signed: false, y: y, code: boundsIndex}) +} + +// failures in the comparisons for s[:x], 0 <= x <= y (y == len(s) or cap(s)) +// +//go:yeswritebarrierrec +func goPanicSliceAlen(x int, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsSliceAlen}) +} + +//go:yeswritebarrierrec +func goPanicSliceAlenU(x uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: false, y: y, code: boundsSliceAlen}) +} + +//go:yeswritebarrierrec +func goPanicSliceAcap(x int, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsSliceAcap}) +} + +//go:yeswritebarrierrec +func goPanicSliceAcapU(x uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: false, y: y, code: boundsSliceAcap}) +} + +// failures in the comparisons for s[x:y], 0 <= x <= y +// +//go:yeswritebarrierrec +func goPanicSliceB(x int, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsSliceB}) +} + +//go:yeswritebarrierrec +func goPanicSliceBU(x uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: false, y: y, code: boundsSliceB}) +} + +// failures in the comparisons for s[::x], 0 <= x <= y (y == len(s) or cap(s)) +func goPanicSlice3Alen(x int, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsSlice3Alen}) +} +func goPanicSlice3AlenU(x uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: false, y: y, code: boundsSlice3Alen}) +} +func goPanicSlice3Acap(x int, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsSlice3Acap}) +} +func goPanicSlice3AcapU(x uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: false, y: y, code: boundsSlice3Acap}) +} + +// failures in the comparisons for s[:x:y], 0 <= x <= y +func goPanicSlice3B(x int, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsSlice3B}) +} +func goPanicSlice3BU(x uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: false, y: y, code: boundsSlice3B}) +} + +// failures in the comparisons for s[x:y:], 0 <= x <= y +func goPanicSlice3C(x int, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsSlice3C}) +} +func goPanicSlice3CU(x uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(x), signed: false, y: y, code: boundsSlice3C}) +} + +// failures in the conversion ([x]T)(s) or (*[x]T)(s), 0 <= x <= y, y == len(s) +func goPanicSliceConvert(x int, y int) { + panicCheck1(getcallerpc(), "slice length too short to convert to array or pointer to array") + panic(boundsError{x: int64(x), signed: true, y: y, code: boundsConvert}) +} + +// Implemented in assembly, as they take arguments in registers. +// Declared here to mark them as ABIInternal. +func panicIndex(x int, y int) +func panicIndexU(x uint, y int) +func panicSliceAlen(x int, y int) +func panicSliceAlenU(x uint, y int) +func panicSliceAcap(x int, y int) +func panicSliceAcapU(x uint, y int) +func panicSliceB(x int, y int) +func panicSliceBU(x uint, y int) +func panicSlice3Alen(x int, y int) +func panicSlice3AlenU(x uint, y int) +func panicSlice3Acap(x int, y int) +func panicSlice3AcapU(x uint, y int) +func panicSlice3B(x int, y int) +func panicSlice3BU(x uint, y int) +func panicSlice3C(x int, y int) +func panicSlice3CU(x uint, y int) +func panicSliceConvert(x int, y int) + +var shiftError = error(errorString("negative shift amount")) + +//go:yeswritebarrierrec +func panicshift() { + panicCheck1(getcallerpc(), "negative shift amount") + panic(shiftError) +} + +var divideError = error(errorString("integer divide by zero")) + +//go:yeswritebarrierrec +func panicdivide() { + panicCheck2("integer divide by zero") + panic(divideError) +} + +var overflowError = error(errorString("integer overflow")) + +func panicoverflow() { + panicCheck2("integer overflow") + panic(overflowError) +} + +var floatError = error(errorString("floating point error")) + +func panicfloat() { + panicCheck2("floating point error") + panic(floatError) +} + +var memoryError = error(errorString("invalid memory address or nil pointer dereference")) + +func panicmem() { + panicCheck2("invalid memory address or nil pointer dereference") + panic(memoryError) +} + +func panicmemAddr(addr uintptr) { + panicCheck2("invalid memory address or nil pointer dereference") + panic(errorAddressString{msg: "invalid memory address or nil pointer dereference", addr: addr}) +} + +// Create a new deferred function fn, which has no arguments and results. +// The compiler turns a defer statement into a call to this. +func deferproc(fn func()) { + gp := getg() + if gp.m.curg != gp { + // go code on the system stack can't defer + throw("defer on system stack") + } + + d := newdefer() + if d._panic != nil { + throw("deferproc: d.panic != nil after newdefer") + } + d.link = gp._defer + gp._defer = d + d.fn = fn + d.pc = getcallerpc() + // We must not be preempted between calling getcallersp and + // storing it to d.sp because getcallersp's result is a + // uintptr stack pointer. + d.sp = getcallersp() + + // deferproc returns 0 normally. + // a deferred func that stops a panic + // makes the deferproc return 1. + // the code the compiler generates always + // checks the return value and jumps to the + // end of the function if deferproc returns != 0. + return0() + // No code can go here - the C return register has + // been set and must not be clobbered. +} + +// deferprocStack queues a new deferred function with a defer record on the stack. +// The defer record must have its fn field initialized. +// All other fields can contain junk. +// Nosplit because of the uninitialized pointer fields on the stack. +// +//go:nosplit +func deferprocStack(d *_defer) { + gp := getg() + if gp.m.curg != gp { + // go code on the system stack can't defer + throw("defer on system stack") + } + // fn is already set. + // The other fields are junk on entry to deferprocStack and + // are initialized here. + d.started = false + d.heap = false + d.openDefer = false + d.sp = getcallersp() + d.pc = getcallerpc() + d.framepc = 0 + d.varp = 0 + // The lines below implement: + // d.panic = nil + // d.fd = nil + // d.link = gp._defer + // gp._defer = d + // But without write barriers. The first three are writes to + // the stack so they don't need a write barrier, and furthermore + // are to uninitialized memory, so they must not use a write barrier. + // The fourth write does not require a write barrier because we + // explicitly mark all the defer structures, so we don't need to + // keep track of pointers to them with a write barrier. + *(*uintptr)(unsafe.Pointer(&d._panic)) = 0 + *(*uintptr)(unsafe.Pointer(&d.fd)) = 0 + *(*uintptr)(unsafe.Pointer(&d.link)) = uintptr(unsafe.Pointer(gp._defer)) + *(*uintptr)(unsafe.Pointer(&gp._defer)) = uintptr(unsafe.Pointer(d)) + + return0() + // No code can go here - the C return register has + // been set and must not be clobbered. +} + +// Each P holds a pool for defers. + +// Allocate a Defer, usually using per-P pool. +// Each defer must be released with freedefer. The defer is not +// added to any defer chain yet. +func newdefer() *_defer { + var d *_defer + mp := acquirem() + pp := mp.p.ptr() + if len(pp.deferpool) == 0 && sched.deferpool != nil { + lock(&sched.deferlock) + for len(pp.deferpool) < cap(pp.deferpool)/2 && sched.deferpool != nil { + d := sched.deferpool + sched.deferpool = d.link + d.link = nil + pp.deferpool = append(pp.deferpool, d) + } + unlock(&sched.deferlock) + } + if n := len(pp.deferpool); n > 0 { + d = pp.deferpool[n-1] + pp.deferpool[n-1] = nil + pp.deferpool = pp.deferpool[:n-1] + } + releasem(mp) + mp, pp = nil, nil + + if d == nil { + // Allocate new defer. + d = new(_defer) + } + d.heap = true + return d +} + +// Free the given defer. +// The defer cannot be used after this call. +// +// This is nosplit because the incoming defer is in a perilous state. +// It's not on any defer list, so stack copying won't adjust stack +// pointers in it (namely, d.link). Hence, if we were to copy the +// stack, d could then contain a stale pointer. +// +//go:nosplit +func freedefer(d *_defer) { + d.link = nil + // After this point we can copy the stack. + + if d._panic != nil { + freedeferpanic() + } + if d.fn != nil { + freedeferfn() + } + if !d.heap { + return + } + + mp := acquirem() + pp := mp.p.ptr() + if len(pp.deferpool) == cap(pp.deferpool) { + // Transfer half of local cache to the central cache. + var first, last *_defer + for len(pp.deferpool) > cap(pp.deferpool)/2 { + n := len(pp.deferpool) + d := pp.deferpool[n-1] + pp.deferpool[n-1] = nil + pp.deferpool = pp.deferpool[:n-1] + if first == nil { + first = d + } else { + last.link = d + } + last = d + } + lock(&sched.deferlock) + last.link = sched.deferpool + sched.deferpool = first + unlock(&sched.deferlock) + } + + *d = _defer{} + + pp.deferpool = append(pp.deferpool, d) + + releasem(mp) + mp, pp = nil, nil +} + +// Separate function so that it can split stack. +// Windows otherwise runs out of stack space. +func freedeferpanic() { + // _panic must be cleared before d is unlinked from gp. + throw("freedefer with d._panic != nil") +} + +func freedeferfn() { + // fn must be cleared before d is unlinked from gp. + throw("freedefer with d.fn != nil") +} + +// deferreturn runs deferred functions for the caller's frame. +// The compiler inserts a call to this at the end of any +// function which calls defer. +func deferreturn() { + gp := getg() + for { + d := gp._defer + if d == nil { + return + } + sp := getcallersp() + if d.sp != sp { + return + } + if d.openDefer { + done := runOpenDeferFrame(d) + if !done { + throw("unfinished open-coded defers in deferreturn") + } + gp._defer = d.link + freedefer(d) + // If this frame uses open defers, then this + // must be the only defer record for the + // frame, so we can just return. + return + } + + fn := d.fn + d.fn = nil + gp._defer = d.link + freedefer(d) + fn() + } +} + +// Goexit terminates the goroutine that calls it. No other goroutine is affected. +// Goexit runs all deferred calls before terminating the goroutine. Because Goexit +// is not a panic, any recover calls in those deferred functions will return nil. +// +// Calling Goexit from the main goroutine terminates that goroutine +// without func main returning. Since func main has not returned, +// the program continues execution of other goroutines. +// If all other goroutines exit, the program crashes. +func Goexit() { + // Run all deferred functions for the current goroutine. + // This code is similar to gopanic, see that implementation + // for detailed comments. + gp := getg() + + // Create a panic object for Goexit, so we can recognize when it might be + // bypassed by a recover(). + var p _panic + p.goexit = true + p.link = gp._panic + gp._panic = (*_panic)(noescape(unsafe.Pointer(&p))) + + addOneOpenDeferFrame(gp, getcallerpc(), unsafe.Pointer(getcallersp())) + for { + d := gp._defer + if d == nil { + break + } + if d.started { + if d._panic != nil { + d._panic.aborted = true + d._panic = nil + } + if !d.openDefer { + d.fn = nil + gp._defer = d.link + freedefer(d) + continue + } + } + d.started = true + d._panic = (*_panic)(noescape(unsafe.Pointer(&p))) + if d.openDefer { + done := runOpenDeferFrame(d) + if !done { + // We should always run all defers in the frame, + // since there is no panic associated with this + // defer that can be recovered. + throw("unfinished open-coded defers in Goexit") + } + if p.aborted { + // Since our current defer caused a panic and may + // have been already freed, just restart scanning + // for open-coded defers from this frame again. + addOneOpenDeferFrame(gp, getcallerpc(), unsafe.Pointer(getcallersp())) + } else { + addOneOpenDeferFrame(gp, 0, nil) + } + } else { + // Save the pc/sp in deferCallSave(), so we can "recover" back to this + // loop if necessary. + deferCallSave(&p, d.fn) + } + if p.aborted { + // We had a recursive panic in the defer d we started, and + // then did a recover in a defer that was further down the + // defer chain than d. In the case of an outstanding Goexit, + // we force the recover to return back to this loop. d will + // have already been freed if completed, so just continue + // immediately to the next defer on the chain. + p.aborted = false + continue + } + if gp._defer != d { + throw("bad defer entry in Goexit") + } + d._panic = nil + d.fn = nil + gp._defer = d.link + freedefer(d) + // Note: we ignore recovers here because Goexit isn't a panic + } + goexit1() +} + +// Call all Error and String methods before freezing the world. +// Used when crashing with panicking. +func preprintpanics(p *_panic) { + defer func() { + text := "panic while printing panic value" + switch r := recover().(type) { + case nil: + // nothing to do + case string: + throw(text + ": " + r) + default: + throw(text + ": type " + efaceOf(&r)._type.string()) + } + }() + for p != nil { + switch v := p.arg.(type) { + case error: + p.arg = v.Error() + case stringer: + p.arg = v.String() + } + p = p.link + } +} + +// Print all currently active panics. Used when crashing. +// Should only be called after preprintpanics. +func printpanics(p *_panic) { + if p.link != nil { + printpanics(p.link) + if !p.link.goexit { + print("\t") + } + } + if p.goexit { + return + } + print("panic: ") + printany(p.arg) + if p.recovered { + print(" [recovered]") + } + print("\n") +} + +// addOneOpenDeferFrame scans the stack (in gentraceback order, from inner frames to +// outer frames) for the first frame (if any) with open-coded defers. If it finds +// one, it adds a single entry to the defer chain for that frame. The entry added +// represents all the defers in the associated open defer frame, and is sorted in +// order with respect to any non-open-coded defers. +// +// addOneOpenDeferFrame stops (possibly without adding a new entry) if it encounters +// an in-progress open defer entry. An in-progress open defer entry means there has +// been a new panic because of a defer in the associated frame. addOneOpenDeferFrame +// does not add an open defer entry past a started entry, because that started entry +// still needs to finished, and addOneOpenDeferFrame will be called when that started +// entry is completed. The defer removal loop in gopanic() similarly stops at an +// in-progress defer entry. Together, addOneOpenDeferFrame and the defer removal loop +// ensure the invariant that there is no open defer entry further up the stack than +// an in-progress defer, and also that the defer removal loop is guaranteed to remove +// all not-in-progress open defer entries from the defer chain. +// +// If sp is non-nil, addOneOpenDeferFrame starts the stack scan from the frame +// specified by sp. If sp is nil, it uses the sp from the current defer record (which +// has just been finished). Hence, it continues the stack scan from the frame of the +// defer that just finished. It skips any frame that already has a (not-in-progress) +// open-coded _defer record in the defer chain. +// +// Note: All entries of the defer chain (including this new open-coded entry) have +// their pointers (including sp) adjusted properly if the stack moves while +// running deferred functions. Also, it is safe to pass in the sp arg (which is +// the direct result of calling getcallersp()), because all pointer variables +// (including arguments) are adjusted as needed during stack copies. +func addOneOpenDeferFrame(gp *g, pc uintptr, sp unsafe.Pointer) { + var prevDefer *_defer + if sp == nil { + prevDefer = gp._defer + pc = prevDefer.framepc + sp = unsafe.Pointer(prevDefer.sp) + } + systemstack(func() { + gentraceback(pc, uintptr(sp), 0, gp, 0, nil, 0x7fffffff, + func(frame *stkframe, unused unsafe.Pointer) bool { + if prevDefer != nil && prevDefer.sp == frame.sp { + // Skip the frame for the previous defer that + // we just finished (and was used to set + // where we restarted the stack scan) + return true + } + f := frame.fn + fd := funcdata(f, _FUNCDATA_OpenCodedDeferInfo) + if fd == nil { + return true + } + // Insert the open defer record in the + // chain, in order sorted by sp. + d := gp._defer + var prev *_defer + for d != nil { + dsp := d.sp + if frame.sp < dsp { + break + } + if frame.sp == dsp { + if !d.openDefer { + throw("duplicated defer entry") + } + // Don't add any record past an + // in-progress defer entry. We don't + // need it, and more importantly, we + // want to keep the invariant that + // there is no open defer entry + // passed an in-progress entry (see + // header comment). + if d.started { + return false + } + return true + } + prev = d + d = d.link + } + if frame.fn.deferreturn == 0 { + throw("missing deferreturn") + } + + d1 := newdefer() + d1.openDefer = true + d1._panic = nil + // These are the pc/sp to set after we've + // run a defer in this frame that did a + // recover. We return to a special + // deferreturn that runs any remaining + // defers and then returns from the + // function. + d1.pc = frame.fn.entry() + uintptr(frame.fn.deferreturn) + d1.varp = frame.varp + d1.fd = fd + // Save the SP/PC associated with current frame, + // so we can continue stack trace later if needed. + d1.framepc = frame.pc + d1.sp = frame.sp + d1.link = d + if prev == nil { + gp._defer = d1 + } else { + prev.link = d1 + } + // Stop stack scanning after adding one open defer record + return false + }, + nil, 0) + }) +} + +// readvarintUnsafe reads the uint32 in varint format starting at fd, and returns the +// uint32 and a pointer to the byte following the varint. +// +// There is a similar function runtime.readvarint, which takes a slice of bytes, +// rather than an unsafe pointer. These functions are duplicated, because one of +// the two use cases for the functions would get slower if the functions were +// combined. +func readvarintUnsafe(fd unsafe.Pointer) (uint32, unsafe.Pointer) { + var r uint32 + var shift int + for { + b := *(*uint8)((unsafe.Pointer(fd))) + fd = add(fd, unsafe.Sizeof(b)) + if b < 128 { + return r + uint32(b)<<shift, fd + } + r += ((uint32(b) &^ 128) << shift) + shift += 7 + if shift > 28 { + panic("Bad varint") + } + } +} + +// runOpenDeferFrame runs the active open-coded defers in the frame specified by +// d. It normally processes all active defers in the frame, but stops immediately +// if a defer does a successful recover. It returns true if there are no +// remaining defers to run in the frame. +func runOpenDeferFrame(d *_defer) bool { + done := true + fd := d.fd + + deferBitsOffset, fd := readvarintUnsafe(fd) + nDefers, fd := readvarintUnsafe(fd) + deferBits := *(*uint8)(unsafe.Pointer(d.varp - uintptr(deferBitsOffset))) + + for i := int(nDefers) - 1; i >= 0; i-- { + // read the funcdata info for this defer + var closureOffset uint32 + closureOffset, fd = readvarintUnsafe(fd) + if deferBits&(1<<i) == 0 { + continue + } + closure := *(*func())(unsafe.Pointer(d.varp - uintptr(closureOffset))) + d.fn = closure + deferBits = deferBits &^ (1 << i) + *(*uint8)(unsafe.Pointer(d.varp - uintptr(deferBitsOffset))) = deferBits + p := d._panic + // Call the defer. Note that this can change d.varp if + // the stack moves. + deferCallSave(p, d.fn) + if p != nil && p.aborted { + break + } + d.fn = nil + if d._panic != nil && d._panic.recovered { + done = deferBits == 0 + break + } + } + + return done +} + +// deferCallSave calls fn() after saving the caller's pc and sp in the +// panic record. This allows the runtime to return to the Goexit defer +// processing loop, in the unusual case where the Goexit may be +// bypassed by a successful recover. +// +// This is marked as a wrapper by the compiler so it doesn't appear in +// tracebacks. +func deferCallSave(p *_panic, fn func()) { + if p != nil { + p.argp = unsafe.Pointer(getargp()) + p.pc = getcallerpc() + p.sp = unsafe.Pointer(getcallersp()) + } + fn() + if p != nil { + p.pc = 0 + p.sp = unsafe.Pointer(nil) + } +} + +// The implementation of the predeclared function panic. +func gopanic(e any) { + gp := getg() + if gp.m.curg != gp { + print("panic: ") + printany(e) + print("\n") + throw("panic on system stack") + } + + if gp.m.mallocing != 0 { + print("panic: ") + printany(e) + print("\n") + throw("panic during malloc") + } + if gp.m.preemptoff != "" { + print("panic: ") + printany(e) + print("\n") + print("preempt off reason: ") + print(gp.m.preemptoff) + print("\n") + throw("panic during preemptoff") + } + if gp.m.locks != 0 { + print("panic: ") + printany(e) + print("\n") + throw("panic holding locks") + } + + var p _panic + p.arg = e + p.link = gp._panic + gp._panic = (*_panic)(noescape(unsafe.Pointer(&p))) + + runningPanicDefers.Add(1) + + // By calculating getcallerpc/getcallersp here, we avoid scanning the + // gopanic frame (stack scanning is slow...) + addOneOpenDeferFrame(gp, getcallerpc(), unsafe.Pointer(getcallersp())) + + for { + d := gp._defer + if d == nil { + break + } + + // If defer was started by earlier panic or Goexit (and, since we're back here, that triggered a new panic), + // take defer off list. An earlier panic will not continue running, but we will make sure below that an + // earlier Goexit does continue running. + if d.started { + if d._panic != nil { + d._panic.aborted = true + } + d._panic = nil + if !d.openDefer { + // For open-coded defers, we need to process the + // defer again, in case there are any other defers + // to call in the frame (not including the defer + // call that caused the panic). + d.fn = nil + gp._defer = d.link + freedefer(d) + continue + } + } + + // Mark defer as started, but keep on list, so that traceback + // can find and update the defer's argument frame if stack growth + // or a garbage collection happens before executing d.fn. + d.started = true + + // Record the panic that is running the defer. + // If there is a new panic during the deferred call, that panic + // will find d in the list and will mark d._panic (this panic) aborted. + d._panic = (*_panic)(noescape(unsafe.Pointer(&p))) + + done := true + if d.openDefer { + done = runOpenDeferFrame(d) + if done && !d._panic.recovered { + addOneOpenDeferFrame(gp, 0, nil) + } + } else { + p.argp = unsafe.Pointer(getargp()) + d.fn() + } + p.argp = nil + + // Deferred function did not panic. Remove d. + if gp._defer != d { + throw("bad defer entry in panic") + } + d._panic = nil + + // trigger shrinkage to test stack copy. See stack_test.go:TestStackPanic + //GC() + + pc := d.pc + sp := unsafe.Pointer(d.sp) // must be pointer so it gets adjusted during stack copy + if done { + d.fn = nil + gp._defer = d.link + freedefer(d) + } + if p.recovered { + gp._panic = p.link + if gp._panic != nil && gp._panic.goexit && gp._panic.aborted { + // A normal recover would bypass/abort the Goexit. Instead, + // we return to the processing loop of the Goexit. + gp.sigcode0 = uintptr(gp._panic.sp) + gp.sigcode1 = uintptr(gp._panic.pc) + mcall(recovery) + throw("bypassed recovery failed") // mcall should not return + } + runningPanicDefers.Add(-1) + + // After a recover, remove any remaining non-started, + // open-coded defer entries, since the corresponding defers + // will be executed normally (inline). Any such entry will + // become stale once we run the corresponding defers inline + // and exit the associated stack frame. We only remove up to + // the first started (in-progress) open defer entry, not + // including the current frame, since any higher entries will + // be from a higher panic in progress, and will still be + // needed. + d := gp._defer + var prev *_defer + if !done { + // Skip our current frame, if not done. It is + // needed to complete any remaining defers in + // deferreturn() + prev = d + d = d.link + } + for d != nil { + if d.started { + // This defer is started but we + // are in the middle of a + // defer-panic-recover inside of + // it, so don't remove it or any + // further defer entries + break + } + if d.openDefer { + if prev == nil { + gp._defer = d.link + } else { + prev.link = d.link + } + newd := d.link + freedefer(d) + d = newd + } else { + prev = d + d = d.link + } + } + + gp._panic = p.link + // Aborted panics are marked but remain on the g.panic list. + // Remove them from the list. + for gp._panic != nil && gp._panic.aborted { + gp._panic = gp._panic.link + } + if gp._panic == nil { // must be done with signal + gp.sig = 0 + } + // Pass information about recovering frame to recovery. + gp.sigcode0 = uintptr(sp) + gp.sigcode1 = pc + mcall(recovery) + throw("recovery failed") // mcall should not return + } + } + + // ran out of deferred calls - old-school panic now + // Because it is unsafe to call arbitrary user code after freezing + // the world, we call preprintpanics to invoke all necessary Error + // and String methods to prepare the panic strings before startpanic. + preprintpanics(gp._panic) + + fatalpanic(gp._panic) // should not return + *(*int)(nil) = 0 // not reached +} + +// getargp returns the location where the caller +// writes outgoing function call arguments. +// +//go:nosplit +//go:noinline +func getargp() uintptr { + return getcallersp() + sys.MinFrameSize +} + +// The implementation of the predeclared function recover. +// Cannot split the stack because it needs to reliably +// find the stack segment of its caller. +// +// TODO(rsc): Once we commit to CopyStackAlways, +// this doesn't need to be nosplit. +// +//go:nosplit +func gorecover(argp uintptr) any { + // Must be in a function running as part of a deferred call during the panic. + // Must be called from the topmost function of the call + // (the function used in the defer statement). + // p.argp is the argument pointer of that topmost deferred function call. + // Compare against argp reported by caller. + // If they match, the caller is the one who can recover. + gp := getg() + p := gp._panic + if p != nil && !p.goexit && !p.recovered && argp == uintptr(p.argp) { + p.recovered = true + return p.arg + } + return nil +} + +//go:linkname sync_throw sync.throw +func sync_throw(s string) { + throw(s) +} + +//go:linkname sync_fatal sync.fatal +func sync_fatal(s string) { + fatal(s) +} + +// throw triggers a fatal error that dumps a stack trace and exits. +// +// throw should be used for runtime-internal fatal errors where Go itself, +// rather than user code, may be at fault for the failure. +// +//go:nosplit +func throw(s string) { + // Everything throw does should be recursively nosplit so it + // can be called even when it's unsafe to grow the stack. + systemstack(func() { + print("fatal error: ", s, "\n") + }) + + fatalthrow(throwTypeRuntime) +} + +// fatal triggers a fatal error that dumps a stack trace and exits. +// +// fatal is equivalent to throw, but is used when user code is expected to be +// at fault for the failure, such as racing map writes. +// +// fatal does not include runtime frames, system goroutines, or frame metadata +// (fp, sp, pc) in the stack trace unless GOTRACEBACK=system or higher. +// +//go:nosplit +func fatal(s string) { + // Everything fatal does should be recursively nosplit so it + // can be called even when it's unsafe to grow the stack. + systemstack(func() { + print("fatal error: ", s, "\n") + }) + + fatalthrow(throwTypeUser) +} + +// runningPanicDefers is non-zero while running deferred functions for panic. +// This is used to try hard to get a panic stack trace out when exiting. +var runningPanicDefers atomic.Uint32 + +// panicking is non-zero when crashing the program for an unrecovered panic. +var panicking atomic.Uint32 + +// paniclk is held while printing the panic information and stack trace, +// so that two concurrent panics don't overlap their output. +var paniclk mutex + +// Unwind the stack after a deferred function calls recover +// after a panic. Then arrange to continue running as though +// the caller of the deferred function returned normally. +func recovery(gp *g) { + // Info about defer passed in G struct. + sp := gp.sigcode0 + pc := gp.sigcode1 + + // d's arguments need to be in the stack. + if sp != 0 && (sp < gp.stack.lo || gp.stack.hi < sp) { + print("recover: ", hex(sp), " not in [", hex(gp.stack.lo), ", ", hex(gp.stack.hi), "]\n") + throw("bad recovery") + } + + // Make the deferproc for this d return again, + // this time returning 1. The calling function will + // jump to the standard return epilogue. + gp.sched.sp = sp + gp.sched.pc = pc + gp.sched.lr = 0 + gp.sched.ret = 1 + gogo(&gp.sched) +} + +// fatalthrow implements an unrecoverable runtime throw. It freezes the +// system, prints stack traces starting from its caller, and terminates the +// process. +// +//go:nosplit +func fatalthrow(t throwType) { + pc := getcallerpc() + sp := getcallersp() + gp := getg() + + if gp.m.throwing == throwTypeNone { + gp.m.throwing = t + } + + // Switch to the system stack to avoid any stack growth, which may make + // things worse if the runtime is in a bad state. + systemstack(func() { + if isSecureMode() { + exit(2) + } + + startpanic_m() + + if dopanic_m(gp, pc, sp) { + // crash uses a decent amount of nosplit stack and we're already + // low on stack in throw, so crash on the system stack (unlike + // fatalpanic). + crash() + } + + exit(2) + }) + + *(*int)(nil) = 0 // not reached +} + +// fatalpanic implements an unrecoverable panic. It is like fatalthrow, except +// that if msgs != nil, fatalpanic also prints panic messages and decrements +// runningPanicDefers once main is blocked from exiting. +// +//go:nosplit +func fatalpanic(msgs *_panic) { + pc := getcallerpc() + sp := getcallersp() + gp := getg() + var docrash bool + // Switch to the system stack to avoid any stack growth, which + // may make things worse if the runtime is in a bad state. + systemstack(func() { + if startpanic_m() && msgs != nil { + // There were panic messages and startpanic_m + // says it's okay to try to print them. + + // startpanic_m set panicking, which will + // block main from exiting, so now OK to + // decrement runningPanicDefers. + runningPanicDefers.Add(-1) + + printpanics(msgs) + } + + docrash = dopanic_m(gp, pc, sp) + }) + + if docrash { + // By crashing outside the above systemstack call, debuggers + // will not be confused when generating a backtrace. + // Function crash is marked nosplit to avoid stack growth. + crash() + } + + systemstack(func() { + exit(2) + }) + + *(*int)(nil) = 0 // not reached +} + +// startpanic_m prepares for an unrecoverable panic. +// +// It returns true if panic messages should be printed, or false if +// the runtime is in bad shape and should just print stacks. +// +// It must not have write barriers even though the write barrier +// explicitly ignores writes once dying > 0. Write barriers still +// assume that g.m.p != nil, and this function may not have P +// in some contexts (e.g. a panic in a signal handler for a signal +// sent to an M with no P). +// +//go:nowritebarrierrec +func startpanic_m() bool { + gp := getg() + if mheap_.cachealloc.size == 0 { // very early + print("runtime: panic before malloc heap initialized\n") + } + // Disallow malloc during an unrecoverable panic. A panic + // could happen in a signal handler, or in a throw, or inside + // malloc itself. We want to catch if an allocation ever does + // happen (even if we're not in one of these situations). + gp.m.mallocing++ + + // If we're dying because of a bad lock count, set it to a + // good lock count so we don't recursively panic below. + if gp.m.locks < 0 { + gp.m.locks = 1 + } + + switch gp.m.dying { + case 0: + // Setting dying >0 has the side-effect of disabling this G's writebuf. + gp.m.dying = 1 + panicking.Add(1) + lock(&paniclk) + if debug.schedtrace > 0 || debug.scheddetail > 0 { + schedtrace(true) + } + freezetheworld() + return true + case 1: + // Something failed while panicking. + // Just print a stack trace and exit. + gp.m.dying = 2 + print("panic during panic\n") + return false + case 2: + // This is a genuine bug in the runtime, we couldn't even + // print the stack trace successfully. + gp.m.dying = 3 + print("stack trace unavailable\n") + exit(4) + fallthrough + default: + // Can't even print! Just exit. + exit(5) + return false // Need to return something. + } +} + +var didothers bool +var deadlock mutex + +// gp is the crashing g running on this M, but may be a user G, while getg() is +// always g0. +func dopanic_m(gp *g, pc, sp uintptr) bool { + if gp.sig != 0 { + signame := signame(gp.sig) + if signame != "" { + print("[signal ", signame) + } else { + print("[signal ", hex(gp.sig)) + } + print(" code=", hex(gp.sigcode0), " addr=", hex(gp.sigcode1), " pc=", hex(gp.sigpc), "]\n") + } + + level, all, docrash := gotraceback() + if level > 0 { + if gp != gp.m.curg { + all = true + } + if gp != gp.m.g0 { + print("\n") + goroutineheader(gp) + traceback(pc, sp, 0, gp) + } else if level >= 2 || gp.m.throwing >= throwTypeRuntime { + print("\nruntime stack:\n") + traceback(pc, sp, 0, gp) + } + if !didothers && all { + didothers = true + tracebackothers(gp) + } + } + unlock(&paniclk) + + if panicking.Add(-1) != 0 { + // Some other m is panicking too. + // Let it print what it needs to print. + // Wait forever without chewing up cpu. + // It will exit when it's done. + lock(&deadlock) + lock(&deadlock) + } + + printDebugLog() + + return docrash +} + +// canpanic returns false if a signal should throw instead of +// panicking. +// +//go:nosplit +func canpanic() bool { + gp := getg() + mp := acquirem() + + // Is it okay for gp to panic instead of crashing the program? + // Yes, as long as it is running Go code, not runtime code, + // and not stuck in a system call. + if gp != mp.curg { + releasem(mp) + return false + } + // N.B. mp.locks != 1 instead of 0 to account for acquirem. + if mp.locks != 1 || mp.mallocing != 0 || mp.throwing != throwTypeNone || mp.preemptoff != "" || mp.dying != 0 { + releasem(mp) + return false + } + status := readgstatus(gp) + if status&^_Gscan != _Grunning || gp.syscallsp != 0 { + releasem(mp) + return false + } + if GOOS == "windows" && mp.libcallsp != 0 { + releasem(mp) + return false + } + releasem(mp) + return true +} + +// shouldPushSigpanic reports whether pc should be used as sigpanic's +// return PC (pushing a frame for the call). Otherwise, it should be +// left alone so that LR is used as sigpanic's return PC, effectively +// replacing the top-most frame with sigpanic. This is used by +// preparePanic. +func shouldPushSigpanic(gp *g, pc, lr uintptr) bool { + if pc == 0 { + // Probably a call to a nil func. The old LR is more + // useful in the stack trace. Not pushing the frame + // will make the trace look like a call to sigpanic + // instead. (Otherwise the trace will end at sigpanic + // and we won't get to see who faulted.) + return false + } + // If we don't recognize the PC as code, but we do recognize + // the link register as code, then this assumes the panic was + // caused by a call to non-code. In this case, we want to + // ignore this call to make unwinding show the context. + // + // If we running C code, we're not going to recognize pc as a + // Go function, so just assume it's good. Otherwise, traceback + // may try to read a stale LR that looks like a Go code + // pointer and wander into the woods. + if gp.m.incgo || findfunc(pc).valid() { + // This wasn't a bad call, so use PC as sigpanic's + // return PC. + return true + } + if findfunc(lr).valid() { + // This was a bad call, but the LR is good, so use the + // LR as sigpanic's return PC. + return false + } + // Neither the PC or LR is good. Hopefully pushing a frame + // will work. + return true +} + +// isAbortPC reports whether pc is the program counter at which +// runtime.abort raises a signal. +// +// It is nosplit because it's part of the isgoexception +// implementation. +// +//go:nosplit +func isAbortPC(pc uintptr) bool { + f := findfunc(pc) + if !f.valid() { + return false + } + return f.funcID == funcID_abort +} diff --git a/src/runtime/panic32.go b/src/runtime/panic32.go new file mode 100644 index 0000000..fa3f2bf --- /dev/null +++ b/src/runtime/panic32.go @@ -0,0 +1,105 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build 386 || arm || mips || mipsle + +package runtime + +// Additional index/slice error paths for 32-bit platforms. +// Used when the high word of a 64-bit index is not zero. + +// failures in the comparisons for s[x], 0 <= x < y (y == len(s)) +func goPanicExtendIndex(hi int, lo uint, y int) { + panicCheck1(getcallerpc(), "index out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: true, y: y, code: boundsIndex}) +} +func goPanicExtendIndexU(hi uint, lo uint, y int) { + panicCheck1(getcallerpc(), "index out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: false, y: y, code: boundsIndex}) +} + +// failures in the comparisons for s[:x], 0 <= x <= y (y == len(s) or cap(s)) +func goPanicExtendSliceAlen(hi int, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: true, y: y, code: boundsSliceAlen}) +} +func goPanicExtendSliceAlenU(hi uint, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: false, y: y, code: boundsSliceAlen}) +} +func goPanicExtendSliceAcap(hi int, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: true, y: y, code: boundsSliceAcap}) +} +func goPanicExtendSliceAcapU(hi uint, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: false, y: y, code: boundsSliceAcap}) +} + +// failures in the comparisons for s[x:y], 0 <= x <= y +func goPanicExtendSliceB(hi int, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: true, y: y, code: boundsSliceB}) +} +func goPanicExtendSliceBU(hi uint, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: false, y: y, code: boundsSliceB}) +} + +// failures in the comparisons for s[::x], 0 <= x <= y (y == len(s) or cap(s)) +func goPanicExtendSlice3Alen(hi int, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: true, y: y, code: boundsSlice3Alen}) +} +func goPanicExtendSlice3AlenU(hi uint, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: false, y: y, code: boundsSlice3Alen}) +} +func goPanicExtendSlice3Acap(hi int, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: true, y: y, code: boundsSlice3Acap}) +} +func goPanicExtendSlice3AcapU(hi uint, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: false, y: y, code: boundsSlice3Acap}) +} + +// failures in the comparisons for s[:x:y], 0 <= x <= y +func goPanicExtendSlice3B(hi int, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: true, y: y, code: boundsSlice3B}) +} +func goPanicExtendSlice3BU(hi uint, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: false, y: y, code: boundsSlice3B}) +} + +// failures in the comparisons for s[x:y:], 0 <= x <= y +func goPanicExtendSlice3C(hi int, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: true, y: y, code: boundsSlice3C}) +} +func goPanicExtendSlice3CU(hi uint, lo uint, y int) { + panicCheck1(getcallerpc(), "slice bounds out of range") + panic(boundsError{x: int64(hi)<<32 + int64(lo), signed: false, y: y, code: boundsSlice3C}) +} + +// Implemented in assembly, as they take arguments in registers. +// Declared here to mark them as ABIInternal. +func panicExtendIndex(hi int, lo uint, y int) +func panicExtendIndexU(hi uint, lo uint, y int) +func panicExtendSliceAlen(hi int, lo uint, y int) +func panicExtendSliceAlenU(hi uint, lo uint, y int) +func panicExtendSliceAcap(hi int, lo uint, y int) +func panicExtendSliceAcapU(hi uint, lo uint, y int) +func panicExtendSliceB(hi int, lo uint, y int) +func panicExtendSliceBU(hi uint, lo uint, y int) +func panicExtendSlice3Alen(hi int, lo uint, y int) +func panicExtendSlice3AlenU(hi uint, lo uint, y int) +func panicExtendSlice3Acap(hi int, lo uint, y int) +func panicExtendSlice3AcapU(hi uint, lo uint, y int) +func panicExtendSlice3B(hi int, lo uint, y int) +func panicExtendSlice3BU(hi uint, lo uint, y int) +func panicExtendSlice3C(hi int, lo uint, y int) +func panicExtendSlice3CU(hi uint, lo uint, y int) diff --git a/src/runtime/panic_test.go b/src/runtime/panic_test.go new file mode 100644 index 0000000..b8a300f --- /dev/null +++ b/src/runtime/panic_test.go @@ -0,0 +1,48 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "strings" + "testing" +) + +// Test that panics print out the underlying value +// when the underlying kind is directly printable. +// Issue: https://golang.org/issues/37531 +func TestPanicWithDirectlyPrintableCustomTypes(t *testing.T) { + tests := []struct { + name string + wantPanicPrefix string + }{ + {"panicCustomBool", `panic: main.MyBool(true)`}, + {"panicCustomComplex128", `panic: main.MyComplex128(+3.210000e+001+1.000000e+001i)`}, + {"panicCustomComplex64", `panic: main.MyComplex64(+1.100000e-001+3.000000e+000i)`}, + {"panicCustomFloat32", `panic: main.MyFloat32(-9.370000e+001)`}, + {"panicCustomFloat64", `panic: main.MyFloat64(-9.370000e+001)`}, + {"panicCustomInt", `panic: main.MyInt(93)`}, + {"panicCustomInt8", `panic: main.MyInt8(93)`}, + {"panicCustomInt16", `panic: main.MyInt16(93)`}, + {"panicCustomInt32", `panic: main.MyInt32(93)`}, + {"panicCustomInt64", `panic: main.MyInt64(93)`}, + {"panicCustomString", `panic: main.MyString("Panic")`}, + {"panicCustomUint", `panic: main.MyUint(93)`}, + {"panicCustomUint8", `panic: main.MyUint8(93)`}, + {"panicCustomUint16", `panic: main.MyUint16(93)`}, + {"panicCustomUint32", `panic: main.MyUint32(93)`}, + {"panicCustomUint64", `panic: main.MyUint64(93)`}, + {"panicCustomUintptr", `panic: main.MyUintptr(93)`}, + } + + for _, tt := range tests { + t := t + t.Run(tt.name, func(t *testing.T) { + output := runTestProg(t, "testprog", tt.name) + if !strings.HasPrefix(output, tt.wantPanicPrefix) { + t.Fatalf("%q\nis not present in\n%s", tt.wantPanicPrefix, output) + } + }) + } +} diff --git a/src/runtime/plugin.go b/src/runtime/plugin.go new file mode 100644 index 0000000..a61dcc3 --- /dev/null +++ b/src/runtime/plugin.go @@ -0,0 +1,137 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +//go:linkname plugin_lastmoduleinit plugin.lastmoduleinit +func plugin_lastmoduleinit() (path string, syms map[string]any, errstr string) { + var md *moduledata + for pmd := firstmoduledata.next; pmd != nil; pmd = pmd.next { + if pmd.bad { + md = nil // we only want the last module + continue + } + md = pmd + } + if md == nil { + throw("runtime: no plugin module data") + } + if md.pluginpath == "" { + throw("runtime: plugin has empty pluginpath") + } + if md.typemap != nil { + return "", nil, "plugin already loaded" + } + + for _, pmd := range activeModules() { + if pmd.pluginpath == md.pluginpath { + md.bad = true + return "", nil, "plugin already loaded" + } + + if inRange(pmd.text, pmd.etext, md.text, md.etext) || + inRange(pmd.bss, pmd.ebss, md.bss, md.ebss) || + inRange(pmd.data, pmd.edata, md.data, md.edata) || + inRange(pmd.types, pmd.etypes, md.types, md.etypes) { + println("plugin: new module data overlaps with previous moduledata") + println("\tpmd.text-etext=", hex(pmd.text), "-", hex(pmd.etext)) + println("\tpmd.bss-ebss=", hex(pmd.bss), "-", hex(pmd.ebss)) + println("\tpmd.data-edata=", hex(pmd.data), "-", hex(pmd.edata)) + println("\tpmd.types-etypes=", hex(pmd.types), "-", hex(pmd.etypes)) + println("\tmd.text-etext=", hex(md.text), "-", hex(md.etext)) + println("\tmd.bss-ebss=", hex(md.bss), "-", hex(md.ebss)) + println("\tmd.data-edata=", hex(md.data), "-", hex(md.edata)) + println("\tmd.types-etypes=", hex(md.types), "-", hex(md.etypes)) + throw("plugin: new module data overlaps with previous moduledata") + } + } + for _, pkghash := range md.pkghashes { + if pkghash.linktimehash != *pkghash.runtimehash { + md.bad = true + return "", nil, "plugin was built with a different version of package " + pkghash.modulename + } + } + + // Initialize the freshly loaded module. + modulesinit() + typelinksinit() + + pluginftabverify(md) + moduledataverify1(md) + + lock(&itabLock) + for _, i := range md.itablinks { + itabAdd(i) + } + unlock(&itabLock) + + // Build a map of symbol names to symbols. Here in the runtime + // we fill out the first word of the interface, the type. We + // pass these zero value interfaces to the plugin package, + // where the symbol value is filled in (usually via cgo). + // + // Because functions are handled specially in the plugin package, + // function symbol names are prefixed here with '.' to avoid + // a dependency on the reflect package. + syms = make(map[string]any, len(md.ptab)) + for _, ptab := range md.ptab { + symName := resolveNameOff(unsafe.Pointer(md.types), ptab.name) + t := (*_type)(unsafe.Pointer(md.types)).typeOff(ptab.typ) + var val any + valp := (*[2]unsafe.Pointer)(unsafe.Pointer(&val)) + (*valp)[0] = unsafe.Pointer(t) + + name := symName.name() + if t.kind&kindMask == kindFunc { + name = "." + name + } + syms[name] = val + } + return md.pluginpath, syms, "" +} + +func pluginftabverify(md *moduledata) { + badtable := false + for i := 0; i < len(md.ftab); i++ { + entry := md.textAddr(md.ftab[i].entryoff) + if md.minpc <= entry && entry <= md.maxpc { + continue + } + + f := funcInfo{(*_func)(unsafe.Pointer(&md.pclntable[md.ftab[i].funcoff])), md} + name := funcname(f) + + // A common bug is f.entry has a relocation to a duplicate + // function symbol, meaning if we search for its PC we get + // a valid entry with a name that is useful for debugging. + name2 := "none" + entry2 := uintptr(0) + f2 := findfunc(entry) + if f2.valid() { + name2 = funcname(f2) + entry2 = f2.entry() + } + badtable = true + println("ftab entry", hex(entry), "/", hex(entry2), ": ", + name, "/", name2, "outside pc range:[", hex(md.minpc), ",", hex(md.maxpc), "], modulename=", md.modulename, ", pluginpath=", md.pluginpath) + } + if badtable { + throw("runtime: plugin has bad symbol table") + } +} + +// inRange reports whether v0 or v1 are in the range [r0, r1]. +func inRange(r0, r1, v0, v1 uintptr) bool { + return (v0 >= r0 && v0 <= r1) || (v1 >= r0 && v1 <= r1) +} + +// A ptabEntry is generated by the compiler for each exported function +// and global variable in the main package of a plugin. It is used to +// initialize the plugin module's symbol map. +type ptabEntry struct { + name nameOff + typ typeOff +} diff --git a/src/runtime/pprof/elf.go b/src/runtime/pprof/elf.go new file mode 100644 index 0000000..a8b5ea6 --- /dev/null +++ b/src/runtime/pprof/elf.go @@ -0,0 +1,109 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "encoding/binary" + "errors" + "fmt" + "os" +) + +var ( + errBadELF = errors.New("malformed ELF binary") + errNoBuildID = errors.New("no NT_GNU_BUILD_ID found in ELF binary") +) + +// elfBuildID returns the GNU build ID of the named ELF binary, +// without introducing a dependency on debug/elf and its dependencies. +func elfBuildID(file string) (string, error) { + buf := make([]byte, 256) + f, err := os.Open(file) + if err != nil { + return "", err + } + defer f.Close() + + if _, err := f.ReadAt(buf[:64], 0); err != nil { + return "", err + } + + // ELF file begins with \x7F E L F. + if buf[0] != 0x7F || buf[1] != 'E' || buf[2] != 'L' || buf[3] != 'F' { + return "", errBadELF + } + + var byteOrder binary.ByteOrder + switch buf[5] { + default: + return "", errBadELF + case 1: // little-endian + byteOrder = binary.LittleEndian + case 2: // big-endian + byteOrder = binary.BigEndian + } + + var shnum int + var shoff, shentsize int64 + switch buf[4] { + default: + return "", errBadELF + case 1: // 32-bit file header + shoff = int64(byteOrder.Uint32(buf[32:])) + shentsize = int64(byteOrder.Uint16(buf[46:])) + if shentsize != 40 { + return "", errBadELF + } + shnum = int(byteOrder.Uint16(buf[48:])) + case 2: // 64-bit file header + shoff = int64(byteOrder.Uint64(buf[40:])) + shentsize = int64(byteOrder.Uint16(buf[58:])) + if shentsize != 64 { + return "", errBadELF + } + shnum = int(byteOrder.Uint16(buf[60:])) + } + + for i := 0; i < shnum; i++ { + if _, err := f.ReadAt(buf[:shentsize], shoff+int64(i)*shentsize); err != nil { + return "", err + } + if typ := byteOrder.Uint32(buf[4:]); typ != 7 { // SHT_NOTE + continue + } + var off, size int64 + if shentsize == 40 { + // 32-bit section header + off = int64(byteOrder.Uint32(buf[16:])) + size = int64(byteOrder.Uint32(buf[20:])) + } else { + // 64-bit section header + off = int64(byteOrder.Uint64(buf[24:])) + size = int64(byteOrder.Uint64(buf[32:])) + } + size += off + for off < size { + if _, err := f.ReadAt(buf[:16], off); err != nil { // room for header + name GNU\x00 + return "", err + } + nameSize := int(byteOrder.Uint32(buf[0:])) + descSize := int(byteOrder.Uint32(buf[4:])) + noteType := int(byteOrder.Uint32(buf[8:])) + descOff := off + int64(12+(nameSize+3)&^3) + off = descOff + int64((descSize+3)&^3) + if nameSize != 4 || noteType != 3 || buf[12] != 'G' || buf[13] != 'N' || buf[14] != 'U' || buf[15] != '\x00' { // want name GNU\x00 type 3 (NT_GNU_BUILD_ID) + continue + } + if descSize > len(buf) { + return "", errBadELF + } + if _, err := f.ReadAt(buf[:descSize], descOff); err != nil { + return "", err + } + return fmt.Sprintf("%x", buf[:descSize]), nil + } + } + return "", errNoBuildID +} diff --git a/src/runtime/pprof/label.go b/src/runtime/pprof/label.go new file mode 100644 index 0000000..d39e0ad --- /dev/null +++ b/src/runtime/pprof/label.go @@ -0,0 +1,108 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "context" + "fmt" + "sort" + "strings" +) + +type label struct { + key string + value string +} + +// LabelSet is a set of labels. +type LabelSet struct { + list []label +} + +// labelContextKey is the type of contextKeys used for profiler labels. +type labelContextKey struct{} + +func labelValue(ctx context.Context) labelMap { + labels, _ := ctx.Value(labelContextKey{}).(*labelMap) + if labels == nil { + return labelMap(nil) + } + return *labels +} + +// labelMap is the representation of the label set held in the context type. +// This is an initial implementation, but it will be replaced with something +// that admits incremental immutable modification more efficiently. +type labelMap map[string]string + +// String satisfies Stringer and returns key, value pairs in a consistent +// order. +func (l *labelMap) String() string { + if l == nil { + return "" + } + keyVals := make([]string, 0, len(*l)) + + for k, v := range *l { + keyVals = append(keyVals, fmt.Sprintf("%q:%q", k, v)) + } + + sort.Strings(keyVals) + + return "{" + strings.Join(keyVals, ", ") + "}" +} + +// WithLabels returns a new context.Context with the given labels added. +// A label overwrites a prior label with the same key. +func WithLabels(ctx context.Context, labels LabelSet) context.Context { + parentLabels := labelValue(ctx) + childLabels := make(labelMap, len(parentLabels)) + // TODO(matloob): replace the map implementation with something + // more efficient so creating a child context WithLabels doesn't need + // to clone the map. + for k, v := range parentLabels { + childLabels[k] = v + } + for _, label := range labels.list { + childLabels[label.key] = label.value + } + return context.WithValue(ctx, labelContextKey{}, &childLabels) +} + +// Labels takes an even number of strings representing key-value pairs +// and makes a LabelSet containing them. +// A label overwrites a prior label with the same key. +// Currently only the CPU and goroutine profiles utilize any labels +// information. +// See https://golang.org/issue/23458 for details. +func Labels(args ...string) LabelSet { + if len(args)%2 != 0 { + panic("uneven number of arguments to pprof.Labels") + } + list := make([]label, 0, len(args)/2) + for i := 0; i+1 < len(args); i += 2 { + list = append(list, label{key: args[i], value: args[i+1]}) + } + return LabelSet{list: list} +} + +// Label returns the value of the label with the given key on ctx, and a boolean indicating +// whether that label exists. +func Label(ctx context.Context, key string) (string, bool) { + ctxLabels := labelValue(ctx) + v, ok := ctxLabels[key] + return v, ok +} + +// ForLabels invokes f with each label set on the context. +// The function f should return true to continue iteration or false to stop iteration early. +func ForLabels(ctx context.Context, f func(key, value string) bool) { + ctxLabels := labelValue(ctx) + for k, v := range ctxLabels { + if !f(k, v) { + break + } + } +} diff --git a/src/runtime/pprof/label_test.go b/src/runtime/pprof/label_test.go new file mode 100644 index 0000000..fcb00bd --- /dev/null +++ b/src/runtime/pprof/label_test.go @@ -0,0 +1,114 @@ +package pprof + +import ( + "context" + "reflect" + "sort" + "testing" +) + +func labelsSorted(ctx context.Context) []label { + ls := []label{} + ForLabels(ctx, func(key, value string) bool { + ls = append(ls, label{key, value}) + return true + }) + sort.Sort(labelSorter(ls)) + return ls +} + +type labelSorter []label + +func (s labelSorter) Len() int { return len(s) } +func (s labelSorter) Swap(i, j int) { s[i], s[j] = s[j], s[i] } +func (s labelSorter) Less(i, j int) bool { return s[i].key < s[j].key } + +func TestContextLabels(t *testing.T) { + // Background context starts with no labels. + ctx := context.Background() + labels := labelsSorted(ctx) + if len(labels) != 0 { + t.Errorf("labels on background context: want [], got %v ", labels) + } + + // Add a single label. + ctx = WithLabels(ctx, Labels("key", "value")) + // Retrieve it with Label. + v, ok := Label(ctx, "key") + if !ok || v != "value" { + t.Errorf(`Label(ctx, "key"): got %v, %v; want "value", ok`, v, ok) + } + gotLabels := labelsSorted(ctx) + wantLabels := []label{{"key", "value"}} + if !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("(sorted) labels on context: got %v, want %v", gotLabels, wantLabels) + } + + // Add a label with a different key. + ctx = WithLabels(ctx, Labels("key2", "value2")) + v, ok = Label(ctx, "key2") + if !ok || v != "value2" { + t.Errorf(`Label(ctx, "key2"): got %v, %v; want "value2", ok`, v, ok) + } + gotLabels = labelsSorted(ctx) + wantLabels = []label{{"key", "value"}, {"key2", "value2"}} + if !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("(sorted) labels on context: got %v, want %v", gotLabels, wantLabels) + } + + // Add label with first key to test label replacement. + ctx = WithLabels(ctx, Labels("key", "value3")) + v, ok = Label(ctx, "key") + if !ok || v != "value3" { + t.Errorf(`Label(ctx, "key3"): got %v, %v; want "value3", ok`, v, ok) + } + gotLabels = labelsSorted(ctx) + wantLabels = []label{{"key", "value3"}, {"key2", "value2"}} + if !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("(sorted) labels on context: got %v, want %v", gotLabels, wantLabels) + } + + // Labels called with two labels with the same key should pick the second. + ctx = WithLabels(ctx, Labels("key4", "value4a", "key4", "value4b")) + v, ok = Label(ctx, "key4") + if !ok || v != "value4b" { + t.Errorf(`Label(ctx, "key4"): got %v, %v; want "value4b", ok`, v, ok) + } + gotLabels = labelsSorted(ctx) + wantLabels = []label{{"key", "value3"}, {"key2", "value2"}, {"key4", "value4b"}} + if !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("(sorted) labels on context: got %v, want %v", gotLabels, wantLabels) + } +} + +func TestLabelMapStringer(t *testing.T) { + for _, tbl := range []struct { + m labelMap + expected string + }{ + { + m: labelMap{ + // empty map + }, + expected: "{}", + }, { + m: labelMap{ + "foo": "bar", + }, + expected: `{"foo":"bar"}`, + }, { + m: labelMap{ + "foo": "bar", + "key1": "value1", + "key2": "value2", + "key3": "value3", + "key4WithNewline": "\nvalue4", + }, + expected: `{"foo":"bar", "key1":"value1", "key2":"value2", "key3":"value3", "key4WithNewline":"\nvalue4"}`, + }, + } { + if got := tbl.m.String(); tbl.expected != got { + t.Errorf("%#v.String() = %q; want %q", tbl.m, got, tbl.expected) + } + } +} diff --git a/src/runtime/pprof/map.go b/src/runtime/pprof/map.go new file mode 100644 index 0000000..7c75872 --- /dev/null +++ b/src/runtime/pprof/map.go @@ -0,0 +1,90 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import "unsafe" + +// A profMap is a map from (stack, tag) to mapEntry. +// It grows without bound, but that's assumed to be OK. +type profMap struct { + hash map[uintptr]*profMapEntry + all *profMapEntry + last *profMapEntry + free []profMapEntry + freeStk []uintptr +} + +// A profMapEntry is a single entry in the profMap. +type profMapEntry struct { + nextHash *profMapEntry // next in hash list + nextAll *profMapEntry // next in list of all entries + stk []uintptr + tag unsafe.Pointer + count int64 +} + +func (m *profMap) lookup(stk []uint64, tag unsafe.Pointer) *profMapEntry { + // Compute hash of (stk, tag). + h := uintptr(0) + for _, x := range stk { + h = h<<8 | (h >> (8 * (unsafe.Sizeof(h) - 1))) + h += uintptr(x) * 41 + } + h = h<<8 | (h >> (8 * (unsafe.Sizeof(h) - 1))) + h += uintptr(tag) * 41 + + // Find entry if present. + var last *profMapEntry +Search: + for e := m.hash[h]; e != nil; last, e = e, e.nextHash { + if len(e.stk) != len(stk) || e.tag != tag { + continue + } + for j := range stk { + if e.stk[j] != uintptr(stk[j]) { + continue Search + } + } + // Move to front. + if last != nil { + last.nextHash = e.nextHash + e.nextHash = m.hash[h] + m.hash[h] = e + } + return e + } + + // Add new entry. + if len(m.free) < 1 { + m.free = make([]profMapEntry, 128) + } + e := &m.free[0] + m.free = m.free[1:] + e.nextHash = m.hash[h] + e.tag = tag + + if len(m.freeStk) < len(stk) { + m.freeStk = make([]uintptr, 1024) + } + // Limit cap to prevent append from clobbering freeStk. + e.stk = m.freeStk[:len(stk):len(stk)] + m.freeStk = m.freeStk[len(stk):] + + for j := range stk { + e.stk[j] = uintptr(stk[j]) + } + if m.hash == nil { + m.hash = make(map[uintptr]*profMapEntry) + } + m.hash[h] = e + if m.all == nil { + m.all = e + m.last = e + } else { + m.last.nextAll = e + m.last = e + } + return e +} diff --git a/src/runtime/pprof/mprof_test.go b/src/runtime/pprof/mprof_test.go new file mode 100644 index 0000000..391588d --- /dev/null +++ b/src/runtime/pprof/mprof_test.go @@ -0,0 +1,176 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !js + +package pprof + +import ( + "bytes" + "fmt" + "internal/profile" + "reflect" + "regexp" + "runtime" + "testing" + "unsafe" +) + +var memSink any + +func allocateTransient1M() { + for i := 0; i < 1024; i++ { + memSink = &struct{ x [1024]byte }{} + } +} + +//go:noinline +func allocateTransient2M() { + memSink = make([]byte, 2<<20) +} + +func allocateTransient2MInline() { + memSink = make([]byte, 2<<20) +} + +type Obj32 struct { + link *Obj32 + pad [32 - unsafe.Sizeof(uintptr(0))]byte +} + +var persistentMemSink *Obj32 + +func allocatePersistent1K() { + for i := 0; i < 32; i++ { + // Can't use slice because that will introduce implicit allocations. + obj := &Obj32{link: persistentMemSink} + persistentMemSink = obj + } +} + +// Allocate transient memory using reflect.Call. + +func allocateReflectTransient() { + memSink = make([]byte, 2<<20) +} + +func allocateReflect() { + rv := reflect.ValueOf(allocateReflectTransient) + rv.Call(nil) +} + +var memoryProfilerRun = 0 + +func TestMemoryProfiler(t *testing.T) { + // Disable sampling, otherwise it's difficult to assert anything. + oldRate := runtime.MemProfileRate + runtime.MemProfileRate = 1 + defer func() { + runtime.MemProfileRate = oldRate + }() + + // Allocate a meg to ensure that mcache.nextSample is updated to 1. + for i := 0; i < 1024; i++ { + memSink = make([]byte, 1024) + } + + // Do the interesting allocations. + allocateTransient1M() + allocateTransient2M() + allocateTransient2MInline() + allocatePersistent1K() + allocateReflect() + memSink = nil + + runtime.GC() // materialize stats + + memoryProfilerRun++ + + tests := []struct { + stk []string + legacy string + }{{ + stk: []string{"runtime/pprof.allocatePersistent1K", "runtime/pprof.TestMemoryProfiler"}, + legacy: fmt.Sprintf(`%v: %v \[%v: %v\] @ 0x[0-9,a-f]+ 0x[0-9,a-f]+ 0x[0-9,a-f]+ 0x[0-9,a-f]+ +# 0x[0-9,a-f]+ runtime/pprof\.allocatePersistent1K\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test\.go:47 +# 0x[0-9,a-f]+ runtime/pprof\.TestMemoryProfiler\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test\.go:82 +`, 32*memoryProfilerRun, 1024*memoryProfilerRun, 32*memoryProfilerRun, 1024*memoryProfilerRun), + }, { + stk: []string{"runtime/pprof.allocateTransient1M", "runtime/pprof.TestMemoryProfiler"}, + legacy: fmt.Sprintf(`0: 0 \[%v: %v\] @ 0x[0-9,a-f]+ 0x[0-9,a-f]+ 0x[0-9,a-f]+ 0x[0-9,a-f]+ +# 0x[0-9,a-f]+ runtime/pprof\.allocateTransient1M\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test.go:24 +# 0x[0-9,a-f]+ runtime/pprof\.TestMemoryProfiler\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test.go:79 +`, (1<<10)*memoryProfilerRun, (1<<20)*memoryProfilerRun), + }, { + stk: []string{"runtime/pprof.allocateTransient2M", "runtime/pprof.TestMemoryProfiler"}, + legacy: fmt.Sprintf(`0: 0 \[%v: %v\] @ 0x[0-9,a-f]+ 0x[0-9,a-f]+ 0x[0-9,a-f]+ 0x[0-9,a-f]+ +# 0x[0-9,a-f]+ runtime/pprof\.allocateTransient2M\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test.go:30 +# 0x[0-9,a-f]+ runtime/pprof\.TestMemoryProfiler\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test.go:80 +`, memoryProfilerRun, (2<<20)*memoryProfilerRun), + }, { + stk: []string{"runtime/pprof.allocateTransient2MInline", "runtime/pprof.TestMemoryProfiler"}, + legacy: fmt.Sprintf(`0: 0 \[%v: %v\] @ 0x[0-9,a-f]+ 0x[0-9,a-f]+ 0x[0-9,a-f]+ 0x[0-9,a-f]+ +# 0x[0-9,a-f]+ runtime/pprof\.allocateTransient2MInline\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test.go:34 +# 0x[0-9,a-f]+ runtime/pprof\.TestMemoryProfiler\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test.go:81 +`, memoryProfilerRun, (2<<20)*memoryProfilerRun), + }, { + stk: []string{"runtime/pprof.allocateReflectTransient"}, + legacy: fmt.Sprintf(`0: 0 \[%v: %v\] @( 0x[0-9,a-f]+)+ +# 0x[0-9,a-f]+ runtime/pprof\.allocateReflectTransient\+0x[0-9,a-f]+ .*runtime/pprof/mprof_test.go:55 +`, memoryProfilerRun, (2<<20)*memoryProfilerRun), + }} + + t.Run("debug=1", func(t *testing.T) { + var buf bytes.Buffer + if err := Lookup("heap").WriteTo(&buf, 1); err != nil { + t.Fatalf("failed to write heap profile: %v", err) + } + + for _, test := range tests { + if !regexp.MustCompile(test.legacy).Match(buf.Bytes()) { + t.Fatalf("The entry did not match:\n%v\n\nProfile:\n%v\n", test.legacy, buf.String()) + } + } + }) + + t.Run("proto", func(t *testing.T) { + var buf bytes.Buffer + if err := Lookup("heap").WriteTo(&buf, 0); err != nil { + t.Fatalf("failed to write heap profile: %v", err) + } + p, err := profile.Parse(&buf) + if err != nil { + t.Fatalf("failed to parse heap profile: %v", err) + } + t.Logf("Profile = %v", p) + + stks := stacks(p) + for _, test := range tests { + if !containsStack(stks, test.stk) { + t.Fatalf("No matching stack entry for %q\n\nProfile:\n%v\n", test.stk, p) + } + } + + if !containsInlinedCall(TestMemoryProfiler, 4<<10) { + t.Logf("Can't determine whether allocateTransient2MInline was inlined into TestMemoryProfiler.") + return + } + + // Check the inlined function location is encoded correctly. + for _, loc := range p.Location { + inlinedCaller, inlinedCallee := false, false + for _, line := range loc.Line { + if line.Function.Name == "runtime/pprof.allocateTransient2MInline" { + inlinedCallee = true + } + if inlinedCallee && line.Function.Name == "runtime/pprof.TestMemoryProfiler" { + inlinedCaller = true + } + } + if inlinedCallee != inlinedCaller { + t.Errorf("want allocateTransient2MInline after TestMemoryProfiler in one location, got separate location entries:\n%v", loc) + } + } + }) +} diff --git a/src/runtime/pprof/pe.go b/src/runtime/pprof/pe.go new file mode 100644 index 0000000..4105458 --- /dev/null +++ b/src/runtime/pprof/pe.go @@ -0,0 +1,19 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import "os" + +// peBuildID returns a best effort unique ID for the named executable. +// +// It would be wasteful to calculate the hash of the whole file, +// instead use the binary name and the last modified time for the buildid. +func peBuildID(file string) string { + s, err := os.Stat(file) + if err != nil { + return file + } + return file + s.ModTime().String() +} diff --git a/src/runtime/pprof/pprof.go b/src/runtime/pprof/pprof.go new file mode 100644 index 0000000..17a490e --- /dev/null +++ b/src/runtime/pprof/pprof.go @@ -0,0 +1,910 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Package pprof writes runtime profiling data in the format expected +// by the pprof visualization tool. +// +// # Profiling a Go program +// +// The first step to profiling a Go program is to enable profiling. +// Support for profiling benchmarks built with the standard testing +// package is built into go test. For example, the following command +// runs benchmarks in the current directory and writes the CPU and +// memory profiles to cpu.prof and mem.prof: +// +// go test -cpuprofile cpu.prof -memprofile mem.prof -bench . +// +// To add equivalent profiling support to a standalone program, add +// code like the following to your main function: +// +// var cpuprofile = flag.String("cpuprofile", "", "write cpu profile to `file`") +// var memprofile = flag.String("memprofile", "", "write memory profile to `file`") +// +// func main() { +// flag.Parse() +// if *cpuprofile != "" { +// f, err := os.Create(*cpuprofile) +// if err != nil { +// log.Fatal("could not create CPU profile: ", err) +// } +// defer f.Close() // error handling omitted for example +// if err := pprof.StartCPUProfile(f); err != nil { +// log.Fatal("could not start CPU profile: ", err) +// } +// defer pprof.StopCPUProfile() +// } +// +// // ... rest of the program ... +// +// if *memprofile != "" { +// f, err := os.Create(*memprofile) +// if err != nil { +// log.Fatal("could not create memory profile: ", err) +// } +// defer f.Close() // error handling omitted for example +// runtime.GC() // get up-to-date statistics +// if err := pprof.WriteHeapProfile(f); err != nil { +// log.Fatal("could not write memory profile: ", err) +// } +// } +// } +// +// There is also a standard HTTP interface to profiling data. Adding +// the following line will install handlers under the /debug/pprof/ +// URL to download live profiles: +// +// import _ "net/http/pprof" +// +// See the net/http/pprof package for more details. +// +// Profiles can then be visualized with the pprof tool: +// +// go tool pprof cpu.prof +// +// There are many commands available from the pprof command line. +// Commonly used commands include "top", which prints a summary of the +// top program hot-spots, and "web", which opens an interactive graph +// of hot-spots and their call graphs. Use "help" for information on +// all pprof commands. +// +// For more information about pprof, see +// https://github.com/google/pprof/blob/master/doc/README.md. +package pprof + +import ( + "bufio" + "fmt" + "internal/abi" + "io" + "runtime" + "sort" + "strings" + "sync" + "text/tabwriter" + "time" + "unsafe" +) + +// BUG(rsc): Profiles are only as good as the kernel support used to generate them. +// See https://golang.org/issue/13841 for details about known problems. + +// A Profile is a collection of stack traces showing the call sequences +// that led to instances of a particular event, such as allocation. +// Packages can create and maintain their own profiles; the most common +// use is for tracking resources that must be explicitly closed, such as files +// or network connections. +// +// A Profile's methods can be called from multiple goroutines simultaneously. +// +// Each Profile has a unique name. A few profiles are predefined: +// +// goroutine - stack traces of all current goroutines +// heap - a sampling of memory allocations of live objects +// allocs - a sampling of all past memory allocations +// threadcreate - stack traces that led to the creation of new OS threads +// block - stack traces that led to blocking on synchronization primitives +// mutex - stack traces of holders of contended mutexes +// +// These predefined profiles maintain themselves and panic on an explicit +// Add or Remove method call. +// +// The heap profile reports statistics as of the most recently completed +// garbage collection; it elides more recent allocation to avoid skewing +// the profile away from live data and toward garbage. +// If there has been no garbage collection at all, the heap profile reports +// all known allocations. This exception helps mainly in programs running +// without garbage collection enabled, usually for debugging purposes. +// +// The heap profile tracks both the allocation sites for all live objects in +// the application memory and for all objects allocated since the program start. +// Pprof's -inuse_space, -inuse_objects, -alloc_space, and -alloc_objects +// flags select which to display, defaulting to -inuse_space (live objects, +// scaled by size). +// +// The allocs profile is the same as the heap profile but changes the default +// pprof display to -alloc_space, the total number of bytes allocated since +// the program began (including garbage-collected bytes). +// +// The CPU profile is not available as a Profile. It has a special API, +// the StartCPUProfile and StopCPUProfile functions, because it streams +// output to a writer during profiling. +type Profile struct { + name string + mu sync.Mutex + m map[any][]uintptr + count func() int + write func(io.Writer, int) error +} + +// profiles records all registered profiles. +var profiles struct { + mu sync.Mutex + m map[string]*Profile +} + +var goroutineProfile = &Profile{ + name: "goroutine", + count: countGoroutine, + write: writeGoroutine, +} + +var threadcreateProfile = &Profile{ + name: "threadcreate", + count: countThreadCreate, + write: writeThreadCreate, +} + +var heapProfile = &Profile{ + name: "heap", + count: countHeap, + write: writeHeap, +} + +var allocsProfile = &Profile{ + name: "allocs", + count: countHeap, // identical to heap profile + write: writeAlloc, +} + +var blockProfile = &Profile{ + name: "block", + count: countBlock, + write: writeBlock, +} + +var mutexProfile = &Profile{ + name: "mutex", + count: countMutex, + write: writeMutex, +} + +func lockProfiles() { + profiles.mu.Lock() + if profiles.m == nil { + // Initial built-in profiles. + profiles.m = map[string]*Profile{ + "goroutine": goroutineProfile, + "threadcreate": threadcreateProfile, + "heap": heapProfile, + "allocs": allocsProfile, + "block": blockProfile, + "mutex": mutexProfile, + } + } +} + +func unlockProfiles() { + profiles.mu.Unlock() +} + +// NewProfile creates a new profile with the given name. +// If a profile with that name already exists, NewProfile panics. +// The convention is to use a 'import/path.' prefix to create +// separate name spaces for each package. +// For compatibility with various tools that read pprof data, +// profile names should not contain spaces. +func NewProfile(name string) *Profile { + lockProfiles() + defer unlockProfiles() + if name == "" { + panic("pprof: NewProfile with empty name") + } + if profiles.m[name] != nil { + panic("pprof: NewProfile name already in use: " + name) + } + p := &Profile{ + name: name, + m: map[any][]uintptr{}, + } + profiles.m[name] = p + return p +} + +// Lookup returns the profile with the given name, or nil if no such profile exists. +func Lookup(name string) *Profile { + lockProfiles() + defer unlockProfiles() + return profiles.m[name] +} + +// Profiles returns a slice of all the known profiles, sorted by name. +func Profiles() []*Profile { + lockProfiles() + defer unlockProfiles() + + all := make([]*Profile, 0, len(profiles.m)) + for _, p := range profiles.m { + all = append(all, p) + } + + sort.Slice(all, func(i, j int) bool { return all[i].name < all[j].name }) + return all +} + +// Name returns this profile's name, which can be passed to Lookup to reobtain the profile. +func (p *Profile) Name() string { + return p.name +} + +// Count returns the number of execution stacks currently in the profile. +func (p *Profile) Count() int { + p.mu.Lock() + defer p.mu.Unlock() + if p.count != nil { + return p.count() + } + return len(p.m) +} + +// Add adds the current execution stack to the profile, associated with value. +// Add stores value in an internal map, so value must be suitable for use as +// a map key and will not be garbage collected until the corresponding +// call to Remove. Add panics if the profile already contains a stack for value. +// +// The skip parameter has the same meaning as runtime.Caller's skip +// and controls where the stack trace begins. Passing skip=0 begins the +// trace in the function calling Add. For example, given this +// execution stack: +// +// Add +// called from rpc.NewClient +// called from mypkg.Run +// called from main.main +// +// Passing skip=0 begins the stack trace at the call to Add inside rpc.NewClient. +// Passing skip=1 begins the stack trace at the call to NewClient inside mypkg.Run. +func (p *Profile) Add(value any, skip int) { + if p.name == "" { + panic("pprof: use of uninitialized Profile") + } + if p.write != nil { + panic("pprof: Add called on built-in Profile " + p.name) + } + + stk := make([]uintptr, 32) + n := runtime.Callers(skip+1, stk[:]) + stk = stk[:n] + if len(stk) == 0 { + // The value for skip is too large, and there's no stack trace to record. + stk = []uintptr{abi.FuncPCABIInternal(lostProfileEvent)} + } + + p.mu.Lock() + defer p.mu.Unlock() + if p.m[value] != nil { + panic("pprof: Profile.Add of duplicate value") + } + p.m[value] = stk +} + +// Remove removes the execution stack associated with value from the profile. +// It is a no-op if the value is not in the profile. +func (p *Profile) Remove(value any) { + p.mu.Lock() + defer p.mu.Unlock() + delete(p.m, value) +} + +// WriteTo writes a pprof-formatted snapshot of the profile to w. +// If a write to w returns an error, WriteTo returns that error. +// Otherwise, WriteTo returns nil. +// +// The debug parameter enables additional output. +// Passing debug=0 writes the gzip-compressed protocol buffer described +// in https://github.com/google/pprof/tree/master/proto#overview. +// Passing debug=1 writes the legacy text format with comments +// translating addresses to function names and line numbers, so that a +// programmer can read the profile without tools. +// +// The predefined profiles may assign meaning to other debug values; +// for example, when printing the "goroutine" profile, debug=2 means to +// print the goroutine stacks in the same form that a Go program uses +// when dying due to an unrecovered panic. +func (p *Profile) WriteTo(w io.Writer, debug int) error { + if p.name == "" { + panic("pprof: use of zero Profile") + } + if p.write != nil { + return p.write(w, debug) + } + + // Obtain consistent snapshot under lock; then process without lock. + p.mu.Lock() + all := make([][]uintptr, 0, len(p.m)) + for _, stk := range p.m { + all = append(all, stk) + } + p.mu.Unlock() + + // Map order is non-deterministic; make output deterministic. + sort.Slice(all, func(i, j int) bool { + t, u := all[i], all[j] + for k := 0; k < len(t) && k < len(u); k++ { + if t[k] != u[k] { + return t[k] < u[k] + } + } + return len(t) < len(u) + }) + + return printCountProfile(w, debug, p.name, stackProfile(all)) +} + +type stackProfile [][]uintptr + +func (x stackProfile) Len() int { return len(x) } +func (x stackProfile) Stack(i int) []uintptr { return x[i] } +func (x stackProfile) Label(i int) *labelMap { return nil } + +// A countProfile is a set of stack traces to be printed as counts +// grouped by stack trace. There are multiple implementations: +// all that matters is that we can find out how many traces there are +// and obtain each trace in turn. +type countProfile interface { + Len() int + Stack(i int) []uintptr + Label(i int) *labelMap +} + +// printCountCycleProfile outputs block profile records (for block or mutex profiles) +// as the pprof-proto format output. Translations from cycle count to time duration +// are done because The proto expects count and time (nanoseconds) instead of count +// and the number of cycles for block, contention profiles. +func printCountCycleProfile(w io.Writer, countName, cycleName string, records []runtime.BlockProfileRecord) error { + // Output profile in protobuf form. + b := newProfileBuilder(w) + b.pbValueType(tagProfile_PeriodType, countName, "count") + b.pb.int64Opt(tagProfile_Period, 1) + b.pbValueType(tagProfile_SampleType, countName, "count") + b.pbValueType(tagProfile_SampleType, cycleName, "nanoseconds") + + cpuGHz := float64(runtime_cyclesPerSecond()) / 1e9 + + values := []int64{0, 0} + var locs []uint64 + for _, r := range records { + values[0] = r.Count + values[1] = int64(float64(r.Cycles) / cpuGHz) + // For count profiles, all stack addresses are + // return PCs, which is what appendLocsForStack expects. + locs = b.appendLocsForStack(locs[:0], r.Stack()) + b.pbSample(values, locs, nil) + } + b.build() + return nil +} + +// printCountProfile prints a countProfile at the specified debug level. +// The profile will be in compressed proto format unless debug is nonzero. +func printCountProfile(w io.Writer, debug int, name string, p countProfile) error { + // Build count of each stack. + var buf strings.Builder + key := func(stk []uintptr, lbls *labelMap) string { + buf.Reset() + fmt.Fprintf(&buf, "@") + for _, pc := range stk { + fmt.Fprintf(&buf, " %#x", pc) + } + if lbls != nil { + buf.WriteString("\n# labels: ") + buf.WriteString(lbls.String()) + } + return buf.String() + } + count := map[string]int{} + index := map[string]int{} + var keys []string + n := p.Len() + for i := 0; i < n; i++ { + k := key(p.Stack(i), p.Label(i)) + if count[k] == 0 { + index[k] = i + keys = append(keys, k) + } + count[k]++ + } + + sort.Sort(&keysByCount{keys, count}) + + if debug > 0 { + // Print debug profile in legacy format + tw := tabwriter.NewWriter(w, 1, 8, 1, '\t', 0) + fmt.Fprintf(tw, "%s profile: total %d\n", name, p.Len()) + for _, k := range keys { + fmt.Fprintf(tw, "%d %s\n", count[k], k) + printStackRecord(tw, p.Stack(index[k]), false) + } + return tw.Flush() + } + + // Output profile in protobuf form. + b := newProfileBuilder(w) + b.pbValueType(tagProfile_PeriodType, name, "count") + b.pb.int64Opt(tagProfile_Period, 1) + b.pbValueType(tagProfile_SampleType, name, "count") + + values := []int64{0} + var locs []uint64 + for _, k := range keys { + values[0] = int64(count[k]) + // For count profiles, all stack addresses are + // return PCs, which is what appendLocsForStack expects. + locs = b.appendLocsForStack(locs[:0], p.Stack(index[k])) + idx := index[k] + var labels func() + if p.Label(idx) != nil { + labels = func() { + for k, v := range *p.Label(idx) { + b.pbLabel(tagSample_Label, k, v, 0) + } + } + } + b.pbSample(values, locs, labels) + } + b.build() + return nil +} + +// keysByCount sorts keys with higher counts first, breaking ties by key string order. +type keysByCount struct { + keys []string + count map[string]int +} + +func (x *keysByCount) Len() int { return len(x.keys) } +func (x *keysByCount) Swap(i, j int) { x.keys[i], x.keys[j] = x.keys[j], x.keys[i] } +func (x *keysByCount) Less(i, j int) bool { + ki, kj := x.keys[i], x.keys[j] + ci, cj := x.count[ki], x.count[kj] + if ci != cj { + return ci > cj + } + return ki < kj +} + +// printStackRecord prints the function + source line information +// for a single stack trace. +func printStackRecord(w io.Writer, stk []uintptr, allFrames bool) { + show := allFrames + frames := runtime.CallersFrames(stk) + for { + frame, more := frames.Next() + name := frame.Function + if name == "" { + show = true + fmt.Fprintf(w, "#\t%#x\n", frame.PC) + } else if name != "runtime.goexit" && (show || !strings.HasPrefix(name, "runtime.")) { + // Hide runtime.goexit and any runtime functions at the beginning. + // This is useful mainly for allocation traces. + show = true + fmt.Fprintf(w, "#\t%#x\t%s+%#x\t%s:%d\n", frame.PC, name, frame.PC-frame.Entry, frame.File, frame.Line) + } + if !more { + break + } + } + if !show { + // We didn't print anything; do it again, + // and this time include runtime functions. + printStackRecord(w, stk, true) + return + } + fmt.Fprintf(w, "\n") +} + +// Interface to system profiles. + +// WriteHeapProfile is shorthand for Lookup("heap").WriteTo(w, 0). +// It is preserved for backwards compatibility. +func WriteHeapProfile(w io.Writer) error { + return writeHeap(w, 0) +} + +// countHeap returns the number of records in the heap profile. +func countHeap() int { + n, _ := runtime.MemProfile(nil, true) + return n +} + +// writeHeap writes the current runtime heap profile to w. +func writeHeap(w io.Writer, debug int) error { + return writeHeapInternal(w, debug, "") +} + +// writeAlloc writes the current runtime heap profile to w +// with the total allocation space as the default sample type. +func writeAlloc(w io.Writer, debug int) error { + return writeHeapInternal(w, debug, "alloc_space") +} + +func writeHeapInternal(w io.Writer, debug int, defaultSampleType string) error { + var memStats *runtime.MemStats + if debug != 0 { + // Read mem stats first, so that our other allocations + // do not appear in the statistics. + memStats = new(runtime.MemStats) + runtime.ReadMemStats(memStats) + } + + // Find out how many records there are (MemProfile(nil, true)), + // allocate that many records, and get the data. + // There's a race—more records might be added between + // the two calls—so allocate a few extra records for safety + // and also try again if we're very unlucky. + // The loop should only execute one iteration in the common case. + var p []runtime.MemProfileRecord + n, ok := runtime.MemProfile(nil, true) + for { + // Allocate room for a slightly bigger profile, + // in case a few more entries have been added + // since the call to MemProfile. + p = make([]runtime.MemProfileRecord, n+50) + n, ok = runtime.MemProfile(p, true) + if ok { + p = p[0:n] + break + } + // Profile grew; try again. + } + + if debug == 0 { + return writeHeapProto(w, p, int64(runtime.MemProfileRate), defaultSampleType) + } + + sort.Slice(p, func(i, j int) bool { return p[i].InUseBytes() > p[j].InUseBytes() }) + + b := bufio.NewWriter(w) + tw := tabwriter.NewWriter(b, 1, 8, 1, '\t', 0) + w = tw + + var total runtime.MemProfileRecord + for i := range p { + r := &p[i] + total.AllocBytes += r.AllocBytes + total.AllocObjects += r.AllocObjects + total.FreeBytes += r.FreeBytes + total.FreeObjects += r.FreeObjects + } + + // Technically the rate is MemProfileRate not 2*MemProfileRate, + // but early versions of the C++ heap profiler reported 2*MemProfileRate, + // so that's what pprof has come to expect. + rate := 2 * runtime.MemProfileRate + + // pprof reads a profile with alloc == inuse as being a "2-column" profile + // (objects and bytes, not distinguishing alloc from inuse), + // but then such a profile can't be merged using pprof *.prof with + // other 4-column profiles where alloc != inuse. + // The easiest way to avoid this bug is to adjust allocBytes so it's never == inuseBytes. + // pprof doesn't use these header values anymore except for checking equality. + inUseBytes := total.InUseBytes() + allocBytes := total.AllocBytes + if inUseBytes == allocBytes { + allocBytes++ + } + + fmt.Fprintf(w, "heap profile: %d: %d [%d: %d] @ heap/%d\n", + total.InUseObjects(), inUseBytes, + total.AllocObjects, allocBytes, + rate) + + for i := range p { + r := &p[i] + fmt.Fprintf(w, "%d: %d [%d: %d] @", + r.InUseObjects(), r.InUseBytes(), + r.AllocObjects, r.AllocBytes) + for _, pc := range r.Stack() { + fmt.Fprintf(w, " %#x", pc) + } + fmt.Fprintf(w, "\n") + printStackRecord(w, r.Stack(), false) + } + + // Print memstats information too. + // Pprof will ignore, but useful for people + s := memStats + fmt.Fprintf(w, "\n# runtime.MemStats\n") + fmt.Fprintf(w, "# Alloc = %d\n", s.Alloc) + fmt.Fprintf(w, "# TotalAlloc = %d\n", s.TotalAlloc) + fmt.Fprintf(w, "# Sys = %d\n", s.Sys) + fmt.Fprintf(w, "# Lookups = %d\n", s.Lookups) + fmt.Fprintf(w, "# Mallocs = %d\n", s.Mallocs) + fmt.Fprintf(w, "# Frees = %d\n", s.Frees) + + fmt.Fprintf(w, "# HeapAlloc = %d\n", s.HeapAlloc) + fmt.Fprintf(w, "# HeapSys = %d\n", s.HeapSys) + fmt.Fprintf(w, "# HeapIdle = %d\n", s.HeapIdle) + fmt.Fprintf(w, "# HeapInuse = %d\n", s.HeapInuse) + fmt.Fprintf(w, "# HeapReleased = %d\n", s.HeapReleased) + fmt.Fprintf(w, "# HeapObjects = %d\n", s.HeapObjects) + + fmt.Fprintf(w, "# Stack = %d / %d\n", s.StackInuse, s.StackSys) + fmt.Fprintf(w, "# MSpan = %d / %d\n", s.MSpanInuse, s.MSpanSys) + fmt.Fprintf(w, "# MCache = %d / %d\n", s.MCacheInuse, s.MCacheSys) + fmt.Fprintf(w, "# BuckHashSys = %d\n", s.BuckHashSys) + fmt.Fprintf(w, "# GCSys = %d\n", s.GCSys) + fmt.Fprintf(w, "# OtherSys = %d\n", s.OtherSys) + + fmt.Fprintf(w, "# NextGC = %d\n", s.NextGC) + fmt.Fprintf(w, "# LastGC = %d\n", s.LastGC) + fmt.Fprintf(w, "# PauseNs = %d\n", s.PauseNs) + fmt.Fprintf(w, "# PauseEnd = %d\n", s.PauseEnd) + fmt.Fprintf(w, "# NumGC = %d\n", s.NumGC) + fmt.Fprintf(w, "# NumForcedGC = %d\n", s.NumForcedGC) + fmt.Fprintf(w, "# GCCPUFraction = %v\n", s.GCCPUFraction) + fmt.Fprintf(w, "# DebugGC = %v\n", s.DebugGC) + + // Also flush out MaxRSS on supported platforms. + addMaxRSS(w) + + tw.Flush() + return b.Flush() +} + +// countThreadCreate returns the size of the current ThreadCreateProfile. +func countThreadCreate() int { + n, _ := runtime.ThreadCreateProfile(nil) + return n +} + +// writeThreadCreate writes the current runtime ThreadCreateProfile to w. +func writeThreadCreate(w io.Writer, debug int) error { + // Until https://golang.org/issues/6104 is addressed, wrap + // ThreadCreateProfile because there's no point in tracking labels when we + // don't get any stack-traces. + return writeRuntimeProfile(w, debug, "threadcreate", func(p []runtime.StackRecord, _ []unsafe.Pointer) (n int, ok bool) { + return runtime.ThreadCreateProfile(p) + }) +} + +// countGoroutine returns the number of goroutines. +func countGoroutine() int { + return runtime.NumGoroutine() +} + +// runtime_goroutineProfileWithLabels is defined in runtime/mprof.go +func runtime_goroutineProfileWithLabels(p []runtime.StackRecord, labels []unsafe.Pointer) (n int, ok bool) + +// writeGoroutine writes the current runtime GoroutineProfile to w. +func writeGoroutine(w io.Writer, debug int) error { + if debug >= 2 { + return writeGoroutineStacks(w) + } + return writeRuntimeProfile(w, debug, "goroutine", runtime_goroutineProfileWithLabels) +} + +func writeGoroutineStacks(w io.Writer) error { + // We don't know how big the buffer needs to be to collect + // all the goroutines. Start with 1 MB and try a few times, doubling each time. + // Give up and use a truncated trace if 64 MB is not enough. + buf := make([]byte, 1<<20) + for i := 0; ; i++ { + n := runtime.Stack(buf, true) + if n < len(buf) { + buf = buf[:n] + break + } + if len(buf) >= 64<<20 { + // Filled 64 MB - stop there. + break + } + buf = make([]byte, 2*len(buf)) + } + _, err := w.Write(buf) + return err +} + +func writeRuntimeProfile(w io.Writer, debug int, name string, fetch func([]runtime.StackRecord, []unsafe.Pointer) (int, bool)) error { + // Find out how many records there are (fetch(nil)), + // allocate that many records, and get the data. + // There's a race—more records might be added between + // the two calls—so allocate a few extra records for safety + // and also try again if we're very unlucky. + // The loop should only execute one iteration in the common case. + var p []runtime.StackRecord + var labels []unsafe.Pointer + n, ok := fetch(nil, nil) + for { + // Allocate room for a slightly bigger profile, + // in case a few more entries have been added + // since the call to ThreadProfile. + p = make([]runtime.StackRecord, n+10) + labels = make([]unsafe.Pointer, n+10) + n, ok = fetch(p, labels) + if ok { + p = p[0:n] + break + } + // Profile grew; try again. + } + + return printCountProfile(w, debug, name, &runtimeProfile{p, labels}) +} + +type runtimeProfile struct { + stk []runtime.StackRecord + labels []unsafe.Pointer +} + +func (p *runtimeProfile) Len() int { return len(p.stk) } +func (p *runtimeProfile) Stack(i int) []uintptr { return p.stk[i].Stack() } +func (p *runtimeProfile) Label(i int) *labelMap { return (*labelMap)(p.labels[i]) } + +var cpu struct { + sync.Mutex + profiling bool + done chan bool +} + +// StartCPUProfile enables CPU profiling for the current process. +// While profiling, the profile will be buffered and written to w. +// StartCPUProfile returns an error if profiling is already enabled. +// +// On Unix-like systems, StartCPUProfile does not work by default for +// Go code built with -buildmode=c-archive or -buildmode=c-shared. +// StartCPUProfile relies on the SIGPROF signal, but that signal will +// be delivered to the main program's SIGPROF signal handler (if any) +// not to the one used by Go. To make it work, call os/signal.Notify +// for syscall.SIGPROF, but note that doing so may break any profiling +// being done by the main program. +func StartCPUProfile(w io.Writer) error { + // The runtime routines allow a variable profiling rate, + // but in practice operating systems cannot trigger signals + // at more than about 500 Hz, and our processing of the + // signal is not cheap (mostly getting the stack trace). + // 100 Hz is a reasonable choice: it is frequent enough to + // produce useful data, rare enough not to bog down the + // system, and a nice round number to make it easy to + // convert sample counts to seconds. Instead of requiring + // each client to specify the frequency, we hard code it. + const hz = 100 + + cpu.Lock() + defer cpu.Unlock() + if cpu.done == nil { + cpu.done = make(chan bool) + } + // Double-check. + if cpu.profiling { + return fmt.Errorf("cpu profiling already in use") + } + cpu.profiling = true + runtime.SetCPUProfileRate(hz) + go profileWriter(w) + return nil +} + +// readProfile, provided by the runtime, returns the next chunk of +// binary CPU profiling stack trace data, blocking until data is available. +// If profiling is turned off and all the profile data accumulated while it was +// on has been returned, readProfile returns eof=true. +// The caller must save the returned data and tags before calling readProfile again. +func readProfile() (data []uint64, tags []unsafe.Pointer, eof bool) + +func profileWriter(w io.Writer) { + b := newProfileBuilder(w) + var err error + for { + time.Sleep(100 * time.Millisecond) + data, tags, eof := readProfile() + if e := b.addCPUData(data, tags); e != nil && err == nil { + err = e + } + if eof { + break + } + } + if err != nil { + // The runtime should never produce an invalid or truncated profile. + // It drops records that can't fit into its log buffers. + panic("runtime/pprof: converting profile: " + err.Error()) + } + b.build() + cpu.done <- true +} + +// StopCPUProfile stops the current CPU profile, if any. +// StopCPUProfile only returns after all the writes for the +// profile have completed. +func StopCPUProfile() { + cpu.Lock() + defer cpu.Unlock() + + if !cpu.profiling { + return + } + cpu.profiling = false + runtime.SetCPUProfileRate(0) + <-cpu.done +} + +// countBlock returns the number of records in the blocking profile. +func countBlock() int { + n, _ := runtime.BlockProfile(nil) + return n +} + +// countMutex returns the number of records in the mutex profile. +func countMutex() int { + n, _ := runtime.MutexProfile(nil) + return n +} + +// writeBlock writes the current blocking profile to w. +func writeBlock(w io.Writer, debug int) error { + return writeProfileInternal(w, debug, "contention", runtime.BlockProfile) +} + +// writeMutex writes the current mutex profile to w. +func writeMutex(w io.Writer, debug int) error { + return writeProfileInternal(w, debug, "mutex", runtime.MutexProfile) +} + +// writeProfileInternal writes the current blocking or mutex profile depending on the passed parameters. +func writeProfileInternal(w io.Writer, debug int, name string, runtimeProfile func([]runtime.BlockProfileRecord) (int, bool)) error { + var p []runtime.BlockProfileRecord + n, ok := runtimeProfile(nil) + for { + p = make([]runtime.BlockProfileRecord, n+50) + n, ok = runtimeProfile(p) + if ok { + p = p[:n] + break + } + } + + sort.Slice(p, func(i, j int) bool { return p[i].Cycles > p[j].Cycles }) + + if debug <= 0 { + return printCountCycleProfile(w, "contentions", "delay", p) + } + + b := bufio.NewWriter(w) + tw := tabwriter.NewWriter(w, 1, 8, 1, '\t', 0) + w = tw + + fmt.Fprintf(w, "--- %v:\n", name) + fmt.Fprintf(w, "cycles/second=%v\n", runtime_cyclesPerSecond()) + if name == "mutex" { + fmt.Fprintf(w, "sampling period=%d\n", runtime.SetMutexProfileFraction(-1)) + } + for i := range p { + r := &p[i] + fmt.Fprintf(w, "%v %v @", r.Cycles, r.Count) + for _, pc := range r.Stack() { + fmt.Fprintf(w, " %#x", pc) + } + fmt.Fprint(w, "\n") + if debug > 0 { + printStackRecord(w, r.Stack(), true) + } + } + + if tw != nil { + tw.Flush() + } + return b.Flush() +} + +func runtime_cyclesPerSecond() int64 diff --git a/src/runtime/pprof/pprof_norusage.go b/src/runtime/pprof/pprof_norusage.go new file mode 100644 index 0000000..8de3808 --- /dev/null +++ b/src/runtime/pprof/pprof_norusage.go @@ -0,0 +1,15 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !aix && !darwin && !dragonfly && !freebsd && !linux && !netbsd && !openbsd && !solaris && !windows + +package pprof + +import ( + "io" +) + +// Stub call for platforms that don't support rusage. +func addMaxRSS(w io.Writer) { +} diff --git a/src/runtime/pprof/pprof_rusage.go b/src/runtime/pprof/pprof_rusage.go new file mode 100644 index 0000000..aa429fb --- /dev/null +++ b/src/runtime/pprof/pprof_rusage.go @@ -0,0 +1,35 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package pprof + +import ( + "fmt" + "io" + "runtime" + "syscall" +) + +// Adds MaxRSS to platforms that are supported. +func addMaxRSS(w io.Writer) { + var rssToBytes uintptr + switch runtime.GOOS { + case "aix", "android", "dragonfly", "freebsd", "linux", "netbsd", "openbsd": + rssToBytes = 1024 + case "darwin", "ios": + rssToBytes = 1 + case "illumos", "solaris": + rssToBytes = uintptr(syscall.Getpagesize()) + default: + panic("unsupported OS") + } + + var rusage syscall.Rusage + err := syscall.Getrusage(syscall.RUSAGE_SELF, &rusage) + if err == nil { + fmt.Fprintf(w, "# MaxRSS = %d\n", uintptr(rusage.Maxrss)*rssToBytes) + } +} diff --git a/src/runtime/pprof/pprof_test.go b/src/runtime/pprof/pprof_test.go new file mode 100644 index 0000000..53688ad --- /dev/null +++ b/src/runtime/pprof/pprof_test.go @@ -0,0 +1,2301 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !js + +package pprof + +import ( + "bytes" + "context" + "fmt" + "internal/abi" + "internal/profile" + "internal/syscall/unix" + "internal/testenv" + "io" + "math" + "math/big" + "os" + "os/exec" + "regexp" + "runtime" + "runtime/debug" + "strings" + "sync" + "sync/atomic" + "testing" + "time" + _ "unsafe" +) + +func cpuHogger(f func(x int) int, y *int, dur time.Duration) { + // We only need to get one 100 Hz clock tick, so we've got + // a large safety buffer. + // But do at least 500 iterations (which should take about 100ms), + // otherwise TestCPUProfileMultithreaded can fail if only one + // thread is scheduled during the testing period. + t0 := time.Now() + accum := *y + for i := 0; i < 500 || time.Since(t0) < dur; i++ { + accum = f(accum) + } + *y = accum +} + +var ( + salt1 = 0 + salt2 = 0 +) + +// The actual CPU hogging function. +// Must not call other functions nor access heap/globals in the loop, +// otherwise under race detector the samples will be in the race runtime. +func cpuHog1(x int) int { + return cpuHog0(x, 1e5) +} + +func cpuHog0(x, n int) int { + foo := x + for i := 0; i < n; i++ { + if foo > 0 { + foo *= foo + } else { + foo *= foo + 1 + } + } + return foo +} + +func cpuHog2(x int) int { + foo := x + for i := 0; i < 1e5; i++ { + if foo > 0 { + foo *= foo + } else { + foo *= foo + 2 + } + } + return foo +} + +// Return a list of functions that we don't want to ever appear in CPU +// profiles. For gccgo, that list includes the sigprof handler itself. +func avoidFunctions() []string { + if runtime.Compiler == "gccgo" { + return []string{"runtime.sigprof"} + } + return nil +} + +func TestCPUProfile(t *testing.T) { + matches := matchAndAvoidStacks(stackContains, []string{"runtime/pprof.cpuHog1"}, avoidFunctions()) + testCPUProfile(t, matches, func(dur time.Duration) { + cpuHogger(cpuHog1, &salt1, dur) + }) +} + +func TestCPUProfileMultithreaded(t *testing.T) { + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(2)) + matches := matchAndAvoidStacks(stackContains, []string{"runtime/pprof.cpuHog1", "runtime/pprof.cpuHog2"}, avoidFunctions()) + testCPUProfile(t, matches, func(dur time.Duration) { + c := make(chan int) + go func() { + cpuHogger(cpuHog1, &salt1, dur) + c <- 1 + }() + cpuHogger(cpuHog2, &salt2, dur) + <-c + }) +} + +func TestCPUProfileMultithreadMagnitude(t *testing.T) { + if runtime.GOOS != "linux" { + t.Skip("issue 35057 is only confirmed on Linux") + } + + // Linux [5.9,5.16) has a kernel bug that can break CPU timers on newly + // created threads, breaking our CPU accounting. + major, minor := unix.KernelVersion() + t.Logf("Running on Linux %d.%d", major, minor) + defer func() { + if t.Failed() { + t.Logf("Failure of this test may indicate that your system suffers from a known Linux kernel bug fixed on newer kernels. See https://golang.org/issue/49065.") + } + }() + + // Disable on affected builders to avoid flakiness, but otherwise keep + // it enabled to potentially warn users that they are on a broken + // kernel. + if testenv.Builder() != "" && (runtime.GOARCH == "386" || runtime.GOARCH == "amd64") { + have59 := major > 5 || (major == 5 && minor >= 9) + have516 := major > 5 || (major == 5 && minor >= 16) + if have59 && !have516 { + testenv.SkipFlaky(t, 49065) + } + } + + // Run a workload in a single goroutine, then run copies of the same + // workload in several goroutines. For both the serial and parallel cases, + // the CPU time the process measures with its own profiler should match the + // total CPU usage that the OS reports. + // + // We could also check that increases in parallelism (GOMAXPROCS) lead to a + // linear increase in the CPU usage reported by both the OS and the + // profiler, but without a guarantee of exclusive access to CPU resources + // that is likely to be a flaky test. + + // Require the smaller value to be within 10%, or 40% in short mode. + maxDiff := 0.10 + if testing.Short() { + maxDiff = 0.40 + } + + compare := func(a, b time.Duration, maxDiff float64) error { + if a <= 0 || b <= 0 { + return fmt.Errorf("Expected both time reports to be positive") + } + + if a < b { + a, b = b, a + } + + diff := float64(a-b) / float64(a) + if diff > maxDiff { + return fmt.Errorf("CPU usage reports are too different (limit -%.1f%%, got -%.1f%%)", maxDiff*100, diff*100) + } + + return nil + } + + for _, tc := range []struct { + name string + workers int + }{ + { + name: "serial", + workers: 1, + }, + { + name: "parallel", + workers: runtime.GOMAXPROCS(0), + }, + } { + // check that the OS's perspective matches what the Go runtime measures. + t.Run(tc.name, func(t *testing.T) { + t.Logf("Running with %d workers", tc.workers) + + var userTime, systemTime time.Duration + matches := matchAndAvoidStacks(stackContains, []string{"runtime/pprof.cpuHog1"}, avoidFunctions()) + acceptProfile := func(t *testing.T, p *profile.Profile) bool { + if !matches(t, p) { + return false + } + + ok := true + for i, unit := range []string{"count", "nanoseconds"} { + if have, want := p.SampleType[i].Unit, unit; have != want { + t.Logf("pN SampleType[%d]; %q != %q", i, have, want) + ok = false + } + } + + // cpuHog1 called below is the primary source of CPU + // load, but there may be some background work by the + // runtime. Since the OS rusage measurement will + // include all work done by the process, also compare + // against all samples in our profile. + var value time.Duration + for _, sample := range p.Sample { + value += time.Duration(sample.Value[1]) * time.Nanosecond + } + + totalTime := userTime + systemTime + t.Logf("compare %s user + %s system = %s vs %s", userTime, systemTime, totalTime, value) + if err := compare(totalTime, value, maxDiff); err != nil { + t.Logf("compare got %v want nil", err) + ok = false + } + + return ok + } + + testCPUProfile(t, acceptProfile, func(dur time.Duration) { + userTime, systemTime = diffCPUTime(t, func() { + var wg sync.WaitGroup + var once sync.Once + for i := 0; i < tc.workers; i++ { + wg.Add(1) + go func() { + defer wg.Done() + var salt = 0 + cpuHogger(cpuHog1, &salt, dur) + once.Do(func() { salt1 = salt }) + }() + } + wg.Wait() + }) + }) + }) + } +} + +// containsInlinedCall reports whether the function body for the function f is +// known to contain an inlined function call within the first maxBytes bytes. +func containsInlinedCall(f any, maxBytes int) bool { + _, found := findInlinedCall(f, maxBytes) + return found +} + +// findInlinedCall returns the PC of an inlined function call within +// the function body for the function f if any. +func findInlinedCall(f any, maxBytes int) (pc uint64, found bool) { + fFunc := runtime.FuncForPC(uintptr(abi.FuncPCABIInternal(f))) + if fFunc == nil || fFunc.Entry() == 0 { + panic("failed to locate function entry") + } + + for offset := 0; offset < maxBytes; offset++ { + innerPC := fFunc.Entry() + uintptr(offset) + inner := runtime.FuncForPC(innerPC) + if inner == nil { + // No function known for this PC value. + // It might simply be misaligned, so keep searching. + continue + } + if inner.Entry() != fFunc.Entry() { + // Scanned past f and didn't find any inlined functions. + break + } + if inner.Name() != fFunc.Name() { + // This PC has f as its entry-point, but is not f. Therefore, it must be a + // function inlined into f. + return uint64(innerPC), true + } + } + + return 0, false +} + +func TestCPUProfileInlining(t *testing.T) { + if !containsInlinedCall(inlinedCaller, 4<<10) { + t.Skip("Can't determine whether inlinedCallee was inlined into inlinedCaller.") + } + + matches := matchAndAvoidStacks(stackContains, []string{"runtime/pprof.inlinedCallee", "runtime/pprof.inlinedCaller"}, avoidFunctions()) + p := testCPUProfile(t, matches, func(dur time.Duration) { + cpuHogger(inlinedCaller, &salt1, dur) + }) + + // Check if inlined function locations are encoded correctly. The inlinedCalee and inlinedCaller should be in one location. + for _, loc := range p.Location { + hasInlinedCallerAfterInlinedCallee, hasInlinedCallee := false, false + for _, line := range loc.Line { + if line.Function.Name == "runtime/pprof.inlinedCallee" { + hasInlinedCallee = true + } + if hasInlinedCallee && line.Function.Name == "runtime/pprof.inlinedCaller" { + hasInlinedCallerAfterInlinedCallee = true + } + } + if hasInlinedCallee != hasInlinedCallerAfterInlinedCallee { + t.Fatalf("want inlinedCallee followed by inlinedCaller, got separate Location entries:\n%v", p) + } + } +} + +func inlinedCaller(x int) int { + x = inlinedCallee(x, 1e5) + return x +} + +func inlinedCallee(x, n int) int { + return cpuHog0(x, n) +} + +//go:noinline +func dumpCallers(pcs []uintptr) { + if pcs == nil { + return + } + + skip := 2 // Callers and dumpCallers + runtime.Callers(skip, pcs) +} + +//go:noinline +func inlinedCallerDump(pcs []uintptr) { + inlinedCalleeDump(pcs) +} + +func inlinedCalleeDump(pcs []uintptr) { + dumpCallers(pcs) +} + +func TestCPUProfileRecursion(t *testing.T) { + matches := matchAndAvoidStacks(stackContains, []string{"runtime/pprof.inlinedCallee", "runtime/pprof.recursionCallee", "runtime/pprof.recursionCaller"}, avoidFunctions()) + p := testCPUProfile(t, matches, func(dur time.Duration) { + cpuHogger(recursionCaller, &salt1, dur) + }) + + // check the Location encoding was not confused by recursive calls. + for i, loc := range p.Location { + recursionFunc := 0 + for _, line := range loc.Line { + if name := line.Function.Name; name == "runtime/pprof.recursionCaller" || name == "runtime/pprof.recursionCallee" { + recursionFunc++ + } + } + if recursionFunc > 1 { + t.Fatalf("want at most one recursionCaller or recursionCallee in one Location, got a violating Location (index: %d):\n%v", i, p) + } + } +} + +func recursionCaller(x int) int { + y := recursionCallee(3, x) + return y +} + +func recursionCallee(n, x int) int { + if n == 0 { + return 1 + } + y := inlinedCallee(x, 1e4) + return y * recursionCallee(n-1, x) +} + +func recursionChainTop(x int, pcs []uintptr) { + if x < 0 { + return + } + recursionChainMiddle(x, pcs) +} + +func recursionChainMiddle(x int, pcs []uintptr) { + recursionChainBottom(x, pcs) +} + +func recursionChainBottom(x int, pcs []uintptr) { + // This will be called each time, we only care about the last. We + // can't make this conditional or this function won't be inlined. + dumpCallers(pcs) + + recursionChainTop(x-1, pcs) +} + +func parseProfile(t *testing.T, valBytes []byte, f func(uintptr, []*profile.Location, map[string][]string)) *profile.Profile { + p, err := profile.Parse(bytes.NewReader(valBytes)) + if err != nil { + t.Fatal(err) + } + for _, sample := range p.Sample { + count := uintptr(sample.Value[0]) + f(count, sample.Location, sample.Label) + } + return p +} + +func cpuProfilingBroken() bool { + switch runtime.GOOS { + case "plan9": + // Profiling unimplemented. + return true + case "aix": + // See https://golang.org/issue/45170. + return true + case "ios", "dragonfly", "netbsd", "illumos", "solaris": + // See https://golang.org/issue/13841. + return true + case "openbsd": + if runtime.GOARCH == "arm" || runtime.GOARCH == "arm64" { + // See https://golang.org/issue/13841. + return true + } + } + + return false +} + +// testCPUProfile runs f under the CPU profiler, checking for some conditions specified by need, +// as interpreted by matches, and returns the parsed profile. +func testCPUProfile(t *testing.T, matches profileMatchFunc, f func(dur time.Duration)) *profile.Profile { + switch runtime.GOOS { + case "darwin": + out, err := exec.Command("uname", "-a").CombinedOutput() + if err != nil { + t.Fatal(err) + } + vers := string(out) + t.Logf("uname -a: %v", vers) + case "plan9": + t.Skip("skipping on plan9") + } + + broken := cpuProfilingBroken() + + deadline, ok := t.Deadline() + if broken || !ok { + if broken && testing.Short() { + // If it's expected to be broken, no point waiting around. + deadline = time.Now().Add(1 * time.Second) + } else { + deadline = time.Now().Add(10 * time.Second) + } + } + + // If we're running a long test, start with a long duration + // for tests that try to make sure something *doesn't* happen. + duration := 5 * time.Second + if testing.Short() { + duration = 100 * time.Millisecond + } + + // Profiling tests are inherently flaky, especially on a + // loaded system, such as when this test is running with + // several others under go test std. If a test fails in a way + // that could mean it just didn't run long enough, try with a + // longer duration. + for { + var prof bytes.Buffer + if err := StartCPUProfile(&prof); err != nil { + t.Fatal(err) + } + f(duration) + StopCPUProfile() + + if p, ok := profileOk(t, matches, prof, duration); ok { + return p + } + + duration *= 2 + if time.Until(deadline) < duration { + break + } + t.Logf("retrying with %s duration", duration) + } + + if broken { + t.Skipf("ignoring failure on %s/%s; see golang.org/issue/13841", runtime.GOOS, runtime.GOARCH) + } + + // Ignore the failure if the tests are running in a QEMU-based emulator, + // QEMU is not perfect at emulating everything. + // IN_QEMU environmental variable is set by some of the Go builders. + // IN_QEMU=1 indicates that the tests are running in QEMU. See issue 9605. + if os.Getenv("IN_QEMU") == "1" { + t.Skip("ignore the failure in QEMU; see golang.org/issue/9605") + } + t.FailNow() + return nil +} + +var diffCPUTimeImpl func(f func()) (user, system time.Duration) + +func diffCPUTime(t *testing.T, f func()) (user, system time.Duration) { + if fn := diffCPUTimeImpl; fn != nil { + return fn(f) + } + t.Fatalf("cannot measure CPU time on GOOS=%s GOARCH=%s", runtime.GOOS, runtime.GOARCH) + return 0, 0 +} + +func contains(slice []string, s string) bool { + for i := range slice { + if slice[i] == s { + return true + } + } + return false +} + +// stackContains matches if a function named spec appears anywhere in the stack trace. +func stackContains(spec string, count uintptr, stk []*profile.Location, labels map[string][]string) bool { + for _, loc := range stk { + for _, line := range loc.Line { + if strings.Contains(line.Function.Name, spec) { + return true + } + } + } + return false +} + +type sampleMatchFunc func(spec string, count uintptr, stk []*profile.Location, labels map[string][]string) bool + +func profileOk(t *testing.T, matches profileMatchFunc, prof bytes.Buffer, duration time.Duration) (_ *profile.Profile, ok bool) { + ok = true + + var samples uintptr + var buf strings.Builder + p := parseProfile(t, prof.Bytes(), func(count uintptr, stk []*profile.Location, labels map[string][]string) { + fmt.Fprintf(&buf, "%d:", count) + fprintStack(&buf, stk) + fmt.Fprintf(&buf, " labels: %v\n", labels) + samples += count + fmt.Fprintf(&buf, "\n") + }) + t.Logf("total %d CPU profile samples collected:\n%s", samples, buf.String()) + + if samples < 10 && runtime.GOOS == "windows" { + // On some windows machines we end up with + // not enough samples due to coarse timer + // resolution. Let it go. + t.Log("too few samples on Windows (golang.org/issue/10842)") + return p, false + } + + // Check that we got a reasonable number of samples. + // We used to always require at least ideal/4 samples, + // but that is too hard to guarantee on a loaded system. + // Now we accept 10 or more samples, which we take to be + // enough to show that at least some profiling is occurring. + if ideal := uintptr(duration * 100 / time.Second); samples == 0 || (samples < ideal/4 && samples < 10) { + t.Logf("too few samples; got %d, want at least %d, ideally %d", samples, ideal/4, ideal) + ok = false + } + + if matches != nil && !matches(t, p) { + ok = false + } + + return p, ok +} + +type profileMatchFunc func(*testing.T, *profile.Profile) bool + +func matchAndAvoidStacks(matches sampleMatchFunc, need []string, avoid []string) profileMatchFunc { + return func(t *testing.T, p *profile.Profile) (ok bool) { + ok = true + + // Check that profile is well formed, contains 'need', and does not contain + // anything from 'avoid'. + have := make([]uintptr, len(need)) + avoidSamples := make([]uintptr, len(avoid)) + + for _, sample := range p.Sample { + count := uintptr(sample.Value[0]) + for i, spec := range need { + if matches(spec, count, sample.Location, sample.Label) { + have[i] += count + } + } + for i, name := range avoid { + for _, loc := range sample.Location { + for _, line := range loc.Line { + if strings.Contains(line.Function.Name, name) { + avoidSamples[i] += count + } + } + } + } + } + + for i, name := range avoid { + bad := avoidSamples[i] + if bad != 0 { + t.Logf("found %d samples in avoid-function %s\n", bad, name) + ok = false + } + } + + if len(need) == 0 { + return + } + + var total uintptr + for i, name := range need { + total += have[i] + t.Logf("found %d samples in expected function %s\n", have[i], name) + } + if total == 0 { + t.Logf("no samples in expected functions") + ok = false + } + + // We'd like to check a reasonable minimum, like + // total / len(have) / smallconstant, but this test is + // pretty flaky (see bug 7095). So we'll just test to + // make sure we got at least one sample. + min := uintptr(1) + for i, name := range need { + if have[i] < min { + t.Logf("%s has %d samples out of %d, want at least %d, ideally %d", name, have[i], total, min, total/uintptr(len(have))) + ok = false + } + } + return + } +} + +// Fork can hang if preempted with signals frequently enough (see issue 5517). +// Ensure that we do not do this. +func TestCPUProfileWithFork(t *testing.T) { + testenv.MustHaveExec(t) + + heap := 1 << 30 + if runtime.GOOS == "android" { + // Use smaller size for Android to avoid crash. + heap = 100 << 20 + } + if runtime.GOOS == "windows" && runtime.GOARCH == "arm" { + // Use smaller heap for Windows/ARM to avoid crash. + heap = 100 << 20 + } + if testing.Short() { + heap = 100 << 20 + } + // This makes fork slower. + garbage := make([]byte, heap) + // Need to touch the slice, otherwise it won't be paged in. + done := make(chan bool) + go func() { + for i := range garbage { + garbage[i] = 42 + } + done <- true + }() + <-done + + var prof bytes.Buffer + if err := StartCPUProfile(&prof); err != nil { + t.Fatal(err) + } + defer StopCPUProfile() + + for i := 0; i < 10; i++ { + exec.Command(os.Args[0], "-h").CombinedOutput() + } +} + +// Test that profiler does not observe runtime.gogo as "user" goroutine execution. +// If it did, it would see inconsistent state and would either record an incorrect stack +// or crash because the stack was malformed. +func TestGoroutineSwitch(t *testing.T) { + if runtime.Compiler == "gccgo" { + t.Skip("not applicable for gccgo") + } + // How much to try. These defaults take about 1 seconds + // on a 2012 MacBook Pro. The ones in short mode take + // about 0.1 seconds. + tries := 10 + count := 1000000 + if testing.Short() { + tries = 1 + } + for try := 0; try < tries; try++ { + var prof bytes.Buffer + if err := StartCPUProfile(&prof); err != nil { + t.Fatal(err) + } + for i := 0; i < count; i++ { + runtime.Gosched() + } + StopCPUProfile() + + // Read profile to look for entries for gogo with an attempt at a traceback. + // "runtime.gogo" is OK, because that's the part of the context switch + // before the actual switch begins. But we should not see "gogo", + // aka "gogo<>(SB)", which does the actual switch and is marked SPWRITE. + parseProfile(t, prof.Bytes(), func(count uintptr, stk []*profile.Location, _ map[string][]string) { + // An entry with two frames with 'System' in its top frame + // exists to record a PC without a traceback. Those are okay. + if len(stk) == 2 { + name := stk[1].Line[0].Function.Name + if name == "runtime._System" || name == "runtime._ExternalCode" || name == "runtime._GC" { + return + } + } + + // An entry with just one frame is OK too: + // it knew to stop at gogo. + if len(stk) == 1 { + return + } + + // Otherwise, should not see gogo. + // The place we'd see it would be the inner most frame. + name := stk[0].Line[0].Function.Name + if name == "gogo" { + var buf strings.Builder + fprintStack(&buf, stk) + t.Fatalf("found profile entry for gogo:\n%s", buf.String()) + } + }) + } +} + +func fprintStack(w io.Writer, stk []*profile.Location) { + if len(stk) == 0 { + fmt.Fprintf(w, " (stack empty)") + } + for _, loc := range stk { + fmt.Fprintf(w, " %#x", loc.Address) + fmt.Fprintf(w, " (") + for i, line := range loc.Line { + if i > 0 { + fmt.Fprintf(w, " ") + } + fmt.Fprintf(w, "%s:%d", line.Function.Name, line.Line) + } + fmt.Fprintf(w, ")") + } +} + +// Test that profiling of division operations is okay, especially on ARM. See issue 6681. +func TestMathBigDivide(t *testing.T) { + testCPUProfile(t, nil, func(duration time.Duration) { + t := time.After(duration) + pi := new(big.Int) + for { + for i := 0; i < 100; i++ { + n := big.NewInt(2646693125139304345) + d := big.NewInt(842468587426513207) + pi.Div(n, d) + } + select { + case <-t: + return + default: + } + } + }) +} + +// stackContainsAll matches if all functions in spec (comma-separated) appear somewhere in the stack trace. +func stackContainsAll(spec string, count uintptr, stk []*profile.Location, labels map[string][]string) bool { + for _, f := range strings.Split(spec, ",") { + if !stackContains(f, count, stk, labels) { + return false + } + } + return true +} + +func TestMorestack(t *testing.T) { + matches := matchAndAvoidStacks(stackContainsAll, []string{"runtime.newstack,runtime/pprof.growstack"}, avoidFunctions()) + testCPUProfile(t, matches, func(duration time.Duration) { + t := time.After(duration) + c := make(chan bool) + for { + go func() { + growstack1() + c <- true + }() + select { + case <-t: + return + case <-c: + } + } + }) +} + +//go:noinline +func growstack1() { + growstack(10) +} + +//go:noinline +func growstack(n int) { + var buf [8 << 18]byte + use(buf) + if n > 0 { + growstack(n - 1) + } +} + +//go:noinline +func use(x [8 << 18]byte) {} + +func TestBlockProfile(t *testing.T) { + type TestCase struct { + name string + f func(*testing.T) + stk []string + re string + } + tests := [...]TestCase{ + { + name: "chan recv", + f: blockChanRecv, + stk: []string{ + "runtime.chanrecv1", + "runtime/pprof.blockChanRecv", + "runtime/pprof.TestBlockProfile", + }, + re: ` +[0-9]+ [0-9]+ @( 0x[[:xdigit:]]+)+ +# 0x[0-9a-f]+ runtime\.chanrecv1\+0x[0-9a-f]+ .*runtime/chan.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.blockChanRecv\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.TestBlockProfile\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +`}, + { + name: "chan send", + f: blockChanSend, + stk: []string{ + "runtime.chansend1", + "runtime/pprof.blockChanSend", + "runtime/pprof.TestBlockProfile", + }, + re: ` +[0-9]+ [0-9]+ @( 0x[[:xdigit:]]+)+ +# 0x[0-9a-f]+ runtime\.chansend1\+0x[0-9a-f]+ .*runtime/chan.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.blockChanSend\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.TestBlockProfile\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +`}, + { + name: "chan close", + f: blockChanClose, + stk: []string{ + "runtime.chanrecv1", + "runtime/pprof.blockChanClose", + "runtime/pprof.TestBlockProfile", + }, + re: ` +[0-9]+ [0-9]+ @( 0x[[:xdigit:]]+)+ +# 0x[0-9a-f]+ runtime\.chanrecv1\+0x[0-9a-f]+ .*runtime/chan.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.blockChanClose\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.TestBlockProfile\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +`}, + { + name: "select recv async", + f: blockSelectRecvAsync, + stk: []string{ + "runtime.selectgo", + "runtime/pprof.blockSelectRecvAsync", + "runtime/pprof.TestBlockProfile", + }, + re: ` +[0-9]+ [0-9]+ @( 0x[[:xdigit:]]+)+ +# 0x[0-9a-f]+ runtime\.selectgo\+0x[0-9a-f]+ .*runtime/select.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.blockSelectRecvAsync\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.TestBlockProfile\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +`}, + { + name: "select send sync", + f: blockSelectSendSync, + stk: []string{ + "runtime.selectgo", + "runtime/pprof.blockSelectSendSync", + "runtime/pprof.TestBlockProfile", + }, + re: ` +[0-9]+ [0-9]+ @( 0x[[:xdigit:]]+)+ +# 0x[0-9a-f]+ runtime\.selectgo\+0x[0-9a-f]+ .*runtime/select.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.blockSelectSendSync\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.TestBlockProfile\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +`}, + { + name: "mutex", + f: blockMutex, + stk: []string{ + "sync.(*Mutex).Lock", + "runtime/pprof.blockMutex", + "runtime/pprof.TestBlockProfile", + }, + re: ` +[0-9]+ [0-9]+ @( 0x[[:xdigit:]]+)+ +# 0x[0-9a-f]+ sync\.\(\*Mutex\)\.Lock\+0x[0-9a-f]+ .*sync/mutex\.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.blockMutex\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.TestBlockProfile\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +`}, + { + name: "cond", + f: blockCond, + stk: []string{ + "sync.(*Cond).Wait", + "runtime/pprof.blockCond", + "runtime/pprof.TestBlockProfile", + }, + re: ` +[0-9]+ [0-9]+ @( 0x[[:xdigit:]]+)+ +# 0x[0-9a-f]+ sync\.\(\*Cond\)\.Wait\+0x[0-9a-f]+ .*sync/cond\.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.blockCond\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +# 0x[0-9a-f]+ runtime/pprof\.TestBlockProfile\+0x[0-9a-f]+ .*runtime/pprof/pprof_test.go:[0-9]+ +`}, + } + + // Generate block profile + runtime.SetBlockProfileRate(1) + defer runtime.SetBlockProfileRate(0) + for _, test := range tests { + test.f(t) + } + + t.Run("debug=1", func(t *testing.T) { + var w strings.Builder + Lookup("block").WriteTo(&w, 1) + prof := w.String() + + if !strings.HasPrefix(prof, "--- contention:\ncycles/second=") { + t.Fatalf("Bad profile header:\n%v", prof) + } + + if strings.HasSuffix(prof, "#\t0x0\n\n") { + t.Errorf("Useless 0 suffix:\n%v", prof) + } + + for _, test := range tests { + if !regexp.MustCompile(strings.ReplaceAll(test.re, "\t", "\t+")).MatchString(prof) { + t.Errorf("Bad %v entry, expect:\n%v\ngot:\n%v", test.name, test.re, prof) + } + } + }) + + t.Run("proto", func(t *testing.T) { + // proto format + var w bytes.Buffer + Lookup("block").WriteTo(&w, 0) + p, err := profile.Parse(&w) + if err != nil { + t.Fatalf("failed to parse profile: %v", err) + } + t.Logf("parsed proto: %s", p) + if err := p.CheckValid(); err != nil { + t.Fatalf("invalid profile: %v", err) + } + + stks := stacks(p) + for _, test := range tests { + if !containsStack(stks, test.stk) { + t.Errorf("No matching stack entry for %v, want %+v", test.name, test.stk) + } + } + }) + +} + +func stacks(p *profile.Profile) (res [][]string) { + for _, s := range p.Sample { + var stk []string + for _, l := range s.Location { + for _, line := range l.Line { + stk = append(stk, line.Function.Name) + } + } + res = append(res, stk) + } + return res +} + +func containsStack(got [][]string, want []string) bool { + for _, stk := range got { + if len(stk) < len(want) { + continue + } + for i, f := range want { + if f != stk[i] { + break + } + if i == len(want)-1 { + return true + } + } + } + return false +} + +// awaitBlockedGoroutine spins on runtime.Gosched until a runtime stack dump +// shows a goroutine in the given state with a stack frame in +// runtime/pprof.<fName>. +func awaitBlockedGoroutine(t *testing.T, state, fName string) { + re := fmt.Sprintf(`(?m)^goroutine \d+ \[%s\]:\n(?:.+\n\t.+\n)*runtime/pprof\.%s`, regexp.QuoteMeta(state), fName) + r := regexp.MustCompile(re) + + if deadline, ok := t.Deadline(); ok { + if d := time.Until(deadline); d > 1*time.Second { + timer := time.AfterFunc(d-1*time.Second, func() { + debug.SetTraceback("all") + panic(fmt.Sprintf("timed out waiting for %#q", re)) + }) + defer timer.Stop() + } + } + + buf := make([]byte, 64<<10) + for { + runtime.Gosched() + n := runtime.Stack(buf, true) + if n == len(buf) { + // Buffer wasn't large enough for a full goroutine dump. + // Resize it and try again. + buf = make([]byte, 2*len(buf)) + continue + } + if r.Match(buf[:n]) { + return + } + } +} + +func blockChanRecv(t *testing.T) { + c := make(chan bool) + go func() { + awaitBlockedGoroutine(t, "chan receive", "blockChanRecv") + c <- true + }() + <-c +} + +func blockChanSend(t *testing.T) { + c := make(chan bool) + go func() { + awaitBlockedGoroutine(t, "chan send", "blockChanSend") + <-c + }() + c <- true +} + +func blockChanClose(t *testing.T) { + c := make(chan bool) + go func() { + awaitBlockedGoroutine(t, "chan receive", "blockChanClose") + close(c) + }() + <-c +} + +func blockSelectRecvAsync(t *testing.T) { + const numTries = 3 + c := make(chan bool, 1) + c2 := make(chan bool, 1) + go func() { + for i := 0; i < numTries; i++ { + awaitBlockedGoroutine(t, "select", "blockSelectRecvAsync") + c <- true + } + }() + for i := 0; i < numTries; i++ { + select { + case <-c: + case <-c2: + } + } +} + +func blockSelectSendSync(t *testing.T) { + c := make(chan bool) + c2 := make(chan bool) + go func() { + awaitBlockedGoroutine(t, "select", "blockSelectSendSync") + <-c + }() + select { + case c <- true: + case c2 <- true: + } +} + +func blockMutex(t *testing.T) { + var mu sync.Mutex + mu.Lock() + go func() { + awaitBlockedGoroutine(t, "sync.Mutex.Lock", "blockMutex") + mu.Unlock() + }() + // Note: Unlock releases mu before recording the mutex event, + // so it's theoretically possible for this to proceed and + // capture the profile before the event is recorded. As long + // as this is blocked before the unlock happens, it's okay. + mu.Lock() +} + +func blockCond(t *testing.T) { + var mu sync.Mutex + c := sync.NewCond(&mu) + mu.Lock() + go func() { + awaitBlockedGoroutine(t, "sync.Cond.Wait", "blockCond") + mu.Lock() + c.Signal() + mu.Unlock() + }() + c.Wait() + mu.Unlock() +} + +// See http://golang.org/cl/299991. +func TestBlockProfileBias(t *testing.T) { + rate := int(1000) // arbitrary value + runtime.SetBlockProfileRate(rate) + defer runtime.SetBlockProfileRate(0) + + // simulate blocking events + blockFrequentShort(rate) + blockInfrequentLong(rate) + + var w bytes.Buffer + Lookup("block").WriteTo(&w, 0) + p, err := profile.Parse(&w) + if err != nil { + t.Fatalf("failed to parse profile: %v", err) + } + t.Logf("parsed proto: %s", p) + + il := float64(-1) // blockInfrequentLong duration + fs := float64(-1) // blockFrequentShort duration + for _, s := range p.Sample { + for _, l := range s.Location { + for _, line := range l.Line { + if len(s.Value) < 2 { + t.Fatal("block profile has less than 2 sample types") + } + + if line.Function.Name == "runtime/pprof.blockInfrequentLong" { + il = float64(s.Value[1]) + } else if line.Function.Name == "runtime/pprof.blockFrequentShort" { + fs = float64(s.Value[1]) + } + } + } + } + if il == -1 || fs == -1 { + t.Fatal("block profile is missing expected functions") + } + + // stddev of bias from 100 runs on local machine multiplied by 10x + const threshold = 0.2 + if bias := (il - fs) / il; math.Abs(bias) > threshold { + t.Fatalf("bias: abs(%f) > %f", bias, threshold) + } else { + t.Logf("bias: abs(%f) < %f", bias, threshold) + } +} + +// blockFrequentShort produces 100000 block events with an average duration of +// rate / 10. +func blockFrequentShort(rate int) { + for i := 0; i < 100000; i++ { + blockevent(int64(rate/10), 1) + } +} + +// blockFrequentShort produces 10000 block events with an average duration of +// rate. +func blockInfrequentLong(rate int) { + for i := 0; i < 10000; i++ { + blockevent(int64(rate), 1) + } +} + +// Used by TestBlockProfileBias. +// +//go:linkname blockevent runtime.blockevent +func blockevent(cycles int64, skip int) + +func TestMutexProfile(t *testing.T) { + // Generate mutex profile + + old := runtime.SetMutexProfileFraction(1) + defer runtime.SetMutexProfileFraction(old) + if old != 0 { + t.Fatalf("need MutexProfileRate 0, got %d", old) + } + + blockMutex(t) + + t.Run("debug=1", func(t *testing.T) { + var w strings.Builder + Lookup("mutex").WriteTo(&w, 1) + prof := w.String() + t.Logf("received profile: %v", prof) + + if !strings.HasPrefix(prof, "--- mutex:\ncycles/second=") { + t.Errorf("Bad profile header:\n%v", prof) + } + prof = strings.Trim(prof, "\n") + lines := strings.Split(prof, "\n") + if len(lines) != 6 { + t.Errorf("expected 6 lines, got %d %q\n%s", len(lines), prof, prof) + } + if len(lines) < 6 { + return + } + // checking that the line is like "35258904 1 @ 0x48288d 0x47cd28 0x458931" + r2 := `^\d+ \d+ @(?: 0x[[:xdigit:]]+)+` + //r2 := "^[0-9]+ 1 @ 0x[0-9a-f x]+$" + if ok, err := regexp.MatchString(r2, lines[3]); err != nil || !ok { + t.Errorf("%q didn't match %q", lines[3], r2) + } + r3 := "^#.*runtime/pprof.blockMutex.*$" + if ok, err := regexp.MatchString(r3, lines[5]); err != nil || !ok { + t.Errorf("%q didn't match %q", lines[5], r3) + } + t.Logf(prof) + }) + t.Run("proto", func(t *testing.T) { + // proto format + var w bytes.Buffer + Lookup("mutex").WriteTo(&w, 0) + p, err := profile.Parse(&w) + if err != nil { + t.Fatalf("failed to parse profile: %v", err) + } + t.Logf("parsed proto: %s", p) + if err := p.CheckValid(); err != nil { + t.Fatalf("invalid profile: %v", err) + } + + stks := stacks(p) + for _, want := range [][]string{ + {"sync.(*Mutex).Unlock", "runtime/pprof.blockMutex.func1"}, + } { + if !containsStack(stks, want) { + t.Errorf("No matching stack entry for %+v", want) + } + } + }) +} + +func TestMutexProfileRateAdjust(t *testing.T) { + old := runtime.SetMutexProfileFraction(1) + defer runtime.SetMutexProfileFraction(old) + if old != 0 { + t.Fatalf("need MutexProfileRate 0, got %d", old) + } + + readProfile := func() (contentions int64, delay int64) { + var w bytes.Buffer + Lookup("mutex").WriteTo(&w, 0) + p, err := profile.Parse(&w) + if err != nil { + t.Fatalf("failed to parse profile: %v", err) + } + t.Logf("parsed proto: %s", p) + if err := p.CheckValid(); err != nil { + t.Fatalf("invalid profile: %v", err) + } + + for _, s := range p.Sample { + for _, l := range s.Location { + for _, line := range l.Line { + if line.Function.Name == "runtime/pprof.blockMutex.func1" { + contentions += s.Value[0] + delay += s.Value[1] + } + } + } + } + return + } + + blockMutex(t) + contentions, delay := readProfile() + if contentions == 0 || delay == 0 { + t.Fatal("did not see expected function in profile") + } + runtime.SetMutexProfileFraction(0) + newContentions, newDelay := readProfile() + if newContentions != contentions || newDelay != delay { + t.Fatalf("sample value changed: got [%d, %d], want [%d, %d]", newContentions, newDelay, contentions, delay) + } +} + +func func1(c chan int) { <-c } +func func2(c chan int) { <-c } +func func3(c chan int) { <-c } +func func4(c chan int) { <-c } + +func TestGoroutineCounts(t *testing.T) { + // Setting GOMAXPROCS to 1 ensures we can force all goroutines to the + // desired blocking point. + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(1)) + + c := make(chan int) + for i := 0; i < 100; i++ { + switch { + case i%10 == 0: + go func1(c) + case i%2 == 0: + go func2(c) + default: + go func3(c) + } + // Let goroutines block on channel + for j := 0; j < 5; j++ { + runtime.Gosched() + } + } + ctx := context.Background() + + // ... and again, with labels this time (just with fewer iterations to keep + // sorting deterministic). + Do(ctx, Labels("label", "value"), func(context.Context) { + for i := 0; i < 89; i++ { + switch { + case i%10 == 0: + go func1(c) + case i%2 == 0: + go func2(c) + default: + go func3(c) + } + // Let goroutines block on channel + for j := 0; j < 5; j++ { + runtime.Gosched() + } + } + }) + + var w bytes.Buffer + goroutineProf := Lookup("goroutine") + + // Check debug profile + goroutineProf.WriteTo(&w, 1) + prof := w.String() + + labels := labelMap{"label": "value"} + labelStr := "\n# labels: " + labels.String() + if !containsInOrder(prof, "\n50 @ ", "\n44 @", labelStr, + "\n40 @", "\n36 @", labelStr, "\n10 @", "\n9 @", labelStr, "\n1 @") { + t.Errorf("expected sorted goroutine counts with Labels:\n%s", prof) + } + + // Check proto profile + w.Reset() + goroutineProf.WriteTo(&w, 0) + p, err := profile.Parse(&w) + if err != nil { + t.Errorf("error parsing protobuf profile: %v", err) + } + if err := p.CheckValid(); err != nil { + t.Errorf("protobuf profile is invalid: %v", err) + } + expectedLabels := map[int64]map[string]string{ + 50: {}, + 44: {"label": "value"}, + 40: {}, + 36: {"label": "value"}, + 10: {}, + 9: {"label": "value"}, + 1: {}, + } + if !containsCountsLabels(p, expectedLabels) { + t.Errorf("expected count profile to contain goroutines with counts and labels %v, got %v", + expectedLabels, p) + } + + close(c) + + time.Sleep(10 * time.Millisecond) // let goroutines exit +} + +func containsInOrder(s string, all ...string) bool { + for _, t := range all { + var ok bool + if _, s, ok = strings.Cut(s, t); !ok { + return false + } + } + return true +} + +func containsCountsLabels(prof *profile.Profile, countLabels map[int64]map[string]string) bool { + m := make(map[int64]int) + type nkey struct { + count int64 + key, val string + } + n := make(map[nkey]int) + for c, kv := range countLabels { + m[c]++ + for k, v := range kv { + n[nkey{ + count: c, + key: k, + val: v, + }]++ + + } + } + for _, s := range prof.Sample { + // The count is the single value in the sample + if len(s.Value) != 1 { + return false + } + m[s.Value[0]]-- + for k, vs := range s.Label { + for _, v := range vs { + n[nkey{ + count: s.Value[0], + key: k, + val: v, + }]-- + } + } + } + for _, n := range m { + if n > 0 { + return false + } + } + for _, ncnt := range n { + if ncnt != 0 { + return false + } + } + return true +} + +func TestGoroutineProfileConcurrency(t *testing.T) { + goroutineProf := Lookup("goroutine") + + profilerCalls := func(s string) int { + return strings.Count(s, "\truntime/pprof.runtime_goroutineProfileWithLabels+") + } + + includesFinalizer := func(s string) bool { + return strings.Contains(s, "runtime.runfinq") + } + + // Concurrent calls to the goroutine profiler should not trigger data races + // or corruption. + t.Run("overlapping profile requests", func(t *testing.T) { + ctx := context.Background() + ctx, cancel := context.WithTimeout(ctx, 10*time.Second) + defer cancel() + + var wg sync.WaitGroup + for i := 0; i < 2; i++ { + wg.Add(1) + Do(ctx, Labels("i", fmt.Sprint(i)), func(context.Context) { + go func() { + defer wg.Done() + for ctx.Err() == nil { + var w strings.Builder + goroutineProf.WriteTo(&w, 1) + prof := w.String() + count := profilerCalls(prof) + if count >= 2 { + t.Logf("prof %d\n%s", count, prof) + cancel() + } + } + }() + }) + } + wg.Wait() + }) + + // The finalizer goroutine should not show up in most profiles, since it's + // marked as a system goroutine when idle. + t.Run("finalizer not present", func(t *testing.T) { + var w strings.Builder + goroutineProf.WriteTo(&w, 1) + prof := w.String() + if includesFinalizer(prof) { + t.Errorf("profile includes finalizer (but finalizer should be marked as system):\n%s", prof) + } + }) + + // The finalizer goroutine should show up when it's running user code. + t.Run("finalizer present", func(t *testing.T) { + obj := new(byte) + ch1, ch2 := make(chan int), make(chan int) + defer close(ch2) + runtime.SetFinalizer(obj, func(_ interface{}) { + close(ch1) + <-ch2 + }) + obj = nil + for i := 10; i >= 0; i-- { + select { + case <-ch1: + default: + if i == 0 { + t.Fatalf("finalizer did not run") + } + runtime.GC() + } + } + var w strings.Builder + goroutineProf.WriteTo(&w, 1) + prof := w.String() + if !includesFinalizer(prof) { + t.Errorf("profile does not include finalizer (and it should be marked as user):\n%s", prof) + } + }) + + // Check that new goroutines only show up in order. + testLaunches := func(t *testing.T) { + var done sync.WaitGroup + defer done.Wait() + + ctx := context.Background() + ctx, cancel := context.WithCancel(ctx) + defer cancel() + + ch := make(chan int) + defer close(ch) + + var ready sync.WaitGroup + + // These goroutines all survive until the end of the subtest, so we can + // check that a (numbered) goroutine appearing in the profile implies + // that all older goroutines also appear in the profile. + ready.Add(1) + done.Add(1) + go func() { + defer done.Done() + for i := 0; ctx.Err() == nil; i++ { + // Use SetGoroutineLabels rather than Do we can always expect an + // extra goroutine (this one) with most recent label. + SetGoroutineLabels(WithLabels(ctx, Labels(t.Name()+"-loop-i", fmt.Sprint(i)))) + done.Add(1) + go func() { + <-ch + done.Done() + }() + for j := 0; j < i; j++ { + // Spin for longer and longer as the test goes on. This + // goroutine will do O(N^2) work with the number of + // goroutines it launches. This should be slow relative to + // the work involved in collecting a goroutine profile, + // which is O(N) with the high-water mark of the number of + // goroutines in this process (in the allgs slice). + runtime.Gosched() + } + if i == 0 { + ready.Done() + } + } + }() + + // Short-lived goroutines exercise different code paths (goroutines with + // status _Gdead, for instance). This churn doesn't have behavior that + // we can test directly, but does help to shake out data races. + ready.Add(1) + var churn func(i int) + churn = func(i int) { + SetGoroutineLabels(WithLabels(ctx, Labels(t.Name()+"-churn-i", fmt.Sprint(i)))) + if i == 0 { + ready.Done() + } else if i%16 == 0 { + // Yield on occasion so this sequence of goroutine launches + // doesn't monopolize a P. See issue #52934. + runtime.Gosched() + } + if ctx.Err() == nil { + go churn(i + 1) + } + } + go func() { + churn(0) + }() + + ready.Wait() + + var w [3]bytes.Buffer + for i := range w { + goroutineProf.WriteTo(&w[i], 0) + } + for i := range w { + p, err := profile.Parse(bytes.NewReader(w[i].Bytes())) + if err != nil { + t.Errorf("error parsing protobuf profile: %v", err) + } + + // High-numbered loop-i goroutines imply that every lower-numbered + // loop-i goroutine should be present in the profile too. + counts := make(map[string]int) + for _, s := range p.Sample { + label := s.Label[t.Name()+"-loop-i"] + if len(label) > 0 { + counts[label[0]]++ + } + } + for j, max := 0, len(counts)-1; j <= max; j++ { + n := counts[fmt.Sprint(j)] + if n == 1 || (n == 2 && j == max) { + continue + } + t.Errorf("profile #%d's goroutines with label loop-i:%d; %d != 1 (or 2 for the last entry, %d)", + i+1, j, n, max) + t.Logf("counts %v", counts) + break + } + } + } + + runs := 100 + if testing.Short() { + runs = 5 + } + for i := 0; i < runs; i++ { + // Run multiple times to shake out data races + t.Run("goroutine launches", testLaunches) + } +} + +func BenchmarkGoroutine(b *testing.B) { + withIdle := func(n int, fn func(b *testing.B)) func(b *testing.B) { + return func(b *testing.B) { + c := make(chan int) + var ready, done sync.WaitGroup + defer func() { + close(c) + done.Wait() + }() + + for i := 0; i < n; i++ { + ready.Add(1) + done.Add(1) + go func() { + ready.Done() + <-c + done.Done() + }() + } + // Let goroutines block on channel + ready.Wait() + for i := 0; i < 5; i++ { + runtime.Gosched() + } + + fn(b) + } + } + + withChurn := func(fn func(b *testing.B)) func(b *testing.B) { + return func(b *testing.B) { + ctx := context.Background() + ctx, cancel := context.WithCancel(ctx) + defer cancel() + + var ready sync.WaitGroup + ready.Add(1) + var count int64 + var churn func(i int) + churn = func(i int) { + SetGoroutineLabels(WithLabels(ctx, Labels("churn-i", fmt.Sprint(i)))) + atomic.AddInt64(&count, 1) + if i == 0 { + ready.Done() + } + if ctx.Err() == nil { + go churn(i + 1) + } + } + go func() { + churn(0) + }() + ready.Wait() + + fn(b) + b.ReportMetric(float64(atomic.LoadInt64(&count))/float64(b.N), "concurrent_launches/op") + } + } + + benchWriteTo := func(b *testing.B) { + goroutineProf := Lookup("goroutine") + b.ResetTimer() + for i := 0; i < b.N; i++ { + goroutineProf.WriteTo(io.Discard, 0) + } + b.StopTimer() + } + + benchGoroutineProfile := func(b *testing.B) { + p := make([]runtime.StackRecord, 10000) + b.ResetTimer() + for i := 0; i < b.N; i++ { + runtime.GoroutineProfile(p) + } + b.StopTimer() + } + + // Note that some costs of collecting a goroutine profile depend on the + // length of the runtime.allgs slice, which never shrinks. Stay within race + // detector's 8k-goroutine limit + for _, n := range []int{50, 500, 5000} { + b.Run(fmt.Sprintf("Profile.WriteTo idle %d", n), withIdle(n, benchWriteTo)) + b.Run(fmt.Sprintf("Profile.WriteTo churn %d", n), withIdle(n, withChurn(benchWriteTo))) + b.Run(fmt.Sprintf("runtime.GoroutineProfile churn %d", n), withIdle(n, withChurn(benchGoroutineProfile))) + } +} + +var emptyCallStackTestRun int64 + +// Issue 18836. +func TestEmptyCallStack(t *testing.T) { + name := fmt.Sprintf("test18836_%d", emptyCallStackTestRun) + emptyCallStackTestRun++ + + t.Parallel() + var buf strings.Builder + p := NewProfile(name) + + p.Add("foo", 47674) + p.WriteTo(&buf, 1) + p.Remove("foo") + got := buf.String() + prefix := name + " profile: total 1\n" + if !strings.HasPrefix(got, prefix) { + t.Fatalf("got:\n\t%q\nwant prefix:\n\t%q\n", got, prefix) + } + lostevent := "lostProfileEvent" + if !strings.Contains(got, lostevent) { + t.Fatalf("got:\n\t%q\ndoes not contain:\n\t%q\n", got, lostevent) + } +} + +// stackContainsLabeled takes a spec like funcname;key=value and matches if the stack has that key +// and value and has funcname somewhere in the stack. +func stackContainsLabeled(spec string, count uintptr, stk []*profile.Location, labels map[string][]string) bool { + base, kv, ok := strings.Cut(spec, ";") + if !ok { + panic("no semicolon in key/value spec") + } + k, v, ok := strings.Cut(kv, "=") + if !ok { + panic("missing = in key/value spec") + } + if !contains(labels[k], v) { + return false + } + return stackContains(base, count, stk, labels) +} + +func TestCPUProfileLabel(t *testing.T) { + matches := matchAndAvoidStacks(stackContainsLabeled, []string{"runtime/pprof.cpuHogger;key=value"}, avoidFunctions()) + testCPUProfile(t, matches, func(dur time.Duration) { + Do(context.Background(), Labels("key", "value"), func(context.Context) { + cpuHogger(cpuHog1, &salt1, dur) + }) + }) +} + +func TestLabelRace(t *testing.T) { + // Test the race detector annotations for synchronization + // between setting labels and consuming them from the + // profile. + matches := matchAndAvoidStacks(stackContainsLabeled, []string{"runtime/pprof.cpuHogger;key=value"}, nil) + testCPUProfile(t, matches, func(dur time.Duration) { + start := time.Now() + var wg sync.WaitGroup + for time.Since(start) < dur { + var salts [10]int + for i := 0; i < 10; i++ { + wg.Add(1) + go func(j int) { + Do(context.Background(), Labels("key", "value"), func(context.Context) { + cpuHogger(cpuHog1, &salts[j], time.Millisecond) + }) + wg.Done() + }(i) + } + wg.Wait() + } + }) +} + +func TestGoroutineProfileLabelRace(t *testing.T) { + // Test the race detector annotations for synchronization + // between setting labels and consuming them from the + // goroutine profile. See issue #50292. + + t.Run("reset", func(t *testing.T) { + ctx := context.Background() + ctx, cancel := context.WithCancel(ctx) + defer cancel() + + go func() { + goroutineProf := Lookup("goroutine") + for ctx.Err() == nil { + var w strings.Builder + goroutineProf.WriteTo(&w, 1) + prof := w.String() + if strings.Contains(prof, "loop-i") { + cancel() + } + } + }() + + for i := 0; ctx.Err() == nil; i++ { + Do(ctx, Labels("loop-i", fmt.Sprint(i)), func(ctx context.Context) { + }) + } + }) + + t.Run("churn", func(t *testing.T) { + ctx := context.Background() + ctx, cancel := context.WithCancel(ctx) + defer cancel() + + var ready sync.WaitGroup + ready.Add(1) + var churn func(i int) + churn = func(i int) { + SetGoroutineLabels(WithLabels(ctx, Labels("churn-i", fmt.Sprint(i)))) + if i == 0 { + ready.Done() + } + if ctx.Err() == nil { + go churn(i + 1) + } + } + go func() { + churn(0) + }() + ready.Wait() + + goroutineProf := Lookup("goroutine") + for i := 0; i < 10; i++ { + goroutineProf.WriteTo(io.Discard, 1) + } + }) +} + +// TestLabelSystemstack makes sure CPU profiler samples of goroutines running +// on systemstack include the correct pprof labels. See issue #48577 +func TestLabelSystemstack(t *testing.T) { + // Grab and re-set the initial value before continuing to ensure + // GOGC doesn't actually change following the test. + gogc := debug.SetGCPercent(100) + debug.SetGCPercent(gogc) + + matches := matchAndAvoidStacks(stackContainsLabeled, []string{"runtime.systemstack;key=value"}, avoidFunctions()) + p := testCPUProfile(t, matches, func(dur time.Duration) { + Do(context.Background(), Labels("key", "value"), func(ctx context.Context) { + parallelLabelHog(ctx, dur, gogc) + }) + }) + + // Two conditions to check: + // * labelHog should always be labeled. + // * The label should _only_ appear on labelHog and the Do call above. + for _, s := range p.Sample { + isLabeled := s.Label != nil && contains(s.Label["key"], "value") + var ( + mayBeLabeled bool + mustBeLabeled string + mustNotBeLabeled string + ) + for _, loc := range s.Location { + for _, l := range loc.Line { + switch l.Function.Name { + case "runtime/pprof.labelHog", "runtime/pprof.parallelLabelHog", "runtime/pprof.parallelLabelHog.func1": + mustBeLabeled = l.Function.Name + case "runtime/pprof.Do": + // Do sets the labels, so samples may + // or may not be labeled depending on + // which part of the function they are + // at. + mayBeLabeled = true + case "runtime.bgsweep", "runtime.bgscavenge", "runtime.forcegchelper", "runtime.gcBgMarkWorker", "runtime.runfinq", "runtime.sysmon": + // Runtime system goroutines or threads + // (such as those identified by + // runtime.isSystemGoroutine). These + // should never be labeled. + mustNotBeLabeled = l.Function.Name + case "gogo", "gosave_systemstack_switch", "racecall": + // These are context switch/race + // critical that we can't do a full + // traceback from. Typically this would + // be covered by the runtime check + // below, but these symbols don't have + // the package name. + mayBeLabeled = true + } + + if strings.HasPrefix(l.Function.Name, "runtime.") { + // There are many places in the runtime + // where we can't do a full traceback. + // Ideally we'd list them all, but + // barring that allow anything in the + // runtime, unless explicitly excluded + // above. + mayBeLabeled = true + } + } + } + errorStack := func(f string, args ...any) { + var buf strings.Builder + fprintStack(&buf, s.Location) + t.Errorf("%s: %s", fmt.Sprintf(f, args...), buf.String()) + } + if mustBeLabeled != "" && mustNotBeLabeled != "" { + errorStack("sample contains both %s, which must be labeled, and %s, which must not be labeled", mustBeLabeled, mustNotBeLabeled) + continue + } + if mustBeLabeled != "" || mustNotBeLabeled != "" { + // We found a definitive frame, so mayBeLabeled hints are not relevant. + mayBeLabeled = false + } + if mayBeLabeled { + // This sample may or may not be labeled, so there's nothing we can check. + continue + } + if mustBeLabeled != "" && !isLabeled { + errorStack("sample must be labeled because of %s, but is not", mustBeLabeled) + } + if mustNotBeLabeled != "" && isLabeled { + errorStack("sample must not be labeled because of %s, but is", mustNotBeLabeled) + } + } +} + +// labelHog is designed to burn CPU time in a way that a high number of CPU +// samples end up running on systemstack. +func labelHog(stop chan struct{}, gogc int) { + // Regression test for issue 50032. We must give GC an opportunity to + // be initially triggered by a labelled goroutine. + runtime.GC() + + for i := 0; ; i++ { + select { + case <-stop: + return + default: + debug.SetGCPercent(gogc) + } + } +} + +// parallelLabelHog runs GOMAXPROCS goroutines running labelHog. +func parallelLabelHog(ctx context.Context, dur time.Duration, gogc int) { + var wg sync.WaitGroup + stop := make(chan struct{}) + for i := 0; i < runtime.GOMAXPROCS(0); i++ { + wg.Add(1) + go func() { + defer wg.Done() + labelHog(stop, gogc) + }() + } + + time.Sleep(dur) + close(stop) + wg.Wait() +} + +// Check that there is no deadlock when the program receives SIGPROF while in +// 64bit atomics' critical section. Used to happen on mips{,le}. See #20146. +func TestAtomicLoadStore64(t *testing.T) { + f, err := os.CreateTemp("", "profatomic") + if err != nil { + t.Fatalf("TempFile: %v", err) + } + defer os.Remove(f.Name()) + defer f.Close() + + if err := StartCPUProfile(f); err != nil { + t.Fatal(err) + } + defer StopCPUProfile() + + var flag uint64 + done := make(chan bool, 1) + + go func() { + for atomic.LoadUint64(&flag) == 0 { + runtime.Gosched() + } + done <- true + }() + time.Sleep(50 * time.Millisecond) + atomic.StoreUint64(&flag, 1) + <-done +} + +func TestTracebackAll(t *testing.T) { + // With gccgo, if a profiling signal arrives at the wrong time + // during traceback, it may crash or hang. See issue #29448. + f, err := os.CreateTemp("", "proftraceback") + if err != nil { + t.Fatalf("TempFile: %v", err) + } + defer os.Remove(f.Name()) + defer f.Close() + + if err := StartCPUProfile(f); err != nil { + t.Fatal(err) + } + defer StopCPUProfile() + + ch := make(chan int) + defer close(ch) + + count := 10 + for i := 0; i < count; i++ { + go func() { + <-ch // block + }() + } + + N := 10000 + if testing.Short() { + N = 500 + } + buf := make([]byte, 10*1024) + for i := 0; i < N; i++ { + runtime.Stack(buf, true) + } +} + +// TestTryAdd tests the cases that are hard to test with real program execution. +// +// For example, the current go compilers may not always inline functions +// involved in recursion but that may not be true in the future compilers. This +// tests such cases by using fake call sequences and forcing the profile build +// utilizing translateCPUProfile defined in proto_test.go +func TestTryAdd(t *testing.T) { + if _, found := findInlinedCall(inlinedCallerDump, 4<<10); !found { + t.Skip("Can't determine whether anything was inlined into inlinedCallerDump.") + } + + // inlinedCallerDump + // inlinedCalleeDump + pcs := make([]uintptr, 2) + inlinedCallerDump(pcs) + inlinedCallerStack := make([]uint64, 2) + for i := range pcs { + inlinedCallerStack[i] = uint64(pcs[i]) + } + + if _, found := findInlinedCall(recursionChainBottom, 4<<10); !found { + t.Skip("Can't determine whether anything was inlined into recursionChainBottom.") + } + + // recursionChainTop + // recursionChainMiddle + // recursionChainBottom + // recursionChainTop + // recursionChainMiddle + // recursionChainBottom + pcs = make([]uintptr, 6) + recursionChainTop(1, pcs) + recursionStack := make([]uint64, len(pcs)) + for i := range pcs { + recursionStack[i] = uint64(pcs[i]) + } + + period := int64(2000 * 1000) // 1/500*1e9 nanosec. + + testCases := []struct { + name string + input []uint64 // following the input format assumed by profileBuilder.addCPUData. + count int // number of records in input. + wantLocs [][]string // ordered location entries with function names. + wantSamples []*profile.Sample // ordered samples, we care only about Value and the profile location IDs. + }{{ + // Sanity test for a normal, complete stack trace. + name: "full_stack_trace", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + 5, 0, 50, inlinedCallerStack[0], inlinedCallerStack[1], + }, + count: 2, + wantLocs: [][]string{ + {"runtime/pprof.inlinedCalleeDump", "runtime/pprof.inlinedCallerDump"}, + }, + wantSamples: []*profile.Sample{ + {Value: []int64{50, 50 * period}, Location: []*profile.Location{{ID: 1}}}, + }, + }, { + name: "bug35538", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + // Fake frame: tryAdd will have inlinedCallerDump + // (stack[1]) on the deck when it encounters the next + // inline function. It should accept this. + 7, 0, 10, inlinedCallerStack[0], inlinedCallerStack[1], inlinedCallerStack[0], inlinedCallerStack[1], + 5, 0, 20, inlinedCallerStack[0], inlinedCallerStack[1], + }, + count: 3, + wantLocs: [][]string{{"runtime/pprof.inlinedCalleeDump", "runtime/pprof.inlinedCallerDump"}}, + wantSamples: []*profile.Sample{ + {Value: []int64{10, 10 * period}, Location: []*profile.Location{{ID: 1}, {ID: 1}}}, + {Value: []int64{20, 20 * period}, Location: []*profile.Location{{ID: 1}}}, + }, + }, { + name: "bug38096", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + // count (data[2]) == 0 && len(stk) == 1 is an overflow + // entry. The "stk" entry is actually the count. + 4, 0, 0, 4242, + }, + count: 2, + wantLocs: [][]string{{"runtime/pprof.lostProfileEvent"}}, + wantSamples: []*profile.Sample{ + {Value: []int64{4242, 4242 * period}, Location: []*profile.Location{{ID: 1}}}, + }, + }, { + // If a function is directly called recursively then it must + // not be inlined in the caller. + // + // N.B. We're generating an impossible profile here, with a + // recursive inlineCalleeDump call. This is simulating a non-Go + // function that looks like an inlined Go function other than + // its recursive property. See pcDeck.tryAdd. + name: "directly_recursive_func_is_not_inlined", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + 5, 0, 30, inlinedCallerStack[0], inlinedCallerStack[0], + 4, 0, 40, inlinedCallerStack[0], + }, + count: 3, + // inlinedCallerDump shows up here because + // runtime_expandFinalInlineFrame adds it to the stack frame. + wantLocs: [][]string{{"runtime/pprof.inlinedCalleeDump"}, {"runtime/pprof.inlinedCallerDump"}}, + wantSamples: []*profile.Sample{ + {Value: []int64{30, 30 * period}, Location: []*profile.Location{{ID: 1}, {ID: 1}, {ID: 2}}}, + {Value: []int64{40, 40 * period}, Location: []*profile.Location{{ID: 1}, {ID: 2}}}, + }, + }, { + name: "recursion_chain_inline", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + 9, 0, 10, recursionStack[0], recursionStack[1], recursionStack[2], recursionStack[3], recursionStack[4], recursionStack[5], + }, + count: 2, + wantLocs: [][]string{ + {"runtime/pprof.recursionChainBottom"}, + { + "runtime/pprof.recursionChainMiddle", + "runtime/pprof.recursionChainTop", + "runtime/pprof.recursionChainBottom", + }, + { + "runtime/pprof.recursionChainMiddle", + "runtime/pprof.recursionChainTop", + "runtime/pprof.TestTryAdd", // inlined into the test. + }, + }, + wantSamples: []*profile.Sample{ + {Value: []int64{10, 10 * period}, Location: []*profile.Location{{ID: 1}, {ID: 2}, {ID: 3}}}, + }, + }, { + name: "truncated_stack_trace_later", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + 5, 0, 50, inlinedCallerStack[0], inlinedCallerStack[1], + 4, 0, 60, inlinedCallerStack[0], + }, + count: 3, + wantLocs: [][]string{{"runtime/pprof.inlinedCalleeDump", "runtime/pprof.inlinedCallerDump"}}, + wantSamples: []*profile.Sample{ + {Value: []int64{50, 50 * period}, Location: []*profile.Location{{ID: 1}}}, + {Value: []int64{60, 60 * period}, Location: []*profile.Location{{ID: 1}}}, + }, + }, { + name: "truncated_stack_trace_first", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + 4, 0, 70, inlinedCallerStack[0], + 5, 0, 80, inlinedCallerStack[0], inlinedCallerStack[1], + }, + count: 3, + wantLocs: [][]string{{"runtime/pprof.inlinedCalleeDump", "runtime/pprof.inlinedCallerDump"}}, + wantSamples: []*profile.Sample{ + {Value: []int64{70, 70 * period}, Location: []*profile.Location{{ID: 1}}}, + {Value: []int64{80, 80 * period}, Location: []*profile.Location{{ID: 1}}}, + }, + }, { + // We can recover the inlined caller from a truncated stack. + name: "truncated_stack_trace_only", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + 4, 0, 70, inlinedCallerStack[0], + }, + count: 2, + wantLocs: [][]string{{"runtime/pprof.inlinedCalleeDump", "runtime/pprof.inlinedCallerDump"}}, + wantSamples: []*profile.Sample{ + {Value: []int64{70, 70 * period}, Location: []*profile.Location{{ID: 1}}}, + }, + }, { + // The same location is used for duplicated stacks. + name: "truncated_stack_trace_twice", + input: []uint64{ + 3, 0, 500, // hz = 500. Must match the period. + 4, 0, 70, inlinedCallerStack[0], + // Fake frame: add a fake call to + // inlinedCallerDump to prevent this sample + // from getting merged into above. + 5, 0, 80, inlinedCallerStack[1], inlinedCallerStack[0], + }, + count: 3, + wantLocs: [][]string{ + {"runtime/pprof.inlinedCalleeDump", "runtime/pprof.inlinedCallerDump"}, + {"runtime/pprof.inlinedCallerDump"}, + }, + wantSamples: []*profile.Sample{ + {Value: []int64{70, 70 * period}, Location: []*profile.Location{{ID: 1}}}, + {Value: []int64{80, 80 * period}, Location: []*profile.Location{{ID: 2}, {ID: 1}}}, + }, + }} + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + p, err := translateCPUProfile(tc.input, tc.count) + if err != nil { + t.Fatalf("translating profile: %v", err) + } + t.Logf("Profile: %v\n", p) + + // One location entry with all inlined functions. + var gotLoc [][]string + for _, loc := range p.Location { + var names []string + for _, line := range loc.Line { + names = append(names, line.Function.Name) + } + gotLoc = append(gotLoc, names) + } + if got, want := fmtJSON(gotLoc), fmtJSON(tc.wantLocs); got != want { + t.Errorf("Got Location = %+v\n\twant %+v", got, want) + } + // All samples should point to one location. + var gotSamples []*profile.Sample + for _, sample := range p.Sample { + var locs []*profile.Location + for _, loc := range sample.Location { + locs = append(locs, &profile.Location{ID: loc.ID}) + } + gotSamples = append(gotSamples, &profile.Sample{Value: sample.Value, Location: locs}) + } + if got, want := fmtJSON(gotSamples), fmtJSON(tc.wantSamples); got != want { + t.Errorf("Got Samples = %+v\n\twant %+v", got, want) + } + }) + } +} + +func TestTimeVDSO(t *testing.T) { + // Test that time functions have the right stack trace. In particular, + // it shouldn't be recursive. + + if runtime.GOOS == "android" { + // Flaky on Android, issue 48655. VDSO may not be enabled. + testenv.SkipFlaky(t, 48655) + } + + matches := matchAndAvoidStacks(stackContains, []string{"time.now"}, avoidFunctions()) + p := testCPUProfile(t, matches, func(dur time.Duration) { + t0 := time.Now() + for { + t := time.Now() + if t.Sub(t0) >= dur { + return + } + } + }) + + // Check for recursive time.now sample. + for _, sample := range p.Sample { + var seenNow bool + for _, loc := range sample.Location { + for _, line := range loc.Line { + if line.Function.Name == "time.now" { + if seenNow { + t.Fatalf("unexpected recursive time.now") + } + seenNow = true + } + } + } + } +} diff --git a/src/runtime/pprof/pprof_windows.go b/src/runtime/pprof/pprof_windows.go new file mode 100644 index 0000000..23ef2f8 --- /dev/null +++ b/src/runtime/pprof/pprof_windows.go @@ -0,0 +1,22 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "fmt" + "internal/syscall/windows" + "io" + "syscall" + "unsafe" +) + +func addMaxRSS(w io.Writer) { + var m windows.PROCESS_MEMORY_COUNTERS + p, _ := syscall.GetCurrentProcess() + err := windows.GetProcessMemoryInfo(p, &m, uint32(unsafe.Sizeof(m))) + if err == nil { + fmt.Fprintf(w, "# MaxRSS = %d\n", m.PeakWorkingSetSize) + } +} diff --git a/src/runtime/pprof/proto.go b/src/runtime/pprof/proto.go new file mode 100644 index 0000000..b68f30d --- /dev/null +++ b/src/runtime/pprof/proto.go @@ -0,0 +1,761 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "bytes" + "compress/gzip" + "fmt" + "internal/abi" + "io" + "runtime" + "strconv" + "strings" + "time" + "unsafe" +) + +// lostProfileEvent is the function to which lost profiling +// events are attributed. +// (The name shows up in the pprof graphs.) +func lostProfileEvent() { lostProfileEvent() } + +// A profileBuilder writes a profile incrementally from a +// stream of profile samples delivered by the runtime. +type profileBuilder struct { + start time.Time + end time.Time + havePeriod bool + period int64 + m profMap + + // encoding state + w io.Writer + zw *gzip.Writer + pb protobuf + strings []string + stringMap map[string]int + locs map[uintptr]locInfo // list of locInfo starting with the given PC. + funcs map[string]int // Package path-qualified function name to Function.ID + mem []memMap + deck pcDeck +} + +type memMap struct { + // initialized as reading mapping + start uintptr // Address at which the binary (or DLL) is loaded into memory. + end uintptr // The limit of the address range occupied by this mapping. + offset uint64 // Offset in the binary that corresponds to the first mapped address. + file string // The object this entry is loaded from. + buildID string // A string that uniquely identifies a particular program version with high probability. + + funcs symbolizeFlag + fake bool // map entry was faked; /proc/self/maps wasn't available +} + +// symbolizeFlag keeps track of symbolization result. +// +// 0 : no symbol lookup was performed +// 1<<0 (lookupTried) : symbol lookup was performed +// 1<<1 (lookupFailed): symbol lookup was performed but failed +type symbolizeFlag uint8 + +const ( + lookupTried symbolizeFlag = 1 << iota + lookupFailed symbolizeFlag = 1 << iota +) + +const ( + // message Profile + tagProfile_SampleType = 1 // repeated ValueType + tagProfile_Sample = 2 // repeated Sample + tagProfile_Mapping = 3 // repeated Mapping + tagProfile_Location = 4 // repeated Location + tagProfile_Function = 5 // repeated Function + tagProfile_StringTable = 6 // repeated string + tagProfile_DropFrames = 7 // int64 (string table index) + tagProfile_KeepFrames = 8 // int64 (string table index) + tagProfile_TimeNanos = 9 // int64 + tagProfile_DurationNanos = 10 // int64 + tagProfile_PeriodType = 11 // ValueType (really optional string???) + tagProfile_Period = 12 // int64 + tagProfile_Comment = 13 // repeated int64 + tagProfile_DefaultSampleType = 14 // int64 + + // message ValueType + tagValueType_Type = 1 // int64 (string table index) + tagValueType_Unit = 2 // int64 (string table index) + + // message Sample + tagSample_Location = 1 // repeated uint64 + tagSample_Value = 2 // repeated int64 + tagSample_Label = 3 // repeated Label + + // message Label + tagLabel_Key = 1 // int64 (string table index) + tagLabel_Str = 2 // int64 (string table index) + tagLabel_Num = 3 // int64 + + // message Mapping + tagMapping_ID = 1 // uint64 + tagMapping_Start = 2 // uint64 + tagMapping_Limit = 3 // uint64 + tagMapping_Offset = 4 // uint64 + tagMapping_Filename = 5 // int64 (string table index) + tagMapping_BuildID = 6 // int64 (string table index) + tagMapping_HasFunctions = 7 // bool + tagMapping_HasFilenames = 8 // bool + tagMapping_HasLineNumbers = 9 // bool + tagMapping_HasInlineFrames = 10 // bool + + // message Location + tagLocation_ID = 1 // uint64 + tagLocation_MappingID = 2 // uint64 + tagLocation_Address = 3 // uint64 + tagLocation_Line = 4 // repeated Line + + // message Line + tagLine_FunctionID = 1 // uint64 + tagLine_Line = 2 // int64 + + // message Function + tagFunction_ID = 1 // uint64 + tagFunction_Name = 2 // int64 (string table index) + tagFunction_SystemName = 3 // int64 (string table index) + tagFunction_Filename = 4 // int64 (string table index) + tagFunction_StartLine = 5 // int64 +) + +// stringIndex adds s to the string table if not already present +// and returns the index of s in the string table. +func (b *profileBuilder) stringIndex(s string) int64 { + id, ok := b.stringMap[s] + if !ok { + id = len(b.strings) + b.strings = append(b.strings, s) + b.stringMap[s] = id + } + return int64(id) +} + +func (b *profileBuilder) flush() { + const dataFlush = 4096 + if b.pb.nest == 0 && len(b.pb.data) > dataFlush { + b.zw.Write(b.pb.data) + b.pb.data = b.pb.data[:0] + } +} + +// pbValueType encodes a ValueType message to b.pb. +func (b *profileBuilder) pbValueType(tag int, typ, unit string) { + start := b.pb.startMessage() + b.pb.int64(tagValueType_Type, b.stringIndex(typ)) + b.pb.int64(tagValueType_Unit, b.stringIndex(unit)) + b.pb.endMessage(tag, start) +} + +// pbSample encodes a Sample message to b.pb. +func (b *profileBuilder) pbSample(values []int64, locs []uint64, labels func()) { + start := b.pb.startMessage() + b.pb.int64s(tagSample_Value, values) + b.pb.uint64s(tagSample_Location, locs) + if labels != nil { + labels() + } + b.pb.endMessage(tagProfile_Sample, start) + b.flush() +} + +// pbLabel encodes a Label message to b.pb. +func (b *profileBuilder) pbLabel(tag int, key, str string, num int64) { + start := b.pb.startMessage() + b.pb.int64Opt(tagLabel_Key, b.stringIndex(key)) + b.pb.int64Opt(tagLabel_Str, b.stringIndex(str)) + b.pb.int64Opt(tagLabel_Num, num) + b.pb.endMessage(tag, start) +} + +// pbLine encodes a Line message to b.pb. +func (b *profileBuilder) pbLine(tag int, funcID uint64, line int64) { + start := b.pb.startMessage() + b.pb.uint64Opt(tagLine_FunctionID, funcID) + b.pb.int64Opt(tagLine_Line, line) + b.pb.endMessage(tag, start) +} + +// pbMapping encodes a Mapping message to b.pb. +func (b *profileBuilder) pbMapping(tag int, id, base, limit, offset uint64, file, buildID string, hasFuncs bool) { + start := b.pb.startMessage() + b.pb.uint64Opt(tagMapping_ID, id) + b.pb.uint64Opt(tagMapping_Start, base) + b.pb.uint64Opt(tagMapping_Limit, limit) + b.pb.uint64Opt(tagMapping_Offset, offset) + b.pb.int64Opt(tagMapping_Filename, b.stringIndex(file)) + b.pb.int64Opt(tagMapping_BuildID, b.stringIndex(buildID)) + // TODO: we set HasFunctions if all symbols from samples were symbolized (hasFuncs). + // Decide what to do about HasInlineFrames and HasLineNumbers. + // Also, another approach to handle the mapping entry with + // incomplete symbolization results is to dupliace the mapping + // entry (but with different Has* fields values) and use + // different entries for symbolized locations and unsymbolized locations. + if hasFuncs { + b.pb.bool(tagMapping_HasFunctions, true) + } + b.pb.endMessage(tag, start) +} + +func allFrames(addr uintptr) ([]runtime.Frame, symbolizeFlag) { + // Expand this one address using CallersFrames so we can cache + // each expansion. In general, CallersFrames takes a whole + // stack, but in this case we know there will be no skips in + // the stack and we have return PCs anyway. + frames := runtime.CallersFrames([]uintptr{addr}) + frame, more := frames.Next() + if frame.Function == "runtime.goexit" { + // Short-circuit if we see runtime.goexit so the loop + // below doesn't allocate a useless empty location. + return nil, 0 + } + + symbolizeResult := lookupTried + if frame.PC == 0 || frame.Function == "" || frame.File == "" || frame.Line == 0 { + symbolizeResult |= lookupFailed + } + + if frame.PC == 0 { + // If we failed to resolve the frame, at least make up + // a reasonable call PC. This mostly happens in tests. + frame.PC = addr - 1 + } + ret := []runtime.Frame{frame} + for frame.Function != "runtime.goexit" && more { + frame, more = frames.Next() + ret = append(ret, frame) + } + return ret, symbolizeResult +} + +type locInfo struct { + // location id assigned by the profileBuilder + id uint64 + + // sequence of PCs, including the fake PCs returned by the traceback + // to represent inlined functions + // https://github.com/golang/go/blob/d6f2f833c93a41ec1c68e49804b8387a06b131c5/src/runtime/traceback.go#L347-L368 + pcs []uintptr + + // firstPCFrames and firstPCSymbolizeResult hold the results of the + // allFrames call for the first (leaf-most) PC this locInfo represents + firstPCFrames []runtime.Frame + firstPCSymbolizeResult symbolizeFlag +} + +// newProfileBuilder returns a new profileBuilder. +// CPU profiling data obtained from the runtime can be added +// by calling b.addCPUData, and then the eventual profile +// can be obtained by calling b.finish. +func newProfileBuilder(w io.Writer) *profileBuilder { + zw, _ := gzip.NewWriterLevel(w, gzip.BestSpeed) + b := &profileBuilder{ + w: w, + zw: zw, + start: time.Now(), + strings: []string{""}, + stringMap: map[string]int{"": 0}, + locs: map[uintptr]locInfo{}, + funcs: map[string]int{}, + } + b.readMapping() + return b +} + +// addCPUData adds the CPU profiling data to the profile. +// +// The data must be a whole number of records, as delivered by the runtime. +// len(tags) must be equal to the number of records in data. +func (b *profileBuilder) addCPUData(data []uint64, tags []unsafe.Pointer) error { + if !b.havePeriod { + // first record is period + if len(data) < 3 { + return fmt.Errorf("truncated profile") + } + if data[0] != 3 || data[2] == 0 { + return fmt.Errorf("malformed profile") + } + // data[2] is sampling rate in Hz. Convert to sampling + // period in nanoseconds. + b.period = 1e9 / int64(data[2]) + b.havePeriod = true + data = data[3:] + // Consume tag slot. Note that there isn't a meaningful tag + // value for this record. + tags = tags[1:] + } + + // Parse CPU samples from the profile. + // Each sample is 3+n uint64s: + // data[0] = 3+n + // data[1] = time stamp (ignored) + // data[2] = count + // data[3:3+n] = stack + // If the count is 0 and the stack has length 1, + // that's an overflow record inserted by the runtime + // to indicate that stack[0] samples were lost. + // Otherwise the count is usually 1, + // but in a few special cases like lost non-Go samples + // there can be larger counts. + // Because many samples with the same stack arrive, + // we want to deduplicate immediately, which we do + // using the b.m profMap. + for len(data) > 0 { + if len(data) < 3 || data[0] > uint64(len(data)) { + return fmt.Errorf("truncated profile") + } + if data[0] < 3 || tags != nil && len(tags) < 1 { + return fmt.Errorf("malformed profile") + } + if len(tags) < 1 { + return fmt.Errorf("mismatched profile records and tags") + } + count := data[2] + stk := data[3:data[0]] + data = data[data[0]:] + tag := tags[0] + tags = tags[1:] + + if count == 0 && len(stk) == 1 { + // overflow record + count = uint64(stk[0]) + stk = []uint64{ + // gentraceback guarantees that PCs in the + // stack can be unconditionally decremented and + // still be valid, so we must do the same. + uint64(abi.FuncPCABIInternal(lostProfileEvent) + 1), + } + } + b.m.lookup(stk, tag).count += int64(count) + } + + if len(tags) != 0 { + return fmt.Errorf("mismatched profile records and tags") + } + return nil +} + +// build completes and returns the constructed profile. +func (b *profileBuilder) build() { + b.end = time.Now() + + b.pb.int64Opt(tagProfile_TimeNanos, b.start.UnixNano()) + if b.havePeriod { // must be CPU profile + b.pbValueType(tagProfile_SampleType, "samples", "count") + b.pbValueType(tagProfile_SampleType, "cpu", "nanoseconds") + b.pb.int64Opt(tagProfile_DurationNanos, b.end.Sub(b.start).Nanoseconds()) + b.pbValueType(tagProfile_PeriodType, "cpu", "nanoseconds") + b.pb.int64Opt(tagProfile_Period, b.period) + } + + values := []int64{0, 0} + var locs []uint64 + + for e := b.m.all; e != nil; e = e.nextAll { + values[0] = e.count + values[1] = e.count * b.period + + var labels func() + if e.tag != nil { + labels = func() { + for k, v := range *(*labelMap)(e.tag) { + b.pbLabel(tagSample_Label, k, v, 0) + } + } + } + + locs = b.appendLocsForStack(locs[:0], e.stk) + + b.pbSample(values, locs, labels) + } + + for i, m := range b.mem { + hasFunctions := m.funcs == lookupTried // lookupTried but not lookupFailed + b.pbMapping(tagProfile_Mapping, uint64(i+1), uint64(m.start), uint64(m.end), m.offset, m.file, m.buildID, hasFunctions) + } + + // TODO: Anything for tagProfile_DropFrames? + // TODO: Anything for tagProfile_KeepFrames? + + b.pb.strings(tagProfile_StringTable, b.strings) + b.zw.Write(b.pb.data) + b.zw.Close() +} + +// appendLocsForStack appends the location IDs for the given stack trace to the given +// location ID slice, locs. The addresses in the stack are return PCs or 1 + the PC of +// an inline marker as the runtime traceback function returns. +// +// It may return an empty slice even if locs is non-empty, for example if locs consists +// solely of runtime.goexit. We still count these empty stacks in profiles in order to +// get the right cumulative sample count. +// +// It may emit to b.pb, so there must be no message encoding in progress. +func (b *profileBuilder) appendLocsForStack(locs []uint64, stk []uintptr) (newLocs []uint64) { + b.deck.reset() + + // The last frame might be truncated. Recover lost inline frames. + stk = runtime_expandFinalInlineFrame(stk) + + for len(stk) > 0 { + addr := stk[0] + if l, ok := b.locs[addr]; ok { + // When generating code for an inlined function, the compiler adds + // NOP instructions to the outermost function as a placeholder for + // each layer of inlining. When the runtime generates tracebacks for + // stacks that include inlined functions, it uses the addresses of + // those NOPs as "fake" PCs on the stack as if they were regular + // function call sites. But if a profiling signal arrives while the + // CPU is executing one of those NOPs, its PC will show up as a leaf + // in the profile with its own Location entry. So, always check + // whether addr is a "fake" PC in the context of the current call + // stack by trying to add it to the inlining deck before assuming + // that the deck is complete. + if len(b.deck.pcs) > 0 { + if added := b.deck.tryAdd(addr, l.firstPCFrames, l.firstPCSymbolizeResult); added { + stk = stk[1:] + continue + } + } + + // first record the location if there is any pending accumulated info. + if id := b.emitLocation(); id > 0 { + locs = append(locs, id) + } + + // then, record the cached location. + locs = append(locs, l.id) + + // Skip the matching pcs. + // + // Even if stk was truncated due to the stack depth + // limit, expandFinalInlineFrame above has already + // fixed the truncation, ensuring it is long enough. + stk = stk[len(l.pcs):] + continue + } + + frames, symbolizeResult := allFrames(addr) + if len(frames) == 0 { // runtime.goexit. + if id := b.emitLocation(); id > 0 { + locs = append(locs, id) + } + stk = stk[1:] + continue + } + + if added := b.deck.tryAdd(addr, frames, symbolizeResult); added { + stk = stk[1:] + continue + } + // add failed because this addr is not inlined with the + // existing PCs in the deck. Flush the deck and retry handling + // this pc. + if id := b.emitLocation(); id > 0 { + locs = append(locs, id) + } + + // check cache again - previous emitLocation added a new entry + if l, ok := b.locs[addr]; ok { + locs = append(locs, l.id) + stk = stk[len(l.pcs):] // skip the matching pcs. + } else { + b.deck.tryAdd(addr, frames, symbolizeResult) // must succeed. + stk = stk[1:] + } + } + if id := b.emitLocation(); id > 0 { // emit remaining location. + locs = append(locs, id) + } + return locs +} + +// Here's an example of how Go 1.17 writes out inlined functions, compiled for +// linux/amd64. The disassembly of main.main shows two levels of inlining: main +// calls b, b calls a, a does some work. +// +// inline.go:9 0x4553ec 90 NOPL // func main() { b(v) } +// inline.go:6 0x4553ed 90 NOPL // func b(v *int) { a(v) } +// inline.go:5 0x4553ee 48c7002a000000 MOVQ $0x2a, 0(AX) // func a(v *int) { *v = 42 } +// +// If a profiling signal arrives while executing the MOVQ at 0x4553ee (for line +// 5), the runtime will report the stack as the MOVQ frame being called by the +// NOPL at 0x4553ed (for line 6) being called by the NOPL at 0x4553ec (for line +// 9). +// +// The role of pcDeck is to collapse those three frames back into a single +// location at 0x4553ee, with file/line/function symbolization info representing +// the three layers of calls. It does that via sequential calls to pcDeck.tryAdd +// starting with the leaf-most address. The fourth call to pcDeck.tryAdd will be +// for the caller of main.main. Because main.main was not inlined in its caller, +// the deck will reject the addition, and the fourth PC on the stack will get +// its own location. + +// pcDeck is a helper to detect a sequence of inlined functions from +// a stack trace returned by the runtime. +// +// The stack traces returned by runtime's trackback functions are fully +// expanded (at least for Go functions) and include the fake pcs representing +// inlined functions. The profile proto expects the inlined functions to be +// encoded in one Location message. +// https://github.com/google/pprof/blob/5e965273ee43930341d897407202dd5e10e952cb/proto/profile.proto#L177-L184 +// +// Runtime does not directly expose whether a frame is for an inlined function +// and looking up debug info is not ideal, so we use a heuristic to filter +// the fake pcs and restore the inlined and entry functions. Inlined functions +// have the following properties: +// +// Frame's Func is nil (note: also true for non-Go functions), and +// Frame's Entry matches its entry function frame's Entry (note: could also be true for recursive calls and non-Go functions), and +// Frame's Name does not match its entry function frame's name (note: inlined functions cannot be directly recursive). +// +// As reading and processing the pcs in a stack trace one by one (from leaf to the root), +// we use pcDeck to temporarily hold the observed pcs and their expanded frames +// until we observe the entry function frame. +type pcDeck struct { + pcs []uintptr + frames []runtime.Frame + symbolizeResult symbolizeFlag + + // firstPCFrames indicates the number of frames associated with the first + // (leaf-most) PC in the deck + firstPCFrames int + // firstPCSymbolizeResult holds the results of the allFrames call for the + // first (leaf-most) PC in the deck + firstPCSymbolizeResult symbolizeFlag +} + +func (d *pcDeck) reset() { + d.pcs = d.pcs[:0] + d.frames = d.frames[:0] + d.symbolizeResult = 0 + d.firstPCFrames = 0 + d.firstPCSymbolizeResult = 0 +} + +// tryAdd tries to add the pc and Frames expanded from it (most likely one, +// since the stack trace is already fully expanded) and the symbolizeResult +// to the deck. If it fails the caller needs to flush the deck and retry. +func (d *pcDeck) tryAdd(pc uintptr, frames []runtime.Frame, symbolizeResult symbolizeFlag) (success bool) { + if existing := len(d.frames); existing > 0 { + // 'd.frames' are all expanded from one 'pc' and represent all + // inlined functions so we check only the last one. + newFrame := frames[0] + last := d.frames[existing-1] + if last.Func != nil { // the last frame can't be inlined. Flush. + return false + } + if last.Entry == 0 || newFrame.Entry == 0 { // Possibly not a Go function. Don't try to merge. + return false + } + + if last.Entry != newFrame.Entry { // newFrame is for a different function. + return false + } + if last.Function == newFrame.Function { // maybe recursion. + return false + } + } + d.pcs = append(d.pcs, pc) + d.frames = append(d.frames, frames...) + d.symbolizeResult |= symbolizeResult + if len(d.pcs) == 1 { + d.firstPCFrames = len(d.frames) + d.firstPCSymbolizeResult = symbolizeResult + } + return true +} + +// emitLocation emits the new location and function information recorded in the deck +// and returns the location ID encoded in the profile protobuf. +// It emits to b.pb, so there must be no message encoding in progress. +// It resets the deck. +func (b *profileBuilder) emitLocation() uint64 { + if len(b.deck.pcs) == 0 { + return 0 + } + defer b.deck.reset() + + addr := b.deck.pcs[0] + firstFrame := b.deck.frames[0] + + // We can't write out functions while in the middle of the + // Location message, so record new functions we encounter and + // write them out after the Location. + type newFunc struct { + id uint64 + name, file string + startLine int64 + } + newFuncs := make([]newFunc, 0, 8) + + id := uint64(len(b.locs)) + 1 + b.locs[addr] = locInfo{ + id: id, + pcs: append([]uintptr{}, b.deck.pcs...), + firstPCSymbolizeResult: b.deck.firstPCSymbolizeResult, + firstPCFrames: append([]runtime.Frame{}, b.deck.frames[:b.deck.firstPCFrames]...), + } + + start := b.pb.startMessage() + b.pb.uint64Opt(tagLocation_ID, id) + b.pb.uint64Opt(tagLocation_Address, uint64(firstFrame.PC)) + for _, frame := range b.deck.frames { + // Write out each line in frame expansion. + funcID := uint64(b.funcs[frame.Function]) + if funcID == 0 { + funcID = uint64(len(b.funcs)) + 1 + b.funcs[frame.Function] = int(funcID) + newFuncs = append(newFuncs, newFunc{ + id: funcID, + name: frame.Function, + file: frame.File, + startLine: int64(runtime_FrameStartLine(&frame)), + }) + } + b.pbLine(tagLocation_Line, funcID, int64(frame.Line)) + } + for i := range b.mem { + if b.mem[i].start <= addr && addr < b.mem[i].end || b.mem[i].fake { + b.pb.uint64Opt(tagLocation_MappingID, uint64(i+1)) + + m := b.mem[i] + m.funcs |= b.deck.symbolizeResult + b.mem[i] = m + break + } + } + b.pb.endMessage(tagProfile_Location, start) + + // Write out functions we found during frame expansion. + for _, fn := range newFuncs { + start := b.pb.startMessage() + b.pb.uint64Opt(tagFunction_ID, fn.id) + b.pb.int64Opt(tagFunction_Name, b.stringIndex(fn.name)) + b.pb.int64Opt(tagFunction_SystemName, b.stringIndex(fn.name)) + b.pb.int64Opt(tagFunction_Filename, b.stringIndex(fn.file)) + b.pb.int64Opt(tagFunction_StartLine, fn.startLine) + b.pb.endMessage(tagProfile_Function, start) + } + + b.flush() + return id +} + +var space = []byte(" ") +var newline = []byte("\n") + +func parseProcSelfMaps(data []byte, addMapping func(lo, hi, offset uint64, file, buildID string)) { + // $ cat /proc/self/maps + // 00400000-0040b000 r-xp 00000000 fc:01 787766 /bin/cat + // 0060a000-0060b000 r--p 0000a000 fc:01 787766 /bin/cat + // 0060b000-0060c000 rw-p 0000b000 fc:01 787766 /bin/cat + // 014ab000-014cc000 rw-p 00000000 00:00 0 [heap] + // 7f7d76af8000-7f7d7797c000 r--p 00000000 fc:01 1318064 /usr/lib/locale/locale-archive + // 7f7d7797c000-7f7d77b36000 r-xp 00000000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so + // 7f7d77b36000-7f7d77d36000 ---p 001ba000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so + // 7f7d77d36000-7f7d77d3a000 r--p 001ba000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so + // 7f7d77d3a000-7f7d77d3c000 rw-p 001be000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so + // 7f7d77d3c000-7f7d77d41000 rw-p 00000000 00:00 0 + // 7f7d77d41000-7f7d77d64000 r-xp 00000000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so + // 7f7d77f3f000-7f7d77f42000 rw-p 00000000 00:00 0 + // 7f7d77f61000-7f7d77f63000 rw-p 00000000 00:00 0 + // 7f7d77f63000-7f7d77f64000 r--p 00022000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so + // 7f7d77f64000-7f7d77f65000 rw-p 00023000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so + // 7f7d77f65000-7f7d77f66000 rw-p 00000000 00:00 0 + // 7ffc342a2000-7ffc342c3000 rw-p 00000000 00:00 0 [stack] + // 7ffc34343000-7ffc34345000 r-xp 00000000 00:00 0 [vdso] + // ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] + + var line []byte + // next removes and returns the next field in the line. + // It also removes from line any spaces following the field. + next := func() []byte { + var f []byte + f, line, _ = bytes.Cut(line, space) + line = bytes.TrimLeft(line, " ") + return f + } + + for len(data) > 0 { + line, data, _ = bytes.Cut(data, newline) + addr := next() + loStr, hiStr, ok := strings.Cut(string(addr), "-") + if !ok { + continue + } + lo, err := strconv.ParseUint(loStr, 16, 64) + if err != nil { + continue + } + hi, err := strconv.ParseUint(hiStr, 16, 64) + if err != nil { + continue + } + perm := next() + if len(perm) < 4 || perm[2] != 'x' { + // Only interested in executable mappings. + continue + } + offset, err := strconv.ParseUint(string(next()), 16, 64) + if err != nil { + continue + } + next() // dev + inode := next() // inode + if line == nil { + continue + } + file := string(line) + + // Trim deleted file marker. + deletedStr := " (deleted)" + deletedLen := len(deletedStr) + if len(file) >= deletedLen && file[len(file)-deletedLen:] == deletedStr { + file = file[:len(file)-deletedLen] + } + + if len(inode) == 1 && inode[0] == '0' && file == "" { + // Huge-page text mappings list the initial fragment of + // mapped but unpopulated memory as being inode 0. + // Don't report that part. + // But [vdso] and [vsyscall] are inode 0, so let non-empty file names through. + continue + } + + // TODO: pprof's remapMappingIDs makes one adjustment: + // 1. If there is an /anon_hugepage mapping first and it is + // consecutive to a next mapping, drop the /anon_hugepage. + // There's no indication why this is needed. + // Let's try not doing this and see what breaks. + // If we do need it, it would go here, before we + // enter the mappings into b.mem in the first place. + + buildID, _ := elfBuildID(file) + addMapping(lo, hi, offset, file, buildID) + } +} + +func (b *profileBuilder) addMapping(lo, hi, offset uint64, file, buildID string) { + b.addMappingEntry(lo, hi, offset, file, buildID, false) +} + +func (b *profileBuilder) addMappingEntry(lo, hi, offset uint64, file, buildID string, fake bool) { + b.mem = append(b.mem, memMap{ + start: uintptr(lo), + end: uintptr(hi), + offset: offset, + file: file, + buildID: buildID, + fake: fake, + }) +} diff --git a/src/runtime/pprof/proto_other.go b/src/runtime/pprof/proto_other.go new file mode 100644 index 0000000..4a7fe79 --- /dev/null +++ b/src/runtime/pprof/proto_other.go @@ -0,0 +1,30 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !windows + +package pprof + +import ( + "errors" + "os" +) + +// readMapping reads /proc/self/maps and writes mappings to b.pb. +// It saves the address ranges of the mappings in b.mem for use +// when emitting locations. +func (b *profileBuilder) readMapping() { + data, _ := os.ReadFile("/proc/self/maps") + parseProcSelfMaps(data, b.addMapping) + if len(b.mem) == 0 { // pprof expects a map entry, so fake one. + b.addMappingEntry(0, 0, 0, "", "", true) + // TODO(hyangah): make addMapping return *memMap or + // take a memMap struct, and get rid of addMappingEntry + // that takes a bunch of positional arguments. + } +} + +func readMainModuleMapping() (start, end uint64, err error) { + return 0, 0, errors.New("not implemented") +} diff --git a/src/runtime/pprof/proto_test.go b/src/runtime/pprof/proto_test.go new file mode 100644 index 0000000..780b481 --- /dev/null +++ b/src/runtime/pprof/proto_test.go @@ -0,0 +1,470 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "bytes" + "encoding/json" + "fmt" + "internal/abi" + "internal/profile" + "internal/testenv" + "os" + "os/exec" + "reflect" + "runtime" + "strings" + "testing" + "unsafe" +) + +// translateCPUProfile parses binary CPU profiling stack trace data +// generated by runtime.CPUProfile() into a profile struct. +// This is only used for testing. Real conversions stream the +// data into the profileBuilder as it becomes available. +// +// count is the number of records in data. +func translateCPUProfile(data []uint64, count int) (*profile.Profile, error) { + var buf bytes.Buffer + b := newProfileBuilder(&buf) + tags := make([]unsafe.Pointer, count) + if err := b.addCPUData(data, tags); err != nil { + return nil, err + } + b.build() + return profile.Parse(&buf) +} + +// fmtJSON returns a pretty-printed JSON form for x. +// It works reasonbly well for printing protocol-buffer +// data structures like profile.Profile. +func fmtJSON(x any) string { + js, _ := json.MarshalIndent(x, "", "\t") + return string(js) +} + +func TestConvertCPUProfileEmpty(t *testing.T) { + // A test server with mock cpu profile data. + var buf bytes.Buffer + + b := []uint64{3, 0, 500} // empty profile at 500 Hz (2ms sample period) + p, err := translateCPUProfile(b, 1) + if err != nil { + t.Fatalf("translateCPUProfile: %v", err) + } + if err := p.Write(&buf); err != nil { + t.Fatalf("writing profile: %v", err) + } + + p, err = profile.Parse(&buf) + if err != nil { + t.Fatalf("profile.Parse: %v", err) + } + + // Expected PeriodType and SampleType. + periodType := &profile.ValueType{Type: "cpu", Unit: "nanoseconds"} + sampleType := []*profile.ValueType{ + {Type: "samples", Unit: "count"}, + {Type: "cpu", Unit: "nanoseconds"}, + } + + checkProfile(t, p, 2000*1000, periodType, sampleType, nil, "") +} + +func f1() { f1() } +func f2() { f2() } + +// testPCs returns two PCs and two corresponding memory mappings +// to use in test profiles. +func testPCs(t *testing.T) (addr1, addr2 uint64, map1, map2 *profile.Mapping) { + switch runtime.GOOS { + case "linux", "android", "netbsd": + // Figure out two addresses from /proc/self/maps. + mmap, err := os.ReadFile("/proc/self/maps") + if err != nil { + t.Fatal(err) + } + mprof := &profile.Profile{} + if err = mprof.ParseMemoryMap(bytes.NewReader(mmap)); err != nil { + t.Fatalf("parsing /proc/self/maps: %v", err) + } + if len(mprof.Mapping) < 2 { + // It is possible for a binary to only have 1 executable + // region of memory. + t.Skipf("need 2 or more mappings, got %v", len(mprof.Mapping)) + } + addr1 = mprof.Mapping[0].Start + map1 = mprof.Mapping[0] + map1.BuildID, _ = elfBuildID(map1.File) + addr2 = mprof.Mapping[1].Start + map2 = mprof.Mapping[1] + map2.BuildID, _ = elfBuildID(map2.File) + case "windows": + addr1 = uint64(abi.FuncPCABIInternal(f1)) + addr2 = uint64(abi.FuncPCABIInternal(f2)) + + exe, err := os.Executable() + if err != nil { + t.Fatal(err) + } + + start, end, err := readMainModuleMapping() + if err != nil { + t.Fatal(err) + } + + map1 = &profile.Mapping{ + ID: 1, + Start: start, + Limit: end, + File: exe, + BuildID: peBuildID(exe), + HasFunctions: true, + } + map2 = &profile.Mapping{ + ID: 1, + Start: start, + Limit: end, + File: exe, + BuildID: peBuildID(exe), + HasFunctions: true, + } + case "js": + addr1 = uint64(abi.FuncPCABIInternal(f1)) + addr2 = uint64(abi.FuncPCABIInternal(f2)) + default: + addr1 = uint64(abi.FuncPCABIInternal(f1)) + addr2 = uint64(abi.FuncPCABIInternal(f2)) + // Fake mapping - HasFunctions will be true because two PCs from Go + // will be fully symbolized. + fake := &profile.Mapping{ID: 1, HasFunctions: true} + map1, map2 = fake, fake + } + return +} + +func TestConvertCPUProfile(t *testing.T) { + addr1, addr2, map1, map2 := testPCs(t) + + b := []uint64{ + 3, 0, 500, // hz = 500 + 5, 0, 10, uint64(addr1 + 1), uint64(addr1 + 2), // 10 samples in addr1 + 5, 0, 40, uint64(addr2 + 1), uint64(addr2 + 2), // 40 samples in addr2 + 5, 0, 10, uint64(addr1 + 1), uint64(addr1 + 2), // 10 samples in addr1 + } + p, err := translateCPUProfile(b, 4) + if err != nil { + t.Fatalf("translating profile: %v", err) + } + period := int64(2000 * 1000) + periodType := &profile.ValueType{Type: "cpu", Unit: "nanoseconds"} + sampleType := []*profile.ValueType{ + {Type: "samples", Unit: "count"}, + {Type: "cpu", Unit: "nanoseconds"}, + } + samples := []*profile.Sample{ + {Value: []int64{20, 20 * 2000 * 1000}, Location: []*profile.Location{ + {ID: 1, Mapping: map1, Address: addr1}, + {ID: 2, Mapping: map1, Address: addr1 + 1}, + }}, + {Value: []int64{40, 40 * 2000 * 1000}, Location: []*profile.Location{ + {ID: 3, Mapping: map2, Address: addr2}, + {ID: 4, Mapping: map2, Address: addr2 + 1}, + }}, + } + checkProfile(t, p, period, periodType, sampleType, samples, "") +} + +func checkProfile(t *testing.T, p *profile.Profile, period int64, periodType *profile.ValueType, sampleType []*profile.ValueType, samples []*profile.Sample, defaultSampleType string) { + t.Helper() + + if p.Period != period { + t.Errorf("p.Period = %d, want %d", p.Period, period) + } + if !reflect.DeepEqual(p.PeriodType, periodType) { + t.Errorf("p.PeriodType = %v\nwant = %v", fmtJSON(p.PeriodType), fmtJSON(periodType)) + } + if !reflect.DeepEqual(p.SampleType, sampleType) { + t.Errorf("p.SampleType = %v\nwant = %v", fmtJSON(p.SampleType), fmtJSON(sampleType)) + } + if defaultSampleType != p.DefaultSampleType { + t.Errorf("p.DefaultSampleType = %v\nwant = %v", p.DefaultSampleType, defaultSampleType) + } + // Clear line info since it is not in the expected samples. + // If we used f1 and f2 above, then the samples will have line info. + for _, s := range p.Sample { + for _, l := range s.Location { + l.Line = nil + } + } + if fmtJSON(p.Sample) != fmtJSON(samples) { // ignore unexported fields + if len(p.Sample) == len(samples) { + for i := range p.Sample { + if !reflect.DeepEqual(p.Sample[i], samples[i]) { + t.Errorf("sample %d = %v\nwant = %v\n", i, fmtJSON(p.Sample[i]), fmtJSON(samples[i])) + } + } + if t.Failed() { + t.FailNow() + } + } + t.Fatalf("p.Sample = %v\nwant = %v", fmtJSON(p.Sample), fmtJSON(samples)) + } +} + +var profSelfMapsTests = ` +00400000-0040b000 r-xp 00000000 fc:01 787766 /bin/cat +0060a000-0060b000 r--p 0000a000 fc:01 787766 /bin/cat +0060b000-0060c000 rw-p 0000b000 fc:01 787766 /bin/cat +014ab000-014cc000 rw-p 00000000 00:00 0 [heap] +7f7d76af8000-7f7d7797c000 r--p 00000000 fc:01 1318064 /usr/lib/locale/locale-archive +7f7d7797c000-7f7d77b36000 r-xp 00000000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77b36000-7f7d77d36000 ---p 001ba000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d36000-7f7d77d3a000 r--p 001ba000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d3a000-7f7d77d3c000 rw-p 001be000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d3c000-7f7d77d41000 rw-p 00000000 00:00 0 +7f7d77d41000-7f7d77d64000 r-xp 00000000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f3f000-7f7d77f42000 rw-p 00000000 00:00 0 +7f7d77f61000-7f7d77f63000 rw-p 00000000 00:00 0 +7f7d77f63000-7f7d77f64000 r--p 00022000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f64000-7f7d77f65000 rw-p 00023000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f65000-7f7d77f66000 rw-p 00000000 00:00 0 +7ffc342a2000-7ffc342c3000 rw-p 00000000 00:00 0 [stack] +7ffc34343000-7ffc34345000 r-xp 00000000 00:00 0 [vdso] +ffffffffff600000-ffffffffff601000 r-xp 00000090 00:00 0 [vsyscall] +-> +00400000 0040b000 00000000 /bin/cat +7f7d7797c000 7f7d77b36000 00000000 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d41000 7f7d77d64000 00000000 /lib/x86_64-linux-gnu/ld-2.19.so +7ffc34343000 7ffc34345000 00000000 [vdso] +ffffffffff600000 ffffffffff601000 00000090 [vsyscall] + +00400000-07000000 r-xp 00000000 00:00 0 +07000000-07093000 r-xp 06c00000 00:2e 536754 /path/to/gobench_server_main +07093000-0722d000 rw-p 06c92000 00:2e 536754 /path/to/gobench_server_main +0722d000-07b21000 rw-p 00000000 00:00 0 +c000000000-c000036000 rw-p 00000000 00:00 0 +-> +07000000 07093000 06c00000 /path/to/gobench_server_main +` + +var profSelfMapsTestsWithDeleted = ` +00400000-0040b000 r-xp 00000000 fc:01 787766 /bin/cat (deleted) +0060a000-0060b000 r--p 0000a000 fc:01 787766 /bin/cat (deleted) +0060b000-0060c000 rw-p 0000b000 fc:01 787766 /bin/cat (deleted) +014ab000-014cc000 rw-p 00000000 00:00 0 [heap] +7f7d76af8000-7f7d7797c000 r--p 00000000 fc:01 1318064 /usr/lib/locale/locale-archive +7f7d7797c000-7f7d77b36000 r-xp 00000000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77b36000-7f7d77d36000 ---p 001ba000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d36000-7f7d77d3a000 r--p 001ba000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d3a000-7f7d77d3c000 rw-p 001be000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d3c000-7f7d77d41000 rw-p 00000000 00:00 0 +7f7d77d41000-7f7d77d64000 r-xp 00000000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f3f000-7f7d77f42000 rw-p 00000000 00:00 0 +7f7d77f61000-7f7d77f63000 rw-p 00000000 00:00 0 +7f7d77f63000-7f7d77f64000 r--p 00022000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f64000-7f7d77f65000 rw-p 00023000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f65000-7f7d77f66000 rw-p 00000000 00:00 0 +7ffc342a2000-7ffc342c3000 rw-p 00000000 00:00 0 [stack] +7ffc34343000-7ffc34345000 r-xp 00000000 00:00 0 [vdso] +ffffffffff600000-ffffffffff601000 r-xp 00000090 00:00 0 [vsyscall] +-> +00400000 0040b000 00000000 /bin/cat +7f7d7797c000 7f7d77b36000 00000000 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d41000 7f7d77d64000 00000000 /lib/x86_64-linux-gnu/ld-2.19.so +7ffc34343000 7ffc34345000 00000000 [vdso] +ffffffffff600000 ffffffffff601000 00000090 [vsyscall] + +00400000-0040b000 r-xp 00000000 fc:01 787766 /bin/cat with space +0060a000-0060b000 r--p 0000a000 fc:01 787766 /bin/cat with space +0060b000-0060c000 rw-p 0000b000 fc:01 787766 /bin/cat with space +014ab000-014cc000 rw-p 00000000 00:00 0 [heap] +7f7d76af8000-7f7d7797c000 r--p 00000000 fc:01 1318064 /usr/lib/locale/locale-archive +7f7d7797c000-7f7d77b36000 r-xp 00000000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77b36000-7f7d77d36000 ---p 001ba000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d36000-7f7d77d3a000 r--p 001ba000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d3a000-7f7d77d3c000 rw-p 001be000 fc:01 1180226 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d3c000-7f7d77d41000 rw-p 00000000 00:00 0 +7f7d77d41000-7f7d77d64000 r-xp 00000000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f3f000-7f7d77f42000 rw-p 00000000 00:00 0 +7f7d77f61000-7f7d77f63000 rw-p 00000000 00:00 0 +7f7d77f63000-7f7d77f64000 r--p 00022000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f64000-7f7d77f65000 rw-p 00023000 fc:01 1180217 /lib/x86_64-linux-gnu/ld-2.19.so +7f7d77f65000-7f7d77f66000 rw-p 00000000 00:00 0 +7ffc342a2000-7ffc342c3000 rw-p 00000000 00:00 0 [stack] +7ffc34343000-7ffc34345000 r-xp 00000000 00:00 0 [vdso] +ffffffffff600000-ffffffffff601000 r-xp 00000090 00:00 0 [vsyscall] +-> +00400000 0040b000 00000000 /bin/cat with space +7f7d7797c000 7f7d77b36000 00000000 /lib/x86_64-linux-gnu/libc-2.19.so +7f7d77d41000 7f7d77d64000 00000000 /lib/x86_64-linux-gnu/ld-2.19.so +7ffc34343000 7ffc34345000 00000000 [vdso] +ffffffffff600000 ffffffffff601000 00000090 [vsyscall] +` + +func TestProcSelfMaps(t *testing.T) { + + f := func(t *testing.T, input string) { + for tx, tt := range strings.Split(input, "\n\n") { + in, out, ok := strings.Cut(tt, "->\n") + if !ok { + t.Fatal("malformed test case") + } + if len(out) > 0 && out[len(out)-1] != '\n' { + out += "\n" + } + var buf strings.Builder + parseProcSelfMaps([]byte(in), func(lo, hi, offset uint64, file, buildID string) { + fmt.Fprintf(&buf, "%08x %08x %08x %s\n", lo, hi, offset, file) + }) + if buf.String() != out { + t.Errorf("#%d: have:\n%s\nwant:\n%s\n%q\n%q", tx, buf.String(), out, buf.String(), out) + } + } + } + + t.Run("Normal", func(t *testing.T) { + f(t, profSelfMapsTests) + }) + + t.Run("WithDeletedFile", func(t *testing.T) { + f(t, profSelfMapsTestsWithDeleted) + }) +} + +// TestMapping checks the mapping section of CPU profiles +// has the HasFunctions field set correctly. If all PCs included +// in the samples are successfully symbolized, the corresponding +// mapping entry (in this test case, only one entry) should have +// its HasFunctions field set true. +// The test generates a CPU profile that includes PCs from C side +// that the runtime can't symbolize. See ./testdata/mappingtest. +func TestMapping(t *testing.T) { + testenv.MustHaveGoRun(t) + testenv.MustHaveCGO(t) + + prog := "./testdata/mappingtest/main.go" + + // GoOnly includes only Go symbols that runtime will symbolize. + // Go+C includes C symbols that runtime will not symbolize. + for _, traceback := range []string{"GoOnly", "Go+C"} { + t.Run("traceback"+traceback, func(t *testing.T) { + cmd := exec.Command(testenv.GoToolPath(t), "run", prog) + if traceback != "GoOnly" { + cmd.Env = append(os.Environ(), "SETCGOTRACEBACK=1") + } + cmd.Stderr = new(bytes.Buffer) + + out, err := cmd.Output() + if err != nil { + t.Fatalf("failed to run the test program %q: %v\n%v", prog, err, cmd.Stderr) + } + + prof, err := profile.Parse(bytes.NewReader(out)) + if err != nil { + t.Fatalf("failed to parse the generated profile data: %v", err) + } + t.Logf("Profile: %s", prof) + + hit := make(map[*profile.Mapping]bool) + miss := make(map[*profile.Mapping]bool) + for _, loc := range prof.Location { + if symbolized(loc) { + hit[loc.Mapping] = true + } else { + miss[loc.Mapping] = true + } + } + if len(miss) == 0 { + t.Log("no location with missing symbol info was sampled") + } + + for _, m := range prof.Mapping { + if miss[m] && m.HasFunctions { + t.Errorf("mapping %+v has HasFunctions=true, but contains locations with failed symbolization", m) + continue + } + if !miss[m] && hit[m] && !m.HasFunctions { + t.Errorf("mapping %+v has HasFunctions=false, but all referenced locations from this lapping were symbolized successfully", m) + continue + } + } + + if traceback == "Go+C" { + // The test code was arranged to have PCs from C and + // they are not symbolized. + // Check no Location containing those unsymbolized PCs contains multiple lines. + for i, loc := range prof.Location { + if !symbolized(loc) && len(loc.Line) > 1 { + t.Errorf("Location[%d] contains unsymbolized PCs and multiple lines: %v", i, loc) + } + } + } + }) + } +} + +func symbolized(loc *profile.Location) bool { + if len(loc.Line) == 0 { + return false + } + l := loc.Line[0] + f := l.Function + if l.Line == 0 || f == nil || f.Name == "" || f.Filename == "" { + return false + } + return true +} + +// TestFakeMapping tests if at least one mapping exists +// (including a fake mapping), and their HasFunctions bits +// are set correctly. +func TestFakeMapping(t *testing.T) { + var buf bytes.Buffer + if err := Lookup("heap").WriteTo(&buf, 0); err != nil { + t.Fatalf("failed to write heap profile: %v", err) + } + prof, err := profile.Parse(&buf) + if err != nil { + t.Fatalf("failed to parse the generated profile data: %v", err) + } + t.Logf("Profile: %s", prof) + if len(prof.Mapping) == 0 { + t.Fatal("want profile with at least one mapping entry, got 0 mapping") + } + + hit := make(map[*profile.Mapping]bool) + miss := make(map[*profile.Mapping]bool) + for _, loc := range prof.Location { + if symbolized(loc) { + hit[loc.Mapping] = true + } else { + miss[loc.Mapping] = true + } + } + for _, m := range prof.Mapping { + if miss[m] && m.HasFunctions { + t.Errorf("mapping %+v has HasFunctions=true, but contains locations with failed symbolization", m) + continue + } + if !miss[m] && hit[m] && !m.HasFunctions { + t.Errorf("mapping %+v has HasFunctions=false, but all referenced locations from this lapping were symbolized successfully", m) + continue + } + } +} + +// Make sure the profiler can handle an empty stack trace. +// See issue 37967. +func TestEmptyStack(t *testing.T) { + b := []uint64{ + 3, 0, 500, // hz = 500 + 3, 0, 10, // 10 samples with an empty stack trace + } + _, err := translateCPUProfile(b, 2) + if err != nil { + t.Fatalf("translating profile: %v", err) + } +} diff --git a/src/runtime/pprof/proto_windows.go b/src/runtime/pprof/proto_windows.go new file mode 100644 index 0000000..d5ae4a5 --- /dev/null +++ b/src/runtime/pprof/proto_windows.go @@ -0,0 +1,73 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "errors" + "internal/syscall/windows" + "syscall" +) + +// readMapping adds memory mapping information to the profile. +func (b *profileBuilder) readMapping() { + snap, err := createModuleSnapshot() + if err != nil { + // pprof expects a map entry, so fake one, when we haven't added anything yet. + b.addMappingEntry(0, 0, 0, "", "", true) + return + } + defer func() { _ = syscall.CloseHandle(snap) }() + + var module windows.ModuleEntry32 + module.Size = uint32(windows.SizeofModuleEntry32) + err = windows.Module32First(snap, &module) + if err != nil { + // pprof expects a map entry, so fake one, when we haven't added anything yet. + b.addMappingEntry(0, 0, 0, "", "", true) + return + } + for err == nil { + exe := syscall.UTF16ToString(module.ExePath[:]) + b.addMappingEntry( + uint64(module.ModBaseAddr), + uint64(module.ModBaseAddr)+uint64(module.ModBaseSize), + 0, + exe, + peBuildID(exe), + false, + ) + err = windows.Module32Next(snap, &module) + } +} + +func readMainModuleMapping() (start, end uint64, err error) { + snap, err := createModuleSnapshot() + if err != nil { + return 0, 0, err + } + defer func() { _ = syscall.CloseHandle(snap) }() + + var module windows.ModuleEntry32 + module.Size = uint32(windows.SizeofModuleEntry32) + err = windows.Module32First(snap, &module) + if err != nil { + return 0, 0, err + } + + return uint64(module.ModBaseAddr), uint64(module.ModBaseAddr) + uint64(module.ModBaseSize), nil +} + +func createModuleSnapshot() (syscall.Handle, error) { + for { + snap, err := syscall.CreateToolhelp32Snapshot(windows.TH32CS_SNAPMODULE|windows.TH32CS_SNAPMODULE32, uint32(syscall.Getpid())) + var errno syscall.Errno + if err != nil && errors.As(err, &errno) && errno == windows.ERROR_BAD_LENGTH { + // When CreateToolhelp32Snapshot(SNAPMODULE|SNAPMODULE32, ...) fails + // with ERROR_BAD_LENGTH then it should be retried until it succeeds. + continue + } + return snap, err + } +} diff --git a/src/runtime/pprof/protobuf.go b/src/runtime/pprof/protobuf.go new file mode 100644 index 0000000..f7ec1ac --- /dev/null +++ b/src/runtime/pprof/protobuf.go @@ -0,0 +1,141 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +// A protobuf is a simple protocol buffer encoder. +type protobuf struct { + data []byte + tmp [16]byte + nest int +} + +func (b *protobuf) varint(x uint64) { + for x >= 128 { + b.data = append(b.data, byte(x)|0x80) + x >>= 7 + } + b.data = append(b.data, byte(x)) +} + +func (b *protobuf) length(tag int, len int) { + b.varint(uint64(tag)<<3 | 2) + b.varint(uint64(len)) +} + +func (b *protobuf) uint64(tag int, x uint64) { + // append varint to b.data + b.varint(uint64(tag)<<3 | 0) + b.varint(x) +} + +func (b *protobuf) uint64s(tag int, x []uint64) { + if len(x) > 2 { + // Use packed encoding + n1 := len(b.data) + for _, u := range x { + b.varint(u) + } + n2 := len(b.data) + b.length(tag, n2-n1) + n3 := len(b.data) + copy(b.tmp[:], b.data[n2:n3]) + copy(b.data[n1+(n3-n2):], b.data[n1:n2]) + copy(b.data[n1:], b.tmp[:n3-n2]) + return + } + for _, u := range x { + b.uint64(tag, u) + } +} + +func (b *protobuf) uint64Opt(tag int, x uint64) { + if x == 0 { + return + } + b.uint64(tag, x) +} + +func (b *protobuf) int64(tag int, x int64) { + u := uint64(x) + b.uint64(tag, u) +} + +func (b *protobuf) int64Opt(tag int, x int64) { + if x == 0 { + return + } + b.int64(tag, x) +} + +func (b *protobuf) int64s(tag int, x []int64) { + if len(x) > 2 { + // Use packed encoding + n1 := len(b.data) + for _, u := range x { + b.varint(uint64(u)) + } + n2 := len(b.data) + b.length(tag, n2-n1) + n3 := len(b.data) + copy(b.tmp[:], b.data[n2:n3]) + copy(b.data[n1+(n3-n2):], b.data[n1:n2]) + copy(b.data[n1:], b.tmp[:n3-n2]) + return + } + for _, u := range x { + b.int64(tag, u) + } +} + +func (b *protobuf) string(tag int, x string) { + b.length(tag, len(x)) + b.data = append(b.data, x...) +} + +func (b *protobuf) strings(tag int, x []string) { + for _, s := range x { + b.string(tag, s) + } +} + +func (b *protobuf) stringOpt(tag int, x string) { + if x == "" { + return + } + b.string(tag, x) +} + +func (b *protobuf) bool(tag int, x bool) { + if x { + b.uint64(tag, 1) + } else { + b.uint64(tag, 0) + } +} + +func (b *protobuf) boolOpt(tag int, x bool) { + if !x { + return + } + b.bool(tag, x) +} + +type msgOffset int + +func (b *protobuf) startMessage() msgOffset { + b.nest++ + return msgOffset(len(b.data)) +} + +func (b *protobuf) endMessage(tag int, start msgOffset) { + n1 := int(start) + n2 := len(b.data) + b.length(tag, n2-n1) + n3 := len(b.data) + copy(b.tmp[:], b.data[n2:n3]) + copy(b.data[n1+(n3-n2):], b.data[n1:n2]) + copy(b.data[n1:], b.tmp[:n3-n2]) + b.nest-- +} diff --git a/src/runtime/pprof/protomem.go b/src/runtime/pprof/protomem.go new file mode 100644 index 0000000..fa75a28 --- /dev/null +++ b/src/runtime/pprof/protomem.go @@ -0,0 +1,93 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "io" + "math" + "runtime" + "strings" +) + +// writeHeapProto writes the current heap profile in protobuf format to w. +func writeHeapProto(w io.Writer, p []runtime.MemProfileRecord, rate int64, defaultSampleType string) error { + b := newProfileBuilder(w) + b.pbValueType(tagProfile_PeriodType, "space", "bytes") + b.pb.int64Opt(tagProfile_Period, rate) + b.pbValueType(tagProfile_SampleType, "alloc_objects", "count") + b.pbValueType(tagProfile_SampleType, "alloc_space", "bytes") + b.pbValueType(tagProfile_SampleType, "inuse_objects", "count") + b.pbValueType(tagProfile_SampleType, "inuse_space", "bytes") + if defaultSampleType != "" { + b.pb.int64Opt(tagProfile_DefaultSampleType, b.stringIndex(defaultSampleType)) + } + + values := []int64{0, 0, 0, 0} + var locs []uint64 + for _, r := range p { + hideRuntime := true + for tries := 0; tries < 2; tries++ { + stk := r.Stack() + // For heap profiles, all stack + // addresses are return PCs, which is + // what appendLocsForStack expects. + if hideRuntime { + for i, addr := range stk { + if f := runtime.FuncForPC(addr); f != nil && strings.HasPrefix(f.Name(), "runtime.") { + continue + } + // Found non-runtime. Show any runtime uses above it. + stk = stk[i:] + break + } + } + locs = b.appendLocsForStack(locs[:0], stk) + if len(locs) > 0 { + break + } + hideRuntime = false // try again, and show all frames next time. + } + + values[0], values[1] = scaleHeapSample(r.AllocObjects, r.AllocBytes, rate) + values[2], values[3] = scaleHeapSample(r.InUseObjects(), r.InUseBytes(), rate) + var blockSize int64 + if r.AllocObjects > 0 { + blockSize = r.AllocBytes / r.AllocObjects + } + b.pbSample(values, locs, func() { + if blockSize != 0 { + b.pbLabel(tagSample_Label, "bytes", "", blockSize) + } + }) + } + b.build() + return nil +} + +// scaleHeapSample adjusts the data from a heap Sample to +// account for its probability of appearing in the collected +// data. heap profiles are a sampling of the memory allocations +// requests in a program. We estimate the unsampled value by dividing +// each collected sample by its probability of appearing in the +// profile. heap profiles rely on a poisson process to determine +// which samples to collect, based on the desired average collection +// rate R. The probability of a sample of size S to appear in that +// profile is 1-exp(-S/R). +func scaleHeapSample(count, size, rate int64) (int64, int64) { + if count == 0 || size == 0 { + return 0, 0 + } + + if rate <= 1 { + // if rate==1 all samples were collected so no adjustment is needed. + // if rate<1 treat as unknown and skip scaling. + return count, size + } + + avgSize := float64(size) / float64(count) + scale := 1 / (1 - math.Exp(-avgSize/float64(rate))) + + return int64(float64(count) * scale), int64(float64(size) * scale) +} diff --git a/src/runtime/pprof/protomem_test.go b/src/runtime/pprof/protomem_test.go new file mode 100644 index 0000000..156f628 --- /dev/null +++ b/src/runtime/pprof/protomem_test.go @@ -0,0 +1,84 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "bytes" + "internal/profile" + "runtime" + "testing" +) + +func TestConvertMemProfile(t *testing.T) { + addr1, addr2, map1, map2 := testPCs(t) + + // MemProfileRecord stacks are return PCs, so add one to the + // addresses recorded in the "profile". The proto profile + // locations are call PCs, so conversion will subtract one + // from these and get back to addr1 and addr2. + a1, a2 := uintptr(addr1)+1, uintptr(addr2)+1 + rate := int64(512 * 1024) + rec := []runtime.MemProfileRecord{ + {AllocBytes: 4096, FreeBytes: 1024, AllocObjects: 4, FreeObjects: 1, Stack0: [32]uintptr{a1, a2}}, + {AllocBytes: 512 * 1024, FreeBytes: 0, AllocObjects: 1, FreeObjects: 0, Stack0: [32]uintptr{a2 + 1, a2 + 2}}, + {AllocBytes: 512 * 1024, FreeBytes: 512 * 1024, AllocObjects: 1, FreeObjects: 1, Stack0: [32]uintptr{a1 + 1, a1 + 2, a2 + 3}}, + } + + periodType := &profile.ValueType{Type: "space", Unit: "bytes"} + sampleType := []*profile.ValueType{ + {Type: "alloc_objects", Unit: "count"}, + {Type: "alloc_space", Unit: "bytes"}, + {Type: "inuse_objects", Unit: "count"}, + {Type: "inuse_space", Unit: "bytes"}, + } + samples := []*profile.Sample{ + { + Value: []int64{2050, 2099200, 1537, 1574400}, + Location: []*profile.Location{ + {ID: 1, Mapping: map1, Address: addr1}, + {ID: 2, Mapping: map2, Address: addr2}, + }, + NumLabel: map[string][]int64{"bytes": {1024}}, + }, + { + Value: []int64{1, 829411, 1, 829411}, + Location: []*profile.Location{ + {ID: 3, Mapping: map2, Address: addr2 + 1}, + {ID: 4, Mapping: map2, Address: addr2 + 2}, + }, + NumLabel: map[string][]int64{"bytes": {512 * 1024}}, + }, + { + Value: []int64{1, 829411, 0, 0}, + Location: []*profile.Location{ + {ID: 5, Mapping: map1, Address: addr1 + 1}, + {ID: 6, Mapping: map1, Address: addr1 + 2}, + {ID: 7, Mapping: map2, Address: addr2 + 3}, + }, + NumLabel: map[string][]int64{"bytes": {512 * 1024}}, + }, + } + for _, tc := range []struct { + name string + defaultSampleType string + }{ + {"heap", ""}, + {"allocs", "alloc_space"}, + } { + t.Run(tc.name, func(t *testing.T) { + var buf bytes.Buffer + if err := writeHeapProto(&buf, rec, rate, tc.defaultSampleType); err != nil { + t.Fatalf("writing profile: %v", err) + } + + p, err := profile.Parse(&buf) + if err != nil { + t.Fatalf("profile.Parse: %v", err) + } + + checkProfile(t, p, rate, periodType, sampleType, samples, tc.defaultSampleType) + }) + } +} diff --git a/src/runtime/pprof/runtime.go b/src/runtime/pprof/runtime.go new file mode 100644 index 0000000..57e9ca4 --- /dev/null +++ b/src/runtime/pprof/runtime.go @@ -0,0 +1,45 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "context" + "runtime" + "unsafe" +) + +// runtime_FrameStartLine is defined in runtime/symtab.go. +func runtime_FrameStartLine(f *runtime.Frame) int + +// runtime_expandFinalInlineFrame is defined in runtime/symtab.go. +func runtime_expandFinalInlineFrame(stk []uintptr) []uintptr + +// runtime_setProfLabel is defined in runtime/proflabel.go. +func runtime_setProfLabel(labels unsafe.Pointer) + +// runtime_getProfLabel is defined in runtime/proflabel.go. +func runtime_getProfLabel() unsafe.Pointer + +// SetGoroutineLabels sets the current goroutine's labels to match ctx. +// A new goroutine inherits the labels of the goroutine that created it. +// This is a lower-level API than Do, which should be used instead when possible. +func SetGoroutineLabels(ctx context.Context) { + ctxLabels, _ := ctx.Value(labelContextKey{}).(*labelMap) + runtime_setProfLabel(unsafe.Pointer(ctxLabels)) +} + +// Do calls f with a copy of the parent context with the +// given labels added to the parent's label map. +// Goroutines spawned while executing f will inherit the augmented label-set. +// Each key/value pair in labels is inserted into the label map in the +// order provided, overriding any previous value for the same key. +// The augmented label map will be set for the duration of the call to f +// and restored once f returns. +func Do(ctx context.Context, labels LabelSet, f func(context.Context)) { + defer SetGoroutineLabels(ctx) + ctx = WithLabels(ctx, labels) + SetGoroutineLabels(ctx) + f(ctx) +} diff --git a/src/runtime/pprof/runtime_test.go b/src/runtime/pprof/runtime_test.go new file mode 100644 index 0000000..0dd5324 --- /dev/null +++ b/src/runtime/pprof/runtime_test.go @@ -0,0 +1,96 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package pprof + +import ( + "context" + "fmt" + "reflect" + "testing" +) + +func TestSetGoroutineLabels(t *testing.T) { + sync := make(chan struct{}) + + wantLabels := map[string]string{} + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("Expected parent goroutine's profile labels to be empty before test, got %v", gotLabels) + } + go func() { + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("Expected child goroutine's profile labels to be empty before test, got %v", gotLabels) + } + sync <- struct{}{} + }() + <-sync + + wantLabels = map[string]string{"key": "value"} + ctx := WithLabels(context.Background(), Labels("key", "value")) + SetGoroutineLabels(ctx) + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("parent goroutine's profile labels: got %v, want %v", gotLabels, wantLabels) + } + go func() { + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("child goroutine's profile labels: got %v, want %v", gotLabels, wantLabels) + } + sync <- struct{}{} + }() + <-sync + + wantLabels = map[string]string{} + ctx = context.Background() + SetGoroutineLabels(ctx) + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("Expected parent goroutine's profile labels to be empty, got %v", gotLabels) + } + go func() { + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("Expected child goroutine's profile labels to be empty, got %v", gotLabels) + } + sync <- struct{}{} + }() + <-sync +} + +func TestDo(t *testing.T) { + wantLabels := map[string]string{} + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("Expected parent goroutine's profile labels to be empty before Do, got %v", gotLabels) + } + + Do(context.Background(), Labels("key1", "value1", "key2", "value2"), func(ctx context.Context) { + wantLabels := map[string]string{"key1": "value1", "key2": "value2"} + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("parent goroutine's profile labels: got %v, want %v", gotLabels, wantLabels) + } + + sync := make(chan struct{}) + go func() { + wantLabels := map[string]string{"key1": "value1", "key2": "value2"} + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + t.Errorf("child goroutine's profile labels: got %v, want %v", gotLabels, wantLabels) + } + sync <- struct{}{} + }() + <-sync + + }) + + wantLabels = map[string]string{} + if gotLabels := getProfLabel(); !reflect.DeepEqual(gotLabels, wantLabels) { + fmt.Printf("%#v", gotLabels) + fmt.Printf("%#v", wantLabels) + t.Errorf("Expected parent goroutine's profile labels to be empty after Do, got %v", gotLabels) + } +} + +func getProfLabel() map[string]string { + l := (*labelMap)(runtime_getProfLabel()) + if l == nil { + return map[string]string{} + } + return *l +} diff --git a/src/runtime/pprof/rusage_test.go b/src/runtime/pprof/rusage_test.go new file mode 100644 index 0000000..8039510 --- /dev/null +++ b/src/runtime/pprof/rusage_test.go @@ -0,0 +1,41 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package pprof + +import ( + "syscall" + "time" +) + +func init() { + diffCPUTimeImpl = diffCPUTimeRUsage +} + +func diffCPUTimeRUsage(f func()) (user, system time.Duration) { + ok := true + var before, after syscall.Rusage + + err := syscall.Getrusage(syscall.RUSAGE_SELF, &before) + if err != nil { + ok = false + } + + f() + + err = syscall.Getrusage(syscall.RUSAGE_SELF, &after) + if err != nil { + ok = false + } + + if !ok { + return 0, 0 + } + + user = time.Duration(after.Utime.Nano() - before.Utime.Nano()) + system = time.Duration(after.Stime.Nano() - before.Stime.Nano()) + return user, system +} diff --git a/src/runtime/pprof/testdata/README b/src/runtime/pprof/testdata/README new file mode 100644 index 0000000..876538e --- /dev/null +++ b/src/runtime/pprof/testdata/README @@ -0,0 +1,9 @@ +These binaries were generated by: + +$ cat empty.s +.global _start +_start: +$ as --32 -o empty.o empty.s && ld --build-id -m elf_i386 -o test32 empty.o +$ as --64 -o empty.o empty.s && ld --build-id -o test64 empty.o +$ powerpc-linux-gnu-as -o empty.o empty.s && powerpc-linux-gnu-ld --build-id -o test32be empty.o +$ powerpc64-linux-gnu-as -o empty.o empty.s && powerpc64-linux-gnu-ld --build-id -o test64be empty.o diff --git a/src/runtime/pprof/testdata/mappingtest/main.go b/src/runtime/pprof/testdata/mappingtest/main.go new file mode 100644 index 0000000..484b7f9 --- /dev/null +++ b/src/runtime/pprof/testdata/mappingtest/main.go @@ -0,0 +1,108 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This program outputs a CPU profile that includes +// both Go and Cgo stacks. This is used by the mapping info +// tests in runtime/pprof. +// +// If SETCGOTRACEBACK=1 is set, the CPU profile will includes +// PCs from C side but they will not be symbolized. +package main + +/* +#include <stdint.h> +#include <stdlib.h> + +int cpuHogCSalt1 = 0; +int cpuHogCSalt2 = 0; + +void CPUHogCFunction0(int foo) { + int i; + for (i = 0; i < 100000; i++) { + if (foo > 0) { + foo *= foo; + } else { + foo *= foo + 1; + } + cpuHogCSalt2 = foo; + } +} + +void CPUHogCFunction() { + CPUHogCFunction0(cpuHogCSalt1); +} + +struct CgoTracebackArg { + uintptr_t context; + uintptr_t sigContext; + uintptr_t *buf; + uintptr_t max; +}; + +void CollectCgoTraceback(void* parg) { + struct CgoTracebackArg* arg = (struct CgoTracebackArg*)(parg); + arg->buf[0] = (uintptr_t)(CPUHogCFunction0); + arg->buf[1] = (uintptr_t)(CPUHogCFunction); + arg->buf[2] = 0; +}; +*/ +import "C" + +import ( + "log" + "os" + "runtime" + "runtime/pprof" + "time" + "unsafe" +) + +func init() { + if v := os.Getenv("SETCGOTRACEBACK"); v == "1" { + // Collect some PCs from C-side, but don't symbolize. + runtime.SetCgoTraceback(0, unsafe.Pointer(C.CollectCgoTraceback), nil, nil) + } +} + +func main() { + go cpuHogGoFunction() + go cpuHogCFunction() + runtime.Gosched() + + if err := pprof.StartCPUProfile(os.Stdout); err != nil { + log.Fatal("can't start CPU profile: ", err) + } + time.Sleep(200 * time.Millisecond) + pprof.StopCPUProfile() + + if err := os.Stdout.Close(); err != nil { + log.Fatal("can't write CPU profile: ", err) + } +} + +var salt1 int +var salt2 int + +func cpuHogGoFunction() { + for { + foo := salt1 + for i := 0; i < 1e5; i++ { + if foo > 0 { + foo *= foo + } else { + foo *= foo + 1 + } + salt2 = foo + } + runtime.Gosched() + } +} + +func cpuHogCFunction() { + // Generates CPU profile samples including a Cgo call path. + for { + C.CPUHogCFunction() + runtime.Gosched() + } +} diff --git a/src/runtime/pprof/testdata/test32 b/src/runtime/pprof/testdata/test32 Binary files differnew file mode 100755 index 0000000..ce59472 --- /dev/null +++ b/src/runtime/pprof/testdata/test32 diff --git a/src/runtime/pprof/testdata/test32be b/src/runtime/pprof/testdata/test32be Binary files differnew file mode 100755 index 0000000..f13a732 --- /dev/null +++ b/src/runtime/pprof/testdata/test32be diff --git a/src/runtime/pprof/testdata/test64 b/src/runtime/pprof/testdata/test64 Binary files differnew file mode 100755 index 0000000..3fb42fb --- /dev/null +++ b/src/runtime/pprof/testdata/test64 diff --git a/src/runtime/pprof/testdata/test64be b/src/runtime/pprof/testdata/test64be Binary files differnew file mode 100755 index 0000000..09b4b01 --- /dev/null +++ b/src/runtime/pprof/testdata/test64be diff --git a/src/runtime/preempt.go b/src/runtime/preempt.go new file mode 100644 index 0000000..4f62fc6 --- /dev/null +++ b/src/runtime/preempt.go @@ -0,0 +1,452 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Goroutine preemption +// +// A goroutine can be preempted at any safe-point. Currently, there +// are a few categories of safe-points: +// +// 1. A blocked safe-point occurs for the duration that a goroutine is +// descheduled, blocked on synchronization, or in a system call. +// +// 2. Synchronous safe-points occur when a running goroutine checks +// for a preemption request. +// +// 3. Asynchronous safe-points occur at any instruction in user code +// where the goroutine can be safely paused and a conservative +// stack and register scan can find stack roots. The runtime can +// stop a goroutine at an async safe-point using a signal. +// +// At both blocked and synchronous safe-points, a goroutine's CPU +// state is minimal and the garbage collector has complete information +// about its entire stack. This makes it possible to deschedule a +// goroutine with minimal space, and to precisely scan a goroutine's +// stack. +// +// Synchronous safe-points are implemented by overloading the stack +// bound check in function prologues. To preempt a goroutine at the +// next synchronous safe-point, the runtime poisons the goroutine's +// stack bound to a value that will cause the next stack bound check +// to fail and enter the stack growth implementation, which will +// detect that it was actually a preemption and redirect to preemption +// handling. +// +// Preemption at asynchronous safe-points is implemented by suspending +// the thread using an OS mechanism (e.g., signals) and inspecting its +// state to determine if the goroutine was at an asynchronous +// safe-point. Since the thread suspension itself is generally +// asynchronous, it also checks if the running goroutine wants to be +// preempted, since this could have changed. If all conditions are +// satisfied, it adjusts the signal context to make it look like the +// signaled thread just called asyncPreempt and resumes the thread. +// asyncPreempt spills all registers and enters the scheduler. +// +// (An alternative would be to preempt in the signal handler itself. +// This would let the OS save and restore the register state and the +// runtime would only need to know how to extract potentially +// pointer-containing registers from the signal context. However, this +// would consume an M for every preempted G, and the scheduler itself +// is not designed to run from a signal handler, as it tends to +// allocate memory and start threads in the preemption path.) + +package runtime + +import ( + "internal/abi" + "internal/goarch" +) + +type suspendGState struct { + g *g + + // dead indicates the goroutine was not suspended because it + // is dead. This goroutine could be reused after the dead + // state was observed, so the caller must not assume that it + // remains dead. + dead bool + + // stopped indicates that this suspendG transitioned the G to + // _Gwaiting via g.preemptStop and thus is responsible for + // readying it when done. + stopped bool +} + +// suspendG suspends goroutine gp at a safe-point and returns the +// state of the suspended goroutine. The caller gets read access to +// the goroutine until it calls resumeG. +// +// It is safe for multiple callers to attempt to suspend the same +// goroutine at the same time. The goroutine may execute between +// subsequent successful suspend operations. The current +// implementation grants exclusive access to the goroutine, and hence +// multiple callers will serialize. However, the intent is to grant +// shared read access, so please don't depend on exclusive access. +// +// This must be called from the system stack and the user goroutine on +// the current M (if any) must be in a preemptible state. This +// prevents deadlocks where two goroutines attempt to suspend each +// other and both are in non-preemptible states. There are other ways +// to resolve this deadlock, but this seems simplest. +// +// TODO(austin): What if we instead required this to be called from a +// user goroutine? Then we could deschedule the goroutine while +// waiting instead of blocking the thread. If two goroutines tried to +// suspend each other, one of them would win and the other wouldn't +// complete the suspend until it was resumed. We would have to be +// careful that they couldn't actually queue up suspend for each other +// and then both be suspended. This would also avoid the need for a +// kernel context switch in the synchronous case because we could just +// directly schedule the waiter. The context switch is unavoidable in +// the signal case. +// +//go:systemstack +func suspendG(gp *g) suspendGState { + if mp := getg().m; mp.curg != nil && readgstatus(mp.curg) == _Grunning { + // Since we're on the system stack of this M, the user + // G is stuck at an unsafe point. If another goroutine + // were to try to preempt m.curg, it could deadlock. + throw("suspendG from non-preemptible goroutine") + } + + // See https://golang.org/cl/21503 for justification of the yield delay. + const yieldDelay = 10 * 1000 + var nextYield int64 + + // Drive the goroutine to a preemption point. + stopped := false + var asyncM *m + var asyncGen uint32 + var nextPreemptM int64 + for i := 0; ; i++ { + switch s := readgstatus(gp); s { + default: + if s&_Gscan != 0 { + // Someone else is suspending it. Wait + // for them to finish. + // + // TODO: It would be nicer if we could + // coalesce suspends. + break + } + + dumpgstatus(gp) + throw("invalid g status") + + case _Gdead: + // Nothing to suspend. + // + // preemptStop may need to be cleared, but + // doing that here could race with goroutine + // reuse. Instead, goexit0 clears it. + return suspendGState{dead: true} + + case _Gcopystack: + // The stack is being copied. We need to wait + // until this is done. + + case _Gpreempted: + // We (or someone else) suspended the G. Claim + // ownership of it by transitioning it to + // _Gwaiting. + if !casGFromPreempted(gp, _Gpreempted, _Gwaiting) { + break + } + + // We stopped the G, so we have to ready it later. + stopped = true + + s = _Gwaiting + fallthrough + + case _Grunnable, _Gsyscall, _Gwaiting: + // Claim goroutine by setting scan bit. + // This may race with execution or readying of gp. + // The scan bit keeps it from transition state. + if !castogscanstatus(gp, s, s|_Gscan) { + break + } + + // Clear the preemption request. It's safe to + // reset the stack guard because we hold the + // _Gscan bit and thus own the stack. + gp.preemptStop = false + gp.preempt = false + gp.stackguard0 = gp.stack.lo + _StackGuard + + // The goroutine was already at a safe-point + // and we've now locked that in. + // + // TODO: It would be much better if we didn't + // leave it in _Gscan, but instead gently + // prevented its scheduling until resumption. + // Maybe we only use this to bump a suspended + // count and the scheduler skips suspended + // goroutines? That wouldn't be enough for + // {_Gsyscall,_Gwaiting} -> _Grunning. Maybe + // for all those transitions we need to check + // suspended and deschedule? + return suspendGState{g: gp, stopped: stopped} + + case _Grunning: + // Optimization: if there is already a pending preemption request + // (from the previous loop iteration), don't bother with the atomics. + if gp.preemptStop && gp.preempt && gp.stackguard0 == stackPreempt && asyncM == gp.m && asyncM.preemptGen.Load() == asyncGen { + break + } + + // Temporarily block state transitions. + if !castogscanstatus(gp, _Grunning, _Gscanrunning) { + break + } + + // Request synchronous preemption. + gp.preemptStop = true + gp.preempt = true + gp.stackguard0 = stackPreempt + + // Prepare for asynchronous preemption. + asyncM2 := gp.m + asyncGen2 := asyncM2.preemptGen.Load() + needAsync := asyncM != asyncM2 || asyncGen != asyncGen2 + asyncM = asyncM2 + asyncGen = asyncGen2 + + casfrom_Gscanstatus(gp, _Gscanrunning, _Grunning) + + // Send asynchronous preemption. We do this + // after CASing the G back to _Grunning + // because preemptM may be synchronous and we + // don't want to catch the G just spinning on + // its status. + if preemptMSupported && debug.asyncpreemptoff == 0 && needAsync { + // Rate limit preemptM calls. This is + // particularly important on Windows + // where preemptM is actually + // synchronous and the spin loop here + // can lead to live-lock. + now := nanotime() + if now >= nextPreemptM { + nextPreemptM = now + yieldDelay/2 + preemptM(asyncM) + } + } + } + + // TODO: Don't busy wait. This loop should really only + // be a simple read/decide/CAS loop that only fails if + // there's an active race. Once the CAS succeeds, we + // should queue up the preemption (which will require + // it to be reliable in the _Grunning case, not + // best-effort) and then sleep until we're notified + // that the goroutine is suspended. + if i == 0 { + nextYield = nanotime() + yieldDelay + } + if nanotime() < nextYield { + procyield(10) + } else { + osyield() + nextYield = nanotime() + yieldDelay/2 + } + } +} + +// resumeG undoes the effects of suspendG, allowing the suspended +// goroutine to continue from its current safe-point. +func resumeG(state suspendGState) { + if state.dead { + // We didn't actually stop anything. + return + } + + gp := state.g + switch s := readgstatus(gp); s { + default: + dumpgstatus(gp) + throw("unexpected g status") + + case _Grunnable | _Gscan, + _Gwaiting | _Gscan, + _Gsyscall | _Gscan: + casfrom_Gscanstatus(gp, s, s&^_Gscan) + } + + if state.stopped { + // We stopped it, so we need to re-schedule it. + ready(gp, 0, true) + } +} + +// canPreemptM reports whether mp is in a state that is safe to preempt. +// +// It is nosplit because it has nosplit callers. +// +//go:nosplit +func canPreemptM(mp *m) bool { + return mp.locks == 0 && mp.mallocing == 0 && mp.preemptoff == "" && mp.p.ptr().status == _Prunning +} + +//go:generate go run mkpreempt.go + +// asyncPreempt saves all user registers and calls asyncPreempt2. +// +// When stack scanning encounters an asyncPreempt frame, it scans that +// frame and its parent frame conservatively. +// +// asyncPreempt is implemented in assembly. +func asyncPreempt() + +//go:nosplit +func asyncPreempt2() { + gp := getg() + gp.asyncSafePoint = true + if gp.preemptStop { + mcall(preemptPark) + } else { + mcall(gopreempt_m) + } + gp.asyncSafePoint = false +} + +// asyncPreemptStack is the bytes of stack space required to inject an +// asyncPreempt call. +var asyncPreemptStack = ^uintptr(0) + +func init() { + f := findfunc(abi.FuncPCABI0(asyncPreempt)) + total := funcMaxSPDelta(f) + f = findfunc(abi.FuncPCABIInternal(asyncPreempt2)) + total += funcMaxSPDelta(f) + // Add some overhead for return PCs, etc. + asyncPreemptStack = uintptr(total) + 8*goarch.PtrSize + if asyncPreemptStack > _StackLimit { + // We need more than the nosplit limit. This isn't + // unsafe, but it may limit asynchronous preemption. + // + // This may be a problem if we start using more + // registers. In that case, we should store registers + // in a context object. If we pre-allocate one per P, + // asyncPreempt can spill just a few registers to the + // stack, then grab its context object and spill into + // it. When it enters the runtime, it would allocate a + // new context for the P. + print("runtime: asyncPreemptStack=", asyncPreemptStack, "\n") + throw("async stack too large") + } +} + +// wantAsyncPreempt returns whether an asynchronous preemption is +// queued for gp. +func wantAsyncPreempt(gp *g) bool { + // Check both the G and the P. + return (gp.preempt || gp.m.p != 0 && gp.m.p.ptr().preempt) && readgstatus(gp)&^_Gscan == _Grunning +} + +// isAsyncSafePoint reports whether gp at instruction PC is an +// asynchronous safe point. This indicates that: +// +// 1. It's safe to suspend gp and conservatively scan its stack and +// registers. There are no potentially hidden pointer values and it's +// not in the middle of an atomic sequence like a write barrier. +// +// 2. gp has enough stack space to inject the asyncPreempt call. +// +// 3. It's generally safe to interact with the runtime, even if we're +// in a signal handler stopped here. For example, there are no runtime +// locks held, so acquiring a runtime lock won't self-deadlock. +// +// In some cases the PC is safe for asynchronous preemption but it +// also needs to adjust the resumption PC. The new PC is returned in +// the second result. +func isAsyncSafePoint(gp *g, pc, sp, lr uintptr) (bool, uintptr) { + mp := gp.m + + // Only user Gs can have safe-points. We check this first + // because it's extremely common that we'll catch mp in the + // scheduler processing this G preemption. + if mp.curg != gp { + return false, 0 + } + + // Check M state. + if mp.p == 0 || !canPreemptM(mp) { + return false, 0 + } + + // Check stack space. + if sp < gp.stack.lo || sp-gp.stack.lo < asyncPreemptStack { + return false, 0 + } + + // Check if PC is an unsafe-point. + f := findfunc(pc) + if !f.valid() { + // Not Go code. + return false, 0 + } + if (GOARCH == "mips" || GOARCH == "mipsle" || GOARCH == "mips64" || GOARCH == "mips64le") && lr == pc+8 && funcspdelta(f, pc, nil) == 0 { + // We probably stopped at a half-executed CALL instruction, + // where the LR is updated but the PC has not. If we preempt + // here we'll see a seemingly self-recursive call, which is in + // fact not. + // This is normally ok, as we use the return address saved on + // stack for unwinding, not the LR value. But if this is a + // call to morestack, we haven't created the frame, and we'll + // use the LR for unwinding, which will be bad. + return false, 0 + } + up, startpc := pcdatavalue2(f, _PCDATA_UnsafePoint, pc) + if up == _PCDATA_UnsafePointUnsafe { + // Unsafe-point marked by compiler. This includes + // atomic sequences (e.g., write barrier) and nosplit + // functions (except at calls). + return false, 0 + } + if fd := funcdata(f, _FUNCDATA_LocalsPointerMaps); fd == nil || f.flag&funcFlag_ASM != 0 { + // This is assembly code. Don't assume it's well-formed. + // TODO: Empirically we still need the fd == nil check. Why? + // + // TODO: Are there cases that are safe but don't have a + // locals pointer map, like empty frame functions? + // It might be possible to preempt any assembly functions + // except the ones that have funcFlag_SPWRITE set in f.flag. + return false, 0 + } + name := funcname(f) + if inldata := funcdata(f, _FUNCDATA_InlTree); inldata != nil { + inltree := (*[1 << 20]inlinedCall)(inldata) + ix := pcdatavalue(f, _PCDATA_InlTreeIndex, pc, nil) + if ix >= 0 { + name = funcnameFromNameOff(f, inltree[ix].nameOff) + } + } + if hasPrefix(name, "runtime.") || + hasPrefix(name, "runtime/internal/") || + hasPrefix(name, "reflect.") { + // For now we never async preempt the runtime or + // anything closely tied to the runtime. Known issues + // include: various points in the scheduler ("don't + // preempt between here and here"), much of the defer + // implementation (untyped info on stack), bulk write + // barriers (write barrier check), + // reflect.{makeFuncStub,methodValueCall}. + // + // TODO(austin): We should improve this, or opt things + // in incrementally. + return false, 0 + } + switch up { + case _PCDATA_Restart1, _PCDATA_Restart2: + // Restartable instruction sequence. Back off PC to + // the start PC. + if startpc == 0 || startpc > pc || pc-startpc > 20 { + throw("bad restart PC") + } + return true, startpc + case _PCDATA_RestartAtEntry: + // Restart from the function entry at resumption. + return true, f.entry() + } + return true, pc +} diff --git a/src/runtime/preempt_386.s b/src/runtime/preempt_386.s new file mode 100644 index 0000000..d57bc3d --- /dev/null +++ b/src/runtime/preempt_386.s @@ -0,0 +1,47 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + PUSHFL + ADJSP $156 + NOP SP + MOVL AX, 0(SP) + MOVL CX, 4(SP) + MOVL DX, 8(SP) + MOVL BX, 12(SP) + MOVL BP, 16(SP) + MOVL SI, 20(SP) + MOVL DI, 24(SP) + #ifndef GO386_softfloat + MOVUPS X0, 28(SP) + MOVUPS X1, 44(SP) + MOVUPS X2, 60(SP) + MOVUPS X3, 76(SP) + MOVUPS X4, 92(SP) + MOVUPS X5, 108(SP) + MOVUPS X6, 124(SP) + MOVUPS X7, 140(SP) + #endif + CALL ·asyncPreempt2(SB) + #ifndef GO386_softfloat + MOVUPS 140(SP), X7 + MOVUPS 124(SP), X6 + MOVUPS 108(SP), X5 + MOVUPS 92(SP), X4 + MOVUPS 76(SP), X3 + MOVUPS 60(SP), X2 + MOVUPS 44(SP), X1 + MOVUPS 28(SP), X0 + #endif + MOVL 24(SP), DI + MOVL 20(SP), SI + MOVL 16(SP), BP + MOVL 12(SP), BX + MOVL 8(SP), DX + MOVL 4(SP), CX + MOVL 0(SP), AX + ADJSP $-156 + POPFL + RET diff --git a/src/runtime/preempt_amd64.s b/src/runtime/preempt_amd64.s new file mode 100644 index 0000000..94a84fb --- /dev/null +++ b/src/runtime/preempt_amd64.s @@ -0,0 +1,87 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +#include "go_asm.h" +#include "asm_amd64.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + PUSHQ BP + MOVQ SP, BP + // Save flags before clobbering them + PUSHFQ + // obj doesn't understand ADD/SUB on SP, but does understand ADJSP + ADJSP $368 + // But vet doesn't know ADJSP, so suppress vet stack checking + NOP SP + MOVQ AX, 0(SP) + MOVQ CX, 8(SP) + MOVQ DX, 16(SP) + MOVQ BX, 24(SP) + MOVQ SI, 32(SP) + MOVQ DI, 40(SP) + MOVQ R8, 48(SP) + MOVQ R9, 56(SP) + MOVQ R10, 64(SP) + MOVQ R11, 72(SP) + MOVQ R12, 80(SP) + MOVQ R13, 88(SP) + MOVQ R14, 96(SP) + MOVQ R15, 104(SP) + #ifdef GOOS_darwin + #ifndef hasAVX + CMPB internal∕cpu·X86+const_offsetX86HasAVX(SB), $0 + JE 2(PC) + #endif + VZEROUPPER + #endif + MOVUPS X0, 112(SP) + MOVUPS X1, 128(SP) + MOVUPS X2, 144(SP) + MOVUPS X3, 160(SP) + MOVUPS X4, 176(SP) + MOVUPS X5, 192(SP) + MOVUPS X6, 208(SP) + MOVUPS X7, 224(SP) + MOVUPS X8, 240(SP) + MOVUPS X9, 256(SP) + MOVUPS X10, 272(SP) + MOVUPS X11, 288(SP) + MOVUPS X12, 304(SP) + MOVUPS X13, 320(SP) + MOVUPS X14, 336(SP) + MOVUPS X15, 352(SP) + CALL ·asyncPreempt2(SB) + MOVUPS 352(SP), X15 + MOVUPS 336(SP), X14 + MOVUPS 320(SP), X13 + MOVUPS 304(SP), X12 + MOVUPS 288(SP), X11 + MOVUPS 272(SP), X10 + MOVUPS 256(SP), X9 + MOVUPS 240(SP), X8 + MOVUPS 224(SP), X7 + MOVUPS 208(SP), X6 + MOVUPS 192(SP), X5 + MOVUPS 176(SP), X4 + MOVUPS 160(SP), X3 + MOVUPS 144(SP), X2 + MOVUPS 128(SP), X1 + MOVUPS 112(SP), X0 + MOVQ 104(SP), R15 + MOVQ 96(SP), R14 + MOVQ 88(SP), R13 + MOVQ 80(SP), R12 + MOVQ 72(SP), R11 + MOVQ 64(SP), R10 + MOVQ 56(SP), R9 + MOVQ 48(SP), R8 + MOVQ 40(SP), DI + MOVQ 32(SP), SI + MOVQ 24(SP), BX + MOVQ 16(SP), DX + MOVQ 8(SP), CX + MOVQ 0(SP), AX + ADJSP $-368 + POPFQ + POPQ BP + RET diff --git a/src/runtime/preempt_arm.s b/src/runtime/preempt_arm.s new file mode 100644 index 0000000..8f243c0 --- /dev/null +++ b/src/runtime/preempt_arm.s @@ -0,0 +1,83 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + MOVW.W R14, -188(R13) + MOVW R0, 4(R13) + MOVW R1, 8(R13) + MOVW R2, 12(R13) + MOVW R3, 16(R13) + MOVW R4, 20(R13) + MOVW R5, 24(R13) + MOVW R6, 28(R13) + MOVW R7, 32(R13) + MOVW R8, 36(R13) + MOVW R9, 40(R13) + MOVW R11, 44(R13) + MOVW R12, 48(R13) + MOVW CPSR, R0 + MOVW R0, 52(R13) + MOVB ·goarm(SB), R0 + CMP $6, R0 + BLT nofp + MOVW FPCR, R0 + MOVW R0, 56(R13) + MOVD F0, 60(R13) + MOVD F1, 68(R13) + MOVD F2, 76(R13) + MOVD F3, 84(R13) + MOVD F4, 92(R13) + MOVD F5, 100(R13) + MOVD F6, 108(R13) + MOVD F7, 116(R13) + MOVD F8, 124(R13) + MOVD F9, 132(R13) + MOVD F10, 140(R13) + MOVD F11, 148(R13) + MOVD F12, 156(R13) + MOVD F13, 164(R13) + MOVD F14, 172(R13) + MOVD F15, 180(R13) +nofp: + CALL ·asyncPreempt2(SB) + MOVB ·goarm(SB), R0 + CMP $6, R0 + BLT nofp2 + MOVD 180(R13), F15 + MOVD 172(R13), F14 + MOVD 164(R13), F13 + MOVD 156(R13), F12 + MOVD 148(R13), F11 + MOVD 140(R13), F10 + MOVD 132(R13), F9 + MOVD 124(R13), F8 + MOVD 116(R13), F7 + MOVD 108(R13), F6 + MOVD 100(R13), F5 + MOVD 92(R13), F4 + MOVD 84(R13), F3 + MOVD 76(R13), F2 + MOVD 68(R13), F1 + MOVD 60(R13), F0 + MOVW 56(R13), R0 + MOVW R0, FPCR +nofp2: + MOVW 52(R13), R0 + MOVW R0, CPSR + MOVW 48(R13), R12 + MOVW 44(R13), R11 + MOVW 40(R13), R9 + MOVW 36(R13), R8 + MOVW 32(R13), R7 + MOVW 28(R13), R6 + MOVW 24(R13), R5 + MOVW 20(R13), R4 + MOVW 16(R13), R3 + MOVW 12(R13), R2 + MOVW 8(R13), R1 + MOVW 4(R13), R0 + MOVW 188(R13), R14 + MOVW.P 192(R13), R15 + UNDEF diff --git a/src/runtime/preempt_arm64.s b/src/runtime/preempt_arm64.s new file mode 100644 index 0000000..c27d475 --- /dev/null +++ b/src/runtime/preempt_arm64.s @@ -0,0 +1,85 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + MOVD R30, -496(RSP) + SUB $496, RSP + MOVD R29, -8(RSP) + SUB $8, RSP, R29 + #ifdef GOOS_ios + MOVD R30, (RSP) + #endif + STP (R0, R1), 8(RSP) + STP (R2, R3), 24(RSP) + STP (R4, R5), 40(RSP) + STP (R6, R7), 56(RSP) + STP (R8, R9), 72(RSP) + STP (R10, R11), 88(RSP) + STP (R12, R13), 104(RSP) + STP (R14, R15), 120(RSP) + STP (R16, R17), 136(RSP) + STP (R19, R20), 152(RSP) + STP (R21, R22), 168(RSP) + STP (R23, R24), 184(RSP) + STP (R25, R26), 200(RSP) + MOVD NZCV, R0 + MOVD R0, 216(RSP) + MOVD FPSR, R0 + MOVD R0, 224(RSP) + FSTPD (F0, F1), 232(RSP) + FSTPD (F2, F3), 248(RSP) + FSTPD (F4, F5), 264(RSP) + FSTPD (F6, F7), 280(RSP) + FSTPD (F8, F9), 296(RSP) + FSTPD (F10, F11), 312(RSP) + FSTPD (F12, F13), 328(RSP) + FSTPD (F14, F15), 344(RSP) + FSTPD (F16, F17), 360(RSP) + FSTPD (F18, F19), 376(RSP) + FSTPD (F20, F21), 392(RSP) + FSTPD (F22, F23), 408(RSP) + FSTPD (F24, F25), 424(RSP) + FSTPD (F26, F27), 440(RSP) + FSTPD (F28, F29), 456(RSP) + FSTPD (F30, F31), 472(RSP) + CALL ·asyncPreempt2(SB) + FLDPD 472(RSP), (F30, F31) + FLDPD 456(RSP), (F28, F29) + FLDPD 440(RSP), (F26, F27) + FLDPD 424(RSP), (F24, F25) + FLDPD 408(RSP), (F22, F23) + FLDPD 392(RSP), (F20, F21) + FLDPD 376(RSP), (F18, F19) + FLDPD 360(RSP), (F16, F17) + FLDPD 344(RSP), (F14, F15) + FLDPD 328(RSP), (F12, F13) + FLDPD 312(RSP), (F10, F11) + FLDPD 296(RSP), (F8, F9) + FLDPD 280(RSP), (F6, F7) + FLDPD 264(RSP), (F4, F5) + FLDPD 248(RSP), (F2, F3) + FLDPD 232(RSP), (F0, F1) + MOVD 224(RSP), R0 + MOVD R0, FPSR + MOVD 216(RSP), R0 + MOVD R0, NZCV + LDP 200(RSP), (R25, R26) + LDP 184(RSP), (R23, R24) + LDP 168(RSP), (R21, R22) + LDP 152(RSP), (R19, R20) + LDP 136(RSP), (R16, R17) + LDP 120(RSP), (R14, R15) + LDP 104(RSP), (R12, R13) + LDP 88(RSP), (R10, R11) + LDP 72(RSP), (R8, R9) + LDP 56(RSP), (R6, R7) + LDP 40(RSP), (R4, R5) + LDP 24(RSP), (R2, R3) + LDP 8(RSP), (R0, R1) + MOVD 496(RSP), R30 + MOVD -8(RSP), R29 + MOVD (RSP), R27 + ADD $512, RSP + JMP (R27) diff --git a/src/runtime/preempt_loong64.s b/src/runtime/preempt_loong64.s new file mode 100644 index 0000000..ba59a07 --- /dev/null +++ b/src/runtime/preempt_loong64.s @@ -0,0 +1,129 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + MOVV R1, -472(R3) + SUBV $472, R3 + MOVV R4, 8(R3) + MOVV R5, 16(R3) + MOVV R6, 24(R3) + MOVV R7, 32(R3) + MOVV R8, 40(R3) + MOVV R9, 48(R3) + MOVV R10, 56(R3) + MOVV R11, 64(R3) + MOVV R12, 72(R3) + MOVV R13, 80(R3) + MOVV R14, 88(R3) + MOVV R15, 96(R3) + MOVV R16, 104(R3) + MOVV R17, 112(R3) + MOVV R18, 120(R3) + MOVV R19, 128(R3) + MOVV R20, 136(R3) + MOVV R21, 144(R3) + MOVV R23, 152(R3) + MOVV R24, 160(R3) + MOVV R25, 168(R3) + MOVV R26, 176(R3) + MOVV R27, 184(R3) + MOVV R28, 192(R3) + MOVV R29, 200(R3) + MOVV RSB, 208(R3) + MOVD F0, 216(R3) + MOVD F1, 224(R3) + MOVD F2, 232(R3) + MOVD F3, 240(R3) + MOVD F4, 248(R3) + MOVD F5, 256(R3) + MOVD F6, 264(R3) + MOVD F7, 272(R3) + MOVD F8, 280(R3) + MOVD F9, 288(R3) + MOVD F10, 296(R3) + MOVD F11, 304(R3) + MOVD F12, 312(R3) + MOVD F13, 320(R3) + MOVD F14, 328(R3) + MOVD F15, 336(R3) + MOVD F16, 344(R3) + MOVD F17, 352(R3) + MOVD F18, 360(R3) + MOVD F19, 368(R3) + MOVD F20, 376(R3) + MOVD F21, 384(R3) + MOVD F22, 392(R3) + MOVD F23, 400(R3) + MOVD F24, 408(R3) + MOVD F25, 416(R3) + MOVD F26, 424(R3) + MOVD F27, 432(R3) + MOVD F28, 440(R3) + MOVD F29, 448(R3) + MOVD F30, 456(R3) + MOVD F31, 464(R3) + CALL ·asyncPreempt2(SB) + MOVD 464(R3), F31 + MOVD 456(R3), F30 + MOVD 448(R3), F29 + MOVD 440(R3), F28 + MOVD 432(R3), F27 + MOVD 424(R3), F26 + MOVD 416(R3), F25 + MOVD 408(R3), F24 + MOVD 400(R3), F23 + MOVD 392(R3), F22 + MOVD 384(R3), F21 + MOVD 376(R3), F20 + MOVD 368(R3), F19 + MOVD 360(R3), F18 + MOVD 352(R3), F17 + MOVD 344(R3), F16 + MOVD 336(R3), F15 + MOVD 328(R3), F14 + MOVD 320(R3), F13 + MOVD 312(R3), F12 + MOVD 304(R3), F11 + MOVD 296(R3), F10 + MOVD 288(R3), F9 + MOVD 280(R3), F8 + MOVD 272(R3), F7 + MOVD 264(R3), F6 + MOVD 256(R3), F5 + MOVD 248(R3), F4 + MOVD 240(R3), F3 + MOVD 232(R3), F2 + MOVD 224(R3), F1 + MOVD 216(R3), F0 + MOVV 208(R3), RSB + MOVV 200(R3), R29 + MOVV 192(R3), R28 + MOVV 184(R3), R27 + MOVV 176(R3), R26 + MOVV 168(R3), R25 + MOVV 160(R3), R24 + MOVV 152(R3), R23 + MOVV 144(R3), R21 + MOVV 136(R3), R20 + MOVV 128(R3), R19 + MOVV 120(R3), R18 + MOVV 112(R3), R17 + MOVV 104(R3), R16 + MOVV 96(R3), R15 + MOVV 88(R3), R14 + MOVV 80(R3), R13 + MOVV 72(R3), R12 + MOVV 64(R3), R11 + MOVV 56(R3), R10 + MOVV 48(R3), R9 + MOVV 40(R3), R8 + MOVV 32(R3), R7 + MOVV 24(R3), R6 + MOVV 16(R3), R5 + MOVV 8(R3), R4 + MOVV 472(R3), R1 + MOVV (R3), R30 + ADDV $480, R3 + JMP (R30) diff --git a/src/runtime/preempt_mips64x.s b/src/runtime/preempt_mips64x.s new file mode 100644 index 0000000..996b592 --- /dev/null +++ b/src/runtime/preempt_mips64x.s @@ -0,0 +1,145 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +//go:build mips64 || mips64le + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + MOVV R31, -488(R29) + SUBV $488, R29 + MOVV R1, 8(R29) + MOVV R2, 16(R29) + MOVV R3, 24(R29) + MOVV R4, 32(R29) + MOVV R5, 40(R29) + MOVV R6, 48(R29) + MOVV R7, 56(R29) + MOVV R8, 64(R29) + MOVV R9, 72(R29) + MOVV R10, 80(R29) + MOVV R11, 88(R29) + MOVV R12, 96(R29) + MOVV R13, 104(R29) + MOVV R14, 112(R29) + MOVV R15, 120(R29) + MOVV R16, 128(R29) + MOVV R17, 136(R29) + MOVV R18, 144(R29) + MOVV R19, 152(R29) + MOVV R20, 160(R29) + MOVV R21, 168(R29) + MOVV R22, 176(R29) + MOVV R24, 184(R29) + MOVV R25, 192(R29) + MOVV RSB, 200(R29) + MOVV HI, R1 + MOVV R1, 208(R29) + MOVV LO, R1 + MOVV R1, 216(R29) + #ifndef GOMIPS64_softfloat + MOVV FCR31, R1 + MOVV R1, 224(R29) + MOVD F0, 232(R29) + MOVD F1, 240(R29) + MOVD F2, 248(R29) + MOVD F3, 256(R29) + MOVD F4, 264(R29) + MOVD F5, 272(R29) + MOVD F6, 280(R29) + MOVD F7, 288(R29) + MOVD F8, 296(R29) + MOVD F9, 304(R29) + MOVD F10, 312(R29) + MOVD F11, 320(R29) + MOVD F12, 328(R29) + MOVD F13, 336(R29) + MOVD F14, 344(R29) + MOVD F15, 352(R29) + MOVD F16, 360(R29) + MOVD F17, 368(R29) + MOVD F18, 376(R29) + MOVD F19, 384(R29) + MOVD F20, 392(R29) + MOVD F21, 400(R29) + MOVD F22, 408(R29) + MOVD F23, 416(R29) + MOVD F24, 424(R29) + MOVD F25, 432(R29) + MOVD F26, 440(R29) + MOVD F27, 448(R29) + MOVD F28, 456(R29) + MOVD F29, 464(R29) + MOVD F30, 472(R29) + MOVD F31, 480(R29) + #endif + CALL ·asyncPreempt2(SB) + #ifndef GOMIPS64_softfloat + MOVD 480(R29), F31 + MOVD 472(R29), F30 + MOVD 464(R29), F29 + MOVD 456(R29), F28 + MOVD 448(R29), F27 + MOVD 440(R29), F26 + MOVD 432(R29), F25 + MOVD 424(R29), F24 + MOVD 416(R29), F23 + MOVD 408(R29), F22 + MOVD 400(R29), F21 + MOVD 392(R29), F20 + MOVD 384(R29), F19 + MOVD 376(R29), F18 + MOVD 368(R29), F17 + MOVD 360(R29), F16 + MOVD 352(R29), F15 + MOVD 344(R29), F14 + MOVD 336(R29), F13 + MOVD 328(R29), F12 + MOVD 320(R29), F11 + MOVD 312(R29), F10 + MOVD 304(R29), F9 + MOVD 296(R29), F8 + MOVD 288(R29), F7 + MOVD 280(R29), F6 + MOVD 272(R29), F5 + MOVD 264(R29), F4 + MOVD 256(R29), F3 + MOVD 248(R29), F2 + MOVD 240(R29), F1 + MOVD 232(R29), F0 + MOVV 224(R29), R1 + MOVV R1, FCR31 + #endif + MOVV 216(R29), R1 + MOVV R1, LO + MOVV 208(R29), R1 + MOVV R1, HI + MOVV 200(R29), RSB + MOVV 192(R29), R25 + MOVV 184(R29), R24 + MOVV 176(R29), R22 + MOVV 168(R29), R21 + MOVV 160(R29), R20 + MOVV 152(R29), R19 + MOVV 144(R29), R18 + MOVV 136(R29), R17 + MOVV 128(R29), R16 + MOVV 120(R29), R15 + MOVV 112(R29), R14 + MOVV 104(R29), R13 + MOVV 96(R29), R12 + MOVV 88(R29), R11 + MOVV 80(R29), R10 + MOVV 72(R29), R9 + MOVV 64(R29), R8 + MOVV 56(R29), R7 + MOVV 48(R29), R6 + MOVV 40(R29), R5 + MOVV 32(R29), R4 + MOVV 24(R29), R3 + MOVV 16(R29), R2 + MOVV 8(R29), R1 + MOVV 488(R29), R31 + MOVV (R29), R23 + ADDV $496, R29 + JMP (R23) diff --git a/src/runtime/preempt_mipsx.s b/src/runtime/preempt_mipsx.s new file mode 100644 index 0000000..7b169ac --- /dev/null +++ b/src/runtime/preempt_mipsx.s @@ -0,0 +1,145 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +//go:build mips || mipsle + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + MOVW R31, -244(R29) + SUB $244, R29 + MOVW R1, 4(R29) + MOVW R2, 8(R29) + MOVW R3, 12(R29) + MOVW R4, 16(R29) + MOVW R5, 20(R29) + MOVW R6, 24(R29) + MOVW R7, 28(R29) + MOVW R8, 32(R29) + MOVW R9, 36(R29) + MOVW R10, 40(R29) + MOVW R11, 44(R29) + MOVW R12, 48(R29) + MOVW R13, 52(R29) + MOVW R14, 56(R29) + MOVW R15, 60(R29) + MOVW R16, 64(R29) + MOVW R17, 68(R29) + MOVW R18, 72(R29) + MOVW R19, 76(R29) + MOVW R20, 80(R29) + MOVW R21, 84(R29) + MOVW R22, 88(R29) + MOVW R24, 92(R29) + MOVW R25, 96(R29) + MOVW R28, 100(R29) + MOVW HI, R1 + MOVW R1, 104(R29) + MOVW LO, R1 + MOVW R1, 108(R29) + #ifndef GOMIPS_softfloat + MOVW FCR31, R1 + MOVW R1, 112(R29) + MOVF F0, 116(R29) + MOVF F1, 120(R29) + MOVF F2, 124(R29) + MOVF F3, 128(R29) + MOVF F4, 132(R29) + MOVF F5, 136(R29) + MOVF F6, 140(R29) + MOVF F7, 144(R29) + MOVF F8, 148(R29) + MOVF F9, 152(R29) + MOVF F10, 156(R29) + MOVF F11, 160(R29) + MOVF F12, 164(R29) + MOVF F13, 168(R29) + MOVF F14, 172(R29) + MOVF F15, 176(R29) + MOVF F16, 180(R29) + MOVF F17, 184(R29) + MOVF F18, 188(R29) + MOVF F19, 192(R29) + MOVF F20, 196(R29) + MOVF F21, 200(R29) + MOVF F22, 204(R29) + MOVF F23, 208(R29) + MOVF F24, 212(R29) + MOVF F25, 216(R29) + MOVF F26, 220(R29) + MOVF F27, 224(R29) + MOVF F28, 228(R29) + MOVF F29, 232(R29) + MOVF F30, 236(R29) + MOVF F31, 240(R29) + #endif + CALL ·asyncPreempt2(SB) + #ifndef GOMIPS_softfloat + MOVF 240(R29), F31 + MOVF 236(R29), F30 + MOVF 232(R29), F29 + MOVF 228(R29), F28 + MOVF 224(R29), F27 + MOVF 220(R29), F26 + MOVF 216(R29), F25 + MOVF 212(R29), F24 + MOVF 208(R29), F23 + MOVF 204(R29), F22 + MOVF 200(R29), F21 + MOVF 196(R29), F20 + MOVF 192(R29), F19 + MOVF 188(R29), F18 + MOVF 184(R29), F17 + MOVF 180(R29), F16 + MOVF 176(R29), F15 + MOVF 172(R29), F14 + MOVF 168(R29), F13 + MOVF 164(R29), F12 + MOVF 160(R29), F11 + MOVF 156(R29), F10 + MOVF 152(R29), F9 + MOVF 148(R29), F8 + MOVF 144(R29), F7 + MOVF 140(R29), F6 + MOVF 136(R29), F5 + MOVF 132(R29), F4 + MOVF 128(R29), F3 + MOVF 124(R29), F2 + MOVF 120(R29), F1 + MOVF 116(R29), F0 + MOVW 112(R29), R1 + MOVW R1, FCR31 + #endif + MOVW 108(R29), R1 + MOVW R1, LO + MOVW 104(R29), R1 + MOVW R1, HI + MOVW 100(R29), R28 + MOVW 96(R29), R25 + MOVW 92(R29), R24 + MOVW 88(R29), R22 + MOVW 84(R29), R21 + MOVW 80(R29), R20 + MOVW 76(R29), R19 + MOVW 72(R29), R18 + MOVW 68(R29), R17 + MOVW 64(R29), R16 + MOVW 60(R29), R15 + MOVW 56(R29), R14 + MOVW 52(R29), R13 + MOVW 48(R29), R12 + MOVW 44(R29), R11 + MOVW 40(R29), R10 + MOVW 36(R29), R9 + MOVW 32(R29), R8 + MOVW 28(R29), R7 + MOVW 24(R29), R6 + MOVW 20(R29), R5 + MOVW 16(R29), R4 + MOVW 12(R29), R3 + MOVW 8(R29), R2 + MOVW 4(R29), R1 + MOVW 244(R29), R31 + MOVW (R29), R23 + ADD $248, R29 + JMP (R23) diff --git a/src/runtime/preempt_nonwindows.go b/src/runtime/preempt_nonwindows.go new file mode 100644 index 0000000..d6a2408 --- /dev/null +++ b/src/runtime/preempt_nonwindows.go @@ -0,0 +1,13 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !windows + +package runtime + +//go:nosplit +func osPreemptExtEnter(mp *m) {} + +//go:nosplit +func osPreemptExtExit(mp *m) {} diff --git a/src/runtime/preempt_ppc64x.s b/src/runtime/preempt_ppc64x.s new file mode 100644 index 0000000..2c4d02e --- /dev/null +++ b/src/runtime/preempt_ppc64x.s @@ -0,0 +1,147 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +//go:build ppc64 || ppc64le + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + MOVD R31, -488(R1) + MOVD LR, R31 + MOVDU R31, -520(R1) + MOVD R3, 40(R1) + MOVD R4, 48(R1) + MOVD R5, 56(R1) + MOVD R6, 64(R1) + MOVD R7, 72(R1) + MOVD R8, 80(R1) + MOVD R9, 88(R1) + MOVD R10, 96(R1) + MOVD R11, 104(R1) + MOVD R14, 112(R1) + MOVD R15, 120(R1) + MOVD R16, 128(R1) + MOVD R17, 136(R1) + MOVD R18, 144(R1) + MOVD R19, 152(R1) + MOVD R20, 160(R1) + MOVD R21, 168(R1) + MOVD R22, 176(R1) + MOVD R23, 184(R1) + MOVD R24, 192(R1) + MOVD R25, 200(R1) + MOVD R26, 208(R1) + MOVD R27, 216(R1) + MOVD R28, 224(R1) + MOVD R29, 232(R1) + MOVW CR, R31 + MOVW R31, 240(R1) + MOVD XER, R31 + MOVD R31, 248(R1) + FMOVD F0, 256(R1) + FMOVD F1, 264(R1) + FMOVD F2, 272(R1) + FMOVD F3, 280(R1) + FMOVD F4, 288(R1) + FMOVD F5, 296(R1) + FMOVD F6, 304(R1) + FMOVD F7, 312(R1) + FMOVD F8, 320(R1) + FMOVD F9, 328(R1) + FMOVD F10, 336(R1) + FMOVD F11, 344(R1) + FMOVD F12, 352(R1) + FMOVD F13, 360(R1) + FMOVD F14, 368(R1) + FMOVD F15, 376(R1) + FMOVD F16, 384(R1) + FMOVD F17, 392(R1) + FMOVD F18, 400(R1) + FMOVD F19, 408(R1) + FMOVD F20, 416(R1) + FMOVD F21, 424(R1) + FMOVD F22, 432(R1) + FMOVD F23, 440(R1) + FMOVD F24, 448(R1) + FMOVD F25, 456(R1) + FMOVD F26, 464(R1) + FMOVD F27, 472(R1) + FMOVD F28, 480(R1) + FMOVD F29, 488(R1) + FMOVD F30, 496(R1) + FMOVD F31, 504(R1) + MOVFL FPSCR, F0 + FMOVD F0, 512(R1) + CALL ·asyncPreempt2(SB) + FMOVD 512(R1), F0 + MOVFL F0, FPSCR + FMOVD 504(R1), F31 + FMOVD 496(R1), F30 + FMOVD 488(R1), F29 + FMOVD 480(R1), F28 + FMOVD 472(R1), F27 + FMOVD 464(R1), F26 + FMOVD 456(R1), F25 + FMOVD 448(R1), F24 + FMOVD 440(R1), F23 + FMOVD 432(R1), F22 + FMOVD 424(R1), F21 + FMOVD 416(R1), F20 + FMOVD 408(R1), F19 + FMOVD 400(R1), F18 + FMOVD 392(R1), F17 + FMOVD 384(R1), F16 + FMOVD 376(R1), F15 + FMOVD 368(R1), F14 + FMOVD 360(R1), F13 + FMOVD 352(R1), F12 + FMOVD 344(R1), F11 + FMOVD 336(R1), F10 + FMOVD 328(R1), F9 + FMOVD 320(R1), F8 + FMOVD 312(R1), F7 + FMOVD 304(R1), F6 + FMOVD 296(R1), F5 + FMOVD 288(R1), F4 + FMOVD 280(R1), F3 + FMOVD 272(R1), F2 + FMOVD 264(R1), F1 + FMOVD 256(R1), F0 + MOVD 248(R1), R31 + MOVD R31, XER + MOVW 240(R1), R31 + MOVFL R31, $0xff + MOVD 232(R1), R29 + MOVD 224(R1), R28 + MOVD 216(R1), R27 + MOVD 208(R1), R26 + MOVD 200(R1), R25 + MOVD 192(R1), R24 + MOVD 184(R1), R23 + MOVD 176(R1), R22 + MOVD 168(R1), R21 + MOVD 160(R1), R20 + MOVD 152(R1), R19 + MOVD 144(R1), R18 + MOVD 136(R1), R17 + MOVD 128(R1), R16 + MOVD 120(R1), R15 + MOVD 112(R1), R14 + MOVD 104(R1), R11 + MOVD 96(R1), R10 + MOVD 88(R1), R9 + MOVD 80(R1), R8 + MOVD 72(R1), R7 + MOVD 64(R1), R6 + MOVD 56(R1), R5 + MOVD 48(R1), R4 + MOVD 40(R1), R3 + MOVD 520(R1), R31 + MOVD R31, LR + MOVD 528(R1), R2 + MOVD 536(R1), R12 + MOVD (R1), R31 + MOVD R31, CTR + MOVD 32(R1), R31 + ADD $552, R1 + JMP (CTR) diff --git a/src/runtime/preempt_riscv64.s b/src/runtime/preempt_riscv64.s new file mode 100644 index 0000000..56df6c3 --- /dev/null +++ b/src/runtime/preempt_riscv64.s @@ -0,0 +1,127 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + MOV X1, -464(X2) + ADD $-464, X2 + MOV X5, 8(X2) + MOV X6, 16(X2) + MOV X7, 24(X2) + MOV X8, 32(X2) + MOV X9, 40(X2) + MOV X10, 48(X2) + MOV X11, 56(X2) + MOV X12, 64(X2) + MOV X13, 72(X2) + MOV X14, 80(X2) + MOV X15, 88(X2) + MOV X16, 96(X2) + MOV X17, 104(X2) + MOV X18, 112(X2) + MOV X19, 120(X2) + MOV X20, 128(X2) + MOV X21, 136(X2) + MOV X22, 144(X2) + MOV X23, 152(X2) + MOV X24, 160(X2) + MOV X25, 168(X2) + MOV X26, 176(X2) + MOV X28, 184(X2) + MOV X29, 192(X2) + MOV X30, 200(X2) + MOVD F0, 208(X2) + MOVD F1, 216(X2) + MOVD F2, 224(X2) + MOVD F3, 232(X2) + MOVD F4, 240(X2) + MOVD F5, 248(X2) + MOVD F6, 256(X2) + MOVD F7, 264(X2) + MOVD F8, 272(X2) + MOVD F9, 280(X2) + MOVD F10, 288(X2) + MOVD F11, 296(X2) + MOVD F12, 304(X2) + MOVD F13, 312(X2) + MOVD F14, 320(X2) + MOVD F15, 328(X2) + MOVD F16, 336(X2) + MOVD F17, 344(X2) + MOVD F18, 352(X2) + MOVD F19, 360(X2) + MOVD F20, 368(X2) + MOVD F21, 376(X2) + MOVD F22, 384(X2) + MOVD F23, 392(X2) + MOVD F24, 400(X2) + MOVD F25, 408(X2) + MOVD F26, 416(X2) + MOVD F27, 424(X2) + MOVD F28, 432(X2) + MOVD F29, 440(X2) + MOVD F30, 448(X2) + MOVD F31, 456(X2) + CALL ·asyncPreempt2(SB) + MOVD 456(X2), F31 + MOVD 448(X2), F30 + MOVD 440(X2), F29 + MOVD 432(X2), F28 + MOVD 424(X2), F27 + MOVD 416(X2), F26 + MOVD 408(X2), F25 + MOVD 400(X2), F24 + MOVD 392(X2), F23 + MOVD 384(X2), F22 + MOVD 376(X2), F21 + MOVD 368(X2), F20 + MOVD 360(X2), F19 + MOVD 352(X2), F18 + MOVD 344(X2), F17 + MOVD 336(X2), F16 + MOVD 328(X2), F15 + MOVD 320(X2), F14 + MOVD 312(X2), F13 + MOVD 304(X2), F12 + MOVD 296(X2), F11 + MOVD 288(X2), F10 + MOVD 280(X2), F9 + MOVD 272(X2), F8 + MOVD 264(X2), F7 + MOVD 256(X2), F6 + MOVD 248(X2), F5 + MOVD 240(X2), F4 + MOVD 232(X2), F3 + MOVD 224(X2), F2 + MOVD 216(X2), F1 + MOVD 208(X2), F0 + MOV 200(X2), X30 + MOV 192(X2), X29 + MOV 184(X2), X28 + MOV 176(X2), X26 + MOV 168(X2), X25 + MOV 160(X2), X24 + MOV 152(X2), X23 + MOV 144(X2), X22 + MOV 136(X2), X21 + MOV 128(X2), X20 + MOV 120(X2), X19 + MOV 112(X2), X18 + MOV 104(X2), X17 + MOV 96(X2), X16 + MOV 88(X2), X15 + MOV 80(X2), X14 + MOV 72(X2), X13 + MOV 64(X2), X12 + MOV 56(X2), X11 + MOV 48(X2), X10 + MOV 40(X2), X9 + MOV 32(X2), X8 + MOV 24(X2), X7 + MOV 16(X2), X6 + MOV 8(X2), X5 + MOV 464(X2), X1 + MOV (X2), X31 + ADD $472, X2 + JMP (X31) diff --git a/src/runtime/preempt_s390x.s b/src/runtime/preempt_s390x.s new file mode 100644 index 0000000..ca9e47c --- /dev/null +++ b/src/runtime/preempt_s390x.s @@ -0,0 +1,51 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + IPM R10 + MOVD R14, -248(R15) + ADD $-248, R15 + MOVW R10, 8(R15) + STMG R0, R12, 16(R15) + FMOVD F0, 120(R15) + FMOVD F1, 128(R15) + FMOVD F2, 136(R15) + FMOVD F3, 144(R15) + FMOVD F4, 152(R15) + FMOVD F5, 160(R15) + FMOVD F6, 168(R15) + FMOVD F7, 176(R15) + FMOVD F8, 184(R15) + FMOVD F9, 192(R15) + FMOVD F10, 200(R15) + FMOVD F11, 208(R15) + FMOVD F12, 216(R15) + FMOVD F13, 224(R15) + FMOVD F14, 232(R15) + FMOVD F15, 240(R15) + CALL ·asyncPreempt2(SB) + FMOVD 240(R15), F15 + FMOVD 232(R15), F14 + FMOVD 224(R15), F13 + FMOVD 216(R15), F12 + FMOVD 208(R15), F11 + FMOVD 200(R15), F10 + FMOVD 192(R15), F9 + FMOVD 184(R15), F8 + FMOVD 176(R15), F7 + FMOVD 168(R15), F6 + FMOVD 160(R15), F5 + FMOVD 152(R15), F4 + FMOVD 144(R15), F3 + FMOVD 136(R15), F2 + FMOVD 128(R15), F1 + FMOVD 120(R15), F0 + LMG 16(R15), R0, R12 + MOVD 248(R15), R14 + ADD $256, R15 + MOVWZ -248(R15), R10 + TMLH R10, $(3<<12) + MOVD -256(R15), R10 + JMP (R10) diff --git a/src/runtime/preempt_wasm.s b/src/runtime/preempt_wasm.s new file mode 100644 index 0000000..0cf57d3 --- /dev/null +++ b/src/runtime/preempt_wasm.s @@ -0,0 +1,8 @@ +// Code generated by mkpreempt.go; DO NOT EDIT. + +#include "go_asm.h" +#include "textflag.h" + +TEXT ·asyncPreempt(SB),NOSPLIT|NOFRAME,$0-0 + // No async preemption on wasm + UNDEF diff --git a/src/runtime/print.go b/src/runtime/print.go new file mode 100644 index 0000000..a1e0b8e --- /dev/null +++ b/src/runtime/print.go @@ -0,0 +1,301 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +// The compiler knows that a print of a value of this type +// should use printhex instead of printuint (decimal). +type hex uint64 + +func bytes(s string) (ret []byte) { + rp := (*slice)(unsafe.Pointer(&ret)) + sp := stringStructOf(&s) + rp.array = sp.str + rp.len = sp.len + rp.cap = sp.len + return +} + +var ( + // printBacklog is a circular buffer of messages written with the builtin + // print* functions, for use in postmortem analysis of core dumps. + printBacklog [512]byte + printBacklogIndex int +) + +// recordForPanic maintains a circular buffer of messages written by the +// runtime leading up to a process crash, allowing the messages to be +// extracted from a core dump. +// +// The text written during a process crash (following "panic" or "fatal +// error") is not saved, since the goroutine stacks will generally be readable +// from the runtime datastructures in the core file. +func recordForPanic(b []byte) { + printlock() + + if panicking.Load() == 0 { + // Not actively crashing: maintain circular buffer of print output. + for i := 0; i < len(b); { + n := copy(printBacklog[printBacklogIndex:], b[i:]) + i += n + printBacklogIndex += n + printBacklogIndex %= len(printBacklog) + } + } + + printunlock() +} + +var debuglock mutex + +// The compiler emits calls to printlock and printunlock around +// the multiple calls that implement a single Go print or println +// statement. Some of the print helpers (printslice, for example) +// call print recursively. There is also the problem of a crash +// happening during the print routines and needing to acquire +// the print lock to print information about the crash. +// For both these reasons, let a thread acquire the printlock 'recursively'. + +func printlock() { + mp := getg().m + mp.locks++ // do not reschedule between printlock++ and lock(&debuglock). + mp.printlock++ + if mp.printlock == 1 { + lock(&debuglock) + } + mp.locks-- // now we know debuglock is held and holding up mp.locks for us. +} + +func printunlock() { + mp := getg().m + mp.printlock-- + if mp.printlock == 0 { + unlock(&debuglock) + } +} + +// write to goroutine-local buffer if diverting output, +// or else standard error. +func gwrite(b []byte) { + if len(b) == 0 { + return + } + recordForPanic(b) + gp := getg() + // Don't use the writebuf if gp.m is dying. We want anything + // written through gwrite to appear in the terminal rather + // than be written to in some buffer, if we're in a panicking state. + // Note that we can't just clear writebuf in the gp.m.dying case + // because a panic isn't allowed to have any write barriers. + if gp == nil || gp.writebuf == nil || gp.m.dying > 0 { + writeErr(b) + return + } + + n := copy(gp.writebuf[len(gp.writebuf):cap(gp.writebuf)], b) + gp.writebuf = gp.writebuf[:len(gp.writebuf)+n] +} + +func printsp() { + printstring(" ") +} + +func printnl() { + printstring("\n") +} + +func printbool(v bool) { + if v { + printstring("true") + } else { + printstring("false") + } +} + +func printfloat(v float64) { + switch { + case v != v: + printstring("NaN") + return + case v+v == v && v > 0: + printstring("+Inf") + return + case v+v == v && v < 0: + printstring("-Inf") + return + } + + const n = 7 // digits printed + var buf [n + 7]byte + buf[0] = '+' + e := 0 // exp + if v == 0 { + if 1/v < 0 { + buf[0] = '-' + } + } else { + if v < 0 { + v = -v + buf[0] = '-' + } + + // normalize + for v >= 10 { + e++ + v /= 10 + } + for v < 1 { + e-- + v *= 10 + } + + // round + h := 5.0 + for i := 0; i < n; i++ { + h /= 10 + } + v += h + if v >= 10 { + e++ + v /= 10 + } + } + + // format +d.dddd+edd + for i := 0; i < n; i++ { + s := int(v) + buf[i+2] = byte(s + '0') + v -= float64(s) + v *= 10 + } + buf[1] = buf[2] + buf[2] = '.' + + buf[n+2] = 'e' + buf[n+3] = '+' + if e < 0 { + e = -e + buf[n+3] = '-' + } + + buf[n+4] = byte(e/100) + '0' + buf[n+5] = byte(e/10)%10 + '0' + buf[n+6] = byte(e%10) + '0' + gwrite(buf[:]) +} + +func printcomplex(c complex128) { + print("(", real(c), imag(c), "i)") +} + +func printuint(v uint64) { + var buf [100]byte + i := len(buf) + for i--; i > 0; i-- { + buf[i] = byte(v%10 + '0') + if v < 10 { + break + } + v /= 10 + } + gwrite(buf[i:]) +} + +func printint(v int64) { + if v < 0 { + printstring("-") + v = -v + } + printuint(uint64(v)) +} + +var minhexdigits = 0 // protected by printlock + +func printhex(v uint64) { + const dig = "0123456789abcdef" + var buf [100]byte + i := len(buf) + for i--; i > 0; i-- { + buf[i] = dig[v%16] + if v < 16 && len(buf)-i >= minhexdigits { + break + } + v /= 16 + } + i-- + buf[i] = 'x' + i-- + buf[i] = '0' + gwrite(buf[i:]) +} + +func printpointer(p unsafe.Pointer) { + printhex(uint64(uintptr(p))) +} +func printuintptr(p uintptr) { + printhex(uint64(p)) +} + +func printstring(s string) { + gwrite(bytes(s)) +} + +func printslice(s []byte) { + sp := (*slice)(unsafe.Pointer(&s)) + print("[", len(s), "/", cap(s), "]") + printpointer(sp.array) +} + +func printeface(e eface) { + print("(", e._type, ",", e.data, ")") +} + +func printiface(i iface) { + print("(", i.tab, ",", i.data, ")") +} + +// hexdumpWords prints a word-oriented hex dump of [p, end). +// +// If mark != nil, it will be called with each printed word's address +// and should return a character mark to appear just before that +// word's value. It can return 0 to indicate no mark. +func hexdumpWords(p, end uintptr, mark func(uintptr) byte) { + printlock() + var markbuf [1]byte + markbuf[0] = ' ' + minhexdigits = int(unsafe.Sizeof(uintptr(0)) * 2) + for i := uintptr(0); p+i < end; i += goarch.PtrSize { + if i%16 == 0 { + if i != 0 { + println() + } + print(hex(p+i), ": ") + } + + if mark != nil { + markbuf[0] = mark(p + i) + if markbuf[0] == 0 { + markbuf[0] = ' ' + } + } + gwrite(markbuf[:]) + val := *(*uintptr)(unsafe.Pointer(p + i)) + print(hex(val)) + print(" ") + + // Can we symbolize val? + fn := findfunc(val) + if fn.valid() { + print("<", funcname(fn), "+", hex(val-fn.entry()), "> ") + } + } + minhexdigits = 0 + println() + printunlock() +} diff --git a/src/runtime/proc.go b/src/runtime/proc.go new file mode 100644 index 0000000..c1e45a4 --- /dev/null +++ b/src/runtime/proc.go @@ -0,0 +1,6549 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/cpu" + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// set using cmd/go/internal/modload.ModInfoProg +var modinfo string + +// Goroutine scheduler +// The scheduler's job is to distribute ready-to-run goroutines over worker threads. +// +// The main concepts are: +// G - goroutine. +// M - worker thread, or machine. +// P - processor, a resource that is required to execute Go code. +// M must have an associated P to execute Go code, however it can be +// blocked or in a syscall w/o an associated P. +// +// Design doc at https://golang.org/s/go11sched. + +// Worker thread parking/unparking. +// We need to balance between keeping enough running worker threads to utilize +// available hardware parallelism and parking excessive running worker threads +// to conserve CPU resources and power. This is not simple for two reasons: +// (1) scheduler state is intentionally distributed (in particular, per-P work +// queues), so it is not possible to compute global predicates on fast paths; +// (2) for optimal thread management we would need to know the future (don't park +// a worker thread when a new goroutine will be readied in near future). +// +// Three rejected approaches that would work badly: +// 1. Centralize all scheduler state (would inhibit scalability). +// 2. Direct goroutine handoff. That is, when we ready a new goroutine and there +// is a spare P, unpark a thread and handoff it the thread and the goroutine. +// This would lead to thread state thrashing, as the thread that readied the +// goroutine can be out of work the very next moment, we will need to park it. +// Also, it would destroy locality of computation as we want to preserve +// dependent goroutines on the same thread; and introduce additional latency. +// 3. Unpark an additional thread whenever we ready a goroutine and there is an +// idle P, but don't do handoff. This would lead to excessive thread parking/ +// unparking as the additional threads will instantly park without discovering +// any work to do. +// +// The current approach: +// +// This approach applies to three primary sources of potential work: readying a +// goroutine, new/modified-earlier timers, and idle-priority GC. See below for +// additional details. +// +// We unpark an additional thread when we submit work if (this is wakep()): +// 1. There is an idle P, and +// 2. There are no "spinning" worker threads. +// +// A worker thread is considered spinning if it is out of local work and did +// not find work in the global run queue or netpoller; the spinning state is +// denoted in m.spinning and in sched.nmspinning. Threads unparked this way are +// also considered spinning; we don't do goroutine handoff so such threads are +// out of work initially. Spinning threads spin on looking for work in per-P +// run queues and timer heaps or from the GC before parking. If a spinning +// thread finds work it takes itself out of the spinning state and proceeds to +// execution. If it does not find work it takes itself out of the spinning +// state and then parks. +// +// If there is at least one spinning thread (sched.nmspinning>1), we don't +// unpark new threads when submitting work. To compensate for that, if the last +// spinning thread finds work and stops spinning, it must unpark a new spinning +// thread. This approach smooths out unjustified spikes of thread unparking, +// but at the same time guarantees eventual maximal CPU parallelism +// utilization. +// +// The main implementation complication is that we need to be very careful +// during spinning->non-spinning thread transition. This transition can race +// with submission of new work, and either one part or another needs to unpark +// another worker thread. If they both fail to do that, we can end up with +// semi-persistent CPU underutilization. +// +// The general pattern for submission is: +// 1. Submit work to the local run queue, timer heap, or GC state. +// 2. #StoreLoad-style memory barrier. +// 3. Check sched.nmspinning. +// +// The general pattern for spinning->non-spinning transition is: +// 1. Decrement nmspinning. +// 2. #StoreLoad-style memory barrier. +// 3. Check all per-P work queues and GC for new work. +// +// Note that all this complexity does not apply to global run queue as we are +// not sloppy about thread unparking when submitting to global queue. Also see +// comments for nmspinning manipulation. +// +// How these different sources of work behave varies, though it doesn't affect +// the synchronization approach: +// * Ready goroutine: this is an obvious source of work; the goroutine is +// immediately ready and must run on some thread eventually. +// * New/modified-earlier timer: The current timer implementation (see time.go) +// uses netpoll in a thread with no work available to wait for the soonest +// timer. If there is no thread waiting, we want a new spinning thread to go +// wait. +// * Idle-priority GC: The GC wakes a stopped idle thread to contribute to +// background GC work (note: currently disabled per golang.org/issue/19112). +// Also see golang.org/issue/44313, as this should be extended to all GC +// workers. + +var ( + m0 m + g0 g + mcache0 *mcache + raceprocctx0 uintptr +) + +//go:linkname runtime_inittask runtime..inittask +var runtime_inittask initTask + +//go:linkname main_inittask main..inittask +var main_inittask initTask + +// main_init_done is a signal used by cgocallbackg that initialization +// has been completed. It is made before _cgo_notify_runtime_init_done, +// so all cgo calls can rely on it existing. When main_init is complete, +// it is closed, meaning cgocallbackg can reliably receive from it. +var main_init_done chan bool + +//go:linkname main_main main.main +func main_main() + +// mainStarted indicates that the main M has started. +var mainStarted bool + +// runtimeInitTime is the nanotime() at which the runtime started. +var runtimeInitTime int64 + +// Value to use for signal mask for newly created M's. +var initSigmask sigset + +// The main goroutine. +func main() { + mp := getg().m + + // Racectx of m0->g0 is used only as the parent of the main goroutine. + // It must not be used for anything else. + mp.g0.racectx = 0 + + // Max stack size is 1 GB on 64-bit, 250 MB on 32-bit. + // Using decimal instead of binary GB and MB because + // they look nicer in the stack overflow failure message. + if goarch.PtrSize == 8 { + maxstacksize = 1000000000 + } else { + maxstacksize = 250000000 + } + + // An upper limit for max stack size. Used to avoid random crashes + // after calling SetMaxStack and trying to allocate a stack that is too big, + // since stackalloc works with 32-bit sizes. + maxstackceiling = 2 * maxstacksize + + // Allow newproc to start new Ms. + mainStarted = true + + if GOARCH != "wasm" { // no threads on wasm yet, so no sysmon + systemstack(func() { + newm(sysmon, nil, -1) + }) + } + + // Lock the main goroutine onto this, the main OS thread, + // during initialization. Most programs won't care, but a few + // do require certain calls to be made by the main thread. + // Those can arrange for main.main to run in the main thread + // by calling runtime.LockOSThread during initialization + // to preserve the lock. + lockOSThread() + + if mp != &m0 { + throw("runtime.main not on m0") + } + + // Record when the world started. + // Must be before doInit for tracing init. + runtimeInitTime = nanotime() + if runtimeInitTime == 0 { + throw("nanotime returning zero") + } + + if debug.inittrace != 0 { + inittrace.id = getg().goid + inittrace.active = true + } + + doInit(&runtime_inittask) // Must be before defer. + + // Defer unlock so that runtime.Goexit during init does the unlock too. + needUnlock := true + defer func() { + if needUnlock { + unlockOSThread() + } + }() + + gcenable() + + main_init_done = make(chan bool) + if iscgo { + if _cgo_thread_start == nil { + throw("_cgo_thread_start missing") + } + if GOOS != "windows" { + if _cgo_setenv == nil { + throw("_cgo_setenv missing") + } + if _cgo_unsetenv == nil { + throw("_cgo_unsetenv missing") + } + } + if _cgo_notify_runtime_init_done == nil { + throw("_cgo_notify_runtime_init_done missing") + } + // Start the template thread in case we enter Go from + // a C-created thread and need to create a new thread. + startTemplateThread() + cgocall(_cgo_notify_runtime_init_done, nil) + } + + doInit(&main_inittask) + + // Disable init tracing after main init done to avoid overhead + // of collecting statistics in malloc and newproc + inittrace.active = false + + close(main_init_done) + + needUnlock = false + unlockOSThread() + + if isarchive || islibrary { + // A program compiled with -buildmode=c-archive or c-shared + // has a main, but it is not executed. + return + } + fn := main_main // make an indirect call, as the linker doesn't know the address of the main package when laying down the runtime + fn() + if raceenabled { + runExitHooks(0) // run hooks now, since racefini does not return + racefini() + } + + // Make racy client program work: if panicking on + // another goroutine at the same time as main returns, + // let the other goroutine finish printing the panic trace. + // Once it does, it will exit. See issues 3934 and 20018. + if runningPanicDefers.Load() != 0 { + // Running deferred functions should not take long. + for c := 0; c < 1000; c++ { + if runningPanicDefers.Load() == 0 { + break + } + Gosched() + } + } + if panicking.Load() != 0 { + gopark(nil, nil, waitReasonPanicWait, traceEvGoStop, 1) + } + runExitHooks(0) + + exit(0) + for { + var x *int32 + *x = 0 + } +} + +// os_beforeExit is called from os.Exit(0). +// +//go:linkname os_beforeExit os.runtime_beforeExit +func os_beforeExit(exitCode int) { + runExitHooks(exitCode) + if exitCode == 0 && raceenabled { + racefini() + } +} + +// start forcegc helper goroutine +func init() { + go forcegchelper() +} + +func forcegchelper() { + forcegc.g = getg() + lockInit(&forcegc.lock, lockRankForcegc) + for { + lock(&forcegc.lock) + if forcegc.idle.Load() { + throw("forcegc: phase error") + } + forcegc.idle.Store(true) + goparkunlock(&forcegc.lock, waitReasonForceGCIdle, traceEvGoBlock, 1) + // this goroutine is explicitly resumed by sysmon + if debug.gctrace > 0 { + println("GC forced") + } + // Time-triggered, fully concurrent. + gcStart(gcTrigger{kind: gcTriggerTime, now: nanotime()}) + } +} + +//go:nosplit + +// Gosched yields the processor, allowing other goroutines to run. It does not +// suspend the current goroutine, so execution resumes automatically. +func Gosched() { + checkTimeouts() + mcall(gosched_m) +} + +// goschedguarded yields the processor like gosched, but also checks +// for forbidden states and opts out of the yield in those cases. +// +//go:nosplit +func goschedguarded() { + mcall(goschedguarded_m) +} + +// goschedIfBusy yields the processor like gosched, but only does so if +// there are no idle Ps or if we're on the only P and there's nothing in +// the run queue. In both cases, there is freely available idle time. +// +//go:nosplit +func goschedIfBusy() { + gp := getg() + // Call gosched if gp.preempt is set; we may be in a tight loop that + // doesn't otherwise yield. + if !gp.preempt && sched.npidle.Load() > 0 { + return + } + mcall(gosched_m) +} + +// Puts the current goroutine into a waiting state and calls unlockf on the +// system stack. +// +// If unlockf returns false, the goroutine is resumed. +// +// unlockf must not access this G's stack, as it may be moved between +// the call to gopark and the call to unlockf. +// +// Note that because unlockf is called after putting the G into a waiting +// state, the G may have already been readied by the time unlockf is called +// unless there is external synchronization preventing the G from being +// readied. If unlockf returns false, it must guarantee that the G cannot be +// externally readied. +// +// Reason explains why the goroutine has been parked. It is displayed in stack +// traces and heap dumps. Reasons should be unique and descriptive. Do not +// re-use reasons, add new ones. +func gopark(unlockf func(*g, unsafe.Pointer) bool, lock unsafe.Pointer, reason waitReason, traceEv byte, traceskip int) { + if reason != waitReasonSleep { + checkTimeouts() // timeouts may expire while two goroutines keep the scheduler busy + } + mp := acquirem() + gp := mp.curg + status := readgstatus(gp) + if status != _Grunning && status != _Gscanrunning { + throw("gopark: bad g status") + } + mp.waitlock = lock + mp.waitunlockf = unlockf + gp.waitreason = reason + mp.waittraceev = traceEv + mp.waittraceskip = traceskip + releasem(mp) + // can't do anything that might move the G between Ms here. + mcall(park_m) +} + +// Puts the current goroutine into a waiting state and unlocks the lock. +// The goroutine can be made runnable again by calling goready(gp). +func goparkunlock(lock *mutex, reason waitReason, traceEv byte, traceskip int) { + gopark(parkunlock_c, unsafe.Pointer(lock), reason, traceEv, traceskip) +} + +func goready(gp *g, traceskip int) { + systemstack(func() { + ready(gp, traceskip, true) + }) +} + +//go:nosplit +func acquireSudog() *sudog { + // Delicate dance: the semaphore implementation calls + // acquireSudog, acquireSudog calls new(sudog), + // new calls malloc, malloc can call the garbage collector, + // and the garbage collector calls the semaphore implementation + // in stopTheWorld. + // Break the cycle by doing acquirem/releasem around new(sudog). + // The acquirem/releasem increments m.locks during new(sudog), + // which keeps the garbage collector from being invoked. + mp := acquirem() + pp := mp.p.ptr() + if len(pp.sudogcache) == 0 { + lock(&sched.sudoglock) + // First, try to grab a batch from central cache. + for len(pp.sudogcache) < cap(pp.sudogcache)/2 && sched.sudogcache != nil { + s := sched.sudogcache + sched.sudogcache = s.next + s.next = nil + pp.sudogcache = append(pp.sudogcache, s) + } + unlock(&sched.sudoglock) + // If the central cache is empty, allocate a new one. + if len(pp.sudogcache) == 0 { + pp.sudogcache = append(pp.sudogcache, new(sudog)) + } + } + n := len(pp.sudogcache) + s := pp.sudogcache[n-1] + pp.sudogcache[n-1] = nil + pp.sudogcache = pp.sudogcache[:n-1] + if s.elem != nil { + throw("acquireSudog: found s.elem != nil in cache") + } + releasem(mp) + return s +} + +//go:nosplit +func releaseSudog(s *sudog) { + if s.elem != nil { + throw("runtime: sudog with non-nil elem") + } + if s.isSelect { + throw("runtime: sudog with non-false isSelect") + } + if s.next != nil { + throw("runtime: sudog with non-nil next") + } + if s.prev != nil { + throw("runtime: sudog with non-nil prev") + } + if s.waitlink != nil { + throw("runtime: sudog with non-nil waitlink") + } + if s.c != nil { + throw("runtime: sudog with non-nil c") + } + gp := getg() + if gp.param != nil { + throw("runtime: releaseSudog with non-nil gp.param") + } + mp := acquirem() // avoid rescheduling to another P + pp := mp.p.ptr() + if len(pp.sudogcache) == cap(pp.sudogcache) { + // Transfer half of local cache to the central cache. + var first, last *sudog + for len(pp.sudogcache) > cap(pp.sudogcache)/2 { + n := len(pp.sudogcache) + p := pp.sudogcache[n-1] + pp.sudogcache[n-1] = nil + pp.sudogcache = pp.sudogcache[:n-1] + if first == nil { + first = p + } else { + last.next = p + } + last = p + } + lock(&sched.sudoglock) + last.next = sched.sudogcache + sched.sudogcache = first + unlock(&sched.sudoglock) + } + pp.sudogcache = append(pp.sudogcache, s) + releasem(mp) +} + +// called from assembly. +func badmcall(fn func(*g)) { + throw("runtime: mcall called on m->g0 stack") +} + +func badmcall2(fn func(*g)) { + throw("runtime: mcall function returned") +} + +func badreflectcall() { + panic(plainError("arg size to reflect.call more than 1GB")) +} + +//go:nosplit +//go:nowritebarrierrec +func badmorestackg0() { + writeErrStr("fatal: morestack on g0\n") +} + +//go:nosplit +//go:nowritebarrierrec +func badmorestackgsignal() { + writeErrStr("fatal: morestack on gsignal\n") +} + +//go:nosplit +func badctxt() { + throw("ctxt != 0") +} + +func lockedOSThread() bool { + gp := getg() + return gp.lockedm != 0 && gp.m.lockedg != 0 +} + +var ( + // allgs contains all Gs ever created (including dead Gs), and thus + // never shrinks. + // + // Access via the slice is protected by allglock or stop-the-world. + // Readers that cannot take the lock may (carefully!) use the atomic + // variables below. + allglock mutex + allgs []*g + + // allglen and allgptr are atomic variables that contain len(allgs) and + // &allgs[0] respectively. Proper ordering depends on totally-ordered + // loads and stores. Writes are protected by allglock. + // + // allgptr is updated before allglen. Readers should read allglen + // before allgptr to ensure that allglen is always <= len(allgptr). New + // Gs appended during the race can be missed. For a consistent view of + // all Gs, allglock must be held. + // + // allgptr copies should always be stored as a concrete type or + // unsafe.Pointer, not uintptr, to ensure that GC can still reach it + // even if it points to a stale array. + allglen uintptr + allgptr **g +) + +func allgadd(gp *g) { + if readgstatus(gp) == _Gidle { + throw("allgadd: bad status Gidle") + } + + lock(&allglock) + allgs = append(allgs, gp) + if &allgs[0] != allgptr { + atomicstorep(unsafe.Pointer(&allgptr), unsafe.Pointer(&allgs[0])) + } + atomic.Storeuintptr(&allglen, uintptr(len(allgs))) + unlock(&allglock) +} + +// allGsSnapshot returns a snapshot of the slice of all Gs. +// +// The world must be stopped or allglock must be held. +func allGsSnapshot() []*g { + assertWorldStoppedOrLockHeld(&allglock) + + // Because the world is stopped or allglock is held, allgadd + // cannot happen concurrently with this. allgs grows + // monotonically and existing entries never change, so we can + // simply return a copy of the slice header. For added safety, + // we trim everything past len because that can still change. + return allgs[:len(allgs):len(allgs)] +} + +// atomicAllG returns &allgs[0] and len(allgs) for use with atomicAllGIndex. +func atomicAllG() (**g, uintptr) { + length := atomic.Loaduintptr(&allglen) + ptr := (**g)(atomic.Loadp(unsafe.Pointer(&allgptr))) + return ptr, length +} + +// atomicAllGIndex returns ptr[i] with the allgptr returned from atomicAllG. +func atomicAllGIndex(ptr **g, i uintptr) *g { + return *(**g)(add(unsafe.Pointer(ptr), i*goarch.PtrSize)) +} + +// forEachG calls fn on every G from allgs. +// +// forEachG takes a lock to exclude concurrent addition of new Gs. +func forEachG(fn func(gp *g)) { + lock(&allglock) + for _, gp := range allgs { + fn(gp) + } + unlock(&allglock) +} + +// forEachGRace calls fn on every G from allgs. +// +// forEachGRace avoids locking, but does not exclude addition of new Gs during +// execution, which may be missed. +func forEachGRace(fn func(gp *g)) { + ptr, length := atomicAllG() + for i := uintptr(0); i < length; i++ { + gp := atomicAllGIndex(ptr, i) + fn(gp) + } + return +} + +const ( + // Number of goroutine ids to grab from sched.goidgen to local per-P cache at once. + // 16 seems to provide enough amortization, but other than that it's mostly arbitrary number. + _GoidCacheBatch = 16 +) + +// cpuinit sets up CPU feature flags and calls internal/cpu.Initialize. env should be the complete +// value of the GODEBUG environment variable. +func cpuinit(env string) { + switch GOOS { + case "aix", "darwin", "ios", "dragonfly", "freebsd", "netbsd", "openbsd", "illumos", "solaris", "linux": + cpu.DebugOptions = true + } + cpu.Initialize(env) + + // Support cpu feature variables are used in code generated by the compiler + // to guard execution of instructions that can not be assumed to be always supported. + switch GOARCH { + case "386", "amd64": + x86HasPOPCNT = cpu.X86.HasPOPCNT + x86HasSSE41 = cpu.X86.HasSSE41 + x86HasFMA = cpu.X86.HasFMA + + case "arm": + armHasVFPv4 = cpu.ARM.HasVFPv4 + + case "arm64": + arm64HasATOMICS = cpu.ARM64.HasATOMICS + } +} + +// getGodebugEarly extracts the environment variable GODEBUG from the environment on +// Unix-like operating systems and returns it. This function exists to extract GODEBUG +// early before much of the runtime is initialized. +func getGodebugEarly() string { + const prefix = "GODEBUG=" + var env string + switch GOOS { + case "aix", "darwin", "ios", "dragonfly", "freebsd", "netbsd", "openbsd", "illumos", "solaris", "linux": + // Similar to goenv_unix but extracts the environment value for + // GODEBUG directly. + // TODO(moehrmann): remove when general goenvs() can be called before cpuinit() + n := int32(0) + for argv_index(argv, argc+1+n) != nil { + n++ + } + + for i := int32(0); i < n; i++ { + p := argv_index(argv, argc+1+i) + s := unsafe.String(p, findnull(p)) + + if hasPrefix(s, prefix) { + env = gostring(p)[len(prefix):] + break + } + } + } + return env +} + +// The bootstrap sequence is: +// +// call osinit +// call schedinit +// make & queue new G +// call runtime·mstart +// +// The new G calls runtime·main. +func schedinit() { + lockInit(&sched.lock, lockRankSched) + lockInit(&sched.sysmonlock, lockRankSysmon) + lockInit(&sched.deferlock, lockRankDefer) + lockInit(&sched.sudoglock, lockRankSudog) + lockInit(&deadlock, lockRankDeadlock) + lockInit(&paniclk, lockRankPanic) + lockInit(&allglock, lockRankAllg) + lockInit(&allpLock, lockRankAllp) + lockInit(&reflectOffs.lock, lockRankReflectOffs) + lockInit(&finlock, lockRankFin) + lockInit(&trace.bufLock, lockRankTraceBuf) + lockInit(&trace.stringsLock, lockRankTraceStrings) + lockInit(&trace.lock, lockRankTrace) + lockInit(&cpuprof.lock, lockRankCpuprof) + lockInit(&trace.stackTab.lock, lockRankTraceStackTab) + allocmLock.init(lockRankAllocmR, lockRankAllocmRInternal, lockRankAllocmW) + execLock.init(lockRankExecR, lockRankExecRInternal, lockRankExecW) + // Enforce that this lock is always a leaf lock. + // All of this lock's critical sections should be + // extremely short. + lockInit(&memstats.heapStats.noPLock, lockRankLeafRank) + + // raceinit must be the first call to race detector. + // In particular, it must be done before mallocinit below calls racemapshadow. + gp := getg() + if raceenabled { + gp.racectx, raceprocctx0 = raceinit() + } + + sched.maxmcount = 10000 + + // The world starts stopped. + worldStopped() + + moduledataverify() + stackinit() + mallocinit() + godebug := getGodebugEarly() + initPageTrace(godebug) // must run after mallocinit but before anything allocates + cpuinit(godebug) // must run before alginit + alginit() // maps, hash, fastrand must not be used before this call + fastrandinit() // must run before mcommoninit + mcommoninit(gp.m, -1) + modulesinit() // provides activeModules + typelinksinit() // uses maps, activeModules + itabsinit() // uses activeModules + stkobjinit() // must run before GC starts + + sigsave(&gp.m.sigmask) + initSigmask = gp.m.sigmask + + goargs() + goenvs() + secure() + parsedebugvars() + gcinit() + + // if disableMemoryProfiling is set, update MemProfileRate to 0 to turn off memprofile. + // Note: parsedebugvars may update MemProfileRate, but when disableMemoryProfiling is + // set to true by the linker, it means that nothing is consuming the profile, it is + // safe to set MemProfileRate to 0. + if disableMemoryProfiling { + MemProfileRate = 0 + } + + lock(&sched.lock) + sched.lastpoll.Store(nanotime()) + procs := ncpu + if n, ok := atoi32(gogetenv("GOMAXPROCS")); ok && n > 0 { + procs = n + } + if procresize(procs) != nil { + throw("unknown runnable goroutine during bootstrap") + } + unlock(&sched.lock) + + // World is effectively started now, as P's can run. + worldStarted() + + // For cgocheck > 1, we turn on the write barrier at all times + // and check all pointer writes. We can't do this until after + // procresize because the write barrier needs a P. + if debug.cgocheck > 1 { + writeBarrier.cgo = true + writeBarrier.enabled = true + for _, pp := range allp { + pp.wbBuf.reset() + } + } + + if buildVersion == "" { + // Condition should never trigger. This code just serves + // to ensure runtime·buildVersion is kept in the resulting binary. + buildVersion = "unknown" + } + if len(modinfo) == 1 { + // Condition should never trigger. This code just serves + // to ensure runtime·modinfo is kept in the resulting binary. + modinfo = "" + } +} + +func dumpgstatus(gp *g) { + thisg := getg() + print("runtime: gp: gp=", gp, ", goid=", gp.goid, ", gp->atomicstatus=", readgstatus(gp), "\n") + print("runtime: getg: g=", thisg, ", goid=", thisg.goid, ", g->atomicstatus=", readgstatus(thisg), "\n") +} + +// sched.lock must be held. +func checkmcount() { + assertLockHeld(&sched.lock) + + if mcount() > sched.maxmcount { + print("runtime: program exceeds ", sched.maxmcount, "-thread limit\n") + throw("thread exhaustion") + } +} + +// mReserveID returns the next ID to use for a new m. This new m is immediately +// considered 'running' by checkdead. +// +// sched.lock must be held. +func mReserveID() int64 { + assertLockHeld(&sched.lock) + + if sched.mnext+1 < sched.mnext { + throw("runtime: thread ID overflow") + } + id := sched.mnext + sched.mnext++ + checkmcount() + return id +} + +// Pre-allocated ID may be passed as 'id', or omitted by passing -1. +func mcommoninit(mp *m, id int64) { + gp := getg() + + // g0 stack won't make sense for user (and is not necessary unwindable). + if gp != gp.m.g0 { + callers(1, mp.createstack[:]) + } + + lock(&sched.lock) + + if id >= 0 { + mp.id = id + } else { + mp.id = mReserveID() + } + + lo := uint32(int64Hash(uint64(mp.id), fastrandseed)) + hi := uint32(int64Hash(uint64(cputicks()), ^fastrandseed)) + if lo|hi == 0 { + hi = 1 + } + // Same behavior as for 1.17. + // TODO: Simplify ths. + if goarch.BigEndian { + mp.fastrand = uint64(lo)<<32 | uint64(hi) + } else { + mp.fastrand = uint64(hi)<<32 | uint64(lo) + } + + mpreinit(mp) + if mp.gsignal != nil { + mp.gsignal.stackguard1 = mp.gsignal.stack.lo + _StackGuard + } + + // Add to allm so garbage collector doesn't free g->m + // when it is just in a register or thread-local storage. + mp.alllink = allm + + // NumCgoCall() iterates over allm w/o schedlock, + // so we need to publish it safely. + atomicstorep(unsafe.Pointer(&allm), unsafe.Pointer(mp)) + unlock(&sched.lock) + + // Allocate memory to hold a cgo traceback if the cgo call crashes. + if iscgo || GOOS == "solaris" || GOOS == "illumos" || GOOS == "windows" { + mp.cgoCallers = new(cgoCallers) + } +} + +func (mp *m) becomeSpinning() { + mp.spinning = true + sched.nmspinning.Add(1) + sched.needspinning.Store(0) +} + +var fastrandseed uintptr + +func fastrandinit() { + s := (*[unsafe.Sizeof(fastrandseed)]byte)(unsafe.Pointer(&fastrandseed))[:] + getRandomData(s) +} + +// Mark gp ready to run. +func ready(gp *g, traceskip int, next bool) { + if trace.enabled { + traceGoUnpark(gp, traceskip) + } + + status := readgstatus(gp) + + // Mark runnable. + mp := acquirem() // disable preemption because it can be holding p in a local var + if status&^_Gscan != _Gwaiting { + dumpgstatus(gp) + throw("bad g->status in ready") + } + + // status is Gwaiting or Gscanwaiting, make Grunnable and put on runq + casgstatus(gp, _Gwaiting, _Grunnable) + runqput(mp.p.ptr(), gp, next) + wakep() + releasem(mp) +} + +// freezeStopWait is a large value that freezetheworld sets +// sched.stopwait to in order to request that all Gs permanently stop. +const freezeStopWait = 0x7fffffff + +// freezing is set to non-zero if the runtime is trying to freeze the +// world. +var freezing atomic.Bool + +// Similar to stopTheWorld but best-effort and can be called several times. +// There is no reverse operation, used during crashing. +// This function must not lock any mutexes. +func freezetheworld() { + freezing.Store(true) + // stopwait and preemption requests can be lost + // due to races with concurrently executing threads, + // so try several times + for i := 0; i < 5; i++ { + // this should tell the scheduler to not start any new goroutines + sched.stopwait = freezeStopWait + sched.gcwaiting.Store(true) + // this should stop running goroutines + if !preemptall() { + break // no running goroutines + } + usleep(1000) + } + // to be sure + usleep(1000) + preemptall() + usleep(1000) +} + +// All reads and writes of g's status go through readgstatus, casgstatus +// castogscanstatus, casfrom_Gscanstatus. +// +//go:nosplit +func readgstatus(gp *g) uint32 { + return gp.atomicstatus.Load() +} + +// The Gscanstatuses are acting like locks and this releases them. +// If it proves to be a performance hit we should be able to make these +// simple atomic stores but for now we are going to throw if +// we see an inconsistent state. +func casfrom_Gscanstatus(gp *g, oldval, newval uint32) { + success := false + + // Check that transition is valid. + switch oldval { + default: + print("runtime: casfrom_Gscanstatus bad oldval gp=", gp, ", oldval=", hex(oldval), ", newval=", hex(newval), "\n") + dumpgstatus(gp) + throw("casfrom_Gscanstatus:top gp->status is not in scan state") + case _Gscanrunnable, + _Gscanwaiting, + _Gscanrunning, + _Gscansyscall, + _Gscanpreempted: + if newval == oldval&^_Gscan { + success = gp.atomicstatus.CompareAndSwap(oldval, newval) + } + } + if !success { + print("runtime: casfrom_Gscanstatus failed gp=", gp, ", oldval=", hex(oldval), ", newval=", hex(newval), "\n") + dumpgstatus(gp) + throw("casfrom_Gscanstatus: gp->status is not in scan state") + } + releaseLockRank(lockRankGscan) +} + +// This will return false if the gp is not in the expected status and the cas fails. +// This acts like a lock acquire while the casfromgstatus acts like a lock release. +func castogscanstatus(gp *g, oldval, newval uint32) bool { + switch oldval { + case _Grunnable, + _Grunning, + _Gwaiting, + _Gsyscall: + if newval == oldval|_Gscan { + r := gp.atomicstatus.CompareAndSwap(oldval, newval) + if r { + acquireLockRank(lockRankGscan) + } + return r + + } + } + print("runtime: castogscanstatus oldval=", hex(oldval), " newval=", hex(newval), "\n") + throw("castogscanstatus") + panic("not reached") +} + +// casgstatusAlwaysTrack is a debug flag that causes casgstatus to always track +// various latencies on every transition instead of sampling them. +var casgstatusAlwaysTrack = false + +// If asked to move to or from a Gscanstatus this will throw. Use the castogscanstatus +// and casfrom_Gscanstatus instead. +// casgstatus will loop if the g->atomicstatus is in a Gscan status until the routine that +// put it in the Gscan state is finished. +// +//go:nosplit +func casgstatus(gp *g, oldval, newval uint32) { + if (oldval&_Gscan != 0) || (newval&_Gscan != 0) || oldval == newval { + systemstack(func() { + print("runtime: casgstatus: oldval=", hex(oldval), " newval=", hex(newval), "\n") + throw("casgstatus: bad incoming values") + }) + } + + acquireLockRank(lockRankGscan) + releaseLockRank(lockRankGscan) + + // See https://golang.org/cl/21503 for justification of the yield delay. + const yieldDelay = 5 * 1000 + var nextYield int64 + + // loop if gp->atomicstatus is in a scan state giving + // GC time to finish and change the state to oldval. + for i := 0; !gp.atomicstatus.CompareAndSwap(oldval, newval); i++ { + if oldval == _Gwaiting && gp.atomicstatus.Load() == _Grunnable { + throw("casgstatus: waiting for Gwaiting but is Grunnable") + } + if i == 0 { + nextYield = nanotime() + yieldDelay + } + if nanotime() < nextYield { + for x := 0; x < 10 && gp.atomicstatus.Load() != oldval; x++ { + procyield(1) + } + } else { + osyield() + nextYield = nanotime() + yieldDelay/2 + } + } + + if oldval == _Grunning { + // Track every gTrackingPeriod time a goroutine transitions out of running. + if casgstatusAlwaysTrack || gp.trackingSeq%gTrackingPeriod == 0 { + gp.tracking = true + } + gp.trackingSeq++ + } + if !gp.tracking { + return + } + + // Handle various kinds of tracking. + // + // Currently: + // - Time spent in runnable. + // - Time spent blocked on a sync.Mutex or sync.RWMutex. + switch oldval { + case _Grunnable: + // We transitioned out of runnable, so measure how much + // time we spent in this state and add it to + // runnableTime. + now := nanotime() + gp.runnableTime += now - gp.trackingStamp + gp.trackingStamp = 0 + case _Gwaiting: + if !gp.waitreason.isMutexWait() { + // Not blocking on a lock. + break + } + // Blocking on a lock, measure it. Note that because we're + // sampling, we have to multiply by our sampling period to get + // a more representative estimate of the absolute value. + // gTrackingPeriod also represents an accurate sampling period + // because we can only enter this state from _Grunning. + now := nanotime() + sched.totalMutexWaitTime.Add((now - gp.trackingStamp) * gTrackingPeriod) + gp.trackingStamp = 0 + } + switch newval { + case _Gwaiting: + if !gp.waitreason.isMutexWait() { + // Not blocking on a lock. + break + } + // Blocking on a lock. Write down the timestamp. + now := nanotime() + gp.trackingStamp = now + case _Grunnable: + // We just transitioned into runnable, so record what + // time that happened. + now := nanotime() + gp.trackingStamp = now + case _Grunning: + // We're transitioning into running, so turn off + // tracking and record how much time we spent in + // runnable. + gp.tracking = false + sched.timeToRun.record(gp.runnableTime) + gp.runnableTime = 0 + } +} + +// casGToWaiting transitions gp from old to _Gwaiting, and sets the wait reason. +// +// Use this over casgstatus when possible to ensure that a waitreason is set. +func casGToWaiting(gp *g, old uint32, reason waitReason) { + // Set the wait reason before calling casgstatus, because casgstatus will use it. + gp.waitreason = reason + casgstatus(gp, old, _Gwaiting) +} + +// casgstatus(gp, oldstatus, Gcopystack), assuming oldstatus is Gwaiting or Grunnable. +// Returns old status. Cannot call casgstatus directly, because we are racing with an +// async wakeup that might come in from netpoll. If we see Gwaiting from the readgstatus, +// it might have become Grunnable by the time we get to the cas. If we called casgstatus, +// it would loop waiting for the status to go back to Gwaiting, which it never will. +// +//go:nosplit +func casgcopystack(gp *g) uint32 { + for { + oldstatus := readgstatus(gp) &^ _Gscan + if oldstatus != _Gwaiting && oldstatus != _Grunnable { + throw("copystack: bad status, not Gwaiting or Grunnable") + } + if gp.atomicstatus.CompareAndSwap(oldstatus, _Gcopystack) { + return oldstatus + } + } +} + +// casGToPreemptScan transitions gp from _Grunning to _Gscan|_Gpreempted. +// +// TODO(austin): This is the only status operation that both changes +// the status and locks the _Gscan bit. Rethink this. +func casGToPreemptScan(gp *g, old, new uint32) { + if old != _Grunning || new != _Gscan|_Gpreempted { + throw("bad g transition") + } + acquireLockRank(lockRankGscan) + for !gp.atomicstatus.CompareAndSwap(_Grunning, _Gscan|_Gpreempted) { + } +} + +// casGFromPreempted attempts to transition gp from _Gpreempted to +// _Gwaiting. If successful, the caller is responsible for +// re-scheduling gp. +func casGFromPreempted(gp *g, old, new uint32) bool { + if old != _Gpreempted || new != _Gwaiting { + throw("bad g transition") + } + gp.waitreason = waitReasonPreempted + return gp.atomicstatus.CompareAndSwap(_Gpreempted, _Gwaiting) +} + +// stopTheWorld stops all P's from executing goroutines, interrupting +// all goroutines at GC safe points and records reason as the reason +// for the stop. On return, only the current goroutine's P is running. +// stopTheWorld must not be called from a system stack and the caller +// must not hold worldsema. The caller must call startTheWorld when +// other P's should resume execution. +// +// stopTheWorld is safe for multiple goroutines to call at the +// same time. Each will execute its own stop, and the stops will +// be serialized. +// +// This is also used by routines that do stack dumps. If the system is +// in panic or being exited, this may not reliably stop all +// goroutines. +func stopTheWorld(reason string) { + semacquire(&worldsema) + gp := getg() + gp.m.preemptoff = reason + systemstack(func() { + // Mark the goroutine which called stopTheWorld preemptible so its + // stack may be scanned. + // This lets a mark worker scan us while we try to stop the world + // since otherwise we could get in a mutual preemption deadlock. + // We must not modify anything on the G stack because a stack shrink + // may occur. A stack shrink is otherwise OK though because in order + // to return from this function (and to leave the system stack) we + // must have preempted all goroutines, including any attempting + // to scan our stack, in which case, any stack shrinking will + // have already completed by the time we exit. + // Don't provide a wait reason because we're still executing. + casGToWaiting(gp, _Grunning, waitReasonStoppingTheWorld) + stopTheWorldWithSema() + casgstatus(gp, _Gwaiting, _Grunning) + }) +} + +// startTheWorld undoes the effects of stopTheWorld. +func startTheWorld() { + systemstack(func() { startTheWorldWithSema(false) }) + + // worldsema must be held over startTheWorldWithSema to ensure + // gomaxprocs cannot change while worldsema is held. + // + // Release worldsema with direct handoff to the next waiter, but + // acquirem so that semrelease1 doesn't try to yield our time. + // + // Otherwise if e.g. ReadMemStats is being called in a loop, + // it might stomp on other attempts to stop the world, such as + // for starting or ending GC. The operation this blocks is + // so heavy-weight that we should just try to be as fair as + // possible here. + // + // We don't want to just allow us to get preempted between now + // and releasing the semaphore because then we keep everyone + // (including, for example, GCs) waiting longer. + mp := acquirem() + mp.preemptoff = "" + semrelease1(&worldsema, true, 0) + releasem(mp) +} + +// stopTheWorldGC has the same effect as stopTheWorld, but blocks +// until the GC is not running. It also blocks a GC from starting +// until startTheWorldGC is called. +func stopTheWorldGC(reason string) { + semacquire(&gcsema) + stopTheWorld(reason) +} + +// startTheWorldGC undoes the effects of stopTheWorldGC. +func startTheWorldGC() { + startTheWorld() + semrelease(&gcsema) +} + +// Holding worldsema grants an M the right to try to stop the world. +var worldsema uint32 = 1 + +// Holding gcsema grants the M the right to block a GC, and blocks +// until the current GC is done. In particular, it prevents gomaxprocs +// from changing concurrently. +// +// TODO(mknyszek): Once gomaxprocs and the execution tracer can handle +// being changed/enabled during a GC, remove this. +var gcsema uint32 = 1 + +// stopTheWorldWithSema is the core implementation of stopTheWorld. +// The caller is responsible for acquiring worldsema and disabling +// preemption first and then should stopTheWorldWithSema on the system +// stack: +// +// semacquire(&worldsema, 0) +// m.preemptoff = "reason" +// systemstack(stopTheWorldWithSema) +// +// When finished, the caller must either call startTheWorld or undo +// these three operations separately: +// +// m.preemptoff = "" +// systemstack(startTheWorldWithSema) +// semrelease(&worldsema) +// +// It is allowed to acquire worldsema once and then execute multiple +// startTheWorldWithSema/stopTheWorldWithSema pairs. +// Other P's are able to execute between successive calls to +// startTheWorldWithSema and stopTheWorldWithSema. +// Holding worldsema causes any other goroutines invoking +// stopTheWorld to block. +func stopTheWorldWithSema() { + gp := getg() + + // If we hold a lock, then we won't be able to stop another M + // that is blocked trying to acquire the lock. + if gp.m.locks > 0 { + throw("stopTheWorld: holding locks") + } + + lock(&sched.lock) + sched.stopwait = gomaxprocs + sched.gcwaiting.Store(true) + preemptall() + // stop current P + gp.m.p.ptr().status = _Pgcstop // Pgcstop is only diagnostic. + sched.stopwait-- + // try to retake all P's in Psyscall status + for _, pp := range allp { + s := pp.status + if s == _Psyscall && atomic.Cas(&pp.status, s, _Pgcstop) { + if trace.enabled { + traceGoSysBlock(pp) + traceProcStop(pp) + } + pp.syscalltick++ + sched.stopwait-- + } + } + // stop idle P's + now := nanotime() + for { + pp, _ := pidleget(now) + if pp == nil { + break + } + pp.status = _Pgcstop + sched.stopwait-- + } + wait := sched.stopwait > 0 + unlock(&sched.lock) + + // wait for remaining P's to stop voluntarily + if wait { + for { + // wait for 100us, then try to re-preempt in case of any races + if notetsleep(&sched.stopnote, 100*1000) { + noteclear(&sched.stopnote) + break + } + preemptall() + } + } + + // sanity checks + bad := "" + if sched.stopwait != 0 { + bad = "stopTheWorld: not stopped (stopwait != 0)" + } else { + for _, pp := range allp { + if pp.status != _Pgcstop { + bad = "stopTheWorld: not stopped (status != _Pgcstop)" + } + } + } + if freezing.Load() { + // Some other thread is panicking. This can cause the + // sanity checks above to fail if the panic happens in + // the signal handler on a stopped thread. Either way, + // we should halt this thread. + lock(&deadlock) + lock(&deadlock) + } + if bad != "" { + throw(bad) + } + + worldStopped() +} + +func startTheWorldWithSema(emitTraceEvent bool) int64 { + assertWorldStopped() + + mp := acquirem() // disable preemption because it can be holding p in a local var + if netpollinited() { + list := netpoll(0) // non-blocking + injectglist(&list) + } + lock(&sched.lock) + + procs := gomaxprocs + if newprocs != 0 { + procs = newprocs + newprocs = 0 + } + p1 := procresize(procs) + sched.gcwaiting.Store(false) + if sched.sysmonwait.Load() { + sched.sysmonwait.Store(false) + notewakeup(&sched.sysmonnote) + } + unlock(&sched.lock) + + worldStarted() + + for p1 != nil { + p := p1 + p1 = p1.link.ptr() + if p.m != 0 { + mp := p.m.ptr() + p.m = 0 + if mp.nextp != 0 { + throw("startTheWorld: inconsistent mp->nextp") + } + mp.nextp.set(p) + notewakeup(&mp.park) + } else { + // Start M to run P. Do not start another M below. + newm(nil, p, -1) + } + } + + // Capture start-the-world time before doing clean-up tasks. + startTime := nanotime() + if emitTraceEvent { + traceGCSTWDone() + } + + // Wakeup an additional proc in case we have excessive runnable goroutines + // in local queues or in the global queue. If we don't, the proc will park itself. + // If we have lots of excessive work, resetspinning will unpark additional procs as necessary. + wakep() + + releasem(mp) + + return startTime +} + +// usesLibcall indicates whether this runtime performs system calls +// via libcall. +func usesLibcall() bool { + switch GOOS { + case "aix", "darwin", "illumos", "ios", "solaris", "windows": + return true + case "openbsd": + return GOARCH == "386" || GOARCH == "amd64" || GOARCH == "arm" || GOARCH == "arm64" + } + return false +} + +// mStackIsSystemAllocated indicates whether this runtime starts on a +// system-allocated stack. +func mStackIsSystemAllocated() bool { + switch GOOS { + case "aix", "darwin", "plan9", "illumos", "ios", "solaris", "windows": + return true + case "openbsd": + switch GOARCH { + case "386", "amd64", "arm", "arm64": + return true + } + } + return false +} + +// mstart is the entry-point for new Ms. +// It is written in assembly, uses ABI0, is marked TOPFRAME, and calls mstart0. +func mstart() + +// mstart0 is the Go entry-point for new Ms. +// This must not split the stack because we may not even have stack +// bounds set up yet. +// +// May run during STW (because it doesn't have a P yet), so write +// barriers are not allowed. +// +//go:nosplit +//go:nowritebarrierrec +func mstart0() { + gp := getg() + + osStack := gp.stack.lo == 0 + if osStack { + // Initialize stack bounds from system stack. + // Cgo may have left stack size in stack.hi. + // minit may update the stack bounds. + // + // Note: these bounds may not be very accurate. + // We set hi to &size, but there are things above + // it. The 1024 is supposed to compensate this, + // but is somewhat arbitrary. + size := gp.stack.hi + if size == 0 { + size = 8192 * sys.StackGuardMultiplier + } + gp.stack.hi = uintptr(noescape(unsafe.Pointer(&size))) + gp.stack.lo = gp.stack.hi - size + 1024 + } + // Initialize stack guard so that we can start calling regular + // Go code. + gp.stackguard0 = gp.stack.lo + _StackGuard + // This is the g0, so we can also call go:systemstack + // functions, which check stackguard1. + gp.stackguard1 = gp.stackguard0 + mstart1() + + // Exit this thread. + if mStackIsSystemAllocated() { + // Windows, Solaris, illumos, Darwin, AIX and Plan 9 always system-allocate + // the stack, but put it in gp.stack before mstart, + // so the logic above hasn't set osStack yet. + osStack = true + } + mexit(osStack) +} + +// The go:noinline is to guarantee the getcallerpc/getcallersp below are safe, +// so that we can set up g0.sched to return to the call of mstart1 above. +// +//go:noinline +func mstart1() { + gp := getg() + + if gp != gp.m.g0 { + throw("bad runtime·mstart") + } + + // Set up m.g0.sched as a label returning to just + // after the mstart1 call in mstart0 above, for use by goexit0 and mcall. + // We're never coming back to mstart1 after we call schedule, + // so other calls can reuse the current frame. + // And goexit0 does a gogo that needs to return from mstart1 + // and let mstart0 exit the thread. + gp.sched.g = guintptr(unsafe.Pointer(gp)) + gp.sched.pc = getcallerpc() + gp.sched.sp = getcallersp() + + asminit() + minit() + + // Install signal handlers; after minit so that minit can + // prepare the thread to be able to handle the signals. + if gp.m == &m0 { + mstartm0() + } + + if fn := gp.m.mstartfn; fn != nil { + fn() + } + + if gp.m != &m0 { + acquirep(gp.m.nextp.ptr()) + gp.m.nextp = 0 + } + schedule() +} + +// mstartm0 implements part of mstart1 that only runs on the m0. +// +// Write barriers are allowed here because we know the GC can't be +// running yet, so they'll be no-ops. +// +//go:yeswritebarrierrec +func mstartm0() { + // Create an extra M for callbacks on threads not created by Go. + // An extra M is also needed on Windows for callbacks created by + // syscall.NewCallback. See issue #6751 for details. + if (iscgo || GOOS == "windows") && !cgoHasExtraM { + cgoHasExtraM = true + newextram() + } + initsig(false) +} + +// mPark causes a thread to park itself, returning once woken. +// +//go:nosplit +func mPark() { + gp := getg() + notesleep(&gp.m.park) + noteclear(&gp.m.park) +} + +// mexit tears down and exits the current thread. +// +// Don't call this directly to exit the thread, since it must run at +// the top of the thread stack. Instead, use gogo(&gp.m.g0.sched) to +// unwind the stack to the point that exits the thread. +// +// It is entered with m.p != nil, so write barriers are allowed. It +// will release the P before exiting. +// +//go:yeswritebarrierrec +func mexit(osStack bool) { + mp := getg().m + + if mp == &m0 { + // This is the main thread. Just wedge it. + // + // On Linux, exiting the main thread puts the process + // into a non-waitable zombie state. On Plan 9, + // exiting the main thread unblocks wait even though + // other threads are still running. On Solaris we can + // neither exitThread nor return from mstart. Other + // bad things probably happen on other platforms. + // + // We could try to clean up this M more before wedging + // it, but that complicates signal handling. + handoffp(releasep()) + lock(&sched.lock) + sched.nmfreed++ + checkdead() + unlock(&sched.lock) + mPark() + throw("locked m0 woke up") + } + + sigblock(true) + unminit() + + // Free the gsignal stack. + if mp.gsignal != nil { + stackfree(mp.gsignal.stack) + // On some platforms, when calling into VDSO (e.g. nanotime) + // we store our g on the gsignal stack, if there is one. + // Now the stack is freed, unlink it from the m, so we + // won't write to it when calling VDSO code. + mp.gsignal = nil + } + + // Remove m from allm. + lock(&sched.lock) + for pprev := &allm; *pprev != nil; pprev = &(*pprev).alllink { + if *pprev == mp { + *pprev = mp.alllink + goto found + } + } + throw("m not found in allm") +found: + // Delay reaping m until it's done with the stack. + // + // Put mp on the free list, though it will not be reaped while freeWait + // is freeMWait. mp is no longer reachable via allm, so even if it is + // on an OS stack, we must keep a reference to mp alive so that the GC + // doesn't free mp while we are still using it. + // + // Note that the free list must not be linked through alllink because + // some functions walk allm without locking, so may be using alllink. + mp.freeWait.Store(freeMWait) + mp.freelink = sched.freem + sched.freem = mp + unlock(&sched.lock) + + atomic.Xadd64(&ncgocall, int64(mp.ncgocall)) + + // Release the P. + handoffp(releasep()) + // After this point we must not have write barriers. + + // Invoke the deadlock detector. This must happen after + // handoffp because it may have started a new M to take our + // P's work. + lock(&sched.lock) + sched.nmfreed++ + checkdead() + unlock(&sched.lock) + + if GOOS == "darwin" || GOOS == "ios" { + // Make sure pendingPreemptSignals is correct when an M exits. + // For #41702. + if mp.signalPending.Load() != 0 { + pendingPreemptSignals.Add(-1) + } + } + + // Destroy all allocated resources. After this is called, we may no + // longer take any locks. + mdestroy(mp) + + if osStack { + // No more uses of mp, so it is safe to drop the reference. + mp.freeWait.Store(freeMRef) + + // Return from mstart and let the system thread + // library free the g0 stack and terminate the thread. + return + } + + // mstart is the thread's entry point, so there's nothing to + // return to. Exit the thread directly. exitThread will clear + // m.freeWait when it's done with the stack and the m can be + // reaped. + exitThread(&mp.freeWait) +} + +// forEachP calls fn(p) for every P p when p reaches a GC safe point. +// If a P is currently executing code, this will bring the P to a GC +// safe point and execute fn on that P. If the P is not executing code +// (it is idle or in a syscall), this will call fn(p) directly while +// preventing the P from exiting its state. This does not ensure that +// fn will run on every CPU executing Go code, but it acts as a global +// memory barrier. GC uses this as a "ragged barrier." +// +// The caller must hold worldsema. +// +//go:systemstack +func forEachP(fn func(*p)) { + mp := acquirem() + pp := getg().m.p.ptr() + + lock(&sched.lock) + if sched.safePointWait != 0 { + throw("forEachP: sched.safePointWait != 0") + } + sched.safePointWait = gomaxprocs - 1 + sched.safePointFn = fn + + // Ask all Ps to run the safe point function. + for _, p2 := range allp { + if p2 != pp { + atomic.Store(&p2.runSafePointFn, 1) + } + } + preemptall() + + // Any P entering _Pidle or _Psyscall from now on will observe + // p.runSafePointFn == 1 and will call runSafePointFn when + // changing its status to _Pidle/_Psyscall. + + // Run safe point function for all idle Ps. sched.pidle will + // not change because we hold sched.lock. + for p := sched.pidle.ptr(); p != nil; p = p.link.ptr() { + if atomic.Cas(&p.runSafePointFn, 1, 0) { + fn(p) + sched.safePointWait-- + } + } + + wait := sched.safePointWait > 0 + unlock(&sched.lock) + + // Run fn for the current P. + fn(pp) + + // Force Ps currently in _Psyscall into _Pidle and hand them + // off to induce safe point function execution. + for _, p2 := range allp { + s := p2.status + if s == _Psyscall && p2.runSafePointFn == 1 && atomic.Cas(&p2.status, s, _Pidle) { + if trace.enabled { + traceGoSysBlock(p2) + traceProcStop(p2) + } + p2.syscalltick++ + handoffp(p2) + } + } + + // Wait for remaining Ps to run fn. + if wait { + for { + // Wait for 100us, then try to re-preempt in + // case of any races. + // + // Requires system stack. + if notetsleep(&sched.safePointNote, 100*1000) { + noteclear(&sched.safePointNote) + break + } + preemptall() + } + } + if sched.safePointWait != 0 { + throw("forEachP: not done") + } + for _, p2 := range allp { + if p2.runSafePointFn != 0 { + throw("forEachP: P did not run fn") + } + } + + lock(&sched.lock) + sched.safePointFn = nil + unlock(&sched.lock) + releasem(mp) +} + +// runSafePointFn runs the safe point function, if any, for this P. +// This should be called like +// +// if getg().m.p.runSafePointFn != 0 { +// runSafePointFn() +// } +// +// runSafePointFn must be checked on any transition in to _Pidle or +// _Psyscall to avoid a race where forEachP sees that the P is running +// just before the P goes into _Pidle/_Psyscall and neither forEachP +// nor the P run the safe-point function. +func runSafePointFn() { + p := getg().m.p.ptr() + // Resolve the race between forEachP running the safe-point + // function on this P's behalf and this P running the + // safe-point function directly. + if !atomic.Cas(&p.runSafePointFn, 1, 0) { + return + } + sched.safePointFn(p) + lock(&sched.lock) + sched.safePointWait-- + if sched.safePointWait == 0 { + notewakeup(&sched.safePointNote) + } + unlock(&sched.lock) +} + +// When running with cgo, we call _cgo_thread_start +// to start threads for us so that we can play nicely with +// foreign code. +var cgoThreadStart unsafe.Pointer + +type cgothreadstart struct { + g guintptr + tls *uint64 + fn unsafe.Pointer +} + +// Allocate a new m unassociated with any thread. +// Can use p for allocation context if needed. +// fn is recorded as the new m's m.mstartfn. +// id is optional pre-allocated m ID. Omit by passing -1. +// +// This function is allowed to have write barriers even if the caller +// isn't because it borrows pp. +// +//go:yeswritebarrierrec +func allocm(pp *p, fn func(), id int64) *m { + allocmLock.rlock() + + // The caller owns pp, but we may borrow (i.e., acquirep) it. We must + // disable preemption to ensure it is not stolen, which would make the + // caller lose ownership. + acquirem() + + gp := getg() + if gp.m.p == 0 { + acquirep(pp) // temporarily borrow p for mallocs in this function + } + + // Release the free M list. We need to do this somewhere and + // this may free up a stack we can use. + if sched.freem != nil { + lock(&sched.lock) + var newList *m + for freem := sched.freem; freem != nil; { + wait := freem.freeWait.Load() + if wait == freeMWait { + next := freem.freelink + freem.freelink = newList + newList = freem + freem = next + continue + } + // Free the stack if needed. For freeMRef, there is + // nothing to do except drop freem from the sched.freem + // list. + if wait == freeMStack { + // stackfree must be on the system stack, but allocm is + // reachable off the system stack transitively from + // startm. + systemstack(func() { + stackfree(freem.g0.stack) + }) + } + freem = freem.freelink + } + sched.freem = newList + unlock(&sched.lock) + } + + mp := new(m) + mp.mstartfn = fn + mcommoninit(mp, id) + + // In case of cgo or Solaris or illumos or Darwin, pthread_create will make us a stack. + // Windows and Plan 9 will layout sched stack on OS stack. + if iscgo || mStackIsSystemAllocated() { + mp.g0 = malg(-1) + } else { + mp.g0 = malg(8192 * sys.StackGuardMultiplier) + } + mp.g0.m = mp + + if pp == gp.m.p.ptr() { + releasep() + } + + releasem(gp.m) + allocmLock.runlock() + return mp +} + +// needm is called when a cgo callback happens on a +// thread without an m (a thread not created by Go). +// In this case, needm is expected to find an m to use +// and return with m, g initialized correctly. +// Since m and g are not set now (likely nil, but see below) +// needm is limited in what routines it can call. In particular +// it can only call nosplit functions (textflag 7) and cannot +// do any scheduling that requires an m. +// +// In order to avoid needing heavy lifting here, we adopt +// the following strategy: there is a stack of available m's +// that can be stolen. Using compare-and-swap +// to pop from the stack has ABA races, so we simulate +// a lock by doing an exchange (via Casuintptr) to steal the stack +// head and replace the top pointer with MLOCKED (1). +// This serves as a simple spin lock that we can use even +// without an m. The thread that locks the stack in this way +// unlocks the stack by storing a valid stack head pointer. +// +// In order to make sure that there is always an m structure +// available to be stolen, we maintain the invariant that there +// is always one more than needed. At the beginning of the +// program (if cgo is in use) the list is seeded with a single m. +// If needm finds that it has taken the last m off the list, its job +// is - once it has installed its own m so that it can do things like +// allocate memory - to create a spare m and put it on the list. +// +// Each of these extra m's also has a g0 and a curg that are +// pressed into service as the scheduling stack and current +// goroutine for the duration of the cgo callback. +// +// When the callback is done with the m, it calls dropm to +// put the m back on the list. +// +//go:nosplit +func needm() { + if (iscgo || GOOS == "windows") && !cgoHasExtraM { + // Can happen if C/C++ code calls Go from a global ctor. + // Can also happen on Windows if a global ctor uses a + // callback created by syscall.NewCallback. See issue #6751 + // for details. + // + // Can not throw, because scheduler is not initialized yet. + writeErrStr("fatal error: cgo callback before cgo call\n") + exit(1) + } + + // Save and block signals before getting an M. + // The signal handler may call needm itself, + // and we must avoid a deadlock. Also, once g is installed, + // any incoming signals will try to execute, + // but we won't have the sigaltstack settings and other data + // set up appropriately until the end of minit, which will + // unblock the signals. This is the same dance as when + // starting a new m to run Go code via newosproc. + var sigmask sigset + sigsave(&sigmask) + sigblock(false) + + // Lock extra list, take head, unlock popped list. + // nilokay=false is safe here because of the invariant above, + // that the extra list always contains or will soon contain + // at least one m. + mp := lockextra(false) + + // Set needextram when we've just emptied the list, + // so that the eventual call into cgocallbackg will + // allocate a new m for the extra list. We delay the + // allocation until then so that it can be done + // after exitsyscall makes sure it is okay to be + // running at all (that is, there's no garbage collection + // running right now). + mp.needextram = mp.schedlink == 0 + extraMCount-- + unlockextra(mp.schedlink.ptr()) + + // Store the original signal mask for use by minit. + mp.sigmask = sigmask + + // Install TLS on some platforms (previously setg + // would do this if necessary). + osSetupTLS(mp) + + // Install g (= m->g0) and set the stack bounds + // to match the current stack. We don't actually know + // how big the stack is, like we don't know how big any + // scheduling stack is, but we assume there's at least 32 kB, + // which is more than enough for us. + setg(mp.g0) + gp := getg() + gp.stack.hi = getcallersp() + 1024 + gp.stack.lo = getcallersp() - 32*1024 + gp.stackguard0 = gp.stack.lo + _StackGuard + + // Initialize this thread to use the m. + asminit() + minit() + + // mp.curg is now a real goroutine. + casgstatus(mp.curg, _Gdead, _Gsyscall) + sched.ngsys.Add(-1) +} + +// newextram allocates m's and puts them on the extra list. +// It is called with a working local m, so that it can do things +// like call schedlock and allocate. +func newextram() { + c := extraMWaiters.Swap(0) + if c > 0 { + for i := uint32(0); i < c; i++ { + oneNewExtraM() + } + } else { + // Make sure there is at least one extra M. + mp := lockextra(true) + unlockextra(mp) + if mp == nil { + oneNewExtraM() + } + } +} + +// oneNewExtraM allocates an m and puts it on the extra list. +func oneNewExtraM() { + // Create extra goroutine locked to extra m. + // The goroutine is the context in which the cgo callback will run. + // The sched.pc will never be returned to, but setting it to + // goexit makes clear to the traceback routines where + // the goroutine stack ends. + mp := allocm(nil, nil, -1) + gp := malg(4096) + gp.sched.pc = abi.FuncPCABI0(goexit) + sys.PCQuantum + gp.sched.sp = gp.stack.hi + gp.sched.sp -= 4 * goarch.PtrSize // extra space in case of reads slightly beyond frame + gp.sched.lr = 0 + gp.sched.g = guintptr(unsafe.Pointer(gp)) + gp.syscallpc = gp.sched.pc + gp.syscallsp = gp.sched.sp + gp.stktopsp = gp.sched.sp + // malg returns status as _Gidle. Change to _Gdead before + // adding to allg where GC can see it. We use _Gdead to hide + // this from tracebacks and stack scans since it isn't a + // "real" goroutine until needm grabs it. + casgstatus(gp, _Gidle, _Gdead) + gp.m = mp + mp.curg = gp + mp.isextra = true + mp.lockedInt++ + mp.lockedg.set(gp) + gp.lockedm.set(mp) + gp.goid = sched.goidgen.Add(1) + gp.sysblocktraced = true + if raceenabled { + gp.racectx = racegostart(abi.FuncPCABIInternal(newextram) + sys.PCQuantum) + } + if trace.enabled { + // Trigger two trace events for the locked g in the extra m, + // since the next event of the g will be traceEvGoSysExit in exitsyscall, + // while calling from C thread to Go. + traceGoCreate(gp, 0) // no start pc + gp.traceseq++ + traceEvent(traceEvGoInSyscall, -1, gp.goid) + } + // put on allg for garbage collector + allgadd(gp) + + // gp is now on the allg list, but we don't want it to be + // counted by gcount. It would be more "proper" to increment + // sched.ngfree, but that requires locking. Incrementing ngsys + // has the same effect. + sched.ngsys.Add(1) + + // Add m to the extra list. + mnext := lockextra(true) + mp.schedlink.set(mnext) + extraMCount++ + unlockextra(mp) +} + +// dropm is called when a cgo callback has called needm but is now +// done with the callback and returning back into the non-Go thread. +// It puts the current m back onto the extra list. +// +// The main expense here is the call to signalstack to release the +// m's signal stack, and then the call to needm on the next callback +// from this thread. It is tempting to try to save the m for next time, +// which would eliminate both these costs, but there might not be +// a next time: the current thread (which Go does not control) might exit. +// If we saved the m for that thread, there would be an m leak each time +// such a thread exited. Instead, we acquire and release an m on each +// call. These should typically not be scheduling operations, just a few +// atomics, so the cost should be small. +// +// TODO(rsc): An alternative would be to allocate a dummy pthread per-thread +// variable using pthread_key_create. Unlike the pthread keys we already use +// on OS X, this dummy key would never be read by Go code. It would exist +// only so that we could register at thread-exit-time destructor. +// That destructor would put the m back onto the extra list. +// This is purely a performance optimization. The current version, +// in which dropm happens on each cgo call, is still correct too. +// We may have to keep the current version on systems with cgo +// but without pthreads, like Windows. +func dropm() { + // Clear m and g, and return m to the extra list. + // After the call to setg we can only call nosplit functions + // with no pointer manipulation. + mp := getg().m + + // Return mp.curg to dead state. + casgstatus(mp.curg, _Gsyscall, _Gdead) + mp.curg.preemptStop = false + sched.ngsys.Add(1) + + // Block signals before unminit. + // Unminit unregisters the signal handling stack (but needs g on some systems). + // Setg(nil) clears g, which is the signal handler's cue not to run Go handlers. + // It's important not to try to handle a signal between those two steps. + sigmask := mp.sigmask + sigblock(false) + unminit() + + mnext := lockextra(true) + extraMCount++ + mp.schedlink.set(mnext) + + setg(nil) + + // Commit the release of mp. + unlockextra(mp) + + msigrestore(sigmask) +} + +// A helper function for EnsureDropM. +func getm() uintptr { + return uintptr(unsafe.Pointer(getg().m)) +} + +var extram atomic.Uintptr +var extraMCount uint32 // Protected by lockextra +var extraMWaiters atomic.Uint32 + +// lockextra locks the extra list and returns the list head. +// The caller must unlock the list by storing a new list head +// to extram. If nilokay is true, then lockextra will +// return a nil list head if that's what it finds. If nilokay is false, +// lockextra will keep waiting until the list head is no longer nil. +// +//go:nosplit +func lockextra(nilokay bool) *m { + const locked = 1 + + incr := false + for { + old := extram.Load() + if old == locked { + osyield_no_g() + continue + } + if old == 0 && !nilokay { + if !incr { + // Add 1 to the number of threads + // waiting for an M. + // This is cleared by newextram. + extraMWaiters.Add(1) + incr = true + } + usleep_no_g(1) + continue + } + if extram.CompareAndSwap(old, locked) { + return (*m)(unsafe.Pointer(old)) + } + osyield_no_g() + continue + } +} + +//go:nosplit +func unlockextra(mp *m) { + extram.Store(uintptr(unsafe.Pointer(mp))) +} + +var ( + // allocmLock is locked for read when creating new Ms in allocm and their + // addition to allm. Thus acquiring this lock for write blocks the + // creation of new Ms. + allocmLock rwmutex + + // execLock serializes exec and clone to avoid bugs or unspecified + // behaviour around exec'ing while creating/destroying threads. See + // issue #19546. + execLock rwmutex +) + +// These errors are reported (via writeErrStr) by some OS-specific +// versions of newosproc and newosproc0. +const ( + failthreadcreate = "runtime: failed to create new OS thread\n" + failallocatestack = "runtime: failed to allocate stack for the new OS thread\n" +) + +// newmHandoff contains a list of m structures that need new OS threads. +// This is used by newm in situations where newm itself can't safely +// start an OS thread. +var newmHandoff struct { + lock mutex + + // newm points to a list of M structures that need new OS + // threads. The list is linked through m.schedlink. + newm muintptr + + // waiting indicates that wake needs to be notified when an m + // is put on the list. + waiting bool + wake note + + // haveTemplateThread indicates that the templateThread has + // been started. This is not protected by lock. Use cas to set + // to 1. + haveTemplateThread uint32 +} + +// Create a new m. It will start off with a call to fn, or else the scheduler. +// fn needs to be static and not a heap allocated closure. +// May run with m.p==nil, so write barriers are not allowed. +// +// id is optional pre-allocated m ID. Omit by passing -1. +// +//go:nowritebarrierrec +func newm(fn func(), pp *p, id int64) { + // allocm adds a new M to allm, but they do not start until created by + // the OS in newm1 or the template thread. + // + // doAllThreadsSyscall requires that every M in allm will eventually + // start and be signal-able, even with a STW. + // + // Disable preemption here until we start the thread to ensure that + // newm is not preempted between allocm and starting the new thread, + // ensuring that anything added to allm is guaranteed to eventually + // start. + acquirem() + + mp := allocm(pp, fn, id) + mp.nextp.set(pp) + mp.sigmask = initSigmask + if gp := getg(); gp != nil && gp.m != nil && (gp.m.lockedExt != 0 || gp.m.incgo) && GOOS != "plan9" { + // We're on a locked M or a thread that may have been + // started by C. The kernel state of this thread may + // be strange (the user may have locked it for that + // purpose). We don't want to clone that into another + // thread. Instead, ask a known-good thread to create + // the thread for us. + // + // This is disabled on Plan 9. See golang.org/issue/22227. + // + // TODO: This may be unnecessary on Windows, which + // doesn't model thread creation off fork. + lock(&newmHandoff.lock) + if newmHandoff.haveTemplateThread == 0 { + throw("on a locked thread with no template thread") + } + mp.schedlink = newmHandoff.newm + newmHandoff.newm.set(mp) + if newmHandoff.waiting { + newmHandoff.waiting = false + notewakeup(&newmHandoff.wake) + } + unlock(&newmHandoff.lock) + // The M has not started yet, but the template thread does not + // participate in STW, so it will always process queued Ms and + // it is safe to releasem. + releasem(getg().m) + return + } + newm1(mp) + releasem(getg().m) +} + +func newm1(mp *m) { + if iscgo { + var ts cgothreadstart + if _cgo_thread_start == nil { + throw("_cgo_thread_start missing") + } + ts.g.set(mp.g0) + ts.tls = (*uint64)(unsafe.Pointer(&mp.tls[0])) + ts.fn = unsafe.Pointer(abi.FuncPCABI0(mstart)) + if msanenabled { + msanwrite(unsafe.Pointer(&ts), unsafe.Sizeof(ts)) + } + if asanenabled { + asanwrite(unsafe.Pointer(&ts), unsafe.Sizeof(ts)) + } + execLock.rlock() // Prevent process clone. + asmcgocall(_cgo_thread_start, unsafe.Pointer(&ts)) + execLock.runlock() + return + } + execLock.rlock() // Prevent process clone. + newosproc(mp) + execLock.runlock() +} + +// startTemplateThread starts the template thread if it is not already +// running. +// +// The calling thread must itself be in a known-good state. +func startTemplateThread() { + if GOARCH == "wasm" { // no threads on wasm yet + return + } + + // Disable preemption to guarantee that the template thread will be + // created before a park once haveTemplateThread is set. + mp := acquirem() + if !atomic.Cas(&newmHandoff.haveTemplateThread, 0, 1) { + releasem(mp) + return + } + newm(templateThread, nil, -1) + releasem(mp) +} + +// templateThread is a thread in a known-good state that exists solely +// to start new threads in known-good states when the calling thread +// may not be in a good state. +// +// Many programs never need this, so templateThread is started lazily +// when we first enter a state that might lead to running on a thread +// in an unknown state. +// +// templateThread runs on an M without a P, so it must not have write +// barriers. +// +//go:nowritebarrierrec +func templateThread() { + lock(&sched.lock) + sched.nmsys++ + checkdead() + unlock(&sched.lock) + + for { + lock(&newmHandoff.lock) + for newmHandoff.newm != 0 { + newm := newmHandoff.newm.ptr() + newmHandoff.newm = 0 + unlock(&newmHandoff.lock) + for newm != nil { + next := newm.schedlink.ptr() + newm.schedlink = 0 + newm1(newm) + newm = next + } + lock(&newmHandoff.lock) + } + newmHandoff.waiting = true + noteclear(&newmHandoff.wake) + unlock(&newmHandoff.lock) + notesleep(&newmHandoff.wake) + } +} + +// Stops execution of the current m until new work is available. +// Returns with acquired P. +func stopm() { + gp := getg() + + if gp.m.locks != 0 { + throw("stopm holding locks") + } + if gp.m.p != 0 { + throw("stopm holding p") + } + if gp.m.spinning { + throw("stopm spinning") + } + + lock(&sched.lock) + mput(gp.m) + unlock(&sched.lock) + mPark() + acquirep(gp.m.nextp.ptr()) + gp.m.nextp = 0 +} + +func mspinning() { + // startm's caller incremented nmspinning. Set the new M's spinning. + getg().m.spinning = true +} + +// Schedules some M to run the p (creates an M if necessary). +// If p==nil, tries to get an idle P, if no idle P's does nothing. +// May run with m.p==nil, so write barriers are not allowed. +// If spinning is set, the caller has incremented nmspinning and must provide a +// P. startm will set m.spinning in the newly started M. +// +// Callers passing a non-nil P must call from a non-preemptible context. See +// comment on acquirem below. +// +// Argument lockheld indicates whether the caller already acquired the +// scheduler lock. Callers holding the lock when making the call must pass +// true. The lock might be temporarily dropped, but will be reacquired before +// returning. +// +// Must not have write barriers because this may be called without a P. +// +//go:nowritebarrierrec +func startm(pp *p, spinning, lockheld bool) { + // Disable preemption. + // + // Every owned P must have an owner that will eventually stop it in the + // event of a GC stop request. startm takes transient ownership of a P + // (either from argument or pidleget below) and transfers ownership to + // a started M, which will be responsible for performing the stop. + // + // Preemption must be disabled during this transient ownership, + // otherwise the P this is running on may enter GC stop while still + // holding the transient P, leaving that P in limbo and deadlocking the + // STW. + // + // Callers passing a non-nil P must already be in non-preemptible + // context, otherwise such preemption could occur on function entry to + // startm. Callers passing a nil P may be preemptible, so we must + // disable preemption before acquiring a P from pidleget below. + mp := acquirem() + if !lockheld { + lock(&sched.lock) + } + if pp == nil { + if spinning { + // TODO(prattmic): All remaining calls to this function + // with _p_ == nil could be cleaned up to find a P + // before calling startm. + throw("startm: P required for spinning=true") + } + pp, _ = pidleget(0) + if pp == nil { + if !lockheld { + unlock(&sched.lock) + } + releasem(mp) + return + } + } + nmp := mget() + if nmp == nil { + // No M is available, we must drop sched.lock and call newm. + // However, we already own a P to assign to the M. + // + // Once sched.lock is released, another G (e.g., in a syscall), + // could find no idle P while checkdead finds a runnable G but + // no running M's because this new M hasn't started yet, thus + // throwing in an apparent deadlock. + // This apparent deadlock is possible when startm is called + // from sysmon, which doesn't count as a running M. + // + // Avoid this situation by pre-allocating the ID for the new M, + // thus marking it as 'running' before we drop sched.lock. This + // new M will eventually run the scheduler to execute any + // queued G's. + id := mReserveID() + unlock(&sched.lock) + + var fn func() + if spinning { + // The caller incremented nmspinning, so set m.spinning in the new M. + fn = mspinning + } + newm(fn, pp, id) + + if lockheld { + lock(&sched.lock) + } + // Ownership transfer of pp committed by start in newm. + // Preemption is now safe. + releasem(mp) + return + } + if !lockheld { + unlock(&sched.lock) + } + if nmp.spinning { + throw("startm: m is spinning") + } + if nmp.nextp != 0 { + throw("startm: m has p") + } + if spinning && !runqempty(pp) { + throw("startm: p has runnable gs") + } + // The caller incremented nmspinning, so set m.spinning in the new M. + nmp.spinning = spinning + nmp.nextp.set(pp) + notewakeup(&nmp.park) + // Ownership transfer of pp committed by wakeup. Preemption is now + // safe. + releasem(mp) +} + +// Hands off P from syscall or locked M. +// Always runs without a P, so write barriers are not allowed. +// +//go:nowritebarrierrec +func handoffp(pp *p) { + // handoffp must start an M in any situation where + // findrunnable would return a G to run on pp. + + // if it has local work, start it straight away + if !runqempty(pp) || sched.runqsize != 0 { + startm(pp, false, false) + return + } + // if there's trace work to do, start it straight away + if (trace.enabled || trace.shutdown) && traceReaderAvailable() != nil { + startm(pp, false, false) + return + } + // if it has GC work, start it straight away + if gcBlackenEnabled != 0 && gcMarkWorkAvailable(pp) { + startm(pp, false, false) + return + } + // no local work, check that there are no spinning/idle M's, + // otherwise our help is not required + if sched.nmspinning.Load()+sched.npidle.Load() == 0 && sched.nmspinning.CompareAndSwap(0, 1) { // TODO: fast atomic + sched.needspinning.Store(0) + startm(pp, true, false) + return + } + lock(&sched.lock) + if sched.gcwaiting.Load() { + pp.status = _Pgcstop + sched.stopwait-- + if sched.stopwait == 0 { + notewakeup(&sched.stopnote) + } + unlock(&sched.lock) + return + } + if pp.runSafePointFn != 0 && atomic.Cas(&pp.runSafePointFn, 1, 0) { + sched.safePointFn(pp) + sched.safePointWait-- + if sched.safePointWait == 0 { + notewakeup(&sched.safePointNote) + } + } + if sched.runqsize != 0 { + unlock(&sched.lock) + startm(pp, false, false) + return + } + // If this is the last running P and nobody is polling network, + // need to wakeup another M to poll network. + if sched.npidle.Load() == gomaxprocs-1 && sched.lastpoll.Load() != 0 { + unlock(&sched.lock) + startm(pp, false, false) + return + } + + // The scheduler lock cannot be held when calling wakeNetPoller below + // because wakeNetPoller may call wakep which may call startm. + when := nobarrierWakeTime(pp) + pidleput(pp, 0) + unlock(&sched.lock) + + if when != 0 { + wakeNetPoller(when) + } +} + +// Tries to add one more P to execute G's. +// Called when a G is made runnable (newproc, ready). +// Must be called with a P. +func wakep() { + // Be conservative about spinning threads, only start one if none exist + // already. + if sched.nmspinning.Load() != 0 || !sched.nmspinning.CompareAndSwap(0, 1) { + return + } + + // Disable preemption until ownership of pp transfers to the next M in + // startm. Otherwise preemption here would leave pp stuck waiting to + // enter _Pgcstop. + // + // See preemption comment on acquirem in startm for more details. + mp := acquirem() + + var pp *p + lock(&sched.lock) + pp, _ = pidlegetSpinning(0) + if pp == nil { + if sched.nmspinning.Add(-1) < 0 { + throw("wakep: negative nmspinning") + } + unlock(&sched.lock) + releasem(mp) + return + } + // Since we always have a P, the race in the "No M is available" + // comment in startm doesn't apply during the small window between the + // unlock here and lock in startm. A checkdead in between will always + // see at least one running M (ours). + unlock(&sched.lock) + + startm(pp, true, false) + + releasem(mp) +} + +// Stops execution of the current m that is locked to a g until the g is runnable again. +// Returns with acquired P. +func stoplockedm() { + gp := getg() + + if gp.m.lockedg == 0 || gp.m.lockedg.ptr().lockedm.ptr() != gp.m { + throw("stoplockedm: inconsistent locking") + } + if gp.m.p != 0 { + // Schedule another M to run this p. + pp := releasep() + handoffp(pp) + } + incidlelocked(1) + // Wait until another thread schedules lockedg again. + mPark() + status := readgstatus(gp.m.lockedg.ptr()) + if status&^_Gscan != _Grunnable { + print("runtime:stoplockedm: lockedg (atomicstatus=", status, ") is not Grunnable or Gscanrunnable\n") + dumpgstatus(gp.m.lockedg.ptr()) + throw("stoplockedm: not runnable") + } + acquirep(gp.m.nextp.ptr()) + gp.m.nextp = 0 +} + +// Schedules the locked m to run the locked gp. +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func startlockedm(gp *g) { + mp := gp.lockedm.ptr() + if mp == getg().m { + throw("startlockedm: locked to me") + } + if mp.nextp != 0 { + throw("startlockedm: m has p") + } + // directly handoff current P to the locked m + incidlelocked(-1) + pp := releasep() + mp.nextp.set(pp) + notewakeup(&mp.park) + stopm() +} + +// Stops the current m for stopTheWorld. +// Returns when the world is restarted. +func gcstopm() { + gp := getg() + + if !sched.gcwaiting.Load() { + throw("gcstopm: not waiting for gc") + } + if gp.m.spinning { + gp.m.spinning = false + // OK to just drop nmspinning here, + // startTheWorld will unpark threads as necessary. + if sched.nmspinning.Add(-1) < 0 { + throw("gcstopm: negative nmspinning") + } + } + pp := releasep() + lock(&sched.lock) + pp.status = _Pgcstop + sched.stopwait-- + if sched.stopwait == 0 { + notewakeup(&sched.stopnote) + } + unlock(&sched.lock) + stopm() +} + +// Schedules gp to run on the current M. +// If inheritTime is true, gp inherits the remaining time in the +// current time slice. Otherwise, it starts a new time slice. +// Never returns. +// +// Write barriers are allowed because this is called immediately after +// acquiring a P in several places. +// +//go:yeswritebarrierrec +func execute(gp *g, inheritTime bool) { + mp := getg().m + + if goroutineProfile.active { + // Make sure that gp has had its stack written out to the goroutine + // profile, exactly as it was when the goroutine profiler first stopped + // the world. + tryRecordGoroutineProfile(gp, osyield) + } + + // Assign gp.m before entering _Grunning so running Gs have an + // M. + mp.curg = gp + gp.m = mp + casgstatus(gp, _Grunnable, _Grunning) + gp.waitsince = 0 + gp.preempt = false + gp.stackguard0 = gp.stack.lo + _StackGuard + if !inheritTime { + mp.p.ptr().schedtick++ + } + + // Check whether the profiler needs to be turned on or off. + hz := sched.profilehz + if mp.profilehz != hz { + setThreadCPUProfiler(hz) + } + + if trace.enabled { + // GoSysExit has to happen when we have a P, but before GoStart. + // So we emit it here. + if gp.syscallsp != 0 && gp.sysblocktraced { + traceGoSysExit(gp.sysexitticks) + } + traceGoStart() + } + + gogo(&gp.sched) +} + +// Finds a runnable goroutine to execute. +// Tries to steal from other P's, get g from local or global queue, poll network. +// tryWakeP indicates that the returned goroutine is not normal (GC worker, trace +// reader) so the caller should try to wake a P. +func findRunnable() (gp *g, inheritTime, tryWakeP bool) { + mp := getg().m + + // The conditions here and in handoffp must agree: if + // findrunnable would return a G to run, handoffp must start + // an M. + +top: + pp := mp.p.ptr() + if sched.gcwaiting.Load() { + gcstopm() + goto top + } + if pp.runSafePointFn != 0 { + runSafePointFn() + } + + // now and pollUntil are saved for work stealing later, + // which may steal timers. It's important that between now + // and then, nothing blocks, so these numbers remain mostly + // relevant. + now, pollUntil, _ := checkTimers(pp, 0) + + // Try to schedule the trace reader. + if trace.enabled || trace.shutdown { + gp := traceReader() + if gp != nil { + casgstatus(gp, _Gwaiting, _Grunnable) + traceGoUnpark(gp, 0) + return gp, false, true + } + } + + // Try to schedule a GC worker. + if gcBlackenEnabled != 0 { + gp, tnow := gcController.findRunnableGCWorker(pp, now) + if gp != nil { + return gp, false, true + } + now = tnow + } + + // Check the global runnable queue once in a while to ensure fairness. + // Otherwise two goroutines can completely occupy the local runqueue + // by constantly respawning each other. + if pp.schedtick%61 == 0 && sched.runqsize > 0 { + lock(&sched.lock) + gp := globrunqget(pp, 1) + unlock(&sched.lock) + if gp != nil { + return gp, false, false + } + } + + // Wake up the finalizer G. + if fingStatus.Load()&(fingWait|fingWake) == fingWait|fingWake { + if gp := wakefing(); gp != nil { + ready(gp, 0, true) + } + } + if *cgo_yield != nil { + asmcgocall(*cgo_yield, nil) + } + + // local runq + if gp, inheritTime := runqget(pp); gp != nil { + return gp, inheritTime, false + } + + // global runq + if sched.runqsize != 0 { + lock(&sched.lock) + gp := globrunqget(pp, 0) + unlock(&sched.lock) + if gp != nil { + return gp, false, false + } + } + + // Poll network. + // This netpoll is only an optimization before we resort to stealing. + // We can safely skip it if there are no waiters or a thread is blocked + // in netpoll already. If there is any kind of logical race with that + // blocked thread (e.g. it has already returned from netpoll, but does + // not set lastpoll yet), this thread will do blocking netpoll below + // anyway. + if netpollinited() && netpollWaiters.Load() > 0 && sched.lastpoll.Load() != 0 { + if list := netpoll(0); !list.empty() { // non-blocking + gp := list.pop() + injectglist(&list) + casgstatus(gp, _Gwaiting, _Grunnable) + if trace.enabled { + traceGoUnpark(gp, 0) + } + return gp, false, false + } + } + + // Spinning Ms: steal work from other Ps. + // + // Limit the number of spinning Ms to half the number of busy Ps. + // This is necessary to prevent excessive CPU consumption when + // GOMAXPROCS>>1 but the program parallelism is low. + if mp.spinning || 2*sched.nmspinning.Load() < gomaxprocs-sched.npidle.Load() { + if !mp.spinning { + mp.becomeSpinning() + } + + gp, inheritTime, tnow, w, newWork := stealWork(now) + if gp != nil { + // Successfully stole. + return gp, inheritTime, false + } + if newWork { + // There may be new timer or GC work; restart to + // discover. + goto top + } + + now = tnow + if w != 0 && (pollUntil == 0 || w < pollUntil) { + // Earlier timer to wait for. + pollUntil = w + } + } + + // We have nothing to do. + // + // If we're in the GC mark phase, can safely scan and blacken objects, + // and have work to do, run idle-time marking rather than give up the P. + if gcBlackenEnabled != 0 && gcMarkWorkAvailable(pp) && gcController.addIdleMarkWorker() { + node := (*gcBgMarkWorkerNode)(gcBgMarkWorkerPool.pop()) + if node != nil { + pp.gcMarkWorkerMode = gcMarkWorkerIdleMode + gp := node.gp.ptr() + casgstatus(gp, _Gwaiting, _Grunnable) + if trace.enabled { + traceGoUnpark(gp, 0) + } + return gp, false, false + } + gcController.removeIdleMarkWorker() + } + + // wasm only: + // If a callback returned and no other goroutine is awake, + // then wake event handler goroutine which pauses execution + // until a callback was triggered. + gp, otherReady := beforeIdle(now, pollUntil) + if gp != nil { + casgstatus(gp, _Gwaiting, _Grunnable) + if trace.enabled { + traceGoUnpark(gp, 0) + } + return gp, false, false + } + if otherReady { + goto top + } + + // Before we drop our P, make a snapshot of the allp slice, + // which can change underfoot once we no longer block + // safe-points. We don't need to snapshot the contents because + // everything up to cap(allp) is immutable. + allpSnapshot := allp + // Also snapshot masks. Value changes are OK, but we can't allow + // len to change out from under us. + idlepMaskSnapshot := idlepMask + timerpMaskSnapshot := timerpMask + + // return P and block + lock(&sched.lock) + if sched.gcwaiting.Load() || pp.runSafePointFn != 0 { + unlock(&sched.lock) + goto top + } + if sched.runqsize != 0 { + gp := globrunqget(pp, 0) + unlock(&sched.lock) + return gp, false, false + } + if !mp.spinning && sched.needspinning.Load() == 1 { + // See "Delicate dance" comment below. + mp.becomeSpinning() + unlock(&sched.lock) + goto top + } + if releasep() != pp { + throw("findrunnable: wrong p") + } + now = pidleput(pp, now) + unlock(&sched.lock) + + // Delicate dance: thread transitions from spinning to non-spinning + // state, potentially concurrently with submission of new work. We must + // drop nmspinning first and then check all sources again (with + // #StoreLoad memory barrier in between). If we do it the other way + // around, another thread can submit work after we've checked all + // sources but before we drop nmspinning; as a result nobody will + // unpark a thread to run the work. + // + // This applies to the following sources of work: + // + // * Goroutines added to a per-P run queue. + // * New/modified-earlier timers on a per-P timer heap. + // * Idle-priority GC work (barring golang.org/issue/19112). + // + // If we discover new work below, we need to restore m.spinning as a + // signal for resetspinning to unpark a new worker thread (because + // there can be more than one starving goroutine). + // + // However, if after discovering new work we also observe no idle Ps + // (either here or in resetspinning), we have a problem. We may be + // racing with a non-spinning M in the block above, having found no + // work and preparing to release its P and park. Allowing that P to go + // idle will result in loss of work conservation (idle P while there is + // runnable work). This could result in complete deadlock in the + // unlikely event that we discover new work (from netpoll) right as we + // are racing with _all_ other Ps going idle. + // + // We use sched.needspinning to synchronize with non-spinning Ms going + // idle. If needspinning is set when they are about to drop their P, + // they abort the drop and instead become a new spinning M on our + // behalf. If we are not racing and the system is truly fully loaded + // then no spinning threads are required, and the next thread to + // naturally become spinning will clear the flag. + // + // Also see "Worker thread parking/unparking" comment at the top of the + // file. + wasSpinning := mp.spinning + if mp.spinning { + mp.spinning = false + if sched.nmspinning.Add(-1) < 0 { + throw("findrunnable: negative nmspinning") + } + + // Note the for correctness, only the last M transitioning from + // spinning to non-spinning must perform these rechecks to + // ensure no missed work. However, the runtime has some cases + // of transient increments of nmspinning that are decremented + // without going through this path, so we must be conservative + // and perform the check on all spinning Ms. + // + // See https://go.dev/issue/43997. + + // Check all runqueues once again. + pp := checkRunqsNoP(allpSnapshot, idlepMaskSnapshot) + if pp != nil { + acquirep(pp) + mp.becomeSpinning() + goto top + } + + // Check for idle-priority GC work again. + pp, gp := checkIdleGCNoP() + if pp != nil { + acquirep(pp) + mp.becomeSpinning() + + // Run the idle worker. + pp.gcMarkWorkerMode = gcMarkWorkerIdleMode + casgstatus(gp, _Gwaiting, _Grunnable) + if trace.enabled { + traceGoUnpark(gp, 0) + } + return gp, false, false + } + + // Finally, check for timer creation or expiry concurrently with + // transitioning from spinning to non-spinning. + // + // Note that we cannot use checkTimers here because it calls + // adjusttimers which may need to allocate memory, and that isn't + // allowed when we don't have an active P. + pollUntil = checkTimersNoP(allpSnapshot, timerpMaskSnapshot, pollUntil) + } + + // Poll network until next timer. + if netpollinited() && (netpollWaiters.Load() > 0 || pollUntil != 0) && sched.lastpoll.Swap(0) != 0 { + sched.pollUntil.Store(pollUntil) + if mp.p != 0 { + throw("findrunnable: netpoll with p") + } + if mp.spinning { + throw("findrunnable: netpoll with spinning") + } + // Refresh now. + now = nanotime() + delay := int64(-1) + if pollUntil != 0 { + delay = pollUntil - now + if delay < 0 { + delay = 0 + } + } + if faketime != 0 { + // When using fake time, just poll. + delay = 0 + } + list := netpoll(delay) // block until new work is available + sched.pollUntil.Store(0) + sched.lastpoll.Store(now) + if faketime != 0 && list.empty() { + // Using fake time and nothing is ready; stop M. + // When all M's stop, checkdead will call timejump. + stopm() + goto top + } + lock(&sched.lock) + pp, _ := pidleget(now) + unlock(&sched.lock) + if pp == nil { + injectglist(&list) + } else { + acquirep(pp) + if !list.empty() { + gp := list.pop() + injectglist(&list) + casgstatus(gp, _Gwaiting, _Grunnable) + if trace.enabled { + traceGoUnpark(gp, 0) + } + return gp, false, false + } + if wasSpinning { + mp.becomeSpinning() + } + goto top + } + } else if pollUntil != 0 && netpollinited() { + pollerPollUntil := sched.pollUntil.Load() + if pollerPollUntil == 0 || pollerPollUntil > pollUntil { + netpollBreak() + } + } + stopm() + goto top +} + +// pollWork reports whether there is non-background work this P could +// be doing. This is a fairly lightweight check to be used for +// background work loops, like idle GC. It checks a subset of the +// conditions checked by the actual scheduler. +func pollWork() bool { + if sched.runqsize != 0 { + return true + } + p := getg().m.p.ptr() + if !runqempty(p) { + return true + } + if netpollinited() && netpollWaiters.Load() > 0 && sched.lastpoll.Load() != 0 { + if list := netpoll(0); !list.empty() { + injectglist(&list) + return true + } + } + return false +} + +// stealWork attempts to steal a runnable goroutine or timer from any P. +// +// If newWork is true, new work may have been readied. +// +// If now is not 0 it is the current time. stealWork returns the passed time or +// the current time if now was passed as 0. +func stealWork(now int64) (gp *g, inheritTime bool, rnow, pollUntil int64, newWork bool) { + pp := getg().m.p.ptr() + + ranTimer := false + + const stealTries = 4 + for i := 0; i < stealTries; i++ { + stealTimersOrRunNextG := i == stealTries-1 + + for enum := stealOrder.start(fastrand()); !enum.done(); enum.next() { + if sched.gcwaiting.Load() { + // GC work may be available. + return nil, false, now, pollUntil, true + } + p2 := allp[enum.position()] + if pp == p2 { + continue + } + + // Steal timers from p2. This call to checkTimers is the only place + // where we might hold a lock on a different P's timers. We do this + // once on the last pass before checking runnext because stealing + // from the other P's runnext should be the last resort, so if there + // are timers to steal do that first. + // + // We only check timers on one of the stealing iterations because + // the time stored in now doesn't change in this loop and checking + // the timers for each P more than once with the same value of now + // is probably a waste of time. + // + // timerpMask tells us whether the P may have timers at all. If it + // can't, no need to check at all. + if stealTimersOrRunNextG && timerpMask.read(enum.position()) { + tnow, w, ran := checkTimers(p2, now) + now = tnow + if w != 0 && (pollUntil == 0 || w < pollUntil) { + pollUntil = w + } + if ran { + // Running the timers may have + // made an arbitrary number of G's + // ready and added them to this P's + // local run queue. That invalidates + // the assumption of runqsteal + // that it always has room to add + // stolen G's. So check now if there + // is a local G to run. + if gp, inheritTime := runqget(pp); gp != nil { + return gp, inheritTime, now, pollUntil, ranTimer + } + ranTimer = true + } + } + + // Don't bother to attempt to steal if p2 is idle. + if !idlepMask.read(enum.position()) { + if gp := runqsteal(pp, p2, stealTimersOrRunNextG); gp != nil { + return gp, false, now, pollUntil, ranTimer + } + } + } + } + + // No goroutines found to steal. Regardless, running a timer may have + // made some goroutine ready that we missed. Indicate the next timer to + // wait for. + return nil, false, now, pollUntil, ranTimer +} + +// Check all Ps for a runnable G to steal. +// +// On entry we have no P. If a G is available to steal and a P is available, +// the P is returned which the caller should acquire and attempt to steal the +// work to. +func checkRunqsNoP(allpSnapshot []*p, idlepMaskSnapshot pMask) *p { + for id, p2 := range allpSnapshot { + if !idlepMaskSnapshot.read(uint32(id)) && !runqempty(p2) { + lock(&sched.lock) + pp, _ := pidlegetSpinning(0) + if pp == nil { + // Can't get a P, don't bother checking remaining Ps. + unlock(&sched.lock) + return nil + } + unlock(&sched.lock) + return pp + } + } + + // No work available. + return nil +} + +// Check all Ps for a timer expiring sooner than pollUntil. +// +// Returns updated pollUntil value. +func checkTimersNoP(allpSnapshot []*p, timerpMaskSnapshot pMask, pollUntil int64) int64 { + for id, p2 := range allpSnapshot { + if timerpMaskSnapshot.read(uint32(id)) { + w := nobarrierWakeTime(p2) + if w != 0 && (pollUntil == 0 || w < pollUntil) { + pollUntil = w + } + } + } + + return pollUntil +} + +// Check for idle-priority GC, without a P on entry. +// +// If some GC work, a P, and a worker G are all available, the P and G will be +// returned. The returned P has not been wired yet. +func checkIdleGCNoP() (*p, *g) { + // N.B. Since we have no P, gcBlackenEnabled may change at any time; we + // must check again after acquiring a P. As an optimization, we also check + // if an idle mark worker is needed at all. This is OK here, because if we + // observe that one isn't needed, at least one is currently running. Even if + // it stops running, its own journey into the scheduler should schedule it + // again, if need be (at which point, this check will pass, if relevant). + if atomic.Load(&gcBlackenEnabled) == 0 || !gcController.needIdleMarkWorker() { + return nil, nil + } + if !gcMarkWorkAvailable(nil) { + return nil, nil + } + + // Work is available; we can start an idle GC worker only if there is + // an available P and available worker G. + // + // We can attempt to acquire these in either order, though both have + // synchronization concerns (see below). Workers are almost always + // available (see comment in findRunnableGCWorker for the one case + // there may be none). Since we're slightly less likely to find a P, + // check for that first. + // + // Synchronization: note that we must hold sched.lock until we are + // committed to keeping it. Otherwise we cannot put the unnecessary P + // back in sched.pidle without performing the full set of idle + // transition checks. + // + // If we were to check gcBgMarkWorkerPool first, we must somehow handle + // the assumption in gcControllerState.findRunnableGCWorker that an + // empty gcBgMarkWorkerPool is only possible if gcMarkDone is running. + lock(&sched.lock) + pp, now := pidlegetSpinning(0) + if pp == nil { + unlock(&sched.lock) + return nil, nil + } + + // Now that we own a P, gcBlackenEnabled can't change (as it requires STW). + if gcBlackenEnabled == 0 || !gcController.addIdleMarkWorker() { + pidleput(pp, now) + unlock(&sched.lock) + return nil, nil + } + + node := (*gcBgMarkWorkerNode)(gcBgMarkWorkerPool.pop()) + if node == nil { + pidleput(pp, now) + unlock(&sched.lock) + gcController.removeIdleMarkWorker() + return nil, nil + } + + unlock(&sched.lock) + + return pp, node.gp.ptr() +} + +// wakeNetPoller wakes up the thread sleeping in the network poller if it isn't +// going to wake up before the when argument; or it wakes an idle P to service +// timers and the network poller if there isn't one already. +func wakeNetPoller(when int64) { + if sched.lastpoll.Load() == 0 { + // In findrunnable we ensure that when polling the pollUntil + // field is either zero or the time to which the current + // poll is expected to run. This can have a spurious wakeup + // but should never miss a wakeup. + pollerPollUntil := sched.pollUntil.Load() + if pollerPollUntil == 0 || pollerPollUntil > when { + netpollBreak() + } + } else { + // There are no threads in the network poller, try to get + // one there so it can handle new timers. + if GOOS != "plan9" { // Temporary workaround - see issue #42303. + wakep() + } + } +} + +func resetspinning() { + gp := getg() + if !gp.m.spinning { + throw("resetspinning: not a spinning m") + } + gp.m.spinning = false + nmspinning := sched.nmspinning.Add(-1) + if nmspinning < 0 { + throw("findrunnable: negative nmspinning") + } + // M wakeup policy is deliberately somewhat conservative, so check if we + // need to wakeup another P here. See "Worker thread parking/unparking" + // comment at the top of the file for details. + wakep() +} + +// injectglist adds each runnable G on the list to some run queue, +// and clears glist. If there is no current P, they are added to the +// global queue, and up to npidle M's are started to run them. +// Otherwise, for each idle P, this adds a G to the global queue +// and starts an M. Any remaining G's are added to the current P's +// local run queue. +// This may temporarily acquire sched.lock. +// Can run concurrently with GC. +func injectglist(glist *gList) { + if glist.empty() { + return + } + if trace.enabled { + for gp := glist.head.ptr(); gp != nil; gp = gp.schedlink.ptr() { + traceGoUnpark(gp, 0) + } + } + + // Mark all the goroutines as runnable before we put them + // on the run queues. + head := glist.head.ptr() + var tail *g + qsize := 0 + for gp := head; gp != nil; gp = gp.schedlink.ptr() { + tail = gp + qsize++ + casgstatus(gp, _Gwaiting, _Grunnable) + } + + // Turn the gList into a gQueue. + var q gQueue + q.head.set(head) + q.tail.set(tail) + *glist = gList{} + + startIdle := func(n int) { + for i := 0; i < n; i++ { + mp := acquirem() // See comment in startm. + lock(&sched.lock) + + pp, _ := pidlegetSpinning(0) + if pp == nil { + unlock(&sched.lock) + releasem(mp) + break + } + + startm(pp, false, true) + unlock(&sched.lock) + releasem(mp) + } + } + + pp := getg().m.p.ptr() + if pp == nil { + lock(&sched.lock) + globrunqputbatch(&q, int32(qsize)) + unlock(&sched.lock) + startIdle(qsize) + return + } + + npidle := int(sched.npidle.Load()) + var globq gQueue + var n int + for n = 0; n < npidle && !q.empty(); n++ { + g := q.pop() + globq.pushBack(g) + } + if n > 0 { + lock(&sched.lock) + globrunqputbatch(&globq, int32(n)) + unlock(&sched.lock) + startIdle(n) + qsize -= n + } + + if !q.empty() { + runqputbatch(pp, &q, qsize) + } +} + +// One round of scheduler: find a runnable goroutine and execute it. +// Never returns. +func schedule() { + mp := getg().m + + if mp.locks != 0 { + throw("schedule: holding locks") + } + + if mp.lockedg != 0 { + stoplockedm() + execute(mp.lockedg.ptr(), false) // Never returns. + } + + // We should not schedule away from a g that is executing a cgo call, + // since the cgo call is using the m's g0 stack. + if mp.incgo { + throw("schedule: in cgo") + } + +top: + pp := mp.p.ptr() + pp.preempt = false + + // Safety check: if we are spinning, the run queue should be empty. + // Check this before calling checkTimers, as that might call + // goready to put a ready goroutine on the local run queue. + if mp.spinning && (pp.runnext != 0 || pp.runqhead != pp.runqtail) { + throw("schedule: spinning with local work") + } + + gp, inheritTime, tryWakeP := findRunnable() // blocks until work is available + + // This thread is going to run a goroutine and is not spinning anymore, + // so if it was marked as spinning we need to reset it now and potentially + // start a new spinning M. + if mp.spinning { + resetspinning() + } + + if sched.disable.user && !schedEnabled(gp) { + // Scheduling of this goroutine is disabled. Put it on + // the list of pending runnable goroutines for when we + // re-enable user scheduling and look again. + lock(&sched.lock) + if schedEnabled(gp) { + // Something re-enabled scheduling while we + // were acquiring the lock. + unlock(&sched.lock) + } else { + sched.disable.runnable.pushBack(gp) + sched.disable.n++ + unlock(&sched.lock) + goto top + } + } + + // If about to schedule a not-normal goroutine (a GCworker or tracereader), + // wake a P if there is one. + if tryWakeP { + wakep() + } + if gp.lockedm != 0 { + // Hands off own p to the locked m, + // then blocks waiting for a new p. + startlockedm(gp) + goto top + } + + execute(gp, inheritTime) +} + +// dropg removes the association between m and the current goroutine m->curg (gp for short). +// Typically a caller sets gp's status away from Grunning and then +// immediately calls dropg to finish the job. The caller is also responsible +// for arranging that gp will be restarted using ready at an +// appropriate time. After calling dropg and arranging for gp to be +// readied later, the caller can do other work but eventually should +// call schedule to restart the scheduling of goroutines on this m. +func dropg() { + gp := getg() + + setMNoWB(&gp.m.curg.m, nil) + setGNoWB(&gp.m.curg, nil) +} + +// checkTimers runs any timers for the P that are ready. +// If now is not 0 it is the current time. +// It returns the passed time or the current time if now was passed as 0. +// and the time when the next timer should run or 0 if there is no next timer, +// and reports whether it ran any timers. +// If the time when the next timer should run is not 0, +// it is always larger than the returned time. +// We pass now in and out to avoid extra calls of nanotime. +// +//go:yeswritebarrierrec +func checkTimers(pp *p, now int64) (rnow, pollUntil int64, ran bool) { + // If it's not yet time for the first timer, or the first adjusted + // timer, then there is nothing to do. + next := pp.timer0When.Load() + nextAdj := pp.timerModifiedEarliest.Load() + if next == 0 || (nextAdj != 0 && nextAdj < next) { + next = nextAdj + } + + if next == 0 { + // No timers to run or adjust. + return now, 0, false + } + + if now == 0 { + now = nanotime() + } + if now < next { + // Next timer is not ready to run, but keep going + // if we would clear deleted timers. + // This corresponds to the condition below where + // we decide whether to call clearDeletedTimers. + if pp != getg().m.p.ptr() || int(pp.deletedTimers.Load()) <= int(pp.numTimers.Load()/4) { + return now, next, false + } + } + + lock(&pp.timersLock) + + if len(pp.timers) > 0 { + adjusttimers(pp, now) + for len(pp.timers) > 0 { + // Note that runtimer may temporarily unlock + // pp.timersLock. + if tw := runtimer(pp, now); tw != 0 { + if tw > 0 { + pollUntil = tw + } + break + } + ran = true + } + } + + // If this is the local P, and there are a lot of deleted timers, + // clear them out. We only do this for the local P to reduce + // lock contention on timersLock. + if pp == getg().m.p.ptr() && int(pp.deletedTimers.Load()) > len(pp.timers)/4 { + clearDeletedTimers(pp) + } + + unlock(&pp.timersLock) + + return now, pollUntil, ran +} + +func parkunlock_c(gp *g, lock unsafe.Pointer) bool { + unlock((*mutex)(lock)) + return true +} + +// park continuation on g0. +func park_m(gp *g) { + mp := getg().m + + if trace.enabled { + traceGoPark(mp.waittraceev, mp.waittraceskip) + } + + // N.B. Not using casGToWaiting here because the waitreason is + // set by park_m's caller. + casgstatus(gp, _Grunning, _Gwaiting) + dropg() + + if fn := mp.waitunlockf; fn != nil { + ok := fn(gp, mp.waitlock) + mp.waitunlockf = nil + mp.waitlock = nil + if !ok { + if trace.enabled { + traceGoUnpark(gp, 2) + } + casgstatus(gp, _Gwaiting, _Grunnable) + execute(gp, true) // Schedule it back, never returns. + } + } + schedule() +} + +func goschedImpl(gp *g) { + status := readgstatus(gp) + if status&^_Gscan != _Grunning { + dumpgstatus(gp) + throw("bad g status") + } + casgstatus(gp, _Grunning, _Grunnable) + dropg() + lock(&sched.lock) + globrunqput(gp) + unlock(&sched.lock) + + schedule() +} + +// Gosched continuation on g0. +func gosched_m(gp *g) { + if trace.enabled { + traceGoSched() + } + goschedImpl(gp) +} + +// goschedguarded is a forbidden-states-avoided version of gosched_m. +func goschedguarded_m(gp *g) { + + if !canPreemptM(gp.m) { + gogo(&gp.sched) // never return + } + + if trace.enabled { + traceGoSched() + } + goschedImpl(gp) +} + +func gopreempt_m(gp *g) { + if trace.enabled { + traceGoPreempt() + } + goschedImpl(gp) +} + +// preemptPark parks gp and puts it in _Gpreempted. +// +//go:systemstack +func preemptPark(gp *g) { + if trace.enabled { + traceGoPark(traceEvGoBlock, 0) + } + status := readgstatus(gp) + if status&^_Gscan != _Grunning { + dumpgstatus(gp) + throw("bad g status") + } + + if gp.asyncSafePoint { + // Double-check that async preemption does not + // happen in SPWRITE assembly functions. + // isAsyncSafePoint must exclude this case. + f := findfunc(gp.sched.pc) + if !f.valid() { + throw("preempt at unknown pc") + } + if f.flag&funcFlag_SPWRITE != 0 { + println("runtime: unexpected SPWRITE function", funcname(f), "in async preempt") + throw("preempt SPWRITE") + } + } + + // Transition from _Grunning to _Gscan|_Gpreempted. We can't + // be in _Grunning when we dropg because then we'd be running + // without an M, but the moment we're in _Gpreempted, + // something could claim this G before we've fully cleaned it + // up. Hence, we set the scan bit to lock down further + // transitions until we can dropg. + casGToPreemptScan(gp, _Grunning, _Gscan|_Gpreempted) + dropg() + casfrom_Gscanstatus(gp, _Gscan|_Gpreempted, _Gpreempted) + schedule() +} + +// goyield is like Gosched, but it: +// - emits a GoPreempt trace event instead of a GoSched trace event +// - puts the current G on the runq of the current P instead of the globrunq +func goyield() { + checkTimeouts() + mcall(goyield_m) +} + +func goyield_m(gp *g) { + if trace.enabled { + traceGoPreempt() + } + pp := gp.m.p.ptr() + casgstatus(gp, _Grunning, _Grunnable) + dropg() + runqput(pp, gp, false) + schedule() +} + +// Finishes execution of the current goroutine. +func goexit1() { + if raceenabled { + racegoend() + } + if trace.enabled { + traceGoEnd() + } + mcall(goexit0) +} + +// goexit continuation on g0. +func goexit0(gp *g) { + mp := getg().m + pp := mp.p.ptr() + + casgstatus(gp, _Grunning, _Gdead) + gcController.addScannableStack(pp, -int64(gp.stack.hi-gp.stack.lo)) + if isSystemGoroutine(gp, false) { + sched.ngsys.Add(-1) + } + gp.m = nil + locked := gp.lockedm != 0 + gp.lockedm = 0 + mp.lockedg = 0 + gp.preemptStop = false + gp.paniconfault = false + gp._defer = nil // should be true already but just in case. + gp._panic = nil // non-nil for Goexit during panic. points at stack-allocated data. + gp.writebuf = nil + gp.waitreason = waitReasonZero + gp.param = nil + gp.labels = nil + gp.timer = nil + + if gcBlackenEnabled != 0 && gp.gcAssistBytes > 0 { + // Flush assist credit to the global pool. This gives + // better information to pacing if the application is + // rapidly creating an exiting goroutines. + assistWorkPerByte := gcController.assistWorkPerByte.Load() + scanCredit := int64(assistWorkPerByte * float64(gp.gcAssistBytes)) + gcController.bgScanCredit.Add(scanCredit) + gp.gcAssistBytes = 0 + } + + dropg() + + if GOARCH == "wasm" { // no threads yet on wasm + gfput(pp, gp) + schedule() // never returns + } + + if mp.lockedInt != 0 { + print("invalid m->lockedInt = ", mp.lockedInt, "\n") + throw("internal lockOSThread error") + } + gfput(pp, gp) + if locked { + // The goroutine may have locked this thread because + // it put it in an unusual kernel state. Kill it + // rather than returning it to the thread pool. + + // Return to mstart, which will release the P and exit + // the thread. + if GOOS != "plan9" { // See golang.org/issue/22227. + gogo(&mp.g0.sched) + } else { + // Clear lockedExt on plan9 since we may end up re-using + // this thread. + mp.lockedExt = 0 + } + } + schedule() +} + +// save updates getg().sched to refer to pc and sp so that a following +// gogo will restore pc and sp. +// +// save must not have write barriers because invoking a write barrier +// can clobber getg().sched. +// +//go:nosplit +//go:nowritebarrierrec +func save(pc, sp uintptr) { + gp := getg() + + if gp == gp.m.g0 || gp == gp.m.gsignal { + // m.g0.sched is special and must describe the context + // for exiting the thread. mstart1 writes to it directly. + // m.gsignal.sched should not be used at all. + // This check makes sure save calls do not accidentally + // run in contexts where they'd write to system g's. + throw("save on system g not allowed") + } + + gp.sched.pc = pc + gp.sched.sp = sp + gp.sched.lr = 0 + gp.sched.ret = 0 + // We need to ensure ctxt is zero, but can't have a write + // barrier here. However, it should always already be zero. + // Assert that. + if gp.sched.ctxt != nil { + badctxt() + } +} + +// The goroutine g is about to enter a system call. +// Record that it's not using the cpu anymore. +// This is called only from the go syscall library and cgocall, +// not from the low-level system calls used by the runtime. +// +// Entersyscall cannot split the stack: the save must +// make g->sched refer to the caller's stack segment, because +// entersyscall is going to return immediately after. +// +// Nothing entersyscall calls can split the stack either. +// We cannot safely move the stack during an active call to syscall, +// because we do not know which of the uintptr arguments are +// really pointers (back into the stack). +// In practice, this means that we make the fast path run through +// entersyscall doing no-split things, and the slow path has to use systemstack +// to run bigger things on the system stack. +// +// reentersyscall is the entry point used by cgo callbacks, where explicitly +// saved SP and PC are restored. This is needed when exitsyscall will be called +// from a function further up in the call stack than the parent, as g->syscallsp +// must always point to a valid stack frame. entersyscall below is the normal +// entry point for syscalls, which obtains the SP and PC from the caller. +// +// Syscall tracing: +// At the start of a syscall we emit traceGoSysCall to capture the stack trace. +// If the syscall does not block, that is it, we do not emit any other events. +// If the syscall blocks (that is, P is retaken), retaker emits traceGoSysBlock; +// when syscall returns we emit traceGoSysExit and when the goroutine starts running +// (potentially instantly, if exitsyscallfast returns true) we emit traceGoStart. +// To ensure that traceGoSysExit is emitted strictly after traceGoSysBlock, +// we remember current value of syscalltick in m (gp.m.syscalltick = gp.m.p.ptr().syscalltick), +// whoever emits traceGoSysBlock increments p.syscalltick afterwards; +// and we wait for the increment before emitting traceGoSysExit. +// Note that the increment is done even if tracing is not enabled, +// because tracing can be enabled in the middle of syscall. We don't want the wait to hang. +// +//go:nosplit +func reentersyscall(pc, sp uintptr) { + gp := getg() + + // Disable preemption because during this function g is in Gsyscall status, + // but can have inconsistent g->sched, do not let GC observe it. + gp.m.locks++ + + // Entersyscall must not call any function that might split/grow the stack. + // (See details in comment above.) + // Catch calls that might, by replacing the stack guard with something that + // will trip any stack check and leaving a flag to tell newstack to die. + gp.stackguard0 = stackPreempt + gp.throwsplit = true + + // Leave SP around for GC and traceback. + save(pc, sp) + gp.syscallsp = sp + gp.syscallpc = pc + casgstatus(gp, _Grunning, _Gsyscall) + if gp.syscallsp < gp.stack.lo || gp.stack.hi < gp.syscallsp { + systemstack(func() { + print("entersyscall inconsistent ", hex(gp.syscallsp), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n") + throw("entersyscall") + }) + } + + if trace.enabled { + systemstack(traceGoSysCall) + // systemstack itself clobbers g.sched.{pc,sp} and we might + // need them later when the G is genuinely blocked in a + // syscall + save(pc, sp) + } + + if sched.sysmonwait.Load() { + systemstack(entersyscall_sysmon) + save(pc, sp) + } + + if gp.m.p.ptr().runSafePointFn != 0 { + // runSafePointFn may stack split if run on this stack + systemstack(runSafePointFn) + save(pc, sp) + } + + gp.m.syscalltick = gp.m.p.ptr().syscalltick + gp.sysblocktraced = true + pp := gp.m.p.ptr() + pp.m = 0 + gp.m.oldp.set(pp) + gp.m.p = 0 + atomic.Store(&pp.status, _Psyscall) + if sched.gcwaiting.Load() { + systemstack(entersyscall_gcwait) + save(pc, sp) + } + + gp.m.locks-- +} + +// Standard syscall entry used by the go syscall library and normal cgo calls. +// +// This is exported via linkname to assembly in the syscall package and x/sys. +// +//go:nosplit +//go:linkname entersyscall +func entersyscall() { + reentersyscall(getcallerpc(), getcallersp()) +} + +func entersyscall_sysmon() { + lock(&sched.lock) + if sched.sysmonwait.Load() { + sched.sysmonwait.Store(false) + notewakeup(&sched.sysmonnote) + } + unlock(&sched.lock) +} + +func entersyscall_gcwait() { + gp := getg() + pp := gp.m.oldp.ptr() + + lock(&sched.lock) + if sched.stopwait > 0 && atomic.Cas(&pp.status, _Psyscall, _Pgcstop) { + if trace.enabled { + traceGoSysBlock(pp) + traceProcStop(pp) + } + pp.syscalltick++ + if sched.stopwait--; sched.stopwait == 0 { + notewakeup(&sched.stopnote) + } + } + unlock(&sched.lock) +} + +// The same as entersyscall(), but with a hint that the syscall is blocking. +// +//go:nosplit +func entersyscallblock() { + gp := getg() + + gp.m.locks++ // see comment in entersyscall + gp.throwsplit = true + gp.stackguard0 = stackPreempt // see comment in entersyscall + gp.m.syscalltick = gp.m.p.ptr().syscalltick + gp.sysblocktraced = true + gp.m.p.ptr().syscalltick++ + + // Leave SP around for GC and traceback. + pc := getcallerpc() + sp := getcallersp() + save(pc, sp) + gp.syscallsp = gp.sched.sp + gp.syscallpc = gp.sched.pc + if gp.syscallsp < gp.stack.lo || gp.stack.hi < gp.syscallsp { + sp1 := sp + sp2 := gp.sched.sp + sp3 := gp.syscallsp + systemstack(func() { + print("entersyscallblock inconsistent ", hex(sp1), " ", hex(sp2), " ", hex(sp3), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n") + throw("entersyscallblock") + }) + } + casgstatus(gp, _Grunning, _Gsyscall) + if gp.syscallsp < gp.stack.lo || gp.stack.hi < gp.syscallsp { + systemstack(func() { + print("entersyscallblock inconsistent ", hex(sp), " ", hex(gp.sched.sp), " ", hex(gp.syscallsp), " [", hex(gp.stack.lo), ",", hex(gp.stack.hi), "]\n") + throw("entersyscallblock") + }) + } + + systemstack(entersyscallblock_handoff) + + // Resave for traceback during blocked call. + save(getcallerpc(), getcallersp()) + + gp.m.locks-- +} + +func entersyscallblock_handoff() { + if trace.enabled { + traceGoSysCall() + traceGoSysBlock(getg().m.p.ptr()) + } + handoffp(releasep()) +} + +// The goroutine g exited its system call. +// Arrange for it to run on a cpu again. +// This is called only from the go syscall library, not +// from the low-level system calls used by the runtime. +// +// Write barriers are not allowed because our P may have been stolen. +// +// This is exported via linkname to assembly in the syscall package. +// +//go:nosplit +//go:nowritebarrierrec +//go:linkname exitsyscall +func exitsyscall() { + gp := getg() + + gp.m.locks++ // see comment in entersyscall + if getcallersp() > gp.syscallsp { + throw("exitsyscall: syscall frame is no longer valid") + } + + gp.waitsince = 0 + oldp := gp.m.oldp.ptr() + gp.m.oldp = 0 + if exitsyscallfast(oldp) { + // When exitsyscallfast returns success, we have a P so can now use + // write barriers + if goroutineProfile.active { + // Make sure that gp has had its stack written out to the goroutine + // profile, exactly as it was when the goroutine profiler first + // stopped the world. + systemstack(func() { + tryRecordGoroutineProfileWB(gp) + }) + } + if trace.enabled { + if oldp != gp.m.p.ptr() || gp.m.syscalltick != gp.m.p.ptr().syscalltick { + systemstack(traceGoStart) + } + } + // There's a cpu for us, so we can run. + gp.m.p.ptr().syscalltick++ + // We need to cas the status and scan before resuming... + casgstatus(gp, _Gsyscall, _Grunning) + + // Garbage collector isn't running (since we are), + // so okay to clear syscallsp. + gp.syscallsp = 0 + gp.m.locks-- + if gp.preempt { + // restore the preemption request in case we've cleared it in newstack + gp.stackguard0 = stackPreempt + } else { + // otherwise restore the real _StackGuard, we've spoiled it in entersyscall/entersyscallblock + gp.stackguard0 = gp.stack.lo + _StackGuard + } + gp.throwsplit = false + + if sched.disable.user && !schedEnabled(gp) { + // Scheduling of this goroutine is disabled. + Gosched() + } + + return + } + + gp.sysexitticks = 0 + if trace.enabled { + // Wait till traceGoSysBlock event is emitted. + // This ensures consistency of the trace (the goroutine is started after it is blocked). + for oldp != nil && oldp.syscalltick == gp.m.syscalltick { + osyield() + } + // We can't trace syscall exit right now because we don't have a P. + // Tracing code can invoke write barriers that cannot run without a P. + // So instead we remember the syscall exit time and emit the event + // in execute when we have a P. + gp.sysexitticks = cputicks() + } + + gp.m.locks-- + + // Call the scheduler. + mcall(exitsyscall0) + + // Scheduler returned, so we're allowed to run now. + // Delete the syscallsp information that we left for + // the garbage collector during the system call. + // Must wait until now because until gosched returns + // we don't know for sure that the garbage collector + // is not running. + gp.syscallsp = 0 + gp.m.p.ptr().syscalltick++ + gp.throwsplit = false +} + +//go:nosplit +func exitsyscallfast(oldp *p) bool { + gp := getg() + + // Freezetheworld sets stopwait but does not retake P's. + if sched.stopwait == freezeStopWait { + return false + } + + // Try to re-acquire the last P. + if oldp != nil && oldp.status == _Psyscall && atomic.Cas(&oldp.status, _Psyscall, _Pidle) { + // There's a cpu for us, so we can run. + wirep(oldp) + exitsyscallfast_reacquired() + return true + } + + // Try to get any other idle P. + if sched.pidle != 0 { + var ok bool + systemstack(func() { + ok = exitsyscallfast_pidle() + if ok && trace.enabled { + if oldp != nil { + // Wait till traceGoSysBlock event is emitted. + // This ensures consistency of the trace (the goroutine is started after it is blocked). + for oldp.syscalltick == gp.m.syscalltick { + osyield() + } + } + traceGoSysExit(0) + } + }) + if ok { + return true + } + } + return false +} + +// exitsyscallfast_reacquired is the exitsyscall path on which this G +// has successfully reacquired the P it was running on before the +// syscall. +// +//go:nosplit +func exitsyscallfast_reacquired() { + gp := getg() + if gp.m.syscalltick != gp.m.p.ptr().syscalltick { + if trace.enabled { + // The p was retaken and then enter into syscall again (since gp.m.syscalltick has changed). + // traceGoSysBlock for this syscall was already emitted, + // but here we effectively retake the p from the new syscall running on the same p. + systemstack(func() { + // Denote blocking of the new syscall. + traceGoSysBlock(gp.m.p.ptr()) + // Denote completion of the current syscall. + traceGoSysExit(0) + }) + } + gp.m.p.ptr().syscalltick++ + } +} + +func exitsyscallfast_pidle() bool { + lock(&sched.lock) + pp, _ := pidleget(0) + if pp != nil && sched.sysmonwait.Load() { + sched.sysmonwait.Store(false) + notewakeup(&sched.sysmonnote) + } + unlock(&sched.lock) + if pp != nil { + acquirep(pp) + return true + } + return false +} + +// exitsyscall slow path on g0. +// Failed to acquire P, enqueue gp as runnable. +// +// Called via mcall, so gp is the calling g from this M. +// +//go:nowritebarrierrec +func exitsyscall0(gp *g) { + casgstatus(gp, _Gsyscall, _Grunnable) + dropg() + lock(&sched.lock) + var pp *p + if schedEnabled(gp) { + pp, _ = pidleget(0) + } + var locked bool + if pp == nil { + globrunqput(gp) + + // Below, we stoplockedm if gp is locked. globrunqput releases + // ownership of gp, so we must check if gp is locked prior to + // committing the release by unlocking sched.lock, otherwise we + // could race with another M transitioning gp from unlocked to + // locked. + locked = gp.lockedm != 0 + } else if sched.sysmonwait.Load() { + sched.sysmonwait.Store(false) + notewakeup(&sched.sysmonnote) + } + unlock(&sched.lock) + if pp != nil { + acquirep(pp) + execute(gp, false) // Never returns. + } + if locked { + // Wait until another thread schedules gp and so m again. + // + // N.B. lockedm must be this M, as this g was running on this M + // before entersyscall. + stoplockedm() + execute(gp, false) // Never returns. + } + stopm() + schedule() // Never returns. +} + +// Called from syscall package before fork. +// +//go:linkname syscall_runtime_BeforeFork syscall.runtime_BeforeFork +//go:nosplit +func syscall_runtime_BeforeFork() { + gp := getg().m.curg + + // Block signals during a fork, so that the child does not run + // a signal handler before exec if a signal is sent to the process + // group. See issue #18600. + gp.m.locks++ + sigsave(&gp.m.sigmask) + sigblock(false) + + // This function is called before fork in syscall package. + // Code between fork and exec must not allocate memory nor even try to grow stack. + // Here we spoil g->_StackGuard to reliably detect any attempts to grow stack. + // runtime_AfterFork will undo this in parent process, but not in child. + gp.stackguard0 = stackFork +} + +// Called from syscall package after fork in parent. +// +//go:linkname syscall_runtime_AfterFork syscall.runtime_AfterFork +//go:nosplit +func syscall_runtime_AfterFork() { + gp := getg().m.curg + + // See the comments in beforefork. + gp.stackguard0 = gp.stack.lo + _StackGuard + + msigrestore(gp.m.sigmask) + + gp.m.locks-- +} + +// inForkedChild is true while manipulating signals in the child process. +// This is used to avoid calling libc functions in case we are using vfork. +var inForkedChild bool + +// Called from syscall package after fork in child. +// It resets non-sigignored signals to the default handler, and +// restores the signal mask in preparation for the exec. +// +// Because this might be called during a vfork, and therefore may be +// temporarily sharing address space with the parent process, this must +// not change any global variables or calling into C code that may do so. +// +//go:linkname syscall_runtime_AfterForkInChild syscall.runtime_AfterForkInChild +//go:nosplit +//go:nowritebarrierrec +func syscall_runtime_AfterForkInChild() { + // It's OK to change the global variable inForkedChild here + // because we are going to change it back. There is no race here, + // because if we are sharing address space with the parent process, + // then the parent process can not be running concurrently. + inForkedChild = true + + clearSignalHandlers() + + // When we are the child we are the only thread running, + // so we know that nothing else has changed gp.m.sigmask. + msigrestore(getg().m.sigmask) + + inForkedChild = false +} + +// pendingPreemptSignals is the number of preemption signals +// that have been sent but not received. This is only used on Darwin. +// For #41702. +var pendingPreemptSignals atomic.Int32 + +// Called from syscall package before Exec. +// +//go:linkname syscall_runtime_BeforeExec syscall.runtime_BeforeExec +func syscall_runtime_BeforeExec() { + // Prevent thread creation during exec. + execLock.lock() + + // On Darwin, wait for all pending preemption signals to + // be received. See issue #41702. + if GOOS == "darwin" || GOOS == "ios" { + for pendingPreemptSignals.Load() > 0 { + osyield() + } + } +} + +// Called from syscall package after Exec. +// +//go:linkname syscall_runtime_AfterExec syscall.runtime_AfterExec +func syscall_runtime_AfterExec() { + execLock.unlock() +} + +// Allocate a new g, with a stack big enough for stacksize bytes. +func malg(stacksize int32) *g { + newg := new(g) + if stacksize >= 0 { + stacksize = round2(_StackSystem + stacksize) + systemstack(func() { + newg.stack = stackalloc(uint32(stacksize)) + }) + newg.stackguard0 = newg.stack.lo + _StackGuard + newg.stackguard1 = ^uintptr(0) + // Clear the bottom word of the stack. We record g + // there on gsignal stack during VDSO on ARM and ARM64. + *(*uintptr)(unsafe.Pointer(newg.stack.lo)) = 0 + } + return newg +} + +// Create a new g running fn. +// Put it on the queue of g's waiting to run. +// The compiler turns a go statement into a call to this. +func newproc(fn *funcval) { + gp := getg() + pc := getcallerpc() + systemstack(func() { + newg := newproc1(fn, gp, pc) + + pp := getg().m.p.ptr() + runqput(pp, newg, true) + + if mainStarted { + wakep() + } + }) +} + +// Create a new g in state _Grunnable, starting at fn. callerpc is the +// address of the go statement that created this. The caller is responsible +// for adding the new g to the scheduler. +func newproc1(fn *funcval, callergp *g, callerpc uintptr) *g { + if fn == nil { + fatal("go of nil func value") + } + + mp := acquirem() // disable preemption because we hold M and P in local vars. + pp := mp.p.ptr() + newg := gfget(pp) + if newg == nil { + newg = malg(_StackMin) + casgstatus(newg, _Gidle, _Gdead) + allgadd(newg) // publishes with a g->status of Gdead so GC scanner doesn't look at uninitialized stack. + } + if newg.stack.hi == 0 { + throw("newproc1: newg missing stack") + } + + if readgstatus(newg) != _Gdead { + throw("newproc1: new g is not Gdead") + } + + totalSize := uintptr(4*goarch.PtrSize + sys.MinFrameSize) // extra space in case of reads slightly beyond frame + totalSize = alignUp(totalSize, sys.StackAlign) + sp := newg.stack.hi - totalSize + spArg := sp + if usesLR { + // caller's LR + *(*uintptr)(unsafe.Pointer(sp)) = 0 + prepGoExitFrame(sp) + spArg += sys.MinFrameSize + } + + memclrNoHeapPointers(unsafe.Pointer(&newg.sched), unsafe.Sizeof(newg.sched)) + newg.sched.sp = sp + newg.stktopsp = sp + newg.sched.pc = abi.FuncPCABI0(goexit) + sys.PCQuantum // +PCQuantum so that previous instruction is in same function + newg.sched.g = guintptr(unsafe.Pointer(newg)) + gostartcallfn(&newg.sched, fn) + newg.gopc = callerpc + newg.ancestors = saveAncestors(callergp) + newg.startpc = fn.fn + if isSystemGoroutine(newg, false) { + sched.ngsys.Add(1) + } else { + // Only user goroutines inherit pprof labels. + if mp.curg != nil { + newg.labels = mp.curg.labels + } + if goroutineProfile.active { + // A concurrent goroutine profile is running. It should include + // exactly the set of goroutines that were alive when the goroutine + // profiler first stopped the world. That does not include newg, so + // mark it as not needing a profile before transitioning it from + // _Gdead. + newg.goroutineProfiled.Store(goroutineProfileSatisfied) + } + } + // Track initial transition? + newg.trackingSeq = uint8(fastrand()) + if newg.trackingSeq%gTrackingPeriod == 0 { + newg.tracking = true + } + casgstatus(newg, _Gdead, _Grunnable) + gcController.addScannableStack(pp, int64(newg.stack.hi-newg.stack.lo)) + + if pp.goidcache == pp.goidcacheend { + // Sched.goidgen is the last allocated id, + // this batch must be [sched.goidgen+1, sched.goidgen+GoidCacheBatch]. + // At startup sched.goidgen=0, so main goroutine receives goid=1. + pp.goidcache = sched.goidgen.Add(_GoidCacheBatch) + pp.goidcache -= _GoidCacheBatch - 1 + pp.goidcacheend = pp.goidcache + _GoidCacheBatch + } + newg.goid = pp.goidcache + pp.goidcache++ + if raceenabled { + newg.racectx = racegostart(callerpc) + newg.raceignore = 0 + if newg.labels != nil { + // See note in proflabel.go on labelSync's role in synchronizing + // with the reads in the signal handler. + racereleasemergeg(newg, unsafe.Pointer(&labelSync)) + } + } + if trace.enabled { + traceGoCreate(newg, newg.startpc) + } + releasem(mp) + + return newg +} + +// saveAncestors copies previous ancestors of the given caller g and +// includes infor for the current caller into a new set of tracebacks for +// a g being created. +func saveAncestors(callergp *g) *[]ancestorInfo { + // Copy all prior info, except for the root goroutine (goid 0). + if debug.tracebackancestors <= 0 || callergp.goid == 0 { + return nil + } + var callerAncestors []ancestorInfo + if callergp.ancestors != nil { + callerAncestors = *callergp.ancestors + } + n := int32(len(callerAncestors)) + 1 + if n > debug.tracebackancestors { + n = debug.tracebackancestors + } + ancestors := make([]ancestorInfo, n) + copy(ancestors[1:], callerAncestors) + + var pcs [_TracebackMaxFrames]uintptr + npcs := gcallers(callergp, 0, pcs[:]) + ipcs := make([]uintptr, npcs) + copy(ipcs, pcs[:]) + ancestors[0] = ancestorInfo{ + pcs: ipcs, + goid: callergp.goid, + gopc: callergp.gopc, + } + + ancestorsp := new([]ancestorInfo) + *ancestorsp = ancestors + return ancestorsp +} + +// Put on gfree list. +// If local list is too long, transfer a batch to the global list. +func gfput(pp *p, gp *g) { + if readgstatus(gp) != _Gdead { + throw("gfput: bad status (not Gdead)") + } + + stksize := gp.stack.hi - gp.stack.lo + + if stksize != uintptr(startingStackSize) { + // non-standard stack size - free it. + stackfree(gp.stack) + gp.stack.lo = 0 + gp.stack.hi = 0 + gp.stackguard0 = 0 + } + + pp.gFree.push(gp) + pp.gFree.n++ + if pp.gFree.n >= 64 { + var ( + inc int32 + stackQ gQueue + noStackQ gQueue + ) + for pp.gFree.n >= 32 { + gp := pp.gFree.pop() + pp.gFree.n-- + if gp.stack.lo == 0 { + noStackQ.push(gp) + } else { + stackQ.push(gp) + } + inc++ + } + lock(&sched.gFree.lock) + sched.gFree.noStack.pushAll(noStackQ) + sched.gFree.stack.pushAll(stackQ) + sched.gFree.n += inc + unlock(&sched.gFree.lock) + } +} + +// Get from gfree list. +// If local list is empty, grab a batch from global list. +func gfget(pp *p) *g { +retry: + if pp.gFree.empty() && (!sched.gFree.stack.empty() || !sched.gFree.noStack.empty()) { + lock(&sched.gFree.lock) + // Move a batch of free Gs to the P. + for pp.gFree.n < 32 { + // Prefer Gs with stacks. + gp := sched.gFree.stack.pop() + if gp == nil { + gp = sched.gFree.noStack.pop() + if gp == nil { + break + } + } + sched.gFree.n-- + pp.gFree.push(gp) + pp.gFree.n++ + } + unlock(&sched.gFree.lock) + goto retry + } + gp := pp.gFree.pop() + if gp == nil { + return nil + } + pp.gFree.n-- + if gp.stack.lo != 0 && gp.stack.hi-gp.stack.lo != uintptr(startingStackSize) { + // Deallocate old stack. We kept it in gfput because it was the + // right size when the goroutine was put on the free list, but + // the right size has changed since then. + systemstack(func() { + stackfree(gp.stack) + gp.stack.lo = 0 + gp.stack.hi = 0 + gp.stackguard0 = 0 + }) + } + if gp.stack.lo == 0 { + // Stack was deallocated in gfput or just above. Allocate a new one. + systemstack(func() { + gp.stack = stackalloc(startingStackSize) + }) + gp.stackguard0 = gp.stack.lo + _StackGuard + } else { + if raceenabled { + racemalloc(unsafe.Pointer(gp.stack.lo), gp.stack.hi-gp.stack.lo) + } + if msanenabled { + msanmalloc(unsafe.Pointer(gp.stack.lo), gp.stack.hi-gp.stack.lo) + } + if asanenabled { + asanunpoison(unsafe.Pointer(gp.stack.lo), gp.stack.hi-gp.stack.lo) + } + } + return gp +} + +// Purge all cached G's from gfree list to the global list. +func gfpurge(pp *p) { + var ( + inc int32 + stackQ gQueue + noStackQ gQueue + ) + for !pp.gFree.empty() { + gp := pp.gFree.pop() + pp.gFree.n-- + if gp.stack.lo == 0 { + noStackQ.push(gp) + } else { + stackQ.push(gp) + } + inc++ + } + lock(&sched.gFree.lock) + sched.gFree.noStack.pushAll(noStackQ) + sched.gFree.stack.pushAll(stackQ) + sched.gFree.n += inc + unlock(&sched.gFree.lock) +} + +// Breakpoint executes a breakpoint trap. +func Breakpoint() { + breakpoint() +} + +// dolockOSThread is called by LockOSThread and lockOSThread below +// after they modify m.locked. Do not allow preemption during this call, +// or else the m might be different in this function than in the caller. +// +//go:nosplit +func dolockOSThread() { + if GOARCH == "wasm" { + return // no threads on wasm yet + } + gp := getg() + gp.m.lockedg.set(gp) + gp.lockedm.set(gp.m) +} + +//go:nosplit + +// LockOSThread wires the calling goroutine to its current operating system thread. +// The calling goroutine will always execute in that thread, +// and no other goroutine will execute in it, +// until the calling goroutine has made as many calls to +// UnlockOSThread as to LockOSThread. +// If the calling goroutine exits without unlocking the thread, +// the thread will be terminated. +// +// All init functions are run on the startup thread. Calling LockOSThread +// from an init function will cause the main function to be invoked on +// that thread. +// +// A goroutine should call LockOSThread before calling OS services or +// non-Go library functions that depend on per-thread state. +func LockOSThread() { + if atomic.Load(&newmHandoff.haveTemplateThread) == 0 && GOOS != "plan9" { + // If we need to start a new thread from the locked + // thread, we need the template thread. Start it now + // while we're in a known-good state. + startTemplateThread() + } + gp := getg() + gp.m.lockedExt++ + if gp.m.lockedExt == 0 { + gp.m.lockedExt-- + panic("LockOSThread nesting overflow") + } + dolockOSThread() +} + +//go:nosplit +func lockOSThread() { + getg().m.lockedInt++ + dolockOSThread() +} + +// dounlockOSThread is called by UnlockOSThread and unlockOSThread below +// after they update m->locked. Do not allow preemption during this call, +// or else the m might be in different in this function than in the caller. +// +//go:nosplit +func dounlockOSThread() { + if GOARCH == "wasm" { + return // no threads on wasm yet + } + gp := getg() + if gp.m.lockedInt != 0 || gp.m.lockedExt != 0 { + return + } + gp.m.lockedg = 0 + gp.lockedm = 0 +} + +//go:nosplit + +// UnlockOSThread undoes an earlier call to LockOSThread. +// If this drops the number of active LockOSThread calls on the +// calling goroutine to zero, it unwires the calling goroutine from +// its fixed operating system thread. +// If there are no active LockOSThread calls, this is a no-op. +// +// Before calling UnlockOSThread, the caller must ensure that the OS +// thread is suitable for running other goroutines. If the caller made +// any permanent changes to the state of the thread that would affect +// other goroutines, it should not call this function and thus leave +// the goroutine locked to the OS thread until the goroutine (and +// hence the thread) exits. +func UnlockOSThread() { + gp := getg() + if gp.m.lockedExt == 0 { + return + } + gp.m.lockedExt-- + dounlockOSThread() +} + +//go:nosplit +func unlockOSThread() { + gp := getg() + if gp.m.lockedInt == 0 { + systemstack(badunlockosthread) + } + gp.m.lockedInt-- + dounlockOSThread() +} + +func badunlockosthread() { + throw("runtime: internal error: misuse of lockOSThread/unlockOSThread") +} + +func gcount() int32 { + n := int32(atomic.Loaduintptr(&allglen)) - sched.gFree.n - sched.ngsys.Load() + for _, pp := range allp { + n -= pp.gFree.n + } + + // All these variables can be changed concurrently, so the result can be inconsistent. + // But at least the current goroutine is running. + if n < 1 { + n = 1 + } + return n +} + +func mcount() int32 { + return int32(sched.mnext - sched.nmfreed) +} + +var prof struct { + signalLock atomic.Uint32 + + // Must hold signalLock to write. Reads may be lock-free, but + // signalLock should be taken to synchronize with changes. + hz atomic.Int32 +} + +func _System() { _System() } +func _ExternalCode() { _ExternalCode() } +func _LostExternalCode() { _LostExternalCode() } +func _GC() { _GC() } +func _LostSIGPROFDuringAtomic64() { _LostSIGPROFDuringAtomic64() } +func _VDSO() { _VDSO() } + +// Called if we receive a SIGPROF signal. +// Called by the signal handler, may run during STW. +// +//go:nowritebarrierrec +func sigprof(pc, sp, lr uintptr, gp *g, mp *m) { + if prof.hz.Load() == 0 { + return + } + + // If mp.profilehz is 0, then profiling is not enabled for this thread. + // We must check this to avoid a deadlock between setcpuprofilerate + // and the call to cpuprof.add, below. + if mp != nil && mp.profilehz == 0 { + return + } + + // On mips{,le}/arm, 64bit atomics are emulated with spinlocks, in + // runtime/internal/atomic. If SIGPROF arrives while the program is inside + // the critical section, it creates a deadlock (when writing the sample). + // As a workaround, create a counter of SIGPROFs while in critical section + // to store the count, and pass it to sigprof.add() later when SIGPROF is + // received from somewhere else (with _LostSIGPROFDuringAtomic64 as pc). + if GOARCH == "mips" || GOARCH == "mipsle" || GOARCH == "arm" { + if f := findfunc(pc); f.valid() { + if hasPrefix(funcname(f), "runtime/internal/atomic") { + cpuprof.lostAtomic++ + return + } + } + if GOARCH == "arm" && goarm < 7 && GOOS == "linux" && pc&0xffff0000 == 0xffff0000 { + // runtime/internal/atomic functions call into kernel + // helpers on arm < 7. See + // runtime/internal/atomic/sys_linux_arm.s. + cpuprof.lostAtomic++ + return + } + } + + // Profiling runs concurrently with GC, so it must not allocate. + // Set a trap in case the code does allocate. + // Note that on windows, one thread takes profiles of all the + // other threads, so mp is usually not getg().m. + // In fact mp may not even be stopped. + // See golang.org/issue/17165. + getg().m.mallocing++ + + var stk [maxCPUProfStack]uintptr + n := 0 + if mp.ncgo > 0 && mp.curg != nil && mp.curg.syscallpc != 0 && mp.curg.syscallsp != 0 { + cgoOff := 0 + // Check cgoCallersUse to make sure that we are not + // interrupting other code that is fiddling with + // cgoCallers. We are running in a signal handler + // with all signals blocked, so we don't have to worry + // about any other code interrupting us. + if mp.cgoCallersUse.Load() == 0 && mp.cgoCallers != nil && mp.cgoCallers[0] != 0 { + for cgoOff < len(mp.cgoCallers) && mp.cgoCallers[cgoOff] != 0 { + cgoOff++ + } + copy(stk[:], mp.cgoCallers[:cgoOff]) + mp.cgoCallers[0] = 0 + } + + // Collect Go stack that leads to the cgo call. + n = gentraceback(mp.curg.syscallpc, mp.curg.syscallsp, 0, mp.curg, 0, &stk[cgoOff], len(stk)-cgoOff, nil, nil, 0) + if n > 0 { + n += cgoOff + } + } else if usesLibcall() && mp.libcallg != 0 && mp.libcallpc != 0 && mp.libcallsp != 0 { + // Libcall, i.e. runtime syscall on windows. + // Collect Go stack that leads to the call. + n = gentraceback(mp.libcallpc, mp.libcallsp, 0, mp.libcallg.ptr(), 0, &stk[n], len(stk[n:]), nil, nil, 0) + } else if mp != nil && mp.vdsoSP != 0 { + // VDSO call, e.g. nanotime1 on Linux. + // Collect Go stack that leads to the call. + n = gentraceback(mp.vdsoPC, mp.vdsoSP, 0, gp, 0, &stk[n], len(stk[n:]), nil, nil, _TraceJumpStack) + } else { + n = gentraceback(pc, sp, lr, gp, 0, &stk[0], len(stk), nil, nil, _TraceTrap|_TraceJumpStack) + } + + if n <= 0 { + // Normal traceback is impossible or has failed. + // Account it against abstract "System" or "GC". + n = 2 + if inVDSOPage(pc) { + pc = abi.FuncPCABIInternal(_VDSO) + sys.PCQuantum + } else if pc > firstmoduledata.etext { + // "ExternalCode" is better than "etext". + pc = abi.FuncPCABIInternal(_ExternalCode) + sys.PCQuantum + } + stk[0] = pc + if mp.preemptoff != "" { + stk[1] = abi.FuncPCABIInternal(_GC) + sys.PCQuantum + } else { + stk[1] = abi.FuncPCABIInternal(_System) + sys.PCQuantum + } + } + + if prof.hz.Load() != 0 { + // Note: it can happen on Windows that we interrupted a system thread + // with no g, so gp could nil. The other nil checks are done out of + // caution, but not expected to be nil in practice. + var tagPtr *unsafe.Pointer + if gp != nil && gp.m != nil && gp.m.curg != nil { + tagPtr = &gp.m.curg.labels + } + cpuprof.add(tagPtr, stk[:n]) + + gprof := gp + var pp *p + if gp != nil && gp.m != nil { + if gp.m.curg != nil { + gprof = gp.m.curg + } + pp = gp.m.p.ptr() + } + traceCPUSample(gprof, pp, stk[:n]) + } + getg().m.mallocing-- +} + +// setcpuprofilerate sets the CPU profiling rate to hz times per second. +// If hz <= 0, setcpuprofilerate turns off CPU profiling. +func setcpuprofilerate(hz int32) { + // Force sane arguments. + if hz < 0 { + hz = 0 + } + + // Disable preemption, otherwise we can be rescheduled to another thread + // that has profiling enabled. + gp := getg() + gp.m.locks++ + + // Stop profiler on this thread so that it is safe to lock prof. + // if a profiling signal came in while we had prof locked, + // it would deadlock. + setThreadCPUProfiler(0) + + for !prof.signalLock.CompareAndSwap(0, 1) { + osyield() + } + if prof.hz.Load() != hz { + setProcessCPUProfiler(hz) + prof.hz.Store(hz) + } + prof.signalLock.Store(0) + + lock(&sched.lock) + sched.profilehz = hz + unlock(&sched.lock) + + if hz != 0 { + setThreadCPUProfiler(hz) + } + + gp.m.locks-- +} + +// init initializes pp, which may be a freshly allocated p or a +// previously destroyed p, and transitions it to status _Pgcstop. +func (pp *p) init(id int32) { + pp.id = id + pp.status = _Pgcstop + pp.sudogcache = pp.sudogbuf[:0] + pp.deferpool = pp.deferpoolbuf[:0] + pp.wbBuf.reset() + if pp.mcache == nil { + if id == 0 { + if mcache0 == nil { + throw("missing mcache?") + } + // Use the bootstrap mcache0. Only one P will get + // mcache0: the one with ID 0. + pp.mcache = mcache0 + } else { + pp.mcache = allocmcache() + } + } + if raceenabled && pp.raceprocctx == 0 { + if id == 0 { + pp.raceprocctx = raceprocctx0 + raceprocctx0 = 0 // bootstrap + } else { + pp.raceprocctx = raceproccreate() + } + } + lockInit(&pp.timersLock, lockRankTimers) + + // This P may get timers when it starts running. Set the mask here + // since the P may not go through pidleget (notably P 0 on startup). + timerpMask.set(id) + // Similarly, we may not go through pidleget before this P starts + // running if it is P 0 on startup. + idlepMask.clear(id) +} + +// destroy releases all of the resources associated with pp and +// transitions it to status _Pdead. +// +// sched.lock must be held and the world must be stopped. +func (pp *p) destroy() { + assertLockHeld(&sched.lock) + assertWorldStopped() + + // Move all runnable goroutines to the global queue + for pp.runqhead != pp.runqtail { + // Pop from tail of local queue + pp.runqtail-- + gp := pp.runq[pp.runqtail%uint32(len(pp.runq))].ptr() + // Push onto head of global queue + globrunqputhead(gp) + } + if pp.runnext != 0 { + globrunqputhead(pp.runnext.ptr()) + pp.runnext = 0 + } + if len(pp.timers) > 0 { + plocal := getg().m.p.ptr() + // The world is stopped, but we acquire timersLock to + // protect against sysmon calling timeSleepUntil. + // This is the only case where we hold the timersLock of + // more than one P, so there are no deadlock concerns. + lock(&plocal.timersLock) + lock(&pp.timersLock) + moveTimers(plocal, pp.timers) + pp.timers = nil + pp.numTimers.Store(0) + pp.deletedTimers.Store(0) + pp.timer0When.Store(0) + unlock(&pp.timersLock) + unlock(&plocal.timersLock) + } + // Flush p's write barrier buffer. + if gcphase != _GCoff { + wbBufFlush1(pp) + pp.gcw.dispose() + } + for i := range pp.sudogbuf { + pp.sudogbuf[i] = nil + } + pp.sudogcache = pp.sudogbuf[:0] + for j := range pp.deferpoolbuf { + pp.deferpoolbuf[j] = nil + } + pp.deferpool = pp.deferpoolbuf[:0] + systemstack(func() { + for i := 0; i < pp.mspancache.len; i++ { + // Safe to call since the world is stopped. + mheap_.spanalloc.free(unsafe.Pointer(pp.mspancache.buf[i])) + } + pp.mspancache.len = 0 + lock(&mheap_.lock) + pp.pcache.flush(&mheap_.pages) + unlock(&mheap_.lock) + }) + freemcache(pp.mcache) + pp.mcache = nil + gfpurge(pp) + traceProcFree(pp) + if raceenabled { + if pp.timerRaceCtx != 0 { + // The race detector code uses a callback to fetch + // the proc context, so arrange for that callback + // to see the right thing. + // This hack only works because we are the only + // thread running. + mp := getg().m + phold := mp.p.ptr() + mp.p.set(pp) + + racectxend(pp.timerRaceCtx) + pp.timerRaceCtx = 0 + + mp.p.set(phold) + } + raceprocdestroy(pp.raceprocctx) + pp.raceprocctx = 0 + } + pp.gcAssistTime = 0 + pp.status = _Pdead +} + +// Change number of processors. +// +// sched.lock must be held, and the world must be stopped. +// +// gcworkbufs must not be being modified by either the GC or the write barrier +// code, so the GC must not be running if the number of Ps actually changes. +// +// Returns list of Ps with local work, they need to be scheduled by the caller. +func procresize(nprocs int32) *p { + assertLockHeld(&sched.lock) + assertWorldStopped() + + old := gomaxprocs + if old < 0 || nprocs <= 0 { + throw("procresize: invalid arg") + } + if trace.enabled { + traceGomaxprocs(nprocs) + } + + // update statistics + now := nanotime() + if sched.procresizetime != 0 { + sched.totaltime += int64(old) * (now - sched.procresizetime) + } + sched.procresizetime = now + + maskWords := (nprocs + 31) / 32 + + // Grow allp if necessary. + if nprocs > int32(len(allp)) { + // Synchronize with retake, which could be running + // concurrently since it doesn't run on a P. + lock(&allpLock) + if nprocs <= int32(cap(allp)) { + allp = allp[:nprocs] + } else { + nallp := make([]*p, nprocs) + // Copy everything up to allp's cap so we + // never lose old allocated Ps. + copy(nallp, allp[:cap(allp)]) + allp = nallp + } + + if maskWords <= int32(cap(idlepMask)) { + idlepMask = idlepMask[:maskWords] + timerpMask = timerpMask[:maskWords] + } else { + nidlepMask := make([]uint32, maskWords) + // No need to copy beyond len, old Ps are irrelevant. + copy(nidlepMask, idlepMask) + idlepMask = nidlepMask + + ntimerpMask := make([]uint32, maskWords) + copy(ntimerpMask, timerpMask) + timerpMask = ntimerpMask + } + unlock(&allpLock) + } + + // initialize new P's + for i := old; i < nprocs; i++ { + pp := allp[i] + if pp == nil { + pp = new(p) + } + pp.init(i) + atomicstorep(unsafe.Pointer(&allp[i]), unsafe.Pointer(pp)) + } + + gp := getg() + if gp.m.p != 0 && gp.m.p.ptr().id < nprocs { + // continue to use the current P + gp.m.p.ptr().status = _Prunning + gp.m.p.ptr().mcache.prepareForSweep() + } else { + // release the current P and acquire allp[0]. + // + // We must do this before destroying our current P + // because p.destroy itself has write barriers, so we + // need to do that from a valid P. + if gp.m.p != 0 { + if trace.enabled { + // Pretend that we were descheduled + // and then scheduled again to keep + // the trace sane. + traceGoSched() + traceProcStop(gp.m.p.ptr()) + } + gp.m.p.ptr().m = 0 + } + gp.m.p = 0 + pp := allp[0] + pp.m = 0 + pp.status = _Pidle + acquirep(pp) + if trace.enabled { + traceGoStart() + } + } + + // g.m.p is now set, so we no longer need mcache0 for bootstrapping. + mcache0 = nil + + // release resources from unused P's + for i := nprocs; i < old; i++ { + pp := allp[i] + pp.destroy() + // can't free P itself because it can be referenced by an M in syscall + } + + // Trim allp. + if int32(len(allp)) != nprocs { + lock(&allpLock) + allp = allp[:nprocs] + idlepMask = idlepMask[:maskWords] + timerpMask = timerpMask[:maskWords] + unlock(&allpLock) + } + + var runnablePs *p + for i := nprocs - 1; i >= 0; i-- { + pp := allp[i] + if gp.m.p.ptr() == pp { + continue + } + pp.status = _Pidle + if runqempty(pp) { + pidleput(pp, now) + } else { + pp.m.set(mget()) + pp.link.set(runnablePs) + runnablePs = pp + } + } + stealOrder.reset(uint32(nprocs)) + var int32p *int32 = &gomaxprocs // make compiler check that gomaxprocs is an int32 + atomic.Store((*uint32)(unsafe.Pointer(int32p)), uint32(nprocs)) + if old != nprocs { + // Notify the limiter that the amount of procs has changed. + gcCPULimiter.resetCapacity(now, nprocs) + } + return runnablePs +} + +// Associate p and the current m. +// +// This function is allowed to have write barriers even if the caller +// isn't because it immediately acquires pp. +// +//go:yeswritebarrierrec +func acquirep(pp *p) { + // Do the part that isn't allowed to have write barriers. + wirep(pp) + + // Have p; write barriers now allowed. + + // Perform deferred mcache flush before this P can allocate + // from a potentially stale mcache. + pp.mcache.prepareForSweep() + + if trace.enabled { + traceProcStart() + } +} + +// wirep is the first step of acquirep, which actually associates the +// current M to pp. This is broken out so we can disallow write +// barriers for this part, since we don't yet have a P. +// +//go:nowritebarrierrec +//go:nosplit +func wirep(pp *p) { + gp := getg() + + if gp.m.p != 0 { + throw("wirep: already in go") + } + if pp.m != 0 || pp.status != _Pidle { + id := int64(0) + if pp.m != 0 { + id = pp.m.ptr().id + } + print("wirep: p->m=", pp.m, "(", id, ") p->status=", pp.status, "\n") + throw("wirep: invalid p state") + } + gp.m.p.set(pp) + pp.m.set(gp.m) + pp.status = _Prunning +} + +// Disassociate p and the current m. +func releasep() *p { + gp := getg() + + if gp.m.p == 0 { + throw("releasep: invalid arg") + } + pp := gp.m.p.ptr() + if pp.m.ptr() != gp.m || pp.status != _Prunning { + print("releasep: m=", gp.m, " m->p=", gp.m.p.ptr(), " p->m=", hex(pp.m), " p->status=", pp.status, "\n") + throw("releasep: invalid p state") + } + if trace.enabled { + traceProcStop(gp.m.p.ptr()) + } + gp.m.p = 0 + pp.m = 0 + pp.status = _Pidle + return pp +} + +func incidlelocked(v int32) { + lock(&sched.lock) + sched.nmidlelocked += v + if v > 0 { + checkdead() + } + unlock(&sched.lock) +} + +// Check for deadlock situation. +// The check is based on number of running M's, if 0 -> deadlock. +// sched.lock must be held. +func checkdead() { + assertLockHeld(&sched.lock) + + // For -buildmode=c-shared or -buildmode=c-archive it's OK if + // there are no running goroutines. The calling program is + // assumed to be running. + if islibrary || isarchive { + return + } + + // If we are dying because of a signal caught on an already idle thread, + // freezetheworld will cause all running threads to block. + // And runtime will essentially enter into deadlock state, + // except that there is a thread that will call exit soon. + if panicking.Load() > 0 { + return + } + + // If we are not running under cgo, but we have an extra M then account + // for it. (It is possible to have an extra M on Windows without cgo to + // accommodate callbacks created by syscall.NewCallback. See issue #6751 + // for details.) + var run0 int32 + if !iscgo && cgoHasExtraM { + mp := lockextra(true) + haveExtraM := extraMCount > 0 + unlockextra(mp) + if haveExtraM { + run0 = 1 + } + } + + run := mcount() - sched.nmidle - sched.nmidlelocked - sched.nmsys + if run > run0 { + return + } + if run < 0 { + print("runtime: checkdead: nmidle=", sched.nmidle, " nmidlelocked=", sched.nmidlelocked, " mcount=", mcount(), " nmsys=", sched.nmsys, "\n") + throw("checkdead: inconsistent counts") + } + + grunning := 0 + forEachG(func(gp *g) { + if isSystemGoroutine(gp, false) { + return + } + s := readgstatus(gp) + switch s &^ _Gscan { + case _Gwaiting, + _Gpreempted: + grunning++ + case _Grunnable, + _Grunning, + _Gsyscall: + print("runtime: checkdead: find g ", gp.goid, " in status ", s, "\n") + throw("checkdead: runnable g") + } + }) + if grunning == 0 { // possible if main goroutine calls runtime·Goexit() + unlock(&sched.lock) // unlock so that GODEBUG=scheddetail=1 doesn't hang + fatal("no goroutines (main called runtime.Goexit) - deadlock!") + } + + // Maybe jump time forward for playground. + if faketime != 0 { + if when := timeSleepUntil(); when < maxWhen { + faketime = when + + // Start an M to steal the timer. + pp, _ := pidleget(faketime) + if pp == nil { + // There should always be a free P since + // nothing is running. + throw("checkdead: no p for timer") + } + mp := mget() + if mp == nil { + // There should always be a free M since + // nothing is running. + throw("checkdead: no m for timer") + } + // M must be spinning to steal. We set this to be + // explicit, but since this is the only M it would + // become spinning on its own anyways. + sched.nmspinning.Add(1) + mp.spinning = true + mp.nextp.set(pp) + notewakeup(&mp.park) + return + } + } + + // There are no goroutines running, so we can look at the P's. + for _, pp := range allp { + if len(pp.timers) > 0 { + return + } + } + + unlock(&sched.lock) // unlock so that GODEBUG=scheddetail=1 doesn't hang + fatal("all goroutines are asleep - deadlock!") +} + +// forcegcperiod is the maximum time in nanoseconds between garbage +// collections. If we go this long without a garbage collection, one +// is forced to run. +// +// This is a variable for testing purposes. It normally doesn't change. +var forcegcperiod int64 = 2 * 60 * 1e9 + +// needSysmonWorkaround is true if the workaround for +// golang.org/issue/42515 is needed on NetBSD. +var needSysmonWorkaround bool = false + +// Always runs without a P, so write barriers are not allowed. +// +//go:nowritebarrierrec +func sysmon() { + lock(&sched.lock) + sched.nmsys++ + checkdead() + unlock(&sched.lock) + + lasttrace := int64(0) + idle := 0 // how many cycles in succession we had not wokeup somebody + delay := uint32(0) + + for { + if idle == 0 { // start with 20us sleep... + delay = 20 + } else if idle > 50 { // start doubling the sleep after 1ms... + delay *= 2 + } + if delay > 10*1000 { // up to 10ms + delay = 10 * 1000 + } + usleep(delay) + + // sysmon should not enter deep sleep if schedtrace is enabled so that + // it can print that information at the right time. + // + // It should also not enter deep sleep if there are any active P's so + // that it can retake P's from syscalls, preempt long running G's, and + // poll the network if all P's are busy for long stretches. + // + // It should wakeup from deep sleep if any P's become active either due + // to exiting a syscall or waking up due to a timer expiring so that it + // can resume performing those duties. If it wakes from a syscall it + // resets idle and delay as a bet that since it had retaken a P from a + // syscall before, it may need to do it again shortly after the + // application starts work again. It does not reset idle when waking + // from a timer to avoid adding system load to applications that spend + // most of their time sleeping. + now := nanotime() + if debug.schedtrace <= 0 && (sched.gcwaiting.Load() || sched.npidle.Load() == gomaxprocs) { + lock(&sched.lock) + if sched.gcwaiting.Load() || sched.npidle.Load() == gomaxprocs { + syscallWake := false + next := timeSleepUntil() + if next > now { + sched.sysmonwait.Store(true) + unlock(&sched.lock) + // Make wake-up period small enough + // for the sampling to be correct. + sleep := forcegcperiod / 2 + if next-now < sleep { + sleep = next - now + } + shouldRelax := sleep >= osRelaxMinNS + if shouldRelax { + osRelax(true) + } + syscallWake = notetsleep(&sched.sysmonnote, sleep) + if shouldRelax { + osRelax(false) + } + lock(&sched.lock) + sched.sysmonwait.Store(false) + noteclear(&sched.sysmonnote) + } + if syscallWake { + idle = 0 + delay = 20 + } + } + unlock(&sched.lock) + } + + lock(&sched.sysmonlock) + // Update now in case we blocked on sysmonnote or spent a long time + // blocked on schedlock or sysmonlock above. + now = nanotime() + + // trigger libc interceptors if needed + if *cgo_yield != nil { + asmcgocall(*cgo_yield, nil) + } + // poll network if not polled for more than 10ms + lastpoll := sched.lastpoll.Load() + if netpollinited() && lastpoll != 0 && lastpoll+10*1000*1000 < now { + sched.lastpoll.CompareAndSwap(lastpoll, now) + list := netpoll(0) // non-blocking - returns list of goroutines + if !list.empty() { + // Need to decrement number of idle locked M's + // (pretending that one more is running) before injectglist. + // Otherwise it can lead to the following situation: + // injectglist grabs all P's but before it starts M's to run the P's, + // another M returns from syscall, finishes running its G, + // observes that there is no work to do and no other running M's + // and reports deadlock. + incidlelocked(-1) + injectglist(&list) + incidlelocked(1) + } + } + if GOOS == "netbsd" && needSysmonWorkaround { + // netpoll is responsible for waiting for timer + // expiration, so we typically don't have to worry + // about starting an M to service timers. (Note that + // sleep for timeSleepUntil above simply ensures sysmon + // starts running again when that timer expiration may + // cause Go code to run again). + // + // However, netbsd has a kernel bug that sometimes + // misses netpollBreak wake-ups, which can lead to + // unbounded delays servicing timers. If we detect this + // overrun, then startm to get something to handle the + // timer. + // + // See issue 42515 and + // https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=50094. + if next := timeSleepUntil(); next < now { + startm(nil, false, false) + } + } + if scavenger.sysmonWake.Load() != 0 { + // Kick the scavenger awake if someone requested it. + scavenger.wake() + } + // retake P's blocked in syscalls + // and preempt long running G's + if retake(now) != 0 { + idle = 0 + } else { + idle++ + } + // check if we need to force a GC + if t := (gcTrigger{kind: gcTriggerTime, now: now}); t.test() && forcegc.idle.Load() { + lock(&forcegc.lock) + forcegc.idle.Store(false) + var list gList + list.push(forcegc.g) + injectglist(&list) + unlock(&forcegc.lock) + } + if debug.schedtrace > 0 && lasttrace+int64(debug.schedtrace)*1000000 <= now { + lasttrace = now + schedtrace(debug.scheddetail > 0) + } + unlock(&sched.sysmonlock) + } +} + +type sysmontick struct { + schedtick uint32 + schedwhen int64 + syscalltick uint32 + syscallwhen int64 +} + +// forcePreemptNS is the time slice given to a G before it is +// preempted. +const forcePreemptNS = 10 * 1000 * 1000 // 10ms + +func retake(now int64) uint32 { + n := 0 + // Prevent allp slice changes. This lock will be completely + // uncontended unless we're already stopping the world. + lock(&allpLock) + // We can't use a range loop over allp because we may + // temporarily drop the allpLock. Hence, we need to re-fetch + // allp each time around the loop. + for i := 0; i < len(allp); i++ { + pp := allp[i] + if pp == nil { + // This can happen if procresize has grown + // allp but not yet created new Ps. + continue + } + pd := &pp.sysmontick + s := pp.status + sysretake := false + if s == _Prunning || s == _Psyscall { + // Preempt G if it's running for too long. + t := int64(pp.schedtick) + if int64(pd.schedtick) != t { + pd.schedtick = uint32(t) + pd.schedwhen = now + } else if pd.schedwhen+forcePreemptNS <= now { + preemptone(pp) + // In case of syscall, preemptone() doesn't + // work, because there is no M wired to P. + sysretake = true + } + } + if s == _Psyscall { + // Retake P from syscall if it's there for more than 1 sysmon tick (at least 20us). + t := int64(pp.syscalltick) + if !sysretake && int64(pd.syscalltick) != t { + pd.syscalltick = uint32(t) + pd.syscallwhen = now + continue + } + // On the one hand we don't want to retake Ps if there is no other work to do, + // but on the other hand we want to retake them eventually + // because they can prevent the sysmon thread from deep sleep. + if runqempty(pp) && sched.nmspinning.Load()+sched.npidle.Load() > 0 && pd.syscallwhen+10*1000*1000 > now { + continue + } + // Drop allpLock so we can take sched.lock. + unlock(&allpLock) + // Need to decrement number of idle locked M's + // (pretending that one more is running) before the CAS. + // Otherwise the M from which we retake can exit the syscall, + // increment nmidle and report deadlock. + incidlelocked(-1) + if atomic.Cas(&pp.status, s, _Pidle) { + if trace.enabled { + traceGoSysBlock(pp) + traceProcStop(pp) + } + n++ + pp.syscalltick++ + handoffp(pp) + } + incidlelocked(1) + lock(&allpLock) + } + } + unlock(&allpLock) + return uint32(n) +} + +// Tell all goroutines that they have been preempted and they should stop. +// This function is purely best-effort. It can fail to inform a goroutine if a +// processor just started running it. +// No locks need to be held. +// Returns true if preemption request was issued to at least one goroutine. +func preemptall() bool { + res := false + for _, pp := range allp { + if pp.status != _Prunning { + continue + } + if preemptone(pp) { + res = true + } + } + return res +} + +// Tell the goroutine running on processor P to stop. +// This function is purely best-effort. It can incorrectly fail to inform the +// goroutine. It can inform the wrong goroutine. Even if it informs the +// correct goroutine, that goroutine might ignore the request if it is +// simultaneously executing newstack. +// No lock needs to be held. +// Returns true if preemption request was issued. +// The actual preemption will happen at some point in the future +// and will be indicated by the gp->status no longer being +// Grunning +func preemptone(pp *p) bool { + mp := pp.m.ptr() + if mp == nil || mp == getg().m { + return false + } + gp := mp.curg + if gp == nil || gp == mp.g0 { + return false + } + + gp.preempt = true + + // Every call in a goroutine checks for stack overflow by + // comparing the current stack pointer to gp->stackguard0. + // Setting gp->stackguard0 to StackPreempt folds + // preemption into the normal stack overflow check. + gp.stackguard0 = stackPreempt + + // Request an async preemption of this P. + if preemptMSupported && debug.asyncpreemptoff == 0 { + pp.preempt = true + preemptM(mp) + } + + return true +} + +var starttime int64 + +func schedtrace(detailed bool) { + now := nanotime() + if starttime == 0 { + starttime = now + } + + lock(&sched.lock) + print("SCHED ", (now-starttime)/1e6, "ms: gomaxprocs=", gomaxprocs, " idleprocs=", sched.npidle.Load(), " threads=", mcount(), " spinningthreads=", sched.nmspinning.Load(), " needspinning=", sched.needspinning.Load(), " idlethreads=", sched.nmidle, " runqueue=", sched.runqsize) + if detailed { + print(" gcwaiting=", sched.gcwaiting.Load(), " nmidlelocked=", sched.nmidlelocked, " stopwait=", sched.stopwait, " sysmonwait=", sched.sysmonwait.Load(), "\n") + } + // We must be careful while reading data from P's, M's and G's. + // Even if we hold schedlock, most data can be changed concurrently. + // E.g. (p->m ? p->m->id : -1) can crash if p->m changes from non-nil to nil. + for i, pp := range allp { + mp := pp.m.ptr() + h := atomic.Load(&pp.runqhead) + t := atomic.Load(&pp.runqtail) + if detailed { + print(" P", i, ": status=", pp.status, " schedtick=", pp.schedtick, " syscalltick=", pp.syscalltick, " m=") + if mp != nil { + print(mp.id) + } else { + print("nil") + } + print(" runqsize=", t-h, " gfreecnt=", pp.gFree.n, " timerslen=", len(pp.timers), "\n") + } else { + // In non-detailed mode format lengths of per-P run queues as: + // [len1 len2 len3 len4] + print(" ") + if i == 0 { + print("[") + } + print(t - h) + if i == len(allp)-1 { + print("]\n") + } + } + } + + if !detailed { + unlock(&sched.lock) + return + } + + for mp := allm; mp != nil; mp = mp.alllink { + pp := mp.p.ptr() + print(" M", mp.id, ": p=") + if pp != nil { + print(pp.id) + } else { + print("nil") + } + print(" curg=") + if mp.curg != nil { + print(mp.curg.goid) + } else { + print("nil") + } + print(" mallocing=", mp.mallocing, " throwing=", mp.throwing, " preemptoff=", mp.preemptoff, " locks=", mp.locks, " dying=", mp.dying, " spinning=", mp.spinning, " blocked=", mp.blocked, " lockedg=") + if lockedg := mp.lockedg.ptr(); lockedg != nil { + print(lockedg.goid) + } else { + print("nil") + } + print("\n") + } + + forEachG(func(gp *g) { + print(" G", gp.goid, ": status=", readgstatus(gp), "(", gp.waitreason.String(), ") m=") + if gp.m != nil { + print(gp.m.id) + } else { + print("nil") + } + print(" lockedm=") + if lockedm := gp.lockedm.ptr(); lockedm != nil { + print(lockedm.id) + } else { + print("nil") + } + print("\n") + }) + unlock(&sched.lock) +} + +// schedEnableUser enables or disables the scheduling of user +// goroutines. +// +// This does not stop already running user goroutines, so the caller +// should first stop the world when disabling user goroutines. +func schedEnableUser(enable bool) { + lock(&sched.lock) + if sched.disable.user == !enable { + unlock(&sched.lock) + return + } + sched.disable.user = !enable + if enable { + n := sched.disable.n + sched.disable.n = 0 + globrunqputbatch(&sched.disable.runnable, n) + unlock(&sched.lock) + for ; n != 0 && sched.npidle.Load() != 0; n-- { + startm(nil, false, false) + } + } else { + unlock(&sched.lock) + } +} + +// schedEnabled reports whether gp should be scheduled. It returns +// false is scheduling of gp is disabled. +// +// sched.lock must be held. +func schedEnabled(gp *g) bool { + assertLockHeld(&sched.lock) + + if sched.disable.user { + return isSystemGoroutine(gp, true) + } + return true +} + +// Put mp on midle list. +// sched.lock must be held. +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func mput(mp *m) { + assertLockHeld(&sched.lock) + + mp.schedlink = sched.midle + sched.midle.set(mp) + sched.nmidle++ + checkdead() +} + +// Try to get an m from midle list. +// sched.lock must be held. +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func mget() *m { + assertLockHeld(&sched.lock) + + mp := sched.midle.ptr() + if mp != nil { + sched.midle = mp.schedlink + sched.nmidle-- + } + return mp +} + +// Put gp on the global runnable queue. +// sched.lock must be held. +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func globrunqput(gp *g) { + assertLockHeld(&sched.lock) + + sched.runq.pushBack(gp) + sched.runqsize++ +} + +// Put gp at the head of the global runnable queue. +// sched.lock must be held. +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func globrunqputhead(gp *g) { + assertLockHeld(&sched.lock) + + sched.runq.push(gp) + sched.runqsize++ +} + +// Put a batch of runnable goroutines on the global runnable queue. +// This clears *batch. +// sched.lock must be held. +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func globrunqputbatch(batch *gQueue, n int32) { + assertLockHeld(&sched.lock) + + sched.runq.pushBackAll(*batch) + sched.runqsize += n + *batch = gQueue{} +} + +// Try get a batch of G's from the global runnable queue. +// sched.lock must be held. +func globrunqget(pp *p, max int32) *g { + assertLockHeld(&sched.lock) + + if sched.runqsize == 0 { + return nil + } + + n := sched.runqsize/gomaxprocs + 1 + if n > sched.runqsize { + n = sched.runqsize + } + if max > 0 && n > max { + n = max + } + if n > int32(len(pp.runq))/2 { + n = int32(len(pp.runq)) / 2 + } + + sched.runqsize -= n + + gp := sched.runq.pop() + n-- + for ; n > 0; n-- { + gp1 := sched.runq.pop() + runqput(pp, gp1, false) + } + return gp +} + +// pMask is an atomic bitstring with one bit per P. +type pMask []uint32 + +// read returns true if P id's bit is set. +func (p pMask) read(id uint32) bool { + word := id / 32 + mask := uint32(1) << (id % 32) + return (atomic.Load(&p[word]) & mask) != 0 +} + +// set sets P id's bit. +func (p pMask) set(id int32) { + word := id / 32 + mask := uint32(1) << (id % 32) + atomic.Or(&p[word], mask) +} + +// clear clears P id's bit. +func (p pMask) clear(id int32) { + word := id / 32 + mask := uint32(1) << (id % 32) + atomic.And(&p[word], ^mask) +} + +// updateTimerPMask clears pp's timer mask if it has no timers on its heap. +// +// Ideally, the timer mask would be kept immediately consistent on any timer +// operations. Unfortunately, updating a shared global data structure in the +// timer hot path adds too much overhead in applications frequently switching +// between no timers and some timers. +// +// As a compromise, the timer mask is updated only on pidleget / pidleput. A +// running P (returned by pidleget) may add a timer at any time, so its mask +// must be set. An idle P (passed to pidleput) cannot add new timers while +// idle, so if it has no timers at that time, its mask may be cleared. +// +// Thus, we get the following effects on timer-stealing in findrunnable: +// +// - Idle Ps with no timers when they go idle are never checked in findrunnable +// (for work- or timer-stealing; this is the ideal case). +// - Running Ps must always be checked. +// - Idle Ps whose timers are stolen must continue to be checked until they run +// again, even after timer expiration. +// +// When the P starts running again, the mask should be set, as a timer may be +// added at any time. +// +// TODO(prattmic): Additional targeted updates may improve the above cases. +// e.g., updating the mask when stealing a timer. +func updateTimerPMask(pp *p) { + if pp.numTimers.Load() > 0 { + return + } + + // Looks like there are no timers, however another P may transiently + // decrement numTimers when handling a timerModified timer in + // checkTimers. We must take timersLock to serialize with these changes. + lock(&pp.timersLock) + if pp.numTimers.Load() == 0 { + timerpMask.clear(pp.id) + } + unlock(&pp.timersLock) +} + +// pidleput puts p on the _Pidle list. now must be a relatively recent call +// to nanotime or zero. Returns now or the current time if now was zero. +// +// This releases ownership of p. Once sched.lock is released it is no longer +// safe to use p. +// +// sched.lock must be held. +// +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func pidleput(pp *p, now int64) int64 { + assertLockHeld(&sched.lock) + + if !runqempty(pp) { + throw("pidleput: P has non-empty run queue") + } + if now == 0 { + now = nanotime() + } + updateTimerPMask(pp) // clear if there are no timers. + idlepMask.set(pp.id) + pp.link = sched.pidle + sched.pidle.set(pp) + sched.npidle.Add(1) + if !pp.limiterEvent.start(limiterEventIdle, now) { + throw("must be able to track idle limiter event") + } + return now +} + +// pidleget tries to get a p from the _Pidle list, acquiring ownership. +// +// sched.lock must be held. +// +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func pidleget(now int64) (*p, int64) { + assertLockHeld(&sched.lock) + + pp := sched.pidle.ptr() + if pp != nil { + // Timer may get added at any time now. + if now == 0 { + now = nanotime() + } + timerpMask.set(pp.id) + idlepMask.clear(pp.id) + sched.pidle = pp.link + sched.npidle.Add(-1) + pp.limiterEvent.stop(limiterEventIdle, now) + } + return pp, now +} + +// pidlegetSpinning tries to get a p from the _Pidle list, acquiring ownership. +// This is called by spinning Ms (or callers than need a spinning M) that have +// found work. If no P is available, this must synchronized with non-spinning +// Ms that may be preparing to drop their P without discovering this work. +// +// sched.lock must be held. +// +// May run during STW, so write barriers are not allowed. +// +//go:nowritebarrierrec +func pidlegetSpinning(now int64) (*p, int64) { + assertLockHeld(&sched.lock) + + pp, now := pidleget(now) + if pp == nil { + // See "Delicate dance" comment in findrunnable. We found work + // that we cannot take, we must synchronize with non-spinning + // Ms that may be preparing to drop their P. + sched.needspinning.Store(1) + return nil, now + } + + return pp, now +} + +// runqempty reports whether pp has no Gs on its local run queue. +// It never returns true spuriously. +func runqempty(pp *p) bool { + // Defend against a race where 1) pp has G1 in runqnext but runqhead == runqtail, + // 2) runqput on pp kicks G1 to the runq, 3) runqget on pp empties runqnext. + // Simply observing that runqhead == runqtail and then observing that runqnext == nil + // does not mean the queue is empty. + for { + head := atomic.Load(&pp.runqhead) + tail := atomic.Load(&pp.runqtail) + runnext := atomic.Loaduintptr((*uintptr)(unsafe.Pointer(&pp.runnext))) + if tail == atomic.Load(&pp.runqtail) { + return head == tail && runnext == 0 + } + } +} + +// To shake out latent assumptions about scheduling order, +// we introduce some randomness into scheduling decisions +// when running with the race detector. +// The need for this was made obvious by changing the +// (deterministic) scheduling order in Go 1.5 and breaking +// many poorly-written tests. +// With the randomness here, as long as the tests pass +// consistently with -race, they shouldn't have latent scheduling +// assumptions. +const randomizeScheduler = raceenabled + +// runqput tries to put g on the local runnable queue. +// If next is false, runqput adds g to the tail of the runnable queue. +// If next is true, runqput puts g in the pp.runnext slot. +// If the run queue is full, runnext puts g on the global queue. +// Executed only by the owner P. +func runqput(pp *p, gp *g, next bool) { + if randomizeScheduler && next && fastrandn(2) == 0 { + next = false + } + + if next { + retryNext: + oldnext := pp.runnext + if !pp.runnext.cas(oldnext, guintptr(unsafe.Pointer(gp))) { + goto retryNext + } + if oldnext == 0 { + return + } + // Kick the old runnext out to the regular run queue. + gp = oldnext.ptr() + } + +retry: + h := atomic.LoadAcq(&pp.runqhead) // load-acquire, synchronize with consumers + t := pp.runqtail + if t-h < uint32(len(pp.runq)) { + pp.runq[t%uint32(len(pp.runq))].set(gp) + atomic.StoreRel(&pp.runqtail, t+1) // store-release, makes the item available for consumption + return + } + if runqputslow(pp, gp, h, t) { + return + } + // the queue is not full, now the put above must succeed + goto retry +} + +// Put g and a batch of work from local runnable queue on global queue. +// Executed only by the owner P. +func runqputslow(pp *p, gp *g, h, t uint32) bool { + var batch [len(pp.runq)/2 + 1]*g + + // First, grab a batch from local queue. + n := t - h + n = n / 2 + if n != uint32(len(pp.runq)/2) { + throw("runqputslow: queue is not full") + } + for i := uint32(0); i < n; i++ { + batch[i] = pp.runq[(h+i)%uint32(len(pp.runq))].ptr() + } + if !atomic.CasRel(&pp.runqhead, h, h+n) { // cas-release, commits consume + return false + } + batch[n] = gp + + if randomizeScheduler { + for i := uint32(1); i <= n; i++ { + j := fastrandn(i + 1) + batch[i], batch[j] = batch[j], batch[i] + } + } + + // Link the goroutines. + for i := uint32(0); i < n; i++ { + batch[i].schedlink.set(batch[i+1]) + } + var q gQueue + q.head.set(batch[0]) + q.tail.set(batch[n]) + + // Now put the batch on global queue. + lock(&sched.lock) + globrunqputbatch(&q, int32(n+1)) + unlock(&sched.lock) + return true +} + +// runqputbatch tries to put all the G's on q on the local runnable queue. +// If the queue is full, they are put on the global queue; in that case +// this will temporarily acquire the scheduler lock. +// Executed only by the owner P. +func runqputbatch(pp *p, q *gQueue, qsize int) { + h := atomic.LoadAcq(&pp.runqhead) + t := pp.runqtail + n := uint32(0) + for !q.empty() && t-h < uint32(len(pp.runq)) { + gp := q.pop() + pp.runq[t%uint32(len(pp.runq))].set(gp) + t++ + n++ + } + qsize -= int(n) + + if randomizeScheduler { + off := func(o uint32) uint32 { + return (pp.runqtail + o) % uint32(len(pp.runq)) + } + for i := uint32(1); i < n; i++ { + j := fastrandn(i + 1) + pp.runq[off(i)], pp.runq[off(j)] = pp.runq[off(j)], pp.runq[off(i)] + } + } + + atomic.StoreRel(&pp.runqtail, t) + if !q.empty() { + lock(&sched.lock) + globrunqputbatch(q, int32(qsize)) + unlock(&sched.lock) + } +} + +// Get g from local runnable queue. +// If inheritTime is true, gp should inherit the remaining time in the +// current time slice. Otherwise, it should start a new time slice. +// Executed only by the owner P. +func runqget(pp *p) (gp *g, inheritTime bool) { + // If there's a runnext, it's the next G to run. + next := pp.runnext + // If the runnext is non-0 and the CAS fails, it could only have been stolen by another P, + // because other Ps can race to set runnext to 0, but only the current P can set it to non-0. + // Hence, there's no need to retry this CAS if it fails. + if next != 0 && pp.runnext.cas(next, 0) { + return next.ptr(), true + } + + for { + h := atomic.LoadAcq(&pp.runqhead) // load-acquire, synchronize with other consumers + t := pp.runqtail + if t == h { + return nil, false + } + gp := pp.runq[h%uint32(len(pp.runq))].ptr() + if atomic.CasRel(&pp.runqhead, h, h+1) { // cas-release, commits consume + return gp, false + } + } +} + +// runqdrain drains the local runnable queue of pp and returns all goroutines in it. +// Executed only by the owner P. +func runqdrain(pp *p) (drainQ gQueue, n uint32) { + oldNext := pp.runnext + if oldNext != 0 && pp.runnext.cas(oldNext, 0) { + drainQ.pushBack(oldNext.ptr()) + n++ + } + +retry: + h := atomic.LoadAcq(&pp.runqhead) // load-acquire, synchronize with other consumers + t := pp.runqtail + qn := t - h + if qn == 0 { + return + } + if qn > uint32(len(pp.runq)) { // read inconsistent h and t + goto retry + } + + if !atomic.CasRel(&pp.runqhead, h, h+qn) { // cas-release, commits consume + goto retry + } + + // We've inverted the order in which it gets G's from the local P's runnable queue + // and then advances the head pointer because we don't want to mess up the statuses of G's + // while runqdrain() and runqsteal() are running in parallel. + // Thus we should advance the head pointer before draining the local P into a gQueue, + // so that we can update any gp.schedlink only after we take the full ownership of G, + // meanwhile, other P's can't access to all G's in local P's runnable queue and steal them. + // See https://groups.google.com/g/golang-dev/c/0pTKxEKhHSc/m/6Q85QjdVBQAJ for more details. + for i := uint32(0); i < qn; i++ { + gp := pp.runq[(h+i)%uint32(len(pp.runq))].ptr() + drainQ.pushBack(gp) + n++ + } + return +} + +// Grabs a batch of goroutines from pp's runnable queue into batch. +// Batch is a ring buffer starting at batchHead. +// Returns number of grabbed goroutines. +// Can be executed by any P. +func runqgrab(pp *p, batch *[256]guintptr, batchHead uint32, stealRunNextG bool) uint32 { + for { + h := atomic.LoadAcq(&pp.runqhead) // load-acquire, synchronize with other consumers + t := atomic.LoadAcq(&pp.runqtail) // load-acquire, synchronize with the producer + n := t - h + n = n - n/2 + if n == 0 { + if stealRunNextG { + // Try to steal from pp.runnext. + if next := pp.runnext; next != 0 { + if pp.status == _Prunning { + // Sleep to ensure that pp isn't about to run the g + // we are about to steal. + // The important use case here is when the g running + // on pp ready()s another g and then almost + // immediately blocks. Instead of stealing runnext + // in this window, back off to give pp a chance to + // schedule runnext. This will avoid thrashing gs + // between different Ps. + // A sync chan send/recv takes ~50ns as of time of + // writing, so 3us gives ~50x overshoot. + if GOOS != "windows" && GOOS != "openbsd" && GOOS != "netbsd" { + usleep(3) + } else { + // On some platforms system timer granularity is + // 1-15ms, which is way too much for this + // optimization. So just yield. + osyield() + } + } + if !pp.runnext.cas(next, 0) { + continue + } + batch[batchHead%uint32(len(batch))] = next + return 1 + } + } + return 0 + } + if n > uint32(len(pp.runq)/2) { // read inconsistent h and t + continue + } + for i := uint32(0); i < n; i++ { + g := pp.runq[(h+i)%uint32(len(pp.runq))] + batch[(batchHead+i)%uint32(len(batch))] = g + } + if atomic.CasRel(&pp.runqhead, h, h+n) { // cas-release, commits consume + return n + } + } +} + +// Steal half of elements from local runnable queue of p2 +// and put onto local runnable queue of p. +// Returns one of the stolen elements (or nil if failed). +func runqsteal(pp, p2 *p, stealRunNextG bool) *g { + t := pp.runqtail + n := runqgrab(p2, &pp.runq, t, stealRunNextG) + if n == 0 { + return nil + } + n-- + gp := pp.runq[(t+n)%uint32(len(pp.runq))].ptr() + if n == 0 { + return gp + } + h := atomic.LoadAcq(&pp.runqhead) // load-acquire, synchronize with consumers + if t-h+n >= uint32(len(pp.runq)) { + throw("runqsteal: runq overflow") + } + atomic.StoreRel(&pp.runqtail, t+n) // store-release, makes the item available for consumption + return gp +} + +// A gQueue is a dequeue of Gs linked through g.schedlink. A G can only +// be on one gQueue or gList at a time. +type gQueue struct { + head guintptr + tail guintptr +} + +// empty reports whether q is empty. +func (q *gQueue) empty() bool { + return q.head == 0 +} + +// push adds gp to the head of q. +func (q *gQueue) push(gp *g) { + gp.schedlink = q.head + q.head.set(gp) + if q.tail == 0 { + q.tail.set(gp) + } +} + +// pushBack adds gp to the tail of q. +func (q *gQueue) pushBack(gp *g) { + gp.schedlink = 0 + if q.tail != 0 { + q.tail.ptr().schedlink.set(gp) + } else { + q.head.set(gp) + } + q.tail.set(gp) +} + +// pushBackAll adds all Gs in q2 to the tail of q. After this q2 must +// not be used. +func (q *gQueue) pushBackAll(q2 gQueue) { + if q2.tail == 0 { + return + } + q2.tail.ptr().schedlink = 0 + if q.tail != 0 { + q.tail.ptr().schedlink = q2.head + } else { + q.head = q2.head + } + q.tail = q2.tail +} + +// pop removes and returns the head of queue q. It returns nil if +// q is empty. +func (q *gQueue) pop() *g { + gp := q.head.ptr() + if gp != nil { + q.head = gp.schedlink + if q.head == 0 { + q.tail = 0 + } + } + return gp +} + +// popList takes all Gs in q and returns them as a gList. +func (q *gQueue) popList() gList { + stack := gList{q.head} + *q = gQueue{} + return stack +} + +// A gList is a list of Gs linked through g.schedlink. A G can only be +// on one gQueue or gList at a time. +type gList struct { + head guintptr +} + +// empty reports whether l is empty. +func (l *gList) empty() bool { + return l.head == 0 +} + +// push adds gp to the head of l. +func (l *gList) push(gp *g) { + gp.schedlink = l.head + l.head.set(gp) +} + +// pushAll prepends all Gs in q to l. +func (l *gList) pushAll(q gQueue) { + if !q.empty() { + q.tail.ptr().schedlink = l.head + l.head = q.head + } +} + +// pop removes and returns the head of l. If l is empty, it returns nil. +func (l *gList) pop() *g { + gp := l.head.ptr() + if gp != nil { + l.head = gp.schedlink + } + return gp +} + +//go:linkname setMaxThreads runtime/debug.setMaxThreads +func setMaxThreads(in int) (out int) { + lock(&sched.lock) + out = int(sched.maxmcount) + if in > 0x7fffffff { // MaxInt32 + sched.maxmcount = 0x7fffffff + } else { + sched.maxmcount = int32(in) + } + checkmcount() + unlock(&sched.lock) + return +} + +//go:nosplit +func procPin() int { + gp := getg() + mp := gp.m + + mp.locks++ + return int(mp.p.ptr().id) +} + +//go:nosplit +func procUnpin() { + gp := getg() + gp.m.locks-- +} + +//go:linkname sync_runtime_procPin sync.runtime_procPin +//go:nosplit +func sync_runtime_procPin() int { + return procPin() +} + +//go:linkname sync_runtime_procUnpin sync.runtime_procUnpin +//go:nosplit +func sync_runtime_procUnpin() { + procUnpin() +} + +//go:linkname sync_atomic_runtime_procPin sync/atomic.runtime_procPin +//go:nosplit +func sync_atomic_runtime_procPin() int { + return procPin() +} + +//go:linkname sync_atomic_runtime_procUnpin sync/atomic.runtime_procUnpin +//go:nosplit +func sync_atomic_runtime_procUnpin() { + procUnpin() +} + +// Active spinning for sync.Mutex. +// +//go:linkname sync_runtime_canSpin sync.runtime_canSpin +//go:nosplit +func sync_runtime_canSpin(i int) bool { + // sync.Mutex is cooperative, so we are conservative with spinning. + // Spin only few times and only if running on a multicore machine and + // GOMAXPROCS>1 and there is at least one other running P and local runq is empty. + // As opposed to runtime mutex we don't do passive spinning here, + // because there can be work on global runq or on other Ps. + if i >= active_spin || ncpu <= 1 || gomaxprocs <= sched.npidle.Load()+sched.nmspinning.Load()+1 { + return false + } + if p := getg().m.p.ptr(); !runqempty(p) { + return false + } + return true +} + +//go:linkname sync_runtime_doSpin sync.runtime_doSpin +//go:nosplit +func sync_runtime_doSpin() { + procyield(active_spin_cnt) +} + +var stealOrder randomOrder + +// randomOrder/randomEnum are helper types for randomized work stealing. +// They allow to enumerate all Ps in different pseudo-random orders without repetitions. +// The algorithm is based on the fact that if we have X such that X and GOMAXPROCS +// are coprime, then a sequences of (i + X) % GOMAXPROCS gives the required enumeration. +type randomOrder struct { + count uint32 + coprimes []uint32 +} + +type randomEnum struct { + i uint32 + count uint32 + pos uint32 + inc uint32 +} + +func (ord *randomOrder) reset(count uint32) { + ord.count = count + ord.coprimes = ord.coprimes[:0] + for i := uint32(1); i <= count; i++ { + if gcd(i, count) == 1 { + ord.coprimes = append(ord.coprimes, i) + } + } +} + +func (ord *randomOrder) start(i uint32) randomEnum { + return randomEnum{ + count: ord.count, + pos: i % ord.count, + inc: ord.coprimes[i/ord.count%uint32(len(ord.coprimes))], + } +} + +func (enum *randomEnum) done() bool { + return enum.i == enum.count +} + +func (enum *randomEnum) next() { + enum.i++ + enum.pos = (enum.pos + enum.inc) % enum.count +} + +func (enum *randomEnum) position() uint32 { + return enum.pos +} + +func gcd(a, b uint32) uint32 { + for b != 0 { + a, b = b, a%b + } + return a +} + +// An initTask represents the set of initializations that need to be done for a package. +// Keep in sync with ../../test/initempty.go:initTask +type initTask struct { + // TODO: pack the first 3 fields more tightly? + state uintptr // 0 = uninitialized, 1 = in progress, 2 = done + ndeps uintptr + nfns uintptr + // followed by ndeps instances of an *initTask, one per package depended on + // followed by nfns pcs, one per init function to run +} + +// inittrace stores statistics for init functions which are +// updated by malloc and newproc when active is true. +var inittrace tracestat + +type tracestat struct { + active bool // init tracing activation status + id uint64 // init goroutine id + allocs uint64 // heap allocations + bytes uint64 // heap allocated bytes +} + +func doInit(t *initTask) { + switch t.state { + case 2: // fully initialized + return + case 1: // initialization in progress + throw("recursive call during initialization - linker skew") + default: // not initialized yet + t.state = 1 // initialization in progress + + for i := uintptr(0); i < t.ndeps; i++ { + p := add(unsafe.Pointer(t), (3+i)*goarch.PtrSize) + t2 := *(**initTask)(p) + doInit(t2) + } + + if t.nfns == 0 { + t.state = 2 // initialization done + return + } + + var ( + start int64 + before tracestat + ) + + if inittrace.active { + start = nanotime() + // Load stats non-atomically since tracinit is updated only by this init goroutine. + before = inittrace + } + + firstFunc := add(unsafe.Pointer(t), (3+t.ndeps)*goarch.PtrSize) + for i := uintptr(0); i < t.nfns; i++ { + p := add(firstFunc, i*goarch.PtrSize) + f := *(*func())(unsafe.Pointer(&p)) + f() + } + + if inittrace.active { + end := nanotime() + // Load stats non-atomically since tracinit is updated only by this init goroutine. + after := inittrace + + f := *(*func())(unsafe.Pointer(&firstFunc)) + pkg := funcpkgpath(findfunc(abi.FuncPCABIInternal(f))) + + var sbuf [24]byte + print("init ", pkg, " @") + print(string(fmtNSAsMS(sbuf[:], uint64(start-runtimeInitTime))), " ms, ") + print(string(fmtNSAsMS(sbuf[:], uint64(end-start))), " ms clock, ") + print(string(itoa(sbuf[:], after.bytes-before.bytes)), " bytes, ") + print(string(itoa(sbuf[:], after.allocs-before.allocs)), " allocs") + print("\n") + } + + t.state = 2 // initialization done + } +} diff --git a/src/runtime/proc_runtime_test.go b/src/runtime/proc_runtime_test.go new file mode 100644 index 0000000..90aed83 --- /dev/null +++ b/src/runtime/proc_runtime_test.go @@ -0,0 +1,50 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Proc unit tests. In runtime package so can use runtime guts. + +package runtime + +func RunStealOrderTest() { + var ord randomOrder + for procs := 1; procs <= 64; procs++ { + ord.reset(uint32(procs)) + if procs >= 3 && len(ord.coprimes) < 2 { + panic("too few coprimes") + } + for co := 0; co < len(ord.coprimes); co++ { + enum := ord.start(uint32(co)) + checked := make([]bool, procs) + for p := 0; p < procs; p++ { + x := enum.position() + if checked[x] { + println("procs:", procs, "inc:", enum.inc) + panic("duplicate during enumeration") + } + checked[x] = true + enum.next() + } + if !enum.done() { + panic("not done") + } + } + } + // Make sure that different arguments to ord.start don't generate the + // same pos+inc twice. + for procs := 2; procs <= 64; procs++ { + ord.reset(uint32(procs)) + checked := make([]bool, procs*procs) + // We want at least procs*len(ord.coprimes) different pos+inc values + // before we start repeating. + for i := 0; i < procs*len(ord.coprimes); i++ { + enum := ord.start(uint32(i)) + j := enum.pos*uint32(procs) + enum.inc + if checked[j] { + println("procs:", procs, "pos:", enum.pos, "inc:", enum.inc) + panic("duplicate pos+inc during enumeration") + } + checked[j] = true + } + } +} diff --git a/src/runtime/proc_test.go b/src/runtime/proc_test.go new file mode 100644 index 0000000..f354fac --- /dev/null +++ b/src/runtime/proc_test.go @@ -0,0 +1,1157 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "internal/race" + "internal/testenv" + "math" + "net" + "runtime" + "runtime/debug" + "strings" + "sync" + "sync/atomic" + "syscall" + "testing" + "time" +) + +var stop = make(chan bool, 1) + +func perpetuumMobile() { + select { + case <-stop: + default: + go perpetuumMobile() + } +} + +func TestStopTheWorldDeadlock(t *testing.T) { + if runtime.GOARCH == "wasm" { + t.Skip("no preemption on wasm yet") + } + if testing.Short() { + t.Skip("skipping during short test") + } + maxprocs := runtime.GOMAXPROCS(3) + compl := make(chan bool, 2) + go func() { + for i := 0; i != 1000; i += 1 { + runtime.GC() + } + compl <- true + }() + go func() { + for i := 0; i != 1000; i += 1 { + runtime.GOMAXPROCS(3) + } + compl <- true + }() + go perpetuumMobile() + <-compl + <-compl + stop <- true + runtime.GOMAXPROCS(maxprocs) +} + +func TestYieldProgress(t *testing.T) { + testYieldProgress(false) +} + +func TestYieldLockedProgress(t *testing.T) { + testYieldProgress(true) +} + +func testYieldProgress(locked bool) { + c := make(chan bool) + cack := make(chan bool) + go func() { + if locked { + runtime.LockOSThread() + } + for { + select { + case <-c: + cack <- true + return + default: + runtime.Gosched() + } + } + }() + time.Sleep(10 * time.Millisecond) + c <- true + <-cack +} + +func TestYieldLocked(t *testing.T) { + const N = 10 + c := make(chan bool) + go func() { + runtime.LockOSThread() + for i := 0; i < N; i++ { + runtime.Gosched() + time.Sleep(time.Millisecond) + } + c <- true + // runtime.UnlockOSThread() is deliberately omitted + }() + <-c +} + +func TestGoroutineParallelism(t *testing.T) { + if runtime.NumCPU() == 1 { + // Takes too long, too easy to deadlock, etc. + t.Skip("skipping on uniprocessor") + } + P := 4 + N := 10 + if testing.Short() { + P = 3 + N = 3 + } + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(P)) + // If runtime triggers a forced GC during this test then it will deadlock, + // since the goroutines can't be stopped/preempted. + // Disable GC for this test (see issue #10958). + defer debug.SetGCPercent(debug.SetGCPercent(-1)) + // SetGCPercent waits until the mark phase is over, but the runtime + // also preempts at the start of the sweep phase, so make sure that's + // done too. See #45867. + runtime.GC() + for try := 0; try < N; try++ { + done := make(chan bool) + x := uint32(0) + for p := 0; p < P; p++ { + // Test that all P goroutines are scheduled at the same time + go func(p int) { + for i := 0; i < 3; i++ { + expected := uint32(P*i + p) + for atomic.LoadUint32(&x) != expected { + } + atomic.StoreUint32(&x, expected+1) + } + done <- true + }(p) + } + for p := 0; p < P; p++ { + <-done + } + } +} + +// Test that all runnable goroutines are scheduled at the same time. +func TestGoroutineParallelism2(t *testing.T) { + //testGoroutineParallelism2(t, false, false) + testGoroutineParallelism2(t, true, false) + testGoroutineParallelism2(t, false, true) + testGoroutineParallelism2(t, true, true) +} + +func testGoroutineParallelism2(t *testing.T, load, netpoll bool) { + if runtime.NumCPU() == 1 { + // Takes too long, too easy to deadlock, etc. + t.Skip("skipping on uniprocessor") + } + P := 4 + N := 10 + if testing.Short() { + N = 3 + } + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(P)) + // If runtime triggers a forced GC during this test then it will deadlock, + // since the goroutines can't be stopped/preempted. + // Disable GC for this test (see issue #10958). + defer debug.SetGCPercent(debug.SetGCPercent(-1)) + // SetGCPercent waits until the mark phase is over, but the runtime + // also preempts at the start of the sweep phase, so make sure that's + // done too. See #45867. + runtime.GC() + for try := 0; try < N; try++ { + if load { + // Create P goroutines and wait until they all run. + // When we run the actual test below, worker threads + // running the goroutines will start parking. + done := make(chan bool) + x := uint32(0) + for p := 0; p < P; p++ { + go func() { + if atomic.AddUint32(&x, 1) == uint32(P) { + done <- true + return + } + for atomic.LoadUint32(&x) != uint32(P) { + } + }() + } + <-done + } + if netpoll { + // Enable netpoller, affects schedler behavior. + laddr := "localhost:0" + if runtime.GOOS == "android" { + // On some Android devices, there are no records for localhost, + // see https://golang.org/issues/14486. + // Don't use 127.0.0.1 for every case, it won't work on IPv6-only systems. + laddr = "127.0.0.1:0" + } + ln, err := net.Listen("tcp", laddr) + if err != nil { + defer ln.Close() // yup, defer in a loop + } + } + done := make(chan bool) + x := uint32(0) + // Spawn P goroutines in a nested fashion just to differ from TestGoroutineParallelism. + for p := 0; p < P/2; p++ { + go func(p int) { + for p2 := 0; p2 < 2; p2++ { + go func(p2 int) { + for i := 0; i < 3; i++ { + expected := uint32(P*i + p*2 + p2) + for atomic.LoadUint32(&x) != expected { + } + atomic.StoreUint32(&x, expected+1) + } + done <- true + }(p2) + } + }(p) + } + for p := 0; p < P; p++ { + <-done + } + } +} + +func TestBlockLocked(t *testing.T) { + const N = 10 + c := make(chan bool) + go func() { + runtime.LockOSThread() + for i := 0; i < N; i++ { + c <- true + } + runtime.UnlockOSThread() + }() + for i := 0; i < N; i++ { + <-c + } +} + +func TestTimerFairness(t *testing.T) { + if runtime.GOARCH == "wasm" { + t.Skip("no preemption on wasm yet") + } + + done := make(chan bool) + c := make(chan bool) + for i := 0; i < 2; i++ { + go func() { + for { + select { + case c <- true: + case <-done: + return + } + } + }() + } + + timer := time.After(20 * time.Millisecond) + for { + select { + case <-c: + case <-timer: + close(done) + return + } + } +} + +func TestTimerFairness2(t *testing.T) { + if runtime.GOARCH == "wasm" { + t.Skip("no preemption on wasm yet") + } + + done := make(chan bool) + c := make(chan bool) + for i := 0; i < 2; i++ { + go func() { + timer := time.After(20 * time.Millisecond) + var buf [1]byte + for { + syscall.Read(0, buf[0:0]) + select { + case c <- true: + case <-c: + case <-timer: + done <- true + return + } + } + }() + } + <-done + <-done +} + +// The function is used to test preemption at split stack checks. +// Declaring a var avoids inlining at the call site. +var preempt = func() int { + var a [128]int + sum := 0 + for _, v := range a { + sum += v + } + return sum +} + +func TestPreemption(t *testing.T) { + if runtime.GOARCH == "wasm" { + t.Skip("no preemption on wasm yet") + } + + // Test that goroutines are preempted at function calls. + N := 5 + if testing.Short() { + N = 2 + } + c := make(chan bool) + var x uint32 + for g := 0; g < 2; g++ { + go func(g int) { + for i := 0; i < N; i++ { + for atomic.LoadUint32(&x) != uint32(g) { + preempt() + } + atomic.StoreUint32(&x, uint32(1-g)) + } + c <- true + }(g) + } + <-c + <-c +} + +func TestPreemptionGC(t *testing.T) { + if runtime.GOARCH == "wasm" { + t.Skip("no preemption on wasm yet") + } + + // Test that pending GC preempts running goroutines. + P := 5 + N := 10 + if testing.Short() { + P = 3 + N = 2 + } + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(P + 1)) + var stop uint32 + for i := 0; i < P; i++ { + go func() { + for atomic.LoadUint32(&stop) == 0 { + preempt() + } + }() + } + for i := 0; i < N; i++ { + runtime.Gosched() + runtime.GC() + } + atomic.StoreUint32(&stop, 1) +} + +func TestAsyncPreempt(t *testing.T) { + if !runtime.PreemptMSupported { + t.Skip("asynchronous preemption not supported on this platform") + } + output := runTestProg(t, "testprog", "AsyncPreempt") + want := "OK\n" + if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +func TestGCFairness(t *testing.T) { + output := runTestProg(t, "testprog", "GCFairness") + want := "OK\n" + if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +func TestGCFairness2(t *testing.T) { + output := runTestProg(t, "testprog", "GCFairness2") + want := "OK\n" + if output != want { + t.Fatalf("want %s, got %s\n", want, output) + } +} + +func TestNumGoroutine(t *testing.T) { + output := runTestProg(t, "testprog", "NumGoroutine") + want := "1\n" + if output != want { + t.Fatalf("want %q, got %q", want, output) + } + + buf := make([]byte, 1<<20) + + // Try up to 10 times for a match before giving up. + // This is a fundamentally racy check but it's important + // to notice if NumGoroutine and Stack are _always_ out of sync. + for i := 0; ; i++ { + // Give goroutines about to exit a chance to exit. + // The NumGoroutine and Stack below need to see + // the same state of the world, so anything we can do + // to keep it quiet is good. + runtime.Gosched() + + n := runtime.NumGoroutine() + buf = buf[:runtime.Stack(buf, true)] + + nstk := strings.Count(string(buf), "goroutine ") + if n == nstk { + break + } + if i >= 10 { + t.Fatalf("NumGoroutine=%d, but found %d goroutines in stack dump: %s", n, nstk, buf) + } + } +} + +func TestPingPongHog(t *testing.T) { + if runtime.GOARCH == "wasm" { + t.Skip("no preemption on wasm yet") + } + if testing.Short() { + t.Skip("skipping in -short mode") + } + if race.Enabled { + // The race detector randomizes the scheduler, + // which causes this test to fail (#38266). + t.Skip("skipping in -race mode") + } + + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(1)) + done := make(chan bool) + hogChan, lightChan := make(chan bool), make(chan bool) + hogCount, lightCount := 0, 0 + + run := func(limit int, counter *int, wake chan bool) { + for { + select { + case <-done: + return + + case <-wake: + for i := 0; i < limit; i++ { + *counter++ + } + wake <- true + } + } + } + + // Start two co-scheduled hog goroutines. + for i := 0; i < 2; i++ { + go run(1e6, &hogCount, hogChan) + } + + // Start two co-scheduled light goroutines. + for i := 0; i < 2; i++ { + go run(1e3, &lightCount, lightChan) + } + + // Start goroutine pairs and wait for a few preemption rounds. + hogChan <- true + lightChan <- true + time.Sleep(100 * time.Millisecond) + close(done) + <-hogChan + <-lightChan + + // Check that hogCount and lightCount are within a factor of + // 20, which indicates that both pairs of goroutines handed off + // the P within a time-slice to their buddy. We can use a + // fairly large factor here to make this robust: if the + // scheduler isn't working right, the gap should be ~1000X + // (was 5, increased to 20, see issue 52207). + const factor = 20 + if hogCount/factor > lightCount || lightCount/factor > hogCount { + t.Fatalf("want hogCount/lightCount in [%v, %v]; got %d/%d = %g", 1.0/factor, factor, hogCount, lightCount, float64(hogCount)/float64(lightCount)) + } +} + +func BenchmarkPingPongHog(b *testing.B) { + if b.N == 0 { + return + } + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(1)) + + // Create a CPU hog + stop, done := make(chan bool), make(chan bool) + go func() { + for { + select { + case <-stop: + done <- true + return + default: + } + } + }() + + // Ping-pong b.N times + ping, pong := make(chan bool), make(chan bool) + go func() { + for j := 0; j < b.N; j++ { + pong <- <-ping + } + close(stop) + done <- true + }() + go func() { + for i := 0; i < b.N; i++ { + ping <- <-pong + } + done <- true + }() + b.ResetTimer() + ping <- true // Start ping-pong + <-stop + b.StopTimer() + <-ping // Let last ponger exit + <-done // Make sure goroutines exit + <-done + <-done +} + +var padData [128]uint64 + +func stackGrowthRecursive(i int) { + var pad [128]uint64 + pad = padData + for j := range pad { + if pad[j] != 0 { + return + } + } + if i != 0 { + stackGrowthRecursive(i - 1) + } +} + +func TestPreemptSplitBig(t *testing.T) { + if testing.Short() { + t.Skip("skipping in -short mode") + } + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(2)) + stop := make(chan int) + go big(stop) + for i := 0; i < 3; i++ { + time.Sleep(10 * time.Microsecond) // let big start running + runtime.GC() + } + close(stop) +} + +func big(stop chan int) int { + n := 0 + for { + // delay so that gc is sure to have asked for a preemption + for i := 0; i < 1e9; i++ { + n++ + } + + // call bigframe, which used to miss the preemption in its prologue. + bigframe(stop) + + // check if we've been asked to stop. + select { + case <-stop: + return n + } + } +} + +func bigframe(stop chan int) int { + // not splitting the stack will overflow. + // small will notice that it needs a stack split and will + // catch the overflow. + var x [8192]byte + return small(stop, &x) +} + +func small(stop chan int, x *[8192]byte) int { + for i := range x { + x[i] = byte(i) + } + sum := 0 + for i := range x { + sum += int(x[i]) + } + + // keep small from being a leaf function, which might + // make it not do any stack check at all. + nonleaf(stop) + + return sum +} + +func nonleaf(stop chan int) bool { + // do something that won't be inlined: + select { + case <-stop: + return true + default: + return false + } +} + +func TestSchedLocalQueue(t *testing.T) { + runtime.RunSchedLocalQueueTest() +} + +func TestSchedLocalQueueSteal(t *testing.T) { + runtime.RunSchedLocalQueueStealTest() +} + +func TestSchedLocalQueueEmpty(t *testing.T) { + if runtime.NumCPU() == 1 { + // Takes too long and does not trigger the race. + t.Skip("skipping on uniprocessor") + } + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(4)) + + // If runtime triggers a forced GC during this test then it will deadlock, + // since the goroutines can't be stopped/preempted during spin wait. + defer debug.SetGCPercent(debug.SetGCPercent(-1)) + // SetGCPercent waits until the mark phase is over, but the runtime + // also preempts at the start of the sweep phase, so make sure that's + // done too. See #45867. + runtime.GC() + + iters := int(1e5) + if testing.Short() { + iters = 1e2 + } + runtime.RunSchedLocalQueueEmptyTest(iters) +} + +func benchmarkStackGrowth(b *testing.B, rec int) { + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + stackGrowthRecursive(rec) + } + }) +} + +func BenchmarkStackGrowth(b *testing.B) { + benchmarkStackGrowth(b, 10) +} + +func BenchmarkStackGrowthDeep(b *testing.B) { + benchmarkStackGrowth(b, 1024) +} + +func BenchmarkCreateGoroutines(b *testing.B) { + benchmarkCreateGoroutines(b, 1) +} + +func BenchmarkCreateGoroutinesParallel(b *testing.B) { + benchmarkCreateGoroutines(b, runtime.GOMAXPROCS(-1)) +} + +func benchmarkCreateGoroutines(b *testing.B, procs int) { + c := make(chan bool) + var f func(n int) + f = func(n int) { + if n == 0 { + c <- true + return + } + go f(n - 1) + } + for i := 0; i < procs; i++ { + go f(b.N / procs) + } + for i := 0; i < procs; i++ { + <-c + } +} + +func BenchmarkCreateGoroutinesCapture(b *testing.B) { + b.ReportAllocs() + for i := 0; i < b.N; i++ { + const N = 4 + var wg sync.WaitGroup + wg.Add(N) + for i := 0; i < N; i++ { + i := i + go func() { + if i >= N { + b.Logf("bad") // just to capture b + } + wg.Done() + }() + } + wg.Wait() + } +} + +// warmupScheduler ensures the scheduler has at least targetThreadCount threads +// in its thread pool. +func warmupScheduler(targetThreadCount int) { + var wg sync.WaitGroup + var count int32 + for i := 0; i < targetThreadCount; i++ { + wg.Add(1) + go func() { + atomic.AddInt32(&count, 1) + for atomic.LoadInt32(&count) < int32(targetThreadCount) { + // spin until all threads started + } + + // spin a bit more to ensure they are all running on separate CPUs. + doWork(time.Millisecond) + wg.Done() + }() + } + wg.Wait() +} + +func doWork(dur time.Duration) { + start := time.Now() + for time.Since(start) < dur { + } +} + +// BenchmarkCreateGoroutinesSingle creates many goroutines, all from a single +// producer (the main benchmark goroutine). +// +// Compared to BenchmarkCreateGoroutines, this causes different behavior in the +// scheduler because Ms are much more likely to need to steal work from the +// main P rather than having work in the local run queue. +func BenchmarkCreateGoroutinesSingle(b *testing.B) { + // Since we are interested in stealing behavior, warm the scheduler to + // get all the Ps running first. + warmupScheduler(runtime.GOMAXPROCS(0)) + b.ResetTimer() + + var wg sync.WaitGroup + wg.Add(b.N) + for i := 0; i < b.N; i++ { + go func() { + wg.Done() + }() + } + wg.Wait() +} + +func BenchmarkClosureCall(b *testing.B) { + sum := 0 + off1 := 1 + for i := 0; i < b.N; i++ { + off2 := 2 + func() { + sum += i + off1 + off2 + }() + } + _ = sum +} + +func benchmarkWakeupParallel(b *testing.B, spin func(time.Duration)) { + if runtime.GOMAXPROCS(0) == 1 { + b.Skip("skipping: GOMAXPROCS=1") + } + + wakeDelay := 5 * time.Microsecond + for _, delay := range []time.Duration{ + 0, + 1 * time.Microsecond, + 2 * time.Microsecond, + 5 * time.Microsecond, + 10 * time.Microsecond, + 20 * time.Microsecond, + 50 * time.Microsecond, + 100 * time.Microsecond, + } { + b.Run(delay.String(), func(b *testing.B) { + if b.N == 0 { + return + } + // Start two goroutines, which alternate between being + // sender and receiver in the following protocol: + // + // - The receiver spins for `delay` and then does a + // blocking receive on a channel. + // + // - The sender spins for `delay+wakeDelay` and then + // sends to the same channel. (The addition of + // `wakeDelay` improves the probability that the + // receiver will be blocking when the send occurs when + // the goroutines execute in parallel.) + // + // In each iteration of the benchmark, each goroutine + // acts once as sender and once as receiver, so each + // goroutine spins for delay twice. + // + // BenchmarkWakeupParallel is used to estimate how + // efficiently the scheduler parallelizes goroutines in + // the presence of blocking: + // + // - If both goroutines are executed on the same core, + // an increase in delay by N will increase the time per + // iteration by 4*N, because all 4 delays are + // serialized. + // + // - Otherwise, an increase in delay by N will increase + // the time per iteration by 2*N, and the time per + // iteration is 2 * (runtime overhead + chan + // send/receive pair + delay + wakeDelay). This allows + // the runtime overhead, including the time it takes + // for the unblocked goroutine to be scheduled, to be + // estimated. + ping, pong := make(chan struct{}), make(chan struct{}) + start := make(chan struct{}) + done := make(chan struct{}) + go func() { + <-start + for i := 0; i < b.N; i++ { + // sender + spin(delay + wakeDelay) + ping <- struct{}{} + // receiver + spin(delay) + <-pong + } + done <- struct{}{} + }() + go func() { + for i := 0; i < b.N; i++ { + // receiver + spin(delay) + <-ping + // sender + spin(delay + wakeDelay) + pong <- struct{}{} + } + done <- struct{}{} + }() + b.ResetTimer() + start <- struct{}{} + <-done + <-done + }) + } +} + +func BenchmarkWakeupParallelSpinning(b *testing.B) { + benchmarkWakeupParallel(b, func(d time.Duration) { + end := time.Now().Add(d) + for time.Now().Before(end) { + // do nothing + } + }) +} + +// sysNanosleep is defined by OS-specific files (such as runtime_linux_test.go) +// to sleep for the given duration. If nil, dependent tests are skipped. +// The implementation should invoke a blocking system call and not +// call time.Sleep, which would deschedule the goroutine. +var sysNanosleep func(d time.Duration) + +func BenchmarkWakeupParallelSyscall(b *testing.B) { + if sysNanosleep == nil { + b.Skipf("skipping on %v; sysNanosleep not defined", runtime.GOOS) + } + benchmarkWakeupParallel(b, func(d time.Duration) { + sysNanosleep(d) + }) +} + +type Matrix [][]float64 + +func BenchmarkMatmult(b *testing.B) { + b.StopTimer() + // matmult is O(N**3) but testing expects O(b.N), + // so we need to take cube root of b.N + n := int(math.Cbrt(float64(b.N))) + 1 + A := makeMatrix(n) + B := makeMatrix(n) + C := makeMatrix(n) + b.StartTimer() + matmult(nil, A, B, C, 0, n, 0, n, 0, n, 8) +} + +func makeMatrix(n int) Matrix { + m := make(Matrix, n) + for i := 0; i < n; i++ { + m[i] = make([]float64, n) + for j := 0; j < n; j++ { + m[i][j] = float64(i*n + j) + } + } + return m +} + +func matmult(done chan<- struct{}, A, B, C Matrix, i0, i1, j0, j1, k0, k1, threshold int) { + di := i1 - i0 + dj := j1 - j0 + dk := k1 - k0 + if di >= dj && di >= dk && di >= threshold { + // divide in two by y axis + mi := i0 + di/2 + done1 := make(chan struct{}, 1) + go matmult(done1, A, B, C, i0, mi, j0, j1, k0, k1, threshold) + matmult(nil, A, B, C, mi, i1, j0, j1, k0, k1, threshold) + <-done1 + } else if dj >= dk && dj >= threshold { + // divide in two by x axis + mj := j0 + dj/2 + done1 := make(chan struct{}, 1) + go matmult(done1, A, B, C, i0, i1, j0, mj, k0, k1, threshold) + matmult(nil, A, B, C, i0, i1, mj, j1, k0, k1, threshold) + <-done1 + } else if dk >= threshold { + // divide in two by "k" axis + // deliberately not parallel because of data races + mk := k0 + dk/2 + matmult(nil, A, B, C, i0, i1, j0, j1, k0, mk, threshold) + matmult(nil, A, B, C, i0, i1, j0, j1, mk, k1, threshold) + } else { + // the matrices are small enough, compute directly + for i := i0; i < i1; i++ { + for j := j0; j < j1; j++ { + for k := k0; k < k1; k++ { + C[i][j] += A[i][k] * B[k][j] + } + } + } + } + if done != nil { + done <- struct{}{} + } +} + +func TestStealOrder(t *testing.T) { + runtime.RunStealOrderTest() +} + +func TestLockOSThreadNesting(t *testing.T) { + if runtime.GOARCH == "wasm" { + t.Skip("no threads on wasm yet") + } + + go func() { + e, i := runtime.LockOSCounts() + if e != 0 || i != 0 { + t.Errorf("want locked counts 0, 0; got %d, %d", e, i) + return + } + runtime.LockOSThread() + runtime.LockOSThread() + runtime.UnlockOSThread() + e, i = runtime.LockOSCounts() + if e != 1 || i != 0 { + t.Errorf("want locked counts 1, 0; got %d, %d", e, i) + return + } + runtime.UnlockOSThread() + e, i = runtime.LockOSCounts() + if e != 0 || i != 0 { + t.Errorf("want locked counts 0, 0; got %d, %d", e, i) + return + } + }() +} + +func TestLockOSThreadExit(t *testing.T) { + testLockOSThreadExit(t, "testprog") +} + +func testLockOSThreadExit(t *testing.T, prog string) { + output := runTestProg(t, prog, "LockOSThreadMain", "GOMAXPROCS=1") + want := "OK\n" + if output != want { + t.Errorf("want %q, got %q", want, output) + } + + output = runTestProg(t, prog, "LockOSThreadAlt") + if output != want { + t.Errorf("want %q, got %q", want, output) + } +} + +func TestLockOSThreadAvoidsStatePropagation(t *testing.T) { + want := "OK\n" + skip := "unshare not permitted\n" + output := runTestProg(t, "testprog", "LockOSThreadAvoidsStatePropagation", "GOMAXPROCS=1") + if output == skip { + t.Skip("unshare syscall not permitted on this system") + } else if output != want { + t.Errorf("want %q, got %q", want, output) + } +} + +func TestLockOSThreadTemplateThreadRace(t *testing.T) { + testenv.MustHaveGoRun(t) + + exe, err := buildTestProg(t, "testprog") + if err != nil { + t.Fatal(err) + } + + iterations := 100 + if testing.Short() { + // Reduce run time to ~100ms, with much lower probability of + // catching issues. + iterations = 5 + } + for i := 0; i < iterations; i++ { + want := "OK\n" + output := runBuiltTestProg(t, exe, "LockOSThreadTemplateThreadRace") + if output != want { + t.Fatalf("run %d: want %q, got %q", i, want, output) + } + } +} + +// fakeSyscall emulates a system call. +// +//go:nosplit +func fakeSyscall(duration time.Duration) { + runtime.Entersyscall() + for start := runtime.Nanotime(); runtime.Nanotime()-start < int64(duration); { + } + runtime.Exitsyscall() +} + +// Check that a goroutine will be preempted if it is calling short system calls. +func testPreemptionAfterSyscall(t *testing.T, syscallDuration time.Duration) { + if runtime.GOARCH == "wasm" { + t.Skip("no preemption on wasm yet") + } + + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(2)) + + interations := 10 + if testing.Short() { + interations = 1 + } + const ( + maxDuration = 5 * time.Second + nroutines = 8 + ) + + for i := 0; i < interations; i++ { + c := make(chan bool, nroutines) + stop := uint32(0) + + start := time.Now() + for g := 0; g < nroutines; g++ { + go func(stop *uint32) { + c <- true + for atomic.LoadUint32(stop) == 0 { + fakeSyscall(syscallDuration) + } + c <- true + }(&stop) + } + // wait until all goroutines have started. + for g := 0; g < nroutines; g++ { + <-c + } + atomic.StoreUint32(&stop, 1) + // wait until all goroutines have finished. + for g := 0; g < nroutines; g++ { + <-c + } + duration := time.Since(start) + + if duration > maxDuration { + t.Errorf("timeout exceeded: %v (%v)", duration, maxDuration) + } + } +} + +func TestPreemptionAfterSyscall(t *testing.T) { + if runtime.GOOS == "plan9" { + testenv.SkipFlaky(t, 41015) + } + + for _, i := range []time.Duration{10, 100, 1000} { + d := i * time.Microsecond + t.Run(fmt.Sprint(d), func(t *testing.T) { + testPreemptionAfterSyscall(t, d) + }) + } +} + +func TestGetgThreadSwitch(t *testing.T) { + runtime.RunGetgThreadSwitchTest() +} + +// TestNetpollBreak tests that netpollBreak can break a netpoll. +// This test is not particularly safe since the call to netpoll +// will pick up any stray files that are ready, but it should work +// OK as long it is not run in parallel. +func TestNetpollBreak(t *testing.T) { + if runtime.GOMAXPROCS(0) == 1 { + t.Skip("skipping: GOMAXPROCS=1") + } + + // Make sure that netpoll is initialized. + runtime.NetpollGenericInit() + + start := time.Now() + c := make(chan bool, 2) + go func() { + c <- true + runtime.Netpoll(10 * time.Second.Nanoseconds()) + c <- true + }() + <-c + // Loop because the break might get eaten by the scheduler. + // Break twice to break both the netpoll we started and the + // scheduler netpoll. +loop: + for { + runtime.Usleep(100) + runtime.NetpollBreak() + runtime.NetpollBreak() + select { + case <-c: + break loop + default: + } + } + if dur := time.Since(start); dur > 5*time.Second { + t.Errorf("netpollBreak did not interrupt netpoll: slept for: %v", dur) + } +} + +// TestBigGOMAXPROCS tests that setting GOMAXPROCS to a large value +// doesn't cause a crash at startup. See issue 38474. +func TestBigGOMAXPROCS(t *testing.T) { + t.Parallel() + output := runTestProg(t, "testprog", "NonexistentTest", "GOMAXPROCS=1024") + // Ignore error conditions on small machines. + for _, errstr := range []string{ + "failed to create new OS thread", + "cannot allocate memory", + } { + if strings.Contains(output, errstr) { + t.Skipf("failed to create 1024 threads") + } + } + if !strings.Contains(output, "unknown function: NonexistentTest") { + t.Errorf("output:\n%s\nwanted:\nunknown function: NonexistentTest", output) + } +} diff --git a/src/runtime/profbuf.go b/src/runtime/profbuf.go new file mode 100644 index 0000000..083b55a --- /dev/null +++ b/src/runtime/profbuf.go @@ -0,0 +1,561 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// A profBuf is a lock-free buffer for profiling events, +// safe for concurrent use by one reader and one writer. +// The writer may be a signal handler running without a user g. +// The reader is assumed to be a user g. +// +// Each logged event corresponds to a fixed size header, a list of +// uintptrs (typically a stack), and exactly one unsafe.Pointer tag. +// The header and uintptrs are stored in the circular buffer data and the +// tag is stored in a circular buffer tags, running in parallel. +// In the circular buffer data, each event takes 2+hdrsize+len(stk) +// words: the value 2+hdrsize+len(stk), then the time of the event, then +// hdrsize words giving the fixed-size header, and then len(stk) words +// for the stack. +// +// The current effective offsets into the tags and data circular buffers +// for reading and writing are stored in the high 30 and low 32 bits of r and w. +// The bottom bits of the high 32 are additional flag bits in w, unused in r. +// "Effective" offsets means the total number of reads or writes, mod 2^length. +// The offset in the buffer is the effective offset mod the length of the buffer. +// To make wraparound mod 2^length match wraparound mod length of the buffer, +// the length of the buffer must be a power of two. +// +// If the reader catches up to the writer, a flag passed to read controls +// whether the read blocks until more data is available. A read returns a +// pointer to the buffer data itself; the caller is assumed to be done with +// that data at the next read. The read offset rNext tracks the next offset to +// be returned by read. By definition, r ≤ rNext ≤ w (before wraparound), +// and rNext is only used by the reader, so it can be accessed without atomics. +// +// If the writer gets ahead of the reader, so that the buffer fills, +// future writes are discarded and replaced in the output stream by an +// overflow entry, which has size 2+hdrsize+1, time set to the time of +// the first discarded write, a header of all zeroed words, and a "stack" +// containing one word, the number of discarded writes. +// +// Between the time the buffer fills and the buffer becomes empty enough +// to hold more data, the overflow entry is stored as a pending overflow +// entry in the fields overflow and overflowTime. The pending overflow +// entry can be turned into a real record by either the writer or the +// reader. If the writer is called to write a new record and finds that +// the output buffer has room for both the pending overflow entry and the +// new record, the writer emits the pending overflow entry and the new +// record into the buffer. If the reader is called to read data and finds +// that the output buffer is empty but that there is a pending overflow +// entry, the reader will return a synthesized record for the pending +// overflow entry. +// +// Only the writer can create or add to a pending overflow entry, but +// either the reader or the writer can clear the pending overflow entry. +// A pending overflow entry is indicated by the low 32 bits of 'overflow' +// holding the number of discarded writes, and overflowTime holding the +// time of the first discarded write. The high 32 bits of 'overflow' +// increment each time the low 32 bits transition from zero to non-zero +// or vice versa. This sequence number avoids ABA problems in the use of +// compare-and-swap to coordinate between reader and writer. +// The overflowTime is only written when the low 32 bits of overflow are +// zero, that is, only when there is no pending overflow entry, in +// preparation for creating a new one. The reader can therefore fetch and +// clear the entry atomically using +// +// for { +// overflow = load(&b.overflow) +// if uint32(overflow) == 0 { +// // no pending entry +// break +// } +// time = load(&b.overflowTime) +// if cas(&b.overflow, overflow, ((overflow>>32)+1)<<32) { +// // pending entry cleared +// break +// } +// } +// if uint32(overflow) > 0 { +// emit entry for uint32(overflow), time +// } +type profBuf struct { + // accessed atomically + r, w profAtomic + overflow atomic.Uint64 + overflowTime atomic.Uint64 + eof atomic.Uint32 + + // immutable (excluding slice content) + hdrsize uintptr + data []uint64 + tags []unsafe.Pointer + + // owned by reader + rNext profIndex + overflowBuf []uint64 // for use by reader to return overflow record + wait note +} + +// A profAtomic is the atomically-accessed word holding a profIndex. +type profAtomic uint64 + +// A profIndex is the packet tag and data counts and flags bits, described above. +type profIndex uint64 + +const ( + profReaderSleeping profIndex = 1 << 32 // reader is sleeping and must be woken up + profWriteExtra profIndex = 1 << 33 // overflow or eof waiting +) + +func (x *profAtomic) load() profIndex { + return profIndex(atomic.Load64((*uint64)(x))) +} + +func (x *profAtomic) store(new profIndex) { + atomic.Store64((*uint64)(x), uint64(new)) +} + +func (x *profAtomic) cas(old, new profIndex) bool { + return atomic.Cas64((*uint64)(x), uint64(old), uint64(new)) +} + +func (x profIndex) dataCount() uint32 { + return uint32(x) +} + +func (x profIndex) tagCount() uint32 { + return uint32(x >> 34) +} + +// countSub subtracts two counts obtained from profIndex.dataCount or profIndex.tagCount, +// assuming that they are no more than 2^29 apart (guaranteed since they are never more than +// len(data) or len(tags) apart, respectively). +// tagCount wraps at 2^30, while dataCount wraps at 2^32. +// This function works for both. +func countSub(x, y uint32) int { + // x-y is 32-bit signed or 30-bit signed; sign-extend to 32 bits and convert to int. + return int(int32(x-y) << 2 >> 2) +} + +// addCountsAndClearFlags returns the packed form of "x + (data, tag) - all flags". +func (x profIndex) addCountsAndClearFlags(data, tag int) profIndex { + return profIndex((uint64(x)>>34+uint64(uint32(tag)<<2>>2))<<34 | uint64(uint32(x)+uint32(data))) +} + +// hasOverflow reports whether b has any overflow records pending. +func (b *profBuf) hasOverflow() bool { + return uint32(b.overflow.Load()) > 0 +} + +// takeOverflow consumes the pending overflow records, returning the overflow count +// and the time of the first overflow. +// When called by the reader, it is racing against incrementOverflow. +func (b *profBuf) takeOverflow() (count uint32, time uint64) { + overflow := b.overflow.Load() + time = b.overflowTime.Load() + for { + count = uint32(overflow) + if count == 0 { + time = 0 + break + } + // Increment generation, clear overflow count in low bits. + if b.overflow.CompareAndSwap(overflow, ((overflow>>32)+1)<<32) { + break + } + overflow = b.overflow.Load() + time = b.overflowTime.Load() + } + return uint32(overflow), time +} + +// incrementOverflow records a single overflow at time now. +// It is racing against a possible takeOverflow in the reader. +func (b *profBuf) incrementOverflow(now int64) { + for { + overflow := b.overflow.Load() + + // Once we see b.overflow reach 0, it's stable: no one else is changing it underfoot. + // We need to set overflowTime if we're incrementing b.overflow from 0. + if uint32(overflow) == 0 { + // Store overflowTime first so it's always available when overflow != 0. + b.overflowTime.Store(uint64(now)) + b.overflow.Store((((overflow >> 32) + 1) << 32) + 1) + break + } + // Otherwise we're racing to increment against reader + // who wants to set b.overflow to 0. + // Out of paranoia, leave 2³²-1 a sticky overflow value, + // to avoid wrapping around. Extremely unlikely. + if int32(overflow) == -1 { + break + } + if b.overflow.CompareAndSwap(overflow, overflow+1) { + break + } + } +} + +// newProfBuf returns a new profiling buffer with room for +// a header of hdrsize words and a buffer of at least bufwords words. +func newProfBuf(hdrsize, bufwords, tags int) *profBuf { + if min := 2 + hdrsize + 1; bufwords < min { + bufwords = min + } + + // Buffer sizes must be power of two, so that we don't have to + // worry about uint32 wraparound changing the effective position + // within the buffers. We store 30 bits of count; limiting to 28 + // gives us some room for intermediate calculations. + if bufwords >= 1<<28 || tags >= 1<<28 { + throw("newProfBuf: buffer too large") + } + var i int + for i = 1; i < bufwords; i <<= 1 { + } + bufwords = i + for i = 1; i < tags; i <<= 1 { + } + tags = i + + b := new(profBuf) + b.hdrsize = uintptr(hdrsize) + b.data = make([]uint64, bufwords) + b.tags = make([]unsafe.Pointer, tags) + b.overflowBuf = make([]uint64, 2+b.hdrsize+1) + return b +} + +// canWriteRecord reports whether the buffer has room +// for a single contiguous record with a stack of length nstk. +func (b *profBuf) canWriteRecord(nstk int) bool { + br := b.r.load() + bw := b.w.load() + + // room for tag? + if countSub(br.tagCount(), bw.tagCount())+len(b.tags) < 1 { + return false + } + + // room for data? + nd := countSub(br.dataCount(), bw.dataCount()) + len(b.data) + want := 2 + int(b.hdrsize) + nstk + i := int(bw.dataCount() % uint32(len(b.data))) + if i+want > len(b.data) { + // Can't fit in trailing fragment of slice. + // Skip over that and start over at beginning of slice. + nd -= len(b.data) - i + } + return nd >= want +} + +// canWriteTwoRecords reports whether the buffer has room +// for two records with stack lengths nstk1, nstk2, in that order. +// Each record must be contiguous on its own, but the two +// records need not be contiguous (one can be at the end of the buffer +// and the other can wrap around and start at the beginning of the buffer). +func (b *profBuf) canWriteTwoRecords(nstk1, nstk2 int) bool { + br := b.r.load() + bw := b.w.load() + + // room for tag? + if countSub(br.tagCount(), bw.tagCount())+len(b.tags) < 2 { + return false + } + + // room for data? + nd := countSub(br.dataCount(), bw.dataCount()) + len(b.data) + + // first record + want := 2 + int(b.hdrsize) + nstk1 + i := int(bw.dataCount() % uint32(len(b.data))) + if i+want > len(b.data) { + // Can't fit in trailing fragment of slice. + // Skip over that and start over at beginning of slice. + nd -= len(b.data) - i + i = 0 + } + i += want + nd -= want + + // second record + want = 2 + int(b.hdrsize) + nstk2 + if i+want > len(b.data) { + // Can't fit in trailing fragment of slice. + // Skip over that and start over at beginning of slice. + nd -= len(b.data) - i + i = 0 + } + return nd >= want +} + +// write writes an entry to the profiling buffer b. +// The entry begins with a fixed hdr, which must have +// length b.hdrsize, followed by a variable-sized stack +// and a single tag pointer *tagPtr (or nil if tagPtr is nil). +// No write barriers allowed because this might be called from a signal handler. +func (b *profBuf) write(tagPtr *unsafe.Pointer, now int64, hdr []uint64, stk []uintptr) { + if b == nil { + return + } + if len(hdr) > int(b.hdrsize) { + throw("misuse of profBuf.write") + } + + if hasOverflow := b.hasOverflow(); hasOverflow && b.canWriteTwoRecords(1, len(stk)) { + // Room for both an overflow record and the one being written. + // Write the overflow record if the reader hasn't gotten to it yet. + // Only racing against reader, not other writers. + count, time := b.takeOverflow() + if count > 0 { + var stk [1]uintptr + stk[0] = uintptr(count) + b.write(nil, int64(time), nil, stk[:]) + } + } else if hasOverflow || !b.canWriteRecord(len(stk)) { + // Pending overflow without room to write overflow and new records + // or no overflow but also no room for new record. + b.incrementOverflow(now) + b.wakeupExtra() + return + } + + // There's room: write the record. + br := b.r.load() + bw := b.w.load() + + // Profiling tag + // + // The tag is a pointer, but we can't run a write barrier here. + // We have interrupted the OS-level execution of gp, but the + // runtime still sees gp as executing. In effect, we are running + // in place of the real gp. Since gp is the only goroutine that + // can overwrite gp.labels, the value of gp.labels is stable during + // this signal handler: it will still be reachable from gp when + // we finish executing. If a GC is in progress right now, it must + // keep gp.labels alive, because gp.labels is reachable from gp. + // If gp were to overwrite gp.labels, the deletion barrier would + // still shade that pointer, which would preserve it for the + // in-progress GC, so all is well. Any future GC will see the + // value we copied when scanning b.tags (heap-allocated). + // We arrange that the store here is always overwriting a nil, + // so there is no need for a deletion barrier on b.tags[wt]. + wt := int(bw.tagCount() % uint32(len(b.tags))) + if tagPtr != nil { + *(*uintptr)(unsafe.Pointer(&b.tags[wt])) = uintptr(unsafe.Pointer(*tagPtr)) + } + + // Main record. + // It has to fit in a contiguous section of the slice, so if it doesn't fit at the end, + // leave a rewind marker (0) and start over at the beginning of the slice. + wd := int(bw.dataCount() % uint32(len(b.data))) + nd := countSub(br.dataCount(), bw.dataCount()) + len(b.data) + skip := 0 + if wd+2+int(b.hdrsize)+len(stk) > len(b.data) { + b.data[wd] = 0 + skip = len(b.data) - wd + nd -= skip + wd = 0 + } + data := b.data[wd:] + data[0] = uint64(2 + b.hdrsize + uintptr(len(stk))) // length + data[1] = uint64(now) // time stamp + // header, zero-padded + i := uintptr(copy(data[2:2+b.hdrsize], hdr)) + for ; i < b.hdrsize; i++ { + data[2+i] = 0 + } + for i, pc := range stk { + data[2+b.hdrsize+uintptr(i)] = uint64(pc) + } + + for { + // Commit write. + // Racing with reader setting flag bits in b.w, to avoid lost wakeups. + old := b.w.load() + new := old.addCountsAndClearFlags(skip+2+len(stk)+int(b.hdrsize), 1) + if !b.w.cas(old, new) { + continue + } + // If there was a reader, wake it up. + if old&profReaderSleeping != 0 { + notewakeup(&b.wait) + } + break + } +} + +// close signals that there will be no more writes on the buffer. +// Once all the data has been read from the buffer, reads will return eof=true. +func (b *profBuf) close() { + if b.eof.Load() > 0 { + throw("runtime: profBuf already closed") + } + b.eof.Store(1) + b.wakeupExtra() +} + +// wakeupExtra must be called after setting one of the "extra" +// atomic fields b.overflow or b.eof. +// It records the change in b.w and wakes up the reader if needed. +func (b *profBuf) wakeupExtra() { + for { + old := b.w.load() + new := old | profWriteExtra + if !b.w.cas(old, new) { + continue + } + if old&profReaderSleeping != 0 { + notewakeup(&b.wait) + } + break + } +} + +// profBufReadMode specifies whether to block when no data is available to read. +type profBufReadMode int + +const ( + profBufBlocking profBufReadMode = iota + profBufNonBlocking +) + +var overflowTag [1]unsafe.Pointer // always nil + +func (b *profBuf) read(mode profBufReadMode) (data []uint64, tags []unsafe.Pointer, eof bool) { + if b == nil { + return nil, nil, true + } + + br := b.rNext + + // Commit previous read, returning that part of the ring to the writer. + // First clear tags that have now been read, both to avoid holding + // up the memory they point at for longer than necessary + // and so that b.write can assume it is always overwriting + // nil tag entries (see comment in b.write). + rPrev := b.r.load() + if rPrev != br { + ntag := countSub(br.tagCount(), rPrev.tagCount()) + ti := int(rPrev.tagCount() % uint32(len(b.tags))) + for i := 0; i < ntag; i++ { + b.tags[ti] = nil + if ti++; ti == len(b.tags) { + ti = 0 + } + } + b.r.store(br) + } + +Read: + bw := b.w.load() + numData := countSub(bw.dataCount(), br.dataCount()) + if numData == 0 { + if b.hasOverflow() { + // No data to read, but there is overflow to report. + // Racing with writer flushing b.overflow into a real record. + count, time := b.takeOverflow() + if count == 0 { + // Lost the race, go around again. + goto Read + } + // Won the race, report overflow. + dst := b.overflowBuf + dst[0] = uint64(2 + b.hdrsize + 1) + dst[1] = uint64(time) + for i := uintptr(0); i < b.hdrsize; i++ { + dst[2+i] = 0 + } + dst[2+b.hdrsize] = uint64(count) + return dst[:2+b.hdrsize+1], overflowTag[:1], false + } + if b.eof.Load() > 0 { + // No data, no overflow, EOF set: done. + return nil, nil, true + } + if bw&profWriteExtra != 0 { + // Writer claims to have published extra information (overflow or eof). + // Attempt to clear notification and then check again. + // If we fail to clear the notification it means b.w changed, + // so we still need to check again. + b.w.cas(bw, bw&^profWriteExtra) + goto Read + } + + // Nothing to read right now. + // Return or sleep according to mode. + if mode == profBufNonBlocking { + // Necessary on Darwin, notetsleepg below does not work in signal handler, root cause of #61768. + return nil, nil, false + } + if !b.w.cas(bw, bw|profReaderSleeping) { + goto Read + } + // Committed to sleeping. + notetsleepg(&b.wait, -1) + noteclear(&b.wait) + goto Read + } + data = b.data[br.dataCount()%uint32(len(b.data)):] + if len(data) > numData { + data = data[:numData] + } else { + numData -= len(data) // available in case of wraparound + } + skip := 0 + if data[0] == 0 { + // Wraparound record. Go back to the beginning of the ring. + skip = len(data) + data = b.data + if len(data) > numData { + data = data[:numData] + } + } + + ntag := countSub(bw.tagCount(), br.tagCount()) + if ntag == 0 { + throw("runtime: malformed profBuf buffer - tag and data out of sync") + } + tags = b.tags[br.tagCount()%uint32(len(b.tags)):] + if len(tags) > ntag { + tags = tags[:ntag] + } + + // Count out whole data records until either data or tags is done. + // They are always in sync in the buffer, but due to an end-of-slice + // wraparound we might need to stop early and return the rest + // in the next call. + di := 0 + ti := 0 + for di < len(data) && data[di] != 0 && ti < len(tags) { + if uintptr(di)+uintptr(data[di]) > uintptr(len(data)) { + throw("runtime: malformed profBuf buffer - invalid size") + } + di += int(data[di]) + ti++ + } + + // Remember how much we returned, to commit read on next call. + b.rNext = br.addCountsAndClearFlags(skip+di, ti) + + if raceenabled { + // Match racereleasemerge in runtime_setProfLabel, + // so that the setting of the labels in runtime_setProfLabel + // is treated as happening before any use of the labels + // by our caller. The synchronization on labelSync itself is a fiction + // for the race detector. The actual synchronization is handled + // by the fact that the signal handler only reads from the current + // goroutine and uses atomics to write the updated queue indices, + // and then the read-out from the signal handler buffer uses + // atomics to read those queue indices. + raceacquire(unsafe.Pointer(&labelSync)) + } + + return data[:di], tags[:ti], false +} diff --git a/src/runtime/profbuf_test.go b/src/runtime/profbuf_test.go new file mode 100644 index 0000000..d9c5264 --- /dev/null +++ b/src/runtime/profbuf_test.go @@ -0,0 +1,182 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "reflect" + . "runtime" + "testing" + "time" + "unsafe" +) + +func TestProfBuf(t *testing.T) { + const hdrSize = 2 + + write := func(t *testing.T, b *ProfBuf, tag unsafe.Pointer, now int64, hdr []uint64, stk []uintptr) { + b.Write(&tag, now, hdr, stk) + } + read := func(t *testing.T, b *ProfBuf, data []uint64, tags []unsafe.Pointer) { + rdata, rtags, eof := b.Read(ProfBufNonBlocking) + if !reflect.DeepEqual(rdata, data) || !reflect.DeepEqual(rtags, tags) { + t.Fatalf("unexpected profile read:\nhave data %#x\nwant data %#x\nhave tags %#x\nwant tags %#x", rdata, data, rtags, tags) + } + if eof { + t.Fatalf("unexpected eof") + } + } + readBlock := func(t *testing.T, b *ProfBuf, data []uint64, tags []unsafe.Pointer) func() { + c := make(chan int) + go func() { + eof := data == nil + rdata, rtags, reof := b.Read(ProfBufBlocking) + if !reflect.DeepEqual(rdata, data) || !reflect.DeepEqual(rtags, tags) || reof != eof { + // Errorf, not Fatalf, because called in goroutine. + t.Errorf("unexpected profile read:\nhave data %#x\nwant data %#x\nhave tags %#x\nwant tags %#x\nhave eof=%v, want %v", rdata, data, rtags, tags, reof, eof) + } + c <- 1 + }() + time.Sleep(10 * time.Millisecond) // let goroutine run and block + return func() { + select { + case <-c: + case <-time.After(1 * time.Second): + t.Fatalf("timeout waiting for blocked read") + } + } + } + readEOF := func(t *testing.T, b *ProfBuf) { + rdata, rtags, eof := b.Read(ProfBufBlocking) + if rdata != nil || rtags != nil || !eof { + t.Errorf("unexpected profile read: %#x, %#x, eof=%v; want nil, nil, eof=true", rdata, rtags, eof) + } + rdata, rtags, eof = b.Read(ProfBufNonBlocking) + if rdata != nil || rtags != nil || !eof { + t.Errorf("unexpected profile read (non-blocking): %#x, %#x, eof=%v; want nil, nil, eof=true", rdata, rtags, eof) + } + } + + myTags := make([]byte, 100) + t.Logf("myTags is %p", &myTags[0]) + + t.Run("BasicWriteRead", func(t *testing.T) { + b := NewProfBuf(2, 11, 1) + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) + read(t, b, nil, nil) // release data returned by previous read + write(t, b, unsafe.Pointer(&myTags[2]), 99, []uint64{101, 102}, []uintptr{201, 202, 203, 204}) + read(t, b, []uint64{8, 99, 101, 102, 201, 202, 203, 204}, []unsafe.Pointer{unsafe.Pointer(&myTags[2])}) + }) + + t.Run("ReadMany", func(t *testing.T) { + b := NewProfBuf(2, 50, 50) + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + write(t, b, unsafe.Pointer(&myTags[2]), 99, []uint64{101, 102}, []uintptr{201, 202, 203, 204}) + write(t, b, unsafe.Pointer(&myTags[1]), 500, []uint64{502, 504}, []uintptr{506}) + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 8, 99, 101, 102, 201, 202, 203, 204, 5, 500, 502, 504, 506}, []unsafe.Pointer{unsafe.Pointer(&myTags[0]), unsafe.Pointer(&myTags[2]), unsafe.Pointer(&myTags[1])}) + }) + + t.Run("ReadManyShortData", func(t *testing.T) { + b := NewProfBuf(2, 50, 50) + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + write(t, b, unsafe.Pointer(&myTags[2]), 99, []uint64{101, 102}, []uintptr{201, 202, 203, 204}) + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 8, 99, 101, 102, 201, 202, 203, 204}, []unsafe.Pointer{unsafe.Pointer(&myTags[0]), unsafe.Pointer(&myTags[2])}) + }) + + t.Run("ReadManyShortTags", func(t *testing.T) { + b := NewProfBuf(2, 50, 50) + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + write(t, b, unsafe.Pointer(&myTags[2]), 99, []uint64{101, 102}, []uintptr{201, 202, 203, 204}) + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 8, 99, 101, 102, 201, 202, 203, 204}, []unsafe.Pointer{unsafe.Pointer(&myTags[0]), unsafe.Pointer(&myTags[2])}) + }) + + t.Run("ReadAfterOverflow1", func(t *testing.T) { + // overflow record synthesized by write + b := NewProfBuf(2, 16, 5) + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) // uses 10 + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) // reads 10 but still in use until next read + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5}) // uses 6 + read(t, b, []uint64{6, 1, 2, 3, 4, 5}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) // reads 6 but still in use until next read + // now 10 available + write(t, b, unsafe.Pointer(&myTags[2]), 99, []uint64{101, 102}, []uintptr{201, 202, 203, 204, 205, 206, 207, 208, 209}) // no room + for i := 0; i < 299; i++ { + write(t, b, unsafe.Pointer(&myTags[3]), int64(100+i), []uint64{101, 102}, []uintptr{201, 202, 203, 204}) // no room for overflow+this record + } + write(t, b, unsafe.Pointer(&myTags[1]), 500, []uint64{502, 504}, []uintptr{506}) // room for overflow+this record + read(t, b, []uint64{5, 99, 0, 0, 300, 5, 500, 502, 504, 506}, []unsafe.Pointer{nil, unsafe.Pointer(&myTags[1])}) + }) + + t.Run("ReadAfterOverflow2", func(t *testing.T) { + // overflow record synthesized by read + b := NewProfBuf(2, 16, 5) + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + write(t, b, unsafe.Pointer(&myTags[2]), 99, []uint64{101, 102}, []uintptr{201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213}) + for i := 0; i < 299; i++ { + write(t, b, unsafe.Pointer(&myTags[3]), 100, []uint64{101, 102}, []uintptr{201, 202, 203, 204}) + } + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) // reads 10 but still in use until next read + write(t, b, unsafe.Pointer(&myTags[1]), 500, []uint64{502, 504}, []uintptr{}) // still overflow + read(t, b, []uint64{5, 99, 0, 0, 301}, []unsafe.Pointer{nil}) // overflow synthesized by read + write(t, b, unsafe.Pointer(&myTags[1]), 500, []uint64{502, 505}, []uintptr{506}) // written + read(t, b, []uint64{5, 500, 502, 505, 506}, []unsafe.Pointer{unsafe.Pointer(&myTags[1])}) + }) + + t.Run("ReadAtEndAfterOverflow", func(t *testing.T) { + b := NewProfBuf(2, 12, 5) + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + write(t, b, unsafe.Pointer(&myTags[2]), 99, []uint64{101, 102}, []uintptr{201, 202, 203, 204}) + for i := 0; i < 299; i++ { + write(t, b, unsafe.Pointer(&myTags[3]), 100, []uint64{101, 102}, []uintptr{201, 202, 203, 204}) + } + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) + read(t, b, []uint64{5, 99, 0, 0, 300}, []unsafe.Pointer{nil}) + write(t, b, unsafe.Pointer(&myTags[1]), 500, []uint64{502, 504}, []uintptr{506}) + read(t, b, []uint64{5, 500, 502, 504, 506}, []unsafe.Pointer{unsafe.Pointer(&myTags[1])}) + }) + + t.Run("BlockingWriteRead", func(t *testing.T) { + b := NewProfBuf(2, 11, 1) + wait := readBlock(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + wait() + wait = readBlock(t, b, []uint64{8, 99, 101, 102, 201, 202, 203, 204}, []unsafe.Pointer{unsafe.Pointer(&myTags[2])}) + time.Sleep(10 * time.Millisecond) + write(t, b, unsafe.Pointer(&myTags[2]), 99, []uint64{101, 102}, []uintptr{201, 202, 203, 204}) + wait() + wait = readBlock(t, b, nil, nil) + b.Close() + wait() + wait = readBlock(t, b, nil, nil) + wait() + readEOF(t, b) + }) + + t.Run("DataWraparound", func(t *testing.T) { + b := NewProfBuf(2, 16, 1024) + for i := 0; i < 10; i++ { + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) + read(t, b, nil, nil) // release data returned by previous read + } + }) + + t.Run("TagWraparound", func(t *testing.T) { + b := NewProfBuf(2, 1024, 2) + for i := 0; i < 10; i++ { + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) + read(t, b, nil, nil) // release data returned by previous read + } + }) + + t.Run("BothWraparound", func(t *testing.T) { + b := NewProfBuf(2, 16, 2) + for i := 0; i < 10; i++ { + write(t, b, unsafe.Pointer(&myTags[0]), 1, []uint64{2, 3}, []uintptr{4, 5, 6, 7, 8, 9}) + read(t, b, []uint64{10, 1, 2, 3, 4, 5, 6, 7, 8, 9}, []unsafe.Pointer{unsafe.Pointer(&myTags[0])}) + read(t, b, nil, nil) // release data returned by previous read + } + }) +} diff --git a/src/runtime/proflabel.go b/src/runtime/proflabel.go new file mode 100644 index 0000000..b2a1617 --- /dev/null +++ b/src/runtime/proflabel.go @@ -0,0 +1,40 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +var labelSync uintptr + +//go:linkname runtime_setProfLabel runtime/pprof.runtime_setProfLabel +func runtime_setProfLabel(labels unsafe.Pointer) { + // Introduce race edge for read-back via profile. + // This would more properly use &getg().labels as the sync address, + // but we do the read in a signal handler and can't call the race runtime then. + // + // This uses racereleasemerge rather than just racerelease so + // the acquire in profBuf.read synchronizes with *all* prior + // setProfLabel operations, not just the most recent one. This + // is important because profBuf.read will observe different + // labels set by different setProfLabel operations on + // different goroutines, so it needs to synchronize with all + // of them (this wouldn't be an issue if we could synchronize + // on &getg().labels since we would synchronize with each + // most-recent labels write separately.) + // + // racereleasemerge is like a full read-modify-write on + // labelSync, rather than just a store-release, so it carries + // a dependency on the previous racereleasemerge, which + // ultimately carries forward to the acquire in profBuf.read. + if raceenabled { + racereleasemerge(unsafe.Pointer(&labelSync)) + } + getg().labels = labels +} + +//go:linkname runtime_getProfLabel runtime/pprof.runtime_getProfLabel +func runtime_getProfLabel() unsafe.Pointer { + return getg().labels +} diff --git a/src/runtime/race.go b/src/runtime/race.go new file mode 100644 index 0000000..f83a04d --- /dev/null +++ b/src/runtime/race.go @@ -0,0 +1,649 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +// Public race detection API, present iff build with -race. + +func RaceRead(addr unsafe.Pointer) +func RaceWrite(addr unsafe.Pointer) +func RaceReadRange(addr unsafe.Pointer, len int) +func RaceWriteRange(addr unsafe.Pointer, len int) + +func RaceErrors() int { + var n uint64 + racecall(&__tsan_report_count, uintptr(unsafe.Pointer(&n)), 0, 0, 0) + return int(n) +} + +//go:nosplit + +// RaceAcquire/RaceRelease/RaceReleaseMerge establish happens-before relations +// between goroutines. These inform the race detector about actual synchronization +// that it can't see for some reason (e.g. synchronization within RaceDisable/RaceEnable +// sections of code). +// RaceAcquire establishes a happens-before relation with the preceding +// RaceReleaseMerge on addr up to and including the last RaceRelease on addr. +// In terms of the C memory model (C11 §5.1.2.4, §7.17.3), +// RaceAcquire is equivalent to atomic_load(memory_order_acquire). +func RaceAcquire(addr unsafe.Pointer) { + raceacquire(addr) +} + +//go:nosplit + +// RaceRelease performs a release operation on addr that +// can synchronize with a later RaceAcquire on addr. +// +// In terms of the C memory model, RaceRelease is equivalent to +// atomic_store(memory_order_release). +func RaceRelease(addr unsafe.Pointer) { + racerelease(addr) +} + +//go:nosplit + +// RaceReleaseMerge is like RaceRelease, but also establishes a happens-before +// relation with the preceding RaceRelease or RaceReleaseMerge on addr. +// +// In terms of the C memory model, RaceReleaseMerge is equivalent to +// atomic_exchange(memory_order_release). +func RaceReleaseMerge(addr unsafe.Pointer) { + racereleasemerge(addr) +} + +//go:nosplit + +// RaceDisable disables handling of race synchronization events in the current goroutine. +// Handling is re-enabled with RaceEnable. RaceDisable/RaceEnable can be nested. +// Non-synchronization events (memory accesses, function entry/exit) still affect +// the race detector. +func RaceDisable() { + gp := getg() + if gp.raceignore == 0 { + racecall(&__tsan_go_ignore_sync_begin, gp.racectx, 0, 0, 0) + } + gp.raceignore++ +} + +//go:nosplit + +// RaceEnable re-enables handling of race events in the current goroutine. +func RaceEnable() { + gp := getg() + gp.raceignore-- + if gp.raceignore == 0 { + racecall(&__tsan_go_ignore_sync_end, gp.racectx, 0, 0, 0) + } +} + +// Private interface for the runtime. + +const raceenabled = true + +// For all functions accepting callerpc and pc, +// callerpc is a return PC of the function that calls this function, +// pc is start PC of the function that calls this function. +func raceReadObjectPC(t *_type, addr unsafe.Pointer, callerpc, pc uintptr) { + kind := t.kind & kindMask + if kind == kindArray || kind == kindStruct { + // for composite objects we have to read every address + // because a write might happen to any subobject. + racereadrangepc(addr, t.size, callerpc, pc) + } else { + // for non-composite objects we can read just the start + // address, as any write must write the first byte. + racereadpc(addr, callerpc, pc) + } +} + +func raceWriteObjectPC(t *_type, addr unsafe.Pointer, callerpc, pc uintptr) { + kind := t.kind & kindMask + if kind == kindArray || kind == kindStruct { + // for composite objects we have to write every address + // because a write might happen to any subobject. + racewriterangepc(addr, t.size, callerpc, pc) + } else { + // for non-composite objects we can write just the start + // address, as any write must write the first byte. + racewritepc(addr, callerpc, pc) + } +} + +//go:noescape +func racereadpc(addr unsafe.Pointer, callpc, pc uintptr) + +//go:noescape +func racewritepc(addr unsafe.Pointer, callpc, pc uintptr) + +type symbolizeCodeContext struct { + pc uintptr + fn *byte + file *byte + line uintptr + off uintptr + res uintptr +} + +var qq = [...]byte{'?', '?', 0} +var dash = [...]byte{'-', 0} + +const ( + raceGetProcCmd = iota + raceSymbolizeCodeCmd + raceSymbolizeDataCmd +) + +// Callback from C into Go, runs on g0. +func racecallback(cmd uintptr, ctx unsafe.Pointer) { + switch cmd { + case raceGetProcCmd: + throw("should have been handled by racecallbackthunk") + case raceSymbolizeCodeCmd: + raceSymbolizeCode((*symbolizeCodeContext)(ctx)) + case raceSymbolizeDataCmd: + raceSymbolizeData((*symbolizeDataContext)(ctx)) + default: + throw("unknown command") + } +} + +// raceSymbolizeCode reads ctx.pc and populates the rest of *ctx with +// information about the code at that pc. +// +// The race detector has already subtracted 1 from pcs, so they point to the last +// byte of call instructions (including calls to runtime.racewrite and friends). +// +// If the incoming pc is part of an inlined function, *ctx is populated +// with information about the inlined function, and on return ctx.pc is set +// to a pc in the logically containing function. (The race detector should call this +// function again with that pc.) +// +// If the incoming pc is not part of an inlined function, the return pc is unchanged. +func raceSymbolizeCode(ctx *symbolizeCodeContext) { + pc := ctx.pc + fi := findfunc(pc) + f := fi._Func() + if f != nil { + file, line := f.FileLine(pc) + if line != 0 { + if inldata := funcdata(fi, _FUNCDATA_InlTree); inldata != nil { + inltree := (*[1 << 20]inlinedCall)(inldata) + for { + ix := pcdatavalue(fi, _PCDATA_InlTreeIndex, pc, nil) + if ix >= 0 { + if inltree[ix].funcID == funcID_wrapper { + // ignore wrappers + // Back up to an instruction in the "caller". + pc = f.Entry() + uintptr(inltree[ix].parentPc) + continue + } + ctx.pc = f.Entry() + uintptr(inltree[ix].parentPc) // "caller" pc + ctx.fn = cfuncnameFromNameOff(fi, inltree[ix].nameOff) + ctx.line = uintptr(line) + ctx.file = &bytes(file)[0] // assume NUL-terminated + ctx.off = pc - f.Entry() + ctx.res = 1 + return + } + break + } + } + ctx.fn = cfuncname(fi) + ctx.line = uintptr(line) + ctx.file = &bytes(file)[0] // assume NUL-terminated + ctx.off = pc - f.Entry() + ctx.res = 1 + return + } + } + ctx.fn = &qq[0] + ctx.file = &dash[0] + ctx.line = 0 + ctx.off = ctx.pc + ctx.res = 1 +} + +type symbolizeDataContext struct { + addr uintptr + heap uintptr + start uintptr + size uintptr + name *byte + file *byte + line uintptr + res uintptr +} + +func raceSymbolizeData(ctx *symbolizeDataContext) { + if base, span, _ := findObject(ctx.addr, 0, 0); base != 0 { + ctx.heap = 1 + ctx.start = base + ctx.size = span.elemsize + ctx.res = 1 + } +} + +// Race runtime functions called via runtime·racecall. +// +//go:linkname __tsan_init __tsan_init +var __tsan_init byte + +//go:linkname __tsan_fini __tsan_fini +var __tsan_fini byte + +//go:linkname __tsan_proc_create __tsan_proc_create +var __tsan_proc_create byte + +//go:linkname __tsan_proc_destroy __tsan_proc_destroy +var __tsan_proc_destroy byte + +//go:linkname __tsan_map_shadow __tsan_map_shadow +var __tsan_map_shadow byte + +//go:linkname __tsan_finalizer_goroutine __tsan_finalizer_goroutine +var __tsan_finalizer_goroutine byte + +//go:linkname __tsan_go_start __tsan_go_start +var __tsan_go_start byte + +//go:linkname __tsan_go_end __tsan_go_end +var __tsan_go_end byte + +//go:linkname __tsan_malloc __tsan_malloc +var __tsan_malloc byte + +//go:linkname __tsan_free __tsan_free +var __tsan_free byte + +//go:linkname __tsan_acquire __tsan_acquire +var __tsan_acquire byte + +//go:linkname __tsan_release __tsan_release +var __tsan_release byte + +//go:linkname __tsan_release_acquire __tsan_release_acquire +var __tsan_release_acquire byte + +//go:linkname __tsan_release_merge __tsan_release_merge +var __tsan_release_merge byte + +//go:linkname __tsan_go_ignore_sync_begin __tsan_go_ignore_sync_begin +var __tsan_go_ignore_sync_begin byte + +//go:linkname __tsan_go_ignore_sync_end __tsan_go_ignore_sync_end +var __tsan_go_ignore_sync_end byte + +//go:linkname __tsan_report_count __tsan_report_count +var __tsan_report_count byte + +// Mimic what cmd/cgo would do. +// +//go:cgo_import_static __tsan_init +//go:cgo_import_static __tsan_fini +//go:cgo_import_static __tsan_proc_create +//go:cgo_import_static __tsan_proc_destroy +//go:cgo_import_static __tsan_map_shadow +//go:cgo_import_static __tsan_finalizer_goroutine +//go:cgo_import_static __tsan_go_start +//go:cgo_import_static __tsan_go_end +//go:cgo_import_static __tsan_malloc +//go:cgo_import_static __tsan_free +//go:cgo_import_static __tsan_acquire +//go:cgo_import_static __tsan_release +//go:cgo_import_static __tsan_release_acquire +//go:cgo_import_static __tsan_release_merge +//go:cgo_import_static __tsan_go_ignore_sync_begin +//go:cgo_import_static __tsan_go_ignore_sync_end +//go:cgo_import_static __tsan_report_count + +// These are called from race_amd64.s. +// +//go:cgo_import_static __tsan_read +//go:cgo_import_static __tsan_read_pc +//go:cgo_import_static __tsan_read_range +//go:cgo_import_static __tsan_write +//go:cgo_import_static __tsan_write_pc +//go:cgo_import_static __tsan_write_range +//go:cgo_import_static __tsan_func_enter +//go:cgo_import_static __tsan_func_exit + +//go:cgo_import_static __tsan_go_atomic32_load +//go:cgo_import_static __tsan_go_atomic64_load +//go:cgo_import_static __tsan_go_atomic32_store +//go:cgo_import_static __tsan_go_atomic64_store +//go:cgo_import_static __tsan_go_atomic32_exchange +//go:cgo_import_static __tsan_go_atomic64_exchange +//go:cgo_import_static __tsan_go_atomic32_fetch_add +//go:cgo_import_static __tsan_go_atomic64_fetch_add +//go:cgo_import_static __tsan_go_atomic32_compare_exchange +//go:cgo_import_static __tsan_go_atomic64_compare_exchange + +// start/end of global data (data+bss). +var racedatastart uintptr +var racedataend uintptr + +// start/end of heap for race_amd64.s +var racearenastart uintptr +var racearenaend uintptr + +func racefuncenter(callpc uintptr) +func racefuncenterfp(fp uintptr) +func racefuncexit() +func raceread(addr uintptr) +func racewrite(addr uintptr) +func racereadrange(addr, size uintptr) +func racewriterange(addr, size uintptr) +func racereadrangepc1(addr, size, pc uintptr) +func racewriterangepc1(addr, size, pc uintptr) +func racecallbackthunk(uintptr) + +// racecall allows calling an arbitrary function fn from C race runtime +// with up to 4 uintptr arguments. +func racecall(fn *byte, arg0, arg1, arg2, arg3 uintptr) + +// checks if the address has shadow (i.e. heap or data/bss). +// +//go:nosplit +func isvalidaddr(addr unsafe.Pointer) bool { + return racearenastart <= uintptr(addr) && uintptr(addr) < racearenaend || + racedatastart <= uintptr(addr) && uintptr(addr) < racedataend +} + +//go:nosplit +func raceinit() (gctx, pctx uintptr) { + // On most machines, cgo is required to initialize libc, which is used by race runtime. + if !iscgo && GOOS != "darwin" { + throw("raceinit: race build must use cgo") + } + + racecall(&__tsan_init, uintptr(unsafe.Pointer(&gctx)), uintptr(unsafe.Pointer(&pctx)), abi.FuncPCABI0(racecallbackthunk), 0) + + // Round data segment to page boundaries, because it's used in mmap(). + start := ^uintptr(0) + end := uintptr(0) + if start > firstmoduledata.noptrdata { + start = firstmoduledata.noptrdata + } + if start > firstmoduledata.data { + start = firstmoduledata.data + } + if start > firstmoduledata.noptrbss { + start = firstmoduledata.noptrbss + } + if start > firstmoduledata.bss { + start = firstmoduledata.bss + } + if end < firstmoduledata.enoptrdata { + end = firstmoduledata.enoptrdata + } + if end < firstmoduledata.edata { + end = firstmoduledata.edata + } + if end < firstmoduledata.enoptrbss { + end = firstmoduledata.enoptrbss + } + if end < firstmoduledata.ebss { + end = firstmoduledata.ebss + } + size := alignUp(end-start, _PageSize) + racecall(&__tsan_map_shadow, start, size, 0, 0) + racedatastart = start + racedataend = start + size + + return +} + +var raceFiniLock mutex + +//go:nosplit +func racefini() { + // racefini() can only be called once to avoid races. + // This eventually (via __tsan_fini) calls C.exit which has + // undefined behavior if called more than once. If the lock is + // already held it's assumed that the first caller exits the program + // so other calls can hang forever without an issue. + lock(&raceFiniLock) + // We're entering external code that may call ExitProcess on + // Windows. + osPreemptExtEnter(getg().m) + racecall(&__tsan_fini, 0, 0, 0, 0) +} + +//go:nosplit +func raceproccreate() uintptr { + var ctx uintptr + racecall(&__tsan_proc_create, uintptr(unsafe.Pointer(&ctx)), 0, 0, 0) + return ctx +} + +//go:nosplit +func raceprocdestroy(ctx uintptr) { + racecall(&__tsan_proc_destroy, ctx, 0, 0, 0) +} + +//go:nosplit +func racemapshadow(addr unsafe.Pointer, size uintptr) { + if racearenastart == 0 { + racearenastart = uintptr(addr) + } + if racearenaend < uintptr(addr)+size { + racearenaend = uintptr(addr) + size + } + racecall(&__tsan_map_shadow, uintptr(addr), size, 0, 0) +} + +//go:nosplit +func racemalloc(p unsafe.Pointer, sz uintptr) { + racecall(&__tsan_malloc, 0, 0, uintptr(p), sz) +} + +//go:nosplit +func racefree(p unsafe.Pointer, sz uintptr) { + racecall(&__tsan_free, uintptr(p), sz, 0, 0) +} + +//go:nosplit +func racegostart(pc uintptr) uintptr { + gp := getg() + var spawng *g + if gp.m.curg != nil { + spawng = gp.m.curg + } else { + spawng = gp + } + + var racectx uintptr + racecall(&__tsan_go_start, spawng.racectx, uintptr(unsafe.Pointer(&racectx)), pc, 0) + return racectx +} + +//go:nosplit +func racegoend() { + racecall(&__tsan_go_end, getg().racectx, 0, 0, 0) +} + +//go:nosplit +func racectxend(racectx uintptr) { + racecall(&__tsan_go_end, racectx, 0, 0, 0) +} + +//go:nosplit +func racewriterangepc(addr unsafe.Pointer, sz, callpc, pc uintptr) { + gp := getg() + if gp != gp.m.curg { + // The call is coming from manual instrumentation of Go code running on g0/gsignal. + // Not interesting. + return + } + if callpc != 0 { + racefuncenter(callpc) + } + racewriterangepc1(uintptr(addr), sz, pc) + if callpc != 0 { + racefuncexit() + } +} + +//go:nosplit +func racereadrangepc(addr unsafe.Pointer, sz, callpc, pc uintptr) { + gp := getg() + if gp != gp.m.curg { + // The call is coming from manual instrumentation of Go code running on g0/gsignal. + // Not interesting. + return + } + if callpc != 0 { + racefuncenter(callpc) + } + racereadrangepc1(uintptr(addr), sz, pc) + if callpc != 0 { + racefuncexit() + } +} + +//go:nosplit +func raceacquire(addr unsafe.Pointer) { + raceacquireg(getg(), addr) +} + +//go:nosplit +func raceacquireg(gp *g, addr unsafe.Pointer) { + if getg().raceignore != 0 || !isvalidaddr(addr) { + return + } + racecall(&__tsan_acquire, gp.racectx, uintptr(addr), 0, 0) +} + +//go:nosplit +func raceacquirectx(racectx uintptr, addr unsafe.Pointer) { + if !isvalidaddr(addr) { + return + } + racecall(&__tsan_acquire, racectx, uintptr(addr), 0, 0) +} + +//go:nosplit +func racerelease(addr unsafe.Pointer) { + racereleaseg(getg(), addr) +} + +//go:nosplit +func racereleaseg(gp *g, addr unsafe.Pointer) { + if getg().raceignore != 0 || !isvalidaddr(addr) { + return + } + racecall(&__tsan_release, gp.racectx, uintptr(addr), 0, 0) +} + +//go:nosplit +func racereleaseacquire(addr unsafe.Pointer) { + racereleaseacquireg(getg(), addr) +} + +//go:nosplit +func racereleaseacquireg(gp *g, addr unsafe.Pointer) { + if getg().raceignore != 0 || !isvalidaddr(addr) { + return + } + racecall(&__tsan_release_acquire, gp.racectx, uintptr(addr), 0, 0) +} + +//go:nosplit +func racereleasemerge(addr unsafe.Pointer) { + racereleasemergeg(getg(), addr) +} + +//go:nosplit +func racereleasemergeg(gp *g, addr unsafe.Pointer) { + if getg().raceignore != 0 || !isvalidaddr(addr) { + return + } + racecall(&__tsan_release_merge, gp.racectx, uintptr(addr), 0, 0) +} + +//go:nosplit +func racefingo() { + racecall(&__tsan_finalizer_goroutine, getg().racectx, 0, 0, 0) +} + +// The declarations below generate ABI wrappers for functions +// implemented in assembly in this package but declared in another +// package. + +//go:linkname abigen_sync_atomic_LoadInt32 sync/atomic.LoadInt32 +func abigen_sync_atomic_LoadInt32(addr *int32) (val int32) + +//go:linkname abigen_sync_atomic_LoadInt64 sync/atomic.LoadInt64 +func abigen_sync_atomic_LoadInt64(addr *int64) (val int64) + +//go:linkname abigen_sync_atomic_LoadUint32 sync/atomic.LoadUint32 +func abigen_sync_atomic_LoadUint32(addr *uint32) (val uint32) + +//go:linkname abigen_sync_atomic_LoadUint64 sync/atomic.LoadUint64 +func abigen_sync_atomic_LoadUint64(addr *uint64) (val uint64) + +//go:linkname abigen_sync_atomic_LoadUintptr sync/atomic.LoadUintptr +func abigen_sync_atomic_LoadUintptr(addr *uintptr) (val uintptr) + +//go:linkname abigen_sync_atomic_LoadPointer sync/atomic.LoadPointer +func abigen_sync_atomic_LoadPointer(addr *unsafe.Pointer) (val unsafe.Pointer) + +//go:linkname abigen_sync_atomic_StoreInt32 sync/atomic.StoreInt32 +func abigen_sync_atomic_StoreInt32(addr *int32, val int32) + +//go:linkname abigen_sync_atomic_StoreInt64 sync/atomic.StoreInt64 +func abigen_sync_atomic_StoreInt64(addr *int64, val int64) + +//go:linkname abigen_sync_atomic_StoreUint32 sync/atomic.StoreUint32 +func abigen_sync_atomic_StoreUint32(addr *uint32, val uint32) + +//go:linkname abigen_sync_atomic_StoreUint64 sync/atomic.StoreUint64 +func abigen_sync_atomic_StoreUint64(addr *uint64, val uint64) + +//go:linkname abigen_sync_atomic_SwapInt32 sync/atomic.SwapInt32 +func abigen_sync_atomic_SwapInt32(addr *int32, new int32) (old int32) + +//go:linkname abigen_sync_atomic_SwapInt64 sync/atomic.SwapInt64 +func abigen_sync_atomic_SwapInt64(addr *int64, new int64) (old int64) + +//go:linkname abigen_sync_atomic_SwapUint32 sync/atomic.SwapUint32 +func abigen_sync_atomic_SwapUint32(addr *uint32, new uint32) (old uint32) + +//go:linkname abigen_sync_atomic_SwapUint64 sync/atomic.SwapUint64 +func abigen_sync_atomic_SwapUint64(addr *uint64, new uint64) (old uint64) + +//go:linkname abigen_sync_atomic_AddInt32 sync/atomic.AddInt32 +func abigen_sync_atomic_AddInt32(addr *int32, delta int32) (new int32) + +//go:linkname abigen_sync_atomic_AddUint32 sync/atomic.AddUint32 +func abigen_sync_atomic_AddUint32(addr *uint32, delta uint32) (new uint32) + +//go:linkname abigen_sync_atomic_AddInt64 sync/atomic.AddInt64 +func abigen_sync_atomic_AddInt64(addr *int64, delta int64) (new int64) + +//go:linkname abigen_sync_atomic_AddUint64 sync/atomic.AddUint64 +func abigen_sync_atomic_AddUint64(addr *uint64, delta uint64) (new uint64) + +//go:linkname abigen_sync_atomic_AddUintptr sync/atomic.AddUintptr +func abigen_sync_atomic_AddUintptr(addr *uintptr, delta uintptr) (new uintptr) + +//go:linkname abigen_sync_atomic_CompareAndSwapInt32 sync/atomic.CompareAndSwapInt32 +func abigen_sync_atomic_CompareAndSwapInt32(addr *int32, old, new int32) (swapped bool) + +//go:linkname abigen_sync_atomic_CompareAndSwapInt64 sync/atomic.CompareAndSwapInt64 +func abigen_sync_atomic_CompareAndSwapInt64(addr *int64, old, new int64) (swapped bool) + +//go:linkname abigen_sync_atomic_CompareAndSwapUint32 sync/atomic.CompareAndSwapUint32 +func abigen_sync_atomic_CompareAndSwapUint32(addr *uint32, old, new uint32) (swapped bool) + +//go:linkname abigen_sync_atomic_CompareAndSwapUint64 sync/atomic.CompareAndSwapUint64 +func abigen_sync_atomic_CompareAndSwapUint64(addr *uint64, old, new uint64) (swapped bool) diff --git a/src/runtime/race/README b/src/runtime/race/README new file mode 100644 index 0000000..596700a --- /dev/null +++ b/src/runtime/race/README @@ -0,0 +1,17 @@ +runtime/race package contains the data race detector runtime library. +It is based on ThreadSanitizer race detector, that is currently a part of +the LLVM project (https://github.com/llvm/llvm-project/tree/main/compiler-rt). + +To update the .syso files use golang.org/x/build/cmd/racebuild. + +race_darwin_amd64.syso built with LLVM 127e59048cd3d8dbb80c14b3036918c114089529 and Go 59ab6f351a370a27458755dc69f4a837e55a05a6. +race_freebsd_amd64.syso built with LLVM 127e59048cd3d8dbb80c14b3036918c114089529 and Go 59ab6f351a370a27458755dc69f4a837e55a05a6. +race_linux_ppc64le.syso built with LLVM 41cb504b7c4b18ac15830107431a0c1eec73a6b2 and Go 851ecea4cc99ab276109493477b2c7e30c253ea8. +race_netbsd_amd64.syso built with LLVM 41cb504b7c4b18ac15830107431a0c1eec73a6b2 and Go 851ecea4cc99ab276109493477b2c7e30c253ea8. +race_windows_amd64.syso built with LLVM 89f7ccea6f6488c443655880229c54db1f180153 and Go f62d3202bf9dbb3a00ad2a2c63ff4fa4188c5d3b. +race_linux_arm64.syso built with LLVM 41cb504b7c4b18ac15830107431a0c1eec73a6b2 and Go 851ecea4cc99ab276109493477b2c7e30c253ea8. +race_darwin_arm64.syso built with LLVM 41cb504b7c4b18ac15830107431a0c1eec73a6b2 and Go 851ecea4cc99ab276109493477b2c7e30c253ea8. +race_openbsd_amd64.syso built with LLVM fcf6ae2f070eba73074b6ec8d8281e54d29dbeeb and Go 8f2db14cd35bbd674cb2988a508306de6655e425. +race_linux_s390x.syso built with LLVM 41cb504b7c4b18ac15830107431a0c1eec73a6b2 and Go 851ecea4cc99ab276109493477b2c7e30c253ea8. +internal/amd64v3/race_linux.syso built with LLVM 74c2d4f6024c8f160871a2baa928d0b42415f183 and Go c0f27eb3d580c8b9efd73802678eba4c6c9461be. +internal/amd64v1/race_linux.syso built with LLVM 74c2d4f6024c8f160871a2baa928d0b42415f183 and Go c0f27eb3d580c8b9efd73802678eba4c6c9461be. diff --git a/src/runtime/race/doc.go b/src/runtime/race/doc.go new file mode 100644 index 0000000..60a20df --- /dev/null +++ b/src/runtime/race/doc.go @@ -0,0 +1,11 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Package race implements data race detection logic. +// No public interface is provided. +// For details about the race detector see +// https://golang.org/doc/articles/race_detector.html +package race + +//go:generate ./mkcgo.sh diff --git a/src/runtime/race/internal/amd64v1/doc.go b/src/runtime/race/internal/amd64v1/doc.go new file mode 100644 index 0000000..ccb088c --- /dev/null +++ b/src/runtime/race/internal/amd64v1/doc.go @@ -0,0 +1,10 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This package holds the race detector .syso for +// amd64 architectures with GOAMD64<v3. + +//go:build amd64 && ((linux && !amd64.v3) || darwin || freebsd || netbsd || openbsd || windows) + +package amd64v1 diff --git a/src/runtime/race/internal/amd64v1/race_darwin.syso b/src/runtime/race/internal/amd64v1/race_darwin.syso Binary files differnew file mode 100644 index 0000000..e5d848c --- /dev/null +++ b/src/runtime/race/internal/amd64v1/race_darwin.syso diff --git a/src/runtime/race/internal/amd64v1/race_freebsd.syso b/src/runtime/race/internal/amd64v1/race_freebsd.syso Binary files differnew file mode 100644 index 0000000..b3a4383 --- /dev/null +++ b/src/runtime/race/internal/amd64v1/race_freebsd.syso diff --git a/src/runtime/race/internal/amd64v1/race_linux.syso b/src/runtime/race/internal/amd64v1/race_linux.syso Binary files differnew file mode 100644 index 0000000..68f1508 --- /dev/null +++ b/src/runtime/race/internal/amd64v1/race_linux.syso diff --git a/src/runtime/race/internal/amd64v1/race_netbsd.syso b/src/runtime/race/internal/amd64v1/race_netbsd.syso Binary files differnew file mode 100644 index 0000000..e6cc4bf --- /dev/null +++ b/src/runtime/race/internal/amd64v1/race_netbsd.syso diff --git a/src/runtime/race/internal/amd64v1/race_openbsd.syso b/src/runtime/race/internal/amd64v1/race_openbsd.syso Binary files differnew file mode 100644 index 0000000..9fefd87 --- /dev/null +++ b/src/runtime/race/internal/amd64v1/race_openbsd.syso diff --git a/src/runtime/race/internal/amd64v1/race_windows.syso b/src/runtime/race/internal/amd64v1/race_windows.syso Binary files differnew file mode 100644 index 0000000..9fbf9b4 --- /dev/null +++ b/src/runtime/race/internal/amd64v1/race_windows.syso diff --git a/src/runtime/race/internal/amd64v3/doc.go b/src/runtime/race/internal/amd64v3/doc.go new file mode 100644 index 0000000..215998a --- /dev/null +++ b/src/runtime/race/internal/amd64v3/doc.go @@ -0,0 +1,10 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This package holds the race detector .syso for +// amd64 architectures with GOAMD64>=v3. + +//go:build amd64 && linux && amd64.v3 + +package amd64v3 diff --git a/src/runtime/race/internal/amd64v3/race_linux.syso b/src/runtime/race/internal/amd64v3/race_linux.syso Binary files differnew file mode 100644 index 0000000..33c3e76 --- /dev/null +++ b/src/runtime/race/internal/amd64v3/race_linux.syso diff --git a/src/runtime/race/mkcgo.sh b/src/runtime/race/mkcgo.sh new file mode 100755 index 0000000..6ebe5a4 --- /dev/null +++ b/src/runtime/race/mkcgo.sh @@ -0,0 +1,20 @@ +#!/bin/bash + +hdr=' +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Code generated by mkcgo.sh. DO NOT EDIT. + +//go:build race + +' + +convert() { + (echo "$hdr"; go tool cgo -dynpackage race -dynimport $1) | gofmt +} + +convert race_darwin_arm64.syso >race_darwin_arm64.go +convert internal/amd64v1/race_darwin.syso >race_darwin_amd64.go + diff --git a/src/runtime/race/output_test.go b/src/runtime/race/output_test.go new file mode 100644 index 0000000..0dcdabe --- /dev/null +++ b/src/runtime/race/output_test.go @@ -0,0 +1,442 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +package race_test + +import ( + "fmt" + "internal/testenv" + "os" + "os/exec" + "path/filepath" + "regexp" + "runtime" + "strings" + "testing" +) + +func TestOutput(t *testing.T) { + pkgdir := t.TempDir() + out, err := exec.Command(testenv.GoToolPath(t), "install", "-race", "-pkgdir="+pkgdir, "testing").CombinedOutput() + if err != nil { + t.Fatalf("go install -race: %v\n%s", err, out) + } + + for _, test := range tests { + if test.goos != "" && test.goos != runtime.GOOS { + t.Logf("test %v runs only on %v, skipping: ", test.name, test.goos) + continue + } + dir := t.TempDir() + source := "main.go" + if test.run == "test" { + source = "main_test.go" + } + src := filepath.Join(dir, source) + f, err := os.Create(src) + if err != nil { + t.Fatalf("failed to create file: %v", err) + } + _, err = f.WriteString(test.source) + if err != nil { + f.Close() + t.Fatalf("failed to write: %v", err) + } + if err := f.Close(); err != nil { + t.Fatalf("failed to close file: %v", err) + } + + cmd := exec.Command(testenv.GoToolPath(t), test.run, "-race", "-pkgdir="+pkgdir, src) + // GODEBUG spoils program output, GOMAXPROCS makes it flaky. + for _, env := range os.Environ() { + if strings.HasPrefix(env, "GODEBUG=") || + strings.HasPrefix(env, "GOMAXPROCS=") || + strings.HasPrefix(env, "GORACE=") { + continue + } + cmd.Env = append(cmd.Env, env) + } + cmd.Env = append(cmd.Env, + "GOMAXPROCS=1", // see comment in race_test.go + "GORACE="+test.gorace, + ) + got, _ := cmd.CombinedOutput() + matched := false + for _, re := range test.re { + if regexp.MustCompile(re).MatchString(string(got)) { + matched = true + break + } + } + if !matched { + exp := fmt.Sprintf("expect:\n%v\n", test.re[0]) + if len(test.re) > 1 { + exp = fmt.Sprintf("expected one of %d patterns:\n", + len(test.re)) + for k, re := range test.re { + exp += fmt.Sprintf("pattern %d:\n%v\n", k, re) + } + } + t.Fatalf("failed test case %v, %sgot:\n%s", + test.name, exp, got) + } + } +} + +var tests = []struct { + name string + run string + goos string + gorace string + source string + re []string +}{ + {"simple", "run", "", "atexit_sleep_ms=0", ` +package main +import "time" +var xptr *int +var donechan chan bool +func main() { + done := make(chan bool) + x := 0 + startRacer(&x, done) + store(&x, 43) + <-done +} +func store(x *int, v int) { + *x = v +} +func startRacer(x *int, done chan bool) { + xptr = x + donechan = done + go racer() +} +func racer() { + time.Sleep(10*time.Millisecond) + store(xptr, 42) + donechan <- true +} +`, []string{`================== +WARNING: DATA RACE +Write at 0x[0-9,a-f]+ by goroutine [0-9]: + main\.store\(\) + .+/main\.go:14 \+0x[0-9,a-f]+ + main\.racer\(\) + .+/main\.go:23 \+0x[0-9,a-f]+ + +Previous write at 0x[0-9,a-f]+ by main goroutine: + main\.store\(\) + .+/main\.go:14 \+0x[0-9,a-f]+ + main\.main\(\) + .+/main\.go:10 \+0x[0-9,a-f]+ + +Goroutine [0-9] \(running\) created at: + main\.startRacer\(\) + .+/main\.go:19 \+0x[0-9,a-f]+ + main\.main\(\) + .+/main\.go:9 \+0x[0-9,a-f]+ +================== +Found 1 data race\(s\) +exit status 66 +`}}, + + {"exitcode", "run", "", "atexit_sleep_ms=0 exitcode=13", ` +package main +func main() { + done := make(chan bool) + x := 0; _ = x + go func() { + x = 42 + done <- true + }() + x = 43 + <-done +} +`, []string{`exit status 13`}}, + + {"strip_path_prefix", "run", "", "atexit_sleep_ms=0 strip_path_prefix=/main.", ` +package main +func main() { + done := make(chan bool) + x := 0; _ = x + go func() { + x = 42 + done <- true + }() + x = 43 + <-done +} +`, []string{` + go:7 \+0x[0-9,a-f]+ +`}}, + + {"halt_on_error", "run", "", "atexit_sleep_ms=0 halt_on_error=1", ` +package main +func main() { + done := make(chan bool) + x := 0; _ = x + go func() { + x = 42 + done <- true + }() + x = 43 + <-done +} +`, []string{` +================== +exit status 66 +`}}, + + {"test_fails_on_race", "test", "", "atexit_sleep_ms=0", ` +package main_test +import "testing" +func TestFail(t *testing.T) { + done := make(chan bool) + x := 0 + _ = x + go func() { + x = 42 + done <- true + }() + x = 43 + <-done + t.Log(t.Failed()) +} +`, []string{` +================== +--- FAIL: TestFail \([0-9.]+s\) +.*main_test.go:14: true +.*testing.go:.*: race detected during execution of test +FAIL`}}, + + {"slicebytetostring_pc", "run", "", "atexit_sleep_ms=0", ` +package main +func main() { + done := make(chan string) + data := make([]byte, 10) + go func() { + done <- string(data) + }() + data[0] = 1 + <-done +} +`, []string{` + runtime\.slicebytetostring\(\) + .*/runtime/string\.go:.* + main\.main\.func1\(\) + .*/main.go:7`}}, + + // Test for https://golang.org/issue/33309 + {"midstack_inlining_traceback", "run", "linux", "atexit_sleep_ms=0", ` +package main + +var x int +var c chan int +func main() { + c = make(chan int) + go f() + x = 1 + <-c +} + +func f() { + g(c) +} + +func g(c chan int) { + h(c) +} + +func h(c chan int) { + c <- x +} +`, []string{`================== +WARNING: DATA RACE +Read at 0x[0-9,a-f]+ by goroutine [0-9]: + main\.h\(\) + .+/main\.go:22 \+0x[0-9,a-f]+ + main\.g\(\) + .+/main\.go:18 \+0x[0-9,a-f]+ + main\.f\(\) + .+/main\.go:14 \+0x[0-9,a-f]+ + +Previous write at 0x[0-9,a-f]+ by main goroutine: + main\.main\(\) + .+/main\.go:9 \+0x[0-9,a-f]+ + +Goroutine [0-9] \(running\) created at: + main\.main\(\) + .+/main\.go:8 \+0x[0-9,a-f]+ +================== +Found 1 data race\(s\) +exit status 66 +`}}, + + // Test for https://golang.org/issue/17190 + {"external_cgo_thread", "run", "linux", "atexit_sleep_ms=0", ` +package main + +/* +#include <pthread.h> +typedef struct cb { + int foo; +} cb; +extern void goCallback(); +static inline void *threadFunc(void *p) { + goCallback(); + return 0; +} +static inline void startThread(cb* c) { + pthread_t th; + pthread_create(&th, 0, threadFunc, 0); +} +*/ +import "C" + +var done chan bool +var racy int + +//export goCallback +func goCallback() { + racy++ + done <- true +} + +func main() { + done = make(chan bool) + var c C.cb + C.startThread(&c) + racy++ + <- done +} +`, []string{`================== +WARNING: DATA RACE +Read at 0x[0-9,a-f]+ by main goroutine: + main\.main\(\) + .*/main\.go:34 \+0x[0-9,a-f]+ + +Previous write at 0x[0-9,a-f]+ by goroutine [0-9]: + main\.goCallback\(\) + .*/main\.go:27 \+0x[0-9,a-f]+ + _cgoexp_[0-9a-z]+_goCallback\(\) + .*_cgo_gotypes\.go:[0-9]+ \+0x[0-9,a-f]+ + _cgoexp_[0-9a-z]+_goCallback\(\) + <autogenerated>:1 \+0x[0-9,a-f]+ + +Goroutine [0-9] \(running\) created at: + runtime\.newextram\(\) + .*/runtime/proc.go:[0-9]+ \+0x[0-9,a-f]+ +==================`, + `================== +WARNING: DATA RACE +Read at 0x[0-9,a-f]+ by .*: + main\..* + .*/main\.go:[0-9]+ \+0x[0-9,a-f]+(?s).* + +Previous write at 0x[0-9,a-f]+ by .*: + main\..* + .*/main\.go:[0-9]+ \+0x[0-9,a-f]+(?s).* + +Goroutine [0-9] \(running\) created at: + runtime\.newextram\(\) + .*/runtime/proc.go:[0-9]+ \+0x[0-9,a-f]+ +==================`}}, + {"second_test_passes", "test", "", "atexit_sleep_ms=0", ` +package main_test +import "testing" +func TestFail(t *testing.T) { + done := make(chan bool) + x := 0 + _ = x + go func() { + x = 42 + done <- true + }() + x = 43 + <-done +} + +func TestPass(t *testing.T) { +} +`, []string{` +================== +--- FAIL: TestFail \([0-9.]+s\) +.*testing.go:.*: race detected during execution of test +FAIL`}}, + {"mutex", "run", "", "atexit_sleep_ms=0", ` +package main +import ( + "sync" + "fmt" +) +func main() { + c := make(chan bool, 1) + threads := 1 + iterations := 20000 + data := 0 + var wg sync.WaitGroup + for i := 0; i < threads; i++ { + wg.Add(1) + go func() { + defer wg.Done() + for i := 0; i < iterations; i++ { + c <- true + data += 1 + <- c + } + }() + } + for i := 0; i < iterations; i++ { + c <- true + data += 1 + <- c + } + wg.Wait() + if (data == iterations*(threads+1)) { fmt.Println("pass") } +}`, []string{`pass`}}, + // Test for https://github.com/golang/go/issues/37355 + {"chanmm", "run", "", "atexit_sleep_ms=0", ` +package main +import ( + "sync" + "time" +) +func main() { + c := make(chan bool, 1) + var data uint64 + var wg sync.WaitGroup + wg.Add(2) + c <- true + go func() { + defer wg.Done() + c <- true + }() + go func() { + defer wg.Done() + time.Sleep(time.Second) + <-c + data = 2 + }() + data = 1 + <-c + wg.Wait() + _ = data +} +`, []string{`================== +WARNING: DATA RACE +Write at 0x[0-9,a-f]+ by goroutine [0-9]: + main\.main\.func2\(\) + .*/main\.go:21 \+0x[0-9,a-f]+ + +Previous write at 0x[0-9,a-f]+ by main goroutine: + main\.main\(\) + .*/main\.go:23 \+0x[0-9,a-f]+ + +Goroutine [0-9] \(running\) created at: + main\.main\(\) + .*/main.go:[0-9]+ \+0x[0-9,a-f]+ +==================`}}, +} diff --git a/src/runtime/race/race.go b/src/runtime/race/race.go new file mode 100644 index 0000000..9c508eb --- /dev/null +++ b/src/runtime/race/race.go @@ -0,0 +1,20 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race && ((linux && (amd64 || arm64 || ppc64le || s390x)) || ((freebsd || netbsd || openbsd || windows) && amd64)) + +package race + +// This file merely ensures that we link in runtime/cgo in race build, +// this in turn ensures that runtime uses pthread_create to create threads. +// The prebuilt race runtime lives in race_GOOS_GOARCH.syso. +// Calls to the runtime are done directly from src/runtime/race.go. + +// On darwin we always use system DLLs to create threads, +// so we use race_darwin_$GOARCH.go to provide the syso-derived +// symbol information without needing to invoke cgo. +// This allows -race to be used on Mac systems without a C toolchain. + +// void __race_unused_func(void); +import "C" diff --git a/src/runtime/race/race_darwin_amd64.go b/src/runtime/race/race_darwin_amd64.go new file mode 100644 index 0000000..fbb838a --- /dev/null +++ b/src/runtime/race/race_darwin_amd64.go @@ -0,0 +1,101 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Code generated by mkcgo.sh. DO NOT EDIT. + +//go:build race + +package race + +//go:cgo_import_dynamic _Block_object_assign _Block_object_assign "" +//go:cgo_import_dynamic _Block_object_dispose _Block_object_dispose "" +//go:cgo_import_dynamic _NSConcreteStackBlock _NSConcreteStackBlock "" +//go:cgo_import_dynamic _NSGetArgv _NSGetArgv "" +//go:cgo_import_dynamic _NSGetEnviron _NSGetEnviron "" +//go:cgo_import_dynamic _NSGetExecutablePath _NSGetExecutablePath "" +//go:cgo_import_dynamic __bzero __bzero "" +//go:cgo_import_dynamic __error __error "" +//go:cgo_import_dynamic __fork __fork "" +//go:cgo_import_dynamic __mmap __mmap "" +//go:cgo_import_dynamic __munmap __munmap "" +//go:cgo_import_dynamic __stack_chk_fail __stack_chk_fail "" +//go:cgo_import_dynamic __stack_chk_guard __stack_chk_guard "" +//go:cgo_import_dynamic _dyld_get_image_header _dyld_get_image_header "" +//go:cgo_import_dynamic _dyld_get_image_name _dyld_get_image_name "" +//go:cgo_import_dynamic _dyld_get_image_vmaddr_slide _dyld_get_image_vmaddr_slide "" +//go:cgo_import_dynamic _dyld_get_shared_cache_range _dyld_get_shared_cache_range "" +//go:cgo_import_dynamic _dyld_get_shared_cache_uuid _dyld_get_shared_cache_uuid "" +//go:cgo_import_dynamic _dyld_image_count _dyld_image_count "" +//go:cgo_import_dynamic _exit _exit "" +//go:cgo_import_dynamic abort abort "" +//go:cgo_import_dynamic arc4random_buf arc4random_buf "" +//go:cgo_import_dynamic close close "" +//go:cgo_import_dynamic dlsym dlsym "" +//go:cgo_import_dynamic dup dup "" +//go:cgo_import_dynamic dup2 dup2 "" +//go:cgo_import_dynamic dyld_shared_cache_iterate_text dyld_shared_cache_iterate_text "" +//go:cgo_import_dynamic execve execve "" +//go:cgo_import_dynamic exit exit "" +//go:cgo_import_dynamic fstat$INODE64 fstat$INODE64 "" +//go:cgo_import_dynamic ftruncate ftruncate "" +//go:cgo_import_dynamic getpid getpid "" +//go:cgo_import_dynamic getrlimit getrlimit "" +//go:cgo_import_dynamic gettimeofday gettimeofday "" +//go:cgo_import_dynamic getuid getuid "" +//go:cgo_import_dynamic grantpt grantpt "" +//go:cgo_import_dynamic ioctl ioctl "" +//go:cgo_import_dynamic isatty isatty "" +//go:cgo_import_dynamic lstat$INODE64 lstat$INODE64 "" +//go:cgo_import_dynamic mach_absolute_time mach_absolute_time "" +//go:cgo_import_dynamic mach_task_self_ mach_task_self_ "" +//go:cgo_import_dynamic mach_timebase_info mach_timebase_info "" +//go:cgo_import_dynamic mach_vm_region_recurse mach_vm_region_recurse "" +//go:cgo_import_dynamic madvise madvise "" +//go:cgo_import_dynamic malloc_num_zones malloc_num_zones "" +//go:cgo_import_dynamic malloc_zones malloc_zones "" +//go:cgo_import_dynamic memcpy memcpy "" +//go:cgo_import_dynamic memset_pattern16 memset_pattern16 "" +//go:cgo_import_dynamic mkdir mkdir "" +//go:cgo_import_dynamic mprotect mprotect "" +//go:cgo_import_dynamic open open "" +//go:cgo_import_dynamic pipe pipe "" +//go:cgo_import_dynamic posix_openpt posix_openpt "" +//go:cgo_import_dynamic posix_spawn posix_spawn "" +//go:cgo_import_dynamic posix_spawn_file_actions_addclose posix_spawn_file_actions_addclose "" +//go:cgo_import_dynamic posix_spawn_file_actions_adddup2 posix_spawn_file_actions_adddup2 "" +//go:cgo_import_dynamic posix_spawn_file_actions_destroy posix_spawn_file_actions_destroy "" +//go:cgo_import_dynamic posix_spawn_file_actions_init posix_spawn_file_actions_init "" +//go:cgo_import_dynamic posix_spawnattr_destroy posix_spawnattr_destroy "" +//go:cgo_import_dynamic posix_spawnattr_init posix_spawnattr_init "" +//go:cgo_import_dynamic posix_spawnattr_setflags posix_spawnattr_setflags "" +//go:cgo_import_dynamic pthread_attr_getstack pthread_attr_getstack "" +//go:cgo_import_dynamic pthread_create pthread_create "" +//go:cgo_import_dynamic pthread_get_stackaddr_np pthread_get_stackaddr_np "" +//go:cgo_import_dynamic pthread_get_stacksize_np pthread_get_stacksize_np "" +//go:cgo_import_dynamic pthread_getspecific pthread_getspecific "" +//go:cgo_import_dynamic pthread_join pthread_join "" +//go:cgo_import_dynamic pthread_self pthread_self "" +//go:cgo_import_dynamic pthread_sigmask pthread_sigmask "" +//go:cgo_import_dynamic pthread_threadid_np pthread_threadid_np "" +//go:cgo_import_dynamic read read "" +//go:cgo_import_dynamic readlink readlink "" +//go:cgo_import_dynamic realpath$DARWIN_EXTSN realpath$DARWIN_EXTSN "" +//go:cgo_import_dynamic rename rename "" +//go:cgo_import_dynamic sched_yield sched_yield "" +//go:cgo_import_dynamic setrlimit setrlimit "" +//go:cgo_import_dynamic sigaction sigaction "" +//go:cgo_import_dynamic stat$INODE64 stat$INODE64 "" +//go:cgo_import_dynamic sysconf sysconf "" +//go:cgo_import_dynamic sysctl sysctl "" +//go:cgo_import_dynamic sysctlbyname sysctlbyname "" +//go:cgo_import_dynamic task_info task_info "" +//go:cgo_import_dynamic tcgetattr tcgetattr "" +//go:cgo_import_dynamic tcsetattr tcsetattr "" +//go:cgo_import_dynamic unlink unlink "" +//go:cgo_import_dynamic unlockpt unlockpt "" +//go:cgo_import_dynamic usleep usleep "" +//go:cgo_import_dynamic vm_region_64 vm_region_64 "" +//go:cgo_import_dynamic vm_region_recurse_64 vm_region_recurse_64 "" +//go:cgo_import_dynamic waitpid waitpid "" +//go:cgo_import_dynamic write write "" diff --git a/src/runtime/race/race_darwin_arm64.go b/src/runtime/race/race_darwin_arm64.go new file mode 100644 index 0000000..fe8584c --- /dev/null +++ b/src/runtime/race/race_darwin_arm64.go @@ -0,0 +1,95 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Code generated by mkcgo.sh. DO NOT EDIT. + +//go:build race + +package race + +//go:cgo_import_dynamic _NSGetArgv _NSGetArgv "" +//go:cgo_import_dynamic _NSGetEnviron _NSGetEnviron "" +//go:cgo_import_dynamic _NSGetExecutablePath _NSGetExecutablePath "" +//go:cgo_import_dynamic __error __error "" +//go:cgo_import_dynamic __fork __fork "" +//go:cgo_import_dynamic __mmap __mmap "" +//go:cgo_import_dynamic __munmap __munmap "" +//go:cgo_import_dynamic __stack_chk_fail __stack_chk_fail "" +//go:cgo_import_dynamic __stack_chk_guard __stack_chk_guard "" +//go:cgo_import_dynamic _dyld_get_image_header _dyld_get_image_header "" +//go:cgo_import_dynamic _dyld_get_image_name _dyld_get_image_name "" +//go:cgo_import_dynamic _dyld_get_image_vmaddr_slide _dyld_get_image_vmaddr_slide "" +//go:cgo_import_dynamic _dyld_image_count _dyld_image_count "" +//go:cgo_import_dynamic _exit _exit "" +//go:cgo_import_dynamic abort abort "" +//go:cgo_import_dynamic arc4random_buf arc4random_buf "" +//go:cgo_import_dynamic bzero bzero "" +//go:cgo_import_dynamic close close "" +//go:cgo_import_dynamic dlsym dlsym "" +//go:cgo_import_dynamic dup dup "" +//go:cgo_import_dynamic dup2 dup2 "" +//go:cgo_import_dynamic execve execve "" +//go:cgo_import_dynamic exit exit "" +//go:cgo_import_dynamic fstat fstat "" +//go:cgo_import_dynamic ftruncate ftruncate "" +//go:cgo_import_dynamic getpid getpid "" +//go:cgo_import_dynamic getrlimit getrlimit "" +//go:cgo_import_dynamic gettimeofday gettimeofday "" +//go:cgo_import_dynamic getuid getuid "" +//go:cgo_import_dynamic grantpt grantpt "" +//go:cgo_import_dynamic ioctl ioctl "" +//go:cgo_import_dynamic isatty isatty "" +//go:cgo_import_dynamic lstat lstat "" +//go:cgo_import_dynamic mach_absolute_time mach_absolute_time "" +//go:cgo_import_dynamic mach_task_self_ mach_task_self_ "" +//go:cgo_import_dynamic mach_timebase_info mach_timebase_info "" +//go:cgo_import_dynamic mach_vm_region_recurse mach_vm_region_recurse "" +//go:cgo_import_dynamic madvise madvise "" +//go:cgo_import_dynamic malloc_num_zones malloc_num_zones "" +//go:cgo_import_dynamic malloc_zones malloc_zones "" +//go:cgo_import_dynamic memcpy memcpy "" +//go:cgo_import_dynamic memset_pattern16 memset_pattern16 "" +//go:cgo_import_dynamic mkdir mkdir "" +//go:cgo_import_dynamic mprotect mprotect "" +//go:cgo_import_dynamic open open "" +//go:cgo_import_dynamic pipe pipe "" +//go:cgo_import_dynamic posix_openpt posix_openpt "" +//go:cgo_import_dynamic posix_spawn posix_spawn "" +//go:cgo_import_dynamic posix_spawn_file_actions_addclose posix_spawn_file_actions_addclose "" +//go:cgo_import_dynamic posix_spawn_file_actions_adddup2 posix_spawn_file_actions_adddup2 "" +//go:cgo_import_dynamic posix_spawn_file_actions_destroy posix_spawn_file_actions_destroy "" +//go:cgo_import_dynamic posix_spawn_file_actions_init posix_spawn_file_actions_init "" +//go:cgo_import_dynamic posix_spawnattr_destroy posix_spawnattr_destroy "" +//go:cgo_import_dynamic posix_spawnattr_init posix_spawnattr_init "" +//go:cgo_import_dynamic posix_spawnattr_setflags posix_spawnattr_setflags "" +//go:cgo_import_dynamic pthread_attr_getstack pthread_attr_getstack "" +//go:cgo_import_dynamic pthread_create pthread_create "" +//go:cgo_import_dynamic pthread_get_stackaddr_np pthread_get_stackaddr_np "" +//go:cgo_import_dynamic pthread_get_stacksize_np pthread_get_stacksize_np "" +//go:cgo_import_dynamic pthread_getspecific pthread_getspecific "" +//go:cgo_import_dynamic pthread_join pthread_join "" +//go:cgo_import_dynamic pthread_self pthread_self "" +//go:cgo_import_dynamic pthread_sigmask pthread_sigmask "" +//go:cgo_import_dynamic pthread_threadid_np pthread_threadid_np "" +//go:cgo_import_dynamic read read "" +//go:cgo_import_dynamic readlink readlink "" +//go:cgo_import_dynamic realpath$DARWIN_EXTSN realpath$DARWIN_EXTSN "" +//go:cgo_import_dynamic rename rename "" +//go:cgo_import_dynamic sched_yield sched_yield "" +//go:cgo_import_dynamic setrlimit setrlimit "" +//go:cgo_import_dynamic sigaction sigaction "" +//go:cgo_import_dynamic stat stat "" +//go:cgo_import_dynamic sysconf sysconf "" +//go:cgo_import_dynamic sysctl sysctl "" +//go:cgo_import_dynamic sysctlbyname sysctlbyname "" +//go:cgo_import_dynamic task_info task_info "" +//go:cgo_import_dynamic tcgetattr tcgetattr "" +//go:cgo_import_dynamic tcsetattr tcsetattr "" +//go:cgo_import_dynamic unlink unlink "" +//go:cgo_import_dynamic unlockpt unlockpt "" +//go:cgo_import_dynamic usleep usleep "" +//go:cgo_import_dynamic vm_region_64 vm_region_64 "" +//go:cgo_import_dynamic vm_region_recurse_64 vm_region_recurse_64 "" +//go:cgo_import_dynamic waitpid waitpid "" +//go:cgo_import_dynamic write write "" diff --git a/src/runtime/race/race_darwin_arm64.syso b/src/runtime/race/race_darwin_arm64.syso Binary files differnew file mode 100644 index 0000000..4a23df2 --- /dev/null +++ b/src/runtime/race/race_darwin_arm64.syso diff --git a/src/runtime/race/race_linux_arm64.syso b/src/runtime/race/race_linux_arm64.syso Binary files differnew file mode 100644 index 0000000..c8b3f48 --- /dev/null +++ b/src/runtime/race/race_linux_arm64.syso diff --git a/src/runtime/race/race_linux_ppc64le.syso b/src/runtime/race/race_linux_ppc64le.syso Binary files differnew file mode 100644 index 0000000..1939f29 --- /dev/null +++ b/src/runtime/race/race_linux_ppc64le.syso diff --git a/src/runtime/race/race_linux_s390x.syso b/src/runtime/race/race_linux_s390x.syso Binary files differnew file mode 100644 index 0000000..ed4a300 --- /dev/null +++ b/src/runtime/race/race_linux_s390x.syso diff --git a/src/runtime/race/race_linux_test.go b/src/runtime/race/race_linux_test.go new file mode 100644 index 0000000..947ed7c --- /dev/null +++ b/src/runtime/race/race_linux_test.go @@ -0,0 +1,65 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && race + +package race_test + +import ( + "sync/atomic" + "syscall" + "testing" + "unsafe" +) + +func TestAtomicMmap(t *testing.T) { + // Test that atomic operations work on "external" memory. Previously they crashed (#16206). + // Also do a sanity correctness check: under race detector atomic operations + // are implemented inside of race runtime. + mem, err := syscall.Mmap(-1, 0, 1<<20, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_ANON|syscall.MAP_PRIVATE) + if err != nil { + t.Fatalf("mmap failed: %v", err) + } + defer syscall.Munmap(mem) + a := (*uint64)(unsafe.Pointer(&mem[0])) + if *a != 0 { + t.Fatalf("bad atomic value: %v, want 0", *a) + } + atomic.AddUint64(a, 1) + if *a != 1 { + t.Fatalf("bad atomic value: %v, want 1", *a) + } + atomic.AddUint64(a, 1) + if *a != 2 { + t.Fatalf("bad atomic value: %v, want 2", *a) + } +} + +func TestAtomicPageBoundary(t *testing.T) { + // Test that atomic access near (but not cross) a page boundary + // doesn't fault. See issue 60825. + + // Mmap two pages of memory, and make the second page inaccessible, + // so we have an address at the end of a page. + pagesize := syscall.Getpagesize() + b, err := syscall.Mmap(0, 0, 2*pagesize, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_ANON|syscall.MAP_PRIVATE) + if err != nil { + t.Fatalf("mmap failed %s", err) + } + defer syscall.Munmap(b) + err = syscall.Mprotect(b[pagesize:], syscall.PROT_NONE) + if err != nil { + t.Fatalf("mprotect high failed %s\n", err) + } + + // This should not fault. + a := (*uint32)(unsafe.Pointer(&b[pagesize-4])) + atomic.StoreUint32(a, 1) + if x := atomic.LoadUint32(a); x != 1 { + t.Fatalf("bad atomic value: %v, want 1", x) + } + if x := atomic.AddUint32(a, 1); x != 2 { + t.Fatalf("bad atomic value: %v, want 2", x) + } +} diff --git a/src/runtime/race/race_test.go b/src/runtime/race/race_test.go new file mode 100644 index 0000000..4fe6168 --- /dev/null +++ b/src/runtime/race/race_test.go @@ -0,0 +1,250 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +// This program is used to verify the race detector +// by running the tests and parsing their output. +// It does not check stack correctness, completeness or anything else: +// it merely verifies that if a test is expected to be racy +// then the race is detected. +package race_test + +import ( + "bufio" + "bytes" + "fmt" + "internal/testenv" + "io" + "log" + "math/rand" + "os" + "os/exec" + "path/filepath" + "strings" + "sync" + "sync/atomic" + "testing" +) + +var ( + passedTests = 0 + totalTests = 0 + falsePos = 0 + falseNeg = 0 + failingPos = 0 + failingNeg = 0 + failed = false +) + +const ( + visibleLen = 40 + testPrefix = "=== RUN Test" +) + +func TestRace(t *testing.T) { + testOutput, err := runTests(t) + if err != nil { + t.Fatalf("Failed to run tests: %v\n%v", err, string(testOutput)) + } + reader := bufio.NewReader(bytes.NewReader(testOutput)) + + funcName := "" + var tsanLog []string + for { + s, err := nextLine(reader) + if err != nil { + fmt.Printf("%s\n", processLog(funcName, tsanLog)) + break + } + if strings.HasPrefix(s, testPrefix) { + fmt.Printf("%s\n", processLog(funcName, tsanLog)) + tsanLog = make([]string, 0, 100) + funcName = s[len(testPrefix):] + } else { + tsanLog = append(tsanLog, s) + } + } + + if totalTests == 0 { + t.Fatalf("failed to parse test output:\n%s", testOutput) + } + fmt.Printf("\nPassed %d of %d tests (%.02f%%, %d+, %d-)\n", + passedTests, totalTests, 100*float64(passedTests)/float64(totalTests), falsePos, falseNeg) + fmt.Printf("%d expected failures (%d has not fail)\n", failingPos+failingNeg, failingNeg) + if failed { + t.Fail() + } +} + +// nextLine is a wrapper around bufio.Reader.ReadString. +// It reads a line up to the next '\n' character. Error +// is non-nil if there are no lines left, and nil +// otherwise. +func nextLine(r *bufio.Reader) (string, error) { + s, err := r.ReadString('\n') + if err != nil { + if err != io.EOF { + log.Fatalf("nextLine: expected EOF, received %v", err) + } + return s, err + } + return s[:len(s)-1], nil +} + +// processLog verifies whether the given ThreadSanitizer's log +// contains a race report, checks this information against +// the name of the testcase and returns the result of this +// comparison. +func processLog(testName string, tsanLog []string) string { + if !strings.HasPrefix(testName, "Race") && !strings.HasPrefix(testName, "NoRace") { + return "" + } + gotRace := false + for _, s := range tsanLog { + if strings.Contains(s, "DATA RACE") { + gotRace = true + break + } + } + + failing := strings.Contains(testName, "Failing") + expRace := !strings.HasPrefix(testName, "No") + for len(testName) < visibleLen { + testName += " " + } + if expRace == gotRace { + passedTests++ + totalTests++ + if failing { + failed = true + failingNeg++ + } + return fmt.Sprintf("%s .", testName) + } + pos := "" + if expRace { + falseNeg++ + } else { + falsePos++ + pos = "+" + } + if failing { + failingPos++ + } else { + failed = true + } + totalTests++ + return fmt.Sprintf("%s %s%s", testName, "FAILED", pos) +} + +// runTests assures that the package and its dependencies is +// built with instrumentation enabled and returns the output of 'go test' +// which includes possible data race reports from ThreadSanitizer. +func runTests(t *testing.T) ([]byte, error) { + tests, err := filepath.Glob("./testdata/*_test.go") + if err != nil { + return nil, err + } + args := []string{"test", "-race", "-v"} + args = append(args, tests...) + cmd := exec.Command(testenv.GoToolPath(t), args...) + // The following flags turn off heuristics that suppress seemingly identical reports. + // It is required because the tests contain a lot of data races on the same addresses + // (the tests are simple and the memory is constantly reused). + for _, env := range os.Environ() { + if strings.HasPrefix(env, "GOMAXPROCS=") || + strings.HasPrefix(env, "GODEBUG=") || + strings.HasPrefix(env, "GORACE=") { + continue + } + cmd.Env = append(cmd.Env, env) + } + // We set GOMAXPROCS=1 to prevent test flakiness. + // There are two sources of flakiness: + // 1. Some tests rely on particular execution order. + // If the order is different, race does not happen at all. + // 2. Ironically, ThreadSanitizer runtime contains a logical race condition + // that can lead to false negatives if racy accesses happen literally at the same time. + // Tests used to work reliably in the good old days of GOMAXPROCS=1. + // So let's set it for now. A more reliable solution is to explicitly annotate tests + // with required execution order by means of a special "invisible" synchronization primitive + // (that's what is done for C++ ThreadSanitizer tests). This is issue #14119. + cmd.Env = append(cmd.Env, + "GOMAXPROCS=1", + "GORACE=suppress_equal_stacks=0 suppress_equal_addresses=0", + ) + // There are races: we expect tests to fail and the exit code to be non-zero. + out, _ := cmd.CombinedOutput() + if bytes.Contains(out, []byte("fatal error:")) { + // But don't expect runtime to crash. + return out, fmt.Errorf("runtime fatal error") + } + return out, nil +} + +func TestIssue8102(t *testing.T) { + // If this compiles with -race, the test passes. + type S struct { + x any + i int + } + c := make(chan int) + a := [2]*int{} + for ; ; c <- *a[S{}.i] { + if t != nil { + break + } + } +} + +func TestIssue9137(t *testing.T) { + a := []string{"a"} + i := 0 + a[i], a[len(a)-1], a = a[len(a)-1], "", a[:len(a)-1] + if len(a) != 0 || a[:1][0] != "" { + t.Errorf("mangled a: %q %q", a, a[:1]) + } +} + +func BenchmarkSyncLeak(b *testing.B) { + const ( + G = 1000 + S = 1000 + H = 10 + ) + var wg sync.WaitGroup + wg.Add(G) + for g := 0; g < G; g++ { + go func() { + defer wg.Done() + hold := make([][]uint32, H) + for i := 0; i < b.N; i++ { + a := make([]uint32, S) + atomic.AddUint32(&a[rand.Intn(len(a))], 1) + hold[rand.Intn(len(hold))] = a + } + _ = hold + }() + } + wg.Wait() +} + +func BenchmarkStackLeak(b *testing.B) { + done := make(chan bool, 1) + for i := 0; i < b.N; i++ { + go func() { + growStack(rand.Intn(100)) + done <- true + }() + <-done + } +} + +func growStack(i int) { + if i == 0 { + return + } + growStack(i - 1) +} diff --git a/src/runtime/race/race_unix_test.go b/src/runtime/race/race_unix_test.go new file mode 100644 index 0000000..3cf53b0 --- /dev/null +++ b/src/runtime/race/race_unix_test.go @@ -0,0 +1,29 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race && (darwin || freebsd || linux) + +package race_test + +import ( + "sync/atomic" + "syscall" + "testing" + "unsafe" +) + +// Test that race detector does not crash when accessing non-Go allocated memory (issue 9136). +func TestNonGoMemory(t *testing.T) { + data, err := syscall.Mmap(-1, 0, 4096, syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_ANON|syscall.MAP_PRIVATE) + if err != nil { + t.Fatalf("failed to mmap memory: %v", err) + } + defer syscall.Munmap(data) + p := (*uint32)(unsafe.Pointer(&data[0])) + atomic.AddUint32(p, 1) + (*p)++ + if *p != 2 { + t.Fatalf("data[0] = %v, expect 2", *p) + } +} diff --git a/src/runtime/race/race_v1_amd64.go b/src/runtime/race/race_v1_amd64.go new file mode 100644 index 0000000..7c40db1 --- /dev/null +++ b/src/runtime/race/race_v1_amd64.go @@ -0,0 +1,9 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux && !amd64.v3) || darwin || freebsd || netbsd || openbsd || windows + +package race + +import _ "runtime/race/internal/amd64v1" diff --git a/src/runtime/race/race_v3_amd64.go b/src/runtime/race/race_v3_amd64.go new file mode 100644 index 0000000..80728d8 --- /dev/null +++ b/src/runtime/race/race_v3_amd64.go @@ -0,0 +1,9 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && amd64.v3 + +package race + +import _ "runtime/race/internal/amd64v3" diff --git a/src/runtime/race/race_windows_test.go b/src/runtime/race/race_windows_test.go new file mode 100644 index 0000000..143b483 --- /dev/null +++ b/src/runtime/race/race_windows_test.go @@ -0,0 +1,46 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build windows && race + +package race_test + +import ( + "sync/atomic" + "syscall" + "testing" + "unsafe" +) + +func TestAtomicMmap(t *testing.T) { + // Test that atomic operations work on "external" memory. Previously they crashed (#16206). + // Also do a sanity correctness check: under race detector atomic operations + // are implemented inside of race runtime. + kernel32 := syscall.NewLazyDLL("kernel32.dll") + VirtualAlloc := kernel32.NewProc("VirtualAlloc") + VirtualFree := kernel32.NewProc("VirtualFree") + const ( + MEM_COMMIT = 0x00001000 + MEM_RESERVE = 0x00002000 + MEM_RELEASE = 0x8000 + PAGE_READWRITE = 0x04 + ) + mem, _, err := syscall.Syscall6(VirtualAlloc.Addr(), 4, 0, 1<<20, MEM_COMMIT|MEM_RESERVE, PAGE_READWRITE, 0, 0) + if err != 0 { + t.Fatalf("VirtualAlloc failed: %v", err) + } + defer syscall.Syscall(VirtualFree.Addr(), 3, mem, 1<<20, MEM_RELEASE) + a := (*uint64)(unsafe.Pointer(mem)) + if *a != 0 { + t.Fatalf("bad atomic value: %v, want 0", *a) + } + atomic.AddUint64(a, 1) + if *a != 1 { + t.Fatalf("bad atomic value: %v, want 1", *a) + } + atomic.AddUint64(a, 1) + if *a != 2 { + t.Fatalf("bad atomic value: %v, want 2", *a) + } +} diff --git a/src/runtime/race/sched_test.go b/src/runtime/race/sched_test.go new file mode 100644 index 0000000..a66860c --- /dev/null +++ b/src/runtime/race/sched_test.go @@ -0,0 +1,48 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +package race_test + +import ( + "fmt" + "reflect" + "runtime" + "strings" + "testing" +) + +func TestRandomScheduling(t *testing.T) { + // Scheduler is most consistent with GOMAXPROCS=1. + // Use that to make the test most likely to fail. + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(1)) + const N = 10 + out := make([][]int, N) + for i := 0; i < N; i++ { + c := make(chan int, N) + for j := 0; j < N; j++ { + go func(j int) { + c <- j + }(j) + } + row := make([]int, N) + for j := 0; j < N; j++ { + row[j] = <-c + } + out[i] = row + } + + for i := 0; i < N; i++ { + if !reflect.DeepEqual(out[0], out[i]) { + return // found a different order + } + } + + var buf strings.Builder + for i := 0; i < N; i++ { + fmt.Fprintf(&buf, "%v\n", out[i]) + } + t.Fatalf("consistent goroutine execution order:\n%v", buf.String()) +} diff --git a/src/runtime/race/syso_test.go b/src/runtime/race/syso_test.go new file mode 100644 index 0000000..2f1a91c --- /dev/null +++ b/src/runtime/race/syso_test.go @@ -0,0 +1,33 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +package race + +import ( + "bytes" + "os/exec" + "path/filepath" + "runtime" + "testing" +) + +func TestIssue37485(t *testing.T) { + files, err := filepath.Glob("./*.syso") + if err != nil { + t.Fatalf("can't find syso files: %s", err) + } + for _, f := range files { + cmd := exec.Command(filepath.Join(runtime.GOROOT(), "bin", "go"), "tool", "nm", f) + res, err := cmd.CombinedOutput() + if err != nil { + t.Errorf("nm of %s failed: %s", f, err) + continue + } + if bytes.Contains(res, []byte("getauxval")) { + t.Errorf("%s contains getauxval", f) + } + } +} diff --git a/src/runtime/race/testdata/atomic_test.go b/src/runtime/race/testdata/atomic_test.go new file mode 100644 index 0000000..4ce7260 --- /dev/null +++ b/src/runtime/race/testdata/atomic_test.go @@ -0,0 +1,325 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "runtime" + "sync" + "sync/atomic" + "testing" + "unsafe" +) + +func TestNoRaceAtomicAddInt64(t *testing.T) { + var x1, x2 int8 + _ = x1 + x2 + var s int64 + ch := make(chan bool, 2) + go func() { + x1 = 1 + if atomic.AddInt64(&s, 1) == 2 { + x2 = 1 + } + ch <- true + }() + go func() { + x2 = 1 + if atomic.AddInt64(&s, 1) == 2 { + x1 = 1 + } + ch <- true + }() + <-ch + <-ch +} + +func TestRaceAtomicAddInt64(t *testing.T) { + var x1, x2 int8 + _ = x1 + x2 + var s int64 + ch := make(chan bool, 2) + go func() { + x1 = 1 + if atomic.AddInt64(&s, 1) == 1 { + x2 = 1 + } + ch <- true + }() + go func() { + x2 = 1 + if atomic.AddInt64(&s, 1) == 1 { + x1 = 1 + } + ch <- true + }() + <-ch + <-ch +} + +func TestNoRaceAtomicAddInt32(t *testing.T) { + var x1, x2 int8 + _ = x1 + x2 + var s int32 + ch := make(chan bool, 2) + go func() { + x1 = 1 + if atomic.AddInt32(&s, 1) == 2 { + x2 = 1 + } + ch <- true + }() + go func() { + x2 = 1 + if atomic.AddInt32(&s, 1) == 2 { + x1 = 1 + } + ch <- true + }() + <-ch + <-ch +} + +func TestNoRaceAtomicLoadAddInt32(t *testing.T) { + var x int64 + _ = x + var s int32 + go func() { + x = 2 + atomic.AddInt32(&s, 1) + }() + for atomic.LoadInt32(&s) != 1 { + runtime.Gosched() + } + x = 1 +} + +func TestNoRaceAtomicLoadStoreInt32(t *testing.T) { + var x int64 + _ = x + var s int32 + go func() { + x = 2 + atomic.StoreInt32(&s, 1) + }() + for atomic.LoadInt32(&s) != 1 { + runtime.Gosched() + } + x = 1 +} + +func TestNoRaceAtomicStoreCASInt32(t *testing.T) { + var x int64 + _ = x + var s int32 + go func() { + x = 2 + atomic.StoreInt32(&s, 1) + }() + for !atomic.CompareAndSwapInt32(&s, 1, 0) { + runtime.Gosched() + } + x = 1 +} + +func TestNoRaceAtomicCASLoadInt32(t *testing.T) { + var x int64 + _ = x + var s int32 + go func() { + x = 2 + if !atomic.CompareAndSwapInt32(&s, 0, 1) { + panic("") + } + }() + for atomic.LoadInt32(&s) != 1 { + runtime.Gosched() + } + x = 1 +} + +func TestNoRaceAtomicCASCASInt32(t *testing.T) { + var x int64 + _ = x + var s int32 + go func() { + x = 2 + if !atomic.CompareAndSwapInt32(&s, 0, 1) { + panic("") + } + }() + for !atomic.CompareAndSwapInt32(&s, 1, 0) { + runtime.Gosched() + } + x = 1 +} + +func TestNoRaceAtomicCASCASInt32_2(t *testing.T) { + var x1, x2 int8 + _ = x1 + x2 + var s int32 + ch := make(chan bool, 2) + go func() { + x1 = 1 + if !atomic.CompareAndSwapInt32(&s, 0, 1) { + x2 = 1 + } + ch <- true + }() + go func() { + x2 = 1 + if !atomic.CompareAndSwapInt32(&s, 0, 1) { + x1 = 1 + } + ch <- true + }() + <-ch + <-ch +} + +func TestNoRaceAtomicLoadInt64(t *testing.T) { + var x int32 + _ = x + var s int64 + go func() { + x = 2 + atomic.AddInt64(&s, 1) + }() + for atomic.LoadInt64(&s) != 1 { + runtime.Gosched() + } + x = 1 +} + +func TestNoRaceAtomicCASCASUInt64(t *testing.T) { + var x int64 + _ = x + var s uint64 + go func() { + x = 2 + if !atomic.CompareAndSwapUint64(&s, 0, 1) { + panic("") + } + }() + for !atomic.CompareAndSwapUint64(&s, 1, 0) { + runtime.Gosched() + } + x = 1 +} + +func TestNoRaceAtomicLoadStorePointer(t *testing.T) { + var x int64 + _ = x + var s unsafe.Pointer + var y int = 2 + var p unsafe.Pointer = unsafe.Pointer(&y) + go func() { + x = 2 + atomic.StorePointer(&s, p) + }() + for atomic.LoadPointer(&s) != p { + runtime.Gosched() + } + x = 1 +} + +func TestNoRaceAtomicStoreCASUint64(t *testing.T) { + var x int64 + _ = x + var s uint64 + go func() { + x = 2 + atomic.StoreUint64(&s, 1) + }() + for !atomic.CompareAndSwapUint64(&s, 1, 0) { + runtime.Gosched() + } + x = 1 +} + +func TestRaceAtomicStoreLoad(t *testing.T) { + c := make(chan bool) + var a uint64 + go func() { + atomic.StoreUint64(&a, 1) + c <- true + }() + _ = a + <-c +} + +func TestRaceAtomicLoadStore(t *testing.T) { + c := make(chan bool) + var a uint64 + go func() { + _ = atomic.LoadUint64(&a) + c <- true + }() + a = 1 + <-c +} + +func TestRaceAtomicAddLoad(t *testing.T) { + c := make(chan bool) + var a uint64 + go func() { + atomic.AddUint64(&a, 1) + c <- true + }() + _ = a + <-c +} + +func TestRaceAtomicAddStore(t *testing.T) { + c := make(chan bool) + var a uint64 + go func() { + atomic.AddUint64(&a, 1) + c <- true + }() + a = 42 + <-c +} + +// A nil pointer in an atomic operation should not deadlock +// the rest of the program. Used to hang indefinitely. +func TestNoRaceAtomicCrash(t *testing.T) { + var mutex sync.Mutex + var nilptr *int32 + panics := 0 + defer func() { + if x := recover(); x != nil { + mutex.Lock() + panics++ + mutex.Unlock() + } else { + panic("no panic") + } + }() + atomic.AddInt32(nilptr, 1) +} + +func TestNoRaceDeferAtomicStore(t *testing.T) { + // Test that when an atomic function is deferred directly, the + // GC scans it correctly. See issue 42599. + type foo struct { + bar int64 + } + + var doFork func(f *foo, depth int) + doFork = func(f *foo, depth int) { + atomic.StoreInt64(&f.bar, 1) + defer atomic.StoreInt64(&f.bar, 0) + if depth > 0 { + for i := 0; i < 2; i++ { + f2 := &foo{} + go doFork(f2, depth-1) + } + } + runtime.GC() + } + + f := &foo{} + doFork(f, 11) +} diff --git a/src/runtime/race/testdata/cgo_test.go b/src/runtime/race/testdata/cgo_test.go new file mode 100644 index 0000000..211ef7d --- /dev/null +++ b/src/runtime/race/testdata/cgo_test.go @@ -0,0 +1,21 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "internal/testenv" + "os" + "os/exec" + "testing" +) + +func TestNoRaceCgoSync(t *testing.T) { + cmd := exec.Command(testenv.GoToolPath(t), "run", "-race", "cgo_test_main.go") + cmd.Stdout = os.Stdout + cmd.Stderr = os.Stderr + if err := cmd.Run(); err != nil { + t.Fatalf("program exited with error: %v\n", err) + } +} diff --git a/src/runtime/race/testdata/cgo_test_main.go b/src/runtime/race/testdata/cgo_test_main.go new file mode 100644 index 0000000..620cea1 --- /dev/null +++ b/src/runtime/race/testdata/cgo_test_main.go @@ -0,0 +1,30 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +/* +int sync; + +void Notify(void) +{ + __sync_fetch_and_add(&sync, 1); +} + +void Wait(void) +{ + while(__sync_fetch_and_add(&sync, 0) == 0) {} +} +*/ +import "C" + +func main() { + data := 0 + go func() { + data = 1 + C.Notify() + }() + C.Wait() + _ = data +} diff --git a/src/runtime/race/testdata/chan_test.go b/src/runtime/race/testdata/chan_test.go new file mode 100644 index 0000000..e39ad4f --- /dev/null +++ b/src/runtime/race/testdata/chan_test.go @@ -0,0 +1,787 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "runtime" + "testing" + "time" +) + +func TestNoRaceChanSync(t *testing.T) { + v := 0 + _ = v + c := make(chan int) + go func() { + v = 1 + c <- 0 + }() + <-c + v = 2 +} + +func TestNoRaceChanSyncRev(t *testing.T) { + v := 0 + _ = v + c := make(chan int) + go func() { + c <- 0 + v = 2 + }() + v = 1 + <-c +} + +func TestNoRaceChanAsync(t *testing.T) { + v := 0 + _ = v + c := make(chan int, 10) + go func() { + v = 1 + c <- 0 + }() + <-c + v = 2 +} + +func TestRaceChanAsyncRev(t *testing.T) { + v := 0 + _ = v + c := make(chan int, 10) + go func() { + c <- 0 + v = 1 + }() + v = 2 + <-c +} + +func TestNoRaceChanAsyncCloseRecv(t *testing.T) { + v := 0 + _ = v + c := make(chan int, 10) + go func() { + v = 1 + close(c) + }() + func() { + defer func() { + recover() + v = 2 + }() + <-c + }() +} + +func TestNoRaceChanAsyncCloseRecv2(t *testing.T) { + v := 0 + _ = v + c := make(chan int, 10) + go func() { + v = 1 + close(c) + }() + _, _ = <-c + v = 2 +} + +func TestNoRaceChanAsyncCloseRecv3(t *testing.T) { + v := 0 + _ = v + c := make(chan int, 10) + go func() { + v = 1 + close(c) + }() + for range c { + } + v = 2 +} + +func TestNoRaceChanSyncCloseRecv(t *testing.T) { + v := 0 + _ = v + c := make(chan int) + go func() { + v = 1 + close(c) + }() + func() { + defer func() { + recover() + v = 2 + }() + <-c + }() +} + +func TestNoRaceChanSyncCloseRecv2(t *testing.T) { + v := 0 + _ = v + c := make(chan int) + go func() { + v = 1 + close(c) + }() + _, _ = <-c + v = 2 +} + +func TestNoRaceChanSyncCloseRecv3(t *testing.T) { + v := 0 + _ = v + c := make(chan int) + go func() { + v = 1 + close(c) + }() + for range c { + } + v = 2 +} + +func TestRaceChanSyncCloseSend(t *testing.T) { + v := 0 + _ = v + c := make(chan int) + go func() { + v = 1 + close(c) + }() + func() { + defer func() { + recover() + }() + c <- 0 + }() + v = 2 +} + +func TestRaceChanAsyncCloseSend(t *testing.T) { + v := 0 + _ = v + c := make(chan int, 10) + go func() { + v = 1 + close(c) + }() + func() { + defer func() { + recover() + }() + for { + c <- 0 + } + }() + v = 2 +} + +func TestRaceChanCloseClose(t *testing.T) { + compl := make(chan bool, 2) + v1 := 0 + v2 := 0 + _ = v1 + v2 + c := make(chan int) + go func() { + defer func() { + if recover() != nil { + v2 = 2 + } + compl <- true + }() + v1 = 1 + close(c) + }() + go func() { + defer func() { + if recover() != nil { + v1 = 2 + } + compl <- true + }() + v2 = 1 + close(c) + }() + <-compl + <-compl +} + +func TestRaceChanSendLen(t *testing.T) { + v := 0 + _ = v + c := make(chan int, 10) + go func() { + v = 1 + c <- 1 + }() + for len(c) == 0 { + runtime.Gosched() + } + v = 2 +} + +func TestRaceChanRecvLen(t *testing.T) { + v := 0 + _ = v + c := make(chan int, 10) + c <- 1 + go func() { + v = 1 + <-c + }() + for len(c) != 0 { + runtime.Gosched() + } + v = 2 +} + +func TestRaceChanSendSend(t *testing.T) { + compl := make(chan bool, 2) + v1 := 0 + v2 := 0 + _ = v1 + v2 + c := make(chan int, 1) + go func() { + v1 = 1 + select { + case c <- 1: + default: + v2 = 2 + } + compl <- true + }() + go func() { + v2 = 1 + select { + case c <- 1: + default: + v1 = 2 + } + compl <- true + }() + <-compl + <-compl +} + +func TestNoRaceChanPtr(t *testing.T) { + type msg struct { + x int + } + c := make(chan *msg) + go func() { + c <- &msg{1} + }() + m := <-c + m.x = 2 +} + +func TestRaceChanWrongSend(t *testing.T) { + v1 := 0 + v2 := 0 + _ = v1 + v2 + c := make(chan int, 2) + go func() { + v1 = 1 + c <- 1 + }() + go func() { + v2 = 2 + c <- 2 + }() + time.Sleep(1e7) + if <-c == 1 { + v2 = 3 + } else { + v1 = 3 + } +} + +func TestRaceChanWrongClose(t *testing.T) { + v1 := 0 + v2 := 0 + _ = v1 + v2 + c := make(chan int, 1) + done := make(chan bool) + go func() { + defer func() { + recover() + }() + v1 = 1 + c <- 1 + done <- true + }() + go func() { + time.Sleep(1e7) + v2 = 2 + close(c) + done <- true + }() + time.Sleep(2e7) + if _, who := <-c; who { + v2 = 2 + } else { + v1 = 2 + } + <-done + <-done +} + +func TestRaceChanSendClose(t *testing.T) { + compl := make(chan bool, 2) + c := make(chan int, 1) + go func() { + defer func() { + recover() + compl <- true + }() + c <- 1 + }() + go func() { + time.Sleep(10 * time.Millisecond) + close(c) + compl <- true + }() + <-compl + <-compl +} + +func TestRaceChanSendSelectClose(t *testing.T) { + compl := make(chan bool, 2) + c := make(chan int, 1) + c1 := make(chan int) + go func() { + defer func() { + recover() + compl <- true + }() + time.Sleep(10 * time.Millisecond) + select { + case c <- 1: + case <-c1: + } + }() + go func() { + close(c) + compl <- true + }() + <-compl + <-compl +} + +func TestRaceSelectReadWriteAsync(t *testing.T) { + done := make(chan bool) + x := 0 + c1 := make(chan int, 10) + c2 := make(chan int, 10) + c3 := make(chan int) + c2 <- 1 + go func() { + select { + case c1 <- x: // read of x races with... + case c3 <- 1: + } + done <- true + }() + select { + case x = <-c2: // ... write to x here + case c3 <- 1: + } + <-done +} + +func TestRaceSelectReadWriteSync(t *testing.T) { + done := make(chan bool) + x := 0 + c1 := make(chan int) + c2 := make(chan int) + c3 := make(chan int) + // make c1 and c2 ready for communication + go func() { + <-c1 + }() + go func() { + c2 <- 1 + }() + go func() { + select { + case c1 <- x: // read of x races with... + case c3 <- 1: + } + done <- true + }() + select { + case x = <-c2: // ... write to x here + case c3 <- 1: + } + <-done +} + +func TestNoRaceSelectReadWriteAsync(t *testing.T) { + done := make(chan bool) + x := 0 + c1 := make(chan int) + c2 := make(chan int) + go func() { + select { + case c1 <- x: // read of x does not race with... + case c2 <- 1: + } + done <- true + }() + select { + case x = <-c1: // ... write to x here + case c2 <- 1: + } + <-done +} + +func TestRaceChanReadWriteAsync(t *testing.T) { + done := make(chan bool) + c1 := make(chan int, 10) + c2 := make(chan int, 10) + c2 <- 10 + x := 0 + go func() { + c1 <- x // read of x races with... + done <- true + }() + x = <-c2 // ... write to x here + <-done +} + +func TestRaceChanReadWriteSync(t *testing.T) { + done := make(chan bool) + c1 := make(chan int) + c2 := make(chan int) + // make c1 and c2 ready for communication + go func() { + <-c1 + }() + go func() { + c2 <- 10 + }() + x := 0 + go func() { + c1 <- x // read of x races with... + done <- true + }() + x = <-c2 // ... write to x here + <-done +} + +func TestNoRaceChanReadWriteAsync(t *testing.T) { + done := make(chan bool) + c1 := make(chan int, 10) + x := 0 + go func() { + c1 <- x // read of x does not race with... + done <- true + }() + x = <-c1 // ... write to x here + <-done +} + +func TestNoRaceProducerConsumerUnbuffered(t *testing.T) { + type Task struct { + f func() + done chan bool + } + + queue := make(chan Task) + + go func() { + t := <-queue + t.f() + t.done <- true + }() + + doit := func(f func()) { + done := make(chan bool, 1) + queue <- Task{f, done} + <-done + } + + x := 0 + doit(func() { + x = 1 + }) + _ = x +} + +func TestRaceChanItselfSend(t *testing.T) { + compl := make(chan bool, 1) + c := make(chan int, 10) + go func() { + c <- 0 + compl <- true + }() + c = make(chan int, 20) + <-compl +} + +func TestRaceChanItselfRecv(t *testing.T) { + compl := make(chan bool, 1) + c := make(chan int, 10) + c <- 1 + go func() { + <-c + compl <- true + }() + time.Sleep(1e7) + c = make(chan int, 20) + <-compl +} + +func TestRaceChanItselfNil(t *testing.T) { + c := make(chan int, 10) + go func() { + c <- 0 + }() + time.Sleep(1e7) + c = nil + _ = c +} + +func TestRaceChanItselfClose(t *testing.T) { + compl := make(chan bool, 1) + c := make(chan int) + go func() { + close(c) + compl <- true + }() + c = make(chan int) + <-compl +} + +func TestRaceChanItselfLen(t *testing.T) { + compl := make(chan bool, 1) + c := make(chan int) + go func() { + _ = len(c) + compl <- true + }() + c = make(chan int) + <-compl +} + +func TestRaceChanItselfCap(t *testing.T) { + compl := make(chan bool, 1) + c := make(chan int) + go func() { + _ = cap(c) + compl <- true + }() + c = make(chan int) + <-compl +} + +func TestNoRaceChanCloseLen(t *testing.T) { + c := make(chan int, 10) + r := make(chan int, 10) + go func() { + r <- len(c) + }() + go func() { + close(c) + r <- 0 + }() + <-r + <-r +} + +func TestNoRaceChanCloseCap(t *testing.T) { + c := make(chan int, 10) + r := make(chan int, 10) + go func() { + r <- cap(c) + }() + go func() { + close(c) + r <- 0 + }() + <-r + <-r +} + +func TestRaceChanCloseSend(t *testing.T) { + compl := make(chan bool, 1) + c := make(chan int, 10) + go func() { + close(c) + compl <- true + }() + c <- 0 + <-compl +} + +func TestNoRaceChanMutex(t *testing.T) { + done := make(chan struct{}) + mtx := make(chan struct{}, 1) + data := 0 + _ = data + go func() { + mtx <- struct{}{} + data = 42 + <-mtx + done <- struct{}{} + }() + mtx <- struct{}{} + data = 43 + <-mtx + <-done +} + +func TestNoRaceSelectMutex(t *testing.T) { + done := make(chan struct{}) + mtx := make(chan struct{}, 1) + aux := make(chan bool) + data := 0 + _ = data + go func() { + select { + case mtx <- struct{}{}: + case <-aux: + } + data = 42 + select { + case <-mtx: + case <-aux: + } + done <- struct{}{} + }() + select { + case mtx <- struct{}{}: + case <-aux: + } + data = 43 + select { + case <-mtx: + case <-aux: + } + <-done +} + +func TestRaceChanSem(t *testing.T) { + done := make(chan struct{}) + mtx := make(chan bool, 2) + data := 0 + _ = data + go func() { + mtx <- true + data = 42 + <-mtx + done <- struct{}{} + }() + mtx <- true + data = 43 + <-mtx + <-done +} + +func TestNoRaceChanWaitGroup(t *testing.T) { + const N = 10 + chanWg := make(chan bool, N/2) + data := make([]int, N) + for i := 0; i < N; i++ { + chanWg <- true + go func(i int) { + data[i] = 42 + <-chanWg + }(i) + } + for i := 0; i < cap(chanWg); i++ { + chanWg <- true + } + for i := 0; i < N; i++ { + _ = data[i] + } +} + +// Test that sender synchronizes with receiver even if the sender was blocked. +func TestNoRaceBlockedSendSync(t *testing.T) { + c := make(chan *int, 1) + c <- nil + go func() { + i := 42 + c <- &i + }() + // Give the sender time to actually block. + // This sleep is completely optional: race report must not be printed + // regardless of whether the sender actually blocks or not. + // It cannot lead to flakiness. + time.Sleep(10 * time.Millisecond) + <-c + p := <-c + if *p != 42 { + t.Fatal() + } +} + +// The same as TestNoRaceBlockedSendSync above, but sender unblock happens in a select. +func TestNoRaceBlockedSelectSendSync(t *testing.T) { + c := make(chan *int, 1) + c <- nil + go func() { + i := 42 + c <- &i + }() + time.Sleep(10 * time.Millisecond) + <-c + select { + case p := <-c: + if *p != 42 { + t.Fatal() + } + case <-make(chan int): + } +} + +// Test that close synchronizes with a read from the empty closed channel. +// See https://golang.org/issue/36714. +func TestNoRaceCloseHappensBeforeRead(t *testing.T) { + for i := 0; i < 100; i++ { + var loc int + var write = make(chan struct{}) + var read = make(chan struct{}) + + go func() { + select { + case <-write: + _ = loc + default: + } + close(read) + }() + + go func() { + loc = 1 + close(write) + }() + + <-read + } +} + +// Test that we call the proper race detector function when c.elemsize==0. +// See https://github.com/golang/go/issues/42598 +func TestNoRaceElemetSize0(t *testing.T) { + var x, y int + var c = make(chan struct{}, 2) + c <- struct{}{} + c <- struct{}{} + go func() { + x += 1 + <-c + }() + go func() { + y += 1 + <-c + }() + time.Sleep(10 * time.Millisecond) + c <- struct{}{} + c <- struct{}{} + x += 1 + y += 1 +} diff --git a/src/runtime/race/testdata/comp_test.go b/src/runtime/race/testdata/comp_test.go new file mode 100644 index 0000000..27b2d00 --- /dev/null +++ b/src/runtime/race/testdata/comp_test.go @@ -0,0 +1,186 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "testing" +) + +type P struct { + x, y int +} + +type S struct { + s1, s2 P +} + +func TestNoRaceComp(t *testing.T) { + c := make(chan bool, 1) + var s S + go func() { + s.s2.x = 1 + c <- true + }() + s.s2.y = 2 + <-c +} + +func TestNoRaceComp2(t *testing.T) { + c := make(chan bool, 1) + var s S + go func() { + s.s1.x = 1 + c <- true + }() + s.s1.y = 2 + <-c +} + +func TestRaceComp(t *testing.T) { + c := make(chan bool, 1) + var s S + go func() { + s.s2.y = 1 + c <- true + }() + s.s2.y = 2 + <-c +} + +func TestRaceComp2(t *testing.T) { + c := make(chan bool, 1) + var s S + go func() { + s.s1.x = 1 + c <- true + }() + s = S{} + <-c +} + +func TestRaceComp3(t *testing.T) { + c := make(chan bool, 1) + var s S + go func() { + s.s2.y = 1 + c <- true + }() + s = S{} + <-c +} + +func TestRaceCompArray(t *testing.T) { + c := make(chan bool, 1) + s := make([]S, 10) + x := 4 + go func() { + s[x].s2.y = 1 + c <- true + }() + x = 5 + <-c +} + +type P2 P +type S2 S + +func TestRaceConv1(t *testing.T) { + c := make(chan bool, 1) + var p P2 + go func() { + p.x = 1 + c <- true + }() + _ = P(p).x + <-c +} + +func TestRaceConv2(t *testing.T) { + c := make(chan bool, 1) + var p P2 + go func() { + p.x = 1 + c <- true + }() + ptr := &p + _ = P(*ptr).x + <-c +} + +func TestRaceConv3(t *testing.T) { + c := make(chan bool, 1) + var s S2 + go func() { + s.s1.x = 1 + c <- true + }() + _ = P2(S(s).s1).x + <-c +} + +type X struct { + V [4]P +} + +type X2 X + +func TestRaceConv4(t *testing.T) { + c := make(chan bool, 1) + var x X2 + go func() { + x.V[1].x = 1 + c <- true + }() + _ = P2(X(x).V[1]).x + <-c +} + +type Ptr struct { + s1, s2 *P +} + +func TestNoRaceCompPtr(t *testing.T) { + c := make(chan bool, 1) + p := Ptr{&P{}, &P{}} + go func() { + p.s1.x = 1 + c <- true + }() + p.s1.y = 2 + <-c +} + +func TestNoRaceCompPtr2(t *testing.T) { + c := make(chan bool, 1) + p := Ptr{&P{}, &P{}} + go func() { + p.s1.x = 1 + c <- true + }() + _ = p + <-c +} + +func TestRaceCompPtr(t *testing.T) { + c := make(chan bool, 1) + p := Ptr{&P{}, &P{}} + go func() { + p.s2.x = 1 + c <- true + }() + p.s2.x = 2 + <-c +} + +func TestRaceCompPtr2(t *testing.T) { + c := make(chan bool, 1) + p := Ptr{&P{}, &P{}} + go func() { + p.s2.x = 1 + c <- true + }() + p.s2 = &P{} + <-c +} diff --git a/src/runtime/race/testdata/finalizer_test.go b/src/runtime/race/testdata/finalizer_test.go new file mode 100644 index 0000000..3ac33d2 --- /dev/null +++ b/src/runtime/race/testdata/finalizer_test.go @@ -0,0 +1,68 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "runtime" + "sync" + "testing" + "time" +) + +func TestNoRaceFin(t *testing.T) { + c := make(chan bool) + go func() { + x := new(string) + runtime.SetFinalizer(x, func(x *string) { + *x = "foo" + }) + *x = "bar" + c <- true + }() + <-c + runtime.GC() + time.Sleep(100 * time.Millisecond) +} + +var finVar struct { + sync.Mutex + cnt int +} + +func TestNoRaceFinGlobal(t *testing.T) { + c := make(chan bool) + go func() { + x := new(string) + runtime.SetFinalizer(x, func(x *string) { + finVar.Lock() + finVar.cnt++ + finVar.Unlock() + }) + c <- true + }() + <-c + runtime.GC() + time.Sleep(100 * time.Millisecond) + finVar.Lock() + finVar.cnt++ + finVar.Unlock() +} + +func TestRaceFin(t *testing.T) { + c := make(chan bool) + y := 0 + _ = y + go func() { + x := new(string) + runtime.SetFinalizer(x, func(x *string) { + y = 42 + }) + c <- true + }() + <-c + runtime.GC() + time.Sleep(100 * time.Millisecond) + y = 66 +} diff --git a/src/runtime/race/testdata/io_test.go b/src/runtime/race/testdata/io_test.go new file mode 100644 index 0000000..3303cb0 --- /dev/null +++ b/src/runtime/race/testdata/io_test.go @@ -0,0 +1,75 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "fmt" + "net" + "net/http" + "os" + "path/filepath" + "sync" + "testing" + "time" +) + +func TestNoRaceIOFile(t *testing.T) { + x := 0 + path := t.TempDir() + fname := filepath.Join(path, "data") + go func() { + x = 42 + f, _ := os.Create(fname) + f.Write([]byte("done")) + f.Close() + }() + for { + f, err := os.Open(fname) + if err != nil { + time.Sleep(1e6) + continue + } + buf := make([]byte, 100) + count, err := f.Read(buf) + if count == 0 { + time.Sleep(1e6) + continue + } + break + } + _ = x +} + +var ( + regHandler sync.Once + handlerData int +) + +func TestNoRaceIOHttp(t *testing.T) { + regHandler.Do(func() { + http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) { + handlerData++ + fmt.Fprintf(w, "test") + handlerData++ + }) + }) + ln, err := net.Listen("tcp", "127.0.0.1:0") + if err != nil { + t.Fatalf("net.Listen: %v", err) + } + defer ln.Close() + go http.Serve(ln, nil) + handlerData++ + _, err = http.Get("http://" + ln.Addr().String()) + if err != nil { + t.Fatalf("http.Get: %v", err) + } + handlerData++ + _, err = http.Get("http://" + ln.Addr().String()) + if err != nil { + t.Fatalf("http.Get: %v", err) + } + handlerData++ +} diff --git a/src/runtime/race/testdata/issue12225_test.go b/src/runtime/race/testdata/issue12225_test.go new file mode 100644 index 0000000..0494493 --- /dev/null +++ b/src/runtime/race/testdata/issue12225_test.go @@ -0,0 +1,20 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import "unsafe" + +// golang.org/issue/12225 +// The test is that this compiles at all. + +//go:noinline +func convert(s string) []byte { + return []byte(s) +} + +func issue12225() { + println(*(*int)(unsafe.Pointer(&convert("")[0]))) + println(*(*int)(unsafe.Pointer(&[]byte("")[0]))) +} diff --git a/src/runtime/race/testdata/issue12664_test.go b/src/runtime/race/testdata/issue12664_test.go new file mode 100644 index 0000000..714e83d --- /dev/null +++ b/src/runtime/race/testdata/issue12664_test.go @@ -0,0 +1,76 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "fmt" + "testing" +) + +var issue12664 = "hi" + +func TestRaceIssue12664(t *testing.T) { + c := make(chan struct{}) + go func() { + issue12664 = "bye" + close(c) + }() + fmt.Println(issue12664) + <-c +} + +type MyI interface { + foo() +} + +type MyT int + +func (MyT) foo() { +} + +var issue12664_2 MyT = 0 + +func TestRaceIssue12664_2(t *testing.T) { + c := make(chan struct{}) + go func() { + issue12664_2 = 1 + close(c) + }() + func(x MyI) { + // Never true, but prevents inlining. + if x.(MyT) == -1 { + close(c) + } + }(issue12664_2) + <-c +} + +var issue12664_3 MyT = 0 + +func TestRaceIssue12664_3(t *testing.T) { + c := make(chan struct{}) + go func() { + issue12664_3 = 1 + close(c) + }() + var r MyT + var i any = r + issue12664_3 = i.(MyT) + <-c +} + +var issue12664_4 MyT = 0 + +func TestRaceIssue12664_4(t *testing.T) { + c := make(chan struct{}) + go func() { + issue12664_4 = 1 + close(c) + }() + var r MyT + var i MyI = r + issue12664_4 = i.(MyT) + <-c +} diff --git a/src/runtime/race/testdata/issue13264_test.go b/src/runtime/race/testdata/issue13264_test.go new file mode 100644 index 0000000..d42290d --- /dev/null +++ b/src/runtime/race/testdata/issue13264_test.go @@ -0,0 +1,13 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +// golang.org/issue/13264 +// The test is that this compiles at all. + +func issue13264() { + for ; ; []map[int]int{}[0][0] = 0 { + } +} diff --git a/src/runtime/race/testdata/map_test.go b/src/runtime/race/testdata/map_test.go new file mode 100644 index 0000000..88e735e --- /dev/null +++ b/src/runtime/race/testdata/map_test.go @@ -0,0 +1,335 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "testing" +) + +func TestRaceMapRW(t *testing.T) { + m := make(map[int]int) + ch := make(chan bool, 1) + go func() { + _ = m[1] + ch <- true + }() + m[1] = 1 + <-ch +} + +func TestRaceMapRW2(t *testing.T) { + m := make(map[int]int) + ch := make(chan bool, 1) + go func() { + _, _ = m[1] + ch <- true + }() + m[1] = 1 + <-ch +} + +func TestRaceMapRWArray(t *testing.T) { + // Check instrumentation of unaddressable arrays (issue 4578). + m := make(map[int][2]int) + ch := make(chan bool, 1) + go func() { + _ = m[1][1] + ch <- true + }() + m[2] = [2]int{1, 2} + <-ch +} + +func TestNoRaceMapRR(t *testing.T) { + m := make(map[int]int) + ch := make(chan bool, 1) + go func() { + _, _ = m[1] + ch <- true + }() + _ = m[1] + <-ch +} + +func TestRaceMapRange(t *testing.T) { + m := make(map[int]int) + ch := make(chan bool, 1) + go func() { + for range m { + } + ch <- true + }() + m[1] = 1 + <-ch +} + +func TestRaceMapRange2(t *testing.T) { + m := make(map[int]int) + ch := make(chan bool, 1) + go func() { + for range m { + } + ch <- true + }() + m[1] = 1 + <-ch +} + +func TestNoRaceMapRangeRange(t *testing.T) { + m := make(map[int]int) + // now the map is not empty and range triggers an event + // should work without this (as in other tests) + // so it is suspicious if this test passes and others don't + m[0] = 0 + ch := make(chan bool, 1) + go func() { + for range m { + } + ch <- true + }() + for range m { + } + <-ch +} + +func TestRaceMapLen(t *testing.T) { + m := make(map[string]bool) + ch := make(chan bool, 1) + go func() { + _ = len(m) + ch <- true + }() + m[""] = true + <-ch +} + +func TestRaceMapDelete(t *testing.T) { + m := make(map[string]bool) + ch := make(chan bool, 1) + go func() { + delete(m, "") + ch <- true + }() + m[""] = true + <-ch +} + +func TestRaceMapLenDelete(t *testing.T) { + m := make(map[string]bool) + ch := make(chan bool, 1) + go func() { + delete(m, "a") + ch <- true + }() + _ = len(m) + <-ch +} + +func TestRaceMapVariable(t *testing.T) { + ch := make(chan bool, 1) + m := make(map[int]int) + _ = m + go func() { + m = make(map[int]int) + ch <- true + }() + m = make(map[int]int) + <-ch +} + +func TestRaceMapVariable2(t *testing.T) { + ch := make(chan bool, 1) + m := make(map[int]int) + go func() { + m[1] = 1 + ch <- true + }() + m = make(map[int]int) + <-ch +} + +func TestRaceMapVariable3(t *testing.T) { + ch := make(chan bool, 1) + m := make(map[int]int) + go func() { + _ = m[1] + ch <- true + }() + m = make(map[int]int) + <-ch +} + +type Big struct { + x [17]int32 +} + +func TestRaceMapLookupPartKey(t *testing.T) { + k := &Big{} + m := make(map[Big]bool) + ch := make(chan bool, 1) + go func() { + k.x[8] = 1 + ch <- true + }() + _ = m[*k] + <-ch +} + +func TestRaceMapLookupPartKey2(t *testing.T) { + k := &Big{} + m := make(map[Big]bool) + ch := make(chan bool, 1) + go func() { + k.x[8] = 1 + ch <- true + }() + _, _ = m[*k] + <-ch +} +func TestRaceMapDeletePartKey(t *testing.T) { + k := &Big{} + m := make(map[Big]bool) + ch := make(chan bool, 1) + go func() { + k.x[8] = 1 + ch <- true + }() + delete(m, *k) + <-ch +} + +func TestRaceMapInsertPartKey(t *testing.T) { + k := &Big{} + m := make(map[Big]bool) + ch := make(chan bool, 1) + go func() { + k.x[8] = 1 + ch <- true + }() + m[*k] = true + <-ch +} + +func TestRaceMapInsertPartVal(t *testing.T) { + v := &Big{} + m := make(map[int]Big) + ch := make(chan bool, 1) + go func() { + v.x[8] = 1 + ch <- true + }() + m[1] = *v + <-ch +} + +// Test for issue 7561. +func TestRaceMapAssignMultipleReturn(t *testing.T) { + connect := func() (int, error) { return 42, nil } + conns := make(map[int][]int) + conns[1] = []int{0} + ch := make(chan bool, 1) + var err error + _ = err + go func() { + conns[1][0], err = connect() + ch <- true + }() + x := conns[1][0] + _ = x + <-ch +} + +// BigKey and BigVal must be larger than 256 bytes, +// so that compiler sets KindGCProg for them. +type BigKey [1000]*int + +type BigVal struct { + x int + y [1000]*int +} + +func TestRaceMapBigKeyAccess1(t *testing.T) { + m := make(map[BigKey]int) + var k BigKey + ch := make(chan bool, 1) + go func() { + _ = m[k] + ch <- true + }() + k[30] = new(int) + <-ch +} + +func TestRaceMapBigKeyAccess2(t *testing.T) { + m := make(map[BigKey]int) + var k BigKey + ch := make(chan bool, 1) + go func() { + _, _ = m[k] + ch <- true + }() + k[30] = new(int) + <-ch +} + +func TestRaceMapBigKeyInsert(t *testing.T) { + m := make(map[BigKey]int) + var k BigKey + ch := make(chan bool, 1) + go func() { + m[k] = 1 + ch <- true + }() + k[30] = new(int) + <-ch +} + +func TestRaceMapBigKeyDelete(t *testing.T) { + m := make(map[BigKey]int) + var k BigKey + ch := make(chan bool, 1) + go func() { + delete(m, k) + ch <- true + }() + k[30] = new(int) + <-ch +} + +func TestRaceMapBigValInsert(t *testing.T) { + m := make(map[int]BigVal) + var v BigVal + ch := make(chan bool, 1) + go func() { + m[1] = v + ch <- true + }() + v.y[30] = new(int) + <-ch +} + +func TestRaceMapBigValAccess1(t *testing.T) { + m := make(map[int]BigVal) + var v BigVal + ch := make(chan bool, 1) + go func() { + v = m[1] + ch <- true + }() + v.y[30] = new(int) + <-ch +} + +func TestRaceMapBigValAccess2(t *testing.T) { + m := make(map[int]BigVal) + var v BigVal + ch := make(chan bool, 1) + go func() { + v, _ = m[1] + ch <- true + }() + v.y[30] = new(int) + <-ch +} diff --git a/src/runtime/race/testdata/mop_test.go b/src/runtime/race/testdata/mop_test.go new file mode 100644 index 0000000..4a9ce26 --- /dev/null +++ b/src/runtime/race/testdata/mop_test.go @@ -0,0 +1,2131 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "bytes" + "errors" + "fmt" + "hash/crc32" + "io" + "os" + "runtime" + "sync" + "testing" + "time" + "unsafe" +) + +type Point struct { + x, y int +} + +type NamedPoint struct { + name string + p Point +} + +type DummyWriter struct { + state int +} +type Writer interface { + Write(p []byte) (n int) +} + +func (d DummyWriter) Write(p []byte) (n int) { + return 0 +} + +var GlobalX, GlobalY int = 0, 0 +var GlobalCh chan int = make(chan int, 2) + +func GlobalFunc1() { + GlobalY = GlobalX + GlobalCh <- 1 +} + +func GlobalFunc2() { + GlobalX = 1 + GlobalCh <- 1 +} + +func TestRaceIntRWGlobalFuncs(t *testing.T) { + go GlobalFunc1() + go GlobalFunc2() + <-GlobalCh + <-GlobalCh +} + +func TestRaceIntRWClosures(t *testing.T) { + var x, y int + _ = y + ch := make(chan int, 2) + + go func() { + y = x + ch <- 1 + }() + go func() { + x = 1 + ch <- 1 + }() + <-ch + <-ch +} + +func TestNoRaceIntRWClosures(t *testing.T) { + var x, y int + _ = y + ch := make(chan int, 1) + + go func() { + y = x + ch <- 1 + }() + <-ch + go func() { + x = 1 + ch <- 1 + }() + <-ch + +} + +func TestRaceInt32RWClosures(t *testing.T) { + var x, y int32 + _ = y + ch := make(chan bool, 2) + + go func() { + y = x + ch <- true + }() + go func() { + x = 1 + ch <- true + }() + <-ch + <-ch +} + +func TestNoRaceCase(t *testing.T) { + var y int + for x := -1; x <= 1; x++ { + switch { + case x < 0: + y = -1 + case x == 0: + y = 0 + case x > 0: + y = 1 + } + } + y++ +} + +func TestRaceCaseCondition(t *testing.T) { + var x int = 0 + ch := make(chan int, 2) + + go func() { + x = 2 + ch <- 1 + }() + go func() { + switch x < 2 { + case true: + x = 1 + //case false: + // x = 5 + } + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceCaseCondition2(t *testing.T) { + // switch body is rearranged by the compiler so the tests + // passes even if we don't instrument '<' + var x int = 0 + ch := make(chan int, 2) + + go func() { + x = 2 + ch <- 1 + }() + go func() { + switch x < 2 { + case true: + x = 1 + case false: + x = 5 + } + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceCaseBody(t *testing.T) { + var x, y int + _ = y + ch := make(chan int, 2) + + go func() { + y = x + ch <- 1 + }() + go func() { + switch { + default: + x = 1 + case x == 100: + x = -x + } + ch <- 1 + }() + <-ch + <-ch +} + +func TestNoRaceCaseFallthrough(t *testing.T) { + var x, y, z int + _ = y + ch := make(chan int, 2) + z = 1 + + go func() { + y = x + ch <- 1 + }() + go func() { + switch { + case z == 1: + case z == 2: + x = 2 + } + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceCaseFallthrough(t *testing.T) { + var x, y, z int + _ = y + ch := make(chan int, 2) + z = 1 + + go func() { + y = x + ch <- 1 + }() + go func() { + switch { + case z == 1: + fallthrough + case z == 2: + x = 2 + } + ch <- 1 + }() + + <-ch + <-ch +} + +func TestRaceCaseIssue6418(t *testing.T) { + m := map[string]map[string]string{ + "a": { + "b": "c", + }, + } + ch := make(chan int) + go func() { + m["a"]["x"] = "y" + ch <- 1 + }() + switch m["a"]["b"] { + } + <-ch +} + +func TestRaceCaseType(t *testing.T) { + var x, y int + var i any = x + c := make(chan int, 1) + go func() { + switch i.(type) { + case nil: + case int: + } + c <- 1 + }() + i = y + <-c +} + +func TestRaceCaseTypeBody(t *testing.T) { + var x, y int + var i any = &x + c := make(chan int, 1) + go func() { + switch i := i.(type) { + case nil: + case *int: + *i = y + } + c <- 1 + }() + x = y + <-c +} + +func TestRaceCaseTypeIssue5890(t *testing.T) { + // spurious extra instrumentation of the initial interface + // value. + var x, y int + m := make(map[int]map[int]any) + m[0] = make(map[int]any) + c := make(chan int, 1) + go func() { + switch i := m[0][1].(type) { + case nil: + case *int: + *i = x + } + c <- 1 + }() + m[0][1] = y + <-c +} + +func TestNoRaceRange(t *testing.T) { + ch := make(chan int, 3) + a := [...]int{1, 2, 3} + for _, v := range a { + ch <- v + } + close(ch) +} + +func TestNoRaceRangeIssue5446(t *testing.T) { + ch := make(chan int, 3) + a := []int{1, 2, 3} + b := []int{4} + // used to insert a spurious instrumentation of a[i] + // and crash. + i := 1 + for i, a[i] = range b { + ch <- i + } + close(ch) +} + +func TestRaceRange(t *testing.T) { + const N = 2 + var a [N]int + var x, y int + _ = x + y + done := make(chan bool, N) + for i, v := range a { + go func(i int) { + // we don't want a write-vs-write race + // so there is no array b here + if i == 0 { + x = v + } else { + y = v + } + done <- true + }(i) + // Ensure the goroutine runs before we continue the loop. + runtime.Gosched() + } + for i := 0; i < N; i++ { + <-done + } +} + +func TestRaceForInit(t *testing.T) { + c := make(chan int) + x := 0 + go func() { + c <- x + }() + for x = 42; false; { + } + <-c +} + +func TestNoRaceForInit(t *testing.T) { + done := make(chan bool) + c := make(chan bool) + x := 0 + go func() { + for { + _, ok := <-c + if !ok { + done <- true + return + } + x++ + } + }() + i := 0 + for x = 42; i < 10; i++ { + c <- true + } + close(c) + <-done +} + +func TestRaceForTest(t *testing.T) { + done := make(chan bool) + c := make(chan bool) + stop := false + go func() { + for { + _, ok := <-c + if !ok { + done <- true + return + } + stop = true + } + }() + for !stop { + c <- true + } + close(c) + <-done +} + +func TestRaceForIncr(t *testing.T) { + done := make(chan bool) + c := make(chan bool) + x := 0 + go func() { + for { + _, ok := <-c + if !ok { + done <- true + return + } + x++ + } + }() + for i := 0; i < 10; x++ { + i++ + c <- true + } + close(c) + <-done +} + +func TestNoRaceForIncr(t *testing.T) { + done := make(chan bool) + x := 0 + go func() { + x++ + done <- true + }() + for i := 0; i < 0; x++ { + } + <-done +} + +func TestRacePlus(t *testing.T) { + var x, y, z int + _ = y + ch := make(chan int, 2) + + go func() { + y = x + z + ch <- 1 + }() + go func() { + y = x + z + z + ch <- 1 + }() + <-ch + <-ch +} + +func TestRacePlus2(t *testing.T) { + var x, y, z int + _ = y + ch := make(chan int, 2) + + go func() { + x = 1 + ch <- 1 + }() + go func() { + y = +x + z + ch <- 1 + }() + <-ch + <-ch +} + +func TestNoRacePlus(t *testing.T) { + var x, y, z, f int + _ = x + y + f + ch := make(chan int, 2) + + go func() { + y = x + z + ch <- 1 + }() + go func() { + f = z + x + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceComplement(t *testing.T) { + var x, y, z int + _ = x + ch := make(chan int, 2) + + go func() { + x = ^y + ch <- 1 + }() + go func() { + y = ^z + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceDiv(t *testing.T) { + var x, y, z int + _ = x + ch := make(chan int, 2) + + go func() { + x = y / (z + 1) + ch <- 1 + }() + go func() { + y = z + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceDivConst(t *testing.T) { + var x, y, z uint32 + _ = x + ch := make(chan int, 2) + + go func() { + x = y / 3 // involves only a HMUL node + ch <- 1 + }() + go func() { + y = z + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceMod(t *testing.T) { + var x, y, z int + _ = x + ch := make(chan int, 2) + + go func() { + x = y % (z + 1) + ch <- 1 + }() + go func() { + y = z + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceModConst(t *testing.T) { + var x, y, z int + _ = x + ch := make(chan int, 2) + + go func() { + x = y % 3 + ch <- 1 + }() + go func() { + y = z + ch <- 1 + }() + <-ch + <-ch +} + +func TestRaceRotate(t *testing.T) { + var x, y, z uint32 + _ = x + ch := make(chan int, 2) + + go func() { + x = y<<12 | y>>20 + ch <- 1 + }() + go func() { + y = z + ch <- 1 + }() + <-ch + <-ch +} + +// May crash if the instrumentation is reckless. +func TestNoRaceEnoughRegisters(t *testing.T) { + // from erf.go + const ( + sa1 = 1 + sa2 = 2 + sa3 = 3 + sa4 = 4 + sa5 = 5 + sa6 = 6 + sa7 = 7 + sa8 = 8 + ) + var s, S float64 + s = 3.1415 + S = 1 + s*(sa1+s*(sa2+s*(sa3+s*(sa4+s*(sa5+s*(sa6+s*(sa7+s*sa8))))))) + s = S +} + +// emptyFunc should not be inlined. +func emptyFunc(x int) { + if false { + fmt.Println(x) + } +} + +func TestRaceFuncArgument(t *testing.T) { + var x int + ch := make(chan bool, 1) + go func() { + emptyFunc(x) + ch <- true + }() + x = 1 + <-ch +} + +func TestRaceFuncArgument2(t *testing.T) { + var x int + ch := make(chan bool, 2) + go func() { + x = 42 + ch <- true + }() + go func(y int) { + ch <- true + }(x) + <-ch + <-ch +} + +func TestRaceSprint(t *testing.T) { + var x int + ch := make(chan bool, 1) + go func() { + fmt.Sprint(x) + ch <- true + }() + x = 1 + <-ch +} + +func TestRaceArrayCopy(t *testing.T) { + ch := make(chan bool, 1) + var a [5]int + go func() { + a[3] = 1 + ch <- true + }() + a = [5]int{1, 2, 3, 4, 5} + <-ch +} + +// Blows up a naive compiler. +func TestRaceNestedArrayCopy(t *testing.T) { + ch := make(chan bool, 1) + type ( + Point32 [2][2][2][2][2]Point + Point1024 [2][2][2][2][2]Point32 + Point32k [2][2][2][2][2]Point1024 + Point1M [2][2][2][2][2]Point32k + ) + var a, b Point1M + go func() { + a[0][1][0][1][0][1][0][1][0][1][0][1][0][1][0][1][0][1][0][1].y = 1 + ch <- true + }() + a = b + <-ch +} + +func TestRaceStructRW(t *testing.T) { + p := Point{0, 0} + ch := make(chan bool, 1) + go func() { + p = Point{1, 1} + ch <- true + }() + q := p + <-ch + p = q +} + +func TestRaceStructFieldRW1(t *testing.T) { + p := Point{0, 0} + ch := make(chan bool, 1) + go func() { + p.x = 1 + ch <- true + }() + _ = p.x + <-ch +} + +func TestNoRaceStructFieldRW1(t *testing.T) { + // Same struct, different variables, no + // pointers. The layout is known (at compile time?) -> + // no read on p + // writes on x and y + p := Point{0, 0} + ch := make(chan bool, 1) + go func() { + p.x = 1 + ch <- true + }() + p.y = 1 + <-ch + _ = p +} + +func TestNoRaceStructFieldRW2(t *testing.T) { + // Same as NoRaceStructFieldRW1 + // but p is a pointer, so there is a read on p + p := Point{0, 0} + ch := make(chan bool, 1) + go func() { + p.x = 1 + ch <- true + }() + p.y = 1 + <-ch + _ = p +} + +func TestRaceStructFieldRW2(t *testing.T) { + p := &Point{0, 0} + ch := make(chan bool, 1) + go func() { + p.x = 1 + ch <- true + }() + _ = p.x + <-ch +} + +func TestRaceStructFieldRW3(t *testing.T) { + p := NamedPoint{name: "a", p: Point{0, 0}} + ch := make(chan bool, 1) + go func() { + p.p.x = 1 + ch <- true + }() + _ = p.p.x + <-ch +} + +func TestRaceEfaceWW(t *testing.T) { + var a, b any + ch := make(chan bool, 1) + go func() { + a = 1 + ch <- true + }() + a = 2 + <-ch + _, _ = a, b +} + +func TestRaceIfaceWW(t *testing.T) { + var a, b Writer + ch := make(chan bool, 1) + go func() { + a = DummyWriter{1} + ch <- true + }() + a = DummyWriter{2} + <-ch + b = a + a = b +} + +func TestRaceIfaceCmp(t *testing.T) { + var a, b Writer + a = DummyWriter{1} + ch := make(chan bool, 1) + go func() { + a = DummyWriter{1} + ch <- true + }() + _ = a == b + <-ch +} + +func TestRaceIfaceCmpNil(t *testing.T) { + var a Writer + a = DummyWriter{1} + ch := make(chan bool, 1) + go func() { + a = DummyWriter{1} + ch <- true + }() + _ = a == nil + <-ch +} + +func TestRaceEfaceConv(t *testing.T) { + c := make(chan bool) + v := 0 + go func() { + go func(x any) { + }(v) + c <- true + }() + v = 42 + <-c +} + +type OsFile struct{} + +func (*OsFile) Read() { +} + +type IoReader interface { + Read() +} + +func TestRaceIfaceConv(t *testing.T) { + c := make(chan bool) + f := &OsFile{} + go func() { + go func(x IoReader) { + }(f) + c <- true + }() + f = &OsFile{} + <-c +} + +func TestRaceError(t *testing.T) { + ch := make(chan bool, 1) + var err error + go func() { + err = nil + ch <- true + }() + _ = err + <-ch +} + +func TestRaceIntptrRW(t *testing.T) { + var x, y int + var p *int = &x + ch := make(chan bool, 1) + go func() { + *p = 5 + ch <- true + }() + y = *p + x = y + <-ch +} + +func TestRaceStringRW(t *testing.T) { + ch := make(chan bool, 1) + s := "" + go func() { + s = "abacaba" + ch <- true + }() + _ = s + <-ch +} + +func TestRaceStringPtrRW(t *testing.T) { + ch := make(chan bool, 1) + var x string + p := &x + go func() { + *p = "a" + ch <- true + }() + _ = *p + <-ch +} + +func TestRaceFloat64WW(t *testing.T) { + var x, y float64 + ch := make(chan bool, 1) + go func() { + x = 1.0 + ch <- true + }() + x = 2.0 + <-ch + + y = x + x = y +} + +func TestRaceComplex128WW(t *testing.T) { + var x, y complex128 + ch := make(chan bool, 1) + go func() { + x = 2 + 2i + ch <- true + }() + x = 4 + 4i + <-ch + + y = x + x = y +} + +func TestRaceUnsafePtrRW(t *testing.T) { + var x, y, z int + x, y, z = 1, 2, 3 + var p unsafe.Pointer = unsafe.Pointer(&x) + ch := make(chan bool, 1) + go func() { + p = (unsafe.Pointer)(&z) + ch <- true + }() + y = *(*int)(p) + x = y + <-ch +} + +func TestRaceFuncVariableRW(t *testing.T) { + var f func(x int) int + f = func(x int) int { + return x * x + } + ch := make(chan bool, 1) + go func() { + f = func(x int) int { + return x + } + ch <- true + }() + y := f(1) + <-ch + x := y + y = x +} + +func TestRaceFuncVariableWW(t *testing.T) { + var f func(x int) int + _ = f + ch := make(chan bool, 1) + go func() { + f = func(x int) int { + return x + } + ch <- true + }() + f = func(x int) int { + return x * x + } + <-ch +} + +// This one should not belong to mop_test +func TestRacePanic(t *testing.T) { + var x int + _ = x + var zero int = 0 + ch := make(chan bool, 2) + go func() { + defer func() { + err := recover() + if err == nil { + panic("should be panicking") + } + x = 1 + ch <- true + }() + var y int = 1 / zero + zero = y + }() + go func() { + defer func() { + err := recover() + if err == nil { + panic("should be panicking") + } + x = 2 + ch <- true + }() + var y int = 1 / zero + zero = y + }() + + <-ch + <-ch + if zero != 0 { + panic("zero has changed") + } +} + +func TestNoRaceBlank(t *testing.T) { + var a [5]int + ch := make(chan bool, 1) + go func() { + _, _ = a[0], a[1] + ch <- true + }() + _, _ = a[2], a[3] + <-ch + a[1] = a[0] +} + +func TestRaceAppendRW(t *testing.T) { + a := make([]int, 10) + ch := make(chan bool) + go func() { + _ = append(a, 1) + ch <- true + }() + a[0] = 1 + <-ch +} + +func TestRaceAppendLenRW(t *testing.T) { + a := make([]int, 0) + ch := make(chan bool) + go func() { + a = append(a, 1) + ch <- true + }() + _ = len(a) + <-ch +} + +func TestRaceAppendCapRW(t *testing.T) { + a := make([]int, 0) + ch := make(chan string) + go func() { + a = append(a, 1) + ch <- "" + }() + _ = cap(a) + <-ch +} + +func TestNoRaceFuncArgsRW(t *testing.T) { + ch := make(chan byte, 1) + var x byte + go func(y byte) { + _ = y + ch <- 0 + }(x) + x = 1 + <-ch +} + +func TestRaceFuncArgsRW(t *testing.T) { + ch := make(chan byte, 1) + var x byte + go func(y *byte) { + _ = *y + ch <- 0 + }(&x) + x = 1 + <-ch +} + +// from the mailing list, slightly modified +// unprotected concurrent access to seen[] +func TestRaceCrawl(t *testing.T) { + url := "dummyurl" + depth := 3 + seen := make(map[string]bool) + ch := make(chan int, 100) + var wg sync.WaitGroup + var crawl func(string, int) + crawl = func(u string, d int) { + nurl := 0 + defer func() { + ch <- nurl + }() + seen[u] = true + if d <= 0 { + wg.Done() + return + } + urls := [...]string{"a", "b", "c"} + for _, uu := range urls { + if _, ok := seen[uu]; !ok { + wg.Add(1) + go crawl(uu, d-1) + nurl++ + } + } + wg.Done() + } + wg.Add(1) + go crawl(url, depth) + wg.Wait() +} + +func TestRaceIndirection(t *testing.T) { + ch := make(chan struct{}, 1) + var y int + var x *int = &y + go func() { + *x = 1 + ch <- struct{}{} + }() + *x = 2 + <-ch + _ = *x +} + +func TestRaceRune(t *testing.T) { + c := make(chan bool) + var x rune + go func() { + x = 1 + c <- true + }() + _ = x + <-c +} + +func TestRaceEmptyInterface1(t *testing.T) { + c := make(chan bool) + var x any + go func() { + x = nil + c <- true + }() + _ = x + <-c +} + +func TestRaceEmptyInterface2(t *testing.T) { + c := make(chan bool) + var x any + go func() { + x = &Point{} + c <- true + }() + _ = x + <-c +} + +func TestRaceTLS(t *testing.T) { + comm := make(chan *int) + done := make(chan bool, 2) + go func() { + var x int + comm <- &x + x = 1 + x = *(<-comm) + done <- true + }() + go func() { + p := <-comm + *p = 2 + comm <- p + done <- true + }() + <-done + <-done +} + +func TestNoRaceHeapReallocation(t *testing.T) { + // It is possible that a future implementation + // of memory allocation will ruin this test. + // Increasing n might help in this case, so + // this test is a bit more generic than most of the + // others. + const n = 2 + done := make(chan bool, n) + empty := func(p *int) {} + for i := 0; i < n; i++ { + ms := i + go func() { + <-time.After(time.Duration(ms) * time.Millisecond) + runtime.GC() + var x int + empty(&x) // x goes to the heap + done <- true + }() + } + for i := 0; i < n; i++ { + <-done + } +} + +func TestRaceAnd(t *testing.T) { + c := make(chan bool) + x, y := 0, 0 + go func() { + x = 1 + c <- true + }() + if x == 1 && y == 1 { + } + <-c +} + +func TestRaceAnd2(t *testing.T) { + c := make(chan bool) + x, y := 0, 0 + go func() { + x = 1 + c <- true + }() + if y == 0 && x == 1 { + } + <-c +} + +func TestNoRaceAnd(t *testing.T) { + c := make(chan bool) + x, y := 0, 0 + go func() { + x = 1 + c <- true + }() + if y == 1 && x == 1 { + } + <-c +} + +func TestRaceOr(t *testing.T) { + c := make(chan bool) + x, y := 0, 0 + go func() { + x = 1 + c <- true + }() + if x == 1 || y == 1 { + } + <-c +} + +func TestRaceOr2(t *testing.T) { + c := make(chan bool) + x, y := 0, 0 + go func() { + x = 1 + c <- true + }() + if y == 1 || x == 1 { + } + <-c +} + +func TestNoRaceOr(t *testing.T) { + c := make(chan bool) + x, y := 0, 0 + go func() { + x = 1 + c <- true + }() + if y == 0 || x == 1 { + } + <-c +} + +func TestNoRaceShortCalc(t *testing.T) { + c := make(chan bool) + x, y := 0, 0 + go func() { + y = 1 + c <- true + }() + if x == 0 || y == 0 { + } + <-c +} + +func TestNoRaceShortCalc2(t *testing.T) { + c := make(chan bool) + x, y := 0, 0 + go func() { + y = 1 + c <- true + }() + if x == 1 && y == 0 { + } + <-c +} + +func TestRaceFuncItself(t *testing.T) { + c := make(chan bool) + f := func() {} + go func() { + f() + c <- true + }() + f = func() {} + <-c +} + +func TestNoRaceFuncUnlock(t *testing.T) { + ch := make(chan bool, 1) + var mu sync.Mutex + x := 0 + _ = x + go func() { + mu.Lock() + x = 42 + mu.Unlock() + ch <- true + }() + x = func(mu *sync.Mutex) int { + mu.Lock() + return 43 + }(&mu) + mu.Unlock() + <-ch +} + +func TestRaceStructInit(t *testing.T) { + type X struct { + x, y int + } + c := make(chan bool, 1) + y := 0 + go func() { + y = 42 + c <- true + }() + x := X{x: y} + _ = x + <-c +} + +func TestRaceArrayInit(t *testing.T) { + c := make(chan bool, 1) + y := 0 + go func() { + y = 42 + c <- true + }() + x := []int{0, y, 42} + _ = x + <-c +} + +func TestRaceMapInit(t *testing.T) { + c := make(chan bool, 1) + y := 0 + go func() { + y = 42 + c <- true + }() + x := map[int]int{0: 42, y: 42} + _ = x + <-c +} + +func TestRaceMapInit2(t *testing.T) { + c := make(chan bool, 1) + y := 0 + go func() { + y = 42 + c <- true + }() + x := map[int]int{0: 42, 42: y} + _ = x + <-c +} + +type Inter interface { + Foo(x int) +} +type InterImpl struct { + x, y int +} + +//go:noinline +func (p InterImpl) Foo(x int) { +} + +type InterImpl2 InterImpl + +func (p *InterImpl2) Foo(x int) { + if p == nil { + InterImpl{}.Foo(x) + } + InterImpl(*p).Foo(x) +} + +func TestRaceInterCall(t *testing.T) { + c := make(chan bool, 1) + p := InterImpl{} + var x Inter = p + go func() { + p2 := InterImpl{} + x = p2 + c <- true + }() + x.Foo(0) + <-c +} + +func TestRaceInterCall2(t *testing.T) { + c := make(chan bool, 1) + p := InterImpl{} + var x Inter = p + z := 0 + go func() { + z = 42 + c <- true + }() + x.Foo(z) + <-c +} + +func TestRaceFuncCall(t *testing.T) { + c := make(chan bool, 1) + f := func(x, y int) {} + x, y := 0, 0 + go func() { + y = 42 + c <- true + }() + f(x, y) + <-c +} + +func TestRaceMethodCall(t *testing.T) { + c := make(chan bool, 1) + i := InterImpl{} + x := 0 + go func() { + x = 42 + c <- true + }() + i.Foo(x) + <-c +} + +func TestRaceMethodCall2(t *testing.T) { + c := make(chan bool, 1) + i := &InterImpl{} + go func() { + i = &InterImpl{} + c <- true + }() + i.Foo(0) + <-c +} + +// Method value with concrete value receiver. +func TestRaceMethodValue(t *testing.T) { + c := make(chan bool, 1) + i := InterImpl{} + go func() { + i = InterImpl{} + c <- true + }() + _ = i.Foo + <-c +} + +// Method value with interface receiver. +func TestRaceMethodValue2(t *testing.T) { + c := make(chan bool, 1) + var i Inter = InterImpl{} + go func() { + i = InterImpl{} + c <- true + }() + _ = i.Foo + <-c +} + +// Method value with implicit dereference. +func TestRaceMethodValue3(t *testing.T) { + c := make(chan bool, 1) + i := &InterImpl{} + go func() { + *i = InterImpl{} + c <- true + }() + _ = i.Foo // dereferences i. + <-c +} + +// Method value implicitly taking receiver address. +func TestNoRaceMethodValue(t *testing.T) { + c := make(chan bool, 1) + i := InterImpl2{} + go func() { + i = InterImpl2{} + c <- true + }() + _ = i.Foo // takes the address of i only. + <-c +} + +func TestRacePanicArg(t *testing.T) { + c := make(chan bool, 1) + err := errors.New("err") + go func() { + err = errors.New("err2") + c <- true + }() + defer func() { + recover() + <-c + }() + panic(err) +} + +func TestRaceDeferArg(t *testing.T) { + c := make(chan bool, 1) + x := 0 + go func() { + x = 42 + c <- true + }() + func() { + defer func(x int) { + }(x) + }() + <-c +} + +type DeferT int + +func (d DeferT) Foo() { +} + +func TestRaceDeferArg2(t *testing.T) { + c := make(chan bool, 1) + var x DeferT + go func() { + var y DeferT + x = y + c <- true + }() + func() { + defer x.Foo() + }() + <-c +} + +func TestNoRaceAddrExpr(t *testing.T) { + c := make(chan bool, 1) + x := 0 + go func() { + x = 42 + c <- true + }() + _ = &x + <-c +} + +type AddrT struct { + _ [256]byte + x int +} + +type AddrT2 struct { + _ [512]byte + p *AddrT +} + +func TestRaceAddrExpr(t *testing.T) { + c := make(chan bool, 1) + a := AddrT2{p: &AddrT{x: 42}} + go func() { + a.p = &AddrT{x: 43} + c <- true + }() + _ = &a.p.x + <-c +} + +func TestRaceTypeAssert(t *testing.T) { + c := make(chan bool, 1) + x := 0 + var i any = x + go func() { + y := 0 + i = y + c <- true + }() + _ = i.(int) + <-c +} + +func TestRaceBlockAs(t *testing.T) { + c := make(chan bool, 1) + var x, y int + go func() { + x = 42 + c <- true + }() + x, y = y, x + <-c +} + +func TestRaceBlockCall1(t *testing.T) { + done := make(chan bool) + x, y := 0, 0 + go func() { + f := func() (int, int) { + return 42, 43 + } + x, y = f() + done <- true + }() + _ = x + <-done + if x != 42 || y != 43 { + panic("corrupted data") + } +} +func TestRaceBlockCall2(t *testing.T) { + done := make(chan bool) + x, y := 0, 0 + go func() { + f := func() (int, int) { + return 42, 43 + } + x, y = f() + done <- true + }() + _ = y + <-done + if x != 42 || y != 43 { + panic("corrupted data") + } +} +func TestRaceBlockCall3(t *testing.T) { + done := make(chan bool) + var x *int + y := 0 + go func() { + f := func() (*int, int) { + i := 42 + return &i, 43 + } + x, y = f() + done <- true + }() + _ = x + <-done + if *x != 42 || y != 43 { + panic("corrupted data") + } +} +func TestRaceBlockCall4(t *testing.T) { + done := make(chan bool) + x := 0 + var y *int + go func() { + f := func() (int, *int) { + i := 43 + return 42, &i + } + x, y = f() + done <- true + }() + _ = y + <-done + if x != 42 || *y != 43 { + panic("corrupted data") + } +} +func TestRaceBlockCall5(t *testing.T) { + done := make(chan bool) + var x *int + y := 0 + go func() { + f := func() (*int, int) { + i := 42 + return &i, 43 + } + x, y = f() + done <- true + }() + _ = y + <-done + if *x != 42 || y != 43 { + panic("corrupted data") + } +} +func TestRaceBlockCall6(t *testing.T) { + done := make(chan bool) + x := 0 + var y *int + go func() { + f := func() (int, *int) { + i := 43 + return 42, &i + } + x, y = f() + done <- true + }() + _ = x + <-done + if x != 42 || *y != 43 { + panic("corrupted data") + } +} +func TestRaceSliceSlice(t *testing.T) { + c := make(chan bool, 1) + x := make([]int, 10) + go func() { + x = make([]int, 20) + c <- true + }() + _ = x[2:3] + <-c +} + +func TestRaceSliceSlice2(t *testing.T) { + c := make(chan bool, 1) + x := make([]int, 10) + i := 2 + go func() { + i = 3 + c <- true + }() + _ = x[i:4] + <-c +} + +func TestRaceSliceString(t *testing.T) { + c := make(chan bool, 1) + x := "hello" + go func() { + x = "world" + c <- true + }() + _ = x[2:3] + <-c +} + +func TestRaceSliceStruct(t *testing.T) { + type X struct { + x, y int + } + c := make(chan bool, 1) + x := make([]X, 10) + go func() { + y := make([]X, 10) + copy(y, x) + c <- true + }() + x[1].y = 42 + <-c +} + +func TestRaceAppendSliceStruct(t *testing.T) { + type X struct { + x, y int + } + c := make(chan bool, 1) + x := make([]X, 10) + go func() { + y := make([]X, 0, 10) + y = append(y, x...) + c <- true + }() + x[1].y = 42 + <-c +} + +func TestRaceStructInd(t *testing.T) { + c := make(chan bool, 1) + type Item struct { + x, y int + } + i := Item{} + go func(p *Item) { + *p = Item{} + c <- true + }(&i) + i.y = 42 + <-c +} + +func TestRaceAsFunc1(t *testing.T) { + var s []byte + c := make(chan bool, 1) + go func() { + var err error + s, err = func() ([]byte, error) { + t := []byte("hello world") + return t, nil + }() + c <- true + _ = err + }() + _ = string(s) + <-c +} + +func TestRaceAsFunc2(t *testing.T) { + c := make(chan bool, 1) + x := 0 + go func() { + func(x int) { + }(x) + c <- true + }() + x = 42 + <-c +} + +func TestRaceAsFunc3(t *testing.T) { + c := make(chan bool, 1) + var mu sync.Mutex + x := 0 + go func() { + func(x int) { + mu.Lock() + }(x) // Read of x must be outside of the mutex. + mu.Unlock() + c <- true + }() + mu.Lock() + x = 42 + mu.Unlock() + <-c +} + +func TestNoRaceAsFunc4(t *testing.T) { + c := make(chan bool, 1) + var mu sync.Mutex + x := 0 + _ = x + go func() { + x = func() int { // Write of x must be under the mutex. + mu.Lock() + return 42 + }() + mu.Unlock() + c <- true + }() + mu.Lock() + x = 42 + mu.Unlock() + <-c +} + +func TestRaceHeapParam(t *testing.T) { + done := make(chan bool) + x := func() (x int) { + go func() { + x = 42 + done <- true + }() + return + }() + _ = x + <-done +} + +func TestNoRaceEmptyStruct(t *testing.T) { + type Empty struct{} + type X struct { + y int64 + Empty + } + type Y struct { + x X + y int64 + } + c := make(chan X) + var y Y + go func() { + x := y.x + c <- x + }() + y.y = 42 + <-c +} + +func TestRaceNestedStruct(t *testing.T) { + type X struct { + x, y int + } + type Y struct { + x X + } + c := make(chan Y) + var y Y + go func() { + c <- y + }() + y.x.y = 42 + <-c +} + +func TestRaceIssue5567(t *testing.T) { + testRaceRead(t, false) +} + +func TestRaceIssue51618(t *testing.T) { + testRaceRead(t, true) +} + +func testRaceRead(t *testing.T, pread bool) { + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(4)) + in := make(chan []byte) + res := make(chan error) + go func() { + var err error + defer func() { + close(in) + res <- err + }() + path := "mop_test.go" + f, err := os.Open(path) + if err != nil { + return + } + defer f.Close() + var n, total int + b := make([]byte, 17) // the race is on b buffer + for err == nil { + if pread { + n, err = f.ReadAt(b, int64(total)) + } else { + n, err = f.Read(b) + } + total += n + if n > 0 { + in <- b[:n] + } + } + if err == io.EOF { + err = nil + } + }() + h := crc32.New(crc32.MakeTable(0x12345678)) + for b := range in { + h.Write(b) + } + _ = h.Sum(nil) + err := <-res + if err != nil { + t.Fatal(err) + } +} + +func TestRaceIssue5654(t *testing.T) { + text := `Friends, Romans, countrymen, lend me your ears; +I come to bury Caesar, not to praise him. +The evil that men do lives after them; +The good is oft interred with their bones; +So let it be with Caesar. The noble Brutus +Hath told you Caesar was ambitious: +If it were so, it was a grievous fault, +And grievously hath Caesar answer'd it. +Here, under leave of Brutus and the rest - +For Brutus is an honourable man; +So are they all, all honourable men - +Come I to speak in Caesar's funeral. +He was my friend, faithful and just to me: +But Brutus says he was ambitious; +And Brutus is an honourable man.` + + data := bytes.NewBufferString(text) + in := make(chan []byte) + + go func() { + buf := make([]byte, 16) + var n int + var err error + for ; err == nil; n, err = data.Read(buf) { + in <- buf[:n] + } + close(in) + }() + res := "" + for s := range in { + res += string(s) + } + _ = res +} + +type Base int + +func (b *Base) Foo() int { + return 42 +} + +func (b Base) Bar() int { + return int(b) +} + +func TestNoRaceMethodThunk(t *testing.T) { + type Derived struct { + pad int + Base + } + var d Derived + done := make(chan bool) + go func() { + _ = d.Foo() + done <- true + }() + d = Derived{} + <-done +} + +func TestRaceMethodThunk(t *testing.T) { + type Derived struct { + pad int + *Base + } + var d Derived + done := make(chan bool) + go func() { + _ = d.Foo() + done <- true + }() + d = Derived{} + <-done +} + +func TestRaceMethodThunk2(t *testing.T) { + type Derived struct { + pad int + Base + } + var d Derived + done := make(chan bool) + go func() { + _ = d.Bar() + done <- true + }() + d = Derived{} + <-done +} + +func TestRaceMethodThunk3(t *testing.T) { + type Derived struct { + pad int + *Base + } + var d Derived + d.Base = new(Base) + done := make(chan bool) + go func() { + _ = d.Bar() + done <- true + }() + d.Base = new(Base) + <-done +} + +func TestRaceMethodThunk4(t *testing.T) { + type Derived struct { + pad int + *Base + } + var d Derived + d.Base = new(Base) + done := make(chan bool) + go func() { + _ = d.Bar() + done <- true + }() + *(*int)(d.Base) = 42 + <-done +} + +func TestNoRaceTinyAlloc(t *testing.T) { + const P = 4 + const N = 1e6 + var tinySink *byte + _ = tinySink + done := make(chan bool) + for p := 0; p < P; p++ { + go func() { + for i := 0; i < N; i++ { + var b byte + if b != 0 { + tinySink = &b // make it heap allocated + } + b = 42 + } + done <- true + }() + } + for p := 0; p < P; p++ { + <-done + } +} + +func TestNoRaceIssue60934(t *testing.T) { + // Test that runtime.RaceDisable state doesn't accidentally get applied to + // new goroutines. + + // Create several goroutines that end after calling runtime.RaceDisable. + var wg sync.WaitGroup + ready := make(chan struct{}) + wg.Add(32) + for i := 0; i < 32; i++ { + go func() { + <-ready // ensure we have multiple goroutines running at the same time + runtime.RaceDisable() + wg.Done() + }() + } + close(ready) + wg.Wait() + + // Make sure race detector still works. If the runtime.RaceDisable state + // leaks, the happens-before edges here will be ignored and a race on x will + // be reported. + var x int + ch := make(chan struct{}, 0) + wg.Add(2) + go func() { + x = 1 + ch <- struct{}{} + wg.Done() + }() + go func() { + <-ch + _ = x + wg.Done() + }() + wg.Wait() +} diff --git a/src/runtime/race/testdata/mutex_test.go b/src/runtime/race/testdata/mutex_test.go new file mode 100644 index 0000000..9dbed9a --- /dev/null +++ b/src/runtime/race/testdata/mutex_test.go @@ -0,0 +1,150 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "sync" + "testing" + "time" +) + +func TestNoRaceMutex(t *testing.T) { + var mu sync.Mutex + var x int16 = 0 + _ = x + ch := make(chan bool, 2) + go func() { + mu.Lock() + defer mu.Unlock() + x = 1 + ch <- true + }() + go func() { + mu.Lock() + x = 2 + mu.Unlock() + ch <- true + }() + <-ch + <-ch +} + +func TestRaceMutex(t *testing.T) { + var mu sync.Mutex + var x int16 = 0 + _ = x + ch := make(chan bool, 2) + go func() { + x = 1 + mu.Lock() + defer mu.Unlock() + ch <- true + }() + go func() { + x = 2 + mu.Lock() + mu.Unlock() + ch <- true + }() + <-ch + <-ch +} + +func TestRaceMutex2(t *testing.T) { + var mu1 sync.Mutex + var mu2 sync.Mutex + var x int8 = 0 + _ = x + ch := make(chan bool, 2) + go func() { + mu1.Lock() + defer mu1.Unlock() + x = 1 + ch <- true + }() + go func() { + mu2.Lock() + x = 2 + mu2.Unlock() + ch <- true + }() + <-ch + <-ch +} + +func TestNoRaceMutexPureHappensBefore(t *testing.T) { + var mu sync.Mutex + var x int16 = 0 + _ = x + written := false + ch := make(chan bool, 2) + go func() { + x = 1 + mu.Lock() + written = true + mu.Unlock() + ch <- true + }() + go func() { + time.Sleep(100 * time.Microsecond) + mu.Lock() + for !written { + mu.Unlock() + time.Sleep(100 * time.Microsecond) + mu.Lock() + } + mu.Unlock() + x = 1 + ch <- true + }() + <-ch + <-ch +} + +func TestNoRaceMutexSemaphore(t *testing.T) { + var mu sync.Mutex + ch := make(chan bool, 2) + x := 0 + _ = x + mu.Lock() + go func() { + x = 1 + mu.Unlock() + ch <- true + }() + go func() { + mu.Lock() + x = 2 + mu.Unlock() + ch <- true + }() + <-ch + <-ch +} + +// from doc/go_mem.html +func TestNoRaceMutexExampleFromHtml(t *testing.T) { + var l sync.Mutex + a := "" + + l.Lock() + go func() { + a = "hello, world" + l.Unlock() + }() + l.Lock() + _ = a +} + +func TestRaceMutexOverwrite(t *testing.T) { + c := make(chan bool, 1) + var mu sync.Mutex + go func() { + mu = sync.Mutex{} + c <- true + }() + mu.Lock() + <-c +} diff --git a/src/runtime/race/testdata/pool_test.go b/src/runtime/race/testdata/pool_test.go new file mode 100644 index 0000000..a96913e --- /dev/null +++ b/src/runtime/race/testdata/pool_test.go @@ -0,0 +1,47 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "sync" + "testing" + "time" +) + +func TestRacePool(t *testing.T) { + // Pool randomly drops the argument on the floor during Put. + // Repeat so that at least one iteration gets reuse. + for i := 0; i < 10; i++ { + c := make(chan int) + p := &sync.Pool{New: func() any { return make([]byte, 10) }} + x := p.Get().([]byte) + x[0] = 1 + p.Put(x) + go func() { + y := p.Get().([]byte) + y[0] = 2 + c <- 1 + }() + x[0] = 3 + <-c + } +} + +func TestNoRacePool(t *testing.T) { + for i := 0; i < 10; i++ { + p := &sync.Pool{New: func() any { return make([]byte, 10) }} + x := p.Get().([]byte) + x[0] = 1 + p.Put(x) + go func() { + y := p.Get().([]byte) + y[0] = 2 + p.Put(y) + }() + time.Sleep(100 * time.Millisecond) + x = p.Get().([]byte) + x[0] = 3 + } +} diff --git a/src/runtime/race/testdata/reflect_test.go b/src/runtime/race/testdata/reflect_test.go new file mode 100644 index 0000000..b567400 --- /dev/null +++ b/src/runtime/race/testdata/reflect_test.go @@ -0,0 +1,46 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "reflect" + "testing" +) + +func TestRaceReflectRW(t *testing.T) { + ch := make(chan bool, 1) + i := 0 + v := reflect.ValueOf(&i) + go func() { + v.Elem().Set(reflect.ValueOf(1)) + ch <- true + }() + _ = v.Elem().Int() + <-ch +} + +func TestRaceReflectWW(t *testing.T) { + ch := make(chan bool, 1) + i := 0 + v := reflect.ValueOf(&i) + go func() { + v.Elem().Set(reflect.ValueOf(1)) + ch <- true + }() + v.Elem().Set(reflect.ValueOf(2)) + <-ch +} + +func TestRaceReflectCopyWW(t *testing.T) { + ch := make(chan bool, 1) + a := make([]byte, 2) + v := reflect.ValueOf(a) + go func() { + reflect.Copy(v, v) + ch <- true + }() + reflect.Copy(v, v) + <-ch +} diff --git a/src/runtime/race/testdata/regression_test.go b/src/runtime/race/testdata/regression_test.go new file mode 100644 index 0000000..6a7802f --- /dev/null +++ b/src/runtime/race/testdata/regression_test.go @@ -0,0 +1,189 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Code patterns that caused problems in the past. + +package race_test + +import ( + "testing" +) + +type LogImpl struct { + x int +} + +func NewLog() (l LogImpl) { + c := make(chan bool) + go func() { + _ = l + c <- true + }() + l = LogImpl{} + <-c + return +} + +var _ LogImpl = NewLog() + +func MakeMap() map[int]int { + return make(map[int]int) +} + +func InstrumentMapLen() { + _ = len(MakeMap()) +} + +func InstrumentMapLen2() { + m := make(map[int]map[int]int) + _ = len(m[0]) +} + +func InstrumentMapLen3() { + m := make(map[int]*map[int]int) + _ = len(*m[0]) +} + +func TestRaceUnaddressableMapLen(t *testing.T) { + m := make(map[int]map[int]int) + ch := make(chan int, 1) + m[0] = make(map[int]int) + go func() { + _ = len(m[0]) + ch <- 0 + }() + m[0][0] = 1 + <-ch +} + +type Rect struct { + x, y int +} + +type Image struct { + min, max Rect +} + +//go:noinline +func NewImage() Image { + return Image{} +} + +func AddrOfTemp() { + _ = NewImage().min +} + +type TypeID int + +func (t *TypeID) encodeType(x int) (tt TypeID, err error) { + switch x { + case 0: + return t.encodeType(x * x) + } + return 0, nil +} + +type stack []int + +func (s *stack) push(x int) { + *s = append(*s, x) +} + +func (s *stack) pop() int { + i := len(*s) + n := (*s)[i-1] + *s = (*s)[:i-1] + return n +} + +func TestNoRaceStackPushPop(t *testing.T) { + var s stack + go func(s *stack) {}(&s) + s.push(1) + x := s.pop() + _ = x +} + +type RpcChan struct { + c chan bool +} + +var makeChanCalls int + +//go:noinline +func makeChan() *RpcChan { + makeChanCalls++ + c := &RpcChan{make(chan bool, 1)} + c.c <- true + return c +} + +func call() bool { + x := <-makeChan().c + return x +} + +func TestNoRaceRpcChan(t *testing.T) { + makeChanCalls = 0 + _ = call() + if makeChanCalls != 1 { + t.Fatalf("makeChanCalls %d, expected 1\n", makeChanCalls) + } +} + +func divInSlice() { + v := make([]int64, 10) + i := 1 + _ = v[(i*4)/3] +} + +func TestNoRaceReturn(t *testing.T) { + c := make(chan int) + noRaceReturn(c) + <-c +} + +// Return used to do an implicit a = a, causing a read/write race +// with the goroutine. Compiler has an optimization to avoid that now. +// See issue 4014. +func noRaceReturn(c chan int) (a, b int) { + a = 42 + go func() { + _ = a + c <- 1 + }() + return a, 10 +} + +func issue5431() { + var p **inltype + if inlinetest(p).x && inlinetest(p).y { + } else if inlinetest(p).x || inlinetest(p).y { + } +} + +type inltype struct { + x, y bool +} + +func inlinetest(p **inltype) *inltype { + return *p +} + +type iface interface { + Foo() *struct{ b bool } +} + +type Int int + +func (i Int) Foo() *struct{ b bool } { + return &struct{ b bool }{false} +} + +func TestNoRaceForInfiniteLoop(t *testing.T) { + var x Int + // interface conversion causes nodes to be put on init list + for iface(x).Foo().b { + } +} diff --git a/src/runtime/race/testdata/rwmutex_test.go b/src/runtime/race/testdata/rwmutex_test.go new file mode 100644 index 0000000..39219e5 --- /dev/null +++ b/src/runtime/race/testdata/rwmutex_test.go @@ -0,0 +1,154 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "sync" + "testing" + "time" +) + +func TestRaceMutexRWMutex(t *testing.T) { + var mu1 sync.Mutex + var mu2 sync.RWMutex + var x int16 = 0 + _ = x + ch := make(chan bool, 2) + go func() { + mu1.Lock() + defer mu1.Unlock() + x = 1 + ch <- true + }() + go func() { + mu2.Lock() + x = 2 + mu2.Unlock() + ch <- true + }() + <-ch + <-ch +} + +func TestNoRaceRWMutex(t *testing.T) { + var mu sync.RWMutex + var x, y int64 = 0, 1 + _ = y + ch := make(chan bool, 2) + go func() { + mu.Lock() + defer mu.Unlock() + x = 2 + ch <- true + }() + go func() { + mu.RLock() + y = x + mu.RUnlock() + ch <- true + }() + <-ch + <-ch +} + +func TestRaceRWMutexMultipleReaders(t *testing.T) { + var mu sync.RWMutex + var x, y int64 = 0, 1 + ch := make(chan bool, 4) + go func() { + mu.Lock() + defer mu.Unlock() + x = 2 + ch <- true + }() + // Use three readers so that no matter what order they're + // scheduled in, two will be on the same side of the write + // lock above. + go func() { + mu.RLock() + y = x + 1 + mu.RUnlock() + ch <- true + }() + go func() { + mu.RLock() + y = x + 2 + mu.RUnlock() + ch <- true + }() + go func() { + mu.RLock() + y = x + 3 + mu.RUnlock() + ch <- true + }() + <-ch + <-ch + <-ch + <-ch + _ = y +} + +func TestNoRaceRWMutexMultipleReaders(t *testing.T) { + var mu sync.RWMutex + x := int64(0) + ch := make(chan bool, 4) + go func() { + mu.Lock() + defer mu.Unlock() + x = 2 + ch <- true + }() + go func() { + mu.RLock() + y := x + 1 + _ = y + mu.RUnlock() + ch <- true + }() + go func() { + mu.RLock() + y := x + 2 + _ = y + mu.RUnlock() + ch <- true + }() + go func() { + mu.RLock() + y := x + 3 + _ = y + mu.RUnlock() + ch <- true + }() + <-ch + <-ch + <-ch + <-ch +} + +func TestNoRaceRWMutexTransitive(t *testing.T) { + var mu sync.RWMutex + x := int64(0) + ch := make(chan bool, 2) + go func() { + mu.RLock() + _ = x + mu.RUnlock() + ch <- true + }() + go func() { + time.Sleep(1e7) + mu.RLock() + _ = x + mu.RUnlock() + ch <- true + }() + time.Sleep(2e7) + mu.Lock() + x = 42 + mu.Unlock() + <-ch + <-ch +} diff --git a/src/runtime/race/testdata/select_test.go b/src/runtime/race/testdata/select_test.go new file mode 100644 index 0000000..9a43f9b --- /dev/null +++ b/src/runtime/race/testdata/select_test.go @@ -0,0 +1,293 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "runtime" + "testing" +) + +func TestNoRaceSelect1(t *testing.T) { + var x int + _ = x + compl := make(chan bool) + c := make(chan bool) + c1 := make(chan bool) + + go func() { + x = 1 + // At least two channels are needed because + // otherwise the compiler optimizes select out. + // See comment in runtime/select.go:^func selectgo. + select { + case c <- true: + case c1 <- true: + } + compl <- true + }() + select { + case <-c: + case c1 <- true: + } + x = 2 + <-compl +} + +func TestNoRaceSelect2(t *testing.T) { + var x int + _ = x + compl := make(chan bool) + c := make(chan bool) + c1 := make(chan bool) + go func() { + select { + case <-c: + case <-c1: + } + x = 1 + compl <- true + }() + x = 2 + close(c) + runtime.Gosched() + <-compl +} + +func TestNoRaceSelect3(t *testing.T) { + var x int + _ = x + compl := make(chan bool) + c := make(chan bool, 10) + c1 := make(chan bool) + go func() { + x = 1 + select { + case c <- true: + case <-c1: + } + compl <- true + }() + <-c + x = 2 + <-compl +} + +func TestNoRaceSelect4(t *testing.T) { + type Task struct { + f func() + done chan bool + } + + queue := make(chan Task) + dummy := make(chan bool) + + go func() { + for { + select { + case t := <-queue: + t.f() + t.done <- true + } + } + }() + + doit := func(f func()) { + done := make(chan bool, 1) + select { + case queue <- Task{f, done}: + case <-dummy: + } + select { + case <-done: + case <-dummy: + } + } + + var x int + doit(func() { + x = 1 + }) + _ = x +} + +func TestNoRaceSelect5(t *testing.T) { + test := func(sel, needSched bool) { + var x int + _ = x + ch := make(chan bool) + c1 := make(chan bool) + + done := make(chan bool, 2) + go func() { + if needSched { + runtime.Gosched() + } + // println(1) + x = 1 + if sel { + select { + case ch <- true: + case <-c1: + } + } else { + ch <- true + } + done <- true + }() + + go func() { + // println(2) + if sel { + select { + case <-ch: + case <-c1: + } + } else { + <-ch + } + x = 1 + done <- true + }() + <-done + <-done + } + + test(true, true) + test(true, false) + test(false, true) + test(false, false) +} + +func TestRaceSelect1(t *testing.T) { + var x int + _ = x + compl := make(chan bool, 2) + c := make(chan bool) + c1 := make(chan bool) + + go func() { + <-c + <-c + }() + f := func() { + select { + case c <- true: + case c1 <- true: + } + x = 1 + compl <- true + } + go f() + go f() + <-compl + <-compl +} + +func TestRaceSelect2(t *testing.T) { + var x int + _ = x + compl := make(chan bool) + c := make(chan bool) + c1 := make(chan bool) + go func() { + x = 1 + select { + case <-c: + case <-c1: + } + compl <- true + }() + close(c) + x = 2 + <-compl +} + +func TestRaceSelect3(t *testing.T) { + var x int + _ = x + compl := make(chan bool) + c := make(chan bool) + c1 := make(chan bool) + go func() { + x = 1 + select { + case c <- true: + case c1 <- true: + } + compl <- true + }() + x = 2 + select { + case <-c: + } + <-compl +} + +func TestRaceSelect4(t *testing.T) { + done := make(chan bool, 1) + var x int + go func() { + select { + default: + x = 2 + } + done <- true + }() + _ = x + <-done +} + +// The idea behind this test: +// there are two variables, access to one +// of them is synchronized, access to the other +// is not. +// Select must (unconditionally) choose the non-synchronized variable +// thus causing exactly one race. +// Currently this test doesn't look like it accomplishes +// this goal. +func TestRaceSelect5(t *testing.T) { + done := make(chan bool, 1) + c1 := make(chan bool, 1) + c2 := make(chan bool) + var x, y int + go func() { + select { + case c1 <- true: + x = 1 + case c2 <- true: + y = 1 + } + done <- true + }() + _ = x + _ = y + <-done +} + +// select statements may introduce +// flakiness: whether this test contains +// a race depends on the scheduling +// (some may argue that the code contains +// this race by definition) +/* +func TestFlakyDefault(t *testing.T) { + var x int + c := make(chan bool, 1) + done := make(chan bool, 1) + go func() { + select { + case <-c: + x = 2 + default: + x = 3 + } + done <- true + }() + x = 1 + c <- true + _ = x + <-done +} +*/ diff --git a/src/runtime/race/testdata/slice_test.go b/src/runtime/race/testdata/slice_test.go new file mode 100644 index 0000000..9009a9a --- /dev/null +++ b/src/runtime/race/testdata/slice_test.go @@ -0,0 +1,608 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "sync" + "testing" +) + +func TestRaceSliceRW(t *testing.T) { + ch := make(chan bool, 1) + a := make([]int, 2) + go func() { + a[1] = 1 + ch <- true + }() + _ = a[1] + <-ch +} + +func TestNoRaceSliceRW(t *testing.T) { + ch := make(chan bool, 1) + a := make([]int, 2) + go func() { + a[0] = 1 + ch <- true + }() + _ = a[1] + <-ch +} + +func TestRaceSliceWW(t *testing.T) { + a := make([]int, 10) + ch := make(chan bool, 1) + go func() { + a[1] = 1 + ch <- true + }() + a[1] = 2 + <-ch +} + +func TestNoRaceArrayWW(t *testing.T) { + var a [5]int + ch := make(chan bool, 1) + go func() { + a[0] = 1 + ch <- true + }() + a[1] = 2 + <-ch +} + +func TestRaceArrayWW(t *testing.T) { + var a [5]int + ch := make(chan bool, 1) + go func() { + a[1] = 1 + ch <- true + }() + a[1] = 2 + <-ch +} + +func TestNoRaceSliceWriteLen(t *testing.T) { + ch := make(chan bool, 1) + a := make([]bool, 1) + go func() { + a[0] = true + ch <- true + }() + _ = len(a) + <-ch +} + +func TestNoRaceSliceWriteCap(t *testing.T) { + ch := make(chan bool, 1) + a := make([]uint64, 100) + go func() { + a[50] = 123 + ch <- true + }() + _ = cap(a) + <-ch +} + +func TestRaceSliceCopyRead(t *testing.T) { + ch := make(chan bool, 1) + a := make([]int, 10) + b := make([]int, 10) + go func() { + _ = a[5] + ch <- true + }() + copy(a, b) + <-ch +} + +func TestNoRaceSliceWriteCopy(t *testing.T) { + ch := make(chan bool, 1) + a := make([]int, 10) + b := make([]int, 10) + go func() { + a[5] = 1 + ch <- true + }() + copy(a[:5], b[:5]) + <-ch +} + +func TestRaceSliceCopyWrite2(t *testing.T) { + ch := make(chan bool, 1) + a := make([]int, 10) + b := make([]int, 10) + go func() { + b[5] = 1 + ch <- true + }() + copy(a, b) + <-ch +} + +func TestRaceSliceCopyWrite3(t *testing.T) { + ch := make(chan bool, 1) + a := make([]byte, 10) + go func() { + a[7] = 1 + ch <- true + }() + copy(a, "qwertyqwerty") + <-ch +} + +func TestNoRaceSliceCopyRead(t *testing.T) { + ch := make(chan bool, 1) + a := make([]int, 10) + b := make([]int, 10) + go func() { + _ = b[5] + ch <- true + }() + copy(a, b) + <-ch +} + +func TestRacePointerSliceCopyRead(t *testing.T) { + ch := make(chan bool, 1) + a := make([]*int, 10) + b := make([]*int, 10) + go func() { + _ = a[5] + ch <- true + }() + copy(a, b) + <-ch +} + +func TestNoRacePointerSliceWriteCopy(t *testing.T) { + ch := make(chan bool, 1) + a := make([]*int, 10) + b := make([]*int, 10) + go func() { + a[5] = new(int) + ch <- true + }() + copy(a[:5], b[:5]) + <-ch +} + +func TestRacePointerSliceCopyWrite2(t *testing.T) { + ch := make(chan bool, 1) + a := make([]*int, 10) + b := make([]*int, 10) + go func() { + b[5] = new(int) + ch <- true + }() + copy(a, b) + <-ch +} + +func TestNoRacePointerSliceCopyRead(t *testing.T) { + ch := make(chan bool, 1) + a := make([]*int, 10) + b := make([]*int, 10) + go func() { + _ = b[5] + ch <- true + }() + copy(a, b) + <-ch +} + +func TestNoRaceSliceWriteSlice2(t *testing.T) { + ch := make(chan bool, 1) + a := make([]float64, 10) + go func() { + a[2] = 1.0 + ch <- true + }() + _ = a[0:5] + <-ch +} + +func TestRaceSliceWriteSlice(t *testing.T) { + ch := make(chan bool, 1) + a := make([]float64, 10) + go func() { + a[2] = 1.0 + ch <- true + }() + a = a[5:10] + <-ch +} + +func TestNoRaceSliceWriteSlice(t *testing.T) { + ch := make(chan bool, 1) + a := make([]float64, 10) + go func() { + a[2] = 1.0 + ch <- true + }() + _ = a[5:10] + <-ch +} + +func TestNoRaceSliceLenCap(t *testing.T) { + ch := make(chan bool, 1) + a := make([]struct{}, 10) + go func() { + _ = len(a) + ch <- true + }() + _ = cap(a) + <-ch +} + +func TestNoRaceStructSlicesRangeWrite(t *testing.T) { + type Str struct { + a []int + b []int + } + ch := make(chan bool, 1) + var s Str + s.a = make([]int, 10) + s.b = make([]int, 10) + go func() { + for range s.a { + } + ch <- true + }() + s.b[5] = 5 + <-ch +} + +func TestRaceSliceDifferent(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + s2 := s + go func() { + s[3] = 3 + c <- true + }() + // false negative because s2 is PAUTO w/o PHEAP + // so we do not instrument it + s2[3] = 3 + <-c +} + +func TestRaceSliceRangeWrite(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + s[3] = 3 + c <- true + }() + for _, v := range s { + _ = v + } + <-c +} + +func TestNoRaceSliceRangeWrite(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + s[3] = 3 + c <- true + }() + for range s { + } + <-c +} + +func TestRaceSliceRangeAppend(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + s = append(s, 3) + c <- true + }() + for range s { + } + <-c +} + +func TestNoRaceSliceRangeAppend(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + _ = append(s, 3) + c <- true + }() + for range s { + } + <-c +} + +func TestRaceSliceVarWrite(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + s[3] = 3 + c <- true + }() + s = make([]int, 20) + <-c +} + +func TestRaceSliceVarRead(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + _ = s[3] + c <- true + }() + s = make([]int, 20) + <-c +} + +func TestRaceSliceVarRange(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + for range s { + } + c <- true + }() + s = make([]int, 20) + <-c +} + +func TestRaceSliceVarAppend(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + _ = append(s, 10) + c <- true + }() + s = make([]int, 20) + <-c +} + +func TestRaceSliceVarCopy(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + s2 := make([]int, 10) + copy(s, s2) + c <- true + }() + s = make([]int, 20) + <-c +} + +func TestRaceSliceVarCopy2(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + s2 := make([]int, 10) + copy(s2, s) + c <- true + }() + s = make([]int, 20) + <-c +} + +func TestRaceSliceAppend(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10, 20) + go func() { + _ = append(s, 1) + c <- true + }() + _ = append(s, 2) + <-c +} + +func TestRaceSliceAppendWrite(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + _ = append(s, 1) + c <- true + }() + s[0] = 42 + <-c +} + +func TestRaceSliceAppendSlice(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + go func() { + s2 := make([]int, 10) + _ = append(s, s2...) + c <- true + }() + s[0] = 42 + <-c +} + +func TestRaceSliceAppendSlice2(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + s2foobar := make([]int, 10) + go func() { + _ = append(s, s2foobar...) + c <- true + }() + s2foobar[5] = 42 + <-c +} + +func TestRaceSliceAppendString(t *testing.T) { + c := make(chan bool, 1) + s := make([]byte, 10) + go func() { + _ = append(s, "qwerty"...) + c <- true + }() + s[0] = 42 + <-c +} + +func TestRacePointerSliceAppend(t *testing.T) { + c := make(chan bool, 1) + s := make([]*int, 10, 20) + go func() { + _ = append(s, new(int)) + c <- true + }() + _ = append(s, new(int)) + <-c +} + +func TestRacePointerSliceAppendWrite(t *testing.T) { + c := make(chan bool, 1) + s := make([]*int, 10) + go func() { + _ = append(s, new(int)) + c <- true + }() + s[0] = new(int) + <-c +} + +func TestRacePointerSliceAppendSlice(t *testing.T) { + c := make(chan bool, 1) + s := make([]*int, 10) + go func() { + s2 := make([]*int, 10) + _ = append(s, s2...) + c <- true + }() + s[0] = new(int) + <-c +} + +func TestRacePointerSliceAppendSlice2(t *testing.T) { + c := make(chan bool, 1) + s := make([]*int, 10) + s2foobar := make([]*int, 10) + go func() { + _ = append(s, s2foobar...) + c <- true + }() + println("WRITE:", &s2foobar[5]) + s2foobar[5] = nil + <-c +} + +func TestNoRaceSliceIndexAccess(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + v := 0 + go func() { + _ = v + c <- true + }() + s[v] = 1 + <-c +} + +func TestNoRaceSliceIndexAccess2(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + v := 0 + go func() { + _ = v + c <- true + }() + _ = s[v] + <-c +} + +func TestRaceSliceIndexAccess(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + v := 0 + go func() { + v = 1 + c <- true + }() + s[v] = 1 + <-c +} + +func TestRaceSliceIndexAccess2(t *testing.T) { + c := make(chan bool, 1) + s := make([]int, 10) + v := 0 + go func() { + v = 1 + c <- true + }() + _ = s[v] + <-c +} + +func TestRaceSliceByteToString(t *testing.T) { + c := make(chan string) + s := make([]byte, 10) + go func() { + c <- string(s) + }() + s[0] = 42 + <-c +} + +func TestRaceSliceRuneToString(t *testing.T) { + c := make(chan string) + s := make([]rune, 10) + go func() { + c <- string(s) + }() + s[9] = 42 + <-c +} + +func TestRaceConcatString(t *testing.T) { + s := "hello" + c := make(chan string, 1) + go func() { + c <- s + " world" + }() + s = "world" + <-c +} + +func TestRaceCompareString(t *testing.T) { + s1 := "hello" + s2 := "world" + c := make(chan bool, 1) + go func() { + c <- s1 == s2 + }() + s1 = s2 + <-c +} + +func TestRaceSlice3(t *testing.T) { + done := make(chan bool) + x := make([]int, 10) + i := 2 + go func() { + i = 3 + done <- true + }() + _ = x[:1:i] + <-done +} + +var saved string + +func TestRaceSlice4(t *testing.T) { + // See issue 36794. + data := []byte("hello there") + var wg sync.WaitGroup + wg.Add(1) + go func() { + _ = string(data) + wg.Done() + }() + copy(data, data[2:]) + wg.Wait() +} diff --git a/src/runtime/race/testdata/sync_test.go b/src/runtime/race/testdata/sync_test.go new file mode 100644 index 0000000..b5fcd6c --- /dev/null +++ b/src/runtime/race/testdata/sync_test.go @@ -0,0 +1,202 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "sync" + "testing" + "time" +) + +func TestNoRaceCond(t *testing.T) { + x := 0 + _ = x + condition := 0 + var mu sync.Mutex + cond := sync.NewCond(&mu) + go func() { + x = 1 + mu.Lock() + condition = 1 + cond.Signal() + mu.Unlock() + }() + mu.Lock() + for condition != 1 { + cond.Wait() + } + mu.Unlock() + x = 2 +} + +func TestRaceCond(t *testing.T) { + done := make(chan bool) + var mu sync.Mutex + cond := sync.NewCond(&mu) + x := 0 + _ = x + condition := 0 + go func() { + time.Sleep(10 * time.Millisecond) // Enter cond.Wait loop + x = 1 + mu.Lock() + condition = 1 + cond.Signal() + mu.Unlock() + time.Sleep(10 * time.Millisecond) // Exit cond.Wait loop + mu.Lock() + x = 3 + mu.Unlock() + done <- true + }() + mu.Lock() + for condition != 1 { + cond.Wait() + } + mu.Unlock() + x = 2 + <-done +} + +// We do not currently automatically +// parse this test. It is intended that the creation +// stack is observed manually not to contain +// off-by-one errors +func TestRaceAnnounceThreads(t *testing.T) { + const N = 7 + allDone := make(chan bool, N) + + var x int + _ = x + + var f, g, h func() + f = func() { + x = 1 + go g() + go func() { + x = 1 + allDone <- true + }() + x = 2 + allDone <- true + } + + g = func() { + for i := 0; i < 2; i++ { + go func() { + x = 1 + allDone <- true + }() + allDone <- true + } + } + + h = func() { + x = 1 + x = 2 + go f() + allDone <- true + } + + go h() + + for i := 0; i < N; i++ { + <-allDone + } +} + +func TestNoRaceAfterFunc1(t *testing.T) { + i := 2 + c := make(chan bool) + var f func() + f = func() { + i-- + if i >= 0 { + time.AfterFunc(0, f) + } else { + c <- true + } + } + + time.AfterFunc(0, f) + <-c +} + +func TestNoRaceAfterFunc2(t *testing.T) { + var x int + _ = x + timer := time.AfterFunc(10, func() { + x = 1 + }) + defer timer.Stop() +} + +func TestNoRaceAfterFunc3(t *testing.T) { + c := make(chan bool, 1) + x := 0 + _ = x + time.AfterFunc(1e7, func() { + x = 1 + c <- true + }) + <-c +} + +func TestRaceAfterFunc3(t *testing.T) { + c := make(chan bool, 2) + x := 0 + _ = x + time.AfterFunc(1e7, func() { + x = 1 + c <- true + }) + time.AfterFunc(2e7, func() { + x = 2 + c <- true + }) + <-c + <-c +} + +// This test's output is intended to be +// observed manually. One should check +// that goroutine creation stack is +// comprehensible. +func TestRaceGoroutineCreationStack(t *testing.T) { + var x int + _ = x + var ch = make(chan bool, 1) + + f1 := func() { + x = 1 + ch <- true + } + f2 := func() { go f1() } + f3 := func() { go f2() } + f4 := func() { go f3() } + + go f4() + x = 2 + <-ch +} + +// A nil pointer in a mutex method call should not +// corrupt the race detector state. +// Used to hang indefinitely. +func TestNoRaceNilMutexCrash(t *testing.T) { + var mutex sync.Mutex + panics := 0 + defer func() { + if x := recover(); x != nil { + mutex.Lock() + panics++ + mutex.Unlock() + } else { + panic("no panic") + } + }() + var othermutex *sync.RWMutex + othermutex.RLock() +} diff --git a/src/runtime/race/testdata/waitgroup_test.go b/src/runtime/race/testdata/waitgroup_test.go new file mode 100644 index 0000000..1693373 --- /dev/null +++ b/src/runtime/race/testdata/waitgroup_test.go @@ -0,0 +1,360 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package race_test + +import ( + "runtime" + "sync" + "testing" + "time" +) + +func TestNoRaceWaitGroup(t *testing.T) { + var x int + _ = x + var wg sync.WaitGroup + n := 1 + for i := 0; i < n; i++ { + wg.Add(1) + j := i + go func() { + x = j + wg.Done() + }() + } + wg.Wait() +} + +func TestRaceWaitGroup(t *testing.T) { + var x int + _ = x + var wg sync.WaitGroup + n := 2 + for i := 0; i < n; i++ { + wg.Add(1) + j := i + go func() { + x = j + wg.Done() + }() + } + wg.Wait() +} + +func TestNoRaceWaitGroup2(t *testing.T) { + var x int + _ = x + var wg sync.WaitGroup + wg.Add(1) + go func() { + x = 1 + wg.Done() + }() + wg.Wait() + x = 2 +} + +// incrementing counter in Add and locking wg's mutex +func TestRaceWaitGroupAsMutex(t *testing.T) { + var x int + _ = x + var wg sync.WaitGroup + c := make(chan bool, 2) + go func() { + wg.Wait() + time.Sleep(100 * time.Millisecond) + wg.Add(+1) + x = 1 + wg.Add(-1) + c <- true + }() + go func() { + wg.Wait() + time.Sleep(100 * time.Millisecond) + wg.Add(+1) + x = 2 + wg.Add(-1) + c <- true + }() + <-c + <-c +} + +// Incorrect usage: Add is too late. +func TestRaceWaitGroupWrongWait(t *testing.T) { + c := make(chan bool, 2) + var x int + _ = x + var wg sync.WaitGroup + go func() { + wg.Add(1) + runtime.Gosched() + x = 1 + wg.Done() + c <- true + }() + go func() { + wg.Add(1) + runtime.Gosched() + x = 2 + wg.Done() + c <- true + }() + wg.Wait() + <-c + <-c +} + +func TestRaceWaitGroupWrongAdd(t *testing.T) { + c := make(chan bool, 2) + var wg sync.WaitGroup + go func() { + wg.Add(1) + time.Sleep(100 * time.Millisecond) + wg.Done() + c <- true + }() + go func() { + wg.Add(1) + time.Sleep(100 * time.Millisecond) + wg.Done() + c <- true + }() + time.Sleep(50 * time.Millisecond) + wg.Wait() + <-c + <-c +} + +func TestNoRaceWaitGroupMultipleWait(t *testing.T) { + c := make(chan bool, 2) + var wg sync.WaitGroup + go func() { + wg.Wait() + c <- true + }() + go func() { + wg.Wait() + c <- true + }() + wg.Wait() + <-c + <-c +} + +func TestNoRaceWaitGroupMultipleWait2(t *testing.T) { + c := make(chan bool, 2) + var wg sync.WaitGroup + wg.Add(2) + go func() { + wg.Done() + wg.Wait() + c <- true + }() + go func() { + wg.Done() + wg.Wait() + c <- true + }() + wg.Wait() + <-c + <-c +} + +func TestNoRaceWaitGroupMultipleWait3(t *testing.T) { + const P = 3 + var data [P]int + done := make(chan bool, P) + var wg sync.WaitGroup + wg.Add(P) + for p := 0; p < P; p++ { + go func(p int) { + data[p] = 42 + wg.Done() + }(p) + } + for p := 0; p < P; p++ { + go func() { + wg.Wait() + for p1 := 0; p1 < P; p1++ { + _ = data[p1] + } + done <- true + }() + } + for p := 0; p < P; p++ { + <-done + } +} + +// Correct usage but still a race +func TestRaceWaitGroup2(t *testing.T) { + var x int + _ = x + var wg sync.WaitGroup + wg.Add(2) + go func() { + x = 1 + wg.Done() + }() + go func() { + x = 2 + wg.Done() + }() + wg.Wait() +} + +func TestNoRaceWaitGroupPanicRecover(t *testing.T) { + var x int + _ = x + var wg sync.WaitGroup + defer func() { + err := recover() + if err != "sync: negative WaitGroup counter" { + t.Fatalf("Unexpected panic: %#v", err) + } + x = 2 + }() + x = 1 + wg.Add(-1) +} + +// TODO: this is actually a panic-synchronization test, not a +// WaitGroup test. Move it to another *_test file +// Is it possible to get a race by synchronization via panic? +func TestNoRaceWaitGroupPanicRecover2(t *testing.T) { + var x int + _ = x + var wg sync.WaitGroup + ch := make(chan bool, 1) + var f func() = func() { + x = 2 + ch <- true + } + go func() { + defer func() { + err := recover() + if err != "sync: negative WaitGroup counter" { + } + go f() + }() + x = 1 + wg.Add(-1) + }() + + <-ch +} + +func TestNoRaceWaitGroupTransitive(t *testing.T) { + x, y := 0, 0 + var wg sync.WaitGroup + wg.Add(2) + go func() { + x = 42 + wg.Done() + }() + go func() { + time.Sleep(1e7) + y = 42 + wg.Done() + }() + wg.Wait() + _ = x + _ = y +} + +func TestNoRaceWaitGroupReuse(t *testing.T) { + const P = 3 + var data [P]int + var wg sync.WaitGroup + for try := 0; try < 3; try++ { + wg.Add(P) + for p := 0; p < P; p++ { + go func(p int) { + data[p]++ + wg.Done() + }(p) + } + wg.Wait() + for p := 0; p < P; p++ { + data[p]++ + } + } +} + +func TestNoRaceWaitGroupReuse2(t *testing.T) { + const P = 3 + var data [P]int + var wg sync.WaitGroup + for try := 0; try < 3; try++ { + wg.Add(P) + for p := 0; p < P; p++ { + go func(p int) { + data[p]++ + wg.Done() + }(p) + } + done := make(chan bool) + go func() { + wg.Wait() + for p := 0; p < P; p++ { + data[p]++ + } + done <- true + }() + wg.Wait() + <-done + for p := 0; p < P; p++ { + data[p]++ + } + } +} + +func TestRaceWaitGroupReuse(t *testing.T) { + const P = 3 + const T = 3 + done := make(chan bool, T) + var wg sync.WaitGroup + for try := 0; try < T; try++ { + var data [P]int + wg.Add(P) + for p := 0; p < P; p++ { + go func(p int) { + time.Sleep(50 * time.Millisecond) + data[p]++ + wg.Done() + }(p) + } + go func() { + wg.Wait() + for p := 0; p < P; p++ { + data[p]++ + } + done <- true + }() + time.Sleep(100 * time.Millisecond) + wg.Wait() + } + for try := 0; try < T; try++ { + <-done + } +} + +func TestNoRaceWaitGroupConcurrentAdd(t *testing.T) { + const P = 4 + waiting := make(chan bool, P) + var wg sync.WaitGroup + for p := 0; p < P; p++ { + go func() { + wg.Add(1) + waiting <- true + wg.Done() + }() + } + for p := 0; p < P; p++ { + <-waiting + } + wg.Wait() +} diff --git a/src/runtime/race/timer_test.go b/src/runtime/race/timer_test.go new file mode 100644 index 0000000..dd59005 --- /dev/null +++ b/src/runtime/race/timer_test.go @@ -0,0 +1,33 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +package race_test + +import ( + "sync" + "testing" + "time" +) + +func TestTimers(t *testing.T) { + const goroutines = 8 + var wg sync.WaitGroup + wg.Add(goroutines) + var mu sync.Mutex + for i := 0; i < goroutines; i++ { + go func() { + defer wg.Done() + ticker := time.NewTicker(1) + defer ticker.Stop() + for c := 0; c < 1000; c++ { + <-ticker.C + mu.Lock() + mu.Unlock() + } + }() + } + wg.Wait() +} diff --git a/src/runtime/race0.go b/src/runtime/race0.go new file mode 100644 index 0000000..f36d438 --- /dev/null +++ b/src/runtime/race0.go @@ -0,0 +1,44 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !race + +// Dummy race detection API, used when not built with -race. + +package runtime + +import ( + "unsafe" +) + +const raceenabled = false + +// Because raceenabled is false, none of these functions should be called. + +func raceReadObjectPC(t *_type, addr unsafe.Pointer, callerpc, pc uintptr) { throw("race") } +func raceWriteObjectPC(t *_type, addr unsafe.Pointer, callerpc, pc uintptr) { throw("race") } +func raceinit() (uintptr, uintptr) { throw("race"); return 0, 0 } +func racefini() { throw("race") } +func raceproccreate() uintptr { throw("race"); return 0 } +func raceprocdestroy(ctx uintptr) { throw("race") } +func racemapshadow(addr unsafe.Pointer, size uintptr) { throw("race") } +func racewritepc(addr unsafe.Pointer, callerpc, pc uintptr) { throw("race") } +func racereadpc(addr unsafe.Pointer, callerpc, pc uintptr) { throw("race") } +func racereadrangepc(addr unsafe.Pointer, sz, callerpc, pc uintptr) { throw("race") } +func racewriterangepc(addr unsafe.Pointer, sz, callerpc, pc uintptr) { throw("race") } +func raceacquire(addr unsafe.Pointer) { throw("race") } +func raceacquireg(gp *g, addr unsafe.Pointer) { throw("race") } +func raceacquirectx(racectx uintptr, addr unsafe.Pointer) { throw("race") } +func racerelease(addr unsafe.Pointer) { throw("race") } +func racereleaseg(gp *g, addr unsafe.Pointer) { throw("race") } +func racereleaseacquire(addr unsafe.Pointer) { throw("race") } +func racereleaseacquireg(gp *g, addr unsafe.Pointer) { throw("race") } +func racereleasemerge(addr unsafe.Pointer) { throw("race") } +func racereleasemergeg(gp *g, addr unsafe.Pointer) { throw("race") } +func racefingo() { throw("race") } +func racemalloc(p unsafe.Pointer, sz uintptr) { throw("race") } +func racefree(p unsafe.Pointer, sz uintptr) { throw("race") } +func racegostart(pc uintptr) uintptr { throw("race"); return 0 } +func racegoend() { throw("race") } +func racectxend(racectx uintptr) { throw("race") } diff --git a/src/runtime/race_amd64.s b/src/runtime/race_amd64.s new file mode 100644 index 0000000..34ec200 --- /dev/null +++ b/src/runtime/race_amd64.s @@ -0,0 +1,457 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" +#include "cgo/abi_amd64.h" + +// The following thunks allow calling the gcc-compiled race runtime directly +// from Go code without going all the way through cgo. +// First, it's much faster (up to 50% speedup for real Go programs). +// Second, it eliminates race-related special cases from cgocall and scheduler. +// Third, in long-term it will allow to remove cyclic runtime/race dependency on cmd/go. + +// A brief recap of the amd64 calling convention. +// Arguments are passed in DI, SI, DX, CX, R8, R9, the rest is on stack. +// Callee-saved registers are: BX, BP, R12-R15. +// SP must be 16-byte aligned. +// On Windows: +// Arguments are passed in CX, DX, R8, R9, the rest is on stack. +// Callee-saved registers are: BX, BP, DI, SI, R12-R15. +// SP must be 16-byte aligned. Windows also requires "stack-backing" for the 4 register arguments: +// https://msdn.microsoft.com/en-us/library/ms235286.aspx +// We do not do this, because it seems to be intended for vararg/unprototyped functions. +// Gcc-compiled race runtime does not try to use that space. + +#ifdef GOOS_windows +#define RARG0 CX +#define RARG1 DX +#define RARG2 R8 +#define RARG3 R9 +#else +#define RARG0 DI +#define RARG1 SI +#define RARG2 DX +#define RARG3 CX +#endif + +// func runtime·raceread(addr uintptr) +// Called from instrumented code. +// Defined as ABIInternal so as to avoid introducing a wrapper, +// which would render runtime.getcallerpc ineffective. +TEXT runtime·raceread<ABIInternal>(SB), NOSPLIT, $0-8 + MOVQ AX, RARG1 + MOVQ (SP), RARG2 + // void __tsan_read(ThreadState *thr, void *addr, void *pc); + MOVQ $__tsan_read(SB), AX + JMP racecalladdr<>(SB) + +// func runtime·RaceRead(addr uintptr) +TEXT runtime·RaceRead(SB), NOSPLIT, $0-8 + // This needs to be a tail call, because raceread reads caller pc. + JMP runtime·raceread(SB) + +// void runtime·racereadpc(void *addr, void *callpc, void *pc) +TEXT runtime·racereadpc(SB), NOSPLIT, $0-24 + MOVQ addr+0(FP), RARG1 + MOVQ callpc+8(FP), RARG2 + MOVQ pc+16(FP), RARG3 + ADDQ $1, RARG3 // pc is function start, tsan wants return address + // void __tsan_read_pc(ThreadState *thr, void *addr, void *callpc, void *pc); + MOVQ $__tsan_read_pc(SB), AX + JMP racecalladdr<>(SB) + +// func runtime·racewrite(addr uintptr) +// Called from instrumented code. +// Defined as ABIInternal so as to avoid introducing a wrapper, +// which would render runtime.getcallerpc ineffective. +TEXT runtime·racewrite<ABIInternal>(SB), NOSPLIT, $0-8 + MOVQ AX, RARG1 + MOVQ (SP), RARG2 + // void __tsan_write(ThreadState *thr, void *addr, void *pc); + MOVQ $__tsan_write(SB), AX + JMP racecalladdr<>(SB) + +// func runtime·RaceWrite(addr uintptr) +TEXT runtime·RaceWrite(SB), NOSPLIT, $0-8 + // This needs to be a tail call, because racewrite reads caller pc. + JMP runtime·racewrite(SB) + +// void runtime·racewritepc(void *addr, void *callpc, void *pc) +TEXT runtime·racewritepc(SB), NOSPLIT, $0-24 + MOVQ addr+0(FP), RARG1 + MOVQ callpc+8(FP), RARG2 + MOVQ pc+16(FP), RARG3 + ADDQ $1, RARG3 // pc is function start, tsan wants return address + // void __tsan_write_pc(ThreadState *thr, void *addr, void *callpc, void *pc); + MOVQ $__tsan_write_pc(SB), AX + JMP racecalladdr<>(SB) + +// func runtime·racereadrange(addr, size uintptr) +// Called from instrumented code. +// Defined as ABIInternal so as to avoid introducing a wrapper, +// which would render runtime.getcallerpc ineffective. +TEXT runtime·racereadrange<ABIInternal>(SB), NOSPLIT, $0-16 + MOVQ AX, RARG1 + MOVQ BX, RARG2 + MOVQ (SP), RARG3 + // void __tsan_read_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVQ $__tsan_read_range(SB), AX + JMP racecalladdr<>(SB) + +// func runtime·RaceReadRange(addr, size uintptr) +TEXT runtime·RaceReadRange(SB), NOSPLIT, $0-16 + // This needs to be a tail call, because racereadrange reads caller pc. + JMP runtime·racereadrange(SB) + +// void runtime·racereadrangepc1(void *addr, uintptr sz, void *pc) +TEXT runtime·racereadrangepc1(SB), NOSPLIT, $0-24 + MOVQ addr+0(FP), RARG1 + MOVQ size+8(FP), RARG2 + MOVQ pc+16(FP), RARG3 + ADDQ $1, RARG3 // pc is function start, tsan wants return address + // void __tsan_read_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVQ $__tsan_read_range(SB), AX + JMP racecalladdr<>(SB) + +// func runtime·racewriterange(addr, size uintptr) +// Called from instrumented code. +// Defined as ABIInternal so as to avoid introducing a wrapper, +// which would render runtime.getcallerpc ineffective. +TEXT runtime·racewriterange<ABIInternal>(SB), NOSPLIT, $0-16 + MOVQ AX, RARG1 + MOVQ BX, RARG2 + MOVQ (SP), RARG3 + // void __tsan_write_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVQ $__tsan_write_range(SB), AX + JMP racecalladdr<>(SB) + +// func runtime·RaceWriteRange(addr, size uintptr) +TEXT runtime·RaceWriteRange(SB), NOSPLIT, $0-16 + // This needs to be a tail call, because racewriterange reads caller pc. + JMP runtime·racewriterange(SB) + +// void runtime·racewriterangepc1(void *addr, uintptr sz, void *pc) +TEXT runtime·racewriterangepc1(SB), NOSPLIT, $0-24 + MOVQ addr+0(FP), RARG1 + MOVQ size+8(FP), RARG2 + MOVQ pc+16(FP), RARG3 + ADDQ $1, RARG3 // pc is function start, tsan wants return address + // void __tsan_write_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVQ $__tsan_write_range(SB), AX + JMP racecalladdr<>(SB) + +// If addr (RARG1) is out of range, do nothing. +// Otherwise, setup goroutine context and invoke racecall. Other arguments already set. +TEXT racecalladdr<>(SB), NOSPLIT, $0-0 + MOVQ g_racectx(R14), RARG0 // goroutine context + // Check that addr is within [arenastart, arenaend) or within [racedatastart, racedataend). + CMPQ RARG1, runtime·racearenastart(SB) + JB data + CMPQ RARG1, runtime·racearenaend(SB) + JB call +data: + CMPQ RARG1, runtime·racedatastart(SB) + JB ret + CMPQ RARG1, runtime·racedataend(SB) + JAE ret +call: + MOVQ AX, AX // w/o this 6a miscompiles this function + JMP racecall<>(SB) +ret: + RET + +// func runtime·racefuncenter(pc uintptr) +// Called from instrumented code. +TEXT runtime·racefuncenter(SB), NOSPLIT, $0-8 + MOVQ callpc+0(FP), R11 + JMP racefuncenter<>(SB) + +// Common code for racefuncenter +// R11 = caller's return address +TEXT racefuncenter<>(SB), NOSPLIT, $0-0 + MOVQ DX, BX // save function entry context (for closures) + MOVQ g_racectx(R14), RARG0 // goroutine context + MOVQ R11, RARG1 + // void __tsan_func_enter(ThreadState *thr, void *pc); + MOVQ $__tsan_func_enter(SB), AX + // racecall<> preserves BX + CALL racecall<>(SB) + MOVQ BX, DX // restore function entry context + RET + +// func runtime·racefuncexit() +// Called from instrumented code. +TEXT runtime·racefuncexit(SB), NOSPLIT, $0-0 + MOVQ g_racectx(R14), RARG0 // goroutine context + // void __tsan_func_exit(ThreadState *thr); + MOVQ $__tsan_func_exit(SB), AX + JMP racecall<>(SB) + +// Atomic operations for sync/atomic package. + +// Load +TEXT sync∕atomic·LoadInt32(SB), NOSPLIT, $0-12 + GO_ARGS + MOVQ $__tsan_go_atomic32_load(SB), AX + CALL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·LoadInt64(SB), NOSPLIT, $0-16 + GO_ARGS + MOVQ $__tsan_go_atomic64_load(SB), AX + CALL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·LoadUint32(SB), NOSPLIT, $0-12 + GO_ARGS + JMP sync∕atomic·LoadInt32(SB) + +TEXT sync∕atomic·LoadUint64(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +TEXT sync∕atomic·LoadUintptr(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +TEXT sync∕atomic·LoadPointer(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +// Store +TEXT sync∕atomic·StoreInt32(SB), NOSPLIT, $0-12 + GO_ARGS + MOVQ $__tsan_go_atomic32_store(SB), AX + CALL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·StoreInt64(SB), NOSPLIT, $0-16 + GO_ARGS + MOVQ $__tsan_go_atomic64_store(SB), AX + CALL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·StoreUint32(SB), NOSPLIT, $0-12 + GO_ARGS + JMP sync∕atomic·StoreInt32(SB) + +TEXT sync∕atomic·StoreUint64(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·StoreInt64(SB) + +TEXT sync∕atomic·StoreUintptr(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·StoreInt64(SB) + +// Swap +TEXT sync∕atomic·SwapInt32(SB), NOSPLIT, $0-20 + GO_ARGS + MOVQ $__tsan_go_atomic32_exchange(SB), AX + CALL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·SwapInt64(SB), NOSPLIT, $0-24 + GO_ARGS + MOVQ $__tsan_go_atomic64_exchange(SB), AX + CALL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·SwapUint32(SB), NOSPLIT, $0-20 + GO_ARGS + JMP sync∕atomic·SwapInt32(SB) + +TEXT sync∕atomic·SwapUint64(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·SwapInt64(SB) + +TEXT sync∕atomic·SwapUintptr(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·SwapInt64(SB) + +// Add +TEXT sync∕atomic·AddInt32(SB), NOSPLIT, $0-20 + GO_ARGS + MOVQ $__tsan_go_atomic32_fetch_add(SB), AX + CALL racecallatomic<>(SB) + MOVL add+8(FP), AX // convert fetch_add to add_fetch + ADDL AX, ret+16(FP) + RET + +TEXT sync∕atomic·AddInt64(SB), NOSPLIT, $0-24 + GO_ARGS + MOVQ $__tsan_go_atomic64_fetch_add(SB), AX + CALL racecallatomic<>(SB) + MOVQ add+8(FP), AX // convert fetch_add to add_fetch + ADDQ AX, ret+16(FP) + RET + +TEXT sync∕atomic·AddUint32(SB), NOSPLIT, $0-20 + GO_ARGS + JMP sync∕atomic·AddInt32(SB) + +TEXT sync∕atomic·AddUint64(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·AddInt64(SB) + +TEXT sync∕atomic·AddUintptr(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·AddInt64(SB) + +// CompareAndSwap +TEXT sync∕atomic·CompareAndSwapInt32(SB), NOSPLIT, $0-17 + GO_ARGS + MOVQ $__tsan_go_atomic32_compare_exchange(SB), AX + CALL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·CompareAndSwapInt64(SB), NOSPLIT, $0-25 + GO_ARGS + MOVQ $__tsan_go_atomic64_compare_exchange(SB), AX + CALL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·CompareAndSwapUint32(SB), NOSPLIT, $0-17 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt32(SB) + +TEXT sync∕atomic·CompareAndSwapUint64(SB), NOSPLIT, $0-25 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt64(SB) + +TEXT sync∕atomic·CompareAndSwapUintptr(SB), NOSPLIT, $0-25 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt64(SB) + +// Generic atomic operation implementation. +// AX already contains target function. +TEXT racecallatomic<>(SB), NOSPLIT, $0-0 + // Trigger SIGSEGV early. + MOVQ 16(SP), R12 + MOVBLZX (R12), R13 + // Check that addr is within [arenastart, arenaend) or within [racedatastart, racedataend). + CMPQ R12, runtime·racearenastart(SB) + JB racecallatomic_data + CMPQ R12, runtime·racearenaend(SB) + JB racecallatomic_ok +racecallatomic_data: + CMPQ R12, runtime·racedatastart(SB) + JB racecallatomic_ignore + CMPQ R12, runtime·racedataend(SB) + JAE racecallatomic_ignore +racecallatomic_ok: + // Addr is within the good range, call the atomic function. + MOVQ g_racectx(R14), RARG0 // goroutine context + MOVQ 8(SP), RARG1 // caller pc + MOVQ (SP), RARG2 // pc + LEAQ 16(SP), RARG3 // arguments + JMP racecall<>(SB) // does not return +racecallatomic_ignore: + // Addr is outside the good range. + // Call __tsan_go_ignore_sync_begin to ignore synchronization during the atomic op. + // An attempt to synchronize on the address would cause crash. + MOVQ AX, BX // remember the original function + MOVQ $__tsan_go_ignore_sync_begin(SB), AX + MOVQ g_racectx(R14), RARG0 // goroutine context + CALL racecall<>(SB) + MOVQ BX, AX // restore the original function + // Call the atomic function. + MOVQ g_racectx(R14), RARG0 // goroutine context + MOVQ 8(SP), RARG1 // caller pc + MOVQ (SP), RARG2 // pc + LEAQ 16(SP), RARG3 // arguments + CALL racecall<>(SB) + // Call __tsan_go_ignore_sync_end. + MOVQ $__tsan_go_ignore_sync_end(SB), AX + MOVQ g_racectx(R14), RARG0 // goroutine context + JMP racecall<>(SB) + +// void runtime·racecall(void(*f)(...), ...) +// Calls C function f from race runtime and passes up to 4 arguments to it. +// The arguments are never heap-object-preserving pointers, so we pretend there are no arguments. +TEXT runtime·racecall(SB), NOSPLIT, $0-0 + MOVQ fn+0(FP), AX + MOVQ arg0+8(FP), RARG0 + MOVQ arg1+16(FP), RARG1 + MOVQ arg2+24(FP), RARG2 + MOVQ arg3+32(FP), RARG3 + JMP racecall<>(SB) + +// Switches SP to g0 stack and calls (AX). Arguments already set. +TEXT racecall<>(SB), NOSPLIT, $0-0 + MOVQ g_m(R14), R13 + // Switch to g0 stack. + MOVQ SP, R12 // callee-saved, preserved across the CALL + MOVQ m_g0(R13), R10 + CMPQ R10, R14 + JE call // already on g0 + MOVQ (g_sched+gobuf_sp)(R10), SP +call: + ANDQ $~15, SP // alignment for gcc ABI + CALL AX + MOVQ R12, SP + // Back to Go world, set special registers. + // The g register (R14) is preserved in C. + XORPS X15, X15 + RET + +// C->Go callback thunk that allows to call runtime·racesymbolize from C code. +// Direct Go->C race call has only switched SP, finish g->g0 switch by setting correct g. +// The overall effect of Go->C->Go call chain is similar to that of mcall. +// RARG0 contains command code. RARG1 contains command-specific context. +// See racecallback for command codes. +TEXT runtime·racecallbackthunk(SB), NOSPLIT, $0-0 + // Handle command raceGetProcCmd (0) here. + // First, code below assumes that we are on curg, while raceGetProcCmd + // can be executed on g0. Second, it is called frequently, so will + // benefit from this fast path. + CMPQ RARG0, $0 + JNE rest + get_tls(RARG0) + MOVQ g(RARG0), RARG0 + MOVQ g_m(RARG0), RARG0 + MOVQ m_p(RARG0), RARG0 + MOVQ p_raceprocctx(RARG0), RARG0 + MOVQ RARG0, (RARG1) + RET + +rest: + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + // Set g = g0. + get_tls(R12) + MOVQ g(R12), R14 + MOVQ g_m(R14), R13 + MOVQ m_g0(R13), R15 + CMPQ R13, R15 + JEQ noswitch // branch if already on g0 + MOVQ R15, g(R12) // g = m->g0 + MOVQ R15, R14 // set g register + PUSHQ RARG1 // func arg + PUSHQ RARG0 // func arg + CALL runtime·racecallback(SB) + POPQ R12 + POPQ R12 + // All registers are smashed after Go code, reload. + get_tls(R12) + MOVQ g(R12), R13 + MOVQ g_m(R13), R13 + MOVQ m_curg(R13), R14 + MOVQ R14, g(R12) // g = m->curg +ret: + POP_REGS_HOST_TO_ABI0() + RET + +noswitch: + // already on g0 + PUSHQ RARG1 // func arg + PUSHQ RARG0 // func arg + CALL runtime·racecallback(SB) + POPQ R12 + POPQ R12 + JMP ret diff --git a/src/runtime/race_arm64.s b/src/runtime/race_arm64.s new file mode 100644 index 0000000..c818345 --- /dev/null +++ b/src/runtime/race_arm64.s @@ -0,0 +1,498 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +#include "go_asm.h" +#include "funcdata.h" +#include "textflag.h" +#include "tls_arm64.h" +#include "cgo/abi_arm64.h" + +// The following thunks allow calling the gcc-compiled race runtime directly +// from Go code without going all the way through cgo. +// First, it's much faster (up to 50% speedup for real Go programs). +// Second, it eliminates race-related special cases from cgocall and scheduler. +// Third, in long-term it will allow to remove cyclic runtime/race dependency on cmd/go. + +// A brief recap of the arm64 calling convention. +// Arguments are passed in R0...R7, the rest is on stack. +// Callee-saved registers are: R19...R28. +// Temporary registers are: R9...R15 +// SP must be 16-byte aligned. + +// When calling racecalladdr, R9 is the call target address. + +// The race ctx, ThreadState *thr below, is passed in R0 and loaded in racecalladdr. + +// Darwin may return unaligned thread pointer. Align it. (See tls_arm64.s) +// No-op on other OSes. +#ifdef TLS_darwin +#define TP_ALIGN AND $~7, R0 +#else +#define TP_ALIGN +#endif + +// Load g from TLS. (See tls_arm64.s) +#define load_g \ + MRS_TPIDR_R0 \ + TP_ALIGN \ + MOVD runtime·tls_g(SB), R11 \ + MOVD (R0)(R11), g + +// func runtime·raceread(addr uintptr) +// Called from instrumented code. +// Defined as ABIInternal so as to avoid introducing a wrapper, +// which would make caller's PC ineffective. +TEXT runtime·raceread<ABIInternal>(SB), NOSPLIT, $0-8 + MOVD R0, R1 // addr + MOVD LR, R2 + // void __tsan_read(ThreadState *thr, void *addr, void *pc); + MOVD $__tsan_read(SB), R9 + JMP racecalladdr<>(SB) + +// func runtime·RaceRead(addr uintptr) +TEXT runtime·RaceRead(SB), NOSPLIT, $0-8 + // This needs to be a tail call, because raceread reads caller pc. + JMP runtime·raceread(SB) + +// func runtime·racereadpc(void *addr, void *callpc, void *pc) +TEXT runtime·racereadpc(SB), NOSPLIT, $0-24 + MOVD addr+0(FP), R1 + MOVD callpc+8(FP), R2 + MOVD pc+16(FP), R3 + // void __tsan_read_pc(ThreadState *thr, void *addr, void *callpc, void *pc); + MOVD $__tsan_read_pc(SB), R9 + JMP racecalladdr<>(SB) + +// func runtime·racewrite(addr uintptr) +// Called from instrumented code. +// Defined as ABIInternal so as to avoid introducing a wrapper, +// which would make caller's PC ineffective. +TEXT runtime·racewrite<ABIInternal>(SB), NOSPLIT, $0-8 + MOVD R0, R1 // addr + MOVD LR, R2 + // void __tsan_write(ThreadState *thr, void *addr, void *pc); + MOVD $__tsan_write(SB), R9 + JMP racecalladdr<>(SB) + +// func runtime·RaceWrite(addr uintptr) +TEXT runtime·RaceWrite(SB), NOSPLIT, $0-8 + // This needs to be a tail call, because racewrite reads caller pc. + JMP runtime·racewrite(SB) + +// func runtime·racewritepc(void *addr, void *callpc, void *pc) +TEXT runtime·racewritepc(SB), NOSPLIT, $0-24 + MOVD addr+0(FP), R1 + MOVD callpc+8(FP), R2 + MOVD pc+16(FP), R3 + // void __tsan_write_pc(ThreadState *thr, void *addr, void *callpc, void *pc); + MOVD $__tsan_write_pc(SB), R9 + JMP racecalladdr<>(SB) + +// func runtime·racereadrange(addr, size uintptr) +// Called from instrumented code. +// Defined as ABIInternal so as to avoid introducing a wrapper, +// which would make caller's PC ineffective. +TEXT runtime·racereadrange<ABIInternal>(SB), NOSPLIT, $0-16 + MOVD R1, R2 // size + MOVD R0, R1 // addr + MOVD LR, R3 + // void __tsan_read_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_read_range(SB), R9 + JMP racecalladdr<>(SB) + +// func runtime·RaceReadRange(addr, size uintptr) +TEXT runtime·RaceReadRange(SB), NOSPLIT, $0-16 + // This needs to be a tail call, because racereadrange reads caller pc. + JMP runtime·racereadrange(SB) + +// func runtime·racereadrangepc1(void *addr, uintptr sz, void *pc) +TEXT runtime·racereadrangepc1(SB), NOSPLIT, $0-24 + MOVD addr+0(FP), R1 + MOVD size+8(FP), R2 + MOVD pc+16(FP), R3 + ADD $4, R3 // pc is function start, tsan wants return address. + // void __tsan_read_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_read_range(SB), R9 + JMP racecalladdr<>(SB) + +// func runtime·racewriterange(addr, size uintptr) +// Called from instrumented code. +// Defined as ABIInternal so as to avoid introducing a wrapper, +// which would make caller's PC ineffective. +TEXT runtime·racewriterange<ABIInternal>(SB), NOSPLIT, $0-16 + MOVD R1, R2 // size + MOVD R0, R1 // addr + MOVD LR, R3 + // void __tsan_write_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_write_range(SB), R9 + JMP racecalladdr<>(SB) + +// func runtime·RaceWriteRange(addr, size uintptr) +TEXT runtime·RaceWriteRange(SB), NOSPLIT, $0-16 + // This needs to be a tail call, because racewriterange reads caller pc. + JMP runtime·racewriterange(SB) + +// func runtime·racewriterangepc1(void *addr, uintptr sz, void *pc) +TEXT runtime·racewriterangepc1(SB), NOSPLIT, $0-24 + MOVD addr+0(FP), R1 + MOVD size+8(FP), R2 + MOVD pc+16(FP), R3 + ADD $4, R3 // pc is function start, tsan wants return address. + // void __tsan_write_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_write_range(SB), R9 + JMP racecalladdr<>(SB) + +// If addr (R1) is out of range, do nothing. +// Otherwise, setup goroutine context and invoke racecall. Other arguments already set. +TEXT racecalladdr<>(SB), NOSPLIT, $0-0 + load_g + MOVD g_racectx(g), R0 + // Check that addr is within [arenastart, arenaend) or within [racedatastart, racedataend). + MOVD runtime·racearenastart(SB), R10 + CMP R10, R1 + BLT data + MOVD runtime·racearenaend(SB), R10 + CMP R10, R1 + BLT call +data: + MOVD runtime·racedatastart(SB), R10 + CMP R10, R1 + BLT ret + MOVD runtime·racedataend(SB), R10 + CMP R10, R1 + BGT ret +call: + JMP racecall<>(SB) +ret: + RET + +// func runtime·racefuncenter(pc uintptr) +// Called from instrumented code. +TEXT runtime·racefuncenter<ABIInternal>(SB), NOSPLIT, $0-8 + MOVD R0, R9 // callpc + JMP racefuncenter<>(SB) + +// Common code for racefuncenter +// R9 = caller's return address +TEXT racefuncenter<>(SB), NOSPLIT, $0-0 + load_g + MOVD g_racectx(g), R0 // goroutine racectx + MOVD R9, R1 + // void __tsan_func_enter(ThreadState *thr, void *pc); + MOVD $__tsan_func_enter(SB), R9 + BL racecall<>(SB) + RET + +// func runtime·racefuncexit() +// Called from instrumented code. +TEXT runtime·racefuncexit<ABIInternal>(SB), NOSPLIT, $0-0 + load_g + MOVD g_racectx(g), R0 // race context + // void __tsan_func_exit(ThreadState *thr); + MOVD $__tsan_func_exit(SB), R9 + JMP racecall<>(SB) + +// Atomic operations for sync/atomic package. +// R3 = addr of arguments passed to this function, it can +// be fetched at 40(RSP) in racecallatomic after two times BL +// R0, R1, R2 set in racecallatomic + +// Load +TEXT sync∕atomic·LoadInt32(SB), NOSPLIT, $0-12 + GO_ARGS + MOVD $__tsan_go_atomic32_load(SB), R9 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·LoadInt64(SB), NOSPLIT, $0-16 + GO_ARGS + MOVD $__tsan_go_atomic64_load(SB), R9 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·LoadUint32(SB), NOSPLIT, $0-12 + GO_ARGS + JMP sync∕atomic·LoadInt32(SB) + +TEXT sync∕atomic·LoadUint64(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +TEXT sync∕atomic·LoadUintptr(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +TEXT sync∕atomic·LoadPointer(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +// Store +TEXT sync∕atomic·StoreInt32(SB), NOSPLIT, $0-12 + GO_ARGS + MOVD $__tsan_go_atomic32_store(SB), R9 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·StoreInt64(SB), NOSPLIT, $0-16 + GO_ARGS + MOVD $__tsan_go_atomic64_store(SB), R9 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·StoreUint32(SB), NOSPLIT, $0-12 + GO_ARGS + JMP sync∕atomic·StoreInt32(SB) + +TEXT sync∕atomic·StoreUint64(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·StoreInt64(SB) + +TEXT sync∕atomic·StoreUintptr(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·StoreInt64(SB) + +// Swap +TEXT sync∕atomic·SwapInt32(SB), NOSPLIT, $0-20 + GO_ARGS + MOVD $__tsan_go_atomic32_exchange(SB), R9 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·SwapInt64(SB), NOSPLIT, $0-24 + GO_ARGS + MOVD $__tsan_go_atomic64_exchange(SB), R9 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·SwapUint32(SB), NOSPLIT, $0-20 + GO_ARGS + JMP sync∕atomic·SwapInt32(SB) + +TEXT sync∕atomic·SwapUint64(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·SwapInt64(SB) + +TEXT sync∕atomic·SwapUintptr(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·SwapInt64(SB) + +// Add +TEXT sync∕atomic·AddInt32(SB), NOSPLIT, $0-20 + GO_ARGS + MOVD $__tsan_go_atomic32_fetch_add(SB), R9 + BL racecallatomic<>(SB) + MOVW add+8(FP), R0 // convert fetch_add to add_fetch + MOVW ret+16(FP), R1 + ADD R0, R1, R0 + MOVW R0, ret+16(FP) + RET + +TEXT sync∕atomic·AddInt64(SB), NOSPLIT, $0-24 + GO_ARGS + MOVD $__tsan_go_atomic64_fetch_add(SB), R9 + BL racecallatomic<>(SB) + MOVD add+8(FP), R0 // convert fetch_add to add_fetch + MOVD ret+16(FP), R1 + ADD R0, R1, R0 + MOVD R0, ret+16(FP) + RET + +TEXT sync∕atomic·AddUint32(SB), NOSPLIT, $0-20 + GO_ARGS + JMP sync∕atomic·AddInt32(SB) + +TEXT sync∕atomic·AddUint64(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·AddInt64(SB) + +TEXT sync∕atomic·AddUintptr(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·AddInt64(SB) + +// CompareAndSwap +TEXT sync∕atomic·CompareAndSwapInt32(SB), NOSPLIT, $0-17 + GO_ARGS + MOVD $__tsan_go_atomic32_compare_exchange(SB), R9 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·CompareAndSwapInt64(SB), NOSPLIT, $0-25 + GO_ARGS + MOVD $__tsan_go_atomic64_compare_exchange(SB), R9 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·CompareAndSwapUint32(SB), NOSPLIT, $0-17 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt32(SB) + +TEXT sync∕atomic·CompareAndSwapUint64(SB), NOSPLIT, $0-25 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt64(SB) + +TEXT sync∕atomic·CompareAndSwapUintptr(SB), NOSPLIT, $0-25 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt64(SB) + +// Generic atomic operation implementation. +// R9 = addr of target function +TEXT racecallatomic<>(SB), NOSPLIT, $0 + // Set up these registers + // R0 = *ThreadState + // R1 = caller pc + // R2 = pc + // R3 = addr of incoming arg list + + // Trigger SIGSEGV early. + MOVD 40(RSP), R3 // 1st arg is addr. after two times BL, get it at 40(RSP) + MOVB (R3), R13 // segv here if addr is bad + // Check that addr is within [arenastart, arenaend) or within [racedatastart, racedataend). + MOVD runtime·racearenastart(SB), R10 + CMP R10, R3 + BLT racecallatomic_data + MOVD runtime·racearenaend(SB), R10 + CMP R10, R3 + BLT racecallatomic_ok +racecallatomic_data: + MOVD runtime·racedatastart(SB), R10 + CMP R10, R3 + BLT racecallatomic_ignore + MOVD runtime·racedataend(SB), R10 + CMP R10, R3 + BGE racecallatomic_ignore +racecallatomic_ok: + // Addr is within the good range, call the atomic function. + load_g + MOVD g_racectx(g), R0 // goroutine context + MOVD 16(RSP), R1 // caller pc + MOVD R9, R2 // pc + ADD $40, RSP, R3 + JMP racecall<>(SB) // does not return +racecallatomic_ignore: + // Addr is outside the good range. + // Call __tsan_go_ignore_sync_begin to ignore synchronization during the atomic op. + // An attempt to synchronize on the address would cause crash. + MOVD R9, R21 // remember the original function + MOVD $__tsan_go_ignore_sync_begin(SB), R9 + load_g + MOVD g_racectx(g), R0 // goroutine context + BL racecall<>(SB) + MOVD R21, R9 // restore the original function + // Call the atomic function. + // racecall will call LLVM race code which might clobber R28 (g) + load_g + MOVD g_racectx(g), R0 // goroutine context + MOVD 16(RSP), R1 // caller pc + MOVD R9, R2 // pc + ADD $40, RSP, R3 // arguments + BL racecall<>(SB) + // Call __tsan_go_ignore_sync_end. + MOVD $__tsan_go_ignore_sync_end(SB), R9 + MOVD g_racectx(g), R0 // goroutine context + BL racecall<>(SB) + RET + +// func runtime·racecall(void(*f)(...), ...) +// Calls C function f from race runtime and passes up to 4 arguments to it. +// The arguments are never heap-object-preserving pointers, so we pretend there are no arguments. +TEXT runtime·racecall(SB), NOSPLIT, $0-0 + MOVD fn+0(FP), R9 + MOVD arg0+8(FP), R0 + MOVD arg1+16(FP), R1 + MOVD arg2+24(FP), R2 + MOVD arg3+32(FP), R3 + JMP racecall<>(SB) + +// Switches SP to g0 stack and calls (R9). Arguments already set. +// Clobbers R19, R20. +TEXT racecall<>(SB), NOSPLIT|NOFRAME, $0-0 + MOVD g_m(g), R10 + // Switch to g0 stack. + MOVD RSP, R19 // callee-saved, preserved across the CALL + MOVD R30, R20 // callee-saved, preserved across the CALL + MOVD m_g0(R10), R11 + CMP R11, g + BEQ call // already on g0 + MOVD (g_sched+gobuf_sp)(R11), R12 + MOVD R12, RSP +call: + BL R9 + MOVD R19, RSP + JMP (R20) + +// C->Go callback thunk that allows to call runtime·racesymbolize from C code. +// Direct Go->C race call has only switched SP, finish g->g0 switch by setting correct g. +// The overall effect of Go->C->Go call chain is similar to that of mcall. +// R0 contains command code. R1 contains command-specific context. +// See racecallback for command codes. +TEXT runtime·racecallbackthunk(SB), NOSPLIT|NOFRAME, $0 + // Handle command raceGetProcCmd (0) here. + // First, code below assumes that we are on curg, while raceGetProcCmd + // can be executed on g0. Second, it is called frequently, so will + // benefit from this fast path. + CBNZ R0, rest + MOVD g, R13 +#ifdef TLS_darwin + MOVD R27, R12 // save R27 a.k.a. REGTMP (callee-save in C). load_g clobbers it +#endif + load_g +#ifdef TLS_darwin + MOVD R12, R27 +#endif + MOVD g_m(g), R0 + MOVD m_p(R0), R0 + MOVD p_raceprocctx(R0), R0 + MOVD R0, (R1) + MOVD R13, g + JMP (LR) +rest: + // Save callee-saved registers (Go code won't respect that). + // 8(RSP) and 16(RSP) are for args passed through racecallback + SUB $176, RSP + MOVD LR, 0(RSP) + + SAVE_R19_TO_R28(8*3) + SAVE_F8_TO_F15(8*13) + MOVD R29, (8*21)(RSP) + // Set g = g0. + // load_g will clobber R0, Save R0 + MOVD R0, R13 + load_g + // restore R0 + MOVD R13, R0 + MOVD g_m(g), R13 + MOVD m_g0(R13), R14 + CMP R14, g + BEQ noswitch // branch if already on g0 + MOVD R14, g + + MOVD R0, 8(RSP) // func arg + MOVD R1, 16(RSP) // func arg + BL runtime·racecallback(SB) + + // All registers are smashed after Go code, reload. + MOVD g_m(g), R13 + MOVD m_curg(R13), g // g = m->curg +ret: + // Restore callee-saved registers. + MOVD 0(RSP), LR + MOVD (8*21)(RSP), R29 + RESTORE_F8_TO_F15(8*13) + RESTORE_R19_TO_R28(8*3) + ADD $176, RSP + JMP (LR) + +noswitch: + // already on g0 + MOVD R0, 8(RSP) // func arg + MOVD R1, 16(RSP) // func arg + BL runtime·racecallback(SB) + JMP ret + +#ifndef TLSG_IS_VARIABLE +// tls_g, g value for each thread in TLS +GLOBL runtime·tls_g+0(SB), TLSBSS+DUPOK, $8 +#endif diff --git a/src/runtime/race_ppc64le.s b/src/runtime/race_ppc64le.s new file mode 100644 index 0000000..2826501 --- /dev/null +++ b/src/runtime/race_ppc64le.s @@ -0,0 +1,601 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" +#include "asm_ppc64x.h" + +// The following functions allow calling the clang-compiled race runtime directly +// from Go code without going all the way through cgo. +// First, it's much faster (up to 50% speedup for real Go programs). +// Second, it eliminates race-related special cases from cgocall and scheduler. +// Third, in long-term it will allow to remove cyclic runtime/race dependency on cmd/go. + +// A brief recap of the ppc64le calling convention. +// Arguments are passed in R3, R4, R5 ... +// SP must be 16-byte aligned. + +// Note that for ppc64x, LLVM follows the standard ABI and +// expects arguments in registers, so these functions move +// the arguments from storage to the registers expected +// by the ABI. + +// When calling from Go to Clang tsan code: +// R3 is the 1st argument and is usually the ThreadState* +// R4-? are the 2nd, 3rd, 4th, etc. arguments + +// When calling racecalladdr: +// R8 is the call target address + +// The race ctx is passed in R3 and loaded in +// racecalladdr. +// +// The sequence used to get the race ctx: +// MOVD runtime·tls_g(SB), R10 // Address of TLS variable +// MOVD 0(R10), g // g = R30 +// MOVD g_racectx(g), R3 // racectx == ThreadState + +// func runtime·RaceRead(addr uintptr) +// Called from instrumented Go code +TEXT runtime·raceread<ABIInternal>(SB), NOSPLIT, $0-8 + MOVD R3, R4 // addr + MOVD LR, R5 // caller of this? + // void __tsan_read(ThreadState *thr, void *addr, void *pc); + MOVD $__tsan_read(SB), R8 + BR racecalladdr<>(SB) + +TEXT runtime·RaceRead(SB), NOSPLIT, $0-8 + BR runtime·raceread(SB) + +// void runtime·racereadpc(void *addr, void *callpc, void *pc) +TEXT runtime·racereadpc(SB), NOSPLIT, $0-24 + MOVD addr+0(FP), R4 + MOVD callpc+8(FP), R5 + MOVD pc+16(FP), R6 + // void __tsan_read_pc(ThreadState *thr, void *addr, void *callpc, void *pc); + MOVD $__tsan_read_pc(SB), R8 + BR racecalladdr<>(SB) + +// func runtime·RaceWrite(addr uintptr) +// Called from instrumented Go code +TEXT runtime·racewrite<ABIInternal>(SB), NOSPLIT, $0-8 + MOVD R3, R4 // addr + MOVD LR, R5 // caller has set LR via BL inst + // void __tsan_write(ThreadState *thr, void *addr, void *pc); + MOVD $__tsan_write(SB), R8 + BR racecalladdr<>(SB) + +TEXT runtime·RaceWrite(SB), NOSPLIT, $0-8 + JMP runtime·racewrite(SB) + +// void runtime·racewritepc(void *addr, void *callpc, void *pc) +TEXT runtime·racewritepc(SB), NOSPLIT, $0-24 + MOVD addr+0(FP), R4 + MOVD callpc+8(FP), R5 + MOVD pc+16(FP), R6 + // void __tsan_write_pc(ThreadState *thr, void *addr, void *callpc, void *pc); + MOVD $__tsan_write_pc(SB), R8 + BR racecalladdr<>(SB) + +// func runtime·RaceReadRange(addr, size uintptr) +// Called from instrumented Go code. +TEXT runtime·racereadrange<ABIInternal>(SB), NOSPLIT, $0-16 + MOVD R4, R5 // size + MOVD R3, R4 // addr + MOVD LR, R6 + // void __tsan_read_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_read_range(SB), R8 + BR racecalladdr<>(SB) + +// void runtime·racereadrangepc1(void *addr, uintptr sz, void *pc) +TEXT runtime·racereadrangepc1(SB), NOSPLIT, $0-24 + MOVD addr+0(FP), R4 + MOVD size+8(FP), R5 + MOVD pc+16(FP), R6 + ADD $4, R6 // tsan wants return addr + // void __tsan_read_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_read_range(SB), R8 + BR racecalladdr<>(SB) + +TEXT runtime·RaceReadRange(SB), NOSPLIT, $0-16 + BR runtime·racereadrange(SB) + +// func runtime·RaceWriteRange(addr, size uintptr) +// Called from instrumented Go code. +TEXT runtime·racewriterange<ABIInternal>(SB), NOSPLIT, $0-16 + MOVD R4, R5 // size + MOVD R3, R4 // addr + MOVD LR, R6 + // void __tsan_write_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_write_range(SB), R8 + BR racecalladdr<>(SB) + +TEXT runtime·RaceWriteRange(SB), NOSPLIT, $0-16 + BR runtime·racewriterange(SB) + +// void runtime·racewriterangepc1(void *addr, uintptr sz, void *pc) +// Called from instrumented Go code +TEXT runtime·racewriterangepc1(SB), NOSPLIT, $0-24 + MOVD addr+0(FP), R4 + MOVD size+8(FP), R5 + MOVD pc+16(FP), R6 + ADD $4, R6 // add 4 to inst offset? + // void __tsan_write_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_write_range(SB), R8 + BR racecalladdr<>(SB) + +// Call a __tsan function from Go code. +// R8 = tsan function address +// R3 = *ThreadState a.k.a. g_racectx from g +// R4 = addr passed to __tsan function +// +// Otherwise, setup goroutine context and invoke racecall. Other arguments already set. +TEXT racecalladdr<>(SB), NOSPLIT, $0-0 + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + MOVD g_racectx(g), R3 // goroutine context + // Check that addr is within [arenastart, arenaend) or within [racedatastart, racedataend). + MOVD runtime·racearenastart(SB), R9 + CMP R4, R9 + BLT data + MOVD runtime·racearenaend(SB), R9 + CMP R4, R9 + BLT call +data: + MOVD runtime·racedatastart(SB), R9 + CMP R4, R9 + BLT ret + MOVD runtime·racedataend(SB), R9 + CMP R4, R9 + BGT ret +call: + // Careful!! racecall will save LR on its + // stack, which is OK as long as racecalladdr + // doesn't change in a way that generates a stack. + // racecall should return to the caller of + // recalladdr. + BR racecall<>(SB) +ret: + RET + +// func runtime·racefuncenter(pc uintptr) +// Called from instrumented Go code. +TEXT runtime·racefuncenter(SB), NOSPLIT, $0-8 + MOVD callpc+0(FP), R8 + BR racefuncenter<>(SB) + +// Common code for racefuncenter +// R11 = caller's return address +TEXT racefuncenter<>(SB), NOSPLIT, $0-0 + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + MOVD g_racectx(g), R3 // goroutine racectx aka *ThreadState + MOVD R8, R4 // caller pc set by caller in R8 + // void __tsan_func_enter(ThreadState *thr, void *pc); + MOVD $__tsan_func_enter(SB), R8 + BR racecall<>(SB) + RET + +// func runtime·racefuncexit() +// Called from Go instrumented code. +TEXT runtime·racefuncexit(SB), NOSPLIT, $0-0 + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + MOVD g_racectx(g), R3 // goroutine racectx aka *ThreadState + // void __tsan_func_exit(ThreadState *thr); + MOVD $__tsan_func_exit(SB), R8 + BR racecall<>(SB) + +// Atomic operations for sync/atomic package. +// Some use the __tsan versions instead +// R6 = addr of arguments passed to this function +// R3, R4, R5 set in racecallatomic + +// Load atomic in tsan +TEXT sync∕atomic·LoadInt32(SB), NOSPLIT, $0-12 + GO_ARGS + // void __tsan_go_atomic32_load(ThreadState *thr, uptr cpc, uptr pc, u8 *a); + MOVD $__tsan_go_atomic32_load(SB), R8 + ADD $32, R1, R6 // addr of caller's 1st arg + BR racecallatomic<>(SB) + RET + +TEXT sync∕atomic·LoadInt64(SB), NOSPLIT, $0-16 + GO_ARGS + // void __tsan_go_atomic64_load(ThreadState *thr, uptr cpc, uptr pc, u8 *a); + MOVD $__tsan_go_atomic64_load(SB), R8 + ADD $32, R1, R6 // addr of caller's 1st arg + BR racecallatomic<>(SB) + RET + +TEXT sync∕atomic·LoadUint32(SB), NOSPLIT, $0-12 + GO_ARGS + BR sync∕atomic·LoadInt32(SB) + +TEXT sync∕atomic·LoadUint64(SB), NOSPLIT, $0-16 + GO_ARGS + BR sync∕atomic·LoadInt64(SB) + +TEXT sync∕atomic·LoadUintptr(SB), NOSPLIT, $0-16 + GO_ARGS + BR sync∕atomic·LoadInt64(SB) + +TEXT sync∕atomic·LoadPointer(SB), NOSPLIT, $0-16 + GO_ARGS + BR sync∕atomic·LoadInt64(SB) + +// Store atomic in tsan +TEXT sync∕atomic·StoreInt32(SB), NOSPLIT, $0-12 + GO_ARGS + // void __tsan_go_atomic32_store(ThreadState *thr, uptr cpc, uptr pc, u8 *a); + MOVD $__tsan_go_atomic32_store(SB), R8 + ADD $32, R1, R6 // addr of caller's 1st arg + BR racecallatomic<>(SB) + +TEXT sync∕atomic·StoreInt64(SB), NOSPLIT, $0-16 + GO_ARGS + // void __tsan_go_atomic64_store(ThreadState *thr, uptr cpc, uptr pc, u8 *a); + MOVD $__tsan_go_atomic64_store(SB), R8 + ADD $32, R1, R6 // addr of caller's 1st arg + BR racecallatomic<>(SB) + +TEXT sync∕atomic·StoreUint32(SB), NOSPLIT, $0-12 + GO_ARGS + BR sync∕atomic·StoreInt32(SB) + +TEXT sync∕atomic·StoreUint64(SB), NOSPLIT, $0-16 + GO_ARGS + BR sync∕atomic·StoreInt64(SB) + +TEXT sync∕atomic·StoreUintptr(SB), NOSPLIT, $0-16 + GO_ARGS + BR sync∕atomic·StoreInt64(SB) + +// Swap in tsan +TEXT sync∕atomic·SwapInt32(SB), NOSPLIT, $0-20 + GO_ARGS + // void __tsan_go_atomic32_exchange(ThreadState *thr, uptr cpc, uptr pc, u8 *a); + MOVD $__tsan_go_atomic32_exchange(SB), R8 + ADD $32, R1, R6 // addr of caller's 1st arg + BR racecallatomic<>(SB) + +TEXT sync∕atomic·SwapInt64(SB), NOSPLIT, $0-24 + GO_ARGS + // void __tsan_go_atomic64_exchange(ThreadState *thr, uptr cpc, uptr pc, u8 *a) + MOVD $__tsan_go_atomic64_exchange(SB), R8 + ADD $32, R1, R6 // addr of caller's 1st arg + BR racecallatomic<>(SB) + +TEXT sync∕atomic·SwapUint32(SB), NOSPLIT, $0-20 + GO_ARGS + BR sync∕atomic·SwapInt32(SB) + +TEXT sync∕atomic·SwapUint64(SB), NOSPLIT, $0-24 + GO_ARGS + BR sync∕atomic·SwapInt64(SB) + +TEXT sync∕atomic·SwapUintptr(SB), NOSPLIT, $0-24 + GO_ARGS + BR sync∕atomic·SwapInt64(SB) + +// Add atomic in tsan +TEXT sync∕atomic·AddInt32(SB), NOSPLIT, $0-20 + GO_ARGS + // void __tsan_go_atomic32_fetch_add(ThreadState *thr, uptr cpc, uptr pc, u8 *a); + MOVD $__tsan_go_atomic32_fetch_add(SB), R8 + ADD $64, R1, R6 // addr of caller's 1st arg + BL racecallatomic<>(SB) + // The tsan fetch_add result is not as expected by Go, + // so the 'add' must be added to the result. + MOVW add+8(FP), R3 // The tsa fetch_add does not return the + MOVW ret+16(FP), R4 // result as expected by go, so fix it. + ADD R3, R4, R3 + MOVW R3, ret+16(FP) + RET + +TEXT sync∕atomic·AddInt64(SB), NOSPLIT, $0-24 + GO_ARGS + // void __tsan_go_atomic64_fetch_add(ThreadState *thr, uptr cpc, uptr pc, u8 *a); + MOVD $__tsan_go_atomic64_fetch_add(SB), R8 + ADD $64, R1, R6 // addr of caller's 1st arg + BL racecallatomic<>(SB) + // The tsan fetch_add result is not as expected by Go, + // so the 'add' must be added to the result. + MOVD add+8(FP), R3 + MOVD ret+16(FP), R4 + ADD R3, R4, R3 + MOVD R3, ret+16(FP) + RET + +TEXT sync∕atomic·AddUint32(SB), NOSPLIT, $0-20 + GO_ARGS + BR sync∕atomic·AddInt32(SB) + +TEXT sync∕atomic·AddUint64(SB), NOSPLIT, $0-24 + GO_ARGS + BR sync∕atomic·AddInt64(SB) + +TEXT sync∕atomic·AddUintptr(SB), NOSPLIT, $0-24 + GO_ARGS + BR sync∕atomic·AddInt64(SB) + +// CompareAndSwap in tsan +TEXT sync∕atomic·CompareAndSwapInt32(SB), NOSPLIT, $0-17 + GO_ARGS + // void __tsan_go_atomic32_compare_exchange( + // ThreadState *thr, uptr cpc, uptr pc, u8 *a) + MOVD $__tsan_go_atomic32_compare_exchange(SB), R8 + ADD $32, R1, R6 // addr of caller's 1st arg + BR racecallatomic<>(SB) + +TEXT sync∕atomic·CompareAndSwapInt64(SB), NOSPLIT, $0-25 + GO_ARGS + // void __tsan_go_atomic32_compare_exchange( + // ThreadState *thr, uptr cpc, uptr pc, u8 *a) + MOVD $__tsan_go_atomic64_compare_exchange(SB), R8 + ADD $32, R1, R6 // addr of caller's 1st arg + BR racecallatomic<>(SB) + +TEXT sync∕atomic·CompareAndSwapUint32(SB), NOSPLIT, $0-17 + GO_ARGS + BR sync∕atomic·CompareAndSwapInt32(SB) + +TEXT sync∕atomic·CompareAndSwapUint64(SB), NOSPLIT, $0-25 + GO_ARGS + BR sync∕atomic·CompareAndSwapInt64(SB) + +TEXT sync∕atomic·CompareAndSwapUintptr(SB), NOSPLIT, $0-25 + GO_ARGS + BR sync∕atomic·CompareAndSwapInt64(SB) + +// Common function used to call tsan's atomic functions +// R3 = *ThreadState +// R4 = TODO: What's this supposed to be? +// R5 = caller pc +// R6 = addr of incoming arg list +// R8 contains addr of target function. +TEXT racecallatomic<>(SB), NOSPLIT, $0-0 + // Trigger SIGSEGV early if address passed to atomic function is bad. + MOVD (R6), R7 // 1st arg is addr + MOVB (R7), R9 // segv here if addr is bad + // Check that addr is within [arenastart, arenaend) or within [racedatastart, racedataend). + MOVD runtime·racearenastart(SB), R9 + CMP R7, R9 + BLT racecallatomic_data + MOVD runtime·racearenaend(SB), R9 + CMP R7, R9 + BLT racecallatomic_ok +racecallatomic_data: + MOVD runtime·racedatastart(SB), R9 + CMP R7, R9 + BLT racecallatomic_ignore + MOVD runtime·racedataend(SB), R9 + CMP R7, R9 + BGE racecallatomic_ignore +racecallatomic_ok: + // Addr is within the good range, call the atomic function. + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + MOVD g_racectx(g), R3 // goroutine racectx aka *ThreadState + MOVD R8, R5 // pc is the function called + MOVD (R1), R4 // caller pc from stack + BL racecall<>(SB) // BL needed to maintain stack consistency + RET // +racecallatomic_ignore: + // Addr is outside the good range. + // Call __tsan_go_ignore_sync_begin to ignore synchronization during the atomic op. + // An attempt to synchronize on the address would cause crash. + MOVD R8, R15 // save the original function + MOVD R6, R17 // save the original arg list addr + MOVD $__tsan_go_ignore_sync_begin(SB), R8 // func addr to call + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + MOVD g_racectx(g), R3 // goroutine context + BL racecall<>(SB) + MOVD R15, R8 // restore the original function + MOVD R17, R6 // restore arg list addr + // Call the atomic function. + // racecall will call LLVM race code which might clobber r30 (g) + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + + MOVD g_racectx(g), R3 + MOVD R8, R4 // pc being called same TODO as above + MOVD (R1), R5 // caller pc from latest LR + BL racecall<>(SB) + // Call __tsan_go_ignore_sync_end. + MOVD $__tsan_go_ignore_sync_end(SB), R8 + MOVD g_racectx(g), R3 // goroutine context g should still be good? + BL racecall<>(SB) + RET + +// void runtime·racecall(void(*f)(...), ...) +// Calls C function f from race runtime and passes up to 4 arguments to it. +// The arguments are never heap-object-preserving pointers, so we pretend there are no arguments. +TEXT runtime·racecall(SB), NOSPLIT, $0-0 + MOVD fn+0(FP), R8 + MOVD arg0+8(FP), R3 + MOVD arg1+16(FP), R4 + MOVD arg2+24(FP), R5 + MOVD arg3+32(FP), R6 + JMP racecall<>(SB) + +// Finds g0 and sets its stack +// Arguments were loaded for call from Go to C +TEXT racecall<>(SB), NOSPLIT, $0-0 + // Set the LR slot for the ppc64 ABI + MOVD LR, R10 + MOVD R10, 0(R1) // Go expectation + MOVD R10, 16(R1) // C ABI + // Get info from the current goroutine + MOVD runtime·tls_g(SB), R10 // g offset in TLS + MOVD 0(R10), g + MOVD g_m(g), R7 // m for g + MOVD R1, R16 // callee-saved, preserved across C call + MOVD m_g0(R7), R10 // g0 for m + CMP R10, g // same g0? + BEQ call // already on g0 + MOVD (g_sched+gobuf_sp)(R10), R1 // switch R1 +call: + // prepare frame for C ABI + SUB $32, R1 // create frame for callee saving LR, CR, R2 etc. + RLDCR $0, R1, $~15, R1 // align SP to 16 bytes + MOVD R8, CTR // R8 = caller addr + MOVD R8, R12 // expected by PPC64 ABI + BL (CTR) + XOR R0, R0 // clear R0 on return from Clang + MOVD R16, R1 // restore R1; R16 nonvol in Clang + MOVD runtime·tls_g(SB), R10 // find correct g + MOVD 0(R10), g + MOVD 16(R1), R10 // LR was saved away, restore for return + MOVD R10, LR + RET + +// C->Go callback thunk that allows to call runtime·racesymbolize from C code. +// Direct Go->C race call has only switched SP, finish g->g0 switch by setting correct g. +// The overall effect of Go->C->Go call chain is similar to that of mcall. +// RARG0 contains command code. RARG1 contains command-specific context. +// See racecallback for command codes. +TEXT runtime·racecallbackthunk(SB), NOSPLIT, $-8 + // Handle command raceGetProcCmd (0) here. + // First, code below assumes that we are on curg, while raceGetProcCmd + // can be executed on g0. Second, it is called frequently, so will + // benefit from this fast path. + XOR R0, R0 // clear R0 since we came from C code + CMP R3, $0 + BNE rest + // g0 TODO: Don't modify g here since R30 is nonvolatile + MOVD g, R9 + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + MOVD g_m(g), R3 + MOVD m_p(R3), R3 + MOVD p_raceprocctx(R3), R3 + MOVD R3, (R4) + MOVD R9, g // restore R30 ?? + RET + + // This is all similar to what cgo does + // Save registers according to the ppc64 ABI +rest: + MOVD LR, R10 // save link register + MOVD R10, 16(R1) + MOVW CR, R10 + MOVW R10, 8(R1) + MOVDU R1, -336(R1) // Allocate frame needed for outargs and register save area + + MOVD R14, 328(R1) + MOVD R15, 48(R1) + MOVD R16, 56(R1) + MOVD R17, 64(R1) + MOVD R18, 72(R1) + MOVD R19, 80(R1) + MOVD R20, 88(R1) + MOVD R21, 96(R1) + MOVD R22, 104(R1) + MOVD R23, 112(R1) + MOVD R24, 120(R1) + MOVD R25, 128(R1) + MOVD R26, 136(R1) + MOVD R27, 144(R1) + MOVD R28, 152(R1) + MOVD R29, 160(R1) + MOVD g, 168(R1) // R30 + MOVD R31, 176(R1) + FMOVD F14, 184(R1) + FMOVD F15, 192(R1) + FMOVD F16, 200(R1) + FMOVD F17, 208(R1) + FMOVD F18, 216(R1) + FMOVD F19, 224(R1) + FMOVD F20, 232(R1) + FMOVD F21, 240(R1) + FMOVD F22, 248(R1) + FMOVD F23, 256(R1) + FMOVD F24, 264(R1) + FMOVD F25, 272(R1) + FMOVD F26, 280(R1) + FMOVD F27, 288(R1) + FMOVD F28, 296(R1) + FMOVD F29, 304(R1) + FMOVD F30, 312(R1) + FMOVD F31, 320(R1) + + MOVD R3, FIXED_FRAME+0(R1) + MOVD R4, FIXED_FRAME+8(R1) + + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + + MOVD g_m(g), R7 + MOVD m_g0(R7), R8 + CMP g, R8 + BEQ noswitch + + MOVD R8, g // set g = m-> g0 + + BL runtime·racecallback(SB) + + // All registers are clobbered after Go code, reload. + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10), g + + MOVD g_m(g), R7 + MOVD m_curg(R7), g // restore g = m->curg + +ret: + MOVD 328(R1), R14 + MOVD 48(R1), R15 + MOVD 56(R1), R16 + MOVD 64(R1), R17 + MOVD 72(R1), R18 + MOVD 80(R1), R19 + MOVD 88(R1), R20 + MOVD 96(R1), R21 + MOVD 104(R1), R22 + MOVD 112(R1), R23 + MOVD 120(R1), R24 + MOVD 128(R1), R25 + MOVD 136(R1), R26 + MOVD 144(R1), R27 + MOVD 152(R1), R28 + MOVD 160(R1), R29 + MOVD 168(R1), g // R30 + MOVD 176(R1), R31 + FMOVD 184(R1), F14 + FMOVD 192(R1), F15 + FMOVD 200(R1), F16 + FMOVD 208(R1), F17 + FMOVD 216(R1), F18 + FMOVD 224(R1), F19 + FMOVD 232(R1), F20 + FMOVD 240(R1), F21 + FMOVD 248(R1), F22 + FMOVD 256(R1), F23 + FMOVD 264(R1), F24 + FMOVD 272(R1), F25 + FMOVD 280(R1), F26 + FMOVD 288(R1), F27 + FMOVD 296(R1), F28 + FMOVD 304(R1), F29 + FMOVD 312(R1), F30 + FMOVD 320(R1), F31 + + ADD $336, R1 + MOVD 8(R1), R10 + MOVFL R10, $0xff // Restore of CR + MOVD 16(R1), R10 // needed? + MOVD R10, LR + RET + +noswitch: + BL runtime·racecallback(SB) + JMP ret + +// tls_g, g value for each thread in TLS +GLOBL runtime·tls_g+0(SB), TLSBSS+DUPOK, $8 diff --git a/src/runtime/race_s390x.s b/src/runtime/race_s390x.s new file mode 100644 index 0000000..beb7f83 --- /dev/null +++ b/src/runtime/race_s390x.s @@ -0,0 +1,391 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build race +// +build race + +#include "go_asm.h" +#include "funcdata.h" +#include "textflag.h" + +// The following thunks allow calling the gcc-compiled race runtime directly +// from Go code without going all the way through cgo. +// First, it's much faster (up to 50% speedup for real Go programs). +// Second, it eliminates race-related special cases from cgocall and scheduler. +// Third, in long-term it will allow to remove cyclic runtime/race dependency on cmd/go. + +// A brief recap of the s390x C calling convention. +// Arguments are passed in R2...R6, the rest is on stack. +// Callee-saved registers are: R6...R13, R15. +// Temporary registers are: R0...R5, R14. + +// When calling racecalladdr, R1 is the call target address. + +// The race ctx, ThreadState *thr below, is passed in R2 and loaded in racecalladdr. + +// func runtime·raceread(addr uintptr) +// Called from instrumented code. +TEXT runtime·raceread(SB), NOSPLIT, $0-8 + // void __tsan_read(ThreadState *thr, void *addr, void *pc); + MOVD $__tsan_read(SB), R1 + MOVD addr+0(FP), R3 + MOVD R14, R4 + JMP racecalladdr<>(SB) + +// func runtime·RaceRead(addr uintptr) +TEXT runtime·RaceRead(SB), NOSPLIT, $0-8 + // This needs to be a tail call, because raceread reads caller pc. + JMP runtime·raceread(SB) + +// func runtime·racereadpc(void *addr, void *callpc, void *pc) +TEXT runtime·racereadpc(SB), NOSPLIT, $0-24 + // void __tsan_read_pc(ThreadState *thr, void *addr, void *callpc, void *pc); + MOVD $__tsan_read_pc(SB), R1 + LMG addr+0(FP), R3, R5 + JMP racecalladdr<>(SB) + +// func runtime·racewrite(addr uintptr) +// Called from instrumented code. +TEXT runtime·racewrite(SB), NOSPLIT, $0-8 + // void __tsan_write(ThreadState *thr, void *addr, void *pc); + MOVD $__tsan_write(SB), R1 + MOVD addr+0(FP), R3 + MOVD R14, R4 + JMP racecalladdr<>(SB) + +// func runtime·RaceWrite(addr uintptr) +TEXT runtime·RaceWrite(SB), NOSPLIT, $0-8 + // This needs to be a tail call, because racewrite reads caller pc. + JMP runtime·racewrite(SB) + +// func runtime·racewritepc(void *addr, void *callpc, void *pc) +TEXT runtime·racewritepc(SB), NOSPLIT, $0-24 + // void __tsan_write_pc(ThreadState *thr, void *addr, void *callpc, void *pc); + MOVD $__tsan_write_pc(SB), R1 + LMG addr+0(FP), R3, R5 + JMP racecalladdr<>(SB) + +// func runtime·racereadrange(addr, size uintptr) +// Called from instrumented code. +TEXT runtime·racereadrange(SB), NOSPLIT, $0-16 + // void __tsan_read_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_read_range(SB), R1 + LMG addr+0(FP), R3, R4 + MOVD R14, R5 + JMP racecalladdr<>(SB) + +// func runtime·RaceReadRange(addr, size uintptr) +TEXT runtime·RaceReadRange(SB), NOSPLIT, $0-16 + // This needs to be a tail call, because racereadrange reads caller pc. + JMP runtime·racereadrange(SB) + +// func runtime·racereadrangepc1(void *addr, uintptr sz, void *pc) +TEXT runtime·racereadrangepc1(SB), NOSPLIT, $0-24 + // void __tsan_read_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_read_range(SB), R1 + LMG addr+0(FP), R3, R5 + // pc is an interceptor address, but TSan expects it to point to the + // middle of an interceptor (see LLVM's SCOPED_INTERCEPTOR_RAW). + ADD $2, R5 + JMP racecalladdr<>(SB) + +// func runtime·racewriterange(addr, size uintptr) +// Called from instrumented code. +TEXT runtime·racewriterange(SB), NOSPLIT, $0-16 + // void __tsan_write_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_write_range(SB), R1 + LMG addr+0(FP), R3, R4 + MOVD R14, R5 + JMP racecalladdr<>(SB) + +// func runtime·RaceWriteRange(addr, size uintptr) +TEXT runtime·RaceWriteRange(SB), NOSPLIT, $0-16 + // This needs to be a tail call, because racewriterange reads caller pc. + JMP runtime·racewriterange(SB) + +// func runtime·racewriterangepc1(void *addr, uintptr sz, void *pc) +TEXT runtime·racewriterangepc1(SB), NOSPLIT, $0-24 + // void __tsan_write_range(ThreadState *thr, void *addr, uintptr size, void *pc); + MOVD $__tsan_write_range(SB), R1 + LMG addr+0(FP), R3, R5 + // pc is an interceptor address, but TSan expects it to point to the + // middle of an interceptor (see LLVM's SCOPED_INTERCEPTOR_RAW). + ADD $2, R5 + JMP racecalladdr<>(SB) + +// If R3 is out of range, do nothing. Otherwise, setup goroutine context and +// invoke racecall. Other arguments are already set. +TEXT racecalladdr<>(SB), NOSPLIT, $0-0 + MOVD runtime·racearenastart(SB), R0 + CMPUBLT R3, R0, data // Before racearena start? + MOVD runtime·racearenaend(SB), R0 + CMPUBLT R3, R0, call // Before racearena end? +data: + MOVD runtime·racedatastart(SB), R0 + CMPUBLT R3, R0, ret // Before racedata start? + MOVD runtime·racedataend(SB), R0 + CMPUBGE R3, R0, ret // At or after racedata end? +call: + MOVD g_racectx(g), R2 + JMP racecall<>(SB) +ret: + RET + +// func runtime·racefuncenter(pc uintptr) +// Called from instrumented code. +TEXT runtime·racefuncenter(SB), NOSPLIT, $0-8 + MOVD callpc+0(FP), R3 + JMP racefuncenter<>(SB) + +// Common code for racefuncenter +// R3 = caller's return address +TEXT racefuncenter<>(SB), NOSPLIT, $0-0 + // void __tsan_func_enter(ThreadState *thr, void *pc); + MOVD $__tsan_func_enter(SB), R1 + MOVD g_racectx(g), R2 + BL racecall<>(SB) + RET + +// func runtime·racefuncexit() +// Called from instrumented code. +TEXT runtime·racefuncexit(SB), NOSPLIT, $0-0 + // void __tsan_func_exit(ThreadState *thr); + MOVD $__tsan_func_exit(SB), R1 + MOVD g_racectx(g), R2 + JMP racecall<>(SB) + +// Atomic operations for sync/atomic package. + +// Load + +TEXT sync∕atomic·LoadInt32(SB), NOSPLIT, $0-12 + GO_ARGS + MOVD $__tsan_go_atomic32_load(SB), R1 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·LoadInt64(SB), NOSPLIT, $0-16 + GO_ARGS + MOVD $__tsan_go_atomic64_load(SB), R1 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·LoadUint32(SB), NOSPLIT, $0-12 + GO_ARGS + JMP sync∕atomic·LoadInt32(SB) + +TEXT sync∕atomic·LoadUint64(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +TEXT sync∕atomic·LoadUintptr(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +TEXT sync∕atomic·LoadPointer(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·LoadInt64(SB) + +// Store + +TEXT sync∕atomic·StoreInt32(SB), NOSPLIT, $0-12 + GO_ARGS + MOVD $__tsan_go_atomic32_store(SB), R1 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·StoreInt64(SB), NOSPLIT, $0-16 + GO_ARGS + MOVD $__tsan_go_atomic64_store(SB), R1 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·StoreUint32(SB), NOSPLIT, $0-12 + GO_ARGS + JMP sync∕atomic·StoreInt32(SB) + +TEXT sync∕atomic·StoreUint64(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·StoreInt64(SB) + +TEXT sync∕atomic·StoreUintptr(SB), NOSPLIT, $0-16 + GO_ARGS + JMP sync∕atomic·StoreInt64(SB) + +// Swap + +TEXT sync∕atomic·SwapInt32(SB), NOSPLIT, $0-20 + GO_ARGS + MOVD $__tsan_go_atomic32_exchange(SB), R1 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·SwapInt64(SB), NOSPLIT, $0-24 + GO_ARGS + MOVD $__tsan_go_atomic64_exchange(SB), R1 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·SwapUint32(SB), NOSPLIT, $0-20 + GO_ARGS + JMP sync∕atomic·SwapInt32(SB) + +TEXT sync∕atomic·SwapUint64(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·SwapInt64(SB) + +TEXT sync∕atomic·SwapUintptr(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·SwapInt64(SB) + +// Add + +TEXT sync∕atomic·AddInt32(SB), NOSPLIT, $0-20 + GO_ARGS + MOVD $__tsan_go_atomic32_fetch_add(SB), R1 + BL racecallatomic<>(SB) + // TSan performed fetch_add, but Go needs add_fetch. + MOVW add+8(FP), R0 + MOVW ret+16(FP), R1 + ADD R0, R1, R0 + MOVW R0, ret+16(FP) + RET + +TEXT sync∕atomic·AddInt64(SB), NOSPLIT, $0-24 + GO_ARGS + MOVD $__tsan_go_atomic64_fetch_add(SB), R1 + BL racecallatomic<>(SB) + // TSan performed fetch_add, but Go needs add_fetch. + MOVD add+8(FP), R0 + MOVD ret+16(FP), R1 + ADD R0, R1, R0 + MOVD R0, ret+16(FP) + RET + +TEXT sync∕atomic·AddUint32(SB), NOSPLIT, $0-20 + GO_ARGS + JMP sync∕atomic·AddInt32(SB) + +TEXT sync∕atomic·AddUint64(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·AddInt64(SB) + +TEXT sync∕atomic·AddUintptr(SB), NOSPLIT, $0-24 + GO_ARGS + JMP sync∕atomic·AddInt64(SB) + +// CompareAndSwap + +TEXT sync∕atomic·CompareAndSwapInt32(SB), NOSPLIT, $0-17 + GO_ARGS + MOVD $__tsan_go_atomic32_compare_exchange(SB), R1 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·CompareAndSwapInt64(SB), NOSPLIT, $0-25 + GO_ARGS + MOVD $__tsan_go_atomic64_compare_exchange(SB), R1 + BL racecallatomic<>(SB) + RET + +TEXT sync∕atomic·CompareAndSwapUint32(SB), NOSPLIT, $0-17 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt32(SB) + +TEXT sync∕atomic·CompareAndSwapUint64(SB), NOSPLIT, $0-25 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt64(SB) + +TEXT sync∕atomic·CompareAndSwapUintptr(SB), NOSPLIT, $0-25 + GO_ARGS + JMP sync∕atomic·CompareAndSwapInt64(SB) + +// Common code for atomic operations. Calls R1. +TEXT racecallatomic<>(SB), NOSPLIT, $0 + MOVD 24(R15), R5 // Address (arg1, after 2xBL). + // If we pass an invalid pointer to the TSan runtime, it will cause a + // "fatal error: unknown caller pc". So trigger a SEGV here instead. + MOVB (R5), R0 + MOVD runtime·racearenastart(SB), R0 + CMPUBLT R5, R0, racecallatomic_data // Before racearena start? + MOVD runtime·racearenaend(SB), R0 + CMPUBLT R5, R0, racecallatomic_ok // Before racearena end? +racecallatomic_data: + MOVD runtime·racedatastart(SB), R0 + CMPUBLT R5, R0, racecallatomic_ignore // Before racedata start? + MOVD runtime·racedataend(SB), R0 + CMPUBGE R5, R0, racecallatomic_ignore // At or after racearena end? +racecallatomic_ok: + MOVD g_racectx(g), R2 // ThreadState *. + MOVD 8(R15), R3 // Caller PC. + MOVD R14, R4 // PC. + ADD $24, R15, R5 // Arguments. + // Tail call fails to restore R15, so use a normal one. + BL racecall<>(SB) + RET +racecallatomic_ignore: + // Call __tsan_go_ignore_sync_begin to ignore synchronization during + // the atomic op. An attempt to synchronize on the address would cause + // a crash. + MOVD R1, R6 // Save target function. + MOVD R14, R7 // Save PC. + MOVD $__tsan_go_ignore_sync_begin(SB), R1 + MOVD g_racectx(g), R2 // ThreadState *. + BL racecall<>(SB) + MOVD R6, R1 // Restore target function. + MOVD g_racectx(g), R2 // ThreadState *. + MOVD 8(R15), R3 // Caller PC. + MOVD R7, R4 // PC. + ADD $24, R15, R5 // Arguments. + BL racecall<>(SB) + MOVD $__tsan_go_ignore_sync_end(SB), R1 + MOVD g_racectx(g), R2 // ThreadState *. + BL racecall<>(SB) + RET + +// func runtime·racecall(void(*f)(...), ...) +// Calls C function f from race runtime and passes up to 4 arguments to it. +// The arguments are never heap-object-preserving pointers, so we pretend there +// are no arguments. +TEXT runtime·racecall(SB), NOSPLIT, $0-0 + MOVD fn+0(FP), R1 + MOVD arg0+8(FP), R2 + MOVD arg1+16(FP), R3 + MOVD arg2+24(FP), R4 + MOVD arg3+32(FP), R5 + JMP racecall<>(SB) + +// Switches SP to g0 stack and calls R1. Arguments are already set. +TEXT racecall<>(SB), NOSPLIT, $0-0 + BL runtime·save_g(SB) // Save g for callbacks. + MOVD R15, R7 // Save SP. + MOVD g_m(g), R8 // R8 = thread. + MOVD m_g0(R8), R8 // R8 = g0. + CMPBEQ R8, g, call // Already on g0? + MOVD (g_sched+gobuf_sp)(R8), R15 // Switch SP to g0. +call: SUB $160, R15 // Allocate C frame. + BL R1 // Call C code. + MOVD R7, R15 // Restore SP. + RET // Return to Go. + +// C->Go callback thunk that allows to call runtime·racesymbolize from C +// code. racecall has only switched SP, finish g->g0 switch by setting correct +// g. R2 contains command code, R3 contains command-specific context. See +// racecallback for command codes. +TEXT runtime·racecallbackthunk(SB), NOSPLIT|NOFRAME, $0 + STMG R6, R15, 48(R15) // Save non-volatile regs. + BL runtime·load_g(SB) // Saved by racecall. + CMPBNE R2, $0, rest // raceGetProcCmd? + MOVD g_m(g), R2 // R2 = thread. + MOVD m_p(R2), R2 // R2 = processor. + MVC $8, p_raceprocctx(R2), (R3) // *R3 = ThreadState *. + LMG 48(R15), R6, R15 // Restore non-volatile regs. + BR R14 // Return to C. +rest: MOVD g_m(g), R4 // R4 = current thread. + MOVD m_g0(R4), g // Switch to g0. + SUB $24, R15 // Allocate Go argument slots. + STMG R2, R3, 8(R15) // Fill Go frame. + BL runtime·racecallback(SB) // Call Go code. + LMG 72(R15), R6, R15 // Restore non-volatile regs. + BR R14 // Return to C. diff --git a/src/runtime/rand_test.go b/src/runtime/rand_test.go new file mode 100644 index 0000000..92d07eb --- /dev/null +++ b/src/runtime/rand_test.go @@ -0,0 +1,53 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + . "runtime" + "strconv" + "testing" +) + +func BenchmarkFastrand(b *testing.B) { + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + Fastrand() + } + }) +} + +func BenchmarkFastrand64(b *testing.B) { + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + Fastrand64() + } + }) +} + +func BenchmarkFastrandHashiter(b *testing.B) { + var m = make(map[int]int, 10) + for i := 0; i < 10; i++ { + m[i] = i + } + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + for range m { + break + } + } + }) +} + +var sink32 uint32 + +func BenchmarkFastrandn(b *testing.B) { + for n := uint32(2); n <= 5; n++ { + b.Run(strconv.Itoa(int(n)), func(b *testing.B) { + for i := 0; i < b.N; i++ { + sink32 = Fastrandn(n) + } + }) + } +} diff --git a/src/runtime/rdebug.go b/src/runtime/rdebug.go new file mode 100644 index 0000000..7ecb2a5 --- /dev/null +++ b/src/runtime/rdebug.go @@ -0,0 +1,22 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import _ "unsafe" // for go:linkname + +//go:linkname setMaxStack runtime/debug.setMaxStack +func setMaxStack(in int) (out int) { + out = int(maxstacksize) + maxstacksize = uintptr(in) + return out +} + +//go:linkname setPanicOnFault runtime/debug.setPanicOnFault +func setPanicOnFault(new bool) (old bool) { + gp := getg() + old = gp.paniconfault + gp.paniconfault = new + return old +} diff --git a/src/runtime/relax_stub.go b/src/runtime/relax_stub.go new file mode 100644 index 0000000..e507702 --- /dev/null +++ b/src/runtime/relax_stub.go @@ -0,0 +1,17 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !windows + +package runtime + +// osRelaxMinNS is the number of nanoseconds of idleness to tolerate +// without performing an osRelax. Since osRelax may reduce the +// precision of timers, this should be enough larger than the relaxed +// timer precision to keep the timer error acceptable. +const osRelaxMinNS = 0 + +// osRelax is called by the scheduler when transitioning to and from +// all Ps being idle. +func osRelax(relax bool) {} diff --git a/src/runtime/retry.go b/src/runtime/retry.go new file mode 100644 index 0000000..2e2f813 --- /dev/null +++ b/src/runtime/retry.go @@ -0,0 +1,23 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime + +// retryOnEAGAIN retries a function until it does not return EAGAIN. +// It will use an increasing delay between calls, and retry up to 20 times. +// The function argument is expected to return an errno value, +// and retryOnEAGAIN will return any errno value other than EAGAIN. +// If all retries return EAGAIN, then retryOnEAGAIN will return EAGAIN. +func retryOnEAGAIN(fn func() int32) int32 { + for tries := 0; tries < 20; tries++ { + errno := fn() + if errno != _EAGAIN { + return errno + } + usleep_no_g(uint32(tries+1) * 1000) // milliseconds + } + return _EAGAIN +} diff --git a/src/runtime/rt0_aix_ppc64.s b/src/runtime/rt0_aix_ppc64.s new file mode 100644 index 0000000..e06caa1 --- /dev/null +++ b/src/runtime/rt0_aix_ppc64.s @@ -0,0 +1,199 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// _rt0_ppc64_aix is a function descriptor of the entrypoint function +// __start. This name is needed by cmd/link. +DATA _rt0_ppc64_aix+0(SB)/8, $__start<>(SB) +DATA _rt0_ppc64_aix+8(SB)/8, $TOC(SB) +GLOBL _rt0_ppc64_aix(SB), NOPTR, $16 + + +// The starting function must return in the loader to +// initialise some librairies, especially libthread which +// creates the main thread and adds the TLS in R13 +// R19 contains a function descriptor to the loader function +// which needs to be called. +// This code is similar to the __start function in C +TEXT __start<>(SB),NOSPLIT,$-8 + XOR R0, R0 + MOVD $libc___n_pthreads(SB), R4 + MOVD 0(R4), R4 + MOVD $libc___mod_init(SB), R5 + MOVD 0(R5), R5 + MOVD 0(R19), R0 + MOVD R2, 40(R1) + MOVD 8(R19), R2 + MOVD R18, R3 + MOVD R0, CTR + BL (CTR) // Return to AIX loader + + // Launch rt0_go + MOVD 40(R1), R2 + MOVD R14, R3 // argc + MOVD R15, R4 // argv + BL _main(SB) + + +DATA main+0(SB)/8, $_main(SB) +DATA main+8(SB)/8, $TOC(SB) +DATA main+16(SB)/8, $0 +GLOBL main(SB), NOPTR, $24 + +TEXT _main(SB),NOSPLIT,$-8 + MOVD $runtime·rt0_go(SB), R12 + MOVD R12, CTR + BR (CTR) + + +TEXT _rt0_ppc64_aix_lib(SB),NOSPLIT,$-8 + // Start with standard C stack frame layout and linkage. + MOVD LR, R0 + MOVD R0, 16(R1) // Save LR in caller's frame. + MOVW CR, R0 // Save CR in caller's frame + MOVD R0, 8(R1) + + MOVDU R1, -344(R1) // Allocate frame. + + // Preserve callee-save registers. + MOVD R14, 48(R1) + MOVD R15, 56(R1) + MOVD R16, 64(R1) + MOVD R17, 72(R1) + MOVD R18, 80(R1) + MOVD R19, 88(R1) + MOVD R20, 96(R1) + MOVD R21,104(R1) + MOVD R22, 112(R1) + MOVD R23, 120(R1) + MOVD R24, 128(R1) + MOVD R25, 136(R1) + MOVD R26, 144(R1) + MOVD R27, 152(R1) + MOVD R28, 160(R1) + MOVD R29, 168(R1) + MOVD g, 176(R1) // R30 + MOVD R31, 184(R1) + FMOVD F14, 192(R1) + FMOVD F15, 200(R1) + FMOVD F16, 208(R1) + FMOVD F17, 216(R1) + FMOVD F18, 224(R1) + FMOVD F19, 232(R1) + FMOVD F20, 240(R1) + FMOVD F21, 248(R1) + FMOVD F22, 256(R1) + FMOVD F23, 264(R1) + FMOVD F24, 272(R1) + FMOVD F25, 280(R1) + FMOVD F26, 288(R1) + FMOVD F27, 296(R1) + FMOVD F28, 304(R1) + FMOVD F29, 312(R1) + FMOVD F30, 320(R1) + FMOVD F31, 328(R1) + + // Synchronous initialization. + MOVD $runtime·reginit(SB), R12 + MOVD R12, CTR + BL (CTR) + + MOVBZ runtime·isarchive(SB), R3 // Check buildmode = c-archive + CMP $0, R3 + BEQ done + + MOVD R14, _rt0_ppc64_aix_lib_argc<>(SB) + MOVD R15, _rt0_ppc64_aix_lib_argv<>(SB) + + MOVD $runtime·libpreinit(SB), R12 + MOVD R12, CTR + BL (CTR) + + // Create a new thread to do the runtime initialization and return. + MOVD _cgo_sys_thread_create(SB), R12 + CMP $0, R12 + BEQ nocgo + MOVD $_rt0_ppc64_aix_lib_go(SB), R3 + MOVD $0, R4 + MOVD R2, 40(R1) + MOVD 8(R12), R2 + MOVD (R12), R12 + MOVD R12, CTR + BL (CTR) + MOVD 40(R1), R2 + BR done + +nocgo: + MOVD $0x800000, R12 // stacksize = 8192KB + MOVD R12, 8(R1) + MOVD $_rt0_ppc64_aix_lib_go(SB), R12 + MOVD R12, 16(R1) + MOVD $runtime·newosproc0(SB),R12 + MOVD R12, CTR + BL (CTR) + +done: + // Restore saved registers. + MOVD 48(R1), R14 + MOVD 56(R1), R15 + MOVD 64(R1), R16 + MOVD 72(R1), R17 + MOVD 80(R1), R18 + MOVD 88(R1), R19 + MOVD 96(R1), R20 + MOVD 104(R1), R21 + MOVD 112(R1), R22 + MOVD 120(R1), R23 + MOVD 128(R1), R24 + MOVD 136(R1), R25 + MOVD 144(R1), R26 + MOVD 152(R1), R27 + MOVD 160(R1), R28 + MOVD 168(R1), R29 + MOVD 176(R1), g // R30 + MOVD 184(R1), R31 + FMOVD 196(R1), F14 + FMOVD 200(R1), F15 + FMOVD 208(R1), F16 + FMOVD 216(R1), F17 + FMOVD 224(R1), F18 + FMOVD 232(R1), F19 + FMOVD 240(R1), F20 + FMOVD 248(R1), F21 + FMOVD 256(R1), F22 + FMOVD 264(R1), F23 + FMOVD 272(R1), F24 + FMOVD 280(R1), F25 + FMOVD 288(R1), F26 + FMOVD 296(R1), F27 + FMOVD 304(R1), F28 + FMOVD 312(R1), F29 + FMOVD 320(R1), F30 + FMOVD 328(R1), F31 + + ADD $344, R1 + + MOVD 8(R1), R0 + MOVFL R0, $0xff + MOVD 16(R1), R0 + MOVD R0, LR + RET + +DATA _rt0_ppc64_aix_lib_go+0(SB)/8, $__rt0_ppc64_aix_lib_go(SB) +DATA _rt0_ppc64_aix_lib_go+8(SB)/8, $TOC(SB) +DATA _rt0_ppc64_aix_lib_go+16(SB)/8, $0 +GLOBL _rt0_ppc64_aix_lib_go(SB), NOPTR, $24 + +TEXT __rt0_ppc64_aix_lib_go(SB),NOSPLIT,$0 + MOVD _rt0_ppc64_aix_lib_argc<>(SB), R3 + MOVD _rt0_ppc64_aix_lib_argv<>(SB), R4 + MOVD $runtime·rt0_go(SB), R12 + MOVD R12, CTR + BR (CTR) + +DATA _rt0_ppc64_aix_lib_argc<>(SB)/8, $0 +GLOBL _rt0_ppc64_aix_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_ppc64_aix_lib_argv<>(SB)/8, $0 +GLOBL _rt0_ppc64_aix_lib_argv<>(SB),NOPTR, $8 diff --git a/src/runtime/rt0_android_386.s b/src/runtime/rt0_android_386.s new file mode 100644 index 0000000..3a1b06b --- /dev/null +++ b/src/runtime/rt0_android_386.s @@ -0,0 +1,27 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_386_android(SB),NOSPLIT,$0 + JMP _rt0_386(SB) + +TEXT _rt0_386_android_lib(SB),NOSPLIT,$0 + PUSHL $_rt0_386_android_argv(SB) // argv + PUSHL $1 // argc + CALL _rt0_386_lib(SB) + POPL AX + POPL AX + RET + +DATA _rt0_386_android_argv+0x00(SB)/4,$_rt0_386_android_argv0(SB) +DATA _rt0_386_android_argv+0x04(SB)/4,$0 // argv terminate +DATA _rt0_386_android_argv+0x08(SB)/4,$0 // envp terminate +DATA _rt0_386_android_argv+0x0c(SB)/4,$0 // auxv terminate +GLOBL _rt0_386_android_argv(SB),NOPTR,$0x10 + +// TODO: wire up necessary VDSO (see os_linux_386.go) + +DATA _rt0_386_android_argv0(SB)/8, $"gojni" +GLOBL _rt0_386_android_argv0(SB),RODATA,$8 diff --git a/src/runtime/rt0_android_amd64.s b/src/runtime/rt0_android_amd64.s new file mode 100644 index 0000000..6bda3bf --- /dev/null +++ b/src/runtime/rt0_android_amd64.s @@ -0,0 +1,22 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_amd64_android(SB),NOSPLIT,$-8 + JMP _rt0_amd64(SB) + +TEXT _rt0_amd64_android_lib(SB),NOSPLIT,$0 + MOVQ $1, DI // argc + MOVQ $_rt0_amd64_android_argv(SB), SI // argv + JMP _rt0_amd64_lib(SB) + +DATA _rt0_amd64_android_argv+0x00(SB)/8,$_rt0_amd64_android_argv0(SB) +DATA _rt0_amd64_android_argv+0x08(SB)/8,$0 // end argv +DATA _rt0_amd64_android_argv+0x10(SB)/8,$0 // end envv +DATA _rt0_amd64_android_argv+0x18(SB)/8,$0 // end auxv +GLOBL _rt0_amd64_android_argv(SB),NOPTR,$0x20 + +DATA _rt0_amd64_android_argv0(SB)/8, $"gojni" +GLOBL _rt0_amd64_android_argv0(SB),RODATA,$8 diff --git a/src/runtime/rt0_android_arm.s b/src/runtime/rt0_android_arm.s new file mode 100644 index 0000000..cc5b78e --- /dev/null +++ b/src/runtime/rt0_android_arm.s @@ -0,0 +1,25 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_arm_android(SB),NOSPLIT|NOFRAME,$0 + MOVW (R13), R0 // argc + MOVW $4(R13), R1 // argv + MOVW $_rt0_arm_linux1(SB), R4 + B (R4) + +TEXT _rt0_arm_android_lib(SB),NOSPLIT,$0 + MOVW $1, R0 // argc + MOVW $_rt0_arm_android_argv(SB), R1 // **argv + B _rt0_arm_lib(SB) + +DATA _rt0_arm_android_argv+0x00(SB)/4,$_rt0_arm_android_argv0(SB) +DATA _rt0_arm_android_argv+0x04(SB)/4,$0 // end argv +DATA _rt0_arm_android_argv+0x08(SB)/4,$0 // end envv +DATA _rt0_arm_android_argv+0x0c(SB)/4,$0 // end auxv +GLOBL _rt0_arm_android_argv(SB),NOPTR,$0x10 + +DATA _rt0_arm_android_argv0(SB)/8, $"gojni" +GLOBL _rt0_arm_android_argv0(SB),RODATA,$8 diff --git a/src/runtime/rt0_android_arm64.s b/src/runtime/rt0_android_arm64.s new file mode 100644 index 0000000..4135bf0 --- /dev/null +++ b/src/runtime/rt0_android_arm64.s @@ -0,0 +1,26 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_arm64_android(SB),NOSPLIT|NOFRAME,$0 + MOVD $_rt0_arm64_linux(SB), R4 + B (R4) + +// When building with -buildmode=c-shared, this symbol is called when the shared +// library is loaded. +TEXT _rt0_arm64_android_lib(SB),NOSPLIT|NOFRAME,$0 + MOVW $1, R0 // argc + MOVD $_rt0_arm64_android_argv(SB), R1 // **argv + MOVD $_rt0_arm64_linux_lib(SB), R4 + B (R4) + +DATA _rt0_arm64_android_argv+0x00(SB)/8,$_rt0_arm64_android_argv0(SB) +DATA _rt0_arm64_android_argv+0x08(SB)/8,$0 // end argv +DATA _rt0_arm64_android_argv+0x10(SB)/8,$0 // end envv +DATA _rt0_arm64_android_argv+0x18(SB)/8,$0 // end auxv +GLOBL _rt0_arm64_android_argv(SB),NOPTR,$0x20 + +DATA _rt0_arm64_android_argv0(SB)/8, $"gojni" +GLOBL _rt0_arm64_android_argv0(SB),RODATA,$8 diff --git a/src/runtime/rt0_darwin_amd64.s b/src/runtime/rt0_darwin_amd64.s new file mode 100644 index 0000000..ed804d4 --- /dev/null +++ b/src/runtime/rt0_darwin_amd64.s @@ -0,0 +1,13 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_amd64_darwin(SB),NOSPLIT,$-8 + JMP _rt0_amd64(SB) + +// When linking with -shared, this symbol is called when the shared library +// is loaded. +TEXT _rt0_amd64_darwin_lib(SB),NOSPLIT,$0 + JMP _rt0_amd64_lib(SB) diff --git a/src/runtime/rt0_darwin_arm64.s b/src/runtime/rt0_darwin_arm64.s new file mode 100644 index 0000000..697104a --- /dev/null +++ b/src/runtime/rt0_darwin_arm64.s @@ -0,0 +1,63 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "cgo/abi_arm64.h" + +TEXT _rt0_arm64_darwin(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·rt0_go(SB), R2 + BL (R2) +exit: + MOVD $0, R0 + MOVD $1, R16 // sys_exit + SVC $0x80 + B exit + +// When linking with -buildmode=c-archive or -buildmode=c-shared, +// this symbol is called from a global initialization function. +// +// Note that all currently shipping darwin/arm64 platforms require +// cgo and do not support c-shared. +TEXT _rt0_arm64_darwin_lib(SB),NOSPLIT,$152 + // Preserve callee-save registers. + SAVE_R19_TO_R28(8) + SAVE_F8_TO_F15(88) + + MOVD R0, _rt0_arm64_darwin_lib_argc<>(SB) + MOVD R1, _rt0_arm64_darwin_lib_argv<>(SB) + + MOVD $0, g // initialize g to nil + + // Synchronous initialization. + MOVD $runtime·libpreinit(SB), R4 + BL (R4) + + // Create a new thread to do the runtime initialization and return. + MOVD _cgo_sys_thread_create(SB), R4 + MOVD $_rt0_arm64_darwin_lib_go(SB), R0 + MOVD $0, R1 + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL (R4) + ADD $16, RSP + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8) + RESTORE_F8_TO_F15(88) + + RET + +TEXT _rt0_arm64_darwin_lib_go(SB),NOSPLIT,$0 + MOVD _rt0_arm64_darwin_lib_argc<>(SB), R0 + MOVD _rt0_arm64_darwin_lib_argv<>(SB), R1 + MOVD $runtime·rt0_go(SB), R4 + B (R4) + +DATA _rt0_arm64_darwin_lib_argc<>(SB)/8, $0 +GLOBL _rt0_arm64_darwin_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_arm64_darwin_lib_argv<>(SB)/8, $0 +GLOBL _rt0_arm64_darwin_lib_argv<>(SB),NOPTR, $8 + +// external linking entry point. +TEXT main(SB),NOSPLIT|NOFRAME,$0 + JMP _rt0_arm64_darwin(SB) diff --git a/src/runtime/rt0_dragonfly_amd64.s b/src/runtime/rt0_dragonfly_amd64.s new file mode 100644 index 0000000..e76f9b9 --- /dev/null +++ b/src/runtime/rt0_dragonfly_amd64.s @@ -0,0 +1,14 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// On Dragonfly argc/argv are passed in DI, not SP, so we can't use _rt0_amd64. +TEXT _rt0_amd64_dragonfly(SB),NOSPLIT,$-8 + LEAQ 8(DI), SI // argv + MOVQ 0(DI), DI // argc + JMP runtime·rt0_go(SB) + +TEXT _rt0_amd64_dragonfly_lib(SB),NOSPLIT,$0 + JMP _rt0_amd64_lib(SB) diff --git a/src/runtime/rt0_freebsd_386.s b/src/runtime/rt0_freebsd_386.s new file mode 100644 index 0000000..1808059 --- /dev/null +++ b/src/runtime/rt0_freebsd_386.s @@ -0,0 +1,17 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_386_freebsd(SB),NOSPLIT,$0 + JMP _rt0_386(SB) + +TEXT _rt0_386_freebsd_lib(SB),NOSPLIT,$0 + JMP _rt0_386_lib(SB) + +TEXT main(SB),NOSPLIT,$0 + // Remove the return address from the stack. + // rt0_go doesn't expect it to be there. + ADDL $4, SP + JMP runtime·rt0_go(SB) diff --git a/src/runtime/rt0_freebsd_amd64.s b/src/runtime/rt0_freebsd_amd64.s new file mode 100644 index 0000000..ccc48f6 --- /dev/null +++ b/src/runtime/rt0_freebsd_amd64.s @@ -0,0 +1,14 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// On FreeBSD argc/argv are passed in DI, not SP, so we can't use _rt0_amd64. +TEXT _rt0_amd64_freebsd(SB),NOSPLIT,$-8 + LEAQ 8(DI), SI // argv + MOVQ 0(DI), DI // argc + JMP runtime·rt0_go(SB) + +TEXT _rt0_amd64_freebsd_lib(SB),NOSPLIT,$0 + JMP _rt0_amd64_lib(SB) diff --git a/src/runtime/rt0_freebsd_arm.s b/src/runtime/rt0_freebsd_arm.s new file mode 100644 index 0000000..62ecd9a --- /dev/null +++ b/src/runtime/rt0_freebsd_arm.s @@ -0,0 +1,11 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_arm_freebsd(SB),NOSPLIT,$0 + B _rt0_arm(SB) + +TEXT _rt0_arm_freebsd_lib(SB),NOSPLIT,$0 + B _rt0_arm_lib(SB) diff --git a/src/runtime/rt0_freebsd_arm64.s b/src/runtime/rt0_freebsd_arm64.s new file mode 100644 index 0000000..e517ae0 --- /dev/null +++ b/src/runtime/rt0_freebsd_arm64.s @@ -0,0 +1,74 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "cgo/abi_arm64.h" + +// On FreeBSD argc/argv are passed in R0, not RSP +TEXT _rt0_arm64_freebsd(SB),NOSPLIT|NOFRAME,$0 + ADD $8, R0, R1 // argv + MOVD 0(R0), R0 // argc + BL main(SB) + +// When building with -buildmode=c-shared, this symbol is called when the shared +// library is loaded. +TEXT _rt0_arm64_freebsd_lib(SB),NOSPLIT,$184 + // Preserve callee-save registers. + SAVE_R19_TO_R28(24) + SAVE_F8_TO_F15(104) + + // Initialize g as null in case of using g later e.g. sigaction in cgo_sigaction.go + MOVD ZR, g + + MOVD R0, _rt0_arm64_freebsd_lib_argc<>(SB) + MOVD R1, _rt0_arm64_freebsd_lib_argv<>(SB) + + // Synchronous initialization. + MOVD $runtime·libpreinit(SB), R4 + BL (R4) + + // Create a new thread to do the runtime initialization and return. + MOVD _cgo_sys_thread_create(SB), R4 + CBZ R4, nocgo + MOVD $_rt0_arm64_freebsd_lib_go(SB), R0 + MOVD $0, R1 + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL (R4) + ADD $16, RSP + B restore + +nocgo: + MOVD $0x800000, R0 // stacksize = 8192KB + MOVD $_rt0_arm64_freebsd_lib_go(SB), R1 + MOVD R0, 8(RSP) + MOVD R1, 16(RSP) + MOVD $runtime·newosproc0(SB),R4 + BL (R4) + +restore: + // Restore callee-save registers. + RESTORE_R19_TO_R28(24) + RESTORE_F8_TO_F15(104) + RET + +TEXT _rt0_arm64_freebsd_lib_go(SB),NOSPLIT,$0 + MOVD _rt0_arm64_freebsd_lib_argc<>(SB), R0 + MOVD _rt0_arm64_freebsd_lib_argv<>(SB), R1 + MOVD $runtime·rt0_go(SB),R4 + B (R4) + +DATA _rt0_arm64_freebsd_lib_argc<>(SB)/8, $0 +GLOBL _rt0_arm64_freebsd_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_arm64_freebsd_lib_argv<>(SB)/8, $0 +GLOBL _rt0_arm64_freebsd_lib_argv<>(SB),NOPTR, $8 + + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·rt0_go(SB), R2 + BL (R2) +exit: + MOVD $0, R0 + MOVD $1, R8 // SYS_exit + SVC + B exit diff --git a/src/runtime/rt0_freebsd_riscv64.s b/src/runtime/rt0_freebsd_riscv64.s new file mode 100644 index 0000000..dc46b70 --- /dev/null +++ b/src/runtime/rt0_freebsd_riscv64.s @@ -0,0 +1,112 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// On FreeBSD argc/argv are passed in R0, not X2 +TEXT _rt0_riscv64_freebsd(SB),NOSPLIT|NOFRAME,$0 + ADD $8, A0, A1 // argv + MOV 0(A0), A0 // argc + JMP main(SB) + +// When building with -buildmode=c-shared, this symbol is called when the shared +// library is loaded. +TEXT _rt0_riscv64_freebsd_lib(SB),NOSPLIT,$224 + // Preserve callee-save registers, along with X1 (LR). + MOV X1, (8*3)(X2) + MOV X8, (8*4)(X2) + MOV X9, (8*5)(X2) + MOV X18, (8*6)(X2) + MOV X19, (8*7)(X2) + MOV X20, (8*8)(X2) + MOV X21, (8*9)(X2) + MOV X22, (8*10)(X2) + MOV X23, (8*11)(X2) + MOV X24, (8*12)(X2) + MOV X25, (8*13)(X2) + MOV X26, (8*14)(X2) + MOV g, (8*15)(X2) + MOVD F8, (8*16)(X2) + MOVD F9, (8*17)(X2) + MOVD F18, (8*18)(X2) + MOVD F19, (8*19)(X2) + MOVD F20, (8*20)(X2) + MOVD F21, (8*21)(X2) + MOVD F22, (8*22)(X2) + MOVD F23, (8*23)(X2) + MOVD F24, (8*24)(X2) + MOVD F25, (8*25)(X2) + MOVD F26, (8*26)(X2) + MOVD F27, (8*27)(X2) + + // Initialize g as nil in case of using g later e.g. sigaction in cgo_sigaction.go + MOV X0, g + + MOV A0, _rt0_riscv64_freebsd_lib_argc<>(SB) + MOV A1, _rt0_riscv64_freebsd_lib_argv<>(SB) + + // Synchronous initialization. + MOV $runtime·libpreinit(SB), T0 + JALR RA, T0 + + // Create a new thread to do the runtime initialization and return. + MOV _cgo_sys_thread_create(SB), T0 + BEQZ T0, nocgo + MOV $_rt0_riscv64_freebsd_lib_go(SB), A0 + MOV $0, A1 + JALR RA, T0 + JMP restore + +nocgo: + MOV $0x800000, A0 // stacksize = 8192KB + MOV $_rt0_riscv64_freebsd_lib_go(SB), A1 + MOV A0, 8(X2) + MOV A1, 16(X2) + MOV $runtime·newosproc0(SB), T0 + JALR RA, T0 + +restore: + // Restore callee-save registers, along with X1 (LR). + MOV (8*3)(X2), X1 + MOV (8*4)(X2), X8 + MOV (8*5)(X2), X9 + MOV (8*6)(X2), X18 + MOV (8*7)(X2), X19 + MOV (8*8)(X2), X20 + MOV (8*9)(X2), X21 + MOV (8*10)(X2), X22 + MOV (8*11)(X2), X23 + MOV (8*12)(X2), X24 + MOV (8*13)(X2), X25 + MOV (8*14)(X2), X26 + MOV (8*15)(X2), g + MOVD (8*16)(X2), F8 + MOVD (8*17)(X2), F9 + MOVD (8*18)(X2), F18 + MOVD (8*19)(X2), F19 + MOVD (8*20)(X2), F20 + MOVD (8*21)(X2), F21 + MOVD (8*22)(X2), F22 + MOVD (8*23)(X2), F23 + MOVD (8*24)(X2), F24 + MOVD (8*25)(X2), F25 + MOVD (8*26)(X2), F26 + MOVD (8*27)(X2), F27 + + RET + +TEXT _rt0_riscv64_freebsd_lib_go(SB),NOSPLIT,$0 + MOV _rt0_riscv64_freebsd_lib_argc<>(SB), A0 + MOV _rt0_riscv64_freebsd_lib_argv<>(SB), A1 + MOV $runtime·rt0_go(SB), T0 + JALR ZERO, T0 + +DATA _rt0_riscv64_freebsd_lib_argc<>(SB)/8, $0 +GLOBL _rt0_riscv64_freebsd_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_riscv64_freebsd_lib_argv<>(SB)/8, $0 +GLOBL _rt0_riscv64_freebsd_lib_argv<>(SB),NOPTR, $8 + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + MOV $runtime·rt0_go(SB), T0 + JALR ZERO, T0 diff --git a/src/runtime/rt0_illumos_amd64.s b/src/runtime/rt0_illumos_amd64.s new file mode 100644 index 0000000..54d35b7 --- /dev/null +++ b/src/runtime/rt0_illumos_amd64.s @@ -0,0 +1,11 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_amd64_illumos(SB),NOSPLIT,$-8 + JMP _rt0_amd64(SB) + +TEXT _rt0_amd64_illumos_lib(SB),NOSPLIT,$0 + JMP _rt0_amd64_lib(SB) diff --git a/src/runtime/rt0_ios_amd64.s b/src/runtime/rt0_ios_amd64.s new file mode 100644 index 0000000..c699032 --- /dev/null +++ b/src/runtime/rt0_ios_amd64.s @@ -0,0 +1,14 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// internal linking executable entry point. +// ios/amd64 only supports external linking. +TEXT _rt0_amd64_ios(SB),NOSPLIT|NOFRAME,$0 + UNDEF + +// library entry point. +TEXT _rt0_amd64_ios_lib(SB),NOSPLIT|NOFRAME,$0 + JMP _rt0_amd64_darwin_lib(SB) diff --git a/src/runtime/rt0_ios_arm64.s b/src/runtime/rt0_ios_arm64.s new file mode 100644 index 0000000..dcc8365 --- /dev/null +++ b/src/runtime/rt0_ios_arm64.s @@ -0,0 +1,14 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// internal linking executable entry point. +// ios/arm64 only supports external linking. +TEXT _rt0_arm64_ios(SB),NOSPLIT|NOFRAME,$0 + UNDEF + +// library entry point. +TEXT _rt0_arm64_ios_lib(SB),NOSPLIT|NOFRAME,$0 + JMP _rt0_arm64_darwin_lib(SB) diff --git a/src/runtime/rt0_js_wasm.s b/src/runtime/rt0_js_wasm.s new file mode 100644 index 0000000..714582a --- /dev/null +++ b/src/runtime/rt0_js_wasm.s @@ -0,0 +1,107 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "textflag.h" + +// _rt0_wasm_js is not used itself. It only exists to mark the exported functions as alive. +TEXT _rt0_wasm_js(SB),NOSPLIT,$0 + I32Const $wasm_export_run(SB) + Drop + I32Const $wasm_export_resume(SB) + Drop + I32Const $wasm_export_getsp(SB) + Drop + +// wasm_export_run gets called from JavaScript. It initializes the Go runtime and executes Go code until it needs +// to wait for an event. It does NOT follow the Go ABI. It has two WebAssembly parameters: +// R0: argc (i32) +// R1: argv (i32) +TEXT wasm_export_run(SB),NOSPLIT,$0 + MOVD $runtime·wasmStack+(m0Stack__size-16)(SB), SP + + Get SP + Get R0 // argc + I64ExtendI32U + I64Store $0 + + Get SP + Get R1 // argv + I64ExtendI32U + I64Store $8 + + I32Const $0 // entry PC_B + Call runtime·rt0_go(SB) + Drop + Call wasm_pc_f_loop(SB) + + Return + +// wasm_export_resume gets called from JavaScript. It resumes the execution of Go code until it needs to wait for +// an event. +TEXT wasm_export_resume(SB),NOSPLIT,$0 + I32Const $0 + Call runtime·handleEvent(SB) + Drop + Call wasm_pc_f_loop(SB) + + Return + +TEXT wasm_pc_f_loop(SB),NOSPLIT,$0 +// Call the function for the current PC_F. Repeat until PAUSE != 0 indicates pause or exit. +// The WebAssembly stack may unwind, e.g. when switching goroutines. +// The Go stack on the linear memory is then used to jump to the correct functions +// with this loop, without having to restore the full WebAssembly stack. +// It is expected to have a pending call before entering the loop, so check PAUSE first. + Get PAUSE + I32Eqz + If + loop: + Loop + // Get PC_B & PC_F from -8(SP) + Get SP + I32Const $8 + I32Sub + I32Load16U $0 // PC_B + + Get SP + I32Const $8 + I32Sub + I32Load16U $2 // PC_F + + CallIndirect $0 + Drop + + Get PAUSE + I32Eqz + BrIf loop + End + End + + I32Const $0 + Set PAUSE + + Return + +// wasm_export_getsp gets called from JavaScript to retrieve the SP. +TEXT wasm_export_getsp(SB),NOSPLIT,$0 + Get SP + Return + +TEXT runtime·pause(SB), NOSPLIT, $0-8 + MOVD newsp+0(FP), SP + I32Const $1 + Set PAUSE + RETUNWIND + +TEXT runtime·exit(SB), NOSPLIT, $0-4 + I32Const $0 + Call runtime·wasmExit(SB) + Drop + I32Const $1 + Set PAUSE + RETUNWIND + +TEXT wasm_export_lib(SB),NOSPLIT,$0 + UNDEF diff --git a/src/runtime/rt0_linux_386.s b/src/runtime/rt0_linux_386.s new file mode 100644 index 0000000..325066f --- /dev/null +++ b/src/runtime/rt0_linux_386.s @@ -0,0 +1,17 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_386_linux(SB),NOSPLIT,$0 + JMP _rt0_386(SB) + +TEXT _rt0_386_linux_lib(SB),NOSPLIT,$0 + JMP _rt0_386_lib(SB) + +TEXT main(SB),NOSPLIT,$0 + // Remove the return address from the stack. + // rt0_go doesn't expect it to be there. + ADDL $4, SP + JMP runtime·rt0_go(SB) diff --git a/src/runtime/rt0_linux_amd64.s b/src/runtime/rt0_linux_amd64.s new file mode 100644 index 0000000..94ff709 --- /dev/null +++ b/src/runtime/rt0_linux_amd64.s @@ -0,0 +1,11 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_amd64_linux(SB),NOSPLIT,$-8 + JMP _rt0_amd64(SB) + +TEXT _rt0_amd64_linux_lib(SB),NOSPLIT,$0 + JMP _rt0_amd64_lib(SB) diff --git a/src/runtime/rt0_linux_arm.s b/src/runtime/rt0_linux_arm.s new file mode 100644 index 0000000..8a5722f --- /dev/null +++ b/src/runtime/rt0_linux_arm.s @@ -0,0 +1,33 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_arm_linux(SB),NOSPLIT|NOFRAME,$0 + MOVW (R13), R0 // argc + MOVW $4(R13), R1 // argv + MOVW $_rt0_arm_linux1(SB), R4 + B (R4) + +// When building with -buildmode=c-shared, this symbol is called when the shared +// library is loaded. +TEXT _rt0_arm_linux_lib(SB),NOSPLIT,$0 + B _rt0_arm_lib(SB) + +TEXT _rt0_arm_linux1(SB),NOSPLIT|NOFRAME,$0 + // We first need to detect the kernel ABI, and warn the user + // if the system only supports OABI. + // The strategy here is to call some EABI syscall to see if + // SIGILL is received. + // If you get a SIGILL here, you have the wrong kernel. + + // Save argc and argv (syscall will clobber at least R0). + MOVM.DB.W [R0-R1], (R13) + + // do an EABI syscall + MOVW $20, R7 // sys_getpid + SWI $0 // this will trigger SIGILL on OABI systems + + MOVM.IA.W (R13), [R0-R1] + B runtime·rt0_go(SB) diff --git a/src/runtime/rt0_linux_arm64.s b/src/runtime/rt0_linux_arm64.s new file mode 100644 index 0000000..0eb8fc2 --- /dev/null +++ b/src/runtime/rt0_linux_arm64.s @@ -0,0 +1,73 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "cgo/abi_arm64.h" + +TEXT _rt0_arm64_linux(SB),NOSPLIT|NOFRAME,$0 + MOVD 0(RSP), R0 // argc + ADD $8, RSP, R1 // argv + BL main(SB) + +// When building with -buildmode=c-shared, this symbol is called when the shared +// library is loaded. +TEXT _rt0_arm64_linux_lib(SB),NOSPLIT,$184 + // Preserve callee-save registers. + SAVE_R19_TO_R28(24) + SAVE_F8_TO_F15(104) + + // Initialize g as null in case of using g later e.g. sigaction in cgo_sigaction.go + MOVD ZR, g + + MOVD R0, _rt0_arm64_linux_lib_argc<>(SB) + MOVD R1, _rt0_arm64_linux_lib_argv<>(SB) + + // Synchronous initialization. + MOVD $runtime·libpreinit(SB), R4 + BL (R4) + + // Create a new thread to do the runtime initialization and return. + MOVD _cgo_sys_thread_create(SB), R4 + CBZ R4, nocgo + MOVD $_rt0_arm64_linux_lib_go(SB), R0 + MOVD $0, R1 + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL (R4) + ADD $16, RSP + B restore + +nocgo: + MOVD $0x800000, R0 // stacksize = 8192KB + MOVD $_rt0_arm64_linux_lib_go(SB), R1 + MOVD R0, 8(RSP) + MOVD R1, 16(RSP) + MOVD $runtime·newosproc0(SB),R4 + BL (R4) + +restore: + // Restore callee-save registers. + RESTORE_R19_TO_R28(24) + RESTORE_F8_TO_F15(104) + RET + +TEXT _rt0_arm64_linux_lib_go(SB),NOSPLIT,$0 + MOVD _rt0_arm64_linux_lib_argc<>(SB), R0 + MOVD _rt0_arm64_linux_lib_argv<>(SB), R1 + MOVD $runtime·rt0_go(SB),R4 + B (R4) + +DATA _rt0_arm64_linux_lib_argc<>(SB)/8, $0 +GLOBL _rt0_arm64_linux_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_arm64_linux_lib_argv<>(SB)/8, $0 +GLOBL _rt0_arm64_linux_lib_argv<>(SB),NOPTR, $8 + + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·rt0_go(SB), R2 + BL (R2) +exit: + MOVD $0, R0 + MOVD $94, R8 // sys_exit + SVC + B exit diff --git a/src/runtime/rt0_linux_loong64.s b/src/runtime/rt0_linux_loong64.s new file mode 100644 index 0000000..b23ae78 --- /dev/null +++ b/src/runtime/rt0_linux_loong64.s @@ -0,0 +1,24 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_loong64_linux(SB),NOSPLIT,$0 + JMP _main<>(SB) + +TEXT _main<>(SB),NOSPLIT|NOFRAME,$0 + // In a statically linked binary, the stack contains argc, + // argv as argc string pointers followed by a NULL, envv as a + // sequence of string pointers followed by a NULL, and auxv. + // There is no TLS base pointer. + MOVW 0(R3), R4 // argc + ADDV $8, R3, R5 // argv + JMP main(SB) + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + // in external linking, glibc jumps to main with argc in R4 + // and argv in R5 + + MOVV $runtime·rt0_go(SB), R19 + JMP (R19) diff --git a/src/runtime/rt0_linux_mips64x.s b/src/runtime/rt0_linux_mips64x.s new file mode 100644 index 0000000..e9328b7 --- /dev/null +++ b/src/runtime/rt0_linux_mips64x.s @@ -0,0 +1,38 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips64 || mips64le) + +#include "textflag.h" + +TEXT _rt0_mips64_linux(SB),NOSPLIT,$0 + JMP _main<>(SB) + +TEXT _rt0_mips64le_linux(SB),NOSPLIT,$0 + JMP _main<>(SB) + +TEXT _main<>(SB),NOSPLIT|NOFRAME,$0 + // In a statically linked binary, the stack contains argc, + // argv as argc string pointers followed by a NULL, envv as a + // sequence of string pointers followed by a NULL, and auxv. + // There is no TLS base pointer. +#ifdef GOARCH_mips64 + MOVW 4(R29), R4 // argc, big-endian ABI places int32 at offset 4 +#else + MOVW 0(R29), R4 // argc +#endif + ADDV $8, R29, R5 // argv + JMP main(SB) + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + // in external linking, glibc jumps to main with argc in R4 + // and argv in R5 + + // initialize REGSB = PC&0xffffffff00000000 + BGEZAL R0, 1(PC) + SRLV $32, R31, RSB + SLLV $32, RSB + + MOVV $runtime·rt0_go(SB), R1 + JMP (R1) diff --git a/src/runtime/rt0_linux_mipsx.s b/src/runtime/rt0_linux_mipsx.s new file mode 100644 index 0000000..3cbb7fc --- /dev/null +++ b/src/runtime/rt0_linux_mipsx.s @@ -0,0 +1,27 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips || mipsle) + +#include "textflag.h" + +TEXT _rt0_mips_linux(SB),NOSPLIT,$0 + JMP _main<>(SB) + +TEXT _rt0_mipsle_linux(SB),NOSPLIT,$0 + JMP _main<>(SB) + +TEXT _main<>(SB),NOSPLIT|NOFRAME,$0 + // In a statically linked binary, the stack contains argc, + // argv as argc string pointers followed by a NULL, envv as a + // sequence of string pointers followed by a NULL, and auxv. + // There is no TLS base pointer. + MOVW 0(R29), R4 // argc + ADD $4, R29, R5 // argv + JMP main(SB) + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + // In external linking, libc jumps to main with argc in R4, argv in R5 + MOVW $runtime·rt0_go(SB), R1 + JMP (R1) diff --git a/src/runtime/rt0_linux_ppc64.s b/src/runtime/rt0_linux_ppc64.s new file mode 100644 index 0000000..c9300a9 --- /dev/null +++ b/src/runtime/rt0_linux_ppc64.s @@ -0,0 +1,35 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +// actually a function descriptor for _main<>(SB) +TEXT _rt0_ppc64_linux(SB),NOSPLIT,$0 + DWORD $_main<>(SB) + DWORD $0 + DWORD $0 + +TEXT main(SB),NOSPLIT,$0 + DWORD $_main<>(SB) + DWORD $0 + DWORD $0 + +TEXT _main<>(SB),NOSPLIT,$-8 + // In a statically linked binary, the stack contains argc, + // argv as argc string pointers followed by a NULL, envv as a + // sequence of string pointers followed by a NULL, and auxv. + // There is no TLS base pointer. + // + // TODO(austin): Support ABI v1 dynamic linking entry point + XOR R0, R0 // Note, newer kernels may not always set R0 to 0. + MOVD $runtime·rt0_go(SB), R12 + MOVD R12, CTR + MOVBZ runtime·iscgo(SB), R5 + CMP R5, $0 + BEQ nocgo + BR (CTR) +nocgo: + MOVD 0(R1), R3 // argc + ADD $8, R1, R4 // argv + BR (CTR) diff --git a/src/runtime/rt0_linux_ppc64le.s b/src/runtime/rt0_linux_ppc64le.s new file mode 100644 index 0000000..66f7e7b --- /dev/null +++ b/src/runtime/rt0_linux_ppc64le.s @@ -0,0 +1,184 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "textflag.h" + +TEXT _rt0_ppc64le_linux(SB),NOSPLIT,$0 + XOR R0, R0 // Make sure R0 is zero before _main + BR _main<>(SB) + +TEXT _rt0_ppc64le_linux_lib(SB),NOSPLIT,$-8 + // Start with standard C stack frame layout and linkage. + MOVD LR, R0 + MOVD R0, 16(R1) // Save LR in caller's frame. + MOVW CR, R0 // Save CR in caller's frame + MOVD R0, 8(R1) + MOVDU R1, -320(R1) // Allocate frame. + + // Preserve callee-save registers. + MOVD R14, 24(R1) + MOVD R15, 32(R1) + MOVD R16, 40(R1) + MOVD R17, 48(R1) + MOVD R18, 56(R1) + MOVD R19, 64(R1) + MOVD R20, 72(R1) + MOVD R21, 80(R1) + MOVD R22, 88(R1) + MOVD R23, 96(R1) + MOVD R24, 104(R1) + MOVD R25, 112(R1) + MOVD R26, 120(R1) + MOVD R27, 128(R1) + MOVD R28, 136(R1) + MOVD R29, 144(R1) + MOVD g, 152(R1) // R30 + MOVD R31, 160(R1) + FMOVD F14, 168(R1) + FMOVD F15, 176(R1) + FMOVD F16, 184(R1) + FMOVD F17, 192(R1) + FMOVD F18, 200(R1) + FMOVD F19, 208(R1) + FMOVD F20, 216(R1) + FMOVD F21, 224(R1) + FMOVD F22, 232(R1) + FMOVD F23, 240(R1) + FMOVD F24, 248(R1) + FMOVD F25, 256(R1) + FMOVD F26, 264(R1) + FMOVD F27, 272(R1) + FMOVD F28, 280(R1) + FMOVD F29, 288(R1) + FMOVD F30, 296(R1) + FMOVD F31, 304(R1) + + MOVD R3, _rt0_ppc64le_linux_lib_argc<>(SB) + MOVD R4, _rt0_ppc64le_linux_lib_argv<>(SB) + + // Synchronous initialization. + MOVD $runtime·reginit(SB), R12 + MOVD R12, CTR + BL (CTR) + MOVD $runtime·libpreinit(SB), R12 + MOVD R12, CTR + BL (CTR) + + // Create a new thread to do the runtime initialization and return. + MOVD _cgo_sys_thread_create(SB), R12 + CMP $0, R12 + BEQ nocgo + MOVD $_rt0_ppc64le_linux_lib_go(SB), R3 + MOVD $0, R4 + MOVD R12, CTR + BL (CTR) + BR done + +nocgo: + MOVD $0x800000, R12 // stacksize = 8192KB + MOVD R12, 8(R1) + MOVD $_rt0_ppc64le_linux_lib_go(SB), R12 + MOVD R12, 16(R1) + MOVD $runtime·newosproc0(SB),R12 + MOVD R12, CTR + BL (CTR) + +done: + // Restore saved registers. + MOVD 24(R1), R14 + MOVD 32(R1), R15 + MOVD 40(R1), R16 + MOVD 48(R1), R17 + MOVD 56(R1), R18 + MOVD 64(R1), R19 + MOVD 72(R1), R20 + MOVD 80(R1), R21 + MOVD 88(R1), R22 + MOVD 96(R1), R23 + MOVD 104(R1), R24 + MOVD 112(R1), R25 + MOVD 120(R1), R26 + MOVD 128(R1), R27 + MOVD 136(R1), R28 + MOVD 144(R1), R29 + MOVD 152(R1), g // R30 + MOVD 160(R1), R31 + FMOVD 168(R1), F14 + FMOVD 176(R1), F15 + FMOVD 184(R1), F16 + FMOVD 192(R1), F17 + FMOVD 200(R1), F18 + FMOVD 208(R1), F19 + FMOVD 216(R1), F20 + FMOVD 224(R1), F21 + FMOVD 232(R1), F22 + FMOVD 240(R1), F23 + FMOVD 248(R1), F24 + FMOVD 256(R1), F25 + FMOVD 264(R1), F26 + FMOVD 272(R1), F27 + FMOVD 280(R1), F28 + FMOVD 288(R1), F29 + FMOVD 296(R1), F30 + FMOVD 304(R1), F31 + + ADD $320, R1 + MOVD 8(R1), R0 + MOVFL R0, $0xff + MOVD 16(R1), R0 + MOVD R0, LR + RET + +TEXT _rt0_ppc64le_linux_lib_go(SB),NOSPLIT,$0 + MOVD _rt0_ppc64le_linux_lib_argc<>(SB), R3 + MOVD _rt0_ppc64le_linux_lib_argv<>(SB), R4 + MOVD $runtime·rt0_go(SB), R12 + MOVD R12, CTR + BR (CTR) + +DATA _rt0_ppc64le_linux_lib_argc<>(SB)/8, $0 +GLOBL _rt0_ppc64le_linux_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_ppc64le_linux_lib_argv<>(SB)/8, $0 +GLOBL _rt0_ppc64le_linux_lib_argv<>(SB),NOPTR, $8 + +TEXT _main<>(SB),NOSPLIT,$-8 + // In a statically linked binary, the stack contains argc, + // argv as argc string pointers followed by a NULL, envv as a + // sequence of string pointers followed by a NULL, and auxv. + // The TLS pointer should be initialized to 0. + // + // In an ELFv2 compliant dynamically linked binary, R3 contains argc, + // R4 contains argv, R5 contains envp, R6 contains auxv, and R13 + // contains the TLS pointer. + // + // When loading via glibc, the first doubleword on the stack points + // to NULL a value. (that is *(uintptr)(R1) == 0). This is used to + // differentiate static vs dynamicly linked binaries. + // + // If loading with the musl loader, it doesn't follow the ELFv2 ABI. It + // passes argc/argv similar to the linux kernel, R13 (TLS) is + // initialized, and R3/R4 are undefined. + MOVD (R1), R12 + CMP R0, R12 + BEQ tls_and_argcv_in_reg + + // Arguments are passed via the stack (musl loader or a static binary) + MOVD 0(R1), R3 // argc + ADD $8, R1, R4 // argv + + // Did the TLS pointer get set? If so, don't change it (e.g musl). + CMP R0, R13 + BNE tls_and_argcv_in_reg + + MOVD $runtime·m0+m_tls(SB), R13 // TLS + ADD $0x7000, R13 + +tls_and_argcv_in_reg: + BR main(SB) + +TEXT main(SB),NOSPLIT,$-8 + MOVD $runtime·rt0_go(SB), R12 + MOVD R12, CTR + BR (CTR) diff --git a/src/runtime/rt0_linux_riscv64.s b/src/runtime/rt0_linux_riscv64.s new file mode 100644 index 0000000..d6b8ac8 --- /dev/null +++ b/src/runtime/rt0_linux_riscv64.s @@ -0,0 +1,112 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_riscv64_linux(SB),NOSPLIT|NOFRAME,$0 + MOV 0(X2), A0 // argc + ADD $8, X2, A1 // argv + JMP main(SB) + +// When building with -buildmode=c-shared, this symbol is called when the shared +// library is loaded. +TEXT _rt0_riscv64_linux_lib(SB),NOSPLIT,$224 + // Preserve callee-save registers, along with X1 (LR). + MOV X1, (8*3)(X2) + MOV X8, (8*4)(X2) + MOV X9, (8*5)(X2) + MOV X18, (8*6)(X2) + MOV X19, (8*7)(X2) + MOV X20, (8*8)(X2) + MOV X21, (8*9)(X2) + MOV X22, (8*10)(X2) + MOV X23, (8*11)(X2) + MOV X24, (8*12)(X2) + MOV X25, (8*13)(X2) + MOV X26, (8*14)(X2) + MOV g, (8*15)(X2) + MOVD F8, (8*16)(X2) + MOVD F9, (8*17)(X2) + MOVD F18, (8*18)(X2) + MOVD F19, (8*19)(X2) + MOVD F20, (8*20)(X2) + MOVD F21, (8*21)(X2) + MOVD F22, (8*22)(X2) + MOVD F23, (8*23)(X2) + MOVD F24, (8*24)(X2) + MOVD F25, (8*25)(X2) + MOVD F26, (8*26)(X2) + MOVD F27, (8*27)(X2) + + // Initialize g as nil in case of using g later e.g. sigaction in cgo_sigaction.go + MOV X0, g + + MOV A0, _rt0_riscv64_linux_lib_argc<>(SB) + MOV A1, _rt0_riscv64_linux_lib_argv<>(SB) + + // Synchronous initialization. + MOV $runtime·libpreinit(SB), T0 + JALR RA, T0 + + // Create a new thread to do the runtime initialization and return. + MOV _cgo_sys_thread_create(SB), T0 + BEQZ T0, nocgo + MOV $_rt0_riscv64_linux_lib_go(SB), A0 + MOV $0, A1 + JALR RA, T0 + JMP restore + +nocgo: + MOV $0x800000, A0 // stacksize = 8192KB + MOV $_rt0_riscv64_linux_lib_go(SB), A1 + MOV A0, 8(X2) + MOV A1, 16(X2) + MOV $runtime·newosproc0(SB), T0 + JALR RA, T0 + +restore: + // Restore callee-save registers, along with X1 (LR). + MOV (8*3)(X2), X1 + MOV (8*4)(X2), X8 + MOV (8*5)(X2), X9 + MOV (8*6)(X2), X18 + MOV (8*7)(X2), X19 + MOV (8*8)(X2), X20 + MOV (8*9)(X2), X21 + MOV (8*10)(X2), X22 + MOV (8*11)(X2), X23 + MOV (8*12)(X2), X24 + MOV (8*13)(X2), X25 + MOV (8*14)(X2), X26 + MOV (8*15)(X2), g + MOVD (8*16)(X2), F8 + MOVD (8*17)(X2), F9 + MOVD (8*18)(X2), F18 + MOVD (8*19)(X2), F19 + MOVD (8*20)(X2), F20 + MOVD (8*21)(X2), F21 + MOVD (8*22)(X2), F22 + MOVD (8*23)(X2), F23 + MOVD (8*24)(X2), F24 + MOVD (8*25)(X2), F25 + MOVD (8*26)(X2), F26 + MOVD (8*27)(X2), F27 + + RET + +TEXT _rt0_riscv64_linux_lib_go(SB),NOSPLIT,$0 + MOV _rt0_riscv64_linux_lib_argc<>(SB), A0 + MOV _rt0_riscv64_linux_lib_argv<>(SB), A1 + MOV $runtime·rt0_go(SB), T0 + JALR ZERO, T0 + +DATA _rt0_riscv64_linux_lib_argc<>(SB)/8, $0 +GLOBL _rt0_riscv64_linux_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_riscv64_linux_lib_argv<>(SB)/8, $0 +GLOBL _rt0_riscv64_linux_lib_argv<>(SB),NOPTR, $8 + + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + MOV $runtime·rt0_go(SB), T0 + JALR ZERO, T0 diff --git a/src/runtime/rt0_linux_s390x.s b/src/runtime/rt0_linux_s390x.s new file mode 100644 index 0000000..4b62c5a --- /dev/null +++ b/src/runtime/rt0_linux_s390x.s @@ -0,0 +1,23 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_s390x_linux(SB), NOSPLIT|NOFRAME, $0 + // In a statically linked binary, the stack contains argc, + // argv as argc string pointers followed by a NULL, envv as a + // sequence of string pointers followed by a NULL, and auxv. + // There is no TLS base pointer. + + MOVD 0(R15), R2 // argc + ADD $8, R15, R3 // argv + BR main(SB) + +TEXT _rt0_s390x_linux_lib(SB), NOSPLIT, $0 + MOVD $_rt0_s390x_lib(SB), R1 + BR R1 + +TEXT main(SB), NOSPLIT|NOFRAME, $0 + MOVD $runtime·rt0_go(SB), R1 + BR R1 diff --git a/src/runtime/rt0_netbsd_386.s b/src/runtime/rt0_netbsd_386.s new file mode 100644 index 0000000..cefc04a --- /dev/null +++ b/src/runtime/rt0_netbsd_386.s @@ -0,0 +1,17 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_386_netbsd(SB),NOSPLIT,$0 + JMP _rt0_386(SB) + +TEXT _rt0_386_netbsd_lib(SB),NOSPLIT,$0 + JMP _rt0_386_lib(SB) + +TEXT main(SB),NOSPLIT,$0 + // Remove the return address from the stack. + // rt0_go doesn't expect it to be there. + ADDL $4, SP + JMP runtime·rt0_go(SB) diff --git a/src/runtime/rt0_netbsd_amd64.s b/src/runtime/rt0_netbsd_amd64.s new file mode 100644 index 0000000..77c7187 --- /dev/null +++ b/src/runtime/rt0_netbsd_amd64.s @@ -0,0 +1,11 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_amd64_netbsd(SB),NOSPLIT,$-8 + JMP _rt0_amd64(SB) + +TEXT _rt0_amd64_netbsd_lib(SB),NOSPLIT,$0 + JMP _rt0_amd64_lib(SB) diff --git a/src/runtime/rt0_netbsd_arm.s b/src/runtime/rt0_netbsd_arm.s new file mode 100644 index 0000000..503c32a --- /dev/null +++ b/src/runtime/rt0_netbsd_arm.s @@ -0,0 +1,11 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_arm_netbsd(SB),NOSPLIT,$0 + B _rt0_arm(SB) + +TEXT _rt0_arm_netbsd_lib(SB),NOSPLIT,$0 + B _rt0_arm_lib(SB) diff --git a/src/runtime/rt0_netbsd_arm64.s b/src/runtime/rt0_netbsd_arm64.s new file mode 100644 index 0000000..691a8e4 --- /dev/null +++ b/src/runtime/rt0_netbsd_arm64.s @@ -0,0 +1,71 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "cgo/abi_arm64.h" + +TEXT _rt0_arm64_netbsd(SB),NOSPLIT|NOFRAME,$0 + MOVD 0(RSP), R0 // argc + ADD $8, RSP, R1 // argv + BL main(SB) + +// When building with -buildmode=c-shared, this symbol is called when the shared +// library is loaded. +TEXT _rt0_arm64_netbsd_lib(SB),NOSPLIT,$184 + // Preserve callee-save registers. + SAVE_R19_TO_R28(24) + SAVE_F8_TO_F15(104) + + // Initialize g as null in case of using g later e.g. sigaction in cgo_sigaction.go + MOVD ZR, g + + MOVD R0, _rt0_arm64_netbsd_lib_argc<>(SB) + MOVD R1, _rt0_arm64_netbsd_lib_argv<>(SB) + + // Synchronous initialization. + MOVD $runtime·libpreinit(SB), R4 + BL (R4) + + // Create a new thread to do the runtime initialization and return. + MOVD _cgo_sys_thread_create(SB), R4 + CBZ R4, nocgo + MOVD $_rt0_arm64_netbsd_lib_go(SB), R0 + MOVD $0, R1 + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL (R4) + ADD $16, RSP + B restore + +nocgo: + MOVD $0x800000, R0 // stacksize = 8192KB + MOVD $_rt0_arm64_netbsd_lib_go(SB), R1 + MOVD R0, 8(RSP) + MOVD R1, 16(RSP) + MOVD $runtime·newosproc0(SB),R4 + BL (R4) + +restore: + // Restore callee-save registers. + RESTORE_R19_TO_R28(24) + RESTORE_F8_TO_F15(104) + RET + +TEXT _rt0_arm64_netbsd_lib_go(SB),NOSPLIT,$0 + MOVD _rt0_arm64_netbsd_lib_argc<>(SB), R0 + MOVD _rt0_arm64_netbsd_lib_argv<>(SB), R1 + MOVD $runtime·rt0_go(SB),R4 + B (R4) + +DATA _rt0_arm64_netbsd_lib_argc<>(SB)/8, $0 +GLOBL _rt0_arm64_netbsd_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_arm64_netbsd_lib_argv<>(SB)/8, $0 +GLOBL _rt0_arm64_netbsd_lib_argv<>(SB),NOPTR, $8 + + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·rt0_go(SB), R2 + BL (R2) +exit: + MOVD $0, R0 + SVC $1 // sys_exit diff --git a/src/runtime/rt0_openbsd_386.s b/src/runtime/rt0_openbsd_386.s new file mode 100644 index 0000000..959f4d6 --- /dev/null +++ b/src/runtime/rt0_openbsd_386.s @@ -0,0 +1,17 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_386_openbsd(SB),NOSPLIT,$0 + JMP _rt0_386(SB) + +TEXT _rt0_386_openbsd_lib(SB),NOSPLIT,$0 + JMP _rt0_386_lib(SB) + +TEXT main(SB),NOSPLIT,$0 + // Remove the return address from the stack. + // rt0_go doesn't expect it to be there. + ADDL $4, SP + JMP runtime·rt0_go(SB) diff --git a/src/runtime/rt0_openbsd_amd64.s b/src/runtime/rt0_openbsd_amd64.s new file mode 100644 index 0000000..c2f3f23 --- /dev/null +++ b/src/runtime/rt0_openbsd_amd64.s @@ -0,0 +1,11 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_amd64_openbsd(SB),NOSPLIT,$-8 + JMP _rt0_amd64(SB) + +TEXT _rt0_amd64_openbsd_lib(SB),NOSPLIT,$0 + JMP _rt0_amd64_lib(SB) diff --git a/src/runtime/rt0_openbsd_arm.s b/src/runtime/rt0_openbsd_arm.s new file mode 100644 index 0000000..3511c96 --- /dev/null +++ b/src/runtime/rt0_openbsd_arm.s @@ -0,0 +1,11 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_arm_openbsd(SB),NOSPLIT,$0 + B _rt0_arm(SB) + +TEXT _rt0_arm_openbsd_lib(SB),NOSPLIT,$0 + B _rt0_arm_lib(SB) diff --git a/src/runtime/rt0_openbsd_arm64.s b/src/runtime/rt0_openbsd_arm64.s new file mode 100644 index 0000000..49d49b3 --- /dev/null +++ b/src/runtime/rt0_openbsd_arm64.s @@ -0,0 +1,79 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" +#include "cgo/abi_arm64.h" + +// See comment in runtime/sys_openbsd_arm64.s re this construction. +#define INVOKE_SYSCALL \ + SVC; \ + NOOP; \ + NOOP + +TEXT _rt0_arm64_openbsd(SB),NOSPLIT|NOFRAME,$0 + MOVD 0(RSP), R0 // argc + ADD $8, RSP, R1 // argv + BL main(SB) + +// When building with -buildmode=c-shared, this symbol is called when the shared +// library is loaded. +TEXT _rt0_arm64_openbsd_lib(SB),NOSPLIT,$184 + // Preserve callee-save registers. + SAVE_R19_TO_R28(24) + SAVE_F8_TO_F15(104) + + // Initialize g as null in case of using g later e.g. sigaction in cgo_sigaction.go + MOVD ZR, g + + MOVD R0, _rt0_arm64_openbsd_lib_argc<>(SB) + MOVD R1, _rt0_arm64_openbsd_lib_argv<>(SB) + + // Synchronous initialization. + MOVD $runtime·libpreinit(SB), R4 + BL (R4) + + // Create a new thread to do the runtime initialization and return. + MOVD _cgo_sys_thread_create(SB), R4 + CBZ R4, nocgo + MOVD $_rt0_arm64_openbsd_lib_go(SB), R0 + MOVD $0, R1 + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL (R4) + ADD $16, RSP + B restore + +nocgo: + MOVD $0x800000, R0 // stacksize = 8192KB + MOVD $_rt0_arm64_openbsd_lib_go(SB), R1 + MOVD R0, 8(RSP) + MOVD R1, 16(RSP) + MOVD $runtime·newosproc0(SB),R4 + BL (R4) + +restore: + // Restore callee-save registers. + RESTORE_R19_TO_R28(24) + RESTORE_F8_TO_F15(104) + RET + +TEXT _rt0_arm64_openbsd_lib_go(SB),NOSPLIT,$0 + MOVD _rt0_arm64_openbsd_lib_argc<>(SB), R0 + MOVD _rt0_arm64_openbsd_lib_argv<>(SB), R1 + MOVD $runtime·rt0_go(SB),R4 + B (R4) + +DATA _rt0_arm64_openbsd_lib_argc<>(SB)/8, $0 +GLOBL _rt0_arm64_openbsd_lib_argc<>(SB),NOPTR, $8 +DATA _rt0_arm64_openbsd_lib_argv<>(SB)/8, $0 +GLOBL _rt0_arm64_openbsd_lib_argv<>(SB),NOPTR, $8 + + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·rt0_go(SB), R2 + BL (R2) +exit: + MOVD $0, R0 + MOVD $1, R8 // sys_exit + INVOKE_SYSCALL + B exit diff --git a/src/runtime/rt0_openbsd_mips64.s b/src/runtime/rt0_openbsd_mips64.s new file mode 100644 index 0000000..82a8dfa --- /dev/null +++ b/src/runtime/rt0_openbsd_mips64.s @@ -0,0 +1,36 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_mips64_openbsd(SB),NOSPLIT,$0 + JMP _main<>(SB) + +TEXT _rt0_mips64le_openbsd(SB),NOSPLIT,$0 + JMP _main<>(SB) + +TEXT _main<>(SB),NOSPLIT|NOFRAME,$0 + // In a statically linked binary, the stack contains argc, + // argv as argc string pointers followed by a NULL, envv as a + // sequence of string pointers followed by a NULL, and auxv. + // There is no TLS base pointer. +#ifdef GOARCH_mips64 + MOVW 4(R29), R4 // argc, big-endian ABI places int32 at offset 4 +#else + MOVW 0(R29), R4 // argc +#endif + ADDV $8, R29, R5 // argv + JMP main(SB) + +TEXT main(SB),NOSPLIT|NOFRAME,$0 + // in external linking, glibc jumps to main with argc in R4 + // and argv in R5 + + // initialize REGSB = PC&0xffffffff00000000 + BGEZAL R0, 1(PC) + SRLV $32, R31, RSB + SLLV $32, RSB + + MOVV $runtime·rt0_go(SB), R1 + JMP (R1) diff --git a/src/runtime/rt0_plan9_386.s b/src/runtime/rt0_plan9_386.s new file mode 100644 index 0000000..6471615 --- /dev/null +++ b/src/runtime/rt0_plan9_386.s @@ -0,0 +1,21 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_386_plan9(SB),NOSPLIT,$12 + MOVL AX, _tos(SB) + LEAL 8(SP), AX + MOVL AX, _privates(SB) + MOVL $1, _nprivates(SB) + CALL runtime·asminit(SB) + MOVL inargc-4(FP), AX + MOVL AX, 0(SP) + LEAL inargv+0(FP), AX + MOVL AX, 4(SP) + JMP runtime·rt0_go(SB) + +GLOBL _tos(SB), NOPTR, $4 +GLOBL _privates(SB), NOPTR, $4 +GLOBL _nprivates(SB), NOPTR, $4 diff --git a/src/runtime/rt0_plan9_amd64.s b/src/runtime/rt0_plan9_amd64.s new file mode 100644 index 0000000..6fd493a --- /dev/null +++ b/src/runtime/rt0_plan9_amd64.s @@ -0,0 +1,19 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_amd64_plan9(SB),NOSPLIT,$24 + MOVQ AX, _tos(SB) + LEAQ 16(SP), AX + MOVQ AX, _privates(SB) + MOVL $1, _nprivates(SB) + MOVL inargc-8(FP), DI + LEAQ inargv+0(FP), SI + MOVQ $runtime·rt0_go(SB), AX + JMP AX + +GLOBL _tos(SB), NOPTR, $8 +GLOBL _privates(SB), NOPTR, $8 +GLOBL _nprivates(SB), NOPTR, $4 diff --git a/src/runtime/rt0_plan9_arm.s b/src/runtime/rt0_plan9_arm.s new file mode 100644 index 0000000..697a78d --- /dev/null +++ b/src/runtime/rt0_plan9_arm.s @@ -0,0 +1,15 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +//in plan 9 argc is at top of stack followed by ptrs to arguments + +TEXT _rt0_arm_plan9(SB),NOSPLIT|NOFRAME,$0 + MOVW R0, _tos(SB) + MOVW 0(R13), R0 + MOVW $4(R13), R1 + B runtime·rt0_go(SB) + +GLOBL _tos(SB), NOPTR, $4 diff --git a/src/runtime/rt0_solaris_amd64.s b/src/runtime/rt0_solaris_amd64.s new file mode 100644 index 0000000..5c46ded --- /dev/null +++ b/src/runtime/rt0_solaris_amd64.s @@ -0,0 +1,11 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_amd64_solaris(SB),NOSPLIT,$-8 + JMP _rt0_amd64(SB) + +TEXT _rt0_amd64_solaris_lib(SB),NOSPLIT,$0 + JMP _rt0_amd64_lib(SB) diff --git a/src/runtime/rt0_windows_386.s b/src/runtime/rt0_windows_386.s new file mode 100644 index 0000000..fa39edd --- /dev/null +++ b/src/runtime/rt0_windows_386.s @@ -0,0 +1,47 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT _rt0_386_windows(SB),NOSPLIT,$0 + JMP _rt0_386(SB) + +// When building with -buildmode=(c-shared or c-archive), this +// symbol is called. For dynamic libraries it is called when the +// library is loaded. For static libraries it is called when the +// final executable starts, during the C runtime initialization +// phase. +TEXT _rt0_386_windows_lib(SB),NOSPLIT,$0x1C + MOVL BP, 0x08(SP) + MOVL BX, 0x0C(SP) + MOVL AX, 0x10(SP) + MOVL CX, 0x14(SP) + MOVL DX, 0x18(SP) + + // Create a new thread to do the runtime initialization and return. + MOVL _cgo_sys_thread_create(SB), AX + MOVL $_rt0_386_windows_lib_go(SB), 0x00(SP) + MOVL $0, 0x04(SP) + + // Top two items on the stack are passed to _cgo_sys_thread_create + // as parameters. This is the calling convention on 32-bit Windows. + CALL AX + + MOVL 0x08(SP), BP + MOVL 0x0C(SP), BX + MOVL 0x10(SP), AX + MOVL 0x14(SP), CX + MOVL 0x18(SP), DX + RET + +TEXT _rt0_386_windows_lib_go(SB),NOSPLIT,$0 + PUSHL $0 + PUSHL $0 + JMP runtime·rt0_go(SB) + +TEXT _main(SB),NOSPLIT,$0 + // Remove the return address from the stack. + // rt0_go doesn't expect it to be there. + ADDL $4, SP + JMP runtime·rt0_go(SB) diff --git a/src/runtime/rt0_windows_amd64.s b/src/runtime/rt0_windows_amd64.s new file mode 100644 index 0000000..e60bf4c --- /dev/null +++ b/src/runtime/rt0_windows_amd64.s @@ -0,0 +1,31 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +TEXT _rt0_amd64_windows(SB),NOSPLIT,$-8 + JMP _rt0_amd64(SB) + +// When building with -buildmode=(c-shared or c-archive), this +// symbol is called. For dynamic libraries it is called when the +// library is loaded. For static libraries it is called when the +// final executable starts, during the C runtime initialization +// phase. +// Leave space for four pointers on the stack as required +// by the Windows amd64 calling convention. +TEXT _rt0_amd64_windows_lib(SB),NOSPLIT,$0x20 + // Create a new thread to do the runtime initialization and return. + MOVQ _cgo_sys_thread_create(SB), AX + MOVQ $_rt0_amd64_windows_lib_go(SB), CX + MOVQ $0, DX + CALL AX + RET + +TEXT _rt0_amd64_windows_lib_go(SB),NOSPLIT,$0 + MOVQ $0, DI + MOVQ $0, SI + MOVQ $runtime·rt0_go(SB), AX + JMP AX diff --git a/src/runtime/rt0_windows_arm.s b/src/runtime/rt0_windows_arm.s new file mode 100644 index 0000000..c5787d0 --- /dev/null +++ b/src/runtime/rt0_windows_arm.s @@ -0,0 +1,12 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// This is the entry point for the program from the +// kernel for an ordinary -buildmode=exe program. +TEXT _rt0_arm_windows(SB),NOSPLIT|NOFRAME,$0 + B ·rt0_go(SB) diff --git a/src/runtime/rt0_windows_arm64.s b/src/runtime/rt0_windows_arm64.s new file mode 100644 index 0000000..bad85c2 --- /dev/null +++ b/src/runtime/rt0_windows_arm64.s @@ -0,0 +1,29 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// This is the entry point for the program from the +// kernel for an ordinary -buildmode=exe program. +TEXT _rt0_arm64_windows(SB),NOSPLIT|NOFRAME,$0 + B ·rt0_go(SB) + +TEXT _rt0_arm64_windows_lib(SB),NOSPLIT|NOFRAME,$0 + MOVD $_rt0_arm64_windows_lib_go(SB), R0 + MOVD $0, R1 + MOVD _cgo_sys_thread_create(SB), R2 + B (R2) + +TEXT _rt0_arm64_windows_lib_go(SB),NOSPLIT|NOFRAME,$0 + MOVD $0, R0 + MOVD $0, R1 + MOVD $runtime·rt0_go(SB), R2 + B (R2) + +TEXT main(SB),NOSPLIT,$0 + MOVD $runtime·rt0_go(SB), R2 + B (R2) + diff --git a/src/runtime/runtime-gdb.py b/src/runtime/runtime-gdb.py new file mode 100644 index 0000000..c4462de --- /dev/null +++ b/src/runtime/runtime-gdb.py @@ -0,0 +1,611 @@ +# Copyright 2010 The Go Authors. All rights reserved. +# Use of this source code is governed by a BSD-style +# license that can be found in the LICENSE file. + +"""GDB Pretty printers and convenience functions for Go's runtime structures. + +This script is loaded by GDB when it finds a .debug_gdb_scripts +section in the compiled binary. The [68]l linkers emit this with a +path to this file based on the path to the runtime package. +""" + +# Known issues: +# - pretty printing only works for the 'native' strings. E.g. 'type +# foo string' will make foo a plain struct in the eyes of gdb, +# circumventing the pretty print triggering. + + +from __future__ import print_function +import re +import sys +import gdb + +print("Loading Go Runtime support.", file=sys.stderr) +#http://python3porting.com/differences.html +if sys.version > '3': + xrange = range +# allow to manually reload while developing +goobjfile = gdb.current_objfile() or gdb.objfiles()[0] +goobjfile.pretty_printers = [] + +# G state (runtime2.go) + +def read_runtime_const(varname, default): + try: + return int(gdb.parse_and_eval(varname)) + except Exception: + return int(default) + + +G_IDLE = read_runtime_const("'runtime._Gidle'", 0) +G_RUNNABLE = read_runtime_const("'runtime._Grunnable'", 1) +G_RUNNING = read_runtime_const("'runtime._Grunning'", 2) +G_SYSCALL = read_runtime_const("'runtime._Gsyscall'", 3) +G_WAITING = read_runtime_const("'runtime._Gwaiting'", 4) +G_MORIBUND_UNUSED = read_runtime_const("'runtime._Gmoribund_unused'", 5) +G_DEAD = read_runtime_const("'runtime._Gdead'", 6) +G_ENQUEUE_UNUSED = read_runtime_const("'runtime._Genqueue_unused'", 7) +G_COPYSTACK = read_runtime_const("'runtime._Gcopystack'", 8) +G_SCAN = read_runtime_const("'runtime._Gscan'", 0x1000) +G_SCANRUNNABLE = G_SCAN+G_RUNNABLE +G_SCANRUNNING = G_SCAN+G_RUNNING +G_SCANSYSCALL = G_SCAN+G_SYSCALL +G_SCANWAITING = G_SCAN+G_WAITING + +sts = { + G_IDLE: 'idle', + G_RUNNABLE: 'runnable', + G_RUNNING: 'running', + G_SYSCALL: 'syscall', + G_WAITING: 'waiting', + G_MORIBUND_UNUSED: 'moribund', + G_DEAD: 'dead', + G_ENQUEUE_UNUSED: 'enqueue', + G_COPYSTACK: 'copystack', + G_SCAN: 'scan', + G_SCANRUNNABLE: 'runnable+s', + G_SCANRUNNING: 'running+s', + G_SCANSYSCALL: 'syscall+s', + G_SCANWAITING: 'waiting+s', +} + + +# +# Value wrappers +# + +class SliceValue: + "Wrapper for slice values." + + def __init__(self, val): + self.val = val + + @property + def len(self): + return int(self.val['len']) + + @property + def cap(self): + return int(self.val['cap']) + + def __getitem__(self, i): + if i < 0 or i >= self.len: + raise IndexError(i) + ptr = self.val["array"] + return (ptr + i).dereference() + + +# +# Pretty Printers +# + +# The patterns for matching types are permissive because gdb 8.2 switched to matching on (we think) typedef names instead of C syntax names. +class StringTypePrinter: + "Pretty print Go strings." + + pattern = re.compile(r'^(struct string( \*)?|string)$') + + def __init__(self, val): + self.val = val + + def display_hint(self): + return 'string' + + def to_string(self): + l = int(self.val['len']) + return self.val['str'].string("utf-8", "ignore", l) + + +class SliceTypePrinter: + "Pretty print slices." + + pattern = re.compile(r'^(struct \[\]|\[\])') + + def __init__(self, val): + self.val = val + + def display_hint(self): + return 'array' + + def to_string(self): + t = str(self.val.type) + if (t.startswith("struct ")): + return t[len("struct "):] + return t + + def children(self): + sval = SliceValue(self.val) + if sval.len > sval.cap: + return + for idx, item in enumerate(sval): + yield ('[{0}]'.format(idx), item) + + +class MapTypePrinter: + """Pretty print map[K]V types. + + Map-typed go variables are really pointers. dereference them in gdb + to inspect their contents with this pretty printer. + """ + + pattern = re.compile(r'^map\[.*\].*$') + + def __init__(self, val): + self.val = val + + def display_hint(self): + return 'map' + + def to_string(self): + return str(self.val.type) + + def children(self): + B = self.val['B'] + buckets = self.val['buckets'] + oldbuckets = self.val['oldbuckets'] + flags = self.val['flags'] + inttype = self.val['hash0'].type + cnt = 0 + for bucket in xrange(2 ** int(B)): + bp = buckets + bucket + if oldbuckets: + oldbucket = bucket & (2 ** (B - 1) - 1) + oldbp = oldbuckets + oldbucket + oldb = oldbp.dereference() + if (oldb['overflow'].cast(inttype) & 1) == 0: # old bucket not evacuated yet + if bucket >= 2 ** (B - 1): + continue # already did old bucket + bp = oldbp + while bp: + b = bp.dereference() + for i in xrange(8): + if b['tophash'][i] != 0: + k = b['keys'][i] + v = b['values'][i] + if flags & 1: + k = k.dereference() + if flags & 2: + v = v.dereference() + yield str(cnt), k + yield str(cnt + 1), v + cnt += 2 + bp = b['overflow'] + + +class ChanTypePrinter: + """Pretty print chan[T] types. + + Chan-typed go variables are really pointers. dereference them in gdb + to inspect their contents with this pretty printer. + """ + + pattern = re.compile(r'^chan ') + + def __init__(self, val): + self.val = val + + def display_hint(self): + return 'array' + + def to_string(self): + return str(self.val.type) + + def children(self): + # see chan.c chanbuf(). et is the type stolen from hchan<T>::recvq->first->elem + et = [x.type for x in self.val['recvq']['first'].type.target().fields() if x.name == 'elem'][0] + ptr = (self.val.address["buf"]).cast(et) + for i in range(self.val["qcount"]): + j = (self.val["recvx"] + i) % self.val["dataqsiz"] + yield ('[{0}]'.format(i), (ptr + j).dereference()) + + +def paramtypematch(t, pattern): + return t.code == gdb.TYPE_CODE_TYPEDEF and str(t).startswith(".param") and pattern.match(str(t.target())) + +# +# Register all the *Printer classes above. +# + +def makematcher(klass): + def matcher(val): + try: + if klass.pattern.match(str(val.type)): + return klass(val) + elif paramtypematch(val.type, klass.pattern): + return klass(val.cast(val.type.target())) + except Exception: + pass + return matcher + +goobjfile.pretty_printers.extend([makematcher(var) for var in vars().values() if hasattr(var, 'pattern')]) +# +# Utilities +# + +def pc_to_int(pc): + # python2 will not cast pc (type void*) to an int cleanly + # instead python2 and python3 work with the hex string representation + # of the void pointer which we can parse back into an int. + # int(pc) will not work. + try: + # python3 / newer versions of gdb + pc = int(pc) + except gdb.error: + # str(pc) can return things like + # "0x429d6c <runtime.gopark+284>", so + # chop at first space. + pc = int(str(pc).split(None, 1)[0], 16) + return pc + + +# +# For reference, this is what we're trying to do: +# eface: p *(*(struct 'runtime.rtype'*)'main.e'->type_->data)->string +# iface: p *(*(struct 'runtime.rtype'*)'main.s'->tab->Type->data)->string +# +# interface types can't be recognized by their name, instead we check +# if they have the expected fields. Unfortunately the mapping of +# fields to python attributes in gdb.py isn't complete: you can't test +# for presence other than by trapping. + + +def is_iface(val): + try: + return str(val['tab'].type) == "struct runtime.itab *" and str(val['data'].type) == "void *" + except gdb.error: + pass + + +def is_eface(val): + try: + return str(val['_type'].type) == "struct runtime._type *" and str(val['data'].type) == "void *" + except gdb.error: + pass + + +def lookup_type(name): + try: + return gdb.lookup_type(name) + except gdb.error: + pass + try: + return gdb.lookup_type('struct ' + name) + except gdb.error: + pass + try: + return gdb.lookup_type('struct ' + name[1:]).pointer() + except gdb.error: + pass + + +def iface_commontype(obj): + if is_iface(obj): + go_type_ptr = obj['tab']['_type'] + elif is_eface(obj): + go_type_ptr = obj['_type'] + else: + return + + return go_type_ptr.cast(gdb.lookup_type("struct reflect.rtype").pointer()).dereference() + + +def iface_dtype(obj): + "Decode type of the data field of an eface or iface struct." + # known issue: dtype_name decoded from runtime.rtype is "nested.Foo" + # but the dwarf table lists it as "full/path/to/nested.Foo" + + dynamic_go_type = iface_commontype(obj) + if dynamic_go_type is None: + return + dtype_name = dynamic_go_type['string'].dereference()['str'].string() + + dynamic_gdb_type = lookup_type(dtype_name) + if dynamic_gdb_type is None: + return + + type_size = int(dynamic_go_type['size']) + uintptr_size = int(dynamic_go_type['size'].type.sizeof) # size is itself an uintptr + if type_size > uintptr_size: + dynamic_gdb_type = dynamic_gdb_type.pointer() + + return dynamic_gdb_type + + +def iface_dtype_name(obj): + "Decode type name of the data field of an eface or iface struct." + + dynamic_go_type = iface_commontype(obj) + if dynamic_go_type is None: + return + return dynamic_go_type['string'].dereference()['str'].string() + + +class IfacePrinter: + """Pretty print interface values + + Casts the data field to the appropriate dynamic type.""" + + def __init__(self, val): + self.val = val + + def display_hint(self): + return 'string' + + def to_string(self): + if self.val['data'] == 0: + return 0x0 + try: + dtype = iface_dtype(self.val) + except Exception: + return "<bad dynamic type>" + + if dtype is None: # trouble looking up, print something reasonable + return "({typename}){data}".format( + typename=iface_dtype_name(self.val), data=self.val['data']) + + try: + return self.val['data'].cast(dtype).dereference() + except Exception: + pass + return self.val['data'].cast(dtype) + + +def ifacematcher(val): + if is_iface(val) or is_eface(val): + return IfacePrinter(val) + +goobjfile.pretty_printers.append(ifacematcher) + +# +# Convenience Functions +# + + +class GoLenFunc(gdb.Function): + "Length of strings, slices, maps or channels" + + how = ((StringTypePrinter, 'len'), (SliceTypePrinter, 'len'), (MapTypePrinter, 'count'), (ChanTypePrinter, 'qcount')) + + def __init__(self): + gdb.Function.__init__(self, "len") + + def invoke(self, obj): + typename = str(obj.type) + for klass, fld in self.how: + if klass.pattern.match(typename) or paramtypematch(obj.type, klass.pattern): + return obj[fld] + + +class GoCapFunc(gdb.Function): + "Capacity of slices or channels" + + how = ((SliceTypePrinter, 'cap'), (ChanTypePrinter, 'dataqsiz')) + + def __init__(self): + gdb.Function.__init__(self, "cap") + + def invoke(self, obj): + typename = str(obj.type) + for klass, fld in self.how: + if klass.pattern.match(typename) or paramtypematch(obj.type, klass.pattern): + return obj[fld] + + +class DTypeFunc(gdb.Function): + """Cast Interface values to their dynamic type. + + For non-interface types this behaves as the identity operation. + """ + + def __init__(self): + gdb.Function.__init__(self, "dtype") + + def invoke(self, obj): + try: + return obj['data'].cast(iface_dtype(obj)) + except gdb.error: + pass + return obj + +# +# Commands +# + +def linked_list(ptr, linkfield): + while ptr: + yield ptr + ptr = ptr[linkfield] + + +class GoroutinesCmd(gdb.Command): + "List all goroutines." + + def __init__(self): + gdb.Command.__init__(self, "info goroutines", gdb.COMMAND_STACK, gdb.COMPLETE_NONE) + + def invoke(self, _arg, _from_tty): + # args = gdb.string_to_argv(arg) + vp = gdb.lookup_type('void').pointer() + for ptr in SliceValue(gdb.parse_and_eval("'runtime.allgs'")): + if ptr['atomicstatus']['value'] == G_DEAD: + continue + s = ' ' + if ptr['m']: + s = '*' + pc = ptr['sched']['pc'].cast(vp) + pc = pc_to_int(pc) + blk = gdb.block_for_pc(pc) + status = int(ptr['atomicstatus']['value']) + st = sts.get(status, "unknown(%d)" % status) + print(s, ptr['goid'], "{0:8s}".format(st), blk.function) + + +def find_goroutine(goid): + """ + find_goroutine attempts to find the goroutine identified by goid. + It returns a tuple of gdb.Value's representing the stack pointer + and program counter pointer for the goroutine. + + @param int goid + + @return tuple (gdb.Value, gdb.Value) + """ + vp = gdb.lookup_type('void').pointer() + for ptr in SliceValue(gdb.parse_and_eval("'runtime.allgs'")): + if ptr['atomicstatus']['value'] == G_DEAD: + continue + if ptr['goid'] == goid: + break + else: + return None, None + # Get the goroutine's saved state. + pc, sp = ptr['sched']['pc'], ptr['sched']['sp'] + status = ptr['atomicstatus']['value']&~G_SCAN + # Goroutine is not running nor in syscall, so use the info in goroutine + if status != G_RUNNING and status != G_SYSCALL: + return pc.cast(vp), sp.cast(vp) + + # If the goroutine is in a syscall, use syscallpc/sp. + pc, sp = ptr['syscallpc'], ptr['syscallsp'] + if sp != 0: + return pc.cast(vp), sp.cast(vp) + # Otherwise, the goroutine is running, so it doesn't have + # saved scheduler state. Find G's OS thread. + m = ptr['m'] + if m == 0: + return None, None + for thr in gdb.selected_inferior().threads(): + if thr.ptid[1] == m['procid']: + break + else: + return None, None + # Get scheduler state from the G's OS thread state. + curthr = gdb.selected_thread() + try: + thr.switch() + pc = gdb.parse_and_eval('$pc') + sp = gdb.parse_and_eval('$sp') + finally: + curthr.switch() + return pc.cast(vp), sp.cast(vp) + + +class GoroutineCmd(gdb.Command): + """Execute gdb command in the context of goroutine <goid>. + + Switch PC and SP to the ones in the goroutine's G structure, + execute an arbitrary gdb command, and restore PC and SP. + + Usage: (gdb) goroutine <goid> <gdbcmd> + + You could pass "all" as <goid> to apply <gdbcmd> to all goroutines. + + For example: (gdb) goroutine all <gdbcmd> + + Note that it is ill-defined to modify state in the context of a goroutine. + Restrict yourself to inspecting values. + """ + + def __init__(self): + gdb.Command.__init__(self, "goroutine", gdb.COMMAND_STACK, gdb.COMPLETE_NONE) + + def invoke(self, arg, _from_tty): + goid_str, cmd = arg.split(None, 1) + goids = [] + + if goid_str == 'all': + for ptr in SliceValue(gdb.parse_and_eval("'runtime.allgs'")): + goids.append(int(ptr['goid'])) + else: + goids = [int(gdb.parse_and_eval(goid_str))] + + for goid in goids: + self.invoke_per_goid(goid, cmd) + + def invoke_per_goid(self, goid, cmd): + pc, sp = find_goroutine(goid) + if not pc: + print("No such goroutine: ", goid) + return + pc = pc_to_int(pc) + save_frame = gdb.selected_frame() + gdb.parse_and_eval('$save_sp = $sp') + gdb.parse_and_eval('$save_pc = $pc') + # In GDB, assignments to sp must be done from the + # top-most frame, so select frame 0 first. + gdb.execute('select-frame 0') + gdb.parse_and_eval('$sp = {0}'.format(str(sp))) + gdb.parse_and_eval('$pc = {0}'.format(str(pc))) + try: + gdb.execute(cmd) + finally: + # In GDB, assignments to sp must be done from the + # top-most frame, so select frame 0 first. + gdb.execute('select-frame 0') + gdb.parse_and_eval('$pc = $save_pc') + gdb.parse_and_eval('$sp = $save_sp') + save_frame.select() + + +class GoIfaceCmd(gdb.Command): + "Print Static and dynamic interface types" + + def __init__(self): + gdb.Command.__init__(self, "iface", gdb.COMMAND_DATA, gdb.COMPLETE_SYMBOL) + + def invoke(self, arg, _from_tty): + for obj in gdb.string_to_argv(arg): + try: + #TODO fix quoting for qualified variable names + obj = gdb.parse_and_eval(str(obj)) + except Exception as e: + print("Can't parse ", obj, ": ", e) + continue + + if obj['data'] == 0: + dtype = "nil" + else: + dtype = iface_dtype(obj) + + if dtype is None: + print("Not an interface: ", obj.type) + continue + + print("{0}: {1}".format(obj.type, dtype)) + +# TODO: print interface's methods and dynamic type's func pointers thereof. +#rsc: "to find the number of entries in the itab's Fn field look at +# itab.inter->numMethods +# i am sure i have the names wrong but look at the interface type +# and its method count" +# so Itype will start with a commontype which has kind = interface + +# +# Register all convenience functions and CLI commands +# +GoLenFunc() +GoCapFunc() +DTypeFunc() +GoroutinesCmd() +GoroutineCmd() +GoIfaceCmd() diff --git a/src/runtime/runtime-gdb_test.go b/src/runtime/runtime-gdb_test.go new file mode 100644 index 0000000..4e7c227 --- /dev/null +++ b/src/runtime/runtime-gdb_test.go @@ -0,0 +1,783 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "bytes" + "flag" + "fmt" + "internal/testenv" + "os" + "os/exec" + "path/filepath" + "regexp" + "runtime" + "strconv" + "strings" + "testing" + "time" +) + +// NOTE: In some configurations, GDB will segfault when sent a SIGWINCH signal. +// Some runtime tests send SIGWINCH to the entire process group, so those tests +// must never run in parallel with GDB tests. +// +// See issue 39021 and https://sourceware.org/bugzilla/show_bug.cgi?id=26056. + +func checkGdbEnvironment(t *testing.T) { + testenv.MustHaveGoBuild(t) + switch runtime.GOOS { + case "darwin": + t.Skip("gdb does not work on darwin") + case "netbsd": + t.Skip("gdb does not work with threads on NetBSD; see https://golang.org/issue/22893 and https://gnats.netbsd.org/52548") + case "windows": + t.Skip("gdb tests fail on Windows: https://golang.org/issue/22687") + case "linux": + if runtime.GOARCH == "ppc64" { + t.Skip("skipping gdb tests on linux/ppc64; see https://golang.org/issue/17366") + } + if runtime.GOARCH == "mips" { + t.Skip("skipping gdb tests on linux/mips; see https://golang.org/issue/25939") + } + // Disable GDB tests on alpine until issue #54352 resolved. + if strings.HasSuffix(testenv.Builder(), "-alpine") { + t.Skip("skipping gdb tests on alpine; see https://golang.org/issue/54352") + } + case "freebsd": + t.Skip("skipping gdb tests on FreeBSD; see https://golang.org/issue/29508") + case "aix": + if testing.Short() { + t.Skip("skipping gdb tests on AIX; see https://golang.org/issue/35710") + } + case "plan9": + t.Skip("there is no gdb on Plan 9") + } + if final := os.Getenv("GOROOT_FINAL"); final != "" && testenv.GOROOT(t) != final { + t.Skip("gdb test can fail with GOROOT_FINAL pending") + } +} + +func checkGdbVersion(t *testing.T) { + // Issue 11214 reports various failures with older versions of gdb. + out, err := exec.Command("gdb", "--version").CombinedOutput() + if err != nil { + t.Skipf("skipping: error executing gdb: %v", err) + } + re := regexp.MustCompile(`([0-9]+)\.([0-9]+)`) + matches := re.FindSubmatch(out) + if len(matches) < 3 { + t.Skipf("skipping: can't determine gdb version from\n%s\n", out) + } + major, err1 := strconv.Atoi(string(matches[1])) + minor, err2 := strconv.Atoi(string(matches[2])) + if err1 != nil || err2 != nil { + t.Skipf("skipping: can't determine gdb version: %v, %v", err1, err2) + } + if major < 7 || (major == 7 && minor < 7) { + t.Skipf("skipping: gdb version %d.%d too old", major, minor) + } + t.Logf("gdb version %d.%d", major, minor) +} + +func checkGdbPython(t *testing.T) { + if runtime.GOOS == "solaris" || runtime.GOOS == "illumos" { + t.Skip("skipping gdb python tests on illumos and solaris; see golang.org/issue/20821") + } + + cmd := exec.Command("gdb", "-nx", "-q", "--batch", "-iex", "python import sys; print('go gdb python support')") + out, err := cmd.CombinedOutput() + + if err != nil { + t.Skipf("skipping due to issue running gdb: %v", err) + } + if strings.TrimSpace(string(out)) != "go gdb python support" { + t.Skipf("skipping due to lack of python gdb support: %s", out) + } +} + +// checkCleanBacktrace checks that the given backtrace is well formed and does +// not contain any error messages from GDB. +func checkCleanBacktrace(t *testing.T, backtrace string) { + backtrace = strings.TrimSpace(backtrace) + lines := strings.Split(backtrace, "\n") + if len(lines) == 0 { + t.Fatalf("empty backtrace") + } + for i, l := range lines { + if !strings.HasPrefix(l, fmt.Sprintf("#%v ", i)) { + t.Fatalf("malformed backtrace at line %v: %v", i, l) + } + } + // TODO(mundaym): check for unknown frames (e.g. "??"). +} + +const helloSource = ` +import "fmt" +import "runtime" +var gslice []string +func main() { + mapvar := make(map[string]string, 13) + slicemap := make(map[string][]string,11) + chanint := make(chan int, 10) + chanstr := make(chan string, 10) + chanint <- 99 + chanint <- 11 + chanstr <- "spongepants" + chanstr <- "squarebob" + mapvar["abc"] = "def" + mapvar["ghi"] = "jkl" + slicemap["a"] = []string{"b","c","d"} + slicemap["e"] = []string{"f","g","h"} + strvar := "abc" + ptrvar := &strvar + slicevar := make([]string, 0, 16) + slicevar = append(slicevar, mapvar["abc"]) + fmt.Println("hi") + runtime.KeepAlive(ptrvar) + _ = ptrvar // set breakpoint here + gslice = slicevar + fmt.Printf("%v, %v, %v\n", slicemap, <-chanint, <-chanstr) + runtime.KeepAlive(mapvar) +} // END_OF_PROGRAM +` + +func lastLine(src []byte) int { + eop := []byte("END_OF_PROGRAM") + for i, l := range bytes.Split(src, []byte("\n")) { + if bytes.Contains(l, eop) { + return i + } + } + return 0 +} + +func TestGdbPython(t *testing.T) { + testGdbPython(t, false) +} + +func TestGdbPythonCgo(t *testing.T) { + if strings.HasPrefix(runtime.GOARCH, "mips") { + testenv.SkipFlaky(t, 37794) + } + testGdbPython(t, true) +} + +func testGdbPython(t *testing.T, cgo bool) { + if cgo { + testenv.MustHaveCGO(t) + } + + checkGdbEnvironment(t) + t.Parallel() + checkGdbVersion(t) + checkGdbPython(t) + + dir := t.TempDir() + + var buf bytes.Buffer + buf.WriteString("package main\n") + if cgo { + buf.WriteString(`import "C"` + "\n") + } + buf.WriteString(helloSource) + + src := buf.Bytes() + + // Locate breakpoint line + var bp int + lines := bytes.Split(src, []byte("\n")) + for i, line := range lines { + if bytes.Contains(line, []byte("breakpoint")) { + bp = i + break + } + } + + err := os.WriteFile(filepath.Join(dir, "main.go"), src, 0644) + if err != nil { + t.Fatalf("failed to create file: %v", err) + } + nLines := lastLine(src) + + cmd := exec.Command(testenv.GoToolPath(t), "build", "-o", "a.exe", "main.go") + cmd.Dir = dir + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("building source %v\n%s", err, out) + } + + args := []string{"-nx", "-q", "--batch", + "-iex", "add-auto-load-safe-path " + filepath.Join(testenv.GOROOT(t), "src", "runtime"), + "-ex", "set startup-with-shell off", + "-ex", "set print thread-events off", + } + if cgo { + // When we build the cgo version of the program, the system's + // linker is used. Some external linkers, like GNU gold, + // compress the .debug_gdb_scripts into .zdebug_gdb_scripts. + // Until gold and gdb can work together, temporarily load the + // python script directly. + args = append(args, + "-ex", "source "+filepath.Join(testenv.GOROOT(t), "src", "runtime", "runtime-gdb.py"), + ) + } else { + args = append(args, + "-ex", "info auto-load python-scripts", + ) + } + args = append(args, + "-ex", "set python print-stack full", + "-ex", fmt.Sprintf("br main.go:%d", bp), + "-ex", "run", + "-ex", "echo BEGIN info goroutines\n", + "-ex", "info goroutines", + "-ex", "echo END\n", + "-ex", "echo BEGIN print mapvar\n", + "-ex", "print mapvar", + "-ex", "echo END\n", + "-ex", "echo BEGIN print slicemap\n", + "-ex", "print slicemap", + "-ex", "echo END\n", + "-ex", "echo BEGIN print strvar\n", + "-ex", "print strvar", + "-ex", "echo END\n", + "-ex", "echo BEGIN print chanint\n", + "-ex", "print chanint", + "-ex", "echo END\n", + "-ex", "echo BEGIN print chanstr\n", + "-ex", "print chanstr", + "-ex", "echo END\n", + "-ex", "echo BEGIN info locals\n", + "-ex", "info locals", + "-ex", "echo END\n", + "-ex", "echo BEGIN goroutine 1 bt\n", + "-ex", "goroutine 1 bt", + "-ex", "echo END\n", + "-ex", "echo BEGIN goroutine all bt\n", + "-ex", "goroutine all bt", + "-ex", "echo END\n", + "-ex", "clear main.go:15", // clear the previous break point + "-ex", fmt.Sprintf("br main.go:%d", nLines), // new break point at the end of main + "-ex", "c", + "-ex", "echo BEGIN goroutine 1 bt at the end\n", + "-ex", "goroutine 1 bt", + "-ex", "echo END\n", + filepath.Join(dir, "a.exe"), + ) + got, err := exec.Command("gdb", args...).CombinedOutput() + t.Logf("gdb output:\n%s", got) + if err != nil { + t.Fatalf("gdb exited with error: %v", err) + } + + firstLine, _, _ := bytes.Cut(got, []byte("\n")) + if string(firstLine) != "Loading Go Runtime support." { + // This can happen when using all.bash with + // GOROOT_FINAL set, because the tests are run before + // the final installation of the files. + cmd := exec.Command(testenv.GoToolPath(t), "env", "GOROOT") + cmd.Env = []string{} + out, err := cmd.CombinedOutput() + if err != nil && bytes.Contains(out, []byte("cannot find GOROOT")) { + t.Skipf("skipping because GOROOT=%s does not exist", testenv.GOROOT(t)) + } + + _, file, _, _ := runtime.Caller(1) + + t.Logf("package testing source file: %s", file) + t.Fatalf("failed to load Go runtime support: %s\n%s", firstLine, got) + } + + // Extract named BEGIN...END blocks from output + partRe := regexp.MustCompile(`(?ms)^BEGIN ([^\n]*)\n(.*?)\nEND`) + blocks := map[string]string{} + for _, subs := range partRe.FindAllSubmatch(got, -1) { + blocks[string(subs[1])] = string(subs[2]) + } + + infoGoroutinesRe := regexp.MustCompile(`\*\s+\d+\s+running\s+`) + if bl := blocks["info goroutines"]; !infoGoroutinesRe.MatchString(bl) { + t.Fatalf("info goroutines failed: %s", bl) + } + + printMapvarRe1 := regexp.MustCompile(`^\$[0-9]+ = map\[string\]string = {\[(0x[0-9a-f]+\s+)?"abc"\] = (0x[0-9a-f]+\s+)?"def", \[(0x[0-9a-f]+\s+)?"ghi"\] = (0x[0-9a-f]+\s+)?"jkl"}$`) + printMapvarRe2 := regexp.MustCompile(`^\$[0-9]+ = map\[string\]string = {\[(0x[0-9a-f]+\s+)?"ghi"\] = (0x[0-9a-f]+\s+)?"jkl", \[(0x[0-9a-f]+\s+)?"abc"\] = (0x[0-9a-f]+\s+)?"def"}$`) + if bl := blocks["print mapvar"]; !printMapvarRe1.MatchString(bl) && + !printMapvarRe2.MatchString(bl) { + t.Fatalf("print mapvar failed: %s", bl) + } + + // 2 orders, and possible differences in spacing. + sliceMapSfx1 := `map[string][]string = {["e"] = []string = {"f", "g", "h"}, ["a"] = []string = {"b", "c", "d"}}` + sliceMapSfx2 := `map[string][]string = {["a"] = []string = {"b", "c", "d"}, ["e"] = []string = {"f", "g", "h"}}` + if bl := strings.ReplaceAll(blocks["print slicemap"], " ", " "); !strings.HasSuffix(bl, sliceMapSfx1) && !strings.HasSuffix(bl, sliceMapSfx2) { + t.Fatalf("print slicemap failed: %s", bl) + } + + chanIntSfx := `chan int = {99, 11}` + if bl := strings.ReplaceAll(blocks["print chanint"], " ", " "); !strings.HasSuffix(bl, chanIntSfx) { + t.Fatalf("print chanint failed: %s", bl) + } + + chanStrSfx := `chan string = {"spongepants", "squarebob"}` + if bl := strings.ReplaceAll(blocks["print chanstr"], " ", " "); !strings.HasSuffix(bl, chanStrSfx) { + t.Fatalf("print chanstr failed: %s", bl) + } + + strVarRe := regexp.MustCompile(`^\$[0-9]+ = (0x[0-9a-f]+\s+)?"abc"$`) + if bl := blocks["print strvar"]; !strVarRe.MatchString(bl) { + t.Fatalf("print strvar failed: %s", bl) + } + + // The exact format of composite values has changed over time. + // For issue 16338: ssa decompose phase split a slice into + // a collection of scalar vars holding its fields. In such cases + // the DWARF variable location expression should be of the + // form "var.field" and not just "field". + // However, the newer dwarf location list code reconstituted + // aggregates from their fields and reverted their printing + // back to its original form. + // Only test that all variables are listed in 'info locals' since + // different versions of gdb print variables in different + // order and with differing amount of information and formats. + + if bl := blocks["info locals"]; !strings.Contains(bl, "slicevar") || + !strings.Contains(bl, "mapvar") || + !strings.Contains(bl, "strvar") { + t.Fatalf("info locals failed: %s", bl) + } + + // Check that the backtraces are well formed. + checkCleanBacktrace(t, blocks["goroutine 1 bt"]) + checkCleanBacktrace(t, blocks["goroutine 1 bt at the end"]) + + btGoroutine1Re := regexp.MustCompile(`(?m)^#0\s+(0x[0-9a-f]+\s+in\s+)?main\.main.+at`) + if bl := blocks["goroutine 1 bt"]; !btGoroutine1Re.MatchString(bl) { + t.Fatalf("goroutine 1 bt failed: %s", bl) + } + + if bl := blocks["goroutine all bt"]; !btGoroutine1Re.MatchString(bl) { + t.Fatalf("goroutine all bt failed: %s", bl) + } + + btGoroutine1AtTheEndRe := regexp.MustCompile(`(?m)^#0\s+(0x[0-9a-f]+\s+in\s+)?main\.main.+at`) + if bl := blocks["goroutine 1 bt at the end"]; !btGoroutine1AtTheEndRe.MatchString(bl) { + t.Fatalf("goroutine 1 bt at the end failed: %s", bl) + } +} + +const backtraceSource = ` +package main + +//go:noinline +func aaa() bool { return bbb() } + +//go:noinline +func bbb() bool { return ccc() } + +//go:noinline +func ccc() bool { return ddd() } + +//go:noinline +func ddd() bool { return f() } + +//go:noinline +func eee() bool { return true } + +var f = eee + +func main() { + _ = aaa() +} +` + +// TestGdbBacktrace tests that gdb can unwind the stack correctly +// using only the DWARF debug info. +func TestGdbBacktrace(t *testing.T) { + if runtime.GOOS == "netbsd" { + testenv.SkipFlaky(t, 15603) + } + if flag.Lookup("test.parallel").Value.(flag.Getter).Get().(int) < 2 { + // It is possible that this test will hang for a long time due to an + // apparent GDB bug reported in https://go.dev/issue/37405. + // If test parallelism is high enough, that might be ok: the other parallel + // tests will finish, and then this test will finish right before it would + // time out. However, if test are running sequentially, a hang in this test + // would likely cause the remaining tests to run out of time. + testenv.SkipFlaky(t, 37405) + } + + checkGdbEnvironment(t) + t.Parallel() + checkGdbVersion(t) + + dir := t.TempDir() + + // Build the source code. + src := filepath.Join(dir, "main.go") + err := os.WriteFile(src, []byte(backtraceSource), 0644) + if err != nil { + t.Fatalf("failed to create file: %v", err) + } + cmd := exec.Command(testenv.GoToolPath(t), "build", "-o", "a.exe", "main.go") + cmd.Dir = dir + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("building source %v\n%s", err, out) + } + + // Execute gdb commands. + start := time.Now() + args := []string{"-nx", "-batch", + "-iex", "add-auto-load-safe-path " + filepath.Join(testenv.GOROOT(t), "src", "runtime"), + "-ex", "set startup-with-shell off", + "-ex", "break main.eee", + "-ex", "run", + "-ex", "backtrace", + "-ex", "continue", + filepath.Join(dir, "a.exe"), + } + cmd = testenv.Command(t, "gdb", args...) + + // Work around the GDB hang reported in https://go.dev/issue/37405. + // Sometimes (rarely), the GDB process hangs completely when the Go program + // exits, and we suspect that the bug is on the GDB side. + // + // The default Cancel function added by testenv.Command will mark the test as + // failed if it is in danger of timing out, but we want to instead mark it as + // skipped. Change the Cancel function to kill the process and merely log + // instead of failing the test. + // + // (This approach does not scale: if the test parallelism is less than or + // equal to the number of tests that run right up to the deadline, then the + // remaining parallel tests are likely to time out. But as long as it's just + // this one flaky test, it's probably fine..?) + // + // If there is no deadline set on the test at all, relying on the timeout set + // by testenv.Command will cause the test to hang indefinitely, but that's + // what “no deadline” means, after all — and it's probably the right behavior + // anyway if someone is trying to investigate and fix the GDB bug. + cmd.Cancel = func() error { + t.Logf("GDB command timed out after %v: %v", time.Since(start), cmd) + return cmd.Process.Kill() + } + + got, err := cmd.CombinedOutput() + t.Logf("gdb output:\n%s", got) + if err != nil { + if bytes.Contains(got, []byte("internal-error: wait returned unexpected status 0x0")) { + // GDB bug: https://sourceware.org/bugzilla/show_bug.cgi?id=28551 + testenv.SkipFlaky(t, 43068) + } + if bytes.Contains(got, []byte("Couldn't get registers: No such process.")) { + // GDB bug: https://sourceware.org/bugzilla/show_bug.cgi?id=9086 + testenv.SkipFlaky(t, 50838) + } + if bytes.Contains(got, []byte(" exited normally]\n")) { + // GDB bug: Sometimes the inferior exits fine, + // but then GDB hangs. + testenv.SkipFlaky(t, 37405) + } + t.Fatalf("gdb exited with error: %v", err) + } + + // Check that the backtrace matches the source code. + bt := []string{ + "eee", + "ddd", + "ccc", + "bbb", + "aaa", + "main", + } + for i, name := range bt { + s := fmt.Sprintf("#%v.*main\\.%v", i, name) + re := regexp.MustCompile(s) + if found := re.Find(got) != nil; !found { + t.Fatalf("could not find '%v' in backtrace", s) + } + } +} + +const autotmpTypeSource = ` +package main + +type astruct struct { + a, b int +} + +func main() { + var iface interface{} = map[string]astruct{} + var iface2 interface{} = []astruct{} + println(iface, iface2) +} +` + +// TestGdbAutotmpTypes ensures that types of autotmp variables appear in .debug_info +// See bug #17830. +func TestGdbAutotmpTypes(t *testing.T) { + checkGdbEnvironment(t) + t.Parallel() + checkGdbVersion(t) + + if runtime.GOOS == "aix" && testing.Short() { + t.Skip("TestGdbAutotmpTypes is too slow on aix/ppc64") + } + + dir := t.TempDir() + + // Build the source code. + src := filepath.Join(dir, "main.go") + err := os.WriteFile(src, []byte(autotmpTypeSource), 0644) + if err != nil { + t.Fatalf("failed to create file: %v", err) + } + cmd := exec.Command(testenv.GoToolPath(t), "build", "-gcflags=all=-N -l", "-o", "a.exe", "main.go") + cmd.Dir = dir + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("building source %v\n%s", err, out) + } + + // Execute gdb commands. + args := []string{"-nx", "-batch", + "-iex", "add-auto-load-safe-path " + filepath.Join(testenv.GOROOT(t), "src", "runtime"), + "-ex", "set startup-with-shell off", + // Some gdb may set scheduling-locking as "step" by default. This prevents background tasks + // (e.g GC) from completing which may result in a hang when executing the step command. + // See #49852. + "-ex", "set scheduler-locking off", + "-ex", "break main.main", + "-ex", "run", + "-ex", "step", + "-ex", "info types astruct", + filepath.Join(dir, "a.exe"), + } + got, err := exec.Command("gdb", args...).CombinedOutput() + t.Logf("gdb output:\n%s", got) + if err != nil { + t.Fatalf("gdb exited with error: %v", err) + } + + sgot := string(got) + + // Check that the backtrace matches the source code. + types := []string{ + "[]main.astruct;", + "bucket<string,main.astruct>;", + "hash<string,main.astruct>;", + "main.astruct;", + "hash<string,main.astruct> * map[string]main.astruct;", + } + for _, name := range types { + if !strings.Contains(sgot, name) { + t.Fatalf("could not find %s in 'info typrs astruct' output", name) + } + } +} + +const constsSource = ` +package main + +const aConstant int = 42 +const largeConstant uint64 = ^uint64(0) +const minusOne int64 = -1 + +func main() { + println("hello world") +} +` + +func TestGdbConst(t *testing.T) { + checkGdbEnvironment(t) + t.Parallel() + checkGdbVersion(t) + + dir := t.TempDir() + + // Build the source code. + src := filepath.Join(dir, "main.go") + err := os.WriteFile(src, []byte(constsSource), 0644) + if err != nil { + t.Fatalf("failed to create file: %v", err) + } + cmd := exec.Command(testenv.GoToolPath(t), "build", "-gcflags=all=-N -l", "-o", "a.exe", "main.go") + cmd.Dir = dir + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("building source %v\n%s", err, out) + } + + // Execute gdb commands. + args := []string{"-nx", "-batch", + "-iex", "add-auto-load-safe-path " + filepath.Join(testenv.GOROOT(t), "src", "runtime"), + "-ex", "set startup-with-shell off", + "-ex", "break main.main", + "-ex", "run", + "-ex", "print main.aConstant", + "-ex", "print main.largeConstant", + "-ex", "print main.minusOne", + "-ex", "print 'runtime.mSpanInUse'", + "-ex", "print 'runtime._PageSize'", + filepath.Join(dir, "a.exe"), + } + got, err := exec.Command("gdb", args...).CombinedOutput() + t.Logf("gdb output:\n%s", got) + if err != nil { + t.Fatalf("gdb exited with error: %v", err) + } + + sgot := strings.ReplaceAll(string(got), "\r\n", "\n") + + if !strings.Contains(sgot, "\n$1 = 42\n$2 = 18446744073709551615\n$3 = -1\n$4 = 1 '\\001'\n$5 = 8192") { + t.Fatalf("output mismatch") + } +} + +const panicSource = ` +package main + +import "runtime/debug" + +func main() { + debug.SetTraceback("crash") + crash() +} + +func crash() { + panic("panic!") +} +` + +// TestGdbPanic tests that gdb can unwind the stack correctly +// from SIGABRTs from Go panics. +func TestGdbPanic(t *testing.T) { + checkGdbEnvironment(t) + t.Parallel() + checkGdbVersion(t) + + dir := t.TempDir() + + // Build the source code. + src := filepath.Join(dir, "main.go") + err := os.WriteFile(src, []byte(panicSource), 0644) + if err != nil { + t.Fatalf("failed to create file: %v", err) + } + cmd := exec.Command(testenv.GoToolPath(t), "build", "-o", "a.exe", "main.go") + cmd.Dir = dir + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("building source %v\n%s", err, out) + } + + // Execute gdb commands. + args := []string{"-nx", "-batch", + "-iex", "add-auto-load-safe-path " + filepath.Join(testenv.GOROOT(t), "src", "runtime"), + "-ex", "set startup-with-shell off", + "-ex", "run", + "-ex", "backtrace", + filepath.Join(dir, "a.exe"), + } + got, err := exec.Command("gdb", args...).CombinedOutput() + t.Logf("gdb output:\n%s", got) + if err != nil { + t.Fatalf("gdb exited with error: %v", err) + } + + // Check that the backtrace matches the source code. + bt := []string{ + `crash`, + `main`, + } + for _, name := range bt { + s := fmt.Sprintf("(#.* .* in )?main\\.%v", name) + re := regexp.MustCompile(s) + if found := re.Find(got) != nil; !found { + t.Fatalf("could not find '%v' in backtrace", s) + } + } +} + +const InfCallstackSource = ` +package main +import "C" +import "time" + +func loop() { + for i := 0; i < 1000; i++ { + time.Sleep(time.Millisecond*5) + } +} + +func main() { + go loop() + time.Sleep(time.Second * 1) +} +` + +// TestGdbInfCallstack tests that gdb can unwind the callstack of cgo programs +// on arm64 platforms without endless frames of function 'crossfunc1'. +// https://golang.org/issue/37238 +func TestGdbInfCallstack(t *testing.T) { + checkGdbEnvironment(t) + + testenv.MustHaveCGO(t) + if runtime.GOARCH != "arm64" { + t.Skip("skipping infinite callstack test on non-arm64 arches") + } + + t.Parallel() + checkGdbVersion(t) + + dir := t.TempDir() + + // Build the source code. + src := filepath.Join(dir, "main.go") + err := os.WriteFile(src, []byte(InfCallstackSource), 0644) + if err != nil { + t.Fatalf("failed to create file: %v", err) + } + cmd := exec.Command(testenv.GoToolPath(t), "build", "-o", "a.exe", "main.go") + cmd.Dir = dir + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("building source %v\n%s", err, out) + } + + // Execute gdb commands. + // 'setg_gcc' is the first point where we can reproduce the issue with just one 'run' command. + args := []string{"-nx", "-batch", + "-iex", "add-auto-load-safe-path " + filepath.Join(testenv.GOROOT(t), "src", "runtime"), + "-ex", "set startup-with-shell off", + "-ex", "break setg_gcc", + "-ex", "run", + "-ex", "backtrace 3", + "-ex", "disable 1", + "-ex", "continue", + filepath.Join(dir, "a.exe"), + } + got, err := exec.Command("gdb", args...).CombinedOutput() + t.Logf("gdb output:\n%s", got) + if err != nil { + t.Fatalf("gdb exited with error: %v", err) + } + + // Check that the backtrace matches + // We check the 3 inner most frames only as they are present certainly, according to gcc_<OS>_arm64.c + bt := []string{ + `setg_gcc`, + `crosscall1`, + `threadentry`, + } + for i, name := range bt { + s := fmt.Sprintf("#%v.*%v", i, name) + re := regexp.MustCompile(s) + if found := re.Find(got) != nil; !found { + t.Fatalf("could not find '%v' in backtrace", s) + } + } +} diff --git a/src/runtime/runtime-lldb_test.go b/src/runtime/runtime-lldb_test.go new file mode 100644 index 0000000..19a6cc6 --- /dev/null +++ b/src/runtime/runtime-lldb_test.go @@ -0,0 +1,185 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "internal/testenv" + "os" + "os/exec" + "path/filepath" + "runtime" + "strings" + "testing" +) + +var lldbPath string + +func checkLldbPython(t *testing.T) { + cmd := exec.Command("lldb", "-P") + out, err := cmd.CombinedOutput() + if err != nil { + t.Skipf("skipping due to issue running lldb: %v\n%s", err, out) + } + lldbPath = strings.TrimSpace(string(out)) + + cmd = exec.Command("/usr/bin/python2.7", "-c", "import sys;sys.path.append(sys.argv[1]);import lldb; print('go lldb python support')", lldbPath) + out, err = cmd.CombinedOutput() + + if err != nil { + t.Skipf("skipping due to issue running python: %v\n%s", err, out) + } + if string(out) != "go lldb python support\n" { + t.Skipf("skipping due to lack of python lldb support: %s", out) + } + + if runtime.GOOS == "darwin" { + // Try to see if we have debugging permissions. + cmd = exec.Command("/usr/sbin/DevToolsSecurity", "-status") + out, err = cmd.CombinedOutput() + if err != nil { + t.Skipf("DevToolsSecurity failed: %v", err) + } else if !strings.Contains(string(out), "enabled") { + t.Skip(string(out)) + } + cmd = exec.Command("/usr/bin/groups") + out, err = cmd.CombinedOutput() + if err != nil { + t.Skipf("groups failed: %v", err) + } else if !strings.Contains(string(out), "_developer") { + t.Skip("Not in _developer group") + } + } +} + +const lldbHelloSource = ` +package main +import "fmt" +func main() { + mapvar := make(map[string]string,5) + mapvar["abc"] = "def" + mapvar["ghi"] = "jkl" + intvar := 42 + ptrvar := &intvar + fmt.Println("hi") // line 10 + _ = ptrvar +} +` + +const lldbScriptSource = ` +import sys +sys.path.append(sys.argv[1]) +import lldb +import os + +TIMEOUT_SECS = 5 + +debugger = lldb.SBDebugger.Create() +debugger.SetAsync(True) +target = debugger.CreateTargetWithFileAndArch("a.exe", None) +if target: + print "Created target" + main_bp = target.BreakpointCreateByLocation("main.go", 10) + if main_bp: + print "Created breakpoint" + process = target.LaunchSimple(None, None, os.getcwd()) + if process: + print "Process launched" + listener = debugger.GetListener() + process.broadcaster.AddListener(listener, lldb.SBProcess.eBroadcastBitStateChanged) + while True: + event = lldb.SBEvent() + if listener.WaitForEvent(TIMEOUT_SECS, event): + if lldb.SBProcess.GetRestartedFromEvent(event): + continue + state = process.GetState() + if state in [lldb.eStateUnloaded, lldb.eStateLaunching, lldb.eStateRunning]: + continue + else: + print "Timeout launching" + break + if state == lldb.eStateStopped: + for t in process.threads: + if t.GetStopReason() == lldb.eStopReasonBreakpoint: + print "Hit breakpoint" + frame = t.GetFrameAtIndex(0) + if frame: + if frame.line_entry: + print "Stopped at %s:%d" % (frame.line_entry.file.basename, frame.line_entry.line) + if frame.function: + print "Stopped in %s" % (frame.function.name,) + var = frame.FindVariable('intvar') + if var: + print "intvar = %s" % (var.GetValue(),) + else: + print "no intvar" + else: + print "Process state", state + process.Destroy() +else: + print "Failed to create target a.exe" + +lldb.SBDebugger.Destroy(debugger) +sys.exit() +` + +const expectedLldbOutput = `Created target +Created breakpoint +Process launched +Hit breakpoint +Stopped at main.go:10 +Stopped in main.main +intvar = 42 +` + +func TestLldbPython(t *testing.T) { + testenv.MustHaveGoBuild(t) + if final := os.Getenv("GOROOT_FINAL"); final != "" && runtime.GOROOT() != final { + t.Skip("gdb test can fail with GOROOT_FINAL pending") + } + testenv.SkipFlaky(t, 31188) + + checkLldbPython(t) + + dir := t.TempDir() + + src := filepath.Join(dir, "main.go") + err := os.WriteFile(src, []byte(lldbHelloSource), 0644) + if err != nil { + t.Fatalf("failed to create src file: %v", err) + } + + mod := filepath.Join(dir, "go.mod") + err = os.WriteFile(mod, []byte("module lldbtest"), 0644) + if err != nil { + t.Fatalf("failed to create mod file: %v", err) + } + + // As of 2018-07-17, lldb doesn't support compressed DWARF, so + // disable it for this test. + cmd := exec.Command(testenv.GoToolPath(t), "build", "-gcflags=all=-N -l", "-ldflags=-compressdwarf=false", "-o", "a.exe") + cmd.Dir = dir + cmd.Env = append(os.Environ(), "GOPATH=") // issue 31100 + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("building source %v\n%s", err, out) + } + + src = filepath.Join(dir, "script.py") + err = os.WriteFile(src, []byte(lldbScriptSource), 0755) + if err != nil { + t.Fatalf("failed to create script: %v", err) + } + + cmd = exec.Command("/usr/bin/python2.7", "script.py", lldbPath) + cmd.Dir = dir + got, _ := cmd.CombinedOutput() + + if string(got) != expectedLldbOutput { + if strings.Contains(string(got), "Timeout launching") { + t.Skip("Timeout launching") + } + t.Fatalf("Unexpected lldb output:\n%s", got) + } +} diff --git a/src/runtime/runtime.go b/src/runtime/runtime.go new file mode 100644 index 0000000..9f68738 --- /dev/null +++ b/src/runtime/runtime.go @@ -0,0 +1,116 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +//go:generate go run wincallback.go +//go:generate go run mkduff.go +//go:generate go run mkfastlog2table.go +//go:generate go run mklockrank.go -o lockrank.go + +var ticks ticksType + +type ticksType struct { + lock mutex + val atomic.Int64 +} + +// Note: Called by runtime/pprof in addition to runtime code. +func tickspersecond() int64 { + r := ticks.val.Load() + if r != 0 { + return r + } + lock(&ticks.lock) + r = ticks.val.Load() + if r == 0 { + t0 := nanotime() + c0 := cputicks() + usleep(100 * 1000) + t1 := nanotime() + c1 := cputicks() + if t1 == t0 { + t1++ + } + r = (c1 - c0) * 1000 * 1000 * 1000 / (t1 - t0) + if r == 0 { + r++ + } + ticks.val.Store(r) + } + unlock(&ticks.lock) + return r +} + +var envs []string +var argslice []string + +//go:linkname syscall_runtime_envs syscall.runtime_envs +func syscall_runtime_envs() []string { return append([]string{}, envs...) } + +//go:linkname syscall_Getpagesize syscall.Getpagesize +func syscall_Getpagesize() int { return int(physPageSize) } + +//go:linkname os_runtime_args os.runtime_args +func os_runtime_args() []string { return append([]string{}, argslice...) } + +//go:linkname syscall_Exit syscall.Exit +//go:nosplit +func syscall_Exit(code int) { + exit(int32(code)) +} + +var godebugDefault string +var godebugUpdate atomic.Pointer[func(string, string)] +var godebugEnv atomic.Pointer[string] // set by parsedebugvars + +//go:linkname godebug_setUpdate internal/godebug.setUpdate +func godebug_setUpdate(update func(string, string)) { + p := new(func(string, string)) + *p = update + godebugUpdate.Store(p) + godebugNotify() +} + +func godebugNotify() { + if update := godebugUpdate.Load(); update != nil { + var env string + if p := godebugEnv.Load(); p != nil { + env = *p + } + (*update)(godebugDefault, env) + } +} + +//go:linkname syscall_runtimeSetenv syscall.runtimeSetenv +func syscall_runtimeSetenv(key, value string) { + setenv_c(key, value) + if key == "GODEBUG" { + p := new(string) + *p = value + godebugEnv.Store(p) + godebugNotify() + } +} + +//go:linkname syscall_runtimeUnsetenv syscall.runtimeUnsetenv +func syscall_runtimeUnsetenv(key string) { + unsetenv_c(key) + if key == "GODEBUG" { + godebugEnv.Store(nil) + godebugNotify() + } +} + +// writeErrStr writes a string to descriptor 2. +// +//go:nosplit +func writeErrStr(s string) { + write(2, unsafe.Pointer(unsafe.StringData(s)), int32(len(s))) +} diff --git a/src/runtime/runtime1.go b/src/runtime/runtime1.go new file mode 100644 index 0000000..277f18a --- /dev/null +++ b/src/runtime/runtime1.go @@ -0,0 +1,563 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/bytealg" + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +// Keep a cached value to make gotraceback fast, +// since we call it on every call to gentraceback. +// The cached value is a uint32 in which the low bits +// are the "crash" and "all" settings and the remaining +// bits are the traceback value (0 off, 1 on, 2 include system). +const ( + tracebackCrash = 1 << iota + tracebackAll + tracebackShift = iota +) + +var traceback_cache uint32 = 2 << tracebackShift +var traceback_env uint32 + +// gotraceback returns the current traceback settings. +// +// If level is 0, suppress all tracebacks. +// If level is 1, show tracebacks, but exclude runtime frames. +// If level is 2, show tracebacks including runtime frames. +// If all is set, print all goroutine stacks. Otherwise, print just the current goroutine. +// If crash is set, crash (core dump, etc) after tracebacking. +// +//go:nosplit +func gotraceback() (level int32, all, crash bool) { + gp := getg() + t := atomic.Load(&traceback_cache) + crash = t&tracebackCrash != 0 + all = gp.m.throwing >= throwTypeUser || t&tracebackAll != 0 + if gp.m.traceback != 0 { + level = int32(gp.m.traceback) + } else if gp.m.throwing >= throwTypeRuntime { + // Always include runtime frames in runtime throws unless + // otherwise overridden by m.traceback. + level = 2 + } else { + level = int32(t >> tracebackShift) + } + return +} + +var ( + argc int32 + argv **byte +) + +// nosplit for use in linux startup sysargs. +// +//go:nosplit +func argv_index(argv **byte, i int32) *byte { + return *(**byte)(add(unsafe.Pointer(argv), uintptr(i)*goarch.PtrSize)) +} + +func args(c int32, v **byte) { + argc = c + argv = v + sysargs(c, v) +} + +func goargs() { + if GOOS == "windows" { + return + } + argslice = make([]string, argc) + for i := int32(0); i < argc; i++ { + argslice[i] = gostringnocopy(argv_index(argv, i)) + } +} + +func goenvs_unix() { + // TODO(austin): ppc64 in dynamic linking mode doesn't + // guarantee env[] will immediately follow argv. Might cause + // problems. + n := int32(0) + for argv_index(argv, argc+1+n) != nil { + n++ + } + + envs = make([]string, n) + for i := int32(0); i < n; i++ { + envs[i] = gostring(argv_index(argv, argc+1+i)) + } +} + +func environ() []string { + return envs +} + +// TODO: These should be locals in testAtomic64, but we don't 8-byte +// align stack variables on 386. +var test_z64, test_x64 uint64 + +func testAtomic64() { + test_z64 = 42 + test_x64 = 0 + if atomic.Cas64(&test_z64, test_x64, 1) { + throw("cas64 failed") + } + if test_x64 != 0 { + throw("cas64 failed") + } + test_x64 = 42 + if !atomic.Cas64(&test_z64, test_x64, 1) { + throw("cas64 failed") + } + if test_x64 != 42 || test_z64 != 1 { + throw("cas64 failed") + } + if atomic.Load64(&test_z64) != 1 { + throw("load64 failed") + } + atomic.Store64(&test_z64, (1<<40)+1) + if atomic.Load64(&test_z64) != (1<<40)+1 { + throw("store64 failed") + } + if atomic.Xadd64(&test_z64, (1<<40)+1) != (2<<40)+2 { + throw("xadd64 failed") + } + if atomic.Load64(&test_z64) != (2<<40)+2 { + throw("xadd64 failed") + } + if atomic.Xchg64(&test_z64, (3<<40)+3) != (2<<40)+2 { + throw("xchg64 failed") + } + if atomic.Load64(&test_z64) != (3<<40)+3 { + throw("xchg64 failed") + } +} + +func check() { + var ( + a int8 + b uint8 + c int16 + d uint16 + e int32 + f uint32 + g int64 + h uint64 + i, i1 float32 + j, j1 float64 + k unsafe.Pointer + l *uint16 + m [4]byte + ) + type x1t struct { + x uint8 + } + type y1t struct { + x1 x1t + y uint8 + } + var x1 x1t + var y1 y1t + + if unsafe.Sizeof(a) != 1 { + throw("bad a") + } + if unsafe.Sizeof(b) != 1 { + throw("bad b") + } + if unsafe.Sizeof(c) != 2 { + throw("bad c") + } + if unsafe.Sizeof(d) != 2 { + throw("bad d") + } + if unsafe.Sizeof(e) != 4 { + throw("bad e") + } + if unsafe.Sizeof(f) != 4 { + throw("bad f") + } + if unsafe.Sizeof(g) != 8 { + throw("bad g") + } + if unsafe.Sizeof(h) != 8 { + throw("bad h") + } + if unsafe.Sizeof(i) != 4 { + throw("bad i") + } + if unsafe.Sizeof(j) != 8 { + throw("bad j") + } + if unsafe.Sizeof(k) != goarch.PtrSize { + throw("bad k") + } + if unsafe.Sizeof(l) != goarch.PtrSize { + throw("bad l") + } + if unsafe.Sizeof(x1) != 1 { + throw("bad unsafe.Sizeof x1") + } + if unsafe.Offsetof(y1.y) != 1 { + throw("bad offsetof y1.y") + } + if unsafe.Sizeof(y1) != 2 { + throw("bad unsafe.Sizeof y1") + } + + if timediv(12345*1000000000+54321, 1000000000, &e) != 12345 || e != 54321 { + throw("bad timediv") + } + + var z uint32 + z = 1 + if !atomic.Cas(&z, 1, 2) { + throw("cas1") + } + if z != 2 { + throw("cas2") + } + + z = 4 + if atomic.Cas(&z, 5, 6) { + throw("cas3") + } + if z != 4 { + throw("cas4") + } + + z = 0xffffffff + if !atomic.Cas(&z, 0xffffffff, 0xfffffffe) { + throw("cas5") + } + if z != 0xfffffffe { + throw("cas6") + } + + m = [4]byte{1, 1, 1, 1} + atomic.Or8(&m[1], 0xf0) + if m[0] != 1 || m[1] != 0xf1 || m[2] != 1 || m[3] != 1 { + throw("atomicor8") + } + + m = [4]byte{0xff, 0xff, 0xff, 0xff} + atomic.And8(&m[1], 0x1) + if m[0] != 0xff || m[1] != 0x1 || m[2] != 0xff || m[3] != 0xff { + throw("atomicand8") + } + + *(*uint64)(unsafe.Pointer(&j)) = ^uint64(0) + if j == j { + throw("float64nan") + } + if !(j != j) { + throw("float64nan1") + } + + *(*uint64)(unsafe.Pointer(&j1)) = ^uint64(1) + if j == j1 { + throw("float64nan2") + } + if !(j != j1) { + throw("float64nan3") + } + + *(*uint32)(unsafe.Pointer(&i)) = ^uint32(0) + if i == i { + throw("float32nan") + } + if i == i { + throw("float32nan1") + } + + *(*uint32)(unsafe.Pointer(&i1)) = ^uint32(1) + if i == i1 { + throw("float32nan2") + } + if i == i1 { + throw("float32nan3") + } + + testAtomic64() + + if _FixedStack != round2(_FixedStack) { + throw("FixedStack is not power-of-2") + } + + if !checkASM() { + throw("assembly checks failed") + } +} + +type dbgVar struct { + name string + value *int32 +} + +// Holds variables parsed from GODEBUG env var, +// except for "memprofilerate" since there is an +// existing int var for that value, which may +// already have an initial value. +var debug struct { + cgocheck int32 + clobberfree int32 + efence int32 + gccheckmark int32 + gcpacertrace int32 + gcshrinkstackoff int32 + gcstoptheworld int32 + gctrace int32 + invalidptr int32 + madvdontneed int32 // for Linux; issue 28466 + scavtrace int32 + scheddetail int32 + schedtrace int32 + tracebackancestors int32 + asyncpreemptoff int32 + harddecommit int32 + adaptivestackstart int32 + + // debug.malloc is used as a combined debug check + // in the malloc function and should be set + // if any of the below debug options is != 0. + malloc bool + allocfreetrace int32 + inittrace int32 + sbrk int32 +} + +var dbgvars = []dbgVar{ + {"allocfreetrace", &debug.allocfreetrace}, + {"clobberfree", &debug.clobberfree}, + {"cgocheck", &debug.cgocheck}, + {"efence", &debug.efence}, + {"gccheckmark", &debug.gccheckmark}, + {"gcpacertrace", &debug.gcpacertrace}, + {"gcshrinkstackoff", &debug.gcshrinkstackoff}, + {"gcstoptheworld", &debug.gcstoptheworld}, + {"gctrace", &debug.gctrace}, + {"invalidptr", &debug.invalidptr}, + {"madvdontneed", &debug.madvdontneed}, + {"sbrk", &debug.sbrk}, + {"scavtrace", &debug.scavtrace}, + {"scheddetail", &debug.scheddetail}, + {"schedtrace", &debug.schedtrace}, + {"tracebackancestors", &debug.tracebackancestors}, + {"asyncpreemptoff", &debug.asyncpreemptoff}, + {"inittrace", &debug.inittrace}, + {"harddecommit", &debug.harddecommit}, + {"adaptivestackstart", &debug.adaptivestackstart}, +} + +var globalGODEBUG string + +func parsedebugvars() { + // defaults + debug.cgocheck = 1 + debug.invalidptr = 1 + debug.adaptivestackstart = 1 // go119 - set this to 0 to turn larger initial goroutine stacks off + if GOOS == "linux" { + // On Linux, MADV_FREE is faster than MADV_DONTNEED, + // but doesn't affect many of the statistics that + // MADV_DONTNEED does until the memory is actually + // reclaimed. This generally leads to poor user + // experience, like confusing stats in top and other + // monitoring tools; and bad integration with + // management systems that respond to memory usage. + // Hence, default to MADV_DONTNEED. + debug.madvdontneed = 1 + } + + globalGODEBUG = gogetenv("GODEBUG") + godebugEnv.StoreNoWB(&globalGODEBUG) + for p := globalGODEBUG; p != ""; { + field := "" + i := bytealg.IndexByteString(p, ',') + if i < 0 { + field, p = p, "" + } else { + field, p = p[:i], p[i+1:] + } + i = bytealg.IndexByteString(field, '=') + if i < 0 { + continue + } + key, value := field[:i], field[i+1:] + + // Update MemProfileRate directly here since it + // is int, not int32, and should only be updated + // if specified in GODEBUG. + if key == "memprofilerate" { + if n, ok := atoi(value); ok { + MemProfileRate = n + } + } else { + for _, v := range dbgvars { + if v.name == key { + if n, ok := atoi32(value); ok { + *v.value = n + } + } + } + } + } + + debug.malloc = (debug.allocfreetrace | debug.inittrace | debug.sbrk) != 0 + + setTraceback(gogetenv("GOTRACEBACK")) + traceback_env = traceback_cache +} + +//go:linkname setTraceback runtime/debug.SetTraceback +func setTraceback(level string) { + var t uint32 + switch level { + case "none": + t = 0 + case "single", "": + t = 1 << tracebackShift + case "all": + t = 1<<tracebackShift | tracebackAll + case "system": + t = 2<<tracebackShift | tracebackAll + case "crash": + t = 2<<tracebackShift | tracebackAll | tracebackCrash + default: + t = tracebackAll + if n, ok := atoi(level); ok && n == int(uint32(n)) { + t |= uint32(n) << tracebackShift + } + } + // when C owns the process, simply exit'ing the process on fatal errors + // and panics is surprising. Be louder and abort instead. + if islibrary || isarchive { + t |= tracebackCrash + } + + t |= traceback_env + + atomic.Store(&traceback_cache, t) +} + +// Poor mans 64-bit division. +// This is a very special function, do not use it if you are not sure what you are doing. +// int64 division is lowered into _divv() call on 386, which does not fit into nosplit functions. +// Handles overflow in a time-specific manner. +// This keeps us within no-split stack limits on 32-bit processors. +// +//go:nosplit +func timediv(v int64, div int32, rem *int32) int32 { + res := int32(0) + for bit := 30; bit >= 0; bit-- { + if v >= int64(div)<<uint(bit) { + v = v - (int64(div) << uint(bit)) + // Before this for loop, res was 0, thus all these + // power of 2 increments are now just bitsets. + res |= 1 << uint(bit) + } + } + if v >= int64(div) { + if rem != nil { + *rem = 0 + } + return 0x7fffffff + } + if rem != nil { + *rem = int32(v) + } + return res +} + +// Helpers for Go. Must be NOSPLIT, must only call NOSPLIT functions, and must not block. + +//go:nosplit +func acquirem() *m { + gp := getg() + gp.m.locks++ + return gp.m +} + +//go:nosplit +func releasem(mp *m) { + gp := getg() + mp.locks-- + if mp.locks == 0 && gp.preempt { + // restore the preemption request in case we've cleared it in newstack + gp.stackguard0 = stackPreempt + } +} + +//go:linkname reflect_typelinks reflect.typelinks +func reflect_typelinks() ([]unsafe.Pointer, [][]int32) { + modules := activeModules() + sections := []unsafe.Pointer{unsafe.Pointer(modules[0].types)} + ret := [][]int32{modules[0].typelinks} + for _, md := range modules[1:] { + sections = append(sections, unsafe.Pointer(md.types)) + ret = append(ret, md.typelinks) + } + return sections, ret +} + +// reflect_resolveNameOff resolves a name offset from a base pointer. +// +//go:linkname reflect_resolveNameOff reflect.resolveNameOff +func reflect_resolveNameOff(ptrInModule unsafe.Pointer, off int32) unsafe.Pointer { + return unsafe.Pointer(resolveNameOff(ptrInModule, nameOff(off)).bytes) +} + +// reflect_resolveTypeOff resolves an *rtype offset from a base type. +// +//go:linkname reflect_resolveTypeOff reflect.resolveTypeOff +func reflect_resolveTypeOff(rtype unsafe.Pointer, off int32) unsafe.Pointer { + return unsafe.Pointer((*_type)(rtype).typeOff(typeOff(off))) +} + +// reflect_resolveTextOff resolves a function pointer offset from a base type. +// +//go:linkname reflect_resolveTextOff reflect.resolveTextOff +func reflect_resolveTextOff(rtype unsafe.Pointer, off int32) unsafe.Pointer { + return (*_type)(rtype).textOff(textOff(off)) + +} + +// reflectlite_resolveNameOff resolves a name offset from a base pointer. +// +//go:linkname reflectlite_resolveNameOff internal/reflectlite.resolveNameOff +func reflectlite_resolveNameOff(ptrInModule unsafe.Pointer, off int32) unsafe.Pointer { + return unsafe.Pointer(resolveNameOff(ptrInModule, nameOff(off)).bytes) +} + +// reflectlite_resolveTypeOff resolves an *rtype offset from a base type. +// +//go:linkname reflectlite_resolveTypeOff internal/reflectlite.resolveTypeOff +func reflectlite_resolveTypeOff(rtype unsafe.Pointer, off int32) unsafe.Pointer { + return unsafe.Pointer((*_type)(rtype).typeOff(typeOff(off))) +} + +// reflect_addReflectOff adds a pointer to the reflection offset lookup map. +// +//go:linkname reflect_addReflectOff reflect.addReflectOff +func reflect_addReflectOff(ptr unsafe.Pointer) int32 { + reflectOffsLock() + if reflectOffs.m == nil { + reflectOffs.m = make(map[int32]unsafe.Pointer) + reflectOffs.minv = make(map[unsafe.Pointer]int32) + reflectOffs.next = -1 + } + id, found := reflectOffs.minv[ptr] + if !found { + id = reflectOffs.next + reflectOffs.next-- // use negative offsets as IDs to aid debugging + reflectOffs.m[id] = ptr + reflectOffs.minv[ptr] = id + } + reflectOffsUnlock() + return id +} diff --git a/src/runtime/runtime2.go b/src/runtime/runtime2.go new file mode 100644 index 0000000..9381d1e --- /dev/null +++ b/src/runtime/runtime2.go @@ -0,0 +1,1190 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "unsafe" +) + +// defined constants +const ( + // G status + // + // Beyond indicating the general state of a G, the G status + // acts like a lock on the goroutine's stack (and hence its + // ability to execute user code). + // + // If you add to this list, add to the list + // of "okay during garbage collection" status + // in mgcmark.go too. + // + // TODO(austin): The _Gscan bit could be much lighter-weight. + // For example, we could choose not to run _Gscanrunnable + // goroutines found in the run queue, rather than CAS-looping + // until they become _Grunnable. And transitions like + // _Gscanwaiting -> _Gscanrunnable are actually okay because + // they don't affect stack ownership. + + // _Gidle means this goroutine was just allocated and has not + // yet been initialized. + _Gidle = iota // 0 + + // _Grunnable means this goroutine is on a run queue. It is + // not currently executing user code. The stack is not owned. + _Grunnable // 1 + + // _Grunning means this goroutine may execute user code. The + // stack is owned by this goroutine. It is not on a run queue. + // It is assigned an M and a P (g.m and g.m.p are valid). + _Grunning // 2 + + // _Gsyscall means this goroutine is executing a system call. + // It is not executing user code. The stack is owned by this + // goroutine. It is not on a run queue. It is assigned an M. + _Gsyscall // 3 + + // _Gwaiting means this goroutine is blocked in the runtime. + // It is not executing user code. It is not on a run queue, + // but should be recorded somewhere (e.g., a channel wait + // queue) so it can be ready()d when necessary. The stack is + // not owned *except* that a channel operation may read or + // write parts of the stack under the appropriate channel + // lock. Otherwise, it is not safe to access the stack after a + // goroutine enters _Gwaiting (e.g., it may get moved). + _Gwaiting // 4 + + // _Gmoribund_unused is currently unused, but hardcoded in gdb + // scripts. + _Gmoribund_unused // 5 + + // _Gdead means this goroutine is currently unused. It may be + // just exited, on a free list, or just being initialized. It + // is not executing user code. It may or may not have a stack + // allocated. The G and its stack (if any) are owned by the M + // that is exiting the G or that obtained the G from the free + // list. + _Gdead // 6 + + // _Genqueue_unused is currently unused. + _Genqueue_unused // 7 + + // _Gcopystack means this goroutine's stack is being moved. It + // is not executing user code and is not on a run queue. The + // stack is owned by the goroutine that put it in _Gcopystack. + _Gcopystack // 8 + + // _Gpreempted means this goroutine stopped itself for a + // suspendG preemption. It is like _Gwaiting, but nothing is + // yet responsible for ready()ing it. Some suspendG must CAS + // the status to _Gwaiting to take responsibility for + // ready()ing this G. + _Gpreempted // 9 + + // _Gscan combined with one of the above states other than + // _Grunning indicates that GC is scanning the stack. The + // goroutine is not executing user code and the stack is owned + // by the goroutine that set the _Gscan bit. + // + // _Gscanrunning is different: it is used to briefly block + // state transitions while GC signals the G to scan its own + // stack. This is otherwise like _Grunning. + // + // atomicstatus&~Gscan gives the state the goroutine will + // return to when the scan completes. + _Gscan = 0x1000 + _Gscanrunnable = _Gscan + _Grunnable // 0x1001 + _Gscanrunning = _Gscan + _Grunning // 0x1002 + _Gscansyscall = _Gscan + _Gsyscall // 0x1003 + _Gscanwaiting = _Gscan + _Gwaiting // 0x1004 + _Gscanpreempted = _Gscan + _Gpreempted // 0x1009 +) + +const ( + // P status + + // _Pidle means a P is not being used to run user code or the + // scheduler. Typically, it's on the idle P list and available + // to the scheduler, but it may just be transitioning between + // other states. + // + // The P is owned by the idle list or by whatever is + // transitioning its state. Its run queue is empty. + _Pidle = iota + + // _Prunning means a P is owned by an M and is being used to + // run user code or the scheduler. Only the M that owns this P + // is allowed to change the P's status from _Prunning. The M + // may transition the P to _Pidle (if it has no more work to + // do), _Psyscall (when entering a syscall), or _Pgcstop (to + // halt for the GC). The M may also hand ownership of the P + // off directly to another M (e.g., to schedule a locked G). + _Prunning + + // _Psyscall means a P is not running user code. It has + // affinity to an M in a syscall but is not owned by it and + // may be stolen by another M. This is similar to _Pidle but + // uses lightweight transitions and maintains M affinity. + // + // Leaving _Psyscall must be done with a CAS, either to steal + // or retake the P. Note that there's an ABA hazard: even if + // an M successfully CASes its original P back to _Prunning + // after a syscall, it must understand the P may have been + // used by another M in the interim. + _Psyscall + + // _Pgcstop means a P is halted for STW and owned by the M + // that stopped the world. The M that stopped the world + // continues to use its P, even in _Pgcstop. Transitioning + // from _Prunning to _Pgcstop causes an M to release its P and + // park. + // + // The P retains its run queue and startTheWorld will restart + // the scheduler on Ps with non-empty run queues. + _Pgcstop + + // _Pdead means a P is no longer used (GOMAXPROCS shrank). We + // reuse Ps if GOMAXPROCS increases. A dead P is mostly + // stripped of its resources, though a few things remain + // (e.g., trace buffers). + _Pdead +) + +// Mutual exclusion locks. In the uncontended case, +// as fast as spin locks (just a few user-level instructions), +// but on the contention path they sleep in the kernel. +// A zeroed Mutex is unlocked (no need to initialize each lock). +// Initialization is helpful for static lock ranking, but not required. +type mutex struct { + // Empty struct if lock ranking is disabled, otherwise includes the lock rank + lockRankStruct + // Futex-based impl treats it as uint32 key, + // while sema-based impl as M* waitm. + // Used to be a union, but unions break precise GC. + key uintptr +} + +// sleep and wakeup on one-time events. +// before any calls to notesleep or notewakeup, +// must call noteclear to initialize the Note. +// then, exactly one thread can call notesleep +// and exactly one thread can call notewakeup (once). +// once notewakeup has been called, the notesleep +// will return. future notesleep will return immediately. +// subsequent noteclear must be called only after +// previous notesleep has returned, e.g. it's disallowed +// to call noteclear straight after notewakeup. +// +// notetsleep is like notesleep but wakes up after +// a given number of nanoseconds even if the event +// has not yet happened. if a goroutine uses notetsleep to +// wake up early, it must wait to call noteclear until it +// can be sure that no other goroutine is calling +// notewakeup. +// +// notesleep/notetsleep are generally called on g0, +// notetsleepg is similar to notetsleep but is called on user g. +type note struct { + // Futex-based impl treats it as uint32 key, + // while sema-based impl as M* waitm. + // Used to be a union, but unions break precise GC. + key uintptr +} + +type funcval struct { + fn uintptr + // variable-size, fn-specific data here +} + +type iface struct { + tab *itab + data unsafe.Pointer +} + +type eface struct { + _type *_type + data unsafe.Pointer +} + +func efaceOf(ep *any) *eface { + return (*eface)(unsafe.Pointer(ep)) +} + +// The guintptr, muintptr, and puintptr are all used to bypass write barriers. +// It is particularly important to avoid write barriers when the current P has +// been released, because the GC thinks the world is stopped, and an +// unexpected write barrier would not be synchronized with the GC, +// which can lead to a half-executed write barrier that has marked the object +// but not queued it. If the GC skips the object and completes before the +// queuing can occur, it will incorrectly free the object. +// +// We tried using special assignment functions invoked only when not +// holding a running P, but then some updates to a particular memory +// word went through write barriers and some did not. This breaks the +// write barrier shadow checking mode, and it is also scary: better to have +// a word that is completely ignored by the GC than to have one for which +// only a few updates are ignored. +// +// Gs and Ps are always reachable via true pointers in the +// allgs and allp lists or (during allocation before they reach those lists) +// from stack variables. +// +// Ms are always reachable via true pointers either from allm or +// freem. Unlike Gs and Ps we do free Ms, so it's important that +// nothing ever hold an muintptr across a safe point. + +// A guintptr holds a goroutine pointer, but typed as a uintptr +// to bypass write barriers. It is used in the Gobuf goroutine state +// and in scheduling lists that are manipulated without a P. +// +// The Gobuf.g goroutine pointer is almost always updated by assembly code. +// In one of the few places it is updated by Go code - func save - it must be +// treated as a uintptr to avoid a write barrier being emitted at a bad time. +// Instead of figuring out how to emit the write barriers missing in the +// assembly manipulation, we change the type of the field to uintptr, +// so that it does not require write barriers at all. +// +// Goroutine structs are published in the allg list and never freed. +// That will keep the goroutine structs from being collected. +// There is never a time that Gobuf.g's contain the only references +// to a goroutine: the publishing of the goroutine in allg comes first. +// Goroutine pointers are also kept in non-GC-visible places like TLS, +// so I can't see them ever moving. If we did want to start moving data +// in the GC, we'd need to allocate the goroutine structs from an +// alternate arena. Using guintptr doesn't make that problem any worse. +// Note that pollDesc.rg, pollDesc.wg also store g in uintptr form, +// so they would need to be updated too if g's start moving. +type guintptr uintptr + +//go:nosplit +func (gp guintptr) ptr() *g { return (*g)(unsafe.Pointer(gp)) } + +//go:nosplit +func (gp *guintptr) set(g *g) { *gp = guintptr(unsafe.Pointer(g)) } + +//go:nosplit +func (gp *guintptr) cas(old, new guintptr) bool { + return atomic.Casuintptr((*uintptr)(unsafe.Pointer(gp)), uintptr(old), uintptr(new)) +} + +// setGNoWB performs *gp = new without a write barrier. +// For times when it's impractical to use a guintptr. +// +//go:nosplit +//go:nowritebarrier +func setGNoWB(gp **g, new *g) { + (*guintptr)(unsafe.Pointer(gp)).set(new) +} + +type puintptr uintptr + +//go:nosplit +func (pp puintptr) ptr() *p { return (*p)(unsafe.Pointer(pp)) } + +//go:nosplit +func (pp *puintptr) set(p *p) { *pp = puintptr(unsafe.Pointer(p)) } + +// muintptr is a *m that is not tracked by the garbage collector. +// +// Because we do free Ms, there are some additional constrains on +// muintptrs: +// +// 1. Never hold an muintptr locally across a safe point. +// +// 2. Any muintptr in the heap must be owned by the M itself so it can +// ensure it is not in use when the last true *m is released. +type muintptr uintptr + +//go:nosplit +func (mp muintptr) ptr() *m { return (*m)(unsafe.Pointer(mp)) } + +//go:nosplit +func (mp *muintptr) set(m *m) { *mp = muintptr(unsafe.Pointer(m)) } + +// setMNoWB performs *mp = new without a write barrier. +// For times when it's impractical to use an muintptr. +// +//go:nosplit +//go:nowritebarrier +func setMNoWB(mp **m, new *m) { + (*muintptr)(unsafe.Pointer(mp)).set(new) +} + +type gobuf struct { + // The offsets of sp, pc, and g are known to (hard-coded in) libmach. + // + // ctxt is unusual with respect to GC: it may be a + // heap-allocated funcval, so GC needs to track it, but it + // needs to be set and cleared from assembly, where it's + // difficult to have write barriers. However, ctxt is really a + // saved, live register, and we only ever exchange it between + // the real register and the gobuf. Hence, we treat it as a + // root during stack scanning, which means assembly that saves + // and restores it doesn't need write barriers. It's still + // typed as a pointer so that any other writes from Go get + // write barriers. + sp uintptr + pc uintptr + g guintptr + ctxt unsafe.Pointer + ret uintptr + lr uintptr + bp uintptr // for framepointer-enabled architectures +} + +// sudog represents a g in a wait list, such as for sending/receiving +// on a channel. +// +// sudog is necessary because the g ↔ synchronization object relation +// is many-to-many. A g can be on many wait lists, so there may be +// many sudogs for one g; and many gs may be waiting on the same +// synchronization object, so there may be many sudogs for one object. +// +// sudogs are allocated from a special pool. Use acquireSudog and +// releaseSudog to allocate and free them. +type sudog struct { + // The following fields are protected by the hchan.lock of the + // channel this sudog is blocking on. shrinkstack depends on + // this for sudogs involved in channel ops. + + g *g + + next *sudog + prev *sudog + elem unsafe.Pointer // data element (may point to stack) + + // The following fields are never accessed concurrently. + // For channels, waitlink is only accessed by g. + // For semaphores, all fields (including the ones above) + // are only accessed when holding a semaRoot lock. + + acquiretime int64 + releasetime int64 + ticket uint32 + + // isSelect indicates g is participating in a select, so + // g.selectDone must be CAS'd to win the wake-up race. + isSelect bool + + // success indicates whether communication over channel c + // succeeded. It is true if the goroutine was awoken because a + // value was delivered over channel c, and false if awoken + // because c was closed. + success bool + + parent *sudog // semaRoot binary tree + waitlink *sudog // g.waiting list or semaRoot + waittail *sudog // semaRoot + c *hchan // channel +} + +type libcall struct { + fn uintptr + n uintptr // number of parameters + args uintptr // parameters + r1 uintptr // return values + r2 uintptr + err uintptr // error number +} + +// Stack describes a Go execution stack. +// The bounds of the stack are exactly [lo, hi), +// with no implicit data structures on either side. +type stack struct { + lo uintptr + hi uintptr +} + +// heldLockInfo gives info on a held lock and the rank of that lock +type heldLockInfo struct { + lockAddr uintptr + rank lockRank +} + +type g struct { + // Stack parameters. + // stack describes the actual stack memory: [stack.lo, stack.hi). + // stackguard0 is the stack pointer compared in the Go stack growth prologue. + // It is stack.lo+StackGuard normally, but can be StackPreempt to trigger a preemption. + // stackguard1 is the stack pointer compared in the C stack growth prologue. + // It is stack.lo+StackGuard on g0 and gsignal stacks. + // It is ~0 on other goroutine stacks, to trigger a call to morestackc (and crash). + stack stack // offset known to runtime/cgo + stackguard0 uintptr // offset known to liblink + stackguard1 uintptr // offset known to liblink + + _panic *_panic // innermost panic - offset known to liblink + _defer *_defer // innermost defer + m *m // current m; offset known to arm liblink + sched gobuf + syscallsp uintptr // if status==Gsyscall, syscallsp = sched.sp to use during gc + syscallpc uintptr // if status==Gsyscall, syscallpc = sched.pc to use during gc + stktopsp uintptr // expected sp at top of stack, to check in traceback + // param is a generic pointer parameter field used to pass + // values in particular contexts where other storage for the + // parameter would be difficult to find. It is currently used + // in three ways: + // 1. When a channel operation wakes up a blocked goroutine, it sets param to + // point to the sudog of the completed blocking operation. + // 2. By gcAssistAlloc1 to signal back to its caller that the goroutine completed + // the GC cycle. It is unsafe to do so in any other way, because the goroutine's + // stack may have moved in the meantime. + // 3. By debugCallWrap to pass parameters to a new goroutine because allocating a + // closure in the runtime is forbidden. + param unsafe.Pointer + atomicstatus atomic.Uint32 + stackLock uint32 // sigprof/scang lock; TODO: fold in to atomicstatus + goid uint64 + schedlink guintptr + waitsince int64 // approx time when the g become blocked + waitreason waitReason // if status==Gwaiting + + preempt bool // preemption signal, duplicates stackguard0 = stackpreempt + preemptStop bool // transition to _Gpreempted on preemption; otherwise, just deschedule + preemptShrink bool // shrink stack at synchronous safe point + + // asyncSafePoint is set if g is stopped at an asynchronous + // safe point. This means there are frames on the stack + // without precise pointer information. + asyncSafePoint bool + + paniconfault bool // panic (instead of crash) on unexpected fault address + gcscandone bool // g has scanned stack; protected by _Gscan bit in status + throwsplit bool // must not split stack + // activeStackChans indicates that there are unlocked channels + // pointing into this goroutine's stack. If true, stack + // copying needs to acquire channel locks to protect these + // areas of the stack. + activeStackChans bool + // parkingOnChan indicates that the goroutine is about to + // park on a chansend or chanrecv. Used to signal an unsafe point + // for stack shrinking. + parkingOnChan atomic.Bool + + raceignore int8 // ignore race detection events + sysblocktraced bool // StartTrace has emitted EvGoInSyscall about this goroutine + tracking bool // whether we're tracking this G for sched latency statistics + trackingSeq uint8 // used to decide whether to track this G + trackingStamp int64 // timestamp of when the G last started being tracked + runnableTime int64 // the amount of time spent runnable, cleared when running, only used when tracking + sysexitticks int64 // cputicks when syscall has returned (for tracing) + traceseq uint64 // trace event sequencer + tracelastp puintptr // last P emitted an event for this goroutine + lockedm muintptr + sig uint32 + writebuf []byte + sigcode0 uintptr + sigcode1 uintptr + sigpc uintptr + gopc uintptr // pc of go statement that created this goroutine + ancestors *[]ancestorInfo // ancestor information goroutine(s) that created this goroutine (only used if debug.tracebackancestors) + startpc uintptr // pc of goroutine function + racectx uintptr + waiting *sudog // sudog structures this g is waiting on (that have a valid elem ptr); in lock order + cgoCtxt []uintptr // cgo traceback context + labels unsafe.Pointer // profiler labels + timer *timer // cached timer for time.Sleep + selectDone atomic.Uint32 // are we participating in a select and did someone win the race? + + // goroutineProfiled indicates the status of this goroutine's stack for the + // current in-progress goroutine profile + goroutineProfiled goroutineProfileStateHolder + + // Per-G GC state + + // gcAssistBytes is this G's GC assist credit in terms of + // bytes allocated. If this is positive, then the G has credit + // to allocate gcAssistBytes bytes without assisting. If this + // is negative, then the G must correct this by performing + // scan work. We track this in bytes to make it fast to update + // and check for debt in the malloc hot path. The assist ratio + // determines how this corresponds to scan work debt. + gcAssistBytes int64 +} + +// gTrackingPeriod is the number of transitions out of _Grunning between +// latency tracking runs. +const gTrackingPeriod = 8 + +const ( + // tlsSlots is the number of pointer-sized slots reserved for TLS on some platforms, + // like Windows. + tlsSlots = 6 + tlsSize = tlsSlots * goarch.PtrSize +) + +// Values for m.freeWait. +const ( + freeMStack = 0 // M done, free stack and reference. + freeMRef = 1 // M done, free reference. + freeMWait = 2 // M still in use. +) + +type m struct { + g0 *g // goroutine with scheduling stack + morebuf gobuf // gobuf arg to morestack + divmod uint32 // div/mod denominator for arm - known to liblink + _ uint32 // align next field to 8 bytes + + // Fields not known to debuggers. + procid uint64 // for debuggers, but offset not hard-coded + gsignal *g // signal-handling g + goSigStack gsignalStack // Go-allocated signal handling stack + sigmask sigset // storage for saved signal mask + tls [tlsSlots]uintptr // thread-local storage (for x86 extern register) + mstartfn func() + curg *g // current running goroutine + caughtsig guintptr // goroutine running during fatal signal + p puintptr // attached p for executing go code (nil if not executing go code) + nextp puintptr + oldp puintptr // the p that was attached before executing a syscall + id int64 + mallocing int32 + throwing throwType + preemptoff string // if != "", keep curg running on this m + locks int32 + dying int32 + profilehz int32 + spinning bool // m is out of work and is actively looking for work + blocked bool // m is blocked on a note + newSigstack bool // minit on C thread called sigaltstack + printlock int8 + incgo bool // m is executing a cgo call + isextra bool // m is an extra m + freeWait atomic.Uint32 // Whether it is safe to free g0 and delete m (one of freeMRef, freeMStack, freeMWait) + fastrand uint64 + needextram bool + traceback uint8 + ncgocall uint64 // number of cgo calls in total + ncgo int32 // number of cgo calls currently in progress + cgoCallersUse atomic.Uint32 // if non-zero, cgoCallers in use temporarily + cgoCallers *cgoCallers // cgo traceback if crashing in cgo call + park note + alllink *m // on allm + schedlink muintptr + lockedg guintptr + createstack [32]uintptr // stack that created this thread. + lockedExt uint32 // tracking for external LockOSThread + lockedInt uint32 // tracking for internal lockOSThread + nextwaitm muintptr // next m waiting for lock + waitunlockf func(*g, unsafe.Pointer) bool + waitlock unsafe.Pointer + waittraceev byte + waittraceskip int + startingtrace bool + syscalltick uint32 + freelink *m // on sched.freem + + // these are here because they are too large to be on the stack + // of low-level NOSPLIT functions. + libcall libcall + libcallpc uintptr // for cpu profiler + libcallsp uintptr + libcallg guintptr + syscall libcall // stores syscall parameters on windows + + vdsoSP uintptr // SP for traceback while in VDSO call (0 if not in call) + vdsoPC uintptr // PC for traceback while in VDSO call + + // preemptGen counts the number of completed preemption + // signals. This is used to detect when a preemption is + // requested, but fails. + preemptGen atomic.Uint32 + + // Whether this is a pending preemption signal on this M. + signalPending atomic.Uint32 + + dlogPerM + + mOS + + // Up to 10 locks held by this m, maintained by the lock ranking code. + locksHeldLen int + locksHeld [10]heldLockInfo +} + +type p struct { + id int32 + status uint32 // one of pidle/prunning/... + link puintptr + schedtick uint32 // incremented on every scheduler call + syscalltick uint32 // incremented on every system call + sysmontick sysmontick // last tick observed by sysmon + m muintptr // back-link to associated m (nil if idle) + mcache *mcache + pcache pageCache + raceprocctx uintptr + + deferpool []*_defer // pool of available defer structs (see panic.go) + deferpoolbuf [32]*_defer + + // Cache of goroutine ids, amortizes accesses to runtime·sched.goidgen. + goidcache uint64 + goidcacheend uint64 + + // Queue of runnable goroutines. Accessed without lock. + runqhead uint32 + runqtail uint32 + runq [256]guintptr + // runnext, if non-nil, is a runnable G that was ready'd by + // the current G and should be run next instead of what's in + // runq if there's time remaining in the running G's time + // slice. It will inherit the time left in the current time + // slice. If a set of goroutines is locked in a + // communicate-and-wait pattern, this schedules that set as a + // unit and eliminates the (potentially large) scheduling + // latency that otherwise arises from adding the ready'd + // goroutines to the end of the run queue. + // + // Note that while other P's may atomically CAS this to zero, + // only the owner P can CAS it to a valid G. + runnext guintptr + + // Available G's (status == Gdead) + gFree struct { + gList + n int32 + } + + sudogcache []*sudog + sudogbuf [128]*sudog + + // Cache of mspan objects from the heap. + mspancache struct { + // We need an explicit length here because this field is used + // in allocation codepaths where write barriers are not allowed, + // and eliminating the write barrier/keeping it eliminated from + // slice updates is tricky, moreso than just managing the length + // ourselves. + len int + buf [128]*mspan + } + + tracebuf traceBufPtr + + // traceSweep indicates the sweep events should be traced. + // This is used to defer the sweep start event until a span + // has actually been swept. + traceSweep bool + // traceSwept and traceReclaimed track the number of bytes + // swept and reclaimed by sweeping in the current sweep loop. + traceSwept, traceReclaimed uintptr + + palloc persistentAlloc // per-P to avoid mutex + + // The when field of the first entry on the timer heap. + // This is 0 if the timer heap is empty. + timer0When atomic.Int64 + + // The earliest known nextwhen field of a timer with + // timerModifiedEarlier status. Because the timer may have been + // modified again, there need not be any timer with this value. + // This is 0 if there are no timerModifiedEarlier timers. + timerModifiedEarliest atomic.Int64 + + // Per-P GC state + gcAssistTime int64 // Nanoseconds in assistAlloc + gcFractionalMarkTime int64 // Nanoseconds in fractional mark worker (atomic) + + // limiterEvent tracks events for the GC CPU limiter. + limiterEvent limiterEvent + + // gcMarkWorkerMode is the mode for the next mark worker to run in. + // That is, this is used to communicate with the worker goroutine + // selected for immediate execution by + // gcController.findRunnableGCWorker. When scheduling other goroutines, + // this field must be set to gcMarkWorkerNotWorker. + gcMarkWorkerMode gcMarkWorkerMode + // gcMarkWorkerStartTime is the nanotime() at which the most recent + // mark worker started. + gcMarkWorkerStartTime int64 + + // gcw is this P's GC work buffer cache. The work buffer is + // filled by write barriers, drained by mutator assists, and + // disposed on certain GC state transitions. + gcw gcWork + + // wbBuf is this P's GC write barrier buffer. + // + // TODO: Consider caching this in the running G. + wbBuf wbBuf + + runSafePointFn uint32 // if 1, run sched.safePointFn at next safe point + + // statsSeq is a counter indicating whether this P is currently + // writing any stats. Its value is even when not, odd when it is. + statsSeq atomic.Uint32 + + // Lock for timers. We normally access the timers while running + // on this P, but the scheduler can also do it from a different P. + timersLock mutex + + // Actions to take at some time. This is used to implement the + // standard library's time package. + // Must hold timersLock to access. + timers []*timer + + // Number of timers in P's heap. + numTimers atomic.Uint32 + + // Number of timerDeleted timers in P's heap. + deletedTimers atomic.Uint32 + + // Race context used while executing timer functions. + timerRaceCtx uintptr + + // maxStackScanDelta accumulates the amount of stack space held by + // live goroutines (i.e. those eligible for stack scanning). + // Flushed to gcController.maxStackScan once maxStackScanSlack + // or -maxStackScanSlack is reached. + maxStackScanDelta int64 + + // gc-time statistics about current goroutines + // Note that this differs from maxStackScan in that this + // accumulates the actual stack observed to be used at GC time (hi - sp), + // not an instantaneous measure of the total stack size that might need + // to be scanned (hi - lo). + scannedStackSize uint64 // stack size of goroutines scanned by this P + scannedStacks uint64 // number of goroutines scanned by this P + + // preempt is set to indicate that this P should be enter the + // scheduler ASAP (regardless of what G is running on it). + preempt bool + + // pageTraceBuf is a buffer for writing out page allocation/free/scavenge traces. + // + // Used only if GOEXPERIMENT=pagetrace. + pageTraceBuf pageTraceBuf + + // Padding is no longer needed. False sharing is now not a worry because p is large enough + // that its size class is an integer multiple of the cache line size (for any of our architectures). +} + +type schedt struct { + goidgen atomic.Uint64 + lastpoll atomic.Int64 // time of last network poll, 0 if currently polling + pollUntil atomic.Int64 // time to which current poll is sleeping + + lock mutex + + // When increasing nmidle, nmidlelocked, nmsys, or nmfreed, be + // sure to call checkdead(). + + midle muintptr // idle m's waiting for work + nmidle int32 // number of idle m's waiting for work + nmidlelocked int32 // number of locked m's waiting for work + mnext int64 // number of m's that have been created and next M ID + maxmcount int32 // maximum number of m's allowed (or die) + nmsys int32 // number of system m's not counted for deadlock + nmfreed int64 // cumulative number of freed m's + + ngsys atomic.Int32 // number of system goroutines + + pidle puintptr // idle p's + npidle atomic.Int32 + nmspinning atomic.Int32 // See "Worker thread parking/unparking" comment in proc.go. + needspinning atomic.Uint32 // See "Delicate dance" comment in proc.go. Boolean. Must hold sched.lock to set to 1. + + // Global runnable queue. + runq gQueue + runqsize int32 + + // disable controls selective disabling of the scheduler. + // + // Use schedEnableUser to control this. + // + // disable is protected by sched.lock. + disable struct { + // user disables scheduling of user goroutines. + user bool + runnable gQueue // pending runnable Gs + n int32 // length of runnable + } + + // Global cache of dead G's. + gFree struct { + lock mutex + stack gList // Gs with stacks + noStack gList // Gs without stacks + n int32 + } + + // Central cache of sudog structs. + sudoglock mutex + sudogcache *sudog + + // Central pool of available defer structs. + deferlock mutex + deferpool *_defer + + // freem is the list of m's waiting to be freed when their + // m.exited is set. Linked through m.freelink. + freem *m + + gcwaiting atomic.Bool // gc is waiting to run + stopwait int32 + stopnote note + sysmonwait atomic.Bool + sysmonnote note + + // safepointFn should be called on each P at the next GC + // safepoint if p.runSafePointFn is set. + safePointFn func(*p) + safePointWait int32 + safePointNote note + + profilehz int32 // cpu profiling rate + + procresizetime int64 // nanotime() of last change to gomaxprocs + totaltime int64 // ∫gomaxprocs dt up to procresizetime + + // sysmonlock protects sysmon's actions on the runtime. + // + // Acquire and hold this mutex to block sysmon from interacting + // with the rest of the runtime. + sysmonlock mutex + + // timeToRun is a distribution of scheduling latencies, defined + // as the sum of time a G spends in the _Grunnable state before + // it transitions to _Grunning. + timeToRun timeHistogram + + // idleTime is the total CPU time Ps have "spent" idle. + // + // Reset on each GC cycle. + idleTime atomic.Int64 + + // totalMutexWaitTime is the sum of time goroutines have spent in _Gwaiting + // with a waitreason of the form waitReasonSync{RW,}Mutex{R,}Lock. + totalMutexWaitTime atomic.Int64 +} + +// Values for the flags field of a sigTabT. +const ( + _SigNotify = 1 << iota // let signal.Notify have signal, even if from kernel + _SigKill // if signal.Notify doesn't take it, exit quietly + _SigThrow // if signal.Notify doesn't take it, exit loudly + _SigPanic // if the signal is from the kernel, panic + _SigDefault // if the signal isn't explicitly requested, don't monitor it + _SigGoExit // cause all runtime procs to exit (only used on Plan 9). + _SigSetStack // Don't explicitly install handler, but add SA_ONSTACK to existing libc handler + _SigUnblock // always unblock; see blockableSig + _SigIgn // _SIG_DFL action is to ignore the signal +) + +// Layout of in-memory per-function information prepared by linker +// See https://golang.org/s/go12symtab. +// Keep in sync with linker (../cmd/link/internal/ld/pcln.go:/pclntab) +// and with package debug/gosym and with symtab.go in package runtime. +type _func struct { + entryOff uint32 // start pc, as offset from moduledata.text/pcHeader.textStart + nameOff int32 // function name, as index into moduledata.funcnametab. + + args int32 // in/out args size + deferreturn uint32 // offset of start of a deferreturn call instruction from entry, if any. + + pcsp uint32 + pcfile uint32 + pcln uint32 + npcdata uint32 + cuOffset uint32 // runtime.cutab offset of this function's CU + startLine int32 // line number of start of function (func keyword/TEXT directive) + funcID funcID // set for certain special runtime functions + flag funcFlag + _ [1]byte // pad + nfuncdata uint8 // must be last, must end on a uint32-aligned boundary + + // The end of the struct is followed immediately by two variable-length + // arrays that reference the pcdata and funcdata locations for this + // function. + + // pcdata contains the offset into moduledata.pctab for the start of + // that index's table. e.g., + // &moduledata.pctab[_func.pcdata[_PCDATA_UnsafePoint]] is the start of + // the unsafe point table. + // + // An offset of 0 indicates that there is no table. + // + // pcdata [npcdata]uint32 + + // funcdata contains the offset past moduledata.gofunc which contains a + // pointer to that index's funcdata. e.g., + // *(moduledata.gofunc + _func.funcdata[_FUNCDATA_ArgsPointerMaps]) is + // the argument pointer map. + // + // An offset of ^uint32(0) indicates that there is no entry. + // + // funcdata [nfuncdata]uint32 +} + +// Pseudo-Func that is returned for PCs that occur in inlined code. +// A *Func can be either a *_func or a *funcinl, and they are distinguished +// by the first uintptr. +type funcinl struct { + ones uint32 // set to ^0 to distinguish from _func + entry uintptr // entry of the real (the "outermost") frame + name string + file string + line int32 + startLine int32 +} + +// layout of Itab known to compilers +// allocated in non-garbage-collected memory +// Needs to be in sync with +// ../cmd/compile/internal/reflectdata/reflect.go:/^func.WriteTabs. +type itab struct { + inter *interfacetype + _type *_type + hash uint32 // copy of _type.hash. Used for type switches. + _ [4]byte + fun [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter. +} + +// Lock-free stack node. +// Also known to export_test.go. +type lfnode struct { + next uint64 + pushcnt uintptr +} + +type forcegcstate struct { + lock mutex + g *g + idle atomic.Bool +} + +// extendRandom extends the random numbers in r[:n] to the whole slice r. +// Treats n<0 as n==0. +func extendRandom(r []byte, n int) { + if n < 0 { + n = 0 + } + for n < len(r) { + // Extend random bits using hash function & time seed + w := n + if w > 16 { + w = 16 + } + h := memhash(unsafe.Pointer(&r[n-w]), uintptr(nanotime()), uintptr(w)) + for i := 0; i < goarch.PtrSize && n < len(r); i++ { + r[n] = byte(h) + n++ + h >>= 8 + } + } +} + +// A _defer holds an entry on the list of deferred calls. +// If you add a field here, add code to clear it in deferProcStack. +// This struct must match the code in cmd/compile/internal/ssagen/ssa.go:deferstruct +// and cmd/compile/internal/ssagen/ssa.go:(*state).call. +// Some defers will be allocated on the stack and some on the heap. +// All defers are logically part of the stack, so write barriers to +// initialize them are not required. All defers must be manually scanned, +// and for heap defers, marked. +type _defer struct { + started bool + heap bool + // openDefer indicates that this _defer is for a frame with open-coded + // defers. We have only one defer record for the entire frame (which may + // currently have 0, 1, or more defers active). + openDefer bool + sp uintptr // sp at time of defer + pc uintptr // pc at time of defer + fn func() // can be nil for open-coded defers + _panic *_panic // panic that is running defer + link *_defer // next defer on G; can point to either heap or stack! + + // If openDefer is true, the fields below record values about the stack + // frame and associated function that has the open-coded defer(s). sp + // above will be the sp for the frame, and pc will be address of the + // deferreturn call in the function. + fd unsafe.Pointer // funcdata for the function associated with the frame + varp uintptr // value of varp for the stack frame + // framepc is the current pc associated with the stack frame. Together, + // with sp above (which is the sp associated with the stack frame), + // framepc/sp can be used as pc/sp pair to continue a stack trace via + // gentraceback(). + framepc uintptr +} + +// A _panic holds information about an active panic. +// +// A _panic value must only ever live on the stack. +// +// The argp and link fields are stack pointers, but don't need special +// handling during stack growth: because they are pointer-typed and +// _panic values only live on the stack, regular stack pointer +// adjustment takes care of them. +type _panic struct { + argp unsafe.Pointer // pointer to arguments of deferred call run during panic; cannot move - known to liblink + arg any // argument to panic + link *_panic // link to earlier panic + pc uintptr // where to return to in runtime if this panic is bypassed + sp unsafe.Pointer // where to return to in runtime if this panic is bypassed + recovered bool // whether this panic is over + aborted bool // the panic was aborted + goexit bool +} + +// ancestorInfo records details of where a goroutine was started. +type ancestorInfo struct { + pcs []uintptr // pcs from the stack of this goroutine + goid uint64 // goroutine id of this goroutine; original goroutine possibly dead + gopc uintptr // pc of go statement that created this goroutine +} + +const ( + _TraceRuntimeFrames = 1 << iota // include frames for internal runtime functions. + _TraceTrap // the initial PC, SP are from a trap, not a return PC from a call + _TraceJumpStack // if traceback is on a systemstack, resume trace at g that called into it +) + +// The maximum number of frames we print for a traceback +const _TracebackMaxFrames = 100 + +// A waitReason explains why a goroutine has been stopped. +// See gopark. Do not re-use waitReasons, add new ones. +type waitReason uint8 + +const ( + waitReasonZero waitReason = iota // "" + waitReasonGCAssistMarking // "GC assist marking" + waitReasonIOWait // "IO wait" + waitReasonChanReceiveNilChan // "chan receive (nil chan)" + waitReasonChanSendNilChan // "chan send (nil chan)" + waitReasonDumpingHeap // "dumping heap" + waitReasonGarbageCollection // "garbage collection" + waitReasonGarbageCollectionScan // "garbage collection scan" + waitReasonPanicWait // "panicwait" + waitReasonSelect // "select" + waitReasonSelectNoCases // "select (no cases)" + waitReasonGCAssistWait // "GC assist wait" + waitReasonGCSweepWait // "GC sweep wait" + waitReasonGCScavengeWait // "GC scavenge wait" + waitReasonChanReceive // "chan receive" + waitReasonChanSend // "chan send" + waitReasonFinalizerWait // "finalizer wait" + waitReasonForceGCIdle // "force gc (idle)" + waitReasonSemacquire // "semacquire" + waitReasonSleep // "sleep" + waitReasonSyncCondWait // "sync.Cond.Wait" + waitReasonSyncMutexLock // "sync.Mutex.Lock" + waitReasonSyncRWMutexRLock // "sync.RWMutex.RLock" + waitReasonSyncRWMutexLock // "sync.RWMutex.Lock" + waitReasonTraceReaderBlocked // "trace reader (blocked)" + waitReasonWaitForGCCycle // "wait for GC cycle" + waitReasonGCWorkerIdle // "GC worker (idle)" + waitReasonGCWorkerActive // "GC worker (active)" + waitReasonPreempted // "preempted" + waitReasonDebugCall // "debug call" + waitReasonGCMarkTermination // "GC mark termination" + waitReasonStoppingTheWorld // "stopping the world" +) + +var waitReasonStrings = [...]string{ + waitReasonZero: "", + waitReasonGCAssistMarking: "GC assist marking", + waitReasonIOWait: "IO wait", + waitReasonChanReceiveNilChan: "chan receive (nil chan)", + waitReasonChanSendNilChan: "chan send (nil chan)", + waitReasonDumpingHeap: "dumping heap", + waitReasonGarbageCollection: "garbage collection", + waitReasonGarbageCollectionScan: "garbage collection scan", + waitReasonPanicWait: "panicwait", + waitReasonSelect: "select", + waitReasonSelectNoCases: "select (no cases)", + waitReasonGCAssistWait: "GC assist wait", + waitReasonGCSweepWait: "GC sweep wait", + waitReasonGCScavengeWait: "GC scavenge wait", + waitReasonChanReceive: "chan receive", + waitReasonChanSend: "chan send", + waitReasonFinalizerWait: "finalizer wait", + waitReasonForceGCIdle: "force gc (idle)", + waitReasonSemacquire: "semacquire", + waitReasonSleep: "sleep", + waitReasonSyncCondWait: "sync.Cond.Wait", + waitReasonSyncMutexLock: "sync.Mutex.Lock", + waitReasonSyncRWMutexRLock: "sync.RWMutex.RLock", + waitReasonSyncRWMutexLock: "sync.RWMutex.Lock", + waitReasonTraceReaderBlocked: "trace reader (blocked)", + waitReasonWaitForGCCycle: "wait for GC cycle", + waitReasonGCWorkerIdle: "GC worker (idle)", + waitReasonGCWorkerActive: "GC worker (active)", + waitReasonPreempted: "preempted", + waitReasonDebugCall: "debug call", + waitReasonGCMarkTermination: "GC mark termination", + waitReasonStoppingTheWorld: "stopping the world", +} + +func (w waitReason) String() string { + if w < 0 || w >= waitReason(len(waitReasonStrings)) { + return "unknown wait reason" + } + return waitReasonStrings[w] +} + +func (w waitReason) isMutexWait() bool { + return w == waitReasonSyncMutexLock || + w == waitReasonSyncRWMutexRLock || + w == waitReasonSyncRWMutexLock +} + +var ( + allm *m + gomaxprocs int32 + ncpu int32 + forcegc forcegcstate + sched schedt + newprocs int32 + + // allpLock protects P-less reads and size changes of allp, idlepMask, + // and timerpMask, and all writes to allp. + allpLock mutex + // len(allp) == gomaxprocs; may change at safe points, otherwise + // immutable. + allp []*p + // Bitmask of Ps in _Pidle list, one bit per P. Reads and writes must + // be atomic. Length may change at safe points. + // + // Each P must update only its own bit. In order to maintain + // consistency, a P going idle must the idle mask simultaneously with + // updates to the idle P list under the sched.lock, otherwise a racing + // pidleget may clear the mask before pidleput sets the mask, + // corrupting the bitmap. + // + // N.B., procresize takes ownership of all Ps in stopTheWorldWithSema. + idlepMask pMask + // Bitmask of Ps that may have a timer, one bit per P. Reads and writes + // must be atomic. Length may change at safe points. + timerpMask pMask + + // Pool of GC parked background workers. Entries are type + // *gcBgMarkWorkerNode. + gcBgMarkWorkerPool lfstack + + // Total number of gcBgMarkWorker goroutines. Protected by worldsema. + gcBgMarkWorkerCount int32 + + // Information about what cpu features are available. + // Packages outside the runtime should not use these + // as they are not an external api. + // Set on startup in asm_{386,amd64}.s + processorVersionInfo uint32 + isIntel bool + + goarm uint8 // set by cmd/link on arm systems +) + +// Set by the linker so the runtime can determine the buildmode. +var ( + islibrary bool // -buildmode=c-shared + isarchive bool // -buildmode=c-archive +) + +// Must agree with internal/buildcfg.FramePointerEnabled. +const framepointer_enabled = GOARCH == "amd64" || GOARCH == "arm64" diff --git a/src/runtime/runtime_boring.go b/src/runtime/runtime_boring.go new file mode 100644 index 0000000..5a98b20 --- /dev/null +++ b/src/runtime/runtime_boring.go @@ -0,0 +1,19 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import _ "unsafe" // for go:linkname + +//go:linkname boring_runtime_arg0 crypto/internal/boring.runtime_arg0 +func boring_runtime_arg0() string { + // On Windows, argslice is not set, and it's too much work to find argv0. + if len(argslice) == 0 { + return "" + } + return argslice[0] +} + +//go:linkname fipstls_runtime_arg0 crypto/internal/boring/fipstls.runtime_arg0 +func fipstls_runtime_arg0() string { return boring_runtime_arg0() } diff --git a/src/runtime/runtime_linux_test.go b/src/runtime/runtime_linux_test.go new file mode 100644 index 0000000..6af5561 --- /dev/null +++ b/src/runtime/runtime_linux_test.go @@ -0,0 +1,65 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + . "runtime" + "syscall" + "testing" + "time" + "unsafe" +) + +var pid, tid int + +func init() { + // Record pid and tid of init thread for use during test. + // The call to LockOSThread is just to exercise it; + // we can't test that it does anything. + // Instead we're testing that the conditions are good + // for how it is used in init (must be on main thread). + pid, tid = syscall.Getpid(), syscall.Gettid() + LockOSThread() + + sysNanosleep = func(d time.Duration) { + // Invoke a blocking syscall directly; calling time.Sleep() + // would deschedule the goroutine instead. + ts := syscall.NsecToTimespec(d.Nanoseconds()) + for { + if err := syscall.Nanosleep(&ts, &ts); err != syscall.EINTR { + return + } + } + } +} + +func TestLockOSThread(t *testing.T) { + if pid != tid { + t.Fatalf("pid=%d but tid=%d", pid, tid) + } +} + +// Test that error values are negative. +// Use a misaligned pointer to get -EINVAL. +func TestMincoreErrorSign(t *testing.T) { + var dst byte + v := Mincore(Add(unsafe.Pointer(new(int32)), 1), 1, &dst) + + const EINVAL = 0x16 + if v != -EINVAL { + t.Errorf("mincore = %v, want %v", v, -EINVAL) + } +} + +func TestKernelStructSize(t *testing.T) { + // Check that the Go definitions of structures exchanged with the kernel are + // the same size as what the kernel defines. + if have, want := unsafe.Sizeof(Siginfo{}), uintptr(SiginfoMaxSize); have != want { + t.Errorf("Go's siginfo struct is %d bytes long; kernel expects %d", have, want) + } + if have, want := unsafe.Sizeof(Sigevent{}), uintptr(SigeventMaxSize); have != want { + t.Errorf("Go's sigevent struct is %d bytes long; kernel expects %d", have, want) + } +} diff --git a/src/runtime/runtime_mmap_test.go b/src/runtime/runtime_mmap_test.go new file mode 100644 index 0000000..456f913 --- /dev/null +++ b/src/runtime/runtime_mmap_test.go @@ -0,0 +1,53 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime_test + +import ( + "runtime" + "testing" + "unsafe" +) + +// Test that the error value returned by mmap is positive, as that is +// what the code in mem_bsd.go, mem_darwin.go, and mem_linux.go expects. +// See the uses of ENOMEM in sysMap in those files. +func TestMmapErrorSign(t *testing.T) { + p, err := runtime.Mmap(nil, ^uintptr(0)&^(runtime.GetPhysPageSize()-1), 0, runtime.MAP_ANON|runtime.MAP_PRIVATE, -1, 0) + + if p != nil || err != runtime.ENOMEM { + t.Errorf("mmap = %v, %v, want nil, %v", p, err, runtime.ENOMEM) + } +} + +func TestPhysPageSize(t *testing.T) { + // Mmap fails if the address is not page aligned, so we can + // use this to test if the page size is the true page size. + ps := runtime.GetPhysPageSize() + + // Get a region of memory to play with. This should be page-aligned. + b, err := runtime.Mmap(nil, 2*ps, 0, runtime.MAP_ANON|runtime.MAP_PRIVATE, -1, 0) + if err != 0 { + t.Fatalf("Mmap: %v", err) + } + + if runtime.GOOS == "aix" { + // AIX does not allow mapping a range that is already mapped. + runtime.Munmap(unsafe.Pointer(uintptr(b)), 2*ps) + } + + // Mmap should fail at a half page into the buffer. + _, err = runtime.Mmap(unsafe.Pointer(uintptr(b)+ps/2), ps, 0, runtime.MAP_ANON|runtime.MAP_PRIVATE|runtime.MAP_FIXED, -1, 0) + if err == 0 { + t.Errorf("Mmap should have failed with half-page alignment %d, but succeeded: %v", ps/2, err) + } + + // Mmap should succeed at a full page into the buffer. + _, err = runtime.Mmap(unsafe.Pointer(uintptr(b)+ps), ps, 0, runtime.MAP_ANON|runtime.MAP_PRIVATE|runtime.MAP_FIXED, -1, 0) + if err != 0 { + t.Errorf("Mmap at full-page alignment %d failed: %v", ps, err) + } +} diff --git a/src/runtime/runtime_test.go b/src/runtime/runtime_test.go new file mode 100644 index 0000000..2faf06e --- /dev/null +++ b/src/runtime/runtime_test.go @@ -0,0 +1,543 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "flag" + "fmt" + "io" + . "runtime" + "runtime/debug" + "sort" + "strings" + "sync" + "testing" + "time" + "unsafe" +) + +// flagQuick is set by the -quick option to skip some relatively slow tests. +// This is used by the cmd/dist test runtime:cpu124. +// The cmd/dist test passes both -test.short and -quick; +// there are tests that only check testing.Short, and those tests will +// not be skipped if only -quick is used. +var flagQuick = flag.Bool("quick", false, "skip slow tests, for cmd/dist test runtime:cpu124") + +func init() { + // We're testing the runtime, so make tracebacks show things + // in the runtime. This only raises the level, so it won't + // override GOTRACEBACK=crash from the user. + SetTracebackEnv("system") +} + +var errf error + +func errfn() error { + return errf +} + +func errfn1() error { + return io.EOF +} + +func BenchmarkIfaceCmp100(b *testing.B) { + for i := 0; i < b.N; i++ { + for j := 0; j < 100; j++ { + if errfn() == io.EOF { + b.Fatal("bad comparison") + } + } + } +} + +func BenchmarkIfaceCmpNil100(b *testing.B) { + for i := 0; i < b.N; i++ { + for j := 0; j < 100; j++ { + if errfn1() == nil { + b.Fatal("bad comparison") + } + } + } +} + +var efaceCmp1 any +var efaceCmp2 any + +func BenchmarkEfaceCmpDiff(b *testing.B) { + x := 5 + efaceCmp1 = &x + y := 6 + efaceCmp2 = &y + for i := 0; i < b.N; i++ { + for j := 0; j < 100; j++ { + if efaceCmp1 == efaceCmp2 { + b.Fatal("bad comparison") + } + } + } +} + +func BenchmarkEfaceCmpDiffIndirect(b *testing.B) { + efaceCmp1 = [2]int{1, 2} + efaceCmp2 = [2]int{1, 2} + for i := 0; i < b.N; i++ { + for j := 0; j < 100; j++ { + if efaceCmp1 != efaceCmp2 { + b.Fatal("bad comparison") + } + } + } +} + +func BenchmarkDefer(b *testing.B) { + for i := 0; i < b.N; i++ { + defer1() + } +} + +func defer1() { + defer func(x, y, z int) { + if recover() != nil || x != 1 || y != 2 || z != 3 { + panic("bad recover") + } + }(1, 2, 3) +} + +func BenchmarkDefer10(b *testing.B) { + for i := 0; i < b.N/10; i++ { + defer2() + } +} + +func defer2() { + for i := 0; i < 10; i++ { + defer func(x, y, z int) { + if recover() != nil || x != 1 || y != 2 || z != 3 { + panic("bad recover") + } + }(1, 2, 3) + } +} + +func BenchmarkDeferMany(b *testing.B) { + for i := 0; i < b.N; i++ { + defer func(x, y, z int) { + if recover() != nil || x != 1 || y != 2 || z != 3 { + panic("bad recover") + } + }(1, 2, 3) + } +} + +func BenchmarkPanicRecover(b *testing.B) { + for i := 0; i < b.N; i++ { + defer3() + } +} + +func defer3() { + defer func(x, y, z int) { + if recover() == nil { + panic("failed recover") + } + }(1, 2, 3) + panic("hi") +} + +// golang.org/issue/7063 +func TestStopCPUProfilingWithProfilerOff(t *testing.T) { + SetCPUProfileRate(0) +} + +// Addresses to test for faulting behavior. +// This is less a test of SetPanicOnFault and more a check that +// the operating system and the runtime can process these faults +// correctly. That is, we're indirectly testing that without SetPanicOnFault +// these would manage to turn into ordinary crashes. +// Note that these are truncated on 32-bit systems, so the bottom 32 bits +// of the larger addresses must themselves be invalid addresses. +// We might get unlucky and the OS might have mapped one of these +// addresses, but probably not: they're all in the first page, very high +// addresses that normally an OS would reserve for itself, or malformed +// addresses. Even so, we might have to remove one or two on different +// systems. We will see. + +var faultAddrs = []uint64{ + // low addresses + 0, + 1, + 0xfff, + // high (kernel) addresses + // or else malformed. + 0xffffffffffffffff, + 0xfffffffffffff001, + 0xffffffffffff0001, + 0xfffffffffff00001, + 0xffffffffff000001, + 0xfffffffff0000001, + 0xffffffff00000001, + 0xfffffff000000001, + 0xffffff0000000001, + 0xfffff00000000001, + 0xffff000000000001, + 0xfff0000000000001, + 0xff00000000000001, + 0xf000000000000001, + 0x8000000000000001, +} + +func TestSetPanicOnFault(t *testing.T) { + old := debug.SetPanicOnFault(true) + defer debug.SetPanicOnFault(old) + + nfault := 0 + for _, addr := range faultAddrs { + testSetPanicOnFault(t, uintptr(addr), &nfault) + } + if nfault == 0 { + t.Fatalf("none of the addresses faulted") + } +} + +// testSetPanicOnFault tests one potentially faulting address. +// It deliberately constructs and uses an invalid pointer, +// so mark it as nocheckptr. +// +//go:nocheckptr +func testSetPanicOnFault(t *testing.T, addr uintptr, nfault *int) { + if GOOS == "js" { + t.Skip("js does not support catching faults") + } + + defer func() { + if err := recover(); err != nil { + *nfault++ + } + }() + + // The read should fault, except that sometimes we hit + // addresses that have had C or kernel pages mapped there + // readable by user code. So just log the content. + // If no addresses fault, we'll fail the test. + v := *(*byte)(unsafe.Pointer(addr)) + t.Logf("addr %#x: %#x\n", addr, v) +} + +func eqstring_generic(s1, s2 string) bool { + if len(s1) != len(s2) { + return false + } + // optimization in assembly versions: + // if s1.str == s2.str { return true } + for i := 0; i < len(s1); i++ { + if s1[i] != s2[i] { + return false + } + } + return true +} + +func TestEqString(t *testing.T) { + // This isn't really an exhaustive test of == on strings, it's + // just a convenient way of documenting (via eqstring_generic) + // what == does. + s := []string{ + "", + "a", + "c", + "aaa", + "ccc", + "cccc"[:3], // same contents, different string + "1234567890", + } + for _, s1 := range s { + for _, s2 := range s { + x := s1 == s2 + y := eqstring_generic(s1, s2) + if x != y { + t.Errorf(`("%s" == "%s") = %t, want %t`, s1, s2, x, y) + } + } + } +} + +func TestTrailingZero(t *testing.T) { + // make sure we add padding for structs with trailing zero-sized fields + type T1 struct { + n int32 + z [0]byte + } + if unsafe.Sizeof(T1{}) != 8 { + t.Errorf("sizeof(%#v)==%d, want 8", T1{}, unsafe.Sizeof(T1{})) + } + type T2 struct { + n int64 + z struct{} + } + if unsafe.Sizeof(T2{}) != 8+unsafe.Sizeof(uintptr(0)) { + t.Errorf("sizeof(%#v)==%d, want %d", T2{}, unsafe.Sizeof(T2{}), 8+unsafe.Sizeof(uintptr(0))) + } + type T3 struct { + n byte + z [4]struct{} + } + if unsafe.Sizeof(T3{}) != 2 { + t.Errorf("sizeof(%#v)==%d, want 2", T3{}, unsafe.Sizeof(T3{})) + } + // make sure padding can double for both zerosize and alignment + type T4 struct { + a int32 + b int16 + c int8 + z struct{} + } + if unsafe.Sizeof(T4{}) != 8 { + t.Errorf("sizeof(%#v)==%d, want 8", T4{}, unsafe.Sizeof(T4{})) + } + // make sure we don't pad a zero-sized thing + type T5 struct { + } + if unsafe.Sizeof(T5{}) != 0 { + t.Errorf("sizeof(%#v)==%d, want 0", T5{}, unsafe.Sizeof(T5{})) + } +} + +func TestAppendGrowth(t *testing.T) { + var x []int64 + check := func(want int) { + if cap(x) != want { + t.Errorf("len=%d, cap=%d, want cap=%d", len(x), cap(x), want) + } + } + + check(0) + want := 1 + for i := 1; i <= 100; i++ { + x = append(x, 1) + check(want) + if i&(i-1) == 0 { + want = 2 * i + } + } +} + +var One = []int64{1} + +func TestAppendSliceGrowth(t *testing.T) { + var x []int64 + check := func(want int) { + if cap(x) != want { + t.Errorf("len=%d, cap=%d, want cap=%d", len(x), cap(x), want) + } + } + + check(0) + want := 1 + for i := 1; i <= 100; i++ { + x = append(x, One...) + check(want) + if i&(i-1) == 0 { + want = 2 * i + } + } +} + +func TestGoroutineProfileTrivial(t *testing.T) { + // Calling GoroutineProfile twice in a row should find the same number of goroutines, + // but it's possible there are goroutines just about to exit, so we might end up + // with fewer in the second call. Try a few times; it should converge once those + // zombies are gone. + for i := 0; ; i++ { + n1, ok := GoroutineProfile(nil) // should fail, there's at least 1 goroutine + if n1 < 1 || ok { + t.Fatalf("GoroutineProfile(nil) = %d, %v, want >0, false", n1, ok) + } + n2, ok := GoroutineProfile(make([]StackRecord, n1)) + if n2 == n1 && ok { + break + } + t.Logf("GoroutineProfile(%d) = %d, %v, want %d, true", n1, n2, ok, n1) + if i >= 10 { + t.Fatalf("GoroutineProfile not converging") + } + } +} + +func BenchmarkGoroutineProfile(b *testing.B) { + run := func(fn func() bool) func(b *testing.B) { + runOne := func(b *testing.B) { + latencies := make([]time.Duration, 0, b.N) + + b.ResetTimer() + for i := 0; i < b.N; i++ { + start := time.Now() + ok := fn() + if !ok { + b.Fatal("goroutine profile failed") + } + latencies = append(latencies, time.Since(start)) + } + b.StopTimer() + + // Sort latencies then report percentiles. + sort.Slice(latencies, func(i, j int) bool { + return latencies[i] < latencies[j] + }) + b.ReportMetric(float64(latencies[len(latencies)*50/100]), "p50-ns") + b.ReportMetric(float64(latencies[len(latencies)*90/100]), "p90-ns") + b.ReportMetric(float64(latencies[len(latencies)*99/100]), "p99-ns") + } + return func(b *testing.B) { + b.Run("idle", runOne) + + b.Run("loaded", func(b *testing.B) { + stop := applyGCLoad(b) + runOne(b) + // Make sure to stop the timer before we wait! The load created above + // is very heavy-weight and not easy to stop, so we could end up + // confusing the benchmarking framework for small b.N. + b.StopTimer() + stop() + }) + } + } + + // Measure the cost of counting goroutines + b.Run("small-nil", run(func() bool { + GoroutineProfile(nil) + return true + })) + + // Measure the cost with a small set of goroutines + n := NumGoroutine() + p := make([]StackRecord, 2*n+2*GOMAXPROCS(0)) + b.Run("small", run(func() bool { + _, ok := GoroutineProfile(p) + return ok + })) + + // Measure the cost with a large set of goroutines + ch := make(chan int) + var ready, done sync.WaitGroup + for i := 0; i < 5000; i++ { + ready.Add(1) + done.Add(1) + go func() { ready.Done(); <-ch; done.Done() }() + } + ready.Wait() + + // Count goroutines with a large allgs list + b.Run("large-nil", run(func() bool { + GoroutineProfile(nil) + return true + })) + + n = NumGoroutine() + p = make([]StackRecord, 2*n+2*GOMAXPROCS(0)) + b.Run("large", run(func() bool { + _, ok := GoroutineProfile(p) + return ok + })) + + close(ch) + done.Wait() + + // Count goroutines with a large (but unused) allgs list + b.Run("sparse-nil", run(func() bool { + GoroutineProfile(nil) + return true + })) + + // Measure the cost of a large (but unused) allgs list + n = NumGoroutine() + p = make([]StackRecord, 2*n+2*GOMAXPROCS(0)) + b.Run("sparse", run(func() bool { + _, ok := GoroutineProfile(p) + return ok + })) +} + +func TestVersion(t *testing.T) { + // Test that version does not contain \r or \n. + vers := Version() + if strings.Contains(vers, "\r") || strings.Contains(vers, "\n") { + t.Fatalf("cr/nl in version: %q", vers) + } +} + +func TestTimediv(t *testing.T) { + for _, tc := range []struct { + num int64 + div int32 + ret int32 + rem int32 + }{ + { + num: 8, + div: 2, + ret: 4, + rem: 0, + }, + { + num: 9, + div: 2, + ret: 4, + rem: 1, + }, + { + // Used by runtime.check. + num: 12345*1000000000 + 54321, + div: 1000000000, + ret: 12345, + rem: 54321, + }, + { + num: 1<<32 - 1, + div: 2, + ret: 1<<31 - 1, // no overflow. + rem: 1, + }, + { + num: 1 << 32, + div: 2, + ret: 1<<31 - 1, // overflow. + rem: 0, + }, + { + num: 1 << 40, + div: 2, + ret: 1<<31 - 1, // overflow. + rem: 0, + }, + { + num: 1<<40 + 1, + div: 1 << 10, + ret: 1 << 30, + rem: 1, + }, + } { + name := fmt.Sprintf("%d div %d", tc.num, tc.div) + t.Run(name, func(t *testing.T) { + // Double check that the inputs make sense using + // standard 64-bit division. + ret64 := tc.num / int64(tc.div) + rem64 := tc.num % int64(tc.div) + if ret64 != int64(int32(ret64)) { + // Simulate timediv overflow value. + ret64 = 1<<31 - 1 + rem64 = 0 + } + if ret64 != int64(tc.ret) { + t.Errorf("%d / %d got ret %d rem %d want ret %d rem %d", tc.num, tc.div, ret64, rem64, tc.ret, tc.rem) + } + + var rem int32 + ret := Timediv(tc.num, tc.div, &rem) + if ret != tc.ret || rem != tc.rem { + t.Errorf("timediv %d / %d got ret %d rem %d want ret %d rem %d", tc.num, tc.div, ret, rem, tc.ret, tc.rem) + } + }) + } +} diff --git a/src/runtime/runtime_unix_test.go b/src/runtime/runtime_unix_test.go new file mode 100644 index 0000000..642a946 --- /dev/null +++ b/src/runtime/runtime_unix_test.go @@ -0,0 +1,56 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Only works on systems with syscall.Close. +// We need a fast system call to provoke the race, +// and Close(-1) is nearly universally fast. + +//go:build aix || darwin || dragonfly || freebsd || linux || netbsd || openbsd || plan9 + +package runtime_test + +import ( + "runtime" + "sync" + "sync/atomic" + "syscall" + "testing" +) + +func TestGoroutineProfile(t *testing.T) { + // GoroutineProfile used to use the wrong starting sp for + // goroutines coming out of system calls, causing possible + // crashes. + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(100)) + + var stop uint32 + defer atomic.StoreUint32(&stop, 1) // in case of panic + + var wg sync.WaitGroup + for i := 0; i < 4; i++ { + wg.Add(1) + go func() { + for atomic.LoadUint32(&stop) == 0 { + syscall.Close(-1) + } + wg.Done() + }() + } + + max := 10000 + if testing.Short() { + max = 100 + } + stk := make([]runtime.StackRecord, 128) + for n := 0; n < max; n++ { + _, ok := runtime.GoroutineProfile(stk) + if !ok { + t.Fatalf("GoroutineProfile failed") + } + } + + // If the program didn't crash, we passed. + atomic.StoreUint32(&stop, 1) + wg.Wait() +} diff --git a/src/runtime/rwmutex.go b/src/runtime/rwmutex.go new file mode 100644 index 0000000..34d8f67 --- /dev/null +++ b/src/runtime/rwmutex.go @@ -0,0 +1,167 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/atomic" +) + +// This is a copy of sync/rwmutex.go rewritten to work in the runtime. + +// A rwmutex is a reader/writer mutual exclusion lock. +// The lock can be held by an arbitrary number of readers or a single writer. +// This is a variant of sync.RWMutex, for the runtime package. +// Like mutex, rwmutex blocks the calling M. +// It does not interact with the goroutine scheduler. +type rwmutex struct { + rLock mutex // protects readers, readerPass, writer + readers muintptr // list of pending readers + readerPass uint32 // number of pending readers to skip readers list + + wLock mutex // serializes writers + writer muintptr // pending writer waiting for completing readers + + readerCount atomic.Int32 // number of pending readers + readerWait atomic.Int32 // number of departing readers + + readRank lockRank // semantic lock rank for read locking +} + +// Lock ranking an rwmutex has two aspects: +// +// Semantic ranking: this rwmutex represents some higher level lock that +// protects some resource (e.g., allocmLock protects creation of new Ms). The +// read and write locks of that resource need to be represented in the lock +// rank. +// +// Internal ranking: as an implementation detail, rwmutex uses two mutexes: +// rLock and wLock. These have lock order requirements: wLock must be locked +// before rLock. This also needs to be represented in the lock rank. +// +// Semantic ranking is represented by acquiring readRank during read lock and +// writeRank during write lock. +// +// wLock is held for the duration of a write lock, so it uses writeRank +// directly, both for semantic and internal ranking. rLock is only held +// temporarily inside the rlock/lock methods, so it uses readRankInternal to +// represent internal ranking. Semantic ranking is represented by a separate +// acquire of readRank for the duration of a read lock. +// +// The lock ranking must document this ordering: +// - readRankInternal is a leaf lock. +// - readRank is taken before readRankInternal. +// - writeRank is taken before readRankInternal. +// - readRank is placed in the lock order wherever a read lock of this rwmutex +// belongs. +// - writeRank is placed in the lock order wherever a write lock of this +// rwmutex belongs. +func (rw *rwmutex) init(readRank, readRankInternal, writeRank lockRank) { + rw.readRank = readRank + + lockInit(&rw.rLock, readRankInternal) + lockInit(&rw.wLock, writeRank) +} + +const rwmutexMaxReaders = 1 << 30 + +// rlock locks rw for reading. +func (rw *rwmutex) rlock() { + // The reader must not be allowed to lose its P or else other + // things blocking on the lock may consume all of the Ps and + // deadlock (issue #20903). Alternatively, we could drop the P + // while sleeping. + acquirem() + + acquireLockRank(rw.readRank) + lockWithRankMayAcquire(&rw.rLock, getLockRank(&rw.rLock)) + + if rw.readerCount.Add(1) < 0 { + // A writer is pending. Park on the reader queue. + systemstack(func() { + lock(&rw.rLock) + if rw.readerPass > 0 { + // Writer finished. + rw.readerPass -= 1 + unlock(&rw.rLock) + } else { + // Queue this reader to be woken by + // the writer. + m := getg().m + m.schedlink = rw.readers + rw.readers.set(m) + unlock(&rw.rLock) + notesleep(&m.park) + noteclear(&m.park) + } + }) + } +} + +// runlock undoes a single rlock call on rw. +func (rw *rwmutex) runlock() { + if r := rw.readerCount.Add(-1); r < 0 { + if r+1 == 0 || r+1 == -rwmutexMaxReaders { + throw("runlock of unlocked rwmutex") + } + // A writer is pending. + if rw.readerWait.Add(-1) == 0 { + // The last reader unblocks the writer. + lock(&rw.rLock) + w := rw.writer.ptr() + if w != nil { + notewakeup(&w.park) + } + unlock(&rw.rLock) + } + } + releaseLockRank(rw.readRank) + releasem(getg().m) +} + +// lock locks rw for writing. +func (rw *rwmutex) lock() { + // Resolve competition with other writers and stick to our P. + lock(&rw.wLock) + m := getg().m + // Announce that there is a pending writer. + r := rw.readerCount.Add(-rwmutexMaxReaders) + rwmutexMaxReaders + // Wait for any active readers to complete. + lock(&rw.rLock) + if r != 0 && rw.readerWait.Add(r) != 0 { + // Wait for reader to wake us up. + systemstack(func() { + rw.writer.set(m) + unlock(&rw.rLock) + notesleep(&m.park) + noteclear(&m.park) + }) + } else { + unlock(&rw.rLock) + } +} + +// unlock unlocks rw for writing. +func (rw *rwmutex) unlock() { + // Announce to readers that there is no active writer. + r := rw.readerCount.Add(rwmutexMaxReaders) + if r >= rwmutexMaxReaders { + throw("unlock of unlocked rwmutex") + } + // Unblock blocked readers. + lock(&rw.rLock) + for rw.readers.ptr() != nil { + reader := rw.readers.ptr() + rw.readers = reader.schedlink + reader.schedlink.set(nil) + notewakeup(&reader.park) + r -= 1 + } + // If r > 0, there are pending readers that aren't on the + // queue. Tell them to skip waiting. + rw.readerPass += uint32(r) + unlock(&rw.rLock) + // Allow other writers to proceed. + unlock(&rw.wLock) +} diff --git a/src/runtime/rwmutex_test.go b/src/runtime/rwmutex_test.go new file mode 100644 index 0000000..bdeb9c4 --- /dev/null +++ b/src/runtime/rwmutex_test.go @@ -0,0 +1,195 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// GOMAXPROCS=10 go test + +// This is a copy of sync/rwmutex_test.go rewritten to test the +// runtime rwmutex. + +package runtime_test + +import ( + "fmt" + . "runtime" + "runtime/debug" + "sync/atomic" + "testing" +) + +func parallelReader(m *RWMutex, clocked chan bool, cunlock *atomic.Bool, cdone chan bool) { + m.RLock() + clocked <- true + for !cunlock.Load() { + } + m.RUnlock() + cdone <- true +} + +func doTestParallelReaders(numReaders int) { + GOMAXPROCS(numReaders + 1) + var m RWMutex + m.Init() + clocked := make(chan bool, numReaders) + var cunlock atomic.Bool + cdone := make(chan bool) + for i := 0; i < numReaders; i++ { + go parallelReader(&m, clocked, &cunlock, cdone) + } + // Wait for all parallel RLock()s to succeed. + for i := 0; i < numReaders; i++ { + <-clocked + } + cunlock.Store(true) + // Wait for the goroutines to finish. + for i := 0; i < numReaders; i++ { + <-cdone + } +} + +func TestParallelRWMutexReaders(t *testing.T) { + if GOARCH == "wasm" { + t.Skip("wasm has no threads yet") + } + defer GOMAXPROCS(GOMAXPROCS(-1)) + // If runtime triggers a forced GC during this test then it will deadlock, + // since the goroutines can't be stopped/preempted. + // Disable GC for this test (see issue #10958). + defer debug.SetGCPercent(debug.SetGCPercent(-1)) + // SetGCPercent waits until the mark phase is over, but the runtime + // also preempts at the start of the sweep phase, so make sure that's + // done too. + GC() + + doTestParallelReaders(1) + doTestParallelReaders(3) + doTestParallelReaders(4) +} + +func reader(rwm *RWMutex, num_iterations int, activity *int32, cdone chan bool) { + for i := 0; i < num_iterations; i++ { + rwm.RLock() + n := atomic.AddInt32(activity, 1) + if n < 1 || n >= 10000 { + panic(fmt.Sprintf("wlock(%d)\n", n)) + } + for i := 0; i < 100; i++ { + } + atomic.AddInt32(activity, -1) + rwm.RUnlock() + } + cdone <- true +} + +func writer(rwm *RWMutex, num_iterations int, activity *int32, cdone chan bool) { + for i := 0; i < num_iterations; i++ { + rwm.Lock() + n := atomic.AddInt32(activity, 10000) + if n != 10000 { + panic(fmt.Sprintf("wlock(%d)\n", n)) + } + for i := 0; i < 100; i++ { + } + atomic.AddInt32(activity, -10000) + rwm.Unlock() + } + cdone <- true +} + +func HammerRWMutex(gomaxprocs, numReaders, num_iterations int) { + GOMAXPROCS(gomaxprocs) + // Number of active readers + 10000 * number of active writers. + var activity int32 + var rwm RWMutex + rwm.Init() + cdone := make(chan bool) + go writer(&rwm, num_iterations, &activity, cdone) + var i int + for i = 0; i < numReaders/2; i++ { + go reader(&rwm, num_iterations, &activity, cdone) + } + go writer(&rwm, num_iterations, &activity, cdone) + for ; i < numReaders; i++ { + go reader(&rwm, num_iterations, &activity, cdone) + } + // Wait for the 2 writers and all readers to finish. + for i := 0; i < 2+numReaders; i++ { + <-cdone + } +} + +func TestRWMutex(t *testing.T) { + defer GOMAXPROCS(GOMAXPROCS(-1)) + n := 1000 + if testing.Short() { + n = 5 + } + HammerRWMutex(1, 1, n) + HammerRWMutex(1, 3, n) + HammerRWMutex(1, 10, n) + HammerRWMutex(4, 1, n) + HammerRWMutex(4, 3, n) + HammerRWMutex(4, 10, n) + HammerRWMutex(10, 1, n) + HammerRWMutex(10, 3, n) + HammerRWMutex(10, 10, n) + HammerRWMutex(10, 5, n) +} + +func BenchmarkRWMutexUncontended(b *testing.B) { + type PaddedRWMutex struct { + RWMutex + pad [32]uint32 + } + b.RunParallel(func(pb *testing.PB) { + var rwm PaddedRWMutex + rwm.Init() + for pb.Next() { + rwm.RLock() + rwm.RLock() + rwm.RUnlock() + rwm.RUnlock() + rwm.Lock() + rwm.Unlock() + } + }) +} + +func benchmarkRWMutex(b *testing.B, localWork, writeRatio int) { + var rwm RWMutex + rwm.Init() + b.RunParallel(func(pb *testing.PB) { + foo := 0 + for pb.Next() { + foo++ + if foo%writeRatio == 0 { + rwm.Lock() + rwm.Unlock() + } else { + rwm.RLock() + for i := 0; i != localWork; i += 1 { + foo *= 2 + foo /= 2 + } + rwm.RUnlock() + } + } + _ = foo + }) +} + +func BenchmarkRWMutexWrite100(b *testing.B) { + benchmarkRWMutex(b, 0, 100) +} + +func BenchmarkRWMutexWrite10(b *testing.B) { + benchmarkRWMutex(b, 0, 10) +} + +func BenchmarkRWMutexWorkWrite100(b *testing.B) { + benchmarkRWMutex(b, 100, 100) +} + +func BenchmarkRWMutexWorkWrite10(b *testing.B) { + benchmarkRWMutex(b, 100, 10) +} diff --git a/src/runtime/security_aix.go b/src/runtime/security_aix.go new file mode 100644 index 0000000..c11b9c3 --- /dev/null +++ b/src/runtime/security_aix.go @@ -0,0 +1,17 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// secureMode is only ever mutated in schedinit, so we don't need to worry about +// synchronization primitives. +var secureMode bool + +func initSecureMode() { + secureMode = !(getuid() == geteuid() && getgid() == getegid()) +} + +func isSecureMode() bool { + return secureMode +} diff --git a/src/runtime/security_issetugid.go b/src/runtime/security_issetugid.go new file mode 100644 index 0000000..5048632 --- /dev/null +++ b/src/runtime/security_issetugid.go @@ -0,0 +1,19 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build darwin || dragonfly || freebsd || illumos || netbsd || openbsd || solaris + +package runtime + +// secureMode is only ever mutated in schedinit, so we don't need to worry about +// synchronization primitives. +var secureMode bool + +func initSecureMode() { + secureMode = issetugid() == 1 +} + +func isSecureMode() bool { + return secureMode +} diff --git a/src/runtime/security_linux.go b/src/runtime/security_linux.go new file mode 100644 index 0000000..181f3a1 --- /dev/null +++ b/src/runtime/security_linux.go @@ -0,0 +1,15 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import _ "unsafe" + +func initSecureMode() { + // We have already initialized the secureMode bool in sysauxv. +} + +func isSecureMode() bool { + return secureMode +} diff --git a/src/runtime/security_nonunix.go b/src/runtime/security_nonunix.go new file mode 100644 index 0000000..fc9571c --- /dev/null +++ b/src/runtime/security_nonunix.go @@ -0,0 +1,13 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !unix + +package runtime + +func isSecureMode() bool { + return false +} + +func secure() {} diff --git a/src/runtime/security_test.go b/src/runtime/security_test.go new file mode 100644 index 0000000..1d30411 --- /dev/null +++ b/src/runtime/security_test.go @@ -0,0 +1,143 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime_test + +import ( + "bytes" + "context" + "fmt" + "internal/testenv" + "io" + "os" + "os/exec" + "path/filepath" + "runtime" + "strings" + "testing" + "time" +) + +func privesc(command string, args ...string) error { + ctx, cancel := context.WithTimeout(context.Background(), time.Second*5) + defer cancel() + var cmd *exec.Cmd + if runtime.GOOS == "darwin" { + cmd = exec.CommandContext(ctx, "sudo", append([]string{"-n", command}, args...)...) + } else { + cmd = exec.CommandContext(ctx, "su", highPrivUser, "-c", fmt.Sprintf("%s %s", command, strings.Join(args, " "))) + } + _, err := cmd.CombinedOutput() + return err +} + +const highPrivUser = "root" + +func setSetuid(t *testing.T, user, bin string) { + t.Helper() + // We escalate privileges here even if we are root, because for some reason on some builders + // (at least freebsd-amd64-13_0) the default PATH doesn't include /usr/sbin, which is where + // chown lives, but using 'su root -c' gives us the correct PATH. + + // buildTestProg uses os.MkdirTemp which creates directories with 0700, which prevents + // setuid binaries from executing because of the missing g+rx, so we need to set the parent + // directory to better permissions before anything else. We created this directory, so we + // shouldn't need to do any privilege trickery. + if err := privesc("chmod", "0777", filepath.Dir(bin)); err != nil { + t.Skipf("unable to set permissions on %q, likely no passwordless sudo/su: %s", filepath.Dir(bin), err) + } + + if err := privesc("chown", user, bin); err != nil { + t.Skipf("unable to set permissions on test binary, likely no passwordless sudo/su: %s", err) + } + if err := privesc("chmod", "u+s", bin); err != nil { + t.Skipf("unable to set permissions on test binary, likely no passwordless sudo/su: %s", err) + } +} + +func TestSUID(t *testing.T) { + // This test is relatively simple, we build a test program which opens a + // file passed via the TEST_OUTPUT envvar, prints the value of the + // GOTRACEBACK envvar to stdout, and prints "hello" to stderr. We then chown + // the program to "nobody" and set u+s on it. We execute the program, only + // passing it two files, for stdin and stdout, and passing + // GOTRACEBACK=system in the env. + // + // We expect that the program will trigger the SUID protections, resetting + // the value of GOTRACEBACK, and opening the missing stderr descriptor, such + // that the program prints "GOTRACEBACK=none" to stdout, and nothing gets + // written to the file pointed at by TEST_OUTPUT. + + if *flagQuick { + t.Skip("-quick") + } + + testenv.MustHaveGoBuild(t) + + helloBin, err := buildTestProg(t, "testsuid") + if err != nil { + t.Fatal(err) + } + + f, err := os.CreateTemp(t.TempDir(), "suid-output") + if err != nil { + t.Fatal(err) + } + tempfilePath := f.Name() + f.Close() + + lowPrivUser := "nobody" + setSetuid(t, lowPrivUser, helloBin) + + b := bytes.NewBuffer(nil) + pr, pw, err := os.Pipe() + if err != nil { + t.Fatal(err) + } + + proc, err := os.StartProcess(helloBin, []string{helloBin}, &os.ProcAttr{ + Env: []string{"GOTRACEBACK=system", "TEST_OUTPUT=" + tempfilePath}, + Files: []*os.File{os.Stdin, pw}, + }) + if err != nil { + if os.IsPermission(err) { + t.Skip("don't have execute permission on setuid binary, possibly directory permission issue?") + } + t.Fatal(err) + } + done := make(chan bool, 1) + go func() { + io.Copy(b, pr) + pr.Close() + done <- true + }() + ps, err := proc.Wait() + if err != nil { + t.Fatal(err) + } + pw.Close() + <-done + output := b.String() + + if ps.ExitCode() == 99 { + t.Skip("binary wasn't setuid (uid == euid), unable to effectively test") + } + + expected := "GOTRACEBACK=none\n" + if output != expected { + t.Errorf("unexpected output, got: %q, want %q", output, expected) + } + + fc, err := os.ReadFile(tempfilePath) + if err != nil { + t.Fatal(err) + } + if string(fc) != "" { + t.Errorf("unexpected file content, got: %q", string(fc)) + } + + // TODO: check the registers aren't leaked? +} diff --git a/src/runtime/security_unix.go b/src/runtime/security_unix.go new file mode 100644 index 0000000..16fc87e --- /dev/null +++ b/src/runtime/security_unix.go @@ -0,0 +1,72 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime + +func secure() { + initSecureMode() + + if !isSecureMode() { + return + } + + // When secure mode is enabled, we do two things: + // 1. ensure the file descriptors 0, 1, and 2 are open, and if not open them, + // pointing at /dev/null (or fail) + // 2. enforce specific environment variable values (currently we only force + // GOTRACEBACK=none) + // + // Other packages may also disable specific functionality when secure mode + // is enabled (determined by using linkname to call isSecureMode). + // + // NOTE: we may eventually want to enforce (1) regardless of whether secure + // mode is enabled or not. + + secureFDs() + secureEnv() +} + +func secureEnv() { + var hasTraceback bool + for i := 0; i < len(envs); i++ { + if hasPrefix(envs[i], "GOTRACEBACK=") { + hasTraceback = true + envs[i] = "GOTRACEBACK=none" + } + } + if !hasTraceback { + envs = append(envs, "GOTRACEBACK=none") + } +} + +func secureFDs() { + const ( + // F_GETFD and EBADF are standard across all unixes, define + // them here rather than in each of the OS specific files + F_GETFD = 0x01 + EBADF = 0x09 + ) + + devNull := []byte("/dev/null\x00") + for i := 0; i < 3; i++ { + ret, errno := fcntl(int32(i), F_GETFD, 0) + if ret >= 0 { + continue + } + if errno != EBADF { + print("runtime: unexpected error while checking standard file descriptor ", i, ", errno=", errno, "\n") + throw("cannot secure fds") + } + + if ret := open(&devNull[0], 2 /* O_RDWR */, 0); ret < 0 { + print("runtime: standard file descriptor ", i, " closed, unable to open /dev/null, errno=", errno, "\n") + throw("cannot secure fds") + } else if ret != int32(i) { + print("runtime: opened unexpected file descriptor ", ret, " when attempting to open ", i, "\n") + throw("cannot secure fds") + } + } +} diff --git a/src/runtime/select.go b/src/runtime/select.go new file mode 100644 index 0000000..1072465 --- /dev/null +++ b/src/runtime/select.go @@ -0,0 +1,632 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// This file contains the implementation of Go select statements. + +import ( + "internal/abi" + "unsafe" +) + +const debugSelect = false + +// Select case descriptor. +// Known to compiler. +// Changes here must also be made in src/cmd/compile/internal/walk/select.go's scasetype. +type scase struct { + c *hchan // chan + elem unsafe.Pointer // data element +} + +var ( + chansendpc = abi.FuncPCABIInternal(chansend) + chanrecvpc = abi.FuncPCABIInternal(chanrecv) +) + +func selectsetpc(pc *uintptr) { + *pc = getcallerpc() +} + +func sellock(scases []scase, lockorder []uint16) { + var c *hchan + for _, o := range lockorder { + c0 := scases[o].c + if c0 != c { + c = c0 + lock(&c.lock) + } + } +} + +func selunlock(scases []scase, lockorder []uint16) { + // We must be very careful here to not touch sel after we have unlocked + // the last lock, because sel can be freed right after the last unlock. + // Consider the following situation. + // First M calls runtime·park() in runtime·selectgo() passing the sel. + // Once runtime·park() has unlocked the last lock, another M makes + // the G that calls select runnable again and schedules it for execution. + // When the G runs on another M, it locks all the locks and frees sel. + // Now if the first M touches sel, it will access freed memory. + for i := len(lockorder) - 1; i >= 0; i-- { + c := scases[lockorder[i]].c + if i > 0 && c == scases[lockorder[i-1]].c { + continue // will unlock it on the next iteration + } + unlock(&c.lock) + } +} + +func selparkcommit(gp *g, _ unsafe.Pointer) bool { + // There are unlocked sudogs that point into gp's stack. Stack + // copying must lock the channels of those sudogs. + // Set activeStackChans here instead of before we try parking + // because we could self-deadlock in stack growth on a + // channel lock. + gp.activeStackChans = true + // Mark that it's safe for stack shrinking to occur now, + // because any thread acquiring this G's stack for shrinking + // is guaranteed to observe activeStackChans after this store. + gp.parkingOnChan.Store(false) + // Make sure we unlock after setting activeStackChans and + // unsetting parkingOnChan. The moment we unlock any of the + // channel locks we risk gp getting readied by a channel operation + // and so gp could continue running before everything before the + // unlock is visible (even to gp itself). + + // This must not access gp's stack (see gopark). In + // particular, it must not access the *hselect. That's okay, + // because by the time this is called, gp.waiting has all + // channels in lock order. + var lastc *hchan + for sg := gp.waiting; sg != nil; sg = sg.waitlink { + if sg.c != lastc && lastc != nil { + // As soon as we unlock the channel, fields in + // any sudog with that channel may change, + // including c and waitlink. Since multiple + // sudogs may have the same channel, we unlock + // only after we've passed the last instance + // of a channel. + unlock(&lastc.lock) + } + lastc = sg.c + } + if lastc != nil { + unlock(&lastc.lock) + } + return true +} + +func block() { + gopark(nil, nil, waitReasonSelectNoCases, traceEvGoStop, 1) // forever +} + +// selectgo implements the select statement. +// +// cas0 points to an array of type [ncases]scase, and order0 points to +// an array of type [2*ncases]uint16 where ncases must be <= 65536. +// Both reside on the goroutine's stack (regardless of any escaping in +// selectgo). +// +// For race detector builds, pc0 points to an array of type +// [ncases]uintptr (also on the stack); for other builds, it's set to +// nil. +// +// selectgo returns the index of the chosen scase, which matches the +// ordinal position of its respective select{recv,send,default} call. +// Also, if the chosen scase was a receive operation, it reports whether +// a value was received. +func selectgo(cas0 *scase, order0 *uint16, pc0 *uintptr, nsends, nrecvs int, block bool) (int, bool) { + if debugSelect { + print("select: cas0=", cas0, "\n") + } + + // NOTE: In order to maintain a lean stack size, the number of scases + // is capped at 65536. + cas1 := (*[1 << 16]scase)(unsafe.Pointer(cas0)) + order1 := (*[1 << 17]uint16)(unsafe.Pointer(order0)) + + ncases := nsends + nrecvs + scases := cas1[:ncases:ncases] + pollorder := order1[:ncases:ncases] + lockorder := order1[ncases:][:ncases:ncases] + // NOTE: pollorder/lockorder's underlying array was not zero-initialized by compiler. + + // Even when raceenabled is true, there might be select + // statements in packages compiled without -race (e.g., + // ensureSigM in runtime/signal_unix.go). + var pcs []uintptr + if raceenabled && pc0 != nil { + pc1 := (*[1 << 16]uintptr)(unsafe.Pointer(pc0)) + pcs = pc1[:ncases:ncases] + } + casePC := func(casi int) uintptr { + if pcs == nil { + return 0 + } + return pcs[casi] + } + + var t0 int64 + if blockprofilerate > 0 { + t0 = cputicks() + } + + // The compiler rewrites selects that statically have + // only 0 or 1 cases plus default into simpler constructs. + // The only way we can end up with such small sel.ncase + // values here is for a larger select in which most channels + // have been nilled out. The general code handles those + // cases correctly, and they are rare enough not to bother + // optimizing (and needing to test). + + // generate permuted order + norder := 0 + for i := range scases { + cas := &scases[i] + + // Omit cases without channels from the poll and lock orders. + if cas.c == nil { + cas.elem = nil // allow GC + continue + } + + j := fastrandn(uint32(norder + 1)) + pollorder[norder] = pollorder[j] + pollorder[j] = uint16(i) + norder++ + } + pollorder = pollorder[:norder] + lockorder = lockorder[:norder] + + // sort the cases by Hchan address to get the locking order. + // simple heap sort, to guarantee n log n time and constant stack footprint. + for i := range lockorder { + j := i + // Start with the pollorder to permute cases on the same channel. + c := scases[pollorder[i]].c + for j > 0 && scases[lockorder[(j-1)/2]].c.sortkey() < c.sortkey() { + k := (j - 1) / 2 + lockorder[j] = lockorder[k] + j = k + } + lockorder[j] = pollorder[i] + } + for i := len(lockorder) - 1; i >= 0; i-- { + o := lockorder[i] + c := scases[o].c + lockorder[i] = lockorder[0] + j := 0 + for { + k := j*2 + 1 + if k >= i { + break + } + if k+1 < i && scases[lockorder[k]].c.sortkey() < scases[lockorder[k+1]].c.sortkey() { + k++ + } + if c.sortkey() < scases[lockorder[k]].c.sortkey() { + lockorder[j] = lockorder[k] + j = k + continue + } + break + } + lockorder[j] = o + } + + if debugSelect { + for i := 0; i+1 < len(lockorder); i++ { + if scases[lockorder[i]].c.sortkey() > scases[lockorder[i+1]].c.sortkey() { + print("i=", i, " x=", lockorder[i], " y=", lockorder[i+1], "\n") + throw("select: broken sort") + } + } + } + + // lock all the channels involved in the select + sellock(scases, lockorder) + + var ( + gp *g + sg *sudog + c *hchan + k *scase + sglist *sudog + sgnext *sudog + qp unsafe.Pointer + nextp **sudog + ) + + // pass 1 - look for something already waiting + var casi int + var cas *scase + var caseSuccess bool + var caseReleaseTime int64 = -1 + var recvOK bool + for _, casei := range pollorder { + casi = int(casei) + cas = &scases[casi] + c = cas.c + + if casi >= nsends { + sg = c.sendq.dequeue() + if sg != nil { + goto recv + } + if c.qcount > 0 { + goto bufrecv + } + if c.closed != 0 { + goto rclose + } + } else { + if raceenabled { + racereadpc(c.raceaddr(), casePC(casi), chansendpc) + } + if c.closed != 0 { + goto sclose + } + sg = c.recvq.dequeue() + if sg != nil { + goto send + } + if c.qcount < c.dataqsiz { + goto bufsend + } + } + } + + if !block { + selunlock(scases, lockorder) + casi = -1 + goto retc + } + + // pass 2 - enqueue on all chans + gp = getg() + if gp.waiting != nil { + throw("gp.waiting != nil") + } + nextp = &gp.waiting + for _, casei := range lockorder { + casi = int(casei) + cas = &scases[casi] + c = cas.c + sg := acquireSudog() + sg.g = gp + sg.isSelect = true + // No stack splits between assigning elem and enqueuing + // sg on gp.waiting where copystack can find it. + sg.elem = cas.elem + sg.releasetime = 0 + if t0 != 0 { + sg.releasetime = -1 + } + sg.c = c + // Construct waiting list in lock order. + *nextp = sg + nextp = &sg.waitlink + + if casi < nsends { + c.sendq.enqueue(sg) + } else { + c.recvq.enqueue(sg) + } + } + + // wait for someone to wake us up + gp.param = nil + // Signal to anyone trying to shrink our stack that we're about + // to park on a channel. The window between when this G's status + // changes and when we set gp.activeStackChans is not safe for + // stack shrinking. + gp.parkingOnChan.Store(true) + gopark(selparkcommit, nil, waitReasonSelect, traceEvGoBlockSelect, 1) + gp.activeStackChans = false + + sellock(scases, lockorder) + + gp.selectDone.Store(0) + sg = (*sudog)(gp.param) + gp.param = nil + + // pass 3 - dequeue from unsuccessful chans + // otherwise they stack up on quiet channels + // record the successful case, if any. + // We singly-linked up the SudoGs in lock order. + casi = -1 + cas = nil + caseSuccess = false + sglist = gp.waiting + // Clear all elem before unlinking from gp.waiting. + for sg1 := gp.waiting; sg1 != nil; sg1 = sg1.waitlink { + sg1.isSelect = false + sg1.elem = nil + sg1.c = nil + } + gp.waiting = nil + + for _, casei := range lockorder { + k = &scases[casei] + if sg == sglist { + // sg has already been dequeued by the G that woke us up. + casi = int(casei) + cas = k + caseSuccess = sglist.success + if sglist.releasetime > 0 { + caseReleaseTime = sglist.releasetime + } + } else { + c = k.c + if int(casei) < nsends { + c.sendq.dequeueSudoG(sglist) + } else { + c.recvq.dequeueSudoG(sglist) + } + } + sgnext = sglist.waitlink + sglist.waitlink = nil + releaseSudog(sglist) + sglist = sgnext + } + + if cas == nil { + throw("selectgo: bad wakeup") + } + + c = cas.c + + if debugSelect { + print("wait-return: cas0=", cas0, " c=", c, " cas=", cas, " send=", casi < nsends, "\n") + } + + if casi < nsends { + if !caseSuccess { + goto sclose + } + } else { + recvOK = caseSuccess + } + + if raceenabled { + if casi < nsends { + raceReadObjectPC(c.elemtype, cas.elem, casePC(casi), chansendpc) + } else if cas.elem != nil { + raceWriteObjectPC(c.elemtype, cas.elem, casePC(casi), chanrecvpc) + } + } + if msanenabled { + if casi < nsends { + msanread(cas.elem, c.elemtype.size) + } else if cas.elem != nil { + msanwrite(cas.elem, c.elemtype.size) + } + } + if asanenabled { + if casi < nsends { + asanread(cas.elem, c.elemtype.size) + } else if cas.elem != nil { + asanwrite(cas.elem, c.elemtype.size) + } + } + + selunlock(scases, lockorder) + goto retc + +bufrecv: + // can receive from buffer + if raceenabled { + if cas.elem != nil { + raceWriteObjectPC(c.elemtype, cas.elem, casePC(casi), chanrecvpc) + } + racenotify(c, c.recvx, nil) + } + if msanenabled && cas.elem != nil { + msanwrite(cas.elem, c.elemtype.size) + } + if asanenabled && cas.elem != nil { + asanwrite(cas.elem, c.elemtype.size) + } + recvOK = true + qp = chanbuf(c, c.recvx) + if cas.elem != nil { + typedmemmove(c.elemtype, cas.elem, qp) + } + typedmemclr(c.elemtype, qp) + c.recvx++ + if c.recvx == c.dataqsiz { + c.recvx = 0 + } + c.qcount-- + selunlock(scases, lockorder) + goto retc + +bufsend: + // can send to buffer + if raceenabled { + racenotify(c, c.sendx, nil) + raceReadObjectPC(c.elemtype, cas.elem, casePC(casi), chansendpc) + } + if msanenabled { + msanread(cas.elem, c.elemtype.size) + } + if asanenabled { + asanread(cas.elem, c.elemtype.size) + } + typedmemmove(c.elemtype, chanbuf(c, c.sendx), cas.elem) + c.sendx++ + if c.sendx == c.dataqsiz { + c.sendx = 0 + } + c.qcount++ + selunlock(scases, lockorder) + goto retc + +recv: + // can receive from sleeping sender (sg) + recv(c, sg, cas.elem, func() { selunlock(scases, lockorder) }, 2) + if debugSelect { + print("syncrecv: cas0=", cas0, " c=", c, "\n") + } + recvOK = true + goto retc + +rclose: + // read at end of closed channel + selunlock(scases, lockorder) + recvOK = false + if cas.elem != nil { + typedmemclr(c.elemtype, cas.elem) + } + if raceenabled { + raceacquire(c.raceaddr()) + } + goto retc + +send: + // can send to a sleeping receiver (sg) + if raceenabled { + raceReadObjectPC(c.elemtype, cas.elem, casePC(casi), chansendpc) + } + if msanenabled { + msanread(cas.elem, c.elemtype.size) + } + if asanenabled { + asanread(cas.elem, c.elemtype.size) + } + send(c, sg, cas.elem, func() { selunlock(scases, lockorder) }, 2) + if debugSelect { + print("syncsend: cas0=", cas0, " c=", c, "\n") + } + goto retc + +retc: + if caseReleaseTime > 0 { + blockevent(caseReleaseTime-t0, 1) + } + return casi, recvOK + +sclose: + // send on closed channel + selunlock(scases, lockorder) + panic(plainError("send on closed channel")) +} + +func (c *hchan) sortkey() uintptr { + return uintptr(unsafe.Pointer(c)) +} + +// A runtimeSelect is a single case passed to rselect. +// This must match ../reflect/value.go:/runtimeSelect +type runtimeSelect struct { + dir selectDir + typ unsafe.Pointer // channel type (not used here) + ch *hchan // channel + val unsafe.Pointer // ptr to data (SendDir) or ptr to receive buffer (RecvDir) +} + +// These values must match ../reflect/value.go:/SelectDir. +type selectDir int + +const ( + _ selectDir = iota + selectSend // case Chan <- Send + selectRecv // case <-Chan: + selectDefault // default +) + +//go:linkname reflect_rselect reflect.rselect +func reflect_rselect(cases []runtimeSelect) (int, bool) { + if len(cases) == 0 { + block() + } + sel := make([]scase, len(cases)) + orig := make([]int, len(cases)) + nsends, nrecvs := 0, 0 + dflt := -1 + for i, rc := range cases { + var j int + switch rc.dir { + case selectDefault: + dflt = i + continue + case selectSend: + j = nsends + nsends++ + case selectRecv: + nrecvs++ + j = len(cases) - nrecvs + } + + sel[j] = scase{c: rc.ch, elem: rc.val} + orig[j] = i + } + + // Only a default case. + if nsends+nrecvs == 0 { + return dflt, false + } + + // Compact sel and orig if necessary. + if nsends+nrecvs < len(cases) { + copy(sel[nsends:], sel[len(cases)-nrecvs:]) + copy(orig[nsends:], orig[len(cases)-nrecvs:]) + } + + order := make([]uint16, 2*(nsends+nrecvs)) + var pc0 *uintptr + if raceenabled { + pcs := make([]uintptr, nsends+nrecvs) + for i := range pcs { + selectsetpc(&pcs[i]) + } + pc0 = &pcs[0] + } + + chosen, recvOK := selectgo(&sel[0], &order[0], pc0, nsends, nrecvs, dflt == -1) + + // Translate chosen back to caller's ordering. + if chosen < 0 { + chosen = dflt + } else { + chosen = orig[chosen] + } + return chosen, recvOK +} + +func (q *waitq) dequeueSudoG(sgp *sudog) { + x := sgp.prev + y := sgp.next + if x != nil { + if y != nil { + // middle of queue + x.next = y + y.prev = x + sgp.next = nil + sgp.prev = nil + return + } + // end of queue + x.next = nil + q.last = x + sgp.prev = nil + return + } + if y != nil { + // start of queue + y.prev = nil + q.first = y + sgp.next = nil + return + } + + // x==y==nil. Either sgp is the only element in the queue, + // or it has already been removed. Use q.first to disambiguate. + if q.first == sgp { + q.first = nil + q.last = nil + } +} diff --git a/src/runtime/sema.go b/src/runtime/sema.go new file mode 100644 index 0000000..bc23a85 --- /dev/null +++ b/src/runtime/sema.go @@ -0,0 +1,633 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Semaphore implementation exposed to Go. +// Intended use is provide a sleep and wakeup +// primitive that can be used in the contended case +// of other synchronization primitives. +// Thus it targets the same goal as Linux's futex, +// but it has much simpler semantics. +// +// That is, don't think of these as semaphores. +// Think of them as a way to implement sleep and wakeup +// such that every sleep is paired with a single wakeup, +// even if, due to races, the wakeup happens before the sleep. +// +// See Mullender and Cox, ``Semaphores in Plan 9,'' +// https://swtch.com/semaphore.pdf + +package runtime + +import ( + "internal/cpu" + "runtime/internal/atomic" + "unsafe" +) + +// Asynchronous semaphore for sync.Mutex. + +// A semaRoot holds a balanced tree of sudog with distinct addresses (s.elem). +// Each of those sudog may in turn point (through s.waitlink) to a list +// of other sudogs waiting on the same address. +// The operations on the inner lists of sudogs with the same address +// are all O(1). The scanning of the top-level semaRoot list is O(log n), +// where n is the number of distinct addresses with goroutines blocked +// on them that hash to the given semaRoot. +// See golang.org/issue/17953 for a program that worked badly +// before we introduced the second level of list, and +// BenchmarkSemTable/OneAddrCollision/* for a benchmark that exercises this. +type semaRoot struct { + lock mutex + treap *sudog // root of balanced tree of unique waiters. + nwait atomic.Uint32 // Number of waiters. Read w/o the lock. +} + +var semtable semTable + +// Prime to not correlate with any user patterns. +const semTabSize = 251 + +type semTable [semTabSize]struct { + root semaRoot + pad [cpu.CacheLinePadSize - unsafe.Sizeof(semaRoot{})]byte +} + +func (t *semTable) rootFor(addr *uint32) *semaRoot { + return &t[(uintptr(unsafe.Pointer(addr))>>3)%semTabSize].root +} + +//go:linkname sync_runtime_Semacquire sync.runtime_Semacquire +func sync_runtime_Semacquire(addr *uint32) { + semacquire1(addr, false, semaBlockProfile, 0, waitReasonSemacquire) +} + +//go:linkname poll_runtime_Semacquire internal/poll.runtime_Semacquire +func poll_runtime_Semacquire(addr *uint32) { + semacquire1(addr, false, semaBlockProfile, 0, waitReasonSemacquire) +} + +//go:linkname sync_runtime_Semrelease sync.runtime_Semrelease +func sync_runtime_Semrelease(addr *uint32, handoff bool, skipframes int) { + semrelease1(addr, handoff, skipframes) +} + +//go:linkname sync_runtime_SemacquireMutex sync.runtime_SemacquireMutex +func sync_runtime_SemacquireMutex(addr *uint32, lifo bool, skipframes int) { + semacquire1(addr, lifo, semaBlockProfile|semaMutexProfile, skipframes, waitReasonSyncMutexLock) +} + +//go:linkname sync_runtime_SemacquireRWMutexR sync.runtime_SemacquireRWMutexR +func sync_runtime_SemacquireRWMutexR(addr *uint32, lifo bool, skipframes int) { + semacquire1(addr, lifo, semaBlockProfile|semaMutexProfile, skipframes, waitReasonSyncRWMutexRLock) +} + +//go:linkname sync_runtime_SemacquireRWMutex sync.runtime_SemacquireRWMutex +func sync_runtime_SemacquireRWMutex(addr *uint32, lifo bool, skipframes int) { + semacquire1(addr, lifo, semaBlockProfile|semaMutexProfile, skipframes, waitReasonSyncRWMutexLock) +} + +//go:linkname poll_runtime_Semrelease internal/poll.runtime_Semrelease +func poll_runtime_Semrelease(addr *uint32) { + semrelease(addr) +} + +func readyWithTime(s *sudog, traceskip int) { + if s.releasetime != 0 { + s.releasetime = cputicks() + } + goready(s.g, traceskip) +} + +type semaProfileFlags int + +const ( + semaBlockProfile semaProfileFlags = 1 << iota + semaMutexProfile +) + +// Called from runtime. +func semacquire(addr *uint32) { + semacquire1(addr, false, 0, 0, waitReasonSemacquire) +} + +func semacquire1(addr *uint32, lifo bool, profile semaProfileFlags, skipframes int, reason waitReason) { + gp := getg() + if gp != gp.m.curg { + throw("semacquire not on the G stack") + } + + // Easy case. + if cansemacquire(addr) { + return + } + + // Harder case: + // increment waiter count + // try cansemacquire one more time, return if succeeded + // enqueue itself as a waiter + // sleep + // (waiter descriptor is dequeued by signaler) + s := acquireSudog() + root := semtable.rootFor(addr) + t0 := int64(0) + s.releasetime = 0 + s.acquiretime = 0 + s.ticket = 0 + if profile&semaBlockProfile != 0 && blockprofilerate > 0 { + t0 = cputicks() + s.releasetime = -1 + } + if profile&semaMutexProfile != 0 && mutexprofilerate > 0 { + if t0 == 0 { + t0 = cputicks() + } + s.acquiretime = t0 + } + for { + lockWithRank(&root.lock, lockRankRoot) + // Add ourselves to nwait to disable "easy case" in semrelease. + root.nwait.Add(1) + // Check cansemacquire to avoid missed wakeup. + if cansemacquire(addr) { + root.nwait.Add(-1) + unlock(&root.lock) + break + } + // Any semrelease after the cansemacquire knows we're waiting + // (we set nwait above), so go to sleep. + root.queue(addr, s, lifo) + goparkunlock(&root.lock, reason, traceEvGoBlockSync, 4+skipframes) + if s.ticket != 0 || cansemacquire(addr) { + break + } + } + if s.releasetime > 0 { + blockevent(s.releasetime-t0, 3+skipframes) + } + releaseSudog(s) +} + +func semrelease(addr *uint32) { + semrelease1(addr, false, 0) +} + +func semrelease1(addr *uint32, handoff bool, skipframes int) { + root := semtable.rootFor(addr) + atomic.Xadd(addr, 1) + + // Easy case: no waiters? + // This check must happen after the xadd, to avoid a missed wakeup + // (see loop in semacquire). + if root.nwait.Load() == 0 { + return + } + + // Harder case: search for a waiter and wake it. + lockWithRank(&root.lock, lockRankRoot) + if root.nwait.Load() == 0 { + // The count is already consumed by another goroutine, + // so no need to wake up another goroutine. + unlock(&root.lock) + return + } + s, t0 := root.dequeue(addr) + if s != nil { + root.nwait.Add(-1) + } + unlock(&root.lock) + if s != nil { // May be slow or even yield, so unlock first + acquiretime := s.acquiretime + if acquiretime != 0 { + mutexevent(t0-acquiretime, 3+skipframes) + } + if s.ticket != 0 { + throw("corrupted semaphore ticket") + } + if handoff && cansemacquire(addr) { + s.ticket = 1 + } + readyWithTime(s, 5+skipframes) + if s.ticket == 1 && getg().m.locks == 0 { + // Direct G handoff + // readyWithTime has added the waiter G as runnext in the + // current P; we now call the scheduler so that we start running + // the waiter G immediately. + // Note that waiter inherits our time slice: this is desirable + // to avoid having a highly contended semaphore hog the P + // indefinitely. goyield is like Gosched, but it emits a + // "preempted" trace event instead and, more importantly, puts + // the current G on the local runq instead of the global one. + // We only do this in the starving regime (handoff=true), as in + // the non-starving case it is possible for a different waiter + // to acquire the semaphore while we are yielding/scheduling, + // and this would be wasteful. We wait instead to enter starving + // regime, and then we start to do direct handoffs of ticket and + // P. + // See issue 33747 for discussion. + goyield() + } + } +} + +func cansemacquire(addr *uint32) bool { + for { + v := atomic.Load(addr) + if v == 0 { + return false + } + if atomic.Cas(addr, v, v-1) { + return true + } + } +} + +// queue adds s to the blocked goroutines in semaRoot. +func (root *semaRoot) queue(addr *uint32, s *sudog, lifo bool) { + s.g = getg() + s.elem = unsafe.Pointer(addr) + s.next = nil + s.prev = nil + + var last *sudog + pt := &root.treap + for t := *pt; t != nil; t = *pt { + if t.elem == unsafe.Pointer(addr) { + // Already have addr in list. + if lifo { + // Substitute s in t's place in treap. + *pt = s + s.ticket = t.ticket + s.acquiretime = t.acquiretime + s.parent = t.parent + s.prev = t.prev + s.next = t.next + if s.prev != nil { + s.prev.parent = s + } + if s.next != nil { + s.next.parent = s + } + // Add t first in s's wait list. + s.waitlink = t + s.waittail = t.waittail + if s.waittail == nil { + s.waittail = t + } + t.parent = nil + t.prev = nil + t.next = nil + t.waittail = nil + } else { + // Add s to end of t's wait list. + if t.waittail == nil { + t.waitlink = s + } else { + t.waittail.waitlink = s + } + t.waittail = s + s.waitlink = nil + } + return + } + last = t + if uintptr(unsafe.Pointer(addr)) < uintptr(t.elem) { + pt = &t.prev + } else { + pt = &t.next + } + } + + // Add s as new leaf in tree of unique addrs. + // The balanced tree is a treap using ticket as the random heap priority. + // That is, it is a binary tree ordered according to the elem addresses, + // but then among the space of possible binary trees respecting those + // addresses, it is kept balanced on average by maintaining a heap ordering + // on the ticket: s.ticket <= both s.prev.ticket and s.next.ticket. + // https://en.wikipedia.org/wiki/Treap + // https://faculty.washington.edu/aragon/pubs/rst89.pdf + // + // s.ticket compared with zero in couple of places, therefore set lowest bit. + // It will not affect treap's quality noticeably. + s.ticket = fastrand() | 1 + s.parent = last + *pt = s + + // Rotate up into tree according to ticket (priority). + for s.parent != nil && s.parent.ticket > s.ticket { + if s.parent.prev == s { + root.rotateRight(s.parent) + } else { + if s.parent.next != s { + panic("semaRoot queue") + } + root.rotateLeft(s.parent) + } + } +} + +// dequeue searches for and finds the first goroutine +// in semaRoot blocked on addr. +// If the sudog was being profiled, dequeue returns the time +// at which it was woken up as now. Otherwise now is 0. +func (root *semaRoot) dequeue(addr *uint32) (found *sudog, now int64) { + ps := &root.treap + s := *ps + for ; s != nil; s = *ps { + if s.elem == unsafe.Pointer(addr) { + goto Found + } + if uintptr(unsafe.Pointer(addr)) < uintptr(s.elem) { + ps = &s.prev + } else { + ps = &s.next + } + } + return nil, 0 + +Found: + now = int64(0) + if s.acquiretime != 0 { + now = cputicks() + } + if t := s.waitlink; t != nil { + // Substitute t, also waiting on addr, for s in root tree of unique addrs. + *ps = t + t.ticket = s.ticket + t.parent = s.parent + t.prev = s.prev + if t.prev != nil { + t.prev.parent = t + } + t.next = s.next + if t.next != nil { + t.next.parent = t + } + if t.waitlink != nil { + t.waittail = s.waittail + } else { + t.waittail = nil + } + t.acquiretime = now + s.waitlink = nil + s.waittail = nil + } else { + // Rotate s down to be leaf of tree for removal, respecting priorities. + for s.next != nil || s.prev != nil { + if s.next == nil || s.prev != nil && s.prev.ticket < s.next.ticket { + root.rotateRight(s) + } else { + root.rotateLeft(s) + } + } + // Remove s, now a leaf. + if s.parent != nil { + if s.parent.prev == s { + s.parent.prev = nil + } else { + s.parent.next = nil + } + } else { + root.treap = nil + } + } + s.parent = nil + s.elem = nil + s.next = nil + s.prev = nil + s.ticket = 0 + return s, now +} + +// rotateLeft rotates the tree rooted at node x. +// turning (x a (y b c)) into (y (x a b) c). +func (root *semaRoot) rotateLeft(x *sudog) { + // p -> (x a (y b c)) + p := x.parent + y := x.next + b := y.prev + + y.prev = x + x.parent = y + x.next = b + if b != nil { + b.parent = x + } + + y.parent = p + if p == nil { + root.treap = y + } else if p.prev == x { + p.prev = y + } else { + if p.next != x { + throw("semaRoot rotateLeft") + } + p.next = y + } +} + +// rotateRight rotates the tree rooted at node y. +// turning (y (x a b) c) into (x a (y b c)). +func (root *semaRoot) rotateRight(y *sudog) { + // p -> (y (x a b) c) + p := y.parent + x := y.prev + b := x.next + + x.next = y + y.parent = x + y.prev = b + if b != nil { + b.parent = y + } + + x.parent = p + if p == nil { + root.treap = x + } else if p.prev == y { + p.prev = x + } else { + if p.next != y { + throw("semaRoot rotateRight") + } + p.next = x + } +} + +// notifyList is a ticket-based notification list used to implement sync.Cond. +// +// It must be kept in sync with the sync package. +type notifyList struct { + // wait is the ticket number of the next waiter. It is atomically + // incremented outside the lock. + wait atomic.Uint32 + + // notify is the ticket number of the next waiter to be notified. It can + // be read outside the lock, but is only written to with lock held. + // + // Both wait & notify can wrap around, and such cases will be correctly + // handled as long as their "unwrapped" difference is bounded by 2^31. + // For this not to be the case, we'd need to have 2^31+ goroutines + // blocked on the same condvar, which is currently not possible. + notify uint32 + + // List of parked waiters. + lock mutex + head *sudog + tail *sudog +} + +// less checks if a < b, considering a & b running counts that may overflow the +// 32-bit range, and that their "unwrapped" difference is always less than 2^31. +func less(a, b uint32) bool { + return int32(a-b) < 0 +} + +// notifyListAdd adds the caller to a notify list such that it can receive +// notifications. The caller must eventually call notifyListWait to wait for +// such a notification, passing the returned ticket number. +// +//go:linkname notifyListAdd sync.runtime_notifyListAdd +func notifyListAdd(l *notifyList) uint32 { + // This may be called concurrently, for example, when called from + // sync.Cond.Wait while holding a RWMutex in read mode. + return l.wait.Add(1) - 1 +} + +// notifyListWait waits for a notification. If one has been sent since +// notifyListAdd was called, it returns immediately. Otherwise, it blocks. +// +//go:linkname notifyListWait sync.runtime_notifyListWait +func notifyListWait(l *notifyList, t uint32) { + lockWithRank(&l.lock, lockRankNotifyList) + + // Return right away if this ticket has already been notified. + if less(t, l.notify) { + unlock(&l.lock) + return + } + + // Enqueue itself. + s := acquireSudog() + s.g = getg() + s.ticket = t + s.releasetime = 0 + t0 := int64(0) + if blockprofilerate > 0 { + t0 = cputicks() + s.releasetime = -1 + } + if l.tail == nil { + l.head = s + } else { + l.tail.next = s + } + l.tail = s + goparkunlock(&l.lock, waitReasonSyncCondWait, traceEvGoBlockCond, 3) + if t0 != 0 { + blockevent(s.releasetime-t0, 2) + } + releaseSudog(s) +} + +// notifyListNotifyAll notifies all entries in the list. +// +//go:linkname notifyListNotifyAll sync.runtime_notifyListNotifyAll +func notifyListNotifyAll(l *notifyList) { + // Fast-path: if there are no new waiters since the last notification + // we don't need to acquire the lock. + if l.wait.Load() == atomic.Load(&l.notify) { + return + } + + // Pull the list out into a local variable, waiters will be readied + // outside the lock. + lockWithRank(&l.lock, lockRankNotifyList) + s := l.head + l.head = nil + l.tail = nil + + // Update the next ticket to be notified. We can set it to the current + // value of wait because any previous waiters are already in the list + // or will notice that they have already been notified when trying to + // add themselves to the list. + atomic.Store(&l.notify, l.wait.Load()) + unlock(&l.lock) + + // Go through the local list and ready all waiters. + for s != nil { + next := s.next + s.next = nil + readyWithTime(s, 4) + s = next + } +} + +// notifyListNotifyOne notifies one entry in the list. +// +//go:linkname notifyListNotifyOne sync.runtime_notifyListNotifyOne +func notifyListNotifyOne(l *notifyList) { + // Fast-path: if there are no new waiters since the last notification + // we don't need to acquire the lock at all. + if l.wait.Load() == atomic.Load(&l.notify) { + return + } + + lockWithRank(&l.lock, lockRankNotifyList) + + // Re-check under the lock if we need to do anything. + t := l.notify + if t == l.wait.Load() { + unlock(&l.lock) + return + } + + // Update the next notify ticket number. + atomic.Store(&l.notify, t+1) + + // Try to find the g that needs to be notified. + // If it hasn't made it to the list yet we won't find it, + // but it won't park itself once it sees the new notify number. + // + // This scan looks linear but essentially always stops quickly. + // Because g's queue separately from taking numbers, + // there may be minor reorderings in the list, but we + // expect the g we're looking for to be near the front. + // The g has others in front of it on the list only to the + // extent that it lost the race, so the iteration will not + // be too long. This applies even when the g is missing: + // it hasn't yet gotten to sleep and has lost the race to + // the (few) other g's that we find on the list. + for p, s := (*sudog)(nil), l.head; s != nil; p, s = s, s.next { + if s.ticket == t { + n := s.next + if p != nil { + p.next = n + } else { + l.head = n + } + if n == nil { + l.tail = p + } + unlock(&l.lock) + s.next = nil + readyWithTime(s, 4) + return + } + } + unlock(&l.lock) +} + +//go:linkname notifyListCheck sync.runtime_notifyListCheck +func notifyListCheck(sz uintptr) { + if sz != unsafe.Sizeof(notifyList{}) { + print("runtime: bad notifyList size - sync=", sz, " runtime=", unsafe.Sizeof(notifyList{}), "\n") + throw("bad notifyList size") + } +} + +//go:linkname sync_nanotime sync.runtime_nanotime +func sync_nanotime() int64 { + return nanotime() +} diff --git a/src/runtime/sema_test.go b/src/runtime/sema_test.go new file mode 100644 index 0000000..9943d2e --- /dev/null +++ b/src/runtime/sema_test.go @@ -0,0 +1,170 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + . "runtime" + "sync" + "sync/atomic" + "testing" +) + +// TestSemaHandoff checks that when semrelease+handoff is +// requested, the G that releases the semaphore yields its +// P directly to the first waiter in line. +// See issue 33747 for discussion. +func TestSemaHandoff(t *testing.T) { + const iter = 10000 + ok := 0 + for i := 0; i < iter; i++ { + if testSemaHandoff() { + ok++ + } + } + // As long as two thirds of handoffs are direct, we + // consider the test successful. The scheduler is + // nondeterministic, so this test checks that we get the + // desired outcome in a significant majority of cases. + // The actual ratio of direct handoffs is much higher + // (>90%) but we use a lower threshold to minimize the + // chances that unrelated changes in the runtime will + // cause the test to fail or become flaky. + if ok < iter*2/3 { + t.Fatal("direct handoff < 2/3:", ok, iter) + } +} + +func TestSemaHandoff1(t *testing.T) { + if GOMAXPROCS(-1) <= 1 { + t.Skip("GOMAXPROCS <= 1") + } + defer GOMAXPROCS(GOMAXPROCS(-1)) + GOMAXPROCS(1) + TestSemaHandoff(t) +} + +func TestSemaHandoff2(t *testing.T) { + if GOMAXPROCS(-1) <= 2 { + t.Skip("GOMAXPROCS <= 2") + } + defer GOMAXPROCS(GOMAXPROCS(-1)) + GOMAXPROCS(2) + TestSemaHandoff(t) +} + +func testSemaHandoff() bool { + var sema, res uint32 + done := make(chan struct{}) + + // We're testing that the current goroutine is able to yield its time slice + // to another goroutine. Stop the current goroutine from migrating to + // another CPU where it can win the race (and appear to have not yielded) by + // keeping the CPUs slightly busy. + var wg sync.WaitGroup + for i := 0; i < GOMAXPROCS(-1); i++ { + wg.Add(1) + go func() { + defer wg.Done() + for { + select { + case <-done: + return + default: + } + Gosched() + } + }() + } + + wg.Add(1) + go func() { + defer wg.Done() + Semacquire(&sema) + atomic.CompareAndSwapUint32(&res, 0, 1) + + Semrelease1(&sema, true, 0) + close(done) + }() + for SemNwait(&sema) == 0 { + Gosched() // wait for goroutine to block in Semacquire + } + + // The crux of the test: we release the semaphore with handoff + // and immediately perform a CAS both here and in the waiter; we + // want the CAS in the waiter to execute first. + Semrelease1(&sema, true, 0) + atomic.CompareAndSwapUint32(&res, 0, 2) + + wg.Wait() // wait for goroutines to finish to avoid data races + + return res == 1 // did the waiter run first? +} + +func BenchmarkSemTable(b *testing.B) { + for _, n := range []int{1000, 2000, 4000, 8000} { + b.Run(fmt.Sprintf("OneAddrCollision/n=%d", n), func(b *testing.B) { + tab := Escape(new(SemTable)) + u := make([]uint32, SemTableSize+1) + + b.ResetTimer() + + for j := 0; j < b.N; j++ { + // Simulate two locks colliding on the same semaRoot. + // + // Specifically enqueue all the waiters for the first lock, + // then all the waiters for the second lock. + // + // Then, dequeue all the waiters from the first lock, then + // the second. + // + // Each enqueue/dequeue operation should be O(1), because + // there are exactly 2 locks. This could be O(n) if all + // the waiters for both locks are on the same list, as it + // once was. + for i := 0; i < n; i++ { + if i < n/2 { + tab.Enqueue(&u[0]) + } else { + tab.Enqueue(&u[SemTableSize]) + } + } + for i := 0; i < n; i++ { + var ok bool + if i < n/2 { + ok = tab.Dequeue(&u[0]) + } else { + ok = tab.Dequeue(&u[SemTableSize]) + } + if !ok { + b.Fatal("failed to dequeue") + } + } + } + }) + b.Run(fmt.Sprintf("ManyAddrCollision/n=%d", n), func(b *testing.B) { + tab := Escape(new(SemTable)) + u := make([]uint32, n*SemTableSize) + + b.ResetTimer() + + for j := 0; j < b.N; j++ { + // Simulate n locks colliding on the same semaRoot. + // + // Each enqueue/dequeue operation should be O(log n), because + // each semaRoot is a tree. This could be O(n) if it was + // some simpler data structure. + for i := 0; i < n; i++ { + tab.Enqueue(&u[i*SemTableSize]) + } + for i := 0; i < n; i++ { + if !tab.Dequeue(&u[i*SemTableSize]) { + b.Fatal("failed to dequeue") + } + } + } + }) + } +} diff --git a/src/runtime/semasleep_test.go b/src/runtime/semasleep_test.go new file mode 100644 index 0000000..7262853 --- /dev/null +++ b/src/runtime/semasleep_test.go @@ -0,0 +1,121 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows && !js + +package runtime_test + +import ( + "io" + "os/exec" + "syscall" + "testing" + "time" +) + +// Issue #27250. Spurious wakeups to pthread_cond_timedwait_relative_np +// shouldn't cause semasleep to retry with the same timeout which would +// cause indefinite spinning. +func TestSpuriousWakeupsNeverHangSemasleep(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + t.Parallel() // Waits for a program to sleep for 1s. + + exe, err := buildTestProg(t, "testprog") + if err != nil { + t.Fatal(err) + } + + cmd := exec.Command(exe, "After1") + stdout, err := cmd.StdoutPipe() + if err != nil { + t.Fatalf("StdoutPipe: %v", err) + } + beforeStart := time.Now() + if err := cmd.Start(); err != nil { + t.Fatalf("Failed to start command: %v", err) + } + + waiting := false + doneCh := make(chan error, 1) + t.Cleanup(func() { + cmd.Process.Kill() + if waiting { + <-doneCh + } else { + cmd.Wait() + } + }) + + // Wait for After1 to close its stdout so that we know the runtime's SIGIO + // handler is registered. + b, err := io.ReadAll(stdout) + if len(b) > 0 { + t.Logf("read from testprog stdout: %s", b) + } + if err != nil { + t.Fatalf("error reading from testprog: %v", err) + } + + // Wait for child exit. + // + // Note that we must do this after waiting for the write/child end of + // stdout to close. Wait closes the read/parent end of stdout, so + // starting this goroutine prior to io.ReadAll introduces a race + // condition where ReadAll may get fs.ErrClosed if the child exits too + // quickly. + waiting = true + go func() { + doneCh <- cmd.Wait() + close(doneCh) + }() + + // Wait for an arbitrary timeout longer than one second. The subprocess itself + // attempts to sleep for one second, but if the machine running the test is + // heavily loaded that subprocess may not schedule very quickly even if the + // bug remains fixed. (This is fine, because if the bug really is unfixed we + // can keep the process hung indefinitely, as long as we signal it often + // enough.) + timeout := 10 * time.Second + + // The subprocess begins sleeping for 1s after it writes to stdout, so measure + // the timeout from here (not from when we started creating the process). + // That should reduce noise from process startup overhead. + ready := time.Now() + + // With the repro running, we can continuously send to it + // a signal that the runtime considers non-terminal, + // such as SIGIO, to spuriously wake up + // pthread_cond_timedwait_relative_np. + ticker := time.NewTicker(200 * time.Millisecond) + defer ticker.Stop() + for { + select { + case now := <-ticker.C: + if now.Sub(ready) > timeout { + t.Error("Program failed to return on time and has to be killed, issue #27520 still exists") + // Send SIGQUIT to get a goroutine dump. + // Stop sending SIGIO so that the program can clean up and actually terminate. + cmd.Process.Signal(syscall.SIGQUIT) + return + } + + // Send the pesky signal that toggles spinning + // indefinitely if #27520 is not fixed. + cmd.Process.Signal(syscall.SIGIO) + + case err := <-doneCh: + if err != nil { + t.Fatalf("The program returned but unfortunately with an error: %v", err) + } + if time.Since(beforeStart) < 1*time.Second { + // The program was supposed to sleep for a full (monotonic) second; + // it should not return before that has elapsed. + t.Fatalf("The program stopped too quickly.") + } + return + } + } +} diff --git a/src/runtime/sigaction.go b/src/runtime/sigaction.go new file mode 100644 index 0000000..05f44f6 --- /dev/null +++ b/src/runtime/sigaction.go @@ -0,0 +1,16 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux && !amd64 && !arm64 && !ppc64le) || (freebsd && !amd64) + +package runtime + +// This version is used on Linux and FreeBSD systems on which we don't +// use cgo to call the C version of sigaction. + +//go:nosplit +//go:nowritebarrierrec +func sigaction(sig uint32, new, old *sigactiont) { + sysSigaction(sig, new, old) +} diff --git a/src/runtime/signal_386.go b/src/runtime/signal_386.go new file mode 100644 index 0000000..aa66032 --- /dev/null +++ b/src/runtime/signal_386.go @@ -0,0 +1,59 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build dragonfly || freebsd || linux || netbsd || openbsd + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("eax ", hex(c.eax()), "\n") + print("ebx ", hex(c.ebx()), "\n") + print("ecx ", hex(c.ecx()), "\n") + print("edx ", hex(c.edx()), "\n") + print("edi ", hex(c.edi()), "\n") + print("esi ", hex(c.esi()), "\n") + print("ebp ", hex(c.ebp()), "\n") + print("esp ", hex(c.esp()), "\n") + print("eip ", hex(c.eip()), "\n") + print("eflags ", hex(c.eflags()), "\n") + print("cs ", hex(c.cs()), "\n") + print("fs ", hex(c.fs()), "\n") + print("gs ", hex(c.gs()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.eip()) } + +func (c *sigctxt) sigsp() uintptr { return uintptr(c.esp()) } +func (c *sigctxt) siglr() uintptr { return 0 } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + pc := uintptr(c.eip()) + sp := uintptr(c.esp()) + + if shouldPushSigpanic(gp, pc, *(*uintptr)(unsafe.Pointer(sp))) { + c.pushCall(abi.FuncPCABIInternal(sigpanic), pc) + } else { + // Not safe to push the call. Just clobber the frame. + c.set_eip(uint32(abi.FuncPCABIInternal(sigpanic))) + } +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Make it look like we called target at resumePC. + sp := uintptr(c.esp()) + sp -= goarch.PtrSize + *(*uintptr)(unsafe.Pointer(sp)) = resumePC + c.set_esp(uint32(sp)) + c.set_eip(uint32(targetPC)) +} diff --git a/src/runtime/signal_aix_ppc64.go b/src/runtime/signal_aix_ppc64.go new file mode 100644 index 0000000..c6cb91a --- /dev/null +++ b/src/runtime/signal_aix_ppc64.go @@ -0,0 +1,85 @@ +/// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build aix + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *context64 { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) r0() uint64 { return c.regs().gpr[0] } +func (c *sigctxt) r1() uint64 { return c.regs().gpr[1] } +func (c *sigctxt) r2() uint64 { return c.regs().gpr[2] } +func (c *sigctxt) r3() uint64 { return c.regs().gpr[3] } +func (c *sigctxt) r4() uint64 { return c.regs().gpr[4] } +func (c *sigctxt) r5() uint64 { return c.regs().gpr[5] } +func (c *sigctxt) r6() uint64 { return c.regs().gpr[6] } +func (c *sigctxt) r7() uint64 { return c.regs().gpr[7] } +func (c *sigctxt) r8() uint64 { return c.regs().gpr[8] } +func (c *sigctxt) r9() uint64 { return c.regs().gpr[9] } +func (c *sigctxt) r10() uint64 { return c.regs().gpr[10] } +func (c *sigctxt) r11() uint64 { return c.regs().gpr[11] } +func (c *sigctxt) r12() uint64 { return c.regs().gpr[12] } +func (c *sigctxt) r13() uint64 { return c.regs().gpr[13] } +func (c *sigctxt) r14() uint64 { return c.regs().gpr[14] } +func (c *sigctxt) r15() uint64 { return c.regs().gpr[15] } +func (c *sigctxt) r16() uint64 { return c.regs().gpr[16] } +func (c *sigctxt) r17() uint64 { return c.regs().gpr[17] } +func (c *sigctxt) r18() uint64 { return c.regs().gpr[18] } +func (c *sigctxt) r19() uint64 { return c.regs().gpr[19] } +func (c *sigctxt) r20() uint64 { return c.regs().gpr[20] } +func (c *sigctxt) r21() uint64 { return c.regs().gpr[21] } +func (c *sigctxt) r22() uint64 { return c.regs().gpr[22] } +func (c *sigctxt) r23() uint64 { return c.regs().gpr[23] } +func (c *sigctxt) r24() uint64 { return c.regs().gpr[24] } +func (c *sigctxt) r25() uint64 { return c.regs().gpr[25] } +func (c *sigctxt) r26() uint64 { return c.regs().gpr[26] } +func (c *sigctxt) r27() uint64 { return c.regs().gpr[27] } +func (c *sigctxt) r28() uint64 { return c.regs().gpr[28] } +func (c *sigctxt) r29() uint64 { return c.regs().gpr[29] } +func (c *sigctxt) r30() uint64 { return c.regs().gpr[30] } +func (c *sigctxt) r31() uint64 { return c.regs().gpr[31] } +func (c *sigctxt) sp() uint64 { return c.regs().gpr[1] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().iar } + +func (c *sigctxt) ctr() uint64 { return c.regs().ctr } +func (c *sigctxt) link() uint64 { return c.regs().lr } +func (c *sigctxt) xer() uint32 { return c.regs().xer } +func (c *sigctxt) ccr() uint32 { return c.regs().cr } +func (c *sigctxt) fpscr() uint32 { return c.regs().fpscr } +func (c *sigctxt) fpscrx() uint32 { return c.regs().fpscrx } + +// TODO(aix): find trap equivalent +func (c *sigctxt) trap() uint32 { return 0x0 } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return uint64(c.info.si_addr) } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } + +func (c *sigctxt) set_r0(x uint64) { c.regs().gpr[0] = x } +func (c *sigctxt) set_r12(x uint64) { c.regs().gpr[12] = x } +func (c *sigctxt) set_r30(x uint64) { c.regs().gpr[30] = x } +func (c *sigctxt) set_pc(x uint64) { c.regs().iar = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().gpr[1] = x } +func (c *sigctxt) set_link(x uint64) { c.regs().lr = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_amd64.go b/src/runtime/signal_amd64.go new file mode 100644 index 0000000..8ade208 --- /dev/null +++ b/src/runtime/signal_amd64.go @@ -0,0 +1,87 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build amd64 && (darwin || dragonfly || freebsd || linux || netbsd || openbsd || solaris) + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("rax ", hex(c.rax()), "\n") + print("rbx ", hex(c.rbx()), "\n") + print("rcx ", hex(c.rcx()), "\n") + print("rdx ", hex(c.rdx()), "\n") + print("rdi ", hex(c.rdi()), "\n") + print("rsi ", hex(c.rsi()), "\n") + print("rbp ", hex(c.rbp()), "\n") + print("rsp ", hex(c.rsp()), "\n") + print("r8 ", hex(c.r8()), "\n") + print("r9 ", hex(c.r9()), "\n") + print("r10 ", hex(c.r10()), "\n") + print("r11 ", hex(c.r11()), "\n") + print("r12 ", hex(c.r12()), "\n") + print("r13 ", hex(c.r13()), "\n") + print("r14 ", hex(c.r14()), "\n") + print("r15 ", hex(c.r15()), "\n") + print("rip ", hex(c.rip()), "\n") + print("rflags ", hex(c.rflags()), "\n") + print("cs ", hex(c.cs()), "\n") + print("fs ", hex(c.fs()), "\n") + print("gs ", hex(c.gs()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.rip()) } + +func (c *sigctxt) setsigpc(x uint64) { c.set_rip(x) } +func (c *sigctxt) sigsp() uintptr { return uintptr(c.rsp()) } +func (c *sigctxt) siglr() uintptr { return 0 } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // Work around Leopard bug that doesn't set FPE_INTDIV. + // Look at instruction to see if it is a divide. + // Not necessary in Snow Leopard (si_code will be != 0). + if GOOS == "darwin" && sig == _SIGFPE && gp.sigcode0 == 0 { + pc := (*[4]byte)(unsafe.Pointer(gp.sigpc)) + i := 0 + if pc[i]&0xF0 == 0x40 { // 64-bit REX prefix + i++ + } else if pc[i] == 0x66 { // 16-bit instruction prefix + i++ + } + if pc[i] == 0xF6 || pc[i] == 0xF7 { + gp.sigcode0 = _FPE_INTDIV + } + } + + pc := uintptr(c.rip()) + sp := uintptr(c.rsp()) + + // In case we are panicking from external code, we need to initialize + // Go special registers. We inject sigpanic0 (instead of sigpanic), + // which takes care of that. + if shouldPushSigpanic(gp, pc, *(*uintptr)(unsafe.Pointer(sp))) { + c.pushCall(abi.FuncPCABI0(sigpanic0), pc) + } else { + // Not safe to push the call. Just clobber the frame. + c.set_rip(uint64(abi.FuncPCABI0(sigpanic0))) + } +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Make it look like we called target at resumePC. + sp := uintptr(c.rsp()) + sp -= goarch.PtrSize + *(*uintptr)(unsafe.Pointer(sp)) = resumePC + c.set_rsp(uint64(sp)) + c.set_rip(uint64(targetPC)) +} diff --git a/src/runtime/signal_arm.go b/src/runtime/signal_arm.go new file mode 100644 index 0000000..fff302f --- /dev/null +++ b/src/runtime/signal_arm.go @@ -0,0 +1,81 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build dragonfly || freebsd || linux || netbsd || openbsd + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("trap ", hex(c.trap()), "\n") + print("error ", hex(c.error()), "\n") + print("oldmask ", hex(c.oldmask()), "\n") + print("r0 ", hex(c.r0()), "\n") + print("r1 ", hex(c.r1()), "\n") + print("r2 ", hex(c.r2()), "\n") + print("r3 ", hex(c.r3()), "\n") + print("r4 ", hex(c.r4()), "\n") + print("r5 ", hex(c.r5()), "\n") + print("r6 ", hex(c.r6()), "\n") + print("r7 ", hex(c.r7()), "\n") + print("r8 ", hex(c.r8()), "\n") + print("r9 ", hex(c.r9()), "\n") + print("r10 ", hex(c.r10()), "\n") + print("fp ", hex(c.fp()), "\n") + print("ip ", hex(c.ip()), "\n") + print("sp ", hex(c.sp()), "\n") + print("lr ", hex(c.lr()), "\n") + print("pc ", hex(c.pc()), "\n") + print("cpsr ", hex(c.cpsr()), "\n") + print("fault ", hex(c.fault()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.pc()) } + +func (c *sigctxt) sigsp() uintptr { return uintptr(c.sp()) } +func (c *sigctxt) siglr() uintptr { return uintptr(c.lr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // We arrange lr, and pc to pretend the panicking + // function calls sigpanic directly. + // Always save LR to stack so that panics in leaf + // functions are correctly handled. This smashes + // the stack frame but we're not going back there + // anyway. + sp := c.sp() - 4 + c.set_sp(sp) + *(*uint32)(unsafe.Pointer(uintptr(sp))) = c.lr() + + pc := gp.sigpc + + if shouldPushSigpanic(gp, pc, uintptr(c.lr())) { + // Make it look the like faulting PC called sigpanic. + c.set_lr(uint32(pc)) + } + + // In case we are panicking from external C code + c.set_r10(uint32(uintptr(unsafe.Pointer(gp)))) + c.set_pc(uint32(abi.FuncPCABIInternal(sigpanic))) +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Push the LR to stack, as we'll clobber it in order to + // push the call. The function being pushed is responsible + // for restoring the LR and setting the SP back. + // This extra slot is known to gentraceback. + sp := c.sp() - 4 + c.set_sp(sp) + *(*uint32)(unsafe.Pointer(uintptr(sp))) = c.lr() + // Set up PC and LR to pretend the function being signaled + // calls targetPC at resumePC. + c.set_lr(uint32(resumePC)) + c.set_pc(uint32(targetPC)) +} diff --git a/src/runtime/signal_arm64.go b/src/runtime/signal_arm64.go new file mode 100644 index 0000000..c8b8781 --- /dev/null +++ b/src/runtime/signal_arm64.go @@ -0,0 +1,96 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build darwin || freebsd || linux || netbsd || openbsd + +package runtime + +import ( + "internal/abi" + "runtime/internal/sys" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("r0 ", hex(c.r0()), "\n") + print("r1 ", hex(c.r1()), "\n") + print("r2 ", hex(c.r2()), "\n") + print("r3 ", hex(c.r3()), "\n") + print("r4 ", hex(c.r4()), "\n") + print("r5 ", hex(c.r5()), "\n") + print("r6 ", hex(c.r6()), "\n") + print("r7 ", hex(c.r7()), "\n") + print("r8 ", hex(c.r8()), "\n") + print("r9 ", hex(c.r9()), "\n") + print("r10 ", hex(c.r10()), "\n") + print("r11 ", hex(c.r11()), "\n") + print("r12 ", hex(c.r12()), "\n") + print("r13 ", hex(c.r13()), "\n") + print("r14 ", hex(c.r14()), "\n") + print("r15 ", hex(c.r15()), "\n") + print("r16 ", hex(c.r16()), "\n") + print("r17 ", hex(c.r17()), "\n") + print("r18 ", hex(c.r18()), "\n") + print("r19 ", hex(c.r19()), "\n") + print("r20 ", hex(c.r20()), "\n") + print("r21 ", hex(c.r21()), "\n") + print("r22 ", hex(c.r22()), "\n") + print("r23 ", hex(c.r23()), "\n") + print("r24 ", hex(c.r24()), "\n") + print("r25 ", hex(c.r25()), "\n") + print("r26 ", hex(c.r26()), "\n") + print("r27 ", hex(c.r27()), "\n") + print("r28 ", hex(c.r28()), "\n") + print("r29 ", hex(c.r29()), "\n") + print("lr ", hex(c.lr()), "\n") + print("sp ", hex(c.sp()), "\n") + print("pc ", hex(c.pc()), "\n") + print("fault ", hex(c.fault()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.pc()) } + +func (c *sigctxt) setsigpc(x uint64) { c.set_pc(x) } +func (c *sigctxt) sigsp() uintptr { return uintptr(c.sp()) } +func (c *sigctxt) siglr() uintptr { return uintptr(c.lr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // We arrange lr, and pc to pretend the panicking + // function calls sigpanic directly. + // Always save LR to stack so that panics in leaf + // functions are correctly handled. This smashes + // the stack frame but we're not going back there + // anyway. + sp := c.sp() - sys.StackAlign // needs only sizeof uint64, but must align the stack + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.lr() + + pc := gp.sigpc + + if shouldPushSigpanic(gp, pc, uintptr(c.lr())) { + // Make it look the like faulting PC called sigpanic. + c.set_lr(uint64(pc)) + } + + // In case we are panicking from external C code + c.set_r28(uint64(uintptr(unsafe.Pointer(gp)))) + c.set_pc(uint64(abi.FuncPCABIInternal(sigpanic))) +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Push the LR to stack, as we'll clobber it in order to + // push the call. The function being pushed is responsible + // for restoring the LR and setting the SP back. + // This extra space is known to gentraceback. + sp := c.sp() - 16 // SP needs 16-byte alignment + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.lr() + // Set up PC and LR to pretend the function being signaled + // calls targetPC at resumePC. + c.set_lr(uint64(resumePC)) + c.set_pc(uint64(targetPC)) +} diff --git a/src/runtime/signal_darwin.go b/src/runtime/signal_darwin.go new file mode 100644 index 0000000..8090fb2 --- /dev/null +++ b/src/runtime/signal_darwin.go @@ -0,0 +1,40 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var sigtable = [...]sigTabT{ + /* 0 */ {0, "SIGNONE: no trap"}, + /* 1 */ {_SigNotify + _SigKill, "SIGHUP: terminal line hangup"}, + /* 2 */ {_SigNotify + _SigKill, "SIGINT: interrupt"}, + /* 3 */ {_SigNotify + _SigThrow, "SIGQUIT: quit"}, + /* 4 */ {_SigThrow + _SigUnblock, "SIGILL: illegal instruction"}, + /* 5 */ {_SigThrow + _SigUnblock, "SIGTRAP: trace trap"}, + /* 6 */ {_SigNotify + _SigThrow, "SIGABRT: abort"}, + /* 7 */ {_SigThrow, "SIGEMT: emulate instruction executed"}, + /* 8 */ {_SigPanic + _SigUnblock, "SIGFPE: floating-point exception"}, + /* 9 */ {0, "SIGKILL: kill"}, + /* 10 */ {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + /* 11 */ {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + /* 12 */ {_SigThrow, "SIGSYS: bad system call"}, + /* 13 */ {_SigNotify, "SIGPIPE: write to broken pipe"}, + /* 14 */ {_SigNotify, "SIGALRM: alarm clock"}, + /* 15 */ {_SigNotify + _SigKill, "SIGTERM: termination"}, + /* 16 */ {_SigNotify + _SigIgn, "SIGURG: urgent condition on socket"}, + /* 17 */ {0, "SIGSTOP: stop"}, + /* 18 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTSTP: keyboard stop"}, + /* 19 */ {_SigNotify + _SigDefault + _SigIgn, "SIGCONT: continue after stop"}, + /* 20 */ {_SigNotify + _SigUnblock + _SigIgn, "SIGCHLD: child status has changed"}, + /* 21 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTIN: background read from tty"}, + /* 22 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTOU: background write to tty"}, + /* 23 */ {_SigNotify + _SigIgn, "SIGIO: i/o now possible"}, + /* 24 */ {_SigNotify, "SIGXCPU: cpu limit exceeded"}, + /* 25 */ {_SigNotify, "SIGXFSZ: file size limit exceeded"}, + /* 26 */ {_SigNotify, "SIGVTALRM: virtual alarm clock"}, + /* 27 */ {_SigNotify + _SigUnblock, "SIGPROF: profiling alarm clock"}, + /* 28 */ {_SigNotify + _SigIgn, "SIGWINCH: window size change"}, + /* 29 */ {_SigNotify + _SigIgn, "SIGINFO: status request from keyboard"}, + /* 30 */ {_SigNotify, "SIGUSR1: user-defined signal 1"}, + /* 31 */ {_SigNotify, "SIGUSR2: user-defined signal 2"}, +} diff --git a/src/runtime/signal_darwin_amd64.go b/src/runtime/signal_darwin_amd64.go new file mode 100644 index 0000000..20544d8 --- /dev/null +++ b/src/runtime/signal_darwin_amd64.go @@ -0,0 +1,96 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *regs64 { return &(*ucontext)(c.ctxt).uc_mcontext.ss } + +func (c *sigctxt) rax() uint64 { return c.regs().rax } +func (c *sigctxt) rbx() uint64 { return c.regs().rbx } +func (c *sigctxt) rcx() uint64 { return c.regs().rcx } +func (c *sigctxt) rdx() uint64 { return c.regs().rdx } +func (c *sigctxt) rdi() uint64 { return c.regs().rdi } +func (c *sigctxt) rsi() uint64 { return c.regs().rsi } +func (c *sigctxt) rbp() uint64 { return c.regs().rbp } +func (c *sigctxt) rsp() uint64 { return c.regs().rsp } +func (c *sigctxt) r8() uint64 { return c.regs().r8 } +func (c *sigctxt) r9() uint64 { return c.regs().r9 } +func (c *sigctxt) r10() uint64 { return c.regs().r10 } +func (c *sigctxt) r11() uint64 { return c.regs().r11 } +func (c *sigctxt) r12() uint64 { return c.regs().r12 } +func (c *sigctxt) r13() uint64 { return c.regs().r13 } +func (c *sigctxt) r14() uint64 { return c.regs().r14 } +func (c *sigctxt) r15() uint64 { return c.regs().r15 } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) rip() uint64 { return c.regs().rip } + +func (c *sigctxt) rflags() uint64 { return c.regs().rflags } +func (c *sigctxt) cs() uint64 { return c.regs().cs } +func (c *sigctxt) fs() uint64 { return c.regs().fs } +func (c *sigctxt) gs() uint64 { return c.regs().gs } +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_rip(x uint64) { c.regs().rip = x } +func (c *sigctxt) set_rsp(x uint64) { c.regs().rsp = x } +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { c.info.si_addr = x } + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { + switch sig { + case _SIGTRAP: + // OS X sets c.sigcode() == TRAP_BRKPT unconditionally for all SIGTRAPs, + // leaving no way to distinguish a breakpoint-induced SIGTRAP + // from an asynchronous signal SIGTRAP. + // They all look breakpoint-induced by default. + // Try looking at the code to see if it's a breakpoint. + // The assumption is that we're very unlikely to get an + // asynchronous SIGTRAP at just the moment that the + // PC started to point at unmapped memory. + pc := uintptr(c.rip()) + // OS X will leave the pc just after the INT 3 instruction. + // INT 3 is usually 1 byte, but there is a 2-byte form. + code := (*[2]byte)(unsafe.Pointer(pc - 2)) + if code[1] != 0xCC && (code[0] != 0xCD || code[1] != 3) { + // SIGTRAP on something other than INT 3. + c.set_sigcode(_SI_USER) + } + + case _SIGSEGV: + // x86-64 has 48-bit virtual addresses. The top 16 bits must echo bit 47. + // The hardware delivers a different kind of fault for a malformed address + // than it does for an attempt to access a valid but unmapped address. + // OS X 10.9.2 mishandles the malformed address case, making it look like + // a user-generated signal (like someone ran kill -SEGV ourpid). + // We pass user-generated signals to os/signal, or else ignore them. + // Doing that here - and returning to the faulting code - results in an + // infinite loop. It appears the best we can do is rewrite what the kernel + // delivers into something more like the truth. The address used below + // has very little chance of being the one that caused the fault, but it is + // malformed, it is clearly not a real pointer, and if it does get printed + // in real life, people will probably search for it and find this code. + // There are no Google hits for b01dfacedebac1e or 0xb01dfacedebac1e + // as I type this comment. + // + // Note: if this code is removed, please consider + // enabling TestSignalForwardingGo for darwin-amd64 in + // misc/cgo/testcarchive/carchive_test.go. + if c.sigcode() == _SI_USER { + c.set_sigcode(_SI_USER + 1) + c.set_sigaddr(0xb01dfacedebac1e) + } + } +} diff --git a/src/runtime/signal_darwin_arm64.go b/src/runtime/signal_darwin_arm64.go new file mode 100644 index 0000000..690ffe4 --- /dev/null +++ b/src/runtime/signal_darwin_arm64.go @@ -0,0 +1,90 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *regs64 { return &(*ucontext)(c.ctxt).uc_mcontext.ss } + +func (c *sigctxt) r0() uint64 { return c.regs().x[0] } +func (c *sigctxt) r1() uint64 { return c.regs().x[1] } +func (c *sigctxt) r2() uint64 { return c.regs().x[2] } +func (c *sigctxt) r3() uint64 { return c.regs().x[3] } +func (c *sigctxt) r4() uint64 { return c.regs().x[4] } +func (c *sigctxt) r5() uint64 { return c.regs().x[5] } +func (c *sigctxt) r6() uint64 { return c.regs().x[6] } +func (c *sigctxt) r7() uint64 { return c.regs().x[7] } +func (c *sigctxt) r8() uint64 { return c.regs().x[8] } +func (c *sigctxt) r9() uint64 { return c.regs().x[9] } +func (c *sigctxt) r10() uint64 { return c.regs().x[10] } +func (c *sigctxt) r11() uint64 { return c.regs().x[11] } +func (c *sigctxt) r12() uint64 { return c.regs().x[12] } +func (c *sigctxt) r13() uint64 { return c.regs().x[13] } +func (c *sigctxt) r14() uint64 { return c.regs().x[14] } +func (c *sigctxt) r15() uint64 { return c.regs().x[15] } +func (c *sigctxt) r16() uint64 { return c.regs().x[16] } +func (c *sigctxt) r17() uint64 { return c.regs().x[17] } +func (c *sigctxt) r18() uint64 { return c.regs().x[18] } +func (c *sigctxt) r19() uint64 { return c.regs().x[19] } +func (c *sigctxt) r20() uint64 { return c.regs().x[20] } +func (c *sigctxt) r21() uint64 { return c.regs().x[21] } +func (c *sigctxt) r22() uint64 { return c.regs().x[22] } +func (c *sigctxt) r23() uint64 { return c.regs().x[23] } +func (c *sigctxt) r24() uint64 { return c.regs().x[24] } +func (c *sigctxt) r25() uint64 { return c.regs().x[25] } +func (c *sigctxt) r26() uint64 { return c.regs().x[26] } +func (c *sigctxt) r27() uint64 { return c.regs().x[27] } +func (c *sigctxt) r28() uint64 { return c.regs().x[28] } +func (c *sigctxt) r29() uint64 { return c.regs().fp } +func (c *sigctxt) lr() uint64 { return c.regs().lr } +func (c *sigctxt) sp() uint64 { return c.regs().sp } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().pc } + +func (c *sigctxt) fault() uintptr { return uintptr(unsafe.Pointer(c.info.si_addr)) } + +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return uint64(uintptr(unsafe.Pointer(c.info.si_addr))) } + +func (c *sigctxt) set_pc(x uint64) { c.regs().pc = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().sp = x } +func (c *sigctxt) set_lr(x uint64) { c.regs().lr = x } +func (c *sigctxt) set_r28(x uint64) { c.regs().x[28] = x } + +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + c.info.si_addr = (*byte)(unsafe.Pointer(uintptr(x))) +} + +//go:nosplit +func (c *sigctxt) fixsigcode(sig uint32) { + switch sig { + case _SIGTRAP: + // OS X sets c.sigcode() == TRAP_BRKPT unconditionally for all SIGTRAPs, + // leaving no way to distinguish a breakpoint-induced SIGTRAP + // from an asynchronous signal SIGTRAP. + // They all look breakpoint-induced by default. + // Try looking at the code to see if it's a breakpoint. + // The assumption is that we're very unlikely to get an + // asynchronous SIGTRAP at just the moment that the + // PC started to point at unmapped memory. + pc := uintptr(c.pc()) + // OS X will leave the pc just after the instruction. + code := (*uint32)(unsafe.Pointer(pc - 4)) + if *code != 0xd4200000 { + // SIGTRAP on something other than breakpoint. + c.set_sigcode(_SI_USER) + } + } +} diff --git a/src/runtime/signal_dragonfly.go b/src/runtime/signal_dragonfly.go new file mode 100644 index 0000000..f2b26e7 --- /dev/null +++ b/src/runtime/signal_dragonfly.go @@ -0,0 +1,41 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var sigtable = [...]sigTabT{ + /* 0 */ {0, "SIGNONE: no trap"}, + /* 1 */ {_SigNotify + _SigKill, "SIGHUP: terminal line hangup"}, + /* 2 */ {_SigNotify + _SigKill, "SIGINT: interrupt"}, + /* 3 */ {_SigNotify + _SigThrow, "SIGQUIT: quit"}, + /* 4 */ {_SigThrow + _SigUnblock, "SIGILL: illegal instruction"}, + /* 5 */ {_SigThrow + _SigUnblock, "SIGTRAP: trace trap"}, + /* 6 */ {_SigNotify + _SigThrow, "SIGABRT: abort"}, + /* 7 */ {_SigThrow, "SIGEMT: emulate instruction executed"}, + /* 8 */ {_SigPanic + _SigUnblock, "SIGFPE: floating-point exception"}, + /* 9 */ {0, "SIGKILL: kill"}, + /* 10 */ {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + /* 11 */ {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + /* 12 */ {_SigThrow, "SIGSYS: bad system call"}, + /* 13 */ {_SigNotify, "SIGPIPE: write to broken pipe"}, + /* 14 */ {_SigNotify, "SIGALRM: alarm clock"}, + /* 15 */ {_SigNotify + _SigKill, "SIGTERM: termination"}, + /* 16 */ {_SigNotify + _SigIgn, "SIGURG: urgent condition on socket"}, + /* 17 */ {0, "SIGSTOP: stop"}, + /* 18 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTSTP: keyboard stop"}, + /* 19 */ {_SigNotify + _SigDefault + _SigIgn, "SIGCONT: continue after stop"}, + /* 20 */ {_SigNotify + _SigUnblock + _SigIgn, "SIGCHLD: child status has changed"}, + /* 21 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTIN: background read from tty"}, + /* 22 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTOU: background write to tty"}, + /* 23 */ {_SigNotify + _SigIgn, "SIGIO: i/o now possible"}, + /* 24 */ {_SigNotify, "SIGXCPU: cpu limit exceeded"}, + /* 25 */ {_SigNotify, "SIGXFSZ: file size limit exceeded"}, + /* 26 */ {_SigNotify, "SIGVTALRM: virtual alarm clock"}, + /* 27 */ {_SigNotify + _SigUnblock, "SIGPROF: profiling alarm clock"}, + /* 28 */ {_SigNotify + _SigIgn, "SIGWINCH: window size change"}, + /* 29 */ {_SigNotify + _SigIgn, "SIGINFO: status request from keyboard"}, + /* 30 */ {_SigNotify, "SIGUSR1: user-defined signal 1"}, + /* 31 */ {_SigNotify, "SIGUSR2: user-defined signal 2"}, + /* 32 */ {_SigNotify, "SIGTHR: reserved"}, +} diff --git a/src/runtime/signal_dragonfly_amd64.go b/src/runtime/signal_dragonfly_amd64.go new file mode 100644 index 0000000..c473edd --- /dev/null +++ b/src/runtime/signal_dragonfly_amd64.go @@ -0,0 +1,51 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontext { + return (*mcontext)(unsafe.Pointer(&(*ucontext)(c.ctxt).uc_mcontext)) +} + +func (c *sigctxt) rax() uint64 { return c.regs().mc_rax } +func (c *sigctxt) rbx() uint64 { return c.regs().mc_rbx } +func (c *sigctxt) rcx() uint64 { return c.regs().mc_rcx } +func (c *sigctxt) rdx() uint64 { return c.regs().mc_rdx } +func (c *sigctxt) rdi() uint64 { return c.regs().mc_rdi } +func (c *sigctxt) rsi() uint64 { return c.regs().mc_rsi } +func (c *sigctxt) rbp() uint64 { return c.regs().mc_rbp } +func (c *sigctxt) rsp() uint64 { return c.regs().mc_rsp } +func (c *sigctxt) r8() uint64 { return c.regs().mc_r8 } +func (c *sigctxt) r9() uint64 { return c.regs().mc_r9 } +func (c *sigctxt) r10() uint64 { return c.regs().mc_r10 } +func (c *sigctxt) r11() uint64 { return c.regs().mc_r11 } +func (c *sigctxt) r12() uint64 { return c.regs().mc_r12 } +func (c *sigctxt) r13() uint64 { return c.regs().mc_r13 } +func (c *sigctxt) r14() uint64 { return c.regs().mc_r14 } +func (c *sigctxt) r15() uint64 { return c.regs().mc_r15 } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) rip() uint64 { return c.regs().mc_rip } + +func (c *sigctxt) rflags() uint64 { return c.regs().mc_rflags } +func (c *sigctxt) cs() uint64 { return c.regs().mc_cs } +func (c *sigctxt) fs() uint64 { return c.regs().mc_ss } +func (c *sigctxt) gs() uint64 { return c.regs().mc_ss } +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_rip(x uint64) { c.regs().mc_rip = x } +func (c *sigctxt) set_rsp(x uint64) { c.regs().mc_rsp = x } +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { c.info.si_addr = x } diff --git a/src/runtime/signal_freebsd.go b/src/runtime/signal_freebsd.go new file mode 100644 index 0000000..2812c69 --- /dev/null +++ b/src/runtime/signal_freebsd.go @@ -0,0 +1,41 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var sigtable = [...]sigTabT{ + /* 0 */ {0, "SIGNONE: no trap"}, + /* 1 */ {_SigNotify + _SigKill, "SIGHUP: terminal line hangup"}, + /* 2 */ {_SigNotify + _SigKill, "SIGINT: interrupt"}, + /* 3 */ {_SigNotify + _SigThrow, "SIGQUIT: quit"}, + /* 4 */ {_SigThrow + _SigUnblock, "SIGILL: illegal instruction"}, + /* 5 */ {_SigThrow + _SigUnblock, "SIGTRAP: trace trap"}, + /* 6 */ {_SigNotify + _SigThrow, "SIGABRT: abort"}, + /* 7 */ {_SigThrow, "SIGEMT: emulate instruction executed"}, + /* 8 */ {_SigPanic + _SigUnblock, "SIGFPE: floating-point exception"}, + /* 9 */ {0, "SIGKILL: kill"}, + /* 10 */ {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + /* 11 */ {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + /* 12 */ {_SigNotify, "SIGSYS: bad system call"}, // see golang.org/issues/15204 + /* 13 */ {_SigNotify, "SIGPIPE: write to broken pipe"}, + /* 14 */ {_SigNotify, "SIGALRM: alarm clock"}, + /* 15 */ {_SigNotify + _SigKill, "SIGTERM: termination"}, + /* 16 */ {_SigNotify + _SigIgn, "SIGURG: urgent condition on socket"}, + /* 17 */ {0, "SIGSTOP: stop"}, + /* 18 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTSTP: keyboard stop"}, + /* 19 */ {_SigNotify + _SigDefault + _SigIgn, "SIGCONT: continue after stop"}, + /* 20 */ {_SigNotify + _SigUnblock + _SigIgn, "SIGCHLD: child status has changed"}, + /* 21 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTIN: background read from tty"}, + /* 22 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTOU: background write to tty"}, + /* 23 */ {_SigNotify + _SigIgn, "SIGIO: i/o now possible"}, + /* 24 */ {_SigNotify, "SIGXCPU: cpu limit exceeded"}, + /* 25 */ {_SigNotify, "SIGXFSZ: file size limit exceeded"}, + /* 26 */ {_SigNotify, "SIGVTALRM: virtual alarm clock"}, + /* 27 */ {_SigNotify + _SigUnblock, "SIGPROF: profiling alarm clock"}, + /* 28 */ {_SigNotify + _SigIgn, "SIGWINCH: window size change"}, + /* 29 */ {_SigNotify + _SigIgn, "SIGINFO: status request from keyboard"}, + /* 30 */ {_SigNotify, "SIGUSR1: user-defined signal 1"}, + /* 31 */ {_SigNotify, "SIGUSR2: user-defined signal 2"}, + /* 32 */ {_SigNotify, "SIGTHR: reserved"}, +} diff --git a/src/runtime/signal_freebsd_386.go b/src/runtime/signal_freebsd_386.go new file mode 100644 index 0000000..f7cc0df --- /dev/null +++ b/src/runtime/signal_freebsd_386.go @@ -0,0 +1,41 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) eax() uint32 { return c.regs().mc_eax } +func (c *sigctxt) ebx() uint32 { return c.regs().mc_ebx } +func (c *sigctxt) ecx() uint32 { return c.regs().mc_ecx } +func (c *sigctxt) edx() uint32 { return c.regs().mc_edx } +func (c *sigctxt) edi() uint32 { return c.regs().mc_edi } +func (c *sigctxt) esi() uint32 { return c.regs().mc_esi } +func (c *sigctxt) ebp() uint32 { return c.regs().mc_ebp } +func (c *sigctxt) esp() uint32 { return c.regs().mc_esp } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) eip() uint32 { return c.regs().mc_eip } + +func (c *sigctxt) eflags() uint32 { return c.regs().mc_eflags } +func (c *sigctxt) cs() uint32 { return c.regs().mc_cs } +func (c *sigctxt) fs() uint32 { return c.regs().mc_fs } +func (c *sigctxt) gs() uint32 { return c.regs().mc_gs } +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint32 { return uint32(c.info.si_addr) } + +func (c *sigctxt) set_eip(x uint32) { c.regs().mc_eip = x } +func (c *sigctxt) set_esp(x uint32) { c.regs().mc_esp = x } +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { c.info.si_addr = uintptr(x) } diff --git a/src/runtime/signal_freebsd_amd64.go b/src/runtime/signal_freebsd_amd64.go new file mode 100644 index 0000000..20b86e7 --- /dev/null +++ b/src/runtime/signal_freebsd_amd64.go @@ -0,0 +1,51 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontext { + return (*mcontext)(unsafe.Pointer(&(*ucontext)(c.ctxt).uc_mcontext)) +} + +func (c *sigctxt) rax() uint64 { return c.regs().mc_rax } +func (c *sigctxt) rbx() uint64 { return c.regs().mc_rbx } +func (c *sigctxt) rcx() uint64 { return c.regs().mc_rcx } +func (c *sigctxt) rdx() uint64 { return c.regs().mc_rdx } +func (c *sigctxt) rdi() uint64 { return c.regs().mc_rdi } +func (c *sigctxt) rsi() uint64 { return c.regs().mc_rsi } +func (c *sigctxt) rbp() uint64 { return c.regs().mc_rbp } +func (c *sigctxt) rsp() uint64 { return c.regs().mc_rsp } +func (c *sigctxt) r8() uint64 { return c.regs().mc_r8 } +func (c *sigctxt) r9() uint64 { return c.regs().mc_r9 } +func (c *sigctxt) r10() uint64 { return c.regs().mc_r10 } +func (c *sigctxt) r11() uint64 { return c.regs().mc_r11 } +func (c *sigctxt) r12() uint64 { return c.regs().mc_r12 } +func (c *sigctxt) r13() uint64 { return c.regs().mc_r13 } +func (c *sigctxt) r14() uint64 { return c.regs().mc_r14 } +func (c *sigctxt) r15() uint64 { return c.regs().mc_r15 } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) rip() uint64 { return c.regs().mc_rip } + +func (c *sigctxt) rflags() uint64 { return c.regs().mc_rflags } +func (c *sigctxt) cs() uint64 { return c.regs().mc_cs } +func (c *sigctxt) fs() uint64 { return uint64(c.regs().mc_fs) } +func (c *sigctxt) gs() uint64 { return uint64(c.regs().mc_gs) } +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_rip(x uint64) { c.regs().mc_rip = x } +func (c *sigctxt) set_rsp(x uint64) { c.regs().mc_rsp = x } +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { c.info.si_addr = x } diff --git a/src/runtime/signal_freebsd_arm.go b/src/runtime/signal_freebsd_arm.go new file mode 100644 index 0000000..2135c1e --- /dev/null +++ b/src/runtime/signal_freebsd_arm.go @@ -0,0 +1,55 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) r0() uint32 { return c.regs().__gregs[0] } +func (c *sigctxt) r1() uint32 { return c.regs().__gregs[1] } +func (c *sigctxt) r2() uint32 { return c.regs().__gregs[2] } +func (c *sigctxt) r3() uint32 { return c.regs().__gregs[3] } +func (c *sigctxt) r4() uint32 { return c.regs().__gregs[4] } +func (c *sigctxt) r5() uint32 { return c.regs().__gregs[5] } +func (c *sigctxt) r6() uint32 { return c.regs().__gregs[6] } +func (c *sigctxt) r7() uint32 { return c.regs().__gregs[7] } +func (c *sigctxt) r8() uint32 { return c.regs().__gregs[8] } +func (c *sigctxt) r9() uint32 { return c.regs().__gregs[9] } +func (c *sigctxt) r10() uint32 { return c.regs().__gregs[10] } +func (c *sigctxt) fp() uint32 { return c.regs().__gregs[11] } +func (c *sigctxt) ip() uint32 { return c.regs().__gregs[12] } +func (c *sigctxt) sp() uint32 { return c.regs().__gregs[13] } +func (c *sigctxt) lr() uint32 { return c.regs().__gregs[14] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint32 { return c.regs().__gregs[15] } + +func (c *sigctxt) cpsr() uint32 { return c.regs().__gregs[16] } +func (c *sigctxt) fault() uintptr { return uintptr(c.info.si_addr) } +func (c *sigctxt) trap() uint32 { return 0 } +func (c *sigctxt) error() uint32 { return 0 } +func (c *sigctxt) oldmask() uint32 { return 0 } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint32 { return uint32(c.info.si_addr) } + +func (c *sigctxt) set_pc(x uint32) { c.regs().__gregs[15] = x } +func (c *sigctxt) set_sp(x uint32) { c.regs().__gregs[13] = x } +func (c *sigctxt) set_lr(x uint32) { c.regs().__gregs[14] = x } +func (c *sigctxt) set_r10(x uint32) { c.regs().__gregs[10] = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { + c.info.si_addr = uintptr(x) +} diff --git a/src/runtime/signal_freebsd_arm64.go b/src/runtime/signal_freebsd_arm64.go new file mode 100644 index 0000000..159e965 --- /dev/null +++ b/src/runtime/signal_freebsd_arm64.go @@ -0,0 +1,66 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) r0() uint64 { return c.regs().mc_gpregs.gp_x[0] } +func (c *sigctxt) r1() uint64 { return c.regs().mc_gpregs.gp_x[1] } +func (c *sigctxt) r2() uint64 { return c.regs().mc_gpregs.gp_x[2] } +func (c *sigctxt) r3() uint64 { return c.regs().mc_gpregs.gp_x[3] } +func (c *sigctxt) r4() uint64 { return c.regs().mc_gpregs.gp_x[4] } +func (c *sigctxt) r5() uint64 { return c.regs().mc_gpregs.gp_x[5] } +func (c *sigctxt) r6() uint64 { return c.regs().mc_gpregs.gp_x[6] } +func (c *sigctxt) r7() uint64 { return c.regs().mc_gpregs.gp_x[7] } +func (c *sigctxt) r8() uint64 { return c.regs().mc_gpregs.gp_x[8] } +func (c *sigctxt) r9() uint64 { return c.regs().mc_gpregs.gp_x[9] } +func (c *sigctxt) r10() uint64 { return c.regs().mc_gpregs.gp_x[10] } +func (c *sigctxt) r11() uint64 { return c.regs().mc_gpregs.gp_x[11] } +func (c *sigctxt) r12() uint64 { return c.regs().mc_gpregs.gp_x[12] } +func (c *sigctxt) r13() uint64 { return c.regs().mc_gpregs.gp_x[13] } +func (c *sigctxt) r14() uint64 { return c.regs().mc_gpregs.gp_x[14] } +func (c *sigctxt) r15() uint64 { return c.regs().mc_gpregs.gp_x[15] } +func (c *sigctxt) r16() uint64 { return c.regs().mc_gpregs.gp_x[16] } +func (c *sigctxt) r17() uint64 { return c.regs().mc_gpregs.gp_x[17] } +func (c *sigctxt) r18() uint64 { return c.regs().mc_gpregs.gp_x[18] } +func (c *sigctxt) r19() uint64 { return c.regs().mc_gpregs.gp_x[19] } +func (c *sigctxt) r20() uint64 { return c.regs().mc_gpregs.gp_x[20] } +func (c *sigctxt) r21() uint64 { return c.regs().mc_gpregs.gp_x[21] } +func (c *sigctxt) r22() uint64 { return c.regs().mc_gpregs.gp_x[22] } +func (c *sigctxt) r23() uint64 { return c.regs().mc_gpregs.gp_x[23] } +func (c *sigctxt) r24() uint64 { return c.regs().mc_gpregs.gp_x[24] } +func (c *sigctxt) r25() uint64 { return c.regs().mc_gpregs.gp_x[25] } +func (c *sigctxt) r26() uint64 { return c.regs().mc_gpregs.gp_x[26] } +func (c *sigctxt) r27() uint64 { return c.regs().mc_gpregs.gp_x[27] } +func (c *sigctxt) r28() uint64 { return c.regs().mc_gpregs.gp_x[28] } +func (c *sigctxt) r29() uint64 { return c.regs().mc_gpregs.gp_x[29] } +func (c *sigctxt) lr() uint64 { return c.regs().mc_gpregs.gp_lr } +func (c *sigctxt) sp() uint64 { return c.regs().mc_gpregs.gp_sp } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().mc_gpregs.gp_elr } + +func (c *sigctxt) fault() uint64 { return c.info.si_addr } + +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_pc(x uint64) { c.regs().mc_gpregs.gp_elr = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().mc_gpregs.gp_sp = x } +func (c *sigctxt) set_lr(x uint64) { c.regs().mc_gpregs.gp_lr = x } +func (c *sigctxt) set_r28(x uint64) { c.regs().mc_gpregs.gp_x[28] = x } + +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { c.info.si_addr = x } diff --git a/src/runtime/signal_freebsd_riscv64.go b/src/runtime/signal_freebsd_riscv64.go new file mode 100644 index 0000000..fbf6c63 --- /dev/null +++ b/src/runtime/signal_freebsd_riscv64.go @@ -0,0 +1,63 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) ra() uint64 { return c.regs().mc_gpregs.gp_ra } +func (c *sigctxt) sp() uint64 { return c.regs().mc_gpregs.gp_sp } +func (c *sigctxt) gp() uint64 { return c.regs().mc_gpregs.gp_gp } +func (c *sigctxt) tp() uint64 { return c.regs().mc_gpregs.gp_tp } +func (c *sigctxt) t0() uint64 { return c.regs().mc_gpregs.gp_t[0] } +func (c *sigctxt) t1() uint64 { return c.regs().mc_gpregs.gp_t[1] } +func (c *sigctxt) t2() uint64 { return c.regs().mc_gpregs.gp_t[2] } +func (c *sigctxt) s0() uint64 { return c.regs().mc_gpregs.gp_s[0] } +func (c *sigctxt) s1() uint64 { return c.regs().mc_gpregs.gp_s[1] } +func (c *sigctxt) a0() uint64 { return c.regs().mc_gpregs.gp_a[0] } +func (c *sigctxt) a1() uint64 { return c.regs().mc_gpregs.gp_a[1] } +func (c *sigctxt) a2() uint64 { return c.regs().mc_gpregs.gp_a[2] } +func (c *sigctxt) a3() uint64 { return c.regs().mc_gpregs.gp_a[3] } +func (c *sigctxt) a4() uint64 { return c.regs().mc_gpregs.gp_a[4] } +func (c *sigctxt) a5() uint64 { return c.regs().mc_gpregs.gp_a[5] } +func (c *sigctxt) a6() uint64 { return c.regs().mc_gpregs.gp_a[6] } +func (c *sigctxt) a7() uint64 { return c.regs().mc_gpregs.gp_a[7] } +func (c *sigctxt) s2() uint64 { return c.regs().mc_gpregs.gp_s[2] } +func (c *sigctxt) s3() uint64 { return c.regs().mc_gpregs.gp_s[3] } +func (c *sigctxt) s4() uint64 { return c.regs().mc_gpregs.gp_s[4] } +func (c *sigctxt) s5() uint64 { return c.regs().mc_gpregs.gp_s[5] } +func (c *sigctxt) s6() uint64 { return c.regs().mc_gpregs.gp_s[6] } +func (c *sigctxt) s7() uint64 { return c.regs().mc_gpregs.gp_s[7] } +func (c *sigctxt) s8() uint64 { return c.regs().mc_gpregs.gp_s[8] } +func (c *sigctxt) s9() uint64 { return c.regs().mc_gpregs.gp_s[9] } +func (c *sigctxt) s10() uint64 { return c.regs().mc_gpregs.gp_s[10] } +func (c *sigctxt) s11() uint64 { return c.regs().mc_gpregs.gp_s[11] } +func (c *sigctxt) t3() uint64 { return c.regs().mc_gpregs.gp_t[3] } +func (c *sigctxt) t4() uint64 { return c.regs().mc_gpregs.gp_t[4] } +func (c *sigctxt) t5() uint64 { return c.regs().mc_gpregs.gp_t[5] } +func (c *sigctxt) t6() uint64 { return c.regs().mc_gpregs.gp_t[6] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().mc_gpregs.gp_sepc } + +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_pc(x uint64) { c.regs().mc_gpregs.gp_sepc = x } +func (c *sigctxt) set_ra(x uint64) { c.regs().mc_gpregs.gp_ra = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().mc_gpregs.gp_sp = x } +func (c *sigctxt) set_gp(x uint64) { c.regs().mc_gpregs.gp_gp = x } + +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { c.info.si_addr = x } diff --git a/src/runtime/signal_linux_386.go b/src/runtime/signal_linux_386.go new file mode 100644 index 0000000..321518c --- /dev/null +++ b/src/runtime/signal_linux_386.go @@ -0,0 +1,46 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) eax() uint32 { return c.regs().eax } +func (c *sigctxt) ebx() uint32 { return c.regs().ebx } +func (c *sigctxt) ecx() uint32 { return c.regs().ecx } +func (c *sigctxt) edx() uint32 { return c.regs().edx } +func (c *sigctxt) edi() uint32 { return c.regs().edi } +func (c *sigctxt) esi() uint32 { return c.regs().esi } +func (c *sigctxt) ebp() uint32 { return c.regs().ebp } +func (c *sigctxt) esp() uint32 { return c.regs().esp } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) eip() uint32 { return c.regs().eip } + +func (c *sigctxt) eflags() uint32 { return c.regs().eflags } +func (c *sigctxt) cs() uint32 { return uint32(c.regs().cs) } +func (c *sigctxt) fs() uint32 { return uint32(c.regs().fs) } +func (c *sigctxt) gs() uint32 { return uint32(c.regs().gs) } +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint32 { return c.info.si_addr } + +func (c *sigctxt) set_eip(x uint32) { c.regs().eip = x } +func (c *sigctxt) set_esp(x uint32) { c.regs().esp = x } +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_linux_amd64.go b/src/runtime/signal_linux_amd64.go new file mode 100644 index 0000000..573b118 --- /dev/null +++ b/src/runtime/signal_linux_amd64.go @@ -0,0 +1,56 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { + return (*sigcontext)(unsafe.Pointer(&(*ucontext)(c.ctxt).uc_mcontext)) +} + +func (c *sigctxt) rax() uint64 { return c.regs().rax } +func (c *sigctxt) rbx() uint64 { return c.regs().rbx } +func (c *sigctxt) rcx() uint64 { return c.regs().rcx } +func (c *sigctxt) rdx() uint64 { return c.regs().rdx } +func (c *sigctxt) rdi() uint64 { return c.regs().rdi } +func (c *sigctxt) rsi() uint64 { return c.regs().rsi } +func (c *sigctxt) rbp() uint64 { return c.regs().rbp } +func (c *sigctxt) rsp() uint64 { return c.regs().rsp } +func (c *sigctxt) r8() uint64 { return c.regs().r8 } +func (c *sigctxt) r9() uint64 { return c.regs().r9 } +func (c *sigctxt) r10() uint64 { return c.regs().r10 } +func (c *sigctxt) r11() uint64 { return c.regs().r11 } +func (c *sigctxt) r12() uint64 { return c.regs().r12 } +func (c *sigctxt) r13() uint64 { return c.regs().r13 } +func (c *sigctxt) r14() uint64 { return c.regs().r14 } +func (c *sigctxt) r15() uint64 { return c.regs().r15 } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) rip() uint64 { return c.regs().rip } + +func (c *sigctxt) rflags() uint64 { return c.regs().eflags } +func (c *sigctxt) cs() uint64 { return uint64(c.regs().cs) } +func (c *sigctxt) fs() uint64 { return uint64(c.regs().fs) } +func (c *sigctxt) gs() uint64 { return uint64(c.regs().gs) } +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_rip(x uint64) { c.regs().rip = x } +func (c *sigctxt) set_rsp(x uint64) { c.regs().rsp = x } +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_linux_arm.go b/src/runtime/signal_linux_arm.go new file mode 100644 index 0000000..eb107d6 --- /dev/null +++ b/src/runtime/signal_linux_arm.go @@ -0,0 +1,58 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) r0() uint32 { return c.regs().r0 } +func (c *sigctxt) r1() uint32 { return c.regs().r1 } +func (c *sigctxt) r2() uint32 { return c.regs().r2 } +func (c *sigctxt) r3() uint32 { return c.regs().r3 } +func (c *sigctxt) r4() uint32 { return c.regs().r4 } +func (c *sigctxt) r5() uint32 { return c.regs().r5 } +func (c *sigctxt) r6() uint32 { return c.regs().r6 } +func (c *sigctxt) r7() uint32 { return c.regs().r7 } +func (c *sigctxt) r8() uint32 { return c.regs().r8 } +func (c *sigctxt) r9() uint32 { return c.regs().r9 } +func (c *sigctxt) r10() uint32 { return c.regs().r10 } +func (c *sigctxt) fp() uint32 { return c.regs().fp } +func (c *sigctxt) ip() uint32 { return c.regs().ip } +func (c *sigctxt) sp() uint32 { return c.regs().sp } +func (c *sigctxt) lr() uint32 { return c.regs().lr } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint32 { return c.regs().pc } + +func (c *sigctxt) cpsr() uint32 { return c.regs().cpsr } +func (c *sigctxt) fault() uintptr { return uintptr(c.regs().fault_address) } +func (c *sigctxt) trap() uint32 { return c.regs().trap_no } +func (c *sigctxt) error() uint32 { return c.regs().error_code } +func (c *sigctxt) oldmask() uint32 { return c.regs().oldmask } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint32 { return c.info.si_addr } + +func (c *sigctxt) set_pc(x uint32) { c.regs().pc = x } +func (c *sigctxt) set_sp(x uint32) { c.regs().sp = x } +func (c *sigctxt) set_lr(x uint32) { c.regs().lr = x } +func (c *sigctxt) set_r10(x uint32) { c.regs().r10 = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_linux_arm64.go b/src/runtime/signal_linux_arm64.go new file mode 100644 index 0000000..4ccc030 --- /dev/null +++ b/src/runtime/signal_linux_arm64.go @@ -0,0 +1,71 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) r0() uint64 { return c.regs().regs[0] } +func (c *sigctxt) r1() uint64 { return c.regs().regs[1] } +func (c *sigctxt) r2() uint64 { return c.regs().regs[2] } +func (c *sigctxt) r3() uint64 { return c.regs().regs[3] } +func (c *sigctxt) r4() uint64 { return c.regs().regs[4] } +func (c *sigctxt) r5() uint64 { return c.regs().regs[5] } +func (c *sigctxt) r6() uint64 { return c.regs().regs[6] } +func (c *sigctxt) r7() uint64 { return c.regs().regs[7] } +func (c *sigctxt) r8() uint64 { return c.regs().regs[8] } +func (c *sigctxt) r9() uint64 { return c.regs().regs[9] } +func (c *sigctxt) r10() uint64 { return c.regs().regs[10] } +func (c *sigctxt) r11() uint64 { return c.regs().regs[11] } +func (c *sigctxt) r12() uint64 { return c.regs().regs[12] } +func (c *sigctxt) r13() uint64 { return c.regs().regs[13] } +func (c *sigctxt) r14() uint64 { return c.regs().regs[14] } +func (c *sigctxt) r15() uint64 { return c.regs().regs[15] } +func (c *sigctxt) r16() uint64 { return c.regs().regs[16] } +func (c *sigctxt) r17() uint64 { return c.regs().regs[17] } +func (c *sigctxt) r18() uint64 { return c.regs().regs[18] } +func (c *sigctxt) r19() uint64 { return c.regs().regs[19] } +func (c *sigctxt) r20() uint64 { return c.regs().regs[20] } +func (c *sigctxt) r21() uint64 { return c.regs().regs[21] } +func (c *sigctxt) r22() uint64 { return c.regs().regs[22] } +func (c *sigctxt) r23() uint64 { return c.regs().regs[23] } +func (c *sigctxt) r24() uint64 { return c.regs().regs[24] } +func (c *sigctxt) r25() uint64 { return c.regs().regs[25] } +func (c *sigctxt) r26() uint64 { return c.regs().regs[26] } +func (c *sigctxt) r27() uint64 { return c.regs().regs[27] } +func (c *sigctxt) r28() uint64 { return c.regs().regs[28] } +func (c *sigctxt) r29() uint64 { return c.regs().regs[29] } +func (c *sigctxt) lr() uint64 { return c.regs().regs[30] } +func (c *sigctxt) sp() uint64 { return c.regs().sp } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().pc } + +func (c *sigctxt) pstate() uint64 { return c.regs().pstate } +func (c *sigctxt) fault() uintptr { return uintptr(c.regs().fault_address) } + +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_pc(x uint64) { c.regs().pc = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().sp = x } +func (c *sigctxt) set_lr(x uint64) { c.regs().regs[30] = x } +func (c *sigctxt) set_r28(x uint64) { c.regs().regs[28] = x } + +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_linux_loong64.go b/src/runtime/signal_linux_loong64.go new file mode 100644 index 0000000..51aaacb --- /dev/null +++ b/src/runtime/signal_linux_loong64.go @@ -0,0 +1,75 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && loong64 + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) r0() uint64 { return c.regs().sc_regs[0] } +func (c *sigctxt) r1() uint64 { return c.regs().sc_regs[1] } +func (c *sigctxt) r2() uint64 { return c.regs().sc_regs[2] } +func (c *sigctxt) r3() uint64 { return c.regs().sc_regs[3] } +func (c *sigctxt) r4() uint64 { return c.regs().sc_regs[4] } +func (c *sigctxt) r5() uint64 { return c.regs().sc_regs[5] } +func (c *sigctxt) r6() uint64 { return c.regs().sc_regs[6] } +func (c *sigctxt) r7() uint64 { return c.regs().sc_regs[7] } +func (c *sigctxt) r8() uint64 { return c.regs().sc_regs[8] } +func (c *sigctxt) r9() uint64 { return c.regs().sc_regs[9] } +func (c *sigctxt) r10() uint64 { return c.regs().sc_regs[10] } +func (c *sigctxt) r11() uint64 { return c.regs().sc_regs[11] } +func (c *sigctxt) r12() uint64 { return c.regs().sc_regs[12] } +func (c *sigctxt) r13() uint64 { return c.regs().sc_regs[13] } +func (c *sigctxt) r14() uint64 { return c.regs().sc_regs[14] } +func (c *sigctxt) r15() uint64 { return c.regs().sc_regs[15] } +func (c *sigctxt) r16() uint64 { return c.regs().sc_regs[16] } +func (c *sigctxt) r17() uint64 { return c.regs().sc_regs[17] } +func (c *sigctxt) r18() uint64 { return c.regs().sc_regs[18] } +func (c *sigctxt) r19() uint64 { return c.regs().sc_regs[19] } +func (c *sigctxt) r20() uint64 { return c.regs().sc_regs[20] } +func (c *sigctxt) r21() uint64 { return c.regs().sc_regs[21] } +func (c *sigctxt) r22() uint64 { return c.regs().sc_regs[22] } +func (c *sigctxt) r23() uint64 { return c.regs().sc_regs[23] } +func (c *sigctxt) r24() uint64 { return c.regs().sc_regs[24] } +func (c *sigctxt) r25() uint64 { return c.regs().sc_regs[25] } +func (c *sigctxt) r26() uint64 { return c.regs().sc_regs[26] } +func (c *sigctxt) r27() uint64 { return c.regs().sc_regs[27] } +func (c *sigctxt) r28() uint64 { return c.regs().sc_regs[28] } +func (c *sigctxt) r29() uint64 { return c.regs().sc_regs[29] } +func (c *sigctxt) r30() uint64 { return c.regs().sc_regs[30] } +func (c *sigctxt) r31() uint64 { return c.regs().sc_regs[31] } +func (c *sigctxt) sp() uint64 { return c.regs().sc_regs[3] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().sc_pc } + +func (c *sigctxt) link() uint64 { return c.regs().sc_regs[1] } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_r31(x uint64) { c.regs().sc_regs[31] = x } +func (c *sigctxt) set_r22(x uint64) { c.regs().sc_regs[22] = x } +func (c *sigctxt) set_pc(x uint64) { c.regs().sc_pc = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().sc_regs[3] = x } +func (c *sigctxt) set_link(x uint64) { c.regs().sc_regs[1] = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_linux_mips64x.go b/src/runtime/signal_linux_mips64x.go new file mode 100644 index 0000000..9c2a286 --- /dev/null +++ b/src/runtime/signal_linux_mips64x.go @@ -0,0 +1,77 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips64 || mips64le) + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) r0() uint64 { return c.regs().sc_regs[0] } +func (c *sigctxt) r1() uint64 { return c.regs().sc_regs[1] } +func (c *sigctxt) r2() uint64 { return c.regs().sc_regs[2] } +func (c *sigctxt) r3() uint64 { return c.regs().sc_regs[3] } +func (c *sigctxt) r4() uint64 { return c.regs().sc_regs[4] } +func (c *sigctxt) r5() uint64 { return c.regs().sc_regs[5] } +func (c *sigctxt) r6() uint64 { return c.regs().sc_regs[6] } +func (c *sigctxt) r7() uint64 { return c.regs().sc_regs[7] } +func (c *sigctxt) r8() uint64 { return c.regs().sc_regs[8] } +func (c *sigctxt) r9() uint64 { return c.regs().sc_regs[9] } +func (c *sigctxt) r10() uint64 { return c.regs().sc_regs[10] } +func (c *sigctxt) r11() uint64 { return c.regs().sc_regs[11] } +func (c *sigctxt) r12() uint64 { return c.regs().sc_regs[12] } +func (c *sigctxt) r13() uint64 { return c.regs().sc_regs[13] } +func (c *sigctxt) r14() uint64 { return c.regs().sc_regs[14] } +func (c *sigctxt) r15() uint64 { return c.regs().sc_regs[15] } +func (c *sigctxt) r16() uint64 { return c.regs().sc_regs[16] } +func (c *sigctxt) r17() uint64 { return c.regs().sc_regs[17] } +func (c *sigctxt) r18() uint64 { return c.regs().sc_regs[18] } +func (c *sigctxt) r19() uint64 { return c.regs().sc_regs[19] } +func (c *sigctxt) r20() uint64 { return c.regs().sc_regs[20] } +func (c *sigctxt) r21() uint64 { return c.regs().sc_regs[21] } +func (c *sigctxt) r22() uint64 { return c.regs().sc_regs[22] } +func (c *sigctxt) r23() uint64 { return c.regs().sc_regs[23] } +func (c *sigctxt) r24() uint64 { return c.regs().sc_regs[24] } +func (c *sigctxt) r25() uint64 { return c.regs().sc_regs[25] } +func (c *sigctxt) r26() uint64 { return c.regs().sc_regs[26] } +func (c *sigctxt) r27() uint64 { return c.regs().sc_regs[27] } +func (c *sigctxt) r28() uint64 { return c.regs().sc_regs[28] } +func (c *sigctxt) r29() uint64 { return c.regs().sc_regs[29] } +func (c *sigctxt) r30() uint64 { return c.regs().sc_regs[30] } +func (c *sigctxt) r31() uint64 { return c.regs().sc_regs[31] } +func (c *sigctxt) sp() uint64 { return c.regs().sc_regs[29] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().sc_pc } + +func (c *sigctxt) link() uint64 { return c.regs().sc_regs[31] } +func (c *sigctxt) lo() uint64 { return c.regs().sc_mdlo } +func (c *sigctxt) hi() uint64 { return c.regs().sc_mdhi } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_r28(x uint64) { c.regs().sc_regs[28] = x } +func (c *sigctxt) set_r30(x uint64) { c.regs().sc_regs[30] = x } +func (c *sigctxt) set_pc(x uint64) { c.regs().sc_pc = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().sc_regs[29] = x } +func (c *sigctxt) set_link(x uint64) { c.regs().sc_regs[31] = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_linux_mipsx.go b/src/runtime/signal_linux_mipsx.go new file mode 100644 index 0000000..f11bfc9 --- /dev/null +++ b/src/runtime/signal_linux_mipsx.go @@ -0,0 +1,64 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips || mipsle) + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +func (c *sigctxt) regs() *sigcontext { return &(*ucontext)(c.ctxt).uc_mcontext } +func (c *sigctxt) r0() uint32 { return uint32(c.regs().sc_regs[0]) } +func (c *sigctxt) r1() uint32 { return uint32(c.regs().sc_regs[1]) } +func (c *sigctxt) r2() uint32 { return uint32(c.regs().sc_regs[2]) } +func (c *sigctxt) r3() uint32 { return uint32(c.regs().sc_regs[3]) } +func (c *sigctxt) r4() uint32 { return uint32(c.regs().sc_regs[4]) } +func (c *sigctxt) r5() uint32 { return uint32(c.regs().sc_regs[5]) } +func (c *sigctxt) r6() uint32 { return uint32(c.regs().sc_regs[6]) } +func (c *sigctxt) r7() uint32 { return uint32(c.regs().sc_regs[7]) } +func (c *sigctxt) r8() uint32 { return uint32(c.regs().sc_regs[8]) } +func (c *sigctxt) r9() uint32 { return uint32(c.regs().sc_regs[9]) } +func (c *sigctxt) r10() uint32 { return uint32(c.regs().sc_regs[10]) } +func (c *sigctxt) r11() uint32 { return uint32(c.regs().sc_regs[11]) } +func (c *sigctxt) r12() uint32 { return uint32(c.regs().sc_regs[12]) } +func (c *sigctxt) r13() uint32 { return uint32(c.regs().sc_regs[13]) } +func (c *sigctxt) r14() uint32 { return uint32(c.regs().sc_regs[14]) } +func (c *sigctxt) r15() uint32 { return uint32(c.regs().sc_regs[15]) } +func (c *sigctxt) r16() uint32 { return uint32(c.regs().sc_regs[16]) } +func (c *sigctxt) r17() uint32 { return uint32(c.regs().sc_regs[17]) } +func (c *sigctxt) r18() uint32 { return uint32(c.regs().sc_regs[18]) } +func (c *sigctxt) r19() uint32 { return uint32(c.regs().sc_regs[19]) } +func (c *sigctxt) r20() uint32 { return uint32(c.regs().sc_regs[20]) } +func (c *sigctxt) r21() uint32 { return uint32(c.regs().sc_regs[21]) } +func (c *sigctxt) r22() uint32 { return uint32(c.regs().sc_regs[22]) } +func (c *sigctxt) r23() uint32 { return uint32(c.regs().sc_regs[23]) } +func (c *sigctxt) r24() uint32 { return uint32(c.regs().sc_regs[24]) } +func (c *sigctxt) r25() uint32 { return uint32(c.regs().sc_regs[25]) } +func (c *sigctxt) r26() uint32 { return uint32(c.regs().sc_regs[26]) } +func (c *sigctxt) r27() uint32 { return uint32(c.regs().sc_regs[27]) } +func (c *sigctxt) r28() uint32 { return uint32(c.regs().sc_regs[28]) } +func (c *sigctxt) r29() uint32 { return uint32(c.regs().sc_regs[29]) } +func (c *sigctxt) r30() uint32 { return uint32(c.regs().sc_regs[30]) } +func (c *sigctxt) r31() uint32 { return uint32(c.regs().sc_regs[31]) } +func (c *sigctxt) sp() uint32 { return uint32(c.regs().sc_regs[29]) } +func (c *sigctxt) pc() uint32 { return uint32(c.regs().sc_pc) } +func (c *sigctxt) link() uint32 { return uint32(c.regs().sc_regs[31]) } +func (c *sigctxt) lo() uint32 { return uint32(c.regs().sc_mdlo) } +func (c *sigctxt) hi() uint32 { return uint32(c.regs().sc_mdhi) } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint32 { return c.info.si_addr } + +func (c *sigctxt) set_r30(x uint32) { c.regs().sc_regs[30] = uint64(x) } +func (c *sigctxt) set_pc(x uint32) { c.regs().sc_pc = uint64(x) } +func (c *sigctxt) set_sp(x uint32) { c.regs().sc_regs[29] = uint64(x) } +func (c *sigctxt) set_link(x uint32) { c.regs().sc_regs[31] = uint64(x) } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { c.info.si_addr = x } diff --git a/src/runtime/signal_linux_ppc64x.go b/src/runtime/signal_linux_ppc64x.go new file mode 100644 index 0000000..3175428 --- /dev/null +++ b/src/runtime/signal_linux_ppc64x.go @@ -0,0 +1,81 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (ppc64 || ppc64le) + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *ptregs { return (*ucontext)(c.ctxt).uc_mcontext.regs } + +func (c *sigctxt) r0() uint64 { return c.regs().gpr[0] } +func (c *sigctxt) r1() uint64 { return c.regs().gpr[1] } +func (c *sigctxt) r2() uint64 { return c.regs().gpr[2] } +func (c *sigctxt) r3() uint64 { return c.regs().gpr[3] } +func (c *sigctxt) r4() uint64 { return c.regs().gpr[4] } +func (c *sigctxt) r5() uint64 { return c.regs().gpr[5] } +func (c *sigctxt) r6() uint64 { return c.regs().gpr[6] } +func (c *sigctxt) r7() uint64 { return c.regs().gpr[7] } +func (c *sigctxt) r8() uint64 { return c.regs().gpr[8] } +func (c *sigctxt) r9() uint64 { return c.regs().gpr[9] } +func (c *sigctxt) r10() uint64 { return c.regs().gpr[10] } +func (c *sigctxt) r11() uint64 { return c.regs().gpr[11] } +func (c *sigctxt) r12() uint64 { return c.regs().gpr[12] } +func (c *sigctxt) r13() uint64 { return c.regs().gpr[13] } +func (c *sigctxt) r14() uint64 { return c.regs().gpr[14] } +func (c *sigctxt) r15() uint64 { return c.regs().gpr[15] } +func (c *sigctxt) r16() uint64 { return c.regs().gpr[16] } +func (c *sigctxt) r17() uint64 { return c.regs().gpr[17] } +func (c *sigctxt) r18() uint64 { return c.regs().gpr[18] } +func (c *sigctxt) r19() uint64 { return c.regs().gpr[19] } +func (c *sigctxt) r20() uint64 { return c.regs().gpr[20] } +func (c *sigctxt) r21() uint64 { return c.regs().gpr[21] } +func (c *sigctxt) r22() uint64 { return c.regs().gpr[22] } +func (c *sigctxt) r23() uint64 { return c.regs().gpr[23] } +func (c *sigctxt) r24() uint64 { return c.regs().gpr[24] } +func (c *sigctxt) r25() uint64 { return c.regs().gpr[25] } +func (c *sigctxt) r26() uint64 { return c.regs().gpr[26] } +func (c *sigctxt) r27() uint64 { return c.regs().gpr[27] } +func (c *sigctxt) r28() uint64 { return c.regs().gpr[28] } +func (c *sigctxt) r29() uint64 { return c.regs().gpr[29] } +func (c *sigctxt) r30() uint64 { return c.regs().gpr[30] } +func (c *sigctxt) r31() uint64 { return c.regs().gpr[31] } +func (c *sigctxt) sp() uint64 { return c.regs().gpr[1] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().nip } + +func (c *sigctxt) trap() uint64 { return c.regs().trap } +func (c *sigctxt) ctr() uint64 { return c.regs().ctr } +func (c *sigctxt) link() uint64 { return c.regs().link } +func (c *sigctxt) xer() uint64 { return c.regs().xer } +func (c *sigctxt) ccr() uint64 { return c.regs().ccr } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } +func (c *sigctxt) fault() uintptr { return uintptr(c.regs().dar) } + +func (c *sigctxt) set_r0(x uint64) { c.regs().gpr[0] = x } +func (c *sigctxt) set_r12(x uint64) { c.regs().gpr[12] = x } +func (c *sigctxt) set_r30(x uint64) { c.regs().gpr[30] = x } +func (c *sigctxt) set_pc(x uint64) { c.regs().nip = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().gpr[1] = x } +func (c *sigctxt) set_link(x uint64) { c.regs().link = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_linux_riscv64.go b/src/runtime/signal_linux_riscv64.go new file mode 100644 index 0000000..b26450d --- /dev/null +++ b/src/runtime/signal_linux_riscv64.go @@ -0,0 +1,68 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { return &(*ucontext)(c.ctxt).uc_mcontext } + +func (c *sigctxt) ra() uint64 { return c.regs().sc_regs.ra } +func (c *sigctxt) sp() uint64 { return c.regs().sc_regs.sp } +func (c *sigctxt) gp() uint64 { return c.regs().sc_regs.gp } +func (c *sigctxt) tp() uint64 { return c.regs().sc_regs.tp } +func (c *sigctxt) t0() uint64 { return c.regs().sc_regs.t0 } +func (c *sigctxt) t1() uint64 { return c.regs().sc_regs.t1 } +func (c *sigctxt) t2() uint64 { return c.regs().sc_regs.t2 } +func (c *sigctxt) s0() uint64 { return c.regs().sc_regs.s0 } +func (c *sigctxt) s1() uint64 { return c.regs().sc_regs.s1 } +func (c *sigctxt) a0() uint64 { return c.regs().sc_regs.a0 } +func (c *sigctxt) a1() uint64 { return c.regs().sc_regs.a1 } +func (c *sigctxt) a2() uint64 { return c.regs().sc_regs.a2 } +func (c *sigctxt) a3() uint64 { return c.regs().sc_regs.a3 } +func (c *sigctxt) a4() uint64 { return c.regs().sc_regs.a4 } +func (c *sigctxt) a5() uint64 { return c.regs().sc_regs.a5 } +func (c *sigctxt) a6() uint64 { return c.regs().sc_regs.a6 } +func (c *sigctxt) a7() uint64 { return c.regs().sc_regs.a7 } +func (c *sigctxt) s2() uint64 { return c.regs().sc_regs.s2 } +func (c *sigctxt) s3() uint64 { return c.regs().sc_regs.s3 } +func (c *sigctxt) s4() uint64 { return c.regs().sc_regs.s4 } +func (c *sigctxt) s5() uint64 { return c.regs().sc_regs.s5 } +func (c *sigctxt) s6() uint64 { return c.regs().sc_regs.s6 } +func (c *sigctxt) s7() uint64 { return c.regs().sc_regs.s7 } +func (c *sigctxt) s8() uint64 { return c.regs().sc_regs.s8 } +func (c *sigctxt) s9() uint64 { return c.regs().sc_regs.s9 } +func (c *sigctxt) s10() uint64 { return c.regs().sc_regs.s10 } +func (c *sigctxt) s11() uint64 { return c.regs().sc_regs.s11 } +func (c *sigctxt) t3() uint64 { return c.regs().sc_regs.t3 } +func (c *sigctxt) t4() uint64 { return c.regs().sc_regs.t4 } +func (c *sigctxt) t5() uint64 { return c.regs().sc_regs.t5 } +func (c *sigctxt) t6() uint64 { return c.regs().sc_regs.t6 } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().sc_regs.pc } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_pc(x uint64) { c.regs().sc_regs.pc = x } +func (c *sigctxt) set_ra(x uint64) { c.regs().sc_regs.ra = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().sc_regs.sp = x } +func (c *sigctxt) set_gp(x uint64) { c.regs().sc_regs.gp = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} diff --git a/src/runtime/signal_linux_s390x.go b/src/runtime/signal_linux_s390x.go new file mode 100644 index 0000000..18c3b11 --- /dev/null +++ b/src/runtime/signal_linux_s390x.go @@ -0,0 +1,127 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/sys" + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { + return (*sigcontext)(unsafe.Pointer(&(*ucontext)(c.ctxt).uc_mcontext)) +} + +func (c *sigctxt) r0() uint64 { return c.regs().gregs[0] } +func (c *sigctxt) r1() uint64 { return c.regs().gregs[1] } +func (c *sigctxt) r2() uint64 { return c.regs().gregs[2] } +func (c *sigctxt) r3() uint64 { return c.regs().gregs[3] } +func (c *sigctxt) r4() uint64 { return c.regs().gregs[4] } +func (c *sigctxt) r5() uint64 { return c.regs().gregs[5] } +func (c *sigctxt) r6() uint64 { return c.regs().gregs[6] } +func (c *sigctxt) r7() uint64 { return c.regs().gregs[7] } +func (c *sigctxt) r8() uint64 { return c.regs().gregs[8] } +func (c *sigctxt) r9() uint64 { return c.regs().gregs[9] } +func (c *sigctxt) r10() uint64 { return c.regs().gregs[10] } +func (c *sigctxt) r11() uint64 { return c.regs().gregs[11] } +func (c *sigctxt) r12() uint64 { return c.regs().gregs[12] } +func (c *sigctxt) r13() uint64 { return c.regs().gregs[13] } +func (c *sigctxt) r14() uint64 { return c.regs().gregs[14] } +func (c *sigctxt) r15() uint64 { return c.regs().gregs[15] } +func (c *sigctxt) link() uint64 { return c.regs().gregs[14] } +func (c *sigctxt) sp() uint64 { return c.regs().gregs[15] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().psw_addr } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return c.info.si_addr } + +func (c *sigctxt) set_r0(x uint64) { c.regs().gregs[0] = x } +func (c *sigctxt) set_r13(x uint64) { c.regs().gregs[13] = x } +func (c *sigctxt) set_link(x uint64) { c.regs().gregs[14] = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().gregs[15] = x } +func (c *sigctxt) set_pc(x uint64) { c.regs().psw_addr = x } +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(add(unsafe.Pointer(c.info), 2*goarch.PtrSize)) = uintptr(x) +} + +func dumpregs(c *sigctxt) { + print("r0 ", hex(c.r0()), "\t") + print("r1 ", hex(c.r1()), "\n") + print("r2 ", hex(c.r2()), "\t") + print("r3 ", hex(c.r3()), "\n") + print("r4 ", hex(c.r4()), "\t") + print("r5 ", hex(c.r5()), "\n") + print("r6 ", hex(c.r6()), "\t") + print("r7 ", hex(c.r7()), "\n") + print("r8 ", hex(c.r8()), "\t") + print("r9 ", hex(c.r9()), "\n") + print("r10 ", hex(c.r10()), "\t") + print("r11 ", hex(c.r11()), "\n") + print("r12 ", hex(c.r12()), "\t") + print("r13 ", hex(c.r13()), "\n") + print("r14 ", hex(c.r14()), "\t") + print("r15 ", hex(c.r15()), "\n") + print("pc ", hex(c.pc()), "\t") + print("link ", hex(c.link()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.pc()) } + +func (c *sigctxt) sigsp() uintptr { return uintptr(c.sp()) } +func (c *sigctxt) siglr() uintptr { return uintptr(c.link()) } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // We arrange link, and pc to pretend the panicking + // function calls sigpanic directly. + // Always save LINK to stack so that panics in leaf + // functions are correctly handled. This smashes + // the stack frame but we're not going back there + // anyway. + sp := c.sp() - sys.MinFrameSize + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.link() + + pc := uintptr(gp.sigpc) + + if shouldPushSigpanic(gp, pc, uintptr(c.link())) { + // Make it look the like faulting PC called sigpanic. + c.set_link(uint64(pc)) + } + + // In case we are panicking from external C code + c.set_r0(0) + c.set_r13(uint64(uintptr(unsafe.Pointer(gp)))) + c.set_pc(uint64(abi.FuncPCABIInternal(sigpanic))) +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Push the LR to stack, as we'll clobber it in order to + // push the call. The function being pushed is responsible + // for restoring the LR and setting the SP back. + // This extra slot is known to gentraceback. + sp := c.sp() - 8 + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.link() + // Set up PC and LR to pretend the function being signaled + // calls targetPC at resumePC. + c.set_link(uint64(resumePC)) + c.set_pc(uint64(targetPC)) +} diff --git a/src/runtime/signal_loong64.go b/src/runtime/signal_loong64.go new file mode 100644 index 0000000..26717a6 --- /dev/null +++ b/src/runtime/signal_loong64.go @@ -0,0 +1,98 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && loong64 + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("r0 ", hex(c.r0()), "\t") + print("r1 ", hex(c.r1()), "\n") + print("r2 ", hex(c.r2()), "\t") + print("r3 ", hex(c.r3()), "\n") + print("r4 ", hex(c.r4()), "\t") + print("r5 ", hex(c.r5()), "\n") + print("r6 ", hex(c.r6()), "\t") + print("r7 ", hex(c.r7()), "\n") + print("r8 ", hex(c.r8()), "\t") + print("r9 ", hex(c.r9()), "\n") + print("r10 ", hex(c.r10()), "\t") + print("r11 ", hex(c.r11()), "\n") + print("r12 ", hex(c.r12()), "\t") + print("r13 ", hex(c.r13()), "\n") + print("r14 ", hex(c.r14()), "\t") + print("r15 ", hex(c.r15()), "\n") + print("r16 ", hex(c.r16()), "\t") + print("r17 ", hex(c.r17()), "\n") + print("r18 ", hex(c.r18()), "\t") + print("r19 ", hex(c.r19()), "\n") + print("r20 ", hex(c.r20()), "\t") + print("r21 ", hex(c.r21()), "\n") + print("r22 ", hex(c.r22()), "\t") + print("r23 ", hex(c.r23()), "\n") + print("r24 ", hex(c.r24()), "\t") + print("r25 ", hex(c.r25()), "\n") + print("r26 ", hex(c.r26()), "\t") + print("r27 ", hex(c.r27()), "\n") + print("r28 ", hex(c.r28()), "\t") + print("r29 ", hex(c.r29()), "\n") + print("r30 ", hex(c.r30()), "\t") + print("r31 ", hex(c.r31()), "\n") + print("pc ", hex(c.pc()), "\t") + print("link ", hex(c.link()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.pc()) } + +func (c *sigctxt) sigsp() uintptr { return uintptr(c.sp()) } +func (c *sigctxt) siglr() uintptr { return uintptr(c.link()) } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // We arrange link, and pc to pretend the panicking + // function calls sigpanic directly. + // Always save LINK to stack so that panics in leaf + // functions are correctly handled. This smashes + // the stack frame but we're not going back there + // anyway. + sp := c.sp() - goarch.PtrSize + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.link() + + pc := gp.sigpc + + if shouldPushSigpanic(gp, pc, uintptr(c.link())) { + // Make it look the like faulting PC called sigpanic. + c.set_link(uint64(pc)) + } + + // In case we are panicking from external C code + sigpanicPC := uint64(abi.FuncPCABIInternal(sigpanic)) + c.set_r31(sigpanicPC >> 32 << 32) // RSB register + c.set_r22(uint64(uintptr(unsafe.Pointer(gp)))) + c.set_pc(sigpanicPC) +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Push the LR to stack, as we'll clobber it in order to + // push the call. The function being pushed is responsible + // for restoring the LR and setting the SP back. + // This extra slot is known to gentraceback. + sp := c.sp() - 8 + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.link() + // Set up PC and LR to pretend the function being signaled + // calls targetPC at resumePC. + c.set_link(uint64(resumePC)) + c.set_pc(uint64(targetPC)) +} diff --git a/src/runtime/signal_mips64x.go b/src/runtime/signal_mips64x.go new file mode 100644 index 0000000..cee1bf7 --- /dev/null +++ b/src/runtime/signal_mips64x.go @@ -0,0 +1,100 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux || openbsd) && (mips64 || mips64le) + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("r0 ", hex(c.r0()), "\t") + print("r1 ", hex(c.r1()), "\n") + print("r2 ", hex(c.r2()), "\t") + print("r3 ", hex(c.r3()), "\n") + print("r4 ", hex(c.r4()), "\t") + print("r5 ", hex(c.r5()), "\n") + print("r6 ", hex(c.r6()), "\t") + print("r7 ", hex(c.r7()), "\n") + print("r8 ", hex(c.r8()), "\t") + print("r9 ", hex(c.r9()), "\n") + print("r10 ", hex(c.r10()), "\t") + print("r11 ", hex(c.r11()), "\n") + print("r12 ", hex(c.r12()), "\t") + print("r13 ", hex(c.r13()), "\n") + print("r14 ", hex(c.r14()), "\t") + print("r15 ", hex(c.r15()), "\n") + print("r16 ", hex(c.r16()), "\t") + print("r17 ", hex(c.r17()), "\n") + print("r18 ", hex(c.r18()), "\t") + print("r19 ", hex(c.r19()), "\n") + print("r20 ", hex(c.r20()), "\t") + print("r21 ", hex(c.r21()), "\n") + print("r22 ", hex(c.r22()), "\t") + print("r23 ", hex(c.r23()), "\n") + print("r24 ", hex(c.r24()), "\t") + print("r25 ", hex(c.r25()), "\n") + print("r26 ", hex(c.r26()), "\t") + print("r27 ", hex(c.r27()), "\n") + print("r28 ", hex(c.r28()), "\t") + print("r29 ", hex(c.r29()), "\n") + print("r30 ", hex(c.r30()), "\t") + print("r31 ", hex(c.r31()), "\n") + print("pc ", hex(c.pc()), "\t") + print("link ", hex(c.link()), "\n") + print("lo ", hex(c.lo()), "\t") + print("hi ", hex(c.hi()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.pc()) } + +func (c *sigctxt) sigsp() uintptr { return uintptr(c.sp()) } +func (c *sigctxt) siglr() uintptr { return uintptr(c.link()) } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // We arrange link, and pc to pretend the panicking + // function calls sigpanic directly. + // Always save LINK to stack so that panics in leaf + // functions are correctly handled. This smashes + // the stack frame but we're not going back there + // anyway. + sp := c.sp() - goarch.PtrSize + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.link() + + pc := gp.sigpc + + if shouldPushSigpanic(gp, pc, uintptr(c.link())) { + // Make it look the like faulting PC called sigpanic. + c.set_link(uint64(pc)) + } + + // In case we are panicking from external C code + sigpanicPC := uint64(abi.FuncPCABIInternal(sigpanic)) + c.set_r28(sigpanicPC >> 32 << 32) // RSB register + c.set_r30(uint64(uintptr(unsafe.Pointer(gp)))) + c.set_pc(sigpanicPC) +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Push the LR to stack, as we'll clobber it in order to + // push the call. The function being pushed is responsible + // for restoring the LR and setting the SP back. + // This extra slot is known to gentraceback. + sp := c.sp() - 8 + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.link() + // Set up PC and LR to pretend the function being signaled + // calls targetPC at resumePC. + c.set_link(uint64(resumePC)) + c.set_pc(uint64(targetPC)) +} diff --git a/src/runtime/signal_mipsx.go b/src/runtime/signal_mipsx.go new file mode 100644 index 0000000..ba92655 --- /dev/null +++ b/src/runtime/signal_mipsx.go @@ -0,0 +1,95 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips || mipsle) + +package runtime + +import ( + "internal/abi" + "runtime/internal/sys" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("r0 ", hex(c.r0()), "\t") + print("r1 ", hex(c.r1()), "\n") + print("r2 ", hex(c.r2()), "\t") + print("r3 ", hex(c.r3()), "\n") + print("r4 ", hex(c.r4()), "\t") + print("r5 ", hex(c.r5()), "\n") + print("r6 ", hex(c.r6()), "\t") + print("r7 ", hex(c.r7()), "\n") + print("r8 ", hex(c.r8()), "\t") + print("r9 ", hex(c.r9()), "\n") + print("r10 ", hex(c.r10()), "\t") + print("r11 ", hex(c.r11()), "\n") + print("r12 ", hex(c.r12()), "\t") + print("r13 ", hex(c.r13()), "\n") + print("r14 ", hex(c.r14()), "\t") + print("r15 ", hex(c.r15()), "\n") + print("r16 ", hex(c.r16()), "\t") + print("r17 ", hex(c.r17()), "\n") + print("r18 ", hex(c.r18()), "\t") + print("r19 ", hex(c.r19()), "\n") + print("r20 ", hex(c.r20()), "\t") + print("r21 ", hex(c.r21()), "\n") + print("r22 ", hex(c.r22()), "\t") + print("r23 ", hex(c.r23()), "\n") + print("r24 ", hex(c.r24()), "\t") + print("r25 ", hex(c.r25()), "\n") + print("r26 ", hex(c.r26()), "\t") + print("r27 ", hex(c.r27()), "\n") + print("r28 ", hex(c.r28()), "\t") + print("r29 ", hex(c.r29()), "\n") + print("r30 ", hex(c.r30()), "\t") + print("r31 ", hex(c.r31()), "\n") + print("pc ", hex(c.pc()), "\t") + print("link ", hex(c.link()), "\n") + print("lo ", hex(c.lo()), "\t") + print("hi ", hex(c.hi()), "\n") +} + +func (c *sigctxt) sigpc() uintptr { return uintptr(c.pc()) } +func (c *sigctxt) sigsp() uintptr { return uintptr(c.sp()) } +func (c *sigctxt) siglr() uintptr { return uintptr(c.link()) } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // We arrange link, and pc to pretend the panicking + // function calls sigpanic directly. + // Always save LINK to stack so that panics in leaf + // functions are correctly handled. This smashes + // the stack frame but we're not going back there + // anyway. + sp := c.sp() - sys.MinFrameSize + c.set_sp(sp) + *(*uint32)(unsafe.Pointer(uintptr(sp))) = c.link() + + pc := gp.sigpc + + if shouldPushSigpanic(gp, pc, uintptr(c.link())) { + // Make it look the like faulting PC called sigpanic. + c.set_link(uint32(pc)) + } + + // In case we are panicking from external C code + c.set_r30(uint32(uintptr(unsafe.Pointer(gp)))) + c.set_pc(uint32(abi.FuncPCABIInternal(sigpanic))) +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Push the LR to stack, as we'll clobber it in order to + // push the call. The function being pushed is responsible + // for restoring the LR and setting the SP back. + // This extra slot is known to gentraceback. + sp := c.sp() - 4 + c.set_sp(sp) + *(*uint32)(unsafe.Pointer(uintptr(sp))) = c.link() + // Set up PC and LR to pretend the function being signaled + // calls targetPC at resumePC. + c.set_link(uint32(resumePC)) + c.set_pc(uint32(targetPC)) +} diff --git a/src/runtime/signal_netbsd.go b/src/runtime/signal_netbsd.go new file mode 100644 index 0000000..ca51084 --- /dev/null +++ b/src/runtime/signal_netbsd.go @@ -0,0 +1,41 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var sigtable = [...]sigTabT{ + /* 0 */ {0, "SIGNONE: no trap"}, + /* 1 */ {_SigNotify + _SigKill, "SIGHUP: terminal line hangup"}, + /* 2 */ {_SigNotify + _SigKill, "SIGINT: interrupt"}, + /* 3 */ {_SigNotify + _SigThrow, "SIGQUIT: quit"}, + /* 4 */ {_SigThrow + _SigUnblock, "SIGILL: illegal instruction"}, + /* 5 */ {_SigThrow + _SigUnblock, "SIGTRAP: trace trap"}, + /* 6 */ {_SigNotify + _SigThrow, "SIGABRT: abort"}, + /* 7 */ {_SigThrow, "SIGEMT: emulate instruction executed"}, + /* 8 */ {_SigPanic + _SigUnblock, "SIGFPE: floating-point exception"}, + /* 9 */ {0, "SIGKILL: kill"}, + /* 10 */ {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + /* 11 */ {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + /* 12 */ {_SigThrow, "SIGSYS: bad system call"}, + /* 13 */ {_SigNotify, "SIGPIPE: write to broken pipe"}, + /* 14 */ {_SigNotify, "SIGALRM: alarm clock"}, + /* 15 */ {_SigNotify + _SigKill, "SIGTERM: termination"}, + /* 16 */ {_SigNotify + _SigIgn, "SIGURG: urgent condition on socket"}, + /* 17 */ {0, "SIGSTOP: stop"}, + /* 18 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTSTP: keyboard stop"}, + /* 19 */ {_SigNotify + _SigDefault + _SigIgn, "SIGCONT: continue after stop"}, + /* 20 */ {_SigNotify + _SigUnblock + _SigIgn, "SIGCHLD: child status has changed"}, + /* 21 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTIN: background read from tty"}, + /* 22 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTOU: background write to tty"}, + /* 23 */ {_SigNotify + _SigIgn, "SIGIO: i/o now possible"}, + /* 24 */ {_SigNotify, "SIGXCPU: cpu limit exceeded"}, + /* 25 */ {_SigNotify, "SIGXFSZ: file size limit exceeded"}, + /* 26 */ {_SigNotify, "SIGVTALRM: virtual alarm clock"}, + /* 27 */ {_SigNotify + _SigUnblock, "SIGPROF: profiling alarm clock"}, + /* 28 */ {_SigNotify + _SigIgn, "SIGWINCH: window size change"}, + /* 29 */ {_SigNotify + _SigIgn, "SIGINFO: status request from keyboard"}, + /* 30 */ {_SigNotify, "SIGUSR1: user-defined signal 1"}, + /* 31 */ {_SigNotify, "SIGUSR2: user-defined signal 2"}, + /* 32 */ {_SigNotify, "SIGTHR: reserved"}, +} diff --git a/src/runtime/signal_netbsd_386.go b/src/runtime/signal_netbsd_386.go new file mode 100644 index 0000000..845a575 --- /dev/null +++ b/src/runtime/signal_netbsd_386.go @@ -0,0 +1,45 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontextt { return &(*ucontextt)(c.ctxt).uc_mcontext } + +func (c *sigctxt) eax() uint32 { return c.regs().__gregs[_REG_EAX] } +func (c *sigctxt) ebx() uint32 { return c.regs().__gregs[_REG_EBX] } +func (c *sigctxt) ecx() uint32 { return c.regs().__gregs[_REG_ECX] } +func (c *sigctxt) edx() uint32 { return c.regs().__gregs[_REG_EDX] } +func (c *sigctxt) edi() uint32 { return c.regs().__gregs[_REG_EDI] } +func (c *sigctxt) esi() uint32 { return c.regs().__gregs[_REG_ESI] } +func (c *sigctxt) ebp() uint32 { return c.regs().__gregs[_REG_EBP] } +func (c *sigctxt) esp() uint32 { return c.regs().__gregs[_REG_UESP] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) eip() uint32 { return c.regs().__gregs[_REG_EIP] } + +func (c *sigctxt) eflags() uint32 { return c.regs().__gregs[_REG_EFL] } +func (c *sigctxt) cs() uint32 { return c.regs().__gregs[_REG_CS] } +func (c *sigctxt) fs() uint32 { return c.regs().__gregs[_REG_FS] } +func (c *sigctxt) gs() uint32 { return c.regs().__gregs[_REG_GS] } +func (c *sigctxt) sigcode() uint32 { return uint32(c.info._code) } +func (c *sigctxt) sigaddr() uint32 { + return *(*uint32)(unsafe.Pointer(&c.info._reason[0])) +} + +func (c *sigctxt) set_eip(x uint32) { c.regs().__gregs[_REG_EIP] = x } +func (c *sigctxt) set_esp(x uint32) { c.regs().__gregs[_REG_UESP] = x } +func (c *sigctxt) set_sigcode(x uint32) { c.info._code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { + *(*uint32)(unsafe.Pointer(&c.info._reason[0])) = x +} diff --git a/src/runtime/signal_netbsd_amd64.go b/src/runtime/signal_netbsd_amd64.go new file mode 100644 index 0000000..67fe437 --- /dev/null +++ b/src/runtime/signal_netbsd_amd64.go @@ -0,0 +1,55 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontextt { + return (*mcontextt)(unsafe.Pointer(&(*ucontextt)(c.ctxt).uc_mcontext)) +} + +func (c *sigctxt) rax() uint64 { return c.regs().__gregs[_REG_RAX] } +func (c *sigctxt) rbx() uint64 { return c.regs().__gregs[_REG_RBX] } +func (c *sigctxt) rcx() uint64 { return c.regs().__gregs[_REG_RCX] } +func (c *sigctxt) rdx() uint64 { return c.regs().__gregs[_REG_RDX] } +func (c *sigctxt) rdi() uint64 { return c.regs().__gregs[_REG_RDI] } +func (c *sigctxt) rsi() uint64 { return c.regs().__gregs[_REG_RSI] } +func (c *sigctxt) rbp() uint64 { return c.regs().__gregs[_REG_RBP] } +func (c *sigctxt) rsp() uint64 { return c.regs().__gregs[_REG_RSP] } +func (c *sigctxt) r8() uint64 { return c.regs().__gregs[_REG_R8] } +func (c *sigctxt) r9() uint64 { return c.regs().__gregs[_REG_R8] } +func (c *sigctxt) r10() uint64 { return c.regs().__gregs[_REG_R10] } +func (c *sigctxt) r11() uint64 { return c.regs().__gregs[_REG_R11] } +func (c *sigctxt) r12() uint64 { return c.regs().__gregs[_REG_R12] } +func (c *sigctxt) r13() uint64 { return c.regs().__gregs[_REG_R13] } +func (c *sigctxt) r14() uint64 { return c.regs().__gregs[_REG_R14] } +func (c *sigctxt) r15() uint64 { return c.regs().__gregs[_REG_R15] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) rip() uint64 { return c.regs().__gregs[_REG_RIP] } + +func (c *sigctxt) rflags() uint64 { return c.regs().__gregs[_REG_RFLAGS] } +func (c *sigctxt) cs() uint64 { return c.regs().__gregs[_REG_CS] } +func (c *sigctxt) fs() uint64 { return c.regs().__gregs[_REG_FS] } +func (c *sigctxt) gs() uint64 { return c.regs().__gregs[_REG_GS] } +func (c *sigctxt) sigcode() uint64 { return uint64(c.info._code) } +func (c *sigctxt) sigaddr() uint64 { + return *(*uint64)(unsafe.Pointer(&c.info._reason[0])) +} + +func (c *sigctxt) set_rip(x uint64) { c.regs().__gregs[_REG_RIP] = x } +func (c *sigctxt) set_rsp(x uint64) { c.regs().__gregs[_REG_RSP] = x } +func (c *sigctxt) set_sigcode(x uint64) { c.info._code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uint64)(unsafe.Pointer(&c.info._reason[0])) = x +} diff --git a/src/runtime/signal_netbsd_arm.go b/src/runtime/signal_netbsd_arm.go new file mode 100644 index 0000000..fdb3078 --- /dev/null +++ b/src/runtime/signal_netbsd_arm.go @@ -0,0 +1,55 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontextt { return &(*ucontextt)(c.ctxt).uc_mcontext } + +func (c *sigctxt) r0() uint32 { return c.regs().__gregs[_REG_R0] } +func (c *sigctxt) r1() uint32 { return c.regs().__gregs[_REG_R1] } +func (c *sigctxt) r2() uint32 { return c.regs().__gregs[_REG_R2] } +func (c *sigctxt) r3() uint32 { return c.regs().__gregs[_REG_R3] } +func (c *sigctxt) r4() uint32 { return c.regs().__gregs[_REG_R4] } +func (c *sigctxt) r5() uint32 { return c.regs().__gregs[_REG_R5] } +func (c *sigctxt) r6() uint32 { return c.regs().__gregs[_REG_R6] } +func (c *sigctxt) r7() uint32 { return c.regs().__gregs[_REG_R7] } +func (c *sigctxt) r8() uint32 { return c.regs().__gregs[_REG_R8] } +func (c *sigctxt) r9() uint32 { return c.regs().__gregs[_REG_R9] } +func (c *sigctxt) r10() uint32 { return c.regs().__gregs[_REG_R10] } +func (c *sigctxt) fp() uint32 { return c.regs().__gregs[_REG_R11] } +func (c *sigctxt) ip() uint32 { return c.regs().__gregs[_REG_R12] } +func (c *sigctxt) sp() uint32 { return c.regs().__gregs[_REG_R13] } +func (c *sigctxt) lr() uint32 { return c.regs().__gregs[_REG_R14] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint32 { return c.regs().__gregs[_REG_R15] } + +func (c *sigctxt) cpsr() uint32 { return c.regs().__gregs[_REG_CPSR] } +func (c *sigctxt) fault() uintptr { return uintptr(c.info._reason) } +func (c *sigctxt) trap() uint32 { return 0 } +func (c *sigctxt) error() uint32 { return 0 } +func (c *sigctxt) oldmask() uint32 { return 0 } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info._code) } +func (c *sigctxt) sigaddr() uint32 { return uint32(c.info._reason) } + +func (c *sigctxt) set_pc(x uint32) { c.regs().__gregs[_REG_R15] = x } +func (c *sigctxt) set_sp(x uint32) { c.regs().__gregs[_REG_R13] = x } +func (c *sigctxt) set_lr(x uint32) { c.regs().__gregs[_REG_R14] = x } +func (c *sigctxt) set_r10(x uint32) { c.regs().__gregs[_REG_R10] = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info._code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { + c.info._reason = uintptr(x) +} diff --git a/src/runtime/signal_netbsd_arm64.go b/src/runtime/signal_netbsd_arm64.go new file mode 100644 index 0000000..8dfdfea --- /dev/null +++ b/src/runtime/signal_netbsd_arm64.go @@ -0,0 +1,73 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontextt { + return (*mcontextt)(unsafe.Pointer(&(*ucontextt)(c.ctxt).uc_mcontext)) +} + +func (c *sigctxt) r0() uint64 { return c.regs().__gregs[_REG_X0] } +func (c *sigctxt) r1() uint64 { return c.regs().__gregs[_REG_X1] } +func (c *sigctxt) r2() uint64 { return c.regs().__gregs[_REG_X2] } +func (c *sigctxt) r3() uint64 { return c.regs().__gregs[_REG_X3] } +func (c *sigctxt) r4() uint64 { return c.regs().__gregs[_REG_X4] } +func (c *sigctxt) r5() uint64 { return c.regs().__gregs[_REG_X5] } +func (c *sigctxt) r6() uint64 { return c.regs().__gregs[_REG_X6] } +func (c *sigctxt) r7() uint64 { return c.regs().__gregs[_REG_X7] } +func (c *sigctxt) r8() uint64 { return c.regs().__gregs[_REG_X8] } +func (c *sigctxt) r9() uint64 { return c.regs().__gregs[_REG_X9] } +func (c *sigctxt) r10() uint64 { return c.regs().__gregs[_REG_X10] } +func (c *sigctxt) r11() uint64 { return c.regs().__gregs[_REG_X11] } +func (c *sigctxt) r12() uint64 { return c.regs().__gregs[_REG_X12] } +func (c *sigctxt) r13() uint64 { return c.regs().__gregs[_REG_X13] } +func (c *sigctxt) r14() uint64 { return c.regs().__gregs[_REG_X14] } +func (c *sigctxt) r15() uint64 { return c.regs().__gregs[_REG_X15] } +func (c *sigctxt) r16() uint64 { return c.regs().__gregs[_REG_X16] } +func (c *sigctxt) r17() uint64 { return c.regs().__gregs[_REG_X17] } +func (c *sigctxt) r18() uint64 { return c.regs().__gregs[_REG_X18] } +func (c *sigctxt) r19() uint64 { return c.regs().__gregs[_REG_X19] } +func (c *sigctxt) r20() uint64 { return c.regs().__gregs[_REG_X20] } +func (c *sigctxt) r21() uint64 { return c.regs().__gregs[_REG_X21] } +func (c *sigctxt) r22() uint64 { return c.regs().__gregs[_REG_X22] } +func (c *sigctxt) r23() uint64 { return c.regs().__gregs[_REG_X23] } +func (c *sigctxt) r24() uint64 { return c.regs().__gregs[_REG_X24] } +func (c *sigctxt) r25() uint64 { return c.regs().__gregs[_REG_X25] } +func (c *sigctxt) r26() uint64 { return c.regs().__gregs[_REG_X26] } +func (c *sigctxt) r27() uint64 { return c.regs().__gregs[_REG_X27] } +func (c *sigctxt) r28() uint64 { return c.regs().__gregs[_REG_X28] } +func (c *sigctxt) r29() uint64 { return c.regs().__gregs[_REG_X29] } +func (c *sigctxt) lr() uint64 { return c.regs().__gregs[_REG_X30] } +func (c *sigctxt) sp() uint64 { return c.regs().__gregs[_REG_X31] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().__gregs[_REG_ELR] } + +func (c *sigctxt) fault() uintptr { return uintptr(c.info._reason) } +func (c *sigctxt) trap() uint64 { return 0 } +func (c *sigctxt) error() uint64 { return 0 } +func (c *sigctxt) oldmask() uint64 { return 0 } + +func (c *sigctxt) sigcode() uint64 { return uint64(c.info._code) } +func (c *sigctxt) sigaddr() uint64 { return uint64(c.info._reason) } + +func (c *sigctxt) set_pc(x uint64) { c.regs().__gregs[_REG_ELR] = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().__gregs[_REG_X31] = x } +func (c *sigctxt) set_lr(x uint64) { c.regs().__gregs[_REG_X30] = x } +func (c *sigctxt) set_r28(x uint64) { c.regs().__gregs[_REG_X28] = x } + +func (c *sigctxt) set_sigcode(x uint64) { c.info._code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + c.info._reason = uintptr(x) +} diff --git a/src/runtime/signal_openbsd.go b/src/runtime/signal_openbsd.go new file mode 100644 index 0000000..d2c5c5e --- /dev/null +++ b/src/runtime/signal_openbsd.go @@ -0,0 +1,41 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var sigtable = [...]sigTabT{ + /* 0 */ {0, "SIGNONE: no trap"}, + /* 1 */ {_SigNotify + _SigKill, "SIGHUP: terminal line hangup"}, + /* 2 */ {_SigNotify + _SigKill, "SIGINT: interrupt"}, + /* 3 */ {_SigNotify + _SigThrow, "SIGQUIT: quit"}, + /* 4 */ {_SigThrow + _SigUnblock, "SIGILL: illegal instruction"}, + /* 5 */ {_SigThrow + _SigUnblock, "SIGTRAP: trace trap"}, + /* 6 */ {_SigNotify + _SigThrow, "SIGABRT: abort"}, + /* 7 */ {_SigThrow, "SIGEMT: emulate instruction executed"}, + /* 8 */ {_SigPanic + _SigUnblock, "SIGFPE: floating-point exception"}, + /* 9 */ {0, "SIGKILL: kill"}, + /* 10 */ {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + /* 11 */ {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + /* 12 */ {_SigThrow, "SIGSYS: bad system call"}, + /* 13 */ {_SigNotify, "SIGPIPE: write to broken pipe"}, + /* 14 */ {_SigNotify, "SIGALRM: alarm clock"}, + /* 15 */ {_SigNotify + _SigKill, "SIGTERM: termination"}, + /* 16 */ {_SigNotify + _SigIgn, "SIGURG: urgent condition on socket"}, + /* 17 */ {0, "SIGSTOP: stop"}, + /* 18 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTSTP: keyboard stop"}, + /* 19 */ {_SigNotify + _SigDefault + _SigIgn, "SIGCONT: continue after stop"}, + /* 20 */ {_SigNotify + _SigUnblock + _SigIgn, "SIGCHLD: child status has changed"}, + /* 21 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTIN: background read from tty"}, + /* 22 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTOU: background write to tty"}, + /* 23 */ {_SigNotify, "SIGIO: i/o now possible"}, + /* 24 */ {_SigNotify, "SIGXCPU: cpu limit exceeded"}, + /* 25 */ {_SigNotify, "SIGXFSZ: file size limit exceeded"}, + /* 26 */ {_SigNotify, "SIGVTALRM: virtual alarm clock"}, + /* 27 */ {_SigNotify + _SigUnblock, "SIGPROF: profiling alarm clock"}, + /* 28 */ {_SigNotify, "SIGWINCH: window size change"}, + /* 29 */ {_SigNotify, "SIGINFO: status request from keyboard"}, + /* 30 */ {_SigNotify, "SIGUSR1: user-defined signal 1"}, + /* 31 */ {_SigNotify, "SIGUSR2: user-defined signal 2"}, + /* 32 */ {0, "SIGTHR: reserved"}, // thread AST - cannot be registered. +} diff --git a/src/runtime/signal_openbsd_386.go b/src/runtime/signal_openbsd_386.go new file mode 100644 index 0000000..2fc4b1d --- /dev/null +++ b/src/runtime/signal_openbsd_386.go @@ -0,0 +1,47 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { + return (*sigcontext)(c.ctxt) +} + +func (c *sigctxt) eax() uint32 { return c.regs().sc_eax } +func (c *sigctxt) ebx() uint32 { return c.regs().sc_ebx } +func (c *sigctxt) ecx() uint32 { return c.regs().sc_ecx } +func (c *sigctxt) edx() uint32 { return c.regs().sc_edx } +func (c *sigctxt) edi() uint32 { return c.regs().sc_edi } +func (c *sigctxt) esi() uint32 { return c.regs().sc_esi } +func (c *sigctxt) ebp() uint32 { return c.regs().sc_ebp } +func (c *sigctxt) esp() uint32 { return c.regs().sc_esp } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) eip() uint32 { return c.regs().sc_eip } + +func (c *sigctxt) eflags() uint32 { return c.regs().sc_eflags } +func (c *sigctxt) cs() uint32 { return c.regs().sc_cs } +func (c *sigctxt) fs() uint32 { return c.regs().sc_fs } +func (c *sigctxt) gs() uint32 { return c.regs().sc_gs } +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint32 { + return *(*uint32)(add(unsafe.Pointer(c.info), 12)) +} + +func (c *sigctxt) set_eip(x uint32) { c.regs().sc_eip = x } +func (c *sigctxt) set_esp(x uint32) { c.regs().sc_esp = x } +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { + *(*uint32)(add(unsafe.Pointer(c.info), 12)) = x +} diff --git a/src/runtime/signal_openbsd_amd64.go b/src/runtime/signal_openbsd_amd64.go new file mode 100644 index 0000000..091a88a --- /dev/null +++ b/src/runtime/signal_openbsd_amd64.go @@ -0,0 +1,55 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { + return (*sigcontext)(c.ctxt) +} + +func (c *sigctxt) rax() uint64 { return c.regs().sc_rax } +func (c *sigctxt) rbx() uint64 { return c.regs().sc_rbx } +func (c *sigctxt) rcx() uint64 { return c.regs().sc_rcx } +func (c *sigctxt) rdx() uint64 { return c.regs().sc_rdx } +func (c *sigctxt) rdi() uint64 { return c.regs().sc_rdi } +func (c *sigctxt) rsi() uint64 { return c.regs().sc_rsi } +func (c *sigctxt) rbp() uint64 { return c.regs().sc_rbp } +func (c *sigctxt) rsp() uint64 { return c.regs().sc_rsp } +func (c *sigctxt) r8() uint64 { return c.regs().sc_r8 } +func (c *sigctxt) r9() uint64 { return c.regs().sc_r9 } +func (c *sigctxt) r10() uint64 { return c.regs().sc_r10 } +func (c *sigctxt) r11() uint64 { return c.regs().sc_r11 } +func (c *sigctxt) r12() uint64 { return c.regs().sc_r12 } +func (c *sigctxt) r13() uint64 { return c.regs().sc_r13 } +func (c *sigctxt) r14() uint64 { return c.regs().sc_r14 } +func (c *sigctxt) r15() uint64 { return c.regs().sc_r15 } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) rip() uint64 { return c.regs().sc_rip } + +func (c *sigctxt) rflags() uint64 { return c.regs().sc_rflags } +func (c *sigctxt) cs() uint64 { return c.regs().sc_cs } +func (c *sigctxt) fs() uint64 { return c.regs().sc_fs } +func (c *sigctxt) gs() uint64 { return c.regs().sc_gs } +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { + return *(*uint64)(add(unsafe.Pointer(c.info), 16)) +} + +func (c *sigctxt) set_rip(x uint64) { c.regs().sc_rip = x } +func (c *sigctxt) set_rsp(x uint64) { c.regs().sc_rsp = x } +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uint64)(add(unsafe.Pointer(c.info), 16)) = x +} diff --git a/src/runtime/signal_openbsd_arm.go b/src/runtime/signal_openbsd_arm.go new file mode 100644 index 0000000..f796550 --- /dev/null +++ b/src/runtime/signal_openbsd_arm.go @@ -0,0 +1,59 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { + return (*sigcontext)(c.ctxt) +} + +func (c *sigctxt) r0() uint32 { return c.regs().sc_r0 } +func (c *sigctxt) r1() uint32 { return c.regs().sc_r1 } +func (c *sigctxt) r2() uint32 { return c.regs().sc_r2 } +func (c *sigctxt) r3() uint32 { return c.regs().sc_r3 } +func (c *sigctxt) r4() uint32 { return c.regs().sc_r4 } +func (c *sigctxt) r5() uint32 { return c.regs().sc_r5 } +func (c *sigctxt) r6() uint32 { return c.regs().sc_r6 } +func (c *sigctxt) r7() uint32 { return c.regs().sc_r7 } +func (c *sigctxt) r8() uint32 { return c.regs().sc_r8 } +func (c *sigctxt) r9() uint32 { return c.regs().sc_r9 } +func (c *sigctxt) r10() uint32 { return c.regs().sc_r10 } +func (c *sigctxt) fp() uint32 { return c.regs().sc_r11 } +func (c *sigctxt) ip() uint32 { return c.regs().sc_r12 } +func (c *sigctxt) sp() uint32 { return c.regs().sc_usr_sp } +func (c *sigctxt) lr() uint32 { return c.regs().sc_usr_lr } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint32 { return c.regs().sc_pc } + +func (c *sigctxt) cpsr() uint32 { return c.regs().sc_spsr } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } +func (c *sigctxt) trap() uint32 { return 0 } +func (c *sigctxt) error() uint32 { return 0 } +func (c *sigctxt) oldmask() uint32 { return 0 } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint32 { + return *(*uint32)(add(unsafe.Pointer(c.info), 16)) +} + +func (c *sigctxt) set_pc(x uint32) { c.regs().sc_pc = x } +func (c *sigctxt) set_sp(x uint32) { c.regs().sc_usr_sp = x } +func (c *sigctxt) set_lr(x uint32) { c.regs().sc_usr_lr = x } +func (c *sigctxt) set_r10(x uint32) { c.regs().sc_r10 = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint32) { + *(*uint32)(add(unsafe.Pointer(c.info), 16)) = x +} diff --git a/src/runtime/signal_openbsd_arm64.go b/src/runtime/signal_openbsd_arm64.go new file mode 100644 index 0000000..3747b4f --- /dev/null +++ b/src/runtime/signal_openbsd_arm64.go @@ -0,0 +1,75 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { + return (*sigcontext)(c.ctxt) +} + +func (c *sigctxt) r0() uint64 { return (uint64)(c.regs().sc_x[0]) } +func (c *sigctxt) r1() uint64 { return (uint64)(c.regs().sc_x[1]) } +func (c *sigctxt) r2() uint64 { return (uint64)(c.regs().sc_x[2]) } +func (c *sigctxt) r3() uint64 { return (uint64)(c.regs().sc_x[3]) } +func (c *sigctxt) r4() uint64 { return (uint64)(c.regs().sc_x[4]) } +func (c *sigctxt) r5() uint64 { return (uint64)(c.regs().sc_x[5]) } +func (c *sigctxt) r6() uint64 { return (uint64)(c.regs().sc_x[6]) } +func (c *sigctxt) r7() uint64 { return (uint64)(c.regs().sc_x[7]) } +func (c *sigctxt) r8() uint64 { return (uint64)(c.regs().sc_x[8]) } +func (c *sigctxt) r9() uint64 { return (uint64)(c.regs().sc_x[9]) } +func (c *sigctxt) r10() uint64 { return (uint64)(c.regs().sc_x[10]) } +func (c *sigctxt) r11() uint64 { return (uint64)(c.regs().sc_x[11]) } +func (c *sigctxt) r12() uint64 { return (uint64)(c.regs().sc_x[12]) } +func (c *sigctxt) r13() uint64 { return (uint64)(c.regs().sc_x[13]) } +func (c *sigctxt) r14() uint64 { return (uint64)(c.regs().sc_x[14]) } +func (c *sigctxt) r15() uint64 { return (uint64)(c.regs().sc_x[15]) } +func (c *sigctxt) r16() uint64 { return (uint64)(c.regs().sc_x[16]) } +func (c *sigctxt) r17() uint64 { return (uint64)(c.regs().sc_x[17]) } +func (c *sigctxt) r18() uint64 { return (uint64)(c.regs().sc_x[18]) } +func (c *sigctxt) r19() uint64 { return (uint64)(c.regs().sc_x[19]) } +func (c *sigctxt) r20() uint64 { return (uint64)(c.regs().sc_x[20]) } +func (c *sigctxt) r21() uint64 { return (uint64)(c.regs().sc_x[21]) } +func (c *sigctxt) r22() uint64 { return (uint64)(c.regs().sc_x[22]) } +func (c *sigctxt) r23() uint64 { return (uint64)(c.regs().sc_x[23]) } +func (c *sigctxt) r24() uint64 { return (uint64)(c.regs().sc_x[24]) } +func (c *sigctxt) r25() uint64 { return (uint64)(c.regs().sc_x[25]) } +func (c *sigctxt) r26() uint64 { return (uint64)(c.regs().sc_x[26]) } +func (c *sigctxt) r27() uint64 { return (uint64)(c.regs().sc_x[27]) } +func (c *sigctxt) r28() uint64 { return (uint64)(c.regs().sc_x[28]) } +func (c *sigctxt) r29() uint64 { return (uint64)(c.regs().sc_x[29]) } +func (c *sigctxt) lr() uint64 { return (uint64)(c.regs().sc_lr) } +func (c *sigctxt) sp() uint64 { return (uint64)(c.regs().sc_sp) } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) rip() uint64 { return (uint64)(c.regs().sc_lr) } /* XXX */ + +func (c *sigctxt) fault() uint64 { return c.sigaddr() } +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { + return *(*uint64)(add(unsafe.Pointer(c.info), 16)) +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return uint64(c.regs().sc_elr) } + +func (c *sigctxt) set_pc(x uint64) { c.regs().sc_elr = uintptr(x) } +func (c *sigctxt) set_sp(x uint64) { c.regs().sc_sp = uintptr(x) } +func (c *sigctxt) set_lr(x uint64) { c.regs().sc_lr = uintptr(x) } +func (c *sigctxt) set_r28(x uint64) { c.regs().sc_x[28] = uintptr(x) } + +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uint64)(add(unsafe.Pointer(c.info), 16)) = x +} diff --git a/src/runtime/signal_openbsd_mips64.go b/src/runtime/signal_openbsd_mips64.go new file mode 100644 index 0000000..54ed523 --- /dev/null +++ b/src/runtime/signal_openbsd_mips64.go @@ -0,0 +1,78 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "unsafe" +) + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *sigcontext { + return (*sigcontext)(c.ctxt) +} + +func (c *sigctxt) r0() uint64 { return c.regs().sc_regs[0] } +func (c *sigctxt) r1() uint64 { return c.regs().sc_regs[1] } +func (c *sigctxt) r2() uint64 { return c.regs().sc_regs[2] } +func (c *sigctxt) r3() uint64 { return c.regs().sc_regs[3] } +func (c *sigctxt) r4() uint64 { return c.regs().sc_regs[4] } +func (c *sigctxt) r5() uint64 { return c.regs().sc_regs[5] } +func (c *sigctxt) r6() uint64 { return c.regs().sc_regs[6] } +func (c *sigctxt) r7() uint64 { return c.regs().sc_regs[7] } +func (c *sigctxt) r8() uint64 { return c.regs().sc_regs[8] } +func (c *sigctxt) r9() uint64 { return c.regs().sc_regs[9] } +func (c *sigctxt) r10() uint64 { return c.regs().sc_regs[10] } +func (c *sigctxt) r11() uint64 { return c.regs().sc_regs[11] } +func (c *sigctxt) r12() uint64 { return c.regs().sc_regs[12] } +func (c *sigctxt) r13() uint64 { return c.regs().sc_regs[13] } +func (c *sigctxt) r14() uint64 { return c.regs().sc_regs[14] } +func (c *sigctxt) r15() uint64 { return c.regs().sc_regs[15] } +func (c *sigctxt) r16() uint64 { return c.regs().sc_regs[16] } +func (c *sigctxt) r17() uint64 { return c.regs().sc_regs[17] } +func (c *sigctxt) r18() uint64 { return c.regs().sc_regs[18] } +func (c *sigctxt) r19() uint64 { return c.regs().sc_regs[19] } +func (c *sigctxt) r20() uint64 { return c.regs().sc_regs[20] } +func (c *sigctxt) r21() uint64 { return c.regs().sc_regs[21] } +func (c *sigctxt) r22() uint64 { return c.regs().sc_regs[22] } +func (c *sigctxt) r23() uint64 { return c.regs().sc_regs[23] } +func (c *sigctxt) r24() uint64 { return c.regs().sc_regs[24] } +func (c *sigctxt) r25() uint64 { return c.regs().sc_regs[25] } +func (c *sigctxt) r26() uint64 { return c.regs().sc_regs[26] } +func (c *sigctxt) r27() uint64 { return c.regs().sc_regs[27] } +func (c *sigctxt) r28() uint64 { return c.regs().sc_regs[28] } +func (c *sigctxt) r29() uint64 { return c.regs().sc_regs[29] } +func (c *sigctxt) r30() uint64 { return c.regs().sc_regs[30] } +func (c *sigctxt) r31() uint64 { return c.regs().sc_regs[31] } +func (c *sigctxt) sp() uint64 { return c.regs().sc_regs[29] } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) pc() uint64 { return c.regs().sc_pc } + +func (c *sigctxt) link() uint64 { return c.regs().sc_regs[31] } +func (c *sigctxt) lo() uint64 { return c.regs().mullo } +func (c *sigctxt) hi() uint64 { return c.regs().mulhi } + +func (c *sigctxt) sigcode() uint32 { return uint32(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { + return *(*uint64)(add(unsafe.Pointer(c.info), 16)) +} + +func (c *sigctxt) set_r28(x uint64) { c.regs().sc_regs[28] = x } +func (c *sigctxt) set_r30(x uint64) { c.regs().sc_regs[30] = x } +func (c *sigctxt) set_pc(x uint64) { c.regs().sc_pc = x } +func (c *sigctxt) set_sp(x uint64) { c.regs().sc_regs[29] = x } +func (c *sigctxt) set_link(x uint64) { c.regs().sc_regs[31] = x } + +func (c *sigctxt) set_sigcode(x uint32) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uint64)(add(unsafe.Pointer(c.info), 16)) = x +} diff --git a/src/runtime/signal_plan9.go b/src/runtime/signal_plan9.go new file mode 100644 index 0000000..d3894c8 --- /dev/null +++ b/src/runtime/signal_plan9.go @@ -0,0 +1,57 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +type sigTabT struct { + flags int + name string +} + +// Incoming notes are compared against this table using strncmp, so the +// order matters: longer patterns must appear before their prefixes. +// There are _SIG constants in os2_plan9.go for the table index of some +// of these. +// +// If you add entries to this table, you must respect the prefix ordering +// and also update the constant values is os2_plan9.go. +var sigtable = [...]sigTabT{ + // Traps that we cannot be recovered. + {_SigThrow, "sys: trap: debug exception"}, + {_SigThrow, "sys: trap: invalid opcode"}, + + // We can recover from some memory errors in runtime·sigpanic. + {_SigPanic, "sys: trap: fault read"}, // SIGRFAULT + {_SigPanic, "sys: trap: fault write"}, // SIGWFAULT + + // We can also recover from math errors. + {_SigPanic, "sys: trap: divide error"}, // SIGINTDIV + {_SigPanic, "sys: fp:"}, // SIGFLOAT + + // All other traps are normally handled as if they were marked SigThrow. + // We mark them SigPanic here so that debug.SetPanicOnFault will work. + {_SigPanic, "sys: trap:"}, // SIGTRAP + + // Writes to a closed pipe can be handled if desired, otherwise they're ignored. + {_SigNotify, "sys: write on closed pipe"}, + + // Other system notes are more serious and cannot be recovered. + {_SigThrow, "sys:"}, + + // Issued to all other procs when calling runtime·exit. + {_SigGoExit, "go: exit "}, + + // Kill is sent by external programs to cause an exit. + {_SigKill, "kill"}, + + // Interrupts can be handled if desired, otherwise they cause an exit. + {_SigNotify + _SigKill, "interrupt"}, + {_SigNotify + _SigKill, "hangup"}, + + // Alarms can be handled if desired, otherwise they're ignored. + {_SigNotify, "alarm"}, + + // Aborts can be handled if desired, otherwise they cause a stack trace. + {_SigNotify + _SigThrow, "abort"}, +} diff --git a/src/runtime/signal_ppc64x.go b/src/runtime/signal_ppc64x.go new file mode 100644 index 0000000..bdd3540 --- /dev/null +++ b/src/runtime/signal_ppc64x.go @@ -0,0 +1,111 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (aix || linux) && (ppc64 || ppc64le) + +package runtime + +import ( + "internal/abi" + "runtime/internal/sys" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("r0 ", hex(c.r0()), "\t") + print("r1 ", hex(c.r1()), "\n") + print("r2 ", hex(c.r2()), "\t") + print("r3 ", hex(c.r3()), "\n") + print("r4 ", hex(c.r4()), "\t") + print("r5 ", hex(c.r5()), "\n") + print("r6 ", hex(c.r6()), "\t") + print("r7 ", hex(c.r7()), "\n") + print("r8 ", hex(c.r8()), "\t") + print("r9 ", hex(c.r9()), "\n") + print("r10 ", hex(c.r10()), "\t") + print("r11 ", hex(c.r11()), "\n") + print("r12 ", hex(c.r12()), "\t") + print("r13 ", hex(c.r13()), "\n") + print("r14 ", hex(c.r14()), "\t") + print("r15 ", hex(c.r15()), "\n") + print("r16 ", hex(c.r16()), "\t") + print("r17 ", hex(c.r17()), "\n") + print("r18 ", hex(c.r18()), "\t") + print("r19 ", hex(c.r19()), "\n") + print("r20 ", hex(c.r20()), "\t") + print("r21 ", hex(c.r21()), "\n") + print("r22 ", hex(c.r22()), "\t") + print("r23 ", hex(c.r23()), "\n") + print("r24 ", hex(c.r24()), "\t") + print("r25 ", hex(c.r25()), "\n") + print("r26 ", hex(c.r26()), "\t") + print("r27 ", hex(c.r27()), "\n") + print("r28 ", hex(c.r28()), "\t") + print("r29 ", hex(c.r29()), "\n") + print("r30 ", hex(c.r30()), "\t") + print("r31 ", hex(c.r31()), "\n") + print("pc ", hex(c.pc()), "\t") + print("ctr ", hex(c.ctr()), "\n") + print("link ", hex(c.link()), "\t") + print("xer ", hex(c.xer()), "\n") + print("ccr ", hex(c.ccr()), "\t") + print("trap ", hex(c.trap()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.pc()) } + +func (c *sigctxt) sigsp() uintptr { return uintptr(c.sp()) } +func (c *sigctxt) siglr() uintptr { return uintptr(c.link()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // We arrange link, and pc to pretend the panicking + // function calls sigpanic directly. + // Always save LINK to stack so that panics in leaf + // functions are correctly handled. This smashes + // the stack frame but we're not going back there + // anyway. + sp := c.sp() - sys.MinFrameSize + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.link() + + pc := gp.sigpc + + if shouldPushSigpanic(gp, pc, uintptr(c.link())) { + // Make it look the like faulting PC called sigpanic. + c.set_link(uint64(pc)) + } + + // In case we are panicking from external C code + c.set_r0(0) + c.set_r30(uint64(uintptr(unsafe.Pointer(gp)))) + c.set_r12(uint64(abi.FuncPCABIInternal(sigpanic))) + c.set_pc(uint64(abi.FuncPCABIInternal(sigpanic))) +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Push the LR to stack, as we'll clobber it in order to + // push the call. The function being pushed is responsible + // for restoring the LR and setting the SP back. + // This extra space is known to gentraceback. + sp := c.sp() - sys.MinFrameSize + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.link() + // In PIC mode, we'll set up (i.e. clobber) R2 on function + // entry. Save it ahead of time. + // In PIC mode it requires R12 points to the function entry, + // so we'll set it up when pushing the call. Save it ahead + // of time as well. + // 8(SP) and 16(SP) are unused space in the reserved + // MinFrameSize (32) bytes. + *(*uint64)(unsafe.Pointer(uintptr(sp) + 8)) = c.r2() + *(*uint64)(unsafe.Pointer(uintptr(sp) + 16)) = c.r12() + // Set up PC and LR to pretend the function being signaled + // calls targetPC at resumePC. + c.set_link(uint64(resumePC)) + c.set_r12(uint64(targetPC)) + c.set_pc(uint64(targetPC)) +} diff --git a/src/runtime/signal_riscv64.go b/src/runtime/signal_riscv64.go new file mode 100644 index 0000000..b8d7b97 --- /dev/null +++ b/src/runtime/signal_riscv64.go @@ -0,0 +1,94 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux || freebsd) && riscv64 + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +func dumpregs(c *sigctxt) { + print("ra ", hex(c.ra()), "\t") + print("sp ", hex(c.sp()), "\n") + print("gp ", hex(c.gp()), "\t") + print("tp ", hex(c.tp()), "\n") + print("t0 ", hex(c.t0()), "\t") + print("t1 ", hex(c.t1()), "\n") + print("t2 ", hex(c.t2()), "\t") + print("s0 ", hex(c.s0()), "\n") + print("s1 ", hex(c.s1()), "\t") + print("a0 ", hex(c.a0()), "\n") + print("a1 ", hex(c.a1()), "\t") + print("a2 ", hex(c.a2()), "\n") + print("a3 ", hex(c.a3()), "\t") + print("a4 ", hex(c.a4()), "\n") + print("a5 ", hex(c.a5()), "\t") + print("a6 ", hex(c.a6()), "\n") + print("a7 ", hex(c.a7()), "\t") + print("s2 ", hex(c.s2()), "\n") + print("s3 ", hex(c.s3()), "\t") + print("s4 ", hex(c.s4()), "\n") + print("s5 ", hex(c.s5()), "\t") + print("s6 ", hex(c.s6()), "\n") + print("s7 ", hex(c.s7()), "\t") + print("s8 ", hex(c.s8()), "\n") + print("s9 ", hex(c.s9()), "\t") + print("s10 ", hex(c.s10()), "\n") + print("s11 ", hex(c.s11()), "\t") + print("t3 ", hex(c.t3()), "\n") + print("t4 ", hex(c.t4()), "\t") + print("t5 ", hex(c.t5()), "\n") + print("t6 ", hex(c.t6()), "\t") + print("pc ", hex(c.pc()), "\n") +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) sigpc() uintptr { return uintptr(c.pc()) } + +func (c *sigctxt) sigsp() uintptr { return uintptr(c.sp()) } +func (c *sigctxt) siglr() uintptr { return uintptr(c.ra()) } +func (c *sigctxt) fault() uintptr { return uintptr(c.sigaddr()) } + +// preparePanic sets up the stack to look like a call to sigpanic. +func (c *sigctxt) preparePanic(sig uint32, gp *g) { + // We arrange RA, and pc to pretend the panicking + // function calls sigpanic directly. + // Always save RA to stack so that panics in leaf + // functions are correctly handled. This smashes + // the stack frame but we're not going back there + // anyway. + sp := c.sp() - goarch.PtrSize + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.ra() + + pc := gp.sigpc + + if shouldPushSigpanic(gp, pc, uintptr(c.ra())) { + // Make it look the like faulting PC called sigpanic. + c.set_ra(uint64(pc)) + } + + // In case we are panicking from external C code + c.set_gp(uint64(uintptr(unsafe.Pointer(gp)))) + c.set_pc(uint64(abi.FuncPCABIInternal(sigpanic))) +} + +func (c *sigctxt) pushCall(targetPC, resumePC uintptr) { + // Push the LR to stack, as we'll clobber it in order to + // push the call. The function being pushed is responsible + // for restoring the LR and setting the SP back. + // This extra slot is known to gentraceback. + sp := c.sp() - goarch.PtrSize + c.set_sp(sp) + *(*uint64)(unsafe.Pointer(uintptr(sp))) = c.ra() + // Set up PC and LR to pretend the function being signaled + // calls targetPC at resumePC. + c.set_ra(uint64(resumePC)) + c.set_pc(uint64(targetPC)) +} diff --git a/src/runtime/signal_solaris.go b/src/runtime/signal_solaris.go new file mode 100644 index 0000000..25f8ad5 --- /dev/null +++ b/src/runtime/signal_solaris.go @@ -0,0 +1,83 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var sigtable = [...]sigTabT{ + /* 0 */ {0, "SIGNONE: no trap"}, + /* 1 */ {_SigNotify + _SigKill, "SIGHUP: hangup"}, + /* 2 */ {_SigNotify + _SigKill, "SIGINT: interrupt (rubout)"}, + /* 3 */ {_SigNotify + _SigThrow, "SIGQUIT: quit (ASCII FS)"}, + /* 4 */ {_SigThrow + _SigUnblock, "SIGILL: illegal instruction (not reset when caught)"}, + /* 5 */ {_SigThrow + _SigUnblock, "SIGTRAP: trace trap (not reset when caught)"}, + /* 6 */ {_SigNotify + _SigThrow, "SIGABRT: used by abort, replace SIGIOT in the future"}, + /* 7 */ {_SigThrow, "SIGEMT: EMT instruction"}, + /* 8 */ {_SigPanic + _SigUnblock, "SIGFPE: floating point exception"}, + /* 9 */ {0, "SIGKILL: kill (cannot be caught or ignored)"}, + /* 10 */ {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + /* 11 */ {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + /* 12 */ {_SigThrow, "SIGSYS: bad argument to system call"}, + /* 13 */ {_SigNotify, "SIGPIPE: write on a pipe with no one to read it"}, + /* 14 */ {_SigNotify, "SIGALRM: alarm clock"}, + /* 15 */ {_SigNotify + _SigKill, "SIGTERM: software termination signal from kill"}, + /* 16 */ {_SigNotify, "SIGUSR1: user defined signal 1"}, + /* 17 */ {_SigNotify, "SIGUSR2: user defined signal 2"}, + /* 18 */ {_SigNotify + _SigUnblock + _SigIgn, "SIGCHLD: child status change alias (POSIX)"}, + /* 19 */ {_SigNotify, "SIGPWR: power-fail restart"}, + /* 20 */ {_SigNotify + _SigIgn, "SIGWINCH: window size change"}, + /* 21 */ {_SigNotify + _SigIgn, "SIGURG: urgent socket condition"}, + /* 22 */ {_SigNotify, "SIGPOLL: pollable event occurred"}, + /* 23 */ {0, "SIGSTOP: stop (cannot be caught or ignored)"}, + /* 24 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTSTP: user stop requested from tty"}, + /* 25 */ {_SigNotify + _SigDefault + _SigIgn, "SIGCONT: stopped process has been continued"}, + /* 26 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTIN: background tty read attempted"}, + /* 27 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTOU: background tty write attempted"}, + /* 28 */ {_SigNotify, "SIGVTALRM: virtual timer expired"}, + /* 29 */ {_SigNotify + _SigUnblock, "SIGPROF: profiling timer expired"}, + /* 30 */ {_SigNotify, "SIGXCPU: exceeded cpu limit"}, + /* 31 */ {_SigNotify, "SIGXFSZ: exceeded file size limit"}, + /* 32 */ {_SigNotify, "SIGWAITING: reserved signal no longer used by"}, + /* 33 */ {_SigNotify, "SIGLWP: reserved signal no longer used by"}, + /* 34 */ {_SigNotify, "SIGFREEZE: special signal used by CPR"}, + /* 35 */ {_SigNotify, "SIGTHAW: special signal used by CPR"}, + /* 36 */ {_SigSetStack + _SigUnblock, "SIGCANCEL: reserved signal for thread cancellation"}, // Oracle's spelling of cancellation. + /* 37 */ {_SigNotify, "SIGLOST: resource lost (eg, record-lock lost)"}, + /* 38 */ {_SigNotify, "SIGXRES: resource control exceeded"}, + /* 39 */ {_SigNotify, "SIGJVM1: reserved signal for Java Virtual Machine"}, + /* 40 */ {_SigNotify, "SIGJVM2: reserved signal for Java Virtual Machine"}, + + /* TODO(aram): what should be do about these signals? _SigDefault or _SigNotify? is this set static? */ + /* 41 */ {_SigNotify, "real time signal"}, + /* 42 */ {_SigNotify, "real time signal"}, + /* 43 */ {_SigNotify, "real time signal"}, + /* 44 */ {_SigNotify, "real time signal"}, + /* 45 */ {_SigNotify, "real time signal"}, + /* 46 */ {_SigNotify, "real time signal"}, + /* 47 */ {_SigNotify, "real time signal"}, + /* 48 */ {_SigNotify, "real time signal"}, + /* 49 */ {_SigNotify, "real time signal"}, + /* 50 */ {_SigNotify, "real time signal"}, + /* 51 */ {_SigNotify, "real time signal"}, + /* 52 */ {_SigNotify, "real time signal"}, + /* 53 */ {_SigNotify, "real time signal"}, + /* 54 */ {_SigNotify, "real time signal"}, + /* 55 */ {_SigNotify, "real time signal"}, + /* 56 */ {_SigNotify, "real time signal"}, + /* 57 */ {_SigNotify, "real time signal"}, + /* 58 */ {_SigNotify, "real time signal"}, + /* 59 */ {_SigNotify, "real time signal"}, + /* 60 */ {_SigNotify, "real time signal"}, + /* 61 */ {_SigNotify, "real time signal"}, + /* 62 */ {_SigNotify, "real time signal"}, + /* 63 */ {_SigNotify, "real time signal"}, + /* 64 */ {_SigNotify, "real time signal"}, + /* 65 */ {_SigNotify, "real time signal"}, + /* 66 */ {_SigNotify, "real time signal"}, + /* 67 */ {_SigNotify, "real time signal"}, + /* 68 */ {_SigNotify, "real time signal"}, + /* 69 */ {_SigNotify, "real time signal"}, + /* 70 */ {_SigNotify, "real time signal"}, + /* 71 */ {_SigNotify, "real time signal"}, + /* 72 */ {_SigNotify, "real time signal"}, +} diff --git a/src/runtime/signal_solaris_amd64.go b/src/runtime/signal_solaris_amd64.go new file mode 100644 index 0000000..b1da313 --- /dev/null +++ b/src/runtime/signal_solaris_amd64.go @@ -0,0 +1,53 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +type sigctxt struct { + info *siginfo + ctxt unsafe.Pointer +} + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) regs() *mcontext { + return (*mcontext)(unsafe.Pointer(&(*ucontext)(c.ctxt).uc_mcontext)) +} + +func (c *sigctxt) rax() uint64 { return uint64(c.regs().gregs[_REG_RAX]) } +func (c *sigctxt) rbx() uint64 { return uint64(c.regs().gregs[_REG_RBX]) } +func (c *sigctxt) rcx() uint64 { return uint64(c.regs().gregs[_REG_RCX]) } +func (c *sigctxt) rdx() uint64 { return uint64(c.regs().gregs[_REG_RDX]) } +func (c *sigctxt) rdi() uint64 { return uint64(c.regs().gregs[_REG_RDI]) } +func (c *sigctxt) rsi() uint64 { return uint64(c.regs().gregs[_REG_RSI]) } +func (c *sigctxt) rbp() uint64 { return uint64(c.regs().gregs[_REG_RBP]) } +func (c *sigctxt) rsp() uint64 { return uint64(c.regs().gregs[_REG_RSP]) } +func (c *sigctxt) r8() uint64 { return uint64(c.regs().gregs[_REG_R8]) } +func (c *sigctxt) r9() uint64 { return uint64(c.regs().gregs[_REG_R9]) } +func (c *sigctxt) r10() uint64 { return uint64(c.regs().gregs[_REG_R10]) } +func (c *sigctxt) r11() uint64 { return uint64(c.regs().gregs[_REG_R11]) } +func (c *sigctxt) r12() uint64 { return uint64(c.regs().gregs[_REG_R12]) } +func (c *sigctxt) r13() uint64 { return uint64(c.regs().gregs[_REG_R13]) } +func (c *sigctxt) r14() uint64 { return uint64(c.regs().gregs[_REG_R14]) } +func (c *sigctxt) r15() uint64 { return uint64(c.regs().gregs[_REG_R15]) } + +//go:nosplit +//go:nowritebarrierrec +func (c *sigctxt) rip() uint64 { return uint64(c.regs().gregs[_REG_RIP]) } + +func (c *sigctxt) rflags() uint64 { return uint64(c.regs().gregs[_REG_RFLAGS]) } +func (c *sigctxt) cs() uint64 { return uint64(c.regs().gregs[_REG_CS]) } +func (c *sigctxt) fs() uint64 { return uint64(c.regs().gregs[_REG_FS]) } +func (c *sigctxt) gs() uint64 { return uint64(c.regs().gregs[_REG_GS]) } +func (c *sigctxt) sigcode() uint64 { return uint64(c.info.si_code) } +func (c *sigctxt) sigaddr() uint64 { return *(*uint64)(unsafe.Pointer(&c.info.__data[0])) } + +func (c *sigctxt) set_rip(x uint64) { c.regs().gregs[_REG_RIP] = int64(x) } +func (c *sigctxt) set_rsp(x uint64) { c.regs().gregs[_REG_RSP] = int64(x) } +func (c *sigctxt) set_sigcode(x uint64) { c.info.si_code = int32(x) } +func (c *sigctxt) set_sigaddr(x uint64) { + *(*uintptr)(unsafe.Pointer(&c.info.__data[0])) = uintptr(x) +} diff --git a/src/runtime/signal_unix.go b/src/runtime/signal_unix.go new file mode 100644 index 0000000..c1abe62 --- /dev/null +++ b/src/runtime/signal_unix.go @@ -0,0 +1,1358 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime + +import ( + "internal/abi" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// sigTabT is the type of an entry in the global sigtable array. +// sigtable is inherently system dependent, and appears in OS-specific files, +// but sigTabT is the same for all Unixy systems. +// The sigtable array is indexed by a system signal number to get the flags +// and printable name of each signal. +type sigTabT struct { + flags int32 + name string +} + +//go:linkname os_sigpipe os.sigpipe +func os_sigpipe() { + systemstack(sigpipe) +} + +func signame(sig uint32) string { + if sig >= uint32(len(sigtable)) { + return "" + } + return sigtable[sig].name +} + +const ( + _SIG_DFL uintptr = 0 + _SIG_IGN uintptr = 1 +) + +// sigPreempt is the signal used for non-cooperative preemption. +// +// There's no good way to choose this signal, but there are some +// heuristics: +// +// 1. It should be a signal that's passed-through by debuggers by +// default. On Linux, this is SIGALRM, SIGURG, SIGCHLD, SIGIO, +// SIGVTALRM, SIGPROF, and SIGWINCH, plus some glibc-internal signals. +// +// 2. It shouldn't be used internally by libc in mixed Go/C binaries +// because libc may assume it's the only thing that can handle these +// signals. For example SIGCANCEL or SIGSETXID. +// +// 3. It should be a signal that can happen spuriously without +// consequences. For example, SIGALRM is a bad choice because the +// signal handler can't tell if it was caused by the real process +// alarm or not (arguably this means the signal is broken, but I +// digress). SIGUSR1 and SIGUSR2 are also bad because those are often +// used in meaningful ways by applications. +// +// 4. We need to deal with platforms without real-time signals (like +// macOS), so those are out. +// +// We use SIGURG because it meets all of these criteria, is extremely +// unlikely to be used by an application for its "real" meaning (both +// because out-of-band data is basically unused and because SIGURG +// doesn't report which socket has the condition, making it pretty +// useless), and even if it is, the application has to be ready for +// spurious SIGURG. SIGIO wouldn't be a bad choice either, but is more +// likely to be used for real. +const sigPreempt = _SIGURG + +// Stores the signal handlers registered before Go installed its own. +// These signal handlers will be invoked in cases where Go doesn't want to +// handle a particular signal (e.g., signal occurred on a non-Go thread). +// See sigfwdgo for more information on when the signals are forwarded. +// +// This is read by the signal handler; accesses should use +// atomic.Loaduintptr and atomic.Storeuintptr. +var fwdSig [_NSIG]uintptr + +// handlingSig is indexed by signal number and is non-zero if we are +// currently handling the signal. Or, to put it another way, whether +// the signal handler is currently set to the Go signal handler or not. +// This is uint32 rather than bool so that we can use atomic instructions. +var handlingSig [_NSIG]uint32 + +// channels for synchronizing signal mask updates with the signal mask +// thread +var ( + disableSigChan chan uint32 + enableSigChan chan uint32 + maskUpdatedChan chan struct{} +) + +func init() { + // _NSIG is the number of signals on this operating system. + // sigtable should describe what to do for all the possible signals. + if len(sigtable) != _NSIG { + print("runtime: len(sigtable)=", len(sigtable), " _NSIG=", _NSIG, "\n") + throw("bad sigtable len") + } +} + +var signalsOK bool + +// Initialize signals. +// Called by libpreinit so runtime may not be initialized. +// +//go:nosplit +//go:nowritebarrierrec +func initsig(preinit bool) { + if !preinit { + // It's now OK for signal handlers to run. + signalsOK = true + } + + // For c-archive/c-shared this is called by libpreinit with + // preinit == true. + if (isarchive || islibrary) && !preinit { + return + } + + for i := uint32(0); i < _NSIG; i++ { + t := &sigtable[i] + if t.flags == 0 || t.flags&_SigDefault != 0 { + continue + } + + // We don't need to use atomic operations here because + // there shouldn't be any other goroutines running yet. + fwdSig[i] = getsig(i) + + if !sigInstallGoHandler(i) { + // Even if we are not installing a signal handler, + // set SA_ONSTACK if necessary. + if fwdSig[i] != _SIG_DFL && fwdSig[i] != _SIG_IGN { + setsigstack(i) + } else if fwdSig[i] == _SIG_IGN { + sigInitIgnored(i) + } + continue + } + + handlingSig[i] = 1 + setsig(i, abi.FuncPCABIInternal(sighandler)) + } +} + +//go:nosplit +//go:nowritebarrierrec +func sigInstallGoHandler(sig uint32) bool { + // For some signals, we respect an inherited SIG_IGN handler + // rather than insist on installing our own default handler. + // Even these signals can be fetched using the os/signal package. + switch sig { + case _SIGHUP, _SIGINT: + if atomic.Loaduintptr(&fwdSig[sig]) == _SIG_IGN { + return false + } + } + + if (GOOS == "linux" || GOOS == "android") && !iscgo && sig == sigPerThreadSyscall { + // sigPerThreadSyscall is the same signal used by glibc for + // per-thread syscalls on Linux. We use it for the same purpose + // in non-cgo binaries. + return true + } + + t := &sigtable[sig] + if t.flags&_SigSetStack != 0 { + return false + } + + // When built using c-archive or c-shared, only install signal + // handlers for synchronous signals and SIGPIPE and sigPreempt. + if (isarchive || islibrary) && t.flags&_SigPanic == 0 && sig != _SIGPIPE && sig != sigPreempt { + return false + } + + return true +} + +// sigenable enables the Go signal handler to catch the signal sig. +// It is only called while holding the os/signal.handlers lock, +// via os/signal.enableSignal and signal_enable. +func sigenable(sig uint32) { + if sig >= uint32(len(sigtable)) { + return + } + + // SIGPROF is handled specially for profiling. + if sig == _SIGPROF { + return + } + + t := &sigtable[sig] + if t.flags&_SigNotify != 0 { + ensureSigM() + enableSigChan <- sig + <-maskUpdatedChan + if atomic.Cas(&handlingSig[sig], 0, 1) { + atomic.Storeuintptr(&fwdSig[sig], getsig(sig)) + setsig(sig, abi.FuncPCABIInternal(sighandler)) + } + } +} + +// sigdisable disables the Go signal handler for the signal sig. +// It is only called while holding the os/signal.handlers lock, +// via os/signal.disableSignal and signal_disable. +func sigdisable(sig uint32) { + if sig >= uint32(len(sigtable)) { + return + } + + // SIGPROF is handled specially for profiling. + if sig == _SIGPROF { + return + } + + t := &sigtable[sig] + if t.flags&_SigNotify != 0 { + ensureSigM() + disableSigChan <- sig + <-maskUpdatedChan + + // If initsig does not install a signal handler for a + // signal, then to go back to the state before Notify + // we should remove the one we installed. + if !sigInstallGoHandler(sig) { + atomic.Store(&handlingSig[sig], 0) + setsig(sig, atomic.Loaduintptr(&fwdSig[sig])) + } + } +} + +// sigignore ignores the signal sig. +// It is only called while holding the os/signal.handlers lock, +// via os/signal.ignoreSignal and signal_ignore. +func sigignore(sig uint32) { + if sig >= uint32(len(sigtable)) { + return + } + + // SIGPROF is handled specially for profiling. + if sig == _SIGPROF { + return + } + + t := &sigtable[sig] + if t.flags&_SigNotify != 0 { + atomic.Store(&handlingSig[sig], 0) + setsig(sig, _SIG_IGN) + } +} + +// clearSignalHandlers clears all signal handlers that are not ignored +// back to the default. This is called by the child after a fork, so that +// we can enable the signal mask for the exec without worrying about +// running a signal handler in the child. +// +//go:nosplit +//go:nowritebarrierrec +func clearSignalHandlers() { + for i := uint32(0); i < _NSIG; i++ { + if atomic.Load(&handlingSig[i]) != 0 { + setsig(i, _SIG_DFL) + } + } +} + +// setProcessCPUProfilerTimer is called when the profiling timer changes. +// It is called with prof.signalLock held. hz is the new timer, and is 0 if +// profiling is being disabled. Enable or disable the signal as +// required for -buildmode=c-archive. +func setProcessCPUProfilerTimer(hz int32) { + if hz != 0 { + // Enable the Go signal handler if not enabled. + if atomic.Cas(&handlingSig[_SIGPROF], 0, 1) { + h := getsig(_SIGPROF) + // If no signal handler was installed before, then we record + // _SIG_IGN here. When we turn off profiling (below) we'll start + // ignoring SIGPROF signals. We do this, rather than change + // to SIG_DFL, because there may be a pending SIGPROF + // signal that has not yet been delivered to some other thread. + // If we change to SIG_DFL when turning off profiling, the + // program will crash when that SIGPROF is delivered. We assume + // that programs that use profiling don't want to crash on a + // stray SIGPROF. See issue 19320. + // We do the change here instead of when turning off profiling, + // because there we may race with a signal handler running + // concurrently, in particular, sigfwdgo may observe _SIG_DFL and + // die. See issue 43828. + if h == _SIG_DFL { + h = _SIG_IGN + } + atomic.Storeuintptr(&fwdSig[_SIGPROF], h) + setsig(_SIGPROF, abi.FuncPCABIInternal(sighandler)) + } + + var it itimerval + it.it_interval.tv_sec = 0 + it.it_interval.set_usec(1000000 / hz) + it.it_value = it.it_interval + setitimer(_ITIMER_PROF, &it, nil) + } else { + setitimer(_ITIMER_PROF, &itimerval{}, nil) + + // If the Go signal handler should be disabled by default, + // switch back to the signal handler that was installed + // when we enabled profiling. We don't try to handle the case + // of a program that changes the SIGPROF handler while Go + // profiling is enabled. + if !sigInstallGoHandler(_SIGPROF) { + if atomic.Cas(&handlingSig[_SIGPROF], 1, 0) { + h := atomic.Loaduintptr(&fwdSig[_SIGPROF]) + setsig(_SIGPROF, h) + } + } + } +} + +// setThreadCPUProfilerHz makes any thread-specific changes required to +// implement profiling at a rate of hz. +// No changes required on Unix systems when using setitimer. +func setThreadCPUProfilerHz(hz int32) { + getg().m.profilehz = hz +} + +func sigpipe() { + if signal_ignored(_SIGPIPE) || sigsend(_SIGPIPE) { + return + } + dieFromSignal(_SIGPIPE) +} + +// doSigPreempt handles a preemption signal on gp. +func doSigPreempt(gp *g, ctxt *sigctxt) { + // Check if this G wants to be preempted and is safe to + // preempt. + if wantAsyncPreempt(gp) { + if ok, newpc := isAsyncSafePoint(gp, ctxt.sigpc(), ctxt.sigsp(), ctxt.siglr()); ok { + // Adjust the PC and inject a call to asyncPreempt. + ctxt.pushCall(abi.FuncPCABI0(asyncPreempt), newpc) + } + } + + // Acknowledge the preemption. + gp.m.preemptGen.Add(1) + gp.m.signalPending.Store(0) + + if GOOS == "darwin" || GOOS == "ios" { + pendingPreemptSignals.Add(-1) + } +} + +const preemptMSupported = true + +// preemptM sends a preemption request to mp. This request may be +// handled asynchronously and may be coalesced with other requests to +// the M. When the request is received, if the running G or P are +// marked for preemption and the goroutine is at an asynchronous +// safe-point, it will preempt the goroutine. It always atomically +// increments mp.preemptGen after handling a preemption request. +func preemptM(mp *m) { + // On Darwin, don't try to preempt threads during exec. + // Issue #41702. + if GOOS == "darwin" || GOOS == "ios" { + execLock.rlock() + } + + if mp.signalPending.CompareAndSwap(0, 1) { + if GOOS == "darwin" || GOOS == "ios" { + pendingPreemptSignals.Add(1) + } + + // If multiple threads are preempting the same M, it may send many + // signals to the same M such that it hardly make progress, causing + // live-lock problem. Apparently this could happen on darwin. See + // issue #37741. + // Only send a signal if there isn't already one pending. + signalM(mp, sigPreempt) + } + + if GOOS == "darwin" || GOOS == "ios" { + execLock.runlock() + } +} + +// sigFetchG fetches the value of G safely when running in a signal handler. +// On some architectures, the g value may be clobbered when running in a VDSO. +// See issue #32912. +// +//go:nosplit +func sigFetchG(c *sigctxt) *g { + switch GOARCH { + case "arm", "arm64", "ppc64", "ppc64le", "riscv64", "s390x": + if !iscgo && inVDSOPage(c.sigpc()) { + // When using cgo, we save the g on TLS and load it from there + // in sigtramp. Just use that. + // Otherwise, before making a VDSO call we save the g to the + // bottom of the signal stack. Fetch from there. + // TODO: in efence mode, stack is sysAlloc'd, so this wouldn't + // work. + sp := getcallersp() + s := spanOf(sp) + if s != nil && s.state.get() == mSpanManual && s.base() < sp && sp < s.limit { + gp := *(**g)(unsafe.Pointer(s.base())) + return gp + } + return nil + } + } + return getg() +} + +// sigtrampgo is called from the signal handler function, sigtramp, +// written in assembly code. +// This is called by the signal handler, and the world may be stopped. +// +// It must be nosplit because getg() is still the G that was running +// (if any) when the signal was delivered, but it's (usually) called +// on the gsignal stack. Until this switches the G to gsignal, the +// stack bounds check won't work. +// +//go:nosplit +//go:nowritebarrierrec +func sigtrampgo(sig uint32, info *siginfo, ctx unsafe.Pointer) { + if sigfwdgo(sig, info, ctx) { + return + } + c := &sigctxt{info, ctx} + gp := sigFetchG(c) + setg(gp) + if gp == nil { + if sig == _SIGPROF { + // Some platforms (Linux) have per-thread timers, which we use in + // combination with the process-wide timer. Avoid double-counting. + if validSIGPROF(nil, c) { + sigprofNonGoPC(c.sigpc()) + } + return + } + if sig == sigPreempt && preemptMSupported && debug.asyncpreemptoff == 0 { + // This is probably a signal from preemptM sent + // while executing Go code but received while + // executing non-Go code. + // We got past sigfwdgo, so we know that there is + // no non-Go signal handler for sigPreempt. + // The default behavior for sigPreempt is to ignore + // the signal, so badsignal will be a no-op anyway. + if GOOS == "darwin" || GOOS == "ios" { + pendingPreemptSignals.Add(-1) + } + return + } + c.fixsigcode(sig) + badsignal(uintptr(sig), c) + return + } + + setg(gp.m.gsignal) + + // If some non-Go code called sigaltstack, adjust. + var gsignalStack gsignalStack + setStack := adjustSignalStack(sig, gp.m, &gsignalStack) + if setStack { + gp.m.gsignal.stktopsp = getcallersp() + } + + if gp.stackguard0 == stackFork { + signalDuringFork(sig) + } + + c.fixsigcode(sig) + sighandler(sig, info, ctx, gp) + setg(gp) + if setStack { + restoreGsignalStack(&gsignalStack) + } +} + +// If the signal handler receives a SIGPROF signal on a non-Go thread, +// it tries to collect a traceback into sigprofCallers. +// sigprofCallersUse is set to non-zero while sigprofCallers holds a traceback. +var sigprofCallers cgoCallers +var sigprofCallersUse uint32 + +// sigprofNonGo is called if we receive a SIGPROF signal on a non-Go thread, +// and the signal handler collected a stack trace in sigprofCallers. +// When this is called, sigprofCallersUse will be non-zero. +// g is nil, and what we can do is very limited. +// +// It is called from the signal handling functions written in assembly code that +// are active for cgo programs, cgoSigtramp and sigprofNonGoWrapper, which have +// not verified that the SIGPROF delivery corresponds to the best available +// profiling source for this thread. +// +//go:nosplit +//go:nowritebarrierrec +func sigprofNonGo(sig uint32, info *siginfo, ctx unsafe.Pointer) { + if prof.hz.Load() != 0 { + c := &sigctxt{info, ctx} + // Some platforms (Linux) have per-thread timers, which we use in + // combination with the process-wide timer. Avoid double-counting. + if validSIGPROF(nil, c) { + n := 0 + for n < len(sigprofCallers) && sigprofCallers[n] != 0 { + n++ + } + cpuprof.addNonGo(sigprofCallers[:n]) + } + } + + atomic.Store(&sigprofCallersUse, 0) +} + +// sigprofNonGoPC is called when a profiling signal arrived on a +// non-Go thread and we have a single PC value, not a stack trace. +// g is nil, and what we can do is very limited. +// +//go:nosplit +//go:nowritebarrierrec +func sigprofNonGoPC(pc uintptr) { + if prof.hz.Load() != 0 { + stk := []uintptr{ + pc, + abi.FuncPCABIInternal(_ExternalCode) + sys.PCQuantum, + } + cpuprof.addNonGo(stk) + } +} + +// adjustSignalStack adjusts the current stack guard based on the +// stack pointer that is actually in use while handling a signal. +// We do this in case some non-Go code called sigaltstack. +// This reports whether the stack was adjusted, and if so stores the old +// signal stack in *gsigstack. +// +//go:nosplit +func adjustSignalStack(sig uint32, mp *m, gsigStack *gsignalStack) bool { + sp := uintptr(unsafe.Pointer(&sig)) + if sp >= mp.gsignal.stack.lo && sp < mp.gsignal.stack.hi { + return false + } + + var st stackt + sigaltstack(nil, &st) + stsp := uintptr(unsafe.Pointer(st.ss_sp)) + if st.ss_flags&_SS_DISABLE == 0 && sp >= stsp && sp < stsp+st.ss_size { + setGsignalStack(&st, gsigStack) + return true + } + + if sp >= mp.g0.stack.lo && sp < mp.g0.stack.hi { + // The signal was delivered on the g0 stack. + // This can happen when linked with C code + // using the thread sanitizer, which collects + // signals then delivers them itself by calling + // the signal handler directly when C code, + // including C code called via cgo, calls a + // TSAN-intercepted function such as malloc. + // + // We check this condition last as g0.stack.lo + // may be not very accurate (see mstart). + st := stackt{ss_size: mp.g0.stack.hi - mp.g0.stack.lo} + setSignalstackSP(&st, mp.g0.stack.lo) + setGsignalStack(&st, gsigStack) + return true + } + + // sp is not within gsignal stack, g0 stack, or sigaltstack. Bad. + setg(nil) + needm() + if st.ss_flags&_SS_DISABLE != 0 { + noSignalStack(sig) + } else { + sigNotOnStack(sig) + } + dropm() + return false +} + +// crashing is the number of m's we have waited for when implementing +// GOTRACEBACK=crash when a signal is received. +var crashing int32 + +// testSigtrap and testSigusr1 are used by the runtime tests. If +// non-nil, it is called on SIGTRAP/SIGUSR1. If it returns true, the +// normal behavior on this signal is suppressed. +var testSigtrap func(info *siginfo, ctxt *sigctxt, gp *g) bool +var testSigusr1 func(gp *g) bool + +// sighandler is invoked when a signal occurs. The global g will be +// set to a gsignal goroutine and we will be running on the alternate +// signal stack. The parameter gp will be the value of the global g +// when the signal occurred. The sig, info, and ctxt parameters are +// from the system signal handler: they are the parameters passed when +// the SA is passed to the sigaction system call. +// +// The garbage collector may have stopped the world, so write barriers +// are not allowed. +// +//go:nowritebarrierrec +func sighandler(sig uint32, info *siginfo, ctxt unsafe.Pointer, gp *g) { + // The g executing the signal handler. This is almost always + // mp.gsignal. See delayedSignal for an exception. + gsignal := getg() + mp := gsignal.m + c := &sigctxt{info, ctxt} + + // Cgo TSAN (not the Go race detector) intercepts signals and calls the + // signal handler at a later time. When the signal handler is called, the + // memory may have changed, but the signal context remains old. The + // unmatched signal context and memory makes it unsafe to unwind or inspect + // the stack. So we ignore delayed non-fatal signals that will cause a stack + // inspection (profiling signal and preemption signal). + // cgo_yield is only non-nil for TSAN, and is specifically used to trigger + // signal delivery. We use that as an indicator of delayed signals. + // For delayed signals, the handler is called on the g0 stack (see + // adjustSignalStack). + delayedSignal := *cgo_yield != nil && mp != nil && gsignal.stack == mp.g0.stack + + if sig == _SIGPROF { + // Some platforms (Linux) have per-thread timers, which we use in + // combination with the process-wide timer. Avoid double-counting. + if !delayedSignal && validSIGPROF(mp, c) { + sigprof(c.sigpc(), c.sigsp(), c.siglr(), gp, mp) + } + return + } + + if sig == _SIGTRAP && testSigtrap != nil && testSigtrap(info, (*sigctxt)(noescape(unsafe.Pointer(c))), gp) { + return + } + + if sig == _SIGUSR1 && testSigusr1 != nil && testSigusr1(gp) { + return + } + + if (GOOS == "linux" || GOOS == "android") && sig == sigPerThreadSyscall { + // sigPerThreadSyscall is the same signal used by glibc for + // per-thread syscalls on Linux. We use it for the same purpose + // in non-cgo binaries. Since this signal is not _SigNotify, + // there is nothing more to do once we run the syscall. + runPerThreadSyscall() + return + } + + if sig == sigPreempt && debug.asyncpreemptoff == 0 && !delayedSignal { + // Might be a preemption signal. + doSigPreempt(gp, c) + // Even if this was definitely a preemption signal, it + // may have been coalesced with another signal, so we + // still let it through to the application. + } + + flags := int32(_SigThrow) + if sig < uint32(len(sigtable)) { + flags = sigtable[sig].flags + } + if !c.sigFromUser() && flags&_SigPanic != 0 && gp.throwsplit { + // We can't safely sigpanic because it may grow the + // stack. Abort in the signal handler instead. + flags = _SigThrow + } + if isAbortPC(c.sigpc()) { + // On many architectures, the abort function just + // causes a memory fault. Don't turn that into a panic. + flags = _SigThrow + } + if !c.sigFromUser() && flags&_SigPanic != 0 { + // The signal is going to cause a panic. + // Arrange the stack so that it looks like the point + // where the signal occurred made a call to the + // function sigpanic. Then set the PC to sigpanic. + + // Have to pass arguments out of band since + // augmenting the stack frame would break + // the unwinding code. + gp.sig = sig + gp.sigcode0 = uintptr(c.sigcode()) + gp.sigcode1 = uintptr(c.fault()) + gp.sigpc = c.sigpc() + + c.preparePanic(sig, gp) + return + } + + if c.sigFromUser() || flags&_SigNotify != 0 { + if sigsend(sig) { + return + } + } + + if c.sigFromUser() && signal_ignored(sig) { + return + } + + if flags&_SigKill != 0 { + dieFromSignal(sig) + } + + // _SigThrow means that we should exit now. + // If we get here with _SigPanic, it means that the signal + // was sent to us by a program (c.sigFromUser() is true); + // in that case, if we didn't handle it in sigsend, we exit now. + if flags&(_SigThrow|_SigPanic) == 0 { + return + } + + mp.throwing = throwTypeRuntime + mp.caughtsig.set(gp) + + if crashing == 0 { + startpanic_m() + } + + if sig < uint32(len(sigtable)) { + print(sigtable[sig].name, "\n") + } else { + print("Signal ", sig, "\n") + } + + if isSecureMode() { + exit(2) + } + + print("PC=", hex(c.sigpc()), " m=", mp.id, " sigcode=", c.sigcode(), "\n") + if mp.incgo && gp == mp.g0 && mp.curg != nil { + print("signal arrived during cgo execution\n") + // Switch to curg so that we get a traceback of the Go code + // leading up to the cgocall, which switched from curg to g0. + gp = mp.curg + } + if sig == _SIGILL || sig == _SIGFPE { + // It would be nice to know how long the instruction is. + // Unfortunately, that's complicated to do in general (mostly for x86 + // and s930x, but other archs have non-standard instruction lengths also). + // Opt to print 16 bytes, which covers most instructions. + const maxN = 16 + n := uintptr(maxN) + // We have to be careful, though. If we're near the end of + // a page and the following page isn't mapped, we could + // segfault. So make sure we don't straddle a page (even though + // that could lead to printing an incomplete instruction). + // We're assuming here we can read at least the page containing the PC. + // I suppose it is possible that the page is mapped executable but not readable? + pc := c.sigpc() + if n > physPageSize-pc%physPageSize { + n = physPageSize - pc%physPageSize + } + print("instruction bytes:") + b := (*[maxN]byte)(unsafe.Pointer(pc)) + for i := uintptr(0); i < n; i++ { + print(" ", hex(b[i])) + } + println() + } + print("\n") + + level, _, docrash := gotraceback() + if level > 0 { + goroutineheader(gp) + tracebacktrap(c.sigpc(), c.sigsp(), c.siglr(), gp) + if crashing > 0 && gp != mp.curg && mp.curg != nil && readgstatus(mp.curg)&^_Gscan == _Grunning { + // tracebackothers on original m skipped this one; trace it now. + goroutineheader(mp.curg) + traceback(^uintptr(0), ^uintptr(0), 0, mp.curg) + } else if crashing == 0 { + tracebackothers(gp) + print("\n") + } + dumpregs(c) + } + + if docrash { + crashing++ + if crashing < mcount()-int32(extraMCount) { + // There are other m's that need to dump their stacks. + // Relay SIGQUIT to the next m by sending it to the current process. + // All m's that have already received SIGQUIT have signal masks blocking + // receipt of any signals, so the SIGQUIT will go to an m that hasn't seen it yet. + // When the last m receives the SIGQUIT, it will fall through to the call to + // crash below. Just in case the relaying gets botched, each m involved in + // the relay sleeps for 5 seconds and then does the crash/exit itself. + // In expected operation, the last m has received the SIGQUIT and run + // crash/exit and the process is gone, all long before any of the + // 5-second sleeps have finished. + print("\n-----\n\n") + raiseproc(_SIGQUIT) + usleep(5 * 1000 * 1000) + } + crash() + } + + printDebugLog() + + exit(2) +} + +// sigpanic turns a synchronous signal into a run-time panic. +// If the signal handler sees a synchronous panic, it arranges the +// stack to look like the function where the signal occurred called +// sigpanic, sets the signal's PC value to sigpanic, and returns from +// the signal handler. The effect is that the program will act as +// though the function that got the signal simply called sigpanic +// instead. +// +// This must NOT be nosplit because the linker doesn't know where +// sigpanic calls can be injected. +// +// The signal handler must not inject a call to sigpanic if +// getg().throwsplit, since sigpanic may need to grow the stack. +// +// This is exported via linkname to assembly in runtime/cgo. +// +//go:linkname sigpanic +func sigpanic() { + gp := getg() + if !canpanic() { + throw("unexpected signal during runtime execution") + } + + switch gp.sig { + case _SIGBUS: + if gp.sigcode0 == _BUS_ADRERR && gp.sigcode1 < 0x1000 { + panicmem() + } + // Support runtime/debug.SetPanicOnFault. + if gp.paniconfault { + panicmemAddr(gp.sigcode1) + } + print("unexpected fault address ", hex(gp.sigcode1), "\n") + throw("fault") + case _SIGSEGV: + if (gp.sigcode0 == 0 || gp.sigcode0 == _SEGV_MAPERR || gp.sigcode0 == _SEGV_ACCERR) && gp.sigcode1 < 0x1000 { + panicmem() + } + // Support runtime/debug.SetPanicOnFault. + if gp.paniconfault { + panicmemAddr(gp.sigcode1) + } + if inUserArenaChunk(gp.sigcode1) { + // We could check that the arena chunk is explicitly set to fault, + // but the fact that we faulted on accessing it is enough to prove + // that it is. + print("accessed data from freed user arena ", hex(gp.sigcode1), "\n") + } else { + print("unexpected fault address ", hex(gp.sigcode1), "\n") + } + throw("fault") + case _SIGFPE: + switch gp.sigcode0 { + case _FPE_INTDIV: + panicdivide() + case _FPE_INTOVF: + panicoverflow() + } + panicfloat() + } + + if gp.sig >= uint32(len(sigtable)) { + // can't happen: we looked up gp.sig in sigtable to decide to call sigpanic + throw("unexpected signal value") + } + panic(errorString(sigtable[gp.sig].name)) +} + +// dieFromSignal kills the program with a signal. +// This provides the expected exit status for the shell. +// This is only called with fatal signals expected to kill the process. +// +//go:nosplit +//go:nowritebarrierrec +func dieFromSignal(sig uint32) { + unblocksig(sig) + // Mark the signal as unhandled to ensure it is forwarded. + atomic.Store(&handlingSig[sig], 0) + raise(sig) + + // That should have killed us. On some systems, though, raise + // sends the signal to the whole process rather than to just + // the current thread, which means that the signal may not yet + // have been delivered. Give other threads a chance to run and + // pick up the signal. + osyield() + osyield() + osyield() + + // If that didn't work, try _SIG_DFL. + setsig(sig, _SIG_DFL) + raise(sig) + + osyield() + osyield() + osyield() + + // If we are still somehow running, just exit with the wrong status. + exit(2) +} + +// raisebadsignal is called when a signal is received on a non-Go +// thread, and the Go program does not want to handle it (that is, the +// program has not called os/signal.Notify for the signal). +func raisebadsignal(sig uint32, c *sigctxt) { + if sig == _SIGPROF { + // Ignore profiling signals that arrive on non-Go threads. + return + } + + var handler uintptr + if sig >= _NSIG { + handler = _SIG_DFL + } else { + handler = atomic.Loaduintptr(&fwdSig[sig]) + } + + // Reset the signal handler and raise the signal. + // We are currently running inside a signal handler, so the + // signal is blocked. We need to unblock it before raising the + // signal, or the signal we raise will be ignored until we return + // from the signal handler. We know that the signal was unblocked + // before entering the handler, or else we would not have received + // it. That means that we don't have to worry about blocking it + // again. + unblocksig(sig) + setsig(sig, handler) + + // If we're linked into a non-Go program we want to try to + // avoid modifying the original context in which the signal + // was raised. If the handler is the default, we know it + // is non-recoverable, so we don't have to worry about + // re-installing sighandler. At this point we can just + // return and the signal will be re-raised and caught by + // the default handler with the correct context. + // + // On FreeBSD, the libthr sigaction code prevents + // this from working so we fall through to raise. + if GOOS != "freebsd" && (isarchive || islibrary) && handler == _SIG_DFL && !c.sigFromUser() { + return + } + + raise(sig) + + // Give the signal a chance to be delivered. + // In almost all real cases the program is about to crash, + // so sleeping here is not a waste of time. + usleep(1000) + + // If the signal didn't cause the program to exit, restore the + // Go signal handler and carry on. + // + // We may receive another instance of the signal before we + // restore the Go handler, but that is not so bad: we know + // that the Go program has been ignoring the signal. + setsig(sig, abi.FuncPCABIInternal(sighandler)) +} + +//go:nosplit +func crash() { + // OS X core dumps are linear dumps of the mapped memory, + // from the first virtual byte to the last, with zeros in the gaps. + // Because of the way we arrange the address space on 64-bit systems, + // this means the OS X core file will be >128 GB and even on a zippy + // workstation can take OS X well over an hour to write (uninterruptible). + // Save users from making that mistake. + if GOOS == "darwin" && GOARCH == "amd64" { + return + } + + dieFromSignal(_SIGABRT) +} + +// ensureSigM starts one global, sleeping thread to make sure at least one thread +// is available to catch signals enabled for os/signal. +func ensureSigM() { + if maskUpdatedChan != nil { + return + } + maskUpdatedChan = make(chan struct{}) + disableSigChan = make(chan uint32) + enableSigChan = make(chan uint32) + go func() { + // Signal masks are per-thread, so make sure this goroutine stays on one + // thread. + LockOSThread() + defer UnlockOSThread() + // The sigBlocked mask contains the signals not active for os/signal, + // initially all signals except the essential. When signal.Notify()/Stop is called, + // sigenable/sigdisable in turn notify this thread to update its signal + // mask accordingly. + sigBlocked := sigset_all + for i := range sigtable { + if !blockableSig(uint32(i)) { + sigdelset(&sigBlocked, i) + } + } + sigprocmask(_SIG_SETMASK, &sigBlocked, nil) + for { + select { + case sig := <-enableSigChan: + if sig > 0 { + sigdelset(&sigBlocked, int(sig)) + } + case sig := <-disableSigChan: + if sig > 0 && blockableSig(sig) { + sigaddset(&sigBlocked, int(sig)) + } + } + sigprocmask(_SIG_SETMASK, &sigBlocked, nil) + maskUpdatedChan <- struct{}{} + } + }() +} + +// This is called when we receive a signal when there is no signal stack. +// This can only happen if non-Go code calls sigaltstack to disable the +// signal stack. +func noSignalStack(sig uint32) { + println("signal", sig, "received on thread with no signal stack") + throw("non-Go code disabled sigaltstack") +} + +// This is called if we receive a signal when there is a signal stack +// but we are not on it. This can only happen if non-Go code called +// sigaction without setting the SS_ONSTACK flag. +func sigNotOnStack(sig uint32) { + println("signal", sig, "received but handler not on signal stack") + throw("non-Go code set up signal handler without SA_ONSTACK flag") +} + +// signalDuringFork is called if we receive a signal while doing a fork. +// We do not want signals at that time, as a signal sent to the process +// group may be delivered to the child process, causing confusion. +// This should never be called, because we block signals across the fork; +// this function is just a safety check. See issue 18600 for background. +func signalDuringFork(sig uint32) { + println("signal", sig, "received during fork") + throw("signal received during fork") +} + +// This runs on a foreign stack, without an m or a g. No stack split. +// +//go:nosplit +//go:norace +//go:nowritebarrierrec +func badsignal(sig uintptr, c *sigctxt) { + if !iscgo && !cgoHasExtraM { + // There is no extra M. needm will not be able to grab + // an M. Instead of hanging, just crash. + // Cannot call split-stack function as there is no G. + writeErrStr("fatal: bad g in signal handler\n") + exit(2) + *(*uintptr)(unsafe.Pointer(uintptr(123))) = 2 + } + needm() + if !sigsend(uint32(sig)) { + // A foreign thread received the signal sig, and the + // Go code does not want to handle it. + raisebadsignal(uint32(sig), c) + } + dropm() +} + +//go:noescape +func sigfwd(fn uintptr, sig uint32, info *siginfo, ctx unsafe.Pointer) + +// Determines if the signal should be handled by Go and if not, forwards the +// signal to the handler that was installed before Go's. Returns whether the +// signal was forwarded. +// This is called by the signal handler, and the world may be stopped. +// +//go:nosplit +//go:nowritebarrierrec +func sigfwdgo(sig uint32, info *siginfo, ctx unsafe.Pointer) bool { + if sig >= uint32(len(sigtable)) { + return false + } + fwdFn := atomic.Loaduintptr(&fwdSig[sig]) + flags := sigtable[sig].flags + + // If we aren't handling the signal, forward it. + if atomic.Load(&handlingSig[sig]) == 0 || !signalsOK { + // If the signal is ignored, doing nothing is the same as forwarding. + if fwdFn == _SIG_IGN || (fwdFn == _SIG_DFL && flags&_SigIgn != 0) { + return true + } + // We are not handling the signal and there is no other handler to forward to. + // Crash with the default behavior. + if fwdFn == _SIG_DFL { + setsig(sig, _SIG_DFL) + dieFromSignal(sig) + return false + } + + sigfwd(fwdFn, sig, info, ctx) + return true + } + + // This function and its caller sigtrampgo assumes SIGPIPE is delivered on the + // originating thread. This property does not hold on macOS (golang.org/issue/33384), + // so we have no choice but to ignore SIGPIPE. + if (GOOS == "darwin" || GOOS == "ios") && sig == _SIGPIPE { + return true + } + + // If there is no handler to forward to, no need to forward. + if fwdFn == _SIG_DFL { + return false + } + + c := &sigctxt{info, ctx} + // Only forward synchronous signals and SIGPIPE. + // Unfortunately, user generated SIGPIPEs will also be forwarded, because si_code + // is set to _SI_USER even for a SIGPIPE raised from a write to a closed socket + // or pipe. + if (c.sigFromUser() || flags&_SigPanic == 0) && sig != _SIGPIPE { + return false + } + // Determine if the signal occurred inside Go code. We test that: + // (1) we weren't in VDSO page, + // (2) we were in a goroutine (i.e., m.curg != nil), and + // (3) we weren't in CGO. + gp := sigFetchG(c) + if gp != nil && gp.m != nil && gp.m.curg != nil && !gp.m.incgo { + return false + } + + // Signal not handled by Go, forward it. + if fwdFn != _SIG_IGN { + sigfwd(fwdFn, sig, info, ctx) + } + + return true +} + +// sigsave saves the current thread's signal mask into *p. +// This is used to preserve the non-Go signal mask when a non-Go +// thread calls a Go function. +// This is nosplit and nowritebarrierrec because it is called by needm +// which may be called on a non-Go thread with no g available. +// +//go:nosplit +//go:nowritebarrierrec +func sigsave(p *sigset) { + sigprocmask(_SIG_SETMASK, nil, p) +} + +// msigrestore sets the current thread's signal mask to sigmask. +// This is used to restore the non-Go signal mask when a non-Go thread +// calls a Go function. +// This is nosplit and nowritebarrierrec because it is called by dropm +// after g has been cleared. +// +//go:nosplit +//go:nowritebarrierrec +func msigrestore(sigmask sigset) { + sigprocmask(_SIG_SETMASK, &sigmask, nil) +} + +// sigsetAllExiting is used by sigblock(true) when a thread is +// exiting. sigset_all is defined in OS specific code, and per GOOS +// behavior may override this default for sigsetAllExiting: see +// osinit(). +var sigsetAllExiting = sigset_all + +// sigblock blocks signals in the current thread's signal mask. +// This is used to block signals while setting up and tearing down g +// when a non-Go thread calls a Go function. When a thread is exiting +// we use the sigsetAllExiting value, otherwise the OS specific +// definition of sigset_all is used. +// This is nosplit and nowritebarrierrec because it is called by needm +// which may be called on a non-Go thread with no g available. +// +//go:nosplit +//go:nowritebarrierrec +func sigblock(exiting bool) { + if exiting { + sigprocmask(_SIG_SETMASK, &sigsetAllExiting, nil) + return + } + sigprocmask(_SIG_SETMASK, &sigset_all, nil) +} + +// unblocksig removes sig from the current thread's signal mask. +// This is nosplit and nowritebarrierrec because it is called from +// dieFromSignal, which can be called by sigfwdgo while running in the +// signal handler, on the signal stack, with no g available. +// +//go:nosplit +//go:nowritebarrierrec +func unblocksig(sig uint32) { + var set sigset + sigaddset(&set, int(sig)) + sigprocmask(_SIG_UNBLOCK, &set, nil) +} + +// minitSignals is called when initializing a new m to set the +// thread's alternate signal stack and signal mask. +func minitSignals() { + minitSignalStack() + minitSignalMask() +} + +// minitSignalStack is called when initializing a new m to set the +// alternate signal stack. If the alternate signal stack is not set +// for the thread (the normal case) then set the alternate signal +// stack to the gsignal stack. If the alternate signal stack is set +// for the thread (the case when a non-Go thread sets the alternate +// signal stack and then calls a Go function) then set the gsignal +// stack to the alternate signal stack. We also set the alternate +// signal stack to the gsignal stack if cgo is not used (regardless +// of whether it is already set). Record which choice was made in +// newSigstack, so that it can be undone in unminit. +func minitSignalStack() { + mp := getg().m + var st stackt + sigaltstack(nil, &st) + if st.ss_flags&_SS_DISABLE != 0 || !iscgo { + signalstack(&mp.gsignal.stack) + mp.newSigstack = true + } else { + setGsignalStack(&st, &mp.goSigStack) + mp.newSigstack = false + } +} + +// minitSignalMask is called when initializing a new m to set the +// thread's signal mask. When this is called all signals have been +// blocked for the thread. This starts with m.sigmask, which was set +// either from initSigmask for a newly created thread or by calling +// sigsave if this is a non-Go thread calling a Go function. It +// removes all essential signals from the mask, thus causing those +// signals to not be blocked. Then it sets the thread's signal mask. +// After this is called the thread can receive signals. +func minitSignalMask() { + nmask := getg().m.sigmask + for i := range sigtable { + if !blockableSig(uint32(i)) { + sigdelset(&nmask, i) + } + } + sigprocmask(_SIG_SETMASK, &nmask, nil) +} + +// unminitSignals is called from dropm, via unminit, to undo the +// effect of calling minit on a non-Go thread. +// +//go:nosplit +func unminitSignals() { + if getg().m.newSigstack { + st := stackt{ss_flags: _SS_DISABLE} + sigaltstack(&st, nil) + } else { + // We got the signal stack from someone else. Restore + // the Go-allocated stack in case this M gets reused + // for another thread (e.g., it's an extram). Also, on + // Android, libc allocates a signal stack for all + // threads, so it's important to restore the Go stack + // even on Go-created threads so we can free it. + restoreGsignalStack(&getg().m.goSigStack) + } +} + +// blockableSig reports whether sig may be blocked by the signal mask. +// We never want to block the signals marked _SigUnblock; +// these are the synchronous signals that turn into a Go panic. +// We never want to block the preemption signal if it is being used. +// In a Go program--not a c-archive/c-shared--we never want to block +// the signals marked _SigKill or _SigThrow, as otherwise it's possible +// for all running threads to block them and delay their delivery until +// we start a new thread. When linked into a C program we let the C code +// decide on the disposition of those signals. +func blockableSig(sig uint32) bool { + flags := sigtable[sig].flags + if flags&_SigUnblock != 0 { + return false + } + if sig == sigPreempt && preemptMSupported && debug.asyncpreemptoff == 0 { + return false + } + if isarchive || islibrary { + return true + } + return flags&(_SigKill|_SigThrow) == 0 +} + +// gsignalStack saves the fields of the gsignal stack changed by +// setGsignalStack. +type gsignalStack struct { + stack stack + stackguard0 uintptr + stackguard1 uintptr + stktopsp uintptr +} + +// setGsignalStack sets the gsignal stack of the current m to an +// alternate signal stack returned from the sigaltstack system call. +// It saves the old values in *old for use by restoreGsignalStack. +// This is used when handling a signal if non-Go code has set the +// alternate signal stack. +// +//go:nosplit +//go:nowritebarrierrec +func setGsignalStack(st *stackt, old *gsignalStack) { + gp := getg() + if old != nil { + old.stack = gp.m.gsignal.stack + old.stackguard0 = gp.m.gsignal.stackguard0 + old.stackguard1 = gp.m.gsignal.stackguard1 + old.stktopsp = gp.m.gsignal.stktopsp + } + stsp := uintptr(unsafe.Pointer(st.ss_sp)) + gp.m.gsignal.stack.lo = stsp + gp.m.gsignal.stack.hi = stsp + st.ss_size + gp.m.gsignal.stackguard0 = stsp + _StackGuard + gp.m.gsignal.stackguard1 = stsp + _StackGuard +} + +// restoreGsignalStack restores the gsignal stack to the value it had +// before entering the signal handler. +// +//go:nosplit +//go:nowritebarrierrec +func restoreGsignalStack(st *gsignalStack) { + gp := getg().m.gsignal + gp.stack = st.stack + gp.stackguard0 = st.stackguard0 + gp.stackguard1 = st.stackguard1 + gp.stktopsp = st.stktopsp +} + +// signalstack sets the current thread's alternate signal stack to s. +// +//go:nosplit +func signalstack(s *stack) { + st := stackt{ss_size: s.hi - s.lo} + setSignalstackSP(&st, s.lo) + sigaltstack(&st, nil) +} + +// setsigsegv is used on darwin/arm64 to fake a segmentation fault. +// +// This is exported via linkname to assembly in runtime/cgo. +// +//go:nosplit +//go:linkname setsigsegv +func setsigsegv(pc uintptr) { + gp := getg() + gp.sig = _SIGSEGV + gp.sigpc = pc + gp.sigcode0 = _SEGV_MAPERR + gp.sigcode1 = 0 // TODO: emulate si_addr +} diff --git a/src/runtime/signal_windows.go b/src/runtime/signal_windows.go new file mode 100644 index 0000000..37986cd --- /dev/null +++ b/src/runtime/signal_windows.go @@ -0,0 +1,335 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "runtime/internal/sys" + "unsafe" +) + +func disableWER() { + // do not display Windows Error Reporting dialogue + const ( + SEM_FAILCRITICALERRORS = 0x0001 + SEM_NOGPFAULTERRORBOX = 0x0002 + SEM_NOALIGNMENTFAULTEXCEPT = 0x0004 + SEM_NOOPENFILEERRORBOX = 0x8000 + ) + errormode := uint32(stdcall1(_SetErrorMode, SEM_NOGPFAULTERRORBOX)) + stdcall1(_SetErrorMode, uintptr(errormode)|SEM_FAILCRITICALERRORS|SEM_NOGPFAULTERRORBOX|SEM_NOOPENFILEERRORBOX) +} + +// in sys_windows_386.s and sys_windows_amd64.s +func exceptiontramp() +func firstcontinuetramp() +func lastcontinuetramp() + +func initExceptionHandler() { + stdcall2(_AddVectoredExceptionHandler, 1, abi.FuncPCABI0(exceptiontramp)) + if _AddVectoredContinueHandler == nil || GOARCH == "386" { + // use SetUnhandledExceptionFilter for windows-386 or + // if VectoredContinueHandler is unavailable. + // note: SetUnhandledExceptionFilter handler won't be called, if debugging. + stdcall1(_SetUnhandledExceptionFilter, abi.FuncPCABI0(lastcontinuetramp)) + } else { + stdcall2(_AddVectoredContinueHandler, 1, abi.FuncPCABI0(firstcontinuetramp)) + stdcall2(_AddVectoredContinueHandler, 0, abi.FuncPCABI0(lastcontinuetramp)) + } +} + +// isAbort returns true, if context r describes exception raised +// by calling runtime.abort function. +// +//go:nosplit +func isAbort(r *context) bool { + pc := r.ip() + if GOARCH == "386" || GOARCH == "amd64" || GOARCH == "arm" { + // In the case of an abort, the exception IP is one byte after + // the INT3 (this differs from UNIX OSes). Note that on ARM, + // this means that the exception IP is no longer aligned. + pc-- + } + return isAbortPC(pc) +} + +// isgoexception reports whether this exception should be translated +// into a Go panic or throw. +// +// It is nosplit to avoid growing the stack in case we're aborting +// because of a stack overflow. +// +//go:nosplit +func isgoexception(info *exceptionrecord, r *context) bool { + // Only handle exception if executing instructions in Go binary + // (not Windows library code). + // TODO(mwhudson): needs to loop to support shared libs + if r.ip() < firstmoduledata.text || firstmoduledata.etext < r.ip() { + return false + } + + // Go will only handle some exceptions. + switch info.exceptioncode { + default: + return false + case _EXCEPTION_ACCESS_VIOLATION: + case _EXCEPTION_INT_DIVIDE_BY_ZERO: + case _EXCEPTION_INT_OVERFLOW: + case _EXCEPTION_FLT_DENORMAL_OPERAND: + case _EXCEPTION_FLT_DIVIDE_BY_ZERO: + case _EXCEPTION_FLT_INEXACT_RESULT: + case _EXCEPTION_FLT_OVERFLOW: + case _EXCEPTION_FLT_UNDERFLOW: + case _EXCEPTION_BREAKPOINT: + case _EXCEPTION_ILLEGAL_INSTRUCTION: // breakpoint arrives this way on arm64 + } + return true +} + +// Called by sigtramp from Windows VEH handler. +// Return value signals whether the exception has been handled (EXCEPTION_CONTINUE_EXECUTION) +// or should be made available to other handlers in the chain (EXCEPTION_CONTINUE_SEARCH). +// +// This is the first entry into Go code for exception handling. This +// is nosplit to avoid growing the stack until we've checked for +// _EXCEPTION_BREAKPOINT, which is raised if we overflow the g0 stack, +// +//go:nosplit +func exceptionhandler(info *exceptionrecord, r *context, gp *g) int32 { + if !isgoexception(info, r) { + return _EXCEPTION_CONTINUE_SEARCH + } + + if gp.throwsplit || isAbort(r) { + // We can't safely sigpanic because it may grow the stack. + // Or this is a call to abort. + // Don't go through any more of the Windows handler chain. + // Crash now. + winthrow(info, r, gp) + } + + // After this point, it is safe to grow the stack. + + // Make it look like a call to the signal func. + // Have to pass arguments out of band since + // augmenting the stack frame would break + // the unwinding code. + gp.sig = info.exceptioncode + gp.sigcode0 = info.exceptioninformation[0] + gp.sigcode1 = info.exceptioninformation[1] + gp.sigpc = r.ip() + + // Only push runtime·sigpanic if r.ip() != 0. + // If r.ip() == 0, probably panicked because of a + // call to a nil func. Not pushing that onto sp will + // make the trace look like a call to runtime·sigpanic instead. + // (Otherwise the trace will end at runtime·sigpanic and we + // won't get to see who faulted.) + // Also don't push a sigpanic frame if the faulting PC + // is the entry of asyncPreempt. In this case, we suspended + // the thread right between the fault and the exception handler + // starting to run, and we have pushed an asyncPreempt call. + // The exception is not from asyncPreempt, so not to push a + // sigpanic call to make it look like that. Instead, just + // overwrite the PC. (See issue #35773) + if r.ip() != 0 && r.ip() != abi.FuncPCABI0(asyncPreempt) { + sp := unsafe.Pointer(r.sp()) + delta := uintptr(sys.StackAlign) + sp = add(sp, -delta) + r.set_sp(uintptr(sp)) + if usesLR { + *((*uintptr)(sp)) = r.lr() + r.set_lr(r.ip()) + } else { + *((*uintptr)(sp)) = r.ip() + } + } + r.set_ip(abi.FuncPCABI0(sigpanic0)) + return _EXCEPTION_CONTINUE_EXECUTION +} + +// It seems Windows searches ContinueHandler's list even +// if ExceptionHandler returns EXCEPTION_CONTINUE_EXECUTION. +// firstcontinuehandler will stop that search, +// if exceptionhandler did the same earlier. +// +// It is nosplit for the same reason as exceptionhandler. +// +//go:nosplit +func firstcontinuehandler(info *exceptionrecord, r *context, gp *g) int32 { + if !isgoexception(info, r) { + return _EXCEPTION_CONTINUE_SEARCH + } + return _EXCEPTION_CONTINUE_EXECUTION +} + +var testingWER bool + +// lastcontinuehandler is reached, because runtime cannot handle +// current exception. lastcontinuehandler will print crash info and exit. +// +// It is nosplit for the same reason as exceptionhandler. +// +//go:nosplit +func lastcontinuehandler(info *exceptionrecord, r *context, gp *g) int32 { + if islibrary || isarchive { + // Go DLL/archive has been loaded in a non-go program. + // If the exception does not originate from go, the go runtime + // should not take responsibility of crashing the process. + return _EXCEPTION_CONTINUE_SEARCH + } + if testingWER { + return _EXCEPTION_CONTINUE_SEARCH + } + + // VEH is called before SEH, but arm64 MSVC DLLs use SEH to trap + // illegal instructions during runtime initialization to determine + // CPU features, so if we make it to the last handler and we're + // arm64 and it's an illegal instruction and this is coming from + // non-Go code, then assume it's this runtime probing happen, and + // pass that onward to SEH. + if GOARCH == "arm64" && info.exceptioncode == _EXCEPTION_ILLEGAL_INSTRUCTION && + (r.ip() < firstmoduledata.text || firstmoduledata.etext < r.ip()) { + return _EXCEPTION_CONTINUE_SEARCH + } + + winthrow(info, r, gp) + return 0 // not reached +} + +// Always called on g0. gp is the G where the exception occurred. +// +//go:nosplit +func winthrow(info *exceptionrecord, r *context, gp *g) { + g0 := getg() + + if panicking.Load() != 0 { // traceback already printed + exit(2) + } + panicking.Store(1) + + // In case we're handling a g0 stack overflow, blow away the + // g0 stack bounds so we have room to print the traceback. If + // this somehow overflows the stack, the OS will trap it. + g0.stack.lo = 0 + g0.stackguard0 = g0.stack.lo + _StackGuard + g0.stackguard1 = g0.stackguard0 + + print("Exception ", hex(info.exceptioncode), " ", hex(info.exceptioninformation[0]), " ", hex(info.exceptioninformation[1]), " ", hex(r.ip()), "\n") + + print("PC=", hex(r.ip()), "\n") + if g0.m.incgo && gp == g0.m.g0 && g0.m.curg != nil { + if iscgo { + print("signal arrived during external code execution\n") + } + gp = g0.m.curg + } + print("\n") + + g0.m.throwing = throwTypeRuntime + g0.m.caughtsig.set(gp) + + level, _, docrash := gotraceback() + if level > 0 { + tracebacktrap(r.ip(), r.sp(), r.lr(), gp) + tracebackothers(gp) + dumpregs(r) + } + + if docrash { + crash() + } + + exit(2) +} + +func sigpanic() { + gp := getg() + if !canpanic() { + throw("unexpected signal during runtime execution") + } + + switch gp.sig { + case _EXCEPTION_ACCESS_VIOLATION: + if gp.sigcode1 < 0x1000 { + panicmem() + } + if gp.paniconfault { + panicmemAddr(gp.sigcode1) + } + if inUserArenaChunk(gp.sigcode1) { + // We could check that the arena chunk is explicitly set to fault, + // but the fact that we faulted on accessing it is enough to prove + // that it is. + print("accessed data from freed user arena ", hex(gp.sigcode1), "\n") + } else { + print("unexpected fault address ", hex(gp.sigcode1), "\n") + } + throw("fault") + case _EXCEPTION_INT_DIVIDE_BY_ZERO: + panicdivide() + case _EXCEPTION_INT_OVERFLOW: + panicoverflow() + case _EXCEPTION_FLT_DENORMAL_OPERAND, + _EXCEPTION_FLT_DIVIDE_BY_ZERO, + _EXCEPTION_FLT_INEXACT_RESULT, + _EXCEPTION_FLT_OVERFLOW, + _EXCEPTION_FLT_UNDERFLOW: + panicfloat() + } + throw("fault") +} + +var ( + badsignalmsg [100]byte + badsignallen int32 +) + +func setBadSignalMsg() { + const msg = "runtime: signal received on thread not created by Go.\n" + for i, c := range msg { + badsignalmsg[i] = byte(c) + badsignallen++ + } +} + +// Following are not implemented. + +func initsig(preinit bool) { +} + +func sigenable(sig uint32) { +} + +func sigdisable(sig uint32) { +} + +func sigignore(sig uint32) { +} + +func badsignal2() + +func raisebadsignal(sig uint32) { + badsignal2() +} + +func signame(sig uint32) string { + return "" +} + +//go:nosplit +func crash() { + // TODO: This routine should do whatever is needed + // to make the Windows program abort/crash as it + // would if Go was not intercepting signals. + // On Unix the routine would remove the custom signal + // handler and then raise a signal (like SIGABRT). + // Something like that should happen here. + // It's okay to leave this empty for now: if crash returns + // the ordinary exit-after-panic happens. +} + +// gsignalStack is unused on Windows. +type gsignalStack struct{} diff --git a/src/runtime/signal_windows_test.go b/src/runtime/signal_windows_test.go new file mode 100644 index 0000000..4c7a476 --- /dev/null +++ b/src/runtime/signal_windows_test.go @@ -0,0 +1,312 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "bufio" + "bytes" + "fmt" + "internal/testenv" + "os/exec" + "path/filepath" + "runtime" + "strings" + "syscall" + "testing" +) + +func TestVectoredHandlerExceptionInNonGoThread(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + if strings.HasPrefix(testenv.Builder(), "windows-amd64-2012") { + testenv.SkipFlaky(t, 49681) + } + testenv.MustHaveGoBuild(t) + testenv.MustHaveCGO(t) + testenv.MustHaveExecPath(t, "gcc") + testprog.Lock() + defer testprog.Unlock() + dir := t.TempDir() + + // build c program + dll := filepath.Join(dir, "veh.dll") + cmd := exec.Command("gcc", "-shared", "-o", dll, "testdata/testwinlibthrow/veh.c") + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed to build c exe: %s\n%s", err, out) + } + + // build go exe + exe := filepath.Join(dir, "test.exe") + cmd = exec.Command(testenv.GoToolPath(t), "build", "-o", exe, "testdata/testwinlibthrow/main.go") + out, err = testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed to build go library: %s\n%s", err, out) + } + + // run test program in same thread + cmd = exec.Command(exe) + out, err = testenv.CleanCmdEnv(cmd).CombinedOutput() + if err == nil { + t.Fatal("error expected") + } + if _, ok := err.(*exec.ExitError); ok && len(out) > 0 { + if !bytes.Contains(out, []byte("Exception 0x2a")) { + t.Fatalf("unexpected failure while running executable: %s\n%s", err, out) + } + } else { + t.Fatalf("unexpected error while running executable: %s\n%s", err, out) + } + // run test program in a new thread + cmd = exec.Command(exe, "thread") + out, err = testenv.CleanCmdEnv(cmd).CombinedOutput() + if err == nil { + t.Fatal("error expected") + } + if err, ok := err.(*exec.ExitError); ok { + if err.ExitCode() != 42 { + t.Fatalf("unexpected failure while running executable: %s\n%s", err, out) + } + } else { + t.Fatalf("unexpected error while running executable: %s\n%s", err, out) + } +} + +func TestVectoredHandlerDontCrashOnLibrary(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + if runtime.GOARCH != "amd64" { + t.Skip("this test can only run on windows/amd64") + } + testenv.MustHaveGoBuild(t) + testenv.MustHaveCGO(t) + testenv.MustHaveExecPath(t, "gcc") + testprog.Lock() + defer testprog.Unlock() + dir := t.TempDir() + + // build go dll + dll := filepath.Join(dir, "testwinlib.dll") + cmd := exec.Command(testenv.GoToolPath(t), "build", "-o", dll, "-buildmode", "c-shared", "testdata/testwinlib/main.go") + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed to build go library: %s\n%s", err, out) + } + + // build c program + exe := filepath.Join(dir, "test.exe") + cmd = exec.Command("gcc", "-L"+dir, "-I"+dir, "-ltestwinlib", "-o", exe, "testdata/testwinlib/main.c") + out, err = testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed to build c exe: %s\n%s", err, out) + } + + // run test program + cmd = exec.Command(exe) + out, err = testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failure while running executable: %s\n%s", err, out) + } + expectedOutput := "exceptionCount: 1\ncontinueCount: 1\n" + // cleaning output + cleanedOut := strings.ReplaceAll(string(out), "\r\n", "\n") + if cleanedOut != expectedOutput { + t.Errorf("expected output %q, got %q", expectedOutput, cleanedOut) + } +} + +func sendCtrlBreak(pid int) error { + kernel32, err := syscall.LoadDLL("kernel32.dll") + if err != nil { + return fmt.Errorf("LoadDLL: %v\n", err) + } + generateEvent, err := kernel32.FindProc("GenerateConsoleCtrlEvent") + if err != nil { + return fmt.Errorf("FindProc: %v\n", err) + } + result, _, err := generateEvent.Call(syscall.CTRL_BREAK_EVENT, uintptr(pid)) + if result == 0 { + return fmt.Errorf("GenerateConsoleCtrlEvent: %v\n", err) + } + return nil +} + +// TestCtrlHandler tests that Go can gracefully handle closing the console window. +// See https://golang.org/issues/41884. +func TestCtrlHandler(t *testing.T) { + testenv.MustHaveGoBuild(t) + t.Parallel() + + // build go program + exe := filepath.Join(t.TempDir(), "test.exe") + cmd := exec.Command(testenv.GoToolPath(t), "build", "-o", exe, "testdata/testwinsignal/main.go") + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed to build go exe: %v\n%s", err, out) + } + + // run test program + cmd = exec.Command(exe) + var stdout strings.Builder + var stderr strings.Builder + cmd.Stdout = &stdout + cmd.Stderr = &stderr + inPipe, err := cmd.StdinPipe() + if err != nil { + t.Fatalf("Failed to create stdin pipe: %v", err) + } + // keep inPipe alive until the end of the test + defer inPipe.Close() + + // in a new command window + const _CREATE_NEW_CONSOLE = 0x00000010 + cmd.SysProcAttr = &syscall.SysProcAttr{ + CreationFlags: _CREATE_NEW_CONSOLE, + HideWindow: true, + } + if err := cmd.Start(); err != nil { + t.Fatalf("Start failed: %v", err) + } + defer func() { + cmd.Process.Kill() + cmd.Wait() + }() + + // check child exited gracefully, did not timeout + if err := cmd.Wait(); err != nil { + t.Fatalf("Program exited with error: %v\n%s", err, &stderr) + } + + // check child received, handled SIGTERM + if expected, got := syscall.SIGTERM.String(), strings.TrimSpace(stdout.String()); expected != got { + t.Fatalf("Expected '%s' got: %s", expected, got) + } +} + +// TestLibraryCtrlHandler tests that Go DLL allows calling program to handle console control events. +// See https://golang.org/issues/35965. +func TestLibraryCtrlHandler(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + if runtime.GOARCH != "amd64" { + t.Skip("this test can only run on windows/amd64") + } + testenv.MustHaveGoBuild(t) + testenv.MustHaveCGO(t) + testenv.MustHaveExecPath(t, "gcc") + testprog.Lock() + defer testprog.Unlock() + dir := t.TempDir() + + // build go dll + dll := filepath.Join(dir, "dummy.dll") + cmd := exec.Command(testenv.GoToolPath(t), "build", "-o", dll, "-buildmode", "c-shared", "testdata/testwinlibsignal/dummy.go") + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed to build go library: %s\n%s", err, out) + } + + // build c program + exe := filepath.Join(dir, "test.exe") + cmd = exec.Command("gcc", "-o", exe, "testdata/testwinlibsignal/main.c") + out, err = testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed to build c exe: %s\n%s", err, out) + } + + // run test program + cmd = exec.Command(exe) + var stderr bytes.Buffer + cmd.Stderr = &stderr + outPipe, err := cmd.StdoutPipe() + if err != nil { + t.Fatalf("Failed to create stdout pipe: %v", err) + } + outReader := bufio.NewReader(outPipe) + + cmd.SysProcAttr = &syscall.SysProcAttr{ + CreationFlags: syscall.CREATE_NEW_PROCESS_GROUP, + } + if err := cmd.Start(); err != nil { + t.Fatalf("Start failed: %v", err) + } + + errCh := make(chan error, 1) + go func() { + if line, err := outReader.ReadString('\n'); err != nil { + errCh <- fmt.Errorf("could not read stdout: %v", err) + } else if strings.TrimSpace(line) != "ready" { + errCh <- fmt.Errorf("unexpected message: %v", line) + } else { + errCh <- sendCtrlBreak(cmd.Process.Pid) + } + }() + + if err := <-errCh; err != nil { + t.Fatal(err) + } + if err := cmd.Wait(); err != nil { + t.Fatalf("Program exited with error: %v\n%s", err, &stderr) + } +} + +func TestIssue59213(t *testing.T) { + if runtime.GOOS != "windows" { + t.Skip("skipping windows only test") + } + if *flagQuick { + t.Skip("-quick") + } + testenv.MustHaveGoBuild(t) + testenv.MustHaveCGO(t) + + goEnv := func(arg string) string { + cmd := testenv.Command(t, testenv.GoToolPath(t), "env", arg) + cmd.Stderr = new(bytes.Buffer) + + line, err := cmd.Output() + if err != nil { + t.Fatalf("%v: %v\n%s", cmd, err, cmd.Stderr) + } + out := string(bytes.TrimSpace(line)) + t.Logf("%v: %q", cmd, out) + return out + } + + cc := goEnv("CC") + cgoCflags := goEnv("CGO_CFLAGS") + + t.Parallel() + + tmpdir := t.TempDir() + dllfile := filepath.Join(tmpdir, "test.dll") + exefile := filepath.Join(tmpdir, "gotest.exe") + + // build go dll + cmd := testenv.Command(t, testenv.GoToolPath(t), "build", "-o", dllfile, "-buildmode", "c-shared", "testdata/testwintls/main.go") + out, err := testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed to build go library: %s\n%s", err, out) + } + + // build c program + cmd = testenv.Command(t, cc, "-o", exefile, "testdata/testwintls/main.c") + testenv.CleanCmdEnv(cmd) + cmd.Env = append(cmd.Env, "CGO_CFLAGS="+cgoCflags) + out, err = cmd.CombinedOutput() + if err != nil { + t.Fatalf("failed to build c exe: %s\n%s", err, out) + } + + // run test program + cmd = testenv.Command(t, exefile, dllfile, "GoFunc") + out, err = testenv.CleanCmdEnv(cmd).CombinedOutput() + if err != nil { + t.Fatalf("failed: %s\n%s", err, out) + } +} diff --git a/src/runtime/sigqueue.go b/src/runtime/sigqueue.go new file mode 100644 index 0000000..51e424d --- /dev/null +++ b/src/runtime/sigqueue.go @@ -0,0 +1,275 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This file implements runtime support for signal handling. +// +// Most synchronization primitives are not available from +// the signal handler (it cannot block, allocate memory, or use locks) +// so the handler communicates with a processing goroutine +// via struct sig, below. +// +// sigsend is called by the signal handler to queue a new signal. +// signal_recv is called by the Go program to receive a newly queued signal. +// +// Synchronization between sigsend and signal_recv is based on the sig.state +// variable. It can be in three states: +// * sigReceiving means that signal_recv is blocked on sig.Note and there are +// no new pending signals. +// * sigSending means that sig.mask *may* contain new pending signals, +// signal_recv can't be blocked in this state. +// * sigIdle means that there are no new pending signals and signal_recv is not +// blocked. +// +// Transitions between states are done atomically with CAS. +// +// When signal_recv is unblocked, it resets sig.Note and rechecks sig.mask. +// If several sigsends and signal_recv execute concurrently, it can lead to +// unnecessary rechecks of sig.mask, but it cannot lead to missed signals +// nor deadlocks. + +//go:build !plan9 + +package runtime + +import ( + "runtime/internal/atomic" + _ "unsafe" // for go:linkname +) + +// sig handles communication between the signal handler and os/signal. +// Other than the inuse and recv fields, the fields are accessed atomically. +// +// The wanted and ignored fields are only written by one goroutine at +// a time; access is controlled by the handlers Mutex in os/signal. +// The fields are only read by that one goroutine and by the signal handler. +// We access them atomically to minimize the race between setting them +// in the goroutine calling os/signal and the signal handler, +// which may be running in a different thread. That race is unavoidable, +// as there is no connection between handling a signal and receiving one, +// but atomic instructions should minimize it. +var sig struct { + note note + mask [(_NSIG + 31) / 32]uint32 + wanted [(_NSIG + 31) / 32]uint32 + ignored [(_NSIG + 31) / 32]uint32 + recv [(_NSIG + 31) / 32]uint32 + state atomic.Uint32 + delivering atomic.Uint32 + inuse bool +} + +const ( + sigIdle = iota + sigReceiving + sigSending +) + +// sigsend delivers a signal from sighandler to the internal signal delivery queue. +// It reports whether the signal was sent. If not, the caller typically crashes the program. +// It runs from the signal handler, so it's limited in what it can do. +func sigsend(s uint32) bool { + bit := uint32(1) << uint(s&31) + if s >= uint32(32*len(sig.wanted)) { + return false + } + + sig.delivering.Add(1) + // We are running in the signal handler; defer is not available. + + if w := atomic.Load(&sig.wanted[s/32]); w&bit == 0 { + sig.delivering.Add(-1) + return false + } + + // Add signal to outgoing queue. + for { + mask := sig.mask[s/32] + if mask&bit != 0 { + sig.delivering.Add(-1) + return true // signal already in queue + } + if atomic.Cas(&sig.mask[s/32], mask, mask|bit) { + break + } + } + + // Notify receiver that queue has new bit. +Send: + for { + switch sig.state.Load() { + default: + throw("sigsend: inconsistent state") + case sigIdle: + if sig.state.CompareAndSwap(sigIdle, sigSending) { + break Send + } + case sigSending: + // notification already pending + break Send + case sigReceiving: + if sig.state.CompareAndSwap(sigReceiving, sigIdle) { + if GOOS == "darwin" || GOOS == "ios" { + sigNoteWakeup(&sig.note) + break Send + } + notewakeup(&sig.note) + break Send + } + } + } + + sig.delivering.Add(-1) + return true +} + +// Called to receive the next queued signal. +// Must only be called from a single goroutine at a time. +// +//go:linkname signal_recv os/signal.signal_recv +func signal_recv() uint32 { + for { + // Serve any signals from local copy. + for i := uint32(0); i < _NSIG; i++ { + if sig.recv[i/32]&(1<<(i&31)) != 0 { + sig.recv[i/32] &^= 1 << (i & 31) + return i + } + } + + // Wait for updates to be available from signal sender. + Receive: + for { + switch sig.state.Load() { + default: + throw("signal_recv: inconsistent state") + case sigIdle: + if sig.state.CompareAndSwap(sigIdle, sigReceiving) { + if GOOS == "darwin" || GOOS == "ios" { + sigNoteSleep(&sig.note) + break Receive + } + notetsleepg(&sig.note, -1) + noteclear(&sig.note) + break Receive + } + case sigSending: + if sig.state.CompareAndSwap(sigSending, sigIdle) { + break Receive + } + } + } + + // Incorporate updates from sender into local copy. + for i := range sig.mask { + sig.recv[i] = atomic.Xchg(&sig.mask[i], 0) + } + } +} + +// signalWaitUntilIdle waits until the signal delivery mechanism is idle. +// This is used to ensure that we do not drop a signal notification due +// to a race between disabling a signal and receiving a signal. +// This assumes that signal delivery has already been disabled for +// the signal(s) in question, and here we are just waiting to make sure +// that all the signals have been delivered to the user channels +// by the os/signal package. +// +//go:linkname signalWaitUntilIdle os/signal.signalWaitUntilIdle +func signalWaitUntilIdle() { + // Although the signals we care about have been removed from + // sig.wanted, it is possible that another thread has received + // a signal, has read from sig.wanted, is now updating sig.mask, + // and has not yet woken up the processor thread. We need to wait + // until all current signal deliveries have completed. + for sig.delivering.Load() != 0 { + Gosched() + } + + // Although WaitUntilIdle seems like the right name for this + // function, the state we are looking for is sigReceiving, not + // sigIdle. The sigIdle state is really more like sigProcessing. + for sig.state.Load() != sigReceiving { + Gosched() + } +} + +// Must only be called from a single goroutine at a time. +// +//go:linkname signal_enable os/signal.signal_enable +func signal_enable(s uint32) { + if !sig.inuse { + // This is the first call to signal_enable. Initialize. + sig.inuse = true // enable reception of signals; cannot disable + if GOOS == "darwin" || GOOS == "ios" { + sigNoteSetup(&sig.note) + } else { + noteclear(&sig.note) + } + } + + if s >= uint32(len(sig.wanted)*32) { + return + } + + w := sig.wanted[s/32] + w |= 1 << (s & 31) + atomic.Store(&sig.wanted[s/32], w) + + i := sig.ignored[s/32] + i &^= 1 << (s & 31) + atomic.Store(&sig.ignored[s/32], i) + + sigenable(s) +} + +// Must only be called from a single goroutine at a time. +// +//go:linkname signal_disable os/signal.signal_disable +func signal_disable(s uint32) { + if s >= uint32(len(sig.wanted)*32) { + return + } + sigdisable(s) + + w := sig.wanted[s/32] + w &^= 1 << (s & 31) + atomic.Store(&sig.wanted[s/32], w) +} + +// Must only be called from a single goroutine at a time. +// +//go:linkname signal_ignore os/signal.signal_ignore +func signal_ignore(s uint32) { + if s >= uint32(len(sig.wanted)*32) { + return + } + sigignore(s) + + w := sig.wanted[s/32] + w &^= 1 << (s & 31) + atomic.Store(&sig.wanted[s/32], w) + + i := sig.ignored[s/32] + i |= 1 << (s & 31) + atomic.Store(&sig.ignored[s/32], i) +} + +// sigInitIgnored marks the signal as already ignored. This is called at +// program start by initsig. In a shared library initsig is called by +// libpreinit, so the runtime may not be initialized yet. +// +//go:nosplit +func sigInitIgnored(s uint32) { + i := sig.ignored[s/32] + i |= 1 << (s & 31) + atomic.Store(&sig.ignored[s/32], i) +} + +// Checked by signal handlers. +// +//go:linkname signal_ignored os/signal.signal_ignored +func signal_ignored(s uint32) bool { + i := atomic.Load(&sig.ignored[s/32]) + return i&(1<<(s&31)) != 0 +} diff --git a/src/runtime/sigqueue_note.go b/src/runtime/sigqueue_note.go new file mode 100644 index 0000000..fb1a517 --- /dev/null +++ b/src/runtime/sigqueue_note.go @@ -0,0 +1,24 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The current implementation of notes on Darwin is not async-signal-safe, +// so on Darwin the sigqueue code uses different functions to wake up the +// signal_recv thread. This file holds the non-Darwin implementations of +// those functions. These functions will never be called. + +//go:build !darwin && !plan9 + +package runtime + +func sigNoteSetup(*note) { + throw("sigNoteSetup") +} + +func sigNoteSleep(*note) { + throw("sigNoteSleep") +} + +func sigNoteWakeup(*note) { + throw("sigNoteWakeup") +} diff --git a/src/runtime/sigqueue_plan9.go b/src/runtime/sigqueue_plan9.go new file mode 100644 index 0000000..9ed6fb5 --- /dev/null +++ b/src/runtime/sigqueue_plan9.go @@ -0,0 +1,161 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This file implements runtime support for signal handling. + +package runtime + +import _ "unsafe" + +const qsize = 64 + +var sig struct { + q noteQueue + inuse bool + + lock mutex + note note + sleeping bool +} + +type noteData struct { + s [_ERRMAX]byte + n int // n bytes of s are valid +} + +type noteQueue struct { + lock mutex + data [qsize]noteData + ri int + wi int + full bool +} + +// It is not allowed to allocate memory in the signal handler. +func (q *noteQueue) push(item *byte) bool { + lock(&q.lock) + if q.full { + unlock(&q.lock) + return false + } + s := gostringnocopy(item) + copy(q.data[q.wi].s[:], s) + q.data[q.wi].n = len(s) + q.wi++ + if q.wi == qsize { + q.wi = 0 + } + if q.wi == q.ri { + q.full = true + } + unlock(&q.lock) + return true +} + +func (q *noteQueue) pop() string { + lock(&q.lock) + q.full = false + if q.ri == q.wi { + unlock(&q.lock) + return "" + } + note := &q.data[q.ri] + item := string(note.s[:note.n]) + q.ri++ + if q.ri == qsize { + q.ri = 0 + } + unlock(&q.lock) + return item +} + +// Called from sighandler to send a signal back out of the signal handling thread. +// Reports whether the signal was sent. If not, the caller typically crashes the program. +func sendNote(s *byte) bool { + if !sig.inuse { + return false + } + + // Add signal to outgoing queue. + if !sig.q.push(s) { + return false + } + + lock(&sig.lock) + if sig.sleeping { + sig.sleeping = false + notewakeup(&sig.note) + } + unlock(&sig.lock) + + return true +} + +// Called to receive the next queued signal. +// Must only be called from a single goroutine at a time. +// +//go:linkname signal_recv os/signal.signal_recv +func signal_recv() string { + for { + note := sig.q.pop() + if note != "" { + return note + } + + lock(&sig.lock) + sig.sleeping = true + noteclear(&sig.note) + unlock(&sig.lock) + notetsleepg(&sig.note, -1) + } +} + +// signalWaitUntilIdle waits until the signal delivery mechanism is idle. +// This is used to ensure that we do not drop a signal notification due +// to a race between disabling a signal and receiving a signal. +// This assumes that signal delivery has already been disabled for +// the signal(s) in question, and here we are just waiting to make sure +// that all the signals have been delivered to the user channels +// by the os/signal package. +// +//go:linkname signalWaitUntilIdle os/signal.signalWaitUntilIdle +func signalWaitUntilIdle() { + for { + lock(&sig.lock) + sleeping := sig.sleeping + unlock(&sig.lock) + if sleeping { + return + } + Gosched() + } +} + +// Must only be called from a single goroutine at a time. +// +//go:linkname signal_enable os/signal.signal_enable +func signal_enable(s uint32) { + if !sig.inuse { + // This is the first call to signal_enable. Initialize. + sig.inuse = true // enable reception of signals; cannot disable + noteclear(&sig.note) + } +} + +// Must only be called from a single goroutine at a time. +// +//go:linkname signal_disable os/signal.signal_disable +func signal_disable(s uint32) { +} + +// Must only be called from a single goroutine at a time. +// +//go:linkname signal_ignore os/signal.signal_ignore +func signal_ignore(s uint32) { +} + +//go:linkname signal_ignored os/signal.signal_ignored +func signal_ignored(s uint32) bool { + return false +} diff --git a/src/runtime/sigtab_aix.go b/src/runtime/sigtab_aix.go new file mode 100644 index 0000000..42e5606 --- /dev/null +++ b/src/runtime/sigtab_aix.go @@ -0,0 +1,264 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +var sigtable = [...]sigTabT{ + 0: {0, "SIGNONE: no trap"}, + _SIGHUP: {_SigNotify + _SigKill, "SIGHUP: terminal line hangup"}, + _SIGINT: {_SigNotify + _SigKill, "SIGINT: interrupt"}, + _SIGQUIT: {_SigNotify + _SigThrow, "SIGQUIT: quit"}, + _SIGILL: {_SigThrow + _SigUnblock, "SIGILL: illegal instruction"}, + _SIGTRAP: {_SigThrow + _SigUnblock, "SIGTRAP: trace trap"}, + _SIGABRT: {_SigNotify + _SigThrow, "SIGABRT: abort"}, + _SIGBUS: {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + _SIGFPE: {_SigPanic + _SigUnblock, "SIGFPE: floating-point exception"}, + _SIGKILL: {0, "SIGKILL: kill"}, + _SIGUSR1: {_SigNotify, "SIGUSR1: user-defined signal 1"}, + _SIGSEGV: {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + _SIGUSR2: {_SigNotify, "SIGUSR2: user-defined signal 2"}, + _SIGPIPE: {_SigNotify, "SIGPIPE: write to broken pipe"}, + _SIGALRM: {_SigNotify, "SIGALRM: alarm clock"}, + _SIGTERM: {_SigNotify + _SigKill, "SIGTERM: termination"}, + _SIGCHLD: {_SigNotify + _SigUnblock, "SIGCHLD: child status has changed"}, + _SIGCONT: {_SigNotify + _SigDefault, "SIGCONT: continue"}, + _SIGSTOP: {0, "SIGSTOP: stop"}, + _SIGTSTP: {_SigNotify + _SigDefault, "SIGTSTP: keyboard stop"}, + _SIGTTIN: {_SigNotify + _SigDefault, "SIGTTIN: background read from tty"}, + _SIGTTOU: {_SigNotify + _SigDefault, "SIGTTOU: background write to tty"}, + _SIGURG: {_SigNotify, "SIGURG: urgent condition on socket"}, + _SIGXCPU: {_SigNotify, "SIGXCPU: cpu limit exceeded"}, + _SIGXFSZ: {_SigNotify, "SIGXFSZ: file size limit exceeded"}, + _SIGVTALRM: {_SigNotify, "SIGVTALRM: virtual alarm clock"}, + _SIGPROF: {_SigNotify + _SigUnblock, "SIGPROF: profiling alarm clock"}, + _SIGWINCH: {_SigNotify, "SIGWINCH: window size change"}, + _SIGSYS: {_SigThrow, "SIGSYS: bad system call"}, + _SIGIO: {_SigNotify, "SIGIO: i/o now possible"}, + _SIGPWR: {_SigNotify, "SIGPWR: power failure restart"}, + _SIGEMT: {_SigThrow, "SIGEMT: emulate instruction executed"}, + _SIGWAITING: {0, "SIGWAITING: reserved signal no longer used by"}, + 26: {_SigNotify, "signal 26"}, + 27: {_SigNotify, "signal 27"}, + 33: {_SigNotify, "signal 33"}, + 35: {_SigNotify, "signal 35"}, + 36: {_SigNotify, "signal 36"}, + 37: {_SigNotify, "signal 37"}, + 38: {_SigNotify, "signal 38"}, + 40: {_SigNotify, "signal 40"}, + 41: {_SigNotify, "signal 41"}, + 42: {_SigNotify, "signal 42"}, + 43: {_SigNotify, "signal 43"}, + 44: {_SigNotify, "signal 44"}, + 45: {_SigNotify, "signal 45"}, + 46: {_SigNotify, "signal 46"}, + 47: {_SigNotify, "signal 47"}, + 48: {_SigNotify, "signal 48"}, + 49: {_SigNotify, "signal 49"}, + 50: {_SigNotify, "signal 50"}, + 51: {_SigNotify, "signal 51"}, + 52: {_SigNotify, "signal 52"}, + 53: {_SigNotify, "signal 53"}, + 54: {_SigNotify, "signal 54"}, + 55: {_SigNotify, "signal 55"}, + 56: {_SigNotify, "signal 56"}, + 57: {_SigNotify, "signal 57"}, + 58: {_SigNotify, "signal 58"}, + 59: {_SigNotify, "signal 59"}, + 60: {_SigNotify, "signal 60"}, + 61: {_SigNotify, "signal 61"}, + 62: {_SigNotify, "signal 62"}, + 63: {_SigNotify, "signal 63"}, + 64: {_SigNotify, "signal 64"}, + 65: {_SigNotify, "signal 65"}, + 66: {_SigNotify, "signal 66"}, + 67: {_SigNotify, "signal 67"}, + 68: {_SigNotify, "signal 68"}, + 69: {_SigNotify, "signal 69"}, + 70: {_SigNotify, "signal 70"}, + 71: {_SigNotify, "signal 71"}, + 72: {_SigNotify, "signal 72"}, + 73: {_SigNotify, "signal 73"}, + 74: {_SigNotify, "signal 74"}, + 75: {_SigNotify, "signal 75"}, + 76: {_SigNotify, "signal 76"}, + 77: {_SigNotify, "signal 77"}, + 78: {_SigNotify, "signal 78"}, + 79: {_SigNotify, "signal 79"}, + 80: {_SigNotify, "signal 80"}, + 81: {_SigNotify, "signal 81"}, + 82: {_SigNotify, "signal 82"}, + 83: {_SigNotify, "signal 83"}, + 84: {_SigNotify, "signal 84"}, + 85: {_SigNotify, "signal 85"}, + 86: {_SigNotify, "signal 86"}, + 87: {_SigNotify, "signal 87"}, + 88: {_SigNotify, "signal 88"}, + 89: {_SigNotify, "signal 89"}, + 90: {_SigNotify, "signal 90"}, + 91: {_SigNotify, "signal 91"}, + 92: {_SigNotify, "signal 92"}, + 93: {_SigNotify, "signal 93"}, + 94: {_SigNotify, "signal 94"}, + 95: {_SigNotify, "signal 95"}, + 96: {_SigNotify, "signal 96"}, + 97: {_SigNotify, "signal 97"}, + 98: {_SigNotify, "signal 98"}, + 99: {_SigNotify, "signal 99"}, + 100: {_SigNotify, "signal 100"}, + 101: {_SigNotify, "signal 101"}, + 102: {_SigNotify, "signal 102"}, + 103: {_SigNotify, "signal 103"}, + 104: {_SigNotify, "signal 104"}, + 105: {_SigNotify, "signal 105"}, + 106: {_SigNotify, "signal 106"}, + 107: {_SigNotify, "signal 107"}, + 108: {_SigNotify, "signal 108"}, + 109: {_SigNotify, "signal 109"}, + 110: {_SigNotify, "signal 110"}, + 111: {_SigNotify, "signal 111"}, + 112: {_SigNotify, "signal 112"}, + 113: {_SigNotify, "signal 113"}, + 114: {_SigNotify, "signal 114"}, + 115: {_SigNotify, "signal 115"}, + 116: {_SigNotify, "signal 116"}, + 117: {_SigNotify, "signal 117"}, + 118: {_SigNotify, "signal 118"}, + 119: {_SigNotify, "signal 119"}, + 120: {_SigNotify, "signal 120"}, + 121: {_SigNotify, "signal 121"}, + 122: {_SigNotify, "signal 122"}, + 123: {_SigNotify, "signal 123"}, + 124: {_SigNotify, "signal 124"}, + 125: {_SigNotify, "signal 125"}, + 126: {_SigNotify, "signal 126"}, + 127: {_SigNotify, "signal 127"}, + 128: {_SigNotify, "signal 128"}, + 129: {_SigNotify, "signal 129"}, + 130: {_SigNotify, "signal 130"}, + 131: {_SigNotify, "signal 131"}, + 132: {_SigNotify, "signal 132"}, + 133: {_SigNotify, "signal 133"}, + 134: {_SigNotify, "signal 134"}, + 135: {_SigNotify, "signal 135"}, + 136: {_SigNotify, "signal 136"}, + 137: {_SigNotify, "signal 137"}, + 138: {_SigNotify, "signal 138"}, + 139: {_SigNotify, "signal 139"}, + 140: {_SigNotify, "signal 140"}, + 141: {_SigNotify, "signal 141"}, + 142: {_SigNotify, "signal 142"}, + 143: {_SigNotify, "signal 143"}, + 144: {_SigNotify, "signal 144"}, + 145: {_SigNotify, "signal 145"}, + 146: {_SigNotify, "signal 146"}, + 147: {_SigNotify, "signal 147"}, + 148: {_SigNotify, "signal 148"}, + 149: {_SigNotify, "signal 149"}, + 150: {_SigNotify, "signal 150"}, + 151: {_SigNotify, "signal 151"}, + 152: {_SigNotify, "signal 152"}, + 153: {_SigNotify, "signal 153"}, + 154: {_SigNotify, "signal 154"}, + 155: {_SigNotify, "signal 155"}, + 156: {_SigNotify, "signal 156"}, + 157: {_SigNotify, "signal 157"}, + 158: {_SigNotify, "signal 158"}, + 159: {_SigNotify, "signal 159"}, + 160: {_SigNotify, "signal 160"}, + 161: {_SigNotify, "signal 161"}, + 162: {_SigNotify, "signal 162"}, + 163: {_SigNotify, "signal 163"}, + 164: {_SigNotify, "signal 164"}, + 165: {_SigNotify, "signal 165"}, + 166: {_SigNotify, "signal 166"}, + 167: {_SigNotify, "signal 167"}, + 168: {_SigNotify, "signal 168"}, + 169: {_SigNotify, "signal 169"}, + 170: {_SigNotify, "signal 170"}, + 171: {_SigNotify, "signal 171"}, + 172: {_SigNotify, "signal 172"}, + 173: {_SigNotify, "signal 173"}, + 174: {_SigNotify, "signal 174"}, + 175: {_SigNotify, "signal 175"}, + 176: {_SigNotify, "signal 176"}, + 177: {_SigNotify, "signal 177"}, + 178: {_SigNotify, "signal 178"}, + 179: {_SigNotify, "signal 179"}, + 180: {_SigNotify, "signal 180"}, + 181: {_SigNotify, "signal 181"}, + 182: {_SigNotify, "signal 182"}, + 183: {_SigNotify, "signal 183"}, + 184: {_SigNotify, "signal 184"}, + 185: {_SigNotify, "signal 185"}, + 186: {_SigNotify, "signal 186"}, + 187: {_SigNotify, "signal 187"}, + 188: {_SigNotify, "signal 188"}, + 189: {_SigNotify, "signal 189"}, + 190: {_SigNotify, "signal 190"}, + 191: {_SigNotify, "signal 191"}, + 192: {_SigNotify, "signal 192"}, + 193: {_SigNotify, "signal 193"}, + 194: {_SigNotify, "signal 194"}, + 195: {_SigNotify, "signal 195"}, + 196: {_SigNotify, "signal 196"}, + 197: {_SigNotify, "signal 197"}, + 198: {_SigNotify, "signal 198"}, + 199: {_SigNotify, "signal 199"}, + 200: {_SigNotify, "signal 200"}, + 201: {_SigNotify, "signal 201"}, + 202: {_SigNotify, "signal 202"}, + 203: {_SigNotify, "signal 203"}, + 204: {_SigNotify, "signal 204"}, + 205: {_SigNotify, "signal 205"}, + 206: {_SigNotify, "signal 206"}, + 207: {_SigNotify, "signal 207"}, + 208: {_SigNotify, "signal 208"}, + 209: {_SigNotify, "signal 209"}, + 210: {_SigNotify, "signal 210"}, + 211: {_SigNotify, "signal 211"}, + 212: {_SigNotify, "signal 212"}, + 213: {_SigNotify, "signal 213"}, + 214: {_SigNotify, "signal 214"}, + 215: {_SigNotify, "signal 215"}, + 216: {_SigNotify, "signal 216"}, + 217: {_SigNotify, "signal 217"}, + 218: {_SigNotify, "signal 218"}, + 219: {_SigNotify, "signal 219"}, + 220: {_SigNotify, "signal 220"}, + 221: {_SigNotify, "signal 221"}, + 222: {_SigNotify, "signal 222"}, + 223: {_SigNotify, "signal 223"}, + 224: {_SigNotify, "signal 224"}, + 225: {_SigNotify, "signal 225"}, + 226: {_SigNotify, "signal 226"}, + 227: {_SigNotify, "signal 227"}, + 228: {_SigNotify, "signal 228"}, + 229: {_SigNotify, "signal 229"}, + 230: {_SigNotify, "signal 230"}, + 231: {_SigNotify, "signal 231"}, + 232: {_SigNotify, "signal 232"}, + 233: {_SigNotify, "signal 233"}, + 234: {_SigNotify, "signal 234"}, + 235: {_SigNotify, "signal 235"}, + 236: {_SigNotify, "signal 236"}, + 237: {_SigNotify, "signal 237"}, + 238: {_SigNotify, "signal 238"}, + 239: {_SigNotify, "signal 239"}, + 240: {_SigNotify, "signal 240"}, + 241: {_SigNotify, "signal 241"}, + 242: {_SigNotify, "signal 242"}, + 243: {_SigNotify, "signal 243"}, + 244: {_SigNotify, "signal 244"}, + 245: {_SigNotify, "signal 245"}, + 246: {_SigNotify, "signal 246"}, + 247: {_SigNotify, "signal 247"}, + 248: {_SigNotify, "signal 248"}, + 249: {_SigNotify, "signal 249"}, + 250: {_SigNotify, "signal 250"}, + 251: {_SigNotify, "signal 251"}, + 252: {_SigNotify, "signal 252"}, + 253: {_SigNotify, "signal 253"}, + 254: {_SigNotify, "signal 254"}, + 255: {_SigNotify, "signal 255"}, +} diff --git a/src/runtime/sigtab_linux_generic.go b/src/runtime/sigtab_linux_generic.go new file mode 100644 index 0000000..fe93bba --- /dev/null +++ b/src/runtime/sigtab_linux_generic.go @@ -0,0 +1,75 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !mips && !mipsle && !mips64 && !mips64le && linux + +package runtime + +var sigtable = [...]sigTabT{ + /* 0 */ {0, "SIGNONE: no trap"}, + /* 1 */ {_SigNotify + _SigKill, "SIGHUP: terminal line hangup"}, + /* 2 */ {_SigNotify + _SigKill, "SIGINT: interrupt"}, + /* 3 */ {_SigNotify + _SigThrow, "SIGQUIT: quit"}, + /* 4 */ {_SigThrow + _SigUnblock, "SIGILL: illegal instruction"}, + /* 5 */ {_SigThrow + _SigUnblock, "SIGTRAP: trace trap"}, + /* 6 */ {_SigNotify + _SigThrow, "SIGABRT: abort"}, + /* 7 */ {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + /* 8 */ {_SigPanic + _SigUnblock, "SIGFPE: floating-point exception"}, + /* 9 */ {0, "SIGKILL: kill"}, + /* 10 */ {_SigNotify, "SIGUSR1: user-defined signal 1"}, + /* 11 */ {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + /* 12 */ {_SigNotify, "SIGUSR2: user-defined signal 2"}, + /* 13 */ {_SigNotify, "SIGPIPE: write to broken pipe"}, + /* 14 */ {_SigNotify, "SIGALRM: alarm clock"}, + /* 15 */ {_SigNotify + _SigKill, "SIGTERM: termination"}, + /* 16 */ {_SigThrow + _SigUnblock, "SIGSTKFLT: stack fault"}, + /* 17 */ {_SigNotify + _SigUnblock + _SigIgn, "SIGCHLD: child status has changed"}, + /* 18 */ {_SigNotify + _SigDefault + _SigIgn, "SIGCONT: continue"}, + /* 19 */ {0, "SIGSTOP: stop, unblockable"}, + /* 20 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTSTP: keyboard stop"}, + /* 21 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTIN: background read from tty"}, + /* 22 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTOU: background write to tty"}, + /* 23 */ {_SigNotify + _SigIgn, "SIGURG: urgent condition on socket"}, + /* 24 */ {_SigNotify, "SIGXCPU: cpu limit exceeded"}, + /* 25 */ {_SigNotify, "SIGXFSZ: file size limit exceeded"}, + /* 26 */ {_SigNotify, "SIGVTALRM: virtual alarm clock"}, + /* 27 */ {_SigNotify + _SigUnblock, "SIGPROF: profiling alarm clock"}, + /* 28 */ {_SigNotify + _SigIgn, "SIGWINCH: window size change"}, + /* 29 */ {_SigNotify, "SIGIO: i/o now possible"}, + /* 30 */ {_SigNotify, "SIGPWR: power failure restart"}, + /* 31 */ {_SigThrow, "SIGSYS: bad system call"}, + /* 32 */ {_SigSetStack + _SigUnblock, "signal 32"}, /* SIGCANCEL; see issue 6997 */ + /* 33 */ {_SigSetStack + _SigUnblock, "signal 33"}, /* SIGSETXID; see issues 3871, 9400, 12498 */ + /* 34 */ {_SigSetStack + _SigUnblock, "signal 34"}, /* musl SIGSYNCCALL; see issue 39343 */ + /* 35 */ {_SigNotify, "signal 35"}, + /* 36 */ {_SigNotify, "signal 36"}, + /* 37 */ {_SigNotify, "signal 37"}, + /* 38 */ {_SigNotify, "signal 38"}, + /* 39 */ {_SigNotify, "signal 39"}, + /* 40 */ {_SigNotify, "signal 40"}, + /* 41 */ {_SigNotify, "signal 41"}, + /* 42 */ {_SigNotify, "signal 42"}, + /* 43 */ {_SigNotify, "signal 43"}, + /* 44 */ {_SigNotify, "signal 44"}, + /* 45 */ {_SigNotify, "signal 45"}, + /* 46 */ {_SigNotify, "signal 46"}, + /* 47 */ {_SigNotify, "signal 47"}, + /* 48 */ {_SigNotify, "signal 48"}, + /* 49 */ {_SigNotify, "signal 49"}, + /* 50 */ {_SigNotify, "signal 50"}, + /* 51 */ {_SigNotify, "signal 51"}, + /* 52 */ {_SigNotify, "signal 52"}, + /* 53 */ {_SigNotify, "signal 53"}, + /* 54 */ {_SigNotify, "signal 54"}, + /* 55 */ {_SigNotify, "signal 55"}, + /* 56 */ {_SigNotify, "signal 56"}, + /* 57 */ {_SigNotify, "signal 57"}, + /* 58 */ {_SigNotify, "signal 58"}, + /* 59 */ {_SigNotify, "signal 59"}, + /* 60 */ {_SigNotify, "signal 60"}, + /* 61 */ {_SigNotify, "signal 61"}, + /* 62 */ {_SigNotify, "signal 62"}, + /* 63 */ {_SigNotify, "signal 63"}, + /* 64 */ {_SigNotify, "signal 64"}, +} diff --git a/src/runtime/sigtab_linux_mipsx.go b/src/runtime/sigtab_linux_mipsx.go new file mode 100644 index 0000000..295ced5 --- /dev/null +++ b/src/runtime/sigtab_linux_mipsx.go @@ -0,0 +1,139 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (mips || mipsle || mips64 || mips64le) && linux + +package runtime + +var sigtable = [...]sigTabT{ + /* 0 */ {0, "SIGNONE: no trap"}, + /* 1 */ {_SigNotify + _SigKill, "SIGHUP: terminal line hangup"}, + /* 2 */ {_SigNotify + _SigKill, "SIGINT: interrupt"}, + /* 3 */ {_SigNotify + _SigThrow, "SIGQUIT: quit"}, + /* 4 */ {_SigThrow + _SigUnblock, "SIGILL: illegal instruction"}, + /* 5 */ {_SigThrow + _SigUnblock, "SIGTRAP: trace trap"}, + /* 6 */ {_SigNotify + _SigThrow, "SIGABRT: abort"}, + /* 7 */ {_SigThrow, "SIGEMT"}, + /* 8 */ {_SigPanic + _SigUnblock, "SIGFPE: floating-point exception"}, + /* 9 */ {0, "SIGKILL: kill"}, + /* 10 */ {_SigPanic + _SigUnblock, "SIGBUS: bus error"}, + /* 11 */ {_SigPanic + _SigUnblock, "SIGSEGV: segmentation violation"}, + /* 12 */ {_SigThrow, "SIGSYS: bad system call"}, + /* 13 */ {_SigNotify, "SIGPIPE: write to broken pipe"}, + /* 14 */ {_SigNotify, "SIGALRM: alarm clock"}, + /* 15 */ {_SigNotify + _SigKill, "SIGTERM: termination"}, + /* 16 */ {_SigNotify, "SIGUSR1: user-defined signal 1"}, + /* 17 */ {_SigNotify, "SIGUSR2: user-defined signal 2"}, + /* 18 */ {_SigNotify + _SigUnblock + _SigIgn, "SIGCHLD: child status has changed"}, + /* 19 */ {_SigNotify, "SIGPWR: power failure restart"}, + /* 20 */ {_SigNotify + _SigIgn, "SIGWINCH: window size change"}, + /* 21 */ {_SigNotify + _SigIgn, "SIGURG: urgent condition on socket"}, + /* 22 */ {_SigNotify, "SIGIO: i/o now possible"}, + /* 23 */ {0, "SIGSTOP: stop, unblockable"}, + /* 24 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTSTP: keyboard stop"}, + /* 25 */ {_SigNotify + _SigDefault + _SigIgn, "SIGCONT: continue"}, + /* 26 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTIN: background read from tty"}, + /* 27 */ {_SigNotify + _SigDefault + _SigIgn, "SIGTTOU: background write to tty"}, + /* 28 */ {_SigNotify, "SIGVTALRM: virtual alarm clock"}, + /* 29 */ {_SigNotify + _SigUnblock, "SIGPROF: profiling alarm clock"}, + /* 30 */ {_SigNotify, "SIGXCPU: cpu limit exceeded"}, + /* 31 */ {_SigNotify, "SIGXFSZ: file size limit exceeded"}, + /* 32 */ {_SigSetStack + _SigUnblock, "signal 32"}, /* SIGCANCEL; see issue 6997 */ + /* 33 */ {_SigSetStack + _SigUnblock, "signal 33"}, /* SIGSETXID; see issues 3871, 9400, 12498 */ + /* 34 */ {_SigSetStack + _SigUnblock, "signal 34"}, /* musl SIGSYNCCALL; see issue 39343 */ + /* 35 */ {_SigNotify, "signal 35"}, + /* 36 */ {_SigNotify, "signal 36"}, + /* 37 */ {_SigNotify, "signal 37"}, + /* 38 */ {_SigNotify, "signal 38"}, + /* 39 */ {_SigNotify, "signal 39"}, + /* 40 */ {_SigNotify, "signal 40"}, + /* 41 */ {_SigNotify, "signal 41"}, + /* 42 */ {_SigNotify, "signal 42"}, + /* 43 */ {_SigNotify, "signal 43"}, + /* 44 */ {_SigNotify, "signal 44"}, + /* 45 */ {_SigNotify, "signal 45"}, + /* 46 */ {_SigNotify, "signal 46"}, + /* 47 */ {_SigNotify, "signal 47"}, + /* 48 */ {_SigNotify, "signal 48"}, + /* 49 */ {_SigNotify, "signal 49"}, + /* 50 */ {_SigNotify, "signal 50"}, + /* 51 */ {_SigNotify, "signal 51"}, + /* 52 */ {_SigNotify, "signal 52"}, + /* 53 */ {_SigNotify, "signal 53"}, + /* 54 */ {_SigNotify, "signal 54"}, + /* 55 */ {_SigNotify, "signal 55"}, + /* 56 */ {_SigNotify, "signal 56"}, + /* 57 */ {_SigNotify, "signal 57"}, + /* 58 */ {_SigNotify, "signal 58"}, + /* 59 */ {_SigNotify, "signal 59"}, + /* 60 */ {_SigNotify, "signal 60"}, + /* 61 */ {_SigNotify, "signal 61"}, + /* 62 */ {_SigNotify, "signal 62"}, + /* 63 */ {_SigNotify, "signal 63"}, + /* 64 */ {_SigNotify, "signal 64"}, + /* 65 */ {_SigNotify, "signal 65"}, + /* 66 */ {_SigNotify, "signal 66"}, + /* 67 */ {_SigNotify, "signal 67"}, + /* 68 */ {_SigNotify, "signal 68"}, + /* 69 */ {_SigNotify, "signal 69"}, + /* 70 */ {_SigNotify, "signal 70"}, + /* 71 */ {_SigNotify, "signal 71"}, + /* 72 */ {_SigNotify, "signal 72"}, + /* 73 */ {_SigNotify, "signal 73"}, + /* 74 */ {_SigNotify, "signal 74"}, + /* 75 */ {_SigNotify, "signal 75"}, + /* 76 */ {_SigNotify, "signal 76"}, + /* 77 */ {_SigNotify, "signal 77"}, + /* 78 */ {_SigNotify, "signal 78"}, + /* 79 */ {_SigNotify, "signal 79"}, + /* 80 */ {_SigNotify, "signal 80"}, + /* 81 */ {_SigNotify, "signal 81"}, + /* 82 */ {_SigNotify, "signal 82"}, + /* 83 */ {_SigNotify, "signal 83"}, + /* 84 */ {_SigNotify, "signal 84"}, + /* 85 */ {_SigNotify, "signal 85"}, + /* 86 */ {_SigNotify, "signal 86"}, + /* 87 */ {_SigNotify, "signal 87"}, + /* 88 */ {_SigNotify, "signal 88"}, + /* 89 */ {_SigNotify, "signal 89"}, + /* 90 */ {_SigNotify, "signal 90"}, + /* 91 */ {_SigNotify, "signal 91"}, + /* 92 */ {_SigNotify, "signal 92"}, + /* 93 */ {_SigNotify, "signal 93"}, + /* 94 */ {_SigNotify, "signal 94"}, + /* 95 */ {_SigNotify, "signal 95"}, + /* 96 */ {_SigNotify, "signal 96"}, + /* 97 */ {_SigNotify, "signal 97"}, + /* 98 */ {_SigNotify, "signal 98"}, + /* 99 */ {_SigNotify, "signal 99"}, + /* 100 */ {_SigNotify, "signal 100"}, + /* 101 */ {_SigNotify, "signal 101"}, + /* 102 */ {_SigNotify, "signal 102"}, + /* 103 */ {_SigNotify, "signal 103"}, + /* 104 */ {_SigNotify, "signal 104"}, + /* 105 */ {_SigNotify, "signal 105"}, + /* 106 */ {_SigNotify, "signal 106"}, + /* 107 */ {_SigNotify, "signal 107"}, + /* 108 */ {_SigNotify, "signal 108"}, + /* 109 */ {_SigNotify, "signal 109"}, + /* 110 */ {_SigNotify, "signal 110"}, + /* 111 */ {_SigNotify, "signal 111"}, + /* 112 */ {_SigNotify, "signal 112"}, + /* 113 */ {_SigNotify, "signal 113"}, + /* 114 */ {_SigNotify, "signal 114"}, + /* 115 */ {_SigNotify, "signal 115"}, + /* 116 */ {_SigNotify, "signal 116"}, + /* 117 */ {_SigNotify, "signal 117"}, + /* 118 */ {_SigNotify, "signal 118"}, + /* 119 */ {_SigNotify, "signal 119"}, + /* 120 */ {_SigNotify, "signal 120"}, + /* 121 */ {_SigNotify, "signal 121"}, + /* 122 */ {_SigNotify, "signal 122"}, + /* 123 */ {_SigNotify, "signal 123"}, + /* 124 */ {_SigNotify, "signal 124"}, + /* 125 */ {_SigNotify, "signal 125"}, + /* 126 */ {_SigNotify, "signal 126"}, + /* 127 */ {_SigNotify, "signal 127"}, + /* 128 */ {_SigNotify, "signal 128"}, +} diff --git a/src/runtime/sizeclasses.go b/src/runtime/sizeclasses.go new file mode 100644 index 0000000..067871e --- /dev/null +++ b/src/runtime/sizeclasses.go @@ -0,0 +1,97 @@ +// Code generated by mksizeclasses.go; DO NOT EDIT. +//go:generate go run mksizeclasses.go + +package runtime + +// class bytes/obj bytes/span objects tail waste max waste min align +// 1 8 8192 1024 0 87.50% 8 +// 2 16 8192 512 0 43.75% 16 +// 3 24 8192 341 8 29.24% 8 +// 4 32 8192 256 0 21.88% 32 +// 5 48 8192 170 32 31.52% 16 +// 6 64 8192 128 0 23.44% 64 +// 7 80 8192 102 32 19.07% 16 +// 8 96 8192 85 32 15.95% 32 +// 9 112 8192 73 16 13.56% 16 +// 10 128 8192 64 0 11.72% 128 +// 11 144 8192 56 128 11.82% 16 +// 12 160 8192 51 32 9.73% 32 +// 13 176 8192 46 96 9.59% 16 +// 14 192 8192 42 128 9.25% 64 +// 15 208 8192 39 80 8.12% 16 +// 16 224 8192 36 128 8.15% 32 +// 17 240 8192 34 32 6.62% 16 +// 18 256 8192 32 0 5.86% 256 +// 19 288 8192 28 128 12.16% 32 +// 20 320 8192 25 192 11.80% 64 +// 21 352 8192 23 96 9.88% 32 +// 22 384 8192 21 128 9.51% 128 +// 23 416 8192 19 288 10.71% 32 +// 24 448 8192 18 128 8.37% 64 +// 25 480 8192 17 32 6.82% 32 +// 26 512 8192 16 0 6.05% 512 +// 27 576 8192 14 128 12.33% 64 +// 28 640 8192 12 512 15.48% 128 +// 29 704 8192 11 448 13.93% 64 +// 30 768 8192 10 512 13.94% 256 +// 31 896 8192 9 128 15.52% 128 +// 32 1024 8192 8 0 12.40% 1024 +// 33 1152 8192 7 128 12.41% 128 +// 34 1280 8192 6 512 15.55% 256 +// 35 1408 16384 11 896 14.00% 128 +// 36 1536 8192 5 512 14.00% 512 +// 37 1792 16384 9 256 15.57% 256 +// 38 2048 8192 4 0 12.45% 2048 +// 39 2304 16384 7 256 12.46% 256 +// 40 2688 8192 3 128 15.59% 128 +// 41 3072 24576 8 0 12.47% 1024 +// 42 3200 16384 5 384 6.22% 128 +// 43 3456 24576 7 384 8.83% 128 +// 44 4096 8192 2 0 15.60% 4096 +// 45 4864 24576 5 256 16.65% 256 +// 46 5376 16384 3 256 10.92% 256 +// 47 6144 24576 4 0 12.48% 2048 +// 48 6528 32768 5 128 6.23% 128 +// 49 6784 40960 6 256 4.36% 128 +// 50 6912 49152 7 768 3.37% 256 +// 51 8192 8192 1 0 15.61% 8192 +// 52 9472 57344 6 512 14.28% 256 +// 53 9728 49152 5 512 3.64% 512 +// 54 10240 40960 4 0 4.99% 2048 +// 55 10880 32768 3 128 6.24% 128 +// 56 12288 24576 2 0 11.45% 4096 +// 57 13568 40960 3 256 9.99% 256 +// 58 14336 57344 4 0 5.35% 2048 +// 59 16384 16384 1 0 12.49% 8192 +// 60 18432 73728 4 0 11.11% 2048 +// 61 19072 57344 3 128 3.57% 128 +// 62 20480 40960 2 0 6.87% 4096 +// 63 21760 65536 3 256 6.25% 256 +// 64 24576 24576 1 0 11.45% 8192 +// 65 27264 81920 3 128 10.00% 128 +// 66 28672 57344 2 0 4.91% 4096 +// 67 32768 32768 1 0 12.50% 8192 + +// alignment bits min obj size +// 8 3 8 +// 16 4 32 +// 32 5 256 +// 64 6 512 +// 128 7 768 +// 4096 12 28672 +// 8192 13 32768 + +const ( + _MaxSmallSize = 32768 + smallSizeDiv = 8 + smallSizeMax = 1024 + largeSizeDiv = 128 + _NumSizeClasses = 68 + _PageShift = 13 +) + +var class_to_size = [_NumSizeClasses]uint16{0, 8, 16, 24, 32, 48, 64, 80, 96, 112, 128, 144, 160, 176, 192, 208, 224, 240, 256, 288, 320, 352, 384, 416, 448, 480, 512, 576, 640, 704, 768, 896, 1024, 1152, 1280, 1408, 1536, 1792, 2048, 2304, 2688, 3072, 3200, 3456, 4096, 4864, 5376, 6144, 6528, 6784, 6912, 8192, 9472, 9728, 10240, 10880, 12288, 13568, 14336, 16384, 18432, 19072, 20480, 21760, 24576, 27264, 28672, 32768} +var class_to_allocnpages = [_NumSizeClasses]uint8{0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 2, 1, 2, 1, 3, 2, 3, 1, 3, 2, 3, 4, 5, 6, 1, 7, 6, 5, 4, 3, 5, 7, 2, 9, 7, 5, 8, 3, 10, 7, 4} +var class_to_divmagic = [_NumSizeClasses]uint32{0, ^uint32(0)/8 + 1, ^uint32(0)/16 + 1, ^uint32(0)/24 + 1, ^uint32(0)/32 + 1, ^uint32(0)/48 + 1, ^uint32(0)/64 + 1, ^uint32(0)/80 + 1, ^uint32(0)/96 + 1, ^uint32(0)/112 + 1, ^uint32(0)/128 + 1, ^uint32(0)/144 + 1, ^uint32(0)/160 + 1, ^uint32(0)/176 + 1, ^uint32(0)/192 + 1, ^uint32(0)/208 + 1, ^uint32(0)/224 + 1, ^uint32(0)/240 + 1, ^uint32(0)/256 + 1, ^uint32(0)/288 + 1, ^uint32(0)/320 + 1, ^uint32(0)/352 + 1, ^uint32(0)/384 + 1, ^uint32(0)/416 + 1, ^uint32(0)/448 + 1, ^uint32(0)/480 + 1, ^uint32(0)/512 + 1, ^uint32(0)/576 + 1, ^uint32(0)/640 + 1, ^uint32(0)/704 + 1, ^uint32(0)/768 + 1, ^uint32(0)/896 + 1, ^uint32(0)/1024 + 1, ^uint32(0)/1152 + 1, ^uint32(0)/1280 + 1, ^uint32(0)/1408 + 1, ^uint32(0)/1536 + 1, ^uint32(0)/1792 + 1, ^uint32(0)/2048 + 1, ^uint32(0)/2304 + 1, ^uint32(0)/2688 + 1, ^uint32(0)/3072 + 1, ^uint32(0)/3200 + 1, ^uint32(0)/3456 + 1, ^uint32(0)/4096 + 1, ^uint32(0)/4864 + 1, ^uint32(0)/5376 + 1, ^uint32(0)/6144 + 1, ^uint32(0)/6528 + 1, ^uint32(0)/6784 + 1, ^uint32(0)/6912 + 1, ^uint32(0)/8192 + 1, ^uint32(0)/9472 + 1, ^uint32(0)/9728 + 1, ^uint32(0)/10240 + 1, ^uint32(0)/10880 + 1, ^uint32(0)/12288 + 1, ^uint32(0)/13568 + 1, ^uint32(0)/14336 + 1, ^uint32(0)/16384 + 1, ^uint32(0)/18432 + 1, ^uint32(0)/19072 + 1, ^uint32(0)/20480 + 1, ^uint32(0)/21760 + 1, ^uint32(0)/24576 + 1, ^uint32(0)/27264 + 1, ^uint32(0)/28672 + 1, ^uint32(0)/32768 + 1} +var size_to_class8 = [smallSizeMax/smallSizeDiv + 1]uint8{0, 1, 2, 3, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9, 10, 10, 11, 11, 12, 12, 13, 13, 14, 14, 15, 15, 16, 16, 17, 17, 18, 18, 19, 19, 19, 19, 20, 20, 20, 20, 21, 21, 21, 21, 22, 22, 22, 22, 23, 23, 23, 23, 24, 24, 24, 24, 25, 25, 25, 25, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32, 32} +var size_to_class128 = [(_MaxSmallSize-smallSizeMax)/largeSizeDiv + 1]uint8{32, 33, 34, 35, 36, 37, 37, 38, 38, 39, 39, 40, 40, 40, 41, 41, 41, 42, 43, 43, 44, 44, 44, 44, 44, 45, 45, 45, 45, 45, 45, 46, 46, 46, 46, 47, 47, 47, 47, 47, 47, 48, 48, 48, 49, 49, 50, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 53, 53, 54, 54, 54, 54, 55, 55, 55, 55, 55, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 57, 57, 57, 57, 57, 57, 57, 57, 57, 57, 58, 58, 58, 58, 58, 58, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 61, 61, 61, 61, 61, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 64, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67} diff --git a/src/runtime/sizeof_test.go b/src/runtime/sizeof_test.go new file mode 100644 index 0000000..9ce0a3a --- /dev/null +++ b/src/runtime/sizeof_test.go @@ -0,0 +1,38 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "reflect" + "runtime" + "testing" + "unsafe" +) + +// Assert that the size of important structures do not change unexpectedly. + +func TestSizeof(t *testing.T) { + const _64bit = unsafe.Sizeof(uintptr(0)) == 8 + + var tests = []struct { + val any // type as a value + _32bit uintptr // size on 32bit platforms + _64bit uintptr // size on 64bit platforms + }{ + {runtime.G{}, 240, 392}, // g, but exported for testing + {runtime.Sudog{}, 56, 88}, // sudog, but exported for testing + } + + for _, tt := range tests { + want := tt._32bit + if _64bit { + want = tt._64bit + } + got := reflect.TypeOf(tt.val).Size() + if want != got { + t.Errorf("unsafe.Sizeof(%T) = %d, want %d", tt.val, got, want) + } + } +} diff --git a/src/runtime/slice.go b/src/runtime/slice.go new file mode 100644 index 0000000..459dc88 --- /dev/null +++ b/src/runtime/slice.go @@ -0,0 +1,347 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/math" + "runtime/internal/sys" + "unsafe" +) + +type slice struct { + array unsafe.Pointer + len int + cap int +} + +// A notInHeapSlice is a slice backed by runtime/internal/sys.NotInHeap memory. +type notInHeapSlice struct { + array *notInHeap + len int + cap int +} + +func panicmakeslicelen() { + panic(errorString("makeslice: len out of range")) +} + +func panicmakeslicecap() { + panic(errorString("makeslice: cap out of range")) +} + +// makeslicecopy allocates a slice of "tolen" elements of type "et", +// then copies "fromlen" elements of type "et" into that new allocation from "from". +func makeslicecopy(et *_type, tolen int, fromlen int, from unsafe.Pointer) unsafe.Pointer { + var tomem, copymem uintptr + if uintptr(tolen) > uintptr(fromlen) { + var overflow bool + tomem, overflow = math.MulUintptr(et.size, uintptr(tolen)) + if overflow || tomem > maxAlloc || tolen < 0 { + panicmakeslicelen() + } + copymem = et.size * uintptr(fromlen) + } else { + // fromlen is a known good length providing and equal or greater than tolen, + // thereby making tolen a good slice length too as from and to slices have the + // same element width. + tomem = et.size * uintptr(tolen) + copymem = tomem + } + + var to unsafe.Pointer + if et.ptrdata == 0 { + to = mallocgc(tomem, nil, false) + if copymem < tomem { + memclrNoHeapPointers(add(to, copymem), tomem-copymem) + } + } else { + // Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory. + to = mallocgc(tomem, et, true) + if copymem > 0 && writeBarrier.enabled { + // Only shade the pointers in old.array since we know the destination slice to + // only contains nil pointers because it has been cleared during alloc. + bulkBarrierPreWriteSrcOnly(uintptr(to), uintptr(from), copymem) + } + } + + if raceenabled { + callerpc := getcallerpc() + pc := abi.FuncPCABIInternal(makeslicecopy) + racereadrangepc(from, copymem, callerpc, pc) + } + if msanenabled { + msanread(from, copymem) + } + if asanenabled { + asanread(from, copymem) + } + + memmove(to, from, copymem) + + return to +} + +func makeslice(et *_type, len, cap int) unsafe.Pointer { + mem, overflow := math.MulUintptr(et.size, uintptr(cap)) + if overflow || mem > maxAlloc || len < 0 || len > cap { + // NOTE: Produce a 'len out of range' error instead of a + // 'cap out of range' error when someone does make([]T, bignumber). + // 'cap out of range' is true too, but since the cap is only being + // supplied implicitly, saying len is clearer. + // See golang.org/issue/4085. + mem, overflow := math.MulUintptr(et.size, uintptr(len)) + if overflow || mem > maxAlloc || len < 0 { + panicmakeslicelen() + } + panicmakeslicecap() + } + + return mallocgc(mem, et, true) +} + +func makeslice64(et *_type, len64, cap64 int64) unsafe.Pointer { + len := int(len64) + if int64(len) != len64 { + panicmakeslicelen() + } + + cap := int(cap64) + if int64(cap) != cap64 { + panicmakeslicecap() + } + + return makeslice(et, len, cap) +} + +// This is a wrapper over runtime/internal/math.MulUintptr, +// so the compiler can recognize and treat it as an intrinsic. +func mulUintptr(a, b uintptr) (uintptr, bool) { + return math.MulUintptr(a, b) +} + +// growslice allocates new backing store for a slice. +// +// arguments: +// +// oldPtr = pointer to the slice's backing array +// newLen = new length (= oldLen + num) +// oldCap = original slice's capacity. +// num = number of elements being added +// et = element type +// +// return values: +// +// newPtr = pointer to the new backing store +// newLen = same value as the argument +// newCap = capacity of the new backing store +// +// Requires that uint(newLen) > uint(oldCap). +// Assumes the original slice length is newLen - num +// +// A new backing store is allocated with space for at least newLen elements. +// Existing entries [0, oldLen) are copied over to the new backing store. +// Added entries [oldLen, newLen) are not initialized by growslice +// (although for pointer-containing element types, they are zeroed). They +// must be initialized by the caller. +// Trailing entries [newLen, newCap) are zeroed. +// +// growslice's odd calling convention makes the generated code that calls +// this function simpler. In particular, it accepts and returns the +// new length so that the old length is not live (does not need to be +// spilled/restored) and the new length is returned (also does not need +// to be spilled/restored). +func growslice(oldPtr unsafe.Pointer, newLen, oldCap, num int, et *_type) slice { + oldLen := newLen - num + if raceenabled { + callerpc := getcallerpc() + racereadrangepc(oldPtr, uintptr(oldLen*int(et.size)), callerpc, abi.FuncPCABIInternal(growslice)) + } + if msanenabled { + msanread(oldPtr, uintptr(oldLen*int(et.size))) + } + if asanenabled { + asanread(oldPtr, uintptr(oldLen*int(et.size))) + } + + if newLen < 0 { + panic(errorString("growslice: len out of range")) + } + + if et.size == 0 { + // append should not create a slice with nil pointer but non-zero len. + // We assume that append doesn't need to preserve oldPtr in this case. + return slice{unsafe.Pointer(&zerobase), newLen, newLen} + } + + newcap := oldCap + doublecap := newcap + newcap + if newLen > doublecap { + newcap = newLen + } else { + const threshold = 256 + if oldCap < threshold { + newcap = doublecap + } else { + // Check 0 < newcap to detect overflow + // and prevent an infinite loop. + for 0 < newcap && newcap < newLen { + // Transition from growing 2x for small slices + // to growing 1.25x for large slices. This formula + // gives a smooth-ish transition between the two. + newcap += (newcap + 3*threshold) / 4 + } + // Set newcap to the requested cap when + // the newcap calculation overflowed. + if newcap <= 0 { + newcap = newLen + } + } + } + + var overflow bool + var lenmem, newlenmem, capmem uintptr + // Specialize for common values of et.size. + // For 1 we don't need any division/multiplication. + // For goarch.PtrSize, compiler will optimize division/multiplication into a shift by a constant. + // For powers of 2, use a variable shift. + switch { + case et.size == 1: + lenmem = uintptr(oldLen) + newlenmem = uintptr(newLen) + capmem = roundupsize(uintptr(newcap)) + overflow = uintptr(newcap) > maxAlloc + newcap = int(capmem) + case et.size == goarch.PtrSize: + lenmem = uintptr(oldLen) * goarch.PtrSize + newlenmem = uintptr(newLen) * goarch.PtrSize + capmem = roundupsize(uintptr(newcap) * goarch.PtrSize) + overflow = uintptr(newcap) > maxAlloc/goarch.PtrSize + newcap = int(capmem / goarch.PtrSize) + case isPowerOfTwo(et.size): + var shift uintptr + if goarch.PtrSize == 8 { + // Mask shift for better code generation. + shift = uintptr(sys.TrailingZeros64(uint64(et.size))) & 63 + } else { + shift = uintptr(sys.TrailingZeros32(uint32(et.size))) & 31 + } + lenmem = uintptr(oldLen) << shift + newlenmem = uintptr(newLen) << shift + capmem = roundupsize(uintptr(newcap) << shift) + overflow = uintptr(newcap) > (maxAlloc >> shift) + newcap = int(capmem >> shift) + capmem = uintptr(newcap) << shift + default: + lenmem = uintptr(oldLen) * et.size + newlenmem = uintptr(newLen) * et.size + capmem, overflow = math.MulUintptr(et.size, uintptr(newcap)) + capmem = roundupsize(capmem) + newcap = int(capmem / et.size) + capmem = uintptr(newcap) * et.size + } + + // The check of overflow in addition to capmem > maxAlloc is needed + // to prevent an overflow which can be used to trigger a segfault + // on 32bit architectures with this example program: + // + // type T [1<<27 + 1]int64 + // + // var d T + // var s []T + // + // func main() { + // s = append(s, d, d, d, d) + // print(len(s), "\n") + // } + if overflow || capmem > maxAlloc { + panic(errorString("growslice: len out of range")) + } + + var p unsafe.Pointer + if et.ptrdata == 0 { + p = mallocgc(capmem, nil, false) + // The append() that calls growslice is going to overwrite from oldLen to newLen. + // Only clear the part that will not be overwritten. + // The reflect_growslice() that calls growslice will manually clear + // the region not cleared here. + memclrNoHeapPointers(add(p, newlenmem), capmem-newlenmem) + } else { + // Note: can't use rawmem (which avoids zeroing of memory), because then GC can scan uninitialized memory. + p = mallocgc(capmem, et, true) + if lenmem > 0 && writeBarrier.enabled { + // Only shade the pointers in oldPtr since we know the destination slice p + // only contains nil pointers because it has been cleared during alloc. + bulkBarrierPreWriteSrcOnly(uintptr(p), uintptr(oldPtr), lenmem-et.size+et.ptrdata) + } + } + memmove(p, oldPtr, lenmem) + + return slice{p, newLen, newcap} +} + +//go:linkname reflect_growslice reflect.growslice +func reflect_growslice(et *_type, old slice, num int) slice { + // Semantically equivalent to slices.Grow, except that the caller + // is responsible for ensuring that old.len+num > old.cap. + num -= old.cap - old.len // preserve memory of old[old.len:old.cap] + new := growslice(old.array, old.cap+num, old.cap, num, et) + // growslice does not zero out new[old.cap:new.len] since it assumes that + // the memory will be overwritten by an append() that called growslice. + // Since the caller of reflect_growslice is not append(), + // zero out this region before returning the slice to the reflect package. + if et.ptrdata == 0 { + oldcapmem := uintptr(old.cap) * et.size + newlenmem := uintptr(new.len) * et.size + memclrNoHeapPointers(add(new.array, oldcapmem), newlenmem-oldcapmem) + } + new.len = old.len // preserve the old length + return new +} + +func isPowerOfTwo(x uintptr) bool { + return x&(x-1) == 0 +} + +// slicecopy is used to copy from a string or slice of pointerless elements into a slice. +func slicecopy(toPtr unsafe.Pointer, toLen int, fromPtr unsafe.Pointer, fromLen int, width uintptr) int { + if fromLen == 0 || toLen == 0 { + return 0 + } + + n := fromLen + if toLen < n { + n = toLen + } + + if width == 0 { + return n + } + + size := uintptr(n) * width + if raceenabled { + callerpc := getcallerpc() + pc := abi.FuncPCABIInternal(slicecopy) + racereadrangepc(fromPtr, size, callerpc, pc) + racewriterangepc(toPtr, size, callerpc, pc) + } + if msanenabled { + msanread(fromPtr, size) + msanwrite(toPtr, size) + } + if asanenabled { + asanread(fromPtr, size) + asanwrite(toPtr, size) + } + + if size == 1 { // common case worth about 2x to do here + // TODO: is this still worth it with new memmove impl? + *(*byte)(toPtr) = *(*byte)(fromPtr) // known to be a byte pointer + } else { + memmove(toPtr, fromPtr, size) + } + return n +} diff --git a/src/runtime/slice_test.go b/src/runtime/slice_test.go new file mode 100644 index 0000000..cd2bc26 --- /dev/null +++ b/src/runtime/slice_test.go @@ -0,0 +1,501 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "testing" +) + +const N = 20 + +func BenchmarkMakeSliceCopy(b *testing.B) { + const length = 32 + var bytes = make([]byte, 8*length) + var ints = make([]int, length) + var ptrs = make([]*byte, length) + b.Run("mallocmove", func(b *testing.B) { + b.Run("Byte", func(b *testing.B) { + var x []byte + for i := 0; i < b.N; i++ { + x = make([]byte, len(bytes)) + copy(x, bytes) + } + }) + b.Run("Int", func(b *testing.B) { + var x []int + for i := 0; i < b.N; i++ { + x = make([]int, len(ints)) + copy(x, ints) + } + }) + b.Run("Ptr", func(b *testing.B) { + var x []*byte + for i := 0; i < b.N; i++ { + x = make([]*byte, len(ptrs)) + copy(x, ptrs) + } + + }) + }) + b.Run("makecopy", func(b *testing.B) { + b.Run("Byte", func(b *testing.B) { + var x []byte + for i := 0; i < b.N; i++ { + x = make([]byte, 8*length) + copy(x, bytes) + } + }) + b.Run("Int", func(b *testing.B) { + var x []int + for i := 0; i < b.N; i++ { + x = make([]int, length) + copy(x, ints) + } + }) + b.Run("Ptr", func(b *testing.B) { + var x []*byte + for i := 0; i < b.N; i++ { + x = make([]*byte, length) + copy(x, ptrs) + } + + }) + }) + b.Run("nilappend", func(b *testing.B) { + b.Run("Byte", func(b *testing.B) { + var x []byte + for i := 0; i < b.N; i++ { + x = append([]byte(nil), bytes...) + _ = x + } + }) + b.Run("Int", func(b *testing.B) { + var x []int + for i := 0; i < b.N; i++ { + x = append([]int(nil), ints...) + _ = x + } + }) + b.Run("Ptr", func(b *testing.B) { + var x []*byte + for i := 0; i < b.N; i++ { + x = append([]*byte(nil), ptrs...) + _ = x + } + }) + }) +} + +type ( + struct24 struct{ a, b, c int64 } + struct32 struct{ a, b, c, d int64 } + struct40 struct{ a, b, c, d, e int64 } +) + +func BenchmarkMakeSlice(b *testing.B) { + const length = 2 + b.Run("Byte", func(b *testing.B) { + var x []byte + for i := 0; i < b.N; i++ { + x = make([]byte, length, 2*length) + _ = x + } + }) + b.Run("Int16", func(b *testing.B) { + var x []int16 + for i := 0; i < b.N; i++ { + x = make([]int16, length, 2*length) + _ = x + } + }) + b.Run("Int", func(b *testing.B) { + var x []int + for i := 0; i < b.N; i++ { + x = make([]int, length, 2*length) + _ = x + } + }) + b.Run("Ptr", func(b *testing.B) { + var x []*byte + for i := 0; i < b.N; i++ { + x = make([]*byte, length, 2*length) + _ = x + } + }) + b.Run("Struct", func(b *testing.B) { + b.Run("24", func(b *testing.B) { + var x []struct24 + for i := 0; i < b.N; i++ { + x = make([]struct24, length, 2*length) + _ = x + } + }) + b.Run("32", func(b *testing.B) { + var x []struct32 + for i := 0; i < b.N; i++ { + x = make([]struct32, length, 2*length) + _ = x + } + }) + b.Run("40", func(b *testing.B) { + var x []struct40 + for i := 0; i < b.N; i++ { + x = make([]struct40, length, 2*length) + _ = x + } + }) + + }) +} + +func BenchmarkGrowSlice(b *testing.B) { + b.Run("Byte", func(b *testing.B) { + x := make([]byte, 9) + for i := 0; i < b.N; i++ { + _ = append([]byte(nil), x...) + } + }) + b.Run("Int16", func(b *testing.B) { + x := make([]int16, 9) + for i := 0; i < b.N; i++ { + _ = append([]int16(nil), x...) + } + }) + b.Run("Int", func(b *testing.B) { + x := make([]int, 9) + for i := 0; i < b.N; i++ { + _ = append([]int(nil), x...) + } + }) + b.Run("Ptr", func(b *testing.B) { + x := make([]*byte, 9) + for i := 0; i < b.N; i++ { + _ = append([]*byte(nil), x...) + } + }) + b.Run("Struct", func(b *testing.B) { + b.Run("24", func(b *testing.B) { + x := make([]struct24, 9) + for i := 0; i < b.N; i++ { + _ = append([]struct24(nil), x...) + } + }) + b.Run("32", func(b *testing.B) { + x := make([]struct32, 9) + for i := 0; i < b.N; i++ { + _ = append([]struct32(nil), x...) + } + }) + b.Run("40", func(b *testing.B) { + x := make([]struct40, 9) + for i := 0; i < b.N; i++ { + _ = append([]struct40(nil), x...) + } + }) + + }) +} + +var ( + SinkIntSlice []int + SinkIntPointerSlice []*int +) + +func BenchmarkExtendSlice(b *testing.B) { + var length = 4 // Use a variable to prevent stack allocation of slices. + b.Run("IntSlice", func(b *testing.B) { + s := make([]int, 0, length) + for i := 0; i < b.N; i++ { + s = append(s[:0:length/2], make([]int, length)...) + } + SinkIntSlice = s + }) + b.Run("PointerSlice", func(b *testing.B) { + s := make([]*int, 0, length) + for i := 0; i < b.N; i++ { + s = append(s[:0:length/2], make([]*int, length)...) + } + SinkIntPointerSlice = s + }) + b.Run("NoGrow", func(b *testing.B) { + s := make([]int, 0, length) + for i := 0; i < b.N; i++ { + s = append(s[:0:length], make([]int, length)...) + } + SinkIntSlice = s + }) +} + +func BenchmarkAppend(b *testing.B) { + b.StopTimer() + x := make([]int, 0, N) + b.StartTimer() + for i := 0; i < b.N; i++ { + x = x[0:0] + for j := 0; j < N; j++ { + x = append(x, j) + } + } +} + +func BenchmarkAppendGrowByte(b *testing.B) { + for i := 0; i < b.N; i++ { + var x []byte + for j := 0; j < 1<<20; j++ { + x = append(x, byte(j)) + } + } +} + +func BenchmarkAppendGrowString(b *testing.B) { + var s string + for i := 0; i < b.N; i++ { + var x []string + for j := 0; j < 1<<20; j++ { + x = append(x, s) + } + } +} + +func BenchmarkAppendSlice(b *testing.B) { + for _, length := range []int{1, 4, 7, 8, 15, 16, 32} { + b.Run(fmt.Sprint(length, "Bytes"), func(b *testing.B) { + x := make([]byte, 0, N) + y := make([]byte, length) + for i := 0; i < b.N; i++ { + x = x[0:0] + x = append(x, y...) + } + }) + } +} + +var ( + blackhole []byte +) + +func BenchmarkAppendSliceLarge(b *testing.B) { + for _, length := range []int{1 << 10, 4 << 10, 16 << 10, 64 << 10, 256 << 10, 1024 << 10} { + y := make([]byte, length) + b.Run(fmt.Sprint(length, "Bytes"), func(b *testing.B) { + for i := 0; i < b.N; i++ { + blackhole = nil + blackhole = append(blackhole, y...) + } + }) + } +} + +func BenchmarkAppendStr(b *testing.B) { + for _, str := range []string{ + "1", + "1234", + "12345678", + "1234567890123456", + "12345678901234567890123456789012", + } { + b.Run(fmt.Sprint(len(str), "Bytes"), func(b *testing.B) { + x := make([]byte, 0, N) + for i := 0; i < b.N; i++ { + x = x[0:0] + x = append(x, str...) + } + }) + } +} + +func BenchmarkAppendSpecialCase(b *testing.B) { + b.StopTimer() + x := make([]int, 0, N) + b.StartTimer() + for i := 0; i < b.N; i++ { + x = x[0:0] + for j := 0; j < N; j++ { + if len(x) < cap(x) { + x = x[:len(x)+1] + x[len(x)-1] = j + } else { + x = append(x, j) + } + } + } +} + +var x []int + +func f() int { + x[:1][0] = 3 + return 2 +} + +func TestSideEffectOrder(t *testing.T) { + x = make([]int, 0, 10) + x = append(x, 1, f()) + if x[0] != 1 || x[1] != 2 { + t.Error("append failed: ", x[0], x[1]) + } +} + +func TestAppendOverlap(t *testing.T) { + x := []byte("1234") + x = append(x[1:], x...) // p > q in runtime·appendslice. + got := string(x) + want := "2341234" + if got != want { + t.Errorf("overlap failed: got %q want %q", got, want) + } +} + +func BenchmarkCopy(b *testing.B) { + for _, l := range []int{1, 2, 4, 8, 12, 16, 32, 128, 1024} { + buf := make([]byte, 4096) + b.Run(fmt.Sprint(l, "Byte"), func(b *testing.B) { + s := make([]byte, l) + var n int + for i := 0; i < b.N; i++ { + n = copy(buf, s) + } + b.SetBytes(int64(n)) + }) + b.Run(fmt.Sprint(l, "String"), func(b *testing.B) { + s := string(make([]byte, l)) + var n int + for i := 0; i < b.N; i++ { + n = copy(buf, s) + } + b.SetBytes(int64(n)) + }) + } +} + +var ( + sByte []byte + s1Ptr []uintptr + s2Ptr [][2]uintptr + s3Ptr [][3]uintptr + s4Ptr [][4]uintptr +) + +// BenchmarkAppendInPlace tests the performance of append +// when the result is being written back to the same slice. +// In order for the in-place optimization to occur, +// the slice must be referred to by address; +// using a global is an easy way to trigger that. +// We test the "grow" and "no grow" paths separately, +// but not the "normal" (occasionally grow) path, +// because it is a blend of the other two. +// We use small numbers and small sizes in an attempt +// to avoid benchmarking memory allocation and copying. +// We use scalars instead of pointers in an attempt +// to avoid benchmarking the write barriers. +// We benchmark four common sizes (byte, pointer, string/interface, slice), +// and one larger size. +func BenchmarkAppendInPlace(b *testing.B) { + b.Run("NoGrow", func(b *testing.B) { + const C = 128 + + b.Run("Byte", func(b *testing.B) { + for i := 0; i < b.N; i++ { + sByte = make([]byte, C) + for j := 0; j < C; j++ { + sByte = append(sByte, 0x77) + } + } + }) + + b.Run("1Ptr", func(b *testing.B) { + for i := 0; i < b.N; i++ { + s1Ptr = make([]uintptr, C) + for j := 0; j < C; j++ { + s1Ptr = append(s1Ptr, 0x77) + } + } + }) + + b.Run("2Ptr", func(b *testing.B) { + for i := 0; i < b.N; i++ { + s2Ptr = make([][2]uintptr, C) + for j := 0; j < C; j++ { + s2Ptr = append(s2Ptr, [2]uintptr{0x77, 0x88}) + } + } + }) + + b.Run("3Ptr", func(b *testing.B) { + for i := 0; i < b.N; i++ { + s3Ptr = make([][3]uintptr, C) + for j := 0; j < C; j++ { + s3Ptr = append(s3Ptr, [3]uintptr{0x77, 0x88, 0x99}) + } + } + }) + + b.Run("4Ptr", func(b *testing.B) { + for i := 0; i < b.N; i++ { + s4Ptr = make([][4]uintptr, C) + for j := 0; j < C; j++ { + s4Ptr = append(s4Ptr, [4]uintptr{0x77, 0x88, 0x99, 0xAA}) + } + } + }) + + }) + + b.Run("Grow", func(b *testing.B) { + const C = 5 + + b.Run("Byte", func(b *testing.B) { + for i := 0; i < b.N; i++ { + sByte = make([]byte, 0) + for j := 0; j < C; j++ { + sByte = append(sByte, 0x77) + sByte = sByte[:cap(sByte)] + } + } + }) + + b.Run("1Ptr", func(b *testing.B) { + for i := 0; i < b.N; i++ { + s1Ptr = make([]uintptr, 0) + for j := 0; j < C; j++ { + s1Ptr = append(s1Ptr, 0x77) + s1Ptr = s1Ptr[:cap(s1Ptr)] + } + } + }) + + b.Run("2Ptr", func(b *testing.B) { + for i := 0; i < b.N; i++ { + s2Ptr = make([][2]uintptr, 0) + for j := 0; j < C; j++ { + s2Ptr = append(s2Ptr, [2]uintptr{0x77, 0x88}) + s2Ptr = s2Ptr[:cap(s2Ptr)] + } + } + }) + + b.Run("3Ptr", func(b *testing.B) { + for i := 0; i < b.N; i++ { + s3Ptr = make([][3]uintptr, 0) + for j := 0; j < C; j++ { + s3Ptr = append(s3Ptr, [3]uintptr{0x77, 0x88, 0x99}) + s3Ptr = s3Ptr[:cap(s3Ptr)] + } + } + }) + + b.Run("4Ptr", func(b *testing.B) { + for i := 0; i < b.N; i++ { + s4Ptr = make([][4]uintptr, 0) + for j := 0; j < C; j++ { + s4Ptr = append(s4Ptr, [4]uintptr{0x77, 0x88, 0x99, 0xAA}) + s4Ptr = s4Ptr[:cap(s4Ptr)] + } + } + }) + + }) +} diff --git a/src/runtime/softfloat64.go b/src/runtime/softfloat64.go new file mode 100644 index 0000000..42ef009 --- /dev/null +++ b/src/runtime/softfloat64.go @@ -0,0 +1,627 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Software IEEE754 64-bit floating point. +// Only referred to (and thus linked in) by softfloat targets +// and by tests in this directory. + +package runtime + +const ( + mantbits64 uint = 52 + expbits64 uint = 11 + bias64 = -1<<(expbits64-1) + 1 + + nan64 uint64 = (1<<expbits64-1)<<mantbits64 + 1<<(mantbits64-1) // quiet NaN, 0 payload + inf64 uint64 = (1<<expbits64 - 1) << mantbits64 + neg64 uint64 = 1 << (expbits64 + mantbits64) + + mantbits32 uint = 23 + expbits32 uint = 8 + bias32 = -1<<(expbits32-1) + 1 + + nan32 uint32 = (1<<expbits32-1)<<mantbits32 + 1<<(mantbits32-1) // quiet NaN, 0 payload + inf32 uint32 = (1<<expbits32 - 1) << mantbits32 + neg32 uint32 = 1 << (expbits32 + mantbits32) +) + +func funpack64(f uint64) (sign, mant uint64, exp int, inf, nan bool) { + sign = f & (1 << (mantbits64 + expbits64)) + mant = f & (1<<mantbits64 - 1) + exp = int(f>>mantbits64) & (1<<expbits64 - 1) + + switch exp { + case 1<<expbits64 - 1: + if mant != 0 { + nan = true + return + } + inf = true + return + + case 0: + // denormalized + if mant != 0 { + exp += bias64 + 1 + for mant < 1<<mantbits64 { + mant <<= 1 + exp-- + } + } + + default: + // add implicit top bit + mant |= 1 << mantbits64 + exp += bias64 + } + return +} + +func funpack32(f uint32) (sign, mant uint32, exp int, inf, nan bool) { + sign = f & (1 << (mantbits32 + expbits32)) + mant = f & (1<<mantbits32 - 1) + exp = int(f>>mantbits32) & (1<<expbits32 - 1) + + switch exp { + case 1<<expbits32 - 1: + if mant != 0 { + nan = true + return + } + inf = true + return + + case 0: + // denormalized + if mant != 0 { + exp += bias32 + 1 + for mant < 1<<mantbits32 { + mant <<= 1 + exp-- + } + } + + default: + // add implicit top bit + mant |= 1 << mantbits32 + exp += bias32 + } + return +} + +func fpack64(sign, mant uint64, exp int, trunc uint64) uint64 { + mant0, exp0, trunc0 := mant, exp, trunc + if mant == 0 { + return sign + } + for mant < 1<<mantbits64 { + mant <<= 1 + exp-- + } + for mant >= 4<<mantbits64 { + trunc |= mant & 1 + mant >>= 1 + exp++ + } + if mant >= 2<<mantbits64 { + if mant&1 != 0 && (trunc != 0 || mant&2 != 0) { + mant++ + if mant >= 4<<mantbits64 { + mant >>= 1 + exp++ + } + } + mant >>= 1 + exp++ + } + if exp >= 1<<expbits64-1+bias64 { + return sign ^ inf64 + } + if exp < bias64+1 { + if exp < bias64-int(mantbits64) { + return sign | 0 + } + // repeat expecting denormal + mant, exp, trunc = mant0, exp0, trunc0 + for exp < bias64 { + trunc |= mant & 1 + mant >>= 1 + exp++ + } + if mant&1 != 0 && (trunc != 0 || mant&2 != 0) { + mant++ + } + mant >>= 1 + exp++ + if mant < 1<<mantbits64 { + return sign | mant + } + } + return sign | uint64(exp-bias64)<<mantbits64 | mant&(1<<mantbits64-1) +} + +func fpack32(sign, mant uint32, exp int, trunc uint32) uint32 { + mant0, exp0, trunc0 := mant, exp, trunc + if mant == 0 { + return sign + } + for mant < 1<<mantbits32 { + mant <<= 1 + exp-- + } + for mant >= 4<<mantbits32 { + trunc |= mant & 1 + mant >>= 1 + exp++ + } + if mant >= 2<<mantbits32 { + if mant&1 != 0 && (trunc != 0 || mant&2 != 0) { + mant++ + if mant >= 4<<mantbits32 { + mant >>= 1 + exp++ + } + } + mant >>= 1 + exp++ + } + if exp >= 1<<expbits32-1+bias32 { + return sign ^ inf32 + } + if exp < bias32+1 { + if exp < bias32-int(mantbits32) { + return sign | 0 + } + // repeat expecting denormal + mant, exp, trunc = mant0, exp0, trunc0 + for exp < bias32 { + trunc |= mant & 1 + mant >>= 1 + exp++ + } + if mant&1 != 0 && (trunc != 0 || mant&2 != 0) { + mant++ + } + mant >>= 1 + exp++ + if mant < 1<<mantbits32 { + return sign | mant + } + } + return sign | uint32(exp-bias32)<<mantbits32 | mant&(1<<mantbits32-1) +} + +func fadd64(f, g uint64) uint64 { + fs, fm, fe, fi, fn := funpack64(f) + gs, gm, ge, gi, gn := funpack64(g) + + // Special cases. + switch { + case fn || gn: // NaN + x or x + NaN = NaN + return nan64 + + case fi && gi && fs != gs: // +Inf + -Inf or -Inf + +Inf = NaN + return nan64 + + case fi: // ±Inf + g = ±Inf + return f + + case gi: // f + ±Inf = ±Inf + return g + + case fm == 0 && gm == 0 && fs != 0 && gs != 0: // -0 + -0 = -0 + return f + + case fm == 0: // 0 + g = g but 0 + -0 = +0 + if gm == 0 { + g ^= gs + } + return g + + case gm == 0: // f + 0 = f + return f + + } + + if fe < ge || fe == ge && fm < gm { + f, g, fs, fm, fe, gs, gm, ge = g, f, gs, gm, ge, fs, fm, fe + } + + shift := uint(fe - ge) + fm <<= 2 + gm <<= 2 + trunc := gm & (1<<shift - 1) + gm >>= shift + if fs == gs { + fm += gm + } else { + fm -= gm + if trunc != 0 { + fm-- + } + } + if fm == 0 { + fs = 0 + } + return fpack64(fs, fm, fe-2, trunc) +} + +func fsub64(f, g uint64) uint64 { + return fadd64(f, fneg64(g)) +} + +func fneg64(f uint64) uint64 { + return f ^ (1 << (mantbits64 + expbits64)) +} + +func fmul64(f, g uint64) uint64 { + fs, fm, fe, fi, fn := funpack64(f) + gs, gm, ge, gi, gn := funpack64(g) + + // Special cases. + switch { + case fn || gn: // NaN * g or f * NaN = NaN + return nan64 + + case fi && gi: // Inf * Inf = Inf (with sign adjusted) + return f ^ gs + + case fi && gm == 0, fm == 0 && gi: // 0 * Inf = Inf * 0 = NaN + return nan64 + + case fm == 0: // 0 * x = 0 (with sign adjusted) + return f ^ gs + + case gm == 0: // x * 0 = 0 (with sign adjusted) + return g ^ fs + } + + // 53-bit * 53-bit = 107- or 108-bit + lo, hi := mullu(fm, gm) + shift := mantbits64 - 1 + trunc := lo & (1<<shift - 1) + mant := hi<<(64-shift) | lo>>shift + return fpack64(fs^gs, mant, fe+ge-1, trunc) +} + +func fdiv64(f, g uint64) uint64 { + fs, fm, fe, fi, fn := funpack64(f) + gs, gm, ge, gi, gn := funpack64(g) + + // Special cases. + switch { + case fn || gn: // NaN / g = f / NaN = NaN + return nan64 + + case fi && gi: // ±Inf / ±Inf = NaN + return nan64 + + case !fi && !gi && fm == 0 && gm == 0: // 0 / 0 = NaN + return nan64 + + case fi, !gi && gm == 0: // Inf / g = f / 0 = Inf + return fs ^ gs ^ inf64 + + case gi, fm == 0: // f / Inf = 0 / g = Inf + return fs ^ gs ^ 0 + } + _, _, _, _ = fi, fn, gi, gn + + // 53-bit<<54 / 53-bit = 53- or 54-bit. + shift := mantbits64 + 2 + q, r := divlu(fm>>(64-shift), fm<<shift, gm) + return fpack64(fs^gs, q, fe-ge-2, r) +} + +func f64to32(f uint64) uint32 { + fs, fm, fe, fi, fn := funpack64(f) + if fn { + return nan32 + } + fs32 := uint32(fs >> 32) + if fi { + return fs32 ^ inf32 + } + const d = mantbits64 - mantbits32 - 1 + return fpack32(fs32, uint32(fm>>d), fe-1, uint32(fm&(1<<d-1))) +} + +func f32to64(f uint32) uint64 { + const d = mantbits64 - mantbits32 + fs, fm, fe, fi, fn := funpack32(f) + if fn { + return nan64 + } + fs64 := uint64(fs) << 32 + if fi { + return fs64 ^ inf64 + } + return fpack64(fs64, uint64(fm)<<d, fe, 0) +} + +func fcmp64(f, g uint64) (cmp int32, isnan bool) { + fs, fm, _, fi, fn := funpack64(f) + gs, gm, _, gi, gn := funpack64(g) + + switch { + case fn, gn: // flag NaN + return 0, true + + case !fi && !gi && fm == 0 && gm == 0: // ±0 == ±0 + return 0, false + + case fs > gs: // f < 0, g > 0 + return -1, false + + case fs < gs: // f > 0, g < 0 + return +1, false + + // Same sign, not NaN. + // Can compare encodings directly now. + // Reverse for sign. + case fs == 0 && f < g, fs != 0 && f > g: + return -1, false + + case fs == 0 && f > g, fs != 0 && f < g: + return +1, false + } + + // f == g + return 0, false +} + +func f64toint(f uint64) (val int64, ok bool) { + fs, fm, fe, fi, fn := funpack64(f) + + switch { + case fi, fn: // NaN + return 0, false + + case fe < -1: // f < 0.5 + return 0, false + + case fe > 63: // f >= 2^63 + if fs != 0 && fm == 0 { // f == -2^63 + return -1 << 63, true + } + if fs != 0 { + return 0, false + } + return 0, false + } + + for fe > int(mantbits64) { + fe-- + fm <<= 1 + } + for fe < int(mantbits64) { + fe++ + fm >>= 1 + } + val = int64(fm) + if fs != 0 { + val = -val + } + return val, true +} + +func fintto64(val int64) (f uint64) { + fs := uint64(val) & (1 << 63) + mant := uint64(val) + if fs != 0 { + mant = -mant + } + return fpack64(fs, mant, int(mantbits64), 0) +} +func fintto32(val int64) (f uint32) { + fs := uint64(val) & (1 << 63) + mant := uint64(val) + if fs != 0 { + mant = -mant + } + // Reduce mantissa size until it fits into a uint32. + // Keep track of the bits we throw away, and if any are + // nonzero or them into the lowest bit. + exp := int(mantbits32) + var trunc uint32 + for mant >= 1<<32 { + trunc |= uint32(mant) & 1 + mant >>= 1 + exp++ + } + + return fpack32(uint32(fs>>32), uint32(mant), exp, trunc) +} + +// 64x64 -> 128 multiply. +// adapted from hacker's delight. +func mullu(u, v uint64) (lo, hi uint64) { + const ( + s = 32 + mask = 1<<s - 1 + ) + u0 := u & mask + u1 := u >> s + v0 := v & mask + v1 := v >> s + w0 := u0 * v0 + t := u1*v0 + w0>>s + w1 := t & mask + w2 := t >> s + w1 += u0 * v1 + return u * v, u1*v1 + w2 + w1>>s +} + +// 128/64 -> 64 quotient, 64 remainder. +// adapted from hacker's delight +func divlu(u1, u0, v uint64) (q, r uint64) { + const b = 1 << 32 + + if u1 >= v { + return 1<<64 - 1, 1<<64 - 1 + } + + // s = nlz(v); v <<= s + s := uint(0) + for v&(1<<63) == 0 { + s++ + v <<= 1 + } + + vn1 := v >> 32 + vn0 := v & (1<<32 - 1) + un32 := u1<<s | u0>>(64-s) + un10 := u0 << s + un1 := un10 >> 32 + un0 := un10 & (1<<32 - 1) + q1 := un32 / vn1 + rhat := un32 - q1*vn1 + +again1: + if q1 >= b || q1*vn0 > b*rhat+un1 { + q1-- + rhat += vn1 + if rhat < b { + goto again1 + } + } + + un21 := un32*b + un1 - q1*v + q0 := un21 / vn1 + rhat = un21 - q0*vn1 + +again2: + if q0 >= b || q0*vn0 > b*rhat+un0 { + q0-- + rhat += vn1 + if rhat < b { + goto again2 + } + } + + return q1*b + q0, (un21*b + un0 - q0*v) >> s +} + +func fadd32(x, y uint32) uint32 { + return f64to32(fadd64(f32to64(x), f32to64(y))) +} + +func fmul32(x, y uint32) uint32 { + return f64to32(fmul64(f32to64(x), f32to64(y))) +} + +func fdiv32(x, y uint32) uint32 { + // TODO: are there double-rounding problems here? See issue 48807. + return f64to32(fdiv64(f32to64(x), f32to64(y))) +} + +func feq32(x, y uint32) bool { + cmp, nan := fcmp64(f32to64(x), f32to64(y)) + return cmp == 0 && !nan +} + +func fgt32(x, y uint32) bool { + cmp, nan := fcmp64(f32to64(x), f32to64(y)) + return cmp >= 1 && !nan +} + +func fge32(x, y uint32) bool { + cmp, nan := fcmp64(f32to64(x), f32to64(y)) + return cmp >= 0 && !nan +} + +func feq64(x, y uint64) bool { + cmp, nan := fcmp64(x, y) + return cmp == 0 && !nan +} + +func fgt64(x, y uint64) bool { + cmp, nan := fcmp64(x, y) + return cmp >= 1 && !nan +} + +func fge64(x, y uint64) bool { + cmp, nan := fcmp64(x, y) + return cmp >= 0 && !nan +} + +func fint32to32(x int32) uint32 { + return fintto32(int64(x)) +} + +func fint32to64(x int32) uint64 { + return fintto64(int64(x)) +} + +func fint64to32(x int64) uint32 { + return fintto32(x) +} + +func fint64to64(x int64) uint64 { + return fintto64(x) +} + +func f32toint32(x uint32) int32 { + val, _ := f64toint(f32to64(x)) + return int32(val) +} + +func f32toint64(x uint32) int64 { + val, _ := f64toint(f32to64(x)) + return val +} + +func f64toint32(x uint64) int32 { + val, _ := f64toint(x) + return int32(val) +} + +func f64toint64(x uint64) int64 { + val, _ := f64toint(x) + return val +} + +func f64touint64(x uint64) uint64 { + var m uint64 = 0x43e0000000000000 // float64 1<<63 + if fgt64(m, x) { + return uint64(f64toint64(x)) + } + y := fadd64(x, -m) + z := uint64(f64toint64(y)) + return z | (1 << 63) +} + +func f32touint64(x uint32) uint64 { + var m uint32 = 0x5f000000 // float32 1<<63 + if fgt32(m, x) { + return uint64(f32toint64(x)) + } + y := fadd32(x, -m) + z := uint64(f32toint64(y)) + return z | (1 << 63) +} + +func fuint64to64(x uint64) uint64 { + if int64(x) >= 0 { + return fint64to64(int64(x)) + } + // See ../cmd/compile/internal/ssagen/ssa.go:uint64Tofloat + y := x & 1 + z := x >> 1 + z = z | y + r := fint64to64(int64(z)) + return fadd64(r, r) +} + +func fuint64to32(x uint64) uint32 { + if int64(x) >= 0 { + return fint64to32(int64(x)) + } + // See ../cmd/compile/internal/ssagen/ssa.go:uint64Tofloat + y := x & 1 + z := x >> 1 + z = z | y + r := fint64to32(int64(z)) + return fadd32(r, r) +} diff --git a/src/runtime/softfloat64_test.go b/src/runtime/softfloat64_test.go new file mode 100644 index 0000000..3f53e8b --- /dev/null +++ b/src/runtime/softfloat64_test.go @@ -0,0 +1,198 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "math" + "math/rand" + . "runtime" + "testing" +) + +// turn uint64 op into float64 op +func fop(f func(x, y uint64) uint64) func(x, y float64) float64 { + return func(x, y float64) float64 { + bx := math.Float64bits(x) + by := math.Float64bits(y) + return math.Float64frombits(f(bx, by)) + } +} + +func add(x, y float64) float64 { return x + y } +func sub(x, y float64) float64 { return x - y } +func mul(x, y float64) float64 { return x * y } +func div(x, y float64) float64 { return x / y } + +func TestFloat64(t *testing.T) { + base := []float64{ + 0, + math.Copysign(0, -1), + -1, + 1, + math.NaN(), + math.Inf(+1), + math.Inf(-1), + 0.1, + 1.5, + 1.9999999999999998, // all 1s mantissa + 1.3333333333333333, // 1.010101010101... + 1.1428571428571428, // 1.001001001001... + 1.112536929253601e-308, // first normal + 2, + 4, + 8, + 16, + 32, + 64, + 128, + 256, + 3, + 12, + 1234, + 123456, + -0.1, + -1.5, + -1.9999999999999998, + -1.3333333333333333, + -1.1428571428571428, + -2, + -3, + 1e-200, + 1e-300, + 1e-310, + 5e-324, + 1e-105, + 1e-305, + 1e+200, + 1e+306, + 1e+307, + 1e+308, + } + all := make([]float64, 200) + copy(all, base) + for i := len(base); i < len(all); i++ { + all[i] = rand.NormFloat64() + } + + test(t, "+", add, fop(Fadd64), all) + test(t, "-", sub, fop(Fsub64), all) + if GOARCH != "386" { // 386 is not precise! + test(t, "*", mul, fop(Fmul64), all) + test(t, "/", div, fop(Fdiv64), all) + } +} + +// 64 -hw-> 32 -hw-> 64 +func trunc32(f float64) float64 { + return float64(float32(f)) +} + +// 64 -sw->32 -hw-> 64 +func to32sw(f float64) float64 { + return float64(math.Float32frombits(F64to32(math.Float64bits(f)))) +} + +// 64 -hw->32 -sw-> 64 +func to64sw(f float64) float64 { + return math.Float64frombits(F32to64(math.Float32bits(float32(f)))) +} + +// float64 -hw-> int64 -hw-> float64 +func hwint64(f float64) float64 { + return float64(int64(f)) +} + +// float64 -hw-> int32 -hw-> float64 +func hwint32(f float64) float64 { + return float64(int32(f)) +} + +// float64 -sw-> int64 -hw-> float64 +func toint64sw(f float64) float64 { + i, ok := F64toint(math.Float64bits(f)) + if !ok { + // There's no right answer for out of range. + // Match the hardware to pass the test. + i = int64(f) + } + return float64(i) +} + +// float64 -hw-> int64 -sw-> float64 +func fromint64sw(f float64) float64 { + return math.Float64frombits(Fintto64(int64(f))) +} + +var nerr int + +func err(t *testing.T, format string, args ...any) { + t.Errorf(format, args...) + + // cut errors off after a while. + // otherwise we spend all our time + // allocating memory to hold the + // formatted output. + if nerr++; nerr >= 10 { + t.Fatal("too many errors") + } +} + +func test(t *testing.T, op string, hw, sw func(float64, float64) float64, all []float64) { + for _, f := range all { + for _, g := range all { + h := hw(f, g) + s := sw(f, g) + if !same(h, s) { + err(t, "%g %s %g = sw %g, hw %g\n", f, op, g, s, h) + } + testu(t, "to32", trunc32, to32sw, h) + testu(t, "to64", trunc32, to64sw, h) + testu(t, "toint64", hwint64, toint64sw, h) + testu(t, "fromint64", hwint64, fromint64sw, h) + testcmp(t, f, h) + testcmp(t, h, f) + testcmp(t, g, h) + testcmp(t, h, g) + } + } +} + +func testu(t *testing.T, op string, hw, sw func(float64) float64, v float64) { + h := hw(v) + s := sw(v) + if !same(h, s) { + err(t, "%s %g = sw %g, hw %g\n", op, v, s, h) + } +} + +func hwcmp(f, g float64) (cmp int, isnan bool) { + switch { + case f < g: + return -1, false + case f > g: + return +1, false + case f == g: + return 0, false + } + return 0, true // must be NaN +} + +func testcmp(t *testing.T, f, g float64) { + hcmp, hisnan := hwcmp(f, g) + scmp, sisnan := Fcmp64(math.Float64bits(f), math.Float64bits(g)) + if int32(hcmp) != scmp || hisnan != sisnan { + err(t, "cmp(%g, %g) = sw %v, %v, hw %v, %v\n", f, g, scmp, sisnan, hcmp, hisnan) + } +} + +func same(f, g float64) bool { + if math.IsNaN(f) && math.IsNaN(g) { + return true + } + if math.Copysign(1, f) != math.Copysign(1, g) { + return false + } + return f == g +} diff --git a/src/runtime/stack.go b/src/runtime/stack.go new file mode 100644 index 0000000..d5e587a --- /dev/null +++ b/src/runtime/stack.go @@ -0,0 +1,1345 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/cpu" + "internal/goarch" + "internal/goos" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +/* +Stack layout parameters. +Included both by runtime (compiled via 6c) and linkers (compiled via gcc). + +The per-goroutine g->stackguard is set to point StackGuard bytes +above the bottom of the stack. Each function compares its stack +pointer against g->stackguard to check for overflow. To cut one +instruction from the check sequence for functions with tiny frames, +the stack is allowed to protrude StackSmall bytes below the stack +guard. Functions with large frames don't bother with the check and +always call morestack. The sequences are (for amd64, others are +similar): + + guard = g->stackguard + frame = function's stack frame size + argsize = size of function arguments (call + return) + + stack frame size <= StackSmall: + CMPQ guard, SP + JHI 3(PC) + MOVQ m->morearg, $(argsize << 32) + CALL morestack(SB) + + stack frame size > StackSmall but < StackBig + LEAQ (frame-StackSmall)(SP), R0 + CMPQ guard, R0 + JHI 3(PC) + MOVQ m->morearg, $(argsize << 32) + CALL morestack(SB) + + stack frame size >= StackBig: + MOVQ m->morearg, $((argsize << 32) | frame) + CALL morestack(SB) + +The bottom StackGuard - StackSmall bytes are important: there has +to be enough room to execute functions that refuse to check for +stack overflow, either because they need to be adjacent to the +actual caller's frame (deferproc) or because they handle the imminent +stack overflow (morestack). + +For example, deferproc might call malloc, which does one of the +above checks (without allocating a full frame), which might trigger +a call to morestack. This sequence needs to fit in the bottom +section of the stack. On amd64, morestack's frame is 40 bytes, and +deferproc's frame is 56 bytes. That fits well within the +StackGuard - StackSmall bytes at the bottom. +The linkers explore all possible call traces involving non-splitting +functions to make sure that this limit cannot be violated. +*/ + +const ( + // StackSystem is a number of additional bytes to add + // to each stack below the usual guard area for OS-specific + // purposes like signal handling. Used on Windows, Plan 9, + // and iOS because they do not use a separate stack. + _StackSystem = goos.IsWindows*512*goarch.PtrSize + goos.IsPlan9*512 + goos.IsIos*goarch.IsArm64*1024 + + // The minimum size of stack used by Go code + _StackMin = 2048 + + // The minimum stack size to allocate. + // The hackery here rounds FixedStack0 up to a power of 2. + _FixedStack0 = _StackMin + _StackSystem + _FixedStack1 = _FixedStack0 - 1 + _FixedStack2 = _FixedStack1 | (_FixedStack1 >> 1) + _FixedStack3 = _FixedStack2 | (_FixedStack2 >> 2) + _FixedStack4 = _FixedStack3 | (_FixedStack3 >> 4) + _FixedStack5 = _FixedStack4 | (_FixedStack4 >> 8) + _FixedStack6 = _FixedStack5 | (_FixedStack5 >> 16) + _FixedStack = _FixedStack6 + 1 + + // Functions that need frames bigger than this use an extra + // instruction to do the stack split check, to avoid overflow + // in case SP - framesize wraps below zero. + // This value can be no bigger than the size of the unmapped + // space at zero. + _StackBig = 4096 + + // The stack guard is a pointer this many bytes above the + // bottom of the stack. + // + // The guard leaves enough room for one _StackSmall frame plus + // a _StackLimit chain of NOSPLIT calls plus _StackSystem + // bytes for the OS. + // This arithmetic must match that in cmd/internal/objabi/stack.go:StackLimit. + _StackGuard = 928*sys.StackGuardMultiplier + _StackSystem + + // After a stack split check the SP is allowed to be this + // many bytes below the stack guard. This saves an instruction + // in the checking sequence for tiny frames. + _StackSmall = 128 + + // The maximum number of bytes that a chain of NOSPLIT + // functions can use. + // This arithmetic must match that in cmd/internal/objabi/stack.go:StackLimit. + _StackLimit = _StackGuard - _StackSystem - _StackSmall +) + +const ( + // stackDebug == 0: no logging + // == 1: logging of per-stack operations + // == 2: logging of per-frame operations + // == 3: logging of per-word updates + // == 4: logging of per-word reads + stackDebug = 0 + stackFromSystem = 0 // allocate stacks from system memory instead of the heap + stackFaultOnFree = 0 // old stacks are mapped noaccess to detect use after free + stackPoisonCopy = 0 // fill stack that should not be accessed with garbage, to detect bad dereferences during copy + stackNoCache = 0 // disable per-P small stack caches + + // check the BP links during traceback. + debugCheckBP = false +) + +const ( + uintptrMask = 1<<(8*goarch.PtrSize) - 1 + + // The values below can be stored to g.stackguard0 to force + // the next stack check to fail. + // These are all larger than any real SP. + + // Goroutine preemption request. + // 0xfffffade in hex. + stackPreempt = uintptrMask & -1314 + + // Thread is forking. Causes a split stack check failure. + // 0xfffffb2e in hex. + stackFork = uintptrMask & -1234 + + // Force a stack movement. Used for debugging. + // 0xfffffeed in hex. + stackForceMove = uintptrMask & -275 + + // stackPoisonMin is the lowest allowed stack poison value. + stackPoisonMin = uintptrMask & -4096 +) + +// Global pool of spans that have free stacks. +// Stacks are assigned an order according to size. +// +// order = log_2(size/FixedStack) +// +// There is a free list for each order. +var stackpool [_NumStackOrders]struct { + item stackpoolItem + _ [(cpu.CacheLinePadSize - unsafe.Sizeof(stackpoolItem{})%cpu.CacheLinePadSize) % cpu.CacheLinePadSize]byte +} + +type stackpoolItem struct { + _ sys.NotInHeap + mu mutex + span mSpanList +} + +// Global pool of large stack spans. +var stackLarge struct { + lock mutex + free [heapAddrBits - pageShift]mSpanList // free lists by log_2(s.npages) +} + +func stackinit() { + if _StackCacheSize&_PageMask != 0 { + throw("cache size must be a multiple of page size") + } + for i := range stackpool { + stackpool[i].item.span.init() + lockInit(&stackpool[i].item.mu, lockRankStackpool) + } + for i := range stackLarge.free { + stackLarge.free[i].init() + lockInit(&stackLarge.lock, lockRankStackLarge) + } +} + +// stacklog2 returns ⌊log_2(n)⌋. +func stacklog2(n uintptr) int { + log2 := 0 + for n > 1 { + n >>= 1 + log2++ + } + return log2 +} + +// Allocates a stack from the free pool. Must be called with +// stackpool[order].item.mu held. +func stackpoolalloc(order uint8) gclinkptr { + list := &stackpool[order].item.span + s := list.first + lockWithRankMayAcquire(&mheap_.lock, lockRankMheap) + if s == nil { + // no free stacks. Allocate another span worth. + s = mheap_.allocManual(_StackCacheSize>>_PageShift, spanAllocStack) + if s == nil { + throw("out of memory") + } + if s.allocCount != 0 { + throw("bad allocCount") + } + if s.manualFreeList.ptr() != nil { + throw("bad manualFreeList") + } + osStackAlloc(s) + s.elemsize = _FixedStack << order + for i := uintptr(0); i < _StackCacheSize; i += s.elemsize { + x := gclinkptr(s.base() + i) + x.ptr().next = s.manualFreeList + s.manualFreeList = x + } + list.insert(s) + } + x := s.manualFreeList + if x.ptr() == nil { + throw("span has no free stacks") + } + s.manualFreeList = x.ptr().next + s.allocCount++ + if s.manualFreeList.ptr() == nil { + // all stacks in s are allocated. + list.remove(s) + } + return x +} + +// Adds stack x to the free pool. Must be called with stackpool[order].item.mu held. +func stackpoolfree(x gclinkptr, order uint8) { + s := spanOfUnchecked(uintptr(x)) + if s.state.get() != mSpanManual { + throw("freeing stack not in a stack span") + } + if s.manualFreeList.ptr() == nil { + // s will now have a free stack + stackpool[order].item.span.insert(s) + } + x.ptr().next = s.manualFreeList + s.manualFreeList = x + s.allocCount-- + if gcphase == _GCoff && s.allocCount == 0 { + // Span is completely free. Return it to the heap + // immediately if we're sweeping. + // + // If GC is active, we delay the free until the end of + // GC to avoid the following type of situation: + // + // 1) GC starts, scans a SudoG but does not yet mark the SudoG.elem pointer + // 2) The stack that pointer points to is copied + // 3) The old stack is freed + // 4) The containing span is marked free + // 5) GC attempts to mark the SudoG.elem pointer. The + // marking fails because the pointer looks like a + // pointer into a free span. + // + // By not freeing, we prevent step #4 until GC is done. + stackpool[order].item.span.remove(s) + s.manualFreeList = 0 + osStackFree(s) + mheap_.freeManual(s, spanAllocStack) + } +} + +// stackcacherefill/stackcacherelease implement a global pool of stack segments. +// The pool is required to prevent unlimited growth of per-thread caches. +// +//go:systemstack +func stackcacherefill(c *mcache, order uint8) { + if stackDebug >= 1 { + print("stackcacherefill order=", order, "\n") + } + + // Grab some stacks from the global cache. + // Grab half of the allowed capacity (to prevent thrashing). + var list gclinkptr + var size uintptr + lock(&stackpool[order].item.mu) + for size < _StackCacheSize/2 { + x := stackpoolalloc(order) + x.ptr().next = list + list = x + size += _FixedStack << order + } + unlock(&stackpool[order].item.mu) + c.stackcache[order].list = list + c.stackcache[order].size = size +} + +//go:systemstack +func stackcacherelease(c *mcache, order uint8) { + if stackDebug >= 1 { + print("stackcacherelease order=", order, "\n") + } + x := c.stackcache[order].list + size := c.stackcache[order].size + lock(&stackpool[order].item.mu) + for size > _StackCacheSize/2 { + y := x.ptr().next + stackpoolfree(x, order) + x = y + size -= _FixedStack << order + } + unlock(&stackpool[order].item.mu) + c.stackcache[order].list = x + c.stackcache[order].size = size +} + +//go:systemstack +func stackcache_clear(c *mcache) { + if stackDebug >= 1 { + print("stackcache clear\n") + } + for order := uint8(0); order < _NumStackOrders; order++ { + lock(&stackpool[order].item.mu) + x := c.stackcache[order].list + for x.ptr() != nil { + y := x.ptr().next + stackpoolfree(x, order) + x = y + } + c.stackcache[order].list = 0 + c.stackcache[order].size = 0 + unlock(&stackpool[order].item.mu) + } +} + +// stackalloc allocates an n byte stack. +// +// stackalloc must run on the system stack because it uses per-P +// resources and must not split the stack. +// +//go:systemstack +func stackalloc(n uint32) stack { + // Stackalloc must be called on scheduler stack, so that we + // never try to grow the stack during the code that stackalloc runs. + // Doing so would cause a deadlock (issue 1547). + thisg := getg() + if thisg != thisg.m.g0 { + throw("stackalloc not on scheduler stack") + } + if n&(n-1) != 0 { + throw("stack size not a power of 2") + } + if stackDebug >= 1 { + print("stackalloc ", n, "\n") + } + + if debug.efence != 0 || stackFromSystem != 0 { + n = uint32(alignUp(uintptr(n), physPageSize)) + v := sysAlloc(uintptr(n), &memstats.stacks_sys) + if v == nil { + throw("out of memory (stackalloc)") + } + return stack{uintptr(v), uintptr(v) + uintptr(n)} + } + + // Small stacks are allocated with a fixed-size free-list allocator. + // If we need a stack of a bigger size, we fall back on allocating + // a dedicated span. + var v unsafe.Pointer + if n < _FixedStack<<_NumStackOrders && n < _StackCacheSize { + order := uint8(0) + n2 := n + for n2 > _FixedStack { + order++ + n2 >>= 1 + } + var x gclinkptr + if stackNoCache != 0 || thisg.m.p == 0 || thisg.m.preemptoff != "" { + // thisg.m.p == 0 can happen in the guts of exitsyscall + // or procresize. Just get a stack from the global pool. + // Also don't touch stackcache during gc + // as it's flushed concurrently. + lock(&stackpool[order].item.mu) + x = stackpoolalloc(order) + unlock(&stackpool[order].item.mu) + } else { + c := thisg.m.p.ptr().mcache + x = c.stackcache[order].list + if x.ptr() == nil { + stackcacherefill(c, order) + x = c.stackcache[order].list + } + c.stackcache[order].list = x.ptr().next + c.stackcache[order].size -= uintptr(n) + } + v = unsafe.Pointer(x) + } else { + var s *mspan + npage := uintptr(n) >> _PageShift + log2npage := stacklog2(npage) + + // Try to get a stack from the large stack cache. + lock(&stackLarge.lock) + if !stackLarge.free[log2npage].isEmpty() { + s = stackLarge.free[log2npage].first + stackLarge.free[log2npage].remove(s) + } + unlock(&stackLarge.lock) + + lockWithRankMayAcquire(&mheap_.lock, lockRankMheap) + + if s == nil { + // Allocate a new stack from the heap. + s = mheap_.allocManual(npage, spanAllocStack) + if s == nil { + throw("out of memory") + } + osStackAlloc(s) + s.elemsize = uintptr(n) + } + v = unsafe.Pointer(s.base()) + } + + if raceenabled { + racemalloc(v, uintptr(n)) + } + if msanenabled { + msanmalloc(v, uintptr(n)) + } + if asanenabled { + asanunpoison(v, uintptr(n)) + } + if stackDebug >= 1 { + print(" allocated ", v, "\n") + } + return stack{uintptr(v), uintptr(v) + uintptr(n)} +} + +// stackfree frees an n byte stack allocation at stk. +// +// stackfree must run on the system stack because it uses per-P +// resources and must not split the stack. +// +//go:systemstack +func stackfree(stk stack) { + gp := getg() + v := unsafe.Pointer(stk.lo) + n := stk.hi - stk.lo + if n&(n-1) != 0 { + throw("stack not a power of 2") + } + if stk.lo+n < stk.hi { + throw("bad stack size") + } + if stackDebug >= 1 { + println("stackfree", v, n) + memclrNoHeapPointers(v, n) // for testing, clobber stack data + } + if debug.efence != 0 || stackFromSystem != 0 { + if debug.efence != 0 || stackFaultOnFree != 0 { + sysFault(v, n) + } else { + sysFree(v, n, &memstats.stacks_sys) + } + return + } + if msanenabled { + msanfree(v, n) + } + if asanenabled { + asanpoison(v, n) + } + if n < _FixedStack<<_NumStackOrders && n < _StackCacheSize { + order := uint8(0) + n2 := n + for n2 > _FixedStack { + order++ + n2 >>= 1 + } + x := gclinkptr(v) + if stackNoCache != 0 || gp.m.p == 0 || gp.m.preemptoff != "" { + lock(&stackpool[order].item.mu) + stackpoolfree(x, order) + unlock(&stackpool[order].item.mu) + } else { + c := gp.m.p.ptr().mcache + if c.stackcache[order].size >= _StackCacheSize { + stackcacherelease(c, order) + } + x.ptr().next = c.stackcache[order].list + c.stackcache[order].list = x + c.stackcache[order].size += n + } + } else { + s := spanOfUnchecked(uintptr(v)) + if s.state.get() != mSpanManual { + println(hex(s.base()), v) + throw("bad span state") + } + if gcphase == _GCoff { + // Free the stack immediately if we're + // sweeping. + osStackFree(s) + mheap_.freeManual(s, spanAllocStack) + } else { + // If the GC is running, we can't return a + // stack span to the heap because it could be + // reused as a heap span, and this state + // change would race with GC. Add it to the + // large stack cache instead. + log2npage := stacklog2(s.npages) + lock(&stackLarge.lock) + stackLarge.free[log2npage].insert(s) + unlock(&stackLarge.lock) + } + } +} + +var maxstacksize uintptr = 1 << 20 // enough until runtime.main sets it for real + +var maxstackceiling = maxstacksize + +var ptrnames = []string{ + 0: "scalar", + 1: "ptr", +} + +// Stack frame layout +// +// (x86) +// +------------------+ +// | args from caller | +// +------------------+ <- frame->argp +// | return address | +// +------------------+ +// | caller's BP (*) | (*) if framepointer_enabled && varp < sp +// +------------------+ <- frame->varp +// | locals | +// +------------------+ +// | args to callee | +// +------------------+ <- frame->sp +// +// (arm) +// +------------------+ +// | args from caller | +// +------------------+ <- frame->argp +// | caller's retaddr | +// +------------------+ <- frame->varp +// | locals | +// +------------------+ +// | args to callee | +// +------------------+ +// | return address | +// +------------------+ <- frame->sp + +type adjustinfo struct { + old stack + delta uintptr // ptr distance from old to new stack (newbase - oldbase) + cache pcvalueCache + + // sghi is the highest sudog.elem on the stack. + sghi uintptr +} + +// adjustpointer checks whether *vpp is in the old stack described by adjinfo. +// If so, it rewrites *vpp to point into the new stack. +func adjustpointer(adjinfo *adjustinfo, vpp unsafe.Pointer) { + pp := (*uintptr)(vpp) + p := *pp + if stackDebug >= 4 { + print(" ", pp, ":", hex(p), "\n") + } + if adjinfo.old.lo <= p && p < adjinfo.old.hi { + *pp = p + adjinfo.delta + if stackDebug >= 3 { + print(" adjust ptr ", pp, ":", hex(p), " -> ", hex(*pp), "\n") + } + } +} + +// Information from the compiler about the layout of stack frames. +// Note: this type must agree with reflect.bitVector. +type bitvector struct { + n int32 // # of bits + bytedata *uint8 +} + +// ptrbit returns the i'th bit in bv. +// ptrbit is less efficient than iterating directly over bitvector bits, +// and should only be used in non-performance-critical code. +// See adjustpointers for an example of a high-efficiency walk of a bitvector. +func (bv *bitvector) ptrbit(i uintptr) uint8 { + b := *(addb(bv.bytedata, i/8)) + return (b >> (i % 8)) & 1 +} + +// bv describes the memory starting at address scanp. +// Adjust any pointers contained therein. +func adjustpointers(scanp unsafe.Pointer, bv *bitvector, adjinfo *adjustinfo, f funcInfo) { + minp := adjinfo.old.lo + maxp := adjinfo.old.hi + delta := adjinfo.delta + num := uintptr(bv.n) + // If this frame might contain channel receive slots, use CAS + // to adjust pointers. If the slot hasn't been received into + // yet, it may contain stack pointers and a concurrent send + // could race with adjusting those pointers. (The sent value + // itself can never contain stack pointers.) + useCAS := uintptr(scanp) < adjinfo.sghi + for i := uintptr(0); i < num; i += 8 { + if stackDebug >= 4 { + for j := uintptr(0); j < 8; j++ { + print(" ", add(scanp, (i+j)*goarch.PtrSize), ":", ptrnames[bv.ptrbit(i+j)], ":", hex(*(*uintptr)(add(scanp, (i+j)*goarch.PtrSize))), " # ", i, " ", *addb(bv.bytedata, i/8), "\n") + } + } + b := *(addb(bv.bytedata, i/8)) + for b != 0 { + j := uintptr(sys.TrailingZeros8(b)) + b &= b - 1 + pp := (*uintptr)(add(scanp, (i+j)*goarch.PtrSize)) + retry: + p := *pp + if f.valid() && 0 < p && p < minLegalPointer && debug.invalidptr != 0 { + // Looks like a junk value in a pointer slot. + // Live analysis wrong? + getg().m.traceback = 2 + print("runtime: bad pointer in frame ", funcname(f), " at ", pp, ": ", hex(p), "\n") + throw("invalid pointer found on stack") + } + if minp <= p && p < maxp { + if stackDebug >= 3 { + print("adjust ptr ", hex(p), " ", funcname(f), "\n") + } + if useCAS { + ppu := (*unsafe.Pointer)(unsafe.Pointer(pp)) + if !atomic.Casp1(ppu, unsafe.Pointer(p), unsafe.Pointer(p+delta)) { + goto retry + } + } else { + *pp = p + delta + } + } + } + } +} + +// Note: the argument/return area is adjusted by the callee. +func adjustframe(frame *stkframe, arg unsafe.Pointer) bool { + adjinfo := (*adjustinfo)(arg) + if frame.continpc == 0 { + // Frame is dead. + return true + } + f := frame.fn + if stackDebug >= 2 { + print(" adjusting ", funcname(f), " frame=[", hex(frame.sp), ",", hex(frame.fp), "] pc=", hex(frame.pc), " continpc=", hex(frame.continpc), "\n") + } + if f.funcID == funcID_systemstack_switch { + // A special routine at the bottom of stack of a goroutine that does a systemstack call. + // We will allow it to be copied even though we don't + // have full GC info for it (because it is written in asm). + return true + } + + locals, args, objs := frame.getStackMap(&adjinfo.cache, true) + + // Adjust local variables if stack frame has been allocated. + if locals.n > 0 { + size := uintptr(locals.n) * goarch.PtrSize + adjustpointers(unsafe.Pointer(frame.varp-size), &locals, adjinfo, f) + } + + // Adjust saved base pointer if there is one. + // TODO what about arm64 frame pointer adjustment? + if goarch.ArchFamily == goarch.AMD64 && frame.argp-frame.varp == 2*goarch.PtrSize { + if stackDebug >= 3 { + print(" saved bp\n") + } + if debugCheckBP { + // Frame pointers should always point to the next higher frame on + // the Go stack (or be nil, for the top frame on the stack). + bp := *(*uintptr)(unsafe.Pointer(frame.varp)) + if bp != 0 && (bp < adjinfo.old.lo || bp >= adjinfo.old.hi) { + println("runtime: found invalid frame pointer") + print("bp=", hex(bp), " min=", hex(adjinfo.old.lo), " max=", hex(adjinfo.old.hi), "\n") + throw("bad frame pointer") + } + } + adjustpointer(adjinfo, unsafe.Pointer(frame.varp)) + } + + // Adjust arguments. + if args.n > 0 { + if stackDebug >= 3 { + print(" args\n") + } + adjustpointers(unsafe.Pointer(frame.argp), &args, adjinfo, funcInfo{}) + } + + // Adjust pointers in all stack objects (whether they are live or not). + // See comments in mgcmark.go:scanframeworker. + if frame.varp != 0 { + for i := range objs { + obj := &objs[i] + off := obj.off + base := frame.varp // locals base pointer + if off >= 0 { + base = frame.argp // arguments and return values base pointer + } + p := base + uintptr(off) + if p < frame.sp { + // Object hasn't been allocated in the frame yet. + // (Happens when the stack bounds check fails and + // we call into morestack.) + continue + } + ptrdata := obj.ptrdata() + gcdata := obj.gcdata() + var s *mspan + if obj.useGCProg() { + // See comments in mgcmark.go:scanstack + s = materializeGCProg(ptrdata, gcdata) + gcdata = (*byte)(unsafe.Pointer(s.startAddr)) + } + for i := uintptr(0); i < ptrdata; i += goarch.PtrSize { + if *addb(gcdata, i/(8*goarch.PtrSize))>>(i/goarch.PtrSize&7)&1 != 0 { + adjustpointer(adjinfo, unsafe.Pointer(p+i)) + } + } + if s != nil { + dematerializeGCProg(s) + } + } + } + + return true +} + +func adjustctxt(gp *g, adjinfo *adjustinfo) { + adjustpointer(adjinfo, unsafe.Pointer(&gp.sched.ctxt)) + if !framepointer_enabled { + return + } + if debugCheckBP { + bp := gp.sched.bp + if bp != 0 && (bp < adjinfo.old.lo || bp >= adjinfo.old.hi) { + println("runtime: found invalid top frame pointer") + print("bp=", hex(bp), " min=", hex(adjinfo.old.lo), " max=", hex(adjinfo.old.hi), "\n") + throw("bad top frame pointer") + } + } + adjustpointer(adjinfo, unsafe.Pointer(&gp.sched.bp)) +} + +func adjustdefers(gp *g, adjinfo *adjustinfo) { + // Adjust pointers in the Defer structs. + // We need to do this first because we need to adjust the + // defer.link fields so we always work on the new stack. + adjustpointer(adjinfo, unsafe.Pointer(&gp._defer)) + for d := gp._defer; d != nil; d = d.link { + adjustpointer(adjinfo, unsafe.Pointer(&d.fn)) + adjustpointer(adjinfo, unsafe.Pointer(&d.sp)) + adjustpointer(adjinfo, unsafe.Pointer(&d._panic)) + adjustpointer(adjinfo, unsafe.Pointer(&d.link)) + adjustpointer(adjinfo, unsafe.Pointer(&d.varp)) + adjustpointer(adjinfo, unsafe.Pointer(&d.fd)) + } +} + +func adjustpanics(gp *g, adjinfo *adjustinfo) { + // Panics are on stack and already adjusted. + // Update pointer to head of list in G. + adjustpointer(adjinfo, unsafe.Pointer(&gp._panic)) +} + +func adjustsudogs(gp *g, adjinfo *adjustinfo) { + // the data elements pointed to by a SudoG structure + // might be in the stack. + for s := gp.waiting; s != nil; s = s.waitlink { + adjustpointer(adjinfo, unsafe.Pointer(&s.elem)) + } +} + +func fillstack(stk stack, b byte) { + for p := stk.lo; p < stk.hi; p++ { + *(*byte)(unsafe.Pointer(p)) = b + } +} + +func findsghi(gp *g, stk stack) uintptr { + var sghi uintptr + for sg := gp.waiting; sg != nil; sg = sg.waitlink { + p := uintptr(sg.elem) + uintptr(sg.c.elemsize) + if stk.lo <= p && p < stk.hi && p > sghi { + sghi = p + } + } + return sghi +} + +// syncadjustsudogs adjusts gp's sudogs and copies the part of gp's +// stack they refer to while synchronizing with concurrent channel +// operations. It returns the number of bytes of stack copied. +func syncadjustsudogs(gp *g, used uintptr, adjinfo *adjustinfo) uintptr { + if gp.waiting == nil { + return 0 + } + + // Lock channels to prevent concurrent send/receive. + var lastc *hchan + for sg := gp.waiting; sg != nil; sg = sg.waitlink { + if sg.c != lastc { + // There is a ranking cycle here between gscan bit and + // hchan locks. Normally, we only allow acquiring hchan + // locks and then getting a gscan bit. In this case, we + // already have the gscan bit. We allow acquiring hchan + // locks here as a special case, since a deadlock can't + // happen because the G involved must already be + // suspended. So, we get a special hchan lock rank here + // that is lower than gscan, but doesn't allow acquiring + // any other locks other than hchan. + lockWithRank(&sg.c.lock, lockRankHchanLeaf) + } + lastc = sg.c + } + + // Adjust sudogs. + adjustsudogs(gp, adjinfo) + + // Copy the part of the stack the sudogs point in to + // while holding the lock to prevent races on + // send/receive slots. + var sgsize uintptr + if adjinfo.sghi != 0 { + oldBot := adjinfo.old.hi - used + newBot := oldBot + adjinfo.delta + sgsize = adjinfo.sghi - oldBot + memmove(unsafe.Pointer(newBot), unsafe.Pointer(oldBot), sgsize) + } + + // Unlock channels. + lastc = nil + for sg := gp.waiting; sg != nil; sg = sg.waitlink { + if sg.c != lastc { + unlock(&sg.c.lock) + } + lastc = sg.c + } + + return sgsize +} + +// Copies gp's stack to a new stack of a different size. +// Caller must have changed gp status to Gcopystack. +func copystack(gp *g, newsize uintptr) { + if gp.syscallsp != 0 { + throw("stack growth not allowed in system call") + } + old := gp.stack + if old.lo == 0 { + throw("nil stackbase") + } + used := old.hi - gp.sched.sp + // Add just the difference to gcController.addScannableStack. + // g0 stacks never move, so this will never account for them. + // It's also fine if we have no P, addScannableStack can deal with + // that case. + gcController.addScannableStack(getg().m.p.ptr(), int64(newsize)-int64(old.hi-old.lo)) + + // allocate new stack + new := stackalloc(uint32(newsize)) + if stackPoisonCopy != 0 { + fillstack(new, 0xfd) + } + if stackDebug >= 1 { + print("copystack gp=", gp, " [", hex(old.lo), " ", hex(old.hi-used), " ", hex(old.hi), "]", " -> [", hex(new.lo), " ", hex(new.hi-used), " ", hex(new.hi), "]/", newsize, "\n") + } + + // Compute adjustment. + var adjinfo adjustinfo + adjinfo.old = old + adjinfo.delta = new.hi - old.hi + + // Adjust sudogs, synchronizing with channel ops if necessary. + ncopy := used + if !gp.activeStackChans { + if newsize < old.hi-old.lo && gp.parkingOnChan.Load() { + // It's not safe for someone to shrink this stack while we're actively + // parking on a channel, but it is safe to grow since we do that + // ourselves and explicitly don't want to synchronize with channels + // since we could self-deadlock. + throw("racy sudog adjustment due to parking on channel") + } + adjustsudogs(gp, &adjinfo) + } else { + // sudogs may be pointing in to the stack and gp has + // released channel locks, so other goroutines could + // be writing to gp's stack. Find the highest such + // pointer so we can handle everything there and below + // carefully. (This shouldn't be far from the bottom + // of the stack, so there's little cost in handling + // everything below it carefully.) + adjinfo.sghi = findsghi(gp, old) + + // Synchronize with channel ops and copy the part of + // the stack they may interact with. + ncopy -= syncadjustsudogs(gp, used, &adjinfo) + } + + // Copy the stack (or the rest of it) to the new location + memmove(unsafe.Pointer(new.hi-ncopy), unsafe.Pointer(old.hi-ncopy), ncopy) + + // Adjust remaining structures that have pointers into stacks. + // We have to do most of these before we traceback the new + // stack because gentraceback uses them. + adjustctxt(gp, &adjinfo) + adjustdefers(gp, &adjinfo) + adjustpanics(gp, &adjinfo) + if adjinfo.sghi != 0 { + adjinfo.sghi += adjinfo.delta + } + + // Swap out old stack for new one + gp.stack = new + gp.stackguard0 = new.lo + _StackGuard // NOTE: might clobber a preempt request + gp.sched.sp = new.hi - used + gp.stktopsp += adjinfo.delta + + // Adjust pointers in the new stack. + gentraceback(^uintptr(0), ^uintptr(0), 0, gp, 0, nil, 0x7fffffff, adjustframe, noescape(unsafe.Pointer(&adjinfo)), 0) + + // free old stack + if stackPoisonCopy != 0 { + fillstack(old, 0xfc) + } + stackfree(old) +} + +// round x up to a power of 2. +func round2(x int32) int32 { + s := uint(0) + for 1<<s < x { + s++ + } + return 1 << s +} + +// Called from runtime·morestack when more stack is needed. +// Allocate larger stack and relocate to new stack. +// Stack growth is multiplicative, for constant amortized cost. +// +// g->atomicstatus will be Grunning or Gscanrunning upon entry. +// If the scheduler is trying to stop this g, then it will set preemptStop. +// +// This must be nowritebarrierrec because it can be called as part of +// stack growth from other nowritebarrierrec functions, but the +// compiler doesn't check this. +// +//go:nowritebarrierrec +func newstack() { + thisg := getg() + // TODO: double check all gp. shouldn't be getg(). + if thisg.m.morebuf.g.ptr().stackguard0 == stackFork { + throw("stack growth after fork") + } + if thisg.m.morebuf.g.ptr() != thisg.m.curg { + print("runtime: newstack called from g=", hex(thisg.m.morebuf.g), "\n"+"\tm=", thisg.m, " m->curg=", thisg.m.curg, " m->g0=", thisg.m.g0, " m->gsignal=", thisg.m.gsignal, "\n") + morebuf := thisg.m.morebuf + traceback(morebuf.pc, morebuf.sp, morebuf.lr, morebuf.g.ptr()) + throw("runtime: wrong goroutine in newstack") + } + + gp := thisg.m.curg + + if thisg.m.curg.throwsplit { + // Update syscallsp, syscallpc in case traceback uses them. + morebuf := thisg.m.morebuf + gp.syscallsp = morebuf.sp + gp.syscallpc = morebuf.pc + pcname, pcoff := "(unknown)", uintptr(0) + f := findfunc(gp.sched.pc) + if f.valid() { + pcname = funcname(f) + pcoff = gp.sched.pc - f.entry() + } + print("runtime: newstack at ", pcname, "+", hex(pcoff), + " sp=", hex(gp.sched.sp), " stack=[", hex(gp.stack.lo), ", ", hex(gp.stack.hi), "]\n", + "\tmorebuf={pc:", hex(morebuf.pc), " sp:", hex(morebuf.sp), " lr:", hex(morebuf.lr), "}\n", + "\tsched={pc:", hex(gp.sched.pc), " sp:", hex(gp.sched.sp), " lr:", hex(gp.sched.lr), " ctxt:", gp.sched.ctxt, "}\n") + + thisg.m.traceback = 2 // Include runtime frames + traceback(morebuf.pc, morebuf.sp, morebuf.lr, gp) + throw("runtime: stack split at bad time") + } + + morebuf := thisg.m.morebuf + thisg.m.morebuf.pc = 0 + thisg.m.morebuf.lr = 0 + thisg.m.morebuf.sp = 0 + thisg.m.morebuf.g = 0 + + // NOTE: stackguard0 may change underfoot, if another thread + // is about to try to preempt gp. Read it just once and use that same + // value now and below. + stackguard0 := atomic.Loaduintptr(&gp.stackguard0) + + // Be conservative about where we preempt. + // We are interested in preempting user Go code, not runtime code. + // If we're holding locks, mallocing, or preemption is disabled, don't + // preempt. + // This check is very early in newstack so that even the status change + // from Grunning to Gwaiting and back doesn't happen in this case. + // That status change by itself can be viewed as a small preemption, + // because the GC might change Gwaiting to Gscanwaiting, and then + // this goroutine has to wait for the GC to finish before continuing. + // If the GC is in some way dependent on this goroutine (for example, + // it needs a lock held by the goroutine), that small preemption turns + // into a real deadlock. + preempt := stackguard0 == stackPreempt + if preempt { + if !canPreemptM(thisg.m) { + // Let the goroutine keep running for now. + // gp->preempt is set, so it will be preempted next time. + gp.stackguard0 = gp.stack.lo + _StackGuard + gogo(&gp.sched) // never return + } + } + + if gp.stack.lo == 0 { + throw("missing stack in newstack") + } + sp := gp.sched.sp + if goarch.ArchFamily == goarch.AMD64 || goarch.ArchFamily == goarch.I386 || goarch.ArchFamily == goarch.WASM { + // The call to morestack cost a word. + sp -= goarch.PtrSize + } + if stackDebug >= 1 || sp < gp.stack.lo { + print("runtime: newstack sp=", hex(sp), " stack=[", hex(gp.stack.lo), ", ", hex(gp.stack.hi), "]\n", + "\tmorebuf={pc:", hex(morebuf.pc), " sp:", hex(morebuf.sp), " lr:", hex(morebuf.lr), "}\n", + "\tsched={pc:", hex(gp.sched.pc), " sp:", hex(gp.sched.sp), " lr:", hex(gp.sched.lr), " ctxt:", gp.sched.ctxt, "}\n") + } + if sp < gp.stack.lo { + print("runtime: gp=", gp, ", goid=", gp.goid, ", gp->status=", hex(readgstatus(gp)), "\n ") + print("runtime: split stack overflow: ", hex(sp), " < ", hex(gp.stack.lo), "\n") + throw("runtime: split stack overflow") + } + + if preempt { + if gp == thisg.m.g0 { + throw("runtime: preempt g0") + } + if thisg.m.p == 0 && thisg.m.locks == 0 { + throw("runtime: g is running but p is not") + } + + if gp.preemptShrink { + // We're at a synchronous safe point now, so + // do the pending stack shrink. + gp.preemptShrink = false + shrinkstack(gp) + } + + if gp.preemptStop { + preemptPark(gp) // never returns + } + + // Act like goroutine called runtime.Gosched. + gopreempt_m(gp) // never return + } + + // Allocate a bigger segment and move the stack. + oldsize := gp.stack.hi - gp.stack.lo + newsize := oldsize * 2 + + // Make sure we grow at least as much as needed to fit the new frame. + // (This is just an optimization - the caller of morestack will + // recheck the bounds on return.) + if f := findfunc(gp.sched.pc); f.valid() { + max := uintptr(funcMaxSPDelta(f)) + needed := max + _StackGuard + used := gp.stack.hi - gp.sched.sp + for newsize-used < needed { + newsize *= 2 + } + } + + if stackguard0 == stackForceMove { + // Forced stack movement used for debugging. + // Don't double the stack (or we may quickly run out + // if this is done repeatedly). + newsize = oldsize + } + + if newsize > maxstacksize || newsize > maxstackceiling { + if maxstacksize < maxstackceiling { + print("runtime: goroutine stack exceeds ", maxstacksize, "-byte limit\n") + } else { + print("runtime: goroutine stack exceeds ", maxstackceiling, "-byte limit\n") + } + print("runtime: sp=", hex(sp), " stack=[", hex(gp.stack.lo), ", ", hex(gp.stack.hi), "]\n") + throw("stack overflow") + } + + // The goroutine must be executing in order to call newstack, + // so it must be Grunning (or Gscanrunning). + casgstatus(gp, _Grunning, _Gcopystack) + + // The concurrent GC will not scan the stack while we are doing the copy since + // the gp is in a Gcopystack status. + copystack(gp, newsize) + if stackDebug >= 1 { + print("stack grow done\n") + } + casgstatus(gp, _Gcopystack, _Grunning) + gogo(&gp.sched) +} + +//go:nosplit +func nilfunc() { + *(*uint8)(nil) = 0 +} + +// adjust Gobuf as if it executed a call to fn +// and then stopped before the first instruction in fn. +func gostartcallfn(gobuf *gobuf, fv *funcval) { + var fn unsafe.Pointer + if fv != nil { + fn = unsafe.Pointer(fv.fn) + } else { + fn = unsafe.Pointer(abi.FuncPCABIInternal(nilfunc)) + } + gostartcall(gobuf, fn, unsafe.Pointer(fv)) +} + +// isShrinkStackSafe returns whether it's safe to attempt to shrink +// gp's stack. Shrinking the stack is only safe when we have precise +// pointer maps for all frames on the stack. +func isShrinkStackSafe(gp *g) bool { + // We can't copy the stack if we're in a syscall. + // The syscall might have pointers into the stack and + // often we don't have precise pointer maps for the innermost + // frames. + // + // We also can't copy the stack if we're at an asynchronous + // safe-point because we don't have precise pointer maps for + // all frames. + // + // We also can't *shrink* the stack in the window between the + // goroutine calling gopark to park on a channel and + // gp.activeStackChans being set. + return gp.syscallsp == 0 && !gp.asyncSafePoint && !gp.parkingOnChan.Load() +} + +// Maybe shrink the stack being used by gp. +// +// gp must be stopped and we must own its stack. It may be in +// _Grunning, but only if this is our own user G. +func shrinkstack(gp *g) { + if gp.stack.lo == 0 { + throw("missing stack in shrinkstack") + } + if s := readgstatus(gp); s&_Gscan == 0 { + // We don't own the stack via _Gscan. We could still + // own it if this is our own user G and we're on the + // system stack. + if !(gp == getg().m.curg && getg() != getg().m.curg && s == _Grunning) { + // We don't own the stack. + throw("bad status in shrinkstack") + } + } + if !isShrinkStackSafe(gp) { + throw("shrinkstack at bad time") + } + // Check for self-shrinks while in a libcall. These may have + // pointers into the stack disguised as uintptrs, but these + // code paths should all be nosplit. + if gp == getg().m.curg && gp.m.libcallsp != 0 { + throw("shrinking stack in libcall") + } + + if debug.gcshrinkstackoff > 0 { + return + } + f := findfunc(gp.startpc) + if f.valid() && f.funcID == funcID_gcBgMarkWorker { + // We're not allowed to shrink the gcBgMarkWorker + // stack (see gcBgMarkWorker for explanation). + return + } + + oldsize := gp.stack.hi - gp.stack.lo + newsize := oldsize / 2 + // Don't shrink the allocation below the minimum-sized stack + // allocation. + if newsize < _FixedStack { + return + } + // Compute how much of the stack is currently in use and only + // shrink the stack if gp is using less than a quarter of its + // current stack. The currently used stack includes everything + // down to the SP plus the stack guard space that ensures + // there's room for nosplit functions. + avail := gp.stack.hi - gp.stack.lo + if used := gp.stack.hi - gp.sched.sp + _StackLimit; used >= avail/4 { + return + } + + if stackDebug > 0 { + print("shrinking stack ", oldsize, "->", newsize, "\n") + } + + copystack(gp, newsize) +} + +// freeStackSpans frees unused stack spans at the end of GC. +func freeStackSpans() { + // Scan stack pools for empty stack spans. + for order := range stackpool { + lock(&stackpool[order].item.mu) + list := &stackpool[order].item.span + for s := list.first; s != nil; { + next := s.next + if s.allocCount == 0 { + list.remove(s) + s.manualFreeList = 0 + osStackFree(s) + mheap_.freeManual(s, spanAllocStack) + } + s = next + } + unlock(&stackpool[order].item.mu) + } + + // Free large stack spans. + lock(&stackLarge.lock) + for i := range stackLarge.free { + for s := stackLarge.free[i].first; s != nil; { + next := s.next + stackLarge.free[i].remove(s) + osStackFree(s) + mheap_.freeManual(s, spanAllocStack) + s = next + } + } + unlock(&stackLarge.lock) +} + +// A stackObjectRecord is generated by the compiler for each stack object in a stack frame. +// This record must match the generator code in cmd/compile/internal/liveness/plive.go:emitStackObjects. +type stackObjectRecord struct { + // offset in frame + // if negative, offset from varp + // if non-negative, offset from argp + off int32 + size int32 + _ptrdata int32 // ptrdata, or -ptrdata is GC prog is used + gcdataoff uint32 // offset to gcdata from moduledata.rodata +} + +func (r *stackObjectRecord) useGCProg() bool { + return r._ptrdata < 0 +} + +func (r *stackObjectRecord) ptrdata() uintptr { + x := r._ptrdata + if x < 0 { + return uintptr(-x) + } + return uintptr(x) +} + +// gcdata returns pointer map or GC prog of the type. +func (r *stackObjectRecord) gcdata() *byte { + ptr := uintptr(unsafe.Pointer(r)) + var mod *moduledata + for datap := &firstmoduledata; datap != nil; datap = datap.next { + if datap.gofunc <= ptr && ptr < datap.end { + mod = datap + break + } + } + // If you get a panic here due to a nil mod, + // you may have made a copy of a stackObjectRecord. + // You must use the original pointer. + res := mod.rodata + uintptr(r.gcdataoff) + return (*byte)(unsafe.Pointer(res)) +} + +// This is exported as ABI0 via linkname so obj can call it. +// +//go:nosplit +//go:linkname morestackc +func morestackc() { + throw("attempt to execute system stack code on user stack") +} + +// startingStackSize is the amount of stack that new goroutines start with. +// It is a power of 2, and between _FixedStack and maxstacksize, inclusive. +// startingStackSize is updated every GC by tracking the average size of +// stacks scanned during the GC. +var startingStackSize uint32 = _FixedStack + +func gcComputeStartingStackSize() { + if debug.adaptivestackstart == 0 { + return + } + // For details, see the design doc at + // https://docs.google.com/document/d/1YDlGIdVTPnmUiTAavlZxBI1d9pwGQgZT7IKFKlIXohQ/edit?usp=sharing + // The basic algorithm is to track the average size of stacks + // and start goroutines with stack equal to that average size. + // Starting at the average size uses at most 2x the space that + // an ideal algorithm would have used. + // This is just a heuristic to avoid excessive stack growth work + // early in a goroutine's lifetime. See issue 18138. Stacks that + // are allocated too small can still grow, and stacks allocated + // too large can still shrink. + var scannedStackSize uint64 + var scannedStacks uint64 + for _, p := range allp { + scannedStackSize += p.scannedStackSize + scannedStacks += p.scannedStacks + // Reset for next time + p.scannedStackSize = 0 + p.scannedStacks = 0 + } + if scannedStacks == 0 { + startingStackSize = _FixedStack + return + } + avg := scannedStackSize/scannedStacks + _StackGuard + // Note: we add _StackGuard to ensure that a goroutine that + // uses the average space will not trigger a growth. + if avg > uint64(maxstacksize) { + avg = uint64(maxstacksize) + } + if avg < _FixedStack { + avg = _FixedStack + } + // Note: maxstacksize fits in 30 bits, so avg also does. + startingStackSize = uint32(round2(int32(avg))) +} diff --git a/src/runtime/stack_test.go b/src/runtime/stack_test.go new file mode 100644 index 0000000..92d5880 --- /dev/null +++ b/src/runtime/stack_test.go @@ -0,0 +1,939 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "reflect" + "regexp" + . "runtime" + "strings" + "sync" + "sync/atomic" + "testing" + "time" + _ "unsafe" // for go:linkname +) + +// TestStackMem measures per-thread stack segment cache behavior. +// The test consumed up to 500MB in the past. +func TestStackMem(t *testing.T) { + const ( + BatchSize = 32 + BatchCount = 256 + ArraySize = 1024 + RecursionDepth = 128 + ) + if testing.Short() { + return + } + defer GOMAXPROCS(GOMAXPROCS(BatchSize)) + s0 := new(MemStats) + ReadMemStats(s0) + for b := 0; b < BatchCount; b++ { + c := make(chan bool, BatchSize) + for i := 0; i < BatchSize; i++ { + go func() { + var f func(k int, a [ArraySize]byte) + f = func(k int, a [ArraySize]byte) { + if k == 0 { + time.Sleep(time.Millisecond) + return + } + f(k-1, a) + } + f(RecursionDepth, [ArraySize]byte{}) + c <- true + }() + } + for i := 0; i < BatchSize; i++ { + <-c + } + + // The goroutines have signaled via c that they are ready to exit. + // Give them a chance to exit by sleeping. If we don't wait, we + // might not reuse them on the next batch. + time.Sleep(10 * time.Millisecond) + } + s1 := new(MemStats) + ReadMemStats(s1) + consumed := int64(s1.StackSys - s0.StackSys) + t.Logf("Consumed %vMB for stack mem", consumed>>20) + estimate := int64(8 * BatchSize * ArraySize * RecursionDepth) // 8 is to reduce flakiness. + if consumed > estimate { + t.Fatalf("Stack mem: want %v, got %v", estimate, consumed) + } + // Due to broken stack memory accounting (https://golang.org/issue/7468), + // StackInuse can decrease during function execution, so we cast the values to int64. + inuse := int64(s1.StackInuse) - int64(s0.StackInuse) + t.Logf("Inuse %vMB for stack mem", inuse>>20) + if inuse > 4<<20 { + t.Fatalf("Stack inuse: want %v, got %v", 4<<20, inuse) + } +} + +// Test stack growing in different contexts. +func TestStackGrowth(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + + t.Parallel() + + var wg sync.WaitGroup + + // in a normal goroutine + var growDuration time.Duration // For debugging failures + wg.Add(1) + go func() { + defer wg.Done() + start := time.Now() + growStack(nil) + growDuration = time.Since(start) + }() + wg.Wait() + t.Log("first growStack took", growDuration) + + // in locked goroutine + wg.Add(1) + go func() { + defer wg.Done() + LockOSThread() + growStack(nil) + UnlockOSThread() + }() + wg.Wait() + + // in finalizer + var finalizerStart time.Time + var started atomic.Bool + var progress atomic.Uint32 + wg.Add(1) + s := new(string) // Must be of a type that avoids the tiny allocator, or else the finalizer might not run. + SetFinalizer(s, func(ss *string) { + defer wg.Done() + finalizerStart = time.Now() + started.Store(true) + growStack(&progress) + }) + setFinalizerTime := time.Now() + s = nil + + if d, ok := t.Deadline(); ok { + // Pad the timeout by an arbitrary 5% to give the AfterFunc time to run. + timeout := time.Until(d) * 19 / 20 + timer := time.AfterFunc(timeout, func() { + // Panic — instead of calling t.Error and returning from the test — so + // that we get a useful goroutine dump if the test times out, especially + // if GOTRACEBACK=system or GOTRACEBACK=crash is set. + if !started.Load() { + panic("finalizer did not start") + } else { + panic(fmt.Sprintf("finalizer started %s ago (%s after registration) and ran %d iterations, but did not return", time.Since(finalizerStart), finalizerStart.Sub(setFinalizerTime), progress.Load())) + } + }) + defer timer.Stop() + } + + GC() + wg.Wait() + t.Logf("finalizer started after %s and ran %d iterations in %v", finalizerStart.Sub(setFinalizerTime), progress.Load(), time.Since(finalizerStart)) +} + +// ... and in init +//func init() { +// growStack() +//} + +func growStack(progress *atomic.Uint32) { + n := 1 << 10 + if testing.Short() { + n = 1 << 8 + } + for i := 0; i < n; i++ { + x := 0 + growStackIter(&x, i) + if x != i+1 { + panic("stack is corrupted") + } + if progress != nil { + progress.Store(uint32(i)) + } + } + GC() +} + +// This function is not an anonymous func, so that the compiler can do escape +// analysis and place x on stack (and subsequently stack growth update the pointer). +func growStackIter(p *int, n int) { + if n == 0 { + *p = n + 1 + GC() + return + } + *p = n + 1 + x := 0 + growStackIter(&x, n-1) + if x != n { + panic("stack is corrupted") + } +} + +func TestStackGrowthCallback(t *testing.T) { + t.Parallel() + var wg sync.WaitGroup + + // test stack growth at chan op + wg.Add(1) + go func() { + defer wg.Done() + c := make(chan int, 1) + growStackWithCallback(func() { + c <- 1 + <-c + }) + }() + + // test stack growth at map op + wg.Add(1) + go func() { + defer wg.Done() + m := make(map[int]int) + growStackWithCallback(func() { + _, _ = m[1] + m[1] = 1 + }) + }() + + // test stack growth at goroutine creation + wg.Add(1) + go func() { + defer wg.Done() + growStackWithCallback(func() { + done := make(chan bool) + go func() { + done <- true + }() + <-done + }) + }() + wg.Wait() +} + +func growStackWithCallback(cb func()) { + var f func(n int) + f = func(n int) { + if n == 0 { + cb() + return + } + f(n - 1) + } + for i := 0; i < 1<<10; i++ { + f(i) + } +} + +// TestDeferPtrs tests the adjustment of Defer's argument pointers (p aka &y) +// during a stack copy. +func set(p *int, x int) { + *p = x +} +func TestDeferPtrs(t *testing.T) { + var y int + + defer func() { + if y != 42 { + t.Errorf("defer's stack references were not adjusted appropriately") + } + }() + defer set(&y, 42) + growStack(nil) +} + +type bigBuf [4 * 1024]byte + +// TestDeferPtrsGoexit is like TestDeferPtrs but exercises the possibility that the +// stack grows as part of starting the deferred function. It calls Goexit at various +// stack depths, forcing the deferred function (with >4kB of args) to be run at +// the bottom of the stack. The goal is to find a stack depth less than 4kB from +// the end of the stack. Each trial runs in a different goroutine so that an earlier +// stack growth does not invalidate a later attempt. +func TestDeferPtrsGoexit(t *testing.T) { + for i := 0; i < 100; i++ { + c := make(chan int, 1) + go testDeferPtrsGoexit(c, i) + if n := <-c; n != 42 { + t.Fatalf("defer's stack references were not adjusted appropriately (i=%d n=%d)", i, n) + } + } +} + +func testDeferPtrsGoexit(c chan int, i int) { + var y int + defer func() { + c <- y + }() + defer setBig(&y, 42, bigBuf{}) + useStackAndCall(i, Goexit) +} + +func setBig(p *int, x int, b bigBuf) { + *p = x +} + +// TestDeferPtrsPanic is like TestDeferPtrsGoexit, but it's using panic instead +// of Goexit to run the Defers. Those two are different execution paths +// in the runtime. +func TestDeferPtrsPanic(t *testing.T) { + for i := 0; i < 100; i++ { + c := make(chan int, 1) + go testDeferPtrsGoexit(c, i) + if n := <-c; n != 42 { + t.Fatalf("defer's stack references were not adjusted appropriately (i=%d n=%d)", i, n) + } + } +} + +func testDeferPtrsPanic(c chan int, i int) { + var y int + defer func() { + if recover() == nil { + c <- -1 + return + } + c <- y + }() + defer setBig(&y, 42, bigBuf{}) + useStackAndCall(i, func() { panic(1) }) +} + +//go:noinline +func testDeferLeafSigpanic1() { + // Cause a sigpanic to be injected in this frame. + // + // This function has to be declared before + // TestDeferLeafSigpanic so the runtime will crash if we think + // this function's continuation PC is in + // TestDeferLeafSigpanic. + *(*int)(nil) = 0 +} + +// TestDeferLeafSigpanic tests defer matching around leaf functions +// that sigpanic. This is tricky because on LR machines the outer +// function and the inner function have the same SP, but it's critical +// that we match up the defer correctly to get the right liveness map. +// See issue #25499. +func TestDeferLeafSigpanic(t *testing.T) { + // Push a defer that will walk the stack. + defer func() { + if err := recover(); err == nil { + t.Fatal("expected panic from nil pointer") + } + GC() + }() + // Call a leaf function. We must set up the exact call stack: + // + // defering function -> leaf function -> sigpanic + // + // On LR machines, the leaf function will have the same SP as + // the SP pushed for the defer frame. + testDeferLeafSigpanic1() +} + +// TestPanicUseStack checks that a chain of Panic structs on the stack are +// updated correctly if the stack grows during the deferred execution that +// happens as a result of the panic. +func TestPanicUseStack(t *testing.T) { + pc := make([]uintptr, 10000) + defer func() { + recover() + Callers(0, pc) // force stack walk + useStackAndCall(100, func() { + defer func() { + recover() + Callers(0, pc) // force stack walk + useStackAndCall(200, func() { + defer func() { + recover() + Callers(0, pc) // force stack walk + }() + panic(3) + }) + }() + panic(2) + }) + }() + panic(1) +} + +func TestPanicFar(t *testing.T) { + var xtree *xtreeNode + pc := make([]uintptr, 10000) + defer func() { + // At this point we created a large stack and unwound + // it via recovery. Force a stack walk, which will + // check the stack's consistency. + Callers(0, pc) + }() + defer func() { + recover() + }() + useStackAndCall(100, func() { + // Kick off the GC and make it do something nontrivial. + // (This used to force stack barriers to stick around.) + xtree = makeTree(18) + // Give the GC time to start scanning stacks. + time.Sleep(time.Millisecond) + panic(1) + }) + _ = xtree +} + +type xtreeNode struct { + l, r *xtreeNode +} + +func makeTree(d int) *xtreeNode { + if d == 0 { + return new(xtreeNode) + } + return &xtreeNode{makeTree(d - 1), makeTree(d - 1)} +} + +// use about n KB of stack and call f +func useStackAndCall(n int, f func()) { + if n == 0 { + f() + return + } + var b [1024]byte // makes frame about 1KB + useStackAndCall(n-1+int(b[99]), f) +} + +func useStack(n int) { + useStackAndCall(n, func() {}) +} + +func growing(c chan int, done chan struct{}) { + for n := range c { + useStack(n) + done <- struct{}{} + } + done <- struct{}{} +} + +func TestStackCache(t *testing.T) { + // Allocate a bunch of goroutines and grow their stacks. + // Repeat a few times to test the stack cache. + const ( + R = 4 + G = 200 + S = 5 + ) + for i := 0; i < R; i++ { + var reqchans [G]chan int + done := make(chan struct{}) + for j := 0; j < G; j++ { + reqchans[j] = make(chan int) + go growing(reqchans[j], done) + } + for s := 0; s < S; s++ { + for j := 0; j < G; j++ { + reqchans[j] <- 1 << uint(s) + } + for j := 0; j < G; j++ { + <-done + } + } + for j := 0; j < G; j++ { + close(reqchans[j]) + } + for j := 0; j < G; j++ { + <-done + } + } +} + +func TestStackOutput(t *testing.T) { + b := make([]byte, 1024) + stk := string(b[:Stack(b, false)]) + if !strings.HasPrefix(stk, "goroutine ") { + t.Errorf("Stack (len %d):\n%s", len(stk), stk) + t.Errorf("Stack output should begin with \"goroutine \"") + } +} + +func TestStackAllOutput(t *testing.T) { + b := make([]byte, 1024) + stk := string(b[:Stack(b, true)]) + if !strings.HasPrefix(stk, "goroutine ") { + t.Errorf("Stack (len %d):\n%s", len(stk), stk) + t.Errorf("Stack output should begin with \"goroutine \"") + } +} + +func TestStackPanic(t *testing.T) { + // Test that stack copying copies panics correctly. This is difficult + // to test because it is very unlikely that the stack will be copied + // in the middle of gopanic. But it can happen. + // To make this test effective, edit panic.go:gopanic and uncomment + // the GC() call just before freedefer(d). + defer func() { + if x := recover(); x == nil { + t.Errorf("recover failed") + } + }() + useStack(32) + panic("test panic") +} + +func BenchmarkStackCopyPtr(b *testing.B) { + c := make(chan bool) + for i := 0; i < b.N; i++ { + go func() { + i := 1000000 + countp(&i) + c <- true + }() + <-c + } +} + +func countp(n *int) { + if *n == 0 { + return + } + *n-- + countp(n) +} + +func BenchmarkStackCopy(b *testing.B) { + c := make(chan bool) + for i := 0; i < b.N; i++ { + go func() { + count(1000000) + c <- true + }() + <-c + } +} + +func count(n int) int { + if n == 0 { + return 0 + } + return 1 + count(n-1) +} + +func BenchmarkStackCopyNoCache(b *testing.B) { + c := make(chan bool) + for i := 0; i < b.N; i++ { + go func() { + count1(1000000) + c <- true + }() + <-c + } +} + +func count1(n int) int { + if n <= 0 { + return 0 + } + return 1 + count2(n-1) +} + +func count2(n int) int { return 1 + count3(n-1) } +func count3(n int) int { return 1 + count4(n-1) } +func count4(n int) int { return 1 + count5(n-1) } +func count5(n int) int { return 1 + count6(n-1) } +func count6(n int) int { return 1 + count7(n-1) } +func count7(n int) int { return 1 + count8(n-1) } +func count8(n int) int { return 1 + count9(n-1) } +func count9(n int) int { return 1 + count10(n-1) } +func count10(n int) int { return 1 + count11(n-1) } +func count11(n int) int { return 1 + count12(n-1) } +func count12(n int) int { return 1 + count13(n-1) } +func count13(n int) int { return 1 + count14(n-1) } +func count14(n int) int { return 1 + count15(n-1) } +func count15(n int) int { return 1 + count16(n-1) } +func count16(n int) int { return 1 + count17(n-1) } +func count17(n int) int { return 1 + count18(n-1) } +func count18(n int) int { return 1 + count19(n-1) } +func count19(n int) int { return 1 + count20(n-1) } +func count20(n int) int { return 1 + count21(n-1) } +func count21(n int) int { return 1 + count22(n-1) } +func count22(n int) int { return 1 + count23(n-1) } +func count23(n int) int { return 1 + count1(n-1) } + +type stkobjT struct { + p *stkobjT + x int64 + y [20]int // consume some stack +} + +// Sum creates a linked list of stkobjTs. +func Sum(n int64, p *stkobjT) { + if n == 0 { + return + } + s := stkobjT{p: p, x: n} + Sum(n-1, &s) + p.x += s.x +} + +func BenchmarkStackCopyWithStkobj(b *testing.B) { + c := make(chan bool) + for i := 0; i < b.N; i++ { + go func() { + var s stkobjT + Sum(100000, &s) + c <- true + }() + <-c + } +} + +func BenchmarkIssue18138(b *testing.B) { + // Channel with N "can run a goroutine" tokens + const N = 10 + c := make(chan []byte, N) + for i := 0; i < N; i++ { + c <- make([]byte, 1) + } + + for i := 0; i < b.N; i++ { + <-c // get token + go func() { + useStackPtrs(1000, false) // uses ~1MB max + m := make([]byte, 8192) // make GC trigger occasionally + c <- m // return token + }() + } +} + +func useStackPtrs(n int, b bool) { + if b { + // This code contributes to the stack frame size, and hence to the + // stack copying cost. But since b is always false, it costs no + // execution time (not even the zeroing of a). + var a [128]*int // 1KB of pointers + a[n] = &n + n = *a[0] + } + if n == 0 { + return + } + useStackPtrs(n-1, b) +} + +type structWithMethod struct{} + +func (s structWithMethod) caller() string { + _, file, line, ok := Caller(1) + if !ok { + panic("Caller failed") + } + return fmt.Sprintf("%s:%d", file, line) +} + +func (s structWithMethod) callers() []uintptr { + pc := make([]uintptr, 16) + return pc[:Callers(0, pc)] +} + +func (s structWithMethod) stack() string { + buf := make([]byte, 4<<10) + return string(buf[:Stack(buf, false)]) +} + +func (s structWithMethod) nop() {} + +func TestStackWrapperCaller(t *testing.T) { + var d structWithMethod + // Force the compiler to construct a wrapper method. + wrapper := (*structWithMethod).caller + // Check that the wrapper doesn't affect the stack trace. + if dc, ic := d.caller(), wrapper(&d); dc != ic { + t.Fatalf("direct caller %q != indirect caller %q", dc, ic) + } +} + +func TestStackWrapperCallers(t *testing.T) { + var d structWithMethod + wrapper := (*structWithMethod).callers + // Check that <autogenerated> doesn't appear in the stack trace. + pcs := wrapper(&d) + frames := CallersFrames(pcs) + for { + fr, more := frames.Next() + if fr.File == "<autogenerated>" { + t.Fatalf("<autogenerated> appears in stack trace: %+v", fr) + } + if !more { + break + } + } +} + +func TestStackWrapperStack(t *testing.T) { + var d structWithMethod + wrapper := (*structWithMethod).stack + // Check that <autogenerated> doesn't appear in the stack trace. + stk := wrapper(&d) + if strings.Contains(stk, "<autogenerated>") { + t.Fatalf("<autogenerated> appears in stack trace:\n%s", stk) + } +} + +type I interface { + M() +} + +func TestStackWrapperStackPanic(t *testing.T) { + t.Run("sigpanic", func(t *testing.T) { + // nil calls to interface methods cause a sigpanic. + testStackWrapperPanic(t, func() { I.M(nil) }, "runtime_test.I.M") + }) + t.Run("panicwrap", func(t *testing.T) { + // Nil calls to value method wrappers call panicwrap. + wrapper := (*structWithMethod).nop + testStackWrapperPanic(t, func() { wrapper(nil) }, "runtime_test.(*structWithMethod).nop") + }) +} + +func testStackWrapperPanic(t *testing.T, cb func(), expect string) { + // Test that the stack trace from a panicking wrapper includes + // the wrapper, even though elide these when they don't panic. + t.Run("CallersFrames", func(t *testing.T) { + defer func() { + err := recover() + if err == nil { + t.Fatalf("expected panic") + } + pcs := make([]uintptr, 10) + n := Callers(0, pcs) + frames := CallersFrames(pcs[:n]) + for { + frame, more := frames.Next() + t.Log(frame.Function) + if frame.Function == expect { + return + } + if !more { + break + } + } + t.Fatalf("panicking wrapper %s missing from stack trace", expect) + }() + cb() + }) + t.Run("Stack", func(t *testing.T) { + defer func() { + err := recover() + if err == nil { + t.Fatalf("expected panic") + } + buf := make([]byte, 4<<10) + stk := string(buf[:Stack(buf, false)]) + if !strings.Contains(stk, "\n"+expect) { + t.Fatalf("panicking wrapper %s missing from stack trace:\n%s", expect, stk) + } + }() + cb() + }) +} + +func TestCallersFromWrapper(t *testing.T) { + // Test that invoking CallersFrames on a stack where the first + // PC is an autogenerated wrapper keeps the wrapper in the + // trace. Normally we elide these, assuming that the wrapper + // calls the thing you actually wanted to see, but in this + // case we need to keep it. + pc := reflect.ValueOf(I.M).Pointer() + frames := CallersFrames([]uintptr{pc}) + frame, more := frames.Next() + if frame.Function != "runtime_test.I.M" { + t.Fatalf("want function %s, got %s", "runtime_test.I.M", frame.Function) + } + if more { + t.Fatalf("want 1 frame, got > 1") + } +} + +func TestTracebackSystemstack(t *testing.T) { + if GOARCH == "ppc64" || GOARCH == "ppc64le" { + t.Skip("systemstack tail call not implemented on ppc64x") + } + + // Test that profiles correctly jump over systemstack, + // including nested systemstack calls. + pcs := make([]uintptr, 20) + pcs = pcs[:TracebackSystemstack(pcs, 5)] + // Check that runtime.TracebackSystemstack appears five times + // and that we see TestTracebackSystemstack. + countIn, countOut := 0, 0 + frames := CallersFrames(pcs) + var tb strings.Builder + for { + frame, more := frames.Next() + fmt.Fprintf(&tb, "\n%s+0x%x %s:%d", frame.Function, frame.PC-frame.Entry, frame.File, frame.Line) + switch frame.Function { + case "runtime.TracebackSystemstack": + countIn++ + case "runtime_test.TestTracebackSystemstack": + countOut++ + } + if !more { + break + } + } + if countIn != 5 || countOut != 1 { + t.Fatalf("expected 5 calls to TracebackSystemstack and 1 call to TestTracebackSystemstack, got:%s", tb.String()) + } +} + +func TestTracebackAncestors(t *testing.T) { + goroutineRegex := regexp.MustCompile(`goroutine [0-9]+ \[`) + for _, tracebackDepth := range []int{0, 1, 5, 50} { + output := runTestProg(t, "testprog", "TracebackAncestors", fmt.Sprintf("GODEBUG=tracebackancestors=%d", tracebackDepth)) + + numGoroutines := 3 + numFrames := 2 + ancestorsExpected := numGoroutines + if numGoroutines > tracebackDepth { + ancestorsExpected = tracebackDepth + } + + matches := goroutineRegex.FindAllStringSubmatch(output, -1) + if len(matches) != 2 { + t.Fatalf("want 2 goroutines, got:\n%s", output) + } + + // Check functions in the traceback. + fns := []string{"main.recurseThenCallGo", "main.main", "main.printStack", "main.TracebackAncestors"} + for _, fn := range fns { + if !strings.Contains(output, "\n"+fn+"(") { + t.Fatalf("expected %q function in traceback:\n%s", fn, output) + } + } + + if want, count := "originating from goroutine", ancestorsExpected; strings.Count(output, want) != count { + t.Errorf("output does not contain %d instances of %q:\n%s", count, want, output) + } + + if want, count := "main.recurseThenCallGo(...)", ancestorsExpected*(numFrames+1); strings.Count(output, want) != count { + t.Errorf("output does not contain %d instances of %q:\n%s", count, want, output) + } + + if want, count := "main.recurseThenCallGo(0x", 1; strings.Count(output, want) != count { + t.Errorf("output does not contain %d instances of %q:\n%s", count, want, output) + } + } +} + +// Test that defer closure is correctly scanned when the stack is scanned. +func TestDeferLiveness(t *testing.T) { + output := runTestProg(t, "testprog", "DeferLiveness", "GODEBUG=clobberfree=1") + if output != "" { + t.Errorf("output:\n%s\n\nwant no output", output) + } +} + +func TestDeferHeapAndStack(t *testing.T) { + P := 4 // processors + N := 10000 //iterations + D := 200 // stack depth + + if testing.Short() { + P /= 2 + N /= 10 + D /= 10 + } + c := make(chan bool) + for p := 0; p < P; p++ { + go func() { + for i := 0; i < N; i++ { + if deferHeapAndStack(D) != 2*D { + panic("bad result") + } + } + c <- true + }() + } + for p := 0; p < P; p++ { + <-c + } +} + +// deferHeapAndStack(n) computes 2*n +func deferHeapAndStack(n int) (r int) { + if n == 0 { + return 0 + } + if n%2 == 0 { + // heap-allocated defers + for i := 0; i < 2; i++ { + defer func() { + r++ + }() + } + } else { + // stack-allocated defers + defer func() { + r++ + }() + defer func() { + r++ + }() + } + r = deferHeapAndStack(n - 1) + escapeMe(new([1024]byte)) // force some GCs + return +} + +// Pass a value to escapeMe to force it to escape. +var escapeMe = func(x any) {} + +// Test that when F -> G is inlined and F is excluded from stack +// traces, G still appears. +func TestTracebackInlineExcluded(t *testing.T) { + defer func() { + recover() + buf := make([]byte, 4<<10) + stk := string(buf[:Stack(buf, false)]) + + t.Log(stk) + + if not := "tracebackExcluded"; strings.Contains(stk, not) { + t.Errorf("found but did not expect %q", not) + } + if want := "tracebackNotExcluded"; !strings.Contains(stk, want) { + t.Errorf("expected %q in stack", want) + } + }() + tracebackExcluded() +} + +// tracebackExcluded should be excluded from tracebacks. There are +// various ways this could come up. Linking it to a "runtime." name is +// rather synthetic, but it's easy and reliable. See issue #42754 for +// one way this happened in real code. +// +//go:linkname tracebackExcluded runtime.tracebackExcluded +//go:noinline +func tracebackExcluded() { + // Call an inlined function that should not itself be excluded + // from tracebacks. + tracebackNotExcluded() +} + +// tracebackNotExcluded should be inlined into tracebackExcluded, but +// should not itself be excluded from the traceback. +func tracebackNotExcluded() { + var x *int + *x = 0 +} diff --git a/src/runtime/start_line_amd64_test.go b/src/runtime/start_line_amd64_test.go new file mode 100644 index 0000000..305ed0b --- /dev/null +++ b/src/runtime/start_line_amd64_test.go @@ -0,0 +1,23 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "runtime/internal/startlinetest" + "testing" +) + +// TestStartLineAsm tests the start line metadata of an assembly function. This +// is only tested on amd64 to avoid the need for a proliferation of per-arch +// copies of this function. +func TestStartLineAsm(t *testing.T) { + startlinetest.CallerStartLine = callerStartLine + + const wantLine = 23 + got := startlinetest.AsmFunc() + if got != wantLine { + t.Errorf("start line got %d want %d", got, wantLine) + } +} diff --git a/src/runtime/start_line_test.go b/src/runtime/start_line_test.go new file mode 100644 index 0000000..6c4faa8 --- /dev/null +++ b/src/runtime/start_line_test.go @@ -0,0 +1,138 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "internal/testenv" + "runtime" + "testing" +) + +// The tests in this file test the function start line metadata included in +// _func and inlinedCall. TestStartLine hard-codes the start lines of functions +// in this file. If code moves, the test will need to be updated. +// +// The "start line" of a function should be the line containing the func +// keyword. + +func normalFunc() int { + return callerStartLine(false) +} + +func multilineDeclarationFunc() int { + return multilineDeclarationFunc1(0, 0, 0) +} + +//go:noinline +func multilineDeclarationFunc1( + a, b, c int) int { + return callerStartLine(false) +} + +func blankLinesFunc() int { + + // Some + // lines + // without + // code + + return callerStartLine(false) +} + +func inlineFunc() int { + return inlineFunc1() +} + +func inlineFunc1() int { + return callerStartLine(true) +} + +var closureFn func() int + +func normalClosure() int { + // Assign to global to ensure this isn't inlined. + closureFn = func() int { + return callerStartLine(false) + } + return closureFn() +} + +func inlineClosure() int { + return func() int { + return callerStartLine(true) + }() +} + +func TestStartLine(t *testing.T) { + // We test inlined vs non-inlined variants. We can't do that if + // optimizations are disabled. + testenv.SkipIfOptimizationOff(t) + + testCases := []struct{ + name string + fn func() int + want int + }{ + { + name: "normal", + fn: normalFunc, + want: 21, + }, + { + name: "multiline-declaration", + fn: multilineDeclarationFunc, + want: 30, + }, + { + name: "blank-lines", + fn: blankLinesFunc, + want: 35, + }, + { + name: "inline", + fn: inlineFunc, + want: 49, + }, + { + name: "normal-closure", + fn: normalClosure, + want: 57, + }, + { + name: "inline-closure", + fn: inlineClosure, + want: 64, + }, + } + + for _, tc := range testCases { + t.Run(tc.name, func(t *testing.T) { + got := tc.fn() + if got != tc.want { + t.Errorf("start line got %d want %d", got, tc.want) + } + }) + } +} + +//go:noinline +func callerStartLine(wantInlined bool) int { + var pcs [1]uintptr + n := runtime.Callers(2, pcs[:]) + if n != 1 { + panic(fmt.Sprintf("no caller of callerStartLine? n = %d", n)) + } + + frames := runtime.CallersFrames(pcs[:]) + frame, _ := frames.Next() + + inlined := frame.Func == nil // Func always set to nil for inlined frames + if wantInlined != inlined { + panic(fmt.Sprintf("caller %s inlined got %v want %v", frame.Function, inlined, wantInlined)) + } + + return runtime.FrameStartLine(&frame) +} diff --git a/src/runtime/stkframe.go b/src/runtime/stkframe.go new file mode 100644 index 0000000..3ecf3a8 --- /dev/null +++ b/src/runtime/stkframe.go @@ -0,0 +1,289 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/sys" + "unsafe" +) + +// A stkframe holds information about a single physical stack frame. +type stkframe struct { + // fn is the function being run in this frame. If there is + // inlining, this is the outermost function. + fn funcInfo + + // pc is the program counter within fn. + // + // The meaning of this is subtle: + // + // - Typically, this frame performed a regular function call + // and this is the return PC (just after the CALL + // instruction). In this case, pc-1 reflects the CALL + // instruction itself and is the correct source of symbolic + // information. + // + // - If this frame "called" sigpanic, then pc is the + // instruction that panicked, and pc is the correct address + // to use for symbolic information. + // + // - If this is the innermost frame, then PC is where + // execution will continue, but it may not be the + // instruction following a CALL. This may be from + // cooperative preemption, in which case this is the + // instruction after the call to morestack. Or this may be + // from a signal or an un-started goroutine, in which case + // PC could be any instruction, including the first + // instruction in a function. Conventionally, we use pc-1 + // for symbolic information, unless pc == fn.entry(), in + // which case we use pc. + pc uintptr + + // continpc is the PC where execution will continue in fn, or + // 0 if execution will not continue in this frame. + // + // This is usually the same as pc, unless this frame "called" + // sigpanic, in which case it's either the address of + // deferreturn or 0 if this frame will never execute again. + // + // This is the PC to use to look up GC liveness for this frame. + continpc uintptr + + lr uintptr // program counter at caller aka link register + sp uintptr // stack pointer at pc + fp uintptr // stack pointer at caller aka frame pointer + varp uintptr // top of local variables + argp uintptr // pointer to function arguments +} + +// reflectMethodValue is a partial duplicate of reflect.makeFuncImpl +// and reflect.methodValue. +type reflectMethodValue struct { + fn uintptr + stack *bitvector // ptrmap for both args and results + argLen uintptr // just args +} + +// argBytes returns the argument frame size for a call to frame.fn. +func (frame *stkframe) argBytes() uintptr { + if frame.fn.args != _ArgsSizeUnknown { + return uintptr(frame.fn.args) + } + // This is an uncommon and complicated case. Fall back to fully + // fetching the argument map to compute its size. + argMap, _ := frame.argMapInternal() + return uintptr(argMap.n) * goarch.PtrSize +} + +// argMapInternal is used internally by stkframe to fetch special +// argument maps. +// +// argMap.n is always populated with the size of the argument map. +// +// argMap.bytedata is only populated for dynamic argument maps (used +// by reflect). If the caller requires the argument map, it should use +// this if non-nil, and otherwise fetch the argument map using the +// current PC. +// +// hasReflectStackObj indicates that this frame also has a reflect +// function stack object, which the caller must synthesize. +func (frame *stkframe) argMapInternal() (argMap bitvector, hasReflectStackObj bool) { + f := frame.fn + if f.args != _ArgsSizeUnknown { + argMap.n = f.args / goarch.PtrSize + return + } + // Extract argument bitmaps for reflect stubs from the calls they made to reflect. + switch funcname(f) { + case "reflect.makeFuncStub", "reflect.methodValueCall": + // These take a *reflect.methodValue as their + // context register and immediately save it to 0(SP). + // Get the methodValue from 0(SP). + arg0 := frame.sp + sys.MinFrameSize + + minSP := frame.fp + if !usesLR { + // The CALL itself pushes a word. + // Undo that adjustment. + minSP -= goarch.PtrSize + } + if arg0 >= minSP { + // The function hasn't started yet. + // This only happens if f was the + // start function of a new goroutine + // that hasn't run yet *and* f takes + // no arguments and has no results + // (otherwise it will get wrapped in a + // closure). In this case, we can't + // reach into its locals because it + // doesn't have locals yet, but we + // also know its argument map is + // empty. + if frame.pc != f.entry() { + print("runtime: confused by ", funcname(f), ": no frame (sp=", hex(frame.sp), " fp=", hex(frame.fp), ") at entry+", hex(frame.pc-f.entry()), "\n") + throw("reflect mismatch") + } + return bitvector{}, false // No locals, so also no stack objects + } + hasReflectStackObj = true + mv := *(**reflectMethodValue)(unsafe.Pointer(arg0)) + // Figure out whether the return values are valid. + // Reflect will update this value after it copies + // in the return values. + retValid := *(*bool)(unsafe.Pointer(arg0 + 4*goarch.PtrSize)) + if mv.fn != f.entry() { + print("runtime: confused by ", funcname(f), "\n") + throw("reflect mismatch") + } + argMap = *mv.stack + if !retValid { + // argMap.n includes the results, but + // those aren't valid, so drop them. + n := int32((uintptr(mv.argLen) &^ (goarch.PtrSize - 1)) / goarch.PtrSize) + if n < argMap.n { + argMap.n = n + } + } + } + return +} + +// getStackMap returns the locals and arguments live pointer maps, and +// stack object list for frame. +func (frame *stkframe) getStackMap(cache *pcvalueCache, debug bool) (locals, args bitvector, objs []stackObjectRecord) { + targetpc := frame.continpc + if targetpc == 0 { + // Frame is dead. Return empty bitvectors. + return + } + + f := frame.fn + pcdata := int32(-1) + if targetpc != f.entry() { + // Back up to the CALL. If we're at the function entry + // point, we want to use the entry map (-1), even if + // the first instruction of the function changes the + // stack map. + targetpc-- + pcdata = pcdatavalue(f, _PCDATA_StackMapIndex, targetpc, cache) + } + if pcdata == -1 { + // We do not have a valid pcdata value but there might be a + // stackmap for this function. It is likely that we are looking + // at the function prologue, assume so and hope for the best. + pcdata = 0 + } + + // Local variables. + size := frame.varp - frame.sp + var minsize uintptr + switch goarch.ArchFamily { + case goarch.ARM64: + minsize = sys.StackAlign + default: + minsize = sys.MinFrameSize + } + if size > minsize { + stackid := pcdata + stkmap := (*stackmap)(funcdata(f, _FUNCDATA_LocalsPointerMaps)) + if stkmap == nil || stkmap.n <= 0 { + print("runtime: frame ", funcname(f), " untyped locals ", hex(frame.varp-size), "+", hex(size), "\n") + throw("missing stackmap") + } + // If nbit == 0, there's no work to do. + if stkmap.nbit > 0 { + if stackid < 0 || stackid >= stkmap.n { + // don't know where we are + print("runtime: pcdata is ", stackid, " and ", stkmap.n, " locals stack map entries for ", funcname(f), " (targetpc=", hex(targetpc), ")\n") + throw("bad symbol table") + } + locals = stackmapdata(stkmap, stackid) + if stackDebug >= 3 && debug { + print(" locals ", stackid, "/", stkmap.n, " ", locals.n, " words ", locals.bytedata, "\n") + } + } else if stackDebug >= 3 && debug { + print(" no locals to adjust\n") + } + } + + // Arguments. First fetch frame size and special-case argument maps. + var isReflect bool + args, isReflect = frame.argMapInternal() + if args.n > 0 && args.bytedata == nil { + // Non-empty argument frame, but not a special map. + // Fetch the argument map at pcdata. + stackmap := (*stackmap)(funcdata(f, _FUNCDATA_ArgsPointerMaps)) + if stackmap == nil || stackmap.n <= 0 { + print("runtime: frame ", funcname(f), " untyped args ", hex(frame.argp), "+", hex(args.n*goarch.PtrSize), "\n") + throw("missing stackmap") + } + if pcdata < 0 || pcdata >= stackmap.n { + // don't know where we are + print("runtime: pcdata is ", pcdata, " and ", stackmap.n, " args stack map entries for ", funcname(f), " (targetpc=", hex(targetpc), ")\n") + throw("bad symbol table") + } + if stackmap.nbit == 0 { + args.n = 0 + } else { + args = stackmapdata(stackmap, pcdata) + } + } + + // stack objects. + if (GOARCH == "amd64" || GOARCH == "arm64" || GOARCH == "ppc64" || GOARCH == "ppc64le" || GOARCH == "riscv64") && + unsafe.Sizeof(abi.RegArgs{}) > 0 && isReflect { + // For reflect.makeFuncStub and reflect.methodValueCall, + // we need to fake the stack object record. + // These frames contain an internal/abi.RegArgs at a hard-coded offset. + // This offset matches the assembly code on amd64 and arm64. + objs = methodValueCallFrameObjs[:] + } else { + p := funcdata(f, _FUNCDATA_StackObjects) + if p != nil { + n := *(*uintptr)(p) + p = add(p, goarch.PtrSize) + r0 := (*stackObjectRecord)(noescape(p)) + objs = unsafe.Slice(r0, int(n)) + // Note: the noescape above is needed to keep + // getStackMap from "leaking param content: + // frame". That leak propagates up to getgcmask, then + // GCMask, then verifyGCInfo, which converts the stack + // gcinfo tests into heap gcinfo tests :( + } + } + + return +} + +var methodValueCallFrameObjs [1]stackObjectRecord // initialized in stackobjectinit + +func stkobjinit() { + var abiRegArgsEface any = abi.RegArgs{} + abiRegArgsType := efaceOf(&abiRegArgsEface)._type + if abiRegArgsType.kind&kindGCProg != 0 { + throw("abiRegArgsType needs GC Prog, update methodValueCallFrameObjs") + } + // Set methodValueCallFrameObjs[0].gcdataoff so that + // stackObjectRecord.gcdata() will work correctly with it. + ptr := uintptr(unsafe.Pointer(&methodValueCallFrameObjs[0])) + var mod *moduledata + for datap := &firstmoduledata; datap != nil; datap = datap.next { + if datap.gofunc <= ptr && ptr < datap.end { + mod = datap + break + } + } + if mod == nil { + throw("methodValueCallFrameObjs is not in a module") + } + methodValueCallFrameObjs[0] = stackObjectRecord{ + off: -int32(alignUp(abiRegArgsType.size, 8)), // It's always the highest address local. + size: int32(abiRegArgsType.size), + _ptrdata: int32(abiRegArgsType.ptrdata), + gcdataoff: uint32(uintptr(unsafe.Pointer(abiRegArgsType.gcdata)) - mod.rodata), + } +} diff --git a/src/runtime/string.go b/src/runtime/string.go new file mode 100644 index 0000000..a00976b --- /dev/null +++ b/src/runtime/string.go @@ -0,0 +1,584 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/bytealg" + "internal/goarch" + "unsafe" +) + +// The constant is known to the compiler. +// There is no fundamental theory behind this number. +const tmpStringBufSize = 32 + +type tmpBuf [tmpStringBufSize]byte + +// concatstrings implements a Go string concatenation x+y+z+... +// The operands are passed in the slice a. +// If buf != nil, the compiler has determined that the result does not +// escape the calling function, so the string data can be stored in buf +// if small enough. +func concatstrings(buf *tmpBuf, a []string) string { + idx := 0 + l := 0 + count := 0 + for i, x := range a { + n := len(x) + if n == 0 { + continue + } + if l+n < l { + throw("string concatenation too long") + } + l += n + count++ + idx = i + } + if count == 0 { + return "" + } + + // If there is just one string and either it is not on the stack + // or our result does not escape the calling frame (buf != nil), + // then we can return that string directly. + if count == 1 && (buf != nil || !stringDataOnStack(a[idx])) { + return a[idx] + } + s, b := rawstringtmp(buf, l) + for _, x := range a { + copy(b, x) + b = b[len(x):] + } + return s +} + +func concatstring2(buf *tmpBuf, a0, a1 string) string { + return concatstrings(buf, []string{a0, a1}) +} + +func concatstring3(buf *tmpBuf, a0, a1, a2 string) string { + return concatstrings(buf, []string{a0, a1, a2}) +} + +func concatstring4(buf *tmpBuf, a0, a1, a2, a3 string) string { + return concatstrings(buf, []string{a0, a1, a2, a3}) +} + +func concatstring5(buf *tmpBuf, a0, a1, a2, a3, a4 string) string { + return concatstrings(buf, []string{a0, a1, a2, a3, a4}) +} + +// slicebytetostring converts a byte slice to a string. +// It is inserted by the compiler into generated code. +// ptr is a pointer to the first element of the slice; +// n is the length of the slice. +// Buf is a fixed-size buffer for the result, +// it is not nil if the result does not escape. +func slicebytetostring(buf *tmpBuf, ptr *byte, n int) string { + if n == 0 { + // Turns out to be a relatively common case. + // Consider that you want to parse out data between parens in "foo()bar", + // you find the indices and convert the subslice to string. + return "" + } + if raceenabled { + racereadrangepc(unsafe.Pointer(ptr), + uintptr(n), + getcallerpc(), + abi.FuncPCABIInternal(slicebytetostring)) + } + if msanenabled { + msanread(unsafe.Pointer(ptr), uintptr(n)) + } + if asanenabled { + asanread(unsafe.Pointer(ptr), uintptr(n)) + } + if n == 1 { + p := unsafe.Pointer(&staticuint64s[*ptr]) + if goarch.BigEndian { + p = add(p, 7) + } + return unsafe.String((*byte)(p), 1) + } + + var p unsafe.Pointer + if buf != nil && n <= len(buf) { + p = unsafe.Pointer(buf) + } else { + p = mallocgc(uintptr(n), nil, false) + } + memmove(p, unsafe.Pointer(ptr), uintptr(n)) + return unsafe.String((*byte)(p), n) +} + +// stringDataOnStack reports whether the string's data is +// stored on the current goroutine's stack. +func stringDataOnStack(s string) bool { + ptr := uintptr(unsafe.Pointer(unsafe.StringData(s))) + stk := getg().stack + return stk.lo <= ptr && ptr < stk.hi +} + +func rawstringtmp(buf *tmpBuf, l int) (s string, b []byte) { + if buf != nil && l <= len(buf) { + b = buf[:l] + s = slicebytetostringtmp(&b[0], len(b)) + } else { + s, b = rawstring(l) + } + return +} + +// slicebytetostringtmp returns a "string" referring to the actual []byte bytes. +// +// Callers need to ensure that the returned string will not be used after +// the calling goroutine modifies the original slice or synchronizes with +// another goroutine. +// +// The function is only called when instrumenting +// and otherwise intrinsified by the compiler. +// +// Some internal compiler optimizations use this function. +// - Used for m[T1{... Tn{..., string(k), ...} ...}] and m[string(k)] +// where k is []byte, T1 to Tn is a nesting of struct and array literals. +// - Used for "<"+string(b)+">" concatenation where b is []byte. +// - Used for string(b)=="foo" comparison where b is []byte. +func slicebytetostringtmp(ptr *byte, n int) string { + if raceenabled && n > 0 { + racereadrangepc(unsafe.Pointer(ptr), + uintptr(n), + getcallerpc(), + abi.FuncPCABIInternal(slicebytetostringtmp)) + } + if msanenabled && n > 0 { + msanread(unsafe.Pointer(ptr), uintptr(n)) + } + if asanenabled && n > 0 { + asanread(unsafe.Pointer(ptr), uintptr(n)) + } + return unsafe.String(ptr, n) +} + +func stringtoslicebyte(buf *tmpBuf, s string) []byte { + var b []byte + if buf != nil && len(s) <= len(buf) { + *buf = tmpBuf{} + b = buf[:len(s)] + } else { + b = rawbyteslice(len(s)) + } + copy(b, s) + return b +} + +func stringtoslicerune(buf *[tmpStringBufSize]rune, s string) []rune { + // two passes. + // unlike slicerunetostring, no race because strings are immutable. + n := 0 + for range s { + n++ + } + + var a []rune + if buf != nil && n <= len(buf) { + *buf = [tmpStringBufSize]rune{} + a = buf[:n] + } else { + a = rawruneslice(n) + } + + n = 0 + for _, r := range s { + a[n] = r + n++ + } + return a +} + +func slicerunetostring(buf *tmpBuf, a []rune) string { + if raceenabled && len(a) > 0 { + racereadrangepc(unsafe.Pointer(&a[0]), + uintptr(len(a))*unsafe.Sizeof(a[0]), + getcallerpc(), + abi.FuncPCABIInternal(slicerunetostring)) + } + if msanenabled && len(a) > 0 { + msanread(unsafe.Pointer(&a[0]), uintptr(len(a))*unsafe.Sizeof(a[0])) + } + if asanenabled && len(a) > 0 { + asanread(unsafe.Pointer(&a[0]), uintptr(len(a))*unsafe.Sizeof(a[0])) + } + var dum [4]byte + size1 := 0 + for _, r := range a { + size1 += encoderune(dum[:], r) + } + s, b := rawstringtmp(buf, size1+3) + size2 := 0 + for _, r := range a { + // check for race + if size2 >= size1 { + break + } + size2 += encoderune(b[size2:], r) + } + return s[:size2] +} + +type stringStruct struct { + str unsafe.Pointer + len int +} + +// Variant with *byte pointer type for DWARF debugging. +type stringStructDWARF struct { + str *byte + len int +} + +func stringStructOf(sp *string) *stringStruct { + return (*stringStruct)(unsafe.Pointer(sp)) +} + +func intstring(buf *[4]byte, v int64) (s string) { + var b []byte + if buf != nil { + b = buf[:] + s = slicebytetostringtmp(&b[0], len(b)) + } else { + s, b = rawstring(4) + } + if int64(rune(v)) != v { + v = runeError + } + n := encoderune(b, rune(v)) + return s[:n] +} + +// rawstring allocates storage for a new string. The returned +// string and byte slice both refer to the same storage. +// The storage is not zeroed. Callers should use +// b to set the string contents and then drop b. +func rawstring(size int) (s string, b []byte) { + p := mallocgc(uintptr(size), nil, false) + return unsafe.String((*byte)(p), size), unsafe.Slice((*byte)(p), size) +} + +// rawbyteslice allocates a new byte slice. The byte slice is not zeroed. +func rawbyteslice(size int) (b []byte) { + cap := roundupsize(uintptr(size)) + p := mallocgc(cap, nil, false) + if cap != uintptr(size) { + memclrNoHeapPointers(add(p, uintptr(size)), cap-uintptr(size)) + } + + *(*slice)(unsafe.Pointer(&b)) = slice{p, size, int(cap)} + return +} + +// rawruneslice allocates a new rune slice. The rune slice is not zeroed. +func rawruneslice(size int) (b []rune) { + if uintptr(size) > maxAlloc/4 { + throw("out of memory") + } + mem := roundupsize(uintptr(size) * 4) + p := mallocgc(mem, nil, false) + if mem != uintptr(size)*4 { + memclrNoHeapPointers(add(p, uintptr(size)*4), mem-uintptr(size)*4) + } + + *(*slice)(unsafe.Pointer(&b)) = slice{p, size, int(mem / 4)} + return +} + +// used by cmd/cgo +func gobytes(p *byte, n int) (b []byte) { + if n == 0 { + return make([]byte, 0) + } + + if n < 0 || uintptr(n) > maxAlloc { + panic(errorString("gobytes: length out of range")) + } + + bp := mallocgc(uintptr(n), nil, false) + memmove(bp, unsafe.Pointer(p), uintptr(n)) + + *(*slice)(unsafe.Pointer(&b)) = slice{bp, n, n} + return +} + +// This is exported via linkname to assembly in syscall (for Plan9). +// +//go:linkname gostring +func gostring(p *byte) string { + l := findnull(p) + if l == 0 { + return "" + } + s, b := rawstring(l) + memmove(unsafe.Pointer(&b[0]), unsafe.Pointer(p), uintptr(l)) + return s +} + +// internal_syscall_gostring is a version of gostring for internal/syscall/unix. +// +//go:linkname internal_syscall_gostring internal/syscall/unix.gostring +func internal_syscall_gostring(p *byte) string { + return gostring(p) +} + +func gostringn(p *byte, l int) string { + if l == 0 { + return "" + } + s, b := rawstring(l) + memmove(unsafe.Pointer(&b[0]), unsafe.Pointer(p), uintptr(l)) + return s +} + +func hasPrefix(s, prefix string) bool { + return len(s) >= len(prefix) && s[:len(prefix)] == prefix +} + +const ( + maxUint64 = ^uint64(0) + maxInt64 = int64(maxUint64 >> 1) +) + +// atoi64 parses an int64 from a string s. +// The bool result reports whether s is a number +// representable by a value of type int64. +func atoi64(s string) (int64, bool) { + if s == "" { + return 0, false + } + + neg := false + if s[0] == '-' { + neg = true + s = s[1:] + } + + un := uint64(0) + for i := 0; i < len(s); i++ { + c := s[i] + if c < '0' || c > '9' { + return 0, false + } + if un > maxUint64/10 { + // overflow + return 0, false + } + un *= 10 + un1 := un + uint64(c) - '0' + if un1 < un { + // overflow + return 0, false + } + un = un1 + } + + if !neg && un > uint64(maxInt64) { + return 0, false + } + if neg && un > uint64(maxInt64)+1 { + return 0, false + } + + n := int64(un) + if neg { + n = -n + } + + return n, true +} + +// atoi is like atoi64 but for integers +// that fit into an int. +func atoi(s string) (int, bool) { + if n, ok := atoi64(s); n == int64(int(n)) { + return int(n), ok + } + return 0, false +} + +// atoi32 is like atoi but for integers +// that fit into an int32. +func atoi32(s string) (int32, bool) { + if n, ok := atoi64(s); n == int64(int32(n)) { + return int32(n), ok + } + return 0, false +} + +// parseByteCount parses a string that represents a count of bytes. +// +// s must match the following regular expression: +// +// ^[0-9]+(([KMGT]i)?B)?$ +// +// In other words, an integer byte count with an optional unit +// suffix. Acceptable suffixes include one of +// - KiB, MiB, GiB, TiB which represent binary IEC/ISO 80000 units, or +// - B, which just represents bytes. +// +// Returns an int64 because that's what its callers want and receive, +// but the result is always non-negative. +func parseByteCount(s string) (int64, bool) { + // The empty string is not valid. + if s == "" { + return 0, false + } + // Handle the easy non-suffix case. + last := s[len(s)-1] + if last >= '0' && last <= '9' { + n, ok := atoi64(s) + if !ok || n < 0 { + return 0, false + } + return n, ok + } + // Failing a trailing digit, this must always end in 'B'. + // Also at this point there must be at least one digit before + // that B. + if last != 'B' || len(s) < 2 { + return 0, false + } + // The one before that must always be a digit or 'i'. + if c := s[len(s)-2]; c >= '0' && c <= '9' { + // Trivial 'B' suffix. + n, ok := atoi64(s[:len(s)-1]) + if !ok || n < 0 { + return 0, false + } + return n, ok + } else if c != 'i' { + return 0, false + } + // Finally, we need at least 4 characters now, for the unit + // prefix and at least one digit. + if len(s) < 4 { + return 0, false + } + power := 0 + switch s[len(s)-3] { + case 'K': + power = 1 + case 'M': + power = 2 + case 'G': + power = 3 + case 'T': + power = 4 + default: + // Invalid suffix. + return 0, false + } + m := uint64(1) + for i := 0; i < power; i++ { + m *= 1024 + } + n, ok := atoi64(s[:len(s)-3]) + if !ok || n < 0 { + return 0, false + } + un := uint64(n) + if un > maxUint64/m { + // Overflow. + return 0, false + } + un *= m + if un > uint64(maxInt64) { + // Overflow. + return 0, false + } + return int64(un), true +} + +//go:nosplit +func findnull(s *byte) int { + if s == nil { + return 0 + } + + // Avoid IndexByteString on Plan 9 because it uses SSE instructions + // on x86 machines, and those are classified as floating point instructions, + // which are illegal in a note handler. + if GOOS == "plan9" { + p := (*[maxAlloc/2 - 1]byte)(unsafe.Pointer(s)) + l := 0 + for p[l] != 0 { + l++ + } + return l + } + + // pageSize is the unit we scan at a time looking for NULL. + // It must be the minimum page size for any architecture Go + // runs on. It's okay (just a minor performance loss) if the + // actual system page size is larger than this value. + const pageSize = 4096 + + offset := 0 + ptr := unsafe.Pointer(s) + // IndexByteString uses wide reads, so we need to be careful + // with page boundaries. Call IndexByteString on + // [ptr, endOfPage) interval. + safeLen := int(pageSize - uintptr(ptr)%pageSize) + + for { + t := *(*string)(unsafe.Pointer(&stringStruct{ptr, safeLen})) + // Check one page at a time. + if i := bytealg.IndexByteString(t, 0); i != -1 { + return offset + i + } + // Move to next page + ptr = unsafe.Pointer(uintptr(ptr) + uintptr(safeLen)) + offset += safeLen + safeLen = pageSize + } +} + +func findnullw(s *uint16) int { + if s == nil { + return 0 + } + p := (*[maxAlloc/2/2 - 1]uint16)(unsafe.Pointer(s)) + l := 0 + for p[l] != 0 { + l++ + } + return l +} + +//go:nosplit +func gostringnocopy(str *byte) string { + ss := stringStruct{str: unsafe.Pointer(str), len: findnull(str)} + s := *(*string)(unsafe.Pointer(&ss)) + return s +} + +func gostringw(strw *uint16) string { + var buf [8]byte + str := (*[maxAlloc/2/2 - 1]uint16)(unsafe.Pointer(strw)) + n1 := 0 + for i := 0; str[i] != 0; i++ { + n1 += encoderune(buf[:], rune(str[i])) + } + s, b := rawstring(n1 + 4) + n2 := 0 + for i := 0; str[i] != 0; i++ { + // check for race + if n2 >= n1 { + break + } + n2 += encoderune(b[n2:], rune(str[i])) + } + b[n2] = 0 // for luck + return s[:n2] +} diff --git a/src/runtime/string_test.go b/src/runtime/string_test.go new file mode 100644 index 0000000..cfc0ad7 --- /dev/null +++ b/src/runtime/string_test.go @@ -0,0 +1,606 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "runtime" + "strconv" + "strings" + "testing" + "unicode/utf8" +) + +// Strings and slices that don't escape and fit into tmpBuf are stack allocated, +// which defeats using AllocsPerRun to test other optimizations. +const sizeNoStack = 100 + +func BenchmarkCompareStringEqual(b *testing.B) { + bytes := []byte("Hello Gophers!") + s1, s2 := string(bytes), string(bytes) + for i := 0; i < b.N; i++ { + if s1 != s2 { + b.Fatal("s1 != s2") + } + } +} + +func BenchmarkCompareStringIdentical(b *testing.B) { + s1 := "Hello Gophers!" + s2 := s1 + for i := 0; i < b.N; i++ { + if s1 != s2 { + b.Fatal("s1 != s2") + } + } +} + +func BenchmarkCompareStringSameLength(b *testing.B) { + s1 := "Hello Gophers!" + s2 := "Hello, Gophers" + for i := 0; i < b.N; i++ { + if s1 == s2 { + b.Fatal("s1 == s2") + } + } +} + +func BenchmarkCompareStringDifferentLength(b *testing.B) { + s1 := "Hello Gophers!" + s2 := "Hello, Gophers!" + for i := 0; i < b.N; i++ { + if s1 == s2 { + b.Fatal("s1 == s2") + } + } +} + +func BenchmarkCompareStringBigUnaligned(b *testing.B) { + bytes := make([]byte, 0, 1<<20) + for len(bytes) < 1<<20 { + bytes = append(bytes, "Hello Gophers!"...) + } + s1, s2 := string(bytes), "hello"+string(bytes) + for i := 0; i < b.N; i++ { + if s1 != s2[len("hello"):] { + b.Fatal("s1 != s2") + } + } + b.SetBytes(int64(len(s1))) +} + +func BenchmarkCompareStringBig(b *testing.B) { + bytes := make([]byte, 0, 1<<20) + for len(bytes) < 1<<20 { + bytes = append(bytes, "Hello Gophers!"...) + } + s1, s2 := string(bytes), string(bytes) + for i := 0; i < b.N; i++ { + if s1 != s2 { + b.Fatal("s1 != s2") + } + } + b.SetBytes(int64(len(s1))) +} + +func BenchmarkConcatStringAndBytes(b *testing.B) { + s1 := []byte("Gophers!") + for i := 0; i < b.N; i++ { + _ = "Hello " + string(s1) + } +} + +var escapeString string + +func BenchmarkSliceByteToString(b *testing.B) { + buf := []byte{'!'} + for n := 0; n < 8; n++ { + b.Run(strconv.Itoa(len(buf)), func(b *testing.B) { + for i := 0; i < b.N; i++ { + escapeString = string(buf) + } + }) + buf = append(buf, buf...) + } +} + +var stringdata = []struct{ name, data string }{ + {"ASCII", "01234567890"}, + {"Japanese", "日本語日本語日本語"}, + {"MixedLength", "$Ѐࠀက퀀𐀀\U00040000\U0010FFFF"}, +} + +var sinkInt int + +func BenchmarkRuneCount(b *testing.B) { + // Each sub-benchmark counts the runes in a string in a different way. + b.Run("lenruneslice", func(b *testing.B) { + for _, sd := range stringdata { + b.Run(sd.name, func(b *testing.B) { + for i := 0; i < b.N; i++ { + sinkInt += len([]rune(sd.data)) + } + }) + } + }) + b.Run("rangeloop", func(b *testing.B) { + for _, sd := range stringdata { + b.Run(sd.name, func(b *testing.B) { + for i := 0; i < b.N; i++ { + n := 0 + for range sd.data { + n++ + } + sinkInt += n + } + }) + } + }) + b.Run("utf8.RuneCountInString", func(b *testing.B) { + for _, sd := range stringdata { + b.Run(sd.name, func(b *testing.B) { + for i := 0; i < b.N; i++ { + sinkInt += utf8.RuneCountInString(sd.data) + } + }) + } + }) +} + +func BenchmarkRuneIterate(b *testing.B) { + b.Run("range", func(b *testing.B) { + for _, sd := range stringdata { + b.Run(sd.name, func(b *testing.B) { + for i := 0; i < b.N; i++ { + for range sd.data { + } + } + }) + } + }) + b.Run("range1", func(b *testing.B) { + for _, sd := range stringdata { + b.Run(sd.name, func(b *testing.B) { + for i := 0; i < b.N; i++ { + for range sd.data { + } + } + }) + } + }) + b.Run("range2", func(b *testing.B) { + for _, sd := range stringdata { + b.Run(sd.name, func(b *testing.B) { + for i := 0; i < b.N; i++ { + for range sd.data { + } + } + }) + } + }) +} + +func BenchmarkArrayEqual(b *testing.B) { + a1 := [16]byte{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} + a2 := [16]byte{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16} + b.ResetTimer() + for i := 0; i < b.N; i++ { + if a1 != a2 { + b.Fatal("not equal") + } + } +} + +func TestStringW(t *testing.T) { + strings := []string{ + "hello", + "a\u5566\u7788b", + } + + for _, s := range strings { + var b []uint16 + for _, c := range s { + b = append(b, uint16(c)) + if c != rune(uint16(c)) { + t.Errorf("bad test: stringW can't handle >16 bit runes") + } + } + b = append(b, 0) + r := runtime.GostringW(b) + if r != s { + t.Errorf("gostringW(%v) = %s, want %s", b, r, s) + } + } +} + +func TestLargeStringConcat(t *testing.T) { + output := runTestProg(t, "testprog", "stringconcat") + want := "panic: " + strings.Repeat("0", 1<<10) + strings.Repeat("1", 1<<10) + + strings.Repeat("2", 1<<10) + strings.Repeat("3", 1<<10) + if !strings.HasPrefix(output, want) { + t.Fatalf("output does not start with %q:\n%s", want, output) + } +} + +func TestConcatTempString(t *testing.T) { + s := "bytes" + b := []byte(s) + n := testing.AllocsPerRun(1000, func() { + if "prefix "+string(b)+" suffix" != "prefix bytes suffix" { + t.Fatalf("strings are not equal: '%v' and '%v'", "prefix "+string(b)+" suffix", "prefix bytes suffix") + } + }) + if n != 0 { + t.Fatalf("want 0 allocs, got %v", n) + } +} + +func TestCompareTempString(t *testing.T) { + s := strings.Repeat("x", sizeNoStack) + b := []byte(s) + n := testing.AllocsPerRun(1000, func() { + if string(b) != s { + t.Fatalf("strings are not equal: '%v' and '%v'", string(b), s) + } + if string(b) < s { + t.Fatalf("strings are not equal: '%v' and '%v'", string(b), s) + } + if string(b) > s { + t.Fatalf("strings are not equal: '%v' and '%v'", string(b), s) + } + if string(b) == s { + } else { + t.Fatalf("strings are not equal: '%v' and '%v'", string(b), s) + } + if string(b) <= s { + } else { + t.Fatalf("strings are not equal: '%v' and '%v'", string(b), s) + } + if string(b) >= s { + } else { + t.Fatalf("strings are not equal: '%v' and '%v'", string(b), s) + } + }) + if n != 0 { + t.Fatalf("want 0 allocs, got %v", n) + } +} + +func TestStringIndexHaystack(t *testing.T) { + // See issue 25864. + haystack := []byte("hello") + needle := "ll" + n := testing.AllocsPerRun(1000, func() { + if strings.Index(string(haystack), needle) != 2 { + t.Fatalf("needle not found") + } + }) + if n != 0 { + t.Fatalf("want 0 allocs, got %v", n) + } +} + +func TestStringIndexNeedle(t *testing.T) { + // See issue 25864. + haystack := "hello" + needle := []byte("ll") + n := testing.AllocsPerRun(1000, func() { + if strings.Index(haystack, string(needle)) != 2 { + t.Fatalf("needle not found") + } + }) + if n != 0 { + t.Fatalf("want 0 allocs, got %v", n) + } +} + +func TestStringOnStack(t *testing.T) { + s := "" + for i := 0; i < 3; i++ { + s = "a" + s + "b" + s + "c" + } + + if want := "aaabcbabccbaabcbabccc"; s != want { + t.Fatalf("want: '%v', got '%v'", want, s) + } +} + +func TestIntString(t *testing.T) { + // Non-escaping result of intstring. + s := "" + for i := rune(0); i < 4; i++ { + s += string(i+'0') + string(i+'0'+1) + } + if want := "01122334"; s != want { + t.Fatalf("want '%v', got '%v'", want, s) + } + + // Escaping result of intstring. + var a [4]string + for i := rune(0); i < 4; i++ { + a[i] = string(i + '0') + } + s = a[0] + a[1] + a[2] + a[3] + if want := "0123"; s != want { + t.Fatalf("want '%v', got '%v'", want, s) + } +} + +func TestIntStringAllocs(t *testing.T) { + unknown := '0' + n := testing.AllocsPerRun(1000, func() { + s1 := string(unknown) + s2 := string(unknown + 1) + if s1 == s2 { + t.Fatalf("bad") + } + }) + if n != 0 { + t.Fatalf("want 0 allocs, got %v", n) + } +} + +func TestRangeStringCast(t *testing.T) { + s := strings.Repeat("x", sizeNoStack) + n := testing.AllocsPerRun(1000, func() { + for i, c := range []byte(s) { + if c != s[i] { + t.Fatalf("want '%c' at pos %v, got '%c'", s[i], i, c) + } + } + }) + if n != 0 { + t.Fatalf("want 0 allocs, got %v", n) + } +} + +func isZeroed(b []byte) bool { + for _, x := range b { + if x != 0 { + return false + } + } + return true +} + +func isZeroedR(r []rune) bool { + for _, x := range r { + if x != 0 { + return false + } + } + return true +} + +func TestString2Slice(t *testing.T) { + // Make sure we don't return slices that expose + // an unzeroed section of stack-allocated temp buf + // between len and cap. See issue 14232. + s := "foož" + b := ([]byte)(s) + if !isZeroed(b[len(b):cap(b)]) { + t.Errorf("extra bytes not zeroed") + } + r := ([]rune)(s) + if !isZeroedR(r[len(r):cap(r)]) { + t.Errorf("extra runes not zeroed") + } +} + +const intSize = 32 << (^uint(0) >> 63) + +type atoi64Test struct { + in string + out int64 + ok bool +} + +var atoi64tests = []atoi64Test{ + {"", 0, false}, + {"0", 0, true}, + {"-0", 0, true}, + {"1", 1, true}, + {"-1", -1, true}, + {"12345", 12345, true}, + {"-12345", -12345, true}, + {"012345", 12345, true}, + {"-012345", -12345, true}, + {"12345x", 0, false}, + {"-12345x", 0, false}, + {"98765432100", 98765432100, true}, + {"-98765432100", -98765432100, true}, + {"20496382327982653440", 0, false}, + {"-20496382327982653440", 0, false}, + {"9223372036854775807", 1<<63 - 1, true}, + {"-9223372036854775807", -(1<<63 - 1), true}, + {"9223372036854775808", 0, false}, + {"-9223372036854775808", -1 << 63, true}, + {"9223372036854775809", 0, false}, + {"-9223372036854775809", 0, false}, +} + +func TestAtoi(t *testing.T) { + switch intSize { + case 32: + for i := range atoi32tests { + test := &atoi32tests[i] + out, ok := runtime.Atoi(test.in) + if test.out != int32(out) || test.ok != ok { + t.Errorf("atoi(%q) = (%v, %v) want (%v, %v)", + test.in, out, ok, test.out, test.ok) + } + } + case 64: + for i := range atoi64tests { + test := &atoi64tests[i] + out, ok := runtime.Atoi(test.in) + if test.out != int64(out) || test.ok != ok { + t.Errorf("atoi(%q) = (%v, %v) want (%v, %v)", + test.in, out, ok, test.out, test.ok) + } + } + } +} + +type atoi32Test struct { + in string + out int32 + ok bool +} + +var atoi32tests = []atoi32Test{ + {"", 0, false}, + {"0", 0, true}, + {"-0", 0, true}, + {"1", 1, true}, + {"-1", -1, true}, + {"12345", 12345, true}, + {"-12345", -12345, true}, + {"012345", 12345, true}, + {"-012345", -12345, true}, + {"12345x", 0, false}, + {"-12345x", 0, false}, + {"987654321", 987654321, true}, + {"-987654321", -987654321, true}, + {"2147483647", 1<<31 - 1, true}, + {"-2147483647", -(1<<31 - 1), true}, + {"2147483648", 0, false}, + {"-2147483648", -1 << 31, true}, + {"2147483649", 0, false}, + {"-2147483649", 0, false}, +} + +func TestAtoi32(t *testing.T) { + for i := range atoi32tests { + test := &atoi32tests[i] + out, ok := runtime.Atoi32(test.in) + if test.out != out || test.ok != ok { + t.Errorf("atoi32(%q) = (%v, %v) want (%v, %v)", + test.in, out, ok, test.out, test.ok) + } + } +} + +func TestParseByteCount(t *testing.T) { + for _, test := range []struct { + in string + out int64 + ok bool + }{ + // Good numeric inputs. + {"1", 1, true}, + {"12345", 12345, true}, + {"012345", 12345, true}, + {"98765432100", 98765432100, true}, + {"9223372036854775807", 1<<63 - 1, true}, + + // Good trivial suffix inputs. + {"1B", 1, true}, + {"12345B", 12345, true}, + {"012345B", 12345, true}, + {"98765432100B", 98765432100, true}, + {"9223372036854775807B", 1<<63 - 1, true}, + + // Good binary suffix inputs. + {"1KiB", 1 << 10, true}, + {"05KiB", 5 << 10, true}, + {"1MiB", 1 << 20, true}, + {"10MiB", 10 << 20, true}, + {"1GiB", 1 << 30, true}, + {"100GiB", 100 << 30, true}, + {"1TiB", 1 << 40, true}, + {"99TiB", 99 << 40, true}, + + // Good zero inputs. + // + // -0 is an edge case, but no harm in supporting it. + {"-0", 0, true}, + {"0", 0, true}, + {"0B", 0, true}, + {"0KiB", 0, true}, + {"0MiB", 0, true}, + {"0GiB", 0, true}, + {"0TiB", 0, true}, + + // Bad inputs. + {"", 0, false}, + {"-1", 0, false}, + {"a12345", 0, false}, + {"a12345B", 0, false}, + {"12345x", 0, false}, + {"0x12345", 0, false}, + + // Bad numeric inputs. + {"9223372036854775808", 0, false}, + {"9223372036854775809", 0, false}, + {"18446744073709551615", 0, false}, + {"20496382327982653440", 0, false}, + {"18446744073709551616", 0, false}, + {"18446744073709551617", 0, false}, + {"9999999999999999999999", 0, false}, + + // Bad trivial suffix inputs. + {"9223372036854775808B", 0, false}, + {"9223372036854775809B", 0, false}, + {"18446744073709551615B", 0, false}, + {"20496382327982653440B", 0, false}, + {"18446744073709551616B", 0, false}, + {"18446744073709551617B", 0, false}, + {"9999999999999999999999B", 0, false}, + + // Bad binary suffix inputs. + {"1Ki", 0, false}, + {"05Ki", 0, false}, + {"10Mi", 0, false}, + {"100Gi", 0, false}, + {"99Ti", 0, false}, + {"22iB", 0, false}, + {"B", 0, false}, + {"iB", 0, false}, + {"KiB", 0, false}, + {"MiB", 0, false}, + {"GiB", 0, false}, + {"TiB", 0, false}, + {"-120KiB", 0, false}, + {"-891MiB", 0, false}, + {"-704GiB", 0, false}, + {"-42TiB", 0, false}, + {"99999999999999999999KiB", 0, false}, + {"99999999999999999MiB", 0, false}, + {"99999999999999GiB", 0, false}, + {"99999999999TiB", 0, false}, + {"555EiB", 0, false}, + + // Mistaken SI suffix inputs. + {"0KB", 0, false}, + {"0MB", 0, false}, + {"0GB", 0, false}, + {"0TB", 0, false}, + {"1KB", 0, false}, + {"05KB", 0, false}, + {"1MB", 0, false}, + {"10MB", 0, false}, + {"1GB", 0, false}, + {"100GB", 0, false}, + {"1TB", 0, false}, + {"99TB", 0, false}, + {"1K", 0, false}, + {"05K", 0, false}, + {"10M", 0, false}, + {"100G", 0, false}, + {"99T", 0, false}, + {"99999999999999999999KB", 0, false}, + {"99999999999999999MB", 0, false}, + {"99999999999999GB", 0, false}, + {"99999999999TB", 0, false}, + {"99999999999TiB", 0, false}, + {"555EB", 0, false}, + } { + out, ok := runtime.ParseByteCount(test.in) + if test.out != out || test.ok != ok { + t.Errorf("parseByteCount(%q) = (%v, %v) want (%v, %v)", + test.in, out, ok, test.out, test.ok) + } + } +} diff --git a/src/runtime/stubs.go b/src/runtime/stubs.go new file mode 100644 index 0000000..42c2612 --- /dev/null +++ b/src/runtime/stubs.go @@ -0,0 +1,480 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "runtime/internal/math" + "unsafe" +) + +// Should be a built-in for unsafe.Pointer? +// +//go:nosplit +func add(p unsafe.Pointer, x uintptr) unsafe.Pointer { + return unsafe.Pointer(uintptr(p) + x) +} + +// getg returns the pointer to the current g. +// The compiler rewrites calls to this function into instructions +// that fetch the g directly (from TLS or from the dedicated register). +func getg() *g + +// mcall switches from the g to the g0 stack and invokes fn(g), +// where g is the goroutine that made the call. +// mcall saves g's current PC/SP in g->sched so that it can be restored later. +// It is up to fn to arrange for that later execution, typically by recording +// g in a data structure, causing something to call ready(g) later. +// mcall returns to the original goroutine g later, when g has been rescheduled. +// fn must not return at all; typically it ends by calling schedule, to let the m +// run other goroutines. +// +// mcall can only be called from g stacks (not g0, not gsignal). +// +// This must NOT be go:noescape: if fn is a stack-allocated closure, +// fn puts g on a run queue, and g executes before fn returns, the +// closure will be invalidated while it is still executing. +func mcall(fn func(*g)) + +// systemstack runs fn on a system stack. +// If systemstack is called from the per-OS-thread (g0) stack, or +// if systemstack is called from the signal handling (gsignal) stack, +// systemstack calls fn directly and returns. +// Otherwise, systemstack is being called from the limited stack +// of an ordinary goroutine. In this case, systemstack switches +// to the per-OS-thread stack, calls fn, and switches back. +// It is common to use a func literal as the argument, in order +// to share inputs and outputs with the code around the call +// to system stack: +// +// ... set up y ... +// systemstack(func() { +// x = bigcall(y) +// }) +// ... use x ... +// +//go:noescape +func systemstack(fn func()) + +//go:nosplit +//go:nowritebarrierrec +func badsystemstack() { + writeErrStr("fatal: systemstack called from unexpected goroutine") +} + +// memclrNoHeapPointers clears n bytes starting at ptr. +// +// Usually you should use typedmemclr. memclrNoHeapPointers should be +// used only when the caller knows that *ptr contains no heap pointers +// because either: +// +// *ptr is initialized memory and its type is pointer-free, or +// +// *ptr is uninitialized memory (e.g., memory that's being reused +// for a new allocation) and hence contains only "junk". +// +// memclrNoHeapPointers ensures that if ptr is pointer-aligned, and n +// is a multiple of the pointer size, then any pointer-aligned, +// pointer-sized portion is cleared atomically. Despite the function +// name, this is necessary because this function is the underlying +// implementation of typedmemclr and memclrHasPointers. See the doc of +// memmove for more details. +// +// The (CPU-specific) implementations of this function are in memclr_*.s. +// +//go:noescape +func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) + +//go:linkname reflect_memclrNoHeapPointers reflect.memclrNoHeapPointers +func reflect_memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr) { + memclrNoHeapPointers(ptr, n) +} + +// memmove copies n bytes from "from" to "to". +// +// memmove ensures that any pointer in "from" is written to "to" with +// an indivisible write, so that racy reads cannot observe a +// half-written pointer. This is necessary to prevent the garbage +// collector from observing invalid pointers, and differs from memmove +// in unmanaged languages. However, memmove is only required to do +// this if "from" and "to" may contain pointers, which can only be the +// case if "from", "to", and "n" are all be word-aligned. +// +// Implementations are in memmove_*.s. +// +//go:noescape +func memmove(to, from unsafe.Pointer, n uintptr) + +// Outside assembly calls memmove. Make sure it has ABI wrappers. +// +//go:linkname memmove + +//go:linkname reflect_memmove reflect.memmove +func reflect_memmove(to, from unsafe.Pointer, n uintptr) { + memmove(to, from, n) +} + +// exported value for testing +const hashLoad = float32(loadFactorNum) / float32(loadFactorDen) + +//go:nosplit +func fastrand() uint32 { + mp := getg().m + // Implement wyrand: https://github.com/wangyi-fudan/wyhash + // Only the platform that math.Mul64 can be lowered + // by the compiler should be in this list. + if goarch.IsAmd64|goarch.IsArm64|goarch.IsPpc64| + goarch.IsPpc64le|goarch.IsMips64|goarch.IsMips64le| + goarch.IsS390x|goarch.IsRiscv64|goarch.IsLoong64 == 1 { + mp.fastrand += 0xa0761d6478bd642f + hi, lo := math.Mul64(mp.fastrand, mp.fastrand^0xe7037ed1a0b428db) + return uint32(hi ^ lo) + } + + // Implement xorshift64+: 2 32-bit xorshift sequences added together. + // Shift triplet [17,7,16] was calculated as indicated in Marsaglia's + // Xorshift paper: https://www.jstatsoft.org/article/view/v008i14/xorshift.pdf + // This generator passes the SmallCrush suite, part of TestU01 framework: + // http://simul.iro.umontreal.ca/testu01/tu01.html + t := (*[2]uint32)(unsafe.Pointer(&mp.fastrand)) + s1, s0 := t[0], t[1] + s1 ^= s1 << 17 + s1 = s1 ^ s0 ^ s1>>7 ^ s0>>16 + t[0], t[1] = s0, s1 + return s0 + s1 +} + +//go:nosplit +func fastrandn(n uint32) uint32 { + // This is similar to fastrand() % n, but faster. + // See https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/ + return uint32(uint64(fastrand()) * uint64(n) >> 32) +} + +func fastrand64() uint64 { + mp := getg().m + // Implement wyrand: https://github.com/wangyi-fudan/wyhash + // Only the platform that math.Mul64 can be lowered + // by the compiler should be in this list. + if goarch.IsAmd64|goarch.IsArm64|goarch.IsPpc64| + goarch.IsPpc64le|goarch.IsMips64|goarch.IsMips64le| + goarch.IsS390x|goarch.IsRiscv64 == 1 { + mp.fastrand += 0xa0761d6478bd642f + hi, lo := math.Mul64(mp.fastrand, mp.fastrand^0xe7037ed1a0b428db) + return hi ^ lo + } + + // Implement xorshift64+: 2 32-bit xorshift sequences added together. + // Xorshift paper: https://www.jstatsoft.org/article/view/v008i14/xorshift.pdf + // This generator passes the SmallCrush suite, part of TestU01 framework: + // http://simul.iro.umontreal.ca/testu01/tu01.html + t := (*[2]uint32)(unsafe.Pointer(&mp.fastrand)) + s1, s0 := t[0], t[1] + s1 ^= s1 << 17 + s1 = s1 ^ s0 ^ s1>>7 ^ s0>>16 + r := uint64(s0 + s1) + + s0, s1 = s1, s0 + s1 ^= s1 << 17 + s1 = s1 ^ s0 ^ s1>>7 ^ s0>>16 + r += uint64(s0+s1) << 32 + + t[0], t[1] = s0, s1 + return r +} + +func fastrandu() uint { + if goarch.PtrSize == 4 { + return uint(fastrand()) + } + return uint(fastrand64()) +} + +//go:linkname rand_fastrand64 math/rand.fastrand64 +func rand_fastrand64() uint64 { return fastrand64() } + +//go:linkname sync_fastrandn sync.fastrandn +func sync_fastrandn(n uint32) uint32 { return fastrandn(n) } + +//go:linkname net_fastrandu net.fastrandu +func net_fastrandu() uint { return fastrandu() } + +//go:linkname os_fastrand os.fastrand +func os_fastrand() uint32 { return fastrand() } + +// in internal/bytealg/equal_*.s +// +//go:noescape +func memequal(a, b unsafe.Pointer, size uintptr) bool + +// noescape hides a pointer from escape analysis. noescape is +// the identity function but escape analysis doesn't think the +// output depends on the input. noescape is inlined and currently +// compiles down to zero instructions. +// USE CAREFULLY! +// +//go:nosplit +func noescape(p unsafe.Pointer) unsafe.Pointer { + x := uintptr(p) + return unsafe.Pointer(x ^ 0) +} + +// Not all cgocallback frames are actually cgocallback, +// so not all have these arguments. Mark them uintptr so that the GC +// does not misinterpret memory when the arguments are not present. +// cgocallback is not called from Go, only from crosscall2. +// This in turn calls cgocallbackg, which is where we'll find +// pointer-declared arguments. +func cgocallback(fn, frame, ctxt uintptr) + +func gogo(buf *gobuf) + +func asminit() +func setg(gg *g) +func breakpoint() + +// reflectcall calls fn with arguments described by stackArgs, stackArgsSize, +// frameSize, and regArgs. +// +// Arguments passed on the stack and space for return values passed on the stack +// must be laid out at the space pointed to by stackArgs (with total length +// stackArgsSize) according to the ABI. +// +// stackRetOffset must be some value <= stackArgsSize that indicates the +// offset within stackArgs where the return value space begins. +// +// frameSize is the total size of the argument frame at stackArgs and must +// therefore be >= stackArgsSize. It must include additional space for spilling +// register arguments for stack growth and preemption. +// +// TODO(mknyszek): Once we don't need the additional spill space, remove frameSize, +// since frameSize will be redundant with stackArgsSize. +// +// Arguments passed in registers must be laid out in regArgs according to the ABI. +// regArgs will hold any return values passed in registers after the call. +// +// reflectcall copies stack arguments from stackArgs to the goroutine stack, and +// then copies back stackArgsSize-stackRetOffset bytes back to the return space +// in stackArgs once fn has completed. It also "unspills" argument registers from +// regArgs before calling fn, and spills them back into regArgs immediately +// following the call to fn. If there are results being returned on the stack, +// the caller should pass the argument frame type as stackArgsType so that +// reflectcall can execute appropriate write barriers during the copy. +// +// reflectcall expects regArgs.ReturnIsPtr to be populated indicating which +// registers on the return path will contain Go pointers. It will then store +// these pointers in regArgs.Ptrs such that they are visible to the GC. +// +// Package reflect passes a frame type. In package runtime, there is only +// one call that copies results back, in callbackWrap in syscall_windows.go, and it +// does NOT pass a frame type, meaning there are no write barriers invoked. See that +// call site for justification. +// +// Package reflect accesses this symbol through a linkname. +// +// Arguments passed through to reflectcall do not escape. The type is used +// only in a very limited callee of reflectcall, the stackArgs are copied, and +// regArgs is only used in the reflectcall frame. +// +//go:noescape +func reflectcall(stackArgsType *_type, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) + +func procyield(cycles uint32) + +type neverCallThisFunction struct{} + +// goexit is the return stub at the top of every goroutine call stack. +// Each goroutine stack is constructed as if goexit called the +// goroutine's entry point function, so that when the entry point +// function returns, it will return to goexit, which will call goexit1 +// to perform the actual exit. +// +// This function must never be called directly. Call goexit1 instead. +// gentraceback assumes that goexit terminates the stack. A direct +// call on the stack will cause gentraceback to stop walking the stack +// prematurely and if there is leftover state it may panic. +func goexit(neverCallThisFunction) + +// publicationBarrier performs a store/store barrier (a "publication" +// or "export" barrier). Some form of synchronization is required +// between initializing an object and making that object accessible to +// another processor. Without synchronization, the initialization +// writes and the "publication" write may be reordered, allowing the +// other processor to follow the pointer and observe an uninitialized +// object. In general, higher-level synchronization should be used, +// such as locking or an atomic pointer write. publicationBarrier is +// for when those aren't an option, such as in the implementation of +// the memory manager. +// +// There's no corresponding barrier for the read side because the read +// side naturally has a data dependency order. All architectures that +// Go supports or seems likely to ever support automatically enforce +// data dependency ordering. +func publicationBarrier() + +// getcallerpc returns the program counter (PC) of its caller's caller. +// getcallersp returns the stack pointer (SP) of its caller's caller. +// The implementation may be a compiler intrinsic; there is not +// necessarily code implementing this on every platform. +// +// For example: +// +// func f(arg1, arg2, arg3 int) { +// pc := getcallerpc() +// sp := getcallersp() +// } +// +// These two lines find the PC and SP immediately following +// the call to f (where f will return). +// +// The call to getcallerpc and getcallersp must be done in the +// frame being asked about. +// +// The result of getcallersp is correct at the time of the return, +// but it may be invalidated by any subsequent call to a function +// that might relocate the stack in order to grow or shrink it. +// A general rule is that the result of getcallersp should be used +// immediately and can only be passed to nosplit functions. + +//go:noescape +func getcallerpc() uintptr + +//go:noescape +func getcallersp() uintptr // implemented as an intrinsic on all platforms + +// getclosureptr returns the pointer to the current closure. +// getclosureptr can only be used in an assignment statement +// at the entry of a function. Moreover, go:nosplit directive +// must be specified at the declaration of caller function, +// so that the function prolog does not clobber the closure register. +// for example: +// +// //go:nosplit +// func f(arg1, arg2, arg3 int) { +// dx := getclosureptr() +// } +// +// The compiler rewrites calls to this function into instructions that fetch the +// pointer from a well-known register (DX on x86 architecture, etc.) directly. +func getclosureptr() uintptr + +//go:noescape +func asmcgocall(fn, arg unsafe.Pointer) int32 + +func morestack() +func morestack_noctxt() +func rt0_go() + +// return0 is a stub used to return 0 from deferproc. +// It is called at the very end of deferproc to signal +// the calling Go function that it should not jump +// to deferreturn. +// in asm_*.s +func return0() + +// in asm_*.s +// not called directly; definitions here supply type information for traceback. +// These must have the same signature (arg pointer map) as reflectcall. +func call16(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call32(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call64(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call128(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call256(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call512(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call1024(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call2048(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call4096(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call8192(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call16384(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call32768(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call65536(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call131072(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call262144(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call524288(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call1048576(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call2097152(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call4194304(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call8388608(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call16777216(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call33554432(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call67108864(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call134217728(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call268435456(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call536870912(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) +func call1073741824(typ, fn, stackArgs unsafe.Pointer, stackArgsSize, stackRetOffset, frameSize uint32, regArgs *abi.RegArgs) + +func systemstack_switch() + +// alignUp rounds n up to a multiple of a. a must be a power of 2. +func alignUp(n, a uintptr) uintptr { + return (n + a - 1) &^ (a - 1) +} + +// alignDown rounds n down to a multiple of a. a must be a power of 2. +func alignDown(n, a uintptr) uintptr { + return n &^ (a - 1) +} + +// divRoundUp returns ceil(n / a). +func divRoundUp(n, a uintptr) uintptr { + // a is generally a power of two. This will get inlined and + // the compiler will optimize the division. + return (n + a - 1) / a +} + +// checkASM reports whether assembly runtime checks have passed. +func checkASM() bool + +func memequal_varlen(a, b unsafe.Pointer) bool + +// bool2int returns 0 if x is false or 1 if x is true. +func bool2int(x bool) int { + // Avoid branches. In the SSA compiler, this compiles to + // exactly what you would want it to. + return int(uint8(*(*uint8)(unsafe.Pointer(&x)))) +} + +// abort crashes the runtime in situations where even throw might not +// work. In general it should do something a debugger will recognize +// (e.g., an INT3 on x86). A crash in abort is recognized by the +// signal handler, which will attempt to tear down the runtime +// immediately. +func abort() + +// Called from compiled code; declared for vet; do NOT call from Go. +func gcWriteBarrier() +func duffzero() +func duffcopy() + +// Called from linker-generated .initarray; declared for go vet; do NOT call from Go. +func addmoduledata() + +// Injected by the signal handler for panicking signals. +// Initializes any registers that have fixed meaning at calls but +// are scratch in bodies and calls sigpanic. +// On many platforms it just jumps to sigpanic. +func sigpanic0() + +// intArgRegs is used by the various register assignment +// algorithm implementations in the runtime. These include:. +// - Finalizers (mfinal.go) +// - Windows callbacks (syscall_windows.go) +// +// Both are stripped-down versions of the algorithm since they +// only have to deal with a subset of cases (finalizers only +// take a pointer or interface argument, Go Windows callbacks +// don't support floating point). +// +// It should be modified with care and are generally only +// modified when testing this package. +// +// It should never be set higher than its internal/abi +// constant counterparts, because the system relies on a +// structure that is at least large enough to hold the +// registers the system supports. +// +// Protected by finlock. +var intArgRegs = abi.IntArgRegs diff --git a/src/runtime/stubs2.go b/src/runtime/stubs2.go new file mode 100644 index 0000000..0d83deb --- /dev/null +++ b/src/runtime/stubs2.go @@ -0,0 +1,44 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !aix && !darwin && !js && !openbsd && !plan9 && !solaris && !windows + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +// read calls the read system call. +// It returns a non-negative number of bytes written or a negative errno value. +func read(fd int32, p unsafe.Pointer, n int32) int32 + +func closefd(fd int32) int32 + +func exit(code int32) +func usleep(usec uint32) + +//go:nosplit +func usleep_no_g(usec uint32) { + usleep(usec) +} + +// write1 calls the write system call. +// It returns a non-negative number of bytes written or a negative errno value. +// +//go:noescape +func write1(fd uintptr, p unsafe.Pointer, n int32) int32 + +//go:noescape +func open(name *byte, mode, perm int32) int32 + +// return value is only set on linux to be used in osinit(). +func madvise(addr unsafe.Pointer, n uintptr, flags int32) int32 + +// exitThread terminates the current thread, writing *wait = freeMStack when +// the stack is safe to reclaim. +// +//go:noescape +func exitThread(wait *atomic.Uint32) diff --git a/src/runtime/stubs3.go b/src/runtime/stubs3.go new file mode 100644 index 0000000..891663b --- /dev/null +++ b/src/runtime/stubs3.go @@ -0,0 +1,9 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !aix && !darwin && !freebsd && !openbsd && !plan9 && !solaris + +package runtime + +func nanotime1() int64 diff --git a/src/runtime/stubs_386.go b/src/runtime/stubs_386.go new file mode 100644 index 0000000..300f167 --- /dev/null +++ b/src/runtime/stubs_386.go @@ -0,0 +1,20 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +func float64touint32(a float64) uint32 +func uint32tofloat64(a uint32) float64 + +// stackcheck checks that SP is in range [g->stack.lo, g->stack.hi). +func stackcheck() + +// Called from assembly only; declared for go vet. +func setldt(slot uintptr, base unsafe.Pointer, size uintptr) +func emptyfunc() + +//go:noescape +func asmcgocall_no_g(fn, arg unsafe.Pointer) diff --git a/src/runtime/stubs_amd64.go b/src/runtime/stubs_amd64.go new file mode 100644 index 0000000..687a506 --- /dev/null +++ b/src/runtime/stubs_amd64.go @@ -0,0 +1,49 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// Called from compiled code; declared for vet; do NOT call from Go. +func gcWriteBarrierCX() +func gcWriteBarrierDX() +func gcWriteBarrierBX() +func gcWriteBarrierBP() +func gcWriteBarrierSI() +func gcWriteBarrierR8() +func gcWriteBarrierR9() + +// stackcheck checks that SP is in range [g->stack.lo, g->stack.hi). +func stackcheck() + +// Called from assembly only; declared for go vet. +func settls() // argument in DI + +// Retpolines, used by -spectre=ret flag in cmd/asm, cmd/compile. +func retpolineAX() +func retpolineCX() +func retpolineDX() +func retpolineBX() +func retpolineBP() +func retpolineSI() +func retpolineDI() +func retpolineR8() +func retpolineR9() +func retpolineR10() +func retpolineR11() +func retpolineR12() +func retpolineR13() +func retpolineR14() +func retpolineR15() + +//go:noescape +func asmcgocall_no_g(fn, arg unsafe.Pointer) + +// Used by reflectcall and the reflect package. +// +// Spills/loads arguments in registers to/from an internal/abi.RegArgs +// respectively. Does not follow the Go ABI. +func spillArgs() +func unspillArgs() diff --git a/src/runtime/stubs_arm.go b/src/runtime/stubs_arm.go new file mode 100644 index 0000000..52c3293 --- /dev/null +++ b/src/runtime/stubs_arm.go @@ -0,0 +1,25 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// Called from compiler-generated code; declared for go vet. +func udiv() +func _div() +func _divu() +func _mod() +func _modu() + +// Called from assembly only; declared for go vet. +func usplitR0() +func load_g() +func save_g() +func emptyfunc() +func _initcgo() +func read_tls_fallback() + +//go:noescape +func asmcgocall_no_g(fn, arg unsafe.Pointer) diff --git a/src/runtime/stubs_arm64.go b/src/runtime/stubs_arm64.go new file mode 100644 index 0000000..bd0533d --- /dev/null +++ b/src/runtime/stubs_arm64.go @@ -0,0 +1,23 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// Called from assembly only; declared for go vet. +func load_g() +func save_g() + +//go:noescape +func asmcgocall_no_g(fn, arg unsafe.Pointer) + +func emptyfunc() + +// Used by reflectcall and the reflect package. +// +// Spills/loads arguments in registers to/from an internal/abi.RegArgs +// respectively. Does not follow the Go ABI. +func spillArgs() +func unspillArgs() diff --git a/src/runtime/stubs_linux.go b/src/runtime/stubs_linux.go new file mode 100644 index 0000000..2367dc2 --- /dev/null +++ b/src/runtime/stubs_linux.go @@ -0,0 +1,20 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux + +package runtime + +import "unsafe" + +func sbrk0() uintptr + +// Called from write_err_android.go only, but defined in sys_linux_*.s; +// declared here (instead of in write_err_android.go) for go vet on non-android builds. +// The return value is the raw syscall result, which may encode an error number. +// +//go:noescape +func access(name *byte, mode int32) int32 +func connect(fd int32, addr unsafe.Pointer, len int32) int32 +func socket(domain int32, typ int32, prot int32) int32 diff --git a/src/runtime/stubs_loong64.go b/src/runtime/stubs_loong64.go new file mode 100644 index 0000000..22366f5 --- /dev/null +++ b/src/runtime/stubs_loong64.go @@ -0,0 +1,11 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build loong64 + +package runtime + +// Called from assembly only; declared for go vet. +func load_g() +func save_g() diff --git a/src/runtime/stubs_mips64x.go b/src/runtime/stubs_mips64x.go new file mode 100644 index 0000000..a9ddfc0 --- /dev/null +++ b/src/runtime/stubs_mips64x.go @@ -0,0 +1,16 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +package runtime + +import "unsafe" + +// Called from assembly only; declared for go vet. +func load_g() +func save_g() + +//go:noescape +func asmcgocall_no_g(fn, arg unsafe.Pointer) diff --git a/src/runtime/stubs_mipsx.go b/src/runtime/stubs_mipsx.go new file mode 100644 index 0000000..d48f9b8 --- /dev/null +++ b/src/runtime/stubs_mipsx.go @@ -0,0 +1,11 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +package runtime + +// Called from assembly only; declared for go vet. +func load_g() +func save_g() diff --git a/src/runtime/stubs_nonlinux.go b/src/runtime/stubs_nonlinux.go new file mode 100644 index 0000000..1a06d7c --- /dev/null +++ b/src/runtime/stubs_nonlinux.go @@ -0,0 +1,12 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !linux + +package runtime + +// sbrk0 returns the current process brk, or 0 if not implemented. +func sbrk0() uintptr { + return 0 +} diff --git a/src/runtime/stubs_ppc64.go b/src/runtime/stubs_ppc64.go new file mode 100644 index 0000000..e23e338 --- /dev/null +++ b/src/runtime/stubs_ppc64.go @@ -0,0 +1,12 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux + +package runtime + +// This is needed for vet. +// +//go:noescape +func callCgoSigaction(sig uintptr, new, old *sigactiont) int32 diff --git a/src/runtime/stubs_ppc64x.go b/src/runtime/stubs_ppc64x.go new file mode 100644 index 0000000..95e43a5 --- /dev/null +++ b/src/runtime/stubs_ppc64x.go @@ -0,0 +1,17 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64le || ppc64 + +package runtime + +// Called from assembly only; declared for go vet. +func load_g() +func save_g() +func reginit() + +// Spills/loads arguments in registers to/from an internal/abi.RegArgs +// respectively. Does not follow the Go ABI. +func spillArgs() +func unspillArgs() diff --git a/src/runtime/stubs_riscv64.go b/src/runtime/stubs_riscv64.go new file mode 100644 index 0000000..f677117 --- /dev/null +++ b/src/runtime/stubs_riscv64.go @@ -0,0 +1,16 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// Called from assembly only; declared for go vet. +func load_g() +func save_g() + +// Used by reflectcall and the reflect package. +// +// Spills/loads arguments in registers to/from an internal/abi.RegArgs +// respectively. Does not follow the Go ABI. +func spillArgs() +func unspillArgs() diff --git a/src/runtime/stubs_s390x.go b/src/runtime/stubs_s390x.go new file mode 100644 index 0000000..44c566e --- /dev/null +++ b/src/runtime/stubs_s390x.go @@ -0,0 +1,9 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// Called from assembly only; declared for go vet. +func load_g() +func save_g() diff --git a/src/runtime/symtab.go b/src/runtime/symtab.go new file mode 100644 index 0000000..dead27e --- /dev/null +++ b/src/runtime/symtab.go @@ -0,0 +1,1214 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// Frames may be used to get function/file/line information for a +// slice of PC values returned by Callers. +type Frames struct { + // callers is a slice of PCs that have not yet been expanded to frames. + callers []uintptr + + // frames is a slice of Frames that have yet to be returned. + frames []Frame + frameStore [2]Frame +} + +// Frame is the information returned by Frames for each call frame. +type Frame struct { + // PC is the program counter for the location in this frame. + // For a frame that calls another frame, this will be the + // program counter of a call instruction. Because of inlining, + // multiple frames may have the same PC value, but different + // symbolic information. + PC uintptr + + // Func is the Func value of this call frame. This may be nil + // for non-Go code or fully inlined functions. + Func *Func + + // Function is the package path-qualified function name of + // this call frame. If non-empty, this string uniquely + // identifies a single function in the program. + // This may be the empty string if not known. + // If Func is not nil then Function == Func.Name(). + Function string + + // File and Line are the file name and line number of the + // location in this frame. For non-leaf frames, this will be + // the location of a call. These may be the empty string and + // zero, respectively, if not known. + File string + Line int + + // startLine is the line number of the beginning of the function in + // this frame. Specifically, it is the line number of the func keyword + // for Go functions. Note that //line directives can change the + // filename and/or line number arbitrarily within a function, meaning + // that the Line - startLine offset is not always meaningful. + // + // This may be zero if not known. + startLine int + + // Entry point program counter for the function; may be zero + // if not known. If Func is not nil then Entry == + // Func.Entry(). + Entry uintptr + + // The runtime's internal view of the function. This field + // is set (funcInfo.valid() returns true) only for Go functions, + // not for C functions. + funcInfo funcInfo +} + +// CallersFrames takes a slice of PC values returned by Callers and +// prepares to return function/file/line information. +// Do not change the slice until you are done with the Frames. +func CallersFrames(callers []uintptr) *Frames { + f := &Frames{callers: callers} + f.frames = f.frameStore[:0] + return f +} + +// Next returns a Frame representing the next call frame in the slice +// of PC values. If it has already returned all call frames, Next +// returns a zero Frame. +// +// The more result indicates whether the next call to Next will return +// a valid Frame. It does not necessarily indicate whether this call +// returned one. +// +// See the Frames example for idiomatic usage. +func (ci *Frames) Next() (frame Frame, more bool) { + for len(ci.frames) < 2 { + // Find the next frame. + // We need to look for 2 frames so we know what + // to return for the "more" result. + if len(ci.callers) == 0 { + break + } + pc := ci.callers[0] + ci.callers = ci.callers[1:] + funcInfo := findfunc(pc) + if !funcInfo.valid() { + if cgoSymbolizer != nil { + // Pre-expand cgo frames. We could do this + // incrementally, too, but there's no way to + // avoid allocation in this case anyway. + ci.frames = append(ci.frames, expandCgoFrames(pc)...) + } + continue + } + f := funcInfo._Func() + entry := f.Entry() + if pc > entry { + // We store the pc of the start of the instruction following + // the instruction in question (the call or the inline mark). + // This is done for historical reasons, and to make FuncForPC + // work correctly for entries in the result of runtime.Callers. + pc-- + } + name := funcname(funcInfo) + startLine := f.startLine() + if inldata := funcdata(funcInfo, _FUNCDATA_InlTree); inldata != nil { + inltree := (*[1 << 20]inlinedCall)(inldata) + // Non-strict as cgoTraceback may have added bogus PCs + // with a valid funcInfo but invalid PCDATA. + ix := pcdatavalue1(funcInfo, _PCDATA_InlTreeIndex, pc, nil, false) + if ix >= 0 { + // Note: entry is not modified. It always refers to a real frame, not an inlined one. + f = nil + ic := inltree[ix] + name = funcnameFromNameOff(funcInfo, ic.nameOff) + startLine = ic.startLine + // File/line from funcline1 below are already correct. + } + } + ci.frames = append(ci.frames, Frame{ + PC: pc, + Func: f, + Function: name, + Entry: entry, + startLine: int(startLine), + funcInfo: funcInfo, + // Note: File,Line set below + }) + } + + // Pop one frame from the frame list. Keep the rest. + // Avoid allocation in the common case, which is 1 or 2 frames. + switch len(ci.frames) { + case 0: // In the rare case when there are no frames at all, we return Frame{}. + return + case 1: + frame = ci.frames[0] + ci.frames = ci.frameStore[:0] + case 2: + frame = ci.frames[0] + ci.frameStore[0] = ci.frames[1] + ci.frames = ci.frameStore[:1] + default: + frame = ci.frames[0] + ci.frames = ci.frames[1:] + } + more = len(ci.frames) > 0 + if frame.funcInfo.valid() { + // Compute file/line just before we need to return it, + // as it can be expensive. This avoids computing file/line + // for the Frame we find but don't return. See issue 32093. + file, line := funcline1(frame.funcInfo, frame.PC, false) + frame.File, frame.Line = file, int(line) + } + return +} + +// runtime_FrameStartLine returns the start line of the function in a Frame. +// +//go:linkname runtime_FrameStartLine runtime/pprof.runtime_FrameStartLine +func runtime_FrameStartLine(f *Frame) int { + return f.startLine +} + +// runtime_expandFinalInlineFrame expands the final pc in stk to include all +// "callers" if pc is inline. +// +//go:linkname runtime_expandFinalInlineFrame runtime/pprof.runtime_expandFinalInlineFrame +func runtime_expandFinalInlineFrame(stk []uintptr) []uintptr { + if len(stk) == 0 { + return stk + } + pc := stk[len(stk)-1] + tracepc := pc - 1 + + f := findfunc(tracepc) + if !f.valid() { + // Not a Go function. + return stk + } + + inldata := funcdata(f, _FUNCDATA_InlTree) + if inldata == nil { + // Nothing inline in f. + return stk + } + + // Treat the previous func as normal. We haven't actually checked, but + // since this pc was included in the stack, we know it shouldn't be + // elided. + lastFuncID := funcID_normal + + // Remove pc from stk; we'll re-add it below. + stk = stk[:len(stk)-1] + + // See inline expansion in gentraceback. + var cache pcvalueCache + inltree := (*[1 << 20]inlinedCall)(inldata) + for { + // Non-strict as cgoTraceback may have added bogus PCs + // with a valid funcInfo but invalid PCDATA. + ix := pcdatavalue1(f, _PCDATA_InlTreeIndex, tracepc, &cache, false) + if ix < 0 { + break + } + if inltree[ix].funcID == funcID_wrapper && elideWrapperCalling(lastFuncID) { + // ignore wrappers + } else { + stk = append(stk, pc) + } + lastFuncID = inltree[ix].funcID + // Back up to an instruction in the "caller". + tracepc = f.entry() + uintptr(inltree[ix].parentPc) + pc = tracepc + 1 + } + + // N.B. we want to keep the last parentPC which is not inline. + stk = append(stk, pc) + + return stk +} + +// expandCgoFrames expands frame information for pc, known to be +// a non-Go function, using the cgoSymbolizer hook. expandCgoFrames +// returns nil if pc could not be expanded. +func expandCgoFrames(pc uintptr) []Frame { + arg := cgoSymbolizerArg{pc: pc} + callCgoSymbolizer(&arg) + + if arg.file == nil && arg.funcName == nil { + // No useful information from symbolizer. + return nil + } + + var frames []Frame + for { + frames = append(frames, Frame{ + PC: pc, + Func: nil, + Function: gostring(arg.funcName), + File: gostring(arg.file), + Line: int(arg.lineno), + Entry: arg.entry, + // funcInfo is zero, which implies !funcInfo.valid(). + // That ensures that we use the File/Line info given here. + }) + if arg.more == 0 { + break + } + callCgoSymbolizer(&arg) + } + + // No more frames for this PC. Tell the symbolizer we are done. + // We don't try to maintain a single cgoSymbolizerArg for the + // whole use of Frames, because there would be no good way to tell + // the symbolizer when we are done. + arg.pc = 0 + callCgoSymbolizer(&arg) + + return frames +} + +// NOTE: Func does not expose the actual unexported fields, because we return *Func +// values to users, and we want to keep them from being able to overwrite the data +// with (say) *f = Func{}. +// All code operating on a *Func must call raw() to get the *_func +// or funcInfo() to get the funcInfo instead. + +// A Func represents a Go function in the running binary. +type Func struct { + opaque struct{} // unexported field to disallow conversions +} + +func (f *Func) raw() *_func { + return (*_func)(unsafe.Pointer(f)) +} + +func (f *Func) funcInfo() funcInfo { + return f.raw().funcInfo() +} + +func (f *_func) funcInfo() funcInfo { + // Find the module containing fn. fn is located in the pclntable. + // The unsafe.Pointer to uintptr conversions and arithmetic + // are safe because we are working with module addresses. + ptr := uintptr(unsafe.Pointer(f)) + var mod *moduledata + for datap := &firstmoduledata; datap != nil; datap = datap.next { + if len(datap.pclntable) == 0 { + continue + } + base := uintptr(unsafe.Pointer(&datap.pclntable[0])) + if base <= ptr && ptr < base+uintptr(len(datap.pclntable)) { + mod = datap + break + } + } + return funcInfo{f, mod} +} + +// PCDATA and FUNCDATA table indexes. +// +// See funcdata.h and ../cmd/internal/objabi/funcdata.go. +const ( + _PCDATA_UnsafePoint = 0 + _PCDATA_StackMapIndex = 1 + _PCDATA_InlTreeIndex = 2 + _PCDATA_ArgLiveIndex = 3 + + _FUNCDATA_ArgsPointerMaps = 0 + _FUNCDATA_LocalsPointerMaps = 1 + _FUNCDATA_StackObjects = 2 + _FUNCDATA_InlTree = 3 + _FUNCDATA_OpenCodedDeferInfo = 4 + _FUNCDATA_ArgInfo = 5 + _FUNCDATA_ArgLiveInfo = 6 + _FUNCDATA_WrapInfo = 7 + + _ArgsSizeUnknown = -0x80000000 +) + +const ( + // PCDATA_UnsafePoint values. + _PCDATA_UnsafePointSafe = -1 // Safe for async preemption + _PCDATA_UnsafePointUnsafe = -2 // Unsafe for async preemption + + // _PCDATA_Restart1(2) apply on a sequence of instructions, within + // which if an async preemption happens, we should back off the PC + // to the start of the sequence when resume. + // We need two so we can distinguish the start/end of the sequence + // in case that two sequences are next to each other. + _PCDATA_Restart1 = -3 + _PCDATA_Restart2 = -4 + + // Like _PCDATA_RestartAtEntry, but back to function entry if async + // preempted. + _PCDATA_RestartAtEntry = -5 +) + +// A FuncID identifies particular functions that need to be treated +// specially by the runtime. +// Note that in some situations involving plugins, there may be multiple +// copies of a particular special runtime function. +// Note: this list must match the list in cmd/internal/objabi/funcid.go. +type funcID uint8 + +const ( + funcID_normal funcID = iota // not a special function + funcID_abort + funcID_asmcgocall + funcID_asyncPreempt + funcID_cgocallback + funcID_debugCallV2 + funcID_gcBgMarkWorker + funcID_goexit + funcID_gogo + funcID_gopanic + funcID_handleAsyncEvent + funcID_mcall + funcID_morestack + funcID_mstart + funcID_panicwrap + funcID_rt0_go + funcID_runfinq + funcID_runtime_main + funcID_sigpanic + funcID_systemstack + funcID_systemstack_switch + funcID_wrapper // any autogenerated code (hash/eq algorithms, method wrappers, etc.) +) + +// A FuncFlag holds bits about a function. +// This list must match the list in cmd/internal/objabi/funcid.go. +type funcFlag uint8 + +const ( + // TOPFRAME indicates a function that appears at the top of its stack. + // The traceback routine stop at such a function and consider that a + // successful, complete traversal of the stack. + // Examples of TOPFRAME functions include goexit, which appears + // at the top of a user goroutine stack, and mstart, which appears + // at the top of a system goroutine stack. + funcFlag_TOPFRAME funcFlag = 1 << iota + + // SPWRITE indicates a function that writes an arbitrary value to SP + // (any write other than adding or subtracting a constant amount). + // The traceback routines cannot encode such changes into the + // pcsp tables, so the function traceback cannot safely unwind past + // SPWRITE functions. Stopping at an SPWRITE function is considered + // to be an incomplete unwinding of the stack. In certain contexts + // (in particular garbage collector stack scans) that is a fatal error. + funcFlag_SPWRITE + + // ASM indicates that a function was implemented in assembly. + funcFlag_ASM +) + +// pcHeader holds data used by the pclntab lookups. +type pcHeader struct { + magic uint32 // 0xFFFFFFF1 + pad1, pad2 uint8 // 0,0 + minLC uint8 // min instruction size + ptrSize uint8 // size of a ptr in bytes + nfunc int // number of functions in the module + nfiles uint // number of entries in the file tab + textStart uintptr // base for function entry PC offsets in this module, equal to moduledata.text + funcnameOffset uintptr // offset to the funcnametab variable from pcHeader + cuOffset uintptr // offset to the cutab variable from pcHeader + filetabOffset uintptr // offset to the filetab variable from pcHeader + pctabOffset uintptr // offset to the pctab variable from pcHeader + pclnOffset uintptr // offset to the pclntab variable from pcHeader +} + +// moduledata records information about the layout of the executable +// image. It is written by the linker. Any changes here must be +// matched changes to the code in cmd/link/internal/ld/symtab.go:symtab. +// moduledata is stored in statically allocated non-pointer memory; +// none of the pointers here are visible to the garbage collector. +type moduledata struct { + pcHeader *pcHeader + funcnametab []byte + cutab []uint32 + filetab []byte + pctab []byte + pclntable []byte + ftab []functab + findfunctab uintptr + minpc, maxpc uintptr + + text, etext uintptr + noptrdata, enoptrdata uintptr + data, edata uintptr + bss, ebss uintptr + noptrbss, enoptrbss uintptr + covctrs, ecovctrs uintptr + end, gcdata, gcbss uintptr + types, etypes uintptr + rodata uintptr + gofunc uintptr // go.func.* + + textsectmap []textsect + typelinks []int32 // offsets from types + itablinks []*itab + + ptab []ptabEntry + + pluginpath string + pkghashes []modulehash + + modulename string + modulehashes []modulehash + + hasmain uint8 // 1 if module contains the main function, 0 otherwise + + gcdatamask, gcbssmask bitvector + + typemap map[typeOff]*_type // offset to *_rtype in previous module + + bad bool // module failed to load and should be ignored + + next *moduledata +} + +// A modulehash is used to compare the ABI of a new module or a +// package in a new module with the loaded program. +// +// For each shared library a module links against, the linker creates an entry in the +// moduledata.modulehashes slice containing the name of the module, the abi hash seen +// at link time and a pointer to the runtime abi hash. These are checked in +// moduledataverify1 below. +// +// For each loaded plugin, the pkghashes slice has a modulehash of the +// newly loaded package that can be used to check the plugin's version of +// a package against any previously loaded version of the package. +// This is done in plugin.lastmoduleinit. +type modulehash struct { + modulename string + linktimehash string + runtimehash *string +} + +// pinnedTypemaps are the map[typeOff]*_type from the moduledata objects. +// +// These typemap objects are allocated at run time on the heap, but the +// only direct reference to them is in the moduledata, created by the +// linker and marked SNOPTRDATA so it is ignored by the GC. +// +// To make sure the map isn't collected, we keep a second reference here. +var pinnedTypemaps []map[typeOff]*_type + +var firstmoduledata moduledata // linker symbol +var lastmoduledatap *moduledata // linker symbol +var modulesSlice *[]*moduledata // see activeModules + +// activeModules returns a slice of active modules. +// +// A module is active once its gcdatamask and gcbssmask have been +// assembled and it is usable by the GC. +// +// This is nosplit/nowritebarrier because it is called by the +// cgo pointer checking code. +// +//go:nosplit +//go:nowritebarrier +func activeModules() []*moduledata { + p := (*[]*moduledata)(atomic.Loadp(unsafe.Pointer(&modulesSlice))) + if p == nil { + return nil + } + return *p +} + +// modulesinit creates the active modules slice out of all loaded modules. +// +// When a module is first loaded by the dynamic linker, an .init_array +// function (written by cmd/link) is invoked to call addmoduledata, +// appending to the module to the linked list that starts with +// firstmoduledata. +// +// There are two times this can happen in the lifecycle of a Go +// program. First, if compiled with -linkshared, a number of modules +// built with -buildmode=shared can be loaded at program initialization. +// Second, a Go program can load a module while running that was built +// with -buildmode=plugin. +// +// After loading, this function is called which initializes the +// moduledata so it is usable by the GC and creates a new activeModules +// list. +// +// Only one goroutine may call modulesinit at a time. +func modulesinit() { + modules := new([]*moduledata) + for md := &firstmoduledata; md != nil; md = md.next { + if md.bad { + continue + } + *modules = append(*modules, md) + if md.gcdatamask == (bitvector{}) { + scanDataSize := md.edata - md.data + md.gcdatamask = progToPointerMask((*byte)(unsafe.Pointer(md.gcdata)), scanDataSize) + scanBSSSize := md.ebss - md.bss + md.gcbssmask = progToPointerMask((*byte)(unsafe.Pointer(md.gcbss)), scanBSSSize) + gcController.addGlobals(int64(scanDataSize + scanBSSSize)) + } + } + + // Modules appear in the moduledata linked list in the order they are + // loaded by the dynamic loader, with one exception: the + // firstmoduledata itself the module that contains the runtime. This + // is not always the first module (when using -buildmode=shared, it + // is typically libstd.so, the second module). The order matters for + // typelinksinit, so we swap the first module with whatever module + // contains the main function. + // + // See Issue #18729. + for i, md := range *modules { + if md.hasmain != 0 { + (*modules)[0] = md + (*modules)[i] = &firstmoduledata + break + } + } + + atomicstorep(unsafe.Pointer(&modulesSlice), unsafe.Pointer(modules)) +} + +type functab struct { + entryoff uint32 // relative to runtime.text + funcoff uint32 +} + +// Mapping information for secondary text sections + +type textsect struct { + vaddr uintptr // prelinked section vaddr + end uintptr // vaddr + section length + baseaddr uintptr // relocated section address +} + +const minfunc = 16 // minimum function size +const pcbucketsize = 256 * minfunc // size of bucket in the pc->func lookup table + +// findfuncbucket is an array of these structures. +// Each bucket represents 4096 bytes of the text segment. +// Each subbucket represents 256 bytes of the text segment. +// To find a function given a pc, locate the bucket and subbucket for +// that pc. Add together the idx and subbucket value to obtain a +// function index. Then scan the functab array starting at that +// index to find the target function. +// This table uses 20 bytes for every 4096 bytes of code, or ~0.5% overhead. +type findfuncbucket struct { + idx uint32 + subbuckets [16]byte +} + +func moduledataverify() { + for datap := &firstmoduledata; datap != nil; datap = datap.next { + moduledataverify1(datap) + } +} + +const debugPcln = false + +func moduledataverify1(datap *moduledata) { + // Check that the pclntab's format is valid. + hdr := datap.pcHeader + if hdr.magic != 0xfffffff1 || hdr.pad1 != 0 || hdr.pad2 != 0 || + hdr.minLC != sys.PCQuantum || hdr.ptrSize != goarch.PtrSize || hdr.textStart != datap.text { + println("runtime: pcHeader: magic=", hex(hdr.magic), "pad1=", hdr.pad1, "pad2=", hdr.pad2, + "minLC=", hdr.minLC, "ptrSize=", hdr.ptrSize, "pcHeader.textStart=", hex(hdr.textStart), + "text=", hex(datap.text), "pluginpath=", datap.pluginpath) + throw("invalid function symbol table") + } + + // ftab is lookup table for function by program counter. + nftab := len(datap.ftab) - 1 + for i := 0; i < nftab; i++ { + // NOTE: ftab[nftab].entry is legal; it is the address beyond the final function. + if datap.ftab[i].entryoff > datap.ftab[i+1].entryoff { + f1 := funcInfo{(*_func)(unsafe.Pointer(&datap.pclntable[datap.ftab[i].funcoff])), datap} + f2 := funcInfo{(*_func)(unsafe.Pointer(&datap.pclntable[datap.ftab[i+1].funcoff])), datap} + f2name := "end" + if i+1 < nftab { + f2name = funcname(f2) + } + println("function symbol table not sorted by PC offset:", hex(datap.ftab[i].entryoff), funcname(f1), ">", hex(datap.ftab[i+1].entryoff), f2name, ", plugin:", datap.pluginpath) + for j := 0; j <= i; j++ { + println("\t", hex(datap.ftab[j].entryoff), funcname(funcInfo{(*_func)(unsafe.Pointer(&datap.pclntable[datap.ftab[j].funcoff])), datap})) + } + if GOOS == "aix" && isarchive { + println("-Wl,-bnoobjreorder is mandatory on aix/ppc64 with c-archive") + } + throw("invalid runtime symbol table") + } + } + + min := datap.textAddr(datap.ftab[0].entryoff) + max := datap.textAddr(datap.ftab[nftab].entryoff) + if datap.minpc != min || datap.maxpc != max { + println("minpc=", hex(datap.minpc), "min=", hex(min), "maxpc=", hex(datap.maxpc), "max=", hex(max)) + throw("minpc or maxpc invalid") + } + + for _, modulehash := range datap.modulehashes { + if modulehash.linktimehash != *modulehash.runtimehash { + println("abi mismatch detected between", datap.modulename, "and", modulehash.modulename) + throw("abi mismatch") + } + } +} + +// textAddr returns md.text + off, with special handling for multiple text sections. +// off is a (virtual) offset computed at internal linking time, +// before the external linker adjusts the sections' base addresses. +// +// The text, or instruction stream is generated as one large buffer. +// The off (offset) for a function is its offset within this buffer. +// If the total text size gets too large, there can be issues on platforms like ppc64 +// if the target of calls are too far for the call instruction. +// To resolve the large text issue, the text is split into multiple text sections +// to allow the linker to generate long calls when necessary. +// When this happens, the vaddr for each text section is set to its offset within the text. +// Each function's offset is compared against the section vaddrs and ends to determine the containing section. +// Then the section relative offset is added to the section's +// relocated baseaddr to compute the function address. +// +// It is nosplit because it is part of the findfunc implementation. +// +//go:nosplit +func (md *moduledata) textAddr(off32 uint32) uintptr { + off := uintptr(off32) + res := md.text + off + if len(md.textsectmap) > 1 { + for i, sect := range md.textsectmap { + // For the last section, include the end address (etext), as it is included in the functab. + if off >= sect.vaddr && off < sect.end || (i == len(md.textsectmap)-1 && off == sect.end) { + res = sect.baseaddr + off - sect.vaddr + break + } + } + if res > md.etext && GOARCH != "wasm" { // on wasm, functions do not live in the same address space as the linear memory + println("runtime: textAddr", hex(res), "out of range", hex(md.text), "-", hex(md.etext)) + throw("runtime: text offset out of range") + } + } + return res +} + +// textOff is the opposite of textAddr. It converts a PC to a (virtual) offset +// to md.text, and returns if the PC is in any Go text section. +// +// It is nosplit because it is part of the findfunc implementation. +// +//go:nosplit +func (md *moduledata) textOff(pc uintptr) (uint32, bool) { + res := uint32(pc - md.text) + if len(md.textsectmap) > 1 { + for i, sect := range md.textsectmap { + if sect.baseaddr > pc { + // pc is not in any section. + return 0, false + } + end := sect.baseaddr + (sect.end - sect.vaddr) + // For the last section, include the end address (etext), as it is included in the functab. + if i == len(md.textsectmap) { + end++ + } + if pc < end { + res = uint32(pc - sect.baseaddr + sect.vaddr) + break + } + } + } + return res, true +} + +// FuncForPC returns a *Func describing the function that contains the +// given program counter address, or else nil. +// +// If pc represents multiple functions because of inlining, it returns +// the *Func describing the innermost function, but with an entry of +// the outermost function. +func FuncForPC(pc uintptr) *Func { + f := findfunc(pc) + if !f.valid() { + return nil + } + if inldata := funcdata(f, _FUNCDATA_InlTree); inldata != nil { + // Note: strict=false so bad PCs (those between functions) don't crash the runtime. + // We just report the preceding function in that situation. See issue 29735. + // TODO: Perhaps we should report no function at all in that case. + // The runtime currently doesn't have function end info, alas. + if ix := pcdatavalue1(f, _PCDATA_InlTreeIndex, pc, nil, false); ix >= 0 { + inltree := (*[1 << 20]inlinedCall)(inldata) + ic := inltree[ix] + name := funcnameFromNameOff(f, ic.nameOff) + file, line := funcline(f, pc) + fi := &funcinl{ + ones: ^uint32(0), + entry: f.entry(), // entry of the real (the outermost) function. + name: name, + file: file, + line: line, + startLine: ic.startLine, + } + return (*Func)(unsafe.Pointer(fi)) + } + } + return f._Func() +} + +// Name returns the name of the function. +func (f *Func) Name() string { + if f == nil { + return "" + } + fn := f.raw() + if fn.isInlined() { // inlined version + fi := (*funcinl)(unsafe.Pointer(fn)) + return fi.name + } + return funcname(f.funcInfo()) +} + +// Entry returns the entry address of the function. +func (f *Func) Entry() uintptr { + fn := f.raw() + if fn.isInlined() { // inlined version + fi := (*funcinl)(unsafe.Pointer(fn)) + return fi.entry + } + return fn.funcInfo().entry() +} + +// FileLine returns the file name and line number of the +// source code corresponding to the program counter pc. +// The result will not be accurate if pc is not a program +// counter within f. +func (f *Func) FileLine(pc uintptr) (file string, line int) { + fn := f.raw() + if fn.isInlined() { // inlined version + fi := (*funcinl)(unsafe.Pointer(fn)) + return fi.file, int(fi.line) + } + // Pass strict=false here, because anyone can call this function, + // and they might just be wrong about targetpc belonging to f. + file, line32 := funcline1(f.funcInfo(), pc, false) + return file, int(line32) +} + +// startLine returns the starting line number of the function. i.e., the line +// number of the func keyword. +func (f *Func) startLine() int32 { + fn := f.raw() + if fn.isInlined() { // inlined version + fi := (*funcinl)(unsafe.Pointer(fn)) + return fi.startLine + } + return fn.funcInfo().startLine +} + +// findmoduledatap looks up the moduledata for a PC. +// +// It is nosplit because it's part of the isgoexception +// implementation. +// +//go:nosplit +func findmoduledatap(pc uintptr) *moduledata { + for datap := &firstmoduledata; datap != nil; datap = datap.next { + if datap.minpc <= pc && pc < datap.maxpc { + return datap + } + } + return nil +} + +type funcInfo struct { + *_func + datap *moduledata +} + +func (f funcInfo) valid() bool { + return f._func != nil +} + +func (f funcInfo) _Func() *Func { + return (*Func)(unsafe.Pointer(f._func)) +} + +// isInlined reports whether f should be re-interpreted as a *funcinl. +func (f *_func) isInlined() bool { + return f.entryOff == ^uint32(0) // see comment for funcinl.ones +} + +// entry returns the entry PC for f. +func (f funcInfo) entry() uintptr { + return f.datap.textAddr(f.entryOff) +} + +// findfunc looks up function metadata for a PC. +// +// It is nosplit because it's part of the isgoexception +// implementation. +// +//go:nosplit +func findfunc(pc uintptr) funcInfo { + datap := findmoduledatap(pc) + if datap == nil { + return funcInfo{} + } + const nsub = uintptr(len(findfuncbucket{}.subbuckets)) + + pcOff, ok := datap.textOff(pc) + if !ok { + return funcInfo{} + } + + x := uintptr(pcOff) + datap.text - datap.minpc // TODO: are datap.text and datap.minpc always equal? + b := x / pcbucketsize + i := x % pcbucketsize / (pcbucketsize / nsub) + + ffb := (*findfuncbucket)(add(unsafe.Pointer(datap.findfunctab), b*unsafe.Sizeof(findfuncbucket{}))) + idx := ffb.idx + uint32(ffb.subbuckets[i]) + + // Find the ftab entry. + for datap.ftab[idx+1].entryoff <= pcOff { + idx++ + } + + funcoff := datap.ftab[idx].funcoff + return funcInfo{(*_func)(unsafe.Pointer(&datap.pclntable[funcoff])), datap} +} + +type pcvalueCache struct { + entries [2][8]pcvalueCacheEnt +} + +type pcvalueCacheEnt struct { + // targetpc and off together are the key of this cache entry. + targetpc uintptr + off uint32 + // val is the value of this cached pcvalue entry. + val int32 +} + +// pcvalueCacheKey returns the outermost index in a pcvalueCache to use for targetpc. +// It must be very cheap to calculate. +// For now, align to goarch.PtrSize and reduce mod the number of entries. +// In practice, this appears to be fairly randomly and evenly distributed. +func pcvalueCacheKey(targetpc uintptr) uintptr { + return (targetpc / goarch.PtrSize) % uintptr(len(pcvalueCache{}.entries)) +} + +// Returns the PCData value, and the PC where this value starts. +// TODO: the start PC is returned only when cache is nil. +func pcvalue(f funcInfo, off uint32, targetpc uintptr, cache *pcvalueCache, strict bool) (int32, uintptr) { + if off == 0 { + return -1, 0 + } + + // Check the cache. This speeds up walks of deep stacks, which + // tend to have the same recursive functions over and over. + // + // This cache is small enough that full associativity is + // cheaper than doing the hashing for a less associative + // cache. + if cache != nil { + x := pcvalueCacheKey(targetpc) + for i := range cache.entries[x] { + // We check off first because we're more + // likely to have multiple entries with + // different offsets for the same targetpc + // than the other way around, so we'll usually + // fail in the first clause. + ent := &cache.entries[x][i] + if ent.off == off && ent.targetpc == targetpc { + return ent.val, 0 + } + } + } + + if !f.valid() { + if strict && panicking.Load() == 0 { + println("runtime: no module data for", hex(f.entry())) + throw("no module data") + } + return -1, 0 + } + datap := f.datap + p := datap.pctab[off:] + pc := f.entry() + prevpc := pc + val := int32(-1) + for { + var ok bool + p, ok = step(p, &pc, &val, pc == f.entry()) + if !ok { + break + } + if targetpc < pc { + // Replace a random entry in the cache. Random + // replacement prevents a performance cliff if + // a recursive stack's cycle is slightly + // larger than the cache. + // Put the new element at the beginning, + // since it is the most likely to be newly used. + if cache != nil { + x := pcvalueCacheKey(targetpc) + e := &cache.entries[x] + ci := fastrandn(uint32(len(cache.entries[x]))) + e[ci] = e[0] + e[0] = pcvalueCacheEnt{ + targetpc: targetpc, + off: off, + val: val, + } + } + + return val, prevpc + } + prevpc = pc + } + + // If there was a table, it should have covered all program counters. + // If not, something is wrong. + if panicking.Load() != 0 || !strict { + return -1, 0 + } + + print("runtime: invalid pc-encoded table f=", funcname(f), " pc=", hex(pc), " targetpc=", hex(targetpc), " tab=", p, "\n") + + p = datap.pctab[off:] + pc = f.entry() + val = -1 + for { + var ok bool + p, ok = step(p, &pc, &val, pc == f.entry()) + if !ok { + break + } + print("\tvalue=", val, " until pc=", hex(pc), "\n") + } + + throw("invalid runtime symbol table") + return -1, 0 +} + +func cfuncname(f funcInfo) *byte { + if !f.valid() || f.nameOff == 0 { + return nil + } + return &f.datap.funcnametab[f.nameOff] +} + +func funcname(f funcInfo) string { + return gostringnocopy(cfuncname(f)) +} + +func funcpkgpath(f funcInfo) string { + name := funcname(f) + i := len(name) - 1 + for ; i > 0; i-- { + if name[i] == '/' { + break + } + } + for ; i < len(name); i++ { + if name[i] == '.' { + break + } + } + return name[:i] +} + +func cfuncnameFromNameOff(f funcInfo, nameOff int32) *byte { + if !f.valid() { + return nil + } + return &f.datap.funcnametab[nameOff] +} + +func funcnameFromNameOff(f funcInfo, nameOff int32) string { + return gostringnocopy(cfuncnameFromNameOff(f, nameOff)) +} + +func funcfile(f funcInfo, fileno int32) string { + datap := f.datap + if !f.valid() { + return "?" + } + // Make sure the cu index and file offset are valid + if fileoff := datap.cutab[f.cuOffset+uint32(fileno)]; fileoff != ^uint32(0) { + return gostringnocopy(&datap.filetab[fileoff]) + } + // pcln section is corrupt. + return "?" +} + +func funcline1(f funcInfo, targetpc uintptr, strict bool) (file string, line int32) { + datap := f.datap + if !f.valid() { + return "?", 0 + } + fileno, _ := pcvalue(f, f.pcfile, targetpc, nil, strict) + line, _ = pcvalue(f, f.pcln, targetpc, nil, strict) + if fileno == -1 || line == -1 || int(fileno) >= len(datap.filetab) { + // print("looking for ", hex(targetpc), " in ", funcname(f), " got file=", fileno, " line=", lineno, "\n") + return "?", 0 + } + file = funcfile(f, fileno) + return +} + +func funcline(f funcInfo, targetpc uintptr) (file string, line int32) { + return funcline1(f, targetpc, true) +} + +func funcspdelta(f funcInfo, targetpc uintptr, cache *pcvalueCache) int32 { + x, _ := pcvalue(f, f.pcsp, targetpc, cache, true) + if debugPcln && x&(goarch.PtrSize-1) != 0 { + print("invalid spdelta ", funcname(f), " ", hex(f.entry()), " ", hex(targetpc), " ", hex(f.pcsp), " ", x, "\n") + throw("bad spdelta") + } + return x +} + +// funcMaxSPDelta returns the maximum spdelta at any point in f. +func funcMaxSPDelta(f funcInfo) int32 { + datap := f.datap + p := datap.pctab[f.pcsp:] + pc := f.entry() + val := int32(-1) + max := int32(0) + for { + var ok bool + p, ok = step(p, &pc, &val, pc == f.entry()) + if !ok { + return max + } + if val > max { + max = val + } + } +} + +func pcdatastart(f funcInfo, table uint32) uint32 { + return *(*uint32)(add(unsafe.Pointer(&f.nfuncdata), unsafe.Sizeof(f.nfuncdata)+uintptr(table)*4)) +} + +func pcdatavalue(f funcInfo, table uint32, targetpc uintptr, cache *pcvalueCache) int32 { + if table >= f.npcdata { + return -1 + } + r, _ := pcvalue(f, pcdatastart(f, table), targetpc, cache, true) + return r +} + +func pcdatavalue1(f funcInfo, table uint32, targetpc uintptr, cache *pcvalueCache, strict bool) int32 { + if table >= f.npcdata { + return -1 + } + r, _ := pcvalue(f, pcdatastart(f, table), targetpc, cache, strict) + return r +} + +// Like pcdatavalue, but also return the start PC of this PCData value. +// It doesn't take a cache. +func pcdatavalue2(f funcInfo, table uint32, targetpc uintptr) (int32, uintptr) { + if table >= f.npcdata { + return -1, 0 + } + return pcvalue(f, pcdatastart(f, table), targetpc, nil, true) +} + +// funcdata returns a pointer to the ith funcdata for f. +// funcdata should be kept in sync with cmd/link:writeFuncs. +func funcdata(f funcInfo, i uint8) unsafe.Pointer { + if i < 0 || i >= f.nfuncdata { + return nil + } + base := f.datap.gofunc // load gofunc address early so that we calculate during cache misses + p := uintptr(unsafe.Pointer(&f.nfuncdata)) + unsafe.Sizeof(f.nfuncdata) + uintptr(f.npcdata)*4 + uintptr(i)*4 + off := *(*uint32)(unsafe.Pointer(p)) + // Return off == ^uint32(0) ? 0 : f.datap.gofunc + uintptr(off), but without branches. + // The compiler calculates mask on most architectures using conditional assignment. + var mask uintptr + if off == ^uint32(0) { + mask = 1 + } + mask-- + raw := base + uintptr(off) + return unsafe.Pointer(raw & mask) +} + +// step advances to the next pc, value pair in the encoded table. +func step(p []byte, pc *uintptr, val *int32, first bool) (newp []byte, ok bool) { + // For both uvdelta and pcdelta, the common case (~70%) + // is that they are a single byte. If so, avoid calling readvarint. + uvdelta := uint32(p[0]) + if uvdelta == 0 && !first { + return nil, false + } + n := uint32(1) + if uvdelta&0x80 != 0 { + n, uvdelta = readvarint(p) + } + *val += int32(-(uvdelta & 1) ^ (uvdelta >> 1)) + p = p[n:] + + pcdelta := uint32(p[0]) + n = 1 + if pcdelta&0x80 != 0 { + n, pcdelta = readvarint(p) + } + p = p[n:] + *pc += uintptr(pcdelta * sys.PCQuantum) + return p, true +} + +// readvarint reads a varint from p. +func readvarint(p []byte) (read uint32, val uint32) { + var v, shift, n uint32 + for { + b := p[n] + n++ + v |= uint32(b&0x7F) << (shift & 31) + if b&0x80 == 0 { + break + } + shift += 7 + } + return n, v +} + +type stackmap struct { + n int32 // number of bitmaps + nbit int32 // number of bits in each bitmap + bytedata [1]byte // bitmaps, each starting on a byte boundary +} + +//go:nowritebarrier +func stackmapdata(stkmap *stackmap, n int32) bitvector { + // Check this invariant only when stackDebug is on at all. + // The invariant is already checked by many of stackmapdata's callers, + // and disabling it by default allows stackmapdata to be inlined. + if stackDebug > 0 && (n < 0 || n >= stkmap.n) { + throw("stackmapdata: index out of range") + } + return bitvector{stkmap.nbit, addb(&stkmap.bytedata[0], uintptr(n*((stkmap.nbit+7)>>3)))} +} + +// inlinedCall is the encoding of entries in the FUNCDATA_InlTree table. +type inlinedCall struct { + funcID funcID // type of the called function + _ [3]byte + nameOff int32 // offset into pclntab for name of called function + parentPc int32 // position of an instruction whose source position is the call site (offset from entry) + startLine int32 // line number of start of function (func keyword/TEXT directive) +} diff --git a/src/runtime/symtab_test.go b/src/runtime/symtab_test.go new file mode 100644 index 0000000..cf20ea7 --- /dev/null +++ b/src/runtime/symtab_test.go @@ -0,0 +1,285 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "runtime" + "strings" + "testing" + "unsafe" +) + +func TestCaller(t *testing.T) { + procs := runtime.GOMAXPROCS(-1) + c := make(chan bool, procs) + for p := 0; p < procs; p++ { + go func() { + for i := 0; i < 1000; i++ { + testCallerFoo(t) + } + c <- true + }() + defer func() { + <-c + }() + } +} + +// These are marked noinline so that we can use FuncForPC +// in testCallerBar. +// +//go:noinline +func testCallerFoo(t *testing.T) { + testCallerBar(t) +} + +//go:noinline +func testCallerBar(t *testing.T) { + for i := 0; i < 2; i++ { + pc, file, line, ok := runtime.Caller(i) + f := runtime.FuncForPC(pc) + if !ok || + !strings.HasSuffix(file, "symtab_test.go") || + (i == 0 && !strings.HasSuffix(f.Name(), "testCallerBar")) || + (i == 1 && !strings.HasSuffix(f.Name(), "testCallerFoo")) || + line < 5 || line > 1000 || + f.Entry() >= pc { + t.Errorf("incorrect symbol info %d: %t %d %d %s %s %d", + i, ok, f.Entry(), pc, f.Name(), file, line) + } + } +} + +func lineNumber() int { + _, _, line, _ := runtime.Caller(1) + return line // return 0 for error +} + +// Do not add/remove lines in this block without updating the line numbers. +var firstLine = lineNumber() // 0 +var ( // 1 + lineVar1 = lineNumber() // 2 + lineVar2a, lineVar2b = lineNumber(), lineNumber() // 3 +) // 4 +var compLit = []struct { // 5 + lineA, lineB int // 6 +}{ // 7 + { // 8 + lineNumber(), lineNumber(), // 9 + }, // 10 + { // 11 + lineNumber(), // 12 + lineNumber(), // 13 + }, // 14 + { // 15 + lineB: lineNumber(), // 16 + lineA: lineNumber(), // 17 + }, // 18 +} // 19 +var arrayLit = [...]int{lineNumber(), // 20 + lineNumber(), lineNumber(), // 21 + lineNumber(), // 22 +} // 23 +var sliceLit = []int{lineNumber(), // 24 + lineNumber(), lineNumber(), // 25 + lineNumber(), // 26 +} // 27 +var mapLit = map[int]int{ // 28 + 29: lineNumber(), // 29 + 30: lineNumber(), // 30 + lineNumber(): 31, // 31 + lineNumber(): 32, // 32 +} // 33 +var intLit = lineNumber() + // 34 + lineNumber() + // 35 + lineNumber() // 36 +func trythis() { // 37 + recordLines(lineNumber(), // 38 + lineNumber(), // 39 + lineNumber()) // 40 +} + +// Modifications below this line are okay. + +var l38, l39, l40 int + +func recordLines(a, b, c int) { + l38 = a + l39 = b + l40 = c +} + +func TestLineNumber(t *testing.T) { + trythis() + for _, test := range []struct { + name string + val int + want int + }{ + {"firstLine", firstLine, 0}, + {"lineVar1", lineVar1, 2}, + {"lineVar2a", lineVar2a, 3}, + {"lineVar2b", lineVar2b, 3}, + {"compLit[0].lineA", compLit[0].lineA, 9}, + {"compLit[0].lineB", compLit[0].lineB, 9}, + {"compLit[1].lineA", compLit[1].lineA, 12}, + {"compLit[1].lineB", compLit[1].lineB, 13}, + {"compLit[2].lineA", compLit[2].lineA, 17}, + {"compLit[2].lineB", compLit[2].lineB, 16}, + + {"arrayLit[0]", arrayLit[0], 20}, + {"arrayLit[1]", arrayLit[1], 21}, + {"arrayLit[2]", arrayLit[2], 21}, + {"arrayLit[3]", arrayLit[3], 22}, + + {"sliceLit[0]", sliceLit[0], 24}, + {"sliceLit[1]", sliceLit[1], 25}, + {"sliceLit[2]", sliceLit[2], 25}, + {"sliceLit[3]", sliceLit[3], 26}, + + {"mapLit[29]", mapLit[29], 29}, + {"mapLit[30]", mapLit[30], 30}, + {"mapLit[31]", mapLit[31+firstLine] + firstLine, 31}, // nb it's the key not the value + {"mapLit[32]", mapLit[32+firstLine] + firstLine, 32}, // nb it's the key not the value + + {"intLit", intLit - 2*firstLine, 34 + 35 + 36}, + + {"l38", l38, 38}, + {"l39", l39, 39}, + {"l40", l40, 40}, + } { + if got := test.val - firstLine; got != test.want { + t.Errorf("%s on firstLine+%d want firstLine+%d (firstLine=%d, val=%d)", + test.name, got, test.want, firstLine, test.val) + } + } +} + +func TestNilName(t *testing.T) { + defer func() { + if ex := recover(); ex != nil { + t.Fatalf("expected no nil panic, got=%v", ex) + } + }() + if got := (*runtime.Func)(nil).Name(); got != "" { + t.Errorf("Name() = %q, want %q", got, "") + } +} + +var dummy int + +func inlined() { + // Side effect to prevent elimination of this entire function. + dummy = 42 +} + +// A function with an InlTree. Returns a PC within the function body. +// +// No inline to ensure this complete function appears in output. +// +//go:noinline +func tracebackFunc(t *testing.T) uintptr { + // This body must be more complex than a single call to inlined to get + // an inline tree. + inlined() + inlined() + + // Acquire a PC in this function. + pc, _, _, ok := runtime.Caller(0) + if !ok { + t.Fatalf("Caller(0) got ok false, want true") + } + + return pc +} + +// Test that CallersFrames handles PCs in the alignment region between +// functions (int 3 on amd64) without crashing. +// +// Go will never generate a stack trace containing such an address, as it is +// not a valid call site. However, the cgo traceback function passed to +// runtime.SetCgoTraceback may not be completely accurate and may incorrect +// provide PCs in Go code or the alignment region between functions. +// +// Go obviously doesn't easily expose the problematic PCs to running programs, +// so this test is a bit fragile. Some details: +// +// - tracebackFunc is our target function. We want to get a PC in the +// alignment region following this function. This function also has other +// functions inlined into it to ensure it has an InlTree (this was the source +// of the bug in issue 44971). +// +// - We acquire a PC in tracebackFunc, walking forwards until FuncForPC says +// we're in a new function. The last PC of the function according to FuncForPC +// should be in the alignment region (assuming the function isn't already +// perfectly aligned). +// +// This is a regression test for issue 44971. +func TestFunctionAlignmentTraceback(t *testing.T) { + pc := tracebackFunc(t) + + // Double-check we got the right PC. + f := runtime.FuncForPC(pc) + if !strings.HasSuffix(f.Name(), "tracebackFunc") { + t.Fatalf("Caller(0) = %+v, want tracebackFunc", f) + } + + // Iterate forward until we find a different function. Back up one + // instruction is (hopefully) an alignment instruction. + for runtime.FuncForPC(pc) == f { + pc++ + } + pc-- + + // Is this an alignment region filler instruction? We only check this + // on amd64 for simplicity. If this function has no filler, then we may + // get a false negative, but will never get a false positive. + if runtime.GOARCH == "amd64" { + code := *(*uint8)(unsafe.Pointer(pc)) + if code != 0xcc { // INT $3 + t.Errorf("PC %v code got %#x want 0xcc", pc, code) + } + } + + // Finally ensure that Frames.Next doesn't crash when processing this + // PC. + frames := runtime.CallersFrames([]uintptr{pc}) + frame, _ := frames.Next() + if frame.Func != f { + t.Errorf("frames.Next() got %+v want %+v", frame.Func, f) + } +} + +func BenchmarkFunc(b *testing.B) { + pc, _, _, ok := runtime.Caller(0) + if !ok { + b.Fatal("failed to look up PC") + } + f := runtime.FuncForPC(pc) + b.Run("Name", func(b *testing.B) { + for i := 0; i < b.N; i++ { + name := f.Name() + if name != "runtime_test.BenchmarkFunc" { + b.Fatalf("unexpected name %q", name) + } + } + }) + b.Run("Entry", func(b *testing.B) { + for i := 0; i < b.N; i++ { + pc := f.Entry() + if pc == 0 { + b.Fatal("zero PC") + } + } + }) + b.Run("FileLine", func(b *testing.B) { + for i := 0; i < b.N; i++ { + file, line := f.FileLine(pc) + if !strings.HasSuffix(file, "symtab_test.go") || line == 0 { + b.Fatalf("unexpected file/line %q:%d", file, line) + } + } + }) +} diff --git a/src/runtime/sys_aix_ppc64.s b/src/runtime/sys_aix_ppc64.s new file mode 100644 index 0000000..ab18c5e --- /dev/null +++ b/src/runtime/sys_aix_ppc64.s @@ -0,0 +1,318 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for ppc64, Aix +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "asm_ppc64x.h" + +// This function calls a C function with the function descriptor in R12 +TEXT callCfunction<>(SB), NOSPLIT|NOFRAME,$0 + MOVD 0(R12), R12 + MOVD R2, 40(R1) + MOVD 0(R12), R0 + MOVD 8(R12), R2 + MOVD R0, CTR + BR (CTR) + + +// asmsyscall6 calls a library function with a function descriptor +// stored in libcall_fn and store the results in libcall structure +// Up to 6 arguments can be passed to this C function +// Called by runtime.asmcgocall +// It reserves a stack of 288 bytes for the C function. It must +// follow AIX convention, thus the first local variable must +// be stored at the offset 112, after the linker area (48 bytes) +// and the argument area (64). +// The AIX convention is described here: +// https://www.ibm.com/docs/en/aix/7.2?topic=overview-runtime-process-stack +// NOT USING GO CALLING CONVENTION +// runtime.asmsyscall6 is a function descriptor to the real asmsyscall6. +DATA runtime·asmsyscall6+0(SB)/8, $asmsyscall6<>(SB) +DATA runtime·asmsyscall6+8(SB)/8, $TOC(SB) +DATA runtime·asmsyscall6+16(SB)/8, $0 +GLOBL runtime·asmsyscall6(SB), NOPTR, $24 + +TEXT asmsyscall6<>(SB),NOSPLIT,$256 + // Save libcall for later + MOVD R3, 112(R1) + MOVD libcall_fn(R3), R12 + MOVD libcall_args(R3), R9 + MOVD 0(R9), R3 + MOVD 8(R9), R4 + MOVD 16(R9), R5 + MOVD 24(R9), R6 + MOVD 32(R9), R7 + MOVD 40(R9), R8 + BL callCfunction<>(SB) + + // Restore R0 and TOC + XOR R0, R0 + MOVD 40(R1), R2 + + // Store result in libcall + MOVD 112(R1), R5 + MOVD R3, (libcall_r1)(R5) + MOVD $-1, R6 + CMP R6, R3 + BNE skiperrno + + // Save errno in libcall + BL runtime·load_g(SB) + MOVD g_m(g), R4 + MOVD (m_mOS + mOS_perrno)(R4), R9 + MOVW 0(R9), R9 + MOVD R9, (libcall_err)(R5) + RET +skiperrno: + // Reset errno if no error has been returned + MOVD R0, (libcall_err)(R5) + RET + + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R3 + MOVD info+16(FP), R4 + MOVD ctx+24(FP), R5 + MOVD fn+0(FP), R12 + // fn is a function descriptor + // R2 must be saved on restore + MOVD 0(R12), R0 + MOVD R2, 40(R1) + MOVD 8(R12), R2 + MOVD R0, CTR + BL (CTR) + MOVD 40(R1), R2 + BL runtime·reginit(SB) + RET + + +// runtime.sigtramp is a function descriptor to the real sigtramp. +DATA runtime·sigtramp+0(SB)/8, $sigtramp<>(SB) +DATA runtime·sigtramp+8(SB)/8, $TOC(SB) +DATA runtime·sigtramp+16(SB)/8, $0 +GLOBL runtime·sigtramp(SB), NOPTR, $24 + +// This function must not have any frame as we want to control how +// every registers are used. +// TODO(aix): Implement SetCgoTraceback handler. +TEXT sigtramp<>(SB),NOSPLIT|NOFRAME|TOPFRAME,$0 + MOVD LR, R0 + MOVD R0, 16(R1) + // initialize essential registers (just in case) + BL runtime·reginit(SB) + + // Note that we are executing on altsigstack here, so we have + // more stack available than NOSPLIT would have us believe. + // To defeat the linker, we make our own stack frame with + // more space. + SUB $144+FIXED_FRAME, R1 + + // Save registers + MOVD R31, 56(R1) + MOVD g, 64(R1) + MOVD R29, 72(R1) + MOVD R14, 80(R1) + MOVD R15, 88(R1) + + BL runtime·load_g(SB) + + CMP $0, g + BEQ sigtramp // g == nil + MOVD g_m(g), R6 + CMP $0, R6 + BEQ sigtramp // g.m == nil + + // Save m->libcall. We need to do this because we + // might get interrupted by a signal in runtime·asmcgocall. + MOVD (m_libcall+libcall_fn)(R6), R7 + MOVD R7, 96(R1) + MOVD (m_libcall+libcall_args)(R6), R7 + MOVD R7, 104(R1) + MOVD (m_libcall+libcall_n)(R6), R7 + MOVD R7, 112(R1) + MOVD (m_libcall+libcall_r1)(R6), R7 + MOVD R7, 120(R1) + MOVD (m_libcall+libcall_r2)(R6), R7 + MOVD R7, 128(R1) + + // save errno, it might be EINTR; stuff we do here might reset it. + MOVD (m_mOS+mOS_perrno)(R6), R8 + MOVD 0(R8), R8 + MOVD R8, 136(R1) + +sigtramp: + MOVW R3, FIXED_FRAME+0(R1) + MOVD R4, FIXED_FRAME+8(R1) + MOVD R5, FIXED_FRAME+16(R1) + MOVD $runtime·sigtrampgo(SB), R12 + MOVD R12, CTR + BL (CTR) + + CMP $0, g + BEQ exit // g == nil + MOVD g_m(g), R6 + CMP $0, R6 + BEQ exit // g.m == nil + + // restore libcall + MOVD 96(R1), R7 + MOVD R7, (m_libcall+libcall_fn)(R6) + MOVD 104(R1), R7 + MOVD R7, (m_libcall+libcall_args)(R6) + MOVD 112(R1), R7 + MOVD R7, (m_libcall+libcall_n)(R6) + MOVD 120(R1), R7 + MOVD R7, (m_libcall+libcall_r1)(R6) + MOVD 128(R1), R7 + MOVD R7, (m_libcall+libcall_r2)(R6) + + // restore errno + MOVD (m_mOS+mOS_perrno)(R6), R7 + MOVD 136(R1), R8 + MOVD R8, 0(R7) + +exit: + // restore registers + MOVD 56(R1),R31 + MOVD 64(R1),g + MOVD 72(R1),R29 + MOVD 80(R1), R14 + MOVD 88(R1), R15 + + // Don't use RET because we need to restore R31 ! + ADD $144+FIXED_FRAME, R1 + MOVD 16(R1), R0 + MOVD R0, LR + BR (LR) + +// runtime.tstart is a function descriptor to the real tstart. +DATA runtime·tstart+0(SB)/8, $tstart<>(SB) +DATA runtime·tstart+8(SB)/8, $TOC(SB) +DATA runtime·tstart+16(SB)/8, $0 +GLOBL runtime·tstart(SB), NOPTR, $24 + +TEXT tstart<>(SB),NOSPLIT,$0 + XOR R0, R0 // reset R0 + + // set g + MOVD m_g0(R3), g + BL runtime·save_g(SB) + MOVD R3, g_m(g) + + // Layout new m scheduler stack on os stack. + MOVD R1, R3 + MOVD R3, (g_stack+stack_hi)(g) + SUB $(const_threadStackSize), R3 // stack size + MOVD R3, (g_stack+stack_lo)(g) + ADD $const__StackGuard, R3 + MOVD R3, g_stackguard0(g) + MOVD R3, g_stackguard1(g) + + BL runtime·mstart(SB) + + MOVD R0, R3 + RET + + +#define CSYSCALL() \ + MOVD 0(R12), R12 \ + MOVD R2, 40(R1) \ + MOVD 0(R12), R0 \ + MOVD 8(R12), R2 \ + MOVD R0, CTR \ + BL (CTR) \ + MOVD 40(R1), R2 \ + BL runtime·reginit(SB) + + +// Runs on OS stack, called from runtime·osyield. +TEXT runtime·osyield1(SB),NOSPLIT,$0 + MOVD $libc_sched_yield(SB), R12 + CSYSCALL() + RET + + +// Runs on OS stack, called from runtime·sigprocmask. +TEXT runtime·sigprocmask1(SB),NOSPLIT,$0-24 + MOVD how+0(FP), R3 + MOVD new+8(FP), R4 + MOVD old+16(FP), R5 + MOVD $libpthread_sigthreadmask(SB), R12 + CSYSCALL() + RET + +// Runs on OS stack, called from runtime·usleep. +TEXT runtime·usleep1(SB),NOSPLIT,$0-4 + MOVW us+0(FP), R3 + MOVD $libc_usleep(SB), R12 + CSYSCALL() + RET + +// Runs on OS stack, called from runtime·exit. +TEXT runtime·exit1(SB),NOSPLIT,$0-4 + MOVW code+0(FP), R3 + MOVD $libc_exit(SB), R12 + CSYSCALL() + RET + +// Runs on OS stack, called from runtime·write1. +TEXT runtime·write2(SB),NOSPLIT,$0-28 + MOVD fd+0(FP), R3 + MOVD p+8(FP), R4 + MOVW n+16(FP), R5 + MOVD $libc_write(SB), R12 + CSYSCALL() + MOVW R3, ret+24(FP) + RET + +// Runs on OS stack, called from runtime·pthread_attr_init. +TEXT runtime·pthread_attr_init1(SB),NOSPLIT,$0-12 + MOVD attr+0(FP), R3 + MOVD $libpthread_attr_init(SB), R12 + CSYSCALL() + MOVW R3, ret+8(FP) + RET + +// Runs on OS stack, called from runtime·pthread_attr_setstacksize. +TEXT runtime·pthread_attr_setstacksize1(SB),NOSPLIT,$0-20 + MOVD attr+0(FP), R3 + MOVD size+8(FP), R4 + MOVD $libpthread_attr_setstacksize(SB), R12 + CSYSCALL() + MOVW R3, ret+16(FP) + RET + +// Runs on OS stack, called from runtime·pthread_setdetachstate. +TEXT runtime·pthread_attr_setdetachstate1(SB),NOSPLIT,$0-20 + MOVD attr+0(FP), R3 + MOVW state+8(FP), R4 + MOVD $libpthread_attr_setdetachstate(SB), R12 + CSYSCALL() + MOVW R3, ret+16(FP) + RET + +// Runs on OS stack, called from runtime·pthread_create. +TEXT runtime·pthread_create1(SB),NOSPLIT,$0-36 + MOVD tid+0(FP), R3 + MOVD attr+8(FP), R4 + MOVD fn+16(FP), R5 + MOVD arg+24(FP), R6 + MOVD $libpthread_create(SB), R12 + CSYSCALL() + MOVW R3, ret+32(FP) + RET + +// Runs on OS stack, called from runtime·sigaction. +TEXT runtime·sigaction1(SB),NOSPLIT,$0-24 + MOVD sig+0(FP), R3 + MOVD new+8(FP), R4 + MOVD old+16(FP), R5 + MOVD $libc_sigaction(SB), R12 + CSYSCALL() + RET diff --git a/src/runtime/sys_arm.go b/src/runtime/sys_arm.go new file mode 100644 index 0000000..730b9c9 --- /dev/null +++ b/src/runtime/sys_arm.go @@ -0,0 +1,21 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then did an immediate Gosave. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + if buf.lr != 0 { + throw("invalid use of gostartcall") + } + buf.lr = buf.pc + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} + +// for testing +func usplit(x uint32) (q, r uint32) diff --git a/src/runtime/sys_arm64.go b/src/runtime/sys_arm64.go new file mode 100644 index 0000000..230241d --- /dev/null +++ b/src/runtime/sys_arm64.go @@ -0,0 +1,18 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then did an immediate Gosave. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + if buf.lr != 0 { + throw("invalid use of gostartcall") + } + buf.lr = buf.pc + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} diff --git a/src/runtime/sys_darwin.go b/src/runtime/sys_darwin.go new file mode 100644 index 0000000..64d7523 --- /dev/null +++ b/src/runtime/sys_darwin.go @@ -0,0 +1,608 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "runtime/internal/atomic" + "unsafe" +) + +// The X versions of syscall expect the libc call to return a 64-bit result. +// Otherwise (the non-X version) expects a 32-bit result. +// This distinction is required because an error is indicated by returning -1, +// and we need to know whether to check 32 or 64 bits of the result. +// (Some libc functions that return 32 bits put junk in the upper 32 bits of AX.) + +//go:linkname syscall_syscall syscall.syscall +//go:nosplit +func syscall_syscall(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + args := struct{ fn, a1, a2, a3, r1, r2, err uintptr }{fn, a1, a2, a3, r1, r2, err} + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall)), unsafe.Pointer(&args)) + exitsyscall() + return args.r1, args.r2, args.err +} +func syscall() + +//go:linkname syscall_syscallX syscall.syscallX +//go:nosplit +func syscall_syscallX(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + args := struct{ fn, a1, a2, a3, r1, r2, err uintptr }{fn, a1, a2, a3, r1, r2, err} + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscallX)), unsafe.Pointer(&args)) + exitsyscall() + return args.r1, args.r2, args.err +} +func syscallX() + +//go:linkname syscall_syscall6 syscall.syscall6 +//go:nosplit +func syscall_syscall6(fn, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + args := struct{ fn, a1, a2, a3, a4, a5, a6, r1, r2, err uintptr }{fn, a1, a2, a3, a4, a5, a6, r1, r2, err} + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall6)), unsafe.Pointer(&args)) + exitsyscall() + return args.r1, args.r2, args.err +} +func syscall6() + +//go:linkname syscall_syscall9 syscall.syscall9 +//go:nosplit +//go:cgo_unsafe_args +func syscall_syscall9(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9 uintptr) (r1, r2, err uintptr) { + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall9)), unsafe.Pointer(&fn)) + exitsyscall() + return +} +func syscall9() + +//go:linkname syscall_syscall6X syscall.syscall6X +//go:nosplit +func syscall_syscall6X(fn, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + args := struct{ fn, a1, a2, a3, a4, a5, a6, r1, r2, err uintptr }{fn, a1, a2, a3, a4, a5, a6, r1, r2, err} + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall6X)), unsafe.Pointer(&args)) + exitsyscall() + return args.r1, args.r2, args.err +} +func syscall6X() + +//go:linkname syscall_syscallPtr syscall.syscallPtr +//go:nosplit +func syscall_syscallPtr(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + args := struct{ fn, a1, a2, a3, r1, r2, err uintptr }{fn, a1, a2, a3, r1, r2, err} + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscallPtr)), unsafe.Pointer(&args)) + exitsyscall() + return args.r1, args.r2, args.err +} +func syscallPtr() + +//go:linkname syscall_rawSyscall syscall.rawSyscall +//go:nosplit +func syscall_rawSyscall(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + args := struct{ fn, a1, a2, a3, r1, r2, err uintptr }{fn, a1, a2, a3, r1, r2, err} + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall)), unsafe.Pointer(&args)) + return args.r1, args.r2, args.err +} + +//go:linkname syscall_rawSyscall6 syscall.rawSyscall6 +//go:nosplit +func syscall_rawSyscall6(fn, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + args := struct{ fn, a1, a2, a3, a4, a5, a6, r1, r2, err uintptr }{fn, a1, a2, a3, a4, a5, a6, r1, r2, err} + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall6)), unsafe.Pointer(&args)) + return args.r1, args.r2, args.err +} + +// crypto_x509_syscall is used in crypto/x509/internal/macos to call into Security.framework and CF. + +//go:linkname crypto_x509_syscall crypto/x509/internal/macos.syscall +//go:nosplit +func crypto_x509_syscall(fn, a1, a2, a3, a4, a5 uintptr, f1 float64) (r1 uintptr) { + args := struct { + fn, a1, a2, a3, a4, a5 uintptr + f1 float64 + r1 uintptr + }{fn, a1, a2, a3, a4, a5, f1, r1} + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall_x509)), unsafe.Pointer(&args)) + exitsyscall() + return args.r1 +} +func syscall_x509() + +// The *_trampoline functions convert from the Go calling convention to the C calling convention +// and then call the underlying libc function. They are defined in sys_darwin_$ARCH.s. + +//go:nosplit +//go:cgo_unsafe_args +func pthread_attr_init(attr *pthreadattr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_attr_init_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + return ret +} +func pthread_attr_init_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_attr_getstacksize(attr *pthreadattr, size *uintptr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_attr_getstacksize_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + KeepAlive(size) + return ret +} +func pthread_attr_getstacksize_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_attr_setdetachstate(attr *pthreadattr, state int) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_attr_setdetachstate_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + return ret +} +func pthread_attr_setdetachstate_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_create(attr *pthreadattr, start uintptr, arg unsafe.Pointer) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_create_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + KeepAlive(arg) // Just for consistency. Arg of course needs to be kept alive for the start function. + return ret +} +func pthread_create_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func raise(sig uint32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(raise_trampoline)), unsafe.Pointer(&sig)) +} +func raise_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_self() (t pthread) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_self_trampoline)), unsafe.Pointer(&t)) + return +} +func pthread_self_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_kill(t pthread, sig uint32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_kill_trampoline)), unsafe.Pointer(&t)) + return +} +func pthread_kill_trampoline() + +// osinit_hack is a clumsy hack to work around Apple libc bugs +// causing fork+exec to hang in the child process intermittently. +// See go.dev/issue/33565 and go.dev/issue/56784 for a few reports. +// +// The stacks obtained from the hung child processes are in +// libSystem_atfork_child, which is supposed to reinitialize various +// parts of the C library in the new process. +// +// One common stack dies in _notify_fork_child calling _notify_globals +// (inlined) calling _os_alloc_once, because _os_alloc_once detects that +// the once lock is held by the parent process and then calls +// _os_once_gate_corruption_abort. The allocation is setting up the +// globals for the notification subsystem. See the source code at [1]. +// To work around this, we can allocate the globals earlier in the Go +// program's lifetime, before any execs are involved, by calling any +// notify routine that is exported, calls _notify_globals, and doesn't do +// anything too expensive otherwise. notify_is_valid_token(0) fits the bill. +// +// The other common stack dies in xpc_atfork_child calling +// _objc_msgSend_uncached which ends up in +// WAITING_FOR_ANOTHER_THREAD_TO_FINISH_CALLING_+initialize. Of course, +// whatever thread the child is waiting for is in the parent process and +// is not going to finish anything in the child process. There is no +// public source code for these routines, so it is unclear exactly what +// the problem is. An Apple engineer suggests using xpc_date_create_from_current, +// which empirically does fix the problem. +// +// So osinit_hack_trampoline (in sys_darwin_$GOARCH.s) calls +// notify_is_valid_token(0) and xpc_date_create_from_current(), which makes the +// fork+exec hangs stop happening. If Apple fixes the libc bug in +// some future version of macOS, then we can remove this awful code. +// +//go:nosplit +func osinit_hack() { + if GOOS == "darwin" { // not ios + libcCall(unsafe.Pointer(abi.FuncPCABI0(osinit_hack_trampoline)), nil) + } + return +} +func osinit_hack_trampoline() + +// mmap is used to do low-level memory allocation via mmap. Don't allow stack +// splits, since this function (used by sysAlloc) is called in a lot of low-level +// parts of the runtime and callers often assume it won't acquire any locks. +// +//go:nosplit +func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (unsafe.Pointer, int) { + args := struct { + addr unsafe.Pointer + n uintptr + prot, flags, fd int32 + off uint32 + ret1 unsafe.Pointer + ret2 int + }{addr, n, prot, flags, fd, off, nil, 0} + libcCall(unsafe.Pointer(abi.FuncPCABI0(mmap_trampoline)), unsafe.Pointer(&args)) + return args.ret1, args.ret2 +} +func mmap_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func munmap(addr unsafe.Pointer, n uintptr) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(munmap_trampoline)), unsafe.Pointer(&addr)) + KeepAlive(addr) // Just for consistency. Hopefully addr is not a Go address. +} +func munmap_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func madvise(addr unsafe.Pointer, n uintptr, flags int32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(madvise_trampoline)), unsafe.Pointer(&addr)) + KeepAlive(addr) // Just for consistency. Hopefully addr is not a Go address. +} +func madvise_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func mlock(addr unsafe.Pointer, n uintptr) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(mlock_trampoline)), unsafe.Pointer(&addr)) + KeepAlive(addr) // Just for consistency. Hopefully addr is not a Go address. +} +func mlock_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func read(fd int32, p unsafe.Pointer, n int32) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(read_trampoline)), unsafe.Pointer(&fd)) + KeepAlive(p) + return ret +} +func read_trampoline() + +func pipe() (r, w int32, errno int32) { + var p [2]int32 + errno = libcCall(unsafe.Pointer(abi.FuncPCABI0(pipe_trampoline)), noescape(unsafe.Pointer(&p))) + return p[0], p[1], errno +} +func pipe_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func closefd(fd int32) int32 { + return libcCall(unsafe.Pointer(abi.FuncPCABI0(close_trampoline)), unsafe.Pointer(&fd)) +} +func close_trampoline() + +// This is exported via linkname to assembly in runtime/cgo. +// +//go:nosplit +//go:cgo_unsafe_args +//go:linkname exit +func exit(code int32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(exit_trampoline)), unsafe.Pointer(&code)) +} +func exit_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func usleep(usec uint32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(usleep_trampoline)), unsafe.Pointer(&usec)) +} +func usleep_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func usleep_no_g(usec uint32) { + asmcgocall_no_g(unsafe.Pointer(abi.FuncPCABI0(usleep_trampoline)), unsafe.Pointer(&usec)) +} + +//go:nosplit +//go:cgo_unsafe_args +func write1(fd uintptr, p unsafe.Pointer, n int32) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(write_trampoline)), unsafe.Pointer(&fd)) + KeepAlive(p) + return ret +} +func write_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func open(name *byte, mode, perm int32) (ret int32) { + ret = libcCall(unsafe.Pointer(abi.FuncPCABI0(open_trampoline)), unsafe.Pointer(&name)) + KeepAlive(name) + return +} +func open_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func nanotime1() int64 { + var r struct { + t int64 // raw timer + numer, denom uint32 // conversion factors. nanoseconds = t * numer / denom. + } + libcCall(unsafe.Pointer(abi.FuncPCABI0(nanotime_trampoline)), unsafe.Pointer(&r)) + // Note: Apple seems unconcerned about overflow here. See + // https://developer.apple.com/library/content/qa/qa1398/_index.html + // Note also, numer == denom == 1 is common. + t := r.t + if r.numer != 1 { + t *= int64(r.numer) + } + if r.denom != 1 { + t /= int64(r.denom) + } + return t +} +func nanotime_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func walltime() (int64, int32) { + var t timespec + libcCall(unsafe.Pointer(abi.FuncPCABI0(walltime_trampoline)), unsafe.Pointer(&t)) + return t.tv_sec, int32(t.tv_nsec) +} +func walltime_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func sigaction(sig uint32, new *usigactiont, old *usigactiont) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(sigaction_trampoline)), unsafe.Pointer(&sig)) + KeepAlive(new) + KeepAlive(old) +} +func sigaction_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func sigprocmask(how uint32, new *sigset, old *sigset) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(sigprocmask_trampoline)), unsafe.Pointer(&how)) + KeepAlive(new) + KeepAlive(old) +} +func sigprocmask_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func sigaltstack(new *stackt, old *stackt) { + if new != nil && new.ss_flags&_SS_DISABLE != 0 && new.ss_size == 0 { + // Despite the fact that Darwin's sigaltstack man page says it ignores the size + // when SS_DISABLE is set, it doesn't. sigaltstack returns ENOMEM + // if we don't give it a reasonable size. + // ref: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140421/214296.html + new.ss_size = 32768 + } + libcCall(unsafe.Pointer(abi.FuncPCABI0(sigaltstack_trampoline)), unsafe.Pointer(&new)) + KeepAlive(new) + KeepAlive(old) +} +func sigaltstack_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func raiseproc(sig uint32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(raiseproc_trampoline)), unsafe.Pointer(&sig)) +} +func raiseproc_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func setitimer(mode int32, new, old *itimerval) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(setitimer_trampoline)), unsafe.Pointer(&mode)) + KeepAlive(new) + KeepAlive(old) +} +func setitimer_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func sysctl(mib *uint32, miblen uint32, oldp *byte, oldlenp *uintptr, newp *byte, newlen uintptr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(sysctl_trampoline)), unsafe.Pointer(&mib)) + KeepAlive(mib) + KeepAlive(oldp) + KeepAlive(oldlenp) + KeepAlive(newp) + return ret +} +func sysctl_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func sysctlbyname(name *byte, oldp *byte, oldlenp *uintptr, newp *byte, newlen uintptr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(sysctlbyname_trampoline)), unsafe.Pointer(&name)) + KeepAlive(name) + KeepAlive(oldp) + KeepAlive(oldlenp) + KeepAlive(newp) + return ret +} +func sysctlbyname_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func fcntl(fd, cmd, arg int32) (ret int32, errno int32) { + args := struct { + fd, cmd, arg int32 + ret, errno int32 + }{fd, cmd, arg, 0, 0} + libcCall(unsafe.Pointer(abi.FuncPCABI0(fcntl_trampoline)), unsafe.Pointer(&args)) + return args.ret, args.errno +} +func fcntl_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func kqueue() int32 { + v := libcCall(unsafe.Pointer(abi.FuncPCABI0(kqueue_trampoline)), nil) + return v +} +func kqueue_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func kevent(kq int32, ch *keventt, nch int32, ev *keventt, nev int32, ts *timespec) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(kevent_trampoline)), unsafe.Pointer(&kq)) + KeepAlive(ch) + KeepAlive(ev) + KeepAlive(ts) + return ret +} +func kevent_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_mutex_init(m *pthreadmutex, attr *pthreadmutexattr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_mutex_init_trampoline)), unsafe.Pointer(&m)) + KeepAlive(m) + KeepAlive(attr) + return ret +} +func pthread_mutex_init_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_mutex_lock(m *pthreadmutex) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_mutex_lock_trampoline)), unsafe.Pointer(&m)) + KeepAlive(m) + return ret +} +func pthread_mutex_lock_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_mutex_unlock(m *pthreadmutex) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_mutex_unlock_trampoline)), unsafe.Pointer(&m)) + KeepAlive(m) + return ret +} +func pthread_mutex_unlock_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_cond_init(c *pthreadcond, attr *pthreadcondattr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_cond_init_trampoline)), unsafe.Pointer(&c)) + KeepAlive(c) + KeepAlive(attr) + return ret +} +func pthread_cond_init_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_cond_wait(c *pthreadcond, m *pthreadmutex) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_cond_wait_trampoline)), unsafe.Pointer(&c)) + KeepAlive(c) + KeepAlive(m) + return ret +} +func pthread_cond_wait_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_cond_timedwait_relative_np(c *pthreadcond, m *pthreadmutex, t *timespec) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_cond_timedwait_relative_np_trampoline)), unsafe.Pointer(&c)) + KeepAlive(c) + KeepAlive(m) + KeepAlive(t) + return ret +} +func pthread_cond_timedwait_relative_np_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_cond_signal(c *pthreadcond) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_cond_signal_trampoline)), unsafe.Pointer(&c)) + KeepAlive(c) + return ret +} +func pthread_cond_signal_trampoline() + +// Not used on Darwin, but must be defined. +func exitThread(wait *atomic.Uint32) { + throw("exitThread") +} + +//go:nosplit +func closeonexec(fd int32) { + fcntl(fd, _F_SETFD, _FD_CLOEXEC) +} + +//go:nosplit +func setNonblock(fd int32) { + flags, _ := fcntl(fd, _F_GETFL, 0) + if flags != -1 { + fcntl(fd, _F_SETFL, flags|_O_NONBLOCK) + } +} + +func issetugid() int32 { + return libcCall(unsafe.Pointer(abi.FuncPCABI0(issetugid_trampoline)), nil) +} +func issetugid_trampoline() + +// Tell the linker that the libc_* functions are to be found +// in a system library, with the libc_ prefix missing. + +//go:cgo_import_dynamic libc_pthread_attr_init pthread_attr_init "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_attr_getstacksize pthread_attr_getstacksize "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_attr_setdetachstate pthread_attr_setdetachstate "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_create pthread_create "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_self pthread_self "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_kill pthread_kill "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_exit _exit "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_raise raise "/usr/lib/libSystem.B.dylib" + +//go:cgo_import_dynamic libc_open open "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_close close "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_read read "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_write write "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pipe pipe "/usr/lib/libSystem.B.dylib" + +//go:cgo_import_dynamic libc_mmap mmap "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_munmap munmap "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_madvise madvise "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_mlock mlock "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_error __error "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_usleep usleep "/usr/lib/libSystem.B.dylib" + +//go:cgo_import_dynamic libc_mach_timebase_info mach_timebase_info "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_mach_absolute_time mach_absolute_time "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_clock_gettime clock_gettime "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_sigaction sigaction "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_sigmask pthread_sigmask "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_sigaltstack sigaltstack "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_getpid getpid "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_kill kill "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_setitimer setitimer "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_sysctl sysctl "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_sysctlbyname sysctlbyname "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_fcntl fcntl "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_kqueue kqueue "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_kevent kevent "/usr/lib/libSystem.B.dylib" + +//go:cgo_import_dynamic libc_pthread_mutex_init pthread_mutex_init "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_mutex_lock pthread_mutex_lock "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_mutex_unlock pthread_mutex_unlock "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_cond_init pthread_cond_init "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_cond_wait pthread_cond_wait "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_cond_timedwait_relative_np pthread_cond_timedwait_relative_np "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_cond_signal pthread_cond_signal "/usr/lib/libSystem.B.dylib" + +//go:cgo_import_dynamic libc_notify_is_valid_token notify_is_valid_token "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_xpc_date_create_from_current xpc_date_create_from_current "/usr/lib/libSystem.B.dylib" + +//go:cgo_import_dynamic libc_issetugid issetugid "/usr/lib/libSystem.B.dylib" diff --git a/src/runtime/sys_darwin_amd64.s b/src/runtime/sys_darwin_amd64.s new file mode 100644 index 0000000..de7ecdf --- /dev/null +++ b/src/runtime/sys_darwin_amd64.s @@ -0,0 +1,952 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// System calls and other sys.stuff for AMD64, Darwin +// System calls are implemented in libSystem, this file contains +// trampolines that convert from Go to C calling convention. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_amd64.h" + +#define CLOCK_REALTIME 0 + +// Exit the entire program (like C exit) +TEXT runtime·exit_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), DI // arg 1 exit status + CALL libc_exit(SB) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·open_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 8(DI), SI // arg 2 flags + MOVL 12(DI), DX // arg 3 mode + MOVQ 0(DI), DI // arg 1 pathname + XORL AX, AX // vararg: say "no float args" + CALL libc_open(SB) + POPQ BP + RET + +TEXT runtime·close_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), DI // arg 1 fd + CALL libc_close(SB) + POPQ BP + RET + +TEXT runtime·read_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 buf + MOVL 16(DI), DX // arg 3 count + MOVL 0(DI), DI // arg 1 fd + CALL libc_read(SB) + TESTL AX, AX + JGE noerr + CALL libc_error(SB) + MOVL (AX), AX + NEGL AX // caller expects negative errno value +noerr: + POPQ BP + RET + +TEXT runtime·write_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 buf + MOVL 16(DI), DX // arg 3 count + MOVQ 0(DI), DI // arg 1 fd + CALL libc_write(SB) + TESTL AX, AX + JGE noerr + CALL libc_error(SB) + MOVL (AX), AX + NEGL AX // caller expects negative errno value +noerr: + POPQ BP + RET + +TEXT runtime·pipe_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + CALL libc_pipe(SB) // pointer already in DI + TESTL AX, AX + JEQ 3(PC) + CALL libc_error(SB) // return negative errno value + NEGL AX + POPQ BP + RET + +TEXT runtime·setitimer_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 new + MOVQ 16(DI), DX // arg 3 old + MOVL 0(DI), DI // arg 1 which + CALL libc_setitimer(SB) + POPQ BP + RET + +TEXT runtime·madvise_trampoline(SB), NOSPLIT, $0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 len + MOVL 16(DI), DX // arg 3 advice + MOVQ 0(DI), DI // arg 1 addr + CALL libc_madvise(SB) + // ignore failure - maybe pages are locked + POPQ BP + RET + +TEXT runtime·mlock_trampoline(SB), NOSPLIT, $0 + UNDEF // unimplemented + +GLOBL timebase<>(SB),NOPTR,$(machTimebaseInfo__size) + +TEXT runtime·nanotime_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ DI, BX + CALL libc_mach_absolute_time(SB) + MOVQ AX, 0(BX) + MOVL timebase<>+machTimebaseInfo_numer(SB), SI + MOVL timebase<>+machTimebaseInfo_denom(SB), DI // atomic read + TESTL DI, DI + JNE initialized + + SUBQ $(machTimebaseInfo__size+15)/16*16, SP + MOVQ SP, DI + CALL libc_mach_timebase_info(SB) + MOVL machTimebaseInfo_numer(SP), SI + MOVL machTimebaseInfo_denom(SP), DI + ADDQ $(machTimebaseInfo__size+15)/16*16, SP + + MOVL SI, timebase<>+machTimebaseInfo_numer(SB) + MOVL DI, AX + XCHGL AX, timebase<>+machTimebaseInfo_denom(SB) // atomic write + +initialized: + MOVL SI, 8(BX) + MOVL DI, 12(BX) + MOVQ BP, SP + POPQ BP + RET + +TEXT runtime·walltime_trampoline(SB),NOSPLIT,$0 + PUSHQ BP // make a frame; keep stack aligned + MOVQ SP, BP + MOVQ DI, SI // arg 2 timespec + MOVL $CLOCK_REALTIME, DI // arg 1 clock_id + CALL libc_clock_gettime(SB) + POPQ BP + RET + +TEXT runtime·sigaction_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 new + MOVQ 16(DI), DX // arg 3 old + MOVL 0(DI), DI // arg 1 sig + CALL libc_sigaction(SB) + TESTL AX, AX + JEQ 2(PC) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·sigprocmask_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 new + MOVQ 16(DI), DX // arg 3 old + MOVL 0(DI), DI // arg 1 how + CALL libc_pthread_sigmask(SB) + TESTL AX, AX + JEQ 2(PC) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·sigaltstack_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 old + MOVQ 0(DI), DI // arg 1 new + CALL libc_sigaltstack(SB) + TESTQ AX, AX + JEQ 2(PC) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·raiseproc_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), BX // signal + CALL libc_getpid(SB) + MOVL AX, DI // arg 1 pid + MOVL BX, SI // arg 2 signal + CALL libc_kill(SB) + POPQ BP + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVQ fn+0(FP), AX + MOVL sig+8(FP), DI + MOVQ info+16(FP), SI + MOVQ ctx+24(FP), DX + PUSHQ BP + MOVQ SP, BP + ANDQ $~15, SP // alignment for x86_64 ABI + CALL AX + MOVQ BP, SP + POPQ BP + RET + +// This is the function registered during sigaction and is invoked when +// a signal is received. It just redirects to the Go function sigtrampgo. +// Called using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Set up ABIInternal environment: g in R14, cleared X15. + get_tls(R12) + MOVQ g(R12), R14 + PXOR X15, X15 + + // Reserve space for spill slots. + NOP SP // disable vet stack checking + ADJSP $24 + + // Call into the Go signal handler + MOVQ DI, AX // sig + MOVQ SI, BX // info + MOVQ DX, CX // ctx + CALL ·sigtrampgo<ABIInternal>(SB) + + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +// Called using C ABI. +TEXT runtime·sigprofNonGoWrapper<>(SB),NOSPLIT,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Call into the Go signal handler + NOP SP // disable vet stack checking + ADJSP $24 + MOVL DI, 0(SP) // sig + MOVQ SI, 8(SP) // info + MOVQ DX, 16(SP) // ctx + CALL ·sigprofNonGo(SB) + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +// Used instead of sigtramp in programs that use cgo. +// Arguments from kernel are in DI, SI, DX. +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + // If no traceback function, do usual sigtramp. + MOVQ runtime·cgoTraceback(SB), AX + TESTQ AX, AX + JZ sigtramp + + // If no traceback support function, which means that + // runtime/cgo was not linked in, do usual sigtramp. + MOVQ _cgo_callers(SB), AX + TESTQ AX, AX + JZ sigtramp + + // Figure out if we are currently in a cgo call. + // If not, just do usual sigtramp. + get_tls(CX) + MOVQ g(CX),AX + TESTQ AX, AX + JZ sigtrampnog // g == nil + MOVQ g_m(AX), AX + TESTQ AX, AX + JZ sigtramp // g.m == nil + MOVL m_ncgo(AX), CX + TESTL CX, CX + JZ sigtramp // g.m.ncgo == 0 + MOVQ m_curg(AX), CX + TESTQ CX, CX + JZ sigtramp // g.m.curg == nil + MOVQ g_syscallsp(CX), CX + TESTQ CX, CX + JZ sigtramp // g.m.curg.syscallsp == 0 + MOVQ m_cgoCallers(AX), R8 + TESTQ R8, R8 + JZ sigtramp // g.m.cgoCallers == nil + MOVL m_cgoCallersUse(AX), CX + TESTL CX, CX + JNZ sigtramp // g.m.cgoCallersUse != 0 + + // Jump to a function in runtime/cgo. + // That function, written in C, will call the user's traceback + // function with proper unwind info, and will then call back here. + // The first three arguments, and the fifth, are already in registers. + // Set the two remaining arguments now. + MOVQ runtime·cgoTraceback(SB), CX + MOVQ $runtime·sigtramp(SB), R9 + MOVQ _cgo_callers(SB), AX + JMP AX + +sigtramp: + JMP runtime·sigtramp(SB) + +sigtrampnog: + // Signal arrived on a non-Go thread. If this is SIGPROF, get a + // stack trace. + CMPL DI, $27 // 27 == SIGPROF + JNZ sigtramp + + // Lock sigprofCallersUse. + MOVL $0, AX + MOVL $1, CX + MOVQ $runtime·sigprofCallersUse(SB), R11 + LOCK + CMPXCHGL CX, 0(R11) + JNZ sigtramp // Skip stack trace if already locked. + + // Jump to the traceback function in runtime/cgo. + // It will call back to sigprofNonGo, via sigprofNonGoWrapper, to convert + // the arguments to the Go calling convention. + // First three arguments to traceback function are in registers already. + MOVQ runtime·cgoTraceback(SB), CX + MOVQ $runtime·sigprofCallers(SB), R8 + MOVQ $runtime·sigprofNonGoWrapper<>(SB), R9 + MOVQ _cgo_callers(SB), AX + JMP AX + +TEXT runtime·mmap_trampoline(SB),NOSPLIT,$0 + PUSHQ BP // make a frame; keep stack aligned + MOVQ SP, BP + MOVQ DI, BX + MOVQ 0(BX), DI // arg 1 addr + MOVQ 8(BX), SI // arg 2 len + MOVL 16(BX), DX // arg 3 prot + MOVL 20(BX), CX // arg 4 flags + MOVL 24(BX), R8 // arg 5 fid + MOVL 28(BX), R9 // arg 6 offset + CALL libc_mmap(SB) + XORL DX, DX + CMPQ AX, $-1 + JNE ok + CALL libc_error(SB) + MOVLQSX (AX), DX // errno + XORL AX, AX +ok: + MOVQ AX, 32(BX) + MOVQ DX, 40(BX) + POPQ BP + RET + +TEXT runtime·munmap_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 len + MOVQ 0(DI), DI // arg 1 addr + CALL libc_munmap(SB) + TESTQ AX, AX + JEQ 2(PC) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·usleep_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), DI // arg 1 usec + CALL libc_usleep(SB) + POPQ BP + RET + +TEXT runtime·settls(SB),NOSPLIT,$32 + // Nothing to do on Darwin, pthread already set thread-local storage up. + RET + +TEXT runtime·sysctl_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 8(DI), SI // arg 2 miblen + MOVQ 16(DI), DX // arg 3 oldp + MOVQ 24(DI), CX // arg 4 oldlenp + MOVQ 32(DI), R8 // arg 5 newp + MOVQ 40(DI), R9 // arg 6 newlen + MOVQ 0(DI), DI // arg 1 mib + CALL libc_sysctl(SB) + POPQ BP + RET + +TEXT runtime·sysctlbyname_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 oldp + MOVQ 16(DI), DX // arg 3 oldlenp + MOVQ 24(DI), CX // arg 4 newp + MOVQ 32(DI), R8 // arg 5 newlen + MOVQ 0(DI), DI // arg 1 name + CALL libc_sysctlbyname(SB) + POPQ BP + RET + +TEXT runtime·kqueue_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + CALL libc_kqueue(SB) + POPQ BP + RET + +TEXT runtime·kevent_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 keventt + MOVL 16(DI), DX // arg 3 nch + MOVQ 24(DI), CX // arg 4 ev + MOVL 32(DI), R8 // arg 5 nev + MOVQ 40(DI), R9 // arg 6 ts + MOVL 0(DI), DI // arg 1 kq + CALL libc_kevent(SB) + CMPL AX, $-1 + JNE ok + CALL libc_error(SB) + MOVLQSX (AX), AX // errno + NEGQ AX // caller wants it as a negative error code +ok: + POPQ BP + RET + +TEXT runtime·fcntl_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ DI, BX + MOVL 0(BX), DI // arg 1 fd + MOVL 4(BX), SI // arg 2 cmd + MOVL 8(BX), DX // arg 3 arg + XORL AX, AX // vararg: say "no float args" + CALL libc_fcntl(SB) + XORL DX, DX + CMPQ AX, $-1 + JNE noerr + CALL libc_error(SB) + MOVL (AX), DX + MOVL $-1, AX +noerr: + MOVL AX, 12(BX) + MOVL DX, 16(BX) + POPQ BP + RET + +// mstart_stub is the first function executed on a new thread started by pthread_create. +// It just does some low-level setup and then calls mstart. +// Note: called with the C calling convention. +TEXT runtime·mstart_stub(SB),NOSPLIT,$0 + // DI points to the m. + // We are already on m's g0 stack. + + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + MOVQ m_g0(DI), DX // g + + // Initialize TLS entry. + // See cmd/link/internal/ld/sym.go:computeTLSOffset. + MOVQ DX, 0x30(GS) + + CALL runtime·mstart(SB) + + POP_REGS_HOST_TO_ABI0() + + // Go is all done with this OS thread. + // Tell pthread everything is ok (we never join with this thread, so + // the value here doesn't really matter). + XORL AX, AX + RET + +// These trampolines help convert from Go calling convention to C calling convention. +// They should be called with asmcgocall. +// A pointer to the arguments is passed in DI. +// A single int32 result is returned in AX. +// (For more results, make an args/results structure.) +TEXT runtime·pthread_attr_init_trampoline(SB),NOSPLIT,$0 + PUSHQ BP // make frame, keep stack 16-byte aligned. + MOVQ SP, BP + MOVQ 0(DI), DI // arg 1 attr + CALL libc_pthread_attr_init(SB) + POPQ BP + RET + +TEXT runtime·pthread_attr_getstacksize_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 size + MOVQ 0(DI), DI // arg 1 attr + CALL libc_pthread_attr_getstacksize(SB) + POPQ BP + RET + +TEXT runtime·pthread_attr_setdetachstate_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 state + MOVQ 0(DI), DI // arg 1 attr + CALL libc_pthread_attr_setdetachstate(SB) + POPQ BP + RET + +TEXT runtime·pthread_create_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ 0(DI), SI // arg 2 attr + MOVQ 8(DI), DX // arg 3 start + MOVQ 16(DI), CX // arg 4 arg + MOVQ SP, DI // arg 1 &threadid (which we throw away) + CALL libc_pthread_create(SB) + MOVQ BP, SP + POPQ BP + RET + +TEXT runtime·raise_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), DI // arg 1 signal + CALL libc_raise(SB) + POPQ BP + RET + +TEXT runtime·pthread_mutex_init_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 attr + MOVQ 0(DI), DI // arg 1 mutex + CALL libc_pthread_mutex_init(SB) + POPQ BP + RET + +TEXT runtime·pthread_mutex_lock_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 0(DI), DI // arg 1 mutex + CALL libc_pthread_mutex_lock(SB) + POPQ BP + RET + +TEXT runtime·pthread_mutex_unlock_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 0(DI), DI // arg 1 mutex + CALL libc_pthread_mutex_unlock(SB) + POPQ BP + RET + +TEXT runtime·pthread_cond_init_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 attr + MOVQ 0(DI), DI // arg 1 cond + CALL libc_pthread_cond_init(SB) + POPQ BP + RET + +TEXT runtime·pthread_cond_wait_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 mutex + MOVQ 0(DI), DI // arg 1 cond + CALL libc_pthread_cond_wait(SB) + POPQ BP + RET + +TEXT runtime·pthread_cond_timedwait_relative_np_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 mutex + MOVQ 16(DI), DX // arg 3 timeout + MOVQ 0(DI), DI // arg 1 cond + CALL libc_pthread_cond_timedwait_relative_np(SB) + POPQ BP + RET + +TEXT runtime·pthread_cond_signal_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 0(DI), DI // arg 1 cond + CALL libc_pthread_cond_signal(SB) + POPQ BP + RET + +TEXT runtime·pthread_self_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ DI, BX // BX is caller-save + CALL libc_pthread_self(SB) + MOVQ AX, 0(BX) // return value + POPQ BP + RET + +TEXT runtime·pthread_kill_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 sig + MOVQ 0(DI), DI // arg 1 thread + CALL libc_pthread_kill(SB) + POPQ BP + RET + +TEXT runtime·osinit_hack_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ $0, DI // arg 1 val + CALL libc_notify_is_valid_token(SB) + CALL libc_xpc_date_create_from_current(SB) + POPQ BP + RET + +// syscall calls a function in libc on behalf of the syscall package. +// syscall takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), CX // fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL CX + + MOVQ (SP), DI + MOVQ AX, (4*8)(DI) // r1 + MOVQ DX, (5*8)(DI) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMPL AX, $-1 // Note: high 32 bits are junk + JNE ok + + // Get error code from libc. + CALL libc_error(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (6*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscallX calls a function in libc on behalf of the syscall package. +// syscallX takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscallX must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscallX is like syscall but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscallX(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), CX // fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL CX + + MOVQ (SP), DI + MOVQ AX, (4*8)(DI) // r1 + MOVQ DX, (5*8)(DI) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMPQ AX, $-1 + JNE ok + + // Get error code from libc. + CALL libc_error(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (6*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscallPtr is like syscallX except that the libc function reports an +// error by returning NULL and setting errno. +TEXT runtime·syscallPtr(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), CX // fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL CX + + MOVQ (SP), DI + MOVQ AX, (4*8)(DI) // r1 + MOVQ DX, (5*8)(DI) // r2 + + // syscallPtr libc functions return NULL on error + // and set errno. + TESTQ AX, AX + JNE ok + + // Get error code from libc. + CALL libc_error(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (6*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscall6 calls a function in libc on behalf of the syscall package. +// syscall6 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6 must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6 expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall6(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), R11// fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ (4*8)(DI), CX // a4 + MOVQ (5*8)(DI), R8 // a5 + MOVQ (6*8)(DI), R9 // a6 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL R11 + + MOVQ (SP), DI + MOVQ AX, (7*8)(DI) // r1 + MOVQ DX, (8*8)(DI) // r2 + + CMPL AX, $-1 + JNE ok + + CALL libc_error(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (9*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscall6X calls a function in libc on behalf of the syscall package. +// syscall6X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6X is like syscall6 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall6X(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), R11// fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ (4*8)(DI), CX // a4 + MOVQ (5*8)(DI), R8 // a5 + MOVQ (6*8)(DI), R9 // a6 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL R11 + + MOVQ (SP), DI + MOVQ AX, (7*8)(DI) // r1 + MOVQ DX, (8*8)(DI) // r2 + + CMPQ AX, $-1 + JNE ok + + CALL libc_error(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (9*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscall9 calls a function in libc on behalf of the syscall package. +// syscall9 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall9 must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall9 expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall9(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), R13// fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ (4*8)(DI), CX // a4 + MOVQ (5*8)(DI), R8 // a5 + MOVQ (6*8)(DI), R9 // a6 + MOVQ (7*8)(DI), R10 // a7 + MOVQ (8*8)(DI), R11 // a8 + MOVQ (9*8)(DI), R12 // a9 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL R13 + + MOVQ (SP), DI + MOVQ AX, (10*8)(DI) // r1 + MOVQ DX, (11*8)(DI) // r2 + + CMPL AX, $-1 + JNE ok + + CALL libc_error(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (12*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscall_x509 is for crypto/x509. It is like syscall6 but does not check for errors, +// takes 5 uintptrs and 1 float64, and only returns one value, +// for use with standard C ABI functions. +TEXT runtime·syscall_x509(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), R11// fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ (4*8)(DI), CX // a4 + MOVQ (5*8)(DI), R8 // a5 + MOVQ (6*8)(DI), X0 // f1 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL R11 + + MOVQ (SP), DI + MOVQ AX, (7*8)(DI) // r1 + + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +TEXT runtime·issetugid_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + CALL libc_issetugid(SB) + POPQ BP + RET diff --git a/src/runtime/sys_darwin_arm64.go b/src/runtime/sys_darwin_arm64.go new file mode 100644 index 0000000..6170f4f --- /dev/null +++ b/src/runtime/sys_darwin_arm64.go @@ -0,0 +1,65 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +// libc function wrappers. Must run on system stack. + +//go:nosplit +//go:cgo_unsafe_args +func g0_pthread_key_create(k *pthreadkey, destructor uintptr) int32 { + ret := asmcgocall(unsafe.Pointer(abi.FuncPCABI0(pthread_key_create_trampoline)), unsafe.Pointer(&k)) + KeepAlive(k) + return ret +} +func pthread_key_create_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func g0_pthread_setspecific(k pthreadkey, value uintptr) int32 { + return asmcgocall(unsafe.Pointer(abi.FuncPCABI0(pthread_setspecific_trampoline)), unsafe.Pointer(&k)) +} +func pthread_setspecific_trampoline() + +//go:cgo_import_dynamic libc_pthread_key_create pthread_key_create "/usr/lib/libSystem.B.dylib" +//go:cgo_import_dynamic libc_pthread_setspecific pthread_setspecific "/usr/lib/libSystem.B.dylib" + +// tlsinit allocates a thread-local storage slot for g. +// +// It finds the first available slot using pthread_key_create and uses +// it as the offset value for runtime.tlsg. +// +// This runs at startup on g0 stack, but before g is set, so it must +// not split stack (transitively). g is expected to be nil, so things +// (e.g. asmcgocall) will skip saving or reading g. +// +//go:nosplit +func tlsinit(tlsg *uintptr, tlsbase *[_PTHREAD_KEYS_MAX]uintptr) { + var k pthreadkey + err := g0_pthread_key_create(&k, 0) + if err != 0 { + abort() + } + + const magic = 0xc476c475c47957 + err = g0_pthread_setspecific(k, magic) + if err != 0 { + abort() + } + + for i, x := range tlsbase { + if x == magic { + *tlsg = uintptr(i * goarch.PtrSize) + g0_pthread_setspecific(k, 0) + return + } + } + abort() +} diff --git a/src/runtime/sys_darwin_arm64.s b/src/runtime/sys_darwin_arm64.s new file mode 100644 index 0000000..dc6caf8 --- /dev/null +++ b/src/runtime/sys_darwin_arm64.s @@ -0,0 +1,769 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// System calls and other sys.stuff for ARM64, Darwin +// System calls are implemented in libSystem, this file contains +// trampolines that convert from Go to C calling convention. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_arm64.h" + +#define CLOCK_REALTIME 0 + +TEXT notok<>(SB),NOSPLIT,$0 + MOVD $0, R8 + MOVD R8, (R8) + B 0(PC) + +TEXT runtime·open_trampoline(SB),NOSPLIT,$0 + SUB $16, RSP + MOVW 8(R0), R1 // arg 2 flags + MOVW 12(R0), R2 // arg 3 mode + MOVW R2, (RSP) // arg 3 is variadic, pass on stack + MOVD 0(R0), R0 // arg 1 pathname + BL libc_open(SB) + ADD $16, RSP + RET + +TEXT runtime·close_trampoline(SB),NOSPLIT,$0 + MOVW 0(R0), R0 // arg 1 fd + BL libc_close(SB) + RET + +TEXT runtime·write_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 buf + MOVW 16(R0), R2 // arg 3 count + MOVW 0(R0), R0 // arg 1 fd + BL libc_write(SB) + MOVD $-1, R1 + CMP R0, R1 + BNE noerr + BL libc_error(SB) + MOVW (R0), R0 + NEG R0, R0 // caller expects negative errno value +noerr: + RET + +TEXT runtime·read_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 buf + MOVW 16(R0), R2 // arg 3 count + MOVW 0(R0), R0 // arg 1 fd + BL libc_read(SB) + MOVD $-1, R1 + CMP R0, R1 + BNE noerr + BL libc_error(SB) + MOVW (R0), R0 + NEG R0, R0 // caller expects negative errno value +noerr: + RET + +TEXT runtime·pipe_trampoline(SB),NOSPLIT,$0 + BL libc_pipe(SB) // pointer already in R0 + CMP $0, R0 + BEQ 3(PC) + BL libc_error(SB) // return negative errno value + NEG R0, R0 + RET + +TEXT runtime·exit_trampoline(SB),NOSPLIT|NOFRAME,$0 + MOVW 0(R0), R0 + BL libc_exit(SB) + MOVD $1234, R0 + MOVD $1002, R1 + MOVD R0, (R1) // fail hard + +TEXT runtime·raiseproc_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R19 // signal + BL libc_getpid(SB) + // arg 1 pid already in R0 from getpid + MOVD R19, R1 // arg 2 signal + BL libc_kill(SB) + RET + +TEXT runtime·mmap_trampoline(SB),NOSPLIT,$0 + MOVD R0, R19 + MOVD 0(R19), R0 // arg 1 addr + MOVD 8(R19), R1 // arg 2 len + MOVW 16(R19), R2 // arg 3 prot + MOVW 20(R19), R3 // arg 4 flags + MOVW 24(R19), R4 // arg 5 fd + MOVW 28(R19), R5 // arg 6 off + BL libc_mmap(SB) + MOVD $0, R1 + MOVD $-1, R2 + CMP R0, R2 + BNE ok + BL libc_error(SB) + MOVW (R0), R1 + MOVD $0, R0 +ok: + MOVD R0, 32(R19) // ret 1 p + MOVD R1, 40(R19) // ret 2 err + RET + +TEXT runtime·munmap_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 len + MOVD 0(R0), R0 // arg 1 addr + BL libc_munmap(SB) + CMP $0, R0 + BEQ 2(PC) + BL notok<>(SB) + RET + +TEXT runtime·madvise_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 len + MOVW 16(R0), R2 // arg 3 advice + MOVD 0(R0), R0 // arg 1 addr + BL libc_madvise(SB) + RET + +TEXT runtime·mlock_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 len + MOVD 0(R0), R0 // arg 1 addr + BL libc_mlock(SB) + RET + +TEXT runtime·setitimer_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 new + MOVD 16(R0), R2 // arg 3 old + MOVW 0(R0), R0 // arg 1 which + BL libc_setitimer(SB) + RET + +TEXT runtime·walltime_trampoline(SB),NOSPLIT,$0 + MOVD R0, R1 // arg 2 timespec + MOVW $CLOCK_REALTIME, R0 // arg 1 clock_id + BL libc_clock_gettime(SB) + RET + +GLOBL timebase<>(SB),NOPTR,$(machTimebaseInfo__size) + +TEXT runtime·nanotime_trampoline(SB),NOSPLIT,$40 + MOVD R0, R19 + BL libc_mach_absolute_time(SB) + MOVD R0, 0(R19) + MOVW timebase<>+machTimebaseInfo_numer(SB), R20 + MOVD $timebase<>+machTimebaseInfo_denom(SB), R21 + LDARW (R21), R21 // atomic read + CMP $0, R21 + BNE initialized + + SUB $(machTimebaseInfo__size+15)/16*16, RSP + MOVD RSP, R0 + BL libc_mach_timebase_info(SB) + MOVW machTimebaseInfo_numer(RSP), R20 + MOVW machTimebaseInfo_denom(RSP), R21 + ADD $(machTimebaseInfo__size+15)/16*16, RSP + + MOVW R20, timebase<>+machTimebaseInfo_numer(SB) + MOVD $timebase<>+machTimebaseInfo_denom(SB), R22 + STLRW R21, (R22) // atomic write + +initialized: + MOVW R20, 8(R19) + MOVW R21, 12(R19) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R0 + MOVD info+16(FP), R1 + MOVD ctx+24(FP), R2 + MOVD fn+0(FP), R11 + BL (R11) + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$176 + // Save callee-save registers in the case of signal forwarding. + // Please refer to https://golang.org/issue/31827 . + SAVE_R19_TO_R28(8*4) + SAVE_F8_TO_F15(8*14) + + // Save arguments. + MOVW R0, (8*1)(RSP) // sig + MOVD R1, (8*2)(RSP) // info + MOVD R2, (8*3)(RSP) // ctx + + // this might be called in external code context, + // where g is not set. + BL runtime·load_g(SB) + +#ifdef GOOS_ios + MOVD RSP, R6 + CMP $0, g + BEQ nog + // iOS always use the main stack to run the signal handler. + // We need to switch to gsignal ourselves. + MOVD g_m(g), R11 + MOVD m_gsignal(R11), R5 + MOVD (g_stack+stack_hi)(R5), R6 + +nog: + // Restore arguments. + MOVW (8*1)(RSP), R0 + MOVD (8*2)(RSP), R1 + MOVD (8*3)(RSP), R2 + + // Reserve space for args and the stack pointer on the + // gsignal stack. + SUB $48, R6 + // Save stack pointer. + MOVD RSP, R4 + MOVD R4, (8*4)(R6) + // Switch to gsignal stack. + MOVD R6, RSP + + // Save arguments. + MOVW R0, (8*1)(RSP) + MOVD R1, (8*2)(RSP) + MOVD R2, (8*3)(RSP) +#endif + + // Call sigtrampgo. + MOVD $runtime·sigtrampgo(SB), R11 + BL (R11) + +#ifdef GOOS_ios + // Switch to old stack. + MOVD (8*4)(RSP), R5 + MOVD R5, RSP +#endif + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8*4) + RESTORE_F8_TO_F15(8*14) + + RET + +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + JMP runtime·sigtramp(SB) + +TEXT runtime·sigprocmask_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 new + MOVD 16(R0), R2 // arg 3 old + MOVW 0(R0), R0 // arg 1 how + BL libc_pthread_sigmask(SB) + CMP $0, R0 + BEQ 2(PC) + BL notok<>(SB) + RET + +TEXT runtime·sigaction_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 new + MOVD 16(R0), R2 // arg 3 old + MOVW 0(R0), R0 // arg 1 how + BL libc_sigaction(SB) + CMP $0, R0 + BEQ 2(PC) + BL notok<>(SB) + RET + +TEXT runtime·usleep_trampoline(SB),NOSPLIT,$0 + MOVW 0(R0), R0 // arg 1 usec + BL libc_usleep(SB) + RET + +TEXT runtime·sysctl_trampoline(SB),NOSPLIT,$0 + MOVW 8(R0), R1 // arg 2 miblen + MOVD 16(R0), R2 // arg 3 oldp + MOVD 24(R0), R3 // arg 4 oldlenp + MOVD 32(R0), R4 // arg 5 newp + MOVD 40(R0), R5 // arg 6 newlen + MOVD 0(R0), R0 // arg 1 mib + BL libc_sysctl(SB) + RET + +TEXT runtime·sysctlbyname_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 oldp + MOVD 16(R0), R2 // arg 3 oldlenp + MOVD 24(R0), R3 // arg 4 newp + MOVD 32(R0), R4 // arg 5 newlen + MOVD 0(R0), R0 // arg 1 name + BL libc_sysctlbyname(SB) + RET + + +TEXT runtime·kqueue_trampoline(SB),NOSPLIT,$0 + BL libc_kqueue(SB) + RET + +TEXT runtime·kevent_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 keventt + MOVW 16(R0), R2 // arg 3 nch + MOVD 24(R0), R3 // arg 4 ev + MOVW 32(R0), R4 // arg 5 nev + MOVD 40(R0), R5 // arg 6 ts + MOVW 0(R0), R0 // arg 1 kq + BL libc_kevent(SB) + MOVD $-1, R2 + CMP R0, R2 + BNE ok + BL libc_error(SB) + MOVW (R0), R0 // errno + NEG R0, R0 // caller wants it as a negative error code +ok: + RET + +TEXT runtime·fcntl_trampoline(SB),NOSPLIT,$0 + SUB $16, RSP + MOVD R0, R19 + MOVW 0(R19), R0 // arg 1 fd + MOVW 4(R19), R1 // arg 2 cmd + MOVW 8(R19), R2 // arg 3 arg + MOVW R2, (RSP) // arg 3 is variadic, pass on stack + BL libc_fcntl(SB) + MOVD $0, R1 + MOVD $-1, R2 + CMP R0, R2 + BNE noerr + BL libc_error(SB) + MOVW (R0), R1 + MOVW $-1, R0 +noerr: + MOVW R0, 12(R19) + MOVW R1, 16(R19) + ADD $16, RSP + RET + +TEXT runtime·sigaltstack_trampoline(SB),NOSPLIT,$0 +#ifdef GOOS_ios + // sigaltstack on iOS is not supported and will always + // run the signal handler on the main stack, so our sigtramp has + // to do the stack switch ourselves. + MOVW $43, R0 + BL libc_exit(SB) +#else + MOVD 8(R0), R1 // arg 2 old + MOVD 0(R0), R0 // arg 1 new + CALL libc_sigaltstack(SB) + CBZ R0, 2(PC) + BL notok<>(SB) +#endif + RET + +// Thread related functions + +// mstart_stub is the first function executed on a new thread started by pthread_create. +// It just does some low-level setup and then calls mstart. +// Note: called with the C calling convention. +TEXT runtime·mstart_stub(SB),NOSPLIT,$160 + // R0 points to the m. + // We are already on m's g0 stack. + + // Save callee-save registers. + SAVE_R19_TO_R28(8) + SAVE_F8_TO_F15(88) + + MOVD m_g0(R0), g + BL ·save_g(SB) + + BL runtime·mstart(SB) + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8) + RESTORE_F8_TO_F15(88) + + // Go is all done with this OS thread. + // Tell pthread everything is ok (we never join with this thread, so + // the value here doesn't really matter). + MOVD $0, R0 + + RET + +TEXT runtime·pthread_attr_init_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R0 // arg 1 attr + BL libc_pthread_attr_init(SB) + RET + +TEXT runtime·pthread_attr_getstacksize_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 size + MOVD 0(R0), R0 // arg 1 attr + BL libc_pthread_attr_getstacksize(SB) + RET + +TEXT runtime·pthread_attr_setdetachstate_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 state + MOVD 0(R0), R0 // arg 1 attr + BL libc_pthread_attr_setdetachstate(SB) + RET + +TEXT runtime·pthread_create_trampoline(SB),NOSPLIT,$0 + SUB $16, RSP + MOVD 0(R0), R1 // arg 2 state + MOVD 8(R0), R2 // arg 3 start + MOVD 16(R0), R3 // arg 4 arg + MOVD RSP, R0 // arg 1 &threadid (which we throw away) + BL libc_pthread_create(SB) + ADD $16, RSP + RET + +TEXT runtime·raise_trampoline(SB),NOSPLIT,$0 + MOVW 0(R0), R0 // arg 1 sig + BL libc_raise(SB) + RET + +TEXT runtime·pthread_mutex_init_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 attr + MOVD 0(R0), R0 // arg 1 mutex + BL libc_pthread_mutex_init(SB) + RET + +TEXT runtime·pthread_mutex_lock_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R0 // arg 1 mutex + BL libc_pthread_mutex_lock(SB) + RET + +TEXT runtime·pthread_mutex_unlock_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R0 // arg 1 mutex + BL libc_pthread_mutex_unlock(SB) + RET + +TEXT runtime·pthread_cond_init_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 attr + MOVD 0(R0), R0 // arg 1 cond + BL libc_pthread_cond_init(SB) + RET + +TEXT runtime·pthread_cond_wait_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 mutex + MOVD 0(R0), R0 // arg 1 cond + BL libc_pthread_cond_wait(SB) + RET + +TEXT runtime·pthread_cond_timedwait_relative_np_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 mutex + MOVD 16(R0), R2 // arg 3 timeout + MOVD 0(R0), R0 // arg 1 cond + BL libc_pthread_cond_timedwait_relative_np(SB) + RET + +TEXT runtime·pthread_cond_signal_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R0 // arg 1 cond + BL libc_pthread_cond_signal(SB) + RET + +TEXT runtime·pthread_self_trampoline(SB),NOSPLIT,$0 + MOVD R0, R19 // R19 is callee-save + BL libc_pthread_self(SB) + MOVD R0, 0(R19) // return value + RET + +TEXT runtime·pthread_kill_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 sig + MOVD 0(R0), R0 // arg 1 thread + BL libc_pthread_kill(SB) + RET + +TEXT runtime·pthread_key_create_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 destructor + MOVD 0(R0), R0 // arg 1 *key + BL libc_pthread_key_create(SB) + RET + +TEXT runtime·pthread_setspecific_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 value + MOVD 0(R0), R0 // arg 1 key + BL libc_pthread_setspecific(SB) + RET + +TEXT runtime·osinit_hack_trampoline(SB),NOSPLIT,$0 + MOVD $0, R0 // arg 1 val + BL libc_notify_is_valid_token(SB) + BL libc_xpc_date_create_from_current(SB) + RET + +// syscall calls a function in libc on behalf of the syscall package. +// syscall takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscall(SB),NOSPLIT,$0 + SUB $16, RSP // push structure pointer + MOVD R0, 8(RSP) + + MOVD 0(R0), R12 // fn + MOVD 16(R0), R1 // a2 + MOVD 24(R0), R2 // a3 + MOVD 8(R0), R0 // a1 + + // If fn is declared as vararg, we have to pass the vararg arguments on the stack. + // (Because ios decided not to adhere to the standard arm64 calling convention, sigh...) + // The only libSystem calls we support that are vararg are open, fcntl, and ioctl, + // which are all of the form fn(x, y, ...). So we just need to put the 3rd arg + // on the stack as well. + // If we ever have other vararg libSystem calls, we might need to handle more cases. + MOVD R2, (RSP) + + BL (R12) + + MOVD 8(RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 32(R2) // save r1 + MOVD R1, 40(R2) // save r2 + CMPW $-1, R0 + BNE ok + SUB $16, RSP // push structure pointer + MOVD R2, 8(RSP) + BL libc_error(SB) + MOVW (R0), R0 + MOVD 8(RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 48(R2) // save err +ok: + RET + +// syscallX calls a function in libc on behalf of the syscall package. +// syscallX takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscallX must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscallX(SB),NOSPLIT,$0 + SUB $16, RSP // push structure pointer + MOVD R0, (RSP) + + MOVD 0(R0), R12 // fn + MOVD 16(R0), R1 // a2 + MOVD 24(R0), R2 // a3 + MOVD 8(R0), R0 // a1 + BL (R12) + + MOVD (RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 32(R2) // save r1 + MOVD R1, 40(R2) // save r2 + CMP $-1, R0 + BNE ok + SUB $16, RSP // push structure pointer + MOVD R2, (RSP) + BL libc_error(SB) + MOVW (R0), R0 + MOVD (RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 48(R2) // save err +ok: + RET + +// syscallPtr is like syscallX except that the libc function reports an +// error by returning NULL and setting errno. +TEXT runtime·syscallPtr(SB),NOSPLIT,$0 + SUB $16, RSP // push structure pointer + MOVD R0, (RSP) + + MOVD 0(R0), R12 // fn + MOVD 16(R0), R1 // a2 + MOVD 24(R0), R2 // a3 + MOVD 8(R0), R0 // a1 + BL (R12) + + MOVD (RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 32(R2) // save r1 + MOVD R1, 40(R2) // save r2 + CMP $0, R0 + BNE ok + SUB $16, RSP // push structure pointer + MOVD R2, (RSP) + BL libc_error(SB) + MOVW (R0), R0 + MOVD (RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 48(R2) // save err +ok: + RET + +// syscall6 calls a function in libc on behalf of the syscall package. +// syscall6 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6 must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscall6(SB),NOSPLIT,$0 + SUB $16, RSP // push structure pointer + MOVD R0, 8(RSP) + + MOVD 0(R0), R12 // fn + MOVD 16(R0), R1 // a2 + MOVD 24(R0), R2 // a3 + MOVD 32(R0), R3 // a4 + MOVD 40(R0), R4 // a5 + MOVD 48(R0), R5 // a6 + MOVD 8(R0), R0 // a1 + + // If fn is declared as vararg, we have to pass the vararg arguments on the stack. + // See syscall above. The only function this applies to is openat, for which the 4th + // arg must be on the stack. + MOVD R3, (RSP) + + BL (R12) + + MOVD 8(RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 56(R2) // save r1 + MOVD R1, 64(R2) // save r2 + CMPW $-1, R0 + BNE ok + SUB $16, RSP // push structure pointer + MOVD R2, 8(RSP) + BL libc_error(SB) + MOVW (R0), R0 + MOVD 8(RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 72(R2) // save err +ok: + RET + +// syscall6X calls a function in libc on behalf of the syscall package. +// syscall6X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6X must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscall6X(SB),NOSPLIT,$0 + SUB $16, RSP // push structure pointer + MOVD R0, (RSP) + + MOVD 0(R0), R12 // fn + MOVD 16(R0), R1 // a2 + MOVD 24(R0), R2 // a3 + MOVD 32(R0), R3 // a4 + MOVD 40(R0), R4 // a5 + MOVD 48(R0), R5 // a6 + MOVD 8(R0), R0 // a1 + BL (R12) + + MOVD (RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 56(R2) // save r1 + MOVD R1, 64(R2) // save r2 + CMP $-1, R0 + BNE ok + SUB $16, RSP // push structure pointer + MOVD R2, (RSP) + BL libc_error(SB) + MOVW (R0), R0 + MOVD (RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 72(R2) // save err +ok: + RET + +// syscall9 calls a function in libc on behalf of the syscall package. +// syscall9 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall9 must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscall9(SB),NOSPLIT,$0 + SUB $16, RSP // push structure pointer + MOVD R0, 8(RSP) + + MOVD 0(R0), R12 // fn + MOVD 16(R0), R1 // a2 + MOVD 24(R0), R2 // a3 + MOVD 32(R0), R3 // a4 + MOVD 40(R0), R4 // a5 + MOVD 48(R0), R5 // a6 + MOVD 56(R0), R6 // a7 + MOVD 64(R0), R7 // a8 + MOVD 72(R0), R8 // a9 + MOVD 8(R0), R0 // a1 + + // If fn is declared as vararg, we have to pass the vararg arguments on the stack. + // See syscall above. The only function this applies to is openat, for which the 4th + // arg must be on the stack. + MOVD R3, (RSP) + + BL (R12) + + MOVD 8(RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 80(R2) // save r1 + MOVD R1, 88(R2) // save r2 + CMPW $-1, R0 + BNE ok + SUB $16, RSP // push structure pointer + MOVD R2, 8(RSP) + BL libc_error(SB) + MOVW (R0), R0 + MOVD 8(RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 96(R2) // save err +ok: + RET + +// syscall_x509 is for crypto/x509. It is like syscall6 but does not check for errors, +// takes 5 uintptrs and 1 float64, and only returns one value, +// for use with standard C ABI functions. +TEXT runtime·syscall_x509(SB),NOSPLIT,$0 + SUB $16, RSP // push structure pointer + MOVD R0, (RSP) + + MOVD 0(R0), R12 // fn + MOVD 16(R0), R1 // a2 + MOVD 24(R0), R2 // a3 + MOVD 32(R0), R3 // a4 + MOVD 40(R0), R4 // a5 + FMOVD 48(R0), F0 // f1 + MOVD 8(R0), R0 // a1 + BL (R12) + + MOVD (RSP), R2 // pop structure pointer + ADD $16, RSP + MOVD R0, 56(R2) // save r1 + RET + +TEXT runtime·issetugid_trampoline(SB),NOSPLIT,$0 + BL libc_issetugid(SB) + RET diff --git a/src/runtime/sys_dragonfly_amd64.s b/src/runtime/sys_dragonfly_amd64.s new file mode 100644 index 0000000..08f99ca --- /dev/null +++ b/src/runtime/sys_dragonfly_amd64.s @@ -0,0 +1,423 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for AMD64, FreeBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_amd64.h" + +TEXT runtime·sys_umtx_sleep(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 - ptr + MOVL val+8(FP), SI // arg 2 - value + MOVL timeout+12(FP), DX // arg 3 - timeout + MOVL $469, AX // umtx_sleep + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+16(FP) + RET + +TEXT runtime·sys_umtx_wakeup(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 - ptr + MOVL val+8(FP), SI // arg 2 - count + MOVL $470, AX // umtx_wakeup + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+16(FP) + RET + +TEXT runtime·lwp_create(SB),NOSPLIT,$0 + MOVQ param+0(FP), DI // arg 1 - params + MOVL $495, AX // lwp_create + SYSCALL + MOVL AX, ret+8(FP) + RET + +TEXT runtime·lwp_start(SB),NOSPLIT,$0 + MOVQ DI, R13 // m + + // set up FS to point at m->tls + LEAQ m_tls(R13), DI + CALL runtime·settls(SB) // smashes DI + + // set up m, g + get_tls(CX) + MOVQ m_g0(R13), DI + MOVQ R13, g_m(DI) + MOVQ DI, g(CX) + + CALL runtime·stackcheck(SB) + CALL runtime·mstart(SB) + + MOVQ 0, AX // crash (not reached) + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT,$-8 + MOVL code+0(FP), DI // arg 1 exit status + MOVL $1, AX + SYSCALL + MOVL $0xf1, 0xf1 // crash + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-8 + MOVQ wait+0(FP), AX + // We're done using the stack. + MOVL $0, (AX) + MOVL $0x10000, DI // arg 1 how - EXTEXIT_LWP + MOVL $0, SI // arg 2 status + MOVL $0, DX // arg 3 addr + MOVL $494, AX // extexit + SYSCALL + MOVL $0xf1, 0xf1 // crash + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT,$-8 + MOVQ name+0(FP), DI // arg 1 pathname + MOVL mode+8(FP), SI // arg 2 flags + MOVL perm+12(FP), DX // arg 3 mode + MOVL $5, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$-8 + MOVL fd+0(FP), DI // arg 1 fd + MOVL $6, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+8(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$-8 + MOVL fd+0(FP), DI // arg 1 fd + MOVQ p+8(FP), SI // arg 2 buf + MOVL n+16(FP), DX // arg 3 count + MOVL $3, AX + SYSCALL + JCC 2(PC) + NEGL AX // caller expects negative errno + MOVL AX, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-20 + MOVL $0, DI + // dragonfly expects flags as the 2nd argument + MOVL flags+0(FP), SI + MOVL $538, AX + SYSCALL + JCC pipe2ok + MOVL $-1,r+8(FP) + MOVL $-1,w+12(FP) + MOVL AX, errno+16(FP) + RET +pipe2ok: + MOVL AX, r+8(FP) + MOVL DX, w+12(FP) + MOVL $0, errno+16(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$-8 + MOVQ fd+0(FP), DI // arg 1 fd + MOVQ p+8(FP), SI // arg 2 buf + MOVL n+16(FP), DX // arg 3 count + MOVL $4, AX + SYSCALL + JCC 2(PC) + NEGL AX // caller expects negative errno + MOVL AX, ret+24(FP) + RET + +TEXT runtime·lwp_gettid(SB),NOSPLIT,$0-4 + MOVL $496, AX // lwp_gettid + SYSCALL + MOVL AX, ret+0(FP) + RET + +TEXT runtime·lwp_kill(SB),NOSPLIT,$0-16 + MOVL pid+0(FP), DI // arg 1 - pid + MOVL tid+4(FP), SI // arg 2 - tid + MOVQ sig+8(FP), DX // arg 3 - signum + MOVL $497, AX // lwp_kill + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$0 + MOVL $20, AX // getpid + SYSCALL + MOVQ AX, DI // arg 1 - pid + MOVL sig+0(FP), SI // arg 2 - signum + MOVL $37, AX // kill + SYSCALL + RET + +TEXT runtime·setitimer(SB), NOSPLIT, $-8 + MOVL mode+0(FP), DI + MOVQ new+8(FP), SI + MOVQ old+16(FP), DX + MOVL $83, AX + SYSCALL + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB), NOSPLIT, $32 + MOVL $232, AX // clock_gettime + MOVQ $0, DI // CLOCK_REALTIME + LEAQ 8(SP), SI + SYSCALL + MOVQ 8(SP), AX // sec + MOVQ 16(SP), DX // nsec + + // sec is in AX, nsec in DX + MOVQ AX, sec+0(FP) + MOVL DX, nsec+8(FP) + RET + +TEXT runtime·nanotime1(SB), NOSPLIT, $32 + MOVL $232, AX + MOVQ $4, DI // CLOCK_MONOTONIC + LEAQ 8(SP), SI + SYSCALL + MOVQ 8(SP), AX // sec + MOVQ 16(SP), DX // nsec + + // sec is in AX, nsec in DX + // return nsec in AX + IMULQ $1000000000, AX + ADDQ DX, AX + MOVQ AX, ret+0(FP) + RET + +TEXT runtime·sigaction(SB),NOSPLIT,$-8 + MOVL sig+0(FP), DI // arg 1 sig + MOVQ new+8(FP), SI // arg 2 act + MOVQ old+16(FP), DX // arg 3 oact + MOVL $342, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVQ fn+0(FP), AX + MOVL sig+8(FP), DI + MOVQ info+16(FP), SI + MOVQ ctx+24(FP), DX + PUSHQ BP + MOVQ SP, BP + ANDQ $~15, SP // alignment for x86_64 ABI + CALL AX + MOVQ BP, SP + POPQ BP + RET + +// Called using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Set up ABIInternal environment: g in R14, cleared X15. + get_tls(R12) + MOVQ g(R12), R14 + PXOR X15, X15 + + // Reserve space for spill slots. + NOP SP // disable vet stack checking + ADJSP $24 + + // Call into the Go signal handler + MOVQ DI, AX // sig + MOVQ SI, BX // info + MOVQ DX, CX // ctx + CALL ·sigtrampgo<ABIInternal>(SB) + + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +TEXT runtime·mmap(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 - addr + MOVQ n+8(FP), SI // arg 2 - len + MOVL prot+16(FP), DX // arg 3 - prot + MOVL flags+20(FP), R10 // arg 4 - flags + MOVL fd+24(FP), R8 // arg 5 - fd + MOVL off+28(FP), R9 + SUBQ $16, SP + MOVQ R9, 8(SP) // arg 7 - offset (passed on stack) + MOVQ $0, R9 // arg 6 - pad + MOVL $197, AX + SYSCALL + JCC ok + ADDQ $16, SP + MOVQ $0, p+32(FP) + MOVQ AX, err+40(FP) + RET +ok: + ADDQ $16, SP + MOVQ AX, p+32(FP) + MOVQ $0, err+40(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 addr + MOVQ n+8(FP), SI // arg 2 len + MOVL $73, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVL flags+16(FP), DX + MOVQ $75, AX // madvise + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+24(FP) + RET + +TEXT runtime·sigaltstack(SB),NOSPLIT,$-8 + MOVQ new+0(FP), DI + MOVQ old+8(FP), SI + MOVQ $53, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·usleep(SB),NOSPLIT,$16 + MOVL $0, DX + MOVL usec+0(FP), AX + MOVL $1000000, CX + DIVL CX + MOVQ AX, 0(SP) // tv_sec + MOVL $1000, AX + MULL DX + MOVQ AX, 8(SP) // tv_nsec + + MOVQ SP, DI // arg 1 - rqtp + MOVQ $0, SI // arg 2 - rmtp + MOVL $240, AX // sys_nanosleep + SYSCALL + RET + +// set tls base to DI +TEXT runtime·settls(SB),NOSPLIT,$16 + ADDQ $8, DI // adjust for ELF: wants to use -8(FS) for g + MOVQ DI, 0(SP) + MOVQ $16, 8(SP) + MOVQ $0, DI // arg 1 - which + MOVQ SP, SI // arg 2 - tls_info + MOVQ $16, DX // arg 3 - infosize + MOVQ $472, AX // set_tls_area + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sysctl(SB),NOSPLIT,$0 + MOVQ mib+0(FP), DI // arg 1 - name + MOVL miblen+8(FP), SI // arg 2 - namelen + MOVQ out+16(FP), DX // arg 3 - oldp + MOVQ size+24(FP), R10 // arg 4 - oldlenp + MOVQ dst+32(FP), R8 // arg 5 - newp + MOVQ ndst+40(FP), R9 // arg 6 - newlen + MOVQ $202, AX // sys___sysctl + SYSCALL + JCC 4(PC) + NEGQ AX + MOVL AX, ret+48(FP) + RET + MOVL $0, AX + MOVL AX, ret+48(FP) + RET + +TEXT runtime·osyield(SB),NOSPLIT,$-4 + MOVL $331, AX // sys_sched_yield + SYSCALL + RET + +TEXT runtime·sigprocmask(SB),NOSPLIT,$0 + MOVL how+0(FP), DI // arg 1 - how + MOVQ new+8(FP), SI // arg 2 - set + MOVQ old+16(FP), DX // arg 3 - oset + MOVL $340, AX // sys_sigprocmask + SYSCALL + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +// int32 runtime·kqueue(void); +TEXT runtime·kqueue(SB),NOSPLIT,$0 + MOVQ $0, DI + MOVQ $0, SI + MOVQ $0, DX + MOVL $362, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout); +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVL kq+0(FP), DI + MOVQ ch+8(FP), SI + MOVL nch+16(FP), DX + MOVQ ev+24(FP), R10 + MOVL nev+32(FP), R8 + MOVQ ts+40(FP), R9 + MOVL $363, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+48(FP) + RET + +// func fcntl(fd, cmd, arg int32) (ret int32, errno int32) +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVL fd+0(FP), DI // fd + MOVL cmd+4(FP), SI // cmd + MOVL arg+8(FP), DX // arg + MOVL $92, AX // fcntl + SYSCALL + JCC noerr + MOVL $-1, ret+16(FP) + MOVL AX, errno+20(FP) + RET +noerr: + MOVL AX, ret+16(FP) + MOVL $0, errno+20(FP) + RET + +// void runtime·closeonexec(int32 fd); +TEXT runtime·closeonexec(SB),NOSPLIT,$0 + MOVL fd+0(FP), DI // fd + MOVQ $2, SI // F_SETFD + MOVQ $1, DX // FD_CLOEXEC + MOVL $92, AX // fcntl + SYSCALL + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT,$0 + MOVQ $0, DI + MOVQ $0, SI + MOVQ $0, DX + MOVL $253, AX + SYSCALL + MOVL AX, ret+0(FP) + RET diff --git a/src/runtime/sys_freebsd_386.s b/src/runtime/sys_freebsd_386.s new file mode 100644 index 0000000..df0c073 --- /dev/null +++ b/src/runtime/sys_freebsd_386.s @@ -0,0 +1,497 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for 386, FreeBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 4 +#define FD_CLOEXEC 1 +#define F_SETFD 2 + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_sigaltstack 53 +#define SYS_munmap 73 +#define SYS_madvise 75 +#define SYS_setitimer 83 +#define SYS_fcntl 92 +#define SYS_sysarch 165 +#define SYS___sysctl 202 +#define SYS_clock_gettime 232 +#define SYS_nanosleep 240 +#define SYS_issetugid 253 +#define SYS_sched_yield 331 +#define SYS_sigprocmask 340 +#define SYS_kqueue 362 +#define SYS_sigaction 416 +#define SYS_sigreturn 417 +#define SYS_thr_exit 431 +#define SYS_thr_self 432 +#define SYS_thr_kill 433 +#define SYS__umtx_op 454 +#define SYS_thr_new 455 +#define SYS_mmap 477 +#define SYS_cpuset_getaffinity 487 +#define SYS_pipe2 542 +#define SYS_kevent 560 + +TEXT runtime·sys_umtx_op(SB),NOSPLIT,$-4 + MOVL $SYS__umtx_op, AX + INT $0x80 + JAE 2(PC) + NEGL AX + MOVL AX, ret+20(FP) + RET + +TEXT runtime·thr_new(SB),NOSPLIT,$-4 + MOVL $SYS_thr_new, AX + INT $0x80 + JAE 2(PC) + NEGL AX + MOVL AX, ret+8(FP) + RET + +// Called by OS using C ABI. +TEXT runtime·thr_start(SB),NOSPLIT,$0 + NOP SP // tell vet SP changed - stop checking offsets + MOVL 4(SP), AX // m + MOVL m_g0(AX), BX + LEAL m_tls(AX), BP + MOVL m_id(AX), DI + ADDL $7, DI + PUSHAL + PUSHL $32 + PUSHL BP + PUSHL DI + CALL runtime·setldt(SB) + POPL AX + POPL AX + POPL AX + POPAL + get_tls(CX) + MOVL BX, g(CX) + + MOVL AX, g_m(BX) + CALL runtime·stackcheck(SB) // smashes AX + CALL runtime·mstart(SB) + + MOVL 0, AX // crash (not reached) + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT,$-4 + MOVL $SYS_exit, AX + INT $0x80 + MOVL $0xf1, 0xf1 // crash + RET + +GLOBL exitStack<>(SB),RODATA,$8 +DATA exitStack<>+0x00(SB)/4, $0 +DATA exitStack<>+0x04(SB)/4, $0 + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-4 + MOVL wait+0(FP), AX + // We're done using the stack. + MOVL $0, (AX) + // thr_exit takes a single pointer argument, which it expects + // on the stack. We want to pass 0, so switch over to a fake + // stack of 0s. It won't write to the stack. + MOVL $exitStack<>(SB), SP + MOVL $SYS_thr_exit, AX + INT $0x80 + MOVL $0xf1, 0xf1 // crash + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT,$-4 + MOVL $SYS_open, AX + INT $0x80 + JAE 2(PC) + MOVL $-1, AX + MOVL AX, ret+12(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$-4 + MOVL $SYS_close, AX + INT $0x80 + JAE 2(PC) + MOVL $-1, AX + MOVL AX, ret+4(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$-4 + MOVL $SYS_read, AX + INT $0x80 + JAE 2(PC) + NEGL AX // caller expects negative errno + MOVL AX, ret+12(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$12-16 + MOVL $SYS_pipe2, AX + LEAL r+4(FP), BX + MOVL BX, 4(SP) + MOVL flags+0(FP), BX + MOVL BX, 8(SP) + INT $0x80 + JAE 2(PC) + NEGL AX + MOVL AX, errno+12(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$-4 + MOVL $SYS_write, AX + INT $0x80 + JAE 2(PC) + NEGL AX // caller expects negative errno + MOVL AX, ret+12(FP) + RET + +TEXT runtime·thr_self(SB),NOSPLIT,$8-4 + // thr_self(&0(FP)) + LEAL ret+0(FP), AX + MOVL AX, 4(SP) + MOVL $SYS_thr_self, AX + INT $0x80 + RET + +TEXT runtime·thr_kill(SB),NOSPLIT,$-4 + // thr_kill(tid, sig) + MOVL $SYS_thr_kill, AX + INT $0x80 + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$16 + // getpid + MOVL $SYS_getpid, AX + INT $0x80 + // kill(self, sig) + MOVL AX, 4(SP) + MOVL sig+0(FP), AX + MOVL AX, 8(SP) + MOVL $SYS_kill, AX + INT $0x80 + RET + +TEXT runtime·mmap(SB),NOSPLIT,$32 + LEAL addr+0(FP), SI + LEAL 4(SP), DI + CLD + MOVSL + MOVSL + MOVSL + MOVSL + MOVSL + MOVSL + MOVL $0, AX // top 32 bits of file offset + STOSL + MOVL $SYS_mmap, AX + INT $0x80 + JAE ok + MOVL $0, p+24(FP) + MOVL AX, err+28(FP) + RET +ok: + MOVL AX, p+24(FP) + MOVL $0, err+28(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$-4 + MOVL $SYS_munmap, AX + INT $0x80 + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·madvise(SB),NOSPLIT,$-4 + MOVL $SYS_madvise, AX + INT $0x80 + JAE 2(PC) + MOVL $-1, AX + MOVL AX, ret+12(FP) + RET + +TEXT runtime·setitimer(SB), NOSPLIT, $-4 + MOVL $SYS_setitimer, AX + INT $0x80 + RET + +// func fallback_walltime() (sec int64, nsec int32) +TEXT runtime·fallback_walltime(SB), NOSPLIT, $32-12 + MOVL $SYS_clock_gettime, AX + LEAL 12(SP), BX + MOVL $CLOCK_REALTIME, 4(SP) + MOVL BX, 8(SP) + INT $0x80 + MOVL 12(SP), AX // sec + MOVL 16(SP), BX // nsec + + // sec is in AX, nsec in BX + MOVL AX, sec_lo+0(FP) + MOVL $0, sec_hi+4(FP) + MOVL BX, nsec+8(FP) + RET + +// func fallback_nanotime() int64 +TEXT runtime·fallback_nanotime(SB), NOSPLIT, $32-8 + MOVL $SYS_clock_gettime, AX + LEAL 12(SP), BX + MOVL $CLOCK_MONOTONIC, 4(SP) + MOVL BX, 8(SP) + INT $0x80 + MOVL 12(SP), AX // sec + MOVL 16(SP), BX // nsec + + // sec is in AX, nsec in BX + // convert to DX:AX nsec + MOVL $1000000000, CX + MULL CX + ADDL BX, AX + ADCL $0, DX + + MOVL AX, ret_lo+0(FP) + MOVL DX, ret_hi+4(FP) + RET + + +TEXT runtime·asmSigaction(SB),NOSPLIT,$-4 + MOVL $SYS_sigaction, AX + INT $0x80 + MOVL AX, ret+12(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$12-16 + MOVL fn+0(FP), AX + MOVL sig+4(FP), BX + MOVL info+8(FP), CX + MOVL ctx+12(FP), DX + MOVL SP, SI + SUBL $32, SP + ANDL $~15, SP // align stack: handler might be a C function + MOVL BX, 0(SP) + MOVL CX, 4(SP) + MOVL DX, 8(SP) + MOVL SI, 12(SP) // save SI: handler might be a Go function + CALL AX + MOVL 12(SP), AX + MOVL AX, SP + RET + +// Called by OS using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$12 + NOP SP // tell vet SP changed - stop checking offsets + MOVL 16(SP), BX // signo + MOVL BX, 0(SP) + MOVL 20(SP), BX // info + MOVL BX, 4(SP) + MOVL 24(SP), BX // context + MOVL BX, 8(SP) + CALL runtime·sigtrampgo(SB) + + // call sigreturn + MOVL 24(SP), AX // context + MOVL $0, 0(SP) // syscall gap + MOVL AX, 4(SP) + MOVL $SYS_sigreturn, AX + INT $0x80 + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sigaltstack(SB),NOSPLIT,$0 + MOVL $SYS_sigaltstack, AX + INT $0x80 + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·usleep(SB),NOSPLIT,$20 + MOVL $0, DX + MOVL usec+0(FP), AX + MOVL $1000000, CX + DIVL CX + MOVL AX, 12(SP) // tv_sec + MOVL $1000, AX + MULL DX + MOVL AX, 16(SP) // tv_nsec + + MOVL $0, 0(SP) + LEAL 12(SP), AX + MOVL AX, 4(SP) // arg 1 - rqtp + MOVL $0, 8(SP) // arg 2 - rmtp + MOVL $SYS_nanosleep, AX + INT $0x80 + RET + +/* +descriptor entry format for system call +is the native machine format, ugly as it is: + + 2-byte limit + 3-byte base + 1-byte: 0x80=present, 0x60=dpl<<5, 0x1F=type + 1-byte: 0x80=limit is *4k, 0x40=32-bit operand size, + 0x0F=4 more bits of limit + 1 byte: 8 more bits of base + +int i386_get_ldt(int, union ldt_entry *, int); +int i386_set_ldt(int, const union ldt_entry *, int); + +*/ + +// setldt(int entry, int address, int limit) +TEXT runtime·setldt(SB),NOSPLIT,$32 + MOVL base+4(FP), BX + // see comment in sys_linux_386.s; freebsd is similar + ADDL $0x4, BX + + // set up data_desc + LEAL 16(SP), AX // struct data_desc + MOVL $0, 0(AX) + MOVL $0, 4(AX) + + MOVW BX, 2(AX) + SHRL $16, BX + MOVB BX, 4(AX) + SHRL $8, BX + MOVB BX, 7(AX) + + MOVW $0xffff, 0(AX) + MOVB $0xCF, 6(AX) // 32-bit operand, 4k limit unit, 4 more bits of limit + + MOVB $0xF2, 5(AX) // r/w data descriptor, dpl=3, present + + // call i386_set_ldt(entry, desc, 1) + MOVL $0xffffffff, 0(SP) // auto-allocate entry and return in AX + MOVL AX, 4(SP) + MOVL $1, 8(SP) + CALL i386_set_ldt<>(SB) + + // compute segment selector - (entry*8+7) + SHLL $3, AX + ADDL $7, AX + MOVW AX, GS + RET + +TEXT i386_set_ldt<>(SB),NOSPLIT,$16 + LEAL args+0(FP), AX // 0(FP) == 4(SP) before SP got moved + MOVL $0, 0(SP) // syscall gap + MOVL $1, 4(SP) + MOVL AX, 8(SP) + MOVL $SYS_sysarch, AX + INT $0x80 + JAE 2(PC) + INT $3 + RET + +TEXT runtime·sysctl(SB),NOSPLIT,$28 + LEAL mib+0(FP), SI + LEAL 4(SP), DI + CLD + MOVSL // arg 1 - name + MOVSL // arg 2 - namelen + MOVSL // arg 3 - oldp + MOVSL // arg 4 - oldlenp + MOVSL // arg 5 - newp + MOVSL // arg 6 - newlen + MOVL $SYS___sysctl, AX + INT $0x80 + JAE 4(PC) + NEGL AX + MOVL AX, ret+24(FP) + RET + MOVL $0, AX + MOVL AX, ret+24(FP) + RET + +TEXT runtime·osyield(SB),NOSPLIT,$-4 + MOVL $SYS_sched_yield, AX + INT $0x80 + RET + +TEXT runtime·sigprocmask(SB),NOSPLIT,$16 + MOVL $0, 0(SP) // syscall gap + MOVL how+0(FP), AX // arg 1 - how + MOVL AX, 4(SP) + MOVL new+4(FP), AX + MOVL AX, 8(SP) // arg 2 - set + MOVL old+8(FP), AX + MOVL AX, 12(SP) // arg 3 - oset + MOVL $SYS_sigprocmask, AX + INT $0x80 + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +// int32 runtime·kqueue(void); +TEXT runtime·kqueue(SB),NOSPLIT,$0 + MOVL $SYS_kqueue, AX + INT $0x80 + JAE 2(PC) + NEGL AX + MOVL AX, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout); +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVL $SYS_kevent, AX + INT $0x80 + JAE 2(PC) + NEGL AX + MOVL AX, ret+24(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$-4 + MOVL $SYS_fcntl, AX + INT $0x80 + JAE noerr + MOVL $-1, ret+12(FP) + MOVL AX, errno+16(FP) + RET +noerr: + MOVL AX, ret+12(FP) + MOVL $0, errno+16(FP) + RET + +// int32 runtime·closeonexec(int32 fd); +TEXT runtime·closeonexec(SB),NOSPLIT,$32 + MOVL $SYS_fcntl, AX + // 0(SP) is where the caller PC would be; kernel skips it + MOVL fd+0(FP), BX + MOVL BX, 4(SP) // fd + MOVL $F_SETFD, 8(SP) + MOVL $FD_CLOEXEC, 12(SP) + INT $0x80 + JAE 2(PC) + NEGL AX + RET + +// func cpuset_getaffinity(level int, which int, id int64, size int, mask *byte) int32 +TEXT runtime·cpuset_getaffinity(SB), NOSPLIT, $0-28 + MOVL $SYS_cpuset_getaffinity, AX + INT $0x80 + JAE 2(PC) + NEGL AX + MOVL AX, ret+24(FP) + RET + +GLOBL runtime·tlsoffset(SB),NOPTR,$4 + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT,$0 + MOVL $SYS_issetugid, AX + INT $0x80 + MOVL AX, ret+0(FP) + RET diff --git a/src/runtime/sys_freebsd_amd64.s b/src/runtime/sys_freebsd_amd64.s new file mode 100644 index 0000000..95bf07e --- /dev/null +++ b/src/runtime/sys_freebsd_amd64.s @@ -0,0 +1,601 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for AMD64, FreeBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_amd64.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 4 +#define FD_CLOEXEC 1 +#define F_SETFD 2 +#define AMD64_SET_FSBASE 129 + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_sigaltstack 53 +#define SYS_munmap 73 +#define SYS_madvise 75 +#define SYS_setitimer 83 +#define SYS_fcntl 92 +#define SYS_sysarch 165 +#define SYS___sysctl 202 +#define SYS_clock_gettime 232 +#define SYS_nanosleep 240 +#define SYS_issetugid 253 +#define SYS_sched_yield 331 +#define SYS_sigprocmask 340 +#define SYS_kqueue 362 +#define SYS_sigaction 416 +#define SYS_thr_exit 431 +#define SYS_thr_self 432 +#define SYS_thr_kill 433 +#define SYS__umtx_op 454 +#define SYS_thr_new 455 +#define SYS_mmap 477 +#define SYS_cpuset_getaffinity 487 +#define SYS_pipe2 542 +#define SYS_kevent 560 + +TEXT runtime·sys_umtx_op(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI + MOVL mode+8(FP), SI + MOVL val+12(FP), DX + MOVQ uaddr1+16(FP), R10 + MOVQ ut+24(FP), R8 + MOVL $SYS__umtx_op, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+32(FP) + RET + +TEXT runtime·thr_new(SB),NOSPLIT,$0 + MOVQ param+0(FP), DI + MOVL size+8(FP), SI + MOVL $SYS_thr_new, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+16(FP) + RET + +TEXT runtime·thr_start(SB),NOSPLIT,$0 + MOVQ DI, R13 // m + + // set up FS to point at m->tls + LEAQ m_tls(R13), DI + CALL runtime·settls(SB) // smashes DI + + // set up m, g + get_tls(CX) + MOVQ m_g0(R13), DI + MOVQ R13, g_m(DI) + MOVQ DI, g(CX) + + CALL runtime·stackcheck(SB) + CALL runtime·mstart(SB) + + MOVQ 0, AX // crash (not reached) + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT,$-8 + MOVL code+0(FP), DI // arg 1 exit status + MOVL $SYS_exit, AX + SYSCALL + MOVL $0xf1, 0xf1 // crash + RET + +// func exitThread(wait *atomic.uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-8 + MOVQ wait+0(FP), AX + // We're done using the stack. + MOVL $0, (AX) + MOVL $0, DI // arg 1 long *state + MOVL $SYS_thr_exit, AX + SYSCALL + MOVL $0xf1, 0xf1 // crash + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT,$-8 + MOVQ name+0(FP), DI // arg 1 pathname + MOVL mode+8(FP), SI // arg 2 flags + MOVL perm+12(FP), DX // arg 3 mode + MOVL $SYS_open, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$-8 + MOVL fd+0(FP), DI // arg 1 fd + MOVL $SYS_close, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+8(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$-8 + MOVL fd+0(FP), DI // arg 1 fd + MOVQ p+8(FP), SI // arg 2 buf + MOVL n+16(FP), DX // arg 3 count + MOVL $SYS_read, AX + SYSCALL + JCC 2(PC) + NEGQ AX // caller expects negative errno + MOVL AX, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-20 + LEAQ r+8(FP), DI + MOVL flags+0(FP), SI + MOVL $SYS_pipe2, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, errno+16(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$-8 + MOVQ fd+0(FP), DI // arg 1 fd + MOVQ p+8(FP), SI // arg 2 buf + MOVL n+16(FP), DX // arg 3 count + MOVL $SYS_write, AX + SYSCALL + JCC 2(PC) + NEGQ AX // caller expects negative errno + MOVL AX, ret+24(FP) + RET + +TEXT runtime·thr_self(SB),NOSPLIT,$0-8 + // thr_self(&0(FP)) + LEAQ ret+0(FP), DI // arg 1 + MOVL $SYS_thr_self, AX + SYSCALL + RET + +TEXT runtime·thr_kill(SB),NOSPLIT,$0-16 + // thr_kill(tid, sig) + MOVQ tid+0(FP), DI // arg 1 id + MOVQ sig+8(FP), SI // arg 2 sig + MOVL $SYS_thr_kill, AX + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$0 + // getpid + MOVL $SYS_getpid, AX + SYSCALL + // kill(self, sig) + MOVQ AX, DI // arg 1 pid + MOVL sig+0(FP), SI // arg 2 sig + MOVL $SYS_kill, AX + SYSCALL + RET + +TEXT runtime·setitimer(SB), NOSPLIT, $-8 + MOVL mode+0(FP), DI + MOVQ new+8(FP), SI + MOVQ old+16(FP), DX + MOVL $SYS_setitimer, AX + SYSCALL + RET + +// func fallback_walltime() (sec int64, nsec int32) +TEXT runtime·fallback_walltime(SB), NOSPLIT, $32-12 + MOVL $SYS_clock_gettime, AX + MOVQ $CLOCK_REALTIME, DI + LEAQ 8(SP), SI + SYSCALL + MOVQ 8(SP), AX // sec + MOVQ 16(SP), DX // nsec + + // sec is in AX, nsec in DX + MOVQ AX, sec+0(FP) + MOVL DX, nsec+8(FP) + RET + +TEXT runtime·fallback_nanotime(SB), NOSPLIT, $32-8 + MOVL $SYS_clock_gettime, AX + MOVQ $CLOCK_MONOTONIC, DI + LEAQ 8(SP), SI + SYSCALL + MOVQ 8(SP), AX // sec + MOVQ 16(SP), DX // nsec + + // sec is in AX, nsec in DX + // return nsec in AX + IMULQ $1000000000, AX + ADDQ DX, AX + MOVQ AX, ret+0(FP) + RET + +TEXT runtime·asmSigaction(SB),NOSPLIT,$0 + MOVQ sig+0(FP), DI // arg 1 sig + MOVQ new+8(FP), SI // arg 2 act + MOVQ old+16(FP), DX // arg 3 oact + MOVL $SYS_sigaction, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+24(FP) + RET + +TEXT runtime·callCgoSigaction(SB),NOSPLIT,$16 + MOVQ sig+0(FP), DI // arg 1 sig + MOVQ new+8(FP), SI // arg 2 act + MOVQ old+16(FP), DX // arg 3 oact + MOVQ _cgo_sigaction(SB), AX + MOVQ SP, BX // callee-saved + ANDQ $~15, SP // alignment as per amd64 psABI + CALL AX + MOVQ BX, SP + MOVL AX, ret+24(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVQ fn+0(FP), AX + MOVL sig+8(FP), DI + MOVQ info+16(FP), SI + MOVQ ctx+24(FP), DX + PUSHQ BP + MOVQ SP, BP + ANDQ $~15, SP // alignment for x86_64 ABI + CALL AX + MOVQ BP, SP + POPQ BP + RET + +// Called using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Set up ABIInternal environment: g in R14, cleared X15. + get_tls(R12) + MOVQ g(R12), R14 + PXOR X15, X15 + + // Reserve space for spill slots. + NOP SP // disable vet stack checking + ADJSP $24 + + // Call into the Go signal handler + MOVQ DI, AX // sig + MOVQ SI, BX // info + MOVQ DX, CX // ctx + CALL ·sigtrampgo<ABIInternal>(SB) + + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +// Called using C ABI. +TEXT runtime·sigprofNonGoWrapper<>(SB),NOSPLIT,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Set up ABIInternal environment: g in R14, cleared X15. + get_tls(R12) + MOVQ g(R12), R14 + PXOR X15, X15 + + // Reserve space for spill slots. + NOP SP // disable vet stack checking + ADJSP $24 + + // Call into the Go signal handler + MOVQ DI, AX // sig + MOVQ SI, BX // info + MOVQ DX, CX // ctx + CALL ·sigprofNonGo<ABIInternal>(SB) + + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +// Used instead of sigtramp in programs that use cgo. +// Arguments from kernel are in DI, SI, DX. +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + // If no traceback function, do usual sigtramp. + MOVQ runtime·cgoTraceback(SB), AX + TESTQ AX, AX + JZ sigtramp + + // If no traceback support function, which means that + // runtime/cgo was not linked in, do usual sigtramp. + MOVQ _cgo_callers(SB), AX + TESTQ AX, AX + JZ sigtramp + + // Figure out if we are currently in a cgo call. + // If not, just do usual sigtramp. + get_tls(CX) + MOVQ g(CX),AX + TESTQ AX, AX + JZ sigtrampnog // g == nil + MOVQ g_m(AX), AX + TESTQ AX, AX + JZ sigtramp // g.m == nil + MOVL m_ncgo(AX), CX + TESTL CX, CX + JZ sigtramp // g.m.ncgo == 0 + MOVQ m_curg(AX), CX + TESTQ CX, CX + JZ sigtramp // g.m.curg == nil + MOVQ g_syscallsp(CX), CX + TESTQ CX, CX + JZ sigtramp // g.m.curg.syscallsp == 0 + MOVQ m_cgoCallers(AX), R8 + TESTQ R8, R8 + JZ sigtramp // g.m.cgoCallers == nil + MOVL m_cgoCallersUse(AX), CX + TESTL CX, CX + JNZ sigtramp // g.m.cgoCallersUse != 0 + + // Jump to a function in runtime/cgo. + // That function, written in C, will call the user's traceback + // function with proper unwind info, and will then call back here. + // The first three arguments, and the fifth, are already in registers. + // Set the two remaining arguments now. + MOVQ runtime·cgoTraceback(SB), CX + MOVQ $runtime·sigtramp(SB), R9 + MOVQ _cgo_callers(SB), AX + JMP AX + +sigtramp: + JMP runtime·sigtramp(SB) + +sigtrampnog: + // Signal arrived on a non-Go thread. If this is SIGPROF, get a + // stack trace. + CMPL DI, $27 // 27 == SIGPROF + JNZ sigtramp + + // Lock sigprofCallersUse. + MOVL $0, AX + MOVL $1, CX + MOVQ $runtime·sigprofCallersUse(SB), R11 + LOCK + CMPXCHGL CX, 0(R11) + JNZ sigtramp // Skip stack trace if already locked. + + // Jump to the traceback function in runtime/cgo. + // It will call back to sigprofNonGo, via sigprofNonGoWrapper, to convert + // the arguments to the Go calling convention. + // First three arguments to traceback function are in registers already. + MOVQ runtime·cgoTraceback(SB), CX + MOVQ $runtime·sigprofCallers(SB), R8 + MOVQ $runtime·sigprofNonGoWrapper<>(SB), R9 + MOVQ _cgo_callers(SB), AX + JMP AX + +TEXT runtime·sysMmap(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 addr + MOVQ n+8(FP), SI // arg 2 len + MOVL prot+16(FP), DX // arg 3 prot + MOVL flags+20(FP), R10 // arg 4 flags + MOVL fd+24(FP), R8 // arg 5 fid + MOVL off+28(FP), R9 // arg 6 offset + MOVL $SYS_mmap, AX + SYSCALL + JCC ok + MOVQ $0, p+32(FP) + MOVQ AX, err+40(FP) + RET +ok: + MOVQ AX, p+32(FP) + MOVQ $0, err+40(FP) + RET + +// Call the function stored in _cgo_mmap using the GCC calling convention. +// This must be called on the system stack. +TEXT runtime·callCgoMmap(SB),NOSPLIT,$16 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVL prot+16(FP), DX + MOVL flags+20(FP), CX + MOVL fd+24(FP), R8 + MOVL off+28(FP), R9 + MOVQ _cgo_mmap(SB), AX + MOVQ SP, BX + ANDQ $~15, SP // alignment as per amd64 psABI + MOVQ BX, 0(SP) + CALL AX + MOVQ 0(SP), SP + MOVQ AX, ret+32(FP) + RET + +TEXT runtime·sysMunmap(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 addr + MOVQ n+8(FP), SI // arg 2 len + MOVL $SYS_munmap, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +// Call the function stored in _cgo_munmap using the GCC calling convention. +// This must be called on the system stack. +TEXT runtime·callCgoMunmap(SB),NOSPLIT,$16-16 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVQ _cgo_munmap(SB), AX + MOVQ SP, BX + ANDQ $~15, SP // alignment as per amd64 psABI + MOVQ BX, 0(SP) + CALL AX + MOVQ 0(SP), SP + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVL flags+16(FP), DX + MOVQ $SYS_madvise, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+24(FP) + RET + +TEXT runtime·sigaltstack(SB),NOSPLIT,$-8 + MOVQ new+0(FP), DI + MOVQ old+8(FP), SI + MOVQ $SYS_sigaltstack, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·usleep(SB),NOSPLIT,$16 + MOVL $0, DX + MOVL usec+0(FP), AX + MOVL $1000000, CX + DIVL CX + MOVQ AX, 0(SP) // tv_sec + MOVL $1000, AX + MULL DX + MOVQ AX, 8(SP) // tv_nsec + + MOVQ SP, DI // arg 1 - rqtp + MOVQ $0, SI // arg 2 - rmtp + MOVL $SYS_nanosleep, AX + SYSCALL + RET + +// set tls base to DI +TEXT runtime·settls(SB),NOSPLIT,$8 + ADDQ $8, DI // adjust for ELF: wants to use -8(FS) for g and m + MOVQ DI, 0(SP) + MOVQ SP, SI + MOVQ $AMD64_SET_FSBASE, DI + MOVQ $SYS_sysarch, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sysctl(SB),NOSPLIT,$0 + MOVQ mib+0(FP), DI // arg 1 - name + MOVL miblen+8(FP), SI // arg 2 - namelen + MOVQ out+16(FP), DX // arg 3 - oldp + MOVQ size+24(FP), R10 // arg 4 - oldlenp + MOVQ dst+32(FP), R8 // arg 5 - newp + MOVQ ndst+40(FP), R9 // arg 6 - newlen + MOVQ $SYS___sysctl, AX + SYSCALL + JCC 4(PC) + NEGQ AX + MOVL AX, ret+48(FP) + RET + MOVL $0, AX + MOVL AX, ret+48(FP) + RET + +TEXT runtime·osyield(SB),NOSPLIT,$-4 + MOVL $SYS_sched_yield, AX + SYSCALL + RET + +TEXT runtime·sigprocmask(SB),NOSPLIT,$0 + MOVL how+0(FP), DI // arg 1 - how + MOVQ new+8(FP), SI // arg 2 - set + MOVQ old+16(FP), DX // arg 3 - oset + MOVL $SYS_sigprocmask, AX + SYSCALL + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +// int32 runtime·kqueue(void); +TEXT runtime·kqueue(SB),NOSPLIT,$0 + MOVQ $0, DI + MOVQ $0, SI + MOVQ $0, DX + MOVL $SYS_kqueue, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout); +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVL kq+0(FP), DI + MOVQ ch+8(FP), SI + MOVL nch+16(FP), DX + MOVQ ev+24(FP), R10 + MOVL nev+32(FP), R8 + MOVQ ts+40(FP), R9 + MOVL $SYS_kevent, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+48(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVL fd+0(FP), DI // fd + MOVL cmd+4(FP), SI // cmd + MOVL arg+8(FP), DX // arg + MOVL $SYS_fcntl, AX + SYSCALL + JCC noerr + MOVL $-1, ret+16(FP) + MOVL AX, errno+20(FP) + RET +noerr: + MOVL AX, ret+16(FP) + MOVL $0, errno+20(FP) + RET + +// void runtime·closeonexec(int32 fd); +TEXT runtime·closeonexec(SB),NOSPLIT,$0 + MOVL fd+0(FP), DI // fd + MOVQ $F_SETFD, SI + MOVQ $FD_CLOEXEC, DX + MOVL $SYS_fcntl, AX + SYSCALL + RET + +// func cpuset_getaffinity(level int, which int, id int64, size int, mask *byte) int32 +TEXT runtime·cpuset_getaffinity(SB), NOSPLIT, $0-44 + MOVQ level+0(FP), DI + MOVQ which+8(FP), SI + MOVQ id+16(FP), DX + MOVQ size+24(FP), R10 + MOVQ mask+32(FP), R8 + MOVL $SYS_cpuset_getaffinity, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+40(FP) + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT,$0 + MOVQ $0, DI + MOVQ $0, SI + MOVQ $0, DX + MOVL $SYS_issetugid, AX + SYSCALL + MOVL AX, ret+0(FP) + RET diff --git a/src/runtime/sys_freebsd_arm.s b/src/runtime/sys_freebsd_arm.s new file mode 100644 index 0000000..bd2e705 --- /dev/null +++ b/src/runtime/sys_freebsd_arm.s @@ -0,0 +1,465 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for ARM, FreeBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// for EABI, as we don't support OABI +#define SYS_BASE 0x0 + +#define SYS_exit (SYS_BASE + 1) +#define SYS_read (SYS_BASE + 3) +#define SYS_write (SYS_BASE + 4) +#define SYS_open (SYS_BASE + 5) +#define SYS_close (SYS_BASE + 6) +#define SYS_getpid (SYS_BASE + 20) +#define SYS_kill (SYS_BASE + 37) +#define SYS_sigaltstack (SYS_BASE + 53) +#define SYS_munmap (SYS_BASE + 73) +#define SYS_madvise (SYS_BASE + 75) +#define SYS_setitimer (SYS_BASE + 83) +#define SYS_fcntl (SYS_BASE + 92) +#define SYS___sysctl (SYS_BASE + 202) +#define SYS_nanosleep (SYS_BASE + 240) +#define SYS_issetugid (SYS_BASE + 253) +#define SYS_clock_gettime (SYS_BASE + 232) +#define SYS_sched_yield (SYS_BASE + 331) +#define SYS_sigprocmask (SYS_BASE + 340) +#define SYS_kqueue (SYS_BASE + 362) +#define SYS_sigaction (SYS_BASE + 416) +#define SYS_thr_exit (SYS_BASE + 431) +#define SYS_thr_self (SYS_BASE + 432) +#define SYS_thr_kill (SYS_BASE + 433) +#define SYS__umtx_op (SYS_BASE + 454) +#define SYS_thr_new (SYS_BASE + 455) +#define SYS_mmap (SYS_BASE + 477) +#define SYS_cpuset_getaffinity (SYS_BASE + 487) +#define SYS_pipe2 (SYS_BASE + 542) +#define SYS_kevent (SYS_BASE + 560) + +TEXT runtime·sys_umtx_op(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 + MOVW mode+4(FP), R1 + MOVW val+8(FP), R2 + MOVW uaddr1+12(FP), R3 + ADD $20, R13 // arg 5 is passed on stack + MOVW $SYS__umtx_op, R7 + SWI $0 + RSB.CS $0, R0 + SUB $20, R13 + // BCS error + MOVW R0, ret+20(FP) + RET + +TEXT runtime·thr_new(SB),NOSPLIT,$0 + MOVW param+0(FP), R0 + MOVW size+4(FP), R1 + MOVW $SYS_thr_new, R7 + SWI $0 + RSB.CS $0, R0 + MOVW R0, ret+8(FP) + RET + +TEXT runtime·thr_start(SB),NOSPLIT,$0 + // set up g + MOVW m_g0(R0), g + MOVW R0, g_m(g) + BL runtime·emptyfunc(SB) // fault if stack check is wrong + BL runtime·mstart(SB) + + MOVW $2, R8 // crash (not reached) + MOVW R8, (R8) + RET + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0 + MOVW code+0(FP), R0 // arg 1 exit status + MOVW $SYS_exit, R7 + SWI $0 + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-4 + MOVW wait+0(FP), R0 + // We're done using the stack. + MOVW $0, R2 +storeloop: + LDREX (R0), R4 // loads R4 + STREX R2, (R0), R1 // stores R2 + CMP $0, R1 + BNE storeloop + MOVW $0, R0 // arg 1 long *state + MOVW $SYS_thr_exit, R7 + SWI $0 + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0 + MOVW name+0(FP), R0 // arg 1 name + MOVW mode+4(FP), R1 // arg 2 mode + MOVW perm+8(FP), R2 // arg 3 perm + MOVW $SYS_open, R7 + SWI $0 + MOVW.CS $-1, R0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R0 // arg 1 fd + MOVW p+4(FP), R1 // arg 2 buf + MOVW n+8(FP), R2 // arg 3 count + MOVW $SYS_read, R7 + SWI $0 + RSB.CS $0, R0 // caller expects negative errno + MOVW R0, ret+12(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-16 + MOVW $r+4(FP), R0 + MOVW flags+0(FP), R1 + MOVW $SYS_pipe2, R7 + SWI $0 + RSB.CS $0, R0 + MOVW R0, errno+12(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R0 // arg 1 fd + MOVW p+4(FP), R1 // arg 2 buf + MOVW n+8(FP), R2 // arg 3 count + MOVW $SYS_write, R7 + SWI $0 + RSB.CS $0, R0 // caller expects negative errno + MOVW R0, ret+12(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R0 // arg 1 fd + MOVW $SYS_close, R7 + SWI $0 + MOVW.CS $-1, R0 + MOVW R0, ret+4(FP) + RET + +TEXT runtime·thr_self(SB),NOSPLIT,$0-4 + // thr_self(&0(FP)) + MOVW $ret+0(FP), R0 // arg 1 + MOVW $SYS_thr_self, R7 + SWI $0 + RET + +TEXT runtime·thr_kill(SB),NOSPLIT,$0-8 + // thr_kill(tid, sig) + MOVW tid+0(FP), R0 // arg 1 id + MOVW sig+4(FP), R1 // arg 2 signal + MOVW $SYS_thr_kill, R7 + SWI $0 + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$0 + // getpid + MOVW $SYS_getpid, R7 + SWI $0 + // kill(self, sig) + // arg 1 - pid, now in R0 + MOVW sig+0(FP), R1 // arg 2 - signal + MOVW $SYS_kill, R7 + SWI $0 + RET + +TEXT runtime·setitimer(SB), NOSPLIT|NOFRAME, $0 + MOVW mode+0(FP), R0 + MOVW new+4(FP), R1 + MOVW old+8(FP), R2 + MOVW $SYS_setitimer, R7 + SWI $0 + RET + +// func fallback_walltime() (sec int64, nsec int32) +TEXT runtime·fallback_walltime(SB), NOSPLIT, $32-12 + MOVW $0, R0 // CLOCK_REALTIME + MOVW $8(R13), R1 + MOVW $SYS_clock_gettime, R7 + SWI $0 + + MOVW 8(R13), R0 // sec.low + MOVW 12(R13), R1 // sec.high + MOVW 16(R13), R2 // nsec + + MOVW R0, sec_lo+0(FP) + MOVW R1, sec_hi+4(FP) + MOVW R2, nsec+8(FP) + RET + +// func fallback_nanotime() int64 +TEXT runtime·fallback_nanotime(SB), NOSPLIT, $32 + MOVW $4, R0 // CLOCK_MONOTONIC + MOVW $8(R13), R1 + MOVW $SYS_clock_gettime, R7 + SWI $0 + + MOVW 8(R13), R0 // sec.low + MOVW 12(R13), R4 // sec.high + MOVW 16(R13), R2 // nsec + + MOVW $1000000000, R3 + MULLU R0, R3, (R1, R0) + MUL R3, R4 + ADD.S R2, R0 + ADC R4, R1 + + MOVW R0, ret_lo+0(FP) + MOVW R1, ret_hi+4(FP) + RET + +TEXT runtime·asmSigaction(SB),NOSPLIT|NOFRAME,$0 + MOVW sig+0(FP), R0 // arg 1 sig + MOVW new+4(FP), R1 // arg 2 act + MOVW old+8(FP), R2 // arg 3 oact + MOVW $SYS_sigaction, R7 + SWI $0 + MOVW.CS $-1, R0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Reserve space for callee-save registers and arguments. + MOVM.DB.W [R4-R11], (R13) + SUB $16, R13 + + // this might be called in external code context, + // where g is not set. + // first save R0, because runtime·load_g will clobber it + MOVW R0, 4(R13) // signum + MOVB runtime·iscgo(SB), R0 + CMP $0, R0 + BL.NE runtime·load_g(SB) + + MOVW R1, 8(R13) + MOVW R2, 12(R13) + BL runtime·sigtrampgo(SB) + + // Restore callee-save registers. + ADD $16, R13 + MOVM.IA.W (R13), [R4-R11] + + RET + +TEXT runtime·mmap(SB),NOSPLIT,$16 + MOVW addr+0(FP), R0 // arg 1 addr + MOVW n+4(FP), R1 // arg 2 len + MOVW prot+8(FP), R2 // arg 3 prot + MOVW flags+12(FP), R3 // arg 4 flags + // arg 5 (fid) and arg6 (offset_lo, offset_hi) are passed on stack + // note the C runtime only passes the 32-bit offset_lo to us + MOVW fd+16(FP), R4 // arg 5 + MOVW R4, 4(R13) + MOVW off+20(FP), R5 // arg 6 lower 32-bit + // the word at 8(R13) is skipped due to 64-bit argument alignment. + MOVW R5, 12(R13) + MOVW $0, R6 // higher 32-bit for arg 6 + MOVW R6, 16(R13) + ADD $4, R13 + MOVW $SYS_mmap, R7 + SWI $0 + SUB $4, R13 + MOVW $0, R1 + MOVW.CS R0, R1 // if failed, put in R1 + MOVW.CS $0, R0 + MOVW R0, p+24(FP) + MOVW R1, err+28(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 // arg 1 addr + MOVW n+4(FP), R1 // arg 2 len + MOVW $SYS_munmap, R7 + SWI $0 + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 // arg 1 addr + MOVW n+4(FP), R1 // arg 2 len + MOVW flags+8(FP), R2 // arg 3 flags + MOVW $SYS_madvise, R7 + SWI $0 + MOVW.CS $-1, R0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOVW new+0(FP), R0 + MOVW old+4(FP), R1 + MOVW $SYS_sigaltstack, R7 + SWI $0 + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-16 + MOVW sig+4(FP), R0 + MOVW info+8(FP), R1 + MOVW ctx+12(FP), R2 + MOVW fn+0(FP), R11 + MOVW R13, R4 + SUB $24, R13 + BIC $0x7, R13 // alignment for ELF ABI + BL (R11) + MOVW R4, R13 + RET + +TEXT runtime·usleep(SB),NOSPLIT,$16 + MOVW usec+0(FP), R0 + CALL runtime·usplitR0(SB) + // 0(R13) is the saved LR, don't use it + MOVW R0, 4(R13) // tv_sec.low + MOVW $0, R0 + MOVW R0, 8(R13) // tv_sec.high + MOVW $1000, R2 + MUL R1, R2 + MOVW R2, 12(R13) // tv_nsec + + MOVW $4(R13), R0 // arg 1 - rqtp + MOVW $0, R1 // arg 2 - rmtp + MOVW $SYS_nanosleep, R7 + SWI $0 + RET + +TEXT runtime·sysctl(SB),NOSPLIT,$0 + MOVW mib+0(FP), R0 // arg 1 - name + MOVW miblen+4(FP), R1 // arg 2 - namelen + MOVW out+8(FP), R2 // arg 3 - old + MOVW size+12(FP), R3 // arg 4 - oldlenp + // arg 5 (newp) and arg 6 (newlen) are passed on stack + ADD $20, R13 + MOVW $SYS___sysctl, R7 + SWI $0 + SUB.CS $0, R0, R0 + SUB $20, R13 + MOVW R0, ret+24(FP) + RET + +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + MOVW $SYS_sched_yield, R7 + SWI $0 + RET + +TEXT runtime·sigprocmask(SB),NOSPLIT,$0 + MOVW how+0(FP), R0 // arg 1 - how + MOVW new+4(FP), R1 // arg 2 - set + MOVW old+8(FP), R2 // arg 3 - oset + MOVW $SYS_sigprocmask, R7 + SWI $0 + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +// int32 runtime·kqueue(void) +TEXT runtime·kqueue(SB),NOSPLIT,$0 + MOVW $SYS_kqueue, R7 + SWI $0 + RSB.CS $0, R0 + MOVW R0, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout) +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVW kq+0(FP), R0 // kq + MOVW ch+4(FP), R1 // changelist + MOVW nch+8(FP), R2 // nchanges + MOVW ev+12(FP), R3 // eventlist + ADD $20, R13 // pass arg 5 and 6 on stack + MOVW $SYS_kevent, R7 + SWI $0 + RSB.CS $0, R0 + SUB $20, R13 + MOVW R0, ret+24(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 // fd + MOVW cmd+4(FP), R1 // cmd + MOVW arg+8(FP), R2 // arg + MOVW $SYS_fcntl, R7 + SWI $0 + MOVW $0, R1 + MOVW.CS R0, R1 + MOVW.CS $-1, R0 + MOVW R0, ret+12(FP) + MOVW R1, errno+16(FP) + RET + +// void runtime·closeonexec(int32 fd) +TEXT runtime·closeonexec(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 // fd + MOVW $2, R1 // F_SETFD + MOVW $1, R2 // FD_CLOEXEC + MOVW $SYS_fcntl, R7 + SWI $0 + RET + +// TODO: this is only valid for ARMv7+ +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + B runtime·armPublicationBarrier(SB) + +// TODO(minux): this only supports ARMv6K+. +TEXT runtime·read_tls_fallback(SB),NOSPLIT|NOFRAME,$0 + WORD $0xee1d0f70 // mrc p15, 0, r0, c13, c0, 3 + RET + +// func cpuset_getaffinity(level int, which int, id int64, size int, mask *byte) int32 +TEXT runtime·cpuset_getaffinity(SB), NOSPLIT, $0-28 + MOVW level+0(FP), R0 + MOVW which+4(FP), R1 + MOVW id_lo+8(FP), R2 + MOVW id_hi+12(FP), R3 + ADD $20, R13 // Pass size and mask on stack. + MOVW $SYS_cpuset_getaffinity, R7 + SWI $0 + RSB.CS $0, R0 + SUB $20, R13 + MOVW R0, ret+24(FP) + RET + +// func getCntxct(physical bool) uint32 +TEXT runtime·getCntxct(SB),NOSPLIT|NOFRAME,$0-8 + MOVB runtime·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + DMB + + MOVB physical+0(FP), R0 + CMP $1, R0 + B.NE 3(PC) + + // get CNTPCT (Physical Count Register) into R0(low) R1(high) + // mrrc 15, 0, r0, r1, cr14 + WORD $0xec510f0e + B 2(PC) + + // get CNTVCT (Virtual Count Register) into R0(low) R1(high) + // mrrc 15, 1, r0, r1, cr14 + WORD $0xec510f1e + + MOVW R0, ret+4(FP) + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT,$0 + MOVW $SYS_issetugid, R7 + SWI $0 + MOVW R0, ret+0(FP) + RET diff --git a/src/runtime/sys_freebsd_arm64.s b/src/runtime/sys_freebsd_arm64.s new file mode 100644 index 0000000..fe43062 --- /dev/null +++ b/src/runtime/sys_freebsd_arm64.s @@ -0,0 +1,495 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for arm64, FreeBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_arm64.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 4 +#define FD_CLOEXEC 1 +#define F_SETFD 2 +#define F_GETFL 3 +#define F_SETFL 4 +#define O_NONBLOCK 4 + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_sigaltstack 53 +#define SYS_munmap 73 +#define SYS_madvise 75 +#define SYS_setitimer 83 +#define SYS_fcntl 92 +#define SYS___sysctl 202 +#define SYS_nanosleep 240 +#define SYS_issetugid 253 +#define SYS_clock_gettime 232 +#define SYS_sched_yield 331 +#define SYS_sigprocmask 340 +#define SYS_kqueue 362 +#define SYS_sigaction 416 +#define SYS_thr_exit 431 +#define SYS_thr_self 432 +#define SYS_thr_kill 433 +#define SYS__umtx_op 454 +#define SYS_thr_new 455 +#define SYS_mmap 477 +#define SYS_cpuset_getaffinity 487 +#define SYS_pipe2 542 +#define SYS_kevent 560 + +TEXT emptyfunc<>(SB),0,$0-0 + RET + +// func sys_umtx_op(addr *uint32, mode int32, val uint32, uaddr1 uintptr, ut *umtx_time) int32 +TEXT runtime·sys_umtx_op(SB),NOSPLIT,$0 + MOVD addr+0(FP), R0 + MOVW mode+8(FP), R1 + MOVW val+12(FP), R2 + MOVD uaddr1+16(FP), R3 + MOVD ut+24(FP), R4 + MOVD $SYS__umtx_op, R8 + SVC + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+32(FP) + RET + +// func thr_new(param *thrparam, size int32) int32 +TEXT runtime·thr_new(SB),NOSPLIT,$0 + MOVD param+0(FP), R0 + MOVW size+8(FP), R1 + MOVD $SYS_thr_new, R8 + SVC + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+16(FP) + RET + +// func thr_start() +TEXT runtime·thr_start(SB),NOSPLIT,$0 + // set up g + MOVD m_g0(R0), g + MOVD R0, g_m(g) + BL emptyfunc<>(SB) // fault if stack check is wrong + BL runtime·mstart(SB) + + MOVD $2, R8 // crash (not reached) + MOVD R8, (R8) + RET + +// func exit(code int32) +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0-4 + MOVW code+0(FP), R0 + MOVD $SYS_exit, R8 + SVC + MOVD $0, R0 + MOVD R0, (R0) + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-8 + MOVD wait+0(FP), R0 + // We're done using the stack. + MOVW $0, R1 + STLRW R1, (R0) + MOVW $0, R0 + MOVD $SYS_thr_exit, R8 + SVC + JMP 0(PC) + +// func open(name *byte, mode, perm int32) int32 +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0-20 + MOVD name+0(FP), R0 + MOVW mode+8(FP), R1 + MOVW perm+12(FP), R2 + MOVD $SYS_open, R8 + SVC + BCC ok + MOVW $-1, R0 +ok: + MOVW R0, ret+16(FP) + RET + +// func closefd(fd int32) int32 +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0-12 + MOVW fd+0(FP), R0 + MOVD $SYS_close, R8 + SVC + BCC ok + MOVW $-1, R0 +ok: + MOVW R0, ret+8(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + MOVD $r+8(FP), R0 + MOVW flags+0(FP), R1 + MOVD $SYS_pipe2, R8 + SVC + BCC ok + NEG R0, R0 +ok: + MOVW R0, errno+16(FP) + RET + +// func write1(fd uintptr, p unsafe.Pointer, n int32) int32 +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0-28 + MOVD fd+0(FP), R0 + MOVD p+8(FP), R1 + MOVW n+16(FP), R2 + MOVD $SYS_write, R8 + SVC + BCC ok + NEG R0, R0 // caller expects negative errno +ok: + MOVW R0, ret+24(FP) + RET + +// func read(fd int32, p unsafe.Pointer, n int32) int32 +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0-28 + MOVW fd+0(FP), R0 + MOVD p+8(FP), R1 + MOVW n+16(FP), R2 + MOVD $SYS_read, R8 + SVC + BCC ok + NEG R0, R0 // caller expects negative errno +ok: + MOVW R0, ret+24(FP) + RET + +// func usleep(usec uint32) +TEXT runtime·usleep(SB),NOSPLIT,$24-4 + MOVWU usec+0(FP), R3 + MOVD R3, R5 + MOVW $1000000, R4 + UDIV R4, R3 + MOVD R3, 8(RSP) + MUL R3, R4 + SUB R4, R5 + MOVW $1000, R4 + MUL R4, R5 + MOVD R5, 16(RSP) + + // nanosleep(&ts, 0) + ADD $8, RSP, R0 + MOVD $0, R1 + MOVD $SYS_nanosleep, R8 + SVC + RET + +// func thr_self() thread +TEXT runtime·thr_self(SB),NOSPLIT,$8-8 + MOVD $ptr-8(SP), R0 // arg 1 &8(SP) + MOVD $SYS_thr_self, R8 + SVC + MOVD ptr-8(SP), R0 + MOVD R0, ret+0(FP) + RET + +// func thr_kill(t thread, sig int) +TEXT runtime·thr_kill(SB),NOSPLIT,$0-16 + MOVD tid+0(FP), R0 // arg 1 pid + MOVD sig+8(FP), R1 // arg 2 sig + MOVD $SYS_thr_kill, R8 + SVC + RET + +// func raiseproc(sig uint32) +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + MOVD $SYS_getpid, R8 + SVC + MOVW sig+0(FP), R1 + MOVD $SYS_kill, R8 + SVC + RET + +// func setitimer(mode int32, new, old *itimerval) +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0-24 + MOVW mode+0(FP), R0 + MOVD new+8(FP), R1 + MOVD old+16(FP), R2 + MOVD $SYS_setitimer, R8 + SVC + RET + +// func fallback_walltime() (sec int64, nsec int32) +TEXT runtime·fallback_walltime(SB),NOSPLIT,$24-12 + MOVW $CLOCK_REALTIME, R0 + MOVD $8(RSP), R1 + MOVD $SYS_clock_gettime, R8 + SVC + MOVD 8(RSP), R0 // sec + MOVW 16(RSP), R1 // nsec + MOVD R0, sec+0(FP) + MOVW R1, nsec+8(FP) + RET + +// func fallback_nanotime() int64 +TEXT runtime·fallback_nanotime(SB),NOSPLIT,$24-8 + MOVD $CLOCK_MONOTONIC, R0 + MOVD $8(RSP), R1 + MOVD $SYS_clock_gettime, R8 + SVC + MOVD 8(RSP), R0 // sec + MOVW 16(RSP), R2 // nsec + + // sec is in R0, nsec in R2 + // return nsec in R2 + MOVD $1000000000, R3 + MUL R3, R0 + ADD R2, R0 + + MOVD R0, ret+0(FP) + RET + +// func asmSigaction(sig uintptr, new, old *sigactiont) int32 +TEXT runtime·asmSigaction(SB),NOSPLIT|NOFRAME,$0 + MOVD sig+0(FP), R0 // arg 1 sig + MOVD new+8(FP), R1 // arg 2 act + MOVD old+16(FP), R2 // arg 3 oact + MOVD $SYS_sigaction, R8 + SVC + BCC ok + MOVW $-1, R0 +ok: + MOVW R0, ret+24(FP) + RET + +// func sigfwd(fn uintptr, sig uint32, info *siginfo, ctx unsafe.Pointer) +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R0 + MOVD info+16(FP), R1 + MOVD ctx+24(FP), R2 + MOVD fn+0(FP), R11 + BL (R11) + RET + +// func sigtramp() +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$176 + // Save callee-save registers in the case of signal forwarding. + // Please refer to https://golang.org/issue/31827 . + SAVE_R19_TO_R28(8*4) + SAVE_F8_TO_F15(8*14) + + // this might be called in external code context, + // where g is not set. + // first save R0, because runtime·load_g will clobber it + MOVW R0, 8(RSP) + MOVBU runtime·iscgo(SB), R0 + CMP $0, R0 + BEQ 2(PC) + BL runtime·load_g(SB) + +#ifdef GOEXPERIMENT_regabiargs + // Restore signum to R0. + MOVW 8(RSP), R0 + // R1 and R2 already contain info and ctx, respectively. +#else + MOVD R1, 16(RSP) + MOVD R2, 24(RSP) +#endif + MOVD $runtime·sigtrampgo<ABIInternal>(SB), R3 + BL (R3) + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8*4) + RESTORE_F8_TO_F15(8*14) + + RET + +// func mmap(addr uintptr, n uintptr, prot int, flags int, fd int, off int64) (ret uintptr, err error) +TEXT runtime·mmap(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVW prot+16(FP), R2 + MOVW flags+20(FP), R3 + MOVW fd+24(FP), R4 + MOVW off+28(FP), R5 + MOVD $SYS_mmap, R8 + SVC + BCS fail + MOVD R0, p+32(FP) + MOVD $0, err+40(FP) + RET +fail: + MOVD $0, p+32(FP) + MOVD R0, err+40(FP) + RET + +// func munmap(addr uintptr, n uintptr) (err error) +TEXT runtime·munmap(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVD $SYS_munmap, R8 + SVC + BCS fail + RET +fail: + MOVD $0, R0 + MOVD R0, (R0) // crash + +// func madvise(addr unsafe.Pointer, n uintptr, flags int32) int32 +TEXT runtime·madvise(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVW flags+16(FP), R2 + MOVD $SYS_madvise, R8 + SVC + BCC ok + MOVW $-1, R0 +ok: + MOVW R0, ret+24(FP) + RET + +// func sysctl(mib *uint32, miblen uint32, out *byte, size *uintptr, dst *byte, ndst uintptr) int32 +TEXT runtime·sysctl(SB),NOSPLIT,$0 + MOVD mib+0(FP), R0 + MOVD miblen+8(FP), R1 + MOVD out+16(FP), R2 + MOVD size+24(FP), R3 + MOVD dst+32(FP), R4 + MOVD ndst+40(FP), R5 + MOVD $SYS___sysctl, R8 + SVC + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+48(FP) + RET + +// func sigaltstack(new, old *stackt) +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOVD new+0(FP), R0 + MOVD old+8(FP), R1 + MOVD $SYS_sigaltstack, R8 + SVC + BCS fail + RET +fail: + MOVD $0, R0 + MOVD R0, (R0) // crash + +// func osyield() +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + MOVD $SYS_sched_yield, R8 + SVC + RET + +// func sigprocmask(how int32, new, old *sigset) +TEXT runtime·sigprocmask(SB),NOSPLIT|NOFRAME,$0-24 + MOVW how+0(FP), R0 + MOVD new+8(FP), R1 + MOVD old+16(FP), R2 + MOVD $SYS_sigprocmask, R8 + SVC + BCS fail + RET +fail: + MOVD $0, R0 + MOVD R0, (R0) // crash + +// func cpuset_getaffinity(level int, which int, id int64, size int, mask *byte) int32 +TEXT runtime·cpuset_getaffinity(SB),NOSPLIT|NOFRAME,$0-44 + MOVD level+0(FP), R0 + MOVD which+8(FP), R1 + MOVD id+16(FP), R2 + MOVD size+24(FP), R3 + MOVD mask+32(FP), R4 + MOVD $SYS_cpuset_getaffinity, R8 + SVC + BCC ok + MOVW $-1, R0 +ok: + MOVW R0, ret+40(FP) + RET + +// func kqueue() int32 +TEXT runtime·kqueue(SB),NOSPLIT|NOFRAME,$0 + MOVD $SYS_kqueue, R8 + SVC + BCC ok + MOVW $-1, R0 +ok: + MOVW R0, ret+0(FP) + RET + +// func kevent(kq int, ch unsafe.Pointer, nch int, ev unsafe.Pointer, nev int, ts *Timespec) (n int, err error) +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVW kq+0(FP), R0 + MOVD ch+8(FP), R1 + MOVW nch+16(FP), R2 + MOVD ev+24(FP), R3 + MOVW nev+32(FP), R4 + MOVD ts+40(FP), R5 + MOVD $SYS_kevent, R8 + SVC + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+48(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 + MOVW cmd+4(FP), R1 + MOVW arg+8(FP), R2 + MOVD $SYS_fcntl, R8 + SVC + BCC noerr + MOVW $-1, R1 + MOVW R1, ret+16(FP) + MOVW R0, errno+20(FP) + RET +noerr: + MOVW R0, ret+16(FP) + MOVW $0, errno+20(FP) + RET + +// func closeonexec(fd int32) +TEXT runtime·closeonexec(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R0 + MOVD $F_SETFD, R1 + MOVD $FD_CLOEXEC, R2 + MOVD $SYS_fcntl, R8 + SVC + RET + +// func getCntxct(physical bool) uint32 +TEXT runtime·getCntxct(SB),NOSPLIT,$0 + MOVB physical+0(FP), R0 + CMP $0, R0 + BEQ 3(PC) + + // get CNTPCT (Physical Count Register) into R0 + MRS CNTPCT_EL0, R0 + B 2(PC) + + // get CNTVCT (Virtual Count Register) into R0 + MRS CNTVCT_EL0, R0 + + MOVW R0, ret+8(FP) + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT|NOFRAME,$0 + MOVD $SYS_issetugid, R8 + SVC + MOVW R0, ret+0(FP) + RET diff --git a/src/runtime/sys_freebsd_riscv64.s b/src/runtime/sys_freebsd_riscv64.s new file mode 100644 index 0000000..4f581f5 --- /dev/null +++ b/src/runtime/sys_freebsd_riscv64.s @@ -0,0 +1,462 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for riscv64, FreeBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 4 +#define FD_CLOEXEC 1 +#define F_SETFD 2 +#define F_GETFL 3 +#define F_SETFL 4 +#define O_NONBLOCK 4 + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_sigaltstack 53 +#define SYS_munmap 73 +#define SYS_madvise 75 +#define SYS_setitimer 83 +#define SYS_fcntl 92 +#define SYS___sysctl 202 +#define SYS_nanosleep 240 +#define SYS_issetugid 253 +#define SYS_clock_gettime 232 +#define SYS_sched_yield 331 +#define SYS_sigprocmask 340 +#define SYS_kqueue 362 +#define SYS_sigaction 416 +#define SYS_thr_exit 431 +#define SYS_thr_self 432 +#define SYS_thr_kill 433 +#define SYS__umtx_op 454 +#define SYS_thr_new 455 +#define SYS_mmap 477 +#define SYS_cpuset_getaffinity 487 +#define SYS_pipe2 542 +#define SYS_kevent 560 + +TEXT emptyfunc<>(SB),0,$0-0 + RET + +// func sys_umtx_op(addr *uint32, mode int32, val uint32, uaddr1 uintptr, ut *umtx_time) int32 +TEXT runtime·sys_umtx_op(SB),NOSPLIT,$0 + MOV addr+0(FP), A0 + MOVW mode+8(FP), A1 + MOVW val+12(FP), A2 + MOV uaddr1+16(FP), A3 + MOV ut+24(FP), A4 + MOV $SYS__umtx_op, T0 + ECALL + BEQ T0, ZERO, ok + NEG A0, A0 +ok: + MOVW A0, ret+32(FP) + RET + +// func thr_new(param *thrparam, size int32) int32 +TEXT runtime·thr_new(SB),NOSPLIT,$0 + MOV param+0(FP), A0 + MOVW size+8(FP), A1 + MOV $SYS_thr_new, T0 + ECALL + BEQ T0, ZERO, ok + NEG A0, A0 +ok: + MOVW A0, ret+16(FP) + RET + +// func thr_start() +TEXT runtime·thr_start(SB),NOSPLIT,$0 + // set up g + MOV m_g0(A0), g + MOV A0, g_m(g) + CALL emptyfunc<>(SB) // fault if stack check is wrong + CALL runtime·mstart(SB) + + WORD $0 // crash + RET + +// func exit(code int32) +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0-4 + MOVW code+0(FP), A0 + MOV $SYS_exit, T0 + ECALL + WORD $0 // crash + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-8 + MOV wait+0(FP), A0 + // We're done using the stack. + FENCE + MOVW ZERO, (A0) + FENCE + MOV $0, A0 // exit code + MOV $SYS_thr_exit, T0 + ECALL + JMP 0(PC) + +// func open(name *byte, mode, perm int32) int32 +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0-20 + MOV name+0(FP), A0 + MOVW mode+8(FP), A1 + MOVW perm+12(FP), A2 + MOV $SYS_open, T0 + ECALL + BEQ T0, ZERO, ok + MOV $-1, A0 +ok: + MOVW A0, ret+16(FP) + RET + +// func closefd(fd int32) int32 +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0-12 + MOVW fd+0(FP), A0 + MOV $SYS_close, T0 + ECALL + BEQ T0, ZERO, ok + MOV $-1, A0 +ok: + MOVW A0, ret+8(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + MOV $r+8(FP), A0 + MOVW flags+0(FP), A1 + MOV $SYS_pipe2, T0 + ECALL + BEQ T0, ZERO, ok + NEG A0, A0 +ok: + MOVW A0, errno+16(FP) + RET + +// func write1(fd uintptr, p unsafe.Pointer, n int32) int32 +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0-28 + MOV fd+0(FP), A0 + MOV p+8(FP), A1 + MOVW n+16(FP), A2 + MOV $SYS_write, T0 + ECALL + BEQ T0, ZERO, ok + NEG A0, A0 +ok: + MOVW A0, ret+24(FP) + RET + +// func read(fd int32, p unsafe.Pointer, n int32) int32 +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0-28 + MOVW fd+0(FP), A0 + MOV p+8(FP), A1 + MOVW n+16(FP), A2 + MOV $SYS_read, T0 + ECALL + BEQ T0, ZERO, ok + NEG A0, A0 +ok: + MOVW A0, ret+24(FP) + RET + +// func usleep(usec uint32) +TEXT runtime·usleep(SB),NOSPLIT,$24-4 + MOVWU usec+0(FP), A0 + MOV $1000, A1 + MUL A1, A0, A0 + MOV $1000000000, A1 + DIV A1, A0, A2 + MOV A2, 8(X2) + REM A1, A0, A3 + MOV A3, 16(X2) + ADD $8, X2, A0 + MOV ZERO, A1 + MOV $SYS_nanosleep, T0 + ECALL + RET + +// func thr_self() thread +TEXT runtime·thr_self(SB),NOSPLIT,$8-8 + MOV $ptr-8(SP), A0 // arg 1 &8(SP) + MOV $SYS_thr_self, T0 + ECALL + MOV ptr-8(SP), A0 + MOV A0, ret+0(FP) + RET + +// func thr_kill(t thread, sig int) +TEXT runtime·thr_kill(SB),NOSPLIT,$0-16 + MOV tid+0(FP), A0 // arg 1 pid + MOV sig+8(FP), A1 // arg 2 sig + MOV $SYS_thr_kill, T0 + ECALL + RET + +// func raiseproc(sig uint32) +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + MOV $SYS_getpid, T0 + ECALL + // arg 1 pid - already in A0 + MOVW sig+0(FP), A1 // arg 2 + MOV $SYS_kill, T0 + ECALL + RET + +// func setitimer(mode int32, new, old *itimerval) +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0-24 + MOVW mode+0(FP), A0 + MOV new+8(FP), A1 + MOV old+16(FP), A2 + MOV $SYS_setitimer, T0 + ECALL + RET + +// func fallback_walltime() (sec int64, nsec int32) +TEXT runtime·fallback_walltime(SB),NOSPLIT,$24-12 + MOV $CLOCK_REALTIME, A0 + MOV $8(X2), A1 + MOV $SYS_clock_gettime, T0 + ECALL + MOV 8(X2), T0 // sec + MOVW 16(X2), T1 // nsec + MOV T0, sec+0(FP) + MOVW T1, nsec+8(FP) + RET + +// func fallback_nanotime() int64 +TEXT runtime·fallback_nanotime(SB),NOSPLIT,$24-8 + MOV $CLOCK_MONOTONIC, A0 + MOV $8(X2), A1 + MOV $SYS_clock_gettime, T0 + ECALL + MOV 8(X2), T0 // sec + MOV 16(X2), T1 // nsec + + // sec is in T0, nsec in T1 + // return nsec in T0 + MOV $1000000000, T2 + MUL T2, T0 + ADD T1, T0 + + MOV T0, ret+0(FP) + RET + +// func asmSigaction(sig uintptr, new, old *sigactiont) int32 +TEXT runtime·asmSigaction(SB),NOSPLIT|NOFRAME,$0 + MOV sig+0(FP), A0 // arg 1 sig + MOV new+8(FP), A1 // arg 2 act + MOV old+16(FP), A2 // arg 3 oact + MOV $SYS_sigaction, T0 + ECALL + BEQ T0, ZERO, ok + MOV $-1, A0 +ok: + MOVW A0, ret+24(FP) + RET + +// func sigfwd(fn uintptr, sig uint32, info *siginfo, ctx unsafe.Pointer) +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), A0 + MOV info+16(FP), A1 + MOV ctx+24(FP), A2 + MOV fn+0(FP), T1 + JALR RA, T1 + RET + +// func sigtramp(signo, ureg, ctxt unsafe.Pointer) +TEXT runtime·sigtramp(SB),NOSPLIT,$64 + MOVW A0, 8(X2) + MOV A1, 16(X2) + MOV A2, 24(X2) + + // this might be called in external code context, + // where g is not set. + MOVBU runtime·iscgo(SB), A0 + BEQ A0, ZERO, ok + CALL runtime·load_g(SB) +ok: + MOV $runtime·sigtrampgo(SB), A0 + JALR RA, A0 + RET + +// func mmap(addr uintptr, n uintptr, prot int, flags int, fd int, off int64) (ret uintptr, err error) +TEXT runtime·mmap(SB),NOSPLIT|NOFRAME,$0 + MOV addr+0(FP), A0 + MOV n+8(FP), A1 + MOVW prot+16(FP), A2 + MOVW flags+20(FP), A3 + MOVW fd+24(FP), A4 + MOVW off+28(FP), A5 + MOV $SYS_mmap, T0 + ECALL + BNE T0, ZERO, fail + MOV A0, p+32(FP) + MOV ZERO, err+40(FP) + RET +fail: + MOV ZERO, p+32(FP) + MOV A0, err+40(FP) + RET + +// func munmap(addr uintptr, n uintptr) (err error) +TEXT runtime·munmap(SB),NOSPLIT|NOFRAME,$0 + MOV addr+0(FP), A0 + MOV n+8(FP), A1 + MOV $SYS_munmap, T0 + ECALL + BNE T0, ZERO, fail + RET +fail: + WORD $0 // crash + +// func madvise(addr unsafe.Pointer, n uintptr, flags int32) int32 +TEXT runtime·madvise(SB),NOSPLIT|NOFRAME,$0 + MOV addr+0(FP), A0 + MOV n+8(FP), A1 + MOVW flags+16(FP), A2 + MOV $SYS_madvise, T0 + ECALL + BEQ T0, ZERO, ok + MOV $-1, A0 +ok: + MOVW A0, ret+24(FP) + RET + +// func sysctl(mib *uint32, miblen uint32, out *byte, size *uintptr, dst *byte, ndst uintptr) int32 +TEXT runtime·sysctl(SB),NOSPLIT,$0 + MOV mib+0(FP), A0 + MOV miblen+8(FP), A1 + MOV out+16(FP), A2 + MOV size+24(FP), A3 + MOV dst+32(FP), A4 + MOV ndst+40(FP), A5 + MOV $SYS___sysctl, T0 + ECALL + BEQ T0, ZERO, ok + NEG A0, A0 +ok: + MOVW A0, ret+48(FP) + RET + +// func sigaltstack(new, old *stackt) +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOV new+0(FP), A0 + MOV old+8(FP), A1 + MOV $SYS_sigaltstack, T0 + ECALL + BNE T0, ZERO, fail + RET +fail: + WORD $0 // crash + +// func osyield() +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + MOV $SYS_sched_yield, T0 + ECALL + RET + +// func sigprocmask(how int32, new, old *sigset) +TEXT runtime·sigprocmask(SB),NOSPLIT|NOFRAME,$0-24 + MOVW how+0(FP), A0 + MOV new+8(FP), A1 + MOV old+16(FP), A2 + MOV $SYS_sigprocmask, T0 + ECALL + BNE T0, ZERO, fail + RET +fail: + WORD $0 // crash + + +// func cpuset_getaffinity(level int, which int, id int64, size int, mask *byte) int32 +TEXT runtime·cpuset_getaffinity(SB),NOSPLIT|NOFRAME,$0-44 + MOV level+0(FP), A0 + MOV which+8(FP), A1 + MOV id+16(FP), A2 + MOV size+24(FP), A3 + MOV mask+32(FP), A4 + MOV $SYS_cpuset_getaffinity, T0 + ECALL + BEQ T0, ZERO, ok + MOV $-1, A0 +ok: + MOVW A0, ret+40(FP) + RET + +// func kqueue() int32 +TEXT runtime·kqueue(SB),NOSPLIT|NOFRAME,$0 + MOV $SYS_kqueue, T0 + ECALL + BEQ T0, ZERO, ok + MOV $-1, A0 +ok: + MOVW A0, ret+0(FP) + RET + +// func kevent(kq int, ch unsafe.Pointer, nch int, ev unsafe.Pointer, nev int, ts *Timespec) (n int, err error) +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVW kq+0(FP), A0 + MOV ch+8(FP), A1 + MOVW nch+16(FP), A2 + MOV ev+24(FP), A3 + MOVW nev+32(FP), A4 + MOV ts+40(FP), A5 + MOV $SYS_kevent, T0 + ECALL + BEQ T0, ZERO, ok + NEG A0, A0 +ok: + MOVW A0, ret+48(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVW fd+0(FP), A0 + MOVW cmd+4(FP), A1 + MOVW arg+8(FP), A2 + MOV $SYS_fcntl, T0 + ECALL + BEQ T0, ZERO, noerr + MOV $-1, A1 + MOVW A1, ret+16(FP) + MOVW A0, errno+20(FP) + RET +noerr: + MOVW A0, ret+16(FP) + MOVW ZERO, errno+20(FP) + RET + +// func closeonexec(fd int32) +TEXT runtime·closeonexec(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), A0 + MOV $F_SETFD, A1 + MOV $FD_CLOEXEC, A2 + MOV $SYS_fcntl, T0 + ECALL + RET + +// func getCntxct() uint32 +TEXT runtime·getCntxct(SB),NOSPLIT|NOFRAME,$0 + RDTIME A0 + MOVW A0, ret+0(FP) + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT|NOFRAME,$0 + MOV $SYS_issetugid, T0 + ECALL + MOVW A0, ret+0(FP) + RET + diff --git a/src/runtime/sys_libc.go b/src/runtime/sys_libc.go new file mode 100644 index 0000000..0c6f13c --- /dev/null +++ b/src/runtime/sys_libc.go @@ -0,0 +1,54 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build darwin || (openbsd && !mips64) + +package runtime + +import "unsafe" + +// Call fn with arg as its argument. Return what fn returns. +// fn is the raw pc value of the entry point of the desired function. +// Switches to the system stack, if not already there. +// Preserves the calling point as the location where a profiler traceback will begin. +// +//go:nosplit +func libcCall(fn, arg unsafe.Pointer) int32 { + // Leave caller's PC/SP/G around for traceback. + gp := getg() + var mp *m + if gp != nil { + mp = gp.m + } + if mp != nil && mp.libcallsp == 0 { + mp.libcallg.set(gp) + mp.libcallpc = getcallerpc() + // sp must be the last, because once async cpu profiler finds + // all three values to be non-zero, it will use them + mp.libcallsp = getcallersp() + } else { + // Make sure we don't reset libcallsp. This makes + // libcCall reentrant; We remember the g/pc/sp for the + // first call on an M, until that libcCall instance + // returns. Reentrance only matters for signals, as + // libc never calls back into Go. The tricky case is + // where we call libcX from an M and record g/pc/sp. + // Before that call returns, a signal arrives on the + // same M and the signal handling code calls another + // libc function. We don't want that second libcCall + // from within the handler to be recorded, and we + // don't want that call's completion to zero + // libcallsp. + // We don't need to set libcall* while we're in a sighandler + // (even if we're not currently in libc) because we block all + // signals while we're handling a signal. That includes the + // profile signal, which is the one that uses the libcall* info. + mp = nil + } + res := asmcgocall(fn, arg) + if mp != nil { + mp.libcallsp = 0 + } + return res +} diff --git a/src/runtime/sys_linux_386.s b/src/runtime/sys_linux_386.s new file mode 100644 index 0000000..12a2941 --- /dev/null +++ b/src/runtime/sys_linux_386.s @@ -0,0 +1,762 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for 386, Linux +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// Most linux systems use glibc's dynamic linker, which puts the +// __kernel_vsyscall vdso helper at 0x10(GS) for easy access from position +// independent code and setldt in runtime does the same in the statically +// linked case. However, systems that use alternative libc such as Android's +// bionic and musl, do not save the helper anywhere, and so the only way to +// invoke a syscall from position independent code is boring old int $0x80 +// (which is also what syscall wrappers in bionic/musl use). +// +// The benchmarks also showed that using int $0x80 is as fast as calling +// *%gs:0x10 except on AMD Opteron. See https://golang.org/cl/19833 +// for the benchmark program and raw data. +//#define INVOKE_SYSCALL CALL 0x10(GS) // non-portable +#define INVOKE_SYSCALL INT $0x80 + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_access 33 +#define SYS_kill 37 +#define SYS_brk 45 +#define SYS_munmap 91 +#define SYS_socketcall 102 +#define SYS_setittimer 104 +#define SYS_clone 120 +#define SYS_sched_yield 158 +#define SYS_nanosleep 162 +#define SYS_rt_sigreturn 173 +#define SYS_rt_sigaction 174 +#define SYS_rt_sigprocmask 175 +#define SYS_sigaltstack 186 +#define SYS_mmap2 192 +#define SYS_mincore 218 +#define SYS_madvise 219 +#define SYS_gettid 224 +#define SYS_futex 240 +#define SYS_sched_getaffinity 242 +#define SYS_set_thread_area 243 +#define SYS_exit_group 252 +#define SYS_timer_create 259 +#define SYS_timer_settime 260 +#define SYS_timer_delete 263 +#define SYS_clock_gettime 265 +#define SYS_tgkill 270 +#define SYS_pipe2 331 + +TEXT runtime·exit(SB),NOSPLIT,$0 + MOVL $SYS_exit_group, AX + MOVL code+0(FP), BX + INVOKE_SYSCALL + INT $3 // not reached + RET + +TEXT exit1<>(SB),NOSPLIT,$0 + MOVL $SYS_exit, AX + MOVL code+0(FP), BX + INVOKE_SYSCALL + INT $3 // not reached + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-4 + MOVL wait+0(FP), AX + // We're done using the stack. + MOVL $0, (AX) + MOVL $1, AX // exit (just this thread) + MOVL $0, BX // exit code + INT $0x80 // no stack; must not use CALL + // We may not even have a stack any more. + INT $3 + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT,$0 + MOVL $SYS_open, AX + MOVL name+0(FP), BX + MOVL mode+4(FP), CX + MOVL perm+8(FP), DX + INVOKE_SYSCALL + CMPL AX, $0xfffff001 + JLS 2(PC) + MOVL $-1, AX + MOVL AX, ret+12(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$0 + MOVL $SYS_close, AX + MOVL fd+0(FP), BX + INVOKE_SYSCALL + CMPL AX, $0xfffff001 + JLS 2(PC) + MOVL $-1, AX + MOVL AX, ret+4(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$0 + MOVL $SYS_write, AX + MOVL fd+0(FP), BX + MOVL p+4(FP), CX + MOVL n+8(FP), DX + INVOKE_SYSCALL + MOVL AX, ret+12(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$0 + MOVL $SYS_read, AX + MOVL fd+0(FP), BX + MOVL p+4(FP), CX + MOVL n+8(FP), DX + INVOKE_SYSCALL + MOVL AX, ret+12(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-16 + MOVL $SYS_pipe2, AX + LEAL r+4(FP), BX + MOVL flags+0(FP), CX + INVOKE_SYSCALL + MOVL AX, errno+12(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$8 + MOVL $0, DX + MOVL usec+0(FP), AX + MOVL $1000000, CX + DIVL CX + MOVL AX, 0(SP) + MOVL $1000, AX // usec to nsec + MULL DX + MOVL AX, 4(SP) + + // nanosleep(&ts, 0) + MOVL $SYS_nanosleep, AX + LEAL 0(SP), BX + MOVL $0, CX + INVOKE_SYSCALL + RET + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOVL $SYS_gettid, AX + INVOKE_SYSCALL + MOVL AX, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT,$12 + MOVL $SYS_getpid, AX + INVOKE_SYSCALL + MOVL AX, BX // arg 1 pid + MOVL $SYS_gettid, AX + INVOKE_SYSCALL + MOVL AX, CX // arg 2 tid + MOVL sig+0(FP), DX // arg 3 signal + MOVL $SYS_tgkill, AX + INVOKE_SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$12 + MOVL $SYS_getpid, AX + INVOKE_SYSCALL + MOVL AX, BX // arg 1 pid + MOVL sig+0(FP), CX // arg 2 signal + MOVL $SYS_kill, AX + INVOKE_SYSCALL + RET + +TEXT ·getpid(SB),NOSPLIT,$0-4 + MOVL $SYS_getpid, AX + INVOKE_SYSCALL + MOVL AX, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT,$0 + MOVL $SYS_tgkill, AX + MOVL tgid+0(FP), BX + MOVL tid+4(FP), CX + MOVL sig+8(FP), DX + INVOKE_SYSCALL + RET + +TEXT runtime·setitimer(SB),NOSPLIT,$0-12 + MOVL $SYS_setittimer, AX + MOVL mode+0(FP), BX + MOVL new+4(FP), CX + MOVL old+8(FP), DX + INVOKE_SYSCALL + RET + +TEXT runtime·timer_create(SB),NOSPLIT,$0-16 + MOVL $SYS_timer_create, AX + MOVL clockid+0(FP), BX + MOVL sevp+4(FP), CX + MOVL timerid+8(FP), DX + INVOKE_SYSCALL + MOVL AX, ret+12(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT,$0-20 + MOVL $SYS_timer_settime, AX + MOVL timerid+0(FP), BX + MOVL flags+4(FP), CX + MOVL new+8(FP), DX + MOVL old+12(FP), SI + INVOKE_SYSCALL + MOVL AX, ret+16(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT,$0-8 + MOVL $SYS_timer_delete, AX + MOVL timerid+0(FP), BX + INVOKE_SYSCALL + MOVL AX, ret+4(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT,$0-16 + MOVL $SYS_mincore, AX + MOVL addr+0(FP), BX + MOVL n+4(FP), CX + MOVL dst+8(FP), DX + INVOKE_SYSCALL + MOVL AX, ret+12(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB), NOSPLIT, $8-12 + // We don't know how much stack space the VDSO code will need, + // so switch to g0. + + MOVL SP, BP // Save old SP; BP unchanged by C code. + + get_tls(CX) + MOVL g(CX), AX + MOVL g_m(AX), SI // SI unchanged by C code. + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVL m_vdsoPC(SI), CX + MOVL m_vdsoSP(SI), DX + MOVL CX, 0(SP) + MOVL DX, 4(SP) + + LEAL sec+0(FP), DX + MOVL -4(DX), CX + MOVL CX, m_vdsoPC(SI) + MOVL DX, m_vdsoSP(SI) + + CMPL AX, m_curg(SI) // Only switch if on curg. + JNE noswitch + + MOVL m_g0(SI), DX + MOVL (g_sched+gobuf_sp)(DX), SP // Set SP to g0 stack + +noswitch: + SUBL $16, SP // Space for results + ANDL $~15, SP // Align for C code + + // Stack layout, depending on call path: + // x(SP) vDSO INVOKE_SYSCALL + // 12 ts.tv_nsec ts.tv_nsec + // 8 ts.tv_sec ts.tv_sec + // 4 &ts - + // 0 CLOCK_<id> - + + MOVL runtime·vdsoClockgettimeSym(SB), AX + CMPL AX, $0 + JEQ fallback + + LEAL 8(SP), BX // &ts (struct timespec) + MOVL BX, 4(SP) + MOVL $0, 0(SP) // CLOCK_REALTIME + CALL AX + JMP finish + +fallback: + MOVL $SYS_clock_gettime, AX + MOVL $0, BX // CLOCK_REALTIME + LEAL 8(SP), CX + INVOKE_SYSCALL + +finish: + MOVL 8(SP), AX // sec + MOVL 12(SP), BX // nsec + + MOVL BP, SP // Restore real SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVL 4(SP), CX + MOVL CX, m_vdsoSP(SI) + MOVL 0(SP), CX + MOVL CX, m_vdsoPC(SI) + + // sec is in AX, nsec in BX + MOVL AX, sec_lo+0(FP) + MOVL $0, sec_hi+4(FP) + MOVL BX, nsec+8(FP) + RET + +// int64 nanotime(void) so really +// void nanotime(int64 *nsec) +TEXT runtime·nanotime1(SB), NOSPLIT, $8-8 + // Switch to g0 stack. See comment above in runtime·walltime. + + MOVL SP, BP // Save old SP; BP unchanged by C code. + + get_tls(CX) + MOVL g(CX), AX + MOVL g_m(AX), SI // SI unchanged by C code. + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVL m_vdsoPC(SI), CX + MOVL m_vdsoSP(SI), DX + MOVL CX, 0(SP) + MOVL DX, 4(SP) + + LEAL ret+0(FP), DX + MOVL -4(DX), CX + MOVL CX, m_vdsoPC(SI) + MOVL DX, m_vdsoSP(SI) + + CMPL AX, m_curg(SI) // Only switch if on curg. + JNE noswitch + + MOVL m_g0(SI), DX + MOVL (g_sched+gobuf_sp)(DX), SP // Set SP to g0 stack + +noswitch: + SUBL $16, SP // Space for results + ANDL $~15, SP // Align for C code + + MOVL runtime·vdsoClockgettimeSym(SB), AX + CMPL AX, $0 + JEQ fallback + + LEAL 8(SP), BX // &ts (struct timespec) + MOVL BX, 4(SP) + MOVL $1, 0(SP) // CLOCK_MONOTONIC + CALL AX + JMP finish + +fallback: + MOVL $SYS_clock_gettime, AX + MOVL $1, BX // CLOCK_MONOTONIC + LEAL 8(SP), CX + INVOKE_SYSCALL + +finish: + MOVL 8(SP), AX // sec + MOVL 12(SP), BX // nsec + + MOVL BP, SP // Restore real SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVL 4(SP), CX + MOVL CX, m_vdsoSP(SI) + MOVL 0(SP), CX + MOVL CX, m_vdsoPC(SI) + + // sec is in AX, nsec in BX + // convert to DX:AX nsec + MOVL $1000000000, CX + MULL CX + ADDL BX, AX + ADCL $0, DX + + MOVL AX, ret_lo+0(FP) + MOVL DX, ret_hi+4(FP) + RET + +TEXT runtime·rtsigprocmask(SB),NOSPLIT,$0 + MOVL $SYS_rt_sigprocmask, AX + MOVL how+0(FP), BX + MOVL new+4(FP), CX + MOVL old+8(FP), DX + MOVL size+12(FP), SI + INVOKE_SYSCALL + CMPL AX, $0xfffff001 + JLS 2(PC) + INT $3 + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT,$0 + MOVL $SYS_rt_sigaction, AX + MOVL sig+0(FP), BX + MOVL new+4(FP), CX + MOVL old+8(FP), DX + MOVL size+12(FP), SI + INVOKE_SYSCALL + MOVL AX, ret+16(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$12-16 + MOVL fn+0(FP), AX + MOVL sig+4(FP), BX + MOVL info+8(FP), CX + MOVL ctx+12(FP), DX + MOVL SP, SI + SUBL $32, SP + ANDL $-15, SP // align stack: handler might be a C function + MOVL BX, 0(SP) + MOVL CX, 4(SP) + MOVL DX, 8(SP) + MOVL SI, 12(SP) // save SI: handler might be a Go function + CALL AX + MOVL 12(SP), AX + MOVL AX, SP + RET + +// Called using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$28 + // Save callee-saved C registers, since the caller may be a C signal handler. + MOVL BX, bx-4(SP) + MOVL BP, bp-8(SP) + MOVL SI, si-12(SP) + MOVL DI, di-16(SP) + // We don't save mxcsr or the x87 control word because sigtrampgo doesn't + // modify them. + + MOVL (28+4)(SP), BX + MOVL BX, 0(SP) + MOVL (28+8)(SP), BX + MOVL BX, 4(SP) + MOVL (28+12)(SP), BX + MOVL BX, 8(SP) + CALL runtime·sigtrampgo(SB) + + MOVL di-16(SP), DI + MOVL si-12(SP), SI + MOVL bp-8(SP), BP + MOVL bx-4(SP), BX + RET + +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + JMP runtime·sigtramp(SB) + +TEXT runtime·sigreturn(SB),NOSPLIT,$0 + MOVL $SYS_rt_sigreturn, AX + // Sigreturn expects same SP as signal handler, + // so cannot CALL 0x10(GS) here. + INT $0x80 + INT $3 // not reached + RET + +TEXT runtime·mmap(SB),NOSPLIT,$0 + MOVL $SYS_mmap2, AX + MOVL addr+0(FP), BX + MOVL n+4(FP), CX + MOVL prot+8(FP), DX + MOVL flags+12(FP), SI + MOVL fd+16(FP), DI + MOVL off+20(FP), BP + SHRL $12, BP + INVOKE_SYSCALL + CMPL AX, $0xfffff001 + JLS ok + NOTL AX + INCL AX + MOVL $0, p+24(FP) + MOVL AX, err+28(FP) + RET +ok: + MOVL AX, p+24(FP) + MOVL $0, err+28(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0 + MOVL $SYS_munmap, AX + MOVL addr+0(FP), BX + MOVL n+4(FP), CX + INVOKE_SYSCALL + CMPL AX, $0xfffff001 + JLS 2(PC) + INT $3 + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVL $SYS_madvise, AX + MOVL addr+0(FP), BX + MOVL n+4(FP), CX + MOVL flags+8(FP), DX + INVOKE_SYSCALL + MOVL AX, ret+12(FP) + RET + +// int32 futex(int32 *uaddr, int32 op, int32 val, +// struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT,$0 + MOVL $SYS_futex, AX + MOVL addr+0(FP), BX + MOVL op+4(FP), CX + MOVL val+8(FP), DX + MOVL ts+12(FP), SI + MOVL addr2+16(FP), DI + MOVL val3+20(FP), BP + INVOKE_SYSCALL + MOVL AX, ret+24(FP) + RET + +// int32 clone(int32 flags, void *stack, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT,$0 + MOVL $SYS_clone, AX + MOVL flags+0(FP), BX + MOVL stk+4(FP), CX + MOVL $0, DX // parent tid ptr + MOVL $0, DI // child tid ptr + + // Copy mp, gp, fn off parent stack for use by child. + SUBL $16, CX + MOVL mp+8(FP), SI + MOVL SI, 0(CX) + MOVL gp+12(FP), SI + MOVL SI, 4(CX) + MOVL fn+16(FP), SI + MOVL SI, 8(CX) + MOVL $1234, 12(CX) + + // cannot use CALL 0x10(GS) here, because the stack changes during the + // system call (after CALL 0x10(GS), the child is still using the + // parent's stack when executing its RET instruction). + INT $0x80 + + // In parent, return. + CMPL AX, $0 + JEQ 3(PC) + MOVL AX, ret+20(FP) + RET + + // Paranoia: check that SP is as we expect. + NOP SP // tell vet SP changed - stop checking offsets + MOVL 12(SP), BP + CMPL BP, $1234 + JEQ 2(PC) + INT $3 + + // Initialize AX to Linux tid + MOVL $SYS_gettid, AX + INVOKE_SYSCALL + + MOVL 0(SP), BX // m + MOVL 4(SP), DX // g + MOVL 8(SP), SI // fn + + CMPL BX, $0 + JEQ nog + CMPL DX, $0 + JEQ nog + + MOVL AX, m_procid(BX) // save tid as m->procid + + // set up ldt 7+id to point at m->tls. + LEAL m_tls(BX), BP + MOVL m_id(BX), DI + ADDL $7, DI // m0 is LDT#7. count up. + // setldt(tls#, &tls, sizeof tls) + PUSHAL // save registers + PUSHL $32 // sizeof tls + PUSHL BP // &tls + PUSHL DI // tls # + CALL runtime·setldt(SB) + POPL AX + POPL AX + POPL AX + POPAL + + // Now segment is established. Initialize m, g. + get_tls(AX) + MOVL DX, g(AX) + MOVL BX, g_m(DX) + + CALL runtime·stackcheck(SB) // smashes AX, CX + MOVL 0(DX), DX // paranoia; check they are not nil + MOVL 0(BX), BX + + // more paranoia; check that stack splitting code works + PUSHAL + CALL runtime·emptyfunc(SB) + POPAL + +nog: + CALL SI // fn() + CALL exit1<>(SB) + MOVL $0x1234, 0x1005 + +TEXT runtime·sigaltstack(SB),NOSPLIT,$-8 + MOVL $SYS_sigaltstack, AX + MOVL new+0(FP), BX + MOVL old+4(FP), CX + INVOKE_SYSCALL + CMPL AX, $0xfffff001 + JLS 2(PC) + INT $3 + RET + +// <asm-i386/ldt.h> +// struct user_desc { +// unsigned int entry_number; +// unsigned long base_addr; +// unsigned int limit; +// unsigned int seg_32bit:1; +// unsigned int contents:2; +// unsigned int read_exec_only:1; +// unsigned int limit_in_pages:1; +// unsigned int seg_not_present:1; +// unsigned int useable:1; +// }; +#define SEG_32BIT 0x01 +// contents are the 2 bits 0x02 and 0x04. +#define CONTENTS_DATA 0x00 +#define CONTENTS_STACK 0x02 +#define CONTENTS_CODE 0x04 +#define READ_EXEC_ONLY 0x08 +#define LIMIT_IN_PAGES 0x10 +#define SEG_NOT_PRESENT 0x20 +#define USEABLE 0x40 + +// `-1` means the kernel will pick a TLS entry on the first setldt call, +// which happens during runtime init, and that we'll store back the saved +// entry and reuse that on subsequent calls when creating new threads. +DATA runtime·tls_entry_number+0(SB)/4, $-1 +GLOBL runtime·tls_entry_number(SB), NOPTR, $4 + +// setldt(int entry, int address, int limit) +// We use set_thread_area, which mucks with the GDT, instead of modify_ldt, +// which would modify the LDT, but is disabled on some kernels. +// The name, setldt, is a misnomer, although we leave this name as it is for +// the compatibility with other platforms. +TEXT runtime·setldt(SB),NOSPLIT,$32 + MOVL base+4(FP), DX + +#ifdef GOOS_android + // Android stores the TLS offset in runtime·tls_g. + SUBL runtime·tls_g(SB), DX + MOVL DX, 0(DX) +#else + /* + * When linking against the system libraries, + * we use its pthread_create and let it set up %gs + * for us. When we do that, the private storage + * we get is not at 0(GS), but -4(GS). + * To insulate the rest of the tool chain from this + * ugliness, 8l rewrites 0(TLS) into -4(GS) for us. + * To accommodate that rewrite, we translate + * the address here and bump the limit to 0xffffffff (no limit) + * so that -4(GS) maps to 0(address). + * Also, the final 0(GS) (current 4(DX)) has to point + * to itself, to mimic ELF. + */ + ADDL $0x4, DX // address + MOVL DX, 0(DX) +#endif + + // get entry number + MOVL runtime·tls_entry_number(SB), CX + + // set up user_desc + LEAL 16(SP), AX // struct user_desc + MOVL CX, 0(AX) // unsigned int entry_number + MOVL DX, 4(AX) // unsigned long base_addr + MOVL $0xfffff, 8(AX) // unsigned int limit + MOVL $(SEG_32BIT|LIMIT_IN_PAGES|USEABLE|CONTENTS_DATA), 12(AX) // flag bits + + // call set_thread_area + MOVL AX, BX // user_desc + MOVL $SYS_set_thread_area, AX + // We can't call this via 0x10(GS) because this is called from setldt0 to set that up. + INT $0x80 + + // breakpoint on error + CMPL AX, $0xfffff001 + JLS 2(PC) + INT $3 + + // read allocated entry number back out of user_desc + LEAL 16(SP), AX // get our user_desc back + MOVL 0(AX), AX + + // store entry number if the kernel allocated it + CMPL CX, $-1 + JNE 2(PC) + MOVL AX, runtime·tls_entry_number(SB) + + // compute segment selector - (entry*8+3) + SHLL $3, AX + ADDL $3, AX + MOVW AX, GS + + RET + +TEXT runtime·osyield(SB),NOSPLIT,$0 + MOVL $SYS_sched_yield, AX + INVOKE_SYSCALL + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT,$0 + MOVL $SYS_sched_getaffinity, AX + MOVL pid+0(FP), BX + MOVL len+4(FP), CX + MOVL buf+8(FP), DX + INVOKE_SYSCALL + MOVL AX, ret+12(FP) + RET + +// int access(const char *name, int mode) +TEXT runtime·access(SB),NOSPLIT,$0 + MOVL $SYS_access, AX + MOVL name+0(FP), BX + MOVL mode+4(FP), CX + INVOKE_SYSCALL + MOVL AX, ret+8(FP) + RET + +// int connect(int fd, const struct sockaddr *addr, socklen_t addrlen) +TEXT runtime·connect(SB),NOSPLIT,$0-16 + // connect is implemented as socketcall(NR_socket, 3, *(rest of args)) + // stack already should have fd, addr, addrlen. + MOVL $SYS_socketcall, AX + MOVL $3, BX // connect + LEAL fd+0(FP), CX + INVOKE_SYSCALL + MOVL AX, ret+12(FP) + RET + +// int socket(int domain, int type, int protocol) +TEXT runtime·socket(SB),NOSPLIT,$0-16 + // socket is implemented as socketcall(NR_socket, 1, *(rest of args)) + // stack already should have domain, type, protocol. + MOVL $SYS_socketcall, AX + MOVL $1, BX // socket + LEAL domain+0(FP), CX + INVOKE_SYSCALL + MOVL AX, ret+12(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT,$0-4 + // Implemented as brk(NULL). + MOVL $SYS_brk, AX + MOVL $0, BX // NULL + INVOKE_SYSCALL + MOVL AX, ret+0(FP) + RET diff --git a/src/runtime/sys_linux_amd64.s b/src/runtime/sys_linux_amd64.s new file mode 100644 index 0000000..c7a89ba --- /dev/null +++ b/src/runtime/sys_linux_amd64.s @@ -0,0 +1,703 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for AMD64, Linux +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_amd64.h" + +#define AT_FDCWD -100 + +#define SYS_read 0 +#define SYS_write 1 +#define SYS_close 3 +#define SYS_mmap 9 +#define SYS_munmap 11 +#define SYS_brk 12 +#define SYS_rt_sigaction 13 +#define SYS_rt_sigprocmask 14 +#define SYS_rt_sigreturn 15 +#define SYS_sched_yield 24 +#define SYS_mincore 27 +#define SYS_madvise 28 +#define SYS_nanosleep 35 +#define SYS_setittimer 38 +#define SYS_getpid 39 +#define SYS_socket 41 +#define SYS_connect 42 +#define SYS_clone 56 +#define SYS_exit 60 +#define SYS_kill 62 +#define SYS_sigaltstack 131 +#define SYS_arch_prctl 158 +#define SYS_gettid 186 +#define SYS_futex 202 +#define SYS_sched_getaffinity 204 +#define SYS_timer_create 222 +#define SYS_timer_settime 223 +#define SYS_timer_delete 226 +#define SYS_clock_gettime 228 +#define SYS_exit_group 231 +#define SYS_tgkill 234 +#define SYS_openat 257 +#define SYS_faccessat 269 +#define SYS_pipe2 293 + +TEXT runtime·exit(SB),NOSPLIT,$0-4 + MOVL code+0(FP), DI + MOVL $SYS_exit_group, AX + SYSCALL + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-8 + MOVQ wait+0(FP), AX + // We're done using the stack. + MOVL $0, (AX) + MOVL $0, DI // exit code + MOVL $SYS_exit, AX + SYSCALL + // We may not even have a stack any more. + INT $3 + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT,$0-20 + // This uses openat instead of open, because Android O blocks open. + MOVL $AT_FDCWD, DI // AT_FDCWD, so this acts like open + MOVQ name+0(FP), SI + MOVL mode+8(FP), DX + MOVL perm+12(FP), R10 + MOVL $SYS_openat, AX + SYSCALL + CMPQ AX, $0xfffffffffffff001 + JLS 2(PC) + MOVL $-1, AX + MOVL AX, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$0-12 + MOVL fd+0(FP), DI + MOVL $SYS_close, AX + SYSCALL + CMPQ AX, $0xfffffffffffff001 + JLS 2(PC) + MOVL $-1, AX + MOVL AX, ret+8(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$0-28 + MOVQ fd+0(FP), DI + MOVQ p+8(FP), SI + MOVL n+16(FP), DX + MOVL $SYS_write, AX + SYSCALL + MOVL AX, ret+24(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$0-28 + MOVL fd+0(FP), DI + MOVQ p+8(FP), SI + MOVL n+16(FP), DX + MOVL $SYS_read, AX + SYSCALL + MOVL AX, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-20 + LEAQ r+8(FP), DI + MOVL flags+0(FP), SI + MOVL $SYS_pipe2, AX + SYSCALL + MOVL AX, errno+16(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$16 + MOVL $0, DX + MOVL usec+0(FP), AX + MOVL $1000000, CX + DIVL CX + MOVQ AX, 0(SP) + MOVL $1000, AX // usec to nsec + MULL DX + MOVQ AX, 8(SP) + + // nanosleep(&ts, 0) + MOVQ SP, DI + MOVL $0, SI + MOVL $SYS_nanosleep, AX + SYSCALL + RET + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOVL $SYS_gettid, AX + SYSCALL + MOVL AX, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT,$0 + MOVL $SYS_getpid, AX + SYSCALL + MOVL AX, R12 + MOVL $SYS_gettid, AX + SYSCALL + MOVL AX, SI // arg 2 tid + MOVL R12, DI // arg 1 pid + MOVL sig+0(FP), DX // arg 3 + MOVL $SYS_tgkill, AX + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$0 + MOVL $SYS_getpid, AX + SYSCALL + MOVL AX, DI // arg 1 pid + MOVL sig+0(FP), SI // arg 2 + MOVL $SYS_kill, AX + SYSCALL + RET + +TEXT ·getpid(SB),NOSPLIT,$0-8 + MOVL $SYS_getpid, AX + SYSCALL + MOVQ AX, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT,$0 + MOVQ tgid+0(FP), DI + MOVQ tid+8(FP), SI + MOVQ sig+16(FP), DX + MOVL $SYS_tgkill, AX + SYSCALL + RET + +TEXT runtime·setitimer(SB),NOSPLIT,$0-24 + MOVL mode+0(FP), DI + MOVQ new+8(FP), SI + MOVQ old+16(FP), DX + MOVL $SYS_setittimer, AX + SYSCALL + RET + +TEXT runtime·timer_create(SB),NOSPLIT,$0-28 + MOVL clockid+0(FP), DI + MOVQ sevp+8(FP), SI + MOVQ timerid+16(FP), DX + MOVL $SYS_timer_create, AX + SYSCALL + MOVL AX, ret+24(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT,$0-28 + MOVL timerid+0(FP), DI + MOVL flags+4(FP), SI + MOVQ new+8(FP), DX + MOVQ old+16(FP), R10 + MOVL $SYS_timer_settime, AX + SYSCALL + MOVL AX, ret+24(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT,$0-12 + MOVL timerid+0(FP), DI + MOVL $SYS_timer_delete, AX + SYSCALL + MOVL AX, ret+8(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT,$0-28 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVQ dst+16(FP), DX + MOVL $SYS_mincore, AX + SYSCALL + MOVL AX, ret+24(FP) + RET + +// func nanotime1() int64 +TEXT runtime·nanotime1(SB),NOSPLIT,$16-8 + // We don't know how much stack space the VDSO code will need, + // so switch to g0. + // In particular, a kernel configured with CONFIG_OPTIMIZE_INLINING=n + // and hardening can use a full page of stack space in gettime_sym + // due to stack probes inserted to avoid stack/heap collisions. + // See issue #20427. + + MOVQ SP, R12 // Save old SP; R12 unchanged by C code. + + MOVQ g_m(R14), BX // BX unchanged by C code. + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVQ m_vdsoPC(BX), CX + MOVQ m_vdsoSP(BX), DX + MOVQ CX, 0(SP) + MOVQ DX, 8(SP) + + LEAQ ret+0(FP), DX + MOVQ -8(DX), CX + MOVQ CX, m_vdsoPC(BX) + MOVQ DX, m_vdsoSP(BX) + + CMPQ R14, m_curg(BX) // Only switch if on curg. + JNE noswitch + + MOVQ m_g0(BX), DX + MOVQ (g_sched+gobuf_sp)(DX), SP // Set SP to g0 stack + +noswitch: + SUBQ $16, SP // Space for results + ANDQ $~15, SP // Align for C code + + MOVL $1, DI // CLOCK_MONOTONIC + LEAQ 0(SP), SI + MOVQ runtime·vdsoClockgettimeSym(SB), AX + CMPQ AX, $0 + JEQ fallback + CALL AX +ret: + MOVQ 0(SP), AX // sec + MOVQ 8(SP), DX // nsec + MOVQ R12, SP // Restore real SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVQ 8(SP), CX + MOVQ CX, m_vdsoSP(BX) + MOVQ 0(SP), CX + MOVQ CX, m_vdsoPC(BX) + // sec is in AX, nsec in DX + // return nsec in AX + IMULQ $1000000000, AX + ADDQ DX, AX + MOVQ AX, ret+0(FP) + RET +fallback: + MOVQ $SYS_clock_gettime, AX + SYSCALL + JMP ret + +TEXT runtime·rtsigprocmask(SB),NOSPLIT,$0-28 + MOVL how+0(FP), DI + MOVQ new+8(FP), SI + MOVQ old+16(FP), DX + MOVL size+24(FP), R10 + MOVL $SYS_rt_sigprocmask, AX + SYSCALL + CMPQ AX, $0xfffffffffffff001 + JLS 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT,$0-36 + MOVQ sig+0(FP), DI + MOVQ new+8(FP), SI + MOVQ old+16(FP), DX + MOVQ size+24(FP), R10 + MOVL $SYS_rt_sigaction, AX + SYSCALL + MOVL AX, ret+32(FP) + RET + +// Call the function stored in _cgo_sigaction using the GCC calling convention. +TEXT runtime·callCgoSigaction(SB),NOSPLIT,$16 + MOVQ sig+0(FP), DI + MOVQ new+8(FP), SI + MOVQ old+16(FP), DX + MOVQ _cgo_sigaction(SB), AX + MOVQ SP, BX // callee-saved + ANDQ $~15, SP // alignment as per amd64 psABI + CALL AX + MOVQ BX, SP + MOVL AX, ret+24(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVQ fn+0(FP), AX + MOVL sig+8(FP), DI + MOVQ info+16(FP), SI + MOVQ ctx+24(FP), DX + PUSHQ BP + MOVQ SP, BP + ANDQ $~15, SP // alignment for x86_64 ABI + CALL AX + MOVQ BP, SP + POPQ BP + RET + +// Called using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Set up ABIInternal environment: g in R14, cleared X15. + get_tls(R12) + MOVQ g(R12), R14 + PXOR X15, X15 + + // Reserve space for spill slots. + NOP SP // disable vet stack checking + ADJSP $24 + + // Call into the Go signal handler + MOVQ DI, AX // sig + MOVQ SI, BX // info + MOVQ DX, CX // ctx + CALL ·sigtrampgo<ABIInternal>(SB) + + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +// Called using C ABI. +TEXT runtime·sigprofNonGoWrapper<>(SB),NOSPLIT,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Set up ABIInternal environment: g in R14, cleared X15. + get_tls(R12) + MOVQ g(R12), R14 + PXOR X15, X15 + + // Reserve space for spill slots. + NOP SP // disable vet stack checking + ADJSP $24 + + // Call into the Go signal handler + MOVQ DI, AX // sig + MOVQ SI, BX // info + MOVQ DX, CX // ctx + CALL ·sigprofNonGo<ABIInternal>(SB) + + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +// Used instead of sigtramp in programs that use cgo. +// Arguments from kernel are in DI, SI, DX. +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + // If no traceback function, do usual sigtramp. + MOVQ runtime·cgoTraceback(SB), AX + TESTQ AX, AX + JZ sigtramp + + // If no traceback support function, which means that + // runtime/cgo was not linked in, do usual sigtramp. + MOVQ _cgo_callers(SB), AX + TESTQ AX, AX + JZ sigtramp + + // Figure out if we are currently in a cgo call. + // If not, just do usual sigtramp. + get_tls(CX) + MOVQ g(CX),AX + TESTQ AX, AX + JZ sigtrampnog // g == nil + MOVQ g_m(AX), AX + TESTQ AX, AX + JZ sigtramp // g.m == nil + MOVL m_ncgo(AX), CX + TESTL CX, CX + JZ sigtramp // g.m.ncgo == 0 + MOVQ m_curg(AX), CX + TESTQ CX, CX + JZ sigtramp // g.m.curg == nil + MOVQ g_syscallsp(CX), CX + TESTQ CX, CX + JZ sigtramp // g.m.curg.syscallsp == 0 + MOVQ m_cgoCallers(AX), R8 + TESTQ R8, R8 + JZ sigtramp // g.m.cgoCallers == nil + MOVL m_cgoCallersUse(AX), CX + TESTL CX, CX + JNZ sigtramp // g.m.cgoCallersUse != 0 + + // Jump to a function in runtime/cgo. + // That function, written in C, will call the user's traceback + // function with proper unwind info, and will then call back here. + // The first three arguments, and the fifth, are already in registers. + // Set the two remaining arguments now. + MOVQ runtime·cgoTraceback(SB), CX + MOVQ $runtime·sigtramp(SB), R9 + MOVQ _cgo_callers(SB), AX + JMP AX + +sigtramp: + JMP runtime·sigtramp(SB) + +sigtrampnog: + // Signal arrived on a non-Go thread. If this is SIGPROF, get a + // stack trace. + CMPL DI, $27 // 27 == SIGPROF + JNZ sigtramp + + // Lock sigprofCallersUse. + MOVL $0, AX + MOVL $1, CX + MOVQ $runtime·sigprofCallersUse(SB), R11 + LOCK + CMPXCHGL CX, 0(R11) + JNZ sigtramp // Skip stack trace if already locked. + + // Jump to the traceback function in runtime/cgo. + // It will call back to sigprofNonGo, via sigprofNonGoWrapper, to convert + // the arguments to the Go calling convention. + // First three arguments to traceback function are in registers already. + MOVQ runtime·cgoTraceback(SB), CX + MOVQ $runtime·sigprofCallers(SB), R8 + MOVQ $runtime·sigprofNonGoWrapper<>(SB), R9 + MOVQ _cgo_callers(SB), AX + JMP AX + +// For cgo unwinding to work, this function must look precisely like +// the one in glibc. The glibc source code is: +// https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/x86_64/sigaction.c +// The code that cares about the precise instructions used is: +// https://gcc.gnu.org/viewcvs/gcc/trunk/libgcc/config/i386/linux-unwind.h?revision=219188&view=markup +TEXT runtime·sigreturn(SB),NOSPLIT,$0 + MOVQ $SYS_rt_sigreturn, AX + SYSCALL + INT $3 // not reached + +TEXT runtime·sysMmap(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVL prot+16(FP), DX + MOVL flags+20(FP), R10 + MOVL fd+24(FP), R8 + MOVL off+28(FP), R9 + + MOVL $SYS_mmap, AX + SYSCALL + CMPQ AX, $0xfffffffffffff001 + JLS ok + NOTQ AX + INCQ AX + MOVQ $0, p+32(FP) + MOVQ AX, err+40(FP) + RET +ok: + MOVQ AX, p+32(FP) + MOVQ $0, err+40(FP) + RET + +// Call the function stored in _cgo_mmap using the GCC calling convention. +// This must be called on the system stack. +TEXT runtime·callCgoMmap(SB),NOSPLIT,$16 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVL prot+16(FP), DX + MOVL flags+20(FP), CX + MOVL fd+24(FP), R8 + MOVL off+28(FP), R9 + MOVQ _cgo_mmap(SB), AX + MOVQ SP, BX + ANDQ $~15, SP // alignment as per amd64 psABI + MOVQ BX, 0(SP) + CALL AX + MOVQ 0(SP), SP + MOVQ AX, ret+32(FP) + RET + +TEXT runtime·sysMunmap(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVQ $SYS_munmap, AX + SYSCALL + CMPQ AX, $0xfffffffffffff001 + JLS 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +// Call the function stored in _cgo_munmap using the GCC calling convention. +// This must be called on the system stack. +TEXT runtime·callCgoMunmap(SB),NOSPLIT,$16-16 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVQ _cgo_munmap(SB), AX + MOVQ SP, BX + ANDQ $~15, SP // alignment as per amd64 psABI + MOVQ BX, 0(SP) + CALL AX + MOVQ 0(SP), SP + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI + MOVQ n+8(FP), SI + MOVL flags+16(FP), DX + MOVQ $SYS_madvise, AX + SYSCALL + MOVL AX, ret+24(FP) + RET + +// int64 futex(int32 *uaddr, int32 op, int32 val, +// struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI + MOVL op+8(FP), SI + MOVL val+12(FP), DX + MOVQ ts+16(FP), R10 + MOVQ addr2+24(FP), R8 + MOVL val3+32(FP), R9 + MOVL $SYS_futex, AX + SYSCALL + MOVL AX, ret+40(FP) + RET + +// int32 clone(int32 flags, void *stk, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT,$0 + MOVL flags+0(FP), DI + MOVQ stk+8(FP), SI + MOVQ $0, DX + MOVQ $0, R10 + MOVQ $0, R8 + // Copy mp, gp, fn off parent stack for use by child. + // Careful: Linux system call clobbers CX and R11. + MOVQ mp+16(FP), R13 + MOVQ gp+24(FP), R9 + MOVQ fn+32(FP), R12 + CMPQ R13, $0 // m + JEQ nog1 + CMPQ R9, $0 // g + JEQ nog1 + LEAQ m_tls(R13), R8 +#ifdef GOOS_android + // Android stores the TLS offset in runtime·tls_g. + SUBQ runtime·tls_g(SB), R8 +#else + ADDQ $8, R8 // ELF wants to use -8(FS) +#endif + ORQ $0x00080000, DI //add flag CLONE_SETTLS(0x00080000) to call clone +nog1: + MOVL $SYS_clone, AX + SYSCALL + + // In parent, return. + CMPQ AX, $0 + JEQ 3(PC) + MOVL AX, ret+40(FP) + RET + + // In child, on new stack. + MOVQ SI, SP + + // If g or m are nil, skip Go-related setup. + CMPQ R13, $0 // m + JEQ nog2 + CMPQ R9, $0 // g + JEQ nog2 + + // Initialize m->procid to Linux tid + MOVL $SYS_gettid, AX + SYSCALL + MOVQ AX, m_procid(R13) + + // In child, set up new stack + get_tls(CX) + MOVQ R13, g_m(R9) + MOVQ R9, g(CX) + MOVQ R9, R14 // set g register + CALL runtime·stackcheck(SB) + +nog2: + // Call fn. This is the PC of an ABI0 function. + CALL R12 + + // It shouldn't return. If it does, exit that thread. + MOVL $111, DI + MOVL $SYS_exit, AX + SYSCALL + JMP -3(PC) // keep exiting + +TEXT runtime·sigaltstack(SB),NOSPLIT,$-8 + MOVQ new+0(FP), DI + MOVQ old+8(FP), SI + MOVQ $SYS_sigaltstack, AX + SYSCALL + CMPQ AX, $0xfffffffffffff001 + JLS 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +// set tls base to DI +TEXT runtime·settls(SB),NOSPLIT,$32 +#ifdef GOOS_android + // Android stores the TLS offset in runtime·tls_g. + SUBQ runtime·tls_g(SB), DI +#else + ADDQ $8, DI // ELF wants to use -8(FS) +#endif + MOVQ DI, SI + MOVQ $0x1002, DI // ARCH_SET_FS + MOVQ $SYS_arch_prctl, AX + SYSCALL + CMPQ AX, $0xfffffffffffff001 + JLS 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·osyield(SB),NOSPLIT,$0 + MOVL $SYS_sched_yield, AX + SYSCALL + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT,$0 + MOVQ pid+0(FP), DI + MOVQ len+8(FP), SI + MOVQ buf+16(FP), DX + MOVL $SYS_sched_getaffinity, AX + SYSCALL + MOVL AX, ret+24(FP) + RET + +// int access(const char *name, int mode) +TEXT runtime·access(SB),NOSPLIT,$0 + // This uses faccessat instead of access, because Android O blocks access. + MOVL $AT_FDCWD, DI // AT_FDCWD, so this acts like access + MOVQ name+0(FP), SI + MOVL mode+8(FP), DX + MOVL $0, R10 + MOVL $SYS_faccessat, AX + SYSCALL + MOVL AX, ret+16(FP) + RET + +// int connect(int fd, const struct sockaddr *addr, socklen_t addrlen) +TEXT runtime·connect(SB),NOSPLIT,$0-28 + MOVL fd+0(FP), DI + MOVQ addr+8(FP), SI + MOVL len+16(FP), DX + MOVL $SYS_connect, AX + SYSCALL + MOVL AX, ret+24(FP) + RET + +// int socket(int domain, int type, int protocol) +TEXT runtime·socket(SB),NOSPLIT,$0-20 + MOVL domain+0(FP), DI + MOVL typ+4(FP), SI + MOVL prot+8(FP), DX + MOVL $SYS_socket, AX + SYSCALL + MOVL AX, ret+16(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT,$0-8 + // Implemented as brk(NULL). + MOVQ $0, DI + MOVL $SYS_brk, AX + SYSCALL + MOVQ AX, ret+0(FP) + RET diff --git a/src/runtime/sys_linux_arm.s b/src/runtime/sys_linux_arm.s new file mode 100644 index 0000000..7b8c4f0 --- /dev/null +++ b/src/runtime/sys_linux_arm.s @@ -0,0 +1,655 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for arm, Linux +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 1 + +// for EABI, as we don't support OABI +#define SYS_BASE 0x0 + +#define SYS_exit (SYS_BASE + 1) +#define SYS_read (SYS_BASE + 3) +#define SYS_write (SYS_BASE + 4) +#define SYS_open (SYS_BASE + 5) +#define SYS_close (SYS_BASE + 6) +#define SYS_getpid (SYS_BASE + 20) +#define SYS_kill (SYS_BASE + 37) +#define SYS_clone (SYS_BASE + 120) +#define SYS_rt_sigreturn (SYS_BASE + 173) +#define SYS_rt_sigaction (SYS_BASE + 174) +#define SYS_rt_sigprocmask (SYS_BASE + 175) +#define SYS_sigaltstack (SYS_BASE + 186) +#define SYS_mmap2 (SYS_BASE + 192) +#define SYS_futex (SYS_BASE + 240) +#define SYS_exit_group (SYS_BASE + 248) +#define SYS_munmap (SYS_BASE + 91) +#define SYS_madvise (SYS_BASE + 220) +#define SYS_setitimer (SYS_BASE + 104) +#define SYS_mincore (SYS_BASE + 219) +#define SYS_gettid (SYS_BASE + 224) +#define SYS_tgkill (SYS_BASE + 268) +#define SYS_sched_yield (SYS_BASE + 158) +#define SYS_nanosleep (SYS_BASE + 162) +#define SYS_sched_getaffinity (SYS_BASE + 242) +#define SYS_clock_gettime (SYS_BASE + 263) +#define SYS_timer_create (SYS_BASE + 257) +#define SYS_timer_settime (SYS_BASE + 258) +#define SYS_timer_delete (SYS_BASE + 261) +#define SYS_pipe2 (SYS_BASE + 359) +#define SYS_access (SYS_BASE + 33) +#define SYS_connect (SYS_BASE + 283) +#define SYS_socket (SYS_BASE + 281) +#define SYS_brk (SYS_BASE + 45) + +#define ARM_BASE (SYS_BASE + 0x0f0000) + +TEXT runtime·open(SB),NOSPLIT,$0 + MOVW name+0(FP), R0 + MOVW mode+4(FP), R1 + MOVW perm+8(FP), R2 + MOVW $SYS_open, R7 + SWI $0 + MOVW $0xfffff001, R1 + CMP R1, R0 + MOVW.HI $-1, R0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 + MOVW $SYS_close, R7 + SWI $0 + MOVW $0xfffff001, R1 + CMP R1, R0 + MOVW.HI $-1, R0 + MOVW R0, ret+4(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 + MOVW p+4(FP), R1 + MOVW n+8(FP), R2 + MOVW $SYS_write, R7 + SWI $0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 + MOVW p+4(FP), R1 + MOVW n+8(FP), R2 + MOVW $SYS_read, R7 + SWI $0 + MOVW R0, ret+12(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-16 + MOVW $r+4(FP), R0 + MOVW flags+0(FP), R1 + MOVW $SYS_pipe2, R7 + SWI $0 + MOVW R0, errno+12(FP) + RET + +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0 + MOVW code+0(FP), R0 + MOVW $SYS_exit_group, R7 + SWI $0 + MOVW $1234, R0 + MOVW $1002, R1 + MOVW R0, (R1) // fail hard + +TEXT exit1<>(SB),NOSPLIT|NOFRAME,$0 + MOVW code+0(FP), R0 + MOVW $SYS_exit, R7 + SWI $0 + MOVW $1234, R0 + MOVW $1003, R1 + MOVW R0, (R1) // fail hard + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-4 + MOVW wait+0(FP), R0 + // We're done using the stack. + // Alas, there's no reliable way to make this write atomic + // without potentially using the stack. So it goes. + MOVW $0, R1 + MOVW R1, (R0) + MOVW $0, R0 // exit code + MOVW $SYS_exit, R7 + SWI $0 + MOVW $1234, R0 + MOVW $1004, R1 + MOVW R0, (R1) // fail hard + JMP 0(PC) + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOVW $SYS_gettid, R7 + SWI $0 + MOVW R0, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT|NOFRAME,$0 + MOVW $SYS_getpid, R7 + SWI $0 + MOVW R0, R4 + MOVW $SYS_gettid, R7 + SWI $0 + MOVW R0, R1 // arg 2 tid + MOVW R4, R0 // arg 1 pid + MOVW sig+0(FP), R2 // arg 3 + MOVW $SYS_tgkill, R7 + SWI $0 + RET + +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + MOVW $SYS_getpid, R7 + SWI $0 + // arg 1 tid already in R0 from getpid + MOVW sig+0(FP), R1 // arg 2 - signal + MOVW $SYS_kill, R7 + SWI $0 + RET + +TEXT ·getpid(SB),NOSPLIT,$0-4 + MOVW $SYS_getpid, R7 + SWI $0 + MOVW R0, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT,$0-12 + MOVW tgid+0(FP), R0 + MOVW tid+4(FP), R1 + MOVW sig+8(FP), R2 + MOVW $SYS_tgkill, R7 + SWI $0 + RET + +TEXT runtime·mmap(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 + MOVW n+4(FP), R1 + MOVW prot+8(FP), R2 + MOVW flags+12(FP), R3 + MOVW fd+16(FP), R4 + MOVW off+20(FP), R5 + MOVW $SYS_mmap2, R7 + SWI $0 + MOVW $0xfffff001, R6 + CMP R6, R0 + MOVW $0, R1 + RSB.HI $0, R0 + MOVW.HI R0, R1 // if error, put in R1 + MOVW.HI $0, R0 + MOVW R0, p+24(FP) + MOVW R1, err+28(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 + MOVW n+4(FP), R1 + MOVW $SYS_munmap, R7 + SWI $0 + MOVW $0xfffff001, R6 + CMP R6, R0 + MOVW.HI $0, R8 // crash on syscall failure + MOVW.HI R8, (R8) + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 + MOVW n+4(FP), R1 + MOVW flags+8(FP), R2 + MOVW $SYS_madvise, R7 + SWI $0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·setitimer(SB),NOSPLIT,$0 + MOVW mode+0(FP), R0 + MOVW new+4(FP), R1 + MOVW old+8(FP), R2 + MOVW $SYS_setitimer, R7 + SWI $0 + RET + +TEXT runtime·timer_create(SB),NOSPLIT,$0-16 + MOVW clockid+0(FP), R0 + MOVW sevp+4(FP), R1 + MOVW timerid+8(FP), R2 + MOVW $SYS_timer_create, R7 + SWI $0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT,$0-20 + MOVW timerid+0(FP), R0 + MOVW flags+4(FP), R1 + MOVW new+8(FP), R2 + MOVW old+12(FP), R3 + MOVW $SYS_timer_settime, R7 + SWI $0 + MOVW R0, ret+16(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT,$0-8 + MOVW timerid+0(FP), R0 + MOVW $SYS_timer_delete, R7 + SWI $0 + MOVW R0, ret+4(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 + MOVW n+4(FP), R1 + MOVW dst+8(FP), R2 + MOVW $SYS_mincore, R7 + SWI $0 + MOVW R0, ret+12(FP) + RET + +// Call a VDSO function. +// +// R0-R3: arguments to VDSO function (C calling convention) +// R4: uintptr function to call +// +// There is no return value. +TEXT runtime·vdsoCall(SB),NOSPLIT,$8-0 + // R0-R3 may be arguments to fn, do not touch. + // R4 is function to call. + // R5-R9 are available as locals. They are unchanged by the C call + // (callee-save). + + // We don't know how much stack space the VDSO code will need, + // so switch to g0. + + // Save old SP. Use R13 instead of SP to avoid linker rewriting the offsets. + MOVW R13, R5 + + MOVW g_m(g), R6 + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVW m_vdsoPC(R6), R7 + MOVW m_vdsoSP(R6), R8 + MOVW R7, 4(R13) + MOVW R8, 8(R13) + + MOVW $sp-4(FP), R7 // caller's SP + MOVW LR, m_vdsoPC(R6) + MOVW R7, m_vdsoSP(R6) + + MOVW m_curg(R6), R7 + + CMP g, R7 // Only switch if on curg. + B.NE noswitch + + MOVW m_g0(R6), R7 + MOVW (g_sched+gobuf_sp)(R7), R13 // Set SP to g0 stack + +noswitch: + BIC $0x7, R13 // Align for C code + + // Store g on gsignal's stack, so if we receive a signal + // during VDSO code we can find the g. + + // When using cgo, we already saved g on TLS, also don't save g here. + MOVB runtime·iscgo(SB), R7 + CMP $0, R7 + BNE nosaveg + // If we don't have a signal stack, we won't receive signal, so don't + // bother saving g. + MOVW m_gsignal(R6), R7 // g.m.gsignal + CMP $0, R7 + BEQ nosaveg + // Don't save g if we are already on the signal stack, as we won't get + // a nested signal. + CMP g, R7 + BEQ nosaveg + // If we don't have a signal stack, we won't receive signal, so don't + // bother saving g. + MOVW (g_stack+stack_lo)(R7), R7 // g.m.gsignal.stack.lo + CMP $0, R7 + BEQ nosaveg + MOVW g, (R7) + + BL (R4) + + MOVW $0, R8 + MOVW R8, (R7) // clear g slot + + JMP finish + +nosaveg: + BL (R4) + +finish: + MOVW R5, R13 // Restore real SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVW 8(R13), R7 + MOVW R7, m_vdsoSP(R6) + MOVW 4(R13), R7 + MOVW R7, m_vdsoPC(R6) + RET + +TEXT runtime·walltime(SB),NOSPLIT,$12-12 + MOVW $CLOCK_REALTIME, R0 + MOVW $spec-12(SP), R1 // timespec + + MOVW runtime·vdsoClockgettimeSym(SB), R4 + CMP $0, R4 + B.EQ fallback + + BL runtime·vdsoCall(SB) + + JMP finish + +fallback: + MOVW $SYS_clock_gettime, R7 + SWI $0 + +finish: + MOVW sec-12(SP), R0 // sec + MOVW nsec-8(SP), R2 // nsec + + MOVW R0, sec_lo+0(FP) + MOVW $0, R1 + MOVW R1, sec_hi+4(FP) + MOVW R2, nsec+8(FP) + RET + +// func nanotime1() int64 +TEXT runtime·nanotime1(SB),NOSPLIT,$12-8 + MOVW $CLOCK_MONOTONIC, R0 + MOVW $spec-12(SP), R1 // timespec + + MOVW runtime·vdsoClockgettimeSym(SB), R4 + CMP $0, R4 + B.EQ fallback + + BL runtime·vdsoCall(SB) + + JMP finish + +fallback: + MOVW $SYS_clock_gettime, R7 + SWI $0 + +finish: + MOVW sec-12(SP), R0 // sec + MOVW nsec-8(SP), R2 // nsec + + MOVW $1000000000, R3 + MULLU R0, R3, (R1, R0) + ADD.S R2, R0 + ADC $0, R1 // Add carry bit to upper half. + + MOVW R0, ret_lo+0(FP) + MOVW R1, ret_hi+4(FP) + + RET + +// int32 futex(int32 *uaddr, int32 op, int32 val, +// struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 + MOVW op+4(FP), R1 + MOVW val+8(FP), R2 + MOVW ts+12(FP), R3 + MOVW addr2+16(FP), R4 + MOVW val3+20(FP), R5 + MOVW $SYS_futex, R7 + SWI $0 + MOVW R0, ret+24(FP) + RET + +// int32 clone(int32 flags, void *stack, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT,$0 + MOVW flags+0(FP), R0 + MOVW stk+4(FP), R1 + MOVW $0, R2 // parent tid ptr + MOVW $0, R3 // tls_val + MOVW $0, R4 // child tid ptr + MOVW $0, R5 + + // Copy mp, gp, fn off parent stack for use by child. + MOVW $-16(R1), R1 + MOVW mp+8(FP), R6 + MOVW R6, 0(R1) + MOVW gp+12(FP), R6 + MOVW R6, 4(R1) + MOVW fn+16(FP), R6 + MOVW R6, 8(R1) + MOVW $1234, R6 + MOVW R6, 12(R1) + + MOVW $SYS_clone, R7 + SWI $0 + + // In parent, return. + CMP $0, R0 + BEQ 3(PC) + MOVW R0, ret+20(FP) + RET + + // Paranoia: check that SP is as we expect. Use R13 to avoid linker 'fixup' + NOP R13 // tell vet SP/R13 changed - stop checking offsets + MOVW 12(R13), R0 + MOVW $1234, R1 + CMP R0, R1 + BEQ 2(PC) + BL runtime·abort(SB) + + MOVW 0(R13), R8 // m + MOVW 4(R13), R0 // g + + CMP $0, R8 + BEQ nog + CMP $0, R0 + BEQ nog + + MOVW R0, g + MOVW R8, g_m(g) + + // paranoia; check they are not nil + MOVW 0(R8), R0 + MOVW 0(g), R0 + + BL runtime·emptyfunc(SB) // fault if stack check is wrong + + // Initialize m->procid to Linux tid + MOVW $SYS_gettid, R7 + SWI $0 + MOVW g_m(g), R8 + MOVW R0, m_procid(R8) + +nog: + // Call fn + MOVW 8(R13), R0 + MOVW $16(R13), R13 + BL (R0) + + // It shouldn't return. If it does, exit that thread. + SUB $16, R13 // restore the stack pointer to avoid memory corruption + MOVW $0, R0 + MOVW R0, 4(R13) + BL exit1<>(SB) + + MOVW $1234, R0 + MOVW $1005, R1 + MOVW R0, (R1) + +TEXT runtime·sigaltstack(SB),NOSPLIT,$0 + MOVW new+0(FP), R0 + MOVW old+4(FP), R1 + MOVW $SYS_sigaltstack, R7 + SWI $0 + MOVW $0xfffff001, R6 + CMP R6, R0 + MOVW.HI $0, R8 // crash on syscall failure + MOVW.HI R8, (R8) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-16 + MOVW sig+4(FP), R0 + MOVW info+8(FP), R1 + MOVW ctx+12(FP), R2 + MOVW fn+0(FP), R11 + MOVW R13, R4 + SUB $24, R13 + BIC $0x7, R13 // alignment for ELF ABI + BL (R11) + MOVW R4, R13 + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Reserve space for callee-save registers and arguments. + MOVM.DB.W [R4-R11], (R13) + SUB $16, R13 + + // this might be called in external code context, + // where g is not set. + // first save R0, because runtime·load_g will clobber it + MOVW R0, 4(R13) + MOVB runtime·iscgo(SB), R0 + CMP $0, R0 + BL.NE runtime·load_g(SB) + + MOVW R1, 8(R13) + MOVW R2, 12(R13) + MOVW $runtime·sigtrampgo(SB), R11 + BL (R11) + + // Restore callee-save registers. + ADD $16, R13 + MOVM.IA.W (R13), [R4-R11] + + RET + +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + MOVW $runtime·sigtramp(SB), R11 + B (R11) + +TEXT runtime·rtsigprocmask(SB),NOSPLIT,$0 + MOVW how+0(FP), R0 + MOVW new+4(FP), R1 + MOVW old+8(FP), R2 + MOVW size+12(FP), R3 + MOVW $SYS_rt_sigprocmask, R7 + SWI $0 + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT,$0 + MOVW sig+0(FP), R0 + MOVW new+4(FP), R1 + MOVW old+8(FP), R2 + MOVW size+12(FP), R3 + MOVW $SYS_rt_sigaction, R7 + SWI $0 + MOVW R0, ret+16(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$12 + MOVW usec+0(FP), R0 + CALL runtime·usplitR0(SB) + MOVW R0, 4(R13) + MOVW $1000, R0 // usec to nsec + MUL R0, R1 + MOVW R1, 8(R13) + MOVW $4(R13), R0 + MOVW $0, R1 + MOVW $SYS_nanosleep, R7 + SWI $0 + RET + +// As for cas, memory barriers are complicated on ARM, but the kernel +// provides a user helper. ARMv5 does not support SMP and has no +// memory barrier instruction at all. ARMv6 added SMP support and has +// a memory barrier, but it requires writing to a coprocessor +// register. ARMv7 introduced the DMB instruction, but it's expensive +// even on single-core devices. The kernel helper takes care of all of +// this for us. + +TEXT kernelPublicationBarrier<>(SB),NOSPLIT,$0 + // void __kuser_memory_barrier(void); + MOVW $0xffff0fa0, R11 + CALL (R11) + RET + +TEXT ·publicationBarrier(SB),NOSPLIT,$0 + MOVB ·goarm(SB), R11 + CMP $7, R11 + BLT 2(PC) + JMP ·armPublicationBarrier(SB) + JMP kernelPublicationBarrier<>(SB) // extra layer so this function is leaf and no SP adjustment on GOARM=7 + +TEXT runtime·osyield(SB),NOSPLIT,$0 + MOVW $SYS_sched_yield, R7 + SWI $0 + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT,$0 + MOVW pid+0(FP), R0 + MOVW len+4(FP), R1 + MOVW buf+8(FP), R2 + MOVW $SYS_sched_getaffinity, R7 + SWI $0 + MOVW R0, ret+12(FP) + RET + +// b __kuser_get_tls @ 0xffff0fe0 +TEXT runtime·read_tls_fallback(SB),NOSPLIT|NOFRAME,$0 + MOVW $0xffff0fe0, R0 + B (R0) + +TEXT runtime·access(SB),NOSPLIT,$0 + MOVW name+0(FP), R0 + MOVW mode+4(FP), R1 + MOVW $SYS_access, R7 + SWI $0 + MOVW R0, ret+8(FP) + RET + +TEXT runtime·connect(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 + MOVW addr+4(FP), R1 + MOVW len+8(FP), R2 + MOVW $SYS_connect, R7 + SWI $0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·socket(SB),NOSPLIT,$0 + MOVW domain+0(FP), R0 + MOVW typ+4(FP), R1 + MOVW prot+8(FP), R2 + MOVW $SYS_socket, R7 + SWI $0 + MOVW R0, ret+12(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT,$0-4 + // Implemented as brk(NULL). + MOVW $0, R0 + MOVW $SYS_brk, R7 + SWI $0 + MOVW R0, ret+0(FP) + RET + +TEXT runtime·sigreturn(SB),NOSPLIT,$0-0 + RET diff --git a/src/runtime/sys_linux_arm64.s b/src/runtime/sys_linux_arm64.s new file mode 100644 index 0000000..38ff6ac --- /dev/null +++ b/src/runtime/sys_linux_arm64.s @@ -0,0 +1,801 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for arm64, Linux +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_arm64.h" + +#define AT_FDCWD -100 + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 1 + +#define SYS_exit 93 +#define SYS_read 63 +#define SYS_write 64 +#define SYS_openat 56 +#define SYS_close 57 +#define SYS_pipe2 59 +#define SYS_nanosleep 101 +#define SYS_mmap 222 +#define SYS_munmap 215 +#define SYS_setitimer 103 +#define SYS_clone 220 +#define SYS_sched_yield 124 +#define SYS_rt_sigreturn 139 +#define SYS_rt_sigaction 134 +#define SYS_rt_sigprocmask 135 +#define SYS_sigaltstack 132 +#define SYS_madvise 233 +#define SYS_mincore 232 +#define SYS_getpid 172 +#define SYS_gettid 178 +#define SYS_kill 129 +#define SYS_tgkill 131 +#define SYS_futex 98 +#define SYS_sched_getaffinity 123 +#define SYS_exit_group 94 +#define SYS_clock_gettime 113 +#define SYS_faccessat 48 +#define SYS_socket 198 +#define SYS_connect 203 +#define SYS_brk 214 +#define SYS_timer_create 107 +#define SYS_timer_settime 110 +#define SYS_timer_delete 111 + +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0-4 + MOVW code+0(FP), R0 + MOVD $SYS_exit_group, R8 + SVC + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-8 + MOVD wait+0(FP), R0 + // We're done using the stack. + MOVW $0, R1 + STLRW R1, (R0) + MOVW $0, R0 // exit code + MOVD $SYS_exit, R8 + SVC + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0-20 + MOVD $AT_FDCWD, R0 + MOVD name+0(FP), R1 + MOVW mode+8(FP), R2 + MOVW perm+12(FP), R3 + MOVD $SYS_openat, R8 + SVC + CMN $4095, R0 + BCC done + MOVW $-1, R0 +done: + MOVW R0, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0-12 + MOVW fd+0(FP), R0 + MOVD $SYS_close, R8 + SVC + CMN $4095, R0 + BCC done + MOVW $-1, R0 +done: + MOVW R0, ret+8(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0-28 + MOVD fd+0(FP), R0 + MOVD p+8(FP), R1 + MOVW n+16(FP), R2 + MOVD $SYS_write, R8 + SVC + MOVW R0, ret+24(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0-28 + MOVW fd+0(FP), R0 + MOVD p+8(FP), R1 + MOVW n+16(FP), R2 + MOVD $SYS_read, R8 + SVC + MOVW R0, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + MOVD $r+8(FP), R0 + MOVW flags+0(FP), R1 + MOVW $SYS_pipe2, R8 + SVC + MOVW R0, errno+16(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$24-4 + MOVWU usec+0(FP), R3 + MOVD R3, R5 + MOVW $1000000, R4 + UDIV R4, R3 + MOVD R3, 8(RSP) + MUL R3, R4 + SUB R4, R5 + MOVW $1000, R4 + MUL R4, R5 + MOVD R5, 16(RSP) + + // nanosleep(&ts, 0) + ADD $8, RSP, R0 + MOVD $0, R1 + MOVD $SYS_nanosleep, R8 + SVC + RET + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOVD $SYS_gettid, R8 + SVC + MOVW R0, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT|NOFRAME,$0 + MOVD $SYS_getpid, R8 + SVC + MOVW R0, R19 + MOVD $SYS_gettid, R8 + SVC + MOVW R0, R1 // arg 2 tid + MOVW R19, R0 // arg 1 pid + MOVW sig+0(FP), R2 // arg 3 + MOVD $SYS_tgkill, R8 + SVC + RET + +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + MOVD $SYS_getpid, R8 + SVC + MOVW R0, R0 // arg 1 pid + MOVW sig+0(FP), R1 // arg 2 + MOVD $SYS_kill, R8 + SVC + RET + +TEXT ·getpid(SB),NOSPLIT|NOFRAME,$0-8 + MOVD $SYS_getpid, R8 + SVC + MOVD R0, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT,$0-24 + MOVD tgid+0(FP), R0 + MOVD tid+8(FP), R1 + MOVD sig+16(FP), R2 + MOVD $SYS_tgkill, R8 + SVC + RET + +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0-24 + MOVW mode+0(FP), R0 + MOVD new+8(FP), R1 + MOVD old+16(FP), R2 + MOVD $SYS_setitimer, R8 + SVC + RET + +TEXT runtime·timer_create(SB),NOSPLIT,$0-28 + MOVW clockid+0(FP), R0 + MOVD sevp+8(FP), R1 + MOVD timerid+16(FP), R2 + MOVD $SYS_timer_create, R8 + SVC + MOVW R0, ret+24(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT,$0-28 + MOVW timerid+0(FP), R0 + MOVW flags+4(FP), R1 + MOVD new+8(FP), R2 + MOVD old+16(FP), R3 + MOVD $SYS_timer_settime, R8 + SVC + MOVW R0, ret+24(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT,$0-12 + MOVW timerid+0(FP), R0 + MOVD $SYS_timer_delete, R8 + SVC + MOVW R0, ret+8(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT|NOFRAME,$0-28 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVD dst+16(FP), R2 + MOVD $SYS_mincore, R8 + SVC + MOVW R0, ret+24(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$24-12 + MOVD RSP, R20 // R20 is unchanged by C code + MOVD RSP, R1 + + MOVD g_m(g), R21 // R21 = m + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVD m_vdsoPC(R21), R2 + MOVD m_vdsoSP(R21), R3 + MOVD R2, 8(RSP) + MOVD R3, 16(RSP) + + MOVD $ret-8(FP), R2 // caller's SP + MOVD LR, m_vdsoPC(R21) + MOVD R2, m_vdsoSP(R21) + + MOVD m_curg(R21), R0 + CMP g, R0 + BNE noswitch + + MOVD m_g0(R21), R3 + MOVD (g_sched+gobuf_sp)(R3), R1 // Set RSP to g0 stack + +noswitch: + SUB $16, R1 + BIC $15, R1 // Align for C code + MOVD R1, RSP + + MOVW $CLOCK_REALTIME, R0 + MOVD runtime·vdsoClockgettimeSym(SB), R2 + CBZ R2, fallback + + // Store g on gsignal's stack, so if we receive a signal + // during VDSO code we can find the g. + // If we don't have a signal stack, we won't receive signal, + // so don't bother saving g. + // When using cgo, we already saved g on TLS, also don't save + // g here. + // Also don't save g if we are already on the signal stack. + // We won't get a nested signal. + MOVBU runtime·iscgo(SB), R22 + CBNZ R22, nosaveg + MOVD m_gsignal(R21), R22 // g.m.gsignal + CBZ R22, nosaveg + CMP g, R22 + BEQ nosaveg + MOVD (g_stack+stack_lo)(R22), R22 // g.m.gsignal.stack.lo + MOVD g, (R22) + + BL (R2) + + MOVD ZR, (R22) // clear g slot, R22 is unchanged by C code + + B finish + +nosaveg: + BL (R2) + B finish + +fallback: + MOVD $SYS_clock_gettime, R8 + SVC + +finish: + MOVD 0(RSP), R3 // sec + MOVD 8(RSP), R5 // nsec + + MOVD R20, RSP // restore SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVD 16(RSP), R1 + MOVD R1, m_vdsoSP(R21) + MOVD 8(RSP), R1 + MOVD R1, m_vdsoPC(R21) + + MOVD R3, sec+0(FP) + MOVW R5, nsec+8(FP) + RET + +TEXT runtime·nanotime1(SB),NOSPLIT,$24-8 + MOVD RSP, R20 // R20 is unchanged by C code + MOVD RSP, R1 + + MOVD g_m(g), R21 // R21 = m + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVD m_vdsoPC(R21), R2 + MOVD m_vdsoSP(R21), R3 + MOVD R2, 8(RSP) + MOVD R3, 16(RSP) + + MOVD $ret-8(FP), R2 // caller's SP + MOVD LR, m_vdsoPC(R21) + MOVD R2, m_vdsoSP(R21) + + MOVD m_curg(R21), R0 + CMP g, R0 + BNE noswitch + + MOVD m_g0(R21), R3 + MOVD (g_sched+gobuf_sp)(R3), R1 // Set RSP to g0 stack + +noswitch: + SUB $32, R1 + BIC $15, R1 + MOVD R1, RSP + + MOVW $CLOCK_MONOTONIC, R0 + MOVD runtime·vdsoClockgettimeSym(SB), R2 + CBZ R2, fallback + + // Store g on gsignal's stack, so if we receive a signal + // during VDSO code we can find the g. + // If we don't have a signal stack, we won't receive signal, + // so don't bother saving g. + // When using cgo, we already saved g on TLS, also don't save + // g here. + // Also don't save g if we are already on the signal stack. + // We won't get a nested signal. + MOVBU runtime·iscgo(SB), R22 + CBNZ R22, nosaveg + MOVD m_gsignal(R21), R22 // g.m.gsignal + CBZ R22, nosaveg + CMP g, R22 + BEQ nosaveg + MOVD (g_stack+stack_lo)(R22), R22 // g.m.gsignal.stack.lo + MOVD g, (R22) + + BL (R2) + + MOVD ZR, (R22) // clear g slot, R22 is unchanged by C code + + B finish + +nosaveg: + BL (R2) + B finish + +fallback: + MOVD $SYS_clock_gettime, R8 + SVC + +finish: + MOVD 0(RSP), R3 // sec + MOVD 8(RSP), R5 // nsec + + MOVD R20, RSP // restore SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVD 16(RSP), R1 + MOVD R1, m_vdsoSP(R21) + MOVD 8(RSP), R1 + MOVD R1, m_vdsoPC(R21) + + // sec is in R3, nsec in R5 + // return nsec in R3 + MOVD $1000000000, R4 + MUL R4, R3 + ADD R5, R3 + MOVD R3, ret+0(FP) + RET + +TEXT runtime·rtsigprocmask(SB),NOSPLIT|NOFRAME,$0-28 + MOVW how+0(FP), R0 + MOVD new+8(FP), R1 + MOVD old+16(FP), R2 + MOVW size+24(FP), R3 + MOVD $SYS_rt_sigprocmask, R8 + SVC + CMN $4095, R0 + BCC done + MOVD $0, R0 + MOVD R0, (R0) // crash +done: + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT|NOFRAME,$0-36 + MOVD sig+0(FP), R0 + MOVD new+8(FP), R1 + MOVD old+16(FP), R2 + MOVD size+24(FP), R3 + MOVD $SYS_rt_sigaction, R8 + SVC + MOVW R0, ret+32(FP) + RET + +// Call the function stored in _cgo_sigaction using the GCC calling convention. +TEXT runtime·callCgoSigaction(SB),NOSPLIT,$0 + MOVD sig+0(FP), R0 + MOVD new+8(FP), R1 + MOVD old+16(FP), R2 + MOVD _cgo_sigaction(SB), R3 + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL R3 + ADD $16, RSP + MOVW R0, ret+24(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R0 + MOVD info+16(FP), R1 + MOVD ctx+24(FP), R2 + MOVD fn+0(FP), R11 + BL (R11) + RET + +// Called from c-abi, R0: sig, R1: info, R2: cxt +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$176 + // Save callee-save registers in the case of signal forwarding. + // Please refer to https://golang.org/issue/31827 . + SAVE_R19_TO_R28(8*4) + SAVE_F8_TO_F15(8*14) + + // this might be called in external code context, + // where g is not set. + // first save R0, because runtime·load_g will clobber it + MOVW R0, 8(RSP) + MOVBU runtime·iscgo(SB), R0 + CBZ R0, 2(PC) + BL runtime·load_g(SB) + +#ifdef GOEXPERIMENT_regabiargs + // Restore signum to R0. + MOVW 8(RSP), R0 + // R1 and R2 already contain info and ctx, respectively. +#else + MOVD R1, 16(RSP) + MOVD R2, 24(RSP) +#endif + MOVD $runtime·sigtrampgo<ABIInternal>(SB), R3 + BL (R3) + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8*4) + RESTORE_F8_TO_F15(8*14) + + RET + +// Called from c-abi, R0: sig, R1: info, R2: cxt +TEXT runtime·sigprofNonGoWrapper<>(SB),NOSPLIT,$176 + // Save callee-save registers because it's a callback from c code. + SAVE_R19_TO_R28(8*4) + SAVE_F8_TO_F15(8*14) + +#ifdef GOEXPERIMENT_regabiargs + // R0, R1 and R2 already contain sig, info and ctx, respectively. +#else + MOVW R0, 8(RSP) // sig + MOVD R1, 16(RSP) // info + MOVD R2, 24(RSP) // ctx +#endif + CALL runtime·sigprofNonGo<ABIInternal>(SB) + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8*4) + RESTORE_F8_TO_F15(8*14) + RET + +// Called from c-abi, R0: sig, R1: info, R2: cxt +TEXT runtime·cgoSigtramp(SB),NOSPLIT|NOFRAME,$0 + // The stack unwinder, presumably written in C, may not be able to + // handle Go frame correctly. So, this function is NOFRAME, and we + // save/restore LR manually. + MOVD LR, R10 + // Save R27, g because they will be clobbered, + // we need to restore them before jump to sigtramp. + MOVD R27, R11 + MOVD g, R12 + + // If no traceback function, do usual sigtramp. + MOVD runtime·cgoTraceback(SB), R6 + CBZ R6, sigtramp + + // If no traceback support function, which means that + // runtime/cgo was not linked in, do usual sigtramp. + MOVD _cgo_callers(SB), R7 + CBZ R7, sigtramp + + // Figure out if we are currently in a cgo call. + // If not, just do usual sigtramp. + // first save R0, because runtime·load_g will clobber it. + MOVD R0, R8 + // Set up g register. + CALL runtime·load_g(SB) + MOVD R8, R0 + + CBZ g, sigtrampnog // g == nil + MOVD g_m(g), R6 + CBZ R6, sigtramp // g.m == nil + MOVW m_ncgo(R6), R7 + CBZW R7, sigtramp // g.m.ncgo = 0 + MOVD m_curg(R6), R8 + CBZ R8, sigtramp // g.m.curg == nil + MOVD g_syscallsp(R8), R7 + CBZ R7, sigtramp // g.m.curg.syscallsp == 0 + MOVD m_cgoCallers(R6), R4 // R4 is the fifth arg in C calling convention. + CBZ R4, sigtramp // g.m.cgoCallers == nil + MOVW m_cgoCallersUse(R6), R8 + CBNZW R8, sigtramp // g.m.cgoCallersUse != 0 + + // Jump to a function in runtime/cgo. + // That function, written in C, will call the user's traceback + // function with proper unwind info, and will then call back here. + // The first three arguments, and the fifth, are already in registers. + // Set the two remaining arguments now. + MOVD runtime·cgoTraceback(SB), R3 + MOVD $runtime·sigtramp(SB), R5 + MOVD _cgo_callers(SB), R13 + MOVD R10, LR // restore + MOVD R11, R27 + MOVD R12, g + B (R13) + +sigtramp: + MOVD R10, LR // restore + MOVD R11, R27 + MOVD R12, g + B runtime·sigtramp(SB) + +sigtrampnog: + // Signal arrived on a non-Go thread. If this is SIGPROF, get a + // stack trace. + CMPW $27, R0 // 27 == SIGPROF + BNE sigtramp + + // Lock sigprofCallersUse (cas from 0 to 1). + MOVW $1, R7 + MOVD $runtime·sigprofCallersUse(SB), R8 +load_store_loop: + LDAXRW (R8), R9 + CBNZW R9, sigtramp // Skip stack trace if already locked. + STLXRW R7, (R8), R9 + CBNZ R9, load_store_loop + + // Jump to the traceback function in runtime/cgo. + // It will call back to sigprofNonGo, which will ignore the + // arguments passed in registers. + // First three arguments to traceback function are in registers already. + MOVD runtime·cgoTraceback(SB), R3 + MOVD $runtime·sigprofCallers(SB), R4 + MOVD $runtime·sigprofNonGoWrapper<>(SB), R5 + MOVD _cgo_callers(SB), R13 + MOVD R10, LR // restore + MOVD R11, R27 + MOVD R12, g + B (R13) + +TEXT runtime·sysMmap(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVW prot+16(FP), R2 + MOVW flags+20(FP), R3 + MOVW fd+24(FP), R4 + MOVW off+28(FP), R5 + + MOVD $SYS_mmap, R8 + SVC + CMN $4095, R0 + BCC ok + NEG R0,R0 + MOVD $0, p+32(FP) + MOVD R0, err+40(FP) + RET +ok: + MOVD R0, p+32(FP) + MOVD $0, err+40(FP) + RET + +// Call the function stored in _cgo_mmap using the GCC calling convention. +// This must be called on the system stack. +TEXT runtime·callCgoMmap(SB),NOSPLIT,$0 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVW prot+16(FP), R2 + MOVW flags+20(FP), R3 + MOVW fd+24(FP), R4 + MOVW off+28(FP), R5 + MOVD _cgo_mmap(SB), R9 + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL R9 + ADD $16, RSP + MOVD R0, ret+32(FP) + RET + +TEXT runtime·sysMunmap(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVD $SYS_munmap, R8 + SVC + CMN $4095, R0 + BCC cool + MOVD R0, 0xf0(R0) +cool: + RET + +// Call the function stored in _cgo_munmap using the GCC calling convention. +// This must be called on the system stack. +TEXT runtime·callCgoMunmap(SB),NOSPLIT,$0 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVD _cgo_munmap(SB), R9 + SUB $16, RSP // reserve 16 bytes for sp-8 where fp may be saved. + BL R9 + ADD $16, RSP + RET + +TEXT runtime·madvise(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R0 + MOVD n+8(FP), R1 + MOVW flags+16(FP), R2 + MOVD $SYS_madvise, R8 + SVC + MOVW R0, ret+24(FP) + RET + +// int64 futex(int32 *uaddr, int32 op, int32 val, +// struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R0 + MOVW op+8(FP), R1 + MOVW val+12(FP), R2 + MOVD ts+16(FP), R3 + MOVD addr2+24(FP), R4 + MOVW val3+32(FP), R5 + MOVD $SYS_futex, R8 + SVC + MOVW R0, ret+40(FP) + RET + +// int64 clone(int32 flags, void *stk, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT|NOFRAME,$0 + MOVW flags+0(FP), R0 + MOVD stk+8(FP), R1 + + // Copy mp, gp, fn off parent stack for use by child. + MOVD mp+16(FP), R10 + MOVD gp+24(FP), R11 + MOVD fn+32(FP), R12 + + MOVD R10, -8(R1) + MOVD R11, -16(R1) + MOVD R12, -24(R1) + MOVD $1234, R10 + MOVD R10, -32(R1) + + MOVD $SYS_clone, R8 + SVC + + // In parent, return. + CMP ZR, R0 + BEQ child + MOVW R0, ret+40(FP) + RET +child: + + // In child, on new stack. + MOVD -32(RSP), R10 + MOVD $1234, R0 + CMP R0, R10 + BEQ good + MOVD $0, R0 + MOVD R0, (R0) // crash + +good: + // Initialize m->procid to Linux tid + MOVD $SYS_gettid, R8 + SVC + + MOVD -24(RSP), R12 // fn + MOVD -16(RSP), R11 // g + MOVD -8(RSP), R10 // m + + CMP $0, R10 + BEQ nog + CMP $0, R11 + BEQ nog + + MOVD R0, m_procid(R10) + + // TODO: setup TLS. + + // In child, set up new stack + MOVD R10, g_m(R11) + MOVD R11, g + //CALL runtime·stackcheck(SB) + +nog: + // Call fn + MOVD R12, R0 + BL (R0) + + // It shouldn't return. If it does, exit that thread. + MOVW $111, R0 +again: + MOVD $SYS_exit, R8 + SVC + B again // keep exiting + +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOVD new+0(FP), R0 + MOVD old+8(FP), R1 + MOVD $SYS_sigaltstack, R8 + SVC + CMN $4095, R0 + BCC ok + MOVD $0, R0 + MOVD R0, (R0) // crash +ok: + RET + +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + MOVD $SYS_sched_yield, R8 + SVC + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT|NOFRAME,$0 + MOVD pid+0(FP), R0 + MOVD len+8(FP), R1 + MOVD buf+16(FP), R2 + MOVD $SYS_sched_getaffinity, R8 + SVC + MOVW R0, ret+24(FP) + RET + +// int access(const char *name, int mode) +TEXT runtime·access(SB),NOSPLIT,$0-20 + MOVD $AT_FDCWD, R0 + MOVD name+0(FP), R1 + MOVW mode+8(FP), R2 + MOVD $SYS_faccessat, R8 + SVC + MOVW R0, ret+16(FP) + RET + +// int connect(int fd, const struct sockaddr *addr, socklen_t len) +TEXT runtime·connect(SB),NOSPLIT,$0-28 + MOVW fd+0(FP), R0 + MOVD addr+8(FP), R1 + MOVW len+16(FP), R2 + MOVD $SYS_connect, R8 + SVC + MOVW R0, ret+24(FP) + RET + +// int socket(int domain, int typ, int prot) +TEXT runtime·socket(SB),NOSPLIT,$0-20 + MOVW domain+0(FP), R0 + MOVW typ+4(FP), R1 + MOVW prot+8(FP), R2 + MOVD $SYS_socket, R8 + SVC + MOVW R0, ret+16(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT,$0-8 + // Implemented as brk(NULL). + MOVD $0, R0 + MOVD $SYS_brk, R8 + SVC + MOVD R0, ret+0(FP) + RET + +TEXT runtime·sigreturn(SB),NOSPLIT,$0-0 + RET diff --git a/src/runtime/sys_linux_loong64.s b/src/runtime/sys_linux_loong64.s new file mode 100644 index 0000000..9ce5e72 --- /dev/null +++ b/src/runtime/sys_linux_loong64.s @@ -0,0 +1,555 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for loong64, Linux +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define AT_FDCWD -100 + +#define SYS_exit 93 +#define SYS_read 63 +#define SYS_write 64 +#define SYS_close 57 +#define SYS_getpid 172 +#define SYS_kill 129 +#define SYS_mmap 222 +#define SYS_munmap 215 +#define SYS_setitimer 103 +#define SYS_clone 220 +#define SYS_nanosleep 101 +#define SYS_sched_yield 124 +#define SYS_rt_sigreturn 139 +#define SYS_rt_sigaction 134 +#define SYS_rt_sigprocmask 135 +#define SYS_sigaltstack 132 +#define SYS_madvise 233 +#define SYS_mincore 232 +#define SYS_gettid 178 +#define SYS_futex 98 +#define SYS_sched_getaffinity 123 +#define SYS_exit_group 94 +#define SYS_tgkill 131 +#define SYS_openat 56 +#define SYS_clock_gettime 113 +#define SYS_brk 214 +#define SYS_pipe2 59 +#define SYS_timer_create 107 +#define SYS_timer_settime 110 +#define SYS_timer_delete 111 + +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0-4 + MOVW code+0(FP), R4 + MOVV $SYS_exit_group, R11 + SYSCALL + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-8 + MOVV wait+0(FP), R19 + // We're done using the stack. + MOVW $0, R11 + DBAR + MOVW R11, (R19) + DBAR + MOVW $0, R4 // exit code + MOVV $SYS_exit, R11 + SYSCALL + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0-20 + MOVW $AT_FDCWD, R4 // AT_FDCWD, so this acts like open + MOVV name+0(FP), R5 + MOVW mode+8(FP), R6 + MOVW perm+12(FP), R7 + MOVV $SYS_openat, R11 + SYSCALL + MOVW $-4096, R5 + BGEU R5, R4, 2(PC) + MOVW $-1, R4 + MOVW R4, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0-12 + MOVW fd+0(FP), R4 + MOVV $SYS_close, R11 + SYSCALL + MOVW $-4096, R5 + BGEU R5, R4, 2(PC) + MOVW $-1, R4 + MOVW R4, ret+8(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0-28 + MOVV fd+0(FP), R4 + MOVV p+8(FP), R5 + MOVW n+16(FP), R6 + MOVV $SYS_write, R11 + SYSCALL + MOVW R4, ret+24(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0-28 + MOVW fd+0(FP), R4 + MOVV p+8(FP), R5 + MOVW n+16(FP), R6 + MOVV $SYS_read, R11 + SYSCALL + MOVW R4, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + MOVV $r+8(FP), R4 + MOVW flags+0(FP), R5 + MOVV $SYS_pipe2, R11 + SYSCALL + MOVW R4, errno+16(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$16-4 + MOVWU usec+0(FP), R6 + MOVV R6, R5 + MOVW $1000000, R4 + DIVVU R4, R6, R6 + MOVV R6, 8(R3) + MOVW $1000, R4 + MULVU R6, R4, R4 + SUBVU R4, R5 + MOVV R5, 16(R3) + + // nanosleep(&ts, 0) + ADDV $8, R3, R4 + MOVW $0, R5 + MOVV $SYS_nanosleep, R11 + SYSCALL + RET + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOVV $SYS_gettid, R11 + SYSCALL + MOVW R4, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT|NOFRAME,$0 + MOVV $SYS_getpid, R11 + SYSCALL + MOVW R4, R23 + MOVV $SYS_gettid, R11 + SYSCALL + MOVW R4, R5 // arg 2 tid + MOVW R23, R4 // arg 1 pid + MOVW sig+0(FP), R6 // arg 3 + MOVV $SYS_tgkill, R11 + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + MOVV $SYS_getpid, R11 + SYSCALL + //MOVW R4, R4 // arg 1 pid + MOVW sig+0(FP), R5 // arg 2 + MOVV $SYS_kill, R11 + SYSCALL + RET + +TEXT ·getpid(SB),NOSPLIT|NOFRAME,$0-8 + MOVV $SYS_getpid, R11 + SYSCALL + MOVV R4, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT|NOFRAME,$0-24 + MOVV tgid+0(FP), R4 + MOVV tid+8(FP), R5 + MOVV sig+16(FP), R6 + MOVV $SYS_tgkill, R11 + SYSCALL + RET + +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0-24 + MOVW mode+0(FP), R4 + MOVV new+8(FP), R5 + MOVV old+16(FP), R6 + MOVV $SYS_setitimer, R11 + SYSCALL + RET + +TEXT runtime·timer_create(SB),NOSPLIT,$0-28 + MOVW clockid+0(FP), R4 + MOVV sevp+8(FP), R5 + MOVV timerid+16(FP), R6 + MOVV $SYS_timer_create, R11 + SYSCALL + MOVW R4, ret+24(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT,$0-28 + MOVW timerid+0(FP), R4 + MOVW flags+4(FP), R5 + MOVV new+8(FP), R6 + MOVV old+16(FP), R7 + MOVV $SYS_timer_settime, R11 + SYSCALL + MOVW R4, ret+24(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT,$0-12 + MOVW timerid+0(FP), R4 + MOVV $SYS_timer_delete, R11 + SYSCALL + MOVW R4, ret+8(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT|NOFRAME,$0-28 + MOVV addr+0(FP), R4 + MOVV n+8(FP), R5 + MOVV dst+16(FP), R6 + MOVV $SYS_mincore, R11 + SYSCALL + MOVW R4, ret+24(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$16-12 + MOVV R3, R23 // R23 is unchanged by C code + MOVV R3, R25 + + MOVV g_m(g), R24 // R24 = m + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVV m_vdsoPC(R24), R11 + MOVV m_vdsoSP(R24), R7 + MOVV R11, 8(R3) + MOVV R7, 16(R3) + + MOVV $ret-8(FP), R11 // caller's SP + MOVV R1, m_vdsoPC(R24) + MOVV R11, m_vdsoSP(R24) + + MOVV m_curg(R24), R4 + MOVV g, R5 + BNE R4, R5, noswitch + + MOVV m_g0(R24), R4 + MOVV (g_sched+gobuf_sp)(R4), R25 // Set SP to g0 stack + +noswitch: + SUBV $16, R25 + AND $~15, R25 // Align for C code + MOVV R25, R3 + + MOVW $0, R4 // CLOCK_REALTIME=0 + MOVV $0(R3), R5 + + MOVV runtime·vdsoClockgettimeSym(SB), R20 + BEQ R20, fallback + + JAL (R20) + +finish: + MOVV 0(R3), R7 // sec + MOVV 8(R3), R5 // nsec + + MOVV R23, R3 // restore SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVV 16(R3), R25 + MOVV R25, m_vdsoSP(R24) + MOVV 8(R3), R25 + MOVV R25, m_vdsoPC(R24) + + MOVV R7, sec+0(FP) + MOVW R5, nsec+8(FP) + RET + +fallback: + MOVV $SYS_clock_gettime, R11 + SYSCALL + JMP finish + +TEXT runtime·nanotime1(SB),NOSPLIT,$16-8 + MOVV R3, R23 // R23 is unchanged by C code + MOVV R3, R25 + + MOVV g_m(g), R24 // R24 = m + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVV m_vdsoPC(R24), R11 + MOVV m_vdsoSP(R24), R7 + MOVV R11, 8(R3) + MOVV R7, 16(R3) + + MOVV $ret-8(FP), R11 // caller's SP + MOVV R1, m_vdsoPC(R24) + MOVV R11, m_vdsoSP(R24) + + MOVV m_curg(R24), R4 + MOVV g, R5 + BNE R4, R5, noswitch + + MOVV m_g0(R24), R4 + MOVV (g_sched+gobuf_sp)(R4), R25 // Set SP to g0 stack + +noswitch: + SUBV $16, R25 + AND $~15, R25 // Align for C code + MOVV R25, R3 + + MOVW $1, R4 // CLOCK_MONOTONIC=1 + MOVV $0(R3), R5 + + MOVV runtime·vdsoClockgettimeSym(SB), R20 + BEQ R20, fallback + + JAL (R20) + +finish: + MOVV 0(R3), R7 // sec + MOVV 8(R3), R5 // nsec + + MOVV R23, R3 // restore SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVV 16(R3), R25 + MOVV R25, m_vdsoSP(R24) + MOVV 8(R3), R25 + MOVV R25, m_vdsoPC(R24) + + // sec is in R7, nsec in R5 + // return nsec in R7 + MOVV $1000000000, R4 + MULVU R4, R7, R7 + ADDVU R5, R7 + MOVV R7, ret+0(FP) + RET + +fallback: + MOVV $SYS_clock_gettime, R11 + SYSCALL + JMP finish + +TEXT runtime·rtsigprocmask(SB),NOSPLIT|NOFRAME,$0-28 + MOVW how+0(FP), R4 + MOVV new+8(FP), R5 + MOVV old+16(FP), R6 + MOVW size+24(FP), R7 + MOVV $SYS_rt_sigprocmask, R11 + SYSCALL + MOVW $-4096, R5 + BGEU R5, R4, 2(PC) + MOVV R0, 0xf1(R0) // crash + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT|NOFRAME,$0-36 + MOVV sig+0(FP), R4 + MOVV new+8(FP), R5 + MOVV old+16(FP), R6 + MOVV size+24(FP), R7 + MOVV $SYS_rt_sigaction, R11 + SYSCALL + MOVW R4, ret+32(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R4 + MOVV info+16(FP), R5 + MOVV ctx+24(FP), R6 + MOVV fn+0(FP), R20 + JAL (R20) + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$64 + // this might be called in external code context, + // where g is not set. + MOVB runtime·iscgo(SB), R19 + BEQ R19, 2(PC) + JAL runtime·load_g(SB) + + MOVW R4, 8(R3) + MOVV R5, 16(R3) + MOVV R6, 24(R3) + MOVV $runtime·sigtrampgo(SB), R19 + JAL (R19) + RET + +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + JMP runtime·sigtramp(SB) + +TEXT runtime·mmap(SB),NOSPLIT|NOFRAME,$0 + MOVV addr+0(FP), R4 + MOVV n+8(FP), R5 + MOVW prot+16(FP), R6 + MOVW flags+20(FP), R7 + MOVW fd+24(FP), R8 + MOVW off+28(FP), R9 + + MOVV $SYS_mmap, R11 + SYSCALL + MOVW $-4096, R5 + BGEU R5, R4, ok + MOVV $0, p+32(FP) + SUBVU R4, R0, R4 + MOVV R4, err+40(FP) + RET +ok: + MOVV R4, p+32(FP) + MOVV $0, err+40(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT|NOFRAME,$0 + MOVV addr+0(FP), R4 + MOVV n+8(FP), R5 + MOVV $SYS_munmap, R11 + SYSCALL + MOVW $-4096, R5 + BGEU R5, R4, 2(PC) + MOVV R0, 0xf3(R0) // crash + RET + +TEXT runtime·madvise(SB),NOSPLIT|NOFRAME,$0 + MOVV addr+0(FP), R4 + MOVV n+8(FP), R5 + MOVW flags+16(FP), R6 + MOVV $SYS_madvise, R11 + SYSCALL + MOVW R4, ret+24(FP) + RET + +// int64 futex(int32 *uaddr, int32 op, int32 val, +// struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT|NOFRAME,$0 + MOVV addr+0(FP), R4 + MOVW op+8(FP), R5 + MOVW val+12(FP), R6 + MOVV ts+16(FP), R7 + MOVV addr2+24(FP), R8 + MOVW val3+32(FP), R9 + MOVV $SYS_futex, R11 + SYSCALL + MOVW R4, ret+40(FP) + RET + +// int64 clone(int32 flags, void *stk, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT|NOFRAME,$0 + MOVW flags+0(FP), R4 + MOVV stk+8(FP), R5 + + // Copy mp, gp, fn off parent stack for use by child. + // Careful: Linux system call clobbers ???. + MOVV mp+16(FP), R23 + MOVV gp+24(FP), R24 + MOVV fn+32(FP), R25 + + MOVV R23, -8(R5) + MOVV R24, -16(R5) + MOVV R25, -24(R5) + MOVV $1234, R23 + MOVV R23, -32(R5) + + MOVV $SYS_clone, R11 + SYSCALL + + // In parent, return. + BEQ R4, 3(PC) + MOVW R4, ret+40(FP) + RET + + // In child, on new stack. + MOVV -32(R3), R23 + MOVV $1234, R19 + BEQ R23, R19, 2(PC) + MOVV R0, 0(R0) + + // Initialize m->procid to Linux tid + MOVV $SYS_gettid, R11 + SYSCALL + + MOVV -24(R3), R25 // fn + MOVV -16(R3), R24 // g + MOVV -8(R3), R23 // m + + BEQ R23, nog + BEQ R24, nog + + MOVV R4, m_procid(R23) + + // TODO: setup TLS. + + // In child, set up new stack + MOVV R23, g_m(R24) + MOVV R24, g + //CALL runtime·stackcheck(SB) + +nog: + // Call fn + JAL (R25) + + // It shouldn't return. If it does, exit that thread. + MOVW $111, R4 + MOVV $SYS_exit, R11 + SYSCALL + JMP -3(PC) // keep exiting + +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOVV new+0(FP), R4 + MOVV old+8(FP), R5 + MOVV $SYS_sigaltstack, R11 + SYSCALL + MOVW $-4096, R5 + BGEU R5, R4, 2(PC) + MOVV R0, 0xf1(R0) // crash + RET + +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + MOVV $SYS_sched_yield, R11 + SYSCALL + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT|NOFRAME,$0 + MOVV pid+0(FP), R4 + MOVV len+8(FP), R5 + MOVV buf+16(FP), R6 + MOVV $SYS_sched_getaffinity, R11 + SYSCALL + MOVW R4, ret+24(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT|NOFRAME,$0-8 + // Implemented as brk(NULL). + MOVV $0, R4 + MOVV $SYS_brk, R11 + SYSCALL + MOVV R4, ret+0(FP) + RET + +TEXT runtime·access(SB),$0-20 + MOVV R0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+16(FP) // for vet + RET + +TEXT runtime·connect(SB),$0-28 + MOVV R0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+24(FP) // for vet + RET + +TEXT runtime·socket(SB),$0-20 + MOVV R0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+16(FP) // for vet + RET diff --git a/src/runtime/sys_linux_mips64x.s b/src/runtime/sys_linux_mips64x.s new file mode 100644 index 0000000..47f2da5 --- /dev/null +++ b/src/runtime/sys_linux_mips64x.s @@ -0,0 +1,588 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips64 || mips64le) + +// +// System calls and other sys.stuff for mips64, Linux +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define AT_FDCWD -100 + +#define SYS_exit 5058 +#define SYS_read 5000 +#define SYS_write 5001 +#define SYS_close 5003 +#define SYS_getpid 5038 +#define SYS_kill 5060 +#define SYS_mmap 5009 +#define SYS_munmap 5011 +#define SYS_setitimer 5036 +#define SYS_clone 5055 +#define SYS_nanosleep 5034 +#define SYS_sched_yield 5023 +#define SYS_rt_sigreturn 5211 +#define SYS_rt_sigaction 5013 +#define SYS_rt_sigprocmask 5014 +#define SYS_sigaltstack 5129 +#define SYS_madvise 5027 +#define SYS_mincore 5026 +#define SYS_gettid 5178 +#define SYS_futex 5194 +#define SYS_sched_getaffinity 5196 +#define SYS_exit_group 5205 +#define SYS_timer_create 5216 +#define SYS_timer_settime 5217 +#define SYS_timer_delete 5220 +#define SYS_tgkill 5225 +#define SYS_openat 5247 +#define SYS_clock_gettime 5222 +#define SYS_brk 5012 +#define SYS_pipe2 5287 + +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0-4 + MOVW code+0(FP), R4 + MOVV $SYS_exit_group, R2 + SYSCALL + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-8 + MOVV wait+0(FP), R1 + // We're done using the stack. + MOVW $0, R2 + SYNC + MOVW R2, (R1) + SYNC + MOVW $0, R4 // exit code + MOVV $SYS_exit, R2 + SYSCALL + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0-20 + // This uses openat instead of open, because Android O blocks open. + MOVW $AT_FDCWD, R4 // AT_FDCWD, so this acts like open + MOVV name+0(FP), R5 + MOVW mode+8(FP), R6 + MOVW perm+12(FP), R7 + MOVV $SYS_openat, R2 + SYSCALL + BEQ R7, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0-12 + MOVW fd+0(FP), R4 + MOVV $SYS_close, R2 + SYSCALL + BEQ R7, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+8(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0-28 + MOVV fd+0(FP), R4 + MOVV p+8(FP), R5 + MOVW n+16(FP), R6 + MOVV $SYS_write, R2 + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+24(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0-28 + MOVW fd+0(FP), R4 + MOVV p+8(FP), R5 + MOVW n+16(FP), R6 + MOVV $SYS_read, R2 + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + MOVV $r+8(FP), R4 + MOVW flags+0(FP), R5 + MOVV $SYS_pipe2, R2 + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, errno+16(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$16-4 + MOVWU usec+0(FP), R3 + MOVV R3, R5 + MOVW $1000000, R4 + DIVVU R4, R3 + MOVV LO, R3 + MOVV R3, 8(R29) + MOVW $1000, R4 + MULVU R3, R4 + MOVV LO, R4 + SUBVU R4, R5 + MOVV R5, 16(R29) + + // nanosleep(&ts, 0) + ADDV $8, R29, R4 + MOVW $0, R5 + MOVV $SYS_nanosleep, R2 + SYSCALL + RET + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOVV $SYS_gettid, R2 + SYSCALL + MOVW R2, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT|NOFRAME,$0 + MOVV $SYS_getpid, R2 + SYSCALL + MOVW R2, R16 + MOVV $SYS_gettid, R2 + SYSCALL + MOVW R2, R5 // arg 2 tid + MOVW R16, R4 // arg 1 pid + MOVW sig+0(FP), R6 // arg 3 + MOVV $SYS_tgkill, R2 + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + MOVV $SYS_getpid, R2 + SYSCALL + MOVW R2, R4 // arg 1 pid + MOVW sig+0(FP), R5 // arg 2 + MOVV $SYS_kill, R2 + SYSCALL + RET + +TEXT ·getpid(SB),NOSPLIT|NOFRAME,$0-8 + MOVV $SYS_getpid, R2 + SYSCALL + MOVV R2, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT|NOFRAME,$0-24 + MOVV tgid+0(FP), R4 + MOVV tid+8(FP), R5 + MOVV sig+16(FP), R6 + MOVV $SYS_tgkill, R2 + SYSCALL + RET + +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0-24 + MOVW mode+0(FP), R4 + MOVV new+8(FP), R5 + MOVV old+16(FP), R6 + MOVV $SYS_setitimer, R2 + SYSCALL + RET + +TEXT runtime·timer_create(SB),NOSPLIT,$0-28 + MOVW clockid+0(FP), R4 + MOVV sevp+8(FP), R5 + MOVV timerid+16(FP), R6 + MOVV $SYS_timer_create, R2 + SYSCALL + MOVW R2, ret+24(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT,$0-28 + MOVW timerid+0(FP), R4 + MOVW flags+4(FP), R5 + MOVV new+8(FP), R6 + MOVV old+16(FP), R7 + MOVV $SYS_timer_settime, R2 + SYSCALL + MOVW R2, ret+24(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT,$0-12 + MOVW timerid+0(FP), R4 + MOVV $SYS_timer_delete, R2 + SYSCALL + MOVW R2, ret+8(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT|NOFRAME,$0-28 + MOVV addr+0(FP), R4 + MOVV n+8(FP), R5 + MOVV dst+16(FP), R6 + MOVV $SYS_mincore, R2 + SYSCALL + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+24(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$16-12 + MOVV R29, R16 // R16 is unchanged by C code + MOVV R29, R1 + + MOVV g_m(g), R17 // R17 = m + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVV m_vdsoPC(R17), R2 + MOVV m_vdsoSP(R17), R3 + MOVV R2, 8(R29) + MOVV R3, 16(R29) + + MOVV $ret-8(FP), R2 // caller's SP + MOVV R31, m_vdsoPC(R17) + MOVV R2, m_vdsoSP(R17) + + MOVV m_curg(R17), R4 + MOVV g, R5 + BNE R4, R5, noswitch + + MOVV m_g0(R17), R4 + MOVV (g_sched+gobuf_sp)(R4), R1 // Set SP to g0 stack + +noswitch: + SUBV $16, R1 + AND $~15, R1 // Align for C code + MOVV R1, R29 + + MOVW $0, R4 // CLOCK_REALTIME + MOVV $0(R29), R5 + + MOVV runtime·vdsoClockgettimeSym(SB), R25 + BEQ R25, fallback + + JAL (R25) + // check on vdso call return for kernel compatibility + // see https://golang.org/issues/39046 + // if we get any error make fallback permanent. + BEQ R2, R0, finish + MOVV R0, runtime·vdsoClockgettimeSym(SB) + MOVW $0, R4 // CLOCK_REALTIME + MOVV $0(R29), R5 + JMP fallback + +finish: + MOVV 0(R29), R3 // sec + MOVV 8(R29), R5 // nsec + + MOVV R16, R29 // restore SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVV 16(R29), R1 + MOVV R1, m_vdsoSP(R17) + MOVV 8(R29), R1 + MOVV R1, m_vdsoPC(R17) + + MOVV R3, sec+0(FP) + MOVW R5, nsec+8(FP) + RET + +fallback: + MOVV $SYS_clock_gettime, R2 + SYSCALL + JMP finish + +TEXT runtime·nanotime1(SB),NOSPLIT,$16-8 + MOVV R29, R16 // R16 is unchanged by C code + MOVV R29, R1 + + MOVV g_m(g), R17 // R17 = m + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVV m_vdsoPC(R17), R2 + MOVV m_vdsoSP(R17), R3 + MOVV R2, 8(R29) + MOVV R3, 16(R29) + + MOVV $ret-8(FP), R2 // caller's SP + MOVV R31, m_vdsoPC(R17) + MOVV R2, m_vdsoSP(R17) + + MOVV m_curg(R17), R4 + MOVV g, R5 + BNE R4, R5, noswitch + + MOVV m_g0(R17), R4 + MOVV (g_sched+gobuf_sp)(R4), R1 // Set SP to g0 stack + +noswitch: + SUBV $16, R1 + AND $~15, R1 // Align for C code + MOVV R1, R29 + + MOVW $1, R4 // CLOCK_MONOTONIC + MOVV $0(R29), R5 + + MOVV runtime·vdsoClockgettimeSym(SB), R25 + BEQ R25, fallback + + JAL (R25) + // see walltime for detail + BEQ R2, R0, finish + MOVV R0, runtime·vdsoClockgettimeSym(SB) + MOVW $1, R4 // CLOCK_MONOTONIC + MOVV $0(R29), R5 + JMP fallback + +finish: + MOVV 0(R29), R3 // sec + MOVV 8(R29), R5 // nsec + + MOVV R16, R29 // restore SP + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVV 16(R29), R1 + MOVV R1, m_vdsoSP(R17) + MOVV 8(R29), R1 + MOVV R1, m_vdsoPC(R17) + + // sec is in R3, nsec in R5 + // return nsec in R3 + MOVV $1000000000, R4 + MULVU R4, R3 + MOVV LO, R3 + ADDVU R5, R3 + MOVV R3, ret+0(FP) + RET + +fallback: + MOVV $SYS_clock_gettime, R2 + SYSCALL + JMP finish + +TEXT runtime·rtsigprocmask(SB),NOSPLIT|NOFRAME,$0-28 + MOVW how+0(FP), R4 + MOVV new+8(FP), R5 + MOVV old+16(FP), R6 + MOVW size+24(FP), R7 + MOVV $SYS_rt_sigprocmask, R2 + SYSCALL + BEQ R7, 2(PC) + MOVV R0, 0xf1(R0) // crash + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT|NOFRAME,$0-36 + MOVV sig+0(FP), R4 + MOVV new+8(FP), R5 + MOVV old+16(FP), R6 + MOVV size+24(FP), R7 + MOVV $SYS_rt_sigaction, R2 + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+32(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R4 + MOVV info+16(FP), R5 + MOVV ctx+24(FP), R6 + MOVV fn+0(FP), R25 + JAL (R25) + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$64 + // initialize REGSB = PC&0xffffffff00000000 + BGEZAL R0, 1(PC) + SRLV $32, R31, RSB + SLLV $32, RSB + + // this might be called in external code context, + // where g is not set. + MOVB runtime·iscgo(SB), R1 + BEQ R1, 2(PC) + JAL runtime·load_g(SB) + + MOVW R4, 8(R29) + MOVV R5, 16(R29) + MOVV R6, 24(R29) + MOVV $runtime·sigtrampgo(SB), R1 + JAL (R1) + RET + +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + JMP runtime·sigtramp(SB) + +TEXT runtime·mmap(SB),NOSPLIT|NOFRAME,$0 + MOVV addr+0(FP), R4 + MOVV n+8(FP), R5 + MOVW prot+16(FP), R6 + MOVW flags+20(FP), R7 + MOVW fd+24(FP), R8 + MOVW off+28(FP), R9 + + MOVV $SYS_mmap, R2 + SYSCALL + BEQ R7, ok + MOVV $0, p+32(FP) + MOVV R2, err+40(FP) + RET +ok: + MOVV R2, p+32(FP) + MOVV $0, err+40(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT|NOFRAME,$0 + MOVV addr+0(FP), R4 + MOVV n+8(FP), R5 + MOVV $SYS_munmap, R2 + SYSCALL + BEQ R7, 2(PC) + MOVV R0, 0xf3(R0) // crash + RET + +TEXT runtime·madvise(SB),NOSPLIT|NOFRAME,$0 + MOVV addr+0(FP), R4 + MOVV n+8(FP), R5 + MOVW flags+16(FP), R6 + MOVV $SYS_madvise, R2 + SYSCALL + MOVW R2, ret+24(FP) + RET + +// int64 futex(int32 *uaddr, int32 op, int32 val, +// struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT|NOFRAME,$0 + MOVV addr+0(FP), R4 + MOVW op+8(FP), R5 + MOVW val+12(FP), R6 + MOVV ts+16(FP), R7 + MOVV addr2+24(FP), R8 + MOVW val3+32(FP), R9 + MOVV $SYS_futex, R2 + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+40(FP) + RET + +// int64 clone(int32 flags, void *stk, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT|NOFRAME,$0 + MOVW flags+0(FP), R4 + MOVV stk+8(FP), R5 + + // Copy mp, gp, fn off parent stack for use by child. + // Careful: Linux system call clobbers ???. + MOVV mp+16(FP), R16 + MOVV gp+24(FP), R17 + MOVV fn+32(FP), R18 + + MOVV R16, -8(R5) + MOVV R17, -16(R5) + MOVV R18, -24(R5) + MOVV $1234, R16 + MOVV R16, -32(R5) + + MOVV $SYS_clone, R2 + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + + // In parent, return. + BEQ R2, 3(PC) + MOVW R2, ret+40(FP) + RET + + // In child, on new stack. + MOVV -32(R29), R16 + MOVV $1234, R1 + BEQ R16, R1, 2(PC) + MOVV R0, 0(R0) + + // Initialize m->procid to Linux tid + MOVV $SYS_gettid, R2 + SYSCALL + + MOVV -24(R29), R18 // fn + MOVV -16(R29), R17 // g + MOVV -8(R29), R16 // m + + BEQ R16, nog + BEQ R17, nog + + MOVV R2, m_procid(R16) + + // TODO: setup TLS. + + // In child, set up new stack + MOVV R16, g_m(R17) + MOVV R17, g + //CALL runtime·stackcheck(SB) + +nog: + // Call fn + JAL (R18) + + // It shouldn't return. If it does, exit that thread. + MOVW $111, R4 + MOVV $SYS_exit, R2 + SYSCALL + JMP -3(PC) // keep exiting + +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOVV new+0(FP), R4 + MOVV old+8(FP), R5 + MOVV $SYS_sigaltstack, R2 + SYSCALL + BEQ R7, 2(PC) + MOVV R0, 0xf1(R0) // crash + RET + +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + MOVV $SYS_sched_yield, R2 + SYSCALL + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT|NOFRAME,$0 + MOVV pid+0(FP), R4 + MOVV len+8(FP), R5 + MOVV buf+16(FP), R6 + MOVV $SYS_sched_getaffinity, R2 + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+24(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT|NOFRAME,$0-8 + // Implemented as brk(NULL). + MOVV $0, R4 + MOVV $SYS_brk, R2 + SYSCALL + MOVV R2, ret+0(FP) + RET + +TEXT runtime·access(SB),$0-20 + MOVV R0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+16(FP) // for vet + RET + +TEXT runtime·connect(SB),$0-28 + MOVV R0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+24(FP) // for vet + RET + +TEXT runtime·socket(SB),$0-20 + MOVV R0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+16(FP) // for vet + RET diff --git a/src/runtime/sys_linux_mipsx.s b/src/runtime/sys_linux_mipsx.s new file mode 100644 index 0000000..5e6b6c1 --- /dev/null +++ b/src/runtime/sys_linux_mipsx.s @@ -0,0 +1,507 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips || mipsle) + +// +// System calls and other sys.stuff for mips, Linux +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define SYS_exit 4001 +#define SYS_read 4003 +#define SYS_write 4004 +#define SYS_open 4005 +#define SYS_close 4006 +#define SYS_getpid 4020 +#define SYS_kill 4037 +#define SYS_brk 4045 +#define SYS_mmap 4090 +#define SYS_munmap 4091 +#define SYS_setitimer 4104 +#define SYS_clone 4120 +#define SYS_sched_yield 4162 +#define SYS_nanosleep 4166 +#define SYS_rt_sigreturn 4193 +#define SYS_rt_sigaction 4194 +#define SYS_rt_sigprocmask 4195 +#define SYS_sigaltstack 4206 +#define SYS_madvise 4218 +#define SYS_mincore 4217 +#define SYS_gettid 4222 +#define SYS_futex 4238 +#define SYS_sched_getaffinity 4240 +#define SYS_exit_group 4246 +#define SYS_timer_create 4257 +#define SYS_timer_settime 4258 +#define SYS_timer_delete 4261 +#define SYS_clock_gettime 4263 +#define SYS_tgkill 4266 +#define SYS_pipe2 4328 + +TEXT runtime·exit(SB),NOSPLIT,$0-4 + MOVW code+0(FP), R4 + MOVW $SYS_exit_group, R2 + SYSCALL + UNDEF + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-4 + MOVW wait+0(FP), R1 + // We're done using the stack. + MOVW $0, R2 + SYNC + MOVW R2, (R1) + SYNC + MOVW $0, R4 // exit code + MOVW $SYS_exit, R2 + SYSCALL + UNDEF + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT,$0-16 + MOVW name+0(FP), R4 + MOVW mode+4(FP), R5 + MOVW perm+8(FP), R6 + MOVW $SYS_open, R2 + SYSCALL + BEQ R7, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+12(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$0-8 + MOVW fd+0(FP), R4 + MOVW $SYS_close, R2 + SYSCALL + BEQ R7, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+4(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$0-16 + MOVW fd+0(FP), R4 + MOVW p+4(FP), R5 + MOVW n+8(FP), R6 + MOVW $SYS_write, R2 + SYSCALL + BEQ R7, 2(PC) + SUBU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+12(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$0-16 + MOVW fd+0(FP), R4 + MOVW p+4(FP), R5 + MOVW n+8(FP), R6 + MOVW $SYS_read, R2 + SYSCALL + BEQ R7, 2(PC) + SUBU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+12(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-16 + MOVW $r+4(FP), R4 + MOVW flags+0(FP), R5 + MOVW $SYS_pipe2, R2 + SYSCALL + BEQ R7, 2(PC) + SUBU R2, R0, R2 // caller expects negative errno + MOVW R2, errno+12(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$28-4 + MOVW usec+0(FP), R3 + MOVW R3, R5 + MOVW $1000000, R4 + DIVU R4, R3 + MOVW LO, R3 + MOVW R3, 24(R29) + MOVW $1000, R4 + MULU R3, R4 + MOVW LO, R4 + SUBU R4, R5 + MOVW R5, 28(R29) + + // nanosleep(&ts, 0) + ADDU $24, R29, R4 + MOVW $0, R5 + MOVW $SYS_nanosleep, R2 + SYSCALL + RET + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOVW $SYS_gettid, R2 + SYSCALL + MOVW R2, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT,$0-4 + MOVW $SYS_getpid, R2 + SYSCALL + MOVW R2, R16 + MOVW $SYS_gettid, R2 + SYSCALL + MOVW R2, R5 // arg 2 tid + MOVW R16, R4 // arg 1 pid + MOVW sig+0(FP), R6 // arg 3 + MOVW $SYS_tgkill, R2 + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$0 + MOVW $SYS_getpid, R2 + SYSCALL + MOVW R2, R4 // arg 1 pid + MOVW sig+0(FP), R5 // arg 2 + MOVW $SYS_kill, R2 + SYSCALL + RET + +TEXT ·getpid(SB),NOSPLIT,$0-4 + MOVW $SYS_getpid, R2 + SYSCALL + MOVW R2, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT,$0-12 + MOVW tgid+0(FP), R4 + MOVW tid+4(FP), R5 + MOVW sig+8(FP), R6 + MOVW $SYS_tgkill, R2 + SYSCALL + RET + +TEXT runtime·setitimer(SB),NOSPLIT,$0-12 + MOVW mode+0(FP), R4 + MOVW new+4(FP), R5 + MOVW old+8(FP), R6 + MOVW $SYS_setitimer, R2 + SYSCALL + RET + +TEXT runtime·timer_create(SB),NOSPLIT,$0-16 + MOVW clockid+0(FP), R4 + MOVW sevp+4(FP), R5 + MOVW timerid+8(FP), R6 + MOVW $SYS_timer_create, R2 + SYSCALL + MOVW R2, ret+12(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT,$0-20 + MOVW timerid+0(FP), R4 + MOVW flags+4(FP), R5 + MOVW new+8(FP), R6 + MOVW old+12(FP), R7 + MOVW $SYS_timer_settime, R2 + SYSCALL + MOVW R2, ret+16(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT,$0-8 + MOVW timerid+0(FP), R4 + MOVW $SYS_timer_delete, R2 + SYSCALL + MOVW R2, ret+4(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT,$0-16 + MOVW addr+0(FP), R4 + MOVW n+4(FP), R5 + MOVW dst+8(FP), R6 + MOVW $SYS_mincore, R2 + SYSCALL + SUBU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+12(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$8-12 + MOVW $0, R4 // CLOCK_REALTIME + MOVW $4(R29), R5 + MOVW $SYS_clock_gettime, R2 + SYSCALL + MOVW 4(R29), R3 // sec + MOVW 8(R29), R5 // nsec + MOVW $sec+0(FP), R6 +#ifdef GOARCH_mips + MOVW R3, 4(R6) + MOVW R0, 0(R6) +#else + MOVW R3, 0(R6) + MOVW R0, 4(R6) +#endif + MOVW R5, nsec+8(FP) + RET + +TEXT runtime·nanotime1(SB),NOSPLIT,$8-8 + MOVW $1, R4 // CLOCK_MONOTONIC + MOVW $4(R29), R5 + MOVW $SYS_clock_gettime, R2 + SYSCALL + MOVW 4(R29), R3 // sec + MOVW 8(R29), R5 // nsec + // sec is in R3, nsec in R5 + // return nsec in R3 + MOVW $1000000000, R4 + MULU R4, R3 + MOVW LO, R3 + ADDU R5, R3 + SGTU R5, R3, R4 + MOVW $ret+0(FP), R6 +#ifdef GOARCH_mips + MOVW R3, 4(R6) +#else + MOVW R3, 0(R6) +#endif + MOVW HI, R3 + ADDU R4, R3 +#ifdef GOARCH_mips + MOVW R3, 0(R6) +#else + MOVW R3, 4(R6) +#endif + RET + +TEXT runtime·rtsigprocmask(SB),NOSPLIT,$0-16 + MOVW how+0(FP), R4 + MOVW new+4(FP), R5 + MOVW old+8(FP), R6 + MOVW size+12(FP), R7 + MOVW $SYS_rt_sigprocmask, R2 + SYSCALL + BEQ R7, 2(PC) + UNDEF // crash + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT,$0-20 + MOVW sig+0(FP), R4 + MOVW new+4(FP), R5 + MOVW old+8(FP), R6 + MOVW size+12(FP), R7 + MOVW $SYS_rt_sigaction, R2 + SYSCALL + BEQ R7, 2(PC) + SUBU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+16(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-16 + MOVW sig+4(FP), R4 + MOVW info+8(FP), R5 + MOVW ctx+12(FP), R6 + MOVW fn+0(FP), R25 + MOVW R29, R22 + SUBU $16, R29 + AND $~7, R29 // shadow space for 4 args aligned to 8 bytes as per O32 ABI + JAL (R25) + MOVW R22, R29 + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$12 + // this might be called in external code context, + // where g is not set. + MOVB runtime·iscgo(SB), R1 + BEQ R1, 2(PC) + JAL runtime·load_g(SB) + + MOVW R4, 4(R29) + MOVW R5, 8(R29) + MOVW R6, 12(R29) + MOVW $runtime·sigtrampgo(SB), R1 + JAL (R1) + RET + +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + JMP runtime·sigtramp(SB) + +TEXT runtime·mmap(SB),NOSPLIT,$20-32 + MOVW addr+0(FP), R4 + MOVW n+4(FP), R5 + MOVW prot+8(FP), R6 + MOVW flags+12(FP), R7 + MOVW fd+16(FP), R8 + MOVW off+20(FP), R9 + MOVW R8, 16(R29) + MOVW R9, 20(R29) + + MOVW $SYS_mmap, R2 + SYSCALL + BEQ R7, ok + MOVW $0, p+24(FP) + MOVW R2, err+28(FP) + RET +ok: + MOVW R2, p+24(FP) + MOVW $0, err+28(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0-8 + MOVW addr+0(FP), R4 + MOVW n+4(FP), R5 + MOVW $SYS_munmap, R2 + SYSCALL + BEQ R7, 2(PC) + UNDEF // crash + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0-16 + MOVW addr+0(FP), R4 + MOVW n+4(FP), R5 + MOVW flags+8(FP), R6 + MOVW $SYS_madvise, R2 + SYSCALL + MOVW R2, ret+12(FP) + RET + +// int32 futex(int32 *uaddr, int32 op, int32 val, struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT,$20-28 + MOVW addr+0(FP), R4 + MOVW op+4(FP), R5 + MOVW val+8(FP), R6 + MOVW ts+12(FP), R7 + + MOVW addr2+16(FP), R8 + MOVW val3+20(FP), R9 + + MOVW R8, 16(R29) + MOVW R9, 20(R29) + + MOVW $SYS_futex, R2 + SYSCALL + BEQ R7, 2(PC) + SUBU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+24(FP) + RET + + +// int32 clone(int32 flags, void *stk, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT|NOFRAME,$0-24 + MOVW flags+0(FP), R4 + MOVW stk+4(FP), R5 + MOVW R0, R6 // ptid + MOVW R0, R7 // tls + + // O32 syscall handler unconditionally copies arguments 5-8 from stack, + // even for syscalls with less than 8 arguments. Reserve 32 bytes of new + // stack so that any syscall invoked immediately in the new thread won't fail. + ADD $-32, R5 + + // Copy mp, gp, fn off parent stack for use by child. + MOVW mp+8(FP), R16 + MOVW gp+12(FP), R17 + MOVW fn+16(FP), R18 + + MOVW $1234, R1 + + MOVW R16, 0(R5) + MOVW R17, 4(R5) + MOVW R18, 8(R5) + + MOVW R1, 12(R5) + + MOVW $SYS_clone, R2 + SYSCALL + BEQ R7, 2(PC) + SUBU R2, R0, R2 // caller expects negative errno + + // In parent, return. + BEQ R2, 3(PC) + MOVW R2, ret+20(FP) + RET + + // In child, on new stack. + // Check that SP is as we expect + NOP R29 // tell vet R29/SP changed - stop checking offsets + MOVW 12(R29), R16 + MOVW $1234, R1 + BEQ R16, R1, 2(PC) + MOVW (R0), R0 + + // Initialize m->procid to Linux tid + MOVW $SYS_gettid, R2 + SYSCALL + + MOVW 0(R29), R16 // m + MOVW 4(R29), R17 // g + MOVW 8(R29), R18 // fn + + BEQ R16, nog + BEQ R17, nog + + MOVW R2, m_procid(R16) + + // In child, set up new stack + MOVW R16, g_m(R17) + MOVW R17, g + +// TODO(mips32): doesn't have runtime·stackcheck(SB) + +nog: + // Call fn + ADDU $32, R29 + JAL (R18) + + // It shouldn't return. If it does, exit that thread. + ADDU $-32, R29 + MOVW $0xf4, R4 + MOVW $SYS_exit, R2 + SYSCALL + UNDEF + +TEXT runtime·sigaltstack(SB),NOSPLIT,$0 + MOVW new+0(FP), R4 + MOVW old+4(FP), R5 + MOVW $SYS_sigaltstack, R2 + SYSCALL + BEQ R7, 2(PC) + UNDEF // crash + RET + +TEXT runtime·osyield(SB),NOSPLIT,$0 + MOVW $SYS_sched_yield, R2 + SYSCALL + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT,$0-16 + MOVW pid+0(FP), R4 + MOVW len+4(FP), R5 + MOVW buf+8(FP), R6 + MOVW $SYS_sched_getaffinity, R2 + SYSCALL + BEQ R7, 2(PC) + SUBU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+12(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT,$0-4 + // Implemented as brk(NULL). + MOVW $0, R4 + MOVW $SYS_brk, R2 + SYSCALL + MOVW R2, ret+0(FP) + RET + +TEXT runtime·access(SB),$0-12 + BREAK // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+8(FP) // for vet + RET + +TEXT runtime·connect(SB),$0-16 + BREAK // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+12(FP) // for vet + RET + +TEXT runtime·socket(SB),$0-16 + BREAK // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+12(FP) // for vet + RET diff --git a/src/runtime/sys_linux_ppc64x.s b/src/runtime/sys_linux_ppc64x.s new file mode 100644 index 0000000..d0427a4 --- /dev/null +++ b/src/runtime/sys_linux_ppc64x.s @@ -0,0 +1,901 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (ppc64 || ppc64le) + +// +// System calls and other sys.stuff for ppc64, Linux +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "asm_ppc64x.h" + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_brk 45 +#define SYS_mmap 90 +#define SYS_munmap 91 +#define SYS_setitimer 104 +#define SYS_clone 120 +#define SYS_sched_yield 158 +#define SYS_nanosleep 162 +#define SYS_rt_sigreturn 172 +#define SYS_rt_sigaction 173 +#define SYS_rt_sigprocmask 174 +#define SYS_sigaltstack 185 +#define SYS_madvise 205 +#define SYS_mincore 206 +#define SYS_gettid 207 +#define SYS_futex 221 +#define SYS_sched_getaffinity 223 +#define SYS_exit_group 234 +#define SYS_timer_create 240 +#define SYS_timer_settime 241 +#define SYS_timer_delete 244 +#define SYS_clock_gettime 246 +#define SYS_tgkill 250 +#define SYS_pipe2 317 + +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0-4 + MOVW code+0(FP), R3 + SYSCALL $SYS_exit_group + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-8 + MOVD wait+0(FP), R1 + // We're done using the stack. + MOVW $0, R2 + SYNC + MOVW R2, (R1) + MOVW $0, R3 // exit code + SYSCALL $SYS_exit + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0-20 + MOVD name+0(FP), R3 + MOVW mode+8(FP), R4 + MOVW perm+12(FP), R5 + SYSCALL $SYS_open + BVC 2(PC) + MOVW $-1, R3 + MOVW R3, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0-12 + MOVW fd+0(FP), R3 + SYSCALL $SYS_close + BVC 2(PC) + MOVW $-1, R3 + MOVW R3, ret+8(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0-28 + MOVD fd+0(FP), R3 + MOVD p+8(FP), R4 + MOVW n+16(FP), R5 + SYSCALL $SYS_write + BVC 2(PC) + NEG R3 // caller expects negative errno + MOVW R3, ret+24(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0-28 + MOVW fd+0(FP), R3 + MOVD p+8(FP), R4 + MOVW n+16(FP), R5 + SYSCALL $SYS_read + BVC 2(PC) + NEG R3 // caller expects negative errno + MOVW R3, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + ADD $FIXED_FRAME+8, R1, R3 + MOVW flags+0(FP), R4 + SYSCALL $SYS_pipe2 + MOVW R3, errno+16(FP) + RET + +// func usleep(usec uint32) +TEXT runtime·usleep(SB),NOSPLIT,$16-4 + MOVW usec+0(FP), R3 + + // Use magic constant 0x8637bd06 and shift right 51 + // to perform usec/1000000. + MOVD $0x8637bd06, R4 + MULLD R3, R4, R4 // Convert usec to S. + SRD $51, R4, R4 + MOVD R4, 8(R1) // Store to tv_sec + + MOVD $1000000, R5 + MULLW R4, R5, R5 // Convert tv_sec back into uS + SUB R5, R3, R5 // Compute remainder uS. + MULLD $1000, R5, R5 // Convert to nsec + MOVD R5, 16(R1) // Store to tv_nsec + + // nanosleep(&ts, 0) + ADD $8, R1, R3 + MOVW $0, R4 + SYSCALL $SYS_nanosleep + RET + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + SYSCALL $SYS_gettid + MOVW R3, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT|NOFRAME,$0 + SYSCALL $SYS_getpid + MOVW R3, R14 + SYSCALL $SYS_gettid + MOVW R3, R4 // arg 2 tid + MOVW R14, R3 // arg 1 pid + MOVW sig+0(FP), R5 // arg 3 + SYSCALL $SYS_tgkill + RET + +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + SYSCALL $SYS_getpid + MOVW R3, R3 // arg 1 pid + MOVW sig+0(FP), R4 // arg 2 + SYSCALL $SYS_kill + RET + +TEXT ·getpid(SB),NOSPLIT|NOFRAME,$0-8 + SYSCALL $SYS_getpid + MOVD R3, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT|NOFRAME,$0-24 + MOVD tgid+0(FP), R3 + MOVD tid+8(FP), R4 + MOVD sig+16(FP), R5 + SYSCALL $SYS_tgkill + RET + +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0-24 + MOVW mode+0(FP), R3 + MOVD new+8(FP), R4 + MOVD old+16(FP), R5 + SYSCALL $SYS_setitimer + RET + +TEXT runtime·timer_create(SB),NOSPLIT,$0-28 + MOVW clockid+0(FP), R3 + MOVD sevp+8(FP), R4 + MOVD timerid+16(FP), R5 + SYSCALL $SYS_timer_create + MOVW R3, ret+24(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT,$0-28 + MOVW timerid+0(FP), R3 + MOVW flags+4(FP), R4 + MOVD new+8(FP), R5 + MOVD old+16(FP), R6 + SYSCALL $SYS_timer_settime + MOVW R3, ret+24(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT,$0-12 + MOVW timerid+0(FP), R3 + SYSCALL $SYS_timer_delete + MOVW R3, ret+8(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT|NOFRAME,$0-28 + MOVD addr+0(FP), R3 + MOVD n+8(FP), R4 + MOVD dst+16(FP), R5 + SYSCALL $SYS_mincore + NEG R3 // caller expects negative errno + MOVW R3, ret+24(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$16-12 + MOVD R1, R15 // R15 is unchanged by C code + MOVD g_m(g), R21 // R21 = m + + MOVD $0, R3 // CLOCK_REALTIME + + MOVD runtime·vdsoClockgettimeSym(SB), R12 // Check for VDSO availability + CMP R12, R0 + BEQ fallback + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVD m_vdsoPC(R21), R4 + MOVD m_vdsoSP(R21), R5 + MOVD R4, 32(R1) + MOVD R5, 40(R1) + + MOVD LR, R14 + MOVD $ret-FIXED_FRAME(FP), R5 // caller's SP + MOVD R14, m_vdsoPC(R21) + MOVD R5, m_vdsoSP(R21) + + MOVD m_curg(R21), R6 + CMP g, R6 + BNE noswitch + + MOVD m_g0(R21), R7 + MOVD (g_sched+gobuf_sp)(R7), R1 // Set SP to g0 stack + +noswitch: + SUB $16, R1 // Space for results + RLDICR $0, R1, $59, R1 // Align for C code + MOVD R12, CTR + MOVD R1, R4 + + // Store g on gsignal's stack, so if we receive a signal + // during VDSO code we can find the g. + // If we don't have a signal stack, we won't receive signal, + // so don't bother saving g. + // When using cgo, we already saved g on TLS, also don't save + // g here. + // Also don't save g if we are already on the signal stack. + // We won't get a nested signal. + MOVBZ runtime·iscgo(SB), R22 + CMP R22, $0 + BNE nosaveg + MOVD m_gsignal(R21), R22 // g.m.gsignal + CMP R22, $0 + BEQ nosaveg + + CMP g, R22 + BEQ nosaveg + MOVD (g_stack+stack_lo)(R22), R22 // g.m.gsignal.stack.lo + MOVD g, (R22) + + BL (CTR) // Call from VDSO + + MOVD $0, (R22) // clear g slot, R22 is unchanged by C code + + JMP finish + +nosaveg: + BL (CTR) // Call from VDSO + +finish: + MOVD $0, R0 // Restore R0 + MOVD 0(R1), R3 // sec + MOVD 8(R1), R5 // nsec + MOVD R15, R1 // Restore SP + + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVD 40(R1), R6 + MOVD R6, m_vdsoSP(R21) + MOVD 32(R1), R6 + MOVD R6, m_vdsoPC(R21) + +return: + MOVD R3, sec+0(FP) + MOVW R5, nsec+8(FP) + RET + + // Syscall fallback +fallback: + ADD $32, R1, R4 + SYSCALL $SYS_clock_gettime + MOVD 32(R1), R3 + MOVD 40(R1), R5 + JMP return + +TEXT runtime·nanotime1(SB),NOSPLIT,$16-8 + MOVD $1, R3 // CLOCK_MONOTONIC + + MOVD R1, R15 // R15 is unchanged by C code + MOVD g_m(g), R21 // R21 = m + + MOVD runtime·vdsoClockgettimeSym(SB), R12 // Check for VDSO availability + CMP R12, R0 + BEQ fallback + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVD m_vdsoPC(R21), R4 + MOVD m_vdsoSP(R21), R5 + MOVD R4, 32(R1) + MOVD R5, 40(R1) + + MOVD LR, R14 // R14 is unchanged by C code + MOVD $ret-FIXED_FRAME(FP), R5 // caller's SP + MOVD R14, m_vdsoPC(R21) + MOVD R5, m_vdsoSP(R21) + + MOVD m_curg(R21), R6 + CMP g, R6 + BNE noswitch + + MOVD m_g0(R21), R7 + MOVD (g_sched+gobuf_sp)(R7), R1 // Set SP to g0 stack + +noswitch: + SUB $16, R1 // Space for results + RLDICR $0, R1, $59, R1 // Align for C code + MOVD R12, CTR + MOVD R1, R4 + + // Store g on gsignal's stack, so if we receive a signal + // during VDSO code we can find the g. + // If we don't have a signal stack, we won't receive signal, + // so don't bother saving g. + // When using cgo, we already saved g on TLS, also don't save + // g here. + // Also don't save g if we are already on the signal stack. + // We won't get a nested signal. + MOVBZ runtime·iscgo(SB), R22 + CMP R22, $0 + BNE nosaveg + MOVD m_gsignal(R21), R22 // g.m.gsignal + CMP R22, $0 + BEQ nosaveg + + CMP g, R22 + BEQ nosaveg + MOVD (g_stack+stack_lo)(R22), R22 // g.m.gsignal.stack.lo + MOVD g, (R22) + + BL (CTR) // Call from VDSO + + MOVD $0, (R22) // clear g slot, R22 is unchanged by C code + + JMP finish + +nosaveg: + BL (CTR) // Call from VDSO + +finish: + MOVD $0, R0 // Restore R0 + MOVD 0(R1), R3 // sec + MOVD 8(R1), R5 // nsec + MOVD R15, R1 // Restore SP + + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVD 40(R1), R6 + MOVD R6, m_vdsoSP(R21) + MOVD 32(R1), R6 + MOVD R6, m_vdsoPC(R21) + +return: + // sec is in R3, nsec in R5 + // return nsec in R3 + MOVD $1000000000, R4 + MULLD R4, R3 + ADD R5, R3 + MOVD R3, ret+0(FP) + RET + + // Syscall fallback +fallback: + ADD $32, R1, R4 + SYSCALL $SYS_clock_gettime + MOVD 32(R1), R3 + MOVD 40(R1), R5 + JMP return + +TEXT runtime·rtsigprocmask(SB),NOSPLIT|NOFRAME,$0-28 + MOVW how+0(FP), R3 + MOVD new+8(FP), R4 + MOVD old+16(FP), R5 + MOVW size+24(FP), R6 + SYSCALL $SYS_rt_sigprocmask + BVC 2(PC) + MOVD R0, 0xf0(R0) // crash + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT|NOFRAME,$0-36 + MOVD sig+0(FP), R3 + MOVD new+8(FP), R4 + MOVD old+16(FP), R5 + MOVD size+24(FP), R6 + SYSCALL $SYS_rt_sigaction + BVC 2(PC) + NEG R3 // caller expects negative errno + MOVW R3, ret+32(FP) + RET + +#ifdef GOARCH_ppc64le +// Call the function stored in _cgo_sigaction using the GCC calling convention. +TEXT runtime·callCgoSigaction(SB),NOSPLIT,$0 + MOVD sig+0(FP), R3 + MOVD new+8(FP), R4 + MOVD old+16(FP), R5 + MOVD _cgo_sigaction(SB), R12 + MOVD R12, CTR // R12 should contain the function address + MOVD R1, R15 // Save R1 + MOVD R2, 24(R1) // Save R2 + SUB $48, R1 // reserve 32 (frame) + 16 bytes for sp-8 where fp may be saved. + RLDICR $0, R1, $59, R1 // Align to 16 bytes for C code + BL (CTR) + XOR R0, R0, R0 // Clear R0 as Go expects + MOVD R15, R1 // Restore R1 + MOVD 24(R1), R2 // Restore R2 + MOVW R3, ret+24(FP) // Return result + RET +#endif + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R3 + MOVD info+16(FP), R4 + MOVD ctx+24(FP), R5 + MOVD fn+0(FP), R12 + MOVD R12, CTR + BL (CTR) + MOVD 24(R1), R2 + RET + +TEXT runtime·sigreturn(SB),NOSPLIT,$0-0 + RET + +#ifdef GOARCH_ppc64le +// ppc64le doesn't need function descriptors +// Save callee-save registers in the case of signal forwarding. +// Same as on ARM64 https://golang.org/issue/31827 . +TEXT runtime·sigtramp(SB),NOSPLIT|NOFRAME,$0 +#else +// function descriptor for the real sigtramp +TEXT runtime·sigtramp(SB),NOSPLIT|NOFRAME,$0 + DWORD $sigtramp<>(SB) + DWORD $0 + DWORD $0 +TEXT sigtramp<>(SB),NOSPLIT|NOFRAME|TOPFRAME,$0 +#endif + // Start with standard C stack frame layout and linkage. + MOVD LR, R0 + MOVD R0, 16(R1) // Save LR in caller's frame. + MOVW CR, R0 // Save CR in caller's frame + MOVD R0, 8(R1) + // The stack must be acquired here and not + // in the automatic way based on stack size + // since that sequence clobbers R31 before it + // gets saved. + // We are being ultra safe here in saving the + // Vregs. The case where they might need to + // be saved is very unlikely. + MOVDU R1, -544(R1) + MOVD R14, 64(R1) + MOVD R15, 72(R1) + MOVD R16, 80(R1) + MOVD R17, 88(R1) + MOVD R18, 96(R1) + MOVD R19, 104(R1) + MOVD R20, 112(R1) + MOVD R21, 120(R1) + MOVD R22, 128(R1) + MOVD R23, 136(R1) + MOVD R24, 144(R1) + MOVD R25, 152(R1) + MOVD R26, 160(R1) + MOVD R27, 168(R1) + MOVD R28, 176(R1) + MOVD R29, 184(R1) + MOVD g, 192(R1) // R30 + MOVD R31, 200(R1) + FMOVD F14, 208(R1) + FMOVD F15, 216(R1) + FMOVD F16, 224(R1) + FMOVD F17, 232(R1) + FMOVD F18, 240(R1) + FMOVD F19, 248(R1) + FMOVD F20, 256(R1) + FMOVD F21, 264(R1) + FMOVD F22, 272(R1) + FMOVD F23, 280(R1) + FMOVD F24, 288(R1) + FMOVD F25, 296(R1) + FMOVD F26, 304(R1) + FMOVD F27, 312(R1) + FMOVD F28, 320(R1) + FMOVD F29, 328(R1) + FMOVD F30, 336(R1) + FMOVD F31, 344(R1) + // Save V regs + // STXVD2X and LXVD2X used since + // we aren't sure of alignment. + // Endianness doesn't matter + // if we are just loading and + // storing values. + MOVD $352, R7 // V20 + STXVD2X VS52, (R7)(R1) + ADD $16, R7 // V21 368 + STXVD2X VS53, (R7)(R1) + ADD $16, R7 // V22 384 + STXVD2X VS54, (R7)(R1) + ADD $16, R7 // V23 400 + STXVD2X VS55, (R7)(R1) + ADD $16, R7 // V24 416 + STXVD2X VS56, (R7)(R1) + ADD $16, R7 // V25 432 + STXVD2X VS57, (R7)(R1) + ADD $16, R7 // V26 448 + STXVD2X VS58, (R7)(R1) + ADD $16, R7 // V27 464 + STXVD2X VS59, (R7)(R1) + ADD $16, R7 // V28 480 + STXVD2X VS60, (R7)(R1) + ADD $16, R7 // V29 496 + STXVD2X VS61, (R7)(R1) + ADD $16, R7 // V30 512 + STXVD2X VS62, (R7)(R1) + ADD $16, R7 // V31 528 + STXVD2X VS63, (R7)(R1) + + // initialize essential registers (just in case) + BL runtime·reginit(SB) + + // this might be called in external code context, + // where g is not set. + MOVBZ runtime·iscgo(SB), R6 + CMP R6, $0 + BEQ 2(PC) + BL runtime·load_g(SB) + + MOVW R3, FIXED_FRAME+0(R1) + MOVD R4, FIXED_FRAME+8(R1) + MOVD R5, FIXED_FRAME+16(R1) + MOVD $runtime·sigtrampgo(SB), R12 + MOVD R12, CTR + BL (CTR) + MOVD 24(R1), R2 // Should this be here? Where is it saved? + // Starts at 64; FIXED_FRAME is 32 + MOVD 64(R1), R14 + MOVD 72(R1), R15 + MOVD 80(R1), R16 + MOVD 88(R1), R17 + MOVD 96(R1), R18 + MOVD 104(R1), R19 + MOVD 112(R1), R20 + MOVD 120(R1), R21 + MOVD 128(R1), R22 + MOVD 136(R1), R23 + MOVD 144(R1), R24 + MOVD 152(R1), R25 + MOVD 160(R1), R26 + MOVD 168(R1), R27 + MOVD 176(R1), R28 + MOVD 184(R1), R29 + MOVD 192(R1), g // R30 + MOVD 200(R1), R31 + FMOVD 208(R1), F14 + FMOVD 216(R1), F15 + FMOVD 224(R1), F16 + FMOVD 232(R1), F17 + FMOVD 240(R1), F18 + FMOVD 248(R1), F19 + FMOVD 256(R1), F20 + FMOVD 264(R1), F21 + FMOVD 272(R1), F22 + FMOVD 280(R1), F23 + FMOVD 288(R1), F24 + FMOVD 292(R1), F25 + FMOVD 300(R1), F26 + FMOVD 308(R1), F27 + FMOVD 316(R1), F28 + FMOVD 328(R1), F29 + FMOVD 336(R1), F30 + FMOVD 344(R1), F31 + MOVD $352, R7 + LXVD2X (R7)(R1), VS52 + ADD $16, R7 // 368 V21 + LXVD2X (R7)(R1), VS53 + ADD $16, R7 // 384 V22 + LXVD2X (R7)(R1), VS54 + ADD $16, R7 // 400 V23 + LXVD2X (R7)(R1), VS55 + ADD $16, R7 // 416 V24 + LXVD2X (R7)(R1), VS56 + ADD $16, R7 // 432 V25 + LXVD2X (R7)(R1), VS57 + ADD $16, R7 // 448 V26 + LXVD2X (R7)(R1), VS58 + ADD $16, R8 // 464 V27 + LXVD2X (R7)(R1), VS59 + ADD $16, R7 // 480 V28 + LXVD2X (R7)(R1), VS60 + ADD $16, R7 // 496 V29 + LXVD2X (R7)(R1), VS61 + ADD $16, R7 // 512 V30 + LXVD2X (R7)(R1), VS62 + ADD $16, R7 // 528 V31 + LXVD2X (R7)(R1), VS63 + ADD $544, R1 + MOVD 8(R1), R0 + MOVFL R0, $0xff + MOVD 16(R1), R0 + MOVD R0, LR + + RET + +#ifdef GOARCH_ppc64le +// ppc64le doesn't need function descriptors +TEXT runtime·cgoSigtramp(SB),NOSPLIT|NOFRAME,$0 + // The stack unwinder, presumably written in C, may not be able to + // handle Go frame correctly. So, this function is NOFRAME, and we + // save/restore LR manually. + MOVD LR, R10 + + // We're coming from C code, initialize essential registers. + CALL runtime·reginit(SB) + + // If no traceback function, do usual sigtramp. + MOVD runtime·cgoTraceback(SB), R6 + CMP $0, R6 + BEQ sigtramp + + // If no traceback support function, which means that + // runtime/cgo was not linked in, do usual sigtramp. + MOVD _cgo_callers(SB), R6 + CMP $0, R6 + BEQ sigtramp + + // Set up g register. + CALL runtime·load_g(SB) + + // Figure out if we are currently in a cgo call. + // If not, just do usual sigtramp. + // compared to ARM64 and others. + CMP $0, g + BEQ sigtrampnog // g == nil + MOVD g_m(g), R6 + CMP $0, R6 + BEQ sigtramp // g.m == nil + MOVW m_ncgo(R6), R7 + CMPW $0, R7 + BEQ sigtramp // g.m.ncgo = 0 + MOVD m_curg(R6), R7 + CMP $0, R7 + BEQ sigtramp // g.m.curg == nil + MOVD g_syscallsp(R7), R7 + CMP $0, R7 + BEQ sigtramp // g.m.curg.syscallsp == 0 + MOVD m_cgoCallers(R6), R7 // R7 is the fifth arg in C calling convention. + CMP $0, R7 + BEQ sigtramp // g.m.cgoCallers == nil + MOVW m_cgoCallersUse(R6), R8 + CMPW $0, R8 + BNE sigtramp // g.m.cgoCallersUse != 0 + + // Jump to a function in runtime/cgo. + // That function, written in C, will call the user's traceback + // function with proper unwind info, and will then call back here. + // The first three arguments, and the fifth, are already in registers. + // Set the two remaining arguments now. + MOVD runtime·cgoTraceback(SB), R6 + MOVD $runtime·sigtramp(SB), R8 + MOVD _cgo_callers(SB), R12 + MOVD R12, CTR + MOVD R10, LR // restore LR + JMP (CTR) + +sigtramp: + MOVD R10, LR // restore LR + JMP runtime·sigtramp(SB) + +sigtrampnog: + // Signal arrived on a non-Go thread. If this is SIGPROF, get a + // stack trace. + CMPW R3, $27 // 27 == SIGPROF + BNE sigtramp + + // Lock sigprofCallersUse (cas from 0 to 1). + MOVW $1, R7 + MOVD $runtime·sigprofCallersUse(SB), R8 + SYNC + LWAR (R8), R6 + CMPW $0, R6 + BNE sigtramp + STWCCC R7, (R8) + BNE -4(PC) + ISYNC + + // Jump to the traceback function in runtime/cgo. + // It will call back to sigprofNonGo, which will ignore the + // arguments passed in registers. + // First three arguments to traceback function are in registers already. + MOVD runtime·cgoTraceback(SB), R6 + MOVD $runtime·sigprofCallers(SB), R7 + MOVD $runtime·sigprofNonGoWrapper<>(SB), R8 + MOVD _cgo_callers(SB), R12 + MOVD R12, CTR + MOVD R10, LR // restore LR + JMP (CTR) +#else +// function descriptor for the real sigtramp +TEXT runtime·cgoSigtramp(SB),NOSPLIT|NOFRAME,$0 + DWORD $cgoSigtramp<>(SB) + DWORD $0 + DWORD $0 +TEXT cgoSigtramp<>(SB),NOSPLIT,$0 + JMP sigtramp<>(SB) +#endif + +TEXT runtime·sigprofNonGoWrapper<>(SB),NOSPLIT,$0 + // We're coming from C code, set up essential register, then call sigprofNonGo. + CALL runtime·reginit(SB) + MOVW R3, FIXED_FRAME+0(R1) // sig + MOVD R4, FIXED_FRAME+8(R1) // info + MOVD R5, FIXED_FRAME+16(R1) // ctx + CALL runtime·sigprofNonGo(SB) + RET + +TEXT runtime·mmap(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R3 + MOVD n+8(FP), R4 + MOVW prot+16(FP), R5 + MOVW flags+20(FP), R6 + MOVW fd+24(FP), R7 + MOVW off+28(FP), R8 + + SYSCALL $SYS_mmap + BVC ok + MOVD $0, p+32(FP) + MOVD R3, err+40(FP) + RET +ok: + MOVD R3, p+32(FP) + MOVD $0, err+40(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R3 + MOVD n+8(FP), R4 + SYSCALL $SYS_munmap + BVC 2(PC) + MOVD R0, 0xf0(R0) + RET + +TEXT runtime·madvise(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R3 + MOVD n+8(FP), R4 + MOVW flags+16(FP), R5 + SYSCALL $SYS_madvise + MOVW R3, ret+24(FP) + RET + +// int64 futex(int32 *uaddr, int32 op, int32 val, +// struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R3 + MOVW op+8(FP), R4 + MOVW val+12(FP), R5 + MOVD ts+16(FP), R6 + MOVD addr2+24(FP), R7 + MOVW val3+32(FP), R8 + SYSCALL $SYS_futex + BVC 2(PC) + NEG R3 // caller expects negative errno + MOVW R3, ret+40(FP) + RET + +// int64 clone(int32 flags, void *stk, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT|NOFRAME,$0 + MOVW flags+0(FP), R3 + MOVD stk+8(FP), R4 + + // Copy mp, gp, fn off parent stack for use by child. + // Careful: Linux system call clobbers ???. + MOVD mp+16(FP), R7 + MOVD gp+24(FP), R8 + MOVD fn+32(FP), R12 + + MOVD R7, -8(R4) + MOVD R8, -16(R4) + MOVD R12, -24(R4) + MOVD $1234, R7 + MOVD R7, -32(R4) + + SYSCALL $SYS_clone + BVC 2(PC) + NEG R3 // caller expects negative errno + + // In parent, return. + CMP R3, $0 + BEQ 3(PC) + MOVW R3, ret+40(FP) + RET + + // In child, on new stack. + // initialize essential registers + BL runtime·reginit(SB) + MOVD -32(R1), R7 + CMP R7, $1234 + BEQ 2(PC) + MOVD R0, 0(R0) + + // Initialize m->procid to Linux tid + SYSCALL $SYS_gettid + + MOVD -24(R1), R12 // fn + MOVD -16(R1), R8 // g + MOVD -8(R1), R7 // m + + CMP R7, $0 + BEQ nog + CMP R8, $0 + BEQ nog + + MOVD R3, m_procid(R7) + + // TODO: setup TLS. + + // In child, set up new stack + MOVD R7, g_m(R8) + MOVD R8, g + //CALL runtime·stackcheck(SB) + +nog: + // Call fn + MOVD R12, CTR + BL (CTR) + + // It shouldn't return. If it does, exit that thread. + MOVW $111, R3 + SYSCALL $SYS_exit + BR -2(PC) // keep exiting + +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOVD new+0(FP), R3 + MOVD old+8(FP), R4 + SYSCALL $SYS_sigaltstack + BVC 2(PC) + MOVD R0, 0xf0(R0) // crash + RET + +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + SYSCALL $SYS_sched_yield + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT|NOFRAME,$0 + MOVD pid+0(FP), R3 + MOVD len+8(FP), R4 + MOVD buf+16(FP), R5 + SYSCALL $SYS_sched_getaffinity + BVC 2(PC) + NEG R3 // caller expects negative errno + MOVW R3, ret+24(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT|NOFRAME,$0 + // Implemented as brk(NULL). + MOVD $0, R3 + SYSCALL $SYS_brk + MOVD R3, ret+0(FP) + RET + +TEXT runtime·access(SB),$0-20 + MOVD R0, 0(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+16(FP) // for vet + RET + +TEXT runtime·connect(SB),$0-28 + MOVD R0, 0(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+24(FP) // for vet + RET + +TEXT runtime·socket(SB),$0-20 + MOVD R0, 0(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+16(FP) // for vet + RET diff --git a/src/runtime/sys_linux_riscv64.s b/src/runtime/sys_linux_riscv64.s new file mode 100644 index 0000000..d1558fd --- /dev/null +++ b/src/runtime/sys_linux_riscv64.s @@ -0,0 +1,584 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for riscv64, Linux +// + +#include "textflag.h" +#include "go_asm.h" + +#define AT_FDCWD -100 +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 1 + +#define SYS_brk 214 +#define SYS_clock_gettime 113 +#define SYS_clone 220 +#define SYS_close 57 +#define SYS_connect 203 +#define SYS_exit 93 +#define SYS_exit_group 94 +#define SYS_faccessat 48 +#define SYS_futex 98 +#define SYS_getpid 172 +#define SYS_gettid 178 +#define SYS_gettimeofday 169 +#define SYS_kill 129 +#define SYS_madvise 233 +#define SYS_mincore 232 +#define SYS_mmap 222 +#define SYS_munmap 215 +#define SYS_nanosleep 101 +#define SYS_openat 56 +#define SYS_pipe2 59 +#define SYS_pselect6 72 +#define SYS_read 63 +#define SYS_rt_sigaction 134 +#define SYS_rt_sigprocmask 135 +#define SYS_rt_sigreturn 139 +#define SYS_sched_getaffinity 123 +#define SYS_sched_yield 124 +#define SYS_setitimer 103 +#define SYS_sigaltstack 132 +#define SYS_socket 198 +#define SYS_tgkill 131 +#define SYS_timer_create 107 +#define SYS_timer_delete 111 +#define SYS_timer_settime 110 +#define SYS_tkill 130 +#define SYS_write 64 + +// func exit(code int32) +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0-4 + MOVW code+0(FP), A0 + MOV $SYS_exit_group, A7 + ECALL + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-8 + MOV wait+0(FP), A0 + // We're done using the stack. + FENCE + MOVW ZERO, (A0) + FENCE + MOV $0, A0 // exit code + MOV $SYS_exit, A7 + ECALL + JMP 0(PC) + +// func open(name *byte, mode, perm int32) int32 +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0-20 + MOV $AT_FDCWD, A0 + MOV name+0(FP), A1 + MOVW mode+8(FP), A2 + MOVW perm+12(FP), A3 + MOV $SYS_openat, A7 + ECALL + MOV $-4096, T0 + BGEU T0, A0, 2(PC) + MOV $-1, A0 + MOVW A0, ret+16(FP) + RET + +// func closefd(fd int32) int32 +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0-12 + MOVW fd+0(FP), A0 + MOV $SYS_close, A7 + ECALL + MOV $-4096, T0 + BGEU T0, A0, 2(PC) + MOV $-1, A0 + MOVW A0, ret+8(FP) + RET + +// func write1(fd uintptr, p unsafe.Pointer, n int32) int32 +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0-28 + MOV fd+0(FP), A0 + MOV p+8(FP), A1 + MOVW n+16(FP), A2 + MOV $SYS_write, A7 + ECALL + MOVW A0, ret+24(FP) + RET + +// func read(fd int32, p unsafe.Pointer, n int32) int32 +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0-28 + MOVW fd+0(FP), A0 + MOV p+8(FP), A1 + MOVW n+16(FP), A2 + MOV $SYS_read, A7 + ECALL + MOVW A0, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + MOV $r+8(FP), A0 + MOVW flags+0(FP), A1 + MOV $SYS_pipe2, A7 + ECALL + MOVW A0, errno+16(FP) + RET + +// func usleep(usec uint32) +TEXT runtime·usleep(SB),NOSPLIT,$24-4 + MOVWU usec+0(FP), A0 + MOV $1000, A1 + MUL A1, A0, A0 + MOV $1000000000, A1 + DIV A1, A0, A2 + MOV A2, 8(X2) + REM A1, A0, A3 + MOV A3, 16(X2) + ADD $8, X2, A0 + MOV ZERO, A1 + MOV $SYS_nanosleep, A7 + ECALL + RET + +// func gettid() uint32 +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOV $SYS_gettid, A7 + ECALL + MOVW A0, ret+0(FP) + RET + +// func raise(sig uint32) +TEXT runtime·raise(SB),NOSPLIT|NOFRAME,$0 + MOV $SYS_gettid, A7 + ECALL + // arg 1 tid - already in A0 + MOVW sig+0(FP), A1 // arg 2 + MOV $SYS_tkill, A7 + ECALL + RET + +// func raiseproc(sig uint32) +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + MOV $SYS_getpid, A7 + ECALL + // arg 1 pid - already in A0 + MOVW sig+0(FP), A1 // arg 2 + MOV $SYS_kill, A7 + ECALL + RET + +// func getpid() int +TEXT ·getpid(SB),NOSPLIT|NOFRAME,$0-8 + MOV $SYS_getpid, A7 + ECALL + MOV A0, ret+0(FP) + RET + +// func tgkill(tgid, tid, sig int) +TEXT ·tgkill(SB),NOSPLIT|NOFRAME,$0-24 + MOV tgid+0(FP), A0 + MOV tid+8(FP), A1 + MOV sig+16(FP), A2 + MOV $SYS_tgkill, A7 + ECALL + RET + +// func setitimer(mode int32, new, old *itimerval) +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0-24 + MOVW mode+0(FP), A0 + MOV new+8(FP), A1 + MOV old+16(FP), A2 + MOV $SYS_setitimer, A7 + ECALL + RET + +// func timer_create(clockid int32, sevp *sigevent, timerid *int32) int32 +TEXT runtime·timer_create(SB),NOSPLIT,$0-28 + MOVW clockid+0(FP), A0 + MOV sevp+8(FP), A1 + MOV timerid+16(FP), A2 + MOV $SYS_timer_create, A7 + ECALL + MOVW A0, ret+24(FP) + RET + +// func timer_settime(timerid int32, flags int32, new, old *itimerspec) int32 +TEXT runtime·timer_settime(SB),NOSPLIT,$0-28 + MOVW timerid+0(FP), A0 + MOVW flags+4(FP), A1 + MOV new+8(FP), A2 + MOV old+16(FP), A3 + MOV $SYS_timer_settime, A7 + ECALL + MOVW A0, ret+24(FP) + RET + +// func timer_delete(timerid int32) int32 +TEXT runtime·timer_delete(SB),NOSPLIT,$0-12 + MOVW timerid+0(FP), A0 + MOV $SYS_timer_delete, A7 + ECALL + MOVW A0, ret+8(FP) + RET + +// func mincore(addr unsafe.Pointer, n uintptr, dst *byte) int32 +TEXT runtime·mincore(SB),NOSPLIT|NOFRAME,$0-28 + MOV addr+0(FP), A0 + MOV n+8(FP), A1 + MOV dst+16(FP), A2 + MOV $SYS_mincore, A7 + ECALL + MOVW A0, ret+24(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$40-12 + MOV $CLOCK_REALTIME, A0 + + MOV runtime·vdsoClockgettimeSym(SB), A7 + BEQZ A7, fallback + MOV X2, S2 // S2,S3,S4 is unchanged by C code + MOV g_m(g), S3 // S3 = m + + // Save the old values on stack for reentrant + MOV m_vdsoPC(S3), T0 + MOV T0, 24(X2) + MOV m_vdsoSP(S3), T0 + MOV T0, 32(X2) + + MOV RA, m_vdsoPC(S3) + MOV $ret-8(FP), T1 // caller's SP + MOV T1, m_vdsoSP(S3) + + MOV m_curg(S3), T1 + BNE g, T1, noswitch + + MOV m_g0(S3), T1 + MOV (g_sched+gobuf_sp)(T1), X2 + +noswitch: + ADDI $-24, X2 // Space for result + ANDI $~7, X2 // Align for C code + MOV $8(X2), A1 + + // Store g on gsignal's stack, see sys_linux_arm64.s for detail + MOVBU runtime·iscgo(SB), S4 + BNEZ S4, nosaveg + MOV m_gsignal(S3), S4 // g.m.gsignal + BEQZ S4, nosaveg + BEQ g, S4, nosaveg + MOV (g_stack+stack_lo)(S4), S4 // g.m.gsignal.stack.lo + MOV g, (S4) + + JALR RA, A7 + + MOV ZERO, (S4) + JMP finish + +nosaveg: + JALR RA, A7 + +finish: + MOV 8(X2), T0 // sec + MOV 16(X2), T1 // nsec + + MOV S2, X2 // restore stack + MOV 24(X2), A2 + MOV A2, m_vdsoPC(S3) + + MOV 32(X2), A3 + MOV A3, m_vdsoSP(S3) + + MOV T0, sec+0(FP) + MOVW T1, nsec+8(FP) + RET + +fallback: + MOV $8(X2), A1 + MOV $SYS_clock_gettime, A7 + ECALL + MOV 8(X2), T0 // sec + MOV 16(X2), T1 // nsec + MOV T0, sec+0(FP) + MOVW T1, nsec+8(FP) + RET + +// func nanotime1() int64 +TEXT runtime·nanotime1(SB),NOSPLIT,$40-8 + MOV $CLOCK_MONOTONIC, A0 + + MOV runtime·vdsoClockgettimeSym(SB), A7 + BEQZ A7, fallback + + MOV X2, S2 // S2 = RSP, S2 is unchanged by C code + MOV g_m(g), S3 // S3 = m + // Save the old values on stack for reentrant + MOV m_vdsoPC(S3), T0 + MOV T0, 24(X2) + MOV m_vdsoSP(S3), T0 + MOV T0, 32(X2) + + MOV RA, m_vdsoPC(S3) + MOV $ret-8(FP), T0 // caller's SP + MOV T0, m_vdsoSP(S3) + + MOV m_curg(S3), T1 + BNE g, T1, noswitch + + MOV m_g0(S3), T1 + MOV (g_sched+gobuf_sp)(T1), X2 + +noswitch: + ADDI $-24, X2 // Space for result + ANDI $~7, X2 // Align for C code + MOV $8(X2), A1 + + // Store g on gsignal's stack, see sys_linux_arm64.s for detail + MOVBU runtime·iscgo(SB), S4 + BNEZ S4, nosaveg + MOV m_gsignal(S3), S4 // g.m.gsignal + BEQZ S4, nosaveg + BEQ g, S4, nosaveg + MOV (g_stack+stack_lo)(S4), S4 // g.m.gsignal.stack.lo + MOV g, (S4) + + JALR RA, A7 + + MOV ZERO, (S4) + JMP finish + +nosaveg: + JALR RA, A7 + +finish: + MOV 8(X2), T0 // sec + MOV 16(X2), T1 // nsec + // restore stack + MOV S2, X2 + MOV 24(X2), T2 + MOV T2, m_vdsoPC(S3) + + MOV 32(X2), T2 + MOV T2, m_vdsoSP(S3) + // sec is in T0, nsec in T1 + // return nsec in T0 + MOV $1000000000, T2 + MUL T2, T0 + ADD T1, T0 + MOV T0, ret+0(FP) + RET + +fallback: + MOV $8(X2), A1 + MOV $SYS_clock_gettime, A7 + ECALL + MOV 8(X2), T0 // sec + MOV 16(X2), T1 // nsec + MOV $1000000000, T2 + MUL T2, T0 + ADD T1, T0 + MOV T0, ret+0(FP) + RET + +// func rtsigprocmask(how int32, new, old *sigset, size int32) +TEXT runtime·rtsigprocmask(SB),NOSPLIT|NOFRAME,$0-28 + MOVW how+0(FP), A0 + MOV new+8(FP), A1 + MOV old+16(FP), A2 + MOVW size+24(FP), A3 + MOV $SYS_rt_sigprocmask, A7 + ECALL + MOV $-4096, T0 + BLTU A0, T0, 2(PC) + WORD $0 // crash + RET + +// func rt_sigaction(sig uintptr, new, old *sigactiont, size uintptr) int32 +TEXT runtime·rt_sigaction(SB),NOSPLIT|NOFRAME,$0-36 + MOV sig+0(FP), A0 + MOV new+8(FP), A1 + MOV old+16(FP), A2 + MOV size+24(FP), A3 + MOV $SYS_rt_sigaction, A7 + ECALL + MOVW A0, ret+32(FP) + RET + +// func sigfwd(fn uintptr, sig uint32, info *siginfo, ctx unsafe.Pointer) +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), A0 + MOV info+16(FP), A1 + MOV ctx+24(FP), A2 + MOV fn+0(FP), T1 + JALR RA, T1 + RET + +// func sigtramp(signo, ureg, ctxt unsafe.Pointer) +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$64 + MOVW A0, 8(X2) + MOV A1, 16(X2) + MOV A2, 24(X2) + + // this might be called in external code context, + // where g is not set. + MOVBU runtime·iscgo(SB), A0 + BEQ A0, ZERO, 2(PC) + CALL runtime·load_g(SB) + + MOV $runtime·sigtrampgo(SB), A0 + JALR RA, A0 + RET + +// func cgoSigtramp() +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + MOV $runtime·sigtramp(SB), T1 + JALR ZERO, T1 + +// func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (p unsafe.Pointer, err int) +TEXT runtime·mmap(SB),NOSPLIT|NOFRAME,$0 + MOV addr+0(FP), A0 + MOV n+8(FP), A1 + MOVW prot+16(FP), A2 + MOVW flags+20(FP), A3 + MOVW fd+24(FP), A4 + MOVW off+28(FP), A5 + MOV $SYS_mmap, A7 + ECALL + MOV $-4096, T0 + BGEU T0, A0, 5(PC) + SUB A0, ZERO, A0 + MOV ZERO, p+32(FP) + MOV A0, err+40(FP) + RET +ok: + MOV A0, p+32(FP) + MOV ZERO, err+40(FP) + RET + +// func munmap(addr unsafe.Pointer, n uintptr) +TEXT runtime·munmap(SB),NOSPLIT|NOFRAME,$0 + MOV addr+0(FP), A0 + MOV n+8(FP), A1 + MOV $SYS_munmap, A7 + ECALL + MOV $-4096, T0 + BLTU A0, T0, 2(PC) + WORD $0 // crash + RET + +// func madvise(addr unsafe.Pointer, n uintptr, flags int32) +TEXT runtime·madvise(SB),NOSPLIT|NOFRAME,$0 + MOV addr+0(FP), A0 + MOV n+8(FP), A1 + MOVW flags+16(FP), A2 + MOV $SYS_madvise, A7 + ECALL + MOVW A0, ret+24(FP) + RET + +// func futex(addr unsafe.Pointer, op int32, val uint32, ts, addr2 unsafe.Pointer, val3 uint32) int32 +TEXT runtime·futex(SB),NOSPLIT|NOFRAME,$0 + MOV addr+0(FP), A0 + MOVW op+8(FP), A1 + MOVW val+12(FP), A2 + MOV ts+16(FP), A3 + MOV addr2+24(FP), A4 + MOVW val3+32(FP), A5 + MOV $SYS_futex, A7 + ECALL + MOVW A0, ret+40(FP) + RET + +// func clone(flags int32, stk, mp, gp, fn unsafe.Pointer) int32 +TEXT runtime·clone(SB),NOSPLIT|NOFRAME,$0 + MOVW flags+0(FP), A0 + MOV stk+8(FP), A1 + + // Copy mp, gp, fn off parent stack for use by child. + MOV mp+16(FP), T0 + MOV gp+24(FP), T1 + MOV fn+32(FP), T2 + + MOV T0, -8(A1) + MOV T1, -16(A1) + MOV T2, -24(A1) + MOV $1234, T0 + MOV T0, -32(A1) + + MOV $SYS_clone, A7 + ECALL + + // In parent, return. + BEQ ZERO, A0, child + MOVW ZERO, ret+40(FP) + RET + +child: + // In child, on new stack. + MOV -32(X2), T0 + MOV $1234, A0 + BEQ A0, T0, good + WORD $0 // crash + +good: + // Initialize m->procid to Linux tid + MOV $SYS_gettid, A7 + ECALL + + MOV -24(X2), T2 // fn + MOV -16(X2), T1 // g + MOV -8(X2), T0 // m + + BEQ ZERO, T0, nog + BEQ ZERO, T1, nog + + MOV A0, m_procid(T0) + + // In child, set up new stack + MOV T0, g_m(T1) + MOV T1, g + +nog: + // Call fn + JALR RA, T2 + + // It shouldn't return. If it does, exit this thread. + MOV $111, A0 + MOV $SYS_exit, A7 + ECALL + JMP -3(PC) // keep exiting + +// func sigaltstack(new, old *stackt) +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOV new+0(FP), A0 + MOV old+8(FP), A1 + MOV $SYS_sigaltstack, A7 + ECALL + MOV $-4096, T0 + BLTU A0, T0, 2(PC) + WORD $0 // crash + RET + +// func osyield() +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + MOV $SYS_sched_yield, A7 + ECALL + RET + +// func sched_getaffinity(pid, len uintptr, buf *uintptr) int32 +TEXT runtime·sched_getaffinity(SB),NOSPLIT|NOFRAME,$0 + MOV pid+0(FP), A0 + MOV len+8(FP), A1 + MOV buf+16(FP), A2 + MOV $SYS_sched_getaffinity, A7 + ECALL + MOV A0, ret+24(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT,$0-8 + // Implemented as brk(NULL). + MOV $0, A0 + MOV $SYS_brk, A7 + ECALL + MOVW A0, ret+0(FP) + RET diff --git a/src/runtime/sys_linux_s390x.s b/src/runtime/sys_linux_s390x.s new file mode 100644 index 0000000..1448670 --- /dev/null +++ b/src/runtime/sys_linux_s390x.s @@ -0,0 +1,609 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// System calls and other system stuff for Linux s390x; see +// /usr/include/asm/unistd.h for the syscall number definitions. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_brk 45 +#define SYS_mmap 90 +#define SYS_munmap 91 +#define SYS_setitimer 104 +#define SYS_clone 120 +#define SYS_sched_yield 158 +#define SYS_nanosleep 162 +#define SYS_rt_sigreturn 173 +#define SYS_rt_sigaction 174 +#define SYS_rt_sigprocmask 175 +#define SYS_sigaltstack 186 +#define SYS_madvise 219 +#define SYS_mincore 218 +#define SYS_gettid 236 +#define SYS_futex 238 +#define SYS_sched_getaffinity 240 +#define SYS_tgkill 241 +#define SYS_exit_group 248 +#define SYS_timer_create 254 +#define SYS_timer_settime 255 +#define SYS_timer_delete 258 +#define SYS_clock_gettime 260 +#define SYS_pipe2 325 + +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0-4 + MOVW code+0(FP), R2 + MOVW $SYS_exit_group, R1 + SYSCALL + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT|NOFRAME,$0-8 + MOVD wait+0(FP), R1 + // We're done using the stack. + MOVW $0, R2 + MOVW R2, (R1) + MOVW $0, R2 // exit code + MOVW $SYS_exit, R1 + SYSCALL + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0-20 + MOVD name+0(FP), R2 + MOVW mode+8(FP), R3 + MOVW perm+12(FP), R4 + MOVW $SYS_open, R1 + SYSCALL + MOVD $-4095, R3 + CMPUBLT R2, R3, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0-12 + MOVW fd+0(FP), R2 + MOVW $SYS_close, R1 + SYSCALL + MOVD $-4095, R3 + CMPUBLT R2, R3, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+8(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0-28 + MOVD fd+0(FP), R2 + MOVD p+8(FP), R3 + MOVW n+16(FP), R4 + MOVW $SYS_write, R1 + SYSCALL + MOVW R2, ret+24(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0-28 + MOVW fd+0(FP), R2 + MOVD p+8(FP), R3 + MOVW n+16(FP), R4 + MOVW $SYS_read, R1 + SYSCALL + MOVW R2, ret+24(FP) + RET + +// func pipe2() (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + MOVD $r+8(FP), R2 + MOVW flags+0(FP), R3 + MOVW $SYS_pipe2, R1 + SYSCALL + MOVW R2, errno+16(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$16-4 + MOVW usec+0(FP), R2 + MOVD R2, R4 + MOVW $1000000, R3 + DIVD R3, R2 + MOVD R2, 8(R15) + MOVW $1000, R3 + MULLD R2, R3 + SUB R3, R4 + MOVD R4, 16(R15) + + // nanosleep(&ts, 0) + ADD $8, R15, R2 + MOVW $0, R3 + MOVW $SYS_nanosleep, R1 + SYSCALL + RET + +TEXT runtime·gettid(SB),NOSPLIT,$0-4 + MOVW $SYS_gettid, R1 + SYSCALL + MOVW R2, ret+0(FP) + RET + +TEXT runtime·raise(SB),NOSPLIT|NOFRAME,$0 + MOVW $SYS_getpid, R1 + SYSCALL + MOVW R2, R10 + MOVW $SYS_gettid, R1 + SYSCALL + MOVW R2, R3 // arg 2 tid + MOVW R10, R2 // arg 1 pid + MOVW sig+0(FP), R4 // arg 2 + MOVW $SYS_tgkill, R1 + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT|NOFRAME,$0 + MOVW $SYS_getpid, R1 + SYSCALL + MOVW R2, R2 // arg 1 pid + MOVW sig+0(FP), R3 // arg 2 + MOVW $SYS_kill, R1 + SYSCALL + RET + +TEXT ·getpid(SB),NOSPLIT|NOFRAME,$0-8 + MOVW $SYS_getpid, R1 + SYSCALL + MOVD R2, ret+0(FP) + RET + +TEXT ·tgkill(SB),NOSPLIT|NOFRAME,$0-24 + MOVD tgid+0(FP), R2 + MOVD tid+8(FP), R3 + MOVD sig+16(FP), R4 + MOVW $SYS_tgkill, R1 + SYSCALL + RET + +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0-24 + MOVW mode+0(FP), R2 + MOVD new+8(FP), R3 + MOVD old+16(FP), R4 + MOVW $SYS_setitimer, R1 + SYSCALL + RET + +TEXT runtime·timer_create(SB),NOSPLIT|NOFRAME,$0-28 + MOVW clockid+0(FP), R2 + MOVD sevp+8(FP), R3 + MOVD timerid+16(FP), R4 + MOVW $SYS_timer_create, R1 + SYSCALL + MOVW R2, ret+24(FP) + RET + +TEXT runtime·timer_settime(SB),NOSPLIT|NOFRAME,$0-28 + MOVW timerid+0(FP), R2 + MOVW flags+4(FP), R3 + MOVD new+8(FP), R4 + MOVD old+16(FP), R5 + MOVW $SYS_timer_settime, R1 + SYSCALL + MOVW R2, ret+24(FP) + RET + +TEXT runtime·timer_delete(SB),NOSPLIT|NOFRAME,$0-12 + MOVW timerid+0(FP), R2 + MOVW $SYS_timer_delete, R1 + SYSCALL + MOVW R2, ret+8(FP) + RET + +TEXT runtime·mincore(SB),NOSPLIT|NOFRAME,$0-28 + MOVD addr+0(FP), R2 + MOVD n+8(FP), R3 + MOVD dst+16(FP), R4 + MOVW $SYS_mincore, R1 + SYSCALL + MOVW R2, ret+24(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$32-12 + MOVW $0, R2 // CLOCK_REALTIME + MOVD R15, R7 // Backup stack pointer + + MOVD g_m(g), R6 //m + + MOVD runtime·vdsoClockgettimeSym(SB), R9 // Check for VDSO availability + CMPBEQ R9, $0, fallback + + MOVD m_vdsoPC(R6), R4 + MOVD R4, 16(R15) + MOVD m_vdsoSP(R6), R4 + MOVD R4, 24(R15) + + MOVD R14, R8 // Backup return address + MOVD $sec+0(FP), R4 // return parameter caller + + MOVD R8, m_vdsoPC(R6) + MOVD R4, m_vdsoSP(R6) + + MOVD m_curg(R6), R5 + CMP g, R5 + BNE noswitch + + MOVD m_g0(R6), R4 + MOVD (g_sched+gobuf_sp)(R4), R15 // Set SP to g0 stack + +noswitch: + SUB $16, R15 // reserve 2x 8 bytes for parameters + MOVD $~7, R4 // align to 8 bytes because of gcc ABI + AND R4, R15 + MOVD R15, R3 // R15 needs to be in R3 as expected by kernel_clock_gettime + + MOVB runtime·iscgo(SB),R12 + CMPBNE R12, $0, nosaveg + + MOVD m_gsignal(R6), R12 // g.m.gsignal + CMPBEQ R12, $0, nosaveg + + CMPBEQ g, R12, nosaveg + MOVD (g_stack+stack_lo)(R12), R12 // g.m.gsignal.stack.lo + MOVD g, (R12) + + BL R9 // to vdso lookup + + MOVD $0, (R12) + + JMP finish + +nosaveg: + BL R9 // to vdso lookup + +finish: + MOVD 0(R15), R3 // sec + MOVD 8(R15), R5 // nsec + MOVD R7, R15 // Restore SP + + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVD 24(R15), R12 + MOVD R12, m_vdsoSP(R6) + MOVD 16(R15), R12 + MOVD R12, m_vdsoPC(R6) + +return: + // sec is in R3, nsec in R5 + // return nsec in R3 + MOVD R3, sec+0(FP) + MOVW R5, nsec+8(FP) + RET + + // Syscall fallback +fallback: + MOVD $tp-16(SP), R3 + MOVW $SYS_clock_gettime, R1 + SYSCALL + LMG tp-16(SP), R2, R3 + // sec is in R2, nsec in R3 + MOVD R2, sec+0(FP) + MOVW R3, nsec+8(FP) + RET + +TEXT runtime·nanotime1(SB),NOSPLIT,$32-8 + MOVW $1, R2 // CLOCK_MONOTONIC + + MOVD R15, R7 // Backup stack pointer + + MOVD g_m(g), R6 //m + + MOVD runtime·vdsoClockgettimeSym(SB), R9 // Check for VDSO availability + CMPBEQ R9, $0, fallback + + MOVD m_vdsoPC(R6), R4 + MOVD R4, 16(R15) + MOVD m_vdsoSP(R6), R4 + MOVD R4, 24(R15) + + MOVD R14, R8 // Backup return address + MOVD $ret+0(FP), R4 // caller's SP + + MOVD R8, m_vdsoPC(R6) + MOVD R4, m_vdsoSP(R6) + + MOVD m_curg(R6), R5 + CMP g, R5 + BNE noswitch + + MOVD m_g0(R6), R4 + MOVD (g_sched+gobuf_sp)(R4), R15 // Set SP to g0 stack + +noswitch: + SUB $16, R15 // reserve 2x 8 bytes for parameters + MOVD $~7, R4 // align to 8 bytes because of gcc ABI + AND R4, R15 + MOVD R15, R3 // R15 needs to be in R3 as expected by kernel_clock_gettime + + MOVB runtime·iscgo(SB),R12 + CMPBNE R12, $0, nosaveg + + MOVD m_gsignal(R6), R12 // g.m.gsignal + CMPBEQ R12, $0, nosaveg + + CMPBEQ g, R12, nosaveg + MOVD (g_stack+stack_lo)(R12), R12 // g.m.gsignal.stack.lo + MOVD g, (R12) + + BL R9 // to vdso lookup + + MOVD $0, (R12) + + JMP finish + +nosaveg: + BL R9 // to vdso lookup + +finish: + MOVD 0(R15), R3 // sec + MOVD 8(R15), R5 // nsec + MOVD R7, R15 // Restore SP + + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + + MOVD 24(R15), R12 + MOVD R12, m_vdsoSP(R6) + MOVD 16(R15), R12 + MOVD R12, m_vdsoPC(R6) + +return: + // sec is in R3, nsec in R5 + // return nsec in R3 + MULLD $1000000000, R3 + ADD R5, R3 + MOVD R3, ret+0(FP) + RET + + // Syscall fallback +fallback: + MOVD $tp-16(SP), R3 + MOVD $SYS_clock_gettime, R1 + SYSCALL + LMG tp-16(SP), R2, R3 + MOVD R3, R5 + MOVD R2, R3 + JMP return + +TEXT runtime·rtsigprocmask(SB),NOSPLIT|NOFRAME,$0-28 + MOVW how+0(FP), R2 + MOVD new+8(FP), R3 + MOVD old+16(FP), R4 + MOVW size+24(FP), R5 + MOVW $SYS_rt_sigprocmask, R1 + SYSCALL + MOVD $-4095, R3 + CMPUBLT R2, R3, 2(PC) + MOVD R0, 0(R0) // crash + RET + +TEXT runtime·rt_sigaction(SB),NOSPLIT|NOFRAME,$0-36 + MOVD sig+0(FP), R2 + MOVD new+8(FP), R3 + MOVD old+16(FP), R4 + MOVD size+24(FP), R5 + MOVW $SYS_rt_sigaction, R1 + SYSCALL + MOVW R2, ret+32(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R2 + MOVD info+16(FP), R3 + MOVD ctx+24(FP), R4 + MOVD fn+0(FP), R5 + BL R5 + RET + +TEXT runtime·sigreturn(SB),NOSPLIT,$0-0 + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$64 + // initialize essential registers (just in case) + XOR R0, R0 + + // this might be called in external code context, + // where g is not set. + MOVB runtime·iscgo(SB), R6 + CMPBEQ R6, $0, 2(PC) + BL runtime·load_g(SB) + + MOVW R2, 8(R15) + MOVD R3, 16(R15) + MOVD R4, 24(R15) + MOVD $runtime·sigtrampgo(SB), R5 + BL R5 + RET + +TEXT runtime·cgoSigtramp(SB),NOSPLIT,$0 + BR runtime·sigtramp(SB) + +// func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) unsafe.Pointer +TEXT runtime·mmap(SB),NOSPLIT,$48-48 + MOVD addr+0(FP), R2 + MOVD n+8(FP), R3 + MOVW prot+16(FP), R4 + MOVW flags+20(FP), R5 + MOVW fd+24(FP), R6 + MOVWZ off+28(FP), R7 + + // s390x uses old_mmap, so the arguments need to be placed into + // a struct and a pointer to the struct passed to mmap. + MOVD R2, addr-48(SP) + MOVD R3, n-40(SP) + MOVD R4, prot-32(SP) + MOVD R5, flags-24(SP) + MOVD R6, fd-16(SP) + MOVD R7, off-8(SP) + + MOVD $addr-48(SP), R2 + MOVW $SYS_mmap, R1 + SYSCALL + MOVD $-4095, R3 + CMPUBLT R2, R3, ok + NEG R2 + MOVD $0, p+32(FP) + MOVD R2, err+40(FP) + RET +ok: + MOVD R2, p+32(FP) + MOVD $0, err+40(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R2 + MOVD n+8(FP), R3 + MOVW $SYS_munmap, R1 + SYSCALL + MOVD $-4095, R3 + CMPUBLT R2, R3, 2(PC) + MOVD R0, 0(R0) // crash + RET + +TEXT runtime·madvise(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R2 + MOVD n+8(FP), R3 + MOVW flags+16(FP), R4 + MOVW $SYS_madvise, R1 + SYSCALL + MOVW R2, ret+24(FP) + RET + +// int64 futex(int32 *uaddr, int32 op, int32 val, +// struct timespec *timeout, int32 *uaddr2, int32 val2); +TEXT runtime·futex(SB),NOSPLIT|NOFRAME,$0 + MOVD addr+0(FP), R2 + MOVW op+8(FP), R3 + MOVW val+12(FP), R4 + MOVD ts+16(FP), R5 + MOVD addr2+24(FP), R6 + MOVW val3+32(FP), R7 + MOVW $SYS_futex, R1 + SYSCALL + MOVW R2, ret+40(FP) + RET + +// int32 clone(int32 flags, void *stk, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·clone(SB),NOSPLIT|NOFRAME,$0 + MOVW flags+0(FP), R3 + MOVD stk+8(FP), R2 + + // Copy mp, gp, fn off parent stack for use by child. + // Careful: Linux system call clobbers ???. + MOVD mp+16(FP), R7 + MOVD gp+24(FP), R8 + MOVD fn+32(FP), R9 + + MOVD R7, -8(R2) + MOVD R8, -16(R2) + MOVD R9, -24(R2) + MOVD $1234, R7 + MOVD R7, -32(R2) + + SYSCALL $SYS_clone + + // In parent, return. + CMPBEQ R2, $0, 3(PC) + MOVW R2, ret+40(FP) + RET + + // In child, on new stack. + // initialize essential registers + XOR R0, R0 + MOVD -32(R15), R7 + CMP R7, $1234 + BEQ 2(PC) + MOVD R0, 0(R0) + + // Initialize m->procid to Linux tid + SYSCALL $SYS_gettid + + MOVD -24(R15), R9 // fn + MOVD -16(R15), R8 // g + MOVD -8(R15), R7 // m + + CMPBEQ R7, $0, nog + CMP R8, $0 + BEQ nog + + MOVD R2, m_procid(R7) + + // In child, set up new stack + MOVD R7, g_m(R8) + MOVD R8, g + //CALL runtime·stackcheck(SB) + +nog: + // Call fn + BL R9 + + // It shouldn't return. If it does, exit that thread. + MOVW $111, R2 + MOVW $SYS_exit, R1 + SYSCALL + BR -2(PC) // keep exiting + +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOVD new+0(FP), R2 + MOVD old+8(FP), R3 + MOVW $SYS_sigaltstack, R1 + SYSCALL + MOVD $-4095, R3 + CMPUBLT R2, R3, 2(PC) + MOVD R0, 0(R0) // crash + RET + +TEXT runtime·osyield(SB),NOSPLIT|NOFRAME,$0 + MOVW $SYS_sched_yield, R1 + SYSCALL + RET + +TEXT runtime·sched_getaffinity(SB),NOSPLIT|NOFRAME,$0 + MOVD pid+0(FP), R2 + MOVD len+8(FP), R3 + MOVD buf+16(FP), R4 + MOVW $SYS_sched_getaffinity, R1 + SYSCALL + MOVW R2, ret+24(FP) + RET + +// func sbrk0() uintptr +TEXT runtime·sbrk0(SB),NOSPLIT|NOFRAME,$0-8 + // Implemented as brk(NULL). + MOVD $0, R2 + MOVW $SYS_brk, R1 + SYSCALL + MOVD R2, ret+0(FP) + RET + +TEXT runtime·access(SB),$0-20 + MOVD $0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+16(FP) + RET + +TEXT runtime·connect(SB),$0-28 + MOVD $0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+24(FP) + RET + +TEXT runtime·socket(SB),$0-20 + MOVD $0, 2(R0) // unimplemented, only needed for android; declared in stubs_linux.go + MOVW R0, ret+16(FP) + RET diff --git a/src/runtime/sys_loong64.go b/src/runtime/sys_loong64.go new file mode 100644 index 0000000..812db5c --- /dev/null +++ b/src/runtime/sys_loong64.go @@ -0,0 +1,20 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build loong64 + +package runtime + +import "unsafe" + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then did an immediate Gosave. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + if buf.lr != 0 { + throw("invalid use of gostartcall") + } + buf.lr = buf.pc + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} diff --git a/src/runtime/sys_mips64x.go b/src/runtime/sys_mips64x.go new file mode 100644 index 0000000..b715384 --- /dev/null +++ b/src/runtime/sys_mips64x.go @@ -0,0 +1,20 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +package runtime + +import "unsafe" + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then did an immediate Gosave. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + if buf.lr != 0 { + throw("invalid use of gostartcall") + } + buf.lr = buf.pc + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} diff --git a/src/runtime/sys_mipsx.go b/src/runtime/sys_mipsx.go new file mode 100644 index 0000000..b60135f --- /dev/null +++ b/src/runtime/sys_mipsx.go @@ -0,0 +1,20 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +package runtime + +import "unsafe" + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then did an immediate Gosave. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + if buf.lr != 0 { + throw("invalid use of gostartcall") + } + buf.lr = buf.pc + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} diff --git a/src/runtime/sys_netbsd_386.s b/src/runtime/sys_netbsd_386.s new file mode 100644 index 0000000..67a04d7 --- /dev/null +++ b/src/runtime/sys_netbsd_386.s @@ -0,0 +1,492 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for 386, NetBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 3 +#define FD_CLOEXEC 1 +#define F_SETFD 2 + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_munmap 73 +#define SYS_madvise 75 +#define SYS_fcntl 92 +#define SYS_mmap 197 +#define SYS___sysctl 202 +#define SYS___sigaltstack14 281 +#define SYS___sigprocmask14 293 +#define SYS_issetugid 305 +#define SYS_getcontext 307 +#define SYS_setcontext 308 +#define SYS__lwp_create 309 +#define SYS__lwp_exit 310 +#define SYS__lwp_self 311 +#define SYS__lwp_setprivate 317 +#define SYS__lwp_kill 318 +#define SYS__lwp_unpark 321 +#define SYS___sigaction_sigtramp 340 +#define SYS_kqueue 344 +#define SYS_sched_yield 350 +#define SYS___setitimer50 425 +#define SYS___clock_gettime50 427 +#define SYS___nanosleep50 430 +#define SYS___kevent50 435 +#define SYS____lwp_park60 478 + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT,$-4 + MOVL $SYS_exit, AX + INT $0x80 + MOVL $0xf1, 0xf1 // crash + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-4 + MOVL wait+0(FP), AX + // We're done using the stack. + MOVL $0, (AX) + MOVL $SYS__lwp_exit, AX + INT $0x80 + MOVL $0xf1, 0xf1 // crash + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT,$-4 + MOVL $SYS_open, AX + INT $0x80 + JAE 2(PC) + MOVL $-1, AX + MOVL AX, ret+12(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$-4 + MOVL $SYS_close, AX + INT $0x80 + JAE 2(PC) + MOVL $-1, AX + MOVL AX, ret+4(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$-4 + MOVL $SYS_read, AX + INT $0x80 + JAE 2(PC) + NEGL AX // caller expects negative errno + MOVL AX, ret+12(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$12-16 + MOVL $453, AX + LEAL r+4(FP), BX + MOVL BX, 4(SP) + MOVL flags+0(FP), BX + MOVL BX, 8(SP) + INT $0x80 + MOVL AX, errno+12(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$-4 + MOVL $SYS_write, AX + INT $0x80 + JAE 2(PC) + NEGL AX // caller expects negative errno + MOVL AX, ret+12(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$24 + MOVL $0, DX + MOVL usec+0(FP), AX + MOVL $1000000, CX + DIVL CX + MOVL AX, 12(SP) // tv_sec - l32 + MOVL $0, 16(SP) // tv_sec - h32 + MOVL $1000, AX + MULL DX + MOVL AX, 20(SP) // tv_nsec + + MOVL $0, 0(SP) + LEAL 12(SP), AX + MOVL AX, 4(SP) // arg 1 - rqtp + MOVL $0, 8(SP) // arg 2 - rmtp + MOVL $SYS___nanosleep50, AX + INT $0x80 + RET + +TEXT runtime·lwp_kill(SB),NOSPLIT,$12-8 + MOVL $0, 0(SP) + MOVL tid+0(FP), AX + MOVL AX, 4(SP) // arg 1 - target + MOVL sig+4(FP), AX + MOVL AX, 8(SP) // arg 2 - signo + MOVL $SYS__lwp_kill, AX + INT $0x80 + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$12 + MOVL $SYS_getpid, AX + INT $0x80 + MOVL $0, 0(SP) + MOVL AX, 4(SP) // arg 1 - pid + MOVL sig+0(FP), AX + MOVL AX, 8(SP) // arg 2 - signo + MOVL $SYS_kill, AX + INT $0x80 + RET + +TEXT runtime·mmap(SB),NOSPLIT,$36 + LEAL addr+0(FP), SI + LEAL 4(SP), DI + CLD + MOVSL // arg 1 - addr + MOVSL // arg 2 - len + MOVSL // arg 3 - prot + MOVSL // arg 4 - flags + MOVSL // arg 5 - fd + MOVL $0, AX + STOSL // arg 6 - pad + MOVSL // arg 7 - offset + MOVL $0, AX // top 32 bits of file offset + STOSL + MOVL $SYS_mmap, AX + INT $0x80 + JAE ok + MOVL $0, p+24(FP) + MOVL AX, err+28(FP) + RET +ok: + MOVL AX, p+24(FP) + MOVL $0, err+28(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$-4 + MOVL $SYS_munmap, AX + INT $0x80 + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·madvise(SB),NOSPLIT,$-4 + MOVL $SYS_madvise, AX + INT $0x80 + JAE 2(PC) + MOVL $-1, AX + MOVL AX, ret+12(FP) + RET + +TEXT runtime·setitimer(SB),NOSPLIT,$-4 + MOVL $SYS___setitimer50, AX + INT $0x80 + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB), NOSPLIT, $32 + LEAL 12(SP), BX + MOVL $CLOCK_REALTIME, 4(SP) // arg 1 - clock_id + MOVL BX, 8(SP) // arg 2 - tp + MOVL $SYS___clock_gettime50, AX + INT $0x80 + + MOVL 12(SP), AX // sec - l32 + MOVL AX, sec_lo+0(FP) + MOVL 16(SP), AX // sec - h32 + MOVL AX, sec_hi+4(FP) + + MOVL 20(SP), BX // nsec + MOVL BX, nsec+8(FP) + RET + +// int64 nanotime1(void) so really +// void nanotime1(int64 *nsec) +TEXT runtime·nanotime1(SB),NOSPLIT,$32 + LEAL 12(SP), BX + MOVL $CLOCK_MONOTONIC, 4(SP) // arg 1 - clock_id + MOVL BX, 8(SP) // arg 2 - tp + MOVL $SYS___clock_gettime50, AX + INT $0x80 + + MOVL 16(SP), CX // sec - h32 + IMULL $1000000000, CX + + MOVL 12(SP), AX // sec - l32 + MOVL $1000000000, BX + MULL BX // result in dx:ax + + MOVL 20(SP), BX // nsec + ADDL BX, AX + ADCL CX, DX // add high bits with carry + + MOVL AX, ret_lo+0(FP) + MOVL DX, ret_hi+4(FP) + RET + +TEXT runtime·getcontext(SB),NOSPLIT,$-4 + MOVL $SYS_getcontext, AX + INT $0x80 + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sigprocmask(SB),NOSPLIT,$-4 + MOVL $SYS___sigprocmask14, AX + INT $0x80 + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT sigreturn_tramp<>(SB),NOSPLIT,$0 + LEAL 140(SP), AX // Load address of ucontext + MOVL AX, 4(SP) + MOVL $SYS_setcontext, AX + INT $0x80 + MOVL $-1, 4(SP) // Something failed... + MOVL $SYS_exit, AX + INT $0x80 + +TEXT runtime·sigaction(SB),NOSPLIT,$24 + LEAL sig+0(FP), SI + LEAL 4(SP), DI + CLD + MOVSL // arg 1 - sig + MOVSL // arg 2 - act + MOVSL // arg 3 - oact + LEAL sigreturn_tramp<>(SB), AX + STOSL // arg 4 - tramp + MOVL $2, AX + STOSL // arg 5 - vers + MOVL $SYS___sigaction_sigtramp, AX + INT $0x80 + JAE 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$12-16 + MOVL fn+0(FP), AX + MOVL sig+4(FP), BX + MOVL info+8(FP), CX + MOVL ctx+12(FP), DX + MOVL SP, SI + SUBL $32, SP + ANDL $-15, SP // align stack: handler might be a C function + MOVL BX, 0(SP) + MOVL CX, 4(SP) + MOVL DX, 8(SP) + MOVL SI, 12(SP) // save SI: handler might be a Go function + CALL AX + MOVL 12(SP), AX + MOVL AX, SP + RET + +// Called by OS using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$28 + NOP SP // tell vet SP changed - stop checking offsets + // Save callee-saved C registers, since the caller may be a C signal handler. + MOVL BX, bx-4(SP) + MOVL BP, bp-8(SP) + MOVL SI, si-12(SP) + MOVL DI, di-16(SP) + // We don't save mxcsr or the x87 control word because sigtrampgo doesn't + // modify them. + + MOVL 32(SP), BX // signo + MOVL BX, 0(SP) + MOVL 36(SP), BX // info + MOVL BX, 4(SP) + MOVL 40(SP), BX // context + MOVL BX, 8(SP) + CALL runtime·sigtrampgo(SB) + + MOVL di-16(SP), DI + MOVL si-12(SP), SI + MOVL bp-8(SP), BP + MOVL bx-4(SP), BX + RET + +// int32 lwp_create(void *context, uintptr flags, void *lwpid); +TEXT runtime·lwp_create(SB),NOSPLIT,$16 + MOVL $0, 0(SP) + MOVL ctxt+0(FP), AX + MOVL AX, 4(SP) // arg 1 - context + MOVL flags+4(FP), AX + MOVL AX, 8(SP) // arg 2 - flags + MOVL lwpid+8(FP), AX + MOVL AX, 12(SP) // arg 3 - lwpid + MOVL $SYS__lwp_create, AX + INT $0x80 + JCC 2(PC) + NEGL AX + MOVL AX, ret+12(FP) + RET + +TEXT runtime·lwp_tramp(SB),NOSPLIT,$0 + + // Set FS to point at m->tls + LEAL m_tls(BX), BP + PUSHAL // save registers + PUSHL BP + CALL lwp_setprivate<>(SB) + POPL AX + POPAL + + // Now segment is established. Initialize m, g. + get_tls(AX) + MOVL DX, g(AX) + MOVL BX, g_m(DX) + + CALL runtime·stackcheck(SB) // smashes AX, CX + MOVL 0(DX), DX // paranoia; check they are not nil + MOVL 0(BX), BX + + // more paranoia; check that stack splitting code works + PUSHAL + CALL runtime·emptyfunc(SB) + POPAL + + // Call fn + CALL SI + + // fn should never return + MOVL $0x1234, 0x1005 + RET + +TEXT ·netbsdMstart(SB),NOSPLIT|TOPFRAME,$0 + CALL ·netbsdMstart0(SB) + RET // not reached + +TEXT runtime·sigaltstack(SB),NOSPLIT,$-8 + MOVL $SYS___sigaltstack14, AX + MOVL new+0(FP), BX + MOVL old+4(FP), CX + INT $0x80 + CMPL AX, $0xfffff001 + JLS 2(PC) + INT $3 + RET + +TEXT runtime·setldt(SB),NOSPLIT,$8 + // Under NetBSD we set the GS base instead of messing with the LDT. + MOVL base+4(FP), AX + MOVL AX, 0(SP) + CALL lwp_setprivate<>(SB) + RET + +TEXT lwp_setprivate<>(SB),NOSPLIT,$16 + // adjust for ELF: wants to use -4(GS) for g + MOVL base+0(FP), CX + ADDL $4, CX + MOVL $0, 0(SP) // syscall gap + MOVL CX, 4(SP) // arg 1 - ptr + MOVL $SYS__lwp_setprivate, AX + INT $0x80 + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·osyield(SB),NOSPLIT,$-4 + MOVL $SYS_sched_yield, AX + INT $0x80 + RET + +TEXT runtime·lwp_park(SB),NOSPLIT,$-4 + MOVL $SYS____lwp_park60, AX + INT $0x80 + MOVL AX, ret+24(FP) + RET + +TEXT runtime·lwp_unpark(SB),NOSPLIT,$-4 + MOVL $SYS__lwp_unpark, AX + INT $0x80 + MOVL AX, ret+8(FP) + RET + +TEXT runtime·lwp_self(SB),NOSPLIT,$-4 + MOVL $SYS__lwp_self, AX + INT $0x80 + MOVL AX, ret+0(FP) + RET + +TEXT runtime·sysctl(SB),NOSPLIT,$28 + LEAL mib+0(FP), SI + LEAL 4(SP), DI + CLD + MOVSL // arg 1 - name + MOVSL // arg 2 - namelen + MOVSL // arg 3 - oldp + MOVSL // arg 4 - oldlenp + MOVSL // arg 5 - newp + MOVSL // arg 6 - newlen + MOVL $SYS___sysctl, AX + INT $0x80 + JAE 4(PC) + NEGL AX + MOVL AX, ret+24(FP) + RET + MOVL $0, AX + MOVL AX, ret+24(FP) + RET + +GLOBL runtime·tlsoffset(SB),NOPTR,$4 + +// int32 runtime·kqueue(void) +TEXT runtime·kqueue(SB),NOSPLIT,$0 + MOVL $SYS_kqueue, AX + INT $0x80 + JAE 2(PC) + NEGL AX + MOVL AX, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout) +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVL $SYS___kevent50, AX + INT $0x80 + JAE 2(PC) + NEGL AX + MOVL AX, ret+24(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$-4 + MOVL $SYS_fcntl, AX + INT $0x80 + JAE noerr + MOVL $-1, ret+12(FP) + MOVL AX, errno+16(FP) + RET +noerr: + MOVL AX, ret+12(FP) + MOVL $0, errno+16(FP) + RET + +// int32 runtime·closeonexec(int32 fd) +TEXT runtime·closeonexec(SB),NOSPLIT,$32 + MOVL $SYS_fcntl, AX + // 0(SP) is where the caller PC would be; kernel skips it + MOVL fd+0(FP), BX + MOVL BX, 4(SP) // fd + MOVL $F_SETFD, 8(SP) + MOVL $FD_CLOEXEC, 12(SP) + INT $0x80 + JAE 2(PC) + NEGL AX + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT,$0 + MOVL $SYS_issetugid, AX + INT $0x80 + MOVL AX, ret+0(FP) + RET diff --git a/src/runtime/sys_netbsd_amd64.s b/src/runtime/sys_netbsd_amd64.s new file mode 100644 index 0000000..24b3041 --- /dev/null +++ b/src/runtime/sys_netbsd_amd64.s @@ -0,0 +1,471 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for AMD64, NetBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_amd64.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 3 +#define FD_CLOEXEC 1 +#define F_SETFD 2 + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_munmap 73 +#define SYS_madvise 75 +#define SYS_fcntl 92 +#define SYS_mmap 197 +#define SYS___sysctl 202 +#define SYS___sigaltstack14 281 +#define SYS___sigprocmask14 293 +#define SYS_issetugid 305 +#define SYS_getcontext 307 +#define SYS_setcontext 308 +#define SYS__lwp_create 309 +#define SYS__lwp_exit 310 +#define SYS__lwp_self 311 +#define SYS__lwp_setprivate 317 +#define SYS__lwp_kill 318 +#define SYS__lwp_unpark 321 +#define SYS___sigaction_sigtramp 340 +#define SYS_kqueue 344 +#define SYS_sched_yield 350 +#define SYS___setitimer50 425 +#define SYS___clock_gettime50 427 +#define SYS___nanosleep50 430 +#define SYS___kevent50 435 +#define SYS____lwp_park60 478 + +// int32 lwp_create(void *context, uintptr flags, void *lwpid) +TEXT runtime·lwp_create(SB),NOSPLIT,$0 + MOVQ ctxt+0(FP), DI + MOVQ flags+8(FP), SI + MOVQ lwpid+16(FP), DX + MOVL $SYS__lwp_create, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+24(FP) + RET + +TEXT runtime·lwp_tramp(SB),NOSPLIT,$0 + + // Set FS to point at m->tls. + LEAQ m_tls(R8), DI + CALL runtime·settls(SB) + + // Set up new stack. + get_tls(CX) + MOVQ R8, g_m(R9) + MOVQ R9, g(CX) + CALL runtime·stackcheck(SB) + + // Call fn. This is an ABI0 PC. + CALL R12 + + // It shouldn't return. If it does, exit. + MOVL $SYS__lwp_exit, AX + SYSCALL + JMP -3(PC) // keep exiting + +TEXT ·netbsdMstart(SB),NOSPLIT|TOPFRAME,$0 + CALL ·netbsdMstart0(SB) + RET // not reached + +TEXT runtime·osyield(SB),NOSPLIT,$0 + MOVL $SYS_sched_yield, AX + SYSCALL + RET + +TEXT runtime·lwp_park(SB),NOSPLIT,$0 + MOVL clockid+0(FP), DI // arg 1 - clockid + MOVL flags+4(FP), SI // arg 2 - flags + MOVQ ts+8(FP), DX // arg 3 - ts + MOVL unpark+16(FP), R10 // arg 4 - unpark + MOVQ hint+24(FP), R8 // arg 5 - hint + MOVQ unparkhint+32(FP), R9 // arg 6 - unparkhint + MOVL $SYS____lwp_park60, AX + SYSCALL + MOVL AX, ret+40(FP) + RET + +TEXT runtime·lwp_unpark(SB),NOSPLIT,$0 + MOVL lwp+0(FP), DI // arg 1 - lwp + MOVQ hint+8(FP), SI // arg 2 - hint + MOVL $SYS__lwp_unpark, AX + SYSCALL + MOVL AX, ret+16(FP) + RET + +TEXT runtime·lwp_self(SB),NOSPLIT,$0 + MOVL $SYS__lwp_self, AX + SYSCALL + MOVL AX, ret+0(FP) + RET + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT,$-8 + MOVL code+0(FP), DI // arg 1 - exit status + MOVL $SYS_exit, AX + SYSCALL + MOVL $0xf1, 0xf1 // crash + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-8 + MOVQ wait+0(FP), AX + // We're done using the stack. + MOVL $0, (AX) + MOVL $SYS__lwp_exit, AX + SYSCALL + MOVL $0xf1, 0xf1 // crash + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT,$-8 + MOVQ name+0(FP), DI // arg 1 pathname + MOVL mode+8(FP), SI // arg 2 flags + MOVL perm+12(FP), DX // arg 3 mode + MOVL $SYS_open, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$-8 + MOVL fd+0(FP), DI // arg 1 fd + MOVL $SYS_close, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+8(FP) + RET + +TEXT runtime·read(SB),NOSPLIT,$-8 + MOVL fd+0(FP), DI // arg 1 fd + MOVQ p+8(FP), SI // arg 2 buf + MOVL n+16(FP), DX // arg 3 count + MOVL $SYS_read, AX + SYSCALL + JCC 2(PC) + NEGQ AX // caller expects negative errno + MOVL AX, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-20 + LEAQ r+8(FP), DI + MOVL flags+0(FP), SI + MOVL $453, AX + SYSCALL + MOVL AX, errno+16(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$-8 + MOVQ fd+0(FP), DI // arg 1 - fd + MOVQ p+8(FP), SI // arg 2 - buf + MOVL n+16(FP), DX // arg 3 - nbyte + MOVL $SYS_write, AX + SYSCALL + JCC 2(PC) + NEGQ AX // caller expects negative errno + MOVL AX, ret+24(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$16 + MOVL $0, DX + MOVL usec+0(FP), AX + MOVL $1000000, CX + DIVL CX + MOVQ AX, 0(SP) // tv_sec + MOVL $1000, AX + MULL DX + MOVQ AX, 8(SP) // tv_nsec + + MOVQ SP, DI // arg 1 - rqtp + MOVQ $0, SI // arg 2 - rmtp + MOVL $SYS___nanosleep50, AX + SYSCALL + RET + +TEXT runtime·lwp_kill(SB),NOSPLIT,$0-16 + MOVL tid+0(FP), DI // arg 1 - target + MOVQ sig+8(FP), SI // arg 2 - signo + MOVL $SYS__lwp_kill, AX + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$16 + MOVL $SYS_getpid, AX + SYSCALL + MOVQ AX, DI // arg 1 - pid + MOVL sig+0(FP), SI // arg 2 - signo + MOVL $SYS_kill, AX + SYSCALL + RET + +TEXT runtime·setitimer(SB),NOSPLIT,$-8 + MOVL mode+0(FP), DI // arg 1 - which + MOVQ new+8(FP), SI // arg 2 - itv + MOVQ old+16(FP), DX // arg 3 - oitv + MOVL $SYS___setitimer50, AX + SYSCALL + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB), NOSPLIT, $32 + MOVQ $CLOCK_REALTIME, DI // arg 1 - clock_id + LEAQ 8(SP), SI // arg 2 - tp + MOVL $SYS___clock_gettime50, AX + SYSCALL + MOVQ 8(SP), AX // sec + MOVQ 16(SP), DX // nsec + + // sec is in AX, nsec in DX + MOVQ AX, sec+0(FP) + MOVL DX, nsec+8(FP) + RET + +TEXT runtime·nanotime1(SB),NOSPLIT,$32 + MOVQ $CLOCK_MONOTONIC, DI // arg 1 - clock_id + LEAQ 8(SP), SI // arg 2 - tp + MOVL $SYS___clock_gettime50, AX + SYSCALL + MOVQ 8(SP), AX // sec + MOVQ 16(SP), DX // nsec + + // sec is in AX, nsec in DX + // return nsec in AX + IMULQ $1000000000, AX + ADDQ DX, AX + MOVQ AX, ret+0(FP) + RET + +TEXT runtime·getcontext(SB),NOSPLIT,$-8 + MOVQ ctxt+0(FP), DI // arg 1 - context + MOVL $SYS_getcontext, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sigprocmask(SB),NOSPLIT,$0 + MOVL how+0(FP), DI // arg 1 - how + MOVQ new+8(FP), SI // arg 2 - set + MOVQ old+16(FP), DX // arg 3 - oset + MOVL $SYS___sigprocmask14, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT sigreturn_tramp<>(SB),NOSPLIT,$-8 + MOVQ R15, DI // Load address of ucontext + MOVQ $SYS_setcontext, AX + SYSCALL + MOVQ $-1, DI // Something failed... + MOVL $SYS_exit, AX + SYSCALL + +TEXT runtime·sigaction(SB),NOSPLIT,$-8 + MOVL sig+0(FP), DI // arg 1 - signum + MOVQ new+8(FP), SI // arg 2 - nsa + MOVQ old+16(FP), DX // arg 3 - osa + // arg 4 - tramp + LEAQ sigreturn_tramp<>(SB), R10 + MOVQ $2, R8 // arg 5 - vers + MOVL $SYS___sigaction_sigtramp, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVQ fn+0(FP), AX + MOVL sig+8(FP), DI + MOVQ info+16(FP), SI + MOVQ ctx+24(FP), DX + PUSHQ BP + MOVQ SP, BP + ANDQ $~15, SP // alignment for x86_64 ABI + CALL AX + MOVQ BP, SP + POPQ BP + RET + +// Called using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Set up ABIInternal environment: g in R14, cleared X15. + get_tls(R12) + MOVQ g(R12), R14 + PXOR X15, X15 + + // Reserve space for spill slots. + NOP SP // disable vet stack checking + ADJSP $24 + + // Call into the Go signal handler + MOVQ DI, AX // sig + MOVQ SI, BX // info + MOVQ DX, CX // ctx + CALL ·sigtrampgo<ABIInternal>(SB) + + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +TEXT runtime·mmap(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 - addr + MOVQ n+8(FP), SI // arg 2 - len + MOVL prot+16(FP), DX // arg 3 - prot + MOVL flags+20(FP), R10 // arg 4 - flags + MOVL fd+24(FP), R8 // arg 5 - fd + MOVL off+28(FP), R9 + SUBQ $16, SP + MOVQ R9, 8(SP) // arg 7 - offset (passed on stack) + MOVQ $0, R9 // arg 6 - pad + MOVL $SYS_mmap, AX + SYSCALL + JCC ok + ADDQ $16, SP + MOVQ $0, p+32(FP) + MOVQ AX, err+40(FP) + RET +ok: + ADDQ $16, SP + MOVQ AX, p+32(FP) + MOVQ $0, err+40(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 - addr + MOVQ n+8(FP), SI // arg 2 - len + MOVL $SYS_munmap, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVQ addr+0(FP), DI // arg 1 - addr + MOVQ n+8(FP), SI // arg 2 - len + MOVL flags+16(FP), DX // arg 3 - behav + MOVQ $SYS_madvise, AX + SYSCALL + JCC 2(PC) + MOVL $-1, AX + MOVL AX, ret+24(FP) + RET + +TEXT runtime·sigaltstack(SB),NOSPLIT,$-8 + MOVQ new+0(FP), DI // arg 1 - nss + MOVQ old+8(FP), SI // arg 2 - oss + MOVQ $SYS___sigaltstack14, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +// set tls base to DI +TEXT runtime·settls(SB),NOSPLIT,$8 + // adjust for ELF: wants to use -8(FS) for g + ADDQ $8, DI // arg 1 - ptr + MOVQ $SYS__lwp_setprivate, AX + SYSCALL + JCC 2(PC) + MOVL $0xf1, 0xf1 // crash + RET + +TEXT runtime·sysctl(SB),NOSPLIT,$0 + MOVQ mib+0(FP), DI // arg 1 - name + MOVL miblen+8(FP), SI // arg 2 - namelen + MOVQ out+16(FP), DX // arg 3 - oldp + MOVQ size+24(FP), R10 // arg 4 - oldlenp + MOVQ dst+32(FP), R8 // arg 5 - newp + MOVQ ndst+40(FP), R9 // arg 6 - newlen + MOVQ $SYS___sysctl, AX + SYSCALL + JCC 4(PC) + NEGQ AX + MOVL AX, ret+48(FP) + RET + MOVL $0, AX + MOVL AX, ret+48(FP) + RET + +// int32 runtime·kqueue(void) +TEXT runtime·kqueue(SB),NOSPLIT,$0 + MOVQ $0, DI + MOVL $SYS_kqueue, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout) +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVL kq+0(FP), DI + MOVQ ch+8(FP), SI + MOVL nch+16(FP), DX + MOVQ ev+24(FP), R10 + MOVL nev+32(FP), R8 + MOVQ ts+40(FP), R9 + MOVL $SYS___kevent50, AX + SYSCALL + JCC 2(PC) + NEGQ AX + MOVL AX, ret+48(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVL fd+0(FP), DI // fd + MOVL cmd+4(FP), SI // cmd + MOVL arg+8(FP), DX // arg + MOVL $SYS_fcntl, AX + SYSCALL + JCC noerr + MOVL $-1, ret+16(FP) + MOVL AX, errno+20(FP) + RET +noerr: + MOVL AX, ret+16(FP) + MOVL $0, errno+20(FP) + RET + +// void runtime·closeonexec(int32 fd) +TEXT runtime·closeonexec(SB),NOSPLIT,$0 + MOVL fd+0(FP), DI // fd + MOVQ $F_SETFD, SI + MOVQ $FD_CLOEXEC, DX + MOVL $SYS_fcntl, AX + SYSCALL + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT,$0 + MOVQ $0, DI + MOVQ $0, SI + MOVQ $0, DX + MOVL $SYS_issetugid, AX + SYSCALL + MOVL AX, ret+0(FP) + RET diff --git a/src/runtime/sys_netbsd_arm.s b/src/runtime/sys_netbsd_arm.s new file mode 100644 index 0000000..263c3f0 --- /dev/null +++ b/src/runtime/sys_netbsd_arm.s @@ -0,0 +1,437 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for ARM, NetBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 3 +#define FD_CLOEXEC 1 +#define F_SETFD 2 + +#define SWI_OS_NETBSD 0xa00000 +#define SYS_exit SWI_OS_NETBSD | 1 +#define SYS_read SWI_OS_NETBSD | 3 +#define SYS_write SWI_OS_NETBSD | 4 +#define SYS_open SWI_OS_NETBSD | 5 +#define SYS_close SWI_OS_NETBSD | 6 +#define SYS_getpid SWI_OS_NETBSD | 20 +#define SYS_kill SWI_OS_NETBSD | 37 +#define SYS_munmap SWI_OS_NETBSD | 73 +#define SYS_madvise SWI_OS_NETBSD | 75 +#define SYS_fcntl SWI_OS_NETBSD | 92 +#define SYS_mmap SWI_OS_NETBSD | 197 +#define SYS___sysctl SWI_OS_NETBSD | 202 +#define SYS___sigaltstack14 SWI_OS_NETBSD | 281 +#define SYS___sigprocmask14 SWI_OS_NETBSD | 293 +#define SYS_issetugid SWI_OS_NETBSD | 305 +#define SYS_getcontext SWI_OS_NETBSD | 307 +#define SYS_setcontext SWI_OS_NETBSD | 308 +#define SYS__lwp_create SWI_OS_NETBSD | 309 +#define SYS__lwp_exit SWI_OS_NETBSD | 310 +#define SYS__lwp_self SWI_OS_NETBSD | 311 +#define SYS__lwp_getprivate SWI_OS_NETBSD | 316 +#define SYS__lwp_setprivate SWI_OS_NETBSD | 317 +#define SYS__lwp_kill SWI_OS_NETBSD | 318 +#define SYS__lwp_unpark SWI_OS_NETBSD | 321 +#define SYS___sigaction_sigtramp SWI_OS_NETBSD | 340 +#define SYS_kqueue SWI_OS_NETBSD | 344 +#define SYS_sched_yield SWI_OS_NETBSD | 350 +#define SYS___setitimer50 SWI_OS_NETBSD | 425 +#define SYS___clock_gettime50 SWI_OS_NETBSD | 427 +#define SYS___nanosleep50 SWI_OS_NETBSD | 430 +#define SYS___kevent50 SWI_OS_NETBSD | 435 +#define SYS____lwp_park60 SWI_OS_NETBSD | 478 + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0 + MOVW code+0(FP), R0 // arg 1 exit status + SWI $SYS_exit + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-4 + MOVW wait+0(FP), R0 + // We're done using the stack. + MOVW $0, R2 +storeloop: + LDREX (R0), R4 // loads R4 + STREX R2, (R0), R1 // stores R2 + CMP $0, R1 + BNE storeloop + SWI $SYS__lwp_exit + MOVW $1, R8 // crash + MOVW R8, (R8) + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0 + MOVW name+0(FP), R0 + MOVW mode+4(FP), R1 + MOVW perm+8(FP), R2 + SWI $SYS_open + MOVW.CS $-1, R0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R0 + SWI $SYS_close + MOVW.CS $-1, R0 + MOVW R0, ret+4(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R0 + MOVW p+4(FP), R1 + MOVW n+8(FP), R2 + SWI $SYS_read + RSB.CS $0, R0 // caller expects negative errno + MOVW R0, ret+12(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT,$0-16 + MOVW $r+4(FP), R0 + MOVW flags+0(FP), R1 + SWI $0xa001c5 + MOVW R0, errno+12(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R0 // arg 1 - fd + MOVW p+4(FP), R1 // arg 2 - buf + MOVW n+8(FP), R2 // arg 3 - nbyte + SWI $SYS_write + RSB.CS $0, R0 // caller expects negative errno + MOVW R0, ret+12(FP) + RET + +// int32 lwp_create(void *context, uintptr flags, void *lwpid) +TEXT runtime·lwp_create(SB),NOSPLIT,$0 + MOVW ctxt+0(FP), R0 + MOVW flags+4(FP), R1 + MOVW lwpid+8(FP), R2 + SWI $SYS__lwp_create + MOVW R0, ret+12(FP) + RET + +TEXT runtime·osyield(SB),NOSPLIT,$0 + SWI $SYS_sched_yield + RET + +TEXT runtime·lwp_park(SB),NOSPLIT,$8 + MOVW clockid+0(FP), R0 // arg 1 - clock_id + MOVW flags+4(FP), R1 // arg 2 - flags + MOVW ts+8(FP), R2 // arg 3 - ts + MOVW unpark+12(FP), R3 // arg 4 - unpark + MOVW hint+16(FP), R4 // arg 5 - hint + MOVW R4, 4(R13) + MOVW unparkhint+20(FP), R5 // arg 6 - unparkhint + MOVW R5, 8(R13) + SWI $SYS____lwp_park60 + MOVW R0, ret+24(FP) + RET + +TEXT runtime·lwp_unpark(SB),NOSPLIT,$0 + MOVW lwp+0(FP), R0 // arg 1 - lwp + MOVW hint+4(FP), R1 // arg 2 - hint + SWI $SYS__lwp_unpark + MOVW R0, ret+8(FP) + RET + +TEXT runtime·lwp_self(SB),NOSPLIT,$0 + SWI $SYS__lwp_self + MOVW R0, ret+0(FP) + RET + +TEXT runtime·lwp_tramp(SB),NOSPLIT,$0 + MOVW R0, g_m(R1) + MOVW R1, g + + BL runtime·emptyfunc(SB) // fault if stack check is wrong + BL (R2) + MOVW $2, R8 // crash (not reached) + MOVW R8, (R8) + RET + +TEXT ·netbsdMstart(SB),NOSPLIT|TOPFRAME,$0 + BL ·netbsdMstart0(SB) + RET // not reached + +TEXT runtime·usleep(SB),NOSPLIT,$16 + MOVW usec+0(FP), R0 + CALL runtime·usplitR0(SB) + // 0(R13) is the saved LR, don't use it + MOVW R0, 4(R13) // tv_sec.low + MOVW $0, R0 + MOVW R0, 8(R13) // tv_sec.high + MOVW $1000, R2 + MUL R1, R2 + MOVW R2, 12(R13) // tv_nsec + + MOVW $4(R13), R0 // arg 1 - rqtp + MOVW $0, R1 // arg 2 - rmtp + SWI $SYS___nanosleep50 + RET + +TEXT runtime·lwp_kill(SB),NOSPLIT,$0-8 + MOVW tid+0(FP), R0 // arg 1 - tid + MOVW sig+4(FP), R1 // arg 2 - signal + SWI $SYS__lwp_kill + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$16 + SWI $SYS_getpid // the returned R0 is arg 1 + MOVW sig+0(FP), R1 // arg 2 - signal + SWI $SYS_kill + RET + +TEXT runtime·setitimer(SB),NOSPLIT|NOFRAME,$0 + MOVW mode+0(FP), R0 // arg 1 - which + MOVW new+4(FP), R1 // arg 2 - itv + MOVW old+8(FP), R2 // arg 3 - oitv + SWI $SYS___setitimer50 + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB), NOSPLIT, $32 + MOVW $0, R0 // CLOCK_REALTIME + MOVW $8(R13), R1 + SWI $SYS___clock_gettime50 + + MOVW 8(R13), R0 // sec.low + MOVW 12(R13), R1 // sec.high + MOVW 16(R13), R2 // nsec + + MOVW R0, sec_lo+0(FP) + MOVW R1, sec_hi+4(FP) + MOVW R2, nsec+8(FP) + RET + +// int64 nanotime1(void) so really +// void nanotime1(int64 *nsec) +TEXT runtime·nanotime1(SB), NOSPLIT, $32 + MOVW $3, R0 // CLOCK_MONOTONIC + MOVW $8(R13), R1 + SWI $SYS___clock_gettime50 + + MOVW 8(R13), R0 // sec.low + MOVW 12(R13), R4 // sec.high + MOVW 16(R13), R2 // nsec + + MOVW $1000000000, R3 + MULLU R0, R3, (R1, R0) + MUL R3, R4 + ADD.S R2, R0 + ADC R4, R1 + + MOVW R0, ret_lo+0(FP) + MOVW R1, ret_hi+4(FP) + RET + +TEXT runtime·getcontext(SB),NOSPLIT|NOFRAME,$0 + MOVW ctxt+0(FP), R0 // arg 1 - context + SWI $SYS_getcontext + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +TEXT runtime·sigprocmask(SB),NOSPLIT,$0 + MOVW how+0(FP), R0 // arg 1 - how + MOVW new+4(FP), R1 // arg 2 - set + MOVW old+8(FP), R2 // arg 3 - oset + SWI $SYS___sigprocmask14 + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +TEXT sigreturn_tramp<>(SB),NOSPLIT|NOFRAME,$0 + // on entry, SP points to siginfo, we add sizeof(ucontext) + // to SP to get a pointer to ucontext. + ADD $0x80, R13, R0 // 0x80 == sizeof(UcontextT) + SWI $SYS_setcontext + // something failed, we have to exit + MOVW $0x4242, R0 // magic return number + SWI $SYS_exit + B -2(PC) // continue exit + +TEXT runtime·sigaction(SB),NOSPLIT,$4 + MOVW sig+0(FP), R0 // arg 1 - signum + MOVW new+4(FP), R1 // arg 2 - nsa + MOVW old+8(FP), R2 // arg 3 - osa + MOVW $sigreturn_tramp<>(SB), R3 // arg 4 - tramp + MOVW $2, R4 // arg 5 - vers + MOVW R4, 4(R13) + ADD $4, R13 // pass arg 5 on stack + SWI $SYS___sigaction_sigtramp + SUB $4, R13 + MOVW.CS $3, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-16 + MOVW sig+4(FP), R0 + MOVW info+8(FP), R1 + MOVW ctx+12(FP), R2 + MOVW fn+0(FP), R11 + MOVW R13, R4 + SUB $24, R13 + BIC $0x7, R13 // alignment for ELF ABI + BL (R11) + MOVW R4, R13 + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Reserve space for callee-save registers and arguments. + MOVM.DB.W [R4-R11], (R13) + SUB $16, R13 + + // this might be called in external code context, + // where g is not set. + // first save R0, because runtime·load_g will clobber it + MOVW R0, 4(R13) // signum + MOVB runtime·iscgo(SB), R0 + CMP $0, R0 + BL.NE runtime·load_g(SB) + + MOVW R1, 8(R13) + MOVW R2, 12(R13) + BL runtime·sigtrampgo(SB) + + // Restore callee-save registers. + ADD $16, R13 + MOVM.IA.W (R13), [R4-R11] + + RET + +TEXT runtime·mmap(SB),NOSPLIT,$12 + MOVW addr+0(FP), R0 // arg 1 - addr + MOVW n+4(FP), R1 // arg 2 - len + MOVW prot+8(FP), R2 // arg 3 - prot + MOVW flags+12(FP), R3 // arg 4 - flags + // arg 5 (fid) and arg6 (offset_lo, offset_hi) are passed on stack + // note the C runtime only passes the 32-bit offset_lo to us + MOVW fd+16(FP), R4 // arg 5 + MOVW R4, 4(R13) + MOVW off+20(FP), R5 // arg 6 lower 32-bit + MOVW R5, 8(R13) + MOVW $0, R6 // higher 32-bit for arg 6 + MOVW R6, 12(R13) + ADD $4, R13 // pass arg 5 and arg 6 on stack + SWI $SYS_mmap + SUB $4, R13 + MOVW $0, R1 + MOVW.CS R0, R1 // if error, move to R1 + MOVW.CS $0, R0 + MOVW R0, p+24(FP) + MOVW R1, err+28(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 // arg 1 - addr + MOVW n+4(FP), R1 // arg 2 - len + SWI $SYS_munmap + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVW addr+0(FP), R0 // arg 1 - addr + MOVW n+4(FP), R1 // arg 2 - len + MOVW flags+8(FP), R2 // arg 3 - behav + SWI $SYS_madvise + MOVW.CS $-1, R0 + MOVW R0, ret+12(FP) + RET + +TEXT runtime·sigaltstack(SB),NOSPLIT|NOFRAME,$0 + MOVW new+0(FP), R0 // arg 1 - nss + MOVW old+4(FP), R1 // arg 2 - oss + SWI $SYS___sigaltstack14 + MOVW.CS $0, R8 // crash on syscall failure + MOVW.CS R8, (R8) + RET + +TEXT runtime·sysctl(SB),NOSPLIT,$8 + MOVW mib+0(FP), R0 // arg 1 - name + MOVW miblen+4(FP), R1 // arg 2 - namelen + MOVW out+8(FP), R2 // arg 3 - oldp + MOVW size+12(FP), R3 // arg 4 - oldlenp + MOVW dst+16(FP), R4 // arg 5 - newp + MOVW R4, 4(R13) + MOVW ndst+20(FP), R4 // arg 6 - newlen + MOVW R4, 8(R13) + ADD $4, R13 // pass arg 5 and 6 on stack + SWI $SYS___sysctl + SUB $4, R13 + MOVW R0, ret+24(FP) + RET + +// int32 runtime·kqueue(void) +TEXT runtime·kqueue(SB),NOSPLIT,$0 + SWI $SYS_kqueue + RSB.CS $0, R0 + MOVW R0, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout) +TEXT runtime·kevent(SB),NOSPLIT,$8 + MOVW kq+0(FP), R0 // kq + MOVW ch+4(FP), R1 // changelist + MOVW nch+8(FP), R2 // nchanges + MOVW ev+12(FP), R3 // eventlist + MOVW nev+16(FP), R4 // nevents + MOVW R4, 4(R13) + MOVW ts+20(FP), R4 // timeout + MOVW R4, 8(R13) + ADD $4, R13 // pass arg 5 and 6 on stack + SWI $SYS___kevent50 + RSB.CS $0, R0 + SUB $4, R13 + MOVW R0, ret+24(FP) + RET + +// func fcntl(fd, cmd, args int32) int32 +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 + MOVW cmd+4(FP), R1 + MOVW arg+8(FP), R2 + SWI $SYS_fcntl + MOVW $0, R1 + MOVW.CS R0, R1 + MOVW.CS $-1, R0 + MOVW R0, ret+12(FP) + MOVW R1, errno+16(FP) + RET + +// void runtime·closeonexec(int32 fd) +TEXT runtime·closeonexec(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 // fd + MOVW $F_SETFD, R1 // F_SETFD + MOVW $FD_CLOEXEC, R2 // FD_CLOEXEC + SWI $SYS_fcntl + RET + +// TODO: this is only valid for ARMv7+ +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + B runtime·armPublicationBarrier(SB) + +TEXT runtime·read_tls_fallback(SB),NOSPLIT|NOFRAME,$0 + MOVM.WP [R1, R2, R3, R12], (R13) + SWI $SYS__lwp_getprivate + MOVM.IAW (R13), [R1, R2, R3, R12] + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT,$0 + SWI $SYS_issetugid + MOVW R0, ret+0(FP) + RET diff --git a/src/runtime/sys_netbsd_arm64.s b/src/runtime/sys_netbsd_arm64.s new file mode 100644 index 0000000..c302adb --- /dev/null +++ b/src/runtime/sys_netbsd_arm64.s @@ -0,0 +1,453 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for arm64, NetBSD +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_arm64.h" + +#define CLOCK_REALTIME 0 +#define CLOCK_MONOTONIC 3 +#define FD_CLOEXEC 1 +#define F_SETFD 2 +#define F_GETFL 3 +#define F_SETFL 4 +#define O_NONBLOCK 4 + +#define SYS_exit 1 +#define SYS_read 3 +#define SYS_write 4 +#define SYS_open 5 +#define SYS_close 6 +#define SYS_getpid 20 +#define SYS_kill 37 +#define SYS_munmap 73 +#define SYS_madvise 75 +#define SYS_fcntl 92 +#define SYS_mmap 197 +#define SYS___sysctl 202 +#define SYS___sigaltstack14 281 +#define SYS___sigprocmask14 293 +#define SYS_issetugid 305 +#define SYS_getcontext 307 +#define SYS_setcontext 308 +#define SYS__lwp_create 309 +#define SYS__lwp_exit 310 +#define SYS__lwp_self 311 +#define SYS__lwp_kill 318 +#define SYS__lwp_unpark 321 +#define SYS___sigaction_sigtramp 340 +#define SYS_kqueue 344 +#define SYS_sched_yield 350 +#define SYS___setitimer50 425 +#define SYS___clock_gettime50 427 +#define SYS___nanosleep50 430 +#define SYS___kevent50 435 +#define SYS_pipe2 453 +#define SYS_openat 468 +#define SYS____lwp_park60 478 + +// int32 lwp_create(void *context, uintptr flags, void *lwpid) +TEXT runtime·lwp_create(SB),NOSPLIT,$0 + MOVD ctxt+0(FP), R0 + MOVD flags+8(FP), R1 + MOVD lwpid+16(FP), R2 + SVC $SYS__lwp_create + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+24(FP) + RET + +TEXT runtime·lwp_tramp(SB),NOSPLIT,$0 + CMP $0, R1 + BEQ nog + CMP $0, R2 + BEQ nog + + MOVD R0, g_m(R1) + MOVD R1, g +nog: + CALL (R2) + + MOVD $0, R0 // crash (not reached) + MOVD R0, (R8) + +TEXT ·netbsdMstart(SB),NOSPLIT|TOPFRAME,$0 + CALL ·netbsdMstart0(SB) + RET // not reached + +TEXT runtime·osyield(SB),NOSPLIT,$0 + SVC $SYS_sched_yield + RET + +TEXT runtime·lwp_park(SB),NOSPLIT,$0 + MOVW clockid+0(FP), R0 // arg 1 - clockid + MOVW flags+4(FP), R1 // arg 2 - flags + MOVD ts+8(FP), R2 // arg 3 - ts + MOVW unpark+16(FP), R3 // arg 4 - unpark + MOVD hint+24(FP), R4 // arg 5 - hint + MOVD unparkhint+32(FP), R5 // arg 6 - unparkhint + SVC $SYS____lwp_park60 + MOVW R0, ret+40(FP) + RET + +TEXT runtime·lwp_unpark(SB),NOSPLIT,$0 + MOVW lwp+0(FP), R0 // arg 1 - lwp + MOVD hint+8(FP), R1 // arg 2 - hint + SVC $SYS__lwp_unpark + MOVW R0, ret+16(FP) + RET + +TEXT runtime·lwp_self(SB),NOSPLIT,$0 + SVC $SYS__lwp_self + MOVW R0, ret+0(FP) + RET + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT,$-8 + MOVW code+0(FP), R0 // arg 1 - exit status + SVC $SYS_exit + MOVD $0, R0 // If we're still running, + MOVD R0, (R0) // crash + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0-8 + MOVD wait+0(FP), R0 + // We're done using the stack. + MOVW $0, R1 + STLRW R1, (R0) + SVC $SYS__lwp_exit + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$-8 + MOVD name+0(FP), R0 // arg 1 - pathname + MOVW mode+8(FP), R1 // arg 2 - flags + MOVW perm+12(FP), R2 // arg 3 - mode + SVC $SYS_open + BCC ok + MOVW $-1, R0 +ok: + MOVW R0, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$-8 + MOVW fd+0(FP), R0 // arg 1 - fd + SVC $SYS_close + BCC ok + MOVW $-1, R0 +ok: + MOVW R0, ret+8(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R0 // arg 1 - fd + MOVD p+8(FP), R1 // arg 2 - buf + MOVW n+16(FP), R2 // arg 3 - count + SVC $SYS_read + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + ADD $16, RSP, R0 + MOVW flags+0(FP), R1 + SVC $SYS_pipe2 + BCC pipe2ok + NEG R0, R0 +pipe2ok: + MOVW R0, errno+16(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT,$-8 + MOVD fd+0(FP), R0 // arg 1 - fd + MOVD p+8(FP), R1 // arg 2 - buf + MOVW n+16(FP), R2 // arg 3 - nbyte + SVC $SYS_write + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+24(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$24-4 + MOVWU usec+0(FP), R3 + MOVD R3, R5 + MOVW $1000000, R4 + UDIV R4, R3 + MOVD R3, 8(RSP) // sec + MUL R3, R4 + SUB R4, R5 + MOVW $1000, R4 + MUL R4, R5 + MOVD R5, 16(RSP) // nsec + + MOVD $8(RSP), R0 // arg 1 - rqtp + MOVD $0, R1 // arg 2 - rmtp + SVC $SYS___nanosleep50 + RET + +TEXT runtime·lwp_kill(SB),NOSPLIT,$0-16 + MOVW tid+0(FP), R0 // arg 1 - target + MOVD sig+8(FP), R1 // arg 2 - signo + SVC $SYS__lwp_kill + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$16 + SVC $SYS_getpid + // arg 1 - pid (from getpid) + MOVD sig+0(FP), R1 // arg 2 - signo + SVC $SYS_kill + RET + +TEXT runtime·setitimer(SB),NOSPLIT,$-8 + MOVW mode+0(FP), R0 // arg 1 - which + MOVD new+8(FP), R1 // arg 2 - itv + MOVD old+16(FP), R2 // arg 3 - oitv + SVC $SYS___setitimer50 + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB), NOSPLIT, $32 + MOVW $CLOCK_REALTIME, R0 // arg 1 - clock_id + MOVD $8(RSP), R1 // arg 2 - tp + SVC $SYS___clock_gettime50 + + MOVD 8(RSP), R0 // sec + MOVD 16(RSP), R1 // nsec + + // sec is in R0, nsec in R1 + MOVD R0, sec+0(FP) + MOVW R1, nsec+8(FP) + RET + +// int64 nanotime1(void) so really +// void nanotime1(int64 *nsec) +TEXT runtime·nanotime1(SB), NOSPLIT, $32 + MOVD $CLOCK_MONOTONIC, R0 // arg 1 - clock_id + MOVD $8(RSP), R1 // arg 2 - tp + SVC $SYS___clock_gettime50 + MOVD 8(RSP), R0 // sec + MOVD 16(RSP), R2 // nsec + + // sec is in R0, nsec in R2 + // return nsec in R2 + MOVD $1000000000, R3 + MUL R3, R0 + ADD R2, R0 + + MOVD R0, ret+0(FP) + RET + +TEXT runtime·getcontext(SB),NOSPLIT,$-8 + MOVD ctxt+0(FP), R0 // arg 1 - context + SVC $SYS_getcontext + BCS fail + RET +fail: + MOVD $0, R0 + MOVD R0, (R0) // crash + +TEXT runtime·sigprocmask(SB),NOSPLIT,$0 + MOVW how+0(FP), R0 // arg 1 - how + MOVD new+8(FP), R1 // arg 2 - set + MOVD old+16(FP), R2 // arg 3 - oset + SVC $SYS___sigprocmask14 + BCS fail + RET +fail: + MOVD $0, R0 + MOVD R0, (R0) // crash + +TEXT sigreturn_tramp<>(SB),NOSPLIT,$-8 + MOVD g, R0 + SVC $SYS_setcontext + MOVD $0, R0 + MOVD R0, (R0) // crash + +TEXT runtime·sigaction(SB),NOSPLIT,$-8 + MOVW sig+0(FP), R0 // arg 1 - signum + MOVD new+8(FP), R1 // arg 2 - nsa + MOVD old+16(FP), R2 // arg 3 - osa + // arg 4 - tramp + MOVD $sigreturn_tramp<>(SB), R3 + MOVW $2, R4 // arg 5 - vers + SVC $SYS___sigaction_sigtramp + BCS fail + RET +fail: + MOVD $0, R0 + MOVD R0, (R0) // crash + +// XXX ??? +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R0 + MOVD info+16(FP), R1 + MOVD ctx+24(FP), R2 + MOVD fn+0(FP), R11 + BL (R11) + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$176 + // Save callee-save registers in the case of signal forwarding. + // Please refer to https://golang.org/issue/31827 . + SAVE_R19_TO_R28(8*4) + SAVE_F8_TO_F15(8*14) + // Unclobber g for now (kernel uses it as ucontext ptr) + // See https://github.com/golang/go/issues/30824#issuecomment-492772426 + // This is only correct in the non-cgo case. + // XXX should use lwp_getprivate as suggested. + // 8*36 is ucontext.uc_mcontext.__gregs[_REG_X28] + MOVD 8*36(g), g + + // this might be called in external code context, + // where g is not set. + // first save R0, because runtime·load_g will clobber it + MOVD R0, 8(RSP) // signum + MOVB runtime·iscgo(SB), R0 + CMP $0, R0 + // XXX branch destination + BEQ 2(PC) + BL runtime·load_g(SB) + +#ifdef GOEXPERIMENT_regabiargs + // Restore signum to R0. + MOVW 8(RSP), R0 + // R1 and R2 already contain info and ctx, respectively. +#else + MOVD R1, 16(RSP) + MOVD R2, 24(RSP) +#endif + BL runtime·sigtrampgo<ABIInternal>(SB) + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8*4) + RESTORE_F8_TO_F15(8*14) + + RET + +TEXT runtime·mmap(SB),NOSPLIT,$0 + MOVD addr+0(FP), R0 // arg 1 - addr + MOVD n+8(FP), R1 // arg 2 - len + MOVW prot+16(FP), R2 // arg 3 - prot + MOVW flags+20(FP), R3 // arg 4 - flags + MOVW fd+24(FP), R4 // arg 5 - fd + MOVW $0, R5 // arg 6 - pad + MOVD off+28(FP), R6 // arg 7 - offset + SVC $SYS_mmap + BCS fail + MOVD R0, p+32(FP) + MOVD $0, err+40(FP) + RET +fail: + MOVD $0, p+32(FP) + MOVD R0, err+40(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0 + MOVD addr+0(FP), R0 // arg 1 - addr + MOVD n+8(FP), R1 // arg 2 - len + SVC $SYS_munmap + BCS fail + RET +fail: + MOVD $0, R0 + MOVD R0, (R0) // crash + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVD addr+0(FP), R0 // arg 1 - addr + MOVD n+8(FP), R1 // arg 2 - len + MOVW flags+16(FP), R2 // arg 3 - behav + SVC $SYS_madvise + BCC ok + MOVD $-1, R0 +ok: + MOVD R0, ret+24(FP) + RET + +TEXT runtime·sigaltstack(SB),NOSPLIT,$0 + MOVD new+0(FP), R0 // arg 1 - nss + MOVD old+8(FP), R1 // arg 2 - oss + SVC $SYS___sigaltstack14 + BCS fail + RET +fail: + MOVD $0, R0 + MOVD R0, (R0) // crash + +TEXT runtime·sysctl(SB),NOSPLIT,$0 + MOVD mib+0(FP), R0 // arg 1 - name + MOVW miblen+8(FP), R1 // arg 2 - namelen + MOVD out+16(FP), R2 // arg 3 - oldp + MOVD size+24(FP), R3 // arg 4 - oldlenp + MOVD dst+32(FP), R4 // arg 5 - newp + MOVD ndst+40(FP), R5 // arg 6 - newlen + SVC $SYS___sysctl + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+48(FP) + RET + +// int32 runtime·kqueue(void) +TEXT runtime·kqueue(SB),NOSPLIT,$0 + MOVD $0, R0 + SVC $SYS_kqueue + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout) +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVW kq+0(FP), R0 // arg 1 - kq + MOVD ch+8(FP), R1 // arg 2 - changelist + MOVW nch+16(FP), R2 // arg 3 - nchanges + MOVD ev+24(FP), R3 // arg 4 - eventlist + MOVW nev+32(FP), R4 // arg 5 - nevents + MOVD ts+40(FP), R5 // arg 6 - timeout + SVC $SYS___kevent50 + BCC ok + NEG R0, R0 +ok: + MOVW R0, ret+48(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 // fd + MOVW cmd+4(FP), R1 // cmd + MOVW arg+8(FP), R2 // arg + SVC $SYS_fcntl + BCC noerr + MOVW $-1, R1 + MOVW R1, ret+16(FP) + MOVW R0, errno+20(FP) + RET +noerr: + MOVW R0, ret+16(FP) + MOVW $0, errno+20(FP) + RET + +// void runtime·closeonexec(int32 fd) +TEXT runtime·closeonexec(SB),NOSPLIT,$0 + MOVW fd+0(FP), R0 // arg 1 - fd + MOVW $F_SETFD, R1 + MOVW $FD_CLOEXEC, R2 + SVC $SYS_fcntl + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT|NOFRAME,$0 + SVC $SYS_issetugid + MOVW R0, ret+0(FP) + RET diff --git a/src/runtime/sys_nonppc64x.go b/src/runtime/sys_nonppc64x.go new file mode 100644 index 0000000..653f1c9 --- /dev/null +++ b/src/runtime/sys_nonppc64x.go @@ -0,0 +1,10 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !ppc64 && !ppc64le + +package runtime + +func prepGoExitFrame(sp uintptr) { +} diff --git a/src/runtime/sys_openbsd.go b/src/runtime/sys_openbsd.go new file mode 100644 index 0000000..c4b8489 --- /dev/null +++ b/src/runtime/sys_openbsd.go @@ -0,0 +1,75 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd && !mips64 + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +// The *_trampoline functions convert from the Go calling convention to the C calling convention +// and then call the underlying libc function. These are defined in sys_openbsd_$ARCH.s. + +//go:nosplit +//go:cgo_unsafe_args +func pthread_attr_init(attr *pthreadattr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_attr_init_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + return ret +} +func pthread_attr_init_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_attr_destroy(attr *pthreadattr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_attr_destroy_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + return ret +} +func pthread_attr_destroy_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_attr_getstacksize(attr *pthreadattr, size *uintptr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_attr_getstacksize_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + KeepAlive(size) + return ret +} +func pthread_attr_getstacksize_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_attr_setdetachstate(attr *pthreadattr, state int) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_attr_setdetachstate_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + return ret +} +func pthread_attr_setdetachstate_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func pthread_create(attr *pthreadattr, start uintptr, arg unsafe.Pointer) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(pthread_create_trampoline)), unsafe.Pointer(&attr)) + KeepAlive(attr) + KeepAlive(arg) // Just for consistency. Arg of course needs to be kept alive for the start function. + return ret +} +func pthread_create_trampoline() + +// Tell the linker that the libc_* functions are to be found +// in a system library, with the libc_ prefix missing. + +//go:cgo_import_dynamic libc_pthread_attr_init pthread_attr_init "libpthread.so" +//go:cgo_import_dynamic libc_pthread_attr_destroy pthread_attr_destroy "libpthread.so" +//go:cgo_import_dynamic libc_pthread_attr_getstacksize pthread_attr_getstacksize "libpthread.so" +//go:cgo_import_dynamic libc_pthread_attr_setdetachstate pthread_attr_setdetachstate "libpthread.so" +//go:cgo_import_dynamic libc_pthread_create pthread_create "libpthread.so" +//go:cgo_import_dynamic libc_pthread_sigmask pthread_sigmask "libpthread.so" + +//go:cgo_import_dynamic _ _ "libpthread.so" +//go:cgo_import_dynamic _ _ "libc.so" diff --git a/src/runtime/sys_openbsd1.go b/src/runtime/sys_openbsd1.go new file mode 100644 index 0000000..d852e3c --- /dev/null +++ b/src/runtime/sys_openbsd1.go @@ -0,0 +1,46 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd && !mips64 + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +//go:nosplit +//go:cgo_unsafe_args +func thrsleep(ident uintptr, clock_id int32, tsp *timespec, lock uintptr, abort *uint32) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(thrsleep_trampoline)), unsafe.Pointer(&ident)) + KeepAlive(tsp) + KeepAlive(abort) + return ret +} +func thrsleep_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func thrwakeup(ident uintptr, n int32) int32 { + return libcCall(unsafe.Pointer(abi.FuncPCABI0(thrwakeup_trampoline)), unsafe.Pointer(&ident)) +} +func thrwakeup_trampoline() + +//go:nosplit +func osyield() { + libcCall(unsafe.Pointer(abi.FuncPCABI0(sched_yield_trampoline)), unsafe.Pointer(nil)) +} +func sched_yield_trampoline() + +//go:nosplit +func osyield_no_g() { + asmcgocall_no_g(unsafe.Pointer(abi.FuncPCABI0(sched_yield_trampoline)), unsafe.Pointer(nil)) +} + +//go:cgo_import_dynamic libc_thrsleep __thrsleep "libc.so" +//go:cgo_import_dynamic libc_thrwakeup __thrwakeup "libc.so" +//go:cgo_import_dynamic libc_sched_yield sched_yield "libc.so" + +//go:cgo_import_dynamic _ _ "libc.so" diff --git a/src/runtime/sys_openbsd2.go b/src/runtime/sys_openbsd2.go new file mode 100644 index 0000000..c7efeaf --- /dev/null +++ b/src/runtime/sys_openbsd2.go @@ -0,0 +1,307 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd && !mips64 + +package runtime + +import ( + "internal/abi" + "runtime/internal/atomic" + "unsafe" +) + +// This is exported via linkname to assembly in runtime/cgo. +// +//go:linkname exit +//go:nosplit +//go:cgo_unsafe_args +func exit(code int32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(exit_trampoline)), unsafe.Pointer(&code)) +} +func exit_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func getthrid() (tid int32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(getthrid_trampoline)), unsafe.Pointer(&tid)) + return +} +func getthrid_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func raiseproc(sig uint32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(raiseproc_trampoline)), unsafe.Pointer(&sig)) +} +func raiseproc_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func thrkill(tid int32, sig int) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(thrkill_trampoline)), unsafe.Pointer(&tid)) +} +func thrkill_trampoline() + +// mmap is used to do low-level memory allocation via mmap. Don't allow stack +// splits, since this function (used by sysAlloc) is called in a lot of low-level +// parts of the runtime and callers often assume it won't acquire any locks. +// +//go:nosplit +func mmap(addr unsafe.Pointer, n uintptr, prot, flags, fd int32, off uint32) (unsafe.Pointer, int) { + args := struct { + addr unsafe.Pointer + n uintptr + prot, flags, fd int32 + off uint32 + ret1 unsafe.Pointer + ret2 int + }{addr, n, prot, flags, fd, off, nil, 0} + libcCall(unsafe.Pointer(abi.FuncPCABI0(mmap_trampoline)), unsafe.Pointer(&args)) + KeepAlive(addr) // Just for consistency. Hopefully addr is not a Go address. + return args.ret1, args.ret2 +} +func mmap_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func munmap(addr unsafe.Pointer, n uintptr) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(munmap_trampoline)), unsafe.Pointer(&addr)) + KeepAlive(addr) // Just for consistency. Hopefully addr is not a Go address. +} +func munmap_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func madvise(addr unsafe.Pointer, n uintptr, flags int32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(madvise_trampoline)), unsafe.Pointer(&addr)) + KeepAlive(addr) // Just for consistency. Hopefully addr is not a Go address. +} +func madvise_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func open(name *byte, mode, perm int32) (ret int32) { + ret = libcCall(unsafe.Pointer(abi.FuncPCABI0(open_trampoline)), unsafe.Pointer(&name)) + KeepAlive(name) + return +} +func open_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func closefd(fd int32) int32 { + return libcCall(unsafe.Pointer(abi.FuncPCABI0(close_trampoline)), unsafe.Pointer(&fd)) +} +func close_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func read(fd int32, p unsafe.Pointer, n int32) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(read_trampoline)), unsafe.Pointer(&fd)) + KeepAlive(p) + return ret +} +func read_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func write1(fd uintptr, p unsafe.Pointer, n int32) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(write_trampoline)), unsafe.Pointer(&fd)) + KeepAlive(p) + return ret +} +func write_trampoline() + +func pipe2(flags int32) (r, w int32, errno int32) { + var p [2]int32 + args := struct { + p unsafe.Pointer + flags int32 + }{noescape(unsafe.Pointer(&p)), flags} + errno = libcCall(unsafe.Pointer(abi.FuncPCABI0(pipe2_trampoline)), unsafe.Pointer(&args)) + return p[0], p[1], errno +} +func pipe2_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func setitimer(mode int32, new, old *itimerval) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(setitimer_trampoline)), unsafe.Pointer(&mode)) + KeepAlive(new) + KeepAlive(old) +} +func setitimer_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func usleep(usec uint32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(usleep_trampoline)), unsafe.Pointer(&usec)) +} +func usleep_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func usleep_no_g(usec uint32) { + asmcgocall_no_g(unsafe.Pointer(abi.FuncPCABI0(usleep_trampoline)), unsafe.Pointer(&usec)) +} + +//go:nosplit +//go:cgo_unsafe_args +func sysctl(mib *uint32, miblen uint32, out *byte, size *uintptr, dst *byte, ndst uintptr) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(sysctl_trampoline)), unsafe.Pointer(&mib)) + KeepAlive(mib) + KeepAlive(out) + KeepAlive(size) + KeepAlive(dst) + return ret +} +func sysctl_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func fcntl(fd, cmd, arg int32) (ret int32, errno int32) { + args := struct { + fd, cmd, arg int32 + ret, errno int32 + }{fd, cmd, arg, 0, 0} + libcCall(unsafe.Pointer(abi.FuncPCABI0(fcntl_trampoline)), unsafe.Pointer(&args)) + return args.ret, args.errno +} +func fcntl_trampoline() + +//go:nosplit +func nanotime1() int64 { + var ts timespec + args := struct { + clock_id int32 + tp unsafe.Pointer + }{_CLOCK_MONOTONIC, unsafe.Pointer(&ts)} + if errno := libcCall(unsafe.Pointer(abi.FuncPCABI0(clock_gettime_trampoline)), unsafe.Pointer(&args)); errno < 0 { + // Avoid growing the nosplit stack. + systemstack(func() { + println("runtime: errno", -errno) + throw("clock_gettime failed") + }) + } + return ts.tv_sec*1e9 + int64(ts.tv_nsec) +} +func clock_gettime_trampoline() + +//go:nosplit +func walltime() (int64, int32) { + var ts timespec + args := struct { + clock_id int32 + tp unsafe.Pointer + }{_CLOCK_REALTIME, unsafe.Pointer(&ts)} + if errno := libcCall(unsafe.Pointer(abi.FuncPCABI0(clock_gettime_trampoline)), unsafe.Pointer(&args)); errno < 0 { + // Avoid growing the nosplit stack. + systemstack(func() { + println("runtime: errno", -errno) + throw("clock_gettime failed") + }) + } + return ts.tv_sec, int32(ts.tv_nsec) +} + +//go:nosplit +//go:cgo_unsafe_args +func kqueue() int32 { + return libcCall(unsafe.Pointer(abi.FuncPCABI0(kqueue_trampoline)), nil) +} +func kqueue_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func kevent(kq int32, ch *keventt, nch int32, ev *keventt, nev int32, ts *timespec) int32 { + ret := libcCall(unsafe.Pointer(abi.FuncPCABI0(kevent_trampoline)), unsafe.Pointer(&kq)) + KeepAlive(ch) + KeepAlive(ev) + KeepAlive(ts) + return ret +} +func kevent_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func sigaction(sig uint32, new *sigactiont, old *sigactiont) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(sigaction_trampoline)), unsafe.Pointer(&sig)) + KeepAlive(new) + KeepAlive(old) +} +func sigaction_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func sigprocmask(how uint32, new *sigset, old *sigset) { + // sigprocmask is called from sigsave, which is called from needm. + // As such, we have to be able to run with no g here. + asmcgocall_no_g(unsafe.Pointer(abi.FuncPCABI0(sigprocmask_trampoline)), unsafe.Pointer(&how)) + KeepAlive(new) + KeepAlive(old) +} +func sigprocmask_trampoline() + +//go:nosplit +//go:cgo_unsafe_args +func sigaltstack(new *stackt, old *stackt) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(sigaltstack_trampoline)), unsafe.Pointer(&new)) + KeepAlive(new) + KeepAlive(old) +} +func sigaltstack_trampoline() + +// Not used on OpenBSD, but must be defined. +func exitThread(wait *atomic.Uint32) { + throw("exitThread") +} + +//go:nosplit +func closeonexec(fd int32) { + fcntl(fd, _F_SETFD, _FD_CLOEXEC) +} + +//go:cgo_unsafe_args +func issetugid() (ret int32) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(issetugid_trampoline)), unsafe.Pointer(&ret)) + return +} +func issetugid_trampoline() + +// Tell the linker that the libc_* functions are to be found +// in a system library, with the libc_ prefix missing. + +//go:cgo_import_dynamic libc_errno __errno "libc.so" +//go:cgo_import_dynamic libc_exit exit "libc.so" +//go:cgo_import_dynamic libc_getthrid getthrid "libc.so" +//go:cgo_import_dynamic libc_sched_yield sched_yield "libc.so" +//go:cgo_import_dynamic libc_thrkill thrkill "libc.so" + +//go:cgo_import_dynamic libc_mmap mmap "libc.so" +//go:cgo_import_dynamic libc_munmap munmap "libc.so" +//go:cgo_import_dynamic libc_madvise madvise "libc.so" + +//go:cgo_import_dynamic libc_open open "libc.so" +//go:cgo_import_dynamic libc_close close "libc.so" +//go:cgo_import_dynamic libc_read read "libc.so" +//go:cgo_import_dynamic libc_write write "libc.so" +//go:cgo_import_dynamic libc_pipe2 pipe2 "libc.so" + +//go:cgo_import_dynamic libc_clock_gettime clock_gettime "libc.so" +//go:cgo_import_dynamic libc_setitimer setitimer "libc.so" +//go:cgo_import_dynamic libc_usleep usleep "libc.so" +//go:cgo_import_dynamic libc_sysctl sysctl "libc.so" +//go:cgo_import_dynamic libc_fcntl fcntl "libc.so" +//go:cgo_import_dynamic libc_getpid getpid "libc.so" +//go:cgo_import_dynamic libc_kill kill "libc.so" +//go:cgo_import_dynamic libc_kqueue kqueue "libc.so" +//go:cgo_import_dynamic libc_kevent kevent "libc.so" + +//go:cgo_import_dynamic libc_sigaction sigaction "libc.so" +//go:cgo_import_dynamic libc_sigaltstack sigaltstack "libc.so" + +//go:cgo_import_dynamic libc_issetugid issetugid "libc.so" + +//go:cgo_import_dynamic _ _ "libc.so" diff --git a/src/runtime/sys_openbsd3.go b/src/runtime/sys_openbsd3.go new file mode 100644 index 0000000..269bf86 --- /dev/null +++ b/src/runtime/sys_openbsd3.go @@ -0,0 +1,116 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build openbsd && !mips64 + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +// The X versions of syscall expect the libc call to return a 64-bit result. +// Otherwise (the non-X version) expects a 32-bit result. +// This distinction is required because an error is indicated by returning -1, +// and we need to know whether to check 32 or 64 bits of the result. +// (Some libc functions that return 32 bits put junk in the upper 32 bits of AX.) + +//go:linkname syscall_syscall syscall.syscall +//go:nosplit +//go:cgo_unsafe_args +func syscall_syscall(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall)), unsafe.Pointer(&fn)) + exitsyscall() + return +} +func syscall() + +//go:linkname syscall_syscallX syscall.syscallX +//go:nosplit +//go:cgo_unsafe_args +func syscall_syscallX(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscallX)), unsafe.Pointer(&fn)) + exitsyscall() + return +} +func syscallX() + +//go:linkname syscall_syscall6 syscall.syscall6 +//go:nosplit +//go:cgo_unsafe_args +func syscall_syscall6(fn, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall6)), unsafe.Pointer(&fn)) + exitsyscall() + return +} +func syscall6() + +//go:linkname syscall_syscall6X syscall.syscall6X +//go:nosplit +//go:cgo_unsafe_args +func syscall_syscall6X(fn, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall6X)), unsafe.Pointer(&fn)) + exitsyscall() + return +} +func syscall6X() + +//go:linkname syscall_syscall10 syscall.syscall10 +//go:nosplit +//go:cgo_unsafe_args +func syscall_syscall10(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10 uintptr) (r1, r2, err uintptr) { + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall10)), unsafe.Pointer(&fn)) + exitsyscall() + return +} +func syscall10() + +//go:linkname syscall_syscall10X syscall.syscall10X +//go:nosplit +//go:cgo_unsafe_args +func syscall_syscall10X(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10 uintptr) (r1, r2, err uintptr) { + entersyscall() + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall10X)), unsafe.Pointer(&fn)) + exitsyscall() + return +} +func syscall10X() + +//go:linkname syscall_rawSyscall syscall.rawSyscall +//go:nosplit +//go:cgo_unsafe_args +func syscall_rawSyscall(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall)), unsafe.Pointer(&fn)) + return +} + +//go:linkname syscall_rawSyscall6 syscall.rawSyscall6 +//go:nosplit +//go:cgo_unsafe_args +func syscall_rawSyscall6(fn, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall6)), unsafe.Pointer(&fn)) + return +} + +//go:linkname syscall_rawSyscall6X syscall.rawSyscall6X +//go:nosplit +//go:cgo_unsafe_args +func syscall_rawSyscall6X(fn, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall6X)), unsafe.Pointer(&fn)) + return +} + +//go:linkname syscall_rawSyscall10X syscall.rawSyscall10X +//go:nosplit +//go:cgo_unsafe_args +func syscall_rawSyscall10X(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10 uintptr) (r1, r2, err uintptr) { + libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall10X)), unsafe.Pointer(&fn)) + return +} diff --git a/src/runtime/sys_openbsd_386.s b/src/runtime/sys_openbsd_386.s new file mode 100644 index 0000000..6005c10 --- /dev/null +++ b/src/runtime/sys_openbsd_386.s @@ -0,0 +1,990 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for 386, OpenBSD +// System calls are implemented in libc/libpthread, this file +// contains trampolines that convert from Go to C calling convention. +// Some direct system call implementations currently remain. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define CLOCK_MONOTONIC $3 + +TEXT runtime·setldt(SB),NOSPLIT,$0 + // Nothing to do, pthread already set thread-local storage up. + RET + +// mstart_stub is the first function executed on a new thread started by pthread_create. +// It just does some low-level setup and then calls mstart. +// Note: called with the C calling convention. +TEXT runtime·mstart_stub(SB),NOSPLIT,$28 + NOP SP // tell vet SP changed - stop checking offsets + + // We are already on m's g0 stack. + + // Save callee-save registers. + MOVL BX, bx-4(SP) + MOVL BP, bp-8(SP) + MOVL SI, si-12(SP) + MOVL DI, di-16(SP) + + MOVL 32(SP), AX // m + MOVL m_g0(AX), DX + get_tls(CX) + MOVL DX, g(CX) + + CALL runtime·mstart(SB) + + // Restore callee-save registers. + MOVL di-16(SP), DI + MOVL si-12(SP), SI + MOVL bp-8(SP), BP + MOVL bx-4(SP), BX + + // Go is all done with this OS thread. + // Tell pthread everything is ok (we never join with this thread, so + // the value here doesn't really matter). + MOVL $0, AX + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-16 + MOVL fn+0(FP), AX + MOVL sig+4(FP), BX + MOVL info+8(FP), CX + MOVL ctx+12(FP), DX + MOVL SP, SI + SUBL $32, SP + ANDL $~15, SP // align stack: handler might be a C function + MOVL BX, 0(SP) + MOVL CX, 4(SP) + MOVL DX, 8(SP) + MOVL SI, 12(SP) // save SI: handler might be a Go function + CALL AX + MOVL 12(SP), AX + MOVL AX, SP + RET + +// Called by OS using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$28 + NOP SP // tell vet SP changed - stop checking offsets + // Save callee-saved C registers, since the caller may be a C signal handler. + MOVL BX, bx-4(SP) + MOVL BP, bp-8(SP) + MOVL SI, si-12(SP) + MOVL DI, di-16(SP) + // We don't save mxcsr or the x87 control word because sigtrampgo doesn't + // modify them. + + MOVL 32(SP), BX // signo + MOVL BX, 0(SP) + MOVL 36(SP), BX // info + MOVL BX, 4(SP) + MOVL 40(SP), BX // context + MOVL BX, 8(SP) + CALL runtime·sigtrampgo(SB) + + MOVL di-16(SP), DI + MOVL si-12(SP), SI + MOVL bp-8(SP), BP + MOVL bx-4(SP), BX + RET + +// These trampolines help convert from Go calling convention to C calling convention. +// They should be called with asmcgocall - note that while asmcgocall does +// stack alignment, creation of a frame undoes it again. +// A pointer to the arguments is passed on the stack. +// A single int32 result is returned in AX. +// (For more results, make an args/results structure.) +TEXT runtime·pthread_attr_init_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $4, SP + MOVL 12(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL AX, 0(SP) // arg 1 - attr + CALL libc_pthread_attr_init(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·pthread_attr_destroy_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $4, SP + MOVL 12(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL AX, 0(SP) // arg 1 - attr + CALL libc_pthread_attr_destroy(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·pthread_attr_getstacksize_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $8, SP + MOVL 16(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL AX, 0(SP) // arg 1 - attr + MOVL BX, 4(SP) // arg 2 - size + CALL libc_pthread_attr_getstacksize(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·pthread_attr_setdetachstate_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $8, SP + MOVL 16(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL AX, 0(SP) // arg 1 - attr + MOVL BX, 4(SP) // arg 2 - state + CALL libc_pthread_attr_setdetachstate(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·pthread_create_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $20, SP + MOVL 28(SP), DX // pointer to args + LEAL 16(SP), AX + MOVL AX, 0(SP) // arg 1 - &threadid (discarded) + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 4(SP) // arg 2 - attr + MOVL BX, 8(SP) // arg 3 - start + MOVL CX, 12(SP) // arg 4 - arg + CALL libc_pthread_create(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·thrkill_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $12, SP + MOVL 20(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL AX, 0(SP) // arg 1 - tid + MOVL BX, 4(SP) // arg 2 - signal + MOVL $0, 8(SP) // arg 3 - tcb + CALL libc_thrkill(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·thrsleep_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $20, SP + MOVL 28(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - id + MOVL BX, 4(SP) // arg 2 - clock_id + MOVL CX, 8(SP) // arg 3 - abstime + MOVL 12(DX), AX + MOVL 16(DX), BX + MOVL AX, 12(SP) // arg 4 - lock + MOVL BX, 16(SP) // arg 5 - abort + CALL libc_thrsleep(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·thrwakeup_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $8, SP + MOVL 16(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL AX, 0(SP) // arg 1 - id + MOVL BX, 4(SP) // arg 2 - count + CALL libc_thrwakeup(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·exit_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $4, SP + MOVL 12(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL AX, 0(SP) // arg 1 - status + CALL libc_exit(SB) + MOVL $0xf1, 0xf1 // crash on failure + MOVL BP, SP + POPL BP + RET + +TEXT runtime·getthrid_trampoline(SB),NOSPLIT,$0 + PUSHL BP + CALL libc_getthrid(SB) + NOP SP // tell vet SP changed - stop checking offsets + MOVL 8(SP), DX // pointer to return value + MOVL AX, 0(DX) + POPL BP + RET + +TEXT runtime·raiseproc_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $8, SP + MOVL 16(SP), DX + MOVL 0(DX), BX + CALL libc_getpid(SB) + MOVL AX, 0(SP) // arg 1 - pid + MOVL BX, 4(SP) // arg 2 - signal + CALL libc_kill(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·sched_yield_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + CALL libc_sched_yield(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·mmap_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $32, SP + MOVL 40(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - addr + MOVL BX, 4(SP) // arg 2 - len + MOVL CX, 8(SP) // arg 3 - prot + MOVL 12(DX), AX + MOVL 16(DX), BX + MOVL 20(DX), CX + MOVL AX, 12(SP) // arg 4 - flags + MOVL BX, 16(SP) // arg 5 - fid + MOVL $0, 20(SP) // pad + MOVL CX, 24(SP) // arg 6 - offset (low 32 bits) + MOVL $0, 28(SP) // offset (high 32 bits) + CALL libc_mmap(SB) + MOVL $0, BX + CMPL AX, $-1 + JNE ok + CALL libc_errno(SB) + MOVL (AX), BX + MOVL $0, AX +ok: + MOVL 40(SP), DX + MOVL AX, 24(DX) + MOVL BX, 28(DX) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·munmap_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $8, SP + MOVL 16(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL AX, 0(SP) // arg 1 - addr + MOVL BX, 4(SP) // arg 2 - len + CALL libc_munmap(SB) + CMPL AX, $-1 + JNE 2(PC) + MOVL $0xf1, 0xf1 // crash on failure + MOVL BP, SP + POPL BP + RET + +TEXT runtime·madvise_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $12, SP + MOVL 20(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - addr + MOVL BX, 4(SP) // arg 2 - len + MOVL CX, 8(SP) // arg 3 - advice + CALL libc_madvise(SB) + // ignore failure - maybe pages are locked + MOVL BP, SP + POPL BP + RET + +TEXT runtime·open_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $16, SP + MOVL 24(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - path + MOVL BX, 4(SP) // arg 2 - flags + MOVL CX, 8(SP) // arg 3 - mode + MOVL $0, 12(SP) // vararg + CALL libc_open(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·close_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $4, SP + MOVL 12(SP), DX + MOVL 0(DX), AX + MOVL AX, 0(SP) // arg 1 - fd + CALL libc_close(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·read_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $12, SP + MOVL 20(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - fd + MOVL BX, 4(SP) // arg 2 - buf + MOVL CX, 8(SP) // arg 3 - count + CALL libc_read(SB) + CMPL AX, $-1 + JNE noerr + CALL libc_errno(SB) + MOVL (AX), AX + NEGL AX // caller expects negative errno +noerr: + MOVL BP, SP + POPL BP + RET + +TEXT runtime·write_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $12, SP + MOVL 20(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - fd + MOVL BX, 4(SP) // arg 2 - buf + MOVL CX, 8(SP) // arg 3 - count + CALL libc_write(SB) + CMPL AX, $-1 + JNE noerr + CALL libc_errno(SB) + MOVL (AX), AX + NEGL AX // caller expects negative errno +noerr: + MOVL BP, SP + POPL BP + RET + +TEXT runtime·pipe2_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $8, SP + MOVL 16(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL AX, 0(SP) // arg 1 - fds + MOVL BX, 4(SP) // arg 2 - flags + CALL libc_pipe2(SB) + CMPL AX, $-1 + JNE noerr + CALL libc_errno(SB) + MOVL (AX), AX + NEGL AX // caller expects negative errno +noerr: + MOVL BP, SP + POPL BP + RET + +TEXT runtime·setitimer_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $12, SP + MOVL 20(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - which + MOVL BX, 4(SP) // arg 2 - new + MOVL CX, 8(SP) // arg 3 - old + CALL libc_setitimer(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·usleep_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $4, SP + MOVL 12(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL AX, 0(SP) + CALL libc_usleep(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·sysctl_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $24, SP + MOVL 32(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - name + MOVL BX, 4(SP) // arg 2 - namelen + MOVL CX, 8(SP) // arg 3 - old + MOVL 12(DX), AX + MOVL 16(DX), BX + MOVL 20(DX), CX + MOVL AX, 12(SP) // arg 4 - oldlenp + MOVL BX, 16(SP) // arg 5 - newp + MOVL CX, 20(SP) // arg 6 - newlen + CALL libc_sysctl(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·kqueue_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + CALL libc_kqueue(SB) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·kevent_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $24, SP + MOVL 32(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - kq + MOVL BX, 4(SP) // arg 2 - keventt + MOVL CX, 8(SP) // arg 3 - nch + MOVL 12(DX), AX + MOVL 16(DX), BX + MOVL 20(DX), CX + MOVL AX, 12(SP) // arg 4 - ev + MOVL BX, 16(SP) // arg 5 - nev + MOVL CX, 20(SP) // arg 6 - ts + CALL libc_kevent(SB) + CMPL AX, $-1 + JNE noerr + CALL libc_errno(SB) + MOVL (AX), AX + NEGL AX // caller expects negative errno +noerr: + MOVL BP, SP + POPL BP + RET + +TEXT runtime·clock_gettime_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $8, SP + MOVL 16(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL AX, 0(SP) // arg 1 - tp + MOVL BX, 4(SP) // arg 2 - clock_id + CALL libc_clock_gettime(SB) + CMPL AX, $-1 + JNE noerr + CALL libc_errno(SB) + MOVL (AX), AX + NEGL AX // caller expects negative errno +noerr: + MOVL BP, SP + POPL BP + RET + +TEXT runtime·fcntl_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $16, SP + MOVL 24(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - fd + MOVL BX, 4(SP) // arg 2 - cmd + MOVL CX, 8(SP) // arg 3 - arg + MOVL $0, 12(SP) // vararg + CALL libc_fcntl(SB) + MOVL $0, BX + CMPL AX, $-1 + JNE noerr + CALL libc_errno(SB) + MOVL (AX), BX + MOVL $-1, AX +noerr: + MOVL 24(SP), DX // pointer to args + MOVL AX, 12(DX) + MOVL BX, 16(DX) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·sigaction_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $12, SP + MOVL 20(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - sig + MOVL BX, 4(SP) // arg 2 - new + MOVL CX, 8(SP) // arg 3 - old + CALL libc_sigaction(SB) + CMPL AX, $-1 + JNE 2(PC) + MOVL $0xf1, 0xf1 // crash on failure + MOVL BP, SP + POPL BP + RET + +TEXT runtime·sigprocmask_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $12, SP + MOVL 20(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL 8(DX), CX + MOVL AX, 0(SP) // arg 1 - how + MOVL BX, 4(SP) // arg 2 - new + MOVL CX, 8(SP) // arg 3 - old + CALL libc_pthread_sigmask(SB) + CMPL AX, $-1 + JNE 2(PC) + MOVL $0xf1, 0xf1 // crash on failure + MOVL BP, SP + POPL BP + RET + +TEXT runtime·sigaltstack_trampoline(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + SUBL $8, SP + MOVL 16(SP), DX // pointer to args + MOVL 0(DX), AX + MOVL 4(DX), BX + MOVL AX, 0(SP) // arg 1 - new + MOVL BX, 4(SP) // arg 2 - old + CALL libc_sigaltstack(SB) + CMPL AX, $-1 + JNE 2(PC) + MOVL $0xf1, 0xf1 // crash on failure + MOVL BP, SP + POPL BP + RET + +// syscall calls a function in libc on behalf of the syscall package. +// syscall takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + + SUBL $12, SP + MOVL 20(SP), BX // pointer to args + + MOVL (1*4)(BX), AX + MOVL (2*4)(BX), CX + MOVL (3*4)(BX), DX + MOVL AX, (0*4)(SP) // a1 + MOVL CX, (1*4)(SP) // a2 + MOVL DX, (2*4)(SP) // a3 + + MOVL (0*4)(BX), AX // fn + CALL AX + + MOVL AX, (4*4)(BX) // r1 + MOVL DX, (5*4)(BX) // r2 + + // Standard libc functions return -1 on error and set errno. + CMPL AX, $-1 + JNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVL (AX), AX + MOVW AX, (6*4)(BX) // err + +ok: + MOVL $0, AX // no error (it's ignored anyway) + MOVL BP, SP + POPL BP + RET + +// syscallX calls a function in libc on behalf of the syscall package. +// syscallX takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscallX must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscallX is like syscall but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscallX(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + + SUBL $12, SP + MOVL 20(SP), BX // pointer to args + + MOVL (1*4)(BX), AX + MOVL (2*4)(BX), CX + MOVL (3*4)(BX), DX + MOVL AX, (0*4)(SP) // a1 + MOVL CX, (1*4)(SP) // a2 + MOVL DX, (2*4)(SP) // a3 + + MOVL (0*4)(BX), AX // fn + CALL AX + + MOVL AX, (4*4)(BX) // r1 + MOVL DX, (5*4)(BX) // r2 + + // Standard libc functions return -1 on error and set errno. + CMPL AX, $-1 + JNE ok + CMPL DX, $-1 + JNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVL (AX), AX + MOVW AX, (6*4)(BX) // err + +ok: + MOVL $0, AX // no error (it's ignored anyway) + MOVL BP, SP + POPL BP + RET + +// syscall6 calls a function in libc on behalf of the syscall package. +// syscall6 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6 must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6 expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall6(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + + SUBL $24, SP + MOVL 32(SP), BX // pointer to args + + MOVL (1*4)(BX), AX + MOVL (2*4)(BX), CX + MOVL (3*4)(BX), DX + MOVL AX, (0*4)(SP) // a1 + MOVL CX, (1*4)(SP) // a2 + MOVL DX, (2*4)(SP) // a3 + MOVL (4*4)(BX), AX + MOVL (5*4)(BX), CX + MOVL (6*4)(BX), DX + MOVL AX, (3*4)(SP) // a4 + MOVL CX, (4*4)(SP) // a5 + MOVL DX, (5*4)(SP) // a6 + + MOVL (0*4)(BX), AX // fn + CALL AX + + MOVL AX, (7*4)(BX) // r1 + MOVL DX, (8*4)(BX) // r2 + + // Standard libc functions return -1 on error and set errno. + CMPL AX, $-1 + JNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVL (AX), AX + MOVW AX, (9*4)(BX) // err + +ok: + MOVL $0, AX // no error (it's ignored anyway) + MOVL BP, SP + POPL BP + RET + +// syscall6X calls a function in libc on behalf of the syscall package. +// syscall6X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6X is like syscall6 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall6X(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + + SUBL $24, SP + MOVL 32(SP), BX // pointer to args + + MOVL (1*4)(BX), AX + MOVL (2*4)(BX), CX + MOVL (3*4)(BX), DX + MOVL AX, (0*4)(SP) // a1 + MOVL CX, (1*4)(SP) // a2 + MOVL DX, (2*4)(SP) // a3 + MOVL (4*4)(BX), AX + MOVL (5*4)(BX), CX + MOVL (6*4)(BX), DX + MOVL AX, (3*4)(SP) // a4 + MOVL CX, (4*4)(SP) // a5 + MOVL DX, (5*4)(SP) // a6 + + MOVL (0*4)(BX), AX // fn + CALL AX + + MOVL AX, (7*4)(BX) // r1 + MOVL DX, (8*4)(BX) // r2 + + // Standard libc functions return -1 on error and set errno. + CMPL AX, $-1 + JNE ok + CMPL DX, $-1 + JNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVL (AX), AX + MOVW AX, (9*4)(BX) // err + +ok: + MOVL $0, AX // no error (it's ignored anyway) + MOVL BP, SP + POPL BP + RET + +// syscall10 calls a function in libc on behalf of the syscall package. +// syscall10 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// a10 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall10 must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscall10(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + + SUBL $40, SP + MOVL 48(SP), BX // pointer to args + + MOVL (1*4)(BX), AX + MOVL (2*4)(BX), CX + MOVL (3*4)(BX), DX + MOVL AX, (0*4)(SP) // a1 + MOVL CX, (1*4)(SP) // a2 + MOVL DX, (2*4)(SP) // a3 + MOVL (4*4)(BX), AX + MOVL (5*4)(BX), CX + MOVL (6*4)(BX), DX + MOVL AX, (3*4)(SP) // a4 + MOVL CX, (4*4)(SP) // a5 + MOVL DX, (5*4)(SP) // a6 + MOVL (7*4)(BX), AX + MOVL (8*4)(BX), CX + MOVL (9*4)(BX), DX + MOVL AX, (6*4)(SP) // a7 + MOVL CX, (7*4)(SP) // a8 + MOVL DX, (8*4)(SP) // a9 + MOVL (10*4)(BX), AX + MOVL AX, (9*4)(SP) // a10 + + MOVL (0*4)(BX), AX // fn + CALL AX + + MOVL AX, (11*4)(BX) // r1 + MOVL DX, (12*4)(BX) // r2 + + // Standard libc functions return -1 on error and set errno. + CMPL AX, $-1 + JNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVL (AX), AX + MOVW AX, (13*4)(BX) // err + +ok: + MOVL $0, AX // no error (it's ignored anyway) + MOVL BP, SP + POPL BP + RET + +// syscall10X calls a function in libc on behalf of the syscall package. +// syscall10X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// a10 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall10X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall10X is like syscall9 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall10X(SB),NOSPLIT,$0 + PUSHL BP + MOVL SP, BP + + SUBL $40, SP + MOVL 48(SP), BX // pointer to args + + MOVL (1*4)(BX), AX + MOVL (2*4)(BX), CX + MOVL (3*4)(BX), DX + MOVL AX, (0*4)(SP) // a1 + MOVL CX, (1*4)(SP) // a2 + MOVL DX, (2*4)(SP) // a3 + MOVL (4*4)(BX), AX + MOVL (5*4)(BX), CX + MOVL (6*4)(BX), DX + MOVL AX, (3*4)(SP) // a4 + MOVL CX, (4*4)(SP) // a5 + MOVL DX, (5*4)(SP) // a6 + MOVL (7*4)(BX), AX + MOVL (8*4)(BX), CX + MOVL (9*4)(BX), DX + MOVL AX, (6*4)(SP) // a7 + MOVL CX, (7*4)(SP) // a8 + MOVL DX, (8*4)(SP) // a9 + MOVL (10*4)(BX), AX + MOVL AX, (9*4)(SP) // a10 + + MOVL (0*4)(BX), AX // fn + CALL AX + + MOVL AX, (11*4)(BX) // r1 + MOVL DX, (12*4)(BX) // r2 + + // Standard libc functions return -1 on error and set errno. + CMPL AX, $-1 + JNE ok + CMPL DX, $-1 + JNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVL (AX), AX + MOVW AX, (13*4)(BX) // err + +ok: + MOVL $0, AX // no error (it's ignored anyway) + MOVL BP, SP + POPL BP + RET + +TEXT runtime·issetugid_trampoline(SB),NOSPLIT,$0 + PUSHL BP + CALL libc_issetugid(SB) + NOP SP // tell vet SP changed - stop checking offsets + MOVL 8(SP), DX // pointer to return value + MOVL AX, 0(DX) + POPL BP + RET diff --git a/src/runtime/sys_openbsd_amd64.s b/src/runtime/sys_openbsd_amd64.s new file mode 100644 index 0000000..1177bc1 --- /dev/null +++ b/src/runtime/sys_openbsd_amd64.s @@ -0,0 +1,792 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for AMD64, OpenBSD. +// System calls are implemented in libc/libpthread, this file +// contains trampolines that convert from Go to C calling convention. +// Some direct system call implementations currently remain. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_amd64.h" + +#define CLOCK_MONOTONIC $3 + +TEXT runtime·settls(SB),NOSPLIT,$0 + // Nothing to do, pthread already set thread-local storage up. + RET + +// mstart_stub is the first function executed on a new thread started by pthread_create. +// It just does some low-level setup and then calls mstart. +// Note: called with the C calling convention. +TEXT runtime·mstart_stub(SB),NOSPLIT,$0 + // DI points to the m. + // We are already on m's g0 stack. + + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Load g and save to TLS entry. + // See cmd/link/internal/ld/sym.go:computeTLSOffset. + MOVQ m_g0(DI), DX // g + MOVQ DX, -8(FS) + + CALL runtime·mstart(SB) + + POP_REGS_HOST_TO_ABI0() + + // Go is all done with this OS thread. + // Tell pthread everything is ok (we never join with this thread, so + // the value here doesn't really matter). + XORL AX, AX + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVQ fn+0(FP), AX + MOVL sig+8(FP), DI + MOVQ info+16(FP), SI + MOVQ ctx+24(FP), DX + PUSHQ BP + MOVQ SP, BP + ANDQ $~15, SP // alignment for x86_64 ABI + CALL AX + MOVQ BP, SP + POPQ BP + RET + +// Called using C ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Transition from C ABI to Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Set up ABIInternal environment: g in R14, cleared X15. + get_tls(R12) + MOVQ g(R12), R14 + PXOR X15, X15 + + // Reserve space for spill slots. + NOP SP // disable vet stack checking + ADJSP $24 + + // Call into the Go signal handler + MOVQ DI, AX // sig + MOVQ SI, BX // info + MOVQ DX, CX // ctx + CALL ·sigtrampgo<ABIInternal>(SB) + + ADJSP $-24 + + POP_REGS_HOST_TO_ABI0() + RET + +// +// These trampolines help convert from Go calling convention to C calling convention. +// They should be called with asmcgocall. +// A pointer to the arguments is passed in DI. +// A single int32 result is returned in AX. +// (For more results, make an args/results structure.) +TEXT runtime·pthread_attr_init_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 0(DI), DI // arg 1 - attr + CALL libc_pthread_attr_init(SB) + POPQ BP + RET + +TEXT runtime·pthread_attr_destroy_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 0(DI), DI // arg 1 - attr + CALL libc_pthread_attr_destroy(SB) + POPQ BP + RET + +TEXT runtime·pthread_attr_getstacksize_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 - stacksize + MOVQ 0(DI), DI // arg 1 - attr + CALL libc_pthread_attr_getstacksize(SB) + POPQ BP + RET + +TEXT runtime·pthread_attr_setdetachstate_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 - detachstate + MOVQ 0(DI), DI // arg 1 - attr + CALL libc_pthread_attr_setdetachstate(SB) + POPQ BP + RET + +TEXT runtime·pthread_create_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ 0(DI), SI // arg 2 - attr + MOVQ 8(DI), DX // arg 3 - start + MOVQ 16(DI), CX // arg 4 - arg + MOVQ SP, DI // arg 1 - &thread (discarded) + CALL libc_pthread_create(SB) + MOVQ BP, SP + POPQ BP + RET + +TEXT runtime·thrkill_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 8(DI), SI // arg 2 - signal + MOVQ $0, DX // arg 3 - tcb + MOVL 0(DI), DI // arg 1 - tid + CALL libc_thrkill(SB) + POPQ BP + RET + +TEXT runtime·thrsleep_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 8(DI), SI // arg 2 - clock_id + MOVQ 16(DI), DX // arg 3 - abstime + MOVQ 24(DI), CX // arg 4 - lock + MOVQ 32(DI), R8 // arg 5 - abort + MOVQ 0(DI), DI // arg 1 - id + CALL libc_thrsleep(SB) + POPQ BP + RET + +TEXT runtime·thrwakeup_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 8(DI), SI // arg 2 - count + MOVQ 0(DI), DI // arg 1 - id + CALL libc_thrwakeup(SB) + POPQ BP + RET + +TEXT runtime·exit_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), DI // arg 1 exit status + CALL libc_exit(SB) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·getthrid_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ DI, BX // BX is caller-save + CALL libc_getthrid(SB) + MOVL AX, 0(BX) // return value + POPQ BP + RET + +TEXT runtime·raiseproc_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), BX // signal + CALL libc_getpid(SB) + MOVL AX, DI // arg 1 pid + MOVL BX, SI // arg 2 signal + CALL libc_kill(SB) + POPQ BP + RET + +TEXT runtime·sched_yield_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + CALL libc_sched_yield(SB) + POPQ BP + RET + +TEXT runtime·mmap_trampoline(SB),NOSPLIT,$0 + PUSHQ BP // make a frame; keep stack aligned + MOVQ SP, BP + MOVQ DI, BX + MOVQ 0(BX), DI // arg 1 addr + MOVQ 8(BX), SI // arg 2 len + MOVL 16(BX), DX // arg 3 prot + MOVL 20(BX), CX // arg 4 flags + MOVL 24(BX), R8 // arg 5 fid + MOVL 28(BX), R9 // arg 6 offset + CALL libc_mmap(SB) + XORL DX, DX + CMPQ AX, $-1 + JNE ok + CALL libc_errno(SB) + MOVLQSX (AX), DX // errno + XORQ AX, AX +ok: + MOVQ AX, 32(BX) + MOVQ DX, 40(BX) + POPQ BP + RET + +TEXT runtime·munmap_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 len + MOVQ 0(DI), DI // arg 1 addr + CALL libc_munmap(SB) + TESTQ AX, AX + JEQ 2(PC) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·madvise_trampoline(SB), NOSPLIT, $0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 len + MOVL 16(DI), DX // arg 3 advice + MOVQ 0(DI), DI // arg 1 addr + CALL libc_madvise(SB) + // ignore failure - maybe pages are locked + POPQ BP + RET + +TEXT runtime·open_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 8(DI), SI // arg 2 - flags + MOVL 12(DI), DX // arg 3 - mode + MOVQ 0(DI), DI // arg 1 - path + XORL AX, AX // vararg: say "no float args" + CALL libc_open(SB) + POPQ BP + RET + +TEXT runtime·close_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), DI // arg 1 - fd + CALL libc_close(SB) + POPQ BP + RET + +TEXT runtime·read_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 - buf + MOVL 16(DI), DX // arg 3 - count + MOVL 0(DI), DI // arg 1 - fd + CALL libc_read(SB) + TESTL AX, AX + JGE noerr + CALL libc_errno(SB) + MOVL (AX), AX // errno + NEGL AX // caller expects negative errno value +noerr: + POPQ BP + RET + +TEXT runtime·write_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 buf + MOVL 16(DI), DX // arg 3 count + MOVL 0(DI), DI // arg 1 fd + CALL libc_write(SB) + TESTL AX, AX + JGE noerr + CALL libc_errno(SB) + MOVL (AX), AX // errno + NEGL AX // caller expects negative errno value +noerr: + POPQ BP + RET + +TEXT runtime·pipe2_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 8(DI), SI // arg 2 flags + MOVQ 0(DI), DI // arg 1 filedes + CALL libc_pipe2(SB) + TESTL AX, AX + JEQ 3(PC) + CALL libc_errno(SB) + MOVL (AX), AX // errno + NEGL AX // caller expects negative errno value + POPQ BP + RET + +TEXT runtime·setitimer_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 new + MOVQ 16(DI), DX // arg 3 old + MOVL 0(DI), DI // arg 1 which + CALL libc_setitimer(SB) + POPQ BP + RET + +TEXT runtime·usleep_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 0(DI), DI // arg 1 usec + CALL libc_usleep(SB) + POPQ BP + RET + +TEXT runtime·sysctl_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVL 8(DI), SI // arg 2 miblen + MOVQ 16(DI), DX // arg 3 out + MOVQ 24(DI), CX // arg 4 size + MOVQ 32(DI), R8 // arg 5 dst + MOVQ 40(DI), R9 // arg 6 ndst + MOVQ 0(DI), DI // arg 1 mib + CALL libc_sysctl(SB) + POPQ BP + RET + +TEXT runtime·kqueue_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + CALL libc_kqueue(SB) + POPQ BP + RET + +TEXT runtime·kevent_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 keventt + MOVL 16(DI), DX // arg 3 nch + MOVQ 24(DI), CX // arg 4 ev + MOVL 32(DI), R8 // arg 5 nev + MOVQ 40(DI), R9 // arg 6 ts + MOVL 0(DI), DI // arg 1 kq + CALL libc_kevent(SB) + CMPL AX, $-1 + JNE ok + CALL libc_errno(SB) + MOVL (AX), AX // errno + NEGL AX // caller expects negative errno value +ok: + POPQ BP + RET + +TEXT runtime·clock_gettime_trampoline(SB),NOSPLIT,$0 + PUSHQ BP // make a frame; keep stack aligned + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 tp + MOVL 0(DI), DI // arg 1 clock_id + CALL libc_clock_gettime(SB) + TESTL AX, AX + JEQ noerr + CALL libc_errno(SB) + MOVL (AX), AX // errno + NEGL AX // caller expects negative errno value +noerr: + POPQ BP + RET + +TEXT runtime·fcntl_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ DI, BX + MOVL 0(BX), DI // arg 1 fd + MOVL 4(BX), SI // arg 2 cmd + MOVL 8(BX), DX // arg 3 arg + XORL AX, AX // vararg: say "no float args" + CALL libc_fcntl(SB) + XORL DX, DX + CMPL AX, $-1 + JNE noerr + CALL libc_errno(SB) + MOVL (AX), DX + MOVL $-1, AX +noerr: + MOVL AX, 12(BX) + MOVL DX, 16(BX) + POPQ BP + RET + +TEXT runtime·sigaction_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 new + MOVQ 16(DI), DX // arg 3 old + MOVL 0(DI), DI // arg 1 sig + CALL libc_sigaction(SB) + TESTL AX, AX + JEQ 2(PC) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·sigprocmask_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 new + MOVQ 16(DI), DX // arg 3 old + MOVL 0(DI), DI // arg 1 how + CALL libc_pthread_sigmask(SB) + TESTL AX, AX + JEQ 2(PC) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +TEXT runtime·sigaltstack_trampoline(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + MOVQ 8(DI), SI // arg 2 old + MOVQ 0(DI), DI // arg 1 new + CALL libc_sigaltstack(SB) + TESTQ AX, AX + JEQ 2(PC) + MOVL $0xf1, 0xf1 // crash + POPQ BP + RET + +// syscall calls a function in libc on behalf of the syscall package. +// syscall takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), CX // fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL CX + + MOVQ (SP), DI + MOVQ AX, (4*8)(DI) // r1 + MOVQ DX, (5*8)(DI) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMPL AX, $-1 // Note: high 32 bits are junk + JNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (6*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscallX calls a function in libc on behalf of the syscall package. +// syscallX takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscallX must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscallX is like syscall but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscallX(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), CX // fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL CX + + MOVQ (SP), DI + MOVQ AX, (4*8)(DI) // r1 + MOVQ DX, (5*8)(DI) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMPQ AX, $-1 + JNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (6*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscall6 calls a function in libc on behalf of the syscall package. +// syscall6 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6 must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6 expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall6(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), R11// fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ (4*8)(DI), CX // a4 + MOVQ (5*8)(DI), R8 // a5 + MOVQ (6*8)(DI), R9 // a6 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL R11 + + MOVQ (SP), DI + MOVQ AX, (7*8)(DI) // r1 + MOVQ DX, (8*8)(DI) // r2 + + CMPL AX, $-1 + JNE ok + + CALL libc_errno(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (9*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscall6X calls a function in libc on behalf of the syscall package. +// syscall6X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6X is like syscall6 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall6X(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $16, SP + MOVQ (0*8)(DI), R11// fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ (4*8)(DI), CX // a4 + MOVQ (5*8)(DI), R8 // a5 + MOVQ (6*8)(DI), R9 // a6 + MOVQ DI, (SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL R11 + + MOVQ (SP), DI + MOVQ AX, (7*8)(DI) // r1 + MOVQ DX, (8*8)(DI) // r2 + + CMPQ AX, $-1 + JNE ok + + CALL libc_errno(SB) + MOVLQSX (AX), AX + MOVQ (SP), DI + MOVQ AX, (9*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscall10 calls a function in libc on behalf of the syscall package. +// syscall10 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// a10 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall10 must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscall10(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $48, SP + + // Arguments a1 to a6 get passed in registers, with a7 onwards being + // passed via the stack per the x86-64 System V ABI + // (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf). + MOVQ (7*8)(DI), R10 // a7 + MOVQ (8*8)(DI), R11 // a8 + MOVQ (9*8)(DI), R12 // a9 + MOVQ (10*8)(DI), R13 // a10 + MOVQ R10, (0*8)(SP) // a7 + MOVQ R11, (1*8)(SP) // a8 + MOVQ R12, (2*8)(SP) // a9 + MOVQ R13, (3*8)(SP) // a10 + MOVQ (0*8)(DI), R11 // fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ (4*8)(DI), CX // a4 + MOVQ (5*8)(DI), R8 // a5 + MOVQ (6*8)(DI), R9 // a6 + MOVQ DI, (4*8)(SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL R11 + + MOVQ (4*8)(SP), DI + MOVQ AX, (11*8)(DI) // r1 + MOVQ DX, (12*8)(DI) // r2 + + CMPL AX, $-1 + JNE ok + + CALL libc_errno(SB) + MOVLQSX (AX), AX + MOVQ (4*8)(SP), DI + MOVQ AX, (13*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +// syscall10X calls a function in libc on behalf of the syscall package. +// syscall10X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// a10 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall10X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall10X is like syscall10 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall10X(SB),NOSPLIT,$0 + PUSHQ BP + MOVQ SP, BP + SUBQ $48, SP + + // Arguments a1 to a6 get passed in registers, with a7 onwards being + // passed via the stack per the x86-64 System V ABI + // (https://github.com/hjl-tools/x86-psABI/wiki/x86-64-psABI-1.0.pdf). + MOVQ (7*8)(DI), R10 // a7 + MOVQ (8*8)(DI), R11 // a8 + MOVQ (9*8)(DI), R12 // a9 + MOVQ (10*8)(DI), R13 // a10 + MOVQ R10, (0*8)(SP) // a7 + MOVQ R11, (1*8)(SP) // a8 + MOVQ R12, (2*8)(SP) // a9 + MOVQ R13, (3*8)(SP) // a10 + MOVQ (0*8)(DI), R11 // fn + MOVQ (2*8)(DI), SI // a2 + MOVQ (3*8)(DI), DX // a3 + MOVQ (4*8)(DI), CX // a4 + MOVQ (5*8)(DI), R8 // a5 + MOVQ (6*8)(DI), R9 // a6 + MOVQ DI, (4*8)(SP) + MOVQ (1*8)(DI), DI // a1 + XORL AX, AX // vararg: say "no float args" + + CALL R11 + + MOVQ (4*8)(SP), DI + MOVQ AX, (11*8)(DI) // r1 + MOVQ DX, (12*8)(DI) // r2 + + CMPQ AX, $-1 + JNE ok + + CALL libc_errno(SB) + MOVLQSX (AX), AX + MOVQ (4*8)(SP), DI + MOVQ AX, (13*8)(DI) // err + +ok: + XORL AX, AX // no error (it's ignored anyway) + MOVQ BP, SP + POPQ BP + RET + +TEXT runtime·issetugid_trampoline(SB),NOSPLIT,$0 + MOVQ DI, BX // BX is caller-save + CALL libc_issetugid(SB) + MOVL AX, 0(BX) // return value + RET diff --git a/src/runtime/sys_openbsd_arm.s b/src/runtime/sys_openbsd_arm.s new file mode 100644 index 0000000..61b901b --- /dev/null +++ b/src/runtime/sys_openbsd_arm.s @@ -0,0 +1,827 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for ARM, OpenBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define CLOCK_REALTIME $0 +#define CLOCK_MONOTONIC $3 + +// With OpenBSD 6.7 onwards, an armv7 syscall returns two instructions +// after the SWI instruction, to allow for a speculative execution +// barrier to be placed after the SWI without impacting performance. +// For now use hardware no-ops as this works with both older and newer +// kernels. After OpenBSD 6.8 is released this should be changed to +// speculation barriers. +#define NOOP MOVW R0, R0 +#define INVOKE_SYSCALL \ + SWI $0; \ + NOOP; \ + NOOP + +// mstart_stub is the first function executed on a new thread started by pthread_create. +// It just does some low-level setup and then calls mstart. +// Note: called with the C calling convention. +TEXT runtime·mstart_stub(SB),NOSPLIT,$0 + // R0 points to the m. + // We are already on m's g0 stack. + + // Save callee-save registers. + MOVM.DB.W [R4-R11], (R13) + + MOVW m_g0(R0), g + BL runtime·save_g(SB) + + BL runtime·mstart(SB) + + // Restore callee-save registers. + MOVM.IA.W (R13), [R4-R11] + + // Go is all done with this OS thread. + // Tell pthread everything is ok (we never join with this thread, so + // the value here doesn't really matter). + MOVW $0, R0 + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-16 + MOVW sig+4(FP), R0 + MOVW info+8(FP), R1 + MOVW ctx+12(FP), R2 + MOVW fn+0(FP), R3 + MOVW R13, R9 + SUB $24, R13 + BIC $0x7, R13 // alignment for ELF ABI + BL (R3) + MOVW R9, R13 + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Reserve space for callee-save registers and arguments. + MOVM.DB.W [R4-R11], (R13) + SUB $16, R13 + + // If called from an external code context, g will not be set. + // Save R0, since runtime·load_g will clobber it. + MOVW R0, 4(R13) // signum + BL runtime·load_g(SB) + + MOVW R1, 8(R13) + MOVW R2, 12(R13) + BL runtime·sigtrampgo(SB) + + // Restore callee-save registers. + ADD $16, R13 + MOVM.IA.W (R13), [R4-R11] + + RET + +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + B runtime·armPublicationBarrier(SB) + +// TODO(jsing): OpenBSD only supports GOARM=7 machines... this +// should not be needed, however the linker still allows GOARM=5 +// on this platform. +TEXT runtime·read_tls_fallback(SB),NOSPLIT|NOFRAME,$0 + MOVM.WP [R1, R2, R3, R12], (R13) + MOVW $330, R12 // sys___get_tcb + INVOKE_SYSCALL + MOVM.IAW (R13), [R1, R2, R3, R12] + RET + +// These trampolines help convert from Go calling convention to C calling convention. +// They should be called with asmcgocall - note that while asmcgocall does +// stack alignment, creation of a frame undoes it again. +// A pointer to the arguments is passed in R0. +// A single int32 result is returned in R0. +// (For more results, make an args/results structure.) +TEXT runtime·pthread_attr_init_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 0(R0), R0 // arg 1 attr + CALL libc_pthread_attr_init(SB) + MOVW R9, R13 + RET + +TEXT runtime·pthread_attr_destroy_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 0(R0), R0 // arg 1 attr + CALL libc_pthread_attr_destroy(SB) + MOVW R9, R13 + RET + +TEXT runtime·pthread_attr_getstacksize_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 size + MOVW 0(R0), R0 // arg 1 attr + CALL libc_pthread_attr_getstacksize(SB) + MOVW R9, R13 + RET + +TEXT runtime·pthread_attr_setdetachstate_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 state + MOVW 0(R0), R0 // arg 1 attr + CALL libc_pthread_attr_setdetachstate(SB) + MOVW R9, R13 + RET + +TEXT runtime·pthread_create_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $16, R13 + BIC $0x7, R13 // align for ELF ABI + MOVW 0(R0), R1 // arg 2 attr + MOVW 4(R0), R2 // arg 3 start + MOVW 8(R0), R3 // arg 4 arg + MOVW R13, R0 // arg 1 &threadid (discarded) + CALL libc_pthread_create(SB) + MOVW R9, R13 + RET + +TEXT runtime·thrkill_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 - signal + MOVW $0, R2 // arg 3 - tcb + MOVW 0(R0), R0 // arg 1 - tid + CALL libc_thrkill(SB) + MOVW R9, R13 + RET + +TEXT runtime·thrsleep_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $16, R13 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 - clock_id + MOVW 8(R0), R2 // arg 3 - abstime + MOVW 12(R0), R3 // arg 4 - lock + MOVW 16(R0), R4 // arg 5 - abort (on stack) + MOVW R4, 0(R13) + MOVW 0(R0), R0 // arg 1 - id + CALL libc_thrsleep(SB) + MOVW R9, R13 + RET + +TEXT runtime·thrwakeup_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 - count + MOVW 0(R0), R0 // arg 1 - id + CALL libc_thrwakeup(SB) + MOVW R9, R13 + RET + +TEXT runtime·exit_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 0(R0), R0 // arg 1 exit status + BL libc_exit(SB) + MOVW $0, R8 // crash on failure + MOVW R8, (R8) + MOVW R9, R13 + RET + +TEXT runtime·getthrid_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + MOVW R0, R8 + BIC $0x7, R13 // align for ELF ABI + BL libc_getthrid(SB) + MOVW R0, 0(R8) + MOVW R9, R13 + RET + +TEXT runtime·raiseproc_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW R0, R4 + BL libc_getpid(SB) // arg 1 pid + MOVW R4, R1 // arg 2 signal + BL libc_kill(SB) + MOVW R9, R13 + RET + +TEXT runtime·sched_yield_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + BL libc_sched_yield(SB) + MOVW R9, R13 + RET + +TEXT runtime·mmap_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $16, R13 + BIC $0x7, R13 // align for ELF ABI + MOVW R0, R8 + MOVW 4(R0), R1 // arg 2 len + MOVW 8(R0), R2 // arg 3 prot + MOVW 12(R0), R3 // arg 4 flags + MOVW 16(R0), R4 // arg 5 fid (on stack) + MOVW R4, 0(R13) + MOVW $0, R5 // pad (on stack) + MOVW R5, 4(R13) + MOVW 20(R0), R6 // arg 6 offset (on stack) + MOVW R6, 8(R13) // low 32 bits + MOVW $0, R7 + MOVW R7, 12(R13) // high 32 bits + MOVW 0(R0), R0 // arg 1 addr + BL libc_mmap(SB) + MOVW $0, R1 + CMP $-1, R0 + BNE ok + BL libc_errno(SB) + MOVW (R0), R1 // errno + MOVW $0, R0 +ok: + MOVW R0, 24(R8) + MOVW R1, 28(R8) + MOVW R9, R13 + RET + +TEXT runtime·munmap_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 len + MOVW 0(R0), R0 // arg 1 addr + BL libc_munmap(SB) + CMP $-1, R0 + BNE 3(PC) + MOVW $0, R8 // crash on failure + MOVW R8, (R8) + MOVW R9, R13 + RET + +TEXT runtime·madvise_trampoline(SB), NOSPLIT, $0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 len + MOVW 8(R0), R2 // arg 3 advice + MOVW 0(R0), R0 // arg 1 addr + BL libc_madvise(SB) + // ignore failure - maybe pages are locked + MOVW R9, R13 + RET + +TEXT runtime·open_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $8, R13 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 - flags + MOVW 8(R0), R2 // arg 3 - mode (vararg, on stack) + MOVW R2, 0(R13) + MOVW 0(R0), R0 // arg 1 - path + MOVW R13, R4 + BIC $0x7, R13 // align for ELF ABI + BL libc_open(SB) + MOVW R9, R13 + RET + +TEXT runtime·close_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 0(R0), R0 // arg 1 - fd + BL libc_close(SB) + MOVW R9, R13 + RET + +TEXT runtime·read_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 - buf + MOVW 8(R0), R2 // arg 3 - count + MOVW 0(R0), R0 // arg 1 - fd + BL libc_read(SB) + CMP $-1, R0 + BNE noerr + BL libc_errno(SB) + MOVW (R0), R0 // errno + RSB.CS $0, R0 // caller expects negative errno +noerr: + MOVW R9, R13 + RET + +TEXT runtime·write_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 buf + MOVW 8(R0), R2 // arg 3 count + MOVW 0(R0), R0 // arg 1 fd + BL libc_write(SB) + CMP $-1, R0 + BNE noerr + BL libc_errno(SB) + MOVW (R0), R0 // errno + RSB.CS $0, R0 // caller expects negative errno +noerr: + MOVW R9, R13 + RET + +TEXT runtime·pipe2_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 flags + MOVW 0(R0), R0 // arg 1 filedes + BL libc_pipe2(SB) + CMP $-1, R0 + BNE 3(PC) + BL libc_errno(SB) + MOVW (R0), R0 // errno + RSB.CS $0, R0 // caller expects negative errno + MOVW R9, R13 + RET + +TEXT runtime·setitimer_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 new + MOVW 8(R0), R2 // arg 3 old + MOVW 0(R0), R0 // arg 1 which + BL libc_setitimer(SB) + MOVW R9, R13 + RET + +TEXT runtime·usleep_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 0(R0), R0 // arg 1 usec + BL libc_usleep(SB) + MOVW R9, R13 + RET + +TEXT runtime·sysctl_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $8, R13 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 miblen + MOVW 8(R0), R2 // arg 3 out + MOVW 12(R0), R3 // arg 4 size + MOVW 16(R0), R4 // arg 5 dst (on stack) + MOVW R4, 0(R13) + MOVW 20(R0), R5 // arg 6 ndst (on stack) + MOVW R5, 4(R13) + MOVW 0(R0), R0 // arg 1 mib + BL libc_sysctl(SB) + MOVW R9, R13 + RET + +TEXT runtime·kqueue_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + BL libc_kqueue(SB) + MOVW R9, R13 + RET + +TEXT runtime·kevent_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $8, R13 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 keventt + MOVW 8(R0), R2 // arg 3 nch + MOVW 12(R0), R3 // arg 4 ev + MOVW 16(R0), R4 // arg 5 nev (on stack) + MOVW R4, 0(R13) + MOVW 20(R0), R5 // arg 6 ts (on stack) + MOVW R5, 4(R13) + MOVW 0(R0), R0 // arg 1 kq + BL libc_kevent(SB) + CMP $-1, R0 + BNE ok + BL libc_errno(SB) + MOVW (R0), R0 // errno + RSB.CS $0, R0 // caller expects negative errno +ok: + MOVW R9, R13 + RET + +TEXT runtime·clock_gettime_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 tp + MOVW 0(R0), R0 // arg 1 clock_id + BL libc_clock_gettime(SB) + CMP $-1, R0 + BNE noerr + BL libc_errno(SB) + MOVW (R0), R0 // errno + RSB.CS $0, R0 // caller expects negative errno +noerr: + MOVW R9, R13 + RET + +TEXT runtime·fcntl_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $8, R13 + BIC $0x7, R13 // align for ELF ABI + MOVW R0, R8 + MOVW 0(R8), R0 // arg 1 fd + MOVW 4(R8), R1 // arg 2 cmd + MOVW 8(R8), R2 // arg 3 arg (vararg, on stack) + MOVW R2, 0(R13) + BL libc_fcntl(SB) + MOVW $0, R1 + CMP $-1, R0 + BNE noerr + BL libc_errno(SB) + MOVW (R0), R1 + MOVW $-1, R0 +noerr: + MOVW R0, 12(R8) + MOVW R1, 16(R8) + MOVW R9, R13 + RET + +TEXT runtime·sigaction_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 new + MOVW 8(R0), R2 // arg 3 old + MOVW 0(R0), R0 // arg 1 sig + BL libc_sigaction(SB) + CMP $-1, R0 + BNE 3(PC) + MOVW $0, R8 // crash on failure + MOVW R8, (R8) + MOVW R9, R13 + RET + +TEXT runtime·sigprocmask_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 new + MOVW 8(R0), R2 // arg 3 old + MOVW 0(R0), R0 // arg 1 how + BL libc_pthread_sigmask(SB) + CMP $-1, R0 + BNE 3(PC) + MOVW $0, R8 // crash on failure + MOVW R8, (R8) + MOVW R9, R13 + RET + +TEXT runtime·sigaltstack_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + MOVW 4(R0), R1 // arg 2 old + MOVW 0(R0), R0 // arg 1 new + BL libc_sigaltstack(SB) + CMP $-1, R0 + BNE 3(PC) + MOVW $0, R8 // crash on failure + MOVW R8, (R8) + MOVW R9, R13 + RET + +// syscall calls a function in libc on behalf of the syscall package. +// syscall takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + + MOVW R0, R8 + + MOVW (0*4)(R8), R7 // fn + MOVW (1*4)(R8), R0 // a1 + MOVW (2*4)(R8), R1 // a2 + MOVW (3*4)(R8), R2 // a3 + + BL (R7) + + MOVW R0, (4*4)(R8) // r1 + MOVW R1, (5*4)(R8) // r2 + + // Standard libc functions return -1 on error and set errno. + CMP $-1, R0 + BNE ok + + // Get error code from libc. + BL libc_errno(SB) + MOVW (R0), R1 + MOVW R1, (6*4)(R8) // err + +ok: + MOVW $0, R0 // no error (it's ignored anyway) + MOVW R9, R13 + RET + +// syscallX calls a function in libc on behalf of the syscall package. +// syscallX takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscallX must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscallX is like syscall but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscallX(SB),NOSPLIT,$0 + MOVW R13, R9 + BIC $0x7, R13 // align for ELF ABI + + MOVW R0, R8 + + MOVW (0*4)(R8), R7 // fn + MOVW (1*4)(R8), R0 // a1 + MOVW (2*4)(R8), R1 // a2 + MOVW (3*4)(R8), R2 // a3 + + BL (R7) + + MOVW R0, (4*4)(R8) // r1 + MOVW R1, (5*4)(R8) // r2 + + // Standard libc functions return -1 on error and set errno. + CMP $-1, R0 + BNE ok + CMP $-1, R1 + BNE ok + + // Get error code from libc. + BL libc_errno(SB) + MOVW (R0), R1 + MOVW R1, (6*4)(R8) // err + +ok: + MOVW $0, R0 // no error (it's ignored anyway) + MOVW R9, R13 + RET + +// syscall6 calls a function in libc on behalf of the syscall package. +// syscall6 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6 must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6 expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall6(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $8, R13 + BIC $0x7, R13 // align for ELF ABI + + MOVW R0, R8 + + MOVW (0*4)(R8), R7 // fn + MOVW (1*4)(R8), R0 // a1 + MOVW (2*4)(R8), R1 // a2 + MOVW (3*4)(R8), R2 // a3 + MOVW (4*4)(R8), R3 // a4 + MOVW (5*4)(R8), R4 // a5 + MOVW R4, 0(R13) + MOVW (6*4)(R8), R5 // a6 + MOVW R5, 4(R13) + + BL (R7) + + MOVW R0, (7*4)(R8) // r1 + MOVW R1, (8*4)(R8) // r2 + + // Standard libc functions return -1 on error and set errno. + CMP $-1, R0 + BNE ok + + // Get error code from libc. + BL libc_errno(SB) + MOVW (R0), R1 + MOVW R1, (9*4)(R8) // err + +ok: + MOVW $0, R0 // no error (it's ignored anyway) + MOVW R9, R13 + RET + +// syscall6X calls a function in libc on behalf of the syscall package. +// syscall6X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6X is like syscall6 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall6X(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $8, R13 + BIC $0x7, R13 // align for ELF ABI + + MOVW R0, R8 + + MOVW (0*4)(R8), R7 // fn + MOVW (1*4)(R8), R0 // a1 + MOVW (2*4)(R8), R1 // a2 + MOVW (3*4)(R8), R2 // a3 + MOVW (4*4)(R8), R3 // a4 + MOVW (5*4)(R8), R4 // a5 + MOVW R4, 0(R13) + MOVW (6*4)(R8), R5 // a6 + MOVW R5, 4(R13) + + BL (R7) + + MOVW R0, (7*4)(R8) // r1 + MOVW R1, (8*4)(R8) // r2 + + // Standard libc functions return -1 on error and set errno. + CMP $-1, R0 + BNE ok + CMP $-1, R1 + BNE ok + + // Get error code from libc. + BL libc_errno(SB) + MOVW (R0), R1 + MOVW R1, (9*4)(R8) // err + +ok: + MOVW $0, R0 // no error (it's ignored anyway) + MOVW R9, R13 + RET + +// syscall10 calls a function in libc on behalf of the syscall package. +// syscall10 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// a10 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall10 must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscall10(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $24, R13 + BIC $0x7, R13 // align for ELF ABI + + MOVW R0, R8 + + MOVW (0*4)(R8), R7 // fn + MOVW (1*4)(R8), R0 // a1 + MOVW (2*4)(R8), R1 // a2 + MOVW (3*4)(R8), R2 // a3 + MOVW (4*4)(R8), R3 // a4 + MOVW (5*4)(R8), R4 // a5 + MOVW R4, 0(R13) + MOVW (6*4)(R8), R5 // a6 + MOVW R5, 4(R13) + MOVW (7*4)(R8), R6 // a7 + MOVW R6, 8(R13) + MOVW (8*4)(R8), R4 // a8 + MOVW R4, 12(R13) + MOVW (9*4)(R8), R5 // a9 + MOVW R5, 16(R13) + MOVW (10*4)(R8), R6 // a10 + MOVW R6, 20(R13) + + BL (R7) + + MOVW R0, (11*4)(R8) // r1 + MOVW R1, (12*4)(R8) // r2 + + // Standard libc functions return -1 on error and set errno. + CMP $-1, R0 + BNE ok + + // Get error code from libc. + BL libc_errno(SB) + MOVW (R0), R1 + MOVW R1, (13*4)(R8) // err + +ok: + MOVW $0, R0 // no error (it's ignored anyway) + MOVW R9, R13 + RET + +// syscall10X calls a function in libc on behalf of the syscall package. +// syscall10X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// a10 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall10X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall10X is like syscall10 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall10X(SB),NOSPLIT,$0 + MOVW R13, R9 + SUB $24, R13 + BIC $0x7, R13 // align for ELF ABI + + MOVW R0, R8 + + MOVW (0*4)(R8), R7 // fn + MOVW (1*4)(R8), R0 // a1 + MOVW (2*4)(R8), R1 // a2 + MOVW (3*4)(R8), R2 // a3 + MOVW (4*4)(R8), R3 // a4 + MOVW (5*4)(R8), R4 // a5 + MOVW R4, 0(R13) + MOVW (6*4)(R8), R5 // a6 + MOVW R5, 4(R13) + MOVW (7*4)(R8), R6 // a7 + MOVW R6, 8(R13) + MOVW (8*4)(R8), R4 // a8 + MOVW R4, 12(R13) + MOVW (9*4)(R8), R5 // a9 + MOVW R5, 16(R13) + MOVW (10*4)(R8), R6 // a10 + MOVW R6, 20(R13) + + BL (R7) + + MOVW R0, (11*4)(R8) // r1 + MOVW R1, (12*4)(R8) // r2 + + // Standard libc functions return -1 on error and set errno. + CMP $-1, R0 + BNE ok + CMP $-1, R1 + BNE ok + + // Get error code from libc. + BL libc_errno(SB) + MOVW (R0), R1 + MOVW R1, (13*4)(R8) // err + +ok: + MOVW $0, R0 // no error (it's ignored anyway) + MOVW R9, R13 + RET + +TEXT runtime·issetugid_trampoline(SB),NOSPLIT,$0 + MOVW R13, R9 + MOVW R0, R8 + BIC $0x7, R13 // align for ELF ABI + BL libc_issetugid(SB) + MOVW R0, 0(R8) + MOVW R9, R13 + RET diff --git a/src/runtime/sys_openbsd_arm64.s b/src/runtime/sys_openbsd_arm64.s new file mode 100644 index 0000000..563b88f --- /dev/null +++ b/src/runtime/sys_openbsd_arm64.s @@ -0,0 +1,657 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for arm64, OpenBSD +// System calls are implemented in libc/libpthread, this file +// contains trampolines that convert from Go to C calling convention. +// Some direct system call implementations currently remain. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "cgo/abi_arm64.h" + +#define CLOCK_REALTIME $0 +#define CLOCK_MONOTONIC $3 + +// mstart_stub is the first function executed on a new thread started by pthread_create. +// It just does some low-level setup and then calls mstart. +// Note: called with the C calling convention. +TEXT runtime·mstart_stub(SB),NOSPLIT,$144 + // R0 points to the m. + // We are already on m's g0 stack. + + // Save callee-save registers. + SAVE_R19_TO_R28(8) + SAVE_F8_TO_F15(88) + + MOVD m_g0(R0), g + BL runtime·save_g(SB) + + BL runtime·mstart(SB) + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8) + RESTORE_F8_TO_F15(88) + + // Go is all done with this OS thread. + // Tell pthread everything is ok (we never join with this thread, so + // the value here doesn't really matter). + MOVD $0, R0 + + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R0 + MOVD info+16(FP), R1 + MOVD ctx+24(FP), R2 + MOVD fn+0(FP), R11 + BL (R11) // Alignment for ELF ABI? + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$192 + // Save callee-save registers in the case of signal forwarding. + // Please refer to https://golang.org/issue/31827 . + SAVE_R19_TO_R28(8*4) + SAVE_F8_TO_F15(8*14) + + // If called from an external code context, g will not be set. + // Save R0, since runtime·load_g will clobber it. + MOVW R0, 8(RSP) // signum + BL runtime·load_g(SB) + +#ifdef GOEXPERIMENT_regabiargs + // Restore signum to R0. + MOVW 8(RSP), R0 + // R1 and R2 already contain info and ctx, respectively. +#else + MOVD R1, 16(RSP) + MOVD R2, 24(RSP) +#endif + BL runtime·sigtrampgo<ABIInternal>(SB) + + // Restore callee-save registers. + RESTORE_R19_TO_R28(8*4) + RESTORE_F8_TO_F15(8*14) + + RET + +// +// These trampolines help convert from Go calling convention to C calling convention. +// They should be called with asmcgocall. +// A pointer to the arguments is passed in R0. +// A single int32 result is returned in R0. +// (For more results, make an args/results structure.) +TEXT runtime·pthread_attr_init_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R0 // arg 1 - attr + CALL libc_pthread_attr_init(SB) + RET + +TEXT runtime·pthread_attr_destroy_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R0 // arg 1 - attr + CALL libc_pthread_attr_destroy(SB) + RET + +TEXT runtime·pthread_attr_getstacksize_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - size + MOVD 0(R0), R0 // arg 1 - attr + CALL libc_pthread_attr_getstacksize(SB) + RET + +TEXT runtime·pthread_attr_setdetachstate_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - state + MOVD 0(R0), R0 // arg 1 - attr + CALL libc_pthread_attr_setdetachstate(SB) + RET + +TEXT runtime·pthread_create_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R1 // arg 2 - attr + MOVD 8(R0), R2 // arg 3 - start + MOVD 16(R0), R3 // arg 4 - arg + SUB $16, RSP + MOVD RSP, R0 // arg 1 - &threadid (discard) + CALL libc_pthread_create(SB) + ADD $16, RSP + RET + +TEXT runtime·thrkill_trampoline(SB),NOSPLIT,$0 + MOVW 8(R0), R1 // arg 2 - signal + MOVD $0, R2 // arg 3 - tcb + MOVW 0(R0), R0 // arg 1 - tid + CALL libc_thrkill(SB) + RET + +TEXT runtime·thrsleep_trampoline(SB),NOSPLIT,$0 + MOVW 8(R0), R1 // arg 2 - clock_id + MOVD 16(R0), R2 // arg 3 - abstime + MOVD 24(R0), R3 // arg 4 - lock + MOVD 32(R0), R4 // arg 5 - abort + MOVD 0(R0), R0 // arg 1 - id + CALL libc_thrsleep(SB) + RET + +TEXT runtime·thrwakeup_trampoline(SB),NOSPLIT,$0 + MOVW 8(R0), R1 // arg 2 - count + MOVD 0(R0), R0 // arg 1 - id + CALL libc_thrwakeup(SB) + RET + +TEXT runtime·exit_trampoline(SB),NOSPLIT,$0 + MOVW 0(R0), R0 // arg 1 - status + CALL libc_exit(SB) + MOVD $0, R0 // crash on failure + MOVD R0, (R0) + RET + +TEXT runtime·getthrid_trampoline(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + CALL libc_getthrid(SB) + MOVW R0, 0(R19) // return value + RET + +TEXT runtime·raiseproc_trampoline(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + CALL libc_getpid(SB) // arg 1 - pid + MOVW 0(R19), R1 // arg 2 - signal + CALL libc_kill(SB) + RET + +TEXT runtime·sched_yield_trampoline(SB),NOSPLIT,$0 + CALL libc_sched_yield(SB) + RET + +TEXT runtime·mmap_trampoline(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + MOVD 0(R19), R0 // arg 1 - addr + MOVD 8(R19), R1 // arg 2 - len + MOVW 16(R19), R2 // arg 3 - prot + MOVW 20(R19), R3 // arg 4 - flags + MOVW 24(R19), R4 // arg 5 - fid + MOVW 28(R19), R5 // arg 6 - offset + CALL libc_mmap(SB) + MOVD $0, R1 + CMP $-1, R0 + BNE noerr + CALL libc_errno(SB) + MOVW (R0), R1 // errno + MOVD $0, R0 +noerr: + MOVD R0, 32(R19) + MOVD R1, 40(R19) + RET + +TEXT runtime·munmap_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - len + MOVD 0(R0), R0 // arg 1 - addr + CALL libc_munmap(SB) + CMP $-1, R0 + BNE 3(PC) + MOVD $0, R0 // crash on failure + MOVD R0, (R0) + RET + +TEXT runtime·madvise_trampoline(SB), NOSPLIT, $0 + MOVD 8(R0), R1 // arg 2 - len + MOVW 16(R0), R2 // arg 3 - advice + MOVD 0(R0), R0 // arg 1 - addr + CALL libc_madvise(SB) + // ignore failure - maybe pages are locked + RET + +TEXT runtime·open_trampoline(SB),NOSPLIT,$0 + MOVW 8(R0), R1 // arg 2 - flags + MOVW 12(R0), R2 // arg 3 - mode + MOVD 0(R0), R0 // arg 1 - path + MOVD $0, R3 // varargs + CALL libc_open(SB) + RET + +TEXT runtime·close_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R0 // arg 1 - fd + CALL libc_close(SB) + RET + +TEXT runtime·read_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - buf + MOVW 16(R0), R2 // arg 3 - count + MOVW 0(R0), R0 // arg 1 - fd + CALL libc_read(SB) + CMP $-1, R0 + BNE noerr + CALL libc_errno(SB) + MOVW (R0), R0 // errno + NEG R0, R0 // caller expects negative errno value +noerr: + RET + +TEXT runtime·write_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - buf + MOVW 16(R0), R2 // arg 3 - count + MOVW 0(R0), R0 // arg 1 - fd + CALL libc_write(SB) + CMP $-1, R0 + BNE noerr + CALL libc_errno(SB) + MOVW (R0), R0 // errno + NEG R0, R0 // caller expects negative errno value +noerr: + RET + +TEXT runtime·pipe2_trampoline(SB),NOSPLIT,$0 + MOVW 8(R0), R1 // arg 2 - flags + MOVD 0(R0), R0 // arg 1 - filedes + CALL libc_pipe2(SB) + CMP $-1, R0 + BNE noerr + CALL libc_errno(SB) + MOVW (R0), R0 // errno + NEG R0, R0 // caller expects negative errno value +noerr: + RET + +TEXT runtime·setitimer_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - new + MOVD 16(R0), R2 // arg 3 - old + MOVW 0(R0), R0 // arg 1 - which + CALL libc_setitimer(SB) + RET + +TEXT runtime·usleep_trampoline(SB),NOSPLIT,$0 + MOVD 0(R0), R0 // arg 1 - usec + CALL libc_usleep(SB) + RET + +TEXT runtime·sysctl_trampoline(SB),NOSPLIT,$0 + MOVW 8(R0), R1 // arg 2 - miblen + MOVD 16(R0), R2 // arg 3 - out + MOVD 24(R0), R3 // arg 4 - size + MOVD 32(R0), R4 // arg 5 - dst + MOVD 40(R0), R5 // arg 6 - ndst + MOVD 0(R0), R0 // arg 1 - mib + CALL libc_sysctl(SB) + RET + +TEXT runtime·kqueue_trampoline(SB),NOSPLIT,$0 + CALL libc_kqueue(SB) + RET + +TEXT runtime·kevent_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - keventt + MOVW 16(R0), R2 // arg 3 - nch + MOVD 24(R0), R3 // arg 4 - ev + MOVW 32(R0), R4 // arg 5 - nev + MOVD 40(R0), R5 // arg 6 - ts + MOVW 0(R0), R0 // arg 1 - kq + CALL libc_kevent(SB) + CMP $-1, R0 + BNE noerr + CALL libc_errno(SB) + MOVW (R0), R0 // errno + NEG R0, R0 // caller expects negative errno value +noerr: + RET + +TEXT runtime·clock_gettime_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - tp + MOVD 0(R0), R0 // arg 1 - clock_id + CALL libc_clock_gettime(SB) + CMP $-1, R0 + BNE noerr + CALL libc_errno(SB) + MOVW (R0), R0 // errno + NEG R0, R0 // caller expects negative errno value +noerr: + RET + +TEXT runtime·fcntl_trampoline(SB),NOSPLIT,$0 + MOVD R0, R19 + MOVW 0(R19), R0 // arg 1 - fd + MOVW 4(R19), R1 // arg 2 - cmd + MOVW 8(R19), R2 // arg 3 - arg + MOVD $0, R3 // vararg + CALL libc_fcntl(SB) + MOVD $0, R1 + CMP $-1, R0 + BNE noerr + CALL libc_errno(SB) + MOVW (R0), R1 + MOVW $-1, R0 +noerr: + MOVW R0, 12(R19) + MOVW R1, 16(R19) + RET + +TEXT runtime·sigaction_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - new + MOVD 16(R0), R2 // arg 3 - old + MOVW 0(R0), R0 // arg 1 - sig + CALL libc_sigaction(SB) + CMP $-1, R0 + BNE 3(PC) + MOVD $0, R0 // crash on syscall failure + MOVD R0, (R0) + RET + +TEXT runtime·sigprocmask_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - new + MOVD 16(R0), R2 // arg 3 - old + MOVW 0(R0), R0 // arg 1 - how + CALL libc_pthread_sigmask(SB) + CMP $-1, R0 + BNE 3(PC) + MOVD $0, R0 // crash on syscall failure + MOVD R0, (R0) + RET + +TEXT runtime·sigaltstack_trampoline(SB),NOSPLIT,$0 + MOVD 8(R0), R1 // arg 2 - old + MOVD 0(R0), R0 // arg 1 - new + CALL libc_sigaltstack(SB) + CMP $-1, R0 + BNE 3(PC) + MOVD $0, R0 // crash on syscall failure + MOVD R0, (R0) + RET + +// syscall calls a function in libc on behalf of the syscall package. +// syscall takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + + MOVD (0*8)(R19), R11 // fn + MOVD (1*8)(R19), R0 // a1 + MOVD (2*8)(R19), R1 // a2 + MOVD (3*8)(R19), R2 // a3 + MOVD $0, R3 // vararg + + CALL R11 + + MOVD R0, (4*8)(R19) // r1 + MOVD R1, (5*8)(R19) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMPW $-1, R0 + BNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVW (R0), R0 + MOVD R0, (6*8)(R19) // err + +ok: + RET + +// syscallX calls a function in libc on behalf of the syscall package. +// syscallX takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscallX must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscallX is like syscall but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscallX(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + + MOVD (0*8)(R19), R11 // fn + MOVD (1*8)(R19), R0 // a1 + MOVD (2*8)(R19), R1 // a2 + MOVD (3*8)(R19), R2 // a3 + MOVD $0, R3 // vararg + + CALL R11 + + MOVD R0, (4*8)(R19) // r1 + MOVD R1, (5*8)(R19) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMP $-1, R0 + BNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVW (R0), R0 + MOVD R0, (6*8)(R19) // err + +ok: + RET + +// syscall6 calls a function in libc on behalf of the syscall package. +// syscall6 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6 must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6 expects a 32-bit result and tests for 32-bit -1 +// to decide there was an error. +TEXT runtime·syscall6(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + + MOVD (0*8)(R19), R11 // fn + MOVD (1*8)(R19), R0 // a1 + MOVD (2*8)(R19), R1 // a2 + MOVD (3*8)(R19), R2 // a3 + MOVD (4*8)(R19), R3 // a4 + MOVD (5*8)(R19), R4 // a5 + MOVD (6*8)(R19), R5 // a6 + MOVD $0, R6 // vararg + + CALL R11 + + MOVD R0, (7*8)(R19) // r1 + MOVD R1, (8*8)(R19) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMPW $-1, R0 + BNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVW (R0), R0 + MOVD R0, (9*8)(R19) // err + +ok: + RET + +// syscall6X calls a function in libc on behalf of the syscall package. +// syscall6X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall6X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall6X is like syscall6 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall6X(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + + MOVD (0*8)(R19), R11 // fn + MOVD (1*8)(R19), R0 // a1 + MOVD (2*8)(R19), R1 // a2 + MOVD (3*8)(R19), R2 // a3 + MOVD (4*8)(R19), R3 // a4 + MOVD (5*8)(R19), R4 // a5 + MOVD (6*8)(R19), R5 // a6 + MOVD $0, R6 // vararg + + CALL R11 + + MOVD R0, (7*8)(R19) // r1 + MOVD R1, (8*8)(R19) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMP $-1, R0 + BNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVW (R0), R0 + MOVD R0, (9*8)(R19) // err + +ok: + RET + +// syscall10 calls a function in libc on behalf of the syscall package. +// syscall10 takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// a10 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall10 must be called on the g0 stack with the +// C calling convention (use libcCall). +TEXT runtime·syscall10(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + + MOVD (0*8)(R19), R11 // fn + MOVD (1*8)(R19), R0 // a1 + MOVD (2*8)(R19), R1 // a2 + MOVD (3*8)(R19), R2 // a3 + MOVD (4*8)(R19), R3 // a4 + MOVD (5*8)(R19), R4 // a5 + MOVD (6*8)(R19), R5 // a6 + MOVD (7*8)(R19), R6 // a7 + MOVD (8*8)(R19), R7 // a8 + MOVD (9*8)(R19), R8 // a9 + MOVD (10*8)(R19), R9 // a10 + MOVD $0, R10 // vararg + + CALL R11 + + MOVD R0, (11*8)(R19) // r1 + MOVD R1, (12*8)(R19) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMPW $-1, R0 + BNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVW (R0), R0 + MOVD R0, (13*8)(R19) // err + +ok: + RET + +// syscall10X calls a function in libc on behalf of the syscall package. +// syscall10X takes a pointer to a struct like: +// struct { +// fn uintptr +// a1 uintptr +// a2 uintptr +// a3 uintptr +// a4 uintptr +// a5 uintptr +// a6 uintptr +// a7 uintptr +// a8 uintptr +// a9 uintptr +// a10 uintptr +// r1 uintptr +// r2 uintptr +// err uintptr +// } +// syscall10X must be called on the g0 stack with the +// C calling convention (use libcCall). +// +// syscall10X is like syscall10 but expects a 64-bit result +// and tests for 64-bit -1 to decide there was an error. +TEXT runtime·syscall10X(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + + MOVD (0*8)(R19), R11 // fn + MOVD (1*8)(R19), R0 // a1 + MOVD (2*8)(R19), R1 // a2 + MOVD (3*8)(R19), R2 // a3 + MOVD (4*8)(R19), R3 // a4 + MOVD (5*8)(R19), R4 // a5 + MOVD (6*8)(R19), R5 // a6 + MOVD (7*8)(R19), R6 // a7 + MOVD (8*8)(R19), R7 // a8 + MOVD (9*8)(R19), R8 // a9 + MOVD (10*8)(R19), R9 // a10 + MOVD $0, R10 // vararg + + CALL R11 + + MOVD R0, (11*8)(R19) // r1 + MOVD R1, (12*8)(R19) // r2 + + // Standard libc functions return -1 on error + // and set errno. + CMP $-1, R0 + BNE ok + + // Get error code from libc. + CALL libc_errno(SB) + MOVW (R0), R0 + MOVD R0, (13*8)(R19) // err + +ok: + RET + +TEXT runtime·issetugid_trampoline(SB),NOSPLIT,$0 + MOVD R0, R19 // pointer to args + CALL libc_issetugid(SB) + MOVW R0, 0(R19) // return value + RET diff --git a/src/runtime/sys_openbsd_mips64.s b/src/runtime/sys_openbsd_mips64.s new file mode 100644 index 0000000..0a45a07 --- /dev/null +++ b/src/runtime/sys_openbsd_mips64.s @@ -0,0 +1,397 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +// System calls and other sys.stuff for mips64, OpenBSD +// /usr/src/sys/kern/syscalls.master for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define CLOCK_REALTIME $0 +#define CLOCK_MONOTONIC $3 + +// Exit the entire program (like C exit) +TEXT runtime·exit(SB),NOSPLIT|NOFRAME,$0 + MOVW code+0(FP), R4 // arg 1 - status + MOVV $1, R2 // sys_exit + SYSCALL + BEQ R7, 3(PC) + MOVV $0, R2 // crash on syscall failure + MOVV R2, (R2) + RET + +// func exitThread(wait *atomic.Uint32) +TEXT runtime·exitThread(SB),NOSPLIT,$0 + MOVV wait+0(FP), R4 // arg 1 - notdead + MOVV $302, R2 // sys___threxit + SYSCALL + MOVV $0, R2 // crash on syscall failure + MOVV R2, (R2) + JMP 0(PC) + +TEXT runtime·open(SB),NOSPLIT|NOFRAME,$0 + MOVV name+0(FP), R4 // arg 1 - path + MOVW mode+8(FP), R5 // arg 2 - mode + MOVW perm+12(FP), R6 // arg 3 - perm + MOVV $5, R2 // sys_open + SYSCALL + BEQ R7, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+16(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R4 // arg 1 - fd + MOVV $6, R2 // sys_close + SYSCALL + BEQ R7, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+8(FP) + RET + +TEXT runtime·read(SB),NOSPLIT|NOFRAME,$0 + MOVW fd+0(FP), R4 // arg 1 - fd + MOVV p+8(FP), R5 // arg 2 - buf + MOVW n+16(FP), R6 // arg 3 - nbyte + MOVV $3, R2 // sys_read + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+24(FP) + RET + +// func pipe2(flags int32) (r, w int32, errno int32) +TEXT runtime·pipe2(SB),NOSPLIT|NOFRAME,$0-20 + MOVV $r+8(FP), R4 + MOVW flags+0(FP), R5 + MOVV $101, R2 // sys_pipe2 + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, errno+16(FP) + RET + +TEXT runtime·write1(SB),NOSPLIT|NOFRAME,$0 + MOVV fd+0(FP), R4 // arg 1 - fd + MOVV p+8(FP), R5 // arg 2 - buf + MOVW n+16(FP), R6 // arg 3 - nbyte + MOVV $4, R2 // sys_write + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+24(FP) + RET + +TEXT runtime·usleep(SB),NOSPLIT,$24-4 + MOVWU usec+0(FP), R3 + MOVV R3, R5 + MOVW $1000000, R4 + DIVVU R4, R3 + MOVV LO, R3 + MOVV R3, 8(R29) // tv_sec + MOVW $1000, R4 + MULVU R3, R4 + MOVV LO, R4 + SUBVU R4, R5 + MOVV R5, 16(R29) // tv_nsec + + ADDV $8, R29, R4 // arg 1 - rqtp + MOVV $0, R5 // arg 2 - rmtp + MOVV $91, R2 // sys_nanosleep + SYSCALL + RET + +TEXT runtime·getthrid(SB),NOSPLIT,$0-4 + MOVV $299, R2 // sys_getthrid + SYSCALL + MOVW R2, ret+0(FP) + RET + +TEXT runtime·thrkill(SB),NOSPLIT,$0-16 + MOVW tid+0(FP), R4 // arg 1 - tid + MOVV sig+8(FP), R5 // arg 2 - signum + MOVW $0, R6 // arg 3 - tcb + MOVV $119, R2 // sys_thrkill + SYSCALL + RET + +TEXT runtime·raiseproc(SB),NOSPLIT,$0 + MOVV $20, R4 // sys_getpid + SYSCALL + MOVV R2, R4 // arg 1 - pid + MOVW sig+0(FP), R5 // arg 2 - signum + MOVV $122, R2 // sys_kill + SYSCALL + RET + +TEXT runtime·mmap(SB),NOSPLIT,$0 + MOVV addr+0(FP), R4 // arg 1 - addr + MOVV n+8(FP), R5 // arg 2 - len + MOVW prot+16(FP), R6 // arg 3 - prot + MOVW flags+20(FP), R7 // arg 4 - flags + MOVW fd+24(FP), R8 // arg 5 - fd + MOVW $0, R9 // arg 6 - pad + MOVW off+28(FP), R10 // arg 7 - offset + MOVV $197, R2 // sys_mmap + SYSCALL + MOVV $0, R4 + BEQ R7, 3(PC) + MOVV R2, R4 // if error, move to R4 + MOVV $0, R2 + MOVV R2, p+32(FP) + MOVV R4, err+40(FP) + RET + +TEXT runtime·munmap(SB),NOSPLIT,$0 + MOVV addr+0(FP), R4 // arg 1 - addr + MOVV n+8(FP), R5 // arg 2 - len + MOVV $73, R2 // sys_munmap + SYSCALL + BEQ R7, 3(PC) + MOVV $0, R2 // crash on syscall failure + MOVV R2, (R2) + RET + +TEXT runtime·madvise(SB),NOSPLIT,$0 + MOVV addr+0(FP), R4 // arg 1 - addr + MOVV n+8(FP), R5 // arg 2 - len + MOVW flags+16(FP), R6 // arg 2 - flags + MOVV $75, R2 // sys_madvise + SYSCALL + BEQ R7, 2(PC) + MOVW $-1, R2 + MOVW R2, ret+24(FP) + RET + +TEXT runtime·setitimer(SB),NOSPLIT,$0 + MOVW mode+0(FP), R4 // arg 1 - mode + MOVV new+8(FP), R5 // arg 2 - new value + MOVV old+16(FP), R6 // arg 3 - old value + MOVV $69, R2 // sys_setitimer + SYSCALL + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB), NOSPLIT, $32 + MOVW CLOCK_REALTIME, R4 // arg 1 - clock_id + MOVV $8(R29), R5 // arg 2 - tp + MOVV $87, R2 // sys_clock_gettime + SYSCALL + + MOVV 8(R29), R4 // sec + MOVV 16(R29), R5 // nsec + MOVV R4, sec+0(FP) + MOVW R5, nsec+8(FP) + + RET + +// int64 nanotime1(void) so really +// void nanotime1(int64 *nsec) +TEXT runtime·nanotime1(SB),NOSPLIT,$32 + MOVW CLOCK_MONOTONIC, R4 // arg 1 - clock_id + MOVV $8(R29), R5 // arg 2 - tp + MOVV $87, R2 // sys_clock_gettime + SYSCALL + + MOVV 8(R29), R3 // sec + MOVV 16(R29), R5 // nsec + + MOVV $1000000000, R4 + MULVU R4, R3 + MOVV LO, R3 + ADDVU R5, R3 + MOVV R3, ret+0(FP) + RET + +TEXT runtime·sigaction(SB),NOSPLIT,$0 + MOVW sig+0(FP), R4 // arg 1 - signum + MOVV new+8(FP), R5 // arg 2 - new sigaction + MOVV old+16(FP), R6 // arg 3 - old sigaction + MOVV $46, R2 // sys_sigaction + SYSCALL + BEQ R7, 3(PC) + MOVV $3, R2 // crash on syscall failure + MOVV R2, (R2) + RET + +TEXT runtime·obsdsigprocmask(SB),NOSPLIT,$0 + MOVW how+0(FP), R4 // arg 1 - mode + MOVW new+4(FP), R5 // arg 2 - new + MOVV $48, R2 // sys_sigprocmask + SYSCALL + BEQ R7, 3(PC) + MOVV $3, R2 // crash on syscall failure + MOVV R2, (R2) + MOVW R2, ret+8(FP) + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVW sig+8(FP), R4 + MOVV info+16(FP), R5 + MOVV ctx+24(FP), R6 + MOVV fn+0(FP), R25 // Must use R25, needed for PIC code. + CALL (R25) + RET + +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$192 + // initialize REGSB = PC&0xffffffff00000000 + BGEZAL R0, 1(PC) + SRLV $32, R31, RSB + SLLV $32, RSB + + // this might be called in external code context, + // where g is not set. + MOVB runtime·iscgo(SB), R1 + BEQ R1, 2(PC) + JAL runtime·load_g(SB) + + MOVW R4, 8(R29) + MOVV R5, 16(R29) + MOVV R6, 24(R29) + MOVV $runtime·sigtrampgo(SB), R1 + JAL (R1) + RET + +// int32 tfork(void *param, uintptr psize, M *mp, G *gp, void (*fn)(void)); +TEXT runtime·tfork(SB),NOSPLIT,$0 + + // Copy mp, gp and fn off parent stack for use by child. + MOVV mm+16(FP), R16 + MOVV gg+24(FP), R17 + MOVV fn+32(FP), R18 + + MOVV param+0(FP), R4 // arg 1 - param + MOVV psize+8(FP), R5 // arg 2 - psize + MOVV $8, R2 // sys___tfork + SYSCALL + + // Return if syscall failed. + BEQ R7, 4(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+40(FP) + RET + + // In parent, return. + BEQ R2, 3(PC) + MOVW $0, ret+40(FP) + RET + + // Initialise m, g. + MOVV R17, g + MOVV R16, g_m(g) + + // Call fn. + CALL (R18) + + // fn should never return. + MOVV $2, R8 // crash if reached + MOVV R8, (R8) + RET + +TEXT runtime·sigaltstack(SB),NOSPLIT,$0 + MOVV new+0(FP), R4 // arg 1 - new sigaltstack + MOVV old+8(FP), R5 // arg 2 - old sigaltstack + MOVV $288, R2 // sys_sigaltstack + SYSCALL + BEQ R7, 3(PC) + MOVV $0, R8 // crash on syscall failure + MOVV R8, (R8) + RET + +TEXT runtime·osyield(SB),NOSPLIT,$0 + MOVV $298, R2 // sys_sched_yield + SYSCALL + RET + +TEXT runtime·thrsleep(SB),NOSPLIT,$0 + MOVV ident+0(FP), R4 // arg 1 - ident + MOVW clock_id+8(FP), R5 // arg 2 - clock_id + MOVV tsp+16(FP), R6 // arg 3 - tsp + MOVV lock+24(FP), R7 // arg 4 - lock + MOVV abort+32(FP), R8 // arg 5 - abort + MOVV $94, R2 // sys___thrsleep + SYSCALL + MOVW R2, ret+40(FP) + RET + +TEXT runtime·thrwakeup(SB),NOSPLIT,$0 + MOVV ident+0(FP), R4 // arg 1 - ident + MOVW n+8(FP), R5 // arg 2 - n + MOVV $301, R2 // sys___thrwakeup + SYSCALL + MOVW R2, ret+16(FP) + RET + +TEXT runtime·sysctl(SB),NOSPLIT,$0 + MOVV mib+0(FP), R4 // arg 1 - mib + MOVW miblen+8(FP), R5 // arg 2 - miblen + MOVV out+16(FP), R6 // arg 3 - out + MOVV size+24(FP), R7 // arg 4 - size + MOVV dst+32(FP), R8 // arg 5 - dest + MOVV ndst+40(FP), R9 // arg 6 - newlen + MOVV $202, R2 // sys___sysctl + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+48(FP) + RET + +// int32 runtime·kqueue(void); +TEXT runtime·kqueue(SB),NOSPLIT,$0 + MOVV $269, R2 // sys_kqueue + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+0(FP) + RET + +// int32 runtime·kevent(int kq, Kevent *changelist, int nchanges, Kevent *eventlist, int nevents, Timespec *timeout); +TEXT runtime·kevent(SB),NOSPLIT,$0 + MOVW kq+0(FP), R4 // arg 1 - kq + MOVV ch+8(FP), R5 // arg 2 - changelist + MOVW nch+16(FP), R6 // arg 3 - nchanges + MOVV ev+24(FP), R7 // arg 4 - eventlist + MOVW nev+32(FP), R8 // arg 5 - nevents + MOVV ts+40(FP), R9 // arg 6 - timeout + MOVV $72, R2 // sys_kevent + SYSCALL + BEQ R7, 2(PC) + SUBVU R2, R0, R2 // caller expects negative errno + MOVW R2, ret+48(FP) + RET + +// func fcntl(fd, cmd, arg int32) (int32, int32) +TEXT runtime·fcntl(SB),NOSPLIT,$0 + MOVW fd+0(FP), R4 // fd + MOVW cmd+4(FP), R5 // cmd + MOVW arg+8(FP), R6 // arg + MOVV $92, R2 // sys_fcntl + SYSCALL + MOVV $0, R4 + BEQ R7, noerr + MOVV R2, R4 + MOVW $-1, R2 +noerr: + MOVW R2, ret+16(FP) + MOVW R4, errno+20(FP) + RET + +// func closeonexec(fd int32) +TEXT runtime·closeonexec(SB),NOSPLIT,$0 + MOVW fd+0(FP), R4 // arg 1 - fd + MOVV $2, R5 // arg 2 - cmd (F_SETFD) + MOVV $1, R6 // arg 3 - arg (FD_CLOEXEC) + MOVV $92, R2 // sys_fcntl + SYSCALL + RET + +// func issetugid() int32 +TEXT runtime·issetugid(SB),NOSPLIT,$0 + MOVV $253, R2 // sys_issetugid + SYSCALL + MOVW R2, ret+0(FP) + RET diff --git a/src/runtime/sys_plan9_386.s b/src/runtime/sys_plan9_386.s new file mode 100644 index 0000000..bdcb98e --- /dev/null +++ b/src/runtime/sys_plan9_386.s @@ -0,0 +1,256 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// setldt(int entry, int address, int limit) +TEXT runtime·setldt(SB),NOSPLIT,$0 + RET + +TEXT runtime·open(SB),NOSPLIT,$0 + MOVL $14, AX + INT $64 + MOVL AX, ret+12(FP) + RET + +TEXT runtime·pread(SB),NOSPLIT,$0 + MOVL $50, AX + INT $64 + MOVL AX, ret+20(FP) + RET + +TEXT runtime·pwrite(SB),NOSPLIT,$0 + MOVL $51, AX + INT $64 + MOVL AX, ret+20(FP) + RET + +// int32 _seek(int64*, int32, int64, int32) +TEXT _seek<>(SB),NOSPLIT,$0 + MOVL $39, AX + INT $64 + RET + +TEXT runtime·seek(SB),NOSPLIT,$24 + LEAL ret+16(FP), AX + MOVL fd+0(FP), BX + MOVL offset_lo+4(FP), CX + MOVL offset_hi+8(FP), DX + MOVL whence+12(FP), SI + MOVL AX, 0(SP) + MOVL BX, 4(SP) + MOVL CX, 8(SP) + MOVL DX, 12(SP) + MOVL SI, 16(SP) + CALL _seek<>(SB) + CMPL AX, $0 + JGE 3(PC) + MOVL $-1, ret_lo+16(FP) + MOVL $-1, ret_hi+20(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$0 + MOVL $4, AX + INT $64 + MOVL AX, ret+4(FP) + RET + +TEXT runtime·exits(SB),NOSPLIT,$0 + MOVL $8, AX + INT $64 + RET + +TEXT runtime·brk_(SB),NOSPLIT,$0 + MOVL $24, AX + INT $64 + MOVL AX, ret+4(FP) + RET + +TEXT runtime·sleep(SB),NOSPLIT,$0 + MOVL $17, AX + INT $64 + MOVL AX, ret+4(FP) + RET + +TEXT runtime·plan9_semacquire(SB),NOSPLIT,$0 + MOVL $37, AX + INT $64 + MOVL AX, ret+8(FP) + RET + +TEXT runtime·plan9_tsemacquire(SB),NOSPLIT,$0 + MOVL $52, AX + INT $64 + MOVL AX, ret+8(FP) + RET + +TEXT nsec<>(SB),NOSPLIT,$0 + MOVL $53, AX + INT $64 + RET + +TEXT runtime·nsec(SB),NOSPLIT,$8 + LEAL ret+4(FP), AX + MOVL AX, 0(SP) + CALL nsec<>(SB) + CMPL AX, $0 + JGE 3(PC) + MOVL $-1, ret_lo+4(FP) + MOVL $-1, ret_hi+8(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$8-12 + CALL runtime·nanotime1(SB) + MOVL 0(SP), AX + MOVL 4(SP), DX + + MOVL $1000000000, CX + DIVL CX + MOVL AX, sec_lo+0(FP) + MOVL $0, sec_hi+4(FP) + MOVL DX, nsec+8(FP) + RET + +TEXT runtime·notify(SB),NOSPLIT,$0 + MOVL $28, AX + INT $64 + MOVL AX, ret+4(FP) + RET + +TEXT runtime·noted(SB),NOSPLIT,$0 + MOVL $29, AX + INT $64 + MOVL AX, ret+4(FP) + RET + +TEXT runtime·plan9_semrelease(SB),NOSPLIT,$0 + MOVL $38, AX + INT $64 + MOVL AX, ret+8(FP) + RET + +TEXT runtime·rfork(SB),NOSPLIT,$0 + MOVL $19, AX + INT $64 + MOVL AX, ret+4(FP) + RET + +TEXT runtime·tstart_plan9(SB),NOSPLIT,$4 + MOVL newm+0(FP), CX + MOVL m_g0(CX), DX + + // Layout new m scheduler stack on os stack. + MOVL SP, AX + MOVL AX, (g_stack+stack_hi)(DX) + SUBL $(64*1024), AX // stack size + MOVL AX, (g_stack+stack_lo)(DX) + MOVL AX, g_stackguard0(DX) + MOVL AX, g_stackguard1(DX) + + // Initialize procid from TOS struct. + MOVL _tos(SB), AX + MOVL 48(AX), AX + MOVL AX, m_procid(CX) // save pid as m->procid + + // Finally, initialize g. + get_tls(BX) + MOVL DX, g(BX) + + CALL runtime·stackcheck(SB) // smashes AX, CX + CALL runtime·mstart(SB) + + // Exit the thread. + MOVL $0, 0(SP) + CALL runtime·exits(SB) + JMP 0(PC) + +// void sigtramp(void *ureg, int8 *note) +TEXT runtime·sigtramp(SB),NOSPLIT,$0 + get_tls(AX) + + // check that g exists + MOVL g(AX), BX + CMPL BX, $0 + JNE 3(PC) + CALL runtime·badsignal2(SB) // will exit + RET + + // save args + MOVL ureg+0(FP), CX + MOVL note+4(FP), DX + + // change stack + MOVL g_m(BX), BX + MOVL m_gsignal(BX), BP + MOVL (g_stack+stack_hi)(BP), BP + MOVL BP, SP + + // make room for args and g + SUBL $24, SP + + // save g + MOVL g(AX), BP + MOVL BP, 20(SP) + + // g = m->gsignal + MOVL m_gsignal(BX), DI + MOVL DI, g(AX) + + // load args and call sighandler + MOVL CX, 0(SP) + MOVL DX, 4(SP) + MOVL BP, 8(SP) + + CALL runtime·sighandler(SB) + MOVL 12(SP), AX + + // restore g + get_tls(BX) + MOVL 20(SP), BP + MOVL BP, g(BX) + + // call noted(AX) + MOVL AX, 0(SP) + CALL runtime·noted(SB) + RET + +// Only used by the 64-bit runtime. +TEXT runtime·setfpmasks(SB),NOSPLIT,$0 + RET + +#define ERRMAX 128 /* from os_plan9.h */ + +// void errstr(int8 *buf, int32 len) +TEXT errstr<>(SB),NOSPLIT,$0 + MOVL $41, AX + INT $64 + RET + +// func errstr() string +// Only used by package syscall. +// Grab error string due to a syscall made +// in entersyscall mode, without going +// through the allocator (issue 4994). +// See ../syscall/asm_plan9_386.s:/·Syscall/ +TEXT runtime·errstr(SB),NOSPLIT,$8-8 + get_tls(AX) + MOVL g(AX), BX + MOVL g_m(BX), BX + MOVL (m_mOS+mOS_errstr)(BX), CX + MOVL CX, 0(SP) + MOVL $ERRMAX, 4(SP) + CALL errstr<>(SB) + CALL runtime·findnull(SB) + MOVL 4(SP), AX + MOVL AX, ret_len+4(FP) + MOVL 0(SP), AX + MOVL AX, ret_base+0(FP) + RET + +// never called on this platform +TEXT ·sigpanictramp(SB),NOSPLIT,$0-0 + UNDEF diff --git a/src/runtime/sys_plan9_amd64.s b/src/runtime/sys_plan9_amd64.s new file mode 100644 index 0000000..638300d --- /dev/null +++ b/src/runtime/sys_plan9_amd64.s @@ -0,0 +1,257 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +TEXT runtime·open(SB),NOSPLIT,$0 + MOVQ $14, BP + SYSCALL + MOVL AX, ret+16(FP) + RET + +TEXT runtime·pread(SB),NOSPLIT,$0 + MOVQ $50, BP + SYSCALL + MOVL AX, ret+32(FP) + RET + +TEXT runtime·pwrite(SB),NOSPLIT,$0 + MOVQ $51, BP + SYSCALL + MOVL AX, ret+32(FP) + RET + +// int32 _seek(int64*, int32, int64, int32) +TEXT _seek<>(SB),NOSPLIT,$0 + MOVQ $39, BP + SYSCALL + RET + +// int64 seek(int32, int64, int32) +// Convenience wrapper around _seek, the actual system call. +TEXT runtime·seek(SB),NOSPLIT,$32 + LEAQ ret+24(FP), AX + MOVL fd+0(FP), BX + MOVQ offset+8(FP), CX + MOVL whence+16(FP), DX + MOVQ AX, 0(SP) + MOVL BX, 8(SP) + MOVQ CX, 16(SP) + MOVL DX, 24(SP) + CALL _seek<>(SB) + CMPL AX, $0 + JGE 2(PC) + MOVQ $-1, ret+24(FP) + RET + +TEXT runtime·closefd(SB),NOSPLIT,$0 + MOVQ $4, BP + SYSCALL + MOVL AX, ret+8(FP) + RET + +TEXT runtime·exits(SB),NOSPLIT,$0 + MOVQ $8, BP + SYSCALL + RET + +TEXT runtime·brk_(SB),NOSPLIT,$0 + MOVQ $24, BP + SYSCALL + MOVL AX, ret+8(FP) + RET + +TEXT runtime·sleep(SB),NOSPLIT,$0 + MOVQ $17, BP + SYSCALL + MOVL AX, ret+8(FP) + RET + +TEXT runtime·plan9_semacquire(SB),NOSPLIT,$0 + MOVQ $37, BP + SYSCALL + MOVL AX, ret+16(FP) + RET + +TEXT runtime·plan9_tsemacquire(SB),NOSPLIT,$0 + MOVQ $52, BP + SYSCALL + MOVL AX, ret+16(FP) + RET + +TEXT runtime·nsec(SB),NOSPLIT,$0 + MOVQ $53, BP + SYSCALL + MOVQ AX, ret+8(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$8-12 + CALL runtime·nanotime1(SB) + MOVQ 0(SP), AX + + // generated code for + // func f(x uint64) (uint64, uint64) { return x/1000000000, x%1000000000 } + // adapted to reduce duplication + MOVQ AX, CX + MOVQ $1360296554856532783, AX + MULQ CX + ADDQ CX, DX + RCRQ $1, DX + SHRQ $29, DX + MOVQ DX, sec+0(FP) + IMULQ $1000000000, DX + SUBQ DX, CX + MOVL CX, nsec+8(FP) + RET + +TEXT runtime·notify(SB),NOSPLIT,$0 + MOVQ $28, BP + SYSCALL + MOVL AX, ret+8(FP) + RET + +TEXT runtime·noted(SB),NOSPLIT,$0 + MOVQ $29, BP + SYSCALL + MOVL AX, ret+8(FP) + RET + +TEXT runtime·plan9_semrelease(SB),NOSPLIT,$0 + MOVQ $38, BP + SYSCALL + MOVL AX, ret+16(FP) + RET + +TEXT runtime·rfork(SB),NOSPLIT,$0 + MOVQ $19, BP + SYSCALL + MOVL AX, ret+8(FP) + RET + +TEXT runtime·tstart_plan9(SB),NOSPLIT,$8 + MOVQ newm+0(FP), CX + MOVQ m_g0(CX), DX + + // Layout new m scheduler stack on os stack. + MOVQ SP, AX + MOVQ AX, (g_stack+stack_hi)(DX) + SUBQ $(64*1024), AX // stack size + MOVQ AX, (g_stack+stack_lo)(DX) + MOVQ AX, g_stackguard0(DX) + MOVQ AX, g_stackguard1(DX) + + // Initialize procid from TOS struct. + MOVQ _tos(SB), AX + MOVL 64(AX), AX + MOVQ AX, m_procid(CX) // save pid as m->procid + + // Finally, initialize g. + get_tls(BX) + MOVQ DX, g(BX) + + CALL runtime·stackcheck(SB) // smashes AX, CX + CALL runtime·mstart(SB) + + // Exit the thread. + MOVQ $0, 0(SP) + CALL runtime·exits(SB) + JMP 0(PC) + +// This is needed by asm_amd64.s +TEXT runtime·settls(SB),NOSPLIT,$0 + RET + +// void sigtramp(void *ureg, int8 *note) +TEXT runtime·sigtramp(SB),NOSPLIT,$0 + get_tls(AX) + + // check that g exists + MOVQ g(AX), BX + CMPQ BX, $0 + JNE 3(PC) + CALL runtime·badsignal2(SB) // will exit + RET + + // save args + MOVQ ureg+0(FP), CX + MOVQ note+8(FP), DX + + // change stack + MOVQ g_m(BX), BX + MOVQ m_gsignal(BX), R10 + MOVQ (g_stack+stack_hi)(R10), BP + MOVQ BP, SP + + // make room for args and g + SUBQ $128, SP + + // save g + MOVQ g(AX), BP + MOVQ BP, 32(SP) + + // g = m->gsignal + MOVQ R10, g(AX) + + // load args and call sighandler + MOVQ CX, 0(SP) + MOVQ DX, 8(SP) + MOVQ BP, 16(SP) + + CALL runtime·sighandler(SB) + MOVL 24(SP), AX + + // restore g + get_tls(BX) + MOVQ 32(SP), R10 + MOVQ R10, g(BX) + + // call noted(AX) + MOVQ AX, 0(SP) + CALL runtime·noted(SB) + RET + +TEXT runtime·setfpmasks(SB),NOSPLIT,$8 + STMXCSR 0(SP) + MOVL 0(SP), AX + ANDL $~0x3F, AX + ORL $(0x3F<<7), AX + MOVL AX, 0(SP) + LDMXCSR 0(SP) + RET + +#define ERRMAX 128 /* from os_plan9.h */ + +// void errstr(int8 *buf, int32 len) +TEXT errstr<>(SB),NOSPLIT,$0 + MOVQ $41, BP + SYSCALL + RET + +// func errstr() string +// Only used by package syscall. +// Grab error string due to a syscall made +// in entersyscall mode, without going +// through the allocator (issue 4994). +// See ../syscall/asm_plan9_amd64.s:/·Syscall/ +TEXT runtime·errstr(SB),NOSPLIT,$16-16 + get_tls(AX) + MOVQ g(AX), BX + MOVQ g_m(BX), BX + MOVQ (m_mOS+mOS_errstr)(BX), CX + MOVQ CX, 0(SP) + MOVQ $ERRMAX, 8(SP) + CALL errstr<>(SB) + CALL runtime·findnull(SB) + MOVQ 8(SP), AX + MOVQ AX, ret_len+8(FP) + MOVQ 0(SP), AX + MOVQ AX, ret_base+0(FP) + RET + +// never called on this platform +TEXT ·sigpanictramp(SB),NOSPLIT,$0-0 + UNDEF diff --git a/src/runtime/sys_plan9_arm.s b/src/runtime/sys_plan9_arm.s new file mode 100644 index 0000000..5343085 --- /dev/null +++ b/src/runtime/sys_plan9_arm.s @@ -0,0 +1,320 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// from ../syscall/zsysnum_plan9.go + +#define SYS_SYSR1 0 +#define SYS_BIND 2 +#define SYS_CHDIR 3 +#define SYS_CLOSE 4 +#define SYS_DUP 5 +#define SYS_ALARM 6 +#define SYS_EXEC 7 +#define SYS_EXITS 8 +#define SYS_FAUTH 10 +#define SYS_SEGBRK 12 +#define SYS_OPEN 14 +#define SYS_OSEEK 16 +#define SYS_SLEEP 17 +#define SYS_RFORK 19 +#define SYS_PIPE 21 +#define SYS_CREATE 22 +#define SYS_FD2PATH 23 +#define SYS_BRK_ 24 +#define SYS_REMOVE 25 +#define SYS_NOTIFY 28 +#define SYS_NOTED 29 +#define SYS_SEGATTACH 30 +#define SYS_SEGDETACH 31 +#define SYS_SEGFREE 32 +#define SYS_SEGFLUSH 33 +#define SYS_RENDEZVOUS 34 +#define SYS_UNMOUNT 35 +#define SYS_SEMACQUIRE 37 +#define SYS_SEMRELEASE 38 +#define SYS_SEEK 39 +#define SYS_FVERSION 40 +#define SYS_ERRSTR 41 +#define SYS_STAT 42 +#define SYS_FSTAT 43 +#define SYS_WSTAT 44 +#define SYS_FWSTAT 45 +#define SYS_MOUNT 46 +#define SYS_AWAIT 47 +#define SYS_PREAD 50 +#define SYS_PWRITE 51 +#define SYS_TSEMACQUIRE 52 +#define SYS_NSEC 53 + +//func open(name *byte, mode, perm int32) int32 +TEXT runtime·open(SB),NOSPLIT,$0-16 + MOVW $SYS_OPEN, R0 + SWI $0 + MOVW R0, ret+12(FP) + RET + +//func pread(fd int32, buf unsafe.Pointer, nbytes int32, offset int64) int32 +TEXT runtime·pread(SB),NOSPLIT,$0-24 + MOVW $SYS_PREAD, R0 + SWI $0 + MOVW R0, ret+20(FP) + RET + +//func pwrite(fd int32, buf unsafe.Pointer, nbytes int32, offset int64) int32 +TEXT runtime·pwrite(SB),NOSPLIT,$0-24 + MOVW $SYS_PWRITE, R0 + SWI $0 + MOVW R0, ret+20(FP) + RET + +//func seek(fd int32, offset int64, whence int32) int64 +TEXT runtime·seek(SB),NOSPLIT,$0-24 + MOVW $ret_lo+16(FP), R0 + MOVW 0(R13), R1 + MOVW R0, 0(R13) + MOVW.W R1, -4(R13) + MOVW $SYS_SEEK, R0 + SWI $0 + MOVW.W R1, 4(R13) + CMP $-1, R0 + MOVW.EQ R0, ret_lo+16(FP) + MOVW.EQ R0, ret_hi+20(FP) + RET + +//func closefd(fd int32) int32 +TEXT runtime·closefd(SB),NOSPLIT,$0-8 + MOVW $SYS_CLOSE, R0 + SWI $0 + MOVW R0, ret+4(FP) + RET + +//func exits(msg *byte) +TEXT runtime·exits(SB),NOSPLIT,$0-4 + MOVW $SYS_EXITS, R0 + SWI $0 + RET + +//func brk_(addr unsafe.Pointer) int32 +TEXT runtime·brk_(SB),NOSPLIT,$0-8 + MOVW $SYS_BRK_, R0 + SWI $0 + MOVW R0, ret+4(FP) + RET + +//func sleep(ms int32) int32 +TEXT runtime·sleep(SB),NOSPLIT,$0-8 + MOVW $SYS_SLEEP, R0 + SWI $0 + MOVW R0, ret+4(FP) + RET + +//func plan9_semacquire(addr *uint32, block int32) int32 +TEXT runtime·plan9_semacquire(SB),NOSPLIT,$0-12 + MOVW $SYS_SEMACQUIRE, R0 + SWI $0 + MOVW R0, ret+8(FP) + RET + +//func plan9_tsemacquire(addr *uint32, ms int32) int32 +TEXT runtime·plan9_tsemacquire(SB),NOSPLIT,$0-12 + MOVW $SYS_TSEMACQUIRE, R0 + SWI $0 + MOVW R0, ret+8(FP) + RET + +//func nsec(*int64) int64 +TEXT runtime·nsec(SB),NOSPLIT|NOFRAME,$0-12 + MOVW $SYS_NSEC, R0 + SWI $0 + MOVW arg+0(FP), R1 + MOVW 0(R1), R0 + MOVW R0, ret_lo+4(FP) + MOVW 4(R1), R0 + MOVW R0, ret_hi+8(FP) + RET + +// func walltime() (sec int64, nsec int32) +TEXT runtime·walltime(SB),NOSPLIT,$12-12 + // use nsec system call to get current time in nanoseconds + MOVW $sysnsec_lo-8(SP), R0 // destination addr + MOVW R0,res-12(SP) + MOVW $SYS_NSEC, R0 + SWI $0 + MOVW sysnsec_lo-8(SP), R1 // R1:R2 = nsec + MOVW sysnsec_hi-4(SP), R2 + + // multiply nanoseconds by reciprocal of 10**9 (scaled by 2**61) + // to get seconds (96 bit scaled result) + MOVW $0x89705f41, R3 // 2**61 * 10**-9 + MULLU R1,R3,(R6,R5) // R5:R6:R7 = R1:R2 * R3 + MOVW $0,R7 + MULALU R2,R3,(R7,R6) + + // unscale by discarding low 32 bits, shifting the rest by 29 + MOVW R6>>29,R6 // R6:R7 = (R5:R6:R7 >> 61) + ORR R7<<3,R6 + MOVW R7>>29,R7 + + // subtract (10**9 * sec) from nsec to get nanosecond remainder + MOVW $1000000000, R5 // 10**9 + MULLU R6,R5,(R9,R8) // R8:R9 = R6:R7 * R5 + MULA R7,R5,R9,R9 + SUB.S R8,R1 // R1:R2 -= R8:R9 + SBC R9,R2 + + // because reciprocal was a truncated repeating fraction, quotient + // may be slightly too small -- adjust to make remainder < 10**9 + CMP R5,R1 // if remainder > 10**9 + SUB.HS R5,R1 // remainder -= 10**9 + ADD.HS $1,R6 // sec += 1 + + MOVW R6,sec_lo+0(FP) + MOVW R7,sec_hi+4(FP) + MOVW R1,nsec+8(FP) + RET + +//func notify(fn unsafe.Pointer) int32 +TEXT runtime·notify(SB),NOSPLIT,$0-8 + MOVW $SYS_NOTIFY, R0 + SWI $0 + MOVW R0, ret+4(FP) + RET + +//func noted(mode int32) int32 +TEXT runtime·noted(SB),NOSPLIT,$0-8 + MOVW $SYS_NOTED, R0 + SWI $0 + MOVW R0, ret+4(FP) + RET + +//func plan9_semrelease(addr *uint32, count int32) int32 +TEXT runtime·plan9_semrelease(SB),NOSPLIT,$0-12 + MOVW $SYS_SEMRELEASE, R0 + SWI $0 + MOVW R0, ret+8(FP) + RET + +//func rfork(flags int32) int32 +TEXT runtime·rfork(SB),NOSPLIT,$0-8 + MOVW $SYS_RFORK, R0 + SWI $0 + MOVW R0, ret+4(FP) + RET + +//func tstart_plan9(newm *m) +TEXT runtime·tstart_plan9(SB),NOSPLIT,$4-4 + MOVW newm+0(FP), R1 + MOVW m_g0(R1), g + + // Layout new m scheduler stack on os stack. + MOVW R13, R0 + MOVW R0, g_stack+stack_hi(g) + SUB $(64*1024), R0 + MOVW R0, (g_stack+stack_lo)(g) + MOVW R0, g_stackguard0(g) + MOVW R0, g_stackguard1(g) + + // Initialize procid from TOS struct. + MOVW _tos(SB), R0 + MOVW 48(R0), R0 + MOVW R0, m_procid(R1) // save pid as m->procid + + BL runtime·mstart(SB) + + // Exit the thread. + MOVW $0, R0 + MOVW R0, 4(R13) + CALL runtime·exits(SB) + JMP 0(PC) + +//func sigtramp(ureg, note unsafe.Pointer) +TEXT runtime·sigtramp(SB),NOSPLIT,$0-8 + // check that g and m exist + CMP $0, g + BEQ 4(PC) + MOVW g_m(g), R0 + CMP $0, R0 + BNE 2(PC) + BL runtime·badsignal2(SB) // will exit + + // save args + MOVW ureg+0(FP), R1 + MOVW note+4(FP), R2 + + // change stack + MOVW m_gsignal(R0), R3 + MOVW (g_stack+stack_hi)(R3), R13 + + // make room for args, retval and g + SUB $24, R13 + + // save g + MOVW g, R3 + MOVW R3, 20(R13) + + // g = m->gsignal + MOVW m_gsignal(R0), g + + // load args and call sighandler + ADD $4,R13,R5 + MOVM.IA [R1-R3], (R5) + BL runtime·sighandler(SB) + MOVW 16(R13), R0 // retval + + // restore g + MOVW 20(R13), g + + // call noted(R0) + MOVW R0, 4(R13) + BL runtime·noted(SB) + RET + +//func sigpanictramp() +TEXT runtime·sigpanictramp(SB),NOSPLIT,$0-0 + MOVW.W R0, -4(R13) + B runtime·sigpanic(SB) + +//func setfpmasks() +// Only used by the 64-bit runtime. +TEXT runtime·setfpmasks(SB),NOSPLIT,$0 + RET + +#define ERRMAX 128 /* from os_plan9.h */ + +// func errstr() string +// Only used by package syscall. +// Grab error string due to a syscall made +// in entersyscall mode, without going +// through the allocator (issue 4994). +// See ../syscall/asm_plan9_arm.s:/·Syscall/ +TEXT runtime·errstr(SB),NOSPLIT,$0-8 + MOVW g_m(g), R0 + MOVW (m_mOS+mOS_errstr)(R0), R1 + MOVW R1, ret_base+0(FP) + MOVW $ERRMAX, R2 + MOVW R2, ret_len+4(FP) + MOVW $SYS_ERRSTR, R0 + SWI $0 + MOVW R1, R2 + MOVBU 0(R2), R0 + CMP $0, R0 + BEQ 3(PC) + ADD $1, R2 + B -4(PC) + SUB R1, R2 + MOVW R2, ret_len+4(FP) + RET + +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + B runtime·armPublicationBarrier(SB) + +// never called (cgo not supported) +TEXT runtime·read_tls_fallback(SB),NOSPLIT|NOFRAME,$0 + MOVW $0, R0 + MOVW R0, (R0) + RET diff --git a/src/runtime/sys_ppc64x.go b/src/runtime/sys_ppc64x.go new file mode 100644 index 0000000..56c5c95 --- /dev/null +++ b/src/runtime/sys_ppc64x.go @@ -0,0 +1,22 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +package runtime + +import "unsafe" + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then did an immediate Gosave. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + if buf.lr != 0 { + throw("invalid use of gostartcall") + } + buf.lr = buf.pc + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} + +func prepGoExitFrame(sp uintptr) diff --git a/src/runtime/sys_riscv64.go b/src/runtime/sys_riscv64.go new file mode 100644 index 0000000..e710840 --- /dev/null +++ b/src/runtime/sys_riscv64.go @@ -0,0 +1,18 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then did an immediate Gosave. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + if buf.lr != 0 { + throw("invalid use of gostartcall") + } + buf.lr = buf.pc + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} diff --git a/src/runtime/sys_s390x.go b/src/runtime/sys_s390x.go new file mode 100644 index 0000000..e710840 --- /dev/null +++ b/src/runtime/sys_s390x.go @@ -0,0 +1,18 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then did an immediate Gosave. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + if buf.lr != 0 { + throw("invalid use of gostartcall") + } + buf.lr = buf.pc + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} diff --git a/src/runtime/sys_solaris_amd64.s b/src/runtime/sys_solaris_amd64.s new file mode 100644 index 0000000..7376e06 --- /dev/null +++ b/src/runtime/sys_solaris_amd64.s @@ -0,0 +1,308 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. +// +// System calls and other sys.stuff for AMD64, SunOS +// /usr/include/sys/syscall.h for syscall numbers. +// + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +// This is needed by asm_amd64.s +TEXT runtime·settls(SB),NOSPLIT,$8 + RET + +// void libc_miniterrno(void *(*___errno)(void)); +// +// Set the TLS errno pointer in M. +// +// Called using runtime·asmcgocall from os_solaris.c:/minit. +// NOT USING GO CALLING CONVENTION. +TEXT runtime·miniterrno(SB),NOSPLIT,$0 + // asmcgocall will put first argument into DI. + CALL DI // SysV ABI so returns in AX + get_tls(CX) + MOVQ g(CX), BX + MOVQ g_m(BX), BX + MOVQ AX, (m_mOS+mOS_perrno)(BX) + RET + +// Call a library function with SysV calling conventions. +// The called function can take a maximum of 6 INTEGER class arguments, +// see +// Michael Matz, Jan Hubicka, Andreas Jaeger, and Mark Mitchell +// System V Application Binary Interface +// AMD64 Architecture Processor Supplement +// section 3.2.3. +// +// Called by runtime·asmcgocall or runtime·cgocall. +// NOT USING GO CALLING CONVENTION. +TEXT runtime·asmsysvicall6(SB),NOSPLIT,$0 + // asmcgocall will put first argument into DI. + PUSHQ DI // save for later + MOVQ libcall_fn(DI), AX + MOVQ libcall_args(DI), R11 + MOVQ libcall_n(DI), R10 + + get_tls(CX) + MOVQ g(CX), BX + CMPQ BX, $0 + JEQ skiperrno1 + MOVQ g_m(BX), BX + MOVQ (m_mOS+mOS_perrno)(BX), DX + CMPQ DX, $0 + JEQ skiperrno1 + MOVL $0, 0(DX) + +skiperrno1: + CMPQ R11, $0 + JEQ skipargs + // Load 6 args into correspondent registers. + MOVQ 0(R11), DI + MOVQ 8(R11), SI + MOVQ 16(R11), DX + MOVQ 24(R11), CX + MOVQ 32(R11), R8 + MOVQ 40(R11), R9 +skipargs: + + // Call SysV function + CALL AX + + // Return result + POPQ DI + MOVQ AX, libcall_r1(DI) + MOVQ DX, libcall_r2(DI) + + get_tls(CX) + MOVQ g(CX), BX + CMPQ BX, $0 + JEQ skiperrno2 + MOVQ g_m(BX), BX + MOVQ (m_mOS+mOS_perrno)(BX), AX + CMPQ AX, $0 + JEQ skiperrno2 + MOVL 0(AX), AX + MOVQ AX, libcall_err(DI) + +skiperrno2: + RET + +// uint32 tstart_sysvicall(M *newm); +TEXT runtime·tstart_sysvicall(SB),NOSPLIT,$0 + // DI contains first arg newm + MOVQ m_g0(DI), DX // g + + // Make TLS entries point at g and m. + get_tls(BX) + MOVQ DX, g(BX) + MOVQ DI, g_m(DX) + + // Layout new m scheduler stack on os stack. + MOVQ SP, AX + MOVQ AX, (g_stack+stack_hi)(DX) + SUBQ $(0x100000), AX // stack size + MOVQ AX, (g_stack+stack_lo)(DX) + ADDQ $const__StackGuard, AX + MOVQ AX, g_stackguard0(DX) + MOVQ AX, g_stackguard1(DX) + + // Someday the convention will be D is always cleared. + CLD + + CALL runtime·stackcheck(SB) // clobbers AX,CX + CALL runtime·mstart(SB) + + XORL AX, AX // return 0 == success + MOVL AX, ret+8(FP) + RET + +// Careful, this is called by __sighndlr, a libc function. We must preserve +// registers as per AMD 64 ABI. +TEXT runtime·sigtramp(SB),NOSPLIT|TOPFRAME,$0 + // Note that we are executing on altsigstack here, so we have + // more stack available than NOSPLIT would have us believe. + // To defeat the linker, we make our own stack frame with + // more space: + SUBQ $184, SP + + // save registers + MOVQ BX, 32(SP) + MOVQ BP, 40(SP) + MOVQ R12, 48(SP) + MOVQ R13, 56(SP) + MOVQ R14, 64(SP) + MOVQ R15, 72(SP) + + get_tls(BX) + // check that g exists + MOVQ g(BX), R10 + CMPQ R10, $0 + JNE allgood + MOVQ SI, 80(SP) + MOVQ DX, 88(SP) + LEAQ 80(SP), AX + MOVQ DI, 0(SP) + MOVQ AX, 8(SP) + MOVQ $runtime·badsignal(SB), AX + CALL AX + JMP exit + +allgood: + // Save m->libcall and m->scratch. We need to do this because we + // might get interrupted by a signal in runtime·asmcgocall. + + // save m->libcall + MOVQ g_m(R10), BP + LEAQ m_libcall(BP), R11 + MOVQ libcall_fn(R11), R10 + MOVQ R10, 88(SP) + MOVQ libcall_args(R11), R10 + MOVQ R10, 96(SP) + MOVQ libcall_n(R11), R10 + MOVQ R10, 104(SP) + MOVQ libcall_r1(R11), R10 + MOVQ R10, 168(SP) + MOVQ libcall_r2(R11), R10 + MOVQ R10, 176(SP) + + // save m->scratch + LEAQ (m_mOS+mOS_scratch)(BP), R11 + MOVQ 0(R11), R10 + MOVQ R10, 112(SP) + MOVQ 8(R11), R10 + MOVQ R10, 120(SP) + MOVQ 16(R11), R10 + MOVQ R10, 128(SP) + MOVQ 24(R11), R10 + MOVQ R10, 136(SP) + MOVQ 32(R11), R10 + MOVQ R10, 144(SP) + MOVQ 40(R11), R10 + MOVQ R10, 152(SP) + + // save errno, it might be EINTR; stuff we do here might reset it. + MOVQ (m_mOS+mOS_perrno)(BP), R10 + MOVL 0(R10), R10 + MOVQ R10, 160(SP) + + // prepare call + MOVQ DI, 0(SP) + MOVQ SI, 8(SP) + MOVQ DX, 16(SP) + CALL runtime·sigtrampgo(SB) + + get_tls(BX) + MOVQ g(BX), BP + MOVQ g_m(BP), BP + // restore libcall + LEAQ m_libcall(BP), R11 + MOVQ 88(SP), R10 + MOVQ R10, libcall_fn(R11) + MOVQ 96(SP), R10 + MOVQ R10, libcall_args(R11) + MOVQ 104(SP), R10 + MOVQ R10, libcall_n(R11) + MOVQ 168(SP), R10 + MOVQ R10, libcall_r1(R11) + MOVQ 176(SP), R10 + MOVQ R10, libcall_r2(R11) + + // restore scratch + LEAQ (m_mOS+mOS_scratch)(BP), R11 + MOVQ 112(SP), R10 + MOVQ R10, 0(R11) + MOVQ 120(SP), R10 + MOVQ R10, 8(R11) + MOVQ 128(SP), R10 + MOVQ R10, 16(R11) + MOVQ 136(SP), R10 + MOVQ R10, 24(R11) + MOVQ 144(SP), R10 + MOVQ R10, 32(R11) + MOVQ 152(SP), R10 + MOVQ R10, 40(R11) + + // restore errno + MOVQ (m_mOS+mOS_perrno)(BP), R11 + MOVQ 160(SP), R10 + MOVL R10, 0(R11) + +exit: + // restore registers + MOVQ 32(SP), BX + MOVQ 40(SP), BP + MOVQ 48(SP), R12 + MOVQ 56(SP), R13 + MOVQ 64(SP), R14 + MOVQ 72(SP), R15 + + ADDQ $184, SP + RET + +TEXT runtime·sigfwd(SB),NOSPLIT,$0-32 + MOVQ fn+0(FP), AX + MOVL sig+8(FP), DI + MOVQ info+16(FP), SI + MOVQ ctx+24(FP), DX + PUSHQ BP + MOVQ SP, BP + ANDQ $~15, SP // alignment for x86_64 ABI + CALL AX + MOVQ BP, SP + POPQ BP + RET + +// Called from runtime·usleep (Go). Can be called on Go stack, on OS stack, +// can also be called in cgo callback path without a g->m. +TEXT runtime·usleep1(SB),NOSPLIT,$0 + MOVL usec+0(FP), DI + MOVQ $usleep2<>(SB), AX // to hide from 6l + + // Execute call on m->g0. + get_tls(R15) + CMPQ R15, $0 + JE noswitch + + MOVQ g(R15), R13 + CMPQ R13, $0 + JE noswitch + MOVQ g_m(R13), R13 + CMPQ R13, $0 + JE noswitch + // TODO(aram): do something about the cpu profiler here. + + MOVQ m_g0(R13), R14 + CMPQ g(R15), R14 + JNE switch + // executing on m->g0 already + CALL AX + RET + +switch: + // Switch to m->g0 stack and back. + MOVQ (g_sched+gobuf_sp)(R14), R14 + MOVQ SP, -8(R14) + LEAQ -8(R14), SP + CALL AX + MOVQ 0(SP), SP + RET + +noswitch: + // Not a Go-managed thread. Do not switch stack. + CALL AX + RET + +// Runs on OS stack. duration (in µs units) is in DI. +TEXT usleep2<>(SB),NOSPLIT,$0 + LEAQ libc_usleep(SB), AX + CALL AX + RET + +// Runs on OS stack, called from runtime·osyield. +TEXT runtime·osyield1(SB),NOSPLIT,$0 + LEAQ libc_sched_yield(SB), AX + CALL AX + RET diff --git a/src/runtime/sys_wasm.go b/src/runtime/sys_wasm.go new file mode 100644 index 0000000..bf57569 --- /dev/null +++ b/src/runtime/sys_wasm.go @@ -0,0 +1,35 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/sys" + "unsafe" +) + +type m0Stack struct { + _ [8192 * sys.StackGuardMultiplier]byte +} + +var wasmStack m0Stack + +func wasmDiv() + +func wasmTruncS() +func wasmTruncU() + +func wasmExit(code int32) + +// adjust Gobuf as it if executed a call to fn with context ctxt +// and then stopped before the first instruction in fn. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + sp := buf.sp + sp -= goarch.PtrSize + *(*uintptr)(unsafe.Pointer(sp)) = buf.pc + buf.sp = sp + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} diff --git a/src/runtime/sys_wasm.s b/src/runtime/sys_wasm.s new file mode 100644 index 0000000..f706e00 --- /dev/null +++ b/src/runtime/sys_wasm.s @@ -0,0 +1,135 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "textflag.h" + +TEXT runtime·wasmDiv(SB), NOSPLIT, $0-0 + Get R0 + I64Const $-0x8000000000000000 + I64Eq + If + Get R1 + I64Const $-1 + I64Eq + If + I64Const $-0x8000000000000000 + Return + End + End + Get R0 + Get R1 + I64DivS + Return + +TEXT runtime·wasmTruncS(SB), NOSPLIT, $0-0 + Get R0 + Get R0 + F64Ne // NaN + If + I64Const $0x8000000000000000 + Return + End + + Get R0 + F64Const $0x7ffffffffffffc00p0 // Maximum truncated representation of 0x7fffffffffffffff + F64Gt + If + I64Const $0x8000000000000000 + Return + End + + Get R0 + F64Const $-0x7ffffffffffffc00p0 // Minimum truncated representation of -0x8000000000000000 + F64Lt + If + I64Const $0x8000000000000000 + Return + End + + Get R0 + I64TruncF64S + Return + +TEXT runtime·wasmTruncU(SB), NOSPLIT, $0-0 + Get R0 + Get R0 + F64Ne // NaN + If + I64Const $0x8000000000000000 + Return + End + + Get R0 + F64Const $0xfffffffffffff800p0 // Maximum truncated representation of 0xffffffffffffffff + F64Gt + If + I64Const $0x8000000000000000 + Return + End + + Get R0 + F64Const $0. + F64Lt + If + I64Const $0x8000000000000000 + Return + End + + Get R0 + I64TruncF64U + Return + +TEXT runtime·exitThread(SB), NOSPLIT, $0-0 + UNDEF + +TEXT runtime·osyield(SB), NOSPLIT, $0-0 + UNDEF + +TEXT runtime·usleep(SB), NOSPLIT, $0-0 + RET // TODO(neelance): implement usleep + +TEXT runtime·currentMemory(SB), NOSPLIT, $0 + Get SP + CurrentMemory + I32Store ret+0(FP) + RET + +TEXT runtime·growMemory(SB), NOSPLIT, $0 + Get SP + I32Load pages+0(FP) + GrowMemory + I32Store ret+8(FP) + RET + +TEXT ·resetMemoryDataView(SB), NOSPLIT, $0 + CallImport + RET + +TEXT ·wasmExit(SB), NOSPLIT, $0 + CallImport + RET + +TEXT ·wasmWrite(SB), NOSPLIT, $0 + CallImport + RET + +TEXT ·nanotime1(SB), NOSPLIT, $0 + CallImport + RET + +TEXT ·walltime(SB), NOSPLIT, $0 + CallImport + RET + +TEXT ·scheduleTimeoutEvent(SB), NOSPLIT, $0 + CallImport + RET + +TEXT ·clearTimeoutEvent(SB), NOSPLIT, $0 + CallImport + RET + +TEXT ·getRandomData(SB), NOSPLIT, $0 + CallImport + RET diff --git a/src/runtime/sys_windows_386.s b/src/runtime/sys_windows_386.s new file mode 100644 index 0000000..cf3a439 --- /dev/null +++ b/src/runtime/sys_windows_386.s @@ -0,0 +1,358 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "time_windows.h" + +// void runtime·asmstdcall(void *c); +TEXT runtime·asmstdcall(SB),NOSPLIT,$0 + MOVL fn+0(FP), BX + + // SetLastError(0). + MOVL $0, 0x34(FS) + + // Copy args to the stack. + MOVL SP, BP + MOVL libcall_n(BX), CX // words + MOVL CX, AX + SALL $2, AX + SUBL AX, SP // room for args + MOVL SP, DI + MOVL libcall_args(BX), SI + CLD + REP; MOVSL + + // Call stdcall or cdecl function. + // DI SI BP BX are preserved, SP is not + CALL libcall_fn(BX) + MOVL BP, SP + + // Return result. + MOVL fn+0(FP), BX + MOVL AX, libcall_r1(BX) + MOVL DX, libcall_r2(BX) + + // GetLastError(). + MOVL 0x34(FS), AX + MOVL AX, libcall_err(BX) + + RET + +TEXT runtime·badsignal2(SB),NOSPLIT,$24 + // stderr + MOVL $-12, 0(SP) + MOVL SP, BP + CALL *runtime·_GetStdHandle(SB) + MOVL BP, SP + + MOVL AX, 0(SP) // handle + MOVL $runtime·badsignalmsg(SB), DX // pointer + MOVL DX, 4(SP) + MOVL runtime·badsignallen(SB), DX // count + MOVL DX, 8(SP) + LEAL 20(SP), DX // written count + MOVL $0, 0(DX) + MOVL DX, 12(SP) + MOVL $0, 16(SP) // overlapped + CALL *runtime·_WriteFile(SB) + + // Does not return. + CALL runtime·abort(SB) + RET + +// faster get/set last error +TEXT runtime·getlasterror(SB),NOSPLIT,$0 + MOVL 0x34(FS), AX + MOVL AX, ret+0(FP) + RET + +// Called by Windows as a Vectored Exception Handler (VEH). +// First argument is pointer to struct containing +// exception record and context pointers. +// Handler function is stored in AX. +// Return 0 for 'not handled', -1 for handled. +TEXT sigtramp<>(SB),NOSPLIT,$0-0 + MOVL ptrs+0(FP), CX + SUBL $40, SP + + // save callee-saved registers + MOVL BX, 28(SP) + MOVL BP, 16(SP) + MOVL SI, 20(SP) + MOVL DI, 24(SP) + + MOVL AX, SI // save handler address + + // find g + get_tls(DX) + CMPL DX, $0 + JNE 3(PC) + MOVL $0, AX // continue + JMP done + MOVL g(DX), DX + CMPL DX, $0 + JNE 2(PC) + CALL runtime·badsignal2(SB) + + // save g in case of stack switch + MOVL DX, 32(SP) // g + MOVL SP, 36(SP) + + // do we need to switch to the g0 stack? + MOVL g_m(DX), BX + MOVL m_g0(BX), BX + CMPL DX, BX + JEQ g0 + + // switch to the g0 stack + get_tls(BP) + MOVL BX, g(BP) + MOVL (g_sched+gobuf_sp)(BX), DI + // make room for sighandler arguments + // and re-save old SP for restoring later. + // (note that the 36(DI) here must match the 36(SP) above.) + SUBL $40, DI + MOVL SP, 36(DI) + MOVL DI, SP + +g0: + MOVL 0(CX), BX // ExceptionRecord* + MOVL 4(CX), CX // Context* + MOVL BX, 0(SP) + MOVL CX, 4(SP) + MOVL DX, 8(SP) + CALL SI // call handler + // AX is set to report result back to Windows + MOVL 12(SP), AX + + // switch back to original stack and g + // no-op if we never left. + MOVL 36(SP), SP + MOVL 32(SP), DX // note: different SP + get_tls(BP) + MOVL DX, g(BP) + +done: + // restore callee-saved registers + MOVL 24(SP), DI + MOVL 20(SP), SI + MOVL 16(SP), BP + MOVL 28(SP), BX + + ADDL $40, SP + // RET 4 (return and pop 4 bytes parameters) + BYTE $0xC2; WORD $4 + RET // unreached; make assembler happy + +TEXT runtime·exceptiontramp(SB),NOSPLIT,$0 + MOVL $runtime·exceptionhandler(SB), AX + JMP sigtramp<>(SB) + +TEXT runtime·firstcontinuetramp(SB),NOSPLIT,$0-0 + // is never called + INT $3 + +TEXT runtime·lastcontinuetramp(SB),NOSPLIT,$0-0 + MOVL $runtime·lastcontinuehandler(SB), AX + JMP sigtramp<>(SB) + +GLOBL runtime·cbctxts(SB), NOPTR, $4 + +TEXT runtime·callbackasm1(SB),NOSPLIT,$0 + MOVL 0(SP), AX // will use to find our callback context + + // remove return address from stack, we are not returning to callbackasm, but to its caller. + ADDL $4, SP + + // address to callback parameters into CX + LEAL 4(SP), CX + + // save registers as required for windows callback + PUSHL DI + PUSHL SI + PUSHL BP + PUSHL BX + + // Go ABI requires DF flag to be cleared. + CLD + + // determine index into runtime·cbs table + SUBL $runtime·callbackasm(SB), AX + MOVL $0, DX + MOVL $5, BX // divide by 5 because each call instruction in runtime·callbacks is 5 bytes long + DIVL BX + SUBL $1, AX // subtract 1 because return PC is to the next slot + + // Create a struct callbackArgs on our stack. + SUBL $(12+callbackArgs__size), SP + MOVL AX, (12+callbackArgs_index)(SP) // callback index + MOVL CX, (12+callbackArgs_args)(SP) // address of args vector + MOVL $0, (12+callbackArgs_result)(SP) // result + LEAL 12(SP), AX // AX = &callbackArgs{...} + + // Call cgocallback, which will call callbackWrap(frame). + MOVL $0, 8(SP) // context + MOVL AX, 4(SP) // frame (address of callbackArgs) + LEAL ·callbackWrap(SB), AX + MOVL AX, 0(SP) // PC of function to call + CALL runtime·cgocallback(SB) + + // Get callback result. + MOVL (12+callbackArgs_result)(SP), AX + // Get popRet. + MOVL (12+callbackArgs_retPop)(SP), CX // Can't use a callee-save register + ADDL $(12+callbackArgs__size), SP + + // restore registers as required for windows callback + POPL BX + POPL BP + POPL SI + POPL DI + + // remove callback parameters before return (as per Windows spec) + POPL DX + ADDL CX, SP + PUSHL DX + + CLD + + RET + +// void tstart(M *newm); +TEXT tstart<>(SB),NOSPLIT,$0 + MOVL newm+0(FP), CX // m + MOVL m_g0(CX), DX // g + + // Layout new m scheduler stack on os stack. + MOVL SP, AX + MOVL AX, (g_stack+stack_hi)(DX) + SUBL $(64*1024), AX // initial stack size (adjusted later) + MOVL AX, (g_stack+stack_lo)(DX) + ADDL $const__StackGuard, AX + MOVL AX, g_stackguard0(DX) + MOVL AX, g_stackguard1(DX) + + // Set up tls. + LEAL m_tls(CX), SI + MOVL SI, 0x14(FS) + MOVL CX, g_m(DX) + MOVL DX, g(SI) + + // Someday the convention will be D is always cleared. + CLD + + CALL runtime·stackcheck(SB) // clobbers AX,CX + CALL runtime·mstart(SB) + + RET + +// uint32 tstart_stdcall(M *newm); +TEXT runtime·tstart_stdcall(SB),NOSPLIT,$0 + MOVL newm+0(FP), BX + + PUSHL BX + CALL tstart<>(SB) + POPL BX + + // Adjust stack for stdcall to return properly. + MOVL (SP), AX // save return address + ADDL $4, SP // remove single parameter + MOVL AX, (SP) // restore return address + + XORL AX, AX // return 0 == success + + RET + +// setldt(int entry, int address, int limit) +TEXT runtime·setldt(SB),NOSPLIT,$0 + MOVL base+4(FP), CX + MOVL CX, 0x14(FS) + RET + +// Runs on OS stack. +// duration (in -100ns units) is in dt+0(FP). +// g may be nil. +TEXT runtime·usleep2(SB),NOSPLIT,$20-4 + MOVL dt+0(FP), BX + MOVL $-1, hi-4(SP) + MOVL BX, lo-8(SP) + LEAL lo-8(SP), BX + MOVL BX, ptime-12(SP) + MOVL $0, alertable-16(SP) + MOVL $-1, handle-20(SP) + MOVL SP, BP + MOVL runtime·_NtWaitForSingleObject(SB), AX + CALL AX + MOVL BP, SP + RET + +// Runs on OS stack. +// duration (in -100ns units) is in dt+0(FP). +// g is valid. +TEXT runtime·usleep2HighRes(SB),NOSPLIT,$36-4 + MOVL dt+0(FP), BX + MOVL $-1, hi-4(SP) + MOVL BX, lo-8(SP) + + get_tls(CX) + MOVL g(CX), CX + MOVL g_m(CX), CX + MOVL (m_mOS+mOS_highResTimer)(CX), CX + MOVL CX, saved_timer-12(SP) + + MOVL $0, fResume-16(SP) + MOVL $0, lpArgToCompletionRoutine-20(SP) + MOVL $0, pfnCompletionRoutine-24(SP) + MOVL $0, lPeriod-28(SP) + LEAL lo-8(SP), BX + MOVL BX, lpDueTime-32(SP) + MOVL CX, hTimer-36(SP) + MOVL SP, BP + MOVL runtime·_SetWaitableTimer(SB), AX + CALL AX + MOVL BP, SP + + MOVL $0, ptime-28(SP) + MOVL $0, alertable-32(SP) + MOVL saved_timer-12(SP), CX + MOVL CX, handle-36(SP) + MOVL SP, BP + MOVL runtime·_NtWaitForSingleObject(SB), AX + CALL AX + MOVL BP, SP + + RET + +// Runs on OS stack. +TEXT runtime·switchtothread(SB),NOSPLIT,$0 + MOVL SP, BP + MOVL runtime·_SwitchToThread(SB), AX + CALL AX + MOVL BP, SP + RET + +TEXT runtime·nanotime1(SB),NOSPLIT,$0-8 + CMPB runtime·useQPCTime(SB), $0 + JNE useQPC +loop: + MOVL (_INTERRUPT_TIME+time_hi1), AX + MOVL (_INTERRUPT_TIME+time_lo), CX + MOVL (_INTERRUPT_TIME+time_hi2), DI + CMPL AX, DI + JNE loop + + // wintime = DI:CX, multiply by 100 + MOVL $100, AX + MULL CX + IMULL $100, DI + ADDL DI, DX + // wintime*100 = DX:AX + MOVL AX, ret_lo+0(FP) + MOVL DX, ret_hi+4(FP) + RET +useQPC: + JMP runtime·nanotimeQPC(SB) + RET diff --git a/src/runtime/sys_windows_amd64.s b/src/runtime/sys_windows_amd64.s new file mode 100644 index 0000000..4027770 --- /dev/null +++ b/src/runtime/sys_windows_amd64.s @@ -0,0 +1,445 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "time_windows.h" +#include "cgo/abi_amd64.h" + +// Offsets into Thread Environment Block (pointer in GS) +#define TEB_TlsSlots 0x1480 +#define TEB_ArbitraryPtr 0x28 + +// void runtime·asmstdcall(void *c); +TEXT runtime·asmstdcall(SB),NOSPLIT|NOFRAME,$0 + // asmcgocall will put first argument into CX. + PUSHQ CX // save for later + MOVQ libcall_fn(CX), AX + MOVQ libcall_args(CX), SI + MOVQ libcall_n(CX), CX + + // SetLastError(0). + MOVQ 0x30(GS), DI + MOVL $0, 0x68(DI) + + SUBQ $(const_maxArgs*8), SP // room for args + + // Fast version, do not store args on the stack. + CMPL CX, $4 + JLE loadregs + + // Check we have enough room for args. + CMPL CX, $const_maxArgs + JLE 2(PC) + INT $3 // not enough room -> crash + + // Copy args to the stack. + MOVQ SP, DI + CLD + REP; MOVSQ + MOVQ SP, SI + +loadregs: + // Load first 4 args into correspondent registers. + MOVQ 0(SI), CX + MOVQ 8(SI), DX + MOVQ 16(SI), R8 + MOVQ 24(SI), R9 + // Floating point arguments are passed in the XMM + // registers. Set them here in case any of the arguments + // are floating point values. For details see + // https://msdn.microsoft.com/en-us/library/zthk2dkh.aspx + MOVQ CX, X0 + MOVQ DX, X1 + MOVQ R8, X2 + MOVQ R9, X3 + + // Call stdcall function. + CALL AX + + ADDQ $(const_maxArgs*8), SP + + // Return result. + POPQ CX + MOVQ AX, libcall_r1(CX) + // Floating point return values are returned in XMM0. Setting r2 to this + // value in case this call returned a floating point value. For details, + // see https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention + MOVQ X0, libcall_r2(CX) + + // GetLastError(). + MOVQ 0x30(GS), DI + MOVL 0x68(DI), AX + MOVQ AX, libcall_err(CX) + + RET + +TEXT runtime·badsignal2(SB),NOSPLIT|NOFRAME,$48 + // stderr + MOVQ $-12, CX // stderr + MOVQ CX, 0(SP) + MOVQ runtime·_GetStdHandle(SB), AX + CALL AX + + MOVQ AX, CX // handle + MOVQ CX, 0(SP) + MOVQ $runtime·badsignalmsg(SB), DX // pointer + MOVQ DX, 8(SP) + MOVL $runtime·badsignallen(SB), R8 // count + MOVQ R8, 16(SP) + LEAQ 40(SP), R9 // written count + MOVQ $0, 0(R9) + MOVQ R9, 24(SP) + MOVQ $0, 32(SP) // overlapped + MOVQ runtime·_WriteFile(SB), AX + CALL AX + + // Does not return. + CALL runtime·abort(SB) + RET + +// faster get/set last error +TEXT runtime·getlasterror(SB),NOSPLIT,$0 + MOVQ 0x30(GS), AX + MOVL 0x68(AX), AX + MOVL AX, ret+0(FP) + RET + +// Called by Windows as a Vectored Exception Handler (VEH). +// First argument is pointer to struct containing +// exception record and context pointers. +// Handler function is stored in AX. +// Return 0 for 'not handled', -1 for handled. +TEXT sigtramp<>(SB),NOSPLIT|NOFRAME,$0-0 + // CX: PEXCEPTION_POINTERS ExceptionInfo + + // Switch from the host ABI to the Go ABI. + PUSH_REGS_HOST_TO_ABI0() + // Make stack space for the rest of the function. + ADJSP $48 + + MOVQ CX, R13 // save exception address + MOVQ AX, R15 // save handler address + + // find g + get_tls(DX) + CMPQ DX, $0 + JNE 3(PC) + MOVQ $0, AX // continue + JMP done + MOVQ g(DX), DX + CMPQ DX, $0 + JNE 2(PC) + CALL runtime·badsignal2(SB) + + // save g and SP in case of stack switch + MOVQ DX, 32(SP) // g + MOVQ SP, 40(SP) + + // do we need to switch to the g0 stack? + MOVQ g_m(DX), BX + MOVQ m_g0(BX), BX + CMPQ DX, BX + JEQ g0 + + // switch to g0 stack + get_tls(BP) + MOVQ BX, g(BP) + MOVQ (g_sched+gobuf_sp)(BX), DI + // make room for sighandler arguments + // and re-save old SP for restoring later. + // Adjust g0 stack by the space we're using and + // save SP at the same place on the g0 stack. + // The 40(DI) here must match the 40(SP) above. + SUBQ $(REGS_HOST_TO_ABI0_STACK + 48), DI + MOVQ SP, 40(DI) + MOVQ DI, SP + +g0: + MOVQ 0(R13), BX // ExceptionRecord* + MOVQ 8(R13), CX // Context* + MOVQ BX, 0(SP) + MOVQ CX, 8(SP) + MOVQ DX, 16(SP) + CALL R15 // call handler + // AX is set to report result back to Windows + MOVL 24(SP), AX + + MOVQ SP, DI // save g0 SP + + // switch back to original stack and g + // no-op if we never left. + MOVQ 40(SP), SP + MOVQ 32(SP), DX + get_tls(BP) + MOVQ DX, g(BP) + + // if return value is CONTINUE_SEARCH, do not set up control + // flow guard workaround. + CMPQ AX, $0 + JEQ done + + // Check if we need to set up the control flow guard workaround. + // On Windows, the stack pointer in the context must lie within + // system stack limits when we resume from exception. + // Store the resume SP and PC in alternate registers + // and return to sigresume on the g0 stack. + // sigresume makes no use of the stack at all, + // loading SP from R8 and jumping to R9. + // Note that smashing R8 and R9 is only safe because we know sigpanic + // will not actually return to the original frame, so the registers + // are effectively dead. But this does mean we can't use the + // same mechanism for async preemption. + MOVQ 8(R13), CX + MOVQ $sigresume<>(SB), BX + CMPQ BX, context_rip(CX) + JEQ done // do not clobber saved SP/PC + + // Save resume SP and PC into R8, R9. + MOVQ context_rsp(CX), BX + MOVQ BX, context_r8(CX) + MOVQ context_rip(CX), BX + MOVQ BX, context_r9(CX) + + // Set up context record to return to sigresume on g0 stack + MOVD DI, BX + MOVD BX, context_rsp(CX) + MOVD $sigresume<>(SB), BX + MOVD BX, context_rip(CX) + +done: + ADJSP $-48 + POP_REGS_HOST_TO_ABI0() + + RET + +// Trampoline to resume execution from exception handler. +// This is part of the control flow guard workaround. +// It switches stacks and jumps to the continuation address. +// R8 and R9 are set above at the end of sigtramp<> +// in the context that starts executing at sigresume<>. +TEXT sigresume<>(SB),NOSPLIT|NOFRAME,$0 + MOVQ R8, SP + JMP R9 + +TEXT runtime·exceptiontramp(SB),NOSPLIT|NOFRAME,$0 + MOVQ $runtime·exceptionhandler(SB), AX + JMP sigtramp<>(SB) + +TEXT runtime·firstcontinuetramp(SB),NOSPLIT|NOFRAME,$0-0 + MOVQ $runtime·firstcontinuehandler(SB), AX + JMP sigtramp<>(SB) + +TEXT runtime·lastcontinuetramp(SB),NOSPLIT|NOFRAME,$0-0 + MOVQ $runtime·lastcontinuehandler(SB), AX + JMP sigtramp<>(SB) + +GLOBL runtime·cbctxts(SB), NOPTR, $8 + +TEXT runtime·callbackasm1(SB),NOSPLIT,$0 + // Construct args vector for cgocallback(). + // By windows/amd64 calling convention first 4 args are in CX, DX, R8, R9 + // args from the 5th on are on the stack. + // In any case, even if function has 0,1,2,3,4 args, there is reserved + // but uninitialized "shadow space" for the first 4 args. + // The values are in registers. + MOVQ CX, (16+0)(SP) + MOVQ DX, (16+8)(SP) + MOVQ R8, (16+16)(SP) + MOVQ R9, (16+24)(SP) + // R8 = address of args vector + LEAQ (16+0)(SP), R8 + + // remove return address from stack, we are not returning to callbackasm, but to its caller. + MOVQ 0(SP), AX + ADDQ $8, SP + + // determine index into runtime·cbs table + MOVQ $runtime·callbackasm(SB), DX + SUBQ DX, AX + MOVQ $0, DX + MOVQ $5, CX // divide by 5 because each call instruction in runtime·callbacks is 5 bytes long + DIVL CX + SUBQ $1, AX // subtract 1 because return PC is to the next slot + + // Switch from the host ABI to the Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // Create a struct callbackArgs on our stack to be passed as + // the "frame" to cgocallback and on to callbackWrap. + SUBQ $(24+callbackArgs__size), SP + MOVQ AX, (24+callbackArgs_index)(SP) // callback index + MOVQ R8, (24+callbackArgs_args)(SP) // address of args vector + MOVQ $0, (24+callbackArgs_result)(SP) // result + LEAQ 24(SP), AX + // Call cgocallback, which will call callbackWrap(frame). + MOVQ $0, 16(SP) // context + MOVQ AX, 8(SP) // frame (address of callbackArgs) + LEAQ ·callbackWrap<ABIInternal>(SB), BX // cgocallback takes an ABIInternal entry-point + MOVQ BX, 0(SP) // PC of function value to call (callbackWrap) + CALL ·cgocallback(SB) + // Get callback result. + MOVQ (24+callbackArgs_result)(SP), AX + ADDQ $(24+callbackArgs__size), SP + + POP_REGS_HOST_TO_ABI0() + + // The return value was placed in AX above. + RET + +// uint32 tstart_stdcall(M *newm); +TEXT runtime·tstart_stdcall(SB),NOSPLIT,$0 + // Switch from the host ABI to the Go ABI. + PUSH_REGS_HOST_TO_ABI0() + + // CX contains first arg newm + MOVQ m_g0(CX), DX // g + + // Layout new m scheduler stack on os stack. + MOVQ SP, AX + MOVQ AX, (g_stack+stack_hi)(DX) + SUBQ $(64*1024), AX // initial stack size (adjusted later) + MOVQ AX, (g_stack+stack_lo)(DX) + ADDQ $const__StackGuard, AX + MOVQ AX, g_stackguard0(DX) + MOVQ AX, g_stackguard1(DX) + + // Set up tls. + LEAQ m_tls(CX), DI + MOVQ CX, g_m(DX) + MOVQ DX, g(DI) + CALL runtime·settls(SB) // clobbers CX + + CALL runtime·stackcheck(SB) // clobbers AX,CX + CALL runtime·mstart(SB) + + POP_REGS_HOST_TO_ABI0() + + XORL AX, AX // return 0 == success + RET + +// set tls base to DI +TEXT runtime·settls(SB),NOSPLIT,$0 + MOVQ runtime·tls_g(SB), CX + MOVQ DI, 0(CX)(GS) + RET + +// Runs on OS stack. +// duration (in -100ns units) is in dt+0(FP). +// g may be nil. +// The function leaves room for 4 syscall parameters +// (as per windows amd64 calling convention). +TEXT runtime·usleep2(SB),NOSPLIT|NOFRAME,$48-4 + MOVLQSX dt+0(FP), BX + MOVQ SP, AX + ANDQ $~15, SP // alignment as per Windows requirement + MOVQ AX, 40(SP) + LEAQ 32(SP), R8 // ptime + MOVQ BX, (R8) + MOVQ $-1, CX // handle + MOVQ $0, DX // alertable + MOVQ runtime·_NtWaitForSingleObject(SB), AX + CALL AX + MOVQ 40(SP), SP + RET + +// Runs on OS stack. duration (in -100ns units) is in dt+0(FP). +// g is valid. +TEXT runtime·usleep2HighRes(SB),NOSPLIT|NOFRAME,$72-4 + MOVLQSX dt+0(FP), BX + get_tls(CX) + + MOVQ SP, AX + ANDQ $~15, SP // alignment as per Windows requirement + MOVQ AX, 64(SP) + + MOVQ g(CX), CX + MOVQ g_m(CX), CX + MOVQ (m_mOS+mOS_highResTimer)(CX), CX // hTimer + MOVQ CX, 48(SP) // save hTimer for later + LEAQ 56(SP), DX // lpDueTime + MOVQ BX, (DX) + MOVQ $0, R8 // lPeriod + MOVQ $0, R9 // pfnCompletionRoutine + MOVQ $0, AX + MOVQ AX, 32(SP) // lpArgToCompletionRoutine + MOVQ AX, 40(SP) // fResume + MOVQ runtime·_SetWaitableTimer(SB), AX + CALL AX + + MOVQ 48(SP), CX // handle + MOVQ $0, DX // alertable + MOVQ $0, R8 // ptime + MOVQ runtime·_NtWaitForSingleObject(SB), AX + CALL AX + + MOVQ 64(SP), SP + RET + +// Runs on OS stack. +TEXT runtime·switchtothread(SB),NOSPLIT|NOFRAME,$0 + MOVQ SP, AX + ANDQ $~15, SP // alignment as per Windows requirement + SUBQ $(48), SP // room for SP and 4 args as per Windows requirement + // plus one extra word to keep stack 16 bytes aligned + MOVQ AX, 32(SP) + MOVQ runtime·_SwitchToThread(SB), AX + CALL AX + MOVQ 32(SP), SP + RET + +TEXT runtime·nanotime1(SB),NOSPLIT,$0-8 + CMPB runtime·useQPCTime(SB), $0 + JNE useQPC + MOVQ $_INTERRUPT_TIME, DI + MOVQ time_lo(DI), AX + IMULQ $100, AX + MOVQ AX, ret+0(FP) + RET +useQPC: + JMP runtime·nanotimeQPC(SB) + RET + +// func osSetupTLS(mp *m) +// Setup TLS. for use by needm on Windows. +TEXT runtime·osSetupTLS(SB),NOSPLIT,$0-8 + MOVQ mp+0(FP), AX + LEAQ m_tls(AX), DI + CALL runtime·settls(SB) + RET + +// This is called from rt0_go, which runs on the system stack +// using the initial stack allocated by the OS. +TEXT runtime·wintls(SB),NOSPLIT|NOFRAME,$0 + // Allocate a TLS slot to hold g across calls to external code + MOVQ SP, AX + ANDQ $~15, SP // alignment as per Windows requirement + SUBQ $48, SP // room for SP and 4 args as per Windows requirement + // plus one extra word to keep stack 16 bytes aligned + MOVQ AX, 32(SP) + MOVQ runtime·_TlsAlloc(SB), AX + CALL AX + MOVQ 32(SP), SP + + MOVQ AX, CX // TLS index + + // Assert that slot is less than 64 so we can use _TEB->TlsSlots + CMPQ CX, $64 + JB ok + + // Fallback to the TEB arbitrary pointer. + // TODO: don't use the arbitrary pointer (see go.dev/issue/59824) + MOVQ $TEB_ArbitraryPtr, CX + JMP settls +ok: + // Convert the TLS index at CX into + // an offset from TEB_TlsSlots. + SHLQ $3, CX + + // Save offset from TLS into tls_g. + ADDQ $TEB_TlsSlots, CX +settls: + MOVQ CX, runtime·tls_g(SB) + RET diff --git a/src/runtime/sys_windows_arm.s b/src/runtime/sys_windows_arm.s new file mode 100644 index 0000000..db6d8f1 --- /dev/null +++ b/src/runtime/sys_windows_arm.s @@ -0,0 +1,438 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "time_windows.h" + +// Note: For system ABI, R0-R3 are args, R4-R11 are callee-save. + +// void runtime·asmstdcall(void *c); +TEXT runtime·asmstdcall(SB),NOSPLIT|NOFRAME,$0 + MOVM.DB.W [R4, R5, R14], (R13) // push {r4, r5, lr} + MOVW R0, R4 // put libcall * in r4 + MOVW R13, R5 // save stack pointer in r5 + + // SetLastError(0) + MOVW $0, R0 + MRC 15, 0, R1, C13, C0, 2 + MOVW R0, 0x34(R1) + + MOVW 8(R4), R12 // libcall->args + + // Do we have more than 4 arguments? + MOVW 4(R4), R0 // libcall->n + SUB.S $4, R0, R2 + BLE loadregs + + // Reserve stack space for remaining args + SUB R2<<2, R13 + BIC $0x7, R13 // alignment for ABI + + // R0: count of arguments + // R1: + // R2: loop counter, from 0 to (n-4) + // R3: scratch + // R4: pointer to libcall struct + // R12: libcall->args + MOVW $0, R2 +stackargs: + ADD $4, R2, R3 // r3 = args[4 + i] + MOVW R3<<2(R12), R3 + MOVW R3, R2<<2(R13) // stack[i] = r3 + + ADD $1, R2 // i++ + SUB $4, R0, R3 // while (i < (n - 4)) + CMP R3, R2 + BLT stackargs + +loadregs: + CMP $3, R0 + MOVW.GT 12(R12), R3 + + CMP $2, R0 + MOVW.GT 8(R12), R2 + + CMP $1, R0 + MOVW.GT 4(R12), R1 + + CMP $0, R0 + MOVW.GT 0(R12), R0 + + BIC $0x7, R13 // alignment for ABI + MOVW 0(R4), R12 // branch to libcall->fn + BL (R12) + + MOVW R5, R13 // free stack space + MOVW R0, 12(R4) // save return value to libcall->r1 + MOVW R1, 16(R4) + + // GetLastError + MRC 15, 0, R1, C13, C0, 2 + MOVW 0x34(R1), R0 + MOVW R0, 20(R4) // store in libcall->err + + MOVM.IA.W (R13), [R4, R5, R15] + +TEXT runtime·badsignal2(SB),NOSPLIT|NOFRAME,$0 + MOVM.DB.W [R4, R14], (R13) // push {r4, lr} + MOVW R13, R4 // save original stack pointer + SUB $8, R13 // space for 2 variables + BIC $0x7, R13 // alignment for ABI + + // stderr + MOVW runtime·_GetStdHandle(SB), R1 + MOVW $-12, R0 + BL (R1) + + MOVW $runtime·badsignalmsg(SB), R1 // lpBuffer + MOVW $runtime·badsignallen(SB), R2 // lpNumberOfBytesToWrite + MOVW (R2), R2 + ADD $0x4, R13, R3 // lpNumberOfBytesWritten + MOVW $0, R12 // lpOverlapped + MOVW R12, (R13) + + MOVW runtime·_WriteFile(SB), R12 + BL (R12) + + // Does not return. + B runtime·abort(SB) + +TEXT runtime·getlasterror(SB),NOSPLIT,$0 + MRC 15, 0, R0, C13, C0, 2 + MOVW 0x34(R0), R0 + MOVW R0, ret+0(FP) + RET + +// Called by Windows as a Vectored Exception Handler (VEH). +// First argument is pointer to struct containing +// exception record and context pointers. +// Handler function is stored in R1 +// Return 0 for 'not handled', -1 for handled. +// int32_t sigtramp( +// PEXCEPTION_POINTERS ExceptionInfo, +// func *GoExceptionHandler); +TEXT sigtramp<>(SB),NOSPLIT|NOFRAME,$0 + MOVM.DB.W [R0, R4-R11, R14], (R13) // push {r0, r4-r11, lr} (SP-=40) + SUB $(8+20), R13 // reserve space for g, sp, and + // parameters/retval to go call + + MOVW R0, R6 // Save param0 + MOVW R1, R7 // Save param1 + + BL runtime·load_g(SB) + CMP $0, g // is there a current g? + BNE g_ok + ADD $(8+20), R13 // free locals + MOVM.IA.W (R13), [R3, R4-R11, R14] // pop {r3, r4-r11, lr} + MOVW $0, R0 // continue + BEQ return + +g_ok: + + // save g and SP in case of stack switch + MOVW R13, 24(R13) + MOVW g, 20(R13) + + // do we need to switch to the g0 stack? + MOVW g, R5 // R5 = g + MOVW g_m(R5), R2 // R2 = m + MOVW m_g0(R2), R4 // R4 = g0 + CMP R5, R4 // if curg == g0 + BEQ g0 + + // switch to g0 stack + MOVW R4, g // g = g0 + MOVW (g_sched+gobuf_sp)(g), R3 // R3 = g->gobuf.sp + BL runtime·save_g(SB) + + // make room for sighandler arguments + // and re-save old SP for restoring later. + // (note that the 24(R3) here must match the 24(R13) above.) + SUB $40, R3 + MOVW R13, 24(R3) // save old stack pointer + MOVW R3, R13 // switch stack + +g0: + MOVW 0(R6), R2 // R2 = ExceptionPointers->ExceptionRecord + MOVW 4(R6), R3 // R3 = ExceptionPointers->ContextRecord + + MOVW $0, R4 + MOVW R4, 0(R13) // No saved link register. + MOVW R2, 4(R13) // Move arg0 (ExceptionRecord) into position + MOVW R3, 8(R13) // Move arg1 (ContextRecord) into position + MOVW R5, 12(R13) // Move arg2 (original g) into position + BL (R7) // Call the goroutine + MOVW 16(R13), R4 // Fetch return value from stack + + // Save system stack pointer for sigresume setup below. + // The exact value does not matter - nothing is read or written + // from this address. It just needs to be on the system stack. + MOVW R13, R12 + + // switch back to original stack and g + MOVW 24(R13), R13 + MOVW 20(R13), g + BL runtime·save_g(SB) + +done: + MOVW R4, R0 // move retval into position + ADD $(8 + 20), R13 // free locals + MOVM.IA.W (R13), [R3, R4-R11, R14] // pop {r3, r4-r11, lr} + + // if return value is CONTINUE_SEARCH, do not set up control + // flow guard workaround + CMP $0, R0 + BEQ return + + // Check if we need to set up the control flow guard workaround. + // On Windows, the stack pointer in the context must lie within + // system stack limits when we resume from exception. + // Store the resume SP and PC on the g0 stack, + // and return to sigresume on the g0 stack. sigresume + // pops the saved PC and SP from the g0 stack, resuming execution + // at the desired location. + // If sigresume has already been set up by a previous exception + // handler, don't clobber the stored SP and PC on the stack. + MOVW 4(R3), R3 // PEXCEPTION_POINTERS->Context + MOVW context_pc(R3), R2 // load PC from context record + MOVW $sigresume<>(SB), R1 + CMP R1, R2 + B.EQ return // do not clobber saved SP/PC + + // Save resume SP and PC into R0, R1. + MOVW context_spr(R3), R2 + MOVW R2, context_r0(R3) + MOVW context_pc(R3), R2 + MOVW R2, context_r1(R3) + + // Set up context record to return to sigresume on g0 stack + MOVW R12, context_spr(R3) + MOVW $sigresume<>(SB), R2 + MOVW R2, context_pc(R3) + +return: + B (R14) // return + +// Trampoline to resume execution from exception handler. +// This is part of the control flow guard workaround. +// It switches stacks and jumps to the continuation address. +// R0 and R1 are set above at the end of sigtramp<> +// in the context that starts executing at sigresume<>. +TEXT sigresume<>(SB),NOSPLIT|NOFRAME,$0 + // Important: do not smash LR, + // which is set to a live value when handling + // a signal by pushing a call to sigpanic onto the stack. + MOVW R0, R13 + B (R1) + +TEXT runtime·exceptiontramp(SB),NOSPLIT|NOFRAME,$0 + MOVW $runtime·exceptionhandler(SB), R1 + B sigtramp<>(SB) + +TEXT runtime·firstcontinuetramp(SB),NOSPLIT|NOFRAME,$0 + MOVW $runtime·firstcontinuehandler(SB), R1 + B sigtramp<>(SB) + +TEXT runtime·lastcontinuetramp(SB),NOSPLIT|NOFRAME,$0 + MOVW $runtime·lastcontinuehandler(SB), R1 + B sigtramp<>(SB) + +GLOBL runtime·cbctxts(SB), NOPTR, $4 + +TEXT runtime·callbackasm1(SB),NOSPLIT|NOFRAME,$0 + // On entry, the trampoline in zcallback_windows_arm.s left + // the callback index in R12 (which is volatile in the C ABI). + + // Push callback register arguments r0-r3. We do this first so + // they're contiguous with stack arguments. + MOVM.DB.W [R0-R3], (R13) + // Push C callee-save registers r4-r11 and lr. + MOVM.DB.W [R4-R11, R14], (R13) + SUB $(16 + callbackArgs__size), R13 // space for locals + + // Create a struct callbackArgs on our stack. + MOVW R12, (16+callbackArgs_index)(R13) // callback index + MOVW $(16+callbackArgs__size+4*9)(R13), R0 + MOVW R0, (16+callbackArgs_args)(R13) // address of args vector + MOVW $0, R0 + MOVW R0, (16+callbackArgs_result)(R13) // result + + // Prepare for entry to Go. + BL runtime·load_g(SB) + + // Call cgocallback, which will call callbackWrap(frame). + MOVW $0, R0 + MOVW R0, 12(R13) // context + MOVW $16(R13), R1 // R1 = &callbackArgs{...} + MOVW R1, 8(R13) // frame (address of callbackArgs) + MOVW $·callbackWrap(SB), R1 + MOVW R1, 4(R13) // PC of function to call + BL runtime·cgocallback(SB) + + // Get callback result. + MOVW (16+callbackArgs_result)(R13), R0 + + ADD $(16 + callbackArgs__size), R13 // free locals + MOVM.IA.W (R13), [R4-R11, R12] // pop {r4-r11, lr=>r12} + ADD $(4*4), R13 // skip r0-r3 + B (R12) // return + +// uint32 tstart_stdcall(M *newm); +TEXT runtime·tstart_stdcall(SB),NOSPLIT|NOFRAME,$0 + MOVM.DB.W [R4-R11, R14], (R13) // push {r4-r11, lr} + + MOVW m_g0(R0), g + MOVW R0, g_m(g) + BL runtime·save_g(SB) + + // Layout new m scheduler stack on os stack. + MOVW R13, R0 + MOVW R0, g_stack+stack_hi(g) + SUB $(64*1024), R0 + MOVW R0, (g_stack+stack_lo)(g) + MOVW R0, g_stackguard0(g) + MOVW R0, g_stackguard1(g) + + BL runtime·emptyfunc(SB) // fault if stack check is wrong + BL runtime·mstart(SB) + + // Exit the thread. + MOVW $0, R0 + MOVM.IA.W (R13), [R4-R11, R15] // pop {r4-r11, pc} + +// Runs on OS stack. +// duration (in -100ns units) is in dt+0(FP). +// g may be nil. +TEXT runtime·usleep2(SB),NOSPLIT|NOFRAME,$0-4 + MOVW dt+0(FP), R3 + MOVM.DB.W [R4, R14], (R13) // push {r4, lr} + MOVW R13, R4 // Save SP + SUB $8, R13 // R13 = R13 - 8 + BIC $0x7, R13 // Align SP for ABI + MOVW $0, R1 // R1 = FALSE (alertable) + MOVW $-1, R0 // R0 = handle + MOVW R13, R2 // R2 = pTime + MOVW R3, 0(R2) // time_lo + MOVW R0, 4(R2) // time_hi + MOVW runtime·_NtWaitForSingleObject(SB), R3 + BL (R3) + MOVW R4, R13 // Restore SP + MOVM.IA.W (R13), [R4, R15] // pop {R4, pc} + +// Runs on OS stack. +// duration (in -100ns units) is in dt+0(FP). +// g is valid. +// TODO: needs to be implemented properly. +TEXT runtime·usleep2HighRes(SB),NOSPLIT|NOFRAME,$0-4 + B runtime·abort(SB) + +// Runs on OS stack. +TEXT runtime·switchtothread(SB),NOSPLIT|NOFRAME,$0 + MOVM.DB.W [R4, R14], (R13) // push {R4, lr} + MOVW R13, R4 + BIC $0x7, R13 // alignment for ABI + MOVW runtime·_SwitchToThread(SB), R0 + BL (R0) + MOVW R4, R13 // restore stack pointer + MOVM.IA.W (R13), [R4, R15] // pop {R4, pc} + +TEXT ·publicationBarrier(SB),NOSPLIT|NOFRAME,$0-0 + B runtime·armPublicationBarrier(SB) + +// never called (this is a GOARM=7 platform) +TEXT runtime·read_tls_fallback(SB),NOSPLIT|NOFRAME,$0 + MOVW $0xabcd, R0 + MOVW R0, (R0) + RET + +TEXT runtime·nanotime1(SB),NOSPLIT|NOFRAME,$0-8 + MOVW $0, R0 + MOVB runtime·useQPCTime(SB), R0 + CMP $0, R0 + BNE useQPC + MOVW $_INTERRUPT_TIME, R3 +loop: + MOVW time_hi1(R3), R1 + DMB MB_ISH + MOVW time_lo(R3), R0 + DMB MB_ISH + MOVW time_hi2(R3), R2 + CMP R1, R2 + BNE loop + + // wintime = R1:R0, multiply by 100 + MOVW $100, R2 + MULLU R0, R2, (R4, R3) // R4:R3 = R1:R0 * R2 + MULA R1, R2, R4, R4 + + // wintime*100 = R4:R3 + MOVW R3, ret_lo+0(FP) + MOVW R4, ret_hi+4(FP) + RET +useQPC: + B runtime·nanotimeQPC(SB) // tail call + +// save_g saves the g register (R10) into thread local memory +// so that we can call externally compiled +// ARM code that will overwrite those registers. +// NOTE: runtime.gogo assumes that R1 is preserved by this function. +// runtime.mcall assumes this function only clobbers R0 and R11. +// Returns with g in R0. +// Save the value in the _TEB->TlsSlots array. +// Effectively implements TlsSetValue(). +// tls_g stores the TLS slot allocated TlsAlloc(). +TEXT runtime·save_g(SB),NOSPLIT|NOFRAME,$0 + MRC 15, 0, R0, C13, C0, 2 + ADD $0xe10, R0 + MOVW $runtime·tls_g(SB), R11 + MOVW (R11), R11 + MOVW g, R11<<2(R0) + MOVW g, R0 // preserve R0 across call to setg<> + RET + +// load_g loads the g register from thread-local memory, +// for use after calling externally compiled +// ARM code that overwrote those registers. +// Get the value from the _TEB->TlsSlots array. +// Effectively implements TlsGetValue(). +TEXT runtime·load_g(SB),NOSPLIT|NOFRAME,$0 + MRC 15, 0, R0, C13, C0, 2 + ADD $0xe10, R0 + MOVW $runtime·tls_g(SB), g + MOVW (g), g + MOVW g<<2(R0), g + RET + +// This is called from rt0_go, which runs on the system stack +// using the initial stack allocated by the OS. +// It calls back into standard C using the BL below. +// To do that, the stack pointer must be 8-byte-aligned. +TEXT runtime·_initcgo(SB),NOSPLIT|NOFRAME,$0 + MOVM.DB.W [R4, R14], (R13) // push {r4, lr} + + // Ensure stack is 8-byte aligned before calling C code + MOVW R13, R4 + BIC $0x7, R13 + + // Allocate a TLS slot to hold g across calls to external code + MOVW $runtime·_TlsAlloc(SB), R0 + MOVW (R0), R0 + BL (R0) + + // Assert that slot is less than 64 so we can use _TEB->TlsSlots + CMP $64, R0 + MOVW $runtime·abort(SB), R1 + BL.GE (R1) + + // Save Slot into tls_g + MOVW $runtime·tls_g(SB), R1 + MOVW R0, (R1) + + MOVW R4, R13 + MOVM.IA.W (R13), [R4, R15] // pop {r4, pc} + +// Holds the TLS Slot, which was allocated by TlsAlloc() +GLOBL runtime·tls_g+0(SB), NOPTR, $4 diff --git a/src/runtime/sys_windows_arm64.s b/src/runtime/sys_windows_arm64.s new file mode 100644 index 0000000..e3082a1 --- /dev/null +++ b/src/runtime/sys_windows_arm64.s @@ -0,0 +1,430 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" +#include "funcdata.h" +#include "time_windows.h" +#include "cgo/abi_arm64.h" + +// Offsets into Thread Environment Block (pointer in R18) +#define TEB_error 0x68 +#define TEB_TlsSlots 0x1480 +#define TEB_ArbitraryPtr 0x28 + +// Note: R0-R7 are args, R8 is indirect return value address, +// R9-R15 are caller-save, R19-R29 are callee-save. +// +// load_g and save_g (in tls_arm64.s) clobber R27 (REGTMP) and R0. + +// void runtime·asmstdcall(void *c); +TEXT runtime·asmstdcall(SB),NOSPLIT|NOFRAME,$0 + STP.W (R29, R30), -32(RSP) // allocate C ABI stack frame + STP (R19, R20), 16(RSP) // save old R19, R20 + MOVD R0, R19 // save libcall pointer + MOVD RSP, R20 // save stack pointer + + // SetLastError(0) + MOVD $0, TEB_error(R18_PLATFORM) + MOVD libcall_args(R19), R12 // libcall->args + + // Do we have more than 8 arguments? + MOVD libcall_n(R19), R0 + CMP $0, R0; BEQ _0args + CMP $1, R0; BEQ _1args + CMP $2, R0; BEQ _2args + CMP $3, R0; BEQ _3args + CMP $4, R0; BEQ _4args + CMP $5, R0; BEQ _5args + CMP $6, R0; BEQ _6args + CMP $7, R0; BEQ _7args + CMP $8, R0; BEQ _8args + + // Reserve stack space for remaining args + SUB $8, R0, R2 + ADD $1, R2, R3 // make even number of words for stack alignment + AND $~1, R3 + LSL $3, R3 + SUB R3, RSP + + // R4: size of stack arguments (n-8)*8 + // R5: &args[8] + // R6: loop counter, from 0 to (n-8)*8 + // R7: scratch + // R8: copy of RSP - (R2)(RSP) assembles as (R2)(ZR) + SUB $8, R0, R4 + LSL $3, R4 + ADD $(8*8), R12, R5 + MOVD $0, R6 + MOVD RSP, R8 +stackargs: + MOVD (R6)(R5), R7 + MOVD R7, (R6)(R8) + ADD $8, R6 + CMP R6, R4 + BNE stackargs + +_8args: + MOVD (7*8)(R12), R7 +_7args: + MOVD (6*8)(R12), R6 +_6args: + MOVD (5*8)(R12), R5 +_5args: + MOVD (4*8)(R12), R4 +_4args: + MOVD (3*8)(R12), R3 +_3args: + MOVD (2*8)(R12), R2 +_2args: + MOVD (1*8)(R12), R1 +_1args: + MOVD (0*8)(R12), R0 +_0args: + + MOVD libcall_fn(R19), R12 // branch to libcall->fn + BL (R12) + + MOVD R20, RSP // free stack space + MOVD R0, libcall_r1(R19) // save return value to libcall->r1 + // TODO(rsc) floating point like amd64 in libcall->r2? + + // GetLastError + MOVD TEB_error(R18_PLATFORM), R0 + MOVD R0, libcall_err(R19) + + // Restore callee-saved registers. + LDP 16(RSP), (R19, R20) + LDP.P 32(RSP), (R29, R30) + RET + +TEXT runtime·badsignal2(SB),NOSPLIT,$16-0 + NO_LOCAL_POINTERS + + // stderr + MOVD runtime·_GetStdHandle(SB), R1 + MOVD $-12, R0 + SUB $16, RSP // skip over saved frame pointer below RSP + BL (R1) + ADD $16, RSP + + // handle in R0 already + MOVD $runtime·badsignalmsg(SB), R1 // lpBuffer + MOVD $runtime·badsignallen(SB), R2 // lpNumberOfBytesToWrite + MOVD (R2), R2 + // point R3 to stack local that will receive number of bytes written + ADD $16, RSP, R3 // lpNumberOfBytesWritten + MOVD $0, R4 // lpOverlapped + MOVD runtime·_WriteFile(SB), R12 + SUB $16, RSP // skip over saved frame pointer below RSP + BL (R12) + + // Does not return. + B runtime·abort(SB) + + RET + +TEXT runtime·getlasterror(SB),NOSPLIT|NOFRAME,$0 + MOVD TEB_error(R18_PLATFORM), R0 + MOVD R0, ret+0(FP) + RET + +// Called by Windows as a Vectored Exception Handler (VEH). +// First argument is pointer to struct containing +// exception record and context pointers. +// Handler function is stored in R1 +// Return 0 for 'not handled', -1 for handled. +// int32_t sigtramp( +// PEXCEPTION_POINTERS ExceptionInfo, +// func *GoExceptionHandler); +TEXT sigtramp<>(SB),NOSPLIT|NOFRAME,$0 + // Save R0, R1 (args) as well as LR, R27, R28 (callee-save). + MOVD R0, R5 + MOVD R1, R6 + MOVD LR, R7 + MOVD R27, R16 // saved R27 (callee-save) + MOVD g, R17 // saved R28 (callee-save from Windows, not really g) + + BL runtime·load_g(SB) // smashes R0, R27, R28 (g) + CMP $0, g // is there a current g? + BNE g_ok + MOVD R7, LR + MOVD R16, R27 // restore R27 + MOVD R17, g // restore R28 + MOVD $0, R0 // continue + RET + +g_ok: + // Do we need to switch to the g0 stack? + MOVD g, R3 // R3 = oldg (for sigtramp_g0) + MOVD g_m(g), R2 // R2 = m + MOVD m_g0(R2), R2 // R2 = g0 + CMP g, R2 // if curg == g0 + BNE switch + + // No: on g0 stack already, tail call to sigtramp_g0. + // Restore all the callee-saves so sigtramp_g0 can return to our caller. + // We also pass R2 = g0, R3 = oldg, both set above. + MOVD R5, R0 + MOVD R6, R1 + MOVD R7, LR + MOVD R16, R27 // restore R27 + MOVD R17, g // restore R28 + B sigtramp_g0<>(SB) + +switch: + // switch to g0 stack (but do not update g - that's sigtramp_g0's job) + MOVD RSP, R8 + MOVD (g_sched+gobuf_sp)(R2), R4 // R4 = g->gobuf.sp + SUB $(6*8), R4 // alloc space for saves - 2 words below SP for frame pointer, 3 for us to use, 1 for alignment + MOVD R4, RSP // switch to g0 stack + + MOVD $0, (0*8)(RSP) // fake saved LR + MOVD R7, (1*8)(RSP) // saved LR + MOVD R8, (2*8)(RSP) // saved SP + + MOVD R5, R0 // original args + MOVD R6, R1 // original args + MOVD R16, R27 + MOVD R17, g // R28 + BL sigtramp_g0<>(SB) + + // switch back to original stack; g already updated + MOVD (1*8)(RSP), R7 // saved LR + MOVD (2*8)(RSP), R8 // saved SP + MOVD R7, LR + MOVD R8, RSP + RET + +// sigtramp_g0 is running on the g0 stack, with R2 = g0, R3 = oldg. +// But g itself is not set - that's R28, a callee-save register, +// and it still holds the value from the Windows DLL caller. +TEXT sigtramp_g0<>(SB),NOSPLIT,$128 + NO_LOCAL_POINTERS + + // Push C callee-save registers R19-R28. LR, FP already saved. + // These registers will occupy the upper 10 words of the frame. + SAVE_R19_TO_R28(8*7) + + MOVD 0(R0), R5 // R5 = ExceptionPointers->ExceptionRecord + MOVD 8(R0), R6 // R6 = ExceptionPointers->ContextRecord + MOVD R6, context-(11*8)(SP) + + MOVD R2, g // g0 + BL runtime·save_g(SB) // smashes R0 + + MOVD R5, (1*8)(RSP) // arg0 (ExceptionRecord) + MOVD R6, (2*8)(RSP) // arg1 (ContextRecord) + MOVD R3, (3*8)(RSP) // arg2 (original g) + MOVD R3, oldg-(12*8)(SP) + BL (R1) + MOVD oldg-(12*8)(SP), g + BL runtime·save_g(SB) // smashes R0 + MOVW (4*8)(RSP), R0 // return value (0 or -1) + + // if return value is CONTINUE_SEARCH, do not set up control + // flow guard workaround + CMP $0, R0 + BEQ return + + // Check if we need to set up the control flow guard workaround. + // On Windows, the stack pointer in the context must lie within + // system stack limits when we resume from exception. + // Store the resume SP and PC in alternate registers + // and return to sigresume on the g0 stack. + // sigresume makes no use of the stack at all, + // loading SP from R0 and jumping to R1. + // Note that smashing R0 and R1 is only safe because we know sigpanic + // will not actually return to the original frame, so the registers + // are effectively dead. But this does mean we can't use the + // same mechanism for async preemption. + MOVD context-(11*8)(SP), R6 + MOVD context_pc(R6), R2 // load PC from context record + MOVD $sigresume<>(SB), R1 + + CMP R1, R2 + BEQ return // do not clobber saved SP/PC + + // Save resume SP and PC into R0, R1. + MOVD context_xsp(R6), R2 + MOVD R2, (context_x+0*8)(R6) + MOVD context_pc(R6), R2 + MOVD R2, (context_x+1*8)(R6) + + // Set up context record to return to sigresume on g0 stack + MOVD RSP, R2 + MOVD R2, context_xsp(R6) + MOVD $sigresume<>(SB), R2 + MOVD R2, context_pc(R6) + +return: + RESTORE_R19_TO_R28(8*7) // smashes g + RET + +// Trampoline to resume execution from exception handler. +// This is part of the control flow guard workaround. +// It switches stacks and jumps to the continuation address. +// R0 and R1 are set above at the end of sigtramp<> +// in the context that starts executing at sigresume<>. +TEXT sigresume<>(SB),NOSPLIT|NOFRAME,$0 + // Important: do not smash LR, + // which is set to a live value when handling + // a signal by pushing a call to sigpanic onto the stack. + MOVD R0, RSP + B (R1) + +TEXT runtime·exceptiontramp(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·exceptionhandler(SB), R1 + B sigtramp<>(SB) + +TEXT runtime·firstcontinuetramp(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·firstcontinuehandler(SB), R1 + B sigtramp<>(SB) + +TEXT runtime·lastcontinuetramp(SB),NOSPLIT|NOFRAME,$0 + MOVD $runtime·lastcontinuehandler(SB), R1 + B sigtramp<>(SB) + +GLOBL runtime·cbctxts(SB), NOPTR, $4 + +TEXT runtime·callbackasm1(SB),NOSPLIT,$208-0 + NO_LOCAL_POINTERS + + // On entry, the trampoline in zcallback_windows_arm64.s left + // the callback index in R12 (which is volatile in the C ABI). + + // Save callback register arguments R0-R7. + // We do this at the top of the frame so they're contiguous with stack arguments. + // The 7*8 setting up R14 looks like a bug but is not: the eighth word + // is the space the assembler reserved for our caller's frame pointer, + // but we are not called from Go so that space is ours to use, + // and we must to be contiguous with the stack arguments. + MOVD $arg0-(7*8)(SP), R14 + STP (R0, R1), (0*8)(R14) + STP (R2, R3), (2*8)(R14) + STP (R4, R5), (4*8)(R14) + STP (R6, R7), (6*8)(R14) + + // Push C callee-save registers R19-R28. + // LR, FP already saved. + SAVE_R19_TO_R28(8*9) + + // Create a struct callbackArgs on our stack. + MOVD $cbargs-(18*8+callbackArgs__size)(SP), R13 + MOVD R12, callbackArgs_index(R13) // callback index + MOVD R14, R0 + MOVD R0, callbackArgs_args(R13) // address of args vector + MOVD $0, R0 + MOVD R0, callbackArgs_result(R13) // result + + // Call cgocallback, which will call callbackWrap(frame). + MOVD $·callbackWrap<ABIInternal>(SB), R0 // PC of function to call, cgocallback takes an ABIInternal entry-point + MOVD R13, R1 // frame (&callbackArgs{...}) + MOVD $0, R2 // context + STP (R0, R1), (1*8)(RSP) + MOVD R2, (3*8)(RSP) + BL runtime·cgocallback(SB) + + // Get callback result. + MOVD $cbargs-(18*8+callbackArgs__size)(SP), R13 + MOVD callbackArgs_result(R13), R0 + + RESTORE_R19_TO_R28(8*9) + + RET + +// uint32 tstart_stdcall(M *newm); +TEXT runtime·tstart_stdcall(SB),NOSPLIT,$96-0 + SAVE_R19_TO_R28(8*3) + + MOVD m_g0(R0), g + MOVD R0, g_m(g) + BL runtime·save_g(SB) + + // Set up stack guards for OS stack. + MOVD RSP, R0 + MOVD R0, g_stack+stack_hi(g) + SUB $(64*1024), R0 + MOVD R0, (g_stack+stack_lo)(g) + MOVD R0, g_stackguard0(g) + MOVD R0, g_stackguard1(g) + + BL runtime·emptyfunc(SB) // fault if stack check is wrong + BL runtime·mstart(SB) + + RESTORE_R19_TO_R28(8*3) + + // Exit the thread. + MOVD $0, R0 + RET + +// Runs on OS stack. +// duration (in -100ns units) is in dt+0(FP). +// g may be nil. +TEXT runtime·usleep2(SB),NOSPLIT,$32-4 + MOVW dt+0(FP), R0 + MOVD $16(RSP), R2 // R2 = pTime + MOVD R0, 0(R2) // *pTime = -dt + MOVD $-1, R0 // R0 = handle + MOVD $0, R1 // R1 = FALSE (alertable) + MOVD runtime·_NtWaitForSingleObject(SB), R3 + SUB $16, RSP // skip over saved frame pointer below RSP + BL (R3) + ADD $16, RSP + RET + +// Runs on OS stack. +// duration (in -100ns units) is in dt+0(FP). +// g is valid. +// TODO: needs to be implemented properly. +TEXT runtime·usleep2HighRes(SB),NOSPLIT,$0-4 + B runtime·abort(SB) + +// Runs on OS stack. +TEXT runtime·switchtothread(SB),NOSPLIT,$16-0 + MOVD runtime·_SwitchToThread(SB), R0 + SUB $16, RSP // skip over saved frame pointer below RSP + BL (R0) + ADD $16, RSP + RET + +TEXT runtime·nanotime1(SB),NOSPLIT|NOFRAME,$0-8 + MOVB runtime·useQPCTime(SB), R0 + CMP $0, R0 + BNE useQPC + MOVD $_INTERRUPT_TIME, R3 + MOVD time_lo(R3), R0 + MOVD $100, R1 + MUL R1, R0 + MOVD R0, ret+0(FP) + RET +useQPC: + B runtime·nanotimeQPC(SB) // tail call + +// This is called from rt0_go, which runs on the system stack +// using the initial stack allocated by the OS. +// It calls back into standard C using the BL below. +TEXT runtime·wintls(SB),NOSPLIT,$0 + // Allocate a TLS slot to hold g across calls to external code + MOVD runtime·_TlsAlloc(SB), R0 + SUB $16, RSP // skip over saved frame pointer below RSP + BL (R0) + ADD $16, RSP + + // Assert that slot is less than 64 so we can use _TEB->TlsSlots + CMP $64, R0 + BLT ok + // Fallback to the TEB arbitrary pointer. + // TODO: don't use the arbitrary pointer (see go.dev/issue/59824) + MOVD $TEB_ArbitraryPtr, R0 + B settls +ok: + + // Save offset from R18 into tls_g. + LSL $3, R0 + ADD $TEB_TlsSlots, R0 +settls: + MOVD R0, runtime·tls_g(SB) + RET diff --git a/src/runtime/sys_x86.go b/src/runtime/sys_x86.go new file mode 100644 index 0000000..9fb36c2 --- /dev/null +++ b/src/runtime/sys_x86.go @@ -0,0 +1,23 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build amd64 || 386 + +package runtime + +import ( + "internal/goarch" + "unsafe" +) + +// adjust Gobuf as if it executed a call to fn with context ctxt +// and then stopped before the first instruction in fn. +func gostartcall(buf *gobuf, fn, ctxt unsafe.Pointer) { + sp := buf.sp + sp -= goarch.PtrSize + *(*uintptr)(unsafe.Pointer(sp)) = buf.pc + buf.sp = sp + buf.pc = uintptr(fn) + buf.ctxt = ctxt +} diff --git a/src/runtime/syscall2_solaris.go b/src/runtime/syscall2_solaris.go new file mode 100644 index 0000000..10a4fa0 --- /dev/null +++ b/src/runtime/syscall2_solaris.go @@ -0,0 +1,45 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import _ "unsafe" // for go:linkname + +//go:cgo_import_dynamic libc_chdir chdir "libc.so" +//go:cgo_import_dynamic libc_chroot chroot "libc.so" +//go:cgo_import_dynamic libc_close close "libc.so" +//go:cgo_import_dynamic libc_execve execve "libc.so" +//go:cgo_import_dynamic libc_fcntl fcntl "libc.so" +//go:cgo_import_dynamic libc_forkx forkx "libc.so" +//go:cgo_import_dynamic libc_gethostname gethostname "libc.so" +//go:cgo_import_dynamic libc_getpid getpid "libc.so" +//go:cgo_import_dynamic libc_ioctl ioctl "libc.so" +//go:cgo_import_dynamic libc_setgid setgid "libc.so" +//go:cgo_import_dynamic libc_setgroups setgroups "libc.so" +//go:cgo_import_dynamic libc_setrlimit setrlimit "libc.so" +//go:cgo_import_dynamic libc_setsid setsid "libc.so" +//go:cgo_import_dynamic libc_setuid setuid "libc.so" +//go:cgo_import_dynamic libc_setpgid setpgid "libc.so" +//go:cgo_import_dynamic libc_syscall syscall "libc.so" +//go:cgo_import_dynamic libc_wait4 wait4 "libc.so" +//go:cgo_import_dynamic libc_issetugid issetugid "libc.so" + +//go:linkname libc_chdir libc_chdir +//go:linkname libc_chroot libc_chroot +//go:linkname libc_close libc_close +//go:linkname libc_execve libc_execve +//go:linkname libc_fcntl libc_fcntl +//go:linkname libc_forkx libc_forkx +//go:linkname libc_gethostname libc_gethostname +//go:linkname libc_getpid libc_getpid +//go:linkname libc_ioctl libc_ioctl +//go:linkname libc_setgid libc_setgid +//go:linkname libc_setgroups libc_setgroups +//go:linkname libc_setrlimit libc_setrlimit +//go:linkname libc_setsid libc_setsid +//go:linkname libc_setuid libc_setuid +//go:linkname libc_setpgid libc_setpgid +//go:linkname libc_syscall libc_syscall +//go:linkname libc_wait4 libc_wait4 +//go:linkname libc_issetugid libc_issetugid diff --git a/src/runtime/syscall_aix.go b/src/runtime/syscall_aix.go new file mode 100644 index 0000000..e87d4d6 --- /dev/null +++ b/src/runtime/syscall_aix.go @@ -0,0 +1,238 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +// This file handles some syscalls from the syscall package +// Especially, syscalls use during forkAndExecInChild which must not split the stack + +//go:cgo_import_dynamic libc_chdir chdir "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_chroot chroot "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_dup2 dup2 "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_execve execve "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_fcntl fcntl "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_fork fork "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_ioctl ioctl "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_setgid setgid "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_setgroups setgroups "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_setrlimit setrlimit "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_setsid setsid "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_setuid setuid "libc.a/shr_64.o" +//go:cgo_import_dynamic libc_setpgid setpgid "libc.a/shr_64.o" + +//go:linkname libc_chdir libc_chdir +//go:linkname libc_chroot libc_chroot +//go:linkname libc_dup2 libc_dup2 +//go:linkname libc_execve libc_execve +//go:linkname libc_fcntl libc_fcntl +//go:linkname libc_fork libc_fork +//go:linkname libc_ioctl libc_ioctl +//go:linkname libc_setgid libc_setgid +//go:linkname libc_setgroups libc_setgroups +//go:linkname libc_setrlimit libc_setrlimit +//go:linkname libc_setsid libc_setsid +//go:linkname libc_setuid libc_setuid +//go:linkname libc_setpgid libc_setpgid + +var ( + libc_chdir, + libc_chroot, + libc_dup2, + libc_execve, + libc_fcntl, + libc_fork, + libc_ioctl, + libc_setgid, + libc_setgroups, + libc_setrlimit, + libc_setsid, + libc_setuid, + libc_setpgid libFunc +) + +// In syscall_syscall6 and syscall_rawsyscall6, r2 is always 0 +// as it's never used on AIX +// TODO: remove r2 from zsyscall_aix_$GOARCH.go + +// Syscall is needed because some packages (like net) need it too. +// The best way is to return EINVAL and let Golang handles its failure +// If the syscall can't fail, this function can redirect it to a real syscall. +// +// This is exported via linkname to assembly in the syscall package. +// +//go:nosplit +//go:linkname syscall_Syscall +func syscall_Syscall(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + return 0, 0, _EINVAL +} + +// This is syscall.RawSyscall, it exists to satisfy some build dependency, +// but it doesn't work. +// +// This is exported via linkname to assembly in the syscall package. +// +//go:linkname syscall_RawSyscall +func syscall_RawSyscall(trap, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + panic("RawSyscall not available on AIX") +} + +// This is exported via linkname to assembly in the syscall package. +// +//go:nosplit +//go:cgo_unsafe_args +//go:linkname syscall_syscall6 +func syscall_syscall6(fn, nargs, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + c := libcall{ + fn: fn, + n: nargs, + args: uintptr(unsafe.Pointer(&a1)), + } + + entersyscallblock() + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + exitsyscall() + return c.r1, 0, c.err +} + +// This is exported via linkname to assembly in the syscall package. +// +//go:nosplit +//go:cgo_unsafe_args +//go:linkname syscall_rawSyscall6 +func syscall_rawSyscall6(fn, nargs, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + c := libcall{ + fn: fn, + n: nargs, + args: uintptr(unsafe.Pointer(&a1)), + } + + asmcgocall(unsafe.Pointer(&asmsyscall6), unsafe.Pointer(&c)) + + return c.r1, 0, c.err +} + +//go:linkname syscall_chdir syscall.chdir +//go:nosplit +func syscall_chdir(path uintptr) (err uintptr) { + _, err = syscall1(&libc_chdir, path) + return +} + +//go:linkname syscall_chroot1 syscall.chroot1 +//go:nosplit +func syscall_chroot1(path uintptr) (err uintptr) { + _, err = syscall1(&libc_chroot, path) + return +} + +// like close, but must not split stack, for fork. +// +//go:linkname syscall_closeFD syscall.closeFD +//go:nosplit +func syscall_closeFD(fd int32) int32 { + _, err := syscall1(&libc_close, uintptr(fd)) + return int32(err) +} + +//go:linkname syscall_dup2child syscall.dup2child +//go:nosplit +func syscall_dup2child(old, new uintptr) (val, err uintptr) { + val, err = syscall2(&libc_dup2, old, new) + return +} + +//go:linkname syscall_execve syscall.execve +//go:nosplit +func syscall_execve(path, argv, envp uintptr) (err uintptr) { + _, err = syscall3(&libc_execve, path, argv, envp) + return +} + +// like exit, but must not split stack, for fork. +// +//go:linkname syscall_exit syscall.exit +//go:nosplit +func syscall_exit(code uintptr) { + syscall1(&libc_exit, code) +} + +//go:linkname syscall_fcntl1 syscall.fcntl1 +//go:nosplit +func syscall_fcntl1(fd, cmd, arg uintptr) (val, err uintptr) { + val, err = syscall3(&libc_fcntl, fd, cmd, arg) + return + +} + +//go:linkname syscall_forkx syscall.forkx +//go:nosplit +func syscall_forkx(flags uintptr) (pid uintptr, err uintptr) { + pid, err = syscall1(&libc_fork, flags) + return +} + +//go:linkname syscall_getpid syscall.getpid +//go:nosplit +func syscall_getpid() (pid, err uintptr) { + pid, err = syscall0(&libc_getpid) + return +} + +//go:linkname syscall_ioctl syscall.ioctl +//go:nosplit +func syscall_ioctl(fd, req, arg uintptr) (err uintptr) { + _, err = syscall3(&libc_ioctl, fd, req, arg) + return +} + +//go:linkname syscall_setgid syscall.setgid +//go:nosplit +func syscall_setgid(gid uintptr) (err uintptr) { + _, err = syscall1(&libc_setgid, gid) + return +} + +//go:linkname syscall_setgroups1 syscall.setgroups1 +//go:nosplit +func syscall_setgroups1(ngid, gid uintptr) (err uintptr) { + _, err = syscall2(&libc_setgroups, ngid, gid) + return +} + +//go:linkname syscall_setrlimit1 syscall.setrlimit1 +//go:nosplit +func syscall_setrlimit1(which uintptr, lim unsafe.Pointer) (err uintptr) { + _, err = syscall2(&libc_setrlimit, which, uintptr(lim)) + return +} + +//go:linkname syscall_setsid syscall.setsid +//go:nosplit +func syscall_setsid() (pid, err uintptr) { + pid, err = syscall0(&libc_setsid) + return +} + +//go:linkname syscall_setuid syscall.setuid +//go:nosplit +func syscall_setuid(uid uintptr) (err uintptr) { + _, err = syscall1(&libc_setuid, uid) + return +} + +//go:linkname syscall_setpgid syscall.setpgid +//go:nosplit +func syscall_setpgid(pid, pgid uintptr) (err uintptr) { + _, err = syscall2(&libc_setpgid, pid, pgid) + return +} + +//go:linkname syscall_write1 syscall.write1 +//go:nosplit +func syscall_write1(fd, buf, nbyte uintptr) (n, err uintptr) { + n, err = syscall3(&libc_write, fd, buf, nbyte) + return +} diff --git a/src/runtime/syscall_solaris.go b/src/runtime/syscall_solaris.go new file mode 100644 index 0000000..11b9c2a --- /dev/null +++ b/src/runtime/syscall_solaris.go @@ -0,0 +1,330 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +var ( + libc_chdir, + libc_chroot, + libc_close, + libc_execve, + libc_fcntl, + libc_forkx, + libc_gethostname, + libc_getpid, + libc_ioctl, + libc_setgid, + libc_setgroups, + libc_setrlimit, + libc_setsid, + libc_setuid, + libc_setpgid, + libc_syscall, + libc_issetugid, + libc_wait4 libcFunc +) + +// Many of these are exported via linkname to assembly in the syscall +// package. + +//go:nosplit +//go:linkname syscall_sysvicall6 +//go:cgo_unsafe_args +func syscall_sysvicall6(fn, nargs, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + call := libcall{ + fn: fn, + n: nargs, + args: uintptr(unsafe.Pointer(&a1)), + } + entersyscallblock() + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + exitsyscall() + return call.r1, call.r2, call.err +} + +//go:nosplit +//go:linkname syscall_rawsysvicall6 +//go:cgo_unsafe_args +func syscall_rawsysvicall6(fn, nargs, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + call := libcall{ + fn: fn, + n: nargs, + args: uintptr(unsafe.Pointer(&a1)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.r1, call.r2, call.err +} + +// TODO(aram): Once we remove all instances of C calling sysvicallN, make +// sysvicallN return errors and replace the body of the following functions +// with calls to sysvicallN. + +//go:nosplit +//go:linkname syscall_chdir +func syscall_chdir(path uintptr) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_chdir)), + n: 1, + args: uintptr(unsafe.Pointer(&path)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +//go:nosplit +//go:linkname syscall_chroot +func syscall_chroot(path uintptr) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_chroot)), + n: 1, + args: uintptr(unsafe.Pointer(&path)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +// like close, but must not split stack, for forkx. +// +//go:nosplit +//go:linkname syscall_close +func syscall_close(fd int32) int32 { + return int32(sysvicall1(&libc_close, uintptr(fd))) +} + +const _F_DUP2FD = 0x9 + +//go:nosplit +//go:linkname syscall_dup2 +func syscall_dup2(oldfd, newfd uintptr) (val, err uintptr) { + return syscall_fcntl(oldfd, _F_DUP2FD, newfd) +} + +//go:nosplit +//go:linkname syscall_execve +//go:cgo_unsafe_args +func syscall_execve(path, argv, envp uintptr) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_execve)), + n: 3, + args: uintptr(unsafe.Pointer(&path)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +// like exit, but must not split stack, for forkx. +// +//go:nosplit +//go:linkname syscall_exit +func syscall_exit(code uintptr) { + sysvicall1(&libc_exit, code) +} + +//go:nosplit +//go:linkname syscall_fcntl +//go:cgo_unsafe_args +func syscall_fcntl(fd, cmd, arg uintptr) (val, err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_fcntl)), + n: 3, + args: uintptr(unsafe.Pointer(&fd)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.r1, call.err +} + +//go:nosplit +//go:linkname syscall_forkx +func syscall_forkx(flags uintptr) (pid uintptr, err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_forkx)), + n: 1, + args: uintptr(unsafe.Pointer(&flags)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + if int(call.r1) != -1 { + call.err = 0 + } + return call.r1, call.err +} + +//go:linkname syscall_gethostname +func syscall_gethostname() (name string, err uintptr) { + cname := new([_MAXHOSTNAMELEN]byte) + var args = [2]uintptr{uintptr(unsafe.Pointer(&cname[0])), _MAXHOSTNAMELEN} + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_gethostname)), + n: 2, + args: uintptr(unsafe.Pointer(&args[0])), + } + entersyscallblock() + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + exitsyscall() + if call.r1 != 0 { + return "", call.err + } + cname[_MAXHOSTNAMELEN-1] = 0 + return gostringnocopy(&cname[0]), 0 +} + +//go:nosplit +//go:linkname syscall_getpid +func syscall_getpid() (pid, err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_getpid)), + n: 0, + args: uintptr(unsafe.Pointer(&libc_getpid)), // it's unused but must be non-nil, otherwise crashes + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.r1, call.err +} + +//go:nosplit +//go:linkname syscall_ioctl +//go:cgo_unsafe_args +func syscall_ioctl(fd, req, arg uintptr) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_ioctl)), + n: 3, + args: uintptr(unsafe.Pointer(&fd)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +// This is syscall.RawSyscall, it exists to satisfy some build dependency, +// but it doesn't work. +// +//go:linkname syscall_rawsyscall +func syscall_rawsyscall(trap, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + panic("RawSyscall not available on Solaris") +} + +// This is syscall.RawSyscall6, it exists to avoid a linker error because +// syscall.RawSyscall6 is already declared. See golang.org/issue/24357 +// +//go:linkname syscall_rawsyscall6 +func syscall_rawsyscall6(trap, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + panic("RawSyscall6 not available on Solaris") +} + +//go:nosplit +//go:linkname syscall_setgid +func syscall_setgid(gid uintptr) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_setgid)), + n: 1, + args: uintptr(unsafe.Pointer(&gid)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +//go:nosplit +//go:linkname syscall_setgroups +//go:cgo_unsafe_args +func syscall_setgroups(ngid, gid uintptr) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_setgroups)), + n: 2, + args: uintptr(unsafe.Pointer(&ngid)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +//go:nosplit +//go:linkname syscall_setrlimit +//go:cgo_unsafe_args +func syscall_setrlimit(which uintptr, lim unsafe.Pointer) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_setrlimit)), + n: 2, + args: uintptr(unsafe.Pointer(&which)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +//go:nosplit +//go:linkname syscall_setsid +func syscall_setsid() (pid, err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_setsid)), + n: 0, + args: uintptr(unsafe.Pointer(&libc_setsid)), // it's unused but must be non-nil, otherwise crashes + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.r1, call.err +} + +//go:nosplit +//go:linkname syscall_setuid +func syscall_setuid(uid uintptr) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_setuid)), + n: 1, + args: uintptr(unsafe.Pointer(&uid)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +//go:nosplit +//go:linkname syscall_setpgid +//go:cgo_unsafe_args +func syscall_setpgid(pid, pgid uintptr) (err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_setpgid)), + n: 2, + args: uintptr(unsafe.Pointer(&pid)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.err +} + +//go:linkname syscall_syscall +//go:cgo_unsafe_args +func syscall_syscall(trap, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_syscall)), + n: 4, + args: uintptr(unsafe.Pointer(&trap)), + } + entersyscallblock() + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + exitsyscall() + return call.r1, call.r2, call.err +} + +//go:linkname syscall_wait4 +//go:cgo_unsafe_args +func syscall_wait4(pid uintptr, wstatus *uint32, options uintptr, rusage unsafe.Pointer) (wpid int, err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_wait4)), + n: 4, + args: uintptr(unsafe.Pointer(&pid)), + } + entersyscallblock() + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + exitsyscall() + KeepAlive(wstatus) + KeepAlive(rusage) + return int(call.r1), call.err +} + +//go:nosplit +//go:linkname syscall_write +//go:cgo_unsafe_args +func syscall_write(fd, buf, nbyte uintptr) (n, err uintptr) { + call := libcall{ + fn: uintptr(unsafe.Pointer(&libc_write)), + n: 3, + args: uintptr(unsafe.Pointer(&fd)), + } + asmcgocall(unsafe.Pointer(&asmsysvicall6x), unsafe.Pointer(&call)) + return call.r1, call.err +} diff --git a/src/runtime/syscall_unix_test.go b/src/runtime/syscall_unix_test.go new file mode 100644 index 0000000..2a69c40 --- /dev/null +++ b/src/runtime/syscall_unix_test.go @@ -0,0 +1,25 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package runtime_test + +import ( + "runtime" + "syscall" + "testing" +) + +func TestSyscallFlagAlignment(t *testing.T) { + // TODO(mknyszek): Check other flags. + check := func(name string, got, want int) { + if got != want { + t.Errorf("flag %s does not line up: got %d, want %d", name, got, want) + } + } + check("O_WRONLY", runtime.O_WRONLY, syscall.O_WRONLY) + check("O_CREAT", runtime.O_CREAT, syscall.O_CREAT) + check("O_TRUNC", runtime.O_TRUNC, syscall.O_TRUNC) +} diff --git a/src/runtime/syscall_windows.go b/src/runtime/syscall_windows.go new file mode 100644 index 0000000..76036ad --- /dev/null +++ b/src/runtime/syscall_windows.go @@ -0,0 +1,559 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/abi" + "internal/goarch" + "unsafe" +) + +// cbs stores all registered Go callbacks. +var cbs struct { + lock mutex // use cbsLock / cbsUnlock for race instrumentation. + ctxt [cb_max]winCallback + index map[winCallbackKey]int + n int +} + +func cbsLock() { + lock(&cbs.lock) + // compileCallback is used by goenvs prior to completion of schedinit. + // raceacquire involves a racecallback to get the proc, which is not + // safe prior to scheduler initialization. Thus avoid instrumentation + // until then. + if raceenabled && mainStarted { + raceacquire(unsafe.Pointer(&cbs.lock)) + } +} + +func cbsUnlock() { + if raceenabled && mainStarted { + racerelease(unsafe.Pointer(&cbs.lock)) + } + unlock(&cbs.lock) +} + +// winCallback records information about a registered Go callback. +type winCallback struct { + fn *funcval // Go function + retPop uintptr // For 386 cdecl, how many bytes to pop on return + abiMap abiDesc +} + +// abiPartKind is the action an abiPart should take. +type abiPartKind int + +const ( + abiPartBad abiPartKind = iota + abiPartStack // Move a value from memory to the stack. + abiPartReg // Move a value from memory to a register. +) + +// abiPart encodes a step in translating between calling ABIs. +type abiPart struct { + kind abiPartKind + srcStackOffset uintptr + dstStackOffset uintptr // used if kind == abiPartStack + dstRegister int // used if kind == abiPartReg + len uintptr +} + +func (a *abiPart) tryMerge(b abiPart) bool { + if a.kind != abiPartStack || b.kind != abiPartStack { + return false + } + if a.srcStackOffset+a.len == b.srcStackOffset && a.dstStackOffset+a.len == b.dstStackOffset { + a.len += b.len + return true + } + return false +} + +// abiDesc specifies how to translate from a C frame to a Go +// frame. This does not specify how to translate back because +// the result is always a uintptr. If the C ABI is fastcall, +// this assumes the four fastcall registers were first spilled +// to the shadow space. +type abiDesc struct { + parts []abiPart + + srcStackSize uintptr // stdcall/fastcall stack space tracking + dstStackSize uintptr // Go stack space used + dstSpill uintptr // Extra stack space for argument spill slots + dstRegisters int // Go ABI int argument registers used + + // retOffset is the offset of the uintptr-sized result in the Go + // frame. + retOffset uintptr +} + +func (p *abiDesc) assignArg(t *_type) { + if t.size > goarch.PtrSize { + // We don't support this right now. In + // stdcall/cdecl, 64-bit ints and doubles are + // passed as two words (little endian); and + // structs are pushed on the stack. In + // fastcall, arguments larger than the word + // size are passed by reference. On arm, + // 8-byte aligned arguments round up to the + // next even register and can be split across + // registers and the stack. + panic("compileCallback: argument size is larger than uintptr") + } + if k := t.kind & kindMask; GOARCH != "386" && (k == kindFloat32 || k == kindFloat64) { + // In fastcall, floating-point arguments in + // the first four positions are passed in + // floating-point registers, which we don't + // currently spill. arm passes floating-point + // arguments in VFP registers, which we also + // don't support. + // So basically we only support 386. + panic("compileCallback: float arguments not supported") + } + + if t.size == 0 { + // The Go ABI aligns for zero-sized types. + p.dstStackSize = alignUp(p.dstStackSize, uintptr(t.align)) + return + } + + // In the C ABI, we're already on a word boundary. + // Also, sub-word-sized fastcall register arguments + // are stored to the least-significant bytes of the + // argument word and all supported Windows + // architectures are little endian, so srcStackOffset + // is already pointing to the right place for smaller + // arguments. The same is true on arm. + + oldParts := p.parts + if p.tryRegAssignArg(t, 0) { + // Account for spill space. + // + // TODO(mknyszek): Remove this when we no longer have + // caller reserved spill space. + p.dstSpill = alignUp(p.dstSpill, uintptr(t.align)) + p.dstSpill += t.size + } else { + // Register assignment failed. + // Undo the work and stack assign. + p.parts = oldParts + + // The Go ABI aligns arguments. + p.dstStackSize = alignUp(p.dstStackSize, uintptr(t.align)) + + // Copy just the size of the argument. Note that this + // could be a small by-value struct, but C and Go + // struct layouts are compatible, so we can copy these + // directly, too. + part := abiPart{ + kind: abiPartStack, + srcStackOffset: p.srcStackSize, + dstStackOffset: p.dstStackSize, + len: t.size, + } + // Add this step to the adapter. + if len(p.parts) == 0 || !p.parts[len(p.parts)-1].tryMerge(part) { + p.parts = append(p.parts, part) + } + // The Go ABI packs arguments. + p.dstStackSize += t.size + } + + // cdecl, stdcall, fastcall, and arm pad arguments to word size. + // TODO(rsc): On arm and arm64 do we need to skip the caller's saved LR? + p.srcStackSize += goarch.PtrSize +} + +// tryRegAssignArg tries to register-assign a value of type t. +// If this type is nested in an aggregate type, then offset is the +// offset of this type within its parent type. +// Assumes t.size <= goarch.PtrSize and t.size != 0. +// +// Returns whether the assignment succeeded. +func (p *abiDesc) tryRegAssignArg(t *_type, offset uintptr) bool { + switch k := t.kind & kindMask; k { + case kindBool, kindInt, kindInt8, kindInt16, kindInt32, kindUint, kindUint8, kindUint16, kindUint32, kindUintptr, kindPtr, kindUnsafePointer: + // Assign a register for all these types. + return p.assignReg(t.size, offset) + case kindInt64, kindUint64: + // Only register-assign if the registers are big enough. + if goarch.PtrSize == 8 { + return p.assignReg(t.size, offset) + } + case kindArray: + at := (*arraytype)(unsafe.Pointer(t)) + if at.len == 1 { + return p.tryRegAssignArg(at.elem, offset) + } + case kindStruct: + st := (*structtype)(unsafe.Pointer(t)) + for i := range st.fields { + f := &st.fields[i] + if !p.tryRegAssignArg(f.typ, offset+f.offset) { + return false + } + } + return true + } + // Pointer-sized types such as maps and channels are currently + // not supported. + panic("compileCallabck: type " + t.string() + " is currently not supported for use in system callbacks") +} + +// assignReg attempts to assign a single register for an +// argument with the given size, at the given offset into the +// value in the C ABI space. +// +// Returns whether the assignment was successful. +func (p *abiDesc) assignReg(size, offset uintptr) bool { + if p.dstRegisters >= intArgRegs { + return false + } + p.parts = append(p.parts, abiPart{ + kind: abiPartReg, + srcStackOffset: p.srcStackSize + offset, + dstRegister: p.dstRegisters, + len: size, + }) + p.dstRegisters++ + return true +} + +type winCallbackKey struct { + fn *funcval + cdecl bool +} + +func callbackasm() + +// callbackasmAddr returns address of runtime.callbackasm +// function adjusted by i. +// On x86 and amd64, runtime.callbackasm is a series of CALL instructions, +// and we want callback to arrive at +// correspondent call instruction instead of start of +// runtime.callbackasm. +// On ARM, runtime.callbackasm is a series of mov and branch instructions. +// R12 is loaded with the callback index. Each entry is two instructions, +// hence 8 bytes. +func callbackasmAddr(i int) uintptr { + var entrySize int + switch GOARCH { + default: + panic("unsupported architecture") + case "386", "amd64": + entrySize = 5 + case "arm", "arm64": + // On ARM and ARM64, each entry is a MOV instruction + // followed by a branch instruction + entrySize = 8 + } + return abi.FuncPCABI0(callbackasm) + uintptr(i*entrySize) +} + +const callbackMaxFrame = 64 * goarch.PtrSize + +// compileCallback converts a Go function fn into a C function pointer +// that can be passed to Windows APIs. +// +// On 386, if cdecl is true, the returned C function will use the +// cdecl calling convention; otherwise, it will use stdcall. On amd64, +// it always uses fastcall. On arm, it always uses the ARM convention. +// +//go:linkname compileCallback syscall.compileCallback +func compileCallback(fn eface, cdecl bool) (code uintptr) { + if GOARCH != "386" { + // cdecl is only meaningful on 386. + cdecl = false + } + + if fn._type == nil || (fn._type.kind&kindMask) != kindFunc { + panic("compileCallback: expected function with one uintptr-sized result") + } + ft := (*functype)(unsafe.Pointer(fn._type)) + + // Check arguments and construct ABI translation. + var abiMap abiDesc + for _, t := range ft.in() { + abiMap.assignArg(t) + } + // The Go ABI aligns the result to the word size. src is + // already aligned. + abiMap.dstStackSize = alignUp(abiMap.dstStackSize, goarch.PtrSize) + abiMap.retOffset = abiMap.dstStackSize + + if len(ft.out()) != 1 { + panic("compileCallback: expected function with one uintptr-sized result") + } + if ft.out()[0].size != goarch.PtrSize { + panic("compileCallback: expected function with one uintptr-sized result") + } + if k := ft.out()[0].kind & kindMask; k == kindFloat32 || k == kindFloat64 { + // In cdecl and stdcall, float results are returned in + // ST(0). In fastcall, they're returned in XMM0. + // Either way, it's not AX. + panic("compileCallback: float results not supported") + } + if intArgRegs == 0 { + // Make room for the uintptr-sized result. + // If there are argument registers, the return value will + // be passed in the first register. + abiMap.dstStackSize += goarch.PtrSize + } + + // TODO(mknyszek): Remove dstSpill from this calculation when we no longer have + // caller reserved spill space. + frameSize := alignUp(abiMap.dstStackSize, goarch.PtrSize) + frameSize += abiMap.dstSpill + if frameSize > callbackMaxFrame { + panic("compileCallback: function argument frame too large") + } + + // For cdecl, the callee is responsible for popping its + // arguments from the C stack. + var retPop uintptr + if cdecl { + retPop = abiMap.srcStackSize + } + + key := winCallbackKey{(*funcval)(fn.data), cdecl} + + cbsLock() + + // Check if this callback is already registered. + if n, ok := cbs.index[key]; ok { + cbsUnlock() + return callbackasmAddr(n) + } + + // Register the callback. + if cbs.index == nil { + cbs.index = make(map[winCallbackKey]int) + } + n := cbs.n + if n >= len(cbs.ctxt) { + cbsUnlock() + throw("too many callback functions") + } + c := winCallback{key.fn, retPop, abiMap} + cbs.ctxt[n] = c + cbs.index[key] = n + cbs.n++ + + cbsUnlock() + return callbackasmAddr(n) +} + +type callbackArgs struct { + index uintptr + // args points to the argument block. + // + // For cdecl and stdcall, all arguments are on the stack. + // + // For fastcall, the trampoline spills register arguments to + // the reserved spill slots below the stack arguments, + // resulting in a layout equivalent to stdcall. + // + // For arm, the trampoline stores the register arguments just + // below the stack arguments, so again we can treat it as one + // big stack arguments frame. + args unsafe.Pointer + // Below are out-args from callbackWrap + result uintptr + retPop uintptr // For 386 cdecl, how many bytes to pop on return +} + +// callbackWrap is called by callbackasm to invoke a registered C callback. +func callbackWrap(a *callbackArgs) { + c := cbs.ctxt[a.index] + a.retPop = c.retPop + + // Convert from C to Go ABI. + var regs abi.RegArgs + var frame [callbackMaxFrame]byte + goArgs := unsafe.Pointer(&frame) + for _, part := range c.abiMap.parts { + switch part.kind { + case abiPartStack: + memmove(add(goArgs, part.dstStackOffset), add(a.args, part.srcStackOffset), part.len) + case abiPartReg: + goReg := unsafe.Pointer(®s.Ints[part.dstRegister]) + memmove(goReg, add(a.args, part.srcStackOffset), part.len) + default: + panic("bad ABI description") + } + } + + // TODO(mknyszek): Remove this when we no longer have + // caller reserved spill space. + frameSize := alignUp(c.abiMap.dstStackSize, goarch.PtrSize) + frameSize += c.abiMap.dstSpill + + // Even though this is copying back results, we can pass a nil + // type because those results must not require write barriers. + reflectcall(nil, unsafe.Pointer(c.fn), noescape(goArgs), uint32(c.abiMap.dstStackSize), uint32(c.abiMap.retOffset), uint32(frameSize), ®s) + + // Extract the result. + // + // There's always exactly one return value, one pointer in size. + // If it's on the stack, then we will have reserved space for it + // at the end of the frame, otherwise it was passed in a register. + if c.abiMap.dstStackSize != c.abiMap.retOffset { + a.result = *(*uintptr)(unsafe.Pointer(&frame[c.abiMap.retOffset])) + } else { + var zero int + // On architectures with no registers, Ints[0] would be a compile error, + // so we use a dynamic index. These architectures will never take this + // branch, so this won't cause a runtime panic. + a.result = regs.Ints[zero] + } +} + +const _LOAD_LIBRARY_SEARCH_SYSTEM32 = 0x00000800 + +// When available, this function will use LoadLibraryEx with the filename +// parameter and the important SEARCH_SYSTEM32 argument. But on systems that +// do not have that option, absoluteFilepath should contain a fallback +// to the full path inside of system32 for use with vanilla LoadLibrary. +// +//go:linkname syscall_loadsystemlibrary syscall.loadsystemlibrary +//go:nosplit +//go:cgo_unsafe_args +func syscall_loadsystemlibrary(filename *uint16, absoluteFilepath *uint16) (handle, err uintptr) { + lockOSThread() + c := &getg().m.syscall + + if useLoadLibraryEx { + c.fn = getLoadLibraryEx() + c.n = 3 + args := struct { + lpFileName *uint16 + hFile uintptr // always 0 + flags uint32 + }{filename, 0, _LOAD_LIBRARY_SEARCH_SYSTEM32} + c.args = uintptr(noescape(unsafe.Pointer(&args))) + } else { + c.fn = getLoadLibrary() + c.n = 1 + c.args = uintptr(noescape(unsafe.Pointer(&absoluteFilepath))) + } + + cgocall(asmstdcallAddr, unsafe.Pointer(c)) + KeepAlive(filename) + KeepAlive(absoluteFilepath) + handle = c.r1 + if handle == 0 { + err = c.err + } + unlockOSThread() // not defer'd after the lockOSThread above to save stack frame size. + return +} + +//go:linkname syscall_loadlibrary syscall.loadlibrary +//go:nosplit +//go:cgo_unsafe_args +func syscall_loadlibrary(filename *uint16) (handle, err uintptr) { + lockOSThread() + defer unlockOSThread() + c := &getg().m.syscall + c.fn = getLoadLibrary() + c.n = 1 + c.args = uintptr(noescape(unsafe.Pointer(&filename))) + cgocall(asmstdcallAddr, unsafe.Pointer(c)) + KeepAlive(filename) + handle = c.r1 + if handle == 0 { + err = c.err + } + return +} + +//go:linkname syscall_getprocaddress syscall.getprocaddress +//go:nosplit +//go:cgo_unsafe_args +func syscall_getprocaddress(handle uintptr, procname *byte) (outhandle, err uintptr) { + lockOSThread() + defer unlockOSThread() + c := &getg().m.syscall + c.fn = getGetProcAddress() + c.n = 2 + c.args = uintptr(noescape(unsafe.Pointer(&handle))) + cgocall(asmstdcallAddr, unsafe.Pointer(c)) + KeepAlive(procname) + outhandle = c.r1 + if outhandle == 0 { + err = c.err + } + return +} + +//go:linkname syscall_Syscall syscall.Syscall +//go:nosplit +func syscall_Syscall(fn, nargs, a1, a2, a3 uintptr) (r1, r2, err uintptr) { + return syscall_SyscallN(fn, a1, a2, a3) +} + +//go:linkname syscall_Syscall6 syscall.Syscall6 +//go:nosplit +func syscall_Syscall6(fn, nargs, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, err uintptr) { + return syscall_SyscallN(fn, a1, a2, a3, a4, a5, a6) +} + +//go:linkname syscall_Syscall9 syscall.Syscall9 +//go:nosplit +func syscall_Syscall9(fn, nargs, a1, a2, a3, a4, a5, a6, a7, a8, a9 uintptr) (r1, r2, err uintptr) { + return syscall_SyscallN(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9) +} + +//go:linkname syscall_Syscall12 syscall.Syscall12 +//go:nosplit +func syscall_Syscall12(fn, nargs, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12 uintptr) (r1, r2, err uintptr) { + return syscall_SyscallN(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12) +} + +//go:linkname syscall_Syscall15 syscall.Syscall15 +//go:nosplit +func syscall_Syscall15(fn, nargs, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15 uintptr) (r1, r2, err uintptr) { + return syscall_SyscallN(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15) +} + +//go:linkname syscall_Syscall18 syscall.Syscall18 +//go:nosplit +func syscall_Syscall18(fn, nargs, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15, a16, a17, a18 uintptr) (r1, r2, err uintptr) { + return syscall_SyscallN(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15, a16, a17, a18) +} + +// maxArgs should be divisible by 2, as Windows stack +// must be kept 16-byte aligned on syscall entry. +// +// Although it only permits maximum 42 parameters, it +// is arguably large enough. +const maxArgs = 42 + +//go:linkname syscall_SyscallN syscall.SyscallN +//go:nosplit +func syscall_SyscallN(trap uintptr, args ...uintptr) (r1, r2, err uintptr) { + nargs := len(args) + + // asmstdcall expects it can access the first 4 arguments + // to load them into registers. + var tmp [4]uintptr + switch { + case nargs < 4: + copy(tmp[:], args) + args = tmp[:] + case nargs > maxArgs: + panic("runtime: SyscallN has too many arguments") + } + + lockOSThread() + defer unlockOSThread() + c := &getg().m.syscall + c.fn = trap + c.n = uintptr(nargs) + c.args = uintptr(noescape(unsafe.Pointer(&args[0]))) + cgocall(asmstdcallAddr, unsafe.Pointer(c)) + return c.r1, c.r2, c.err +} diff --git a/src/runtime/syscall_windows_test.go b/src/runtime/syscall_windows_test.go new file mode 100644 index 0000000..abc2838 --- /dev/null +++ b/src/runtime/syscall_windows_test.go @@ -0,0 +1,1366 @@ +// Copyright 2010 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "fmt" + "internal/abi" + "internal/syscall/windows/sysdll" + "internal/testenv" + "io" + "math" + "os" + "os/exec" + "path/filepath" + "reflect" + "runtime" + "strconv" + "strings" + "syscall" + "testing" + "unsafe" +) + +type DLL struct { + *syscall.DLL + t *testing.T +} + +func GetDLL(t *testing.T, name string) *DLL { + d, e := syscall.LoadDLL(name) + if e != nil { + t.Fatal(e) + } + return &DLL{DLL: d, t: t} +} + +func (d *DLL) Proc(name string) *syscall.Proc { + p, e := d.FindProc(name) + if e != nil { + d.t.Fatal(e) + } + return p +} + +func TestStdCall(t *testing.T) { + type Rect struct { + left, top, right, bottom int32 + } + res := Rect{} + expected := Rect{1, 1, 40, 60} + a, _, _ := GetDLL(t, "user32.dll").Proc("UnionRect").Call( + uintptr(unsafe.Pointer(&res)), + uintptr(unsafe.Pointer(&Rect{10, 1, 14, 60})), + uintptr(unsafe.Pointer(&Rect{1, 2, 40, 50}))) + if a != 1 || res.left != expected.left || + res.top != expected.top || + res.right != expected.right || + res.bottom != expected.bottom { + t.Error("stdcall USER32.UnionRect returns", a, "res=", res) + } +} + +func Test64BitReturnStdCall(t *testing.T) { + + const ( + VER_BUILDNUMBER = 0x0000004 + VER_MAJORVERSION = 0x0000002 + VER_MINORVERSION = 0x0000001 + VER_PLATFORMID = 0x0000008 + VER_PRODUCT_TYPE = 0x0000080 + VER_SERVICEPACKMAJOR = 0x0000020 + VER_SERVICEPACKMINOR = 0x0000010 + VER_SUITENAME = 0x0000040 + + VER_EQUAL = 1 + VER_GREATER = 2 + VER_GREATER_EQUAL = 3 + VER_LESS = 4 + VER_LESS_EQUAL = 5 + + ERROR_OLD_WIN_VERSION syscall.Errno = 1150 + ) + + type OSVersionInfoEx struct { + OSVersionInfoSize uint32 + MajorVersion uint32 + MinorVersion uint32 + BuildNumber uint32 + PlatformId uint32 + CSDVersion [128]uint16 + ServicePackMajor uint16 + ServicePackMinor uint16 + SuiteMask uint16 + ProductType byte + Reserve byte + } + + d := GetDLL(t, "kernel32.dll") + + var m1, m2 uintptr + VerSetConditionMask := d.Proc("VerSetConditionMask") + m1, m2, _ = VerSetConditionMask.Call(m1, m2, VER_MAJORVERSION, VER_GREATER_EQUAL) + m1, m2, _ = VerSetConditionMask.Call(m1, m2, VER_MINORVERSION, VER_GREATER_EQUAL) + m1, m2, _ = VerSetConditionMask.Call(m1, m2, VER_SERVICEPACKMAJOR, VER_GREATER_EQUAL) + m1, m2, _ = VerSetConditionMask.Call(m1, m2, VER_SERVICEPACKMINOR, VER_GREATER_EQUAL) + + vi := OSVersionInfoEx{ + MajorVersion: 5, + MinorVersion: 1, + ServicePackMajor: 2, + ServicePackMinor: 0, + } + vi.OSVersionInfoSize = uint32(unsafe.Sizeof(vi)) + r, _, e2 := d.Proc("VerifyVersionInfoW").Call( + uintptr(unsafe.Pointer(&vi)), + VER_MAJORVERSION|VER_MINORVERSION|VER_SERVICEPACKMAJOR|VER_SERVICEPACKMINOR, + m1, m2) + if r == 0 && e2 != ERROR_OLD_WIN_VERSION { + t.Errorf("VerifyVersionInfo failed: %s", e2) + } +} + +func TestCDecl(t *testing.T) { + var buf [50]byte + fmtp, _ := syscall.BytePtrFromString("%d %d %d") + a, _, _ := GetDLL(t, "user32.dll").Proc("wsprintfA").Call( + uintptr(unsafe.Pointer(&buf[0])), + uintptr(unsafe.Pointer(fmtp)), + 1000, 2000, 3000) + if string(buf[:a]) != "1000 2000 3000" { + t.Error("cdecl USER32.wsprintfA returns", a, "buf=", buf[:a]) + } +} + +func TestEnumWindows(t *testing.T) { + d := GetDLL(t, "user32.dll") + isWindows := d.Proc("IsWindow") + counter := 0 + cb := syscall.NewCallback(func(hwnd syscall.Handle, lparam uintptr) uintptr { + if lparam != 888 { + t.Error("lparam was not passed to callback") + } + b, _, _ := isWindows.Call(uintptr(hwnd)) + if b == 0 { + t.Error("USER32.IsWindow returns FALSE") + } + counter++ + return 1 // continue enumeration + }) + a, _, _ := d.Proc("EnumWindows").Call(cb, 888) + if a == 0 { + t.Error("USER32.EnumWindows returns FALSE") + } + if counter == 0 { + t.Error("Callback has been never called or your have no windows") + } +} + +func callback(timeFormatString unsafe.Pointer, lparam uintptr) uintptr { + (*(*func())(unsafe.Pointer(&lparam)))() + return 0 // stop enumeration +} + +// nestedCall calls into Windows, back into Go, and finally to f. +func nestedCall(t *testing.T, f func()) { + c := syscall.NewCallback(callback) + d := GetDLL(t, "kernel32.dll") + defer d.Release() + const LOCALE_NAME_USER_DEFAULT = 0 + d.Proc("EnumTimeFormatsEx").Call(c, LOCALE_NAME_USER_DEFAULT, 0, uintptr(*(*unsafe.Pointer)(unsafe.Pointer(&f)))) +} + +func TestCallback(t *testing.T) { + var x = false + nestedCall(t, func() { x = true }) + if !x { + t.Fatal("nestedCall did not call func") + } +} + +func TestCallbackGC(t *testing.T) { + nestedCall(t, runtime.GC) +} + +func TestCallbackPanicLocked(t *testing.T) { + runtime.LockOSThread() + defer runtime.UnlockOSThread() + + if !runtime.LockedOSThread() { + t.Fatal("runtime.LockOSThread didn't") + } + defer func() { + s := recover() + if s == nil { + t.Fatal("did not panic") + } + if s.(string) != "callback panic" { + t.Fatal("wrong panic:", s) + } + if !runtime.LockedOSThread() { + t.Fatal("lost lock on OS thread after panic") + } + }() + nestedCall(t, func() { panic("callback panic") }) + panic("nestedCall returned") +} + +func TestCallbackPanic(t *testing.T) { + // Make sure panic during callback unwinds properly. + if runtime.LockedOSThread() { + t.Fatal("locked OS thread on entry to TestCallbackPanic") + } + defer func() { + s := recover() + if s == nil { + t.Fatal("did not panic") + } + if s.(string) != "callback panic" { + t.Fatal("wrong panic:", s) + } + if runtime.LockedOSThread() { + t.Fatal("locked OS thread on exit from TestCallbackPanic") + } + }() + nestedCall(t, func() { panic("callback panic") }) + panic("nestedCall returned") +} + +func TestCallbackPanicLoop(t *testing.T) { + // Make sure we don't blow out m->g0 stack. + for i := 0; i < 100000; i++ { + TestCallbackPanic(t) + } +} + +func TestBlockingCallback(t *testing.T) { + c := make(chan int) + go func() { + for i := 0; i < 10; i++ { + c <- <-c + } + }() + nestedCall(t, func() { + for i := 0; i < 10; i++ { + c <- i + if j := <-c; j != i { + t.Errorf("out of sync %d != %d", j, i) + } + } + }) +} + +func TestCallbackInAnotherThread(t *testing.T) { + d := GetDLL(t, "kernel32.dll") + + f := func(p uintptr) uintptr { + return p + } + r, _, err := d.Proc("CreateThread").Call(0, 0, syscall.NewCallback(f), 123, 0, 0) + if r == 0 { + t.Fatalf("CreateThread failed: %v", err) + } + h := syscall.Handle(r) + defer syscall.CloseHandle(h) + + switch s, err := syscall.WaitForSingleObject(h, 100); s { + case syscall.WAIT_OBJECT_0: + break + case syscall.WAIT_TIMEOUT: + t.Fatal("timeout waiting for thread to exit") + case syscall.WAIT_FAILED: + t.Fatalf("WaitForSingleObject failed: %v", err) + default: + t.Fatalf("WaitForSingleObject returns unexpected value %v", s) + } + + var ec uint32 + r, _, err = d.Proc("GetExitCodeThread").Call(uintptr(h), uintptr(unsafe.Pointer(&ec))) + if r == 0 { + t.Fatalf("GetExitCodeThread failed: %v", err) + } + if ec != 123 { + t.Fatalf("expected 123, but got %d", ec) + } +} + +type cbFunc struct { + goFunc any +} + +func (f cbFunc) cName(cdecl bool) string { + name := "stdcall" + if cdecl { + name = "cdecl" + } + t := reflect.TypeOf(f.goFunc) + for i := 0; i < t.NumIn(); i++ { + name += "_" + t.In(i).Name() + } + return name +} + +func (f cbFunc) cSrc(w io.Writer, cdecl bool) { + // Construct a C function that takes a callback with + // f.goFunc's signature, and calls it with integers 1..N. + funcname := f.cName(cdecl) + attr := "__stdcall" + if cdecl { + attr = "__cdecl" + } + typename := "t" + funcname + t := reflect.TypeOf(f.goFunc) + cTypes := make([]string, t.NumIn()) + cArgs := make([]string, t.NumIn()) + for i := range cTypes { + // We included stdint.h, so this works for all sized + // integer types, and uint8Pair_t. + cTypes[i] = t.In(i).Name() + "_t" + if t.In(i).Name() == "uint8Pair" { + cArgs[i] = fmt.Sprintf("(uint8Pair_t){%d,1}", i) + } else { + cArgs[i] = fmt.Sprintf("%d", i+1) + } + } + fmt.Fprintf(w, ` +typedef uintptr_t %s (*%s)(%s); +uintptr_t %s(%s f) { + return f(%s); +} + `, attr, typename, strings.Join(cTypes, ","), funcname, typename, strings.Join(cArgs, ",")) +} + +func (f cbFunc) testOne(t *testing.T, dll *syscall.DLL, cdecl bool, cb uintptr) { + r1, _, _ := dll.MustFindProc(f.cName(cdecl)).Call(cb) + + want := 0 + for i := 0; i < reflect.TypeOf(f.goFunc).NumIn(); i++ { + want += i + 1 + } + if int(r1) != want { + t.Errorf("wanted result %d; got %d", want, r1) + } +} + +type uint8Pair struct{ x, y uint8 } + +var cbFuncs = []cbFunc{ + {func(i1, i2 uintptr) uintptr { + return i1 + i2 + }}, + {func(i1, i2, i3 uintptr) uintptr { + return i1 + i2 + i3 + }}, + {func(i1, i2, i3, i4 uintptr) uintptr { + return i1 + i2 + i3 + i4 + }}, + {func(i1, i2, i3, i4, i5 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + }}, + {func(i1, i2, i3, i4, i5, i6 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 + }}, + {func(i1, i2, i3, i4, i5, i6, i7 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 + i7 + }}, + {func(i1, i2, i3, i4, i5, i6, i7, i8 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + }}, + {func(i1, i2, i3, i4, i5, i6, i7, i8, i9 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9 + }}, + + // Non-uintptr parameters. + {func(i1, i2, i3, i4, i5, i6, i7, i8, i9 uint8) uintptr { + return uintptr(i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9) + }}, + {func(i1, i2, i3, i4, i5, i6, i7, i8, i9 uint16) uintptr { + return uintptr(i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9) + }}, + {func(i1, i2, i3, i4, i5, i6, i7, i8, i9 int8) uintptr { + return uintptr(i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9) + }}, + {func(i1 int8, i2 int16, i3 int32, i4, i5 uintptr) uintptr { + return uintptr(i1) + uintptr(i2) + uintptr(i3) + i4 + i5 + }}, + {func(i1, i2, i3, i4, i5 uint8Pair) uintptr { + return uintptr(i1.x + i1.y + i2.x + i2.y + i3.x + i3.y + i4.x + i4.y + i5.x + i5.y) + }}, + {func(i1, i2, i3, i4, i5, i6, i7, i8, i9 uint32) uintptr { + runtime.GC() + return uintptr(i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9) + }}, +} + +//go:registerparams +func sum2(i1, i2 uintptr) uintptr { + return i1 + i2 +} + +//go:registerparams +func sum3(i1, i2, i3 uintptr) uintptr { + return i1 + i2 + i3 +} + +//go:registerparams +func sum4(i1, i2, i3, i4 uintptr) uintptr { + return i1 + i2 + i3 + i4 +} + +//go:registerparams +func sum5(i1, i2, i3, i4, i5 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 +} + +//go:registerparams +func sum6(i1, i2, i3, i4, i5, i6 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 +} + +//go:registerparams +func sum7(i1, i2, i3, i4, i5, i6, i7 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 + i7 +} + +//go:registerparams +func sum8(i1, i2, i3, i4, i5, i6, i7, i8 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 +} + +//go:registerparams +func sum9(i1, i2, i3, i4, i5, i6, i7, i8, i9 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9 +} + +//go:registerparams +func sum10(i1, i2, i3, i4, i5, i6, i7, i8, i9, i10 uintptr) uintptr { + return i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9 + i10 +} + +//go:registerparams +func sum9uint8(i1, i2, i3, i4, i5, i6, i7, i8, i9 uint8) uintptr { + return uintptr(i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9) +} + +//go:registerparams +func sum9uint16(i1, i2, i3, i4, i5, i6, i7, i8, i9 uint16) uintptr { + return uintptr(i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9) +} + +//go:registerparams +func sum9int8(i1, i2, i3, i4, i5, i6, i7, i8, i9 int8) uintptr { + return uintptr(i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9) +} + +//go:registerparams +func sum5mix(i1 int8, i2 int16, i3 int32, i4, i5 uintptr) uintptr { + return uintptr(i1) + uintptr(i2) + uintptr(i3) + i4 + i5 +} + +//go:registerparams +func sum5andPair(i1, i2, i3, i4, i5 uint8Pair) uintptr { + return uintptr(i1.x + i1.y + i2.x + i2.y + i3.x + i3.y + i4.x + i4.y + i5.x + i5.y) +} + +// This test forces a GC. The idea is to have enough arguments +// that insufficient spill slots allocated (according to the ABI) +// may cause compiler-generated spills to clobber the return PC. +// Then, the GC stack scanning will catch that. +// +//go:registerparams +func sum9andGC(i1, i2, i3, i4, i5, i6, i7, i8, i9 uint32) uintptr { + runtime.GC() + return uintptr(i1 + i2 + i3 + i4 + i5 + i6 + i7 + i8 + i9) +} + +// TODO(register args): Remove this once we switch to using the register +// calling convention by default, since this is redundant with the existing +// tests. +var cbFuncsRegABI = []cbFunc{ + {sum2}, + {sum3}, + {sum4}, + {sum5}, + {sum6}, + {sum7}, + {sum8}, + {sum9}, + {sum10}, + {sum9uint8}, + {sum9uint16}, + {sum9int8}, + {sum5mix}, + {sum5andPair}, + {sum9andGC}, +} + +func getCallbackTestFuncs() []cbFunc { + if regs := runtime.SetIntArgRegs(-1); regs > 0 { + return cbFuncsRegABI + } + return cbFuncs +} + +type cbDLL struct { + name string + buildArgs func(out, src string) []string +} + +func (d *cbDLL) makeSrc(t *testing.T, path string) { + f, err := os.Create(path) + if err != nil { + t.Fatalf("failed to create source file: %v", err) + } + defer f.Close() + + fmt.Fprint(f, ` +#include <stdint.h> +typedef struct { uint8_t x, y; } uint8Pair_t; +`) + for _, cbf := range getCallbackTestFuncs() { + cbf.cSrc(f, false) + cbf.cSrc(f, true) + } +} + +func (d *cbDLL) build(t *testing.T, dir string) string { + srcname := d.name + ".c" + d.makeSrc(t, filepath.Join(dir, srcname)) + outname := d.name + ".dll" + args := d.buildArgs(outname, srcname) + cmd := exec.Command(args[0], args[1:]...) + cmd.Dir = dir + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("failed to build dll: %v - %v", err, string(out)) + } + return filepath.Join(dir, outname) +} + +var cbDLLs = []cbDLL{ + { + "test", + func(out, src string) []string { + return []string{"gcc", "-shared", "-s", "-Werror", "-o", out, src} + }, + }, + { + "testO2", + func(out, src string) []string { + return []string{"gcc", "-shared", "-s", "-Werror", "-o", out, "-O2", src} + }, + }, +} + +func TestStdcallAndCDeclCallbacks(t *testing.T) { + if _, err := exec.LookPath("gcc"); err != nil { + t.Skip("skipping test: gcc is missing") + } + tmp := t.TempDir() + + oldRegs := runtime.SetIntArgRegs(abi.IntArgRegs) + defer runtime.SetIntArgRegs(oldRegs) + + for _, dll := range cbDLLs { + t.Run(dll.name, func(t *testing.T) { + dllPath := dll.build(t, tmp) + dll := syscall.MustLoadDLL(dllPath) + defer dll.Release() + for _, cbf := range getCallbackTestFuncs() { + t.Run(cbf.cName(false), func(t *testing.T) { + stdcall := syscall.NewCallback(cbf.goFunc) + cbf.testOne(t, dll, false, stdcall) + }) + t.Run(cbf.cName(true), func(t *testing.T) { + cdecl := syscall.NewCallbackCDecl(cbf.goFunc) + cbf.testOne(t, dll, true, cdecl) + }) + } + }) + } +} + +func TestRegisterClass(t *testing.T) { + kernel32 := GetDLL(t, "kernel32.dll") + user32 := GetDLL(t, "user32.dll") + mh, _, _ := kernel32.Proc("GetModuleHandleW").Call(0) + cb := syscall.NewCallback(func(hwnd syscall.Handle, msg uint32, wparam, lparam uintptr) (rc uintptr) { + t.Fatal("callback should never get called") + return 0 + }) + type Wndclassex struct { + Size uint32 + Style uint32 + WndProc uintptr + ClsExtra int32 + WndExtra int32 + Instance syscall.Handle + Icon syscall.Handle + Cursor syscall.Handle + Background syscall.Handle + MenuName *uint16 + ClassName *uint16 + IconSm syscall.Handle + } + name := syscall.StringToUTF16Ptr("test_window") + wc := Wndclassex{ + WndProc: cb, + Instance: syscall.Handle(mh), + ClassName: name, + } + wc.Size = uint32(unsafe.Sizeof(wc)) + a, _, err := user32.Proc("RegisterClassExW").Call(uintptr(unsafe.Pointer(&wc))) + if a == 0 { + t.Fatalf("RegisterClassEx failed: %v", err) + } + r, _, err := user32.Proc("UnregisterClassW").Call(uintptr(unsafe.Pointer(name)), 0) + if r == 0 { + t.Fatalf("UnregisterClass failed: %v", err) + } +} + +func TestOutputDebugString(t *testing.T) { + d := GetDLL(t, "kernel32.dll") + p := syscall.StringToUTF16Ptr("testing OutputDebugString") + d.Proc("OutputDebugStringW").Call(uintptr(unsafe.Pointer(p))) +} + +func TestRaiseException(t *testing.T) { + if strings.HasPrefix(testenv.Builder(), "windows-amd64-2012") { + testenv.SkipFlaky(t, 49681) + } + o := runTestProg(t, "testprog", "RaiseException") + if strings.Contains(o, "RaiseException should not return") { + t.Fatalf("RaiseException did not crash program: %v", o) + } + if !strings.Contains(o, "Exception 0xbad") { + t.Fatalf("No stack trace: %v", o) + } +} + +func TestZeroDivisionException(t *testing.T) { + o := runTestProg(t, "testprog", "ZeroDivisionException") + if !strings.Contains(o, "panic: runtime error: integer divide by zero") { + t.Fatalf("No stack trace: %v", o) + } +} + +func TestWERDialogue(t *testing.T) { + if os.Getenv("TESTING_WER_DIALOGUE") == "1" { + defer os.Exit(0) + + *runtime.TestingWER = true + const EXCEPTION_NONCONTINUABLE = 1 + mod := syscall.MustLoadDLL("kernel32.dll") + proc := mod.MustFindProc("RaiseException") + proc.Call(0xbad, EXCEPTION_NONCONTINUABLE, 0, 0) + println("RaiseException should not return") + return + } + cmd := exec.Command(os.Args[0], "-test.run=TestWERDialogue") + cmd.Env = []string{"TESTING_WER_DIALOGUE=1"} + // Child process should not open WER dialogue, but return immediately instead. + cmd.CombinedOutput() +} + +func TestWindowsStackMemory(t *testing.T) { + o := runTestProg(t, "testprog", "StackMemory") + stackUsage, err := strconv.Atoi(o) + if err != nil { + t.Fatalf("Failed to read stack usage: %v", err) + } + if expected, got := 100<<10, stackUsage; got > expected { + t.Fatalf("expected < %d bytes of memory per thread, got %d", expected, got) + } +} + +var used byte + +func use(buf []byte) { + for _, c := range buf { + used += c + } +} + +func forceStackCopy() (r int) { + var f func(int) int + f = func(i int) int { + var buf [256]byte + use(buf[:]) + if i == 0 { + return 0 + } + return i + f(i-1) + } + r = f(128) + return +} + +func TestReturnAfterStackGrowInCallback(t *testing.T) { + if _, err := exec.LookPath("gcc"); err != nil { + t.Skip("skipping test: gcc is missing") + } + + const src = ` +#include <stdint.h> +#include <windows.h> + +typedef uintptr_t __stdcall (*callback)(uintptr_t); + +uintptr_t cfunc(callback f, uintptr_t n) { + uintptr_t r; + r = f(n); + SetLastError(333); + return r; +} +` + tmpdir := t.TempDir() + + srcname := "mydll.c" + err := os.WriteFile(filepath.Join(tmpdir, srcname), []byte(src), 0) + if err != nil { + t.Fatal(err) + } + outname := "mydll.dll" + cmd := exec.Command("gcc", "-shared", "-s", "-Werror", "-o", outname, srcname) + cmd.Dir = tmpdir + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("failed to build dll: %v - %v", err, string(out)) + } + dllpath := filepath.Join(tmpdir, outname) + + dll := syscall.MustLoadDLL(dllpath) + defer dll.Release() + + proc := dll.MustFindProc("cfunc") + + cb := syscall.NewCallback(func(n uintptr) uintptr { + forceStackCopy() + return n + }) + + // Use a new goroutine so that we get a small stack. + type result struct { + r uintptr + err syscall.Errno + } + want := result{ + // Make it large enough to test issue #29331. + r: (^uintptr(0)) >> 24, + err: 333, + } + c := make(chan result) + go func() { + r, _, err := proc.Call(cb, want.r) + c <- result{r, err.(syscall.Errno)} + }() + if got := <-c; got != want { + t.Errorf("got %d want %d", got, want) + } +} + +func TestSyscallN(t *testing.T) { + if _, err := exec.LookPath("gcc"); err != nil { + t.Skip("skipping test: gcc is missing") + } + if runtime.GOARCH != "amd64" { + t.Skipf("skipping test: GOARCH=%s", runtime.GOARCH) + } + + for arglen := 0; arglen <= runtime.MaxArgs; arglen++ { + arglen := arglen + t.Run(fmt.Sprintf("arg-%d", arglen), func(t *testing.T) { + t.Parallel() + args := make([]string, arglen) + rets := make([]string, arglen+1) + params := make([]uintptr, arglen) + for i := range args { + args[i] = fmt.Sprintf("int a%d", i) + rets[i] = fmt.Sprintf("(a%d == %d)", i, i) + params[i] = uintptr(i) + } + rets[arglen] = "1" // for arglen == 0 + + src := fmt.Sprintf(` + #include <stdint.h> + #include <windows.h> + int cfunc(%s) { return %s; }`, strings.Join(args, ", "), strings.Join(rets, " && ")) + + tmpdir := t.TempDir() + + srcname := "mydll.c" + err := os.WriteFile(filepath.Join(tmpdir, srcname), []byte(src), 0) + if err != nil { + t.Fatal(err) + } + outname := "mydll.dll" + cmd := exec.Command("gcc", "-shared", "-s", "-Werror", "-o", outname, srcname) + cmd.Dir = tmpdir + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("failed to build dll: %v\n%s", err, out) + } + dllpath := filepath.Join(tmpdir, outname) + + dll := syscall.MustLoadDLL(dllpath) + defer dll.Release() + + proc := dll.MustFindProc("cfunc") + + // proc.Call() will call SyscallN() internally. + r, _, err := proc.Call(params...) + if r != 1 { + t.Errorf("got %d want 1 (err=%v)", r, err) + } + }) + } +} + +func TestFloatArgs(t *testing.T) { + if _, err := exec.LookPath("gcc"); err != nil { + t.Skip("skipping test: gcc is missing") + } + if runtime.GOARCH != "amd64" { + t.Skipf("skipping test: GOARCH=%s", runtime.GOARCH) + } + + const src = ` +#include <stdint.h> +#include <windows.h> + +uintptr_t cfunc(uintptr_t a, double b, float c, double d) { + if (a == 1 && b == 2.2 && c == 3.3f && d == 4.4e44) { + return 1; + } + return 0; +} +` + tmpdir := t.TempDir() + + srcname := "mydll.c" + err := os.WriteFile(filepath.Join(tmpdir, srcname), []byte(src), 0) + if err != nil { + t.Fatal(err) + } + outname := "mydll.dll" + cmd := exec.Command("gcc", "-shared", "-s", "-Werror", "-o", outname, srcname) + cmd.Dir = tmpdir + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("failed to build dll: %v - %v", err, string(out)) + } + dllpath := filepath.Join(tmpdir, outname) + + dll := syscall.MustLoadDLL(dllpath) + defer dll.Release() + + proc := dll.MustFindProc("cfunc") + + r, _, err := proc.Call( + 1, + uintptr(math.Float64bits(2.2)), + uintptr(math.Float32bits(3.3)), + uintptr(math.Float64bits(4.4e44)), + ) + if r != 1 { + t.Errorf("got %d want 1 (err=%v)", r, err) + } +} + +func TestFloatReturn(t *testing.T) { + if _, err := exec.LookPath("gcc"); err != nil { + t.Skip("skipping test: gcc is missing") + } + if runtime.GOARCH != "amd64" { + t.Skipf("skipping test: GOARCH=%s", runtime.GOARCH) + } + + const src = ` +#include <stdint.h> +#include <windows.h> + +float cfuncFloat(uintptr_t a, double b, float c, double d) { + if (a == 1 && b == 2.2 && c == 3.3f && d == 4.4e44) { + return 1.5f; + } + return 0; +} + +double cfuncDouble(uintptr_t a, double b, float c, double d) { + if (a == 1 && b == 2.2 && c == 3.3f && d == 4.4e44) { + return 2.5; + } + return 0; +} +` + tmpdir := t.TempDir() + + srcname := "mydll.c" + err := os.WriteFile(filepath.Join(tmpdir, srcname), []byte(src), 0) + if err != nil { + t.Fatal(err) + } + outname := "mydll.dll" + cmd := exec.Command("gcc", "-shared", "-s", "-Werror", "-o", outname, srcname) + cmd.Dir = tmpdir + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("failed to build dll: %v - %v", err, string(out)) + } + dllpath := filepath.Join(tmpdir, outname) + + dll := syscall.MustLoadDLL(dllpath) + defer dll.Release() + + proc := dll.MustFindProc("cfuncFloat") + + _, r, err := proc.Call( + 1, + uintptr(math.Float64bits(2.2)), + uintptr(math.Float32bits(3.3)), + uintptr(math.Float64bits(4.4e44)), + ) + fr := math.Float32frombits(uint32(r)) + if fr != 1.5 { + t.Errorf("got %f want 1.5 (err=%v)", fr, err) + } + + proc = dll.MustFindProc("cfuncDouble") + + _, r, err = proc.Call( + 1, + uintptr(math.Float64bits(2.2)), + uintptr(math.Float32bits(3.3)), + uintptr(math.Float64bits(4.4e44)), + ) + dr := math.Float64frombits(uint64(r)) + if dr != 2.5 { + t.Errorf("got %f want 2.5 (err=%v)", dr, err) + } +} + +func TestTimeBeginPeriod(t *testing.T) { + const TIMERR_NOERROR = 0 + if *runtime.TimeBeginPeriodRetValue != TIMERR_NOERROR { + t.Fatalf("timeBeginPeriod failed: it returned %d", *runtime.TimeBeginPeriodRetValue) + } +} + +// removeOneCPU removes one (any) cpu from affinity mask. +// It returns new affinity mask. +func removeOneCPU(mask uintptr) (uintptr, error) { + if mask == 0 { + return 0, fmt.Errorf("cpu affinity mask is empty") + } + maskbits := int(unsafe.Sizeof(mask) * 8) + for i := 0; i < maskbits; i++ { + newmask := mask & ^(1 << uint(i)) + if newmask != mask { + return newmask, nil + } + + } + panic("not reached") +} + +func resumeChildThread(kernel32 *syscall.DLL, childpid int) error { + _OpenThread := kernel32.MustFindProc("OpenThread") + _ResumeThread := kernel32.MustFindProc("ResumeThread") + _Thread32First := kernel32.MustFindProc("Thread32First") + _Thread32Next := kernel32.MustFindProc("Thread32Next") + + snapshot, err := syscall.CreateToolhelp32Snapshot(syscall.TH32CS_SNAPTHREAD, 0) + if err != nil { + return err + } + defer syscall.CloseHandle(snapshot) + + const _THREAD_SUSPEND_RESUME = 0x0002 + + type ThreadEntry32 struct { + Size uint32 + tUsage uint32 + ThreadID uint32 + OwnerProcessID uint32 + BasePri int32 + DeltaPri int32 + Flags uint32 + } + + var te ThreadEntry32 + te.Size = uint32(unsafe.Sizeof(te)) + ret, _, err := _Thread32First.Call(uintptr(snapshot), uintptr(unsafe.Pointer(&te))) + if ret == 0 { + return err + } + for te.OwnerProcessID != uint32(childpid) { + ret, _, err = _Thread32Next.Call(uintptr(snapshot), uintptr(unsafe.Pointer(&te))) + if ret == 0 { + return err + } + } + h, _, err := _OpenThread.Call(_THREAD_SUSPEND_RESUME, 1, uintptr(te.ThreadID)) + if h == 0 { + return err + } + defer syscall.Close(syscall.Handle(h)) + + ret, _, err = _ResumeThread.Call(h) + if ret == 0xffffffff { + return err + } + return nil +} + +func TestNumCPU(t *testing.T) { + if os.Getenv("GO_WANT_HELPER_PROCESS") == "1" { + // in child process + fmt.Fprintf(os.Stderr, "%d", runtime.NumCPU()) + os.Exit(0) + } + + switch n := runtime.NumberOfProcessors(); { + case n < 1: + t.Fatalf("system cannot have %d cpu(s)", n) + case n == 1: + if runtime.NumCPU() != 1 { + t.Fatalf("runtime.NumCPU() returns %d on single cpu system", runtime.NumCPU()) + } + return + } + + const ( + _CREATE_SUSPENDED = 0x00000004 + _PROCESS_ALL_ACCESS = syscall.STANDARD_RIGHTS_REQUIRED | syscall.SYNCHRONIZE | 0xfff + ) + + kernel32 := syscall.MustLoadDLL("kernel32.dll") + _GetProcessAffinityMask := kernel32.MustFindProc("GetProcessAffinityMask") + _SetProcessAffinityMask := kernel32.MustFindProc("SetProcessAffinityMask") + + cmd := exec.Command(os.Args[0], "-test.run=TestNumCPU") + cmd.Env = append(os.Environ(), "GO_WANT_HELPER_PROCESS=1") + var buf strings.Builder + cmd.Stdout = &buf + cmd.Stderr = &buf + cmd.SysProcAttr = &syscall.SysProcAttr{CreationFlags: _CREATE_SUSPENDED} + err := cmd.Start() + if err != nil { + t.Fatal(err) + } + defer func() { + err = cmd.Wait() + childOutput := buf.String() + if err != nil { + t.Fatalf("child failed: %v: %v", err, childOutput) + } + // removeOneCPU should have decreased child cpu count by 1 + want := fmt.Sprintf("%d", runtime.NumCPU()-1) + if childOutput != want { + t.Fatalf("child output: want %q, got %q", want, childOutput) + } + }() + + defer func() { + err = resumeChildThread(kernel32, cmd.Process.Pid) + if err != nil { + t.Fatal(err) + } + }() + + ph, err := syscall.OpenProcess(_PROCESS_ALL_ACCESS, false, uint32(cmd.Process.Pid)) + if err != nil { + t.Fatal(err) + } + defer syscall.CloseHandle(ph) + + var mask, sysmask uintptr + ret, _, err := _GetProcessAffinityMask.Call(uintptr(ph), uintptr(unsafe.Pointer(&mask)), uintptr(unsafe.Pointer(&sysmask))) + if ret == 0 { + t.Fatal(err) + } + + newmask, err := removeOneCPU(mask) + if err != nil { + t.Fatal(err) + } + + ret, _, err = _SetProcessAffinityMask.Call(uintptr(ph), newmask) + if ret == 0 { + t.Fatal(err) + } + ret, _, err = _GetProcessAffinityMask.Call(uintptr(ph), uintptr(unsafe.Pointer(&mask)), uintptr(unsafe.Pointer(&sysmask))) + if ret == 0 { + t.Fatal(err) + } + if newmask != mask { + t.Fatalf("SetProcessAffinityMask didn't set newmask of 0x%x. Current mask is 0x%x.", newmask, mask) + } +} + +// See Issue 14959 +func TestDLLPreloadMitigation(t *testing.T) { + if _, err := exec.LookPath("gcc"); err != nil { + t.Skip("skipping test: gcc is missing") + } + + tmpdir := t.TempDir() + + dir0, err := os.Getwd() + if err != nil { + t.Fatal(err) + } + defer os.Chdir(dir0) + + const src = ` +#include <stdint.h> +#include <windows.h> + +uintptr_t cfunc(void) { + SetLastError(123); + return 0; +} +` + srcname := "nojack.c" + err = os.WriteFile(filepath.Join(tmpdir, srcname), []byte(src), 0) + if err != nil { + t.Fatal(err) + } + name := "nojack.dll" + cmd := exec.Command("gcc", "-shared", "-s", "-Werror", "-o", name, srcname) + cmd.Dir = tmpdir + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("failed to build dll: %v - %v", err, string(out)) + } + dllpath := filepath.Join(tmpdir, name) + + dll := syscall.MustLoadDLL(dllpath) + dll.MustFindProc("cfunc") + dll.Release() + + // Get into the directory with the DLL we'll load by base name + // ("nojack.dll") Think of this as the user double-clicking an + // installer from their Downloads directory where a browser + // silently downloaded some malicious DLLs. + os.Chdir(tmpdir) + + // First before we can load a DLL from the current directory, + // loading it only as "nojack.dll", without an absolute path. + delete(sysdll.IsSystemDLL, name) // in case test was run repeatedly + dll, err = syscall.LoadDLL(name) + if err != nil { + t.Fatalf("failed to load %s by base name before sysdll registration: %v", name, err) + } + dll.Release() + + // And now verify that if we register it as a system32-only + // DLL, the implicit loading from the current directory no + // longer works. + sysdll.IsSystemDLL[name] = true + dll, err = syscall.LoadDLL(name) + if err == nil { + dll.Release() + if wantLoadLibraryEx() { + t.Fatalf("Bad: insecure load of DLL by base name %q before sysdll registration: %v", name, err) + } + t.Skip("insecure load of DLL, but expected") + } +} + +// Test that C code called via a DLL can use large Windows thread +// stacks and call back in to Go without crashing. See issue #20975. +// +// See also TestBigStackCallbackCgo. +func TestBigStackCallbackSyscall(t *testing.T) { + if _, err := exec.LookPath("gcc"); err != nil { + t.Skip("skipping test: gcc is missing") + } + + srcname, err := filepath.Abs("testdata/testprogcgo/bigstack_windows.c") + if err != nil { + t.Fatal("Abs failed: ", err) + } + + tmpdir := t.TempDir() + + outname := "mydll.dll" + cmd := exec.Command("gcc", "-shared", "-s", "-Werror", "-o", outname, srcname) + cmd.Dir = tmpdir + out, err := cmd.CombinedOutput() + if err != nil { + t.Fatalf("failed to build dll: %v - %v", err, string(out)) + } + dllpath := filepath.Join(tmpdir, outname) + + dll := syscall.MustLoadDLL(dllpath) + defer dll.Release() + + var ok bool + proc := dll.MustFindProc("bigStack") + cb := syscall.NewCallback(func() uintptr { + // Do something interesting to force stack checks. + forceStackCopy() + ok = true + return 0 + }) + proc.Call(cb) + if !ok { + t.Fatalf("callback not called") + } +} + +// wantLoadLibraryEx reports whether we expect LoadLibraryEx to work for tests. +func wantLoadLibraryEx() bool { + return testenv.Builder() != "" && (runtime.GOARCH == "amd64" || runtime.GOARCH == "386") +} + +func TestLoadLibraryEx(t *testing.T) { + use, have, flags := runtime.LoadLibraryExStatus() + if use { + return // success. + } + if wantLoadLibraryEx() { + t.Fatalf("Expected LoadLibraryEx+flags to be available. (LoadLibraryEx=%v; flags=%v)", + have, flags) + } + t.Skipf("LoadLibraryEx not usable, but not expected. (LoadLibraryEx=%v; flags=%v)", + have, flags) +} + +var ( + modwinmm = syscall.NewLazyDLL("winmm.dll") + modkernel32 = syscall.NewLazyDLL("kernel32.dll") + + procCreateEvent = modkernel32.NewProc("CreateEventW") + procSetEvent = modkernel32.NewProc("SetEvent") +) + +func createEvent() (syscall.Handle, error) { + r0, _, e0 := syscall.Syscall6(procCreateEvent.Addr(), 4, 0, 0, 0, 0, 0, 0) + if r0 == 0 { + return 0, syscall.Errno(e0) + } + return syscall.Handle(r0), nil +} + +func setEvent(h syscall.Handle) error { + r0, _, e0 := syscall.Syscall(procSetEvent.Addr(), 1, uintptr(h), 0, 0) + if r0 == 0 { + return syscall.Errno(e0) + } + return nil +} + +func BenchmarkChanToSyscallPing(b *testing.B) { + n := b.N + ch := make(chan int) + event, err := createEvent() + if err != nil { + b.Fatal(err) + } + go func() { + for i := 0; i < n; i++ { + syscall.WaitForSingleObject(event, syscall.INFINITE) + ch <- 1 + } + }() + for i := 0; i < n; i++ { + err := setEvent(event) + if err != nil { + b.Fatal(err) + } + <-ch + } +} + +func BenchmarkSyscallToSyscallPing(b *testing.B) { + n := b.N + event1, err := createEvent() + if err != nil { + b.Fatal(err) + } + event2, err := createEvent() + if err != nil { + b.Fatal(err) + } + go func() { + for i := 0; i < n; i++ { + syscall.WaitForSingleObject(event1, syscall.INFINITE) + if err := setEvent(event2); err != nil { + b.Errorf("Set event failed: %v", err) + return + } + } + }() + for i := 0; i < n; i++ { + if err := setEvent(event1); err != nil { + b.Fatal(err) + } + if b.Failed() { + break + } + syscall.WaitForSingleObject(event2, syscall.INFINITE) + } +} + +func BenchmarkChanToChanPing(b *testing.B) { + n := b.N + ch1 := make(chan int) + ch2 := make(chan int) + go func() { + for i := 0; i < n; i++ { + <-ch1 + ch2 <- 1 + } + }() + for i := 0; i < n; i++ { + ch1 <- 1 + <-ch2 + } +} + +func BenchmarkOsYield(b *testing.B) { + for i := 0; i < b.N; i++ { + runtime.OsYield() + } +} + +func BenchmarkRunningGoProgram(b *testing.B) { + tmpdir := b.TempDir() + + src := filepath.Join(tmpdir, "main.go") + err := os.WriteFile(src, []byte(benchmarkRunningGoProgram), 0666) + if err != nil { + b.Fatal(err) + } + + exe := filepath.Join(tmpdir, "main.exe") + cmd := exec.Command(testenv.GoToolPath(b), "build", "-o", exe, src) + cmd.Dir = tmpdir + out, err := cmd.CombinedOutput() + if err != nil { + b.Fatalf("building main.exe failed: %v\n%s", err, out) + } + + b.ResetTimer() + for i := 0; i < b.N; i++ { + cmd := exec.Command(exe) + out, err := cmd.CombinedOutput() + if err != nil { + b.Fatalf("running main.exe failed: %v\n%s", err, out) + } + } +} + +const benchmarkRunningGoProgram = ` +package main + +import _ "os" // average Go program will use "os" package, do the same here + +func main() { +} +` diff --git a/src/runtime/testdata/testexithooks/testexithooks.go b/src/runtime/testdata/testexithooks/testexithooks.go new file mode 100644 index 0000000..ceb3326 --- /dev/null +++ b/src/runtime/testdata/testexithooks/testexithooks.go @@ -0,0 +1,85 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "flag" + "os" + _ "unsafe" +) + +var modeflag = flag.String("mode", "", "mode to run in") + +func main() { + flag.Parse() + switch *modeflag { + case "simple": + testSimple() + case "goodexit": + testGoodExit() + case "badexit": + testBadExit() + case "panics": + testPanics() + case "callsexit": + testHookCallsExit() + default: + panic("unknown mode") + } +} + +//go:linkname runtime_addExitHook runtime.addExitHook +func runtime_addExitHook(f func(), runOnNonZeroExit bool) + +func testSimple() { + f1 := func() { println("foo") } + f2 := func() { println("bar") } + runtime_addExitHook(f1, false) + runtime_addExitHook(f2, false) + // no explicit call to os.Exit +} + +func testGoodExit() { + f1 := func() { println("apple") } + f2 := func() { println("orange") } + runtime_addExitHook(f1, false) + runtime_addExitHook(f2, false) + // explicit call to os.Exit + os.Exit(0) +} + +func testBadExit() { + f1 := func() { println("blog") } + f2 := func() { println("blix") } + f3 := func() { println("blek") } + f4 := func() { println("blub") } + f5 := func() { println("blat") } + runtime_addExitHook(f1, false) + runtime_addExitHook(f2, true) + runtime_addExitHook(f3, false) + runtime_addExitHook(f4, true) + runtime_addExitHook(f5, false) + os.Exit(1) +} + +func testPanics() { + f1 := func() { println("ok") } + f2 := func() { panic("BADBADBAD") } + f3 := func() { println("good") } + runtime_addExitHook(f1, true) + runtime_addExitHook(f2, true) + runtime_addExitHook(f3, true) + os.Exit(0) +} + +func testHookCallsExit() { + f1 := func() { println("ok") } + f2 := func() { os.Exit(1) } + f3 := func() { println("good") } + runtime_addExitHook(f1, true) + runtime_addExitHook(f2, true) + runtime_addExitHook(f3, true) + os.Exit(1) +} diff --git a/src/runtime/testdata/testfaketime/faketime.go b/src/runtime/testdata/testfaketime/faketime.go new file mode 100644 index 0000000..1fb15eb --- /dev/null +++ b/src/runtime/testdata/testfaketime/faketime.go @@ -0,0 +1,28 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Test faketime support. This is its own test program because we have +// to build it with custom build tags and hence want to minimize +// dependencies. + +package main + +import ( + "os" + "time" +) + +func main() { + println("line 1") + // Stream switch, increments time + os.Stdout.WriteString("line 2\n") + os.Stdout.WriteString("line 3\n") + // Stream switch, increments time + os.Stderr.WriteString("line 4\n") + // Time jump + time.Sleep(1 * time.Second) + os.Stdout.WriteString("line 5\n") + // Print the current time. + os.Stdout.WriteString(time.Now().UTC().Format(time.RFC3339)) +} diff --git a/src/runtime/testdata/testprog/abort.go b/src/runtime/testdata/testprog/abort.go new file mode 100644 index 0000000..9e79d4d --- /dev/null +++ b/src/runtime/testdata/testprog/abort.go @@ -0,0 +1,23 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import _ "unsafe" // for go:linkname + +func init() { + register("Abort", Abort) +} + +//go:linkname runtimeAbort runtime.abort +func runtimeAbort() + +func Abort() { + defer func() { + recover() + panic("BAD: recovered from abort") + }() + runtimeAbort() + println("BAD: after abort") +} diff --git a/src/runtime/testdata/testprog/badtraceback.go b/src/runtime/testdata/testprog/badtraceback.go new file mode 100644 index 0000000..09aa2b8 --- /dev/null +++ b/src/runtime/testdata/testprog/badtraceback.go @@ -0,0 +1,50 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "runtime" + "runtime/debug" + "unsafe" +) + +func init() { + register("BadTraceback", BadTraceback) +} + +func BadTraceback() { + // Disable GC to prevent traceback at unexpected time. + debug.SetGCPercent(-1) + // Out of an abundance of caution, also make sure that there are + // no GCs actively in progress. + runtime.GC() + + // Run badLR1 on its own stack to minimize the stack size and + // exercise the stack bounds logic in the hex dump. + go badLR1() + select {} +} + +//go:noinline +func badLR1() { + // We need two frames on LR machines because we'll smash this + // frame's saved LR. + badLR2(0) +} + +//go:noinline +func badLR2(arg int) { + // Smash the return PC or saved LR. + lrOff := unsafe.Sizeof(uintptr(0)) + if runtime.GOARCH == "ppc64" || runtime.GOARCH == "ppc64le" { + lrOff = 32 // FIXED_FRAME or sys.MinFrameSize + } + lrPtr := (*uintptr)(unsafe.Pointer(uintptr(unsafe.Pointer(&arg)) - lrOff)) + *lrPtr = 0xbad + + // Print a backtrace. This should include diagnostics for the + // bad return PC and a hex dump. + panic("backtrace") +} diff --git a/src/runtime/testdata/testprog/checkptr.go b/src/runtime/testdata/testprog/checkptr.go new file mode 100644 index 0000000..60e71e6 --- /dev/null +++ b/src/runtime/testdata/testprog/checkptr.go @@ -0,0 +1,119 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "runtime" + "time" + "unsafe" +) + +func init() { + register("CheckPtrAlignmentNoPtr", CheckPtrAlignmentNoPtr) + register("CheckPtrAlignmentPtr", CheckPtrAlignmentPtr) + register("CheckPtrAlignmentNilPtr", CheckPtrAlignmentNilPtr) + register("CheckPtrArithmetic", CheckPtrArithmetic) + register("CheckPtrArithmetic2", CheckPtrArithmetic2) + register("CheckPtrSize", CheckPtrSize) + register("CheckPtrSmall", CheckPtrSmall) + register("CheckPtrSliceOK", CheckPtrSliceOK) + register("CheckPtrSliceFail", CheckPtrSliceFail) + register("CheckPtrStringOK", CheckPtrStringOK) + register("CheckPtrStringFail", CheckPtrStringFail) + register("CheckPtrAlignmentNested", CheckPtrAlignmentNested) +} + +func CheckPtrAlignmentNoPtr() { + var x [2]int64 + p := unsafe.Pointer(&x[0]) + sink2 = (*int64)(unsafe.Pointer(uintptr(p) + 1)) +} + +func CheckPtrAlignmentPtr() { + var x [2]int64 + p := unsafe.Pointer(&x[0]) + sink2 = (**int64)(unsafe.Pointer(uintptr(p) + 1)) +} + +// CheckPtrAlignmentNilPtr tests that checkptrAlignment doesn't crash +// on nil pointers (#47430). +func CheckPtrAlignmentNilPtr() { + var do func(int) + do = func(n int) { + // Inflate the stack so runtime.shrinkstack gets called during GC + if n > 0 { + do(n - 1) + } + + var p unsafe.Pointer + _ = (*int)(p) + } + + go func() { + for { + runtime.GC() + } + }() + + go func() { + for i := 0; ; i++ { + do(i % 1024) + } + }() + + time.Sleep(time.Second) +} + +func CheckPtrArithmetic() { + var x int + i := uintptr(unsafe.Pointer(&x)) + sink2 = (*int)(unsafe.Pointer(i)) +} + +func CheckPtrArithmetic2() { + var x [2]int64 + p := unsafe.Pointer(&x[1]) + var one uintptr = 1 + sink2 = unsafe.Pointer(uintptr(p) & ^one) +} + +func CheckPtrSize() { + p := new(int64) + sink2 = p + sink2 = (*[100]int64)(unsafe.Pointer(p)) +} + +func CheckPtrSmall() { + sink2 = unsafe.Pointer(uintptr(1)) +} + +func CheckPtrSliceOK() { + p := new([4]int64) + sink2 = unsafe.Slice(&p[1], 3) +} + +func CheckPtrSliceFail() { + p := new(int64) + sink2 = p + sink2 = unsafe.Slice(p, 100) +} + +func CheckPtrStringOK() { + p := new([4]byte) + sink2 = unsafe.String(&p[1], 3) +} + +func CheckPtrStringFail() { + p := new(byte) + sink2 = p + sink2 = unsafe.String(p, 100) +} + +func CheckPtrAlignmentNested() { + s := make([]int8, 100) + p := unsafe.Pointer(&s[0]) + n := 9 + _ = ((*[10]int8)(unsafe.Pointer((*[10]int64)(unsafe.Pointer(&p)))))[:n:n] +} diff --git a/src/runtime/testdata/testprog/crash.go b/src/runtime/testdata/testprog/crash.go new file mode 100644 index 0000000..38c8f6a --- /dev/null +++ b/src/runtime/testdata/testprog/crash.go @@ -0,0 +1,139 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "fmt" + "runtime" +) + +func init() { + register("Crash", Crash) + register("DoublePanic", DoublePanic) + register("ErrorPanic", ErrorPanic) + register("StringerPanic", StringerPanic) + register("DoubleErrorPanic", DoubleErrorPanic) + register("DoubleStringerPanic", DoubleStringerPanic) + register("StringPanic", StringPanic) + register("NilPanic", NilPanic) + register("CircularPanic", CircularPanic) +} + +func test(name string) { + defer func() { + if x := recover(); x != nil { + fmt.Printf(" recovered") + } + fmt.Printf(" done\n") + }() + fmt.Printf("%s:", name) + var s *string + _ = *s + fmt.Print("SHOULD NOT BE HERE") +} + +func testInNewThread(name string) { + c := make(chan bool) + go func() { + runtime.LockOSThread() + test(name) + c <- true + }() + <-c +} + +func Crash() { + runtime.LockOSThread() + test("main") + testInNewThread("new-thread") + testInNewThread("second-new-thread") + test("main-again") +} + +type P string + +func (p P) String() string { + // Try to free the "YYY" string header when the "XXX" + // panic is stringified. + runtime.GC() + runtime.GC() + runtime.GC() + return string(p) +} + +// Test that panic message is not clobbered. +// See issue 30150. +func DoublePanic() { + defer func() { + panic(P("YYY")) + }() + panic(P("XXX")) +} + +// Test that panic while panicking discards error message +// See issue 52257 +type exampleError struct{} + +func (e exampleError) Error() string { + panic("important error message") +} + +func ErrorPanic() { + panic(exampleError{}) +} + +type examplePanicError struct{} + +func (e examplePanicError) Error() string { + panic(exampleError{}) +} + +func DoubleErrorPanic() { + panic(examplePanicError{}) +} + +type exampleStringer struct{} + +func (s exampleStringer) String() string { + panic("important stringer message") +} + +func StringerPanic() { + panic(exampleStringer{}) +} + +type examplePanicStringer struct{} + +func (s examplePanicStringer) String() string { + panic(exampleStringer{}) +} + +func DoubleStringerPanic() { + panic(examplePanicStringer{}) +} + +func StringPanic() { + panic("important string message") +} + +func NilPanic() { + panic(nil) +} + +type exampleCircleStartError struct{} + +func (e exampleCircleStartError) Error() string { + panic(exampleCircleEndError{}) +} + +type exampleCircleEndError struct{} + +func (e exampleCircleEndError) Error() string { + panic(exampleCircleStartError{}) +} + +func CircularPanic() { + panic(exampleCircleStartError{}) +} diff --git a/src/runtime/testdata/testprog/crashdump.go b/src/runtime/testdata/testprog/crashdump.go new file mode 100644 index 0000000..bced397 --- /dev/null +++ b/src/runtime/testdata/testprog/crashdump.go @@ -0,0 +1,47 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "fmt" + "os" + "runtime" +) + +func init() { + register("CrashDumpsAllThreads", CrashDumpsAllThreads) +} + +func CrashDumpsAllThreads() { + const count = 4 + runtime.GOMAXPROCS(count + 1) + + chans := make([]chan bool, count) + for i := range chans { + chans[i] = make(chan bool) + go crashDumpsAllThreadsLoop(i, chans[i]) + } + + // Wait for all the goroutines to start executing. + for _, c := range chans { + <-c + } + + // Tell our parent that all the goroutines are executing. + if _, err := os.NewFile(3, "pipe").WriteString("x"); err != nil { + fmt.Fprintf(os.Stderr, "write to pipe failed: %v\n", err) + os.Exit(2) + } + + select {} +} + +func crashDumpsAllThreadsLoop(i int, c chan bool) { + close(c) + for { + for j := 0; j < 0x7fffffff; j++ { + } + } +} diff --git a/src/runtime/testdata/testprog/deadlock.go b/src/runtime/testdata/testprog/deadlock.go new file mode 100644 index 0000000..781acbd --- /dev/null +++ b/src/runtime/testdata/testprog/deadlock.go @@ -0,0 +1,363 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "fmt" + "runtime" + "runtime/debug" + "time" +) + +func init() { + registerInit("InitDeadlock", InitDeadlock) + registerInit("NoHelperGoroutines", NoHelperGoroutines) + + register("SimpleDeadlock", SimpleDeadlock) + register("LockedDeadlock", LockedDeadlock) + register("LockedDeadlock2", LockedDeadlock2) + register("GoexitDeadlock", GoexitDeadlock) + register("StackOverflow", StackOverflow) + register("ThreadExhaustion", ThreadExhaustion) + register("RecursivePanic", RecursivePanic) + register("RecursivePanic2", RecursivePanic2) + register("RecursivePanic3", RecursivePanic3) + register("RecursivePanic4", RecursivePanic4) + register("RecursivePanic5", RecursivePanic5) + register("GoexitExit", GoexitExit) + register("GoNil", GoNil) + register("MainGoroutineID", MainGoroutineID) + register("Breakpoint", Breakpoint) + register("GoexitInPanic", GoexitInPanic) + register("PanicAfterGoexit", PanicAfterGoexit) + register("RecoveredPanicAfterGoexit", RecoveredPanicAfterGoexit) + register("RecoverBeforePanicAfterGoexit", RecoverBeforePanicAfterGoexit) + register("RecoverBeforePanicAfterGoexit2", RecoverBeforePanicAfterGoexit2) + register("PanicTraceback", PanicTraceback) + register("GoschedInPanic", GoschedInPanic) + register("SyscallInPanic", SyscallInPanic) + register("PanicLoop", PanicLoop) +} + +func SimpleDeadlock() { + select {} + panic("not reached") +} + +func InitDeadlock() { + select {} + panic("not reached") +} + +func LockedDeadlock() { + runtime.LockOSThread() + select {} +} + +func LockedDeadlock2() { + go func() { + runtime.LockOSThread() + select {} + }() + time.Sleep(time.Millisecond) + select {} +} + +func GoexitDeadlock() { + F := func() { + for i := 0; i < 10; i++ { + } + } + + go F() + go F() + runtime.Goexit() +} + +func StackOverflow() { + var f func() byte + f = func() byte { + var buf [64 << 10]byte + return buf[0] + f() + } + debug.SetMaxStack(1474560) + f() +} + +func ThreadExhaustion() { + debug.SetMaxThreads(10) + c := make(chan int) + for i := 0; i < 100; i++ { + go func() { + runtime.LockOSThread() + c <- 0 + select {} + }() + <-c + } +} + +func RecursivePanic() { + func() { + defer func() { + fmt.Println(recover()) + }() + var x [8192]byte + func(x [8192]byte) { + defer func() { + if err := recover(); err != nil { + panic("wrap: " + err.(string)) + } + }() + panic("bad") + }(x) + }() + panic("again") +} + +// Same as RecursivePanic, but do the first recover and the second panic in +// separate defers, and make sure they are executed in the correct order. +func RecursivePanic2() { + func() { + defer func() { + fmt.Println(recover()) + }() + var x [8192]byte + func(x [8192]byte) { + defer func() { + panic("second panic") + }() + defer func() { + fmt.Println(recover()) + }() + panic("first panic") + }(x) + }() + panic("third panic") +} + +// Make sure that the first panic finished as a panic, even though the second +// panic was recovered +func RecursivePanic3() { + defer func() { + defer func() { + recover() + }() + panic("second panic") + }() + panic("first panic") +} + +// Test case where a single defer recovers one panic but starts another panic. If +// the second panic is never recovered, then the recovered first panic will still +// appear on the panic stack (labeled '[recovered]') and the runtime stack. +func RecursivePanic4() { + defer func() { + recover() + panic("second panic") + }() + panic("first panic") +} + +// Test case where we have an open-coded defer higher up the stack (in two), and +// in the current function (three) we recover in a defer while we still have +// another defer to be processed. +func RecursivePanic5() { + one() + panic("third panic") +} + +//go:noinline +func one() { + two() +} + +//go:noinline +func two() { + defer func() { + }() + + three() +} + +//go:noinline +func three() { + defer func() { + }() + + defer func() { + fmt.Println(recover()) + }() + + defer func() { + fmt.Println(recover()) + panic("second panic") + }() + + panic("first panic") +} + +func GoexitExit() { + println("t1") + go func() { + time.Sleep(time.Millisecond) + }() + i := 0 + println("t2") + runtime.SetFinalizer(&i, func(p *int) {}) + println("t3") + runtime.GC() + println("t4") + runtime.Goexit() +} + +func GoNil() { + defer func() { + recover() + }() + var f func() + go f() + select {} +} + +func MainGoroutineID() { + panic("test") +} + +func NoHelperGoroutines() { + i := 0 + runtime.SetFinalizer(&i, func(p *int) {}) + time.AfterFunc(time.Hour, func() {}) + panic("oops") +} + +func Breakpoint() { + runtime.Breakpoint() +} + +func GoexitInPanic() { + go func() { + defer func() { + runtime.Goexit() + }() + panic("hello") + }() + runtime.Goexit() +} + +type errorThatGosched struct{} + +func (errorThatGosched) Error() string { + runtime.Gosched() + return "errorThatGosched" +} + +func GoschedInPanic() { + panic(errorThatGosched{}) +} + +type errorThatPrint struct{} + +func (errorThatPrint) Error() string { + fmt.Println("1") + fmt.Println("2") + return "3" +} + +func SyscallInPanic() { + panic(errorThatPrint{}) +} + +func PanicAfterGoexit() { + defer func() { + panic("hello") + }() + runtime.Goexit() +} + +func RecoveredPanicAfterGoexit() { + defer func() { + defer func() { + r := recover() + if r == nil { + panic("bad recover") + } + }() + panic("hello") + }() + runtime.Goexit() +} + +func RecoverBeforePanicAfterGoexit() { + // 1. defer a function that recovers + // 2. defer a function that panics + // 3. call goexit + // Goexit runs the #2 defer. Its panic + // is caught by the #1 defer. For Goexit, we explicitly + // resume execution in the Goexit loop, instead of resuming + // execution in the caller (which would make the Goexit disappear!) + defer func() { + r := recover() + if r == nil { + panic("bad recover") + } + }() + defer func() { + panic("hello") + }() + runtime.Goexit() +} + +func RecoverBeforePanicAfterGoexit2() { + for i := 0; i < 2; i++ { + defer func() { + }() + } + // 1. defer a function that recovers + // 2. defer a function that panics + // 3. call goexit + // Goexit runs the #2 defer. Its panic + // is caught by the #1 defer. For Goexit, we explicitly + // resume execution in the Goexit loop, instead of resuming + // execution in the caller (which would make the Goexit disappear!) + defer func() { + r := recover() + if r == nil { + panic("bad recover") + } + }() + defer func() { + panic("hello") + }() + runtime.Goexit() +} + +func PanicTraceback() { + pt1() +} + +func pt1() { + defer func() { + panic("panic pt1") + }() + pt2() +} + +func pt2() { + defer func() { + panic("panic pt2") + }() + panic("hello") +} + +type panicError struct{} + +func (*panicError) Error() string { + panic("double error") +} + +func PanicLoop() { + panic(&panicError{}) +} diff --git a/src/runtime/testdata/testprog/gc.go b/src/runtime/testdata/testprog/gc.go new file mode 100644 index 0000000..5dc85fb --- /dev/null +++ b/src/runtime/testdata/testprog/gc.go @@ -0,0 +1,420 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "fmt" + "math" + "os" + "runtime" + "runtime/debug" + "runtime/metrics" + "sync" + "sync/atomic" + "time" + "unsafe" +) + +func init() { + register("GCFairness", GCFairness) + register("GCFairness2", GCFairness2) + register("GCSys", GCSys) + register("GCPhys", GCPhys) + register("DeferLiveness", DeferLiveness) + register("GCZombie", GCZombie) + register("GCMemoryLimit", GCMemoryLimit) + register("GCMemoryLimitNoGCPercent", GCMemoryLimitNoGCPercent) +} + +func GCSys() { + runtime.GOMAXPROCS(1) + memstats := new(runtime.MemStats) + runtime.GC() + runtime.ReadMemStats(memstats) + sys := memstats.Sys + + runtime.MemProfileRate = 0 // disable profiler + + itercount := 100000 + for i := 0; i < itercount; i++ { + workthegc() + } + + // Should only be using a few MB. + // We allocated 100 MB or (if not short) 1 GB. + runtime.ReadMemStats(memstats) + if sys > memstats.Sys { + sys = 0 + } else { + sys = memstats.Sys - sys + } + if sys > 16<<20 { + fmt.Printf("using too much memory: %d bytes\n", sys) + return + } + fmt.Printf("OK\n") +} + +var sink []byte + +func workthegc() []byte { + sink = make([]byte, 1029) + return sink +} + +func GCFairness() { + runtime.GOMAXPROCS(1) + f, err := os.Open("/dev/null") + if os.IsNotExist(err) { + // This test tests what it is intended to test only if writes are fast. + // If there is no /dev/null, we just don't execute the test. + fmt.Println("OK") + return + } + if err != nil { + fmt.Println(err) + os.Exit(1) + } + for i := 0; i < 2; i++ { + go func() { + for { + f.Write([]byte(".")) + } + }() + } + time.Sleep(10 * time.Millisecond) + fmt.Println("OK") +} + +func GCFairness2() { + // Make sure user code can't exploit the GC's high priority + // scheduling to make scheduling of user code unfair. See + // issue #15706. + runtime.GOMAXPROCS(1) + debug.SetGCPercent(1) + var count [3]int64 + var sink [3]any + for i := range count { + go func(i int) { + for { + sink[i] = make([]byte, 1024) + atomic.AddInt64(&count[i], 1) + } + }(i) + } + // Note: If the unfairness is really bad, it may not even get + // past the sleep. + // + // If the scheduling rules change, this may not be enough time + // to let all goroutines run, but for now we cycle through + // them rapidly. + // + // OpenBSD's scheduler makes every usleep() take at least + // 20ms, so we need a long time to ensure all goroutines have + // run. If they haven't run after 30ms, give it another 1000ms + // and check again. + time.Sleep(30 * time.Millisecond) + var fail bool + for i := range count { + if atomic.LoadInt64(&count[i]) == 0 { + fail = true + } + } + if fail { + time.Sleep(1 * time.Second) + for i := range count { + if atomic.LoadInt64(&count[i]) == 0 { + fmt.Printf("goroutine %d did not run\n", i) + return + } + } + } + fmt.Println("OK") +} + +func GCPhys() { + // This test ensures that heap-growth scavenging is working as intended. + // + // It attempts to construct a sizeable "swiss cheese" heap, with many + // allocChunk-sized holes. Then, it triggers a heap growth by trying to + // allocate as much memory as would fit in those holes. + // + // The heap growth should cause a large number of those holes to be + // returned to the OS. + + const ( + // The total amount of memory we're willing to allocate. + allocTotal = 32 << 20 + + // The page cache could hide 64 8-KiB pages from the scavenger today. + maxPageCache = (8 << 10) * 64 + ) + + // How big the allocations are needs to depend on the page size. + // If the page size is too big and the allocations are too small, + // they might not be aligned to the physical page size, so the scavenger + // will gloss over them. + pageSize := os.Getpagesize() + var allocChunk int + if pageSize <= 8<<10 { + allocChunk = 64 << 10 + } else { + allocChunk = 512 << 10 + } + allocs := allocTotal / allocChunk + + // Set GC percent just so this test is a little more consistent in the + // face of varying environments. + debug.SetGCPercent(100) + + // Set GOMAXPROCS to 1 to minimize the amount of memory held in the page cache, + // and to reduce the chance that the background scavenger gets scheduled. + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(1)) + + // Allocate allocTotal bytes of memory in allocChunk byte chunks. + // Alternate between whether the chunk will be held live or will be + // condemned to GC to create holes in the heap. + saved := make([][]byte, allocs/2+1) + condemned := make([][]byte, allocs/2) + for i := 0; i < allocs; i++ { + b := make([]byte, allocChunk) + if i%2 == 0 { + saved = append(saved, b) + } else { + condemned = append(condemned, b) + } + } + + // Run a GC cycle just so we're at a consistent state. + runtime.GC() + + // Drop the only reference to all the condemned memory. + condemned = nil + + // Clear the condemned memory. + runtime.GC() + + // At this point, the background scavenger is likely running + // and could pick up the work, so the next line of code doesn't + // end up doing anything. That's fine. What's important is that + // this test fails somewhat regularly if the runtime doesn't + // scavenge on heap growth, and doesn't fail at all otherwise. + + // Make a large allocation that in theory could fit, but won't + // because we turned the heap into swiss cheese. + saved = append(saved, make([]byte, allocTotal/2)) + + // heapBacked is an estimate of the amount of physical memory used by + // this test. HeapSys is an estimate of the size of the mapped virtual + // address space (which may or may not be backed by physical pages) + // whereas HeapReleased is an estimate of the amount of bytes returned + // to the OS. Their difference then roughly corresponds to the amount + // of virtual address space that is backed by physical pages. + // + // heapBacked also subtracts out maxPageCache bytes of memory because + // this is memory that may be hidden from the scavenger per-P. Since + // GOMAXPROCS=1 here, subtracting it out once is fine. + var stats runtime.MemStats + runtime.ReadMemStats(&stats) + heapBacked := stats.HeapSys - stats.HeapReleased - maxPageCache + // If heapBacked does not exceed the heap goal by more than retainExtraPercent + // then the scavenger is working as expected; the newly-created holes have been + // scavenged immediately as part of the allocations which cannot fit in the holes. + // + // Since the runtime should scavenge the entirety of the remaining holes, + // theoretically there should be no more free and unscavenged memory. However due + // to other allocations that happen during this test we may still see some physical + // memory over-use. + overuse := (float64(heapBacked) - float64(stats.HeapAlloc)) / float64(stats.HeapAlloc) + // Check against our overuse threshold, which is what the scavenger always reserves + // to encourage allocation of memory that doesn't need to be faulted in. + // + // Add additional slack in case the page size is large and the scavenger + // can't reach that memory because it doesn't constitute a complete aligned + // physical page. Assume the worst case: a full physical page out of each + // allocation. + threshold := 0.1 + float64(pageSize)/float64(allocChunk) + if overuse <= threshold { + fmt.Println("OK") + return + } + // Physical memory utilization exceeds the threshold, so heap-growth scavenging + // did not operate as expected. + // + // In the context of this test, this indicates a large amount of + // fragmentation with physical pages that are otherwise unused but not + // returned to the OS. + fmt.Printf("exceeded physical memory overuse threshold of %3.2f%%: %3.2f%%\n"+ + "(alloc: %d, goal: %d, sys: %d, rel: %d, objs: %d)\n", threshold*100, overuse*100, + stats.HeapAlloc, stats.NextGC, stats.HeapSys, stats.HeapReleased, len(saved)) + runtime.KeepAlive(saved) + runtime.KeepAlive(condemned) +} + +// Test that defer closure is correctly scanned when the stack is scanned. +func DeferLiveness() { + var x [10]int + escape(&x) + fn := func() { + if x[0] != 42 { + panic("FAIL") + } + } + defer fn() + + x[0] = 42 + runtime.GC() + runtime.GC() + runtime.GC() +} + +//go:noinline +func escape(x any) { sink2 = x; sink2 = nil } + +var sink2 any + +// Test zombie object detection and reporting. +func GCZombie() { + // Allocate several objects of unusual size (so free slots are + // unlikely to all be re-allocated by the runtime). + const size = 190 + const count = 8192 / size + keep := make([]*byte, 0, (count+1)/2) + free := make([]uintptr, 0, (count+1)/2) + zombies := make([]*byte, 0, len(free)) + for i := 0; i < count; i++ { + obj := make([]byte, size) + p := &obj[0] + if i%2 == 0 { + keep = append(keep, p) + } else { + free = append(free, uintptr(unsafe.Pointer(p))) + } + } + + // Free the unreferenced objects. + runtime.GC() + + // Bring the free objects back to life. + for _, p := range free { + zombies = append(zombies, (*byte)(unsafe.Pointer(p))) + } + + // GC should detect the zombie objects. + runtime.GC() + println("failed") + runtime.KeepAlive(keep) + runtime.KeepAlive(zombies) +} + +func GCMemoryLimit() { + gcMemoryLimit(100) +} + +func GCMemoryLimitNoGCPercent() { + gcMemoryLimit(-1) +} + +// Test SetMemoryLimit functionality. +// +// This test lives here instead of runtime/debug because the entire +// implementation is in the runtime, and testprog gives us a more +// consistent testing environment to help avoid flakiness. +func gcMemoryLimit(gcPercent int) { + if oldProcs := runtime.GOMAXPROCS(4); oldProcs < 4 { + // Fail if the default GOMAXPROCS isn't at least 4. + // Whatever invokes this should check and do a proper t.Skip. + println("insufficient CPUs") + return + } + debug.SetGCPercent(gcPercent) + + const myLimit = 256 << 20 + if limit := debug.SetMemoryLimit(-1); limit != math.MaxInt64 { + print("expected MaxInt64 limit, got ", limit, " bytes instead\n") + return + } + if limit := debug.SetMemoryLimit(myLimit); limit != math.MaxInt64 { + print("expected MaxInt64 limit, got ", limit, " bytes instead\n") + return + } + if limit := debug.SetMemoryLimit(-1); limit != myLimit { + print("expected a ", myLimit, "-byte limit, got ", limit, " bytes instead\n") + return + } + + target := make(chan int64) + var wg sync.WaitGroup + wg.Add(1) + go func() { + defer wg.Done() + + sinkSize := int(<-target / memLimitUnit) + for { + if len(memLimitSink) != sinkSize { + memLimitSink = make([]*[memLimitUnit]byte, sinkSize) + } + for i := 0; i < len(memLimitSink); i++ { + memLimitSink[i] = new([memLimitUnit]byte) + // Write to this memory to slow down the allocator, otherwise + // we get flaky behavior. See #52433. + for j := range memLimitSink[i] { + memLimitSink[i][j] = 9 + } + } + // Again, Gosched to slow down the allocator. + runtime.Gosched() + select { + case newTarget := <-target: + if newTarget == math.MaxInt64 { + return + } + sinkSize = int(newTarget / memLimitUnit) + default: + } + } + }() + var m [2]metrics.Sample + m[0].Name = "/memory/classes/total:bytes" + m[1].Name = "/memory/classes/heap/released:bytes" + + // Don't set this too high, because this is a *live heap* target which + // is not directly comparable to a total memory limit. + maxTarget := int64((myLimit / 10) * 8) + increment := int64((myLimit / 10) * 1) + for i := increment; i < maxTarget; i += increment { + target <- i + + // Check to make sure the memory limit is maintained. + // We're just sampling here so if it transiently goes over we might miss it. + // The internal accounting is inconsistent anyway, so going over by a few + // pages is certainly possible. Just make sure we're within some bound. + // Note that to avoid flakiness due to #52433 (especially since we're allocating + // somewhat heavily here) this bound is kept loose. In practice the Go runtime + // should do considerably better than this bound. + bound := int64(myLimit + 16<<20) + start := time.Now() + for time.Since(start) < 200*time.Millisecond { + metrics.Read(m[:]) + retained := int64(m[0].Value.Uint64() - m[1].Value.Uint64()) + if retained > bound { + print("retained=", retained, " limit=", myLimit, " bound=", bound, "\n") + panic("exceeded memory limit by more than bound allows") + } + runtime.Gosched() + } + } + + if limit := debug.SetMemoryLimit(math.MaxInt64); limit != myLimit { + print("expected a ", myLimit, "-byte limit, got ", limit, " bytes instead\n") + return + } + println("OK") +} + +// Pick a value close to the page size. We want to m +const memLimitUnit = 8000 + +var memLimitSink []*[memLimitUnit]byte diff --git a/src/runtime/testdata/testprog/lockosthread.go b/src/runtime/testdata/testprog/lockosthread.go new file mode 100644 index 0000000..e9d7fdb --- /dev/null +++ b/src/runtime/testdata/testprog/lockosthread.go @@ -0,0 +1,246 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "os" + "runtime" + "sync" + "time" +) + +var mainTID int + +func init() { + registerInit("LockOSThreadMain", func() { + // init is guaranteed to run on the main thread. + mainTID = gettid() + }) + register("LockOSThreadMain", LockOSThreadMain) + + registerInit("LockOSThreadAlt", func() { + // Lock the OS thread now so main runs on the main thread. + runtime.LockOSThread() + }) + register("LockOSThreadAlt", LockOSThreadAlt) + + registerInit("LockOSThreadAvoidsStatePropagation", func() { + // Lock the OS thread now so main runs on the main thread. + runtime.LockOSThread() + }) + register("LockOSThreadAvoidsStatePropagation", LockOSThreadAvoidsStatePropagation) + register("LockOSThreadTemplateThreadRace", LockOSThreadTemplateThreadRace) +} + +func LockOSThreadMain() { + // gettid only works on Linux, so on other platforms this just + // checks that the runtime doesn't do anything terrible. + + // This requires GOMAXPROCS=1 from the beginning to reliably + // start a goroutine on the main thread. + if runtime.GOMAXPROCS(-1) != 1 { + println("requires GOMAXPROCS=1") + os.Exit(1) + } + + ready := make(chan bool, 1) + go func() { + // Because GOMAXPROCS=1, this *should* be on the main + // thread. Stay there. + runtime.LockOSThread() + if mainTID != 0 && gettid() != mainTID { + println("failed to start goroutine on main thread") + os.Exit(1) + } + // Exit with the thread locked, which should exit the + // main thread. + ready <- true + }() + <-ready + time.Sleep(1 * time.Millisecond) + // Check that this goroutine is still running on a different + // thread. + if mainTID != 0 && gettid() == mainTID { + println("goroutine migrated to locked thread") + os.Exit(1) + } + println("OK") +} + +func LockOSThreadAlt() { + // This is running locked to the main OS thread. + + var subTID int + ready := make(chan bool, 1) + go func() { + // This goroutine must be running on a new thread. + runtime.LockOSThread() + subTID = gettid() + ready <- true + // Exit with the thread locked. + }() + <-ready + runtime.UnlockOSThread() + for i := 0; i < 100; i++ { + time.Sleep(1 * time.Millisecond) + // Check that this goroutine is running on a different thread. + if subTID != 0 && gettid() == subTID { + println("locked thread reused") + os.Exit(1) + } + exists, supported := tidExists(subTID) + if !supported || !exists { + goto ok + } + } + println("sub thread", subTID, "still running") + return +ok: + println("OK") +} + +func LockOSThreadAvoidsStatePropagation() { + // This test is similar to LockOSThreadAlt in that it will detect if a thread + // which should have died is still running. However, rather than do this with + // thread IDs, it does this by unsharing state on that thread. This way, it + // also detects whether new threads were cloned from the dead thread, and not + // from a clean thread. Cloning from a locked thread is undesirable since + // cloned threads will inherit potentially unwanted OS state. + // + // unshareFs, getcwd, and chdir("/tmp") are only guaranteed to work on + // Linux, so on other platforms this just checks that the runtime doesn't + // do anything terrible. + // + // This is running locked to the main OS thread. + + // GOMAXPROCS=1 makes this fail much more reliably if a tainted thread is + // cloned from. + if runtime.GOMAXPROCS(-1) != 1 { + println("requires GOMAXPROCS=1") + os.Exit(1) + } + + if err := chdir("/"); err != nil { + println("failed to chdir:", err.Error()) + os.Exit(1) + } + // On systems other than Linux, cwd == "". + cwd, err := getcwd() + if err != nil { + println("failed to get cwd:", err.Error()) + os.Exit(1) + } + if cwd != "" && cwd != "/" { + println("unexpected cwd", cwd, " wanted /") + os.Exit(1) + } + + ready := make(chan bool, 1) + go func() { + // This goroutine must be running on a new thread. + runtime.LockOSThread() + + // Unshare details about the FS, like the CWD, with + // the rest of the process on this thread. + // On systems other than Linux, this is a no-op. + if err := unshareFs(); err != nil { + if err == errNotPermitted { + println("unshare not permitted") + os.Exit(0) + } + println("failed to unshare fs:", err.Error()) + os.Exit(1) + } + // Chdir to somewhere else on this thread. + // On systems other than Linux, this is a no-op. + if err := chdir("/tmp"); err != nil { + println("failed to chdir:", err.Error()) + os.Exit(1) + } + + // The state on this thread is now considered "tainted", but it + // should no longer be observable in any other context. + + ready <- true + // Exit with the thread locked. + }() + <-ready + + // Spawn yet another goroutine and lock it. Since GOMAXPROCS=1, if + // for some reason state from the (hopefully dead) locked thread above + // propagated into a newly created thread (via clone), or that thread + // is actually being re-used, then we should get scheduled on such a + // thread with high likelihood. + done := make(chan bool) + go func() { + runtime.LockOSThread() + + // Get the CWD and check if this is the same as the main thread's + // CWD. Every thread should share the same CWD. + // On systems other than Linux, wd == "". + wd, err := getcwd() + if err != nil { + println("failed to get cwd:", err.Error()) + os.Exit(1) + } + if wd != cwd { + println("bad state from old thread propagated after it should have died") + os.Exit(1) + } + <-done + + runtime.UnlockOSThread() + }() + done <- true + runtime.UnlockOSThread() + println("OK") +} + +func LockOSThreadTemplateThreadRace() { + // This test attempts to reproduce the race described in + // golang.org/issue/38931. To do so, we must have a stop-the-world + // (achieved via ReadMemStats) racing with two LockOSThread calls. + // + // While this test attempts to line up the timing, it is only expected + // to fail (and thus hang) around 2% of the time if the race is + // present. + + // Ensure enough Ps to actually run everything in parallel. Though on + // <4 core machines, we are still at the whim of the kernel scheduler. + runtime.GOMAXPROCS(4) + + go func() { + // Stop the world; race with LockOSThread below. + var m runtime.MemStats + for { + runtime.ReadMemStats(&m) + } + }() + + // Try to synchronize both LockOSThreads. + start := time.Now().Add(10 * time.Millisecond) + + var wg sync.WaitGroup + wg.Add(2) + + for i := 0; i < 2; i++ { + go func() { + for time.Now().Before(start) { + } + + // Add work to the local runq to trigger early startm + // in handoffp. + go func() {}() + + runtime.LockOSThread() + runtime.Gosched() // add a preemption point. + wg.Done() + }() + } + + wg.Wait() + // If both LockOSThreads completed then we did not hit the race. + println("OK") +} diff --git a/src/runtime/testdata/testprog/main.go b/src/runtime/testdata/testprog/main.go new file mode 100644 index 0000000..ae491a2 --- /dev/null +++ b/src/runtime/testdata/testprog/main.go @@ -0,0 +1,35 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import "os" + +var cmds = map[string]func(){} + +func register(name string, f func()) { + if cmds[name] != nil { + panic("duplicate registration: " + name) + } + cmds[name] = f +} + +func registerInit(name string, f func()) { + if len(os.Args) >= 2 && os.Args[1] == name { + f() + } +} + +func main() { + if len(os.Args) < 2 { + println("usage: " + os.Args[0] + " name-of-test") + return + } + f := cmds[os.Args[1]] + if f == nil { + println("unknown function: " + os.Args[1]) + return + } + f() +} diff --git a/src/runtime/testdata/testprog/map.go b/src/runtime/testdata/testprog/map.go new file mode 100644 index 0000000..5524289 --- /dev/null +++ b/src/runtime/testdata/testprog/map.go @@ -0,0 +1,77 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import "runtime" + +func init() { + register("concurrentMapWrites", concurrentMapWrites) + register("concurrentMapReadWrite", concurrentMapReadWrite) + register("concurrentMapIterateWrite", concurrentMapIterateWrite) +} + +func concurrentMapWrites() { + m := map[int]int{} + c := make(chan struct{}) + go func() { + for i := 0; i < 10000; i++ { + m[5] = 0 + runtime.Gosched() + } + c <- struct{}{} + }() + go func() { + for i := 0; i < 10000; i++ { + m[6] = 0 + runtime.Gosched() + } + c <- struct{}{} + }() + <-c + <-c +} + +func concurrentMapReadWrite() { + m := map[int]int{} + c := make(chan struct{}) + go func() { + for i := 0; i < 10000; i++ { + m[5] = 0 + runtime.Gosched() + } + c <- struct{}{} + }() + go func() { + for i := 0; i < 10000; i++ { + _ = m[6] + runtime.Gosched() + } + c <- struct{}{} + }() + <-c + <-c +} + +func concurrentMapIterateWrite() { + m := map[int]int{} + c := make(chan struct{}) + go func() { + for i := 0; i < 10000; i++ { + m[5] = 0 + runtime.Gosched() + } + c <- struct{}{} + }() + go func() { + for i := 0; i < 10000; i++ { + for range m { + } + runtime.Gosched() + } + c <- struct{}{} + }() + <-c + <-c +} diff --git a/src/runtime/testdata/testprog/memprof.go b/src/runtime/testdata/testprog/memprof.go new file mode 100644 index 0000000..0392e60 --- /dev/null +++ b/src/runtime/testdata/testprog/memprof.go @@ -0,0 +1,51 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "bytes" + "fmt" + "os" + "runtime" + "runtime/pprof" +) + +func init() { + register("MemProf", MemProf) +} + +var memProfBuf bytes.Buffer +var memProfStr string + +func MemProf() { + // Force heap sampling for determinism. + runtime.MemProfileRate = 1 + + for i := 0; i < 10; i++ { + fmt.Fprintf(&memProfBuf, "%*d\n", i, i) + } + memProfStr = memProfBuf.String() + + runtime.GC() + + f, err := os.CreateTemp("", "memprof") + if err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + if err := pprof.WriteHeapProfile(f); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + name := f.Name() + if err := f.Close(); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + fmt.Println(name) +} diff --git a/src/runtime/testdata/testprog/misc.go b/src/runtime/testdata/testprog/misc.go new file mode 100644 index 0000000..7ccd389 --- /dev/null +++ b/src/runtime/testdata/testprog/misc.go @@ -0,0 +1,15 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import "runtime" + +func init() { + register("NumGoroutine", NumGoroutine) +} + +func NumGoroutine() { + println(runtime.NumGoroutine()) +} diff --git a/src/runtime/testdata/testprog/numcpu_freebsd.go b/src/runtime/testdata/testprog/numcpu_freebsd.go new file mode 100644 index 0000000..310c212 --- /dev/null +++ b/src/runtime/testdata/testprog/numcpu_freebsd.go @@ -0,0 +1,140 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "bytes" + "fmt" + "os" + "os/exec" + "regexp" + "runtime" + "strconv" + "strings" + "syscall" +) + +var ( + cpuSetRE = regexp.MustCompile(`(\d,?)+`) +) + +func init() { + register("FreeBSDNumCPU", FreeBSDNumCPU) + register("FreeBSDNumCPUHelper", FreeBSDNumCPUHelper) +} + +func FreeBSDNumCPUHelper() { + fmt.Printf("%d\n", runtime.NumCPU()) +} + +func FreeBSDNumCPU() { + _, err := exec.LookPath("cpuset") + if err != nil { + // Can not test without cpuset command. + fmt.Println("OK") + return + } + _, err = exec.LookPath("sysctl") + if err != nil { + // Can not test without sysctl command. + fmt.Println("OK") + return + } + cmd := exec.Command("sysctl", "-n", "kern.smp.active") + output, err := cmd.CombinedOutput() + if err != nil { + fmt.Printf("fail to launch '%s', error: %s, output: %s\n", strings.Join(cmd.Args, " "), err, output) + return + } + if !bytes.Equal(output, []byte("1\n")) { + // SMP mode deactivated in kernel. + fmt.Println("OK") + return + } + + list, err := getList() + if err != nil { + fmt.Printf("%s\n", err) + return + } + err = checkNCPU(list) + if err != nil { + fmt.Printf("%s\n", err) + return + } + if len(list) >= 2 { + err = checkNCPU(list[:len(list)-1]) + if err != nil { + fmt.Printf("%s\n", err) + return + } + } + fmt.Println("OK") + return +} + +func getList() ([]string, error) { + pid := syscall.Getpid() + + // Launch cpuset to print a list of available CPUs: pid <PID> mask: 0, 1, 2, 3. + cmd := exec.Command("cpuset", "-g", "-p", strconv.Itoa(pid)) + cmdline := strings.Join(cmd.Args, " ") + output, err := cmd.CombinedOutput() + if err != nil { + return nil, fmt.Errorf("fail to execute '%s': %s", cmdline, err) + } + output, _, ok := bytes.Cut(output, []byte("\n")) + if !ok { + return nil, fmt.Errorf("invalid output from '%s', '\\n' not found: %s", cmdline, output) + } + + _, cpus, ok := bytes.Cut(output, []byte(":")) + if !ok { + return nil, fmt.Errorf("invalid output from '%s', ':' not found: %s", cmdline, output) + } + + var list []string + for _, val := range bytes.Split(cpus, []byte(",")) { + index := string(bytes.TrimSpace(val)) + if len(index) == 0 { + continue + } + list = append(list, index) + } + if len(list) == 0 { + return nil, fmt.Errorf("empty CPU list from '%s': %s", cmdline, output) + } + return list, nil +} + +func checkNCPU(list []string) error { + listString := strings.Join(list, ",") + if len(listString) == 0 { + return fmt.Errorf("could not check against an empty CPU list") + } + + cListString := cpuSetRE.FindString(listString) + if len(cListString) == 0 { + return fmt.Errorf("invalid cpuset output '%s'", listString) + } + // Launch FreeBSDNumCPUHelper() with specified CPUs list. + cmd := exec.Command("cpuset", "-l", cListString, os.Args[0], "FreeBSDNumCPUHelper") + cmdline := strings.Join(cmd.Args, " ") + output, err := cmd.CombinedOutput() + if err != nil { + return fmt.Errorf("fail to launch child '%s', error: %s, output: %s", cmdline, err, output) + } + + // NumCPU from FreeBSDNumCPUHelper come with '\n'. + output = bytes.TrimSpace(output) + n, err := strconv.Atoi(string(output)) + if err != nil { + return fmt.Errorf("fail to parse output from child '%s', error: %s, output: %s", cmdline, err, output) + } + if n != len(list) { + return fmt.Errorf("runtime.NumCPU() expected to %d, got %d when run with CPU list %s", len(list), n, cListString) + } + return nil +} diff --git a/src/runtime/testdata/testprog/panicprint.go b/src/runtime/testdata/testprog/panicprint.go new file mode 100644 index 0000000..c8deabe --- /dev/null +++ b/src/runtime/testdata/testprog/panicprint.go @@ -0,0 +1,111 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +type MyBool bool +type MyComplex128 complex128 +type MyComplex64 complex64 +type MyFloat32 float32 +type MyFloat64 float64 +type MyInt int +type MyInt8 int8 +type MyInt16 int16 +type MyInt32 int32 +type MyInt64 int64 +type MyString string +type MyUint uint +type MyUint8 uint8 +type MyUint16 uint16 +type MyUint32 uint32 +type MyUint64 uint64 +type MyUintptr uintptr + +func panicCustomComplex64() { + panic(MyComplex64(0.11 + 3i)) +} + +func panicCustomComplex128() { + panic(MyComplex128(32.1 + 10i)) +} + +func panicCustomString() { + panic(MyString("Panic")) +} + +func panicCustomBool() { + panic(MyBool(true)) +} + +func panicCustomInt() { + panic(MyInt(93)) +} + +func panicCustomInt8() { + panic(MyInt8(93)) +} + +func panicCustomInt16() { + panic(MyInt16(93)) +} + +func panicCustomInt32() { + panic(MyInt32(93)) +} + +func panicCustomInt64() { + panic(MyInt64(93)) +} + +func panicCustomUint() { + panic(MyUint(93)) +} + +func panicCustomUint8() { + panic(MyUint8(93)) +} + +func panicCustomUint16() { + panic(MyUint16(93)) +} + +func panicCustomUint32() { + panic(MyUint32(93)) +} + +func panicCustomUint64() { + panic(MyUint64(93)) +} + +func panicCustomUintptr() { + panic(MyUintptr(93)) +} + +func panicCustomFloat64() { + panic(MyFloat64(-93.70)) +} + +func panicCustomFloat32() { + panic(MyFloat32(-93.70)) +} + +func init() { + register("panicCustomComplex64", panicCustomComplex64) + register("panicCustomComplex128", panicCustomComplex128) + register("panicCustomBool", panicCustomBool) + register("panicCustomFloat32", panicCustomFloat32) + register("panicCustomFloat64", panicCustomFloat64) + register("panicCustomInt", panicCustomInt) + register("panicCustomInt8", panicCustomInt8) + register("panicCustomInt16", panicCustomInt16) + register("panicCustomInt32", panicCustomInt32) + register("panicCustomInt64", panicCustomInt64) + register("panicCustomString", panicCustomString) + register("panicCustomUint", panicCustomUint) + register("panicCustomUint8", panicCustomUint8) + register("panicCustomUint16", panicCustomUint16) + register("panicCustomUint32", panicCustomUint32) + register("panicCustomUint64", panicCustomUint64) + register("panicCustomUintptr", panicCustomUintptr) +} diff --git a/src/runtime/testdata/testprog/panicrace.go b/src/runtime/testdata/testprog/panicrace.go new file mode 100644 index 0000000..f058994 --- /dev/null +++ b/src/runtime/testdata/testprog/panicrace.go @@ -0,0 +1,27 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "runtime" + "sync" +) + +func init() { + register("PanicRace", PanicRace) +} + +func PanicRace() { + var wg sync.WaitGroup + wg.Add(1) + go func() { + defer func() { + wg.Done() + runtime.Gosched() + }() + panic("crash") + }() + wg.Wait() +} diff --git a/src/runtime/testdata/testprog/preempt.go b/src/runtime/testdata/testprog/preempt.go new file mode 100644 index 0000000..fb6755a --- /dev/null +++ b/src/runtime/testdata/testprog/preempt.go @@ -0,0 +1,75 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "runtime" + "runtime/debug" + "sync/atomic" +) + +func init() { + register("AsyncPreempt", AsyncPreempt) +} + +func AsyncPreempt() { + // Run with just 1 GOMAXPROCS so the runtime is required to + // use scheduler preemption. + runtime.GOMAXPROCS(1) + // Disable GC so we have complete control of what we're testing. + debug.SetGCPercent(-1) + // Out of an abundance of caution, also make sure that there are + // no GCs actively in progress. The sweep phase of a GC cycle + // for instance tries to preempt Ps at the very beginning. + runtime.GC() + + // Start a goroutine with no sync safe-points. + var ready, ready2 uint32 + go func() { + for { + atomic.StoreUint32(&ready, 1) + dummy() + dummy() + } + }() + // Also start one with a frameless function. + // This is an especially interesting case for + // LR machines. + go func() { + atomic.AddUint32(&ready2, 1) + frameless() + }() + // Also test empty infinite loop. + go func() { + atomic.AddUint32(&ready2, 1) + for { + } + }() + + // Wait for the goroutine to stop passing through sync + // safe-points. + for atomic.LoadUint32(&ready) == 0 || atomic.LoadUint32(&ready2) < 2 { + runtime.Gosched() + } + + // Run a GC, which will have to stop the goroutine for STW and + // for stack scanning. If this doesn't work, the test will + // deadlock and timeout. + runtime.GC() + + println("OK") +} + +//go:noinline +func frameless() { + for i := int64(0); i < 1<<62; i++ { + out += i * i * i * i * i * 12345 + } +} + +var out int64 + +//go:noinline +func dummy() {} diff --git a/src/runtime/testdata/testprog/signal.go b/src/runtime/testdata/testprog/signal.go new file mode 100644 index 0000000..cc5ac8a --- /dev/null +++ b/src/runtime/testdata/testprog/signal.go @@ -0,0 +1,30 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !windows && !plan9 +// +build !windows,!plan9 + +package main + +import ( + "syscall" + "time" +) + +func init() { + register("SignalExitStatus", SignalExitStatus) +} + +func SignalExitStatus() { + syscall.Kill(syscall.Getpid(), syscall.SIGTERM) + + // Should die immediately, but we've seen flakiness on various + // systems (see issue 14063). It's possible that the signal is + // being delivered to a different thread and we are returning + // and exiting before that thread runs again. Give the program + // a little while to die to make sure we pick up the signal + // before we return and exit the program. The time here + // shouldn't matter--we'll never really sleep this long. + time.Sleep(time.Second) +} diff --git a/src/runtime/testdata/testprog/sleep.go b/src/runtime/testdata/testprog/sleep.go new file mode 100644 index 0000000..b230e60 --- /dev/null +++ b/src/runtime/testdata/testprog/sleep.go @@ -0,0 +1,22 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "os" + "time" +) + +// for golang.org/issue/27250 + +func init() { + register("After1", After1) +} + +func After1() { + os.Stdout.WriteString("ready\n") + os.Stdout.Close() + <-time.After(1 * time.Second) +} diff --git a/src/runtime/testdata/testprog/stringconcat.go b/src/runtime/testdata/testprog/stringconcat.go new file mode 100644 index 0000000..f233e66 --- /dev/null +++ b/src/runtime/testdata/testprog/stringconcat.go @@ -0,0 +1,20 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import "strings" + +func init() { + register("stringconcat", stringconcat) +} + +func stringconcat() { + s0 := strings.Repeat("0", 1<<10) + s1 := strings.Repeat("1", 1<<10) + s2 := strings.Repeat("2", 1<<10) + s3 := strings.Repeat("3", 1<<10) + s := s0 + s1 + s2 + s3 + panic(s) +} diff --git a/src/runtime/testdata/testprog/syscall_windows.go b/src/runtime/testdata/testprog/syscall_windows.go new file mode 100644 index 0000000..71bf384 --- /dev/null +++ b/src/runtime/testdata/testprog/syscall_windows.go @@ -0,0 +1,73 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "internal/syscall/windows" + "runtime" + "sync" + "syscall" + "unsafe" +) + +func init() { + register("RaiseException", RaiseException) + register("ZeroDivisionException", ZeroDivisionException) + register("StackMemory", StackMemory) +} + +func RaiseException() { + const EXCEPTION_NONCONTINUABLE = 1 + mod := syscall.MustLoadDLL("kernel32.dll") + proc := mod.MustFindProc("RaiseException") + proc.Call(0xbad, EXCEPTION_NONCONTINUABLE, 0, 0) + println("RaiseException should not return") +} + +func ZeroDivisionException() { + x := 1 + y := 0 + z := x / y + println(z) +} + +func getPagefileUsage() (uintptr, error) { + p, err := syscall.GetCurrentProcess() + if err != nil { + return 0, err + } + var m windows.PROCESS_MEMORY_COUNTERS + err = windows.GetProcessMemoryInfo(p, &m, uint32(unsafe.Sizeof(m))) + if err != nil { + return 0, err + } + return m.PagefileUsage, nil +} + +func StackMemory() { + mem1, err := getPagefileUsage() + if err != nil { + panic(err) + } + const threadCount = 100 + var wg sync.WaitGroup + for i := 0; i < threadCount; i++ { + wg.Add(1) + go func() { + runtime.LockOSThread() + wg.Done() + select {} + }() + } + wg.Wait() + mem2, err := getPagefileUsage() + if err != nil { + panic(err) + } + // assumes that this process creates 1 thread for each + // thread locked goroutine plus extra 5 threads + // like sysmon and others + print((mem2 - mem1) / (threadCount + 5)) +} diff --git a/src/runtime/testdata/testprog/syscalls.go b/src/runtime/testdata/testprog/syscalls.go new file mode 100644 index 0000000..098d5ca --- /dev/null +++ b/src/runtime/testdata/testprog/syscalls.go @@ -0,0 +1,11 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "errors" +) + +var errNotPermitted = errors.New("operation not permitted") diff --git a/src/runtime/testdata/testprog/syscalls_linux.go b/src/runtime/testdata/testprog/syscalls_linux.go new file mode 100644 index 0000000..48f8014 --- /dev/null +++ b/src/runtime/testdata/testprog/syscalls_linux.go @@ -0,0 +1,58 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "bytes" + "fmt" + "os" + "syscall" +) + +func gettid() int { + return syscall.Gettid() +} + +func tidExists(tid int) (exists, supported bool) { + stat, err := os.ReadFile(fmt.Sprintf("/proc/self/task/%d/stat", tid)) + if os.IsNotExist(err) { + return false, true + } + // Check if it's a zombie thread. + state := bytes.Fields(stat)[2] + return !(len(state) == 1 && state[0] == 'Z'), true +} + +func getcwd() (string, error) { + if !syscall.ImplementsGetwd { + return "", nil + } + // Use the syscall to get the current working directory. + // This is imperative for checking for OS thread state + // after an unshare since os.Getwd might just check the + // environment, or use some other mechanism. + var buf [4096]byte + n, err := syscall.Getcwd(buf[:]) + if err != nil { + return "", err + } + // Subtract one for null terminator. + return string(buf[:n-1]), nil +} + +func unshareFs() error { + err := syscall.Unshare(syscall.CLONE_FS) + if err != nil { + errno, ok := err.(syscall.Errno) + if ok && errno == syscall.EPERM { + return errNotPermitted + } + } + return err +} + +func chdir(path string) error { + return syscall.Chdir(path) +} diff --git a/src/runtime/testdata/testprog/syscalls_none.go b/src/runtime/testdata/testprog/syscalls_none.go new file mode 100644 index 0000000..068bb59 --- /dev/null +++ b/src/runtime/testdata/testprog/syscalls_none.go @@ -0,0 +1,28 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !linux +// +build !linux + +package main + +func gettid() int { + return 0 +} + +func tidExists(tid int) (exists, supported bool) { + return false, false +} + +func getcwd() (string, error) { + return "", nil +} + +func unshareFs() error { + return nil +} + +func chdir(path string) error { + return nil +} diff --git a/src/runtime/testdata/testprog/timeprof.go b/src/runtime/testdata/testprog/timeprof.go new file mode 100644 index 0000000..1e90af4 --- /dev/null +++ b/src/runtime/testdata/testprog/timeprof.go @@ -0,0 +1,45 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "fmt" + "os" + "runtime/pprof" + "time" +) + +func init() { + register("TimeProf", TimeProf) +} + +func TimeProf() { + f, err := os.CreateTemp("", "timeprof") + if err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + if err := pprof.StartCPUProfile(f); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + t0 := time.Now() + // We should get a profiling signal 100 times a second, + // so running for 1/10 second should be sufficient. + for time.Since(t0) < time.Second/10 { + } + + pprof.StopCPUProfile() + + name := f.Name() + if err := f.Close(); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + fmt.Println(name) +} diff --git a/src/runtime/testdata/testprog/traceback_ancestors.go b/src/runtime/testdata/testprog/traceback_ancestors.go new file mode 100644 index 0000000..8fc1aa7 --- /dev/null +++ b/src/runtime/testdata/testprog/traceback_ancestors.go @@ -0,0 +1,96 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "bytes" + "fmt" + "runtime" + "strings" +) + +func init() { + register("TracebackAncestors", TracebackAncestors) +} + +const numGoroutines = 3 +const numFrames = 2 + +func TracebackAncestors() { + w := make(chan struct{}) + recurseThenCallGo(w, numGoroutines, numFrames, true) + <-w + printStack() + close(w) +} + +var ignoreGoroutines = make(map[string]bool) + +func printStack() { + buf := make([]byte, 1024) + for { + n := runtime.Stack(buf, true) + if n < len(buf) { + all := string(buf[:n]) + var saved string + + // Delete any ignored goroutines, if present. + for all != "" { + var g string + g, all, _ = strings.Cut(all, "\n\n") + + if strings.HasPrefix(g, "goroutine ") { + id, _, _ := strings.Cut(strings.TrimPrefix(g, "goroutine "), " ") + if ignoreGoroutines[id] { + continue + } + } + if saved != "" { + saved += "\n\n" + } + saved += g + } + + fmt.Print(saved) + return + } + buf = make([]byte, 2*len(buf)) + } +} + +func recurseThenCallGo(w chan struct{}, frames int, goroutines int, main bool) { + if frames == 0 { + // Signal to TracebackAncestors that we are done recursing and starting goroutines. + w <- struct{}{} + <-w + return + } + if goroutines == 0 { + // Record which goroutine this is so we can ignore it + // in the traceback if it hasn't finished exiting by + // the time we printStack. + if !main { + ignoreGoroutines[goroutineID()] = true + } + + // Start the next goroutine now that there are no more recursions left + // for this current goroutine. + go recurseThenCallGo(w, frames-1, numFrames, false) + return + } + recurseThenCallGo(w, frames, goroutines-1, main) +} + +func goroutineID() string { + buf := make([]byte, 128) + runtime.Stack(buf, false) + prefix := []byte("goroutine ") + var found bool + if buf, found = bytes.CutPrefix(buf, prefix); !found { + panic(fmt.Sprintf("expected %q at beginning of traceback:\n%s", prefix, buf)) + } + id, _, _ := bytes.Cut(buf, []byte(" ")) + return string(id) +} diff --git a/src/runtime/testdata/testprog/unsafe.go b/src/runtime/testdata/testprog/unsafe.go new file mode 100644 index 0000000..021b08f --- /dev/null +++ b/src/runtime/testdata/testprog/unsafe.go @@ -0,0 +1,12 @@ +package main + +import "unsafe" + +func init() { + register("panicOnNilAndEleSizeIsZero", panicOnNilAndEleSizeIsZero) +} + +func panicOnNilAndEleSizeIsZero() { + var p *struct{} + _ = unsafe.Slice(p, 5) +} diff --git a/src/runtime/testdata/testprog/vdso.go b/src/runtime/testdata/testprog/vdso.go new file mode 100644 index 0000000..b18bc74 --- /dev/null +++ b/src/runtime/testdata/testprog/vdso.go @@ -0,0 +1,54 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Invoke signal handler in the VDSO context (see issue 32912). + +package main + +import ( + "fmt" + "os" + "runtime/pprof" + "time" +) + +func init() { + register("SignalInVDSO", signalInVDSO) +} + +func signalInVDSO() { + f, err := os.CreateTemp("", "timeprofnow") + if err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + if err := pprof.StartCPUProfile(f); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + t0 := time.Now() + t1 := t0 + // We should get a profiling signal 100 times a second, + // so running for 1 second should be sufficient. + for t1.Sub(t0) < time.Second { + t1 = time.Now() + } + + pprof.StopCPUProfile() + + name := f.Name() + if err := f.Close(); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + if err := os.Remove(name); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + fmt.Println("success") +} diff --git a/src/runtime/testdata/testprogcgo/aprof.go b/src/runtime/testdata/testprogcgo/aprof.go new file mode 100644 index 0000000..1687014 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/aprof.go @@ -0,0 +1,56 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +// Test that SIGPROF received in C code does not crash the process +// looking for the C code's func pointer. + +// This is a regression test for issue 14599, where profiling fails when the +// function is the first C function. Exported functions are the first C +// functions, so we use an exported function. Exported functions are created in +// lexicographical order of source files, so this file is named aprof.go to +// ensure its function is first. + +// extern void CallGoNop(); +import "C" + +import ( + "bytes" + "fmt" + "runtime/pprof" + "time" +) + +func init() { + register("CgoCCodeSIGPROF", CgoCCodeSIGPROF) +} + +//export GoNop +func GoNop() {} + +func CgoCCodeSIGPROF() { + c := make(chan bool) + go func() { + <-c + start := time.Now() + for i := 0; i < 1e7; i++ { + if i%1000 == 0 { + if time.Since(start) > time.Second { + break + } + } + C.CallGoNop() + } + c <- true + }() + + var buf bytes.Buffer + pprof.StartCPUProfile(&buf) + c <- true + <-c + pprof.StopCPUProfile() + + fmt.Println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/aprof_c.c b/src/runtime/testdata/testprogcgo/aprof_c.c new file mode 100644 index 0000000..d588e13 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/aprof_c.c @@ -0,0 +1,9 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "_cgo_export.h" + +void CallGoNop() { + GoNop(); +} diff --git a/src/runtime/testdata/testprogcgo/bigstack1_windows.c b/src/runtime/testdata/testprogcgo/bigstack1_windows.c new file mode 100644 index 0000000..551fb68 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/bigstack1_windows.c @@ -0,0 +1,12 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This is not in bigstack_windows.c because it needs to be part of +// testprogcgo but is not part of the DLL built from bigstack_windows.c. + +#include "_cgo_export.h" + +void CallGoBigStack1(char* p) { + goBigStack1(p); +} diff --git a/src/runtime/testdata/testprogcgo/bigstack_windows.c b/src/runtime/testdata/testprogcgo/bigstack_windows.c new file mode 100644 index 0000000..cd85ac8 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/bigstack_windows.c @@ -0,0 +1,46 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This test source is used by both TestBigStackCallbackCgo (linked +// directly into the Go binary) and TestBigStackCallbackSyscall +// (compiled into a DLL). + +#include <windows.h> +#include <stdio.h> + +#ifndef STACK_SIZE_PARAM_IS_A_RESERVATION +#define STACK_SIZE_PARAM_IS_A_RESERVATION 0x00010000 +#endif + +typedef void callback(char*); + +// Allocate a stack that's much larger than the default. +static const int STACK_SIZE = 16<<20; + +static callback *bigStackCallback; + +static void useStack(int bytes) { + // Windows doesn't like huge frames, so we grow the stack 64k at a time. + char x[64<<10]; + if (bytes < sizeof x) { + bigStackCallback(x); + } else { + useStack(bytes - sizeof x); + } +} + +static DWORD WINAPI threadEntry(LPVOID lpParam) { + useStack(STACK_SIZE - (128<<10)); + return 0; +} + +void bigStack(callback *cb) { + bigStackCallback = cb; + HANDLE hThread = CreateThread(NULL, STACK_SIZE, threadEntry, NULL, STACK_SIZE_PARAM_IS_A_RESERVATION, NULL); + if (hThread == NULL) { + fprintf(stderr, "CreateThread failed\n"); + exit(1); + } + WaitForSingleObject(hThread, INFINITE); +} diff --git a/src/runtime/testdata/testprogcgo/bigstack_windows.go b/src/runtime/testdata/testprogcgo/bigstack_windows.go new file mode 100644 index 0000000..135b5fc --- /dev/null +++ b/src/runtime/testdata/testprogcgo/bigstack_windows.go @@ -0,0 +1,27 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +/* +typedef void callback(char*); +extern void CallGoBigStack1(char*); +extern void bigStack(callback*); +*/ +import "C" + +func init() { + register("BigStack", BigStack) +} + +func BigStack() { + // Create a large thread stack and call back into Go to test + // if Go correctly determines the stack bounds. + C.bigStack((*C.callback)(C.CallGoBigStack1)) +} + +//export goBigStack1 +func goBigStack1(x *C.char) { + println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/callback.go b/src/runtime/testdata/testprogcgo/callback.go new file mode 100644 index 0000000..25f0715 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/callback.go @@ -0,0 +1,94 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +/* +#include <pthread.h> + +void go_callback(); + +static void *thr(void *arg) { + go_callback(); + return 0; +} + +static void foo() { + pthread_t th; + pthread_attr_t attr; + pthread_attr_init(&attr); + pthread_attr_setstacksize(&attr, 256 << 10); + pthread_create(&th, &attr, thr, 0); + pthread_join(th, 0); +} +*/ +import "C" + +import ( + "fmt" + "os" + "runtime" +) + +func init() { + register("CgoCallbackGC", CgoCallbackGC) +} + +//export go_callback +func go_callback() { + runtime.GC() + grow() + runtime.GC() +} + +var cnt int + +func grow() { + x := 10000 + sum := 0 + if grow1(&x, &sum) == 0 { + panic("bad") + } +} + +func grow1(x, sum *int) int { + if *x == 0 { + return *sum + 1 + } + *x-- + sum1 := *sum + *x + return grow1(x, &sum1) +} + +func CgoCallbackGC() { + P := 100 + if os.Getenv("RUNTIME_TEST_SHORT") != "" { + P = 10 + } + done := make(chan bool) + // allocate a bunch of stack frames and spray them with pointers + for i := 0; i < P; i++ { + go func() { + grow() + done <- true + }() + } + for i := 0; i < P; i++ { + <-done + } + // now give these stack frames to cgo callbacks + for i := 0; i < P; i++ { + go func() { + C.foo() + done <- true + }() + } + for i := 0; i < P; i++ { + <-done + } + fmt.Printf("OK\n") +} diff --git a/src/runtime/testdata/testprogcgo/catchpanic.go b/src/runtime/testdata/testprogcgo/catchpanic.go new file mode 100644 index 0000000..c722d40 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/catchpanic.go @@ -0,0 +1,47 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +/* +#include <signal.h> +#include <stdlib.h> +#include <string.h> + +static void abrthandler(int signum) { + if (signum == SIGABRT) { + exit(0); // success + } +} + +void registerAbortHandler() { + struct sigaction act; + memset(&act, 0, sizeof act); + act.sa_handler = abrthandler; + sigaction(SIGABRT, &act, NULL); +} + +static void __attribute__ ((constructor)) sigsetup(void) { + if (getenv("CGOCATCHPANIC_EARLY_HANDLER") == NULL) + return; + registerAbortHandler(); +} +*/ +import "C" +import "os" + +func init() { + register("CgoCatchPanic", CgoCatchPanic) +} + +// Test that the SIGABRT raised by panic can be caught by an early signal handler. +func CgoCatchPanic() { + if _, ok := os.LookupEnv("CGOCATCHPANIC_EARLY_HANDLER"); !ok { + C.registerAbortHandler() + } + panic("catch me") +} diff --git a/src/runtime/testdata/testprogcgo/cgo.go b/src/runtime/testdata/testprogcgo/cgo.go new file mode 100644 index 0000000..a587db3 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/cgo.go @@ -0,0 +1,108 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +/* +void foo1(void) {} +void foo2(void* p) {} +*/ +import "C" +import ( + "fmt" + "os" + "runtime" + "strconv" + "time" + "unsafe" +) + +func init() { + register("CgoSignalDeadlock", CgoSignalDeadlock) + register("CgoTraceback", CgoTraceback) + register("CgoCheckBytes", CgoCheckBytes) +} + +func CgoSignalDeadlock() { + runtime.GOMAXPROCS(100) + ping := make(chan bool) + go func() { + for i := 0; ; i++ { + runtime.Gosched() + select { + case done := <-ping: + if done { + ping <- true + return + } + ping <- true + default: + } + func() { + defer func() { + recover() + }() + var s *string + *s = "" + fmt.Printf("continued after expected panic\n") + }() + } + }() + time.Sleep(time.Millisecond) + start := time.Now() + var times []time.Duration + n := 64 + if os.Getenv("RUNTIME_TEST_SHORT") != "" { + n = 16 + } + for i := 0; i < n; i++ { + go func() { + runtime.LockOSThread() + select {} + }() + go func() { + runtime.LockOSThread() + select {} + }() + time.Sleep(time.Millisecond) + ping <- false + select { + case <-ping: + times = append(times, time.Since(start)) + case <-time.After(time.Second): + fmt.Printf("HANG 1 %v\n", times) + return + } + } + ping <- true + select { + case <-ping: + case <-time.After(time.Second): + fmt.Printf("HANG 2 %v\n", times) + return + } + fmt.Printf("OK\n") +} + +func CgoTraceback() { + C.foo1() + buf := make([]byte, 1) + runtime.Stack(buf, true) + fmt.Printf("OK\n") +} + +func CgoCheckBytes() { + try, _ := strconv.Atoi(os.Getenv("GO_CGOCHECKBYTES_TRY")) + if try <= 0 { + try = 1 + } + b := make([]byte, 1e6*try) + start := time.Now() + for i := 0; i < 1e3*try; i++ { + C.foo2(unsafe.Pointer(&b[0])) + if time.Since(start) > time.Second { + break + } + } +} diff --git a/src/runtime/testdata/testprogcgo/crash.go b/src/runtime/testdata/testprogcgo/crash.go new file mode 100644 index 0000000..4d83132 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/crash.go @@ -0,0 +1,45 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "fmt" + "runtime" +) + +func init() { + register("Crash", Crash) +} + +func test(name string) { + defer func() { + if x := recover(); x != nil { + fmt.Printf(" recovered") + } + fmt.Printf(" done\n") + }() + fmt.Printf("%s:", name) + var s *string + _ = *s + fmt.Print("SHOULD NOT BE HERE") +} + +func testInNewThread(name string) { + c := make(chan bool) + go func() { + runtime.LockOSThread() + test(name) + c <- true + }() + <-c +} + +func Crash() { + runtime.LockOSThread() + test("main") + testInNewThread("new-thread") + testInNewThread("second-new-thread") + test("main-again") +} diff --git a/src/runtime/testdata/testprogcgo/deadlock.go b/src/runtime/testdata/testprogcgo/deadlock.go new file mode 100644 index 0000000..2cc68a8 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/deadlock.go @@ -0,0 +1,30 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +/* +char *geterror() { + return "cgo error"; +} +*/ +import "C" +import ( + "fmt" +) + +func init() { + register("CgoPanicDeadlock", CgoPanicDeadlock) +} + +type cgoError struct{} + +func (cgoError) Error() string { + fmt.Print("") // necessary to trigger the deadlock + return C.GoString(C.geterror()) +} + +func CgoPanicDeadlock() { + panic(cgoError{}) +} diff --git a/src/runtime/testdata/testprogcgo/dll_windows.go b/src/runtime/testdata/testprogcgo/dll_windows.go new file mode 100644 index 0000000..25380fb --- /dev/null +++ b/src/runtime/testdata/testprogcgo/dll_windows.go @@ -0,0 +1,25 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +/* +#include <windows.h> + +DWORD getthread() { + return GetCurrentThreadId(); +} +*/ +import "C" +import "runtime/testdata/testprogcgo/windows" + +func init() { + register("CgoDLLImportsMain", CgoDLLImportsMain) +} + +func CgoDLLImportsMain() { + C.getthread() + windows.GetThread() + println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/dropm.go b/src/runtime/testdata/testprogcgo/dropm.go new file mode 100644 index 0000000..700b7fa --- /dev/null +++ b/src/runtime/testdata/testprogcgo/dropm.go @@ -0,0 +1,60 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +// Test that a sequence of callbacks from C to Go get the same m. +// This failed to be true on arm and arm64, which was the root cause +// of issue 13881. + +package main + +/* +#include <stddef.h> +#include <pthread.h> + +extern void GoCheckM(); + +static void* thread(void* arg __attribute__ ((unused))) { + GoCheckM(); + return NULL; +} + +static void CheckM() { + pthread_t tid; + pthread_create(&tid, NULL, thread, NULL); + pthread_join(tid, NULL); + pthread_create(&tid, NULL, thread, NULL); + pthread_join(tid, NULL); +} +*/ +import "C" + +import ( + "fmt" + "os" +) + +func init() { + register("EnsureDropM", EnsureDropM) +} + +var savedM uintptr + +//export GoCheckM +func GoCheckM() { + m := runtime_getm_for_test() + if savedM == 0 { + savedM = m + } else if savedM != m { + fmt.Printf("m == %x want %x\n", m, savedM) + os.Exit(1) + } +} + +func EnsureDropM() { + C.CheckM() + fmt.Println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/dropm_stub.go b/src/runtime/testdata/testprogcgo/dropm_stub.go new file mode 100644 index 0000000..6997cfd --- /dev/null +++ b/src/runtime/testdata/testprogcgo/dropm_stub.go @@ -0,0 +1,12 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import _ "unsafe" // for go:linkname + +// Defined in the runtime package. +// +//go:linkname runtime_getm_for_test runtime.getm +func runtime_getm_for_test() uintptr diff --git a/src/runtime/testdata/testprogcgo/eintr.go b/src/runtime/testdata/testprogcgo/eintr.go new file mode 100644 index 0000000..6e9677f --- /dev/null +++ b/src/runtime/testdata/testprogcgo/eintr.go @@ -0,0 +1,247 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +/* +#include <errno.h> +#include <signal.h> +#include <string.h> + +static int clearRestart(int sig) { + struct sigaction sa; + + memset(&sa, 0, sizeof sa); + if (sigaction(sig, NULL, &sa) < 0) { + return errno; + } + sa.sa_flags &=~ SA_RESTART; + if (sigaction(sig, &sa, NULL) < 0) { + return errno; + } + return 0; +} +*/ +import "C" + +import ( + "bytes" + "errors" + "fmt" + "io" + "log" + "net" + "os" + "os/exec" + "sync" + "syscall" + "time" +) + +func init() { + register("EINTR", EINTR) + register("Block", Block) +} + +// Test various operations when a signal handler is installed without +// the SA_RESTART flag. This tests that the os and net APIs handle EINTR. +func EINTR() { + if errno := C.clearRestart(C.int(syscall.SIGURG)); errno != 0 { + log.Fatal(syscall.Errno(errno)) + } + if errno := C.clearRestart(C.int(syscall.SIGWINCH)); errno != 0 { + log.Fatal(syscall.Errno(errno)) + } + if errno := C.clearRestart(C.int(syscall.SIGCHLD)); errno != 0 { + log.Fatal(syscall.Errno(errno)) + } + + var wg sync.WaitGroup + testPipe(&wg) + testNet(&wg) + testExec(&wg) + wg.Wait() + fmt.Println("OK") +} + +// spin does CPU bound spinning and allocating for a millisecond, +// to get a SIGURG. +// +//go:noinline +func spin() (float64, []byte) { + stop := time.Now().Add(time.Millisecond) + r1 := 0.0 + r2 := make([]byte, 200) + for time.Now().Before(stop) { + for i := 1; i < 1e6; i++ { + r1 += r1 / float64(i) + r2 = append(r2, bytes.Repeat([]byte{byte(i)}, 100)...) + r2 = r2[100:] + } + } + return r1, r2 +} + +// winch sends a few SIGWINCH signals to the process. +func winch() { + ticker := time.NewTicker(100 * time.Microsecond) + defer ticker.Stop() + pid := syscall.Getpid() + for n := 10; n > 0; n-- { + syscall.Kill(pid, syscall.SIGWINCH) + <-ticker.C + } +} + +// sendSomeSignals triggers a few SIGURG and SIGWINCH signals. +func sendSomeSignals() { + done := make(chan struct{}) + go func() { + spin() + close(done) + }() + winch() + <-done +} + +// testPipe tests pipe operations. +func testPipe(wg *sync.WaitGroup) { + r, w, err := os.Pipe() + if err != nil { + log.Fatal(err) + } + if err := syscall.SetNonblock(int(r.Fd()), false); err != nil { + log.Fatal(err) + } + if err := syscall.SetNonblock(int(w.Fd()), false); err != nil { + log.Fatal(err) + } + wg.Add(2) + go func() { + defer wg.Done() + defer w.Close() + // Spin before calling Write so that the first ReadFull + // in the other goroutine will likely be interrupted + // by a signal. + sendSomeSignals() + // This Write will likely be interrupted by a signal + // as the other goroutine spins in the middle of reading. + // We write enough data that we should always fill the + // pipe buffer and need multiple write system calls. + if _, err := w.Write(bytes.Repeat([]byte{0}, 2<<20)); err != nil { + log.Fatal(err) + } + }() + go func() { + defer wg.Done() + defer r.Close() + b := make([]byte, 1<<20) + // This ReadFull will likely be interrupted by a signal, + // as the other goroutine spins before writing anything. + if _, err := io.ReadFull(r, b); err != nil { + log.Fatal(err) + } + // Spin after reading half the data so that the Write + // in the other goroutine will likely be interrupted + // before it completes. + sendSomeSignals() + if _, err := io.ReadFull(r, b); err != nil { + log.Fatal(err) + } + }() +} + +// testNet tests network operations. +func testNet(wg *sync.WaitGroup) { + ln, err := net.Listen("tcp4", "127.0.0.1:0") + if err != nil { + if errors.Is(err, syscall.EAFNOSUPPORT) || errors.Is(err, syscall.EPROTONOSUPPORT) { + return + } + log.Fatal(err) + } + wg.Add(2) + go func() { + defer wg.Done() + defer ln.Close() + c, err := ln.Accept() + if err != nil { + log.Fatal(err) + } + defer c.Close() + cf, err := c.(*net.TCPConn).File() + if err != nil { + log.Fatal(err) + } + defer cf.Close() + if err := syscall.SetNonblock(int(cf.Fd()), false); err != nil { + log.Fatal(err) + } + // See comments in testPipe. + sendSomeSignals() + if _, err := cf.Write(bytes.Repeat([]byte{0}, 2<<20)); err != nil { + log.Fatal(err) + } + }() + go func() { + defer wg.Done() + sendSomeSignals() + c, err := net.Dial("tcp", ln.Addr().String()) + if err != nil { + log.Fatal(err) + } + defer c.Close() + cf, err := c.(*net.TCPConn).File() + if err != nil { + log.Fatal(err) + } + defer cf.Close() + if err := syscall.SetNonblock(int(cf.Fd()), false); err != nil { + log.Fatal(err) + } + // See comments in testPipe. + b := make([]byte, 1<<20) + if _, err := io.ReadFull(cf, b); err != nil { + log.Fatal(err) + } + sendSomeSignals() + if _, err := io.ReadFull(cf, b); err != nil { + log.Fatal(err) + } + }() +} + +func testExec(wg *sync.WaitGroup) { + wg.Add(1) + go func() { + defer wg.Done() + cmd := exec.Command(os.Args[0], "Block") + stdin, err := cmd.StdinPipe() + if err != nil { + log.Fatal(err) + } + cmd.Stderr = new(bytes.Buffer) + cmd.Stdout = cmd.Stderr + if err := cmd.Start(); err != nil { + log.Fatal(err) + } + + go func() { + sendSomeSignals() + stdin.Close() + }() + + if err := cmd.Wait(); err != nil { + log.Fatalf("%v:\n%s", err, cmd.Stdout) + } + }() +} + +// Block blocks until stdin is closed. +func Block() { + io.Copy(io.Discard, os.Stdin) +} diff --git a/src/runtime/testdata/testprogcgo/exec.go b/src/runtime/testdata/testprogcgo/exec.go new file mode 100644 index 0000000..c268bcd --- /dev/null +++ b/src/runtime/testdata/testprogcgo/exec.go @@ -0,0 +1,107 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +/* +#include <stddef.h> +#include <signal.h> +#include <pthread.h> + +// Save the signal mask at startup so that we see what it is before +// the Go runtime starts setting up signals. + +static sigset_t mask; + +static void init(void) __attribute__ ((constructor)); + +static void init() { + sigemptyset(&mask); + pthread_sigmask(SIG_SETMASK, NULL, &mask); +} + +int SIGINTBlocked() { + return sigismember(&mask, SIGINT); +} +*/ +import "C" + +import ( + "fmt" + "io/fs" + "os" + "os/exec" + "os/signal" + "sync" + "syscall" +) + +func init() { + register("CgoExecSignalMask", CgoExecSignalMask) +} + +func CgoExecSignalMask() { + if len(os.Args) > 2 && os.Args[2] == "testsigint" { + if C.SIGINTBlocked() != 0 { + os.Exit(1) + } + os.Exit(0) + } + + c := make(chan os.Signal, 1) + signal.Notify(c, syscall.SIGTERM) + go func() { + for range c { + } + }() + + const goCount = 10 + const execCount = 10 + var wg sync.WaitGroup + wg.Add(goCount*execCount + goCount) + for i := 0; i < goCount; i++ { + go func() { + defer wg.Done() + for j := 0; j < execCount; j++ { + c2 := make(chan os.Signal, 1) + signal.Notify(c2, syscall.SIGUSR1) + syscall.Kill(os.Getpid(), syscall.SIGTERM) + go func(j int) { + defer wg.Done() + cmd := exec.Command(os.Args[0], "CgoExecSignalMask", "testsigint") + cmd.Stdin = os.Stdin + cmd.Stdout = os.Stdout + cmd.Stderr = os.Stderr + if err := cmd.Run(); err != nil { + // An overloaded system + // may fail with EAGAIN. + // This doesn't tell us + // anything useful; ignore it. + // Issue #27731. + if isEAGAIN(err) { + return + } + fmt.Printf("iteration %d: %v\n", j, err) + os.Exit(1) + } + }(j) + signal.Stop(c2) + } + }() + } + wg.Wait() + + fmt.Println("OK") +} + +// isEAGAIN reports whether err is an EAGAIN error from a process execution. +func isEAGAIN(err error) bool { + if p, ok := err.(*fs.PathError); ok { + err = p.Err + } + return err == syscall.EAGAIN +} diff --git a/src/runtime/testdata/testprogcgo/gprof.go b/src/runtime/testdata/testprogcgo/gprof.go new file mode 100644 index 0000000..d453b4d --- /dev/null +++ b/src/runtime/testdata/testprogcgo/gprof.go @@ -0,0 +1,46 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +// Test taking a goroutine profile with C traceback. + +/* +// Defined in gprof_c.c. +void CallGoSleep(void); +void gprofCgoTraceback(void* parg); +void gprofCgoContext(void* parg); +*/ +import "C" + +import ( + "fmt" + "io" + "runtime" + "runtime/pprof" + "time" + "unsafe" +) + +func init() { + register("GoroutineProfile", GoroutineProfile) +} + +func GoroutineProfile() { + runtime.SetCgoTraceback(0, unsafe.Pointer(C.gprofCgoTraceback), unsafe.Pointer(C.gprofCgoContext), nil) + + go C.CallGoSleep() + go C.CallGoSleep() + go C.CallGoSleep() + time.Sleep(1 * time.Second) + + prof := pprof.Lookup("goroutine") + prof.WriteTo(io.Discard, 1) + fmt.Println("OK") +} + +//export GoSleep +func GoSleep() { + time.Sleep(time.Hour) +} diff --git a/src/runtime/testdata/testprogcgo/gprof_c.c b/src/runtime/testdata/testprogcgo/gprof_c.c new file mode 100644 index 0000000..5c7cd77 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/gprof_c.c @@ -0,0 +1,30 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The C definitions for gprof.go. That file uses //export so +// it can't put function definitions in the "C" import comment. + +#include <stdint.h> +#include <stdlib.h> + +// Functions exported from Go. +extern void GoSleep(); + +struct cgoContextArg { + uintptr_t context; +}; + +void gprofCgoContext(void *arg) { + ((struct cgoContextArg*)arg)->context = 1; +} + +void gprofCgoTraceback(void *arg) { + // spend some time here so the P is more likely to be retaken. + volatile int i; + for (i = 0; i < 123456789; i++); +} + +void CallGoSleep() { + GoSleep(); +} diff --git a/src/runtime/testdata/testprogcgo/issue29707.go b/src/runtime/testdata/testprogcgo/issue29707.go new file mode 100644 index 0000000..7d9299f --- /dev/null +++ b/src/runtime/testdata/testprogcgo/issue29707.go @@ -0,0 +1,60 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +// This is for issue #29707 + +package main + +/* +#include <pthread.h> + +extern void* callbackTraceParser(void*); +typedef void* (*cbTraceParser)(void*); + +static void testCallbackTraceParser(cbTraceParser cb) { + pthread_t thread_id; + pthread_create(&thread_id, NULL, cb, NULL); + pthread_join(thread_id, NULL); +} +*/ +import "C" + +import ( + "bytes" + "fmt" + traceparser "internal/trace" + "runtime/trace" + "time" + "unsafe" +) + +func init() { + register("CgoTraceParser", CgoTraceParser) +} + +//export callbackTraceParser +func callbackTraceParser(unsafe.Pointer) unsafe.Pointer { + time.Sleep(time.Millisecond) + return nil +} + +func CgoTraceParser() { + buf := new(bytes.Buffer) + + trace.Start(buf) + C.testCallbackTraceParser(C.cbTraceParser(C.callbackTraceParser)) + trace.Stop() + + _, err := traceparser.Parse(buf, "") + if err == traceparser.ErrTimeOrder { + fmt.Println("ErrTimeOrder") + } else if err != nil { + fmt.Println("Parse error: ", err) + } else { + fmt.Println("OK") + } +} diff --git a/src/runtime/testdata/testprogcgo/lockosthread.c b/src/runtime/testdata/testprogcgo/lockosthread.c new file mode 100644 index 0000000..b10cc4f --- /dev/null +++ b/src/runtime/testdata/testprogcgo/lockosthread.c @@ -0,0 +1,13 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build !plan9,!windows + +#include <stdint.h> + +uint32_t threadExited; + +void setExited(void *x) { + __sync_fetch_and_add(&threadExited, 1); +} diff --git a/src/runtime/testdata/testprogcgo/lockosthread.go b/src/runtime/testdata/testprogcgo/lockosthread.go new file mode 100644 index 0000000..8fcea35 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/lockosthread.go @@ -0,0 +1,112 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +import ( + "os" + "runtime" + "sync/atomic" + "time" + "unsafe" +) + +/* +#include <pthread.h> +#include <stdint.h> + +extern uint32_t threadExited; + +void setExited(void *x); +*/ +import "C" + +var mainThread C.pthread_t + +func init() { + registerInit("LockOSThreadMain", func() { + // init is guaranteed to run on the main thread. + mainThread = C.pthread_self() + }) + register("LockOSThreadMain", LockOSThreadMain) + + registerInit("LockOSThreadAlt", func() { + // Lock the OS thread now so main runs on the main thread. + runtime.LockOSThread() + }) + register("LockOSThreadAlt", LockOSThreadAlt) +} + +func LockOSThreadMain() { + // This requires GOMAXPROCS=1 from the beginning to reliably + // start a goroutine on the main thread. + if runtime.GOMAXPROCS(-1) != 1 { + println("requires GOMAXPROCS=1") + os.Exit(1) + } + + ready := make(chan bool, 1) + go func() { + // Because GOMAXPROCS=1, this *should* be on the main + // thread. Stay there. + runtime.LockOSThread() + self := C.pthread_self() + if C.pthread_equal(mainThread, self) == 0 { + println("failed to start goroutine on main thread") + os.Exit(1) + } + // Exit with the thread locked, which should exit the + // main thread. + ready <- true + }() + <-ready + time.Sleep(1 * time.Millisecond) + // Check that this goroutine is still running on a different + // thread. + self := C.pthread_self() + if C.pthread_equal(mainThread, self) != 0 { + println("goroutine migrated to locked thread") + os.Exit(1) + } + println("OK") +} + +func LockOSThreadAlt() { + // This is running locked to the main OS thread. + + var subThread C.pthread_t + ready := make(chan bool, 1) + C.threadExited = 0 + go func() { + // This goroutine must be running on a new thread. + runtime.LockOSThread() + subThread = C.pthread_self() + // Register a pthread destructor so we can tell this + // thread has exited. + var key C.pthread_key_t + C.pthread_key_create(&key, (*[0]byte)(unsafe.Pointer(C.setExited))) + C.pthread_setspecific(key, unsafe.Pointer(new(int))) + ready <- true + // Exit with the thread locked. + }() + <-ready + for i := 0; i < 100; i++ { + time.Sleep(1 * time.Millisecond) + // Check that this goroutine is running on a different thread. + self := C.pthread_self() + if C.pthread_equal(subThread, self) != 0 { + println("locked thread reused") + os.Exit(1) + } + if atomic.LoadUint32((*uint32)(&C.threadExited)) != 0 { + println("OK") + return + } + } + println("sub thread still running") + os.Exit(1) +} diff --git a/src/runtime/testdata/testprogcgo/main.go b/src/runtime/testdata/testprogcgo/main.go new file mode 100644 index 0000000..ae491a2 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/main.go @@ -0,0 +1,35 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import "os" + +var cmds = map[string]func(){} + +func register(name string, f func()) { + if cmds[name] != nil { + panic("duplicate registration: " + name) + } + cmds[name] = f +} + +func registerInit(name string, f func()) { + if len(os.Args) >= 2 && os.Args[1] == name { + f() + } +} + +func main() { + if len(os.Args) < 2 { + println("usage: " + os.Args[0] + " name-of-test") + return + } + f := cmds[os.Args[1]] + if f == nil { + println("unknown function: " + os.Args[1]) + return + } + f() +} diff --git a/src/runtime/testdata/testprogcgo/needmdeadlock.go b/src/runtime/testdata/testprogcgo/needmdeadlock.go new file mode 100644 index 0000000..b95ec77 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/needmdeadlock.go @@ -0,0 +1,96 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +// This is for issue #42207. +// During a call to needm we could get a SIGCHLD signal +// which would itself call needm, causing a deadlock. + +/* +#include <signal.h> +#include <pthread.h> +#include <sched.h> +#include <unistd.h> + +extern void GoNeedM(); + +#define SIGNALERS 10 + +static void* needmSignalThread(void* p) { + pthread_t* pt = (pthread_t*)(p); + int i; + + for (i = 0; i < 100; i++) { + if (pthread_kill(*pt, SIGCHLD) < 0) { + return NULL; + } + usleep(1); + } + return NULL; +} + +// We don't need many calls, as the deadlock is only likely +// to occur the first couple of times that needm is called. +// After that there will likely be an extra M available. +#define CALLS 10 + +static void* needmCallbackThread(void* p) { + int i; + + for (i = 0; i < SIGNALERS; i++) { + sched_yield(); // Help the signal threads get started. + } + for (i = 0; i < CALLS; i++) { + GoNeedM(); + } + return NULL; +} + +static void runNeedmSignalThread() { + int i; + pthread_t caller; + pthread_t s[SIGNALERS]; + + pthread_create(&caller, NULL, needmCallbackThread, NULL); + for (i = 0; i < SIGNALERS; i++) { + pthread_create(&s[i], NULL, needmSignalThread, &caller); + } + for (i = 0; i < SIGNALERS; i++) { + pthread_join(s[i], NULL); + } + pthread_join(caller, NULL); +} +*/ +import "C" + +import ( + "fmt" + "os" + "time" +) + +func init() { + register("NeedmDeadlock", NeedmDeadlock) +} + +//export GoNeedM +func GoNeedM() { +} + +func NeedmDeadlock() { + // The failure symptom is that the program hangs because of a + // deadlock in needm, so set an alarm. + go func() { + time.Sleep(5 * time.Second) + fmt.Println("Hung for 5 seconds") + os.Exit(1) + }() + + C.runNeedmSignalThread() + fmt.Println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/numgoroutine.go b/src/runtime/testdata/testprogcgo/numgoroutine.go new file mode 100644 index 0000000..1b9f202 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/numgoroutine.go @@ -0,0 +1,93 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +/* +#include <stddef.h> +#include <pthread.h> + +extern void CallbackNumGoroutine(); + +static void* thread2(void* arg __attribute__ ((unused))) { + CallbackNumGoroutine(); + return NULL; +} + +static void CheckNumGoroutine() { + pthread_t tid; + pthread_create(&tid, NULL, thread2, NULL); + pthread_join(tid, NULL); +} +*/ +import "C" + +import ( + "fmt" + "runtime" + "strings" +) + +var baseGoroutines int + +func init() { + register("NumGoroutine", NumGoroutine) +} + +func NumGoroutine() { + // Test that there are just the expected number of goroutines + // running. Specifically, test that the spare M's goroutine + // doesn't show up. + if _, ok := checkNumGoroutine("first", 1+baseGoroutines); !ok { + return + } + + // Test that the goroutine for a callback from C appears. + if C.CheckNumGoroutine(); !callbackok { + return + } + + // Make sure we're back to the initial goroutines. + if _, ok := checkNumGoroutine("third", 1+baseGoroutines); !ok { + return + } + + fmt.Println("OK") +} + +func checkNumGoroutine(label string, want int) (string, bool) { + n := runtime.NumGoroutine() + if n != want { + fmt.Printf("%s NumGoroutine: want %d; got %d\n", label, want, n) + return "", false + } + + sbuf := make([]byte, 32<<10) + sbuf = sbuf[:runtime.Stack(sbuf, true)] + n = strings.Count(string(sbuf), "goroutine ") + if n != want { + fmt.Printf("%s Stack: want %d; got %d:\n%s\n", label, want, n, string(sbuf)) + return "", false + } + return string(sbuf), true +} + +var callbackok bool + +//export CallbackNumGoroutine +func CallbackNumGoroutine() { + stk, ok := checkNumGoroutine("second", 2+baseGoroutines) + if !ok { + return + } + if !strings.Contains(stk, "CallbackNumGoroutine") { + fmt.Printf("missing CallbackNumGoroutine from stack:\n%s\n", stk) + return + } + + callbackok = true +} diff --git a/src/runtime/testdata/testprogcgo/panic.c b/src/runtime/testdata/testprogcgo/panic.c new file mode 100644 index 0000000..deb5ed5 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/panic.c @@ -0,0 +1,9 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +extern void panic_callback(); + +void call_callback(void) { + panic_callback(); +} diff --git a/src/runtime/testdata/testprogcgo/panic.go b/src/runtime/testdata/testprogcgo/panic.go new file mode 100644 index 0000000..57ac895 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/panic.go @@ -0,0 +1,23 @@ +package main + +// This program will crash. +// We want to test unwinding from a cgo callback. + +/* +void call_callback(void); +*/ +import "C" + +func init() { + register("PanicCallback", PanicCallback) +} + +//export panic_callback +func panic_callback() { + var i *int + *i = 42 +} + +func PanicCallback() { + C.call_callback() +} diff --git a/src/runtime/testdata/testprogcgo/pprof.go b/src/runtime/testdata/testprogcgo/pprof.go new file mode 100644 index 0000000..8870d0c --- /dev/null +++ b/src/runtime/testdata/testprogcgo/pprof.go @@ -0,0 +1,93 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +// Run a slow C function saving a CPU profile. + +/* +#include <stdint.h> + +int salt1; +int salt2; + +void cpuHog() { + int foo = salt1; + int i; + + for (i = 0; i < 100000; i++) { + if (foo > 0) { + foo *= foo; + } else { + foo *= foo + 1; + } + } + salt2 = foo; +} + +void cpuHog2() { +} + +struct cgoTracebackArg { + uintptr_t context; + uintptr_t sigContext; + uintptr_t* buf; + uintptr_t max; +}; + +// pprofCgoTraceback is passed to runtime.SetCgoTraceback. +// For testing purposes it pretends that all CPU hits in C code are in cpuHog. +// Issue #29034: At least 2 frames are required to verify all frames are captured +// since runtime/pprof ignores the runtime.goexit base frame if it exists. +void pprofCgoTraceback(void* parg) { + struct cgoTracebackArg* arg = (struct cgoTracebackArg*)(parg); + arg->buf[0] = (uintptr_t)(cpuHog) + 0x10; + arg->buf[1] = (uintptr_t)(cpuHog2) + 0x4; + arg->buf[2] = 0; +} +*/ +import "C" + +import ( + "fmt" + "os" + "runtime" + "runtime/pprof" + "time" + "unsafe" +) + +func init() { + register("CgoPprof", CgoPprof) +} + +func CgoPprof() { + runtime.SetCgoTraceback(0, unsafe.Pointer(C.pprofCgoTraceback), nil, nil) + + f, err := os.CreateTemp("", "prof") + if err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + if err := pprof.StartCPUProfile(f); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + t0 := time.Now() + for time.Since(t0) < time.Second { + C.cpuHog() + } + + pprof.StopCPUProfile() + + name := f.Name() + if err := f.Close(); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + fmt.Println(name) +} diff --git a/src/runtime/testdata/testprogcgo/pprof_callback.go b/src/runtime/testdata/testprogcgo/pprof_callback.go new file mode 100644 index 0000000..fd87eb8 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/pprof_callback.go @@ -0,0 +1,89 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows + +package main + +// Make many C-to-Go callback while collecting a CPU profile. +// +// This is a regression test for issue 50936. + +/* +#include <unistd.h> + +void goCallbackPprof(); + +static void callGo() { + // Spent >20us in C so this thread is eligible for sysmon to retake its + // P. + usleep(50); + goCallbackPprof(); +} +*/ +import "C" + +import ( + "fmt" + "os" + "runtime" + "runtime/pprof" + "time" +) + +func init() { + register("CgoPprofCallback", CgoPprofCallback) +} + +//export goCallbackPprof +func goCallbackPprof() { + // No-op. We want to stress the cgocall and cgocallback internals, + // landing as many pprof signals there as possible. +} + +func CgoPprofCallback() { + // Issue 50936 was a crash in the SIGPROF handler when the signal + // arrived during the exitsyscall following a cgocall(back) in dropg or + // execute, when updating mp.curg. + // + // These are reachable only when exitsyscall finds no P available. Thus + // we make C calls from significantly more Gs than there are available + // Ps. Lots of runnable work combined with >20us spent in callGo makes + // it possible for sysmon to retake Ps, forcing C calls to go down the + // desired exitsyscall path. + // + // High GOMAXPROCS is used to increase opportunities for failure on + // high CPU machines. + const ( + P = 16 + G = 64 + ) + runtime.GOMAXPROCS(P) + + f, err := os.CreateTemp("", "prof") + if err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + defer f.Close() + + if err := pprof.StartCPUProfile(f); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + for i := 0; i < G; i++ { + go func() { + for { + C.callGo() + } + }() + } + + time.Sleep(time.Second) + + pprof.StopCPUProfile() + + fmt.Println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/raceprof.go b/src/runtime/testdata/testprogcgo/raceprof.go new file mode 100644 index 0000000..c098e16 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/raceprof.go @@ -0,0 +1,79 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux && amd64) || (freebsd && amd64) +// +build linux,amd64 freebsd,amd64 + +package main + +// Test that we can collect a lot of colliding profiling signals from +// an external C thread. This used to fail when built with the race +// detector, because a call of the predeclared function copy was +// turned into a call to runtime.slicecopy, which is not marked nosplit. + +/* +#include <signal.h> +#include <stdint.h> +#include <pthread.h> +#include <sched.h> + +struct cgoTracebackArg { + uintptr_t context; + uintptr_t sigContext; + uintptr_t* buf; + uintptr_t max; +}; + +static int raceprofCount; + +// We want a bunch of different profile stacks that collide in the +// hash table maintained in runtime/cpuprof.go. This code knows the +// size of the hash table (1 << 10) and knows that the hash function +// is simply multiplicative. +void raceprofTraceback(void* parg) { + struct cgoTracebackArg* arg = (struct cgoTracebackArg*)(parg); + raceprofCount++; + arg->buf[0] = raceprofCount * (1 << 10); + arg->buf[1] = 0; +} + +static void* raceprofThread(void* p) { + int i; + + for (i = 0; i < 100; i++) { + pthread_kill(pthread_self(), SIGPROF); + sched_yield(); + } + return 0; +} + +void runRaceprofThread() { + pthread_t tid; + pthread_create(&tid, 0, raceprofThread, 0); + pthread_join(tid, 0); +} +*/ +import "C" + +import ( + "bytes" + "fmt" + "runtime" + "runtime/pprof" + "unsafe" +) + +func init() { + register("CgoRaceprof", CgoRaceprof) +} + +func CgoRaceprof() { + runtime.SetCgoTraceback(0, unsafe.Pointer(C.raceprofTraceback), nil, nil) + + var buf bytes.Buffer + pprof.StartCPUProfile(&buf) + + C.runRaceprofThread() + fmt.Println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/racesig.go b/src/runtime/testdata/testprogcgo/racesig.go new file mode 100644 index 0000000..9352679 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/racesig.go @@ -0,0 +1,103 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux && amd64) || (freebsd && amd64) +// +build linux,amd64 freebsd,amd64 + +package main + +// Test that an external C thread that is calling malloc can be hit +// with SIGCHLD signals. This used to fail when built with the race +// detector, because in that case the signal handler would indirectly +// call the C malloc function. + +/* +#include <errno.h> +#include <signal.h> +#include <stdint.h> +#include <stdio.h> +#include <stdlib.h> +#include <pthread.h> +#include <sched.h> +#include <unistd.h> + +#define ALLOCERS 100 +#define SIGNALERS 10 + +static void* signalThread(void* p) { + pthread_t* pt = (pthread_t*)(p); + int i, j; + + for (i = 0; i < 100; i++) { + for (j = 0; j < ALLOCERS; j++) { + if (pthread_kill(pt[j], SIGCHLD) < 0) { + return NULL; + } + } + usleep(1); + } + return NULL; +} + +#define CALLS 100 + +static void* mallocThread(void* p) { + int i; + void *a[CALLS]; + + for (i = 0; i < ALLOCERS; i++) { + sched_yield(); + } + for (i = 0; i < CALLS; i++) { + a[i] = malloc(i); + } + for (i = 0; i < CALLS; i++) { + free(a[i]); + } + return NULL; +} + +void runRaceSignalThread() { + int i; + pthread_t m[ALLOCERS]; + pthread_t s[SIGNALERS]; + + for (i = 0; i < ALLOCERS; i++) { + pthread_create(&m[i], NULL, mallocThread, NULL); + } + for (i = 0; i < SIGNALERS; i++) { + pthread_create(&s[i], NULL, signalThread, &m[0]); + } + for (i = 0; i < SIGNALERS; i++) { + pthread_join(s[i], NULL); + } + for (i = 0; i < ALLOCERS; i++) { + pthread_join(m[i], NULL); + } +} +*/ +import "C" + +import ( + "fmt" + "os" + "time" +) + +func init() { + register("CgoRaceSignal", CgoRaceSignal) +} + +func CgoRaceSignal() { + // The failure symptom is that the program hangs because of a + // deadlock in malloc, so set an alarm. + go func() { + time.Sleep(5 * time.Second) + fmt.Println("Hung for 5 seconds") + os.Exit(1) + }() + + C.runRaceSignalThread() + fmt.Println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/segv.go b/src/runtime/testdata/testprogcgo/segv.go new file mode 100644 index 0000000..bf5aa31 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/segv.go @@ -0,0 +1,55 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix +// +build unix + +package main + +// #include <unistd.h> +// static void nop() {} +import "C" + +import "syscall" + +func init() { + register("Segv", Segv) + register("SegvInCgo", SegvInCgo) +} + +var Sum int + +func Segv() { + c := make(chan bool) + go func() { + close(c) + for i := 0; ; i++ { + Sum += i + } + }() + + <-c + + syscall.Kill(syscall.Getpid(), syscall.SIGSEGV) + + // Wait for the OS to deliver the signal. + C.pause() +} + +func SegvInCgo() { + c := make(chan bool) + go func() { + close(c) + for { + C.nop() + } + }() + + <-c + + syscall.Kill(syscall.Getpid(), syscall.SIGSEGV) + + // Wait for the OS to deliver the signal. + C.pause() +} diff --git a/src/runtime/testdata/testprogcgo/segv_linux.go b/src/runtime/testdata/testprogcgo/segv_linux.go new file mode 100644 index 0000000..fe93778 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/segv_linux.go @@ -0,0 +1,51 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +// #include <unistd.h> +// static void nop() {} +import "C" + +import "syscall" + +func init() { + register("TgkillSegv", TgkillSegv) + register("TgkillSegvInCgo", TgkillSegvInCgo) +} + +func TgkillSegv() { + c := make(chan bool) + go func() { + close(c) + for i := 0; ; i++ { + // Sum defined in segv.go. + Sum += i + } + }() + + <-c + + syscall.Tgkill(syscall.Getpid(), syscall.Gettid(), syscall.SIGSEGV) + + // Wait for the OS to deliver the signal. + C.pause() +} + +func TgkillSegvInCgo() { + c := make(chan bool) + go func() { + close(c) + for { + C.nop() + } + }() + + <-c + + syscall.Tgkill(syscall.Getpid(), syscall.Gettid(), syscall.SIGSEGV) + + // Wait for the OS to deliver the signal. + C.pause() +} diff --git a/src/runtime/testdata/testprogcgo/sigfwd.go b/src/runtime/testdata/testprogcgo/sigfwd.go new file mode 100644 index 0000000..f6a0c03 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/sigfwd.go @@ -0,0 +1,87 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build unix + +package main + +import ( + "fmt" + "os" +) + +/* +#include <signal.h> +#include <stdlib.h> +#include <stdio.h> +#include <string.h> + +sig_atomic_t expectCSigsegv; +int *sigfwdP; + +static void sigsegv() { + expectCSigsegv = 1; + *sigfwdP = 1; + fprintf(stderr, "ERROR: C SIGSEGV not thrown on caught?.\n"); + exit(2); +} + +static void segvhandler(int signum) { + if (signum == SIGSEGV) { + if (expectCSigsegv == 0) { + fprintf(stderr, "SIGSEGV caught in C unexpectedly\n"); + exit(1); + } + fprintf(stdout, "OK\n"); + exit(0); // success + } +} + +static void __attribute__ ((constructor)) sigsetup(void) { + if (getenv("GO_TEST_CGOSIGFWD") == NULL) { + return; + } + + struct sigaction act; + + memset(&act, 0, sizeof act); + act.sa_handler = segvhandler; + sigaction(SIGSEGV, &act, NULL); +} +*/ +import "C" + +func init() { + register("CgoSigfwd", CgoSigfwd) +} + +var nilPtr *byte + +func f() (ret bool) { + defer func() { + if recover() == nil { + fmt.Fprintf(os.Stderr, "ERROR: couldn't raise SIGSEGV in Go\n") + C.exit(2) + } + ret = true + }() + *nilPtr = 1 + return false +} + +func CgoSigfwd() { + if os.Getenv("GO_TEST_CGOSIGFWD") == "" { + fmt.Fprintf(os.Stderr, "test must be run with GO_TEST_CGOSIGFWD set\n") + os.Exit(1) + } + + // Test that the signal originating in Go is handled (and recovered) by Go. + if !f() { + fmt.Fprintf(os.Stderr, "couldn't recover from SIGSEGV in Go.\n") + C.exit(2) + } + + // Test that the signal originating in C is handled by C. + C.sigsegv() +} diff --git a/src/runtime/testdata/testprogcgo/sigpanic.go b/src/runtime/testdata/testprogcgo/sigpanic.go new file mode 100644 index 0000000..cb46030 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/sigpanic.go @@ -0,0 +1,28 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +// This program will crash. +// We want to test unwinding from sigpanic into C code (without a C symbolizer). + +/* +#cgo CFLAGS: -O0 + +char *pnil; + +static int f1(void) { + *pnil = 0; + return 0; +} +*/ +import "C" + +func init() { + register("TracebackSigpanic", TracebackSigpanic) +} + +func TracebackSigpanic() { + C.f1() +} diff --git a/src/runtime/testdata/testprogcgo/sigstack.go b/src/runtime/testdata/testprogcgo/sigstack.go new file mode 100644 index 0000000..12ca661 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/sigstack.go @@ -0,0 +1,99 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +// Test handling of Go-allocated signal stacks when calling from +// C-created threads with and without signal stacks. (See issue +// #22930.) + +package main + +/* +#include <pthread.h> +#include <signal.h> +#include <stdio.h> +#include <stdlib.h> +#include <sys/mman.h> + +#ifdef _AIX +// On AIX, SIGSTKSZ is too small to handle Go sighandler. +#define CSIGSTKSZ 0x4000 +#else +#define CSIGSTKSZ SIGSTKSZ +#endif + +extern void SigStackCallback(); + +static void* WithSigStack(void* arg __attribute__((unused))) { + // Set up an alternate system stack. + void* base = mmap(0, CSIGSTKSZ, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANON, -1, 0); + if (base == MAP_FAILED) { + perror("mmap failed"); + abort(); + } + stack_t st = {}, ost = {}; + st.ss_sp = (char*)base; + st.ss_flags = 0; + st.ss_size = CSIGSTKSZ; + if (sigaltstack(&st, &ost) < 0) { + perror("sigaltstack failed"); + abort(); + } + + // Call Go. + SigStackCallback(); + + // Disable signal stack and protect it so we can detect reuse. + if (ost.ss_flags & SS_DISABLE) { + // Darwin libsystem has a bug where it checks ss_size + // even if SS_DISABLE is set. (The kernel gets it right.) + ost.ss_size = CSIGSTKSZ; + } + if (sigaltstack(&ost, NULL) < 0) { + perror("sigaltstack restore failed"); + abort(); + } + mprotect(base, CSIGSTKSZ, PROT_NONE); + return NULL; +} + +static void* WithoutSigStack(void* arg __attribute__((unused))) { + SigStackCallback(); + return NULL; +} + +static void DoThread(int sigstack) { + pthread_t tid; + if (sigstack) { + pthread_create(&tid, NULL, WithSigStack, NULL); + } else { + pthread_create(&tid, NULL, WithoutSigStack, NULL); + } + pthread_join(tid, NULL); +} +*/ +import "C" + +func init() { + register("SigStack", SigStack) +} + +func SigStack() { + C.DoThread(0) + C.DoThread(1) + C.DoThread(0) + C.DoThread(1) + println("OK") +} + +var BadPtr *int + +//export SigStackCallback +func SigStackCallback() { + // Cause the Go signal handler to run. + defer func() { recover() }() + *BadPtr = 42 +} diff --git a/src/runtime/testdata/testprogcgo/sigthrow.go b/src/runtime/testdata/testprogcgo/sigthrow.go new file mode 100644 index 0000000..665e3b0 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/sigthrow.go @@ -0,0 +1,20 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +// This program will abort. + +/* +#include <stdlib.h> +*/ +import "C" + +func init() { + register("Abort", Abort) +} + +func Abort() { + C.abort() +} diff --git a/src/runtime/testdata/testprogcgo/stack_windows.go b/src/runtime/testdata/testprogcgo/stack_windows.go new file mode 100644 index 0000000..0be1126 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/stack_windows.go @@ -0,0 +1,57 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import "C" +import ( + "internal/syscall/windows" + "runtime" + "sync" + "syscall" + "unsafe" +) + +func init() { + register("StackMemory", StackMemory) +} + +func getPagefileUsage() (uintptr, error) { + p, err := syscall.GetCurrentProcess() + if err != nil { + return 0, err + } + var m windows.PROCESS_MEMORY_COUNTERS + err = windows.GetProcessMemoryInfo(p, &m, uint32(unsafe.Sizeof(m))) + if err != nil { + return 0, err + } + return m.PagefileUsage, nil +} + +func StackMemory() { + mem1, err := getPagefileUsage() + if err != nil { + panic(err) + } + const threadCount = 100 + var wg sync.WaitGroup + for i := 0; i < threadCount; i++ { + wg.Add(1) + go func() { + runtime.LockOSThread() + wg.Done() + select {} + }() + } + wg.Wait() + mem2, err := getPagefileUsage() + if err != nil { + panic(err) + } + // assumes that this process creates 1 thread for each + // thread locked goroutine plus extra 5 threads + // like sysmon and others + print((mem2 - mem1) / (threadCount + 5)) +} diff --git a/src/runtime/testdata/testprogcgo/threadpanic.go b/src/runtime/testdata/testprogcgo/threadpanic.go new file mode 100644 index 0000000..2d24fe6 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/threadpanic.go @@ -0,0 +1,25 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 +// +build !plan9 + +package main + +// void start(void); +import "C" + +func init() { + register("CgoExternalThreadPanic", CgoExternalThreadPanic) +} + +func CgoExternalThreadPanic() { + C.start() + select {} +} + +//export gopanic +func gopanic() { + panic("BOOM") +} diff --git a/src/runtime/testdata/testprogcgo/threadpanic_unix.c b/src/runtime/testdata/testprogcgo/threadpanic_unix.c new file mode 100644 index 0000000..c426452 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/threadpanic_unix.c @@ -0,0 +1,26 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// +build !plan9,!windows + +#include <stdlib.h> +#include <stdio.h> +#include <pthread.h> + +void gopanic(void); + +static void* +die(void* x) +{ + gopanic(); + return 0; +} + +void +start(void) +{ + pthread_t t; + if(pthread_create(&t, 0, die, 0) != 0) + printf("pthread_create failed\n"); +} diff --git a/src/runtime/testdata/testprogcgo/threadpanic_windows.c b/src/runtime/testdata/testprogcgo/threadpanic_windows.c new file mode 100644 index 0000000..ba66d0f --- /dev/null +++ b/src/runtime/testdata/testprogcgo/threadpanic_windows.c @@ -0,0 +1,23 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <process.h> +#include <stdlib.h> +#include <stdio.h> + +void gopanic(void); + +static unsigned int __attribute__((__stdcall__)) +die(void* x) +{ + gopanic(); + return 0; +} + +void +start(void) +{ + if(_beginthreadex(0, 0, die, 0, 0, 0) != 0) + printf("_beginthreadex failed\n"); +} diff --git a/src/runtime/testdata/testprogcgo/threadpprof.go b/src/runtime/testdata/testprogcgo/threadpprof.go new file mode 100644 index 0000000..70717e0 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/threadpprof.go @@ -0,0 +1,128 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +// Run a slow C function saving a CPU profile. + +/* +#include <stdint.h> +#include <time.h> +#include <pthread.h> + +int threadSalt1; +int threadSalt2; + +static pthread_t tid; + +void cpuHogThread() { + int foo = threadSalt1; + int i; + + for (i = 0; i < 100000; i++) { + if (foo > 0) { + foo *= foo; + } else { + foo *= foo + 1; + } + } + threadSalt2 = foo; +} + +void cpuHogThread2() { +} + +struct cgoTracebackArg { + uintptr_t context; + uintptr_t sigContext; + uintptr_t* buf; + uintptr_t max; +}; + +// pprofCgoThreadTraceback is passed to runtime.SetCgoTraceback. +// For testing purposes it pretends that all CPU hits on the cpuHog +// C thread are in cpuHog. +void pprofCgoThreadTraceback(void* parg) { + struct cgoTracebackArg* arg = (struct cgoTracebackArg*)(parg); + if (pthread_self() == tid) { + arg->buf[0] = (uintptr_t)(cpuHogThread) + 0x10; + arg->buf[1] = (uintptr_t)(cpuHogThread2) + 0x4; + arg->buf[2] = 0; + } else + arg->buf[0] = 0; +} + +static void* cpuHogDriver(void* arg __attribute__ ((unused))) { + while (1) { + cpuHogThread(); + } + return 0; +} + +void runCPUHogThread(void) { + pthread_create(&tid, 0, cpuHogDriver, 0); +} +*/ +import "C" + +import ( + "context" + "fmt" + "os" + "runtime" + "runtime/pprof" + "time" + "unsafe" +) + +func init() { + register("CgoPprofThread", CgoPprofThread) + register("CgoPprofThreadNoTraceback", CgoPprofThreadNoTraceback) +} + +func CgoPprofThread() { + runtime.SetCgoTraceback(0, unsafe.Pointer(C.pprofCgoThreadTraceback), nil, nil) + pprofThread() +} + +func CgoPprofThreadNoTraceback() { + pprofThread() +} + +func pprofThread() { + f, err := os.CreateTemp("", "prof") + if err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + if err := pprof.StartCPUProfile(f); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + // This goroutine may receive a profiling signal while creating the C-owned + // thread. If it does, the SetCgoTraceback handler will make the leaf end of + // the stack look almost (but not exactly) like the stacks the test case is + // trying to find. Attach a profiler label so the test can filter out those + // confusing samples. + pprof.Do(context.Background(), pprof.Labels("ignore", "ignore"), func(ctx context.Context) { + C.runCPUHogThread() + }) + + time.Sleep(1 * time.Second) + + pprof.StopCPUProfile() + + name := f.Name() + if err := f.Close(); err != nil { + fmt.Fprintln(os.Stderr, err) + os.Exit(2) + } + + fmt.Println(name) +} diff --git a/src/runtime/testdata/testprogcgo/threadprof.go b/src/runtime/testdata/testprogcgo/threadprof.go new file mode 100644 index 0000000..d62d4b4 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/threadprof.go @@ -0,0 +1,103 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !plan9 && !windows +// +build !plan9,!windows + +package main + +/* +#include <stdint.h> +#include <stdlib.h> +#include <signal.h> +#include <pthread.h> + +volatile int32_t spinlock; + +// Note that this thread is only started if GO_START_SIGPROF_THREAD +// is set in the environment, which is only done when running the +// CgoExternalThreadSIGPROF test. +static void *thread1(void *p) { + (void)p; + while (spinlock == 0) + ; + pthread_kill(pthread_self(), SIGPROF); + spinlock = 0; + return NULL; +} + +// This constructor function is run when the program starts. +// It is used for the CgoExternalThreadSIGPROF test. +__attribute__((constructor)) void issue9456() { + if (getenv("GO_START_SIGPROF_THREAD") != NULL) { + pthread_t tid; + pthread_create(&tid, 0, thread1, NULL); + } +} + +void **nullptr; + +void *crash(void *p) { + *nullptr = p; + return 0; +} + +int start_crashing_thread(void) { + pthread_t tid; + return pthread_create(&tid, 0, crash, 0); +} +*/ +import "C" + +import ( + "fmt" + "os" + "os/exec" + "runtime" + "sync/atomic" + "time" + "unsafe" +) + +func init() { + register("CgoExternalThreadSIGPROF", CgoExternalThreadSIGPROF) + register("CgoExternalThreadSignal", CgoExternalThreadSignal) +} + +func CgoExternalThreadSIGPROF() { + // This test intends to test that sending SIGPROF to foreign threads + // before we make any cgo call will not abort the whole process, so + // we cannot make any cgo call here. See https://golang.org/issue/9456. + atomic.StoreInt32((*int32)(unsafe.Pointer(&C.spinlock)), 1) + for atomic.LoadInt32((*int32)(unsafe.Pointer(&C.spinlock))) == 1 { + runtime.Gosched() + } + println("OK") +} + +func CgoExternalThreadSignal() { + if len(os.Args) > 2 && os.Args[2] == "crash" { + i := C.start_crashing_thread() + if i != 0 { + fmt.Println("pthread_create failed:", i) + // Exit with 0 because parent expects us to crash. + return + } + + // We should crash immediately, but give it plenty of + // time before failing (by exiting 0) in case we are + // running on a slow system. + time.Sleep(5 * time.Second) + return + } + + out, err := exec.Command(os.Args[0], "CgoExternalThreadSignal", "crash").CombinedOutput() + if err == nil { + fmt.Println("C signal did not crash as expected") + fmt.Printf("\n%s\n", out) + os.Exit(1) + } + + fmt.Println("OK") +} diff --git a/src/runtime/testdata/testprogcgo/traceback.go b/src/runtime/testdata/testprogcgo/traceback.go new file mode 100644 index 0000000..e2d7599 --- /dev/null +++ b/src/runtime/testdata/testprogcgo/traceback.go @@ -0,0 +1,54 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +// This program will crash. +// We want the stack trace to include the C functions. +// We use a fake traceback, and a symbolizer that dumps a string we recognize. + +/* +#cgo CFLAGS: -g -O0 + +// Defined in traceback_c.c. +extern int crashInGo; +int tracebackF1(void); +void cgoTraceback(void* parg); +void cgoSymbolizer(void* parg); +*/ +import "C" + +import ( + "runtime" + "unsafe" +) + +func init() { + register("CrashTraceback", CrashTraceback) + register("CrashTracebackGo", CrashTracebackGo) +} + +func CrashTraceback() { + runtime.SetCgoTraceback(0, unsafe.Pointer(C.cgoTraceback), nil, unsafe.Pointer(C.cgoSymbolizer)) + C.tracebackF1() +} + +func CrashTracebackGo() { + C.crashInGo = 1 + CrashTraceback() +} + +//export h1 +func h1() { + h2() +} + +func h2() { + h3() +} + +func h3() { + var x *int + *x = 0 +} diff --git a/src/runtime/testdata/testprogcgo/traceback_c.c b/src/runtime/testdata/testprogcgo/traceback_c.c new file mode 100644 index 0000000..56eda8f --- /dev/null +++ b/src/runtime/testdata/testprogcgo/traceback_c.c @@ -0,0 +1,65 @@ +// Copyright 2020 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The C definitions for traceback.go. That file uses //export so +// it can't put function definitions in the "C" import comment. + +#include <stdint.h> + +char *p; + +int crashInGo; +extern void h1(void); + +int tracebackF3(void) { + if (crashInGo) + h1(); + else + *p = 0; + return 0; +} + +int tracebackF2(void) { + return tracebackF3(); +} + +int tracebackF1(void) { + return tracebackF2(); +} + +struct cgoTracebackArg { + uintptr_t context; + uintptr_t sigContext; + uintptr_t* buf; + uintptr_t max; +}; + +struct cgoSymbolizerArg { + uintptr_t pc; + const char* file; + uintptr_t lineno; + const char* func; + uintptr_t entry; + uintptr_t more; + uintptr_t data; +}; + +void cgoTraceback(void* parg) { + struct cgoTracebackArg* arg = (struct cgoTracebackArg*)(parg); + arg->buf[0] = 1; + arg->buf[1] = 2; + arg->buf[2] = 3; + arg->buf[3] = 0; +} + +void cgoSymbolizer(void* parg) { + struct cgoSymbolizerArg* arg = (struct cgoSymbolizerArg*)(parg); + if (arg->pc != arg->data + 1) { + arg->file = "unexpected data"; + } else { + arg->file = "cgo symbolizer"; + } + arg->lineno = arg->data + 1; + arg->data++; +} diff --git a/src/runtime/testdata/testprogcgo/tracebackctxt.go b/src/runtime/testdata/testprogcgo/tracebackctxt.go new file mode 100644 index 0000000..62ff8ec --- /dev/null +++ b/src/runtime/testdata/testprogcgo/tracebackctxt.go @@ -0,0 +1,136 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +// Test the context argument to SetCgoTraceback. +// Use fake context, traceback, and symbolizer functions. + +/* +// Defined in tracebackctxt_c.c. +extern void C1(void); +extern void C2(void); +extern void tcContext(void*); +extern void tcContextSimple(void*); +extern void tcTraceback(void*); +extern void tcSymbolizer(void*); +extern int getContextCount(void); +extern void TracebackContextPreemptionCallGo(int); +*/ +import "C" + +import ( + "fmt" + "runtime" + "sync" + "unsafe" +) + +func init() { + register("TracebackContext", TracebackContext) + register("TracebackContextPreemption", TracebackContextPreemption) +} + +var tracebackOK bool + +func TracebackContext() { + runtime.SetCgoTraceback(0, unsafe.Pointer(C.tcTraceback), unsafe.Pointer(C.tcContext), unsafe.Pointer(C.tcSymbolizer)) + C.C1() + if got := C.getContextCount(); got != 0 { + fmt.Printf("at end contextCount == %d, expected 0\n", got) + tracebackOK = false + } + if tracebackOK { + fmt.Println("OK") + } +} + +//export G1 +func G1() { + C.C2() +} + +//export G2 +func G2() { + pc := make([]uintptr, 32) + n := runtime.Callers(0, pc) + cf := runtime.CallersFrames(pc[:n]) + var frames []runtime.Frame + for { + frame, more := cf.Next() + frames = append(frames, frame) + if !more { + break + } + } + + want := []struct { + function string + line int + }{ + {"main.G2", 0}, + {"cFunction", 0x10200}, + {"cFunction", 0x200}, + {"cFunction", 0x10201}, + {"cFunction", 0x201}, + {"main.G1", 0}, + {"cFunction", 0x10100}, + {"cFunction", 0x100}, + {"main.TracebackContext", 0}, + } + + ok := true + i := 0 +wantLoop: + for _, w := range want { + for ; i < len(frames); i++ { + if w.function == frames[i].Function { + if w.line != 0 && w.line != frames[i].Line { + fmt.Printf("found function %s at wrong line %#x (expected %#x)\n", w.function, frames[i].Line, w.line) + ok = false + } + i++ + continue wantLoop + } + } + fmt.Printf("did not find function %s in\n", w.function) + for _, f := range frames { + fmt.Println(f) + } + ok = false + break + } + tracebackOK = ok + if got := C.getContextCount(); got != 2 { + fmt.Printf("at bottom contextCount == %d, expected 2\n", got) + tracebackOK = false + } +} + +// Issue 47441. +func TracebackContextPreemption() { + runtime.SetCgoTraceback(0, unsafe.Pointer(C.tcTraceback), unsafe.Pointer(C.tcContextSimple), unsafe.Pointer(C.tcSymbolizer)) + + const funcs = 10 + const calls = 1e5 + var wg sync.WaitGroup + for i := 0; i < funcs; i++ { + wg.Add(1) + go func(i int) { + defer wg.Done() + for j := 0; j < calls; j++ { + C.TracebackContextPreemptionCallGo(C.int(i*calls + j)) + } + }(i) + } + wg.Wait() + + fmt.Println("OK") +} + +//export TracebackContextPreemptionGoFunction +func TracebackContextPreemptionGoFunction(i C.int) { + // Do some busy work. + fmt.Sprintf("%d\n", i) +} diff --git a/src/runtime/testdata/testprogcgo/tracebackctxt_c.c b/src/runtime/testdata/testprogcgo/tracebackctxt_c.c new file mode 100644 index 0000000..910cb7b --- /dev/null +++ b/src/runtime/testdata/testprogcgo/tracebackctxt_c.c @@ -0,0 +1,103 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// The C definitions for tracebackctxt.go. That file uses //export so +// it can't put function definitions in the "C" import comment. + +#include <stdlib.h> +#include <stdint.h> + +// Functions exported from Go. +extern void G1(void); +extern void G2(void); +extern void TracebackContextPreemptionGoFunction(int); + +void C1() { + G1(); +} + +void C2() { + G2(); +} + +struct cgoContextArg { + uintptr_t context; +}; + +struct cgoTracebackArg { + uintptr_t context; + uintptr_t sigContext; + uintptr_t* buf; + uintptr_t max; +}; + +struct cgoSymbolizerArg { + uintptr_t pc; + const char* file; + uintptr_t lineno; + const char* func; + uintptr_t entry; + uintptr_t more; + uintptr_t data; +}; + +// Uses atomic adds and subtracts to catch the possibility of +// erroneous calls from multiple threads; that should be impossible in +// this test case, but we check just in case. +static int contextCount; + +int getContextCount() { + return __sync_add_and_fetch(&contextCount, 0); +} + +void tcContext(void* parg) { + struct cgoContextArg* arg = (struct cgoContextArg*)(parg); + if (arg->context == 0) { + arg->context = __sync_add_and_fetch(&contextCount, 1); + } else { + if (arg->context != __sync_add_and_fetch(&contextCount, 0)) { + abort(); + } + __sync_sub_and_fetch(&contextCount, 1); + } +} + +void tcContextSimple(void* parg) { + struct cgoContextArg* arg = (struct cgoContextArg*)(parg); + if (arg->context == 0) { + arg->context = 1; + } +} + +void tcTraceback(void* parg) { + int base, i; + struct cgoTracebackArg* arg = (struct cgoTracebackArg*)(parg); + if (arg->context == 0 && arg->sigContext == 0) { + // This shouldn't happen in this program. + abort(); + } + // Return a variable number of PC values. + base = arg->context << 8; + for (i = 0; i < arg->context; i++) { + if (i < arg->max) { + arg->buf[i] = base + i; + } + } +} + +void tcSymbolizer(void *parg) { + struct cgoSymbolizerArg* arg = (struct cgoSymbolizerArg*)(parg); + if (arg->pc == 0) { + return; + } + // Report two lines per PC returned by traceback, to test more handling. + arg->more = arg->file == NULL; + arg->file = "tracebackctxt.go"; + arg->func = "cFunction"; + arg->lineno = arg->pc + (arg->more << 16); +} + +void TracebackContextPreemptionCallGo(int i) { + TracebackContextPreemptionGoFunction(i); +} diff --git a/src/runtime/testdata/testprogcgo/windows/win.go b/src/runtime/testdata/testprogcgo/windows/win.go new file mode 100644 index 0000000..9d9f86c --- /dev/null +++ b/src/runtime/testdata/testprogcgo/windows/win.go @@ -0,0 +1,14 @@ +package windows + +/* +#include <windows.h> + +DWORD agetthread() { + return GetCurrentThreadId(); +} +*/ +import "C" + +func GetThread() uint32 { + return uint32(C.agetthread()) +} diff --git a/src/runtime/testdata/testprognet/main.go b/src/runtime/testdata/testprognet/main.go new file mode 100644 index 0000000..ae491a2 --- /dev/null +++ b/src/runtime/testdata/testprognet/main.go @@ -0,0 +1,35 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import "os" + +var cmds = map[string]func(){} + +func register(name string, f func()) { + if cmds[name] != nil { + panic("duplicate registration: " + name) + } + cmds[name] = f +} + +func registerInit(name string, f func()) { + if len(os.Args) >= 2 && os.Args[1] == name { + f() + } +} + +func main() { + if len(os.Args) < 2 { + println("usage: " + os.Args[0] + " name-of-test") + return + } + f := cmds[os.Args[1]] + if f == nil { + println("unknown function: " + os.Args[1]) + return + } + f() +} diff --git a/src/runtime/testdata/testprognet/net.go b/src/runtime/testdata/testprognet/net.go new file mode 100644 index 0000000..714b101 --- /dev/null +++ b/src/runtime/testdata/testprognet/net.go @@ -0,0 +1,29 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "fmt" + "net" +) + +func init() { + registerInit("NetpollDeadlock", NetpollDeadlockInit) + register("NetpollDeadlock", NetpollDeadlock) +} + +func NetpollDeadlockInit() { + fmt.Println("dialing") + c, err := net.Dial("tcp", "localhost:14356") + if err == nil { + c.Close() + } else { + fmt.Println("error: ", err) + } +} + +func NetpollDeadlock() { + fmt.Println("done") +} diff --git a/src/runtime/testdata/testprognet/signal.go b/src/runtime/testdata/testprognet/signal.go new file mode 100644 index 0000000..dfa2e10 --- /dev/null +++ b/src/runtime/testdata/testprognet/signal.go @@ -0,0 +1,27 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !windows && !plan9 +// +build !windows,!plan9 + +// This is in testprognet instead of testprog because testprog +// must not import anything (like net, but also like os/signal) +// that kicks off background goroutines during init. + +package main + +import ( + "os/signal" + "syscall" +) + +func init() { + register("SignalIgnoreSIGTRAP", SignalIgnoreSIGTRAP) +} + +func SignalIgnoreSIGTRAP() { + signal.Ignore(syscall.SIGTRAP) + syscall.Kill(syscall.Getpid(), syscall.SIGTRAP) + println("OK") +} diff --git a/src/runtime/testdata/testprognet/signalexec.go b/src/runtime/testdata/testprognet/signalexec.go new file mode 100644 index 0000000..62ebce7 --- /dev/null +++ b/src/runtime/testdata/testprognet/signalexec.go @@ -0,0 +1,71 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build darwin || dragonfly || freebsd || linux || netbsd || openbsd +// +build darwin dragonfly freebsd linux netbsd openbsd + +// This is in testprognet instead of testprog because testprog +// must not import anything (like net, but also like os/signal) +// that kicks off background goroutines during init. + +package main + +import ( + "fmt" + "os" + "os/exec" + "os/signal" + "sync" + "syscall" + "time" +) + +func init() { + register("SignalDuringExec", SignalDuringExec) + register("Nop", Nop) +} + +func SignalDuringExec() { + pgrp := syscall.Getpgrp() + + const tries = 10 + + var wg sync.WaitGroup + c := make(chan os.Signal, tries) + signal.Notify(c, syscall.SIGWINCH) + wg.Add(1) + go func() { + defer wg.Done() + for range c { + } + }() + + for i := 0; i < tries; i++ { + time.Sleep(time.Microsecond) + wg.Add(2) + go func() { + defer wg.Done() + cmd := exec.Command(os.Args[0], "Nop") + cmd.Stdout = os.Stdout + cmd.Stderr = os.Stderr + if err := cmd.Run(); err != nil { + fmt.Printf("Start failed: %v", err) + } + }() + go func() { + defer wg.Done() + syscall.Kill(-pgrp, syscall.SIGWINCH) + }() + } + + signal.Stop(c) + close(c) + wg.Wait() + + fmt.Println("OK") +} + +func Nop() { + // This is just for SignalDuringExec. +} diff --git a/src/runtime/testdata/testsuid/main.go b/src/runtime/testdata/testsuid/main.go new file mode 100644 index 0000000..1949d2d --- /dev/null +++ b/src/runtime/testdata/testsuid/main.go @@ -0,0 +1,25 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import ( + "fmt" + "log" + "os" +) + +func main() { + if os.Geteuid() == os.Getuid() { + os.Exit(99) + } + + fmt.Fprintf(os.Stdout, "GOTRACEBACK=%s\n", os.Getenv("GOTRACEBACK")) + f, err := os.OpenFile(os.Getenv("TEST_OUTPUT"), os.O_CREATE|os.O_RDWR, 0600) + if err != nil { + log.Fatalf("os.Open failed: %s", err) + } + defer f.Close() + fmt.Fprintf(os.Stderr, "hello\n") +} diff --git a/src/runtime/testdata/testwinlib/main.c b/src/runtime/testdata/testwinlib/main.c new file mode 100644 index 0000000..c3fe3cb --- /dev/null +++ b/src/runtime/testdata/testwinlib/main.c @@ -0,0 +1,60 @@ +#include <stdio.h> +#include <windows.h> +#include "testwinlib.h" + +int exceptionCount; +int continueCount; +LONG WINAPI customExceptionHandlder(struct _EXCEPTION_POINTERS *ExceptionInfo) +{ + if (ExceptionInfo->ExceptionRecord->ExceptionCode == EXCEPTION_BREAKPOINT) + { + exceptionCount++; + // prepare context to resume execution + CONTEXT *c = ExceptionInfo->ContextRecord; + c->Rip = *(ULONG_PTR *)c->Rsp; + c->Rsp += 8; + return EXCEPTION_CONTINUE_EXECUTION; + } + return EXCEPTION_CONTINUE_SEARCH; +} +LONG WINAPI customContinueHandlder(struct _EXCEPTION_POINTERS *ExceptionInfo) +{ + if (ExceptionInfo->ExceptionRecord->ExceptionCode == EXCEPTION_BREAKPOINT) + { + continueCount++; + return EXCEPTION_CONTINUE_EXECUTION; + } + return EXCEPTION_CONTINUE_SEARCH; +} + +void throwFromC() +{ + DebugBreak(); +} +int main() +{ + // simulate a "lazily" attached debugger, by calling some go code before attaching the exception/continue handler + Dummy(); + exceptionCount = 0; + continueCount = 0; + void *exceptionHandlerHandle = AddVectoredExceptionHandler(0, customExceptionHandlder); + if (NULL == exceptionHandlerHandle) + { + printf("cannot add vectored exception handler\n"); + fflush(stdout); + return 2; + } + void *continueHandlerHandle = AddVectoredContinueHandler(0, customContinueHandlder); + if (NULL == continueHandlerHandle) + { + printf("cannot add vectored continue handler\n"); + fflush(stdout); + return 2; + } + CallMeBack(throwFromC); + RemoveVectoredContinueHandler(continueHandlerHandle); + RemoveVectoredExceptionHandler(exceptionHandlerHandle); + printf("exceptionCount: %d\ncontinueCount: %d\n", exceptionCount, continueCount); + fflush(stdout); + return 0; +} diff --git a/src/runtime/testdata/testwinlib/main.go b/src/runtime/testdata/testwinlib/main.go new file mode 100644 index 0000000..407331b --- /dev/null +++ b/src/runtime/testdata/testwinlib/main.go @@ -0,0 +1,31 @@ +//go:build windows && cgo +// +build windows,cgo + +package main + +// #include <windows.h> +// typedef void(*callmeBackFunc)(); +// static void bridgeCallback(callmeBackFunc callback) { +// callback(); +//} +import "C" + +// CallMeBack call backs C code. +// +//export CallMeBack +func CallMeBack(callback C.callmeBackFunc) { + C.bridgeCallback(callback) +} + +// Dummy is called by the C code before registering the exception/continue handlers simulating a debugger. +// This makes sure that the Go runtime's lastcontinuehandler is reached before the C continue handler and thus, +// validate that it does not crash the program before another handler could take an action. +// The idea here is to reproduce what happens when you attach a debugger to a running program. +// It also simulate the behavior of the .Net debugger, which register its exception/continue handlers lazily. +// +//export Dummy +func Dummy() int { + return 42 +} + +func main() {} diff --git a/src/runtime/testdata/testwinlibsignal/dummy.go b/src/runtime/testdata/testwinlibsignal/dummy.go new file mode 100644 index 0000000..e610f15 --- /dev/null +++ b/src/runtime/testdata/testwinlibsignal/dummy.go @@ -0,0 +1,13 @@ +//go:build windows +// +build windows + +package main + +import "C" + +//export Dummy +func Dummy() int { + return 42 +} + +func main() {} diff --git a/src/runtime/testdata/testwinlibsignal/main.c b/src/runtime/testdata/testwinlibsignal/main.c new file mode 100644 index 0000000..37f2482 --- /dev/null +++ b/src/runtime/testdata/testwinlibsignal/main.c @@ -0,0 +1,57 @@ +#include <windows.h> +#include <stdio.h> + +HANDLE waitForCtrlBreakEvent; + +BOOL WINAPI CtrlHandler(DWORD fdwCtrlType) +{ + switch (fdwCtrlType) + { + case CTRL_BREAK_EVENT: + SetEvent(waitForCtrlBreakEvent); + return TRUE; + default: + return FALSE; + } +} + +int main(void) +{ + waitForCtrlBreakEvent = CreateEvent(NULL, TRUE, FALSE, NULL); + if (!waitForCtrlBreakEvent) { + fprintf(stderr, "ERROR: Could not create event\n"); + return 1; + } + + if (!SetConsoleCtrlHandler(CtrlHandler, TRUE)) + { + fprintf(stderr, "ERROR: Could not set control handler\n"); + return 1; + } + + // The library must be loaded after the SetConsoleCtrlHandler call + // so that the library handler registers after the main program. + // This way the library handler gets called first. + HMODULE dummyDll = LoadLibrary("dummy.dll"); + if (!dummyDll) { + fprintf(stderr, "ERROR: Could not load dummy.dll\n"); + return 1; + } + + // Call the Dummy function so that Go initialization completes, since + // all cgo entry points call out to _cgo_wait_runtime_init_done. + if (((int(*)(void))GetProcAddress(dummyDll, "Dummy"))() != 42) { + fprintf(stderr, "ERROR: Dummy function did not return 42\n"); + return 1; + } + + printf("ready\n"); + fflush(stdout); + + if (WaitForSingleObject(waitForCtrlBreakEvent, 5000) != WAIT_OBJECT_0) { + fprintf(stderr, "FAILURE: No signal received\n"); + return 1; + } + + return 0; +} diff --git a/src/runtime/testdata/testwinlibthrow/main.go b/src/runtime/testdata/testwinlibthrow/main.go new file mode 100644 index 0000000..ce0c92f --- /dev/null +++ b/src/runtime/testdata/testwinlibthrow/main.go @@ -0,0 +1,19 @@ +package main
+
+import (
+ "os"
+ "syscall"
+)
+
+func main() {
+ dll := syscall.MustLoadDLL("veh.dll")
+ RaiseNoExcept := dll.MustFindProc("RaiseNoExcept")
+ ThreadRaiseNoExcept := dll.MustFindProc("ThreadRaiseNoExcept")
+
+ thread := len(os.Args) > 1 && os.Args[1] == "thread"
+ if !thread {
+ RaiseNoExcept.Call()
+ } else {
+ ThreadRaiseNoExcept.Call()
+ }
+}
diff --git a/src/runtime/testdata/testwinlibthrow/veh.c b/src/runtime/testdata/testwinlibthrow/veh.c new file mode 100644 index 0000000..08c1f9e --- /dev/null +++ b/src/runtime/testdata/testwinlibthrow/veh.c @@ -0,0 +1,26 @@ +//go:build ignore
+
+#include <windows.h>
+
+__declspec(dllexport)
+void RaiseNoExcept(void)
+{
+ RaiseException(42, 0, 0, 0);
+}
+
+static DWORD WINAPI ThreadRaiser(void* Context)
+{
+ RaiseNoExcept();
+ return 0;
+}
+
+__declspec(dllexport)
+void ThreadRaiseNoExcept(void)
+{
+ HANDLE thread = CreateThread(0, 0, ThreadRaiser, 0, 0, 0);
+ if (0 != thread)
+ {
+ WaitForSingleObject(thread, INFINITE);
+ CloseHandle(thread);
+ }
+}
diff --git a/src/runtime/testdata/testwinsignal/main.go b/src/runtime/testdata/testwinsignal/main.go new file mode 100644 index 0000000..e1136f3 --- /dev/null +++ b/src/runtime/testdata/testwinsignal/main.go @@ -0,0 +1,53 @@ +package main + +import ( + "fmt" + "io" + "log" + "os" + "os/signal" + "syscall" + "time" +) + +func main() { + // Ensure that this process terminates when the test times out, + // even if the expected signal never arrives. + go func() { + io.Copy(io.Discard, os.Stdin) + log.Fatal("stdin is closed; terminating") + }() + + // Register to receive all signals. + c := make(chan os.Signal, 1) + signal.Notify(c) + + // Get console window handle. + kernel32 := syscall.NewLazyDLL("kernel32.dll") + getConsoleWindow := kernel32.NewProc("GetConsoleWindow") + hwnd, _, err := getConsoleWindow.Call() + if hwnd == 0 { + log.Fatal("no associated console: ", err) + } + + // Send message to close the console window. + const _WM_CLOSE = 0x0010 + user32 := syscall.NewLazyDLL("user32.dll") + postMessage := user32.NewProc("PostMessageW") + ok, _, err := postMessage.Call(hwnd, _WM_CLOSE, 0, 0) + if ok == 0 { + log.Fatal("post message failed: ", err) + } + + sig := <-c + + // Allow some time for the handler to complete if it's going to. + // + // (In https://go.dev/issue/41884 the handler returned immediately, + // which caused Windows to terminate the program before the goroutine + // that received the SIGTERM had a chance to actually clean up.) + time.Sleep(time.Second) + + // Print the signal's name: "terminated" makes the test succeed. + fmt.Println(sig) +} diff --git a/src/runtime/testdata/testwintls/main.c b/src/runtime/testdata/testwintls/main.c new file mode 100644 index 0000000..6061828 --- /dev/null +++ b/src/runtime/testdata/testwintls/main.c @@ -0,0 +1,29 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include <windows.h> + +int main(int argc, char **argv) { + if (argc < 3) { + return 1; + } + // Allocate more than 64 TLS indices + // so the Go runtime doesn't find + // enough space in the TEB TLS slots. + for (int i = 0; i < 65; i++) { + TlsAlloc(); + } + HMODULE hlib = LoadLibrary(argv[1]); + if (hlib == NULL) { + return 2; + } + FARPROC proc = GetProcAddress(hlib, argv[2]); + if (proc == NULL) { + return 3; + } + if (proc() != 42) { + return 4; + } + return 0; +}
\ No newline at end of file diff --git a/src/runtime/testdata/testwintls/main.go b/src/runtime/testdata/testwintls/main.go new file mode 100644 index 0000000..1cf296c --- /dev/null +++ b/src/runtime/testdata/testwintls/main.go @@ -0,0 +1,12 @@ +// Copyright 2023 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package main + +import "C" + +//export GoFunc +func GoFunc() int { return 42 } + +func main() {} diff --git a/src/runtime/textflag.h b/src/runtime/textflag.h new file mode 100644 index 0000000..214075e --- /dev/null +++ b/src/runtime/textflag.h @@ -0,0 +1,39 @@ +// Copyright 2013 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// This file defines flags attached to various functions +// and data objects. The compilers, assemblers, and linker must +// all agree on these values. +// +// Keep in sync with src/cmd/internal/obj/textflag.go. + +// Don't profile the marked routine. This flag is deprecated. +#define NOPROF 1 +// It is ok for the linker to get multiple of these symbols. It will +// pick one of the duplicates to use. +#define DUPOK 2 +// Don't insert stack check preamble. +#define NOSPLIT 4 +// Put this data in a read-only section. +#define RODATA 8 +// This data contains no pointers. +#define NOPTR 16 +// This is a wrapper function and should not count as disabling 'recover'. +#define WRAPPER 32 +// This function uses its incoming context register. +#define NEEDCTXT 64 +// Allocate a word of thread local storage and store the offset from the +// thread local base to the thread local storage in this variable. +#define TLSBSS 256 +// Do not insert instructions to allocate a stack frame for this function. +// Only valid on functions that declare a frame size of 0. +// TODO(mwhudson): only implemented for ppc64x at present. +#define NOFRAME 512 +// Function can call reflect.Type.Method or reflect.Type.MethodByName. +#define REFLECTMETHOD 1024 +// Function is the outermost frame of the call stack. Call stack unwinders +// should stop at this function. +#define TOPFRAME 2048 +// Function is an ABI wrapper. +#define ABIWRAPPER 4096 diff --git a/src/runtime/time.go b/src/runtime/time.go new file mode 100644 index 0000000..6cd70b7 --- /dev/null +++ b/src/runtime/time.go @@ -0,0 +1,1144 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Time-related runtime and pieces of package time. + +package runtime + +import ( + "internal/abi" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// Package time knows the layout of this structure. +// If this struct changes, adjust ../time/sleep.go:/runtimeTimer. +type timer struct { + // If this timer is on a heap, which P's heap it is on. + // puintptr rather than *p to match uintptr in the versions + // of this struct defined in other packages. + pp puintptr + + // Timer wakes up at when, and then at when+period, ... (period > 0 only) + // each time calling f(arg, now) in the timer goroutine, so f must be + // a well-behaved function and not block. + // + // when must be positive on an active timer. + when int64 + period int64 + f func(any, uintptr) + arg any + seq uintptr + + // What to set the when field to in timerModifiedXX status. + nextwhen int64 + + // The status field holds one of the values below. + status atomic.Uint32 +} + +// Code outside this file has to be careful in using a timer value. +// +// The pp, status, and nextwhen fields may only be used by code in this file. +// +// Code that creates a new timer value can set the when, period, f, +// arg, and seq fields. +// A new timer value may be passed to addtimer (called by time.startTimer). +// After doing that no fields may be touched. +// +// An active timer (one that has been passed to addtimer) may be +// passed to deltimer (time.stopTimer), after which it is no longer an +// active timer. It is an inactive timer. +// In an inactive timer the period, f, arg, and seq fields may be modified, +// but not the when field. +// It's OK to just drop an inactive timer and let the GC collect it. +// It's not OK to pass an inactive timer to addtimer. +// Only newly allocated timer values may be passed to addtimer. +// +// An active timer may be passed to modtimer. No fields may be touched. +// It remains an active timer. +// +// An inactive timer may be passed to resettimer to turn into an +// active timer with an updated when field. +// It's OK to pass a newly allocated timer value to resettimer. +// +// Timer operations are addtimer, deltimer, modtimer, resettimer, +// cleantimers, adjusttimers, and runtimer. +// +// We don't permit calling addtimer/deltimer/modtimer/resettimer simultaneously, +// but adjusttimers and runtimer can be called at the same time as any of those. +// +// Active timers live in heaps attached to P, in the timers field. +// Inactive timers live there too temporarily, until they are removed. +// +// addtimer: +// timerNoStatus -> timerWaiting +// anything else -> panic: invalid value +// deltimer: +// timerWaiting -> timerModifying -> timerDeleted +// timerModifiedEarlier -> timerModifying -> timerDeleted +// timerModifiedLater -> timerModifying -> timerDeleted +// timerNoStatus -> do nothing +// timerDeleted -> do nothing +// timerRemoving -> do nothing +// timerRemoved -> do nothing +// timerRunning -> wait until status changes +// timerMoving -> wait until status changes +// timerModifying -> wait until status changes +// modtimer: +// timerWaiting -> timerModifying -> timerModifiedXX +// timerModifiedXX -> timerModifying -> timerModifiedYY +// timerNoStatus -> timerModifying -> timerWaiting +// timerRemoved -> timerModifying -> timerWaiting +// timerDeleted -> timerModifying -> timerModifiedXX +// timerRunning -> wait until status changes +// timerMoving -> wait until status changes +// timerRemoving -> wait until status changes +// timerModifying -> wait until status changes +// cleantimers (looks in P's timer heap): +// timerDeleted -> timerRemoving -> timerRemoved +// timerModifiedXX -> timerMoving -> timerWaiting +// adjusttimers (looks in P's timer heap): +// timerDeleted -> timerRemoving -> timerRemoved +// timerModifiedXX -> timerMoving -> timerWaiting +// runtimer (looks in P's timer heap): +// timerNoStatus -> panic: uninitialized timer +// timerWaiting -> timerWaiting or +// timerWaiting -> timerRunning -> timerNoStatus or +// timerWaiting -> timerRunning -> timerWaiting +// timerModifying -> wait until status changes +// timerModifiedXX -> timerMoving -> timerWaiting +// timerDeleted -> timerRemoving -> timerRemoved +// timerRunning -> panic: concurrent runtimer calls +// timerRemoved -> panic: inconsistent timer heap +// timerRemoving -> panic: inconsistent timer heap +// timerMoving -> panic: inconsistent timer heap + +// Values for the timer status field. +const ( + // Timer has no status set yet. + timerNoStatus = iota + + // Waiting for timer to fire. + // The timer is in some P's heap. + timerWaiting + + // Running the timer function. + // A timer will only have this status briefly. + timerRunning + + // The timer is deleted and should be removed. + // It should not be run, but it is still in some P's heap. + timerDeleted + + // The timer is being removed. + // The timer will only have this status briefly. + timerRemoving + + // The timer has been stopped. + // It is not in any P's heap. + timerRemoved + + // The timer is being modified. + // The timer will only have this status briefly. + timerModifying + + // The timer has been modified to an earlier time. + // The new when value is in the nextwhen field. + // The timer is in some P's heap, possibly in the wrong place. + timerModifiedEarlier + + // The timer has been modified to the same or a later time. + // The new when value is in the nextwhen field. + // The timer is in some P's heap, possibly in the wrong place. + timerModifiedLater + + // The timer has been modified and is being moved. + // The timer will only have this status briefly. + timerMoving +) + +// maxWhen is the maximum value for timer's when field. +const maxWhen = 1<<63 - 1 + +// verifyTimers can be set to true to add debugging checks that the +// timer heaps are valid. +const verifyTimers = false + +// Package time APIs. +// Godoc uses the comments in package time, not these. + +// time.now is implemented in assembly. + +// timeSleep puts the current goroutine to sleep for at least ns nanoseconds. +// +//go:linkname timeSleep time.Sleep +func timeSleep(ns int64) { + if ns <= 0 { + return + } + + gp := getg() + t := gp.timer + if t == nil { + t = new(timer) + gp.timer = t + } + t.f = goroutineReady + t.arg = gp + t.nextwhen = nanotime() + ns + if t.nextwhen < 0 { // check for overflow. + t.nextwhen = maxWhen + } + gopark(resetForSleep, unsafe.Pointer(t), waitReasonSleep, traceEvGoSleep, 1) +} + +// resetForSleep is called after the goroutine is parked for timeSleep. +// We can't call resettimer in timeSleep itself because if this is a short +// sleep and there are many goroutines then the P can wind up running the +// timer function, goroutineReady, before the goroutine has been parked. +func resetForSleep(gp *g, ut unsafe.Pointer) bool { + t := (*timer)(ut) + resettimer(t, t.nextwhen) + return true +} + +// startTimer adds t to the timer heap. +// +//go:linkname startTimer time.startTimer +func startTimer(t *timer) { + if raceenabled { + racerelease(unsafe.Pointer(t)) + } + addtimer(t) +} + +// stopTimer stops a timer. +// It reports whether t was stopped before being run. +// +//go:linkname stopTimer time.stopTimer +func stopTimer(t *timer) bool { + return deltimer(t) +} + +// resetTimer resets an inactive timer, adding it to the heap. +// +// Reports whether the timer was modified before it was run. +// +//go:linkname resetTimer time.resetTimer +func resetTimer(t *timer, when int64) bool { + if raceenabled { + racerelease(unsafe.Pointer(t)) + } + return resettimer(t, when) +} + +// modTimer modifies an existing timer. +// +//go:linkname modTimer time.modTimer +func modTimer(t *timer, when, period int64, f func(any, uintptr), arg any, seq uintptr) { + modtimer(t, when, period, f, arg, seq) +} + +// Go runtime. + +// Ready the goroutine arg. +func goroutineReady(arg any, seq uintptr) { + goready(arg.(*g), 0) +} + +// Note: this changes some unsynchronized operations to synchronized operations +// addtimer adds a timer to the current P. +// This should only be called with a newly created timer. +// That avoids the risk of changing the when field of a timer in some P's heap, +// which could cause the heap to become unsorted. +func addtimer(t *timer) { + // when must be positive. A negative value will cause runtimer to + // overflow during its delta calculation and never expire other runtime + // timers. Zero will cause checkTimers to fail to notice the timer. + if t.when <= 0 { + throw("timer when must be positive") + } + if t.period < 0 { + throw("timer period must be non-negative") + } + if t.status.Load() != timerNoStatus { + throw("addtimer called with initialized timer") + } + t.status.Store(timerWaiting) + + when := t.when + + // Disable preemption while using pp to avoid changing another P's heap. + mp := acquirem() + + pp := getg().m.p.ptr() + lock(&pp.timersLock) + cleantimers(pp) + doaddtimer(pp, t) + unlock(&pp.timersLock) + + wakeNetPoller(when) + + releasem(mp) +} + +// doaddtimer adds t to the current P's heap. +// The caller must have locked the timers for pp. +func doaddtimer(pp *p, t *timer) { + // Timers rely on the network poller, so make sure the poller + // has started. + if netpollInited.Load() == 0 { + netpollGenericInit() + } + + if t.pp != 0 { + throw("doaddtimer: P already set in timer") + } + t.pp.set(pp) + i := len(pp.timers) + pp.timers = append(pp.timers, t) + siftupTimer(pp.timers, i) + if t == pp.timers[0] { + pp.timer0When.Store(t.when) + } + pp.numTimers.Add(1) +} + +// deltimer deletes the timer t. It may be on some other P, so we can't +// actually remove it from the timers heap. We can only mark it as deleted. +// It will be removed in due course by the P whose heap it is on. +// Reports whether the timer was removed before it was run. +func deltimer(t *timer) bool { + for { + switch s := t.status.Load(); s { + case timerWaiting, timerModifiedLater: + // Prevent preemption while the timer is in timerModifying. + // This could lead to a self-deadlock. See #38070. + mp := acquirem() + if t.status.CompareAndSwap(s, timerModifying) { + // Must fetch t.pp before changing status, + // as cleantimers in another goroutine + // can clear t.pp of a timerDeleted timer. + tpp := t.pp.ptr() + if !t.status.CompareAndSwap(timerModifying, timerDeleted) { + badTimer() + } + releasem(mp) + tpp.deletedTimers.Add(1) + // Timer was not yet run. + return true + } else { + releasem(mp) + } + case timerModifiedEarlier: + // Prevent preemption while the timer is in timerModifying. + // This could lead to a self-deadlock. See #38070. + mp := acquirem() + if t.status.CompareAndSwap(s, timerModifying) { + // Must fetch t.pp before setting status + // to timerDeleted. + tpp := t.pp.ptr() + if !t.status.CompareAndSwap(timerModifying, timerDeleted) { + badTimer() + } + releasem(mp) + tpp.deletedTimers.Add(1) + // Timer was not yet run. + return true + } else { + releasem(mp) + } + case timerDeleted, timerRemoving, timerRemoved: + // Timer was already run. + return false + case timerRunning, timerMoving: + // The timer is being run or moved, by a different P. + // Wait for it to complete. + osyield() + case timerNoStatus: + // Removing timer that was never added or + // has already been run. Also see issue 21874. + return false + case timerModifying: + // Simultaneous calls to deltimer and modtimer. + // Wait for the other call to complete. + osyield() + default: + badTimer() + } + } +} + +// dodeltimer removes timer i from the current P's heap. +// We are locked on the P when this is called. +// It returns the smallest changed index in pp.timers. +// The caller must have locked the timers for pp. +func dodeltimer(pp *p, i int) int { + if t := pp.timers[i]; t.pp.ptr() != pp { + throw("dodeltimer: wrong P") + } else { + t.pp = 0 + } + last := len(pp.timers) - 1 + if i != last { + pp.timers[i] = pp.timers[last] + } + pp.timers[last] = nil + pp.timers = pp.timers[:last] + smallestChanged := i + if i != last { + // Moving to i may have moved the last timer to a new parent, + // so sift up to preserve the heap guarantee. + smallestChanged = siftupTimer(pp.timers, i) + siftdownTimer(pp.timers, i) + } + if i == 0 { + updateTimer0When(pp) + } + n := pp.numTimers.Add(-1) + if n == 0 { + // If there are no timers, then clearly none are modified. + pp.timerModifiedEarliest.Store(0) + } + return smallestChanged +} + +// dodeltimer0 removes timer 0 from the current P's heap. +// We are locked on the P when this is called. +// It reports whether it saw no problems due to races. +// The caller must have locked the timers for pp. +func dodeltimer0(pp *p) { + if t := pp.timers[0]; t.pp.ptr() != pp { + throw("dodeltimer0: wrong P") + } else { + t.pp = 0 + } + last := len(pp.timers) - 1 + if last > 0 { + pp.timers[0] = pp.timers[last] + } + pp.timers[last] = nil + pp.timers = pp.timers[:last] + if last > 0 { + siftdownTimer(pp.timers, 0) + } + updateTimer0When(pp) + n := pp.numTimers.Add(-1) + if n == 0 { + // If there are no timers, then clearly none are modified. + pp.timerModifiedEarliest.Store(0) + } +} + +// modtimer modifies an existing timer. +// This is called by the netpoll code or time.Ticker.Reset or time.Timer.Reset. +// Reports whether the timer was modified before it was run. +func modtimer(t *timer, when, period int64, f func(any, uintptr), arg any, seq uintptr) bool { + if when <= 0 { + throw("timer when must be positive") + } + if period < 0 { + throw("timer period must be non-negative") + } + + status := uint32(timerNoStatus) + wasRemoved := false + var pending bool + var mp *m +loop: + for { + switch status = t.status.Load(); status { + case timerWaiting, timerModifiedEarlier, timerModifiedLater: + // Prevent preemption while the timer is in timerModifying. + // This could lead to a self-deadlock. See #38070. + mp = acquirem() + if t.status.CompareAndSwap(status, timerModifying) { + pending = true // timer not yet run + break loop + } + releasem(mp) + case timerNoStatus, timerRemoved: + // Prevent preemption while the timer is in timerModifying. + // This could lead to a self-deadlock. See #38070. + mp = acquirem() + + // Timer was already run and t is no longer in a heap. + // Act like addtimer. + if t.status.CompareAndSwap(status, timerModifying) { + wasRemoved = true + pending = false // timer already run or stopped + break loop + } + releasem(mp) + case timerDeleted: + // Prevent preemption while the timer is in timerModifying. + // This could lead to a self-deadlock. See #38070. + mp = acquirem() + if t.status.CompareAndSwap(status, timerModifying) { + t.pp.ptr().deletedTimers.Add(-1) + pending = false // timer already stopped + break loop + } + releasem(mp) + case timerRunning, timerRemoving, timerMoving: + // The timer is being run or moved, by a different P. + // Wait for it to complete. + osyield() + case timerModifying: + // Multiple simultaneous calls to modtimer. + // Wait for the other call to complete. + osyield() + default: + badTimer() + } + } + + t.period = period + t.f = f + t.arg = arg + t.seq = seq + + if wasRemoved { + t.when = when + pp := getg().m.p.ptr() + lock(&pp.timersLock) + doaddtimer(pp, t) + unlock(&pp.timersLock) + if !t.status.CompareAndSwap(timerModifying, timerWaiting) { + badTimer() + } + releasem(mp) + wakeNetPoller(when) + } else { + // The timer is in some other P's heap, so we can't change + // the when field. If we did, the other P's heap would + // be out of order. So we put the new when value in the + // nextwhen field, and let the other P set the when field + // when it is prepared to resort the heap. + t.nextwhen = when + + newStatus := uint32(timerModifiedLater) + if when < t.when { + newStatus = timerModifiedEarlier + } + + tpp := t.pp.ptr() + + if newStatus == timerModifiedEarlier { + updateTimerModifiedEarliest(tpp, when) + } + + // Set the new status of the timer. + if !t.status.CompareAndSwap(timerModifying, newStatus) { + badTimer() + } + releasem(mp) + + // If the new status is earlier, wake up the poller. + if newStatus == timerModifiedEarlier { + wakeNetPoller(when) + } + } + + return pending +} + +// resettimer resets the time when a timer should fire. +// If used for an inactive timer, the timer will become active. +// This should be called instead of addtimer if the timer value has been, +// or may have been, used previously. +// Reports whether the timer was modified before it was run. +func resettimer(t *timer, when int64) bool { + return modtimer(t, when, t.period, t.f, t.arg, t.seq) +} + +// cleantimers cleans up the head of the timer queue. This speeds up +// programs that create and delete timers; leaving them in the heap +// slows down addtimer. Reports whether no timer problems were found. +// The caller must have locked the timers for pp. +func cleantimers(pp *p) { + gp := getg() + for { + if len(pp.timers) == 0 { + return + } + + // This loop can theoretically run for a while, and because + // it is holding timersLock it cannot be preempted. + // If someone is trying to preempt us, just return. + // We can clean the timers later. + if gp.preemptStop { + return + } + + t := pp.timers[0] + if t.pp.ptr() != pp { + throw("cleantimers: bad p") + } + switch s := t.status.Load(); s { + case timerDeleted: + if !t.status.CompareAndSwap(s, timerRemoving) { + continue + } + dodeltimer0(pp) + if !t.status.CompareAndSwap(timerRemoving, timerRemoved) { + badTimer() + } + pp.deletedTimers.Add(-1) + case timerModifiedEarlier, timerModifiedLater: + if !t.status.CompareAndSwap(s, timerMoving) { + continue + } + // Now we can change the when field. + t.when = t.nextwhen + // Move t to the right position. + dodeltimer0(pp) + doaddtimer(pp, t) + if !t.status.CompareAndSwap(timerMoving, timerWaiting) { + badTimer() + } + default: + // Head of timers does not need adjustment. + return + } + } +} + +// moveTimers moves a slice of timers to pp. The slice has been taken +// from a different P. +// This is currently called when the world is stopped, but the caller +// is expected to have locked the timers for pp. +func moveTimers(pp *p, timers []*timer) { + for _, t := range timers { + loop: + for { + switch s := t.status.Load(); s { + case timerWaiting: + if !t.status.CompareAndSwap(s, timerMoving) { + continue + } + t.pp = 0 + doaddtimer(pp, t) + if !t.status.CompareAndSwap(timerMoving, timerWaiting) { + badTimer() + } + break loop + case timerModifiedEarlier, timerModifiedLater: + if !t.status.CompareAndSwap(s, timerMoving) { + continue + } + t.when = t.nextwhen + t.pp = 0 + doaddtimer(pp, t) + if !t.status.CompareAndSwap(timerMoving, timerWaiting) { + badTimer() + } + break loop + case timerDeleted: + if !t.status.CompareAndSwap(s, timerRemoved) { + continue + } + t.pp = 0 + // We no longer need this timer in the heap. + break loop + case timerModifying: + // Loop until the modification is complete. + osyield() + case timerNoStatus, timerRemoved: + // We should not see these status values in a timers heap. + badTimer() + case timerRunning, timerRemoving, timerMoving: + // Some other P thinks it owns this timer, + // which should not happen. + badTimer() + default: + badTimer() + } + } + } +} + +// adjusttimers looks through the timers in the current P's heap for +// any timers that have been modified to run earlier, and puts them in +// the correct place in the heap. While looking for those timers, +// it also moves timers that have been modified to run later, +// and removes deleted timers. The caller must have locked the timers for pp. +func adjusttimers(pp *p, now int64) { + // If we haven't yet reached the time of the first timerModifiedEarlier + // timer, don't do anything. This speeds up programs that adjust + // a lot of timers back and forth if the timers rarely expire. + // We'll postpone looking through all the adjusted timers until + // one would actually expire. + first := pp.timerModifiedEarliest.Load() + if first == 0 || first > now { + if verifyTimers { + verifyTimerHeap(pp) + } + return + } + + // We are going to clear all timerModifiedEarlier timers. + pp.timerModifiedEarliest.Store(0) + + var moved []*timer + for i := 0; i < len(pp.timers); i++ { + t := pp.timers[i] + if t.pp.ptr() != pp { + throw("adjusttimers: bad p") + } + switch s := t.status.Load(); s { + case timerDeleted: + if t.status.CompareAndSwap(s, timerRemoving) { + changed := dodeltimer(pp, i) + if !t.status.CompareAndSwap(timerRemoving, timerRemoved) { + badTimer() + } + pp.deletedTimers.Add(-1) + // Go back to the earliest changed heap entry. + // "- 1" because the loop will add 1. + i = changed - 1 + } + case timerModifiedEarlier, timerModifiedLater: + if t.status.CompareAndSwap(s, timerMoving) { + // Now we can change the when field. + t.when = t.nextwhen + // Take t off the heap, and hold onto it. + // We don't add it back yet because the + // heap manipulation could cause our + // loop to skip some other timer. + changed := dodeltimer(pp, i) + moved = append(moved, t) + // Go back to the earliest changed heap entry. + // "- 1" because the loop will add 1. + i = changed - 1 + } + case timerNoStatus, timerRunning, timerRemoving, timerRemoved, timerMoving: + badTimer() + case timerWaiting: + // OK, nothing to do. + case timerModifying: + // Check again after modification is complete. + osyield() + i-- + default: + badTimer() + } + } + + if len(moved) > 0 { + addAdjustedTimers(pp, moved) + } + + if verifyTimers { + verifyTimerHeap(pp) + } +} + +// addAdjustedTimers adds any timers we adjusted in adjusttimers +// back to the timer heap. +func addAdjustedTimers(pp *p, moved []*timer) { + for _, t := range moved { + doaddtimer(pp, t) + if !t.status.CompareAndSwap(timerMoving, timerWaiting) { + badTimer() + } + } +} + +// nobarrierWakeTime looks at P's timers and returns the time when we +// should wake up the netpoller. It returns 0 if there are no timers. +// This function is invoked when dropping a P, and must run without +// any write barriers. +// +//go:nowritebarrierrec +func nobarrierWakeTime(pp *p) int64 { + next := pp.timer0When.Load() + nextAdj := pp.timerModifiedEarliest.Load() + if next == 0 || (nextAdj != 0 && nextAdj < next) { + next = nextAdj + } + return next +} + +// runtimer examines the first timer in timers. If it is ready based on now, +// it runs the timer and removes or updates it. +// Returns 0 if it ran a timer, -1 if there are no more timers, or the time +// when the first timer should run. +// The caller must have locked the timers for pp. +// If a timer is run, this will temporarily unlock the timers. +// +//go:systemstack +func runtimer(pp *p, now int64) int64 { + for { + t := pp.timers[0] + if t.pp.ptr() != pp { + throw("runtimer: bad p") + } + switch s := t.status.Load(); s { + case timerWaiting: + if t.when > now { + // Not ready to run. + return t.when + } + + if !t.status.CompareAndSwap(s, timerRunning) { + continue + } + // Note that runOneTimer may temporarily unlock + // pp.timersLock. + runOneTimer(pp, t, now) + return 0 + + case timerDeleted: + if !t.status.CompareAndSwap(s, timerRemoving) { + continue + } + dodeltimer0(pp) + if !t.status.CompareAndSwap(timerRemoving, timerRemoved) { + badTimer() + } + pp.deletedTimers.Add(-1) + if len(pp.timers) == 0 { + return -1 + } + + case timerModifiedEarlier, timerModifiedLater: + if !t.status.CompareAndSwap(s, timerMoving) { + continue + } + t.when = t.nextwhen + dodeltimer0(pp) + doaddtimer(pp, t) + if !t.status.CompareAndSwap(timerMoving, timerWaiting) { + badTimer() + } + + case timerModifying: + // Wait for modification to complete. + osyield() + + case timerNoStatus, timerRemoved: + // Should not see a new or inactive timer on the heap. + badTimer() + case timerRunning, timerRemoving, timerMoving: + // These should only be set when timers are locked, + // and we didn't do it. + badTimer() + default: + badTimer() + } + } +} + +// runOneTimer runs a single timer. +// The caller must have locked the timers for pp. +// This will temporarily unlock the timers while running the timer function. +// +//go:systemstack +func runOneTimer(pp *p, t *timer, now int64) { + if raceenabled { + ppcur := getg().m.p.ptr() + if ppcur.timerRaceCtx == 0 { + ppcur.timerRaceCtx = racegostart(abi.FuncPCABIInternal(runtimer) + sys.PCQuantum) + } + raceacquirectx(ppcur.timerRaceCtx, unsafe.Pointer(t)) + } + + f := t.f + arg := t.arg + seq := t.seq + + if t.period > 0 { + // Leave in heap but adjust next time to fire. + delta := t.when - now + t.when += t.period * (1 + -delta/t.period) + if t.when < 0 { // check for overflow. + t.when = maxWhen + } + siftdownTimer(pp.timers, 0) + if !t.status.CompareAndSwap(timerRunning, timerWaiting) { + badTimer() + } + updateTimer0When(pp) + } else { + // Remove from heap. + dodeltimer0(pp) + if !t.status.CompareAndSwap(timerRunning, timerNoStatus) { + badTimer() + } + } + + if raceenabled { + // Temporarily use the current P's racectx for g0. + gp := getg() + if gp.racectx != 0 { + throw("runOneTimer: unexpected racectx") + } + gp.racectx = gp.m.p.ptr().timerRaceCtx + } + + unlock(&pp.timersLock) + + f(arg, seq) + + lock(&pp.timersLock) + + if raceenabled { + gp := getg() + gp.racectx = 0 + } +} + +// clearDeletedTimers removes all deleted timers from the P's timer heap. +// This is used to avoid clogging up the heap if the program +// starts a lot of long-running timers and then stops them. +// For example, this can happen via context.WithTimeout. +// +// This is the only function that walks through the entire timer heap, +// other than moveTimers which only runs when the world is stopped. +// +// The caller must have locked the timers for pp. +func clearDeletedTimers(pp *p) { + // We are going to clear all timerModifiedEarlier timers. + // Do this now in case new ones show up while we are looping. + pp.timerModifiedEarliest.Store(0) + + cdel := int32(0) + to := 0 + changedHeap := false + timers := pp.timers +nextTimer: + for _, t := range timers { + for { + switch s := t.status.Load(); s { + case timerWaiting: + if changedHeap { + timers[to] = t + siftupTimer(timers, to) + } + to++ + continue nextTimer + case timerModifiedEarlier, timerModifiedLater: + if t.status.CompareAndSwap(s, timerMoving) { + t.when = t.nextwhen + timers[to] = t + siftupTimer(timers, to) + to++ + changedHeap = true + if !t.status.CompareAndSwap(timerMoving, timerWaiting) { + badTimer() + } + continue nextTimer + } + case timerDeleted: + if t.status.CompareAndSwap(s, timerRemoving) { + t.pp = 0 + cdel++ + if !t.status.CompareAndSwap(timerRemoving, timerRemoved) { + badTimer() + } + changedHeap = true + continue nextTimer + } + case timerModifying: + // Loop until modification complete. + osyield() + case timerNoStatus, timerRemoved: + // We should not see these status values in a timer heap. + badTimer() + case timerRunning, timerRemoving, timerMoving: + // Some other P thinks it owns this timer, + // which should not happen. + badTimer() + default: + badTimer() + } + } + } + + // Set remaining slots in timers slice to nil, + // so that the timer values can be garbage collected. + for i := to; i < len(timers); i++ { + timers[i] = nil + } + + pp.deletedTimers.Add(-cdel) + pp.numTimers.Add(-cdel) + + timers = timers[:to] + pp.timers = timers + updateTimer0When(pp) + + if verifyTimers { + verifyTimerHeap(pp) + } +} + +// verifyTimerHeap verifies that the timer heap is in a valid state. +// This is only for debugging, and is only called if verifyTimers is true. +// The caller must have locked the timers. +func verifyTimerHeap(pp *p) { + for i, t := range pp.timers { + if i == 0 { + // First timer has no parent. + continue + } + + // The heap is 4-ary. See siftupTimer and siftdownTimer. + p := (i - 1) / 4 + if t.when < pp.timers[p].when { + print("bad timer heap at ", i, ": ", p, ": ", pp.timers[p].when, ", ", i, ": ", t.when, "\n") + throw("bad timer heap") + } + } + if numTimers := int(pp.numTimers.Load()); len(pp.timers) != numTimers { + println("timer heap len", len(pp.timers), "!= numTimers", numTimers) + throw("bad timer heap len") + } +} + +// updateTimer0When sets the P's timer0When field. +// The caller must have locked the timers for pp. +func updateTimer0When(pp *p) { + if len(pp.timers) == 0 { + pp.timer0When.Store(0) + } else { + pp.timer0When.Store(pp.timers[0].when) + } +} + +// updateTimerModifiedEarliest updates the recorded nextwhen field of the +// earlier timerModifiedEarier value. +// The timers for pp will not be locked. +func updateTimerModifiedEarliest(pp *p, nextwhen int64) { + for { + old := pp.timerModifiedEarliest.Load() + if old != 0 && int64(old) < nextwhen { + return + } + + if pp.timerModifiedEarliest.CompareAndSwap(old, nextwhen) { + return + } + } +} + +// timeSleepUntil returns the time when the next timer should fire. Returns +// maxWhen if there are no timers. +// This is only called by sysmon and checkdead. +func timeSleepUntil() int64 { + next := int64(maxWhen) + + // Prevent allp slice changes. This is like retake. + lock(&allpLock) + for _, pp := range allp { + if pp == nil { + // This can happen if procresize has grown + // allp but not yet created new Ps. + continue + } + + w := pp.timer0When.Load() + if w != 0 && w < next { + next = w + } + + w = pp.timerModifiedEarliest.Load() + if w != 0 && w < next { + next = w + } + } + unlock(&allpLock) + + return next +} + +// Heap maintenance algorithms. +// These algorithms check for slice index errors manually. +// Slice index error can happen if the program is using racy +// access to timers. We don't want to panic here, because +// it will cause the program to crash with a mysterious +// "panic holding locks" message. Instead, we panic while not +// holding a lock. + +// siftupTimer puts the timer at position i in the right place +// in the heap by moving it up toward the top of the heap. +// It returns the smallest changed index. +func siftupTimer(t []*timer, i int) int { + if i >= len(t) { + badTimer() + } + when := t[i].when + if when <= 0 { + badTimer() + } + tmp := t[i] + for i > 0 { + p := (i - 1) / 4 // parent + if when >= t[p].when { + break + } + t[i] = t[p] + i = p + } + if tmp != t[i] { + t[i] = tmp + } + return i +} + +// siftdownTimer puts the timer at position i in the right place +// in the heap by moving it down toward the bottom of the heap. +func siftdownTimer(t []*timer, i int) { + n := len(t) + if i >= n { + badTimer() + } + when := t[i].when + if when <= 0 { + badTimer() + } + tmp := t[i] + for { + c := i*4 + 1 // left child + c3 := c + 2 // mid child + if c >= n { + break + } + w := t[c].when + if c+1 < n && t[c+1].when < w { + w = t[c+1].when + c++ + } + if c3 < n { + w3 := t[c3].when + if c3+1 < n && t[c3+1].when < w3 { + w3 = t[c3+1].when + c3++ + } + if w3 < w { + w = w3 + c = c3 + } + } + if w >= when { + break + } + t[i] = t[c] + i = c + } + if tmp != t[i] { + t[i] = tmp + } +} + +// badTimer is called if the timer data structures have been corrupted, +// presumably due to racy use by the program. We panic here rather than +// panicing due to invalid slice access while holding locks. +// See issue #25686. +func badTimer() { + throw("timer data corruption") +} diff --git a/src/runtime/time_fake.go b/src/runtime/time_fake.go new file mode 100644 index 0000000..9e24f70 --- /dev/null +++ b/src/runtime/time_fake.go @@ -0,0 +1,98 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build faketime && !windows + +// Faketime isn't currently supported on Windows. This would require +// modifying syscall.Write to call syscall.faketimeWrite, +// translating the Stdout and Stderr handles into FDs 1 and 2. +// (See CL 192739 PS 3.) + +package runtime + +import "unsafe" + +// faketime is the simulated time in nanoseconds since 1970 for the +// playground. +var faketime int64 = 1257894000000000000 + +var faketimeState struct { + lock mutex + + // lastfaketime is the last faketime value written to fd 1 or 2. + lastfaketime int64 + + // lastfd is the fd to which lastfaketime was written. + // + // Subsequent writes to the same fd may use the same + // timestamp, but the timestamp must increase if the fd + // changes. + lastfd uintptr +} + +//go:nosplit +func nanotime() int64 { + return faketime +} + +//go:linkname time_now time.now +func time_now() (sec int64, nsec int32, mono int64) { + return faketime / 1e9, int32(faketime % 1e9), faketime +} + +// write is like the Unix write system call. +// We have to avoid write barriers to avoid potential deadlock +// on write calls. +// +//go:nowritebarrierrec +func write(fd uintptr, p unsafe.Pointer, n int32) int32 { + if !(fd == 1 || fd == 2) { + // Do an ordinary write. + return write1(fd, p, n) + } + + // Write with the playback header. + + // First, lock to avoid interleaving writes. + lock(&faketimeState.lock) + + // If the current fd doesn't match the fd of the previous write, + // ensure that the timestamp is strictly greater. That way, we can + // recover the original order even if we read the fds separately. + t := faketimeState.lastfaketime + if fd != faketimeState.lastfd { + t++ + faketimeState.lastfd = fd + } + if faketime > t { + t = faketime + } + faketimeState.lastfaketime = t + + // Playback header: 0 0 P B <8-byte time> <4-byte data length> (big endian) + var buf [4 + 8 + 4]byte + buf[2] = 'P' + buf[3] = 'B' + tu := uint64(t) + buf[4] = byte(tu >> (7 * 8)) + buf[5] = byte(tu >> (6 * 8)) + buf[6] = byte(tu >> (5 * 8)) + buf[7] = byte(tu >> (4 * 8)) + buf[8] = byte(tu >> (3 * 8)) + buf[9] = byte(tu >> (2 * 8)) + buf[10] = byte(tu >> (1 * 8)) + buf[11] = byte(tu >> (0 * 8)) + nu := uint32(n) + buf[12] = byte(nu >> (3 * 8)) + buf[13] = byte(nu >> (2 * 8)) + buf[14] = byte(nu >> (1 * 8)) + buf[15] = byte(nu >> (0 * 8)) + write1(fd, unsafe.Pointer(&buf[0]), int32(len(buf))) + + // Write actual data. + res := write1(fd, p, n) + + unlock(&faketimeState.lock) + return res +} diff --git a/src/runtime/time_linux_amd64.s b/src/runtime/time_linux_amd64.s new file mode 100644 index 0000000..1416d23 --- /dev/null +++ b/src/runtime/time_linux_amd64.s @@ -0,0 +1,87 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !faketime + +#include "go_asm.h" +#include "go_tls.h" +#include "textflag.h" + +#define SYS_clock_gettime 228 + +// func time.now() (sec int64, nsec int32, mono int64) +TEXT time·now<ABIInternal>(SB),NOSPLIT,$16-24 + MOVQ SP, R12 // Save old SP; R12 unchanged by C code. + + MOVQ g_m(R14), BX // BX unchanged by C code. + + // Set vdsoPC and vdsoSP for SIGPROF traceback. + // Save the old values on stack and restore them on exit, + // so this function is reentrant. + MOVQ m_vdsoPC(BX), CX + MOVQ m_vdsoSP(BX), DX + MOVQ CX, 0(SP) + MOVQ DX, 8(SP) + + LEAQ sec+0(FP), DX + MOVQ -8(DX), CX // Sets CX to function return address. + MOVQ CX, m_vdsoPC(BX) + MOVQ DX, m_vdsoSP(BX) + + CMPQ R14, m_curg(BX) // Only switch if on curg. + JNE noswitch + + MOVQ m_g0(BX), DX + MOVQ (g_sched+gobuf_sp)(DX), SP // Set SP to g0 stack + +noswitch: + SUBQ $32, SP // Space for two time results + ANDQ $~15, SP // Align for C code + + MOVL $0, DI // CLOCK_REALTIME + LEAQ 16(SP), SI + MOVQ runtime·vdsoClockgettimeSym(SB), AX + CMPQ AX, $0 + JEQ fallback + CALL AX + + MOVL $1, DI // CLOCK_MONOTONIC + LEAQ 0(SP), SI + MOVQ runtime·vdsoClockgettimeSym(SB), AX + CALL AX + +ret: + MOVQ 16(SP), AX // realtime sec + MOVQ 24(SP), DI // realtime nsec (moved to BX below) + MOVQ 0(SP), CX // monotonic sec + IMULQ $1000000000, CX + MOVQ 8(SP), DX // monotonic nsec + + MOVQ R12, SP // Restore real SP + + // Restore vdsoPC, vdsoSP + // We don't worry about being signaled between the two stores. + // If we are not in a signal handler, we'll restore vdsoSP to 0, + // and no one will care about vdsoPC. If we are in a signal handler, + // we cannot receive another signal. + MOVQ 8(SP), SI + MOVQ SI, m_vdsoSP(BX) + MOVQ 0(SP), SI + MOVQ SI, m_vdsoPC(BX) + + // set result registers; AX is already correct + MOVQ DI, BX + ADDQ DX, CX + RET + +fallback: + MOVQ $SYS_clock_gettime, AX + SYSCALL + + MOVL $1, DI // CLOCK_MONOTONIC + LEAQ 0(SP), SI + MOVQ $SYS_clock_gettime, AX + SYSCALL + + JMP ret diff --git a/src/runtime/time_nofake.go b/src/runtime/time_nofake.go new file mode 100644 index 0000000..70a2102 --- /dev/null +++ b/src/runtime/time_nofake.go @@ -0,0 +1,32 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !faketime + +package runtime + +import "unsafe" + +// faketime is the simulated time in nanoseconds since 1970 for the +// playground. +// +// Zero means not to use faketime. +var faketime int64 + +//go:nosplit +func nanotime() int64 { + return nanotime1() +} + +var overrideWrite func(fd uintptr, p unsafe.Pointer, n int32) int32 + +// write must be nosplit on Windows (see write1) +// +//go:nosplit +func write(fd uintptr, p unsafe.Pointer, n int32) int32 { + if overrideWrite != nil { + return overrideWrite(fd, noescape(p), n) + } + return write1(fd, p, n) +} diff --git a/src/runtime/time_test.go b/src/runtime/time_test.go new file mode 100644 index 0000000..afd9af2 --- /dev/null +++ b/src/runtime/time_test.go @@ -0,0 +1,97 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "bytes" + "encoding/binary" + "errors" + "internal/testenv" + "os/exec" + "reflect" + "runtime" + "testing" +) + +func TestFakeTime(t *testing.T) { + if runtime.GOOS == "windows" { + t.Skip("faketime not supported on windows") + } + + // Faketime is advanced in checkdead. External linking brings in cgo, + // causing checkdead not working. + testenv.MustInternalLink(t) + + t.Parallel() + + exe, err := buildTestProg(t, "testfaketime", "-tags=faketime") + if err != nil { + t.Fatal(err) + } + + var stdout, stderr bytes.Buffer + cmd := exec.Command(exe) + cmd.Stdout = &stdout + cmd.Stderr = &stderr + + err = testenv.CleanCmdEnv(cmd).Run() + if err != nil { + t.Fatalf("exit status: %v\n%s", err, stderr.String()) + } + + t.Logf("raw stdout: %q", stdout.String()) + t.Logf("raw stderr: %q", stderr.String()) + + f1, err1 := parseFakeTime(stdout.Bytes()) + if err1 != nil { + t.Fatal(err1) + } + f2, err2 := parseFakeTime(stderr.Bytes()) + if err2 != nil { + t.Fatal(err2) + } + + const time0 = 1257894000000000000 + got := [][]fakeTimeFrame{f1, f2} + var want = [][]fakeTimeFrame{{ + {time0 + 1, "line 2\n"}, + {time0 + 1, "line 3\n"}, + {time0 + 1e9, "line 5\n"}, + {time0 + 1e9, "2009-11-10T23:00:01Z"}, + }, { + {time0, "line 1\n"}, + {time0 + 2, "line 4\n"}, + }} + if !reflect.DeepEqual(want, got) { + t.Fatalf("want %v, got %v", want, got) + } +} + +type fakeTimeFrame struct { + time uint64 + data string +} + +func parseFakeTime(x []byte) ([]fakeTimeFrame, error) { + var frames []fakeTimeFrame + for len(x) != 0 { + if len(x) < 4+8+4 { + return nil, errors.New("truncated header") + } + const magic = "\x00\x00PB" + if string(x[:len(magic)]) != magic { + return nil, errors.New("bad magic") + } + x = x[len(magic):] + time := binary.BigEndian.Uint64(x) + x = x[8:] + dlen := binary.BigEndian.Uint32(x) + x = x[4:] + data := string(x[:dlen]) + x = x[dlen:] + frames = append(frames, fakeTimeFrame{time, data}) + } + return frames, nil +} diff --git a/src/runtime/time_windows.h b/src/runtime/time_windows.h new file mode 100644 index 0000000..7c2e65c --- /dev/null +++ b/src/runtime/time_windows.h @@ -0,0 +1,17 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Constants for fetching time values on Windows for use in asm code. + +// See https://wrkhpi.wordpress.com/2007/08/09/getting-os-information-the-kuser_shared_data-structure/ +// Archived copy at: +// http://web.archive.org/web/20210411000829/https://wrkhpi.wordpress.com/2007/08/09/getting-os-information-the-kuser_shared_data-structure/ + +// Must read hi1, then lo, then hi2. The snapshot is valid if hi1 == hi2. +// Or, on 64-bit, just read lo:hi1 all at once atomically. +#define _INTERRUPT_TIME 0x7ffe0008 +#define _SYSTEM_TIME 0x7ffe0014 +#define time_lo 0 +#define time_hi1 4 +#define time_hi2 8 diff --git a/src/runtime/time_windows_386.s b/src/runtime/time_windows_386.s new file mode 100644 index 0000000..b8b636e --- /dev/null +++ b/src/runtime/time_windows_386.s @@ -0,0 +1,84 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !faketime + +#include "go_asm.h" +#include "textflag.h" +#include "time_windows.h" + +TEXT time·now(SB),NOSPLIT,$0-20 + CMPB runtime·useQPCTime(SB), $0 + JNE useQPC +loop: + MOVL (_INTERRUPT_TIME+time_hi1), AX + MOVL (_INTERRUPT_TIME+time_lo), CX + MOVL (_INTERRUPT_TIME+time_hi2), DI + CMPL AX, DI + JNE loop + + // w = DI:CX + // multiply by 100 + MOVL $100, AX + MULL CX + IMULL $100, DI + ADDL DI, DX + // w*100 = DX:AX + MOVL AX, mono+12(FP) + MOVL DX, mono+16(FP) + +wall: + MOVL (_SYSTEM_TIME+time_hi1), CX + MOVL (_SYSTEM_TIME+time_lo), AX + MOVL (_SYSTEM_TIME+time_hi2), DX + CMPL CX, DX + JNE wall + + // w = DX:AX + // convert to Unix epoch (but still 100ns units) + #define delta 116444736000000000 + SUBL $(delta & 0xFFFFFFFF), AX + SBBL $(delta >> 32), DX + + // nano/100 = DX:AX + // split into two decimal halves by div 1e9. + // (decimal point is two spots over from correct place, + // but we avoid overflow in the high word.) + MOVL $1000000000, CX + DIVL CX + MOVL AX, DI + MOVL DX, SI + + // DI = nano/100/1e9 = nano/1e11 = sec/100, DX = SI = nano/100%1e9 + // split DX into seconds and nanoseconds by div 1e7 magic multiply. + MOVL DX, AX + MOVL $1801439851, CX + MULL CX + SHRL $22, DX + MOVL DX, BX + IMULL $10000000, DX + MOVL SI, CX + SUBL DX, CX + + // DI = sec/100 (still) + // BX = (nano/100%1e9)/1e7 = (nano/1e9)%100 = sec%100 + // CX = (nano/100%1e9)%1e7 = (nano%1e9)/100 = nsec/100 + // store nsec for return + IMULL $100, CX + MOVL CX, nsec+8(FP) + + // DI = sec/100 (still) + // BX = sec%100 + // construct DX:AX = 64-bit sec and store for return + MOVL $0, DX + MOVL $100, AX + MULL DI + ADDL BX, AX + ADCL $0, DX + MOVL AX, sec+0(FP) + MOVL DX, sec+4(FP) + RET +useQPC: + JMP runtime·nowQPC(SB) + RET diff --git a/src/runtime/time_windows_amd64.s b/src/runtime/time_windows_amd64.s new file mode 100644 index 0000000..226f2b5 --- /dev/null +++ b/src/runtime/time_windows_amd64.s @@ -0,0 +1,42 @@ +// Copyright 2011 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !faketime + +#include "go_asm.h" +#include "textflag.h" +#include "time_windows.h" + +TEXT time·now(SB),NOSPLIT,$0-24 + CMPB runtime·useQPCTime(SB), $0 + JNE useQPC + + MOVQ $_INTERRUPT_TIME, DI + MOVQ time_lo(DI), AX + IMULQ $100, AX + MOVQ AX, mono+16(FP) + + MOVQ $_SYSTEM_TIME, DI + MOVQ time_lo(DI), AX + MOVQ $116444736000000000, DI + SUBQ DI, AX + IMULQ $100, AX + + // generated code for + // func f(x uint64) (uint64, uint64) { return x/1000000000, x%1000000000 } + // adapted to reduce duplication + MOVQ AX, CX + MOVQ $1360296554856532783, AX + MULQ CX + ADDQ CX, DX + RCRQ $1, DX + SHRQ $29, DX + MOVQ DX, sec+0(FP) + IMULQ $1000000000, DX + SUBQ DX, CX + MOVL CX, nsec+8(FP) + RET +useQPC: + JMP runtime·nowQPC(SB) + RET diff --git a/src/runtime/time_windows_arm.s b/src/runtime/time_windows_arm.s new file mode 100644 index 0000000..711af88 --- /dev/null +++ b/src/runtime/time_windows_arm.s @@ -0,0 +1,90 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !faketime + +#include "go_asm.h" +#include "textflag.h" +#include "time_windows.h" + +TEXT time·now(SB),NOSPLIT|NOFRAME,$0-20 + MOVW $0, R0 + MOVB runtime·useQPCTime(SB), R0 + CMP $0, R0 + BNE useQPC + MOVW $_INTERRUPT_TIME, R3 +loop: + MOVW time_hi1(R3), R1 + DMB MB_ISH + MOVW time_lo(R3), R0 + DMB MB_ISH + MOVW time_hi2(R3), R2 + CMP R1, R2 + BNE loop + + // wintime = R1:R0, multiply by 100 + MOVW $100, R2 + MULLU R0, R2, (R4, R3) // R4:R3 = R1:R0 * R2 + MULA R1, R2, R4, R4 + + // wintime*100 = R4:R3 + MOVW R3, mono+12(FP) + MOVW R4, mono+16(FP) + + MOVW $_SYSTEM_TIME, R3 +wall: + MOVW time_hi1(R3), R1 + DMB MB_ISH + MOVW time_lo(R3), R0 + DMB MB_ISH + MOVW time_hi2(R3), R2 + CMP R1, R2 + BNE wall + + // w = R1:R0 in 100ns untis + // convert to Unix epoch (but still 100ns units) + #define delta 116444736000000000 + SUB.S $(delta & 0xFFFFFFFF), R0 + SBC $(delta >> 32), R1 + + // Convert to nSec + MOVW $100, R2 + MULLU R0, R2, (R4, R3) // R4:R3 = R1:R0 * R2 + MULA R1, R2, R4, R4 + // w = R2:R1 in nSec + MOVW R3, R1 // R4:R3 -> R2:R1 + MOVW R4, R2 + + // multiply nanoseconds by reciprocal of 10**9 (scaled by 2**61) + // to get seconds (96 bit scaled result) + MOVW $0x89705f41, R3 // 2**61 * 10**-9 + MULLU R1,R3,(R6,R5) // R7:R6:R5 = R2:R1 * R3 + MOVW $0,R7 + MULALU R2,R3,(R7,R6) + + // unscale by discarding low 32 bits, shifting the rest by 29 + MOVW R6>>29,R6 // R7:R6 = (R7:R6:R5 >> 61) + ORR R7<<3,R6 + MOVW R7>>29,R7 + + // subtract (10**9 * sec) from nsec to get nanosecond remainder + MOVW $1000000000, R5 // 10**9 + MULLU R6,R5,(R9,R8) // R9:R8 = R7:R6 * R5 + MULA R7,R5,R9,R9 + SUB.S R8,R1 // R2:R1 -= R9:R8 + SBC R9,R2 + + // because reciprocal was a truncated repeating fraction, quotient + // may be slightly too small -- adjust to make remainder < 10**9 + CMP R5,R1 // if remainder > 10**9 + SUB.HS R5,R1 // remainder -= 10**9 + ADD.HS $1,R6 // sec += 1 + + MOVW R6,sec_lo+0(FP) + MOVW R7,sec_hi+4(FP) + MOVW R1,nsec+8(FP) + RET +useQPC: + B runtime·nowQPC(SB) // tail call + diff --git a/src/runtime/time_windows_arm64.s b/src/runtime/time_windows_arm64.s new file mode 100644 index 0000000..e0c7d28 --- /dev/null +++ b/src/runtime/time_windows_arm64.s @@ -0,0 +1,47 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !faketime + +#include "go_asm.h" +#include "textflag.h" +#include "time_windows.h" + +TEXT time·now(SB),NOSPLIT|NOFRAME,$0-24 + MOVB runtime·useQPCTime(SB), R0 + CMP $0, R0 + BNE useQPC + + MOVD $_INTERRUPT_TIME, R3 + MOVD time_lo(R3), R0 + MOVD $100, R1 + MUL R1, R0 + MOVD R0, mono+16(FP) + + MOVD $_SYSTEM_TIME, R3 + MOVD time_lo(R3), R0 + // convert to Unix epoch (but still 100ns units) + #define delta 116444736000000000 + SUB $delta, R0 + // Convert to nSec + MOVD $100, R1 + MUL R1, R0 + + // Code stolen from compiler output for: + // + // var x uint64 + // func f() (sec uint64, nsec uint32) { return x / 1000000000, uint32(x % 1000000000) } + // + LSR $1, R0, R1 + MOVD $-8543223759426509416, R2 + UMULH R1, R2, R1 + LSR $28, R1, R1 + MOVD R1, sec+0(FP) + MOVD $1000000000, R2 + MSUB R1, R0, R2, R0 + MOVW R0, nsec+8(FP) + RET +useQPC: + B runtime·nowQPC(SB) // tail call + diff --git a/src/runtime/timeasm.go b/src/runtime/timeasm.go new file mode 100644 index 0000000..0421388 --- /dev/null +++ b/src/runtime/timeasm.go @@ -0,0 +1,14 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Declarations for operating systems implementing time.now directly in assembly. + +//go:build !faketime && (windows || (linux && amd64)) + +package runtime + +import _ "unsafe" + +//go:linkname time_now time.now +func time_now() (sec int64, nsec int32, mono int64) diff --git a/src/runtime/timestub.go b/src/runtime/timestub.go new file mode 100644 index 0000000..1d2926b --- /dev/null +++ b/src/runtime/timestub.go @@ -0,0 +1,18 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Declarations for operating systems implementing time.now +// indirectly, in terms of walltime and nanotime assembly. + +//go:build !faketime && !windows && !(linux && amd64) + +package runtime + +import _ "unsafe" // for go:linkname + +//go:linkname time_now time.now +func time_now() (sec int64, nsec int32, mono int64) { + sec, nsec = walltime() + return sec, nsec, nanotime() +} diff --git a/src/runtime/timestub2.go b/src/runtime/timestub2.go new file mode 100644 index 0000000..b9a5cc6 --- /dev/null +++ b/src/runtime/timestub2.go @@ -0,0 +1,9 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !aix && !darwin && !freebsd && !openbsd && !solaris && !windows && !(linux && amd64) + +package runtime + +func walltime() (sec int64, nsec int32) diff --git a/src/runtime/tls_arm.s b/src/runtime/tls_arm.s new file mode 100644 index 0000000..d224c55 --- /dev/null +++ b/src/runtime/tls_arm.s @@ -0,0 +1,100 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !windows + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// We have to resort to TLS variable to save g(R10). +// One reason is that external code might trigger +// SIGSEGV, and our runtime.sigtramp don't even know we +// are in external code, and will continue to use R10, +// this might as well result in another SIGSEGV. +// Note: both functions will clobber R0 and R11 and +// can be called from 5c ABI code. + +// On android, runtime.tls_g is a normal variable. +// TLS offset is computed in x_cgo_inittls. +#ifdef GOOS_android +#define TLSG_IS_VARIABLE +#endif + +// save_g saves the g register into pthread-provided +// thread-local memory, so that we can call externally compiled +// ARM code that will overwrite those registers. +// NOTE: runtime.gogo assumes that R1 is preserved by this function. +// runtime.mcall assumes this function only clobbers R0 and R11. +// Returns with g in R0. +TEXT runtime·save_g(SB),NOSPLIT,$0 + // If the host does not support MRC the linker will replace it with + // a call to runtime.read_tls_fallback which jumps to __kuser_get_tls. + // The replacement function saves LR in R11 over the call to read_tls_fallback. + // To make stack unwinding work, this function should NOT be marked as NOFRAME, + // as it may contain a call, which clobbers LR even just temporarily. + MRC 15, 0, R0, C13, C0, 3 // fetch TLS base pointer + BIC $3, R0 // Darwin/ARM might return unaligned pointer + MOVW runtime·tls_g(SB), R11 + ADD R11, R0 + MOVW g, 0(R0) + MOVW g, R0 // preserve R0 across call to setg<> + RET + +// load_g loads the g register from pthread-provided +// thread-local memory, for use after calling externally compiled +// ARM code that overwrote those registers. +TEXT runtime·load_g(SB),NOSPLIT,$0 + // See save_g + MRC 15, 0, R0, C13, C0, 3 // fetch TLS base pointer + BIC $3, R0 // Darwin/ARM might return unaligned pointer + MOVW runtime·tls_g(SB), R11 + ADD R11, R0 + MOVW 0(R0), g + RET + +// This is called from rt0_go, which runs on the system stack +// using the initial stack allocated by the OS. +// It calls back into standard C using the BL (R4) below. +// To do that, the stack pointer must be 8-byte-aligned +// on some systems, notably FreeBSD. +// The ARM ABI says the stack pointer must be 8-byte-aligned +// on entry to any function, but only FreeBSD's C library seems to care. +// The caller was 8-byte aligned, but we push an LR. +// Declare a dummy word ($4, not $0) to make sure the +// frame is 8 bytes and stays 8-byte-aligned. +TEXT runtime·_initcgo(SB),NOSPLIT,$4 + // if there is an _cgo_init, call it. + MOVW _cgo_init(SB), R4 + CMP $0, R4 + B.EQ nocgo + MRC 15, 0, R0, C13, C0, 3 // load TLS base pointer + MOVW R0, R3 // arg 3: TLS base pointer +#ifdef TLSG_IS_VARIABLE + MOVW $runtime·tls_g(SB), R2 // arg 2: &tls_g +#else + MOVW $0, R2 // arg 2: not used when using platform tls +#endif + MOVW $setg_gcc<>(SB), R1 // arg 1: setg + MOVW g, R0 // arg 0: G + BL (R4) // will clobber R0-R3 +nocgo: + RET + +// void setg_gcc(G*); set g called from gcc. +TEXT setg_gcc<>(SB),NOSPLIT,$0 + MOVW R0, g + B runtime·save_g(SB) + +#ifdef TLSG_IS_VARIABLE +#ifdef GOOS_android +// Use the free TLS_SLOT_APP slot #2 on Android Q. +// Earlier androids are set up in gcc_android.c. +DATA runtime·tls_g+0(SB)/4, $8 +#endif +GLOBL runtime·tls_g+0(SB), NOPTR, $4 +#else +GLOBL runtime·tls_g+0(SB), TLSBSS, $4 +#endif diff --git a/src/runtime/tls_arm64.h b/src/runtime/tls_arm64.h new file mode 100644 index 0000000..3aa8c63 --- /dev/null +++ b/src/runtime/tls_arm64.h @@ -0,0 +1,51 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#ifdef GOOS_android +#define TLS_linux +#define TLSG_IS_VARIABLE +#endif +#ifdef GOOS_linux +#define TLS_linux +#endif +#ifdef TLS_linux +#define MRS_TPIDR_R0 WORD $0xd53bd040 // MRS TPIDR_EL0, R0 +#endif + +#ifdef GOOS_darwin +#define TLS_darwin +#endif +#ifdef GOOS_ios +#define TLS_darwin +#endif +#ifdef TLS_darwin +#define TLSG_IS_VARIABLE +#define MRS_TPIDR_R0 WORD $0xd53bd060 // MRS TPIDRRO_EL0, R0 +#endif + +#ifdef GOOS_freebsd +#define MRS_TPIDR_R0 WORD $0xd53bd040 // MRS TPIDR_EL0, R0 +#endif + +#ifdef GOOS_netbsd +#define MRS_TPIDR_R0 WORD $0xd53bd040 // MRS TPIDRRO_EL0, R0 +#endif + +#ifdef GOOS_openbsd +#define MRS_TPIDR_R0 WORD $0xd53bd040 // MRS TPIDR_EL0, R0 +#endif + +#ifdef GOOS_windows +#define TLS_windows +#endif +#ifdef TLS_windows +#define TLSG_IS_VARIABLE +#define MRS_TPIDR_R0 MOVD R18_PLATFORM, R0 +#endif + +// Define something that will break the build if +// the GOOS is unknown. +#ifndef MRS_TPIDR_R0 +#define MRS_TPIDR_R0 unknown_TLS_implementation_in_tls_arm64_h +#endif diff --git a/src/runtime/tls_arm64.s b/src/runtime/tls_arm64.s new file mode 100644 index 0000000..52b3e8f --- /dev/null +++ b/src/runtime/tls_arm64.s @@ -0,0 +1,62 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" +#include "tls_arm64.h" + +TEXT runtime·load_g(SB),NOSPLIT,$0 +#ifndef GOOS_darwin +#ifndef GOOS_openbsd +#ifndef GOOS_windows + MOVB runtime·iscgo(SB), R0 + CBZ R0, nocgo +#endif +#endif +#endif + + MRS_TPIDR_R0 +#ifdef TLS_darwin + // Darwin sometimes returns unaligned pointers + AND $0xfffffffffffffff8, R0 +#endif + MOVD runtime·tls_g(SB), R27 + MOVD (R0)(R27), g + +nocgo: + RET + +TEXT runtime·save_g(SB),NOSPLIT,$0 +#ifndef GOOS_darwin +#ifndef GOOS_openbsd +#ifndef GOOS_windows + MOVB runtime·iscgo(SB), R0 + CBZ R0, nocgo +#endif +#endif +#endif + + MRS_TPIDR_R0 +#ifdef TLS_darwin + // Darwin sometimes returns unaligned pointers + AND $0xfffffffffffffff8, R0 +#endif + MOVD runtime·tls_g(SB), R27 + MOVD g, (R0)(R27) + +nocgo: + RET + +#ifdef TLSG_IS_VARIABLE +#ifdef GOOS_android +// Use the free TLS_SLOT_APP slot #2 on Android Q. +// Earlier androids are set up in gcc_android.c. +DATA runtime·tls_g+0(SB)/8, $16 +#endif +GLOBL runtime·tls_g+0(SB), NOPTR, $8 +#else +GLOBL runtime·tls_g+0(SB), TLSBSS, $8 +#endif diff --git a/src/runtime/tls_loong64.s b/src/runtime/tls_loong64.s new file mode 100644 index 0000000..bc3be3d --- /dev/null +++ b/src/runtime/tls_loong64.s @@ -0,0 +1,26 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// If !iscgo, this is a no-op. +// +// NOTE: mcall() assumes this clobbers only R30 (REGTMP). +TEXT runtime·save_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVB runtime·iscgo(SB), R30 + BEQ R30, nocgo + + MOVV g, runtime·tls_g(SB) + +nocgo: + RET + +TEXT runtime·load_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVV runtime·tls_g(SB), g + RET + +GLOBL runtime·tls_g(SB), TLSBSS, $8 diff --git a/src/runtime/tls_mips64x.s b/src/runtime/tls_mips64x.s new file mode 100644 index 0000000..ec2748e --- /dev/null +++ b/src/runtime/tls_mips64x.s @@ -0,0 +1,30 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips64 || mips64le + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// If !iscgo, this is a no-op. +// +// NOTE: mcall() assumes this clobbers only R23 (REGTMP). +TEXT runtime·save_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVB runtime·iscgo(SB), R23 + BEQ R23, nocgo + + MOVV R3, R23 // save R3 + MOVV g, runtime·tls_g(SB) // TLS relocation clobbers R3 + MOVV R23, R3 // restore R3 + +nocgo: + RET + +TEXT runtime·load_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVV runtime·tls_g(SB), g // TLS relocation clobbers R3 + RET + +GLOBL runtime·tls_g(SB), TLSBSS, $8 diff --git a/src/runtime/tls_mipsx.s b/src/runtime/tls_mipsx.s new file mode 100644 index 0000000..acc3eb5 --- /dev/null +++ b/src/runtime/tls_mipsx.s @@ -0,0 +1,29 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build mips || mipsle + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// If !iscgo, this is a no-op. +// NOTE: gogo asumes load_g only clobers g (R30) and REGTMP (R23) +TEXT runtime·save_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVB runtime·iscgo(SB), R23 + BEQ R23, nocgo + + MOVW R3, R23 + MOVW g, runtime·tls_g(SB) // TLS relocation clobbers R3 + MOVW R23, R3 + +nocgo: + RET + +TEXT runtime·load_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVW runtime·tls_g(SB), g // TLS relocation clobbers R3 + RET + +GLOBL runtime·tls_g(SB), TLSBSS, $4 diff --git a/src/runtime/tls_ppc64x.s b/src/runtime/tls_ppc64x.s new file mode 100644 index 0000000..17aec9f --- /dev/null +++ b/src/runtime/tls_ppc64x.s @@ -0,0 +1,51 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ppc64 || ppc64le + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// We have to resort to TLS variable to save g (R30). +// One reason is that external code might trigger +// SIGSEGV, and our runtime.sigtramp don't even know we +// are in external code, and will continue to use R30, +// this might well result in another SIGSEGV. + +// save_g saves the g register into pthread-provided +// thread-local memory, so that we can call externally compiled +// ppc64 code that will overwrite this register. +// +// If !iscgo, this is a no-op. +// +// NOTE: setg_gcc<> assume this clobbers only R31. +TEXT runtime·save_g(SB),NOSPLIT|NOFRAME,$0-0 +#ifndef GOOS_aix + MOVBZ runtime·iscgo(SB), R31 + CMP R31, $0 + BEQ nocgo +#endif + MOVD runtime·tls_g(SB), R31 + MOVD g, 0(R31) + +nocgo: + RET + +// load_g loads the g register from pthread-provided +// thread-local memory, for use after calling externally compiled +// ppc64 code that overwrote those registers. +// +// This is never called directly from C code (it doesn't have to +// follow the C ABI), but it may be called from a C context, where the +// usual Go registers aren't set up. +// +// NOTE: _cgo_topofstack assumes this only clobbers g (R30), and R31. +TEXT runtime·load_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVD runtime·tls_g(SB), R31 + MOVD 0(R31), g + RET + +GLOBL runtime·tls_g+0(SB), TLSBSS+DUPOK, $8 diff --git a/src/runtime/tls_riscv64.s b/src/runtime/tls_riscv64.s new file mode 100644 index 0000000..397919a --- /dev/null +++ b/src/runtime/tls_riscv64.s @@ -0,0 +1,30 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// If !iscgo, this is a no-op. +// +// NOTE: mcall() assumes this clobbers only X31 (REG_TMP). +TEXT runtime·save_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVB runtime·iscgo(SB), X31 + BEQ X0, X31, nocgo + + MOV runtime·tls_g(SB), X31 + ADD TP, X31 // add offset to thread pointer (X4) + MOV g, (X31) + +nocgo: + RET + +TEXT runtime·load_g(SB),NOSPLIT|NOFRAME,$0-0 + MOV runtime·tls_g(SB), X31 + ADD TP, X31 // add offset to thread pointer (X4) + MOV (X31), g + RET + +GLOBL runtime·tls_g(SB), TLSBSS, $8 diff --git a/src/runtime/tls_s390x.s b/src/runtime/tls_s390x.s new file mode 100644 index 0000000..cb6a21c --- /dev/null +++ b/src/runtime/tls_s390x.s @@ -0,0 +1,51 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// We have to resort to TLS variable to save g (R13). +// One reason is that external code might trigger +// SIGSEGV, and our runtime.sigtramp don't even know we +// are in external code, and will continue to use R13, +// this might well result in another SIGSEGV. + +// save_g saves the g register into pthread-provided +// thread-local memory, so that we can call externally compiled +// s390x code that will overwrite this register. +// +// If !iscgo, this is a no-op. +// +// NOTE: setg_gcc<> assume this clobbers only R10 and R11. +TEXT runtime·save_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVB runtime·iscgo(SB), R10 + CMPBEQ R10, $0, nocgo + MOVW AR0, R11 + SLD $32, R11 + MOVW AR1, R11 + MOVD runtime·tls_g(SB), R10 + MOVD g, 0(R10)(R11*1) +nocgo: + RET + +// load_g loads the g register from pthread-provided +// thread-local memory, for use after calling externally compiled +// s390x code that overwrote those registers. +// +// This is never called directly from C code (it doesn't have to +// follow the C ABI), but it may be called from a C context, where the +// usual Go registers aren't set up. +// +// NOTE: _cgo_topofstack assumes this only clobbers g (R13), R10 and R11. +TEXT runtime·load_g(SB),NOSPLIT|NOFRAME,$0-0 + MOVW AR0, R11 + SLD $32, R11 + MOVW AR1, R11 + MOVD runtime·tls_g(SB), R10 + MOVD 0(R10)(R11*1), g + RET + +GLOBL runtime·tls_g+0(SB),TLSBSS,$8 diff --git a/src/runtime/tls_stub.go b/src/runtime/tls_stub.go new file mode 100644 index 0000000..7bdfc6b --- /dev/null +++ b/src/runtime/tls_stub.go @@ -0,0 +1,10 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (windows && !amd64) || !windows + +package runtime + +//go:nosplit +func osSetupTLS(mp *m) {} diff --git a/src/runtime/tls_windows_amd64.go b/src/runtime/tls_windows_amd64.go new file mode 100644 index 0000000..cacaa84 --- /dev/null +++ b/src/runtime/tls_windows_amd64.go @@ -0,0 +1,10 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// osSetupTLS is called by needm to set up TLS for non-Go threads. +// +// Defined in assembly. +func osSetupTLS(mp *m) diff --git a/src/runtime/trace.go b/src/runtime/trace.go new file mode 100644 index 0000000..e7dfab1 --- /dev/null +++ b/src/runtime/trace.go @@ -0,0 +1,1579 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Go execution tracer. +// The tracer captures a wide range of execution events like goroutine +// creation/blocking/unblocking, syscall enter/exit/block, GC-related events, +// changes of heap size, processor start/stop, etc and writes them to a buffer +// in a compact form. A precise nanosecond-precision timestamp and a stack +// trace is captured for most events. +// See https://golang.org/s/go15trace for more info. + +package runtime + +import ( + "internal/goarch" + "runtime/internal/atomic" + "runtime/internal/sys" + "unsafe" +) + +// Event types in the trace, args are given in square brackets. +const ( + traceEvNone = 0 // unused + traceEvBatch = 1 // start of per-P batch of events [pid, timestamp] + traceEvFrequency = 2 // contains tracer timer frequency [frequency (ticks per second)] + traceEvStack = 3 // stack [stack id, number of PCs, array of {PC, func string ID, file string ID, line}] + traceEvGomaxprocs = 4 // current value of GOMAXPROCS [timestamp, GOMAXPROCS, stack id] + traceEvProcStart = 5 // start of P [timestamp, thread id] + traceEvProcStop = 6 // stop of P [timestamp] + traceEvGCStart = 7 // GC start [timestamp, seq, stack id] + traceEvGCDone = 8 // GC done [timestamp] + traceEvGCSTWStart = 9 // GC STW start [timestamp, kind] + traceEvGCSTWDone = 10 // GC STW done [timestamp] + traceEvGCSweepStart = 11 // GC sweep start [timestamp, stack id] + traceEvGCSweepDone = 12 // GC sweep done [timestamp, swept, reclaimed] + traceEvGoCreate = 13 // goroutine creation [timestamp, new goroutine id, new stack id, stack id] + traceEvGoStart = 14 // goroutine starts running [timestamp, goroutine id, seq] + traceEvGoEnd = 15 // goroutine ends [timestamp] + traceEvGoStop = 16 // goroutine stops (like in select{}) [timestamp, stack] + traceEvGoSched = 17 // goroutine calls Gosched [timestamp, stack] + traceEvGoPreempt = 18 // goroutine is preempted [timestamp, stack] + traceEvGoSleep = 19 // goroutine calls Sleep [timestamp, stack] + traceEvGoBlock = 20 // goroutine blocks [timestamp, stack] + traceEvGoUnblock = 21 // goroutine is unblocked [timestamp, goroutine id, seq, stack] + traceEvGoBlockSend = 22 // goroutine blocks on chan send [timestamp, stack] + traceEvGoBlockRecv = 23 // goroutine blocks on chan recv [timestamp, stack] + traceEvGoBlockSelect = 24 // goroutine blocks on select [timestamp, stack] + traceEvGoBlockSync = 25 // goroutine blocks on Mutex/RWMutex [timestamp, stack] + traceEvGoBlockCond = 26 // goroutine blocks on Cond [timestamp, stack] + traceEvGoBlockNet = 27 // goroutine blocks on network [timestamp, stack] + traceEvGoSysCall = 28 // syscall enter [timestamp, stack] + traceEvGoSysExit = 29 // syscall exit [timestamp, goroutine id, seq, real timestamp] + traceEvGoSysBlock = 30 // syscall blocks [timestamp] + traceEvGoWaiting = 31 // denotes that goroutine is blocked when tracing starts [timestamp, goroutine id] + traceEvGoInSyscall = 32 // denotes that goroutine is in syscall when tracing starts [timestamp, goroutine id] + traceEvHeapAlloc = 33 // gcController.heapLive change [timestamp, heap_alloc] + traceEvHeapGoal = 34 // gcController.heapGoal() (formerly next_gc) change [timestamp, heap goal in bytes] + traceEvTimerGoroutine = 35 // not currently used; previously denoted timer goroutine [timer goroutine id] + traceEvFutileWakeup = 36 // denotes that the previous wakeup of this goroutine was futile [timestamp] + traceEvString = 37 // string dictionary entry [ID, length, string] + traceEvGoStartLocal = 38 // goroutine starts running on the same P as the last event [timestamp, goroutine id] + traceEvGoUnblockLocal = 39 // goroutine is unblocked on the same P as the last event [timestamp, goroutine id, stack] + traceEvGoSysExitLocal = 40 // syscall exit on the same P as the last event [timestamp, goroutine id, real timestamp] + traceEvGoStartLabel = 41 // goroutine starts running with label [timestamp, goroutine id, seq, label string id] + traceEvGoBlockGC = 42 // goroutine blocks on GC assist [timestamp, stack] + traceEvGCMarkAssistStart = 43 // GC mark assist start [timestamp, stack] + traceEvGCMarkAssistDone = 44 // GC mark assist done [timestamp] + traceEvUserTaskCreate = 45 // trace.NewContext [timestamp, internal task id, internal parent task id, stack, name string] + traceEvUserTaskEnd = 46 // end of a task [timestamp, internal task id, stack] + traceEvUserRegion = 47 // trace.WithRegion [timestamp, internal task id, mode(0:start, 1:end), stack, name string] + traceEvUserLog = 48 // trace.Log [timestamp, internal task id, key string id, stack, value string] + traceEvCPUSample = 49 // CPU profiling sample [timestamp, stack, real timestamp, real P id (-1 when absent), goroutine id] + traceEvCount = 50 + // Byte is used but only 6 bits are available for event type. + // The remaining 2 bits are used to specify the number of arguments. + // That means, the max event type value is 63. +) + +const ( + // Timestamps in trace are cputicks/traceTickDiv. + // This makes absolute values of timestamp diffs smaller, + // and so they are encoded in less number of bytes. + // 64 on x86 is somewhat arbitrary (one tick is ~20ns on a 3GHz machine). + // The suggested increment frequency for PowerPC's time base register is + // 512 MHz according to Power ISA v2.07 section 6.2, so we use 16 on ppc64 + // and ppc64le. + // Tracing won't work reliably for architectures where cputicks is emulated + // by nanotime, so the value doesn't matter for those architectures. + traceTickDiv = 16 + 48*(goarch.Is386|goarch.IsAmd64) + // Maximum number of PCs in a single stack trace. + // Since events contain only stack id rather than whole stack trace, + // we can allow quite large values here. + traceStackSize = 128 + // Identifier of a fake P that is used when we trace without a real P. + traceGlobProc = -1 + // Maximum number of bytes to encode uint64 in base-128. + traceBytesPerNumber = 10 + // Shift of the number of arguments in the first event byte. + traceArgCountShift = 6 + // Flag passed to traceGoPark to denote that the previous wakeup of this + // goroutine was futile. For example, a goroutine was unblocked on a mutex, + // but another goroutine got ahead and acquired the mutex before the first + // goroutine is scheduled, so the first goroutine has to block again. + // Such wakeups happen on buffered channels and sync.Mutex, + // but are generally not interesting for end user. + traceFutileWakeup byte = 128 +) + +// trace is global tracing context. +var trace struct { + // trace.lock must only be acquired on the system stack where + // stack splits cannot happen while it is held. + lock mutex // protects the following members + lockOwner *g // to avoid deadlocks during recursive lock locks + enabled bool // when set runtime traces events + shutdown bool // set when we are waiting for trace reader to finish after setting enabled to false + headerWritten bool // whether ReadTrace has emitted trace header + footerWritten bool // whether ReadTrace has emitted trace footer + shutdownSema uint32 // used to wait for ReadTrace completion + seqStart uint64 // sequence number when tracing was started + ticksStart int64 // cputicks when tracing was started + ticksEnd int64 // cputicks when tracing was stopped + timeStart int64 // nanotime when tracing was started + timeEnd int64 // nanotime when tracing was stopped + seqGC uint64 // GC start/done sequencer + reading traceBufPtr // buffer currently handed off to user + empty traceBufPtr // stack of empty buffers + fullHead traceBufPtr // queue of full buffers + fullTail traceBufPtr + stackTab traceStackTable // maps stack traces to unique ids + // cpuLogRead accepts CPU profile samples from the signal handler where + // they're generated. It uses a two-word header to hold the IDs of the P and + // G (respectively) that were active at the time of the sample. Because + // profBuf uses a record with all zeros in its header to indicate overflow, + // we make sure to make the P field always non-zero: The ID of a real P will + // start at bit 1, and bit 0 will be set. Samples that arrive while no P is + // running (such as near syscalls) will set the first header field to 0b10. + // This careful handling of the first header field allows us to store ID of + // the active G directly in the second field, even though that will be 0 + // when sampling g0. + cpuLogRead *profBuf + // cpuLogBuf is a trace buffer to hold events corresponding to CPU profile + // samples, which arrive out of band and not directly connected to a + // specific P. + cpuLogBuf traceBufPtr + + reader atomic.Pointer[g] // goroutine that called ReadTrace, or nil + + signalLock atomic.Uint32 // protects use of the following member, only usable in signal handlers + cpuLogWrite *profBuf // copy of cpuLogRead for use in signal handlers, set without signalLock + + // Dictionary for traceEvString. + // + // TODO: central lock to access the map is not ideal. + // option: pre-assign ids to all user annotation region names and tags + // option: per-P cache + // option: sync.Map like data structure + stringsLock mutex + strings map[string]uint64 + stringSeq uint64 + + // markWorkerLabels maps gcMarkWorkerMode to string ID. + markWorkerLabels [len(gcMarkWorkerModeStrings)]uint64 + + bufLock mutex // protects buf + buf traceBufPtr // global trace buffer, used when running without a p +} + +// traceBufHeader is per-P tracing buffer. +type traceBufHeader struct { + link traceBufPtr // in trace.empty/full + lastTicks uint64 // when we wrote the last event + pos int // next write offset in arr + stk [traceStackSize]uintptr // scratch buffer for traceback +} + +// traceBuf is per-P tracing buffer. +type traceBuf struct { + _ sys.NotInHeap + traceBufHeader + arr [64<<10 - unsafe.Sizeof(traceBufHeader{})]byte // underlying buffer for traceBufHeader.buf +} + +// traceBufPtr is a *traceBuf that is not traced by the garbage +// collector and doesn't have write barriers. traceBufs are not +// allocated from the GC'd heap, so this is safe, and are often +// manipulated in contexts where write barriers are not allowed, so +// this is necessary. +// +// TODO: Since traceBuf is now embedded runtime/internal/sys.NotInHeap, this isn't necessary. +type traceBufPtr uintptr + +func (tp traceBufPtr) ptr() *traceBuf { return (*traceBuf)(unsafe.Pointer(tp)) } +func (tp *traceBufPtr) set(b *traceBuf) { *tp = traceBufPtr(unsafe.Pointer(b)) } +func traceBufPtrOf(b *traceBuf) traceBufPtr { + return traceBufPtr(unsafe.Pointer(b)) +} + +// StartTrace enables tracing for the current process. +// While tracing, the data will be buffered and available via ReadTrace. +// StartTrace returns an error if tracing is already enabled. +// Most clients should use the runtime/trace package or the testing package's +// -test.trace flag instead of calling StartTrace directly. +func StartTrace() error { + // Stop the world so that we can take a consistent snapshot + // of all goroutines at the beginning of the trace. + // Do not stop the world during GC so we ensure we always see + // a consistent view of GC-related events (e.g. a start is always + // paired with an end). + stopTheWorldGC("start tracing") + + // Prevent sysmon from running any code that could generate events. + lock(&sched.sysmonlock) + + // We are in stop-the-world, but syscalls can finish and write to trace concurrently. + // Exitsyscall could check trace.enabled long before and then suddenly wake up + // and decide to write to trace at a random point in time. + // However, such syscall will use the global trace.buf buffer, because we've + // acquired all p's by doing stop-the-world. So this protects us from such races. + lock(&trace.bufLock) + + if trace.enabled || trace.shutdown { + unlock(&trace.bufLock) + unlock(&sched.sysmonlock) + startTheWorldGC() + return errorString("tracing is already enabled") + } + + // Can't set trace.enabled yet. While the world is stopped, exitsyscall could + // already emit a delayed event (see exitTicks in exitsyscall) if we set trace.enabled here. + // That would lead to an inconsistent trace: + // - either GoSysExit appears before EvGoInSyscall, + // - or GoSysExit appears for a goroutine for which we don't emit EvGoInSyscall below. + // To instruct traceEvent that it must not ignore events below, we set startingtrace. + // trace.enabled is set afterwards once we have emitted all preliminary events. + mp := getg().m + mp.startingtrace = true + + // Obtain current stack ID to use in all traceEvGoCreate events below. + stkBuf := make([]uintptr, traceStackSize) + stackID := traceStackID(mp, stkBuf, 2) + + profBuf := newProfBuf(2, profBufWordCount, profBufTagCount) // after the timestamp, header is [pp.id, gp.goid] + trace.cpuLogRead = profBuf + + // We must not acquire trace.signalLock outside of a signal handler: a + // profiling signal may arrive at any time and try to acquire it, leading to + // deadlock. Because we can't use that lock to protect updates to + // trace.cpuLogWrite (only use of the structure it references), reads and + // writes of the pointer must be atomic. (And although this field is never + // the sole pointer to the profBuf value, it's best to allow a write barrier + // here.) + atomicstorep(unsafe.Pointer(&trace.cpuLogWrite), unsafe.Pointer(profBuf)) + + // World is stopped, no need to lock. + forEachGRace(func(gp *g) { + status := readgstatus(gp) + if status != _Gdead { + gp.traceseq = 0 + gp.tracelastp = getg().m.p + // +PCQuantum because traceFrameForPC expects return PCs and subtracts PCQuantum. + id := trace.stackTab.put([]uintptr{startPCforTrace(gp.startpc) + sys.PCQuantum}) + traceEvent(traceEvGoCreate, -1, gp.goid, uint64(id), stackID) + } + if status == _Gwaiting { + // traceEvGoWaiting is implied to have seq=1. + gp.traceseq++ + traceEvent(traceEvGoWaiting, -1, gp.goid) + } + if status == _Gsyscall { + gp.traceseq++ + traceEvent(traceEvGoInSyscall, -1, gp.goid) + } else if status == _Gdead && gp.m != nil && gp.m.isextra { + // Trigger two trace events for the dead g in the extra m, + // since the next event of the g will be traceEvGoSysExit in exitsyscall, + // while calling from C thread to Go. + gp.traceseq = 0 + gp.tracelastp = getg().m.p + // +PCQuantum because traceFrameForPC expects return PCs and subtracts PCQuantum. + id := trace.stackTab.put([]uintptr{startPCforTrace(0) + sys.PCQuantum}) // no start pc + traceEvent(traceEvGoCreate, -1, gp.goid, uint64(id), stackID) + gp.traceseq++ + traceEvent(traceEvGoInSyscall, -1, gp.goid) + } else { + gp.sysblocktraced = false + } + }) + traceProcStart() + traceGoStart() + // Note: ticksStart needs to be set after we emit traceEvGoInSyscall events. + // If we do it the other way around, it is possible that exitsyscall will + // query sysexitticks after ticksStart but before traceEvGoInSyscall timestamp. + // It will lead to a false conclusion that cputicks is broken. + trace.ticksStart = cputicks() + trace.timeStart = nanotime() + trace.headerWritten = false + trace.footerWritten = false + + // string to id mapping + // 0 : reserved for an empty string + // remaining: other strings registered by traceString + trace.stringSeq = 0 + trace.strings = make(map[string]uint64) + + trace.seqGC = 0 + mp.startingtrace = false + trace.enabled = true + + // Register runtime goroutine labels. + _, pid, bufp := traceAcquireBuffer() + for i, label := range gcMarkWorkerModeStrings[:] { + trace.markWorkerLabels[i], bufp = traceString(bufp, pid, label) + } + traceReleaseBuffer(pid) + + unlock(&trace.bufLock) + + unlock(&sched.sysmonlock) + + startTheWorldGC() + return nil +} + +// StopTrace stops tracing, if it was previously enabled. +// StopTrace only returns after all the reads for the trace have completed. +func StopTrace() { + // Stop the world so that we can collect the trace buffers from all p's below, + // and also to avoid races with traceEvent. + stopTheWorldGC("stop tracing") + + // See the comment in StartTrace. + lock(&sched.sysmonlock) + + // See the comment in StartTrace. + lock(&trace.bufLock) + + if !trace.enabled { + unlock(&trace.bufLock) + unlock(&sched.sysmonlock) + startTheWorldGC() + return + } + + traceGoSched() + + atomicstorep(unsafe.Pointer(&trace.cpuLogWrite), nil) + trace.cpuLogRead.close() + traceReadCPU() + + // Loop over all allocated Ps because dead Ps may still have + // trace buffers. + for _, p := range allp[:cap(allp)] { + buf := p.tracebuf + if buf != 0 { + traceFullQueue(buf) + p.tracebuf = 0 + } + } + if trace.buf != 0 { + buf := trace.buf + trace.buf = 0 + if buf.ptr().pos != 0 { + traceFullQueue(buf) + } + } + if trace.cpuLogBuf != 0 { + buf := trace.cpuLogBuf + trace.cpuLogBuf = 0 + if buf.ptr().pos != 0 { + traceFullQueue(buf) + } + } + + for { + trace.ticksEnd = cputicks() + trace.timeEnd = nanotime() + // Windows time can tick only every 15ms, wait for at least one tick. + if trace.timeEnd != trace.timeStart { + break + } + osyield() + } + + trace.enabled = false + trace.shutdown = true + unlock(&trace.bufLock) + + unlock(&sched.sysmonlock) + + startTheWorldGC() + + // The world is started but we've set trace.shutdown, so new tracing can't start. + // Wait for the trace reader to flush pending buffers and stop. + semacquire(&trace.shutdownSema) + if raceenabled { + raceacquire(unsafe.Pointer(&trace.shutdownSema)) + } + + systemstack(func() { + // The lock protects us from races with StartTrace/StopTrace because they do stop-the-world. + lock(&trace.lock) + for _, p := range allp[:cap(allp)] { + if p.tracebuf != 0 { + throw("trace: non-empty trace buffer in proc") + } + } + if trace.buf != 0 { + throw("trace: non-empty global trace buffer") + } + if trace.fullHead != 0 || trace.fullTail != 0 { + throw("trace: non-empty full trace buffer") + } + if trace.reading != 0 || trace.reader.Load() != nil { + throw("trace: reading after shutdown") + } + for trace.empty != 0 { + buf := trace.empty + trace.empty = buf.ptr().link + sysFree(unsafe.Pointer(buf), unsafe.Sizeof(*buf.ptr()), &memstats.other_sys) + } + trace.strings = nil + trace.shutdown = false + trace.cpuLogRead = nil + unlock(&trace.lock) + }) +} + +// ReadTrace returns the next chunk of binary tracing data, blocking until data +// is available. If tracing is turned off and all the data accumulated while it +// was on has been returned, ReadTrace returns nil. The caller must copy the +// returned data before calling ReadTrace again. +// ReadTrace must be called from one goroutine at a time. +func ReadTrace() []byte { +top: + var buf []byte + var park bool + systemstack(func() { + buf, park = readTrace0() + }) + if park { + gopark(func(gp *g, _ unsafe.Pointer) bool { + if !trace.reader.CompareAndSwapNoWB(nil, gp) { + // We're racing with another reader. + // Wake up and handle this case. + return false + } + + if g2 := traceReader(); gp == g2 { + // New data arrived between unlocking + // and the CAS and we won the wake-up + // race, so wake up directly. + return false + } else if g2 != nil { + printlock() + println("runtime: got trace reader", g2, g2.goid) + throw("unexpected trace reader") + } + + return true + }, nil, waitReasonTraceReaderBlocked, traceEvGoBlock, 2) + goto top + } + + return buf +} + +// readTrace0 is ReadTrace's continuation on g0. This must run on the +// system stack because it acquires trace.lock. +// +//go:systemstack +func readTrace0() (buf []byte, park bool) { + if raceenabled { + // g0 doesn't have a race context. Borrow the user G's. + if getg().racectx != 0 { + throw("expected racectx == 0") + } + getg().racectx = getg().m.curg.racectx + // (This defer should get open-coded, which is safe on + // the system stack.) + defer func() { getg().racectx = 0 }() + } + + // This function may need to lock trace.lock recursively + // (goparkunlock -> traceGoPark -> traceEvent -> traceFlush). + // To allow this we use trace.lockOwner. + // Also this function must not allocate while holding trace.lock: + // allocation can call heap allocate, which will try to emit a trace + // event while holding heap lock. + lock(&trace.lock) + trace.lockOwner = getg().m.curg + + if trace.reader.Load() != nil { + // More than one goroutine reads trace. This is bad. + // But we rather do not crash the program because of tracing, + // because tracing can be enabled at runtime on prod servers. + trace.lockOwner = nil + unlock(&trace.lock) + println("runtime: ReadTrace called from multiple goroutines simultaneously") + return nil, false + } + // Recycle the old buffer. + if buf := trace.reading; buf != 0 { + buf.ptr().link = trace.empty + trace.empty = buf + trace.reading = 0 + } + // Write trace header. + if !trace.headerWritten { + trace.headerWritten = true + trace.lockOwner = nil + unlock(&trace.lock) + return []byte("go 1.19 trace\x00\x00\x00"), false + } + // Optimistically look for CPU profile samples. This may write new stack + // records, and may write new tracing buffers. + if !trace.footerWritten && !trace.shutdown { + traceReadCPU() + } + // Wait for new data. + if trace.fullHead == 0 && !trace.shutdown { + // We don't simply use a note because the scheduler + // executes this goroutine directly when it wakes up + // (also a note would consume an M). + trace.lockOwner = nil + unlock(&trace.lock) + return nil, true + } +newFull: + assertLockHeld(&trace.lock) + // Write a buffer. + if trace.fullHead != 0 { + buf := traceFullDequeue() + trace.reading = buf + trace.lockOwner = nil + unlock(&trace.lock) + return buf.ptr().arr[:buf.ptr().pos], false + } + + // Write footer with timer frequency. + if !trace.footerWritten { + trace.footerWritten = true + // Use float64 because (trace.ticksEnd - trace.ticksStart) * 1e9 can overflow int64. + freq := float64(trace.ticksEnd-trace.ticksStart) * 1e9 / float64(trace.timeEnd-trace.timeStart) / traceTickDiv + if freq <= 0 { + throw("trace: ReadTrace got invalid frequency") + } + trace.lockOwner = nil + unlock(&trace.lock) + + // Write frequency event. + bufp := traceFlush(0, 0) + buf := bufp.ptr() + buf.byte(traceEvFrequency | 0<<traceArgCountShift) + buf.varint(uint64(freq)) + + // Dump stack table. + // This will emit a bunch of full buffers, we will pick them up + // on the next iteration. + bufp = trace.stackTab.dump(bufp) + + // Flush final buffer. + lock(&trace.lock) + traceFullQueue(bufp) + goto newFull // trace.lock should be held at newFull + } + // Done. + if trace.shutdown { + trace.lockOwner = nil + unlock(&trace.lock) + if raceenabled { + // Model synchronization on trace.shutdownSema, which race + // detector does not see. This is required to avoid false + // race reports on writer passed to trace.Start. + racerelease(unsafe.Pointer(&trace.shutdownSema)) + } + // trace.enabled is already reset, so can call traceable functions. + semrelease(&trace.shutdownSema) + return nil, false + } + // Also bad, but see the comment above. + trace.lockOwner = nil + unlock(&trace.lock) + println("runtime: spurious wakeup of trace reader") + return nil, false +} + +// traceReader returns the trace reader that should be woken up, if any. +// Callers should first check that trace.enabled or trace.shutdown is set. +// +// This must run on the system stack because it acquires trace.lock. +// +//go:systemstack +func traceReader() *g { + // Optimistic check first + if traceReaderAvailable() == nil { + return nil + } + lock(&trace.lock) + gp := traceReaderAvailable() + if gp == nil || !trace.reader.CompareAndSwapNoWB(gp, nil) { + unlock(&trace.lock) + return nil + } + unlock(&trace.lock) + return gp +} + +// traceReaderAvailable returns the trace reader if it is not currently +// scheduled and should be. Callers should first check that trace.enabled +// or trace.shutdown is set. +func traceReaderAvailable() *g { + if trace.fullHead != 0 || trace.shutdown { + return trace.reader.Load() + } + return nil +} + +// traceProcFree frees trace buffer associated with pp. +// +// This must run on the system stack because it acquires trace.lock. +// +//go:systemstack +func traceProcFree(pp *p) { + buf := pp.tracebuf + pp.tracebuf = 0 + if buf == 0 { + return + } + lock(&trace.lock) + traceFullQueue(buf) + unlock(&trace.lock) +} + +// traceFullQueue queues buf into queue of full buffers. +func traceFullQueue(buf traceBufPtr) { + buf.ptr().link = 0 + if trace.fullHead == 0 { + trace.fullHead = buf + } else { + trace.fullTail.ptr().link = buf + } + trace.fullTail = buf +} + +// traceFullDequeue dequeues from queue of full buffers. +func traceFullDequeue() traceBufPtr { + buf := trace.fullHead + if buf == 0 { + return 0 + } + trace.fullHead = buf.ptr().link + if trace.fullHead == 0 { + trace.fullTail = 0 + } + buf.ptr().link = 0 + return buf +} + +// traceEvent writes a single event to trace buffer, flushing the buffer if necessary. +// ev is event type. +// If skip > 0, write current stack id as the last argument (skipping skip top frames). +// If skip = 0, this event type should contain a stack, but we don't want +// to collect and remember it for this particular call. +func traceEvent(ev byte, skip int, args ...uint64) { + mp, pid, bufp := traceAcquireBuffer() + // Double-check trace.enabled now that we've done m.locks++ and acquired bufLock. + // This protects from races between traceEvent and StartTrace/StopTrace. + + // The caller checked that trace.enabled == true, but trace.enabled might have been + // turned off between the check and now. Check again. traceLockBuffer did mp.locks++, + // StopTrace does stopTheWorld, and stopTheWorld waits for mp.locks to go back to zero, + // so if we see trace.enabled == true now, we know it's true for the rest of the function. + // Exitsyscall can run even during stopTheWorld. The race with StartTrace/StopTrace + // during tracing in exitsyscall is resolved by locking trace.bufLock in traceLockBuffer. + // + // Note trace_userTaskCreate runs the same check. + if !trace.enabled && !mp.startingtrace { + traceReleaseBuffer(pid) + return + } + + if skip > 0 { + if getg() == mp.curg { + skip++ // +1 because stack is captured in traceEventLocked. + } + } + traceEventLocked(0, mp, pid, bufp, ev, 0, skip, args...) + traceReleaseBuffer(pid) +} + +// traceEventLocked writes a single event of type ev to the trace buffer bufp, +// flushing the buffer if necessary. pid is the id of the current P, or +// traceGlobProc if we're tracing without a real P. +// +// Preemption is disabled, and if running without a real P the global tracing +// buffer is locked. +// +// Events types that do not include a stack set skip to -1. Event types that +// include a stack may explicitly reference a stackID from the trace.stackTab +// (obtained by an earlier call to traceStackID). Without an explicit stackID, +// this function will automatically capture the stack of the goroutine currently +// running on mp, skipping skip top frames or, if skip is 0, writing out an +// empty stack record. +// +// It records the event's args to the traceBuf, and also makes an effort to +// reserve extraBytes bytes of additional space immediately following the event, +// in the same traceBuf. +func traceEventLocked(extraBytes int, mp *m, pid int32, bufp *traceBufPtr, ev byte, stackID uint32, skip int, args ...uint64) { + buf := bufp.ptr() + // TODO: test on non-zero extraBytes param. + maxSize := 2 + 5*traceBytesPerNumber + extraBytes // event type, length, sequence, timestamp, stack id and two add params + if buf == nil || len(buf.arr)-buf.pos < maxSize { + systemstack(func() { + buf = traceFlush(traceBufPtrOf(buf), pid).ptr() + }) + bufp.set(buf) + } + + // NOTE: ticks might be same after tick division, although the real cputicks is + // linear growth. + ticks := uint64(cputicks()) / traceTickDiv + tickDiff := ticks - buf.lastTicks + if tickDiff == 0 { + ticks = buf.lastTicks + 1 + tickDiff = 1 + } + + buf.lastTicks = ticks + narg := byte(len(args)) + if stackID != 0 || skip >= 0 { + narg++ + } + // We have only 2 bits for number of arguments. + // If number is >= 3, then the event type is followed by event length in bytes. + if narg > 3 { + narg = 3 + } + startPos := buf.pos + buf.byte(ev | narg<<traceArgCountShift) + var lenp *byte + if narg == 3 { + // Reserve the byte for length assuming that length < 128. + buf.varint(0) + lenp = &buf.arr[buf.pos-1] + } + buf.varint(tickDiff) + for _, a := range args { + buf.varint(a) + } + if stackID != 0 { + buf.varint(uint64(stackID)) + } else if skip == 0 { + buf.varint(0) + } else if skip > 0 { + buf.varint(traceStackID(mp, buf.stk[:], skip)) + } + evSize := buf.pos - startPos + if evSize > maxSize { + throw("invalid length of trace event") + } + if lenp != nil { + // Fill in actual length. + *lenp = byte(evSize - 2) + } +} + +// traceCPUSample writes a CPU profile sample stack to the execution tracer's +// profiling buffer. It is called from a signal handler, so is limited in what +// it can do. +func traceCPUSample(gp *g, pp *p, stk []uintptr) { + if !trace.enabled { + // Tracing is usually turned off; don't spend time acquiring the signal + // lock unless it's active. + return + } + + // Match the clock used in traceEventLocked + now := cputicks() + // The "header" here is the ID of the P that was running the profiled code, + // followed by the ID of the goroutine. (For normal CPU profiling, it's + // usually the number of samples with the given stack.) Near syscalls, pp + // may be nil. Reporting goid of 0 is fine for either g0 or a nil gp. + var hdr [2]uint64 + if pp != nil { + // Overflow records in profBuf have all header values set to zero. Make + // sure that real headers have at least one bit set. + hdr[0] = uint64(pp.id)<<1 | 0b1 + } else { + hdr[0] = 0b10 + } + if gp != nil { + hdr[1] = gp.goid + } + + // Allow only one writer at a time + for !trace.signalLock.CompareAndSwap(0, 1) { + // TODO: Is it safe to osyield here? https://go.dev/issue/52672 + osyield() + } + + if log := (*profBuf)(atomic.Loadp(unsafe.Pointer(&trace.cpuLogWrite))); log != nil { + // Note: we don't pass a tag pointer here (how should profiling tags + // interact with the execution tracer?), but if we did we'd need to be + // careful about write barriers. See the long comment in profBuf.write. + log.write(nil, now, hdr[:], stk) + } + + trace.signalLock.Store(0) +} + +func traceReadCPU() { + bufp := &trace.cpuLogBuf + + for { + data, tags, _ := trace.cpuLogRead.read(profBufNonBlocking) + if len(data) == 0 { + break + } + for len(data) > 0 { + if len(data) < 4 || data[0] > uint64(len(data)) { + break // truncated profile + } + if data[0] < 4 || tags != nil && len(tags) < 1 { + break // malformed profile + } + if len(tags) < 1 { + break // mismatched profile records and tags + } + timestamp := data[1] + ppid := data[2] >> 1 + if hasP := (data[2] & 0b1) != 0; !hasP { + ppid = ^uint64(0) + } + goid := data[3] + stk := data[4:data[0]] + empty := len(stk) == 1 && data[2] == 0 && data[3] == 0 + data = data[data[0]:] + // No support here for reporting goroutine tags at the moment; if + // that information is to be part of the execution trace, we'd + // probably want to see when the tags are applied and when they + // change, instead of only seeing them when we get a CPU sample. + tags = tags[1:] + + if empty { + // Looks like an overflow record from the profBuf. Not much to + // do here, we only want to report full records. + // + // TODO: should we start a goroutine to drain the profBuf, + // rather than relying on a high-enough volume of tracing events + // to keep ReadTrace busy? https://go.dev/issue/52674 + continue + } + + buf := bufp.ptr() + if buf == nil { + systemstack(func() { + *bufp = traceFlush(*bufp, 0) + }) + buf = bufp.ptr() + } + for i := range stk { + if i >= len(buf.stk) { + break + } + buf.stk[i] = uintptr(stk[i]) + } + stackID := trace.stackTab.put(buf.stk[:len(stk)]) + + traceEventLocked(0, nil, 0, bufp, traceEvCPUSample, stackID, 1, timestamp/traceTickDiv, ppid, goid) + } + } +} + +func traceStackID(mp *m, buf []uintptr, skip int) uint64 { + gp := getg() + curgp := mp.curg + var nstk int + if curgp == gp { + nstk = callers(skip+1, buf) + } else if curgp != nil { + nstk = gcallers(curgp, skip, buf) + } + if nstk > 0 { + nstk-- // skip runtime.goexit + } + if nstk > 0 && curgp.goid == 1 { + nstk-- // skip runtime.main + } + id := trace.stackTab.put(buf[:nstk]) + return uint64(id) +} + +// traceAcquireBuffer returns trace buffer to use and, if necessary, locks it. +func traceAcquireBuffer() (mp *m, pid int32, bufp *traceBufPtr) { + // Any time we acquire a buffer, we may end up flushing it, + // but flushes are rare. Record the lock edge even if it + // doesn't happen this time. + lockRankMayTraceFlush() + + mp = acquirem() + if p := mp.p.ptr(); p != nil { + return mp, p.id, &p.tracebuf + } + lock(&trace.bufLock) + return mp, traceGlobProc, &trace.buf +} + +// traceReleaseBuffer releases a buffer previously acquired with traceAcquireBuffer. +func traceReleaseBuffer(pid int32) { + if pid == traceGlobProc { + unlock(&trace.bufLock) + } + releasem(getg().m) +} + +// lockRankMayTraceFlush records the lock ranking effects of a +// potential call to traceFlush. +func lockRankMayTraceFlush() { + owner := trace.lockOwner + dolock := owner == nil || owner != getg().m.curg + if dolock { + lockWithRankMayAcquire(&trace.lock, getLockRank(&trace.lock)) + } +} + +// traceFlush puts buf onto stack of full buffers and returns an empty buffer. +// +// This must run on the system stack because it acquires trace.lock. +// +//go:systemstack +func traceFlush(buf traceBufPtr, pid int32) traceBufPtr { + owner := trace.lockOwner + dolock := owner == nil || owner != getg().m.curg + if dolock { + lock(&trace.lock) + } + if buf != 0 { + traceFullQueue(buf) + } + if trace.empty != 0 { + buf = trace.empty + trace.empty = buf.ptr().link + } else { + buf = traceBufPtr(sysAlloc(unsafe.Sizeof(traceBuf{}), &memstats.other_sys)) + if buf == 0 { + throw("trace: out of memory") + } + } + bufp := buf.ptr() + bufp.link.set(nil) + bufp.pos = 0 + + // initialize the buffer for a new batch + ticks := uint64(cputicks()) / traceTickDiv + if ticks == bufp.lastTicks { + ticks = bufp.lastTicks + 1 + } + bufp.lastTicks = ticks + bufp.byte(traceEvBatch | 1<<traceArgCountShift) + bufp.varint(uint64(pid)) + bufp.varint(ticks) + + if dolock { + unlock(&trace.lock) + } + return buf +} + +// traceString adds a string to the trace.strings and returns the id. +func traceString(bufp *traceBufPtr, pid int32, s string) (uint64, *traceBufPtr) { + if s == "" { + return 0, bufp + } + + lock(&trace.stringsLock) + if raceenabled { + // raceacquire is necessary because the map access + // below is race annotated. + raceacquire(unsafe.Pointer(&trace.stringsLock)) + } + + if id, ok := trace.strings[s]; ok { + if raceenabled { + racerelease(unsafe.Pointer(&trace.stringsLock)) + } + unlock(&trace.stringsLock) + + return id, bufp + } + + trace.stringSeq++ + id := trace.stringSeq + trace.strings[s] = id + + if raceenabled { + racerelease(unsafe.Pointer(&trace.stringsLock)) + } + unlock(&trace.stringsLock) + + // memory allocation in above may trigger tracing and + // cause *bufp changes. Following code now works with *bufp, + // so there must be no memory allocation or any activities + // that causes tracing after this point. + + buf := bufp.ptr() + size := 1 + 2*traceBytesPerNumber + len(s) + if buf == nil || len(buf.arr)-buf.pos < size { + systemstack(func() { + buf = traceFlush(traceBufPtrOf(buf), pid).ptr() + bufp.set(buf) + }) + } + buf.byte(traceEvString) + buf.varint(id) + + // double-check the string and the length can fit. + // Otherwise, truncate the string. + slen := len(s) + if room := len(buf.arr) - buf.pos; room < slen+traceBytesPerNumber { + slen = room + } + + buf.varint(uint64(slen)) + buf.pos += copy(buf.arr[buf.pos:], s[:slen]) + + bufp.set(buf) + return id, bufp +} + +// varint appends v to buf in little-endian-base-128 encoding. +func (buf *traceBuf) varint(v uint64) { + pos := buf.pos + for ; v >= 0x80; v >>= 7 { + buf.arr[pos] = 0x80 | byte(v) + pos++ + } + buf.arr[pos] = byte(v) + pos++ + buf.pos = pos +} + +// varintAt writes varint v at byte position pos in buf. This always +// consumes traceBytesPerNumber bytes. This is intended for when the +// caller needs to reserve space for a varint but can't populate it +// until later. +func (buf *traceBuf) varintAt(pos int, v uint64) { + for i := 0; i < traceBytesPerNumber; i++ { + if i < traceBytesPerNumber-1 { + buf.arr[pos] = 0x80 | byte(v) + } else { + buf.arr[pos] = byte(v) + } + v >>= 7 + pos++ + } +} + +// byte appends v to buf. +func (buf *traceBuf) byte(v byte) { + buf.arr[buf.pos] = v + buf.pos++ +} + +// traceStackTable maps stack traces (arrays of PC's) to unique uint32 ids. +// It is lock-free for reading. +type traceStackTable struct { + lock mutex // Must be acquired on the system stack + seq uint32 + mem traceAlloc + tab [1 << 13]traceStackPtr +} + +// traceStack is a single stack in traceStackTable. +type traceStack struct { + link traceStackPtr + hash uintptr + id uint32 + n int + stk [0]uintptr // real type [n]uintptr +} + +type traceStackPtr uintptr + +func (tp traceStackPtr) ptr() *traceStack { return (*traceStack)(unsafe.Pointer(tp)) } + +// stack returns slice of PCs. +func (ts *traceStack) stack() []uintptr { + return (*[traceStackSize]uintptr)(unsafe.Pointer(&ts.stk))[:ts.n] +} + +// put returns a unique id for the stack trace pcs and caches it in the table, +// if it sees the trace for the first time. +func (tab *traceStackTable) put(pcs []uintptr) uint32 { + if len(pcs) == 0 { + return 0 + } + hash := memhash(unsafe.Pointer(&pcs[0]), 0, uintptr(len(pcs))*unsafe.Sizeof(pcs[0])) + // First, search the hashtable w/o the mutex. + if id := tab.find(pcs, hash); id != 0 { + return id + } + // Now, double check under the mutex. + // Switch to the system stack so we can acquire tab.lock + var id uint32 + systemstack(func() { + lock(&tab.lock) + if id = tab.find(pcs, hash); id != 0 { + unlock(&tab.lock) + return + } + // Create new record. + tab.seq++ + stk := tab.newStack(len(pcs)) + stk.hash = hash + stk.id = tab.seq + id = stk.id + stk.n = len(pcs) + stkpc := stk.stack() + for i, pc := range pcs { + stkpc[i] = pc + } + part := int(hash % uintptr(len(tab.tab))) + stk.link = tab.tab[part] + atomicstorep(unsafe.Pointer(&tab.tab[part]), unsafe.Pointer(stk)) + unlock(&tab.lock) + }) + return id +} + +// find checks if the stack trace pcs is already present in the table. +func (tab *traceStackTable) find(pcs []uintptr, hash uintptr) uint32 { + part := int(hash % uintptr(len(tab.tab))) +Search: + for stk := tab.tab[part].ptr(); stk != nil; stk = stk.link.ptr() { + if stk.hash == hash && stk.n == len(pcs) { + for i, stkpc := range stk.stack() { + if stkpc != pcs[i] { + continue Search + } + } + return stk.id + } + } + return 0 +} + +// newStack allocates a new stack of size n. +func (tab *traceStackTable) newStack(n int) *traceStack { + return (*traceStack)(tab.mem.alloc(unsafe.Sizeof(traceStack{}) + uintptr(n)*goarch.PtrSize)) +} + +// traceFrames returns the frames corresponding to pcs. It may +// allocate and may emit trace events. +func traceFrames(bufp traceBufPtr, pcs []uintptr) ([]traceFrame, traceBufPtr) { + frames := make([]traceFrame, 0, len(pcs)) + ci := CallersFrames(pcs) + for { + var frame traceFrame + f, more := ci.Next() + frame, bufp = traceFrameForPC(bufp, 0, f) + frames = append(frames, frame) + if !more { + return frames, bufp + } + } +} + +// dump writes all previously cached stacks to trace buffers, +// releases all memory and resets state. +// +// This must run on the system stack because it calls traceFlush. +// +//go:systemstack +func (tab *traceStackTable) dump(bufp traceBufPtr) traceBufPtr { + for i := range tab.tab { + stk := tab.tab[i].ptr() + for ; stk != nil; stk = stk.link.ptr() { + var frames []traceFrame + frames, bufp = traceFrames(bufp, stk.stack()) + + // Estimate the size of this record. This + // bound is pretty loose, but avoids counting + // lots of varint sizes. + maxSize := 1 + traceBytesPerNumber + (2+4*len(frames))*traceBytesPerNumber + // Make sure we have enough buffer space. + if buf := bufp.ptr(); len(buf.arr)-buf.pos < maxSize { + bufp = traceFlush(bufp, 0) + } + + // Emit header, with space reserved for length. + buf := bufp.ptr() + buf.byte(traceEvStack | 3<<traceArgCountShift) + lenPos := buf.pos + buf.pos += traceBytesPerNumber + + // Emit body. + recPos := buf.pos + buf.varint(uint64(stk.id)) + buf.varint(uint64(len(frames))) + for _, frame := range frames { + buf.varint(uint64(frame.PC)) + buf.varint(frame.funcID) + buf.varint(frame.fileID) + buf.varint(frame.line) + } + + // Fill in size header. + buf.varintAt(lenPos, uint64(buf.pos-recPos)) + } + } + + tab.mem.drop() + *tab = traceStackTable{} + lockInit(&((*tab).lock), lockRankTraceStackTab) + + return bufp +} + +type traceFrame struct { + PC uintptr + funcID uint64 + fileID uint64 + line uint64 +} + +// traceFrameForPC records the frame information. +// It may allocate memory. +func traceFrameForPC(buf traceBufPtr, pid int32, f Frame) (traceFrame, traceBufPtr) { + bufp := &buf + var frame traceFrame + frame.PC = f.PC + + fn := f.Function + const maxLen = 1 << 10 + if len(fn) > maxLen { + fn = fn[len(fn)-maxLen:] + } + frame.funcID, bufp = traceString(bufp, pid, fn) + frame.line = uint64(f.Line) + file := f.File + if len(file) > maxLen { + file = file[len(file)-maxLen:] + } + frame.fileID, bufp = traceString(bufp, pid, file) + return frame, (*bufp) +} + +// traceAlloc is a non-thread-safe region allocator. +// It holds a linked list of traceAllocBlock. +type traceAlloc struct { + head traceAllocBlockPtr + off uintptr +} + +// traceAllocBlock is a block in traceAlloc. +// +// traceAllocBlock is allocated from non-GC'd memory, so it must not +// contain heap pointers. Writes to pointers to traceAllocBlocks do +// not need write barriers. +type traceAllocBlock struct { + _ sys.NotInHeap + next traceAllocBlockPtr + data [64<<10 - goarch.PtrSize]byte +} + +// TODO: Since traceAllocBlock is now embedded runtime/internal/sys.NotInHeap, this isn't necessary. +type traceAllocBlockPtr uintptr + +func (p traceAllocBlockPtr) ptr() *traceAllocBlock { return (*traceAllocBlock)(unsafe.Pointer(p)) } +func (p *traceAllocBlockPtr) set(x *traceAllocBlock) { *p = traceAllocBlockPtr(unsafe.Pointer(x)) } + +// alloc allocates n-byte block. +func (a *traceAlloc) alloc(n uintptr) unsafe.Pointer { + n = alignUp(n, goarch.PtrSize) + if a.head == 0 || a.off+n > uintptr(len(a.head.ptr().data)) { + if n > uintptr(len(a.head.ptr().data)) { + throw("trace: alloc too large") + } + block := (*traceAllocBlock)(sysAlloc(unsafe.Sizeof(traceAllocBlock{}), &memstats.other_sys)) + if block == nil { + throw("trace: out of memory") + } + block.next.set(a.head.ptr()) + a.head.set(block) + a.off = 0 + } + p := &a.head.ptr().data[a.off] + a.off += n + return unsafe.Pointer(p) +} + +// drop frees all previously allocated memory and resets the allocator. +func (a *traceAlloc) drop() { + for a.head != 0 { + block := a.head.ptr() + a.head.set(block.next.ptr()) + sysFree(unsafe.Pointer(block), unsafe.Sizeof(traceAllocBlock{}), &memstats.other_sys) + } +} + +// The following functions write specific events to trace. + +func traceGomaxprocs(procs int32) { + traceEvent(traceEvGomaxprocs, 1, uint64(procs)) +} + +func traceProcStart() { + traceEvent(traceEvProcStart, -1, uint64(getg().m.id)) +} + +func traceProcStop(pp *p) { + // Sysmon and stopTheWorld can stop Ps blocked in syscalls, + // to handle this we temporary employ the P. + mp := acquirem() + oldp := mp.p + mp.p.set(pp) + traceEvent(traceEvProcStop, -1) + mp.p = oldp + releasem(mp) +} + +func traceGCStart() { + traceEvent(traceEvGCStart, 3, trace.seqGC) + trace.seqGC++ +} + +func traceGCDone() { + traceEvent(traceEvGCDone, -1) +} + +func traceGCSTWStart(kind int) { + traceEvent(traceEvGCSTWStart, -1, uint64(kind)) +} + +func traceGCSTWDone() { + traceEvent(traceEvGCSTWDone, -1) +} + +// traceGCSweepStart prepares to trace a sweep loop. This does not +// emit any events until traceGCSweepSpan is called. +// +// traceGCSweepStart must be paired with traceGCSweepDone and there +// must be no preemption points between these two calls. +func traceGCSweepStart() { + // Delay the actual GCSweepStart event until the first span + // sweep. If we don't sweep anything, don't emit any events. + pp := getg().m.p.ptr() + if pp.traceSweep { + throw("double traceGCSweepStart") + } + pp.traceSweep, pp.traceSwept, pp.traceReclaimed = true, 0, 0 +} + +// traceGCSweepSpan traces the sweep of a single page. +// +// This may be called outside a traceGCSweepStart/traceGCSweepDone +// pair; however, it will not emit any trace events in this case. +func traceGCSweepSpan(bytesSwept uintptr) { + pp := getg().m.p.ptr() + if pp.traceSweep { + if pp.traceSwept == 0 { + traceEvent(traceEvGCSweepStart, 1) + } + pp.traceSwept += bytesSwept + } +} + +func traceGCSweepDone() { + pp := getg().m.p.ptr() + if !pp.traceSweep { + throw("missing traceGCSweepStart") + } + if pp.traceSwept != 0 { + traceEvent(traceEvGCSweepDone, -1, uint64(pp.traceSwept), uint64(pp.traceReclaimed)) + } + pp.traceSweep = false +} + +func traceGCMarkAssistStart() { + traceEvent(traceEvGCMarkAssistStart, 1) +} + +func traceGCMarkAssistDone() { + traceEvent(traceEvGCMarkAssistDone, -1) +} + +func traceGoCreate(newg *g, pc uintptr) { + newg.traceseq = 0 + newg.tracelastp = getg().m.p + // +PCQuantum because traceFrameForPC expects return PCs and subtracts PCQuantum. + id := trace.stackTab.put([]uintptr{startPCforTrace(pc) + sys.PCQuantum}) + traceEvent(traceEvGoCreate, 2, newg.goid, uint64(id)) +} + +func traceGoStart() { + gp := getg().m.curg + pp := gp.m.p + gp.traceseq++ + if pp.ptr().gcMarkWorkerMode != gcMarkWorkerNotWorker { + traceEvent(traceEvGoStartLabel, -1, gp.goid, gp.traceseq, trace.markWorkerLabels[pp.ptr().gcMarkWorkerMode]) + } else if gp.tracelastp == pp { + traceEvent(traceEvGoStartLocal, -1, gp.goid) + } else { + gp.tracelastp = pp + traceEvent(traceEvGoStart, -1, gp.goid, gp.traceseq) + } +} + +func traceGoEnd() { + traceEvent(traceEvGoEnd, -1) +} + +func traceGoSched() { + gp := getg() + gp.tracelastp = gp.m.p + traceEvent(traceEvGoSched, 1) +} + +func traceGoPreempt() { + gp := getg() + gp.tracelastp = gp.m.p + traceEvent(traceEvGoPreempt, 1) +} + +func traceGoPark(traceEv byte, skip int) { + if traceEv&traceFutileWakeup != 0 { + traceEvent(traceEvFutileWakeup, -1) + } + traceEvent(traceEv & ^traceFutileWakeup, skip) +} + +func traceGoUnpark(gp *g, skip int) { + pp := getg().m.p + gp.traceseq++ + if gp.tracelastp == pp { + traceEvent(traceEvGoUnblockLocal, skip, gp.goid) + } else { + gp.tracelastp = pp + traceEvent(traceEvGoUnblock, skip, gp.goid, gp.traceseq) + } +} + +func traceGoSysCall() { + traceEvent(traceEvGoSysCall, 1) +} + +func traceGoSysExit(ts int64) { + if ts != 0 && ts < trace.ticksStart { + // There is a race between the code that initializes sysexitticks + // (in exitsyscall, which runs without a P, and therefore is not + // stopped with the rest of the world) and the code that initializes + // a new trace. The recorded sysexitticks must therefore be treated + // as "best effort". If they are valid for this trace, then great, + // use them for greater accuracy. But if they're not valid for this + // trace, assume that the trace was started after the actual syscall + // exit (but before we actually managed to start the goroutine, + // aka right now), and assign a fresh time stamp to keep the log consistent. + ts = 0 + } + gp := getg().m.curg + gp.traceseq++ + gp.tracelastp = gp.m.p + traceEvent(traceEvGoSysExit, -1, gp.goid, gp.traceseq, uint64(ts)/traceTickDiv) +} + +func traceGoSysBlock(pp *p) { + // Sysmon and stopTheWorld can declare syscalls running on remote Ps as blocked, + // to handle this we temporary employ the P. + mp := acquirem() + oldp := mp.p + mp.p.set(pp) + traceEvent(traceEvGoSysBlock, -1) + mp.p = oldp + releasem(mp) +} + +func traceHeapAlloc(live uint64) { + traceEvent(traceEvHeapAlloc, -1, live) +} + +func traceHeapGoal() { + heapGoal := gcController.heapGoal() + if heapGoal == ^uint64(0) { + // Heap-based triggering is disabled. + traceEvent(traceEvHeapGoal, -1, 0) + } else { + traceEvent(traceEvHeapGoal, -1, heapGoal) + } +} + +// To access runtime functions from runtime/trace. +// See runtime/trace/annotation.go + +//go:linkname trace_userTaskCreate runtime/trace.userTaskCreate +func trace_userTaskCreate(id, parentID uint64, taskType string) { + if !trace.enabled { + return + } + + // Same as in traceEvent. + mp, pid, bufp := traceAcquireBuffer() + if !trace.enabled && !mp.startingtrace { + traceReleaseBuffer(pid) + return + } + + typeStringID, bufp := traceString(bufp, pid, taskType) + traceEventLocked(0, mp, pid, bufp, traceEvUserTaskCreate, 0, 3, id, parentID, typeStringID) + traceReleaseBuffer(pid) +} + +//go:linkname trace_userTaskEnd runtime/trace.userTaskEnd +func trace_userTaskEnd(id uint64) { + traceEvent(traceEvUserTaskEnd, 2, id) +} + +//go:linkname trace_userRegion runtime/trace.userRegion +func trace_userRegion(id, mode uint64, name string) { + if !trace.enabled { + return + } + + mp, pid, bufp := traceAcquireBuffer() + if !trace.enabled && !mp.startingtrace { + traceReleaseBuffer(pid) + return + } + + nameStringID, bufp := traceString(bufp, pid, name) + traceEventLocked(0, mp, pid, bufp, traceEvUserRegion, 0, 3, id, mode, nameStringID) + traceReleaseBuffer(pid) +} + +//go:linkname trace_userLog runtime/trace.userLog +func trace_userLog(id uint64, category, message string) { + if !trace.enabled { + return + } + + mp, pid, bufp := traceAcquireBuffer() + if !trace.enabled && !mp.startingtrace { + traceReleaseBuffer(pid) + return + } + + categoryID, bufp := traceString(bufp, pid, category) + + extraSpace := traceBytesPerNumber + len(message) // extraSpace for the value string + traceEventLocked(extraSpace, mp, pid, bufp, traceEvUserLog, 0, 3, id, categoryID) + // traceEventLocked reserved extra space for val and len(val) + // in buf, so buf now has room for the following. + buf := bufp.ptr() + + // double-check the message and its length can fit. + // Otherwise, truncate the message. + slen := len(message) + if room := len(buf.arr) - buf.pos; room < slen+traceBytesPerNumber { + slen = room + } + buf.varint(uint64(slen)) + buf.pos += copy(buf.arr[buf.pos:], message[:slen]) + + traceReleaseBuffer(pid) +} + +// the start PC of a goroutine for tracing purposes. If pc is a wrapper, +// it returns the PC of the wrapped function. Otherwise it returns pc. +func startPCforTrace(pc uintptr) uintptr { + f := findfunc(pc) + if !f.valid() { + return pc // may happen for locked g in extra M since its pc is 0. + } + w := funcdata(f, _FUNCDATA_WrapInfo) + if w == nil { + return pc // not a wrapper + } + return f.datap.textAddr(*(*uint32)(w)) +} diff --git a/src/runtime/trace/annotation.go b/src/runtime/trace/annotation.go new file mode 100644 index 0000000..d47cb85 --- /dev/null +++ b/src/runtime/trace/annotation.go @@ -0,0 +1,198 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package trace + +import ( + "context" + "fmt" + "sync/atomic" + _ "unsafe" +) + +type traceContextKey struct{} + +// NewTask creates a task instance with the type taskType and returns +// it along with a Context that carries the task. +// If the input context contains a task, the new task is its subtask. +// +// The taskType is used to classify task instances. Analysis tools +// like the Go execution tracer may assume there are only a bounded +// number of unique task types in the system. +// +// The returned end function is used to mark the task's end. +// The trace tool measures task latency as the time between task creation +// and when the end function is called, and provides the latency +// distribution per task type. +// If the end function is called multiple times, only the first +// call is used in the latency measurement. +// +// ctx, task := trace.NewTask(ctx, "awesomeTask") +// trace.WithRegion(ctx, "preparation", prepWork) +// // preparation of the task +// go func() { // continue processing the task in a separate goroutine. +// defer task.End() +// trace.WithRegion(ctx, "remainingWork", remainingWork) +// }() +func NewTask(pctx context.Context, taskType string) (ctx context.Context, task *Task) { + pid := fromContext(pctx).id + id := newID() + userTaskCreate(id, pid, taskType) + s := &Task{id: id} + return context.WithValue(pctx, traceContextKey{}, s), s + + // We allocate a new task and the end function even when + // the tracing is disabled because the context and the detach + // function can be used across trace enable/disable boundaries, + // which complicates the problem. + // + // For example, consider the following scenario: + // - trace is enabled. + // - trace.WithRegion is called, so a new context ctx + // with a new region is created. + // - trace is disabled. + // - trace is enabled again. + // - trace APIs with the ctx is called. Is the ID in the task + // a valid one to use? + // + // TODO(hyangah): reduce the overhead at least when + // tracing is disabled. Maybe the id can embed a tracing + // round number and ignore ids generated from previous + // tracing round. +} + +func fromContext(ctx context.Context) *Task { + if s, ok := ctx.Value(traceContextKey{}).(*Task); ok { + return s + } + return &bgTask +} + +// Task is a data type for tracing a user-defined, logical operation. +type Task struct { + id uint64 + // TODO(hyangah): record parent id? +} + +// End marks the end of the operation represented by the Task. +func (t *Task) End() { + userTaskEnd(t.id) +} + +var lastTaskID uint64 = 0 // task id issued last time + +func newID() uint64 { + // TODO(hyangah): use per-P cache + return atomic.AddUint64(&lastTaskID, 1) +} + +var bgTask = Task{id: uint64(0)} + +// Log emits a one-off event with the given category and message. +// Category can be empty and the API assumes there are only a handful of +// unique categories in the system. +func Log(ctx context.Context, category, message string) { + id := fromContext(ctx).id + userLog(id, category, message) +} + +// Logf is like Log, but the value is formatted using the specified format spec. +func Logf(ctx context.Context, category, format string, args ...any) { + if IsEnabled() { + // Ideally this should be just Log, but that will + // add one more frame in the stack trace. + id := fromContext(ctx).id + userLog(id, category, fmt.Sprintf(format, args...)) + } +} + +const ( + regionStartCode = uint64(0) + regionEndCode = uint64(1) +) + +// WithRegion starts a region associated with its calling goroutine, runs fn, +// and then ends the region. If the context carries a task, the region is +// associated with the task. Otherwise, the region is attached to the background +// task. +// +// The regionType is used to classify regions, so there should be only a +// handful of unique region types. +func WithRegion(ctx context.Context, regionType string, fn func()) { + // NOTE: + // WithRegion helps avoiding misuse of the API but in practice, + // this is very restrictive: + // - Use of WithRegion makes the stack traces captured from + // region start and end are identical. + // - Refactoring the existing code to use WithRegion is sometimes + // hard and makes the code less readable. + // e.g. code block nested deep in the loop with various + // exit point with return values + // - Refactoring the code to use this API with closure can + // cause different GC behavior such as retaining some parameters + // longer. + // This causes more churns in code than I hoped, and sometimes + // makes the code less readable. + + id := fromContext(ctx).id + userRegion(id, regionStartCode, regionType) + defer userRegion(id, regionEndCode, regionType) + fn() +} + +// StartRegion starts a region and returns a function for marking the +// end of the region. The returned Region's End function must be called +// from the same goroutine where the region was started. +// Within each goroutine, regions must nest. That is, regions started +// after this region must be ended before this region can be ended. +// Recommended usage is +// +// defer trace.StartRegion(ctx, "myTracedRegion").End() +func StartRegion(ctx context.Context, regionType string) *Region { + if !IsEnabled() { + return noopRegion + } + id := fromContext(ctx).id + userRegion(id, regionStartCode, regionType) + return &Region{id, regionType} +} + +// Region is a region of code whose execution time interval is traced. +type Region struct { + id uint64 + regionType string +} + +var noopRegion = &Region{} + +// End marks the end of the traced code region. +func (r *Region) End() { + if r == noopRegion { + return + } + userRegion(r.id, regionEndCode, r.regionType) +} + +// IsEnabled reports whether tracing is enabled. +// The information is advisory only. The tracing status +// may have changed by the time this function returns. +func IsEnabled() bool { + return tracing.enabled.Load() +} + +// +// Function bodies are defined in runtime/trace.go +// + +// emits UserTaskCreate event. +func userTaskCreate(id, parentID uint64, taskType string) + +// emits UserTaskEnd event. +func userTaskEnd(id uint64) + +// emits UserRegion event. +func userRegion(id, mode uint64, regionType string) + +// emits UserLog event. +func userLog(id uint64, category, message string) diff --git a/src/runtime/trace/annotation_test.go b/src/runtime/trace/annotation_test.go new file mode 100644 index 0000000..69ea8f2 --- /dev/null +++ b/src/runtime/trace/annotation_test.go @@ -0,0 +1,156 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package trace_test + +import ( + "bytes" + "context" + "fmt" + "internal/trace" + "reflect" + . "runtime/trace" + "strings" + "sync" + "testing" +) + +func BenchmarkStartRegion(b *testing.B) { + b.ReportAllocs() + ctx, task := NewTask(context.Background(), "benchmark") + defer task.End() + + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + StartRegion(ctx, "region").End() + } + }) +} + +func BenchmarkNewTask(b *testing.B) { + b.ReportAllocs() + pctx, task := NewTask(context.Background(), "benchmark") + defer task.End() + + b.RunParallel(func(pb *testing.PB) { + for pb.Next() { + _, task := NewTask(pctx, "task") + task.End() + } + }) +} + +func TestUserTaskRegion(t *testing.T) { + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + bgctx, cancel := context.WithCancel(context.Background()) + defer cancel() + + preExistingRegion := StartRegion(bgctx, "pre-existing region") + + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + + // Beginning of traced execution + var wg sync.WaitGroup + ctx, task := NewTask(bgctx, "task0") // EvUserTaskCreate("task0") + wg.Add(1) + go func() { + defer wg.Done() + defer task.End() // EvUserTaskEnd("task0") + + WithRegion(ctx, "region0", func() { + // EvUserRegionCreate("region0", start) + WithRegion(ctx, "region1", func() { + Log(ctx, "key0", "0123456789abcdef") // EvUserLog("task0", "key0", "0....f") + }) + // EvUserRegion("region0", end) + }) + }() + + wg.Wait() + + preExistingRegion.End() + postExistingRegion := StartRegion(bgctx, "post-existing region") + + // End of traced execution + Stop() + + postExistingRegion.End() + + saveTrace(t, buf, "TestUserTaskRegion") + res, err := trace.Parse(buf, "") + if err == trace.ErrTimeOrder { + // golang.org/issues/16755 + t.Skipf("skipping trace: %v", err) + } + if err != nil { + t.Fatalf("Parse failed: %v", err) + } + + // Check whether we see all user annotation related records in order + type testData struct { + typ byte + strs []string + args []uint64 + setLink bool + } + + var got []testData + tasks := map[uint64]string{} + for _, e := range res.Events { + t.Logf("%s", e) + switch e.Type { + case trace.EvUserTaskCreate: + taskName := e.SArgs[0] + got = append(got, testData{trace.EvUserTaskCreate, []string{taskName}, nil, e.Link != nil}) + if e.Link != nil && e.Link.Type != trace.EvUserTaskEnd { + t.Errorf("Unexpected linked event %q->%q", e, e.Link) + } + tasks[e.Args[0]] = taskName + case trace.EvUserLog: + key, val := e.SArgs[0], e.SArgs[1] + taskName := tasks[e.Args[0]] + got = append(got, testData{trace.EvUserLog, []string{taskName, key, val}, nil, e.Link != nil}) + case trace.EvUserTaskEnd: + taskName := tasks[e.Args[0]] + got = append(got, testData{trace.EvUserTaskEnd, []string{taskName}, nil, e.Link != nil}) + if e.Link != nil && e.Link.Type != trace.EvUserTaskCreate { + t.Errorf("Unexpected linked event %q->%q", e, e.Link) + } + case trace.EvUserRegion: + taskName := tasks[e.Args[0]] + regionName := e.SArgs[0] + got = append(got, testData{trace.EvUserRegion, []string{taskName, regionName}, []uint64{e.Args[1]}, e.Link != nil}) + if e.Link != nil && (e.Link.Type != trace.EvUserRegion || e.Link.SArgs[0] != regionName) { + t.Errorf("Unexpected linked event %q->%q", e, e.Link) + } + } + } + want := []testData{ + {trace.EvUserTaskCreate, []string{"task0"}, nil, true}, + {trace.EvUserRegion, []string{"task0", "region0"}, []uint64{0}, true}, + {trace.EvUserRegion, []string{"task0", "region1"}, []uint64{0}, true}, + {trace.EvUserLog, []string{"task0", "key0", "0123456789abcdef"}, nil, false}, + {trace.EvUserRegion, []string{"task0", "region1"}, []uint64{1}, false}, + {trace.EvUserRegion, []string{"task0", "region0"}, []uint64{1}, false}, + {trace.EvUserTaskEnd, []string{"task0"}, nil, false}, + // Currently, pre-existing region is not recorded to avoid allocations. + // {trace.EvUserRegion, []string{"", "pre-existing region"}, []uint64{1}, false}, + {trace.EvUserRegion, []string{"", "post-existing region"}, []uint64{0}, false}, + } + if !reflect.DeepEqual(got, want) { + pretty := func(data []testData) string { + var s strings.Builder + for _, d := range data { + fmt.Fprintf(&s, "\t%+v\n", d) + } + return s.String() + } + t.Errorf("Got user region related events\n%+v\nwant:\n%+v", pretty(got), pretty(want)) + } +} diff --git a/src/runtime/trace/example_test.go b/src/runtime/trace/example_test.go new file mode 100644 index 0000000..ba96a82 --- /dev/null +++ b/src/runtime/trace/example_test.go @@ -0,0 +1,39 @@ +// Copyright 2017 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package trace_test + +import ( + "fmt" + "log" + "os" + "runtime/trace" +) + +// Example demonstrates the use of the trace package to trace +// the execution of a Go program. The trace output will be +// written to the file trace.out +func Example() { + f, err := os.Create("trace.out") + if err != nil { + log.Fatalf("failed to create trace output file: %v", err) + } + defer func() { + if err := f.Close(); err != nil { + log.Fatalf("failed to close trace file: %v", err) + } + }() + + if err := trace.Start(f); err != nil { + log.Fatalf("failed to start trace: %v", err) + } + defer trace.Stop() + + // your program here + RunMyProgram() +} + +func RunMyProgram() { + fmt.Printf("this function will be traced") +} diff --git a/src/runtime/trace/trace.go b/src/runtime/trace/trace.go new file mode 100644 index 0000000..86c97e2 --- /dev/null +++ b/src/runtime/trace/trace.go @@ -0,0 +1,154 @@ +// Copyright 2015 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Package trace contains facilities for programs to generate traces +// for the Go execution tracer. +// +// # Tracing runtime activities +// +// The execution trace captures a wide range of execution events such as +// goroutine creation/blocking/unblocking, syscall enter/exit/block, +// GC-related events, changes of heap size, processor start/stop, etc. +// When CPU profiling is active, the execution tracer makes an effort to +// include those samples as well. +// A precise nanosecond-precision timestamp and a stack trace is +// captured for most events. The generated trace can be interpreted +// using `go tool trace`. +// +// Support for tracing tests and benchmarks built with the standard +// testing package is built into `go test`. For example, the following +// command runs the test in the current directory and writes the trace +// file (trace.out). +// +// go test -trace=trace.out +// +// This runtime/trace package provides APIs to add equivalent tracing +// support to a standalone program. See the Example that demonstrates +// how to use this API to enable tracing. +// +// There is also a standard HTTP interface to trace data. Adding the +// following line will install a handler under the /debug/pprof/trace URL +// to download a live trace: +// +// import _ "net/http/pprof" +// +// See the net/http/pprof package for more details about all of the +// debug endpoints installed by this import. +// +// # User annotation +// +// Package trace provides user annotation APIs that can be used to +// log interesting events during execution. +// +// There are three types of user annotations: log messages, regions, +// and tasks. +// +// Log emits a timestamped message to the execution trace along with +// additional information such as the category of the message and +// which goroutine called Log. The execution tracer provides UIs to filter +// and group goroutines using the log category and the message supplied +// in Log. +// +// A region is for logging a time interval during a goroutine's execution. +// By definition, a region starts and ends in the same goroutine. +// Regions can be nested to represent subintervals. +// For example, the following code records four regions in the execution +// trace to trace the durations of sequential steps in a cappuccino making +// operation. +// +// trace.WithRegion(ctx, "makeCappuccino", func() { +// +// // orderID allows to identify a specific order +// // among many cappuccino order region records. +// trace.Log(ctx, "orderID", orderID) +// +// trace.WithRegion(ctx, "steamMilk", steamMilk) +// trace.WithRegion(ctx, "extractCoffee", extractCoffee) +// trace.WithRegion(ctx, "mixMilkCoffee", mixMilkCoffee) +// }) +// +// A task is a higher-level component that aids tracing of logical +// operations such as an RPC request, an HTTP request, or an +// interesting local operation which may require multiple goroutines +// working together. Since tasks can involve multiple goroutines, +// they are tracked via a context.Context object. NewTask creates +// a new task and embeds it in the returned context.Context object. +// Log messages and regions are attached to the task, if any, in the +// Context passed to Log and WithRegion. +// +// For example, assume that we decided to froth milk, extract coffee, +// and mix milk and coffee in separate goroutines. With a task, +// the trace tool can identify the goroutines involved in a specific +// cappuccino order. +// +// ctx, task := trace.NewTask(ctx, "makeCappuccino") +// trace.Log(ctx, "orderID", orderID) +// +// milk := make(chan bool) +// espresso := make(chan bool) +// +// go func() { +// trace.WithRegion(ctx, "steamMilk", steamMilk) +// milk <- true +// }() +// go func() { +// trace.WithRegion(ctx, "extractCoffee", extractCoffee) +// espresso <- true +// }() +// go func() { +// defer task.End() // When assemble is done, the order is complete. +// <-espresso +// <-milk +// trace.WithRegion(ctx, "mixMilkCoffee", mixMilkCoffee) +// }() +// +// The trace tool computes the latency of a task by measuring the +// time between the task creation and the task end and provides +// latency distributions for each task type found in the trace. +package trace + +import ( + "io" + "runtime" + "sync" + "sync/atomic" +) + +// Start enables tracing for the current program. +// While tracing, the trace will be buffered and written to w. +// Start returns an error if tracing is already enabled. +func Start(w io.Writer) error { + tracing.Lock() + defer tracing.Unlock() + + if err := runtime.StartTrace(); err != nil { + return err + } + go func() { + for { + data := runtime.ReadTrace() + if data == nil { + break + } + w.Write(data) + } + }() + tracing.enabled.Store(true) + return nil +} + +// Stop stops the current tracing, if any. +// Stop only returns after all the writes for the trace have completed. +func Stop() { + tracing.Lock() + defer tracing.Unlock() + tracing.enabled.Store(false) + + runtime.StopTrace() +} + +var tracing struct { + sync.Mutex // gate mutators (Start, Stop) + enabled atomic.Bool +} diff --git a/src/runtime/trace/trace_stack_test.go b/src/runtime/trace/trace_stack_test.go new file mode 100644 index 0000000..be3adc9 --- /dev/null +++ b/src/runtime/trace/trace_stack_test.go @@ -0,0 +1,333 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package trace_test + +import ( + "bytes" + "fmt" + "internal/testenv" + "internal/trace" + "net" + "os" + "runtime" + . "runtime/trace" + "strings" + "sync" + "testing" + "text/tabwriter" + "time" +) + +// TestTraceSymbolize tests symbolization and that events has proper stacks. +// In particular that we strip bottom uninteresting frames like goexit, +// top uninteresting frames (runtime guts). +func TestTraceSymbolize(t *testing.T) { + skipTraceSymbolizeTestIfNecessary(t) + + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + defer Stop() // in case of early return + + // Now we will do a bunch of things for which we verify stacks later. + // It is impossible to ensure that a goroutine has actually blocked + // on a channel, in a select or otherwise. So we kick off goroutines + // that need to block first in the hope that while we are executing + // the rest of the test, they will block. + go func() { // func1 + select {} + }() + go func() { // func2 + var c chan int + c <- 0 + }() + go func() { // func3 + var c chan int + <-c + }() + done1 := make(chan bool) + go func() { // func4 + <-done1 + }() + done2 := make(chan bool) + go func() { // func5 + done2 <- true + }() + c1 := make(chan int) + c2 := make(chan int) + go func() { // func6 + select { + case <-c1: + case <-c2: + } + }() + var mu sync.Mutex + mu.Lock() + go func() { // func7 + mu.Lock() + mu.Unlock() + }() + var wg sync.WaitGroup + wg.Add(1) + go func() { // func8 + wg.Wait() + }() + cv := sync.NewCond(&sync.Mutex{}) + go func() { // func9 + cv.L.Lock() + cv.Wait() + cv.L.Unlock() + }() + ln, err := net.Listen("tcp", "127.0.0.1:0") + if err != nil { + t.Fatalf("failed to listen: %v", err) + } + go func() { // func10 + c, err := ln.Accept() + if err != nil { + t.Errorf("failed to accept: %v", err) + return + } + c.Close() + }() + rp, wp, err := os.Pipe() + if err != nil { + t.Fatalf("failed to create a pipe: %v", err) + } + defer rp.Close() + defer wp.Close() + pipeReadDone := make(chan bool) + go func() { // func11 + var data [1]byte + rp.Read(data[:]) + pipeReadDone <- true + }() + + time.Sleep(100 * time.Millisecond) + runtime.GC() + runtime.Gosched() + time.Sleep(100 * time.Millisecond) // the last chance for the goroutines above to block + done1 <- true + <-done2 + select { + case c1 <- 0: + case c2 <- 0: + } + mu.Unlock() + wg.Done() + cv.Signal() + c, err := net.Dial("tcp", ln.Addr().String()) + if err != nil { + t.Fatalf("failed to dial: %v", err) + } + c.Close() + var data [1]byte + wp.Write(data[:]) + <-pipeReadDone + + oldGoMaxProcs := runtime.GOMAXPROCS(0) + runtime.GOMAXPROCS(oldGoMaxProcs + 1) + + Stop() + + runtime.GOMAXPROCS(oldGoMaxProcs) + + events, _ := parseTrace(t, buf) + + // Now check that the stacks are correct. + type eventDesc struct { + Type byte + Stk []frame + } + want := []eventDesc{ + {trace.EvGCStart, []frame{ + {"runtime.GC", 0}, + {"runtime/trace_test.TestTraceSymbolize", 0}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoStart, []frame{ + {"runtime/trace_test.TestTraceSymbolize.func1", 0}, + }}, + {trace.EvGoSched, []frame{ + {"runtime/trace_test.TestTraceSymbolize", 111}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoCreate, []frame{ + {"runtime/trace_test.TestTraceSymbolize", 40}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoStop, []frame{ + {"runtime.block", 0}, + {"runtime/trace_test.TestTraceSymbolize.func1", 0}, + }}, + {trace.EvGoStop, []frame{ + {"runtime.chansend1", 0}, + {"runtime/trace_test.TestTraceSymbolize.func2", 0}, + }}, + {trace.EvGoStop, []frame{ + {"runtime.chanrecv1", 0}, + {"runtime/trace_test.TestTraceSymbolize.func3", 0}, + }}, + {trace.EvGoBlockRecv, []frame{ + {"runtime.chanrecv1", 0}, + {"runtime/trace_test.TestTraceSymbolize.func4", 0}, + }}, + {trace.EvGoUnblock, []frame{ + {"runtime.chansend1", 0}, + {"runtime/trace_test.TestTraceSymbolize", 113}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoBlockSend, []frame{ + {"runtime.chansend1", 0}, + {"runtime/trace_test.TestTraceSymbolize.func5", 0}, + }}, + {trace.EvGoUnblock, []frame{ + {"runtime.chanrecv1", 0}, + {"runtime/trace_test.TestTraceSymbolize", 114}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoBlockSelect, []frame{ + {"runtime.selectgo", 0}, + {"runtime/trace_test.TestTraceSymbolize.func6", 0}, + }}, + {trace.EvGoUnblock, []frame{ + {"runtime.selectgo", 0}, + {"runtime/trace_test.TestTraceSymbolize", 115}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoBlockSync, []frame{ + {"sync.(*Mutex).Lock", 0}, + {"runtime/trace_test.TestTraceSymbolize.func7", 0}, + }}, + {trace.EvGoUnblock, []frame{ + {"sync.(*Mutex).Unlock", 0}, + {"runtime/trace_test.TestTraceSymbolize", 0}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoBlockSync, []frame{ + {"sync.(*WaitGroup).Wait", 0}, + {"runtime/trace_test.TestTraceSymbolize.func8", 0}, + }}, + {trace.EvGoUnblock, []frame{ + {"sync.(*WaitGroup).Add", 0}, + {"sync.(*WaitGroup).Done", 0}, + {"runtime/trace_test.TestTraceSymbolize", 120}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoBlockCond, []frame{ + {"sync.(*Cond).Wait", 0}, + {"runtime/trace_test.TestTraceSymbolize.func9", 0}, + }}, + {trace.EvGoUnblock, []frame{ + {"sync.(*Cond).Signal", 0}, + {"runtime/trace_test.TestTraceSymbolize", 0}, + {"testing.tRunner", 0}, + }}, + {trace.EvGoSleep, []frame{ + {"time.Sleep", 0}, + {"runtime/trace_test.TestTraceSymbolize", 0}, + {"testing.tRunner", 0}, + }}, + {trace.EvGomaxprocs, []frame{ + {"runtime.startTheWorld", 0}, // this is when the current gomaxprocs is logged. + {"runtime.startTheWorldGC", 0}, + {"runtime.GOMAXPROCS", 0}, + {"runtime/trace_test.TestTraceSymbolize", 0}, + {"testing.tRunner", 0}, + }}, + } + // Stacks for the following events are OS-dependent due to OS-specific code in net package. + if runtime.GOOS != "windows" && runtime.GOOS != "plan9" { + want = append(want, []eventDesc{ + {trace.EvGoBlockNet, []frame{ + {"internal/poll.(*FD).Accept", 0}, + {"net.(*netFD).accept", 0}, + {"net.(*TCPListener).accept", 0}, + {"net.(*TCPListener).Accept", 0}, + {"runtime/trace_test.TestTraceSymbolize.func10", 0}, + }}, + {trace.EvGoSysCall, []frame{ + {"syscall.read", 0}, + {"syscall.Read", 0}, + {"internal/poll.ignoringEINTRIO", 0}, + {"internal/poll.(*FD).Read", 0}, + {"os.(*File).read", 0}, + {"os.(*File).Read", 0}, + {"runtime/trace_test.TestTraceSymbolize.func11", 0}, + }}, + }...) + } + matched := make([]bool, len(want)) + for _, ev := range events { + wantLoop: + for i, w := range want { + if matched[i] || w.Type != ev.Type || len(w.Stk) != len(ev.Stk) { + continue + } + + for fi, f := range ev.Stk { + wf := w.Stk[fi] + if wf.Fn != f.Fn || wf.Line != 0 && wf.Line != f.Line { + continue wantLoop + } + } + matched[i] = true + } + } + for i, w := range want { + if matched[i] { + continue + } + seen, n := dumpEventStacks(w.Type, events) + t.Errorf("Did not match event %v with stack\n%s\nSeen %d events of the type\n%s", + trace.EventDescriptions[w.Type].Name, dumpFrames(w.Stk), n, seen) + } +} + +func skipTraceSymbolizeTestIfNecessary(t *testing.T) { + testenv.MustHaveGoBuild(t) + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } +} + +func dumpEventStacks(typ byte, events []*trace.Event) ([]byte, int) { + matched := 0 + o := new(bytes.Buffer) + tw := tabwriter.NewWriter(o, 0, 8, 0, '\t', 0) + for _, ev := range events { + if ev.Type != typ { + continue + } + matched++ + fmt.Fprintf(tw, "Offset %d\n", ev.Off) + for _, f := range ev.Stk { + fname := f.File + if idx := strings.Index(fname, "/go/src/"); idx > 0 { + fname = fname[idx:] + } + fmt.Fprintf(tw, " %v\t%s:%d\n", f.Fn, fname, f.Line) + } + } + tw.Flush() + return o.Bytes(), matched +} + +type frame struct { + Fn string + Line int +} + +func dumpFrames(frames []frame) []byte { + o := new(bytes.Buffer) + tw := tabwriter.NewWriter(o, 0, 8, 0, '\t', 0) + + for _, f := range frames { + fmt.Fprintf(tw, " %v\t :%d\n", f.Fn, f.Line) + } + tw.Flush() + return o.Bytes() +} diff --git a/src/runtime/trace/trace_test.go b/src/runtime/trace/trace_test.go new file mode 100644 index 0000000..19f7dbe --- /dev/null +++ b/src/runtime/trace/trace_test.go @@ -0,0 +1,792 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package trace_test + +import ( + "bytes" + "context" + "flag" + "fmt" + "internal/profile" + "internal/race" + "internal/trace" + "io" + "net" + "os" + "runtime" + "runtime/pprof" + . "runtime/trace" + "strconv" + "strings" + "sync" + "testing" + "time" +) + +var ( + saveTraces = flag.Bool("savetraces", false, "save traces collected by tests") +) + +// TestEventBatch tests Flush calls that happen during Start +// don't produce corrupted traces. +func TestEventBatch(t *testing.T) { + if race.Enabled { + t.Skip("skipping in race mode") + } + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + if testing.Short() { + t.Skip("skipping in short mode") + } + // During Start, bunch of records are written to reflect the current + // snapshot of the program, including state of each goroutines. + // And some string constants are written to the trace to aid trace + // parsing. This test checks Flush of the buffer occurred during + // this process doesn't cause corrupted traces. + // When a Flush is called during Start is complicated + // so we test with a range of number of goroutines hoping that one + // of them triggers Flush. + // This range was chosen to fill up a ~64KB buffer with traceEvGoCreate + // and traceEvGoWaiting events (12~13bytes per goroutine). + for g := 4950; g < 5050; g++ { + n := g + t.Run("G="+strconv.Itoa(n), func(t *testing.T) { + var wg sync.WaitGroup + wg.Add(n) + + in := make(chan bool, 1000) + for i := 0; i < n; i++ { + go func() { + <-in + wg.Done() + }() + } + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + + for i := 0; i < n; i++ { + in <- true + } + wg.Wait() + Stop() + + _, err := trace.Parse(buf, "") + if err == trace.ErrTimeOrder { + t.Skipf("skipping trace: %v", err) + } + + if err != nil { + t.Fatalf("failed to parse trace: %v", err) + } + }) + } +} + +func TestTraceStartStop(t *testing.T) { + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + Stop() + size := buf.Len() + if size == 0 { + t.Fatalf("trace is empty") + } + time.Sleep(100 * time.Millisecond) + if size != buf.Len() { + t.Fatalf("trace writes after stop: %v -> %v", size, buf.Len()) + } + saveTrace(t, buf, "TestTraceStartStop") +} + +func TestTraceDoubleStart(t *testing.T) { + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + Stop() + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + if err := Start(buf); err == nil { + t.Fatalf("succeed to start tracing second time") + } + Stop() + Stop() +} + +func TestTrace(t *testing.T) { + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + Stop() + saveTrace(t, buf, "TestTrace") + _, err := trace.Parse(buf, "") + if err == trace.ErrTimeOrder { + t.Skipf("skipping trace: %v", err) + } + if err != nil { + t.Fatalf("failed to parse trace: %v", err) + } +} + +func parseTrace(t *testing.T, r io.Reader) ([]*trace.Event, map[uint64]*trace.GDesc) { + res, err := trace.Parse(r, "") + if err == trace.ErrTimeOrder { + t.Skipf("skipping trace: %v", err) + } + if err != nil { + t.Fatalf("failed to parse trace: %v", err) + } + gs := trace.GoroutineStats(res.Events) + for goid := range gs { + // We don't do any particular checks on the result at the moment. + // But still check that RelatedGoroutines does not crash, hang, etc. + _ = trace.RelatedGoroutines(res.Events, goid) + } + return res.Events, gs +} + +func testBrokenTimestamps(t *testing.T, data []byte) { + // On some processors cputicks (used to generate trace timestamps) + // produce non-monotonic timestamps. It is important that the parser + // distinguishes logically inconsistent traces (e.g. missing, excessive + // or misordered events) from broken timestamps. The former is a bug + // in tracer, the latter is a machine issue. + // So now that we have a consistent trace, test that (1) parser does + // not return a logical error in case of broken timestamps + // and (2) broken timestamps are eventually detected and reported. + trace.BreakTimestampsForTesting = true + defer func() { + trace.BreakTimestampsForTesting = false + }() + for i := 0; i < 1e4; i++ { + _, err := trace.Parse(bytes.NewReader(data), "") + if err == trace.ErrTimeOrder { + return + } + if err != nil { + t.Fatalf("failed to parse trace: %v", err) + } + } +} + +func TestTraceStress(t *testing.T) { + if runtime.GOOS == "js" { + t.Skip("no os.Pipe on js") + } + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + if testing.Short() { + t.Skip("skipping in -short mode") + } + + var wg sync.WaitGroup + done := make(chan bool) + + // Create a goroutine blocked before tracing. + wg.Add(1) + go func() { + <-done + wg.Done() + }() + + // Create a goroutine blocked in syscall before tracing. + rp, wp, err := os.Pipe() + if err != nil { + t.Fatalf("failed to create pipe: %v", err) + } + defer func() { + rp.Close() + wp.Close() + }() + wg.Add(1) + go func() { + var tmp [1]byte + rp.Read(tmp[:]) + <-done + wg.Done() + }() + time.Sleep(time.Millisecond) // give the goroutine above time to block + + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + + procs := runtime.GOMAXPROCS(10) + time.Sleep(50 * time.Millisecond) // test proc stop/start events + + go func() { + runtime.LockOSThread() + for { + select { + case <-done: + return + default: + runtime.Gosched() + } + } + }() + + runtime.GC() + // Trigger GC from malloc. + n := int(1e3) + if isMemoryConstrained() { + // Reduce allocation to avoid running out of + // memory on the builder - see issue/12032. + n = 512 + } + for i := 0; i < n; i++ { + _ = make([]byte, 1<<20) + } + + // Create a bunch of busy goroutines to load all Ps. + for p := 0; p < 10; p++ { + wg.Add(1) + go func() { + // Do something useful. + tmp := make([]byte, 1<<16) + for i := range tmp { + tmp[i]++ + } + _ = tmp + <-done + wg.Done() + }() + } + + // Block in syscall. + wg.Add(1) + go func() { + var tmp [1]byte + rp.Read(tmp[:]) + <-done + wg.Done() + }() + + // Test timers. + timerDone := make(chan bool) + go func() { + time.Sleep(time.Millisecond) + timerDone <- true + }() + <-timerDone + + // A bit of network. + ln, err := net.Listen("tcp", "127.0.0.1:0") + if err != nil { + t.Fatalf("listen failed: %v", err) + } + defer ln.Close() + go func() { + c, err := ln.Accept() + if err != nil { + return + } + time.Sleep(time.Millisecond) + var buf [1]byte + c.Write(buf[:]) + c.Close() + }() + c, err := net.Dial("tcp", ln.Addr().String()) + if err != nil { + t.Fatalf("dial failed: %v", err) + } + var tmp [1]byte + c.Read(tmp[:]) + c.Close() + + go func() { + runtime.Gosched() + select {} + }() + + // Unblock helper goroutines and wait them to finish. + wp.Write(tmp[:]) + wp.Write(tmp[:]) + close(done) + wg.Wait() + + runtime.GOMAXPROCS(procs) + + Stop() + saveTrace(t, buf, "TestTraceStress") + trace := buf.Bytes() + parseTrace(t, buf) + testBrokenTimestamps(t, trace) +} + +// isMemoryConstrained reports whether the current machine is likely +// to be memory constrained. +// This was originally for the openbsd/arm builder (Issue 12032). +// TODO: move this to testenv? Make this look at memory? Look at GO_BUILDER_NAME? +func isMemoryConstrained() bool { + if runtime.GOOS == "plan9" { + return true + } + switch runtime.GOARCH { + case "arm", "mips", "mipsle": + return true + } + return false +} + +// Do a bunch of various stuff (timers, GC, network, etc) in a separate goroutine. +// And concurrently with all that start/stop trace 3 times. +func TestTraceStressStartStop(t *testing.T) { + if runtime.GOOS == "js" { + t.Skip("no os.Pipe on js") + } + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(8)) + outerDone := make(chan bool) + + go func() { + defer func() { + outerDone <- true + }() + + var wg sync.WaitGroup + done := make(chan bool) + + wg.Add(1) + go func() { + <-done + wg.Done() + }() + + rp, wp, err := os.Pipe() + if err != nil { + t.Errorf("failed to create pipe: %v", err) + return + } + defer func() { + rp.Close() + wp.Close() + }() + wg.Add(1) + go func() { + var tmp [1]byte + rp.Read(tmp[:]) + <-done + wg.Done() + }() + time.Sleep(time.Millisecond) + + go func() { + runtime.LockOSThread() + for { + select { + case <-done: + return + default: + runtime.Gosched() + } + } + }() + + runtime.GC() + // Trigger GC from malloc. + n := int(1e3) + if isMemoryConstrained() { + // Reduce allocation to avoid running out of + // memory on the builder. + n = 512 + } + for i := 0; i < n; i++ { + _ = make([]byte, 1<<20) + } + + // Create a bunch of busy goroutines to load all Ps. + for p := 0; p < 10; p++ { + wg.Add(1) + go func() { + // Do something useful. + tmp := make([]byte, 1<<16) + for i := range tmp { + tmp[i]++ + } + _ = tmp + <-done + wg.Done() + }() + } + + // Block in syscall. + wg.Add(1) + go func() { + var tmp [1]byte + rp.Read(tmp[:]) + <-done + wg.Done() + }() + + runtime.GOMAXPROCS(runtime.GOMAXPROCS(1)) + + // Test timers. + timerDone := make(chan bool) + go func() { + time.Sleep(time.Millisecond) + timerDone <- true + }() + <-timerDone + + // A bit of network. + ln, err := net.Listen("tcp", "127.0.0.1:0") + if err != nil { + t.Errorf("listen failed: %v", err) + return + } + defer ln.Close() + go func() { + c, err := ln.Accept() + if err != nil { + return + } + time.Sleep(time.Millisecond) + var buf [1]byte + c.Write(buf[:]) + c.Close() + }() + c, err := net.Dial("tcp", ln.Addr().String()) + if err != nil { + t.Errorf("dial failed: %v", err) + return + } + var tmp [1]byte + c.Read(tmp[:]) + c.Close() + + go func() { + runtime.Gosched() + select {} + }() + + // Unblock helper goroutines and wait them to finish. + wp.Write(tmp[:]) + wp.Write(tmp[:]) + close(done) + wg.Wait() + }() + + for i := 0; i < 3; i++ { + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + time.Sleep(time.Millisecond) + Stop() + saveTrace(t, buf, "TestTraceStressStartStop") + trace := buf.Bytes() + parseTrace(t, buf) + testBrokenTimestamps(t, trace) + } + <-outerDone +} + +func TestTraceFutileWakeup(t *testing.T) { + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + + defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(8)) + c0 := make(chan int, 1) + c1 := make(chan int, 1) + c2 := make(chan int, 1) + const procs = 2 + var done sync.WaitGroup + done.Add(4 * procs) + for p := 0; p < procs; p++ { + const iters = 1e3 + go func() { + for i := 0; i < iters; i++ { + runtime.Gosched() + c0 <- 0 + } + done.Done() + }() + go func() { + for i := 0; i < iters; i++ { + runtime.Gosched() + <-c0 + } + done.Done() + }() + go func() { + for i := 0; i < iters; i++ { + runtime.Gosched() + select { + case c1 <- 0: + case c2 <- 0: + } + } + done.Done() + }() + go func() { + for i := 0; i < iters; i++ { + runtime.Gosched() + select { + case <-c1: + case <-c2: + } + } + done.Done() + }() + } + done.Wait() + + Stop() + saveTrace(t, buf, "TestTraceFutileWakeup") + events, _ := parseTrace(t, buf) + // Check that (1) trace does not contain EvFutileWakeup events and + // (2) there are no consecutive EvGoBlock/EvGCStart/EvGoBlock events + // (we call runtime.Gosched between all operations, so these would be futile wakeups). + gs := make(map[uint64]int) + for _, ev := range events { + switch ev.Type { + case trace.EvFutileWakeup: + t.Fatalf("found EvFutileWakeup event") + case trace.EvGoBlockSend, trace.EvGoBlockRecv, trace.EvGoBlockSelect: + if gs[ev.G] == 2 { + t.Fatalf("goroutine %v blocked on %v at %v right after start", + ev.G, trace.EventDescriptions[ev.Type].Name, ev.Ts) + } + if gs[ev.G] == 1 { + t.Fatalf("goroutine %v blocked on %v at %v while blocked", + ev.G, trace.EventDescriptions[ev.Type].Name, ev.Ts) + } + gs[ev.G] = 1 + case trace.EvGoStart: + if gs[ev.G] == 1 { + gs[ev.G] = 2 + } + default: + delete(gs, ev.G) + } + } +} + +func TestTraceCPUProfile(t *testing.T) { + if IsEnabled() { + t.Skip("skipping because -test.trace is set") + } + + cpuBuf := new(bytes.Buffer) + if err := pprof.StartCPUProfile(cpuBuf); err != nil { + t.Skipf("failed to start CPU profile: %v", err) + } + + buf := new(bytes.Buffer) + if err := Start(buf); err != nil { + t.Fatalf("failed to start tracing: %v", err) + } + + dur := 100 * time.Millisecond + func() { + // Create a region in the execution trace. Set and clear goroutine + // labels fully within that region, so we know that any CPU profile + // sample with the label must also be eligible for inclusion in the + // execution trace. + ctx := context.Background() + defer StartRegion(ctx, "cpuHogger").End() + pprof.Do(ctx, pprof.Labels("tracing", "on"), func(ctx context.Context) { + cpuHogger(cpuHog1, &salt1, dur) + }) + // Be sure the execution trace's view, when filtered to this goroutine + // via the explicit goroutine ID in each event, gets many more samples + // than the CPU profiler when filtered to this goroutine via labels. + cpuHogger(cpuHog1, &salt1, dur) + }() + + Stop() + pprof.StopCPUProfile() + saveTrace(t, buf, "TestTraceCPUProfile") + + prof, err := profile.Parse(cpuBuf) + if err != nil { + t.Fatalf("failed to parse CPU profile: %v", err) + } + // Examine the CPU profiler's view. Filter it to only include samples from + // the single test goroutine. Use labels to execute that filter: they should + // apply to all work done while that goroutine is getg().m.curg, and they + // should apply to no other goroutines. + pprofSamples := 0 + pprofStacks := make(map[string]int) + for _, s := range prof.Sample { + if s.Label["tracing"] != nil { + var fns []string + var leaf string + for _, loc := range s.Location { + for _, line := range loc.Line { + fns = append(fns, fmt.Sprintf("%s:%d", line.Function.Name, line.Line)) + leaf = line.Function.Name + } + } + // runtime.sigprof synthesizes call stacks when "normal traceback is + // impossible or has failed", using particular placeholder functions + // to represent common failure cases. Look for those functions in + // the leaf position as a sign that the call stack and its + // symbolization are more complex than this test can handle. + // + // TODO: Make the symbolization done by the execution tracer and CPU + // profiler match up even in these harder cases. See #53378. + switch leaf { + case "runtime._System", "runtime._GC", "runtime._ExternalCode", "runtime._VDSO": + continue + } + stack := strings.Join(fns, " ") + samples := int(s.Value[0]) + pprofSamples += samples + pprofStacks[stack] += samples + } + } + if pprofSamples == 0 { + t.Skipf("CPU profile did not include any samples while tracing was active\n%s", prof) + } + + // Examine the execution tracer's view of the CPU profile samples. Filter it + // to only include samples from the single test goroutine. Use the goroutine + // ID that was recorded in the events: that should reflect getg().m.curg, + // same as the profiler's labels (even when the M is using its g0 stack). + totalTraceSamples := 0 + traceSamples := 0 + traceStacks := make(map[string]int) + events, _ := parseTrace(t, buf) + var hogRegion *trace.Event + for _, ev := range events { + if ev.Type == trace.EvUserRegion && ev.Args[1] == 0 && ev.SArgs[0] == "cpuHogger" { + // mode "0" indicates region start + hogRegion = ev + } + } + if hogRegion == nil { + t.Fatalf("execution trace did not identify cpuHogger goroutine") + } else if hogRegion.Link == nil { + t.Fatalf("execution trace did not close cpuHogger region") + } + for _, ev := range events { + if ev.Type == trace.EvCPUSample { + totalTraceSamples++ + if ev.G == hogRegion.G { + traceSamples++ + var fns []string + for _, frame := range ev.Stk { + if frame.Fn != "runtime.goexit" { + fns = append(fns, fmt.Sprintf("%s:%d", frame.Fn, frame.Line)) + } + } + stack := strings.Join(fns, " ") + traceStacks[stack]++ + } + } + } + + // The execution trace may drop CPU profile samples if the profiling buffer + // overflows. Based on the size of profBufWordCount, that takes a bit over + // 1900 CPU samples or 19 thread-seconds at a 100 Hz sample rate. If we've + // hit that case, then we definitely have at least one full buffer's worth + // of CPU samples, so we'll call that success. + overflowed := totalTraceSamples >= 1900 + if traceSamples < pprofSamples { + t.Logf("exectution trace did not include all CPU profile samples; %d in profile, %d in trace", pprofSamples, traceSamples) + if !overflowed { + t.Fail() + } + } + + for stack, traceSamples := range traceStacks { + pprofSamples := pprofStacks[stack] + delete(pprofStacks, stack) + if traceSamples < pprofSamples { + t.Logf("execution trace did not include all CPU profile samples for stack %q; %d in profile, %d in trace", + stack, pprofSamples, traceSamples) + if !overflowed { + t.Fail() + } + } + } + for stack, pprofSamples := range pprofStacks { + t.Logf("CPU profile included %d samples at stack %q not present in execution trace", pprofSamples, stack) + if !overflowed { + t.Fail() + } + } + + if t.Failed() { + t.Logf("execution trace CPU samples:") + for stack, samples := range traceStacks { + t.Logf("%d: %q", samples, stack) + } + t.Logf("CPU profile:\n%v", prof) + } +} + +func cpuHogger(f func(x int) int, y *int, dur time.Duration) { + // We only need to get one 100 Hz clock tick, so we've got + // a large safety buffer. + // But do at least 500 iterations (which should take about 100ms), + // otherwise TestCPUProfileMultithreaded can fail if only one + // thread is scheduled during the testing period. + t0 := time.Now() + accum := *y + for i := 0; i < 500 || time.Since(t0) < dur; i++ { + accum = f(accum) + } + *y = accum +} + +var ( + salt1 = 0 +) + +// The actual CPU hogging function. +// Must not call other functions nor access heap/globals in the loop, +// otherwise under race detector the samples will be in the race runtime. +func cpuHog1(x int) int { + return cpuHog0(x, 1e5) +} + +func cpuHog0(x, n int) int { + foo := x + for i := 0; i < n; i++ { + if i%1000 == 0 { + // Spend time in mcall, stored as gp.m.curg, with g0 running + runtime.Gosched() + } + if foo > 0 { + foo *= foo + } else { + foo *= foo + 1 + } + } + return foo +} + +func saveTrace(t *testing.T, buf *bytes.Buffer, name string) { + if !*saveTraces { + return + } + if err := os.WriteFile(name+".trace", buf.Bytes(), 0600); err != nil { + t.Errorf("failed to write trace file: %s", err) + } +} diff --git a/src/runtime/traceback.go b/src/runtime/traceback.go new file mode 100644 index 0000000..37f35d5 --- /dev/null +++ b/src/runtime/traceback.go @@ -0,0 +1,1377 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "internal/bytealg" + "internal/goarch" + "runtime/internal/sys" + "unsafe" +) + +// The code in this file implements stack trace walking for all architectures. +// The most important fact about a given architecture is whether it uses a link register. +// On systems with link registers, the prologue for a non-leaf function stores the +// incoming value of LR at the bottom of the newly allocated stack frame. +// On systems without link registers (x86), the architecture pushes a return PC during +// the call instruction, so the return PC ends up above the stack frame. +// In this file, the return PC is always called LR, no matter how it was found. + +const usesLR = sys.MinFrameSize > 0 + +// Generic traceback. Handles runtime stack prints (pcbuf == nil), +// the runtime.Callers function (pcbuf != nil), as well as the garbage +// collector (callback != nil). A little clunky to merge these, but avoids +// duplicating the code and all its subtlety. +// +// The skip argument is only valid with pcbuf != nil and counts the number +// of logical frames to skip rather than physical frames (with inlining, a +// PC in pcbuf can represent multiple calls). +func gentraceback(pc0, sp0, lr0 uintptr, gp *g, skip int, pcbuf *uintptr, max int, callback func(*stkframe, unsafe.Pointer) bool, v unsafe.Pointer, flags uint) int { + if skip > 0 && callback != nil { + throw("gentraceback callback cannot be used with non-zero skip") + } + + // Don't call this "g"; it's too easy get "g" and "gp" confused. + if ourg := getg(); ourg == gp && ourg == ourg.m.curg { + // The starting sp has been passed in as a uintptr, and the caller may + // have other uintptr-typed stack references as well. + // If during one of the calls that got us here or during one of the + // callbacks below the stack must be grown, all these uintptr references + // to the stack will not be updated, and gentraceback will continue + // to inspect the old stack memory, which may no longer be valid. + // Even if all the variables were updated correctly, it is not clear that + // we want to expose a traceback that begins on one stack and ends + // on another stack. That could confuse callers quite a bit. + // Instead, we require that gentraceback and any other function that + // accepts an sp for the current goroutine (typically obtained by + // calling getcallersp) must not run on that goroutine's stack but + // instead on the g0 stack. + throw("gentraceback cannot trace user goroutine on its own stack") + } + level, _, _ := gotraceback() + + if pc0 == ^uintptr(0) && sp0 == ^uintptr(0) { // Signal to fetch saved values from gp. + if gp.syscallsp != 0 { + pc0 = gp.syscallpc + sp0 = gp.syscallsp + if usesLR { + lr0 = 0 + } + } else { + pc0 = gp.sched.pc + sp0 = gp.sched.sp + if usesLR { + lr0 = gp.sched.lr + } + } + } + + nprint := 0 + var frame stkframe + frame.pc = pc0 + frame.sp = sp0 + if usesLR { + frame.lr = lr0 + } + waspanic := false + cgoCtxt := gp.cgoCtxt + stack := gp.stack + printing := pcbuf == nil && callback == nil + + // If the PC is zero, it's likely a nil function call. + // Start in the caller's frame. + if frame.pc == 0 { + if usesLR { + frame.pc = *(*uintptr)(unsafe.Pointer(frame.sp)) + frame.lr = 0 + } else { + frame.pc = uintptr(*(*uintptr)(unsafe.Pointer(frame.sp))) + frame.sp += goarch.PtrSize + } + } + + // runtime/internal/atomic functions call into kernel helpers on + // arm < 7. See runtime/internal/atomic/sys_linux_arm.s. + // + // Start in the caller's frame. + if GOARCH == "arm" && goarm < 7 && GOOS == "linux" && frame.pc&0xffff0000 == 0xffff0000 { + // Note that the calls are simple BL without pushing the return + // address, so we use LR directly. + // + // The kernel helpers are frameless leaf functions, so SP and + // LR are not touched. + frame.pc = frame.lr + frame.lr = 0 + } + + f := findfunc(frame.pc) + if !f.valid() { + if callback != nil || printing { + print("runtime: g ", gp.goid, ": unknown pc ", hex(frame.pc), "\n") + tracebackHexdump(stack, &frame, 0) + } + if callback != nil { + throw("unknown pc") + } + return 0 + } + frame.fn = f + + var cache pcvalueCache + + lastFuncID := funcID_normal + n := 0 + for n < max { + // Typically: + // pc is the PC of the running function. + // sp is the stack pointer at that program counter. + // fp is the frame pointer (caller's stack pointer) at that program counter, or nil if unknown. + // stk is the stack containing sp. + // The caller's program counter is lr, unless lr is zero, in which case it is *(uintptr*)sp. + f = frame.fn + if f.pcsp == 0 { + // No frame information, must be external function, like race support. + // See golang.org/issue/13568. + break + } + + // Compute function info flags. + flag := f.flag + if f.funcID == funcID_cgocallback { + // cgocallback does write SP to switch from the g0 to the curg stack, + // but it carefully arranges that during the transition BOTH stacks + // have cgocallback frame valid for unwinding through. + // So we don't need to exclude it with the other SP-writing functions. + flag &^= funcFlag_SPWRITE + } + if frame.pc == pc0 && frame.sp == sp0 && pc0 == gp.syscallpc && sp0 == gp.syscallsp { + // Some Syscall functions write to SP, but they do so only after + // saving the entry PC/SP using entersyscall. + // Since we are using the entry PC/SP, the later SP write doesn't matter. + flag &^= funcFlag_SPWRITE + } + + // Found an actual function. + // Derive frame pointer and link register. + if frame.fp == 0 { + // Jump over system stack transitions. If we're on g0 and there's a user + // goroutine, try to jump. Otherwise this is a regular call. + // We also defensively check that this won't switch M's on us, + // which could happen at critical points in the scheduler. + // This ensures gp.m doesn't change from a stack jump. + if flags&_TraceJumpStack != 0 && gp == gp.m.g0 && gp.m.curg != nil && gp.m.curg.m == gp.m { + switch f.funcID { + case funcID_morestack: + // morestack does not return normally -- newstack() + // gogo's to curg.sched. Match that. + // This keeps morestack() from showing up in the backtrace, + // but that makes some sense since it'll never be returned + // to. + gp = gp.m.curg + frame.pc = gp.sched.pc + frame.fn = findfunc(frame.pc) + f = frame.fn + flag = f.flag + frame.lr = gp.sched.lr + frame.sp = gp.sched.sp + stack = gp.stack + cgoCtxt = gp.cgoCtxt + case funcID_systemstack: + // systemstack returns normally, so just follow the + // stack transition. + if usesLR && funcspdelta(f, frame.pc, &cache) == 0 { + // We're at the function prologue and the stack + // switch hasn't happened, or epilogue where we're + // about to return. Just unwind normally. + // Do this only on LR machines because on x86 + // systemstack doesn't have an SP delta (the CALL + // instruction opens the frame), therefore no way + // to check. + flag &^= funcFlag_SPWRITE + break + } + gp = gp.m.curg + frame.sp = gp.sched.sp + stack = gp.stack + cgoCtxt = gp.cgoCtxt + flag &^= funcFlag_SPWRITE + } + } + frame.fp = frame.sp + uintptr(funcspdelta(f, frame.pc, &cache)) + if !usesLR { + // On x86, call instruction pushes return PC before entering new function. + frame.fp += goarch.PtrSize + } + } + var flr funcInfo + if flag&funcFlag_TOPFRAME != 0 { + // This function marks the top of the stack. Stop the traceback. + frame.lr = 0 + flr = funcInfo{} + } else if flag&funcFlag_SPWRITE != 0 && (callback == nil || n > 0) { + // The function we are in does a write to SP that we don't know + // how to encode in the spdelta table. Examples include context + // switch routines like runtime.gogo but also any code that switches + // to the g0 stack to run host C code. Since we can't reliably unwind + // the SP (we might not even be on the stack we think we are), + // we stop the traceback here. + // This only applies for profiling signals (callback == nil). + // + // For a GC stack traversal (callback != nil), we should only see + // a function when it has voluntarily preempted itself on entry + // during the stack growth check. In that case, the function has + // not yet had a chance to do any writes to SP and is safe to unwind. + // isAsyncSafePoint does not allow assembly functions to be async preempted, + // and preemptPark double-checks that SPWRITE functions are not async preempted. + // So for GC stack traversal we leave things alone (this if body does not execute for n == 0) + // at the bottom frame of the stack. But farther up the stack we'd better not + // find any. + if callback != nil { + println("traceback: unexpected SPWRITE function", funcname(f)) + throw("traceback") + } + frame.lr = 0 + flr = funcInfo{} + } else { + var lrPtr uintptr + if usesLR { + if n == 0 && frame.sp < frame.fp || frame.lr == 0 { + lrPtr = frame.sp + frame.lr = *(*uintptr)(unsafe.Pointer(lrPtr)) + } + } else { + if frame.lr == 0 { + lrPtr = frame.fp - goarch.PtrSize + frame.lr = uintptr(*(*uintptr)(unsafe.Pointer(lrPtr))) + } + } + flr = findfunc(frame.lr) + if !flr.valid() { + // This happens if you get a profiling interrupt at just the wrong time. + // In that context it is okay to stop early. + // But if callback is set, we're doing a garbage collection and must + // get everything, so crash loudly. + doPrint := printing + if doPrint && gp.m.incgo && f.funcID == funcID_sigpanic { + // We can inject sigpanic + // calls directly into C code, + // in which case we'll see a C + // return PC. Don't complain. + doPrint = false + } + if callback != nil || doPrint { + print("runtime: g ", gp.goid, ": unexpected return pc for ", funcname(f), " called from ", hex(frame.lr), "\n") + tracebackHexdump(stack, &frame, lrPtr) + } + if callback != nil { + throw("unknown caller pc") + } + } + } + + frame.varp = frame.fp + if !usesLR { + // On x86, call instruction pushes return PC before entering new function. + frame.varp -= goarch.PtrSize + } + + // For architectures with frame pointers, if there's + // a frame, then there's a saved frame pointer here. + // + // NOTE: This code is not as general as it looks. + // On x86, the ABI is to save the frame pointer word at the + // top of the stack frame, so we have to back down over it. + // On arm64, the frame pointer should be at the bottom of + // the stack (with R29 (aka FP) = RSP), in which case we would + // not want to do the subtraction here. But we started out without + // any frame pointer, and when we wanted to add it, we didn't + // want to break all the assembly doing direct writes to 8(RSP) + // to set the first parameter to a called function. + // So we decided to write the FP link *below* the stack pointer + // (with R29 = RSP - 8 in Go functions). + // This is technically ABI-compatible but not standard. + // And it happens to end up mimicking the x86 layout. + // Other architectures may make different decisions. + if frame.varp > frame.sp && framepointer_enabled { + frame.varp -= goarch.PtrSize + } + + frame.argp = frame.fp + sys.MinFrameSize + + // Determine frame's 'continuation PC', where it can continue. + // Normally this is the return address on the stack, but if sigpanic + // is immediately below this function on the stack, then the frame + // stopped executing due to a trap, and frame.pc is probably not + // a safe point for looking up liveness information. In this panicking case, + // the function either doesn't return at all (if it has no defers or if the + // defers do not recover) or it returns from one of the calls to + // deferproc a second time (if the corresponding deferred func recovers). + // In the latter case, use a deferreturn call site as the continuation pc. + frame.continpc = frame.pc + if waspanic { + if frame.fn.deferreturn != 0 { + frame.continpc = frame.fn.entry() + uintptr(frame.fn.deferreturn) + 1 + // Note: this may perhaps keep return variables alive longer than + // strictly necessary, as we are using "function has a defer statement" + // as a proxy for "function actually deferred something". It seems + // to be a minor drawback. (We used to actually look through the + // gp._defer for a defer corresponding to this function, but that + // is hard to do with defer records on the stack during a stack copy.) + // Note: the +1 is to offset the -1 that + // stack.go:getStackMap does to back up a return + // address make sure the pc is in the CALL instruction. + } else { + frame.continpc = 0 + } + } + + if callback != nil { + if !callback((*stkframe)(noescape(unsafe.Pointer(&frame))), v) { + return n + } + } + + if pcbuf != nil { + pc := frame.pc + // backup to CALL instruction to read inlining info (same logic as below) + tracepc := pc + // Normally, pc is a return address. In that case, we want to look up + // file/line information using pc-1, because that is the pc of the + // call instruction (more precisely, the last byte of the call instruction). + // Callers expect the pc buffer to contain return addresses and do the + // same -1 themselves, so we keep pc unchanged. + // When the pc is from a signal (e.g. profiler or segv) then we want + // to look up file/line information using pc, and we store pc+1 in the + // pc buffer so callers can unconditionally subtract 1 before looking up. + // See issue 34123. + // The pc can be at function entry when the frame is initialized without + // actually running code, like runtime.mstart. + if (n == 0 && flags&_TraceTrap != 0) || waspanic || pc == f.entry() { + pc++ + } else { + tracepc-- + } + + // If there is inlining info, record the inner frames. + if inldata := funcdata(f, _FUNCDATA_InlTree); inldata != nil { + inltree := (*[1 << 20]inlinedCall)(inldata) + for { + ix := pcdatavalue(f, _PCDATA_InlTreeIndex, tracepc, &cache) + if ix < 0 { + break + } + if inltree[ix].funcID == funcID_wrapper && elideWrapperCalling(lastFuncID) { + // ignore wrappers + } else if skip > 0 { + skip-- + } else if n < max { + (*[1 << 20]uintptr)(unsafe.Pointer(pcbuf))[n] = pc + n++ + } + lastFuncID = inltree[ix].funcID + // Back up to an instruction in the "caller". + tracepc = frame.fn.entry() + uintptr(inltree[ix].parentPc) + pc = tracepc + 1 + } + } + // Record the main frame. + if f.funcID == funcID_wrapper && elideWrapperCalling(lastFuncID) { + // Ignore wrapper functions (except when they trigger panics). + } else if skip > 0 { + skip-- + } else if n < max { + (*[1 << 20]uintptr)(unsafe.Pointer(pcbuf))[n] = pc + n++ + } + lastFuncID = f.funcID + n-- // offset n++ below + } + + if printing { + // assume skip=0 for printing. + // + // Never elide wrappers if we haven't printed + // any frames. And don't elide wrappers that + // called panic rather than the wrapped + // function. Otherwise, leave them out. + + // backup to CALL instruction to read inlining info (same logic as below) + tracepc := frame.pc + if (n > 0 || flags&_TraceTrap == 0) && frame.pc > f.entry() && !waspanic { + tracepc-- + } + // If there is inlining info, print the inner frames. + if inldata := funcdata(f, _FUNCDATA_InlTree); inldata != nil { + inltree := (*[1 << 20]inlinedCall)(inldata) + var inlFunc _func + inlFuncInfo := funcInfo{&inlFunc, f.datap} + for { + ix := pcdatavalue(f, _PCDATA_InlTreeIndex, tracepc, nil) + if ix < 0 { + break + } + + // Create a fake _func for the + // inlined function. + inlFunc.nameOff = inltree[ix].nameOff + inlFunc.funcID = inltree[ix].funcID + inlFunc.startLine = inltree[ix].startLine + + if (flags&_TraceRuntimeFrames) != 0 || showframe(inlFuncInfo, gp, nprint == 0, inlFuncInfo.funcID, lastFuncID) { + name := funcname(inlFuncInfo) + file, line := funcline(f, tracepc) + print(name, "(...)\n") + print("\t", file, ":", line, "\n") + nprint++ + } + lastFuncID = inltree[ix].funcID + // Back up to an instruction in the "caller". + tracepc = frame.fn.entry() + uintptr(inltree[ix].parentPc) + } + } + if (flags&_TraceRuntimeFrames) != 0 || showframe(f, gp, nprint == 0, f.funcID, lastFuncID) { + // Print during crash. + // main(0x1, 0x2, 0x3) + // /home/rsc/go/src/runtime/x.go:23 +0xf + // + name := funcname(f) + file, line := funcline(f, tracepc) + if name == "runtime.gopanic" { + name = "panic" + } + print(name, "(") + argp := unsafe.Pointer(frame.argp) + printArgs(f, argp, tracepc) + print(")\n") + print("\t", file, ":", line) + if frame.pc > f.entry() { + print(" +", hex(frame.pc-f.entry())) + } + if gp.m != nil && gp.m.throwing >= throwTypeRuntime && gp == gp.m.curg || level >= 2 { + print(" fp=", hex(frame.fp), " sp=", hex(frame.sp), " pc=", hex(frame.pc)) + } + print("\n") + nprint++ + } + lastFuncID = f.funcID + } + n++ + + if f.funcID == funcID_cgocallback && len(cgoCtxt) > 0 { + ctxt := cgoCtxt[len(cgoCtxt)-1] + cgoCtxt = cgoCtxt[:len(cgoCtxt)-1] + + // skip only applies to Go frames. + // callback != nil only used when we only care + // about Go frames. + if skip == 0 && callback == nil { + n = tracebackCgoContext(pcbuf, printing, ctxt, n, max) + } + } + + waspanic = f.funcID == funcID_sigpanic + injectedCall := waspanic || f.funcID == funcID_asyncPreempt || f.funcID == funcID_debugCallV2 + + // Do not unwind past the bottom of the stack. + if !flr.valid() { + break + } + + if frame.pc == frame.lr && frame.sp == frame.fp { + // If the next frame is identical to the current frame, we cannot make progress. + print("runtime: traceback stuck. pc=", hex(frame.pc), " sp=", hex(frame.sp), "\n") + tracebackHexdump(stack, &frame, frame.sp) + throw("traceback stuck") + } + + // Unwind to next frame. + frame.fn = flr + frame.pc = frame.lr + frame.lr = 0 + frame.sp = frame.fp + frame.fp = 0 + + // On link register architectures, sighandler saves the LR on stack + // before faking a call. + if usesLR && injectedCall { + x := *(*uintptr)(unsafe.Pointer(frame.sp)) + frame.sp += alignUp(sys.MinFrameSize, sys.StackAlign) + f = findfunc(frame.pc) + frame.fn = f + if !f.valid() { + frame.pc = x + } else if funcspdelta(f, frame.pc, &cache) == 0 { + frame.lr = x + } + } + } + + if printing { + n = nprint + } + + // Note that panic != nil is okay here: there can be leftover panics, + // because the defers on the panic stack do not nest in frame order as + // they do on the defer stack. If you have: + // + // frame 1 defers d1 + // frame 2 defers d2 + // frame 3 defers d3 + // frame 4 panics + // frame 4's panic starts running defers + // frame 5, running d3, defers d4 + // frame 5 panics + // frame 5's panic starts running defers + // frame 6, running d4, garbage collects + // frame 6, running d2, garbage collects + // + // During the execution of d4, the panic stack is d4 -> d3, which + // is nested properly, and we'll treat frame 3 as resumable, because we + // can find d3. (And in fact frame 3 is resumable. If d4 recovers + // and frame 5 continues running, d3, d3 can recover and we'll + // resume execution in (returning from) frame 3.) + // + // During the execution of d2, however, the panic stack is d2 -> d3, + // which is inverted. The scan will match d2 to frame 2 but having + // d2 on the stack until then means it will not match d3 to frame 3. + // This is okay: if we're running d2, then all the defers after d2 have + // completed and their corresponding frames are dead. Not finding d3 + // for frame 3 means we'll set frame 3's continpc == 0, which is correct + // (frame 3 is dead). At the end of the walk the panic stack can thus + // contain defers (d3 in this case) for dead frames. The inversion here + // always indicates a dead frame, and the effect of the inversion on the + // scan is to hide those dead frames, so the scan is still okay: + // what's left on the panic stack are exactly (and only) the dead frames. + // + // We require callback != nil here because only when callback != nil + // do we know that gentraceback is being called in a "must be correct" + // context as opposed to a "best effort" context. The tracebacks with + // callbacks only happen when everything is stopped nicely. + // At other times, such as when gathering a stack for a profiling signal + // or when printing a traceback during a crash, everything may not be + // stopped nicely, and the stack walk may not be able to complete. + if callback != nil && n < max && frame.sp != gp.stktopsp { + print("runtime: g", gp.goid, ": frame.sp=", hex(frame.sp), " top=", hex(gp.stktopsp), "\n") + print("\tstack=[", hex(gp.stack.lo), "-", hex(gp.stack.hi), "] n=", n, " max=", max, "\n") + throw("traceback did not unwind completely") + } + + return n +} + +// printArgs prints function arguments in traceback. +func printArgs(f funcInfo, argp unsafe.Pointer, pc uintptr) { + // The "instruction" of argument printing is encoded in _FUNCDATA_ArgInfo. + // See cmd/compile/internal/ssagen.emitArgInfo for the description of the + // encoding. + // These constants need to be in sync with the compiler. + const ( + _endSeq = 0xff + _startAgg = 0xfe + _endAgg = 0xfd + _dotdotdot = 0xfc + _offsetTooLarge = 0xfb + ) + + const ( + limit = 10 // print no more than 10 args/components + maxDepth = 5 // no more than 5 layers of nesting + maxLen = (maxDepth*3+2)*limit + 1 // max length of _FUNCDATA_ArgInfo (see the compiler side for reasoning) + ) + + p := (*[maxLen]uint8)(funcdata(f, _FUNCDATA_ArgInfo)) + if p == nil { + return + } + + liveInfo := funcdata(f, _FUNCDATA_ArgLiveInfo) + liveIdx := pcdatavalue(f, _PCDATA_ArgLiveIndex, pc, nil) + startOffset := uint8(0xff) // smallest offset that needs liveness info (slots with a lower offset is always live) + if liveInfo != nil { + startOffset = *(*uint8)(liveInfo) + } + + isLive := func(off, slotIdx uint8) bool { + if liveInfo == nil || liveIdx <= 0 { + return true // no liveness info, always live + } + if off < startOffset { + return true + } + bits := *(*uint8)(add(liveInfo, uintptr(liveIdx)+uintptr(slotIdx/8))) + return bits&(1<<(slotIdx%8)) != 0 + } + + print1 := func(off, sz, slotIdx uint8) { + x := readUnaligned64(add(argp, uintptr(off))) + // mask out irrelevant bits + if sz < 8 { + shift := 64 - sz*8 + if goarch.BigEndian { + x = x >> shift + } else { + x = x << shift >> shift + } + } + print(hex(x)) + if !isLive(off, slotIdx) { + print("?") + } + } + + start := true + printcomma := func() { + if !start { + print(", ") + } + } + pi := 0 + slotIdx := uint8(0) // register arg spill slot index +printloop: + for { + o := p[pi] + pi++ + switch o { + case _endSeq: + break printloop + case _startAgg: + printcomma() + print("{") + start = true + continue + case _endAgg: + print("}") + case _dotdotdot: + printcomma() + print("...") + case _offsetTooLarge: + printcomma() + print("_") + default: + printcomma() + sz := p[pi] + pi++ + print1(o, sz, slotIdx) + if o >= startOffset { + slotIdx++ + } + } + start = false + } +} + +// tracebackCgoContext handles tracing back a cgo context value, from +// the context argument to setCgoTraceback, for the gentraceback +// function. It returns the new value of n. +func tracebackCgoContext(pcbuf *uintptr, printing bool, ctxt uintptr, n, max int) int { + var cgoPCs [32]uintptr + cgoContextPCs(ctxt, cgoPCs[:]) + var arg cgoSymbolizerArg + anySymbolized := false + for _, pc := range cgoPCs { + if pc == 0 || n >= max { + break + } + if pcbuf != nil { + (*[1 << 20]uintptr)(unsafe.Pointer(pcbuf))[n] = pc + } + if printing { + if cgoSymbolizer == nil { + print("non-Go function at pc=", hex(pc), "\n") + } else { + c := printOneCgoTraceback(pc, max-n, &arg) + n += c - 1 // +1 a few lines down + anySymbolized = true + } + } + n++ + } + if anySymbolized { + arg.pc = 0 + callCgoSymbolizer(&arg) + } + return n +} + +func printcreatedby(gp *g) { + // Show what created goroutine, except main goroutine (goid 1). + pc := gp.gopc + f := findfunc(pc) + if f.valid() && showframe(f, gp, false, funcID_normal, funcID_normal) && gp.goid != 1 { + printcreatedby1(f, pc) + } +} + +func printcreatedby1(f funcInfo, pc uintptr) { + print("created by ", funcname(f), "\n") + tracepc := pc // back up to CALL instruction for funcline. + if pc > f.entry() { + tracepc -= sys.PCQuantum + } + file, line := funcline(f, tracepc) + print("\t", file, ":", line) + if pc > f.entry() { + print(" +", hex(pc-f.entry())) + } + print("\n") +} + +func traceback(pc, sp, lr uintptr, gp *g) { + traceback1(pc, sp, lr, gp, 0) +} + +// tracebacktrap is like traceback but expects that the PC and SP were obtained +// from a trap, not from gp->sched or gp->syscallpc/gp->syscallsp or getcallerpc/getcallersp. +// Because they are from a trap instead of from a saved pair, +// the initial PC must not be rewound to the previous instruction. +// (All the saved pairs record a PC that is a return address, so we +// rewind it into the CALL instruction.) +// If gp.m.libcall{g,pc,sp} information is available, it uses that information in preference to +// the pc/sp/lr passed in. +func tracebacktrap(pc, sp, lr uintptr, gp *g) { + if gp.m.libcallsp != 0 { + // We're in C code somewhere, traceback from the saved position. + traceback1(gp.m.libcallpc, gp.m.libcallsp, 0, gp.m.libcallg.ptr(), 0) + return + } + traceback1(pc, sp, lr, gp, _TraceTrap) +} + +func traceback1(pc, sp, lr uintptr, gp *g, flags uint) { + // If the goroutine is in cgo, and we have a cgo traceback, print that. + if iscgo && gp.m != nil && gp.m.ncgo > 0 && gp.syscallsp != 0 && gp.m.cgoCallers != nil && gp.m.cgoCallers[0] != 0 { + // Lock cgoCallers so that a signal handler won't + // change it, copy the array, reset it, unlock it. + // We are locked to the thread and are not running + // concurrently with a signal handler. + // We just have to stop a signal handler from interrupting + // in the middle of our copy. + gp.m.cgoCallersUse.Store(1) + cgoCallers := *gp.m.cgoCallers + gp.m.cgoCallers[0] = 0 + gp.m.cgoCallersUse.Store(0) + + printCgoTraceback(&cgoCallers) + } + + if readgstatus(gp)&^_Gscan == _Gsyscall { + // Override registers if blocked in system call. + pc = gp.syscallpc + sp = gp.syscallsp + flags &^= _TraceTrap + } + if gp.m != nil && gp.m.vdsoSP != 0 { + // Override registers if running in VDSO. This comes after the + // _Gsyscall check to cover VDSO calls after entersyscall. + pc = gp.m.vdsoPC + sp = gp.m.vdsoSP + flags &^= _TraceTrap + } + + // Print traceback. By default, omits runtime frames. + // If that means we print nothing at all, repeat forcing all frames printed. + n := gentraceback(pc, sp, lr, gp, 0, nil, _TracebackMaxFrames, nil, nil, flags) + if n == 0 && (flags&_TraceRuntimeFrames) == 0 { + n = gentraceback(pc, sp, lr, gp, 0, nil, _TracebackMaxFrames, nil, nil, flags|_TraceRuntimeFrames) + } + if n == _TracebackMaxFrames { + print("...additional frames elided...\n") + } + printcreatedby(gp) + + if gp.ancestors == nil { + return + } + for _, ancestor := range *gp.ancestors { + printAncestorTraceback(ancestor) + } +} + +// printAncestorTraceback prints the traceback of the given ancestor. +// TODO: Unify this with gentraceback and CallersFrames. +func printAncestorTraceback(ancestor ancestorInfo) { + print("[originating from goroutine ", ancestor.goid, "]:\n") + for fidx, pc := range ancestor.pcs { + f := findfunc(pc) // f previously validated + if showfuncinfo(f, fidx == 0, funcID_normal, funcID_normal) { + printAncestorTracebackFuncInfo(f, pc) + } + } + if len(ancestor.pcs) == _TracebackMaxFrames { + print("...additional frames elided...\n") + } + // Show what created goroutine, except main goroutine (goid 1). + f := findfunc(ancestor.gopc) + if f.valid() && showfuncinfo(f, false, funcID_normal, funcID_normal) && ancestor.goid != 1 { + printcreatedby1(f, ancestor.gopc) + } +} + +// printAncestorTracebackFuncInfo prints the given function info at a given pc +// within an ancestor traceback. The precision of this info is reduced +// due to only have access to the pcs at the time of the caller +// goroutine being created. +func printAncestorTracebackFuncInfo(f funcInfo, pc uintptr) { + name := funcname(f) + if inldata := funcdata(f, _FUNCDATA_InlTree); inldata != nil { + inltree := (*[1 << 20]inlinedCall)(inldata) + ix := pcdatavalue(f, _PCDATA_InlTreeIndex, pc, nil) + if ix >= 0 { + name = funcnameFromNameOff(f, inltree[ix].nameOff) + } + } + file, line := funcline(f, pc) + if name == "runtime.gopanic" { + name = "panic" + } + print(name, "(...)\n") + print("\t", file, ":", line) + if pc > f.entry() { + print(" +", hex(pc-f.entry())) + } + print("\n") +} + +func callers(skip int, pcbuf []uintptr) int { + sp := getcallersp() + pc := getcallerpc() + gp := getg() + var n int + systemstack(func() { + n = gentraceback(pc, sp, 0, gp, skip, &pcbuf[0], len(pcbuf), nil, nil, 0) + }) + return n +} + +func gcallers(gp *g, skip int, pcbuf []uintptr) int { + return gentraceback(^uintptr(0), ^uintptr(0), 0, gp, skip, &pcbuf[0], len(pcbuf), nil, nil, 0) +} + +// showframe reports whether the frame with the given characteristics should +// be printed during a traceback. +func showframe(f funcInfo, gp *g, firstFrame bool, funcID, childID funcID) bool { + mp := getg().m + if mp.throwing >= throwTypeRuntime && gp != nil && (gp == mp.curg || gp == mp.caughtsig.ptr()) { + return true + } + return showfuncinfo(f, firstFrame, funcID, childID) +} + +// showfuncinfo reports whether a function with the given characteristics should +// be printed during a traceback. +func showfuncinfo(f funcInfo, firstFrame bool, funcID, childID funcID) bool { + // Note that f may be a synthesized funcInfo for an inlined + // function, in which case only nameOff and funcID are set. + + level, _, _ := gotraceback() + if level > 1 { + // Show all frames. + return true + } + + if !f.valid() { + return false + } + + if funcID == funcID_wrapper && elideWrapperCalling(childID) { + return false + } + + name := funcname(f) + + // Special case: always show runtime.gopanic frame + // in the middle of a stack trace, so that we can + // see the boundary between ordinary code and + // panic-induced deferred code. + // See golang.org/issue/5832. + if name == "runtime.gopanic" && !firstFrame { + return true + } + + return bytealg.IndexByteString(name, '.') >= 0 && (!hasPrefix(name, "runtime.") || isExportedRuntime(name)) +} + +// isExportedRuntime reports whether name is an exported runtime function. +// It is only for runtime functions, so ASCII A-Z is fine. +func isExportedRuntime(name string) bool { + const n = len("runtime.") + return len(name) > n && name[:n] == "runtime." && 'A' <= name[n] && name[n] <= 'Z' +} + +// elideWrapperCalling reports whether a wrapper function that called +// function id should be elided from stack traces. +func elideWrapperCalling(id funcID) bool { + // If the wrapper called a panic function instead of the + // wrapped function, we want to include it in stacks. + return !(id == funcID_gopanic || id == funcID_sigpanic || id == funcID_panicwrap) +} + +var gStatusStrings = [...]string{ + _Gidle: "idle", + _Grunnable: "runnable", + _Grunning: "running", + _Gsyscall: "syscall", + _Gwaiting: "waiting", + _Gdead: "dead", + _Gcopystack: "copystack", + _Gpreempted: "preempted", +} + +func goroutineheader(gp *g) { + gpstatus := readgstatus(gp) + + isScan := gpstatus&_Gscan != 0 + gpstatus &^= _Gscan // drop the scan bit + + // Basic string status + var status string + if 0 <= gpstatus && gpstatus < uint32(len(gStatusStrings)) { + status = gStatusStrings[gpstatus] + } else { + status = "???" + } + + // Override. + if gpstatus == _Gwaiting && gp.waitreason != waitReasonZero { + status = gp.waitreason.String() + } + + // approx time the G is blocked, in minutes + var waitfor int64 + if (gpstatus == _Gwaiting || gpstatus == _Gsyscall) && gp.waitsince != 0 { + waitfor = (nanotime() - gp.waitsince) / 60e9 + } + print("goroutine ", gp.goid, " [", status) + if isScan { + print(" (scan)") + } + if waitfor >= 1 { + print(", ", waitfor, " minutes") + } + if gp.lockedm != 0 { + print(", locked to thread") + } + print("]:\n") +} + +func tracebackothers(me *g) { + level, _, _ := gotraceback() + + // Show the current goroutine first, if we haven't already. + curgp := getg().m.curg + if curgp != nil && curgp != me { + print("\n") + goroutineheader(curgp) + traceback(^uintptr(0), ^uintptr(0), 0, curgp) + } + + // We can't call locking forEachG here because this may be during fatal + // throw/panic, where locking could be out-of-order or a direct + // deadlock. + // + // Instead, use forEachGRace, which requires no locking. We don't lock + // against concurrent creation of new Gs, but even with allglock we may + // miss Gs created after this loop. + forEachGRace(func(gp *g) { + if gp == me || gp == curgp || readgstatus(gp) == _Gdead || isSystemGoroutine(gp, false) && level < 2 { + return + } + print("\n") + goroutineheader(gp) + // Note: gp.m == getg().m occurs when tracebackothers is called + // from a signal handler initiated during a systemstack call. + // The original G is still in the running state, and we want to + // print its stack. + if gp.m != getg().m && readgstatus(gp)&^_Gscan == _Grunning { + print("\tgoroutine running on other thread; stack unavailable\n") + printcreatedby(gp) + } else { + traceback(^uintptr(0), ^uintptr(0), 0, gp) + } + }) +} + +// tracebackHexdump hexdumps part of stk around frame.sp and frame.fp +// for debugging purposes. If the address bad is included in the +// hexdumped range, it will mark it as well. +func tracebackHexdump(stk stack, frame *stkframe, bad uintptr) { + const expand = 32 * goarch.PtrSize + const maxExpand = 256 * goarch.PtrSize + // Start around frame.sp. + lo, hi := frame.sp, frame.sp + // Expand to include frame.fp. + if frame.fp != 0 && frame.fp < lo { + lo = frame.fp + } + if frame.fp != 0 && frame.fp > hi { + hi = frame.fp + } + // Expand a bit more. + lo, hi = lo-expand, hi+expand + // But don't go too far from frame.sp. + if lo < frame.sp-maxExpand { + lo = frame.sp - maxExpand + } + if hi > frame.sp+maxExpand { + hi = frame.sp + maxExpand + } + // And don't go outside the stack bounds. + if lo < stk.lo { + lo = stk.lo + } + if hi > stk.hi { + hi = stk.hi + } + + // Print the hex dump. + print("stack: frame={sp:", hex(frame.sp), ", fp:", hex(frame.fp), "} stack=[", hex(stk.lo), ",", hex(stk.hi), ")\n") + hexdumpWords(lo, hi, func(p uintptr) byte { + switch p { + case frame.fp: + return '>' + case frame.sp: + return '<' + case bad: + return '!' + } + return 0 + }) +} + +// isSystemGoroutine reports whether the goroutine g must be omitted +// in stack dumps and deadlock detector. This is any goroutine that +// starts at a runtime.* entry point, except for runtime.main, +// runtime.handleAsyncEvent (wasm only) and sometimes runtime.runfinq. +// +// If fixed is true, any goroutine that can vary between user and +// system (that is, the finalizer goroutine) is considered a user +// goroutine. +func isSystemGoroutine(gp *g, fixed bool) bool { + // Keep this in sync with internal/trace.IsSystemGoroutine. + f := findfunc(gp.startpc) + if !f.valid() { + return false + } + if f.funcID == funcID_runtime_main || f.funcID == funcID_handleAsyncEvent { + return false + } + if f.funcID == funcID_runfinq { + // We include the finalizer goroutine if it's calling + // back into user code. + if fixed { + // This goroutine can vary. In fixed mode, + // always consider it a user goroutine. + return false + } + return fingStatus.Load()&fingRunningFinalizer == 0 + } + return hasPrefix(funcname(f), "runtime.") +} + +// SetCgoTraceback records three C functions to use to gather +// traceback information from C code and to convert that traceback +// information into symbolic information. These are used when printing +// stack traces for a program that uses cgo. +// +// The traceback and context functions may be called from a signal +// handler, and must therefore use only async-signal safe functions. +// The symbolizer function may be called while the program is +// crashing, and so must be cautious about using memory. None of the +// functions may call back into Go. +// +// The context function will be called with a single argument, a +// pointer to a struct: +// +// struct { +// Context uintptr +// } +// +// In C syntax, this struct will be +// +// struct { +// uintptr_t Context; +// }; +// +// If the Context field is 0, the context function is being called to +// record the current traceback context. It should record in the +// Context field whatever information is needed about the current +// point of execution to later produce a stack trace, probably the +// stack pointer and PC. In this case the context function will be +// called from C code. +// +// If the Context field is not 0, then it is a value returned by a +// previous call to the context function. This case is called when the +// context is no longer needed; that is, when the Go code is returning +// to its C code caller. This permits the context function to release +// any associated resources. +// +// While it would be correct for the context function to record a +// complete a stack trace whenever it is called, and simply copy that +// out in the traceback function, in a typical program the context +// function will be called many times without ever recording a +// traceback for that context. Recording a complete stack trace in a +// call to the context function is likely to be inefficient. +// +// The traceback function will be called with a single argument, a +// pointer to a struct: +// +// struct { +// Context uintptr +// SigContext uintptr +// Buf *uintptr +// Max uintptr +// } +// +// In C syntax, this struct will be +// +// struct { +// uintptr_t Context; +// uintptr_t SigContext; +// uintptr_t* Buf; +// uintptr_t Max; +// }; +// +// The Context field will be zero to gather a traceback from the +// current program execution point. In this case, the traceback +// function will be called from C code. +// +// Otherwise Context will be a value previously returned by a call to +// the context function. The traceback function should gather a stack +// trace from that saved point in the program execution. The traceback +// function may be called from an execution thread other than the one +// that recorded the context, but only when the context is known to be +// valid and unchanging. The traceback function may also be called +// deeper in the call stack on the same thread that recorded the +// context. The traceback function may be called multiple times with +// the same Context value; it will usually be appropriate to cache the +// result, if possible, the first time this is called for a specific +// context value. +// +// If the traceback function is called from a signal handler on a Unix +// system, SigContext will be the signal context argument passed to +// the signal handler (a C ucontext_t* cast to uintptr_t). This may be +// used to start tracing at the point where the signal occurred. If +// the traceback function is not called from a signal handler, +// SigContext will be zero. +// +// Buf is where the traceback information should be stored. It should +// be PC values, such that Buf[0] is the PC of the caller, Buf[1] is +// the PC of that function's caller, and so on. Max is the maximum +// number of entries to store. The function should store a zero to +// indicate the top of the stack, or that the caller is on a different +// stack, presumably a Go stack. +// +// Unlike runtime.Callers, the PC values returned should, when passed +// to the symbolizer function, return the file/line of the call +// instruction. No additional subtraction is required or appropriate. +// +// On all platforms, the traceback function is invoked when a call from +// Go to C to Go requests a stack trace. On linux/amd64, linux/ppc64le, +// linux/arm64, and freebsd/amd64, the traceback function is also invoked +// when a signal is received by a thread that is executing a cgo call. +// The traceback function should not make assumptions about when it is +// called, as future versions of Go may make additional calls. +// +// The symbolizer function will be called with a single argument, a +// pointer to a struct: +// +// struct { +// PC uintptr // program counter to fetch information for +// File *byte // file name (NUL terminated) +// Lineno uintptr // line number +// Func *byte // function name (NUL terminated) +// Entry uintptr // function entry point +// More uintptr // set non-zero if more info for this PC +// Data uintptr // unused by runtime, available for function +// } +// +// In C syntax, this struct will be +// +// struct { +// uintptr_t PC; +// char* File; +// uintptr_t Lineno; +// char* Func; +// uintptr_t Entry; +// uintptr_t More; +// uintptr_t Data; +// }; +// +// The PC field will be a value returned by a call to the traceback +// function. +// +// The first time the function is called for a particular traceback, +// all the fields except PC will be 0. The function should fill in the +// other fields if possible, setting them to 0/nil if the information +// is not available. The Data field may be used to store any useful +// information across calls. The More field should be set to non-zero +// if there is more information for this PC, zero otherwise. If More +// is set non-zero, the function will be called again with the same +// PC, and may return different information (this is intended for use +// with inlined functions). If More is zero, the function will be +// called with the next PC value in the traceback. When the traceback +// is complete, the function will be called once more with PC set to +// zero; this may be used to free any information. Each call will +// leave the fields of the struct set to the same values they had upon +// return, except for the PC field when the More field is zero. The +// function must not keep a copy of the struct pointer between calls. +// +// When calling SetCgoTraceback, the version argument is the version +// number of the structs that the functions expect to receive. +// Currently this must be zero. +// +// The symbolizer function may be nil, in which case the results of +// the traceback function will be displayed as numbers. If the +// traceback function is nil, the symbolizer function will never be +// called. The context function may be nil, in which case the +// traceback function will only be called with the context field set +// to zero. If the context function is nil, then calls from Go to C +// to Go will not show a traceback for the C portion of the call stack. +// +// SetCgoTraceback should be called only once, ideally from an init function. +func SetCgoTraceback(version int, traceback, context, symbolizer unsafe.Pointer) { + if version != 0 { + panic("unsupported version") + } + + if cgoTraceback != nil && cgoTraceback != traceback || + cgoContext != nil && cgoContext != context || + cgoSymbolizer != nil && cgoSymbolizer != symbolizer { + panic("call SetCgoTraceback only once") + } + + cgoTraceback = traceback + cgoContext = context + cgoSymbolizer = symbolizer + + // The context function is called when a C function calls a Go + // function. As such it is only called by C code in runtime/cgo. + if _cgo_set_context_function != nil { + cgocall(_cgo_set_context_function, context) + } +} + +var cgoTraceback unsafe.Pointer +var cgoContext unsafe.Pointer +var cgoSymbolizer unsafe.Pointer + +// cgoTracebackArg is the type passed to cgoTraceback. +type cgoTracebackArg struct { + context uintptr + sigContext uintptr + buf *uintptr + max uintptr +} + +// cgoContextArg is the type passed to the context function. +type cgoContextArg struct { + context uintptr +} + +// cgoSymbolizerArg is the type passed to cgoSymbolizer. +type cgoSymbolizerArg struct { + pc uintptr + file *byte + lineno uintptr + funcName *byte + entry uintptr + more uintptr + data uintptr +} + +// printCgoTraceback prints a traceback of callers. +func printCgoTraceback(callers *cgoCallers) { + if cgoSymbolizer == nil { + for _, c := range callers { + if c == 0 { + break + } + print("non-Go function at pc=", hex(c), "\n") + } + return + } + + var arg cgoSymbolizerArg + for _, c := range callers { + if c == 0 { + break + } + printOneCgoTraceback(c, 0x7fffffff, &arg) + } + arg.pc = 0 + callCgoSymbolizer(&arg) +} + +// printOneCgoTraceback prints the traceback of a single cgo caller. +// This can print more than one line because of inlining. +// Returns the number of frames printed. +func printOneCgoTraceback(pc uintptr, max int, arg *cgoSymbolizerArg) int { + c := 0 + arg.pc = pc + for c <= max { + callCgoSymbolizer(arg) + if arg.funcName != nil { + // Note that we don't print any argument + // information here, not even parentheses. + // The symbolizer must add that if appropriate. + println(gostringnocopy(arg.funcName)) + } else { + println("non-Go function") + } + print("\t") + if arg.file != nil { + print(gostringnocopy(arg.file), ":", arg.lineno, " ") + } + print("pc=", hex(pc), "\n") + c++ + if arg.more == 0 { + break + } + } + return c +} + +// callCgoSymbolizer calls the cgoSymbolizer function. +func callCgoSymbolizer(arg *cgoSymbolizerArg) { + call := cgocall + if panicking.Load() > 0 || getg().m.curg != getg() { + // We do not want to call into the scheduler when panicking + // or when on the system stack. + call = asmcgocall + } + if msanenabled { + msanwrite(unsafe.Pointer(arg), unsafe.Sizeof(cgoSymbolizerArg{})) + } + if asanenabled { + asanwrite(unsafe.Pointer(arg), unsafe.Sizeof(cgoSymbolizerArg{})) + } + call(cgoSymbolizer, noescape(unsafe.Pointer(arg))) +} + +// cgoContextPCs gets the PC values from a cgo traceback. +func cgoContextPCs(ctxt uintptr, buf []uintptr) { + if cgoTraceback == nil { + return + } + call := cgocall + if panicking.Load() > 0 || getg().m.curg != getg() { + // We do not want to call into the scheduler when panicking + // or when on the system stack. + call = asmcgocall + } + arg := cgoTracebackArg{ + context: ctxt, + buf: (*uintptr)(noescape(unsafe.Pointer(&buf[0]))), + max: uintptr(len(buf)), + } + if msanenabled { + msanwrite(unsafe.Pointer(&arg), unsafe.Sizeof(arg)) + } + if asanenabled { + asanwrite(unsafe.Pointer(&arg), unsafe.Sizeof(arg)) + } + call(cgoTraceback, noescape(unsafe.Pointer(&arg))) +} diff --git a/src/runtime/traceback_test.go b/src/runtime/traceback_test.go new file mode 100644 index 0000000..97eb921 --- /dev/null +++ b/src/runtime/traceback_test.go @@ -0,0 +1,422 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "bytes" + "internal/abi" + "internal/testenv" + "runtime" + "testing" +) + +var testTracebackArgsBuf [1000]byte + +func TestTracebackArgs(t *testing.T) { + if *flagQuick { + t.Skip("-quick") + } + optimized := !testenv.OptimizationOff() + abiSel := func(x, y string) string { + // select expected output based on ABI + // In noopt build we always spill arguments so the output is the same as stack ABI. + if optimized && abi.IntArgRegs > 0 { + return x + } + return y + } + + tests := []struct { + fn func() int + expect string + }{ + // simple ints + { + func() int { return testTracebackArgs1(1, 2, 3, 4, 5) }, + "testTracebackArgs1(0x1, 0x2, 0x3, 0x4, 0x5)", + }, + // some aggregates + { + func() int { + return testTracebackArgs2(false, struct { + a, b, c int + x [2]int + }{1, 2, 3, [2]int{4, 5}}, [0]int{}, [3]byte{6, 7, 8}) + }, + "testTracebackArgs2(0x0, {0x1, 0x2, 0x3, {0x4, 0x5}}, {}, {0x6, 0x7, 0x8})", + }, + { + func() int { return testTracebackArgs3([3]byte{1, 2, 3}, 4, 5, 6, [3]byte{7, 8, 9}) }, + "testTracebackArgs3({0x1, 0x2, 0x3}, 0x4, 0x5, 0x6, {0x7, 0x8, 0x9})", + }, + // too deeply nested type + { + func() int { return testTracebackArgs4(false, [1][1][1][1][1][1][1][1][1][1]int{}) }, + "testTracebackArgs4(0x0, {{{{{...}}}}})", + }, + // a lot of zero-sized type + { + func() int { + z := [0]int{} + return testTracebackArgs5(false, struct { + x int + y [0]int + z [2][0]int + }{1, z, [2][0]int{}}, z, z, z, z, z, z, z, z, z, z, z, z) + }, + "testTracebackArgs5(0x0, {0x1, {}, {{}, {}}}, {}, {}, {}, {}, {}, ...)", + }, + + // edge cases for ... + // no ... for 10 args + { + func() int { return testTracebackArgs6a(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) }, + "testTracebackArgs6a(0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa)", + }, + // has ... for 11 args + { + func() int { return testTracebackArgs6b(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11) }, + "testTracebackArgs6b(0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa, ...)", + }, + // no ... for aggregates with 10 words + { + func() int { return testTracebackArgs7a([10]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}) }, + "testTracebackArgs7a({0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa})", + }, + // has ... for aggregates with 11 words + { + func() int { return testTracebackArgs7b([11]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}) }, + "testTracebackArgs7b({0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa, ...})", + }, + // no ... for aggregates, but with more args + { + func() int { return testTracebackArgs7c([10]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, 11) }, + "testTracebackArgs7c({0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa}, ...)", + }, + // has ... for aggregates and also for more args + { + func() int { return testTracebackArgs7d([11]int{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}, 12) }, + "testTracebackArgs7d({0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9, 0xa, ...}, ...)", + }, + // nested aggregates, no ... + { + func() int { return testTracebackArgs8a(testArgsType8a{1, 2, 3, 4, 5, 6, 7, 8, [2]int{9, 10}}) }, + "testTracebackArgs8a({0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, {0x9, 0xa}})", + }, + // nested aggregates, ... in inner but not outer + { + func() int { return testTracebackArgs8b(testArgsType8b{1, 2, 3, 4, 5, 6, 7, 8, [3]int{9, 10, 11}}) }, + "testTracebackArgs8b({0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, {0x9, 0xa, ...}})", + }, + // nested aggregates, ... in outer but not inner + { + func() int { return testTracebackArgs8c(testArgsType8c{1, 2, 3, 4, 5, 6, 7, 8, [2]int{9, 10}, 11}) }, + "testTracebackArgs8c({0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, {0x9, 0xa}, ...})", + }, + // nested aggregates, ... in both inner and outer + { + func() int { return testTracebackArgs8d(testArgsType8d{1, 2, 3, 4, 5, 6, 7, 8, [3]int{9, 10, 11}, 12}) }, + "testTracebackArgs8d({0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, {0x9, 0xa, ...}, ...})", + }, + + // Register argument liveness. + // 1, 3 are used and live, 2, 4 are dead (in register ABI). + // Address-taken (7) and stack ({5, 6}) args are always live. + { + func() int { + poisonStack() // poison arg area to make output deterministic + return testTracebackArgs9(1, 2, 3, 4, [2]int{5, 6}, 7) + }, + abiSel( + "testTracebackArgs9(0x1, 0xffffffff?, 0x3, 0xff?, {0x5, 0x6}, 0x7)", + "testTracebackArgs9(0x1, 0x2, 0x3, 0x4, {0x5, 0x6}, 0x7)"), + }, + // No live. + // (Note: this assume at least 5 int registers if register ABI is used.) + { + func() int { + poisonStack() // poison arg area to make output deterministic + return testTracebackArgs10(1, 2, 3, 4, 5) + }, + abiSel( + "testTracebackArgs10(0xffffffff?, 0xffffffff?, 0xffffffff?, 0xffffffff?, 0xffffffff?)", + "testTracebackArgs10(0x1, 0x2, 0x3, 0x4, 0x5)"), + }, + // Conditional spills. + // Spill in conditional, not executed. + { + func() int { + poisonStack() // poison arg area to make output deterministic + return testTracebackArgs11a(1, 2, 3) + }, + abiSel( + "testTracebackArgs11a(0xffffffff?, 0xffffffff?, 0xffffffff?)", + "testTracebackArgs11a(0x1, 0x2, 0x3)"), + }, + // 2 spills in conditional, not executed; 3 spills in conditional, executed, but not statically known. + // So print 0x3?. + { + func() int { + poisonStack() // poison arg area to make output deterministic + return testTracebackArgs11b(1, 2, 3, 4) + }, + abiSel( + "testTracebackArgs11b(0xffffffff?, 0xffffffff?, 0x3?, 0x4)", + "testTracebackArgs11b(0x1, 0x2, 0x3, 0x4)"), + }, + } + for _, test := range tests { + n := test.fn() + got := testTracebackArgsBuf[:n] + if !bytes.Contains(got, []byte(test.expect)) { + t.Errorf("traceback does not contain expected string: want %q, got\n%s", test.expect, got) + } + } +} + +//go:noinline +func testTracebackArgs1(a, b, c, d, e int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a < 0 { + // use in-reg args to keep them alive + return a + b + c + d + e + } + return n +} + +//go:noinline +func testTracebackArgs2(a bool, b struct { + a, b, c int + x [2]int +}, _ [0]int, d [3]byte) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a { + // use in-reg args to keep them alive + return b.a + b.b + b.c + b.x[0] + b.x[1] + int(d[0]) + int(d[1]) + int(d[2]) + } + return n + +} + +//go:noinline +//go:registerparams +func testTracebackArgs3(x [3]byte, a, b, c int, y [3]byte) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a < 0 { + // use in-reg args to keep them alive + return int(x[0]) + int(x[1]) + int(x[2]) + a + b + c + int(y[0]) + int(y[1]) + int(y[2]) + } + return n +} + +//go:noinline +func testTracebackArgs4(a bool, x [1][1][1][1][1][1][1][1][1][1]int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a { + panic(x) // use args to keep them alive + } + return n +} + +//go:noinline +func testTracebackArgs5(a bool, x struct { + x int + y [0]int + z [2][0]int +}, _, _, _, _, _, _, _, _, _, _, _, _ [0]int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a { + panic(x) // use args to keep them alive + } + return n +} + +//go:noinline +func testTracebackArgs6a(a, b, c, d, e, f, g, h, i, j int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a < 0 { + // use in-reg args to keep them alive + return a + b + c + d + e + f + g + h + i + j + } + return n +} + +//go:noinline +func testTracebackArgs6b(a, b, c, d, e, f, g, h, i, j, k int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a < 0 { + // use in-reg args to keep them alive + return a + b + c + d + e + f + g + h + i + j + k + } + return n +} + +//go:noinline +func testTracebackArgs7a(a [10]int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a[0] < 0 { + // use in-reg args to keep them alive + return a[1] + a[2] + a[3] + a[4] + a[5] + a[6] + a[7] + a[8] + a[9] + } + return n +} + +//go:noinline +func testTracebackArgs7b(a [11]int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a[0] < 0 { + // use in-reg args to keep them alive + return a[1] + a[2] + a[3] + a[4] + a[5] + a[6] + a[7] + a[8] + a[9] + a[10] + } + return n +} + +//go:noinline +func testTracebackArgs7c(a [10]int, b int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a[0] < 0 { + // use in-reg args to keep them alive + return a[1] + a[2] + a[3] + a[4] + a[5] + a[6] + a[7] + a[8] + a[9] + b + } + return n +} + +//go:noinline +func testTracebackArgs7d(a [11]int, b int) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a[0] < 0 { + // use in-reg args to keep them alive + return a[1] + a[2] + a[3] + a[4] + a[5] + a[6] + a[7] + a[8] + a[9] + a[10] + b + } + return n +} + +type testArgsType8a struct { + a, b, c, d, e, f, g, h int + i [2]int +} +type testArgsType8b struct { + a, b, c, d, e, f, g, h int + i [3]int +} +type testArgsType8c struct { + a, b, c, d, e, f, g, h int + i [2]int + j int +} +type testArgsType8d struct { + a, b, c, d, e, f, g, h int + i [3]int + j int +} + +//go:noinline +func testTracebackArgs8a(a testArgsType8a) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a.a < 0 { + // use in-reg args to keep them alive + return a.b + a.c + a.d + a.e + a.f + a.g + a.h + a.i[0] + a.i[1] + } + return n +} + +//go:noinline +func testTracebackArgs8b(a testArgsType8b) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a.a < 0 { + // use in-reg args to keep them alive + return a.b + a.c + a.d + a.e + a.f + a.g + a.h + a.i[0] + a.i[1] + a.i[2] + } + return n +} + +//go:noinline +func testTracebackArgs8c(a testArgsType8c) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a.a < 0 { + // use in-reg args to keep them alive + return a.b + a.c + a.d + a.e + a.f + a.g + a.h + a.i[0] + a.i[1] + a.j + } + return n +} + +//go:noinline +func testTracebackArgs8d(a testArgsType8d) int { + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a.a < 0 { + // use in-reg args to keep them alive + return a.b + a.c + a.d + a.e + a.f + a.g + a.h + a.i[0] + a.i[1] + a.i[2] + a.j + } + return n +} + +// nosplit to avoid preemption or morestack spilling registers. +// +//go:nosplit +//go:noinline +func testTracebackArgs9(a int64, b int32, c int16, d int8, x [2]int, y int) int { + if a < 0 { + println(&y) // take address, make y live, even if no longer used at traceback + } + n := runtime.Stack(testTracebackArgsBuf[:], false) + if a < 0 { + // use half of in-reg args to keep them alive, the other half are dead + return int(a) + int(c) + } + return n +} + +// nosplit to avoid preemption or morestack spilling registers. +// +//go:nosplit +//go:noinline +func testTracebackArgs10(a, b, c, d, e int32) int { + // no use of any args + return runtime.Stack(testTracebackArgsBuf[:], false) +} + +// norace to avoid race instrumentation changing spill locations. +// nosplit to avoid preemption or morestack spilling registers. +// +//go:norace +//go:nosplit +//go:noinline +func testTracebackArgs11a(a, b, c int32) int { + if a < 0 { + println(a, b, c) // spill in a conditional, may not execute + } + if b < 0 { + return int(a + b + c) + } + return runtime.Stack(testTracebackArgsBuf[:], false) +} + +// norace to avoid race instrumentation changing spill locations. +// nosplit to avoid preemption or morestack spilling registers. +// +//go:norace +//go:nosplit +//go:noinline +func testTracebackArgs11b(a, b, c, d int32) int { + var x int32 + if a < 0 { + print() // spill b in a conditional + x = b + } else { + print() // spill c in a conditional + x = c + } + if d < 0 { // d is always needed + return int(x + d) + } + return runtime.Stack(testTracebackArgsBuf[:], false) +} + +// Poison the arg area with deterministic values. +// +//go:noinline +func poisonStack() [20]int { + return [20]int{-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1} +} diff --git a/src/runtime/type.go b/src/runtime/type.go new file mode 100644 index 0000000..1c6103e --- /dev/null +++ b/src/runtime/type.go @@ -0,0 +1,713 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +// Runtime type representation. + +package runtime + +import ( + "internal/abi" + "unsafe" +) + +// tflag is documented in reflect/type.go. +// +// tflag values must be kept in sync with copies in: +// +// cmd/compile/internal/reflectdata/reflect.go +// cmd/link/internal/ld/decodesym.go +// reflect/type.go +// internal/reflectlite/type.go +type tflag uint8 + +const ( + tflagUncommon tflag = 1 << 0 + tflagExtraStar tflag = 1 << 1 + tflagNamed tflag = 1 << 2 + tflagRegularMemory tflag = 1 << 3 // equal and hash can treat values of this type as a single region of t.size bytes +) + +// Needs to be in sync with ../cmd/link/internal/ld/decodesym.go:/^func.commonsize, +// ../cmd/compile/internal/reflectdata/reflect.go:/^func.dcommontype and +// ../reflect/type.go:/^type.rtype. +// ../internal/reflectlite/type.go:/^type.rtype. +type _type struct { + size uintptr + ptrdata uintptr // size of memory prefix holding all pointers + hash uint32 + tflag tflag + align uint8 + fieldAlign uint8 + kind uint8 + // function for comparing objects of this type + // (ptr to object A, ptr to object B) -> ==? + equal func(unsafe.Pointer, unsafe.Pointer) bool + // gcdata stores the GC type data for the garbage collector. + // If the KindGCProg bit is set in kind, gcdata is a GC program. + // Otherwise it is a ptrmask bitmap. See mbitmap.go for details. + gcdata *byte + str nameOff + ptrToThis typeOff +} + +func (t *_type) string() string { + s := t.nameOff(t.str).name() + if t.tflag&tflagExtraStar != 0 { + return s[1:] + } + return s +} + +func (t *_type) uncommon() *uncommontype { + if t.tflag&tflagUncommon == 0 { + return nil + } + switch t.kind & kindMask { + case kindStruct: + type u struct { + structtype + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + case kindPtr: + type u struct { + ptrtype + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + case kindFunc: + type u struct { + functype + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + case kindSlice: + type u struct { + slicetype + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + case kindArray: + type u struct { + arraytype + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + case kindChan: + type u struct { + chantype + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + case kindMap: + type u struct { + maptype + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + case kindInterface: + type u struct { + interfacetype + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + default: + type u struct { + _type + u uncommontype + } + return &(*u)(unsafe.Pointer(t)).u + } +} + +func (t *_type) name() string { + if t.tflag&tflagNamed == 0 { + return "" + } + s := t.string() + i := len(s) - 1 + sqBrackets := 0 + for i >= 0 && (s[i] != '.' || sqBrackets != 0) { + switch s[i] { + case ']': + sqBrackets++ + case '[': + sqBrackets-- + } + i-- + } + return s[i+1:] +} + +// pkgpath returns the path of the package where t was defined, if +// available. This is not the same as the reflect package's PkgPath +// method, in that it returns the package path for struct and interface +// types, not just named types. +func (t *_type) pkgpath() string { + if u := t.uncommon(); u != nil { + return t.nameOff(u.pkgpath).name() + } + switch t.kind & kindMask { + case kindStruct: + st := (*structtype)(unsafe.Pointer(t)) + return st.pkgPath.name() + case kindInterface: + it := (*interfacetype)(unsafe.Pointer(t)) + return it.pkgpath.name() + } + return "" +} + +// reflectOffs holds type offsets defined at run time by the reflect package. +// +// When a type is defined at run time, its *rtype data lives on the heap. +// There are a wide range of possible addresses the heap may use, that +// may not be representable as a 32-bit offset. Moreover the GC may +// one day start moving heap memory, in which case there is no stable +// offset that can be defined. +// +// To provide stable offsets, we add pin *rtype objects in a global map +// and treat the offset as an identifier. We use negative offsets that +// do not overlap with any compile-time module offsets. +// +// Entries are created by reflect.addReflectOff. +var reflectOffs struct { + lock mutex + next int32 + m map[int32]unsafe.Pointer + minv map[unsafe.Pointer]int32 +} + +func reflectOffsLock() { + lock(&reflectOffs.lock) + if raceenabled { + raceacquire(unsafe.Pointer(&reflectOffs.lock)) + } +} + +func reflectOffsUnlock() { + if raceenabled { + racerelease(unsafe.Pointer(&reflectOffs.lock)) + } + unlock(&reflectOffs.lock) +} + +func resolveNameOff(ptrInModule unsafe.Pointer, off nameOff) name { + if off == 0 { + return name{} + } + base := uintptr(ptrInModule) + for md := &firstmoduledata; md != nil; md = md.next { + if base >= md.types && base < md.etypes { + res := md.types + uintptr(off) + if res > md.etypes { + println("runtime: nameOff", hex(off), "out of range", hex(md.types), "-", hex(md.etypes)) + throw("runtime: name offset out of range") + } + return name{(*byte)(unsafe.Pointer(res))} + } + } + + // No module found. see if it is a run time name. + reflectOffsLock() + res, found := reflectOffs.m[int32(off)] + reflectOffsUnlock() + if !found { + println("runtime: nameOff", hex(off), "base", hex(base), "not in ranges:") + for next := &firstmoduledata; next != nil; next = next.next { + println("\ttypes", hex(next.types), "etypes", hex(next.etypes)) + } + throw("runtime: name offset base pointer out of range") + } + return name{(*byte)(res)} +} + +func (t *_type) nameOff(off nameOff) name { + return resolveNameOff(unsafe.Pointer(t), off) +} + +func resolveTypeOff(ptrInModule unsafe.Pointer, off typeOff) *_type { + if off == 0 || off == -1 { + // -1 is the sentinel value for unreachable code. + // See cmd/link/internal/ld/data.go:relocsym. + return nil + } + base := uintptr(ptrInModule) + var md *moduledata + for next := &firstmoduledata; next != nil; next = next.next { + if base >= next.types && base < next.etypes { + md = next + break + } + } + if md == nil { + reflectOffsLock() + res := reflectOffs.m[int32(off)] + reflectOffsUnlock() + if res == nil { + println("runtime: typeOff", hex(off), "base", hex(base), "not in ranges:") + for next := &firstmoduledata; next != nil; next = next.next { + println("\ttypes", hex(next.types), "etypes", hex(next.etypes)) + } + throw("runtime: type offset base pointer out of range") + } + return (*_type)(res) + } + if t := md.typemap[off]; t != nil { + return t + } + res := md.types + uintptr(off) + if res > md.etypes { + println("runtime: typeOff", hex(off), "out of range", hex(md.types), "-", hex(md.etypes)) + throw("runtime: type offset out of range") + } + return (*_type)(unsafe.Pointer(res)) +} + +func (t *_type) typeOff(off typeOff) *_type { + return resolveTypeOff(unsafe.Pointer(t), off) +} + +func (t *_type) textOff(off textOff) unsafe.Pointer { + if off == -1 { + // -1 is the sentinel value for unreachable code. + // See cmd/link/internal/ld/data.go:relocsym. + return unsafe.Pointer(abi.FuncPCABIInternal(unreachableMethod)) + } + base := uintptr(unsafe.Pointer(t)) + var md *moduledata + for next := &firstmoduledata; next != nil; next = next.next { + if base >= next.types && base < next.etypes { + md = next + break + } + } + if md == nil { + reflectOffsLock() + res := reflectOffs.m[int32(off)] + reflectOffsUnlock() + if res == nil { + println("runtime: textOff", hex(off), "base", hex(base), "not in ranges:") + for next := &firstmoduledata; next != nil; next = next.next { + println("\ttypes", hex(next.types), "etypes", hex(next.etypes)) + } + throw("runtime: text offset base pointer out of range") + } + return res + } + res := md.textAddr(uint32(off)) + return unsafe.Pointer(res) +} + +func (t *functype) in() []*_type { + // See funcType in reflect/type.go for details on data layout. + uadd := uintptr(unsafe.Sizeof(functype{})) + if t.typ.tflag&tflagUncommon != 0 { + uadd += unsafe.Sizeof(uncommontype{}) + } + return (*[1 << 20]*_type)(add(unsafe.Pointer(t), uadd))[:t.inCount] +} + +func (t *functype) out() []*_type { + // See funcType in reflect/type.go for details on data layout. + uadd := uintptr(unsafe.Sizeof(functype{})) + if t.typ.tflag&tflagUncommon != 0 { + uadd += unsafe.Sizeof(uncommontype{}) + } + outCount := t.outCount & (1<<15 - 1) + return (*[1 << 20]*_type)(add(unsafe.Pointer(t), uadd))[t.inCount : t.inCount+outCount] +} + +func (t *functype) dotdotdot() bool { + return t.outCount&(1<<15) != 0 +} + +type nameOff int32 +type typeOff int32 +type textOff int32 + +type method struct { + name nameOff + mtyp typeOff + ifn textOff + tfn textOff +} + +type uncommontype struct { + pkgpath nameOff + mcount uint16 // number of methods + xcount uint16 // number of exported methods + moff uint32 // offset from this uncommontype to [mcount]method + _ uint32 // unused +} + +type imethod struct { + name nameOff + ityp typeOff +} + +type interfacetype struct { + typ _type + pkgpath name + mhdr []imethod +} + +type maptype struct { + typ _type + key *_type + elem *_type + bucket *_type // internal type representing a hash bucket + // function for hashing keys (ptr to key, seed) -> hash + hasher func(unsafe.Pointer, uintptr) uintptr + keysize uint8 // size of key slot + elemsize uint8 // size of elem slot + bucketsize uint16 // size of bucket + flags uint32 +} + +// Note: flag values must match those used in the TMAP case +// in ../cmd/compile/internal/reflectdata/reflect.go:writeType. +func (mt *maptype) indirectkey() bool { // store ptr to key instead of key itself + return mt.flags&1 != 0 +} +func (mt *maptype) indirectelem() bool { // store ptr to elem instead of elem itself + return mt.flags&2 != 0 +} +func (mt *maptype) reflexivekey() bool { // true if k==k for all keys + return mt.flags&4 != 0 +} +func (mt *maptype) needkeyupdate() bool { // true if we need to update key on an overwrite + return mt.flags&8 != 0 +} +func (mt *maptype) hashMightPanic() bool { // true if hash function might panic + return mt.flags&16 != 0 +} + +type arraytype struct { + typ _type + elem *_type + slice *_type + len uintptr +} + +type chantype struct { + typ _type + elem *_type + dir uintptr +} + +type slicetype struct { + typ _type + elem *_type +} + +type functype struct { + typ _type + inCount uint16 + outCount uint16 +} + +type ptrtype struct { + typ _type + elem *_type +} + +type structfield struct { + name name + typ *_type + offset uintptr +} + +type structtype struct { + typ _type + pkgPath name + fields []structfield +} + +// name is an encoded type name with optional extra data. +// See reflect/type.go for details. +type name struct { + bytes *byte +} + +func (n name) data(off int) *byte { + return (*byte)(add(unsafe.Pointer(n.bytes), uintptr(off))) +} + +func (n name) isExported() bool { + return (*n.bytes)&(1<<0) != 0 +} + +func (n name) isEmbedded() bool { + return (*n.bytes)&(1<<3) != 0 +} + +func (n name) readvarint(off int) (int, int) { + v := 0 + for i := 0; ; i++ { + x := *n.data(off + i) + v += int(x&0x7f) << (7 * i) + if x&0x80 == 0 { + return i + 1, v + } + } +} + +func (n name) name() string { + if n.bytes == nil { + return "" + } + i, l := n.readvarint(1) + if l == 0 { + return "" + } + return unsafe.String(n.data(1+i), l) +} + +func (n name) tag() string { + if *n.data(0)&(1<<1) == 0 { + return "" + } + i, l := n.readvarint(1) + i2, l2 := n.readvarint(1 + i + l) + return unsafe.String(n.data(1+i+l+i2), l2) +} + +func (n name) pkgPath() string { + if n.bytes == nil || *n.data(0)&(1<<2) == 0 { + return "" + } + i, l := n.readvarint(1) + off := 1 + i + l + if *n.data(0)&(1<<1) != 0 { + i2, l2 := n.readvarint(off) + off += i2 + l2 + } + var nameOff nameOff + copy((*[4]byte)(unsafe.Pointer(&nameOff))[:], (*[4]byte)(unsafe.Pointer(n.data(off)))[:]) + pkgPathName := resolveNameOff(unsafe.Pointer(n.bytes), nameOff) + return pkgPathName.name() +} + +func (n name) isBlank() bool { + if n.bytes == nil { + return false + } + _, l := n.readvarint(1) + return l == 1 && *n.data(2) == '_' +} + +// typelinksinit scans the types from extra modules and builds the +// moduledata typemap used to de-duplicate type pointers. +func typelinksinit() { + if firstmoduledata.next == nil { + return + } + typehash := make(map[uint32][]*_type, len(firstmoduledata.typelinks)) + + modules := activeModules() + prev := modules[0] + for _, md := range modules[1:] { + // Collect types from the previous module into typehash. + collect: + for _, tl := range prev.typelinks { + var t *_type + if prev.typemap == nil { + t = (*_type)(unsafe.Pointer(prev.types + uintptr(tl))) + } else { + t = prev.typemap[typeOff(tl)] + } + // Add to typehash if not seen before. + tlist := typehash[t.hash] + for _, tcur := range tlist { + if tcur == t { + continue collect + } + } + typehash[t.hash] = append(tlist, t) + } + + if md.typemap == nil { + // If any of this module's typelinks match a type from a + // prior module, prefer that prior type by adding the offset + // to this module's typemap. + tm := make(map[typeOff]*_type, len(md.typelinks)) + pinnedTypemaps = append(pinnedTypemaps, tm) + md.typemap = tm + for _, tl := range md.typelinks { + t := (*_type)(unsafe.Pointer(md.types + uintptr(tl))) + for _, candidate := range typehash[t.hash] { + seen := map[_typePair]struct{}{} + if typesEqual(t, candidate, seen) { + t = candidate + break + } + } + md.typemap[typeOff(tl)] = t + } + } + + prev = md + } +} + +type _typePair struct { + t1 *_type + t2 *_type +} + +// typesEqual reports whether two types are equal. +// +// Everywhere in the runtime and reflect packages, it is assumed that +// there is exactly one *_type per Go type, so that pointer equality +// can be used to test if types are equal. There is one place that +// breaks this assumption: buildmode=shared. In this case a type can +// appear as two different pieces of memory. This is hidden from the +// runtime and reflect package by the per-module typemap built in +// typelinksinit. It uses typesEqual to map types from later modules +// back into earlier ones. +// +// Only typelinksinit needs this function. +func typesEqual(t, v *_type, seen map[_typePair]struct{}) bool { + tp := _typePair{t, v} + if _, ok := seen[tp]; ok { + return true + } + + // mark these types as seen, and thus equivalent which prevents an infinite loop if + // the two types are identical, but recursively defined and loaded from + // different modules + seen[tp] = struct{}{} + + if t == v { + return true + } + kind := t.kind & kindMask + if kind != v.kind&kindMask { + return false + } + if t.string() != v.string() { + return false + } + ut := t.uncommon() + uv := v.uncommon() + if ut != nil || uv != nil { + if ut == nil || uv == nil { + return false + } + pkgpatht := t.nameOff(ut.pkgpath).name() + pkgpathv := v.nameOff(uv.pkgpath).name() + if pkgpatht != pkgpathv { + return false + } + } + if kindBool <= kind && kind <= kindComplex128 { + return true + } + switch kind { + case kindString, kindUnsafePointer: + return true + case kindArray: + at := (*arraytype)(unsafe.Pointer(t)) + av := (*arraytype)(unsafe.Pointer(v)) + return typesEqual(at.elem, av.elem, seen) && at.len == av.len + case kindChan: + ct := (*chantype)(unsafe.Pointer(t)) + cv := (*chantype)(unsafe.Pointer(v)) + return ct.dir == cv.dir && typesEqual(ct.elem, cv.elem, seen) + case kindFunc: + ft := (*functype)(unsafe.Pointer(t)) + fv := (*functype)(unsafe.Pointer(v)) + if ft.outCount != fv.outCount || ft.inCount != fv.inCount { + return false + } + tin, vin := ft.in(), fv.in() + for i := 0; i < len(tin); i++ { + if !typesEqual(tin[i], vin[i], seen) { + return false + } + } + tout, vout := ft.out(), fv.out() + for i := 0; i < len(tout); i++ { + if !typesEqual(tout[i], vout[i], seen) { + return false + } + } + return true + case kindInterface: + it := (*interfacetype)(unsafe.Pointer(t)) + iv := (*interfacetype)(unsafe.Pointer(v)) + if it.pkgpath.name() != iv.pkgpath.name() { + return false + } + if len(it.mhdr) != len(iv.mhdr) { + return false + } + for i := range it.mhdr { + tm := &it.mhdr[i] + vm := &iv.mhdr[i] + // Note the mhdr array can be relocated from + // another module. See #17724. + tname := resolveNameOff(unsafe.Pointer(tm), tm.name) + vname := resolveNameOff(unsafe.Pointer(vm), vm.name) + if tname.name() != vname.name() { + return false + } + if tname.pkgPath() != vname.pkgPath() { + return false + } + tityp := resolveTypeOff(unsafe.Pointer(tm), tm.ityp) + vityp := resolveTypeOff(unsafe.Pointer(vm), vm.ityp) + if !typesEqual(tityp, vityp, seen) { + return false + } + } + return true + case kindMap: + mt := (*maptype)(unsafe.Pointer(t)) + mv := (*maptype)(unsafe.Pointer(v)) + return typesEqual(mt.key, mv.key, seen) && typesEqual(mt.elem, mv.elem, seen) + case kindPtr: + pt := (*ptrtype)(unsafe.Pointer(t)) + pv := (*ptrtype)(unsafe.Pointer(v)) + return typesEqual(pt.elem, pv.elem, seen) + case kindSlice: + st := (*slicetype)(unsafe.Pointer(t)) + sv := (*slicetype)(unsafe.Pointer(v)) + return typesEqual(st.elem, sv.elem, seen) + case kindStruct: + st := (*structtype)(unsafe.Pointer(t)) + sv := (*structtype)(unsafe.Pointer(v)) + if len(st.fields) != len(sv.fields) { + return false + } + if st.pkgPath.name() != sv.pkgPath.name() { + return false + } + for i := range st.fields { + tf := &st.fields[i] + vf := &sv.fields[i] + if tf.name.name() != vf.name.name() { + return false + } + if !typesEqual(tf.typ, vf.typ, seen) { + return false + } + if tf.name.tag() != vf.name.tag() { + return false + } + if tf.offset != vf.offset { + return false + } + if tf.name.isEmbedded() != vf.name.isEmbedded() { + return false + } + } + return true + default: + println("runtime: impossible type kind", kind) + throw("runtime: impossible type kind") + return false + } +} diff --git a/src/runtime/typekind.go b/src/runtime/typekind.go new file mode 100644 index 0000000..7087a9b --- /dev/null +++ b/src/runtime/typekind.go @@ -0,0 +1,43 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + kindBool = 1 + iota + kindInt + kindInt8 + kindInt16 + kindInt32 + kindInt64 + kindUint + kindUint8 + kindUint16 + kindUint32 + kindUint64 + kindUintptr + kindFloat32 + kindFloat64 + kindComplex64 + kindComplex128 + kindArray + kindChan + kindFunc + kindInterface + kindMap + kindPtr + kindSlice + kindString + kindStruct + kindUnsafePointer + + kindDirectIface = 1 << 5 + kindGCProg = 1 << 6 + kindMask = (1 << 5) - 1 +) + +// isDirectIface reports whether t is stored directly in an interface value. +func isDirectIface(t *_type) bool { + return t.kind&kindDirectIface != 0 +} diff --git a/src/runtime/unsafe.go b/src/runtime/unsafe.go new file mode 100644 index 0000000..54649e8 --- /dev/null +++ b/src/runtime/unsafe.go @@ -0,0 +1,98 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import ( + "runtime/internal/math" + "unsafe" +) + +func unsafestring(ptr unsafe.Pointer, len int) { + if len < 0 { + panicunsafestringlen() + } + + if uintptr(len) > -uintptr(ptr) { + if ptr == nil { + panicunsafestringnilptr() + } + panicunsafestringlen() + } +} + +// Keep this code in sync with cmd/compile/internal/walk/builtin.go:walkUnsafeString +func unsafestring64(ptr unsafe.Pointer, len64 int64) { + len := int(len64) + if int64(len) != len64 { + panicunsafestringlen() + } + unsafestring(ptr, len) +} + +func unsafestringcheckptr(ptr unsafe.Pointer, len64 int64) { + unsafestring64(ptr, len64) + + // Check that underlying array doesn't straddle multiple heap objects. + // unsafestring64 has already checked for overflow. + if checkptrStraddles(ptr, uintptr(len64)) { + throw("checkptr: unsafe.String result straddles multiple allocations") + } +} + +func panicunsafestringlen() { + panic(errorString("unsafe.String: len out of range")) +} + +func panicunsafestringnilptr() { + panic(errorString("unsafe.String: ptr is nil and len is not zero")) +} + +// Keep this code in sync with cmd/compile/internal/walk/builtin.go:walkUnsafeSlice +func unsafeslice(et *_type, ptr unsafe.Pointer, len int) { + if len < 0 { + panicunsafeslicelen() + } + + if et.size == 0 { + if ptr == nil && len > 0 { + panicunsafeslicenilptr() + } + } + + mem, overflow := math.MulUintptr(et.size, uintptr(len)) + if overflow || mem > -uintptr(ptr) { + if ptr == nil { + panicunsafeslicenilptr() + } + panicunsafeslicelen() + } +} + +// Keep this code in sync with cmd/compile/internal/walk/builtin.go:walkUnsafeSlice +func unsafeslice64(et *_type, ptr unsafe.Pointer, len64 int64) { + len := int(len64) + if int64(len) != len64 { + panicunsafeslicelen() + } + unsafeslice(et, ptr, len) +} + +func unsafeslicecheckptr(et *_type, ptr unsafe.Pointer, len64 int64) { + unsafeslice64(et, ptr, len64) + + // Check that underlying array doesn't straddle multiple heap objects. + // unsafeslice64 has already checked for overflow. + if checkptrStraddles(ptr, uintptr(len64)*et.size) { + throw("checkptr: unsafe.Slice result straddles multiple allocations") + } +} + +func panicunsafeslicelen() { + panic(errorString("unsafe.Slice: len out of range")) +} + +func panicunsafeslicenilptr() { + panic(errorString("unsafe.Slice: ptr is nil and len is not zero")) +} diff --git a/src/runtime/utf8.go b/src/runtime/utf8.go new file mode 100644 index 0000000..52b7576 --- /dev/null +++ b/src/runtime/utf8.go @@ -0,0 +1,132 @@ +// Copyright 2016 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +// Numbers fundamental to the encoding. +const ( + runeError = '\uFFFD' // the "error" Rune or "Unicode replacement character" + runeSelf = 0x80 // characters below runeSelf are represented as themselves in a single byte. + maxRune = '\U0010FFFF' // Maximum valid Unicode code point. +) + +// Code points in the surrogate range are not valid for UTF-8. +const ( + surrogateMin = 0xD800 + surrogateMax = 0xDFFF +) + +const ( + t1 = 0x00 // 0000 0000 + tx = 0x80 // 1000 0000 + t2 = 0xC0 // 1100 0000 + t3 = 0xE0 // 1110 0000 + t4 = 0xF0 // 1111 0000 + t5 = 0xF8 // 1111 1000 + + maskx = 0x3F // 0011 1111 + mask2 = 0x1F // 0001 1111 + mask3 = 0x0F // 0000 1111 + mask4 = 0x07 // 0000 0111 + + rune1Max = 1<<7 - 1 + rune2Max = 1<<11 - 1 + rune3Max = 1<<16 - 1 + + // The default lowest and highest continuation byte. + locb = 0x80 // 1000 0000 + hicb = 0xBF // 1011 1111 +) + +// countrunes returns the number of runes in s. +func countrunes(s string) int { + n := 0 + for range s { + n++ + } + return n +} + +// decoderune returns the non-ASCII rune at the start of +// s[k:] and the index after the rune in s. +// +// decoderune assumes that caller has checked that +// the to be decoded rune is a non-ASCII rune. +// +// If the string appears to be incomplete or decoding problems +// are encountered (runeerror, k + 1) is returned to ensure +// progress when decoderune is used to iterate over a string. +func decoderune(s string, k int) (r rune, pos int) { + pos = k + + if k >= len(s) { + return runeError, k + 1 + } + + s = s[k:] + + switch { + case t2 <= s[0] && s[0] < t3: + // 0080-07FF two byte sequence + if len(s) > 1 && (locb <= s[1] && s[1] <= hicb) { + r = rune(s[0]&mask2)<<6 | rune(s[1]&maskx) + pos += 2 + if rune1Max < r { + return + } + } + case t3 <= s[0] && s[0] < t4: + // 0800-FFFF three byte sequence + if len(s) > 2 && (locb <= s[1] && s[1] <= hicb) && (locb <= s[2] && s[2] <= hicb) { + r = rune(s[0]&mask3)<<12 | rune(s[1]&maskx)<<6 | rune(s[2]&maskx) + pos += 3 + if rune2Max < r && !(surrogateMin <= r && r <= surrogateMax) { + return + } + } + case t4 <= s[0] && s[0] < t5: + // 10000-1FFFFF four byte sequence + if len(s) > 3 && (locb <= s[1] && s[1] <= hicb) && (locb <= s[2] && s[2] <= hicb) && (locb <= s[3] && s[3] <= hicb) { + r = rune(s[0]&mask4)<<18 | rune(s[1]&maskx)<<12 | rune(s[2]&maskx)<<6 | rune(s[3]&maskx) + pos += 4 + if rune3Max < r && r <= maxRune { + return + } + } + } + + return runeError, k + 1 +} + +// encoderune writes into p (which must be large enough) the UTF-8 encoding of the rune. +// It returns the number of bytes written. +func encoderune(p []byte, r rune) int { + // Negative values are erroneous. Making it unsigned addresses the problem. + switch i := uint32(r); { + case i <= rune1Max: + p[0] = byte(r) + return 1 + case i <= rune2Max: + _ = p[1] // eliminate bounds checks + p[0] = t2 | byte(r>>6) + p[1] = tx | byte(r)&maskx + return 2 + case i > maxRune, surrogateMin <= i && i <= surrogateMax: + r = runeError + fallthrough + case i <= rune3Max: + _ = p[2] // eliminate bounds checks + p[0] = t3 | byte(r>>12) + p[1] = tx | byte(r>>6)&maskx + p[2] = tx | byte(r)&maskx + return 3 + default: + _ = p[3] // eliminate bounds checks + p[0] = t4 | byte(r>>18) + p[1] = tx | byte(r>>12)&maskx + p[2] = tx | byte(r>>6)&maskx + p[3] = tx | byte(r)&maskx + return 4 + } +} diff --git a/src/runtime/vdso_elf32.go b/src/runtime/vdso_elf32.go new file mode 100644 index 0000000..1b8afbe --- /dev/null +++ b/src/runtime/vdso_elf32.go @@ -0,0 +1,79 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (386 || arm) + +package runtime + +// ELF32 structure definitions for use by the vDSO loader + +type elfSym struct { + st_name uint32 + st_value uint32 + st_size uint32 + st_info byte + st_other byte + st_shndx uint16 +} + +type elfVerdef struct { + vd_version uint16 /* Version revision */ + vd_flags uint16 /* Version information */ + vd_ndx uint16 /* Version Index */ + vd_cnt uint16 /* Number of associated aux entries */ + vd_hash uint32 /* Version name hash value */ + vd_aux uint32 /* Offset in bytes to verdaux array */ + vd_next uint32 /* Offset in bytes to next verdef entry */ +} + +type elfEhdr struct { + e_ident [_EI_NIDENT]byte /* Magic number and other info */ + e_type uint16 /* Object file type */ + e_machine uint16 /* Architecture */ + e_version uint32 /* Object file version */ + e_entry uint32 /* Entry point virtual address */ + e_phoff uint32 /* Program header table file offset */ + e_shoff uint32 /* Section header table file offset */ + e_flags uint32 /* Processor-specific flags */ + e_ehsize uint16 /* ELF header size in bytes */ + e_phentsize uint16 /* Program header table entry size */ + e_phnum uint16 /* Program header table entry count */ + e_shentsize uint16 /* Section header table entry size */ + e_shnum uint16 /* Section header table entry count */ + e_shstrndx uint16 /* Section header string table index */ +} + +type elfPhdr struct { + p_type uint32 /* Segment type */ + p_offset uint32 /* Segment file offset */ + p_vaddr uint32 /* Segment virtual address */ + p_paddr uint32 /* Segment physical address */ + p_filesz uint32 /* Segment size in file */ + p_memsz uint32 /* Segment size in memory */ + p_flags uint32 /* Segment flags */ + p_align uint32 /* Segment alignment */ +} + +type elfShdr struct { + sh_name uint32 /* Section name (string tbl index) */ + sh_type uint32 /* Section type */ + sh_flags uint32 /* Section flags */ + sh_addr uint32 /* Section virtual addr at execution */ + sh_offset uint32 /* Section file offset */ + sh_size uint32 /* Section size in bytes */ + sh_link uint32 /* Link to another section */ + sh_info uint32 /* Additional section information */ + sh_addralign uint32 /* Section alignment */ + sh_entsize uint32 /* Entry size if section holds table */ +} + +type elfDyn struct { + d_tag int32 /* Dynamic entry type */ + d_val uint32 /* Integer value */ +} + +type elfVerdaux struct { + vda_name uint32 /* Version or dependency names */ + vda_next uint32 /* Offset in bytes to next verdaux entry */ +} diff --git a/src/runtime/vdso_elf64.go b/src/runtime/vdso_elf64.go new file mode 100644 index 0000000..d41d25e --- /dev/null +++ b/src/runtime/vdso_elf64.go @@ -0,0 +1,79 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (amd64 || arm64 || loong64 || mips64 || mips64le || ppc64 || ppc64le || riscv64 || s390x) + +package runtime + +// ELF64 structure definitions for use by the vDSO loader + +type elfSym struct { + st_name uint32 + st_info byte + st_other byte + st_shndx uint16 + st_value uint64 + st_size uint64 +} + +type elfVerdef struct { + vd_version uint16 /* Version revision */ + vd_flags uint16 /* Version information */ + vd_ndx uint16 /* Version Index */ + vd_cnt uint16 /* Number of associated aux entries */ + vd_hash uint32 /* Version name hash value */ + vd_aux uint32 /* Offset in bytes to verdaux array */ + vd_next uint32 /* Offset in bytes to next verdef entry */ +} + +type elfEhdr struct { + e_ident [_EI_NIDENT]byte /* Magic number and other info */ + e_type uint16 /* Object file type */ + e_machine uint16 /* Architecture */ + e_version uint32 /* Object file version */ + e_entry uint64 /* Entry point virtual address */ + e_phoff uint64 /* Program header table file offset */ + e_shoff uint64 /* Section header table file offset */ + e_flags uint32 /* Processor-specific flags */ + e_ehsize uint16 /* ELF header size in bytes */ + e_phentsize uint16 /* Program header table entry size */ + e_phnum uint16 /* Program header table entry count */ + e_shentsize uint16 /* Section header table entry size */ + e_shnum uint16 /* Section header table entry count */ + e_shstrndx uint16 /* Section header string table index */ +} + +type elfPhdr struct { + p_type uint32 /* Segment type */ + p_flags uint32 /* Segment flags */ + p_offset uint64 /* Segment file offset */ + p_vaddr uint64 /* Segment virtual address */ + p_paddr uint64 /* Segment physical address */ + p_filesz uint64 /* Segment size in file */ + p_memsz uint64 /* Segment size in memory */ + p_align uint64 /* Segment alignment */ +} + +type elfShdr struct { + sh_name uint32 /* Section name (string tbl index) */ + sh_type uint32 /* Section type */ + sh_flags uint64 /* Section flags */ + sh_addr uint64 /* Section virtual addr at execution */ + sh_offset uint64 /* Section file offset */ + sh_size uint64 /* Section size in bytes */ + sh_link uint32 /* Link to another section */ + sh_info uint32 /* Additional section information */ + sh_addralign uint64 /* Section alignment */ + sh_entsize uint64 /* Entry size if section holds table */ +} + +type elfDyn struct { + d_tag int64 /* Dynamic entry type */ + d_val uint64 /* Integer value */ +} + +type elfVerdaux struct { + vda_name uint32 /* Version or dependency names */ + vda_next uint32 /* Offset in bytes to next verdaux entry */ +} diff --git a/src/runtime/vdso_freebsd.go b/src/runtime/vdso_freebsd.go new file mode 100644 index 0000000..0fe21cf --- /dev/null +++ b/src/runtime/vdso_freebsd.go @@ -0,0 +1,114 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build freebsd + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +const _VDSO_TH_NUM = 4 // defined in <sys/vdso.h> #ifdef _KERNEL + +var timekeepSharedPage *vdsoTimekeep + +//go:nosplit +func (bt *bintime) Add(bt2 *bintime) { + u := bt.frac + bt.frac += bt2.frac + if u > bt.frac { + bt.sec++ + } + bt.sec += bt2.sec +} + +//go:nosplit +func (bt *bintime) AddX(x uint64) { + u := bt.frac + bt.frac += x + if u > bt.frac { + bt.sec++ + } +} + +var ( + // binuptimeDummy is used in binuptime as the address of an atomic.Load, to simulate + // an atomic_thread_fence_acq() call which behaves as an instruction reordering and + // memory barrier. + binuptimeDummy uint32 + + zeroBintime bintime +) + +// based on /usr/src/lib/libc/sys/__vdso_gettimeofday.c +// +//go:nosplit +func binuptime(abs bool) (bt bintime) { + timehands := (*[_VDSO_TH_NUM]vdsoTimehands)(add(unsafe.Pointer(timekeepSharedPage), vdsoTimekeepSize)) + for { + if timekeepSharedPage.enabled == 0 { + return zeroBintime + } + + curr := atomic.Load(&timekeepSharedPage.current) // atomic_load_acq_32 + th := &timehands[curr] + gen := atomic.Load(&th.gen) // atomic_load_acq_32 + bt = th.offset + + if tc, ok := th.getTimecounter(); !ok { + return zeroBintime + } else { + delta := (tc - th.offset_count) & th.counter_mask + bt.AddX(th.scale * uint64(delta)) + } + if abs { + bt.Add(&th.boottime) + } + + atomic.Load(&binuptimeDummy) // atomic_thread_fence_acq() + if curr == timekeepSharedPage.current && gen != 0 && gen == th.gen { + break + } + } + return bt +} + +//go:nosplit +func vdsoClockGettime(clockID int32) bintime { + if timekeepSharedPage == nil || timekeepSharedPage.ver != _VDSO_TK_VER_CURR { + return zeroBintime + } + abs := false + switch clockID { + case _CLOCK_MONOTONIC: + /* ok */ + case _CLOCK_REALTIME: + abs = true + default: + return zeroBintime + } + return binuptime(abs) +} + +func fallback_nanotime() int64 +func fallback_walltime() (sec int64, nsec int32) + +//go:nosplit +func nanotime1() int64 { + bt := vdsoClockGettime(_CLOCK_MONOTONIC) + if bt == zeroBintime { + return fallback_nanotime() + } + return int64((1e9 * uint64(bt.sec)) + ((1e9 * uint64(bt.frac>>32)) >> 32)) +} + +func walltime() (sec int64, nsec int32) { + bt := vdsoClockGettime(_CLOCK_REALTIME) + if bt == zeroBintime { + return fallback_walltime() + } + return int64(bt.sec), int32((1e9 * uint64(bt.frac>>32)) >> 32) +} diff --git a/src/runtime/vdso_freebsd_arm.go b/src/runtime/vdso_freebsd_arm.go new file mode 100644 index 0000000..669fed0 --- /dev/null +++ b/src/runtime/vdso_freebsd_arm.go @@ -0,0 +1,21 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + _VDSO_TH_ALGO_ARM_GENTIM = 1 +) + +func getCntxct(physical bool) uint32 + +//go:nosplit +func (th *vdsoTimehands) getTimecounter() (uint32, bool) { + switch th.algo { + case _VDSO_TH_ALGO_ARM_GENTIM: + return getCntxct(th.physical != 0), true + default: + return 0, false + } +} diff --git a/src/runtime/vdso_freebsd_arm64.go b/src/runtime/vdso_freebsd_arm64.go new file mode 100644 index 0000000..37b26d7 --- /dev/null +++ b/src/runtime/vdso_freebsd_arm64.go @@ -0,0 +1,21 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + _VDSO_TH_ALGO_ARM_GENTIM = 1 +) + +func getCntxct(physical bool) uint32 + +//go:nosplit +func (th *vdsoTimehands) getTimecounter() (uint32, bool) { + switch th.algo { + case _VDSO_TH_ALGO_ARM_GENTIM: + return getCntxct(th.physical != 0), true + default: + return 0, false + } +} diff --git a/src/runtime/vdso_freebsd_riscv64.go b/src/runtime/vdso_freebsd_riscv64.go new file mode 100644 index 0000000..a4fff4b --- /dev/null +++ b/src/runtime/vdso_freebsd_riscv64.go @@ -0,0 +1,21 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + _VDSO_TH_ALGO_RISCV_RDTIME = 1 +) + +func getCntxct() uint32 + +//go:nosplit +func (th *vdsoTimehands) getTimecounter() (uint32, bool) { + switch th.algo { + case _VDSO_TH_ALGO_RISCV_RDTIME: + return getCntxct(), true + default: + return 0, false + } +} diff --git a/src/runtime/vdso_freebsd_x86.go b/src/runtime/vdso_freebsd_x86.go new file mode 100644 index 0000000..66d1c65 --- /dev/null +++ b/src/runtime/vdso_freebsd_x86.go @@ -0,0 +1,90 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build freebsd && (386 || amd64) + +package runtime + +import ( + "runtime/internal/atomic" + "unsafe" +) + +const ( + _VDSO_TH_ALGO_X86_TSC = 1 + _VDSO_TH_ALGO_X86_HPET = 2 +) + +const ( + _HPET_DEV_MAP_MAX = 10 + _HPET_MAIN_COUNTER = 0xf0 /* Main counter register */ + + hpetDevPath = "/dev/hpetX\x00" +) + +var hpetDevMap [_HPET_DEV_MAP_MAX]uintptr + +//go:nosplit +func (th *vdsoTimehands) getTSCTimecounter() uint32 { + tsc := cputicks() + if th.x86_shift > 0 { + tsc >>= th.x86_shift + } + return uint32(tsc) +} + +//go:nosplit +func (th *vdsoTimehands) getHPETTimecounter() (uint32, bool) { + idx := int(th.x86_hpet_idx) + if idx >= len(hpetDevMap) { + return 0, false + } + + p := atomic.Loaduintptr(&hpetDevMap[idx]) + if p == 0 { + systemstack(func() { initHPETTimecounter(idx) }) + p = atomic.Loaduintptr(&hpetDevMap[idx]) + } + if p == ^uintptr(0) { + return 0, false + } + return *(*uint32)(unsafe.Pointer(p + _HPET_MAIN_COUNTER)), true +} + +//go:systemstack +func initHPETTimecounter(idx int) { + const digits = "0123456789" + + var devPath [len(hpetDevPath)]byte + copy(devPath[:], hpetDevPath) + devPath[9] = digits[idx] + + fd := open(&devPath[0], 0 /* O_RDONLY */ |_O_CLOEXEC, 0) + if fd < 0 { + atomic.Casuintptr(&hpetDevMap[idx], 0, ^uintptr(0)) + return + } + + addr, mmapErr := mmap(nil, physPageSize, _PROT_READ, _MAP_SHARED, fd, 0) + closefd(fd) + newP := uintptr(addr) + if mmapErr != 0 { + newP = ^uintptr(0) + } + if !atomic.Casuintptr(&hpetDevMap[idx], 0, newP) && mmapErr == 0 { + munmap(addr, physPageSize) + } +} + +//go:nosplit +func (th *vdsoTimehands) getTimecounter() (uint32, bool) { + switch th.algo { + case _VDSO_TH_ALGO_X86_TSC: + return th.getTSCTimecounter(), true + case _VDSO_TH_ALGO_X86_HPET: + return th.getHPETTimecounter() + default: + return 0, false + } +} diff --git a/src/runtime/vdso_in_none.go b/src/runtime/vdso_in_none.go new file mode 100644 index 0000000..3a6ee6f --- /dev/null +++ b/src/runtime/vdso_in_none.go @@ -0,0 +1,13 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build (linux && !386 && !amd64 && !arm && !arm64 && !loong64 && !mips64 && !mips64le && !ppc64 && !ppc64le && !riscv64 && !s390x) || !linux + +package runtime + +// A dummy version of inVDSOPage for targets that don't use a VDSO. + +func inVDSOPage(pc uintptr) bool { + return false +} diff --git a/src/runtime/vdso_linux.go b/src/runtime/vdso_linux.go new file mode 100644 index 0000000..4523615 --- /dev/null +++ b/src/runtime/vdso_linux.go @@ -0,0 +1,295 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (386 || amd64 || arm || arm64 || loong64 || mips64 || mips64le || ppc64 || ppc64le || riscv64 || s390x) + +package runtime + +import "unsafe" + +// Look up symbols in the Linux vDSO. + +// This code was originally based on the sample Linux vDSO parser at +// https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/vDSO/parse_vdso.c + +// This implements the ELF dynamic linking spec at +// http://sco.com/developers/gabi/latest/ch5.dynamic.html + +// The version section is documented at +// https://refspecs.linuxfoundation.org/LSB_3.2.0/LSB-Core-generic/LSB-Core-generic/symversion.html + +const ( + _AT_SYSINFO_EHDR = 33 + + _PT_LOAD = 1 /* Loadable program segment */ + _PT_DYNAMIC = 2 /* Dynamic linking information */ + + _DT_NULL = 0 /* Marks end of dynamic section */ + _DT_HASH = 4 /* Dynamic symbol hash table */ + _DT_STRTAB = 5 /* Address of string table */ + _DT_SYMTAB = 6 /* Address of symbol table */ + _DT_GNU_HASH = 0x6ffffef5 /* GNU-style dynamic symbol hash table */ + _DT_VERSYM = 0x6ffffff0 + _DT_VERDEF = 0x6ffffffc + + _VER_FLG_BASE = 0x1 /* Version definition of file itself */ + + _SHN_UNDEF = 0 /* Undefined section */ + + _SHT_DYNSYM = 11 /* Dynamic linker symbol table */ + + _STT_FUNC = 2 /* Symbol is a code object */ + + _STT_NOTYPE = 0 /* Symbol type is not specified */ + + _STB_GLOBAL = 1 /* Global symbol */ + _STB_WEAK = 2 /* Weak symbol */ + + _EI_NIDENT = 16 + + // Maximum indices for the array types used when traversing the vDSO ELF structures. + // Computed from architecture-specific max provided by vdso_linux_*.go + vdsoSymTabSize = vdsoArrayMax / unsafe.Sizeof(elfSym{}) + vdsoDynSize = vdsoArrayMax / unsafe.Sizeof(elfDyn{}) + vdsoSymStringsSize = vdsoArrayMax // byte + vdsoVerSymSize = vdsoArrayMax / 2 // uint16 + vdsoHashSize = vdsoArrayMax / 4 // uint32 + + // vdsoBloomSizeScale is a scaling factor for gnuhash tables which are uint32 indexed, + // but contain uintptrs + vdsoBloomSizeScale = unsafe.Sizeof(uintptr(0)) / 4 // uint32 +) + +/* How to extract and insert information held in the st_info field. */ +func _ELF_ST_BIND(val byte) byte { return val >> 4 } +func _ELF_ST_TYPE(val byte) byte { return val & 0xf } + +type vdsoSymbolKey struct { + name string + symHash uint32 + gnuHash uint32 + ptr *uintptr +} + +type vdsoVersionKey struct { + version string + verHash uint32 +} + +type vdsoInfo struct { + valid bool + + /* Load information */ + loadAddr uintptr + loadOffset uintptr /* loadAddr - recorded vaddr */ + + /* Symbol table */ + symtab *[vdsoSymTabSize]elfSym + symstrings *[vdsoSymStringsSize]byte + chain []uint32 + bucket []uint32 + symOff uint32 + isGNUHash bool + + /* Version table */ + versym *[vdsoVerSymSize]uint16 + verdef *elfVerdef +} + +// see vdso_linux_*.go for vdsoSymbolKeys[] and vdso*Sym vars + +func vdsoInitFromSysinfoEhdr(info *vdsoInfo, hdr *elfEhdr) { + info.valid = false + info.loadAddr = uintptr(unsafe.Pointer(hdr)) + + pt := unsafe.Pointer(info.loadAddr + uintptr(hdr.e_phoff)) + + // We need two things from the segment table: the load offset + // and the dynamic table. + var foundVaddr bool + var dyn *[vdsoDynSize]elfDyn + for i := uint16(0); i < hdr.e_phnum; i++ { + pt := (*elfPhdr)(add(pt, uintptr(i)*unsafe.Sizeof(elfPhdr{}))) + switch pt.p_type { + case _PT_LOAD: + if !foundVaddr { + foundVaddr = true + info.loadOffset = info.loadAddr + uintptr(pt.p_offset-pt.p_vaddr) + } + + case _PT_DYNAMIC: + dyn = (*[vdsoDynSize]elfDyn)(unsafe.Pointer(info.loadAddr + uintptr(pt.p_offset))) + } + } + + if !foundVaddr || dyn == nil { + return // Failed + } + + // Fish out the useful bits of the dynamic table. + + var hash, gnuhash *[vdsoHashSize]uint32 + info.symstrings = nil + info.symtab = nil + info.versym = nil + info.verdef = nil + for i := 0; dyn[i].d_tag != _DT_NULL; i++ { + dt := &dyn[i] + p := info.loadOffset + uintptr(dt.d_val) + switch dt.d_tag { + case _DT_STRTAB: + info.symstrings = (*[vdsoSymStringsSize]byte)(unsafe.Pointer(p)) + case _DT_SYMTAB: + info.symtab = (*[vdsoSymTabSize]elfSym)(unsafe.Pointer(p)) + case _DT_HASH: + hash = (*[vdsoHashSize]uint32)(unsafe.Pointer(p)) + case _DT_GNU_HASH: + gnuhash = (*[vdsoHashSize]uint32)(unsafe.Pointer(p)) + case _DT_VERSYM: + info.versym = (*[vdsoVerSymSize]uint16)(unsafe.Pointer(p)) + case _DT_VERDEF: + info.verdef = (*elfVerdef)(unsafe.Pointer(p)) + } + } + + if info.symstrings == nil || info.symtab == nil || (hash == nil && gnuhash == nil) { + return // Failed + } + + if info.verdef == nil { + info.versym = nil + } + + if gnuhash != nil { + // Parse the GNU hash table header. + nbucket := gnuhash[0] + info.symOff = gnuhash[1] + bloomSize := gnuhash[2] + info.bucket = gnuhash[4+bloomSize*uint32(vdsoBloomSizeScale):][:nbucket] + info.chain = gnuhash[4+bloomSize*uint32(vdsoBloomSizeScale)+nbucket:] + info.isGNUHash = true + } else { + // Parse the hash table header. + nbucket := hash[0] + nchain := hash[1] + info.bucket = hash[2 : 2+nbucket] + info.chain = hash[2+nbucket : 2+nbucket+nchain] + } + + // That's all we need. + info.valid = true +} + +func vdsoFindVersion(info *vdsoInfo, ver *vdsoVersionKey) int32 { + if !info.valid { + return 0 + } + + def := info.verdef + for { + if def.vd_flags&_VER_FLG_BASE == 0 { + aux := (*elfVerdaux)(add(unsafe.Pointer(def), uintptr(def.vd_aux))) + if def.vd_hash == ver.verHash && ver.version == gostringnocopy(&info.symstrings[aux.vda_name]) { + return int32(def.vd_ndx & 0x7fff) + } + } + + if def.vd_next == 0 { + break + } + def = (*elfVerdef)(add(unsafe.Pointer(def), uintptr(def.vd_next))) + } + + return -1 // cannot match any version +} + +func vdsoParseSymbols(info *vdsoInfo, version int32) { + if !info.valid { + return + } + + apply := func(symIndex uint32, k vdsoSymbolKey) bool { + sym := &info.symtab[symIndex] + typ := _ELF_ST_TYPE(sym.st_info) + bind := _ELF_ST_BIND(sym.st_info) + // On ppc64x, VDSO functions are of type _STT_NOTYPE. + if typ != _STT_FUNC && typ != _STT_NOTYPE || bind != _STB_GLOBAL && bind != _STB_WEAK || sym.st_shndx == _SHN_UNDEF { + return false + } + if k.name != gostringnocopy(&info.symstrings[sym.st_name]) { + return false + } + // Check symbol version. + if info.versym != nil && version != 0 && int32(info.versym[symIndex]&0x7fff) != version { + return false + } + + *k.ptr = info.loadOffset + uintptr(sym.st_value) + return true + } + + if !info.isGNUHash { + // Old-style DT_HASH table. + for _, k := range vdsoSymbolKeys { + if len(info.bucket) > 0 { + for chain := info.bucket[k.symHash%uint32(len(info.bucket))]; chain != 0; chain = info.chain[chain] { + if apply(chain, k) { + break + } + } + } + } + return + } + + // New-style DT_GNU_HASH table. + for _, k := range vdsoSymbolKeys { + symIndex := info.bucket[k.gnuHash%uint32(len(info.bucket))] + if symIndex < info.symOff { + continue + } + for ; ; symIndex++ { + hash := info.chain[symIndex-info.symOff] + if hash|1 == k.gnuHash|1 { + // Found a hash match. + if apply(symIndex, k) { + break + } + } + if hash&1 != 0 { + // End of chain. + break + } + } + } +} + +func vdsoauxv(tag, val uintptr) { + switch tag { + case _AT_SYSINFO_EHDR: + if val == 0 { + // Something went wrong + return + } + var info vdsoInfo + // TODO(rsc): I don't understand why the compiler thinks info escapes + // when passed to the three functions below. + info1 := (*vdsoInfo)(noescape(unsafe.Pointer(&info))) + vdsoInitFromSysinfoEhdr(info1, (*elfEhdr)(unsafe.Pointer(val))) + vdsoParseSymbols(info1, vdsoFindVersion(info1, &vdsoLinuxVersion)) + } +} + +// vdsoMarker reports whether PC is on the VDSO page. +// +//go:nosplit +func inVDSOPage(pc uintptr) bool { + for _, k := range vdsoSymbolKeys { + if *k.ptr != 0 { + page := *k.ptr &^ (physPageSize - 1) + return pc >= page && pc < page+physPageSize + } + } + return false +} diff --git a/src/runtime/vdso_linux_386.go b/src/runtime/vdso_linux_386.go new file mode 100644 index 0000000..5092c7c --- /dev/null +++ b/src/runtime/vdso_linux_386.go @@ -0,0 +1,21 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/x86/galign.go arch.MAXWIDTH initialization, but must also + // be constrained to max +ve int. + vdsoArrayMax = 1<<31 - 1 +) + +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_2.6", 0x3ae75f6} + +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__vdso_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym}, +} + +// initialize to fall back to syscall +var vdsoClockgettimeSym uintptr = 0 diff --git a/src/runtime/vdso_linux_amd64.go b/src/runtime/vdso_linux_amd64.go new file mode 100644 index 0000000..4e9f748 --- /dev/null +++ b/src/runtime/vdso_linux_amd64.go @@ -0,0 +1,23 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/amd64/galign.go arch.MAXWIDTH initialization. + vdsoArrayMax = 1<<50 - 1 +) + +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_2.6", 0x3ae75f6} + +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__vdso_gettimeofday", 0x315ca59, 0xb01bca00, &vdsoGettimeofdaySym}, + {"__vdso_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym}, +} + +var ( + vdsoGettimeofdaySym uintptr + vdsoClockgettimeSym uintptr +) diff --git a/src/runtime/vdso_linux_arm.go b/src/runtime/vdso_linux_arm.go new file mode 100644 index 0000000..ac3bdcf --- /dev/null +++ b/src/runtime/vdso_linux_arm.go @@ -0,0 +1,21 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/arm/galign.go arch.MAXWIDTH initialization, but must also + // be constrained to max +ve int. + vdsoArrayMax = 1<<31 - 1 +) + +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_2.6", 0x3ae75f6} + +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__vdso_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym}, +} + +// initialize to fall back to syscall +var vdsoClockgettimeSym uintptr = 0 diff --git a/src/runtime/vdso_linux_arm64.go b/src/runtime/vdso_linux_arm64.go new file mode 100644 index 0000000..2f003cd --- /dev/null +++ b/src/runtime/vdso_linux_arm64.go @@ -0,0 +1,21 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/arm64/galign.go arch.MAXWIDTH initialization. + vdsoArrayMax = 1<<50 - 1 +) + +// key and version at man 7 vdso : aarch64 +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_2.6.39", 0x75fcb89} + +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__kernel_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym}, +} + +// initialize to fall back to syscall +var vdsoClockgettimeSym uintptr = 0 diff --git a/src/runtime/vdso_linux_loong64.go b/src/runtime/vdso_linux_loong64.go new file mode 100644 index 0000000..e00ef95 --- /dev/null +++ b/src/runtime/vdso_linux_loong64.go @@ -0,0 +1,27 @@ +// Copyright 2022 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && loong64 + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/loong64/galign.go arch.MAXWIDTH initialization. + vdsoArrayMax = 1<<50 - 1 +) + +// not currently described in manpages as of May 2022, but will eventually +// appear +// when that happens, see man 7 vdso : loongarch +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_5.10", 0xae78f70} + +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__vdso_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym}, +} + +// initialize to fall back to syscall +var ( + vdsoClockgettimeSym uintptr = 0 +) diff --git a/src/runtime/vdso_linux_mips64x.go b/src/runtime/vdso_linux_mips64x.go new file mode 100644 index 0000000..1444f8e --- /dev/null +++ b/src/runtime/vdso_linux_mips64x.go @@ -0,0 +1,27 @@ +// Copyright 2019 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (mips64 || mips64le) + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/mips64/galign.go arch.MAXWIDTH initialization. + vdsoArrayMax = 1<<50 - 1 +) + +// see man 7 vdso : mips +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_2.6", 0x3ae75f6} + +// The symbol name is not __kernel_clock_gettime as suggested by the manpage; +// according to Linux source code it should be __vdso_clock_gettime instead. +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__vdso_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym}, +} + +// initialize to fall back to syscall +var ( + vdsoClockgettimeSym uintptr = 0 +) diff --git a/src/runtime/vdso_linux_ppc64x.go b/src/runtime/vdso_linux_ppc64x.go new file mode 100644 index 0000000..09c8d9d --- /dev/null +++ b/src/runtime/vdso_linux_ppc64x.go @@ -0,0 +1,24 @@ +// Copyright 2018 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && (ppc64 || ppc64le) + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/ppc64/galign.go arch.MAXWIDTH initialization. + vdsoArrayMax = 1<<50 - 1 +) + +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_2.6.15", 0x75fcba5} + +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__kernel_clock_gettime", 0xb0cd725, 0xdfa941fd, &vdsoClockgettimeSym}, +} + +// initialize with vsyscall fallbacks +var ( + vdsoClockgettimeSym uintptr = 0 +) diff --git a/src/runtime/vdso_linux_riscv64.go b/src/runtime/vdso_linux_riscv64.go new file mode 100644 index 0000000..f427124 --- /dev/null +++ b/src/runtime/vdso_linux_riscv64.go @@ -0,0 +1,21 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/riscv64/galign.go arch.MAXWIDTH initialization. + vdsoArrayMax = 1<<50 - 1 +) + +// key and version at man 7 vdso : riscv +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_4.15", 0xae77f75} + +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__vdso_clock_gettime", 0xd35ec75, 0x6e43a318, &vdsoClockgettimeSym}, +} + +// initialize to fall back to syscall +var vdsoClockgettimeSym uintptr = 0 diff --git a/src/runtime/vdso_linux_s390x.go b/src/runtime/vdso_linux_s390x.go new file mode 100644 index 0000000..c1c0b1b --- /dev/null +++ b/src/runtime/vdso_linux_s390x.go @@ -0,0 +1,25 @@ +// Copyright 2021 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build linux && s390x +// +build linux,s390x + +package runtime + +const ( + // vdsoArrayMax is the byte-size of a maximally sized array on this architecture. + // See cmd/compile/internal/s390x/galign.go arch.MAXWIDTH initialization. + vdsoArrayMax = 1<<50 - 1 +) + +var vdsoLinuxVersion = vdsoVersionKey{"LINUX_2.6.29", 0x75fcbb9} + +var vdsoSymbolKeys = []vdsoSymbolKey{ + {"__kernel_clock_gettime", 0xb0cd725, 0xdfa941fd, &vdsoClockgettimeSym}, +} + +// initialize with vsyscall fallbacks +var ( + vdsoClockgettimeSym uintptr = 0 +) diff --git a/src/runtime/vlop_386.s b/src/runtime/vlop_386.s new file mode 100644 index 0000000..b478ff8 --- /dev/null +++ b/src/runtime/vlop_386.s @@ -0,0 +1,56 @@ +// Inferno's libkern/vlop-386.s +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/vlop-386.s +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +#include "textflag.h" + +/* + * C runtime for 64-bit divide. + */ + +// runtime·_mul64x32(lo64 *uint64, a uint64, b uint32) (hi32 uint32) +// sets *lo64 = low 64 bits of 96-bit product a*b; returns high 32 bits. +TEXT runtime·_mul64by32(SB), NOSPLIT, $0 + MOVL lo64+0(FP), CX + MOVL a_lo+4(FP), AX + MULL b+12(FP) + MOVL AX, 0(CX) + MOVL DX, BX + MOVL a_hi+8(FP), AX + MULL b+12(FP) + ADDL AX, BX + ADCL $0, DX + MOVL BX, 4(CX) + MOVL DX, AX + MOVL AX, hi32+16(FP) + RET + +TEXT runtime·_div64by32(SB), NOSPLIT, $0 + MOVL r+12(FP), CX + MOVL a_lo+0(FP), AX + MOVL a_hi+4(FP), DX + DIVL b+8(FP) + MOVL DX, 0(CX) + MOVL AX, q+16(FP) + RET diff --git a/src/runtime/vlop_arm.s b/src/runtime/vlop_arm.s new file mode 100644 index 0000000..9e19938 --- /dev/null +++ b/src/runtime/vlop_arm.s @@ -0,0 +1,260 @@ +// Inferno's libkern/vlop-arm.s +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/vlop-arm.s +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +#include "go_asm.h" +#include "go_tls.h" +#include "funcdata.h" +#include "textflag.h" + +// func runtime·udiv(n, d uint32) (q, r uint32) +// compiler knowns the register usage of this function +// Reference: +// Sloss, Andrew et. al; ARM System Developer's Guide: Designing and Optimizing System Software +// Morgan Kaufmann; 1 edition (April 8, 2004), ISBN 978-1558608740 +#define Rq R0 // input d, output q +#define Rr R1 // input n, output r +#define Rs R2 // three temporary variables +#define RM R3 +#define Ra R11 + +// Be careful: Ra == R11 will be used by the linker for synthesized instructions. +// Note: this function does not have a frame. +TEXT runtime·udiv(SB),NOSPLIT|NOFRAME,$0 + MOVBU internal∕cpu·ARM+const_offsetARMHasIDIVA(SB), Ra + CMP $0, Ra + BNE udiv_hardware + + CLZ Rq, Rs // find normalizing shift + MOVW.S Rq<<Rs, Ra + MOVW $fast_udiv_tab<>-64(SB), RM + ADD.NE Ra>>25, RM, Ra // index by most significant 7 bits of divisor + MOVBU.NE (Ra), Ra + + SUB.S $7, Rs + RSB $0, Rq, RM // M = -q + MOVW.PL Ra<<Rs, Rq + + // 1st Newton iteration + MUL.PL RM, Rq, Ra // a = -q*d + BMI udiv_by_large_d + MULAWT Ra, Rq, Rq, Rq // q approx q-(q*q*d>>32) + TEQ RM->1, RM // check for d=0 or d=1 + + // 2nd Newton iteration + MUL.NE RM, Rq, Ra + MOVW.NE $0, Rs + MULAL.NE Rq, Ra, (Rq,Rs) + BEQ udiv_by_0_or_1 + + // q now accurate enough for a remainder r, 0<=r<3*d + MULLU Rq, Rr, (Rq,Rs) // q = (r * q) >> 32 + ADD RM, Rr, Rr // r = n - d + MULA RM, Rq, Rr, Rr // r = n - (q+1)*d + + // since 0 <= n-q*d < 3*d; thus -d <= r < 2*d + CMN RM, Rr // t = r-d + SUB.CS RM, Rr, Rr // if (t<-d || t>=0) r=r+d + ADD.CC $1, Rq + ADD.PL RM<<1, Rr + ADD.PL $2, Rq + RET + +// use hardware divider +udiv_hardware: + DIVUHW Rq, Rr, Rs + MUL Rs, Rq, RM + RSB Rr, RM, Rr + MOVW Rs, Rq + RET + +udiv_by_large_d: + // at this point we know d>=2^(31-6)=2^25 + SUB $4, Ra, Ra + RSB $0, Rs, Rs + MOVW Ra>>Rs, Rq + MULLU Rq, Rr, (Rq,Rs) + MULA RM, Rq, Rr, Rr + + // q now accurate enough for a remainder r, 0<=r<4*d + CMN Rr>>1, RM // if(r/2 >= d) + ADD.CS RM<<1, Rr + ADD.CS $2, Rq + CMN Rr, RM + ADD.CS RM, Rr + ADD.CS $1, Rq + RET + +udiv_by_0_or_1: + // carry set if d==1, carry clear if d==0 + BCC udiv_by_0 + MOVW Rr, Rq + MOVW $0, Rr + RET + +udiv_by_0: + MOVW $runtime·panicdivide(SB), R11 + B (R11) + +// var tab [64]byte +// tab[0] = 255; for i := 1; i <= 63; i++ { tab[i] = (1<<14)/(64+i) } +// laid out here as little-endian uint32s +DATA fast_udiv_tab<>+0x00(SB)/4, $0xf4f8fcff +DATA fast_udiv_tab<>+0x04(SB)/4, $0xe6eaedf0 +DATA fast_udiv_tab<>+0x08(SB)/4, $0xdadde0e3 +DATA fast_udiv_tab<>+0x0c(SB)/4, $0xcfd2d4d7 +DATA fast_udiv_tab<>+0x10(SB)/4, $0xc5c7cacc +DATA fast_udiv_tab<>+0x14(SB)/4, $0xbcbec0c3 +DATA fast_udiv_tab<>+0x18(SB)/4, $0xb4b6b8ba +DATA fast_udiv_tab<>+0x1c(SB)/4, $0xacaeb0b2 +DATA fast_udiv_tab<>+0x20(SB)/4, $0xa5a7a8aa +DATA fast_udiv_tab<>+0x24(SB)/4, $0x9fa0a2a3 +DATA fast_udiv_tab<>+0x28(SB)/4, $0x999a9c9d +DATA fast_udiv_tab<>+0x2c(SB)/4, $0x93949697 +DATA fast_udiv_tab<>+0x30(SB)/4, $0x8e8f9092 +DATA fast_udiv_tab<>+0x34(SB)/4, $0x898a8c8d +DATA fast_udiv_tab<>+0x38(SB)/4, $0x85868788 +DATA fast_udiv_tab<>+0x3c(SB)/4, $0x81828384 +GLOBL fast_udiv_tab<>(SB), RODATA, $64 + +// The linker will pass numerator in R8 +#define Rn R8 +// The linker expects the result in RTMP +#define RTMP R11 + +TEXT runtime·_divu(SB), NOSPLIT, $16-0 + // It's not strictly true that there are no local pointers. + // It could be that the saved registers Rq, Rr, Rs, and Rm + // contain pointers. However, the only way this can matter + // is if the stack grows (which it can't, udiv is nosplit) + // or if a fault happens and more frames are added to + // the stack due to deferred functions. + // In the latter case, the stack can grow arbitrarily, + // and garbage collection can happen, and those + // operations care about pointers, but in that case + // the calling frame is dead, and so are the saved + // registers. So we can claim there are no pointers here. + NO_LOCAL_POINTERS + MOVW Rq, 4(R13) + MOVW Rr, 8(R13) + MOVW Rs, 12(R13) + MOVW RM, 16(R13) + + MOVW Rn, Rr /* numerator */ + MOVW g_m(g), Rq + MOVW m_divmod(Rq), Rq /* denominator */ + BL runtime·udiv(SB) + MOVW Rq, RTMP + MOVW 4(R13), Rq + MOVW 8(R13), Rr + MOVW 12(R13), Rs + MOVW 16(R13), RM + RET + +TEXT runtime·_modu(SB), NOSPLIT, $16-0 + NO_LOCAL_POINTERS + MOVW Rq, 4(R13) + MOVW Rr, 8(R13) + MOVW Rs, 12(R13) + MOVW RM, 16(R13) + + MOVW Rn, Rr /* numerator */ + MOVW g_m(g), Rq + MOVW m_divmod(Rq), Rq /* denominator */ + BL runtime·udiv(SB) + MOVW Rr, RTMP + MOVW 4(R13), Rq + MOVW 8(R13), Rr + MOVW 12(R13), Rs + MOVW 16(R13), RM + RET + +TEXT runtime·_div(SB),NOSPLIT,$16-0 + NO_LOCAL_POINTERS + MOVW Rq, 4(R13) + MOVW Rr, 8(R13) + MOVW Rs, 12(R13) + MOVW RM, 16(R13) + MOVW Rn, Rr /* numerator */ + MOVW g_m(g), Rq + MOVW m_divmod(Rq), Rq /* denominator */ + CMP $0, Rr + BGE d1 + RSB $0, Rr, Rr + CMP $0, Rq + BGE d2 + RSB $0, Rq, Rq +d0: + BL runtime·udiv(SB) /* none/both neg */ + MOVW Rq, RTMP + B out1 +d1: + CMP $0, Rq + BGE d0 + RSB $0, Rq, Rq +d2: + BL runtime·udiv(SB) /* one neg */ + RSB $0, Rq, RTMP +out1: + MOVW 4(R13), Rq + MOVW 8(R13), Rr + MOVW 12(R13), Rs + MOVW 16(R13), RM + RET + +TEXT runtime·_mod(SB),NOSPLIT,$16-0 + NO_LOCAL_POINTERS + MOVW Rq, 4(R13) + MOVW Rr, 8(R13) + MOVW Rs, 12(R13) + MOVW RM, 16(R13) + MOVW Rn, Rr /* numerator */ + MOVW g_m(g), Rq + MOVW m_divmod(Rq), Rq /* denominator */ + CMP $0, Rq + RSB.LT $0, Rq, Rq + CMP $0, Rr + BGE m1 + RSB $0, Rr, Rr + BL runtime·udiv(SB) /* neg numerator */ + RSB $0, Rr, RTMP + B out +m1: + BL runtime·udiv(SB) /* pos numerator */ + MOVW Rr, RTMP +out: + MOVW 4(R13), Rq + MOVW 8(R13), Rr + MOVW 12(R13), Rs + MOVW 16(R13), RM + RET + +// _mul64by32 and _div64by32 not implemented on arm +TEXT runtime·_mul64by32(SB), NOSPLIT, $0 + MOVW $0, R0 + MOVW (R0), R1 // crash + +TEXT runtime·_div64by32(SB), NOSPLIT, $0 + MOVW $0, R0 + MOVW (R0), R1 // crash diff --git a/src/runtime/vlop_arm_test.go b/src/runtime/vlop_arm_test.go new file mode 100644 index 0000000..015126a --- /dev/null +++ b/src/runtime/vlop_arm_test.go @@ -0,0 +1,128 @@ +// Copyright 2012 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime_test + +import ( + "runtime" + "testing" +) + +// arm soft division benchmarks adapted from +// https://ridiculousfish.com/files/division_benchmarks.tar.gz + +const numeratorsSize = 1 << 21 + +var numerators = randomNumerators() + +type randstate struct { + hi, lo uint32 +} + +func (r *randstate) rand() uint32 { + r.hi = r.hi<<16 + r.hi>>16 + r.hi += r.lo + r.lo += r.hi + return r.hi +} + +func randomNumerators() []uint32 { + numerators := make([]uint32, numeratorsSize) + random := &randstate{2147483563, 2147483563 ^ 0x49616E42} + for i := range numerators { + numerators[i] = random.rand() + } + return numerators +} + +func bmUint32Div(divisor uint32, b *testing.B) { + var sum uint32 + for i := 0; i < b.N; i++ { + sum += numerators[i&(numeratorsSize-1)] / divisor + } +} + +func BenchmarkUint32Div7(b *testing.B) { bmUint32Div(7, b) } +func BenchmarkUint32Div37(b *testing.B) { bmUint32Div(37, b) } +func BenchmarkUint32Div123(b *testing.B) { bmUint32Div(123, b) } +func BenchmarkUint32Div763(b *testing.B) { bmUint32Div(763, b) } +func BenchmarkUint32Div1247(b *testing.B) { bmUint32Div(1247, b) } +func BenchmarkUint32Div9305(b *testing.B) { bmUint32Div(9305, b) } +func BenchmarkUint32Div13307(b *testing.B) { bmUint32Div(13307, b) } +func BenchmarkUint32Div52513(b *testing.B) { bmUint32Div(52513, b) } +func BenchmarkUint32Div60978747(b *testing.B) { bmUint32Div(60978747, b) } +func BenchmarkUint32Div106956295(b *testing.B) { bmUint32Div(106956295, b) } + +func bmUint32Mod(divisor uint32, b *testing.B) { + var sum uint32 + for i := 0; i < b.N; i++ { + sum += numerators[i&(numeratorsSize-1)] % divisor + } +} + +func BenchmarkUint32Mod7(b *testing.B) { bmUint32Mod(7, b) } +func BenchmarkUint32Mod37(b *testing.B) { bmUint32Mod(37, b) } +func BenchmarkUint32Mod123(b *testing.B) { bmUint32Mod(123, b) } +func BenchmarkUint32Mod763(b *testing.B) { bmUint32Mod(763, b) } +func BenchmarkUint32Mod1247(b *testing.B) { bmUint32Mod(1247, b) } +func BenchmarkUint32Mod9305(b *testing.B) { bmUint32Mod(9305, b) } +func BenchmarkUint32Mod13307(b *testing.B) { bmUint32Mod(13307, b) } +func BenchmarkUint32Mod52513(b *testing.B) { bmUint32Mod(52513, b) } +func BenchmarkUint32Mod60978747(b *testing.B) { bmUint32Mod(60978747, b) } +func BenchmarkUint32Mod106956295(b *testing.B) { bmUint32Mod(106956295, b) } + +func TestUsplit(t *testing.T) { + var den uint32 = 1000000 + for _, x := range []uint32{0, 1, 999999, 1000000, 1010101, 0xFFFFFFFF} { + q1, r1 := runtime.Usplit(x) + q2, r2 := x/den, x%den + if q1 != q2 || r1 != r2 { + t.Errorf("%d/1e6, %d%%1e6 = %d, %d, want %d, %d", x, x, q1, r1, q2, r2) + } + } +} + +//go:noinline +func armFloatWrite(a *[129]float64) { + // This used to miscompile on arm5. + // The offset is too big to fit in a load. + // So the code does: + // ldr r0, [sp, #8] + // bl 6f690 <_sfloat> + // ldr fp, [pc, #32] ; (address of 128.0) + // vldr d0, [fp] + // ldr fp, [pc, #28] ; (1024) + // add fp, fp, r0 + // vstr d0, [fp] + // The software floating-point emulator gives up on the add. + // This causes the store to not work. + // See issue 15440. + a[128] = 128.0 +} +func TestArmFloatBigOffsetWrite(t *testing.T) { + var a [129]float64 + for i := 0; i < 128; i++ { + a[i] = float64(i) + } + armFloatWrite(&a) + for i, x := range a { + if x != float64(i) { + t.Errorf("bad entry %d:%f\n", i, x) + } + } +} + +//go:noinline +func armFloatRead(a *[129]float64) float64 { + return a[128] +} +func TestArmFloatBigOffsetRead(t *testing.T) { + var a [129]float64 + for i := 0; i < 129; i++ { + a[i] = float64(i) + } + if x := armFloatRead(&a); x != 128.0 { + t.Errorf("bad value %f\n", x) + } +} diff --git a/src/runtime/vlrt.go b/src/runtime/vlrt.go new file mode 100644 index 0000000..4b12f59 --- /dev/null +++ b/src/runtime/vlrt.go @@ -0,0 +1,310 @@ +// Inferno's libkern/vlrt-arm.c +// https://bitbucket.org/inferno-os/inferno-os/src/master/libkern/vlrt-arm.c +// +// Copyright © 1994-1999 Lucent Technologies Inc. All rights reserved. +// Revisions Copyright © 2000-2007 Vita Nuova Holdings Limited (www.vitanuova.com). All rights reserved. +// Portions Copyright 2009 The Go Authors. All rights reserved. +// +// Permission is hereby granted, free of charge, to any person obtaining a copy +// of this software and associated documentation files (the "Software"), to deal +// in the Software without restriction, including without limitation the rights +// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell +// copies of the Software, and to permit persons to whom the Software is +// furnished to do so, subject to the following conditions: +// +// The above copyright notice and this permission notice shall be included in +// all copies or substantial portions of the Software. +// +// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR +// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, +// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE +// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER +// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, +// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN +// THE SOFTWARE. + +//go:build arm || 386 || mips || mipsle + +package runtime + +import "unsafe" + +const ( + sign32 = 1 << (32 - 1) + sign64 = 1 << (64 - 1) +) + +func float64toint64(d float64) (y uint64) { + _d2v(&y, d) + return +} + +func float64touint64(d float64) (y uint64) { + _d2v(&y, d) + return +} + +func int64tofloat64(y int64) float64 { + if y < 0 { + return -uint64tofloat64(-uint64(y)) + } + return uint64tofloat64(uint64(y)) +} + +func uint64tofloat64(y uint64) float64 { + hi := float64(uint32(y >> 32)) + lo := float64(uint32(y)) + d := hi*(1<<32) + lo + return d +} + +func int64tofloat32(y int64) float32 { + if y < 0 { + return -uint64tofloat32(-uint64(y)) + } + return uint64tofloat32(uint64(y)) +} + +func uint64tofloat32(y uint64) float32 { + // divide into top 18, mid 23, and bottom 23 bits. + // (23-bit integers fit into a float32 without loss.) + top := uint32(y >> 46) + mid := uint32(y >> 23 & (1<<23 - 1)) + bot := uint32(y & (1<<23 - 1)) + if top == 0 { + return float32(mid)*(1<<23) + float32(bot) + } + if bot != 0 { + // Top is not zero, so the bits in bot + // won't make it into the final mantissa. + // In fact, the bottom bit of mid won't + // make it into the mantissa either. + // We only need to make sure that if top+mid + // is about to round down in a round-to-even + // scenario, and bot is not zero, we make it + // round up instead. + mid |= 1 + } + return float32(top)*(1<<46) + float32(mid)*(1<<23) +} + +func _d2v(y *uint64, d float64) { + x := *(*uint64)(unsafe.Pointer(&d)) + + xhi := uint32(x>>32)&0xfffff | 0x100000 + xlo := uint32(x) + sh := 1075 - int32(uint32(x>>52)&0x7ff) + + var ylo, yhi uint32 + if sh >= 0 { + sh := uint32(sh) + /* v = (hi||lo) >> sh */ + if sh < 32 { + if sh == 0 { + ylo = xlo + yhi = xhi + } else { + ylo = xlo>>sh | xhi<<(32-sh) + yhi = xhi >> sh + } + } else { + if sh == 32 { + ylo = xhi + } else if sh < 64 { + ylo = xhi >> (sh - 32) + } + } + } else { + /* v = (hi||lo) << -sh */ + sh := uint32(-sh) + if sh <= 11 { + ylo = xlo << sh + yhi = xhi<<sh | xlo>>(32-sh) + } else { + /* overflow */ + yhi = uint32(d) /* causes something awful */ + } + } + if x&sign64 != 0 { + if ylo != 0 { + ylo = -ylo + yhi = ^yhi + } else { + yhi = -yhi + } + } + + *y = uint64(yhi)<<32 | uint64(ylo) +} +func uint64div(n, d uint64) uint64 { + // Check for 32 bit operands + if uint32(n>>32) == 0 && uint32(d>>32) == 0 { + if uint32(d) == 0 { + panicdivide() + } + return uint64(uint32(n) / uint32(d)) + } + q, _ := dodiv(n, d) + return q +} + +func uint64mod(n, d uint64) uint64 { + // Check for 32 bit operands + if uint32(n>>32) == 0 && uint32(d>>32) == 0 { + if uint32(d) == 0 { + panicdivide() + } + return uint64(uint32(n) % uint32(d)) + } + _, r := dodiv(n, d) + return r +} + +func int64div(n, d int64) int64 { + // Check for 32 bit operands + if int64(int32(n)) == n && int64(int32(d)) == d { + if int32(n) == -0x80000000 && int32(d) == -1 { + // special case: 32-bit -0x80000000 / -1 = -0x80000000, + // but 64-bit -0x80000000 / -1 = 0x80000000. + return 0x80000000 + } + if int32(d) == 0 { + panicdivide() + } + return int64(int32(n) / int32(d)) + } + + nneg := n < 0 + dneg := d < 0 + if nneg { + n = -n + } + if dneg { + d = -d + } + uq, _ := dodiv(uint64(n), uint64(d)) + q := int64(uq) + if nneg != dneg { + q = -q + } + return q +} + +//go:nosplit +func int64mod(n, d int64) int64 { + // Check for 32 bit operands + if int64(int32(n)) == n && int64(int32(d)) == d { + if int32(d) == 0 { + panicdivide() + } + return int64(int32(n) % int32(d)) + } + + nneg := n < 0 + if nneg { + n = -n + } + if d < 0 { + d = -d + } + _, ur := dodiv(uint64(n), uint64(d)) + r := int64(ur) + if nneg { + r = -r + } + return r +} + +//go:noescape +func _mul64by32(lo64 *uint64, a uint64, b uint32) (hi32 uint32) + +//go:noescape +func _div64by32(a uint64, b uint32, r *uint32) (q uint32) + +//go:nosplit +func dodiv(n, d uint64) (q, r uint64) { + if GOARCH == "arm" { + // arm doesn't have a division instruction, so + // slowdodiv is the best that we can do. + return slowdodiv(n, d) + } + + if GOARCH == "mips" || GOARCH == "mipsle" { + // No _div64by32 on mips and using only _mul64by32 doesn't bring much benefit + return slowdodiv(n, d) + } + + if d > n { + return 0, n + } + + if uint32(d>>32) != 0 { + t := uint32(n>>32) / uint32(d>>32) + var lo64 uint64 + hi32 := _mul64by32(&lo64, d, t) + if hi32 != 0 || lo64 > n { + return slowdodiv(n, d) + } + return uint64(t), n - lo64 + } + + // d is 32 bit + var qhi uint32 + if uint32(n>>32) >= uint32(d) { + if uint32(d) == 0 { + panicdivide() + } + qhi = uint32(n>>32) / uint32(d) + n -= uint64(uint32(d)*qhi) << 32 + } else { + qhi = 0 + } + + var rlo uint32 + qlo := _div64by32(n, uint32(d), &rlo) + return uint64(qhi)<<32 + uint64(qlo), uint64(rlo) +} + +//go:nosplit +func slowdodiv(n, d uint64) (q, r uint64) { + if d == 0 { + panicdivide() + } + + // Set up the divisor and find the number of iterations needed. + capn := n + if n >= sign64 { + capn = sign64 + } + i := 0 + for d < capn { + d <<= 1 + i++ + } + + for ; i >= 0; i-- { + q <<= 1 + if n >= d { + n -= d + q |= 1 + } + d >>= 1 + } + return q, n +} + +// Floating point control word values. +// Bits 0-5 are bits to disable floating-point exceptions. +// Bits 8-9 are the precision control: +// +// 0 = single precision a.k.a. float32 +// 2 = double precision a.k.a. float64 +// +// Bits 10-11 are the rounding mode: +// +// 0 = round to nearest (even on a tie) +// 3 = round toward zero +var ( + controlWord64 uint16 = 0x3f + 2<<8 + 0<<10 + controlWord64trunc uint16 = 0x3f + 2<<8 + 3<<10 +) diff --git a/src/runtime/wincallback.go b/src/runtime/wincallback.go new file mode 100644 index 0000000..9ec2027 --- /dev/null +++ b/src/runtime/wincallback.go @@ -0,0 +1,125 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build ignore + +// Generate Windows callback assembly file. + +package main + +import ( + "bytes" + "fmt" + "os" +) + +const maxCallback = 2000 + +func genasm386Amd64() { + var buf bytes.Buffer + + buf.WriteString(`// Code generated by wincallback.go using 'go generate'. DO NOT EDIT. + +//go:build 386 || amd64 + +// runtime·callbackasm is called by external code to +// execute Go implemented callback function. It is not +// called from the start, instead runtime·compilecallback +// always returns address into runtime·callbackasm offset +// appropriately so different callbacks start with different +// CALL instruction in runtime·callbackasm. This determines +// which Go callback function is executed later on. + +TEXT runtime·callbackasm(SB),7,$0 +`) + for i := 0; i < maxCallback; i++ { + buf.WriteString("\tCALL\truntime·callbackasm1(SB)\n") + } + + filename := fmt.Sprintf("zcallback_windows.s") + err := os.WriteFile(filename, buf.Bytes(), 0666) + if err != nil { + fmt.Fprintf(os.Stderr, "wincallback: %s\n", err) + os.Exit(2) + } +} + +func genasmArm() { + var buf bytes.Buffer + + buf.WriteString(`// Code generated by wincallback.go using 'go generate'. DO NOT EDIT. + +// External code calls into callbackasm at an offset corresponding +// to the callback index. Callbackasm is a table of MOV and B instructions. +// The MOV instruction loads R12 with the callback index, and the +// B instruction branches to callbackasm1. +// callbackasm1 takes the callback index from R12 and +// indexes into an array that stores information about each callback. +// It then calls the Go implementation for that callback. +#include "textflag.h" + +TEXT runtime·callbackasm(SB),NOSPLIT|NOFRAME,$0 +`) + for i := 0; i < maxCallback; i++ { + fmt.Fprintf(&buf, "\tMOVW\t$%d, R12\n", i) + buf.WriteString("\tB\truntime·callbackasm1(SB)\n") + } + + err := os.WriteFile("zcallback_windows_arm.s", buf.Bytes(), 0666) + if err != nil { + fmt.Fprintf(os.Stderr, "wincallback: %s\n", err) + os.Exit(2) + } +} + +func genasmArm64() { + var buf bytes.Buffer + + buf.WriteString(`// Code generated by wincallback.go using 'go generate'. DO NOT EDIT. + +// External code calls into callbackasm at an offset corresponding +// to the callback index. Callbackasm is a table of MOV and B instructions. +// The MOV instruction loads R12 with the callback index, and the +// B instruction branches to callbackasm1. +// callbackasm1 takes the callback index from R12 and +// indexes into an array that stores information about each callback. +// It then calls the Go implementation for that callback. +#include "textflag.h" + +TEXT runtime·callbackasm(SB),NOSPLIT|NOFRAME,$0 +`) + for i := 0; i < maxCallback; i++ { + fmt.Fprintf(&buf, "\tMOVD\t$%d, R12\n", i) + buf.WriteString("\tB\truntime·callbackasm1(SB)\n") + } + + err := os.WriteFile("zcallback_windows_arm64.s", buf.Bytes(), 0666) + if err != nil { + fmt.Fprintf(os.Stderr, "wincallback: %s\n", err) + os.Exit(2) + } +} + +func gengo() { + var buf bytes.Buffer + + fmt.Fprintf(&buf, `// Code generated by wincallback.go using 'go generate'. DO NOT EDIT. + +package runtime + +const cb_max = %d // maximum number of windows callbacks allowed +`, maxCallback) + err := os.WriteFile("zcallback_windows.go", buf.Bytes(), 0666) + if err != nil { + fmt.Fprintf(os.Stderr, "wincallback: %s\n", err) + os.Exit(2) + } +} + +func main() { + genasm386Amd64() + genasmArm() + genasmArm64() + gengo() +} diff --git a/src/runtime/write_err.go b/src/runtime/write_err.go new file mode 100644 index 0000000..81ae872 --- /dev/null +++ b/src/runtime/write_err.go @@ -0,0 +1,13 @@ +// Copyright 2009 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +//go:build !android + +package runtime + +import "unsafe" + +func writeErr(b []byte) { + write(2, unsafe.Pointer(&b[0]), int32(len(b))) +} diff --git a/src/runtime/write_err_android.go b/src/runtime/write_err_android.go new file mode 100644 index 0000000..a876900 --- /dev/null +++ b/src/runtime/write_err_android.go @@ -0,0 +1,162 @@ +// Copyright 2014 The Go Authors. All rights reserved. +// Use of this source code is governed by a BSD-style +// license that can be found in the LICENSE file. + +package runtime + +import "unsafe" + +var ( + writeHeader = []byte{6 /* ANDROID_LOG_ERROR */, 'G', 'o', 0} + writePath = []byte("/dev/log/main\x00") + writeLogd = []byte("/dev/socket/logdw\x00") + + // guarded by printlock/printunlock. + writeFD uintptr + writeBuf [1024]byte + writePos int +) + +// Prior to Android-L, logging was done through writes to /dev/log files implemented +// in kernel ring buffers. In Android-L, those /dev/log files are no longer +// accessible and logging is done through a centralized user-mode logger, logd. +// +// https://android.googlesource.com/platform/system/core/+/refs/tags/android-6.0.1_r78/liblog/logd_write.c +type loggerType int32 + +const ( + unknown loggerType = iota + legacy + logd + // TODO(hakim): logging for emulator? +) + +var logger loggerType + +func writeErr(b []byte) { + if logger == unknown { + // Use logd if /dev/socket/logdw is available. + if v := uintptr(access(&writeLogd[0], 0x02 /* W_OK */)); v == 0 { + logger = logd + initLogd() + } else { + logger = legacy + initLegacy() + } + } + + // Write to stderr for command-line programs. + write(2, unsafe.Pointer(&b[0]), int32(len(b))) + + // Log format: "<header>\x00<message m bytes>\x00" + // + // <header> + // In legacy mode: "<priority 1 byte><tag n bytes>". + // In logd mode: "<android_log_header_t 11 bytes><priority 1 byte><tag n bytes>" + // + // The entire log needs to be delivered in a single syscall (the NDK + // does this with writev). Each log is its own line, so we need to + // buffer writes until we see a newline. + var hlen int + switch logger { + case logd: + hlen = writeLogdHeader() + case legacy: + hlen = len(writeHeader) + } + + dst := writeBuf[hlen:] + for _, v := range b { + if v == 0 { // android logging won't print a zero byte + v = '0' + } + dst[writePos] = v + writePos++ + if v == '\n' || writePos == len(dst)-1 { + dst[writePos] = 0 + write(writeFD, unsafe.Pointer(&writeBuf[0]), int32(hlen+writePos)) + for i := range dst { + dst[i] = 0 + } + writePos = 0 + } + } +} + +func initLegacy() { + // In legacy mode, logs are written to /dev/log/main + writeFD = uintptr(open(&writePath[0], 0x1 /* O_WRONLY */, 0)) + if writeFD == 0 { + // It is hard to do anything here. Write to stderr just + // in case user has root on device and has run + // adb shell setprop log.redirect-stdio true + msg := []byte("runtime: cannot open /dev/log/main\x00") + write(2, unsafe.Pointer(&msg[0]), int32(len(msg))) + exit(2) + } + + // Prepopulate the invariant header part. + copy(writeBuf[:len(writeHeader)], writeHeader) +} + +// used in initLogdWrite but defined here to avoid heap allocation. +var logdAddr sockaddr_un + +func initLogd() { + // In logd mode, logs are sent to the logd via a unix domain socket. + logdAddr.family = _AF_UNIX + copy(logdAddr.path[:], writeLogd) + + // We are not using non-blocking I/O because writes taking this path + // are most likely triggered by panic, we cannot think of the advantage of + // non-blocking I/O for panic but see disadvantage (dropping panic message), + // and blocking I/O simplifies the code a lot. + fd := socket(_AF_UNIX, _SOCK_DGRAM|_O_CLOEXEC, 0) + if fd < 0 { + msg := []byte("runtime: cannot create a socket for logging\x00") + write(2, unsafe.Pointer(&msg[0]), int32(len(msg))) + exit(2) + } + + errno := connect(fd, unsafe.Pointer(&logdAddr), int32(unsafe.Sizeof(logdAddr))) + if errno < 0 { + msg := []byte("runtime: cannot connect to /dev/socket/logdw\x00") + write(2, unsafe.Pointer(&msg[0]), int32(len(msg))) + // TODO(hakim): or should we just close fd and hope for better luck next time? + exit(2) + } + writeFD = uintptr(fd) + + // Prepopulate invariant part of the header. + // The first 11 bytes will be populated later in writeLogdHeader. + copy(writeBuf[11:11+len(writeHeader)], writeHeader) +} + +// writeLogdHeader populates the header and returns the length of the payload. +func writeLogdHeader() int { + hdr := writeBuf[:11] + + // The first 11 bytes of the header corresponds to android_log_header_t + // as defined in system/core/include/private/android_logger.h + // hdr[0] log type id (unsigned char), defined in <log/log.h> + // hdr[1:2] tid (uint16_t) + // hdr[3:11] log_time defined in <log/log_read.h> + // hdr[3:7] sec unsigned uint32, little endian. + // hdr[7:11] nsec unsigned uint32, little endian. + hdr[0] = 0 // LOG_ID_MAIN + sec, nsec, _ := time_now() + packUint32(hdr[3:7], uint32(sec)) + packUint32(hdr[7:11], uint32(nsec)) + + // TODO(hakim): hdr[1:2] = gettid? + + return 11 + len(writeHeader) +} + +func packUint32(b []byte, v uint32) { + // little-endian. + b[0] = byte(v) + b[1] = byte(v >> 8) + b[2] = byte(v >> 16) + b[3] = byte(v >> 24) +} diff --git a/src/runtime/zcallback_windows.go b/src/runtime/zcallback_windows.go new file mode 100644 index 0000000..2c3cb28 --- /dev/null +++ b/src/runtime/zcallback_windows.go @@ -0,0 +1,5 @@ +// Code generated by wincallback.go using 'go generate'. DO NOT EDIT. + +package runtime + +const cb_max = 2000 // maximum number of windows callbacks allowed diff --git a/src/runtime/zcallback_windows.s b/src/runtime/zcallback_windows.s new file mode 100644 index 0000000..bd23d71 --- /dev/null +++ b/src/runtime/zcallback_windows.s @@ -0,0 +1,2013 @@ +// Code generated by wincallback.go using 'go generate'. DO NOT EDIT. + +//go:build 386 || amd64 + +// runtime·callbackasm is called by external code to +// execute Go implemented callback function. It is not +// called from the start, instead runtime·compilecallback +// always returns address into runtime·callbackasm offset +// appropriately so different callbacks start with different +// CALL instruction in runtime·callbackasm. This determines +// which Go callback function is executed later on. + +TEXT runtime·callbackasm(SB),7,$0 + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) + CALL runtime·callbackasm1(SB) diff --git a/src/runtime/zcallback_windows_arm.s b/src/runtime/zcallback_windows_arm.s new file mode 100644 index 0000000..f943d84 --- /dev/null +++ b/src/runtime/zcallback_windows_arm.s @@ -0,0 +1,4012 @@ +// Code generated by wincallback.go using 'go generate'. DO NOT EDIT. + +// External code calls into callbackasm at an offset corresponding +// to the callback index. Callbackasm is a table of MOV and B instructions. +// The MOV instruction loads R12 with the callback index, and the +// B instruction branches to callbackasm1. +// callbackasm1 takes the callback index from R12 and +// indexes into an array that stores information about each callback. +// It then calls the Go implementation for that callback. +#include "textflag.h" + +TEXT runtime·callbackasm(SB),NOSPLIT|NOFRAME,$0 + MOVW $0, R12 + B runtime·callbackasm1(SB) + MOVW $1, R12 + B runtime·callbackasm1(SB) + MOVW $2, R12 + B runtime·callbackasm1(SB) + MOVW $3, R12 + B runtime·callbackasm1(SB) + MOVW $4, R12 + B runtime·callbackasm1(SB) + MOVW $5, R12 + B runtime·callbackasm1(SB) + MOVW $6, R12 + B runtime·callbackasm1(SB) + MOVW $7, R12 + B runtime·callbackasm1(SB) + MOVW $8, R12 + B runtime·callbackasm1(SB) + MOVW $9, R12 + B runtime·callbackasm1(SB) + MOVW $10, R12 + B runtime·callbackasm1(SB) + MOVW $11, R12 + B runtime·callbackasm1(SB) + MOVW $12, R12 + B runtime·callbackasm1(SB) + MOVW $13, R12 + B runtime·callbackasm1(SB) + MOVW $14, R12 + B runtime·callbackasm1(SB) + MOVW $15, R12 + B runtime·callbackasm1(SB) + MOVW $16, R12 + B runtime·callbackasm1(SB) + MOVW $17, R12 + B runtime·callbackasm1(SB) + MOVW $18, R12 + B runtime·callbackasm1(SB) + MOVW $19, R12 + B runtime·callbackasm1(SB) + MOVW $20, R12 + B runtime·callbackasm1(SB) + MOVW $21, R12 + B runtime·callbackasm1(SB) + MOVW $22, R12 + B runtime·callbackasm1(SB) + MOVW $23, R12 + B runtime·callbackasm1(SB) + MOVW $24, R12 + B runtime·callbackasm1(SB) + MOVW $25, R12 + B runtime·callbackasm1(SB) + MOVW $26, R12 + B runtime·callbackasm1(SB) + MOVW $27, R12 + B runtime·callbackasm1(SB) + MOVW $28, R12 + B runtime·callbackasm1(SB) + MOVW $29, R12 + B runtime·callbackasm1(SB) + MOVW $30, R12 + B runtime·callbackasm1(SB) + MOVW $31, R12 + B runtime·callbackasm1(SB) + MOVW $32, R12 + B runtime·callbackasm1(SB) + MOVW $33, R12 + B runtime·callbackasm1(SB) + MOVW $34, R12 + B runtime·callbackasm1(SB) + MOVW $35, R12 + B runtime·callbackasm1(SB) + MOVW $36, R12 + B runtime·callbackasm1(SB) + MOVW $37, R12 + B runtime·callbackasm1(SB) + MOVW $38, R12 + B runtime·callbackasm1(SB) + MOVW $39, R12 + B runtime·callbackasm1(SB) + MOVW $40, R12 + B runtime·callbackasm1(SB) + MOVW $41, R12 + B runtime·callbackasm1(SB) + MOVW $42, R12 + B runtime·callbackasm1(SB) + MOVW $43, R12 + B runtime·callbackasm1(SB) + MOVW $44, R12 + B runtime·callbackasm1(SB) + MOVW $45, R12 + B runtime·callbackasm1(SB) + MOVW $46, R12 + B runtime·callbackasm1(SB) + MOVW $47, R12 + B runtime·callbackasm1(SB) + MOVW $48, R12 + B runtime·callbackasm1(SB) + MOVW $49, R12 + B runtime·callbackasm1(SB) + MOVW $50, R12 + B runtime·callbackasm1(SB) + MOVW $51, R12 + B runtime·callbackasm1(SB) + MOVW $52, R12 + B runtime·callbackasm1(SB) + MOVW $53, R12 + B runtime·callbackasm1(SB) + MOVW $54, R12 + B runtime·callbackasm1(SB) + MOVW $55, R12 + B runtime·callbackasm1(SB) + MOVW $56, R12 + B runtime·callbackasm1(SB) + MOVW $57, R12 + B runtime·callbackasm1(SB) + MOVW $58, R12 + B runtime·callbackasm1(SB) + MOVW $59, R12 + B runtime·callbackasm1(SB) + MOVW $60, R12 + B runtime·callbackasm1(SB) + MOVW $61, R12 + B runtime·callbackasm1(SB) + MOVW $62, R12 + B runtime·callbackasm1(SB) + MOVW $63, R12 + B runtime·callbackasm1(SB) + MOVW $64, R12 + B runtime·callbackasm1(SB) + MOVW $65, R12 + B runtime·callbackasm1(SB) + MOVW $66, R12 + B runtime·callbackasm1(SB) + MOVW $67, R12 + B runtime·callbackasm1(SB) + MOVW $68, R12 + B runtime·callbackasm1(SB) + MOVW $69, R12 + B runtime·callbackasm1(SB) + MOVW $70, R12 + B runtime·callbackasm1(SB) + MOVW $71, R12 + B runtime·callbackasm1(SB) + MOVW $72, R12 + B runtime·callbackasm1(SB) + MOVW $73, R12 + B runtime·callbackasm1(SB) + MOVW $74, R12 + B runtime·callbackasm1(SB) + MOVW $75, R12 + B runtime·callbackasm1(SB) + MOVW $76, R12 + B runtime·callbackasm1(SB) + MOVW $77, R12 + B runtime·callbackasm1(SB) + MOVW $78, R12 + B runtime·callbackasm1(SB) + MOVW $79, R12 + B runtime·callbackasm1(SB) + MOVW $80, R12 + B runtime·callbackasm1(SB) + MOVW $81, R12 + B runtime·callbackasm1(SB) + MOVW $82, R12 + B runtime·callbackasm1(SB) + MOVW $83, R12 + B runtime·callbackasm1(SB) + MOVW $84, R12 + B runtime·callbackasm1(SB) + MOVW $85, R12 + B runtime·callbackasm1(SB) + MOVW $86, R12 + B runtime·callbackasm1(SB) + MOVW $87, R12 + B runtime·callbackasm1(SB) + MOVW $88, R12 + B runtime·callbackasm1(SB) + MOVW $89, R12 + B runtime·callbackasm1(SB) + MOVW $90, R12 + B runtime·callbackasm1(SB) + MOVW $91, R12 + B runtime·callbackasm1(SB) + MOVW $92, R12 + B runtime·callbackasm1(SB) + MOVW $93, R12 + B runtime·callbackasm1(SB) + MOVW $94, R12 + B runtime·callbackasm1(SB) + MOVW $95, R12 + B runtime·callbackasm1(SB) + MOVW $96, R12 + B runtime·callbackasm1(SB) + MOVW $97, R12 + B runtime·callbackasm1(SB) + MOVW $98, R12 + B runtime·callbackasm1(SB) + MOVW $99, R12 + B runtime·callbackasm1(SB) + MOVW $100, R12 + B runtime·callbackasm1(SB) + MOVW $101, R12 + B runtime·callbackasm1(SB) + MOVW $102, R12 + B runtime·callbackasm1(SB) + MOVW $103, R12 + B runtime·callbackasm1(SB) + MOVW $104, R12 + B runtime·callbackasm1(SB) + MOVW $105, R12 + B runtime·callbackasm1(SB) + MOVW $106, R12 + B runtime·callbackasm1(SB) + MOVW $107, R12 + B runtime·callbackasm1(SB) + MOVW $108, R12 + B runtime·callbackasm1(SB) + MOVW $109, R12 + B runtime·callbackasm1(SB) + MOVW $110, R12 + B runtime·callbackasm1(SB) + MOVW $111, R12 + B runtime·callbackasm1(SB) + MOVW $112, R12 + B runtime·callbackasm1(SB) + MOVW $113, R12 + B runtime·callbackasm1(SB) + MOVW $114, R12 + B runtime·callbackasm1(SB) + MOVW $115, R12 + B runtime·callbackasm1(SB) + MOVW $116, R12 + B runtime·callbackasm1(SB) + MOVW $117, R12 + B runtime·callbackasm1(SB) + MOVW $118, R12 + B runtime·callbackasm1(SB) + MOVW $119, R12 + B runtime·callbackasm1(SB) + MOVW $120, R12 + B runtime·callbackasm1(SB) + MOVW $121, R12 + B runtime·callbackasm1(SB) + MOVW $122, R12 + B runtime·callbackasm1(SB) + MOVW $123, R12 + B runtime·callbackasm1(SB) + MOVW $124, R12 + B runtime·callbackasm1(SB) + MOVW $125, R12 + B runtime·callbackasm1(SB) + MOVW $126, R12 + B runtime·callbackasm1(SB) + MOVW $127, R12 + B runtime·callbackasm1(SB) + MOVW $128, R12 + B runtime·callbackasm1(SB) + MOVW $129, R12 + B runtime·callbackasm1(SB) + MOVW $130, R12 + B runtime·callbackasm1(SB) + MOVW $131, R12 + B runtime·callbackasm1(SB) + MOVW $132, R12 + B runtime·callbackasm1(SB) + MOVW $133, R12 + B runtime·callbackasm1(SB) + MOVW $134, R12 + B runtime·callbackasm1(SB) + MOVW $135, R12 + B runtime·callbackasm1(SB) + MOVW $136, R12 + B runtime·callbackasm1(SB) + MOVW $137, R12 + B runtime·callbackasm1(SB) + MOVW $138, R12 + B runtime·callbackasm1(SB) + MOVW $139, R12 + B runtime·callbackasm1(SB) + MOVW $140, R12 + B runtime·callbackasm1(SB) + MOVW $141, R12 + B runtime·callbackasm1(SB) + MOVW $142, R12 + B runtime·callbackasm1(SB) + MOVW $143, R12 + B runtime·callbackasm1(SB) + MOVW $144, R12 + B runtime·callbackasm1(SB) + MOVW $145, R12 + B runtime·callbackasm1(SB) + MOVW $146, R12 + B runtime·callbackasm1(SB) + MOVW $147, R12 + B runtime·callbackasm1(SB) + MOVW $148, R12 + B runtime·callbackasm1(SB) + MOVW $149, R12 + B runtime·callbackasm1(SB) + MOVW $150, R12 + B runtime·callbackasm1(SB) + MOVW $151, R12 + B runtime·callbackasm1(SB) + MOVW $152, R12 + B runtime·callbackasm1(SB) + MOVW $153, R12 + B runtime·callbackasm1(SB) + MOVW $154, R12 + B runtime·callbackasm1(SB) + MOVW $155, R12 + B runtime·callbackasm1(SB) + MOVW $156, R12 + B runtime·callbackasm1(SB) + MOVW $157, R12 + B runtime·callbackasm1(SB) + MOVW $158, R12 + B runtime·callbackasm1(SB) + MOVW $159, R12 + B runtime·callbackasm1(SB) + MOVW $160, R12 + B runtime·callbackasm1(SB) + MOVW $161, R12 + B runtime·callbackasm1(SB) + MOVW $162, R12 + B runtime·callbackasm1(SB) + MOVW $163, R12 + B runtime·callbackasm1(SB) + MOVW $164, R12 + B runtime·callbackasm1(SB) + MOVW $165, R12 + B runtime·callbackasm1(SB) + MOVW $166, R12 + B runtime·callbackasm1(SB) + MOVW $167, R12 + B runtime·callbackasm1(SB) + MOVW $168, R12 + B runtime·callbackasm1(SB) + MOVW $169, R12 + B runtime·callbackasm1(SB) + MOVW $170, R12 + B runtime·callbackasm1(SB) + MOVW $171, R12 + B runtime·callbackasm1(SB) + MOVW $172, R12 + B runtime·callbackasm1(SB) + MOVW $173, R12 + B runtime·callbackasm1(SB) + MOVW $174, R12 + B runtime·callbackasm1(SB) + MOVW $175, R12 + B runtime·callbackasm1(SB) + MOVW $176, R12 + B runtime·callbackasm1(SB) + MOVW $177, R12 + B runtime·callbackasm1(SB) + MOVW $178, R12 + B runtime·callbackasm1(SB) + MOVW $179, R12 + B runtime·callbackasm1(SB) + MOVW $180, R12 + B runtime·callbackasm1(SB) + MOVW $181, R12 + B runtime·callbackasm1(SB) + MOVW $182, R12 + B runtime·callbackasm1(SB) + MOVW $183, R12 + B runtime·callbackasm1(SB) + MOVW $184, R12 + B runtime·callbackasm1(SB) + MOVW $185, R12 + B runtime·callbackasm1(SB) + MOVW $186, R12 + B runtime·callbackasm1(SB) + MOVW $187, R12 + B runtime·callbackasm1(SB) + MOVW $188, R12 + B runtime·callbackasm1(SB) + MOVW $189, R12 + B runtime·callbackasm1(SB) + MOVW $190, R12 + B runtime·callbackasm1(SB) + MOVW $191, R12 + B runtime·callbackasm1(SB) + MOVW $192, R12 + B runtime·callbackasm1(SB) + MOVW $193, R12 + B runtime·callbackasm1(SB) + MOVW $194, R12 + B runtime·callbackasm1(SB) + MOVW $195, R12 + B runtime·callbackasm1(SB) + MOVW $196, R12 + B runtime·callbackasm1(SB) + MOVW $197, R12 + B runtime·callbackasm1(SB) + MOVW $198, R12 + B runtime·callbackasm1(SB) + MOVW $199, R12 + B runtime·callbackasm1(SB) + MOVW $200, R12 + B runtime·callbackasm1(SB) + MOVW $201, R12 + B runtime·callbackasm1(SB) + MOVW $202, R12 + B runtime·callbackasm1(SB) + MOVW $203, R12 + B runtime·callbackasm1(SB) + MOVW $204, R12 + B runtime·callbackasm1(SB) + MOVW $205, R12 + B runtime·callbackasm1(SB) + MOVW $206, R12 + B runtime·callbackasm1(SB) + MOVW $207, R12 + B runtime·callbackasm1(SB) + MOVW $208, R12 + B runtime·callbackasm1(SB) + MOVW $209, R12 + B runtime·callbackasm1(SB) + MOVW $210, R12 + B runtime·callbackasm1(SB) + MOVW $211, R12 + B runtime·callbackasm1(SB) + MOVW $212, R12 + B runtime·callbackasm1(SB) + MOVW $213, R12 + B runtime·callbackasm1(SB) + MOVW $214, R12 + B runtime·callbackasm1(SB) + MOVW $215, R12 + B runtime·callbackasm1(SB) + MOVW $216, R12 + B runtime·callbackasm1(SB) + MOVW $217, R12 + B runtime·callbackasm1(SB) + MOVW $218, R12 + B runtime·callbackasm1(SB) + MOVW $219, R12 + B runtime·callbackasm1(SB) + MOVW $220, R12 + B runtime·callbackasm1(SB) + MOVW $221, R12 + B runtime·callbackasm1(SB) + MOVW $222, R12 + B runtime·callbackasm1(SB) + MOVW $223, R12 + B runtime·callbackasm1(SB) + MOVW $224, R12 + B runtime·callbackasm1(SB) + MOVW $225, R12 + B runtime·callbackasm1(SB) + MOVW $226, R12 + B runtime·callbackasm1(SB) + MOVW $227, R12 + B runtime·callbackasm1(SB) + MOVW $228, R12 + B runtime·callbackasm1(SB) + MOVW $229, R12 + B runtime·callbackasm1(SB) + MOVW $230, R12 + B runtime·callbackasm1(SB) + MOVW $231, R12 + B runtime·callbackasm1(SB) + MOVW $232, R12 + B runtime·callbackasm1(SB) + MOVW $233, R12 + B runtime·callbackasm1(SB) + MOVW $234, R12 + B runtime·callbackasm1(SB) + MOVW $235, R12 + B runtime·callbackasm1(SB) + MOVW $236, R12 + B runtime·callbackasm1(SB) + MOVW $237, R12 + B runtime·callbackasm1(SB) + MOVW $238, R12 + B runtime·callbackasm1(SB) + MOVW $239, R12 + B runtime·callbackasm1(SB) + MOVW $240, R12 + B runtime·callbackasm1(SB) + MOVW $241, R12 + B runtime·callbackasm1(SB) + MOVW $242, R12 + B runtime·callbackasm1(SB) + MOVW $243, R12 + B runtime·callbackasm1(SB) + MOVW $244, R12 + B runtime·callbackasm1(SB) + MOVW $245, R12 + B runtime·callbackasm1(SB) + MOVW $246, R12 + B runtime·callbackasm1(SB) + MOVW $247, R12 + B runtime·callbackasm1(SB) + MOVW $248, R12 + B runtime·callbackasm1(SB) + MOVW $249, R12 + B runtime·callbackasm1(SB) + MOVW $250, R12 + B runtime·callbackasm1(SB) + MOVW $251, R12 + B runtime·callbackasm1(SB) + MOVW $252, R12 + B runtime·callbackasm1(SB) + MOVW $253, R12 + B runtime·callbackasm1(SB) + MOVW $254, R12 + B runtime·callbackasm1(SB) + MOVW $255, R12 + B runtime·callbackasm1(SB) + MOVW $256, R12 + B runtime·callbackasm1(SB) + MOVW $257, R12 + B runtime·callbackasm1(SB) + MOVW $258, R12 + B runtime·callbackasm1(SB) + MOVW $259, R12 + B runtime·callbackasm1(SB) + MOVW $260, R12 + B runtime·callbackasm1(SB) + MOVW $261, R12 + B runtime·callbackasm1(SB) + MOVW $262, R12 + B runtime·callbackasm1(SB) + MOVW $263, R12 + B runtime·callbackasm1(SB) + MOVW $264, R12 + B runtime·callbackasm1(SB) + MOVW $265, R12 + B runtime·callbackasm1(SB) + MOVW $266, R12 + B runtime·callbackasm1(SB) + MOVW $267, R12 + B runtime·callbackasm1(SB) + MOVW $268, R12 + B runtime·callbackasm1(SB) + MOVW $269, R12 + B runtime·callbackasm1(SB) + MOVW $270, R12 + B runtime·callbackasm1(SB) + MOVW $271, R12 + B runtime·callbackasm1(SB) + MOVW $272, R12 + B runtime·callbackasm1(SB) + MOVW $273, R12 + B runtime·callbackasm1(SB) + MOVW $274, R12 + B runtime·callbackasm1(SB) + MOVW $275, R12 + B runtime·callbackasm1(SB) + MOVW $276, R12 + B runtime·callbackasm1(SB) + MOVW $277, R12 + B runtime·callbackasm1(SB) + MOVW $278, R12 + B runtime·callbackasm1(SB) + MOVW $279, R12 + B runtime·callbackasm1(SB) + MOVW $280, R12 + B runtime·callbackasm1(SB) + MOVW $281, R12 + B runtime·callbackasm1(SB) + MOVW $282, R12 + B runtime·callbackasm1(SB) + MOVW $283, R12 + B runtime·callbackasm1(SB) + MOVW $284, R12 + B runtime·callbackasm1(SB) + MOVW $285, R12 + B runtime·callbackasm1(SB) + MOVW $286, R12 + B runtime·callbackasm1(SB) + MOVW $287, R12 + B runtime·callbackasm1(SB) + MOVW $288, R12 + B runtime·callbackasm1(SB) + MOVW $289, R12 + B runtime·callbackasm1(SB) + MOVW $290, R12 + B runtime·callbackasm1(SB) + MOVW $291, R12 + B runtime·callbackasm1(SB) + MOVW $292, R12 + B runtime·callbackasm1(SB) + MOVW $293, R12 + B runtime·callbackasm1(SB) + MOVW $294, R12 + B runtime·callbackasm1(SB) + MOVW $295, R12 + B runtime·callbackasm1(SB) + MOVW $296, R12 + B runtime·callbackasm1(SB) + MOVW $297, R12 + B runtime·callbackasm1(SB) + MOVW $298, R12 + B runtime·callbackasm1(SB) + MOVW $299, R12 + B runtime·callbackasm1(SB) + MOVW $300, R12 + B runtime·callbackasm1(SB) + MOVW $301, R12 + B runtime·callbackasm1(SB) + MOVW $302, R12 + B runtime·callbackasm1(SB) + MOVW $303, R12 + B runtime·callbackasm1(SB) + MOVW $304, R12 + B runtime·callbackasm1(SB) + MOVW $305, R12 + B runtime·callbackasm1(SB) + MOVW $306, R12 + B runtime·callbackasm1(SB) + MOVW $307, R12 + B runtime·callbackasm1(SB) + MOVW $308, R12 + B runtime·callbackasm1(SB) + MOVW $309, R12 + B runtime·callbackasm1(SB) + MOVW $310, R12 + B runtime·callbackasm1(SB) + MOVW $311, R12 + B runtime·callbackasm1(SB) + MOVW $312, R12 + B runtime·callbackasm1(SB) + MOVW $313, R12 + B runtime·callbackasm1(SB) + MOVW $314, R12 + B runtime·callbackasm1(SB) + MOVW $315, R12 + B runtime·callbackasm1(SB) + MOVW $316, R12 + B runtime·callbackasm1(SB) + MOVW $317, R12 + B runtime·callbackasm1(SB) + MOVW $318, R12 + B runtime·callbackasm1(SB) + MOVW $319, R12 + B runtime·callbackasm1(SB) + MOVW $320, R12 + B runtime·callbackasm1(SB) + MOVW $321, R12 + B runtime·callbackasm1(SB) + MOVW $322, R12 + B runtime·callbackasm1(SB) + MOVW $323, R12 + B runtime·callbackasm1(SB) + MOVW $324, R12 + B runtime·callbackasm1(SB) + MOVW $325, R12 + B runtime·callbackasm1(SB) + MOVW $326, R12 + B runtime·callbackasm1(SB) + MOVW $327, R12 + B runtime·callbackasm1(SB) + MOVW $328, R12 + B runtime·callbackasm1(SB) + MOVW $329, R12 + B runtime·callbackasm1(SB) + MOVW $330, R12 + B runtime·callbackasm1(SB) + MOVW $331, R12 + B runtime·callbackasm1(SB) + MOVW $332, R12 + B runtime·callbackasm1(SB) + MOVW $333, R12 + B runtime·callbackasm1(SB) + MOVW $334, R12 + B runtime·callbackasm1(SB) + MOVW $335, R12 + B runtime·callbackasm1(SB) + MOVW $336, R12 + B runtime·callbackasm1(SB) + MOVW $337, R12 + B runtime·callbackasm1(SB) + MOVW $338, R12 + B runtime·callbackasm1(SB) + MOVW $339, R12 + B runtime·callbackasm1(SB) + MOVW $340, R12 + B runtime·callbackasm1(SB) + MOVW $341, R12 + B runtime·callbackasm1(SB) + MOVW $342, R12 + B runtime·callbackasm1(SB) + MOVW $343, R12 + B runtime·callbackasm1(SB) + MOVW $344, R12 + B runtime·callbackasm1(SB) + MOVW $345, R12 + B runtime·callbackasm1(SB) + MOVW $346, R12 + B runtime·callbackasm1(SB) + MOVW $347, R12 + B runtime·callbackasm1(SB) + MOVW $348, R12 + B runtime·callbackasm1(SB) + MOVW $349, R12 + B runtime·callbackasm1(SB) + MOVW $350, R12 + B runtime·callbackasm1(SB) + MOVW $351, R12 + B runtime·callbackasm1(SB) + MOVW $352, R12 + B runtime·callbackasm1(SB) + MOVW $353, R12 + B runtime·callbackasm1(SB) + MOVW $354, R12 + B runtime·callbackasm1(SB) + MOVW $355, R12 + B runtime·callbackasm1(SB) + MOVW $356, R12 + B runtime·callbackasm1(SB) + MOVW $357, R12 + B runtime·callbackasm1(SB) + MOVW $358, R12 + B runtime·callbackasm1(SB) + MOVW $359, R12 + B runtime·callbackasm1(SB) + MOVW $360, R12 + B runtime·callbackasm1(SB) + MOVW $361, R12 + B runtime·callbackasm1(SB) + MOVW $362, R12 + B runtime·callbackasm1(SB) + MOVW $363, R12 + B runtime·callbackasm1(SB) + MOVW $364, R12 + B runtime·callbackasm1(SB) + MOVW $365, R12 + B runtime·callbackasm1(SB) + MOVW $366, R12 + B runtime·callbackasm1(SB) + MOVW $367, R12 + B runtime·callbackasm1(SB) + MOVW $368, R12 + B runtime·callbackasm1(SB) + MOVW $369, R12 + B runtime·callbackasm1(SB) + MOVW $370, R12 + B runtime·callbackasm1(SB) + MOVW $371, R12 + B runtime·callbackasm1(SB) + MOVW $372, R12 + B runtime·callbackasm1(SB) + MOVW $373, R12 + B runtime·callbackasm1(SB) + MOVW $374, R12 + B runtime·callbackasm1(SB) + MOVW $375, R12 + B runtime·callbackasm1(SB) + MOVW $376, R12 + B runtime·callbackasm1(SB) + MOVW $377, R12 + B runtime·callbackasm1(SB) + MOVW $378, R12 + B runtime·callbackasm1(SB) + MOVW $379, R12 + B runtime·callbackasm1(SB) + MOVW $380, R12 + B runtime·callbackasm1(SB) + MOVW $381, R12 + B runtime·callbackasm1(SB) + MOVW $382, R12 + B runtime·callbackasm1(SB) + MOVW $383, R12 + B runtime·callbackasm1(SB) + MOVW $384, R12 + B runtime·callbackasm1(SB) + MOVW $385, R12 + B runtime·callbackasm1(SB) + MOVW $386, R12 + B runtime·callbackasm1(SB) + MOVW $387, R12 + B runtime·callbackasm1(SB) + MOVW $388, R12 + B runtime·callbackasm1(SB) + MOVW $389, R12 + B runtime·callbackasm1(SB) + MOVW $390, R12 + B runtime·callbackasm1(SB) + MOVW $391, R12 + B runtime·callbackasm1(SB) + MOVW $392, R12 + B runtime·callbackasm1(SB) + MOVW $393, R12 + B runtime·callbackasm1(SB) + MOVW $394, R12 + B runtime·callbackasm1(SB) + MOVW $395, R12 + B runtime·callbackasm1(SB) + MOVW $396, R12 + B runtime·callbackasm1(SB) + MOVW $397, R12 + B runtime·callbackasm1(SB) + MOVW $398, R12 + B runtime·callbackasm1(SB) + MOVW $399, R12 + B runtime·callbackasm1(SB) + MOVW $400, R12 + B runtime·callbackasm1(SB) + MOVW $401, R12 + B runtime·callbackasm1(SB) + MOVW $402, R12 + B runtime·callbackasm1(SB) + MOVW $403, R12 + B runtime·callbackasm1(SB) + MOVW $404, R12 + B runtime·callbackasm1(SB) + MOVW $405, R12 + B runtime·callbackasm1(SB) + MOVW $406, R12 + B runtime·callbackasm1(SB) + MOVW $407, R12 + B runtime·callbackasm1(SB) + MOVW $408, R12 + B runtime·callbackasm1(SB) + MOVW $409, R12 + B runtime·callbackasm1(SB) + MOVW $410, R12 + B runtime·callbackasm1(SB) + MOVW $411, R12 + B runtime·callbackasm1(SB) + MOVW $412, R12 + B runtime·callbackasm1(SB) + MOVW $413, R12 + B runtime·callbackasm1(SB) + MOVW $414, R12 + B runtime·callbackasm1(SB) + MOVW $415, R12 + B runtime·callbackasm1(SB) + MOVW $416, R12 + B runtime·callbackasm1(SB) + MOVW $417, R12 + B runtime·callbackasm1(SB) + MOVW $418, R12 + B runtime·callbackasm1(SB) + MOVW $419, R12 + B runtime·callbackasm1(SB) + MOVW $420, R12 + B runtime·callbackasm1(SB) + MOVW $421, R12 + B runtime·callbackasm1(SB) + MOVW $422, R12 + B runtime·callbackasm1(SB) + MOVW $423, R12 + B runtime·callbackasm1(SB) + MOVW $424, R12 + B runtime·callbackasm1(SB) + MOVW $425, R12 + B runtime·callbackasm1(SB) + MOVW $426, R12 + B runtime·callbackasm1(SB) + MOVW $427, R12 + B runtime·callbackasm1(SB) + MOVW $428, R12 + B runtime·callbackasm1(SB) + MOVW $429, R12 + B runtime·callbackasm1(SB) + MOVW $430, R12 + B runtime·callbackasm1(SB) + MOVW $431, R12 + B runtime·callbackasm1(SB) + MOVW $432, R12 + B runtime·callbackasm1(SB) + MOVW $433, R12 + B runtime·callbackasm1(SB) + MOVW $434, R12 + B runtime·callbackasm1(SB) + MOVW $435, R12 + B runtime·callbackasm1(SB) + MOVW $436, R12 + B runtime·callbackasm1(SB) + MOVW $437, R12 + B runtime·callbackasm1(SB) + MOVW $438, R12 + B runtime·callbackasm1(SB) + MOVW $439, R12 + B runtime·callbackasm1(SB) + MOVW $440, R12 + B runtime·callbackasm1(SB) + MOVW $441, R12 + B runtime·callbackasm1(SB) + MOVW $442, R12 + B runtime·callbackasm1(SB) + MOVW $443, R12 + B runtime·callbackasm1(SB) + MOVW $444, R12 + B runtime·callbackasm1(SB) + MOVW $445, R12 + B runtime·callbackasm1(SB) + MOVW $446, R12 + B runtime·callbackasm1(SB) + MOVW $447, R12 + B runtime·callbackasm1(SB) + MOVW $448, R12 + B runtime·callbackasm1(SB) + MOVW $449, R12 + B runtime·callbackasm1(SB) + MOVW $450, R12 + B runtime·callbackasm1(SB) + MOVW $451, R12 + B runtime·callbackasm1(SB) + MOVW $452, R12 + B runtime·callbackasm1(SB) + MOVW $453, R12 + B runtime·callbackasm1(SB) + MOVW $454, R12 + B runtime·callbackasm1(SB) + MOVW $455, R12 + B runtime·callbackasm1(SB) + MOVW $456, R12 + B runtime·callbackasm1(SB) + MOVW $457, R12 + B runtime·callbackasm1(SB) + MOVW $458, R12 + B runtime·callbackasm1(SB) + MOVW $459, R12 + B runtime·callbackasm1(SB) + MOVW $460, R12 + B runtime·callbackasm1(SB) + MOVW $461, R12 + B runtime·callbackasm1(SB) + MOVW $462, R12 + B runtime·callbackasm1(SB) + MOVW $463, R12 + B runtime·callbackasm1(SB) + MOVW $464, R12 + B runtime·callbackasm1(SB) + MOVW $465, R12 + B runtime·callbackasm1(SB) + MOVW $466, R12 + B runtime·callbackasm1(SB) + MOVW $467, R12 + B runtime·callbackasm1(SB) + MOVW $468, R12 + B runtime·callbackasm1(SB) + MOVW $469, R12 + B runtime·callbackasm1(SB) + MOVW $470, R12 + B runtime·callbackasm1(SB) + MOVW $471, R12 + B runtime·callbackasm1(SB) + MOVW $472, R12 + B runtime·callbackasm1(SB) + MOVW $473, R12 + B runtime·callbackasm1(SB) + MOVW $474, R12 + B runtime·callbackasm1(SB) + MOVW $475, R12 + B runtime·callbackasm1(SB) + MOVW $476, R12 + B runtime·callbackasm1(SB) + MOVW $477, R12 + B runtime·callbackasm1(SB) + MOVW $478, R12 + B runtime·callbackasm1(SB) + MOVW $479, R12 + B runtime·callbackasm1(SB) + MOVW $480, R12 + B runtime·callbackasm1(SB) + MOVW $481, R12 + B runtime·callbackasm1(SB) + MOVW $482, R12 + B runtime·callbackasm1(SB) + MOVW $483, R12 + B runtime·callbackasm1(SB) + MOVW $484, R12 + B runtime·callbackasm1(SB) + MOVW $485, R12 + B runtime·callbackasm1(SB) + MOVW $486, R12 + B runtime·callbackasm1(SB) + MOVW $487, R12 + B runtime·callbackasm1(SB) + MOVW $488, R12 + B runtime·callbackasm1(SB) + MOVW $489, R12 + B runtime·callbackasm1(SB) + MOVW $490, R12 + B runtime·callbackasm1(SB) + MOVW $491, R12 + B runtime·callbackasm1(SB) + MOVW $492, R12 + B runtime·callbackasm1(SB) + MOVW $493, R12 + B runtime·callbackasm1(SB) + MOVW $494, R12 + B runtime·callbackasm1(SB) + MOVW $495, R12 + B runtime·callbackasm1(SB) + MOVW $496, R12 + B runtime·callbackasm1(SB) + MOVW $497, R12 + B runtime·callbackasm1(SB) + MOVW $498, R12 + B runtime·callbackasm1(SB) + MOVW $499, R12 + B runtime·callbackasm1(SB) + MOVW $500, R12 + B runtime·callbackasm1(SB) + MOVW $501, R12 + B runtime·callbackasm1(SB) + MOVW $502, R12 + B runtime·callbackasm1(SB) + MOVW $503, R12 + B runtime·callbackasm1(SB) + MOVW $504, R12 + B runtime·callbackasm1(SB) + MOVW $505, R12 + B runtime·callbackasm1(SB) + MOVW $506, R12 + B runtime·callbackasm1(SB) + MOVW $507, R12 + B runtime·callbackasm1(SB) + MOVW $508, R12 + B runtime·callbackasm1(SB) + MOVW $509, R12 + B runtime·callbackasm1(SB) + MOVW $510, R12 + B runtime·callbackasm1(SB) + MOVW $511, R12 + B runtime·callbackasm1(SB) + MOVW $512, R12 + B runtime·callbackasm1(SB) + MOVW $513, R12 + B runtime·callbackasm1(SB) + MOVW $514, R12 + B runtime·callbackasm1(SB) + MOVW $515, R12 + B runtime·callbackasm1(SB) + MOVW $516, R12 + B runtime·callbackasm1(SB) + MOVW $517, R12 + B runtime·callbackasm1(SB) + MOVW $518, R12 + B runtime·callbackasm1(SB) + MOVW $519, R12 + B runtime·callbackasm1(SB) + MOVW $520, R12 + B runtime·callbackasm1(SB) + MOVW $521, R12 + B runtime·callbackasm1(SB) + MOVW $522, R12 + B runtime·callbackasm1(SB) + MOVW $523, R12 + B runtime·callbackasm1(SB) + MOVW $524, R12 + B runtime·callbackasm1(SB) + MOVW $525, R12 + B runtime·callbackasm1(SB) + MOVW $526, R12 + B runtime·callbackasm1(SB) + MOVW $527, R12 + B runtime·callbackasm1(SB) + MOVW $528, R12 + B runtime·callbackasm1(SB) + MOVW $529, R12 + B runtime·callbackasm1(SB) + MOVW $530, R12 + B runtime·callbackasm1(SB) + MOVW $531, R12 + B runtime·callbackasm1(SB) + MOVW $532, R12 + B runtime·callbackasm1(SB) + MOVW $533, R12 + B runtime·callbackasm1(SB) + MOVW $534, R12 + B runtime·callbackasm1(SB) + MOVW $535, R12 + B runtime·callbackasm1(SB) + MOVW $536, R12 + B runtime·callbackasm1(SB) + MOVW $537, R12 + B runtime·callbackasm1(SB) + MOVW $538, R12 + B runtime·callbackasm1(SB) + MOVW $539, R12 + B runtime·callbackasm1(SB) + MOVW $540, R12 + B runtime·callbackasm1(SB) + MOVW $541, R12 + B runtime·callbackasm1(SB) + MOVW $542, R12 + B runtime·callbackasm1(SB) + MOVW $543, R12 + B runtime·callbackasm1(SB) + MOVW $544, R12 + B runtime·callbackasm1(SB) + MOVW $545, R12 + B runtime·callbackasm1(SB) + MOVW $546, R12 + B runtime·callbackasm1(SB) + MOVW $547, R12 + B runtime·callbackasm1(SB) + MOVW $548, R12 + B runtime·callbackasm1(SB) + MOVW $549, R12 + B runtime·callbackasm1(SB) + MOVW $550, R12 + B runtime·callbackasm1(SB) + MOVW $551, R12 + B runtime·callbackasm1(SB) + MOVW $552, R12 + B runtime·callbackasm1(SB) + MOVW $553, R12 + B runtime·callbackasm1(SB) + MOVW $554, R12 + B runtime·callbackasm1(SB) + MOVW $555, R12 + B runtime·callbackasm1(SB) + MOVW $556, R12 + B runtime·callbackasm1(SB) + MOVW $557, R12 + B runtime·callbackasm1(SB) + MOVW $558, R12 + B runtime·callbackasm1(SB) + MOVW $559, R12 + B runtime·callbackasm1(SB) + MOVW $560, R12 + B runtime·callbackasm1(SB) + MOVW $561, R12 + B runtime·callbackasm1(SB) + MOVW $562, R12 + B runtime·callbackasm1(SB) + MOVW $563, R12 + B runtime·callbackasm1(SB) + MOVW $564, R12 + B runtime·callbackasm1(SB) + MOVW $565, R12 + B runtime·callbackasm1(SB) + MOVW $566, R12 + B runtime·callbackasm1(SB) + MOVW $567, R12 + B runtime·callbackasm1(SB) + MOVW $568, R12 + B runtime·callbackasm1(SB) + MOVW $569, R12 + B runtime·callbackasm1(SB) + MOVW $570, R12 + B runtime·callbackasm1(SB) + MOVW $571, R12 + B runtime·callbackasm1(SB) + MOVW $572, R12 + B runtime·callbackasm1(SB) + MOVW $573, R12 + B runtime·callbackasm1(SB) + MOVW $574, R12 + B runtime·callbackasm1(SB) + MOVW $575, R12 + B runtime·callbackasm1(SB) + MOVW $576, R12 + B runtime·callbackasm1(SB) + MOVW $577, R12 + B runtime·callbackasm1(SB) + MOVW $578, R12 + B runtime·callbackasm1(SB) + MOVW $579, R12 + B runtime·callbackasm1(SB) + MOVW $580, R12 + B runtime·callbackasm1(SB) + MOVW $581, R12 + B runtime·callbackasm1(SB) + MOVW $582, R12 + B runtime·callbackasm1(SB) + MOVW $583, R12 + B runtime·callbackasm1(SB) + MOVW $584, R12 + B runtime·callbackasm1(SB) + MOVW $585, R12 + B runtime·callbackasm1(SB) + MOVW $586, R12 + B runtime·callbackasm1(SB) + MOVW $587, R12 + B runtime·callbackasm1(SB) + MOVW $588, R12 + B runtime·callbackasm1(SB) + MOVW $589, R12 + B runtime·callbackasm1(SB) + MOVW $590, R12 + B runtime·callbackasm1(SB) + MOVW $591, R12 + B runtime·callbackasm1(SB) + MOVW $592, R12 + B runtime·callbackasm1(SB) + MOVW $593, R12 + B runtime·callbackasm1(SB) + MOVW $594, R12 + B runtime·callbackasm1(SB) + MOVW $595, R12 + B runtime·callbackasm1(SB) + MOVW $596, R12 + B runtime·callbackasm1(SB) + MOVW $597, R12 + B runtime·callbackasm1(SB) + MOVW $598, R12 + B runtime·callbackasm1(SB) + MOVW $599, R12 + B runtime·callbackasm1(SB) + MOVW $600, R12 + B runtime·callbackasm1(SB) + MOVW $601, R12 + B runtime·callbackasm1(SB) + MOVW $602, R12 + B runtime·callbackasm1(SB) + MOVW $603, R12 + B runtime·callbackasm1(SB) + MOVW $604, R12 + B runtime·callbackasm1(SB) + MOVW $605, R12 + B runtime·callbackasm1(SB) + MOVW $606, R12 + B runtime·callbackasm1(SB) + MOVW $607, R12 + B runtime·callbackasm1(SB) + MOVW $608, R12 + B runtime·callbackasm1(SB) + MOVW $609, R12 + B runtime·callbackasm1(SB) + MOVW $610, R12 + B runtime·callbackasm1(SB) + MOVW $611, R12 + B runtime·callbackasm1(SB) + MOVW $612, R12 + B runtime·callbackasm1(SB) + MOVW $613, R12 + B runtime·callbackasm1(SB) + MOVW $614, R12 + B runtime·callbackasm1(SB) + MOVW $615, R12 + B runtime·callbackasm1(SB) + MOVW $616, R12 + B runtime·callbackasm1(SB) + MOVW $617, R12 + B runtime·callbackasm1(SB) + MOVW $618, R12 + B runtime·callbackasm1(SB) + MOVW $619, R12 + B runtime·callbackasm1(SB) + MOVW $620, R12 + B runtime·callbackasm1(SB) + MOVW $621, R12 + B runtime·callbackasm1(SB) + MOVW $622, R12 + B runtime·callbackasm1(SB) + MOVW $623, R12 + B runtime·callbackasm1(SB) + MOVW $624, R12 + B runtime·callbackasm1(SB) + MOVW $625, R12 + B runtime·callbackasm1(SB) + MOVW $626, R12 + B runtime·callbackasm1(SB) + MOVW $627, R12 + B runtime·callbackasm1(SB) + MOVW $628, R12 + B runtime·callbackasm1(SB) + MOVW $629, R12 + B runtime·callbackasm1(SB) + MOVW $630, R12 + B runtime·callbackasm1(SB) + MOVW $631, R12 + B runtime·callbackasm1(SB) + MOVW $632, R12 + B runtime·callbackasm1(SB) + MOVW $633, R12 + B runtime·callbackasm1(SB) + MOVW $634, R12 + B runtime·callbackasm1(SB) + MOVW $635, R12 + B runtime·callbackasm1(SB) + MOVW $636, R12 + B runtime·callbackasm1(SB) + MOVW $637, R12 + B runtime·callbackasm1(SB) + MOVW $638, R12 + B runtime·callbackasm1(SB) + MOVW $639, R12 + B runtime·callbackasm1(SB) + MOVW $640, R12 + B runtime·callbackasm1(SB) + MOVW $641, R12 + B runtime·callbackasm1(SB) + MOVW $642, R12 + B runtime·callbackasm1(SB) + MOVW $643, R12 + B runtime·callbackasm1(SB) + MOVW $644, R12 + B runtime·callbackasm1(SB) + MOVW $645, R12 + B runtime·callbackasm1(SB) + MOVW $646, R12 + B runtime·callbackasm1(SB) + MOVW $647, R12 + B runtime·callbackasm1(SB) + MOVW $648, R12 + B runtime·callbackasm1(SB) + MOVW $649, R12 + B runtime·callbackasm1(SB) + MOVW $650, R12 + B runtime·callbackasm1(SB) + MOVW $651, R12 + B runtime·callbackasm1(SB) + MOVW $652, R12 + B runtime·callbackasm1(SB) + MOVW $653, R12 + B runtime·callbackasm1(SB) + MOVW $654, R12 + B runtime·callbackasm1(SB) + MOVW $655, R12 + B runtime·callbackasm1(SB) + MOVW $656, R12 + B runtime·callbackasm1(SB) + MOVW $657, R12 + B runtime·callbackasm1(SB) + MOVW $658, R12 + B runtime·callbackasm1(SB) + MOVW $659, R12 + B runtime·callbackasm1(SB) + MOVW $660, R12 + B runtime·callbackasm1(SB) + MOVW $661, R12 + B runtime·callbackasm1(SB) + MOVW $662, R12 + B runtime·callbackasm1(SB) + MOVW $663, R12 + B runtime·callbackasm1(SB) + MOVW $664, R12 + B runtime·callbackasm1(SB) + MOVW $665, R12 + B runtime·callbackasm1(SB) + MOVW $666, R12 + B runtime·callbackasm1(SB) + MOVW $667, R12 + B runtime·callbackasm1(SB) + MOVW $668, R12 + B runtime·callbackasm1(SB) + MOVW $669, R12 + B runtime·callbackasm1(SB) + MOVW $670, R12 + B runtime·callbackasm1(SB) + MOVW $671, R12 + B runtime·callbackasm1(SB) + MOVW $672, R12 + B runtime·callbackasm1(SB) + MOVW $673, R12 + B runtime·callbackasm1(SB) + MOVW $674, R12 + B runtime·callbackasm1(SB) + MOVW $675, R12 + B runtime·callbackasm1(SB) + MOVW $676, R12 + B runtime·callbackasm1(SB) + MOVW $677, R12 + B runtime·callbackasm1(SB) + MOVW $678, R12 + B runtime·callbackasm1(SB) + MOVW $679, R12 + B runtime·callbackasm1(SB) + MOVW $680, R12 + B runtime·callbackasm1(SB) + MOVW $681, R12 + B runtime·callbackasm1(SB) + MOVW $682, R12 + B runtime·callbackasm1(SB) + MOVW $683, R12 + B runtime·callbackasm1(SB) + MOVW $684, R12 + B runtime·callbackasm1(SB) + MOVW $685, R12 + B runtime·callbackasm1(SB) + MOVW $686, R12 + B runtime·callbackasm1(SB) + MOVW $687, R12 + B runtime·callbackasm1(SB) + MOVW $688, R12 + B runtime·callbackasm1(SB) + MOVW $689, R12 + B runtime·callbackasm1(SB) + MOVW $690, R12 + B runtime·callbackasm1(SB) + MOVW $691, R12 + B runtime·callbackasm1(SB) + MOVW $692, R12 + B runtime·callbackasm1(SB) + MOVW $693, R12 + B runtime·callbackasm1(SB) + MOVW $694, R12 + B runtime·callbackasm1(SB) + MOVW $695, R12 + B runtime·callbackasm1(SB) + MOVW $696, R12 + B runtime·callbackasm1(SB) + MOVW $697, R12 + B runtime·callbackasm1(SB) + MOVW $698, R12 + B runtime·callbackasm1(SB) + MOVW $699, R12 + B runtime·callbackasm1(SB) + MOVW $700, R12 + B runtime·callbackasm1(SB) + MOVW $701, R12 + B runtime·callbackasm1(SB) + MOVW $702, R12 + B runtime·callbackasm1(SB) + MOVW $703, R12 + B runtime·callbackasm1(SB) + MOVW $704, R12 + B runtime·callbackasm1(SB) + MOVW $705, R12 + B runtime·callbackasm1(SB) + MOVW $706, R12 + B runtime·callbackasm1(SB) + MOVW $707, R12 + B runtime·callbackasm1(SB) + MOVW $708, R12 + B runtime·callbackasm1(SB) + MOVW $709, R12 + B runtime·callbackasm1(SB) + MOVW $710, R12 + B runtime·callbackasm1(SB) + MOVW $711, R12 + B runtime·callbackasm1(SB) + MOVW $712, R12 + B runtime·callbackasm1(SB) + MOVW $713, R12 + B runtime·callbackasm1(SB) + MOVW $714, R12 + B runtime·callbackasm1(SB) + MOVW $715, R12 + B runtime·callbackasm1(SB) + MOVW $716, R12 + B runtime·callbackasm1(SB) + MOVW $717, R12 + B runtime·callbackasm1(SB) + MOVW $718, R12 + B runtime·callbackasm1(SB) + MOVW $719, R12 + B runtime·callbackasm1(SB) + MOVW $720, R12 + B runtime·callbackasm1(SB) + MOVW $721, R12 + B runtime·callbackasm1(SB) + MOVW $722, R12 + B runtime·callbackasm1(SB) + MOVW $723, R12 + B runtime·callbackasm1(SB) + MOVW $724, R12 + B runtime·callbackasm1(SB) + MOVW $725, R12 + B runtime·callbackasm1(SB) + MOVW $726, R12 + B runtime·callbackasm1(SB) + MOVW $727, R12 + B runtime·callbackasm1(SB) + MOVW $728, R12 + B runtime·callbackasm1(SB) + MOVW $729, R12 + B runtime·callbackasm1(SB) + MOVW $730, R12 + B runtime·callbackasm1(SB) + MOVW $731, R12 + B runtime·callbackasm1(SB) + MOVW $732, R12 + B runtime·callbackasm1(SB) + MOVW $733, R12 + B runtime·callbackasm1(SB) + MOVW $734, R12 + B runtime·callbackasm1(SB) + MOVW $735, R12 + B runtime·callbackasm1(SB) + MOVW $736, R12 + B runtime·callbackasm1(SB) + MOVW $737, R12 + B runtime·callbackasm1(SB) + MOVW $738, R12 + B runtime·callbackasm1(SB) + MOVW $739, R12 + B runtime·callbackasm1(SB) + MOVW $740, R12 + B runtime·callbackasm1(SB) + MOVW $741, R12 + B runtime·callbackasm1(SB) + MOVW $742, R12 + B runtime·callbackasm1(SB) + MOVW $743, R12 + B runtime·callbackasm1(SB) + MOVW $744, R12 + B runtime·callbackasm1(SB) + MOVW $745, R12 + B runtime·callbackasm1(SB) + MOVW $746, R12 + B runtime·callbackasm1(SB) + MOVW $747, R12 + B runtime·callbackasm1(SB) + MOVW $748, R12 + B runtime·callbackasm1(SB) + MOVW $749, R12 + B runtime·callbackasm1(SB) + MOVW $750, R12 + B runtime·callbackasm1(SB) + MOVW $751, R12 + B runtime·callbackasm1(SB) + MOVW $752, R12 + B runtime·callbackasm1(SB) + MOVW $753, R12 + B runtime·callbackasm1(SB) + MOVW $754, R12 + B runtime·callbackasm1(SB) + MOVW $755, R12 + B runtime·callbackasm1(SB) + MOVW $756, R12 + B runtime·callbackasm1(SB) + MOVW $757, R12 + B runtime·callbackasm1(SB) + MOVW $758, R12 + B runtime·callbackasm1(SB) + MOVW $759, R12 + B runtime·callbackasm1(SB) + MOVW $760, R12 + B runtime·callbackasm1(SB) + MOVW $761, R12 + B runtime·callbackasm1(SB) + MOVW $762, R12 + B runtime·callbackasm1(SB) + MOVW $763, R12 + B runtime·callbackasm1(SB) + MOVW $764, R12 + B runtime·callbackasm1(SB) + MOVW $765, R12 + B runtime·callbackasm1(SB) + MOVW $766, R12 + B runtime·callbackasm1(SB) + MOVW $767, R12 + B runtime·callbackasm1(SB) + MOVW $768, R12 + B runtime·callbackasm1(SB) + MOVW $769, R12 + B runtime·callbackasm1(SB) + MOVW $770, R12 + B runtime·callbackasm1(SB) + MOVW $771, R12 + B runtime·callbackasm1(SB) + MOVW $772, R12 + B runtime·callbackasm1(SB) + MOVW $773, R12 + B runtime·callbackasm1(SB) + MOVW $774, R12 + B runtime·callbackasm1(SB) + MOVW $775, R12 + B runtime·callbackasm1(SB) + MOVW $776, R12 + B runtime·callbackasm1(SB) + MOVW $777, R12 + B runtime·callbackasm1(SB) + MOVW $778, R12 + B runtime·callbackasm1(SB) + MOVW $779, R12 + B runtime·callbackasm1(SB) + MOVW $780, R12 + B runtime·callbackasm1(SB) + MOVW $781, R12 + B runtime·callbackasm1(SB) + MOVW $782, R12 + B runtime·callbackasm1(SB) + MOVW $783, R12 + B runtime·callbackasm1(SB) + MOVW $784, R12 + B runtime·callbackasm1(SB) + MOVW $785, R12 + B runtime·callbackasm1(SB) + MOVW $786, R12 + B runtime·callbackasm1(SB) + MOVW $787, R12 + B runtime·callbackasm1(SB) + MOVW $788, R12 + B runtime·callbackasm1(SB) + MOVW $789, R12 + B runtime·callbackasm1(SB) + MOVW $790, R12 + B runtime·callbackasm1(SB) + MOVW $791, R12 + B runtime·callbackasm1(SB) + MOVW $792, R12 + B runtime·callbackasm1(SB) + MOVW $793, R12 + B runtime·callbackasm1(SB) + MOVW $794, R12 + B runtime·callbackasm1(SB) + MOVW $795, R12 + B runtime·callbackasm1(SB) + MOVW $796, R12 + B runtime·callbackasm1(SB) + MOVW $797, R12 + B runtime·callbackasm1(SB) + MOVW $798, R12 + B runtime·callbackasm1(SB) + MOVW $799, R12 + B runtime·callbackasm1(SB) + MOVW $800, R12 + B runtime·callbackasm1(SB) + MOVW $801, R12 + B runtime·callbackasm1(SB) + MOVW $802, R12 + B runtime·callbackasm1(SB) + MOVW $803, R12 + B runtime·callbackasm1(SB) + MOVW $804, R12 + B runtime·callbackasm1(SB) + MOVW $805, R12 + B runtime·callbackasm1(SB) + MOVW $806, R12 + B runtime·callbackasm1(SB) + MOVW $807, R12 + B runtime·callbackasm1(SB) + MOVW $808, R12 + B runtime·callbackasm1(SB) + MOVW $809, R12 + B runtime·callbackasm1(SB) + MOVW $810, R12 + B runtime·callbackasm1(SB) + MOVW $811, R12 + B runtime·callbackasm1(SB) + MOVW $812, R12 + B runtime·callbackasm1(SB) + MOVW $813, R12 + B runtime·callbackasm1(SB) + MOVW $814, R12 + B runtime·callbackasm1(SB) + MOVW $815, R12 + B runtime·callbackasm1(SB) + MOVW $816, R12 + B runtime·callbackasm1(SB) + MOVW $817, R12 + B runtime·callbackasm1(SB) + MOVW $818, R12 + B runtime·callbackasm1(SB) + MOVW $819, R12 + B runtime·callbackasm1(SB) + MOVW $820, R12 + B runtime·callbackasm1(SB) + MOVW $821, R12 + B runtime·callbackasm1(SB) + MOVW $822, R12 + B runtime·callbackasm1(SB) + MOVW $823, R12 + B runtime·callbackasm1(SB) + MOVW $824, R12 + B runtime·callbackasm1(SB) + MOVW $825, R12 + B runtime·callbackasm1(SB) + MOVW $826, R12 + B runtime·callbackasm1(SB) + MOVW $827, R12 + B runtime·callbackasm1(SB) + MOVW $828, R12 + B runtime·callbackasm1(SB) + MOVW $829, R12 + B runtime·callbackasm1(SB) + MOVW $830, R12 + B runtime·callbackasm1(SB) + MOVW $831, R12 + B runtime·callbackasm1(SB) + MOVW $832, R12 + B runtime·callbackasm1(SB) + MOVW $833, R12 + B runtime·callbackasm1(SB) + MOVW $834, R12 + B runtime·callbackasm1(SB) + MOVW $835, R12 + B runtime·callbackasm1(SB) + MOVW $836, R12 + B runtime·callbackasm1(SB) + MOVW $837, R12 + B runtime·callbackasm1(SB) + MOVW $838, R12 + B runtime·callbackasm1(SB) + MOVW $839, R12 + B runtime·callbackasm1(SB) + MOVW $840, R12 + B runtime·callbackasm1(SB) + MOVW $841, R12 + B runtime·callbackasm1(SB) + MOVW $842, R12 + B runtime·callbackasm1(SB) + MOVW $843, R12 + B runtime·callbackasm1(SB) + MOVW $844, R12 + B runtime·callbackasm1(SB) + MOVW $845, R12 + B runtime·callbackasm1(SB) + MOVW $846, R12 + B runtime·callbackasm1(SB) + MOVW $847, R12 + B runtime·callbackasm1(SB) + MOVW $848, R12 + B runtime·callbackasm1(SB) + MOVW $849, R12 + B runtime·callbackasm1(SB) + MOVW $850, R12 + B runtime·callbackasm1(SB) + MOVW $851, R12 + B runtime·callbackasm1(SB) + MOVW $852, R12 + B runtime·callbackasm1(SB) + MOVW $853, R12 + B runtime·callbackasm1(SB) + MOVW $854, R12 + B runtime·callbackasm1(SB) + MOVW $855, R12 + B runtime·callbackasm1(SB) + MOVW $856, R12 + B runtime·callbackasm1(SB) + MOVW $857, R12 + B runtime·callbackasm1(SB) + MOVW $858, R12 + B runtime·callbackasm1(SB) + MOVW $859, R12 + B runtime·callbackasm1(SB) + MOVW $860, R12 + B runtime·callbackasm1(SB) + MOVW $861, R12 + B runtime·callbackasm1(SB) + MOVW $862, R12 + B runtime·callbackasm1(SB) + MOVW $863, R12 + B runtime·callbackasm1(SB) + MOVW $864, R12 + B runtime·callbackasm1(SB) + MOVW $865, R12 + B runtime·callbackasm1(SB) + MOVW $866, R12 + B runtime·callbackasm1(SB) + MOVW $867, R12 + B runtime·callbackasm1(SB) + MOVW $868, R12 + B runtime·callbackasm1(SB) + MOVW $869, R12 + B runtime·callbackasm1(SB) + MOVW $870, R12 + B runtime·callbackasm1(SB) + MOVW $871, R12 + B runtime·callbackasm1(SB) + MOVW $872, R12 + B runtime·callbackasm1(SB) + MOVW $873, R12 + B runtime·callbackasm1(SB) + MOVW $874, R12 + B runtime·callbackasm1(SB) + MOVW $875, R12 + B runtime·callbackasm1(SB) + MOVW $876, R12 + B runtime·callbackasm1(SB) + MOVW $877, R12 + B runtime·callbackasm1(SB) + MOVW $878, R12 + B runtime·callbackasm1(SB) + MOVW $879, R12 + B runtime·callbackasm1(SB) + MOVW $880, R12 + B runtime·callbackasm1(SB) + MOVW $881, R12 + B runtime·callbackasm1(SB) + MOVW $882, R12 + B runtime·callbackasm1(SB) + MOVW $883, R12 + B runtime·callbackasm1(SB) + MOVW $884, R12 + B runtime·callbackasm1(SB) + MOVW $885, R12 + B runtime·callbackasm1(SB) + MOVW $886, R12 + B runtime·callbackasm1(SB) + MOVW $887, R12 + B runtime·callbackasm1(SB) + MOVW $888, R12 + B runtime·callbackasm1(SB) + MOVW $889, R12 + B runtime·callbackasm1(SB) + MOVW $890, R12 + B runtime·callbackasm1(SB) + MOVW $891, R12 + B runtime·callbackasm1(SB) + MOVW $892, R12 + B runtime·callbackasm1(SB) + MOVW $893, R12 + B runtime·callbackasm1(SB) + MOVW $894, R12 + B runtime·callbackasm1(SB) + MOVW $895, R12 + B runtime·callbackasm1(SB) + MOVW $896, R12 + B runtime·callbackasm1(SB) + MOVW $897, R12 + B runtime·callbackasm1(SB) + MOVW $898, R12 + B runtime·callbackasm1(SB) + MOVW $899, R12 + B runtime·callbackasm1(SB) + MOVW $900, R12 + B runtime·callbackasm1(SB) + MOVW $901, R12 + B runtime·callbackasm1(SB) + MOVW $902, R12 + B runtime·callbackasm1(SB) + MOVW $903, R12 + B runtime·callbackasm1(SB) + MOVW $904, R12 + B runtime·callbackasm1(SB) + MOVW $905, R12 + B runtime·callbackasm1(SB) + MOVW $906, R12 + B runtime·callbackasm1(SB) + MOVW $907, R12 + B runtime·callbackasm1(SB) + MOVW $908, R12 + B runtime·callbackasm1(SB) + MOVW $909, R12 + B runtime·callbackasm1(SB) + MOVW $910, R12 + B runtime·callbackasm1(SB) + MOVW $911, R12 + B runtime·callbackasm1(SB) + MOVW $912, R12 + B runtime·callbackasm1(SB) + MOVW $913, R12 + B runtime·callbackasm1(SB) + MOVW $914, R12 + B runtime·callbackasm1(SB) + MOVW $915, R12 + B runtime·callbackasm1(SB) + MOVW $916, R12 + B runtime·callbackasm1(SB) + MOVW $917, R12 + B runtime·callbackasm1(SB) + MOVW $918, R12 + B runtime·callbackasm1(SB) + MOVW $919, R12 + B runtime·callbackasm1(SB) + MOVW $920, R12 + B runtime·callbackasm1(SB) + MOVW $921, R12 + B runtime·callbackasm1(SB) + MOVW $922, R12 + B runtime·callbackasm1(SB) + MOVW $923, R12 + B runtime·callbackasm1(SB) + MOVW $924, R12 + B runtime·callbackasm1(SB) + MOVW $925, R12 + B runtime·callbackasm1(SB) + MOVW $926, R12 + B runtime·callbackasm1(SB) + MOVW $927, R12 + B runtime·callbackasm1(SB) + MOVW $928, R12 + B runtime·callbackasm1(SB) + MOVW $929, R12 + B runtime·callbackasm1(SB) + MOVW $930, R12 + B runtime·callbackasm1(SB) + MOVW $931, R12 + B runtime·callbackasm1(SB) + MOVW $932, R12 + B runtime·callbackasm1(SB) + MOVW $933, R12 + B runtime·callbackasm1(SB) + MOVW $934, R12 + B runtime·callbackasm1(SB) + MOVW $935, R12 + B runtime·callbackasm1(SB) + MOVW $936, R12 + B runtime·callbackasm1(SB) + MOVW $937, R12 + B runtime·callbackasm1(SB) + MOVW $938, R12 + B runtime·callbackasm1(SB) + MOVW $939, R12 + B runtime·callbackasm1(SB) + MOVW $940, R12 + B runtime·callbackasm1(SB) + MOVW $941, R12 + B runtime·callbackasm1(SB) + MOVW $942, R12 + B runtime·callbackasm1(SB) + MOVW $943, R12 + B runtime·callbackasm1(SB) + MOVW $944, R12 + B runtime·callbackasm1(SB) + MOVW $945, R12 + B runtime·callbackasm1(SB) + MOVW $946, R12 + B runtime·callbackasm1(SB) + MOVW $947, R12 + B runtime·callbackasm1(SB) + MOVW $948, R12 + B runtime·callbackasm1(SB) + MOVW $949, R12 + B runtime·callbackasm1(SB) + MOVW $950, R12 + B runtime·callbackasm1(SB) + MOVW $951, R12 + B runtime·callbackasm1(SB) + MOVW $952, R12 + B runtime·callbackasm1(SB) + MOVW $953, R12 + B runtime·callbackasm1(SB) + MOVW $954, R12 + B runtime·callbackasm1(SB) + MOVW $955, R12 + B runtime·callbackasm1(SB) + MOVW $956, R12 + B runtime·callbackasm1(SB) + MOVW $957, R12 + B runtime·callbackasm1(SB) + MOVW $958, R12 + B runtime·callbackasm1(SB) + MOVW $959, R12 + B runtime·callbackasm1(SB) + MOVW $960, R12 + B runtime·callbackasm1(SB) + MOVW $961, R12 + B runtime·callbackasm1(SB) + MOVW $962, R12 + B runtime·callbackasm1(SB) + MOVW $963, R12 + B runtime·callbackasm1(SB) + MOVW $964, R12 + B runtime·callbackasm1(SB) + MOVW $965, R12 + B runtime·callbackasm1(SB) + MOVW $966, R12 + B runtime·callbackasm1(SB) + MOVW $967, R12 + B runtime·callbackasm1(SB) + MOVW $968, R12 + B runtime·callbackasm1(SB) + MOVW $969, R12 + B runtime·callbackasm1(SB) + MOVW $970, R12 + B runtime·callbackasm1(SB) + MOVW $971, R12 + B runtime·callbackasm1(SB) + MOVW $972, R12 + B runtime·callbackasm1(SB) + MOVW $973, R12 + B runtime·callbackasm1(SB) + MOVW $974, R12 + B runtime·callbackasm1(SB) + MOVW $975, R12 + B runtime·callbackasm1(SB) + MOVW $976, R12 + B runtime·callbackasm1(SB) + MOVW $977, R12 + B runtime·callbackasm1(SB) + MOVW $978, R12 + B runtime·callbackasm1(SB) + MOVW $979, R12 + B runtime·callbackasm1(SB) + MOVW $980, R12 + B runtime·callbackasm1(SB) + MOVW $981, R12 + B runtime·callbackasm1(SB) + MOVW $982, R12 + B runtime·callbackasm1(SB) + MOVW $983, R12 + B runtime·callbackasm1(SB) + MOVW $984, R12 + B runtime·callbackasm1(SB) + MOVW $985, R12 + B runtime·callbackasm1(SB) + MOVW $986, R12 + B runtime·callbackasm1(SB) + MOVW $987, R12 + B runtime·callbackasm1(SB) + MOVW $988, R12 + B runtime·callbackasm1(SB) + MOVW $989, R12 + B runtime·callbackasm1(SB) + MOVW $990, R12 + B runtime·callbackasm1(SB) + MOVW $991, R12 + B runtime·callbackasm1(SB) + MOVW $992, R12 + B runtime·callbackasm1(SB) + MOVW $993, R12 + B runtime·callbackasm1(SB) + MOVW $994, R12 + B runtime·callbackasm1(SB) + MOVW $995, R12 + B runtime·callbackasm1(SB) + MOVW $996, R12 + B runtime·callbackasm1(SB) + MOVW $997, R12 + B runtime·callbackasm1(SB) + MOVW $998, R12 + B runtime·callbackasm1(SB) + MOVW $999, R12 + B runtime·callbackasm1(SB) + MOVW $1000, R12 + B runtime·callbackasm1(SB) + MOVW $1001, R12 + B runtime·callbackasm1(SB) + MOVW $1002, R12 + B runtime·callbackasm1(SB) + MOVW $1003, R12 + B runtime·callbackasm1(SB) + MOVW $1004, R12 + B runtime·callbackasm1(SB) + MOVW $1005, R12 + B runtime·callbackasm1(SB) + MOVW $1006, R12 + B runtime·callbackasm1(SB) + MOVW $1007, R12 + B runtime·callbackasm1(SB) + MOVW $1008, R12 + B runtime·callbackasm1(SB) + MOVW $1009, R12 + B runtime·callbackasm1(SB) + MOVW $1010, R12 + B runtime·callbackasm1(SB) + MOVW $1011, R12 + B runtime·callbackasm1(SB) + MOVW $1012, R12 + B runtime·callbackasm1(SB) + MOVW $1013, R12 + B runtime·callbackasm1(SB) + MOVW $1014, R12 + B runtime·callbackasm1(SB) + MOVW $1015, R12 + B runtime·callbackasm1(SB) + MOVW $1016, R12 + B runtime·callbackasm1(SB) + MOVW $1017, R12 + B runtime·callbackasm1(SB) + MOVW $1018, R12 + B runtime·callbackasm1(SB) + MOVW $1019, R12 + B runtime·callbackasm1(SB) + MOVW $1020, R12 + B runtime·callbackasm1(SB) + MOVW $1021, R12 + B runtime·callbackasm1(SB) + MOVW $1022, R12 + B runtime·callbackasm1(SB) + MOVW $1023, R12 + B runtime·callbackasm1(SB) + MOVW $1024, R12 + B runtime·callbackasm1(SB) + MOVW $1025, R12 + B runtime·callbackasm1(SB) + MOVW $1026, R12 + B runtime·callbackasm1(SB) + MOVW $1027, R12 + B runtime·callbackasm1(SB) + MOVW $1028, R12 + B runtime·callbackasm1(SB) + MOVW $1029, R12 + B runtime·callbackasm1(SB) + MOVW $1030, R12 + B runtime·callbackasm1(SB) + MOVW $1031, R12 + B runtime·callbackasm1(SB) + MOVW $1032, R12 + B runtime·callbackasm1(SB) + MOVW $1033, R12 + B runtime·callbackasm1(SB) + MOVW $1034, R12 + B runtime·callbackasm1(SB) + MOVW $1035, R12 + B runtime·callbackasm1(SB) + MOVW $1036, R12 + B runtime·callbackasm1(SB) + MOVW $1037, R12 + B runtime·callbackasm1(SB) + MOVW $1038, R12 + B runtime·callbackasm1(SB) + MOVW $1039, R12 + B runtime·callbackasm1(SB) + MOVW $1040, R12 + B runtime·callbackasm1(SB) + MOVW $1041, R12 + B runtime·callbackasm1(SB) + MOVW $1042, R12 + B runtime·callbackasm1(SB) + MOVW $1043, R12 + B runtime·callbackasm1(SB) + MOVW $1044, R12 + B runtime·callbackasm1(SB) + MOVW $1045, R12 + B runtime·callbackasm1(SB) + MOVW $1046, R12 + B runtime·callbackasm1(SB) + MOVW $1047, R12 + B runtime·callbackasm1(SB) + MOVW $1048, R12 + B runtime·callbackasm1(SB) + MOVW $1049, R12 + B runtime·callbackasm1(SB) + MOVW $1050, R12 + B runtime·callbackasm1(SB) + MOVW $1051, R12 + B runtime·callbackasm1(SB) + MOVW $1052, R12 + B runtime·callbackasm1(SB) + MOVW $1053, R12 + B runtime·callbackasm1(SB) + MOVW $1054, R12 + B runtime·callbackasm1(SB) + MOVW $1055, R12 + B runtime·callbackasm1(SB) + MOVW $1056, R12 + B runtime·callbackasm1(SB) + MOVW $1057, R12 + B runtime·callbackasm1(SB) + MOVW $1058, R12 + B runtime·callbackasm1(SB) + MOVW $1059, R12 + B runtime·callbackasm1(SB) + MOVW $1060, R12 + B runtime·callbackasm1(SB) + MOVW $1061, R12 + B runtime·callbackasm1(SB) + MOVW $1062, R12 + B runtime·callbackasm1(SB) + MOVW $1063, R12 + B runtime·callbackasm1(SB) + MOVW $1064, R12 + B runtime·callbackasm1(SB) + MOVW $1065, R12 + B runtime·callbackasm1(SB) + MOVW $1066, R12 + B runtime·callbackasm1(SB) + MOVW $1067, R12 + B runtime·callbackasm1(SB) + MOVW $1068, R12 + B runtime·callbackasm1(SB) + MOVW $1069, R12 + B runtime·callbackasm1(SB) + MOVW $1070, R12 + B runtime·callbackasm1(SB) + MOVW $1071, R12 + B runtime·callbackasm1(SB) + MOVW $1072, R12 + B runtime·callbackasm1(SB) + MOVW $1073, R12 + B runtime·callbackasm1(SB) + MOVW $1074, R12 + B runtime·callbackasm1(SB) + MOVW $1075, R12 + B runtime·callbackasm1(SB) + MOVW $1076, R12 + B runtime·callbackasm1(SB) + MOVW $1077, R12 + B runtime·callbackasm1(SB) + MOVW $1078, R12 + B runtime·callbackasm1(SB) + MOVW $1079, R12 + B runtime·callbackasm1(SB) + MOVW $1080, R12 + B runtime·callbackasm1(SB) + MOVW $1081, R12 + B runtime·callbackasm1(SB) + MOVW $1082, R12 + B runtime·callbackasm1(SB) + MOVW $1083, R12 + B runtime·callbackasm1(SB) + MOVW $1084, R12 + B runtime·callbackasm1(SB) + MOVW $1085, R12 + B runtime·callbackasm1(SB) + MOVW $1086, R12 + B runtime·callbackasm1(SB) + MOVW $1087, R12 + B runtime·callbackasm1(SB) + MOVW $1088, R12 + B runtime·callbackasm1(SB) + MOVW $1089, R12 + B runtime·callbackasm1(SB) + MOVW $1090, R12 + B runtime·callbackasm1(SB) + MOVW $1091, R12 + B runtime·callbackasm1(SB) + MOVW $1092, R12 + B runtime·callbackasm1(SB) + MOVW $1093, R12 + B runtime·callbackasm1(SB) + MOVW $1094, R12 + B runtime·callbackasm1(SB) + MOVW $1095, R12 + B runtime·callbackasm1(SB) + MOVW $1096, R12 + B runtime·callbackasm1(SB) + MOVW $1097, R12 + B runtime·callbackasm1(SB) + MOVW $1098, R12 + B runtime·callbackasm1(SB) + MOVW $1099, R12 + B runtime·callbackasm1(SB) + MOVW $1100, R12 + B runtime·callbackasm1(SB) + MOVW $1101, R12 + B runtime·callbackasm1(SB) + MOVW $1102, R12 + B runtime·callbackasm1(SB) + MOVW $1103, R12 + B runtime·callbackasm1(SB) + MOVW $1104, R12 + B runtime·callbackasm1(SB) + MOVW $1105, R12 + B runtime·callbackasm1(SB) + MOVW $1106, R12 + B runtime·callbackasm1(SB) + MOVW $1107, R12 + B runtime·callbackasm1(SB) + MOVW $1108, R12 + B runtime·callbackasm1(SB) + MOVW $1109, R12 + B runtime·callbackasm1(SB) + MOVW $1110, R12 + B runtime·callbackasm1(SB) + MOVW $1111, R12 + B runtime·callbackasm1(SB) + MOVW $1112, R12 + B runtime·callbackasm1(SB) + MOVW $1113, R12 + B runtime·callbackasm1(SB) + MOVW $1114, R12 + B runtime·callbackasm1(SB) + MOVW $1115, R12 + B runtime·callbackasm1(SB) + MOVW $1116, R12 + B runtime·callbackasm1(SB) + MOVW $1117, R12 + B runtime·callbackasm1(SB) + MOVW $1118, R12 + B runtime·callbackasm1(SB) + MOVW $1119, R12 + B runtime·callbackasm1(SB) + MOVW $1120, R12 + B runtime·callbackasm1(SB) + MOVW $1121, R12 + B runtime·callbackasm1(SB) + MOVW $1122, R12 + B runtime·callbackasm1(SB) + MOVW $1123, R12 + B runtime·callbackasm1(SB) + MOVW $1124, R12 + B runtime·callbackasm1(SB) + MOVW $1125, R12 + B runtime·callbackasm1(SB) + MOVW $1126, R12 + B runtime·callbackasm1(SB) + MOVW $1127, R12 + B runtime·callbackasm1(SB) + MOVW $1128, R12 + B runtime·callbackasm1(SB) + MOVW $1129, R12 + B runtime·callbackasm1(SB) + MOVW $1130, R12 + B runtime·callbackasm1(SB) + MOVW $1131, R12 + B runtime·callbackasm1(SB) + MOVW $1132, R12 + B runtime·callbackasm1(SB) + MOVW $1133, R12 + B runtime·callbackasm1(SB) + MOVW $1134, R12 + B runtime·callbackasm1(SB) + MOVW $1135, R12 + B runtime·callbackasm1(SB) + MOVW $1136, R12 + B runtime·callbackasm1(SB) + MOVW $1137, R12 + B runtime·callbackasm1(SB) + MOVW $1138, R12 + B runtime·callbackasm1(SB) + MOVW $1139, R12 + B runtime·callbackasm1(SB) + MOVW $1140, R12 + B runtime·callbackasm1(SB) + MOVW $1141, R12 + B runtime·callbackasm1(SB) + MOVW $1142, R12 + B runtime·callbackasm1(SB) + MOVW $1143, R12 + B runtime·callbackasm1(SB) + MOVW $1144, R12 + B runtime·callbackasm1(SB) + MOVW $1145, R12 + B runtime·callbackasm1(SB) + MOVW $1146, R12 + B runtime·callbackasm1(SB) + MOVW $1147, R12 + B runtime·callbackasm1(SB) + MOVW $1148, R12 + B runtime·callbackasm1(SB) + MOVW $1149, R12 + B runtime·callbackasm1(SB) + MOVW $1150, R12 + B runtime·callbackasm1(SB) + MOVW $1151, R12 + B runtime·callbackasm1(SB) + MOVW $1152, R12 + B runtime·callbackasm1(SB) + MOVW $1153, R12 + B runtime·callbackasm1(SB) + MOVW $1154, R12 + B runtime·callbackasm1(SB) + MOVW $1155, R12 + B runtime·callbackasm1(SB) + MOVW $1156, R12 + B runtime·callbackasm1(SB) + MOVW $1157, R12 + B runtime·callbackasm1(SB) + MOVW $1158, R12 + B runtime·callbackasm1(SB) + MOVW $1159, R12 + B runtime·callbackasm1(SB) + MOVW $1160, R12 + B runtime·callbackasm1(SB) + MOVW $1161, R12 + B runtime·callbackasm1(SB) + MOVW $1162, R12 + B runtime·callbackasm1(SB) + MOVW $1163, R12 + B runtime·callbackasm1(SB) + MOVW $1164, R12 + B runtime·callbackasm1(SB) + MOVW $1165, R12 + B runtime·callbackasm1(SB) + MOVW $1166, R12 + B runtime·callbackasm1(SB) + MOVW $1167, R12 + B runtime·callbackasm1(SB) + MOVW $1168, R12 + B runtime·callbackasm1(SB) + MOVW $1169, R12 + B runtime·callbackasm1(SB) + MOVW $1170, R12 + B runtime·callbackasm1(SB) + MOVW $1171, R12 + B runtime·callbackasm1(SB) + MOVW $1172, R12 + B runtime·callbackasm1(SB) + MOVW $1173, R12 + B runtime·callbackasm1(SB) + MOVW $1174, R12 + B runtime·callbackasm1(SB) + MOVW $1175, R12 + B runtime·callbackasm1(SB) + MOVW $1176, R12 + B runtime·callbackasm1(SB) + MOVW $1177, R12 + B runtime·callbackasm1(SB) + MOVW $1178, R12 + B runtime·callbackasm1(SB) + MOVW $1179, R12 + B runtime·callbackasm1(SB) + MOVW $1180, R12 + B runtime·callbackasm1(SB) + MOVW $1181, R12 + B runtime·callbackasm1(SB) + MOVW $1182, R12 + B runtime·callbackasm1(SB) + MOVW $1183, R12 + B runtime·callbackasm1(SB) + MOVW $1184, R12 + B runtime·callbackasm1(SB) + MOVW $1185, R12 + B runtime·callbackasm1(SB) + MOVW $1186, R12 + B runtime·callbackasm1(SB) + MOVW $1187, R12 + B runtime·callbackasm1(SB) + MOVW $1188, R12 + B runtime·callbackasm1(SB) + MOVW $1189, R12 + B runtime·callbackasm1(SB) + MOVW $1190, R12 + B runtime·callbackasm1(SB) + MOVW $1191, R12 + B runtime·callbackasm1(SB) + MOVW $1192, R12 + B runtime·callbackasm1(SB) + MOVW $1193, R12 + B runtime·callbackasm1(SB) + MOVW $1194, R12 + B runtime·callbackasm1(SB) + MOVW $1195, R12 + B runtime·callbackasm1(SB) + MOVW $1196, R12 + B runtime·callbackasm1(SB) + MOVW $1197, R12 + B runtime·callbackasm1(SB) + MOVW $1198, R12 + B runtime·callbackasm1(SB) + MOVW $1199, R12 + B runtime·callbackasm1(SB) + MOVW $1200, R12 + B runtime·callbackasm1(SB) + MOVW $1201, R12 + B runtime·callbackasm1(SB) + MOVW $1202, R12 + B runtime·callbackasm1(SB) + MOVW $1203, R12 + B runtime·callbackasm1(SB) + MOVW $1204, R12 + B runtime·callbackasm1(SB) + MOVW $1205, R12 + B runtime·callbackasm1(SB) + MOVW $1206, R12 + B runtime·callbackasm1(SB) + MOVW $1207, R12 + B runtime·callbackasm1(SB) + MOVW $1208, R12 + B runtime·callbackasm1(SB) + MOVW $1209, R12 + B runtime·callbackasm1(SB) + MOVW $1210, R12 + B runtime·callbackasm1(SB) + MOVW $1211, R12 + B runtime·callbackasm1(SB) + MOVW $1212, R12 + B runtime·callbackasm1(SB) + MOVW $1213, R12 + B runtime·callbackasm1(SB) + MOVW $1214, R12 + B runtime·callbackasm1(SB) + MOVW $1215, R12 + B runtime·callbackasm1(SB) + MOVW $1216, R12 + B runtime·callbackasm1(SB) + MOVW $1217, R12 + B runtime·callbackasm1(SB) + MOVW $1218, R12 + B runtime·callbackasm1(SB) + MOVW $1219, R12 + B runtime·callbackasm1(SB) + MOVW $1220, R12 + B runtime·callbackasm1(SB) + MOVW $1221, R12 + B runtime·callbackasm1(SB) + MOVW $1222, R12 + B runtime·callbackasm1(SB) + MOVW $1223, R12 + B runtime·callbackasm1(SB) + MOVW $1224, R12 + B runtime·callbackasm1(SB) + MOVW $1225, R12 + B runtime·callbackasm1(SB) + MOVW $1226, R12 + B runtime·callbackasm1(SB) + MOVW $1227, R12 + B runtime·callbackasm1(SB) + MOVW $1228, R12 + B runtime·callbackasm1(SB) + MOVW $1229, R12 + B runtime·callbackasm1(SB) + MOVW $1230, R12 + B runtime·callbackasm1(SB) + MOVW $1231, R12 + B runtime·callbackasm1(SB) + MOVW $1232, R12 + B runtime·callbackasm1(SB) + MOVW $1233, R12 + B runtime·callbackasm1(SB) + MOVW $1234, R12 + B runtime·callbackasm1(SB) + MOVW $1235, R12 + B runtime·callbackasm1(SB) + MOVW $1236, R12 + B runtime·callbackasm1(SB) + MOVW $1237, R12 + B runtime·callbackasm1(SB) + MOVW $1238, R12 + B runtime·callbackasm1(SB) + MOVW $1239, R12 + B runtime·callbackasm1(SB) + MOVW $1240, R12 + B runtime·callbackasm1(SB) + MOVW $1241, R12 + B runtime·callbackasm1(SB) + MOVW $1242, R12 + B runtime·callbackasm1(SB) + MOVW $1243, R12 + B runtime·callbackasm1(SB) + MOVW $1244, R12 + B runtime·callbackasm1(SB) + MOVW $1245, R12 + B runtime·callbackasm1(SB) + MOVW $1246, R12 + B runtime·callbackasm1(SB) + MOVW $1247, R12 + B runtime·callbackasm1(SB) + MOVW $1248, R12 + B runtime·callbackasm1(SB) + MOVW $1249, R12 + B runtime·callbackasm1(SB) + MOVW $1250, R12 + B runtime·callbackasm1(SB) + MOVW $1251, R12 + B runtime·callbackasm1(SB) + MOVW $1252, R12 + B runtime·callbackasm1(SB) + MOVW $1253, R12 + B runtime·callbackasm1(SB) + MOVW $1254, R12 + B runtime·callbackasm1(SB) + MOVW $1255, R12 + B runtime·callbackasm1(SB) + MOVW $1256, R12 + B runtime·callbackasm1(SB) + MOVW $1257, R12 + B runtime·callbackasm1(SB) + MOVW $1258, R12 + B runtime·callbackasm1(SB) + MOVW $1259, R12 + B runtime·callbackasm1(SB) + MOVW $1260, R12 + B runtime·callbackasm1(SB) + MOVW $1261, R12 + B runtime·callbackasm1(SB) + MOVW $1262, R12 + B runtime·callbackasm1(SB) + MOVW $1263, R12 + B runtime·callbackasm1(SB) + MOVW $1264, R12 + B runtime·callbackasm1(SB) + MOVW $1265, R12 + B runtime·callbackasm1(SB) + MOVW $1266, R12 + B runtime·callbackasm1(SB) + MOVW $1267, R12 + B runtime·callbackasm1(SB) + MOVW $1268, R12 + B runtime·callbackasm1(SB) + MOVW $1269, R12 + B runtime·callbackasm1(SB) + MOVW $1270, R12 + B runtime·callbackasm1(SB) + MOVW $1271, R12 + B runtime·callbackasm1(SB) + MOVW $1272, R12 + B runtime·callbackasm1(SB) + MOVW $1273, R12 + B runtime·callbackasm1(SB) + MOVW $1274, R12 + B runtime·callbackasm1(SB) + MOVW $1275, R12 + B runtime·callbackasm1(SB) + MOVW $1276, R12 + B runtime·callbackasm1(SB) + MOVW $1277, R12 + B runtime·callbackasm1(SB) + MOVW $1278, R12 + B runtime·callbackasm1(SB) + MOVW $1279, R12 + B runtime·callbackasm1(SB) + MOVW $1280, R12 + B runtime·callbackasm1(SB) + MOVW $1281, R12 + B runtime·callbackasm1(SB) + MOVW $1282, R12 + B runtime·callbackasm1(SB) + MOVW $1283, R12 + B runtime·callbackasm1(SB) + MOVW $1284, R12 + B runtime·callbackasm1(SB) + MOVW $1285, R12 + B runtime·callbackasm1(SB) + MOVW $1286, R12 + B runtime·callbackasm1(SB) + MOVW $1287, R12 + B runtime·callbackasm1(SB) + MOVW $1288, R12 + B runtime·callbackasm1(SB) + MOVW $1289, R12 + B runtime·callbackasm1(SB) + MOVW $1290, R12 + B runtime·callbackasm1(SB) + MOVW $1291, R12 + B runtime·callbackasm1(SB) + MOVW $1292, R12 + B runtime·callbackasm1(SB) + MOVW $1293, R12 + B runtime·callbackasm1(SB) + MOVW $1294, R12 + B runtime·callbackasm1(SB) + MOVW $1295, R12 + B runtime·callbackasm1(SB) + MOVW $1296, R12 + B runtime·callbackasm1(SB) + MOVW $1297, R12 + B runtime·callbackasm1(SB) + MOVW $1298, R12 + B runtime·callbackasm1(SB) + MOVW $1299, R12 + B runtime·callbackasm1(SB) + MOVW $1300, R12 + B runtime·callbackasm1(SB) + MOVW $1301, R12 + B runtime·callbackasm1(SB) + MOVW $1302, R12 + B runtime·callbackasm1(SB) + MOVW $1303, R12 + B runtime·callbackasm1(SB) + MOVW $1304, R12 + B runtime·callbackasm1(SB) + MOVW $1305, R12 + B runtime·callbackasm1(SB) + MOVW $1306, R12 + B runtime·callbackasm1(SB) + MOVW $1307, R12 + B runtime·callbackasm1(SB) + MOVW $1308, R12 + B runtime·callbackasm1(SB) + MOVW $1309, R12 + B runtime·callbackasm1(SB) + MOVW $1310, R12 + B runtime·callbackasm1(SB) + MOVW $1311, R12 + B runtime·callbackasm1(SB) + MOVW $1312, R12 + B runtime·callbackasm1(SB) + MOVW $1313, R12 + B runtime·callbackasm1(SB) + MOVW $1314, R12 + B runtime·callbackasm1(SB) + MOVW $1315, R12 + B runtime·callbackasm1(SB) + MOVW $1316, R12 + B runtime·callbackasm1(SB) + MOVW $1317, R12 + B runtime·callbackasm1(SB) + MOVW $1318, R12 + B runtime·callbackasm1(SB) + MOVW $1319, R12 + B runtime·callbackasm1(SB) + MOVW $1320, R12 + B runtime·callbackasm1(SB) + MOVW $1321, R12 + B runtime·callbackasm1(SB) + MOVW $1322, R12 + B runtime·callbackasm1(SB) + MOVW $1323, R12 + B runtime·callbackasm1(SB) + MOVW $1324, R12 + B runtime·callbackasm1(SB) + MOVW $1325, R12 + B runtime·callbackasm1(SB) + MOVW $1326, R12 + B runtime·callbackasm1(SB) + MOVW $1327, R12 + B runtime·callbackasm1(SB) + MOVW $1328, R12 + B runtime·callbackasm1(SB) + MOVW $1329, R12 + B runtime·callbackasm1(SB) + MOVW $1330, R12 + B runtime·callbackasm1(SB) + MOVW $1331, R12 + B runtime·callbackasm1(SB) + MOVW $1332, R12 + B runtime·callbackasm1(SB) + MOVW $1333, R12 + B runtime·callbackasm1(SB) + MOVW $1334, R12 + B runtime·callbackasm1(SB) + MOVW $1335, R12 + B runtime·callbackasm1(SB) + MOVW $1336, R12 + B runtime·callbackasm1(SB) + MOVW $1337, R12 + B runtime·callbackasm1(SB) + MOVW $1338, R12 + B runtime·callbackasm1(SB) + MOVW $1339, R12 + B runtime·callbackasm1(SB) + MOVW $1340, R12 + B runtime·callbackasm1(SB) + MOVW $1341, R12 + B runtime·callbackasm1(SB) + MOVW $1342, R12 + B runtime·callbackasm1(SB) + MOVW $1343, R12 + B runtime·callbackasm1(SB) + MOVW $1344, R12 + B runtime·callbackasm1(SB) + MOVW $1345, R12 + B runtime·callbackasm1(SB) + MOVW $1346, R12 + B runtime·callbackasm1(SB) + MOVW $1347, R12 + B runtime·callbackasm1(SB) + MOVW $1348, R12 + B runtime·callbackasm1(SB) + MOVW $1349, R12 + B runtime·callbackasm1(SB) + MOVW $1350, R12 + B runtime·callbackasm1(SB) + MOVW $1351, R12 + B runtime·callbackasm1(SB) + MOVW $1352, R12 + B runtime·callbackasm1(SB) + MOVW $1353, R12 + B runtime·callbackasm1(SB) + MOVW $1354, R12 + B runtime·callbackasm1(SB) + MOVW $1355, R12 + B runtime·callbackasm1(SB) + MOVW $1356, R12 + B runtime·callbackasm1(SB) + MOVW $1357, R12 + B runtime·callbackasm1(SB) + MOVW $1358, R12 + B runtime·callbackasm1(SB) + MOVW $1359, R12 + B runtime·callbackasm1(SB) + MOVW $1360, R12 + B runtime·callbackasm1(SB) + MOVW $1361, R12 + B runtime·callbackasm1(SB) + MOVW $1362, R12 + B runtime·callbackasm1(SB) + MOVW $1363, R12 + B runtime·callbackasm1(SB) + MOVW $1364, R12 + B runtime·callbackasm1(SB) + MOVW $1365, R12 + B runtime·callbackasm1(SB) + MOVW $1366, R12 + B runtime·callbackasm1(SB) + MOVW $1367, R12 + B runtime·callbackasm1(SB) + MOVW $1368, R12 + B runtime·callbackasm1(SB) + MOVW $1369, R12 + B runtime·callbackasm1(SB) + MOVW $1370, R12 + B runtime·callbackasm1(SB) + MOVW $1371, R12 + B runtime·callbackasm1(SB) + MOVW $1372, R12 + B runtime·callbackasm1(SB) + MOVW $1373, R12 + B runtime·callbackasm1(SB) + MOVW $1374, R12 + B runtime·callbackasm1(SB) + MOVW $1375, R12 + B runtime·callbackasm1(SB) + MOVW $1376, R12 + B runtime·callbackasm1(SB) + MOVW $1377, R12 + B runtime·callbackasm1(SB) + MOVW $1378, R12 + B runtime·callbackasm1(SB) + MOVW $1379, R12 + B runtime·callbackasm1(SB) + MOVW $1380, R12 + B runtime·callbackasm1(SB) + MOVW $1381, R12 + B runtime·callbackasm1(SB) + MOVW $1382, R12 + B runtime·callbackasm1(SB) + MOVW $1383, R12 + B runtime·callbackasm1(SB) + MOVW $1384, R12 + B runtime·callbackasm1(SB) + MOVW $1385, R12 + B runtime·callbackasm1(SB) + MOVW $1386, R12 + B runtime·callbackasm1(SB) + MOVW $1387, R12 + B runtime·callbackasm1(SB) + MOVW $1388, R12 + B runtime·callbackasm1(SB) + MOVW $1389, R12 + B runtime·callbackasm1(SB) + MOVW $1390, R12 + B runtime·callbackasm1(SB) + MOVW $1391, R12 + B runtime·callbackasm1(SB) + MOVW $1392, R12 + B runtime·callbackasm1(SB) + MOVW $1393, R12 + B runtime·callbackasm1(SB) + MOVW $1394, R12 + B runtime·callbackasm1(SB) + MOVW $1395, R12 + B runtime·callbackasm1(SB) + MOVW $1396, R12 + B runtime·callbackasm1(SB) + MOVW $1397, R12 + B runtime·callbackasm1(SB) + MOVW $1398, R12 + B runtime·callbackasm1(SB) + MOVW $1399, R12 + B runtime·callbackasm1(SB) + MOVW $1400, R12 + B runtime·callbackasm1(SB) + MOVW $1401, R12 + B runtime·callbackasm1(SB) + MOVW $1402, R12 + B runtime·callbackasm1(SB) + MOVW $1403, R12 + B runtime·callbackasm1(SB) + MOVW $1404, R12 + B runtime·callbackasm1(SB) + MOVW $1405, R12 + B runtime·callbackasm1(SB) + MOVW $1406, R12 + B runtime·callbackasm1(SB) + MOVW $1407, R12 + B runtime·callbackasm1(SB) + MOVW $1408, R12 + B runtime·callbackasm1(SB) + MOVW $1409, R12 + B runtime·callbackasm1(SB) + MOVW $1410, R12 + B runtime·callbackasm1(SB) + MOVW $1411, R12 + B runtime·callbackasm1(SB) + MOVW $1412, R12 + B runtime·callbackasm1(SB) + MOVW $1413, R12 + B runtime·callbackasm1(SB) + MOVW $1414, R12 + B runtime·callbackasm1(SB) + MOVW $1415, R12 + B runtime·callbackasm1(SB) + MOVW $1416, R12 + B runtime·callbackasm1(SB) + MOVW $1417, R12 + B runtime·callbackasm1(SB) + MOVW $1418, R12 + B runtime·callbackasm1(SB) + MOVW $1419, R12 + B runtime·callbackasm1(SB) + MOVW $1420, R12 + B runtime·callbackasm1(SB) + MOVW $1421, R12 + B runtime·callbackasm1(SB) + MOVW $1422, R12 + B runtime·callbackasm1(SB) + MOVW $1423, R12 + B runtime·callbackasm1(SB) + MOVW $1424, R12 + B runtime·callbackasm1(SB) + MOVW $1425, R12 + B runtime·callbackasm1(SB) + MOVW $1426, R12 + B runtime·callbackasm1(SB) + MOVW $1427, R12 + B runtime·callbackasm1(SB) + MOVW $1428, R12 + B runtime·callbackasm1(SB) + MOVW $1429, R12 + B runtime·callbackasm1(SB) + MOVW $1430, R12 + B runtime·callbackasm1(SB) + MOVW $1431, R12 + B runtime·callbackasm1(SB) + MOVW $1432, R12 + B runtime·callbackasm1(SB) + MOVW $1433, R12 + B runtime·callbackasm1(SB) + MOVW $1434, R12 + B runtime·callbackasm1(SB) + MOVW $1435, R12 + B runtime·callbackasm1(SB) + MOVW $1436, R12 + B runtime·callbackasm1(SB) + MOVW $1437, R12 + B runtime·callbackasm1(SB) + MOVW $1438, R12 + B runtime·callbackasm1(SB) + MOVW $1439, R12 + B runtime·callbackasm1(SB) + MOVW $1440, R12 + B runtime·callbackasm1(SB) + MOVW $1441, R12 + B runtime·callbackasm1(SB) + MOVW $1442, R12 + B runtime·callbackasm1(SB) + MOVW $1443, R12 + B runtime·callbackasm1(SB) + MOVW $1444, R12 + B runtime·callbackasm1(SB) + MOVW $1445, R12 + B runtime·callbackasm1(SB) + MOVW $1446, R12 + B runtime·callbackasm1(SB) + MOVW $1447, R12 + B runtime·callbackasm1(SB) + MOVW $1448, R12 + B runtime·callbackasm1(SB) + MOVW $1449, R12 + B runtime·callbackasm1(SB) + MOVW $1450, R12 + B runtime·callbackasm1(SB) + MOVW $1451, R12 + B runtime·callbackasm1(SB) + MOVW $1452, R12 + B runtime·callbackasm1(SB) + MOVW $1453, R12 + B runtime·callbackasm1(SB) + MOVW $1454, R12 + B runtime·callbackasm1(SB) + MOVW $1455, R12 + B runtime·callbackasm1(SB) + MOVW $1456, R12 + B runtime·callbackasm1(SB) + MOVW $1457, R12 + B runtime·callbackasm1(SB) + MOVW $1458, R12 + B runtime·callbackasm1(SB) + MOVW $1459, R12 + B runtime·callbackasm1(SB) + MOVW $1460, R12 + B runtime·callbackasm1(SB) + MOVW $1461, R12 + B runtime·callbackasm1(SB) + MOVW $1462, R12 + B runtime·callbackasm1(SB) + MOVW $1463, R12 + B runtime·callbackasm1(SB) + MOVW $1464, R12 + B runtime·callbackasm1(SB) + MOVW $1465, R12 + B runtime·callbackasm1(SB) + MOVW $1466, R12 + B runtime·callbackasm1(SB) + MOVW $1467, R12 + B runtime·callbackasm1(SB) + MOVW $1468, R12 + B runtime·callbackasm1(SB) + MOVW $1469, R12 + B runtime·callbackasm1(SB) + MOVW $1470, R12 + B runtime·callbackasm1(SB) + MOVW $1471, R12 + B runtime·callbackasm1(SB) + MOVW $1472, R12 + B runtime·callbackasm1(SB) + MOVW $1473, R12 + B runtime·callbackasm1(SB) + MOVW $1474, R12 + B runtime·callbackasm1(SB) + MOVW $1475, R12 + B runtime·callbackasm1(SB) + MOVW $1476, R12 + B runtime·callbackasm1(SB) + MOVW $1477, R12 + B runtime·callbackasm1(SB) + MOVW $1478, R12 + B runtime·callbackasm1(SB) + MOVW $1479, R12 + B runtime·callbackasm1(SB) + MOVW $1480, R12 + B runtime·callbackasm1(SB) + MOVW $1481, R12 + B runtime·callbackasm1(SB) + MOVW $1482, R12 + B runtime·callbackasm1(SB) + MOVW $1483, R12 + B runtime·callbackasm1(SB) + MOVW $1484, R12 + B runtime·callbackasm1(SB) + MOVW $1485, R12 + B runtime·callbackasm1(SB) + MOVW $1486, R12 + B runtime·callbackasm1(SB) + MOVW $1487, R12 + B runtime·callbackasm1(SB) + MOVW $1488, R12 + B runtime·callbackasm1(SB) + MOVW $1489, R12 + B runtime·callbackasm1(SB) + MOVW $1490, R12 + B runtime·callbackasm1(SB) + MOVW $1491, R12 + B runtime·callbackasm1(SB) + MOVW $1492, R12 + B runtime·callbackasm1(SB) + MOVW $1493, R12 + B runtime·callbackasm1(SB) + MOVW $1494, R12 + B runtime·callbackasm1(SB) + MOVW $1495, R12 + B runtime·callbackasm1(SB) + MOVW $1496, R12 + B runtime·callbackasm1(SB) + MOVW $1497, R12 + B runtime·callbackasm1(SB) + MOVW $1498, R12 + B runtime·callbackasm1(SB) + MOVW $1499, R12 + B runtime·callbackasm1(SB) + MOVW $1500, R12 + B runtime·callbackasm1(SB) + MOVW $1501, R12 + B runtime·callbackasm1(SB) + MOVW $1502, R12 + B runtime·callbackasm1(SB) + MOVW $1503, R12 + B runtime·callbackasm1(SB) + MOVW $1504, R12 + B runtime·callbackasm1(SB) + MOVW $1505, R12 + B runtime·callbackasm1(SB) + MOVW $1506, R12 + B runtime·callbackasm1(SB) + MOVW $1507, R12 + B runtime·callbackasm1(SB) + MOVW $1508, R12 + B runtime·callbackasm1(SB) + MOVW $1509, R12 + B runtime·callbackasm1(SB) + MOVW $1510, R12 + B runtime·callbackasm1(SB) + MOVW $1511, R12 + B runtime·callbackasm1(SB) + MOVW $1512, R12 + B runtime·callbackasm1(SB) + MOVW $1513, R12 + B runtime·callbackasm1(SB) + MOVW $1514, R12 + B runtime·callbackasm1(SB) + MOVW $1515, R12 + B runtime·callbackasm1(SB) + MOVW $1516, R12 + B runtime·callbackasm1(SB) + MOVW $1517, R12 + B runtime·callbackasm1(SB) + MOVW $1518, R12 + B runtime·callbackasm1(SB) + MOVW $1519, R12 + B runtime·callbackasm1(SB) + MOVW $1520, R12 + B runtime·callbackasm1(SB) + MOVW $1521, R12 + B runtime·callbackasm1(SB) + MOVW $1522, R12 + B runtime·callbackasm1(SB) + MOVW $1523, R12 + B runtime·callbackasm1(SB) + MOVW $1524, R12 + B runtime·callbackasm1(SB) + MOVW $1525, R12 + B runtime·callbackasm1(SB) + MOVW $1526, R12 + B runtime·callbackasm1(SB) + MOVW $1527, R12 + B runtime·callbackasm1(SB) + MOVW $1528, R12 + B runtime·callbackasm1(SB) + MOVW $1529, R12 + B runtime·callbackasm1(SB) + MOVW $1530, R12 + B runtime·callbackasm1(SB) + MOVW $1531, R12 + B runtime·callbackasm1(SB) + MOVW $1532, R12 + B runtime·callbackasm1(SB) + MOVW $1533, R12 + B runtime·callbackasm1(SB) + MOVW $1534, R12 + B runtime·callbackasm1(SB) + MOVW $1535, R12 + B runtime·callbackasm1(SB) + MOVW $1536, R12 + B runtime·callbackasm1(SB) + MOVW $1537, R12 + B runtime·callbackasm1(SB) + MOVW $1538, R12 + B runtime·callbackasm1(SB) + MOVW $1539, R12 + B runtime·callbackasm1(SB) + MOVW $1540, R12 + B runtime·callbackasm1(SB) + MOVW $1541, R12 + B runtime·callbackasm1(SB) + MOVW $1542, R12 + B runtime·callbackasm1(SB) + MOVW $1543, R12 + B runtime·callbackasm1(SB) + MOVW $1544, R12 + B runtime·callbackasm1(SB) + MOVW $1545, R12 + B runtime·callbackasm1(SB) + MOVW $1546, R12 + B runtime·callbackasm1(SB) + MOVW $1547, R12 + B runtime·callbackasm1(SB) + MOVW $1548, R12 + B runtime·callbackasm1(SB) + MOVW $1549, R12 + B runtime·callbackasm1(SB) + MOVW $1550, R12 + B runtime·callbackasm1(SB) + MOVW $1551, R12 + B runtime·callbackasm1(SB) + MOVW $1552, R12 + B runtime·callbackasm1(SB) + MOVW $1553, R12 + B runtime·callbackasm1(SB) + MOVW $1554, R12 + B runtime·callbackasm1(SB) + MOVW $1555, R12 + B runtime·callbackasm1(SB) + MOVW $1556, R12 + B runtime·callbackasm1(SB) + MOVW $1557, R12 + B runtime·callbackasm1(SB) + MOVW $1558, R12 + B runtime·callbackasm1(SB) + MOVW $1559, R12 + B runtime·callbackasm1(SB) + MOVW $1560, R12 + B runtime·callbackasm1(SB) + MOVW $1561, R12 + B runtime·callbackasm1(SB) + MOVW $1562, R12 + B runtime·callbackasm1(SB) + MOVW $1563, R12 + B runtime·callbackasm1(SB) + MOVW $1564, R12 + B runtime·callbackasm1(SB) + MOVW $1565, R12 + B runtime·callbackasm1(SB) + MOVW $1566, R12 + B runtime·callbackasm1(SB) + MOVW $1567, R12 + B runtime·callbackasm1(SB) + MOVW $1568, R12 + B runtime·callbackasm1(SB) + MOVW $1569, R12 + B runtime·callbackasm1(SB) + MOVW $1570, R12 + B runtime·callbackasm1(SB) + MOVW $1571, R12 + B runtime·callbackasm1(SB) + MOVW $1572, R12 + B runtime·callbackasm1(SB) + MOVW $1573, R12 + B runtime·callbackasm1(SB) + MOVW $1574, R12 + B runtime·callbackasm1(SB) + MOVW $1575, R12 + B runtime·callbackasm1(SB) + MOVW $1576, R12 + B runtime·callbackasm1(SB) + MOVW $1577, R12 + B runtime·callbackasm1(SB) + MOVW $1578, R12 + B runtime·callbackasm1(SB) + MOVW $1579, R12 + B runtime·callbackasm1(SB) + MOVW $1580, R12 + B runtime·callbackasm1(SB) + MOVW $1581, R12 + B runtime·callbackasm1(SB) + MOVW $1582, R12 + B runtime·callbackasm1(SB) + MOVW $1583, R12 + B runtime·callbackasm1(SB) + MOVW $1584, R12 + B runtime·callbackasm1(SB) + MOVW $1585, R12 + B runtime·callbackasm1(SB) + MOVW $1586, R12 + B runtime·callbackasm1(SB) + MOVW $1587, R12 + B runtime·callbackasm1(SB) + MOVW $1588, R12 + B runtime·callbackasm1(SB) + MOVW $1589, R12 + B runtime·callbackasm1(SB) + MOVW $1590, R12 + B runtime·callbackasm1(SB) + MOVW $1591, R12 + B runtime·callbackasm1(SB) + MOVW $1592, R12 + B runtime·callbackasm1(SB) + MOVW $1593, R12 + B runtime·callbackasm1(SB) + MOVW $1594, R12 + B runtime·callbackasm1(SB) + MOVW $1595, R12 + B runtime·callbackasm1(SB) + MOVW $1596, R12 + B runtime·callbackasm1(SB) + MOVW $1597, R12 + B runtime·callbackasm1(SB) + MOVW $1598, R12 + B runtime·callbackasm1(SB) + MOVW $1599, R12 + B runtime·callbackasm1(SB) + MOVW $1600, R12 + B runtime·callbackasm1(SB) + MOVW $1601, R12 + B runtime·callbackasm1(SB) + MOVW $1602, R12 + B runtime·callbackasm1(SB) + MOVW $1603, R12 + B runtime·callbackasm1(SB) + MOVW $1604, R12 + B runtime·callbackasm1(SB) + MOVW $1605, R12 + B runtime·callbackasm1(SB) + MOVW $1606, R12 + B runtime·callbackasm1(SB) + MOVW $1607, R12 + B runtime·callbackasm1(SB) + MOVW $1608, R12 + B runtime·callbackasm1(SB) + MOVW $1609, R12 + B runtime·callbackasm1(SB) + MOVW $1610, R12 + B runtime·callbackasm1(SB) + MOVW $1611, R12 + B runtime·callbackasm1(SB) + MOVW $1612, R12 + B runtime·callbackasm1(SB) + MOVW $1613, R12 + B runtime·callbackasm1(SB) + MOVW $1614, R12 + B runtime·callbackasm1(SB) + MOVW $1615, R12 + B runtime·callbackasm1(SB) + MOVW $1616, R12 + B runtime·callbackasm1(SB) + MOVW $1617, R12 + B runtime·callbackasm1(SB) + MOVW $1618, R12 + B runtime·callbackasm1(SB) + MOVW $1619, R12 + B runtime·callbackasm1(SB) + MOVW $1620, R12 + B runtime·callbackasm1(SB) + MOVW $1621, R12 + B runtime·callbackasm1(SB) + MOVW $1622, R12 + B runtime·callbackasm1(SB) + MOVW $1623, R12 + B runtime·callbackasm1(SB) + MOVW $1624, R12 + B runtime·callbackasm1(SB) + MOVW $1625, R12 + B runtime·callbackasm1(SB) + MOVW $1626, R12 + B runtime·callbackasm1(SB) + MOVW $1627, R12 + B runtime·callbackasm1(SB) + MOVW $1628, R12 + B runtime·callbackasm1(SB) + MOVW $1629, R12 + B runtime·callbackasm1(SB) + MOVW $1630, R12 + B runtime·callbackasm1(SB) + MOVW $1631, R12 + B runtime·callbackasm1(SB) + MOVW $1632, R12 + B runtime·callbackasm1(SB) + MOVW $1633, R12 + B runtime·callbackasm1(SB) + MOVW $1634, R12 + B runtime·callbackasm1(SB) + MOVW $1635, R12 + B runtime·callbackasm1(SB) + MOVW $1636, R12 + B runtime·callbackasm1(SB) + MOVW $1637, R12 + B runtime·callbackasm1(SB) + MOVW $1638, R12 + B runtime·callbackasm1(SB) + MOVW $1639, R12 + B runtime·callbackasm1(SB) + MOVW $1640, R12 + B runtime·callbackasm1(SB) + MOVW $1641, R12 + B runtime·callbackasm1(SB) + MOVW $1642, R12 + B runtime·callbackasm1(SB) + MOVW $1643, R12 + B runtime·callbackasm1(SB) + MOVW $1644, R12 + B runtime·callbackasm1(SB) + MOVW $1645, R12 + B runtime·callbackasm1(SB) + MOVW $1646, R12 + B runtime·callbackasm1(SB) + MOVW $1647, R12 + B runtime·callbackasm1(SB) + MOVW $1648, R12 + B runtime·callbackasm1(SB) + MOVW $1649, R12 + B runtime·callbackasm1(SB) + MOVW $1650, R12 + B runtime·callbackasm1(SB) + MOVW $1651, R12 + B runtime·callbackasm1(SB) + MOVW $1652, R12 + B runtime·callbackasm1(SB) + MOVW $1653, R12 + B runtime·callbackasm1(SB) + MOVW $1654, R12 + B runtime·callbackasm1(SB) + MOVW $1655, R12 + B runtime·callbackasm1(SB) + MOVW $1656, R12 + B runtime·callbackasm1(SB) + MOVW $1657, R12 + B runtime·callbackasm1(SB) + MOVW $1658, R12 + B runtime·callbackasm1(SB) + MOVW $1659, R12 + B runtime·callbackasm1(SB) + MOVW $1660, R12 + B runtime·callbackasm1(SB) + MOVW $1661, R12 + B runtime·callbackasm1(SB) + MOVW $1662, R12 + B runtime·callbackasm1(SB) + MOVW $1663, R12 + B runtime·callbackasm1(SB) + MOVW $1664, R12 + B runtime·callbackasm1(SB) + MOVW $1665, R12 + B runtime·callbackasm1(SB) + MOVW $1666, R12 + B runtime·callbackasm1(SB) + MOVW $1667, R12 + B runtime·callbackasm1(SB) + MOVW $1668, R12 + B runtime·callbackasm1(SB) + MOVW $1669, R12 + B runtime·callbackasm1(SB) + MOVW $1670, R12 + B runtime·callbackasm1(SB) + MOVW $1671, R12 + B runtime·callbackasm1(SB) + MOVW $1672, R12 + B runtime·callbackasm1(SB) + MOVW $1673, R12 + B runtime·callbackasm1(SB) + MOVW $1674, R12 + B runtime·callbackasm1(SB) + MOVW $1675, R12 + B runtime·callbackasm1(SB) + MOVW $1676, R12 + B runtime·callbackasm1(SB) + MOVW $1677, R12 + B runtime·callbackasm1(SB) + MOVW $1678, R12 + B runtime·callbackasm1(SB) + MOVW $1679, R12 + B runtime·callbackasm1(SB) + MOVW $1680, R12 + B runtime·callbackasm1(SB) + MOVW $1681, R12 + B runtime·callbackasm1(SB) + MOVW $1682, R12 + B runtime·callbackasm1(SB) + MOVW $1683, R12 + B runtime·callbackasm1(SB) + MOVW $1684, R12 + B runtime·callbackasm1(SB) + MOVW $1685, R12 + B runtime·callbackasm1(SB) + MOVW $1686, R12 + B runtime·callbackasm1(SB) + MOVW $1687, R12 + B runtime·callbackasm1(SB) + MOVW $1688, R12 + B runtime·callbackasm1(SB) + MOVW $1689, R12 + B runtime·callbackasm1(SB) + MOVW $1690, R12 + B runtime·callbackasm1(SB) + MOVW $1691, R12 + B runtime·callbackasm1(SB) + MOVW $1692, R12 + B runtime·callbackasm1(SB) + MOVW $1693, R12 + B runtime·callbackasm1(SB) + MOVW $1694, R12 + B runtime·callbackasm1(SB) + MOVW $1695, R12 + B runtime·callbackasm1(SB) + MOVW $1696, R12 + B runtime·callbackasm1(SB) + MOVW $1697, R12 + B runtime·callbackasm1(SB) + MOVW $1698, R12 + B runtime·callbackasm1(SB) + MOVW $1699, R12 + B runtime·callbackasm1(SB) + MOVW $1700, R12 + B runtime·callbackasm1(SB) + MOVW $1701, R12 + B runtime·callbackasm1(SB) + MOVW $1702, R12 + B runtime·callbackasm1(SB) + MOVW $1703, R12 + B runtime·callbackasm1(SB) + MOVW $1704, R12 + B runtime·callbackasm1(SB) + MOVW $1705, R12 + B runtime·callbackasm1(SB) + MOVW $1706, R12 + B runtime·callbackasm1(SB) + MOVW $1707, R12 + B runtime·callbackasm1(SB) + MOVW $1708, R12 + B runtime·callbackasm1(SB) + MOVW $1709, R12 + B runtime·callbackasm1(SB) + MOVW $1710, R12 + B runtime·callbackasm1(SB) + MOVW $1711, R12 + B runtime·callbackasm1(SB) + MOVW $1712, R12 + B runtime·callbackasm1(SB) + MOVW $1713, R12 + B runtime·callbackasm1(SB) + MOVW $1714, R12 + B runtime·callbackasm1(SB) + MOVW $1715, R12 + B runtime·callbackasm1(SB) + MOVW $1716, R12 + B runtime·callbackasm1(SB) + MOVW $1717, R12 + B runtime·callbackasm1(SB) + MOVW $1718, R12 + B runtime·callbackasm1(SB) + MOVW $1719, R12 + B runtime·callbackasm1(SB) + MOVW $1720, R12 + B runtime·callbackasm1(SB) + MOVW $1721, R12 + B runtime·callbackasm1(SB) + MOVW $1722, R12 + B runtime·callbackasm1(SB) + MOVW $1723, R12 + B runtime·callbackasm1(SB) + MOVW $1724, R12 + B runtime·callbackasm1(SB) + MOVW $1725, R12 + B runtime·callbackasm1(SB) + MOVW $1726, R12 + B runtime·callbackasm1(SB) + MOVW $1727, R12 + B runtime·callbackasm1(SB) + MOVW $1728, R12 + B runtime·callbackasm1(SB) + MOVW $1729, R12 + B runtime·callbackasm1(SB) + MOVW $1730, R12 + B runtime·callbackasm1(SB) + MOVW $1731, R12 + B runtime·callbackasm1(SB) + MOVW $1732, R12 + B runtime·callbackasm1(SB) + MOVW $1733, R12 + B runtime·callbackasm1(SB) + MOVW $1734, R12 + B runtime·callbackasm1(SB) + MOVW $1735, R12 + B runtime·callbackasm1(SB) + MOVW $1736, R12 + B runtime·callbackasm1(SB) + MOVW $1737, R12 + B runtime·callbackasm1(SB) + MOVW $1738, R12 + B runtime·callbackasm1(SB) + MOVW $1739, R12 + B runtime·callbackasm1(SB) + MOVW $1740, R12 + B runtime·callbackasm1(SB) + MOVW $1741, R12 + B runtime·callbackasm1(SB) + MOVW $1742, R12 + B runtime·callbackasm1(SB) + MOVW $1743, R12 + B runtime·callbackasm1(SB) + MOVW $1744, R12 + B runtime·callbackasm1(SB) + MOVW $1745, R12 + B runtime·callbackasm1(SB) + MOVW $1746, R12 + B runtime·callbackasm1(SB) + MOVW $1747, R12 + B runtime·callbackasm1(SB) + MOVW $1748, R12 + B runtime·callbackasm1(SB) + MOVW $1749, R12 + B runtime·callbackasm1(SB) + MOVW $1750, R12 + B runtime·callbackasm1(SB) + MOVW $1751, R12 + B runtime·callbackasm1(SB) + MOVW $1752, R12 + B runtime·callbackasm1(SB) + MOVW $1753, R12 + B runtime·callbackasm1(SB) + MOVW $1754, R12 + B runtime·callbackasm1(SB) + MOVW $1755, R12 + B runtime·callbackasm1(SB) + MOVW $1756, R12 + B runtime·callbackasm1(SB) + MOVW $1757, R12 + B runtime·callbackasm1(SB) + MOVW $1758, R12 + B runtime·callbackasm1(SB) + MOVW $1759, R12 + B runtime·callbackasm1(SB) + MOVW $1760, R12 + B runtime·callbackasm1(SB) + MOVW $1761, R12 + B runtime·callbackasm1(SB) + MOVW $1762, R12 + B runtime·callbackasm1(SB) + MOVW $1763, R12 + B runtime·callbackasm1(SB) + MOVW $1764, R12 + B runtime·callbackasm1(SB) + MOVW $1765, R12 + B runtime·callbackasm1(SB) + MOVW $1766, R12 + B runtime·callbackasm1(SB) + MOVW $1767, R12 + B runtime·callbackasm1(SB) + MOVW $1768, R12 + B runtime·callbackasm1(SB) + MOVW $1769, R12 + B runtime·callbackasm1(SB) + MOVW $1770, R12 + B runtime·callbackasm1(SB) + MOVW $1771, R12 + B runtime·callbackasm1(SB) + MOVW $1772, R12 + B runtime·callbackasm1(SB) + MOVW $1773, R12 + B runtime·callbackasm1(SB) + MOVW $1774, R12 + B runtime·callbackasm1(SB) + MOVW $1775, R12 + B runtime·callbackasm1(SB) + MOVW $1776, R12 + B runtime·callbackasm1(SB) + MOVW $1777, R12 + B runtime·callbackasm1(SB) + MOVW $1778, R12 + B runtime·callbackasm1(SB) + MOVW $1779, R12 + B runtime·callbackasm1(SB) + MOVW $1780, R12 + B runtime·callbackasm1(SB) + MOVW $1781, R12 + B runtime·callbackasm1(SB) + MOVW $1782, R12 + B runtime·callbackasm1(SB) + MOVW $1783, R12 + B runtime·callbackasm1(SB) + MOVW $1784, R12 + B runtime·callbackasm1(SB) + MOVW $1785, R12 + B runtime·callbackasm1(SB) + MOVW $1786, R12 + B runtime·callbackasm1(SB) + MOVW $1787, R12 + B runtime·callbackasm1(SB) + MOVW $1788, R12 + B runtime·callbackasm1(SB) + MOVW $1789, R12 + B runtime·callbackasm1(SB) + MOVW $1790, R12 + B runtime·callbackasm1(SB) + MOVW $1791, R12 + B runtime·callbackasm1(SB) + MOVW $1792, R12 + B runtime·callbackasm1(SB) + MOVW $1793, R12 + B runtime·callbackasm1(SB) + MOVW $1794, R12 + B runtime·callbackasm1(SB) + MOVW $1795, R12 + B runtime·callbackasm1(SB) + MOVW $1796, R12 + B runtime·callbackasm1(SB) + MOVW $1797, R12 + B runtime·callbackasm1(SB) + MOVW $1798, R12 + B runtime·callbackasm1(SB) + MOVW $1799, R12 + B runtime·callbackasm1(SB) + MOVW $1800, R12 + B runtime·callbackasm1(SB) + MOVW $1801, R12 + B runtime·callbackasm1(SB) + MOVW $1802, R12 + B runtime·callbackasm1(SB) + MOVW $1803, R12 + B runtime·callbackasm1(SB) + MOVW $1804, R12 + B runtime·callbackasm1(SB) + MOVW $1805, R12 + B runtime·callbackasm1(SB) + MOVW $1806, R12 + B runtime·callbackasm1(SB) + MOVW $1807, R12 + B runtime·callbackasm1(SB) + MOVW $1808, R12 + B runtime·callbackasm1(SB) + MOVW $1809, R12 + B runtime·callbackasm1(SB) + MOVW $1810, R12 + B runtime·callbackasm1(SB) + MOVW $1811, R12 + B runtime·callbackasm1(SB) + MOVW $1812, R12 + B runtime·callbackasm1(SB) + MOVW $1813, R12 + B runtime·callbackasm1(SB) + MOVW $1814, R12 + B runtime·callbackasm1(SB) + MOVW $1815, R12 + B runtime·callbackasm1(SB) + MOVW $1816, R12 + B runtime·callbackasm1(SB) + MOVW $1817, R12 + B runtime·callbackasm1(SB) + MOVW $1818, R12 + B runtime·callbackasm1(SB) + MOVW $1819, R12 + B runtime·callbackasm1(SB) + MOVW $1820, R12 + B runtime·callbackasm1(SB) + MOVW $1821, R12 + B runtime·callbackasm1(SB) + MOVW $1822, R12 + B runtime·callbackasm1(SB) + MOVW $1823, R12 + B runtime·callbackasm1(SB) + MOVW $1824, R12 + B runtime·callbackasm1(SB) + MOVW $1825, R12 + B runtime·callbackasm1(SB) + MOVW $1826, R12 + B runtime·callbackasm1(SB) + MOVW $1827, R12 + B runtime·callbackasm1(SB) + MOVW $1828, R12 + B runtime·callbackasm1(SB) + MOVW $1829, R12 + B runtime·callbackasm1(SB) + MOVW $1830, R12 + B runtime·callbackasm1(SB) + MOVW $1831, R12 + B runtime·callbackasm1(SB) + MOVW $1832, R12 + B runtime·callbackasm1(SB) + MOVW $1833, R12 + B runtime·callbackasm1(SB) + MOVW $1834, R12 + B runtime·callbackasm1(SB) + MOVW $1835, R12 + B runtime·callbackasm1(SB) + MOVW $1836, R12 + B runtime·callbackasm1(SB) + MOVW $1837, R12 + B runtime·callbackasm1(SB) + MOVW $1838, R12 + B runtime·callbackasm1(SB) + MOVW $1839, R12 + B runtime·callbackasm1(SB) + MOVW $1840, R12 + B runtime·callbackasm1(SB) + MOVW $1841, R12 + B runtime·callbackasm1(SB) + MOVW $1842, R12 + B runtime·callbackasm1(SB) + MOVW $1843, R12 + B runtime·callbackasm1(SB) + MOVW $1844, R12 + B runtime·callbackasm1(SB) + MOVW $1845, R12 + B runtime·callbackasm1(SB) + MOVW $1846, R12 + B runtime·callbackasm1(SB) + MOVW $1847, R12 + B runtime·callbackasm1(SB) + MOVW $1848, R12 + B runtime·callbackasm1(SB) + MOVW $1849, R12 + B runtime·callbackasm1(SB) + MOVW $1850, R12 + B runtime·callbackasm1(SB) + MOVW $1851, R12 + B runtime·callbackasm1(SB) + MOVW $1852, R12 + B runtime·callbackasm1(SB) + MOVW $1853, R12 + B runtime·callbackasm1(SB) + MOVW $1854, R12 + B runtime·callbackasm1(SB) + MOVW $1855, R12 + B runtime·callbackasm1(SB) + MOVW $1856, R12 + B runtime·callbackasm1(SB) + MOVW $1857, R12 + B runtime·callbackasm1(SB) + MOVW $1858, R12 + B runtime·callbackasm1(SB) + MOVW $1859, R12 + B runtime·callbackasm1(SB) + MOVW $1860, R12 + B runtime·callbackasm1(SB) + MOVW $1861, R12 + B runtime·callbackasm1(SB) + MOVW $1862, R12 + B runtime·callbackasm1(SB) + MOVW $1863, R12 + B runtime·callbackasm1(SB) + MOVW $1864, R12 + B runtime·callbackasm1(SB) + MOVW $1865, R12 + B runtime·callbackasm1(SB) + MOVW $1866, R12 + B runtime·callbackasm1(SB) + MOVW $1867, R12 + B runtime·callbackasm1(SB) + MOVW $1868, R12 + B runtime·callbackasm1(SB) + MOVW $1869, R12 + B runtime·callbackasm1(SB) + MOVW $1870, R12 + B runtime·callbackasm1(SB) + MOVW $1871, R12 + B runtime·callbackasm1(SB) + MOVW $1872, R12 + B runtime·callbackasm1(SB) + MOVW $1873, R12 + B runtime·callbackasm1(SB) + MOVW $1874, R12 + B runtime·callbackasm1(SB) + MOVW $1875, R12 + B runtime·callbackasm1(SB) + MOVW $1876, R12 + B runtime·callbackasm1(SB) + MOVW $1877, R12 + B runtime·callbackasm1(SB) + MOVW $1878, R12 + B runtime·callbackasm1(SB) + MOVW $1879, R12 + B runtime·callbackasm1(SB) + MOVW $1880, R12 + B runtime·callbackasm1(SB) + MOVW $1881, R12 + B runtime·callbackasm1(SB) + MOVW $1882, R12 + B runtime·callbackasm1(SB) + MOVW $1883, R12 + B runtime·callbackasm1(SB) + MOVW $1884, R12 + B runtime·callbackasm1(SB) + MOVW $1885, R12 + B runtime·callbackasm1(SB) + MOVW $1886, R12 + B runtime·callbackasm1(SB) + MOVW $1887, R12 + B runtime·callbackasm1(SB) + MOVW $1888, R12 + B runtime·callbackasm1(SB) + MOVW $1889, R12 + B runtime·callbackasm1(SB) + MOVW $1890, R12 + B runtime·callbackasm1(SB) + MOVW $1891, R12 + B runtime·callbackasm1(SB) + MOVW $1892, R12 + B runtime·callbackasm1(SB) + MOVW $1893, R12 + B runtime·callbackasm1(SB) + MOVW $1894, R12 + B runtime·callbackasm1(SB) + MOVW $1895, R12 + B runtime·callbackasm1(SB) + MOVW $1896, R12 + B runtime·callbackasm1(SB) + MOVW $1897, R12 + B runtime·callbackasm1(SB) + MOVW $1898, R12 + B runtime·callbackasm1(SB) + MOVW $1899, R12 + B runtime·callbackasm1(SB) + MOVW $1900, R12 + B runtime·callbackasm1(SB) + MOVW $1901, R12 + B runtime·callbackasm1(SB) + MOVW $1902, R12 + B runtime·callbackasm1(SB) + MOVW $1903, R12 + B runtime·callbackasm1(SB) + MOVW $1904, R12 + B runtime·callbackasm1(SB) + MOVW $1905, R12 + B runtime·callbackasm1(SB) + MOVW $1906, R12 + B runtime·callbackasm1(SB) + MOVW $1907, R12 + B runtime·callbackasm1(SB) + MOVW $1908, R12 + B runtime·callbackasm1(SB) + MOVW $1909, R12 + B runtime·callbackasm1(SB) + MOVW $1910, R12 + B runtime·callbackasm1(SB) + MOVW $1911, R12 + B runtime·callbackasm1(SB) + MOVW $1912, R12 + B runtime·callbackasm1(SB) + MOVW $1913, R12 + B runtime·callbackasm1(SB) + MOVW $1914, R12 + B runtime·callbackasm1(SB) + MOVW $1915, R12 + B runtime·callbackasm1(SB) + MOVW $1916, R12 + B runtime·callbackasm1(SB) + MOVW $1917, R12 + B runtime·callbackasm1(SB) + MOVW $1918, R12 + B runtime·callbackasm1(SB) + MOVW $1919, R12 + B runtime·callbackasm1(SB) + MOVW $1920, R12 + B runtime·callbackasm1(SB) + MOVW $1921, R12 + B runtime·callbackasm1(SB) + MOVW $1922, R12 + B runtime·callbackasm1(SB) + MOVW $1923, R12 + B runtime·callbackasm1(SB) + MOVW $1924, R12 + B runtime·callbackasm1(SB) + MOVW $1925, R12 + B runtime·callbackasm1(SB) + MOVW $1926, R12 + B runtime·callbackasm1(SB) + MOVW $1927, R12 + B runtime·callbackasm1(SB) + MOVW $1928, R12 + B runtime·callbackasm1(SB) + MOVW $1929, R12 + B runtime·callbackasm1(SB) + MOVW $1930, R12 + B runtime·callbackasm1(SB) + MOVW $1931, R12 + B runtime·callbackasm1(SB) + MOVW $1932, R12 + B runtime·callbackasm1(SB) + MOVW $1933, R12 + B runtime·callbackasm1(SB) + MOVW $1934, R12 + B runtime·callbackasm1(SB) + MOVW $1935, R12 + B runtime·callbackasm1(SB) + MOVW $1936, R12 + B runtime·callbackasm1(SB) + MOVW $1937, R12 + B runtime·callbackasm1(SB) + MOVW $1938, R12 + B runtime·callbackasm1(SB) + MOVW $1939, R12 + B runtime·callbackasm1(SB) + MOVW $1940, R12 + B runtime·callbackasm1(SB) + MOVW $1941, R12 + B runtime·callbackasm1(SB) + MOVW $1942, R12 + B runtime·callbackasm1(SB) + MOVW $1943, R12 + B runtime·callbackasm1(SB) + MOVW $1944, R12 + B runtime·callbackasm1(SB) + MOVW $1945, R12 + B runtime·callbackasm1(SB) + MOVW $1946, R12 + B runtime·callbackasm1(SB) + MOVW $1947, R12 + B runtime·callbackasm1(SB) + MOVW $1948, R12 + B runtime·callbackasm1(SB) + MOVW $1949, R12 + B runtime·callbackasm1(SB) + MOVW $1950, R12 + B runtime·callbackasm1(SB) + MOVW $1951, R12 + B runtime·callbackasm1(SB) + MOVW $1952, R12 + B runtime·callbackasm1(SB) + MOVW $1953, R12 + B runtime·callbackasm1(SB) + MOVW $1954, R12 + B runtime·callbackasm1(SB) + MOVW $1955, R12 + B runtime·callbackasm1(SB) + MOVW $1956, R12 + B runtime·callbackasm1(SB) + MOVW $1957, R12 + B runtime·callbackasm1(SB) + MOVW $1958, R12 + B runtime·callbackasm1(SB) + MOVW $1959, R12 + B runtime·callbackasm1(SB) + MOVW $1960, R12 + B runtime·callbackasm1(SB) + MOVW $1961, R12 + B runtime·callbackasm1(SB) + MOVW $1962, R12 + B runtime·callbackasm1(SB) + MOVW $1963, R12 + B runtime·callbackasm1(SB) + MOVW $1964, R12 + B runtime·callbackasm1(SB) + MOVW $1965, R12 + B runtime·callbackasm1(SB) + MOVW $1966, R12 + B runtime·callbackasm1(SB) + MOVW $1967, R12 + B runtime·callbackasm1(SB) + MOVW $1968, R12 + B runtime·callbackasm1(SB) + MOVW $1969, R12 + B runtime·callbackasm1(SB) + MOVW $1970, R12 + B runtime·callbackasm1(SB) + MOVW $1971, R12 + B runtime·callbackasm1(SB) + MOVW $1972, R12 + B runtime·callbackasm1(SB) + MOVW $1973, R12 + B runtime·callbackasm1(SB) + MOVW $1974, R12 + B runtime·callbackasm1(SB) + MOVW $1975, R12 + B runtime·callbackasm1(SB) + MOVW $1976, R12 + B runtime·callbackasm1(SB) + MOVW $1977, R12 + B runtime·callbackasm1(SB) + MOVW $1978, R12 + B runtime·callbackasm1(SB) + MOVW $1979, R12 + B runtime·callbackasm1(SB) + MOVW $1980, R12 + B runtime·callbackasm1(SB) + MOVW $1981, R12 + B runtime·callbackasm1(SB) + MOVW $1982, R12 + B runtime·callbackasm1(SB) + MOVW $1983, R12 + B runtime·callbackasm1(SB) + MOVW $1984, R12 + B runtime·callbackasm1(SB) + MOVW $1985, R12 + B runtime·callbackasm1(SB) + MOVW $1986, R12 + B runtime·callbackasm1(SB) + MOVW $1987, R12 + B runtime·callbackasm1(SB) + MOVW $1988, R12 + B runtime·callbackasm1(SB) + MOVW $1989, R12 + B runtime·callbackasm1(SB) + MOVW $1990, R12 + B runtime·callbackasm1(SB) + MOVW $1991, R12 + B runtime·callbackasm1(SB) + MOVW $1992, R12 + B runtime·callbackasm1(SB) + MOVW $1993, R12 + B runtime·callbackasm1(SB) + MOVW $1994, R12 + B runtime·callbackasm1(SB) + MOVW $1995, R12 + B runtime·callbackasm1(SB) + MOVW $1996, R12 + B runtime·callbackasm1(SB) + MOVW $1997, R12 + B runtime·callbackasm1(SB) + MOVW $1998, R12 + B runtime·callbackasm1(SB) + MOVW $1999, R12 + B runtime·callbackasm1(SB) diff --git a/src/runtime/zcallback_windows_arm64.s b/src/runtime/zcallback_windows_arm64.s new file mode 100644 index 0000000..69fb057 --- /dev/null +++ b/src/runtime/zcallback_windows_arm64.s @@ -0,0 +1,4012 @@ +// Code generated by wincallback.go using 'go generate'. DO NOT EDIT. + +// External code calls into callbackasm at an offset corresponding +// to the callback index. Callbackasm is a table of MOV and B instructions. +// The MOV instruction loads R12 with the callback index, and the +// B instruction branches to callbackasm1. +// callbackasm1 takes the callback index from R12 and +// indexes into an array that stores information about each callback. +// It then calls the Go implementation for that callback. +#include "textflag.h" + +TEXT runtime·callbackasm(SB),NOSPLIT|NOFRAME,$0 + MOVD $0, R12 + B runtime·callbackasm1(SB) + MOVD $1, R12 + B runtime·callbackasm1(SB) + MOVD $2, R12 + B runtime·callbackasm1(SB) + MOVD $3, R12 + B runtime·callbackasm1(SB) + MOVD $4, R12 + B runtime·callbackasm1(SB) + MOVD $5, R12 + B runtime·callbackasm1(SB) + MOVD $6, R12 + B runtime·callbackasm1(SB) + MOVD $7, R12 + B runtime·callbackasm1(SB) + MOVD $8, R12 + B runtime·callbackasm1(SB) + MOVD $9, R12 + B runtime·callbackasm1(SB) + MOVD $10, R12 + B runtime·callbackasm1(SB) + MOVD $11, R12 + B runtime·callbackasm1(SB) + MOVD $12, R12 + B runtime·callbackasm1(SB) + MOVD $13, R12 + B runtime·callbackasm1(SB) + MOVD $14, R12 + B runtime·callbackasm1(SB) + MOVD $15, R12 + B runtime·callbackasm1(SB) + MOVD $16, R12 + B runtime·callbackasm1(SB) + MOVD $17, R12 + B runtime·callbackasm1(SB) + MOVD $18, R12 + B runtime·callbackasm1(SB) + MOVD $19, R12 + B runtime·callbackasm1(SB) + MOVD $20, R12 + B runtime·callbackasm1(SB) + MOVD $21, R12 + B runtime·callbackasm1(SB) + MOVD $22, R12 + B runtime·callbackasm1(SB) + MOVD $23, R12 + B runtime·callbackasm1(SB) + MOVD $24, R12 + B runtime·callbackasm1(SB) + MOVD $25, R12 + B runtime·callbackasm1(SB) + MOVD $26, R12 + B runtime·callbackasm1(SB) + MOVD $27, R12 + B runtime·callbackasm1(SB) + MOVD $28, R12 + B runtime·callbackasm1(SB) + MOVD $29, R12 + B runtime·callbackasm1(SB) + MOVD $30, R12 + B runtime·callbackasm1(SB) + MOVD $31, R12 + B runtime·callbackasm1(SB) + MOVD $32, R12 + B runtime·callbackasm1(SB) + MOVD $33, R12 + B runtime·callbackasm1(SB) + MOVD $34, R12 + B runtime·callbackasm1(SB) + MOVD $35, R12 + B runtime·callbackasm1(SB) + MOVD $36, R12 + B runtime·callbackasm1(SB) + MOVD $37, R12 + B runtime·callbackasm1(SB) + MOVD $38, R12 + B runtime·callbackasm1(SB) + MOVD $39, R12 + B runtime·callbackasm1(SB) + MOVD $40, R12 + B runtime·callbackasm1(SB) + MOVD $41, R12 + B runtime·callbackasm1(SB) + MOVD $42, R12 + B runtime·callbackasm1(SB) + MOVD $43, R12 + B runtime·callbackasm1(SB) + MOVD $44, R12 + B runtime·callbackasm1(SB) + MOVD $45, R12 + B runtime·callbackasm1(SB) + MOVD $46, R12 + B runtime·callbackasm1(SB) + MOVD $47, R12 + B runtime·callbackasm1(SB) + MOVD $48, R12 + B runtime·callbackasm1(SB) + MOVD $49, R12 + B runtime·callbackasm1(SB) + MOVD $50, R12 + B runtime·callbackasm1(SB) + MOVD $51, R12 + B runtime·callbackasm1(SB) + MOVD $52, R12 + B runtime·callbackasm1(SB) + MOVD $53, R12 + B runtime·callbackasm1(SB) + MOVD $54, R12 + B runtime·callbackasm1(SB) + MOVD $55, R12 + B runtime·callbackasm1(SB) + MOVD $56, R12 + B runtime·callbackasm1(SB) + MOVD $57, R12 + B runtime·callbackasm1(SB) + MOVD $58, R12 + B runtime·callbackasm1(SB) + MOVD $59, R12 + B runtime·callbackasm1(SB) + MOVD $60, R12 + B runtime·callbackasm1(SB) + MOVD $61, R12 + B runtime·callbackasm1(SB) + MOVD $62, R12 + B runtime·callbackasm1(SB) + MOVD $63, R12 + B runtime·callbackasm1(SB) + MOVD $64, R12 + B runtime·callbackasm1(SB) + MOVD $65, R12 + B runtime·callbackasm1(SB) + MOVD $66, R12 + B runtime·callbackasm1(SB) + MOVD $67, R12 + B runtime·callbackasm1(SB) + MOVD $68, R12 + B runtime·callbackasm1(SB) + MOVD $69, R12 + B runtime·callbackasm1(SB) + MOVD $70, R12 + B runtime·callbackasm1(SB) + MOVD $71, R12 + B runtime·callbackasm1(SB) + MOVD $72, R12 + B runtime·callbackasm1(SB) + MOVD $73, R12 + B runtime·callbackasm1(SB) + MOVD $74, R12 + B runtime·callbackasm1(SB) + MOVD $75, R12 + B runtime·callbackasm1(SB) + MOVD $76, R12 + B runtime·callbackasm1(SB) + MOVD $77, R12 + B runtime·callbackasm1(SB) + MOVD $78, R12 + B runtime·callbackasm1(SB) + MOVD $79, R12 + B runtime·callbackasm1(SB) + MOVD $80, R12 + B runtime·callbackasm1(SB) + MOVD $81, R12 + B runtime·callbackasm1(SB) + MOVD $82, R12 + B runtime·callbackasm1(SB) + MOVD $83, R12 + B runtime·callbackasm1(SB) + MOVD $84, R12 + B runtime·callbackasm1(SB) + MOVD $85, R12 + B runtime·callbackasm1(SB) + MOVD $86, R12 + B runtime·callbackasm1(SB) + MOVD $87, R12 + B runtime·callbackasm1(SB) + MOVD $88, R12 + B runtime·callbackasm1(SB) + MOVD $89, R12 + B runtime·callbackasm1(SB) + MOVD $90, R12 + B runtime·callbackasm1(SB) + MOVD $91, R12 + B runtime·callbackasm1(SB) + MOVD $92, R12 + B runtime·callbackasm1(SB) + MOVD $93, R12 + B runtime·callbackasm1(SB) + MOVD $94, R12 + B runtime·callbackasm1(SB) + MOVD $95, R12 + B runtime·callbackasm1(SB) + MOVD $96, R12 + B runtime·callbackasm1(SB) + MOVD $97, R12 + B runtime·callbackasm1(SB) + MOVD $98, R12 + B runtime·callbackasm1(SB) + MOVD $99, R12 + B runtime·callbackasm1(SB) + MOVD $100, R12 + B runtime·callbackasm1(SB) + MOVD $101, R12 + B runtime·callbackasm1(SB) + MOVD $102, R12 + B runtime·callbackasm1(SB) + MOVD $103, R12 + B runtime·callbackasm1(SB) + MOVD $104, R12 + B runtime·callbackasm1(SB) + MOVD $105, R12 + B runtime·callbackasm1(SB) + MOVD $106, R12 + B runtime·callbackasm1(SB) + MOVD $107, R12 + B runtime·callbackasm1(SB) + MOVD $108, R12 + B runtime·callbackasm1(SB) + MOVD $109, R12 + B runtime·callbackasm1(SB) + MOVD $110, R12 + B runtime·callbackasm1(SB) + MOVD $111, R12 + B runtime·callbackasm1(SB) + MOVD $112, R12 + B runtime·callbackasm1(SB) + MOVD $113, R12 + B runtime·callbackasm1(SB) + MOVD $114, R12 + B runtime·callbackasm1(SB) + MOVD $115, R12 + B runtime·callbackasm1(SB) + MOVD $116, R12 + B runtime·callbackasm1(SB) + MOVD $117, R12 + B runtime·callbackasm1(SB) + MOVD $118, R12 + B runtime·callbackasm1(SB) + MOVD $119, R12 + B runtime·callbackasm1(SB) + MOVD $120, R12 + B runtime·callbackasm1(SB) + MOVD $121, R12 + B runtime·callbackasm1(SB) + MOVD $122, R12 + B runtime·callbackasm1(SB) + MOVD $123, R12 + B runtime·callbackasm1(SB) + MOVD $124, R12 + B runtime·callbackasm1(SB) + MOVD $125, R12 + B runtime·callbackasm1(SB) + MOVD $126, R12 + B runtime·callbackasm1(SB) + MOVD $127, R12 + B runtime·callbackasm1(SB) + MOVD $128, R12 + B runtime·callbackasm1(SB) + MOVD $129, R12 + B runtime·callbackasm1(SB) + MOVD $130, R12 + B runtime·callbackasm1(SB) + MOVD $131, R12 + B runtime·callbackasm1(SB) + MOVD $132, R12 + B runtime·callbackasm1(SB) + MOVD $133, R12 + B runtime·callbackasm1(SB) + MOVD $134, R12 + B runtime·callbackasm1(SB) + MOVD $135, R12 + B runtime·callbackasm1(SB) + MOVD $136, R12 + B runtime·callbackasm1(SB) + MOVD $137, R12 + B runtime·callbackasm1(SB) + MOVD $138, R12 + B runtime·callbackasm1(SB) + MOVD $139, R12 + B runtime·callbackasm1(SB) + MOVD $140, R12 + B runtime·callbackasm1(SB) + MOVD $141, R12 + B runtime·callbackasm1(SB) + MOVD $142, R12 + B runtime·callbackasm1(SB) + MOVD $143, R12 + B runtime·callbackasm1(SB) + MOVD $144, R12 + B runtime·callbackasm1(SB) + MOVD $145, R12 + B runtime·callbackasm1(SB) + MOVD $146, R12 + B runtime·callbackasm1(SB) + MOVD $147, R12 + B runtime·callbackasm1(SB) + MOVD $148, R12 + B runtime·callbackasm1(SB) + MOVD $149, R12 + B runtime·callbackasm1(SB) + MOVD $150, R12 + B runtime·callbackasm1(SB) + MOVD $151, R12 + B runtime·callbackasm1(SB) + MOVD $152, R12 + B runtime·callbackasm1(SB) + MOVD $153, R12 + B runtime·callbackasm1(SB) + MOVD $154, R12 + B runtime·callbackasm1(SB) + MOVD $155, R12 + B runtime·callbackasm1(SB) + MOVD $156, R12 + B runtime·callbackasm1(SB) + MOVD $157, R12 + B runtime·callbackasm1(SB) + MOVD $158, R12 + B runtime·callbackasm1(SB) + MOVD $159, R12 + B runtime·callbackasm1(SB) + MOVD $160, R12 + B runtime·callbackasm1(SB) + MOVD $161, R12 + B runtime·callbackasm1(SB) + MOVD $162, R12 + B runtime·callbackasm1(SB) + MOVD $163, R12 + B runtime·callbackasm1(SB) + MOVD $164, R12 + B runtime·callbackasm1(SB) + MOVD $165, R12 + B runtime·callbackasm1(SB) + MOVD $166, R12 + B runtime·callbackasm1(SB) + MOVD $167, R12 + B runtime·callbackasm1(SB) + MOVD $168, R12 + B runtime·callbackasm1(SB) + MOVD $169, R12 + B runtime·callbackasm1(SB) + MOVD $170, R12 + B runtime·callbackasm1(SB) + MOVD $171, R12 + B runtime·callbackasm1(SB) + MOVD $172, R12 + B runtime·callbackasm1(SB) + MOVD $173, R12 + B runtime·callbackasm1(SB) + MOVD $174, R12 + B runtime·callbackasm1(SB) + MOVD $175, R12 + B runtime·callbackasm1(SB) + MOVD $176, R12 + B runtime·callbackasm1(SB) + MOVD $177, R12 + B runtime·callbackasm1(SB) + MOVD $178, R12 + B runtime·callbackasm1(SB) + MOVD $179, R12 + B runtime·callbackasm1(SB) + MOVD $180, R12 + B runtime·callbackasm1(SB) + MOVD $181, R12 + B runtime·callbackasm1(SB) + MOVD $182, R12 + B runtime·callbackasm1(SB) + MOVD $183, R12 + B runtime·callbackasm1(SB) + MOVD $184, R12 + B runtime·callbackasm1(SB) + MOVD $185, R12 + B runtime·callbackasm1(SB) + MOVD $186, R12 + B runtime·callbackasm1(SB) + MOVD $187, R12 + B runtime·callbackasm1(SB) + MOVD $188, R12 + B runtime·callbackasm1(SB) + MOVD $189, R12 + B runtime·callbackasm1(SB) + MOVD $190, R12 + B runtime·callbackasm1(SB) + MOVD $191, R12 + B runtime·callbackasm1(SB) + MOVD $192, R12 + B runtime·callbackasm1(SB) + MOVD $193, R12 + B runtime·callbackasm1(SB) + MOVD $194, R12 + B runtime·callbackasm1(SB) + MOVD $195, R12 + B runtime·callbackasm1(SB) + MOVD $196, R12 + B runtime·callbackasm1(SB) + MOVD $197, R12 + B runtime·callbackasm1(SB) + MOVD $198, R12 + B runtime·callbackasm1(SB) + MOVD $199, R12 + B runtime·callbackasm1(SB) + MOVD $200, R12 + B runtime·callbackasm1(SB) + MOVD $201, R12 + B runtime·callbackasm1(SB) + MOVD $202, R12 + B runtime·callbackasm1(SB) + MOVD $203, R12 + B runtime·callbackasm1(SB) + MOVD $204, R12 + B runtime·callbackasm1(SB) + MOVD $205, R12 + B runtime·callbackasm1(SB) + MOVD $206, R12 + B runtime·callbackasm1(SB) + MOVD $207, R12 + B runtime·callbackasm1(SB) + MOVD $208, R12 + B runtime·callbackasm1(SB) + MOVD $209, R12 + B runtime·callbackasm1(SB) + MOVD $210, R12 + B runtime·callbackasm1(SB) + MOVD $211, R12 + B runtime·callbackasm1(SB) + MOVD $212, R12 + B runtime·callbackasm1(SB) + MOVD $213, R12 + B runtime·callbackasm1(SB) + MOVD $214, R12 + B runtime·callbackasm1(SB) + MOVD $215, R12 + B runtime·callbackasm1(SB) + MOVD $216, R12 + B runtime·callbackasm1(SB) + MOVD $217, R12 + B runtime·callbackasm1(SB) + MOVD $218, R12 + B runtime·callbackasm1(SB) + MOVD $219, R12 + B runtime·callbackasm1(SB) + MOVD $220, R12 + B runtime·callbackasm1(SB) + MOVD $221, R12 + B runtime·callbackasm1(SB) + MOVD $222, R12 + B runtime·callbackasm1(SB) + MOVD $223, R12 + B runtime·callbackasm1(SB) + MOVD $224, R12 + B runtime·callbackasm1(SB) + MOVD $225, R12 + B runtime·callbackasm1(SB) + MOVD $226, R12 + B runtime·callbackasm1(SB) + MOVD $227, R12 + B runtime·callbackasm1(SB) + MOVD $228, R12 + B runtime·callbackasm1(SB) + MOVD $229, R12 + B runtime·callbackasm1(SB) + MOVD $230, R12 + B runtime·callbackasm1(SB) + MOVD $231, R12 + B runtime·callbackasm1(SB) + MOVD $232, R12 + B runtime·callbackasm1(SB) + MOVD $233, R12 + B runtime·callbackasm1(SB) + MOVD $234, R12 + B runtime·callbackasm1(SB) + MOVD $235, R12 + B runtime·callbackasm1(SB) + MOVD $236, R12 + B runtime·callbackasm1(SB) + MOVD $237, R12 + B runtime·callbackasm1(SB) + MOVD $238, R12 + B runtime·callbackasm1(SB) + MOVD $239, R12 + B runtime·callbackasm1(SB) + MOVD $240, R12 + B runtime·callbackasm1(SB) + MOVD $241, R12 + B runtime·callbackasm1(SB) + MOVD $242, R12 + B runtime·callbackasm1(SB) + MOVD $243, R12 + B runtime·callbackasm1(SB) + MOVD $244, R12 + B runtime·callbackasm1(SB) + MOVD $245, R12 + B runtime·callbackasm1(SB) + MOVD $246, R12 + B runtime·callbackasm1(SB) + MOVD $247, R12 + B runtime·callbackasm1(SB) + MOVD $248, R12 + B runtime·callbackasm1(SB) + MOVD $249, R12 + B runtime·callbackasm1(SB) + MOVD $250, R12 + B runtime·callbackasm1(SB) + MOVD $251, R12 + B runtime·callbackasm1(SB) + MOVD $252, R12 + B runtime·callbackasm1(SB) + MOVD $253, R12 + B runtime·callbackasm1(SB) + MOVD $254, R12 + B runtime·callbackasm1(SB) + MOVD $255, R12 + B runtime·callbackasm1(SB) + MOVD $256, R12 + B runtime·callbackasm1(SB) + MOVD $257, R12 + B runtime·callbackasm1(SB) + MOVD $258, R12 + B runtime·callbackasm1(SB) + MOVD $259, R12 + B runtime·callbackasm1(SB) + MOVD $260, R12 + B runtime·callbackasm1(SB) + MOVD $261, R12 + B runtime·callbackasm1(SB) + MOVD $262, R12 + B runtime·callbackasm1(SB) + MOVD $263, R12 + B runtime·callbackasm1(SB) + MOVD $264, R12 + B runtime·callbackasm1(SB) + MOVD $265, R12 + B runtime·callbackasm1(SB) + MOVD $266, R12 + B runtime·callbackasm1(SB) + MOVD $267, R12 + B runtime·callbackasm1(SB) + MOVD $268, R12 + B runtime·callbackasm1(SB) + MOVD $269, R12 + B runtime·callbackasm1(SB) + MOVD $270, R12 + B runtime·callbackasm1(SB) + MOVD $271, R12 + B runtime·callbackasm1(SB) + MOVD $272, R12 + B runtime·callbackasm1(SB) + MOVD $273, R12 + B runtime·callbackasm1(SB) + MOVD $274, R12 + B runtime·callbackasm1(SB) + MOVD $275, R12 + B runtime·callbackasm1(SB) + MOVD $276, R12 + B runtime·callbackasm1(SB) + MOVD $277, R12 + B runtime·callbackasm1(SB) + MOVD $278, R12 + B runtime·callbackasm1(SB) + MOVD $279, R12 + B runtime·callbackasm1(SB) + MOVD $280, R12 + B runtime·callbackasm1(SB) + MOVD $281, R12 + B runtime·callbackasm1(SB) + MOVD $282, R12 + B runtime·callbackasm1(SB) + MOVD $283, R12 + B runtime·callbackasm1(SB) + MOVD $284, R12 + B runtime·callbackasm1(SB) + MOVD $285, R12 + B runtime·callbackasm1(SB) + MOVD $286, R12 + B runtime·callbackasm1(SB) + MOVD $287, R12 + B runtime·callbackasm1(SB) + MOVD $288, R12 + B runtime·callbackasm1(SB) + MOVD $289, R12 + B runtime·callbackasm1(SB) + MOVD $290, R12 + B runtime·callbackasm1(SB) + MOVD $291, R12 + B runtime·callbackasm1(SB) + MOVD $292, R12 + B runtime·callbackasm1(SB) + MOVD $293, R12 + B runtime·callbackasm1(SB) + MOVD $294, R12 + B runtime·callbackasm1(SB) + MOVD $295, R12 + B runtime·callbackasm1(SB) + MOVD $296, R12 + B runtime·callbackasm1(SB) + MOVD $297, R12 + B runtime·callbackasm1(SB) + MOVD $298, R12 + B runtime·callbackasm1(SB) + MOVD $299, R12 + B runtime·callbackasm1(SB) + MOVD $300, R12 + B runtime·callbackasm1(SB) + MOVD $301, R12 + B runtime·callbackasm1(SB) + MOVD $302, R12 + B runtime·callbackasm1(SB) + MOVD $303, R12 + B runtime·callbackasm1(SB) + MOVD $304, R12 + B runtime·callbackasm1(SB) + MOVD $305, R12 + B runtime·callbackasm1(SB) + MOVD $306, R12 + B runtime·callbackasm1(SB) + MOVD $307, R12 + B runtime·callbackasm1(SB) + MOVD $308, R12 + B runtime·callbackasm1(SB) + MOVD $309, R12 + B runtime·callbackasm1(SB) + MOVD $310, R12 + B runtime·callbackasm1(SB) + MOVD $311, R12 + B runtime·callbackasm1(SB) + MOVD $312, R12 + B runtime·callbackasm1(SB) + MOVD $313, R12 + B runtime·callbackasm1(SB) + MOVD $314, R12 + B runtime·callbackasm1(SB) + MOVD $315, R12 + B runtime·callbackasm1(SB) + MOVD $316, R12 + B runtime·callbackasm1(SB) + MOVD $317, R12 + B runtime·callbackasm1(SB) + MOVD $318, R12 + B runtime·callbackasm1(SB) + MOVD $319, R12 + B runtime·callbackasm1(SB) + MOVD $320, R12 + B runtime·callbackasm1(SB) + MOVD $321, R12 + B runtime·callbackasm1(SB) + MOVD $322, R12 + B runtime·callbackasm1(SB) + MOVD $323, R12 + B runtime·callbackasm1(SB) + MOVD $324, R12 + B runtime·callbackasm1(SB) + MOVD $325, R12 + B runtime·callbackasm1(SB) + MOVD $326, R12 + B runtime·callbackasm1(SB) + MOVD $327, R12 + B runtime·callbackasm1(SB) + MOVD $328, R12 + B runtime·callbackasm1(SB) + MOVD $329, R12 + B runtime·callbackasm1(SB) + MOVD $330, R12 + B runtime·callbackasm1(SB) + MOVD $331, R12 + B runtime·callbackasm1(SB) + MOVD $332, R12 + B runtime·callbackasm1(SB) + MOVD $333, R12 + B runtime·callbackasm1(SB) + MOVD $334, R12 + B runtime·callbackasm1(SB) + MOVD $335, R12 + B runtime·callbackasm1(SB) + MOVD $336, R12 + B runtime·callbackasm1(SB) + MOVD $337, R12 + B runtime·callbackasm1(SB) + MOVD $338, R12 + B runtime·callbackasm1(SB) + MOVD $339, R12 + B runtime·callbackasm1(SB) + MOVD $340, R12 + B runtime·callbackasm1(SB) + MOVD $341, R12 + B runtime·callbackasm1(SB) + MOVD $342, R12 + B runtime·callbackasm1(SB) + MOVD $343, R12 + B runtime·callbackasm1(SB) + MOVD $344, R12 + B runtime·callbackasm1(SB) + MOVD $345, R12 + B runtime·callbackasm1(SB) + MOVD $346, R12 + B runtime·callbackasm1(SB) + MOVD $347, R12 + B runtime·callbackasm1(SB) + MOVD $348, R12 + B runtime·callbackasm1(SB) + MOVD $349, R12 + B runtime·callbackasm1(SB) + MOVD $350, R12 + B runtime·callbackasm1(SB) + MOVD $351, R12 + B runtime·callbackasm1(SB) + MOVD $352, R12 + B runtime·callbackasm1(SB) + MOVD $353, R12 + B runtime·callbackasm1(SB) + MOVD $354, R12 + B runtime·callbackasm1(SB) + MOVD $355, R12 + B runtime·callbackasm1(SB) + MOVD $356, R12 + B runtime·callbackasm1(SB) + MOVD $357, R12 + B runtime·callbackasm1(SB) + MOVD $358, R12 + B runtime·callbackasm1(SB) + MOVD $359, R12 + B runtime·callbackasm1(SB) + MOVD $360, R12 + B runtime·callbackasm1(SB) + MOVD $361, R12 + B runtime·callbackasm1(SB) + MOVD $362, R12 + B runtime·callbackasm1(SB) + MOVD $363, R12 + B runtime·callbackasm1(SB) + MOVD $364, R12 + B runtime·callbackasm1(SB) + MOVD $365, R12 + B runtime·callbackasm1(SB) + MOVD $366, R12 + B runtime·callbackasm1(SB) + MOVD $367, R12 + B runtime·callbackasm1(SB) + MOVD $368, R12 + B runtime·callbackasm1(SB) + MOVD $369, R12 + B runtime·callbackasm1(SB) + MOVD $370, R12 + B runtime·callbackasm1(SB) + MOVD $371, R12 + B runtime·callbackasm1(SB) + MOVD $372, R12 + B runtime·callbackasm1(SB) + MOVD $373, R12 + B runtime·callbackasm1(SB) + MOVD $374, R12 + B runtime·callbackasm1(SB) + MOVD $375, R12 + B runtime·callbackasm1(SB) + MOVD $376, R12 + B runtime·callbackasm1(SB) + MOVD $377, R12 + B runtime·callbackasm1(SB) + MOVD $378, R12 + B runtime·callbackasm1(SB) + MOVD $379, R12 + B runtime·callbackasm1(SB) + MOVD $380, R12 + B runtime·callbackasm1(SB) + MOVD $381, R12 + B runtime·callbackasm1(SB) + MOVD $382, R12 + B runtime·callbackasm1(SB) + MOVD $383, R12 + B runtime·callbackasm1(SB) + MOVD $384, R12 + B runtime·callbackasm1(SB) + MOVD $385, R12 + B runtime·callbackasm1(SB) + MOVD $386, R12 + B runtime·callbackasm1(SB) + MOVD $387, R12 + B runtime·callbackasm1(SB) + MOVD $388, R12 + B runtime·callbackasm1(SB) + MOVD $389, R12 + B runtime·callbackasm1(SB) + MOVD $390, R12 + B runtime·callbackasm1(SB) + MOVD $391, R12 + B runtime·callbackasm1(SB) + MOVD $392, R12 + B runtime·callbackasm1(SB) + MOVD $393, R12 + B runtime·callbackasm1(SB) + MOVD $394, R12 + B runtime·callbackasm1(SB) + MOVD $395, R12 + B runtime·callbackasm1(SB) + MOVD $396, R12 + B runtime·callbackasm1(SB) + MOVD $397, R12 + B runtime·callbackasm1(SB) + MOVD $398, R12 + B runtime·callbackasm1(SB) + MOVD $399, R12 + B runtime·callbackasm1(SB) + MOVD $400, R12 + B runtime·callbackasm1(SB) + MOVD $401, R12 + B runtime·callbackasm1(SB) + MOVD $402, R12 + B runtime·callbackasm1(SB) + MOVD $403, R12 + B runtime·callbackasm1(SB) + MOVD $404, R12 + B runtime·callbackasm1(SB) + MOVD $405, R12 + B runtime·callbackasm1(SB) + MOVD $406, R12 + B runtime·callbackasm1(SB) + MOVD $407, R12 + B runtime·callbackasm1(SB) + MOVD $408, R12 + B runtime·callbackasm1(SB) + MOVD $409, R12 + B runtime·callbackasm1(SB) + MOVD $410, R12 + B runtime·callbackasm1(SB) + MOVD $411, R12 + B runtime·callbackasm1(SB) + MOVD $412, R12 + B runtime·callbackasm1(SB) + MOVD $413, R12 + B runtime·callbackasm1(SB) + MOVD $414, R12 + B runtime·callbackasm1(SB) + MOVD $415, R12 + B runtime·callbackasm1(SB) + MOVD $416, R12 + B runtime·callbackasm1(SB) + MOVD $417, R12 + B runtime·callbackasm1(SB) + MOVD $418, R12 + B runtime·callbackasm1(SB) + MOVD $419, R12 + B runtime·callbackasm1(SB) + MOVD $420, R12 + B runtime·callbackasm1(SB) + MOVD $421, R12 + B runtime·callbackasm1(SB) + MOVD $422, R12 + B runtime·callbackasm1(SB) + MOVD $423, R12 + B runtime·callbackasm1(SB) + MOVD $424, R12 + B runtime·callbackasm1(SB) + MOVD $425, R12 + B runtime·callbackasm1(SB) + MOVD $426, R12 + B runtime·callbackasm1(SB) + MOVD $427, R12 + B runtime·callbackasm1(SB) + MOVD $428, R12 + B runtime·callbackasm1(SB) + MOVD $429, R12 + B runtime·callbackasm1(SB) + MOVD $430, R12 + B runtime·callbackasm1(SB) + MOVD $431, R12 + B runtime·callbackasm1(SB) + MOVD $432, R12 + B runtime·callbackasm1(SB) + MOVD $433, R12 + B runtime·callbackasm1(SB) + MOVD $434, R12 + B runtime·callbackasm1(SB) + MOVD $435, R12 + B runtime·callbackasm1(SB) + MOVD $436, R12 + B runtime·callbackasm1(SB) + MOVD $437, R12 + B runtime·callbackasm1(SB) + MOVD $438, R12 + B runtime·callbackasm1(SB) + MOVD $439, R12 + B runtime·callbackasm1(SB) + MOVD $440, R12 + B runtime·callbackasm1(SB) + MOVD $441, R12 + B runtime·callbackasm1(SB) + MOVD $442, R12 + B runtime·callbackasm1(SB) + MOVD $443, R12 + B runtime·callbackasm1(SB) + MOVD $444, R12 + B runtime·callbackasm1(SB) + MOVD $445, R12 + B runtime·callbackasm1(SB) + MOVD $446, R12 + B runtime·callbackasm1(SB) + MOVD $447, R12 + B runtime·callbackasm1(SB) + MOVD $448, R12 + B runtime·callbackasm1(SB) + MOVD $449, R12 + B runtime·callbackasm1(SB) + MOVD $450, R12 + B runtime·callbackasm1(SB) + MOVD $451, R12 + B runtime·callbackasm1(SB) + MOVD $452, R12 + B runtime·callbackasm1(SB) + MOVD $453, R12 + B runtime·callbackasm1(SB) + MOVD $454, R12 + B runtime·callbackasm1(SB) + MOVD $455, R12 + B runtime·callbackasm1(SB) + MOVD $456, R12 + B runtime·callbackasm1(SB) + MOVD $457, R12 + B runtime·callbackasm1(SB) + MOVD $458, R12 + B runtime·callbackasm1(SB) + MOVD $459, R12 + B runtime·callbackasm1(SB) + MOVD $460, R12 + B runtime·callbackasm1(SB) + MOVD $461, R12 + B runtime·callbackasm1(SB) + MOVD $462, R12 + B runtime·callbackasm1(SB) + MOVD $463, R12 + B runtime·callbackasm1(SB) + MOVD $464, R12 + B runtime·callbackasm1(SB) + MOVD $465, R12 + B runtime·callbackasm1(SB) + MOVD $466, R12 + B runtime·callbackasm1(SB) + MOVD $467, R12 + B runtime·callbackasm1(SB) + MOVD $468, R12 + B runtime·callbackasm1(SB) + MOVD $469, R12 + B runtime·callbackasm1(SB) + MOVD $470, R12 + B runtime·callbackasm1(SB) + MOVD $471, R12 + B runtime·callbackasm1(SB) + MOVD $472, R12 + B runtime·callbackasm1(SB) + MOVD $473, R12 + B runtime·callbackasm1(SB) + MOVD $474, R12 + B runtime·callbackasm1(SB) + MOVD $475, R12 + B runtime·callbackasm1(SB) + MOVD $476, R12 + B runtime·callbackasm1(SB) + MOVD $477, R12 + B runtime·callbackasm1(SB) + MOVD $478, R12 + B runtime·callbackasm1(SB) + MOVD $479, R12 + B runtime·callbackasm1(SB) + MOVD $480, R12 + B runtime·callbackasm1(SB) + MOVD $481, R12 + B runtime·callbackasm1(SB) + MOVD $482, R12 + B runtime·callbackasm1(SB) + MOVD $483, R12 + B runtime·callbackasm1(SB) + MOVD $484, R12 + B runtime·callbackasm1(SB) + MOVD $485, R12 + B runtime·callbackasm1(SB) + MOVD $486, R12 + B runtime·callbackasm1(SB) + MOVD $487, R12 + B runtime·callbackasm1(SB) + MOVD $488, R12 + B runtime·callbackasm1(SB) + MOVD $489, R12 + B runtime·callbackasm1(SB) + MOVD $490, R12 + B runtime·callbackasm1(SB) + MOVD $491, R12 + B runtime·callbackasm1(SB) + MOVD $492, R12 + B runtime·callbackasm1(SB) + MOVD $493, R12 + B runtime·callbackasm1(SB) + MOVD $494, R12 + B runtime·callbackasm1(SB) + MOVD $495, R12 + B runtime·callbackasm1(SB) + MOVD $496, R12 + B runtime·callbackasm1(SB) + MOVD $497, R12 + B runtime·callbackasm1(SB) + MOVD $498, R12 + B runtime·callbackasm1(SB) + MOVD $499, R12 + B runtime·callbackasm1(SB) + MOVD $500, R12 + B runtime·callbackasm1(SB) + MOVD $501, R12 + B runtime·callbackasm1(SB) + MOVD $502, R12 + B runtime·callbackasm1(SB) + MOVD $503, R12 + B runtime·callbackasm1(SB) + MOVD $504, R12 + B runtime·callbackasm1(SB) + MOVD $505, R12 + B runtime·callbackasm1(SB) + MOVD $506, R12 + B runtime·callbackasm1(SB) + MOVD $507, R12 + B runtime·callbackasm1(SB) + MOVD $508, R12 + B runtime·callbackasm1(SB) + MOVD $509, R12 + B runtime·callbackasm1(SB) + MOVD $510, R12 + B runtime·callbackasm1(SB) + MOVD $511, R12 + B runtime·callbackasm1(SB) + MOVD $512, R12 + B runtime·callbackasm1(SB) + MOVD $513, R12 + B runtime·callbackasm1(SB) + MOVD $514, R12 + B runtime·callbackasm1(SB) + MOVD $515, R12 + B runtime·callbackasm1(SB) + MOVD $516, R12 + B runtime·callbackasm1(SB) + MOVD $517, R12 + B runtime·callbackasm1(SB) + MOVD $518, R12 + B runtime·callbackasm1(SB) + MOVD $519, R12 + B runtime·callbackasm1(SB) + MOVD $520, R12 + B runtime·callbackasm1(SB) + MOVD $521, R12 + B runtime·callbackasm1(SB) + MOVD $522, R12 + B runtime·callbackasm1(SB) + MOVD $523, R12 + B runtime·callbackasm1(SB) + MOVD $524, R12 + B runtime·callbackasm1(SB) + MOVD $525, R12 + B runtime·callbackasm1(SB) + MOVD $526, R12 + B runtime·callbackasm1(SB) + MOVD $527, R12 + B runtime·callbackasm1(SB) + MOVD $528, R12 + B runtime·callbackasm1(SB) + MOVD $529, R12 + B runtime·callbackasm1(SB) + MOVD $530, R12 + B runtime·callbackasm1(SB) + MOVD $531, R12 + B runtime·callbackasm1(SB) + MOVD $532, R12 + B runtime·callbackasm1(SB) + MOVD $533, R12 + B runtime·callbackasm1(SB) + MOVD $534, R12 + B runtime·callbackasm1(SB) + MOVD $535, R12 + B runtime·callbackasm1(SB) + MOVD $536, R12 + B runtime·callbackasm1(SB) + MOVD $537, R12 + B runtime·callbackasm1(SB) + MOVD $538, R12 + B runtime·callbackasm1(SB) + MOVD $539, R12 + B runtime·callbackasm1(SB) + MOVD $540, R12 + B runtime·callbackasm1(SB) + MOVD $541, R12 + B runtime·callbackasm1(SB) + MOVD $542, R12 + B runtime·callbackasm1(SB) + MOVD $543, R12 + B runtime·callbackasm1(SB) + MOVD $544, R12 + B runtime·callbackasm1(SB) + MOVD $545, R12 + B runtime·callbackasm1(SB) + MOVD $546, R12 + B runtime·callbackasm1(SB) + MOVD $547, R12 + B runtime·callbackasm1(SB) + MOVD $548, R12 + B runtime·callbackasm1(SB) + MOVD $549, R12 + B runtime·callbackasm1(SB) + MOVD $550, R12 + B runtime·callbackasm1(SB) + MOVD $551, R12 + B runtime·callbackasm1(SB) + MOVD $552, R12 + B runtime·callbackasm1(SB) + MOVD $553, R12 + B runtime·callbackasm1(SB) + MOVD $554, R12 + B runtime·callbackasm1(SB) + MOVD $555, R12 + B runtime·callbackasm1(SB) + MOVD $556, R12 + B runtime·callbackasm1(SB) + MOVD $557, R12 + B runtime·callbackasm1(SB) + MOVD $558, R12 + B runtime·callbackasm1(SB) + MOVD $559, R12 + B runtime·callbackasm1(SB) + MOVD $560, R12 + B runtime·callbackasm1(SB) + MOVD $561, R12 + B runtime·callbackasm1(SB) + MOVD $562, R12 + B runtime·callbackasm1(SB) + MOVD $563, R12 + B runtime·callbackasm1(SB) + MOVD $564, R12 + B runtime·callbackasm1(SB) + MOVD $565, R12 + B runtime·callbackasm1(SB) + MOVD $566, R12 + B runtime·callbackasm1(SB) + MOVD $567, R12 + B runtime·callbackasm1(SB) + MOVD $568, R12 + B runtime·callbackasm1(SB) + MOVD $569, R12 + B runtime·callbackasm1(SB) + MOVD $570, R12 + B runtime·callbackasm1(SB) + MOVD $571, R12 + B runtime·callbackasm1(SB) + MOVD $572, R12 + B runtime·callbackasm1(SB) + MOVD $573, R12 + B runtime·callbackasm1(SB) + MOVD $574, R12 + B runtime·callbackasm1(SB) + MOVD $575, R12 + B runtime·callbackasm1(SB) + MOVD $576, R12 + B runtime·callbackasm1(SB) + MOVD $577, R12 + B runtime·callbackasm1(SB) + MOVD $578, R12 + B runtime·callbackasm1(SB) + MOVD $579, R12 + B runtime·callbackasm1(SB) + MOVD $580, R12 + B runtime·callbackasm1(SB) + MOVD $581, R12 + B runtime·callbackasm1(SB) + MOVD $582, R12 + B runtime·callbackasm1(SB) + MOVD $583, R12 + B runtime·callbackasm1(SB) + MOVD $584, R12 + B runtime·callbackasm1(SB) + MOVD $585, R12 + B runtime·callbackasm1(SB) + MOVD $586, R12 + B runtime·callbackasm1(SB) + MOVD $587, R12 + B runtime·callbackasm1(SB) + MOVD $588, R12 + B runtime·callbackasm1(SB) + MOVD $589, R12 + B runtime·callbackasm1(SB) + MOVD $590, R12 + B runtime·callbackasm1(SB) + MOVD $591, R12 + B runtime·callbackasm1(SB) + MOVD $592, R12 + B runtime·callbackasm1(SB) + MOVD $593, R12 + B runtime·callbackasm1(SB) + MOVD $594, R12 + B runtime·callbackasm1(SB) + MOVD $595, R12 + B runtime·callbackasm1(SB) + MOVD $596, R12 + B runtime·callbackasm1(SB) + MOVD $597, R12 + B runtime·callbackasm1(SB) + MOVD $598, R12 + B runtime·callbackasm1(SB) + MOVD $599, R12 + B runtime·callbackasm1(SB) + MOVD $600, R12 + B runtime·callbackasm1(SB) + MOVD $601, R12 + B runtime·callbackasm1(SB) + MOVD $602, R12 + B runtime·callbackasm1(SB) + MOVD $603, R12 + B runtime·callbackasm1(SB) + MOVD $604, R12 + B runtime·callbackasm1(SB) + MOVD $605, R12 + B runtime·callbackasm1(SB) + MOVD $606, R12 + B runtime·callbackasm1(SB) + MOVD $607, R12 + B runtime·callbackasm1(SB) + MOVD $608, R12 + B runtime·callbackasm1(SB) + MOVD $609, R12 + B runtime·callbackasm1(SB) + MOVD $610, R12 + B runtime·callbackasm1(SB) + MOVD $611, R12 + B runtime·callbackasm1(SB) + MOVD $612, R12 + B runtime·callbackasm1(SB) + MOVD $613, R12 + B runtime·callbackasm1(SB) + MOVD $614, R12 + B runtime·callbackasm1(SB) + MOVD $615, R12 + B runtime·callbackasm1(SB) + MOVD $616, R12 + B runtime·callbackasm1(SB) + MOVD $617, R12 + B runtime·callbackasm1(SB) + MOVD $618, R12 + B runtime·callbackasm1(SB) + MOVD $619, R12 + B runtime·callbackasm1(SB) + MOVD $620, R12 + B runtime·callbackasm1(SB) + MOVD $621, R12 + B runtime·callbackasm1(SB) + MOVD $622, R12 + B runtime·callbackasm1(SB) + MOVD $623, R12 + B runtime·callbackasm1(SB) + MOVD $624, R12 + B runtime·callbackasm1(SB) + MOVD $625, R12 + B runtime·callbackasm1(SB) + MOVD $626, R12 + B runtime·callbackasm1(SB) + MOVD $627, R12 + B runtime·callbackasm1(SB) + MOVD $628, R12 + B runtime·callbackasm1(SB) + MOVD $629, R12 + B runtime·callbackasm1(SB) + MOVD $630, R12 + B runtime·callbackasm1(SB) + MOVD $631, R12 + B runtime·callbackasm1(SB) + MOVD $632, R12 + B runtime·callbackasm1(SB) + MOVD $633, R12 + B runtime·callbackasm1(SB) + MOVD $634, R12 + B runtime·callbackasm1(SB) + MOVD $635, R12 + B runtime·callbackasm1(SB) + MOVD $636, R12 + B runtime·callbackasm1(SB) + MOVD $637, R12 + B runtime·callbackasm1(SB) + MOVD $638, R12 + B runtime·callbackasm1(SB) + MOVD $639, R12 + B runtime·callbackasm1(SB) + MOVD $640, R12 + B runtime·callbackasm1(SB) + MOVD $641, R12 + B runtime·callbackasm1(SB) + MOVD $642, R12 + B runtime·callbackasm1(SB) + MOVD $643, R12 + B runtime·callbackasm1(SB) + MOVD $644, R12 + B runtime·callbackasm1(SB) + MOVD $645, R12 + B runtime·callbackasm1(SB) + MOVD $646, R12 + B runtime·callbackasm1(SB) + MOVD $647, R12 + B runtime·callbackasm1(SB) + MOVD $648, R12 + B runtime·callbackasm1(SB) + MOVD $649, R12 + B runtime·callbackasm1(SB) + MOVD $650, R12 + B runtime·callbackasm1(SB) + MOVD $651, R12 + B runtime·callbackasm1(SB) + MOVD $652, R12 + B runtime·callbackasm1(SB) + MOVD $653, R12 + B runtime·callbackasm1(SB) + MOVD $654, R12 + B runtime·callbackasm1(SB) + MOVD $655, R12 + B runtime·callbackasm1(SB) + MOVD $656, R12 + B runtime·callbackasm1(SB) + MOVD $657, R12 + B runtime·callbackasm1(SB) + MOVD $658, R12 + B runtime·callbackasm1(SB) + MOVD $659, R12 + B runtime·callbackasm1(SB) + MOVD $660, R12 + B runtime·callbackasm1(SB) + MOVD $661, R12 + B runtime·callbackasm1(SB) + MOVD $662, R12 + B runtime·callbackasm1(SB) + MOVD $663, R12 + B runtime·callbackasm1(SB) + MOVD $664, R12 + B runtime·callbackasm1(SB) + MOVD $665, R12 + B runtime·callbackasm1(SB) + MOVD $666, R12 + B runtime·callbackasm1(SB) + MOVD $667, R12 + B runtime·callbackasm1(SB) + MOVD $668, R12 + B runtime·callbackasm1(SB) + MOVD $669, R12 + B runtime·callbackasm1(SB) + MOVD $670, R12 + B runtime·callbackasm1(SB) + MOVD $671, R12 + B runtime·callbackasm1(SB) + MOVD $672, R12 + B runtime·callbackasm1(SB) + MOVD $673, R12 + B runtime·callbackasm1(SB) + MOVD $674, R12 + B runtime·callbackasm1(SB) + MOVD $675, R12 + B runtime·callbackasm1(SB) + MOVD $676, R12 + B runtime·callbackasm1(SB) + MOVD $677, R12 + B runtime·callbackasm1(SB) + MOVD $678, R12 + B runtime·callbackasm1(SB) + MOVD $679, R12 + B runtime·callbackasm1(SB) + MOVD $680, R12 + B runtime·callbackasm1(SB) + MOVD $681, R12 + B runtime·callbackasm1(SB) + MOVD $682, R12 + B runtime·callbackasm1(SB) + MOVD $683, R12 + B runtime·callbackasm1(SB) + MOVD $684, R12 + B runtime·callbackasm1(SB) + MOVD $685, R12 + B runtime·callbackasm1(SB) + MOVD $686, R12 + B runtime·callbackasm1(SB) + MOVD $687, R12 + B runtime·callbackasm1(SB) + MOVD $688, R12 + B runtime·callbackasm1(SB) + MOVD $689, R12 + B runtime·callbackasm1(SB) + MOVD $690, R12 + B runtime·callbackasm1(SB) + MOVD $691, R12 + B runtime·callbackasm1(SB) + MOVD $692, R12 + B runtime·callbackasm1(SB) + MOVD $693, R12 + B runtime·callbackasm1(SB) + MOVD $694, R12 + B runtime·callbackasm1(SB) + MOVD $695, R12 + B runtime·callbackasm1(SB) + MOVD $696, R12 + B runtime·callbackasm1(SB) + MOVD $697, R12 + B runtime·callbackasm1(SB) + MOVD $698, R12 + B runtime·callbackasm1(SB) + MOVD $699, R12 + B runtime·callbackasm1(SB) + MOVD $700, R12 + B runtime·callbackasm1(SB) + MOVD $701, R12 + B runtime·callbackasm1(SB) + MOVD $702, R12 + B runtime·callbackasm1(SB) + MOVD $703, R12 + B runtime·callbackasm1(SB) + MOVD $704, R12 + B runtime·callbackasm1(SB) + MOVD $705, R12 + B runtime·callbackasm1(SB) + MOVD $706, R12 + B runtime·callbackasm1(SB) + MOVD $707, R12 + B runtime·callbackasm1(SB) + MOVD $708, R12 + B runtime·callbackasm1(SB) + MOVD $709, R12 + B runtime·callbackasm1(SB) + MOVD $710, R12 + B runtime·callbackasm1(SB) + MOVD $711, R12 + B runtime·callbackasm1(SB) + MOVD $712, R12 + B runtime·callbackasm1(SB) + MOVD $713, R12 + B runtime·callbackasm1(SB) + MOVD $714, R12 + B runtime·callbackasm1(SB) + MOVD $715, R12 + B runtime·callbackasm1(SB) + MOVD $716, R12 + B runtime·callbackasm1(SB) + MOVD $717, R12 + B runtime·callbackasm1(SB) + MOVD $718, R12 + B runtime·callbackasm1(SB) + MOVD $719, R12 + B runtime·callbackasm1(SB) + MOVD $720, R12 + B runtime·callbackasm1(SB) + MOVD $721, R12 + B runtime·callbackasm1(SB) + MOVD $722, R12 + B runtime·callbackasm1(SB) + MOVD $723, R12 + B runtime·callbackasm1(SB) + MOVD $724, R12 + B runtime·callbackasm1(SB) + MOVD $725, R12 + B runtime·callbackasm1(SB) + MOVD $726, R12 + B runtime·callbackasm1(SB) + MOVD $727, R12 + B runtime·callbackasm1(SB) + MOVD $728, R12 + B runtime·callbackasm1(SB) + MOVD $729, R12 + B runtime·callbackasm1(SB) + MOVD $730, R12 + B runtime·callbackasm1(SB) + MOVD $731, R12 + B runtime·callbackasm1(SB) + MOVD $732, R12 + B runtime·callbackasm1(SB) + MOVD $733, R12 + B runtime·callbackasm1(SB) + MOVD $734, R12 + B runtime·callbackasm1(SB) + MOVD $735, R12 + B runtime·callbackasm1(SB) + MOVD $736, R12 + B runtime·callbackasm1(SB) + MOVD $737, R12 + B runtime·callbackasm1(SB) + MOVD $738, R12 + B runtime·callbackasm1(SB) + MOVD $739, R12 + B runtime·callbackasm1(SB) + MOVD $740, R12 + B runtime·callbackasm1(SB) + MOVD $741, R12 + B runtime·callbackasm1(SB) + MOVD $742, R12 + B runtime·callbackasm1(SB) + MOVD $743, R12 + B runtime·callbackasm1(SB) + MOVD $744, R12 + B runtime·callbackasm1(SB) + MOVD $745, R12 + B runtime·callbackasm1(SB) + MOVD $746, R12 + B runtime·callbackasm1(SB) + MOVD $747, R12 + B runtime·callbackasm1(SB) + MOVD $748, R12 + B runtime·callbackasm1(SB) + MOVD $749, R12 + B runtime·callbackasm1(SB) + MOVD $750, R12 + B runtime·callbackasm1(SB) + MOVD $751, R12 + B runtime·callbackasm1(SB) + MOVD $752, R12 + B runtime·callbackasm1(SB) + MOVD $753, R12 + B runtime·callbackasm1(SB) + MOVD $754, R12 + B runtime·callbackasm1(SB) + MOVD $755, R12 + B runtime·callbackasm1(SB) + MOVD $756, R12 + B runtime·callbackasm1(SB) + MOVD $757, R12 + B runtime·callbackasm1(SB) + MOVD $758, R12 + B runtime·callbackasm1(SB) + MOVD $759, R12 + B runtime·callbackasm1(SB) + MOVD $760, R12 + B runtime·callbackasm1(SB) + MOVD $761, R12 + B runtime·callbackasm1(SB) + MOVD $762, R12 + B runtime·callbackasm1(SB) + MOVD $763, R12 + B runtime·callbackasm1(SB) + MOVD $764, R12 + B runtime·callbackasm1(SB) + MOVD $765, R12 + B runtime·callbackasm1(SB) + MOVD $766, R12 + B runtime·callbackasm1(SB) + MOVD $767, R12 + B runtime·callbackasm1(SB) + MOVD $768, R12 + B runtime·callbackasm1(SB) + MOVD $769, R12 + B runtime·callbackasm1(SB) + MOVD $770, R12 + B runtime·callbackasm1(SB) + MOVD $771, R12 + B runtime·callbackasm1(SB) + MOVD $772, R12 + B runtime·callbackasm1(SB) + MOVD $773, R12 + B runtime·callbackasm1(SB) + MOVD $774, R12 + B runtime·callbackasm1(SB) + MOVD $775, R12 + B runtime·callbackasm1(SB) + MOVD $776, R12 + B runtime·callbackasm1(SB) + MOVD $777, R12 + B runtime·callbackasm1(SB) + MOVD $778, R12 + B runtime·callbackasm1(SB) + MOVD $779, R12 + B runtime·callbackasm1(SB) + MOVD $780, R12 + B runtime·callbackasm1(SB) + MOVD $781, R12 + B runtime·callbackasm1(SB) + MOVD $782, R12 + B runtime·callbackasm1(SB) + MOVD $783, R12 + B runtime·callbackasm1(SB) + MOVD $784, R12 + B runtime·callbackasm1(SB) + MOVD $785, R12 + B runtime·callbackasm1(SB) + MOVD $786, R12 + B runtime·callbackasm1(SB) + MOVD $787, R12 + B runtime·callbackasm1(SB) + MOVD $788, R12 + B runtime·callbackasm1(SB) + MOVD $789, R12 + B runtime·callbackasm1(SB) + MOVD $790, R12 + B runtime·callbackasm1(SB) + MOVD $791, R12 + B runtime·callbackasm1(SB) + MOVD $792, R12 + B runtime·callbackasm1(SB) + MOVD $793, R12 + B runtime·callbackasm1(SB) + MOVD $794, R12 + B runtime·callbackasm1(SB) + MOVD $795, R12 + B runtime·callbackasm1(SB) + MOVD $796, R12 + B runtime·callbackasm1(SB) + MOVD $797, R12 + B runtime·callbackasm1(SB) + MOVD $798, R12 + B runtime·callbackasm1(SB) + MOVD $799, R12 + B runtime·callbackasm1(SB) + MOVD $800, R12 + B runtime·callbackasm1(SB) + MOVD $801, R12 + B runtime·callbackasm1(SB) + MOVD $802, R12 + B runtime·callbackasm1(SB) + MOVD $803, R12 + B runtime·callbackasm1(SB) + MOVD $804, R12 + B runtime·callbackasm1(SB) + MOVD $805, R12 + B runtime·callbackasm1(SB) + MOVD $806, R12 + B runtime·callbackasm1(SB) + MOVD $807, R12 + B runtime·callbackasm1(SB) + MOVD $808, R12 + B runtime·callbackasm1(SB) + MOVD $809, R12 + B runtime·callbackasm1(SB) + MOVD $810, R12 + B runtime·callbackasm1(SB) + MOVD $811, R12 + B runtime·callbackasm1(SB) + MOVD $812, R12 + B runtime·callbackasm1(SB) + MOVD $813, R12 + B runtime·callbackasm1(SB) + MOVD $814, R12 + B runtime·callbackasm1(SB) + MOVD $815, R12 + B runtime·callbackasm1(SB) + MOVD $816, R12 + B runtime·callbackasm1(SB) + MOVD $817, R12 + B runtime·callbackasm1(SB) + MOVD $818, R12 + B runtime·callbackasm1(SB) + MOVD $819, R12 + B runtime·callbackasm1(SB) + MOVD $820, R12 + B runtime·callbackasm1(SB) + MOVD $821, R12 + B runtime·callbackasm1(SB) + MOVD $822, R12 + B runtime·callbackasm1(SB) + MOVD $823, R12 + B runtime·callbackasm1(SB) + MOVD $824, R12 + B runtime·callbackasm1(SB) + MOVD $825, R12 + B runtime·callbackasm1(SB) + MOVD $826, R12 + B runtime·callbackasm1(SB) + MOVD $827, R12 + B runtime·callbackasm1(SB) + MOVD $828, R12 + B runtime·callbackasm1(SB) + MOVD $829, R12 + B runtime·callbackasm1(SB) + MOVD $830, R12 + B runtime·callbackasm1(SB) + MOVD $831, R12 + B runtime·callbackasm1(SB) + MOVD $832, R12 + B runtime·callbackasm1(SB) + MOVD $833, R12 + B runtime·callbackasm1(SB) + MOVD $834, R12 + B runtime·callbackasm1(SB) + MOVD $835, R12 + B runtime·callbackasm1(SB) + MOVD $836, R12 + B runtime·callbackasm1(SB) + MOVD $837, R12 + B runtime·callbackasm1(SB) + MOVD $838, R12 + B runtime·callbackasm1(SB) + MOVD $839, R12 + B runtime·callbackasm1(SB) + MOVD $840, R12 + B runtime·callbackasm1(SB) + MOVD $841, R12 + B runtime·callbackasm1(SB) + MOVD $842, R12 + B runtime·callbackasm1(SB) + MOVD $843, R12 + B runtime·callbackasm1(SB) + MOVD $844, R12 + B runtime·callbackasm1(SB) + MOVD $845, R12 + B runtime·callbackasm1(SB) + MOVD $846, R12 + B runtime·callbackasm1(SB) + MOVD $847, R12 + B runtime·callbackasm1(SB) + MOVD $848, R12 + B runtime·callbackasm1(SB) + MOVD $849, R12 + B runtime·callbackasm1(SB) + MOVD $850, R12 + B runtime·callbackasm1(SB) + MOVD $851, R12 + B runtime·callbackasm1(SB) + MOVD $852, R12 + B runtime·callbackasm1(SB) + MOVD $853, R12 + B runtime·callbackasm1(SB) + MOVD $854, R12 + B runtime·callbackasm1(SB) + MOVD $855, R12 + B runtime·callbackasm1(SB) + MOVD $856, R12 + B runtime·callbackasm1(SB) + MOVD $857, R12 + B runtime·callbackasm1(SB) + MOVD $858, R12 + B runtime·callbackasm1(SB) + MOVD $859, R12 + B runtime·callbackasm1(SB) + MOVD $860, R12 + B runtime·callbackasm1(SB) + MOVD $861, R12 + B runtime·callbackasm1(SB) + MOVD $862, R12 + B runtime·callbackasm1(SB) + MOVD $863, R12 + B runtime·callbackasm1(SB) + MOVD $864, R12 + B runtime·callbackasm1(SB) + MOVD $865, R12 + B runtime·callbackasm1(SB) + MOVD $866, R12 + B runtime·callbackasm1(SB) + MOVD $867, R12 + B runtime·callbackasm1(SB) + MOVD $868, R12 + B runtime·callbackasm1(SB) + MOVD $869, R12 + B runtime·callbackasm1(SB) + MOVD $870, R12 + B runtime·callbackasm1(SB) + MOVD $871, R12 + B runtime·callbackasm1(SB) + MOVD $872, R12 + B runtime·callbackasm1(SB) + MOVD $873, R12 + B runtime·callbackasm1(SB) + MOVD $874, R12 + B runtime·callbackasm1(SB) + MOVD $875, R12 + B runtime·callbackasm1(SB) + MOVD $876, R12 + B runtime·callbackasm1(SB) + MOVD $877, R12 + B runtime·callbackasm1(SB) + MOVD $878, R12 + B runtime·callbackasm1(SB) + MOVD $879, R12 + B runtime·callbackasm1(SB) + MOVD $880, R12 + B runtime·callbackasm1(SB) + MOVD $881, R12 + B runtime·callbackasm1(SB) + MOVD $882, R12 + B runtime·callbackasm1(SB) + MOVD $883, R12 + B runtime·callbackasm1(SB) + MOVD $884, R12 + B runtime·callbackasm1(SB) + MOVD $885, R12 + B runtime·callbackasm1(SB) + MOVD $886, R12 + B runtime·callbackasm1(SB) + MOVD $887, R12 + B runtime·callbackasm1(SB) + MOVD $888, R12 + B runtime·callbackasm1(SB) + MOVD $889, R12 + B runtime·callbackasm1(SB) + MOVD $890, R12 + B runtime·callbackasm1(SB) + MOVD $891, R12 + B runtime·callbackasm1(SB) + MOVD $892, R12 + B runtime·callbackasm1(SB) + MOVD $893, R12 + B runtime·callbackasm1(SB) + MOVD $894, R12 + B runtime·callbackasm1(SB) + MOVD $895, R12 + B runtime·callbackasm1(SB) + MOVD $896, R12 + B runtime·callbackasm1(SB) + MOVD $897, R12 + B runtime·callbackasm1(SB) + MOVD $898, R12 + B runtime·callbackasm1(SB) + MOVD $899, R12 + B runtime·callbackasm1(SB) + MOVD $900, R12 + B runtime·callbackasm1(SB) + MOVD $901, R12 + B runtime·callbackasm1(SB) + MOVD $902, R12 + B runtime·callbackasm1(SB) + MOVD $903, R12 + B runtime·callbackasm1(SB) + MOVD $904, R12 + B runtime·callbackasm1(SB) + MOVD $905, R12 + B runtime·callbackasm1(SB) + MOVD $906, R12 + B runtime·callbackasm1(SB) + MOVD $907, R12 + B runtime·callbackasm1(SB) + MOVD $908, R12 + B runtime·callbackasm1(SB) + MOVD $909, R12 + B runtime·callbackasm1(SB) + MOVD $910, R12 + B runtime·callbackasm1(SB) + MOVD $911, R12 + B runtime·callbackasm1(SB) + MOVD $912, R12 + B runtime·callbackasm1(SB) + MOVD $913, R12 + B runtime·callbackasm1(SB) + MOVD $914, R12 + B runtime·callbackasm1(SB) + MOVD $915, R12 + B runtime·callbackasm1(SB) + MOVD $916, R12 + B runtime·callbackasm1(SB) + MOVD $917, R12 + B runtime·callbackasm1(SB) + MOVD $918, R12 + B runtime·callbackasm1(SB) + MOVD $919, R12 + B runtime·callbackasm1(SB) + MOVD $920, R12 + B runtime·callbackasm1(SB) + MOVD $921, R12 + B runtime·callbackasm1(SB) + MOVD $922, R12 + B runtime·callbackasm1(SB) + MOVD $923, R12 + B runtime·callbackasm1(SB) + MOVD $924, R12 + B runtime·callbackasm1(SB) + MOVD $925, R12 + B runtime·callbackasm1(SB) + MOVD $926, R12 + B runtime·callbackasm1(SB) + MOVD $927, R12 + B runtime·callbackasm1(SB) + MOVD $928, R12 + B runtime·callbackasm1(SB) + MOVD $929, R12 + B runtime·callbackasm1(SB) + MOVD $930, R12 + B runtime·callbackasm1(SB) + MOVD $931, R12 + B runtime·callbackasm1(SB) + MOVD $932, R12 + B runtime·callbackasm1(SB) + MOVD $933, R12 + B runtime·callbackasm1(SB) + MOVD $934, R12 + B runtime·callbackasm1(SB) + MOVD $935, R12 + B runtime·callbackasm1(SB) + MOVD $936, R12 + B runtime·callbackasm1(SB) + MOVD $937, R12 + B runtime·callbackasm1(SB) + MOVD $938, R12 + B runtime·callbackasm1(SB) + MOVD $939, R12 + B runtime·callbackasm1(SB) + MOVD $940, R12 + B runtime·callbackasm1(SB) + MOVD $941, R12 + B runtime·callbackasm1(SB) + MOVD $942, R12 + B runtime·callbackasm1(SB) + MOVD $943, R12 + B runtime·callbackasm1(SB) + MOVD $944, R12 + B runtime·callbackasm1(SB) + MOVD $945, R12 + B runtime·callbackasm1(SB) + MOVD $946, R12 + B runtime·callbackasm1(SB) + MOVD $947, R12 + B runtime·callbackasm1(SB) + MOVD $948, R12 + B runtime·callbackasm1(SB) + MOVD $949, R12 + B runtime·callbackasm1(SB) + MOVD $950, R12 + B runtime·callbackasm1(SB) + MOVD $951, R12 + B runtime·callbackasm1(SB) + MOVD $952, R12 + B runtime·callbackasm1(SB) + MOVD $953, R12 + B runtime·callbackasm1(SB) + MOVD $954, R12 + B runtime·callbackasm1(SB) + MOVD $955, R12 + B runtime·callbackasm1(SB) + MOVD $956, R12 + B runtime·callbackasm1(SB) + MOVD $957, R12 + B runtime·callbackasm1(SB) + MOVD $958, R12 + B runtime·callbackasm1(SB) + MOVD $959, R12 + B runtime·callbackasm1(SB) + MOVD $960, R12 + B runtime·callbackasm1(SB) + MOVD $961, R12 + B runtime·callbackasm1(SB) + MOVD $962, R12 + B runtime·callbackasm1(SB) + MOVD $963, R12 + B runtime·callbackasm1(SB) + MOVD $964, R12 + B runtime·callbackasm1(SB) + MOVD $965, R12 + B runtime·callbackasm1(SB) + MOVD $966, R12 + B runtime·callbackasm1(SB) + MOVD $967, R12 + B runtime·callbackasm1(SB) + MOVD $968, R12 + B runtime·callbackasm1(SB) + MOVD $969, R12 + B runtime·callbackasm1(SB) + MOVD $970, R12 + B runtime·callbackasm1(SB) + MOVD $971, R12 + B runtime·callbackasm1(SB) + MOVD $972, R12 + B runtime·callbackasm1(SB) + MOVD $973, R12 + B runtime·callbackasm1(SB) + MOVD $974, R12 + B runtime·callbackasm1(SB) + MOVD $975, R12 + B runtime·callbackasm1(SB) + MOVD $976, R12 + B runtime·callbackasm1(SB) + MOVD $977, R12 + B runtime·callbackasm1(SB) + MOVD $978, R12 + B runtime·callbackasm1(SB) + MOVD $979, R12 + B runtime·callbackasm1(SB) + MOVD $980, R12 + B runtime·callbackasm1(SB) + MOVD $981, R12 + B runtime·callbackasm1(SB) + MOVD $982, R12 + B runtime·callbackasm1(SB) + MOVD $983, R12 + B runtime·callbackasm1(SB) + MOVD $984, R12 + B runtime·callbackasm1(SB) + MOVD $985, R12 + B runtime·callbackasm1(SB) + MOVD $986, R12 + B runtime·callbackasm1(SB) + MOVD $987, R12 + B runtime·callbackasm1(SB) + MOVD $988, R12 + B runtime·callbackasm1(SB) + MOVD $989, R12 + B runtime·callbackasm1(SB) + MOVD $990, R12 + B runtime·callbackasm1(SB) + MOVD $991, R12 + B runtime·callbackasm1(SB) + MOVD $992, R12 + B runtime·callbackasm1(SB) + MOVD $993, R12 + B runtime·callbackasm1(SB) + MOVD $994, R12 + B runtime·callbackasm1(SB) + MOVD $995, R12 + B runtime·callbackasm1(SB) + MOVD $996, R12 + B runtime·callbackasm1(SB) + MOVD $997, R12 + B runtime·callbackasm1(SB) + MOVD $998, R12 + B runtime·callbackasm1(SB) + MOVD $999, R12 + B runtime·callbackasm1(SB) + MOVD $1000, R12 + B runtime·callbackasm1(SB) + MOVD $1001, R12 + B runtime·callbackasm1(SB) + MOVD $1002, R12 + B runtime·callbackasm1(SB) + MOVD $1003, R12 + B runtime·callbackasm1(SB) + MOVD $1004, R12 + B runtime·callbackasm1(SB) + MOVD $1005, R12 + B runtime·callbackasm1(SB) + MOVD $1006, R12 + B runtime·callbackasm1(SB) + MOVD $1007, R12 + B runtime·callbackasm1(SB) + MOVD $1008, R12 + B runtime·callbackasm1(SB) + MOVD $1009, R12 + B runtime·callbackasm1(SB) + MOVD $1010, R12 + B runtime·callbackasm1(SB) + MOVD $1011, R12 + B runtime·callbackasm1(SB) + MOVD $1012, R12 + B runtime·callbackasm1(SB) + MOVD $1013, R12 + B runtime·callbackasm1(SB) + MOVD $1014, R12 + B runtime·callbackasm1(SB) + MOVD $1015, R12 + B runtime·callbackasm1(SB) + MOVD $1016, R12 + B runtime·callbackasm1(SB) + MOVD $1017, R12 + B runtime·callbackasm1(SB) + MOVD $1018, R12 + B runtime·callbackasm1(SB) + MOVD $1019, R12 + B runtime·callbackasm1(SB) + MOVD $1020, R12 + B runtime·callbackasm1(SB) + MOVD $1021, R12 + B runtime·callbackasm1(SB) + MOVD $1022, R12 + B runtime·callbackasm1(SB) + MOVD $1023, R12 + B runtime·callbackasm1(SB) + MOVD $1024, R12 + B runtime·callbackasm1(SB) + MOVD $1025, R12 + B runtime·callbackasm1(SB) + MOVD $1026, R12 + B runtime·callbackasm1(SB) + MOVD $1027, R12 + B runtime·callbackasm1(SB) + MOVD $1028, R12 + B runtime·callbackasm1(SB) + MOVD $1029, R12 + B runtime·callbackasm1(SB) + MOVD $1030, R12 + B runtime·callbackasm1(SB) + MOVD $1031, R12 + B runtime·callbackasm1(SB) + MOVD $1032, R12 + B runtime·callbackasm1(SB) + MOVD $1033, R12 + B runtime·callbackasm1(SB) + MOVD $1034, R12 + B runtime·callbackasm1(SB) + MOVD $1035, R12 + B runtime·callbackasm1(SB) + MOVD $1036, R12 + B runtime·callbackasm1(SB) + MOVD $1037, R12 + B runtime·callbackasm1(SB) + MOVD $1038, R12 + B runtime·callbackasm1(SB) + MOVD $1039, R12 + B runtime·callbackasm1(SB) + MOVD $1040, R12 + B runtime·callbackasm1(SB) + MOVD $1041, R12 + B runtime·callbackasm1(SB) + MOVD $1042, R12 + B runtime·callbackasm1(SB) + MOVD $1043, R12 + B runtime·callbackasm1(SB) + MOVD $1044, R12 + B runtime·callbackasm1(SB) + MOVD $1045, R12 + B runtime·callbackasm1(SB) + MOVD $1046, R12 + B runtime·callbackasm1(SB) + MOVD $1047, R12 + B runtime·callbackasm1(SB) + MOVD $1048, R12 + B runtime·callbackasm1(SB) + MOVD $1049, R12 + B runtime·callbackasm1(SB) + MOVD $1050, R12 + B runtime·callbackasm1(SB) + MOVD $1051, R12 + B runtime·callbackasm1(SB) + MOVD $1052, R12 + B runtime·callbackasm1(SB) + MOVD $1053, R12 + B runtime·callbackasm1(SB) + MOVD $1054, R12 + B runtime·callbackasm1(SB) + MOVD $1055, R12 + B runtime·callbackasm1(SB) + MOVD $1056, R12 + B runtime·callbackasm1(SB) + MOVD $1057, R12 + B runtime·callbackasm1(SB) + MOVD $1058, R12 + B runtime·callbackasm1(SB) + MOVD $1059, R12 + B runtime·callbackasm1(SB) + MOVD $1060, R12 + B runtime·callbackasm1(SB) + MOVD $1061, R12 + B runtime·callbackasm1(SB) + MOVD $1062, R12 + B runtime·callbackasm1(SB) + MOVD $1063, R12 + B runtime·callbackasm1(SB) + MOVD $1064, R12 + B runtime·callbackasm1(SB) + MOVD $1065, R12 + B runtime·callbackasm1(SB) + MOVD $1066, R12 + B runtime·callbackasm1(SB) + MOVD $1067, R12 + B runtime·callbackasm1(SB) + MOVD $1068, R12 + B runtime·callbackasm1(SB) + MOVD $1069, R12 + B runtime·callbackasm1(SB) + MOVD $1070, R12 + B runtime·callbackasm1(SB) + MOVD $1071, R12 + B runtime·callbackasm1(SB) + MOVD $1072, R12 + B runtime·callbackasm1(SB) + MOVD $1073, R12 + B runtime·callbackasm1(SB) + MOVD $1074, R12 + B runtime·callbackasm1(SB) + MOVD $1075, R12 + B runtime·callbackasm1(SB) + MOVD $1076, R12 + B runtime·callbackasm1(SB) + MOVD $1077, R12 + B runtime·callbackasm1(SB) + MOVD $1078, R12 + B runtime·callbackasm1(SB) + MOVD $1079, R12 + B runtime·callbackasm1(SB) + MOVD $1080, R12 + B runtime·callbackasm1(SB) + MOVD $1081, R12 + B runtime·callbackasm1(SB) + MOVD $1082, R12 + B runtime·callbackasm1(SB) + MOVD $1083, R12 + B runtime·callbackasm1(SB) + MOVD $1084, R12 + B runtime·callbackasm1(SB) + MOVD $1085, R12 + B runtime·callbackasm1(SB) + MOVD $1086, R12 + B runtime·callbackasm1(SB) + MOVD $1087, R12 + B runtime·callbackasm1(SB) + MOVD $1088, R12 + B runtime·callbackasm1(SB) + MOVD $1089, R12 + B runtime·callbackasm1(SB) + MOVD $1090, R12 + B runtime·callbackasm1(SB) + MOVD $1091, R12 + B runtime·callbackasm1(SB) + MOVD $1092, R12 + B runtime·callbackasm1(SB) + MOVD $1093, R12 + B runtime·callbackasm1(SB) + MOVD $1094, R12 + B runtime·callbackasm1(SB) + MOVD $1095, R12 + B runtime·callbackasm1(SB) + MOVD $1096, R12 + B runtime·callbackasm1(SB) + MOVD $1097, R12 + B runtime·callbackasm1(SB) + MOVD $1098, R12 + B runtime·callbackasm1(SB) + MOVD $1099, R12 + B runtime·callbackasm1(SB) + MOVD $1100, R12 + B runtime·callbackasm1(SB) + MOVD $1101, R12 + B runtime·callbackasm1(SB) + MOVD $1102, R12 + B runtime·callbackasm1(SB) + MOVD $1103, R12 + B runtime·callbackasm1(SB) + MOVD $1104, R12 + B runtime·callbackasm1(SB) + MOVD $1105, R12 + B runtime·callbackasm1(SB) + MOVD $1106, R12 + B runtime·callbackasm1(SB) + MOVD $1107, R12 + B runtime·callbackasm1(SB) + MOVD $1108, R12 + B runtime·callbackasm1(SB) + MOVD $1109, R12 + B runtime·callbackasm1(SB) + MOVD $1110, R12 + B runtime·callbackasm1(SB) + MOVD $1111, R12 + B runtime·callbackasm1(SB) + MOVD $1112, R12 + B runtime·callbackasm1(SB) + MOVD $1113, R12 + B runtime·callbackasm1(SB) + MOVD $1114, R12 + B runtime·callbackasm1(SB) + MOVD $1115, R12 + B runtime·callbackasm1(SB) + MOVD $1116, R12 + B runtime·callbackasm1(SB) + MOVD $1117, R12 + B runtime·callbackasm1(SB) + MOVD $1118, R12 + B runtime·callbackasm1(SB) + MOVD $1119, R12 + B runtime·callbackasm1(SB) + MOVD $1120, R12 + B runtime·callbackasm1(SB) + MOVD $1121, R12 + B runtime·callbackasm1(SB) + MOVD $1122, R12 + B runtime·callbackasm1(SB) + MOVD $1123, R12 + B runtime·callbackasm1(SB) + MOVD $1124, R12 + B runtime·callbackasm1(SB) + MOVD $1125, R12 + B runtime·callbackasm1(SB) + MOVD $1126, R12 + B runtime·callbackasm1(SB) + MOVD $1127, R12 + B runtime·callbackasm1(SB) + MOVD $1128, R12 + B runtime·callbackasm1(SB) + MOVD $1129, R12 + B runtime·callbackasm1(SB) + MOVD $1130, R12 + B runtime·callbackasm1(SB) + MOVD $1131, R12 + B runtime·callbackasm1(SB) + MOVD $1132, R12 + B runtime·callbackasm1(SB) + MOVD $1133, R12 + B runtime·callbackasm1(SB) + MOVD $1134, R12 + B runtime·callbackasm1(SB) + MOVD $1135, R12 + B runtime·callbackasm1(SB) + MOVD $1136, R12 + B runtime·callbackasm1(SB) + MOVD $1137, R12 + B runtime·callbackasm1(SB) + MOVD $1138, R12 + B runtime·callbackasm1(SB) + MOVD $1139, R12 + B runtime·callbackasm1(SB) + MOVD $1140, R12 + B runtime·callbackasm1(SB) + MOVD $1141, R12 + B runtime·callbackasm1(SB) + MOVD $1142, R12 + B runtime·callbackasm1(SB) + MOVD $1143, R12 + B runtime·callbackasm1(SB) + MOVD $1144, R12 + B runtime·callbackasm1(SB) + MOVD $1145, R12 + B runtime·callbackasm1(SB) + MOVD $1146, R12 + B runtime·callbackasm1(SB) + MOVD $1147, R12 + B runtime·callbackasm1(SB) + MOVD $1148, R12 + B runtime·callbackasm1(SB) + MOVD $1149, R12 + B runtime·callbackasm1(SB) + MOVD $1150, R12 + B runtime·callbackasm1(SB) + MOVD $1151, R12 + B runtime·callbackasm1(SB) + MOVD $1152, R12 + B runtime·callbackasm1(SB) + MOVD $1153, R12 + B runtime·callbackasm1(SB) + MOVD $1154, R12 + B runtime·callbackasm1(SB) + MOVD $1155, R12 + B runtime·callbackasm1(SB) + MOVD $1156, R12 + B runtime·callbackasm1(SB) + MOVD $1157, R12 + B runtime·callbackasm1(SB) + MOVD $1158, R12 + B runtime·callbackasm1(SB) + MOVD $1159, R12 + B runtime·callbackasm1(SB) + MOVD $1160, R12 + B runtime·callbackasm1(SB) + MOVD $1161, R12 + B runtime·callbackasm1(SB) + MOVD $1162, R12 + B runtime·callbackasm1(SB) + MOVD $1163, R12 + B runtime·callbackasm1(SB) + MOVD $1164, R12 + B runtime·callbackasm1(SB) + MOVD $1165, R12 + B runtime·callbackasm1(SB) + MOVD $1166, R12 + B runtime·callbackasm1(SB) + MOVD $1167, R12 + B runtime·callbackasm1(SB) + MOVD $1168, R12 + B runtime·callbackasm1(SB) + MOVD $1169, R12 + B runtime·callbackasm1(SB) + MOVD $1170, R12 + B runtime·callbackasm1(SB) + MOVD $1171, R12 + B runtime·callbackasm1(SB) + MOVD $1172, R12 + B runtime·callbackasm1(SB) + MOVD $1173, R12 + B runtime·callbackasm1(SB) + MOVD $1174, R12 + B runtime·callbackasm1(SB) + MOVD $1175, R12 + B runtime·callbackasm1(SB) + MOVD $1176, R12 + B runtime·callbackasm1(SB) + MOVD $1177, R12 + B runtime·callbackasm1(SB) + MOVD $1178, R12 + B runtime·callbackasm1(SB) + MOVD $1179, R12 + B runtime·callbackasm1(SB) + MOVD $1180, R12 + B runtime·callbackasm1(SB) + MOVD $1181, R12 + B runtime·callbackasm1(SB) + MOVD $1182, R12 + B runtime·callbackasm1(SB) + MOVD $1183, R12 + B runtime·callbackasm1(SB) + MOVD $1184, R12 + B runtime·callbackasm1(SB) + MOVD $1185, R12 + B runtime·callbackasm1(SB) + MOVD $1186, R12 + B runtime·callbackasm1(SB) + MOVD $1187, R12 + B runtime·callbackasm1(SB) + MOVD $1188, R12 + B runtime·callbackasm1(SB) + MOVD $1189, R12 + B runtime·callbackasm1(SB) + MOVD $1190, R12 + B runtime·callbackasm1(SB) + MOVD $1191, R12 + B runtime·callbackasm1(SB) + MOVD $1192, R12 + B runtime·callbackasm1(SB) + MOVD $1193, R12 + B runtime·callbackasm1(SB) + MOVD $1194, R12 + B runtime·callbackasm1(SB) + MOVD $1195, R12 + B runtime·callbackasm1(SB) + MOVD $1196, R12 + B runtime·callbackasm1(SB) + MOVD $1197, R12 + B runtime·callbackasm1(SB) + MOVD $1198, R12 + B runtime·callbackasm1(SB) + MOVD $1199, R12 + B runtime·callbackasm1(SB) + MOVD $1200, R12 + B runtime·callbackasm1(SB) + MOVD $1201, R12 + B runtime·callbackasm1(SB) + MOVD $1202, R12 + B runtime·callbackasm1(SB) + MOVD $1203, R12 + B runtime·callbackasm1(SB) + MOVD $1204, R12 + B runtime·callbackasm1(SB) + MOVD $1205, R12 + B runtime·callbackasm1(SB) + MOVD $1206, R12 + B runtime·callbackasm1(SB) + MOVD $1207, R12 + B runtime·callbackasm1(SB) + MOVD $1208, R12 + B runtime·callbackasm1(SB) + MOVD $1209, R12 + B runtime·callbackasm1(SB) + MOVD $1210, R12 + B runtime·callbackasm1(SB) + MOVD $1211, R12 + B runtime·callbackasm1(SB) + MOVD $1212, R12 + B runtime·callbackasm1(SB) + MOVD $1213, R12 + B runtime·callbackasm1(SB) + MOVD $1214, R12 + B runtime·callbackasm1(SB) + MOVD $1215, R12 + B runtime·callbackasm1(SB) + MOVD $1216, R12 + B runtime·callbackasm1(SB) + MOVD $1217, R12 + B runtime·callbackasm1(SB) + MOVD $1218, R12 + B runtime·callbackasm1(SB) + MOVD $1219, R12 + B runtime·callbackasm1(SB) + MOVD $1220, R12 + B runtime·callbackasm1(SB) + MOVD $1221, R12 + B runtime·callbackasm1(SB) + MOVD $1222, R12 + B runtime·callbackasm1(SB) + MOVD $1223, R12 + B runtime·callbackasm1(SB) + MOVD $1224, R12 + B runtime·callbackasm1(SB) + MOVD $1225, R12 + B runtime·callbackasm1(SB) + MOVD $1226, R12 + B runtime·callbackasm1(SB) + MOVD $1227, R12 + B runtime·callbackasm1(SB) + MOVD $1228, R12 + B runtime·callbackasm1(SB) + MOVD $1229, R12 + B runtime·callbackasm1(SB) + MOVD $1230, R12 + B runtime·callbackasm1(SB) + MOVD $1231, R12 + B runtime·callbackasm1(SB) + MOVD $1232, R12 + B runtime·callbackasm1(SB) + MOVD $1233, R12 + B runtime·callbackasm1(SB) + MOVD $1234, R12 + B runtime·callbackasm1(SB) + MOVD $1235, R12 + B runtime·callbackasm1(SB) + MOVD $1236, R12 + B runtime·callbackasm1(SB) + MOVD $1237, R12 + B runtime·callbackasm1(SB) + MOVD $1238, R12 + B runtime·callbackasm1(SB) + MOVD $1239, R12 + B runtime·callbackasm1(SB) + MOVD $1240, R12 + B runtime·callbackasm1(SB) + MOVD $1241, R12 + B runtime·callbackasm1(SB) + MOVD $1242, R12 + B runtime·callbackasm1(SB) + MOVD $1243, R12 + B runtime·callbackasm1(SB) + MOVD $1244, R12 + B runtime·callbackasm1(SB) + MOVD $1245, R12 + B runtime·callbackasm1(SB) + MOVD $1246, R12 + B runtime·callbackasm1(SB) + MOVD $1247, R12 + B runtime·callbackasm1(SB) + MOVD $1248, R12 + B runtime·callbackasm1(SB) + MOVD $1249, R12 + B runtime·callbackasm1(SB) + MOVD $1250, R12 + B runtime·callbackasm1(SB) + MOVD $1251, R12 + B runtime·callbackasm1(SB) + MOVD $1252, R12 + B runtime·callbackasm1(SB) + MOVD $1253, R12 + B runtime·callbackasm1(SB) + MOVD $1254, R12 + B runtime·callbackasm1(SB) + MOVD $1255, R12 + B runtime·callbackasm1(SB) + MOVD $1256, R12 + B runtime·callbackasm1(SB) + MOVD $1257, R12 + B runtime·callbackasm1(SB) + MOVD $1258, R12 + B runtime·callbackasm1(SB) + MOVD $1259, R12 + B runtime·callbackasm1(SB) + MOVD $1260, R12 + B runtime·callbackasm1(SB) + MOVD $1261, R12 + B runtime·callbackasm1(SB) + MOVD $1262, R12 + B runtime·callbackasm1(SB) + MOVD $1263, R12 + B runtime·callbackasm1(SB) + MOVD $1264, R12 + B runtime·callbackasm1(SB) + MOVD $1265, R12 + B runtime·callbackasm1(SB) + MOVD $1266, R12 + B runtime·callbackasm1(SB) + MOVD $1267, R12 + B runtime·callbackasm1(SB) + MOVD $1268, R12 + B runtime·callbackasm1(SB) + MOVD $1269, R12 + B runtime·callbackasm1(SB) + MOVD $1270, R12 + B runtime·callbackasm1(SB) + MOVD $1271, R12 + B runtime·callbackasm1(SB) + MOVD $1272, R12 + B runtime·callbackasm1(SB) + MOVD $1273, R12 + B runtime·callbackasm1(SB) + MOVD $1274, R12 + B runtime·callbackasm1(SB) + MOVD $1275, R12 + B runtime·callbackasm1(SB) + MOVD $1276, R12 + B runtime·callbackasm1(SB) + MOVD $1277, R12 + B runtime·callbackasm1(SB) + MOVD $1278, R12 + B runtime·callbackasm1(SB) + MOVD $1279, R12 + B runtime·callbackasm1(SB) + MOVD $1280, R12 + B runtime·callbackasm1(SB) + MOVD $1281, R12 + B runtime·callbackasm1(SB) + MOVD $1282, R12 + B runtime·callbackasm1(SB) + MOVD $1283, R12 + B runtime·callbackasm1(SB) + MOVD $1284, R12 + B runtime·callbackasm1(SB) + MOVD $1285, R12 + B runtime·callbackasm1(SB) + MOVD $1286, R12 + B runtime·callbackasm1(SB) + MOVD $1287, R12 + B runtime·callbackasm1(SB) + MOVD $1288, R12 + B runtime·callbackasm1(SB) + MOVD $1289, R12 + B runtime·callbackasm1(SB) + MOVD $1290, R12 + B runtime·callbackasm1(SB) + MOVD $1291, R12 + B runtime·callbackasm1(SB) + MOVD $1292, R12 + B runtime·callbackasm1(SB) + MOVD $1293, R12 + B runtime·callbackasm1(SB) + MOVD $1294, R12 + B runtime·callbackasm1(SB) + MOVD $1295, R12 + B runtime·callbackasm1(SB) + MOVD $1296, R12 + B runtime·callbackasm1(SB) + MOVD $1297, R12 + B runtime·callbackasm1(SB) + MOVD $1298, R12 + B runtime·callbackasm1(SB) + MOVD $1299, R12 + B runtime·callbackasm1(SB) + MOVD $1300, R12 + B runtime·callbackasm1(SB) + MOVD $1301, R12 + B runtime·callbackasm1(SB) + MOVD $1302, R12 + B runtime·callbackasm1(SB) + MOVD $1303, R12 + B runtime·callbackasm1(SB) + MOVD $1304, R12 + B runtime·callbackasm1(SB) + MOVD $1305, R12 + B runtime·callbackasm1(SB) + MOVD $1306, R12 + B runtime·callbackasm1(SB) + MOVD $1307, R12 + B runtime·callbackasm1(SB) + MOVD $1308, R12 + B runtime·callbackasm1(SB) + MOVD $1309, R12 + B runtime·callbackasm1(SB) + MOVD $1310, R12 + B runtime·callbackasm1(SB) + MOVD $1311, R12 + B runtime·callbackasm1(SB) + MOVD $1312, R12 + B runtime·callbackasm1(SB) + MOVD $1313, R12 + B runtime·callbackasm1(SB) + MOVD $1314, R12 + B runtime·callbackasm1(SB) + MOVD $1315, R12 + B runtime·callbackasm1(SB) + MOVD $1316, R12 + B runtime·callbackasm1(SB) + MOVD $1317, R12 + B runtime·callbackasm1(SB) + MOVD $1318, R12 + B runtime·callbackasm1(SB) + MOVD $1319, R12 + B runtime·callbackasm1(SB) + MOVD $1320, R12 + B runtime·callbackasm1(SB) + MOVD $1321, R12 + B runtime·callbackasm1(SB) + MOVD $1322, R12 + B runtime·callbackasm1(SB) + MOVD $1323, R12 + B runtime·callbackasm1(SB) + MOVD $1324, R12 + B runtime·callbackasm1(SB) + MOVD $1325, R12 + B runtime·callbackasm1(SB) + MOVD $1326, R12 + B runtime·callbackasm1(SB) + MOVD $1327, R12 + B runtime·callbackasm1(SB) + MOVD $1328, R12 + B runtime·callbackasm1(SB) + MOVD $1329, R12 + B runtime·callbackasm1(SB) + MOVD $1330, R12 + B runtime·callbackasm1(SB) + MOVD $1331, R12 + B runtime·callbackasm1(SB) + MOVD $1332, R12 + B runtime·callbackasm1(SB) + MOVD $1333, R12 + B runtime·callbackasm1(SB) + MOVD $1334, R12 + B runtime·callbackasm1(SB) + MOVD $1335, R12 + B runtime·callbackasm1(SB) + MOVD $1336, R12 + B runtime·callbackasm1(SB) + MOVD $1337, R12 + B runtime·callbackasm1(SB) + MOVD $1338, R12 + B runtime·callbackasm1(SB) + MOVD $1339, R12 + B runtime·callbackasm1(SB) + MOVD $1340, R12 + B runtime·callbackasm1(SB) + MOVD $1341, R12 + B runtime·callbackasm1(SB) + MOVD $1342, R12 + B runtime·callbackasm1(SB) + MOVD $1343, R12 + B runtime·callbackasm1(SB) + MOVD $1344, R12 + B runtime·callbackasm1(SB) + MOVD $1345, R12 + B runtime·callbackasm1(SB) + MOVD $1346, R12 + B runtime·callbackasm1(SB) + MOVD $1347, R12 + B runtime·callbackasm1(SB) + MOVD $1348, R12 + B runtime·callbackasm1(SB) + MOVD $1349, R12 + B runtime·callbackasm1(SB) + MOVD $1350, R12 + B runtime·callbackasm1(SB) + MOVD $1351, R12 + B runtime·callbackasm1(SB) + MOVD $1352, R12 + B runtime·callbackasm1(SB) + MOVD $1353, R12 + B runtime·callbackasm1(SB) + MOVD $1354, R12 + B runtime·callbackasm1(SB) + MOVD $1355, R12 + B runtime·callbackasm1(SB) + MOVD $1356, R12 + B runtime·callbackasm1(SB) + MOVD $1357, R12 + B runtime·callbackasm1(SB) + MOVD $1358, R12 + B runtime·callbackasm1(SB) + MOVD $1359, R12 + B runtime·callbackasm1(SB) + MOVD $1360, R12 + B runtime·callbackasm1(SB) + MOVD $1361, R12 + B runtime·callbackasm1(SB) + MOVD $1362, R12 + B runtime·callbackasm1(SB) + MOVD $1363, R12 + B runtime·callbackasm1(SB) + MOVD $1364, R12 + B runtime·callbackasm1(SB) + MOVD $1365, R12 + B runtime·callbackasm1(SB) + MOVD $1366, R12 + B runtime·callbackasm1(SB) + MOVD $1367, R12 + B runtime·callbackasm1(SB) + MOVD $1368, R12 + B runtime·callbackasm1(SB) + MOVD $1369, R12 + B runtime·callbackasm1(SB) + MOVD $1370, R12 + B runtime·callbackasm1(SB) + MOVD $1371, R12 + B runtime·callbackasm1(SB) + MOVD $1372, R12 + B runtime·callbackasm1(SB) + MOVD $1373, R12 + B runtime·callbackasm1(SB) + MOVD $1374, R12 + B runtime·callbackasm1(SB) + MOVD $1375, R12 + B runtime·callbackasm1(SB) + MOVD $1376, R12 + B runtime·callbackasm1(SB) + MOVD $1377, R12 + B runtime·callbackasm1(SB) + MOVD $1378, R12 + B runtime·callbackasm1(SB) + MOVD $1379, R12 + B runtime·callbackasm1(SB) + MOVD $1380, R12 + B runtime·callbackasm1(SB) + MOVD $1381, R12 + B runtime·callbackasm1(SB) + MOVD $1382, R12 + B runtime·callbackasm1(SB) + MOVD $1383, R12 + B runtime·callbackasm1(SB) + MOVD $1384, R12 + B runtime·callbackasm1(SB) + MOVD $1385, R12 + B runtime·callbackasm1(SB) + MOVD $1386, R12 + B runtime·callbackasm1(SB) + MOVD $1387, R12 + B runtime·callbackasm1(SB) + MOVD $1388, R12 + B runtime·callbackasm1(SB) + MOVD $1389, R12 + B runtime·callbackasm1(SB) + MOVD $1390, R12 + B runtime·callbackasm1(SB) + MOVD $1391, R12 + B runtime·callbackasm1(SB) + MOVD $1392, R12 + B runtime·callbackasm1(SB) + MOVD $1393, R12 + B runtime·callbackasm1(SB) + MOVD $1394, R12 + B runtime·callbackasm1(SB) + MOVD $1395, R12 + B runtime·callbackasm1(SB) + MOVD $1396, R12 + B runtime·callbackasm1(SB) + MOVD $1397, R12 + B runtime·callbackasm1(SB) + MOVD $1398, R12 + B runtime·callbackasm1(SB) + MOVD $1399, R12 + B runtime·callbackasm1(SB) + MOVD $1400, R12 + B runtime·callbackasm1(SB) + MOVD $1401, R12 + B runtime·callbackasm1(SB) + MOVD $1402, R12 + B runtime·callbackasm1(SB) + MOVD $1403, R12 + B runtime·callbackasm1(SB) + MOVD $1404, R12 + B runtime·callbackasm1(SB) + MOVD $1405, R12 + B runtime·callbackasm1(SB) + MOVD $1406, R12 + B runtime·callbackasm1(SB) + MOVD $1407, R12 + B runtime·callbackasm1(SB) + MOVD $1408, R12 + B runtime·callbackasm1(SB) + MOVD $1409, R12 + B runtime·callbackasm1(SB) + MOVD $1410, R12 + B runtime·callbackasm1(SB) + MOVD $1411, R12 + B runtime·callbackasm1(SB) + MOVD $1412, R12 + B runtime·callbackasm1(SB) + MOVD $1413, R12 + B runtime·callbackasm1(SB) + MOVD $1414, R12 + B runtime·callbackasm1(SB) + MOVD $1415, R12 + B runtime·callbackasm1(SB) + MOVD $1416, R12 + B runtime·callbackasm1(SB) + MOVD $1417, R12 + B runtime·callbackasm1(SB) + MOVD $1418, R12 + B runtime·callbackasm1(SB) + MOVD $1419, R12 + B runtime·callbackasm1(SB) + MOVD $1420, R12 + B runtime·callbackasm1(SB) + MOVD $1421, R12 + B runtime·callbackasm1(SB) + MOVD $1422, R12 + B runtime·callbackasm1(SB) + MOVD $1423, R12 + B runtime·callbackasm1(SB) + MOVD $1424, R12 + B runtime·callbackasm1(SB) + MOVD $1425, R12 + B runtime·callbackasm1(SB) + MOVD $1426, R12 + B runtime·callbackasm1(SB) + MOVD $1427, R12 + B runtime·callbackasm1(SB) + MOVD $1428, R12 + B runtime·callbackasm1(SB) + MOVD $1429, R12 + B runtime·callbackasm1(SB) + MOVD $1430, R12 + B runtime·callbackasm1(SB) + MOVD $1431, R12 + B runtime·callbackasm1(SB) + MOVD $1432, R12 + B runtime·callbackasm1(SB) + MOVD $1433, R12 + B runtime·callbackasm1(SB) + MOVD $1434, R12 + B runtime·callbackasm1(SB) + MOVD $1435, R12 + B runtime·callbackasm1(SB) + MOVD $1436, R12 + B runtime·callbackasm1(SB) + MOVD $1437, R12 + B runtime·callbackasm1(SB) + MOVD $1438, R12 + B runtime·callbackasm1(SB) + MOVD $1439, R12 + B runtime·callbackasm1(SB) + MOVD $1440, R12 + B runtime·callbackasm1(SB) + MOVD $1441, R12 + B runtime·callbackasm1(SB) + MOVD $1442, R12 + B runtime·callbackasm1(SB) + MOVD $1443, R12 + B runtime·callbackasm1(SB) + MOVD $1444, R12 + B runtime·callbackasm1(SB) + MOVD $1445, R12 + B runtime·callbackasm1(SB) + MOVD $1446, R12 + B runtime·callbackasm1(SB) + MOVD $1447, R12 + B runtime·callbackasm1(SB) + MOVD $1448, R12 + B runtime·callbackasm1(SB) + MOVD $1449, R12 + B runtime·callbackasm1(SB) + MOVD $1450, R12 + B runtime·callbackasm1(SB) + MOVD $1451, R12 + B runtime·callbackasm1(SB) + MOVD $1452, R12 + B runtime·callbackasm1(SB) + MOVD $1453, R12 + B runtime·callbackasm1(SB) + MOVD $1454, R12 + B runtime·callbackasm1(SB) + MOVD $1455, R12 + B runtime·callbackasm1(SB) + MOVD $1456, R12 + B runtime·callbackasm1(SB) + MOVD $1457, R12 + B runtime·callbackasm1(SB) + MOVD $1458, R12 + B runtime·callbackasm1(SB) + MOVD $1459, R12 + B runtime·callbackasm1(SB) + MOVD $1460, R12 + B runtime·callbackasm1(SB) + MOVD $1461, R12 + B runtime·callbackasm1(SB) + MOVD $1462, R12 + B runtime·callbackasm1(SB) + MOVD $1463, R12 + B runtime·callbackasm1(SB) + MOVD $1464, R12 + B runtime·callbackasm1(SB) + MOVD $1465, R12 + B runtime·callbackasm1(SB) + MOVD $1466, R12 + B runtime·callbackasm1(SB) + MOVD $1467, R12 + B runtime·callbackasm1(SB) + MOVD $1468, R12 + B runtime·callbackasm1(SB) + MOVD $1469, R12 + B runtime·callbackasm1(SB) + MOVD $1470, R12 + B runtime·callbackasm1(SB) + MOVD $1471, R12 + B runtime·callbackasm1(SB) + MOVD $1472, R12 + B runtime·callbackasm1(SB) + MOVD $1473, R12 + B runtime·callbackasm1(SB) + MOVD $1474, R12 + B runtime·callbackasm1(SB) + MOVD $1475, R12 + B runtime·callbackasm1(SB) + MOVD $1476, R12 + B runtime·callbackasm1(SB) + MOVD $1477, R12 + B runtime·callbackasm1(SB) + MOVD $1478, R12 + B runtime·callbackasm1(SB) + MOVD $1479, R12 + B runtime·callbackasm1(SB) + MOVD $1480, R12 + B runtime·callbackasm1(SB) + MOVD $1481, R12 + B runtime·callbackasm1(SB) + MOVD $1482, R12 + B runtime·callbackasm1(SB) + MOVD $1483, R12 + B runtime·callbackasm1(SB) + MOVD $1484, R12 + B runtime·callbackasm1(SB) + MOVD $1485, R12 + B runtime·callbackasm1(SB) + MOVD $1486, R12 + B runtime·callbackasm1(SB) + MOVD $1487, R12 + B runtime·callbackasm1(SB) + MOVD $1488, R12 + B runtime·callbackasm1(SB) + MOVD $1489, R12 + B runtime·callbackasm1(SB) + MOVD $1490, R12 + B runtime·callbackasm1(SB) + MOVD $1491, R12 + B runtime·callbackasm1(SB) + MOVD $1492, R12 + B runtime·callbackasm1(SB) + MOVD $1493, R12 + B runtime·callbackasm1(SB) + MOVD $1494, R12 + B runtime·callbackasm1(SB) + MOVD $1495, R12 + B runtime·callbackasm1(SB) + MOVD $1496, R12 + B runtime·callbackasm1(SB) + MOVD $1497, R12 + B runtime·callbackasm1(SB) + MOVD $1498, R12 + B runtime·callbackasm1(SB) + MOVD $1499, R12 + B runtime·callbackasm1(SB) + MOVD $1500, R12 + B runtime·callbackasm1(SB) + MOVD $1501, R12 + B runtime·callbackasm1(SB) + MOVD $1502, R12 + B runtime·callbackasm1(SB) + MOVD $1503, R12 + B runtime·callbackasm1(SB) + MOVD $1504, R12 + B runtime·callbackasm1(SB) + MOVD $1505, R12 + B runtime·callbackasm1(SB) + MOVD $1506, R12 + B runtime·callbackasm1(SB) + MOVD $1507, R12 + B runtime·callbackasm1(SB) + MOVD $1508, R12 + B runtime·callbackasm1(SB) + MOVD $1509, R12 + B runtime·callbackasm1(SB) + MOVD $1510, R12 + B runtime·callbackasm1(SB) + MOVD $1511, R12 + B runtime·callbackasm1(SB) + MOVD $1512, R12 + B runtime·callbackasm1(SB) + MOVD $1513, R12 + B runtime·callbackasm1(SB) + MOVD $1514, R12 + B runtime·callbackasm1(SB) + MOVD $1515, R12 + B runtime·callbackasm1(SB) + MOVD $1516, R12 + B runtime·callbackasm1(SB) + MOVD $1517, R12 + B runtime·callbackasm1(SB) + MOVD $1518, R12 + B runtime·callbackasm1(SB) + MOVD $1519, R12 + B runtime·callbackasm1(SB) + MOVD $1520, R12 + B runtime·callbackasm1(SB) + MOVD $1521, R12 + B runtime·callbackasm1(SB) + MOVD $1522, R12 + B runtime·callbackasm1(SB) + MOVD $1523, R12 + B runtime·callbackasm1(SB) + MOVD $1524, R12 + B runtime·callbackasm1(SB) + MOVD $1525, R12 + B runtime·callbackasm1(SB) + MOVD $1526, R12 + B runtime·callbackasm1(SB) + MOVD $1527, R12 + B runtime·callbackasm1(SB) + MOVD $1528, R12 + B runtime·callbackasm1(SB) + MOVD $1529, R12 + B runtime·callbackasm1(SB) + MOVD $1530, R12 + B runtime·callbackasm1(SB) + MOVD $1531, R12 + B runtime·callbackasm1(SB) + MOVD $1532, R12 + B runtime·callbackasm1(SB) + MOVD $1533, R12 + B runtime·callbackasm1(SB) + MOVD $1534, R12 + B runtime·callbackasm1(SB) + MOVD $1535, R12 + B runtime·callbackasm1(SB) + MOVD $1536, R12 + B runtime·callbackasm1(SB) + MOVD $1537, R12 + B runtime·callbackasm1(SB) + MOVD $1538, R12 + B runtime·callbackasm1(SB) + MOVD $1539, R12 + B runtime·callbackasm1(SB) + MOVD $1540, R12 + B runtime·callbackasm1(SB) + MOVD $1541, R12 + B runtime·callbackasm1(SB) + MOVD $1542, R12 + B runtime·callbackasm1(SB) + MOVD $1543, R12 + B runtime·callbackasm1(SB) + MOVD $1544, R12 + B runtime·callbackasm1(SB) + MOVD $1545, R12 + B runtime·callbackasm1(SB) + MOVD $1546, R12 + B runtime·callbackasm1(SB) + MOVD $1547, R12 + B runtime·callbackasm1(SB) + MOVD $1548, R12 + B runtime·callbackasm1(SB) + MOVD $1549, R12 + B runtime·callbackasm1(SB) + MOVD $1550, R12 + B runtime·callbackasm1(SB) + MOVD $1551, R12 + B runtime·callbackasm1(SB) + MOVD $1552, R12 + B runtime·callbackasm1(SB) + MOVD $1553, R12 + B runtime·callbackasm1(SB) + MOVD $1554, R12 + B runtime·callbackasm1(SB) + MOVD $1555, R12 + B runtime·callbackasm1(SB) + MOVD $1556, R12 + B runtime·callbackasm1(SB) + MOVD $1557, R12 + B runtime·callbackasm1(SB) + MOVD $1558, R12 + B runtime·callbackasm1(SB) + MOVD $1559, R12 + B runtime·callbackasm1(SB) + MOVD $1560, R12 + B runtime·callbackasm1(SB) + MOVD $1561, R12 + B runtime·callbackasm1(SB) + MOVD $1562, R12 + B runtime·callbackasm1(SB) + MOVD $1563, R12 + B runtime·callbackasm1(SB) + MOVD $1564, R12 + B runtime·callbackasm1(SB) + MOVD $1565, R12 + B runtime·callbackasm1(SB) + MOVD $1566, R12 + B runtime·callbackasm1(SB) + MOVD $1567, R12 + B runtime·callbackasm1(SB) + MOVD $1568, R12 + B runtime·callbackasm1(SB) + MOVD $1569, R12 + B runtime·callbackasm1(SB) + MOVD $1570, R12 + B runtime·callbackasm1(SB) + MOVD $1571, R12 + B runtime·callbackasm1(SB) + MOVD $1572, R12 + B runtime·callbackasm1(SB) + MOVD $1573, R12 + B runtime·callbackasm1(SB) + MOVD $1574, R12 + B runtime·callbackasm1(SB) + MOVD $1575, R12 + B runtime·callbackasm1(SB) + MOVD $1576, R12 + B runtime·callbackasm1(SB) + MOVD $1577, R12 + B runtime·callbackasm1(SB) + MOVD $1578, R12 + B runtime·callbackasm1(SB) + MOVD $1579, R12 + B runtime·callbackasm1(SB) + MOVD $1580, R12 + B runtime·callbackasm1(SB) + MOVD $1581, R12 + B runtime·callbackasm1(SB) + MOVD $1582, R12 + B runtime·callbackasm1(SB) + MOVD $1583, R12 + B runtime·callbackasm1(SB) + MOVD $1584, R12 + B runtime·callbackasm1(SB) + MOVD $1585, R12 + B runtime·callbackasm1(SB) + MOVD $1586, R12 + B runtime·callbackasm1(SB) + MOVD $1587, R12 + B runtime·callbackasm1(SB) + MOVD $1588, R12 + B runtime·callbackasm1(SB) + MOVD $1589, R12 + B runtime·callbackasm1(SB) + MOVD $1590, R12 + B runtime·callbackasm1(SB) + MOVD $1591, R12 + B runtime·callbackasm1(SB) + MOVD $1592, R12 + B runtime·callbackasm1(SB) + MOVD $1593, R12 + B runtime·callbackasm1(SB) + MOVD $1594, R12 + B runtime·callbackasm1(SB) + MOVD $1595, R12 + B runtime·callbackasm1(SB) + MOVD $1596, R12 + B runtime·callbackasm1(SB) + MOVD $1597, R12 + B runtime·callbackasm1(SB) + MOVD $1598, R12 + B runtime·callbackasm1(SB) + MOVD $1599, R12 + B runtime·callbackasm1(SB) + MOVD $1600, R12 + B runtime·callbackasm1(SB) + MOVD $1601, R12 + B runtime·callbackasm1(SB) + MOVD $1602, R12 + B runtime·callbackasm1(SB) + MOVD $1603, R12 + B runtime·callbackasm1(SB) + MOVD $1604, R12 + B runtime·callbackasm1(SB) + MOVD $1605, R12 + B runtime·callbackasm1(SB) + MOVD $1606, R12 + B runtime·callbackasm1(SB) + MOVD $1607, R12 + B runtime·callbackasm1(SB) + MOVD $1608, R12 + B runtime·callbackasm1(SB) + MOVD $1609, R12 + B runtime·callbackasm1(SB) + MOVD $1610, R12 + B runtime·callbackasm1(SB) + MOVD $1611, R12 + B runtime·callbackasm1(SB) + MOVD $1612, R12 + B runtime·callbackasm1(SB) + MOVD $1613, R12 + B runtime·callbackasm1(SB) + MOVD $1614, R12 + B runtime·callbackasm1(SB) + MOVD $1615, R12 + B runtime·callbackasm1(SB) + MOVD $1616, R12 + B runtime·callbackasm1(SB) + MOVD $1617, R12 + B runtime·callbackasm1(SB) + MOVD $1618, R12 + B runtime·callbackasm1(SB) + MOVD $1619, R12 + B runtime·callbackasm1(SB) + MOVD $1620, R12 + B runtime·callbackasm1(SB) + MOVD $1621, R12 + B runtime·callbackasm1(SB) + MOVD $1622, R12 + B runtime·callbackasm1(SB) + MOVD $1623, R12 + B runtime·callbackasm1(SB) + MOVD $1624, R12 + B runtime·callbackasm1(SB) + MOVD $1625, R12 + B runtime·callbackasm1(SB) + MOVD $1626, R12 + B runtime·callbackasm1(SB) + MOVD $1627, R12 + B runtime·callbackasm1(SB) + MOVD $1628, R12 + B runtime·callbackasm1(SB) + MOVD $1629, R12 + B runtime·callbackasm1(SB) + MOVD $1630, R12 + B runtime·callbackasm1(SB) + MOVD $1631, R12 + B runtime·callbackasm1(SB) + MOVD $1632, R12 + B runtime·callbackasm1(SB) + MOVD $1633, R12 + B runtime·callbackasm1(SB) + MOVD $1634, R12 + B runtime·callbackasm1(SB) + MOVD $1635, R12 + B runtime·callbackasm1(SB) + MOVD $1636, R12 + B runtime·callbackasm1(SB) + MOVD $1637, R12 + B runtime·callbackasm1(SB) + MOVD $1638, R12 + B runtime·callbackasm1(SB) + MOVD $1639, R12 + B runtime·callbackasm1(SB) + MOVD $1640, R12 + B runtime·callbackasm1(SB) + MOVD $1641, R12 + B runtime·callbackasm1(SB) + MOVD $1642, R12 + B runtime·callbackasm1(SB) + MOVD $1643, R12 + B runtime·callbackasm1(SB) + MOVD $1644, R12 + B runtime·callbackasm1(SB) + MOVD $1645, R12 + B runtime·callbackasm1(SB) + MOVD $1646, R12 + B runtime·callbackasm1(SB) + MOVD $1647, R12 + B runtime·callbackasm1(SB) + MOVD $1648, R12 + B runtime·callbackasm1(SB) + MOVD $1649, R12 + B runtime·callbackasm1(SB) + MOVD $1650, R12 + B runtime·callbackasm1(SB) + MOVD $1651, R12 + B runtime·callbackasm1(SB) + MOVD $1652, R12 + B runtime·callbackasm1(SB) + MOVD $1653, R12 + B runtime·callbackasm1(SB) + MOVD $1654, R12 + B runtime·callbackasm1(SB) + MOVD $1655, R12 + B runtime·callbackasm1(SB) + MOVD $1656, R12 + B runtime·callbackasm1(SB) + MOVD $1657, R12 + B runtime·callbackasm1(SB) + MOVD $1658, R12 + B runtime·callbackasm1(SB) + MOVD $1659, R12 + B runtime·callbackasm1(SB) + MOVD $1660, R12 + B runtime·callbackasm1(SB) + MOVD $1661, R12 + B runtime·callbackasm1(SB) + MOVD $1662, R12 + B runtime·callbackasm1(SB) + MOVD $1663, R12 + B runtime·callbackasm1(SB) + MOVD $1664, R12 + B runtime·callbackasm1(SB) + MOVD $1665, R12 + B runtime·callbackasm1(SB) + MOVD $1666, R12 + B runtime·callbackasm1(SB) + MOVD $1667, R12 + B runtime·callbackasm1(SB) + MOVD $1668, R12 + B runtime·callbackasm1(SB) + MOVD $1669, R12 + B runtime·callbackasm1(SB) + MOVD $1670, R12 + B runtime·callbackasm1(SB) + MOVD $1671, R12 + B runtime·callbackasm1(SB) + MOVD $1672, R12 + B runtime·callbackasm1(SB) + MOVD $1673, R12 + B runtime·callbackasm1(SB) + MOVD $1674, R12 + B runtime·callbackasm1(SB) + MOVD $1675, R12 + B runtime·callbackasm1(SB) + MOVD $1676, R12 + B runtime·callbackasm1(SB) + MOVD $1677, R12 + B runtime·callbackasm1(SB) + MOVD $1678, R12 + B runtime·callbackasm1(SB) + MOVD $1679, R12 + B runtime·callbackasm1(SB) + MOVD $1680, R12 + B runtime·callbackasm1(SB) + MOVD $1681, R12 + B runtime·callbackasm1(SB) + MOVD $1682, R12 + B runtime·callbackasm1(SB) + MOVD $1683, R12 + B runtime·callbackasm1(SB) + MOVD $1684, R12 + B runtime·callbackasm1(SB) + MOVD $1685, R12 + B runtime·callbackasm1(SB) + MOVD $1686, R12 + B runtime·callbackasm1(SB) + MOVD $1687, R12 + B runtime·callbackasm1(SB) + MOVD $1688, R12 + B runtime·callbackasm1(SB) + MOVD $1689, R12 + B runtime·callbackasm1(SB) + MOVD $1690, R12 + B runtime·callbackasm1(SB) + MOVD $1691, R12 + B runtime·callbackasm1(SB) + MOVD $1692, R12 + B runtime·callbackasm1(SB) + MOVD $1693, R12 + B runtime·callbackasm1(SB) + MOVD $1694, R12 + B runtime·callbackasm1(SB) + MOVD $1695, R12 + B runtime·callbackasm1(SB) + MOVD $1696, R12 + B runtime·callbackasm1(SB) + MOVD $1697, R12 + B runtime·callbackasm1(SB) + MOVD $1698, R12 + B runtime·callbackasm1(SB) + MOVD $1699, R12 + B runtime·callbackasm1(SB) + MOVD $1700, R12 + B runtime·callbackasm1(SB) + MOVD $1701, R12 + B runtime·callbackasm1(SB) + MOVD $1702, R12 + B runtime·callbackasm1(SB) + MOVD $1703, R12 + B runtime·callbackasm1(SB) + MOVD $1704, R12 + B runtime·callbackasm1(SB) + MOVD $1705, R12 + B runtime·callbackasm1(SB) + MOVD $1706, R12 + B runtime·callbackasm1(SB) + MOVD $1707, R12 + B runtime·callbackasm1(SB) + MOVD $1708, R12 + B runtime·callbackasm1(SB) + MOVD $1709, R12 + B runtime·callbackasm1(SB) + MOVD $1710, R12 + B runtime·callbackasm1(SB) + MOVD $1711, R12 + B runtime·callbackasm1(SB) + MOVD $1712, R12 + B runtime·callbackasm1(SB) + MOVD $1713, R12 + B runtime·callbackasm1(SB) + MOVD $1714, R12 + B runtime·callbackasm1(SB) + MOVD $1715, R12 + B runtime·callbackasm1(SB) + MOVD $1716, R12 + B runtime·callbackasm1(SB) + MOVD $1717, R12 + B runtime·callbackasm1(SB) + MOVD $1718, R12 + B runtime·callbackasm1(SB) + MOVD $1719, R12 + B runtime·callbackasm1(SB) + MOVD $1720, R12 + B runtime·callbackasm1(SB) + MOVD $1721, R12 + B runtime·callbackasm1(SB) + MOVD $1722, R12 + B runtime·callbackasm1(SB) + MOVD $1723, R12 + B runtime·callbackasm1(SB) + MOVD $1724, R12 + B runtime·callbackasm1(SB) + MOVD $1725, R12 + B runtime·callbackasm1(SB) + MOVD $1726, R12 + B runtime·callbackasm1(SB) + MOVD $1727, R12 + B runtime·callbackasm1(SB) + MOVD $1728, R12 + B runtime·callbackasm1(SB) + MOVD $1729, R12 + B runtime·callbackasm1(SB) + MOVD $1730, R12 + B runtime·callbackasm1(SB) + MOVD $1731, R12 + B runtime·callbackasm1(SB) + MOVD $1732, R12 + B runtime·callbackasm1(SB) + MOVD $1733, R12 + B runtime·callbackasm1(SB) + MOVD $1734, R12 + B runtime·callbackasm1(SB) + MOVD $1735, R12 + B runtime·callbackasm1(SB) + MOVD $1736, R12 + B runtime·callbackasm1(SB) + MOVD $1737, R12 + B runtime·callbackasm1(SB) + MOVD $1738, R12 + B runtime·callbackasm1(SB) + MOVD $1739, R12 + B runtime·callbackasm1(SB) + MOVD $1740, R12 + B runtime·callbackasm1(SB) + MOVD $1741, R12 + B runtime·callbackasm1(SB) + MOVD $1742, R12 + B runtime·callbackasm1(SB) + MOVD $1743, R12 + B runtime·callbackasm1(SB) + MOVD $1744, R12 + B runtime·callbackasm1(SB) + MOVD $1745, R12 + B runtime·callbackasm1(SB) + MOVD $1746, R12 + B runtime·callbackasm1(SB) + MOVD $1747, R12 + B runtime·callbackasm1(SB) + MOVD $1748, R12 + B runtime·callbackasm1(SB) + MOVD $1749, R12 + B runtime·callbackasm1(SB) + MOVD $1750, R12 + B runtime·callbackasm1(SB) + MOVD $1751, R12 + B runtime·callbackasm1(SB) + MOVD $1752, R12 + B runtime·callbackasm1(SB) + MOVD $1753, R12 + B runtime·callbackasm1(SB) + MOVD $1754, R12 + B runtime·callbackasm1(SB) + MOVD $1755, R12 + B runtime·callbackasm1(SB) + MOVD $1756, R12 + B runtime·callbackasm1(SB) + MOVD $1757, R12 + B runtime·callbackasm1(SB) + MOVD $1758, R12 + B runtime·callbackasm1(SB) + MOVD $1759, R12 + B runtime·callbackasm1(SB) + MOVD $1760, R12 + B runtime·callbackasm1(SB) + MOVD $1761, R12 + B runtime·callbackasm1(SB) + MOVD $1762, R12 + B runtime·callbackasm1(SB) + MOVD $1763, R12 + B runtime·callbackasm1(SB) + MOVD $1764, R12 + B runtime·callbackasm1(SB) + MOVD $1765, R12 + B runtime·callbackasm1(SB) + MOVD $1766, R12 + B runtime·callbackasm1(SB) + MOVD $1767, R12 + B runtime·callbackasm1(SB) + MOVD $1768, R12 + B runtime·callbackasm1(SB) + MOVD $1769, R12 + B runtime·callbackasm1(SB) + MOVD $1770, R12 + B runtime·callbackasm1(SB) + MOVD $1771, R12 + B runtime·callbackasm1(SB) + MOVD $1772, R12 + B runtime·callbackasm1(SB) + MOVD $1773, R12 + B runtime·callbackasm1(SB) + MOVD $1774, R12 + B runtime·callbackasm1(SB) + MOVD $1775, R12 + B runtime·callbackasm1(SB) + MOVD $1776, R12 + B runtime·callbackasm1(SB) + MOVD $1777, R12 + B runtime·callbackasm1(SB) + MOVD $1778, R12 + B runtime·callbackasm1(SB) + MOVD $1779, R12 + B runtime·callbackasm1(SB) + MOVD $1780, R12 + B runtime·callbackasm1(SB) + MOVD $1781, R12 + B runtime·callbackasm1(SB) + MOVD $1782, R12 + B runtime·callbackasm1(SB) + MOVD $1783, R12 + B runtime·callbackasm1(SB) + MOVD $1784, R12 + B runtime·callbackasm1(SB) + MOVD $1785, R12 + B runtime·callbackasm1(SB) + MOVD $1786, R12 + B runtime·callbackasm1(SB) + MOVD $1787, R12 + B runtime·callbackasm1(SB) + MOVD $1788, R12 + B runtime·callbackasm1(SB) + MOVD $1789, R12 + B runtime·callbackasm1(SB) + MOVD $1790, R12 + B runtime·callbackasm1(SB) + MOVD $1791, R12 + B runtime·callbackasm1(SB) + MOVD $1792, R12 + B runtime·callbackasm1(SB) + MOVD $1793, R12 + B runtime·callbackasm1(SB) + MOVD $1794, R12 + B runtime·callbackasm1(SB) + MOVD $1795, R12 + B runtime·callbackasm1(SB) + MOVD $1796, R12 + B runtime·callbackasm1(SB) + MOVD $1797, R12 + B runtime·callbackasm1(SB) + MOVD $1798, R12 + B runtime·callbackasm1(SB) + MOVD $1799, R12 + B runtime·callbackasm1(SB) + MOVD $1800, R12 + B runtime·callbackasm1(SB) + MOVD $1801, R12 + B runtime·callbackasm1(SB) + MOVD $1802, R12 + B runtime·callbackasm1(SB) + MOVD $1803, R12 + B runtime·callbackasm1(SB) + MOVD $1804, R12 + B runtime·callbackasm1(SB) + MOVD $1805, R12 + B runtime·callbackasm1(SB) + MOVD $1806, R12 + B runtime·callbackasm1(SB) + MOVD $1807, R12 + B runtime·callbackasm1(SB) + MOVD $1808, R12 + B runtime·callbackasm1(SB) + MOVD $1809, R12 + B runtime·callbackasm1(SB) + MOVD $1810, R12 + B runtime·callbackasm1(SB) + MOVD $1811, R12 + B runtime·callbackasm1(SB) + MOVD $1812, R12 + B runtime·callbackasm1(SB) + MOVD $1813, R12 + B runtime·callbackasm1(SB) + MOVD $1814, R12 + B runtime·callbackasm1(SB) + MOVD $1815, R12 + B runtime·callbackasm1(SB) + MOVD $1816, R12 + B runtime·callbackasm1(SB) + MOVD $1817, R12 + B runtime·callbackasm1(SB) + MOVD $1818, R12 + B runtime·callbackasm1(SB) + MOVD $1819, R12 + B runtime·callbackasm1(SB) + MOVD $1820, R12 + B runtime·callbackasm1(SB) + MOVD $1821, R12 + B runtime·callbackasm1(SB) + MOVD $1822, R12 + B runtime·callbackasm1(SB) + MOVD $1823, R12 + B runtime·callbackasm1(SB) + MOVD $1824, R12 + B runtime·callbackasm1(SB) + MOVD $1825, R12 + B runtime·callbackasm1(SB) + MOVD $1826, R12 + B runtime·callbackasm1(SB) + MOVD $1827, R12 + B runtime·callbackasm1(SB) + MOVD $1828, R12 + B runtime·callbackasm1(SB) + MOVD $1829, R12 + B runtime·callbackasm1(SB) + MOVD $1830, R12 + B runtime·callbackasm1(SB) + MOVD $1831, R12 + B runtime·callbackasm1(SB) + MOVD $1832, R12 + B runtime·callbackasm1(SB) + MOVD $1833, R12 + B runtime·callbackasm1(SB) + MOVD $1834, R12 + B runtime·callbackasm1(SB) + MOVD $1835, R12 + B runtime·callbackasm1(SB) + MOVD $1836, R12 + B runtime·callbackasm1(SB) + MOVD $1837, R12 + B runtime·callbackasm1(SB) + MOVD $1838, R12 + B runtime·callbackasm1(SB) + MOVD $1839, R12 + B runtime·callbackasm1(SB) + MOVD $1840, R12 + B runtime·callbackasm1(SB) + MOVD $1841, R12 + B runtime·callbackasm1(SB) + MOVD $1842, R12 + B runtime·callbackasm1(SB) + MOVD $1843, R12 + B runtime·callbackasm1(SB) + MOVD $1844, R12 + B runtime·callbackasm1(SB) + MOVD $1845, R12 + B runtime·callbackasm1(SB) + MOVD $1846, R12 + B runtime·callbackasm1(SB) + MOVD $1847, R12 + B runtime·callbackasm1(SB) + MOVD $1848, R12 + B runtime·callbackasm1(SB) + MOVD $1849, R12 + B runtime·callbackasm1(SB) + MOVD $1850, R12 + B runtime·callbackasm1(SB) + MOVD $1851, R12 + B runtime·callbackasm1(SB) + MOVD $1852, R12 + B runtime·callbackasm1(SB) + MOVD $1853, R12 + B runtime·callbackasm1(SB) + MOVD $1854, R12 + B runtime·callbackasm1(SB) + MOVD $1855, R12 + B runtime·callbackasm1(SB) + MOVD $1856, R12 + B runtime·callbackasm1(SB) + MOVD $1857, R12 + B runtime·callbackasm1(SB) + MOVD $1858, R12 + B runtime·callbackasm1(SB) + MOVD $1859, R12 + B runtime·callbackasm1(SB) + MOVD $1860, R12 + B runtime·callbackasm1(SB) + MOVD $1861, R12 + B runtime·callbackasm1(SB) + MOVD $1862, R12 + B runtime·callbackasm1(SB) + MOVD $1863, R12 + B runtime·callbackasm1(SB) + MOVD $1864, R12 + B runtime·callbackasm1(SB) + MOVD $1865, R12 + B runtime·callbackasm1(SB) + MOVD $1866, R12 + B runtime·callbackasm1(SB) + MOVD $1867, R12 + B runtime·callbackasm1(SB) + MOVD $1868, R12 + B runtime·callbackasm1(SB) + MOVD $1869, R12 + B runtime·callbackasm1(SB) + MOVD $1870, R12 + B runtime·callbackasm1(SB) + MOVD $1871, R12 + B runtime·callbackasm1(SB) + MOVD $1872, R12 + B runtime·callbackasm1(SB) + MOVD $1873, R12 + B runtime·callbackasm1(SB) + MOVD $1874, R12 + B runtime·callbackasm1(SB) + MOVD $1875, R12 + B runtime·callbackasm1(SB) + MOVD $1876, R12 + B runtime·callbackasm1(SB) + MOVD $1877, R12 + B runtime·callbackasm1(SB) + MOVD $1878, R12 + B runtime·callbackasm1(SB) + MOVD $1879, R12 + B runtime·callbackasm1(SB) + MOVD $1880, R12 + B runtime·callbackasm1(SB) + MOVD $1881, R12 + B runtime·callbackasm1(SB) + MOVD $1882, R12 + B runtime·callbackasm1(SB) + MOVD $1883, R12 + B runtime·callbackasm1(SB) + MOVD $1884, R12 + B runtime·callbackasm1(SB) + MOVD $1885, R12 + B runtime·callbackasm1(SB) + MOVD $1886, R12 + B runtime·callbackasm1(SB) + MOVD $1887, R12 + B runtime·callbackasm1(SB) + MOVD $1888, R12 + B runtime·callbackasm1(SB) + MOVD $1889, R12 + B runtime·callbackasm1(SB) + MOVD $1890, R12 + B runtime·callbackasm1(SB) + MOVD $1891, R12 + B runtime·callbackasm1(SB) + MOVD $1892, R12 + B runtime·callbackasm1(SB) + MOVD $1893, R12 + B runtime·callbackasm1(SB) + MOVD $1894, R12 + B runtime·callbackasm1(SB) + MOVD $1895, R12 + B runtime·callbackasm1(SB) + MOVD $1896, R12 + B runtime·callbackasm1(SB) + MOVD $1897, R12 + B runtime·callbackasm1(SB) + MOVD $1898, R12 + B runtime·callbackasm1(SB) + MOVD $1899, R12 + B runtime·callbackasm1(SB) + MOVD $1900, R12 + B runtime·callbackasm1(SB) + MOVD $1901, R12 + B runtime·callbackasm1(SB) + MOVD $1902, R12 + B runtime·callbackasm1(SB) + MOVD $1903, R12 + B runtime·callbackasm1(SB) + MOVD $1904, R12 + B runtime·callbackasm1(SB) + MOVD $1905, R12 + B runtime·callbackasm1(SB) + MOVD $1906, R12 + B runtime·callbackasm1(SB) + MOVD $1907, R12 + B runtime·callbackasm1(SB) + MOVD $1908, R12 + B runtime·callbackasm1(SB) + MOVD $1909, R12 + B runtime·callbackasm1(SB) + MOVD $1910, R12 + B runtime·callbackasm1(SB) + MOVD $1911, R12 + B runtime·callbackasm1(SB) + MOVD $1912, R12 + B runtime·callbackasm1(SB) + MOVD $1913, R12 + B runtime·callbackasm1(SB) + MOVD $1914, R12 + B runtime·callbackasm1(SB) + MOVD $1915, R12 + B runtime·callbackasm1(SB) + MOVD $1916, R12 + B runtime·callbackasm1(SB) + MOVD $1917, R12 + B runtime·callbackasm1(SB) + MOVD $1918, R12 + B runtime·callbackasm1(SB) + MOVD $1919, R12 + B runtime·callbackasm1(SB) + MOVD $1920, R12 + B runtime·callbackasm1(SB) + MOVD $1921, R12 + B runtime·callbackasm1(SB) + MOVD $1922, R12 + B runtime·callbackasm1(SB) + MOVD $1923, R12 + B runtime·callbackasm1(SB) + MOVD $1924, R12 + B runtime·callbackasm1(SB) + MOVD $1925, R12 + B runtime·callbackasm1(SB) + MOVD $1926, R12 + B runtime·callbackasm1(SB) + MOVD $1927, R12 + B runtime·callbackasm1(SB) + MOVD $1928, R12 + B runtime·callbackasm1(SB) + MOVD $1929, R12 + B runtime·callbackasm1(SB) + MOVD $1930, R12 + B runtime·callbackasm1(SB) + MOVD $1931, R12 + B runtime·callbackasm1(SB) + MOVD $1932, R12 + B runtime·callbackasm1(SB) + MOVD $1933, R12 + B runtime·callbackasm1(SB) + MOVD $1934, R12 + B runtime·callbackasm1(SB) + MOVD $1935, R12 + B runtime·callbackasm1(SB) + MOVD $1936, R12 + B runtime·callbackasm1(SB) + MOVD $1937, R12 + B runtime·callbackasm1(SB) + MOVD $1938, R12 + B runtime·callbackasm1(SB) + MOVD $1939, R12 + B runtime·callbackasm1(SB) + MOVD $1940, R12 + B runtime·callbackasm1(SB) + MOVD $1941, R12 + B runtime·callbackasm1(SB) + MOVD $1942, R12 + B runtime·callbackasm1(SB) + MOVD $1943, R12 + B runtime·callbackasm1(SB) + MOVD $1944, R12 + B runtime·callbackasm1(SB) + MOVD $1945, R12 + B runtime·callbackasm1(SB) + MOVD $1946, R12 + B runtime·callbackasm1(SB) + MOVD $1947, R12 + B runtime·callbackasm1(SB) + MOVD $1948, R12 + B runtime·callbackasm1(SB) + MOVD $1949, R12 + B runtime·callbackasm1(SB) + MOVD $1950, R12 + B runtime·callbackasm1(SB) + MOVD $1951, R12 + B runtime·callbackasm1(SB) + MOVD $1952, R12 + B runtime·callbackasm1(SB) + MOVD $1953, R12 + B runtime·callbackasm1(SB) + MOVD $1954, R12 + B runtime·callbackasm1(SB) + MOVD $1955, R12 + B runtime·callbackasm1(SB) + MOVD $1956, R12 + B runtime·callbackasm1(SB) + MOVD $1957, R12 + B runtime·callbackasm1(SB) + MOVD $1958, R12 + B runtime·callbackasm1(SB) + MOVD $1959, R12 + B runtime·callbackasm1(SB) + MOVD $1960, R12 + B runtime·callbackasm1(SB) + MOVD $1961, R12 + B runtime·callbackasm1(SB) + MOVD $1962, R12 + B runtime·callbackasm1(SB) + MOVD $1963, R12 + B runtime·callbackasm1(SB) + MOVD $1964, R12 + B runtime·callbackasm1(SB) + MOVD $1965, R12 + B runtime·callbackasm1(SB) + MOVD $1966, R12 + B runtime·callbackasm1(SB) + MOVD $1967, R12 + B runtime·callbackasm1(SB) + MOVD $1968, R12 + B runtime·callbackasm1(SB) + MOVD $1969, R12 + B runtime·callbackasm1(SB) + MOVD $1970, R12 + B runtime·callbackasm1(SB) + MOVD $1971, R12 + B runtime·callbackasm1(SB) + MOVD $1972, R12 + B runtime·callbackasm1(SB) + MOVD $1973, R12 + B runtime·callbackasm1(SB) + MOVD $1974, R12 + B runtime·callbackasm1(SB) + MOVD $1975, R12 + B runtime·callbackasm1(SB) + MOVD $1976, R12 + B runtime·callbackasm1(SB) + MOVD $1977, R12 + B runtime·callbackasm1(SB) + MOVD $1978, R12 + B runtime·callbackasm1(SB) + MOVD $1979, R12 + B runtime·callbackasm1(SB) + MOVD $1980, R12 + B runtime·callbackasm1(SB) + MOVD $1981, R12 + B runtime·callbackasm1(SB) + MOVD $1982, R12 + B runtime·callbackasm1(SB) + MOVD $1983, R12 + B runtime·callbackasm1(SB) + MOVD $1984, R12 + B runtime·callbackasm1(SB) + MOVD $1985, R12 + B runtime·callbackasm1(SB) + MOVD $1986, R12 + B runtime·callbackasm1(SB) + MOVD $1987, R12 + B runtime·callbackasm1(SB) + MOVD $1988, R12 + B runtime·callbackasm1(SB) + MOVD $1989, R12 + B runtime·callbackasm1(SB) + MOVD $1990, R12 + B runtime·callbackasm1(SB) + MOVD $1991, R12 + B runtime·callbackasm1(SB) + MOVD $1992, R12 + B runtime·callbackasm1(SB) + MOVD $1993, R12 + B runtime·callbackasm1(SB) + MOVD $1994, R12 + B runtime·callbackasm1(SB) + MOVD $1995, R12 + B runtime·callbackasm1(SB) + MOVD $1996, R12 + B runtime·callbackasm1(SB) + MOVD $1997, R12 + B runtime·callbackasm1(SB) + MOVD $1998, R12 + B runtime·callbackasm1(SB) + MOVD $1999, R12 + B runtime·callbackasm1(SB) |