summaryrefslogtreecommitdiffstats
path: root/src/cmd/compile/internal/ssa/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'src/cmd/compile/internal/ssa/README.md')
-rw-r--r--src/cmd/compile/internal/ssa/README.md222
1 files changed, 222 insertions, 0 deletions
diff --git a/src/cmd/compile/internal/ssa/README.md b/src/cmd/compile/internal/ssa/README.md
new file mode 100644
index 0000000..27ac02b
--- /dev/null
+++ b/src/cmd/compile/internal/ssa/README.md
@@ -0,0 +1,222 @@
+<!---
+// Copyright 2018 The Go Authors. All rights reserved.
+// Use of this source code is governed by a BSD-style
+// license that can be found in the LICENSE file.
+-->
+
+## Introduction to the Go compiler's SSA backend
+
+This package contains the compiler's Static Single Assignment form component. If
+you're not familiar with SSA, its [Wikipedia
+article](https://en.wikipedia.org/wiki/Static_single_assignment_form) is a good
+starting point.
+
+It is recommended that you first read [cmd/compile/README.md](../../README.md)
+if you are not familiar with the Go compiler already. That document gives an
+overview of the compiler, and explains what is SSA's part and purpose in it.
+
+### Key concepts
+
+The names described below may be loosely related to their Go counterparts, but
+note that they are not equivalent. For example, a Go block statement has a
+variable scope, yet SSA has no notion of variables nor variable scopes.
+
+It may also be surprising that values and blocks are named after their unique
+sequential IDs. They rarely correspond to named entities in the original code,
+such as variables or function parameters. The sequential IDs also allow the
+compiler to avoid maps, and it is always possible to track back the values to Go
+code using debug and position information.
+
+#### Values
+
+Values are the basic building blocks of SSA. Per SSA's very definition, a
+value is defined exactly once, but it may be used any number of times. A value
+mainly consists of a unique identifier, an operator, a type, and some arguments.
+
+An operator or `Op` describes the operation that computes the value. The
+semantics of each operator can be found in `_gen/*Ops.go`. For example, `OpAdd8`
+takes two value arguments holding 8-bit integers and results in their addition.
+Here is a possible SSA representation of the addition of two `uint8` values:
+
+ // var c uint8 = a + b
+ v4 = Add8 <uint8> v2 v3
+
+A value's type will usually be a Go type. For example, the value in the example
+above has a `uint8` type, and a constant boolean value will have a `bool` type.
+However, certain types don't come from Go and are special; below we will cover
+`memory`, the most common of them.
+
+See [value.go](value.go) for more information.
+
+#### Memory types
+
+`memory` represents the global memory state. An `Op` that takes a memory
+argument depends on that memory state, and an `Op` which has the memory type
+impacts the state of memory. This ensures that memory operations are kept in the
+right order. For example:
+
+ // *a = 3
+ // *b = *a
+ v10 = Store <mem> {int} v6 v8 v1
+ v14 = Store <mem> {int} v7 v8 v10
+
+Here, `Store` stores its second argument (of type `int`) into the first argument
+(of type `*int`). The last argument is the memory state; since the second store
+depends on the memory value defined by the first store, the two stores cannot be
+reordered.
+
+See [cmd/compile/internal/types/type.go](../types/type.go) for more information.
+
+#### Blocks
+
+A block represents a basic block in the control flow graph of a function. It is,
+essentially, a list of values that define the operation of this block. Besides
+the list of values, blocks mainly consist of a unique identifier, a kind, and a
+list of successor blocks.
+
+The simplest kind is a `plain` block; it simply hands the control flow to
+another block, thus its successors list contains one block.
+
+Another common block kind is the `exit` block. These have a final value, called
+control value, which must return a memory state. This is necessary for functions
+to return some values, for example - the caller needs some memory state to
+depend on, to ensure that it receives those return values correctly.
+
+The last important block kind we will mention is the `if` block. It has a single
+control value that must be a boolean value, and it has exactly two successor
+blocks. The control flow is handed to the first successor if the bool is true,
+and to the second otherwise.
+
+Here is a sample if-else control flow represented with basic blocks:
+
+ // func(b bool) int {
+ // if b {
+ // return 2
+ // }
+ // return 3
+ // }
+ b1:
+ v1 = InitMem <mem>
+ v2 = SP <uintptr>
+ v5 = Addr <*int> {~r1} v2
+ v6 = Arg <bool> {b}
+ v8 = Const64 <int> [2]
+ v12 = Const64 <int> [3]
+ If v6 -> b2 b3
+ b2: <- b1
+ v10 = VarDef <mem> {~r1} v1
+ v11 = Store <mem> {int} v5 v8 v10
+ Ret v11
+ b3: <- b1
+ v14 = VarDef <mem> {~r1} v1
+ v15 = Store <mem> {int} v5 v12 v14
+ Ret v15
+
+<!---
+TODO: can we come up with a shorter example that still shows the control flow?
+-->
+
+See [block.go](block.go) for more information.
+
+#### Functions
+
+A function represents a function declaration along with its body. It mainly
+consists of a name, a type (its signature), a list of blocks that form its body,
+and the entry block within said list.
+
+When a function is called, the control flow is handed to its entry block. If the
+function terminates, the control flow will eventually reach an exit block, thus
+ending the function call.
+
+Note that a function may have zero or multiple exit blocks, just like a Go
+function can have any number of return points, but it must have exactly one
+entry point block.
+
+Also note that some SSA functions are autogenerated, such as the hash functions
+for each type used as a map key.
+
+For example, this is what an empty function can look like in SSA, with a single
+exit block that returns an uninteresting memory state:
+
+ foo func()
+ b1:
+ v1 = InitMem <mem>
+ Ret v1
+
+See [func.go](func.go) for more information.
+
+### Compiler passes
+
+Having a program in SSA form is not very useful on its own. Its advantage lies
+in how easy it is to write optimizations that modify the program to make it
+better. The way the Go compiler accomplishes this is via a list of passes.
+
+Each pass transforms a SSA function in some way. For example, a dead code
+elimination pass will remove blocks and values that it can prove will never be
+executed, and a nil check elimination pass will remove nil checks which it can
+prove to be redundant.
+
+Compiler passes work on one function at a time, and by default run sequentially
+and exactly once.
+
+The `lower` pass is special; it converts the SSA representation from being
+machine-independent to being machine-dependent. That is, some abstract operators
+are replaced with their non-generic counterparts, potentially reducing or
+increasing the final number of values.
+
+<!---
+TODO: Probably explain here why the ordering of the passes matters, and why some
+passes like deadstore have multiple variants at different stages.
+-->
+
+See the `passes` list defined in [compile.go](compile.go) for more information.
+
+### Playing with SSA
+
+A good way to see and get used to the compiler's SSA in action is via
+`GOSSAFUNC`. For example, to see func `Foo`'s initial SSA form and final
+generated assembly, one can run:
+
+ GOSSAFUNC=Foo go build
+
+The generated `ssa.html` file will also contain the SSA func at each of the
+compile passes, making it easy to see what each pass does to a particular
+program. You can also click on values and blocks to highlight them, to help
+follow the control flow and values.
+
+The value specified in GOSSAFUNC can also be a package-qualified function
+name, e.g.
+
+ GOSSAFUNC=blah.Foo go build
+
+This will match any function named "Foo" within a package whose final
+suffix is "blah" (e.g. something/blah.Foo, anotherthing/extra/blah.Foo).
+
+If non-HTML dumps are needed, append a "+" to the GOSSAFUNC value
+and dumps will be written to stdout:
+
+ GOSSAFUNC=Bar+ go build
+
+<!---
+TODO: need more ideas for this section
+-->
+
+### Hacking on SSA
+
+While most compiler passes are implemented directly in Go code, some others are
+code generated. This is currently done via rewrite rules, which have their own
+syntax and are maintained in `_gen/*.rules`. Simpler optimizations can be written
+easily and quickly this way, but rewrite rules are not suitable for more complex
+optimizations.
+
+To read more on rewrite rules, have a look at the top comments in
+[_gen/generic.rules](_gen/generic.rules) and [_gen/rulegen.go](_gen/rulegen.go).
+
+Similarly, the code to manage operators is also code generated from
+`_gen/*Ops.go`, as it is easier to maintain a few tables than a lot of code.
+After changing the rules or operators, see [_gen/README](_gen/README) for
+instructions on how to generate the Go code again.
+
+<!---
+TODO: more tips and info could likely go here
+-->