summaryrefslogtreecommitdiffstats
path: root/third_party/rust/cranelift-codegen/src/machinst/buffer.rs
diff options
context:
space:
mode:
Diffstat (limited to 'third_party/rust/cranelift-codegen/src/machinst/buffer.rs')
-rw-r--r--third_party/rust/cranelift-codegen/src/machinst/buffer.rs1813
1 files changed, 1813 insertions, 0 deletions
diff --git a/third_party/rust/cranelift-codegen/src/machinst/buffer.rs b/third_party/rust/cranelift-codegen/src/machinst/buffer.rs
new file mode 100644
index 0000000000..b2187a9b68
--- /dev/null
+++ b/third_party/rust/cranelift-codegen/src/machinst/buffer.rs
@@ -0,0 +1,1813 @@
+//! In-memory representation of compiled machine code, with labels and fixups to
+//! refer to those labels. Handles constant-pool island insertion and also
+//! veneer insertion for out-of-range jumps.
+//!
+//! This code exists to solve three problems:
+//!
+//! - Branch targets for forward branches are not known until later, when we
+//! emit code in a single pass through the instruction structs.
+//!
+//! - On many architectures, address references or offsets have limited range.
+//! For example, on AArch64, conditional branches can only target code +/- 1MB
+//! from the branch itself.
+//!
+//! - The lowering of control flow from the CFG-with-edges produced by
+//! [BlockLoweringOrder](super::BlockLoweringOrder), combined with many empty
+//! edge blocks when the register allocator does not need to insert any
+//! spills/reloads/moves in edge blocks, results in many suboptimal branch
+//! patterns. The lowering also pays no attention to block order, and so
+//! two-target conditional forms (cond-br followed by uncond-br) can often by
+//! avoided because one of the targets is the fallthrough. There are several
+//! cases here where we can simplify to use fewer branches.
+//!
+//! This "buffer" implements a single-pass code emission strategy (with a later
+//! "fixup" pass, but only through recorded fixups, not all instructions). The
+//! basic idea is:
+//!
+//! - Emit branches as they are, including two-target (cond/uncond) compound
+//! forms, but with zero offsets and optimistically assuming the target will be
+//! in range. Record the "fixup" for later. Targets are denoted instead by
+//! symbolic "labels" that are then bound to certain offsets in the buffer as
+//! we emit code. (Nominally, there is a label at the start of every basic
+//! block.)
+//!
+//! - As we do this, track the offset in the buffer at which the first label
+//! reference "goes out of range". We call this the "deadline". If we reach the
+//! deadline and we still have not bound the label to which an unresolved branch
+//! refers, we have a problem!
+//!
+//! - To solve this problem, we emit "islands" full of "veneers". An island is
+//! simply a chunk of code inserted in the middle of the code actually produced
+//! by the emitter (e.g., vcode iterating over instruction structs). The emitter
+//! has some awareness of this: it either asks for an island between blocks, so
+//! it is not accidentally executed, or else it emits a branch around the island
+//! when all other options fail (see `Inst::EmitIsland` meta-instruction).
+//!
+//! - A "veneer" is an instruction (or sequence of instructions) in an "island"
+//! that implements a longer-range reference to a label. The idea is that, for
+//! example, a branch with a limited range can branch to a "veneer" instead,
+//! which is simply a branch in a form that can use a longer-range reference. On
+//! AArch64, for example, conditionals have a +/- 1 MB range, but a conditional
+//! can branch to an unconditional branch which has a +/- 128 MB range. Hence, a
+//! conditional branch's label reference can be fixed up with a "veneer" to
+//! achieve a longer range.
+//!
+//! - To implement all of this, we require the backend to provide a `LabelUse`
+//! type that implements a trait. This is nominally an enum that records one of
+//! several kinds of references to an offset in code -- basically, a relocation
+//! type -- and will usually correspond to different instruction formats. The
+//! `LabelUse` implementation specifies the maximum range, how to patch in the
+//! actual label location when known, and how to generate a veneer to extend the
+//! range.
+//!
+//! That satisfies label references, but we still may have suboptimal branch
+//! patterns. To clean up the branches, we do a simple "peephole"-style
+//! optimization on the fly. To do so, the emitter (e.g., `Inst::emit()`)
+//! informs the buffer of branches in the code and, in the case of conditionals,
+//! the code that would have been emitted to invert this branch's condition. We
+//! track the "latest branches": these are branches that are contiguous up to
+//! the current offset. (If any code is emitted after a branch, that branch or
+//! run of contiguous branches is no longer "latest".) The latest branches are
+//! those that we can edit by simply truncating the buffer and doing something
+//! else instead.
+//!
+//! To optimize branches, we implement several simple rules, and try to apply
+//! them to the "latest branches" when possible:
+//!
+//! - A branch with a label target, when that label is bound to the ending
+//! offset of the branch (the fallthrough location), can be removed altogether,
+//! because the branch would have no effect).
+//!
+//! - An unconditional branch that starts at a label location, and branches to
+//! another label, results in a "label alias": all references to the label bound
+//! *to* this branch instruction are instead resolved to the *target* of the
+//! branch instruction. This effectively removes empty blocks that just
+//! unconditionally branch to the next block. We call this "branch threading".
+//!
+//! - A conditional followed by an unconditional, when the conditional branches
+//! to the unconditional's fallthrough, results in (i) the truncation of the
+//! unconditional, (ii) the inversion of the condition's condition, and (iii)
+//! replacement of the conditional's target (using the original target of the
+//! unconditional). This is a fancy way of saying "we can flip a two-target
+//! conditional branch's taken/not-taken targets if it works better with our
+//! fallthrough". To make this work, the emitter actually gives the buffer
+//! *both* forms of every conditional branch: the true form is emitted into the
+//! buffer, and the "inverted" machine-code bytes are provided as part of the
+//! branch-fixup metadata.
+//!
+//! - An unconditional B preceded by another unconditional P, when B's label(s) have
+//! been redirected to target(B), can be removed entirely. This is an extension
+//! of the branch-threading optimization, and is valid because if we know there
+//! will be no fallthrough into this branch instruction (the prior instruction
+//! is an unconditional jump), and if we know we have successfully redirected
+//! all labels, then this branch instruction is unreachable. Note that this
+//! works because the redirection happens before the label is ever resolved
+//! (fixups happen at island emission time, at which point latest-branches are
+//! cleared, or at the end of emission), so we are sure to catch and redirect
+//! all possible paths to this instruction.
+//!
+//! # Branch-optimization Correctness
+//!
+//! The branch-optimization mechanism depends on a few data structures with
+//! invariants, which are always held outside the scope of top-level public
+//! methods:
+//!
+//! - The latest-branches list. Each entry describes a span of the buffer
+//! (start/end offsets), the label target, the corresponding fixup-list entry
+//! index, and the bytes (must be the same length) for the inverted form, if
+//! conditional. The list of labels that are bound to the start-offset of this
+//! branch is *complete* (if any label has a resolved offset equal to `start`
+//! and is not an alias, it must appear in this list) and *precise* (no label
+//! in this list can be bound to another offset). No label in this list should
+//! be an alias. No two branch ranges can overlap, and branches are in
+//! ascending-offset order.
+//!
+//! - The labels-at-tail list. This contains all MachLabels that have been bound
+//! to (whose resolved offsets are equal to) the tail offset of the buffer.
+//! No label in this list should be an alias.
+//!
+//! - The label_offsets array, containing the bound offset of a label or
+//! UNKNOWN. No label can be bound at an offset greater than the current
+//! buffer tail.
+//!
+//! - The label_aliases array, containing another label to which a label is
+//! bound or UNKNOWN. A label's resolved offset is the resolved offset
+//! of the label it is aliased to, if this is set.
+//!
+//! We argue below, at each method, how the invariants in these data structures
+//! are maintained (grep for "Post-invariant").
+//!
+//! Given these invariants, we argue why each optimization preserves execution
+//! semantics below (grep for "Preserves execution semantics").
+
+use crate::binemit::{Addend, CodeOffset, CodeSink, Reloc, StackMap};
+use crate::ir::{ExternalName, Opcode, SourceLoc, TrapCode};
+use crate::machinst::{BlockIndex, MachInstLabelUse, VCodeConstant, VCodeConstants, VCodeInst};
+use crate::timing;
+use cranelift_entity::{entity_impl, SecondaryMap};
+
+use log::trace;
+use smallvec::SmallVec;
+use std::mem;
+use std::string::String;
+
+/// A buffer of output to be produced, fixed up, and then emitted to a CodeSink
+/// in bulk.
+///
+/// This struct uses `SmallVec`s to support small-ish function bodies without
+/// any heap allocation. As such, it will be several kilobytes large. This is
+/// likely fine as long as it is stack-allocated for function emission then
+/// thrown away; but beware if many buffer objects are retained persistently.
+pub struct MachBuffer<I: VCodeInst> {
+ /// The buffer contents, as raw bytes.
+ data: SmallVec<[u8; 1024]>,
+ /// Any relocations referring to this code. Note that only *external*
+ /// relocations are tracked here; references to labels within the buffer are
+ /// resolved before emission.
+ relocs: SmallVec<[MachReloc; 16]>,
+ /// Any trap records referring to this code.
+ traps: SmallVec<[MachTrap; 16]>,
+ /// Any call site records referring to this code.
+ call_sites: SmallVec<[MachCallSite; 16]>,
+ /// Any source location mappings referring to this code.
+ srclocs: SmallVec<[MachSrcLoc; 64]>,
+ /// Any stack maps referring to this code.
+ stack_maps: SmallVec<[MachStackMap; 8]>,
+ /// The current source location in progress (after `start_srcloc()` and
+ /// before `end_srcloc()`). This is a (start_offset, src_loc) tuple.
+ cur_srcloc: Option<(CodeOffset, SourceLoc)>,
+ /// Known label offsets; `UNKNOWN_LABEL_OFFSET` if unknown.
+ label_offsets: SmallVec<[CodeOffset; 16]>,
+ /// Label aliases: when one label points to an unconditional jump, and that
+ /// jump points to another label, we can redirect references to the first
+ /// label immediately to the second.
+ ///
+ /// Invariant: we don't have label-alias cycles. We ensure this by,
+ /// before setting label A to alias label B, resolving B's alias
+ /// target (iteratively until a non-aliased label); if B is already
+ /// aliased to A, then we cannot alias A back to B.
+ label_aliases: SmallVec<[MachLabel; 16]>,
+ /// Constants that must be emitted at some point.
+ pending_constants: SmallVec<[MachLabelConstant; 16]>,
+ /// Fixups that must be performed after all code is emitted.
+ fixup_records: SmallVec<[MachLabelFixup<I>; 16]>,
+ /// Current deadline at which all constants are flushed and all code labels
+ /// are extended by emitting long-range jumps in an island. This flush
+ /// should be rare (e.g., on AArch64, the shortest-range PC-rel references
+ /// are +/- 1MB for conditional jumps and load-literal instructions), so
+ /// it's acceptable to track a minimum and flush-all rather than doing more
+ /// detailed "current minimum" / sort-by-deadline trickery.
+ island_deadline: CodeOffset,
+ /// How many bytes are needed in the worst case for an island, given all
+ /// pending constants and fixups.
+ island_worst_case_size: CodeOffset,
+ /// Latest branches, to facilitate in-place editing for better fallthrough
+ /// behavior and empty-block removal.
+ latest_branches: SmallVec<[MachBranch; 4]>,
+ /// All labels at the current offset (emission tail). This is lazily
+ /// cleared: it is actually accurate as long as the current offset is
+ /// `labels_at_tail_off`, but if `cur_offset()` has grown larger, it should
+ /// be considered as empty.
+ ///
+ /// For correctness, this *must* be complete (i.e., the vector must contain
+ /// all labels whose offsets are resolved to the current tail), because we
+ /// rely on it to update labels when we truncate branches.
+ labels_at_tail: SmallVec<[MachLabel; 4]>,
+ /// The last offset at which `labels_at_tail` is valid. It is conceptually
+ /// always describing the tail of the buffer, but we do not clear
+ /// `labels_at_tail` eagerly when the tail grows, rather we lazily clear it
+ /// when the offset has grown past this (`labels_at_tail_off`) point.
+ /// Always <= `cur_offset()`.
+ labels_at_tail_off: CodeOffset,
+ /// Map used constants to their [MachLabel].
+ constant_labels: SecondaryMap<VCodeConstant, MachLabel>,
+}
+
+/// A `MachBuffer` once emission is completed: holds generated code and records,
+/// without fixups. This allows the type to be independent of the backend.
+pub struct MachBufferFinalized {
+ /// The buffer contents, as raw bytes.
+ pub data: SmallVec<[u8; 1024]>,
+ /// Any relocations referring to this code. Note that only *external*
+ /// relocations are tracked here; references to labels within the buffer are
+ /// resolved before emission.
+ relocs: SmallVec<[MachReloc; 16]>,
+ /// Any trap records referring to this code.
+ traps: SmallVec<[MachTrap; 16]>,
+ /// Any call site records referring to this code.
+ call_sites: SmallVec<[MachCallSite; 16]>,
+ /// Any source location mappings referring to this code.
+ srclocs: SmallVec<[MachSrcLoc; 64]>,
+ /// Any stack maps referring to this code.
+ stack_maps: SmallVec<[MachStackMap; 8]>,
+}
+
+static UNKNOWN_LABEL_OFFSET: CodeOffset = 0xffff_ffff;
+static UNKNOWN_LABEL: MachLabel = MachLabel(0xffff_ffff);
+
+/// A label refers to some offset in a `MachBuffer`. It may not be resolved at
+/// the point at which it is used by emitted code; the buffer records "fixups"
+/// for references to the label, and will come back and patch the code
+/// appropriately when the label's location is eventually known.
+#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
+pub struct MachLabel(u32);
+entity_impl!(MachLabel);
+
+impl MachLabel {
+ /// Get a label for a block. (The first N MachLabels are always reseved for
+ /// the N blocks in the vcode.)
+ pub fn from_block(bindex: BlockIndex) -> MachLabel {
+ MachLabel(bindex)
+ }
+
+ /// Get the numeric label index.
+ pub fn get(self) -> u32 {
+ self.0
+ }
+
+ /// Creates a string representing this label, for convenience.
+ pub fn to_string(&self) -> String {
+ format!("label{}", self.0)
+ }
+}
+
+impl Default for MachLabel {
+ fn default() -> Self {
+ UNKNOWN_LABEL
+ }
+}
+
+/// A stack map extent, when creating a stack map.
+pub enum StackMapExtent {
+ /// The stack map starts at this instruction, and ends after the number of upcoming bytes
+ /// (note: this is a code offset diff).
+ UpcomingBytes(CodeOffset),
+
+ /// The stack map started at the given offset and ends at the current one. This helps
+ /// architectures where the instruction size has not a fixed length.
+ StartedAtOffset(CodeOffset),
+}
+
+impl<I: VCodeInst> MachBuffer<I> {
+ /// Create a new section, known to start at `start_offset` and with a size limited to
+ /// `length_limit`.
+ pub fn new() -> MachBuffer<I> {
+ MachBuffer {
+ data: SmallVec::new(),
+ relocs: SmallVec::new(),
+ traps: SmallVec::new(),
+ call_sites: SmallVec::new(),
+ srclocs: SmallVec::new(),
+ stack_maps: SmallVec::new(),
+ cur_srcloc: None,
+ label_offsets: SmallVec::new(),
+ label_aliases: SmallVec::new(),
+ pending_constants: SmallVec::new(),
+ fixup_records: SmallVec::new(),
+ island_deadline: UNKNOWN_LABEL_OFFSET,
+ island_worst_case_size: 0,
+ latest_branches: SmallVec::new(),
+ labels_at_tail: SmallVec::new(),
+ labels_at_tail_off: 0,
+ constant_labels: SecondaryMap::new(),
+ }
+ }
+
+ /// Debug-only: check invariants of labels and branch-records described
+ /// under "Branch-optimization Correctness" above.
+ #[cfg(debug)]
+ fn check_label_branch_invariants(&self) {
+ let cur_off = self.cur_offset();
+ // Check that every entry in latest_branches has *correct*
+ // labels_at_this_branch lists. We do not check completeness because
+ // that would require building a reverse index, which is too slow even
+ // for a debug invariant check.
+ let mut last_end = 0;
+ for b in &self.latest_branches {
+ debug_assert!(b.start < b.end);
+ debug_assert!(b.end <= cur_off);
+ debug_assert!(b.start >= last_end);
+ last_end = b.end;
+ for &l in &b.labels_at_this_branch {
+ debug_assert_eq!(self.resolve_label_offset(l), b.start);
+ debug_assert_eq!(self.label_aliases[l.0 as usize], UNKNOWN_LABEL);
+ }
+ }
+
+ // Check that every label is unresolved, or resolved at or before
+ // cur_offset. If at cur_offset, must be in `labels_at_tail`.
+ for (i, &off) in self.label_offsets.iter().enumerate() {
+ let label = MachLabel(i as u32);
+ debug_assert!(off == UNKNOWN_LABEL_OFFSET || off <= cur_off);
+ if off == cur_off {
+ debug_assert!(
+ self.labels_at_tail_off == cur_off && self.labels_at_tail.contains(&label)
+ );
+ }
+ }
+
+ // Check that every label in `labels_at_tail_off` is precise, i.e.,
+ // resolves to the cur offset.
+ debug_assert!(self.labels_at_tail_off <= cur_off);
+ if self.labels_at_tail_off == cur_off {
+ for &l in &self.labels_at_tail {
+ debug_assert_eq!(self.resolve_label_offset(l), cur_off);
+ debug_assert_eq!(self.label_aliases[l.0 as usize], UNKNOWN_LABEL);
+ }
+ }
+ }
+
+ #[cfg(not(debug))]
+ fn check_label_branch_invariants(&self) {
+ // Nothing.
+ }
+
+ /// Current offset from start of buffer.
+ pub fn cur_offset(&self) -> CodeOffset {
+ self.data.len() as CodeOffset
+ }
+
+ /// Add a byte.
+ pub fn put1(&mut self, value: u8) {
+ trace!("MachBuffer: put byte @ {}: {:x}", self.cur_offset(), value);
+ self.data.push(value);
+
+ // Post-invariant: conceptual-labels_at_tail contains a complete and
+ // precise list of labels bound at `cur_offset()`. We have advanced
+ // `cur_offset()`, hence if it had been equal to `labels_at_tail_off`
+ // before, it is not anymore (and it cannot become equal, because
+ // `labels_at_tail_off` is always <= `cur_offset()`). Thus the list is
+ // conceptually empty (even though it is only lazily cleared). No labels
+ // can be bound at this new offset (by invariant on `label_offsets`).
+ // Hence the invariant holds.
+ }
+
+ /// Add 2 bytes.
+ pub fn put2(&mut self, value: u16) {
+ trace!(
+ "MachBuffer: put 16-bit word @ {}: {:x}",
+ self.cur_offset(),
+ value
+ );
+ let bytes = value.to_le_bytes();
+ self.data.extend_from_slice(&bytes[..]);
+
+ // Post-invariant: as for `put1()`.
+ }
+
+ /// Add 4 bytes.
+ pub fn put4(&mut self, value: u32) {
+ trace!(
+ "MachBuffer: put 32-bit word @ {}: {:x}",
+ self.cur_offset(),
+ value
+ );
+ let bytes = value.to_le_bytes();
+ self.data.extend_from_slice(&bytes[..]);
+
+ // Post-invariant: as for `put1()`.
+ }
+
+ /// Add 8 bytes.
+ pub fn put8(&mut self, value: u64) {
+ trace!(
+ "MachBuffer: put 64-bit word @ {}: {:x}",
+ self.cur_offset(),
+ value
+ );
+ let bytes = value.to_le_bytes();
+ self.data.extend_from_slice(&bytes[..]);
+
+ // Post-invariant: as for `put1()`.
+ }
+
+ /// Add a slice of bytes.
+ pub fn put_data(&mut self, data: &[u8]) {
+ trace!(
+ "MachBuffer: put data @ {}: len {}",
+ self.cur_offset(),
+ data.len()
+ );
+ self.data.extend_from_slice(data);
+
+ // Post-invariant: as for `put1()`.
+ }
+
+ /// Reserve appended space and return a mutable slice referring to it.
+ pub fn get_appended_space(&mut self, len: usize) -> &mut [u8] {
+ trace!("MachBuffer: put data @ {}: len {}", self.cur_offset(), len);
+ let off = self.data.len();
+ let new_len = self.data.len() + len;
+ self.data.resize(new_len, 0);
+ &mut self.data[off..]
+
+ // Post-invariant: as for `put1()`.
+ }
+
+ /// Align up to the given alignment.
+ pub fn align_to(&mut self, align_to: CodeOffset) {
+ trace!("MachBuffer: align to {}", align_to);
+ assert!(align_to.is_power_of_two());
+ while self.cur_offset() & (align_to - 1) != 0 {
+ self.put1(0);
+ }
+
+ // Post-invariant: as for `put1()`.
+ }
+
+ /// Allocate a `Label` to refer to some offset. May not be bound to a fixed
+ /// offset yet.
+ pub fn get_label(&mut self) -> MachLabel {
+ let l = self.label_offsets.len() as u32;
+ self.label_offsets.push(UNKNOWN_LABEL_OFFSET);
+ self.label_aliases.push(UNKNOWN_LABEL);
+ trace!("MachBuffer: new label -> {:?}", MachLabel(l));
+ MachLabel(l)
+
+ // Post-invariant: the only mutation is to add a new label; it has no
+ // bound offset yet, so it trivially satisfies all invariants.
+ }
+
+ /// Reserve the first N MachLabels for blocks.
+ pub fn reserve_labels_for_blocks(&mut self, blocks: BlockIndex) {
+ trace!("MachBuffer: first {} labels are for blocks", blocks);
+ debug_assert!(self.label_offsets.is_empty());
+ self.label_offsets
+ .resize(blocks as usize, UNKNOWN_LABEL_OFFSET);
+ self.label_aliases.resize(blocks as usize, UNKNOWN_LABEL);
+
+ // Post-invariant: as for `get_label()`.
+ }
+
+ /// Reserve the next N MachLabels for constants.
+ pub fn reserve_labels_for_constants(&mut self, constants: &VCodeConstants) {
+ trace!(
+ "MachBuffer: next {} labels are for constants",
+ constants.len()
+ );
+ for c in constants.keys() {
+ self.constant_labels[c] = self.get_label();
+ }
+
+ // Post-invariant: as for `get_label()`.
+ }
+
+ /// Retrieve the reserved label for a constant.
+ pub fn get_label_for_constant(&self, constant: VCodeConstant) -> MachLabel {
+ self.constant_labels[constant]
+ }
+
+ /// Bind a label to the current offset. A label can only be bound once.
+ pub fn bind_label(&mut self, label: MachLabel) {
+ trace!(
+ "MachBuffer: bind label {:?} at offset {}",
+ label,
+ self.cur_offset()
+ );
+ debug_assert_eq!(self.label_offsets[label.0 as usize], UNKNOWN_LABEL_OFFSET);
+ debug_assert_eq!(self.label_aliases[label.0 as usize], UNKNOWN_LABEL);
+ let offset = self.cur_offset();
+ self.label_offsets[label.0 as usize] = offset;
+ self.lazily_clear_labels_at_tail();
+ self.labels_at_tail.push(label);
+
+ // Invariants hold: bound offset of label is <= cur_offset (in fact it
+ // is equal). If the `labels_at_tail` list was complete and precise
+ // before, it is still, because we have bound this label to the current
+ // offset and added it to the list (which contains all labels at the
+ // current offset).
+
+ self.check_label_branch_invariants();
+ self.optimize_branches();
+
+ // Post-invariant: by `optimize_branches()` (see argument there).
+ self.check_label_branch_invariants();
+ }
+
+ /// Lazily clear `labels_at_tail` if the tail offset has moved beyond the
+ /// offset that it applies to.
+ fn lazily_clear_labels_at_tail(&mut self) {
+ let offset = self.cur_offset();
+ if offset > self.labels_at_tail_off {
+ self.labels_at_tail_off = offset;
+ self.labels_at_tail.clear();
+ }
+
+ // Post-invariant: either labels_at_tail_off was at cur_offset, and
+ // state is untouched, or was less than cur_offset, in which case the
+ // labels_at_tail list was conceptually empty, and is now actually
+ // empty.
+ }
+
+ /// Resolve a label to an offset, if known. May return `UNKNOWN_LABEL_OFFSET`.
+ fn resolve_label_offset(&self, mut label: MachLabel) -> CodeOffset {
+ let mut iters = 0;
+ while self.label_aliases[label.0 as usize] != UNKNOWN_LABEL {
+ label = self.label_aliases[label.0 as usize];
+ // To protect against an infinite loop (despite our assurances to
+ // ourselves that the invariants make this impossible), assert out
+ // after 1M iterations. The number of basic blocks is limited
+ // in most contexts anyway so this should be impossible to hit with
+ // a legitimate input.
+ iters += 1;
+ assert!(iters < 1_000_000, "Unexpected cycle in label aliases");
+ }
+ self.label_offsets[label.0 as usize]
+
+ // Post-invariant: no mutations.
+ }
+
+ /// Emit a reference to the given label with the given reference type (i.e.,
+ /// branch-instruction format) at the current offset. This is like a
+ /// relocation, but handled internally.
+ ///
+ /// This can be called before the branch is actually emitted; fixups will
+ /// not happen until an island is emitted or the buffer is finished.
+ pub fn use_label_at_offset(&mut self, offset: CodeOffset, label: MachLabel, kind: I::LabelUse) {
+ trace!(
+ "MachBuffer: use_label_at_offset: offset {} label {:?} kind {:?}",
+ offset,
+ label,
+ kind
+ );
+
+ // Add the fixup, and update the worst-case island size based on a
+ // veneer for this label use.
+ self.fixup_records.push(MachLabelFixup {
+ label,
+ offset,
+ kind,
+ });
+ if kind.supports_veneer() {
+ self.island_worst_case_size += kind.veneer_size();
+ self.island_worst_case_size &= !(I::LabelUse::ALIGN - 1);
+ }
+ let deadline = offset + kind.max_pos_range();
+ if deadline < self.island_deadline {
+ self.island_deadline = deadline;
+ }
+
+ // Post-invariant: no mutations to branches/labels data structures.
+ self.check_label_branch_invariants();
+ }
+
+ /// Inform the buffer of an unconditional branch at the given offset,
+ /// targetting the given label. May be used to optimize branches.
+ /// The last added label-use must correspond to this branch.
+ /// This must be called when the current offset is equal to `start`; i.e.,
+ /// before actually emitting the branch. This implies that for a branch that
+ /// uses a label and is eligible for optimizations by the MachBuffer, the
+ /// proper sequence is:
+ ///
+ /// - Call `use_label_at_offset()` to emit the fixup record.
+ /// - Call `add_uncond_branch()` to make note of the branch.
+ /// - Emit the bytes for the branch's machine code.
+ ///
+ /// Additional requirement: no labels may be bound between `start` and `end`
+ /// (exclusive on both ends).
+ pub fn add_uncond_branch(&mut self, start: CodeOffset, end: CodeOffset, target: MachLabel) {
+ assert!(self.cur_offset() == start);
+ debug_assert!(end > start);
+ assert!(!self.fixup_records.is_empty());
+ let fixup = self.fixup_records.len() - 1;
+ self.lazily_clear_labels_at_tail();
+ self.latest_branches.push(MachBranch {
+ start,
+ end,
+ target,
+ fixup,
+ inverted: None,
+ labels_at_this_branch: self.labels_at_tail.clone(),
+ });
+
+ // Post-invariant: we asserted branch start is current tail; the list of
+ // labels at branch is cloned from list of labels at current tail.
+ self.check_label_branch_invariants();
+ }
+
+ /// Inform the buffer of a conditional branch at the given offset,
+ /// targetting the given label. May be used to optimize branches.
+ /// The last added label-use must correspond to this branch.
+ ///
+ /// Additional requirement: no labels may be bound between `start` and `end`
+ /// (exclusive on both ends).
+ pub fn add_cond_branch(
+ &mut self,
+ start: CodeOffset,
+ end: CodeOffset,
+ target: MachLabel,
+ inverted: &[u8],
+ ) {
+ assert!(self.cur_offset() == start);
+ debug_assert!(end > start);
+ assert!(!self.fixup_records.is_empty());
+ debug_assert!(inverted.len() == (end - start) as usize);
+ let fixup = self.fixup_records.len() - 1;
+ let inverted = Some(SmallVec::from(inverted));
+ self.lazily_clear_labels_at_tail();
+ self.latest_branches.push(MachBranch {
+ start,
+ end,
+ target,
+ fixup,
+ inverted,
+ labels_at_this_branch: self.labels_at_tail.clone(),
+ });
+
+ // Post-invariant: we asserted branch start is current tail; labels at
+ // branch list is cloned from list of labels at current tail.
+ self.check_label_branch_invariants();
+ }
+
+ fn truncate_last_branch(&mut self) {
+ self.lazily_clear_labels_at_tail();
+ // Invariants hold at this point.
+
+ let b = self.latest_branches.pop().unwrap();
+ assert!(b.end == self.cur_offset());
+
+ // State:
+ // [PRE CODE]
+ // Offset b.start, b.labels_at_this_branch:
+ // [BRANCH CODE]
+ // cur_off, self.labels_at_tail -->
+ // (end of buffer)
+ self.data.truncate(b.start as usize);
+ self.fixup_records.truncate(b.fixup);
+ // State:
+ // [PRE CODE]
+ // cur_off, Offset b.start, b.labels_at_this_branch:
+ // (end of buffer)
+ //
+ // self.labels_at_tail --> (past end of buffer)
+ let cur_off = self.cur_offset();
+ self.labels_at_tail_off = cur_off;
+ // State:
+ // [PRE CODE]
+ // cur_off, Offset b.start, b.labels_at_this_branch,
+ // self.labels_at_tail:
+ // (end of buffer)
+ //
+ // resolve_label_offset(l) for l in labels_at_tail:
+ // (past end of buffer)
+
+ trace!(
+ "truncate_last_branch: truncated {:?}; off now {}",
+ b,
+ cur_off
+ );
+
+ // Fix up resolved label offsets for labels at tail.
+ for &l in &self.labels_at_tail {
+ self.label_offsets[l.0 as usize] = cur_off;
+ }
+ // Old labels_at_this_branch are now at cur_off.
+ self.labels_at_tail
+ .extend(b.labels_at_this_branch.into_iter());
+
+ // Post-invariant: this operation is defined to truncate the buffer,
+ // which moves cur_off backward, and to move labels at the end of the
+ // buffer back to the start-of-branch offset.
+ //
+ // latest_branches satisfies all invariants:
+ // - it has no branches past the end of the buffer (branches are in
+ // order, we removed the last one, and we truncated the buffer to just
+ // before the start of that branch)
+ // - no labels were moved to lower offsets than the (new) cur_off, so
+ // the labels_at_this_branch list for any other branch need not change.
+ //
+ // labels_at_tail satisfies all invariants:
+ // - all labels that were at the tail after the truncated branch are
+ // moved backward to just before the branch, which becomes the new tail;
+ // thus every element in the list should remain (ensured by `.extend()`
+ // above).
+ // - all labels that refer to the new tail, which is the start-offset of
+ // the truncated branch, must be present. The `labels_at_this_branch`
+ // list in the truncated branch's record is a complete and precise list
+ // of exactly these labels; we append these to labels_at_tail.
+ // - labels_at_tail_off is at cur_off after truncation occurs, so the
+ // list is valid (not to be lazily cleared).
+ //
+ // The stated operation was performed:
+ // - For each label at the end of the buffer prior to this method, it
+ // now resolves to the new (truncated) end of the buffer: it must have
+ // been in `labels_at_tail` (this list is precise and complete, and
+ // the tail was at the end of the truncated branch on entry), and we
+ // iterate over this list and set `label_offsets` to the new tail.
+ // None of these labels could have been an alias (by invariant), so
+ // `label_offsets` is authoritative for each.
+ // - No other labels will be past the end of the buffer, because of the
+ // requirement that no labels be bound to the middle of branch ranges
+ // (see comments to `add_{cond,uncond}_branch()`).
+ // - The buffer is truncated to just before the last branch, and the
+ // fixup record referring to that last branch is removed.
+ self.check_label_branch_invariants();
+ }
+
+ fn optimize_branches(&mut self) {
+ self.lazily_clear_labels_at_tail();
+ // Invariants valid at this point.
+
+ trace!(
+ "enter optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
+ self.latest_branches,
+ self.labels_at_tail,
+ self.fixup_records
+ );
+
+ // We continue to munch on branches at the tail of the buffer until no
+ // more rules apply. Note that the loop only continues if a branch is
+ // actually truncated (or if labels are redirected away from a branch),
+ // so this always makes progress.
+ while let Some(b) = self.latest_branches.last() {
+ let cur_off = self.cur_offset();
+ trace!("optimize_branches: last branch {:?} at off {}", b, cur_off);
+ // If there has been any code emission since the end of the last branch or
+ // label definition, then there's nothing we can edit (because we
+ // don't move code once placed, only back up and overwrite), so
+ // clear the records and finish.
+ if b.end < cur_off {
+ break;
+ }
+
+ // Invariant: we are looking at a branch that ends at the tail of
+ // the buffer.
+
+ // For any branch, conditional or unconditional:
+ // - If the target is a label at the current offset, then remove
+ // the conditional branch, and reset all labels that targetted
+ // the current offset (end of branch) to the truncated
+ // end-of-code.
+ //
+ // Preserves execution semantics: a branch to its own fallthrough
+ // address is equivalent to a no-op; in both cases, nextPC is the
+ // fallthrough.
+ if self.resolve_label_offset(b.target) == cur_off {
+ trace!("branch with target == cur off; truncating");
+ self.truncate_last_branch();
+ continue;
+ }
+
+ // If latest is an unconditional branch:
+ //
+ // - If the branch's target is not its own start address, then for
+ // each label at the start of branch, make the label an alias of the
+ // branch target, and remove the label from the "labels at this
+ // branch" list.
+ //
+ // - Preserves execution semantics: an unconditional branch's
+ // only effect is to set PC to a new PC; this change simply
+ // collapses one step in the step-semantics.
+ //
+ // - Post-invariant: the labels that were bound to the start of
+ // this branch become aliases, so they must not be present in any
+ // labels-at-this-branch list or the labels-at-tail list. The
+ // labels are removed form the latest-branch record's
+ // labels-at-this-branch list, and are never placed in the
+ // labels-at-tail list. Furthermore, it is correct that they are
+ // not in either list, because they are now aliases, and labels
+ // that are aliases remain aliases forever.
+ //
+ // - If there is a prior unconditional branch that ends just before
+ // this one begins, and this branch has no labels bound to its
+ // start, then we can truncate this branch, because it is entirely
+ // unreachable (we have redirected all labels that make it
+ // reachable otherwise). Do so and continue around the loop.
+ //
+ // - Preserves execution semantics: the branch is unreachable,
+ // because execution can only flow into an instruction from the
+ // prior instruction's fallthrough or from a branch bound to that
+ // instruction's start offset. Unconditional branches have no
+ // fallthrough, so if the prior instruction is an unconditional
+ // branch, no fallthrough entry can happen. The
+ // labels-at-this-branch list is complete (by invariant), so if it
+ // is empty, then the instruction is entirely unreachable. Thus,
+ // it can be removed.
+ //
+ // - Post-invariant: ensured by truncate_last_branch().
+ //
+ // - If there is a prior conditional branch whose target label
+ // resolves to the current offset (branches around the
+ // unconditional branch), then remove the unconditional branch,
+ // and make the target of the unconditional the target of the
+ // conditional instead.
+ //
+ // - Preserves execution semantics: previously we had:
+ //
+ // L1:
+ // cond_br L2
+ // br L3
+ // L2:
+ // (end of buffer)
+ //
+ // by removing the last branch, we have:
+ //
+ // L1:
+ // cond_br L2
+ // L2:
+ // (end of buffer)
+ //
+ // we then fix up the records for the conditional branch to
+ // have:
+ //
+ // L1:
+ // cond_br.inverted L3
+ // L2:
+ //
+ // In the original code, control flow reaches L2 when the
+ // conditional branch's predicate is true, and L3 otherwise. In
+ // the optimized code, the same is true.
+ //
+ // - Post-invariant: all edits to latest_branches and
+ // labels_at_tail are performed by `truncate_last_branch()`,
+ // which maintains the invariants at each step.
+
+ if b.is_uncond() {
+ // Set any label equal to current branch's start as an alias of
+ // the branch's target, if the target is not the branch itself
+ // (i.e., an infinite loop).
+ //
+ // We cannot perform this aliasing if the target of this branch
+ // ultimately aliases back here; if so, we need to keep this
+ // branch, so break out of this loop entirely (and clear the
+ // latest-branches list below).
+ //
+ // Note that this check is what prevents cycles from forming in
+ // `self.label_aliases`. To see why, consider an arbitrary start
+ // state:
+ //
+ // label_aliases[L1] = L2, label_aliases[L2] = L3, ..., up to
+ // Ln, which is not aliased.
+ //
+ // We would create a cycle if we assigned label_aliases[Ln]
+ // = L1. Note that the below assignment is the only write
+ // to label_aliases.
+ //
+ // By our other invariants, we have that Ln (`l` below)
+ // resolves to the offset `b.start`, because it is in the
+ // set `b.labels_at_this_branch`.
+ //
+ // If L1 were already aliased, through some arbitrarily deep
+ // chain, to Ln, then it must also resolve to this offset
+ // `b.start`.
+ //
+ // By checking the resolution of `L1` against this offset,
+ // and aborting this branch-simplification if they are
+ // equal, we prevent the below assignment from ever creating
+ // a cycle.
+ if self.resolve_label_offset(b.target) != b.start {
+ let redirected = b.labels_at_this_branch.len();
+ for &l in &b.labels_at_this_branch {
+ trace!(
+ " -> label at start of branch {:?} redirected to target {:?}",
+ l,
+ b.target
+ );
+ self.label_aliases[l.0 as usize] = b.target;
+ // NOTE: we continue to ensure the invariant that labels
+ // pointing to tail of buffer are in `labels_at_tail`
+ // because we already ensured above that the last branch
+ // cannot have a target of `cur_off`; so we never have
+ // to put the label into `labels_at_tail` when moving it
+ // here.
+ }
+ // Maintain invariant: all branches have been redirected
+ // and are no longer pointing at the start of this branch.
+ let mut_b = self.latest_branches.last_mut().unwrap();
+ mut_b.labels_at_this_branch.clear();
+
+ if redirected > 0 {
+ trace!(" -> after label redirects, restarting loop");
+ continue;
+ }
+ } else {
+ break;
+ }
+
+ let b = self.latest_branches.last().unwrap();
+
+ // Examine any immediately preceding branch.
+ if self.latest_branches.len() > 1 {
+ let prev_b = &self.latest_branches[self.latest_branches.len() - 2];
+ trace!(" -> more than one branch; prev_b = {:?}", prev_b);
+ // This uncond is immediately after another uncond; we
+ // should have already redirected labels to this uncond away
+ // (but check to be sure); so we can truncate this uncond.
+ if prev_b.is_uncond()
+ && prev_b.end == b.start
+ && b.labels_at_this_branch.is_empty()
+ {
+ trace!(" -> uncond follows another uncond; truncating");
+ self.truncate_last_branch();
+ continue;
+ }
+
+ // This uncond is immediately after a conditional, and the
+ // conditional's target is the end of this uncond, and we've
+ // already redirected labels to this uncond away; so we can
+ // truncate this uncond, flip the sense of the conditional, and
+ // set the conditional's target (in `latest_branches` and in
+ // `fixup_records`) to the uncond's target.
+ if prev_b.is_cond()
+ && prev_b.end == b.start
+ && self.resolve_label_offset(prev_b.target) == cur_off
+ {
+ trace!(" -> uncond follows a conditional, and conditional's target resolves to current offset");
+ // Save the target of the uncond (this becomes the
+ // target of the cond), and truncate the uncond.
+ let target = b.target;
+ let data = prev_b.inverted.clone().unwrap();
+ self.truncate_last_branch();
+
+ // Mutate the code and cond branch.
+ let off_before_edit = self.cur_offset();
+ let prev_b = self.latest_branches.last_mut().unwrap();
+ let not_inverted = SmallVec::from(
+ &self.data[(prev_b.start as usize)..(prev_b.end as usize)],
+ );
+
+ // Low-level edit: replaces bytes of branch with
+ // inverted form. cur_off remains the same afterward, so
+ // we do not need to modify label data structures.
+ self.data.truncate(prev_b.start as usize);
+ self.data.extend_from_slice(&data[..]);
+
+ // Save the original code as the inversion of the
+ // inverted branch, in case we later edit this branch
+ // again.
+ prev_b.inverted = Some(not_inverted);
+ self.fixup_records[prev_b.fixup].label = target;
+ trace!(" -> reassigning target of condbr to {:?}", target);
+ prev_b.target = target;
+ debug_assert_eq!(off_before_edit, self.cur_offset());
+ continue;
+ }
+ }
+ }
+
+ // If we couldn't do anything with the last branch, then break.
+ break;
+ }
+
+ self.purge_latest_branches();
+
+ trace!(
+ "leave optimize_branches:\n b = {:?}\n l = {:?}\n f = {:?}",
+ self.latest_branches,
+ self.labels_at_tail,
+ self.fixup_records
+ );
+ }
+
+ fn purge_latest_branches(&mut self) {
+ // All of our branch simplification rules work only if a branch ends at
+ // the tail of the buffer, with no following code; and branches are in
+ // order in latest_branches; so if the last entry ends prior to
+ // cur_offset, then clear all entries.
+ let cur_off = self.cur_offset();
+ if let Some(l) = self.latest_branches.last() {
+ if l.end < cur_off {
+ trace!("purge_latest_branches: removing branch {:?}", l);
+ self.latest_branches.clear();
+ }
+ }
+
+ // Post-invariant: no invariant requires any branch to appear in
+ // `latest_branches`; it is always optional. The list-clear above thus
+ // preserves all semantics.
+ }
+
+ /// Emit a constant at some point in the future, binding the given label to
+ /// its offset. The constant will be placed at most `max_distance` from the
+ /// current offset.
+ pub fn defer_constant(
+ &mut self,
+ label: MachLabel,
+ align: CodeOffset,
+ data: &[u8],
+ max_distance: CodeOffset,
+ ) {
+ trace!(
+ "defer_constant: eventually emit {} bytes aligned to {} at label {:?}",
+ data.len(),
+ align,
+ label
+ );
+ let deadline = self.cur_offset().saturating_add(max_distance);
+ self.island_worst_case_size += data.len() as CodeOffset;
+ self.island_worst_case_size =
+ (self.island_worst_case_size + I::LabelUse::ALIGN - 1) & !(I::LabelUse::ALIGN - 1);
+ self.pending_constants.push(MachLabelConstant {
+ label,
+ align,
+ data: SmallVec::from(data),
+ });
+ if deadline < self.island_deadline {
+ self.island_deadline = deadline;
+ }
+ }
+
+ /// Is an island needed within the next N bytes?
+ pub fn island_needed(&self, distance: CodeOffset) -> bool {
+ let worst_case_end_of_island = self.cur_offset() + distance + self.island_worst_case_size;
+ worst_case_end_of_island > self.island_deadline
+ }
+
+ /// Emit all pending constants and veneers. Should only be called if
+ /// `island_needed()` returns true, i.e., if we actually reach a deadline:
+ /// otherwise, unnecessary veneers may be inserted.
+ pub fn emit_island(&mut self) {
+ // We're going to purge fixups, so no latest-branch editing can happen
+ // anymore.
+ self.latest_branches.clear();
+
+ let pending_constants = mem::replace(&mut self.pending_constants, SmallVec::new());
+ for MachLabelConstant { label, align, data } in pending_constants.into_iter() {
+ self.align_to(align);
+ self.bind_label(label);
+ self.put_data(&data[..]);
+ }
+
+ let fixup_records = mem::replace(&mut self.fixup_records, SmallVec::new());
+ let mut new_fixups = SmallVec::new();
+ for MachLabelFixup {
+ label,
+ offset,
+ kind,
+ } in fixup_records.into_iter()
+ {
+ trace!(
+ "emit_island: fixup for label {:?} at offset {} kind {:?}",
+ label,
+ offset,
+ kind
+ );
+ // We eagerly perform fixups whose label targets are known, if not out
+ // of range, to avoid unnecessary veneers.
+ let label_offset = self.resolve_label_offset(label);
+ let known = label_offset != UNKNOWN_LABEL_OFFSET;
+ let in_range = if known {
+ if label_offset >= offset {
+ (label_offset - offset) <= kind.max_pos_range()
+ } else {
+ (offset - label_offset) <= kind.max_neg_range()
+ }
+ } else {
+ false
+ };
+
+ trace!(
+ " -> label_offset = {}, known = {}, in_range = {} (pos {} neg {})",
+ label_offset,
+ known,
+ in_range,
+ kind.max_pos_range(),
+ kind.max_neg_range()
+ );
+
+ let start = offset as usize;
+ let end = (offset + kind.patch_size()) as usize;
+ if in_range {
+ debug_assert!(known); // implied by in_range.
+ let slice = &mut self.data[start..end];
+ trace!("patching in-range!");
+ kind.patch(slice, offset, label_offset);
+ } else if !known && !kind.supports_veneer() {
+ // Nothing for now. Keep it for next round.
+ new_fixups.push(MachLabelFixup {
+ label,
+ offset,
+ kind,
+ });
+ } else if !in_range && kind.supports_veneer() {
+ // Allocate space for a veneer in the island.
+ self.align_to(I::LabelUse::ALIGN);
+ let veneer_offset = self.cur_offset();
+ trace!("making a veneer at {}", veneer_offset);
+ let slice = &mut self.data[start..end];
+ // Patch the original label use to refer to the veneer.
+ trace!(
+ "patching original at offset {} to veneer offset {}",
+ offset,
+ veneer_offset
+ );
+ kind.patch(slice, offset, veneer_offset);
+ // Generate the veneer.
+ let veneer_slice = self.get_appended_space(kind.veneer_size() as usize);
+ let (veneer_fixup_off, veneer_label_use) =
+ kind.generate_veneer(veneer_slice, veneer_offset);
+ trace!(
+ "generated veneer; fixup offset {}, label_use {:?}",
+ veneer_fixup_off,
+ veneer_label_use
+ );
+ // If the label is known (but was just out of range), do the
+ // veneer label-use fixup now too; otherwise, save it for later.
+ if known {
+ let start = veneer_fixup_off as usize;
+ let end = (veneer_fixup_off + veneer_label_use.patch_size()) as usize;
+ let veneer_slice = &mut self.data[start..end];
+ trace!("doing veneer fixup right away too");
+ veneer_label_use.patch(veneer_slice, veneer_fixup_off, label_offset);
+ } else {
+ new_fixups.push(MachLabelFixup {
+ label,
+ offset: veneer_fixup_off,
+ kind: veneer_label_use,
+ });
+ }
+ } else {
+ panic!(
+ "Cannot support label-use {:?} (known = {}, in-range = {})",
+ kind, known, in_range
+ );
+ }
+ }
+
+ self.fixup_records = new_fixups;
+ self.island_deadline = UNKNOWN_LABEL_OFFSET;
+ }
+
+ /// Finish any deferred emissions and/or fixups.
+ pub fn finish(mut self) -> MachBufferFinalized {
+ let _tt = timing::vcode_emit_finish();
+
+ while !self.pending_constants.is_empty() || !self.fixup_records.is_empty() {
+ // `emit_island()` will emit any pending veneers and constants, and
+ // as a side-effect, will also take care of any fixups with resolved
+ // labels eagerly.
+ self.emit_island();
+ }
+
+ // Ensure that all labels have been fixed up after the last island is emitted. This is a
+ // full (release-mode) assert because an unresolved label means the emitted code is
+ // incorrect.
+ assert!(self.fixup_records.is_empty());
+
+ MachBufferFinalized {
+ data: self.data,
+ relocs: self.relocs,
+ traps: self.traps,
+ call_sites: self.call_sites,
+ srclocs: self.srclocs,
+ stack_maps: self.stack_maps,
+ }
+ }
+
+ /// Add an external relocation at the current offset.
+ pub fn add_reloc(
+ &mut self,
+ srcloc: SourceLoc,
+ kind: Reloc,
+ name: &ExternalName,
+ addend: Addend,
+ ) {
+ let name = name.clone();
+ self.relocs.push(MachReloc {
+ offset: self.data.len() as CodeOffset,
+ srcloc,
+ kind,
+ name,
+ addend,
+ });
+ }
+
+ /// Add a trap record at the current offset.
+ pub fn add_trap(&mut self, srcloc: SourceLoc, code: TrapCode) {
+ self.traps.push(MachTrap {
+ offset: self.data.len() as CodeOffset,
+ srcloc,
+ code,
+ });
+ }
+
+ /// Add a call-site record at the current offset.
+ pub fn add_call_site(&mut self, srcloc: SourceLoc, opcode: Opcode) {
+ self.call_sites.push(MachCallSite {
+ ret_addr: self.data.len() as CodeOffset,
+ srcloc,
+ opcode,
+ });
+ }
+
+ /// Set the `SourceLoc` for code from this offset until the offset at the
+ /// next call to `end_srcloc()`.
+ pub fn start_srcloc(&mut self, loc: SourceLoc) {
+ self.cur_srcloc = Some((self.cur_offset(), loc));
+ }
+
+ /// Mark the end of the `SourceLoc` segment started at the last
+ /// `start_srcloc()` call.
+ pub fn end_srcloc(&mut self) {
+ let (start, loc) = self
+ .cur_srcloc
+ .take()
+ .expect("end_srcloc() called without start_srcloc()");
+ let end = self.cur_offset();
+ // Skip zero-length extends.
+ debug_assert!(end >= start);
+ if end > start {
+ self.srclocs.push(MachSrcLoc { start, end, loc });
+ }
+ }
+
+ /// Add stack map metadata for this program point: a set of stack offsets
+ /// (from SP upward) that contain live references.
+ ///
+ /// The `offset_to_fp` value is the offset from the nominal SP (at which the `stack_offsets`
+ /// are based) and the FP value. By subtracting `offset_to_fp` from each `stack_offsets`
+ /// element, one can obtain live-reference offsets from FP instead.
+ pub fn add_stack_map(&mut self, extent: StackMapExtent, stack_map: StackMap) {
+ let (start, end) = match extent {
+ StackMapExtent::UpcomingBytes(insn_len) => {
+ let start_offset = self.cur_offset();
+ (start_offset, start_offset + insn_len)
+ }
+ StackMapExtent::StartedAtOffset(start_offset) => {
+ let end_offset = self.cur_offset();
+ debug_assert!(end_offset >= start_offset);
+ (start_offset, end_offset)
+ }
+ };
+ self.stack_maps.push(MachStackMap {
+ offset: start,
+ offset_end: end,
+ stack_map,
+ });
+ }
+}
+
+impl MachBufferFinalized {
+ /// Get a list of source location mapping tuples in sorted-by-start-offset order.
+ pub fn get_srclocs_sorted(&self) -> &[MachSrcLoc] {
+ &self.srclocs[..]
+ }
+
+ /// Get the total required size for the code.
+ pub fn total_size(&self) -> CodeOffset {
+ self.data.len() as CodeOffset
+ }
+
+ /// Emit this buffer to the given CodeSink.
+ pub fn emit<CS: CodeSink>(&self, sink: &mut CS) {
+ // N.B.: we emit every section into the .text section as far as
+ // the `CodeSink` is concerned; we do not bother to segregate
+ // the contents into the actual program text, the jumptable and the
+ // rodata (constant pool). This allows us to generate code assuming
+ // that these will not be relocated relative to each other, and avoids
+ // having to designate each section as belonging in one of the three
+ // fixed categories defined by `CodeSink`. If this becomes a problem
+ // later (e.g. because of memory permissions or similar), we can
+ // add this designation and segregate the output; take care, however,
+ // to add the appropriate relocations in this case.
+
+ let mut next_reloc = 0;
+ let mut next_trap = 0;
+ let mut next_call_site = 0;
+ for (idx, byte) in self.data.iter().enumerate() {
+ if next_reloc < self.relocs.len() {
+ let reloc = &self.relocs[next_reloc];
+ if reloc.offset == idx as CodeOffset {
+ sink.reloc_external(reloc.srcloc, reloc.kind, &reloc.name, reloc.addend);
+ next_reloc += 1;
+ }
+ }
+ if next_trap < self.traps.len() {
+ let trap = &self.traps[next_trap];
+ if trap.offset == idx as CodeOffset {
+ sink.trap(trap.code, trap.srcloc);
+ next_trap += 1;
+ }
+ }
+ if next_call_site < self.call_sites.len() {
+ let call_site = &self.call_sites[next_call_site];
+ if call_site.ret_addr == idx as CodeOffset {
+ sink.add_call_site(call_site.opcode, call_site.srcloc);
+ next_call_site += 1;
+ }
+ }
+ sink.put1(*byte);
+ }
+
+ sink.begin_jumptables();
+ sink.begin_rodata();
+ sink.end_codegen();
+ }
+
+ /// Get the stack map metadata for this code.
+ pub fn stack_maps(&self) -> &[MachStackMap] {
+ &self.stack_maps[..]
+ }
+}
+
+/// A constant that is deferred to the next constant-pool opportunity.
+struct MachLabelConstant {
+ /// This label will refer to the constant's offset.
+ label: MachLabel,
+ /// Required alignment.
+ align: CodeOffset,
+ /// This data will be emitted when able.
+ data: SmallVec<[u8; 16]>,
+}
+
+/// A fixup to perform on the buffer once code is emitted. Fixups always refer
+/// to labels and patch the code based on label offsets. Hence, they are like
+/// relocations, but internal to one buffer.
+#[derive(Debug)]
+struct MachLabelFixup<I: VCodeInst> {
+ /// The label whose offset controls this fixup.
+ label: MachLabel,
+ /// The offset to fix up / patch to refer to this label.
+ offset: CodeOffset,
+ /// The kind of fixup. This is architecture-specific; each architecture may have,
+ /// e.g., several types of branch instructions, each with differently-sized
+ /// offset fields and different places within the instruction to place the
+ /// bits.
+ kind: I::LabelUse,
+}
+
+/// A relocation resulting from a compilation.
+struct MachReloc {
+ /// The offset at which the relocation applies, *relative to the
+ /// containing section*.
+ offset: CodeOffset,
+ /// The original source location.
+ srcloc: SourceLoc,
+ /// The kind of relocation.
+ kind: Reloc,
+ /// The external symbol / name to which this relocation refers.
+ name: ExternalName,
+ /// The addend to add to the symbol value.
+ addend: i64,
+}
+
+/// A trap record resulting from a compilation.
+struct MachTrap {
+ /// The offset at which the trap instruction occurs, *relative to the
+ /// containing section*.
+ offset: CodeOffset,
+ /// The original source location.
+ srcloc: SourceLoc,
+ /// The trap code.
+ code: TrapCode,
+}
+
+/// A call site record resulting from a compilation.
+struct MachCallSite {
+ /// The offset of the call's return address, *relative to the containing section*.
+ ret_addr: CodeOffset,
+ /// The original source location.
+ srcloc: SourceLoc,
+ /// The call's opcode.
+ opcode: Opcode,
+}
+
+/// A source-location mapping resulting from a compilation.
+#[derive(Clone, Debug)]
+pub struct MachSrcLoc {
+ /// The start of the region of code corresponding to a source location.
+ /// This is relative to the start of the function, not to the start of the
+ /// section.
+ pub start: CodeOffset,
+ /// The end of the region of code corresponding to a source location.
+ /// This is relative to the start of the section, not to the start of the
+ /// section.
+ pub end: CodeOffset,
+ /// The source location.
+ pub loc: SourceLoc,
+}
+
+/// Record of stack map metadata: stack offsets containing references.
+#[derive(Clone, Debug)]
+pub struct MachStackMap {
+ /// The code offset at which this stack map applies.
+ pub offset: CodeOffset,
+ /// The code offset just past the "end" of the instruction: that is, the
+ /// offset of the first byte of the following instruction, or equivalently,
+ /// the start offset plus the instruction length.
+ pub offset_end: CodeOffset,
+ /// The stack map itself.
+ pub stack_map: StackMap,
+}
+
+/// Record of branch instruction in the buffer, to facilitate editing.
+#[derive(Clone, Debug)]
+struct MachBranch {
+ start: CodeOffset,
+ end: CodeOffset,
+ target: MachLabel,
+ fixup: usize,
+ inverted: Option<SmallVec<[u8; 8]>>,
+ /// All labels pointing to the start of this branch. For correctness, this
+ /// *must* be complete (i.e., must contain all labels whose resolved offsets
+ /// are at the start of this branch): we rely on being able to redirect all
+ /// labels that could jump to this branch before removing it, if it is
+ /// otherwise unreachable.
+ labels_at_this_branch: SmallVec<[MachLabel; 4]>,
+}
+
+impl MachBranch {
+ fn is_cond(&self) -> bool {
+ self.inverted.is_some()
+ }
+ fn is_uncond(&self) -> bool {
+ self.inverted.is_none()
+ }
+}
+
+// We use an actual instruction definition to do tests, so we depend on the `arm64` feature here.
+#[cfg(all(test, feature = "arm64"))]
+mod test {
+ use super::*;
+ use crate::isa::aarch64::inst::xreg;
+ use crate::isa::aarch64::inst::{BranchTarget, CondBrKind, EmitInfo, Inst};
+ use crate::machinst::MachInstEmit;
+ use crate::settings;
+ use std::default::Default;
+
+ fn label(n: u32) -> MachLabel {
+ MachLabel::from_block(n)
+ }
+ fn target(n: u32) -> BranchTarget {
+ BranchTarget::Label(label(n))
+ }
+
+ #[test]
+ fn test_elide_jump_to_next() {
+ let info = EmitInfo::new(settings::Flags::new(settings::builder()));
+ let mut buf = MachBuffer::new();
+ let mut state = Default::default();
+
+ buf.reserve_labels_for_blocks(2);
+ buf.bind_label(label(0));
+ let inst = Inst::Jump { dest: target(1) };
+ inst.emit(&mut buf, &info, &mut state);
+ buf.bind_label(label(1));
+ let buf = buf.finish();
+ assert_eq!(0, buf.total_size());
+ }
+
+ #[test]
+ fn test_elide_trivial_jump_blocks() {
+ let info = EmitInfo::new(settings::Flags::new(settings::builder()));
+ let mut buf = MachBuffer::new();
+ let mut state = Default::default();
+
+ buf.reserve_labels_for_blocks(4);
+
+ buf.bind_label(label(0));
+ let inst = Inst::CondBr {
+ kind: CondBrKind::NotZero(xreg(0)),
+ taken: target(1),
+ not_taken: target(2),
+ };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(1));
+ let inst = Inst::Jump { dest: target(3) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(2));
+ let inst = Inst::Jump { dest: target(3) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(3));
+
+ let buf = buf.finish();
+ assert_eq!(0, buf.total_size());
+ }
+
+ #[test]
+ fn test_flip_cond() {
+ let info = EmitInfo::new(settings::Flags::new(settings::builder()));
+ let mut buf = MachBuffer::new();
+ let mut state = Default::default();
+
+ buf.reserve_labels_for_blocks(4);
+
+ buf.bind_label(label(0));
+ let inst = Inst::CondBr {
+ kind: CondBrKind::NotZero(xreg(0)),
+ taken: target(1),
+ not_taken: target(2),
+ };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(1));
+ let inst = Inst::Udf {
+ trap_code: TrapCode::Interrupt,
+ };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(2));
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(3));
+
+ let buf = buf.finish();
+
+ let mut buf2 = MachBuffer::new();
+ let mut state = Default::default();
+ let inst = Inst::TrapIf {
+ kind: CondBrKind::NotZero(xreg(0)),
+ trap_code: TrapCode::Interrupt,
+ };
+ inst.emit(&mut buf2, &info, &mut state);
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf2, &info, &mut state);
+
+ let buf2 = buf2.finish();
+
+ assert_eq!(buf.data, buf2.data);
+ }
+
+ #[test]
+ fn test_island() {
+ let info = EmitInfo::new(settings::Flags::new(settings::builder()));
+ let mut buf = MachBuffer::new();
+ let mut state = Default::default();
+
+ buf.reserve_labels_for_blocks(4);
+
+ buf.bind_label(label(0));
+ let inst = Inst::CondBr {
+ kind: CondBrKind::NotZero(xreg(0)),
+ taken: target(2),
+ not_taken: target(3),
+ };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(1));
+ while buf.cur_offset() < 2000000 {
+ if buf.island_needed(0) {
+ buf.emit_island();
+ }
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+ }
+
+ buf.bind_label(label(2));
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(3));
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+
+ let buf = buf.finish();
+
+ assert_eq!(2000000 + 8, buf.total_size());
+
+ let mut buf2 = MachBuffer::new();
+ let mut state = Default::default();
+ let inst = Inst::CondBr {
+ kind: CondBrKind::NotZero(xreg(0)),
+ taken: BranchTarget::ResolvedOffset(1048576 - 4),
+ not_taken: BranchTarget::ResolvedOffset(2000000 + 4 - 4),
+ };
+ inst.emit(&mut buf2, &info, &mut state);
+
+ let buf2 = buf2.finish();
+
+ assert_eq!(&buf.data[0..8], &buf2.data[..]);
+ }
+
+ #[test]
+ fn test_island_backward() {
+ let info = EmitInfo::new(settings::Flags::new(settings::builder()));
+ let mut buf = MachBuffer::new();
+ let mut state = Default::default();
+
+ buf.reserve_labels_for_blocks(4);
+
+ buf.bind_label(label(0));
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(1));
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(2));
+ while buf.cur_offset() < 2000000 {
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+ }
+
+ buf.bind_label(label(3));
+ let inst = Inst::CondBr {
+ kind: CondBrKind::NotZero(xreg(0)),
+ taken: target(0),
+ not_taken: target(1),
+ };
+ inst.emit(&mut buf, &info, &mut state);
+
+ let buf = buf.finish();
+
+ assert_eq!(2000000 + 12, buf.total_size());
+
+ let mut buf2 = MachBuffer::new();
+ let mut state = Default::default();
+ let inst = Inst::CondBr {
+ kind: CondBrKind::NotZero(xreg(0)),
+ taken: BranchTarget::ResolvedOffset(8),
+ not_taken: BranchTarget::ResolvedOffset(4 - (2000000 + 4)),
+ };
+ inst.emit(&mut buf2, &info, &mut state);
+ let inst = Inst::Jump {
+ dest: BranchTarget::ResolvedOffset(-(2000000 + 8)),
+ };
+ inst.emit(&mut buf2, &info, &mut state);
+
+ let buf2 = buf2.finish();
+
+ assert_eq!(&buf.data[2000000..], &buf2.data[..]);
+ }
+
+ #[test]
+ fn test_multiple_redirect() {
+ // label0:
+ // cbz x0, label1
+ // b label2
+ // label1:
+ // b label3
+ // label2:
+ // nop
+ // nop
+ // b label0
+ // label3:
+ // b label4
+ // label4:
+ // b label5
+ // label5:
+ // b label7
+ // label6:
+ // nop
+ // label7:
+ // ret
+ //
+ // -- should become:
+ //
+ // label0:
+ // cbz x0, label7
+ // label2:
+ // nop
+ // nop
+ // b label0
+ // label6:
+ // nop
+ // label7:
+ // ret
+
+ let info = EmitInfo::new(settings::Flags::new(settings::builder()));
+ let mut buf = MachBuffer::new();
+ let mut state = Default::default();
+
+ buf.reserve_labels_for_blocks(8);
+
+ buf.bind_label(label(0));
+ let inst = Inst::CondBr {
+ kind: CondBrKind::Zero(xreg(0)),
+ taken: target(1),
+ not_taken: target(2),
+ };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(1));
+ let inst = Inst::Jump { dest: target(3) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(2));
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+ inst.emit(&mut buf, &info, &mut state);
+ let inst = Inst::Jump { dest: target(0) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(3));
+ let inst = Inst::Jump { dest: target(4) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(4));
+ let inst = Inst::Jump { dest: target(5) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(5));
+ let inst = Inst::Jump { dest: target(7) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(6));
+ let inst = Inst::Nop4;
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(7));
+ let inst = Inst::Ret;
+ inst.emit(&mut buf, &info, &mut state);
+
+ let buf = buf.finish();
+
+ let golden_data = vec![
+ 0xa0, 0x00, 0x00, 0xb4, // cbz x0, 0x14
+ 0x1f, 0x20, 0x03, 0xd5, // nop
+ 0x1f, 0x20, 0x03, 0xd5, // nop
+ 0xfd, 0xff, 0xff, 0x17, // b 0
+ 0x1f, 0x20, 0x03, 0xd5, // nop
+ 0xc0, 0x03, 0x5f, 0xd6, // ret
+ ];
+
+ assert_eq!(&golden_data[..], &buf.data[..]);
+ }
+
+ #[test]
+ fn test_handle_branch_cycle() {
+ // label0:
+ // b label1
+ // label1:
+ // b label2
+ // label2:
+ // b label3
+ // label3:
+ // b label4
+ // label4:
+ // b label1 // note: not label0 (to make it interesting).
+ //
+ // -- should become:
+ //
+ // label0, label1, ..., label4:
+ // b label0
+ let info = EmitInfo::new(settings::Flags::new(settings::builder()));
+ let mut buf = MachBuffer::new();
+ let mut state = Default::default();
+
+ buf.reserve_labels_for_blocks(5);
+
+ buf.bind_label(label(0));
+ let inst = Inst::Jump { dest: target(1) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(1));
+ let inst = Inst::Jump { dest: target(2) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(2));
+ let inst = Inst::Jump { dest: target(3) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(3));
+ let inst = Inst::Jump { dest: target(4) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ buf.bind_label(label(4));
+ let inst = Inst::Jump { dest: target(1) };
+ inst.emit(&mut buf, &info, &mut state);
+
+ let buf = buf.finish();
+
+ let golden_data = vec![
+ 0x00, 0x00, 0x00, 0x14, // b 0
+ ];
+
+ assert_eq!(&golden_data[..], &buf.data[..]);
+ }
+}