summaryrefslogtreecommitdiffstats
path: root/vendor/sharded-slab/src/implementation.rs
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-17 12:02:58 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-17 12:02:58 +0000
commit698f8c2f01ea549d77d7dc3338a12e04c11057b9 (patch)
tree173a775858bd501c378080a10dca74132f05bc50 /vendor/sharded-slab/src/implementation.rs
parentInitial commit. (diff)
downloadrustc-698f8c2f01ea549d77d7dc3338a12e04c11057b9.tar.xz
rustc-698f8c2f01ea549d77d7dc3338a12e04c11057b9.zip
Adding upstream version 1.64.0+dfsg1.upstream/1.64.0+dfsg1
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'vendor/sharded-slab/src/implementation.rs')
-rw-r--r--vendor/sharded-slab/src/implementation.rs138
1 files changed, 138 insertions, 0 deletions
diff --git a/vendor/sharded-slab/src/implementation.rs b/vendor/sharded-slab/src/implementation.rs
new file mode 100644
index 000000000..01f08a582
--- /dev/null
+++ b/vendor/sharded-slab/src/implementation.rs
@@ -0,0 +1,138 @@
+// This module exists only to provide a separate page for the implementation
+// documentation.
+
+//! Notes on `sharded-slab`'s implementation and design.
+//!
+//! # Design
+//!
+//! The sharded slab's design is strongly inspired by the ideas presented by
+//! Leijen, Zorn, and de Moura in [Mimalloc: Free List Sharding in
+//! Action][mimalloc]. In this report, the authors present a novel design for a
+//! memory allocator based on a concept of _free list sharding_.
+//!
+//! Memory allocators must keep track of what memory regions are not currently
+//! allocated ("free") in order to provide them to future allocation requests.
+//! The term [_free list_][freelist] refers to a technique for performing this
+//! bookkeeping, where each free block stores a pointer to the next free block,
+//! forming a linked list. The memory allocator keeps a pointer to the most
+//! recently freed block, the _head_ of the free list. To allocate more memory,
+//! the allocator pops from the free list by setting the head pointer to the
+//! next free block of the current head block, and returning the previous head.
+//! To deallocate a block, the block is pushed to the free list by setting its
+//! first word to the current head pointer, and the head pointer is set to point
+//! to the deallocated block. Most implementations of slab allocators backed by
+//! arrays or vectors use a similar technique, where pointers are replaced by
+//! indices into the backing array.
+//!
+//! When allocations and deallocations can occur concurrently across threads,
+//! they must synchronize accesses to the free list; either by putting the
+//! entire allocator state inside of a lock, or by using atomic operations to
+//! treat the free list as a lock-free structure (such as a [Treiber stack]). In
+//! both cases, there is a significant performance cost — even when the free
+//! list is lock-free, it is likely that a noticeable amount of time will be
+//! spent in compare-and-swap loops. Ideally, the global synchronzation point
+//! created by the single global free list could be avoided as much as possible.
+//!
+//! The approach presented by Leijen, Zorn, and de Moura is to introduce
+//! sharding and thus increase the granularity of synchronization significantly.
+//! In mimalloc, the heap is _sharded_ so that each thread has its own
+//! thread-local heap. Objects are always allocated from the local heap of the
+//! thread where the allocation is performed. Because allocations are always
+//! done from a thread's local heap, they need not be synchronized.
+//!
+//! However, since objects can move between threads before being deallocated,
+//! _deallocations_ may still occur concurrently. Therefore, Leijen et al.
+//! introduce a concept of _local_ and _global_ free lists. When an object is
+//! deallocated on the same thread it was originally allocated on, it is placed
+//! on the local free list; if it is deallocated on another thread, it goes on
+//! the global free list for the heap of the thread from which it originated. To
+//! allocate, the local free list is used first; if it is empty, the entire
+//! global free list is popped onto the local free list. Since the local free
+//! list is only ever accessed by the thread it belongs to, it does not require
+//! synchronization at all, and because the global free list is popped from
+//! infrequently, the cost of synchronization has a reduced impact. A majority
+//! of allocations can occur without any synchronization at all; and
+//! deallocations only require synchronization when an object has left its
+//! parent thread (a relatively uncommon case).
+//!
+//! [mimalloc]: https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf
+//! [freelist]: https://en.wikipedia.org/wiki/Free_list
+//! [Treiber stack]: https://en.wikipedia.org/wiki/Treiber_stack
+//!
+//! # Implementation
+//!
+//! A slab is represented as an array of [`MAX_THREADS`] _shards_. A shard
+//! consists of a vector of one or more _pages_ plus associated metadata.
+//! Finally, a page consists of an array of _slots_, head indices for the local
+//! and remote free lists.
+//!
+//! ```text
+//! ┌─────────────┐
+//! │ shard 1 │
+//! │ │ ┌─────────────┐ ┌────────┐
+//! │ pages───────┼───▶│ page 1 │ │ │
+//! ├─────────────┤ ├─────────────┤ ┌────▶│ next──┼─┐
+//! │ shard 2 │ │ page 2 │ │ ├────────┤ │
+//! ├─────────────┤ │ │ │ │XXXXXXXX│ │
+//! │ shard 3 │ │ local_head──┼──┘ ├────────┤ │
+//! └─────────────┘ │ remote_head─┼──┐ │ │◀┘
+//! ... ├─────────────┤ │ │ next──┼─┐
+//! ┌─────────────┐ │ page 3 │ │ ├────────┤ │
+//! │ shard n │ └─────────────┘ │ │XXXXXXXX│ │
+//! └─────────────┘ ... │ ├────────┤ │
+//! ┌─────────────┐ │ │XXXXXXXX│ │
+//! │ page n │ │ ├────────┤ │
+//! └─────────────┘ │ │ │◀┘
+//! └────▶│ next──┼───▶ ...
+//! ├────────┤
+//! │XXXXXXXX│
+//! └────────┘
+//! ```
+//!
+//!
+//! The size of the first page in a shard is always a power of two, and every
+//! subsequent page added after the first is twice as large as the page that
+//! preceeds it.
+//!
+//! ```text
+//!
+//! pg.
+//! ┌───┐ ┌─┬─┐
+//! │ 0 │───▶ │ │
+//! ├───┤ ├─┼─┼─┬─┐
+//! │ 1 │───▶ │ │ │ │
+//! ├───┤ ├─┼─┼─┼─┼─┬─┬─┬─┐
+//! │ 2 │───▶ │ │ │ │ │ │ │ │
+//! ├───┤ ├─┼─┼─┼─┼─┼─┼─┼─┼─┬─┬─┬─┬─┬─┬─┬─┐
+//! │ 3 │───▶ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │ │
+//! └───┘ └─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
+//! ```
+//!
+//! When searching for a free slot, the smallest page is searched first, and if
+//! it is full, the search proceeds to the next page until either a free slot is
+//! found or all available pages have been searched. If all available pages have
+//! been searched and the maximum number of pages has not yet been reached, a
+//! new page is then allocated.
+//!
+//! Since every page is twice as large as the previous page, and all page sizes
+//! are powers of two, we can determine the page index that contains a given
+//! address by shifting the address down by the smallest page size and
+//! looking at how many twos places necessary to represent that number,
+//! telling us what power of two page size it fits inside of. We can
+//! determine the number of twos places by counting the number of leading
+//! zeros (unused twos places) in the number's binary representation, and
+//! subtracting that count from the total number of bits in a word.
+//!
+//! The formula for determining the page number that contains an offset is thus:
+//!
+//! ```rust,ignore
+//! WIDTH - ((offset + INITIAL_PAGE_SIZE) >> INDEX_SHIFT).leading_zeros()
+//! ```
+//!
+//! where `WIDTH` is the number of bits in a `usize`, and `INDEX_SHIFT` is
+//!
+//! ```rust,ignore
+//! INITIAL_PAGE_SIZE.trailing_zeros() + 1;
+//! ```
+//!
+//! [`MAX_THREADS`]: https://docs.rs/sharded-slab/latest/sharded_slab/trait.Config.html#associatedconstant.MAX_THREADS