From 311bcfc6b3acdd6fd152798c7f287ddf74fa2a98 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Tue, 16 Apr 2024 21:46:48 +0200 Subject: Adding upstream version 15.4. Signed-off-by: Daniel Baumann --- doc/src/sgml/html/hash-implementation.html | 36 ++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 doc/src/sgml/html/hash-implementation.html (limited to 'doc/src/sgml/html/hash-implementation.html') diff --git a/doc/src/sgml/html/hash-implementation.html b/doc/src/sgml/html/hash-implementation.html new file mode 100644 index 0000000..449299b --- /dev/null +++ b/doc/src/sgml/html/hash-implementation.html @@ -0,0 +1,36 @@ + +72.2. Implementation

72.2. Implementation

+ There are four kinds of pages in a hash index: the meta page (page zero), + which contains statically allocated control information; primary bucket + pages; overflow pages; and bitmap pages, which keep track of overflow + pages that have been freed and are available for re-use. For addressing + purposes, bitmap pages are regarded as a subset of the overflow pages. +

+ Both scanning the index and inserting tuples require locating the bucket + where a given tuple ought to be located. To do this, we need the bucket + count, highmask, and lowmask from the metapage; however, it's undesirable + for performance reasons to have to have to lock and pin the metapage for + every such operation. Instead, we retain a cached copy of the metapage + in each backend's relcache entry. This will produce the correct bucket + mapping as long as the target bucket hasn't been split since the last + cache refresh. +

+ Primary bucket pages and overflow pages are allocated independently since + any given index might need more or fewer overflow pages relative to its + number of buckets. The hash code uses an interesting set of addressing + rules to support a variable number of overflow pages while not having to + move primary bucket pages around after they are created. +

+ Each row in the table indexed is represented by a single index tuple in + the hash index. Hash index tuples are stored in bucket pages, and if + they exist, overflow pages. We speed up searches by keeping the index entries + in any one index page sorted by hash code, thus allowing binary search to be + used within an index page. Note however that there is *no* assumption about + the relative ordering of hash codes across different index pages of a bucket. +

+ The bucket splitting algorithms to expand the hash index are too complex to + be worthy of mention here, though are described in more detail in + src/backend/access/hash/README. + The split algorithm is crash safe and can be restarted if not completed + successfully. +

\ No newline at end of file -- cgit v1.2.3