diff options
Diffstat (limited to 'xpcom/docs/hashtables.rst')
-rw-r--r-- | xpcom/docs/hashtables.rst | 141 |
1 files changed, 141 insertions, 0 deletions
diff --git a/xpcom/docs/hashtables.rst b/xpcom/docs/hashtables.rst new file mode 100644 index 0000000000..8debd11eb7 --- /dev/null +++ b/xpcom/docs/hashtables.rst @@ -0,0 +1,141 @@ +XPCOM Hashtable Guide +===================== + +.. note:: + + For a deep-dive into the underlying mechanisms that power our hashtables, + check out the :ref:`XPCOM Hashtable Technical Details` + document. + +What Is a Hashtable? +-------------------- + +A hashtable is a data construct that stores a set of **items**. Each +item has a **key** that identifies the item. Items are found, added, and +removed from the hashtable by using the key. Hashtables may seem like +arrays, but there are important differences: + ++-------------------------+----------------------+----------------------+ +| | Array | Hashtable | ++=========================+======================+======================+ +| **Keys** | *integer:* arrays | *any type:* almost | +| | are always keyed on | any datatype can be | +| | integers and must | used as key, | +| | be contiguous. | including strings, | +| | | integers, XPCOM | +| | | interface pointers, | +| | | IIDs, and almost | +| | | anything else. Keys | +| | | can be disjunct | +| | | (i.e. you can store | +| | | entries with keys 1, | +| | | 5, and 3000). | ++-------------------------+----------------------+----------------------+ +| **Lookup Time** | *O(1):* lookup time | *O(1):* lookup time | +| | is a simple constant | is mostly-constant, | +| | | but the constant | +| | | time can be larger | +| | | than an array lookup | ++-------------------------+----------------------+----------------------+ +| **Sorting** | *sorted:* stored | *unsorted:* stored | +| | sorted; iterated | unsorted; cannot be | +| | over in a sorted | iterated over in a | +| | fashion. | sorted manner. | ++-------------------------+----------------------+----------------------+ +| **Inserting/Removing** | *O(n):* adding and | *O(1):* adding and | +| | removing items from | removing items from | +| | a large array can be | hashtables is a | +| | time-consuming | quick operation | ++-------------------------+----------------------+----------------------+ +| **Wasted space** | *none:* Arrays are | *some:* hashtables | +| | packed structures, | are not packed | +| | so there is no | structures; | +| | wasted space. | depending on the | +| | | implementation, | +| | | there may be | +| | | significant wasted | +| | | memory. | ++-------------------------+----------------------+----------------------+ + +In their implementation, hashtables take the key and apply a +mathematical **hash function** to **randomize** the key and then use the +hash to find the location in the hashtable. Good hashtable +implementations will automatically resize the hashtable in memory if +extra space is needed, or if too much space has been allocated. + +.. _When_Should_I_Use_a_Hashtable.3F: + +When Should I Use a Hashtable? +------------------------------ + +Hashtables are useful for + +- sets of data that need swift **random access** +- with **non-integral keys** or **non-contiguous integral keys** +- or where **items will be frequently added or removed** + +Hashtables should *not* be used for + +- Sets that need to be **sorted** +- Very small datasets (less than 12-16 items) +- Data that does not need random access + +In these situations, an array, a linked-list, or various tree data +structures are more efficient. + +.. _Which_Hashtable_Should_I_Use.3F: + +Which Hashtable Should I Use? +----------------------------- + +If there is **no** key type, you should use an ``nsTHashSet``. + +If there is a key type, you should use an ``nsTHashMap``. + +``nsTHashMap`` is a template with two parameters. The first is the hash key +and the second is the data to be stored as the value in the map. Most of +the time, you can simply pass the raw key type as the first parameter, +so long as its supported by `nsTHashMap.h <https://searchfox.org/mozilla-central/source/xpcom/ds/nsTHashMap.h>`_. +It is also possible to specify custom keys if necessary. See `nsHashKeys.h +<https://searchfox.org/mozilla-central/source/xpcom/ds/nsHashKeys.h>`_ for examples. + +There are a number of more esoteric hashkey classes in nsHashKeys.h, and +you can always roll your own if none of these fit your needs (make sure +you're not duplicating an existing hashkey class though!) + +Once you've determined what hashtable and hashkey classes you need, you +can put it all together. A few examples: + +- A hashtable that maps UTF-8 origin names to a DOM Window - + ``nsTHashMap<nsCString, nsCOMPtr<nsIDOMWindow>>`` +- A hashtable that maps 32-bit integers to floats - + ``nsTHashMap<uint32_t, float>`` +- A hashtable that maps ``nsISupports`` pointers to reference counted + ``CacheEntry``\ s - + ``nsTHashMap<nsCOMPtr<nsISupports>, RefPtr<CacheEntry>>`` +- A hashtable that maps ``JSContext`` pointers to a ``ContextInfo`` + struct - ``nsTHashMap<JSContext*, UniquePtr<ContextInfo>>`` +- A hashset of strings - ``nsTHashSet<nsString>`` + +.. _nsBaseHashtable_and_friends:_nsDataHashtable.2C_nsInterfaceHashtable.2C_and_nsClassHashtable: + +Hashtable API +------------- + +The hashtable classes all expose the same basic API. There are three +key methods, ``Get``, ``InsertOrUpdate``, and ``Remove``, which retrieve entries from the +hashtable, write entries into the hashtable, and remove entries from the +hashtable respectively. See `nsBaseHashtable.h <https://searchfox.org/mozilla-central/source/xpcom/ds/nsBaseHashtable.h>`_ +for more details. + +The hashtables that hold references to pointers (nsRefPtrHashtable and +nsInterfaceHashtable) also have GetWeak methods that return non-AddRefed +pointers. + +Note that ``nsRefPtrHashtable``, ``nsInterfaceHashtable`` and ``nsClassHashtable`` +are legacy hashtable types which have some extra methods, and don't have automatic +key type handling. + +All of these hashtable classes can be iterated over via the ``Iterator`` +class, with normal C++11 iterators or using the ``Keys()`` / ``Values()`` ranges, +and all can be cleared via the ``Clear`` method. |