summaryrefslogtreecommitdiffstats
path: root/doc/trie_map.rst
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-15 05:46:17 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-15 05:46:17 +0000
commit85adc697d2ec2a379ce6d721f0419ae5df3abdb6 (patch)
tree265f329bc4544c6f11a27ac3fd6022f593a10c11 /doc/trie_map.rst
parentInitial commit. (diff)
downloadmdds-85adc697d2ec2a379ce6d721f0419ae5df3abdb6.tar.xz
mdds-85adc697d2ec2a379ce6d721f0419ae5df3abdb6.zip
Adding upstream version 2.1.1.upstream/2.1.1upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'doc/trie_map.rst')
-rw-r--r--doc/trie_map.rst767
1 files changed, 767 insertions, 0 deletions
diff --git a/doc/trie_map.rst b/doc/trie_map.rst
new file mode 100644
index 0000000..8266771
--- /dev/null
+++ b/doc/trie_map.rst
@@ -0,0 +1,767 @@
+.. highlight:: cpp
+
+
+Trie Maps
+=========
+
+Examples
+--------
+
+Populating Trie Map
+^^^^^^^^^^^^^^^^^^^
+
+This section illustrates how to use :cpp:class:`~mdds::trie_map` to build a
+database of city populations and perform prefix searches. In this example,
+we will use the 2013 populations of cities in North Carolina, and use the city
+names as keys.
+
+Let's define the type first::
+
+ using trie_map_type = mdds::trie_map<mdds::trie::std_string_traits, int>;
+
+The first template argument specifies the trait of the key. In this example,
+we are using a pre-defined trait for std::string, which is defined in
+:cpp:type:`~mdds::trie::std_string_traits`. The second template argument
+specifies the value type, which in this example is simply an ``int``.
+
+Once the type is defined, the next step is instantiation::
+
+ trie_map_type nc_cities;
+
+It's pretty simple as you don't need to pass any arguments to the constructor.
+Now, let's populate this data structure with some population data::
+
+ // Insert key-value pairs.
+ nc_cities.insert("Charlotte", 792862);
+ nc_cities.insert("Raleigh", 431746);
+ nc_cities.insert("Greensboro", 279639);
+ nc_cities.insert("Durham", 245475);
+ nc_cities.insert("Winston-Salem", 236441);
+ nc_cities.insert("Fayetteville", 204408);
+ nc_cities.insert("Cary", 151088);
+ nc_cities.insert("Wilmington", 112067);
+ nc_cities.insert("High Point", 107741);
+ nc_cities.insert("Greenville", 89130);
+ nc_cities.insert("Asheville", 87236);
+ nc_cities.insert("Concord", 83506);
+ nc_cities.insert("Gastonia", 73209);
+ nc_cities.insert("Jacksonville", 69079);
+ nc_cities.insert("Chapel Hill", 59635);
+ nc_cities.insert("Rocky Mount", 56954);
+ nc_cities.insert("Burlington", 51510);
+ nc_cities.insert("Huntersville", 50458);
+ nc_cities.insert("Wilson", 49628);
+ nc_cities.insert("Kannapolis", 44359);
+ nc_cities.insert("Apex", 42214);
+ nc_cities.insert("Hickory", 40361);
+ nc_cities.insert("Goldsboro", 36306);
+
+It's pretty straight-forward. Each :cpp:func:`~mdds::trie_map::insert` call
+expects a pair of string key and an integer value. You can insert your data
+in any order regardless of key's sort order.
+
+Now that the data is in, let's perform prefix search to query all cities whose
+name begins with "Cha"::
+
+ cout << "Cities that start with 'Cha' and their populations:" << endl;
+ auto results = nc_cities.prefix_search("Cha");
+ for (const auto& kv : results)
+ {
+ cout << " " << kv.first << ": " << kv.second << endl;
+ }
+
+You can perform prefix search via :cpp:func:`~mdds::trie_map::prefix_search`
+method, which returns a results object that can be iterated over using a range-based
+for loop. Running this code will produce the following output:
+
+.. code-block:: none
+
+ Cities that start with 'Cha' and their populations:
+ Chapel Hill: 59635
+ Charlotte: 792862
+
+Let's perform another prefix search, this time with a prefix of "W"::
+
+ cout << "Cities that start with 'W' and their populations:" << endl;
+ results = nc_cities.prefix_search("W");
+ for (const auto& kv : results)
+ {
+ cout << " " << kv.first << ": " << kv.second << endl;
+ }
+
+You'll see the following output when running this code:
+
+.. code-block:: none
+
+ Cities that start with 'W' and their populations:
+ Wilmington: 112067
+ Wilson: 49628
+ Winston-Salem: 236441
+
+Note that the results are sorted in key's ascending order.
+
+.. note::
+
+ Results from the prefix search are sorted in key's ascending order.
+
+
+Creating Packed Trie Map from Trie Map
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+There is also another variant of trie called :cpp:class:`~mdds::packed_trie_map`
+which is designed to store all its data in contiguous memory region. Unlike
+:cpp:class:`~mdds::trie_map` which is mutable, :cpp:class:`~mdds::packed_trie_map`
+is immutable; once populated, you can only perform queries and it is no longer
+possible to add new entries into the container.
+
+One way to create an instance of :cpp:class:`~mdds::packed_trie_map` is from
+:cpp:class:`~mdds::trie_map` by calling its :cpp:func:`~mdds::trie_map::pack`
+method::
+
+ auto packed = nc_cities.pack();
+
+The query methods of :cpp:class:`~mdds::packed_trie_map` are identical to those
+of :cpp:class:`~mdds::trie_map`. For instance, performing prefix search to find
+all entries whose key begins with "C" can be done as follows::
+
+ cout << "Cities that start with 'C' and their populations:" << endl;
+ auto packed_results = packed.prefix_search("C");
+ for (const auto& kv : packed_results)
+ {
+ cout << " " << kv.first << ": " << kv.second << endl;
+ }
+
+Running this code will generate the following output:
+
+.. code-block:: none
+
+ Cities that start with 'C' and their populations:
+ Cary: 151088
+ Chapel Hill: 59635
+ Charlotte: 792862
+ Concord: 83506
+
+You can also perform an exact-match query via :cpp:func:`~mdds::packed_trie_map::find`
+method which returns an iterator associated with the key-value pair entry::
+
+ // Individual search.
+ auto it = packed.find("Wilmington");
+ cout << "Population of Wilmington: " << it->second << endl;
+
+You'll see the following output with this code:
+
+.. code-block:: none
+
+ Population of Wilmington: 112067
+
+What if you performed an exact-match query with a key that doesn't exist in the
+container? You will basically get the end iterator position as its return value.
+Thus, running this code::
+
+ // You get an end position iterator when the container doesn't have the
+ // specified key.
+ it = packed.find("Asheboro");
+
+ cout << "Population of Asheboro: ";
+
+ if (it == packed.end())
+ cout << "not found";
+ else
+ cout << it->second;
+
+ cout << endl;
+
+will generate the following output:
+
+.. code-block:: none
+
+ Population of Asheboro: not found
+
+The complete source code for the examples in these two sections is available
+`here <https://gitlab.com/mdds/mdds/-/blob/master/example/trie_map.cpp>`__.
+
+
+Using Packed Trie Map directly
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In the previous example, we showed a way to create an instance of :cpp:class:`~mdds::packed_trie_map`
+from a populated instance of :cpp:class:`~mdds::trie_map`. There is also a way
+to instantiate and populate an instance of :cpp:class:`~mdds::packed_trie_map`
+directly, and that is what we will cover in this section.
+
+First, declare the type::
+
+ using trie_map_type = mdds::packed_trie_map<mdds::trie::std_string_traits, int>;
+
+Once again, we are using the pre-defined trait for std::string as its key, and int
+as its value type. The next step is to prepare its entries ahead of time::
+
+ trie_map_type::entry entries[] =
+ {
+ { MDDS_ASCII("Apex"), 42214 },
+ { MDDS_ASCII("Asheville"), 87236 },
+ { MDDS_ASCII("Burlington"), 51510 },
+ { MDDS_ASCII("Cary"), 151088 },
+ { MDDS_ASCII("Chapel Hill"), 59635 },
+ { MDDS_ASCII("Charlotte"), 792862 },
+ { MDDS_ASCII("Concord"), 83506 },
+ { MDDS_ASCII("Durham"), 245475 },
+ { MDDS_ASCII("Fayetteville"), 204408 },
+ { MDDS_ASCII("Gastonia"), 73209 },
+ { MDDS_ASCII("Goldsboro"), 36306 },
+ { MDDS_ASCII("Greensboro"), 279639 },
+ { MDDS_ASCII("Greenville"), 89130 },
+ { MDDS_ASCII("Hickory"), 40361 },
+ { MDDS_ASCII("High Point"), 107741 },
+ { MDDS_ASCII("Huntersville"), 50458 },
+ { MDDS_ASCII("Jacksonville"), 69079 },
+ { MDDS_ASCII("Kannapolis"), 44359 },
+ { MDDS_ASCII("Raleigh"), 431746 },
+ { MDDS_ASCII("Rocky Mount"), 56954 },
+ { MDDS_ASCII("Wilmington"), 112067 },
+ { MDDS_ASCII("Wilson"), 49628 },
+ { MDDS_ASCII("Winston-Salem"), 236441 },
+ };
+
+We need to do this since :cpp:class:`~mdds::packed_trie_map` is immutable, and
+the only time we can populate its content is at instantiation time. Here, we
+are using the :c:macro:`MDDS_ASCII` macro to expand a string literal to its
+pointer value and size. Note that you need to ensure that the entries are sorted
+by the key in ascending order.
+
+.. warning::
+
+ When instantiating :cpp:class:`~mdds::packed_trie_map` directly with a static
+ set of entries, the entries must be sorted by the key in ascending order.
+
+You can then pass this list of entries to construct the instance::
+
+ trie_map_type nc_cities(entries, MDDS_N_ELEMENTS(entries));
+
+The :c:macro:`MDDS_N_ELEMENTS` macro will infer the size of a fixed-size array
+from its static definition. Once it's instantiated, the rest of the example
+for performing searches will be the same as in the previous section, which we
+will not repeat here.
+
+The complete source code for the example in this section is available
+`here <https://gitlab.com/mdds/mdds/-/blob/master/example/packed_trie_map.cpp>`__.
+
+
+Saving and loading Packed Trie Map instances
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+There are times when you need to save the state of a :cpp:class:`~mdds::packed_trie_map`
+instance to a file, or an in-memory buffer, and load it back later. Doing that
+is now possible by using the :cpp:func:`~mdds::packed_trie_map::save_state` and
+:cpp:func:`~mdds::packed_trie_map::load_state` member methods of the
+:cpp:class:`~mdds::packed_trie_map` class.
+
+First, let's define the type of use::
+
+ using map_type = mdds::packed_trie_map<mdds::trie::std_string_traits, int>;
+
+As with the previous examples, we will use ``std::string`` as the key type and
+``int`` as the value type. In this example, we are going to use `the world's
+largest cities and their 2018 populations
+<https://en.wikipedia.org/wiki/List_of_largest_cities>`__ as the data to store
+in the container.
+
+The following code defines the entries::
+
+ std::vector<map_type::entry> entries =
+ {
+ { MDDS_ASCII("Ahmedabad"), 7681000 },
+ { MDDS_ASCII("Alexandria"), 5086000 },
+ { MDDS_ASCII("Atlanta"), 5572000 },
+ { MDDS_ASCII("Baghdad"), 6812000 },
+ { MDDS_ASCII("Bangalore"), 11440000 },
+ { MDDS_ASCII("Bangkok"), 10156000 },
+ { MDDS_ASCII("Barcelona"), 5494000 },
+ { MDDS_ASCII("Beijing"), 19618000 },
+ { MDDS_ASCII("Belo Horizonte"), 5972000 },
+ { MDDS_ASCII("Bogota"), 10574000 },
+ { MDDS_ASCII("Buenos Aires"), 14967000 },
+ { MDDS_ASCII("Cairo"), 20076000 },
+ { MDDS_ASCII("Chengdu"), 8813000 },
+ { MDDS_ASCII("Chennai"), 10456000 },
+ { MDDS_ASCII("Chicago"), 8864000 },
+ { MDDS_ASCII("Chongqing"), 14838000 },
+ { MDDS_ASCII("Dalian"), 5300000 },
+ { MDDS_ASCII("Dallas"), 6099000 },
+ { MDDS_ASCII("Dar es Salaam"), 6048000 },
+ { MDDS_ASCII("Delhi"), 28514000 },
+ { MDDS_ASCII("Dhaka"), 19578000 },
+ { MDDS_ASCII("Dongguan"), 7360000 },
+ { MDDS_ASCII("Foshan"), 7236000 },
+ { MDDS_ASCII("Fukuoka"), 5551000 },
+ { MDDS_ASCII("Guadalajara"), 5023000 },
+ { MDDS_ASCII("Guangzhou"), 12638000 },
+ { MDDS_ASCII("Hangzhou"), 7236000 },
+ { MDDS_ASCII("Harbin"), 6115000 },
+ { MDDS_ASCII("Ho Chi Minh City"), 8145000 },
+ { MDDS_ASCII("Hong Kong"), 7429000 },
+ { MDDS_ASCII("Houston"), 6115000 },
+ { MDDS_ASCII("Hyderabad"), 9482000 },
+ { MDDS_ASCII("Istanbul"), 14751000 },
+ { MDDS_ASCII("Jakarta"), 10517000 },
+ { MDDS_ASCII("Jinan"), 5052000 },
+ { MDDS_ASCII("Johannesburg"), 5486000 },
+ { MDDS_ASCII("Karachi"), 15400000 },
+ { MDDS_ASCII("Khartoum"), 5534000 },
+ { MDDS_ASCII("Kinshasa"), 13171000 },
+ { MDDS_ASCII("Kolkata"), 14681000 },
+ { MDDS_ASCII("Kuala Lumpur"), 7564000 },
+ { MDDS_ASCII("Lagos"), 13463000 },
+ { MDDS_ASCII("Lahore"), 11738000 },
+ { MDDS_ASCII("Lima"), 10391000 },
+ { MDDS_ASCII("London"), 9046000 },
+ { MDDS_ASCII("Los Angeles"), 12458000 },
+ { MDDS_ASCII("Luanda"), 7774000 },
+ { MDDS_ASCII("Madrid"), 6497000 },
+ { MDDS_ASCII("Manila"), 13482000 },
+ { MDDS_ASCII("Mexico City"), 21581000 },
+ { MDDS_ASCII("Miami"), 6036000 },
+ { MDDS_ASCII("Moscow"), 12410000 },
+ { MDDS_ASCII("Mumbai"), 19980000 },
+ { MDDS_ASCII("Nagoya"), 9507000 },
+ { MDDS_ASCII("Nanjing"), 8245000 },
+ { MDDS_ASCII("New York City"), 18819000 },
+ { MDDS_ASCII("Osaka"), 19281000 },
+ { MDDS_ASCII("Paris"), 10901000 },
+ { MDDS_ASCII("Philadelphia"), 5695000 },
+ { MDDS_ASCII("Pune"), 6276000 },
+ { MDDS_ASCII("Qingdao"), 5381000 },
+ { MDDS_ASCII("Rio de Janeiro"), 13293000 },
+ { MDDS_ASCII("Riyadh"), 6907000 },
+ { MDDS_ASCII("Saint Petersburg"), 5383000 },
+ { MDDS_ASCII("Santiago"), 6680000 },
+ { MDDS_ASCII("Sao Paulo"), 21650000 },
+ { MDDS_ASCII("Seoul"), 9963000 },
+ { MDDS_ASCII("Shanghai"), 25582000 },
+ { MDDS_ASCII("Shenyang"), 6921000 },
+ { MDDS_ASCII("Shenzhen"), 11908000 },
+ { MDDS_ASCII("Singapore"), 5792000 },
+ { MDDS_ASCII("Surat"), 6564000 },
+ { MDDS_ASCII("Suzhou"), 6339000 },
+ { MDDS_ASCII("Tehran"), 8896000 },
+ { MDDS_ASCII("Tianjin"), 13215000 },
+ { MDDS_ASCII("Tokyo"), 37400068 },
+ { MDDS_ASCII("Toronto"), 6082000 },
+ { MDDS_ASCII("Washington, D.C."), 5207000 },
+ { MDDS_ASCII("Wuhan"), 8176000 },
+ { MDDS_ASCII("Xi'an"), 7444000 },
+ { MDDS_ASCII("Yangon"), 5157000 },
+ };
+
+It's a bit long as it contains entries for 81 cities. We are then going to
+create an instance of the :cpp:class:`~mdds::packed_trie_map` class directly::
+
+ map_type cities(entries.data(), entries.size());
+
+Let's print the size of the container to make sure the container has been
+successfully populated::
+
+ cout << "Number of cities: " << cities.size() << endl;
+
+You will see the following output:
+
+.. code-block:: none
+
+ Number of cities: 81
+
+if the container has been successfully populated. Now, let's run a prefix
+search on names beginning with an 'S'::
+
+ cout << "Cities that begin with 'S':" << endl;
+ auto results = cities.prefix_search("S");
+ for (const auto& city : results)
+ cout << " * " << city.first << ": " << city.second << endl;
+
+to make sure you get the following ten cities and their populations as the
+output:
+
+.. code-block:: none
+
+ Cities that begin with 'S':
+ * Saint Petersburg: 5383000
+ * Santiago: 6680000
+ * Sao Paulo: 21650000
+ * Seoul: 9963000
+ * Shanghai: 25582000
+ * Shenyang: 6921000
+ * Shenzhen: 11908000
+ * Singapore: 5792000
+ * Surat: 6564000
+ * Suzhou: 6339000
+
+So far so good. Next, we will use the :cpp:func:`~mdds::packed_trie_map::save_state`
+method to dump the internal state of this container to a file named **cities.bin**::
+
+ std::ofstream outfile("cities.bin", std::ios::binary);
+ cities.save_state(outfile);
+
+This will create a file named **cities.bin** which contains a binary blob
+representing the content of this container in the current working directory.
+Run the ``ls -l cities.bin`` command to make sure the file has been created:
+
+.. code-block:: none
+
+ -rw-r--r-- 1 kohei kohei 17713 Jun 20 12:49 cities.bin
+
+Now that the state of the container has been fully serialized to a file, let's
+work on restoring its content in another, brand-new instance of
+:cpp:class:`~mdds::packed_trie_map`.
+
+::
+
+ map_type cities_loaded;
+
+ std::ifstream infile("cities.bin", std::ios::binary);
+ cities_loaded.load_state(infile);
+
+Here, we used the :cpp:func:`~mdds::packed_trie_map::load_state` method to
+restore the state from the file we have previously created. Let's make sure
+that this new instance has content equivalent to that of the original::
+
+ cout << "Equal to the original? " << std::boolalpha << (cities == cities_loaded) << endl;
+
+If you see the following output:
+
+.. code-block:: none
+
+ Equal to the original? true
+
+then this new instance has equivalent contant as the original one. Let's also
+make sure that it contains the same number of entries as the original::
+
+ cout << "Number of cities: " << cities_loaded.size() << endl;
+
+Hopefully you will see the following output:
+
+.. code-block:: none
+
+ Number of cities: 81
+
+Lastly, let's run on this new instance the same prefix search we did on the
+original instance, to make sure we still get the same results::
+
+ cout << "Cities that begin with 'S':" << endl;
+ auto results = cities_loaded.prefix_search("S");
+ for (const auto& city : results)
+ cout << " * " << city.first << ": " << city.second << endl;
+
+You should see the following output:
+
+.. code-block:: none
+
+ Cities that begin with 'S':
+ * Saint Petersburg: 5383000
+ * Santiago: 6680000
+ * Sao Paulo: 21650000
+ * Seoul: 9963000
+ * Shanghai: 25582000
+ * Shenyang: 6921000
+ * Shenzhen: 11908000
+ * Singapore: 5792000
+ * Surat: 6564000
+ * Suzhou: 6339000
+
+which is the same output we saw in the first prefix search.
+
+The complete source code for this example is found
+`here <https://gitlab.com/mdds/mdds/-/blob/master/example/packed_trie_state_int.cpp>`__.
+
+
+Saving Packed Trie Map with custom value type
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In the previos example, you didn't have to explicitly specify the serializer type
+to the :cpp:func:`~mdds::packed_trie_map::save_state` and
+:cpp:func:`~mdds::packed_trie_map::load_state` methods, even though these two
+methods require the serializer type as their template arguments. That's because
+the library provides default serializer types for
+
+* numeric value types i.e. integers, float and double,
+* ``std::string``, and
+* the standard sequence types, such as ``std::vector``, whose elements are of
+ numeric value types,
+
+and the previous example used ``int`` as the value type.
+
+In this section, we are going to illustrate how you can write your own custom
+serializer to allow serialization of your own custom value type. In this example,
+we are going to use `the list of presidents of the United States
+<https://en.wikipedia.org/wiki/List_of_presidents_of_the_United_States>`__,
+with the names of the presidents as the keys, and their years of inauguration
+and political affiliations as the values.
+
+We will use the following structure to store the values::
+
+ enum affiliated_party_t : uint8_t
+ {
+ unaffiliated = 0,
+ federalist,
+ democratic_republican,
+ democratic,
+ whig,
+ republican,
+ national_union,
+ republican_national_union,
+ };
+
+ struct us_president
+ {
+ uint16_t year;
+ affiliated_party_t party;
+ };
+
+Each entry stores the year as a 16-bit integer and the affiliated party as an enum
+value of 8-bit width.
+
+Next, let's define the container type::
+
+ using map_type = mdds::packed_trie_map<mdds::trie::std_string_traits, us_president>;
+
+As with the previous example, the first step is to define the entries that are
+sorted by the keys, which in this case are the president's names::
+
+ std::vector<map_type::entry> entries =
+ {
+ { MDDS_ASCII("Abraham Lincoln"), { 1861, republican_national_union } },
+ { MDDS_ASCII("Andrew Jackson"), { 1829, democratic } },
+ { MDDS_ASCII("Andrew Johnson"), { 1865, national_union } },
+ { MDDS_ASCII("Barack Obama"), { 2009, democratic } },
+ { MDDS_ASCII("Benjamin Harrison"), { 1889, republican } },
+ { MDDS_ASCII("Bill Clinton"), { 1993, democratic } },
+ { MDDS_ASCII("Calvin Coolidge"), { 1923, republican } },
+ { MDDS_ASCII("Chester A. Arthur"), { 1881, republican } },
+ { MDDS_ASCII("Donald Trump"), { 2017, republican } },
+ { MDDS_ASCII("Dwight D. Eisenhower"), { 1953, republican } },
+ { MDDS_ASCII("Franklin D. Roosevelt"), { 1933, democratic } },
+ { MDDS_ASCII("Franklin Pierce"), { 1853, democratic } },
+ { MDDS_ASCII("George H. W. Bush"), { 1989, republican } },
+ { MDDS_ASCII("George W. Bush"), { 2001, republican } },
+ { MDDS_ASCII("George Washington"), { 1789, unaffiliated } },
+ { MDDS_ASCII("Gerald Ford"), { 1974, republican } },
+ { MDDS_ASCII("Grover Cleveland 1"), { 1885, democratic } },
+ { MDDS_ASCII("Grover Cleveland 2"), { 1893, democratic } },
+ { MDDS_ASCII("Harry S. Truman"), { 1945, democratic } },
+ { MDDS_ASCII("Herbert Hoover"), { 1929, republican } },
+ { MDDS_ASCII("James A. Garfield"), { 1881, republican } },
+ { MDDS_ASCII("James Buchanan"), { 1857, democratic } },
+ { MDDS_ASCII("James K. Polk"), { 1845, democratic } },
+ { MDDS_ASCII("James Madison"), { 1809, democratic_republican } },
+ { MDDS_ASCII("James Monroe"), { 1817, democratic_republican } },
+ { MDDS_ASCII("Jimmy Carter"), { 1977, democratic } },
+ { MDDS_ASCII("John Adams"), { 1797, federalist } },
+ { MDDS_ASCII("John F. Kennedy"), { 1961, democratic } },
+ { MDDS_ASCII("John Quincy Adams"), { 1825, democratic_republican } },
+ { MDDS_ASCII("John Tyler"), { 1841, whig } },
+ { MDDS_ASCII("Lyndon B. Johnson"), { 1963, democratic } },
+ { MDDS_ASCII("Martin Van Buren"), { 1837, democratic } },
+ { MDDS_ASCII("Millard Fillmore"), { 1850, whig } },
+ { MDDS_ASCII("Richard Nixon"), { 1969, republican } },
+ { MDDS_ASCII("Ronald Reagan"), { 1981, republican } },
+ { MDDS_ASCII("Rutherford B. Hayes"), { 1877, republican } },
+ { MDDS_ASCII("Theodore Roosevelt"), { 1901, republican } },
+ { MDDS_ASCII("Thomas Jefferson"), { 1801, democratic_republican } },
+ { MDDS_ASCII("Ulysses S. Grant"), { 1869, republican } },
+ { MDDS_ASCII("Warren G. Harding"), { 1921, republican } },
+ { MDDS_ASCII("William Henry Harrison"), { 1841, whig } },
+ { MDDS_ASCII("William Howard Taft"), { 1909, republican } },
+ { MDDS_ASCII("William McKinley"), { 1897, republican } },
+ { MDDS_ASCII("Woodrow Wilson"), { 1913, democratic } },
+ { MDDS_ASCII("Zachary Taylor"), { 1849, whig } },
+ };
+
+Note that we need to add numeric suffixes to the entries for Grover Cleveland,
+who became president twice in two separate periods, in order to make the keys
+for his entries unique.
+
+Now, proceed to create an instance of :cpp:class:`~mdds::packed_trie_map`::
+
+ map_type us_presidents(entries.data(), entries.size());
+
+and inspect its size to make sure it is instantiated properly::
+
+ cout << "Number of entries: " << us_presidents.size() << endl;
+
+You should see the following output:
+
+.. code-block:: none
+
+ Number of entries: 45
+
+Before we proceed to save the state of this instance, let's define the custom
+serializer type first::
+
+ struct us_president_serializer
+ {
+ union bin_buffer
+ {
+ char buffer[2];
+ uint16_t i16;
+ affiliated_party_t party;
+ };
+
+ static constexpr bool variable_size = false;
+ static constexpr size_t value_size = 3;
+
+ static void write(std::ostream& os, const us_president& v)
+ {
+ bin_buffer buf;
+
+ // Write the year value first.
+ buf.i16 = v.year;
+ os.write(buf.buffer, 2);
+
+ // Write the affiliated party value.
+ buf.party = v.party;
+ os.write(buf.buffer, 1);
+ }
+
+ static void read(std::istream& is, size_t n, us_president& v)
+ {
+ // For a fixed-size value type, this should equal the defined value size.
+ assert(n == 3);
+
+ bin_buffer buf;
+
+ // Read the year value.
+ is.read(buf.buffer, 2);
+ v.year = buf.i16;
+
+ // Read the affiliated party value.
+ is.read(buf.buffer, 1);
+ v.party = buf.party;
+ }
+ };
+
+A custom value type can be either variable-size or fixed-size. For a variable-size
+value type, each value segment is preceded by the byte length of that segment.
+For a fixed-size value type, the byte length of all of the value segments
+is written only once up-front, followed by one or more value segments of equal
+byte length.
+
+Since the value type in this example is fixed-size, we set the value of the
+``variable_size`` static constant to false, and define the size of the value to 3 (bytes)
+as the ``value_size`` static constant. Keep in mind that you need to define
+the ``value_size`` constant *only* for fixed-size value types; if your value
+type is variable-size, you can leave it out.
+
+Additionally, you need to define two static methods - one for writing to the
+output stream, and one for reading from the input stream. The write method
+must have the following signature::
+
+ static void write(std::ostream& os, const T& v);
+
+where the ``T`` is the value type. In the body of this method you write to the
+output stream the bytes that represent the value. The length of the bytes you
+write must match the size specified by the ``value_size`` constant.
+
+The read method must have the following signature::
+
+ static void read(std::istream& is, size_t n, T& v);
+
+where the ``T`` is the value type, and the ``n`` specifies the length of the
+bytes you need to read for the value. For a fixed-size value type, the value
+of ``n`` should equal the ``value_size`` constant. Your job is to read the
+bytes off of the input stream for the length specified by the ``n``, and
+populate the value instance passed to the method as the third argument.
+
+Now that we have defined the custom serializer type, let's proceed to save the
+state to a file::
+
+ std::ofstream outfile("us-presidents.bin", std::ios::binary);
+ us_presidents.save_state<us_president_serializer>(outfile);
+
+This time around, we are specifying the serializer type explicitly as the template
+argument to the :cpp:func:`~mdds::packed_trie_map::save_state` method. Otherwise
+it is no different than what we did in the previous example.
+
+Let's create another instance of :cpp:class:`~mdds::packed_trie_map` and restore
+the state back from the file we just created::
+
+ map_type us_presidents_loaded;
+
+ std::ifstream infile("us-presidents.bin", std::ios::binary);
+ us_presidents_loaded.load_state<us_president_serializer>(infile);
+
+Once again, aside from explicitly specifying the serializer type as the template
+argument to the :cpp:func:`~mdds::packed_trie_map::load_state` method, it is
+identical to the way we did in the previous example.
+
+Let's compare the new instance with the old one to see if the two are equal::
+
+ cout << "Equal to the original? " << std::boolalpha << (us_presidents == us_presidents_loaded) << endl;
+
+The output says:
+
+.. code-block:: none
+
+ Equal to the original? true
+
+They are. While we are at it, let's run a simple prefix search to find out
+all the US presidents whose first name is 'John'::
+
+ cout << "Presidents whose first name is 'John':" << endl;
+ auto results = us_presidents_loaded.prefix_search("John");
+ for (const auto& entry : results)
+ cout << " * " << entry.first << " (" << entry.second.year << "; " << entry.second.party << ")" << endl;
+
+Here is the output:
+
+.. code-block:: none
+
+ Presidents whose first name is 'John':
+ * John Adams (1797; Federalist)
+ * John F. Kennedy (1961; Democratic)
+ * John Quincy Adams (1825; Democratic Republican)
+ * John Tyler (1841; Whig)
+
+This looks like the correct results!
+
+You can find the complete source code for this example `here
+<https://gitlab.com/mdds/mdds/-/blob/master/example/packed_trie_state_custom.cpp>`__.
+
+
+API Reference
+-------------
+
+Trie Map
+^^^^^^^^
+
+.. doxygenclass:: mdds::trie_map
+ :members:
+
+
+Packed Trie Map
+^^^^^^^^^^^^^^^
+
+.. doxygenclass:: mdds::packed_trie_map
+ :members:
+
+
+Traits
+^^^^^^
+
+.. doxygenstruct:: mdds::trie::std_container_traits
+ :members:
+
+.. doxygentypedef:: mdds::trie::std_string_traits
+
+
+Value Serializers
+^^^^^^^^^^^^^^^^^
+
+.. doxygenstruct:: mdds::trie::value_serializer
+ :members:
+
+.. doxygenstruct:: mdds::trie::numeric_value_serializer
+ :members:
+
+.. doxygenstruct:: mdds::trie::variable_value_serializer
+ :members:
+
+.. doxygenstruct:: mdds::trie::numeric_sequence_value_serializer
+ :members: