diff options
Diffstat (limited to '')
-rw-r--r-- | src/libnetdata/adaptive_resortable_list/README.md | 103 |
1 files changed, 103 insertions, 0 deletions
diff --git a/src/libnetdata/adaptive_resortable_list/README.md b/src/libnetdata/adaptive_resortable_list/README.md new file mode 100644 index 000000000..9aa864c9e --- /dev/null +++ b/src/libnetdata/adaptive_resortable_list/README.md @@ -0,0 +1,103 @@ +<!-- +title: "Adaptive Re-sortable List (ARL)" +custom_edit_url: https://github.com/netdata/netdata/edit/master/src/libnetdata/adaptive_resortable_list/README.md +sidebar_label: "Adaptive Re-sortable List (ARL)" +learn_status: "Published" +learn_topic_type: "Tasks" +learn_rel_path: "Developers/libnetdata" +--> + +# Adaptive Re-sortable List (ARL) + +This library allows Netdata to read a series of `name - value` pairs +in the **fastest possible way**. + +ARLs are used all over Netdata, as they are the most +CPU utilization efficient way to process `/proc` files. They are used to +process both vertical (csv like) and horizontal (one pair per line) `name - value` pairs. + +## How ARL works + +It maintains a linked list of all `NAME` (keywords), sorted in the +order found in the data source. The linked list is kept +sorted at all times - the data source may change at any time, the +linked list will adapt at the next iteration. + +### Initialization + +During initialization (just once), the caller: + +- calls `arl_create()` to create the ARL + +- calls `arl_expect()` multiple times to register the expected keywords + +The library will call the `processor()` function (given to +`arl_create()`), for each expected keyword found. +The default `processor()` expects `dst` to be an `unsigned long long *`. + +Each `name` keyword may have a different `processor()` (by calling +`arl_expect_custom()` instead of `arl_expect()`). + +### Data collection iterations + +For each iteration through the data source, the caller: + +- calls `arl_begin()` to initiate a data collection iteration. + This is to be called just ONCE every time the source is re-evaluated. + +- calls `arl_check()` for each entry read from the file. + +### Cleanup + +When the caller exits: + +- calls `arl_free()` to destroy this and free all memory. + +### Performance + +ARL maintains a list of `name` keywords found in the data source (even the ones +that are not useful for data collection). + +If the data source maintains the same order on the `name-value` pairs, for each +each call to `arl_check()` only an `strcmp()` is executed to verify the +expected order has not changed, a counter is incremented and a pointer is changed. +So, if the data source has 100 `name-value` pairs, and their order remains constant +over time, 100 successful `strcmp()` are executed. + +In the unlikely event that an iteration sees the data source with a different order, +for each out-of-order keyword, a full search of the remaining keywords is made. But +this search uses 32bit hashes, not string comparisons, so it should also be fast. + +When all expectations are satisfied (even in the middle of an iteration), +the call to `arl_check()` will return 1, to signal the caller to stop the loop, +saving valuable CPU resources for the rest of the data source. + +In the following test we used alternative methods to process, **1M times**, +a data source like `/proc/meminfo`, already tokenized, in memory, +to extract the same number of expected metrics: + +|test|code|string comparison|number parsing|duration| +|:--:|:--:|:---------------:|:------------:|:------:| +|1|if-else-if-else-if|`strcmp()`|`strtoull()`|4630.337 ms| +|2|nested loops|inline `simple_hash()` and `strcmp()`|`strtoull()`|1597.481 ms| +|3|nested loops|inline `simple_hash()` and `strcmp()`|`str2ull()`|923.523 ms| +|4|if-else-if-else-if|inline `simple_hash()` and `strcmp()`|`strtoull()`|854.574 ms| +|5|if-else-if-else-if|statement expression `simple_hash()` and `strcmp()`|`strtoull()`|912.013 ms| +|6|if-continue|inline `simple_hash()` and `strcmp()`|`strtoull()`|842.279 ms| +|7|if-else-if-else-if|inline `simple_hash()` and `strcmp()`|`str2ull()`|602.837 ms| +|8|ARL|ARL|`strtoull()`|350.360 ms| +|9|ARL|ARL|`str2ull()`|157.237 ms| + +Compared to unoptimized code (test No 1: 4.6sec): + +- before ARL Netdata was using test No **7** with hashing and a custom `str2ull()` to achieve 602ms. +- the current ARL implementation is test No **9** that needs only 157ms (29 times faster vs unoptimized code, about 4 times faster vs optimized code). + +[Check the source code of this test](https://raw.githubusercontent.com/netdata/netdata/master/tests/profile/benchmark-value-pairs.c). + +## Limitations + +Do not use ARL if the a name/keyword may appear more than once in the +source data. + + |