diff options
Diffstat (limited to 'source/configuration/lookup_tables.rst')
-rw-r--r-- | source/configuration/lookup_tables.rst | 251 |
1 files changed, 251 insertions, 0 deletions
diff --git a/source/configuration/lookup_tables.rst b/source/configuration/lookup_tables.rst new file mode 100644 index 0000000..b673022 --- /dev/null +++ b/source/configuration/lookup_tables.rst @@ -0,0 +1,251 @@ +Lookup Tables +============= + +Lookup tables are a powerful construct to obtain *class* information based +on message content. It works on top of a data-file which maps key (to be looked +up) to value (the result of lookup). + +The idea is to use a message properties (or derivatives of it) as an index +into a table which then returns another value. For example, $fromhost-ip +could be used as an index, with the table value representing the type of +server or the department or remote office it is located in. + +This can be emulated using if and else-if stack, but implementing it as a +dedicated component allows ``lookup`` to be made fast. + +The lookup tables itself exists in a separate data file (one per +table). This file is loaded on Rsyslog startup and when a reload is requested. + +There are different types of lookup tables (identified by "type" field in json data-file). + +Types +^^^^^ + +string +------ + +The key to be looked up is an arbitrary string. + +**Match criterion**: The key must be exactly equal to index from one of the entries. + +array +----- + +The value to be looked up is an integer number from a consecutive set. +The set does not need to start at zero or one, but there must be no number missing. +So, for example ``5,6,7,8,9`` would be a valid set of index values, while ``1,2,4,5`` would +not be (due to missing ``3``). + +**Match criterion**: Looked-up number(key) must be exactly equal to index from one of the entries. + +sparseArray +----------- + +The value to be looked up is an integer value, but there may be gaps inside the +set of values (usually there are large gaps). A typical use case would be the +matching of IPv4 address information. + +**Match criterion**: A match happens on the first index that is less than or equal to the looked-up key. + +Note that index integer numbers are represented by unsigned 32 bits. + + +Lookup Table File Format +^^^^^^^^^^^^^^^^^^^^^^^^ + +Lookup table files contain a single JSON object. This object consists of a header and a table part. + +**Header** + +The header is the top-level json. +It has parameters "version", "nomatch", and "type". + +Parameters: + **version** <number, default: 1> : Version of table-definition format (so improvements in the future can be managed in a backward compatible way). + + **nomatch** <string literal, default: ""> : Value to be returned for a lookup when match fails. + + **type** <*string*, *array* or *sparseArray*, default: *string*> : Type of lookup-table (controls how matches are performed). + +**Table** + +This must be an array of elements, even if only a single value exists (for obvious reasons, +we do not expect this to occur often). Each array element must contain two fields "index" +and "value". + +This is a sample of how an ip-to-office mapping may look like: + +:: + + { "version" : 1, + "nomatch" : "unk", + "type" : "string", + "table" : [ + {"index" : "10.0.1.1", "value" : "A" }, + {"index" : "10.0.1.2", "value" : "A" }, + {"index" : "10.0.1.3", "value" : "A" }, + {"index" : "10.0.2.1", "value" : "B" }, + {"index" : "10.0.2.2", "value" : "B" }, + {"index" : "10.0.2.3", "value" : "B" }]} + + +Note: In the example above, if a different IP comes in, the value "unk" is returned thanks to the nomatch parameter in the first line. + +Lookup tables can be accessed via the ``lookup()`` built-in function. A common usage pattern is to set a local variable to the lookup result and later use that variable in templates. + + + +Lookup-table configuration +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Lookup-table configuration involves a **two part setup** (definition and usage(lookup)), with an optional third part, +which allows reloading table using internal trigger. + +lookup_table(name="<table>" file="</path/to/file>"...) (object) +--------------------------------------------------------------- + +**Defines** the table(identified by the table-name) and allows user to set some properties that control behavior of the table. + +:: + + lookup_table(name="msg_per_host") + +Parameters: + **name** <string literal, mandatory> : Name of the table. + + **file** <string literal, file path, mandatory> : Path to external json database file. + + **reloadOnHUP** <on|off, default: on> : Whether or not table should be reloaded when process receives HUP signal. + +A definition setting all the parameters looks like: + +:: + + lookup_table(name="host_bu" file="/var/lib/host_billing_unit_mapping.json" reloadOnHUP="on") + + +lookup("<table>", <expr>) (function) +------------------------------------ + +**Looks up** and returns the value that is associated with the given key (passed as <variable>) +in lookup table identified by table-name. If no match is found (according to table-type +matching-criteria specified above), the "nomatch" string is returned (or an empty string if it is not defined). + +Parameters: + **name** <string literal, mandatory> : Name of the table. + + **expr** <expression resulting in string or number according to lookup-table type, mandatory> : Key to be looked up. + +A ``lookup`` call looks like: + +:: + + set $.bu = lookup("host_bu", $hostname); + + if ($.bu == "unknown") then { + .... + } + +Some examples of different match/no-match scenarios: + +**string table**: + +:: + + { "nomatch" : "none", + "type" : "string", + "table":[ + {"index" : "foo", "value" : "bar" }, + {"index" : "baz", "value" : "quux" }]} + +Match/no-Match behaviour: + +====== ============== +key return +====== ============== +foo bar +baz quux +corge none +====== ============== + +**array table**: + +:: + + { "nomatch" : "nothing", + "type" : "array", + "table":[ + {"index" : 9, "value" : "foo" }, + {"index" : 10, "value" : "bar" }, + {"index" : 11, "value" : "baz" }]} + +Match/no-Match behaviour: + +====== ============== +key return +====== ============== +9 foo +11 baz +15 nothing +0 nothing +====== ============== + +**sparseArray table**: + +:: + + { "nomatch" : "no_num", + "type" : "sparseArray", + "table":[ + {"index" : "9", "value" : "foo" }, + {"index" : "11", "value" : "baz" }]} + +Match/no-Match behaviour: + +====== ============== +key return +====== ============== +8 no_num +9 foo +10 foo +11 baz +12 baz +100 baz +====== ============== + + +reload_lookup_table("<table>", "<stub value>") (statement) +---------------------------------------------------------- + +**Reloads** lookup table identified by given table name **asynchronously** (by internal trigger, as opposed to HUP). + +This statement isn't always useful. It needs to be used only when lookup-table-reload needs to be triggered in response to +a message. + +Messages will continue to be processed while table is asynchronously reloaded. + +Note: For performance reasons, message that triggers reload should be accepted only from a trusted source. + +Parameters: + **name** <string literal, mandatory> : Name of the table. + + **stub value** <string literal, optional> : Value to stub the table in-case reload-attempt fails. + +A ``reload_lookup_table`` invocation looks like: + +:: + + if ($.do_reload == "y") then { + reload_lookup_table("host_bu", "unknown") + } + + +Implementation Details +^^^^^^^^^^^^^^^^^^^^^^ + +The lookup table functionality is implemented via efficient algorithms. + +The string and sparseArray lookup have O(log(n)) time complexity, while array lookup is O(1). + +To preserve space and, more important, increase cache hit performance, equal data values are only stored once, +no matter how often a lookup index points to them. |