1 files changed, 320 insertions, 0 deletions
diff --git a/source/concepts/multi_ruleset.rst b/source/concepts/multi_ruleset.rst
new file mode 100644
index 0000000..512558e
--- /dev/null
+++ b/source/concepts/multi_ruleset.rst
@@ -0,0 +1,320 @@
+Multiple Rulesets in rsyslog
+============================
+
+Starting with version 4.5.0 and 5.1.1,
+`rsyslog <http://www.rsyslog.com>`_ supports multiple rulesets within a
+single configuration. This is especially useful for routing the
+reception of remote messages to a set of specific rules. Note that the
+input module must support binding to non-standard rulesets, so the
+functionality may not be available with all inputs.
+
+In this document, I am using :doc:`imtcp <../configuration/modules/imtcp>`, an input module that
+supports binding to non-standard rulesets since rsyslog started to
+support them.
+
+What is a Ruleset?
+------------------
+
+If you have worked with (r)syslog.conf, you know that it is made up of
+what I call rules (others tend to call them selectors, a sysklogd term).
+Each rule consist of a filter and one or more actions to be carried out
+when the filter evaluates to true. A filter may be as simple as a
+traditional syslog priority based filter (like "\*.\*" or "mail.info" or
+a as complex as a script-like expression. Details on that are covered in
+the config file documentation. After the filter come action specifiers,
+and an action is something that does something to a message, e.g. write
+it to a file or forward it to a remote logging server.
+
+A traditional configuration file is made up of one or more of these
+rules. When a new message arrives, its processing starts with the first
+rule (in order of appearance in rsyslog.conf) and continues for each
+rule until either all rules have been processed or a so-called "discard"
+action happens, in which case processing stops and the message is thrown
+away (what also happens after the last rule has been processed).
+
+The **multi-ruleset** support now permits to specify more than one such
+rule sequence. You can think of a traditional config file just as a
+single default rule set, which is automatically bound to each of the
+inputs. This is even what actually happens. When rsyslog.conf is
+processed, the config file parser looks for the directive
+
+::
+
+    ruleset(name="rulesetname")
+
+Where name is any name the user likes (but must not start with
+"RSYSLOG\_", which is the name space reserved for rsyslog use). If it
+finds this directive, it begins a new rule set (if the name was not yet
+know) or switches to an already-existing one (if the name was known).
+All rules defined between this $RuleSet directive and the next one are
+appended to the named ruleset. Note that the reserved name
+"RSYSLOG\_DefaultRuleset" is used to specify rsyslogd's default ruleset.
+You can use that name wherever you can use a ruleset name, including
+when binding an input to it.
+
+Inside a ruleset, messages are processed as described above: they start
+with the first rule and rules are processed in the order of appearance
+of the configuration file until either there are no more rules or the
+discard action is executed. Note that with multiple rulesets no longer
+**all** rsyslog.conf rules are executed but **only** those that are
+contained within the specific ruleset.
+
+Inputs must explicitly bind to rulesets. If they don't, the default
+ruleset is bound.
+
+This brings up the next question:
+
+What does "To bind to a Ruleset" mean?
+--------------------------------------
+
+This term is used in the same sense as "to bind an IP address to an
+interface": it means that a specific input, or part of an input (like a
+tcp listener) will use a specific ruleset to "pass its messages to". So
+when a new message arrives, it will be processed via the bound ruleset.
+Rules from all other rulesets are irrelevant and will never be processed.
+
+This makes multiple rulesets very handy to process local and remote
+message via separate means: bind the respective receivers to different
+rule sets, and you do not need to separate the messages by any other
+method.
+
+Binding to rulesets is input-specific. For imtcp, this is done via the
+
+::
+
+    input(type="imptcp" port="514" ruleset="rulesetname")
+
+directive. Note that "rulesetname" must be the name of a ruleset that is
+already defined at the time the bind directive is given. There are many
+ways to make sure this happens, but I personally think that it is best
+to define all rule sets at the top of rsyslog.conf and define the inputs
+at the bottom. This kind of reverses the traditional recommended
+ordering, but seems to be a really useful and straightforward way of
+doing things.
+
+Why are rulesets important for different parser configurations?
+---------------------------------------------------------------
+
+Custom message parsers, used to handle different (and potentially
+otherwise-invalid) message formats, can be bound to rulesets. So
+multiple rulesets can be a very useful way to handle devices sending
+messages in different malformed formats in a consistent way.
+Unfortunately, this is not uncommon in the syslog world. An in-depth
+explanation with configuration sample can be found at the
+:doc:`$RulesetParser <../configuration/ruleset/rsconf1_rulesetparser>` configuration directive.
+
+Can I use a different Ruleset as the default?
+---------------------------------------------
+
+This is possible by using the
+
+::
+
+    $DefaultRuleset <name>
+
+Directive. Please note, however, that this directive is actually global:
+that is, it does not modify the ruleset to which the next input is bound
+but rather provides a system-wide default rule set for those inputs that
+did not explicitly bind to one. As such, the directive can not be used
+as a work-around to bind inputs to non-default rulesets that do not
+support ruleset binding.
+
+Rulesets and Queues
+-------------------
+
+By default, rulesets do not have their own queue. It must be activated
+via the $RulesetCreateMainQueue directive, or if using rainerscript
+format, by specifying queue parameters on the ruleset directive, e.g.
+
+::
+
+   ruleset(name="whatever" queue.type="fixedArray" queue. ...)
+
+See :doc:`http://www.rsyslog.com/doc/master/rainerscript/queue\_parameters.html <../rainerscript/queue_parameters>`
+for more details.
+
+Please note that when a ruleset uses its own queue, processing of the ruleset
+happens **asynchronously** to the rest of processing. As such, any modifications
+made to the message object (e.g. message or local variables that are set) or
+discarding of the message object **have no effect outside that ruleset**. So
+if you want to modify the message object inside the ruleset, you **cannot**
+define a queue for it. Most importantly, you cannot call it and expect the
+modified properties to be present when the call returns. Even more so, the
+call will most probably return before the message is even begun to be processed
+by the ruleset in question.
+
+Note that in RainerScript format specifying any "queue.\*" can cause the
+creation of a dedicated queue and as such asynchronous processing. This is
+because queue parameters cannot be specified without a queue. Note, though,
+that the actual creation is **guaranteed** only if "queue.type" is specified
+as above. So if you intentionally want to assign a separate queue to the
+ruleset, do so as shown above.
+
+Examples
+--------
+
+Split local and remote logging
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Let's say you have a pretty standard system that logs its local messages
+to the usual bunch of files that are specified in the default
+rsyslog.conf. As an example, your rsyslog.conf might look like this:
+
+::
+
+    # ... module loading ...
+    # The authpriv file has restricted access.
+    authpriv.*  /var/log/secure
+    # Log all the mail messages in one place.
+    mail.*      /var/log/maillog
+    # Log cron stuff
+    cron.*      /var/log/cron
+    # Everybody gets emergency messages
+    *.emerg     *
+    ... more ...
+
+Now, you want to add receive messages from a remote system and log these
+to a special file, but you do not want to have these messages written to
+the files specified above. The traditional approach is to add a rule in
+front of all others that filters on the message, processes it and then
+discards it:
+
+::
+
+    # ... module loading ...
+    # process remote messages
+    if $fromhost-ip == '192.0.2.1' then {
+            action(type="omfile" file="/var/log/remotefile02")
+            stop
+        }
+
+
+    # only messages not from 192.0.2.1 make it past this point
+
+    # The authpriv file has restricted access.
+    authpriv.*                            /var/log/secure
+    # Log all the mail messages in one place.
+    mail.*                                /var/log/maillog
+    # Log cron stuff
+    cron.*                                /var/log/cron
+    # Everybody gets emergency messages
+    *.emerg                               *
+    ... more ...
+
+Note that "stop" is the discard action!. Also note that we assume that
+192.0.2.1 is the sole remote sender (to keep it simple).
+
+With multiple rulesets, we can simply define a dedicated ruleset for the
+remote reception case and bind it to the receiver. This may be written
+as follows:
+
+::
+
+    # ... module loading ...
+    # process remote messages
+    # define new ruleset and add rules to it:
+    ruleset(name="remote"){
+        action(type="omfile" file="/var/log/remotefile")
+    }
+    # only messages not from 192.0.2.1 make it past this point
+
+    # bind ruleset to tcp listener and activate it:
+    input(type="imptcp" port="10514" ruleset="remote")
+
+Split local and remote logging for three different ports
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This example is almost like the first one, but it extends it a little
+bit. While it is very similar, I hope it is different enough to provide
+a useful example why you may want to have more than two rulesets.
+
+Again, we would like to use the "regular" log files for local logging,
+only. But this time we set up three syslog/tcp listeners, each one
+listening to a different port (in this example 10514, 10515, and 10516).
+Logs received from these receivers shall go into different files. Also,
+logs received from 10516 (and only from that port!) with "mail.\*"
+priority, shall be written into a specific file and **not** be written to
+10516's general log file.
+
+This is the config:
+
+::
+
+    # ... module loading ...
+    # process remote messages
+
+    ruleset(name="remote10514"){
+        action(type="omfile" file="/var/log/remote10514")
+    }
+
+    ruleset(name="remote10515"){
+        action(type="omfile" file="/var/log/remote10515")
+    }
+
+    ruleset(name="remote10516"){
+        if prifilt("mail.*") then {
+            /var/log/mail10516
+            stop
+            # note that the stop-command will prevent this message from 
+            # being written to the remote10516 file - as usual...   
+        }
+        /var/log/remote10516
+    }
+
+
+    # and now define listeners bound to the relevant ruleset
+    input(type="imptcp" port="10514" ruleset="remote10514")
+    input(type="imptcp" port="10515" ruleset="remote10515")
+    input(type="imptcp" port="10516" ruleset="remote10516")
+
+Performance
+-----------
+
+Fewer Filters
+~~~~~~~~~~~~~
+
+No rule processing can be faster than not processing a rule at all. As
+such, it is useful for a high performance system to identify disjunct
+actions and try to split these off to different rule sets. In the
+example section, we had a case where three different tcp listeners need
+to write to three different files. This is a perfect example of where
+multiple rule sets are easier to use and offer more performance. The
+performance is better simply because there is no need to check the
+reception service - instead messages are automatically pushed to the
+right rule set and can be processed by very simple rules (maybe even
+with "\*.\*"-filters, the fastest ones available).
+
+Partitioning of Input Data
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Starting with rsyslog 5.3.4, rulesets permit higher concurrency. They
+offer the ability to run on their own "main" queue. What that means is
+that a own queue is associated with a specific rule set. That means that
+inputs bound to that ruleset do no longer need to compete with each
+other when they enqueue a data element into the queue. Instead, enqueue
+operations can be completed in parallel.
+
+An example: let us assume we have three TCP listeners. Without rulesets,
+each of them needs to insert messages into the main message queue. So if
+each of them wants to submit a newly arrived message into the queue at
+the same time, only one can do so while the others need to wait. With
+multiple rulesets, its own queue can be created for each ruleset. If now
+each listener is bound to its own ruleset, concurrent message submission
+is possible. On a machine with a sufficiently large number of cores,
+this can result in dramatic performance improvement.
+
+It is highly advised that high-performance systems define a dedicated
+ruleset, with a dedicated queue for each of the inputs.
+
+By default, rulesets do **not** have their own queue. It must be
+activated via the
+:doc:`$RulesetCreateMainQueue <../configuration/ruleset/rsconf1_rulesetcreatemainqueue>`
+directive.
+
+See Also
+--------
+.. toctree::
+   :maxdepth: 1
+   
+   ../historical/multi_ruleset_legacy_format_samples
+