diff options
Diffstat (limited to '')
-rw-r--r-- | source/configuration/property_replacer.rst | 374 |
1 files changed, 374 insertions, 0 deletions
diff --git a/source/configuration/property_replacer.rst b/source/configuration/property_replacer.rst new file mode 100644 index 0000000..3d99f5f --- /dev/null +++ b/source/configuration/property_replacer.rst @@ -0,0 +1,374 @@ +The Property Replacer +===================== + +**The property replacer is a core component in rsyslogd's** `string template +system <templates.html>`_. A syslog message has a number of well-defined properties. +Each of these properties can be accessed **and** manipulated by +the property replacer. With it, it is easy to use only part of a +property value or manipulate the value, e.g. by converting all +characters to lower case. + +Accessing Properties +-------------------- + +Syslog message properties are used inside templates. They are accessed +by putting them between percent signs. Properties can be modified by the +property replacer. The full syntax is as follows: + +:: + + %property:fromChar:toChar:options% + +Available Properties +^^^^^^^^^^^^^^^^^^^^ + +The property replacer can use all :doc:`rsyslog properties <properties>`. + +Character Positions +^^^^^^^^^^^^^^^^^^^^ + +**FromChar** and **toChar** are used to build substrings. They +specify the offset within the string that should be copied. Offset +counting starts at 1, so if you need to obtain the first 2 characters of +the message text, you can use this syntax: "%msg:1:2%". If you do not +wish to specify from and to, but you want to specify options, you still +need to include the colons. For example, if you would like to convert +the full message text to lower case, use "%msg:::lowercase%". If you +would like to extract from a position until the end of the string, you +can place a dollar-sign ("$") in toChar (e.g. %msg:10:$%, which will +extract from position 10 to the end of the string). + +There is also support for **regular expressions**. To use them, you need +to place a "R" into FromChar. This tells rsyslog that a regular +expression instead of position-based extraction is desired. The actual +regular expression must then be provided in toChar. The regular +expression **must** be followed by the string "--end". It denotes the +end of the regular expression and will not become part of it. If you are +using regular expressions, the property replacer will return the part of +the property text that matches the regular expression. An example for a +property replacer sequence with a regular expression is: +"%msg:R:.\*Sev:. \\(.\*\\) \\[.\*--end%" + +It is possible to specify some parameters after the "R". These are +comma-separated. They are: + +R,<regexp-type>,<submatch>,<:doc:`nomatch <nomatch>`\ >,<match-number> + +regexp-type is either "BRE" for Posix basic regular expressions or "ERE" +for extended ones. The string must be given in upper case. The default +is "BRE" to be consistent with earlier versions of rsyslog that did not +support ERE. The submatch identifies the submatch to be used with the +result. A single digit is supported. Match 0 is the full match, while 1 +to 9 are the actual submatches. The match-number identifies which match +to use, if the expression occurs more than once inside the string. +Please note that the first match is number 0, the second 1 and so on. Up +to 10 matches (up to number 9) are supported. Please note that it would +be more natural to have the match-number in front of submatch, but this +would break backward-compatibility. So the match-number must be +specified after "nomatch". + +:doc:`nomatch <nomatch>` specifies what should be used in +case no match is found. + +The following is a sample of an ERE expression that takes the first +submatch from the message string and replaces the expression with the +full field if no match is found: + +:: + +%msg:R,ERE,1,FIELD:for (vlan[0-9]\*):--end% + +and this takes the first submatch of the second match of said +expression: + +:: + +%msg:R,ERE,1,FIELD,1:for (vlan[0-9]\*):--end% + +**Please note: there is also a** `rsyslog regular expression +checker/generator <http://www.rsyslog.com/tool-regex>`_ **online tool +available.** With that tool, you can check your regular expressions and +also generate a valid property replacer sequence. Usage of this tool is +recommended. Depending on the version offered, the tool may not cover +all subtleties that can be done with the property replacer. It +concentrates on the most often used cases. So it is still useful to +hand-craft expressions for demanding environments. + +**Also, extraction can be done based on so-called "fields"**. To do so, +place a "F" into FromChar. A field in its current definition is anything +that is delimited by a delimiter character. The delimiter by default is +TAB (US-ASCII value 9). However, if can be changed to any other US-ASCII +character by specifying a comma and the **decimal** US-ASCII value of +the delimiter immediately after the "F". For example, to use comma (",") +as a delimiter, use this field specifier: "F,44". If your syslog data +is delimited, this is a quicker way to extract than via regular +expressions (actually, a *much* quicker way). Field counting starts at +1. Field zero is accepted, but will always lead to a "field not found" +error. The same happens if a field number higher than the number of +fields in the property is requested. The field number must be placed in +the "ToChar" parameter. An example where the 3rd field (delimited by +TAB) from the msg property is extracted is as follows: "%msg:F:3%". The +same example with semicolon as delimiter is "%msg:F,59:3%". + +The use of fields does not permit to select substrings, what is rather +unfortunate. To solve this issue, starting with 6.3.9, fromPos and toPos +can be specified for strings as well. However, the syntax is quite ugly, +but it was the only way to integrate this functionality into the +already-existing system. To do so, use ",fromPos" and ",toPos" during +field extraction. Let's assume you want to extract the substring from +position 5 to 9 in the previous example. Then, the syntax is as follows: +"%msg:F,59,5:3,9%". As you can see, "F,59" means field-mode, with +semicolon delimiter and ",5" means starting at position 5. Then "3,9" +means field 3 and string extraction to position 9. + +Please note that the special characters "F" and "R" are case-sensitive. +Only upper case works, lower case will return an error. There are no +white spaces permitted inside the sequence (that will lead to error +messages and will NOT provide the intended result). + +Each occurrence of the field delimiter starts a new field. However, if +you add a plus sign ("+") after the field delimiter, multiple +delimiters, one immediately after the others, are treated as separate +fields. This can be useful in cases where the syslog message contains +such sequences. A frequent case may be with code that is written as +follows: + +```` + +:: + + int n, m; + ... + syslog(LOG_ERR, "%d test %6d", n, m); + +This will result into things like this in syslog messages: "1 +test 2", "1 test 23", "1 test 234567" + +As you can see, the fields are delimited by space characters, but their +exact number is unknown. They can properly be extracted as follows: + +:: + + "%msg:F,32:2%" to "%msg:F,32+:2%". + +This feature was suggested by Zhuang Yuyao and implemented by him. It is +modeled after perl compatible regular expressions. + +Property Options +^^^^^^^^^^^^^^^^ + +**Property options** are case-insensitive. Currently, the following +options are defined: + +**uppercase** + convert property to uppercase only + +**lowercase** + convert property text to lowercase only + +**fixed-width** + changes behaviour of toChar so that it pads the source string with spaces + up to the value of toChar if the source string is shorter. + *This feature was introduced in rsyslog 8.13.0* + +**json** + encode the value so that it can be used inside a JSON field. This means + that several characters (according to the JSON spec) are being escaped, for + example US-ASCII LF is replaced by "\\n". + The json option cannot be used together with either jsonf or csv options. + +**jsonf**\[:outname\] + (available in 6.3.9+) + This signifies that the property should be expressed as a JSON field. + That means not only the property is written, but rather a complete JSON field in + the format + + ``"fieldname"="value"`` + + where "fieldname" is given in the *outname* property (or the property name + if none was assigned) and value is the end result of property replacer operation. + Note that value supports all property replacer options, like substrings, case + conversion and the like. Values are properly JSON-escaped, however field names are + (currently) not, so it is expected that proper field names are configured. + The jsonf option cannot be used together with either json or csv options. + + For more information you can read `this article from Rainer's blog + <https://rainer.gerhards.net/2012/04/rsyslog-templates-json.html>`_. + +**csv** + formats the resulting field (after all modifications) in CSV format as + specified in `RFC 4180 <http://www.ietf.org/rfc/rfc4180.txt>`_. Rsyslog + will always use double quotes. Note that in order to have full + CSV-formatted text, you need to define a proper template. An example is + this one: + $template csvline,"%syslogtag:::csv%,%msg:::csv%" + Most importantly, you need to provide the commas between the fields + inside the template. + *This feature was introduced in rsyslog 4.1.6.* + +**drop-last-lf** + The last LF in the message (if any), is dropped. Especially useful for + PIX. + +**date-utc** + convert data to UTC prior to outputting it (available since 8.18.0) + +**date-mysql** + format as mysql date + +**date-rfc3164** + format as RFC 3164 date + +**date-rfc3164-buggyday** + similar to date-rfc3164, but emulates a common coding error: RFC 3164 + demands that a space is written for single-digit days. With this option, + a zero is written instead. This format seems to be used by syslog-ng and + the date-rfc3164-buggyday option can be used in migration scenarios + where otherwise lots of scripts would need to be adjusted. It is + recommended *not* to use this option when forwarding to remote hosts - + they may treat the date as invalid (especially when parsing strictly + according to RFC 3164). + + *This feature was introduced in rsyslog 4.6.2 and v4 versions above and + 5.5.3 and all versions above.* + +**date-rfc3339** + format as RFC 3339 date + +**date-unixtimestamp** + Format as a unix timestamp (seconds since epoch) + +**date-year** + just the year part (4-digit) of a timestamp + +**date-month** + just the month part (2-digit) of a timestamp + +**date-day** + just the day part (2-digit) of a timestamp + +**date-hour** + just the hour part (2-digit, 24-hour clock) of a timestamp + +**date-minute** + just the minute part (2-digit) of a timestamp + +**date-second** + just the second part (2-digit) of a timestamp + +**date-subseconds** + just the subseconds of a timestamp (always 0 for a low precision + timestamp) + +**date-tzoffshour** + just the timezone offset hour part (2-digit) of a timestamp + +**date-tzoffsmin** + just the timezone offset minute part (2-digit) of a timestamp. Note + that this is usually 0, but there are some time zones that have + offsets which are not hourly-granular. If so, this is the minute + offset. + +**date-tzoffsdirection** + just the timezone offset direction part of a timestamp. This + specifies if the offsets needs to be added ("+") or subtracted ("-") + to the timestamp in order to get UTC. + +**date-ordinal** + returns the ordinal for the given day, e.g. it is 2 for January, 2nd + +**date-iso-week** and **date-iso-week-year** + return the ISO week number adn week-numbering year, which should be used together. See `ISO week date <https://en.wikipedia.org/wiki/ISO_week_date>`_ for more details + +**date-week** + returns the week number + +**date-wday** + just the weekday number of the timstamp. This is a single digit, + with 0=Sunday, 1=Monday, ..., 6=Saturday. + +**date-wdayname** + just the abbreviated english name of the weekday (e.g. "Mon", "Sat") of + the timestamp. + +**escape-cc** + replace control characters (ASCII value 127 and values less then 32) + with an escape sequence. The sequence is "#<charval>" where charval is + the 3-digit decimal value of the control character. For example, a + tabulator would be replaced by "#009". + Note: using this option requires that + `$EscapeControlCharactersOnReceive <rsconf1_escapecontrolcharactersonreceive.html>`_ + is set to off. + +**space-cc** + replace control characters by spaces + Note: using this option requires that + `$EscapeControlCharactersOnReceive <rsconf1_escapecontrolcharactersonreceive.html>`_ + is set to off. + +**drop-cc** + drop control characters - the resulting string will neither contain + control characters, escape sequences nor any other replacement character + like space. + Note: using this option requires that + `$EscapeControlCharactersOnReceive <rsconf1_escapecontrolcharactersonreceive.html>`_ + is set to off. + +**compressspace** + compresses multiple spaces (US-ASCII SP character) inside the + string to a single one. This compression happens at a very late + stage in processing. Most importantly, it happens after substring + extraction, so the **FromChar** and **ToChar** positions are **NOT** + affected by this option. (available since v8.18.0) + +**sp-if-no-1st-sp** + This option looks scary and should probably not be used by a user. For + any field given, it returns either a single space character or no + character at all. Field content is never returned. A space is returned + if (and only if) the first character of the field's content is NOT a + space. This option is kind of a hack to solve a problem rooted in RFC + 3164: 3164 specifies no delimiter between the syslog tag sequence and + the actual message text. Almost all implementation in fact delimit the + two by a space. As of RFC 3164, this space is part of the message text + itself. This leads to a problem when building the message (e.g. when + writing to disk or forwarding). Should a delimiting space be included if + the message does not start with one? If not, the tag is immediately + followed by another non-space character, which can lead some log parsers + to misinterpret what is the tag and what the message. The problem + finally surfaced when the klog module was restructured and the tag + correctly written. It exists with other message sources, too. The + solution was the introduction of this special property replacer option. + Now, the default template can contain a conditional space, which exists + only if the message does not start with one. While this does not solve + all issues, it should work good enough in the far majority of all cases. + If you read this text and have no idea of what it is talking about - + relax: this is a good indication you will never need this option. Simply + forget about it ;) + +**secpath-drop** + Drops slashes inside the field (e.g. "a/b" becomes "ab"). Useful for + secure pathname generation (with dynafiles). + +**secpath-replace** + Replace slashes inside the field by an underscore. (e.g. "a/b" becomes + "a\_b"). Useful for secure pathname generation (with dynafiles). + +To use multiple options, simply place them one after each other with a +comma delimiting them. For example "escape-cc,sp-if-no-1st-sp". If you +use conflicting options together, the last one will override the +previous one. For example, using "escape-cc,drop-cc" will use drop-cc +and "drop-cc,escape-cc" will use escape-cc mode. + +Further Links +------------- + +- Article on ":doc:`Recording the Priority of Syslog + Messages <../tutorials/recording_pri>`" (describes use of + templates to record severity and facility of a message) +- `Configuration file syntax <rsyslog_conf.html>`_, this is where you + actually use the property replacer. + +.. toctree:: + :maxdepth: 2 + + nomatch |