diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 02:57:58 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 02:57:58 +0000 |
commit | be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97 (patch) | |
tree | 9754ff1ca740f6346cf8483ec915d4054bc5da2d /logsmanagement | |
parent | Initial commit. (diff) | |
download | netdata-be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97.tar.xz netdata-be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97.zip |
Adding upstream version 1.44.3.upstream/1.44.3upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'logsmanagement')
49 files changed, 13630 insertions, 0 deletions
diff --git a/logsmanagement/Makefile.am b/logsmanagement/Makefile.am new file mode 100644 index 00000000..33f08d55 --- /dev/null +++ b/logsmanagement/Makefile.am @@ -0,0 +1,28 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +userlogsmanagconfigdir=$(configdir)/logsmanagement.d + +# Explicitly install directories to avoid permission issues due to umask +install-exec-local: + $(INSTALL) -d $(DESTDIR)$(userlogsmanagconfigdir) + +dist_libconfig_DATA = \ + stock_conf/logsmanagement.d.conf \ + $(NULL) + +logsmanagconfigdir=$(libconfigdir)/logsmanagement.d + +dist_logsmanagconfig_DATA = \ + stock_conf/logsmanagement.d/default.conf \ + stock_conf/logsmanagement.d/example_forward.conf \ + stock_conf/logsmanagement.d/example_mqtt.conf \ + stock_conf/logsmanagement.d/example_serial.conf \ + stock_conf/logsmanagement.d/example_syslog.conf \ + $(NULL) + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/logsmanagement/README.md b/logsmanagement/README.md new file mode 100644 index 00000000..5c3982dc --- /dev/null +++ b/logsmanagement/README.md @@ -0,0 +1,673 @@ +# Logs Management + +## Table of Contents + +- [Summary](#summary) + - [Types of available log collectors](#collector-types) +- [Getting Started](#getting-started) +- [Package Requirements](#package-requirements) +- [General Configuration](#general-configuration) +- [Collector-specific Configuration](#collector-configuration) + - [Kernel logs (kmsg)](#collector-configuration-kmsg) + - [Systemd](#collector-configuration-systemd) + - [Docker events](#collector-configuration-docker-events) + - [Tail](#collector-configuration-tail) + - [Web log](#collector-configuration-web-log) + - [Syslog socket](#collector-configuration-syslog) + - [Serial](#collector-configuration-serial) + - [MQTT](#collector-configuration-mqtt) +- [Custom Charts](#custom-charts) +- [Streaming logs to Netdata](#streaming-in) + - [Example: Systemd log streaming](#streaming-systemd) + - [Example: Kernel log streaming](#streaming-kmsg) + - [Example: Generic log streaming](#streaming-generic) + - [Example: Docker Events log streaming](#streaming-docker-events) +- [Streaming logs from Netdata (exporting)](#streaming-out) +- [Troubleshooting](#troubleshooting) + +<a name="summary"/> + +## Summary + +</a> + +The Netdata logs management engine enables collection, processing, storage, streaming and querying of logs through the Netdata agent. The following pipeline depicts a high-level overview of the different stages that collected logs propagate through for this to be achieved: + +![Logs management pipeline](https://github.com/netdata/netdata/assets/5953192/dd73382c-af4b-4840-a3fe-1ba5069304e8 "Logs management pipeline") + +The [Fluent Bit](https://github.com/fluent/fluent-bit) project has been used as the logs collection and exporting / streaming engine, due to its stability and the variety of [collection (input) plugins](https://docs.fluentbit.io/manual/pipeline/inputs) that it offers. Each collected log record passes through the Fluent Bit engine first, before it gets buffered, parsed, compressed and (optionally) stored locally by the logs management engine. It can also be streamed to another Netdata or Fluent Bit instance (using Fluent Bit's [Forward](https://docs.fluentbit.io/manual/pipeline/outputs/forward) protocol), or exported using any other [Fluent Bit output](https://docs.fluentbit.io/manual/pipeline/outputs). + +A bespoke circular buffering implementation has been used to maximize performance and optimize memory utilization. More technical details about how it works can be found [here](https://github.com/netdata/netdata/pull/13291#buffering). + +To configure Netdata's logs management engine properly, please make sure you are aware of the following points first: + +* One collection cycle (at max) occurs per `update every` interval (in seconds - minimum 1 sec) and any log records collected in a collection cycle are grouped together (for compression and performance purposes). As a result of this, a longer `update every` interval will reduce memory and disk space requirements. +* When collected logs contain parsable timestamps, these will be used to display metrics from parsed logs at the correct time in each chart, even if collection of said logs takes place *much* later than the time they were produced. How much later? Up to a configurable value of `update timeout` seconds. This mechanism ensures correct parsing and querying of delayed logs that contain parsable timestamps (such as streamed inputs or buffered logs sources that write logs in batches), but the respective charts may lag behind some seconds up to that timeout. If no parsable timestamp is found, the collection timestamp will be used instead (or the collector can be forced to always use the collection timestamp by setting `use log timestamp = no`). + +<a name="collector-types"/> + +### Types of available log collectors + +</a> + +The following log collectors are supported at the moment. The table will be updated as more collectors are added: +| Collector | Log type | Description | +| ------------ | ------------ | ------------ | +| kernel logs (kmsg) | `flb_kmsg` | Collection of new kernel ring buffer logs.| +| systemd | `flb_systemd` | Collection of journald logs.| +| docker events | `flb_docker_events` | Collection of docker events logs, similar to executing the `docker events` command.| +| tail | `flb_tail` | Collection of new logs from files by "tailing" them, similar to `tail -f`.| +| web log | `flb_web_log` | Collection of Apache or Nginx access logs.| +| syslog socket | `flb_syslog` | Collection of RFC-3164 syslog logs by creating listening sockets.| +| serial | `flb_serial` | Collection of logs from a serial interface.| +| mqtt | `flb_mqtt` | Collection of MQTT messages over a TCP connection.| + +<a name="getting-started"/> + +## Getting Started + +</a> + +Since version `XXXXX`, Netdata is distributed with logs management functionality as an external plugin, but it is disabled by default and must be explicitly enabled using `./edit-config netdata.conf` and changing the respective configuration option: + +``` +[plugins] + logs-management = yes +``` + +There are some pre-configured log sources that Netdata will attempt to automatically discover and monitor that can be edited using `./edit-config logsmanagement.d/default.conf` in Netdata's configuration directory. More sources can be configured for monitoring by adding them in `logsmanagement.d/default.conf` or in other `.conf` files in the `logsmanagement.d` directory. + +There are also some example configurations that can be listed using `./edit-config --list`. + +To get familiar with the Logs Management functionality, the user is advised to read at least the [Summary](#summary) and the [General Configuration](#general-configuration) sections and also any [Collector-specific Configuration](#collector-configuration) subsections, according to each use case. + +For any issues, please refer to [Troubleshooting](#troubleshooting) or open a new support ticket on [Github](https://github.com/netdata/netdata/issues) or one of Netdata's support channels. + +<a name="package-requirements"/> + +## Package Requirements + +</a> + +Netdata logs management introduces minimal additional package dependencies and those are actually [Fluent Bit dependencies](https://docs.fluentbit.io/manual/installation/requirements). The only extra build-time dependencies are: +- `flex` +- `bison` +- `musl-fts-dev` ([Alpine Linux](https://www.alpinelinux.org/about) only) + +However, there may be some exceptions to this rule as more collectors are added to the logs management engine, so if a specific collector is disabled due to missing dependencies, please refer to this section or check [Troubleshooting](#troubleshooting). + +<a name="general-configuration"/> + +## General Configuration + +</a> + +There are some fundamental configuration options that are common to all log collector types. These options can be set globally in `logsmanagement.d.conf` or they can be customized per log source: + +| Configuration Option | Default | Description | +| :------------: | :------------: | ------------ | +| `update every` | Equivalent value in `logsmanagement.d.conf` (or in `netdata.conf` under `[plugin:logs-management]`, if higher). | How often metrics in charts will be updated every (in seconds). +| `update timeout` | Equivalent value in `[logs management]` section of `netdata.conf` (or Netdata global value, if higher). | Maximum timeout charts may be delayed by while waiting for new logs. +| `use log timestamp` | Equivalent value in `logsmanagement.d.conf` (`auto` by default). | If set to `auto`, log timestamps (when available) will be used for precise metrics aggregation. Otherwise (if set to `no`), collection timestamps will be used instead (which may result in lagged metrics under heavy system load, but it will reduce CPU usage). +| `log type` | `flb_tail` | Type of this log collector, see [relevant table](#collector-types) for a complete list of supported collectors. +| `circular buffer max size` | Equivalent value in `logsmanagement.d.conf`. | Maximum RAM that can be used to buffer collected logs until they are saved to the disk database. +| `circular buffer drop logs if full` | Equivalent value in `logsmanagement.d.conf` (`no` by default). | If there are new logs pending to be collected and the circular buffer is full, enabling this setting will allow old buffered logs to be dropped in favor of new ones. If disabled, collection of new logs will be blocked until there is free space again in the buffer (no logs will be lost in this case, but logs will not be ingested in real-time). +| `compression acceleration` | Equivalent value in `logsmanagement.d.conf` (`1` by default). | Fine-tunes tradeoff between log compression speed and compression ratio, see [here](https://github.com/lz4/lz4/blob/90d68e37093d815e7ea06b0ee3c168cccffc84b8/lib/lz4.h#L195) for more details. +| `db mode` | Equivalent value in `logsmanagement.d.conf` (`none` by default). | Mode of logs management database per collector. If set to `none`, logs will be collected, buffered, parsed and then discarded. If set to `full`, buffered logs will be saved to the logs management database instead of being discarded. When mode is `none`, logs management queries cannot be executed. +| `buffer flush to DB` | Equivalent value in `logsmanagement.d.conf` (`6` by default). | Interval in seconds at which logs will be transferred from RAM buffers to the database. +| `disk space limit` | Equivalent value in `logsmanagement.d.conf` (`500 MiB` by default). | Maximum disk space that all compressed logs in database can occupy (per log source). Once exceeded, oldest BLOB of logs will be truncated for new logs to be written over. Each log source database can contain a maximum of 10 BLOBs at any point, so each truncation equates to a deletion of about 10% of the oldest logs. The number of BLOBS will be configurable in a future release. +| `collected logs total chart enable` | Equivalent value in `logsmanagement.d.conf` (`no` by default). | Chart that shows the number of log records collected for this log source, since the last Netdata agent restart. Useful for debugging purposes. +| `collected logs rate chart enable` | Equivalent value in `logsmanagement.d.conf` (`yes` by default). | Chart that shows the rate that log records are collected at for this log source. +| `submit logs to system journal = no` | Equivalent value in `logsmanagement.d.conf` (`no` by default). Available only for `flb_tail`, `flb_web_log`, `flb_serial`, `flb_docker_events` and `flb_mqtt`. | If enabled, it will submit the collected logs to the system journal. + +There is also one setting that cannot be set per log source, but can only be defined in `logsmanagement.d.conf`: + +| Configuration Option | Default | Description | +| :------------: | :------------: | ------------ | +| `db dir` | `/var/cache/netdata/logs_management_db` | Logs management database path, will be created if it does not exist.| + + + +> **Note** +> `log path` must be defined per log source for any collector type, except for `kmsg` and the collectors that listen to network sockets. Some default examples use `log path = auto`. In those cases, an autodetection of the path will be attempted by searching through common paths where each log source is typically expected to be found. + +<a name="collector-configuration"/> + +## Collector-specific Configuration + +</a> + +<a name="collector-configuration-kmsg"/> + +### Kernel logs (kmsg) + +</a> + +This collector will collect logs from the kernel message log buffer. See also documentation of [Fluent Bit kmsg input plugin](https://docs.fluentbit.io/manual/pipeline/inputs/kernel-logs). + +> **Warning** +> If `use log timestamp` is set to `auto` and the system has been in suspend and resumed since the last boot, timestamps of new `kmsg` logs will be incorrect and log collection will not work. This is a know limitation when reading the kernel log buffer records and it is recommended to use `use log timestamp = no` in this case. + +> **Note** +> `/dev/kmsg` normally returns all the logs in the kernel log buffer every time it is read. To avoid duplicate logs, the collector will discard any previous logs the first time `/dev/kmsg` is read after an agent restart and it will collect only new kernel logs. + +| Configuration Option | Description | +| :------------: | ------------ | +| `prio level` | Drop kernel logs with priority higher than `prio level`. Default value is 8, so no logs will be dropped. +| `severity chart` | Enable chart showing Syslog Severity values of collected logs. Severity values are in the range of 0 to 7 inclusive.| +| `subsystem chart` | Enable chart showing which subsystems generated the logs.| +| `device chart` | Enable chart showing which devices generated the logs.| + +<a name="collector-configuration-systemd"/> + +### Systemd + +</a> + +This collector will collect logs from the journald daemon. See also documentation of [Fluent Bit systemd input plugin](https://docs.fluentbit.io/manual/pipeline/inputs/systemd). + +| Configuration Option | Description | +| :------------: | ------------ | +| `log path` | Path to the systemd journal directory. If set to `auto`, the default path will be used to read local-only logs. | +| `priority value chart` | Enable chart showing Syslog Priority values (PRIVAL) of collected logs. The Priority value ranges from 0 to 191 and represents both the Facility and Severity. It is calculated by first multiplying the Facility number by 8 and then adding the numerical value of the Severity. Please see the [rfc5424: Syslog Protocol](https://www.rfc-editor.org/rfc/rfc5424#section-6.2.1) document for more information.| +| `severity chart` | Enable chart showing Syslog Severity values of collected logs. Severity values are in the range of 0 to 7 inclusive.| +| `facility chart` | Enable chart showing Syslog Facility values of collected logs. Facility values show which subsystem generated the log and are in the range of 0 to 23 inclusive.| + +<a name="collector-configuration-docker-events"/> + +### Docker events + +</a> + +This collector will use the Docker API to collect Docker events logs. See also documentation of [Fluent Bit docker events input plugin](https://docs.fluentbit.io/manual/pipeline/inputs/docker-events). + +| Configuration Option | Description | +| :------------: | ------------ | +| `log path` | Docker socket UNIX path. If set to `auto`, the default path (`/var/run/docker.sock`) will be used. | +| `event type chart` | Enable chart showing the Docker object type of the collected logs. | +| `event action chart` | Enable chart showing the Docker object action of the collected logs. | + +<a name="collector-configuration-tail"/> + +### Tail + +</a> + +This collector will collect any type of logs from a log file, similar to executing the `tail -f` command. See also documentation of [Fluent Bit tail plugin](https://docs.fluentbit.io/manual/pipeline/inputs/tail). + +| Configuration Option | Description | +| :------------: | ------------ | +| `log path` | The path to the log file to be monitored. | +| `use inotify` | Select between inotify and file stat watchers (providing `libfluent-bit.so` has been built with inotify support). It defaults to `yes`. Set to `no` if abnormally high CPU usage is observed or if the log source is expected to consistently produce tens of thousands of (unbuffered) logs per second. | + +<a name="collector-configuration-web-log"/> + +### Web log + +</a> + +This collector will collect [Apache](https://httpd.apache.org/) and [Nginx](https://nginx.org/) access logs. + +| Configuration Option | Description | +| :------------: | ------------ | +| `log path` | The path to the web server's `access.log`. If set to `auto`, the collector will attempt to auto-discover it, provided the name of the configuration section is either `Apache access.log` or `Nginx access.log`. | +| `use inotify` | Select between inotify and file stat watchers (providing `libfluent-bit.so` has been built with inotify support). It defaults to `yes`. Set to `no` if abnormally high CPU usage is observed or if the log source is expected to consistently produce tens of thousands of (unbuffered) logs per second. | +| `log format` | The log format to be used for parsing. Unlike the [`GO weblog`]() module, only the `CSV` parser is supported and it can be configured [in the same way](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md#known-fields) as in the `GO` module. If set to `auto`, the collector will attempt to auto-detect the log format using the same logic explained [here](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md#log-parser-auto-detection). | +| `verify parsed logs` | If set to `yes`, the parser will attempt to verify that the parsed fields are valid, before extracting metrics from them. If they are invalid (for example, the response code is less than `100`), the `invalid` dimension will be incremented instead. Setting this to `no` will result in a slight performance gain. | +| `vhosts chart` | Enable chart showing names of the virtual hosts extracted from the collected logs. | +| `ports chart` | Enable chart showing port numbers extracted from the collected logs. | +| `IP versions chart` | Enable chart showing IP versions (`v4` or `v6`) extracted from the collected logs. | +| `unique client IPs - current poll chart` | Enable chart showing unique client IPs in each collection interval. | +| `unique client IPs - all-time chart` | Enable chart showing unique client IPs since agent startup. It is recommended to set this to `no` as it can have a negative impact on long-term performance. | +| `http request methods chart` | Enable chart showing HTTP request methods extracted from the collected logs. | +| `http protocol versions chart` | Enable chart showing HTTP protocol versions exctracted from the collected logs. | +| `bandwidth chart` | Enable chart showing request and response bandwidth extracted from the collected logs. | +| `timings chart` | Enable chart showing request processing time stats extracted from the collected logs. | +| `response code families chart` | Enable chart showing response code families (`1xx`, `2xx` etc.) extracted from the collected logs. | +| `response codes chart` | Enable chart showing response codes extracted from the collected logs. | +| `response code types chart` | Enable chart showing response code types (`success`, `redirect` etc.) extracted from the collected logs. | +| `SSL protocols chart` | Enable chart showing SSL protocols (`TLSV1`, `TLSV1.1` etc.) exctracted from the collected logs. | +| `SSL chipher suites chart` | Enable chart showing SSL chipher suites exctracted from the collected logs. | + +<a name="collector-configuration-syslog"/> + +### Syslog socket + +</a> + +This collector will collect logs through a Unix socket server (UDP or TCP) or over the network using TCP or UDP. See also documentation of [Fluent Bit syslog input plugin](https://docs.fluentbit.io/manual/pipeline/inputs/syslog). + +| Configuration Option | Description | +| :------------: | ------------ | +| `mode` | Type of socket to be created to listen for incoming syslog messages. Supported modes are: `unix_tcp`, `unix_udp`, `tcp` and `udp`.| +| `log path` | If `mode == unix_tcp` or `mode == unix_udp`, Netdata will create a UNIX socket on this path to listen for syslog messages. Otherwise, this option is not used.| +| `unix_perm` | If `mode == unix_tcp` or `mode == unix_udp`, this sets the permissions of the generated UNIX socket. Otherwise, this option is not used.| +| `listen` | If `mode == tcp` or `mode == udp`, this sets the network interface to bind.| +| `port` | If `mode == tcp` or `mode == udp`, this specifies the port to listen for incoming connections.| +| `log format` | This is a Ruby Regular Expression to define the expected syslog format. Fluent Bit provides some [pre-configured syslog parsers](https://github.com/fluent/fluent-bit/blob/master/conf/parsers.conf#L65). | +|`priority value chart` | Please see the respective [systemd](#collector-configuration-systemd) configuration.| +| `severity chart` | Please see the respective [systemd](#collector-configuration-systemd) configuration.| +| `facility chart` | Please see the respective [systemd](#collector-configuration-systemd) configuration.| + + For parsing and metrics extraction to work properly, please ensure fields `<PRIVAL>`, `<SYSLOG_TIMESTAMP>`, `<HOSTNAME>`, `<SYSLOG_IDENTIFIER>`, `<PID>` and `<MESSAGE>` are defined in `log format`. For example, to parse incoming `syslog-rfc3164` logs, the following regular expression can be used: + +``` +/^\<(?<PRIVAL>[0-9]+)\>(?<SYSLOG_TIMESTAMP>[^ ]* {1,2}[^ ]* [^ ]* )(?<HOSTNAME>[^ ]*) (?<SYSLOG_IDENTIFIER>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<PID>[0-9]+)\])?(?:[^\:]*\:)? *(?<MESSAGE>.*)$/ +``` + +<a name="collector-configuration-serial"/> + +### Serial + +</a> + +This collector will collect logs through a serial interface. See also documentation of [Fluent Bit serial interface input plugin](https://docs.fluentbit.io/manual/pipeline/inputs/serial-interface). + +| Configuration Option | Description | +| :------------: | ------------ | +| `log path` | Absolute path to the device entry, e.g: `/dev/ttyS0`.| +| `bitrate` | The bitrate for the communication, e.g: 9600, 38400, 115200, etc..| +| `min bytes` | The minimum bytes the serial interface will wait to receive before it begines to process the log message.| +| `separator` | An optional separator string to determine the end of a log message.| +| `format` | Specify the format of the incoming data stream. The only option available is 'json'. Note that Format and Separator cannot be used at the same time.| + +<a name="collector-configuration-mqtt"/> + +### MQTT + +</a> + +This collector will collect MQTT data over a TCP connection, by spawning an MQTT server through Fluent Bit. See also documentation of [Fluent Bit MQTT input plugin](https://docs.fluentbit.io/manual/pipeline/inputs/mqtt). + +| Configuration Option | Description | +| :------------: | ------------ | +| `listen` | Specifies the network interface to bind.| +| `port` | Specifies the port to listen for incoming connections.| +| `topic chart` | Enable chart showing MQTT topic of incoming messages.| + +<a name="custom-charts"/> + +## Custom Charts + +</a> + +In addition to the predefined charts, each log source supports the option to extract +user-defined metrics, by matching log records to [POSIX Extended Regular Expressions](https://en.wikibooks.org/wiki/Regular_Expressions/POSIX-Extended_Regular_Expressions). +This can be very useful particularly for `FLB_TAIL` type log sources, where +there is no parsing at all by default. + +To create a custom chart, the following key-value configuration options must be +added to the respective log source configuration section: + +``` + custom 1 chart = identifier + custom 1 regex name = kernel + custom 1 regex = .*\bkernel\b.* + custom 1 ignore case = no +``` + +where the value denoted by: +- `custom x chart` is the title of the chart. +- `custom x regex name` is an optional name for the dimension of this particular metric (if absent, the regex will be used as the dimension name instead). +- `custom x regex` is the POSIX Extended Regular Expression to be used to match log records. +- `custom x ignore case` is equivalent to setting `REG_ICASE` when using POSIX Extended Regular Expressions for case insensitive searches. It is optional and defaults to `yes`. + +`x` must start from number 1 and monotonically increase by 1 every time a new regular expression is configured. +If the titles of two or more charts of a certain log source are the same, the dimensions will be grouped together +in the same chart, rather than a new chart being created. + +Example of configuration for a generic log source collection with custom regex-based parsers: + +``` +[Auth.log] + ## Example: Log collector that will tail auth.log file and count + ## occurences of certain `sudo` commands, using POSIX regular expressions. + + ## Required settings + enabled = no + log type = flb_tail + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## This section supports auto-detection of log file path if section name + ## is left unchanged, otherwise it can be set manually, e.g.: + ## log path = /var/log/auth.log + ## See README for more information on 'log path = auto' option + log path = auto + + ## Use inotify instead of file stat watcher. Set to 'no' to reduce CPU usage. + use inotify = yes + + custom 1 chart = sudo and su + custom 1 regex name = sudo + custom 1 regex = \bsudo\b + custom 1 ignore case = yes + + custom 2 chart = sudo and su + # custom 2 regex name = su + custom 2 regex = \bsu\b + custom 2 ignore case = yes + + custom 3 chart = sudo or su + custom 3 regex name = sudo or su + custom 3 regex = \bsudo\b|\bsu\b + custom 3 ignore case = yes +``` + +And the generated charts based on this configuration: + +![Auth.log](https://user-images.githubusercontent.com/5953192/197003292-13cf2285-c614-42a1-ad5a-896370c22883.PNG) + +<a name="streaming-in"/> + +## Streaming logs to Netdata + +</a> + +Netdata supports 2 incoming streaming configurations: +1. `syslog` messages over Unix or network sockets. +2. Fluent Bit's [Forward protocol](https://docs.fluentbit.io/manual/pipeline/outputs/forward). + +For option 1, please refer to the [syslog collector](#collector-configuration-syslog) section. This section will be focused on using option 2. + +A Netdata agent can be used as a logs aggregation parent to listen to `Forward` messages, using either Unix or network sockets. This option is separate to [Netdata's metrics streaming](https://github.com/netdata/netdata/blob/master/docs/metrics-storage-management/enable-streaming.md) and can be used independently of whether that's enabled or not (and it uses a different listening socket too). + +This setting can be enabled under the `[forward input]` section in `logsmanagement.d.conf`: + +``` +[forward input] + enable = no + unix path = + unix perm = 0644 + listen = 0.0.0.0 + port = 24224 +``` + +The default settings will listen for incoming `Forward` messages on TCP port 24224. If `unix path` is set to a valid path, `listen` and `port` will be ignored and a unix socket will be created under that path. Make sure that `unix perm` has the correct permissions set for that unix socket. Please also see Fluent Bit's [Forward input plugin documentation](https://docs.fluentbit.io/manual/pipeline/inputs/forward). + +The Netdata agent will now listen for incoming `Forward` messages, but by default it won't process or store them. To do that, there must exist at least one log collection, to define how the incoming logs will be processed and stored. This is similar to configuring a local log source, with the difference that `log source = forward` must be set and also a `stream guid` must be defined, matching that of the children log sources. + +The rest of this section contains some examples on how to configure log collections of different types, using a Netdata parent and Fluent Bit children instances (see also `./edit-config logsmanagement.d/example_forward.conf`). Please use the recommended settings on children instances for parsing on parents to work correctly. Also, note that `Forward` output on children supports optional `gzip` compression, by using the `-p Compress=gzip` configuration parameter, as demonstrated in some of the examples. + +<a name="streaming-systemd"/> + +### Example: Systemd log streaming + +</a> + +Example configuration of an `flb_docker_events` type parent log collection: +``` +[Forward systemd] + + ## Required settings + enabled = yes + log type = flb_systemd + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Streaming input settings. + log source = forward + stream guid = 6ce266f5-2704-444d-a301-2423b9d30735 + + ## Other settings specific to this log source type + priority value chart = yes + severity chart = yes + facility chart = yes +``` + +Any children can be configured as follows: +``` +fluent-bit -i systemd -p Read_From_Tail=on -p Strip_Underscores=on -o forward -p Compress=gzip -F record_modifier -p 'Record="stream guid" 6ce266f5-2704-444d-a301-2423b9d30735' -m '*' +``` + +<a name="streaming-kmsg"/> + +### Example: Kernel log streaming + +</a> + +Example configuration of an `flb_kmsg` type parent log collection: +``` +[Forward kmsg] + + ## Required settings + enabled = yes + log type = flb_kmsg + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + use log timestamp = no + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Streaming input settings. + log source = forward + stream guid = 6ce266f5-2704-444d-a301-2423b9d30736 + + ## Other settings specific to this log source type + severity chart = yes + subsystem chart = yes + device chart = yes +``` +Any children can be configured as follows: +``` +fluent-bit -i kmsg -o forward -p Compress=gzip -F record_modifier -p 'Record="stream guid" 6ce266f5-2704-444d-a301-2423b9d30736' -m '*' +``` + +> **Note** +> Fluent Bit's `kmsg` input plugin will collect all kernel logs since boot every time it's started up. Normally, when configured as a local source in a Netdata agent, all these initially collected logs will be discarded at startup so they are not duplicated. This is not possible when streaming from a Fluent Bit child, so every time a child is restarted, all kernel logs since boot will be re-collected and streamed again. + +<a name="streaming-generic"/> + +### Example: Generic log streaming + +</a> + +This is the most flexible option for a parent log collection, as it allows aggregation of logs from multiple children Fluent Bit instances of different log types. Example configuration of a generic parent log collection with `db mode = full`: + +``` +[Forward collection] + + ## Required settings + enabled = yes + log type = flb_tail + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + db mode = full + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Streaming input settings. + log source = forward + stream guid = 6ce266f5-2704-444d-a301-2423b9d30738 +``` + +Children can be configured to `tail` local logs using Fluent Bit and stream them to the parent: +``` +fluent-bit -i tail -p Path=/tmp/test.log -p Inotify_Watcher=true -p Refresh_Interval=1 -p Key=msg -o forward -p Compress=gzip -F record_modifier -p 'Record="stream guid" 6ce266f5-2704-444d-a301-2423b9d30738' -m '*' +``` + +Children instances do not have to use the `tail` input plugin specifically. Any of the supported log types can be used for the streaming child. The following configuration for example can stream `systemd` logs to the same parent as the configuration above: +``` +fluent-bit -i systemd -p Read_From_Tail=on -p Strip_Underscores=on -o forward -p Compress=gzip -F record_modifier -p 'Record="stream guid" 6ce266f5-2704-444d-a301-2423b9d30738' -m '*' +``` + +The caveat is that an `flb_tail` log collection on a parent won't generate any type-specific charts by default, but [custom charts](#custom-charts) can be of course manually added by the user. + +<a name="streaming-docker-events"/> + +### Example: Docker Events log streaming + +</a> + +Example configuration of a `flb_docker_events` type parent log collection: +``` +[Forward Docker Events] + + ## Required settings + enabled = yes + log type = flb_docker_events + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Streaming input settings. + log source = forward + stream guid = 6ce266f5-2704-444d-a301-2423b9d30737 + + ## Other settings specific to this log source type + event type chart = yes +``` + +Any children streaming to this collection must be set up to use one of the [default `json` or `docker` parsers](https://github.com/fluent/fluent-bit/blob/master/conf/parsers.conf), to send the collected log as structured messages, so they can be parsed by the parent: + +``` +fluent-bit -R ~/fluent-bit/conf/parsers.conf -i docker_events -p Parser=json -o forward -F record_modifier -p 'Record="stream guid" 6ce266f5-2704-444d-a301-2423b9d30737' -m '*' +``` +or +``` +fluent-bit -R ~/fluent-bit/conf/parsers.conf -i docker_events -p Parser=docker -o forward -F record_modifier -p 'Record="stream guid" 6ce266f5-2704-444d-a301-2423b9d30737' -m '*' +``` + +If instead the user desires to stream to a parent that collects logs into an `flb_tail` log collection, then a parser is not necessary and the unstructured logs can also be streamed in their original JSON format: +``` +fluent-bit -i docker_events -o forward -F record_modifier -p 'Record="stream guid 6ce266f5-2704-444d-a301-2423b9d30737' -m '*' +``` + +Logs will appear in the parent in their unstructured format: + +``` +{"status":"create","id":"de2432a4f00bd26a4899dde5633bb16090a4f367c36f440ebdfdc09020cb462d","from":"hello-world","Type":"container","Action":"create","Actor":{"ID":"de2432a4f00bd26a4899dde5633bb16090a4f367c36f440ebdfdc09020cb462d","Attributes":{"image":"hello-world","name":"lucid_yalow"}},"scope":"local","time":1680263414,"timeNano":1680263414473911042} +``` + +<a name="streaming-out"/> + +## Streaming logs from Netdata (exporting) + +</a> + +Netdata supports real-time log streaming and exporting through any of [Fluent Bit's outgoing streaming configurations](https://docs.fluentbit.io/manual/pipeline/outputs). + +To use any of the outputs, follow Fluent Bit's documentation with the addition of a `output x` prefix to all of the configuration parameters of the output. `x` must start from number 1 and monotonically increase by 1 every time a new output is configured for the log source. + +For example, the following configuration will add 2 outputs to a `docker events` log collector. The first output will stream logs to https://cloud.openobserve.ai/ using Fluent Bit's [http output plugin](https://docs.fluentbit.io/manual/pipeline/outputs/http) and the second one will save the same logs in a file in CSV format, using Fluent Bit's [file output plugin](https://docs.fluentbit.io/manual/pipeline/outputs/file): + +``` +[Docker Events Logs] + ## Example: Log collector that will monitor the Docker daemon socket and + ## collect Docker event logs in a default format similar to executing + ## the `sudo docker events` command. + + ## Required settings + enabled = yes + log type = flb_docker_events + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Use default Docker socket UNIX path: /var/run/docker.sock + log path = auto + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + event type chart = yes + event action chart = yes + + ## Stream to https://cloud.openobserve.ai/ + output 1 name = http + output 1 URI = YOUR_API_URI + output 1 Host = api.openobserve.ai + output 1 Port = 443 + output 1 tls = On + output 1 Format = json + output 1 Json_date_key = _timestamp + output 1 Json_date_format = iso8601 + output 1 HTTP_User = test@netdata.cloud + output 1 HTTP_Passwd = YOUR_OPENOBSERVE_PASSWORD + output 1 compress = gzip + + ## Real-time export to /tmp/docker_event_logs.csv + output 2 name = file + output 2 Path = /tmp + output 2 File = docker_event_logs.csv +``` + +</a> + +<a name="troubleshooting"/> + +## Troubleshooting + +</a> + +1. I am building Netdata from source or a Git checkout but the `FLB_SYSTEMD` plugin is not available / does not work: + +If during the Fluent Bit build step you are seeing the following message: +``` +-- Could NOT find Journald (missing: JOURNALD_LIBRARY JOURNALD_INCLUDE_DIR) +``` +it means that the systemd development libraries are missing from your system. Please see [how to install them alongside other required packages](https://github.com/netdata/netdata/blob/master/packaging/installer/methods/manual.md). + +2. I am observing very high CPU usage when monitoring a log source using `flb_tail` or `flb_web_log`. + +The log source is probably producing a very high number of unbuffered logs, which results in too many filesystem events. Try setting `use inotify = no` to use file stat watchers instead. + +3. I am using Podman instead of Docker, but I cannot see any Podman events logs being collected. + +Please ensure there is a listening service running that answers API calls for Podman. Instructions on how to start such a service can be found [here](https://docs.podman.io/en/latest/markdown/podman-system-service.1.html). + +Once the service is started, you must updated the Docker events logs collector `log path` to monitor the generated socket (otherwise, it will search for a `dock.sock` by default). + +You must ensure `podman.sock` has the right permissions for Netdata to be able to access it. diff --git a/logsmanagement/circular_buffer.c b/logsmanagement/circular_buffer.c new file mode 100644 index 00000000..6459da5e --- /dev/null +++ b/logsmanagement/circular_buffer.c @@ -0,0 +1,404 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file circular_buffer.c + * @brief This is the implementation of a circular buffer to be used + * for saving collected logs in memory, until they are stored + * into the database. + */ + +#include "circular_buffer.h" +#include "helper.h" +#include "parser.h" + +struct qsort_item { + Circ_buff_item_t *cbi; + struct File_info *pfi; +}; + +static int qsort_timestamp (const void *item_a, const void *item_b) { + return ( (int64_t)((struct qsort_item*)item_a)->cbi->timestamp - + (int64_t)((struct qsort_item*)item_b)->cbi->timestamp); +} + +static int reverse_qsort_timestamp (const void * item_a, const void * item_b) { + return -qsort_timestamp(item_a, item_b); +} + +/** + * @brief Search circular buffers according to the query_params. + * @details If multiple buffers are to be searched, the results will be sorted + * according to timestamps. + * + * Note that buff->tail can only be changed through circ_buff_read_done(), and + * circ_buff_search() and circ_buff_read_done() are mutually exclusive due + * to uv_mutex_lock() and uv_mutex_unlock() in queries and when writing to DB. + * + * @param p_query_params Query parameters to search according to. + * @param p_file_infos File_info structs to be searched. + */ +void circ_buff_search(logs_query_params_t *const p_query_params, struct File_info *const p_file_infos[]) { + + for(int pfi_off = 0; p_file_infos[pfi_off]; pfi_off++) + uv_rwlock_rdlock(&p_file_infos[pfi_off]->circ_buff->buff_realloc_rwlock); + + int buffs_size = 0, + buff_max_num_of_items = 0; + + while(p_file_infos[buffs_size]){ + if(p_file_infos[buffs_size]->circ_buff->num_of_items > buff_max_num_of_items) + buff_max_num_of_items = p_file_infos[buffs_size]->circ_buff->num_of_items; + buffs_size++; + } + + struct qsort_item items[buffs_size * buff_max_num_of_items + 1]; // worst case allocation + + int items_off = 0; + + for(int buff_off = 0; p_file_infos[buff_off]; buff_off++){ + Circ_buff_t *buff = p_file_infos[buff_off]->circ_buff; + /* TODO: The following 3 operations need to be replaced with a struct + * to gurantee atomicity. */ + int head = __atomic_load_n(&buff->head, __ATOMIC_SEQ_CST) % buff->num_of_items; + int tail = __atomic_load_n(&buff->tail, __ATOMIC_SEQ_CST) % buff->num_of_items; + int full = __atomic_load_n(&buff->full, __ATOMIC_SEQ_CST); + + if ((head == tail) && !full) continue; // Nothing to do if buff is empty + + for (int i = tail; i != head; i = (i + 1) % buff->num_of_items){ + items[items_off].cbi = &buff->items[i]; + items[items_off++].pfi = p_file_infos[buff_off]; + } + } + + items[items_off].cbi = NULL; + items[items_off].pfi = NULL; + + if(items[0].cbi) + qsort(items, items_off, sizeof(items[0]), p_query_params->order_by_asc ? qsort_timestamp : reverse_qsort_timestamp); + + + BUFFER *const res_buff = p_query_params->results_buff; + + logs_query_res_hdr_t res_hdr = { // result header + .timestamp = p_query_params->act_to_ts, + .text_size = 0, + .matches = 0, + .log_source = "", + .log_type = "" + }; + + for (int i = 0; items[i].cbi; i++) { + + /* If exceeding quota or timeout is reached and new timestamp is different than previous, + * terminate query but inform caller about act_to_ts to continue from (its next value) in next call. */ + if( (res_buff->len >= p_query_params->quota || terminate_logs_manag_query(p_query_params)) && + items[i].cbi->timestamp != res_hdr.timestamp){ + p_query_params->act_to_ts = res_hdr.timestamp; + break; + } + + res_hdr.timestamp = items[i].cbi->timestamp; + res_hdr.text_size = items[i].cbi->text_size; + strncpyz(res_hdr.log_source, log_src_t_str[items[i].pfi->log_source], sizeof(res_hdr.log_source) - 1); + strncpyz(res_hdr.log_type, log_src_type_t_str[items[i].pfi->log_type], sizeof(res_hdr.log_type) - 1); + strncpyz(res_hdr.basename, items[i].pfi->file_basename, sizeof(res_hdr.basename) - 1); + strncpyz(res_hdr.filename, items[i].pfi->filename, sizeof(res_hdr.filename) - 1); + strncpyz(res_hdr.chartname, items[i].pfi->chartname, sizeof(res_hdr.chartname) - 1); + + if (p_query_params->order_by_asc ? + ( res_hdr.timestamp >= p_query_params->req_from_ts && res_hdr.timestamp <= p_query_params->req_to_ts ) : + ( res_hdr.timestamp >= p_query_params->req_to_ts && res_hdr.timestamp <= p_query_params->req_from_ts) ){ + + /* In case of search_keyword, less than sizeof(res_hdr) + temp_msg.text_size + * space is required, but go for worst case scenario for now */ + buffer_increase(res_buff, sizeof(res_hdr) + res_hdr.text_size); + + if(!p_query_params->keyword || !*p_query_params->keyword || !strcmp(p_query_params->keyword, " ")){ + /* NOTE: relying on items[i]->cbi->num_lines to get number of log lines + * might not be 100% correct, since parsing must have taken place + * already to return correct count. Maybe an issue under heavy load. */ + res_hdr.matches = items[i].cbi->num_lines; + memcpy(&res_buff->buffer[res_buff->len + sizeof(res_hdr)], items[i].cbi->data, res_hdr.text_size); + } + else { + res_hdr.matches = search_keyword( items[i].cbi->data, res_hdr.text_size, + &res_buff->buffer[res_buff->len + sizeof(res_hdr)], + &res_hdr.text_size, p_query_params->keyword, NULL, + p_query_params->ignore_case); + + m_assert( (res_hdr.matches > 0 && res_hdr.text_size > 0) || + (res_hdr.matches == 0 && res_hdr.text_size == 0), + "res_hdr.matches and res_hdr.text_size must both be > 0 or == 0."); + + if(unlikely(res_hdr.matches < 0)) + break; /* res_hdr.matches < 0 - error during keyword search */ + } + + if(res_hdr.text_size){ + res_buff->buffer[res_buff->len + sizeof(res_hdr) + res_hdr.text_size - 1] = '\n'; // replace '\0' with '\n' + memcpy(&res_buff->buffer[res_buff->len], &res_hdr, sizeof(res_hdr)); + res_buff->len += sizeof(res_hdr) + res_hdr.text_size; + p_query_params->num_lines += res_hdr.matches; + } + + m_assert(TEST_MS_TIMESTAMP_VALID(res_hdr.timestamp), "res_hdr.timestamp is invalid"); + } + } + + for(int pfi_off = 0; p_file_infos[pfi_off]; pfi_off++) + uv_rwlock_rdunlock(&p_file_infos[pfi_off]->circ_buff->buff_realloc_rwlock); +} + +/** + * @brief Query circular buffer if there is space for item insertion. + * @param buff Circular buffer to query for available space. + * @param requested_text_space Size of raw (uncompressed) space needed. + * @note If buff->allow_dropped_logs is 0, then this function will block and + * it will only return once there is available space as requested. In this + * case, it will never return 0. + * @return \p requested_text_space if there is enough space, else 0. + */ +size_t circ_buff_prepare_write(Circ_buff_t *const buff, size_t const requested_text_space){ + + /* Calculate how much is the maximum compressed space that will + * be required on top of the requested space for the raw data. */ + buff->in->text_compressed_size = (size_t) LZ4_compressBound(requested_text_space); + m_assert(buff->in->text_compressed_size != 0, "requested text compressed space is zero"); + size_t const required_space = requested_text_space + buff->in->text_compressed_size; + + size_t available_text_space = 0; + size_t total_cached_mem_ex_in; + +try_to_acquire_space: + total_cached_mem_ex_in = 0; + for (int i = 0; i < buff->num_of_items; i++){ + total_cached_mem_ex_in += buff->items[i].data_max_size; + } + + /* If the required space is more than the allocated space of the input + * buffer, then we need to check if the input buffer can be reallocated: + * + * a) If the total memory consumption of the circular buffer plus the + * required space is less than the limit set by "circular buffer max size" + * for this log source, then the input buffer can be reallocated. + * + * b) If the total memory consumption of the circular buffer plus the + * required space is more than the limit set by "circular buffer max size" + * for this log source, we will attempt to reclaim some of the circular + * buffer allocated memory from any empty items. + * + * c) If after reclaiming the total memory consumption is still beyond the + * configuration limit, either 0 will be returned as the available space + * for raw logs in the input buffer, or the function will block and repeat + * the same process, until there is available space to be returned, depending + * of the configuration value of buff->allow_dropped_logs. + * */ + if(required_space > buff->in->data_max_size) { + if(likely(total_cached_mem_ex_in + required_space <= buff->total_cached_mem_max)){ + buff->in->data_max_size = required_space; + buff->in->data = reallocz(buff->in->data, buff->in->data_max_size); + + available_text_space = requested_text_space; + } + else if(likely(__atomic_load_n(&buff->full, __ATOMIC_SEQ_CST) == 0)){ + int head = __atomic_load_n(&buff->head, __ATOMIC_SEQ_CST) % buff->num_of_items; + int tail = __atomic_load_n(&buff->tail, __ATOMIC_SEQ_CST) % buff->num_of_items; + + for (int i = (head == tail ? (head + 1) % buff->num_of_items : head); + i != tail; i = (i + 1) % buff->num_of_items) { + + m_assert(i <= buff->num_of_items, "i > buff->num_of_items"); + buff->items[i].data_max_size = 1; + buff->items[i].data = reallocz(buff->items[i].data, buff->items[i].data_max_size); + } + + total_cached_mem_ex_in = 0; + for (int i = 0; i < buff->num_of_items; i++){ + total_cached_mem_ex_in += buff->items[i].data_max_size; + } + + if(total_cached_mem_ex_in + required_space <= buff->total_cached_mem_max){ + buff->in->data_max_size = required_space; + buff->in->data = reallocz(buff->in->data, buff->in->data_max_size); + + available_text_space = requested_text_space; + } + else available_text_space = 0; + } + } else available_text_space = requested_text_space; + + __atomic_store_n(&buff->total_cached_mem, total_cached_mem_ex_in + buff->in->data_max_size, __ATOMIC_RELAXED); + + if(unlikely(!buff->allow_dropped_logs && !available_text_space)){ + sleep_usec(CIRC_BUFF_PREP_WR_RETRY_AFTER_MS * USEC_PER_MS); + goto try_to_acquire_space; + } + + m_assert(available_text_space || buff->allow_dropped_logs, "!available_text_space == 0 && !buff->allow_dropped_logs"); + return available_text_space; +} + +/** + * @brief Insert item from temporary input buffer to circular buffer. + * @param buff Circular buffer to insert the item into + * @return 0 in case of success or -1 in case there was an error (e.g. buff + * is out of space). + */ +int circ_buff_insert(Circ_buff_t *const buff){ + + // TODO: Probably can be changed to __ATOMIC_RELAXED, but ideally a mutex should be used here. + int head = __atomic_load_n(&buff->head, __ATOMIC_SEQ_CST) % buff->num_of_items; + int tail = __atomic_load_n(&buff->tail, __ATOMIC_SEQ_CST) % buff->num_of_items; + int full = __atomic_load_n(&buff->full, __ATOMIC_SEQ_CST); + + /* If circular buffer does not have any free items, it will be expanded + * by reallocating the `items` array and adding one more item. */ + if (unlikely(( head == tail ) && full )) { + debug_log( "buff out of space! will be expanded."); + uv_rwlock_wrlock(&buff->buff_realloc_rwlock); + + + Circ_buff_item_t *items_new = callocz(buff->num_of_items + 1, sizeof(Circ_buff_item_t)); + + for(int i = 0; i < buff->num_of_items; i++){ + Circ_buff_item_t *item_old = &buff->items[head++ % buff->num_of_items]; + items_new[i] = *item_old; + } + freez(buff->items); + buff->items = items_new; + + buff->parse = buff->parse - buff->tail; + head = buff->head = buff->num_of_items++; + buff->tail = buff->read = 0; + buff->full = 0; + + __atomic_add_fetch(&buff->buff_realloc_cnt, 1, __ATOMIC_RELAXED); + + uv_rwlock_wrunlock(&buff->buff_realloc_rwlock); + } + + Circ_buff_item_t *cur_item = &buff->items[head]; + + char *tmp_data = cur_item->data; + size_t tmp_data_max_size = cur_item->data_max_size; + + cur_item->status = buff->in->status; + cur_item->timestamp = buff->in->timestamp; + cur_item->data = buff->in->data; + cur_item->text_size = buff->in->text_size; + cur_item->text_compressed = buff->in->text_compressed; + cur_item->text_compressed_size = buff->in->text_compressed_size; + cur_item->data_max_size = buff->in->data_max_size; + cur_item->num_lines = buff->in->num_lines; + + buff->in->status = CIRC_BUFF_ITEM_STATUS_UNPROCESSED; + buff->in->timestamp = 0; + buff->in->data = tmp_data; + buff->in->text_size = 0; + // buff->in->text_compressed = tmp_data; + buff->in->text_compressed_size = 0; + buff->in->data_max_size = tmp_data_max_size; + buff->in->num_lines = 0; + + __atomic_add_fetch(&buff->text_size_total, cur_item->text_size, __ATOMIC_SEQ_CST); + + if( __atomic_add_fetch(&buff->text_compressed_size_total, cur_item->text_compressed_size, __ATOMIC_SEQ_CST)){ + __atomic_store_n(&buff->compression_ratio, + __atomic_load_n(&buff->text_size_total, __ATOMIC_SEQ_CST) / + __atomic_load_n(&buff->text_compressed_size_total, __ATOMIC_SEQ_CST), + __ATOMIC_SEQ_CST); + } else __atomic_store_n( &buff->compression_ratio, 0, __ATOMIC_SEQ_CST); + + + if(unlikely(__atomic_add_fetch(&buff->head, 1, __ATOMIC_SEQ_CST) % buff->num_of_items == + __atomic_load_n(&buff->tail, __ATOMIC_SEQ_CST) % buff->num_of_items)){ + __atomic_store_n(&buff->full, 1, __ATOMIC_SEQ_CST); + } + + __atomic_or_fetch(&cur_item->status, CIRC_BUFF_ITEM_STATUS_PARSED | CIRC_BUFF_ITEM_STATUS_STREAMED, __ATOMIC_SEQ_CST); + + return 0; +} + +/** + * @brief Return pointer to next item to be read from the circular buffer. + * @param buff Circular buffer to get next item from. + * @return Pointer to the next circular buffer item to be read, or NULL + * if there are no more items to be read. + */ +Circ_buff_item_t *circ_buff_read_item(Circ_buff_t *const buff) { + + Circ_buff_item_t *item = &buff->items[buff->read % buff->num_of_items]; + + m_assert(__atomic_load_n(&item->status, __ATOMIC_RELAXED) <= CIRC_BUFF_ITEM_STATUS_DONE, "Invalid status"); + + if( /* No more records to be retrieved from the buffer - pay attention that + * there is no `% buff->num_of_items` operation, as we need to check + * the case where buff->read is exactly equal to buff->head. */ + (buff->read == (__atomic_load_n(&buff->head, __ATOMIC_SEQ_CST))) || + /* Current item either not parsed or streamed */ + (__atomic_load_n(&item->status, __ATOMIC_RELAXED) != CIRC_BUFF_ITEM_STATUS_DONE) ){ + + return NULL; + } + + __atomic_sub_fetch(&buff->text_size_total, item->text_size, __ATOMIC_SEQ_CST); + + if( __atomic_sub_fetch(&buff->text_compressed_size_total, item->text_compressed_size, __ATOMIC_SEQ_CST)){ + __atomic_store_n(&buff->compression_ratio, + __atomic_load_n(&buff->text_size_total, __ATOMIC_SEQ_CST) / + __atomic_load_n(&buff->text_compressed_size_total, __ATOMIC_SEQ_CST), + __ATOMIC_SEQ_CST); + } else __atomic_store_n( &buff->compression_ratio, 0, __ATOMIC_SEQ_CST); + + buff->read++; + + return item; +} + +/** + * @brief Complete buffer read process. + * @param buff Circular buffer to complete read process on. + */ +void circ_buff_read_done(Circ_buff_t *const buff){ + /* Even if one item was read, it means buffer cannot be full anymore */ + if(__atomic_load_n(&buff->tail, __ATOMIC_RELAXED) != buff->read) + __atomic_store_n(&buff->full, 0, __ATOMIC_SEQ_CST); + + __atomic_store_n(&buff->tail, buff->read, __ATOMIC_SEQ_CST); +} + +/** + * @brief Create a new circular buffer. + * @param num_of_items Number of Circ_buff_item_t items in the buffer. + * @param max_size Maximum memory the circular buffer can occupy. + * @param allow_dropped_logs Maximum memory the circular buffer can occupy. + * @return Pointer to the new circular buffer structure. + */ +Circ_buff_t *circ_buff_init(const int num_of_items, + const size_t max_size, + const int allow_dropped_logs ) { + Circ_buff_t *buff = callocz(1, sizeof(Circ_buff_t)); + buff->num_of_items = num_of_items; + buff->items = callocz(buff->num_of_items, sizeof(Circ_buff_item_t)); + buff->in = callocz(1, sizeof(Circ_buff_item_t)); + + uv_rwlock_init(&buff->buff_realloc_rwlock); + + buff->total_cached_mem_max = max_size; + buff->allow_dropped_logs = allow_dropped_logs; + + return buff; +} + +/** + * @brief Destroy a circular buffer. + * @param buff Circular buffer to be destroyed. + */ +void circ_buff_destroy(Circ_buff_t *buff){ + for (int i = 0; i < buff->num_of_items; i++) freez(buff->items[i].data); + freez(buff->items); + freez(buff->in->data); + freez(buff->in); + freez(buff); +}; diff --git a/logsmanagement/circular_buffer.h b/logsmanagement/circular_buffer.h new file mode 100644 index 00000000..92697824 --- /dev/null +++ b/logsmanagement/circular_buffer.h @@ -0,0 +1,66 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file circular_buffer.h + * @brief Header of circular_buffer.c + */ + +#ifndef CIRCULAR_BUFFER_H_ +#define CIRCULAR_BUFFER_H_ + +#include <stdio.h> +#include <stdlib.h> +#include <string.h> +#include <uv.h> +#include "defaults.h" +#include "query.h" +#include "file_info.h" + +// Forward declaration to break circular dependency +struct File_info; + +typedef enum { + CIRC_BUFF_ITEM_STATUS_UNPROCESSED = 0, + CIRC_BUFF_ITEM_STATUS_PARSED = 1, + CIRC_BUFF_ITEM_STATUS_STREAMED = 2, + CIRC_BUFF_ITEM_STATUS_DONE = 3 // == CIRC_BUFF_ITEM_STATUS_PARSED | CIRC_BUFF_ITEM_STATUS_STREAMED +} circ_buff_item_status_t; + +typedef struct Circ_buff_item { + circ_buff_item_status_t status; /**< Denotes if item is unprocessed, in processing or processed **/ + msec_t timestamp; /**< Epoch datetime of when data was collected **/ + char *data; /**< Base of buffer to store both uncompressed and compressed logs **/ + size_t text_size; /**< Size of uncompressed logs **/ + char *text_compressed; /**< Pointer offset within *data that points to start of compressed logs **/ + size_t text_compressed_size; /**< Size of compressed logs **/ + size_t data_max_size; /**< Allocated size of *data **/ + unsigned long num_lines; /**< Number of log records in item */ +} Circ_buff_item_t; + +typedef struct Circ_buff { + int num_of_items; /**< Number of preallocated items in the buffer **/ + Circ_buff_item_t *items; /**< Array of all circular buffer items **/ + Circ_buff_item_t *in; /**< Circular buffer item to write new data into **/ + int head; /**< Position of next item insertion **/ + int read; /**< Index between tail and head, used to read items out of Circ_buff **/ + int tail; /**< Last valid item in Circ_buff **/ + int parse; /**< Points to next item in buffer to be parsed **/ + int full; /**< When head == tail, this indicates if buffer is full or empty **/ + uv_rwlock_t buff_realloc_rwlock; /**< RW lock to lock buffer operations when reallocating or expanding buffer **/ + unsigned int buff_realloc_cnt; /**< Counter of how any buffer reallocations have occurred **/ + size_t total_cached_mem; /**< Total memory allocated for Circ_buff (excluding *in) **/ + size_t total_cached_mem_max; /**< Maximum allowable size for total_cached_mem **/ + int allow_dropped_logs; /**< Boolean to indicate whether logs are allowed to be dropped if buffer is full */ + size_t text_size_total; /**< Total size of items[]->text_size **/ + size_t text_compressed_size_total; /**< Total size of items[]->text_compressed_size **/ + int compression_ratio; /**< text_size_total / text_compressed_size_total **/ +} Circ_buff_t; + +void circ_buff_search(logs_query_params_t *const p_query_params, struct File_info *const p_file_infos[]); +size_t circ_buff_prepare_write(Circ_buff_t *const buff, size_t const requested_text_space); +int circ_buff_insert(Circ_buff_t *const buff); +Circ_buff_item_t *circ_buff_read_item(Circ_buff_t *const buff); +void circ_buff_read_done(Circ_buff_t *const buff); +Circ_buff_t *circ_buff_init(const int num_of_items, const size_t max_size, const int allow_dropped_logs); +void circ_buff_destroy(Circ_buff_t *buff); + +#endif // CIRCULAR_BUFFER_H_ diff --git a/logsmanagement/db_api.c b/logsmanagement/db_api.c new file mode 100644 index 00000000..ae091443 --- /dev/null +++ b/logsmanagement/db_api.c @@ -0,0 +1,1396 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + + +/** @file db_api.c + * @brief This is the file implementing the API to the + * logs management database. + */ + +#include "daemon/common.h" +#include "db_api.h" +#include <inttypes.h> +#include <stdio.h> +#include "circular_buffer.h" +#include "helper.h" +#include "lz4.h" +#include "parser.h" + +#define MAIN_DB "main.db" /**< Primary DB with metadata for all the logs managemt collections **/ +#define MAIN_COLLECTIONS_TABLE "LogCollections" /*< Table name where logs collections metadata is stored in MAIN_DB **/ +#define BLOB_STORE_FILENAME "logs.bin." /*< Filename of BLOBs where logs are stored in **/ +#define METADATA_DB_FILENAME "metadata.db" /**< Metadata DB for each log collection **/ +#define LOGS_TABLE "Logs" /*< Table name where logs metadata is stored in METADATA_DB_FILENAME **/ +#define BLOBS_TABLE "Blobs" /*< Table name where BLOBs metadata is stored in METADATA_DB_FILENAME **/ + +#define LOGS_MANAG_DB_VERSION 1 + +static sqlite3 *main_db = NULL; /**< SQLite DB handler for MAIN_DB **/ +static char *main_db_dir = NULL; /**< Directory where all the log management databases and log blobs are stored in **/ +static char *main_db_path = NULL; /**< Path of MAIN_DB **/ + +/* -------------------------------------------------------------------------- */ +/* Database migrations */ +/* -------------------------------------------------------------------------- */ + +/** + * @brief No-op database migration, just to bump up starting version. + * @param database Unused + * @param name Unused + * @return Always 0. + */ +static int do_migration_noop(sqlite3 *database, const char *name){ + UNUSED(database); + UNUSED(name); + collector_info("Running database migration %s", name); + return 0; +} + +typedef struct database_func_migration_list{ + char *name; + int (*func)(sqlite3 *database, const char *name); +} DATABASE_FUNC_MIGRATION_LIST; + +DATABASE_FUNC_MIGRATION_LIST migration_list_main_db[] = { + {.name = MAIN_DB" v0 to v1", .func = do_migration_noop}, + // the terminator of this array + {.name = NULL, .func = NULL} +}; + +DATABASE_FUNC_MIGRATION_LIST migration_list_metadata_db[] = { + {.name = METADATA_DB_FILENAME " v0 to v1", .func = do_migration_noop}, + // the terminator of this array + {.name = NULL, .func = NULL} +}; + +typedef enum { + ERR_TYPE_OTHER, + ERR_TYPE_SQLITE, + ERR_TYPE_LIBUV, +} logs_manag_db_error_t; + +/** + * @brief Logs a database error + * @param[in] log_source Log source that caused the error + * @param[in] error_type Type of error + * @param[in] rc Error code + * @param[in] line Line number where the error occurred (__LINE__) + * @param[in] file Source file where the error occurred (__FILE__) + * @param[in] func Function where the error occurred (__FUNCTION__) + */ +static void throw_error(const char *const log_source, + const logs_manag_db_error_t error_type, + const int rc, const int line, + const char *const file, const char *const func){ + collector_error("[%s]: %s database error: (%d) %s (%s:%s:%d))", + log_source ? log_source : "-", + error_type == ERR_TYPE_OTHER ? "" : ERR_TYPE_SQLITE ? "SQLite" : "libuv", + rc, error_type == ERR_TYPE_OTHER ? "" : ERR_TYPE_SQLITE ? sqlite3_errstr(rc) : uv_strerror(rc), + file, func, line); +} + +/** + * @brief Get or set user_version of database. + * @param db SQLite database to act upon. + * @param set_user_version If <= 0, just get user_version. Otherwise, set + * user_version first, before returning it. + * @return Database user_version or -1 in case of error. + */ +int db_user_version(sqlite3 *const db, const int set_user_version){ + if(unlikely(!db)) return -1; + int rc = 0; + if(set_user_version <= 0){ + sqlite3_stmt *stmt_get_user_version; + rc = sqlite3_prepare_v2(db, "PRAGMA user_version;", -1, &stmt_get_user_version, NULL); + if (unlikely(SQLITE_OK != rc)) { + throw_error(NULL, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return -1; + } + rc = sqlite3_step(stmt_get_user_version); + if (unlikely(SQLITE_ROW != rc)) { + throw_error(NULL, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return -1; + } + int current_user_version = sqlite3_column_int(stmt_get_user_version, 0); + rc = sqlite3_finalize(stmt_get_user_version); + if (unlikely(SQLITE_OK != rc)) { + throw_error(NULL, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return -1; + } + return current_user_version; + } else { + char buf[25]; + snprintfz(buf, 25, "PRAGMA user_version=%d;", set_user_version); + rc = sqlite3_exec(db, buf, NULL, NULL, NULL); + if (unlikely(SQLITE_OK!= rc)) { + throw_error(NULL, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return -1; + } + return set_user_version; + } +} + +static void db_writer_db_mode_none(void *arg){ + struct File_info *const p_file_info = (struct File_info *) arg; + Circ_buff_item_t *item; + + while(__atomic_load_n(&p_file_info->state, __ATOMIC_RELAXED) == LOG_SRC_READY){ + uv_rwlock_rdlock(&p_file_info->circ_buff->buff_realloc_rwlock); + do{ item = circ_buff_read_item(p_file_info->circ_buff);} while(item); + circ_buff_read_done(p_file_info->circ_buff); + uv_rwlock_rdunlock(&p_file_info->circ_buff->buff_realloc_rwlock); + for(int i = 0; i < p_file_info->buff_flush_to_db_interval * 4; i++){ + if(__atomic_load_n(&p_file_info->state, __ATOMIC_RELAXED) != LOG_SRC_READY) + break; + sleep_usec(250 * USEC_PER_MS); + } + } +} + +#define return_db_writer_db_mode_none(p_file_info, do_mut_unlock){ \ + p_file_info->db_mode = LOGS_MANAG_DB_MODE_NONE; \ + freez((void *) p_file_info->db_dir); \ + p_file_info->db_dir = strdupz(""); \ + freez((void *) p_file_info->db_metadata); \ + p_file_info->db_metadata = NULL; \ + sqlite3_finalize(stmt_logs_insert); \ + sqlite3_finalize(stmt_blobs_get_total_filesize); \ + sqlite3_finalize(stmt_blobs_update); \ + sqlite3_finalize(stmt_blobs_set_zero_filesize); \ + sqlite3_finalize(stmt_logs_delete); \ + if(do_mut_unlock){ \ + uv_mutex_unlock(p_file_info->db_mut); \ + uv_rwlock_rdunlock(&p_file_info->circ_buff->buff_realloc_rwlock); \ + } \ + if(__atomic_load_n(&p_file_info->state, __ATOMIC_RELAXED) == LOG_SRC_READY) \ + return fatal_assert(!uv_thread_create( p_file_info->db_writer_thread, \ + db_writer_db_mode_none, \ + p_file_info)); \ +} + +static void db_writer_db_mode_full(void *arg){ + int rc = 0; + struct File_info *const p_file_info = (struct File_info *) arg; + + sqlite3_stmt *stmt_logs_insert = NULL; + sqlite3_stmt *stmt_blobs_get_total_filesize = NULL; + sqlite3_stmt *stmt_blobs_update = NULL; + sqlite3_stmt *stmt_blobs_set_zero_filesize = NULL; + sqlite3_stmt *stmt_logs_delete = NULL; + + /* Prepare LOGS_TABLE INSERT statement */ + rc = sqlite3_prepare_v2(p_file_info->db, + "INSERT INTO " LOGS_TABLE "(" + "FK_BLOB_Id," + "BLOB_Offset," + "Timestamp," + "Msg_compr_size," + "Msg_decompr_size," + "Num_lines" + ") VALUES (?,?,?,?,?,?) ;", + -1, &stmt_logs_insert, NULL); + if (unlikely(SQLITE_OK != rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 0); + } + + /* Prepare BLOBS_TABLE get total filesize statement */ + rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT SUM(Filesize) FROM " BLOBS_TABLE " ;", + -1, &stmt_blobs_get_total_filesize, NULL); + if (unlikely(SQLITE_OK != rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 0); + } + + /* Prepare BLOBS_TABLE UPDATE statement */ + rc = sqlite3_prepare_v2(p_file_info->db, + "UPDATE " BLOBS_TABLE + " SET Filesize = Filesize + ?" + " WHERE Id = ? ;", + -1, &stmt_blobs_update, NULL); + if (unlikely(SQLITE_OK != rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 0); + } + + /* Prepare BLOBS_TABLE UPDATE SET zero filesize statement */ + rc = sqlite3_prepare_v2(p_file_info->db, + "UPDATE " BLOBS_TABLE + " SET Filesize = 0" + " WHERE Id = ? ;", + -1, &stmt_blobs_set_zero_filesize, NULL); + if (unlikely(SQLITE_OK != rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 0); + } + + /* Prepare LOGS_TABLE DELETE statement */ + rc = sqlite3_prepare_v2(p_file_info->db, + "DELETE FROM " LOGS_TABLE + " WHERE FK_BLOB_Id = ? ;", + -1, &stmt_logs_delete, NULL); + if (unlikely(SQLITE_OK != rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 0); + } + + /* Get initial filesize of logs.bin.0 BLOB */ + sqlite3_stmt *stmt_retrieve_filesize_from_id = NULL; + if(unlikely( + SQLITE_OK != (rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT Filesize FROM " BLOBS_TABLE + " WHERE Id = ? ;", + -1, &stmt_retrieve_filesize_from_id, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_int(stmt_retrieve_filesize_from_id, 1, + p_file_info->blob_write_handle_offset)) || + SQLITE_ROW != (rc = sqlite3_step(stmt_retrieve_filesize_from_id)) + )){ + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 0); + } + int64_t blob_filesize = (int64_t) sqlite3_column_int64(stmt_retrieve_filesize_from_id, 0); + rc = sqlite3_finalize(stmt_retrieve_filesize_from_id); + if (unlikely(SQLITE_OK != rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 0); + } + + struct timespec ts_db_write_start, ts_db_write_end, ts_db_rotate_end; + while(__atomic_load_n(&p_file_info->state, __ATOMIC_RELAXED) == LOG_SRC_READY){ + clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts_db_write_start); + + uv_rwlock_rdlock(&p_file_info->circ_buff->buff_realloc_rwlock); + uv_mutex_lock(p_file_info->db_mut); + + /* --------------------------------------------------------------------- + * Read items from circular buffer and store them in disk BLOBs. + * After that, SQLite metadata is updated. + * ------------------------------------------------------------------ */ + Circ_buff_item_t *item = circ_buff_read_item(p_file_info->circ_buff); + while (item) { + m_assert(TEST_MS_TIMESTAMP_VALID(item->timestamp), "item->timestamp == 0"); + m_assert(item->text_compressed_size != 0, "item->text_compressed_size == 0"); + m_assert(item->text_size != 0, "item->text_size == 0"); + + /* Write logs in BLOB */ + uv_fs_t write_req; + uv_buf_t uv_buf = uv_buf_init((char *) item->text_compressed, (unsigned int) item->text_compressed_size); + rc = uv_fs_write( NULL, &write_req, + p_file_info->blob_handles[p_file_info->blob_write_handle_offset], + &uv_buf, 1, blob_filesize, NULL); // Write synchronously at the end of the BLOB file + uv_fs_req_cleanup(&write_req); + if(unlikely(rc < 0)){ + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + circ_buff_read_done(p_file_info->circ_buff); + return_db_writer_db_mode_none(p_file_info, 1); + } + + /* Ensure data is flushed to BLOB via fdatasync() */ + uv_fs_t dsync_req; + rc = uv_fs_fdatasync( NULL, &dsync_req, + p_file_info->blob_handles[p_file_info->blob_write_handle_offset], NULL); + uv_fs_req_cleanup(&dsync_req); + if (unlikely(rc)){ + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + circ_buff_read_done(p_file_info->circ_buff); + return_db_writer_db_mode_none(p_file_info, 1); + } + + if(unlikely( + /* Write metadata of logs in LOGS_TABLE */ + SQLITE_OK != (rc = sqlite3_exec(p_file_info->db, "BEGIN TRANSACTION;", NULL, NULL, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_int(stmt_logs_insert, 1, p_file_info->blob_write_handle_offset)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_logs_insert, 2, (sqlite3_int64) blob_filesize)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_logs_insert, 3, (sqlite3_int64) item->timestamp)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_logs_insert, 4, (sqlite3_int64) item->text_compressed_size)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_logs_insert, 5, (sqlite3_int64)item->text_size)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_logs_insert, 6, (sqlite3_int64)item->num_lines)) || + SQLITE_DONE != (rc = sqlite3_step(stmt_logs_insert)) || + SQLITE_OK != (rc = sqlite3_reset(stmt_logs_insert)) || + + /* Update metadata of BLOBs filesize in BLOBS_TABLE */ + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_blobs_update, 1, (sqlite3_int64)item->text_compressed_size)) || + SQLITE_OK != (rc = sqlite3_bind_int(stmt_blobs_update, 2, p_file_info->blob_write_handle_offset)) || + SQLITE_DONE != (rc = sqlite3_step(stmt_blobs_update)) || + SQLITE_OK != (rc = sqlite3_reset(stmt_blobs_update)) || + SQLITE_OK != (rc = sqlite3_exec(p_file_info->db, "END TRANSACTION;", NULL, NULL, NULL)) + )) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + rc = sqlite3_exec(p_file_info->db, "ROLLBACK;", NULL, NULL, NULL); + if (unlikely(SQLITE_OK != rc)) + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + circ_buff_read_done(p_file_info->circ_buff); + return_db_writer_db_mode_none(p_file_info, 1); + } + + /* TODO: Should we log it if there is a fatal error in the transaction, + * as there will be a mismatch between BLOBs and SQLite metadata? */ + + /* Increase BLOB offset and read next log message until no more messages in buff */ + blob_filesize += (int64_t) item->text_compressed_size; + item = circ_buff_read_item(p_file_info->circ_buff); + } + circ_buff_read_done(p_file_info->circ_buff); + + clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts_db_write_end); + + /* --------------------------------------------------------------------- + * If the filesize of the current write-to BLOB is > + * p_file_info->blob_max_size, then perform a BLOBs rotation. + * ------------------------------------------------------------------ */ + if(blob_filesize > p_file_info->blob_max_size){ + uv_fs_t rename_req; + char old_path[FILENAME_MAX + 1], new_path[FILENAME_MAX + 1]; + + /* Rotate path of BLOBs */ + for(int i = BLOB_MAX_FILES - 1; i >= 0; i--){ + snprintfz(old_path, FILENAME_MAX, "%s" BLOB_STORE_FILENAME "%d", p_file_info->db_dir, i); + snprintfz(new_path, FILENAME_MAX, "%s" BLOB_STORE_FILENAME "%d", p_file_info->db_dir, i + 1); + rc = uv_fs_rename(NULL, &rename_req, old_path, new_path, NULL); + uv_fs_req_cleanup(&rename_req); + if (unlikely(rc)){ + //TODO: This error case needs better handling, as it will result in mismatch with sqlite metadata. + // We probably require a WAL or something similar. + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 1); + } + } + + /* Replace the maximum number with 0 in BLOB files. */ + snprintfz(old_path, FILENAME_MAX, "%s" BLOB_STORE_FILENAME "%d", p_file_info->db_dir, BLOB_MAX_FILES); + snprintfz(new_path, FILENAME_MAX, "%s" BLOB_STORE_FILENAME "%d", p_file_info->db_dir, 0); + rc = uv_fs_rename(NULL, &rename_req, old_path, new_path, NULL); + uv_fs_req_cleanup(&rename_req); + if (unlikely(rc)){ + //TODO: This error case needs better handling, as it will result in mismatch with sqlite metadata. + // We probably require a WAL or something similar. + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 1); + } + + /* Rotate BLOBS_TABLE Filenames */ + rc = sqlite3_exec(p_file_info->db, + "UPDATE " BLOBS_TABLE + " SET Filename = REPLACE( " + " Filename, " + " substr(Filename, -1), " + " case when " + " (cast(substr(Filename, -1) AS INTEGER) < (" LOGS_MANAG_STR(BLOB_MAX_FILES) " - 1)) then " + " substr(Filename, -1) + 1 else 0 end);", + NULL, NULL, NULL); + if (unlikely(rc != SQLITE_OK)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + //TODO: Undo rotation if possible? + return_db_writer_db_mode_none(p_file_info, 1); + } + + /* ----------------------------------------------------------------- + * (a) Update blob_write_handle_offset, + * (b) truncate new write-to BLOB, + * (c) update filesize of truncated BLOB in SQLite DB, + * (d) delete respective logs in LOGS_TABLE for the truncated BLOB and + * (e) reset blob_filesize + * -------------------------------------------------------------- */ + /* (a) */ + p_file_info->blob_write_handle_offset = + p_file_info->blob_write_handle_offset == 1 ? BLOB_MAX_FILES : p_file_info->blob_write_handle_offset - 1; + + /* (b) */ + uv_fs_t trunc_req; + rc = uv_fs_ftruncate(NULL, &trunc_req, p_file_info->blob_handles[p_file_info->blob_write_handle_offset], 0, NULL); + uv_fs_req_cleanup(&trunc_req); + if (unlikely(rc)){ + //TODO: This error case needs better handling, as it will result in mismatch with sqlite metadata. + // We probably require a WAL or something similar. + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 1); + } + + /* (c) */ + if(unlikely( + SQLITE_OK != (rc = sqlite3_exec(p_file_info->db, "BEGIN TRANSACTION;", NULL, NULL, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_int(stmt_blobs_set_zero_filesize, 1, p_file_info->blob_write_handle_offset)) || + SQLITE_DONE != (rc = sqlite3_step(stmt_blobs_set_zero_filesize)) || + SQLITE_OK != (rc = sqlite3_reset(stmt_blobs_set_zero_filesize)) || + + /* (d) */ + SQLITE_OK != (rc = sqlite3_bind_int(stmt_logs_delete, 1, p_file_info->blob_write_handle_offset)) || + SQLITE_DONE != (rc = sqlite3_step(stmt_logs_delete)) || + SQLITE_OK != (rc = sqlite3_reset(stmt_logs_delete)) || + SQLITE_OK != (rc = sqlite3_exec(p_file_info->db, "END TRANSACTION;", NULL, NULL, NULL)) + )) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + rc = sqlite3_exec(p_file_info->db, "ROLLBACK;", NULL, NULL, NULL); + if (unlikely(SQLITE_OK != rc)) + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 1); + } + + /* (e) */ + blob_filesize = 0; + + } + + clock_gettime(CLOCK_THREAD_CPUTIME_ID, &ts_db_rotate_end); + + /* Update database write & rotate timings for this log source */ + __atomic_store_n(&p_file_info->db_write_duration, + (ts_db_write_end.tv_sec - ts_db_write_start.tv_sec) * NSEC_PER_SEC + + (ts_db_write_end.tv_nsec - ts_db_write_start.tv_nsec), __ATOMIC_RELAXED); + __atomic_store_n(&p_file_info->db_rotate_duration, + (ts_db_rotate_end.tv_sec - ts_db_write_end.tv_sec) * NSEC_PER_SEC + + (ts_db_rotate_end.tv_nsec - ts_db_write_end.tv_nsec), __ATOMIC_RELAXED); + + /* Update total disk usage of all BLOBs for this log source */ + rc = sqlite3_step(stmt_blobs_get_total_filesize); + if (unlikely(SQLITE_ROW != rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 1); + } + __atomic_store_n(&p_file_info->blob_total_size, sqlite3_column_int64(stmt_blobs_get_total_filesize, 0), __ATOMIC_RELAXED); + rc = sqlite3_reset(stmt_blobs_get_total_filesize); + if (unlikely(SQLITE_OK != rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + return_db_writer_db_mode_none(p_file_info, 1); + } + + // TODO: Can uv_mutex_unlock(p_file_info->db_mut) be moved before if(blob_filesize > p_file_info-> blob_max_size) ? + uv_mutex_unlock(p_file_info->db_mut); + uv_rwlock_rdunlock(&p_file_info->circ_buff->buff_realloc_rwlock); + for(int i = 0; i < p_file_info->buff_flush_to_db_interval * 4; i++){ + if(__atomic_load_n(&p_file_info->state, __ATOMIC_RELAXED) != LOG_SRC_READY) + break; + sleep_usec(250 * USEC_PER_MS); + } + } + + return_db_writer_db_mode_none(p_file_info, 0); +} + +inline void db_set_main_dir(char *const dir){ + main_db_dir = dir; +} + +int db_init() { + int rc = 0; + char *err_msg = 0; + uv_fs_t mkdir_req; + + if(unlikely(!main_db_dir || !*main_db_dir)){ + rc = -1; + collector_error("main_db_dir is unset"); + throw_error(NULL, ERR_TYPE_OTHER, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + size_t main_db_path_len = strlen(main_db_dir) + sizeof(MAIN_DB) + 1; + main_db_path = mallocz(main_db_path_len); + snprintfz(main_db_path, main_db_path_len, "%s/" MAIN_DB, main_db_dir); + + /* Create databases directory if it doesn't exist. */ + rc = uv_fs_mkdir(NULL, &mkdir_req, main_db_dir, 0775, NULL); + uv_fs_req_cleanup(&mkdir_req); + if(rc == 0) collector_info("DB directory created: %s", main_db_dir); + else if (rc == UV_EEXIST) collector_info("DB directory %s found", main_db_dir); + else { + throw_error(NULL, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + + /* Create or open main db */ + rc = sqlite3_open(main_db_path, &main_db); + if (unlikely(rc != SQLITE_OK)){ + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + + /* Configure main database */ + rc = sqlite3_exec(main_db, + "PRAGMA auto_vacuum = INCREMENTAL;" + "PRAGMA synchronous = 1;" + "PRAGMA journal_mode = WAL;" + "PRAGMA temp_store = MEMORY;" + "PRAGMA foreign_keys = ON;", + 0, 0, &err_msg); + if (unlikely(rc != SQLITE_OK)) { + collector_error("Failed to configure database, SQL error: %s\n", err_msg); + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } else collector_info("%s configured successfully", MAIN_DB); + + /* Execute pending main database migrations */ + int main_db_ver = db_user_version(main_db, -1); + if (likely(LOGS_MANAG_DB_VERSION == main_db_ver)) + collector_info("Logs management %s database version is %d (no migration needed)", MAIN_DB, main_db_ver); + else { + for(int ver = main_db_ver; ver < LOGS_MANAG_DB_VERSION && migration_list_main_db[ver].func; ver++){ + rc = (migration_list_main_db[ver].func)(main_db, migration_list_main_db[ver].name); + if (unlikely(rc)){ + collector_error("Logs management %s database migration from version %d to version %d failed", MAIN_DB, ver, ver + 1); + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + db_user_version(main_db, ver + 1); + } + } + + /* Create new main DB LogCollections table if it doesn't exist */ + rc = sqlite3_exec(main_db, + "CREATE TABLE IF NOT EXISTS " MAIN_COLLECTIONS_TABLE "(" + "Id INTEGER PRIMARY KEY," + "Stream_Tag TEXT NOT NULL," + "Log_Source_Path TEXT NOT NULL," + "Type INTEGER NOT NULL," + "DB_Dir TEXT NOT NULL," + "UNIQUE(Stream_Tag, DB_Dir) " + ");", + 0, 0, &err_msg); + if (unlikely(SQLITE_OK != rc)) { + collector_error("Failed to create table" MAIN_COLLECTIONS_TABLE "SQL error: %s", err_msg); + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + + sqlite3_stmt *stmt_search_if_log_source_exists = NULL; + rc = sqlite3_prepare_v2(main_db, + "SELECT COUNT(*), Id, DB_Dir FROM " MAIN_COLLECTIONS_TABLE + " WHERE Stream_Tag = ? AND Log_Source_Path = ? AND Type = ? ;", + -1, &stmt_search_if_log_source_exists, NULL); + if (unlikely(SQLITE_OK != rc)){ + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + + + sqlite3_stmt *stmt_insert_log_collection_metadata = NULL; + rc = sqlite3_prepare_v2(main_db, + "INSERT INTO " MAIN_COLLECTIONS_TABLE + " (Stream_Tag, Log_Source_Path, Type, DB_Dir) VALUES (?,?,?,?) ;", + -1, &stmt_insert_log_collection_metadata, NULL); + if (unlikely(SQLITE_OK != rc)){ + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + + for (int i = 0; i < p_file_infos_arr->count; i++) { + + struct File_info *const p_file_info = p_file_infos_arr->data[i]; + + if(p_file_info->db_mode == LOGS_MANAG_DB_MODE_NONE){ + p_file_info->db_dir = strdupz(""); + p_file_info->db_writer_thread = mallocz(sizeof(uv_thread_t)); + rc = uv_thread_create(p_file_info->db_writer_thread, db_writer_db_mode_none, p_file_info); + if (unlikely(rc)){ + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + } + else if(p_file_info->db_mode == LOGS_MANAG_DB_MODE_FULL){ + + p_file_info->db_mut = mallocz(sizeof(uv_mutex_t)); + rc = uv_mutex_init(p_file_info->db_mut); + if (unlikely(rc)) fatal("Failed to initialize uv_mutex_t"); + uv_mutex_lock(p_file_info->db_mut); + + // This error check will be used a lot, so define it here. + #define do_sqlite_error_check(p_file_info, rc, rc_expctd) do { \ + if(unlikely(rc_expctd != rc)) { \ + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__);\ + uv_mutex_unlock(p_file_info->db_mut); \ + goto return_error; \ + } \ + } while(0) + + if(unlikely( + SQLITE_OK != (rc = sqlite3_bind_text(stmt_search_if_log_source_exists, 1, p_file_info->stream_guid, -1, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_text(stmt_search_if_log_source_exists, 2, p_file_info->filename, -1, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_int(stmt_search_if_log_source_exists, 3, p_file_info->log_type)) || + /* COUNT(*) query should always return SQLITE_ROW */ + SQLITE_ROW != (rc = sqlite3_step(stmt_search_if_log_source_exists)))){ + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + + const int log_source_occurences = sqlite3_column_int(stmt_search_if_log_source_exists, 0); + switch (log_source_occurences) { + case 0: { /* Log collection metadata not found in main DB - create a new record */ + + /* Create directory of collection of logs for the particular + * log source (in the form of a UUID) and bind it. */ + uuid_t uuid; + uuid_generate(uuid); + char uuid_str[UUID_STR_LEN]; // ex. "1b4e28ba-2fa1-11d2-883f-0016d3cca427" + "\0" + uuid_unparse_lower(uuid, uuid_str); + + p_file_info->db_dir = mallocz(snprintf(NULL, 0, "%s/%s/", main_db_dir, uuid_str) + 1); + sprintf((char *) p_file_info->db_dir, "%s/%s/", main_db_dir, uuid_str); + + rc = uv_fs_mkdir(NULL, &mkdir_req, p_file_info->db_dir, 0775, NULL); + uv_fs_req_cleanup(&mkdir_req); + if (unlikely(rc)) { + if(errno == EEXIST) + collector_error("DB directory %s exists but not found in %s.\n", p_file_info->db_dir, MAIN_DB); + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + + if(unlikely( + SQLITE_OK != (rc = sqlite3_bind_text(stmt_insert_log_collection_metadata, 1, p_file_info->stream_guid, -1, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_text(stmt_insert_log_collection_metadata, 2, p_file_info->filename, -1, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_int(stmt_insert_log_collection_metadata, 3, p_file_info->log_type)) || + SQLITE_OK != (rc = sqlite3_bind_text(stmt_insert_log_collection_metadata, 4, p_file_info->db_dir, -1, NULL)) || + SQLITE_DONE != (rc = sqlite3_step(stmt_insert_log_collection_metadata)) || + SQLITE_OK != (rc = sqlite3_reset(stmt_insert_log_collection_metadata)))) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + + break; + } + + case 1: { /* File metadata found in DB */ + p_file_info->db_dir = mallocz((size_t)sqlite3_column_bytes(stmt_search_if_log_source_exists, 2) + 1); + sprintf((char*) p_file_info->db_dir, "%s", sqlite3_column_text(stmt_search_if_log_source_exists, 2)); + break; + } + + default: { /* Error, file metadata can exist either 0 or 1 times in DB */ + m_assert(0, "Same file stored in DB more than once!"); + collector_error("[%s]: Record encountered multiple times in DB " MAIN_COLLECTIONS_TABLE " table \n", + p_file_info->filename); + throw_error(p_file_info->chartname, ERR_TYPE_OTHER, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + } + rc = sqlite3_reset(stmt_search_if_log_source_exists); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* Create or open metadata DBs for each log collection */ + p_file_info->db_metadata = mallocz(snprintf(NULL, 0, "%s" METADATA_DB_FILENAME, p_file_info->db_dir) + 1); + sprintf((char *) p_file_info->db_metadata, "%s" METADATA_DB_FILENAME, p_file_info->db_dir); + rc = sqlite3_open(p_file_info->db_metadata, &p_file_info->db); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* Configure metadata DB */ + rc = sqlite3_exec(p_file_info->db, + "PRAGMA auto_vacuum = INCREMENTAL;" + "PRAGMA synchronous = 1;" + "PRAGMA journal_mode = WAL;" + "PRAGMA temp_store = MEMORY;" + "PRAGMA foreign_keys = ON;", + 0, 0, &err_msg); + if (unlikely(rc != SQLITE_OK)) { + collector_error("[%s]: Failed to configure database, SQL error: %s", p_file_info->filename, err_msg); + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + + /* Execute pending metadata database migrations */ + collector_info("[%s]: About to execute " METADATA_DB_FILENAME " migrations", p_file_info->chartname); + int metadata_db_ver = db_user_version(p_file_info->db, -1); + if (likely(LOGS_MANAG_DB_VERSION == metadata_db_ver)) { + collector_info( "[%s]: Logs management " METADATA_DB_FILENAME " database version is %d (no migration needed)", + p_file_info->chartname, metadata_db_ver); + } else { + for(int ver = metadata_db_ver; ver < LOGS_MANAG_DB_VERSION && migration_list_metadata_db[ver].func; ver++){ + rc = (migration_list_metadata_db[ver].func)(p_file_info->db, migration_list_metadata_db[ver].name); + if (unlikely(rc)){ + collector_error("[%s]: Logs management " METADATA_DB_FILENAME " database migration from version %d to version %d failed", + p_file_info->chartname, ver, ver + 1); + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + db_user_version(p_file_info->db, ver + 1); + } + } + + /* ----------------------------------------------------------------- + * Create BLOBS_TABLE and LOGS_TABLE if they don't exist. Do it + * as a transaction, so that it can all be rolled back if something + * goes wrong. + * -------------------------------------------------------------- */ + { + rc = sqlite3_exec(p_file_info->db, "BEGIN TRANSACTION;", NULL, NULL, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* Check if BLOBS_TABLE exists or not */ + sqlite3_stmt *stmt_check_if_BLOBS_TABLE_exists = NULL; + rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT COUNT(*) FROM sqlite_master" + " WHERE type='table' AND name='"BLOBS_TABLE"';", + -1, &stmt_check_if_BLOBS_TABLE_exists, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + rc = sqlite3_step(stmt_check_if_BLOBS_TABLE_exists); + do_sqlite_error_check(p_file_info, rc, SQLITE_ROW); + + /* If BLOBS_TABLE doesn't exist, create and populate it */ + if(sqlite3_column_int(stmt_check_if_BLOBS_TABLE_exists, 0) == 0){ + + /* 1. Create it */ + rc = sqlite3_exec(p_file_info->db, + "CREATE TABLE IF NOT EXISTS " BLOBS_TABLE "(" + "Id INTEGER PRIMARY KEY," + "Filename TEXT NOT NULL," + "Filesize INTEGER NOT NULL" + ");", + 0, 0, &err_msg); + if (unlikely(SQLITE_OK != rc)) { + collector_error("[%s]: Failed to create " BLOBS_TABLE ", SQL error: %s", p_file_info->chartname, err_msg); + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } else collector_info("[%s]: Table " BLOBS_TABLE " created successfully", p_file_info->chartname); + + /* 2. Populate it */ + sqlite3_stmt *stmt_init_BLOBS_table = NULL; + rc = sqlite3_prepare_v2(p_file_info->db, + "INSERT INTO " BLOBS_TABLE + " (Filename, Filesize) VALUES (?,?) ;", + -1, &stmt_init_BLOBS_table, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + for(int i = 0; i < BLOB_MAX_FILES; i++){ + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, BLOB_STORE_FILENAME "%d", i); + if(unlikely( + SQLITE_OK != (rc = sqlite3_bind_text(stmt_init_BLOBS_table, 1, filename, -1, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_init_BLOBS_table, 2, (sqlite3_int64) 0)) || + SQLITE_DONE != (rc = sqlite3_step(stmt_init_BLOBS_table)) || + SQLITE_OK != (rc = sqlite3_reset(stmt_init_BLOBS_table)))){ + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + } + rc = sqlite3_finalize(stmt_init_BLOBS_table); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + } + rc = sqlite3_finalize(stmt_check_if_BLOBS_TABLE_exists); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* If LOGS_TABLE doesn't exist, create it */ + rc = sqlite3_exec(p_file_info->db, + "CREATE TABLE IF NOT EXISTS " LOGS_TABLE "(" + "Id INTEGER PRIMARY KEY," + "FK_BLOB_Id INTEGER NOT NULL," + "BLOB_Offset INTEGER NOT NULL," + "Timestamp INTEGER NOT NULL," + "Msg_compr_size INTEGER NOT NULL," + "Msg_decompr_size INTEGER NOT NULL," + "Num_lines INTEGER NOT NULL," + "FOREIGN KEY (FK_BLOB_Id) REFERENCES " BLOBS_TABLE " (Id) ON DELETE CASCADE ON UPDATE CASCADE" + ");", + 0, 0, &err_msg); + if (unlikely(SQLITE_OK != rc)) { + collector_error("[%s]: Failed to create " LOGS_TABLE ", SQL error: %s", p_file_info->chartname, err_msg); + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } else collector_info("[%s]: Table " LOGS_TABLE " created successfully", p_file_info->chartname); + + /* Create index on LOGS_TABLE Timestamp + * TODO: If this doesn't speed up queries, check SQLITE R*tree + * module. Requires benchmarking with/without index. */ + rc = sqlite3_exec(p_file_info->db, + "CREATE INDEX IF NOT EXISTS logs_timestamps_idx " + "ON " LOGS_TABLE "(Timestamp);", + 0, 0, &err_msg); + if (unlikely(SQLITE_OK != rc)) { + collector_error("[%s]: Failed to create logs_timestamps_idx, SQL error: %s", p_file_info->chartname, err_msg); + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } else collector_info("[%s]: logs_timestamps_idx created successfully", p_file_info->chartname); + + rc = sqlite3_exec(p_file_info->db, "END TRANSACTION;", NULL, NULL, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + } + + + /* ----------------------------------------------------------------- + * Remove excess BLOBs beyond BLOB_MAX_FILES (from both DB and disk + * storage). + * + * This is useful if BLOB_MAX_FILES is reduced after an agent + * restart (for example, if in the future it is not hardcoded, + * but instead it is read from the configuration file). LOGS_TABLE + * entries should be deleted automatically (due to ON DELETE CASCADE). + * -------------------------------------------------------------- */ + { + sqlite3_stmt *stmt_get_BLOBS_TABLE_size = NULL; + rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT MAX(Id) FROM " BLOBS_TABLE ";", + -1, &stmt_get_BLOBS_TABLE_size, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + rc = sqlite3_step(stmt_get_BLOBS_TABLE_size); + do_sqlite_error_check(p_file_info, rc, SQLITE_ROW); + + const int blobs_table_max_id = sqlite3_column_int(stmt_get_BLOBS_TABLE_size, 0); + + sqlite3_stmt *stmt_retrieve_filename_last_digits = NULL; // This statement retrieves the last digit(s) from the Filename column of BLOBS_TABLE + rc = sqlite3_prepare_v2(p_file_info->db, + "WITH split(word, str) AS ( SELECT '', (SELECT Filename FROM " BLOBS_TABLE " WHERE Id = ? ) || '.' " + "UNION ALL SELECT substr(str, 0, instr(str, '.')), substr(str, instr(str, '.')+1) FROM split WHERE str!='' ) " + "SELECT word FROM split WHERE word!='' ORDER BY LENGTH(str) LIMIT 1;", + -1, &stmt_retrieve_filename_last_digits, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + sqlite3_stmt *stmt_delete_row_by_id = NULL; + rc = sqlite3_prepare_v2(p_file_info->db, + "DELETE FROM " BLOBS_TABLE " WHERE Id = ?;", + -1, &stmt_delete_row_by_id, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + for (int id = 1; id <= blobs_table_max_id; id++){ + + rc = sqlite3_bind_int(stmt_retrieve_filename_last_digits, 1, id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + rc = sqlite3_step(stmt_retrieve_filename_last_digits); + do_sqlite_error_check(p_file_info, rc, SQLITE_ROW); + int last_digits = sqlite3_column_int(stmt_retrieve_filename_last_digits, 0); + rc = sqlite3_reset(stmt_retrieve_filename_last_digits); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* If last_digits > BLOB_MAX_FILES - 1, then some BLOB files + * will need to be removed (both from DB BLOBS_TABLE and + * also from the disk). */ + if(last_digits > BLOB_MAX_FILES - 1){ + + /* Delete BLOB file from filesystem */ + char blob_delete_path[FILENAME_MAX + 1]; + snprintfz(blob_delete_path, FILENAME_MAX, "%s" BLOB_STORE_FILENAME "%d", p_file_info->db_dir, last_digits); + uv_fs_t unlink_req; + rc = uv_fs_unlink(NULL, &unlink_req, blob_delete_path, NULL); + uv_fs_req_cleanup(&unlink_req); + if (unlikely(rc)) { + // TODO: If there is an erro here, the entry won't be deleted from BLOBS_TABLE. What to do? + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* Delete entry from DB BLOBS_TABLE */ + rc = sqlite3_bind_int(stmt_delete_row_by_id, 1, id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + rc = sqlite3_step(stmt_delete_row_by_id); + do_sqlite_error_check(p_file_info, rc, SQLITE_DONE); + rc = sqlite3_reset(stmt_delete_row_by_id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + } + } + rc = sqlite3_finalize(stmt_retrieve_filename_last_digits); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + rc = sqlite3_finalize(stmt_delete_row_by_id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* ------------------------------------------------------------- + * BLOBS_TABLE ids after the deletion might not be contiguous. + * This needs to be fixed, by having the ids updated. + * LOGS_TABLE FKs will be updated automatically + * (due to ON UPDATE CASCADE). + * ---------------------------------------------------------- */ + + int old_blobs_table_ids[BLOB_MAX_FILES]; + int off = 0; + sqlite3_stmt *stmt_retrieve_all_ids = NULL; + rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT Id FROM " BLOBS_TABLE " ORDER BY Id ASC;", + -1, &stmt_retrieve_all_ids, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + rc = sqlite3_step(stmt_retrieve_all_ids); + while(rc == SQLITE_ROW){ + old_blobs_table_ids[off++] = sqlite3_column_int(stmt_retrieve_all_ids, 0); + rc = sqlite3_step(stmt_retrieve_all_ids); + } + do_sqlite_error_check(p_file_info, rc, SQLITE_DONE); + rc = sqlite3_finalize(stmt_retrieve_all_ids); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + sqlite3_stmt *stmt_update_id = NULL; + rc = sqlite3_prepare_v2(p_file_info->db, + "UPDATE " BLOBS_TABLE " SET Id = ? WHERE Id = ?;", + -1, &stmt_update_id, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + for (int i = 0; i < BLOB_MAX_FILES; i++){ + if(unlikely( + SQLITE_OK != (rc = sqlite3_bind_int(stmt_update_id, 1, i + 1)) || + SQLITE_OK != (rc = sqlite3_bind_int(stmt_update_id, 2, old_blobs_table_ids[i])) || + SQLITE_DONE != (rc = sqlite3_step(stmt_update_id)) || + SQLITE_OK != (rc = sqlite3_reset(stmt_update_id)))) { + throw_error(p_file_info->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + } + rc = sqlite3_finalize(stmt_update_id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + } + + /* ----------------------------------------------------------------- + * Traverse BLOBS_TABLE, open logs.bin.X files and store their + * file handles in p_file_info array. + * -------------------------------------------------------------- */ + sqlite3_stmt *stmt_retrieve_metadata_from_id = NULL; + rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT Filename, Filesize FROM " BLOBS_TABLE + " WHERE Id = ? ;", + -1, &stmt_retrieve_metadata_from_id, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + sqlite3_stmt *stmt_retrieve_total_logs_size = NULL; + rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT SUM(Msg_compr_size) FROM " LOGS_TABLE + " WHERE FK_BLOB_Id = ? GROUP BY FK_BLOB_Id ;", + -1, &stmt_retrieve_total_logs_size, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + uv_fs_t open_req; + for(int id = 1; id <= BLOB_MAX_FILES; id++){ + + /* Open BLOB file based on filename stored in BLOBS_TABLE. */ + rc = sqlite3_bind_int(stmt_retrieve_metadata_from_id, 1, id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + rc = sqlite3_step(stmt_retrieve_metadata_from_id); + do_sqlite_error_check(p_file_info, rc, SQLITE_ROW); + + char filename[FILENAME_MAX + 1] = {0}; + snprintfz(filename, FILENAME_MAX, "%s%s", p_file_info->db_dir, + sqlite3_column_text(stmt_retrieve_metadata_from_id, 0)); + rc = uv_fs_open(NULL, &open_req, filename, + UV_FS_O_RDWR | UV_FS_O_CREAT | UV_FS_O_APPEND | UV_FS_O_RANDOM, + 0644, NULL); + if (unlikely(rc < 0)){ + uv_fs_req_cleanup(&open_req); + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + + // open_req.result of a uv_fs_t is the file descriptor in case of the uv_fs_open + p_file_info->blob_handles[id] = open_req.result; + uv_fs_req_cleanup(&open_req); + + const int64_t metadata_filesize = (int64_t) sqlite3_column_int64(stmt_retrieve_metadata_from_id, 1); + + /* ------------------------------------------------------------- + * Retrieve total log messages compressed size from LOGS_TABLE + * for current FK_BLOB_Id. + * Only to assert whether correct - not used elsewhere. + * + * If no rows are returned, it means it is probably the initial + * execution of the program so still valid (except if rc is other + * than SQLITE_DONE, which is an error then). + * ---------------------------------------------------------- */ + rc = sqlite3_bind_int(stmt_retrieve_total_logs_size, 1, id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + rc = sqlite3_step(stmt_retrieve_total_logs_size); + if (SQLITE_ROW == rc){ + const int64_t total_logs_filesize = (int64_t) sqlite3_column_int64(stmt_retrieve_total_logs_size, 0); + if(unlikely(total_logs_filesize != metadata_filesize)){ + throw_error(p_file_info->chartname, ERR_TYPE_OTHER, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + } else do_sqlite_error_check(p_file_info, rc, SQLITE_DONE); + + + /* Get filesize of BLOB file. */ + uv_fs_t stat_req; + rc = uv_fs_stat(NULL, &stat_req, filename, NULL); + if (unlikely(rc)){ + uv_fs_req_cleanup(&stat_req); + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + const int64_t blob_filesize = (int64_t) stat_req.statbuf.st_size; + uv_fs_req_cleanup(&stat_req); + + do{ + /* Case 1: blob_filesize == metadata_filesize (equal, either both zero or not): All good */ + if(likely(blob_filesize == metadata_filesize)) + break; + + /* Case 2: blob_filesize == 0 && metadata_filesize > 0: fatal(), however could it mean that + * EXT_BLOB_STORE_FILENAME was rotated but the SQLite metadata wasn't updated? So can it + * maybe be recovered by un-rotating? Either way, treat as fatal error for now. */ + // TODO: Can we avoid fatal()? + if(unlikely(blob_filesize == 0 && metadata_filesize > 0)){ + collector_error("[%s]: blob_filesize == 0 but metadata_filesize > 0 for '%s'\n", + p_file_info->chartname, filename); + throw_error(p_file_info->chartname, ERR_TYPE_OTHER, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + + /* Case 3: blob_filesize > metadata_filesize: Truncate binary to sqlite filesize, program + * crashed or terminated after writing BLOBs to external file but before metadata was updated */ + if(unlikely(blob_filesize > metadata_filesize)){ + collector_info("[%s]: blob_filesize > metadata_filesize for '%s'. Will attempt to fix it.", + p_file_info->chartname, filename); + uv_fs_t trunc_req; + rc = uv_fs_ftruncate(NULL, &trunc_req, p_file_info->blob_handles[id], metadata_filesize, NULL); + uv_fs_req_cleanup(&trunc_req); + if(unlikely(rc)) { + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + break; + } + + /* Case 4: blob_filesize < metadata_filesize: unrecoverable, + * maybe rotation went horrible wrong? + * TODO: Delete external BLOB and clear metadata from DB, + * start from clean state but the most recent logs. */ + if(unlikely(blob_filesize < metadata_filesize)){ + collector_info("[%s]: blob_filesize < metadata_filesize for '%s'.", + p_file_info->chartname, filename); + throw_error(p_file_info->chartname, ERR_TYPE_OTHER, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } + + /* Case 5: default if none of the above, should never reach here, fatal() */ + m_assert(0, "Code should not reach here"); + throw_error(p_file_info->chartname, ERR_TYPE_OTHER, rc, __LINE__, __FILE__, __FUNCTION__); + uv_mutex_unlock(p_file_info->db_mut); + goto return_error; + } while(0); + + + /* Initialise blob_write_handle with logs.bin.0 */ + if(filename[strlen(filename) - 1] == '0') + p_file_info->blob_write_handle_offset = id; + + rc = sqlite3_reset(stmt_retrieve_total_logs_size); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + rc = sqlite3_reset(stmt_retrieve_metadata_from_id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + } + + rc = sqlite3_finalize(stmt_retrieve_metadata_from_id); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* Prepare statements to be used in single database queries */ + rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT Timestamp, Msg_compr_size , Msg_decompr_size, " + "BLOB_Offset, " BLOBS_TABLE".Id, Num_lines " + "FROM " LOGS_TABLE " INNER JOIN " BLOBS_TABLE " " + "ON " LOGS_TABLE ".FK_BLOB_Id = " BLOBS_TABLE ".Id " + "WHERE Timestamp >= ? AND Timestamp <= ? " + "ORDER BY Timestamp;", + -1, &p_file_info->stmt_get_log_msg_metadata_asc, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + rc = sqlite3_prepare_v2(p_file_info->db, + "SELECT Timestamp, Msg_compr_size , Msg_decompr_size, " + "BLOB_Offset, " BLOBS_TABLE".Id, Num_lines " + "FROM " LOGS_TABLE " INNER JOIN " BLOBS_TABLE " " + "ON " LOGS_TABLE ".FK_BLOB_Id = " BLOBS_TABLE ".Id " + "WHERE Timestamp <= ? AND Timestamp >= ? " + "ORDER BY Timestamp DESC;", + -1, &p_file_info->stmt_get_log_msg_metadata_desc, NULL); + do_sqlite_error_check(p_file_info, rc, SQLITE_OK); + + /* DB initialisation finished; release lock */ + uv_mutex_unlock(p_file_info->db_mut); + + /* Create synchronous writer thread, one for each log source */ + p_file_info->db_writer_thread = mallocz(sizeof(uv_thread_t)); + rc = uv_thread_create(p_file_info->db_writer_thread, db_writer_db_mode_full, p_file_info); + if (unlikely(rc)){ + throw_error(p_file_info->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + goto return_error; + } + } + } + rc = sqlite3_finalize(stmt_search_if_log_source_exists); + if (unlikely(rc != SQLITE_OK)){ + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + // TODO: Some additional cleanup required here, e.g. terminate db_writer_thread. + goto return_error; + } + rc = sqlite3_finalize(stmt_insert_log_collection_metadata); + if (unlikely(rc != SQLITE_OK)){ + throw_error(MAIN_DB, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + // TODO: Some additional cleanup required here, e.g. terminate db_writer_thread. + goto return_error; + } + + return 0; + +return_error: + freez(main_db_path); + main_db_path = NULL; + + sqlite3_close(main_db); // No-op if main_db == NULL + sqlite3_free(err_msg); // No-op if err_msg == NULL + + m_assert(rc != 0, "rc should not be == 0 in case of error"); + return rc == 0 ? -1 : rc; +} + +/** + * @brief Search database(s) for logs + * @details This function searches one or more databases for any results + * matching the query parameters. If any results are found, it will decompress + * the text of each returned row and add it to the results buffer, up to a + * maximum amount of p_query_params->quota bytes (unless timed out). + * @todo Make decompress buffer static to reduce mallocs/frees. + * @todo Limit number of results returned through SQLite Query to speed up search? + */ +void db_search(logs_query_params_t *const p_query_params, struct File_info *const p_file_infos[]) { + int rc = 0; + + sqlite3_stmt *stmt_get_log_msg_metadata; + sqlite3 *dbt = NULL; // Used only when multiple DBs are searched + + if(!p_file_infos[1]){ /* Single DB to be searched */ + stmt_get_log_msg_metadata = p_query_params->order_by_asc ? + p_file_infos[0]->stmt_get_log_msg_metadata_asc : p_file_infos[0]->stmt_get_log_msg_metadata_desc; + if(unlikely( + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_get_log_msg_metadata, 1, p_query_params->req_from_ts)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_get_log_msg_metadata, 2, p_query_params->req_to_ts)) || + (SQLITE_ROW != (rc = sqlite3_step(stmt_get_log_msg_metadata)) && (SQLITE_DONE != rc)) + )){ + throw_error(p_file_infos[0]->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + // TODO: If there are errors here, should db_writer_db_mode_full() be terminated? + sqlite3_reset(stmt_get_log_msg_metadata); + return; + } + } else { /* Multiple DBs to be searched */ + sqlite3_stmt *stmt_attach_db; + sqlite3_stmt *stmt_create_tmp_view; + int pfi_off = 0; + + /* Open a new DB connection on the first log source DB and attach other DBs */ + if(unlikely( + SQLITE_OK != (rc = sqlite3_open_v2(p_file_infos[0]->db_metadata, &dbt, SQLITE_OPEN_READONLY, NULL)) || + SQLITE_OK != (rc = sqlite3_prepare_v2(dbt,"ATTACH DATABASE ? AS ? ;", -1, &stmt_attach_db, NULL)) + )){ + throw_error(p_file_infos[0]->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + sqlite3_close_v2(dbt); + return; + } + for(pfi_off = 0; p_file_infos[pfi_off]; pfi_off++){ + if(unlikely( + SQLITE_OK != (rc = sqlite3_bind_text(stmt_attach_db, 1, p_file_infos[pfi_off]->db_metadata, -1, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_int(stmt_attach_db, 2, pfi_off)) || + SQLITE_DONE != (rc = sqlite3_step(stmt_attach_db)) || + SQLITE_OK != (rc = sqlite3_reset(stmt_attach_db)) + )){ + throw_error(p_file_infos[pfi_off]->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + sqlite3_close_v2(dbt); + return; + } + } + + /* Create temporary view, then prepare retrieval of metadata from + * TMP_VIEW_TABLE statement and execute search. + * TODO: Limit number of results returned through SQLite Query to speed up search? */ + #define TMP_VIEW_TABLE "compound_view" + #define TMP_VIEW_QUERY_PREFIX "CREATE TEMP VIEW " TMP_VIEW_TABLE " AS SELECT * FROM (SELECT * FROM '0'."\ + LOGS_TABLE " INNER JOIN (VALUES(0)) ORDER BY Timestamp) " + #define TMP_VIEW_QUERY_BODY_1 "UNION ALL SELECT * FROM (SELECT * FROM '" + #define TMP_VIEW_QUERY_BODY_2 "'." LOGS_TABLE " INNER JOIN (VALUES(" + #define TMP_VIEW_QUERY_BODY_3 ")) ORDER BY Timestamp) " + #define TMP_VIEW_QUERY_POSTFIX "ORDER BY Timestamp;" + + char tmp_view_query[sizeof(TMP_VIEW_QUERY_PREFIX) + ( + sizeof(TMP_VIEW_QUERY_BODY_1) + + sizeof(TMP_VIEW_QUERY_BODY_2) + + sizeof(TMP_VIEW_QUERY_BODY_3) + 4 + ) * (LOGS_MANAG_MAX_COMPOUND_QUERY_SOURCES - 1) + + sizeof(TMP_VIEW_QUERY_POSTFIX) + + 50 /* +50 bytes to play it safe */] = TMP_VIEW_QUERY_PREFIX; + int pos = sizeof(TMP_VIEW_QUERY_PREFIX) - 1; + for(pfi_off = 1; p_file_infos[pfi_off]; pfi_off++){ // Skip p_file_infos[0] + int n = snprintf(&tmp_view_query[pos], sizeof(tmp_view_query) - pos, "%s%d%s%d%s", + TMP_VIEW_QUERY_BODY_1, pfi_off, + TMP_VIEW_QUERY_BODY_2, pfi_off, + TMP_VIEW_QUERY_BODY_3); + + if (n < 0 || n >= (int) sizeof(tmp_view_query) - pos){ + throw_error(p_file_infos[pfi_off]->chartname, ERR_TYPE_OTHER, n, __LINE__, __FILE__, __FUNCTION__); + sqlite3_close_v2(dbt); + return; + } + pos += n; + } + snprintf(&tmp_view_query[pos], sizeof(tmp_view_query) - pos, "%s", TMP_VIEW_QUERY_POSTFIX); + + if(unlikely( + SQLITE_OK != (rc = sqlite3_prepare_v2(dbt, tmp_view_query, -1, &stmt_create_tmp_view, NULL)) || + SQLITE_DONE != (rc = sqlite3_step(stmt_create_tmp_view)) || + SQLITE_OK != (rc = sqlite3_prepare_v2(dbt, p_query_params->order_by_asc ? + + "SELECT Timestamp, Msg_compr_size , Msg_decompr_size, " + "BLOB_Offset, FK_BLOB_Id, Num_lines, column1 " + "FROM " TMP_VIEW_TABLE " " + "WHERE Timestamp >= ? AND Timestamp <= ?;" : + + /* TODO: The following can also be done by defining + * a descending order tmp_view_query, which will + * probably be faster. Needs to be measured. */ + + "SELECT Timestamp, Msg_compr_size , Msg_decompr_size, " + "BLOB_Offset, FK_BLOB_Id, Num_lines, column1 " + "FROM " TMP_VIEW_TABLE " " + "WHERE Timestamp <= ? AND Timestamp >= ? ORDER BY Timestamp DESC;", + + -1, &stmt_get_log_msg_metadata, NULL)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_get_log_msg_metadata, 1, + (sqlite3_int64)p_query_params->req_from_ts)) || + SQLITE_OK != (rc = sqlite3_bind_int64(stmt_get_log_msg_metadata, 2, + (sqlite3_int64)p_query_params->req_to_ts)) || + (SQLITE_ROW != (rc = sqlite3_step(stmt_get_log_msg_metadata)) && (SQLITE_DONE != rc)) + )){ + throw_error(p_file_infos[0]->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + sqlite3_close_v2(dbt); + return; + } + } + + Circ_buff_item_t tmp_itm = {0}; + + BUFFER *const res_buff = p_query_params->results_buff; + logs_query_res_hdr_t res_hdr = { // results header + .timestamp = p_query_params->act_to_ts, + .text_size = 0, + .matches = 0, + .log_source = "", + .log_type = "", + .basename = "", + .filename = "", + .chartname ="" + }; + size_t text_compressed_size_max = 0; + + while (rc == SQLITE_ROW) { + + /* Retrieve metadata from DB */ + tmp_itm.timestamp = (msec_t)sqlite3_column_int64(stmt_get_log_msg_metadata, 0); + tmp_itm.text_compressed_size = (size_t)sqlite3_column_int64(stmt_get_log_msg_metadata, 1); + tmp_itm.text_size = (size_t)sqlite3_column_int64(stmt_get_log_msg_metadata, 2); + int64_t blob_offset = (int64_t) sqlite3_column_int64(stmt_get_log_msg_metadata, 3); + int blob_handles_offset = sqlite3_column_int(stmt_get_log_msg_metadata, 4); + unsigned long num_lines = (unsigned long) sqlite3_column_int64(stmt_get_log_msg_metadata, 5); + int db_off = p_file_infos[1] ? sqlite3_column_int(stmt_get_log_msg_metadata, 6) : 0; + + /* If exceeding quota or timeout is reached and new timestamp + * is different than previous, terminate query. */ + if((res_buff->len >= p_query_params->quota || terminate_logs_manag_query(p_query_params)) && + tmp_itm.timestamp != res_hdr.timestamp){ + p_query_params->act_to_ts = res_hdr.timestamp; + break; + } + + res_hdr.timestamp = tmp_itm.timestamp; + snprintfz(res_hdr.log_source, sizeof(res_hdr.log_source), "%s", log_src_t_str[p_file_infos[db_off]->log_source]); + snprintfz(res_hdr.log_type, sizeof(res_hdr.log_type), "%s", log_src_type_t_str[p_file_infos[db_off]->log_type]); + snprintfz(res_hdr.basename, sizeof(res_hdr.basename), "%s", p_file_infos[db_off]->file_basename); + snprintfz(res_hdr.filename, sizeof(res_hdr.filename), "%s", p_file_infos[db_off]->filename); + snprintfz(res_hdr.chartname, sizeof(res_hdr.chartname), "%s", p_file_infos[db_off]->chartname); + + /* Retrieve compressed log messages from BLOB file */ + if(tmp_itm.text_compressed_size > text_compressed_size_max){ + text_compressed_size_max = tmp_itm.text_compressed_size; + tmp_itm.text_compressed = reallocz(tmp_itm.text_compressed, text_compressed_size_max); + } + uv_fs_t read_req; + uv_buf_t uv_buf = uv_buf_init(tmp_itm.text_compressed, tmp_itm.text_compressed_size); + rc = uv_fs_read(NULL, + &read_req, + p_file_infos[db_off]->blob_handles[blob_handles_offset], + &uv_buf, 1, blob_offset, NULL); + uv_fs_req_cleanup(&read_req); + if (unlikely(rc < 0)){ + throw_error(NULL, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + break; + } + + /* Append retrieved results to BUFFER. + * In the case of search_keyword(), less than sizeof(res_hdr) + tmp_itm.text_size + *space may be required, but go for worst case scenario for now */ + buffer_increase(res_buff, sizeof(res_hdr) + tmp_itm.text_size); + + if(!p_query_params->keyword || !*p_query_params->keyword || !strcmp(p_query_params->keyword, " ")){ + rc = LZ4_decompress_safe(tmp_itm.text_compressed, + &res_buff->buffer[res_buff->len + sizeof(res_hdr)], + tmp_itm.text_compressed_size, + tmp_itm.text_size); + + if(unlikely(rc < 0)){ + throw_error(p_file_infos[db_off]->chartname, ERR_TYPE_OTHER, rc, __LINE__, __FILE__, __FUNCTION__); + break; + } + + res_hdr.matches = num_lines; + res_hdr.text_size = tmp_itm.text_size; + } + else { + tmp_itm.data = mallocz(tmp_itm.text_size); + rc = LZ4_decompress_safe(tmp_itm.text_compressed, + tmp_itm.data, + tmp_itm.text_compressed_size, + tmp_itm.text_size); + + if(unlikely(rc < 0)){ + freez(tmp_itm.data); + throw_error(p_file_infos[db_off]->chartname, ERR_TYPE_OTHER, rc, __LINE__, __FILE__, __FUNCTION__); + break; + } + + res_hdr.matches = search_keyword( tmp_itm.data, tmp_itm.text_size, + &res_buff->buffer[res_buff->len + sizeof(res_hdr)], + &res_hdr.text_size, p_query_params->keyword, NULL, + p_query_params->ignore_case); + freez(tmp_itm.data); + + m_assert( (res_hdr.matches > 0 && res_hdr.text_size > 0) || + (res_hdr.matches == 0 && res_hdr.text_size == 0), + "res_hdr.matches and res_hdr.text_size must both be > 0 or == 0."); + + if(unlikely(res_hdr.matches < 0)){ /* res_hdr.matches < 0 - error during keyword search */ + throw_error(p_file_infos[db_off]->chartname, ERR_TYPE_LIBUV, rc, __LINE__, __FILE__, __FUNCTION__); + break; + } + } + + if(res_hdr.text_size){ + res_buff->buffer[res_buff->len + sizeof(res_hdr) + res_hdr.text_size - 1] = '\n'; // replace '\0' with '\n' + memcpy(&res_buff->buffer[res_buff->len], &res_hdr, sizeof(res_hdr)); + res_buff->len += sizeof(res_hdr) + res_hdr.text_size; + p_query_params->num_lines += res_hdr.matches; + } + + m_assert(TEST_MS_TIMESTAMP_VALID(res_hdr.timestamp), "res_hdr.timestamp is invalid"); + + rc = sqlite3_step(stmt_get_log_msg_metadata); + if (unlikely(rc != SQLITE_ROW && rc != SQLITE_DONE)){ + throw_error(p_file_infos[db_off]->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); + // TODO: If there are errors here, should db_writer_db_mode_full() be terminated? + break; + } + } + + if(tmp_itm.text_compressed) + freez(tmp_itm.text_compressed); + + if(p_file_infos[1]) + rc = sqlite3_close_v2(dbt); + else + rc = sqlite3_reset(stmt_get_log_msg_metadata); + + if (unlikely(SQLITE_OK != rc)) + throw_error(p_file_infos[0]->chartname, ERR_TYPE_SQLITE, rc, __LINE__, __FILE__, __FUNCTION__); +} diff --git a/logsmanagement/db_api.h b/logsmanagement/db_api.h new file mode 100644 index 00000000..3f4fe0d3 --- /dev/null +++ b/logsmanagement/db_api.h @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file db_api.h + * @brief Header of db_api.c + */ + +#ifndef DB_API_H_ +#define DB_API_H_ + +#include "../database/sqlite/sqlite3.h" +#include <uv.h> +#include "query.h" +#include "file_info.h" + +#define LOGS_MANAG_DB_SUBPATH "/logs_management_db" + +int db_user_version(sqlite3 *const db, const int set_user_version); +void db_set_main_dir(char *const dir); +int db_init(void); +void db_search(logs_query_params_t *const p_query_params, struct File_info *const p_file_infos[]); + +#endif // DB_API_H_ diff --git a/logsmanagement/defaults.h b/logsmanagement/defaults.h new file mode 100644 index 00000000..2309f781 --- /dev/null +++ b/logsmanagement/defaults.h @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file defaults.h + * @brief Hard-coded configuration settings for the Logs Management engine + */ + +#ifndef LOGSMANAG_DEFAULTS_H_ +#define LOGSMANAG_DEFAULTS_H_ + +/* -------------------------------------------------------------------------- */ +/* General */ +/* -------------------------------------------------------------------------- */ + +#define KiB * 1024ULL +#define MiB * 1048576ULL +#define GiB * 1073741824ULL + +#define MAX_LOG_MSG_SIZE 50 MiB /**< Maximum allowable log message size (in Bytes) to be stored in message queue and DB. **/ + +#define MAX_CUS_CHARTS_PER_SOURCE 100 /**< Hard limit of maximum custom charts per log source **/ + +#define MAX_OUTPUTS_PER_SOURCE 100 /**< Hard limit of maximum Fluent Bit outputs per log source **/ + +#define UPDATE_TIMEOUT_DEFAULT 10 /**< Default timeout to use to update charts if they haven't been updated in the meantime. **/ + +#if !defined(LOGS_MANAGEMENT_DEV_MODE) +#define ENABLE_COLLECTED_LOGS_TOTAL_DEFAULT CONFIG_BOOLEAN_NO /**< Default value to enable (or not) metrics of total collected log records **/ +#else +#define ENABLE_COLLECTED_LOGS_TOTAL_DEFAULT CONFIG_BOOLEAN_YES /**< Default value to enable (or not) metrics of total collected log records, if stress tests are enabled **/ +#endif +#define ENABLE_COLLECTED_LOGS_RATE_DEFAULT CONFIG_BOOLEAN_YES /**< Default value to enable (or not) metrics of rate of collected log records **/ + +#define SD_JOURNAL_FIELD_PREFIX "LOGS_MANAG_" /**< Default systemd journal field prefix for sources that log to the system journal */ + +#define SD_JOURNAL_SEND_DEFAULT CONFIG_BOOLEAN_NO /**< Default value to enable (or not) submission of logs to the system journal (where applicable) **/ + +#define LOGS_MANAG_CHARTNAME_SIZE 50 /**< Maximum size of log source chart names, including terminating '\0'. **/ +#define LOGS_MANAG_CHARTNAME_PREFIX "logs_manag_" /**< Prefix of top-level chart names, used also in function sources. **/ + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Database */ +/* -------------------------------------------------------------------------- */ + +typedef enum { + LOGS_MANAG_DB_MODE_FULL = 0, + LOGS_MANAG_DB_MODE_NONE +} logs_manag_db_mode_t; + +#define SAVE_BLOB_TO_DB_DEFAULT 6 /**< Global default configuration interval to save buffers from RAM to disk **/ +#define SAVE_BLOB_TO_DB_MIN 2 /**< Minimum allowed interval to save buffers from RAM to disk **/ +#define SAVE_BLOB_TO_DB_MAX 1800 /**< Maximum allowed interval to save buffers from RAM to disk **/ + +#define BLOB_MAX_FILES 10 /**< Maximum allowed number of BLOB files (per collection) that are used to store compressed logs. When exceeded, the olderst one will be overwritten. **/ + +#define DISK_SPACE_LIMIT_DEFAULT 500 /**< Global default configuration maximum database disk space limit per log source **/ + +#if !defined(LOGS_MANAGEMENT_DEV_MODE) +#define GLOBAL_DB_MODE_DEFAULT_STR "none" /**< db mode string to be used as global default in configuration **/ +#define GLOBAL_DB_MODE_DEFAULT LOGS_MANAG_DB_MODE_NONE /**< db mode to be used as global default, matching GLOBAL_DB_MODE_DEFAULT_STR **/ +#else +#define GLOBAL_DB_MODE_DEFAULT_STR "full" /**< db mode string to be used as global default in configuration, if stress tests are enabled **/ +#define GLOBAL_DB_MODE_DEFAULT LOGS_MANAG_DB_MODE_FULL /**< db mode to be used as global default, matching GLOBAL_DB_MODE_DEFAULT_STR, if stress tests are enabled **/ +#endif + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Circular Buffer */ +/* -------------------------------------------------------------------------- */ + +#define CIRCULAR_BUFF_SPARE_ITEMS_DEFAULT 2 /**< Additional circular buffers items to give time to the db engine to save buffers to disk **/ + +#define CIRCULAR_BUFF_DEFAULT_MAX_SIZE (64 MiB) /**< Default circular_buffer_max_size **/ +#define CIRCULAR_BUFF_MAX_SIZE_RANGE_MIN (1 MiB) /**< circular_buffer_max_size read from configuration cannot be smaller than this **/ +#define CIRCULAR_BUFF_MAX_SIZE_RANGE_MAX (4 GiB) /**< circular_buffer_max_size read from configuration cannot be larger than this **/ + +#define CIRCULAR_BUFF_DEFAULT_DROP_LOGS 0 /**< Global default configuration value whether to drop logs if circular buffer is full **/ + +#define CIRC_BUFF_PREP_WR_RETRY_AFTER_MS 1000 /**< If circ_buff_prepare_write() fails due to not enough space, how many millisecs to wait before retrying **/ + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Compression */ +/* -------------------------------------------------------------------------- */ + +#define COMPRESSION_ACCELERATION_DEFAULT 1 /**< Global default value for compression acceleration **/ + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Kernel logs (kmsg) plugin */ +/* -------------------------------------------------------------------------- */ + +#define KERNEL_LOGS_COLLECT_INIT_WAIT 5 /**< Wait time (in sec) before kernel log collection starts. Required in order to skip collection and processing of pre-existing logs at Netdata boot. **/ + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Fluent Bit */ +/* -------------------------------------------------------------------------- */ + +#define FLB_FLUSH_DEFAULT "0.1" /**< Default Fluent Bit flush interval **/ +#define FLB_HTTP_LISTEN_DEFAULT "0.0.0.0" /**< Default Fluent Bit server listening socket **/ +#define FLB_HTTP_PORT_DEFAULT "2020" /**< Default Fluent Bit server listening port **/ +#define FLB_HTTP_SERVER_DEFAULT "false" /**< Default Fluent Bit server enable status **/ +#define FLB_LOG_FILENAME_DEFAULT "fluentbit.log" /**< Default Fluent Bit log filename **/ +#define FLB_LOG_LEVEL_DEFAULT "info" /**< Default Fluent Bit log level **/ +#define FLB_CORO_STACK_SIZE_DEFAULT "24576" /**< Default Fluent Bit coro stack size - do not change this value unless there is a good reason **/ + +#define FLB_FORWARD_UNIX_PATH_DEFAULT "" /**< Default path for Forward unix socket configuration, see also https://docs.fluentbit.io/manual/pipeline/inputs/forward#configuration-parameters **/ +#define FLB_FORWARD_UNIX_PERM_DEFAULT "0644" /**< Default permissions for Forward unix socket configuration, see also https://docs.fluentbit.io/manual/pipeline/inputs/forward#configuration-parameters **/ +#define FLB_FORWARD_ADDR_DEFAULT "0.0.0.0" /**< Default listen address for Forward socket configuration, see also https://docs.fluentbit.io/manual/pipeline/inputs/forward#configuration-parameters **/ +#define FLB_FORWARD_PORT_DEFAULT "24224" /**< Default listen port for Forward socket configuration, see also https://docs.fluentbit.io/manual/pipeline/inputs/forward#configuration-parameters **/ + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Queries */ +/* -------------------------------------------------------------------------- */ + +#define LOGS_MANAG_MAX_COMPOUND_QUERY_SOURCES 10U /**< Maximum allowed number of log sources that can be searched in a single query **/ +#define LOGS_MANAG_QUERY_QUOTA_DEFAULT (10 MiB) /**< Default logs management query quota **/ +#define LOGS_MANAG_QUERY_QUOTA_MAX MAX_LOG_MSG_SIZE /**< Max logs management query quota **/ +#define LOGS_MANAG_QUERY_IGNORE_CASE_DEFAULT 0 /**< Boolean to indicate whether to ignore case for keyword or not **/ +#define LOGS_MANAG_QUERY_SANITIZE_KEYWORD_DEFAULT 0 /**< Boolean to indicate whether to sanitize keyword or not **/ +#define LOGS_MANAG_QUERY_TIMEOUT_DEFAULT 30 /**< Default timeout of logs management queries (in secs) **/ + +/* -------------------------------------------------------------------------- */ + + +#endif // LOGSMANAG_DEFAULTS_H_ diff --git a/logsmanagement/file_info.h b/logsmanagement/file_info.h new file mode 100644 index 00000000..751b8744 --- /dev/null +++ b/logsmanagement/file_info.h @@ -0,0 +1,165 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file file_info.h + * @brief Includes the File_info structure that is the primary + * structure for configuring each log source. + */ + +#ifndef FILE_INFO_H_ +#define FILE_INFO_H_ + +#include <uv.h> +#include "../database/sqlite/sqlite3.h" +#include "defaults.h" +#include "parser.h" + +// Cool trick --> http://userpage.fu-berlin.de/~ram/pub/pub_jf47ht81Ht/c_preprocessor_applications_en +/* WARNING: DO NOT CHANGED THE ORDER OF LOG_SRC_TYPES, ONLY APPEND NEW TYPES */ +#define LOG_SRC_TYPES LST(FLB_TAIL)LST(FLB_WEB_LOG)LST(FLB_KMSG) \ + LST(FLB_SYSTEMD)LST(FLB_DOCKER_EV)LST(FLB_SYSLOG) \ + LST(FLB_SERIAL)LST(FLB_MQTT) +#define LST(x) x, +enum log_src_type_t {LOG_SRC_TYPES}; +#undef LST +#define LST(x) #x, +static const char * const log_src_type_t_str[] = {LOG_SRC_TYPES}; +#undef LST + +#define LOG_SRCS LST(LOG_SOURCE_LOCAL)LST(LOG_SOURCE_FORWARD) +#define LST(x) x, +enum log_src_t {LOG_SRCS}; +#undef LST +#define LST(x) #x, +static const char * const log_src_t_str[] = {LOG_SRCS}; +#undef LST + +#include "rrd_api/rrd_api.h" + +typedef enum log_src_state { + LOG_SRC_UNINITIALIZED = 0, /*!< config not initialized */ + LOG_SRC_READY, /*!< config initialized (monitoring may have started or not) */ + LOG_SRC_EXITING /*!< cleanup and destroy stage */ +} LOG_SRC_STATE; + +typedef struct flb_tail_config { + int use_inotify; +} Flb_tail_config_t; + +typedef struct flb_kmsg_config { + char *prio_level; +} Flb_kmsg_config_t; + +typedef struct flb_serial_config { + char *bitrate; + char *min_bytes; + char *separator; + char *format; +} Flb_serial_config_t; + +typedef struct flb_socket_config { + char *mode; + char *unix_path; + char *unix_perm; + char *listen; + char *port; +} Flb_socket_config_t; + +typedef struct syslog_parser_config { + char *log_format; + Flb_socket_config_t *socket_config; +} Syslog_parser_config_t; + +typedef struct flb_output_config { + char *plugin; /**< Fluent Bit output plugin name, see: https://docs.fluentbit.io/manual/pipeline/outputs **/ + int id; /**< Incremental id of plugin configuration in linked list, starting from 1 **/ + struct flb_output_config_param { + char *key; /**< Key of the parameter configuration **/ + char *val; /**< Value of the parameter configuration **/ + struct flb_output_config_param *next; /**< Next output parameter configuration in the linked list of parameters **/ + } *param; + struct flb_output_config *next; /**< Next output plugin configuration in the linked list of output plugins **/ +} Flb_output_config_t; + +struct File_info { + + /* Struct members core to any log source type */ + const char *chartname; /**< Top level chart name for this log source on web dashboard **/ + char *filename; /**< Full path of log source **/ + const char *file_basename; /**< Basename of log source **/ + const char *stream_guid; /**< Streaming input GUID **/ + enum log_src_t log_source; /**< Defines log source origin - see enum log_src_t for options **/ + enum log_src_type_t log_type; /**< Defines type of log source - see enum log_src_type_t for options **/ + struct Circ_buff *circ_buff; /**< Associated circular buffer - only one should exist per log source. **/ + int compression_accel; /**< LZ4 compression acceleration factor for collected logs, see also: https://github.com/lz4/lz4/blob/90d68e37093d815e7ea06b0ee3c168cccffc84b8/lib/lz4.h#L195 **/ + int update_every; /**< Interval (in sec) of how often to collect and update charts **/ + int update_timeout; /**< Timeout to update charts after, since last update */ + int use_log_timestamp; /**< Use log timestamps instead of collection timestamps, if available **/ + int do_sd_journal_send; /**< Write to system journal - not applicable to all log source types **/ + struct Chart_meta *chart_meta; + LOG_SRC_STATE state; /**< State of log source, used to sync status among threads **/ + + /* Struct members related to disk database */ + sqlite3 *db; /**< SQLite3 DB connection to DB that contains metadata for this log source **/ + const char *db_dir; /**< Path to metadata DB and compressed log BLOBs directory **/ + const char *db_metadata; /**< Path to metadata DB file **/ + uv_mutex_t *db_mut; /**< DB access mutex **/ + uv_thread_t *db_writer_thread; /**< Thread responsible for handling the DB writes **/ + uv_file blob_handles[BLOB_MAX_FILES + 1]; /**< File handles for BLOB files. Item 0 not used - just for matching 1-1 with DB ids **/ + logs_manag_db_mode_t db_mode; /**< DB mode as enum. **/ + int blob_write_handle_offset; /**< File offset denoting HEAD of currently open database BLOB file **/ + int buff_flush_to_db_interval; /**< Frequency at which RAM buffers of this log source will be flushed to the database **/ + int64_t blob_max_size; /**< When the size of a BLOB exceeds this value, the BLOB gets rotated. **/ + int64_t blob_total_size; /**< This is the total disk space that all BLOBs occupy (for this log source) **/ + int64_t db_write_duration; /**< Holds timing details related to duration of DB write operations **/ + int64_t db_rotate_duration; /**< Holds timing details related to duration of DB rorate operations **/ + sqlite3_stmt *stmt_get_log_msg_metadata_asc; /**< SQLITE3 statement used to retrieve metadata from database during queries in ascending order **/ + sqlite3_stmt *stmt_get_log_msg_metadata_desc; /**< SQLITE3 statement used to retrieve metadata from database during queries in descending order **/ + + /* Struct members related to queries */ + struct { + usec_t user; + usec_t sys; + } cpu_time_per_mib; + + /* Struct members related to log parsing */ + Log_parser_config_t *parser_config; /**< Configuration to be user by log parser - read from logsmanagement.conf **/ + Log_parser_cus_config_t **parser_cus_config; /**< Array of custom log parsing configurations **/ + Log_parser_metrics_t *parser_metrics; /**< Extracted metrics **/ + + /* Struct members related to Fluent-Bit inputs, filters, buffers, outputs */ + int flb_input; /**< Fluent-bit input interface property for this log source **/ + int flb_parser; /**< Fluent-bit parser interface property for this log source **/ + int flb_lib_output; /**< Fluent-bit "lib" output interface property for this log source **/ + void *flb_config; /**< Any other Fluent-Bit configuration specific to this log source only **/ + uv_mutex_t flb_tmp_buff_mut; + uv_timer_t flb_tmp_buff_cpy_timer; + Flb_output_config_t *flb_outputs; /**< Linked list of Fluent Bit outputs for this log source **/ + +}; + +struct File_infos_arr { + struct File_info **data; + uint8_t count; /**< Number of items in array **/ +}; + +extern struct File_infos_arr *p_file_infos_arr; /**< Array that contains all p_file_info structs for all log sources **/ + +typedef struct { + int update_every; + int update_timeout; + int use_log_timestamp; + int circ_buff_max_size_in_mib; + int circ_buff_drop_logs; + int compression_acceleration; + logs_manag_db_mode_t db_mode; + int disk_space_limit_in_mib; + int buff_flush_to_db_interval; + int enable_collected_logs_total; + int enable_collected_logs_rate; + char *sd_journal_field_prefix; + int do_sd_journal_send; +} g_logs_manag_config_t; + +extern g_logs_manag_config_t g_logs_manag_config; + +#endif // FILE_INFO_H_ diff --git a/logsmanagement/flb_plugin.c b/logsmanagement/flb_plugin.c new file mode 100644 index 00000000..493749ed --- /dev/null +++ b/logsmanagement/flb_plugin.c @@ -0,0 +1,1536 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file flb_plugin.c + * @brief This file includes all functions that act as an API to + * the Fluent Bit library. + */ + +#include "flb_plugin.h" +#include <lz4.h> +#include "helper.h" +#include "defaults.h" +#include "circular_buffer.h" +#include "daemon/common.h" +#include "libnetdata/libnetdata.h" +#include "../fluent-bit/lib/msgpack-c/include/msgpack/unpack.h" +#include "../fluent-bit/lib/msgpack-c/include/msgpack/object.h" +#include "../fluent-bit/lib/monkey/include/monkey/mk_core/mk_list.h" +#include <dlfcn.h> + +#ifdef HAVE_SYSTEMD +#include <systemd/sd-journal.h> +#define SD_JOURNAL_SEND_DEFAULT_FIELDS \ + "%s_LOG_SOURCE=%s" , sd_journal_field_prefix, log_src_t_str[p_file_info->log_source], \ + "%s_LOG_TYPE=%s" , sd_journal_field_prefix, log_src_type_t_str[p_file_info->log_type] +#endif + +#define LOG_REC_KEY "msg" /**< key to represent log message field in most log sources **/ +#define LOG_REC_KEY_SYSTEMD "MESSAGE" /**< key to represent log message field in systemd log source **/ +#define SYSLOG_TIMESTAMP_SIZE 16 +#define UNKNOWN "unknown" + + +/* Including "../fluent-bit/include/fluent-bit/flb_macros.h" causes issues + * with CI, as it requires mk_core/mk_core_info.h which is generated only + * after Fluent Bit has been built. We can instead just redefined a couple + * of macros here: */ +#define FLB_FALSE 0 +#define FLB_TRUE !FLB_FALSE + +/* For similar reasons, (re)define the following macros from "flb_lib.h": */ +/* Lib engine status */ +#define FLB_LIB_ERROR -1 +#define FLB_LIB_NONE 0 +#define FLB_LIB_OK 1 +#define FLB_LIB_NO_CONFIG_MAP 2 + +/* Following structs are the same as defined in fluent-bit/flb_lib.h and + * fluent-bit/flb_time.h, but need to be redefined due to use of dlsym(). */ + +struct flb_time { + struct timespec tm; +}; + +/* Library mode context data */ +struct flb_lib_ctx { + int status; + struct mk_event_loop *event_loop; + struct mk_event *event_channel; + struct flb_config *config; +}; + +struct flb_parser_types { + char *key; + int key_len; + int type; +}; + +struct flb_parser { + /* configuration */ + int type; /* parser type */ + char *name; /* format name */ + char *p_regex; /* pattern for main regular expression */ + int skip_empty; /* skip empty regex matches */ + char *time_fmt; /* time format */ + char *time_fmt_full; /* original given time format */ + char *time_key; /* field name that contains the time */ + int time_offset; /* fixed UTC offset */ + int time_keep; /* keep time field */ + int time_strict; /* parse time field strictly */ + int logfmt_no_bare_keys; /* in logfmt parsers, require all keys to have values */ + char *time_frac_secs; /* time format have fractional seconds ? */ + struct flb_parser_types *types; /* type casting */ + int types_len; + + /* Field decoders */ + struct mk_list *decoders; + + /* internal */ + int time_with_year; /* do time_fmt consider a year (%Y) ? */ + char *time_fmt_year; + int time_with_tz; /* do time_fmt consider a timezone ? */ + struct flb_regex *regex; + struct mk_list _head; +}; + +struct flb_lib_out_cb { + int (*cb) (void *record, size_t size, void *data); + void *data; +}; + +typedef struct flb_lib_ctx flb_ctx_t; + +static flb_ctx_t *(*flb_create)(void); +static int (*flb_service_set)(flb_ctx_t *ctx, ...); +static int (*flb_start)(flb_ctx_t *ctx); +static int (*flb_stop)(flb_ctx_t *ctx); +static void (*flb_destroy)(flb_ctx_t *ctx); +static int (*flb_time_pop_from_msgpack)(struct flb_time *time, msgpack_unpacked *upk, msgpack_object **map); +static int (*flb_lib_free)(void *data); +static struct flb_parser *(*flb_parser_create)( const char *name, const char *format, const char *p_regex, int skip_empty, + const char *time_fmt, const char *time_key, const char *time_offset, + int time_keep, int time_strict, int logfmt_no_bare_keys, + struct flb_parser_types *types, int types_len,struct mk_list *decoders, + struct flb_config *config); +static int (*flb_input)(flb_ctx_t *ctx, const char *input, void *data); +static int (*flb_input_set)(flb_ctx_t *ctx, int ffd, ...); +// static int (*flb_filter)(flb_ctx_t *ctx, const char *filter, void *data); +// static int (*flb_filter_set)(flb_ctx_t *ctx, int ffd, ...); +static int (*flb_output)(flb_ctx_t *ctx, const char *output, struct flb_lib_out_cb *cb); +static int (*flb_output_set)(flb_ctx_t *ctx, int ffd, ...); +static msgpack_unpack_return (*dl_msgpack_unpack_next)(msgpack_unpacked* result, const char* data, size_t len, size_t* off); +static void (*dl_msgpack_zone_free)(msgpack_zone* zone); +static int (*dl_msgpack_object_print_buffer)(char *buffer, size_t buffer_size, msgpack_object o); + +static flb_ctx_t *ctx = NULL; +static void *flb_lib_handle = NULL; + +static struct flb_lib_out_cb *fwd_input_out_cb = NULL; + +static const char *sd_journal_field_prefix = SD_JOURNAL_FIELD_PREFIX; + +extern netdata_mutex_t stdout_mut; + +int flb_init(flb_srvc_config_t flb_srvc_config, + const char *const stock_config_dir, + const char *const new_sd_journal_field_prefix){ + int rc = 0; + char *dl_error; + + char *flb_lib_path = strdupz_path_subpath(stock_config_dir, "/../libfluent-bit.so"); + if (unlikely(NULL == (flb_lib_handle = dlopen(flb_lib_path, RTLD_LAZY)))){ + if (NULL != (dl_error = dlerror())) + collector_error("dlopen() libfluent-bit.so error: %s", dl_error); + rc = -1; + goto do_return; + } + + dlerror(); /* Clear any existing error */ + + /* Load Fluent-Bit functions from the shared library */ + #define load_function(FUNC_NAME){ \ + *(void **) (&FUNC_NAME) = dlsym(flb_lib_handle, LOGS_MANAG_STR(FUNC_NAME)); \ + if ((dl_error = dlerror()) != NULL) { \ + collector_error("dlerror loading %s: %s", LOGS_MANAG_STR(FUNC_NAME), dl_error); \ + rc = -1; \ + goto do_return; \ + } \ + } + + load_function(flb_create); + load_function(flb_service_set); + load_function(flb_start); + load_function(flb_stop); + load_function(flb_destroy); + load_function(flb_time_pop_from_msgpack); + load_function(flb_lib_free); + load_function(flb_parser_create); + load_function(flb_input); + load_function(flb_input_set); + // load_function(flb_filter); + // load_function(flb_filter_set); + load_function(flb_output); + load_function(flb_output_set); + *(void **) (&dl_msgpack_unpack_next) = dlsym(flb_lib_handle, "msgpack_unpack_next"); + if ((dl_error = dlerror()) != NULL) { + collector_error("dlerror loading msgpack_unpack_next: %s", dl_error); + rc = -1; + goto do_return; + } + *(void **) (&dl_msgpack_zone_free) = dlsym(flb_lib_handle, "msgpack_zone_free"); + if ((dl_error = dlerror()) != NULL) { + collector_error("dlerror loading msgpack_zone_free: %s", dl_error); + rc = -1; + goto do_return; + } + *(void **) (&dl_msgpack_object_print_buffer) = dlsym(flb_lib_handle, "msgpack_object_print_buffer"); + if ((dl_error = dlerror()) != NULL) { + collector_error("dlerror loading msgpack_object_print_buffer: %s", dl_error); + rc = -1; + goto do_return; + } + + ctx = flb_create(); + if (unlikely(!ctx)){ + rc = -1; + goto do_return; + } + + /* Global service settings */ + if(unlikely(flb_service_set(ctx, + "Flush" , flb_srvc_config.flush, + "HTTP_Listen" , flb_srvc_config.http_listen, + "HTTP_Port" , flb_srvc_config.http_port, + "HTTP_Server" , flb_srvc_config.http_server, + "Log_File" , flb_srvc_config.log_path, + "Log_Level" , flb_srvc_config.log_level, + "Coro_stack_size" , flb_srvc_config.coro_stack_size, + NULL) != 0 )){ + rc = -1; + goto do_return; + } + + if(new_sd_journal_field_prefix && *new_sd_journal_field_prefix) + sd_journal_field_prefix = new_sd_journal_field_prefix; + +do_return: + freez(flb_lib_path); + if(unlikely(rc && flb_lib_handle)) + dlclose(flb_lib_handle); + + return rc; +} + +int flb_run(void){ + if (likely(flb_start(ctx)) == 0) return 0; + else return -1; +} + +void flb_terminate(void){ + if(ctx){ + flb_stop(ctx); + flb_destroy(ctx); + ctx = NULL; + } + if(flb_lib_handle) + dlclose(flb_lib_handle); +} + +static void flb_complete_buff_item(struct File_info *p_file_info){ + + Circ_buff_t *buff = p_file_info->circ_buff; + + m_assert(buff->in->timestamp, "buff->in->timestamp cannot be 0"); + m_assert(buff->in->data, "buff->in->text cannot be NULL"); + m_assert(*buff->in->data, "*buff->in->text cannot be 0"); + m_assert(buff->in->text_size, "buff->in->text_size cannot be 0"); + + /* Replace last '\n' with '\0' to null-terminate text */ + buff->in->data[buff->in->text_size - 1] = '\0'; + + /* Store status (timestamp and text_size must have already been + * stored during flb_collect_logs_cb() ). */ + buff->in->status = CIRC_BUFF_ITEM_STATUS_UNPROCESSED; + + /* Load max size of compressed buffer, as calculated previously */ + size_t text_compressed_buff_max_size = buff->in->text_compressed_size; + + /* Do compression. + * TODO: Validate compression option? */ + buff->in->text_compressed = buff->in->data + buff->in->text_size; + buff->in->text_compressed_size = LZ4_compress_fast( buff->in->data, + buff->in->text_compressed, + buff->in->text_size, + text_compressed_buff_max_size, + p_file_info->compression_accel); + m_assert(buff->in->text_compressed_size != 0, "Text_compressed_size should be != 0"); + + p_file_info->parser_metrics->last_update = buff->in->timestamp / MSEC_PER_SEC; + + p_file_info->parser_metrics->num_lines += buff->in->num_lines; + + /* Perform custom log chart parsing */ + for(int i = 0; p_file_info->parser_cus_config[i]; i++){ + p_file_info->parser_metrics->parser_cus[i]->count += + search_keyword( buff->in->data, buff->in->text_size, NULL, NULL, + NULL, &p_file_info->parser_cus_config[i]->regex, 0); + } + + /* Update charts */ + netdata_mutex_lock(&stdout_mut); + p_file_info->chart_meta->update(p_file_info); + fflush(stdout); + netdata_mutex_unlock(&stdout_mut); + + circ_buff_insert(buff); + + uv_timer_again(&p_file_info->flb_tmp_buff_cpy_timer); +} + +void flb_complete_item_timer_timeout_cb(uv_timer_t *handle) { + + struct File_info *p_file_info = handle->data; + Circ_buff_t *buff = p_file_info->circ_buff; + + uv_mutex_lock(&p_file_info->flb_tmp_buff_mut); + if(!buff->in->data || !*buff->in->data || !buff->in->text_size){ + p_file_info->parser_metrics->last_update = now_realtime_sec(); + netdata_mutex_lock(&stdout_mut); + p_file_info->chart_meta->update(p_file_info); + fflush(stdout); + netdata_mutex_unlock(&stdout_mut); + uv_mutex_unlock(&p_file_info->flb_tmp_buff_mut); + return; + } + + flb_complete_buff_item(p_file_info); + + uv_mutex_unlock(&p_file_info->flb_tmp_buff_mut); +} + +static int flb_collect_logs_cb(void *record, size_t size, void *data){ + + /* "data" is NULL for Forward-type sources and non-NULL for local sources */ + struct File_info *p_file_info = (struct File_info *) data; + Circ_buff_t *buff = NULL; + + msgpack_unpacked result; + size_t off = 0; + struct flb_time tmp_time; + msgpack_object *x; + + char timestamp_str[TIMESTAMP_MS_STR_SIZE] = ""; + msec_t timestamp = 0; + + struct resizable_key_val_arr { + char **key; + char **val; + size_t *key_size; + size_t *val_size; + int size, max_size; + }; + + /* FLB_WEB_LOG case */ + Log_line_parsed_t line_parsed = (Log_line_parsed_t) {0}; + /* FLB_WEB_LOG case end */ + + /* FLB_KMSG case */ + static int skip_kmsg_log_buffering = 1; + int kmsg_sever = -1; // -1 equals invalid + /* FLB_KMSG case end */ + + /* FLB_SYSTEMD or FLB_SYSLOG case */ + char syslog_prival[4] = ""; + size_t syslog_prival_size = 0; + char syslog_severity[2] = ""; + char syslog_facility[3] = ""; + char *syslog_timestamp = NULL; + size_t syslog_timestamp_size = 0; + char *hostname = NULL; + size_t hostname_size = 0; + char *syslog_identifier = NULL; + size_t syslog_identifier_size = 0; + char *pid = NULL; + size_t pid_size = 0; + char *message = NULL; + size_t message_size = 0; + /* FLB_SYSTEMD or FLB_SYSLOG case end */ + + /* FLB_DOCKER_EV case */ + long docker_ev_time = 0; + long docker_ev_timeNano = 0; + char *docker_ev_type = NULL; + size_t docker_ev_type_size = 0; + char *docker_ev_action = NULL; + size_t docker_ev_action_size = 0; + char *docker_ev_id = NULL; + size_t docker_ev_id_size = 0; + static struct resizable_key_val_arr docker_ev_attr = {0}; + docker_ev_attr.size = 0; + /* FLB_DOCKER_EV case end */ + + /* FLB_MQTT case */ + char *mqtt_topic = NULL; + size_t mqtt_topic_size = 0; + static char *mqtt_message = NULL; + static size_t mqtt_message_size_max = 0; + /* FLB_MQTT case end */ + + size_t new_tmp_text_size = 0; + + msgpack_unpacked_init(&result); + + int iter = 0; + while (dl_msgpack_unpack_next(&result, record, size, &off) == MSGPACK_UNPACK_SUCCESS) { + iter++; + m_assert(iter == 1, "We do not expect more than one loop iteration here"); + + flb_time_pop_from_msgpack(&tmp_time, &result, &x); + + if(likely(x->type == MSGPACK_OBJECT_MAP && x->via.map.size != 0)){ + msgpack_object_kv* p = x->via.map.ptr; + msgpack_object_kv* pend = x->via.map.ptr + x->via.map.size; + + /* ================================================================ + * If p_file_info == NULL, it means it is a "Forward" source, so + * we need to search for the associated p_file_info. This code can + * be optimized further. + * ============================================================== */ + if(p_file_info == NULL){ + do{ + if(!strncmp(p->key.via.str.ptr, "stream guid", (size_t) p->key.via.str.size)){ + char *stream_guid = (char *) p->val.via.str.ptr; + size_t stream_guid_size = p->val.via.str.size; + debug_log( "stream guid:%.*s", (int) stream_guid_size, stream_guid); + + for (int i = 0; i < p_file_infos_arr->count; i++) { + if(!strncmp(p_file_infos_arr->data[i]->stream_guid, stream_guid, stream_guid_size)){ + p_file_info = p_file_infos_arr->data[i]; + // debug_log( "p_file_info match found: %s type[%s]", + // p_file_info->stream_guid, + // log_src_type_t_str[p_file_info->log_type]); + break; + } + } + } + ++p; + // continue; + } while(p < pend); + } + if(unlikely(p_file_info == NULL)) + goto skip_collect_and_drop_logs; + + + uv_mutex_lock(&p_file_info->flb_tmp_buff_mut); + buff = p_file_info->circ_buff; + + + p = x->via.map.ptr; + pend = x->via.map.ptr + x->via.map.size; + do{ + switch(p_file_info->log_type){ + + case FLB_TAIL: + case FLB_WEB_LOG: + case FLB_SERIAL: + { + if( !strncmp(p->key.via.str.ptr, LOG_REC_KEY, (size_t) p->key.via.str.size) || + /* The following line is in case we collect systemd logs + * (tagged as "MESSAGE") or docker_events (tagged as + * "message") via a "Forward" source to an FLB_TAIL parent. */ + !strncasecmp(p->key.via.str.ptr, LOG_REC_KEY_SYSTEMD, (size_t) p->key.via.str.size)){ + + message = (char *) p->val.via.str.ptr; + message_size = p->val.via.str.size; + + if(p_file_info->log_type == FLB_WEB_LOG){ + parse_web_log_line( (Web_log_parser_config_t *) p_file_info->parser_config->gen_config, + message, message_size, &line_parsed); + + if(likely(p_file_info->use_log_timestamp)){ + timestamp = line_parsed.timestamp * MSEC_PER_SEC; // convert to msec from sec + + { /* ------------------ FIXME ------------------------ + * Temporary kludge so that metrics don't break when + * a new record has timestamp before the current one. + */ + static msec_t previous_timestamp = 0; + if((((long long) timestamp - (long long) previous_timestamp) < 0)) + timestamp = previous_timestamp; + + previous_timestamp = timestamp; + } + } + } + + new_tmp_text_size = message_size + 1; // +1 for '\n' + + m_assert(message_size, "message_size is 0"); + m_assert(message, "message is NULL"); + } + + break; + } + + case FLB_KMSG: + { + if(unlikely(skip_kmsg_log_buffering)){ + static time_t netdata_start_time = 0; + if (!netdata_start_time) netdata_start_time = now_boottime_sec(); + if(now_boottime_sec() - netdata_start_time < KERNEL_LOGS_COLLECT_INIT_WAIT) + goto skip_collect_and_drop_logs; + else skip_kmsg_log_buffering = 0; + } + + /* NOTE/WARNING: + * kmsg timestamps are tricky. The timestamp will be + * *wrong** if the system has gone into hibernation since + * last boot and "p_file_info->use_log_timestamp" is set. + * Even if "p_file_info->use_log_timestamp" is NOT set, we + * need to use now_realtime_msec() as Fluent Bit timestamp + * will also be wrong. */ + if( !strncmp(p->key.via.str.ptr, "sec", (size_t) p->key.via.str.size)){ + if(p_file_info->use_log_timestamp){ + timestamp += (now_realtime_sec() - now_boottime_sec() + p->val.via.i64) * MSEC_PER_SEC; + } + else if(!timestamp) + timestamp = now_realtime_msec(); + } + else if(!strncmp(p->key.via.str.ptr, "usec", (size_t) p->key.via.str.size) && + p_file_info->use_log_timestamp){ + timestamp += p->val.via.i64 / USEC_PER_MS; + } + else if(!strncmp(p->key.via.str.ptr, LOG_REC_KEY, (size_t) p->key.via.str.size)){ + message = (char *) p->val.via.str.ptr; + message_size = p->val.via.str.size; + + m_assert(message, "message is NULL"); + m_assert(message_size, "message_size is 0"); + + new_tmp_text_size += message_size + 1; // +1 for '\n' + } + else if(!strncmp(p->key.via.str.ptr, "priority", (size_t) p->key.via.str.size)){ + kmsg_sever = (int) p->val.via.u64; + } + + break; + } + + case FLB_SYSTEMD: + case FLB_SYSLOG: + { + if( p_file_info->use_log_timestamp && !strncmp( p->key.via.str.ptr, + "SOURCE_REALTIME_TIMESTAMP", + (size_t) p->key.via.str.size)){ + + m_assert(p->val.via.str.size - 3 == TIMESTAMP_MS_STR_SIZE - 1, + "p->val.via.str.size - 3 != TIMESTAMP_MS_STR_SIZE"); + + strncpyz(timestamp_str, p->val.via.str.ptr, (size_t) p->val.via.str.size); + + char *endptr = NULL; + timestamp = str2ll(timestamp_str, &endptr); + timestamp = *endptr ? 0 : timestamp / USEC_PER_MS; + } + else if(!strncmp(p->key.via.str.ptr, "PRIVAL", (size_t) p->key.via.str.size)){ + m_assert(p->val.via.str.size <= 3, "p->val.via.str.size > 3"); + strncpyz(syslog_prival, p->val.via.str.ptr, (size_t) p->val.via.str.size); + syslog_prival_size = (size_t) p->val.via.str.size; + + m_assert(syslog_prival, "syslog_prival is NULL"); + } + else if(!strncmp(p->key.via.str.ptr, "PRIORITY", (size_t) p->key.via.str.size)){ + m_assert(p->val.via.str.size <= 1, "p->val.via.str.size > 1"); + strncpyz(syslog_severity, p->val.via.str.ptr, (size_t) p->val.via.str.size); + + m_assert(syslog_severity, "syslog_severity is NULL"); + } + else if(!strncmp(p->key.via.str.ptr, "SYSLOG_FACILITY", (size_t) p->key.via.str.size)){ + m_assert(p->val.via.str.size <= 2, "p->val.via.str.size > 2"); + strncpyz(syslog_facility, p->val.via.str.ptr, (size_t) p->val.via.str.size); + + m_assert(syslog_facility, "syslog_facility is NULL"); + } + else if(!strncmp(p->key.via.str.ptr, "SYSLOG_TIMESTAMP", (size_t) p->key.via.str.size)){ + syslog_timestamp = (char *) p->val.via.str.ptr; + syslog_timestamp_size = p->val.via.str.size; + + m_assert(syslog_timestamp, "syslog_timestamp is NULL"); + m_assert(syslog_timestamp_size, "syslog_timestamp_size is 0"); + + new_tmp_text_size += syslog_timestamp_size; + } + else if(!strncmp(p->key.via.str.ptr, "HOSTNAME", (size_t) p->key.via.str.size)){ + hostname = (char *) p->val.via.str.ptr; + hostname_size = p->val.via.str.size; + + m_assert(hostname, "hostname is NULL"); + m_assert(hostname_size, "hostname_size is 0"); + + new_tmp_text_size += hostname_size + 1; // +1 for ' ' char + } + else if(!strncmp(p->key.via.str.ptr, "SYSLOG_IDENTIFIER", (size_t) p->key.via.str.size)){ + syslog_identifier = (char *) p->val.via.str.ptr; + syslog_identifier_size = p->val.via.str.size; + + new_tmp_text_size += syslog_identifier_size; + } + else if(!strncmp(p->key.via.str.ptr, "PID", (size_t) p->key.via.str.size)){ + pid = (char *) p->val.via.str.ptr; + pid_size = p->val.via.str.size; + + new_tmp_text_size += pid_size; + } + else if(!strncmp(p->key.via.str.ptr, LOG_REC_KEY_SYSTEMD, (size_t) p->key.via.str.size)){ + + message = (char *) p->val.via.str.ptr; + message_size = p->val.via.str.size; + + m_assert(message, "message is NULL"); + m_assert(message_size, "message_size is 0"); + + new_tmp_text_size += message_size; + } + + break; + } + + case FLB_DOCKER_EV: + { + if(!strncmp(p->key.via.str.ptr, "time", (size_t) p->key.via.str.size)){ + docker_ev_time = p->val.via.i64; + + m_assert(docker_ev_time, "docker_ev_time is 0"); + } + else if(!strncmp(p->key.via.str.ptr, "timeNano", (size_t) p->key.via.str.size)){ + docker_ev_timeNano = p->val.via.i64; + + m_assert(docker_ev_timeNano, "docker_ev_timeNano is 0"); + + if(likely(p_file_info->use_log_timestamp)) + timestamp = docker_ev_timeNano / NSEC_PER_MSEC; + } + else if(!strncmp(p->key.via.str.ptr, "Type", (size_t) p->key.via.str.size)){ + docker_ev_type = (char *) p->val.via.str.ptr; + docker_ev_type_size = p->val.via.str.size; + + m_assert(docker_ev_type, "docker_ev_type is NULL"); + m_assert(docker_ev_type_size, "docker_ev_type_size is 0"); + + // debug_log("docker_ev_type: %.*s", docker_ev_type_size, docker_ev_type); + } + else if(!strncmp(p->key.via.str.ptr, "Action", (size_t) p->key.via.str.size)){ + docker_ev_action = (char *) p->val.via.str.ptr; + docker_ev_action_size = p->val.via.str.size; + + m_assert(docker_ev_action, "docker_ev_action is NULL"); + m_assert(docker_ev_action_size, "docker_ev_action_size is 0"); + + // debug_log("docker_ev_action: %.*s", docker_ev_action_size, docker_ev_action); + } + else if(!strncmp(p->key.via.str.ptr, "id", (size_t) p->key.via.str.size)){ + docker_ev_id = (char *) p->val.via.str.ptr; + docker_ev_id_size = p->val.via.str.size; + + m_assert(docker_ev_id, "docker_ev_id is NULL"); + m_assert(docker_ev_id_size, "docker_ev_id_size is 0"); + + // debug_log("docker_ev_id: %.*s", docker_ev_id_size, docker_ev_id); + } + else if(!strncmp(p->key.via.str.ptr, "Actor", (size_t) p->key.via.str.size)){ + // debug_log( "msg key:[%.*s]val:[%.*s]", (int) p->key.via.str.size, + // p->key.via.str.ptr, + // (int) p->val.via.str.size, + // p->val.via.str.ptr); + if(likely(p->val.type == MSGPACK_OBJECT_MAP && p->val.via.map.size != 0)){ + msgpack_object_kv* ac = p->val.via.map.ptr; + msgpack_object_kv* const ac_pend= p->val.via.map.ptr + p->val.via.map.size; + do{ + if(!strncmp(ac->key.via.str.ptr, "ID", (size_t) ac->key.via.str.size)){ + docker_ev_id = (char *) ac->val.via.str.ptr; + docker_ev_id_size = ac->val.via.str.size; + + m_assert(docker_ev_id, "docker_ev_id is NULL"); + m_assert(docker_ev_id_size, "docker_ev_id_size is 0"); + + // debug_log("docker_ev_id: %.*s", docker_ev_id_size, docker_ev_id); + } + else if(!strncmp(ac->key.via.str.ptr, "Attributes", (size_t) ac->key.via.str.size)){ + if(likely(ac->val.type == MSGPACK_OBJECT_MAP && ac->val.via.map.size != 0)){ + msgpack_object_kv* att = ac->val.via.map.ptr; + msgpack_object_kv* const att_pend = ac->val.via.map.ptr + ac->val.via.map.size; + do{ + if(unlikely(++docker_ev_attr.size > docker_ev_attr.max_size)){ + docker_ev_attr.max_size = docker_ev_attr.size; + docker_ev_attr.key = reallocz(docker_ev_attr.key, + docker_ev_attr.max_size * sizeof(char *)); + docker_ev_attr.val = reallocz(docker_ev_attr.val, + docker_ev_attr.max_size * sizeof(char *)); + docker_ev_attr.key_size = reallocz(docker_ev_attr.key_size, + docker_ev_attr.max_size * sizeof(size_t)); + docker_ev_attr.val_size = reallocz(docker_ev_attr.val_size, + docker_ev_attr.max_size * sizeof(size_t)); + } + + docker_ev_attr.key[docker_ev_attr.size - 1] = (char *) att->key.via.str.ptr; + docker_ev_attr.val[docker_ev_attr.size - 1] = (char *) att->val.via.str.ptr; + docker_ev_attr.key_size[docker_ev_attr.size - 1] = (size_t) att->key.via.str.size; + docker_ev_attr.val_size[docker_ev_attr.size - 1] = (size_t) att->val.via.str.size; + + att++; + continue; + } while(att < att_pend); + } + } + ac++; + continue; + } while(ac < ac_pend); + } + } + + break; + } + + case FLB_MQTT: + { + if(!strncmp(p->key.via.str.ptr, "topic", (size_t) p->key.via.str.size)){ + mqtt_topic = (char *) p->val.via.str.ptr; + mqtt_topic_size = (size_t) p->val.via.str.size; + + while(0 == (message_size = dl_msgpack_object_print_buffer(mqtt_message, mqtt_message_size_max, *x))) + mqtt_message = reallocz(mqtt_message, (mqtt_message_size_max += 10)); + + new_tmp_text_size = message_size + 1; // +1 for '\n' + + m_assert(message_size, "message_size is 0"); + m_assert(mqtt_message, "mqtt_message is NULL"); + + break; // watch out, MQTT requires a 'break' here, as we parse the entire 'x' msgpack_object + } + else m_assert(0, "missing mqtt topic"); + + break; + } + + default: + break; + } + + } while(++p < pend); + } + } + + /* If no log timestamp was found, use Fluent Bit collection timestamp. */ + if(timestamp == 0) + timestamp = (msec_t) tmp_time.tm.tv_sec * MSEC_PER_SEC + (msec_t) tmp_time.tm.tv_nsec / (NSEC_PER_MSEC); + + m_assert(TEST_MS_TIMESTAMP_VALID(timestamp), "timestamp is invalid"); + + /* If input buffer timestamp is not set, now is the time to set it, + * else just be done with the previous buffer */ + if(unlikely(buff->in->timestamp == 0)) buff->in->timestamp = timestamp / 1000 * 1000; // rounding down + else if((timestamp - buff->in->timestamp) >= MSEC_PER_SEC) { + flb_complete_buff_item(p_file_info); + buff->in->timestamp = timestamp / 1000 * 1000; // rounding down + } + + m_assert(TEST_MS_TIMESTAMP_VALID(buff->in->timestamp), "buff->in->timestamp is invalid"); + + new_tmp_text_size += buff->in->text_size; + + /* ======================================================================== + * Step 2: Extract metrics and reconstruct log record + * ====================================================================== */ + + /* Parse number of log lines - common for all log source types */ + buff->in->num_lines++; + + /* FLB_TAIL, FLB_WEB_LOG and FLB_SERIAL case */ + if( p_file_info->log_type == FLB_TAIL || + p_file_info->log_type == FLB_WEB_LOG || + p_file_info->log_type == FLB_SERIAL){ + + if(p_file_info->log_type == FLB_WEB_LOG) + extract_web_log_metrics(p_file_info->parser_config, &line_parsed, + p_file_info->parser_metrics->web_log); + + // TODO: Fix: Metrics will still be collected if circ_buff_prepare_write() returns 0. + if(unlikely(!circ_buff_prepare_write(buff, new_tmp_text_size))) + goto skip_collect_and_drop_logs; + + size_t tmp_item_off = buff->in->text_size; + + memcpy_iscntrl_fix(&buff->in->data[tmp_item_off], message, message_size); + tmp_item_off += message_size; + + buff->in->data[tmp_item_off++] = '\n'; + m_assert(tmp_item_off == new_tmp_text_size, "tmp_item_off should be == new_tmp_text_size"); + buff->in->text_size = new_tmp_text_size; + +#ifdef HAVE_SYSTEMD + if(p_file_info->do_sd_journal_send){ + if(p_file_info->log_type == FLB_WEB_LOG){ + sd_journal_send( + SD_JOURNAL_SEND_DEFAULT_FIELDS, + *line_parsed.vhost ? "%sWEB_LOG_VHOST=%s" : "_%s=%s", sd_journal_field_prefix, line_parsed.vhost, + line_parsed.port ? "%sWEB_LOG_PORT=%d" : "_%s=%d", sd_journal_field_prefix, line_parsed.port, + *line_parsed.req_scheme ? "%sWEB_LOG_REQ_SCHEME=%s" : "_%s=%s", sd_journal_field_prefix, line_parsed.req_scheme, + *line_parsed.req_client ? "%sWEB_LOG_REQ_CLIENT=%s" : "_%s=%s", sd_journal_field_prefix, line_parsed.req_client, + "%sWEB_LOG_REQ_METHOD=%s" , sd_journal_field_prefix, line_parsed.req_method, + *line_parsed.req_URL ? "%sWEB_LOG_REQ_URL=%s" : "_%s=%s", sd_journal_field_prefix, line_parsed.req_URL, + *line_parsed.req_proto ? "%sWEB_LOG_REQ_PROTO=%s" : "_%s=%s", sd_journal_field_prefix, line_parsed.req_proto, + line_parsed.req_size ? "%sWEB_LOG_REQ_SIZE=%d" : "_%s=%d", sd_journal_field_prefix, line_parsed.req_size, + line_parsed.req_proc_time ? "%sWEB_LOG_REC_PROC_TIME=%d" : "_%s=%d", sd_journal_field_prefix, line_parsed.req_proc_time, + line_parsed.resp_code ? "%sWEB_LOG_RESP_CODE=%d" : "_%s=%d", sd_journal_field_prefix ,line_parsed.resp_code, + line_parsed.ups_resp_time ? "%sWEB_LOG_UPS_RESP_TIME=%d" : "_%s=%d", sd_journal_field_prefix ,line_parsed.ups_resp_time, + *line_parsed.ssl_proto ? "%sWEB_LOG_SSL_PROTO=%s" : "_%s=%s", sd_journal_field_prefix ,line_parsed.ssl_proto, + *line_parsed.ssl_cipher ? "%sWEB_LOB_SSL_CIPHER=%s" : "_%s=%s", sd_journal_field_prefix ,line_parsed.ssl_cipher, + LOG_REC_KEY_SYSTEMD "=%.*s", (int) message_size, message, + NULL + ); + } + else if(p_file_info->log_type == FLB_SERIAL){ + Flb_serial_config_t *serial_config = (Flb_serial_config_t *) p_file_info->flb_config; + sd_journal_send( + SD_JOURNAL_SEND_DEFAULT_FIELDS, + serial_config->bitrate && *serial_config->bitrate ? + "%sSERIAL_BITRATE=%s" : "_%s=%s", sd_journal_field_prefix, serial_config->bitrate, + LOG_REC_KEY_SYSTEMD "=%.*s", (int) message_size, message, + NULL + ); + } + else{ + sd_journal_send( + SD_JOURNAL_SEND_DEFAULT_FIELDS, + LOG_REC_KEY_SYSTEMD "=%.*s", (int) message_size, message, + NULL + ); + } + } +#endif + + } /* FLB_TAIL, FLB_WEB_LOG and FLB_SERIAL case end */ + + /* FLB_KMSG case */ + else if(p_file_info->log_type == FLB_KMSG){ + + char *c; + + // see https://www.kernel.org/doc/Documentation/ABI/testing/dev-kmsg + if((c = memchr(message, '\n', message_size))){ + + const char subsys_str[] = "SUBSYSTEM=", + device_str[] = "DEVICE="; + const size_t subsys_str_len = sizeof(subsys_str) - 1, + device_str_len = sizeof(device_str) - 1; + + size_t bytes_remain = message_size - (c - message); + + /* Extract machine-readable info for charts, such as subsystem and device. */ + while(bytes_remain){ + size_t sz = 0; + while(--bytes_remain && c[++sz] != '\n'); + if(bytes_remain) --sz; + *(c++) = '\\'; + *(c++) = 'n'; + sz--; + + DICTIONARY *dict = NULL; + char *str = NULL; + size_t str_len = 0; + if(!strncmp(c, subsys_str, subsys_str_len)){ + dict = p_file_info->parser_metrics->kernel->subsystem; + str = &c[subsys_str_len]; + str_len = (sz - subsys_str_len); + } + else if (!strncmp(c, device_str, device_str_len)){ + dict = p_file_info->parser_metrics->kernel->device; + str = &c[device_str_len]; + str_len = (sz - device_str_len); + } + + if(likely(str)){ + char *const key = mallocz(str_len + 1); + memcpy(key, str, str_len); + key[str_len] = '\0'; + metrics_dict_item_t item = {.dim_initialized = false, .num_new = 1}; + dictionary_set_advanced(dict, key, str_len + 1, &item, sizeof(item), NULL); + } + c = &c[sz]; + } + } + + if(likely(kmsg_sever >= 0)) + p_file_info->parser_metrics->kernel->sever[kmsg_sever]++; + + // TODO: Fix: Metrics will still be collected if circ_buff_prepare_write() returns 0. + if(unlikely(!circ_buff_prepare_write(buff, new_tmp_text_size))) + goto skip_collect_and_drop_logs; + + size_t tmp_item_off = buff->in->text_size; + + memcpy_iscntrl_fix(&buff->in->data[tmp_item_off], message, message_size); + tmp_item_off += message_size; + + buff->in->data[tmp_item_off++] = '\n'; + m_assert(tmp_item_off == new_tmp_text_size, "tmp_item_off should be == new_tmp_text_size"); + buff->in->text_size = new_tmp_text_size; + } /* FLB_KMSG case end */ + + /* FLB_SYSTEMD or FLB_SYSLOG case */ + else if(p_file_info->log_type == FLB_SYSTEMD || + p_file_info->log_type == FLB_SYSLOG){ + + int syslog_prival_d = SYSLOG_PRIOR_ARR_SIZE - 1; // Initialise to 'unknown' + int syslog_severity_d = SYSLOG_SEVER_ARR_SIZE - 1; // Initialise to 'unknown' + int syslog_facility_d = SYSLOG_FACIL_ARR_SIZE - 1; // Initialise to 'unknown' + + + /* FLB_SYSTEMD case has syslog_severity and syslog_facility values that + * are used to calculate syslog_prival from. FLB_SYSLOG is the opposite + * case, as it has a syslog_prival value that is used to calculate + * syslog_severity and syslog_facility from. */ + if(p_file_info->log_type == FLB_SYSTEMD){ + + /* Parse syslog_severity char* field into int and extract metrics. + * syslog_severity_s will consist of 1 char (plus '\0'), + * see https://datatracker.ietf.org/doc/html/rfc5424#section-6.2.1 */ + if(likely(syslog_severity[0])){ + if(likely(str2int(&syslog_severity_d, syslog_severity, 10) == STR2XX_SUCCESS)){ + p_file_info->parser_metrics->systemd->sever[syslog_severity_d]++; + } // else parsing errors ++ ?? + } else p_file_info->parser_metrics->systemd->sever[SYSLOG_SEVER_ARR_SIZE - 1]++; // 'unknown' + + /* Parse syslog_facility char* field into int and extract metrics. + * syslog_facility_s will consist of up to 2 chars (plus '\0'), + * see https://datatracker.ietf.org/doc/html/rfc5424#section-6.2.1 */ + if(likely(syslog_facility[0])){ + if(likely(str2int(&syslog_facility_d, syslog_facility, 10) == STR2XX_SUCCESS)){ + p_file_info->parser_metrics->systemd->facil[syslog_facility_d]++; + } // else parsing errors ++ ?? + } else p_file_info->parser_metrics->systemd->facil[SYSLOG_FACIL_ARR_SIZE - 1]++; // 'unknown' + + if(likely(syslog_severity[0] && syslog_facility[0])){ + /* Definition of syslog priority value == facility * 8 + severity */ + syslog_prival_d = syslog_facility_d * 8 + syslog_severity_d; + syslog_prival_size = snprintfz(syslog_prival, 4, "%d", syslog_prival_d); + m_assert(syslog_prival_size < 4 && syslog_prival_size > 0, "error with snprintf()"); + + new_tmp_text_size += syslog_prival_size + 2; // +2 for '<' and '>' + + p_file_info->parser_metrics->systemd->prior[syslog_prival_d]++; + } else { + new_tmp_text_size += 3; // +3 for "<->" string + p_file_info->parser_metrics->systemd->prior[SYSLOG_PRIOR_ARR_SIZE - 1]++; // 'unknown' + } + + } else if(p_file_info->log_type == FLB_SYSLOG){ + + if(likely(syslog_prival[0])){ + if(likely(str2int(&syslog_prival_d, syslog_prival, 10) == STR2XX_SUCCESS)){ + syslog_severity_d = syslog_prival_d % 8; + syslog_facility_d = syslog_prival_d / 8; + + p_file_info->parser_metrics->systemd->prior[syslog_prival_d]++; + p_file_info->parser_metrics->systemd->sever[syslog_severity_d]++; + p_file_info->parser_metrics->systemd->facil[syslog_facility_d]++; + + new_tmp_text_size += syslog_prival_size + 2; // +2 for '<' and '>' + + } // else parsing errors ++ ?? + } else { + new_tmp_text_size += 3; // +3 for "<->" string + p_file_info->parser_metrics->systemd->prior[SYSLOG_PRIOR_ARR_SIZE - 1]++; // 'unknown' + p_file_info->parser_metrics->systemd->sever[SYSLOG_SEVER_ARR_SIZE - 1]++; // 'unknown' + p_file_info->parser_metrics->systemd->facil[SYSLOG_FACIL_ARR_SIZE - 1]++; // 'unknown' + } + + } else m_assert(0, "shoudn't get here"); + + char syslog_time_from_flb_time[25]; // 25 just to be on the safe side, but 16 + 1 chars bytes needed only. + if(unlikely(!syslog_timestamp)){ + const time_t ts = tmp_time.tm.tv_sec; + struct tm *const tm = localtime(&ts); + + strftime(syslog_time_from_flb_time, sizeof(syslog_time_from_flb_time), "%b %d %H:%M:%S ", tm); + new_tmp_text_size += SYSLOG_TIMESTAMP_SIZE; + } + + if(unlikely(!syslog_identifier)) new_tmp_text_size += sizeof(UNKNOWN) - 1; + if(unlikely(!pid)) new_tmp_text_size += sizeof(UNKNOWN) - 1; + + new_tmp_text_size += 5; // +5 for '[', ']', ':' and ' ' characters around and after pid and '\n' at the end + + /* Metrics extracted, now prepare circular buffer for write */ + // TODO: Fix: Metrics will still be collected if circ_buff_prepare_write() returns 0. + if(unlikely(!circ_buff_prepare_write(buff, new_tmp_text_size))) + goto skip_collect_and_drop_logs; + + size_t tmp_item_off = buff->in->text_size; + + buff->in->data[tmp_item_off++] = '<'; + if(likely(syslog_prival[0])){ + memcpy(&buff->in->data[tmp_item_off], syslog_prival, syslog_prival_size); + m_assert(syslog_prival_size, "syslog_prival_size cannot be 0"); + tmp_item_off += syslog_prival_size; + } else buff->in->data[tmp_item_off++] = '-'; + buff->in->data[tmp_item_off++] = '>'; + + if(likely(syslog_timestamp)){ + memcpy(&buff->in->data[tmp_item_off], syslog_timestamp, syslog_timestamp_size); + // FLB_SYSLOG doesn't add space, but FLB_SYSTEMD does: + // if(buff->in->data[tmp_item_off] != ' ') buff->in->data[tmp_item_off++] = ' '; + tmp_item_off += syslog_timestamp_size; + } else { + memcpy(&buff->in->data[tmp_item_off], syslog_time_from_flb_time, SYSLOG_TIMESTAMP_SIZE); + tmp_item_off += SYSLOG_TIMESTAMP_SIZE; + } + + if(likely(hostname)){ + memcpy(&buff->in->data[tmp_item_off], hostname, hostname_size); + tmp_item_off += hostname_size; + buff->in->data[tmp_item_off++] = ' '; + } + + if(likely(syslog_identifier)){ + memcpy(&buff->in->data[tmp_item_off], syslog_identifier, syslog_identifier_size); + tmp_item_off += syslog_identifier_size; + } else { + memcpy(&buff->in->data[tmp_item_off], UNKNOWN, sizeof(UNKNOWN) - 1); + tmp_item_off += sizeof(UNKNOWN) - 1; + } + + buff->in->data[tmp_item_off++] = '['; + if(likely(pid)){ + memcpy(&buff->in->data[tmp_item_off], pid, pid_size); + tmp_item_off += pid_size; + } else { + memcpy(&buff->in->data[tmp_item_off], UNKNOWN, sizeof(UNKNOWN) - 1); + tmp_item_off += sizeof(UNKNOWN) - 1; + } + buff->in->data[tmp_item_off++] = ']'; + + buff->in->data[tmp_item_off++] = ':'; + buff->in->data[tmp_item_off++] = ' '; + + if(likely(message)){ + memcpy_iscntrl_fix(&buff->in->data[tmp_item_off], message, message_size); + tmp_item_off += message_size; + } + + buff->in->data[tmp_item_off++] = '\n'; + m_assert(tmp_item_off == new_tmp_text_size, "tmp_item_off should be == new_tmp_text_size"); + buff->in->text_size = new_tmp_text_size; + } /* FLB_SYSTEMD or FLB_SYSLOG case end */ + + /* FLB_DOCKER_EV case */ + else if(p_file_info->log_type == FLB_DOCKER_EV){ + + const size_t docker_ev_datetime_size = sizeof "2022-08-26T15:33:20.802840200+0000" /* example datetime */; + char docker_ev_datetime[docker_ev_datetime_size]; + docker_ev_datetime[0] = 0; + if(likely(docker_ev_time && docker_ev_timeNano)){ + struct timespec ts; + ts.tv_sec = docker_ev_time; + if(unlikely(0 == strftime( docker_ev_datetime, docker_ev_datetime_size, + "%Y-%m-%dT%H:%M:%S.000000000%z", localtime(&ts.tv_sec)))) { /* TODO: do what if error? */}; + const size_t docker_ev_timeNano_s_size = sizeof "802840200"; + char docker_ev_timeNano_s[docker_ev_timeNano_s_size]; + snprintfz( docker_ev_timeNano_s, docker_ev_timeNano_s_size, "%0*ld", + (int) docker_ev_timeNano_s_size, docker_ev_timeNano % 1000000000); + memcpy(&docker_ev_datetime[20], &docker_ev_timeNano_s, docker_ev_timeNano_s_size - 1); + + new_tmp_text_size += docker_ev_datetime_size; // -1 for null terminator, +1 for ' ' character + } + + if(likely(docker_ev_type && docker_ev_action)){ + int ev_off = -1; + while(++ev_off < NUM_OF_DOCKER_EV_TYPES){ + if(!strncmp(docker_ev_type, docker_ev_type_string[ev_off], docker_ev_type_size)){ + p_file_info->parser_metrics->docker_ev->ev_type[ev_off]++; + + int act_off = -1; + while(docker_ev_action_string[ev_off][++act_off] != NULL){ + if(!strncmp(docker_ev_action, docker_ev_action_string[ev_off][act_off], docker_ev_action_size)){ + p_file_info->parser_metrics->docker_ev->ev_action[ev_off][act_off]++; + break; + } + } + if(unlikely(docker_ev_action_string[ev_off][act_off] == NULL)) + p_file_info->parser_metrics->docker_ev->ev_action[NUM_OF_DOCKER_EV_TYPES - 1][0]++; // 'unknown' + + break; + } + } + if(unlikely(ev_off >= NUM_OF_DOCKER_EV_TYPES - 1)){ + p_file_info->parser_metrics->docker_ev->ev_type[ev_off]++; // 'unknown' + p_file_info->parser_metrics->docker_ev->ev_action[NUM_OF_DOCKER_EV_TYPES - 1][0]++; // 'unknown' + } + + new_tmp_text_size += docker_ev_type_size + docker_ev_action_size + 2; // +2 for ' ' chars + } + + if(likely(docker_ev_id)){ + // debug_log("docker_ev_id: %.*s", (int) docker_ev_id_size, docker_ev_id); + + new_tmp_text_size += docker_ev_id_size + 1; // +1 for ' ' char + } + + if(likely(docker_ev_attr.size)){ + for(int i = 0; i < docker_ev_attr.size; i++){ + new_tmp_text_size += docker_ev_attr.key_size[i] + + docker_ev_attr.val_size[i] + 3; // +3 for '=' ',' ' ' characters + } + /* new_tmp_text_size = -2 + 2; + * -2 due to missing ',' ' ' from last attribute and +2 for the two + * '(' and ')' characters, so no need to add or subtract */ + } + + new_tmp_text_size += 1; // +1 for '\n' character at the end + + /* Metrics extracted, now prepare circular buffer for write */ + // TODO: Fix: Metrics will still be collected if circ_buff_prepare_write() returns 0. + if(unlikely(!circ_buff_prepare_write(buff, new_tmp_text_size))) + goto skip_collect_and_drop_logs; + + size_t tmp_item_off = buff->in->text_size; + message_size = new_tmp_text_size - 1 - tmp_item_off; + + if(likely(*docker_ev_datetime)){ + memcpy(&buff->in->data[tmp_item_off], docker_ev_datetime, docker_ev_datetime_size - 1); + tmp_item_off += docker_ev_datetime_size - 1; // -1 due to null terminator + buff->in->data[tmp_item_off++] = ' '; + } + + if(likely(docker_ev_type)){ + memcpy(&buff->in->data[tmp_item_off], docker_ev_type, docker_ev_type_size); + tmp_item_off += docker_ev_type_size; + buff->in->data[tmp_item_off++] = ' '; + } + + if(likely(docker_ev_action)){ + memcpy(&buff->in->data[tmp_item_off], docker_ev_action, docker_ev_action_size); + tmp_item_off += docker_ev_action_size; + buff->in->data[tmp_item_off++] = ' '; + } + + if(likely(docker_ev_id)){ + memcpy(&buff->in->data[tmp_item_off], docker_ev_id, docker_ev_id_size); + tmp_item_off += docker_ev_id_size; + buff->in->data[tmp_item_off++] = ' '; + } + + if(likely(docker_ev_attr.size)){ + buff->in->data[tmp_item_off++] = '('; + for(int i = 0; i < docker_ev_attr.size; i++){ + memcpy(&buff->in->data[tmp_item_off], docker_ev_attr.key[i], docker_ev_attr.key_size[i]); + tmp_item_off += docker_ev_attr.key_size[i]; + buff->in->data[tmp_item_off++] = '='; + memcpy(&buff->in->data[tmp_item_off], docker_ev_attr.val[i], docker_ev_attr.val_size[i]); + tmp_item_off += docker_ev_attr.val_size[i]; + buff->in->data[tmp_item_off++] = ','; + buff->in->data[tmp_item_off++] = ' '; + } + tmp_item_off -= 2; // overwrite last ',' and ' ' characters with a ')' character + buff->in->data[tmp_item_off++] = ')'; + } + + buff->in->data[tmp_item_off++] = '\n'; + m_assert(tmp_item_off == new_tmp_text_size, "tmp_item_off should be == new_tmp_text_size"); + buff->in->text_size = new_tmp_text_size; + +#ifdef HAVE_SYSTEMD + if(p_file_info->do_sd_journal_send){ + sd_journal_send( + SD_JOURNAL_SEND_DEFAULT_FIELDS, + "%sDOCKER_EVENTS_TYPE=%.*s", sd_journal_field_prefix, (int) docker_ev_type_size, docker_ev_type, + "%sDOCKER_EVENTS_ACTION=%.*s", sd_journal_field_prefix, (int) docker_ev_action_size, docker_ev_action, + "%sDOCKER_EVENTS_ID=%.*s", sd_journal_field_prefix, (int) docker_ev_id_size, docker_ev_id, + LOG_REC_KEY_SYSTEMD "=%.*s", (int) message_size, &buff->in->data[tmp_item_off - 1 - message_size], + NULL + ); + } +#endif + + } /* FLB_DOCKER_EV case end */ + + /* FLB_MQTT case */ + else if(p_file_info->log_type == FLB_MQTT){ + if(likely(mqtt_topic)){ + char *const key = mallocz(mqtt_topic_size + 1); + memcpy(key, mqtt_topic, mqtt_topic_size); + key[mqtt_topic_size] = '\0'; + metrics_dict_item_t item = {.dim_initialized = false, .num_new = 1}; + dictionary_set_advanced(p_file_info->parser_metrics->mqtt->topic, key, mqtt_topic_size + 1, &item, sizeof(item), NULL); + + // TODO: Fix: Metrics will still be collected if circ_buff_prepare_write() returns 0. + if(unlikely(!circ_buff_prepare_write(buff, new_tmp_text_size))) + goto skip_collect_and_drop_logs; + + size_t tmp_item_off = buff->in->text_size; + + memcpy(&buff->in->data[tmp_item_off], mqtt_message, message_size); + tmp_item_off += message_size; + + buff->in->data[tmp_item_off++] = '\n'; + m_assert(tmp_item_off == new_tmp_text_size, "tmp_item_off should be == new_tmp_text_size"); + buff->in->text_size = new_tmp_text_size; + +#ifdef HAVE_SYSTEMD + if(p_file_info->do_sd_journal_send){ + sd_journal_send( + SD_JOURNAL_SEND_DEFAULT_FIELDS, + "%sMQTT_TOPIC=%s", key, + LOG_REC_KEY_SYSTEMD "=%.*s", (int) message_size, mqtt_message, + NULL + ); + } +#endif + + } + else m_assert(0, "missing mqtt topic"); + } + +skip_collect_and_drop_logs: + /* Following code is equivalent to msgpack_unpacked_destroy(&result) due + * to that function call being unavailable when using dl_open() */ + if(result.zone != NULL) { + dl_msgpack_zone_free(result.zone); + result.zone = NULL; + memset(&result.data, 0, sizeof(msgpack_object)); + } + + if(p_file_info) + uv_mutex_unlock(&p_file_info->flb_tmp_buff_mut); + + flb_lib_free(record); + return 0; + +} + +/** + * @brief Add a Fluent-Bit input that outputs to the "lib" Fluent-Bit plugin. + * @param[in] p_file_info Pointer to the log source struct where the input will + * be registered to. + * @return 0 on success, a negative number for any errors (see enum). + */ +int flb_add_input(struct File_info *const p_file_info){ + + enum return_values { + SUCCESS = 0, + INVALID_LOG_TYPE = -1, + CONFIG_READ_ERROR = -2, + FLB_PARSER_CREATE_ERROR = -3, + FLB_INPUT_ERROR = -4, + FLB_INPUT_SET_ERROR = -5, + FLB_OUTPUT_ERROR = -6, + FLB_OUTPUT_SET_ERROR = -7, + DEFAULT_ERROR = -8 + }; + + const int tag_max_size = 5; + static unsigned tag = 0; // incremental tag id to link flb inputs to outputs + char tag_s[tag_max_size]; + snprintfz(tag_s, tag_max_size, "%u", tag++); + + + switch(p_file_info->log_type){ + case FLB_TAIL: + case FLB_WEB_LOG: { + + char update_every_str[10]; + snprintfz(update_every_str, 10, "%d", p_file_info->update_every); + + debug_log("Setting up %s tail for %s (basename:%s)", + p_file_info->log_type == FLB_TAIL ? "FLB_TAIL" : "FLB_WEB_LOG", + p_file_info->filename, p_file_info->file_basename); + + Flb_tail_config_t *tail_config = (Flb_tail_config_t *) p_file_info->flb_config; + if(unlikely(!tail_config)) return CONFIG_READ_ERROR; + + /* Set up input from log source */ + p_file_info->flb_input = flb_input(ctx, "tail", NULL); + if(p_file_info->flb_input < 0 ) return FLB_INPUT_ERROR; + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "Path", p_file_info->filename, + "Key", LOG_REC_KEY, + "Refresh_Interval", update_every_str, + "Skip_Long_Lines", "On", + "Skip_Empty_Lines", "On", +#if defined(FLB_HAVE_INOTIFY) + "Inotify_Watcher", tail_config->use_inotify ? "true" : "false", +#endif + NULL) != 0) return FLB_INPUT_SET_ERROR; + + break; + } + case FLB_KMSG: { + debug_log( "Setting up FLB_KMSG collector"); + + Flb_kmsg_config_t *kmsg_config = (Flb_kmsg_config_t *) p_file_info->flb_config; + if(unlikely(!kmsg_config || + !kmsg_config->prio_level || + !*kmsg_config->prio_level)) return CONFIG_READ_ERROR; + + /* Set up kmsg input */ + p_file_info->flb_input = flb_input(ctx, "kmsg", NULL); + if(p_file_info->flb_input < 0 ) return FLB_INPUT_ERROR; + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "Prio_Level", kmsg_config->prio_level, + NULL) != 0) return FLB_INPUT_SET_ERROR; + + break; + } + case FLB_SYSTEMD: { + debug_log( "Setting up FLB_SYSTEMD collector"); + + /* Set up systemd input */ + p_file_info->flb_input = flb_input(ctx, "systemd", NULL); + if(p_file_info->flb_input < 0 ) return FLB_INPUT_ERROR; + if(!strcmp(p_file_info->filename, SYSTEMD_DEFAULT_PATH)){ + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "Read_From_Tail", "On", + "Strip_Underscores", "On", + NULL) != 0) return FLB_INPUT_SET_ERROR; + } else { + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "Read_From_Tail", "On", + "Strip_Underscores", "On", + "Path", p_file_info->filename, + NULL) != 0) return FLB_INPUT_SET_ERROR; + } + + break; + } + case FLB_DOCKER_EV: { + debug_log( "Setting up FLB_DOCKER_EV collector"); + + /* Set up Docker Events parser */ + if(flb_parser_create( "docker_events_parser", /* parser name */ + "json", /* backend type */ + NULL, /* regex */ + FLB_TRUE, /* skip_empty */ + NULL, /* time format */ + NULL, /* time key */ + NULL, /* time offset */ + FLB_TRUE, /* time keep */ + FLB_FALSE, /* time strict */ + FLB_FALSE, /* no bare keys */ + NULL, /* parser types */ + 0, /* types len */ + NULL, /* decoders */ + ctx->config) == NULL) return FLB_PARSER_CREATE_ERROR; + + /* Set up Docker Events input */ + p_file_info->flb_input = flb_input(ctx, "docker_events", NULL); + if(p_file_info->flb_input < 0 ) return FLB_INPUT_ERROR; + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "Parser", "docker_events_parser", + "Unix_Path", p_file_info->filename, + NULL) != 0) return FLB_INPUT_SET_ERROR; + + break; + } + case FLB_SYSLOG: { + debug_log( "Setting up FLB_SYSLOG collector"); + + /* Set up syslog parser */ + const char syslog_parser_prfx[] = "syslog_parser_"; + size_t parser_name_size = sizeof(syslog_parser_prfx) + tag_max_size - 1; + char parser_name[parser_name_size]; + snprintfz(parser_name, parser_name_size, "%s%u", syslog_parser_prfx, tag); + + Syslog_parser_config_t *syslog_config = (Syslog_parser_config_t *) p_file_info->parser_config->gen_config; + if(unlikely(!syslog_config || + !syslog_config->socket_config || + !syslog_config->socket_config->mode || + !p_file_info->filename)) return CONFIG_READ_ERROR; + + if(flb_parser_create( parser_name, /* parser name */ + "regex", /* backend type */ + syslog_config->log_format, /* regex */ + FLB_TRUE, /* skip_empty */ + NULL, /* time format */ + NULL, /* time key */ + NULL, /* time offset */ + FLB_TRUE, /* time keep */ + FLB_TRUE, /* time strict */ + FLB_FALSE, /* no bare keys */ + NULL, /* parser types */ + 0, /* types len */ + NULL, /* decoders */ + ctx->config) == NULL) return FLB_PARSER_CREATE_ERROR; + + /* Set up syslog input */ + p_file_info->flb_input = flb_input(ctx, "syslog", NULL); + if(p_file_info->flb_input < 0 ) return FLB_INPUT_ERROR; + if( !strcmp(syslog_config->socket_config->mode, "unix_udp") || + !strcmp(syslog_config->socket_config->mode, "unix_tcp")){ + m_assert(syslog_config->socket_config->unix_perm, "unix_perm is not set"); + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "Path", p_file_info->filename, + "Parser", parser_name, + "Mode", syslog_config->socket_config->mode, + "Unix_Perm", syslog_config->socket_config->unix_perm, + NULL) != 0) return FLB_INPUT_SET_ERROR; + } else if( !strcmp(syslog_config->socket_config->mode, "udp") || + !strcmp(syslog_config->socket_config->mode, "tcp")){ + m_assert(syslog_config->socket_config->listen, "listen is not set"); + m_assert(syslog_config->socket_config->port, "port is not set"); + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "Parser", parser_name, + "Mode", syslog_config->socket_config->mode, + "Listen", syslog_config->socket_config->listen, + "Port", syslog_config->socket_config->port, + NULL) != 0) return FLB_INPUT_SET_ERROR; + } else return FLB_INPUT_SET_ERROR; // should never reach this line + + break; + } + case FLB_SERIAL: { + debug_log( "Setting up FLB_SERIAL collector"); + + Flb_serial_config_t *serial_config = (Flb_serial_config_t *) p_file_info->flb_config; + if(unlikely(!serial_config || + !serial_config->bitrate || + !*serial_config->bitrate || + !serial_config->min_bytes || + !*serial_config->min_bytes || + !p_file_info->filename)) return CONFIG_READ_ERROR; + + /* Set up serial input */ + p_file_info->flb_input = flb_input(ctx, "serial", NULL); + if(p_file_info->flb_input < 0 ) return FLB_INPUT_ERROR; + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "File", p_file_info->filename, + "Bitrate", serial_config->bitrate, + "Min_Bytes", serial_config->min_bytes, + "Separator", serial_config->separator, + "Format", serial_config->format, + NULL) != 0) return FLB_INPUT_SET_ERROR; + + break; + } + case FLB_MQTT: { + debug_log( "Setting up FLB_MQTT collector"); + + Flb_socket_config_t *socket_config = (Flb_socket_config_t *) p_file_info->flb_config; + if(unlikely(!socket_config || !socket_config->listen || !*socket_config->listen || + !socket_config->port || !*socket_config->port)) return CONFIG_READ_ERROR; + + /* Set up MQTT input */ + p_file_info->flb_input = flb_input(ctx, "mqtt", NULL); + if(p_file_info->flb_input < 0 ) return FLB_INPUT_ERROR; + if(flb_input_set(ctx, p_file_info->flb_input, + "Tag", tag_s, + "Listen", socket_config->listen, + "Port", socket_config->port, + NULL) != 0) return FLB_INPUT_SET_ERROR; + + break; + } + default: { + m_assert(0, "default: case in flb_add_input() error"); + return DEFAULT_ERROR; // Shouldn't reach here + } + } + + /* Set up user-configured outputs */ + for(Flb_output_config_t *output = p_file_info->flb_outputs; output; output = output->next){ + debug_log( "setting up user output [%s]", output->plugin); + + int out = flb_output(ctx, output->plugin, NULL); + if(out < 0) return FLB_OUTPUT_ERROR; + if(flb_output_set(ctx, out, + "Match", tag_s, + NULL) != 0) return FLB_OUTPUT_SET_ERROR; + for(struct flb_output_config_param *param = output->param; param; param = param->next){ + debug_log( "setting up param [%s][%s] of output [%s]", param->key, param->val, output->plugin); + if(flb_output_set(ctx, out, + param->key, param->val, + NULL) != 0) return FLB_OUTPUT_SET_ERROR; + } + } + + /* Set up "lib" output */ + struct flb_lib_out_cb *callback = mallocz(sizeof(struct flb_lib_out_cb)); + callback->cb = flb_collect_logs_cb; + callback->data = p_file_info; + if(((p_file_info->flb_lib_output = flb_output(ctx, "lib", callback)) < 0) || + (flb_output_set(ctx, p_file_info->flb_lib_output, "Match", tag_s, NULL) != 0)){ + freez(callback); + return FLB_OUTPUT_ERROR; + } + + return SUCCESS; +} + +/** + * @brief Add a Fluent-Bit Forward input. + * @details This creates a unix or network socket to accept logs using + * Fluent Bit's Forward protocol. For more information see: + * https://docs.fluentbit.io/manual/pipeline/inputs/forward + * @param[in] forward_in_config Configuration of the Forward input socket. + * @return 0 on success, -1 on error. + */ +int flb_add_fwd_input(Flb_socket_config_t *forward_in_config){ + + if(forward_in_config == NULL){ + debug_log( "forward: forward_in_config is NULL"); + collector_info("forward_in_config is NULL"); + return 0; + } + + do{ + debug_log( "forward: Setting up flb_add_fwd_input()"); + + int input, output; + + if((input = flb_input(ctx, "forward", NULL)) < 0) break; + + if( forward_in_config->unix_path && *forward_in_config->unix_path && + forward_in_config->unix_perm && *forward_in_config->unix_perm){ + if(flb_input_set(ctx, input, + "Tag_Prefix", "fwd", + "Unix_Path", forward_in_config->unix_path, + "Unix_Perm", forward_in_config->unix_perm, + NULL) != 0) break; + } else if( forward_in_config->listen && *forward_in_config->listen && + forward_in_config->port && *forward_in_config->port){ + if(flb_input_set(ctx, input, + "Tag_Prefix", "fwd", + "Listen", forward_in_config->listen, + "Port", forward_in_config->port, + NULL) != 0) break; + } else break; // should never reach this line + + fwd_input_out_cb = mallocz(sizeof(struct flb_lib_out_cb)); + + /* Set up output */ + fwd_input_out_cb->cb = flb_collect_logs_cb; + fwd_input_out_cb->data = NULL; + if((output = flb_output(ctx, "lib", fwd_input_out_cb)) < 0) break; + if(flb_output_set(ctx, output, + "Match", "fwd*", + NULL) != 0) break; + + debug_log( "forward: Set up flb_add_fwd_input() with success"); + return 0; + } while(0); + + /* Error */ + if(fwd_input_out_cb) freez(fwd_input_out_cb); + return -1; +} + +void flb_free_fwd_input_out_cb(void){ + freez(fwd_input_out_cb); +}
\ No newline at end of file diff --git a/logsmanagement/flb_plugin.h b/logsmanagement/flb_plugin.h new file mode 100644 index 00000000..5c35315b --- /dev/null +++ b/logsmanagement/flb_plugin.h @@ -0,0 +1,35 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file flb_plugin.h + * @brief Header of flb_plugin.c + */ + +#ifndef FLB_PLUGIN_H_ +#define FLB_PLUGIN_H_ + +#include "file_info.h" +#include <uv.h> + +#define LOG_PATH_AUTO "auto" +#define KMSG_DEFAULT_PATH "/dev/kmsg" +#define SYSTEMD_DEFAULT_PATH "SD_JOURNAL_LOCAL_ONLY" +#define DOCKER_EV_DEFAULT_PATH "/var/run/docker.sock" + +typedef struct { + char *flush, + *http_listen, *http_port, *http_server, + *log_path, *log_level, + *coro_stack_size; +} flb_srvc_config_t ; + +int flb_init(flb_srvc_config_t flb_srvc_config, + const char *const stock_config_dir, + const char *const new_sd_journal_field_prefix); +int flb_run(void); +void flb_terminate(void); +void flb_complete_item_timer_timeout_cb(uv_timer_t *handle); +int flb_add_input(struct File_info *const p_file_info); +int flb_add_fwd_input(Flb_socket_config_t *const forward_in_config); +void flb_free_fwd_input_out_cb(void); + +#endif // FLB_PLUGIN_H_ diff --git a/logsmanagement/fluent_bit_build/CMakeLists.patch b/logsmanagement/fluent_bit_build/CMakeLists.patch new file mode 100644 index 00000000..e2b8cab1 --- /dev/null +++ b/logsmanagement/fluent_bit_build/CMakeLists.patch @@ -0,0 +1,19 @@ +diff --git a/CMakeLists.txt b/CMakeLists.txt +index ae853815b..8b81a052f 100644 +--- a/CMakeLists.txt ++++ b/CMakeLists.txt +@@ -70,12 +70,14 @@ set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -D__FLB_FILENAME__=__FILE__") + if(${CMAKE_SYSTEM_PROCESSOR} MATCHES "armv7l") + set(CMAKE_C_LINK_FLAGS "${CMAKE_C_LINK_FLAGS} -latomic") + set(CMAKE_CXX_LINK_FLAGS "${CMAKE_CXX_LINK_FLAGS} -latomic") ++ set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -latomic") + endif() + if(${CMAKE_SYSTEM_NAME} MATCHES "FreeBSD") + set(FLB_SYSTEM_FREEBSD On) + add_definitions(-DFLB_SYSTEM_FREEBSD) + set(CMAKE_C_LINK_FLAGS "${CMAKE_C_LINK_FLAGS} -lutil") + set(CMAKE_CXX_LINK_FLAGS "${CMAKE_CXX_LINK_FLAGS} -lutil") ++ set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS} -lutil") + endif() + + # *BSD is not supported platform for wasm-micro-runtime except for FreeBSD. diff --git a/logsmanagement/fluent_bit_build/chunkio-static-lib-fts.patch b/logsmanagement/fluent_bit_build/chunkio-static-lib-fts.patch new file mode 100644 index 00000000..f3c4dd83 --- /dev/null +++ b/logsmanagement/fluent_bit_build/chunkio-static-lib-fts.patch @@ -0,0 +1,10 @@ +--- a/lib/chunkio/src/CMakeLists.txt ++++ b/lib/chunkio/src/CMakeLists.txt +@@ -14,6 +14,7 @@ + ) + + set(libs cio-crc32) ++set(libs ${libs} fts) + + if(${CMAKE_SYSTEM_NAME} MATCHES "Windows") + set(src diff --git a/logsmanagement/fluent_bit_build/config.cmake b/logsmanagement/fluent_bit_build/config.cmake new file mode 100644 index 00000000..7bb2e86a --- /dev/null +++ b/logsmanagement/fluent_bit_build/config.cmake @@ -0,0 +1,178 @@ +set(FLB_ALL OFF CACHE BOOL "Enable all features") +set(FLB_DEBUG OFF CACHE BOOL "Build with debug mode (-g)") +set(FLB_RELEASE OFF CACHE BOOL "Build with release mode (-O2 -g -DNDEBUG)") +# set(FLB_IPO "ReleaseOnly" CACHE STRING "Build with interprocedural optimization") +# set_property(CACHE FLB_IPO PROPERTY STRINGS "On;Off;ReleaseOnly") +set(FLB_SMALL OFF CACHE BOOL "Optimise for small size") +set(FLB_COVERAGE OFF CACHE BOOL "Build with code-coverage") +set(FLB_JEMALLOC OFF CACHE BOOL "Build with Jemalloc support") +set(FLB_REGEX ON CACHE BOOL "Build with Regex support") +set(FLB_UTF8_ENCODER ON CACHE BOOL "Build with UTF8 encoding support") +set(FLB_PARSER ON CACHE BOOL "Build with Parser support") +set(FLB_TLS ON CACHE BOOL "Build with SSL/TLS support") +set(FLB_BINARY OFF CACHE BOOL "Build executable binary") +set(FLB_EXAMPLES OFF CACHE BOOL "Build examples") +set(FLB_SHARED_LIB ON CACHE BOOL "Build shared library") +set(FLB_VALGRIND OFF CACHE BOOL "Enable Valgrind support") +set(FLB_TRACE OFF CACHE BOOL "Enable trace mode") +set(FLB_CHUNK_TRACE OFF CACHE BOOL "Enable chunk traces") +set(FLB_TESTS_RUNTIME OFF CACHE BOOL "Enable runtime tests") +set(FLB_TESTS_INTERNAL OFF CACHE BOOL "Enable internal tests") +set(FLB_TESTS_INTERNAL_FUZZ OFF CACHE BOOL "Enable internal fuzz tests") +set(FLB_TESTS_OSSFUZZ OFF CACHE BOOL "Enable OSS-Fuzz build") +set(FLB_MTRACE OFF CACHE BOOL "Enable mtrace support") +set(FLB_POSIX_TLS OFF CACHE BOOL "Force POSIX thread storage") +set(FLB_INOTIFY ON CACHE BOOL "Enable inotify support") +set(FLB_SQLDB ON CACHE BOOL "Enable SQL embedded DB") +set(FLB_HTTP_SERVER ON CACHE BOOL "Enable HTTP Server") +set(FLB_BACKTRACE OFF CACHE BOOL "Enable stacktrace support") +set(FLB_LUAJIT OFF CACHE BOOL "Enable Lua Scripting support") +set(FLB_RECORD_ACCESSOR ON CACHE BOOL "Enable record accessor") +set(FLB_SIGNV4 ON CACHE BOOL "Enable AWS Signv4 support") +set(FLB_AWS ON CACHE BOOL "Enable AWS support") +# set(FLB_STATIC_CONF "Build binary using static configuration") +set(FLB_STREAM_PROCESSOR OFF CACHE BOOL "Enable Stream Processor") +set(FLB_CORO_STACK_SIZE 24576 CACHE STRING "Set coroutine stack size") +set(FLB_AVRO_ENCODER OFF CACHE BOOL "Build with Avro encoding support") +set(FLB_AWS_ERROR_REPORTER ON CACHE BOOL "Build with aws error reporting support") +set(FLB_ARROW OFF CACHE BOOL "Build with Apache Arrow support") +set(FLB_WINDOWS_DEFAULTS OFF CACHE BOOL "Build with predefined Windows settings") +set(FLB_WASM OFF CACHE BOOL "Build with WASM runtime support") +set(FLB_WAMRC OFF CACHE BOOL "Build with WASM AOT compiler executable") +set(FLB_WASM_STACK_PROTECT OFF CACHE BOOL "Build with WASM runtime with strong stack protector flags") + +# Native Metrics Support (cmetrics) +set(FLB_METRICS OFF CACHE BOOL "Enable metrics support") + +# Proxy Plugins +set(FLB_PROXY_GO OFF CACHE BOOL "Enable Go plugins support") + +# Built-in Custom Plugins +set(FLB_CUSTOM_CALYPTIA OFF CACHE BOOL "Enable Calyptia Support") + +# Config formats +set(FLB_CONFIG_YAML OFF CACHE BOOL "Enable YAML config format") + +# Built-in Plugins +set(FLB_IN_CPU OFF CACHE BOOL "Enable CPU input plugin") +set(FLB_IN_THERMAL OFF CACHE BOOL "Enable Thermal plugin") +set(FLB_IN_DISK OFF CACHE BOOL "Enable Disk input plugin") +set(FLB_IN_DOCKER OFF CACHE BOOL "Enable Docker input plugin") +set(FLB_IN_DOCKER_EVENTS ON CACHE BOOL "Enable Docker events input plugin") +set(FLB_IN_EXEC OFF CACHE BOOL "Enable Exec input plugin") +set(FLB_IN_EXEC_WASI OFF CACHE BOOL "Enable Exec WASI input plugin") +set(FLB_IN_EVENT_TEST OFF CACHE BOOL "Enable Events test plugin") +set(FLB_IN_EVENT_TYPE OFF CACHE BOOL "Enable event type plugin") +set(FLB_IN_FLUENTBIT_METRICS OFF CACHE BOOL "Enable Fluent Bit metrics plugin") +set(FLB_IN_FORWARD ON CACHE BOOL "Enable Forward input plugin") +set(FLB_IN_HEALTH OFF CACHE BOOL "Enable Health input plugin") +set(FLB_IN_HTTP OFF CACHE BOOL "Enable HTTP input plugin") +set(FLB_IN_MEM OFF CACHE BOOL "Enable Memory input plugin") +set(FLB_IN_KUBERNETES_EVENTS OFF CACHE BOOL "Enable Kubernetes Events plugin") +set(FLB_IN_KAFKA OFF CACHE BOOL "Enable Kafka input plugin") +set(FLB_IN_KMSG ON CACHE BOOL "Enable Kernel log input plugin") +set(FLB_IN_LIB ON CACHE BOOL "Enable library mode input plugin") +set(FLB_IN_RANDOM OFF CACHE BOOL "Enable random input plugin") +set(FLB_IN_SERIAL ON CACHE BOOL "Enable Serial input plugin") +set(FLB_IN_STDIN OFF CACHE BOOL "Enable Standard input plugin") +set(FLB_IN_SYSLOG ON CACHE BOOL "Enable Syslog input plugin") +set(FLB_IN_TAIL ON CACHE BOOL "Enable Tail input plugin") +set(FLB_IN_UDP OFF CACHE BOOL "Enable UDP input plugin") +set(FLB_IN_TCP OFF CACHE BOOL "Enable TCP input plugin") +set(FLB_IN_UNIX_SOCKET OFF CACHE BOOL "Enable Unix socket input plugin") +set(FLB_IN_MQTT ON CACHE BOOL "Enable MQTT Broker input plugin") +set(FLB_IN_HEAD OFF CACHE BOOL "Enable Head input plugin") +set(FLB_IN_PROC OFF CACHE BOOL "Enable Process input plugin") +set(FLB_IN_SYSTEMD ON CACHE BOOL "Enable Systemd input plugin") +set(FLB_IN_DUMMY OFF CACHE BOOL "Enable Dummy input plugin") +set(FLB_IN_NGINX_EXPORTER_METRICS OFF CACHE BOOL "Enable Nginx Metrics input plugin") +set(FLB_IN_NETIF OFF CACHE BOOL "Enable NetworkIF input plugin") +set(FLB_IN_WINLOG OFF CACHE BOOL "Enable Windows Log input plugin") +set(FLB_IN_WINSTAT OFF CACHE BOOL "Enable Windows Stat input plugin") +set(FLB_IN_WINEVTLOG OFF CACHE BOOL "Enable Windows EvtLog input plugin") +set(FLB_IN_COLLECTD OFF CACHE BOOL "Enable Collectd input plugin") +set(FLB_IN_PROMETHEUS_SCRAPE OFF CACHE BOOL "Enable Promeheus Scrape input plugin") +set(FLB_IN_STATSD OFF CACHE BOOL "Enable StatsD input plugin") +set(FLB_IN_EVENT_TEST OFF CACHE BOOL "Enable event test plugin") +set(FLB_IN_STORAGE_BACKLOG OFF CACHE BOOL "Enable storage backlog input plugin") +set(FLB_IN_EMITTER OFF CACHE BOOL "Enable emitter input plugin") +set(FLB_IN_NODE_EXPORTER_METRICS OFF CACHE BOOL "Enable node exporter metrics input plugin") +set(FLB_IN_WINDOWS_EXPORTER_METRICS OFF CACHE BOOL "Enable windows exporter metrics input plugin") +set(FLB_IN_PODMAN_METRICS OFF CACHE BOOL "Enable Podman Metrics input plugin") +set(FLB_IN_OPENTELEMETRY OFF CACHE BOOL "Enable OpenTelemetry input plugin") +set(FLB_IN_ELASTICSEARCH OFF CACHE BOOL "Enable Elasticsearch (Bulk API) input plugin") +set(FLB_IN_CALYPTIA_FLEET OFF CACHE BOOL "Enable Calyptia Fleet input plugin") +set(FLB_IN_SPLUNK OFF CACHE BOOL "Enable Splunk HTTP HEC input plugin") +set(FLB_OUT_AZURE ON CACHE BOOL "Enable Azure output plugin") +set(FLB_OUT_AZURE_BLOB ON CACHE BOOL "Enable Azure output plugin") +set(FLB_OUT_AZURE_LOGS_INGESTION ON CACHE BOOL "Enable Azure Logs Ingestion output plugin") +set(FLB_OUT_AZURE_KUSTO ON CACHE BOOL "Enable Azure Kusto output plugin") +set(FLB_OUT_BIGQUERY ON CACHE BOOL "Enable BigQuery output plugin") +set(FLB_OUT_CALYPTIA OFF CACHE BOOL "Enable Calyptia monitoring plugin") +set(FLB_OUT_COUNTER OFF CACHE BOOL "Enable Counter output plugin") +set(FLB_OUT_DATADOG ON CACHE BOOL "Enable DataDog output plugin") +set(FLB_OUT_ES ON CACHE BOOL "Enable Elasticsearch output plugin") +set(FLB_OUT_EXIT OFF CACHE BOOL "Enable Exit output plugin") +set(FLB_OUT_FORWARD ON CACHE BOOL "Enable Forward output plugin") +set(FLB_OUT_GELF ON CACHE BOOL "Enable GELF output plugin") +set(FLB_OUT_HTTP ON CACHE BOOL "Enable HTTP output plugin") +set(FLB_OUT_INFLUXDB ON CACHE BOOL "Enable InfluxDB output plugin") +set(FLB_OUT_NATS ON CACHE BOOL "Enable NATS output plugin") +set(FLB_OUT_NRLOGS ON CACHE BOOL "Enable New Relic output plugin") +set(FLB_OUT_OPENSEARCH ON CACHE BOOL "Enable OpenSearch output plugin") +set(FLB_OUT_TCP ON CACHE BOOL "Enable TCP output plugin") +set(FLB_OUT_UDP ON CACHE BOOL "Enable UDP output plugin") +set(FLB_OUT_PLOT ON CACHE BOOL "Enable Plot output plugin") +set(FLB_OUT_FILE ON CACHE BOOL "Enable file output plugin") +set(FLB_OUT_TD ON CACHE BOOL "Enable Treasure Data output plugin") +set(FLB_OUT_RETRY OFF CACHE BOOL "Enable Retry test output plugin") +set(FLB_OUT_PGSQL ON CACHE BOOL "Enable PostgreSQL output plugin") +set(FLB_OUT_SKYWALKING ON CACHE BOOL "Enable Apache SkyWalking output plugin") +set(FLB_OUT_SLACK ON CACHE BOOL "Enable Slack output plugin") +set(FLB_OUT_SPLUNK ON CACHE BOOL "Enable Splunk output plugin") +set(FLB_OUT_STACKDRIVER ON CACHE BOOL "Enable Stackdriver output plugin") +set(FLB_OUT_STDOUT OFF CACHE BOOL "Enable STDOUT output plugin") +set(FLB_OUT_SYSLOG ON CACHE BOOL "Enable Syslog output plugin") +set(FLB_OUT_LIB ON CACHE BOOL "Enable library mode output plugin") +set(FLB_OUT_NULL OFF CACHE BOOL "Enable dev null output plugin") +set(FLB_OUT_FLOWCOUNTER ON CACHE BOOL "Enable flowcount output plugin") +set(FLB_OUT_LOGDNA ON CACHE BOOL "Enable LogDNA output plugin") +set(FLB_OUT_LOKI ON CACHE BOOL "Enable Loki output plugin") +set(FLB_OUT_KAFKA ON CACHE BOOL "Enable Kafka output plugin") +set(FLB_OUT_KAFKA_REST ON CACHE BOOL "Enable Kafka Rest output plugin") +set(FLB_OUT_CLOUDWATCH_LOGS ON CACHE BOOL "Enable AWS CloudWatch output plugin") +set(FLB_OUT_KINESIS_FIREHOSE ON CACHE BOOL "Enable AWS Firehose output plugin") +set(FLB_OUT_KINESIS_STREAMS ON CACHE BOOL "Enable AWS Kinesis output plugin") +set(FLB_OUT_OPENTELEMETRY ON CACHE BOOL "Enable OpenTelemetry plugin") +set(FLB_OUT_PROMETHEUS_EXPORTER ON CACHE BOOL "Enable Prometheus exporter plugin") +set(FLB_OUT_PROMETHEUS_REMOTE_WRITE ON CACHE BOOL "Enable Prometheus remote write plugin") +set(FLB_OUT_S3 ON CACHE BOOL "Enable AWS S3 output plugin") +set(FLB_OUT_VIVO_EXPORTER ON CACHE BOOL "Enabel Vivo exporter output plugin") +set(FLB_OUT_WEBSOCKET ON CACHE BOOL "Enable Websocket output plugin") +set(FLB_OUT_CHRONICLE ON CACHE BOOL "Enable Google Chronicle output plugin") +set(FLB_FILTER_ALTER_SIZE OFF CACHE BOOL "Enable alter_size filter") +set(FLB_FILTER_AWS OFF CACHE BOOL "Enable aws filter") +set(FLB_FILTER_ECS OFF CACHE BOOL "Enable AWS ECS filter") +set(FLB_FILTER_CHECKLIST OFF CACHE BOOL "Enable checklist filter") +set(FLB_FILTER_EXPECT OFF CACHE BOOL "Enable expect filter") +set(FLB_FILTER_GREP OFF CACHE BOOL "Enable grep filter") +set(FLB_FILTER_MODIFY OFF CACHE BOOL "Enable modify filter") +set(FLB_FILTER_STDOUT OFF CACHE BOOL "Enable stdout filter") +set(FLB_FILTER_PARSER ON CACHE BOOL "Enable parser filter") +set(FLB_FILTER_KUBERNETES OFF CACHE BOOL "Enable kubernetes filter") +set(FLB_FILTER_REWRITE_TAG OFF CACHE BOOL "Enable tag rewrite filter") +set(FLB_FILTER_THROTTLE OFF CACHE BOOL "Enable throttle filter") +set(FLB_FILTER_THROTTLE_SIZE OFF CACHE BOOL "Enable throttle size filter") +set(FLB_FILTER_TYPE_CONVERTER OFF CACHE BOOL "Enable type converter filter") +set(FLB_FILTER_MULTILINE OFF CACHE BOOL "Enable multiline filter") +set(FLB_FILTER_NEST OFF CACHE BOOL "Enable nest filter") +set(FLB_FILTER_LOG_TO_METRICS OFF CACHE BOOL "Enable log-derived metrics filter") +set(FLB_FILTER_LUA OFF CACHE BOOL "Enable Lua scripting filter") +set(FLB_FILTER_LUA_USE_MPACK OFF CACHE BOOL "Enable mpack on the lua filter") +set(FLB_FILTER_RECORD_MODIFIER ON CACHE BOOL "Enable record_modifier filter") +set(FLB_FILTER_TENSORFLOW OFF CACHE BOOL "Enable tensorflow filter") +set(FLB_FILTER_GEOIP2 OFF CACHE BOOL "Enable geoip2 filter") +set(FLB_FILTER_NIGHTFALL OFF CACHE BOOL "Enable Nightfall filter") +set(FLB_FILTER_WASM OFF CACHE BOOL "Enable WASM filter") +set(FLB_PROCESSOR_LABELS OFF CACHE BOOL "Enable metrics label manipulation processor") +set(FLB_PROCESSOR_ATTRIBUTES OFF CACHE BOOL "Enable atributes manipulation processor") diff --git a/logsmanagement/fluent_bit_build/exclude-luajit.patch b/logsmanagement/fluent_bit_build/exclude-luajit.patch new file mode 100644 index 00000000..4055f59c --- /dev/null +++ b/logsmanagement/fluent_bit_build/exclude-luajit.patch @@ -0,0 +1,10 @@ +diff --git a/cmake/luajit.cmake b/cmake/luajit.cmake +index b6774eb..f8042ae 100644 +--- a/cmake/luajit.cmake ++++ b/cmake/luajit.cmake +@@ -1,4 +1,4 @@ + # luajit cmake + option(LUAJIT_DIR "Path of LuaJIT 2.1 source dir" ON) + set(LUAJIT_DIR ${FLB_PATH_ROOT_SOURCE}/${FLB_PATH_LIB_LUAJIT}) +-add_subdirectory("lib/luajit-cmake") ++add_subdirectory("lib/luajit-cmake" EXCLUDE_FROM_ALL) diff --git a/logsmanagement/fluent_bit_build/flb-log-fmt.patch b/logsmanagement/fluent_bit_build/flb-log-fmt.patch new file mode 100644 index 00000000..b3429c41 --- /dev/null +++ b/logsmanagement/fluent_bit_build/flb-log-fmt.patch @@ -0,0 +1,52 @@ +diff --git a/src/flb_log.c b/src/flb_log.c +index d004af8af..6ed27b8c6 100644 +--- a/src/flb_log.c ++++ b/src/flb_log.c +@@ -509,31 +509,31 @@ int flb_log_construct(struct log_message *msg, int *ret_len, + + switch (type) { + case FLB_LOG_HELP: +- header_title = "help"; ++ header_title = "HELP"; + header_color = ANSI_CYAN; + break; + case FLB_LOG_INFO: +- header_title = "info"; ++ header_title = "INFO"; + header_color = ANSI_GREEN; + break; + case FLB_LOG_WARN: +- header_title = "warn"; ++ header_title = "WARN"; + header_color = ANSI_YELLOW; + break; + case FLB_LOG_ERROR: +- header_title = "error"; ++ header_title = "ERROR"; + header_color = ANSI_RED; + break; + case FLB_LOG_DEBUG: +- header_title = "debug"; ++ header_title = "DEBUG"; + header_color = ANSI_YELLOW; + break; + case FLB_LOG_IDEBUG: +- header_title = "debug"; ++ header_title = "DEBUG"; + header_color = ANSI_CYAN; + break; + case FLB_LOG_TRACE: +- header_title = "trace"; ++ header_title = "TRACE"; + header_color = ANSI_BLUE; + break; + } +@@ -559,7 +559,7 @@ int flb_log_construct(struct log_message *msg, int *ret_len, + } + + len = snprintf(msg->msg, sizeof(msg->msg) - 1, +- "%s[%s%i/%02i/%02i %02i:%02i:%02i%s]%s [%s%5s%s] ", ++ "%s%s%i-%02i-%02i %02i:%02i:%02i%s:%s fluent-bit %s%s%s: ", + /* time */ /* type */ + + /* time variables */ diff --git a/logsmanagement/fluent_bit_build/xsi-strerror.patch b/logsmanagement/fluent_bit_build/xsi-strerror.patch new file mode 100644 index 00000000..527de209 --- /dev/null +++ b/logsmanagement/fluent_bit_build/xsi-strerror.patch @@ -0,0 +1,15 @@ +--- a/src/flb_network.c ++++ b/src/flb_network.c +@@ -523,9 +523,10 @@ + } + + /* Connection is broken, not much to do here */ +- str = strerror_r(error, so_error_buf, sizeof(so_error_buf)); ++ /* XXX: XSI */ ++ int _err = strerror_r(error, so_error_buf, sizeof(so_error_buf)); + flb_error("[net] TCP connection failed: %s:%i (%s)", +- u->tcp_host, u->tcp_port, str); ++ u->tcp_host, u->tcp_port, so_error_buf); + return -1; + } + } diff --git a/logsmanagement/functions.c b/logsmanagement/functions.c new file mode 100644 index 00000000..d53c3ed7 --- /dev/null +++ b/logsmanagement/functions.c @@ -0,0 +1,754 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file functions.c + * + * @brief This is the file containing the implementation of the + * logs management functions API. + */ + +#include "functions.h" +#include "helper.h" +#include "query.h" + +#define LOGS_MANAG_MAX_PARAMS 100 +#define LOGS_MANAGEMENT_DEFAULT_QUERY_DURATION_IN_SEC 3600 +#define LOGS_MANAGEMENT_DEFAULT_ITEMS_PER_QUERY 200 + +#define LOGS_MANAG_FUNC_PARAM_HELP "help" +#define LOGS_MANAG_FUNC_PARAM_ANCHOR "anchor" +#define LOGS_MANAG_FUNC_PARAM_LAST "last" +#define LOGS_MANAG_FUNC_PARAM_QUERY "query" +#define LOGS_MANAG_FUNC_PARAM_FACETS "facets" +#define LOGS_MANAG_FUNC_PARAM_HISTOGRAM "histogram" +#define LOGS_MANAG_FUNC_PARAM_DIRECTION "direction" +#define LOGS_MANAG_FUNC_PARAM_IF_MODIFIED_SINCE "if_modified_since" +#define LOGS_MANAG_FUNC_PARAM_DATA_ONLY "data_only" +#define LOGS_MANAG_FUNC_PARAM_SOURCE "source" +#define LOGS_MANAG_FUNC_PARAM_INFO "info" +#define LOGS_MANAG_FUNC_PARAM_ID "id" +#define LOGS_MANAG_FUNC_PARAM_PROGRESS "progress" +#define LOGS_MANAG_FUNC_PARAM_SLICE "slice" +#define LOGS_MANAG_FUNC_PARAM_DELTA "delta" +#define LOGS_MANAG_FUNC_PARAM_TAIL "tail" + +#define LOGS_MANAG_DEFAULT_DIRECTION FACETS_ANCHOR_DIRECTION_BACKWARD + +#define FACET_MAX_VALUE_LENGTH 8192 + +#define FUNCTION_LOGSMANAGEMENT_HELP_LONG \ + LOGS_MANAGEMENT_PLUGIN_STR " / " LOGS_MANAG_FUNC_NAME"\n" \ + "\n" \ + FUNCTION_LOGSMANAGEMENT_HELP_SHORT"\n" \ + "\n" \ + "The following parameters are supported::\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_HELP"\n" \ + " Shows this help message\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_INFO"\n" \ + " Request initial configuration information about the plugin.\n" \ + " The key entity returned is the required_params array, which includes\n" \ + " all the available "LOGS_MANAG_FUNC_NAME" sources.\n" \ + " When `"LOGS_MANAG_FUNC_PARAM_INFO"` is requested, all other parameters are ignored.\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_DATA_ONLY":true or "LOGS_MANAG_FUNC_PARAM_DATA_ONLY":false\n" \ + " Quickly respond with data requested, without generating a\n" \ + " `histogram`, `facets` counters and `items`.\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_SOURCE":SOURCE\n" \ + " Query only the specified "LOGS_MANAG_FUNC_NAME" sources.\n" \ + " Do an `"LOGS_MANAG_FUNC_PARAM_INFO"` query to find the sources.\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_BEFORE":TIMESTAMP_IN_SECONDS\n" \ + " Absolute or relative (to now) timestamp in seconds, to start the query.\n" \ + " The query is always executed from the most recent to the oldest log entry.\n" \ + " If not given the default is: now.\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_AFTER":TIMESTAMP_IN_SECONDS\n" \ + " Absolute or relative (to `before`) timestamp in seconds, to end the query.\n" \ + " If not given, the default is "LOGS_MANAG_STR(-LOGS_MANAGEMENT_DEFAULT_QUERY_DURATION_IN_SEC)".\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_LAST":ITEMS\n" \ + " The number of items to return.\n" \ + " The default is "LOGS_MANAG_STR(LOGS_MANAGEMENT_DEFAULT_ITEMS_PER_QUERY)".\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_ANCHOR":TIMESTAMP_IN_MICROSECONDS\n" \ + " Return items relative to this timestamp.\n" \ + " The exact items to be returned depend on the query `"LOGS_MANAG_FUNC_PARAM_DIRECTION"`.\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_DIRECTION":forward or "LOGS_MANAG_FUNC_PARAM_DIRECTION":backward\n" \ + " When set to `backward` (default) the items returned are the newest before the\n" \ + " `"LOGS_MANAG_FUNC_PARAM_ANCHOR"`, (or `"LOGS_MANAG_FUNC_PARAM_BEFORE"` if `"LOGS_MANAG_FUNC_PARAM_ANCHOR"` is not set)\n" \ + " When set to `forward` the items returned are the oldest after the\n" \ + " `"LOGS_MANAG_FUNC_PARAM_ANCHOR"`, (or `"LOGS_MANAG_FUNC_PARAM_AFTER"` if `"LOGS_MANAG_FUNC_PARAM_ANCHOR"` is not set)\n" \ + " The default is: backward\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_QUERY":SIMPLE_PATTERN\n" \ + " Do a full text search to find the log entries matching the pattern given.\n" \ + " The plugin is searching for matches on all fields of the database.\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_IF_MODIFIED_SINCE":TIMESTAMP_IN_MICROSECONDS\n" \ + " Each successful response, includes a `last_modified` field.\n" \ + " By providing the timestamp to the `"LOGS_MANAG_FUNC_PARAM_IF_MODIFIED_SINCE"` parameter,\n" \ + " the plugin will return 200 with a successful response, or 304 if the source has not\n" \ + " been modified since that timestamp.\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_HISTOGRAM":facet_id\n" \ + " Use the given `facet_id` for the histogram.\n" \ + " This parameter is ignored in `"LOGS_MANAG_FUNC_PARAM_DATA_ONLY"` mode.\n" \ + "\n" \ + " "LOGS_MANAG_FUNC_PARAM_FACETS":facet_id1,facet_id2,facet_id3,...\n" \ + " Add the given facets to the list of fields for which analysis is required.\n" \ + " The plugin will offer both a histogram and facet value counters for its values.\n" \ + " This parameter is ignored in `"LOGS_MANAG_FUNC_PARAM_DATA_ONLY"` mode.\n" \ + "\n" \ + " facet_id:value_id1,value_id2,value_id3,...\n" \ + " Apply filters to the query, based on the facet IDs returned.\n" \ + " Each `facet_id` can be given once, but multiple `facet_ids` can be given.\n" \ + "\n" + + +extern netdata_mutex_t stdout_mut; + +static DICTIONARY *function_query_status_dict = NULL; + +static DICTIONARY *used_hashes_registry = NULL; + +typedef struct function_query_status { + bool *cancelled; // a pointer to the cancelling boolean + usec_t stop_monotonic_ut; + + usec_t started_monotonic_ut; + + // request + STRING *source; + usec_t after_ut; + usec_t before_ut; + + struct { + usec_t start_ut; + usec_t stop_ut; + } anchor; + + FACETS_ANCHOR_DIRECTION direction; + size_t entries; + usec_t if_modified_since; + bool delta; + bool tail; + bool data_only; + bool slice; + size_t filters; + usec_t last_modified; + const char *query; + const char *histogram; + + // per file progress info + size_t cached_count; + + // progress statistics + usec_t matches_setup_ut; + size_t rows_useful; + size_t rows_read; + size_t bytes_read; + size_t files_matched; + size_t file_working; +} FUNCTION_QUERY_STATUS; + + +#define LOGS_MANAG_KEYS_INCLUDED_IN_FACETS \ + "log_source" \ + "|log_type" \ + "|filename" \ + "|basename" \ + "|chartname" \ + "|message" \ + "" + +static void logsmanagement_function_facets(const char *transaction, char *function, int timeout, bool *cancelled){ + + struct rusage start, end; + getrusage(RUSAGE_THREAD, &start); + + const logs_qry_res_err_t *ret = &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_SERVER_ERR]; + + BUFFER *wb = buffer_create(0, NULL); + buffer_flush(wb); + buffer_json_initialize(wb, "\"", "\"", 0, true, BUFFER_JSON_OPTIONS_MINIFY); + + usec_t now_monotonic_ut = now_monotonic_usec(); + FUNCTION_QUERY_STATUS tmp_fqs = { + .cancelled = cancelled, + .started_monotonic_ut = now_monotonic_ut, + .stop_monotonic_ut = now_monotonic_ut + (timeout * USEC_PER_SEC), + }; + FUNCTION_QUERY_STATUS *fqs = NULL; + const DICTIONARY_ITEM *fqs_item = NULL; + + FACETS *facets = facets_create(50, FACETS_OPTION_ALL_KEYS_FTS, + NULL, + LOGS_MANAG_KEYS_INCLUDED_IN_FACETS, + NULL); + + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_INFO); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_SOURCE); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_AFTER); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_BEFORE); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_ANCHOR); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_DIRECTION); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_LAST); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_QUERY); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_FACETS); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_HISTOGRAM); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_IF_MODIFIED_SINCE); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_DATA_ONLY); + // facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_ID); + // facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_PROGRESS); + facets_accepted_param(facets, LOGS_MANAG_FUNC_PARAM_DELTA); + // facets_accepted_param(facets, JOURNAL_PARAMETER_TAIL); + +// #ifdef HAVE_SD_JOURNAL_RESTART_FIELDS +// facets_accepted_param(facets, JOURNAL_PARAMETER_SLICE); +// #endif // HAVE_SD_JOURNAL_RESTART_FIELDS + + // register the fields in the order you want them on the dashboard + + facets_register_key_name(facets, "log_source", FACET_KEY_OPTION_FACET | + FACET_KEY_OPTION_FTS); + + facets_register_key_name(facets, "log_type", FACET_KEY_OPTION_FACET | + FACET_KEY_OPTION_FTS); + + facets_register_key_name(facets, "filename", FACET_KEY_OPTION_FACET | + FACET_KEY_OPTION_FTS); + + facets_register_key_name(facets, "basename", FACET_KEY_OPTION_FACET | + FACET_KEY_OPTION_FTS); + + facets_register_key_name(facets, "chartname", FACET_KEY_OPTION_VISIBLE | + FACET_KEY_OPTION_FACET | + FACET_KEY_OPTION_FTS); + + facets_register_key_name(facets, "message", FACET_KEY_OPTION_NEVER_FACET | + FACET_KEY_OPTION_MAIN_TEXT | + FACET_KEY_OPTION_VISIBLE | + FACET_KEY_OPTION_FTS); + + bool info = false, + data_only = false, + progress = false, + /* slice = true, */ + delta = false, + tail = false; + time_t after_s = 0, before_s = 0; + usec_t anchor = 0; + usec_t if_modified_since = 0; + size_t last = 0; + FACETS_ANCHOR_DIRECTION direction = LOGS_MANAG_DEFAULT_DIRECTION; + const char *query = NULL; + const char *chart = NULL; + const char *source = NULL; + const char *progress_id = NULL; + // size_t filters = 0; + + buffer_json_member_add_object(wb, "_request"); + + logs_query_params_t query_params = {0}; + unsigned long req_quota = 0; + + // unsigned int fn_off = 0, cn_off = 0; + + char *words[LOGS_MANAG_MAX_PARAMS] = { NULL }; + size_t num_words = quoted_strings_splitter_pluginsd(function, words, LOGS_MANAG_MAX_PARAMS); + for(int i = 1; i < LOGS_MANAG_MAX_PARAMS ; i++) { + char *keyword = get_word(words, num_words, i); + if(!keyword) break; + + + if(!strcmp(keyword, LOGS_MANAG_FUNC_PARAM_HELP)){ + BUFFER *wb = buffer_create(0, NULL); + buffer_sprintf(wb, FUNCTION_LOGSMANAGEMENT_HELP_LONG); + netdata_mutex_lock(&stdout_mut); + pluginsd_function_result_to_stdout(transaction, HTTP_RESP_OK, "text/plain", now_realtime_sec() + 3600, wb); + netdata_mutex_unlock(&stdout_mut); + buffer_free(wb); + goto cleanup; + } + else if(!strcmp(keyword, LOGS_MANAG_FUNC_PARAM_INFO)){ + info = true; + } + else if(!strcmp(keyword, LOGS_MANAG_FUNC_PARAM_PROGRESS)){ + progress = true; + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_DELTA ":", sizeof(LOGS_MANAG_FUNC_PARAM_DELTA ":") - 1) == 0) { + char *v = &keyword[sizeof(LOGS_MANAG_FUNC_PARAM_DELTA ":") - 1]; + + if(strcmp(v, "false") == 0 || strcmp(v, "no") == 0 || strcmp(v, "0") == 0) + delta = false; + else + delta = true; + } + // else if(strncmp(keyword, JOURNAL_PARAMETER_TAIL ":", sizeof(JOURNAL_PARAMETER_TAIL ":") - 1) == 0) { + // char *v = &keyword[sizeof(JOURNAL_PARAMETER_TAIL ":") - 1]; + + // if(strcmp(v, "false") == 0 || strcmp(v, "no") == 0 || strcmp(v, "0") == 0) + // tail = false; + // else + // tail = true; + // } + else if(!strncmp( keyword, + LOGS_MANAG_FUNC_PARAM_DATA_ONLY ":", + sizeof(LOGS_MANAG_FUNC_PARAM_DATA_ONLY ":") - 1)) { + + char *v = &keyword[sizeof(LOGS_MANAG_FUNC_PARAM_DATA_ONLY ":") - 1]; + + if(!strcmp(v, "false") || !strcmp(v, "no") || !strcmp(v, "0")) + data_only = false; + else + data_only = true; + } + // else if(strncmp(keyword, JOURNAL_PARAMETER_SLICE ":", sizeof(JOURNAL_PARAMETER_SLICE ":") - 1) == 0) { + // char *v = &keyword[sizeof(JOURNAL_PARAMETER_SLICE ":") - 1]; + + // if(strcmp(v, "false") == 0 || strcmp(v, "no") == 0 || strcmp(v, "0") == 0) + // slice = false; + // else + // slice = true; + // } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_ID ":", sizeof(LOGS_MANAG_FUNC_PARAM_ID ":") - 1) == 0) { + char *id = &keyword[sizeof(LOGS_MANAG_FUNC_PARAM_ID ":") - 1]; + + if(*id) + progress_id = id; + } + + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_SOURCE ":", sizeof(LOGS_MANAG_FUNC_PARAM_SOURCE ":") - 1) == 0) { + source = !strcmp("all", &keyword[sizeof(LOGS_MANAG_FUNC_PARAM_SOURCE ":") - 1]) ? + NULL : &keyword[sizeof(LOGS_MANAG_FUNC_PARAM_SOURCE ":") - 1]; + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_AFTER ":", sizeof(LOGS_MANAG_FUNC_PARAM_AFTER ":") - 1) == 0) { + after_s = str2l(&keyword[sizeof(LOGS_MANAG_FUNC_PARAM_AFTER ":") - 1]); + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_BEFORE ":", sizeof(LOGS_MANAG_FUNC_PARAM_BEFORE ":") - 1) == 0) { + before_s = str2l(&keyword[sizeof(LOGS_MANAG_FUNC_PARAM_BEFORE ":") - 1]); + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_IF_MODIFIED_SINCE ":", sizeof(LOGS_MANAG_FUNC_PARAM_IF_MODIFIED_SINCE ":") - 1) == 0) { + if_modified_since = str2ull(&keyword[sizeof(LOGS_MANAG_FUNC_PARAM_IF_MODIFIED_SINCE ":") - 1], NULL); + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_ANCHOR ":", sizeof(LOGS_MANAG_FUNC_PARAM_ANCHOR ":") - 1) == 0) { + anchor = str2ull(&keyword[sizeof(LOGS_MANAG_FUNC_PARAM_ANCHOR ":") - 1], NULL); + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_DIRECTION ":", sizeof(LOGS_MANAG_FUNC_PARAM_DIRECTION ":") - 1) == 0) { + direction = !strcasecmp(&keyword[sizeof(LOGS_MANAG_FUNC_PARAM_DIRECTION ":") - 1], "forward") ? + FACETS_ANCHOR_DIRECTION_FORWARD : FACETS_ANCHOR_DIRECTION_BACKWARD; + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_LAST ":", sizeof(LOGS_MANAG_FUNC_PARAM_LAST ":") - 1) == 0) { + last = str2ul(&keyword[sizeof(LOGS_MANAG_FUNC_PARAM_LAST ":") - 1]); + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_QUERY ":", sizeof(LOGS_MANAG_FUNC_PARAM_QUERY ":") - 1) == 0) { + query= &keyword[sizeof(LOGS_MANAG_FUNC_PARAM_QUERY ":") - 1]; + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_HISTOGRAM ":", sizeof(LOGS_MANAG_FUNC_PARAM_HISTOGRAM ":") - 1) == 0) { + chart = &keyword[sizeof(LOGS_MANAG_FUNC_PARAM_HISTOGRAM ":") - 1]; + } + else if(strncmp(keyword, LOGS_MANAG_FUNC_PARAM_FACETS ":", sizeof(LOGS_MANAG_FUNC_PARAM_FACETS ":") - 1) == 0) { + char *value = &keyword[sizeof(LOGS_MANAG_FUNC_PARAM_FACETS ":") - 1]; + if(*value) { + buffer_json_member_add_array(wb, LOGS_MANAG_FUNC_PARAM_FACETS); + + while(value) { + char *sep = strchr(value, ','); + if(sep) + *sep++ = '\0'; + + facets_register_facet_id(facets, value, FACET_KEY_OPTION_FACET|FACET_KEY_OPTION_FTS|FACET_KEY_OPTION_REORDER); + buffer_json_add_array_item_string(wb, value); + + value = sep; + } + + buffer_json_array_close(wb); // LOGS_MANAG_FUNC_PARAM_FACETS + } + } + else { + char *value = strchr(keyword, ':'); + if(value) { + *value++ = '\0'; + + buffer_json_member_add_array(wb, keyword); + + while(value) { + char *sep = strchr(value, ','); + if(sep) + *sep++ = '\0'; + + facets_register_facet_id_filter(facets, keyword, value, FACET_KEY_OPTION_FACET|FACET_KEY_OPTION_FTS|FACET_KEY_OPTION_REORDER); + buffer_json_add_array_item_string(wb, value); + // filters++; + + value = sep; + } + + buffer_json_array_close(wb); // keyword + } + } + } + + // ------------------------------------------------------------------------ + // put this request into the progress db + + if(progress_id && *progress_id) { + fqs_item = dictionary_set_and_acquire_item(function_query_status_dict, progress_id, &tmp_fqs, sizeof(tmp_fqs)); + fqs = dictionary_acquired_item_value(fqs_item); + } + else { + // no progress id given, proceed without registering our progress in the dictionary + fqs = &tmp_fqs; + fqs_item = NULL; + } + + // ------------------------------------------------------------------------ + // validate parameters + + time_t now_s = now_realtime_sec(); + time_t expires = now_s + 1; + + if(!after_s && !before_s) { + before_s = now_s; + after_s = before_s - LOGS_MANAGEMENT_DEFAULT_QUERY_DURATION_IN_SEC; + } + else + rrdr_relative_window_to_absolute(&after_s, &before_s, now_s); + + if(after_s > before_s) { + time_t tmp = after_s; + after_s = before_s; + before_s = tmp; + } + + if(after_s == before_s) + after_s = before_s - LOGS_MANAGEMENT_DEFAULT_QUERY_DURATION_IN_SEC; + + if(!last) + last = LOGS_MANAGEMENT_DEFAULT_ITEMS_PER_QUERY; + + + // ------------------------------------------------------------------------ + // set query time-frame, anchors and direction + + fqs->after_ut = after_s * USEC_PER_SEC; + fqs->before_ut = (before_s * USEC_PER_SEC) + USEC_PER_SEC - 1; + fqs->if_modified_since = if_modified_since; + fqs->data_only = data_only; + fqs->delta = (fqs->data_only) ? delta : false; + fqs->tail = (fqs->data_only && fqs->if_modified_since) ? tail : false; + fqs->source = string_strdupz(source); + fqs->entries = last; + fqs->last_modified = 0; + // fqs->filters = filters; + fqs->query = (query && *query) ? query : NULL; + fqs->histogram = (chart && *chart) ? chart : NULL; + fqs->direction = direction; + fqs->anchor.start_ut = anchor; + fqs->anchor.stop_ut = 0; + + if(fqs->anchor.start_ut && fqs->tail) { + // a tail request + // we need the top X entries from BEFORE + // but, we need to calculate the facets and the + // histogram up to the anchor + fqs->direction = direction = FACETS_ANCHOR_DIRECTION_BACKWARD; + fqs->anchor.start_ut = 0; + fqs->anchor.stop_ut = anchor; + } + + if(anchor && anchor < fqs->after_ut) { + // log_fqs(fqs, "received anchor is too small for query timeframe, ignoring anchor"); + anchor = 0; + fqs->anchor.start_ut = 0; + fqs->anchor.stop_ut = 0; + fqs->direction = direction = FACETS_ANCHOR_DIRECTION_BACKWARD; + } + else if(anchor > fqs->before_ut) { + // log_fqs(fqs, "received anchor is too big for query timeframe, ignoring anchor"); + anchor = 0; + fqs->anchor.start_ut = 0; + fqs->anchor.stop_ut = 0; + fqs->direction = direction = FACETS_ANCHOR_DIRECTION_BACKWARD; + } + + facets_set_anchor(facets, fqs->anchor.start_ut, fqs->anchor.stop_ut, fqs->direction); + + facets_set_additional_options(facets, + ((fqs->data_only) ? FACETS_OPTION_DATA_ONLY : 0) | + ((fqs->delta) ? FACETS_OPTION_SHOW_DELTAS : 0)); + + // ------------------------------------------------------------------------ + // set the rest of the query parameters + + facets_set_items(facets, fqs->entries); + facets_set_query(facets, fqs->query); + +// #ifdef HAVE_SD_JOURNAL_RESTART_FIELDS +// fqs->slice = slice; +// if(slice) +// facets_enable_slice_mode(facets); +// #else +// fqs->slice = false; +// #endif + + if(fqs->histogram) + facets_set_timeframe_and_histogram_by_id(facets, fqs->histogram, fqs->after_ut, fqs->before_ut); + else + facets_set_timeframe_and_histogram_by_name(facets, chart ? chart : "chartname", fqs->after_ut, fqs->before_ut); + + + // ------------------------------------------------------------------------ + // complete the request object + + buffer_json_member_add_boolean(wb, LOGS_MANAG_FUNC_PARAM_INFO, false); + buffer_json_member_add_boolean(wb, LOGS_MANAG_FUNC_PARAM_SLICE, fqs->slice); + buffer_json_member_add_boolean(wb, LOGS_MANAG_FUNC_PARAM_DATA_ONLY, fqs->data_only); + buffer_json_member_add_boolean(wb, LOGS_MANAG_FUNC_PARAM_PROGRESS, false); + buffer_json_member_add_boolean(wb, LOGS_MANAG_FUNC_PARAM_DELTA, fqs->delta); + buffer_json_member_add_boolean(wb, LOGS_MANAG_FUNC_PARAM_TAIL, fqs->tail); + buffer_json_member_add_string(wb, LOGS_MANAG_FUNC_PARAM_ID, progress_id); + buffer_json_member_add_string(wb, LOGS_MANAG_FUNC_PARAM_SOURCE, string2str(fqs->source)); + buffer_json_member_add_uint64(wb, LOGS_MANAG_FUNC_PARAM_AFTER, fqs->after_ut / USEC_PER_SEC); + buffer_json_member_add_uint64(wb, LOGS_MANAG_FUNC_PARAM_BEFORE, fqs->before_ut / USEC_PER_SEC); + buffer_json_member_add_uint64(wb, LOGS_MANAG_FUNC_PARAM_IF_MODIFIED_SINCE, fqs->if_modified_since); + buffer_json_member_add_uint64(wb, LOGS_MANAG_FUNC_PARAM_ANCHOR, anchor); + buffer_json_member_add_string(wb, LOGS_MANAG_FUNC_PARAM_DIRECTION, + fqs->direction == FACETS_ANCHOR_DIRECTION_FORWARD ? "forward" : "backward"); + buffer_json_member_add_uint64(wb, LOGS_MANAG_FUNC_PARAM_LAST, fqs->entries); + buffer_json_member_add_string(wb, LOGS_MANAG_FUNC_PARAM_QUERY, fqs->query); + buffer_json_member_add_string(wb, LOGS_MANAG_FUNC_PARAM_HISTOGRAM, fqs->histogram); + buffer_json_object_close(wb); // request + + // buffer_json_journal_versions(wb); + + // ------------------------------------------------------------------------ + // run the request + + if(info) { + facets_accepted_parameters_to_json_array(facets, wb, false); + buffer_json_member_add_array(wb, "required_params"); + { + buffer_json_add_array_item_object(wb); + { + buffer_json_member_add_string(wb, "id", "source"); + buffer_json_member_add_string(wb, "name", "source"); + buffer_json_member_add_string(wb, "help", "Select the Logs Management source to query"); + buffer_json_member_add_string(wb, "type", "select"); + buffer_json_member_add_array(wb, "options"); + ret = fetch_log_sources(wb); + buffer_json_array_close(wb); // options array + } + buffer_json_object_close(wb); // required params object + } + buffer_json_array_close(wb); // required_params array + + facets_table_config(wb); + + buffer_json_member_add_uint64(wb, "status", HTTP_RESP_OK); + buffer_json_member_add_string(wb, "type", "table"); + buffer_json_member_add_string(wb, "help", FUNCTION_LOGSMANAGEMENT_HELP_SHORT); + buffer_json_finalize(wb); + goto output; + } + + if(progress) { + // TODO: Add progress function + // function_logsmanagement_progress(wb, transaction, progress_id); + goto cleanup; + } + + if(!req_quota) + query_params.quota = LOGS_MANAG_QUERY_QUOTA_DEFAULT; + else if(req_quota > LOGS_MANAG_QUERY_QUOTA_MAX) + query_params.quota = LOGS_MANAG_QUERY_QUOTA_MAX; + else query_params.quota = req_quota; + + + if(fqs->source) + query_params.chartname[0] = (char *) string2str(fqs->source); + + query_params.order_by_asc = 0; + + + // NOTE: Always perform descending timestamp query, req_from_ts >= req_to_ts. + if(fqs->direction == FACETS_ANCHOR_DIRECTION_BACKWARD){ + query_params.req_from_ts = + (fqs->data_only && fqs->anchor.start_ut) ? fqs->anchor.start_ut / USEC_PER_MS : before_s * MSEC_PER_SEC; + query_params.req_to_ts = + (fqs->data_only && fqs->anchor.stop_ut) ? fqs->anchor.stop_ut / USEC_PER_MS : after_s * MSEC_PER_SEC; + } + else{ + query_params.req_from_ts = + (fqs->data_only && fqs->anchor.stop_ut) ? fqs->anchor.stop_ut / USEC_PER_MS : before_s * MSEC_PER_SEC; + query_params.req_to_ts = + (fqs->data_only && fqs->anchor.start_ut) ? fqs->anchor.start_ut / USEC_PER_MS : after_s * MSEC_PER_SEC; + } + + query_params.cancelled = cancelled; + query_params.stop_monotonic_ut = now_monotonic_usec() + (timeout - 1) * USEC_PER_SEC; + query_params.results_buff = buffer_create(query_params.quota, NULL); + + facets_rows_begin(facets); + + do{ + if(query_params.act_to_ts) + query_params.req_from_ts = query_params.act_to_ts - 1000; + + ret = execute_logs_manag_query(&query_params); + + + size_t res_off = 0; + logs_query_res_hdr_t *p_res_hdr; + while(query_params.results_buff->len - res_off > 0){ + p_res_hdr = (logs_query_res_hdr_t *) &query_params.results_buff->buffer[res_off]; + + ssize_t remaining = p_res_hdr->text_size; + char *ls = &query_params.results_buff->buffer[res_off] + sizeof(*p_res_hdr) + p_res_hdr->text_size - 1; + *ls = '\0'; + int timestamp_off = p_res_hdr->matches; + do{ + do{ + --remaining; + --ls; + } while(remaining > 0 && *ls != '\n'); + *ls = '\0'; + --remaining; + --ls; + + usec_t timestamp = p_res_hdr->timestamp * USEC_PER_MS + --timestamp_off; + + if(unlikely(!fqs->last_modified)) { + if(timestamp == if_modified_since){ + ret = &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_UNMODIFIED]; + goto output; + } + else + fqs->last_modified = timestamp; + } + + facets_add_key_value(facets, "log_source", p_res_hdr->log_source[0] ? p_res_hdr->log_source : "-"); + + facets_add_key_value(facets, "log_type", p_res_hdr->log_type[0] ? p_res_hdr->log_type : "-"); + + facets_add_key_value(facets, "filename", p_res_hdr->filename[0] ? p_res_hdr->filename : "-"); + + facets_add_key_value(facets, "basename", p_res_hdr->basename[0] ? p_res_hdr->basename : "-"); + + facets_add_key_value(facets, "chartname", p_res_hdr->chartname[0] ? p_res_hdr->chartname : "-"); + + size_t ls_len = strlen(ls + 2); + facets_add_key_value_length(facets, "message", sizeof("message") - 1, + ls + 2, ls_len <= FACET_MAX_VALUE_LENGTH ? ls_len : FACET_MAX_VALUE_LENGTH); + + facets_row_finished(facets, timestamp); + + } while(remaining > 0); + + res_off += sizeof(*p_res_hdr) + p_res_hdr->text_size; + + } + + buffer_flush(query_params.results_buff); + + } while(query_params.act_to_ts > query_params.req_to_ts); + + m_assert(query_params.req_from_ts == query_params.act_from_ts, "query_params.req_from_ts != query_params.act_from_ts"); + m_assert(query_params.req_to_ts == query_params.act_to_ts , "query_params.req_to_ts != query_params.act_to_ts"); + + + getrusage(RUSAGE_THREAD, &end); + time_t user_time = end.ru_utime.tv_sec * USEC_PER_SEC + end.ru_utime.tv_usec - + start.ru_utime.tv_sec * USEC_PER_SEC - start.ru_utime.tv_usec; + time_t sys_time = end.ru_stime.tv_sec * USEC_PER_SEC + end.ru_stime.tv_usec - + start.ru_stime.tv_sec * USEC_PER_SEC - start.ru_stime.tv_usec; + + buffer_json_member_add_object(wb, "logs_management_meta"); + buffer_json_member_add_string(wb, "api_version", LOGS_QRY_VERSION); + buffer_json_member_add_uint64(wb, "num_lines", query_params.num_lines); + buffer_json_member_add_uint64(wb, "user_time", user_time); + buffer_json_member_add_uint64(wb, "system_time", sys_time); + buffer_json_member_add_uint64(wb, "total_time", user_time + sys_time); + buffer_json_member_add_uint64(wb, "error_code", (uint64_t) ret->err_code); + buffer_json_member_add_string(wb, "error_string", ret->err_str); + buffer_json_object_close(wb); // logs_management_meta + + buffer_json_member_add_uint64(wb, "status", ret->http_code); + buffer_json_member_add_boolean(wb, "partial", ret->http_code != HTTP_RESP_OK || + ret->err_code == LOGS_QRY_RES_ERR_CODE_TIMEOUT); + buffer_json_member_add_string(wb, "type", "table"); + + + if(!fqs->data_only) { + buffer_json_member_add_time_t(wb, "update_every", 1); + buffer_json_member_add_string(wb, "help", FUNCTION_LOGSMANAGEMENT_HELP_SHORT); + } + + if(!fqs->data_only || fqs->tail) + buffer_json_member_add_uint64(wb, "last_modified", fqs->last_modified); + + facets_sort_and_reorder_keys(facets); + facets_report(facets, wb, used_hashes_registry); + + buffer_json_member_add_time_t(wb, "expires", now_realtime_sec() + (fqs->data_only ? 3600 : 0)); + buffer_json_finalize(wb); // logs_management_meta + + + // ------------------------------------------------------------------------ + // cleanup query params + + string_freez(fqs->source); + fqs->source = NULL; + + // ------------------------------------------------------------------------ + // handle error response + +output: + netdata_mutex_lock(&stdout_mut); + if(ret->http_code != HTTP_RESP_OK) + pluginsd_function_json_error_to_stdout(transaction, ret->http_code, ret->err_str); + else + pluginsd_function_result_to_stdout(transaction, ret->http_code, "application/json", expires, wb); + netdata_mutex_unlock(&stdout_mut); + +cleanup: + facets_destroy(facets); + buffer_free(query_params.results_buff); + buffer_free(wb); + + if(fqs_item) { + dictionary_del(function_query_status_dict, dictionary_acquired_item_name(fqs_item)); + dictionary_acquired_item_release(function_query_status_dict, fqs_item); + dictionary_garbage_collect(function_query_status_dict); + } +} + +struct functions_evloop_globals *logsmanagement_func_facets_init(bool *p_logsmanagement_should_exit){ + + function_query_status_dict = dictionary_create_advanced( + DICT_OPTION_DONT_OVERWRITE_VALUE | DICT_OPTION_FIXED_SIZE, + NULL, sizeof(FUNCTION_QUERY_STATUS)); + + used_hashes_registry = dictionary_create(DICT_OPTION_DONT_OVERWRITE_VALUE); + + netdata_mutex_lock(&stdout_mut); + fprintf(stdout, PLUGINSD_KEYWORD_FUNCTION " GLOBAL \"%s\" %d \"%s\"\n", + LOGS_MANAG_FUNC_NAME, + LOGS_MANAG_QUERY_TIMEOUT_DEFAULT, + FUNCTION_LOGSMANAGEMENT_HELP_SHORT); + netdata_mutex_unlock(&stdout_mut); + + struct functions_evloop_globals *wg = functions_evloop_init(1, "LGSMNGM", + &stdout_mut, + p_logsmanagement_should_exit); + + functions_evloop_add_function( wg, LOGS_MANAG_FUNC_NAME, + logsmanagement_function_facets, + LOGS_MANAG_QUERY_TIMEOUT_DEFAULT); + + return wg; +} diff --git a/logsmanagement/functions.h b/logsmanagement/functions.h new file mode 100644 index 00000000..16824d43 --- /dev/null +++ b/logsmanagement/functions.h @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file functions.h + * @brief Header of functions.c + */ + +#ifndef FUNCTIONS_H_ +#define FUNCTIONS_H_ + +#include "../database/rrdfunctions.h" + +#define LOGS_MANAG_FUNC_NAME "logs-management" +#define FUNCTION_LOGSMANAGEMENT_HELP_SHORT "View, search and analyze logs monitored through the logs management engine." + +int logsmanagement_function_execute_cb( BUFFER *dest_wb, int timeout, + const char *function, void *collector_data, + void (*callback)(BUFFER *wb, int code, void *callback_data), + void *callback_data); + +struct functions_evloop_globals *logsmanagement_func_facets_init(bool *p_logsmanagement_should_exit); + +#endif // FUNCTIONS_H_
\ No newline at end of file diff --git a/logsmanagement/helper.h b/logsmanagement/helper.h new file mode 100644 index 00000000..6d1d51f7 --- /dev/null +++ b/logsmanagement/helper.h @@ -0,0 +1,238 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file helper.h + * @brief Includes helper functions for the Logs Management project. + */ + +#ifndef HELPER_H_ +#define HELPER_H_ + +#include "libnetdata/libnetdata.h" +#include <assert.h> + +#define LOGS_MANAGEMENT_PLUGIN_STR "logs-management.plugin" + +#define LOGS_MANAG_STR_HELPER(x) #x +#define LOGS_MANAG_STR(x) LOGS_MANAG_STR_HELPER(x) + +#ifndef m_assert +#if defined(LOGS_MANAGEMENT_DEV_MODE) +#define m_assert(expr, msg) assert(((void)(msg), (expr))) +#else +#define m_assert(expr, msg) do{} while(0) +#endif // LOGS_MANAGEMENT_DEV_MODE +#endif // m_assert + +/* Test if a timestamp is within a valid range + * 1649175852000 equals Tuesday, 5 April 2022 16:24:12, + * 2532788652000 equals Tuesday, 5 April 2050 16:24:12 + */ +#define TEST_MS_TIMESTAMP_VALID(x) (((x) > 1649175852000 && (x) < 2532788652000)? 1:0) + +#define TIMESTAMP_MS_STR_SIZE sizeof("1649175852000") + +#ifdef ENABLE_LOGSMANAGEMENT_TESTS +#define UNIT_STATIC +#else +#define UNIT_STATIC static +#endif // ENABLE_LOGSMANAGEMENT_TESTS + +#ifndef COMPILE_TIME_ASSERT // https://stackoverflow.com/questions/3385515/static-assert-in-c +#define STATIC_ASSERT(COND,MSG) typedef char static_assertion_##MSG[(!!(COND))*2-1] +// token pasting madness: +#define COMPILE_TIME_ASSERT3(X,L) STATIC_ASSERT(X,static_assertion_at_line_##L) +#define COMPILE_TIME_ASSERT2(X,L) COMPILE_TIME_ASSERT3(X,L) +#define COMPILE_TIME_ASSERT(X) COMPILE_TIME_ASSERT2(X,__LINE__) +#endif // COMPILE_TIME_ASSERT + +#if defined(NETDATA_INTERNAL_CHECKS) && defined(LOGS_MANAGEMENT_DEV_MODE) +#define debug_log(args...) netdata_logger(NDLS_COLLECTORS, NDLP_DEBUG, __FILE__, __FUNCTION__, __LINE__, ##args) +#else +#define debug_log(fmt, args...) do {} while(0) +#endif + +/** + * @brief Extract file_basename from full file path + * @param path String containing the full path. + * @return Pointer to the file_basename string + */ +static inline char *get_basename(const char *const path) { + if(!path) return NULL; + char *s = strrchr(path, '/'); + if (!s) + return strdupz(path); + else + return strdupz(s + 1); +} + +typedef enum { + STR2XX_SUCCESS = 0, + STR2XX_OVERFLOW, + STR2XX_UNDERFLOW, + STR2XX_INCONVERTIBLE +} str2xx_errno; + +/* Convert string s to int out. + * https://stackoverflow.com/questions/7021725/how-to-convert-a-string-to-integer-in-c + * + * @param[out] out The converted int. Cannot be NULL. + * @param[in] s Input string to be converted. + * + * The format is the same as strtol, + * except that the following are inconvertible: + * - empty string + * - leading whitespace + * - any trailing characters that are not part of the number + * Cannot be NULL. + * + * @param[in] base Base to interpret string in. Same range as strtol (2 to 36). + * @return Indicates if the operation succeeded, or why it failed. + */ +static inline str2xx_errno str2int(int *out, char *s, int base) { + char *end; + if (unlikely(s[0] == '\0' || isspace(s[0]))){ + // debug_log( "str2int error: STR2XX_INCONVERTIBLE 1"); + // m_assert(0, "str2int error: STR2XX_INCONVERTIBLE"); + return STR2XX_INCONVERTIBLE; + } + errno = 0; + long l = strtol(s, &end, base); + /* Both checks are needed because INT_MAX == LONG_MAX is possible. */ + if (unlikely(l > INT_MAX || (errno == ERANGE && l == LONG_MAX))){ + debug_log( "str2int error: STR2XX_OVERFLOW"); + // m_assert(0, "str2int error: STR2XX_OVERFLOW"); + return STR2XX_OVERFLOW; + } + if (unlikely(l < INT_MIN || (errno == ERANGE && l == LONG_MIN))){ + debug_log( "str2int error: STR2XX_UNDERFLOW"); + // m_assert(0, "str2int error: STR2XX_UNDERFLOW"); + return STR2XX_UNDERFLOW; + } + if (unlikely(*end != '\0')){ + debug_log( "str2int error: STR2XX_INCONVERTIBLE 2"); + // m_assert(0, "str2int error: STR2XX_INCONVERTIBLE 2"); + return STR2XX_INCONVERTIBLE; + } + *out = l; + return STR2XX_SUCCESS; +} + +static inline str2xx_errno str2float(float *out, char *s) { + char *end; + if (unlikely(s[0] == '\0' || isspace(s[0]))){ + // debug_log( "str2float error: STR2XX_INCONVERTIBLE 1\n"); + // m_assert(0, "str2float error: STR2XX_INCONVERTIBLE"); + return STR2XX_INCONVERTIBLE; + } + errno = 0; + float f = strtof(s, &end); + /* Both checks are needed because INT_MAX == LONG_MAX is possible. */ + if (unlikely((errno == ERANGE && f == HUGE_VALF))){ + debug_log( "str2float error: STR2XX_OVERFLOW\n"); + // m_assert(0, "str2float error: STR2XX_OVERFLOW"); + return STR2XX_OVERFLOW; + } + if (unlikely((errno == ERANGE && f == -HUGE_VALF))){ + debug_log( "str2float error: STR2XX_UNDERFLOW\n"); + // m_assert(0, "str2float error: STR2XX_UNDERFLOW"); + return STR2XX_UNDERFLOW; + } + if (unlikely((*end != '\0'))){ + debug_log( "str2float error: STR2XX_INCONVERTIBLE 2\n"); + // m_assert(0, "str2float error: STR2XX_INCONVERTIBLE"); + return STR2XX_INCONVERTIBLE; + } + *out = f; + return STR2XX_SUCCESS; +} + +/** + * @brief Read last line of *filename, up to max_line_width characters. + * @note This function should be used carefully as it is not the most + * efficient one. But it is a quick-n-dirty way of reading the last line + * of a file. + * @param[in] filename File to be read. + * @param[in] max_line_width Integer indicating the max line width to be read. + * If a line is longer than that, it will be truncated. If zero or negative, a + * default value will be used instead. + * @return Pointer to a string holding the line that was read, or NULL if error. + */ +static inline char *read_last_line(const char *filename, int max_line_width){ + uv_fs_t req; + int64_t start_pos, end_pos; + uv_file file_handle = -1; + uv_buf_t uvBuf; + char *buff = NULL; + int rc, line_pos = -1, bytes_read; + + max_line_width = max_line_width > 0 ? max_line_width : 1024; // 1024 == default value + + rc = uv_fs_stat(NULL, &req, filename, NULL); + end_pos = req.statbuf.st_size; + uv_fs_req_cleanup(&req); + if (unlikely(rc)) { + collector_error("[%s]: uv_fs_stat() error: (%d) %s", filename, rc, uv_strerror(rc)); + m_assert(0, "uv_fs_stat() failed during read_last_line()"); + goto error; + } + + if(end_pos == 0) goto error; + start_pos = end_pos - max_line_width; + if(start_pos < 0) start_pos = 0; + + rc = uv_fs_open(NULL, &req, filename, O_RDONLY, 0, NULL); + uv_fs_req_cleanup(&req); + if (unlikely(rc < 0)) { + collector_error("[%s]: uv_fs_open() error: (%d) %s",filename, rc, uv_strerror(rc)); + m_assert(0, "uv_fs_open() failed during read_last_line()"); + goto error; + } + file_handle = rc; + + buff = callocz(1, (size_t) (end_pos - start_pos + 1) * sizeof(char)); + uvBuf = uv_buf_init(buff, (unsigned int) (end_pos - start_pos)); + rc = uv_fs_read(NULL, &req, file_handle, &uvBuf, 1, start_pos, NULL); + uv_fs_req_cleanup(&req); + if (unlikely(rc < 0)){ + collector_error("[%s]: uv_fs_read() error: (%d) %s", filename, rc, uv_strerror(rc)); + m_assert(0, "uv_fs_read() failed during read_last_line()"); + goto error; + } + bytes_read = rc; + + buff[bytes_read] = '\0'; + + for(int i = bytes_read - 2; i >= 0; i--){ // -2 because -1 could be '\n' + if (buff[i] == '\n'){ + line_pos = i; + break; + } + } + + if(line_pos >= 0){ + char *line = callocz(1, (size_t) (bytes_read - line_pos) * sizeof(char)); + memcpy(line, &buff[line_pos + 1], (size_t) (bytes_read - line_pos)); + freez(buff); + uv_fs_close(NULL, &req, file_handle, NULL); + return line; + } + + if(start_pos == 0){ + uv_fs_close(NULL, &req, file_handle, NULL); + return buff; + } + +error: + if(buff) freez(buff); + if(file_handle >= 0) uv_fs_close(NULL, &req, file_handle, NULL); + return NULL; +} + +static inline void memcpy_iscntrl_fix(char *dest, char *src, size_t num){ + while(num--){ + *dest++ = unlikely(!iscntrl(*src)) ? *src : ' '; + src++; + } +} + +#endif // HELPER_H_ diff --git a/logsmanagement/logsmanag_config.c b/logsmanagement/logsmanag_config.c new file mode 100644 index 00000000..5be52389 --- /dev/null +++ b/logsmanagement/logsmanag_config.c @@ -0,0 +1,1410 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file logsmanag_config.c + * @brief This file includes functions to manage + * the logs management configuration. + */ + +#include "logsmanag_config.h" +#include "db_api.h" +#include "rrd_api/rrd_api.h" +#include "helper.h" + +g_logs_manag_config_t g_logs_manag_config = { + .update_every = UPDATE_EVERY, + .update_timeout = UPDATE_TIMEOUT_DEFAULT, + .use_log_timestamp = CONFIG_BOOLEAN_AUTO, + .circ_buff_max_size_in_mib = CIRCULAR_BUFF_DEFAULT_MAX_SIZE / (1 MiB), + .circ_buff_drop_logs = CIRCULAR_BUFF_DEFAULT_DROP_LOGS, + .compression_acceleration = COMPRESSION_ACCELERATION_DEFAULT, + .db_mode = GLOBAL_DB_MODE_DEFAULT, + .disk_space_limit_in_mib = DISK_SPACE_LIMIT_DEFAULT, + .buff_flush_to_db_interval = SAVE_BLOB_TO_DB_DEFAULT, + .enable_collected_logs_total = ENABLE_COLLECTED_LOGS_TOTAL_DEFAULT, + .enable_collected_logs_rate = ENABLE_COLLECTED_LOGS_RATE_DEFAULT, + .sd_journal_field_prefix = SD_JOURNAL_FIELD_PREFIX, + .do_sd_journal_send = SD_JOURNAL_SEND_DEFAULT +}; + +static logs_manag_db_mode_t db_mode_str_to_db_mode(const char *const db_mode_str){ + if(!db_mode_str || !*db_mode_str) return g_logs_manag_config.db_mode; + else if(!strcasecmp(db_mode_str, "full")) return LOGS_MANAG_DB_MODE_FULL; + else if(!strcasecmp(db_mode_str, "none")) return LOGS_MANAG_DB_MODE_NONE; + else return g_logs_manag_config.db_mode; +} + +static struct config log_management_config = { + .first_section = NULL, + .last_section = NULL, + .mutex = NETDATA_MUTEX_INITIALIZER, + .index = { + .avl_tree = { + .root = NULL, + .compar = appconfig_section_compare + }, + .rwlock = AVL_LOCK_INITIALIZER + } +}; + +static struct Chart_meta chart_types[] = { + {.type = FLB_TAIL, .init = generic_chart_init, .update = generic_chart_update}, + {.type = FLB_WEB_LOG, .init = web_log_chart_init, .update = web_log_chart_update}, + {.type = FLB_KMSG, .init = kernel_chart_init, .update = kernel_chart_update}, + {.type = FLB_SYSTEMD, .init = systemd_chart_init, .update = systemd_chart_update}, + {.type = FLB_DOCKER_EV, .init = docker_ev_chart_init, .update = docker_ev_chart_update}, + {.type = FLB_SYSLOG, .init = generic_chart_init, .update = generic_chart_update}, + {.type = FLB_SERIAL, .init = generic_chart_init, .update = generic_chart_update}, + {.type = FLB_MQTT, .init = mqtt_chart_init, .update = mqtt_chart_update} +}; + +char *get_user_config_dir(void){ + char *dir = getenv("NETDATA_USER_CONFIG_DIR"); + + return dir ? dir : CONFIG_DIR; +} + +char *get_stock_config_dir(void){ + char *dir = getenv("NETDATA_STOCK_CONFIG_DIR"); + + return dir ? dir : LIBCONFIG_DIR; +} + +char *get_log_dir(void){ + char *dir = getenv("NETDATA_LOG_DIR"); + + return dir ? dir : LOG_DIR; +} + +char *get_cache_dir(void){ + char *dir = getenv("NETDATA_CACHE_DIR"); + + return dir ? dir : CACHE_DIR; +} + +/** + * @brief Cleanup p_file_info struct + * @param p_file_info The struct of File_info type to be cleaned up. + * @todo Pass p_file_info by reference, so that it can be set to NULL. */ +static void p_file_info_destroy(void *arg){ + struct File_info *p_file_info = (struct File_info *) arg; + + // TODO: Clean up rrd / chart stuff. + // p_file_info->chart_meta + + if(unlikely(!p_file_info)){ + collector_info("p_file_info_destroy() called but p_file_info == NULL - already destroyed?"); + return; + } + + char chartname[100]; + snprintfz(chartname, 100, "%s", p_file_info->chartname ? p_file_info->chartname : "Unknown"); + collector_info("[%s]: p_file_info_destroy() cleanup...", chartname); + + __atomic_store_n(&p_file_info->state, LOG_SRC_EXITING, __ATOMIC_RELAXED); + + if(uv_is_active((uv_handle_t *) &p_file_info->flb_tmp_buff_cpy_timer)){ + uv_timer_stop(&p_file_info->flb_tmp_buff_cpy_timer); + if (!uv_is_closing((uv_handle_t *) &p_file_info->flb_tmp_buff_cpy_timer)) + uv_close((uv_handle_t *) &p_file_info->flb_tmp_buff_cpy_timer, NULL); + } + + // TODO: Need to do proper termination of DB threads and allocated memory. + if(p_file_info->db_writer_thread){ + uv_thread_join(p_file_info->db_writer_thread); + sqlite3_finalize(p_file_info->stmt_get_log_msg_metadata_asc); + sqlite3_finalize(p_file_info->stmt_get_log_msg_metadata_desc); + if(sqlite3_close(p_file_info->db) != SQLITE_OK) + collector_error("[%s]: Failed to close database", chartname); + freez(p_file_info->db_mut); + freez((void *) p_file_info->db_metadata); + freez((void *) p_file_info->db_dir); + freez(p_file_info->db_writer_thread); + } + + freez((void *) p_file_info->chartname); + freez(p_file_info->filename); + freez((void *) p_file_info->file_basename); + freez((void *) p_file_info->stream_guid); + + for(int i = 1; i <= BLOB_MAX_FILES; i++){ + if(p_file_info->blob_handles[i]){ + uv_fs_close(NULL, NULL, p_file_info->blob_handles[i], NULL); + p_file_info->blob_handles[i] = 0; + } + } + + if(p_file_info->circ_buff) + circ_buff_destroy(p_file_info->circ_buff); + + if(p_file_info->parser_metrics){ + switch(p_file_info->log_type){ + case FLB_WEB_LOG: { + if(p_file_info->parser_metrics->web_log) + freez(p_file_info->parser_metrics->web_log); + break; + } + case FLB_KMSG: { + if(p_file_info->parser_metrics->kernel){ + dictionary_destroy(p_file_info->parser_metrics->kernel->subsystem); + dictionary_destroy(p_file_info->parser_metrics->kernel->device); + freez(p_file_info->parser_metrics->kernel); + } + break; + } + case FLB_SYSTEMD: + case FLB_SYSLOG: { + if(p_file_info->parser_metrics->systemd) + freez(p_file_info->parser_metrics->systemd); + break; + } + case FLB_DOCKER_EV: { + if(p_file_info->parser_metrics->docker_ev) + freez(p_file_info->parser_metrics->docker_ev); + break; + } + case FLB_MQTT: { + if(p_file_info->parser_metrics->mqtt){ + dictionary_destroy(p_file_info->parser_metrics->mqtt->topic); + freez(p_file_info->parser_metrics->mqtt); + } + break; + } + default: + break; + } + + for(int i = 0; p_file_info->parser_cus_config && + p_file_info->parser_metrics->parser_cus && + p_file_info->parser_cus_config[i]; i++){ + freez(p_file_info->parser_cus_config[i]->chartname); + freez(p_file_info->parser_cus_config[i]->regex_str); + freez(p_file_info->parser_cus_config[i]->regex_name); + regfree(&p_file_info->parser_cus_config[i]->regex); + freez(p_file_info->parser_cus_config[i]); + freez(p_file_info->parser_metrics->parser_cus[i]); + } + + freez(p_file_info->parser_cus_config); + freez(p_file_info->parser_metrics->parser_cus); + + freez(p_file_info->parser_metrics); + } + + if(p_file_info->parser_config){ + freez(p_file_info->parser_config->gen_config); + freez(p_file_info->parser_config); + } + + Flb_output_config_t *output_next = p_file_info->flb_outputs; + while(output_next){ + Flb_output_config_t *output = output_next; + output_next = output_next->next; + + struct flb_output_config_param *param_next = output->param; + while(param_next){ + struct flb_output_config_param *param = param_next; + param_next = param->next; + freez(param->key); + freez(param->val); + freez(param); + } + freez(output->plugin); + freez(output); + } + + freez(p_file_info->flb_config); + + freez(p_file_info); + + collector_info("[%s]: p_file_info_destroy() cleanup done", chartname); +} + +void p_file_info_destroy_all(void){ + if(p_file_infos_arr){ + uv_thread_t thread_id[p_file_infos_arr->count]; + for(int i = 0; i < p_file_infos_arr->count; i++){ + fatal_assert(0 == uv_thread_create(&thread_id[i], p_file_info_destroy, p_file_infos_arr->data[i])); + } + for(int i = 0; i < p_file_infos_arr->count; i++){ + uv_thread_join(&thread_id[i]); + } + freez(p_file_infos_arr); + p_file_infos_arr = NULL; + } +} + +/** + * @brief Load logs management configuration. + * @returns 0 if success, + * -1 if config file not found + * -2 if p_flb_srvc_config if is NULL (no flb_srvc_config_t provided) + */ +int logs_manag_config_load( flb_srvc_config_t *p_flb_srvc_config, + Flb_socket_config_t **forward_in_config_p, + int g_update_every){ + int rc = LOGS_MANAG_CONFIG_LOAD_ERROR_OK; + char section[100]; + char temp_path[FILENAME_MAX + 1]; + + struct config logsmanagement_d_conf = { + .first_section = NULL, + .last_section = NULL, + .mutex = NETDATA_MUTEX_INITIALIZER, + .index = { + .avl_tree = { + .root = NULL, + .compar = appconfig_section_compare + }, + .rwlock = AVL_LOCK_INITIALIZER + } + }; + + char *filename = strdupz_path_subpath(get_user_config_dir(), "logsmanagement.d.conf"); + if(!appconfig_load(&logsmanagement_d_conf, filename, 0, NULL)) { + collector_info("CONFIG: cannot load user config '%s'. Will try stock config.", filename); + freez(filename); + + filename = strdupz_path_subpath(get_stock_config_dir(), "logsmanagement.d.conf"); + if(!appconfig_load(&logsmanagement_d_conf, filename, 0, NULL)){ + collector_error("CONFIG: cannot load stock config '%s'. Logs management will be disabled.", filename); + rc = LOGS_MANAG_CONFIG_LOAD_ERROR_NO_STOCK_CONFIG; + } + } + freez(filename); + + + /* [global] section */ + + snprintfz(section, 100, "global"); + + g_logs_manag_config.update_every = appconfig_get_number( + &logsmanagement_d_conf, + section, + "update every", + g_logs_manag_config.update_every); + + g_logs_manag_config.update_every = + g_update_every && g_update_every > g_logs_manag_config.update_every ? + g_update_every : g_logs_manag_config.update_every; + + g_logs_manag_config.update_timeout = appconfig_get_number( + &logsmanagement_d_conf, + section, + "update timeout", + UPDATE_TIMEOUT_DEFAULT); + + if(g_logs_manag_config.update_timeout < g_logs_manag_config.update_every) + g_logs_manag_config.update_timeout = g_logs_manag_config.update_every; + + g_logs_manag_config.use_log_timestamp = appconfig_get_boolean_ondemand( + &logsmanagement_d_conf, + section, + "use log timestamp", + g_logs_manag_config.use_log_timestamp); + + g_logs_manag_config.circ_buff_max_size_in_mib = appconfig_get_number( + &logsmanagement_d_conf, + section, + "circular buffer max size MiB", + g_logs_manag_config.circ_buff_max_size_in_mib); + + g_logs_manag_config.circ_buff_drop_logs = appconfig_get_boolean( + &logsmanagement_d_conf, + section, + "circular buffer drop logs if full", + g_logs_manag_config.circ_buff_drop_logs); + + g_logs_manag_config.compression_acceleration = appconfig_get_number( + &logsmanagement_d_conf, + section, + "compression acceleration", + g_logs_manag_config.compression_acceleration); + + g_logs_manag_config.enable_collected_logs_total = appconfig_get_boolean( + &logsmanagement_d_conf, + section, + "collected logs total chart enable", + g_logs_manag_config.enable_collected_logs_total); + + g_logs_manag_config.enable_collected_logs_rate = appconfig_get_boolean( + &logsmanagement_d_conf, + section, + "collected logs rate chart enable", + g_logs_manag_config.enable_collected_logs_rate); + + g_logs_manag_config.do_sd_journal_send = appconfig_get_boolean( + &logsmanagement_d_conf, + section, + "submit logs to system journal", + g_logs_manag_config.do_sd_journal_send); + + g_logs_manag_config.sd_journal_field_prefix = appconfig_get( + &logsmanagement_d_conf, + section, + "systemd journal fields prefix", + g_logs_manag_config.sd_journal_field_prefix); + + if(!rc){ + collector_info("CONFIG: [%s] update every: %d", section, g_logs_manag_config.update_every); + collector_info("CONFIG: [%s] update timeout: %d", section, g_logs_manag_config.update_timeout); + collector_info("CONFIG: [%s] use log timestamp: %d", section, g_logs_manag_config.use_log_timestamp); + collector_info("CONFIG: [%s] circular buffer max size MiB: %d", section, g_logs_manag_config.circ_buff_max_size_in_mib); + collector_info("CONFIG: [%s] circular buffer drop logs if full: %d", section, g_logs_manag_config.circ_buff_drop_logs); + collector_info("CONFIG: [%s] compression acceleration: %d", section, g_logs_manag_config.compression_acceleration); + collector_info("CONFIG: [%s] collected logs total chart enable: %d", section, g_logs_manag_config.enable_collected_logs_total); + collector_info("CONFIG: [%s] collected logs rate chart enable: %d", section, g_logs_manag_config.enable_collected_logs_rate); + collector_info("CONFIG: [%s] submit logs to system journal: %d", section, g_logs_manag_config.do_sd_journal_send); + collector_info("CONFIG: [%s] systemd journal fields prefix: %s", section, g_logs_manag_config.sd_journal_field_prefix); + } + + + /* [db] section */ + + snprintfz(section, 100, "db"); + + const char *const db_mode_str = appconfig_get( + &logsmanagement_d_conf, + section, + "db mode", + GLOBAL_DB_MODE_DEFAULT_STR); + g_logs_manag_config.db_mode = db_mode_str_to_db_mode(db_mode_str); + + snprintfz(temp_path, FILENAME_MAX, "%s" LOGS_MANAG_DB_SUBPATH, get_cache_dir()); + db_set_main_dir(appconfig_get(&logsmanagement_d_conf, section, "db dir", temp_path)); + + g_logs_manag_config.buff_flush_to_db_interval = appconfig_get_number( + &logsmanagement_d_conf, + section, + "circular buffer flush to db", + g_logs_manag_config.buff_flush_to_db_interval); + + g_logs_manag_config.disk_space_limit_in_mib = appconfig_get_number( + &logsmanagement_d_conf, + section, + "disk space limit MiB", + g_logs_manag_config.disk_space_limit_in_mib); + + if(!rc){ + collector_info("CONFIG: [%s] db mode: %s [%d]", section, db_mode_str, (int) g_logs_manag_config.db_mode); + collector_info("CONFIG: [%s] db dir: %s", section, temp_path); + collector_info("CONFIG: [%s] circular buffer flush to db: %d", section, g_logs_manag_config.buff_flush_to_db_interval); + collector_info("CONFIG: [%s] disk space limit MiB: %d", section, g_logs_manag_config.disk_space_limit_in_mib); + } + + + /* [forward input] section */ + + snprintfz(section, 100, "forward input"); + + const int fwd_enable = appconfig_get_boolean( + &logsmanagement_d_conf, + section, + "enabled", + CONFIG_BOOLEAN_NO); + + *forward_in_config_p = (Flb_socket_config_t *) callocz(1, sizeof(Flb_socket_config_t)); + + (*forward_in_config_p)->unix_path = appconfig_get( + &logsmanagement_d_conf, + section, + "unix path", + FLB_FORWARD_UNIX_PATH_DEFAULT); + + (*forward_in_config_p)->unix_perm = appconfig_get( + &logsmanagement_d_conf, + section, + "unix perm", + FLB_FORWARD_UNIX_PERM_DEFAULT); + + // TODO: Check if listen is in valid format + (*forward_in_config_p)->listen = appconfig_get( + &logsmanagement_d_conf, + section, + "listen", + FLB_FORWARD_ADDR_DEFAULT); + + (*forward_in_config_p)->port = appconfig_get( + &logsmanagement_d_conf, + section, + "port", + FLB_FORWARD_PORT_DEFAULT); + + if(!rc){ + collector_info("CONFIG: [%s] enabled: %s", section, fwd_enable ? "yes" : "no"); + collector_info("CONFIG: [%s] unix path: %s", section, (*forward_in_config_p)->unix_path); + collector_info("CONFIG: [%s] unix perm: %s", section, (*forward_in_config_p)->unix_perm); + collector_info("CONFIG: [%s] listen: %s", section, (*forward_in_config_p)->listen); + collector_info("CONFIG: [%s] port: %s", section, (*forward_in_config_p)->port); + } + + if(!fwd_enable) { + freez(*forward_in_config_p); + *forward_in_config_p = NULL; + } + + + /* [fluent bit] section */ + + snprintfz(section, 100, "fluent bit"); + + snprintfz(temp_path, FILENAME_MAX, "%s/%s", get_log_dir(), FLB_LOG_FILENAME_DEFAULT); + + if(p_flb_srvc_config){ + p_flb_srvc_config->flush = appconfig_get( + &logsmanagement_d_conf, + section, + "flush", + p_flb_srvc_config->flush); + + p_flb_srvc_config->http_listen = appconfig_get( + &logsmanagement_d_conf, + section, + "http listen", + p_flb_srvc_config->http_listen); + + p_flb_srvc_config->http_port = appconfig_get( + &logsmanagement_d_conf, + section, + "http port", + p_flb_srvc_config->http_port); + + p_flb_srvc_config->http_server = appconfig_get( + &logsmanagement_d_conf, + section, + "http server", + p_flb_srvc_config->http_server); + + p_flb_srvc_config->log_path = appconfig_get( + &logsmanagement_d_conf, + section, + "log file", + temp_path); + + p_flb_srvc_config->log_level = appconfig_get( + &logsmanagement_d_conf, + section, + "log level", + p_flb_srvc_config->log_level); + + p_flb_srvc_config->coro_stack_size = appconfig_get( + &logsmanagement_d_conf, + section, + "coro stack size", + p_flb_srvc_config->coro_stack_size); + } + else + rc = LOGS_MANAG_CONFIG_LOAD_ERROR_P_FLB_SRVC_NULL; + + if(!rc){ + collector_info("CONFIG: [%s] flush: %s", section, p_flb_srvc_config->flush); + collector_info("CONFIG: [%s] http listen: %s", section, p_flb_srvc_config->http_listen); + collector_info("CONFIG: [%s] http port: %s", section, p_flb_srvc_config->http_port); + collector_info("CONFIG: [%s] http server: %s", section, p_flb_srvc_config->http_server); + collector_info("CONFIG: [%s] log file: %s", section, p_flb_srvc_config->log_path); + collector_info("CONFIG: [%s] log level: %s", section, p_flb_srvc_config->log_level); + collector_info("CONFIG: [%s] coro stack size: %s", section, p_flb_srvc_config->coro_stack_size); + } + + return rc; +} + +static bool metrics_dict_conflict_cb(const DICTIONARY_ITEM *item __maybe_unused, void *old_value, void *new_value, void *data __maybe_unused){ + ((metrics_dict_item_t *)old_value)->num_new += ((metrics_dict_item_t *)new_value)->num_new; + return true; +} + +#define FLB_OUTPUT_PLUGIN_NAME_KEY "name" + +static int flb_output_param_get_cb(void *entry, void *data){ + struct config_option *option = (struct config_option *) entry; + Flb_output_config_t *flb_output = (Flb_output_config_t *) data; + + char *param_prefix = callocz(1, snprintf(NULL, 0, "output %d", MAX_OUTPUTS_PER_SOURCE) + 1); + sprintf(param_prefix, "output %d", flb_output->id); + size_t param_prefix_len = strlen(param_prefix); + + if(!strncasecmp(option->name, param_prefix, param_prefix_len)){ // param->name looks like "output 1 host" + char *param_key = &option->name[param_prefix_len]; // param_key should look like " host" + while(*param_key == ' ') param_key++; // remove whitespace so it looks like "host" + + if(*param_key && strcasecmp(param_key, FLB_OUTPUT_PLUGIN_NAME_KEY)){ // ignore param_key "name" + // debug_log( "config_option: name[%s], value[%s]", option->name, option->value); + // debug_log( "config option kv:[%s][%s]", param_key, option->value); + + struct flb_output_config_param **p = &flb_output->param; + while((*p) != NULL) p = &((*p)->next); // Go to last param of linked list + + (*p) = callocz(1, sizeof(struct flb_output_config_param)); + (*p)->key = strdupz(param_key); + (*p)->val = strdupz(option->value); + } + } + + freez(param_prefix); + + return 0; +} + +/** + * @brief Initialize logs management based on a section configuration. + * @note On error, calls p_file_info_destroy() to clean up before returning. + * @param config_section Section to read configuration from. + * @todo How to handle duplicate entries? + */ +static void config_section_init(uv_loop_t *main_loop, + struct section *config_section, + Flb_socket_config_t *forward_in_config, + flb_srvc_config_t *p_flb_srvc_config, + netdata_mutex_t *stdout_mut){ + + struct File_info *p_file_info = callocz(1, sizeof(struct File_info)); + + /* ------------------------------------------------------------------------- + * Check if config_section->name is valid and if so, use it as chartname. + * ------------------------------------------------------------------------- */ + if(config_section->name && *config_section->name){ + char tmp[LOGS_MANAG_CHARTNAME_SIZE] = {0}; + + snprintfz(tmp, sizeof(tmp), "%s%s", LOGS_MANAG_CHARTNAME_PREFIX, config_section->name); + + netdata_fix_chart_id(tmp); + + for(char *ch = (char *) tmp; *ch; ch++) + *ch = *ch == '.' ? '_' : *ch; // Convert dots to underscores + + p_file_info->chartname = strdupz(tmp); + + collector_info("[%s]: Initializing config loading", p_file_info->chartname); + } else { + collector_error("Invalid logs management config section."); + return p_file_info_destroy(p_file_info); + } + + + /* ------------------------------------------------------------------------- + * Check if this log source is enabled. + * ------------------------------------------------------------------------- */ + if(appconfig_get_boolean(&log_management_config, config_section->name, "enabled", CONFIG_BOOLEAN_NO)){ + collector_info("[%s]: enabled = yes", p_file_info->chartname); + } else { + collector_info("[%s]: enabled = no", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } + + + /* ------------------------------------------------------------------------- + * Check log type. + * ------------------------------------------------------------------------- */ + char *type = appconfig_get(&log_management_config, config_section->name, "log type", "flb_tail"); + if(!type || !*type) p_file_info->log_type = FLB_TAIL; // Default + else{ + if(!strcasecmp(type, "flb_tail")) p_file_info->log_type = FLB_TAIL; + else if (!strcasecmp(type, "flb_web_log")) p_file_info->log_type = FLB_WEB_LOG; + else if (!strcasecmp(type, "flb_kmsg")) p_file_info->log_type = FLB_KMSG; + else if (!strcasecmp(type, "flb_systemd")) p_file_info->log_type = FLB_SYSTEMD; + else if (!strcasecmp(type, "flb_docker_events")) p_file_info->log_type = FLB_DOCKER_EV; + else if (!strcasecmp(type, "flb_syslog")) p_file_info->log_type = FLB_SYSLOG; + else if (!strcasecmp(type, "flb_serial")) p_file_info->log_type = FLB_SERIAL; + else if (!strcasecmp(type, "flb_mqtt")) p_file_info->log_type = FLB_MQTT; + else p_file_info->log_type = FLB_TAIL; + } + freez(type); + collector_info("[%s]: log type = %s", p_file_info->chartname, log_src_type_t_str[p_file_info->log_type]); + + + /* ------------------------------------------------------------------------- + * Read log source. + * ------------------------------------------------------------------------- */ + char *source = appconfig_get(&log_management_config, config_section->name, "log source", "local"); + if(!source || !*source) p_file_info->log_source = LOG_SOURCE_LOCAL; // Default + else if(!strcasecmp(source, "forward")) p_file_info->log_source = LOG_SOURCE_FORWARD; + else p_file_info->log_source = LOG_SOURCE_LOCAL; + freez(source); + collector_info("[%s]: log source = %s", p_file_info->chartname, log_src_t_str[p_file_info->log_source]); + + if(p_file_info->log_source == LOG_SOURCE_FORWARD && !forward_in_config){ + collector_info("[%s]: forward_in_config == NULL - this log source will be disabled", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } + + + /* ------------------------------------------------------------------------- + * Read stream uuid. + * ------------------------------------------------------------------------- */ + p_file_info->stream_guid = appconfig_get(&log_management_config, config_section->name, "stream guid", ""); + collector_info("[%s]: stream guid = %s", p_file_info->chartname, p_file_info->stream_guid); + + + /* ------------------------------------------------------------------------- + * Read log path configuration and check if it is valid. + * ------------------------------------------------------------------------- */ + p_file_info->filename = appconfig_get(&log_management_config, config_section->name, "log path", LOG_PATH_AUTO); + if( /* path doesn't matter when log source is not local */ + (p_file_info->log_source == LOG_SOURCE_LOCAL) && + + /* FLB_SYSLOG is special case, may or may not require a path */ + (p_file_info->log_type != FLB_SYSLOG) && + + /* FLB_MQTT is special case, does not require a path */ + (p_file_info->log_type != FLB_MQTT) && + + (!p_file_info->filename /* Sanity check */ || + !*p_file_info->filename || + !strcmp(p_file_info->filename, LOG_PATH_AUTO) || + access(p_file_info->filename, R_OK) + )){ + + freez(p_file_info->filename); + p_file_info->filename = NULL; + + switch(p_file_info->log_type){ + case FLB_TAIL: + if(!strcasecmp(p_file_info->chartname, LOGS_MANAG_CHARTNAME_PREFIX "netdata_daemon_log")){ + char path[FILENAME_MAX + 1]; + snprintfz(path, FILENAME_MAX, "%s/daemon.log", get_log_dir()); + if(access(path, R_OK)) { + collector_error("[%s]: 'Netdata daemon.log' path (%s) invalid, unknown or needs permissions", + p_file_info->chartname, path); + return p_file_info_destroy(p_file_info); + } else p_file_info->filename = strdupz(path); + } else if(!strcasecmp(p_file_info->chartname, LOGS_MANAG_CHARTNAME_PREFIX "fluentbit_log")){ + if(access(p_flb_srvc_config->log_path, R_OK)){ + collector_error("[%s]: Netdata fluentbit.log path (%s) invalid, unknown or needs permissions", + p_file_info->chartname, p_flb_srvc_config->log_path); + return p_file_info_destroy(p_file_info); + } else p_file_info->filename = strdupz(p_flb_srvc_config->log_path); + } else if(!strcasecmp(p_file_info->chartname, LOGS_MANAG_CHARTNAME_PREFIX "auth_log_tail")){ + const char * const auth_path_default[] = { + "/var/log/auth.log", + NULL + }; + int i = 0; + while(auth_path_default[i] && access(auth_path_default[i], R_OK)){i++;}; + if(!auth_path_default[i]){ + collector_error("[%s]: auth.log path invalid, unknown or needs permissions", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } else p_file_info->filename = strdupz(auth_path_default[i]); + } else if(!strcasecmp(p_file_info->chartname, "syslog_tail")){ + const char * const syslog_path_default[] = { + "/var/log/syslog", /* Debian, Ubuntu */ + "/var/log/messages", /* RHEL, Red Hat, CentOS, Fedora */ + NULL + }; + int i = 0; + while(syslog_path_default[i] && access(syslog_path_default[i], R_OK)){i++;}; + if(!syslog_path_default[i]){ + collector_error("[%s]: syslog path invalid, unknown or needs permissions", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } else p_file_info->filename = strdupz(syslog_path_default[i]); + } + break; + case FLB_WEB_LOG: + if(!strcasecmp(p_file_info->chartname, LOGS_MANAG_CHARTNAME_PREFIX "apache_access_log")){ + const char * const apache_access_path_default[] = { + "/var/log/apache/access.log", + "/var/log/apache2/access.log", + "/var/log/apache2/access_log", + "/var/log/httpd/access_log", + "/var/log/httpd-access.log", + NULL + }; + int i = 0; + while(apache_access_path_default[i] && access(apache_access_path_default[i], R_OK)){i++;}; + if(!apache_access_path_default[i]){ + collector_error("[%s]: Apache access.log path invalid, unknown or needs permissions", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } else p_file_info->filename = strdupz(apache_access_path_default[i]); + } else if(!strcasecmp(p_file_info->chartname, LOGS_MANAG_CHARTNAME_PREFIX "nginx_access_log")){ + const char * const nginx_access_path_default[] = { + "/var/log/nginx/access.log", + NULL + }; + int i = 0; + while(nginx_access_path_default[i] && access(nginx_access_path_default[i], R_OK)){i++;}; + if(!nginx_access_path_default[i]){ + collector_error("[%s]: Nginx access.log path invalid, unknown or needs permissions", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } else p_file_info->filename = strdupz(nginx_access_path_default[i]); + } + break; + case FLB_KMSG: + if(access(KMSG_DEFAULT_PATH, R_OK)){ + collector_error("[%s]: kmsg default path invalid, unknown or needs permissions", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } else p_file_info->filename = strdupz(KMSG_DEFAULT_PATH); + break; + case FLB_SYSTEMD: + p_file_info->filename = strdupz(SYSTEMD_DEFAULT_PATH); + break; + case FLB_DOCKER_EV: + if(access(DOCKER_EV_DEFAULT_PATH, R_OK)){ + collector_error("[%s]: Docker socket default Unix path invalid, unknown or needs permissions", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } else p_file_info->filename = strdupz(DOCKER_EV_DEFAULT_PATH); + break; + default: + collector_error("[%s]: log path invalid or unknown", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } + } + p_file_info->file_basename = get_basename(p_file_info->filename); + collector_info("[%s]: p_file_info->filename: %s", p_file_info->chartname, + p_file_info->filename ? p_file_info->filename : "NULL"); + collector_info("[%s]: p_file_info->file_basename: %s", p_file_info->chartname, + p_file_info->file_basename ? p_file_info->file_basename : "NULL"); + if(unlikely(!p_file_info->filename)) return p_file_info_destroy(p_file_info); + + + /* ------------------------------------------------------------------------- + * Read "update every" and "update timeout" configuration. + * ------------------------------------------------------------------------- */ + p_file_info->update_every = appconfig_get_number( &log_management_config, config_section->name, + "update every", g_logs_manag_config.update_every); + collector_info("[%s]: update every = %d", p_file_info->chartname, p_file_info->update_every); + + p_file_info->update_timeout = appconfig_get_number( &log_management_config, config_section->name, + "update timeout", g_logs_manag_config.update_timeout); + if(p_file_info->update_timeout < p_file_info->update_every) p_file_info->update_timeout = p_file_info->update_every; + collector_info("[%s]: update timeout = %d", p_file_info->chartname, p_file_info->update_timeout); + + + /* ------------------------------------------------------------------------- + * Read "use log timestamp" configuration. + * ------------------------------------------------------------------------- */ + p_file_info->use_log_timestamp = appconfig_get_boolean_ondemand(&log_management_config, config_section->name, + "use log timestamp", + g_logs_manag_config.use_log_timestamp); + collector_info("[%s]: use log timestamp = %s", p_file_info->chartname, + p_file_info->use_log_timestamp ? "auto or yes" : "no"); + + + /* ------------------------------------------------------------------------- + * Read compression acceleration configuration. + * ------------------------------------------------------------------------- */ + p_file_info->compression_accel = appconfig_get_number( &log_management_config, config_section->name, + "compression acceleration", + g_logs_manag_config.compression_acceleration); + collector_info("[%s]: compression acceleration = %d", p_file_info->chartname, p_file_info->compression_accel); + + + /* ------------------------------------------------------------------------- + * Read DB mode. + * ------------------------------------------------------------------------- */ + const char *const db_mode_str = appconfig_get(&log_management_config, config_section->name, "db mode", NULL); + collector_info("[%s]: db mode = %s", p_file_info->chartname, db_mode_str ? db_mode_str : "NULL"); + p_file_info->db_mode = db_mode_str_to_db_mode(db_mode_str); + freez((void *)db_mode_str); + + + /* ------------------------------------------------------------------------- + * Read save logs from buffers to DB interval configuration. + * ------------------------------------------------------------------------- */ + p_file_info->buff_flush_to_db_interval = appconfig_get_number( &log_management_config, config_section->name, + "circular buffer flush to db", + g_logs_manag_config.buff_flush_to_db_interval); + if(p_file_info->buff_flush_to_db_interval > SAVE_BLOB_TO_DB_MAX) { + p_file_info->buff_flush_to_db_interval = SAVE_BLOB_TO_DB_MAX; + collector_info("[%s]: circular buffer flush to db out of range. Using maximum permitted value: %d", + p_file_info->chartname, p_file_info->buff_flush_to_db_interval); + + } else if(p_file_info->buff_flush_to_db_interval < SAVE_BLOB_TO_DB_MIN) { + p_file_info->buff_flush_to_db_interval = SAVE_BLOB_TO_DB_MIN; + collector_info("[%s]: circular buffer flush to db out of range. Using minimum permitted value: %d", + p_file_info->chartname, p_file_info->buff_flush_to_db_interval); + } + collector_info("[%s]: circular buffer flush to db = %d", p_file_info->chartname, p_file_info->buff_flush_to_db_interval); + + + /* ------------------------------------------------------------------------- + * Read BLOB max size configuration. + * ------------------------------------------------------------------------- */ + p_file_info->blob_max_size = appconfig_get_number( &log_management_config, config_section->name, + "disk space limit MiB", + g_logs_manag_config.disk_space_limit_in_mib) MiB / BLOB_MAX_FILES; + collector_info("[%s]: BLOB max size = %lld", p_file_info->chartname, (long long)p_file_info->blob_max_size); + + + /* ------------------------------------------------------------------------- + * Read configuration about sending logs to system journal. + * ------------------------------------------------------------------------- */ + p_file_info->do_sd_journal_send = appconfig_get_boolean(&log_management_config, config_section->name, + "submit logs to system journal", + g_logs_manag_config.do_sd_journal_send); + + /* ------------------------------------------------------------------------- + * Read collected logs chart configuration. + * ------------------------------------------------------------------------- */ + p_file_info->parser_config = callocz(1, sizeof(Log_parser_config_t)); + + if(appconfig_get_boolean(&log_management_config, config_section->name, + "collected logs total chart enable", + g_logs_manag_config.enable_collected_logs_total)){ + p_file_info->parser_config->chart_config |= CHART_COLLECTED_LOGS_TOTAL; + } + collector_info( "[%s]: collected logs total chart enable = %s", p_file_info->chartname, + (p_file_info->parser_config->chart_config & CHART_COLLECTED_LOGS_TOTAL) ? "yes" : "no"); + + if(appconfig_get_boolean(&log_management_config, config_section->name, + "collected logs rate chart enable", + g_logs_manag_config.enable_collected_logs_rate)){ + p_file_info->parser_config->chart_config |= CHART_COLLECTED_LOGS_RATE; + } + collector_info( "[%s]: collected logs rate chart enable = %s", p_file_info->chartname, + (p_file_info->parser_config->chart_config & CHART_COLLECTED_LOGS_RATE) ? "yes" : "no"); + + + /* ------------------------------------------------------------------------- + * Deal with log-type-specific configuration options. + * ------------------------------------------------------------------------- */ + + if(p_file_info->log_type == FLB_TAIL || p_file_info->log_type == FLB_WEB_LOG){ + Flb_tail_config_t *tail_config = callocz(1, sizeof(Flb_tail_config_t)); + if(appconfig_get_boolean(&log_management_config, config_section->name, "use inotify", CONFIG_BOOLEAN_YES)) + tail_config->use_inotify = 1; + collector_info( "[%s]: use inotify = %s", p_file_info->chartname, tail_config->use_inotify? "yes" : "no"); + + p_file_info->flb_config = tail_config; + } + + if(p_file_info->log_type == FLB_WEB_LOG){ + /* Check if a valid web log format configuration is detected */ + char *log_format = appconfig_get(&log_management_config, config_section->name, "log format", LOG_PATH_AUTO); + const char delimiter = ' '; // TODO!!: TO READ FROM CONFIG + collector_info("[%s]: log format = %s", p_file_info->chartname, log_format ? log_format : "NULL!"); + + /* If "log format = auto" or no "log format" config is detected, + * try log format autodetection based on last log file line. + * TODO 1: Add another case in OR where log_format is compared with a valid reg exp. + * TODO 2: Set default log format and delimiter if not found in config? Or auto-detect? */ + if(!log_format || !*log_format || !strcmp(log_format, LOG_PATH_AUTO)){ + collector_info("[%s]: Attempting auto-detection of log format", p_file_info->chartname); + char *line = read_last_line(p_file_info->filename, 0); + if(!line){ + collector_error("[%s]: read_last_line() returned NULL", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } + p_file_info->parser_config->gen_config = auto_detect_web_log_parser_config(line, delimiter); + freez(line); + } + else{ + p_file_info->parser_config->gen_config = read_web_log_parser_config(log_format, delimiter); + collector_info( "[%s]: Read web log parser config: %s", p_file_info->chartname, + p_file_info->parser_config->gen_config ? "success!" : "failed!"); + } + freez(log_format); + + if(!p_file_info->parser_config->gen_config){ + collector_error("[%s]: No valid web log parser config found", p_file_info->chartname); + return p_file_info_destroy(p_file_info); + } + + /* Check whether metrics verification during parsing is required */ + Web_log_parser_config_t *wblp_config = (Web_log_parser_config_t *) p_file_info->parser_config->gen_config; + wblp_config->verify_parsed_logs = appconfig_get_boolean( &log_management_config, config_section->name, + "verify parsed logs", CONFIG_BOOLEAN_NO); + collector_info("[%s]: verify parsed logs = %d", p_file_info->chartname, wblp_config->verify_parsed_logs); + + wblp_config->skip_timestamp_parsing = p_file_info->use_log_timestamp ? 0 : 1; + collector_info("[%s]: skip_timestamp_parsing = %d", p_file_info->chartname, wblp_config->skip_timestamp_parsing); + + for(int j = 0; j < wblp_config->num_fields; j++){ + if((wblp_config->fields[j] == VHOST_WITH_PORT || wblp_config->fields[j] == VHOST) + && appconfig_get_boolean(&log_management_config, config_section->name, "vhosts chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_VHOST; + } + if((wblp_config->fields[j] == VHOST_WITH_PORT || wblp_config->fields[j] == PORT) + && appconfig_get_boolean(&log_management_config, config_section->name, "ports chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_PORT; + } + if((wblp_config->fields[j] == REQ_CLIENT) + && appconfig_get_boolean(&log_management_config, config_section->name, "IP versions chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_IP_VERSION; + } + if((wblp_config->fields[j] == REQ_CLIENT) + && appconfig_get_boolean(&log_management_config, config_section->name, "unique client IPs - current poll chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_REQ_CLIENT_CURRENT; + } + if((wblp_config->fields[j] == REQ_CLIENT) + && appconfig_get_boolean(&log_management_config, config_section->name, "unique client IPs - all-time chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_REQ_CLIENT_ALL_TIME; + } + if((wblp_config->fields[j] == REQ || wblp_config->fields[j] == REQ_METHOD) + && appconfig_get_boolean(&log_management_config, config_section->name, "http request methods chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_REQ_METHODS; + } + if((wblp_config->fields[j] == REQ || wblp_config->fields[j] == REQ_PROTO) + && appconfig_get_boolean(&log_management_config, config_section->name, "http protocol versions chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_REQ_PROTO; + } + if((wblp_config->fields[j] == REQ_SIZE || wblp_config->fields[j] == RESP_SIZE) + && appconfig_get_boolean(&log_management_config, config_section->name, "bandwidth chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_BANDWIDTH; + } + if((wblp_config->fields[j] == REQ_PROC_TIME) + && appconfig_get_boolean(&log_management_config, config_section->name, "timings chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_REQ_PROC_TIME; + } + if((wblp_config->fields[j] == RESP_CODE) + && appconfig_get_boolean(&log_management_config, config_section->name, "response code families chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_RESP_CODE_FAMILY; + } + if((wblp_config->fields[j] == RESP_CODE) + && appconfig_get_boolean(&log_management_config, config_section->name, "response codes chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_RESP_CODE; + } + if((wblp_config->fields[j] == RESP_CODE) + && appconfig_get_boolean(&log_management_config, config_section->name, "response code types chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_RESP_CODE_TYPE; + } + if((wblp_config->fields[j] == SSL_PROTO) + && appconfig_get_boolean(&log_management_config, config_section->name, "SSL protocols chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_SSL_PROTO; + } + if((wblp_config->fields[j] == SSL_CIPHER_SUITE) + && appconfig_get_boolean(&log_management_config, config_section->name, "SSL chipher suites chart", CONFIG_BOOLEAN_NO)){ + p_file_info->parser_config->chart_config |= CHART_SSL_CIPHER; + } + } + } + else if(p_file_info->log_type == FLB_KMSG){ + Flb_kmsg_config_t *kmsg_config = callocz(1, sizeof(Flb_kmsg_config_t)); + + kmsg_config->prio_level = appconfig_get(&log_management_config, config_section->name, "prio level", "8"); + + p_file_info->flb_config = kmsg_config; + + if(appconfig_get_boolean(&log_management_config, config_section->name, "severity chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_SYSLOG_SEVER; + } + if(appconfig_get_boolean(&log_management_config, config_section->name, "subsystem chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_KMSG_SUBSYSTEM; + } + if(appconfig_get_boolean(&log_management_config, config_section->name, "device chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_KMSG_DEVICE; + } + } + else if(p_file_info->log_type == FLB_SYSTEMD || p_file_info->log_type == FLB_SYSLOG){ + if(p_file_info->log_type == FLB_SYSLOG){ + Syslog_parser_config_t *syslog_config = callocz(1, sizeof(Syslog_parser_config_t)); + + /* Read syslog format */ + syslog_config->log_format = appconfig_get( &log_management_config, + config_section->name, + "log format", NULL); + collector_info("[%s]: log format = %s", p_file_info->chartname, + syslog_config->log_format ? syslog_config->log_format : "NULL!"); + if(!syslog_config->log_format || !*syslog_config->log_format || !strcasecmp(syslog_config->log_format, "auto")){ + freez(syslog_config->log_format); + freez(syslog_config); + return p_file_info_destroy(p_file_info); + } + + syslog_config->socket_config = callocz(1, sizeof(Flb_socket_config_t)); + + /* Read syslog socket mode + * see also https://docs.fluentbit.io/manual/pipeline/inputs/syslog#configuration-parameters */ + syslog_config->socket_config->mode = appconfig_get( &log_management_config, + config_section->name, + "mode", "unix_udp"); + collector_info("[%s]: mode = %s", p_file_info->chartname, syslog_config->socket_config->mode); + + /* Check for valid socket path if (mode == unix_udp) or + * (mode == unix_tcp), else read syslog network interface to bind, + * if (mode == udp) or (mode == tcp). */ + if( !strcasecmp(syslog_config->socket_config->mode, "unix_udp") || + !strcasecmp(syslog_config->socket_config->mode, "unix_tcp")){ + if(!p_file_info->filename || !*p_file_info->filename || !strcasecmp(p_file_info->filename, LOG_PATH_AUTO)){ + // freez(syslog_config->socket_config->mode); + freez(syslog_config->socket_config); + freez(syslog_config->log_format); + freez(syslog_config); + return p_file_info_destroy(p_file_info); + } + syslog_config->socket_config->unix_perm = appconfig_get(&log_management_config, + config_section->name, + "unix_perm", "0644"); + collector_info("[%s]: unix_perm = %s", p_file_info->chartname, syslog_config->socket_config->unix_perm); + } else if( !strcasecmp(syslog_config->socket_config->mode, "udp") || + !strcasecmp(syslog_config->socket_config->mode, "tcp")){ + // TODO: Check if listen is in valid format + syslog_config->socket_config->listen = appconfig_get( &log_management_config, + config_section->name, + "listen", "0.0.0.0"); + collector_info("[%s]: listen = %s", p_file_info->chartname, syslog_config->socket_config->listen); + syslog_config->socket_config->port = appconfig_get( &log_management_config, + config_section->name, + "port", "5140"); + collector_info("[%s]: port = %s", p_file_info->chartname, syslog_config->socket_config->port); + } else { + /* Any other modes are invalid */ + // freez(syslog_config->socket_config->mode); + freez(syslog_config->socket_config); + freez(syslog_config->log_format); + freez(syslog_config); + return p_file_info_destroy(p_file_info); + } + + p_file_info->parser_config->gen_config = syslog_config; + } + if(appconfig_get_boolean(&log_management_config, config_section->name, "priority value chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_SYSLOG_PRIOR; + } + if(appconfig_get_boolean(&log_management_config, config_section->name, "severity chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_SYSLOG_SEVER; + } + if(appconfig_get_boolean(&log_management_config, config_section->name, "facility chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_SYSLOG_FACIL; + } + } + else if(p_file_info->log_type == FLB_DOCKER_EV){ + if(appconfig_get_boolean(&log_management_config, config_section->name, "event type chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_DOCKER_EV_TYPE; + } + if(appconfig_get_boolean(&log_management_config, config_section->name, "event action chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_DOCKER_EV_ACTION; + } + } + else if(p_file_info->log_type == FLB_SERIAL){ + Flb_serial_config_t *serial_config = callocz(1, sizeof(Flb_serial_config_t)); + + serial_config->bitrate = appconfig_get(&log_management_config, config_section->name, "bitrate", "115200"); + serial_config->min_bytes = appconfig_get(&log_management_config, config_section->name, "min bytes", "1"); + serial_config->separator = appconfig_get(&log_management_config, config_section->name, "separator", ""); + serial_config->format = appconfig_get(&log_management_config, config_section->name, "format", ""); + + p_file_info->flb_config = serial_config; + } + else if(p_file_info->log_type == FLB_MQTT){ + Flb_socket_config_t *socket_config = callocz(1, sizeof(Flb_socket_config_t)); + + socket_config->listen = appconfig_get(&log_management_config, config_section->name, "listen", "0.0.0.0"); + socket_config->port = appconfig_get(&log_management_config, config_section->name, "port", "1883"); + + p_file_info->flb_config = socket_config; + + if(appconfig_get_boolean(&log_management_config, config_section->name, "topic chart", CONFIG_BOOLEAN_NO)) { + p_file_info->parser_config->chart_config |= CHART_MQTT_TOPIC; + } + } + + + /* ------------------------------------------------------------------------- + * Allocate p_file_info->parser_metrics memory. + * ------------------------------------------------------------------------- */ + p_file_info->parser_metrics = callocz(1, sizeof(Log_parser_metrics_t)); + switch(p_file_info->log_type){ + case FLB_WEB_LOG:{ + p_file_info->parser_metrics->web_log = callocz(1, sizeof(Web_log_metrics_t)); + break; + } + case FLB_KMSG: { + p_file_info->parser_metrics->kernel = callocz(1, sizeof(Kernel_metrics_t)); + p_file_info->parser_metrics->kernel->subsystem = dictionary_create( DICT_OPTION_SINGLE_THREADED | + DICT_OPTION_NAME_LINK_DONT_CLONE | + DICT_OPTION_DONT_OVERWRITE_VALUE); + dictionary_register_conflict_callback(p_file_info->parser_metrics->kernel->subsystem, metrics_dict_conflict_cb, NULL); + p_file_info->parser_metrics->kernel->device = dictionary_create(DICT_OPTION_SINGLE_THREADED | + DICT_OPTION_NAME_LINK_DONT_CLONE | + DICT_OPTION_DONT_OVERWRITE_VALUE); + dictionary_register_conflict_callback(p_file_info->parser_metrics->kernel->device, metrics_dict_conflict_cb, NULL); + break; + } + case FLB_SYSTEMD: + case FLB_SYSLOG: { + p_file_info->parser_metrics->systemd = callocz(1, sizeof(Systemd_metrics_t)); + break; + } + case FLB_DOCKER_EV: { + p_file_info->parser_metrics->docker_ev = callocz(1, sizeof(Docker_ev_metrics_t)); + break; + } + case FLB_MQTT: { + p_file_info->parser_metrics->mqtt = callocz(1, sizeof(Mqtt_metrics_t)); + p_file_info->parser_metrics->mqtt->topic = dictionary_create( DICT_OPTION_SINGLE_THREADED | + DICT_OPTION_NAME_LINK_DONT_CLONE | + DICT_OPTION_DONT_OVERWRITE_VALUE); + dictionary_register_conflict_callback(p_file_info->parser_metrics->mqtt->topic, metrics_dict_conflict_cb, NULL); + break; + } + default: + break; + } + + + /* ------------------------------------------------------------------------- + * Configure (optional) custom charts. + * ------------------------------------------------------------------------- */ + p_file_info->parser_cus_config = callocz(1, sizeof(Log_parser_cus_config_t *)); + p_file_info->parser_metrics->parser_cus = callocz(1, sizeof(Log_parser_cus_metrics_t *)); + for(int cus_off = 1; cus_off <= MAX_CUS_CHARTS_PER_SOURCE; cus_off++){ + + /* Read chart name config */ + char *cus_chart_k = mallocz(snprintf(NULL, 0, "custom %d chart", MAX_CUS_CHARTS_PER_SOURCE) + 1); + sprintf(cus_chart_k, "custom %d chart", cus_off); + char *cus_chart_v = appconfig_get(&log_management_config, config_section->name, cus_chart_k, NULL); + debug_log( "cus chart: (%s:%s)", cus_chart_k, cus_chart_v ? cus_chart_v : "NULL"); + freez(cus_chart_k); + if(unlikely(!cus_chart_v)){ + collector_error("[%s]: custom %d chart = NULL, custom charts for this log source will be disabled.", + p_file_info->chartname, cus_off); + break; + } + netdata_fix_chart_id(cus_chart_v); + + /* Read regex config */ + char *cus_regex_k = mallocz(snprintf(NULL, 0, "custom %d regex", MAX_CUS_CHARTS_PER_SOURCE) + 1); + sprintf(cus_regex_k, "custom %d regex", cus_off); + char *cus_regex_v = appconfig_get(&log_management_config, config_section->name, cus_regex_k, NULL); + debug_log( "cus regex: (%s:%s)", cus_regex_k, cus_regex_v ? cus_regex_v : "NULL"); + freez(cus_regex_k); + if(unlikely(!cus_regex_v)) { + collector_error("[%s]: custom %d regex = NULL, custom charts for this log source will be disabled.", + p_file_info->chartname, cus_off); + freez(cus_chart_v); + break; + } + + /* Read regex name config */ + char *cus_regex_name_k = mallocz(snprintf(NULL, 0, "custom %d regex name", MAX_CUS_CHARTS_PER_SOURCE) + 1); + sprintf(cus_regex_name_k, "custom %d regex name", cus_off); + char *cus_regex_name_v = appconfig_get( &log_management_config, config_section->name, + cus_regex_name_k, cus_regex_v); + debug_log( "cus regex name: (%s:%s)", cus_regex_name_k, cus_regex_name_v ? cus_regex_name_v : "NULL"); + freez(cus_regex_name_k); + m_assert(cus_regex_name_v, "cus_regex_name_v cannot be NULL, should be cus_regex_v"); + + + /* Escape any backslashes in the regex name, to ensure dimension is displayed correctly in charts */ + int regex_name_bslashes = 0; + char **p_regex_name = &cus_regex_name_v; + for(char *p = *p_regex_name; *p; p++) if(unlikely(*p == '\\')) regex_name_bslashes++; + if(regex_name_bslashes) { + *p_regex_name = reallocz(*p_regex_name, strlen(*p_regex_name) + 1 + regex_name_bslashes); + for(char *p = *p_regex_name; *p; p++){ + if(unlikely(*p == '\\')){ + memmove(p + 1, p, strlen(p) + 1); + *p++ = '\\'; + } + } + } + + /* Read ignore case config */ + char *cus_ignore_case_k = mallocz(snprintf(NULL, 0, "custom %d ignore case", MAX_CUS_CHARTS_PER_SOURCE) + 1); + sprintf(cus_ignore_case_k, "custom %d ignore case", cus_off); + int cus_ignore_case_v = appconfig_get_boolean( &log_management_config, + config_section->name, cus_ignore_case_k, CONFIG_BOOLEAN_YES); + debug_log( "cus case: (%s:%s)", cus_ignore_case_k, cus_ignore_case_v ? "yes" : "no"); + freez(cus_ignore_case_k); + + int regex_flags = cus_ignore_case_v ? REG_EXTENDED | REG_NEWLINE | REG_ICASE : REG_EXTENDED | REG_NEWLINE; + + int rc; + regex_t regex; + if (unlikely((rc = regcomp(®ex, cus_regex_v, regex_flags)))){ + size_t regcomp_err_str_size = regerror(rc, ®ex, 0, 0); + char *regcomp_err_str = mallocz(regcomp_err_str_size); + regerror(rc, ®ex, regcomp_err_str, regcomp_err_str_size); + collector_error("[%s]: could not compile regex for custom %d chart: %s due to error: %s. " + "Custom charts for this log source will be disabled.", + p_file_info->chartname, cus_off, cus_chart_v, regcomp_err_str); + freez(regcomp_err_str); + freez(cus_chart_v); + freez(cus_regex_v); + freez(cus_regex_name_v); + break; + }; + + /* Allocate memory and copy config to p_file_info->parser_cus_config struct */ + p_file_info->parser_cus_config = reallocz( p_file_info->parser_cus_config, + (cus_off + 1) * sizeof(Log_parser_cus_config_t *)); + p_file_info->parser_cus_config[cus_off - 1] = callocz(1, sizeof(Log_parser_cus_config_t)); + + p_file_info->parser_cus_config[cus_off - 1]->chartname = cus_chart_v; + p_file_info->parser_cus_config[cus_off - 1]->regex_str = cus_regex_v; + p_file_info->parser_cus_config[cus_off - 1]->regex_name = cus_regex_name_v; + p_file_info->parser_cus_config[cus_off - 1]->regex = regex; + + /* Initialise custom log parser metrics struct array */ + p_file_info->parser_metrics->parser_cus = reallocz( p_file_info->parser_metrics->parser_cus, + (cus_off + 1) * sizeof(Log_parser_cus_metrics_t *)); + p_file_info->parser_metrics->parser_cus[cus_off - 1] = callocz(1, sizeof(Log_parser_cus_metrics_t)); + + + p_file_info->parser_cus_config[cus_off] = NULL; + p_file_info->parser_metrics->parser_cus[cus_off] = NULL; + } + + + /* ------------------------------------------------------------------------- + * Configure (optional) Fluent Bit outputs. + * ------------------------------------------------------------------------- */ + + Flb_output_config_t **output_next_p = &p_file_info->flb_outputs; + for(int out_off = 1; out_off <= MAX_OUTPUTS_PER_SOURCE; out_off++){ + + /* Read output plugin */ + char *out_plugin_k = callocz(1, snprintf(NULL, 0, "output %d " FLB_OUTPUT_PLUGIN_NAME_KEY, MAX_OUTPUTS_PER_SOURCE) + 1); + sprintf(out_plugin_k, "output %d " FLB_OUTPUT_PLUGIN_NAME_KEY, out_off); + char *out_plugin_v = appconfig_get(&log_management_config, config_section->name, out_plugin_k, NULL); + debug_log( "output %d "FLB_OUTPUT_PLUGIN_NAME_KEY": %s", out_off, out_plugin_v ? out_plugin_v : "NULL"); + freez(out_plugin_k); + if(unlikely(!out_plugin_v)){ + collector_error("[%s]: output %d "FLB_OUTPUT_PLUGIN_NAME_KEY" = NULL, outputs for this log source will be disabled.", + p_file_info->chartname, out_off); + break; + } + + Flb_output_config_t *output = callocz(1, sizeof(Flb_output_config_t)); + output->id = out_off; + output->plugin = out_plugin_v; + + /* Read parameters for this output */ + avl_traverse_lock(&config_section->values_index, flb_output_param_get_cb, output); + + *output_next_p = output; + output_next_p = &output->next; + } + + + /* ------------------------------------------------------------------------- + * Read circular buffer configuration and initialize the buffer. + * ------------------------------------------------------------------------- */ + size_t circular_buffer_max_size = ((size_t)appconfig_get_number(&log_management_config, + config_section->name, + "circular buffer max size MiB", + g_logs_manag_config.circ_buff_max_size_in_mib)) MiB; + if(circular_buffer_max_size > CIRCULAR_BUFF_MAX_SIZE_RANGE_MAX) { + circular_buffer_max_size = CIRCULAR_BUFF_MAX_SIZE_RANGE_MAX; + collector_info( "[%s]: circular buffer max size out of range. Using maximum permitted value (MiB): %zu", + p_file_info->chartname, (size_t) (circular_buffer_max_size / (1 MiB))); + } else if(circular_buffer_max_size < CIRCULAR_BUFF_MAX_SIZE_RANGE_MIN) { + circular_buffer_max_size = CIRCULAR_BUFF_MAX_SIZE_RANGE_MIN; + collector_info( "[%s]: circular buffer max size out of range. Using minimum permitted value (MiB): %zu", + p_file_info->chartname, (size_t) (circular_buffer_max_size / (1 MiB))); + } + collector_info("[%s]: circular buffer max size MiB = %zu", p_file_info->chartname, (size_t) (circular_buffer_max_size / (1 MiB))); + + int circular_buffer_allow_dropped_logs = appconfig_get_boolean( &log_management_config, + config_section->name, + "circular buffer drop logs if full", + g_logs_manag_config.circ_buff_drop_logs); + collector_info("[%s]: circular buffer drop logs if full = %s", p_file_info->chartname, + circular_buffer_allow_dropped_logs ? "yes" : "no"); + + p_file_info->circ_buff = circ_buff_init(p_file_info->buff_flush_to_db_interval, + circular_buffer_max_size, + circular_buffer_allow_dropped_logs); + + + /* ------------------------------------------------------------------------- + * Initialize rrd related structures. + * ------------------------------------------------------------------------- */ + p_file_info->chart_meta = callocz(1, sizeof(struct Chart_meta)); + memcpy(p_file_info->chart_meta, &chart_types[p_file_info->log_type], sizeof(struct Chart_meta)); + p_file_info->chart_meta->base_prio = NETDATA_CHART_PRIO_LOGS_BASE + p_file_infos_arr->count * NETDATA_CHART_PRIO_LOGS_INCR; + netdata_mutex_lock(stdout_mut); + p_file_info->chart_meta->init(p_file_info); + fflush(stdout); + netdata_mutex_unlock(stdout_mut); + + /* ------------------------------------------------------------------------- + * Initialize input plugin for local log sources. + * ------------------------------------------------------------------------- */ + if(p_file_info->log_source == LOG_SOURCE_LOCAL){ + int rc = flb_add_input(p_file_info); + if(unlikely(rc)){ + collector_error("[%s]: flb_add_input() error: %d", p_file_info->chartname, rc); + return p_file_info_destroy(p_file_info); + } + } + + /* flb_complete_item_timer_timeout_cb() is needed for both local and + * non-local sources. */ + p_file_info->flb_tmp_buff_cpy_timer.data = p_file_info; + if(unlikely(0 != uv_mutex_init(&p_file_info->flb_tmp_buff_mut))) + fatal("uv_mutex_init(&p_file_info->flb_tmp_buff_mut) failed"); + + fatal_assert(0 == uv_timer_init( main_loop, + &p_file_info->flb_tmp_buff_cpy_timer)); + + fatal_assert(0 == uv_timer_start( &p_file_info->flb_tmp_buff_cpy_timer, + flb_complete_item_timer_timeout_cb, 0, + p_file_info->update_timeout * MSEC_PER_SEC)); + + + /* ------------------------------------------------------------------------- + * All set up successfully - add p_file_info to list of all p_file_info structs. + * ------------------------------------------------------------------------- */ + p_file_infos_arr->data = reallocz(p_file_infos_arr->data, (++p_file_infos_arr->count) * (sizeof p_file_info)); + p_file_infos_arr->data[p_file_infos_arr->count - 1] = p_file_info; + + __atomic_store_n(&p_file_info->state, LOG_SRC_READY, __ATOMIC_RELAXED); + + collector_info("[%s]: initialization completed", p_file_info->chartname); +} + +void config_file_load( uv_loop_t *main_loop, + Flb_socket_config_t *p_forward_in_config, + flb_srvc_config_t *p_flb_srvc_config, + netdata_mutex_t *stdout_mut){ + + int user_default_conf_found = 0; + + struct section *config_section; + + char tmp_name[FILENAME_MAX + 1]; + snprintfz(tmp_name, FILENAME_MAX, "%s/logsmanagement.d", get_user_config_dir()); + DIR *dir = opendir(tmp_name); + + if(dir){ + struct dirent *de = NULL; + while ((de = readdir(dir))) { + size_t d_name_len = strlen(de->d_name); + if (de->d_type == DT_DIR || d_name_len < 6 || strncmp(&de->d_name[d_name_len - 5], ".conf", sizeof(".conf"))) + continue; + + if(!user_default_conf_found && !strncmp(de->d_name, "default.conf", sizeof("default.conf"))) + user_default_conf_found = 1; + + snprintfz(tmp_name, FILENAME_MAX, "%s/logsmanagement.d/%s", get_user_config_dir(), de->d_name); + collector_info("loading config:%s", tmp_name); + log_management_config = (struct config){ + .first_section = NULL, + .last_section = NULL, + .mutex = NETDATA_MUTEX_INITIALIZER, + .index = { + .avl_tree = { + .root = NULL, + .compar = appconfig_section_compare + }, + .rwlock = AVL_LOCK_INITIALIZER + } + }; + if(!appconfig_load(&log_management_config, tmp_name, 0, NULL)) + continue; + + config_section = log_management_config.first_section; + do { + config_section_init(main_loop, config_section, p_forward_in_config, p_flb_srvc_config, stdout_mut); + config_section = config_section->next; + } while(config_section); + + } + closedir(dir); + } + + if(!user_default_conf_found){ + collector_info("CONFIG: cannot load user config '%s/logsmanagement.d/default.conf'. Will try stock config.", get_user_config_dir()); + snprintfz(tmp_name, FILENAME_MAX, "%s/logsmanagement.d/default.conf", get_stock_config_dir()); + if(!appconfig_load(&log_management_config, tmp_name, 0, NULL)){ + collector_error("CONFIG: cannot load stock config '%s/logsmanagement.d/default.conf'. Logs management will be disabled.", get_stock_config_dir()); + exit(1); + } + + config_section = log_management_config.first_section; + do { + config_section_init(main_loop, config_section, p_forward_in_config, p_flb_srvc_config, stdout_mut); + config_section = config_section->next; + } while(config_section); + } +} diff --git a/logsmanagement/logsmanag_config.h b/logsmanagement/logsmanag_config.h new file mode 100644 index 00000000..34693922 --- /dev/null +++ b/logsmanagement/logsmanag_config.h @@ -0,0 +1,31 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file logsmanag_config.h + * @brief Header of logsmanag_config.c + */ + +#include "file_info.h" +#include "flb_plugin.h" + +char *get_user_config_dir(void); + +char *get_stock_config_dir(void); + +char *get_log_dir(void); + +char *get_cache_dir(void); + +void p_file_info_destroy_all(void); + +#define LOGS_MANAG_CONFIG_LOAD_ERROR_OK 0 +#define LOGS_MANAG_CONFIG_LOAD_ERROR_NO_STOCK_CONFIG -1 +#define LOGS_MANAG_CONFIG_LOAD_ERROR_P_FLB_SRVC_NULL -2 + +int logs_manag_config_load( flb_srvc_config_t *p_flb_srvc_config, + Flb_socket_config_t **forward_in_config_p, + int g_update_every); + +void config_file_load( uv_loop_t *main_loop, + Flb_socket_config_t *p_forward_in_config, + flb_srvc_config_t *p_flb_srvc_config, + netdata_mutex_t *stdout_mut);
\ No newline at end of file diff --git a/logsmanagement/logsmanagement.c b/logsmanagement/logsmanagement.c new file mode 100644 index 00000000..05c18d34 --- /dev/null +++ b/logsmanagement/logsmanagement.c @@ -0,0 +1,252 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file logsmanagement.c + * @brief This is the main file of the Netdata logs management project + * + * The aim of the project is to add the capability to collect, parse and + * query logs in the Netdata agent. For more information please refer + * to the project's [README](README.md) file. + */ + +#include <uv.h> +#include "daemon/common.h" +#include "db_api.h" +#include "file_info.h" +#include "flb_plugin.h" +#include "functions.h" +#include "helper.h" +#include "libnetdata/required_dummies.h" +#include "logsmanag_config.h" +#include "rrd_api/rrd_api_stats.h" + +#if defined(ENABLE_LOGSMANAGEMENT_TESTS) +#include "logsmanagement/unit_test/unit_test.h" +#endif + +netdata_mutex_t stdout_mut = NETDATA_MUTEX_INITIALIZER; + +bool logsmanagement_should_exit = false; + +struct File_infos_arr *p_file_infos_arr = NULL; + +static uv_loop_t *main_loop; + +static struct { + uv_signal_t sig; + const int signum; +} signals[] = { + // Add here signals that will terminate the plugin + {.signum = SIGINT}, + {.signum = SIGQUIT}, + {.signum = SIGPIPE}, + {.signum = SIGTERM} +}; + +static void signal_handler(uv_signal_t *handle, int signum __maybe_unused) { + UNUSED(handle); + + debug_log("Signal received: %d\n", signum); + + __atomic_store_n(&logsmanagement_should_exit, true, __ATOMIC_RELAXED); + +} + +static void on_walk_cleanup(uv_handle_t* handle, void* data){ + UNUSED(data); + if (!uv_is_closing(handle)) + uv_close(handle, NULL); +} + +/** + * @brief The main function of the logs management plugin. + * @details Any static asserts are most likely going to be inluded here. After + * any initialisation routines, the default uv_loop_t is executed indefinitely. + */ +int main(int argc, char **argv) { + + /* Static asserts */ + #pragma GCC diagnostic push + #pragma GCC diagnostic ignored "-Wunused-local-typedefs" + COMPILE_TIME_ASSERT(SAVE_BLOB_TO_DB_MIN <= SAVE_BLOB_TO_DB_MAX); + COMPILE_TIME_ASSERT(CIRCULAR_BUFF_DEFAULT_MAX_SIZE >= CIRCULAR_BUFF_MAX_SIZE_RANGE_MIN); + COMPILE_TIME_ASSERT(CIRCULAR_BUFF_DEFAULT_MAX_SIZE <= CIRCULAR_BUFF_MAX_SIZE_RANGE_MAX); + #pragma GCC diagnostic pop + + clocks_init(); + + program_name = LOGS_MANAGEMENT_PLUGIN_STR; + + nd_log_initialize_for_external_plugins(program_name); + + // netdata_configured_host_prefix = getenv("NETDATA_HOST_PREFIX"); + // if(verify_netdata_host_prefix(true) == -1) exit(1); + + int g_update_every = 0; + for(int i = 1; i < argc ; i++) { + if(isdigit(*argv[i]) && !g_update_every && str2i(argv[i]) > 0 && str2i(argv[i]) < 86400) { + g_update_every = str2i(argv[i]); + debug_log("new update_every received: %d", g_update_every); + } + else if(!strcmp("--unittest", argv[i])) { +#if defined(ENABLE_LOGSMANAGEMENT_TESTS) + exit(logs_management_unittest()); +#else + collector_error("%s was not built with unit test support.", program_name); +#endif + } + else if(!strcmp("version", argv[i]) || + !strcmp("-version", argv[i]) || + !strcmp("--version", argv[i]) || + !strcmp("-v", argv[i]) || + !strcmp("-V", argv[i])) { + printf(VERSION"\n"); + exit(0); + } + else if(!strcmp("-h", argv[i]) || + !strcmp("--help", argv[i])) { + fprintf(stderr, + "\n" + " netdata %s %s\n" + " Copyright (C) 2023 Netdata Inc.\n" + " Released under GNU General Public License v3 or later.\n" + " All rights reserved.\n" + "\n" + " This program is the logs management plugin for netdata.\n" + "\n" + " Available command line options:\n" + "\n" + " --unittest run unit tests and exit\n" + "\n" + " -v\n" + " -V\n" + " --version print version and exit\n" + "\n" + " -h\n" + " --help print this message and exit\n" + "\n" + " For more information:\n" + " https://github.com/netdata/netdata/tree/master/collectors/logs-management.plugin\n" + "\n", + program_name, + VERSION + ); + exit(1); + } + else + collector_error("%s(): ignoring parameter '%s'", __FUNCTION__, argv[i]); + } + + Flb_socket_config_t *p_forward_in_config = NULL; + + main_loop = mallocz(sizeof(uv_loop_t)); + fatal_assert(uv_loop_init(main_loop) == 0); + + flb_srvc_config_t flb_srvc_config = { + .flush = FLB_FLUSH_DEFAULT, + .http_listen = FLB_HTTP_LISTEN_DEFAULT, + .http_port = FLB_HTTP_PORT_DEFAULT, + .http_server = FLB_HTTP_SERVER_DEFAULT, + .log_path = "NULL", + .log_level = FLB_LOG_LEVEL_DEFAULT, + .coro_stack_size = FLB_CORO_STACK_SIZE_DEFAULT + }; + + p_file_infos_arr = callocz(1, sizeof(struct File_infos_arr)); + + if(logs_manag_config_load(&flb_srvc_config, &p_forward_in_config, g_update_every)) + exit(1); + + if(flb_init(flb_srvc_config, get_stock_config_dir(), g_logs_manag_config.sd_journal_field_prefix)){ + collector_error("flb_init() failed - logs management will be disabled"); + exit(1); + } + + if(flb_add_fwd_input(p_forward_in_config)) + collector_error("flb_add_fwd_input() failed - logs management forward input will be disabled"); + + /* Initialize logs management for each configuration section */ + config_file_load(main_loop, p_forward_in_config, &flb_srvc_config, &stdout_mut); + + if(p_file_infos_arr->count == 0){ + collector_info("No valid configuration could be found for any log source - logs management will be disabled"); + exit(1); + } + + /* Run Fluent Bit engine + * NOTE: flb_run() ideally would be executed after db_init(), but in case of + * a db_init() failure, it is easier to call flb_stop_and_cleanup() rather + * than the other way round (i.e. cleaning up after db_init(), if flb_run() + * fails). */ + if(flb_run()){ + collector_error("flb_run() failed - logs management will be disabled"); + exit(1); + } + + if(db_init()){ + collector_error("db_init() failed - logs management will be disabled"); + exit(1); + } + + uv_thread_t *p_stats_charts_thread_id = NULL; + const char *const netdata_internals_monitoring = getenv("NETDATA_INTERNALS_MONITORING"); + if( netdata_internals_monitoring && + *netdata_internals_monitoring && + strcmp(netdata_internals_monitoring, "YES") == 0){ + + p_stats_charts_thread_id = mallocz(sizeof(uv_thread_t)); + fatal_assert(0 == uv_thread_create(p_stats_charts_thread_id, stats_charts_init, &stdout_mut)); + } + +#if defined(__STDC_VERSION__) + debug_log( "__STDC_VERSION__: %ld", __STDC_VERSION__); +#else + debug_log( "__STDC_VERSION__ undefined"); +#endif // defined(__STDC_VERSION__) + debug_log( "libuv version: %s", uv_version_string()); + debug_log( "LZ4 version: %s", LZ4_versionString()); + debug_log( "SQLITE version: " SQLITE_VERSION); + + for(int i = 0; i < (int) (sizeof(signals) / sizeof(signals[0])); i++){ + uv_signal_init(main_loop, &signals[i].sig); + uv_signal_start(&signals[i].sig, signal_handler, signals[i].signum); + } + + struct functions_evloop_globals *wg = logsmanagement_func_facets_init(&logsmanagement_should_exit); + + collector_info("%s setup completed successfully", program_name); + + /* Run uvlib loop. */ + while(!__atomic_load_n(&logsmanagement_should_exit, __ATOMIC_RELAXED)) + uv_run(main_loop, UV_RUN_ONCE); + + /* If there are valid log sources, there should always be valid handles */ + collector_info("uv_run(main_loop, ...); no handles or requests - cleaning up..."); + + nd_log_limits_unlimited(); + + // TODO: Clean up stats charts memory + if(p_stats_charts_thread_id){ + uv_thread_join(p_stats_charts_thread_id); + freez(p_stats_charts_thread_id); + } + + uv_stop(main_loop); + + flb_terminate(); + + flb_free_fwd_input_out_cb(); + + p_file_info_destroy_all(); + + uv_walk(main_loop, on_walk_cleanup, NULL); + while(0 != uv_run(main_loop, UV_RUN_ONCE)); + if(uv_loop_close(main_loop)) + m_assert(0, "uv_loop_close() result not 0"); + freez(main_loop); + + functions_evloop_cancel_threads(wg); + + collector_info("logs management clean up done - exiting"); + + exit(0); +} diff --git a/logsmanagement/parser.c b/logsmanagement/parser.c new file mode 100644 index 00000000..272352bb --- /dev/null +++ b/logsmanagement/parser.c @@ -0,0 +1,1500 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file parser.c + * @brief API to parse and search logs + */ + +#if !defined(_XOPEN_SOURCE) && !defined(__DARWIN__) && !defined(__APPLE__) && !defined(__FreeBSD__) +/* _XOPEN_SOURCE 700 required by strptime (POSIX 2004) and strndup (POSIX 2008) + * Will need to find a cleaner way of doing this, as currently defining + * _XOPEN_SOURCE 700 can cause issues on Centos 7, MacOS and FreeBSD too. */ +#define _XOPEN_SOURCE 700 +/* _BSD_SOURCE (glibc <= 2.19) and _DEFAULT_SOURCE (glibc >= 2.20) are required + * to silence "warning: implicit declaration of function ‘strsep’;" that is + * included through libnetdata/inlined.h. */ +#define _BSD_SOURCE +#define _DEFAULT_SOURCE +#include <time.h> +#endif + +#include "parser.h" +#include "helper.h" +#include <stdio.h> +#include <sys/resource.h> +#include <math.h> +#include <string.h> + +static regex_t vhost_regex, req_client_regex, cipher_suite_regex; + +const char* const csv_auto_format_guess_matrix[] = { + "$host:$server_port $remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time", // csvVhostCustom4 + "$host:$server_port $remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent - - $request_length $request_time", // csvVhostCustom3 + "$host:$server_port $remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent - -", // csvVhostCombined + "$host:$server_port $remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent $request_length $request_time $upstream_response_time", // csvVhostCustom2 + "$host:$server_port $remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent $request_length $request_time", // csvVhostCustom1 + "$host:$server_port $remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent", // csvVhostCommon + "$remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time", // csvCustom4 + "$remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent - - $request_length $request_time", // csvCustom3 + "$remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent - -", // csvCombined + "$remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent $request_length $request_time $upstream_response_time", // csvCustom2 + "$remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent $request_length $request_time", // csvCustom1 + "$remote_addr - - [$time_local] \"$request\" $status $body_bytes_sent", // csvCommon + NULL} +; + +UNIT_STATIC int count_fields(const char *line, const char delimiter){ + const char *ptr; + int cnt, fQuote; + + for (cnt = 1, fQuote = 0, ptr = line; *ptr != '\n' && *ptr != '\r' && *ptr != '\0'; ptr++ ){ + if (fQuote) { + if (*ptr == '\"') { + if ( ptr[1] == '\"' ) { + ptr++; + continue; + } + fQuote = 0; + } + continue; + } + + if(*ptr == '\"'){ + fQuote = 1; + continue; + } + if(*ptr == delimiter){ + cnt++; + while(*(ptr+1) == delimiter){ ptr++;}; + continue; + } + } + + if (fQuote) { + return -1; + } + + return cnt; +} + +/** + * @brief Parse a delimited string into an array of strings. + * @details Given a string containing no linebreaks, or containing line breaks + * which are escaped by "double quotes", extract a NULL-terminated + * array of strings, one for every delimiter-separated value in the row. + * @param[in] line The input string to be parsed. + * @param[in] delimiter The delimiter to be used to split the string. + * @param[in] num_fields The expected number of fields in \p line. If a negative + * number is provided, they will be counted. + * @return A NULL-terminated array of strings with the delimited values in \p line, + * or NULL in any other case. + * @todo This function has not been benchmarked or optimised. + */ +static inline char **parse_csv( const char *line, const char delimiter, int num_fields) { + char **buf, **bptr, *tmp, *tptr; + const char *ptr; + int fQuote, fEnd; + + if(num_fields < 0){ + num_fields = count_fields(line, delimiter); + + if ( num_fields == -1 ) { + return NULL; + } + } + + buf = mallocz( sizeof(char*) * (num_fields+1) ); + + tmp = mallocz( strlen(line) + 1 ); + + bptr = buf; + + for ( ptr = line, fQuote = 0, *tmp = '\0', tptr = tmp, fEnd = 0; ; ptr++ ) { + if ( fQuote ) { + if ( !*ptr ) { + break; + } + + if ( *ptr == '\"' ) { + if ( ptr[1] == '\"' ) { + *tptr++ = '\"'; + ptr++; + continue; + } + fQuote = 0; + } + else { + *tptr++ = *ptr; + } + + continue; + } + + + if(*ptr == '\"'){ + fQuote = 1; + continue; + } + else if(*ptr == '\0'){ + fEnd = 1; + *tptr = '\0'; + *bptr = strdupz( tmp ); + + if ( !*bptr ) { + for ( bptr--; bptr >= buf; bptr-- ) { + freez( *bptr ); + } + freez( buf ); + freez( tmp ); + + return NULL; + } + + bptr++; + tptr = tmp; + break; + } + else if(*ptr == delimiter){ + *tptr = '\0'; + *bptr = strdupz( tmp ); + + if ( !*bptr ) { + for ( bptr--; bptr >= buf; bptr-- ) { + freez( *bptr ); + } + freez( buf ); + freez( tmp ); + + return NULL; + } + + bptr++; + tptr = tmp; + + continue; + } + else{ + *tptr++ = *ptr; + continue; + } + + if ( fEnd ) { + break; + } + } + + *bptr = NULL; + freez( tmp ); + return buf; +} + +/** + * @brief Search a buffer for a keyword (or regular expression) + * @details Search the source buffer for a keyword (or regular expression) and + * copy matches to the destination buffer. + * @param[in] src The source buffer to be searched + * @param[in] src_sz Size of \p src + * @param[in, out] dest The destination buffer where the results will be + * written out to. If NULL, the results will just be discarded. + * @param[out] dest_sz Size of \p dest + * @param[in] keyword The keyword or pattern to be searched in the src buffer + * @param[in] regex The precompiled regular expression to be search in the + * src buffer. If NULL, \p keyword will be used instead. + * @param[in] ignore_case Perform case insensitive search if 1. + * @return Number of matches, or -1 in case of error + */ +int search_keyword( char *src, size_t src_sz __maybe_unused, + char *dest, size_t *dest_sz, + const char *keyword, regex_t *regex, + const int ignore_case){ + + m_assert(src[src_sz - 1] == '\0', "src[src_sz - 1] should be '\0' but it's not"); + m_assert((dest && dest_sz) || (!dest && !dest_sz), "either both dest and dest_sz exist, or none does"); + + if(unlikely(dest && !dest_sz)) + return -1; + + regex_t regex_compiled; + + if(regex) + regex_compiled = *regex; + else{ + char regexString[MAX_REGEX_SIZE]; + const int regex_flags = ignore_case ? REG_EXTENDED | REG_NEWLINE | REG_ICASE : REG_EXTENDED | REG_NEWLINE; + snprintf(regexString, MAX_REGEX_SIZE, ".*(%s).*", keyword); + int rc; + if (unlikely((rc = regcomp(®ex_compiled, regexString, regex_flags)))){ + size_t regcomp_err_str_size = regerror(rc, ®ex_compiled, 0, 0); + char *regcomp_err_str = mallocz(regcomp_err_str_size); + regerror(rc, ®ex_compiled, regcomp_err_str, regcomp_err_str_size); + freez(regcomp_err_str); + fatal("Could not compile regular expression:%.*s, error: %s", (int) MAX_REGEX_SIZE, regexString, regcomp_err_str); + }; + } + + regmatch_t groupArray[1]; + int matches = 0; + char *cursor = src; + + if(dest_sz) + *dest_sz = 0; + + for ( ; ; matches++){ + if (regexec(®ex_compiled, cursor, 1, groupArray, REG_NOTBOL | REG_NOTEOL)) + break; // No more matches + if (groupArray[0].rm_so == -1) + break; // No more groups + + size_t match_len = (size_t) (groupArray[0].rm_eo - groupArray[0].rm_so); + + // debug_log( "Match %d [%2d-%2d]:%.*s\n", matches, groupArray[0].rm_so, + // groupArray[0].rm_eo, (int) match_len, cursor + groupArray[0].rm_so); + + if(dest && dest_sz){ + memcpy( &dest[*dest_sz], cursor + groupArray[0].rm_so, match_len); + *dest_sz += match_len + 1; + dest[*dest_sz - 1] = '\n'; + } + + cursor += groupArray[0].rm_eo; + } + + if(!regex) + regfree(®ex_compiled); + + return matches; +} + +/** + * @brief Extract web log parser configuration from string + * @param[in] log_format String that describes the log format + * @param[in] delimiter Delimiter to be used when parsing a CSV log format + * @return Pointer to struct that contains the extracted log format + * configuration or NULL if no fields found in log_format. + */ +Web_log_parser_config_t *read_web_log_parser_config(const char *log_format, const char delimiter){ + int num_fields = count_fields(log_format, delimiter); + if(num_fields <= 0) return NULL; + + /* If first execution of this function, initialise regexs */ + static int regexs_initialised = 0; + + // TODO: Tests needed for following regexs. + if(!regexs_initialised){ + assert(regcomp(&vhost_regex, "^[a-zA-Z0-9:.-]+$", REG_NOSUB | REG_EXTENDED) == 0); + assert(regcomp(&req_client_regex, "^([0-9a-f:.]+|localhost)$", REG_NOSUB | REG_EXTENDED) == 0); + assert(regcomp(&cipher_suite_regex, "^[A-Z0-9_-]+$", REG_NOSUB | REG_EXTENDED) == 0); + regexs_initialised = 1; + } + + Web_log_parser_config_t *wblp_config = callocz(1, sizeof(Web_log_parser_config_t)); + wblp_config->num_fields = num_fields; + wblp_config->delimiter = delimiter; + + char **parsed_format = parse_csv(log_format, delimiter, num_fields); // parsed_format is NULL-terminated + wblp_config->fields = callocz(num_fields, sizeof(web_log_line_field_t)); + unsigned int fields_off = 0; + + for(int i = 0; i < num_fields; i++ ){ + + if(strcmp(parsed_format[i], "$host:$server_port") == 0 || + strcmp(parsed_format[i], "%v:%p") == 0) { + wblp_config->fields[fields_off++] = VHOST_WITH_PORT; + continue; + } + + if(strcmp(parsed_format[i], "$host") == 0 || + strcmp(parsed_format[i], "$http_host") == 0 || + strcmp(parsed_format[i], "%v") == 0) { + wblp_config->fields[fields_off++] = VHOST; + continue; + } + + if(strcmp(parsed_format[i], "$server_port") == 0 || + strcmp(parsed_format[i], "%p") == 0) { + wblp_config->fields[fields_off++] = PORT; + continue; + } + + if(strcmp(parsed_format[i], "$scheme") == 0) { + wblp_config->fields[fields_off++] = REQ_SCHEME; + continue; + } + + if(strcmp(parsed_format[i], "$remote_addr") == 0 || + strcmp(parsed_format[i], "%a") == 0 || + strcmp(parsed_format[i], "%h") == 0) { + wblp_config->fields[fields_off++] = REQ_CLIENT; + continue; + } + + if(strcmp(parsed_format[i], "$request") == 0 || + strcmp(parsed_format[i], "%r") == 0) { + wblp_config->fields[fields_off++] = REQ; + continue; + } + + if(strcmp(parsed_format[i], "$request_method") == 0 || + strcmp(parsed_format[i], "%m") == 0) { + wblp_config->fields[fields_off++] = REQ_METHOD; + continue; + } + + if(strcmp(parsed_format[i], "$request_uri") == 0 || + strcmp(parsed_format[i], "%U") == 0) { + wblp_config->fields[fields_off++] = REQ_URL; + continue; + } + + if(strcmp(parsed_format[i], "$server_protocol") == 0 || + strcmp(parsed_format[i], "%H") == 0) { + wblp_config->fields[fields_off++] = REQ_PROTO; + continue; + } + + if(strcmp(parsed_format[i], "$request_length") == 0 || + strcmp(parsed_format[i], "%I") == 0) { + wblp_config->fields[fields_off++] = REQ_SIZE; + continue; + } + + if(strcmp(parsed_format[i], "$request_time") == 0 || + strcmp(parsed_format[i], "%D") == 0) { + wblp_config->fields[fields_off++] = REQ_PROC_TIME; + continue; + } + + if(strcmp(parsed_format[i], "$status") == 0 || + strcmp(parsed_format[i], "%>s") == 0 || + strcmp(parsed_format[i], "%s") == 0) { + wblp_config->fields[fields_off++] = RESP_CODE; + continue; + } + + if(strcmp(parsed_format[i], "$bytes_sent") == 0 || + strcmp(parsed_format[i], "$body_bytes_sent") == 0 || + strcmp(parsed_format[i], "%b") == 0 || + strcmp(parsed_format[i], "%O") == 0 || + strcmp(parsed_format[i], "%B") == 0) { + wblp_config->fields[fields_off++] = RESP_SIZE; + continue; + } + + if(strcmp(parsed_format[i], "$upstream_response_time") == 0) { + wblp_config->fields[fields_off++] = UPS_RESP_TIME; + continue; + } + + if(strcmp(parsed_format[i], "$ssl_protocol") == 0) { + wblp_config->fields[fields_off++] = SSL_PROTO; + continue; + } + + if(strcmp(parsed_format[i], "$ssl_cipher") == 0) { + wblp_config->fields[fields_off++] = SSL_CIPHER_SUITE; + continue; + } + + if(strcmp(parsed_format[i], "$time_local") == 0 || strcmp(parsed_format[i], "[$time_local]") == 0 || + strcmp(parsed_format[i], "%t") == 0 || strcmp(parsed_format[i], "[%t]") == 0) { + wblp_config->fields = reallocz(wblp_config->fields, (num_fields + 1) * sizeof(web_log_line_field_t)); + wblp_config->fields[fields_off++] = TIME; + wblp_config->fields[fields_off++] = TIME; // TIME takes 2 fields + wblp_config->num_fields++; // TIME takes 2 fields + continue; + } + + wblp_config->fields[fields_off++] = CUSTOM; + + } + + for(int i = 0; parsed_format[i] != NULL; i++) + freez(parsed_format[i]); + + freez(parsed_format); + return wblp_config; +} + +/** + * @brief Parse a web log line to extract individual fields. + * @param[in] wblp_config Configuration that specifies how to parse the line. + * @param[in] line Web log record to be parsed. '\n', '\r' or '\0' terminated. + * @param[out] log_line_parsed Struct that stores the results of parsing. + */ +void parse_web_log_line(const Web_log_parser_config_t *wblp_config, + char *line, size_t line_len, + Log_line_parsed_t *log_line_parsed){ + + /* Read parsing configuration */ + web_log_line_field_t *fields_format = wblp_config->fields; + const int num_fields_config = wblp_config->num_fields; + const char delimiter = wblp_config->delimiter; + const int verify = wblp_config->verify_parsed_logs; + + /* Consume new lines and spaces at end of line */ + for(; line[line_len-1] == '\n' || line[line_len-1] == '\r' || line[line_len-1] == ' '; line_len--); + + char *field = line; + char *offset = line; + size_t field_size = 0; + + for(int i = 0; i < num_fields_config; i++ ){ + + /* Consume double quotes and extra delimiters at beginning of field */ + while(*field == '"' || *field == delimiter) field++, offset++; + + /* Find offset boundaries of next field in line */ + while(((size_t)(offset - line) < line_len) && *offset != delimiter) offset++; + + if(unlikely(*(offset - 1) == '"')) offset--; + + field_size = (size_t) (offset - field); + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Field[%d]:%.*s", i, (int)field_size, field); + #endif + + if(fields_format[i] == CUSTOM){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: CUSTOM or UNKNOWN):%.*s", i, (int)field_size, field); + #endif + goto next_item; + } + + + char *port = field; + size_t port_size = 0; + size_t vhost_size = 0; + + if(fields_format[i] == VHOST_WITH_PORT){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: VHOST_WITH_PORT):%.*s", i, (int)field_size, field); + #endif + + if(unlikely(field[0] == '-' && field_size == 1)){ + log_line_parsed->vhost[0] = '\0'; + log_line_parsed->port = WEB_LOG_INVALID_PORT; + log_line_parsed->parsing_errors++; + goto next_item; + } + + while(*port != ':' && vhost_size < field_size) { port++; vhost_size++; }; + if(likely(vhost_size < field_size)){ + /* ':' detected in string */ + port++; + port_size = field_size - vhost_size - 1; + field_size = vhost_size; // now field represents vhost and port is separate + } + else { + /* no ':' detected in string - invalid */ + log_line_parsed->vhost[0] = '\0'; + log_line_parsed->port = WEB_LOG_INVALID_PORT; + log_line_parsed->parsing_errors++; + goto next_item; + } + } + + if(fields_format[i] == VHOST_WITH_PORT || fields_format[i] == VHOST){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: VHOST):%.*s", i, (int)field_size, field); + #endif + + if(unlikely(field[0] == '-' && field_size == 1)){ + log_line_parsed->vhost[0] = '\0'; + log_line_parsed->parsing_errors++; + goto next_item; + } + + // TODO: Add below case in code!!! + // nginx $host and $http_host return ipv6 in [], apache doesn't + // TODO: TEST! This case hasn't been tested! + // char *pch = strchr(parsed[i], ']'); + // if(pch){ + // *pch = '\0'; + // memmove(parsed[i], parsed[i]+1, strlen(parsed[i])); + // } + + snprintfz(log_line_parsed->vhost, VHOST_MAX_LEN, "%.*s", (int) field_size, field); + + if(verify){ + // if(field_size >= VHOST_MAX_LEN){ + // #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + // collector_error("VHOST is invalid"); + // #endif + // log_line_parsed->vhost[0] = '\0'; + // log_line_parsed->parsing_errors++; + // goto next_item; // TODO: Not entirely right, as it will also skip PORT parsing in case of VHOST_WITH_PORT + // } + + if(unlikely(regexec(&vhost_regex, log_line_parsed->vhost, 0, NULL, 0) == REG_NOMATCH)){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("VHOST is invalid"); + #endif + // log_line_parsed->vhost[0] = 'invalid'; + snprintf(log_line_parsed->vhost, sizeof(WEB_LOG_INVALID_HOST_STR), WEB_LOG_INVALID_HOST_STR); + log_line_parsed->parsing_errors++; + } + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted VHOST:%s", log_line_parsed->vhost); + #endif + + if(fields_format[i] == VHOST) goto next_item; + } + + if(fields_format[i] == VHOST_WITH_PORT || fields_format[i] == PORT){ + + if(fields_format[i] != VHOST_WITH_PORT){ + port = field; + port_size = field_size; + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: PORT):%.*s", i, (int) port_size, port); + #endif + + if(unlikely(port[0] == '-' && port_size == 1)){ + log_line_parsed->port = WEB_LOG_INVALID_PORT; + log_line_parsed->parsing_errors++; + goto next_item; + } + + char port_d[PORT_MAX_LEN]; + snprintfz( port_d, PORT_MAX_LEN, "%.*s", (int) port_size, port); + + if(likely(str2int(&log_line_parsed->port, port_d, 10) == STR2XX_SUCCESS)){ + if(verify){ + if(unlikely(log_line_parsed->port < 80 || log_line_parsed->port > 49151)){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("PORT is invalid (<80 or >49151)"); + #endif + log_line_parsed->port = WEB_LOG_INVALID_PORT; + log_line_parsed->parsing_errors++; + } + } + } + else{ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("Error while extracting PORT from string"); + #endif + log_line_parsed->port = WEB_LOG_INVALID_PORT; + log_line_parsed->parsing_errors++; + } + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted PORT:%d", log_line_parsed->port); + #endif + + goto next_item; + } + + if(fields_format[i] == REQ_SCHEME){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: REQ_SCHEME):%.*s", i, (int)field_size, field); + #endif + + if(unlikely(field[0] == '-' && field_size == 1)){ + log_line_parsed->req_scheme[0] = '\0'; + log_line_parsed->parsing_errors++; + goto next_item; + } + + snprintfz(log_line_parsed->req_scheme, REQ_SCHEME_MAX_LEN, "%.*s", (int) field_size, field); + + if(verify){ + if(unlikely( strcmp(log_line_parsed->req_scheme, "http") && + strcmp(log_line_parsed->req_scheme, "https"))){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("REQ_SCHEME is invalid (must be either 'http' or 'https')"); + #endif + log_line_parsed->req_scheme[0] = '\0'; + log_line_parsed->parsing_errors++; + } + } + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted REQ_SCHEME:%s", log_line_parsed->req_scheme); + #endif + goto next_item; + } + + if(fields_format[i] == REQ_CLIENT){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: REQ_CLIENT):%.*s", i, (int)field_size, field); + #endif + + if(unlikely(field[0] == '-' && field_size == 1)){ + log_line_parsed->req_client[0] = '\0'; + log_line_parsed->parsing_errors++; + goto next_item; + } + + snprintfz(log_line_parsed->req_client, REQ_CLIENT_MAX_LEN, "%.*s", (int)field_size, field); + + if(verify){ + int regex_rc = regexec(&req_client_regex, log_line_parsed->req_client, 0, NULL, 0); + if (likely(regex_rc == 0)) {/* do nothing */} + else if (unlikely(regex_rc == REG_NOMATCH)) { + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("REQ_CLIENT is invalid"); + #endif + snprintf(log_line_parsed->req_client, REQ_CLIENT_MAX_LEN, "%s", WEB_LOG_INVALID_CLIENT_IP_STR); + log_line_parsed->parsing_errors++; + } + else { + size_t err_msg_size = regerror(regex_rc, &req_client_regex, NULL, 0); + char *err_msg = mallocz(err_msg_size); + regerror(regex_rc, &req_client_regex, err_msg, err_msg_size); + collector_error("req_client_regex error:%s", err_msg); + freez(err_msg); + m_assert(0, "req_client_regex has failed"); + } + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted REQ_CLIENT:%s", log_line_parsed->req_client); + #endif + + goto next_item; + } + + if(fields_format[i] == REQ || fields_format[i] == REQ_METHOD){ + + /* If fields_format[i] == REQ, then field is filled in with request in the previous code */ + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: REQ or REQ_METHOD):%.*s", i, (int)field_size, field); + #endif + + snprintfz( log_line_parsed->req_method, REQ_METHOD_MAX_LEN, "%.*s", (int)field_size, field); + + if(verify){ + if( unlikely( + /* GET and POST are the most common requests, so check them first */ + strcmp(log_line_parsed->req_method, "GET") && + strcmp(log_line_parsed->req_method, "POST") && + + strcmp(log_line_parsed->req_method, "ACL") && + strcmp(log_line_parsed->req_method, "BASELINE-CONTROL") && + strcmp(log_line_parsed->req_method, "BIND") && + strcmp(log_line_parsed->req_method, "CHECKIN") && + strcmp(log_line_parsed->req_method, "CHECKOUT") && + strcmp(log_line_parsed->req_method, "CONNECT") && + strcmp(log_line_parsed->req_method, "COPY") && + strcmp(log_line_parsed->req_method, "DELETE") && + strcmp(log_line_parsed->req_method, "HEAD") && + strcmp(log_line_parsed->req_method, "LABEL") && + strcmp(log_line_parsed->req_method, "LINK") && + strcmp(log_line_parsed->req_method, "LOCK") && + strcmp(log_line_parsed->req_method, "MERGE") && + strcmp(log_line_parsed->req_method, "MKACTIVITY") && + strcmp(log_line_parsed->req_method, "MKCALENDAR") && + strcmp(log_line_parsed->req_method, "MKCOL") && + strcmp(log_line_parsed->req_method, "MKREDIRECTREF") && + strcmp(log_line_parsed->req_method, "MKWORKSPACE") && + strcmp(log_line_parsed->req_method, "MOVE") && + strcmp(log_line_parsed->req_method, "OPTIONS") && + strcmp(log_line_parsed->req_method, "ORDERPATCH") && + strcmp(log_line_parsed->req_method, "PATCH") && + strcmp(log_line_parsed->req_method, "PRI") && + strcmp(log_line_parsed->req_method, "PROPFIND") && + strcmp(log_line_parsed->req_method, "PROPPATCH") && + strcmp(log_line_parsed->req_method, "PUT") && + strcmp(log_line_parsed->req_method, "REBIND") && + strcmp(log_line_parsed->req_method, "REPORT") && + strcmp(log_line_parsed->req_method, "SEARCH") && + strcmp(log_line_parsed->req_method, "TRACE") && + strcmp(log_line_parsed->req_method, "UNBIND") && + strcmp(log_line_parsed->req_method, "UNCHECKOUT") && + strcmp(log_line_parsed->req_method, "UNLINK") && + strcmp(log_line_parsed->req_method, "UNLOCK") && + strcmp(log_line_parsed->req_method, "UPDATE") && + strcmp(log_line_parsed->req_method, "UPDATEREDIRECTREF") && + strcmp(log_line_parsed->req_method, "-"))) { + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("REQ_METHOD is invalid"); + #endif + log_line_parsed->req_method[0] = '\0'; + log_line_parsed->parsing_errors++; + } + } + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted REQ_METHOD:%s", log_line_parsed->req_method); + #endif + + if(fields_format[i] == REQ && field[0] != '-') { + while(*(offset + 1) == delimiter) offset++; // Consume extra whitespace characters + field = ++offset; + while(*offset != delimiter && ((size_t)(offset - line) < line_len)) offset++; + field_size = (size_t) (offset - field); + } + else goto next_item; + } + + if(fields_format[i] == REQ || fields_format[i] == REQ_URL){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: REQ or REQ_URL):%.*s", i, (int)field_size, field); + #endif + + snprintfz( log_line_parsed->req_URL, REQ_URL_MAX_LEN, "%.*s", (int)field_size, field); + + // if(unlikely(field[0] == '-' && field_size == 1)){ + // log_line_parsed->req_method[0] = '\0'; + // log_line_parsed->parsing_errors++; + // } + + //if(verify){} ?? + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted REQ_URL:%s", log_line_parsed->req_URL ? log_line_parsed->req_URL : "NULL!"); + #endif + + if(fields_format[i] == REQ) { + while(*(offset + 1) == delimiter) offset++; // Consume extra whitespace characters + field = ++offset; + while(*offset != delimiter && ((size_t)(offset - line) < line_len)) offset++; + field_size = (size_t) (offset - field); + } + else goto next_item; + } + + if(fields_format[i] == REQ || fields_format[i] == REQ_PROTO){ + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: REQ or REQ_PROTO):%.*s", i, (int)field_size, field); + #endif + + if(unlikely(field[0] == '-' && field_size == 1)){ + log_line_parsed->req_proto[0] = '\0'; + log_line_parsed->parsing_errors++; + goto next_item; + } + + if(unlikely( field_size > REQ_PROTO_PREF_SIZE + REQ_PROTO_MAX_LEN - 1)){ + field_size = REQ_PROTO_PREF_SIZE + REQ_PROTO_MAX_LEN - 1; + } + + size_t req_proto_num_size = field_size - REQ_PROTO_PREF_SIZE; + + if(verify){ + if(unlikely(field_size < 6 || + req_proto_num_size == 0 || + strncmp(field, "HTTP/", REQ_PROTO_PREF_SIZE) || + ( strncmp(&field[REQ_PROTO_PREF_SIZE], "1", req_proto_num_size) && + strncmp(&field[REQ_PROTO_PREF_SIZE], "1.0", req_proto_num_size) && + strncmp(&field[REQ_PROTO_PREF_SIZE], "1.1", req_proto_num_size) && + strncmp(&field[REQ_PROTO_PREF_SIZE], "2", req_proto_num_size) && + strncmp(&field[REQ_PROTO_PREF_SIZE], "2.0", req_proto_num_size)))) { + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("REQ_PROTO is invalid"); + #endif + log_line_parsed->req_proto[0] = '\0'; + log_line_parsed->parsing_errors++; + } + else snprintfz( log_line_parsed->req_proto, req_proto_num_size + 1, + "%.*s", (int)req_proto_num_size, &field[REQ_PROTO_PREF_SIZE]); + } + else snprintfz( log_line_parsed->req_proto, req_proto_num_size + 1, + "%.*s", (int)req_proto_num_size, &field[REQ_PROTO_PREF_SIZE]); + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted REQ_PROTO:%s", log_line_parsed->req_proto); + #endif + + goto next_item; + } + + if(fields_format[i] == REQ_SIZE){ + /* TODO: Differentiate between '-' or 0 and an invalid request size. + * right now, all these will set req_size == 0 */ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: REQ_SIZE):%.*s", i, (int)field_size, field); + #endif + + char req_size_d[REQ_SIZE_MAX_LEN]; + snprintfz( req_size_d, REQ_SIZE_MAX_LEN, "%.*s", (int) field_size, field); + + if(field[0] == '-' && field_size == 1) { + log_line_parsed->req_size = 0; // Request size can be '-' + } + else if(likely(str2int(&log_line_parsed->req_size, req_size_d, 10) == STR2XX_SUCCESS)){ + if(verify){ + if(unlikely(log_line_parsed->req_size < 0)){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("REQ_SIZE is invalid (<0)"); + #endif + log_line_parsed->req_size = 0; + log_line_parsed->parsing_errors++; + } + } + } + else{ + collector_error("Error while extracting REQ_SIZE from string"); + log_line_parsed->req_size = 0; + log_line_parsed->parsing_errors++; + } + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted REQ_SIZE:%d", log_line_parsed->req_size); + #endif + + goto next_item; + } + + if(fields_format[i] == REQ_PROC_TIME){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: REQ_PROC_TIME):%.*s", i, (int)field_size, field); + #endif + + if(unlikely(field[0] == '-' && field_size == 1)){ + log_line_parsed->req_proc_time = WEB_LOG_INVALID_PORT; + log_line_parsed->parsing_errors++; + goto next_item; + } + + float f = 0; + + char req_proc_time_d[REQ_PROC_TIME_MAX_LEN]; + snprintfz( req_proc_time_d, REQ_PROC_TIME_MAX_LEN, "%.*s", (int) field_size, field); + + if(memchr(field, '.', field_size)){ // nginx time is in seconds with a milliseconds resolution. + if(likely(str2float(&f, req_proc_time_d) == STR2XX_SUCCESS)){ + log_line_parsed->req_proc_time = (int) (f * 1.0E6); + } + else { + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("Error while extracting REQ_PROC_TIME from string"); + #endif + log_line_parsed->req_proc_time = 0; + log_line_parsed->parsing_errors++; + } + } + else{ // apache time is in microseconds + if(unlikely(str2int(&log_line_parsed->req_proc_time, req_proc_time_d, 10) != STR2XX_SUCCESS)) { + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("Error while extracting REQ_PROC_TIME from string"); + #endif + log_line_parsed->req_proc_time = 0; + log_line_parsed->parsing_errors++; + } + } + + if(verify){ + if(unlikely(log_line_parsed->req_proc_time < 0)){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("REQ_PROC_TIME is invalid (<0)"); + #endif + log_line_parsed->req_proc_time = 0; + log_line_parsed->parsing_errors++; + } + } + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted REQ_PROC_TIME:%d", log_line_parsed->req_proc_time); + #endif + + goto next_item; + } + + if(fields_format[i] == RESP_CODE){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: RESP_CODE):%.*s\n", i, (int)field_size, field); + #endif + + if(unlikely(field[0] == '-' && field_size == 1)){ + log_line_parsed->resp_code = 0; + log_line_parsed->parsing_errors++; + goto next_item; + } + + char resp_code_d[REQ_RESP_CODE_MAX_LEN]; + snprintfz( resp_code_d, REQ_RESP_CODE_MAX_LEN, "%.*s", (int)field_size, field); + + if(likely(str2int(&log_line_parsed->resp_code, resp_code_d, 10) == STR2XX_SUCCESS)){ + if(verify){ + /* rfc7231 + * Informational responses (100–199), + * Successful responses (200–299), + * Redirects (300–399), + * Client errors (400–499), + * Server errors (500–599). */ + if(unlikely(log_line_parsed->resp_code < 100 || log_line_parsed->resp_code > 599)){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("RESP_CODE is invalid (<100 or >599)"); + #endif + log_line_parsed->resp_code = 0; + log_line_parsed->parsing_errors++; + } + } + } + else{ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("Error while extracting RESP_CODE from string"); + #endif + log_line_parsed->resp_code = 0; + log_line_parsed->parsing_errors++; + } + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted RESP_CODE:%d", log_line_parsed->resp_code); + #endif + + goto next_item; + } + + if(fields_format[i] == RESP_SIZE){ + /* TODO: Differentiate between '-' or 0 and an invalid response size. + * right now, all these will set resp_size == 0 */ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: RESP_SIZE):%.*s", i, (int)field_size, field); + #endif + + char resp_size_d[REQ_RESP_SIZE_MAX_LEN]; + snprintfz( resp_size_d, REQ_RESP_SIZE_MAX_LEN, "%.*s", (int)field_size, field); + + if(field[0] == '-' && field_size == 1) { + log_line_parsed->resp_size = 0; // Response size can be '-' + } + else if(likely(str2int(&log_line_parsed->resp_size, resp_size_d, 10) == STR2XX_SUCCESS)){ + if(verify){ + if(unlikely(log_line_parsed->resp_size < 0)){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("RESP_SIZE is invalid (<0)"); + #endif + log_line_parsed->resp_size = 0; + log_line_parsed->parsing_errors++; + } + } + } + else { + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("Error while extracting RESP_SIZE from string"); + #endif + log_line_parsed->resp_size = 0; + log_line_parsed->parsing_errors++; + } + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted RESP_SIZE:%d", log_line_parsed->resp_size); + #endif + + goto next_item; + } + + if(fields_format[i] == UPS_RESP_TIME){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: UPS_RESP_TIME):%.*s", i, (int)field_size, field); + #endif + + if(field[0] == '-' && field_size == 1) { + log_line_parsed->ups_resp_time = 0; + log_line_parsed->parsing_errors++; + goto next_item; + } + + /* Times of several responses are separated by commas and colons. Following the + * Go parser implementation, where only the first one is kept, the others are + * discarded. Also, there must be no space in between them. Needs testing... */ + char *pch = memchr(field, ',', field_size); + if(pch) field_size = pch - field; + + float f = 0; + + char ups_resp_time_d[UPS_RESP_TIME_MAX_LEN]; + snprintfz( ups_resp_time_d, UPS_RESP_TIME_MAX_LEN, "%.*s", (int)field_size, field); + + if(memchr(field, '.', field_size)){ // nginx time is in seconds with a milliseconds resolution. + if(likely(str2float(&f, ups_resp_time_d) == STR2XX_SUCCESS)){ + log_line_parsed->ups_resp_time = (int) (f * 1.0E6); + } + else { + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("Error while extracting UPS_RESP_TIME from string"); + #endif + log_line_parsed->ups_resp_time = 0; + log_line_parsed->parsing_errors++; + } + } + else{ // unlike in the REQ_PROC_TIME case, apache doesn't have an equivalent here + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("Error while extracting UPS_RESP_TIME from string"); + #endif + log_line_parsed->ups_resp_time = 0; + log_line_parsed->parsing_errors++; + } + if(verify){ + if(unlikely(log_line_parsed->ups_resp_time < 0)){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("UPS_RESP_TIME is invalid (<0)"); + #endif + log_line_parsed->ups_resp_time = 0; + log_line_parsed->parsing_errors++; + } + } + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted UPS_RESP_TIME:%d", log_line_parsed->ups_resp_time); + #endif + + goto next_item; + } + + if(fields_format[i] == SSL_PROTO){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: SSL_PROTO):%.*s", i, (int)field_size, field); + #endif + + if(field[0] == '-' && field_size == 1) { + log_line_parsed->ssl_proto[0] = '\0'; + log_line_parsed->parsing_errors++; + goto next_item; + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "SSL_PROTO field size:%zu", field_size); + #endif + + snprintfz( log_line_parsed->ssl_proto, SSL_PROTO_MAX_LEN, "%.*s", (int)field_size, field); + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "log_line_parsed->ssl_proto:%s", log_line_parsed->ssl_proto); + #endif + + if(verify){ + if(unlikely(strcmp(log_line_parsed->ssl_proto, "TLSv1") && + strcmp(log_line_parsed->ssl_proto, "TLSv1.1") && + strcmp(log_line_parsed->ssl_proto, "TLSv1.2") && + strcmp(log_line_parsed->ssl_proto, "TLSv1.3") && + strcmp(log_line_parsed->ssl_proto, "SSLv2") && + strcmp(log_line_parsed->ssl_proto, "SSLv3"))) { + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("SSL_PROTO is invalid"); + #endif + log_line_parsed->ssl_proto[0] = '\0'; + log_line_parsed->parsing_errors++; + } + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted SSL_PROTO:%s", log_line_parsed->ssl_proto); + #endif + + goto next_item; + } + + if(fields_format[i] == SSL_CIPHER_SUITE){ + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: SSL_CIPHER_SUITE):%.*s", i, (int)field_size, field); + #endif + + if(field[0] == '-' && field_size == 1) { + log_line_parsed->ssl_cipher[0] = '\0'; + log_line_parsed->parsing_errors++; + } + + snprintfz( log_line_parsed->ssl_cipher, SSL_CIPHER_SUITE_MAX_LEN, "%.*s", (int)field_size, field); + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "before: SSL_CIPHER_SUITE:%s", log_line_parsed->ssl_cipher); + #endif + + if(verify){ + int regex_rc = regexec(&cipher_suite_regex, log_line_parsed->ssl_cipher, 0, NULL, 0); + if (likely(regex_rc == 0)){/* do nothing */} + else if (unlikely(regex_rc == REG_NOMATCH)) { + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + collector_error("SSL_CIPHER_SUITE is invalid"); + #endif + log_line_parsed->ssl_cipher[0] = '\0'; + log_line_parsed->parsing_errors++; + } + else { + size_t err_msg_size = regerror(regex_rc, &cipher_suite_regex, NULL, 0); + char *err_msg = mallocz(err_msg_size); + regerror(regex_rc, &cipher_suite_regex, err_msg, err_msg_size); + collector_error("cipher_suite_regex error:%s", err_msg); + freez(err_msg); + m_assert(0, "cipher_suite_regex has failed"); + } + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Extracted SSL_CIPHER_SUITE:%s", log_line_parsed->ssl_cipher); + #endif + + goto next_item; + } + + if(fields_format[i] == TIME){ + + if(wblp_config->skip_timestamp_parsing){ + while(*offset != ']') {offset++;}; + i++; + offset++; + goto next_item; + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Item %d (type: TIME - 1st of 2 fields):%.*s", i, (int)field_size, field); + #endif + + // TODO: What if TIME is invalid? + // if(field[0] == '-' && field_size == 1) { + // log_line_parsed->timestamp = 0; + // log_line_parsed->parsing_errors++; + // ++i; + // goto next_item; + // } + + char *datetime = field; + + if(memchr(datetime, '[', field_size)) { + datetime++; + field_size--; + } + + struct tm ltm = {0}; + char *tz_str = strptime(datetime, "%d/%b/%Y:%H:%M:%S", <m); + if(unlikely(tz_str == NULL)){ + collector_error("TIME datetime parsing failed"); + log_line_parsed->timestamp = 0; + log_line_parsed->parsing_errors++; + goto next_item; + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "strptime() result: year:%d mon:%d day:%d hour:%d min:%d sec:%d", + ltm.tm_year, ltm.tm_mon, ltm.tm_mday, + ltm.tm_hour, ltm.tm_min, ltm.tm_sec); + #endif + + /* Deal with 2nd part of datetime i.e. timezone */ + + m_assert(*tz_str == ' ', "Invalid TIME timezone"); + ++tz_str; + m_assert(*tz_str == '+' || *tz_str == '-', "Invalid TIME timezone"); + char tz_sign = *tz_str; + + char *tz_str_end = ++tz_str; + while(*tz_str_end != ']') tz_str_end++; + + m_assert(tz_str_end - tz_str == 4, "Invalid TIME timezone string length"); + + char tz_num[4]; + memcpy(tz_num, tz_str, tz_str_end - tz_str); + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "TIME 2nd part: %.*s", (int)(tz_str_end - tz_str), tz_str); + #endif + + long int tz = strtol(tz_str, NULL, 10); + long int tz_h = tz / 100; + long int tz_m = tz % 100; + int64_t tz_adj = (int64_t) tz_h * 3600 + (int64_t) tz_m * 60; + if(tz_sign == '+') tz_adj *= -1; // if timezone is positive, we need to subtract it to get GMT + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + debug_log( "Timezone: int:%ld, hrs:%ld, mins:%ld", tz, tz_h, tz_m); + #endif + + if(-1 == (log_line_parsed->timestamp = timegm(<m) + tz_adj)){ + collector_error("TIME datetime parsing failed"); + log_line_parsed->timestamp = 0; + log_line_parsed->parsing_errors++; + } + + #if ENABLE_PARSE_WEB_LOG_LINE_DEBUG + char tb[80]; + strftime(tb, sizeof(tb), "%c", <m ); + debug_log( "Extracted TIME:%ld", log_line_parsed->timestamp); + debug_log( "Extracted TIME string:%s", tb); + #endif + + offset = tz_str_end + 1; // WARNING! this modifies the offset but it is required in the TIME case. + ++i; // TIME takes up 2 fields_format[] spaces, so skip the next one + + goto next_item; + } + +next_item: + /* If offset is located beyond the end of the line, terminate parsing */ + if(unlikely((size_t) (offset - line) >= line_len)) break; + + field = ++offset; + } +} + +/** + * @brief Extract web log metrics from a group of web log fields. + * @param[in] parser_config Configuration specifying how and what web log + * metrics to extract. + * @param[in] line_parsed Web logs fields extracted from a web log line. + * @param[out] metrics Web logs metrics exctracted from the \p line_parsed + * web log fields, using the \p parser_config configuration. + */ +void extract_web_log_metrics(Log_parser_config_t *parser_config, + Log_line_parsed_t *line_parsed, + Web_log_metrics_t *metrics){ + + /* Extract number of parsed lines */ + /* NOTE: Commented out as it is done in flb_collect_logs_cb() now. */ + // metrics->num_lines++; + + /* Extract vhost */ + // TODO: Reduce number of reallocs + if((parser_config->chart_config & CHART_VHOST) && *line_parsed->vhost){ + int i; + for(i = 0; i < metrics->vhost_arr.size; i++){ + if(!strcmp(metrics->vhost_arr.vhosts[i].name, line_parsed->vhost)){ + metrics->vhost_arr.vhosts[i].count++; + break; + } + } + if(metrics->vhost_arr.size == i){ // Vhost not found in array - need to append + metrics->vhost_arr.size++; + if(metrics->vhost_arr.size >= metrics->vhost_arr.size_max){ + metrics->vhost_arr.size_max = metrics->vhost_arr.size * VHOST_BUFFS_SCALE_FACTOR + 1; + metrics->vhost_arr.vhosts = reallocz( metrics->vhost_arr.vhosts, + metrics->vhost_arr.size_max * sizeof(struct log_parser_metrics_vhost)); + } + snprintf(metrics->vhost_arr.vhosts[metrics->vhost_arr.size - 1].name, VHOST_MAX_LEN, "%s", line_parsed->vhost); + metrics->vhost_arr.vhosts[metrics->vhost_arr.size - 1].count = 1; + } + } + + /* Extract port */ + // TODO: Reduce number of reallocs + if((parser_config->chart_config & CHART_PORT) && line_parsed->port){ + int i; + for(i = 0; i < metrics->port_arr.size; i++){ + if(metrics->port_arr.ports[i].port == line_parsed->port){ + metrics->port_arr.ports[i].count++; + break; + } + } + if(metrics->port_arr.size == i){ // Port not found in array - need to append + metrics->port_arr.size++; + if(metrics->port_arr.size >= metrics->port_arr.size_max){ + metrics->port_arr.size_max = metrics->port_arr.size * PORT_BUFFS_SCALE_FACTOR + 1; + metrics->port_arr.ports = reallocz( metrics->port_arr.ports, + metrics->port_arr.size_max * sizeof(struct log_parser_metrics_port)); + } + if(line_parsed->port == WEB_LOG_INVALID_PORT) + snprintfz(metrics->port_arr.ports[metrics->port_arr.size - 1].name, PORT_MAX_LEN, WEB_LOG_INVALID_PORT_STR); + else + snprintfz(metrics->port_arr.ports[metrics->port_arr.size - 1].name, PORT_MAX_LEN, "%d", line_parsed->port); + metrics->port_arr.ports[metrics->port_arr.size - 1].port = line_parsed->port; + metrics->port_arr.ports[metrics->port_arr.size - 1].count = 1; + } + } + + /* Extract client metrics */ + if(( parser_config->chart_config & ( CHART_IP_VERSION | CHART_REQ_CLIENT_CURRENT | CHART_REQ_CLIENT_ALL_TIME)) && *line_parsed->req_client) { + + /* Invalid IP version */ + if(unlikely(!strcmp(line_parsed->req_client, WEB_LOG_INVALID_CLIENT_IP_STR))){ + if(parser_config->chart_config & CHART_IP_VERSION) metrics->ip_ver.invalid++; + } + + else if(strchr(line_parsed->req_client, ':')){ + /* IPv6 version */ + if(parser_config->chart_config & CHART_IP_VERSION) metrics->ip_ver.v6++; + + /* Unique Client IPv6 Address current poll */ + if(parser_config->chart_config & CHART_REQ_CLIENT_CURRENT){ + int i; + for(i = 0; i < metrics->req_clients_current_arr.ipv6_size; i++){ + if(!strcmp(metrics->req_clients_current_arr.ipv6_req_clients[i], line_parsed->req_client)) break; + } + if(metrics->req_clients_current_arr.ipv6_size == i){ // Req client not found in array - need to append + metrics->req_clients_current_arr.ipv6_size++; + metrics->req_clients_current_arr.ipv6_req_clients = reallocz(metrics->req_clients_current_arr.ipv6_req_clients, + metrics->req_clients_current_arr.ipv6_size * sizeof(*metrics->req_clients_current_arr.ipv6_req_clients)); + snprintf(metrics->req_clients_current_arr.ipv6_req_clients[metrics->req_clients_current_arr.ipv6_size - 1], + REQ_CLIENT_MAX_LEN, "%s", line_parsed->req_client); + } + } + + /* Unique Client IPv6 Address all-time */ + if(parser_config->chart_config & CHART_REQ_CLIENT_ALL_TIME){ + int i; + for(i = 0; i < metrics->req_clients_alltime_arr.ipv6_size; i++){ + if(!strcmp(metrics->req_clients_alltime_arr.ipv6_req_clients[i], line_parsed->req_client)) break; + } + if(metrics->req_clients_alltime_arr.ipv6_size == i){ // Req client not found in array - need to append + metrics->req_clients_alltime_arr.ipv6_size++; + metrics->req_clients_alltime_arr.ipv6_req_clients = reallocz(metrics->req_clients_alltime_arr.ipv6_req_clients, + metrics->req_clients_alltime_arr.ipv6_size * sizeof(*metrics->req_clients_alltime_arr.ipv6_req_clients)); + snprintf(metrics->req_clients_alltime_arr.ipv6_req_clients[metrics->req_clients_alltime_arr.ipv6_size - 1], + REQ_CLIENT_MAX_LEN, "%s", line_parsed->req_client); + } + } + } + + + else{ + /* IPv4 version */ + if(parser_config->chart_config & CHART_IP_VERSION) metrics->ip_ver.v4++; + + /* Unique Client IPv4 Address current poll */ + if(parser_config->chart_config & CHART_REQ_CLIENT_CURRENT){ + int i; + for(i = 0; i < metrics->req_clients_current_arr.ipv4_size; i++){ + if(!strcmp(metrics->req_clients_current_arr.ipv4_req_clients[i], line_parsed->req_client)) break; + } + if(metrics->req_clients_current_arr.ipv4_size == i){ // Req client not found in array - need to append + metrics->req_clients_current_arr.ipv4_size++; + metrics->req_clients_current_arr.ipv4_req_clients = reallocz(metrics->req_clients_current_arr.ipv4_req_clients, + metrics->req_clients_current_arr.ipv4_size * sizeof(*metrics->req_clients_current_arr.ipv4_req_clients)); + snprintf(metrics->req_clients_current_arr.ipv4_req_clients[metrics->req_clients_current_arr.ipv4_size - 1], + REQ_CLIENT_MAX_LEN, "%s", line_parsed->req_client); + } + } + + /* Unique Client IPv4 Address all-time */ + if(parser_config->chart_config & CHART_REQ_CLIENT_ALL_TIME){ + int i; + for(i = 0; i < metrics->req_clients_alltime_arr.ipv4_size; i++){ + if(!strcmp(metrics->req_clients_alltime_arr.ipv4_req_clients[i], line_parsed->req_client)) break; + } + if(metrics->req_clients_alltime_arr.ipv4_size == i){ // Req client not found in array - need to append + metrics->req_clients_alltime_arr.ipv4_size++; + metrics->req_clients_alltime_arr.ipv4_req_clients = reallocz(metrics->req_clients_alltime_arr.ipv4_req_clients, + metrics->req_clients_alltime_arr.ipv4_size * sizeof(*metrics->req_clients_alltime_arr.ipv4_req_clients)); + snprintf(metrics->req_clients_alltime_arr.ipv4_req_clients[metrics->req_clients_alltime_arr.ipv4_size - 1], + REQ_CLIENT_MAX_LEN, "%s", line_parsed->req_client); + } + } + } + } + + /* Extract request method */ + if(parser_config->chart_config & CHART_REQ_METHODS){ + for(int i = 0; i < REQ_METHOD_ARR_SIZE; i++){ + if(!strcmp(line_parsed->req_method, req_method_str[i])){ + metrics->req_method[i]++; + break; + } + } + } + + /* Extract request protocol */ + if(parser_config->chart_config & CHART_REQ_PROTO){ + if(!strcmp(line_parsed->req_proto, "1") || !strcmp(line_parsed->req_proto, "1.0")) metrics->req_proto.http_1++; + else if(!strcmp(line_parsed->req_proto, "1.1")) metrics->req_proto.http_1_1++; + else if(!strcmp(line_parsed->req_proto, "2") || !strcmp(line_parsed->req_proto, "2.0")) metrics->req_proto.http_2++; + else metrics->req_proto.other++; + } + + /* Extract bytes received and sent */ + if(parser_config->chart_config & CHART_BANDWIDTH){ + metrics->bandwidth.req_size += line_parsed->req_size; + metrics->bandwidth.resp_size += line_parsed->resp_size; + } + + /* Extract request processing time */ + if((parser_config->chart_config & CHART_REQ_PROC_TIME) && line_parsed->req_proc_time){ + if(line_parsed->req_proc_time < metrics->req_proc_time.min || metrics->req_proc_time.min == 0){ + metrics->req_proc_time.min = line_parsed->req_proc_time; + } + if(line_parsed->req_proc_time > metrics->req_proc_time.max || metrics->req_proc_time.max == 0){ + metrics->req_proc_time.max = line_parsed->req_proc_time; + } + metrics->req_proc_time.sum += line_parsed->req_proc_time; + metrics->req_proc_time.count++; + } + + /* Extract response code family, response code & response code type */ + if(parser_config->chart_config & (CHART_RESP_CODE_FAMILY | CHART_RESP_CODE | CHART_RESP_CODE_TYPE)){ + switch(line_parsed->resp_code / 100){ + /* Note: 304 and 401 should be treated as resp_success */ + case 1: + metrics->resp_code_family.resp_1xx++; + metrics->resp_code[line_parsed->resp_code - 100]++; + metrics->resp_code_type.resp_success++; + break; + case 2: + metrics->resp_code_family.resp_2xx++; + metrics->resp_code[line_parsed->resp_code - 100]++; + metrics->resp_code_type.resp_success++; + break; + case 3: + metrics->resp_code_family.resp_3xx++; + metrics->resp_code[line_parsed->resp_code - 100]++; + if(line_parsed->resp_code == 304) metrics->resp_code_type.resp_success++; + else metrics->resp_code_type.resp_redirect++; + break; + case 4: + metrics->resp_code_family.resp_4xx++; + metrics->resp_code[line_parsed->resp_code - 100]++; + if(line_parsed->resp_code == 401) metrics->resp_code_type.resp_success++; + else metrics->resp_code_type.resp_bad++; + break; + case 5: + metrics->resp_code_family.resp_5xx++; + metrics->resp_code[line_parsed->resp_code - 100]++; + metrics->resp_code_type.resp_error++; + break; + default: + metrics->resp_code_family.other++; + metrics->resp_code[RESP_CODE_ARR_SIZE - 1]++; + metrics->resp_code_type.other++; + break; + } + } + + /* Extract SSL protocol */ + if(parser_config->chart_config & CHART_SSL_PROTO){ + if(!strcmp(line_parsed->ssl_proto, "TLSv1")) metrics->ssl_proto.tlsv1++; + else if(!strcmp(line_parsed->ssl_proto, "TLSv1.1")) metrics->ssl_proto.tlsv1_1++; + else if(!strcmp(line_parsed->ssl_proto, "TLSv1.2")) metrics->ssl_proto.tlsv1_2++; + else if(!strcmp(line_parsed->ssl_proto, "TLSv1.3")) metrics->ssl_proto.tlsv1_3++; + else if(!strcmp(line_parsed->ssl_proto, "SSLv2")) metrics->ssl_proto.sslv2++; + else if(!strcmp(line_parsed->ssl_proto, "SSLv3")) metrics->ssl_proto.sslv3++; + else metrics->ssl_proto.other++; + } + + /* Extract SSL cipher suite */ + // TODO: Reduce number of reallocs + if((parser_config->chart_config & CHART_SSL_CIPHER) && *line_parsed->ssl_cipher){ + int i; + for(i = 0; i < metrics->ssl_cipher_arr.size; i++){ + if(!strcmp(metrics->ssl_cipher_arr.ssl_ciphers[i].name, line_parsed->ssl_cipher)){ + metrics->ssl_cipher_arr.ssl_ciphers[i].count++; + break; + } + } + if(metrics->ssl_cipher_arr.size == i){ // SSL cipher suite not found in array - need to append + metrics->ssl_cipher_arr.size++; + metrics->ssl_cipher_arr.ssl_ciphers = reallocz(metrics->ssl_cipher_arr.ssl_ciphers, + metrics->ssl_cipher_arr.size * sizeof(struct log_parser_metrics_ssl_cipher)); + snprintf( metrics->ssl_cipher_arr.ssl_ciphers[metrics->ssl_cipher_arr.size - 1].name, + SSL_CIPHER_SUITE_MAX_LEN, "%s", line_parsed->ssl_cipher); + metrics->ssl_cipher_arr.ssl_ciphers[metrics->ssl_cipher_arr.size - 1].count = 1; + } + } + + metrics->timestamp = line_parsed->timestamp; +} + +/** + * @brief Try to automatically detect the configuration for a web log parser. + * @details It tries to automatically detect the configuration to be used for + * a web log parser, by parsing a single web log line record and trying to pick + * a matching configuration (from a static list of predefined ones.) + * @param[in] line Null-terminated web log line to use in guessing the configuration. + * @param[in] delimiter Delimiter used to break down \p line in separate fields. + * @returns Pointer to the web log parser configuration if automatic detection + * was sucessful, otherwise NULL. + */ +Web_log_parser_config_t *auto_detect_web_log_parser_config(char *line, const char delimiter){ + for(int i = 0; csv_auto_format_guess_matrix[i] != NULL; i++){ + Web_log_parser_config_t *wblp_config = read_web_log_parser_config(csv_auto_format_guess_matrix[i], delimiter); + if(count_fields(line, delimiter) == wblp_config->num_fields){ + wblp_config->verify_parsed_logs = 1; // Verification must be turned on to be able to pick up parsing_errors + Log_line_parsed_t line_parsed = (Log_line_parsed_t) {0}; + parse_web_log_line(wblp_config, line, strlen(line), &line_parsed); + if(line_parsed.parsing_errors == 0){ + return wblp_config; + } + } + + freez(wblp_config->fields); + freez(wblp_config); + } + return NULL; +} diff --git a/logsmanagement/parser.h b/logsmanagement/parser.h new file mode 100644 index 00000000..c0cf284b --- /dev/null +++ b/logsmanagement/parser.h @@ -0,0 +1,436 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file parser.h + * @brief Header of parser.c + */ + +#ifndef PARSER_H_ +#define PARSER_H_ + +#include <regex.h> +#include "daemon/common.h" +#include "libnetdata/libnetdata.h" + +// Forward decleration +typedef struct log_parser_metrics Log_parser_metrics_t; + + +/* -------------------------------------------------------------------------- */ +/* Configuration-related */ +/* -------------------------------------------------------------------------- */ + +typedef enum{ + + CHART_COLLECTED_LOGS_TOTAL = 1 << 0, + CHART_COLLECTED_LOGS_RATE = 1 << 1, + + /* FLB_WEB_LOG charts */ + CHART_VHOST = 1 << 2, + CHART_PORT = 1 << 3, + CHART_IP_VERSION = 1 << 4, + CHART_REQ_CLIENT_CURRENT = 1 << 5, + CHART_REQ_CLIENT_ALL_TIME = 1 << 6, + CHART_REQ_METHODS = 1 << 7, + CHART_REQ_PROTO = 1 << 8, + CHART_BANDWIDTH = 1 << 9, + CHART_REQ_PROC_TIME = 1 << 10, + CHART_RESP_CODE_FAMILY = 1 << 11, + CHART_RESP_CODE = 1 << 12, + CHART_RESP_CODE_TYPE = 1 << 13, + CHART_SSL_PROTO = 1 << 14, + CHART_SSL_CIPHER = 1 << 15, + + /* FLB_SYSTEMD or FLB_SYSLOG charts */ + CHART_SYSLOG_PRIOR = 1 << 16, + CHART_SYSLOG_SEVER = 1 << 17, + CHART_SYSLOG_FACIL = 1 << 18, + + /* FLB_KMSG charts */ + CHART_KMSG_SUBSYSTEM = 1 << 19, + CHART_KMSG_DEVICE = 1 << 20, + + /* FLB_DOCKER_EV charts */ + CHART_DOCKER_EV_TYPE = 1 << 21, + CHART_DOCKER_EV_ACTION = 1 << 22, + + /* FLB_MQTT charts*/ + CHART_MQTT_TOPIC = 1 << 23 + +} chart_type_t; + +typedef struct log_parser_config{ + void *gen_config; /**< Pointer to (optional) generic configuration, as per use case. */ + unsigned long int chart_config; /**< Configuration of which charts to enable according to chart_type_t **/ +} Log_parser_config_t; + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Web Log parsing and metrics */ +/* -------------------------------------------------------------------------- */ + +#define VHOST_MAX_LEN 255 /**< Max vhost string length, inclding terminating \0 **/ +#define PORT_MAX_LEN 6 /**< Max port string length, inclding terminating \0 **/ +#define REQ_SCHEME_MAX_LEN 6 /**< Max request scheme length, including terminating \0 **/ +#define REQ_CLIENT_MAX_LEN 46 /**< https://superuser.com/questions/381022/how-many-characters-can-an-ip-address-be#comment2219013_381029 **/ +#define REQ_METHOD_MAX_LEN 18 /**< Max request method length, including terminating \0 **/ +#define REQ_URL_MAX_LEN 128 /**< Max request URL length, including terminating \0 **/ +#define REQ_PROTO_PREF_SIZE (sizeof("HTTP/") - 1) +#define REQ_PROTO_MAX_LEN 4 /**< Max request protocol numerical part length, including terminating \0 **/ +#define REQ_SIZE_MAX_LEN 11 /**< Max size of bytes received, including terminating \0 **/ +#define REQ_PROC_TIME_MAX_LEN 11 /**< Max size of request processing time, including terminating \0 **/ +#define REQ_RESP_CODE_MAX_LEN 4 /**< Max size of response code, including terminating \0 **/ +#define REQ_RESP_SIZE_MAX_LEN 11 /**< Max size of request response size, including terminating \0 **/ +#define UPS_RESP_TIME_MAX_LEN 10 /**< Max size of upstream response time, including terminating \0 **/ +#define SSL_PROTO_MAX_LEN 8 /**< Max SSL protocol length, inclding terminating \0 **/ +#define SSL_CIPHER_SUITE_MAX_LEN 256 /**< TODO: Check max len for ssl cipher suite string is indeed 256 **/ + +#define RESP_CODE_ARR_SIZE 501 /**< Size of resp_code array, assuming 500 valid resp codes + 1 for "other" **/ + +#define WEB_LOG_INVALID_HOST_STR "invalid" +#define WEB_LOG_INVALID_PORT -1 +#define WEB_LOG_INVALID_PORT_STR "inv" +#define WEB_LOG_INVALID_CLIENT_IP_STR WEB_LOG_INVALID_PORT_STR + +/* Web log configuration */ +#define ENABLE_PARSE_WEB_LOG_LINE_DEBUG 0 + +#define VHOST_BUFFS_SCALE_FACTOR 1.5 +#define PORT_BUFFS_SCALE_FACTOR 8 // Unlike Vhosts, ports are stored as integers, so scale factor can be bigger + + +typedef enum{ + VHOST_WITH_PORT, // nginx: $host:$server_port apache: %v:%p + VHOST, // nginx: $host ($http_host) apache: %v + PORT, // nginx: $server_port apache: %p + REQ_SCHEME, // nginx: $scheme apache: - + REQ_CLIENT, // nginx: $remote_addr apache: %a (%h) + REQ, // nginx: $request apache: %r + REQ_METHOD, // nginx: $request_method apache: %m + REQ_URL, // nginx: $request_uri apache: %U + REQ_PROTO, // nginx: $server_protocol apache: %H + REQ_SIZE, // nginx: $request_length apache: %I + REQ_PROC_TIME, // nginx: $request_time apache: %D + RESP_CODE, // nginx: $status apache: %s, %>s + RESP_SIZE, // nginx: $bytes_sent, $body_bytes_sent apache: %b, %O, %B // TODO: Should separate %b from %O ? + UPS_RESP_TIME, // nginx: $upstream_response_time apache: - + SSL_PROTO, // nginx: $ssl_protocol apache: - + SSL_CIPHER_SUITE, // nginx: $ssl_cipher apache: - + TIME, // nginx: $time_local apache: %t + CUSTOM +} web_log_line_field_t; + +typedef struct web_log_parser_config{ + web_log_line_field_t *fields; + int num_fields; /**< Number of strings in the fields array. **/ + char delimiter; /**< Delimiter that separates the fields in the log format. **/ + int verify_parsed_logs; /**< Boolean whether to try and verify parsed log fields or not **/ + int skip_timestamp_parsing; /**< Boolean whether to skip parsing of timestamp fields **/ +} Web_log_parser_config_t; + +static const char *const req_method_str[] = { + "ACL", + "BASELINE-CONTROL", + "BIND", + "CHECKIN", + "CHECKOUT", + "CONNECT", + "COPY", + "DELETE", + "GET", + "HEAD", + "LABEL", + "LINK", + "LOCK", + "MERGE", + "MKACTIVITY", + "MKCALENDAR", + "MKCOL", + "MKREDIRECTREF", + "MKWORKSPACE", + "MOVE", + "OPTIONS", + "ORDERPATCH", + "PATCH", + "POST", + "PRI", + "PROPFIND", + "PROPPATCH", + "PUT", + "REBIND", + "REPORT", + "SEARCH", + "TRACE", + "UNBIND", + "UNCHECKOUT", + "UNLINK", + "UNLOCK", + "UPDATE", + "UPDATEREDIRECTREF", + "-" +}; + +#define REQ_METHOD_ARR_SIZE (int)(sizeof(req_method_str) / sizeof(req_method_str[0])) + +typedef struct web_log_metrics{ + /* Web log metrics */ + struct log_parser_metrics_vhosts_array{ + struct log_parser_metrics_vhost{ + char name[VHOST_MAX_LEN]; /**< Name of the vhost **/ + int count; /**< Occurences of the vhost **/ + } *vhosts; + int size; /**< Size of vhosts array **/ + int size_max; + } vhost_arr; + struct log_parser_metrics_ports_array{ + struct log_parser_metrics_port{ + char name[PORT_MAX_LEN]; /**< Number of port in str */ + int port; /**< Number of port **/ + int count; /**< Occurences of the port **/ + } *ports; + int size; /**< Size of ports array **/ + int size_max; + } port_arr; + struct log_parser_metrics_ip_ver{ + int v4, v6, invalid; + } ip_ver; + /**< req_clients_current_arr is used by parser.c to save unique client IPs + * extracted per circular buffer item and also in p_file_info to save unique + * client IPs per collection (poll) iteration of plugin_logsmanagement.c. + * req_clients_alltime_arr is used in p_file_info to save unique client IPs + * of all time (and so ipv4_size and ipv6_size can only grow and are never reset to 0). **/ + struct log_parser_metrics_req_clients_array{ + char (*ipv4_req_clients)[REQ_CLIENT_MAX_LEN]; + int ipv4_size; + int ipv4_size_max; + char (*ipv6_req_clients)[REQ_CLIENT_MAX_LEN]; + int ipv6_size; + int ipv6_size_max; + } req_clients_current_arr, req_clients_alltime_arr; + int req_method[REQ_METHOD_ARR_SIZE]; + struct log_parser_metrics_req_proto{ + int http_1, http_1_1, http_2, other; + } req_proto; + struct log_parser_metrics_bandwidth{ + long long req_size, resp_size; + } bandwidth; + struct log_parser_metrics_req_proc_time{ + int min, max, sum, count; + } req_proc_time; + struct log_parser_metrics_resp_code_family{ + int resp_1xx, resp_2xx, resp_3xx, resp_4xx, resp_5xx, other; // TODO: Can there be "other"? + } resp_code_family; + /**< Array counting occurences of response codes. Each item represents the + * respective response code by adding 100 to its index, e.g. resp_code[102] + * counts how many 202 codes were detected. 501st item represents "other" */ + unsigned int resp_code[RESP_CODE_ARR_SIZE]; + struct log_parser_metrics_resp_code_type{ /* Note: 304 and 401 should be treated as resp_success */ + int resp_success, resp_redirect, resp_bad, resp_error, other; // TODO: Can there be "other"? + } resp_code_type; + struct log_parser_metrics_ssl_proto{ + int tlsv1, tlsv1_1, tlsv1_2, tlsv1_3, sslv2, sslv3, other; + } ssl_proto; + struct log_parser_metrics_ssl_cipher_array{ + struct log_parser_metrics_ssl_cipher{ + char name[SSL_CIPHER_SUITE_MAX_LEN]; /**< SSL cipher suite string **/ + int count; /**< Occurences of the SSL cipher **/ + } *ssl_ciphers; + int size; /**< Size of SSL ciphers array **/ + } ssl_cipher_arr; + int64_t timestamp; +} Web_log_metrics_t; + +typedef struct log_line_parsed{ + char vhost[VHOST_MAX_LEN]; + int port; + char req_scheme[REQ_SCHEME_MAX_LEN]; + char req_client[REQ_CLIENT_MAX_LEN]; + char req_method[REQ_METHOD_MAX_LEN]; + char req_URL[REQ_URL_MAX_LEN]; + char req_proto[REQ_PROTO_MAX_LEN]; + int req_size; + int req_proc_time; + int resp_code; + int resp_size; + int ups_resp_time; + char ssl_proto[SSL_PROTO_MAX_LEN]; + char ssl_cipher[SSL_CIPHER_SUITE_MAX_LEN]; + int64_t timestamp; + int parsing_errors; +} Log_line_parsed_t; + +Web_log_parser_config_t *read_web_log_parser_config(const char *log_format, const char delimiter); +#ifdef ENABLE_LOGSMANAGEMENT_TESTS +/* Used as public only for unit testing, normally defined as static */ +int count_fields(const char *line, const char delimiter); +#endif // ENABLE_LOGSMANAGEMENT_TESTS +void parse_web_log_line(const Web_log_parser_config_t *wblp_config, + char *line, const size_t line_len, + Log_line_parsed_t *log_line_parsed); +void extract_web_log_metrics(Log_parser_config_t *parser_config, + Log_line_parsed_t *line_parsed, + Web_log_metrics_t *metrics); +Web_log_parser_config_t *auto_detect_web_log_parser_config(char *line, const char delimiter); + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Kernel logs (kmsg) metrics */ +/* -------------------------------------------------------------------------- */ + +#define SYSLOG_SEVER_ARR_SIZE 9 /**< Number of severity levels plus 1 for 'unknown' **/ + +typedef struct metrics_dict_item{ + bool dim_initialized; + int num; + int num_new; +} metrics_dict_item_t; + +typedef struct kernel_metrics{ + unsigned int sever[SYSLOG_SEVER_ARR_SIZE]; /**< Syslog severity, 0-7 plus 1 space for 'unknown' **/ + DICTIONARY *subsystem; + DICTIONARY *device; +} Kernel_metrics_t; + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Systemd and Syslog metrics */ +/* -------------------------------------------------------------------------- */ + +#define SYSLOG_FACIL_ARR_SIZE 25 /**< Number of facility levels plus 1 for 'unknown' **/ +#define SYSLOG_PRIOR_ARR_SIZE 193 /**< Number of priority values plus 1 for 'unknown' **/ + +typedef struct systemd_metrics{ + unsigned int sever[SYSLOG_SEVER_ARR_SIZE]; /**< Syslog severity, 0-7 plus 1 space for 'unknown' **/ + unsigned int facil[SYSLOG_FACIL_ARR_SIZE]; /**< Syslog facility, 0-23 plus 1 space for 'unknown' **/ + unsigned int prior[SYSLOG_PRIOR_ARR_SIZE]; /**< Syslog priority value, 0-191 plus 1 space for 'unknown' **/ +} Systemd_metrics_t; + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Docker Events metrics */ +/* -------------------------------------------------------------------------- */ + +static const char *const docker_ev_type_string[] = { + "container", "image", "plugin", "volume", "network", "daemon", "service", "node", "secret", "config", "unknown" +}; + +#define NUM_OF_DOCKER_EV_TYPES ((int) (sizeof docker_ev_type_string / sizeof docker_ev_type_string[0])) + +#define NUM_OF_CONTAINER_ACTIONS 25 /**< == size of 'Containers actions' array, largest array in docker_ev_action_string **/ + +static const char *const docker_ev_action_string[NUM_OF_DOCKER_EV_TYPES][NUM_OF_CONTAINER_ACTIONS] = { + /* Order of arrays is important, it must match the order of docker_ev_type_string[] strings. */ + + /* Containers actions */ + {"attach", "commit", "copy", "create", "destroy", "detach", "die", "exec_create", "exec_detach", "exec_die", + "exec_start", "export", "health_status", "kill", "oom", "pause", "rename", "resize", "restart", "start", "stop", + "top", "unpause", "update", NULL}, + + /* Images actions */ + {"delete", "import", "load", "pull", "push", "save", "tag", "untag", NULL}, + + /* Plugins actions */ + {"enable", "disable", "install", "remove", NULL}, + + /* Volumes actions */ + {"create", "destroy", "mount", "unmount", NULL}, + + /* Networks actions */ + {"create", "connect", "destroy", "disconnect", "remove", NULL}, + + /* Daemons actions */ + {"reload", NULL}, + + /* Services actions */ + {"create", "remove", "update", NULL}, + + /* Nodes actions */ + {"create", "remove", "update", NULL}, + + /* Secrets actions */ + {"create", "remove", "update", NULL}, + + /* Configs actions */ + {"create", "remove", "update", NULL}, + + {"unknown", NULL} +}; + +typedef struct docker_ev_metrics{ + unsigned int ev_type[NUM_OF_DOCKER_EV_TYPES]; + unsigned int ev_action[NUM_OF_DOCKER_EV_TYPES][NUM_OF_CONTAINER_ACTIONS]; +} Docker_ev_metrics_t; + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* MQTT metrics */ +/* -------------------------------------------------------------------------- */ + +typedef struct mqtt_metrics{ + DICTIONARY *topic; +} Mqtt_metrics_t; + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Regex / Keyword search */ +/* -------------------------------------------------------------------------- */ + +#define MAX_KEYWORD_LEN 100 /**< Max size of keyword used in keyword search, in bytes */ +#define MAX_REGEX_SIZE MAX_KEYWORD_LEN + 7 /**< Max size of regular expression (used in keyword search) in bytes **/ + +int search_keyword( char *src, size_t src_sz, + char *dest, size_t *dest_sz, + const char *keyword, regex_t *regex, + const int ignore_case); + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* Custom Charts configuration and metrics */ +/* -------------------------------------------------------------------------- */ + +typedef struct log_parser_cus_config{ + char *chartname; /**< Chart name where the regex metrics will appear in **/ + char *regex_str; /**< String representation of the regex **/ + char *regex_name; /**< If regex is named, this is where its name is stored **/ + regex_t regex; /**< The compiled regex **/ +} Log_parser_cus_config_t; + +typedef struct log_parser_cus_metrics{ + unsigned long long count; +} Log_parser_cus_metrics_t; + +/* -------------------------------------------------------------------------- */ + + +/* -------------------------------------------------------------------------- */ +/* General / Other */ +/* -------------------------------------------------------------------------- */ + +struct log_parser_metrics{ + unsigned long long num_lines; + // struct timeval tv; + time_t last_update; + union { + Web_log_metrics_t *web_log; + Kernel_metrics_t *kernel; + Systemd_metrics_t *systemd; + Docker_ev_metrics_t *docker_ev; + Mqtt_metrics_t *mqtt; + }; + Log_parser_cus_metrics_t **parser_cus; /**< Array storing custom chart metrics structs **/ +} ; + +#endif // PARSER_H_ diff --git a/logsmanagement/query.c b/logsmanagement/query.c new file mode 100644 index 00000000..a94c9f70 --- /dev/null +++ b/logsmanagement/query.c @@ -0,0 +1,239 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file query.c + * + * @brief This is the file containing the implementation of the + * logs management querying API. + */ + +#define _GNU_SOURCE + +#include "query.h" +#include <uv.h> +#include <sys/resource.h> +#include "circular_buffer.h" +#include "db_api.h" +#include "file_info.h" +#include "helper.h" + +static const char esc_ch[] = "[]\\^$.|?*+(){}"; + +/** + * @brief Sanitise string to work with regular expressions + * @param[in] s Input string to be sanitised - will not be modified + * @return Sanitised string (escaped characters according to esc_ch[] array) + */ +UNIT_STATIC char *sanitise_string(char *const s){ + size_t s_len = strlen(s); + /* Truncate keyword if longer than maximum allowed length */ + if(unlikely(s_len > MAX_KEYWORD_LEN)){ + s_len = MAX_KEYWORD_LEN; + s[s_len] = '\0'; + } + char *s_san = mallocz(s_len * 2); + + char *s_off = s; + char *s_san_off = s_san; + while(*s_off) { + for(char *esc_ch_off = (char *) esc_ch; *esc_ch_off; esc_ch_off++){ + if(*s_off == *esc_ch_off){ + *s_san_off++ = '\\'; + break; + } + } + *s_san_off++ = *s_off++; + } + *s_san_off = '\0'; + return s_san; +} + +const logs_qry_res_err_t *fetch_log_sources(BUFFER *wb){ + if(unlikely(!p_file_infos_arr || !p_file_infos_arr->count)) + return &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_SERVER_ERR]; + + buffer_json_add_array_item_object(wb); + buffer_json_member_add_string(wb, "id", "all"); + buffer_json_member_add_string(wb, "name", "all"); + buffer_json_member_add_string(wb, "pill", "100"); // TODO + + buffer_json_member_add_string(wb, "info", "All log sources"); + + buffer_json_member_add_string(wb, "basename", ""); + buffer_json_member_add_string(wb, "filename", ""); + buffer_json_member_add_string(wb, "log_type", ""); + buffer_json_member_add_string(wb, "db_dir", ""); + buffer_json_member_add_uint64(wb, "db_version", 0); + buffer_json_member_add_uint64(wb, "db_flush_freq", 0); + buffer_json_member_add_int64( wb, "db_disk_space_limit", 0); + buffer_json_object_close(wb); // options object + + bool queryable_sources = false; + for (int i = 0; i < p_file_infos_arr->count; i++) { + if(p_file_infos_arr->data[i]->db_mode == LOGS_MANAG_DB_MODE_FULL) + queryable_sources = true; + } + + if(!queryable_sources) + return &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_NOT_FOUND_ERR]; + + for (int i = 0; i < p_file_infos_arr->count; i++) { + buffer_json_add_array_item_object(wb); + buffer_json_member_add_string(wb, "id", p_file_infos_arr->data[i]->chartname); + buffer_json_member_add_string(wb, "name", p_file_infos_arr->data[i]->chartname); + buffer_json_member_add_string(wb, "pill", "100"); // TODO + + char info[1024]; + snprintfz(info, sizeof(info), "Chart '%s' from log source '%s'", + p_file_infos_arr->data[i]->chartname, + p_file_infos_arr->data[i]->file_basename); + + buffer_json_member_add_string(wb, "info", info); + + buffer_json_member_add_string(wb, "basename", p_file_infos_arr->data[i]->file_basename); + buffer_json_member_add_string(wb, "filename", p_file_infos_arr->data[i]->filename); + buffer_json_member_add_string(wb, "log_type", log_src_type_t_str[p_file_infos_arr->data[i]->log_type]); + buffer_json_member_add_string(wb, "db_dir", p_file_infos_arr->data[i]->db_dir); + buffer_json_member_add_uint64(wb, "db_version", db_user_version(p_file_infos_arr->data[i]->db, -1)); + buffer_json_member_add_uint64(wb, "db_flush_freq", db_user_version(p_file_infos_arr->data[i]->db, -1)); + buffer_json_member_add_int64( wb, "db_disk_space_limit", p_file_infos_arr->data[i]->blob_max_size * BLOB_MAX_FILES); + buffer_json_object_close(wb); // options object + } + + return &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_OK]; +} + +bool terminate_logs_manag_query(logs_query_params_t *const p_query_params){ + if(p_query_params->cancelled && __atomic_load_n(p_query_params->cancelled, __ATOMIC_RELAXED)) { + return true; + } + + if(now_monotonic_usec() > p_query_params->stop_monotonic_ut) + return true; + + return false; +} + +const logs_qry_res_err_t *execute_logs_manag_query(logs_query_params_t *p_query_params) { + struct File_info *p_file_infos[LOGS_MANAG_MAX_COMPOUND_QUERY_SOURCES] = {NULL}; + + /* Check all required query parameters are present */ + if(unlikely(!p_query_params->req_from_ts || !p_query_params->req_to_ts)) + return &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_INV_TS_ERR]; + + /* Start with maximum possible actual timestamp range and reduce it + * accordingly when searching DB and circular buffer. */ + p_query_params->act_from_ts = p_query_params->req_from_ts; + p_query_params->act_to_ts = p_query_params->req_to_ts; + + if(p_file_infos_arr == NULL) + return &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_NOT_INIT_ERR]; + + /* Find p_file_infos for this query according to chartnames or filenames + * if the former is not valid. Only one of the two will be used, + * charts_names and filenames cannot be mixed. + * If neither list is provided, search all available log sources. */ + if(p_query_params->chartname[0]){ + int pfi_off = 0; + for(int cn_off = 0; p_query_params->chartname[cn_off]; cn_off++) { + for(int pfi_arr_off = 0; pfi_arr_off < p_file_infos_arr->count; pfi_arr_off++) { + if( !strcmp(p_file_infos_arr->data[pfi_arr_off]->chartname, p_query_params->chartname[cn_off]) && + p_file_infos_arr->data[pfi_arr_off]->db_mode != LOGS_MANAG_DB_MODE_NONE) { + p_file_infos[pfi_off++] = p_file_infos_arr->data[pfi_arr_off]; + break; + } + } + } + } + else if(p_query_params->filename[0]){ + int pfi_off = 0; + for(int fn_off = 0; p_query_params->filename[fn_off]; fn_off++) { + for(int pfi_arr_off = 0; pfi_arr_off < p_file_infos_arr->count; pfi_arr_off++) { + if( !strcmp(p_file_infos_arr->data[pfi_arr_off]->filename, p_query_params->filename[fn_off]) && + p_file_infos_arr->data[pfi_arr_off]->db_mode != LOGS_MANAG_DB_MODE_NONE) { + p_file_infos[pfi_off++] = p_file_infos_arr->data[pfi_arr_off]; + break; + } + } + } + } + else{ + int pfi_off = 0; + for(int pfi_arr_off = 0; pfi_arr_off < p_file_infos_arr->count; pfi_arr_off++) { + if(p_file_infos_arr->data[pfi_arr_off]->db_mode != LOGS_MANAG_DB_MODE_NONE) + p_file_infos[pfi_off++] = p_file_infos_arr->data[pfi_arr_off]; + } + } + + if(unlikely(!p_file_infos[0])) + return &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_NOT_FOUND_ERR]; + + + if( p_query_params->sanitize_keyword && p_query_params->keyword && + *p_query_params->keyword && strcmp(p_query_params->keyword, " ")){ + p_query_params->keyword = sanitise_string(p_query_params->keyword); // freez(p_query_params->keyword) in this case + } + + if(p_query_params->stop_monotonic_ut == 0) + p_query_params->stop_monotonic_ut = now_monotonic_usec() + (LOGS_MANAG_QUERY_TIMEOUT_DEFAULT - 1) * USEC_PER_SEC; + + struct rusage ru_start, ru_end; + getrusage(RUSAGE_THREAD, &ru_start); + + /* Secure DB lock to ensure no data will be transferred from the buffers to + * the DB during the query execution and also no other execute_logs_manag_query + * will try to access the DB at the same time. The operations happen + * atomically and the DB searches in series. */ + for(int pfi_off = 0; p_file_infos[pfi_off]; pfi_off++) + uv_mutex_lock(p_file_infos[pfi_off]->db_mut); + + /* If results are requested in ascending timestamp order, search DB(s) first + * and then the circular buffers. Otherwise, search the circular buffers + * first and the DB(s) second. In both cases, the quota must be respected. */ + if(p_query_params->order_by_asc) + db_search(p_query_params, p_file_infos); + + if( p_query_params->results_buff->len < p_query_params->quota && + !terminate_logs_manag_query(p_query_params)) + circ_buff_search(p_query_params, p_file_infos); + + if(!p_query_params->order_by_asc && + p_query_params->results_buff->len < p_query_params->quota && + !terminate_logs_manag_query(p_query_params)) + db_search(p_query_params, p_file_infos); + + for(int pfi_off = 0; p_file_infos[pfi_off]; pfi_off++) + uv_mutex_unlock(p_file_infos[pfi_off]->db_mut); + + getrusage(RUSAGE_THREAD, &ru_end); + + __atomic_add_fetch(&p_file_infos[0]->cpu_time_per_mib.user, + p_query_params->results_buff->len ? ( ru_end.ru_utime.tv_sec * USEC_PER_SEC - + ru_start.ru_utime.tv_sec * USEC_PER_SEC + + ru_end.ru_utime.tv_usec - + ru_start.ru_utime.tv_usec ) * (1 MiB) / p_query_params->results_buff->len : 0 + , __ATOMIC_RELAXED); + + __atomic_add_fetch(&p_file_infos[0]->cpu_time_per_mib.sys, + p_query_params->results_buff->len ? ( ru_end.ru_stime.tv_sec * USEC_PER_SEC - + ru_start.ru_stime.tv_sec * USEC_PER_SEC + + ru_end.ru_stime.tv_usec - + ru_start.ru_stime.tv_usec ) * (1 MiB) / p_query_params->results_buff->len : 0 + , __ATOMIC_RELAXED); + + /* If keyword has been sanitised, it needs to be freed - otherwise it's just a pointer to a substring */ + if(p_query_params->sanitize_keyword && p_query_params->keyword){ + freez(p_query_params->keyword); + } + + if(terminate_logs_manag_query(p_query_params)){ + return (p_query_params->cancelled && + __atomic_load_n(p_query_params->cancelled, __ATOMIC_RELAXED)) ? + &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_CANCELLED] /* cancelled */ : + &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_TIMEOUT] /* timed out */ ; + } + + if(!p_query_params->results_buff->len) + return &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_NOT_FOUND_ERR]; + + return &logs_qry_res_err[LOGS_QRY_RES_ERR_CODE_OK]; +} diff --git a/logsmanagement/query.h b/logsmanagement/query.h new file mode 100644 index 00000000..0576f86e --- /dev/null +++ b/logsmanagement/query.h @@ -0,0 +1,157 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file query.h + * @brief Header of query.c + */ + +#ifndef QUERY_H_ +#define QUERY_H_ + +#include <inttypes.h> +#include <stdlib.h> +#include "libnetdata/libnetdata.h" +#include "defaults.h" + +#define LOGS_QRY_VERSION "1" + +#define LOGS_MANAG_FUNC_PARAM_AFTER "after" +#define LOGS_MANAG_FUNC_PARAM_BEFORE "before" +#define LOGS_QRY_KW_QUOTA "quota" +#define LOGS_QRY_KW_CHARTNAME "chartname" +#define LOGS_QRY_KW_FILENAME "filename" +#define LOGS_QRY_KW_KEYWORD "keyword" +#define LOGS_QRY_KW_IGNORE_CASE "ignore_case" +#define LOGS_QRY_KW_SANITIZE_KW "sanitize_keyword" + +typedef struct { + const enum {LOGS_QRY_RES_ERR_CODE_OK = 0, + LOGS_QRY_RES_ERR_CODE_INV_TS_ERR, + LOGS_QRY_RES_ERR_CODE_NOT_FOUND_ERR, + LOGS_QRY_RES_ERR_CODE_NOT_INIT_ERR, + LOGS_QRY_RES_ERR_CODE_SERVER_ERR, + LOGS_QRY_RES_ERR_CODE_UNMODIFIED, + LOGS_QRY_RES_ERR_CODE_CANCELLED, + LOGS_QRY_RES_ERR_CODE_TIMEOUT } err_code; + char const *const err_str; + const int http_code; +} logs_qry_res_err_t; + +static const logs_qry_res_err_t logs_qry_res_err[] = { + { LOGS_QRY_RES_ERR_CODE_OK, "success", HTTP_RESP_OK }, + { LOGS_QRY_RES_ERR_CODE_INV_TS_ERR, "invalid timestamp range", HTTP_RESP_BAD_REQUEST }, + { LOGS_QRY_RES_ERR_CODE_NOT_FOUND_ERR, "no results found", HTTP_RESP_OK }, + { LOGS_QRY_RES_ERR_CODE_NOT_INIT_ERR, "logs management engine not running", HTTP_RESP_SERVICE_UNAVAILABLE }, + { LOGS_QRY_RES_ERR_CODE_SERVER_ERR, "server error", HTTP_RESP_INTERNAL_SERVER_ERROR }, + { LOGS_QRY_RES_ERR_CODE_UNMODIFIED, "not modified", HTTP_RESP_NOT_MODIFIED }, + { LOGS_QRY_RES_ERR_CODE_CANCELLED, "cancelled", HTTP_RESP_CLIENT_CLOSED_REQUEST }, + { LOGS_QRY_RES_ERR_CODE_TIMEOUT, "query timed out", HTTP_RESP_OK } +}; + +const logs_qry_res_err_t *fetch_log_sources(BUFFER *wb); + + +/** + * @brief Parameters of the query. + * @param req_from_ts Requested start timestamp of query in epoch + * milliseconds. + * + * @param req_to_ts Requested end timestamp of query in epoch milliseconds. + * If it doesn't match the requested start timestamp, there may be more results + * to be retrieved (for descending timestamp order queries). + * + * @param act_from_ts Actual start timestamp of query in epoch milliseconds. + * + * @param act_to_ts Actual end timestamp of query in epoch milliseconds. + * If it doesn't match the requested end timestamp, there may be more results to + * be retrieved (for ascending timestamp order queries). + * + * @param order_by_asc Equal to 1 if req_from_ts <= req_to_ts, otherwise 0. + * + * @param quota Request quota for results. When exceeded, query will + * return, even if there are more pending results. + * + * @param stop_monotonic_ut Monotonic time in usec after which the query + * will be timed out. + * + * @param chartname Chart name of log source to be queried, as it appears + * on the netdata dashboard. If this is defined and not an empty string, the + * filename parameter is ignored. + * + * @param filename Full path of log source to be queried. Will only be used + * if the chartname is not used. + * + * @param keyword The keyword to be searched. IMPORTANT! Regular expressions + * are supported (if sanitize_keyword is not set) but have not been tested + * extensively, so use with caution! + * + * @param ignore_case If set to any integer other than 0, the query will be + * case-insensitive. If not set or if set to 0, the query will be case-sensitive + * + * @param sanitize_keyword If set to any integer other than 0, the keyword + * will be sanitized before used by the regex engine (which means the keyword + * cannot be a regular expression, as it will be taken as a literal input). + * + * @param results_buff Buffer of BUFFER type to store the results of the + * query in. + * + * @param results_buff->size Defines the maximum quota of results to be + * expected. If exceeded, the query will return the results obtained so far. + * + * @param results_buff->len The exact size of the results matched. + * + * @param results_buff->buffer String containing the results of the query. + * + * @param num_lines Number of log records that match the keyword. + * + * @warning results_buff->size argument must be <= MAX_LOG_MSG_SIZE. + */ +typedef struct logs_query_params { + msec_t req_from_ts; + msec_t req_to_ts; + msec_t act_from_ts; + msec_t act_to_ts; + int order_by_asc; + unsigned long quota; + bool *cancelled; + usec_t stop_monotonic_ut; + char *chartname[LOGS_MANAG_MAX_COMPOUND_QUERY_SOURCES]; + char *filename[LOGS_MANAG_MAX_COMPOUND_QUERY_SOURCES]; + char *keyword; + int ignore_case; + int sanitize_keyword; + BUFFER *results_buff; + unsigned long num_lines; +} logs_query_params_t; + +typedef struct logs_query_res_hdr { + msec_t timestamp; + size_t text_size; + int matches; + char log_source[20]; + char log_type[20]; + char basename[20]; + char filename[50]; + char chartname[20]; +} logs_query_res_hdr_t; + +/** + * @brief Check if query should be terminated. + * @param p_query_params See documentation of logs_query_params_t struct. + * @return true if query should be terminated of false otherwise. +*/ +bool terminate_logs_manag_query(logs_query_params_t *p_query_params); + +/** + * @brief Primary query API. + * @param p_query_params See documentation of logs_query_params_t struct. + * @return enum of LOGS_QRY_RES_ERR_CODE with result of query + * @todo Cornercase if filename not found in DB? Return specific message? + */ +const logs_qry_res_err_t *execute_logs_manag_query(logs_query_params_t *p_query_params); + +#ifdef ENABLE_LOGSMANAGEMENT_TESTS +/* Used as public only for unit testing, normally defined as static */ +char *sanitise_string(char *s); +#endif // ENABLE_LOGSMANAGEMENT_TESTS + +#endif // QUERY_H_ diff --git a/logsmanagement/rrd_api/rrd_api.h b/logsmanagement/rrd_api/rrd_api.h new file mode 100644 index 00000000..eecaec99 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api.h @@ -0,0 +1,312 @@ +/** @file rrd_api.h + */ + +#ifndef RRD_API_H_ +#define RRD_API_H_ + +#include "daemon/common.h" +#include "../circular_buffer.h" +#include "../helper.h" + +struct Chart_meta; +struct Chart_str { + const char *type; + const char *id; + const char *title; + const char *units; + const char *family; + const char *context; + const char *chart_type; + long priority; + int update_every; +}; + +#include "rrd_api_generic.h" +#include "rrd_api_web_log.h" +#include "rrd_api_kernel.h" +#include "rrd_api_systemd.h" +#include "rrd_api_docker_ev.h" +#include "rrd_api_mqtt.h" + +#define CHART_TITLE_TOTAL_COLLECTED_LOGS "Total collected log records" +#define CHART_TITLE_RATE_COLLECTED_LOGS "Rate of collected log records" +#define NETDATA_CHART_PRIO_LOGS_INCR 100 /**< PRIO increment step from one log source to another **/ + +typedef struct Chart_data_cus { + char *id; + + struct chart_data_cus_dim { + char *name; + collected_number val; + unsigned long long *p_counter; + } *dims; + + int dims_size; + + struct Chart_data_cus *next; + +} Chart_data_cus_t ; + +struct Chart_meta { + enum log_src_type_t type; + long base_prio; + + union { + chart_data_generic_t *chart_data_generic; + chart_data_web_log_t *chart_data_web_log; + chart_data_kernel_t *chart_data_kernel; + chart_data_systemd_t *chart_data_systemd; + chart_data_docker_ev_t *chart_data_docker_ev; + chart_data_mqtt_t *chart_data_mqtt; + }; + + Chart_data_cus_t *chart_data_cus_arr; + + void (*init)(struct File_info *p_file_info); + void (*update)(struct File_info *p_file_info); + +}; + +static inline struct Chart_str lgs_mng_create_chart(const char *type, + const char *id, + const char *title, + const char *units, + const char *family, + const char *context, + const char *chart_type, + long priority, + int update_every){ + + struct Chart_str cs = { + .type = type, + .id = id, + .title = title, + .units = units, + .family = family ? family : "", + .context = context ? context : "", + .chart_type = chart_type ? chart_type : "", + .priority = priority, + .update_every = update_every + }; + + printf("CHART '%s.%s' '' '%s' '%s' '%s' '%s' '%s' %ld %d '' '" LOGS_MANAGEMENT_PLUGIN_STR "' ''\n", + cs.type, + cs.id, + cs.title, + cs.units, + cs.family, + cs.context, + cs.chart_type, + cs.priority, + cs.update_every + ); + + return cs; +} + +static inline void lgs_mng_add_dim( const char *id, + const char *algorithm, + collected_number multiplier, + collected_number divisor){ + + printf("DIMENSION '%s' '' '%s' %lld %lld\n", id, algorithm, multiplier, divisor); +} + +static inline void lgs_mng_add_dim_post_init( struct Chart_str *cs, + const char *dim_id, + const char *algorithm, + collected_number multiplier, + collected_number divisor){ + + printf("CHART '%s.%s' '' '%s' '%s' '%s' '%s' '%s' %ld %d '' '" LOGS_MANAGEMENT_PLUGIN_STR "' ''\n", + cs->type, + cs->id, + cs->title, + cs->units, + cs->family, + cs->context, + cs->chart_type, + cs->priority, + cs->update_every + ); + lgs_mng_add_dim(dim_id, algorithm, multiplier, divisor); +} + +static inline void lgs_mng_update_chart_begin(const char *type, const char *id){ + + printf("BEGIN '%s.%s'\n", type, id); +} + +static inline void lgs_mng_update_chart_set(const char *id, collected_number val){ + printf("SET '%s' = %lld\n", id, val); +} + +static inline void lgs_mng_update_chart_end(time_t sec){ + printf("END %" PRId64 " 0 1\n", sec); +} + +#define lgs_mng_do_num_of_logs_charts_init(p_file_info, chart_prio){ \ + \ + /* Number of collected logs total - initialise */ \ + if(p_file_info->parser_config->chart_config & CHART_COLLECTED_LOGS_TOTAL){ \ + lgs_mng_create_chart( \ + (char *) p_file_info->chartname /* type */ \ + , "collected_logs_total" /* id */ \ + , CHART_TITLE_TOTAL_COLLECTED_LOGS /* title */ \ + , "log records" /* units */ \ + , "collected_logs" /* family */ \ + , NULL /* context */ \ + , RRDSET_TYPE_AREA_NAME /* chart_type */ \ + , ++chart_prio /* priority */ \ + , p_file_info->update_every /* update_every */ \ + ); \ + lgs_mng_add_dim("total records", RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); \ + } \ + \ + /* Number of collected logs rate - initialise */ \ + if(p_file_info->parser_config->chart_config & CHART_COLLECTED_LOGS_RATE){ \ + lgs_mng_create_chart( \ + (char *) p_file_info->chartname /* type */ \ + , "collected_logs_rate" /* id */ \ + , CHART_TITLE_RATE_COLLECTED_LOGS /* title */ \ + , "log records" /* units */ \ + , "collected_logs" /* family */ \ + , NULL /* context */ \ + , RRDSET_TYPE_LINE_NAME /* chart_type */ \ + , ++chart_prio /* priority */ \ + , p_file_info->update_every /* update_every */ \ + ); \ + lgs_mng_add_dim("records", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); \ + } \ + \ +} \ + +#define lgs_mng_do_num_of_logs_charts_update(p_file_info, lag_in_sec, chart_data){ \ + \ + /* Number of collected logs total - update previous values */ \ + if(p_file_info->parser_config->chart_config & CHART_COLLECTED_LOGS_TOTAL){ \ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; \ + sec < p_file_info->parser_metrics->last_update; \ + sec++){ \ + lgs_mng_update_chart_begin(p_file_info->chartname, "collected_logs_total"); \ + lgs_mng_update_chart_set("total records", chart_data->num_lines); \ + lgs_mng_update_chart_end(sec); \ + } \ + } \ + \ + /* Number of collected logs rate - update previous values */ \ + if(p_file_info->parser_config->chart_config & CHART_COLLECTED_LOGS_RATE){ \ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; \ + sec < p_file_info->parser_metrics->last_update; \ + sec++){ \ + lgs_mng_update_chart_begin(p_file_info->chartname, "collected_logs_rate"); \ + lgs_mng_update_chart_set("records", chart_data->num_lines); \ + lgs_mng_update_chart_end(sec); \ + } \ + } \ + \ + chart_data->num_lines = p_file_info->parser_metrics->num_lines; \ + \ + /* Number of collected logs total - update */ \ + if(p_file_info->parser_config->chart_config & CHART_COLLECTED_LOGS_TOTAL){ \ + lgs_mng_update_chart_begin( (char *) p_file_info->chartname, "collected_logs_total"); \ + lgs_mng_update_chart_set("total records", chart_data->num_lines); \ + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); \ + } \ + \ + /* Number of collected logs rate - update */ \ + if(p_file_info->parser_config->chart_config & CHART_COLLECTED_LOGS_RATE){ \ + lgs_mng_update_chart_begin( (char *) p_file_info->chartname, "collected_logs_rate"); \ + lgs_mng_update_chart_set("records", chart_data->num_lines); \ + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); \ + } \ +} + +#define lgs_mng_do_custom_charts_init(p_file_info) { \ + \ + for(int cus_off = 0; p_file_info->parser_cus_config[cus_off]; cus_off++){ \ + \ + Chart_data_cus_t *cus; \ + Chart_data_cus_t **p_cus = &p_file_info->chart_meta->chart_data_cus_arr; \ + \ + for(cus = p_file_info->chart_meta->chart_data_cus_arr; \ + cus; \ + cus = cus->next){ \ + \ + if(!strcmp(cus->id, p_file_info->parser_cus_config[cus_off]->chartname)) \ + break; \ + \ + p_cus = &(cus->next); \ + } \ + \ + if(!cus){ \ + cus = callocz(1, sizeof(Chart_data_cus_t)); \ + *p_cus = cus; \ + \ + cus->id = p_file_info->parser_cus_config[cus_off]->chartname; \ + \ + lgs_mng_create_chart( \ + (char *) p_file_info->chartname /* type */ \ + , cus->id /* id */ \ + , cus->id /* title */ \ + , "matches" /* units */ \ + , "custom_charts" /* family */ \ + , NULL /* context */ \ + , RRDSET_TYPE_AREA_NAME /* chart_type */ \ + , p_file_info->chart_meta->base_prio + 1000 + cus_off /* priority */ \ + , p_file_info->update_every /* update_every */ \ + ); \ + } \ + \ + cus->dims = reallocz(cus->dims, ++cus->dims_size * sizeof(struct chart_data_cus_dim)); \ + cus->dims[cus->dims_size - 1].name = \ + p_file_info->parser_cus_config[cus_off]->regex_name; \ + cus->dims[cus->dims_size - 1].val = 0; \ + cus->dims[cus->dims_size - 1].p_counter = \ + &p_file_info->parser_metrics->parser_cus[cus_off]->count; \ + \ + lgs_mng_add_dim(cus->dims[cus->dims_size - 1].name, \ + RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); \ + \ + } \ +} + +#define lgs_mng_do_custom_charts_update(p_file_info, lag_in_sec) { \ + \ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; \ + sec < p_file_info->parser_metrics->last_update; \ + sec++){ \ + \ + for(Chart_data_cus_t *cus = p_file_info->chart_meta->chart_data_cus_arr; \ + cus; \ + cus = cus->next){ \ + \ + lgs_mng_update_chart_begin(p_file_info->chartname, cus->id); \ + \ + for(int d_idx = 0; d_idx < cus->dims_size; d_idx++) \ + lgs_mng_update_chart_set(cus->dims[d_idx].name, cus->dims[d_idx].val); \ + \ + lgs_mng_update_chart_end(sec); \ + } \ + \ + } \ + \ + for(Chart_data_cus_t *cus = p_file_info->chart_meta->chart_data_cus_arr; \ + cus; \ + cus = cus->next){ \ + \ + lgs_mng_update_chart_begin(p_file_info->chartname, cus->id); \ + \ + for(int d_idx = 0; d_idx < cus->dims_size; d_idx++){ \ + \ + cus->dims[d_idx].val += *(cus->dims[d_idx].p_counter); \ + *(cus->dims[d_idx].p_counter) = 0; \ + \ + lgs_mng_update_chart_set(cus->dims[d_idx].name, cus->dims[d_idx].val); \ + } \ + \ + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); \ + } \ +} + +#endif // RRD_API_H_ diff --git a/logsmanagement/rrd_api/rrd_api_docker_ev.c b/logsmanagement/rrd_api/rrd_api_docker_ev.c new file mode 100644 index 00000000..743d256a --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_docker_ev.c @@ -0,0 +1,137 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrd_api_docker_ev.h" + +void docker_ev_chart_init(struct File_info *p_file_info){ + p_file_info->chart_meta->chart_data_docker_ev = callocz(1, sizeof (struct Chart_data_docker_ev)); + p_file_info->chart_meta->chart_data_docker_ev->last_update = now_realtime_sec(); // initial value shouldn't be 0 + long chart_prio = p_file_info->chart_meta->base_prio; + + lgs_mng_do_num_of_logs_charts_init(p_file_info, chart_prio); + + /* Docker events type - initialise */ + if(p_file_info->parser_config->chart_config & CHART_DOCKER_EV_TYPE){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "events_type" // id + , "Events type" // title + , "events types" // units + , "event_type" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + for(int idx = 0; idx < NUM_OF_DOCKER_EV_TYPES; idx++) + lgs_mng_add_dim(docker_ev_type_string[idx], RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* Docker events actions - initialise */ + if(p_file_info->parser_config->chart_config & CHART_DOCKER_EV_ACTION){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "events_action" // id + , "Events action" // title + , "events actions" // units + , "event_action" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + for(int ev_off = 0; ev_off < NUM_OF_DOCKER_EV_TYPES; ev_off++){ + int act_off = -1; + while(docker_ev_action_string[ev_off][++act_off] != NULL){ + + char dim[50]; + snprintfz(dim, 50, "%s %s", + docker_ev_type_string[ev_off], + docker_ev_action_string[ev_off][act_off]); + + lgs_mng_add_dim(dim, RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + } + } + + lgs_mng_do_custom_charts_init(p_file_info); +} + +void docker_ev_chart_update(struct File_info *p_file_info){ + chart_data_docker_ev_t *chart_data = p_file_info->chart_meta->chart_data_docker_ev; + + if(chart_data->last_update != p_file_info->parser_metrics->last_update){ + + time_t lag_in_sec = p_file_info->parser_metrics->last_update - chart_data->last_update - 1; + + lgs_mng_do_num_of_logs_charts_update(p_file_info, lag_in_sec, chart_data); + + /* Docker events type - update */ + if(p_file_info->parser_config->chart_config & CHART_DOCKER_EV_TYPE){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "events_type"); + for(int idx = 0; idx < NUM_OF_DOCKER_EV_TYPES; idx++) + lgs_mng_update_chart_set(docker_ev_type_string[idx], chart_data->num_dock_ev_type[idx]); + lgs_mng_update_chart_end(sec); + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "events_type"); + for(int idx = 0; idx < NUM_OF_DOCKER_EV_TYPES; idx++){ + chart_data->num_dock_ev_type[idx] = p_file_info->parser_metrics->docker_ev->ev_type[idx]; + lgs_mng_update_chart_set(docker_ev_type_string[idx], chart_data->num_dock_ev_type[idx]); + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Docker events action - update */ + if(p_file_info->parser_config->chart_config & CHART_DOCKER_EV_ACTION){ + char dim[50]; + + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "events_action"); + for(int ev_off = 0; ev_off < NUM_OF_DOCKER_EV_TYPES; ev_off++){ + int act_off = -1; + while(docker_ev_action_string[ev_off][++act_off] != NULL){ + if(chart_data->num_dock_ev_action[ev_off][act_off]){ + snprintfz(dim, 50, "%s %s", + docker_ev_type_string[ev_off], + docker_ev_action_string[ev_off][act_off]); + lgs_mng_update_chart_set(dim, chart_data->num_dock_ev_action[ev_off][act_off]); + } + } + } + lgs_mng_update_chart_end(sec); + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "events_action"); + for(int ev_off = 0; ev_off < NUM_OF_DOCKER_EV_TYPES; ev_off++){ + int act_off = -1; + while(docker_ev_action_string[ev_off][++act_off] != NULL){ + chart_data->num_dock_ev_action[ev_off][act_off] = + p_file_info->parser_metrics->docker_ev->ev_action[ev_off][act_off]; + + if(chart_data->num_dock_ev_action[ev_off][act_off]){ + snprintfz(dim, 50, "%s %s", + docker_ev_type_string[ev_off], + docker_ev_action_string[ev_off][act_off]); + lgs_mng_update_chart_set(dim, chart_data->num_dock_ev_action[ev_off][act_off]); + } + } + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + + } + + lgs_mng_do_custom_charts_update(p_file_info, lag_in_sec); + + chart_data->last_update = p_file_info->parser_metrics->last_update; + } + +} diff --git a/logsmanagement/rrd_api/rrd_api_docker_ev.h b/logsmanagement/rrd_api/rrd_api_docker_ev.h new file mode 100644 index 00000000..69341326 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_docker_ev.h @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file plugins_logsmanagement_docker_ev.h + * @brief Incudes the structure and function definitions + * for the docker event log charts. + */ + +#ifndef RRD_API_DOCKER_EV_H_ +#define RRD_API_DOCKER_EV_H_ + +#include "daemon/common.h" + +struct File_info; + +typedef struct Chart_data_docker_ev chart_data_docker_ev_t; + +#include "../file_info.h" +#include "../circular_buffer.h" + +#include "rrd_api.h" + +struct Chart_data_docker_ev { + + time_t last_update; + + /* Number of collected log records */ + collected_number num_lines; + + /* Docker events metrics - event type */ + collected_number num_dock_ev_type[NUM_OF_DOCKER_EV_TYPES]; + + /* Docker events metrics - action type */ + collected_number num_dock_ev_action[NUM_OF_DOCKER_EV_TYPES][NUM_OF_CONTAINER_ACTIONS]; +}; + +void docker_ev_chart_init(struct File_info *p_file_info); +void docker_ev_chart_update(struct File_info *p_file_info); + +#endif // RRD_API_DOCKER_EV_H_ diff --git a/logsmanagement/rrd_api/rrd_api_generic.c b/logsmanagement/rrd_api/rrd_api_generic.c new file mode 100644 index 00000000..752f5af7 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_generic.c @@ -0,0 +1,28 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrd_api_generic.h" + +void generic_chart_init(struct File_info *p_file_info){ + p_file_info->chart_meta->chart_data_generic = callocz(1, sizeof (struct Chart_data_generic)); + p_file_info->chart_meta->chart_data_generic->last_update = now_realtime_sec(); // initial value shouldn't be 0 + long chart_prio = p_file_info->chart_meta->base_prio; + + lgs_mng_do_num_of_logs_charts_init(p_file_info, chart_prio); + + lgs_mng_do_custom_charts_init(p_file_info); +} + +void generic_chart_update(struct File_info *p_file_info){ + chart_data_generic_t *chart_data = p_file_info->chart_meta->chart_data_generic; + + if(chart_data->last_update != p_file_info->parser_metrics->last_update){ + + time_t lag_in_sec = p_file_info->parser_metrics->last_update - chart_data->last_update - 1; + + lgs_mng_do_num_of_logs_charts_update(p_file_info, lag_in_sec, chart_data); + + lgs_mng_do_custom_charts_update(p_file_info, lag_in_sec); + + chart_data->last_update = p_file_info->parser_metrics->last_update; + } +} diff --git a/logsmanagement/rrd_api/rrd_api_generic.h b/logsmanagement/rrd_api/rrd_api_generic.h new file mode 100644 index 00000000..25b801a0 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_generic.h @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file rrd_api_generic.h + * @brief Incudes the structure and function definitions for + * generic log charts. + */ + +#ifndef RRD_API_GENERIC_H_ +#define RRD_API_GENERIC_H_ + +#include "daemon/common.h" + +struct File_info; + +typedef struct Chart_data_generic chart_data_generic_t; + +#include "../file_info.h" +#include "../circular_buffer.h" + +#include "rrd_api.h" + +struct Chart_data_generic { + + time_t last_update; + + /* Number of collected log records */ + collected_number num_lines; + +}; + +void generic_chart_init(struct File_info *p_file_info); +void generic_chart_update(struct File_info *p_file_info); + +#endif // RRD_API_GENERIC_H_ diff --git a/logsmanagement/rrd_api/rrd_api_kernel.c b/logsmanagement/rrd_api/rrd_api_kernel.c new file mode 100644 index 00000000..9372f773 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_kernel.c @@ -0,0 +1,168 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrd_api_kernel.h" + +void kernel_chart_init(struct File_info *p_file_info){ + p_file_info->chart_meta->chart_data_kernel = callocz(1, sizeof (struct Chart_data_kernel)); + chart_data_kernel_t *chart_data = p_file_info->chart_meta->chart_data_kernel; + chart_data->last_update = now_realtime_sec(); // initial value shouldn't be 0 + long chart_prio = p_file_info->chart_meta->base_prio; + + lgs_mng_do_num_of_logs_charts_init(p_file_info, chart_prio); + + /* Syslog severity level (== Systemd priority) - initialise */ + if(p_file_info->parser_config->chart_config & CHART_SYSLOG_SEVER){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "severity_levels" // id + , "Severity Levels" // title + , "severity levels" // units + , "severity" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + for(int i = 0; i < SYSLOG_SEVER_ARR_SIZE; i++) + lgs_mng_add_dim(dim_sever_str[i], RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + + } + + /* Subsystem - initialise */ + if(p_file_info->parser_config->chart_config & CHART_KMSG_SUBSYSTEM){ + chart_data->cs_subsys = lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "subsystems" // id + , "Subsystems" // title + , "subsystems" // units + , "subsystem" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + } + + /* Device - initialise */ + if(p_file_info->parser_config->chart_config & CHART_KMSG_DEVICE){ + chart_data->cs_device = lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "devices" // id + , "Devices" // title + , "devices" // units + , "device" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + } + + lgs_mng_do_custom_charts_init(p_file_info); +} + +void kernel_chart_update(struct File_info *p_file_info){ + chart_data_kernel_t *chart_data = p_file_info->chart_meta->chart_data_kernel; + + if(chart_data->last_update != p_file_info->parser_metrics->last_update){ + + time_t lag_in_sec = p_file_info->parser_metrics->last_update - chart_data->last_update - 1; + + lgs_mng_do_num_of_logs_charts_update(p_file_info, lag_in_sec, chart_data); + + /* Syslog severity level (== Systemd priority) - update */ + if(p_file_info->parser_config->chart_config & CHART_SYSLOG_SEVER){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "severity_levels"); + for(int idx = 0; idx < SYSLOG_SEVER_ARR_SIZE; idx++) + lgs_mng_update_chart_set(dim_sever_str[idx], chart_data->num_sever[idx]); + lgs_mng_update_chart_end(sec); + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "severity_levels"); + for(int idx = 0; idx < SYSLOG_SEVER_ARR_SIZE; idx++){ + chart_data->num_sever[idx] = p_file_info->parser_metrics->kernel->sever[idx]; + lgs_mng_update_chart_set(dim_sever_str[idx], chart_data->num_sever[idx]); + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Subsystem - update */ + if(p_file_info->parser_config->chart_config & CHART_KMSG_SUBSYSTEM){ + metrics_dict_item_t *it; + + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "subsystems"); + dfe_start_read(p_file_info->parser_metrics->kernel->subsystem, it){ + if(it->dim_initialized) + lgs_mng_update_chart_set(it_dfe.name, (collected_number) it->num); + } + dfe_done(it); + lgs_mng_update_chart_end(sec); + } + + dfe_start_write(p_file_info->parser_metrics->kernel->subsystem, it){ + if(!it->dim_initialized){ + it->dim_initialized = true; + lgs_mng_add_dim_post_init( &chart_data->cs_subsys, it_dfe.name, + RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + } + dfe_done(it); + + lgs_mng_update_chart_begin(p_file_info->chartname, "subsystems"); + dfe_start_write(p_file_info->parser_metrics->kernel->subsystem, it){ + it->num = it->num_new; + lgs_mng_update_chart_set(it_dfe.name, (collected_number) it->num); + } + dfe_done(it); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Device - update */ + if(p_file_info->parser_config->chart_config & CHART_KMSG_DEVICE){ + metrics_dict_item_t *it; + + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "devices"); + dfe_start_read(p_file_info->parser_metrics->kernel->device, it){ + if(it->dim_initialized) + lgs_mng_update_chart_set(it_dfe.name, (collected_number) it->num); + } + dfe_done(it); + lgs_mng_update_chart_end(sec); + } + + dfe_start_write(p_file_info->parser_metrics->kernel->device, it){ + if(!it->dim_initialized){ + it->dim_initialized = true; + lgs_mng_add_dim_post_init( &chart_data->cs_device, it_dfe.name, + RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + } + dfe_done(it); + + lgs_mng_update_chart_begin(p_file_info->chartname, "devices"); + dfe_start_write(p_file_info->parser_metrics->kernel->device, it){ + it->num = it->num_new; + lgs_mng_update_chart_set(it_dfe.name, (collected_number) it->num); + } + dfe_done(it); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + lgs_mng_do_custom_charts_update(p_file_info, lag_in_sec); + + chart_data->last_update = p_file_info->parser_metrics->last_update; + } +} diff --git a/logsmanagement/rrd_api/rrd_api_kernel.h b/logsmanagement/rrd_api/rrd_api_kernel.h new file mode 100644 index 00000000..ccb4a752 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_kernel.h @@ -0,0 +1,46 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file rrd_api_kernel.h + * @brief Incudes the structure and function definitions + * for the kernel log charts. + */ + +#ifndef RRD_API_KERNEL_H_ +#define RRD_API_KERNEL_H_ + +#include "daemon/common.h" + +struct File_info; + +typedef struct Chart_data_kernel chart_data_kernel_t; + +#include "../file_info.h" +#include "../circular_buffer.h" + +#include "rrd_api.h" + +#include "rrd_api_systemd.h" // required for dim_sever_str[] + +struct Chart_data_kernel { + + time_t last_update; + + /* Number of collected log records */ + collected_number num_lines; + + /* Kernel metrics - Syslog Severity value */ + collected_number num_sever[SYSLOG_SEVER_ARR_SIZE]; + + /* Kernel metrics - Subsystem */ + struct Chart_str cs_subsys; + // Special case: Subsystem dimension and number are part of Kernel_metrics_t + + /* Kernel metrics - Device */ + struct Chart_str cs_device; + // Special case: Device dimension and number are part of Kernel_metrics_t +}; + +void kernel_chart_init(struct File_info *p_file_info); +void kernel_chart_update(struct File_info *p_file_info); + +#endif // RRD_API_KERNEL_H_ diff --git a/logsmanagement/rrd_api/rrd_api_mqtt.c b/logsmanagement/rrd_api/rrd_api_mqtt.c new file mode 100644 index 00000000..eb90b2ab --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_mqtt.c @@ -0,0 +1,79 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrd_api_mqtt.h" + +void mqtt_chart_init(struct File_info *p_file_info){ + p_file_info->chart_meta->chart_data_mqtt = callocz(1, sizeof (struct Chart_data_mqtt)); + chart_data_mqtt_t *chart_data = p_file_info->chart_meta->chart_data_mqtt; + chart_data->last_update = now_realtime_sec(); // initial value shouldn't be 0 + long chart_prio = p_file_info->chart_meta->base_prio; + + lgs_mng_do_num_of_logs_charts_init(p_file_info, chart_prio); + + /* Topic - initialise */ + if(p_file_info->parser_config->chart_config & CHART_MQTT_TOPIC){ + chart_data->cs_topic = lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "topics" // id + , "Topics" // title + , "topics" // units + , "topic" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + } + + lgs_mng_do_custom_charts_init(p_file_info); +} + +void mqtt_chart_update(struct File_info *p_file_info){ + chart_data_mqtt_t *chart_data = p_file_info->chart_meta->chart_data_mqtt; + + if(chart_data->last_update != p_file_info->parser_metrics->last_update){ + + time_t lag_in_sec = p_file_info->parser_metrics->last_update - chart_data->last_update - 1; + + lgs_mng_do_num_of_logs_charts_update(p_file_info, lag_in_sec, chart_data); + + /* Topic - update */ + if(p_file_info->parser_config->chart_config & CHART_MQTT_TOPIC){ + metrics_dict_item_t *it; + + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "topics"); + dfe_start_read(p_file_info->parser_metrics->mqtt->topic, it){ + if(it->dim_initialized) + lgs_mng_update_chart_set(it_dfe.name, (collected_number) it->num); + } + dfe_done(it); + lgs_mng_update_chart_end(sec); + } + + dfe_start_write(p_file_info->parser_metrics->mqtt->topic, it){ + if(!it->dim_initialized){ + it->dim_initialized = true; + lgs_mng_add_dim_post_init( &chart_data->cs_topic, it_dfe.name, + RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + } + dfe_done(it); + + lgs_mng_update_chart_begin(p_file_info->chartname, "topics"); + dfe_start_write(p_file_info->parser_metrics->mqtt->topic, it){ + it->num = it->num_new; + lgs_mng_update_chart_set(it_dfe.name, (collected_number) it->num); + } + dfe_done(it); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + lgs_mng_do_custom_charts_update(p_file_info, lag_in_sec); + + chart_data->last_update = p_file_info->parser_metrics->last_update; + } +} diff --git a/logsmanagement/rrd_api/rrd_api_mqtt.h b/logsmanagement/rrd_api/rrd_api_mqtt.h new file mode 100644 index 00000000..13c5cff3 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_mqtt.h @@ -0,0 +1,37 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file rrd_api_mqtt.h + * @brief Incudes the structure and function definitions + * for the mqtt log charts. + */ + +#ifndef RRD_API_MQTT_H_ +#define RRD_API_MQTT_H_ + +#include "daemon/common.h" + +struct File_info; + +typedef struct Chart_data_mqtt chart_data_mqtt_t; + +#include "../file_info.h" +#include "../circular_buffer.h" + +#include "rrd_api.h" + +struct Chart_data_mqtt { + + time_t last_update; + + /* Number of collected log records */ + collected_number num_lines; + + /* MQTT metrics - Topic */ + struct Chart_str cs_topic; + // Special case: Topic dimension and number are part of Mqtt_metrics_t +}; + +void mqtt_chart_init(struct File_info *p_file_info); +void mqtt_chart_update(struct File_info *p_file_info); + +#endif // RRD_API_MQTT_H_ diff --git a/logsmanagement/rrd_api/rrd_api_stats.c b/logsmanagement/rrd_api/rrd_api_stats.c new file mode 100644 index 00000000..e845d041 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_stats.c @@ -0,0 +1,298 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrd_api_stats.h" + +static const char *const rrd_type = "netdata"; + +static char **dim_db_timings_write, **dim_db_timings_rotate; + +extern bool logsmanagement_should_exit; + +static void stats_charts_update(void){ + + /* Circular buffer total memory stats - update */ + lgs_mng_update_chart_begin(rrd_type, "circular_buffers_mem_total_cached"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(p_file_info->chartname, + __atomic_load_n(&p_file_info->circ_buff->total_cached_mem, __ATOMIC_RELAXED)); + } + lgs_mng_update_chart_end(0); + + /* Circular buffer number of items - update */ + lgs_mng_update_chart_begin(rrd_type, "circular_buffers_num_of_items"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(p_file_info->chartname, p_file_info->circ_buff->num_of_items); + } + lgs_mng_update_chart_end(0); + + /* Circular buffer uncompressed buffered items memory stats - update */ + lgs_mng_update_chart_begin(rrd_type, "circular_buffers_mem_uncompressed_used"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(p_file_info->chartname, + __atomic_load_n(&p_file_info->circ_buff->text_size_total, __ATOMIC_RELAXED)); + } + lgs_mng_update_chart_end(0); + + /* Circular buffer compressed buffered items memory stats - update */ + lgs_mng_update_chart_begin(rrd_type, "circular_buffers_mem_compressed_used"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(p_file_info->chartname, + __atomic_load_n(&p_file_info->circ_buff->text_compressed_size_total, __ATOMIC_RELAXED)); + } + lgs_mng_update_chart_end(0); + + /* Compression stats - update */ + lgs_mng_update_chart_begin(rrd_type, "average_compression_ratio"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(p_file_info->chartname, + __atomic_load_n(&p_file_info->circ_buff->compression_ratio, __ATOMIC_RELAXED)); + } + lgs_mng_update_chart_end(0); + + /* DB disk usage stats - update */ + lgs_mng_update_chart_begin(rrd_type, "database_disk_usage"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(p_file_info->chartname, + __atomic_load_n(&p_file_info->blob_total_size, __ATOMIC_RELAXED)); + } + lgs_mng_update_chart_end(0); + + /* DB timings - update */ + lgs_mng_update_chart_begin(rrd_type, "database_timings"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(dim_db_timings_write[i], + __atomic_exchange_n(&p_file_info->db_write_duration, 0, __ATOMIC_RELAXED)); + + lgs_mng_update_chart_set(dim_db_timings_rotate[i], + __atomic_exchange_n(&p_file_info->db_rotate_duration, 0, __ATOMIC_RELAXED)); + } + lgs_mng_update_chart_end(0); + + /* Query CPU time per byte (user) - update */ + lgs_mng_update_chart_begin(rrd_type, "query_cpu_time_per_MiB_user"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(p_file_info->chartname, + __atomic_load_n(&p_file_info->cpu_time_per_mib.user, __ATOMIC_RELAXED)); + } + lgs_mng_update_chart_end(0); + + /* Query CPU time per byte (user) - update */ + lgs_mng_update_chart_begin(rrd_type, "query_cpu_time_per_MiB_sys"); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + if(!p_file_info->parser_config) + continue; + + lgs_mng_update_chart_set(p_file_info->chartname, + __atomic_load_n(&p_file_info->cpu_time_per_mib.sys, __ATOMIC_RELAXED)); + } + lgs_mng_update_chart_end(0); + +} + +void stats_charts_init(void *arg){ + + netdata_mutex_t *p_stdout_mut = (netdata_mutex_t *) arg; + + netdata_mutex_lock(p_stdout_mut); + + int chart_prio = NETDATA_CHART_PRIO_LOGS_STATS_BASE; + + /* Circular buffer total memory stats - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "circular_buffers_mem_total_cached" // id + , "Circular buffers total cached memory" // title + , "bytes" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_STACKED_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++) + lgs_mng_add_dim(p_file_infos_arr->data[i]->chartname, RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + + /* Circular buffer number of items - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "circular_buffers_num_of_items" // id + , "Circular buffers number of items" // title + , "items" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_LINE_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++) + lgs_mng_add_dim(p_file_infos_arr->data[i]->chartname, RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + + /* Circular buffer uncompressed buffered items memory stats - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "circular_buffers_mem_uncompressed_used" // id + , "Circular buffers used memory for uncompressed logs" // title + , "bytes" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_STACKED_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++) + lgs_mng_add_dim(p_file_infos_arr->data[i]->chartname, RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + + /* Circular buffer compressed buffered items memory stats - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "circular_buffers_mem_compressed_used" // id + , "Circular buffers used memory for compressed logs" // title + , "bytes" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_STACKED_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++) + lgs_mng_add_dim(p_file_infos_arr->data[i]->chartname, RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + + /* Compression stats - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "average_compression_ratio" // id + , "Average compression ratio" // title + , "uncompressed / compressed ratio" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_LINE_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++) + lgs_mng_add_dim(p_file_infos_arr->data[i]->chartname, RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + + /* DB disk usage stats - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "database_disk_usage" // id + , "Database disk usage" // title + , "bytes" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_STACKED_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++) + lgs_mng_add_dim(p_file_infos_arr->data[i]->chartname, RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + + /* DB timings - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "database_timings" // id + , "Database timings" // title + , "ns" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_STACKED_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++){ + struct File_info *p_file_info = p_file_infos_arr->data[i]; + + dim_db_timings_write = reallocz(dim_db_timings_write, (i + 1) * sizeof(char *)); + dim_db_timings_rotate = reallocz(dim_db_timings_rotate, (i + 1) * sizeof(char *)); + + dim_db_timings_write[i] = mallocz(snprintf(NULL, 0, "%s_write", p_file_info->chartname) + 1); + sprintf(dim_db_timings_write[i], "%s_write", p_file_info->chartname); + lgs_mng_add_dim(dim_db_timings_write[i], RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + + dim_db_timings_rotate[i] = mallocz(snprintf(NULL, 0, "%s_rotate", p_file_info->chartname) + 1); + sprintf(dim_db_timings_rotate[i], "%s_rotate", p_file_info->chartname); + lgs_mng_add_dim(dim_db_timings_rotate[i], RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + } + + /* Query CPU time per byte (user) - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "query_cpu_time_per_MiB_user" // id + , "CPU user time per MiB of query results" // title + , "usec/MiB" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_STACKED_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++) + lgs_mng_add_dim(p_file_infos_arr->data[i]->chartname, RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + + /* Query CPU time per byte (system) - initialise */ + lgs_mng_create_chart( + rrd_type // type + , "query_cpu_time_per_MiB_sys" // id + , "CPU system time per MiB of query results" // title + , "usec/MiB" // units + , "logsmanagement" // family + , NULL // context + , RRDSET_TYPE_STACKED_NAME // chart_type + , ++chart_prio // priority + , g_logs_manag_config.update_every // update_every + ); + for(int i = 0; i < p_file_infos_arr->count; i++) + lgs_mng_add_dim(p_file_infos_arr->data[i]->chartname, RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + + netdata_mutex_unlock(p_stdout_mut); + + + heartbeat_t hb; + heartbeat_init(&hb); + usec_t step_ut = g_logs_manag_config.update_every * USEC_PER_SEC; + + while (0 == __atomic_load_n(&logsmanagement_should_exit, __ATOMIC_RELAXED)) { + heartbeat_next(&hb, step_ut); + + netdata_mutex_lock(p_stdout_mut); + stats_charts_update(); + fflush(stdout); + netdata_mutex_unlock(p_stdout_mut); + } + + collector_info("[stats charts]: thread exiting..."); +} + diff --git a/logsmanagement/rrd_api/rrd_api_stats.h b/logsmanagement/rrd_api/rrd_api_stats.h new file mode 100644 index 00000000..79a0f15d --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_stats.h @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file rrd_api_stats.h + * @brief Incudes the structure and function definitions + * for logs management stats charts. + */ + +#ifndef RRD_API_STATS_H_ +#define RRD_API_STATS_H_ + +#include "daemon/common.h" + +struct File_info; + +#include "../file_info.h" + +void stats_charts_init(void *arg); + +#endif // RRD_API_STATS_H_
\ No newline at end of file diff --git a/logsmanagement/rrd_api/rrd_api_systemd.c b/logsmanagement/rrd_api/rrd_api_systemd.c new file mode 100644 index 00000000..1d489389 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_systemd.c @@ -0,0 +1,206 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrd_api_systemd.h" + +const char *dim_sever_str[SYSLOG_SEVER_ARR_SIZE] = { + "0:Emergency", + "1:Alert", + "2:Critical", + "3:Error", + "4:Warning", + "5:Notice", + "6:Informational", + "7:Debug", + "uknown" +}; + +static const char *dim_facil_str[SYSLOG_FACIL_ARR_SIZE] = { + "0:kernel", + "1:user-level", + "2:mail", + "3:system", + "4:sec/auth", + "5:syslog", + "6:lpd/printer", + "7:news/nntp", + "8:uucp", + "9:time", + "10:sec/auth", + "11:ftp", + "12:ntp", + "13:logaudit", + "14:logalert", + "15:clock", + "16:local0", + "17:local1", + "18:local2", + "19:local3", + "20:local4", + "21:local5", + "22:local6", + "23:local7", + "uknown" +}; + +void systemd_chart_init(struct File_info *p_file_info){ + p_file_info->chart_meta->chart_data_systemd = callocz(1, sizeof (struct Chart_data_systemd)); + chart_data_systemd_t *chart_data = p_file_info->chart_meta->chart_data_systemd; + chart_data->last_update = now_realtime_sec(); // initial value shouldn't be 0 + long chart_prio = p_file_info->chart_meta->base_prio; + + lgs_mng_do_num_of_logs_charts_init(p_file_info, chart_prio); + + /* Syslog priority value - initialise */ + if(p_file_info->parser_config->chart_config & CHART_SYSLOG_PRIOR){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "priority_values" // id + , "Priority Values" // title + , "priority values" // units + , "priority" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + for(int i = 0; i < SYSLOG_PRIOR_ARR_SIZE - 1; i++){ + char dim_id[4]; + snprintfz(dim_id, 4, "%d", i); + chart_data->dim_prior[i] = strdupz(dim_id); + lgs_mng_add_dim(chart_data->dim_prior[i], RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + chart_data->dim_prior[SYSLOG_PRIOR_ARR_SIZE - 1] = "uknown"; + lgs_mng_add_dim(chart_data->dim_prior[SYSLOG_PRIOR_ARR_SIZE - 1], + RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + + } + + /* Syslog severity level (== Systemd priority) - initialise */ + if(p_file_info->parser_config->chart_config & CHART_SYSLOG_SEVER){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "severity_levels" // id + , "Severity Levels" // title + , "severity levels" // units + , "priority" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + for(int i = 0; i < SYSLOG_SEVER_ARR_SIZE; i++) + lgs_mng_add_dim(dim_sever_str[i], RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* Syslog facility level - initialise */ + if(p_file_info->parser_config->chart_config & CHART_SYSLOG_FACIL){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "facility_levels" // id + , "Facility Levels" // title + , "facility levels" // units + , "priority" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + for(int i = 0; i < SYSLOG_FACIL_ARR_SIZE; i++) + lgs_mng_add_dim(dim_facil_str[i], RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + lgs_mng_do_custom_charts_init(p_file_info); +} + +void systemd_chart_update(struct File_info *p_file_info){ + chart_data_systemd_t *chart_data = p_file_info->chart_meta->chart_data_systemd; + + if(chart_data->last_update != p_file_info->parser_metrics->last_update){ + + time_t lag_in_sec = p_file_info->parser_metrics->last_update - chart_data->last_update - 1; + + lgs_mng_do_num_of_logs_charts_update(p_file_info, lag_in_sec, chart_data); + + /* Syslog priority value - update */ + if(p_file_info->parser_config->chart_config & CHART_SYSLOG_PRIOR){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "priority_values"); + for(int idx = 0; idx < SYSLOG_PRIOR_ARR_SIZE; idx++){ + if(chart_data->num_prior[idx]) + lgs_mng_update_chart_set(chart_data->dim_prior[idx], chart_data->num_prior[idx]); + } + lgs_mng_update_chart_end(sec); + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "priority_values"); + for(int idx = 0; idx < SYSLOG_PRIOR_ARR_SIZE; idx++){ + if(p_file_info->parser_metrics->systemd->prior[idx]){ + chart_data->num_prior[idx] = p_file_info->parser_metrics->systemd->prior[idx]; + lgs_mng_update_chart_set(chart_data->dim_prior[idx], chart_data->num_prior[idx]); + } + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + + } + + /* Syslog severity level (== Systemd priority) - update chart */ + if(p_file_info->parser_config->chart_config & CHART_SYSLOG_SEVER){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "severity_levels"); + for(int idx = 0; idx < SYSLOG_SEVER_ARR_SIZE; idx++){ + if(chart_data->num_sever[idx]) + lgs_mng_update_chart_set(dim_sever_str[idx], chart_data->num_sever[idx]); + } + lgs_mng_update_chart_end(sec); + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "severity_levels"); + for(int idx = 0; idx < SYSLOG_SEVER_ARR_SIZE; idx++){ + if(p_file_info->parser_metrics->systemd->sever[idx]){ + chart_data->num_sever[idx] = p_file_info->parser_metrics->systemd->sever[idx]; + lgs_mng_update_chart_set(dim_sever_str[idx], chart_data->num_sever[idx]); + } + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + + } + + /* Syslog facility value - update chart */ + if(p_file_info->parser_config->chart_config & CHART_SYSLOG_FACIL){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "facility_levels"); + for(int idx = 0; idx < SYSLOG_FACIL_ARR_SIZE; idx++){ + if(chart_data->num_facil[idx]) + lgs_mng_update_chart_set(dim_facil_str[idx], chart_data->num_facil[idx]); + } + lgs_mng_update_chart_end(sec); + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "facility_levels"); + for(int idx = 0; idx < SYSLOG_FACIL_ARR_SIZE; idx++){ + if(p_file_info->parser_metrics->systemd->facil[idx]){ + chart_data->num_facil[idx] = p_file_info->parser_metrics->systemd->facil[idx]; + lgs_mng_update_chart_set(dim_facil_str[idx], chart_data->num_facil[idx]); + } + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + + } + + lgs_mng_do_custom_charts_update(p_file_info, lag_in_sec); + + chart_data->last_update = p_file_info->parser_metrics->last_update; + } +} diff --git a/logsmanagement/rrd_api/rrd_api_systemd.h b/logsmanagement/rrd_api/rrd_api_systemd.h new file mode 100644 index 00000000..3497168f --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_systemd.h @@ -0,0 +1,45 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file plugins_logsmanagement_systemd.h + * @brief Incudes the structure and function definitions + * for the systemd log charts. + */ + +#ifndef RRD_API_SYSTEMD_H_ +#define RRD_API_SYSTEMD_H_ + +#include "daemon/common.h" + +struct File_info; + +typedef struct Chart_data_systemd chart_data_systemd_t; + +#include "../file_info.h" +#include "../circular_buffer.h" + +#include "rrd_api.h" + +extern const char *dim_sever_str[SYSLOG_SEVER_ARR_SIZE]; + +struct Chart_data_systemd { + + time_t last_update; + + /* Number of collected log records */ + collected_number num_lines; + + /* Systemd metrics - Syslog Priority value */ + char *dim_prior[193]; + collected_number num_prior[193]; + + /* Systemd metrics - Syslog Severity value */ + collected_number num_sever[9]; + + /* Systemd metrics - Syslog Facility value */ + collected_number num_facil[25]; +}; + +void systemd_chart_init(struct File_info *p_file_info); +void systemd_chart_update(struct File_info *p_file_info); + +#endif // RRD_API_SYSTEMD_H_ diff --git a/logsmanagement/rrd_api/rrd_api_web_log.c b/logsmanagement/rrd_api/rrd_api_web_log.c new file mode 100644 index 00000000..5ab60204 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_web_log.c @@ -0,0 +1,716 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrd_api_web_log.h" + +void web_log_chart_init(struct File_info *p_file_info){ + p_file_info->chart_meta->chart_data_web_log = callocz(1, sizeof (struct Chart_data_web_log)); + chart_data_web_log_t *chart_data = p_file_info->chart_meta->chart_data_web_log; + chart_data->last_update = now_realtime_sec(); // initial value shouldn't be 0 + long chart_prio = p_file_info->chart_meta->base_prio; + + lgs_mng_do_num_of_logs_charts_init(p_file_info, chart_prio); + + /* Vhost - initialise */ + if(p_file_info->parser_config->chart_config & CHART_VHOST){ + chart_data->cs_vhosts = lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "vhost" // id + , "Requests by Vhost" // title + , "requests" // units + , "vhost" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + } + + /* Port - initialise */ + if(p_file_info->parser_config->chart_config & CHART_PORT){ + chart_data->cs_ports = lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "port" // id + , "Requests by Port" // title + , "requests" // units + , "port" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + } + + /* IP Version - initialise */ + if(p_file_info->parser_config->chart_config & CHART_IP_VERSION){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "ip_version" // id + , "Requests by IP version" // title + , "requests" // units + , "ip_version" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("ipv4", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("ipv6", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("invalid", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* Request client current poll - initialise */ + if(p_file_info->parser_config->chart_config & CHART_REQ_CLIENT_CURRENT){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "clients" // id + , "Current Poll Unique Client IPs" // title + , "unique ips" // units + , "clients" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("ipv4", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("ipv6", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* Request client all-time - initialise */ + if(p_file_info->parser_config->chart_config & CHART_REQ_CLIENT_ALL_TIME){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "clients_all" // id + , "All Time Unique Client IPs" // title + , "unique ips" // units + , "clients" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("ipv4", RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + lgs_mng_add_dim("ipv6", RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1); + } + + /* Request methods - initialise */ + if(p_file_info->parser_config->chart_config & CHART_REQ_METHODS){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "http_methods" // id + , "Requests Per HTTP Method" // title + , "requests" // units + , "http_methods" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + for(int j = 0; j < REQ_METHOD_ARR_SIZE; j++) + lgs_mng_add_dim(req_method_str[j], RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* Request protocol - initialise */ + if(p_file_info->parser_config->chart_config & CHART_REQ_PROTO){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "http_versions" // id + , "Requests Per HTTP Version" // title + , "requests" // units + , "http_versions" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("1.0", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("1.1", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("2.0", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("other", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* Request bandwidth - initialise */ + if(p_file_info->parser_config->chart_config & CHART_BANDWIDTH){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "bandwidth" // id + , "Bandwidth" // title + , "kilobits" // units + , "bandwidth" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("received", RRD_ALGORITHM_INCREMENTAL_NAME, 8, 1000); + lgs_mng_add_dim("sent", RRD_ALGORITHM_INCREMENTAL_NAME, -8, 1000); + } + + /* Request processing time - initialise */ + if(p_file_info->parser_config->chart_config & CHART_REQ_PROC_TIME){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "timings" // id + , "Request Processing Time" // title + , "milliseconds" // units + , "timings" // family + , NULL // context + , RRDSET_TYPE_LINE_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("min", RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1000); + lgs_mng_add_dim("max", RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1000); + lgs_mng_add_dim("avg", RRD_ALGORITHM_ABSOLUTE_NAME, 1, 1000); + } + + /* Response code family - initialise */ + if(p_file_info->parser_config->chart_config & CHART_RESP_CODE_FAMILY){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "responses" // id + , "Response Codes" // title + , "requests" // units + , "responses" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("1xx", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("2xx", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("3xx", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("4xx", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("5xx", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("other", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* Response code - initialise */ + if(p_file_info->parser_config->chart_config & CHART_RESP_CODE){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "detailed_responses" // id + , "Detailed Response Codes" // title + , "requests" // units + , "responses" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + for(int idx = 0; idx < RESP_CODE_ARR_SIZE - 1; idx++){ + char dim_name[4]; + snprintfz(dim_name, 4, "%d", idx + 100); + lgs_mng_add_dim(dim_name, RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + } + + /* Response code type - initialise */ + if(p_file_info->parser_config->chart_config & CHART_RESP_CODE_TYPE){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "response_types" // id + , "Response Statuses" // title + , "requests" // units + , "responses" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("success", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("redirect", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("bad", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("error", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("other", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* SSL protocol - initialise */ + if(p_file_info->parser_config->chart_config & CHART_SSL_PROTO){ + lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "ssl_protocol" // id + , "Requests Per SSL Protocol" // title + , "requests" // units + , "ssl_protocol" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + + lgs_mng_add_dim("TLSV1", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("TLSV1.1", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("TLSV1.2", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("TLSV1.3", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("SSLV2", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("SSLV3", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + lgs_mng_add_dim("other", RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + /* SSL cipher suite - initialise */ + if(p_file_info->parser_config->chart_config & CHART_SSL_CIPHER){ + chart_data->cs_ssl_ciphers = lgs_mng_create_chart( + (char *) p_file_info->chartname // type + , "ssl_cipher_suite" // id + , "Requests by SSL cipher suite" // title + , "requests" // units + , "ssl_cipher_suite" // family + , NULL // context + , RRDSET_TYPE_AREA_NAME // chart_type + , ++chart_prio // priority + , p_file_info->update_every // update_every + ); + } + + lgs_mng_do_custom_charts_init(p_file_info); +} + + +void web_log_chart_update(struct File_info *p_file_info){ + chart_data_web_log_t *chart_data = p_file_info->chart_meta->chart_data_web_log; + Web_log_metrics_t *wlm = p_file_info->parser_metrics->web_log; + + if(chart_data->last_update != p_file_info->parser_metrics->last_update){ + + time_t lag_in_sec = p_file_info->parser_metrics->last_update - chart_data->last_update - 1; + + lgs_mng_do_num_of_logs_charts_update(p_file_info, lag_in_sec, chart_data); + + /* Vhost - update */ + if(p_file_info->parser_config->chart_config & CHART_VHOST){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "vhost"); + for(int idx = 0; idx < chart_data->vhost_size; idx++) + lgs_mng_update_chart_set(wlm->vhost_arr.vhosts[idx].name, chart_data->num_vhosts[idx]); + lgs_mng_update_chart_end(sec); + } + + if(wlm->vhost_arr.size > chart_data->vhost_size){ + if(wlm->vhost_arr.size >= chart_data->vhost_size_max){ + chart_data->vhost_size_max = wlm->vhost_arr.size * VHOST_BUFFS_SCALE_FACTOR + 1; + chart_data->num_vhosts = reallocz( chart_data->num_vhosts, + chart_data->vhost_size_max * sizeof(collected_number)); + + } + + for(int idx = chart_data->vhost_size; idx < wlm->vhost_arr.size; idx++){ + chart_data->num_vhosts[idx] = 0; + lgs_mng_add_dim_post_init( &chart_data->cs_vhosts, + wlm->vhost_arr.vhosts[idx].name, + RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + chart_data->vhost_size = wlm->vhost_arr.size; + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "vhost"); + for(int idx = 0; idx < chart_data->vhost_size; idx++){ + chart_data->num_vhosts[idx] += wlm->vhost_arr.vhosts[idx].count; + wlm->vhost_arr.vhosts[idx].count = 0; + lgs_mng_update_chart_set(wlm->vhost_arr.vhosts[idx].name, chart_data->num_vhosts[idx]); + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Port - update */ + if(p_file_info->parser_config->chart_config & CHART_PORT){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "port"); + for(int idx = 0; idx < chart_data->port_size; idx++) + lgs_mng_update_chart_set(wlm->port_arr.ports[idx].name, chart_data->num_ports[idx]); + lgs_mng_update_chart_end(sec); + } + + if(wlm->port_arr.size > chart_data->port_size){ + if(wlm->port_arr.size >= chart_data->port_size_max){ + chart_data->port_size_max = wlm->port_arr.size * PORT_BUFFS_SCALE_FACTOR + 1; + chart_data->num_ports = reallocz( chart_data->num_ports, + chart_data->port_size_max * sizeof(collected_number)); + } + + for(int idx = chart_data->port_size; idx < wlm->port_arr.size; idx++){ + chart_data->num_ports[idx] = 0; + lgs_mng_add_dim_post_init( &chart_data->cs_ports, + wlm->port_arr.ports[idx].name, + RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + chart_data->port_size = wlm->port_arr.size; + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "port"); + for(int idx = 0; idx < chart_data->port_size; idx++){ + chart_data->num_ports[idx] += wlm->port_arr.ports[idx].count; + wlm->port_arr.ports[idx].count = 0; + lgs_mng_update_chart_set(wlm->port_arr.ports[idx].name, chart_data->num_ports[idx]); + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* IP Version - update */ + if(p_file_info->parser_config->chart_config & CHART_IP_VERSION){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "ip_version"); + lgs_mng_update_chart_set("ipv4", chart_data->num_ip_ver_4); + lgs_mng_update_chart_set("ipv6", chart_data->num_ip_ver_6); + lgs_mng_update_chart_set("invalid", chart_data->num_ip_ver_invalid); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_ip_ver_4 += wlm->ip_ver.v4; + chart_data->num_ip_ver_6 += wlm->ip_ver.v6; + chart_data->num_ip_ver_invalid += wlm->ip_ver.invalid; + memset(&wlm->ip_ver, 0, sizeof(wlm->ip_ver)); + + lgs_mng_update_chart_begin(p_file_info->chartname, "ip_version"); + lgs_mng_update_chart_set("ipv4", chart_data->num_ip_ver_4); + lgs_mng_update_chart_set("ipv6", chart_data->num_ip_ver_6); + lgs_mng_update_chart_set("invalid", chart_data->num_ip_ver_invalid); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Request client current poll - update */ + if(p_file_info->parser_config->chart_config & CHART_REQ_CLIENT_CURRENT){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "clients"); + lgs_mng_update_chart_set("ipv4", chart_data->num_req_client_current_ipv4); + lgs_mng_update_chart_set("ipv6", chart_data->num_req_client_current_ipv6); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_req_client_current_ipv4 += wlm->req_clients_current_arr.ipv4_size; + wlm->req_clients_current_arr.ipv4_size = 0; + chart_data->num_req_client_current_ipv6 += wlm->req_clients_current_arr.ipv6_size; + wlm->req_clients_current_arr.ipv6_size = 0; + + lgs_mng_update_chart_begin(p_file_info->chartname, "clients"); + lgs_mng_update_chart_set("ipv4", chart_data->num_req_client_current_ipv4); + lgs_mng_update_chart_set("ipv6", chart_data->num_req_client_current_ipv6); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Request client all-time - update */ + if(p_file_info->parser_config->chart_config & CHART_REQ_CLIENT_ALL_TIME){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "clients_all"); + lgs_mng_update_chart_set("ipv4", chart_data->num_req_client_all_time_ipv4); + lgs_mng_update_chart_set("ipv6", chart_data->num_req_client_all_time_ipv6); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_req_client_all_time_ipv4 = wlm->req_clients_alltime_arr.ipv4_size; + chart_data->num_req_client_all_time_ipv6 = wlm->req_clients_alltime_arr.ipv6_size; + + lgs_mng_update_chart_begin(p_file_info->chartname, "clients_all"); + lgs_mng_update_chart_set("ipv4", chart_data->num_req_client_all_time_ipv4); + lgs_mng_update_chart_set("ipv6", chart_data->num_req_client_all_time_ipv6); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Request methods - update */ + if(p_file_info->parser_config->chart_config & CHART_REQ_METHODS){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "http_methods"); + for(int idx = 0; idx < REQ_METHOD_ARR_SIZE; idx++){ + if(chart_data->num_req_method[idx]) + lgs_mng_update_chart_set(req_method_str[idx], chart_data->num_req_method[idx]); + } + lgs_mng_update_chart_end(sec); + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "http_methods"); + for(int idx = 0; idx < REQ_METHOD_ARR_SIZE; idx++){ + chart_data->num_req_method[idx] += wlm->req_method[idx]; + wlm->req_method[idx] = 0; + if(chart_data->num_req_method[idx]) + lgs_mng_update_chart_set(req_method_str[idx], chart_data->num_req_method[idx]); + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Request protocol - update */ + if(p_file_info->parser_config->chart_config & CHART_REQ_PROTO){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "http_versions"); + lgs_mng_update_chart_set("1.0", chart_data->num_req_proto_http_1); + lgs_mng_update_chart_set("1.1", chart_data->num_req_proto_http_1_1); + lgs_mng_update_chart_set("2.0", chart_data->num_req_proto_http_2); + lgs_mng_update_chart_set("other", chart_data->num_req_proto_other); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_req_proto_http_1 += wlm->req_proto.http_1; + chart_data->num_req_proto_http_1_1 += wlm->req_proto.http_1_1; + chart_data->num_req_proto_http_2 += wlm->req_proto.http_2; + chart_data->num_req_proto_other += wlm->req_proto.other; + memset(&wlm->req_proto, 0, sizeof(wlm->req_proto)); + + lgs_mng_update_chart_begin(p_file_info->chartname, "http_versions"); + lgs_mng_update_chart_set("1.0", chart_data->num_req_proto_http_1); + lgs_mng_update_chart_set("1.1", chart_data->num_req_proto_http_1_1); + lgs_mng_update_chart_set("2.0", chart_data->num_req_proto_http_2); + lgs_mng_update_chart_set("other", chart_data->num_req_proto_other); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Request bandwidth - update */ + if(p_file_info->parser_config->chart_config & CHART_BANDWIDTH){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "bandwidth"); + lgs_mng_update_chart_set("received", chart_data->num_bandwidth_req_size); + lgs_mng_update_chart_set("sent", chart_data->num_bandwidth_resp_size); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_bandwidth_req_size += wlm->bandwidth.req_size; + chart_data->num_bandwidth_resp_size += wlm->bandwidth.resp_size; + memset(&wlm->bandwidth, 0, sizeof(wlm->bandwidth)); + + lgs_mng_update_chart_begin(p_file_info->chartname, "bandwidth"); + lgs_mng_update_chart_set("received", chart_data->num_bandwidth_req_size); + lgs_mng_update_chart_set("sent", chart_data->num_bandwidth_resp_size); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Request proc time - update */ + if(p_file_info->parser_config->chart_config & CHART_REQ_PROC_TIME){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "timings"); + lgs_mng_update_chart_set("min", chart_data->num_req_proc_time_min); + lgs_mng_update_chart_set("max", chart_data->num_req_proc_time_max); + lgs_mng_update_chart_set("avg", chart_data->num_req_proc_time_avg); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_req_proc_time_min = wlm->req_proc_time.min; + chart_data->num_req_proc_time_max = wlm->req_proc_time.max; + chart_data->num_req_proc_time_avg = wlm->req_proc_time.count ? + wlm->req_proc_time.sum / wlm->req_proc_time.count : 0; + memset(&wlm->req_proc_time, 0, sizeof(wlm->req_proc_time)); + + lgs_mng_update_chart_begin(p_file_info->chartname, "timings"); + lgs_mng_update_chart_set("min", chart_data->num_req_proc_time_min); + lgs_mng_update_chart_set("max", chart_data->num_req_proc_time_max); + lgs_mng_update_chart_set("avg", chart_data->num_req_proc_time_avg); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Response code family - update */ + if(p_file_info->parser_config->chart_config & CHART_RESP_CODE_FAMILY){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "responses"); + lgs_mng_update_chart_set("1xx", chart_data->num_resp_code_family_1xx); + lgs_mng_update_chart_set("2xx", chart_data->num_resp_code_family_2xx); + lgs_mng_update_chart_set("3xx", chart_data->num_resp_code_family_3xx); + lgs_mng_update_chart_set("4xx", chart_data->num_resp_code_family_4xx); + lgs_mng_update_chart_set("5xx", chart_data->num_resp_code_family_5xx); + lgs_mng_update_chart_set("other", chart_data->num_resp_code_family_other); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_resp_code_family_1xx += wlm->resp_code_family.resp_1xx; + chart_data->num_resp_code_family_2xx += wlm->resp_code_family.resp_2xx; + chart_data->num_resp_code_family_3xx += wlm->resp_code_family.resp_3xx; + chart_data->num_resp_code_family_4xx += wlm->resp_code_family.resp_4xx; + chart_data->num_resp_code_family_5xx += wlm->resp_code_family.resp_5xx; + chart_data->num_resp_code_family_other += wlm->resp_code_family.other; + memset(&wlm->resp_code_family, 0, sizeof(wlm->resp_code_family)); + + lgs_mng_update_chart_begin(p_file_info->chartname, "responses"); + lgs_mng_update_chart_set("1xx", chart_data->num_resp_code_family_1xx); + lgs_mng_update_chart_set("2xx", chart_data->num_resp_code_family_2xx); + lgs_mng_update_chart_set("3xx", chart_data->num_resp_code_family_3xx); + lgs_mng_update_chart_set("4xx", chart_data->num_resp_code_family_4xx); + lgs_mng_update_chart_set("5xx", chart_data->num_resp_code_family_5xx); + lgs_mng_update_chart_set("other", chart_data->num_resp_code_family_other); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Response code - update */ + if(p_file_info->parser_config->chart_config & CHART_RESP_CODE){ + char dim_name[4]; + + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "detailed_responses"); + for(int idx = 0; idx < RESP_CODE_ARR_SIZE - 1; idx++){ + if(chart_data->num_resp_code[idx]){ + snprintfz(dim_name, 4, "%d", idx + 100); + lgs_mng_update_chart_set(dim_name, chart_data->num_resp_code[idx]); + } + } + if(chart_data->num_resp_code[RESP_CODE_ARR_SIZE - 1]) + lgs_mng_update_chart_set("other", chart_data->num_resp_code[RESP_CODE_ARR_SIZE - 1]); + lgs_mng_update_chart_end(sec); + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "detailed_responses"); + for(int idx = 0; idx < RESP_CODE_ARR_SIZE - 1; idx++){ + chart_data->num_resp_code[idx] += wlm->resp_code[idx]; + wlm->resp_code[idx] = 0; + if(chart_data->num_resp_code[idx]){ + snprintfz(dim_name, 4, "%d", idx + 100); + lgs_mng_update_chart_set(dim_name, chart_data->num_resp_code[idx]); + } + } + chart_data->num_resp_code[RESP_CODE_ARR_SIZE - 1] += wlm->resp_code[RESP_CODE_ARR_SIZE - 1]; + wlm->resp_code[RESP_CODE_ARR_SIZE - 1] = 0; + if(chart_data->num_resp_code[RESP_CODE_ARR_SIZE - 1]) + lgs_mng_update_chart_set("other", chart_data->num_resp_code[RESP_CODE_ARR_SIZE - 1]); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* Response code type - update */ + if(p_file_info->parser_config->chart_config & CHART_RESP_CODE_TYPE){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "response_types"); + lgs_mng_update_chart_set("success", chart_data->num_resp_code_type_success); + lgs_mng_update_chart_set("redirect", chart_data->num_resp_code_type_redirect); + lgs_mng_update_chart_set("bad", chart_data->num_resp_code_type_bad); + lgs_mng_update_chart_set("error", chart_data->num_resp_code_type_error); + lgs_mng_update_chart_set("other", chart_data->num_resp_code_type_other); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_resp_code_type_success += wlm->resp_code_type.resp_success; + chart_data->num_resp_code_type_redirect += wlm->resp_code_type.resp_redirect; + chart_data->num_resp_code_type_bad += wlm->resp_code_type.resp_bad; + chart_data->num_resp_code_type_error += wlm->resp_code_type.resp_error; + chart_data->num_resp_code_type_other += wlm->resp_code_type.other; + memset(&wlm->resp_code_type, 0, sizeof(wlm->resp_code_type)); + + lgs_mng_update_chart_begin(p_file_info->chartname, "response_types"); + lgs_mng_update_chart_set("success", chart_data->num_resp_code_type_success); + lgs_mng_update_chart_set("redirect", chart_data->num_resp_code_type_redirect); + lgs_mng_update_chart_set("bad", chart_data->num_resp_code_type_bad); + lgs_mng_update_chart_set("error", chart_data->num_resp_code_type_error); + lgs_mng_update_chart_set("other", chart_data->num_resp_code_type_other); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* SSL protocol - update */ + if(p_file_info->parser_config->chart_config & CHART_SSL_PROTO){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "ssl_protocol"); + lgs_mng_update_chart_set("TLSV1", chart_data->num_ssl_proto_tlsv1); + lgs_mng_update_chart_set("TLSV1.1", chart_data->num_ssl_proto_tlsv1_1); + lgs_mng_update_chart_set("TLSV1.2", chart_data->num_ssl_proto_tlsv1_2); + lgs_mng_update_chart_set("TLSV1.3", chart_data->num_ssl_proto_tlsv1_3); + lgs_mng_update_chart_set("SSLV2", chart_data->num_ssl_proto_sslv2); + lgs_mng_update_chart_set("SSLV3", chart_data->num_ssl_proto_sslv3); + lgs_mng_update_chart_set("other", chart_data->num_ssl_proto_other); + lgs_mng_update_chart_end(sec); + } + + chart_data->num_ssl_proto_tlsv1 += wlm->ssl_proto.tlsv1; + chart_data->num_ssl_proto_tlsv1_1 += wlm->ssl_proto.tlsv1_1; + chart_data->num_ssl_proto_tlsv1_2 += wlm->ssl_proto.tlsv1_2; + chart_data->num_ssl_proto_tlsv1_3 += wlm->ssl_proto.tlsv1_3; + chart_data->num_ssl_proto_sslv2 += wlm->ssl_proto.sslv2; + chart_data->num_ssl_proto_sslv3 += wlm->ssl_proto.sslv3; + chart_data->num_ssl_proto_other += wlm->ssl_proto.other; + memset(&wlm->ssl_proto, 0, sizeof(wlm->ssl_proto)); + + lgs_mng_update_chart_begin(p_file_info->chartname, "ssl_protocol"); + lgs_mng_update_chart_set("TLSV1", chart_data->num_ssl_proto_tlsv1); + lgs_mng_update_chart_set("TLSV1.1", chart_data->num_ssl_proto_tlsv1_1); + lgs_mng_update_chart_set("TLSV1.2", chart_data->num_ssl_proto_tlsv1_2); + lgs_mng_update_chart_set("TLSV1.3", chart_data->num_ssl_proto_tlsv1_3); + lgs_mng_update_chart_set("SSLV2", chart_data->num_ssl_proto_sslv2); + lgs_mng_update_chart_set("SSLV3", chart_data->num_ssl_proto_sslv3); + lgs_mng_update_chart_set("other", chart_data->num_ssl_proto_other); + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + /* SSL cipher suite - update */ + if(p_file_info->parser_config->chart_config & CHART_SSL_CIPHER){ + for(time_t sec = p_file_info->parser_metrics->last_update - lag_in_sec; + sec < p_file_info->parser_metrics->last_update; + sec++){ + + lgs_mng_update_chart_begin(p_file_info->chartname, "ssl_cipher_suite"); + for(int idx = 0; idx < chart_data->ssl_cipher_size; idx++){ + lgs_mng_update_chart_set( wlm->ssl_cipher_arr.ssl_ciphers[idx].name, + chart_data->num_ssl_ciphers[idx]); + } + lgs_mng_update_chart_end(sec); + } + + if(wlm->ssl_cipher_arr.size > chart_data->ssl_cipher_size){ + chart_data->ssl_cipher_size = wlm->ssl_cipher_arr.size; + chart_data->num_ssl_ciphers = reallocz( chart_data->num_ssl_ciphers, + chart_data->ssl_cipher_size * sizeof(collected_number)); + + for(int idx = chart_data->ssl_cipher_size; idx < wlm->ssl_cipher_arr.size; idx++){ + chart_data->num_ssl_ciphers[idx] = 0; + lgs_mng_add_dim_post_init( &chart_data->cs_ssl_ciphers, + wlm->ssl_cipher_arr.ssl_ciphers[idx].name, + RRD_ALGORITHM_INCREMENTAL_NAME, 1, 1); + } + + chart_data->ssl_cipher_size = wlm->ssl_cipher_arr.size; + } + + lgs_mng_update_chart_begin(p_file_info->chartname, "ssl_cipher_suite"); + for(int idx = 0; idx < chart_data->ssl_cipher_size; idx++){ + chart_data->num_ssl_ciphers[idx] += wlm->ssl_cipher_arr.ssl_ciphers[idx].count; + wlm->ssl_cipher_arr.ssl_ciphers[idx].count = 0; + lgs_mng_update_chart_set( wlm->ssl_cipher_arr.ssl_ciphers[idx].name, + chart_data->num_ssl_ciphers[idx]); + } + lgs_mng_update_chart_end(p_file_info->parser_metrics->last_update); + } + + lgs_mng_do_custom_charts_update(p_file_info, lag_in_sec); + + chart_data->last_update = p_file_info->parser_metrics->last_update; + } +} diff --git a/logsmanagement/rrd_api/rrd_api_web_log.h b/logsmanagement/rrd_api/rrd_api_web_log.h new file mode 100644 index 00000000..de0c88e3 --- /dev/null +++ b/logsmanagement/rrd_api/rrd_api_web_log.h @@ -0,0 +1,88 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file rrd_api_web_log.h + * @brief Incudes the structure and function definitions for + * the web log charts. + */ + +#ifndef RRD_API_WEB_LOG_H_ +#define RRD_API_WEB_LOG_H_ + +#include "daemon/common.h" + +struct File_info; + +typedef struct Chart_data_web_log chart_data_web_log_t; + +#include "../file_info.h" +#include "../circular_buffer.h" + +#include "rrd_api.h" + +struct Chart_data_web_log { + + time_t last_update; + + /* Number of collected log records */ + collected_number num_lines; + + /* Vhosts */ + struct Chart_str cs_vhosts; + collected_number *num_vhosts; + int vhost_size, vhost_size_max; /**< Actual size and maximum allocated size of dim_vhosts, num_vhosts arrays **/ + + /* Ports */ + struct Chart_str cs_ports; + collected_number *num_ports; + int port_size, port_size_max; /**< Actual size and maximum allocated size of dim_ports, num_ports and ports arrays **/ + + /* IP Version */ + collected_number num_ip_ver_4, num_ip_ver_6, num_ip_ver_invalid; + + /* Request client current poll */ + collected_number num_req_client_current_ipv4, num_req_client_current_ipv6; + + /* Request client all-time */ + collected_number num_req_client_all_time_ipv4, num_req_client_all_time_ipv6; + + /* Request methods */ + collected_number num_req_method[REQ_METHOD_ARR_SIZE]; + + /* Request protocol */ + collected_number num_req_proto_http_1, num_req_proto_http_1_1, + num_req_proto_http_2, num_req_proto_other; + + /* Request bandwidth */ + collected_number num_bandwidth_req_size, num_bandwidth_resp_size; + + /* Request processing time */ + collected_number num_req_proc_time_min, num_req_proc_time_max, num_req_proc_time_avg; + + /* Response code family */ + collected_number num_resp_code_family_1xx, num_resp_code_family_2xx, + num_resp_code_family_3xx, num_resp_code_family_4xx, + num_resp_code_family_5xx, num_resp_code_family_other; + + /* Response code */ + collected_number num_resp_code[RESP_CODE_ARR_SIZE]; + + /* Response code type */ + collected_number num_resp_code_type_success, num_resp_code_type_redirect, + num_resp_code_type_bad, num_resp_code_type_error, num_resp_code_type_other; + + /* SSL protocol */ + collected_number num_ssl_proto_tlsv1, num_ssl_proto_tlsv1_1, + num_ssl_proto_tlsv1_2, num_ssl_proto_tlsv1_3, + num_ssl_proto_sslv2, num_ssl_proto_sslv3, num_ssl_proto_other; + + /* SSL cipher suite */ + struct Chart_str cs_ssl_ciphers; + collected_number *num_ssl_ciphers; + int ssl_cipher_size; + +}; + +void web_log_chart_init(struct File_info *p_file_info); +void web_log_chart_update(struct File_info *p_file_info); + +#endif // RRD_API_WEB_LOG_H_ diff --git a/logsmanagement/stock_conf/logsmanagement.d.conf b/logsmanagement/stock_conf/logsmanagement.d.conf new file mode 100644 index 00000000..1089aee1 --- /dev/null +++ b/logsmanagement/stock_conf/logsmanagement.d.conf @@ -0,0 +1,33 @@ +[global] + update every = 1 + update timeout = 10 + use log timestamp = auto + circular buffer max size MiB = 64 + circular buffer drop logs if full = no + compression acceleration = 1 + collected logs total chart enable = no + collected logs rate chart enable = yes + submit logs to system journal = no + systemd journal fields prefix = LOGS_MANAG_ + +[db] + db mode = none + # db dir = change to use non-default path + circular buffer flush to db = 6 + disk space limit MiB = 500 + +[forward input] + enabled = no + unix path = + unix perm = 0644 + listen = 0.0.0.0 + port = 24224 + +[fluent bit] + flush = 0.1 + http listen = 0.0.0.0 + http port = 2020 + http server = false + # log file = change to use non-default path + log level = info + coro stack size = 24576 diff --git a/logsmanagement/stock_conf/logsmanagement.d/default.conf b/logsmanagement/stock_conf/logsmanagement.d/default.conf new file mode 100644 index 00000000..80ea790c --- /dev/null +++ b/logsmanagement/stock_conf/logsmanagement.d/default.conf @@ -0,0 +1,455 @@ +# ------------------------------------------------------------------------------ +# Netdata Logs Management default configuration +# See full explanation on https://github.com/netdata/netdata/blob/master/logsmanagement/README.md +# +# To add a new log source, a new section must be added in this +# file with at least the following settings: +# +# [LOG SOURCE NAME] +# enabled = yes +# log type = flb_tail +# +# For a list of all available log types, see: +# https://github.com/netdata/netdata/blob/master/logsmanagement/README.md#types-of-available-collectors +# +# ------------------------------------------------------------------------------ + +[kmsg Logs] + ## Example: Log collector that will collect new kernel ring buffer logs + + ## Required settings + enabled = yes + log type = flb_kmsg + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + use log timestamp = no + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Drop kernel logs with priority higher than prio_level. + # prio level = 8 + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + severity chart = yes + subsystem chart = yes + device chart = yes + + ## Example of capturing specific kmsg events: + # custom 1 chart = USB connect/disconnect + # custom 1 regex name = connect + # custom 1 regex = .*\bNew USB device found\b.* + + # custom 2 chart = USB connect/disconnect + # custom 2 regex name = disconnect + # custom 2 regex = .*\bUSB disconnect\b.* + +[Systemd Logs] + ## Example: Log collector that will query journald to collect system logs + + ## Required settings + enabled = yes + log type = flb_systemd + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Use default path to Systemd Journal + log path = auto + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + priority value chart = yes + severity chart = yes + facility chart = yes + +[Docker Events Logs] + ## Example: Log collector that will monitor the Docker daemon socket and + ## collect Docker event logs in a default format similar to executing + ## the `sudo docker events` command. + + ## Required settings + enabled = yes + log type = flb_docker_events + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Use default Docker socket UNIX path: /var/run/docker.sock + log path = auto + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + event type chart = yes + event action chart = yes + + ## Example of how to capture create / attach / die events for a named container: + # custom 1 chart = serverA events + # custom 1 regex name = container create + # custom 1 regex = .*\bcontainer create\b.*\bname=serverA\b.* + + # custom 2 chart = serverA events + # custom 2 regex name = container attach + # custom 2 regex = .*\bcontainer attach\b.*\bname=serverA\b.* + + # custom 3 chart = serverA events + # custom 3 regex name = container die + # custom 3 regex = .*\bcontainer die\b.*\bname=serverA\b.* + + ## Stream to https://cloud.openobserve.ai/ + # output 1 name = http + # output 1 URI = YOUR_API_URI + # output 1 Host = api.openobserve.ai + # output 1 Port = 443 + # output 1 tls = On + # output 1 Format = json + # output 1 Json_date_key = _timestamp + # output 1 Json_date_format = iso8601 + # output 1 HTTP_User = test@netdata.cloud + # output 1 HTTP_Passwd = YOUR_OPENOBSERVE_PASSWORD + # output 1 compress = gzip + + ## Real-time export to /tmp/docker_event_logs.csv + # output 2 name = file + # output 2 Path = /tmp + # output 2 File = docker_event_logs.csv + +[Apache access.log] + ## Example: Log collector that will tail Apache's access.log file and + ## parse each new record to extract common web server metrics. + + ## Required settings + enabled = yes + log type = flb_web_log + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## This section supports auto-detection of log file path if section name + ## is left unchanged, otherwise it can be set manually, e.g.: + ## log path = /var/log/apache2/access.log + ## See README for more information on 'log path = auto' option + log path = auto + + ## Use inotify instead of file stat watcher. Set to 'no' to reduce CPU usage. + use inotify = yes + + ## Auto-detect web log format, otherwise it can be set manually, e.g.: + ## log format = %h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-agent}i" + ## see https://httpd.apache.org/docs/2.4/logs.html#accesslog + log format = auto + + ## Detect errors such as illegal port numbers or response codes. + verify parsed logs = yes + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + vhosts chart = yes + ports chart = yes + IP versions chart = yes + unique client IPs - current poll chart = yes + unique client IPs - all-time chart = no + http request methods chart = yes + http protocol versions chart = yes + bandwidth chart = yes + timings chart = yes + response code families chart = yes + response codes chart = yes + response code types chart = yes + SSL protocols chart = yes + SSL chipher suites chart = yes + +[Nginx access.log] + ## Example: Log collector that will tail Nginx's access.log file and + ## parse each new record to extract common web server metrics. + + ## Required settings + enabled = yes + log type = flb_web_log + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## This section supports auto-detection of log file path if section name + ## is left unchanged, otherwise it can be set manually, e.g.: + ## log path = /var/log/nginx/access.log + ## See README for more information on 'log path = auto' option + log path = auto + + ## Use inotify instead of file stat watcher. Set to 'no' to reduce CPU usage. + use inotify = yes + + ## see https://docs.nginx.com/nginx/admin-guide/monitoring/logging/#setting-up-the-access-log + log format = $remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent $request_length $request_time "$http_referer" "$http_user_agent" + + ## Detect errors such as illegal port numbers or response codes. + verify parsed logs = yes + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + vhosts chart = yes + ports chart = yes + IP versions chart = yes + unique client IPs - current poll chart = yes + unique client IPs - all-time chart = no + http request methods chart = yes + http protocol versions chart = yes + bandwidth chart = yes + timings chart = yes + response code families chart = yes + response codes chart = yes + response code types chart = yes + SSL protocols chart = yes + SSL chipher suites chart = yes + +[Netdata daemon.log] + ## Example: Log collector that will tail Netdata's daemon.log and + ## it will generate log level charts based on custom regular expressions. + + ## Required settings + enabled = yes + log type = flb_tail + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## This section supports auto-detection of log file path if section name + ## is left unchanged, otherwise it can be set manually, e.g.: + ## log path = /tmp/netdata/var/log/netdata/daemon.log + ## See README for more information on 'log path = auto' option + log path = auto + + ## Use inotify instead of file stat watcher. Set to 'no' to reduce CPU usage. + use inotify = yes + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + + ## Examples of extracting custom metrics from Netdata's daemon.log: + + ## log level chart + custom 1 chart = log level + custom 1 regex name = emergency + custom 1 regex = level=emergency + custom 1 ignore case = no + + custom 2 chart = log level + custom 2 regex name = alert + custom 2 regex = level=alert + custom 2 ignore case = no + + custom 3 chart = log level + custom 3 regex name = critical + custom 3 regex = level=critical + custom 3 ignore case = no + + custom 4 chart = log level + custom 4 regex name = error + custom 4 regex = level=error + custom 4 ignore case = no + + custom 5 chart = log level + custom 5 regex name = warning + custom 5 regex = level=warning + custom 5 ignore case = no + + custom 6 chart = log level + custom 6 regex name = notice + custom 6 regex = level=notice + custom 6 ignore case = no + + custom 7 chart = log level + custom 7 regex name = info + custom 7 regex = level=info + custom 7 ignore case = no + + custom 8 chart = log level + custom 8 regex name = debug + custom 8 regex = level=debug + custom 8 ignore case = no + +[Netdata fluentbit.log] + ## Example: Log collector that will tail Netdata's + ## embedded Fluent Bit's logs + + ## Required settings + enabled = no + log type = flb_tail + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## This section supports auto-detection of log file path if section name + ## is left unchanged, otherwise it can be set manually, e.g.: + ## log path = /tmp/netdata/var/log/netdata/fluentbit.log + ## See README for more information on 'log path = auto' option + log path = auto + + ## Use inotify instead of file stat watcher. Set to 'no' to reduce CPU usage. + use inotify = yes + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + + ## Examples of extracting custom metrics from fluentbit.log: + + ## log level chart + custom 1 chart = log level + custom 1 regex name = error + custom 1 regex = \[error\] + custom 1 ignore case = no + + custom 2 chart = log level + custom 2 regex name = warning + custom 2 regex = \[warning\] + custom 2 ignore case = no + + custom 3 chart = log level + custom 3 regex name = info + custom 3 regex = \[ info\] + custom 3 ignore case = no + + custom 4 chart = log level + custom 4 regex name = debug + custom 4 regex = \[debug\] + custom 4 ignore case = no + + custom 5 chart = log level + custom 5 regex name = trace + custom 5 regex = \[trace\] + custom 5 ignore case = no + +[auth.log tail] + ## Example: Log collector that will tail auth.log file and count + ## occurences of certain `sudo` commands, using POSIX regular expressions. + + ## Required settings + enabled = no + log type = flb_tail + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## This section supports auto-detection of log file path if section name + ## is left unchanged, otherwise it can be set manually, e.g.: + ## log path = /var/log/auth.log + ## See README for more information on 'log path = auto' option + log path = auto + + ## Use inotify instead of file stat watcher. Set to 'no' to reduce CPU usage. + use inotify = yes + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + + ## Examples of extracting custom metrics from auth.log: + # custom 1 chart = failed su + # # custom 1 regex name = + # custom 1 regex = .*\bsu\b.*\bFAILED SU\b.* + # custom 1 ignore case = no + + # custom 2 chart = sudo commands + # custom 2 regex name = sudo su + # custom 2 regex = .*\bsudo\b.*\bCOMMAND=/usr/bin/su\b.* + # custom 2 ignore case = yes + + # custom 3 chart = sudo commands + # custom 3 regex name = sudo docker run + # custom 3 regex = .*\bsudo\b.*\bCOMMAND=/usr/bin/docker run\b.* + # custom 3 ignore case = yes diff --git a/logsmanagement/stock_conf/logsmanagement.d/example_forward.conf b/logsmanagement/stock_conf/logsmanagement.d/example_forward.conf new file mode 100644 index 00000000..87921d25 --- /dev/null +++ b/logsmanagement/stock_conf/logsmanagement.d/example_forward.conf @@ -0,0 +1,96 @@ +[Forward systemd] + ## Example: Log collector that will collect streamed Systemd logs + ## only for parsing, according to global "forward in" configuration + ## found in logsmanagement.d.conf . + + ## Required settings + enabled = no + log type = flb_systemd + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Streaming input settings. + log source = forward + stream guid = 6ce266f5-2704-444d-a301-2423b9d30735 + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + priority value chart = yes + severity chart = yes + facility chart = yes + +[Forward Docker Events] + ## Example: Log collector that will collect streamed Docker Events logs + ## only for parsing, according to global "forward in" configuration + ## found in logsmanagement.d.conf . + + ## Required settings + enabled = no + log type = flb_docker_events + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Streaming input settings. + log source = forward + stream guid = 6ce266f5-2704-444d-a301-2423b9d30736 + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + event type chart = yes + +[Forward collection] + ## Example: Log collector that will collect streamed logs of any type + ## according to global "forward in" configuration found in + ## logsmanagement.d.conf and will also save them in the logs database. + + ## Required settings + enabled = no + log type = flb_tail + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + db mode = full + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Streaming input settings. + log source = forward + stream guid = 6ce266f5-2704-444d-a301-2423b9d30737 + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes diff --git a/logsmanagement/stock_conf/logsmanagement.d/example_mqtt.conf b/logsmanagement/stock_conf/logsmanagement.d/example_mqtt.conf new file mode 100644 index 00000000..2481795d --- /dev/null +++ b/logsmanagement/stock_conf/logsmanagement.d/example_mqtt.conf @@ -0,0 +1,31 @@ +[MQTT messages] + ## Example: Log collector that will create a server to listen for MQTT logs over a TCP connection. + + ## Required settings + enabled = no + log type = flb_mqtt + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Set up configuration specific to flb_mqtt + ## see also https://docs.fluentbit.io/manual/pipeline/inputs/mqtt + # listen = 0.0.0.0 + # port = 1883 + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + topic chart = yes diff --git a/logsmanagement/stock_conf/logsmanagement.d/example_serial.conf b/logsmanagement/stock_conf/logsmanagement.d/example_serial.conf new file mode 100644 index 00000000..7b0bb0bc --- /dev/null +++ b/logsmanagement/stock_conf/logsmanagement.d/example_serial.conf @@ -0,0 +1,38 @@ +[Serial logs] + ## Example: Log collector that will collect logs from a serial interface. + + ## Required settings + enabled = no + log type = flb_serial + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Set up configuration specific to flb_serial + log path = /dev/pts/4 + bitrate = 115200 + min bytes = 1 + # separator = X + # format = json + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + + ## Example of extracting custom metrics from serial interface messages: + # custom 1 chart = UART0 + # # custom 1 regex name = test + # custom 1 regex = .*\bUART0\b.* + # # custom 1 ignore case = no diff --git a/logsmanagement/stock_conf/logsmanagement.d/example_syslog.conf b/logsmanagement/stock_conf/logsmanagement.d/example_syslog.conf new file mode 100644 index 00000000..2dbd416e --- /dev/null +++ b/logsmanagement/stock_conf/logsmanagement.d/example_syslog.conf @@ -0,0 +1,145 @@ +[syslog tail] + ## Example: Log collector that will tail the syslog file and count + ## occurences of certain keywords, using POSIX regular expressions. + + ## Required settings + enabled = no + log type = flb_tail + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## This section supports auto-detection of log file path if section name + ## is left unchanged, otherwise it can be set manually, e.g.: + ## log path = /var/log/syslog + ## log path = /var/log/messages + ## See README for more information on 'log path = auto' option + log path = auto + + ## Use inotify instead of file stat watcher. Set to 'no' to reduce CPU usage. + use inotify = yes + + ## Submit structured log entries to the system journal + # submit logs to system journal = no + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + + ## Examples of extracting custom metrics from syslog: + # custom 1 chart = identifier + # custom 1 regex name = kernel + # custom 1 regex = .*\bkernel\b.* + # custom 1 ignore case = no + + # custom 2 chart = identifier + # custom 2 regex name = systemd + # custom 2 regex = .*\bsystemd\b.* + # custom 2 ignore case = no + + # custom 3 chart = identifier + # custom 3 regex name = CRON + # custom 3 regex = .*\bCRON\b.* + # custom 3 ignore case = no + + # custom 3 chart = identifier + # custom 3 regex name = netdata + # custom 3 regex = .*\netdata\b.* + # custom 3 ignore case = no + +[syslog Unix socket] + ## Example: Log collector that will listen for RFC-3164 syslog on a UNIX + ## socket that will be created on /tmp/netdata-syslog.sock . + + ## Required settings + enabled = no + log type = flb_syslog + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Netdata will create this socket if mode == unix_tcp or mode == unix_udp, + ## please ensure the right permissions exist for this path + log path = /tmp/netdata-syslog.sock + + ## Ruby Regular Expression to define expected syslog format + ## Please make sure <PRIVAL>, <SYSLOG_TIMESTAMP>, <HOSTNAME>, <SYSLOG_IDENTIFIER>, <PID> and <MESSAGE> are defined + ## see also https://docs.fluentbit.io/manual/pipeline/parsers/regular-expression + log format = /^\<(?<PRIVAL>[0-9]+)\>(?<SYSLOG_TIMESTAMP>[^ ]* {1,2}[^ ]* [^ ]* )(?<HOSTNAME>[^ ]*) (?<SYSLOG_IDENTIFIER>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<PID>[0-9]+)\])?(?:[^\:]*\:)? *(?<MESSAGE>.*)$/ + + ## Set up configuration specific to flb_syslog + ## see also https://docs.fluentbit.io/manual/pipeline/inputs/syslog#configuration-parameters + ## Modes supported are: unix_tcp, unix_udp, tcp, udp + mode = unix_udp + # listen = 0.0.0.0 + # port = 5140 + unix_perm = 0666 + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + priority value chart = yes + severity chart = yes + facility chart = yes + +[syslog TCP socket] + ## Example: Log collector that will listen for RFC-3164 syslog, + ## incoming via TCP on localhost IP and port 5140. + + ## Required settings + enabled = no + log type = flb_syslog + + ## Optional settings, common to all log source. + ## Uncomment to override global equivalents in netdata.conf. + # update every = 1 + # update timeout = 10 + # use log timestamp = auto + # circular buffer max size MiB = 64 + # circular buffer drop logs if full = no + # compression acceleration = 1 + # db mode = none + # circular buffer flush to db = 6 + # disk space limit MiB = 500 + + ## Netdata will create this socket if mode == unix_tcp or mode == unix_udp, + ## please ensure the right permissions exist for this path + # log path = /tmp/netdata-syslog.sock + + ## Ruby Regular Expression to define expected syslog format + ## Please make sure <PRIVAL>, <SYSLOG_TIMESTAMP>, <HOSTNAME>, <SYSLOG_IDENTIFIER>, <PID> and <MESSAGE> are defined + ## see also https://docs.fluentbit.io/manual/pipeline/parsers/regular-expression + log format = /^\<(?<PRIVAL>[0-9]+)\>(?<SYSLOG_TIMESTAMP>[^ ]* {1,2}[^ ]* [^ ]* )(?<HOSTNAME>[^ ]*) (?<SYSLOG_IDENTIFIER>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<PID>[0-9]+)\])?(?:[^\:]*\:)? *(?<MESSAGE>.*)$/ + + ## Set up configuration specific to flb_syslog + ## see also https://docs.fluentbit.io/manual/pipeline/inputs/syslog#configuration-parameters + ## Modes supported are: unix_tcp, unix_udp, tcp, udp + mode = tcp + listen = 0.0.0.0 + port = 5140 + # unix_perm = 0666 + + ## Charts to enable + # collected logs total chart enable = no + # collected logs rate chart enable = yes + priority value chart = yes + severity chart = yes + facility chart = yes diff --git a/logsmanagement/unit_test/unit_test.c b/logsmanagement/unit_test/unit_test.c new file mode 100644 index 00000000..9ee50458 --- /dev/null +++ b/logsmanagement/unit_test/unit_test.c @@ -0,0 +1,780 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file unit_test.h + * @brief Includes unit tests for the Logs Management project + */ + +#include "unit_test.h" +#include <stdlib.h> +#include <stdio.h> +#define __USE_XOPEN_EXTENDED +#include <ftw.h> +#include <unistd.h> +#include "../circular_buffer.h" +#include "../helper.h" +#include "../logsmanag_config.h" +#include "../parser.h" +#include "../query.h" +#include "../db_api.h" + +static int old_stdout = STDOUT_FILENO; +static int old_stderr = STDERR_FILENO; + +#define SUPRESS_STDX(stream_no) \ +{ \ + if(stream_no == STDOUT_FILENO) \ + old_stdout = dup(old_stdout); \ + else \ + old_stderr = dup(old_stderr); \ + if(!freopen("/dev/null", "w", stream_no == STDOUT_FILENO ? stdout : stderr)) \ + exit(-1); \ +} + +#define UNSUPRESS_STDX(stream_no) \ +{ \ + fclose(stream_no == STDOUT_FILENO ? stdout : stderr); \ + if(stream_no == STDOUT_FILENO) \ + stdout = fdopen(old_stdout, "w"); \ + else \ + stderr = fdopen(old_stderr, "w"); \ +} + +#define SUPRESS_STDOUT() SUPRESS_STDX(STDOUT_FILENO) +#define SUPRESS_STDERR() SUPRESS_STDX(STDERR_FILENO) +#define UNSUPRESS_STDOUT() UNSUPRESS_STDX(STDOUT_FILENO) +#define UNSUPRESS_STDERR() UNSUPRESS_STDX(STDERR_FILENO) + +#define LOG_RECORDS_PARTIAL "\ +127.0.0.1 - - [30/Jun/2022:16:43:51 +0300] \"GET / HTTP/1.0\" 200 11192 \"-\" \"ApacheBench/2.3\"\n\ +192.168.2.1 - - [30/Jun/2022:16:43:51 +0300] \"PUT / HTTP/1.0\" 400 11192 \"-\" \"ApacheBench/2.3\"\n\ +255.91.204.202 - mann1475 [30/Jun/2023:21:05:09 +0000] \"POST /vertical/turn-key/engineer/e-enable HTTP/1.0\" 401 11411\n\ +91.126.60.234 - ritchie4302 [30/Jun/2023:21:05:09 +0000] \"PATCH /empower/interfaces/deploy HTTP/2.0\" 404 29063\n\ +120.134.242.160 - runte5364 [30/Jun/2023:21:05:09 +0000] \"GET /visualize/enterprise/optimize/embrace HTTP/1.0\" 400 10637\n\ +61.134.57.25 - - [30/Jun/2023:21:05:09 +0000] \"HEAD /metrics/optimize/bandwidth HTTP/1.1\" 200 26713\n\ +18.90.118.50 - - [30/Jun/2023:21:05:09 +0000] \"PATCH /methodologies/extend HTTP/2.0\" 205 15708\n\ +21.174.251.223 - zulauf8852 [30/Jun/2023:21:05:09 +0000] \"POST /proactive HTTP/2.0\" 100 9456\n\ +20.217.190.46 - - [30/Jun/2023:21:05:09 +0000] \"GET /mesh/frictionless HTTP/1.1\" 301 3153\n\ +130.43.250.80 - hintz5738 [30/Jun/2023:21:05:09 +0000] \"PATCH /e-markets/supply-chains/mindshare HTTP/2.0\" 401 13039\n\ +222.36.95.121 - pouros3514 [30/Jun/2023:21:05:09 +0000] \"DELETE /e-commerce/scale/customized/best-of-breed HTTP/1.0\" 406 8304\n\ +133.117.9.29 - hoeger7673 [30/Jun/2023:21:05:09 +0000] \"PUT /extensible/maximize/visualize/bricks-and-clicks HTTP/1.0\" 403 17067\n\ +65.145.39.136 - heathcote3368 [30/Jun/2023:21:05:09 +0000] \"DELETE /technologies/iterate/viral HTTP/1.1\" 501 29982\n\ +153.132.199.122 - murray8217 [30/Jun/2023:21:05:09 +0000] \"PUT /orchestrate/visionary/visualize HTTP/1.1\" 500 12705\n\ +140.149.178.196 - hickle8613 [30/Jun/2023:21:05:09 +0000] \"PATCH /drive/front-end/infomediaries/maximize HTTP/1.1\" 406 20179\n\ +237.31.189.207 - - [30/Jun/2023:21:05:09 +0000] \"GET /bleeding-edge/recontextualize HTTP/1.1\" 406 24815\n\ +210.217.232.107 - - [30/Jun/2023:21:05:09 +0000] \"POST /redefine/next-generation/relationships/intuitive HTTP/2.0\" 205 14028\n\ +121.2.189.119 - marvin5528 [30/Jun/2023:21:05:09 +0000] \"PUT /sexy/innovative HTTP/2.0\" 204 10689\n\ +120.13.121.164 - jakubowski1027 [30/Jun/2023:21:05:09 +0000] \"PUT /sexy/initiatives/morph/eyeballs HTTP/1.0\" 502 22287\n\ +28.229.107.175 - wilderman8830 [30/Jun/2023:21:05:09 +0000] \"PATCH /visionary/best-of-breed HTTP/1.1\" 503 6010\n\ +210.147.186.50 - - [30/Jun/2023:21:05:09 +0000] \"PUT /paradigms HTTP/2.0\" 501 18054\n\ +185.157.236.127 - - [30/Jun/2023:21:05:09 +0000] \"GET /maximize HTTP/1.0\" 400 13650\n\ +236.90.19.165 - - [30/Jun/2023:21:23:34 +0000] \"GET /next-generation/user-centric/24%2f365 HTTP/1.0\" 400 5212\n\ +233.182.111.100 - torphy3512 [30/Jun/2023:21:23:34 +0000] \"PUT /seamless/incentivize HTTP/1.0\" 304 27750\n\ +80.185.129.193 - - [30/Jun/2023:21:23:34 +0000] \"HEAD /strategic HTTP/1.1\" 502 6146\n\ +182.145.92.52 - - [30/Jun/2023:21:23:34 +0000] \"PUT /dot-com/grow/networks HTTP/1.0\" 301 1763\n\ +46.14.122.16 - - [30/Jun/2023:21:23:34 +0000] \"HEAD /deliverables HTTP/1.0\" 301 7608\n\ +162.111.143.158 - bruen3883 [30/Jun/2023:21:23:34 +0000] \"POST /extensible HTTP/2.0\" 403 22752\n\ +201.13.111.255 - hilpert8768 [30/Jun/2023:21:23:34 +0000] \"PATCH /applications/engage/frictionless/content HTTP/1.0\" 406 24866\n\ +76.90.243.15 - - [30/Jun/2023:21:23:34 +0000] \"PATCH /24%2f7/seamless/target/enable HTTP/1.1\" 503 8176\n\ +187.79.114.48 - - [30/Jun/2023:21:23:34 +0000] \"GET /synergistic HTTP/1.0\" 503 14251\n\ +59.52.178.62 - kirlin3704 [30/Jun/2023:21:23:34 +0000] \"POST /web-readiness/grow/evolve HTTP/1.0\" 501 13305\n\ +27.46.78.167 - - [30/Jun/2023:21:23:34 +0000] \"PATCH /interfaces/schemas HTTP/2.0\" 100 4860\n\ +191.9.15.43 - goodwin7310 [30/Jun/2023:21:23:34 +0000] \"POST /engage/innovate/web-readiness/roi HTTP/2.0\" 404 4225\n\ +195.153.126.148 - klein8350 [30/Jun/2023:21:23:34 +0000] \"DELETE /killer/synthesize HTTP/1.0\" 204 15134\n\ +162.207.64.184 - mayert4426 [30/Jun/2023:21:23:34 +0000] \"HEAD /intuitive/vertical/incentivize HTTP/1.0\" 204 23666\n\ +185.96.7.205 - - [30/Jun/2023:21:23:34 +0000] \"DELETE /communities/deliver/user-centric HTTP/1.0\" 416 18210\n\ +187.180.105.55 - - [30/Jun/2023:21:23:34 +0000] \"POST /customized HTTP/2.0\" 200 1396\n\ +216.82.243.54 - kunze7200 [30/Jun/2023:21:23:34 +0000] \"PUT /e-tailers/evolve/leverage/engage HTTP/2.0\" 504 1665\n\ +170.128.69.228 - - [30/Jun/2023:21:23:34 +0000] \"DELETE /matrix/open-source/proactive HTTP/1.0\" 301 18326\n\ +253.200.84.66 - steuber5220 [30/Jun/2023:21:23:34 +0000] \"POST /benchmark/experiences HTTP/1.1\" 504 18944\n\ +28.240.40.161 - - [30/Jun/2023:21:23:34 +0000] \"PATCH /initiatives HTTP/1.0\" 500 6500\n\ +134.163.236.75 - - [30/Jun/2023:21:23:34 +0000] \"HEAD /platforms/recontextualize HTTP/1.0\" 203 22188\n\ +241.64.230.66 - - [30/Jun/2023:21:23:34 +0000] \"GET /cutting-edge/methodologies/b2c/cross-media HTTP/1.1\" 403 20698\n\ +210.216.183.157 - okuneva6218 [30/Jun/2023:21:23:34 +0000] \"POST /generate/incentivize HTTP/2.0\" 403 25900\n\ +164.219.134.242 - - [30/Jun/2023:21:23:34 +0000] \"HEAD /efficient/killer/whiteboard HTTP/2.0\" 501 22081\n\ +173.156.54.99 - harvey6165 [30/Jun/2023:21:23:34 +0000] \"HEAD /dynamic/cutting-edge/sexy/user-centric HTTP/2.0\" 200 2995\n\ +215.242.74.14 - - [30/Jun/2023:21:23:34 +0000] \"PUT /roi HTTP/1.0\" 204 9674\n\ +133.77.49.187 - lockman3141 [30/Jun/2023:21:23:34 +0000] \"PUT /mindshare/transition HTTP/2.0\" 503 2726\n\ +159.77.190.255 - - [30/Jun/2023:21:23:34 +0000] \"DELETE /world-class/bricks-and-clicks HTTP/1.1\" 501 21712\n\ +65.6.237.113 - - [30/Jun/2023:21:23:34 +0000] \"PATCH /e-enable HTTP/2.0\" 405 11865\n\ +194.76.211.16 - champlin6280 [30/Jun/2023:21:23:34 +0000] \"PUT /applications/redefine/eyeballs/mindshare HTTP/1.0\" 302 27679\n\ +96.206.219.202 - - [30/Jun/2023:21:23:34 +0000] \"PUT /solutions/mindshare/vortals/transition HTTP/1.0\" 403 7385\n\ +255.80.116.201 - hintz8162 [30/Jun/2023:21:23:34 +0000] \"POST /frictionless/e-commerce HTTP/1.0\" 302 9235\n\ +89.66.165.183 - smith2655 [30/Jun/2023:21:23:34 +0000] \"HEAD /markets/synergize HTTP/2.0\" 501 28055\n\ +39.210.168.14 - - [30/Jun/2023:21:23:34 +0000] \"GET /integrate/killer/end-to-end/infrastructures HTTP/1.0\" 302 11311\n\ +173.99.112.210 - - [30/Jun/2023:21:23:34 +0000] \"GET /interfaces HTTP/2.0\" 503 1471\n\ +108.4.157.6 - morissette1161 [30/Jun/2023:21:23:34 +0000] \"POST /mesh/convergence HTTP/1.1\" 403 18708\n\ +174.160.107.162 - - [30/Jun/2023:21:23:34 +0000] \"POST /vortals/monetize/utilize/synergistic HTTP/1.1\" 302 13252\n\ +188.8.105.56 - beatty6880 [30/Jun/2023:21:23:34 +0000] \"POST /web+services/innovate/generate/leverage HTTP/1.1\" 301 29856\n\ +115.179.64.255 - - [30/Jun/2023:21:23:34 +0000] \"PATCH /transform/transparent/b2c/holistic HTTP/1.1\" 406 10208\n\ +48.104.215.32 - - [30/Jun/2023:21:23:34 +0000] \"DELETE /drive/clicks-and-mortar HTTP/1.0\" 501 13752\n\ +75.212.115.12 - pfannerstill5140 [30/Jun/2023:21:23:34 +0000] \"PATCH /leading-edge/mesh/methodologies HTTP/1.0\" 503 4946\n\ +52.75.2.117 - osinski2030 [30/Jun/2023:21:23:34 +0000] \"PUT /incentivize/recontextualize HTTP/1.1\" 301 8785\n" + +#define LOG_RECORD_WITHOUT_NEW_LINE \ +"82.39.169.93 - streich5722 [30/Jun/2023:21:23:34 +0000] \"GET /action-items/leading-edge/reinvent/maximize HTTP/1.1\" 500 1228" + +#define LOG_RECORDS_WITHOUT_TERMINATING_NEW_LINE \ + LOG_RECORDS_PARTIAL \ + LOG_RECORD_WITHOUT_NEW_LINE + +#define LOG_RECORD_WITH_NEW_LINE \ +"131.128.33.109 - turcotte6735 [30/Jun/2023:21:23:34 +0000] \"PUT /distributed/strategize HTTP/1.1\" 401 16471\n" + +#define LOG_RECORDS_WITH_TERMINATING_NEW_LINE \ + LOG_RECORDS_PARTIAL \ + LOG_RECORD_WITH_NEW_LINE + +static int test_compression_decompression() { + int errors = 0; + fprintf(stderr, "%s():\n", __FUNCTION__); + + Circ_buff_item_t item; + item.text_size = sizeof(LOG_RECORDS_WITH_TERMINATING_NEW_LINE); + fprintf(stderr, "Testing LZ4_compressBound()...\n"); + size_t required_compressed_space = LZ4_compressBound(item.text_size); + if(!required_compressed_space){ + fprintf(stderr, "- Error while using LZ4_compressBound()\n"); + return ++errors; + } + + item.data_max_size = item.text_size + required_compressed_space; + item.data = mallocz(item.data_max_size); + memcpy(item.data, LOG_RECORDS_WITH_TERMINATING_NEW_LINE, sizeof(LOG_RECORDS_WITH_TERMINATING_NEW_LINE)); + + fprintf(stderr, "Testing LZ4_compress_fast()...\n"); + item.text_compressed = item.data + item.text_size; + + item.text_compressed_size = LZ4_compress_fast( item.data, item.text_compressed, + item.text_size, required_compressed_space, 1); + if(!item.text_compressed_size){ + fprintf(stderr, "- Error while using LZ4_compress_fast()\n"); + return ++errors; + } + + char *decompressed_text = mallocz(item.text_size); + + if(LZ4_decompress_safe( item.text_compressed, + decompressed_text, + item.text_compressed_size, + item.text_size) < 0){ + fprintf(stderr, "- Error in decompress_text()\n"); + return ++errors; + } + + if(memcmp(item.data, decompressed_text, item.text_size)){ + fprintf(stderr, "- Error, original and decompressed data not the same\n"); + ++errors; + } + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +static int test_read_last_line() { + int errors = 0; + fprintf(stderr, "%s():\n", __FUNCTION__); + + #if defined(_WIN32) || defined(_WIN64) + char tmpname[MAX_PATH] = "/tmp/tmp.XXXXXX"; + #else + char tmpname[] = "/tmp/tmp.XXXXXX"; + #endif + + int fd = mkstemp(tmpname); + if (fd == -1){ + fprintf(stderr, "mkstemp() Failed with error %s\n", strerror(errno)); + exit(EXIT_FAILURE); + } + + FILE *tmpfp = fdopen(fd, "r+"); + if (tmpfp == NULL) { + close(fd); + unlink(tmpname); + exit(EXIT_FAILURE); + } + + if(fprintf(tmpfp, "%s", LOG_RECORDS_WITHOUT_TERMINATING_NEW_LINE) <= 0){ + close(fd); + unlink(tmpname); + exit(EXIT_FAILURE); + } + fflush(tmpfp); + + fprintf(stderr, "Testing read of LOG_RECORD_WITHOUT_NEW_LINE...\n"); + errors += strcmp(LOG_RECORD_WITHOUT_NEW_LINE, read_last_line(tmpname, 0)) ? 1 : 0; + + if(fprintf(tmpfp, "\n%s", LOG_RECORD_WITH_NEW_LINE) <= 0){ + close(fd); + unlink(tmpname); + exit(EXIT_FAILURE); + } + fflush(tmpfp); + + fprintf(stderr, "Testing read of LOG_RECORD_WITH_NEW_LINE...\n"); + errors += strcmp(LOG_RECORD_WITH_NEW_LINE, read_last_line(tmpname, 0)) ? 1 : 0; + + unlink(tmpname); + close(fd); + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +const char * const parse_configs_to_test[] = { + /* [1] Apache csvCombined 1 */ + "127.0.0.1 - - [15/Oct/2020:04:43:51 -0700] \"GET / HTTP/1.0\" 200 11228 \"-\" \"ApacheBench/2.3\"", + + /* [2] Apache csvCombined 2 - extra white space */ + "::1 - - [01/Sep/2022:19:04:42 +0100] \"GET / HTTP/1.1\" 200 3477 \"-\" \"Mozilla/5.0 (Windows NT 10.0; \ +Win64; x64; rv:103.0) Gecko/20100101 Firefox/103.0\"", + + /* [3] Apache csvCombined 3 - with new line */ + "209.202.252.202 - rosenbaum7551 [20/Jun/2023:14:42:27 +0000] \"PUT /harness/networks/initiatives/engineer HTTP/2.0\"\ + 403 42410 \"https://www.senioriterate.name/streamline/exploit\" \"Opera/10.54 (Macintosh; Intel Mac OS X 10_7_6;\ + en-US) Presto/2.12.334 Version/10.00\"\n", + + /* [4] Apache csvCombined 4 - invalid request field */ + "::1 - - [13/Jul/2023:21:00:56 +0100] \"-\" 408 - \"-\" \"-\"", + + /* [5] Apache csvVhostCombined */ + "XPS-wsl.localdomain:80 ::1 - - [30/Jun/2022:20:59:29 +0300] \"GET / HTTP/1.1\" 200 3477 \"-\" \"Mozilla\ +/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.5060.53 Safari/537.36\ + Edg/103.0.1264.37\"", + + /* [6] Apache csvCommon 1 */ + "127.0.0.1 - - [30/Jun/2022:16:43:51 +0300] \"GET / HTTP/1.0\" 200 11228", + + /* [7] Apache csvCommon 2 - with carriage return */ + "180.89.137.89 - barrows1527 [05/Jun/2023:17:46:08 +0000]\ + \"DELETE /b2c/viral/innovative/reintermediate HTTP/1.0\" 416 99\r", + + /* [8] Apache csvCommon 3 - with new line */ + "212.113.230.101 - - [20/Jun/2023:14:29:49 +0000] \"PATCH /strategic HTTP/1.1\" 404 1217\n", + + /* [9] Apache csvVhostCommon 1 */ + "XPS-wsl.localdomain:80 127.0.0.1 - - [30/Jun/2022:16:43:51 +0300] \"GET / HTTP/1.0\" 200 11228", + + /* [10] Apache csvVhostCommon 2 - with new line and extra white space */ + "XPS-wsl.localdomain:80 2001:0db8:85a3:0000:0000:8a2e:0370:7334 - - [30/Jun/2022:16:43:51 +0300] \"GET /\ + HTTP/1.0\" 200 11228\n", + + /* [11] Nginx csvCombined */ + "47.29.201.179 - - [28/Feb/2019:13:17:10 +0000] \"GET /?p=1 HTTP/2.0\" 200 5316 \"https://dot.com/?p=1\"\ + \"Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36\"", +}; +const web_log_line_field_t parse_config_expected[][15] = { + /* [1] */ {REQ_CLIENT , CUSTOM , CUSTOM, TIME , TIME, REQ , RESP_CODE, RESP_SIZE, CUSTOM , CUSTOM , -1, -1, -1, -1, -1}, /* Apache csvCombined 1 */ + /* [2] */ {REQ_CLIENT , CUSTOM , CUSTOM, TIME , TIME, REQ , RESP_CODE, RESP_SIZE, CUSTOM , CUSTOM , -1, -1, -1, -1, -1}, /* Apache csvCombined 2 */ + /* [3] */ {REQ_CLIENT , CUSTOM , CUSTOM, TIME , TIME, REQ , RESP_CODE, RESP_SIZE, CUSTOM , CUSTOM , -1, -1, -1, -1, -1}, /* Apache csvCombined 3 */ + /* [4] */ {REQ_CLIENT , CUSTOM , CUSTOM, TIME , TIME, REQ , RESP_CODE, RESP_SIZE, CUSTOM , CUSTOM , -1, -1, -1, -1, -1}, /* Apache csvCombined 4 */ + /* [5] */ {VHOST_WITH_PORT, REQ_CLIENT, CUSTOM, CUSTOM, TIME, TIME, REQ , RESP_CODE, RESP_SIZE, CUSTOM , CUSTOM, -1, -1, -1, -1}, /* Apache csvVhostCombined */ + /* [6] */ {REQ_CLIENT , CUSTOM , CUSTOM, TIME , TIME, REQ , RESP_CODE, RESP_SIZE, -1 , -1 , -1, -1, -1, -1, -1}, /* Apache csvCommon 1 */ + /* [7] */ {REQ_CLIENT , CUSTOM , CUSTOM, TIME , TIME, REQ , RESP_CODE, RESP_SIZE, -1 , -1 , -1, -1, -1, -1, -1}, /* Apache csvCommon 2 */ + /* [8] */ {REQ_CLIENT , CUSTOM , CUSTOM, TIME , TIME, REQ , RESP_CODE, RESP_SIZE, -1 , -1 , -1, -1, -1, -1, -1}, /* Apache csvCommon 3 */ + /* [9] */ {VHOST_WITH_PORT, REQ_CLIENT, CUSTOM, CUSTOM, TIME, TIME, REQ , RESP_CODE, RESP_SIZE, -1 , -1, -1, -1, -1, -1}, /* Apache csvVhostCommon 1 */ + /* [10] */ {VHOST_WITH_PORT, REQ_CLIENT, CUSTOM, CUSTOM, TIME, TIME, REQ , RESP_CODE, RESP_SIZE, -1 , -1, -1, -1, -1, -1}, /* Apache csvVhostCommon 2 */ + /* [11] */ {REQ_CLIENT , CUSTOM , CUSTOM, TIME , TIME, REQ, RESP_CODE, RESP_SIZE, CUSTOM , CUSTOM , -1, -1, -1, -1, -1}, /* Nginx csvCombined */ +}; +static const char parse_config_delim = ' '; +static int *parse_config_expected_num_fields = NULL; + +static void setup_parse_config_expected_num_fields() { + fprintf(stderr, "%s():\n", __FUNCTION__); + + for(int i = 0; i < (int) (sizeof(parse_configs_to_test) / sizeof(parse_configs_to_test[0])); i++){ + parse_config_expected_num_fields = reallocz(parse_config_expected_num_fields, (i + 1) * sizeof(int)); + parse_config_expected_num_fields[i] = 0; + for(int j = 0; (int) parse_config_expected[i][j] != -1; j++){ + parse_config_expected_num_fields[i]++; + } + } + + fprintf(stderr, "OK\n"); +} + +static int test_count_fields() { + int errors = 0; + fprintf(stderr, "%s():\n", __FUNCTION__); + + for(int i = 0; i < (int) (sizeof(parse_configs_to_test) / sizeof(parse_configs_to_test[0])); i++){ + if(count_fields(parse_configs_to_test[i], parse_config_delim) != parse_config_expected_num_fields[i]){ + fprintf(stderr, "- Error (count_fields() result incorrect) for:\n%s", parse_configs_to_test[i]); + ++errors; + } + } + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +static int test_auto_detect_web_log_parser_config() { + int errors = 0; + fprintf(stderr, "%s():\n", __FUNCTION__); + + for(int i = 0; i < (int) (sizeof(parse_configs_to_test) / sizeof(parse_configs_to_test[0])); i++){ + size_t line_sz = strlen(parse_configs_to_test[i]) + 1; + char *line = strdupz(parse_configs_to_test[i]); + if(line[line_sz - 2] != '\n' && line[line_sz - 2] != '\r'){ + line = reallocz(line, ++line_sz); // +1 to add '\n' char + line[line_sz - 1] = '\0'; + line[line_sz - 2] = '\n'; + } + Web_log_parser_config_t *wblp_conf = auto_detect_web_log_parser_config(line, parse_config_delim); + if(!wblp_conf){ + fprintf(stderr, "- Error (NULL wblp_conf) for:\n%s", line); + ++errors; + } else if(wblp_conf->num_fields != parse_config_expected_num_fields[i]){ + fprintf(stderr, "- Error (number of fields mismatch) for:\n%s", line); + fprintf(stderr, "Expected %d fields but auto-detected %d\n", parse_config_expected_num_fields[i], wblp_conf->num_fields); + ++errors; + } else { + for(int j = 0; (int) parse_config_expected[i][j] != -1; j++){ + if(wblp_conf->fields[j] != parse_config_expected[i][j]){ + fprintf(stderr, "- Error (field type mismatch) for:\n%s", line); + ++errors; + break; + } + } + } + + freez(line); + if(wblp_conf) freez(wblp_conf->fields); + freez(wblp_conf); + } + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +Log_line_parsed_t log_line_parsed_expected[] = { + /* -------------------------------------- + char vhost[VHOST_MAX_LEN]; + int port; + char req_scheme[REQ_SCHEME_MAX_LEN]; + char req_client[REQ_CLIENT_MAX_LEN]; + char req_method[REQ_METHOD_MAX_LEN]; + char req_URL[REQ_URL_MAX_LEN]; + char req_proto[REQ_PROTO_MAX_LEN]; + int req_size; + int req_proc_time; + int resp_code; + int resp_size; + int ups_resp_time; + char ssl_proto[SSL_PROTO_MAX_LEN]; + char ssl_cipher[SSL_CIPHER_SUITE_MAX_LEN]; + int64_t timestamp; + int parsing_errors; + ------------------------------------------ */ + /* [1] */ {"", 0, "", "127.0.0.1", "GET", "/", "1.0", 0, 0, 200, 11228, 0, "", "", 1602762231, 0}, + /* [2] */ {"", 0, "", "::1", "GET", "/", "1.1", 0, 0, 200, 3477 , 0, "", "", 1662055482, 0}, + /* [3] */ {"", 0, "", "209.202.252.202", "PUT", "/harness/networks/initiatives/engineer", "2.0", 0, 0, 403, 42410, 0, "", "", 1687272147, 0}, + /* [4] */ {"", 0, "", "::1", "-", "", "", 0, 0, 408, 0, 0, "", "", 1689278456, 0}, + /* [5] */ {"XPS-wsl.localdomain", 80, "", "::1", "GET", "/", "1.1", 0, 0, 200, 3477 , 0, "", "", 1656611969, 0}, + /* [6] */ {"", 0, "", "127.0.0.1", "GET", "/", "1.0", 0, 0, 200, 11228, 0, "", "", 1656596631, 0}, + /* [7] */ {"", 0, "", "180.89.137.89", "DELETE", "/b2c/viral/innovative/reintermediate", "1.0", 0, 0, 416, 99 , 0, "", "", 1685987168, 0}, + /* [8] */ {"", 0, "", "212.113.230.101", "PATCH", "/strategic", "1.1", 0, 0, 404, 1217 , 0, "", "", 1687271389, 0}, + /* [9] */ {"XPS-wsl.localdomain", 80, "", "127.0.0.1", "GET", "/", "1.0", 0, 0, 200, 11228, 0, "", "", 1656596631, 0}, + /* [10] */ {"XPS-wsl.localdomain", 80, "", "2001:0db8:85a3:0000:0000:8a2e:0370:7334", "GET", "/", "1.0", 0, 0, 200, 11228, 0, "", "", 1656596631, 0}, + /* [11] */ {"", 0, "", "47.29.201.179", "GET", "/?p=1", "2.0", 0, 0, 200, 5316 , 0, "", "", 1551359830, 0} +}; +static int test_parse_web_log_line(){ + int errors = 0; + fprintf(stderr, "%s():\n", __FUNCTION__); + + Web_log_parser_config_t *wblp_conf = callocz(1, sizeof(Web_log_parser_config_t)); + + wblp_conf->delimiter = parse_config_delim; + wblp_conf->verify_parsed_logs = 1; + + for(int i = 0; i < (int) (sizeof(parse_configs_to_test) / sizeof(parse_configs_to_test[0])); i++){ + wblp_conf->num_fields = parse_config_expected_num_fields[i]; + wblp_conf->fields = (web_log_line_field_t *) parse_config_expected[i]; + + Log_line_parsed_t log_line_parsed = (Log_line_parsed_t) {0}; + parse_web_log_line( wblp_conf, + (char *) parse_configs_to_test[i], + strlen(parse_configs_to_test[i]), + &log_line_parsed); + + if(strcmp(log_line_parsed_expected[i].vhost, log_line_parsed.vhost)) + fprintf(stderr, "- Error (parsed vhost:%s != expected vhost:%s) for:\n%s", + log_line_parsed.vhost, log_line_parsed_expected[i].vhost, parse_configs_to_test[i]), ++errors; + if(log_line_parsed_expected[i].port != log_line_parsed.port) + fprintf(stderr, "- Error (parsed port:%d != expected port:%d) for:\n%s", + log_line_parsed.port, log_line_parsed_expected[i].port, parse_configs_to_test[i]), ++errors; + if(strcmp(log_line_parsed_expected[i].req_scheme, log_line_parsed.req_scheme)) + fprintf(stderr, "- Error (parsed req_scheme:%s != expected req_scheme:%s) for:\n%s", + log_line_parsed.req_scheme, log_line_parsed_expected[i].req_scheme, parse_configs_to_test[i]), ++errors; + if(strcmp(log_line_parsed_expected[i].req_client, log_line_parsed.req_client)) + fprintf(stderr, "- Error (parsed req_client:%s != expected req_client:%s) for:\n%s", + log_line_parsed.req_client, log_line_parsed_expected[i].req_client, parse_configs_to_test[i]), ++errors; + if(strcmp(log_line_parsed_expected[i].req_method, log_line_parsed.req_method)) + fprintf(stderr, "- Error (parsed req_method:%s != expected req_method:%s) for:\n%s", + log_line_parsed.req_method, log_line_parsed_expected[i].req_method, parse_configs_to_test[i]), ++errors; + if(strcmp(log_line_parsed_expected[i].req_URL, log_line_parsed.req_URL)) + fprintf(stderr, "- Error (parsed req_URL:%s != expected req_URL:%s) for:\n%s", + log_line_parsed.req_URL, log_line_parsed_expected[i].req_URL, parse_configs_to_test[i]), ++errors; + if(strcmp(log_line_parsed_expected[i].req_proto, log_line_parsed.req_proto)) + fprintf(stderr, "- Error (parsed req_proto:%s != expected req_proto:%s) for:\n%s", + log_line_parsed.req_proto, log_line_parsed_expected[i].req_proto, parse_configs_to_test[i]), ++errors; + if(log_line_parsed_expected[i].req_size != log_line_parsed.req_size) + fprintf(stderr, "- Error (parsed req_size:%d != expected req_size:%d) for:\n%s", + log_line_parsed.req_size, log_line_parsed_expected[i].req_size, parse_configs_to_test[i]), ++errors; + if(log_line_parsed_expected[i].req_proc_time != log_line_parsed.req_proc_time) + fprintf(stderr, "- Error (parsed req_proc_time:%d != expected req_proc_time:%d) for:\n%s", + log_line_parsed.req_proc_time, log_line_parsed_expected[i].req_proc_time, parse_configs_to_test[i]), ++errors; + if(log_line_parsed_expected[i].resp_code != log_line_parsed.resp_code) + fprintf(stderr, "- Error (parsed resp_code:%d != expected resp_code:%d) for:\n%s", + log_line_parsed.resp_code, log_line_parsed_expected[i].resp_code, parse_configs_to_test[i]), ++errors; + if(log_line_parsed_expected[i].resp_size != log_line_parsed.resp_size) + fprintf(stderr, "- Error (parsed resp_size:%d != expected resp_size:%d) for:\n%s", + log_line_parsed.resp_size, log_line_parsed_expected[i].resp_size, parse_configs_to_test[i]), ++errors; + if(log_line_parsed_expected[i].ups_resp_time != log_line_parsed.ups_resp_time) + fprintf(stderr, "- Error (parsed ups_resp_time:%d != expected ups_resp_time:%d) for:\n%s", + log_line_parsed.ups_resp_time, log_line_parsed_expected[i].ups_resp_time, parse_configs_to_test[i]), ++errors; + if(strcmp(log_line_parsed_expected[i].ssl_proto, log_line_parsed.ssl_proto)) + fprintf(stderr, "- Error (parsed ssl_proto:%s != expected ssl_proto:%s) for:\n%s", + log_line_parsed.ssl_proto, log_line_parsed_expected[i].ssl_proto, parse_configs_to_test[i]), ++errors; + if(strcmp(log_line_parsed_expected[i].ssl_cipher, log_line_parsed.ssl_cipher)) + fprintf(stderr, "- Error (parsed ssl_cipher:%s != expected ssl_cipher:%s) for:\n%s", + log_line_parsed.ssl_cipher, log_line_parsed_expected[i].ssl_cipher, parse_configs_to_test[i]), ++errors; + if(log_line_parsed_expected[i].timestamp != log_line_parsed.timestamp) + fprintf(stderr, "- Error (parsed timestamp:%" PRId64 " != expected timestamp:%" PRId64 ") for:\n%s", + log_line_parsed.timestamp, log_line_parsed_expected[i].timestamp, parse_configs_to_test[i]), ++errors; + } + + freez(wblp_conf); + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors ; +} + +const char * const unsanitised_strings[] = { "[test]", "^test$", "{test}", + "(test)", "\\test\\", "test*+.?|", "test&£@"}; +const char * const expected_sanitised_strings[] = { "\\[test\\]", "\\^test\\$", "\\{test\\}", + "\\(test\\)", "\\\\test\\\\", "test\\*\\+\\.\\?\\|", "test&£@"}; +static int test_sanitise_string(){ + int errors = 0; + fprintf(stderr, "%s():\n", __FUNCTION__); + + for(int i = 0; i < (int) (sizeof(unsanitised_strings) / sizeof(unsanitised_strings[0])); i++){ + char *sanitised = sanitise_string((char *) unsanitised_strings[i]); + if(strcmp(expected_sanitised_strings[i], sanitised)){ + fprintf(stderr, "- Error during sanitise_string() for:%s\n", unsanitised_strings[i]); + ++errors; + }; + freez(sanitised); + } + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +char * const regex_src[] = { +"2022-11-07T11:28:27.427519600Z container create e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n\ +2022-11-07T11:28:27.932624500Z container start e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n\ +2022-11-07T11:28:27.971060500Z container die e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (exitCode=0, image=hello-world, name=xenodochial_lumiere)", + +"2022-11-07T11:28:27.427519600Z container create e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n\ +2022-11-07T11:28:27.932624500Z container start e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n\ +2022-11-07T11:28:27.971060500Z container die e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (exitCode=0, image=hello-world, name=xenodochial_lumiere)", + +"2022-11-07T11:28:27.427519600Z container create e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n\ +2022-11-07T11:28:27.932624500Z container start e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n\ +2022-11-07T11:28:27.971060500Z container die e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (exitCode=0, image=hello-world, name=xenodochial_lumiere)", + +"2022-11-07T20:06:36.919980700Z container create bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234 (image=hello-world, name=distracted_sinoussi)\n\ +2022-11-07T20:06:36.927728700Z container attach bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234 (image=hello-world, name=distracted_sinoussi)\n\ +2022-11-07T20:06:36.958906200Z network connect 178a1988c4173559c721d5e24970eef32aaca41e0e363ff9792c731f917683ed (container=bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234, name=bridge, type=bridge)\n\ +2022-11-07T20:06:37.564947300Z container start bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234 (image=hello-world, name=distracted_sinoussi)\n\ +2022-11-07T20:06:37.596428500Z container die bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234 (exitCode=0, image=hello-world, name=distracted_sinoussi)\n\ +2022-11-07T20:06:38.134325100Z network disconnect 178a1988c4173559c721d5e24970eef32aaca41e0e363ff9792c731f917683ed (container=bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234, name=bridge, type=bridge)", + +"Nov 7 21:54:24 X-PC sudo: john : TTY=pts/7 ; PWD=/home/john ; USER=root ; COMMAND=/usr/bin/docker run hello-world\n\ +Nov 7 21:54:24 X-PC sudo: pam_unix(sudo:session): session opened for user root by john(uid=0)\n\ +Nov 7 21:54:25 X-PC sudo: pam_unix(sudo:session): session closed for user root\n\ +Nov 7 21:54:24 X-PC sudo: john : TTY=pts/7 ; PWD=/home/john ; USER=root ; COMMAND=/usr/bin/docker run hello-world\n" +}; +const char * const regex_keyword[] = { + "start", + "CONTAINER", + "CONTAINER", + NULL, + NULL +}; +const char * const regex_pat_str[] = { + NULL, + NULL, + NULL, + ".*\\bcontainer\\b.*\\bhello-world\\b.*", + ".*\\bsudo\\b.*\\bCOMMAND=/usr/bin/docker run\\b.*" + +}; +const int regex_ignore_case[] = { + 1, + 1, + 0, + 1, + 1 +}; +const int regex_exp_matches[] = { + 1, + 3, + 0, + 4, + 2 +}; +const char * const regex_exp_dst[] = { +"2022-11-07T11:28:27.932624500Z container start e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n", + +"2022-11-07T11:28:27.427519600Z container create e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n\ +2022-11-07T11:28:27.932624500Z container start e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (image=hello-world, name=xenodochial_lumiere)\n\ +2022-11-07T11:28:27.971060500Z container die e0c3c6120c29beb393e4b92773c9aa60006747bddabd352b77bf0b4ad23747a7 (exitCode=0, image=hello-world, name=xenodochial_lumiere)", + +"", + +"2022-11-07T20:06:36.919980700Z container create bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234 (image=hello-world, name=distracted_sinoussi)\n\ +2022-11-07T20:06:36.927728700Z container attach bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234 (image=hello-world, name=distracted_sinoussi)\n\ +2022-11-07T20:06:37.564947300Z container start bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234 (image=hello-world, name=distracted_sinoussi)\n\ +2022-11-07T20:06:37.596428500Z container die bd8d4a3338c3e9ab4ca555c6d869dc980f04f10ebdcd9284321c0afecbec1234 (exitCode=0, image=hello-world, name=distracted_sinoussi)", + +"Nov 7 21:54:24 X-PC sudo: john : TTY=pts/7 ; PWD=/home/john ; USER=root ; COMMAND=/usr/bin/docker run hello-world\n\ +Nov 7 21:54:24 X-PC sudo: john : TTY=pts/7 ; PWD=/home/john ; USER=root ; COMMAND=/usr/bin/docker run hello-world\n" +}; +static int test_search_keyword(){ + int errors = 0; + fprintf(stderr, "%s():\n", __FUNCTION__); + + for(int i = 0; i < (int) (sizeof(regex_src) / sizeof(regex_src[0])); i++){ + regex_t *regex_c = regex_pat_str[i] ? mallocz(sizeof(regex_t)) : NULL; + if(regex_c && regcomp( regex_c, regex_pat_str[i], + regex_ignore_case[i] ? REG_EXTENDED | REG_NEWLINE | REG_ICASE : REG_EXTENDED | REG_NEWLINE)) + fatal("Could not compile regular expression:%s", regex_pat_str[i]); + + size_t regex_src_sz = strlen(regex_src[i]) + 1; + char *res = callocz(1 , regex_src_sz); + size_t res_sz; + int matches = search_keyword( regex_src[i], regex_src_sz, + res, &res_sz, + regex_keyword[i], regex_c, + regex_ignore_case[i]); + // fprintf(stderr, "\nMatches:%d\nResults:\n%.*s\n", matches, (int) res_sz, res); + if(regex_exp_matches[i] != matches){ + fprintf(stderr, "- Error in matches returned from search_keyword() for: regex_src[%d]\n", i); + ++errors; + }; + if(strncmp(regex_exp_dst[i], res, res_sz - 1)){ + fprintf(stderr, "- Error in strncmp() of results from search_keyword() for: regex_src[%d]\n", i); + ++errors; + } + + if(regex_c) freez(regex_c); + freez(res); + } + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +static Flb_socket_config_t *p_forward_in_config = NULL; + +static flb_srvc_config_t flb_srvc_config = { + .flush = FLB_FLUSH_DEFAULT, + .http_listen = FLB_HTTP_LISTEN_DEFAULT, + .http_port = FLB_HTTP_PORT_DEFAULT, + .http_server = FLB_HTTP_SERVER_DEFAULT, + .log_path = "NULL", + .log_level = FLB_LOG_LEVEL_DEFAULT, + .coro_stack_size = FLB_CORO_STACK_SIZE_DEFAULT +}; + +static flb_srvc_config_t *p_flb_srvc_config = NULL; + +static int test_logsmanag_config_funcs(){ + int errors = 0, rc; + fprintf(stderr, "%s():\n", __FUNCTION__); + + fprintf(stderr, "Testing get_X_dir() functions...\n"); + if(NULL == get_user_config_dir()){ + fprintf(stderr, "- Error, get_user_config_dir() returns NULL.\n"); + ++errors; + } + + if(NULL == get_stock_config_dir()){ + fprintf(stderr, "- Error, get_stock_config_dir() returns NULL.\n"); + ++errors; + } + + if(NULL == get_log_dir()){ + fprintf(stderr, "- Error, get_log_dir() returns NULL.\n"); + ++errors; + } + + if(NULL == get_cache_dir()){ + fprintf(stderr, "- Error, get_cache_dir() returns NULL.\n"); + ++errors; + } + + fprintf(stderr, "Testing logs_manag_config_load() when p_flb_srvc_config is NULL...\n"); + + SUPRESS_STDERR(); + rc = logs_manag_config_load(p_flb_srvc_config, &p_forward_in_config, 1); + UNSUPRESS_STDERR(); + + if(LOGS_MANAG_CONFIG_LOAD_ERROR_P_FLB_SRVC_NULL != rc){ + fprintf(stderr, "- Error, logs_manag_config_load() returns %d.\n", rc); + ++errors; + } + + p_flb_srvc_config = &flb_srvc_config; + + fprintf(stderr, "Testing logs_manag_config_load() can load stock config...\n"); + + SUPRESS_STDERR(); + rc = logs_manag_config_load(&flb_srvc_config, &p_forward_in_config, 1); + UNSUPRESS_STDERR(); + + if( LOGS_MANAG_CONFIG_LOAD_ERROR_OK != rc){ + fprintf(stderr, "- Error, logs_manag_config_load() returns %d.\n", rc); + ++errors; + } + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +uv_loop_t *main_loop; + +static void setup_p_file_infos_arr_and_main_loop() { + fprintf(stderr, "%s():\n", __FUNCTION__); + + p_file_infos_arr = callocz(1, sizeof(struct File_infos_arr)); + main_loop = mallocz(sizeof(uv_loop_t)); + if(uv_loop_init(main_loop)) + exit(EXIT_FAILURE); + + fprintf(stderr, "OK\n"); +} + +static int test_flb_init(){ + int errors = 0, rc; + fprintf(stderr, "%s():\n", __FUNCTION__); + + fprintf(stderr, "Testing flb_init() with wrong stock_config_dir...\n"); + + SUPRESS_STDERR(); + rc = flb_init(flb_srvc_config, "/tmp", "example_prefix_"); + UNSUPRESS_STDERR(); + if(!rc){ + fprintf(stderr, "- Error, flb_init() should fail but it returns %d.\n", rc); + ++errors; + } + + fprintf(stderr, "Testing flb_init() with correct stock_config_dir...\n"); + + rc = flb_init(flb_srvc_config, get_stock_config_dir(), "example_prefix_"); + if(rc){ + fprintf(stderr, "- Error, flb_init() should fail but it returns %d.\n", rc); + ++errors; + } + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +static int unlink_cb(const char *fpath, const struct stat *sb, int typeflag, struct FTW *ftwbuf){ + UNUSED(sb); + UNUSED(typeflag); + UNUSED(ftwbuf); + + return remove(fpath); +} + +static int test_db_init(){ + int errors = 0; + fprintf(stderr, "%s():\n", __FUNCTION__); + + extern netdata_mutex_t stdout_mut; + + SUPRESS_STDOUT(); + SUPRESS_STDERR(); + config_file_load(main_loop, p_forward_in_config, &flb_srvc_config, &stdout_mut); + UNSUPRESS_STDOUT(); + UNSUPRESS_STDERR(); + + fprintf(stderr, "Testing db_init() with main_db_dir == NULL...\n"); + + SUPRESS_STDERR(); + db_set_main_dir(NULL); + int rc = db_init(); + UNSUPRESS_STDERR(); + + if(!rc){ + fprintf(stderr, "- Error, db_init() returns %d even though db_set_main_dir(NULL); was called.\n", rc); + ++errors; + } + + char tmpdir[] = "/tmp/tmpdir.XXXXXX"; + char *main_db_dir = mkdtemp (tmpdir); + fprintf(stderr, "Testing db_init() with main_db_dir == %s...\n", main_db_dir); + + SUPRESS_STDERR(); + db_set_main_dir(main_db_dir); + rc = db_init(); + UNSUPRESS_STDERR(); + + if(rc){ + fprintf(stderr, "- Error, db_init() returns %d.\n", rc); + ++errors; + } + + fprintf(stderr, "Cleaning up %s...\n", main_db_dir); + + if(nftw(main_db_dir, unlink_cb, 64, FTW_DEPTH | FTW_PHYS) == -1){ + fprintf(stderr, "Error while remove path:%s. Will exit...\n", strerror(errno)); + exit(EXIT_FAILURE); + } + + fprintf(stderr, "%s\n", errors ? "FAIL" : "OK"); + return errors; +} + +int logs_management_unittest(void){ + int errors = 0; + + fprintf(stderr, "\n\n======================================================\n"); + fprintf(stderr, " ** Starting logs management tests **\n"); + fprintf(stderr, "======================================================\n"); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_compression_decompression(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_read_last_line(); + fprintf(stderr, "------------------------------------------------------\n"); + setup_parse_config_expected_num_fields(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_count_fields(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_auto_detect_web_log_parser_config(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_parse_web_log_line(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_sanitise_string(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_search_keyword(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_logsmanag_config_funcs(); + fprintf(stderr, "------------------------------------------------------\n"); + setup_p_file_infos_arr_and_main_loop(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_flb_init(); + fprintf(stderr, "------------------------------------------------------\n"); + errors += test_db_init(); + fprintf(stderr, "------------------------------------------------------\n"); + fprintf(stderr, "[%s] Total errors: %d\n", errors ? "FAILED" : "SUCCEEDED", errors); + fprintf(stderr, "======================================================\n"); + fprintf(stderr, " ** Finished logs management tests **\n"); + fprintf(stderr, "======================================================\n"); + fflush(stderr); + + return errors; +} diff --git a/logsmanagement/unit_test/unit_test.h b/logsmanagement/unit_test/unit_test.h new file mode 100644 index 00000000..364f2ea5 --- /dev/null +++ b/logsmanagement/unit_test/unit_test.h @@ -0,0 +1,12 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +/** @file unit_test.h + * @brief This is the header for unit_test.c + */ + +#ifndef LOGS_MANAGEMENT_UNIT_TEST_H_ +#define LOGS_MANAGEMENT_UNIT_TEST_H_ + +int logs_management_unittest(void); + +#endif // LOGS_MANAGEMENT_UNIT_TEST_H_ |