diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-03-09 13:19:48 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-03-09 13:20:02 +0000 |
commit | 58daab21cd043e1dc37024a7f99b396788372918 (patch) | |
tree | 96771e43bb69f7c1c2b0b4f7374cb74d7866d0cb /collectors/systemd-journal.plugin | |
parent | Releasing debian version 1.43.2-1. (diff) | |
download | netdata-58daab21cd043e1dc37024a7f99b396788372918.tar.xz netdata-58daab21cd043e1dc37024a7f99b396788372918.zip |
Merging upstream version 1.44.3.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'collectors/systemd-journal.plugin')
15 files changed, 6002 insertions, 1709 deletions
diff --git a/collectors/systemd-journal.plugin/Makefile.am b/collectors/systemd-journal.plugin/Makefile.am index fd8f4ab21..48f667c1b 100644 --- a/collectors/systemd-journal.plugin/Makefile.am +++ b/collectors/systemd-journal.plugin/Makefile.am @@ -5,6 +5,11 @@ MAINTAINERCLEANFILES = $(srcdir)/Makefile.in dist_noinst_DATA = \ README.md \ + systemd-journal-self-signed-certs.sh \ + forward_secure_sealing.md \ + active_journal_centralization_guide_no_encryption.md \ + passive_journal_centralization_guide_no_encryption.md \ + passive_journal_centralization_guide_self_signed_certs.md \ $(NULL) dist_libconfig_DATA = \ diff --git a/collectors/systemd-journal.plugin/README.md b/collectors/systemd-journal.plugin/README.md index 51aa1b7cd..c3c639045 100644 --- a/collectors/systemd-journal.plugin/README.md +++ b/collectors/systemd-journal.plugin/README.md @@ -40,31 +40,34 @@ For more information check [this discussion](https://github.com/netdata/netdata/ The following are limitations related to the availability of the plugin: -- This plugin is not available when Netdata is installed in a container. The problem is that `libsystemd` is not - available in Alpine Linux (there is a `libsystemd`, but it is a dummy that returns failure on all calls). We plan to - change this, by shipping Netdata containers based on Debian. +- Netdata versions prior to 1.44 shipped in a docker container do not include this plugin. + The problem is that `libsystemd` is not available in Alpine Linux (there is a `libsystemd`, but it is a dummy that + returns failure on all calls). Starting with Netdata version 1.44, Netdata containers use a Debian base image + making this plugin available when Netdata is running in a container. - For the same reason (lack of `systemd` support for Alpine Linux), the plugin is not available on `static` builds of - Netdata (which are based on `muslc`, not `glibc`). + Netdata (which are based on `muslc`, not `glibc`). If your Netdata is installed in `/opt/netdata` you most likely have + a static build of Netdata. - On old systemd systems (like Centos 7), the plugin runs always in "full data query" mode, which makes it slower. The reason, is that systemd API is missing some important calls we need to use the field indexes of `systemd` journal. However, when running in this mode, the plugin offers also negative matches on the data (like filtering for all logs that do not have set some field), and this is the reason "full data query" mode is also offered as an option even on newer versions of `systemd`. -To use the plugin, install one of our native distribution packages, or install it from source. - #### `systemd` journal features The following are limitations related to the features of `systemd` journal: -- This plugin does not support binary field values. `systemd` journal has the ability to assign fields with binary data. - This plugin assumes all fields contain text values (text in this context includes numbers). +- This plugin assumes that binary field values are text fields with newlines in them. `systemd-journal` has the ability + to support binary fields, without specifying the nature of the binary data. However, binary fields are commonly used + to store log entries that include multiple lines of text. The plugin treats all binary fields are multi-line text. - This plugin does not support multiple values per field for any given log entry. `systemd` journal has the ability to accept the same field key, multiple times, with multiple values on a single log entry. This plugin will present the last value and ignore the others for this log entry. -- This plugin will only read journal files located in `/var/log/journal` or `/run/log/journal`. `systemd-remote` has the +- This plugin will only read journal files located in `/var/log/journal` or `/run/log/journal`. `systemd-journal-remote` has the ability to store journal files anywhere (user configured). If journal files are not located in `/var/log/journal` - or `/run/log/journal` (and any of their subdirectories), the plugin will not find them. + or `/run/log/journal` (and any of their subdirectories), the plugin will not find them. A simple solution is to link + the other directories somewhere inside `/var/log/journal`. The plugin will pick them up, even if a sub-directory of + `/var/log/journal` is a link to a directory outside `/var/log/journal`. Other than the above, this plugin supports all features of `systemd` journals. @@ -125,8 +128,8 @@ Usually `remote` journals are named by the IP of the server sending these logs. extracts these IPs and performs a reverse DNS lookup to find their hostnames. When this is successful, `remote` journals are named by the hostnames of the origin servers. -For information about configuring a journals' centralization server, -check [this FAQ item](#how-do-i-configure-a-journals-centralization-server). +For information about configuring a journal centralization server, +check [this FAQ item](#how-do-i-configure-a-journal-centralization-server). ## Journal Fields @@ -153,6 +156,7 @@ The plugin automatically enriches certain fields to make them more user-friendly - `_GID`, `OBJECT_GID`: the local group database is consulted to annotate them with group names. - `_CAP_EFFECTIVE`: the encoded value is annotated with a human-readable list of the linux capabilities. - `_SOURCE_REALTIME_TIMESTAMP`: the numeric value is annotated with human-readable datetime in UTC. +- `MESSAGE_ID`: for the known `MESSAGE_ID`s, the value is replaced with the well known name of the event. The values of all other fields are presented as found in the journals. @@ -237,6 +241,11 @@ Full text search is combined with the selected filters. The text box accepts asterisks `*` as wildcards. So, `a*b*c` means match anything that contains `a`, then `b` and then `c` with anything between them. +Spaces are treated as OR expressions. So that `a*b c*d` means `a*b OR c*d`. + +Negative expressions are supported, by prefixing any string with `!`. Example: `!systemd *` means match anything that +does not contain `systemd` on any of its fields. + ## Query performance Journal files are designed to be accessed by multiple readers and one writer, concurrently. @@ -278,9 +287,9 @@ multiple journal files, over long time-frames. During the development of this plugin, we submitted, to `systemd`, a number of patches to improve `journalctl` performance by a factor of 14: -- https://github.com/systemd/systemd/pull/29365 -- https://github.com/systemd/systemd/pull/29366 -- https://github.com/systemd/systemd/pull/29261 +- <https://github.com/systemd/systemd/pull/29365> +- <https://github.com/systemd/systemd/pull/29366> +- <https://github.com/systemd/systemd/pull/29261> However, even after these patches are merged, `journalctl` will still be 2x slower than this Netdata plugin, on multi-journal queries. @@ -290,13 +299,85 @@ the Netdata plugin queries each file individually and it then it merges the resu This is transparent, thanks to the `facets` library in `libnetdata` that handles on-the-fly indexing, filtering, and searching of any dataset, independently of its source. +## Performance at scale + +On busy logs servers, or when querying long timeframes that match millions of log entries, the plugin has a sampling +algorithm to allow it respond promptly. It works like this: + +1. The latest 500k log entries are queried in full, evaluating all the fields of every single log entry. This evaluation + allows counting the unique values per field, updating the counters next to each value at the filters section of the + dashboard. +2. When the latest 500k log entries have been processed and there are more data to read, the plugin divides evenly 500k + more log entries to the number of journal files matched by the query. So, it will continue to evaluate all the fields + of all log entries, up to the budget per file, aiming to fully query 1 million log entries in total. +3. When the budget is hit for a given file, the plugin continues to scan log entries, but this time it does not evaluate + the fields and their values, so the counters per field and value are not updated. These unsampled log entries are + shown in the histogram with the label `[unsampled]`. +4. The plugin continues to count `[unsampled]` entries until as many as sampled entries have been evaluated and at least + 1% of the journal file has been processed. +5. When the `[unsampled]` budget is exhausted, the plugin stops processing the journal file and based on the processing + completed so far and the number of entries in the journal file, it estimates the remaining number of log entries in + that file. This is shown as `[estimated]` at the histogram. +6. In systemd versions 254 or later, the plugin fetches the unique sequence number of each log entry and calculates the + the percentage of the file matched by the query, versus the total number of the log entries in the journal file. +7. In systemd versions prior to 254, the plugin estimates the number of entries the journal file contributes to the + query, using the amount of log entries matched it vs. the total duration the log file has entries for. + +The above allow the plugin to respond promptly even when the number of log entries in the journal files is several +dozens millions, while providing accurate estimations of the log entries over time at the histogram and enough counters +at the fields filtering section to help users get an overview of the whole timeframe. + +The fact that the latest 500k log entries and 1% of all journal files (which are spread over time) have been fully +evaluated, including counting the number of appearances for each field value, the plugin usually provides an accurate +representation of the whole timeframe. + +Keep in mind that although the plugin is quite effective and responds promptly when there are hundreds of journal files +matching a query, response times may be longer when there are several thousands of smaller files. systemd versions 254+ +attempt to solve this problem by allowing `systemd-journal-remote` to create larger files. However, for systemd +versions prior to 254, `systemd-journal-remote` creates files of up to 32MB each, which when running very busy +journals centralization servers aggregating several thousands of log entries per second, the number of files can grow +to several dozens of thousands quickly. In such setups, the plugin should ideally skip processing journal files +entirely, relying solely on the estimations of the sequence of files each file is part of. However, this has not been +implemented yet. To improve the query performance in such setups, the user has to query smaller timeframes. + +Another optimization taking place in huge journal centralization points, is the initial scan of the database. The plugin +needs to know the list of all journal files available, including the details of the first and the last message in each +of them. When there are several thousands of files in a directory (like it usually happens in `/var/log/journal/remote`), +directory listing and examination of each file can take a considerable amount of time (even `ls -l` takes minutes). +To work around this problem, the plugin uses `inotify` to receive file updates immediately and scans the library from +the newest to the oldest file, allowing the user interface to work immediately after startup, for the most recent +timeframes. + +### Best practices for better performance + +systemd-journal has been designed **first to be reliable** and then to be fast. It includes several mechanisms to ensure +minimal data loss under all conditions (e.g. disk corruption, tampering, forward secure sealing) and despite the fact +that it utilizes several techniques to require minimal disk footprint (like deduplication of log entries, linking of +values and fields, compression) the disk footprint of journal files remains significantly higher compared to other log +management solutions. + +The higher disk footprint results in higher disk I/O during querying, since a lot more data have to read from disk to +evaluate a query. Query performance at scale can greatly benefit by utilizing a compressed filesystem (ext4, btrfs, zfs) +to store systemd-journal files. + +systemd-journal files are cached by the operating system. There is no database server to serve queries. Each file is +opened and the query runs by directly accessing the data in it. + +Therefore systemd-journal relies on the caching layer of the operating system to optimize query performance. The more +RAM the system has, although it will not be reported as `used` (it will be reported as `cache`), the faster the queries +will get. The first time a timeframe is accessed the query performance will be slower, but further queries on the same +timeframe will be significantly faster since journal data are now cached in memory. + +So, on busy logs centralization systems, queries performance can be improved significantly by using a compressed +filesystem for storing the journal files, and higher amounts of RAM. + ## Configuration and maintenance This Netdata plugin does not require any configuration or maintenance. ## FAQ -### Can I use this plugin on journals' centralization servers? +### Can I use this plugin on journal centralization servers? Yes. You can centralize your logs using `systemd-journal-remote`, and then install Netdata on this logs centralization server to explore the logs of all your infrastructure. @@ -304,7 +385,7 @@ on this logs centralization server to explore the logs of all your infrastructur This plugin will automatically provide multi-node views of your logs and also give you the ability to combine the logs of multiple servers, as you see fit. -Check [configuring a logs centralization server](#configuring-a-journals-centralization-server). +Check [configuring a logs centralization server](#how-do-i-configure-a-journal-centralization-server). ### Can I use this plugin from a parent Netdata? @@ -364,7 +445,7 @@ Yes. It is simple, fast and the software to do it is already in your systems. For application and system logs, `systemd` journal is ideal and the visibility you can get by centralizing your system logs and the use of this Netdata plugin, is unparalleled. -### How do I configure a journals' centralization server? +### How do I configure a journal centralization server? A short summary to get journal server running can be found below. There are two strategies you can apply, when it comes down to a centralized server for `systemd` journal logs. @@ -374,294 +455,13 @@ There are two strategies you can apply, when it comes down to a centralized serv For more options and reference to documentation, check `man systemd-journal-remote` and `man systemd-journal-upload`. -#### _passive_ journals' centralization without encryption - -> ℹ️ _passive_ is a journal server that waits for clients to push their metrics to it. - -> ⚠️ **IMPORTANT** -> These instructions will copy your logs to a central server, without any encryption or authorization. -> DO NOT USE THIS ON NON-TRUSTED NETWORKS. - -##### _passive_ server, without encryption - -On the centralization server install `systemd-journal-remote`: - -```sh -# change this according to your distro -sudo apt-get install systemd-journal-remote -``` - -Make sure the journal transfer protocol is `http`: - -```sh -sudo cp /lib/systemd/system/systemd-journal-remote.service /etc/systemd/system/ - -# edit it to make sure it says: -# --listen-http=-3 -# not: -# --listen-https=-3 -sudo nano /etc/systemd/system/systemd-journal-remote.service - -# reload systemd -sudo systemctl daemon-reload -``` - -Optionally, if you want to change the port (the default is `19532`), edit `systemd-journal-remote.socket` - -```sh -# edit the socket file -sudo systemctl edit systemd-journal-remote.socket -``` - -and add the following lines into the instructed place, and choose your desired port; save and exit. - -```sh -[Socket] -ListenStream=<DESIRED_PORT> -``` - -Finally, enable it, so that it will start automatically upon receiving a connection: - -``` -# enable systemd-journal-remote -sudo systemctl enable --now systemd-journal-remote.socket -sudo systemctl enable systemd-journal-remote.service -``` - -`systemd-journal-remote` is now listening for incoming journals from remote hosts. - -##### _passive_ client, without encryption - -On the clients, install `systemd-journal-remote`: - -```sh -# change this according to your distro -sudo apt-get install systemd-journal-remote -``` - -Edit `/etc/systemd/journal-upload.conf` and set the IP address and the port of the server, like so: - -``` -[Upload] -URL=http://centralization.server.ip:19532 -``` - -Edit `systemd-journal-upload`, and add `Restart=always` to make sure the client will keep trying to push logs, even if the server is temporarily not there, like this: - -```sh -sudo systemctl edit systemd-journal-upload -``` - -At the top, add: - -``` -[Service] -Restart=always -``` - -Enable and start `systemd-journal-upload`, like this: - -```sh -sudo systemctl enable systemd-journal-upload -sudo systemctl start systemd-journal-upload -``` - -##### verify it works - -To verify the central server is receiving logs, run this on the central server: - -```sh -sudo ls -l /var/log/journal/remote/ -``` - -You should see new files from the client's IP. - -Also, `systemctl status systemd-journal-remote` should show something like this: - -``` -systemd-journal-remote.service - Journal Remote Sink Service - Loaded: loaded (/etc/systemd/system/systemd-journal-remote.service; indirect; preset: disabled) - Active: active (running) since Sun 2023-10-15 14:29:46 EEST; 2h 24min ago -TriggeredBy: ● systemd-journal-remote.socket - Docs: man:systemd-journal-remote(8) - man:journal-remote.conf(5) - Main PID: 2118153 (systemd-journal) - Status: "Processing requests..." - Tasks: 1 (limit: 154152) - Memory: 2.2M - CPU: 71ms - CGroup: /system.slice/systemd-journal-remote.service - └─2118153 /usr/lib/systemd/systemd-journal-remote --listen-http=-3 --output=/var/log/journal/remote/ -``` - -Note the `status: "Processing requests..."` and the PID under `CGroup`. - -On the client `systemctl status systemd-journal-upload` should show something like this: - -``` -● systemd-journal-upload.service - Journal Remote Upload Service - Loaded: loaded (/lib/systemd/system/systemd-journal-upload.service; enabled; vendor preset: disabled) - Drop-In: /etc/systemd/system/systemd-journal-upload.service.d - └─override.conf - Active: active (running) since Sun 2023-10-15 10:39:04 UTC; 3h 17min ago - Docs: man:systemd-journal-upload(8) - Main PID: 4169 (systemd-journal) - Status: "Processing input..." - Tasks: 1 (limit: 13868) - Memory: 3.5M - CPU: 1.081s - CGroup: /system.slice/systemd-journal-upload.service - └─4169 /lib/systemd/systemd-journal-upload --save-state -``` - -Note the `Status: "Processing input..."` and the PID under `CGroup`. - -#### _passive_ journals' centralization with encryption using self-signed certificates - -> ℹ️ _passive_ is a journal server that waits for clients to push their metrics to it. +#### _passive_ journal centralization without encryption -##### _passive_ server, with encryption and self-singed certificates +If you want to setup your own passive journal centralization setup without encryption, [check out guide on it](https://github.com/netdata/netdata/blob/master/collectors/systemd-journal.plugin/passive_journal_centralization_guide_no_encryption.md). -On the centralization server install `systemd-journal-remote` and `openssl`: - -```sh -# change this according to your distro -sudo apt-get install systemd-journal-remote openssl -``` - -Make sure the journal transfer protocol is `https`: - -```sh -sudo cp /lib/systemd/system/systemd-journal-remote.service /etc/systemd/system/ - -# edit it to make sure it says: -# --listen-https=-3 -# not: -# --listen-http=-3 -sudo nano /etc/systemd/system/systemd-journal-remote.service - -# reload systemd -sudo systemctl daemon-reload -``` - -Optionally, if you want to change the port (the default is `19532`), edit `systemd-journal-remote.socket` - -```sh -# edit the socket file -sudo systemctl edit systemd-journal-remote.socket -``` - -and add the following lines into the instructed place, and choose your desired port; save and exit. - -```sh -[Socket] -ListenStream=<DESIRED_PORT> -``` - -Finally, enable it, so that it will start automatically upon receiving a connection: - -```sh -# enable systemd-journal-remote -sudo systemctl enable --now systemd-journal-remote.socket -sudo systemctl enable systemd-journal-remote.service -``` - -`systemd-journal-remote` is now listening for incoming journals from remote hosts. - -Use [this script](https://gist.github.com/ktsaou/d62b8a6501cf9a0da94f03cbbb71c5c7) to create a self-signed certificates authority and certificates for all your servers. - -```sh -wget -O systemd-journal-self-signed-certs.sh "https://gist.githubusercontent.com/ktsaou/d62b8a6501cf9a0da94f03cbbb71c5c7/raw/c346e61e0a66f45dc4095d254bd23917f0a01bd0/systemd-journal-self-signed-certs.sh" -chmod 755 systemd-journal-self-signed-certs.sh -``` - -Edit the script and at its top, set your settings: - -```sh -# The directory to save the generated certificates (and everything about this certificate authority). -# This is only used on the node generating the certificates (usually on the journals server). -DIR="/etc/ssl/systemd-journal-remote" - -# The journals centralization server name (the CN of the server certificate). -SERVER="server-hostname" - -# All the DNS names or IPs this server is reachable at (the certificate will include them). -# Journal clients can use any of them to connect to this server. -# systemd-journal-upload validates its URL= hostname, against this list. -SERVER_ALIASES=("DNS:server-hostname1" "DNS:server-hostname2" "IP:1.2.3.4" "IP:10.1.1.1" "IP:172.16.1.1") - -# All the names of the journal clients that will be sending logs to the server (the CNs of their certificates). -# These names are used by systemd-journal-remote to name the journal files in /var/log/journal/remote/. -# Also the remote hosts will be presented using these names on Netdata dashboards. -CLIENTS=("vm1" "vm2" "vm3" "add_as_may_as_needed") -``` - -Then run the script: - -```sh -sudo ./systemd-journal-self-signed-certs.sh -``` - -The script will create the directory `/etc/ssl/systemd-journal-remote` and in it you will find all the certificates needed. - -There will also be files named `runme-on-XXX.sh`. There will be 1 script for the server and 1 script for each of the clients. You can copy and paste (or `scp`) these scripts on your server and each of your clients and run them as root: - -```sh -scp /etc/ssl/systemd-journal-remote/runme-on-XXX.sh XXX:/tmp/ -``` - -Once the above is done, `ssh` to each server/client and do: - -```sh -sudo bash /tmp/runme-on-XXX.sh -``` - -The scripts install the needed certificates, fix their file permissions to be accessible by systemd-journal-remote/upload, change `/etc/systemd/journal-remote.conf` (on the server) or `/etc/systemd/journal-upload.conf` on the clients and restart the relevant services. - - -##### _passive_ client, with encryption and self-singed certificates - -On the clients, install `systemd-journal-remote`: - -```sh -# change this according to your distro -sudo apt-get install systemd-journal-remote -``` - -Edit `/etc/systemd/journal-upload.conf` and set the IP address and the port of the server, like so: - -``` -[Upload] -URL=https://centralization.server.ip:19532 -``` - -Make sure that `centralization.server.ip` is one of the `SERVER_ALIASES` when you created the certificates. - -Edit `systemd-journal-upload`, and add `Restart=always` to make sure the client will keep trying to push logs, even if the server is temporarily not there, like this: - -```sh -sudo systemctl edit systemd-journal-upload -``` - -At the top, add: - -``` -[Service] -Restart=always -``` - -Enable and start `systemd-journal-upload`, like this: - -```sh -sudo systemctl enable systemd-journal-upload -``` - -Copy the relevant `runme-on-XXX.sh` script as described on server setup and run it: - -```sh -sudo bash /tmp/runme-on-XXX.sh -``` +#### _passive_ journal centralization with encryption using self-signed certificates +If you want to setup your own passive journal centralization setup using self-signed certificates for encryption, [check out guide on it](https://github.com/netdata/netdata/blob/master/collectors/systemd-journal.plugin/passive_journal_centralization_guide_self_signed_certs.md). #### Limitations when using a logs centralization server @@ -670,4 +470,3 @@ As of this writing `namespaces` support by `systemd` is limited: - Docker containers cannot log to namespaces. Check [this issue](https://github.com/moby/moby/issues/41879). - `systemd-journal-upload` automatically uploads `system` and `user` journals, but not `namespaces` journals. For this you need to spawn a `systemd-journal-upload` per namespace. - diff --git a/collectors/systemd-journal.plugin/active_journal_centralization_guide_no_encryption.md b/collectors/systemd-journal.plugin/active_journal_centralization_guide_no_encryption.md new file mode 100644 index 000000000..cbed1e81e --- /dev/null +++ b/collectors/systemd-journal.plugin/active_journal_centralization_guide_no_encryption.md @@ -0,0 +1,126 @@ +# Active journal source without encryption + +This page will guide you through creating an active journal source without the use of encryption. + +Once you enable an active journal source on a server, `systemd-journal-gatewayd` will expose an REST API on TCP port 19531. This API can be used for querying the logs, exporting the logs, or monitoring new log entries, remotely. + +> ⚠️ **IMPORTANT**<br/> +> These instructions will expose your logs to the network, without any encryption or authorization.<br/> +> DO NOT USE THIS ON NON-TRUSTED NETWORKS. + +## Configuring an active journal source + +On the server you want to expose their logs, install `systemd-journal-gateway`. + +```bash +# change this according to your distro +sudo apt-get install systemd-journal-gateway +``` + +Optionally, if you want to change the port (the default is `19531`), edit `systemd-journal-gatewayd.socket` + +```bash +# edit the socket file +sudo systemctl edit systemd-journal-gatewayd.socket +``` + +and add the following lines into the instructed place, and choose your desired port; save and exit. + +```bash +[Socket] +ListenStream=<DESIRED_PORT> +``` + +Finally, enable it, so that it will start automatically upon receiving a connection: + +```bash +# enable systemd-journal-remote +sudo systemctl daemon-reload +sudo systemctl enable --now systemd-journal-gatewayd.socket +``` + +## Using the active journal source + +### Simple Logs Explorer + +`systemd-journal-gateway` provides a simple HTML5 application to browse the logs. + +To use it, open your web browser and navigate to: + +``` +http://server.ip:19531/browse +``` + +A simple page like this will be presented: + +![image](https://github.com/netdata/netdata/assets/2662304/4da88bf8-6398-468b-a359-68db0c9ad419) + +### Use it with `curl` + +`man systemd-journal-gatewayd` documents the supported API methods and provides examples to query the API using `curl` commands. + +### Copying the logs to a central journals server + +`systemd-journal-remote` has the ability to query instances of `systemd-journal-gatewayd` to fetch their logs, so that the central server fetches the logs, instead of waiting for the individual servers to push their logs to it. + +However, this kind of logs centralization has a key problem: **there is no guarantee that there will be no gaps in the logs replicated**. Theoretically, the REST API of `systemd-journal-gatewayd` supports querying past data, and `systemd-journal-remote` could keep track of the state of replication and automatically continue from the point it stopped last time. But it does not. So, currently the best logs centralization option is to use a **passive** centralization, where the clients push their logs to the server. + +Given these limitations, if you still want to configure an **active** journals centralization, this is what you need to do: + +On the centralization server install `systemd-journal-remote`: + +```bash +# change this according to your distro +sudo apt-get install systemd-journal-remote +``` + +Then, copy `systemd-journal-remote.service` to configure it for querying the active source: + +```bash +# replace "clientX" with the name of the active client node +sudo cp /lib/systemd/system/systemd-journal-remote.service /etc/systemd/system/systemd-journal-remote-clientX.service + +# edit it to make sure it the ExecStart line is like this: +# ExecStart=/usr/lib/systemd/systemd-journal-remote --url http://clientX:19531/entries?follow +sudo nano /etc/systemd/system/systemd-journal-remote-clientX.service + +# reload systemd +sudo systemctl daemon-reload +``` + +```bash +# enable systemd-journal-remote +sudo systemctl enable --now systemd-journal-remote-clientX.service +``` + +You can repeat this process to create as many `systemd-journal-remote` services, as the active source you have. + +## Verify it works + +To verify the central server is receiving logs, run this on the central server: + +```bash +sudo ls -l /var/log/journal/remote/ +``` + +You should see new files from the client's hostname or IP. + +Also, any of the new service files (`systemctl status systemd-journal-clientX`) should show something like this: + +```bash +● systemd-journal-clientX.service - Fetching systemd journal logs from 192.168.2.146 + Loaded: loaded (/etc/systemd/system/systemd-journal-clientX.service; enabled; preset: disabled) + Drop-In: /usr/lib/systemd/system/service.d + └─10-timeout-abort.conf + Active: active (running) since Wed 2023-10-18 07:35:52 EEST; 23min ago + Main PID: 77959 (systemd-journal) + Tasks: 2 (limit: 6928) + Memory: 7.7M + CPU: 518ms + CGroup: /system.slice/systemd-journal-clientX.service + ├─77959 /usr/lib/systemd/systemd-journal-remote --url "http://192.168.2.146:19531/entries?follow" + └─77962 curl "-HAccept: application/vnd.fdo.journal" --silent --show-error "http://192.168.2.146:19531/entries?follow" + +Oct 18 07:35:52 systemd-journal-server systemd[1]: Started systemd-journal-clientX.service - Fetching systemd journal logs from 192.168.2.146. +Oct 18 07:35:52 systemd-journal-server systemd-journal-remote[77959]: Spawning curl http://192.168.2.146:19531/entries?follow... +``` diff --git a/collectors/systemd-journal.plugin/forward_secure_sealing.md b/collectors/systemd-journal.plugin/forward_secure_sealing.md new file mode 100644 index 000000000..b41570d68 --- /dev/null +++ b/collectors/systemd-journal.plugin/forward_secure_sealing.md @@ -0,0 +1,80 @@ +# Forward Secure Sealing (FSS) in Systemd-Journal + +Forward Secure Sealing (FSS) is a feature in the systemd journal designed to detect log file tampering. +Given that attackers often try to hide their actions by modifying or deleting log file entries, +FSS provides administrators with a mechanism to identify any such unauthorized alterations. + +## Importance +Logs are a crucial component of system monitoring and auditing. Ensuring their integrity means administrators can trust +the data, detect potential breaches, and trace actions back to their origins. Traditional methods to maintain this +integrity involve writing logs to external systems or printing them out. While these methods are effective, they are +not foolproof. FSS offers a more streamlined approach, allowing for log verification directly on the local system. + +## How FSS Works +FSS operates by "sealing" binary logs at regular intervals. This seal is a cryptographic operation, ensuring that any +tampering with the logs prior to the sealing can be detected. If an attacker modifies logs before they are sealed, +these changes become a permanent part of the sealed record, highlighting any malicious activity. + +The technology behind FSS is based on "Forward Secure Pseudo Random Generators" (FSPRG), a concept stemming from +academic research. + +Two keys are central to FSS: + +- **Sealing Key**: Kept on the system, used to seal the logs. +- **Verification Key**: Stored securely off-system, used to verify the sealed logs. + +Every so often, the sealing key is regenerated in a non-reversible process, ensuring that old keys are obsolete and the +latest logs are sealed with a fresh key. The off-site verification key can regenerate any past sealing key, allowing +administrators to verify older seals. If logs are tampered with, verification will fail, alerting administrators to the +breach. + +## Enabling FSS +To enable FSS, use the following command: + +```bash +journalctl --setup-keys +``` + +By default, systemd will seal the logs every 15 minutes. However, this interval can be adjusted using a flag during key +generation. For example, to seal logs every 10 seconds: + +```bash +journalctl --setup-keys --interval=10s +``` + +## Verifying Journals +After enabling FSS, you can verify the integrity of your logs using the verification key: + +```bash +journalctl --verify +``` + +If any discrepancies are found, you'll be alerted, indicating potential tampering. + +## Disabling FSS +Should you wish to disable FSS: + +**Delete the Sealing Key**: This stops new log entries from being sealed. + +```bash +journalctl --rotate +``` + +**Rotate and Prune the Journals**: This will start a new unsealed journal and can remove old sealed journals. + +```bash +journalctl --vacuum-time=1s +``` + + +**Adjust Systemd Configuration (Optional)**: If you've made changes to facilitate FSS in `/etc/systemd/journald.conf`, +consider reverting or adjusting those. Restart the systemd-journald service afterward: + +```bash +systemctl restart systemd-journald +``` + +## Conclusion +FSS is a significant advancement in maintaining log integrity. While not a replacement for all traditional integrity +methods, it offers a valuable tool in the battle against unauthorized log tampering. By integrating FSS into your log +management strategy, you ensure a more transparent, reliable, and tamper-evident logging system. diff --git a/collectors/systemd-journal.plugin/passive_journal_centralization_guide_no_encryption.md b/collectors/systemd-journal.plugin/passive_journal_centralization_guide_no_encryption.md new file mode 100644 index 000000000..b70c22033 --- /dev/null +++ b/collectors/systemd-journal.plugin/passive_journal_centralization_guide_no_encryption.md @@ -0,0 +1,150 @@ +# Passive journal centralization without encryption + +This page will guide you through creating a passive journal centralization setup without the use of encryption. + +Once you centralize your infrastructure logs to a server, Netdata will automatically detects all the logs from all servers and organize them in sources. +With the setup described in this document, journal files are identified by the IPs of the clients sending the logs. Netdata will automatically do +reverse DNS lookups to find the names of the server and name the sources on the dashboard accordingly. + +A _passive_ journal server waits for clients to push their metrics to it, so in this setup we will: + +1. configure `systemd-journal-remote` on the server, to listen for incoming connections. +2. configure `systemd-journal-upload` on the clients, to push their logs to the server. + +> ⚠️ **IMPORTANT**<br/> +> These instructions will copy your logs to a central server, without any encryption or authorization.<br/> +> DO NOT USE THIS ON NON-TRUSTED NETWORKS. + +## Server configuration + +On the centralization server install `systemd-journal-remote`: + +```bash +# change this according to your distro +sudo apt-get install systemd-journal-remote +``` + +Make sure the journal transfer protocol is `http`: + +```bash +sudo cp /lib/systemd/system/systemd-journal-remote.service /etc/systemd/system/ + +# edit it to make sure it says: +# --listen-http=-3 +# not: +# --listen-https=-3 +sudo nano /etc/systemd/system/systemd-journal-remote.service + +# reload systemd +sudo systemctl daemon-reload +``` + +Optionally, if you want to change the port (the default is `19532`), edit `systemd-journal-remote.socket` + +```bash +# edit the socket file +sudo systemctl edit systemd-journal-remote.socket +``` + +and add the following lines into the instructed place, and choose your desired port; save and exit. + +```bash +[Socket] +ListenStream=<DESIRED_PORT> +``` + +Finally, enable it, so that it will start automatically upon receiving a connection: + +```bash +# enable systemd-journal-remote +sudo systemctl enable --now systemd-journal-remote.socket +sudo systemctl enable systemd-journal-remote.service +``` + +`systemd-journal-remote` is now listening for incoming journals from remote hosts. + +## Client configuration + +On the clients, install `systemd-journal-remote` (it includes `systemd-journal-upload`): + +```bash +# change this according to your distro +sudo apt-get install systemd-journal-remote +``` + +Edit `/etc/systemd/journal-upload.conf` and set the IP address and the port of the server, like so: + +```conf +[Upload] +URL=http://centralization.server.ip:19532 +``` + +Edit `systemd-journal-upload`, and add `Restart=always` to make sure the client will keep trying to push logs, even if the server is temporarily not there, like this: + +```bash +sudo systemctl edit systemd-journal-upload +``` + +At the top, add: + +```conf +[Service] +Restart=always +``` + +Enable and start `systemd-journal-upload`, like this: + +```bash +sudo systemctl enable systemd-journal-upload +sudo systemctl start systemd-journal-upload +``` + +## Verify it works + +To verify the central server is receiving logs, run this on the central server: + +```bash +sudo ls -l /var/log/journal/remote/ +``` + +You should see new files from the client's IP. + +Also, `systemctl status systemd-journal-remote` should show something like this: + +```bash +systemd-journal-remote.service - Journal Remote Sink Service + Loaded: loaded (/etc/systemd/system/systemd-journal-remote.service; indirect; preset: disabled) + Active: active (running) since Sun 2023-10-15 14:29:46 EEST; 2h 24min ago +TriggeredBy: ● systemd-journal-remote.socket + Docs: man:systemd-journal-remote(8) + man:journal-remote.conf(5) + Main PID: 2118153 (systemd-journal) + Status: "Processing requests..." + Tasks: 1 (limit: 154152) + Memory: 2.2M + CPU: 71ms + CGroup: /system.slice/systemd-journal-remote.service + └─2118153 /usr/lib/systemd/systemd-journal-remote --listen-http=-3 --output=/var/log/journal/remote/ +``` + +Note the `status: "Processing requests..."` and the PID under `CGroup`. + +On the client `systemctl status systemd-journal-upload` should show something like this: + +```bash +● systemd-journal-upload.service - Journal Remote Upload Service + Loaded: loaded (/lib/systemd/system/systemd-journal-upload.service; enabled; vendor preset: disabled) + Drop-In: /etc/systemd/system/systemd-journal-upload.service.d + └─override.conf + Active: active (running) since Sun 2023-10-15 10:39:04 UTC; 3h 17min ago + Docs: man:systemd-journal-upload(8) + Main PID: 4169 (systemd-journal) + Status: "Processing input..." + Tasks: 1 (limit: 13868) + Memory: 3.5M + CPU: 1.081s + CGroup: /system.slice/systemd-journal-upload.service + └─4169 /lib/systemd/systemd-journal-upload --save-state +``` + +Note the `Status: "Processing input..."` and the PID under `CGroup`. diff --git a/collectors/systemd-journal.plugin/passive_journal_centralization_guide_self_signed_certs.md b/collectors/systemd-journal.plugin/passive_journal_centralization_guide_self_signed_certs.md new file mode 100644 index 000000000..722d1ceae --- /dev/null +++ b/collectors/systemd-journal.plugin/passive_journal_centralization_guide_self_signed_certs.md @@ -0,0 +1,250 @@ +# Passive journal centralization with encryption using self-signed certificates + +This page will guide you through creating a **passive** journal centralization setup using **self-signed certificates** for encryption and authorization. + +Once you centralize your infrastructure logs to a server, Netdata will automatically detect all the logs from all servers and organize them in sources. With the setup described in this document, on recent systemd versions, Netdata will automatically name all remote sources using the names of the clients, as they are described at their certificates (on older versions, the names will be IPs or reverse DNS lookups of the IPs). + +A **passive** journal server waits for clients to push their metrics to it, so in this setup we will: + +1. configure a certificates authority and issue self-signed certificates for your servers. +2. configure `systemd-journal-remote` on the server, to listen for incoming connections. +3. configure `systemd-journal-upload` on the clients, to push their logs to the server. + +Keep in mind that the authorization involved works like this: + +1. The server (`systemd-journal-remote`) validates that the client (`systemd-journal-upload`) uses a trusted certificate (a certificate issued by the same certificate authority as its own). + So, **the server will accept logs from any client having a valid certificate**. +2. The client (`systemd-journal-upload`) validates that the receiver (`systemd-journal-remote`) uses a trusted certificate (like the server does) and it also checks that the hostname or IP of the URL specified to its configuration, matches one of the names or IPs of the server it gets connected to. So, **the client does a validation that it connected to the right server**, using the URL hostname against the names and IPs of the server on its certificate. + +This means, that if both certificates are issued by the same certificate authority, only the client can potentially reject the server. + +## Self-signed certificates + +To simplify the process of creating and managing self-signed certificates, we have created [this bash script](https://github.com/netdata/netdata/blob/master/collectors/systemd-journal.plugin/systemd-journal-self-signed-certs.sh). + +This helps to also automate the distribution of the certificates to your servers (it generates a new bash script for each of your servers, which includes everything required, including the certificates). + +We suggest to keep this script and all the involved certificates at the journals centralization server, in the directory `/etc/ssl/systemd-journal`, so that you can make future changes as required. If you prefer to keep the certificate authority and all the certificates at a more secure location, just use the script on that location. + +On the server that will issue the certificates (usually the centralizaton server), do the following: + +```bash +# install systemd-journal-remote to add the users and groups required and openssl for the certs +# change this according to your distro +sudo apt-get install systemd-journal-remote openssl + +# download the script and make it executable +curl >systemd-journal-self-signed-certs.sh "https://raw.githubusercontent.com/netdata/netdata/master/collectors/systemd-journal.plugin/systemd-journal-self-signed-certs.sh" +chmod 750 systemd-journal-self-signed-certs.sh +``` + +To create certificates for your servers, run this: + +```bash +sudo ./systemd-journal-self-signed-certs.sh "server1" "DNS:hostname1" "IP:10.0.0.1" +``` + +Where: + + - `server1` is the canonical name of the server. On newer systemd version, this name will be used by `systemd-journal-remote` and Netdata when you view the logs on the dashboard. + - `DNS:hostname1` is a DNS name that the server is reachable at. Add `"DNS:xyz"` multiple times to define multiple DNS names for the server. + - `IP:10.0.0.1` is an IP that the server is reachable at. Add `"IP:xyz"` multiple times to define multiple IPs for the server. + +Repeat this process to create the certificates for all your servers. You can add servers as required, at any time in the future. + +Existing certificates are never re-generated. Typically certificates need to be revoked and new ones to be issued. But `systemd-journal-remote` tools do not support handling revocations. So, the only option you have to re-issue a certificate is to delete its files in `/etc/ssl/systemd-journal` and run the script again to create a new one. + +Once you run the script of each of your servers, in `/etc/ssl/systemd-journal` you will find shell scripts named `runme-on-XXX.sh`, where `XXX` are the canonical names of your servers. + +These `runme-on-XXX.sh` include everything to install the certificates, fix their file permissions to be accessible by `systemd-journal-remote` and `systemd-journal-upload`, and update `/etc/systemd/journal-remote.conf` and `/etc/systemd/journal-upload.conf`. + +You can copy and paste (or `scp`) these scripts on your server and each of your clients: + +```bash +sudo scp /etc/ssl/systemd-journal/runme-on-XXX.sh XXX:/tmp/ +``` + +For the rest of this guide, we assume that you have copied the right `runme-on-XXX.sh` at the `/tmp` of all the servers for which you issued certificates. + +### note about certificates file permissions + +It is worth noting that `systemd-journal` certificates need to be owned by `systemd-journal-remote:systemd-journal`. + +Both the user `systemd-journal-remote` and the group `systemd-journal` are automatically added by the `systemd-journal-remote` package. However, `systemd-journal-upload` (and `systemd-journal-gatewayd` - that is not used in this guide) use dynamic users. Thankfully they are added to the `systemd-journal` remote group. + +So, by having the certificates owned by `systemd-journal-remote:systemd-journal`, satisfies both `systemd-journal-remote` which is not in the `systemd-journal` group, and `systemd-journal-upload` (and `systemd-journal-gatewayd`) which use dynamic users. + +You don't need to do anything about it (the scripts take care of everything), but it is worth noting how this works. + +## Server configuration + +On the centralization server install `systemd-journal-remote`: + +```bash +# change this according to your distro +sudo apt-get install systemd-journal-remote +``` + +Make sure the journal transfer protocol is `https`: + +```bash +sudo cp /lib/systemd/system/systemd-journal-remote.service /etc/systemd/system/ + +# edit it to make sure it says: +# --listen-https=-3 +# not: +# --listen-http=-3 +sudo nano /etc/systemd/system/systemd-journal-remote.service + +# reload systemd +sudo systemctl daemon-reload +``` + +Optionally, if you want to change the port (the default is `19532`), edit `systemd-journal-remote.socket` + +```bash +# edit the socket file +sudo systemctl edit systemd-journal-remote.socket +``` + +and add the following lines into the instructed place, and choose your desired port; save and exit. + +```bash +[Socket] +ListenStream=<DESIRED_PORT> +``` + +Next, run the `runme-on-XXX.sh` script on the server: + +```bash +# if you run the certificate authority on the server: +sudo /etc/ssl/systemd-journal/runme-on-XXX.sh + +# if you run the certificate authority elsewhere, +# assuming you have coped the runme-on-XXX.sh script (as described above): +sudo bash /tmp/runme-on-XXX.sh +``` + +This will install the certificates in `/etc/ssl/systemd-journal`, set the right file permissions, and update `/etc/systemd/journal-remote.conf` and `/etc/systemd/journal-upload.conf` to use the right certificate files. + +Finally, enable it, so that it will start automatically upon receiving a connection: + +```bash +# enable systemd-journal-remote +sudo systemctl enable --now systemd-journal-remote.socket +sudo systemctl enable systemd-journal-remote.service +``` + +`systemd-journal-remote` is now listening for incoming journals from remote hosts. + +> When done, remember to `rm /tmp/runme-on-*.sh` to make sure your certificates are secure. + +## Client configuration + +On the clients, install `systemd-journal-remote` (it includes `systemd-journal-upload`): + +```bash +# change this according to your distro +sudo apt-get install systemd-journal-remote +``` + +Edit `/etc/systemd/journal-upload.conf` and set the IP address and the port of the server, like so: + +```conf +[Upload] +URL=https://centralization.server.ip:19532 +``` + +Make sure that `centralization.server.ip` is one of the `DNS:` or `IP:` parameters you defined when you created the centralization server certificates. If it is not, the client may reject to connect. + +Next, edit `systemd-journal-upload.service`, and add `Restart=always` to make sure the client will keep trying to push logs, even if the server is temporarily not there, like this: + +```bash +sudo systemctl edit systemd-journal-upload.service +``` + +At the top, add: + +```conf +[Service] +Restart=always +``` + +Enable `systemd-journal-upload.service`, like this: + +```bash +sudo systemctl enable systemd-journal-upload.service +``` + +Assuming that you have in `/tmp` the relevant `runme-on-XXX.sh` script for this client, run: + +```bash +sudo bash /tmp/runme-on-XXX.sh +``` + +This will install the certificates in `/etc/ssl/systemd-journal`, set the right file permissions, and update `/etc/systemd/journal-remote.conf` and `/etc/systemd/journal-upload.conf` to use the right certificate files. + +Finally, restart `systemd-journal-upload.service`: + +```bash +sudo systemctl restart systemd-journal-upload.service +``` + +The client should now be pushing logs to the central server. + +> When done, remember to `rm /tmp/runme-on-*.sh` to make sure your certificates are secure. + +Here it is in action, in Netdata: + +![2023-10-18 16-23-05](https://github.com/netdata/netdata/assets/2662304/83bec232-4770-455b-8f1c-46b5de5f93a2) + + +## Verify it works + +To verify the central server is receiving logs, run this on the central server: + +```bash +sudo ls -l /var/log/journal/remote/ +``` + +Depending on the `systemd` version you use, you should see new files from the clients' canonical names (as defined at their certificates) or IPs. + +Also, `systemctl status systemd-journal-remote` should show something like this: + +```bash +systemd-journal-remote.service - Journal Remote Sink Service + Loaded: loaded (/etc/systemd/system/systemd-journal-remote.service; indirect; preset: disabled) + Active: active (running) since Sun 2023-10-15 14:29:46 EEST; 2h 24min ago +TriggeredBy: ● systemd-journal-remote.socket + Docs: man:systemd-journal-remote(8) + man:journal-remote.conf(5) + Main PID: 2118153 (systemd-journal) + Status: "Processing requests..." + Tasks: 1 (limit: 154152) + Memory: 2.2M + CPU: 71ms + CGroup: /system.slice/systemd-journal-remote.service + └─2118153 /usr/lib/systemd/systemd-journal-remote --listen-https=-3 --output=/var/log/journal/remote/ +``` + +Note the `status: "Processing requests..."` and the PID under `CGroup`. + +On the client `systemctl status systemd-journal-upload` should show something like this: + +```bash +● systemd-journal-upload.service - Journal Remote Upload Service + Loaded: loaded (/lib/systemd/system/systemd-journal-upload.service; enabled; vendor preset: disabled) + Drop-In: /etc/systemd/system/systemd-journal-upload.service.d + └─override.conf + Active: active (running) since Sun 2023-10-15 10:39:04 UTC; 3h 17min ago + Docs: man:systemd-journal-upload(8) + Main PID: 4169 (systemd-journal) + Status: "Processing input..." + Tasks: 1 (limit: 13868) + Memory: 3.5M + CPU: 1.081s + CGroup: /system.slice/systemd-journal-upload.service + └─4169 /lib/systemd/systemd-journal-upload --save-state +``` + +Note the `Status: "Processing input..."` and the PID under `CGroup`. diff --git a/collectors/systemd-journal.plugin/systemd-internals.h b/collectors/systemd-journal.plugin/systemd-internals.h new file mode 100644 index 000000000..e1ae44d4f --- /dev/null +++ b/collectors/systemd-journal.plugin/systemd-internals.h @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_COLLECTORS_SYSTEMD_INTERNALS_H +#define NETDATA_COLLECTORS_SYSTEMD_INTERNALS_H + +#include "collectors/all.h" +#include "libnetdata/libnetdata.h" + +#include <linux/capability.h> +#include <systemd/sd-journal.h> +#include <syslog.h> + +#define SYSTEMD_JOURNAL_FUNCTION_DESCRIPTION "View, search and analyze systemd journal entries." +#define SYSTEMD_JOURNAL_FUNCTION_NAME "systemd-journal" +#define SYSTEMD_JOURNAL_DEFAULT_TIMEOUT 60 +#define SYSTEMD_JOURNAL_ENABLE_ESTIMATIONS_FILE_PERCENTAGE 0.01 +#define SYSTEMD_JOURNAL_EXECUTE_WATCHER_PENDING_EVERY_MS 250 +#define SYSTEMD_JOURNAL_ALL_FILES_SCAN_EVERY_USEC (5 * 60 * USEC_PER_SEC) + +#define SYSTEMD_UNITS_FUNCTION_DESCRIPTION "View the status of systemd units" +#define SYSTEMD_UNITS_FUNCTION_NAME "systemd-list-units" +#define SYSTEMD_UNITS_DEFAULT_TIMEOUT 30 + +extern __thread size_t fstat_thread_calls; +extern __thread size_t fstat_thread_cached_responses; +void fstat_cache_enable_on_thread(void); +void fstat_cache_disable_on_thread(void); + +extern netdata_mutex_t stdout_mutex; + +typedef enum { + ND_SD_JOURNAL_NO_FILE_MATCHED, + ND_SD_JOURNAL_FAILED_TO_OPEN, + ND_SD_JOURNAL_FAILED_TO_SEEK, + ND_SD_JOURNAL_TIMED_OUT, + ND_SD_JOURNAL_OK, + ND_SD_JOURNAL_NOT_MODIFIED, + ND_SD_JOURNAL_CANCELLED, +} ND_SD_JOURNAL_STATUS; + +typedef enum { + SDJF_NONE = 0, + SDJF_ALL = (1 << 0), + SDJF_LOCAL_ALL = (1 << 1), + SDJF_REMOTE_ALL = (1 << 2), + SDJF_LOCAL_SYSTEM = (1 << 3), + SDJF_LOCAL_USER = (1 << 4), + SDJF_LOCAL_NAMESPACE = (1 << 5), + SDJF_LOCAL_OTHER = (1 << 6), +} SD_JOURNAL_FILE_SOURCE_TYPE; + +struct journal_file { + const char *filename; + size_t filename_len; + STRING *source; + SD_JOURNAL_FILE_SOURCE_TYPE source_type; + usec_t file_last_modified_ut; + usec_t msg_first_ut; + usec_t msg_last_ut; + size_t size; + bool logged_failure; + bool logged_journalctl_failure; + usec_t max_journal_vs_realtime_delta_ut; + + usec_t last_scan_monotonic_ut; + usec_t last_scan_header_vs_last_modified_ut; + + uint64_t first_seqnum; + uint64_t last_seqnum; + sd_id128_t first_writer_id; + sd_id128_t last_writer_id; + + uint64_t messages_in_file; +}; + +#define SDJF_SOURCE_ALL_NAME "all" +#define SDJF_SOURCE_LOCAL_NAME "all-local-logs" +#define SDJF_SOURCE_LOCAL_SYSTEM_NAME "all-local-system-logs" +#define SDJF_SOURCE_LOCAL_USERS_NAME "all-local-user-logs" +#define SDJF_SOURCE_LOCAL_OTHER_NAME "all-uncategorized" +#define SDJF_SOURCE_NAMESPACES_NAME "all-local-namespaces" +#define SDJF_SOURCE_REMOTES_NAME "all-remote-systems" + +#define ND_SD_JOURNAL_OPEN_FLAGS (0) + +#define JOURNAL_VS_REALTIME_DELTA_DEFAULT_UT (5 * USEC_PER_SEC) // assume always 5 seconds latency +#define JOURNAL_VS_REALTIME_DELTA_MAX_UT (2 * 60 * USEC_PER_SEC) // up to 2 minutes latency + +extern DICTIONARY *journal_files_registry; +extern DICTIONARY *used_hashes_registry; +extern DICTIONARY *function_query_status_dict; +extern DICTIONARY *boot_ids_to_first_ut; + +int journal_file_dict_items_backward_compar(const void *a, const void *b); +int journal_file_dict_items_forward_compar(const void *a, const void *b); +void buffer_json_journal_versions(BUFFER *wb); +void available_journal_file_sources_to_json_array(BUFFER *wb); +bool journal_files_completed_once(void); +void journal_files_registry_update(void); +void journal_directory_scan_recursively(DICTIONARY *files, DICTIONARY *dirs, const char *dirname, int depth); + +FACET_ROW_SEVERITY syslog_priority_to_facet_severity(FACETS *facets, FACET_ROW *row, void *data); + +void netdata_systemd_journal_dynamic_row_id(FACETS *facets, BUFFER *json_array, FACET_ROW_KEY_VALUE *rkv, FACET_ROW *row, void *data); +void netdata_systemd_journal_transform_priority(FACETS *facets, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope, void *data); +void netdata_systemd_journal_transform_syslog_facility(FACETS *facets, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope, void *data); +void netdata_systemd_journal_transform_errno(FACETS *facets, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope, void *data); +void netdata_systemd_journal_transform_boot_id(FACETS *facets, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope, void *data); +void netdata_systemd_journal_transform_uid(FACETS *facets, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope, void *data); +void netdata_systemd_journal_transform_gid(FACETS *facets, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope, void *data); +void netdata_systemd_journal_transform_cap_effective(FACETS *facets, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope, void *data); +void netdata_systemd_journal_transform_timestamp_usec(FACETS *facets, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope, void *data); + +usec_t journal_file_update_annotation_boot_id(sd_journal *j, struct journal_file *jf, const char *boot_id); + +#define MAX_JOURNAL_DIRECTORIES 100 +struct journal_directory { + char *path; +}; +extern struct journal_directory journal_directories[MAX_JOURNAL_DIRECTORIES]; + +void journal_init_files_and_directories(void); +void journal_init_query_status(void); +void function_systemd_journal(const char *transaction, char *function, int timeout, bool *cancelled); +void journal_file_update_header(const char *filename, struct journal_file *jf); + +void netdata_systemd_journal_message_ids_init(void); +void netdata_systemd_journal_transform_message_id(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused); + +void *journal_watcher_main(void *arg); + +#ifdef ENABLE_SYSTEMD_DBUS +void function_systemd_units(const char *transaction, char *function, int timeout, bool *cancelled); +#endif + +static inline void send_newline_and_flush(void) { + netdata_mutex_lock(&stdout_mutex); + fprintf(stdout, "\n"); + fflush(stdout); + netdata_mutex_unlock(&stdout_mutex); +} + +static inline bool parse_journal_field(const char *data, size_t data_length, const char **key, size_t *key_length, const char **value, size_t *value_length) { + const char *k = data; + const char *equal = strchr(k, '='); + if(unlikely(!equal)) + return false; + + size_t kl = equal - k; + + const char *v = ++equal; + size_t vl = data_length - kl - 1; + + *key = k; + *key_length = kl; + *value = v; + *value_length = vl; + + return true; +} + +#endif //NETDATA_COLLECTORS_SYSTEMD_INTERNALS_H diff --git a/collectors/systemd-journal.plugin/systemd-journal-annotations.c b/collectors/systemd-journal.plugin/systemd-journal-annotations.c new file mode 100644 index 000000000..b12356110 --- /dev/null +++ b/collectors/systemd-journal.plugin/systemd-journal-annotations.c @@ -0,0 +1,719 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "systemd-internals.h" + +const char *errno_map[] = { + [1] = "1 (EPERM)", // "Operation not permitted", + [2] = "2 (ENOENT)", // "No such file or directory", + [3] = "3 (ESRCH)", // "No such process", + [4] = "4 (EINTR)", // "Interrupted system call", + [5] = "5 (EIO)", // "Input/output error", + [6] = "6 (ENXIO)", // "No such device or address", + [7] = "7 (E2BIG)", // "Argument list too long", + [8] = "8 (ENOEXEC)", // "Exec format error", + [9] = "9 (EBADF)", // "Bad file descriptor", + [10] = "10 (ECHILD)", // "No child processes", + [11] = "11 (EAGAIN)", // "Resource temporarily unavailable", + [12] = "12 (ENOMEM)", // "Cannot allocate memory", + [13] = "13 (EACCES)", // "Permission denied", + [14] = "14 (EFAULT)", // "Bad address", + [15] = "15 (ENOTBLK)", // "Block device required", + [16] = "16 (EBUSY)", // "Device or resource busy", + [17] = "17 (EEXIST)", // "File exists", + [18] = "18 (EXDEV)", // "Invalid cross-device link", + [19] = "19 (ENODEV)", // "No such device", + [20] = "20 (ENOTDIR)", // "Not a directory", + [21] = "21 (EISDIR)", // "Is a directory", + [22] = "22 (EINVAL)", // "Invalid argument", + [23] = "23 (ENFILE)", // "Too many open files in system", + [24] = "24 (EMFILE)", // "Too many open files", + [25] = "25 (ENOTTY)", // "Inappropriate ioctl for device", + [26] = "26 (ETXTBSY)", // "Text file busy", + [27] = "27 (EFBIG)", // "File too large", + [28] = "28 (ENOSPC)", // "No space left on device", + [29] = "29 (ESPIPE)", // "Illegal seek", + [30] = "30 (EROFS)", // "Read-only file system", + [31] = "31 (EMLINK)", // "Too many links", + [32] = "32 (EPIPE)", // "Broken pipe", + [33] = "33 (EDOM)", // "Numerical argument out of domain", + [34] = "34 (ERANGE)", // "Numerical result out of range", + [35] = "35 (EDEADLK)", // "Resource deadlock avoided", + [36] = "36 (ENAMETOOLONG)", // "File name too long", + [37] = "37 (ENOLCK)", // "No locks available", + [38] = "38 (ENOSYS)", // "Function not implemented", + [39] = "39 (ENOTEMPTY)", // "Directory not empty", + [40] = "40 (ELOOP)", // "Too many levels of symbolic links", + [42] = "42 (ENOMSG)", // "No message of desired type", + [43] = "43 (EIDRM)", // "Identifier removed", + [44] = "44 (ECHRNG)", // "Channel number out of range", + [45] = "45 (EL2NSYNC)", // "Level 2 not synchronized", + [46] = "46 (EL3HLT)", // "Level 3 halted", + [47] = "47 (EL3RST)", // "Level 3 reset", + [48] = "48 (ELNRNG)", // "Link number out of range", + [49] = "49 (EUNATCH)", // "Protocol driver not attached", + [50] = "50 (ENOCSI)", // "No CSI structure available", + [51] = "51 (EL2HLT)", // "Level 2 halted", + [52] = "52 (EBADE)", // "Invalid exchange", + [53] = "53 (EBADR)", // "Invalid request descriptor", + [54] = "54 (EXFULL)", // "Exchange full", + [55] = "55 (ENOANO)", // "No anode", + [56] = "56 (EBADRQC)", // "Invalid request code", + [57] = "57 (EBADSLT)", // "Invalid slot", + [59] = "59 (EBFONT)", // "Bad font file format", + [60] = "60 (ENOSTR)", // "Device not a stream", + [61] = "61 (ENODATA)", // "No data available", + [62] = "62 (ETIME)", // "Timer expired", + [63] = "63 (ENOSR)", // "Out of streams resources", + [64] = "64 (ENONET)", // "Machine is not on the network", + [65] = "65 (ENOPKG)", // "Package not installed", + [66] = "66 (EREMOTE)", // "Object is remote", + [67] = "67 (ENOLINK)", // "Link has been severed", + [68] = "68 (EADV)", // "Advertise error", + [69] = "69 (ESRMNT)", // "Srmount error", + [70] = "70 (ECOMM)", // "Communication error on send", + [71] = "71 (EPROTO)", // "Protocol error", + [72] = "72 (EMULTIHOP)", // "Multihop attempted", + [73] = "73 (EDOTDOT)", // "RFS specific error", + [74] = "74 (EBADMSG)", // "Bad message", + [75] = "75 (EOVERFLOW)", // "Value too large for defined data type", + [76] = "76 (ENOTUNIQ)", // "Name not unique on network", + [77] = "77 (EBADFD)", // "File descriptor in bad state", + [78] = "78 (EREMCHG)", // "Remote address changed", + [79] = "79 (ELIBACC)", // "Can not access a needed shared library", + [80] = "80 (ELIBBAD)", // "Accessing a corrupted shared library", + [81] = "81 (ELIBSCN)", // ".lib section in a.out corrupted", + [82] = "82 (ELIBMAX)", // "Attempting to link in too many shared libraries", + [83] = "83 (ELIBEXEC)", // "Cannot exec a shared library directly", + [84] = "84 (EILSEQ)", // "Invalid or incomplete multibyte or wide character", + [85] = "85 (ERESTART)", // "Interrupted system call should be restarted", + [86] = "86 (ESTRPIPE)", // "Streams pipe error", + [87] = "87 (EUSERS)", // "Too many users", + [88] = "88 (ENOTSOCK)", // "Socket operation on non-socket", + [89] = "89 (EDESTADDRREQ)", // "Destination address required", + [90] = "90 (EMSGSIZE)", // "Message too long", + [91] = "91 (EPROTOTYPE)", // "Protocol wrong type for socket", + [92] = "92 (ENOPROTOOPT)", // "Protocol not available", + [93] = "93 (EPROTONOSUPPORT)", // "Protocol not supported", + [94] = "94 (ESOCKTNOSUPPORT)", // "Socket type not supported", + [95] = "95 (ENOTSUP)", // "Operation not supported", + [96] = "96 (EPFNOSUPPORT)", // "Protocol family not supported", + [97] = "97 (EAFNOSUPPORT)", // "Address family not supported by protocol", + [98] = "98 (EADDRINUSE)", // "Address already in use", + [99] = "99 (EADDRNOTAVAIL)", // "Cannot assign requested address", + [100] = "100 (ENETDOWN)", // "Network is down", + [101] = "101 (ENETUNREACH)", // "Network is unreachable", + [102] = "102 (ENETRESET)", // "Network dropped connection on reset", + [103] = "103 (ECONNABORTED)", // "Software caused connection abort", + [104] = "104 (ECONNRESET)", // "Connection reset by peer", + [105] = "105 (ENOBUFS)", // "No buffer space available", + [106] = "106 (EISCONN)", // "Transport endpoint is already connected", + [107] = "107 (ENOTCONN)", // "Transport endpoint is not connected", + [108] = "108 (ESHUTDOWN)", // "Cannot send after transport endpoint shutdown", + [109] = "109 (ETOOMANYREFS)", // "Too many references: cannot splice", + [110] = "110 (ETIMEDOUT)", // "Connection timed out", + [111] = "111 (ECONNREFUSED)", // "Connection refused", + [112] = "112 (EHOSTDOWN)", // "Host is down", + [113] = "113 (EHOSTUNREACH)", // "No route to host", + [114] = "114 (EALREADY)", // "Operation already in progress", + [115] = "115 (EINPROGRESS)", // "Operation now in progress", + [116] = "116 (ESTALE)", // "Stale file handle", + [117] = "117 (EUCLEAN)", // "Structure needs cleaning", + [118] = "118 (ENOTNAM)", // "Not a XENIX named type file", + [119] = "119 (ENAVAIL)", // "No XENIX semaphores available", + [120] = "120 (EISNAM)", // "Is a named type file", + [121] = "121 (EREMOTEIO)", // "Remote I/O error", + [122] = "122 (EDQUOT)", // "Disk quota exceeded", + [123] = "123 (ENOMEDIUM)", // "No medium found", + [124] = "124 (EMEDIUMTYPE)", // "Wrong medium type", + [125] = "125 (ECANCELED)", // "Operation canceled", + [126] = "126 (ENOKEY)", // "Required key not available", + [127] = "127 (EKEYEXPIRED)", // "Key has expired", + [128] = "128 (EKEYREVOKED)", // "Key has been revoked", + [129] = "129 (EKEYREJECTED)", // "Key was rejected by service", + [130] = "130 (EOWNERDEAD)", // "Owner died", + [131] = "131 (ENOTRECOVERABLE)", // "State not recoverable", + [132] = "132 (ERFKILL)", // "Operation not possible due to RF-kill", + [133] = "133 (EHWPOISON)", // "Memory page has hardware error", +}; + +const char *linux_capabilities[] = { + [CAP_CHOWN] = "CHOWN", + [CAP_DAC_OVERRIDE] = "DAC_OVERRIDE", + [CAP_DAC_READ_SEARCH] = "DAC_READ_SEARCH", + [CAP_FOWNER] = "FOWNER", + [CAP_FSETID] = "FSETID", + [CAP_KILL] = "KILL", + [CAP_SETGID] = "SETGID", + [CAP_SETUID] = "SETUID", + [CAP_SETPCAP] = "SETPCAP", + [CAP_LINUX_IMMUTABLE] = "LINUX_IMMUTABLE", + [CAP_NET_BIND_SERVICE] = "NET_BIND_SERVICE", + [CAP_NET_BROADCAST] = "NET_BROADCAST", + [CAP_NET_ADMIN] = "NET_ADMIN", + [CAP_NET_RAW] = "NET_RAW", + [CAP_IPC_LOCK] = "IPC_LOCK", + [CAP_IPC_OWNER] = "IPC_OWNER", + [CAP_SYS_MODULE] = "SYS_MODULE", + [CAP_SYS_RAWIO] = "SYS_RAWIO", + [CAP_SYS_CHROOT] = "SYS_CHROOT", + [CAP_SYS_PTRACE] = "SYS_PTRACE", + [CAP_SYS_PACCT] = "SYS_PACCT", + [CAP_SYS_ADMIN] = "SYS_ADMIN", + [CAP_SYS_BOOT] = "SYS_BOOT", + [CAP_SYS_NICE] = "SYS_NICE", + [CAP_SYS_RESOURCE] = "SYS_RESOURCE", + [CAP_SYS_TIME] = "SYS_TIME", + [CAP_SYS_TTY_CONFIG] = "SYS_TTY_CONFIG", + [CAP_MKNOD] = "MKNOD", + [CAP_LEASE] = "LEASE", + [CAP_AUDIT_WRITE] = "AUDIT_WRITE", + [CAP_AUDIT_CONTROL] = "AUDIT_CONTROL", + [CAP_SETFCAP] = "SETFCAP", + [CAP_MAC_OVERRIDE] = "MAC_OVERRIDE", + [CAP_MAC_ADMIN] = "MAC_ADMIN", + [CAP_SYSLOG] = "SYSLOG", + [CAP_WAKE_ALARM] = "WAKE_ALARM", + [CAP_BLOCK_SUSPEND] = "BLOCK_SUSPEND", + [37 /*CAP_AUDIT_READ*/] = "AUDIT_READ", + [38 /*CAP_PERFMON*/] = "PERFMON", + [39 /*CAP_BPF*/] = "BPF", + [40 /* CAP_CHECKPOINT_RESTORE */] = "CHECKPOINT_RESTORE", +}; + +static const char *syslog_facility_to_name(int facility) { + switch (facility) { + case LOG_FAC(LOG_KERN): return "kern"; + case LOG_FAC(LOG_USER): return "user"; + case LOG_FAC(LOG_MAIL): return "mail"; + case LOG_FAC(LOG_DAEMON): return "daemon"; + case LOG_FAC(LOG_AUTH): return "auth"; + case LOG_FAC(LOG_SYSLOG): return "syslog"; + case LOG_FAC(LOG_LPR): return "lpr"; + case LOG_FAC(LOG_NEWS): return "news"; + case LOG_FAC(LOG_UUCP): return "uucp"; + case LOG_FAC(LOG_CRON): return "cron"; + case LOG_FAC(LOG_AUTHPRIV): return "authpriv"; + case LOG_FAC(LOG_FTP): return "ftp"; + case LOG_FAC(LOG_LOCAL0): return "local0"; + case LOG_FAC(LOG_LOCAL1): return "local1"; + case LOG_FAC(LOG_LOCAL2): return "local2"; + case LOG_FAC(LOG_LOCAL3): return "local3"; + case LOG_FAC(LOG_LOCAL4): return "local4"; + case LOG_FAC(LOG_LOCAL5): return "local5"; + case LOG_FAC(LOG_LOCAL6): return "local6"; + case LOG_FAC(LOG_LOCAL7): return "local7"; + default: return NULL; + } +} + +static const char *syslog_priority_to_name(int priority) { + switch (priority) { + case LOG_ALERT: return "alert"; + case LOG_CRIT: return "critical"; + case LOG_DEBUG: return "debug"; + case LOG_EMERG: return "panic"; + case LOG_ERR: return "error"; + case LOG_INFO: return "info"; + case LOG_NOTICE: return "notice"; + case LOG_WARNING: return "warning"; + default: return NULL; + } +} + +FACET_ROW_SEVERITY syslog_priority_to_facet_severity(FACETS *facets __maybe_unused, FACET_ROW *row, void *data __maybe_unused) { + // same to + // https://github.com/systemd/systemd/blob/aab9e4b2b86905a15944a1ac81e471b5b7075932/src/basic/terminal-util.c#L1501 + // function get_log_colors() + + FACET_ROW_KEY_VALUE *priority_rkv = dictionary_get(row->dict, "PRIORITY"); + if(!priority_rkv || priority_rkv->empty) + return FACET_ROW_SEVERITY_NORMAL; + + int priority = str2i(buffer_tostring(priority_rkv->wb)); + + if(priority <= LOG_ERR) + return FACET_ROW_SEVERITY_CRITICAL; + + else if (priority <= LOG_WARNING) + return FACET_ROW_SEVERITY_WARNING; + + else if(priority <= LOG_NOTICE) + return FACET_ROW_SEVERITY_NOTICE; + + else if(priority >= LOG_DEBUG) + return FACET_ROW_SEVERITY_DEBUG; + + return FACET_ROW_SEVERITY_NORMAL; +} + +static char *uid_to_username(uid_t uid, char *buffer, size_t buffer_size) { + static __thread char tmp[1024 + 1]; + struct passwd pw, *result = NULL; + + if (getpwuid_r(uid, &pw, tmp, sizeof(tmp), &result) != 0 || !result || !pw.pw_name || !(*pw.pw_name)) + snprintfz(buffer, buffer_size - 1, "%u", uid); + else + snprintfz(buffer, buffer_size - 1, "%u (%s)", uid, pw.pw_name); + + return buffer; +} + +static char *gid_to_groupname(gid_t gid, char* buffer, size_t buffer_size) { + static __thread char tmp[1024]; + struct group grp, *result = NULL; + + if (getgrgid_r(gid, &grp, tmp, sizeof(tmp), &result) != 0 || !result || !grp.gr_name || !(*grp.gr_name)) + snprintfz(buffer, buffer_size - 1, "%u", gid); + else + snprintfz(buffer, buffer_size - 1, "%u (%s)", gid, grp.gr_name); + + return buffer; +} + +void netdata_systemd_journal_transform_syslog_facility(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + const char *v = buffer_tostring(wb); + if(*v && isdigit(*v)) { + int facility = str2i(buffer_tostring(wb)); + const char *name = syslog_facility_to_name(facility); + if (name) { + buffer_flush(wb); + buffer_strcat(wb, name); + } + } +} + +void netdata_systemd_journal_transform_priority(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + if(scope == FACETS_TRANSFORM_FACET_SORT) + return; + + const char *v = buffer_tostring(wb); + if(*v && isdigit(*v)) { + int priority = str2i(buffer_tostring(wb)); + const char *name = syslog_priority_to_name(priority); + if (name) { + buffer_flush(wb); + buffer_strcat(wb, name); + } + } +} + +void netdata_systemd_journal_transform_errno(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + if(scope == FACETS_TRANSFORM_FACET_SORT) + return; + + const char *v = buffer_tostring(wb); + if(*v && isdigit(*v)) { + unsigned err_no = str2u(buffer_tostring(wb)); + if(err_no > 0 && err_no < sizeof(errno_map) / sizeof(*errno_map)) { + const char *name = errno_map[err_no]; + if(name) { + buffer_flush(wb); + buffer_strcat(wb, name); + } + } + } +} + +// ---------------------------------------------------------------------------- +// UID and GID transformation + +#define UID_GID_HASHTABLE_SIZE 10000 + +struct word_t2str_hashtable_entry { + struct word_t2str_hashtable_entry *next; + Word_t hash; + size_t len; + char str[]; +}; + +struct word_t2str_hashtable { + SPINLOCK spinlock; + size_t size; + struct word_t2str_hashtable_entry *hashtable[UID_GID_HASHTABLE_SIZE]; +}; + +struct word_t2str_hashtable uid_hashtable = { + .size = UID_GID_HASHTABLE_SIZE, +}; + +struct word_t2str_hashtable gid_hashtable = { + .size = UID_GID_HASHTABLE_SIZE, +}; + +struct word_t2str_hashtable_entry **word_t2str_hashtable_slot(struct word_t2str_hashtable *ht, Word_t hash) { + size_t slot = hash % ht->size; + struct word_t2str_hashtable_entry **e = &ht->hashtable[slot]; + + while(*e && (*e)->hash != hash) + e = &((*e)->next); + + return e; +} + +const char *uid_to_username_cached(uid_t uid, size_t *length) { + spinlock_lock(&uid_hashtable.spinlock); + + struct word_t2str_hashtable_entry **e = word_t2str_hashtable_slot(&uid_hashtable, uid); + if(!(*e)) { + static __thread char buf[1024]; + const char *name = uid_to_username(uid, buf, sizeof(buf)); + size_t size = strlen(name) + 1; + + *e = callocz(1, sizeof(struct word_t2str_hashtable_entry) + size); + (*e)->len = size - 1; + (*e)->hash = uid; + memcpy((*e)->str, name, size); + } + + spinlock_unlock(&uid_hashtable.spinlock); + + *length = (*e)->len; + return (*e)->str; +} + +const char *gid_to_groupname_cached(gid_t gid, size_t *length) { + spinlock_lock(&gid_hashtable.spinlock); + + struct word_t2str_hashtable_entry **e = word_t2str_hashtable_slot(&gid_hashtable, gid); + if(!(*e)) { + static __thread char buf[1024]; + const char *name = gid_to_groupname(gid, buf, sizeof(buf)); + size_t size = strlen(name) + 1; + + *e = callocz(1, sizeof(struct word_t2str_hashtable_entry) + size); + (*e)->len = size - 1; + (*e)->hash = gid; + memcpy((*e)->str, name, size); + } + + spinlock_unlock(&gid_hashtable.spinlock); + + *length = (*e)->len; + return (*e)->str; +} + +DICTIONARY *boot_ids_to_first_ut = NULL; + +void netdata_systemd_journal_transform_boot_id(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + const char *boot_id = buffer_tostring(wb); + if(*boot_id && isxdigit(*boot_id)) { + usec_t ut = UINT64_MAX; + usec_t *p_ut = dictionary_get(boot_ids_to_first_ut, boot_id); + if(!p_ut) { +#ifndef HAVE_SD_JOURNAL_RESTART_FIELDS + struct journal_file *jf; + dfe_start_read(journal_files_registry, jf) { + const char *files[2] = { + [0] = jf_dfe.name, + [1] = NULL, + }; + + sd_journal *j = NULL; + int r = sd_journal_open_files(&j, files, ND_SD_JOURNAL_OPEN_FLAGS); + if(r < 0 || !j) { + internal_error(true, "JOURNAL: while looking for the first timestamp of boot_id '%s', " + "sd_journal_open_files('%s') returned %d", + boot_id, jf_dfe.name, r); + continue; + } + + ut = journal_file_update_annotation_boot_id(j, jf, boot_id); + sd_journal_close(j); + } + dfe_done(jf); +#endif + } + else + ut = *p_ut; + + if(ut && ut != UINT64_MAX) { + char buffer[RFC3339_MAX_LENGTH]; + rfc3339_datetime_ut(buffer, sizeof(buffer), ut, 0, true); + + switch(scope) { + default: + case FACETS_TRANSFORM_DATA: + case FACETS_TRANSFORM_VALUE: + buffer_sprintf(wb, " (%s) ", buffer); + break; + + case FACETS_TRANSFORM_FACET: + case FACETS_TRANSFORM_FACET_SORT: + case FACETS_TRANSFORM_HISTOGRAM: + buffer_flush(wb); + buffer_sprintf(wb, "%s", buffer); + break; + } + } + } +} + +void netdata_systemd_journal_transform_uid(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + if(scope == FACETS_TRANSFORM_FACET_SORT) + return; + + const char *v = buffer_tostring(wb); + if(*v && isdigit(*v)) { + uid_t uid = str2i(buffer_tostring(wb)); + size_t len; + const char *name = uid_to_username_cached(uid, &len); + buffer_contents_replace(wb, name, len); + } +} + +void netdata_systemd_journal_transform_gid(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + if(scope == FACETS_TRANSFORM_FACET_SORT) + return; + + const char *v = buffer_tostring(wb); + if(*v && isdigit(*v)) { + gid_t gid = str2i(buffer_tostring(wb)); + size_t len; + const char *name = gid_to_groupname_cached(gid, &len); + buffer_contents_replace(wb, name, len); + } +} + +void netdata_systemd_journal_transform_cap_effective(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + if(scope == FACETS_TRANSFORM_FACET_SORT) + return; + + const char *v = buffer_tostring(wb); + if(*v && isdigit(*v)) { + uint64_t cap = strtoul(buffer_tostring(wb), NULL, 16); + if(cap) { + buffer_fast_strcat(wb, " (", 2); + for (size_t i = 0, added = 0; i < sizeof(linux_capabilities) / sizeof(linux_capabilities[0]); i++) { + if (linux_capabilities[i] && (cap & (1ULL << i))) { + + if (added) + buffer_fast_strcat(wb, " | ", 3); + + buffer_strcat(wb, linux_capabilities[i]); + added++; + } + } + buffer_fast_strcat(wb, ")", 1); + } + } +} + +void netdata_systemd_journal_transform_timestamp_usec(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + if(scope == FACETS_TRANSFORM_FACET_SORT) + return; + + const char *v = buffer_tostring(wb); + if(*v && isdigit(*v)) { + uint64_t ut = str2ull(buffer_tostring(wb), NULL); + if(ut) { + char buffer[RFC3339_MAX_LENGTH]; + rfc3339_datetime_ut(buffer, sizeof(buffer), ut, 6, true); + buffer_sprintf(wb, " (%s)", buffer); + } + } +} + +// ---------------------------------------------------------------------------- + +void netdata_systemd_journal_dynamic_row_id(FACETS *facets __maybe_unused, BUFFER *json_array, FACET_ROW_KEY_VALUE *rkv, FACET_ROW *row, void *data __maybe_unused) { + FACET_ROW_KEY_VALUE *pid_rkv = dictionary_get(row->dict, "_PID"); + const char *pid = pid_rkv ? buffer_tostring(pid_rkv->wb) : FACET_VALUE_UNSET; + + const char *identifier = NULL; + FACET_ROW_KEY_VALUE *container_name_rkv = dictionary_get(row->dict, "CONTAINER_NAME"); + if(container_name_rkv && !container_name_rkv->empty) + identifier = buffer_tostring(container_name_rkv->wb); + + if(!identifier) { + FACET_ROW_KEY_VALUE *syslog_identifier_rkv = dictionary_get(row->dict, "SYSLOG_IDENTIFIER"); + if(syslog_identifier_rkv && !syslog_identifier_rkv->empty) + identifier = buffer_tostring(syslog_identifier_rkv->wb); + + if(!identifier) { + FACET_ROW_KEY_VALUE *comm_rkv = dictionary_get(row->dict, "_COMM"); + if(comm_rkv && !comm_rkv->empty) + identifier = buffer_tostring(comm_rkv->wb); + } + } + + buffer_flush(rkv->wb); + + if(!identifier || !*identifier) + buffer_strcat(rkv->wb, FACET_VALUE_UNSET); + else if(!pid || !*pid) + buffer_sprintf(rkv->wb, "%s", identifier); + else + buffer_sprintf(rkv->wb, "%s[%s]", identifier, pid); + + buffer_json_add_array_item_string(json_array, buffer_tostring(rkv->wb)); +} + + +// ---------------------------------------------------------------------------- + +struct message_id_info { + const char *msg; +}; + +static DICTIONARY *known_journal_messages_ids = NULL; + +void netdata_systemd_journal_message_ids_init(void) { + known_journal_messages_ids = dictionary_create(DICT_OPTION_DONT_OVERWRITE_VALUE); + + struct message_id_info i = { 0 }; + i.msg = "Journal start"; dictionary_set(known_journal_messages_ids, "f77379a8490b408bbe5f6940505a777b", &i, sizeof(i)); + i.msg = "Journal stop"; dictionary_set(known_journal_messages_ids, "d93fb3c9c24d451a97cea615ce59c00b", &i, sizeof(i)); + i.msg = "Journal dropped"; dictionary_set(known_journal_messages_ids, "a596d6fe7bfa4994828e72309e95d61e", &i, sizeof(i)); + i.msg = "Journal missed"; dictionary_set(known_journal_messages_ids, "e9bf28e6e834481bb6f48f548ad13606", &i, sizeof(i)); + i.msg = "Journal usage"; dictionary_set(known_journal_messages_ids, "ec387f577b844b8fa948f33cad9a75e6", &i, sizeof(i)); + i.msg = "Coredump"; dictionary_set(known_journal_messages_ids, "fc2e22bc6ee647b6b90729ab34a250b1", &i, sizeof(i)); + i.msg = "Truncated core"; dictionary_set(known_journal_messages_ids, "5aadd8e954dc4b1a8c954d63fd9e1137", &i, sizeof(i)); + i.msg = "Backtrace"; dictionary_set(known_journal_messages_ids, "1f4e0a44a88649939aaea34fc6da8c95", &i, sizeof(i)); + i.msg = "Session start"; dictionary_set(known_journal_messages_ids, "8d45620c1a4348dbb17410da57c60c66", &i, sizeof(i)); + i.msg = "Session stop"; dictionary_set(known_journal_messages_ids, "3354939424b4456d9802ca8333ed424a", &i, sizeof(i)); + i.msg = "Seat start"; dictionary_set(known_journal_messages_ids, "fcbefc5da23d428093f97c82a9290f7b", &i, sizeof(i)); + i.msg = "Seat stop"; dictionary_set(known_journal_messages_ids, "e7852bfe46784ed0accde04bc864c2d5", &i, sizeof(i)); + i.msg = "Machine start"; dictionary_set(known_journal_messages_ids, "24d8d4452573402496068381a6312df2", &i, sizeof(i)); + i.msg = "Machine stop"; dictionary_set(known_journal_messages_ids, "58432bd3bace477cb514b56381b8a758", &i, sizeof(i)); + i.msg = "Time change"; dictionary_set(known_journal_messages_ids, "c7a787079b354eaaa9e77b371893cd27", &i, sizeof(i)); + i.msg = "Timezone change"; dictionary_set(known_journal_messages_ids, "45f82f4aef7a4bbf942ce861d1f20990", &i, sizeof(i)); + i.msg = "Tainted"; dictionary_set(known_journal_messages_ids, "50876a9db00f4c40bde1a2ad381c3a1b", &i, sizeof(i)); + i.msg = "Startup finished"; dictionary_set(known_journal_messages_ids, "b07a249cd024414a82dd00cd181378ff", &i, sizeof(i)); + i.msg = "User startup finished"; dictionary_set(known_journal_messages_ids, "eed00a68ffd84e31882105fd973abdd1", &i, sizeof(i)); + i.msg = "Sleep start"; dictionary_set(known_journal_messages_ids, "6bbd95ee977941e497c48be27c254128", &i, sizeof(i)); + i.msg = "Sleep stop"; dictionary_set(known_journal_messages_ids, "8811e6df2a8e40f58a94cea26f8ebf14", &i, sizeof(i)); + i.msg = "Shutdown"; dictionary_set(known_journal_messages_ids, "98268866d1d54a499c4e98921d93bc40", &i, sizeof(i)); + i.msg = "Factory reset"; dictionary_set(known_journal_messages_ids, "c14aaf76ec284a5fa1f105f88dfb061c", &i, sizeof(i)); + i.msg = "Crash exit"; dictionary_set(known_journal_messages_ids, "d9ec5e95e4b646aaaea2fd05214edbda", &i, sizeof(i)); + i.msg = "Crash failed"; dictionary_set(known_journal_messages_ids, "3ed0163e868a4417ab8b9e210407a96c", &i, sizeof(i)); + i.msg = "Crash freeze"; dictionary_set(known_journal_messages_ids, "645c735537634ae0a32b15a7c6cba7d4", &i, sizeof(i)); + i.msg = "Crash no coredump"; dictionary_set(known_journal_messages_ids, "5addb3a06a734d3396b794bf98fb2d01", &i, sizeof(i)); + i.msg = "Crash no fork"; dictionary_set(known_journal_messages_ids, "5c9e98de4ab94c6a9d04d0ad793bd903", &i, sizeof(i)); + i.msg = "Crash unknown signal"; dictionary_set(known_journal_messages_ids, "5e6f1f5e4db64a0eaee3368249d20b94", &i, sizeof(i)); + i.msg = "Crash systemd signal"; dictionary_set(known_journal_messages_ids, "83f84b35ee264f74a3896a9717af34cb", &i, sizeof(i)); + i.msg = "Crash process signal"; dictionary_set(known_journal_messages_ids, "3a73a98baf5b4b199929e3226c0be783", &i, sizeof(i)); + i.msg = "Crash waitpid failed"; dictionary_set(known_journal_messages_ids, "2ed18d4f78ca47f0a9bc25271c26adb4", &i, sizeof(i)); + i.msg = "Crash coredump failed"; dictionary_set(known_journal_messages_ids, "56b1cd96f24246c5b607666fda952356", &i, sizeof(i)); + i.msg = "Crash coredump pid"; dictionary_set(known_journal_messages_ids, "4ac7566d4d7548f4981f629a28f0f829", &i, sizeof(i)); + i.msg = "Crash shell fork failed"; dictionary_set(known_journal_messages_ids, "38e8b1e039ad469291b18b44c553a5b7", &i, sizeof(i)); + i.msg = "Crash execle failed"; dictionary_set(known_journal_messages_ids, "872729b47dbe473eb768ccecd477beda", &i, sizeof(i)); + i.msg = "Selinux failed"; dictionary_set(known_journal_messages_ids, "658a67adc1c940b3b3316e7e8628834a", &i, sizeof(i)); + i.msg = "Battery low warning"; dictionary_set(known_journal_messages_ids, "e6f456bd92004d9580160b2207555186", &i, sizeof(i)); + i.msg = "Battery low poweroff"; dictionary_set(known_journal_messages_ids, "267437d33fdd41099ad76221cc24a335", &i, sizeof(i)); + i.msg = "Core mainloop failed"; dictionary_set(known_journal_messages_ids, "79e05b67bc4545d1922fe47107ee60c5", &i, sizeof(i)); + i.msg = "Core no xdgdir path"; dictionary_set(known_journal_messages_ids, "dbb136b10ef4457ba47a795d62f108c9", &i, sizeof(i)); + i.msg = "Core capability bounding user"; dictionary_set(known_journal_messages_ids, "ed158c2df8884fa584eead2d902c1032", &i, sizeof(i)); + i.msg = "Core capability bounding"; dictionary_set(known_journal_messages_ids, "42695b500df048298bee37159caa9f2e", &i, sizeof(i)); + i.msg = "Core disable privileges"; dictionary_set(known_journal_messages_ids, "bfc2430724ab44499735b4f94cca9295", &i, sizeof(i)); + i.msg = "Core start target failed"; dictionary_set(known_journal_messages_ids, "59288af523be43a28d494e41e26e4510", &i, sizeof(i)); + i.msg = "Core isolate target failed"; dictionary_set(known_journal_messages_ids, "689b4fcc97b4486ea5da92db69c9e314", &i, sizeof(i)); + i.msg = "Core fd set failed"; dictionary_set(known_journal_messages_ids, "5ed836f1766f4a8a9fc5da45aae23b29", &i, sizeof(i)); + i.msg = "Core pid1 environment"; dictionary_set(known_journal_messages_ids, "6a40fbfbd2ba4b8db02fb40c9cd090d7", &i, sizeof(i)); + i.msg = "Core manager allocate"; dictionary_set(known_journal_messages_ids, "0e54470984ac419689743d957a119e2e", &i, sizeof(i)); + i.msg = "Smack failed write"; dictionary_set(known_journal_messages_ids, "d67fa9f847aa4b048a2ae33535331adb", &i, sizeof(i)); + i.msg = "Shutdown error"; dictionary_set(known_journal_messages_ids, "af55a6f75b544431b72649f36ff6d62c", &i, sizeof(i)); + i.msg = "Valgrind helper fork"; dictionary_set(known_journal_messages_ids, "d18e0339efb24a068d9c1060221048c2", &i, sizeof(i)); + i.msg = "Unit starting"; dictionary_set(known_journal_messages_ids, "7d4958e842da4a758f6c1cdc7b36dcc5", &i, sizeof(i)); + i.msg = "Unit started"; dictionary_set(known_journal_messages_ids, "39f53479d3a045ac8e11786248231fbf", &i, sizeof(i)); + i.msg = "Unit failed"; dictionary_set(known_journal_messages_ids, "be02cf6855d2428ba40df7e9d022f03d", &i, sizeof(i)); + i.msg = "Unit stopping"; dictionary_set(known_journal_messages_ids, "de5b426a63be47a7b6ac3eaac82e2f6f", &i, sizeof(i)); + i.msg = "Unit stopped"; dictionary_set(known_journal_messages_ids, "9d1aaa27d60140bd96365438aad20286", &i, sizeof(i)); + i.msg = "Unit reloading"; dictionary_set(known_journal_messages_ids, "d34d037fff1847e6ae669a370e694725", &i, sizeof(i)); + i.msg = "Unit reloaded"; dictionary_set(known_journal_messages_ids, "7b05ebc668384222baa8881179cfda54", &i, sizeof(i)); + i.msg = "Unit restart scheduled"; dictionary_set(known_journal_messages_ids, "5eb03494b6584870a536b337290809b3", &i, sizeof(i)); + i.msg = "Unit resources"; dictionary_set(known_journal_messages_ids, "ae8f7b866b0347b9af31fe1c80b127c0", &i, sizeof(i)); + i.msg = "Unit success"; dictionary_set(known_journal_messages_ids, "7ad2d189f7e94e70a38c781354912448", &i, sizeof(i)); + i.msg = "Unit skipped"; dictionary_set(known_journal_messages_ids, "0e4284a0caca4bfc81c0bb6786972673", &i, sizeof(i)); + i.msg = "Unit failure result"; dictionary_set(known_journal_messages_ids, "d9b373ed55a64feb8242e02dbe79a49c", &i, sizeof(i)); + i.msg = "Spawn failed"; dictionary_set(known_journal_messages_ids, "641257651c1b4ec9a8624d7a40a9e1e7", &i, sizeof(i)); + i.msg = "Unit process exit"; dictionary_set(known_journal_messages_ids, "98e322203f7a4ed290d09fe03c09fe15", &i, sizeof(i)); + i.msg = "Forward syslog missed"; dictionary_set(known_journal_messages_ids, "0027229ca0644181a76c4e92458afa2e", &i, sizeof(i)); + i.msg = "Overmounting"; dictionary_set(known_journal_messages_ids, "1dee0369c7fc4736b7099b38ecb46ee7", &i, sizeof(i)); + i.msg = "Unit oomd kill"; dictionary_set(known_journal_messages_ids, "d989611b15e44c9dbf31e3c81256e4ed", &i, sizeof(i)); + i.msg = "Unit out of memory"; dictionary_set(known_journal_messages_ids, "fe6faa94e7774663a0da52717891d8ef", &i, sizeof(i)); + i.msg = "Lid opened"; dictionary_set(known_journal_messages_ids, "b72ea4a2881545a0b50e200e55b9b06f", &i, sizeof(i)); + i.msg = "Lid closed"; dictionary_set(known_journal_messages_ids, "b72ea4a2881545a0b50e200e55b9b070", &i, sizeof(i)); + i.msg = "System docked"; dictionary_set(known_journal_messages_ids, "f5f416b862074b28927a48c3ba7d51ff", &i, sizeof(i)); + i.msg = "System undocked"; dictionary_set(known_journal_messages_ids, "51e171bd585248568110144c517cca53", &i, sizeof(i)); + i.msg = "Power key"; dictionary_set(known_journal_messages_ids, "b72ea4a2881545a0b50e200e55b9b071", &i, sizeof(i)); + i.msg = "Power key long press"; dictionary_set(known_journal_messages_ids, "3e0117101eb243c1b9a50db3494ab10b", &i, sizeof(i)); + i.msg = "Reboot key"; dictionary_set(known_journal_messages_ids, "9fa9d2c012134ec385451ffe316f97d0", &i, sizeof(i)); + i.msg = "Reboot key long press"; dictionary_set(known_journal_messages_ids, "f1c59a58c9d943668965c337caec5975", &i, sizeof(i)); + i.msg = "Suspend key"; dictionary_set(known_journal_messages_ids, "b72ea4a2881545a0b50e200e55b9b072", &i, sizeof(i)); + i.msg = "Suspend key long press"; dictionary_set(known_journal_messages_ids, "bfdaf6d312ab4007bc1fe40a15df78e8", &i, sizeof(i)); + i.msg = "Hibernate key"; dictionary_set(known_journal_messages_ids, "b72ea4a2881545a0b50e200e55b9b073", &i, sizeof(i)); + i.msg = "Hibernate key long press"; dictionary_set(known_journal_messages_ids, "167836df6f7f428e98147227b2dc8945", &i, sizeof(i)); + i.msg = "Invalid configuration"; dictionary_set(known_journal_messages_ids, "c772d24e9a884cbeb9ea12625c306c01", &i, sizeof(i)); + i.msg = "Dnssec failure"; dictionary_set(known_journal_messages_ids, "1675d7f172174098b1108bf8c7dc8f5d", &i, sizeof(i)); + i.msg = "Dnssec trust anchor revoked"; dictionary_set(known_journal_messages_ids, "4d4408cfd0d144859184d1e65d7c8a65", &i, sizeof(i)); + i.msg = "Dnssec downgrade"; dictionary_set(known_journal_messages_ids, "36db2dfa5a9045e1bd4af5f93e1cf057", &i, sizeof(i)); + i.msg = "Unsafe user name"; dictionary_set(known_journal_messages_ids, "b61fdac612e94b9182285b998843061f", &i, sizeof(i)); + i.msg = "Mount point path not suitable"; dictionary_set(known_journal_messages_ids, "1b3bb94037f04bbf81028e135a12d293", &i, sizeof(i)); + i.msg = "Device path not suitable"; dictionary_set(known_journal_messages_ids, "010190138f494e29a0ef6669749531aa", &i, sizeof(i)); + i.msg = "Nobody user unsuitable"; dictionary_set(known_journal_messages_ids, "b480325f9c394a7b802c231e51a2752c", &i, sizeof(i)); + i.msg = "Systemd udev settle deprecated"; dictionary_set(known_journal_messages_ids, "1c0454c1bd2241e0ac6fefb4bc631433", &i, sizeof(i)); + i.msg = "Time sync"; dictionary_set(known_journal_messages_ids, "7c8a41f37b764941a0e1780b1be2f037", &i, sizeof(i)); + i.msg = "Time bump"; dictionary_set(known_journal_messages_ids, "7db73c8af0d94eeb822ae04323fe6ab6", &i, sizeof(i)); + i.msg = "Shutdown scheduled"; dictionary_set(known_journal_messages_ids, "9e7066279dc8403da79ce4b1a69064b2", &i, sizeof(i)); + i.msg = "Shutdown canceled"; dictionary_set(known_journal_messages_ids, "249f6fb9e6e2428c96f3f0875681ffa3", &i, sizeof(i)); + i.msg = "TPM pcr extend"; dictionary_set(known_journal_messages_ids, "3f7d5ef3e54f4302b4f0b143bb270cab", &i, sizeof(i)); + i.msg = "Memory trim"; dictionary_set(known_journal_messages_ids, "f9b0be465ad540d0850ad32172d57c21", &i, sizeof(i)); + i.msg = "Sysv generator deprecated"; dictionary_set(known_journal_messages_ids, "a8fa8dacdb1d443e9503b8be367a6adb", &i, sizeof(i)); + + // gnome + // https://gitlab.gnome.org/GNOME/gnome-session/-/blob/main/gnome-session/gsm-manager.c + i.msg = "Gnome SM startup succeeded"; dictionary_set(known_journal_messages_ids, "0ce153587afa4095832d233c17a88001", &i, sizeof(i)); + i.msg = "Gnome SM unrecoverable failure"; dictionary_set(known_journal_messages_ids, "10dd2dc188b54a5e98970f56499d1f73", &i, sizeof(i)); + + // gnome-shell + // https://gitlab.gnome.org/GNOME/gnome-shell/-/blob/main/js/ui/main.js#L56 + i.msg = "Gnome shell started";dictionary_set(known_journal_messages_ids, "f3ea493c22934e26811cd62abe8e203a", &i, sizeof(i)); + + // flathub + // https://docs.flatpak.org/de/latest/flatpak-command-reference.html + i.msg = "Flatpak cache"; dictionary_set(known_journal_messages_ids, "c7b39b1e006b464599465e105b361485", &i, sizeof(i)); + + // ??? + i.msg = "Flathub pulls"; dictionary_set(known_journal_messages_ids, "75ba3deb0af041a9a46272ff85d9e73e", &i, sizeof(i)); + i.msg = "Flathub pull errors"; dictionary_set(known_journal_messages_ids, "f02bce89a54e4efab3a94a797d26204a", &i, sizeof(i)); + + // ?? + i.msg = "Boltd starting"; dictionary_set(known_journal_messages_ids, "dd11929c788e48bdbb6276fb5f26b08a", &i, sizeof(i)); + + // Netdata + i.msg = "Netdata connection from child"; dictionary_set(known_journal_messages_ids, "ed4cdb8f1beb4ad3b57cb3cae2d162fa", &i, sizeof(i)); + i.msg = "Netdata connection to parent"; dictionary_set(known_journal_messages_ids, "6e2e3839067648968b646045dbf28d66", &i, sizeof(i)); + i.msg = "Netdata alert transition"; dictionary_set(known_journal_messages_ids, "9ce0cb58ab8b44df82c4bf1ad9ee22de", &i, sizeof(i)); + i.msg = "Netdata alert notification"; dictionary_set(known_journal_messages_ids, "6db0018e83e34320ae2a659d78019fb7", &i, sizeof(i)); +} + +void netdata_systemd_journal_transform_message_id(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { + const char *message_id = buffer_tostring(wb); + struct message_id_info *i = dictionary_get(known_journal_messages_ids, message_id); + + if(!i) + return; + + switch(scope) { + default: + case FACETS_TRANSFORM_DATA: + case FACETS_TRANSFORM_VALUE: + buffer_sprintf(wb, " (%s)", i->msg); + break; + + case FACETS_TRANSFORM_FACET: + case FACETS_TRANSFORM_FACET_SORT: + case FACETS_TRANSFORM_HISTOGRAM: + buffer_flush(wb); + buffer_strcat(wb, i->msg); + break; + } +} + +// ---------------------------------------------------------------------------- + +static void netdata_systemd_journal_rich_message(FACETS *facets __maybe_unused, BUFFER *json_array, FACET_ROW_KEY_VALUE *rkv, FACET_ROW *row __maybe_unused, void *data __maybe_unused) { + buffer_json_add_array_item_object(json_array); + buffer_json_member_add_string(json_array, "value", buffer_tostring(rkv->wb)); + buffer_json_object_close(json_array); +} diff --git a/collectors/systemd-journal.plugin/systemd-journal-files.c b/collectors/systemd-journal.plugin/systemd-journal-files.c new file mode 100644 index 000000000..56496df22 --- /dev/null +++ b/collectors/systemd-journal.plugin/systemd-journal-files.c @@ -0,0 +1,857 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "systemd-internals.h" + +#define SYSTEMD_JOURNAL_MAX_SOURCE_LEN 64 +#define VAR_LOG_JOURNAL_MAX_DEPTH 10 + +struct journal_directory journal_directories[MAX_JOURNAL_DIRECTORIES] = { 0 }; +DICTIONARY *journal_files_registry = NULL; +DICTIONARY *used_hashes_registry = NULL; + +static usec_t systemd_journal_session = 0; + +void buffer_json_journal_versions(BUFFER *wb) { + buffer_json_member_add_object(wb, "versions"); + { + buffer_json_member_add_uint64(wb, "sources", + systemd_journal_session + dictionary_version(journal_files_registry)); + } + buffer_json_object_close(wb); +} + +static bool journal_sd_id128_parse(const char *in, sd_id128_t *ret) { + while(isspace(*in)) + in++; + + char uuid[33]; + strncpyz(uuid, in, 32); + uuid[32] = '\0'; + + if(strlen(uuid) == 32) { + sd_id128_t read; + if(sd_id128_from_string(uuid, &read) == 0) { + *ret = read; + return true; + } + } + + return false; +} + +static void journal_file_get_header_from_journalctl(const char *filename, struct journal_file *jf) { + // unfortunately, our capabilities are not inheritted by journalctl + // so, it fails to give us the information we need. + + bool read_writer = false, read_head = false, read_tail = false; + + char cmd[FILENAME_MAX * 2]; + snprintfz(cmd, sizeof(cmd), "journalctl --header --file '%s'", filename); + CLEAN_BUFFER *wb = run_command_and_get_output_to_buffer(cmd, 1024); + if(wb) { + const char *s = buffer_tostring(wb); + + const char *sequential_id_header = "Sequential Number ID:"; + const char *sequential_id_data = strcasestr(s, sequential_id_header); + if(sequential_id_data) { + sequential_id_data += strlen(sequential_id_header); + if(journal_sd_id128_parse(sequential_id_data, &jf->first_writer_id)) + read_writer = true; + } + + const char *head_sequential_number_header = "Head sequential number:"; + const char *head_sequential_number_data = strcasestr(s, head_sequential_number_header); + if(head_sequential_number_data) { + head_sequential_number_data += strlen(head_sequential_number_header); + + while(isspace(*head_sequential_number_data)) + head_sequential_number_data++; + + if(isdigit(*head_sequential_number_data)) { + jf->first_seqnum = strtoul(head_sequential_number_data, NULL, 10); + if(jf->first_seqnum) + read_head = true; + } + } + + const char *tail_sequential_number_header = "Tail sequential number:"; + const char *tail_sequential_number_data = strcasestr(s, tail_sequential_number_header); + if(tail_sequential_number_data) { + tail_sequential_number_data += strlen(tail_sequential_number_header); + + while(isspace(*tail_sequential_number_data)) + tail_sequential_number_data++; + + if(isdigit(*tail_sequential_number_data)) { + jf->last_seqnum = strtoul(tail_sequential_number_data, NULL, 10); + if(jf->last_seqnum) + read_tail = true; + } + } + + if(read_head && read_tail && jf->last_seqnum > jf->first_seqnum) + jf->messages_in_file = jf->last_seqnum - jf->first_seqnum; + } + + if(!jf->logged_journalctl_failure && (!read_head || !read_head || !read_tail)) { + + nd_log(NDLS_COLLECTORS, NDLP_NOTICE, + "Failed to read %s%s%s from journalctl's output on filename '%s', using the command: %s", + read_writer?"":"writer id,", + read_head?"":"head id,", + read_tail?"":"tail id,", + filename, cmd); + + jf->logged_journalctl_failure = true; + } +} + +usec_t journal_file_update_annotation_boot_id(sd_journal *j, struct journal_file *jf, const char *boot_id) { + usec_t ut = UINT64_MAX; + int r; + + char m[100]; + size_t len = snprintfz(m, sizeof(m), "_BOOT_ID=%s", boot_id); + + sd_journal_flush_matches(j); + + r = sd_journal_add_match(j, m, len); + if(r < 0) { + errno = -r; + internal_error(true, + "JOURNAL: while looking for the first timestamp of boot_id '%s', " + "sd_journal_add_match('%s') on file '%s' returned %d", + boot_id, m, jf->filename, r); + return UINT64_MAX; + } + + r = sd_journal_seek_head(j); + if(r < 0) { + errno = -r; + internal_error(true, + "JOURNAL: while looking for the first timestamp of boot_id '%s', " + "sd_journal_seek_head() on file '%s' returned %d", + boot_id, jf->filename, r); + return UINT64_MAX; + } + + r = sd_journal_next(j); + if(r < 0) { + errno = -r; + internal_error(true, + "JOURNAL: while looking for the first timestamp of boot_id '%s', " + "sd_journal_next() on file '%s' returned %d", + boot_id, jf->filename, r); + return UINT64_MAX; + } + + r = sd_journal_get_realtime_usec(j, &ut); + if(r < 0 || !ut || ut == UINT64_MAX) { + errno = -r; + internal_error(r != -EADDRNOTAVAIL, + "JOURNAL: while looking for the first timestamp of boot_id '%s', " + "sd_journal_get_realtime_usec() on file '%s' returned %d", + boot_id, jf->filename, r); + return UINT64_MAX; + } + + if(ut && ut != UINT64_MAX) { + dictionary_set(boot_ids_to_first_ut, boot_id, &ut, sizeof(ut)); + return ut; + } + + return UINT64_MAX; +} + +static void journal_file_get_boot_id_annotations(sd_journal *j __maybe_unused, struct journal_file *jf __maybe_unused) { +#ifdef HAVE_SD_JOURNAL_RESTART_FIELDS + sd_journal_flush_matches(j); + + int r = sd_journal_query_unique(j, "_BOOT_ID"); + if (r < 0) { + errno = -r; + internal_error(true, + "JOURNAL: while querying for the unique _BOOT_ID values, " + "sd_journal_query_unique() on file '%s' returned %d", + jf->filename, r); + errno = -r; + return; + } + + const void *data = NULL; + size_t data_length; + + DICTIONARY *dict = dictionary_create(DICT_OPTION_SINGLE_THREADED); + + SD_JOURNAL_FOREACH_UNIQUE(j, data, data_length) { + const char *key, *value; + size_t key_length, value_length; + + if(!parse_journal_field(data, data_length, &key, &key_length, &value, &value_length)) + continue; + + if(value_length != 32) + continue; + + char buf[33]; + memcpy(buf, value, 32); + buf[32] = '\0'; + + dictionary_set(dict, buf, NULL, 0); + } + + void *nothing; + dfe_start_read(dict, nothing){ + journal_file_update_annotation_boot_id(j, jf, nothing_dfe.name); + } + dfe_done(nothing); + + dictionary_destroy(dict); +#endif +} + +void journal_file_update_header(const char *filename, struct journal_file *jf) { + if(jf->last_scan_header_vs_last_modified_ut == jf->file_last_modified_ut) + return; + + fstat_cache_enable_on_thread(); + + const char *files[2] = { + [0] = filename, + [1] = NULL, + }; + + sd_journal *j = NULL; + if(sd_journal_open_files(&j, files, ND_SD_JOURNAL_OPEN_FLAGS) < 0 || !j) { + netdata_log_error("JOURNAL: cannot open file '%s' to update msg_ut", filename); + fstat_cache_disable_on_thread(); + + if(!jf->logged_failure) { + netdata_log_error("cannot open journal file '%s', using file timestamps to understand time-frame.", filename); + jf->logged_failure = true; + } + + jf->msg_first_ut = 0; + jf->msg_last_ut = jf->file_last_modified_ut; + jf->last_scan_header_vs_last_modified_ut = jf->file_last_modified_ut; + return; + } + + usec_t first_ut = 0, last_ut = 0; + uint64_t first_seqnum = 0, last_seqnum = 0; + sd_id128_t first_writer_id = SD_ID128_NULL, last_writer_id = SD_ID128_NULL; + + if(sd_journal_seek_head(j) < 0 || sd_journal_next(j) < 0 || sd_journal_get_realtime_usec(j, &first_ut) < 0 || !first_ut) { + internal_error(true, "cannot find the timestamp of the first message in '%s'", filename); + first_ut = 0; + } +#ifdef HAVE_SD_JOURNAL_GET_SEQNUM + else { + if(sd_journal_get_seqnum(j, &first_seqnum, &first_writer_id) < 0 || !first_seqnum) { + internal_error(true, "cannot find the first seqnums of the first message in '%s'", filename); + first_seqnum = 0; + memset(&first_writer_id, 0, sizeof(first_writer_id)); + } + } +#endif + + if(sd_journal_seek_tail(j) < 0 || sd_journal_previous(j) < 0 || sd_journal_get_realtime_usec(j, &last_ut) < 0 || !last_ut) { + internal_error(true, "cannot find the timestamp of the last message in '%s'", filename); + last_ut = jf->file_last_modified_ut; + } +#ifdef HAVE_SD_JOURNAL_GET_SEQNUM + else { + if(sd_journal_get_seqnum(j, &last_seqnum, &last_writer_id) < 0 || !last_seqnum) { + internal_error(true, "cannot find the last seqnums of the first message in '%s'", filename); + last_seqnum = 0; + memset(&last_writer_id, 0, sizeof(last_writer_id)); + } + } +#endif + + if(first_ut > last_ut) { + internal_error(true, "timestamps are flipped in file '%s'", filename); + usec_t t = first_ut; + first_ut = last_ut; + last_ut = t; + } + + if(!first_seqnum || !first_ut) { + // extract these from the filename - if possible + + const char *at = strchr(filename, '@'); + if(at) { + const char *dash_seqnum = strchr(at + 1, '-'); + if(dash_seqnum) { + const char *dash_first_msg_ut = strchr(dash_seqnum + 1, '-'); + if(dash_first_msg_ut) { + const char *dot_journal = strstr(dash_first_msg_ut + 1, ".journal"); + if(dot_journal) { + if(dash_seqnum - at - 1 == 32 && + dash_first_msg_ut - dash_seqnum - 1 == 16 && + dot_journal - dash_first_msg_ut - 1 == 16) { + sd_id128_t writer; + if(journal_sd_id128_parse(at + 1, &writer)) { + char *endptr = NULL; + uint64_t seqnum = strtoul(dash_seqnum + 1, &endptr, 16); + if(endptr == dash_first_msg_ut) { + uint64_t ts = strtoul(dash_first_msg_ut + 1, &endptr, 16); + if(endptr == dot_journal) { + first_seqnum = seqnum; + first_writer_id = writer; + first_ut = ts; + } + } + } + } + } + } + } + } + } + + jf->first_seqnum = first_seqnum; + jf->last_seqnum = last_seqnum; + + jf->first_writer_id = first_writer_id; + jf->last_writer_id = last_writer_id; + + jf->msg_first_ut = first_ut; + jf->msg_last_ut = last_ut; + + if(!jf->msg_last_ut) + jf->msg_last_ut = jf->file_last_modified_ut; + + if(last_seqnum > first_seqnum) { + if(!sd_id128_equal(first_writer_id, last_writer_id)) { + jf->messages_in_file = 0; + nd_log(NDLS_COLLECTORS, NDLP_NOTICE, + "The writers of the first and the last message in file '%s' differ." + , filename); + } + else + jf->messages_in_file = last_seqnum - first_seqnum + 1; + } + else + jf->messages_in_file = 0; + +// if(!jf->messages_in_file) +// journal_file_get_header_from_journalctl(filename, jf); + + journal_file_get_boot_id_annotations(j, jf); + sd_journal_close(j); + fstat_cache_disable_on_thread(); + + jf->last_scan_header_vs_last_modified_ut = jf->file_last_modified_ut; + + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "Journal file header updated '%s'", + jf->filename); +} + +static STRING *string_strdupz_source(const char *s, const char *e, size_t max_len, const char *prefix) { + char buf[max_len]; + size_t len; + char *dst = buf; + + if(prefix) { + len = strlen(prefix); + memcpy(buf, prefix, len); + dst = &buf[len]; + max_len -= len; + } + + len = e - s; + if(len >= max_len) + len = max_len - 1; + memcpy(dst, s, len); + dst[len] = '\0'; + buf[max_len - 1] = '\0'; + + for(size_t i = 0; buf[i] ;i++) + if(!isalnum(buf[i]) && buf[i] != '-' && buf[i] != '.' && buf[i] != ':') + buf[i] = '_'; + + return string_strdupz(buf); +} + +static void files_registry_insert_cb(const DICTIONARY_ITEM *item, void *value, void *data __maybe_unused) { + struct journal_file *jf = value; + jf->filename = dictionary_acquired_item_name(item); + jf->filename_len = strlen(jf->filename); + jf->source_type = SDJF_ALL; + + // based on the filename + // decide the source to show to the user + const char *s = strrchr(jf->filename, '/'); + if(s) { + if(strstr(jf->filename, "/remote/")) { + jf->source_type |= SDJF_REMOTE_ALL; + + if(strncmp(s, "/remote-", 8) == 0) { + s = &s[8]; // skip "/remote-" + + char *e = strchr(s, '@'); + if(!e) + e = strstr(s, ".journal"); + + if(e) { + const char *d = s; + for(; d < e && (isdigit(*d) || *d == '.' || *d == ':') ; d++) ; + if(d == e) { + // a valid IP address + char ip[e - s + 1]; + memcpy(ip, s, e - s); + ip[e - s] = '\0'; + char buf[SYSTEMD_JOURNAL_MAX_SOURCE_LEN]; + if(ip_to_hostname(ip, buf, sizeof(buf))) + jf->source = string_strdupz_source(buf, &buf[strlen(buf)], SYSTEMD_JOURNAL_MAX_SOURCE_LEN, "remote-"); + else { + internal_error(true, "Cannot find the hostname for IP '%s'", ip); + jf->source = string_strdupz_source(s, e, SYSTEMD_JOURNAL_MAX_SOURCE_LEN, "remote-"); + } + } + else + jf->source = string_strdupz_source(s, e, SYSTEMD_JOURNAL_MAX_SOURCE_LEN, "remote-"); + } + } + } + else { + jf->source_type |= SDJF_LOCAL_ALL; + + const char *t = s - 1; + while(t >= jf->filename && *t != '.' && *t != '/') + t--; + + if(t >= jf->filename && *t == '.') { + jf->source_type |= SDJF_LOCAL_NAMESPACE; + jf->source = string_strdupz_source(t + 1, s, SYSTEMD_JOURNAL_MAX_SOURCE_LEN, "namespace-"); + } + else if(strncmp(s, "/system", 7) == 0) + jf->source_type |= SDJF_LOCAL_SYSTEM; + + else if(strncmp(s, "/user", 5) == 0) + jf->source_type |= SDJF_LOCAL_USER; + + else + jf->source_type |= SDJF_LOCAL_OTHER; + } + } + else + jf->source_type |= SDJF_LOCAL_ALL | SDJF_LOCAL_OTHER; + + jf->msg_last_ut = jf->file_last_modified_ut; + + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "Journal file added to the journal files registry: '%s'", + jf->filename); +} + +static bool files_registry_conflict_cb(const DICTIONARY_ITEM *item, void *old_value, void *new_value, void *data __maybe_unused) { + struct journal_file *jf = old_value; + struct journal_file *njf = new_value; + + if(njf->last_scan_monotonic_ut > jf->last_scan_monotonic_ut) + jf->last_scan_monotonic_ut = njf->last_scan_monotonic_ut; + + if(njf->file_last_modified_ut > jf->file_last_modified_ut) { + jf->file_last_modified_ut = njf->file_last_modified_ut; + jf->size = njf->size; + + jf->msg_last_ut = jf->file_last_modified_ut; + + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "Journal file updated to the journal files registry '%s'", + jf->filename); + } + + return false; +} + +struct journal_file_source { + usec_t first_ut; + usec_t last_ut; + size_t count; + uint64_t size; +}; + +static void human_readable_size_ib(uint64_t size, char *dst, size_t dst_len) { + if(size > 1024ULL * 1024 * 1024 * 1024) + snprintfz(dst, dst_len, "%0.2f TiB", (double)size / 1024.0 / 1024.0 / 1024.0 / 1024.0); + else if(size > 1024ULL * 1024 * 1024) + snprintfz(dst, dst_len, "%0.2f GiB", (double)size / 1024.0 / 1024.0 / 1024.0); + else if(size > 1024ULL * 1024) + snprintfz(dst, dst_len, "%0.2f MiB", (double)size / 1024.0 / 1024.0); + else if(size > 1024ULL) + snprintfz(dst, dst_len, "%0.2f KiB", (double)size / 1024.0); + else + snprintfz(dst, dst_len, "%"PRIu64" B", size); +} + +#define print_duration(dst, dst_len, pos, remaining, duration, one, many, printed) do { \ + if((remaining) > (duration)) { \ + uint64_t _count = (remaining) / (duration); \ + uint64_t _rem = (remaining) - (_count * (duration)); \ + (pos) += snprintfz(&(dst)[pos], (dst_len) - (pos), "%s%s%"PRIu64" %s", (printed) ? ", " : "", _rem ? "" : "and ", _count, _count > 1 ? (many) : (one)); \ + (remaining) = _rem; \ + (printed) = true; \ + } \ +} while(0) + +static void human_readable_duration_s(time_t duration_s, char *dst, size_t dst_len) { + if(duration_s < 0) + duration_s = -duration_s; + + size_t pos = 0; + dst[0] = 0 ; + + bool printed = false; + print_duration(dst, dst_len, pos, duration_s, 86400 * 365, "year", "years", printed); + print_duration(dst, dst_len, pos, duration_s, 86400 * 30, "month", "months", printed); + print_duration(dst, dst_len, pos, duration_s, 86400 * 1, "day", "days", printed); + print_duration(dst, dst_len, pos, duration_s, 3600 * 1, "hour", "hours", printed); + print_duration(dst, dst_len, pos, duration_s, 60 * 1, "min", "mins", printed); + print_duration(dst, dst_len, pos, duration_s, 1, "sec", "secs", printed); +} + +static int journal_file_to_json_array_cb(const DICTIONARY_ITEM *item, void *entry, void *data) { + struct journal_file_source *jfs = entry; + BUFFER *wb = data; + + const char *name = dictionary_acquired_item_name(item); + + buffer_json_add_array_item_object(wb); + { + char size_for_humans[100]; + human_readable_size_ib(jfs->size, size_for_humans, sizeof(size_for_humans)); + + char duration_for_humans[1024]; + human_readable_duration_s((time_t)((jfs->last_ut - jfs->first_ut) / USEC_PER_SEC), + duration_for_humans, sizeof(duration_for_humans)); + + char info[1024]; + snprintfz(info, sizeof(info), "%zu files, with a total size of %s, covering %s", + jfs->count, size_for_humans, duration_for_humans); + + buffer_json_member_add_string(wb, "id", name); + buffer_json_member_add_string(wb, "name", name); + buffer_json_member_add_string(wb, "pill", size_for_humans); + buffer_json_member_add_string(wb, "info", info); + } + buffer_json_object_close(wb); // options object + + return 1; +} + +static bool journal_file_merge_sizes(const DICTIONARY_ITEM *item __maybe_unused, void *old_value, void *new_value , void *data __maybe_unused) { + struct journal_file_source *jfs = old_value, *njfs = new_value; + jfs->count += njfs->count; + jfs->size += njfs->size; + + if(njfs->first_ut && njfs->first_ut < jfs->first_ut) + jfs->first_ut = njfs->first_ut; + + if(njfs->last_ut && njfs->last_ut > jfs->last_ut) + jfs->last_ut = njfs->last_ut; + + return false; +} + +void available_journal_file_sources_to_json_array(BUFFER *wb) { + DICTIONARY *dict = dictionary_create(DICT_OPTION_SINGLE_THREADED|DICT_OPTION_NAME_LINK_DONT_CLONE|DICT_OPTION_DONT_OVERWRITE_VALUE); + dictionary_register_conflict_callback(dict, journal_file_merge_sizes, NULL); + + struct journal_file_source t = { 0 }; + + struct journal_file *jf; + dfe_start_read(journal_files_registry, jf) { + t.first_ut = jf->msg_first_ut; + t.last_ut = jf->msg_last_ut; + t.count = 1; + t.size = jf->size; + + dictionary_set(dict, SDJF_SOURCE_ALL_NAME, &t, sizeof(t)); + + if(jf->source_type & SDJF_LOCAL_ALL) + dictionary_set(dict, SDJF_SOURCE_LOCAL_NAME, &t, sizeof(t)); + if(jf->source_type & SDJF_LOCAL_SYSTEM) + dictionary_set(dict, SDJF_SOURCE_LOCAL_SYSTEM_NAME, &t, sizeof(t)); + if(jf->source_type & SDJF_LOCAL_USER) + dictionary_set(dict, SDJF_SOURCE_LOCAL_USERS_NAME, &t, sizeof(t)); + if(jf->source_type & SDJF_LOCAL_OTHER) + dictionary_set(dict, SDJF_SOURCE_LOCAL_OTHER_NAME, &t, sizeof(t)); + if(jf->source_type & SDJF_LOCAL_NAMESPACE) + dictionary_set(dict, SDJF_SOURCE_NAMESPACES_NAME, &t, sizeof(t)); + if(jf->source_type & SDJF_REMOTE_ALL) + dictionary_set(dict, SDJF_SOURCE_REMOTES_NAME, &t, sizeof(t)); + if(jf->source) + dictionary_set(dict, string2str(jf->source), &t, sizeof(t)); + } + dfe_done(jf); + + dictionary_sorted_walkthrough_read(dict, journal_file_to_json_array_cb, wb); + + dictionary_destroy(dict); +} + +static void files_registry_delete_cb(const DICTIONARY_ITEM *item, void *value, void *data __maybe_unused) { + struct journal_file *jf = value; (void)jf; + const char *filename = dictionary_acquired_item_name(item); (void)filename; + + internal_error(true, "removed journal file '%s'", filename); + string_freez(jf->source); +} + +void journal_directory_scan_recursively(DICTIONARY *files, DICTIONARY *dirs, const char *dirname, int depth) { + static const char *ext = ".journal"; + static const ssize_t ext_len = sizeof(".journal") - 1; + + if (depth > VAR_LOG_JOURNAL_MAX_DEPTH) + return; + + DIR *dir; + struct dirent *entry; + char full_path[FILENAME_MAX]; + + // Open the directory. + if ((dir = opendir(dirname)) == NULL) { + if(errno != ENOENT && errno != ENOTDIR) + netdata_log_error("Cannot opendir() '%s'", dirname); + return; + } + + bool existing = false; + bool *found = dictionary_set(dirs, dirname, &existing, sizeof(existing)); + if(*found) return; + *found = true; + + // Read each entry in the directory. + while ((entry = readdir(dir)) != NULL) { + if (strcmp(entry->d_name, ".") == 0 || strcmp(entry->d_name, "..") == 0) + continue; + + ssize_t len = snprintfz(full_path, sizeof(full_path), "%s/%s", dirname, entry->d_name); + + if (entry->d_type == DT_DIR) { + journal_directory_scan_recursively(files, dirs, full_path, depth++); + } + else if (entry->d_type == DT_REG && len > ext_len && strcmp(full_path + len - ext_len, ext) == 0) { + if(files) + dictionary_set(files, full_path, NULL, 0); + + send_newline_and_flush(); + } + else if (entry->d_type == DT_LNK) { + struct stat info; + if (stat(full_path, &info) == -1) + continue; + + if (S_ISDIR(info.st_mode)) { + // The symbolic link points to a directory + char resolved_path[FILENAME_MAX + 1]; + if (realpath(full_path, resolved_path) != NULL) { + journal_directory_scan_recursively(files, dirs, resolved_path, depth++); + } + } + else if(S_ISREG(info.st_mode) && len > ext_len && strcmp(full_path + len - ext_len, ext) == 0) { + if(files) + dictionary_set(files, full_path, NULL, 0); + + send_newline_and_flush(); + } + } + } + + closedir(dir); +} + +static size_t journal_files_scans = 0; +bool journal_files_completed_once(void) { + return journal_files_scans > 0; +} + +int filenames_compar(const void *a, const void *b) { + const char *p1 = *(const char **)a; + const char *p2 = *(const char **)b; + + const char *at1 = strchr(p1, '@'); + const char *at2 = strchr(p2, '@'); + + if(!at1 && at2) + return -1; + + if(at1 && !at2) + return 1; + + if(!at1 && !at2) + return strcmp(p1, p2); + + const char *dash1 = strrchr(at1, '-'); + const char *dash2 = strrchr(at2, '-'); + + if(!dash1 || !dash2) + return strcmp(p1, p2); + + uint64_t ts1 = strtoul(dash1 + 1, NULL, 16); + uint64_t ts2 = strtoul(dash2 + 1, NULL, 16); + + if(ts1 > ts2) + return -1; + + if(ts1 < ts2) + return 1; + + return -strcmp(p1, p2); +} + +void journal_files_registry_update(void) { + static SPINLOCK spinlock = NETDATA_SPINLOCK_INITIALIZER; + + if(spinlock_trylock(&spinlock)) { + usec_t scan_monotonic_ut = now_monotonic_usec(); + + DICTIONARY *files = dictionary_create(DICT_OPTION_SINGLE_THREADED | DICT_OPTION_DONT_OVERWRITE_VALUE); + DICTIONARY *dirs = dictionary_create(DICT_OPTION_SINGLE_THREADED|DICT_OPTION_DONT_OVERWRITE_VALUE); + + for(unsigned i = 0; i < MAX_JOURNAL_DIRECTORIES; i++) { + if(!journal_directories[i].path) break; + journal_directory_scan_recursively(files, dirs, journal_directories[i].path, 0); + } + + const char **array = mallocz(sizeof(const char *) * dictionary_entries(files)); + size_t used = 0; + + void *x; + dfe_start_read(files, x) { + if(used >= dictionary_entries(files)) continue; + array[used++] = x_dfe.name; + } + dfe_done(x); + + qsort(array, used, sizeof(const char *), filenames_compar); + + for(size_t i = 0; i < used ;i++) { + const char *full_path = array[i]; + + struct stat info; + if (stat(full_path, &info) == -1) + continue; + + struct journal_file t = { + .file_last_modified_ut = info.st_mtim.tv_sec * USEC_PER_SEC + info.st_mtim.tv_nsec / NSEC_PER_USEC, + .last_scan_monotonic_ut = scan_monotonic_ut, + .size = info.st_size, + .max_journal_vs_realtime_delta_ut = JOURNAL_VS_REALTIME_DELTA_DEFAULT_UT, + }; + struct journal_file *jf = dictionary_set(journal_files_registry, full_path, &t, sizeof(t)); + journal_file_update_header(jf->filename, jf); + } + freez(array); + dictionary_destroy(files); + dictionary_destroy(dirs); + + struct journal_file *jf; + dfe_start_write(journal_files_registry, jf){ + if(jf->last_scan_monotonic_ut < scan_monotonic_ut) + dictionary_del(journal_files_registry, jf_dfe.name); + } + dfe_done(jf); + + journal_files_scans++; + spinlock_unlock(&spinlock); + + internal_error(true, + "Journal library scan completed in %.3f ms", + (double)(now_monotonic_usec() - scan_monotonic_ut) / (double)USEC_PER_MS); + } +} + +// ---------------------------------------------------------------------------- + +int journal_file_dict_items_backward_compar(const void *a, const void *b) { + const DICTIONARY_ITEM **ad = (const DICTIONARY_ITEM **)a, **bd = (const DICTIONARY_ITEM **)b; + struct journal_file *jfa = dictionary_acquired_item_value(*ad); + struct journal_file *jfb = dictionary_acquired_item_value(*bd); + + // compare the last message timestamps + if(jfa->msg_last_ut < jfb->msg_last_ut) + return 1; + + if(jfa->msg_last_ut > jfb->msg_last_ut) + return -1; + + // compare the file last modification timestamps + if(jfa->file_last_modified_ut < jfb->file_last_modified_ut) + return 1; + + if(jfa->file_last_modified_ut > jfb->file_last_modified_ut) + return -1; + + // compare the first message timestamps + if(jfa->msg_first_ut < jfb->msg_first_ut) + return 1; + + if(jfa->msg_first_ut > jfb->msg_first_ut) + return -1; + + return 0; +} + +int journal_file_dict_items_forward_compar(const void *a, const void *b) { + return -journal_file_dict_items_backward_compar(a, b); +} + +static bool boot_id_conflict_cb(const DICTIONARY_ITEM *item, void *old_value, void *new_value, void *data __maybe_unused) { + usec_t *old_usec = old_value; + usec_t *new_usec = new_value; + + if(*new_usec < *old_usec) { + *old_usec = *new_usec; + return true; + } + + return false; +} + +void journal_init_files_and_directories(void) { + unsigned d = 0; + + // ------------------------------------------------------------------------ + // setup the journal directories + + journal_directories[d++].path = strdupz("/run/log/journal"); + journal_directories[d++].path = strdupz("/var/log/journal"); + + if(*netdata_configured_host_prefix) { + char path[PATH_MAX]; + snprintfz(path, sizeof(path), "%s/var/log/journal", netdata_configured_host_prefix); + journal_directories[d++].path = strdupz(path); + snprintfz(path, sizeof(path), "%s/run/log/journal", netdata_configured_host_prefix); + journal_directories[d++].path = strdupz(path); + } + + // terminate the list + journal_directories[d].path = NULL; + + // ------------------------------------------------------------------------ + // initialize the used hashes files registry + + used_hashes_registry = dictionary_create(DICT_OPTION_DONT_OVERWRITE_VALUE); + + systemd_journal_session = (now_realtime_usec() / USEC_PER_SEC) * USEC_PER_SEC; + + journal_files_registry = dictionary_create_advanced( + DICT_OPTION_DONT_OVERWRITE_VALUE | DICT_OPTION_FIXED_SIZE, + NULL, sizeof(struct journal_file)); + + dictionary_register_insert_callback(journal_files_registry, files_registry_insert_cb, NULL); + dictionary_register_delete_callback(journal_files_registry, files_registry_delete_cb, NULL); + dictionary_register_conflict_callback(journal_files_registry, files_registry_conflict_cb, NULL); + + boot_ids_to_first_ut = dictionary_create_advanced( + DICT_OPTION_DONT_OVERWRITE_VALUE | DICT_OPTION_FIXED_SIZE, + NULL, sizeof(usec_t)); + + dictionary_register_conflict_callback(boot_ids_to_first_ut, boot_id_conflict_cb, NULL); + +} diff --git a/collectors/systemd-journal.plugin/systemd-journal-fstat.c b/collectors/systemd-journal.plugin/systemd-journal-fstat.c new file mode 100644 index 000000000..45ea78174 --- /dev/null +++ b/collectors/systemd-journal.plugin/systemd-journal-fstat.c @@ -0,0 +1,74 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "systemd-internals.h" + + +// ---------------------------------------------------------------------------- +// fstat64 overloading to speed up libsystemd +// https://github.com/systemd/systemd/pull/29261 + +#include <dlfcn.h> +#include <sys/stat.h> + +#define FSTAT_CACHE_MAX 1024 +struct fdstat64_cache_entry { + bool enabled; + bool updated; + int err_no; + struct stat64 stat; + int ret; + size_t cached_count; + size_t session; +}; + +struct fdstat64_cache_entry fstat64_cache[FSTAT_CACHE_MAX] = {0 }; +__thread size_t fstat_thread_calls = 0; +__thread size_t fstat_thread_cached_responses = 0; +static __thread bool enable_thread_fstat = false; +static __thread size_t fstat_caching_thread_session = 0; +static size_t fstat_caching_global_session = 0; + +void fstat_cache_enable_on_thread(void) { + fstat_caching_thread_session = __atomic_add_fetch(&fstat_caching_global_session, 1, __ATOMIC_ACQUIRE); + enable_thread_fstat = true; +} + +void fstat_cache_disable_on_thread(void) { + fstat_caching_thread_session = __atomic_add_fetch(&fstat_caching_global_session, 1, __ATOMIC_RELEASE); + enable_thread_fstat = false; +} + +int fstat64(int fd, struct stat64 *buf) { + static int (*real_fstat)(int, struct stat64 *) = NULL; + if (!real_fstat) + real_fstat = dlsym(RTLD_NEXT, "fstat64"); + + fstat_thread_calls++; + + if(fd >= 0 && fd < FSTAT_CACHE_MAX) { + if(enable_thread_fstat && fstat64_cache[fd].session != fstat_caching_thread_session) { + fstat64_cache[fd].session = fstat_caching_thread_session; + fstat64_cache[fd].enabled = true; + fstat64_cache[fd].updated = false; + } + + if(fstat64_cache[fd].enabled && fstat64_cache[fd].updated && fstat64_cache[fd].session == fstat_caching_thread_session) { + fstat_thread_cached_responses++; + errno = fstat64_cache[fd].err_no; + *buf = fstat64_cache[fd].stat; + fstat64_cache[fd].cached_count++; + return fstat64_cache[fd].ret; + } + } + + int ret = real_fstat(fd, buf); + + if(fd >= 0 && fd < FSTAT_CACHE_MAX && fstat64_cache[fd].enabled && fstat64_cache[fd].session == fstat_caching_thread_session) { + fstat64_cache[fd].ret = ret; + fstat64_cache[fd].updated = true; + fstat64_cache[fd].err_no = errno; + fstat64_cache[fd].stat = *buf; + } + + return ret; +} diff --git a/collectors/systemd-journal.plugin/systemd-journal-self-signed-certs.sh b/collectors/systemd-journal.plugin/systemd-journal-self-signed-certs.sh new file mode 100755 index 000000000..ada735f1f --- /dev/null +++ b/collectors/systemd-journal.plugin/systemd-journal-self-signed-certs.sh @@ -0,0 +1,267 @@ +#!/usr/bin/env bash + +me="${0}" +dst="/etc/ssl/systemd-journal" + +show_usage() { + cat <<EOFUSAGE + +${me} [options] server_name alias1 alias2 ... + +server_name + the canonical name of the server on the certificates + +aliasN + a hostname or IP this server is reachable with + DNS names should be like DNS:hostname + IPs should be like IP:1.2.3.4 + Any number of aliases are accepted per server + +options can be: + + -h, --help + show this message + + -d, --directory DIRECTORY + change the default certificates install dir + default: ${dst} + +EOFUSAGE +} + +while [ ! -z "${1}" ]; do + case "${1}" in + -h|--help) + show_usage + exit 0 + ;; + + -d|--directory) + dst="${2}" + echo >&2 "directory set to: ${dst}" + shift + ;; + + *) + break 2 + ;; + esac + + shift +done + +if [ -z "${1}" ]; then + show_usage + exit 1 +fi + + +# Define a regular expression pattern for a valid canonical name +valid_canonical_name_pattern="^[a-zA-Z0-9][a-zA-Z0-9.-]+$" + +# Check if ${1} matches the pattern +if [[ ! "${1}" =~ ${valid_canonical_name_pattern} ]]; then + echo "Certificate name '${1}' is not valid." + exit 1 +fi + +# ----------------------------------------------------------------------------- +# Create the CA + +# stop on all errors +set -e + +if [ $UID -ne 0 ] +then + echo >&2 "Hey! sudo me: sudo ${me}" + exit 1 +fi + +if ! getent group systemd-journal >/dev/null 2>&1; then + echo >&2 "Missing system group: systemd-journal. Did you install systemd-journald?" + exit 1 +fi + +if ! getent passwd systemd-journal-remote >/dev/null 2>&1; then + echo >&2 "Missing system user: systemd-journal-remote. Did you install systemd-journal-remote?" + exit 1 +fi + +if [ ! -d "${dst}" ] +then + mkdir -p "${dst}" + chown systemd-journal-remote:systemd-journal "${dst}" + chmod 750 "${dst}" +fi + +cd "${dst}" + +test ! -f ca.conf && cat >ca.conf <<EOF +[ ca ] +default_ca = CA_default +[ CA_default ] +new_certs_dir = . +certificate = ca.pem +database = ./index +private_key = ca.key +serial = ./serial +default_days = 3650 +default_md = default +policy = policy_anything +[ policy_anything ] +countryName = optional +stateOrProvinceName = optional +localityName = optional +organizationName = optional +organizationalUnitName = optional +commonName = supplied +emailAddress = optional +EOF + +test ! -f index && touch index +test ! -f serial && echo 0001 >serial + +if [ ! -f ca.pem -o ! -f ca.key ]; then + echo >&2 "Generating ca.pem ..." + + openssl req -newkey rsa:2048 -days 3650 -x509 -nodes -out ca.pem -keyout ca.key -subj "/CN=systemd-journal-remote-ca/" + chown systemd-journal-remote:systemd-journal ca.pem + chmod 0640 ca.pem +fi + +# ----------------------------------------------------------------------------- +# Create a server certificate + +generate_server_certificate() { + local cn="${1}"; shift + + if [ ! -f "${cn}.pem" -o ! -f "${cn}.key" ]; then + if [ -z "${*}" ]; then + echo >"${cn}.conf" + else + echo "subjectAltName = $(echo "${@}" | tr " " ",")" >"${cn}.conf" + fi + + echo >&2 "Generating server: ${cn}.pem and ${cn}.key ..." + + openssl req -newkey rsa:2048 -nodes -out "${cn}.csr" -keyout "${cn}.key" -subj "/CN=${cn}/" + openssl ca -batch -config ca.conf -notext -in "${cn}.csr" -out "${cn}.pem" -extfile "${cn}.conf" + else + echo >&2 "certificates for ${cn} are already available." + fi + + chown systemd-journal-remote:systemd-journal "${cn}.pem" "${cn}.key" + chmod 0640 "${cn}.pem" "${cn}.key" +} + + +# ----------------------------------------------------------------------------- +# Create a script to install the certificate on each server + +generate_install_script() { + local cn="${1}" + local dst="/etc/ssl/systemd-journal" + + cat >"runme-on-${cn}.sh" <<EOFC1 +#!/usr/bin/env bash + +# stop on all errors +set -e + +if [ \$UID -ne 0 ]; then + echo >&2 "Hey! sudo me: sudo \${0}" + exit 1 +fi + +# make sure the systemd-journal group exists +# all certificates will be owned by this group +if ! getent group systemd-journal >/dev/null 2>&1; then + echo >&2 "Missing system group: systemd-journal. Did you install systemd-journald?" + exit 1 +fi + +if ! getent passwd systemd-journal-remote >/dev/null 2>&1; then + echo >&2 "Missing system user: systemd-journal-remote. Did you install systemd-journal-remote?" + exit 1 +fi + +if [ ! -d ${dst} ]; then + echo >&2 "creating directory: ${dst}" + mkdir -p "${dst}" +fi +chown systemd-journal-remote:systemd-journal "${dst}" +chmod 750 "${dst}" +cd "${dst}" + +echo >&2 "saving trusted certificate file as: ${dst}/ca.pem" +cat >ca.pem <<EOFCAPEM +$(cat ca.pem) +EOFCAPEM + +chown systemd-journal-remote:systemd-journal ca.pem +chmod 0640 ca.pem + +echo >&2 "saving server ${cn} certificate file as: ${dst}/${cn}.pem" +cat >"${cn}.pem" <<EOFSERPEM +$(cat "${cn}.pem") +EOFSERPEM + +chown systemd-journal-remote:systemd-journal "${cn}.pem" +chmod 0640 "${cn}.pem" + +echo >&2 "saving server ${cn} key file as: ${dst}/${cn}.key" +cat >"${cn}.key" <<EOFSERKEY +$(cat "${cn}.key") +EOFSERKEY + +chown systemd-journal-remote:systemd-journal "${cn}.key" +chmod 0640 "${cn}.key" + +for cfg in /etc/systemd/journal-remote.conf /etc/systemd/journal-upload.conf +do + if [ -f \${cfg} ]; then + # keep a backup of the file + test ! -f \${cfg}.orig && cp \${cfg} \${cfg}.orig + + # fix its contents + echo >&2 "updating the certificates in \${cfg}" + sed -i "s|^#\\?\\s*ServerKeyFile=.*$|ServerKeyFile=${dst}/${cn}.key|" \${cfg} + sed -i "s|^#\\?\\s*ServerCertificateFile=.*$|ServerCertificateFile=${dst}/${cn}.pem|" \${cfg} + sed -i "s|^#\\?\\s*TrustedCertificateFile=.*$|TrustedCertificateFile=${dst}/ca.pem|" \${cfg} + fi +done + +echo >&2 "certificates installed - you may need to restart services to active them" +echo >&2 +echo >&2 "If this is a central server:" +echo >&2 "# systemctl restart systemd-journal-remote.socket" +echo >&2 +echo >&2 "If this is a passive client:" +echo >&2 "# systemctl restart systemd-journal-upload.service" +echo >&2 +echo >&2 "If this is an active client:" +echo >&2 "# systemctl restart systemd-journal-gateway.socket" +EOFC1 + + chmod 0700 "runme-on-${cn}.sh" +} + +# ----------------------------------------------------------------------------- +# Create the client certificates + +generate_server_certificate "${@}" +generate_install_script "${1}" + + +# Set ANSI escape code for colors +yellow_color="\033[1;33m" +green_color="\033[0;32m" +# Reset ANSI color after the message +reset_color="\033[0m" + + +echo >&2 -e "use this script to install it on ${1}: ${yellow_color}$(ls ${dst}/runme-on-${1}.sh)${reset_color}" +echo >&2 "copy it to your server ${1}, like this:" +echo >&2 -e "# ${green_color}scp ${dst}/runme-on-${1}.sh ${1}:/tmp/${reset_color}" +echo >&2 "and then run it on that server to install the certificates" +echo >&2 diff --git a/collectors/systemd-journal.plugin/systemd-journal-watcher.c b/collectors/systemd-journal.plugin/systemd-journal-watcher.c new file mode 100644 index 000000000..ed41f6247 --- /dev/null +++ b/collectors/systemd-journal.plugin/systemd-journal-watcher.c @@ -0,0 +1,379 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "systemd-internals.h" +#include <sys/inotify.h> + +#define EVENT_SIZE (sizeof(struct inotify_event)) +#define INITIAL_WATCHES 256 + +#define WATCH_FOR (IN_CREATE | IN_MODIFY | IN_DELETE | IN_DELETE_SELF | IN_MOVED_FROM | IN_MOVED_TO | IN_UNMOUNT) + +typedef struct watch_entry { + int slot; + + int wd; // Watch descriptor + char *path; // Dynamically allocated path + + struct watch_entry *next; // for the free list +} WatchEntry; + +typedef struct { + WatchEntry *watchList; + WatchEntry *freeList; + int watchCount; + int watchListSize; + + size_t errors; + + DICTIONARY *pending; +} Watcher; + +static WatchEntry *get_slot(Watcher *watcher) { + WatchEntry *t; + + if (watcher->freeList != NULL) { + t = watcher->freeList; + watcher->freeList = t->next; + t->next = NULL; + return t; + } + + if (watcher->watchCount == watcher->watchListSize) { + watcher->watchListSize *= 2; + watcher->watchList = reallocz(watcher->watchList, watcher->watchListSize * sizeof(WatchEntry)); + } + + watcher->watchList[watcher->watchCount] = (WatchEntry){ + .slot = watcher->watchCount, + .wd = -1, + .path = NULL, + .next = NULL, + }; + t = &watcher->watchList[watcher->watchCount]; + watcher->watchCount++; + + return t; +} + +static void free_slot(Watcher *watcher, WatchEntry *t) { + t->wd = -1; + freez(t->path); + t->path = NULL; + + // link it to the free list + t->next = watcher->freeList; + watcher->freeList = t; +} + +static int add_watch(Watcher *watcher, int inotifyFd, const char *path) { + WatchEntry *t = get_slot(watcher); + + t->wd = inotify_add_watch(inotifyFd, path, WATCH_FOR); + if (t->wd == -1) { + nd_log(NDLS_COLLECTORS, NDLP_ERR, + "JOURNAL WATCHER: cannot watch directory: '%s'", + path); + + free_slot(watcher, t); + + struct stat info; + if(stat(path, &info) == 0 && S_ISDIR(info.st_mode)) { + // the directory exists, but we failed to add the watch + // increase errors + watcher->errors++; + } + } + else { + t->path = strdupz(path); + + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "JOURNAL WATCHER: watching directory: '%s'", + path); + + } + return t->wd; +} + +static void remove_watch(Watcher *watcher, int inotifyFd, int wd) { + int i; + for (i = 0; i < watcher->watchCount; ++i) { + if (watcher->watchList[i].wd == wd) { + + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "JOURNAL WATCHER: removing watch from directory: '%s'", + watcher->watchList[i].path); + + inotify_rm_watch(inotifyFd, watcher->watchList[i].wd); + free_slot(watcher, &watcher->watchList[i]); + return; + } + } + + nd_log(NDLS_COLLECTORS, NDLP_WARNING, + "JOURNAL WATCHER: cannot find directory watch %d to remove.", + wd); +} + +static void free_watches(Watcher *watcher, int inotifyFd) { + for (int i = 0; i < watcher->watchCount; ++i) { + if (watcher->watchList[i].wd != -1) { + inotify_rm_watch(inotifyFd, watcher->watchList[i].wd); + free_slot(watcher, &watcher->watchList[i]); + } + } + freez(watcher->watchList); + watcher->watchList = NULL; + + dictionary_destroy(watcher->pending); + watcher->pending = NULL; +} + +static char* get_path_from_wd(Watcher *watcher, int wd) { + for (int i = 0; i < watcher->watchCount; ++i) { + if (watcher->watchList[i].wd == wd) + return watcher->watchList[i].path; + } + return NULL; +} + +static bool is_directory_watched(Watcher *watcher, const char *path) { + for (int i = 0; i < watcher->watchCount; ++i) { + if (watcher->watchList[i].wd != -1 && strcmp(watcher->watchList[i].path, path) == 0) { + return true; + } + } + return false; +} + +static void watch_directory_and_subdirectories(Watcher *watcher, int inotifyFd, const char *basePath) { + DICTIONARY *dirs = dictionary_create(DICT_OPTION_SINGLE_THREADED | DICT_OPTION_DONT_OVERWRITE_VALUE); + + journal_directory_scan_recursively(NULL, dirs, basePath, 0); + + void *x; + dfe_start_read(dirs, x) { + const char *dirname = x_dfe.name; + // Check if this directory is already being watched + if (!is_directory_watched(watcher, dirname)) { + add_watch(watcher, inotifyFd, dirname); + } + } + dfe_done(x); + + dictionary_destroy(dirs); +} + +static bool is_subpath(const char *path, const char *subpath) { + // Use strncmp to compare the paths + if (strncmp(path, subpath, strlen(path)) == 0) { + // Ensure that the next character is a '/' or '\0' + char next_char = subpath[strlen(path)]; + return next_char == '/' || next_char == '\0'; + } + + return false; +} + +void remove_directory_watch(Watcher *watcher, int inotifyFd, const char *dirPath) { + for (int i = 0; i < watcher->watchCount; ++i) { + WatchEntry *t = &watcher->watchList[i]; + if (t->wd != -1 && is_subpath(t->path, dirPath)) { + inotify_rm_watch(inotifyFd, t->wd); + free_slot(watcher, t); + } + } + + struct journal_file *jf; + dfe_start_write(journal_files_registry, jf) { + if(is_subpath(jf->filename, dirPath)) + dictionary_del(journal_files_registry, jf->filename); + } + dfe_done(jf); + + dictionary_garbage_collect(journal_files_registry); +} + +void process_event(Watcher *watcher, int inotifyFd, struct inotify_event *event) { + if(!event->len) { + nd_log(NDLS_COLLECTORS, NDLP_NOTICE + , "JOURNAL WATCHER: received event with mask %u and len %u (this is zero) for path: '%s' - ignoring it." + , event->mask, event->len, event->name); + return; + } + + char *dirPath = get_path_from_wd(watcher, event->wd); + if(!dirPath) { + nd_log(NDLS_COLLECTORS, NDLP_NOTICE, + "JOURNAL WATCHER: received event with mask %u and len %u for path: '%s' - " + "but we can't find its watch descriptor - ignoring it." + , event->mask, event->len, event->name); + return; + } + + if(event->mask & IN_DELETE_SELF) { + remove_watch(watcher, inotifyFd, event->wd); + return; + } + + static __thread char fullPath[PATH_MAX]; + snprintfz(fullPath, sizeof(fullPath), "%s/%s", dirPath, event->name); + // fullPath contains the full path to the file + + size_t len = strlen(event->name); + + if(event->mask & IN_ISDIR) { + if (event->mask & (IN_DELETE | IN_MOVED_FROM)) { + // A directory is deleted or moved out + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "JOURNAL WATCHER: Directory deleted or moved out: '%s'", + fullPath); + + // Remove the watch - implement this function based on how you manage your watches + remove_directory_watch(watcher, inotifyFd, fullPath); + } + else if (event->mask & (IN_CREATE | IN_MOVED_TO)) { + // A new directory is created or moved in + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "JOURNAL WATCHER: New directory created or moved in: '%s'", + fullPath); + + // Start watching the new directory - recursive watch + watch_directory_and_subdirectories(watcher, inotifyFd, fullPath); + } + else + nd_log(NDLS_COLLECTORS, NDLP_WARNING, + "JOURNAL WATCHER: Received unhandled event with mask %u for directory '%s'", + event->mask, fullPath); + } + else if(len > sizeof(".journal") - 1 && strcmp(&event->name[len - (sizeof(".journal") - 1)], ".journal") == 0) { + // It is a file that ends in .journal + // add it to our pending list + dictionary_set(watcher->pending, fullPath, NULL, 0); + } + else + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "JOURNAL WATCHER: ignoring event with mask %u for file '%s'", + event->mask, fullPath); +} + +static void process_pending(Watcher *watcher) { + void *x; + dfe_start_write(watcher->pending, x) { + struct stat info; + const char *fullPath = x_dfe.name; + + if(stat(fullPath, &info) != 0) { + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "JOURNAL WATCHER: file '%s' no longer exists, removing it from the registry", + fullPath); + + dictionary_del(journal_files_registry, fullPath); + } + else if(S_ISREG(info.st_mode)) { + nd_log(NDLS_COLLECTORS, NDLP_DEBUG, + "JOURNAL WATCHER: file '%s' has been added/updated, updating the registry", + fullPath); + + struct journal_file t = { + .file_last_modified_ut = info.st_mtim.tv_sec * USEC_PER_SEC + + info.st_mtim.tv_nsec / NSEC_PER_USEC, + .last_scan_monotonic_ut = now_monotonic_usec(), + .size = info.st_size, + .max_journal_vs_realtime_delta_ut = JOURNAL_VS_REALTIME_DELTA_DEFAULT_UT, + }; + struct journal_file *jf = dictionary_set(journal_files_registry, fullPath, &t, sizeof(t)); + journal_file_update_header(jf->filename, jf); + } + + dictionary_del(watcher->pending, fullPath); + } + dfe_done(x); + + dictionary_garbage_collect(watcher->pending); +} + +void *journal_watcher_main(void *arg __maybe_unused) { + while(1) { + Watcher watcher = { + .watchList = mallocz(INITIAL_WATCHES * sizeof(WatchEntry)), + .freeList = NULL, + .watchCount = 0, + .watchListSize = INITIAL_WATCHES, + .pending = dictionary_create(DICT_OPTION_DONT_OVERWRITE_VALUE|DICT_OPTION_SINGLE_THREADED), + .errors = 0, + }; + + int inotifyFd = inotify_init(); + if (inotifyFd < 0) { + nd_log(NDLS_COLLECTORS, NDLP_ERR, "inotify_init() failed."); + free_watches(&watcher, inotifyFd); + return NULL; + } + + for (unsigned i = 0; i < MAX_JOURNAL_DIRECTORIES; i++) { + if (!journal_directories[i].path) break; + watch_directory_and_subdirectories(&watcher, inotifyFd, journal_directories[i].path); + } + + usec_t last_headers_update_ut = now_monotonic_usec(); + struct buffered_reader reader; + while (1) { + buffered_reader_ret_t rc = buffered_reader_read_timeout( + &reader, inotifyFd, SYSTEMD_JOURNAL_EXECUTE_WATCHER_PENDING_EVERY_MS, false); + + if (rc != BUFFERED_READER_READ_OK && rc != BUFFERED_READER_READ_POLL_TIMEOUT) { + nd_log(NDLS_COLLECTORS, NDLP_CRIT, + "JOURNAL WATCHER: cannot read inotify events, buffered_reader_read_timeout() returned %d - " + "restarting the watcher.", + rc); + break; + } + + if(rc == BUFFERED_READER_READ_OK) { + bool unmount_event = false; + + ssize_t i = 0; + while (i < reader.read_len) { + struct inotify_event *event = (struct inotify_event *) &reader.read_buffer[i]; + + if(event->mask & IN_UNMOUNT) { + unmount_event = true; + break; + } + + process_event(&watcher, inotifyFd, event); + i += (ssize_t)EVENT_SIZE + event->len; + } + + reader.read_buffer[0] = '\0'; + reader.read_len = 0; + reader.pos = 0; + + if(unmount_event) + break; + } + + usec_t ut = now_monotonic_usec(); + if (dictionary_entries(watcher.pending) && (rc == BUFFERED_READER_READ_POLL_TIMEOUT || + last_headers_update_ut + (SYSTEMD_JOURNAL_EXECUTE_WATCHER_PENDING_EVERY_MS * USEC_PER_MS) <= ut)) { + process_pending(&watcher); + last_headers_update_ut = ut; + } + + if(watcher.errors) { + nd_log(NDLS_COLLECTORS, NDLP_NOTICE, + "JOURNAL WATCHER: there were errors in setting up inotify watches - restarting the watcher."); + } + } + + close(inotifyFd); + free_watches(&watcher, inotifyFd); + + // this will scan the directories and cleanup the registry + journal_files_registry_update(); + + sleep_usec(5 * USEC_PER_SEC); + } + + return NULL; +} diff --git a/collectors/systemd-journal.plugin/systemd-journal.c b/collectors/systemd-journal.plugin/systemd-journal.c index 877371120..f812b2161 100644 --- a/collectors/systemd-journal.plugin/systemd-journal.c +++ b/collectors/systemd-journal.plugin/systemd-journal.c @@ -5,13 +5,7 @@ * GPL v3+ */ -#include "collectors/all.h" -#include "libnetdata/libnetdata.h" -#include "libnetdata/required_dummies.h" - -#include <linux/capability.h> -#include <systemd/sd-journal.h> -#include <syslog.h> +#include "systemd-internals.h" /* * TODO @@ -20,95 +14,17 @@ * */ - -// ---------------------------------------------------------------------------- -// fstat64 overloading to speed up libsystemd -// https://github.com/systemd/systemd/pull/29261 - -#define ND_SD_JOURNAL_OPEN_FLAGS (0) - -#include <dlfcn.h> -#include <sys/stat.h> - -#define FSTAT_CACHE_MAX 1024 -struct fdstat64_cache_entry { - bool enabled; - bool updated; - int err_no; - struct stat64 stat; - int ret; - size_t cached_count; - size_t session; -}; - -struct fdstat64_cache_entry fstat64_cache[FSTAT_CACHE_MAX] = {0 }; -static __thread size_t fstat_thread_calls = 0; -static __thread size_t fstat_thread_cached_responses = 0; -static __thread bool enable_thread_fstat = false; -static __thread size_t fstat_caching_thread_session = 0; -static size_t fstat_caching_global_session = 0; - -static void fstat_cache_enable_on_thread(void) { - fstat_caching_thread_session = __atomic_add_fetch(&fstat_caching_global_session, 1, __ATOMIC_ACQUIRE); - enable_thread_fstat = true; -} - -static void fstat_cache_disable_on_thread(void) { - fstat_caching_thread_session = __atomic_add_fetch(&fstat_caching_global_session, 1, __ATOMIC_RELEASE); - enable_thread_fstat = false; -} - -int fstat64(int fd, struct stat64 *buf) { - static int (*real_fstat)(int, struct stat64 *) = NULL; - if (!real_fstat) - real_fstat = dlsym(RTLD_NEXT, "fstat64"); - - fstat_thread_calls++; - - if(fd >= 0 && fd < FSTAT_CACHE_MAX) { - if(enable_thread_fstat && fstat64_cache[fd].session != fstat_caching_thread_session) { - fstat64_cache[fd].session = fstat_caching_thread_session; - fstat64_cache[fd].enabled = true; - fstat64_cache[fd].updated = false; - } - - if(fstat64_cache[fd].enabled && fstat64_cache[fd].updated && fstat64_cache[fd].session == fstat_caching_thread_session) { - fstat_thread_cached_responses++; - errno = fstat64_cache[fd].err_no; - *buf = fstat64_cache[fd].stat; - fstat64_cache[fd].cached_count++; - return fstat64_cache[fd].ret; - } - } - - int ret = real_fstat(fd, buf); - - if(fd >= 0 && fd < FSTAT_CACHE_MAX && fstat64_cache[fd].enabled) { - fstat64_cache[fd].ret = ret; - fstat64_cache[fd].updated = true; - fstat64_cache[fd].err_no = errno; - fstat64_cache[fd].stat = *buf; - fstat64_cache[fd].session = fstat_caching_thread_session; - } - - return ret; -} - -// ---------------------------------------------------------------------------- - #define FACET_MAX_VALUE_LENGTH 8192 -#define SYSTEMD_JOURNAL_MAX_SOURCE_LEN 64 #define SYSTEMD_JOURNAL_FUNCTION_DESCRIPTION "View, search and analyze systemd journal entries." #define SYSTEMD_JOURNAL_FUNCTION_NAME "systemd-journal" #define SYSTEMD_JOURNAL_DEFAULT_TIMEOUT 60 -#define SYSTEMD_JOURNAL_MAX_PARAMS 100 +#define SYSTEMD_JOURNAL_MAX_PARAMS 1000 #define SYSTEMD_JOURNAL_DEFAULT_QUERY_DURATION (1 * 3600) #define SYSTEMD_JOURNAL_DEFAULT_ITEMS_PER_QUERY 200 -#define SYSTEMD_JOURNAL_WORKER_THREADS 5 - -#define JOURNAL_VS_REALTIME_DELTA_DEFAULT_UT (5 * USEC_PER_SEC) // assume always 5 seconds latency -#define JOURNAL_VS_REALTIME_DELTA_MAX_UT (2 * 60 * USEC_PER_SEC) // up to 2 minutes latency +#define SYSTEMD_JOURNAL_DEFAULT_ITEMS_SAMPLING 1000000 +#define SYSTEMD_JOURNAL_SAMPLING_SLOTS 1000 +#define SYSTEMD_JOURNAL_SAMPLING_RECALIBRATE 10000 #define JOURNAL_PARAMETER_HELP "help" #define JOURNAL_PARAMETER_AFTER "after" @@ -128,6 +44,7 @@ int fstat64(int fd, struct stat64 *buf) { #define JOURNAL_PARAMETER_SLICE "slice" #define JOURNAL_PARAMETER_DELTA "delta" #define JOURNAL_PARAMETER_TAIL "tail" +#define JOURNAL_PARAMETER_SAMPLING "sampling" #define JOURNAL_KEY_ND_JOURNAL_FILE "ND_JOURNAL_FILE" #define JOURNAL_KEY_ND_JOURNAL_PROCESS "ND_JOURNAL_PROCESS" @@ -138,7 +55,8 @@ int fstat64(int fd, struct stat64 *buf) { #define SYSTEMD_ALWAYS_VISIBLE_KEYS NULL #define SYSTEMD_KEYS_EXCLUDED_FROM_FACETS \ - "*MESSAGE*" \ + "!MESSAGE_ID" \ + "|*MESSAGE*" \ "|*_RAW" \ "|*_USEC" \ "|*_NSEC" \ @@ -153,7 +71,7 @@ int fstat64(int fd, struct stat64 *buf) { /* --- USER JOURNAL FIELDS --- */ \ \ /* "|MESSAGE" */ \ - /* "|MESSAGE_ID" */ \ + "|MESSAGE_ID" \ "|PRIORITY" \ "|CODE_FILE" \ /* "|CODE_LINE" */ \ @@ -247,33 +165,22 @@ int fstat64(int fd, struct stat64 *buf) { "|IMAGE_NAME" /* undocumented */ \ /* "|CONTAINER_PARTIAL_MESSAGE" */ \ \ + \ + /* --- NETDATA --- */ \ + \ + "|ND_NIDL_NODE" \ + "|ND_NIDL_CONTEXT" \ + "|ND_LOG_SOURCE" \ + /*"|ND_MODULE" */ \ + "|ND_ALERT_NAME" \ + "|ND_ALERT_CLASS" \ + "|ND_ALERT_COMPONENT" \ + "|ND_ALERT_TYPE" \ + \ "" -static netdata_mutex_t stdout_mutex = NETDATA_MUTEX_INITIALIZER; -static bool plugin_should_exit = false; - // ---------------------------------------------------------------------------- -typedef enum { - ND_SD_JOURNAL_NO_FILE_MATCHED, - ND_SD_JOURNAL_FAILED_TO_OPEN, - ND_SD_JOURNAL_FAILED_TO_SEEK, - ND_SD_JOURNAL_TIMED_OUT, - ND_SD_JOURNAL_OK, - ND_SD_JOURNAL_NOT_MODIFIED, - ND_SD_JOURNAL_CANCELLED, -} ND_SD_JOURNAL_STATUS; - -typedef enum { - SDJF_ALL = 0, - SDJF_LOCAL = (1 << 0), - SDJF_REMOTE = (1 << 1), - SDJF_SYSTEM = (1 << 2), - SDJF_USER = (1 << 3), - SDJF_NAMESPACE = (1 << 4), - SDJF_OTHER = (1 << 5), -} SD_JOURNAL_FILE_SOURCE_TYPE; - typedef struct function_query_status { bool *cancelled; // a pointer to the cancelling boolean usec_t stop_monotonic_ut; @@ -282,7 +189,7 @@ typedef struct function_query_status { // request SD_JOURNAL_FILE_SOURCE_TYPE source_type; - STRING *source; + SIMPLE_PATTERN *sources; usec_t after_ut; usec_t before_ut; @@ -298,13 +205,50 @@ typedef struct function_query_status { bool tail; bool data_only; bool slice; + size_t sampling; size_t filters; usec_t last_modified; const char *query; const char *histogram; + struct { + usec_t start_ut; // the starting time of the query - we start from this + usec_t stop_ut; // the ending time of the query - we stop at this + usec_t first_msg_ut; + + sd_id128_t first_msg_writer; + uint64_t first_msg_seqnum; + } query_file; + + struct { + uint32_t enable_after_samples; + uint32_t slots; + uint32_t sampled; + uint32_t unsampled; + uint32_t estimated; + } samples; + + struct { + uint32_t enable_after_samples; + uint32_t every; + uint32_t skipped; + uint32_t recalibrate; + uint32_t sampled; + uint32_t unsampled; + uint32_t estimated; + } samples_per_file; + + struct { + usec_t start_ut; + usec_t end_ut; + usec_t step_ut; + uint32_t enable_after_samples; + uint32_t sampled[SYSTEMD_JOURNAL_SAMPLING_SLOTS]; + uint32_t unsampled[SYSTEMD_JOURNAL_SAMPLING_SLOTS]; + } samples_per_time_slot; + // per file progress info - size_t cached_count; + // size_t cached_count; // progress statistics usec_t matches_setup_ut; @@ -315,20 +259,6 @@ typedef struct function_query_status { size_t file_working; } FUNCTION_QUERY_STATUS; -struct journal_file { - const char *filename; - size_t filename_len; - STRING *source; - SD_JOURNAL_FILE_SOURCE_TYPE source_type; - usec_t file_last_modified_ut; - usec_t msg_first_ut; - usec_t msg_last_ut; - usec_t last_scan_ut; - size_t size; - bool logged_failure; - usec_t max_journal_vs_realtime_delta_ut; -}; - static void log_fqs(FUNCTION_QUERY_STATUS *fqs, const char *msg) { netdata_log_error("ERROR: %s, on query " "timeframe [%"PRIu64" - %"PRIu64"], " @@ -359,25 +289,369 @@ static inline bool netdata_systemd_journal_seek_to(sd_journal *j, usec_t timesta #define JD_SOURCE_REALTIME_TIMESTAMP "_SOURCE_REALTIME_TIMESTAMP" -static inline bool parse_journal_field(const char *data, size_t data_length, const char **key, size_t *key_length, const char **value, size_t *value_length) { - const char *k = data; - const char *equal = strchr(k, '='); - if(unlikely(!equal)) - return false; +// ---------------------------------------------------------------------------- +// sampling support + +static void sampling_query_init(FUNCTION_QUERY_STATUS *fqs, FACETS *facets) { + if(!fqs->sampling) + return; - size_t kl = equal - k; + if(!fqs->slice) { + // the user is doing a full data query + // disable sampling + fqs->sampling = 0; + return; + } - const char *v = ++equal; - size_t vl = data_length - kl - 1; + if(fqs->data_only) { + // the user is doing a data query + // disable sampling + fqs->sampling = 0; + return; + } - *key = k; - *key_length = kl; - *value = v; - *value_length = vl; + if(!fqs->files_matched) { + // no files have been matched + // disable sampling + fqs->sampling = 0; + return; + } - return true; + fqs->samples.slots = facets_histogram_slots(facets); + if(fqs->samples.slots < 2) fqs->samples.slots = 2; + if(fqs->samples.slots > SYSTEMD_JOURNAL_SAMPLING_SLOTS) + fqs->samples.slots = SYSTEMD_JOURNAL_SAMPLING_SLOTS; + + if(!fqs->after_ut || !fqs->before_ut || fqs->after_ut >= fqs->before_ut) { + // we don't have enough information for sampling + fqs->sampling = 0; + return; + } + + usec_t delta = fqs->before_ut - fqs->after_ut; + usec_t step = delta / facets_histogram_slots(facets) - 1; + if(step < 1) step = 1; + + fqs->samples_per_time_slot.start_ut = fqs->after_ut; + fqs->samples_per_time_slot.end_ut = fqs->before_ut; + fqs->samples_per_time_slot.step_ut = step; + + // the minimum number of rows to enable sampling + fqs->samples.enable_after_samples = fqs->sampling / 2; + + size_t files_matched = fqs->files_matched; + if(!files_matched) + files_matched = 1; + + // the minimum number of rows per file to enable sampling + fqs->samples_per_file.enable_after_samples = (fqs->sampling / 4) / files_matched; + if(fqs->samples_per_file.enable_after_samples < fqs->entries) + fqs->samples_per_file.enable_after_samples = fqs->entries; + + // the minimum number of rows per time slot to enable sampling + fqs->samples_per_time_slot.enable_after_samples = (fqs->sampling / 4) / fqs->samples.slots; + if(fqs->samples_per_time_slot.enable_after_samples < fqs->entries) + fqs->samples_per_time_slot.enable_after_samples = fqs->entries; +} + +static void sampling_file_init(FUNCTION_QUERY_STATUS *fqs, struct journal_file *jf __maybe_unused) { + fqs->samples_per_file.sampled = 0; + fqs->samples_per_file.unsampled = 0; + fqs->samples_per_file.estimated = 0; + fqs->samples_per_file.every = 0; + fqs->samples_per_file.skipped = 0; + fqs->samples_per_file.recalibrate = 0; +} + +static size_t sampling_file_lines_scanned_so_far(FUNCTION_QUERY_STATUS *fqs) { + size_t sampled = fqs->samples_per_file.sampled + fqs->samples_per_file.unsampled; + if(!sampled) sampled = 1; + return sampled; +} + +static void sampling_running_file_query_overlapping_timeframe_ut( + FUNCTION_QUERY_STATUS *fqs, struct journal_file *jf, FACETS_ANCHOR_DIRECTION direction, + usec_t msg_ut, usec_t *after_ut, usec_t *before_ut) { + + // find the overlap of the query and file timeframes + // taking into account the first message we encountered + + usec_t oldest_ut, newest_ut; + if(direction == FACETS_ANCHOR_DIRECTION_FORWARD) { + // the first message we know (oldest) + oldest_ut = fqs->query_file.first_msg_ut ? fqs->query_file.first_msg_ut : jf->msg_first_ut; + if(!oldest_ut) oldest_ut = fqs->query_file.start_ut; + + if(jf->msg_last_ut) + newest_ut = MIN(fqs->query_file.stop_ut, jf->msg_last_ut); + else if(jf->file_last_modified_ut) + newest_ut = MIN(fqs->query_file.stop_ut, jf->file_last_modified_ut); + else + newest_ut = fqs->query_file.stop_ut; + + if(msg_ut < oldest_ut) + oldest_ut = msg_ut - 1; + } + else /* BACKWARD */ { + // the latest message we know (newest) + newest_ut = fqs->query_file.first_msg_ut ? fqs->query_file.first_msg_ut : jf->msg_last_ut; + if(!newest_ut) newest_ut = fqs->query_file.start_ut; + + if(jf->msg_first_ut) + oldest_ut = MAX(fqs->query_file.stop_ut, jf->msg_first_ut); + else + oldest_ut = fqs->query_file.stop_ut; + + if(newest_ut < msg_ut) + newest_ut = msg_ut + 1; + } + + *after_ut = oldest_ut; + *before_ut = newest_ut; +} + +static double sampling_running_file_query_progress_by_time(FUNCTION_QUERY_STATUS *fqs, struct journal_file *jf, + FACETS_ANCHOR_DIRECTION direction, usec_t msg_ut) { + + usec_t after_ut, before_ut, elapsed_ut; + sampling_running_file_query_overlapping_timeframe_ut(fqs, jf, direction, msg_ut, &after_ut, &before_ut); + + if(direction == FACETS_ANCHOR_DIRECTION_FORWARD) + elapsed_ut = msg_ut - after_ut; + else + elapsed_ut = before_ut - msg_ut; + + usec_t total_ut = before_ut - after_ut; + double progress = (double)elapsed_ut / (double)total_ut; + + return progress; +} + +static usec_t sampling_running_file_query_remaining_time(FUNCTION_QUERY_STATUS *fqs, struct journal_file *jf, + FACETS_ANCHOR_DIRECTION direction, usec_t msg_ut, + usec_t *total_time_ut, usec_t *remaining_start_ut, + usec_t *remaining_end_ut) { + usec_t after_ut, before_ut; + sampling_running_file_query_overlapping_timeframe_ut(fqs, jf, direction, msg_ut, &after_ut, &before_ut); + + // since we have a timestamp in msg_ut + // this timestamp can extend the overlap + if(msg_ut <= after_ut) + after_ut = msg_ut - 1; + + if(msg_ut >= before_ut) + before_ut = msg_ut + 1; + + // return the remaining duration + usec_t remaining_from_ut, remaining_to_ut; + if(direction == FACETS_ANCHOR_DIRECTION_FORWARD) { + remaining_from_ut = msg_ut; + remaining_to_ut = before_ut; + } + else { + remaining_from_ut = after_ut; + remaining_to_ut = msg_ut; + } + + usec_t remaining_ut = remaining_to_ut - remaining_from_ut; + + if(total_time_ut) + *total_time_ut = (before_ut > after_ut) ? before_ut - after_ut : 1; + + if(remaining_start_ut) + *remaining_start_ut = remaining_from_ut; + + if(remaining_end_ut) + *remaining_end_ut = remaining_to_ut; + + return remaining_ut; +} + +static size_t sampling_running_file_query_estimate_remaining_lines_by_time(FUNCTION_QUERY_STATUS *fqs, + struct journal_file *jf, + FACETS_ANCHOR_DIRECTION direction, + usec_t msg_ut) { + size_t scanned_lines = sampling_file_lines_scanned_so_far(fqs); + + // Calculate the proportion of time covered + usec_t total_time_ut, remaining_start_ut, remaining_end_ut; + usec_t remaining_time_ut = sampling_running_file_query_remaining_time(fqs, jf, direction, msg_ut, &total_time_ut, + &remaining_start_ut, &remaining_end_ut); + if (total_time_ut == 0) total_time_ut = 1; + + double proportion_by_time = (double) (total_time_ut - remaining_time_ut) / (double) total_time_ut; + + if (proportion_by_time == 0 || proportion_by_time > 1.0 || !isfinite(proportion_by_time)) + proportion_by_time = 1.0; + + // Estimate the total number of lines in the file + size_t expected_matching_logs_by_time = (size_t)((double)scanned_lines / proportion_by_time); + + if(jf->messages_in_file && expected_matching_logs_by_time > jf->messages_in_file) + expected_matching_logs_by_time = jf->messages_in_file; + + // Calculate the estimated number of remaining lines + size_t remaining_logs_by_time = expected_matching_logs_by_time - scanned_lines; + if (remaining_logs_by_time < 1) remaining_logs_by_time = 1; + +// nd_log(NDLS_COLLECTORS, NDLP_INFO, +// "JOURNAL ESTIMATION: '%s' " +// "scanned_lines=%zu [sampled=%zu, unsampled=%zu, estimated=%zu], " +// "file [%"PRIu64" - %"PRIu64", duration %"PRId64", known lines in file %zu], " +// "query [%"PRIu64" - %"PRIu64", duration %"PRId64"], " +// "first message read from the file at %"PRIu64", current message at %"PRIu64", " +// "proportion of time %.2f %%, " +// "expected total lines in file %zu, " +// "remaining lines %zu, " +// "remaining time %"PRIu64" [%"PRIu64" - %"PRIu64", duration %"PRId64"]" +// , jf->filename +// , scanned_lines, fqs->samples_per_file.sampled, fqs->samples_per_file.unsampled, fqs->samples_per_file.estimated +// , jf->msg_first_ut, jf->msg_last_ut, jf->msg_last_ut - jf->msg_first_ut, jf->messages_in_file +// , fqs->query_file.start_ut, fqs->query_file.stop_ut, fqs->query_file.stop_ut - fqs->query_file.start_ut +// , fqs->query_file.first_msg_ut, msg_ut +// , proportion_by_time * 100.0 +// , expected_matching_logs_by_time +// , remaining_logs_by_time +// , remaining_time_ut, remaining_start_ut, remaining_end_ut, remaining_end_ut - remaining_start_ut +// ); + + return remaining_logs_by_time; +} + +static size_t sampling_running_file_query_estimate_remaining_lines(sd_journal *j, FUNCTION_QUERY_STATUS *fqs, struct journal_file *jf, FACETS_ANCHOR_DIRECTION direction, usec_t msg_ut) { + size_t expected_matching_logs_by_seqnum = 0; + double proportion_by_seqnum = 0.0; + size_t remaining_logs_by_seqnum = 0; + +#ifdef HAVE_SD_JOURNAL_GET_SEQNUM + uint64_t current_msg_seqnum; + sd_id128_t current_msg_writer; + if(!fqs->query_file.first_msg_seqnum || sd_journal_get_seqnum(j, ¤t_msg_seqnum, ¤t_msg_writer) < 0) { + fqs->query_file.first_msg_seqnum = 0; + fqs->query_file.first_msg_writer = SD_ID128_NULL; + } + else if(jf->messages_in_file) { + size_t scanned_lines = sampling_file_lines_scanned_so_far(fqs); + + double proportion_of_all_lines_so_far; + if(direction == FACETS_ANCHOR_DIRECTION_FORWARD) + proportion_of_all_lines_so_far = (double)scanned_lines / (double)(current_msg_seqnum - jf->first_seqnum); + else + proportion_of_all_lines_so_far = (double)scanned_lines / (double)(jf->last_seqnum - current_msg_seqnum); + + if(proportion_of_all_lines_so_far > 1.0) + proportion_of_all_lines_so_far = 1.0; + + expected_matching_logs_by_seqnum = (size_t)(proportion_of_all_lines_so_far * (double)jf->messages_in_file); + + proportion_by_seqnum = (double)scanned_lines / (double)expected_matching_logs_by_seqnum; + + if (proportion_by_seqnum == 0 || proportion_by_seqnum > 1.0 || !isfinite(proportion_by_seqnum)) + proportion_by_seqnum = 1.0; + + remaining_logs_by_seqnum = expected_matching_logs_by_seqnum - scanned_lines; + if(!remaining_logs_by_seqnum) remaining_logs_by_seqnum = 1; + } +#endif + + if(remaining_logs_by_seqnum) + return remaining_logs_by_seqnum; + + return sampling_running_file_query_estimate_remaining_lines_by_time(fqs, jf, direction, msg_ut); +} + +static void sampling_decide_file_sampling_every(sd_journal *j, FUNCTION_QUERY_STATUS *fqs, struct journal_file *jf, FACETS_ANCHOR_DIRECTION direction, usec_t msg_ut) { + size_t files_matched = fqs->files_matched; + if(!files_matched) files_matched = 1; + + size_t remaining_lines = sampling_running_file_query_estimate_remaining_lines(j, fqs, jf, direction, msg_ut); + size_t wanted_samples = (fqs->sampling / 2) / files_matched; + if(!wanted_samples) wanted_samples = 1; + + fqs->samples_per_file.every = remaining_lines / wanted_samples; + + if(fqs->samples_per_file.every < 1) + fqs->samples_per_file.every = 1; +} + +typedef enum { + SAMPLING_STOP_AND_ESTIMATE = -1, + SAMPLING_FULL = 0, + SAMPLING_SKIP_FIELDS = 1, +} sampling_t; + +static inline sampling_t is_row_in_sample(sd_journal *j, FUNCTION_QUERY_STATUS *fqs, struct journal_file *jf, usec_t msg_ut, FACETS_ANCHOR_DIRECTION direction, bool candidate_to_keep) { + if(!fqs->sampling || candidate_to_keep) + return SAMPLING_FULL; + + if(unlikely(msg_ut < fqs->samples_per_time_slot.start_ut)) + msg_ut = fqs->samples_per_time_slot.start_ut; + if(unlikely(msg_ut > fqs->samples_per_time_slot.end_ut)) + msg_ut = fqs->samples_per_time_slot.end_ut; + + size_t slot = (msg_ut - fqs->samples_per_time_slot.start_ut) / fqs->samples_per_time_slot.step_ut; + if(slot >= fqs->samples.slots) + slot = fqs->samples.slots - 1; + + bool should_sample = false; + + if(fqs->samples.sampled < fqs->samples.enable_after_samples || + fqs->samples_per_file.sampled < fqs->samples_per_file.enable_after_samples || + fqs->samples_per_time_slot.sampled[slot] < fqs->samples_per_time_slot.enable_after_samples) + should_sample = true; + + else if(fqs->samples_per_file.recalibrate >= SYSTEMD_JOURNAL_SAMPLING_RECALIBRATE || !fqs->samples_per_file.every) { + // this is the first to be unsampled for this file + sampling_decide_file_sampling_every(j, fqs, jf, direction, msg_ut); + fqs->samples_per_file.recalibrate = 0; + should_sample = true; + } + else { + // we sample 1 every fqs->samples_per_file.every + if(fqs->samples_per_file.skipped >= fqs->samples_per_file.every) { + fqs->samples_per_file.skipped = 0; + should_sample = true; + } + else + fqs->samples_per_file.skipped++; + } + + if(should_sample) { + fqs->samples.sampled++; + fqs->samples_per_file.sampled++; + fqs->samples_per_time_slot.sampled[slot]++; + + return SAMPLING_FULL; + } + + fqs->samples_per_file.recalibrate++; + + fqs->samples.unsampled++; + fqs->samples_per_file.unsampled++; + fqs->samples_per_time_slot.unsampled[slot]++; + + if(fqs->samples_per_file.unsampled > fqs->samples_per_file.sampled) { + double progress_by_time = sampling_running_file_query_progress_by_time(fqs, jf, direction, msg_ut); + + if(progress_by_time > SYSTEMD_JOURNAL_ENABLE_ESTIMATIONS_FILE_PERCENTAGE) + return SAMPLING_STOP_AND_ESTIMATE; + } + + return SAMPLING_SKIP_FIELDS; +} + +static void sampling_update_running_query_file_estimates(FACETS *facets, sd_journal *j, FUNCTION_QUERY_STATUS *fqs, struct journal_file *jf, usec_t msg_ut, FACETS_ANCHOR_DIRECTION direction) { + usec_t total_time_ut, remaining_start_ut, remaining_end_ut; + sampling_running_file_query_remaining_time(fqs, jf, direction, msg_ut, &total_time_ut, &remaining_start_ut, + &remaining_end_ut); + size_t remaining_lines = sampling_running_file_query_estimate_remaining_lines(j, fqs, jf, direction, msg_ut); + facets_update_estimations(facets, remaining_start_ut, remaining_end_ut, remaining_lines); + fqs->samples.estimated += remaining_lines; + fqs->samples_per_file.estimated += remaining_lines; } +// ---------------------------------------------------------------------------- + static inline size_t netdata_systemd_journal_process_row(sd_journal *j, FACETS *facets, struct journal_file *jf, usec_t *msg_ut) { const void *data; size_t length, bytes = 0; @@ -454,11 +728,15 @@ ND_SD_JOURNAL_STATUS netdata_systemd_journal_query_backward( usec_t stop_ut = (fqs->data_only && fqs->anchor.stop_ut) ? fqs->anchor.stop_ut : fqs->after_ut; bool stop_when_full = (fqs->data_only && !fqs->anchor.stop_ut); + fqs->query_file.start_ut = start_ut; + fqs->query_file.stop_ut = stop_ut; + if(!netdata_systemd_journal_seek_to(j, start_ut)) return ND_SD_JOURNAL_FAILED_TO_SEEK; size_t errors_no_timestamp = 0; - usec_t earliest_msg_ut = 0; + usec_t latest_msg_ut = 0; // the biggest timestamp we have seen so far + usec_t first_msg_ut = 0; // the first message we got from the db size_t row_counter = 0, last_row_counter = 0, rows_useful = 0; size_t bytes = 0, last_bytes = 0; @@ -475,44 +753,68 @@ ND_SD_JOURNAL_STATUS netdata_systemd_journal_query_backward( continue; } - if(unlikely(msg_ut > earliest_msg_ut)) - earliest_msg_ut = msg_ut; - if (unlikely(msg_ut > start_ut)) continue; if (unlikely(msg_ut < stop_ut)) break; - bytes += netdata_systemd_journal_process_row(j, facets, jf, &msg_ut); + if(unlikely(msg_ut > latest_msg_ut)) + latest_msg_ut = msg_ut; - // make sure each line gets a unique timestamp - if(unlikely(msg_ut >= last_usec_from && msg_ut <= last_usec_to)) - msg_ut = --last_usec_from; - else - last_usec_from = last_usec_to = msg_ut; - - if(facets_row_finished(facets, msg_ut)) - rows_useful++; - - row_counter++; - if(unlikely((row_counter % FUNCTION_DATA_ONLY_CHECK_EVERY_ROWS) == 0 && - stop_when_full && - facets_rows(facets) >= fqs->entries)) { - // stop the data only query - usec_t oldest = facets_row_oldest_ut(facets); - if(oldest && msg_ut < (oldest - anchor_delta)) - break; + if(unlikely(!first_msg_ut)) { + first_msg_ut = msg_ut; + fqs->query_file.first_msg_ut = msg_ut; + +#ifdef HAVE_SD_JOURNAL_GET_SEQNUM + if(sd_journal_get_seqnum(j, &fqs->query_file.first_msg_seqnum, &fqs->query_file.first_msg_writer) < 0) { + fqs->query_file.first_msg_seqnum = 0; + fqs->query_file.first_msg_writer = SD_ID128_NULL; + } +#endif } - if(unlikely(row_counter % FUNCTION_PROGRESS_EVERY_ROWS == 0)) { - FUNCTION_PROGRESS_UPDATE_ROWS(fqs->rows_read, row_counter - last_row_counter); - last_row_counter = row_counter; + sampling_t sample = is_row_in_sample(j, fqs, jf, msg_ut, + FACETS_ANCHOR_DIRECTION_BACKWARD, + facets_row_candidate_to_keep(facets, msg_ut)); + + if(sample == SAMPLING_FULL) { + bytes += netdata_systemd_journal_process_row(j, facets, jf, &msg_ut); + + // make sure each line gets a unique timestamp + if(unlikely(msg_ut >= last_usec_from && msg_ut <= last_usec_to)) + msg_ut = --last_usec_from; + else + last_usec_from = last_usec_to = msg_ut; + + if(facets_row_finished(facets, msg_ut)) + rows_useful++; + + row_counter++; + if(unlikely((row_counter % FUNCTION_DATA_ONLY_CHECK_EVERY_ROWS) == 0 && + stop_when_full && + facets_rows(facets) >= fqs->entries)) { + // stop the data only query + usec_t oldest = facets_row_oldest_ut(facets); + if(oldest && msg_ut < (oldest - anchor_delta)) + break; + } + + if(unlikely(row_counter % FUNCTION_PROGRESS_EVERY_ROWS == 0)) { + FUNCTION_PROGRESS_UPDATE_ROWS(fqs->rows_read, row_counter - last_row_counter); + last_row_counter = row_counter; - FUNCTION_PROGRESS_UPDATE_BYTES(fqs->bytes_read, bytes - last_bytes); - last_bytes = bytes; + FUNCTION_PROGRESS_UPDATE_BYTES(fqs->bytes_read, bytes - last_bytes); + last_bytes = bytes; - status = check_stop(fqs->cancelled, &fqs->stop_monotonic_ut); + status = check_stop(fqs->cancelled, &fqs->stop_monotonic_ut); + } + } + else if(sample == SAMPLING_SKIP_FIELDS) + facets_row_finished_unsampled(facets, msg_ut); + else { + sampling_update_running_query_file_estimates(facets, j, fqs, jf, msg_ut, FACETS_ANCHOR_DIRECTION_BACKWARD); + break; } } @@ -524,8 +826,8 @@ ND_SD_JOURNAL_STATUS netdata_systemd_journal_query_backward( if(errors_no_timestamp) netdata_log_error("SYSTEMD-JOURNAL: %zu lines did not have timestamps", errors_no_timestamp); - if(earliest_msg_ut > fqs->last_modified) - fqs->last_modified = earliest_msg_ut; + if(latest_msg_ut > fqs->last_modified) + fqs->last_modified = latest_msg_ut; return status; } @@ -540,11 +842,15 @@ ND_SD_JOURNAL_STATUS netdata_systemd_journal_query_forward( usec_t stop_ut = ((fqs->data_only && fqs->anchor.stop_ut) ? fqs->anchor.stop_ut : fqs->before_ut) + anchor_delta; bool stop_when_full = (fqs->data_only && !fqs->anchor.stop_ut); + fqs->query_file.start_ut = start_ut; + fqs->query_file.stop_ut = stop_ut; + if(!netdata_systemd_journal_seek_to(j, start_ut)) return ND_SD_JOURNAL_FAILED_TO_SEEK; size_t errors_no_timestamp = 0; - usec_t earliest_msg_ut = 0; + usec_t latest_msg_ut = 0; // the biggest timestamp we have seen so far + usec_t first_msg_ut = 0; // the first message we got from the db size_t row_counter = 0, last_row_counter = 0, rows_useful = 0; size_t bytes = 0, last_bytes = 0; @@ -561,44 +867,61 @@ ND_SD_JOURNAL_STATUS netdata_systemd_journal_query_forward( continue; } - if(likely(msg_ut > earliest_msg_ut)) - earliest_msg_ut = msg_ut; - if (unlikely(msg_ut < start_ut)) continue; if (unlikely(msg_ut > stop_ut)) break; - bytes += netdata_systemd_journal_process_row(j, facets, jf, &msg_ut); + if(likely(msg_ut > latest_msg_ut)) + latest_msg_ut = msg_ut; - // make sure each line gets a unique timestamp - if(unlikely(msg_ut >= last_usec_from && msg_ut <= last_usec_to)) - msg_ut = ++last_usec_to; - else - last_usec_from = last_usec_to = msg_ut; - - if(facets_row_finished(facets, msg_ut)) - rows_useful++; - - row_counter++; - if(unlikely((row_counter % FUNCTION_DATA_ONLY_CHECK_EVERY_ROWS) == 0 && - stop_when_full && - facets_rows(facets) >= fqs->entries)) { - // stop the data only query - usec_t newest = facets_row_newest_ut(facets); - if(newest && msg_ut > (newest + anchor_delta)) - break; + if(unlikely(!first_msg_ut)) { + first_msg_ut = msg_ut; + fqs->query_file.first_msg_ut = msg_ut; } - if(unlikely(row_counter % FUNCTION_PROGRESS_EVERY_ROWS == 0)) { - FUNCTION_PROGRESS_UPDATE_ROWS(fqs->rows_read, row_counter - last_row_counter); - last_row_counter = row_counter; + sampling_t sample = is_row_in_sample(j, fqs, jf, msg_ut, + FACETS_ANCHOR_DIRECTION_FORWARD, + facets_row_candidate_to_keep(facets, msg_ut)); + + if(sample == SAMPLING_FULL) { + bytes += netdata_systemd_journal_process_row(j, facets, jf, &msg_ut); - FUNCTION_PROGRESS_UPDATE_BYTES(fqs->bytes_read, bytes - last_bytes); - last_bytes = bytes; + // make sure each line gets a unique timestamp + if(unlikely(msg_ut >= last_usec_from && msg_ut <= last_usec_to)) + msg_ut = ++last_usec_to; + else + last_usec_from = last_usec_to = msg_ut; + + if(facets_row_finished(facets, msg_ut)) + rows_useful++; + + row_counter++; + if(unlikely((row_counter % FUNCTION_DATA_ONLY_CHECK_EVERY_ROWS) == 0 && + stop_when_full && + facets_rows(facets) >= fqs->entries)) { + // stop the data only query + usec_t newest = facets_row_newest_ut(facets); + if(newest && msg_ut > (newest + anchor_delta)) + break; + } + + if(unlikely(row_counter % FUNCTION_PROGRESS_EVERY_ROWS == 0)) { + FUNCTION_PROGRESS_UPDATE_ROWS(fqs->rows_read, row_counter - last_row_counter); + last_row_counter = row_counter; + + FUNCTION_PROGRESS_UPDATE_BYTES(fqs->bytes_read, bytes - last_bytes); + last_bytes = bytes; - status = check_stop(fqs->cancelled, &fqs->stop_monotonic_ut); + status = check_stop(fqs->cancelled, &fqs->stop_monotonic_ut); + } + } + else if(sample == SAMPLING_SKIP_FIELDS) + facets_row_finished_unsampled(facets, msg_ut); + else { + sampling_update_running_query_file_estimates(facets, j, fqs, jf, msg_ut, FACETS_ANCHOR_DIRECTION_FORWARD); + break; } } @@ -610,8 +933,8 @@ ND_SD_JOURNAL_STATUS netdata_systemd_journal_query_forward( if(errors_no_timestamp) netdata_log_error("SYSTEMD-JOURNAL: %zu lines did not have timestamps", errors_no_timestamp); - if(earliest_msg_ut > fqs->last_modified) - fqs->last_modified = earliest_msg_ut; + if(latest_msg_ut > fqs->last_modified) + fqs->last_modified = latest_msg_ut; return status; } @@ -723,6 +1046,7 @@ static ND_SD_JOURNAL_STATUS netdata_systemd_journal_query_one_file( }; if(sd_journal_open_files(&j, paths, ND_SD_JOURNAL_OPEN_FLAGS) < 0 || !j) { + netdata_log_error("JOURNAL: cannot open file '%s' for query", filename); fstat_cache_disable_on_thread(); return ND_SD_JOURNAL_FAILED_TO_OPEN; } @@ -756,432 +1080,18 @@ static ND_SD_JOURNAL_STATUS netdata_systemd_journal_query_one_file( return status; } -// ---------------------------------------------------------------------------- -// journal files registry - -#define VAR_LOG_JOURNAL_MAX_DEPTH 10 -#define MAX_JOURNAL_DIRECTORIES 100 - -struct journal_directory { - char *path; - bool logged_failure; -}; - -static struct journal_directory journal_directories[MAX_JOURNAL_DIRECTORIES] = { 0 }; -static DICTIONARY *journal_files_registry = NULL; -static DICTIONARY *used_hashes_registry = NULL; - -static usec_t systemd_journal_session = 0; - -static void buffer_json_journal_versions(BUFFER *wb) { - buffer_json_member_add_object(wb, "versions"); - { - buffer_json_member_add_uint64(wb, "sources", - systemd_journal_session + dictionary_version(journal_files_registry)); - } - buffer_json_object_close(wb); -} - -static void journal_file_update_msg_ut(const char *filename, struct journal_file *jf) { - fstat_cache_enable_on_thread(); - - const char *files[2] = { - [0] = filename, - [1] = NULL, - }; - - sd_journal *j = NULL; - if(sd_journal_open_files(&j, files, ND_SD_JOURNAL_OPEN_FLAGS) < 0 || !j) { - fstat_cache_disable_on_thread(); - - if(!jf->logged_failure) { - netdata_log_error("cannot open journal file '%s', using file timestamps to understand time-frame.", filename); - jf->logged_failure = true; - } - - jf->msg_first_ut = 0; - jf->msg_last_ut = jf->file_last_modified_ut; - return; - } - - usec_t first_ut = 0, last_ut = 0; - - if(sd_journal_seek_head(j) < 0 || sd_journal_next(j) < 0 || sd_journal_get_realtime_usec(j, &first_ut) < 0 || !first_ut) { - internal_error(true, "cannot find the timestamp of the first message in '%s'", filename); - first_ut = 0; - } - - if(sd_journal_seek_tail(j) < 0 || sd_journal_previous(j) < 0 || sd_journal_get_realtime_usec(j, &last_ut) < 0 || !last_ut) { - internal_error(true, "cannot find the timestamp of the last message in '%s'", filename); - last_ut = jf->file_last_modified_ut; - } - - sd_journal_close(j); - fstat_cache_disable_on_thread(); - - if(first_ut > last_ut) { - internal_error(true, "timestamps are flipped in file '%s'", filename); - usec_t t = first_ut; - first_ut = last_ut; - last_ut = t; - } - - jf->msg_first_ut = first_ut; - jf->msg_last_ut = last_ut; -} - -static STRING *string_strdupz_source(const char *s, const char *e, size_t max_len, const char *prefix) { - char buf[max_len]; - size_t len; - char *dst = buf; - - if(prefix) { - len = strlen(prefix); - memcpy(buf, prefix, len); - dst = &buf[len]; - max_len -= len; - } - - len = e - s; - if(len >= max_len) - len = max_len - 1; - memcpy(dst, s, len); - dst[len] = '\0'; - buf[max_len - 1] = '\0'; - - for(size_t i = 0; buf[i] ;i++) - if(!isalnum(buf[i]) && buf[i] != '-' && buf[i] != '.' && buf[i] != ':') - buf[i] = '_'; - - return string_strdupz(buf); -} - -static void files_registry_insert_cb(const DICTIONARY_ITEM *item, void *value, void *data __maybe_unused) { - struct journal_file *jf = value; - jf->filename = dictionary_acquired_item_name(item); - jf->filename_len = strlen(jf->filename); - - // based on the filename - // decide the source to show to the user - const char *s = strrchr(jf->filename, '/'); - if(s) { - if(strstr(jf->filename, "/remote/")) - jf->source_type = SDJF_REMOTE; - else { - const char *t = s - 1; - while(t >= jf->filename && *t != '.' && *t != '/') - t--; - - if(t >= jf->filename && *t == '.') { - jf->source_type = SDJF_NAMESPACE; - jf->source = string_strdupz_source(t + 1, s, SYSTEMD_JOURNAL_MAX_SOURCE_LEN, "namespace-"); - } - else - jf->source_type = SDJF_LOCAL; - } - - if(strncmp(s, "/system", 7) == 0) - jf->source_type |= SDJF_SYSTEM; - - else if(strncmp(s, "/user", 5) == 0) - jf->source_type |= SDJF_USER; - - else if(strncmp(s, "/remote-", 8) == 0) { - jf->source_type |= SDJF_REMOTE; - - s = &s[8]; // skip "/remote-" - - char *e = strchr(s, '@'); - if(!e) - e = strstr(s, ".journal"); - - if(e) { - const char *d = s; - for(; d < e && (isdigit(*d) || *d == '.' || *d == ':') ; d++) ; - if(d == e) { - // a valid IP address - char ip[e - s + 1]; - memcpy(ip, s, e - s); - ip[e - s] = '\0'; - char buf[SYSTEMD_JOURNAL_MAX_SOURCE_LEN]; - if(ip_to_hostname(ip, buf, sizeof(buf))) - jf->source = string_strdupz_source(buf, &buf[strlen(buf)], SYSTEMD_JOURNAL_MAX_SOURCE_LEN, "remote-"); - else { - internal_error(true, "Cannot find the hostname for IP '%s'", ip); - jf->source = string_strdupz_source(s, e, SYSTEMD_JOURNAL_MAX_SOURCE_LEN, "remote-"); - } - } - else - jf->source = string_strdupz_source(s, e, SYSTEMD_JOURNAL_MAX_SOURCE_LEN, "remote-"); - } - else - jf->source_type |= SDJF_OTHER; - } - else - jf->source_type |= SDJF_OTHER; - } - else - jf->source_type = SDJF_LOCAL | SDJF_OTHER; - - journal_file_update_msg_ut(jf->filename, jf); - - internal_error(true, - "found journal file '%s', type %d, source '%s', " - "file modified: %"PRIu64", " - "msg {first: %"PRIu64", last: %"PRIu64"}", - jf->filename, jf->source_type, jf->source ? string2str(jf->source) : "<unset>", - jf->file_last_modified_ut, - jf->msg_first_ut, jf->msg_last_ut); -} - -static bool files_registry_conflict_cb(const DICTIONARY_ITEM *item, void *old_value, void *new_value, void *data __maybe_unused) { - struct journal_file *jf = old_value; - struct journal_file *njf = new_value; - - if(njf->last_scan_ut > jf->last_scan_ut) - jf->last_scan_ut = njf->last_scan_ut; - - if(njf->file_last_modified_ut > jf->file_last_modified_ut) { - jf->file_last_modified_ut = njf->file_last_modified_ut; - jf->size = njf->size; - - const char *filename = dictionary_acquired_item_name(item); - journal_file_update_msg_ut(filename, jf); - -// internal_error(true, -// "updated journal file '%s', type %d, " -// "file modified: %"PRIu64", " -// "msg {first: %"PRIu64", last: %"PRIu64"}", -// filename, jf->source_type, -// jf->file_last_modified_ut, -// jf->msg_first_ut, jf->msg_last_ut); - } - - return false; -} - -#define SDJF_SOURCE_ALL_NAME "all" -#define SDJF_SOURCE_LOCAL_NAME "all-local-logs" -#define SDJF_SOURCE_LOCAL_SYSTEM_NAME "all-local-system-logs" -#define SDJF_SOURCE_LOCAL_USERS_NAME "all-local-user-logs" -#define SDJF_SOURCE_LOCAL_OTHER_NAME "all-uncategorized" -#define SDJF_SOURCE_NAMESPACES_NAME "all-local-namespaces" -#define SDJF_SOURCE_REMOTES_NAME "all-remote-systems" - -struct journal_file_source { - usec_t first_ut; - usec_t last_ut; - size_t count; - uint64_t size; -}; - -static void human_readable_size_ib(uint64_t size, char *dst, size_t dst_len) { - if(size > 1024ULL * 1024 * 1024 * 1024) - snprintfz(dst, dst_len, "%0.2f TiB", (double)size / 1024.0 / 1024.0 / 1024.0 / 1024.0); - else if(size > 1024ULL * 1024 * 1024) - snprintfz(dst, dst_len, "%0.2f GiB", (double)size / 1024.0 / 1024.0 / 1024.0); - else if(size > 1024ULL * 1024) - snprintfz(dst, dst_len, "%0.2f MiB", (double)size / 1024.0 / 1024.0); - else if(size > 1024ULL) - snprintfz(dst, dst_len, "%0.2f KiB", (double)size / 1024.0); - else - snprintfz(dst, dst_len, "%"PRIu64" B", size); -} - -#define print_duration(dst, dst_len, pos, remaining, duration, one, many, printed) do { \ - if((remaining) > (duration)) { \ - uint64_t _count = (remaining) / (duration); \ - uint64_t _rem = (remaining) - (_count * (duration)); \ - (pos) += snprintfz(&(dst)[pos], (dst_len) - (pos), "%s%s%"PRIu64" %s", (printed) ? ", " : "", _rem ? "" : "and ", _count, _count > 1 ? (many) : (one)); \ - (remaining) = _rem; \ - (printed) = true; \ - } \ -} while(0) - -static void human_readable_duration_s(time_t duration_s, char *dst, size_t dst_len) { - if(duration_s < 0) - duration_s = -duration_s; - - size_t pos = 0; - dst[0] = 0 ; - - bool printed = false; - print_duration(dst, dst_len, pos, duration_s, 86400 * 365, "year", "years", printed); - print_duration(dst, dst_len, pos, duration_s, 86400 * 30, "month", "months", printed); - print_duration(dst, dst_len, pos, duration_s, 86400 * 1, "day", "days", printed); - print_duration(dst, dst_len, pos, duration_s, 3600 * 1, "hour", "hours", printed); - print_duration(dst, dst_len, pos, duration_s, 60 * 1, "min", "mins", printed); - print_duration(dst, dst_len, pos, duration_s, 1, "sec", "secs", printed); -} - -static int journal_file_to_json_array_cb(const DICTIONARY_ITEM *item, void *entry, void *data) { - struct journal_file_source *jfs = entry; - BUFFER *wb = data; - - const char *name = dictionary_acquired_item_name(item); - - buffer_json_add_array_item_object(wb); - { - char size_for_humans[100]; - human_readable_size_ib(jfs->size, size_for_humans, sizeof(size_for_humans)); - - char duration_for_humans[1024]; - human_readable_duration_s((time_t)((jfs->last_ut - jfs->first_ut) / USEC_PER_SEC), - duration_for_humans, sizeof(duration_for_humans)); - - char info[1024]; - snprintfz(info, sizeof(info), "%zu files, with a total size of %s, covering %s", - jfs->count, size_for_humans, duration_for_humans); - - buffer_json_member_add_string(wb, "id", name); - buffer_json_member_add_string(wb, "name", name); - buffer_json_member_add_string(wb, "pill", size_for_humans); - buffer_json_member_add_string(wb, "info", info); - } - buffer_json_object_close(wb); // options object - - return 1; -} - -static bool journal_file_merge_sizes(const DICTIONARY_ITEM *item __maybe_unused, void *old_value, void *new_value , void *data __maybe_unused) { - struct journal_file_source *jfs = old_value, *njfs = new_value; - jfs->count += njfs->count; - jfs->size += njfs->size; - - if(njfs->first_ut && njfs->first_ut < jfs->first_ut) - jfs->first_ut = njfs->first_ut; - - if(njfs->last_ut && njfs->last_ut > jfs->last_ut) - jfs->last_ut = njfs->last_ut; - - return false; -} - -static void available_journal_file_sources_to_json_array(BUFFER *wb) { - DICTIONARY *dict = dictionary_create(DICT_OPTION_SINGLE_THREADED|DICT_OPTION_NAME_LINK_DONT_CLONE|DICT_OPTION_DONT_OVERWRITE_VALUE); - dictionary_register_conflict_callback(dict, journal_file_merge_sizes, NULL); - - struct journal_file_source t = { 0 }; - - struct journal_file *jf; - dfe_start_read(journal_files_registry, jf) { - t.first_ut = jf->msg_first_ut; - t.last_ut = jf->msg_last_ut; - t.count = 1; - t.size = jf->size; - - dictionary_set(dict, SDJF_SOURCE_ALL_NAME, &t, sizeof(t)); - - if((jf->source_type & (SDJF_LOCAL)) == (SDJF_LOCAL)) - dictionary_set(dict, SDJF_SOURCE_LOCAL_NAME, &t, sizeof(t)); - if((jf->source_type & (SDJF_LOCAL | SDJF_SYSTEM)) == (SDJF_LOCAL | SDJF_SYSTEM)) - dictionary_set(dict, SDJF_SOURCE_LOCAL_SYSTEM_NAME, &t, sizeof(t)); - if((jf->source_type & (SDJF_LOCAL | SDJF_USER)) == (SDJF_LOCAL | SDJF_USER)) - dictionary_set(dict, SDJF_SOURCE_LOCAL_USERS_NAME, &t, sizeof(t)); - if((jf->source_type & (SDJF_LOCAL | SDJF_OTHER)) == (SDJF_LOCAL | SDJF_OTHER)) - dictionary_set(dict, SDJF_SOURCE_LOCAL_OTHER_NAME, &t, sizeof(t)); - if((jf->source_type & (SDJF_NAMESPACE)) == (SDJF_NAMESPACE)) - dictionary_set(dict, SDJF_SOURCE_NAMESPACES_NAME, &t, sizeof(t)); - if((jf->source_type & (SDJF_REMOTE)) == (SDJF_REMOTE)) - dictionary_set(dict, SDJF_SOURCE_REMOTES_NAME, &t, sizeof(t)); - if(jf->source) - dictionary_set(dict, string2str(jf->source), &t, sizeof(t)); - } - dfe_done(jf); - - dictionary_sorted_walkthrough_read(dict, journal_file_to_json_array_cb, wb); - - dictionary_destroy(dict); -} - -static void files_registry_delete_cb(const DICTIONARY_ITEM *item, void *value, void *data __maybe_unused) { - struct journal_file *jf = value; (void)jf; - const char *filename = dictionary_acquired_item_name(item); (void)filename; - - string_freez(jf->source); - internal_error(true, "removed journal file '%s'", filename); -} - -void journal_directory_scan(const char *dirname, int depth, usec_t last_scan_ut) { - static const char *ext = ".journal"; - static const size_t ext_len = sizeof(".journal") - 1; - - if (depth > VAR_LOG_JOURNAL_MAX_DEPTH) - return; - - DIR *dir; - struct dirent *entry; - struct stat info; - char absolute_path[FILENAME_MAX]; - - // Open the directory. - if ((dir = opendir(dirname)) == NULL) { - if(errno != ENOENT && errno != ENOTDIR) - netdata_log_error("Cannot opendir() '%s'", dirname); - return; - } - - // Read each entry in the directory. - while ((entry = readdir(dir)) != NULL) { - snprintfz(absolute_path, sizeof(absolute_path), "%s/%s", dirname, entry->d_name); - if (stat(absolute_path, &info) != 0) { - netdata_log_error("Failed to stat() '%s", absolute_path); - continue; - } - - if (S_ISDIR(info.st_mode)) { - // If entry is a directory, call traverse recursively. - if (strcmp(entry->d_name, ".") != 0 && strcmp(entry->d_name, "..") != 0) - journal_directory_scan(absolute_path, depth + 1, last_scan_ut); - - } - else if (S_ISREG(info.st_mode)) { - // If entry is a regular file, check if it ends with .journal. - char *filename = entry->d_name; - size_t len = strlen(filename); - - if (len > ext_len && strcmp(filename + len - ext_len, ext) == 0) { - struct journal_file t = { - .file_last_modified_ut = info.st_mtim.tv_sec * USEC_PER_SEC + info.st_mtim.tv_nsec / NSEC_PER_USEC, - .last_scan_ut = last_scan_ut, - .size = info.st_size, - .max_journal_vs_realtime_delta_ut = JOURNAL_VS_REALTIME_DELTA_DEFAULT_UT, - }; - dictionary_set(journal_files_registry, absolute_path, &t, sizeof(t)); - } - } - } - - closedir(dir); -} - -static void journal_files_registry_update() { - usec_t scan_ut = now_monotonic_usec(); - - for(unsigned i = 0; i < MAX_JOURNAL_DIRECTORIES ;i++) { - if(!journal_directories[i].path) - break; - - journal_directory_scan(journal_directories[i].path, 0, scan_ut); - } - - struct journal_file *jf; - dfe_start_write(journal_files_registry, jf) { - if(jf->last_scan_ut < scan_ut) - dictionary_del(journal_files_registry, jf_dfe.name); - } - dfe_done(jf); -} - -// ---------------------------------------------------------------------------- - static bool jf_is_mine(struct journal_file *jf, FUNCTION_QUERY_STATUS *fqs) { - if((fqs->source_type == SDJF_ALL || (jf->source_type & fqs->source_type) == fqs->source_type) && - (!fqs->source || fqs->source == jf->source)) { + if((fqs->source_type == SDJF_NONE && !fqs->sources) || (jf->source_type & fqs->source_type) || + (fqs->sources && simple_pattern_matches(fqs->sources, string2str(jf->source)))) { + + if(!jf->msg_last_ut || !jf->msg_last_ut) + // the file is not scanned yet, or the timestamps have not been updated, + // so we don't know if it can contribute or not - let's add it. + return true; usec_t anchor_delta = JOURNAL_VS_REALTIME_DELTA_MAX_UT; - usec_t first_ut = jf->msg_first_ut; + usec_t first_ut = jf->msg_first_ut - anchor_delta; usec_t last_ut = jf->msg_last_ut + anchor_delta; if(last_ut >= fqs->after_ut && first_ut <= fqs->before_ut) @@ -1191,30 +1101,6 @@ static bool jf_is_mine(struct journal_file *jf, FUNCTION_QUERY_STATUS *fqs) { return false; } -static int journal_file_dict_items_backward_compar(const void *a, const void *b) { - const DICTIONARY_ITEM **ad = (const DICTIONARY_ITEM **)a, **bd = (const DICTIONARY_ITEM **)b; - struct journal_file *jfa = dictionary_acquired_item_value(*ad); - struct journal_file *jfb = dictionary_acquired_item_value(*bd); - - if(jfa->msg_last_ut < jfb->msg_last_ut) - return 1; - - if(jfa->msg_last_ut > jfb->msg_last_ut) - return -1; - - if(jfa->msg_first_ut < jfb->msg_first_ut) - return 1; - - if(jfa->msg_first_ut > jfb->msg_first_ut) - return -1; - - return 0; -} - -static int journal_file_dict_items_forward_compar(const void *a, const void *b) { - return -journal_file_dict_items_backward_compar(a, b); -} - static int netdata_systemd_journal_query(BUFFER *wb, FACETS *facets, FUNCTION_QUERY_STATUS *fqs) { ND_SD_JOURNAL_STATUS status = ND_SD_JOURNAL_NO_FILE_MATCHED; struct journal_file *jf; @@ -1260,8 +1146,12 @@ static int netdata_systemd_journal_query(BUFFER *wb, FACETS *facets, FUNCTION_QU } bool partial = false; - usec_t started_ut; - usec_t ended_ut = now_monotonic_usec(); + usec_t query_started_ut = now_monotonic_usec(); + usec_t started_ut = query_started_ut; + usec_t ended_ut = started_ut; + usec_t duration_ut = 0, max_duration_ut = 0; + + sampling_query_init(fqs, facets); buffer_json_member_add_array(wb, "_journal_files"); for(size_t f = 0; f < files_used ;f++) { @@ -1271,8 +1161,19 @@ static int netdata_systemd_journal_query(BUFFER *wb, FACETS *facets, FUNCTION_QU if(!jf_is_mine(jf, fqs)) continue; + started_ut = ended_ut; + + // do not even try to do the query if we expect it to pass the timeout + if(ended_ut > (query_started_ut + (fqs->stop_monotonic_ut - query_started_ut) * 3 / 4) && + ended_ut + max_duration_ut * 2 >= fqs->stop_monotonic_ut) { + + partial = true; + status = ND_SD_JOURNAL_TIMED_OUT; + break; + } + fqs->file_working++; - fqs->cached_count = 0; + // fqs->cached_count = 0; size_t fs_calls = fstat_thread_calls; size_t fs_cached = fstat_thread_cached_responses; @@ -1281,8 +1182,22 @@ static int netdata_systemd_journal_query(BUFFER *wb, FACETS *facets, FUNCTION_QU size_t bytes_read = fqs->bytes_read; size_t matches_setup_ut = fqs->matches_setup_ut; + sampling_file_init(fqs, jf); + ND_SD_JOURNAL_STATUS tmp_status = netdata_systemd_journal_query_one_file(filename, wb, facets, jf, fqs); +// nd_log(NDLS_COLLECTORS, NDLP_INFO, +// "JOURNAL ESTIMATION FINAL: '%s' " +// "total lines %zu [sampled=%zu, unsampled=%zu, estimated=%zu], " +// "file [%"PRIu64" - %"PRIu64", duration %"PRId64", known lines in file %zu], " +// "query [%"PRIu64" - %"PRIu64", duration %"PRId64"], " +// , jf->filename +// , fqs->samples_per_file.sampled + fqs->samples_per_file.unsampled + fqs->samples_per_file.estimated +// , fqs->samples_per_file.sampled, fqs->samples_per_file.unsampled, fqs->samples_per_file.estimated +// , jf->msg_first_ut, jf->msg_last_ut, jf->msg_last_ut - jf->msg_first_ut, jf->messages_in_file +// , fqs->query_file.start_ut, fqs->query_file.stop_ut, fqs->query_file.stop_ut - fqs->query_file.start_ut +// ); + rows_useful = fqs->rows_useful - rows_useful; rows_read = fqs->rows_read - rows_read; bytes_read = fqs->bytes_read - bytes_read; @@ -1290,9 +1205,11 @@ static int netdata_systemd_journal_query(BUFFER *wb, FACETS *facets, FUNCTION_QU fs_calls = fstat_thread_calls - fs_calls; fs_cached = fstat_thread_cached_responses - fs_cached; - started_ut = ended_ut; ended_ut = now_monotonic_usec(); - usec_t duration_ut = ended_ut - started_ut; + duration_ut = ended_ut - started_ut; + + if(duration_ut > max_duration_ut) + max_duration_ut = duration_ut; buffer_json_add_array_item_object(wb); // journal file { @@ -1315,6 +1232,16 @@ static int netdata_systemd_journal_query(BUFFER *wb, FACETS *facets, FUNCTION_QU buffer_json_member_add_uint64(wb, "duration_matches_ut", matches_setup_ut); buffer_json_member_add_uint64(wb, "fstat_query_calls", fs_calls); buffer_json_member_add_uint64(wb, "fstat_query_cached_responses", fs_cached); + + if(fqs->sampling) { + buffer_json_member_add_object(wb, "_sampling"); + { + buffer_json_member_add_uint64(wb, "sampled", fqs->samples_per_file.sampled); + buffer_json_member_add_uint64(wb, "unsampled", fqs->samples_per_file.unsampled); + buffer_json_member_add_uint64(wb, "estimated", fqs->samples_per_file.estimated); + } + buffer_json_object_close(wb); // _sampling + } } buffer_json_object_close(wb); // journal file @@ -1384,6 +1311,64 @@ static int netdata_systemd_journal_query(BUFFER *wb, FACETS *facets, FUNCTION_QU buffer_json_member_add_boolean(wb, "partial", partial); buffer_json_member_add_string(wb, "type", "table"); + // build a message for the query + if(!fqs->data_only) { + CLEAN_BUFFER *msg = buffer_create(0, NULL); + CLEAN_BUFFER *msg_description = buffer_create(0, NULL); + ND_LOG_FIELD_PRIORITY msg_priority = NDLP_INFO; + + if(!journal_files_completed_once()) { + buffer_strcat(msg, "Journals are still being scanned. "); + buffer_strcat(msg_description + , "LIBRARY SCAN: The journal files are still being scanned, you are probably viewing incomplete data. "); + msg_priority = NDLP_WARNING; + } + + if(partial) { + buffer_strcat(msg, "Query timed-out, incomplete data. "); + buffer_strcat(msg_description + , "QUERY TIMEOUT: The query timed out and may not include all the data of the selected window. "); + msg_priority = NDLP_WARNING; + } + + if(fqs->samples.estimated || fqs->samples.unsampled) { + double percent = (double) (fqs->samples.sampled * 100.0 / + (fqs->samples.estimated + fqs->samples.unsampled + fqs->samples.sampled)); + buffer_sprintf(msg, "%.2f%% real data", percent); + buffer_sprintf(msg_description, "ACTUAL DATA: The filters counters reflect %0.2f%% of the data. ", percent); + msg_priority = MIN(msg_priority, NDLP_NOTICE); + } + + if(fqs->samples.unsampled) { + double percent = (double) (fqs->samples.unsampled * 100.0 / + (fqs->samples.estimated + fqs->samples.unsampled + fqs->samples.sampled)); + buffer_sprintf(msg, ", %.2f%% unsampled", percent); + buffer_sprintf(msg_description + , "UNSAMPLED DATA: %0.2f%% of the events exist and have been counted, but their values have not been evaluated, so they are not included in the filters counters. " + , percent); + msg_priority = MIN(msg_priority, NDLP_NOTICE); + } + + if(fqs->samples.estimated) { + double percent = (double) (fqs->samples.estimated * 100.0 / + (fqs->samples.estimated + fqs->samples.unsampled + fqs->samples.sampled)); + buffer_sprintf(msg, ", %.2f%% estimated", percent); + buffer_sprintf(msg_description + , "ESTIMATED DATA: The query selected a large amount of data, so to avoid delaying too much, the presented data are estimated by %0.2f%%. " + , percent); + msg_priority = MIN(msg_priority, NDLP_NOTICE); + } + + buffer_json_member_add_object(wb, "message"); + if(buffer_tostring(msg)) { + buffer_json_member_add_string(wb, "title", buffer_tostring(msg)); + buffer_json_member_add_string(wb, "description", buffer_tostring(msg_description)); + buffer_json_member_add_string(wb, "status", nd_log_id2priority(msg_priority)); + } + // else send an empty object if there is nothing to tell + buffer_json_object_close(wb); // message + } + if(!fqs->data_only) { buffer_json_member_add_time_t(wb, "update_every", 1); buffer_json_member_add_string(wb, "help", SYSTEMD_JOURNAL_FUNCTION_DESCRIPTION); @@ -1403,6 +1388,17 @@ static int netdata_systemd_journal_query(BUFFER *wb, FACETS *facets, FUNCTION_QU buffer_json_member_add_uint64(wb, "cached", fstat_thread_cached_responses); } buffer_json_object_close(wb); // _fstat_caching + + if(fqs->sampling) { + buffer_json_member_add_object(wb, "_sampling"); + { + buffer_json_member_add_uint64(wb, "sampled", fqs->samples.sampled); + buffer_json_member_add_uint64(wb, "unsampled", fqs->samples.unsampled); + buffer_json_member_add_uint64(wb, "estimated", fqs->samples.estimated); + } + buffer_json_object_close(wb); // _sampling + } + buffer_json_finalize(wb); return HTTP_RESP_OK; @@ -1471,6 +1467,10 @@ static void netdata_systemd_journal_function_help(const char *transaction) { " The number of items to return.\n" " The default is %d.\n" "\n" + " "JOURNAL_PARAMETER_SAMPLING":ITEMS\n" + " The number of log entries to sample to estimate facets counters and histogram.\n" + " The default is %d.\n" + "\n" " "JOURNAL_PARAMETER_ANCHOR":TIMESTAMP_IN_MICROSECONDS\n" " Return items relative to this timestamp.\n" " The exact items to be returned depend on the query `"JOURNAL_PARAMETER_DIRECTION"`.\n" @@ -1511,6 +1511,7 @@ static void netdata_systemd_journal_function_help(const char *transaction) { , JOURNAL_DEFAULT_SLICE_MODE ? "true" : "false" // slice , -SYSTEMD_JOURNAL_DEFAULT_QUERY_DURATION , SYSTEMD_JOURNAL_DEFAULT_ITEMS_PER_QUERY + , SYSTEMD_JOURNAL_DEFAULT_ITEMS_SAMPLING , JOURNAL_DEFAULT_DIRECTION == FACETS_ANCHOR_DIRECTION_BACKWARD ? "backward" : "forward" ); @@ -1521,572 +1522,6 @@ static void netdata_systemd_journal_function_help(const char *transaction) { buffer_free(wb); } -const char *errno_map[] = { - [1] = "1 (EPERM)", // "Operation not permitted", - [2] = "2 (ENOENT)", // "No such file or directory", - [3] = "3 (ESRCH)", // "No such process", - [4] = "4 (EINTR)", // "Interrupted system call", - [5] = "5 (EIO)", // "Input/output error", - [6] = "6 (ENXIO)", // "No such device or address", - [7] = "7 (E2BIG)", // "Argument list too long", - [8] = "8 (ENOEXEC)", // "Exec format error", - [9] = "9 (EBADF)", // "Bad file descriptor", - [10] = "10 (ECHILD)", // "No child processes", - [11] = "11 (EAGAIN)", // "Resource temporarily unavailable", - [12] = "12 (ENOMEM)", // "Cannot allocate memory", - [13] = "13 (EACCES)", // "Permission denied", - [14] = "14 (EFAULT)", // "Bad address", - [15] = "15 (ENOTBLK)", // "Block device required", - [16] = "16 (EBUSY)", // "Device or resource busy", - [17] = "17 (EEXIST)", // "File exists", - [18] = "18 (EXDEV)", // "Invalid cross-device link", - [19] = "19 (ENODEV)", // "No such device", - [20] = "20 (ENOTDIR)", // "Not a directory", - [21] = "21 (EISDIR)", // "Is a directory", - [22] = "22 (EINVAL)", // "Invalid argument", - [23] = "23 (ENFILE)", // "Too many open files in system", - [24] = "24 (EMFILE)", // "Too many open files", - [25] = "25 (ENOTTY)", // "Inappropriate ioctl for device", - [26] = "26 (ETXTBSY)", // "Text file busy", - [27] = "27 (EFBIG)", // "File too large", - [28] = "28 (ENOSPC)", // "No space left on device", - [29] = "29 (ESPIPE)", // "Illegal seek", - [30] = "30 (EROFS)", // "Read-only file system", - [31] = "31 (EMLINK)", // "Too many links", - [32] = "32 (EPIPE)", // "Broken pipe", - [33] = "33 (EDOM)", // "Numerical argument out of domain", - [34] = "34 (ERANGE)", // "Numerical result out of range", - [35] = "35 (EDEADLK)", // "Resource deadlock avoided", - [36] = "36 (ENAMETOOLONG)", // "File name too long", - [37] = "37 (ENOLCK)", // "No locks available", - [38] = "38 (ENOSYS)", // "Function not implemented", - [39] = "39 (ENOTEMPTY)", // "Directory not empty", - [40] = "40 (ELOOP)", // "Too many levels of symbolic links", - [42] = "42 (ENOMSG)", // "No message of desired type", - [43] = "43 (EIDRM)", // "Identifier removed", - [44] = "44 (ECHRNG)", // "Channel number out of range", - [45] = "45 (EL2NSYNC)", // "Level 2 not synchronized", - [46] = "46 (EL3HLT)", // "Level 3 halted", - [47] = "47 (EL3RST)", // "Level 3 reset", - [48] = "48 (ELNRNG)", // "Link number out of range", - [49] = "49 (EUNATCH)", // "Protocol driver not attached", - [50] = "50 (ENOCSI)", // "No CSI structure available", - [51] = "51 (EL2HLT)", // "Level 2 halted", - [52] = "52 (EBADE)", // "Invalid exchange", - [53] = "53 (EBADR)", // "Invalid request descriptor", - [54] = "54 (EXFULL)", // "Exchange full", - [55] = "55 (ENOANO)", // "No anode", - [56] = "56 (EBADRQC)", // "Invalid request code", - [57] = "57 (EBADSLT)", // "Invalid slot", - [59] = "59 (EBFONT)", // "Bad font file format", - [60] = "60 (ENOSTR)", // "Device not a stream", - [61] = "61 (ENODATA)", // "No data available", - [62] = "62 (ETIME)", // "Timer expired", - [63] = "63 (ENOSR)", // "Out of streams resources", - [64] = "64 (ENONET)", // "Machine is not on the network", - [65] = "65 (ENOPKG)", // "Package not installed", - [66] = "66 (EREMOTE)", // "Object is remote", - [67] = "67 (ENOLINK)", // "Link has been severed", - [68] = "68 (EADV)", // "Advertise error", - [69] = "69 (ESRMNT)", // "Srmount error", - [70] = "70 (ECOMM)", // "Communication error on send", - [71] = "71 (EPROTO)", // "Protocol error", - [72] = "72 (EMULTIHOP)", // "Multihop attempted", - [73] = "73 (EDOTDOT)", // "RFS specific error", - [74] = "74 (EBADMSG)", // "Bad message", - [75] = "75 (EOVERFLOW)", // "Value too large for defined data type", - [76] = "76 (ENOTUNIQ)", // "Name not unique on network", - [77] = "77 (EBADFD)", // "File descriptor in bad state", - [78] = "78 (EREMCHG)", // "Remote address changed", - [79] = "79 (ELIBACC)", // "Can not access a needed shared library", - [80] = "80 (ELIBBAD)", // "Accessing a corrupted shared library", - [81] = "81 (ELIBSCN)", // ".lib section in a.out corrupted", - [82] = "82 (ELIBMAX)", // "Attempting to link in too many shared libraries", - [83] = "83 (ELIBEXEC)", // "Cannot exec a shared library directly", - [84] = "84 (EILSEQ)", // "Invalid or incomplete multibyte or wide character", - [85] = "85 (ERESTART)", // "Interrupted system call should be restarted", - [86] = "86 (ESTRPIPE)", // "Streams pipe error", - [87] = "87 (EUSERS)", // "Too many users", - [88] = "88 (ENOTSOCK)", // "Socket operation on non-socket", - [89] = "89 (EDESTADDRREQ)", // "Destination address required", - [90] = "90 (EMSGSIZE)", // "Message too long", - [91] = "91 (EPROTOTYPE)", // "Protocol wrong type for socket", - [92] = "92 (ENOPROTOOPT)", // "Protocol not available", - [93] = "93 (EPROTONOSUPPORT)", // "Protocol not supported", - [94] = "94 (ESOCKTNOSUPPORT)", // "Socket type not supported", - [95] = "95 (ENOTSUP)", // "Operation not supported", - [96] = "96 (EPFNOSUPPORT)", // "Protocol family not supported", - [97] = "97 (EAFNOSUPPORT)", // "Address family not supported by protocol", - [98] = "98 (EADDRINUSE)", // "Address already in use", - [99] = "99 (EADDRNOTAVAIL)", // "Cannot assign requested address", - [100] = "100 (ENETDOWN)", // "Network is down", - [101] = "101 (ENETUNREACH)", // "Network is unreachable", - [102] = "102 (ENETRESET)", // "Network dropped connection on reset", - [103] = "103 (ECONNABORTED)", // "Software caused connection abort", - [104] = "104 (ECONNRESET)", // "Connection reset by peer", - [105] = "105 (ENOBUFS)", // "No buffer space available", - [106] = "106 (EISCONN)", // "Transport endpoint is already connected", - [107] = "107 (ENOTCONN)", // "Transport endpoint is not connected", - [108] = "108 (ESHUTDOWN)", // "Cannot send after transport endpoint shutdown", - [109] = "109 (ETOOMANYREFS)", // "Too many references: cannot splice", - [110] = "110 (ETIMEDOUT)", // "Connection timed out", - [111] = "111 (ECONNREFUSED)", // "Connection refused", - [112] = "112 (EHOSTDOWN)", // "Host is down", - [113] = "113 (EHOSTUNREACH)", // "No route to host", - [114] = "114 (EALREADY)", // "Operation already in progress", - [115] = "115 (EINPROGRESS)", // "Operation now in progress", - [116] = "116 (ESTALE)", // "Stale file handle", - [117] = "117 (EUCLEAN)", // "Structure needs cleaning", - [118] = "118 (ENOTNAM)", // "Not a XENIX named type file", - [119] = "119 (ENAVAIL)", // "No XENIX semaphores available", - [120] = "120 (EISNAM)", // "Is a named type file", - [121] = "121 (EREMOTEIO)", // "Remote I/O error", - [122] = "122 (EDQUOT)", // "Disk quota exceeded", - [123] = "123 (ENOMEDIUM)", // "No medium found", - [124] = "124 (EMEDIUMTYPE)", // "Wrong medium type", - [125] = "125 (ECANCELED)", // "Operation canceled", - [126] = "126 (ENOKEY)", // "Required key not available", - [127] = "127 (EKEYEXPIRED)", // "Key has expired", - [128] = "128 (EKEYREVOKED)", // "Key has been revoked", - [129] = "129 (EKEYREJECTED)", // "Key was rejected by service", - [130] = "130 (EOWNERDEAD)", // "Owner died", - [131] = "131 (ENOTRECOVERABLE)", // "State not recoverable", - [132] = "132 (ERFKILL)", // "Operation not possible due to RF-kill", - [133] = "133 (EHWPOISON)", // "Memory page has hardware error", -}; - -static const char *syslog_facility_to_name(int facility) { - switch (facility) { - case LOG_FAC(LOG_KERN): return "kern"; - case LOG_FAC(LOG_USER): return "user"; - case LOG_FAC(LOG_MAIL): return "mail"; - case LOG_FAC(LOG_DAEMON): return "daemon"; - case LOG_FAC(LOG_AUTH): return "auth"; - case LOG_FAC(LOG_SYSLOG): return "syslog"; - case LOG_FAC(LOG_LPR): return "lpr"; - case LOG_FAC(LOG_NEWS): return "news"; - case LOG_FAC(LOG_UUCP): return "uucp"; - case LOG_FAC(LOG_CRON): return "cron"; - case LOG_FAC(LOG_AUTHPRIV): return "authpriv"; - case LOG_FAC(LOG_FTP): return "ftp"; - case LOG_FAC(LOG_LOCAL0): return "local0"; - case LOG_FAC(LOG_LOCAL1): return "local1"; - case LOG_FAC(LOG_LOCAL2): return "local2"; - case LOG_FAC(LOG_LOCAL3): return "local3"; - case LOG_FAC(LOG_LOCAL4): return "local4"; - case LOG_FAC(LOG_LOCAL5): return "local5"; - case LOG_FAC(LOG_LOCAL6): return "local6"; - case LOG_FAC(LOG_LOCAL7): return "local7"; - default: return NULL; - } -} - -static const char *syslog_priority_to_name(int priority) { - switch (priority) { - case LOG_ALERT: return "alert"; - case LOG_CRIT: return "critical"; - case LOG_DEBUG: return "debug"; - case LOG_EMERG: return "panic"; - case LOG_ERR: return "error"; - case LOG_INFO: return "info"; - case LOG_NOTICE: return "notice"; - case LOG_WARNING: return "warning"; - default: return NULL; - } -} - -static FACET_ROW_SEVERITY syslog_priority_to_facet_severity(FACETS *facets __maybe_unused, FACET_ROW *row, void *data __maybe_unused) { - // same to - // https://github.com/systemd/systemd/blob/aab9e4b2b86905a15944a1ac81e471b5b7075932/src/basic/terminal-util.c#L1501 - // function get_log_colors() - - FACET_ROW_KEY_VALUE *priority_rkv = dictionary_get(row->dict, "PRIORITY"); - if(!priority_rkv || priority_rkv->empty) - return FACET_ROW_SEVERITY_NORMAL; - - int priority = str2i(buffer_tostring(priority_rkv->wb)); - - if(priority <= LOG_ERR) - return FACET_ROW_SEVERITY_CRITICAL; - - else if (priority <= LOG_WARNING) - return FACET_ROW_SEVERITY_WARNING; - - else if(priority <= LOG_NOTICE) - return FACET_ROW_SEVERITY_NOTICE; - - else if(priority >= LOG_DEBUG) - return FACET_ROW_SEVERITY_DEBUG; - - return FACET_ROW_SEVERITY_NORMAL; -} - -static char *uid_to_username(uid_t uid, char *buffer, size_t buffer_size) { - static __thread char tmp[1024 + 1]; - struct passwd pw, *result = NULL; - - if (getpwuid_r(uid, &pw, tmp, sizeof(tmp), &result) != 0 || !result || !pw.pw_name || !(*pw.pw_name)) - snprintfz(buffer, buffer_size - 1, "%u", uid); - else - snprintfz(buffer, buffer_size - 1, "%u (%s)", uid, pw.pw_name); - - return buffer; -} - -static char *gid_to_groupname(gid_t gid, char* buffer, size_t buffer_size) { - static __thread char tmp[1024]; - struct group grp, *result = NULL; - - if (getgrgid_r(gid, &grp, tmp, sizeof(tmp), &result) != 0 || !result || !grp.gr_name || !(*grp.gr_name)) - snprintfz(buffer, buffer_size - 1, "%u", gid); - else - snprintfz(buffer, buffer_size - 1, "%u (%s)", gid, grp.gr_name); - - return buffer; -} - -static void netdata_systemd_journal_transform_syslog_facility(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { - const char *v = buffer_tostring(wb); - if(*v && isdigit(*v)) { - int facility = str2i(buffer_tostring(wb)); - const char *name = syslog_facility_to_name(facility); - if (name) { - buffer_flush(wb); - buffer_strcat(wb, name); - } - } -} - -static void netdata_systemd_journal_transform_priority(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { - if(scope == FACETS_TRANSFORM_FACET_SORT) - return; - - const char *v = buffer_tostring(wb); - if(*v && isdigit(*v)) { - int priority = str2i(buffer_tostring(wb)); - const char *name = syslog_priority_to_name(priority); - if (name) { - buffer_flush(wb); - buffer_strcat(wb, name); - } - } -} - -static void netdata_systemd_journal_transform_errno(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { - if(scope == FACETS_TRANSFORM_FACET_SORT) - return; - - const char *v = buffer_tostring(wb); - if(*v && isdigit(*v)) { - unsigned err_no = str2u(buffer_tostring(wb)); - if(err_no > 0 && err_no < sizeof(errno_map) / sizeof(*errno_map)) { - const char *name = errno_map[err_no]; - if(name) { - buffer_flush(wb); - buffer_strcat(wb, name); - } - } - } -} - -// ---------------------------------------------------------------------------- -// UID and GID transformation - -#define UID_GID_HASHTABLE_SIZE 10000 - -struct word_t2str_hashtable_entry { - struct word_t2str_hashtable_entry *next; - Word_t hash; - size_t len; - char str[]; -}; - -struct word_t2str_hashtable { - SPINLOCK spinlock; - size_t size; - struct word_t2str_hashtable_entry *hashtable[UID_GID_HASHTABLE_SIZE]; -}; - -struct word_t2str_hashtable uid_hashtable = { - .size = UID_GID_HASHTABLE_SIZE, -}; - -struct word_t2str_hashtable gid_hashtable = { - .size = UID_GID_HASHTABLE_SIZE, -}; - -struct word_t2str_hashtable_entry **word_t2str_hashtable_slot(struct word_t2str_hashtable *ht, Word_t hash) { - size_t slot = hash % ht->size; - struct word_t2str_hashtable_entry **e = &ht->hashtable[slot]; - - while(*e && (*e)->hash != hash) - e = &((*e)->next); - - return e; -} - -const char *uid_to_username_cached(uid_t uid, size_t *length) { - spinlock_lock(&uid_hashtable.spinlock); - - struct word_t2str_hashtable_entry **e = word_t2str_hashtable_slot(&uid_hashtable, uid); - if(!(*e)) { - static __thread char buf[1024]; - const char *name = uid_to_username(uid, buf, sizeof(buf)); - size_t size = strlen(name) + 1; - - *e = callocz(1, sizeof(struct word_t2str_hashtable_entry) + size); - (*e)->len = size - 1; - (*e)->hash = uid; - memcpy((*e)->str, name, size); - } - - spinlock_unlock(&uid_hashtable.spinlock); - - *length = (*e)->len; - return (*e)->str; -} - -const char *gid_to_groupname_cached(gid_t gid, size_t *length) { - spinlock_lock(&gid_hashtable.spinlock); - - struct word_t2str_hashtable_entry **e = word_t2str_hashtable_slot(&gid_hashtable, gid); - if(!(*e)) { - static __thread char buf[1024]; - const char *name = gid_to_groupname(gid, buf, sizeof(buf)); - size_t size = strlen(name) + 1; - - *e = callocz(1, sizeof(struct word_t2str_hashtable_entry) + size); - (*e)->len = size - 1; - (*e)->hash = gid; - memcpy((*e)->str, name, size); - } - - spinlock_unlock(&gid_hashtable.spinlock); - - *length = (*e)->len; - return (*e)->str; -} - -DICTIONARY *boot_ids_to_first_ut = NULL; - -static void netdata_systemd_journal_transform_boot_id(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { - const char *boot_id = buffer_tostring(wb); - if(*boot_id && isxdigit(*boot_id)) { - usec_t ut = UINT64_MAX; - usec_t *p_ut = dictionary_get(boot_ids_to_first_ut, boot_id); - if(!p_ut) { - struct journal_file *jf; - dfe_start_read(journal_files_registry, jf) { - const char *files[2] = { - [0] = jf_dfe.name, - [1] = NULL, - }; - - sd_journal *j = NULL; - if(sd_journal_open_files(&j, files, ND_SD_JOURNAL_OPEN_FLAGS) < 0 || !j) - continue; - - char m[100]; - size_t len = snprintfz(m, sizeof(m), "_BOOT_ID=%s", boot_id); - usec_t t_ut = 0; - if(sd_journal_add_match(j, m, len) < 0 || - sd_journal_seek_head(j) < 0 || - sd_journal_next(j) < 0 || - sd_journal_get_realtime_usec(j, &t_ut) < 0 || !t_ut) { - sd_journal_close(j); - continue; - } - - if(t_ut < ut) - ut = t_ut; - - sd_journal_close(j); - } - dfe_done(jf); - - dictionary_set(boot_ids_to_first_ut, boot_id, &ut, sizeof(ut)); - } - else - ut = *p_ut; - - if(ut != UINT64_MAX) { - time_t timestamp_sec = (time_t)(ut / USEC_PER_SEC); - struct tm tm; - char buffer[30]; - - gmtime_r(×tamp_sec, &tm); - strftime(buffer, sizeof(buffer), "%Y-%m-%d %H:%M:%S", &tm); - - switch(scope) { - default: - case FACETS_TRANSFORM_DATA: - case FACETS_TRANSFORM_VALUE: - buffer_sprintf(wb, " (%s UTC) ", buffer); - break; - - case FACETS_TRANSFORM_FACET: - case FACETS_TRANSFORM_FACET_SORT: - case FACETS_TRANSFORM_HISTOGRAM: - buffer_flush(wb); - buffer_sprintf(wb, "%s UTC", buffer); - break; - } - } - } -} - -static void netdata_systemd_journal_transform_uid(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { - if(scope == FACETS_TRANSFORM_FACET_SORT) - return; - - const char *v = buffer_tostring(wb); - if(*v && isdigit(*v)) { - uid_t uid = str2i(buffer_tostring(wb)); - size_t len; - const char *name = uid_to_username_cached(uid, &len); - buffer_contents_replace(wb, name, len); - } -} - -static void netdata_systemd_journal_transform_gid(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { - if(scope == FACETS_TRANSFORM_FACET_SORT) - return; - - const char *v = buffer_tostring(wb); - if(*v && isdigit(*v)) { - gid_t gid = str2i(buffer_tostring(wb)); - size_t len; - const char *name = gid_to_groupname_cached(gid, &len); - buffer_contents_replace(wb, name, len); - } -} - -const char *linux_capabilities[] = { - [CAP_CHOWN] = "CHOWN", - [CAP_DAC_OVERRIDE] = "DAC_OVERRIDE", - [CAP_DAC_READ_SEARCH] = "DAC_READ_SEARCH", - [CAP_FOWNER] = "FOWNER", - [CAP_FSETID] = "FSETID", - [CAP_KILL] = "KILL", - [CAP_SETGID] = "SETGID", - [CAP_SETUID] = "SETUID", - [CAP_SETPCAP] = "SETPCAP", - [CAP_LINUX_IMMUTABLE] = "LINUX_IMMUTABLE", - [CAP_NET_BIND_SERVICE] = "NET_BIND_SERVICE", - [CAP_NET_BROADCAST] = "NET_BROADCAST", - [CAP_NET_ADMIN] = "NET_ADMIN", - [CAP_NET_RAW] = "NET_RAW", - [CAP_IPC_LOCK] = "IPC_LOCK", - [CAP_IPC_OWNER] = "IPC_OWNER", - [CAP_SYS_MODULE] = "SYS_MODULE", - [CAP_SYS_RAWIO] = "SYS_RAWIO", - [CAP_SYS_CHROOT] = "SYS_CHROOT", - [CAP_SYS_PTRACE] = "SYS_PTRACE", - [CAP_SYS_PACCT] = "SYS_PACCT", - [CAP_SYS_ADMIN] = "SYS_ADMIN", - [CAP_SYS_BOOT] = "SYS_BOOT", - [CAP_SYS_NICE] = "SYS_NICE", - [CAP_SYS_RESOURCE] = "SYS_RESOURCE", - [CAP_SYS_TIME] = "SYS_TIME", - [CAP_SYS_TTY_CONFIG] = "SYS_TTY_CONFIG", - [CAP_MKNOD] = "MKNOD", - [CAP_LEASE] = "LEASE", - [CAP_AUDIT_WRITE] = "AUDIT_WRITE", - [CAP_AUDIT_CONTROL] = "AUDIT_CONTROL", - [CAP_SETFCAP] = "SETFCAP", - [CAP_MAC_OVERRIDE] = "MAC_OVERRIDE", - [CAP_MAC_ADMIN] = "MAC_ADMIN", - [CAP_SYSLOG] = "SYSLOG", - [CAP_WAKE_ALARM] = "WAKE_ALARM", - [CAP_BLOCK_SUSPEND] = "BLOCK_SUSPEND", - [37 /*CAP_AUDIT_READ*/] = "AUDIT_READ", - [38 /*CAP_PERFMON*/] = "PERFMON", - [39 /*CAP_BPF*/] = "BPF", - [40 /* CAP_CHECKPOINT_RESTORE */] = "CHECKPOINT_RESTORE", -}; - -static void netdata_systemd_journal_transform_cap_effective(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { - if(scope == FACETS_TRANSFORM_FACET_SORT) - return; - - const char *v = buffer_tostring(wb); - if(*v && isdigit(*v)) { - uint64_t cap = strtoul(buffer_tostring(wb), NULL, 16); - if(cap) { - buffer_fast_strcat(wb, " (", 2); - for (size_t i = 0, added = 0; i < sizeof(linux_capabilities) / sizeof(linux_capabilities[0]); i++) { - if (linux_capabilities[i] && (cap & (1ULL << i))) { - - if (added) - buffer_fast_strcat(wb, " | ", 3); - - buffer_strcat(wb, linux_capabilities[i]); - added++; - } - } - buffer_fast_strcat(wb, ")", 1); - } - } -} - -static void netdata_systemd_journal_transform_timestamp_usec(FACETS *facets __maybe_unused, BUFFER *wb, FACETS_TRANSFORMATION_SCOPE scope __maybe_unused, void *data __maybe_unused) { - if(scope == FACETS_TRANSFORM_FACET_SORT) - return; - - const char *v = buffer_tostring(wb); - if(*v && isdigit(*v)) { - uint64_t ut = str2ull(buffer_tostring(wb), NULL); - if(ut) { - time_t timestamp_sec = ut / USEC_PER_SEC; - struct tm tm; - char buffer[30]; - - gmtime_r(×tamp_sec, &tm); - strftime(buffer, sizeof(buffer), "%Y-%m-%d %H:%M:%S", &tm); - buffer_sprintf(wb, " (%s.%06llu UTC)", buffer, ut % USEC_PER_SEC); - } - } -} - -// ---------------------------------------------------------------------------- - -static void netdata_systemd_journal_dynamic_row_id(FACETS *facets __maybe_unused, BUFFER *json_array, FACET_ROW_KEY_VALUE *rkv, FACET_ROW *row, void *data __maybe_unused) { - FACET_ROW_KEY_VALUE *pid_rkv = dictionary_get(row->dict, "_PID"); - const char *pid = pid_rkv ? buffer_tostring(pid_rkv->wb) : FACET_VALUE_UNSET; - - const char *identifier = NULL; - FACET_ROW_KEY_VALUE *container_name_rkv = dictionary_get(row->dict, "CONTAINER_NAME"); - if(container_name_rkv && !container_name_rkv->empty) - identifier = buffer_tostring(container_name_rkv->wb); - - if(!identifier) { - FACET_ROW_KEY_VALUE *syslog_identifier_rkv = dictionary_get(row->dict, "SYSLOG_IDENTIFIER"); - if(syslog_identifier_rkv && !syslog_identifier_rkv->empty) - identifier = buffer_tostring(syslog_identifier_rkv->wb); - - if(!identifier) { - FACET_ROW_KEY_VALUE *comm_rkv = dictionary_get(row->dict, "_COMM"); - if(comm_rkv && !comm_rkv->empty) - identifier = buffer_tostring(comm_rkv->wb); - } - } - - buffer_flush(rkv->wb); - - if(!identifier) - buffer_strcat(rkv->wb, FACET_VALUE_UNSET); - else - buffer_sprintf(rkv->wb, "%s[%s]", identifier, pid); - - buffer_json_add_array_item_string(json_array, buffer_tostring(rkv->wb)); -} - -static void netdata_systemd_journal_rich_message(FACETS *facets __maybe_unused, BUFFER *json_array, FACET_ROW_KEY_VALUE *rkv, FACET_ROW *row __maybe_unused, void *data __maybe_unused) { - buffer_json_add_array_item_object(json_array); - buffer_json_member_add_string(json_array, "value", buffer_tostring(rkv->wb)); - buffer_json_object_close(json_array); -} - DICTIONARY *function_query_status_dict = NULL; static void function_systemd_journal_progress(BUFFER *wb, const char *transaction, const char *progress_id) { @@ -2129,7 +1564,7 @@ static void function_systemd_journal_progress(BUFFER *wb, const char *transactio buffer_json_member_add_uint64(wb, "running_duration_usec", duration_ut); buffer_json_member_add_double(wb, "progress", (double)file_working * 100.0 / (double)files_matched); char msg[1024 + 1]; - snprintfz(msg, 1024, + snprintfz(msg, sizeof(msg) - 1, "Read %zu rows (%0.0f rows/s), " "data %0.1f MB (%0.1f MB/s), " "file %zu of %zu", @@ -2147,10 +1582,9 @@ static void function_systemd_journal_progress(BUFFER *wb, const char *transactio dictionary_acquired_item_release(function_query_status_dict, item); } -static void function_systemd_journal(const char *transaction, char *function, int timeout, bool *cancelled) { +void function_systemd_journal(const char *transaction, char *function, int timeout, bool *cancelled) { fstat_thread_calls = 0; fstat_thread_cached_responses = 0; - journal_files_registry_update(); BUFFER *wb = buffer_create(0, NULL); buffer_flush(wb); @@ -2186,6 +1620,7 @@ static void function_systemd_journal(const char *transaction, char *function, in facets_accepted_param(facets, JOURNAL_PARAMETER_PROGRESS); facets_accepted_param(facets, JOURNAL_PARAMETER_DELTA); facets_accepted_param(facets, JOURNAL_PARAMETER_TAIL); + facets_accepted_param(facets, JOURNAL_PARAMETER_SAMPLING); #ifdef HAVE_SD_JOURNAL_RESTART_FIELDS facets_accepted_param(facets, JOURNAL_PARAMETER_SLICE); @@ -2196,10 +1631,10 @@ static void function_systemd_journal(const char *transaction, char *function, in facets_register_row_severity(facets, syslog_priority_to_facet_severity, NULL); facets_register_key_name(facets, "_HOSTNAME", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_VISIBLE | FACET_KEY_OPTION_FTS); + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_VISIBLE); facets_register_dynamic_key_name(facets, JOURNAL_KEY_ND_JOURNAL_PROCESS, - FACET_KEY_OPTION_NEVER_FACET | FACET_KEY_OPTION_VISIBLE | FACET_KEY_OPTION_FTS, + FACET_KEY_OPTION_NEVER_FACET | FACET_KEY_OPTION_VISIBLE, netdata_systemd_journal_dynamic_row_id, NULL); facets_register_key_name(facets, "MESSAGE", @@ -2212,71 +1647,78 @@ static void function_systemd_journal(const char *transaction, char *function, in // netdata_systemd_journal_rich_message, NULL); facets_register_key_name_transformation(facets, "PRIORITY", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW | + FACET_KEY_OPTION_EXPANDED_FILTER, netdata_systemd_journal_transform_priority, NULL); facets_register_key_name_transformation(facets, "SYSLOG_FACILITY", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW | + FACET_KEY_OPTION_EXPANDED_FILTER, netdata_systemd_journal_transform_syslog_facility, NULL); facets_register_key_name_transformation(facets, "ERRNO", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_errno, NULL); facets_register_key_name(facets, JOURNAL_KEY_ND_JOURNAL_FILE, FACET_KEY_OPTION_NEVER_FACET); facets_register_key_name(facets, "SYSLOG_IDENTIFIER", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS); + FACET_KEY_OPTION_FACET); facets_register_key_name(facets, "UNIT", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS); + FACET_KEY_OPTION_FACET); facets_register_key_name(facets, "USER_UNIT", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS); + FACET_KEY_OPTION_FACET); + + facets_register_key_name_transformation(facets, "MESSAGE_ID", + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW | + FACET_KEY_OPTION_EXPANDED_FILTER, + netdata_systemd_journal_transform_message_id, NULL); facets_register_key_name_transformation(facets, "_BOOT_ID", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_boot_id, NULL); facets_register_key_name_transformation(facets, "_SYSTEMD_OWNER_UID", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_uid, NULL); facets_register_key_name_transformation(facets, "_UID", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_uid, NULL); facets_register_key_name_transformation(facets, "OBJECT_SYSTEMD_OWNER_UID", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_uid, NULL); facets_register_key_name_transformation(facets, "OBJECT_UID", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_uid, NULL); facets_register_key_name_transformation(facets, "_GID", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_gid, NULL); facets_register_key_name_transformation(facets, "OBJECT_GID", - FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_FACET | FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_gid, NULL); facets_register_key_name_transformation(facets, "_CAP_EFFECTIVE", - FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_cap_effective, NULL); facets_register_key_name_transformation(facets, "_AUDIT_LOGINUID", - FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_uid, NULL); facets_register_key_name_transformation(facets, "OBJECT_AUDIT_LOGINUID", - FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_uid, NULL); facets_register_key_name_transformation(facets, "_SOURCE_REALTIME_TIMESTAMP", - FACET_KEY_OPTION_FTS | FACET_KEY_OPTION_TRANSFORM_VIEW, + FACET_KEY_OPTION_TRANSFORM_VIEW, netdata_systemd_journal_transform_timestamp_usec, NULL); // ------------------------------------------------------------------------ @@ -2290,10 +1732,11 @@ static void function_systemd_journal(const char *transaction, char *function, in FACETS_ANCHOR_DIRECTION direction = JOURNAL_DEFAULT_DIRECTION; const char *query = NULL; const char *chart = NULL; - const char *source = NULL; + SIMPLE_PATTERN *sources = NULL; const char *progress_id = NULL; SD_JOURNAL_FILE_SOURCE_TYPE source_type = SDJF_ALL; size_t filters = 0; + size_t sampling = SYSTEMD_JOURNAL_DEFAULT_ITEMS_SAMPLING; buffer_json_member_add_object(wb, "_request"); @@ -2329,6 +1772,9 @@ static void function_systemd_journal(const char *transaction, char *function, in else tail = true; } + else if(strncmp(keyword, JOURNAL_PARAMETER_SAMPLING ":", sizeof(JOURNAL_PARAMETER_SAMPLING ":") - 1) == 0) { + sampling = str2ul(&keyword[sizeof(JOURNAL_PARAMETER_SAMPLING ":") - 1]); + } else if(strncmp(keyword, JOURNAL_PARAMETER_DATA_ONLY ":", sizeof(JOURNAL_PARAMETER_DATA_ONLY ":") - 1) == 0) { char *v = &keyword[sizeof(JOURNAL_PARAMETER_DATA_ONLY ":") - 1]; @@ -2352,40 +1798,67 @@ static void function_systemd_journal(const char *transaction, char *function, in progress_id = id; } else if(strncmp(keyword, JOURNAL_PARAMETER_SOURCE ":", sizeof(JOURNAL_PARAMETER_SOURCE ":") - 1) == 0) { - source = &keyword[sizeof(JOURNAL_PARAMETER_SOURCE ":") - 1]; + const char *value = &keyword[sizeof(JOURNAL_PARAMETER_SOURCE ":") - 1]; - if(strcmp(source, SDJF_SOURCE_ALL_NAME) == 0) { - source_type = SDJF_ALL; - source = NULL; - } - else if(strcmp(source, SDJF_SOURCE_LOCAL_NAME) == 0) { - source_type = SDJF_LOCAL; - source = NULL; - } - else if(strcmp(source, SDJF_SOURCE_REMOTES_NAME) == 0) { - source_type = SDJF_REMOTE; - source = NULL; - } - else if(strcmp(source, SDJF_SOURCE_NAMESPACES_NAME) == 0) { - source_type = SDJF_NAMESPACE; - source = NULL; - } - else if(strcmp(source, SDJF_SOURCE_LOCAL_SYSTEM_NAME) == 0) { - source_type = SDJF_LOCAL | SDJF_SYSTEM; - source = NULL; - } - else if(strcmp(source, SDJF_SOURCE_LOCAL_USERS_NAME) == 0) { - source_type = SDJF_LOCAL | SDJF_USER; - source = NULL; - } - else if(strcmp(source, SDJF_SOURCE_LOCAL_OTHER_NAME) == 0) { - source_type = SDJF_LOCAL | SDJF_OTHER; - source = NULL; + buffer_json_member_add_array(wb, JOURNAL_PARAMETER_SOURCE); + + BUFFER *sources_list = buffer_create(0, NULL); + + source_type = SDJF_NONE; + while(value) { + char *sep = strchr(value, ','); + if(sep) + *sep++ = '\0'; + + buffer_json_add_array_item_string(wb, value); + + if(strcmp(value, SDJF_SOURCE_ALL_NAME) == 0) { + source_type |= SDJF_ALL; + value = NULL; + } + else if(strcmp(value, SDJF_SOURCE_LOCAL_NAME) == 0) { + source_type |= SDJF_LOCAL_ALL; + value = NULL; + } + else if(strcmp(value, SDJF_SOURCE_REMOTES_NAME) == 0) { + source_type |= SDJF_REMOTE_ALL; + value = NULL; + } + else if(strcmp(value, SDJF_SOURCE_NAMESPACES_NAME) == 0) { + source_type |= SDJF_LOCAL_NAMESPACE; + value = NULL; + } + else if(strcmp(value, SDJF_SOURCE_LOCAL_SYSTEM_NAME) == 0) { + source_type |= SDJF_LOCAL_SYSTEM; + value = NULL; + } + else if(strcmp(value, SDJF_SOURCE_LOCAL_USERS_NAME) == 0) { + source_type |= SDJF_LOCAL_USER; + value = NULL; + } + else if(strcmp(value, SDJF_SOURCE_LOCAL_OTHER_NAME) == 0) { + source_type |= SDJF_LOCAL_OTHER; + value = NULL; + } + else { + // else, match the source, whatever it is + if(buffer_strlen(sources_list)) + buffer_strcat(sources_list, ","); + + buffer_strcat(sources_list, value); + } + + value = sep; } - else { - source_type = SDJF_ALL; - // else, match the source, whatever it is + + if(buffer_strlen(sources_list)) { + simple_pattern_free(sources); + sources = simple_pattern_create(buffer_tostring(sources_list), ",", SIMPLE_PATTERN_EXACT, false); } + + buffer_free(sources_list); + + buffer_json_array_close(wb); // source } else if(strncmp(keyword, JOURNAL_PARAMETER_AFTER ":", sizeof(JOURNAL_PARAMETER_AFTER ":") - 1) == 0) { after_s = str2l(&keyword[sizeof(JOURNAL_PARAMETER_AFTER ":") - 1]); @@ -2502,7 +1975,7 @@ static void function_systemd_journal(const char *transaction, char *function, in fqs->data_only = data_only; fqs->delta = (fqs->data_only) ? delta : false; fqs->tail = (fqs->data_only && fqs->if_modified_since) ? tail : false; - fqs->source = string_strdupz(source); + fqs->sources = sources; fqs->source_type = source_type; fqs->entries = last; fqs->last_modified = 0; @@ -2512,6 +1985,7 @@ static void function_systemd_journal(const char *transaction, char *function, in fqs->direction = direction; fqs->anchor.start_ut = anchor; fqs->anchor.stop_ut = 0; + fqs->sampling = sampling; if(fqs->anchor.start_ut && fqs->tail) { // a tail request @@ -2574,8 +2048,8 @@ static void function_systemd_journal(const char *transaction, char *function, in buffer_json_member_add_boolean(wb, JOURNAL_PARAMETER_PROGRESS, false); buffer_json_member_add_boolean(wb, JOURNAL_PARAMETER_DELTA, fqs->delta); buffer_json_member_add_boolean(wb, JOURNAL_PARAMETER_TAIL, fqs->tail); + buffer_json_member_add_uint64(wb, JOURNAL_PARAMETER_SAMPLING, fqs->sampling); buffer_json_member_add_string(wb, JOURNAL_PARAMETER_ID, progress_id); - buffer_json_member_add_string(wb, JOURNAL_PARAMETER_SOURCE, string2str(fqs->source)); buffer_json_member_add_uint64(wb, "source_type", fqs->source_type); buffer_json_member_add_uint64(wb, JOURNAL_PARAMETER_AFTER, fqs->after_ut / USEC_PER_SEC); buffer_json_member_add_uint64(wb, JOURNAL_PARAMETER_BEFORE, fqs->before_ut / USEC_PER_SEC); @@ -2603,7 +2077,7 @@ static void function_systemd_journal(const char *transaction, char *function, in buffer_json_member_add_string(wb, "id", "source"); buffer_json_member_add_string(wb, "name", "source"); buffer_json_member_add_string(wb, "help", "Select the SystemD Journal source to query"); - buffer_json_member_add_string(wb, "type", "select"); + buffer_json_member_add_string(wb, "type", "multiselect"); buffer_json_member_add_array(wb, "options"); { available_journal_file_sources_to_json_array(wb); @@ -2632,12 +2106,6 @@ static void function_systemd_journal(const char *transaction, char *function, in response = netdata_systemd_journal_query(wb, facets, fqs); // ------------------------------------------------------------------------ - // cleanup query params - - string_freez(fqs->source); - fqs->source = NULL; - - // ------------------------------------------------------------------------ // handle error response if(response != HTTP_RESP_OK) { @@ -2653,6 +2121,7 @@ output: netdata_mutex_unlock(&stdout_mutex); cleanup: + simple_pattern_free(sources); facets_destroy(facets); buffer_free(wb); @@ -2663,129 +2132,8 @@ cleanup: } } -// ---------------------------------------------------------------------------- - -int main(int argc __maybe_unused, char **argv __maybe_unused) { - stderror = stderr; - clocks_init(); - - program_name = "systemd-journal.plugin"; - - // disable syslog - error_log_syslog = 0; - - // set errors flood protection to 100 logs per hour - error_log_errors_per_period = 100; - error_log_throttle_period = 3600; - - log_set_global_severity_for_external_plugins(); - - netdata_configured_host_prefix = getenv("NETDATA_HOST_PREFIX"); - if(verify_netdata_host_prefix() == -1) exit(1); - - // ------------------------------------------------------------------------ - // setup the journal directories - - unsigned d = 0; - - journal_directories[d++].path = strdupz("/var/log/journal"); - journal_directories[d++].path = strdupz("/run/log/journal"); - - if(*netdata_configured_host_prefix) { - char path[PATH_MAX]; - snprintfz(path, sizeof(path), "%s/var/log/journal", netdata_configured_host_prefix); - journal_directories[d++].path = strdupz(path); - snprintfz(path, sizeof(path), "%s/run/log/journal", netdata_configured_host_prefix); - journal_directories[d++].path = strdupz(path); - } - - // terminate the list - journal_directories[d].path = NULL; - - // ------------------------------------------------------------------------ - +void journal_init_query_status(void) { function_query_status_dict = dictionary_create_advanced( DICT_OPTION_DONT_OVERWRITE_VALUE | DICT_OPTION_FIXED_SIZE, NULL, sizeof(FUNCTION_QUERY_STATUS)); - - // ------------------------------------------------------------------------ - // initialize the used hashes files registry - - used_hashes_registry = dictionary_create(DICT_OPTION_DONT_OVERWRITE_VALUE); - - - // ------------------------------------------------------------------------ - // initialize the journal files registry - - systemd_journal_session = (now_realtime_usec() / USEC_PER_SEC) * USEC_PER_SEC; - - journal_files_registry = dictionary_create_advanced( - DICT_OPTION_DONT_OVERWRITE_VALUE | DICT_OPTION_FIXED_SIZE, - NULL, sizeof(struct journal_file)); - - dictionary_register_insert_callback(journal_files_registry, files_registry_insert_cb, NULL); - dictionary_register_delete_callback(journal_files_registry, files_registry_delete_cb, NULL); - dictionary_register_conflict_callback(journal_files_registry, files_registry_conflict_cb, NULL); - - boot_ids_to_first_ut = dictionary_create_advanced( - DICT_OPTION_DONT_OVERWRITE_VALUE | DICT_OPTION_FIXED_SIZE, - NULL, sizeof(usec_t)); - - journal_files_registry_update(); - - - // ------------------------------------------------------------------------ - // debug - - if(argc == 2 && strcmp(argv[1], "debug") == 0) { - bool cancelled = false; - char buf[] = "systemd-journal after:-16000000 before:0 last:1"; - // char buf[] = "systemd-journal after:1695332964 before:1695937764 direction:backward last:100 slice:true source:all DHKucpqUoe1:PtVoyIuX.MU"; - // char buf[] = "systemd-journal after:1694511062 before:1694514662 anchor:1694514122024403"; - function_systemd_journal("123", buf, 600, &cancelled); - exit(1); - } - - // ------------------------------------------------------------------------ - // the event loop for functions - - struct functions_evloop_globals *wg = - functions_evloop_init(SYSTEMD_JOURNAL_WORKER_THREADS, "SDJ", &stdout_mutex, &plugin_should_exit); - - functions_evloop_add_function(wg, SYSTEMD_JOURNAL_FUNCTION_NAME, function_systemd_journal, - SYSTEMD_JOURNAL_DEFAULT_TIMEOUT); - - - // ------------------------------------------------------------------------ - - time_t started_t = now_monotonic_sec(); - - size_t iteration = 0; - usec_t step = 1000 * USEC_PER_MS; - bool tty = isatty(fileno(stderr)) == 1; - - netdata_mutex_lock(&stdout_mutex); - fprintf(stdout, PLUGINSD_KEYWORD_FUNCTION " GLOBAL \"%s\" %d \"%s\"\n", - SYSTEMD_JOURNAL_FUNCTION_NAME, SYSTEMD_JOURNAL_DEFAULT_TIMEOUT, SYSTEMD_JOURNAL_FUNCTION_DESCRIPTION); - - heartbeat_t hb; - heartbeat_init(&hb); - while(!plugin_should_exit) { - iteration++; - - netdata_mutex_unlock(&stdout_mutex); - heartbeat_next(&hb, step); - netdata_mutex_lock(&stdout_mutex); - - if(!tty) - fprintf(stdout, "\n"); - - fflush(stdout); - - time_t now = now_monotonic_sec(); - if(now - started_t > 86400) - break; - } - - exit(0); } diff --git a/collectors/systemd-journal.plugin/systemd-main.c b/collectors/systemd-journal.plugin/systemd-main.c new file mode 100644 index 000000000..a3510b0ed --- /dev/null +++ b/collectors/systemd-journal.plugin/systemd-main.c @@ -0,0 +1,112 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "systemd-internals.h" +#include "libnetdata/required_dummies.h" + +#define SYSTEMD_JOURNAL_WORKER_THREADS 5 + +netdata_mutex_t stdout_mutex = NETDATA_MUTEX_INITIALIZER; +static bool plugin_should_exit = false; + +int main(int argc __maybe_unused, char **argv __maybe_unused) { + clocks_init(); + netdata_thread_set_tag("SDMAIN"); + nd_log_initialize_for_external_plugins("systemd-journal.plugin"); + + netdata_configured_host_prefix = getenv("NETDATA_HOST_PREFIX"); + if(verify_netdata_host_prefix(true) == -1) exit(1); + + // ------------------------------------------------------------------------ + // initialization + + netdata_systemd_journal_message_ids_init(); + journal_init_query_status(); + journal_init_files_and_directories(); + + // ------------------------------------------------------------------------ + // debug + + if(argc == 2 && strcmp(argv[1], "debug") == 0) { + journal_files_registry_update(); + + bool cancelled = false; + char buf[] = "systemd-journal after:-8640000 before:0 direction:backward last:200 data_only:false slice:true source:all"; + // char buf[] = "systemd-journal after:1695332964 before:1695937764 direction:backward last:100 slice:true source:all DHKucpqUoe1:PtVoyIuX.MU"; + // char buf[] = "systemd-journal after:1694511062 before:1694514662 anchor:1694514122024403"; + function_systemd_journal("123", buf, 600, &cancelled); +// function_systemd_units("123", "systemd-units", 600, &cancelled); + exit(1); + } +#ifdef ENABLE_SYSTEMD_DBUS + if(argc == 2 && strcmp(argv[1], "debug-units") == 0) { + bool cancelled = false; + function_systemd_units("123", "systemd-units", 600, &cancelled); + exit(1); + } +#endif + + // ------------------------------------------------------------------------ + // watcher thread + + netdata_thread_t watcher_thread; + netdata_thread_create(&watcher_thread, "SDWATCH", + NETDATA_THREAD_OPTION_DONT_LOG, journal_watcher_main, NULL); + + // ------------------------------------------------------------------------ + // the event loop for functions + + struct functions_evloop_globals *wg = + functions_evloop_init(SYSTEMD_JOURNAL_WORKER_THREADS, "SDJ", &stdout_mutex, &plugin_should_exit); + + functions_evloop_add_function(wg, SYSTEMD_JOURNAL_FUNCTION_NAME, function_systemd_journal, + SYSTEMD_JOURNAL_DEFAULT_TIMEOUT); + +#ifdef ENABLE_SYSTEMD_DBUS + functions_evloop_add_function(wg, SYSTEMD_UNITS_FUNCTION_NAME, function_systemd_units, + SYSTEMD_UNITS_DEFAULT_TIMEOUT); +#endif + + // ------------------------------------------------------------------------ + // register functions to netdata + + netdata_mutex_lock(&stdout_mutex); + + fprintf(stdout, PLUGINSD_KEYWORD_FUNCTION " GLOBAL \"%s\" %d \"%s\"\n", + SYSTEMD_JOURNAL_FUNCTION_NAME, SYSTEMD_JOURNAL_DEFAULT_TIMEOUT, SYSTEMD_JOURNAL_FUNCTION_DESCRIPTION); + +#ifdef ENABLE_SYSTEMD_DBUS + fprintf(stdout, PLUGINSD_KEYWORD_FUNCTION " GLOBAL \"%s\" %d \"%s\"\n", + SYSTEMD_UNITS_FUNCTION_NAME, SYSTEMD_UNITS_DEFAULT_TIMEOUT, SYSTEMD_UNITS_FUNCTION_DESCRIPTION); +#endif + + fflush(stdout); + netdata_mutex_unlock(&stdout_mutex); + + // ------------------------------------------------------------------------ + + usec_t step_ut = 100 * USEC_PER_MS; + usec_t send_newline_ut = 0; + usec_t since_last_scan_ut = SYSTEMD_JOURNAL_ALL_FILES_SCAN_EVERY_USEC * 2; // something big to trigger scanning at start + bool tty = isatty(fileno(stderr)) == 1; + + heartbeat_t hb; + heartbeat_init(&hb); + while(!plugin_should_exit) { + + if(since_last_scan_ut > SYSTEMD_JOURNAL_ALL_FILES_SCAN_EVERY_USEC) { + journal_files_registry_update(); + since_last_scan_ut = 0; + } + + usec_t dt_ut = heartbeat_next(&hb, step_ut); + since_last_scan_ut += dt_ut; + send_newline_ut += dt_ut; + + if(!tty && send_newline_ut > USEC_PER_SEC) { + send_newline_and_flush(); + send_newline_ut = 0; + } + } + + exit(0); +} diff --git a/collectors/systemd-journal.plugin/systemd-units.c b/collectors/systemd-journal.plugin/systemd-units.c new file mode 100644 index 000000000..dac158817 --- /dev/null +++ b/collectors/systemd-journal.plugin/systemd-units.c @@ -0,0 +1,1965 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "systemd-internals.h" + +#ifdef ENABLE_SYSTEMD_DBUS +#include <systemd/sd-bus.h> + +#define SYSTEMD_UNITS_MAX_PARAMS 10 +#define SYSTEMD_UNITS_DBUS_TYPES "(ssssssouso)" + +// ---------------------------------------------------------------------------- +// copied from systemd: string-table.h + +typedef char sd_char; +#define XCONCATENATE(x, y) x ## y +#define CONCATENATE(x, y) XCONCATENATE(x, y) + +#ifndef __COVERITY__ +# define VOID_0 ((void)0) +#else +# define VOID_0 ((void*)0) +#endif + +#define ELEMENTSOF(x) \ + (__builtin_choose_expr( \ + !__builtin_types_compatible_p(typeof(x), typeof(&*(x))), \ + sizeof(x)/sizeof((x)[0]), \ + VOID_0)) + +#define UNIQ_T(x, uniq) CONCATENATE(__unique_prefix_, CONCATENATE(x, uniq)) +#define UNIQ __COUNTER__ +#define __CMP(aq, a, bq, b) \ + ({ \ + const typeof(a) UNIQ_T(A, aq) = (a); \ + const typeof(b) UNIQ_T(B, bq) = (b); \ + UNIQ_T(A, aq) < UNIQ_T(B, bq) ? -1 : \ + UNIQ_T(A, aq) > UNIQ_T(B, bq) ? 1 : 0; \ + }) +#define CMP(a, b) __CMP(UNIQ, (a), UNIQ, (b)) + +static inline int strcmp_ptr(const sd_char *a, const sd_char *b) { + if (a && b) + return strcmp(a, b); + + return CMP(a, b); +} + +static inline bool streq_ptr(const sd_char *a, const sd_char *b) { + return strcmp_ptr(a, b) == 0; +} + +ssize_t string_table_lookup(const char * const *table, size_t len, const char *key) { + if (!key || !*key) + return -EINVAL; + + for (size_t i = 0; i < len; ++i) + if (streq_ptr(table[i], key)) + return (ssize_t) i; + + return -EINVAL; +} + +/* For basic lookup tables with strictly enumerated entries */ +#define _DEFINE_STRING_TABLE_LOOKUP_TO_STRING(name,type,scope) \ + scope const char *name##_to_string(type i) { \ + if (i < 0 || i >= (type) ELEMENTSOF(name##_table)) \ + return NULL; \ + return name##_table[i]; \ + } + +#define _DEFINE_STRING_TABLE_LOOKUP_FROM_STRING(name,type,scope) \ + scope type name##_from_string(const char *s) { \ + return (type) string_table_lookup(name##_table, ELEMENTSOF(name##_table), s); \ + } + +#define _DEFINE_STRING_TABLE_LOOKUP(name,type,scope) \ + _DEFINE_STRING_TABLE_LOOKUP_TO_STRING(name,type,scope) \ + _DEFINE_STRING_TABLE_LOOKUP_FROM_STRING(name,type,scope) + +#define DEFINE_STRING_TABLE_LOOKUP(name,type) _DEFINE_STRING_TABLE_LOOKUP(name,type,) + +// ---------------------------------------------------------------------------- +// copied from systemd: unit-def.h + +typedef enum UnitType { + UNIT_SERVICE, + UNIT_MOUNT, + UNIT_SWAP, + UNIT_SOCKET, + UNIT_TARGET, + UNIT_DEVICE, + UNIT_AUTOMOUNT, + UNIT_TIMER, + UNIT_PATH, + UNIT_SLICE, + UNIT_SCOPE, + _UNIT_TYPE_MAX, + _UNIT_TYPE_INVALID = -EINVAL, +} UnitType; + +typedef enum UnitLoadState { + UNIT_STUB, + UNIT_LOADED, + UNIT_NOT_FOUND, /* error condition #1: unit file not found */ + UNIT_BAD_SETTING, /* error condition #2: we couldn't parse some essential unit file setting */ + UNIT_ERROR, /* error condition #3: other "system" error, catchall for the rest */ + UNIT_MERGED, + UNIT_MASKED, + _UNIT_LOAD_STATE_MAX, + _UNIT_LOAD_STATE_INVALID = -EINVAL, +} UnitLoadState; + +typedef enum UnitActiveState { + UNIT_ACTIVE, + UNIT_RELOADING, + UNIT_INACTIVE, + UNIT_FAILED, + UNIT_ACTIVATING, + UNIT_DEACTIVATING, + UNIT_MAINTENANCE, + _UNIT_ACTIVE_STATE_MAX, + _UNIT_ACTIVE_STATE_INVALID = -EINVAL, +} UnitActiveState; + +typedef enum AutomountState { + AUTOMOUNT_DEAD, + AUTOMOUNT_WAITING, + AUTOMOUNT_RUNNING, + AUTOMOUNT_FAILED, + _AUTOMOUNT_STATE_MAX, + _AUTOMOUNT_STATE_INVALID = -EINVAL, +} AutomountState; + +typedef enum DeviceState { + DEVICE_DEAD, + DEVICE_TENTATIVE, /* mounted or swapped, but not (yet) announced by udev */ + DEVICE_PLUGGED, /* announced by udev */ + _DEVICE_STATE_MAX, + _DEVICE_STATE_INVALID = -EINVAL, +} DeviceState; + +typedef enum MountState { + MOUNT_DEAD, + MOUNT_MOUNTING, /* /usr/bin/mount is running, but the mount is not done yet. */ + MOUNT_MOUNTING_DONE, /* /usr/bin/mount is running, and the mount is done. */ + MOUNT_MOUNTED, + MOUNT_REMOUNTING, + MOUNT_UNMOUNTING, + MOUNT_REMOUNTING_SIGTERM, + MOUNT_REMOUNTING_SIGKILL, + MOUNT_UNMOUNTING_SIGTERM, + MOUNT_UNMOUNTING_SIGKILL, + MOUNT_FAILED, + MOUNT_CLEANING, + _MOUNT_STATE_MAX, + _MOUNT_STATE_INVALID = -EINVAL, +} MountState; + +typedef enum PathState { + PATH_DEAD, + PATH_WAITING, + PATH_RUNNING, + PATH_FAILED, + _PATH_STATE_MAX, + _PATH_STATE_INVALID = -EINVAL, +} PathState; + +typedef enum ScopeState { + SCOPE_DEAD, + SCOPE_START_CHOWN, + SCOPE_RUNNING, + SCOPE_ABANDONED, + SCOPE_STOP_SIGTERM, + SCOPE_STOP_SIGKILL, + SCOPE_FAILED, + _SCOPE_STATE_MAX, + _SCOPE_STATE_INVALID = -EINVAL, +} ScopeState; + +typedef enum ServiceState { + SERVICE_DEAD, + SERVICE_CONDITION, + SERVICE_START_PRE, + SERVICE_START, + SERVICE_START_POST, + SERVICE_RUNNING, + SERVICE_EXITED, /* Nothing is running anymore, but RemainAfterExit is true hence this is OK */ + SERVICE_RELOAD, /* Reloading via ExecReload= */ + SERVICE_RELOAD_SIGNAL, /* Reloading via SIGHUP requested */ + SERVICE_RELOAD_NOTIFY, /* Waiting for READY=1 after RELOADING=1 notify */ + SERVICE_STOP, /* No STOP_PRE state, instead just register multiple STOP executables */ + SERVICE_STOP_WATCHDOG, + SERVICE_STOP_SIGTERM, + SERVICE_STOP_SIGKILL, + SERVICE_STOP_POST, + SERVICE_FINAL_WATCHDOG, /* In case the STOP_POST executable needs to be aborted. */ + SERVICE_FINAL_SIGTERM, /* In case the STOP_POST executable hangs, we shoot that down, too */ + SERVICE_FINAL_SIGKILL, + SERVICE_FAILED, + SERVICE_DEAD_BEFORE_AUTO_RESTART, + SERVICE_FAILED_BEFORE_AUTO_RESTART, + SERVICE_DEAD_RESOURCES_PINNED, /* Like SERVICE_DEAD, but with pinned resources */ + SERVICE_AUTO_RESTART, + SERVICE_AUTO_RESTART_QUEUED, + SERVICE_CLEANING, + _SERVICE_STATE_MAX, + _SERVICE_STATE_INVALID = -EINVAL, +} ServiceState; + +typedef enum SliceState { + SLICE_DEAD, + SLICE_ACTIVE, + _SLICE_STATE_MAX, + _SLICE_STATE_INVALID = -EINVAL, +} SliceState; + +typedef enum SocketState { + SOCKET_DEAD, + SOCKET_START_PRE, + SOCKET_START_CHOWN, + SOCKET_START_POST, + SOCKET_LISTENING, + SOCKET_RUNNING, + SOCKET_STOP_PRE, + SOCKET_STOP_PRE_SIGTERM, + SOCKET_STOP_PRE_SIGKILL, + SOCKET_STOP_POST, + SOCKET_FINAL_SIGTERM, + SOCKET_FINAL_SIGKILL, + SOCKET_FAILED, + SOCKET_CLEANING, + _SOCKET_STATE_MAX, + _SOCKET_STATE_INVALID = -EINVAL, +} SocketState; + +typedef enum SwapState { + SWAP_DEAD, + SWAP_ACTIVATING, /* /sbin/swapon is running, but the swap not yet enabled. */ + SWAP_ACTIVATING_DONE, /* /sbin/swapon is running, and the swap is done. */ + SWAP_ACTIVE, + SWAP_DEACTIVATING, + SWAP_DEACTIVATING_SIGTERM, + SWAP_DEACTIVATING_SIGKILL, + SWAP_FAILED, + SWAP_CLEANING, + _SWAP_STATE_MAX, + _SWAP_STATE_INVALID = -EINVAL, +} SwapState; + +typedef enum TargetState { + TARGET_DEAD, + TARGET_ACTIVE, + _TARGET_STATE_MAX, + _TARGET_STATE_INVALID = -EINVAL, +} TargetState; + +typedef enum TimerState { + TIMER_DEAD, + TIMER_WAITING, + TIMER_RUNNING, + TIMER_ELAPSED, + TIMER_FAILED, + _TIMER_STATE_MAX, + _TIMER_STATE_INVALID = -EINVAL, +} TimerState; + +typedef enum FreezerState { + FREEZER_RUNNING, + FREEZER_FREEZING, + FREEZER_FROZEN, + FREEZER_THAWING, + _FREEZER_STATE_MAX, + _FREEZER_STATE_INVALID = -EINVAL, +} FreezerState; + +// ---------------------------------------------------------------------------- +// copied from systemd: unit-def.c + +static const char* const unit_type_table[_UNIT_TYPE_MAX] = { + [UNIT_SERVICE] = "service", + [UNIT_SOCKET] = "socket", + [UNIT_TARGET] = "target", + [UNIT_DEVICE] = "device", + [UNIT_MOUNT] = "mount", + [UNIT_AUTOMOUNT] = "automount", + [UNIT_SWAP] = "swap", + [UNIT_TIMER] = "timer", + [UNIT_PATH] = "path", + [UNIT_SLICE] = "slice", + [UNIT_SCOPE] = "scope", +}; + +DEFINE_STRING_TABLE_LOOKUP(unit_type, UnitType); + +static const char* const unit_load_state_table[_UNIT_LOAD_STATE_MAX] = { + [UNIT_STUB] = "stub", + [UNIT_LOADED] = "loaded", + [UNIT_NOT_FOUND] = "not-found", + [UNIT_BAD_SETTING] = "bad-setting", + [UNIT_ERROR] = "error", + [UNIT_MERGED] = "merged", + [UNIT_MASKED] = "masked" +}; + +DEFINE_STRING_TABLE_LOOKUP(unit_load_state, UnitLoadState); + +static const char* const unit_active_state_table[_UNIT_ACTIVE_STATE_MAX] = { + [UNIT_ACTIVE] = "active", + [UNIT_RELOADING] = "reloading", + [UNIT_INACTIVE] = "inactive", + [UNIT_FAILED] = "failed", + [UNIT_ACTIVATING] = "activating", + [UNIT_DEACTIVATING] = "deactivating", + [UNIT_MAINTENANCE] = "maintenance", +}; + +DEFINE_STRING_TABLE_LOOKUP(unit_active_state, UnitActiveState); + +static const char* const automount_state_table[_AUTOMOUNT_STATE_MAX] = { + [AUTOMOUNT_DEAD] = "dead", + [AUTOMOUNT_WAITING] = "waiting", + [AUTOMOUNT_RUNNING] = "running", + [AUTOMOUNT_FAILED] = "failed" +}; + +DEFINE_STRING_TABLE_LOOKUP(automount_state, AutomountState); + +static const char* const device_state_table[_DEVICE_STATE_MAX] = { + [DEVICE_DEAD] = "dead", + [DEVICE_TENTATIVE] = "tentative", + [DEVICE_PLUGGED] = "plugged", +}; + +DEFINE_STRING_TABLE_LOOKUP(device_state, DeviceState); + +static const char* const mount_state_table[_MOUNT_STATE_MAX] = { + [MOUNT_DEAD] = "dead", + [MOUNT_MOUNTING] = "mounting", + [MOUNT_MOUNTING_DONE] = "mounting-done", + [MOUNT_MOUNTED] = "mounted", + [MOUNT_REMOUNTING] = "remounting", + [MOUNT_UNMOUNTING] = "unmounting", + [MOUNT_REMOUNTING_SIGTERM] = "remounting-sigterm", + [MOUNT_REMOUNTING_SIGKILL] = "remounting-sigkill", + [MOUNT_UNMOUNTING_SIGTERM] = "unmounting-sigterm", + [MOUNT_UNMOUNTING_SIGKILL] = "unmounting-sigkill", + [MOUNT_FAILED] = "failed", + [MOUNT_CLEANING] = "cleaning", +}; + +DEFINE_STRING_TABLE_LOOKUP(mount_state, MountState); + +static const char* const path_state_table[_PATH_STATE_MAX] = { + [PATH_DEAD] = "dead", + [PATH_WAITING] = "waiting", + [PATH_RUNNING] = "running", + [PATH_FAILED] = "failed" +}; + +DEFINE_STRING_TABLE_LOOKUP(path_state, PathState); + +static const char* const scope_state_table[_SCOPE_STATE_MAX] = { + [SCOPE_DEAD] = "dead", + [SCOPE_START_CHOWN] = "start-chown", + [SCOPE_RUNNING] = "running", + [SCOPE_ABANDONED] = "abandoned", + [SCOPE_STOP_SIGTERM] = "stop-sigterm", + [SCOPE_STOP_SIGKILL] = "stop-sigkill", + [SCOPE_FAILED] = "failed", +}; + +DEFINE_STRING_TABLE_LOOKUP(scope_state, ScopeState); + +static const char* const service_state_table[_SERVICE_STATE_MAX] = { + [SERVICE_DEAD] = "dead", + [SERVICE_CONDITION] = "condition", + [SERVICE_START_PRE] = "start-pre", + [SERVICE_START] = "start", + [SERVICE_START_POST] = "start-post", + [SERVICE_RUNNING] = "running", + [SERVICE_EXITED] = "exited", + [SERVICE_RELOAD] = "reload", + [SERVICE_RELOAD_SIGNAL] = "reload-signal", + [SERVICE_RELOAD_NOTIFY] = "reload-notify", + [SERVICE_STOP] = "stop", + [SERVICE_STOP_WATCHDOG] = "stop-watchdog", + [SERVICE_STOP_SIGTERM] = "stop-sigterm", + [SERVICE_STOP_SIGKILL] = "stop-sigkill", + [SERVICE_STOP_POST] = "stop-post", + [SERVICE_FINAL_WATCHDOG] = "final-watchdog", + [SERVICE_FINAL_SIGTERM] = "final-sigterm", + [SERVICE_FINAL_SIGKILL] = "final-sigkill", + [SERVICE_FAILED] = "failed", + [SERVICE_DEAD_BEFORE_AUTO_RESTART] = "dead-before-auto-restart", + [SERVICE_FAILED_BEFORE_AUTO_RESTART] = "failed-before-auto-restart", + [SERVICE_DEAD_RESOURCES_PINNED] = "dead-resources-pinned", + [SERVICE_AUTO_RESTART] = "auto-restart", + [SERVICE_AUTO_RESTART_QUEUED] = "auto-restart-queued", + [SERVICE_CLEANING] = "cleaning", +}; + +DEFINE_STRING_TABLE_LOOKUP(service_state, ServiceState); + +static const char* const slice_state_table[_SLICE_STATE_MAX] = { + [SLICE_DEAD] = "dead", + [SLICE_ACTIVE] = "active" +}; + +DEFINE_STRING_TABLE_LOOKUP(slice_state, SliceState); + +static const char* const socket_state_table[_SOCKET_STATE_MAX] = { + [SOCKET_DEAD] = "dead", + [SOCKET_START_PRE] = "start-pre", + [SOCKET_START_CHOWN] = "start-chown", + [SOCKET_START_POST] = "start-post", + [SOCKET_LISTENING] = "listening", + [SOCKET_RUNNING] = "running", + [SOCKET_STOP_PRE] = "stop-pre", + [SOCKET_STOP_PRE_SIGTERM] = "stop-pre-sigterm", + [SOCKET_STOP_PRE_SIGKILL] = "stop-pre-sigkill", + [SOCKET_STOP_POST] = "stop-post", + [SOCKET_FINAL_SIGTERM] = "final-sigterm", + [SOCKET_FINAL_SIGKILL] = "final-sigkill", + [SOCKET_FAILED] = "failed", + [SOCKET_CLEANING] = "cleaning", +}; + +DEFINE_STRING_TABLE_LOOKUP(socket_state, SocketState); + +static const char* const swap_state_table[_SWAP_STATE_MAX] = { + [SWAP_DEAD] = "dead", + [SWAP_ACTIVATING] = "activating", + [SWAP_ACTIVATING_DONE] = "activating-done", + [SWAP_ACTIVE] = "active", + [SWAP_DEACTIVATING] = "deactivating", + [SWAP_DEACTIVATING_SIGTERM] = "deactivating-sigterm", + [SWAP_DEACTIVATING_SIGKILL] = "deactivating-sigkill", + [SWAP_FAILED] = "failed", + [SWAP_CLEANING] = "cleaning", +}; + +DEFINE_STRING_TABLE_LOOKUP(swap_state, SwapState); + +static const char* const target_state_table[_TARGET_STATE_MAX] = { + [TARGET_DEAD] = "dead", + [TARGET_ACTIVE] = "active" +}; + +DEFINE_STRING_TABLE_LOOKUP(target_state, TargetState); + +static const char* const timer_state_table[_TIMER_STATE_MAX] = { + [TIMER_DEAD] = "dead", + [TIMER_WAITING] = "waiting", + [TIMER_RUNNING] = "running", + [TIMER_ELAPSED] = "elapsed", + [TIMER_FAILED] = "failed" +}; + +DEFINE_STRING_TABLE_LOOKUP(timer_state, TimerState); + +static const char* const freezer_state_table[_FREEZER_STATE_MAX] = { + [FREEZER_RUNNING] = "running", + [FREEZER_FREEZING] = "freezing", + [FREEZER_FROZEN] = "frozen", + [FREEZER_THAWING] = "thawing", +}; + +DEFINE_STRING_TABLE_LOOKUP(freezer_state, FreezerState); + +// ---------------------------------------------------------------------------- +// our code + +typedef struct UnitAttribute { + union { + int boolean; + char *str; + uint64_t uint64; + int64_t int64; + uint32_t uint32; + int32_t int32; + double dbl; + }; +} UnitAttribute; + +struct UnitInfo; +typedef void (*attribute_handler_t)(struct UnitInfo *u, UnitAttribute *ua); + +static void update_freezer_state(struct UnitInfo *u, UnitAttribute *ua); + +struct { + const char *member; + char value_type; + + const char *show_as; + const char *info; + RRDF_FIELD_OPTIONS options; + RRDF_FIELD_FILTER filter; + + attribute_handler_t handler; +} unit_attributes[] = { + { + .member = "Type", + .value_type = SD_BUS_TYPE_STRING, + .show_as = "ServiceType", + .info = "Service Type", + .options = RRDF_FIELD_OPTS_VISIBLE, + .filter = RRDF_FIELD_FILTER_MULTISELECT, + }, { + .member = "Result", + .value_type = SD_BUS_TYPE_STRING, + .show_as = "Result", + .info = "Result", + .options = RRDF_FIELD_OPTS_VISIBLE, + .filter = RRDF_FIELD_FILTER_MULTISELECT, + }, { + .member = "UnitFileState", + .value_type = SD_BUS_TYPE_STRING, + .show_as = "Enabled", + .info = "Unit File State", + .options = RRDF_FIELD_OPTS_NONE, + .filter = RRDF_FIELD_FILTER_MULTISELECT, + }, { + .member = "UnitFilePreset", + .value_type = SD_BUS_TYPE_STRING, + .show_as = "Preset", + .info = "Unit File Preset", + .options = RRDF_FIELD_OPTS_NONE, + .filter = RRDF_FIELD_FILTER_MULTISELECT, + }, { + .member = "FreezerState", + .value_type = SD_BUS_TYPE_STRING, + .show_as = "FreezerState", + .info = "Freezer State", + .options = RRDF_FIELD_OPTS_NONE, + .filter = RRDF_FIELD_FILTER_MULTISELECT, + .handler = update_freezer_state, + }, +// { .member = "Id", .signature = "s", }, +// { .member = "LoadState", .signature = "s", }, +// { .member = "ActiveState", .signature = "s", }, +// { .member = "SubState", .signature = "s", }, +// { .member = "Description", .signature = "s", }, +// { .member = "Following", .signature = "s", }, +// { .member = "Documentation", .signature = "as", }, +// { .member = "FragmentPath", .signature = "s", }, +// { .member = "SourcePath", .signature = "s", }, +// { .member = "ControlGroup", .signature = "s", }, +// { .member = "DropInPaths", .signature = "as", }, +// { .member = "LoadError", .signature = "(ss)", }, +// { .member = "TriggeredBy", .signature = "as", }, +// { .member = "Triggers", .signature = "as", }, +// { .member = "InactiveExitTimestamp", .signature = "t", }, +// { .member = "InactiveExitTimestampMonotonic", .signature = "t", }, +// { .member = "ActiveEnterTimestamp", .signature = "t", }, +// { .member = "ActiveExitTimestamp", .signature = "t", }, +// { .member = "RuntimeMaxUSec", .signature = "t", }, +// { .member = "InactiveEnterTimestamp", .signature = "t", }, +// { .member = "NeedDaemonReload", .signature = "b", }, +// { .member = "Transient", .signature = "b", }, +// { .member = "ExecMainPID", .signature = "u", }, +// { .member = "MainPID", .signature = "u", }, +// { .member = "ControlPID", .signature = "u", }, +// { .member = "StatusText", .signature = "s", }, +// { .member = "PIDFile", .signature = "s", }, +// { .member = "StatusErrno", .signature = "i", }, +// { .member = "FileDescriptorStoreMax", .signature = "u", }, +// { .member = "NFileDescriptorStore", .signature = "u", }, +// { .member = "ExecMainStartTimestamp", .signature = "t", }, +// { .member = "ExecMainExitTimestamp", .signature = "t", }, +// { .member = "ExecMainCode", .signature = "i", }, +// { .member = "ExecMainStatus", .signature = "i", }, +// { .member = "LogNamespace", .signature = "s", }, +// { .member = "ConditionTimestamp", .signature = "t", }, +// { .member = "ConditionResult", .signature = "b", }, +// { .member = "Conditions", .signature = "a(sbbsi)", }, +// { .member = "AssertTimestamp", .signature = "t", }, +// { .member = "AssertResult", .signature = "b", }, +// { .member = "Asserts", .signature = "a(sbbsi)", }, +// { .member = "NextElapseUSecRealtime", .signature = "t", }, +// { .member = "NextElapseUSecMonotonic", .signature = "t", }, +// { .member = "NAccepted", .signature = "u", }, +// { .member = "NConnections", .signature = "u", }, +// { .member = "NRefused", .signature = "u", }, +// { .member = "Accept", .signature = "b", }, +// { .member = "Listen", .signature = "a(ss)", }, +// { .member = "SysFSPath", .signature = "s", }, +// { .member = "Where", .signature = "s", }, +// { .member = "What", .signature = "s", }, +// { .member = "MemoryCurrent", .signature = "t", }, +// { .member = "MemoryAvailable", .signature = "t", }, +// { .member = "DefaultMemoryMin", .signature = "t", }, +// { .member = "DefaultMemoryLow", .signature = "t", }, +// { .member = "DefaultStartupMemoryLow", .signature = "t", }, +// { .member = "MemoryMin", .signature = "t", }, +// { .member = "MemoryLow", .signature = "t", }, +// { .member = "StartupMemoryLow", .signature = "t", }, +// { .member = "MemoryHigh", .signature = "t", }, +// { .member = "StartupMemoryHigh", .signature = "t", }, +// { .member = "MemoryMax", .signature = "t", }, +// { .member = "StartupMemoryMax", .signature = "t", }, +// { .member = "MemorySwapMax", .signature = "t", }, +// { .member = "StartupMemorySwapMax", .signature = "t", }, +// { .member = "MemoryZSwapMax", .signature = "t", }, +// { .member = "StartupMemoryZSwapMax", .signature = "t", }, +// { .member = "MemoryLimit", .signature = "t", }, +// { .member = "CPUUsageNSec", .signature = "t", }, +// { .member = "TasksCurrent", .signature = "t", }, +// { .member = "TasksMax", .signature = "t", }, +// { .member = "IPIngressBytes", .signature = "t", }, +// { .member = "IPEgressBytes", .signature = "t", }, +// { .member = "IOReadBytes", .signature = "t", }, +// { .member = "IOWriteBytes", .signature = "t", }, +// { .member = "ExecCondition", .signature = "a(sasbttttuii)", }, +// { .member = "ExecConditionEx", .signature = "a(sasasttttuii)", }, +// { .member = "ExecStartPre", .signature = "a(sasbttttuii)", }, +// { .member = "ExecStartPreEx", .signature = "a(sasasttttuii)", }, +// { .member = "ExecStart", .signature = "a(sasbttttuii)", }, +// { .member = "ExecStartEx", .signature = "a(sasasttttuii)", }, +// { .member = "ExecStartPost", .signature = "a(sasbttttuii)", }, +// { .member = "ExecStartPostEx", .signature = "a(sasasttttuii)", }, +// { .member = "ExecReload", .signature = "a(sasbttttuii)", }, +// { .member = "ExecReloadEx", .signature = "a(sasasttttuii)", }, +// { .member = "ExecStopPre", .signature = "a(sasbttttuii)", }, +// { .member = "ExecStop", .signature = "a(sasbttttuii)", }, +// { .member = "ExecStopEx", .signature = "a(sasasttttuii)", }, +// { .member = "ExecStopPost", .signature = "a(sasbttttuii)", }, +// { .member = "ExecStopPostEx", .signature = "a(sasasttttuii)", }, +}; + +#define _UNIT_ATTRIBUTE_MAX (sizeof(unit_attributes) / sizeof(unit_attributes[0])) + +typedef struct UnitInfo { + char *id; + char *type; + char *description; + char *load_state; + char *active_state; + char *sub_state; + char *following; + char *unit_path; + uint32_t job_id; + char *job_type; + char *job_path; + + UnitType UnitType; + UnitLoadState UnitLoadState; + UnitActiveState UnitActiveState; + FreezerState FreezerState; + + union { + AutomountState AutomountState; + DeviceState DeviceState; + MountState MountState; + PathState PathState; + ScopeState ScopeState; + ServiceState ServiceState; + SliceState SliceState; + SocketState SocketState; + SwapState SwapState; + TargetState TargetState; + TimerState TimerState; + }; + + struct UnitAttribute attributes[_UNIT_ATTRIBUTE_MAX]; + + FACET_ROW_SEVERITY severity; + uint32_t prio; + + struct UnitInfo *prev, *next; +} UnitInfo; + +static void update_freezer_state(UnitInfo *u, UnitAttribute *ua) { + u->FreezerState = freezer_state_from_string(ua->str); +} + +// ---------------------------------------------------------------------------- +// common helpers + +static void log_dbus_error(int r, const char *msg) { + netdata_log_error("SYSTEMD_UNITS: %s failed with error %d (%s)", msg, r, strerror(-r)); +} + +// ---------------------------------------------------------------------------- +// attributes management + +static inline ssize_t unit_property_slot_from_string(const char *s) { + if(!s || !*s) + return -EINVAL; + + for(size_t i = 0; i < _UNIT_ATTRIBUTE_MAX ;i++) + if(streq_ptr(unit_attributes[i].member, s)) + return (ssize_t)i; + + return -EINVAL; +} + +static inline const char *unit_property_name_to_string_from_slot(ssize_t i) { + if(i >= 0 && i < (ssize_t)_UNIT_ATTRIBUTE_MAX) + return unit_attributes[i].member; + + return NULL; +} + +static inline void systemd_unit_free_property(char type, struct UnitAttribute *at) { + switch(type) { + case SD_BUS_TYPE_STRING: + case SD_BUS_TYPE_OBJECT_PATH: + freez(at->str); + at->str = NULL; + break; + + default: + break; + } +} + +static int systemd_unit_get_property(sd_bus_message *m, UnitInfo *u, const char *name) { + int r; + char type; + + r = sd_bus_message_peek_type(m, &type, NULL); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_peek_type()"); + return r; + } + + ssize_t slot = unit_property_slot_from_string(name); + if(slot < 0) { + // internal_error(true, "unused attribute '%s' for unit '%s'", name, u->id); + sd_bus_message_skip(m, NULL); + return 0; + } + + systemd_unit_free_property(unit_attributes[slot].value_type, &u->attributes[slot]); + + if(unit_attributes[slot].value_type != type) { + netdata_log_error("Type of field '%s' expected to be '%c' but found '%c'. Ignoring field.", + unit_attributes[slot].member, unit_attributes[slot].value_type, type); + sd_bus_message_skip(m, NULL); + return 0; + } + + switch (type) { + case SD_BUS_TYPE_OBJECT_PATH: + case SD_BUS_TYPE_STRING: { + char *s; + + r = sd_bus_message_read_basic(m, type, &s); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_read_basic()"); + return r; + } + + if(s && *s) + u->attributes[slot].str = strdupz(s); + } + break; + + case SD_BUS_TYPE_BOOLEAN: { + r = sd_bus_message_read_basic(m, type, &u->attributes[slot].boolean); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_read_basic()"); + return r; + } + } + break; + + case SD_BUS_TYPE_UINT64: { + r = sd_bus_message_read_basic(m, type, &u->attributes[slot].uint64); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_read_basic()"); + return r; + } + } + break; + + case SD_BUS_TYPE_INT64: { + r = sd_bus_message_read_basic(m, type, &u->attributes[slot].int64); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_read_basic()"); + return r; + } + } + break; + + case SD_BUS_TYPE_UINT32: { + r = sd_bus_message_read_basic(m, type, &u->attributes[slot].uint32); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_read_basic()"); + return r; + } + } + break; + + case SD_BUS_TYPE_INT32: { + r = sd_bus_message_read_basic(m, type, &u->attributes[slot].int32); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_read_basic()"); + return r; + } + } + break; + + case SD_BUS_TYPE_DOUBLE: { + r = sd_bus_message_read_basic(m, type, &u->attributes[slot].dbl); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_read_basic()"); + return r; + } + } + break; + + case SD_BUS_TYPE_ARRAY: { + internal_error(true, "member '%s' is an array", name); + sd_bus_message_skip(m, NULL); + return 0; + } + break; + + default: { + internal_error(true, "unknown field type '%c' for key '%s'", type, name); + sd_bus_message_skip(m, NULL); + return 0; + } + break; + } + + if(unit_attributes[slot].handler) + unit_attributes[slot].handler(u, &u->attributes[slot]); + + return 0; +} + +static int systemd_unit_get_all_properties(sd_bus *bus, UnitInfo *u) { + _cleanup_(sd_bus_message_unrefp) sd_bus_message *m = NULL; + _cleanup_(sd_bus_error_free) sd_bus_error error = SD_BUS_ERROR_NULL; + int r; + + r = sd_bus_call_method(bus, + "org.freedesktop.systemd1", + u->unit_path, + "org.freedesktop.DBus.Properties", + "GetAll", + &error, + &m, + "s", ""); + if (r < 0) { + log_dbus_error(r, "sd_bus_call_method(p1)"); + return r; + } + + r = sd_bus_message_enter_container(m, SD_BUS_TYPE_ARRAY, "{sv}"); + if (r < 0) { + log_dbus_error(r, "sd_bus_message_enter_container(p2)"); + return r; + } + + int c = 0; + while ((r = sd_bus_message_enter_container(m, SD_BUS_TYPE_DICT_ENTRY, "sv")) > 0) { + const char *member, *contents; + c++; + + r = sd_bus_message_read_basic(m, SD_BUS_TYPE_STRING, &member); + if (r < 0) { + log_dbus_error(r, "sd_bus_message_read_basic(p3)"); + return r; + } + + r = sd_bus_message_peek_type(m, NULL, &contents); + if (r < 0) { + log_dbus_error(r, "sd_bus_message_peek_type(p4)"); + return r; + } + + r = sd_bus_message_enter_container(m, SD_BUS_TYPE_VARIANT, contents); + if (r < 0) { + log_dbus_error(r, "sd_bus_message_enter_container(p5)"); + return r; + } + + systemd_unit_get_property(m, u, member); + + r = sd_bus_message_exit_container(m); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_exit_container(p6)"); + return r; + } + + r = sd_bus_message_exit_container(m); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_exit_container(p7)"); + return r; + } + } + if(r < 0) { + log_dbus_error(r, "sd_bus_message_enter_container(p8)"); + return r; + } + + r = sd_bus_message_exit_container(m); + if(r < 0) { + log_dbus_error(r, "sd_bus_message_exit_container(p9)"); + return r; + } + + return 0; +} + +static void systemd_units_get_all_properties(sd_bus *bus, UnitInfo *base) { + for(UnitInfo *u = base ; u ;u = u->next) + systemd_unit_get_all_properties(bus, u); +} + + + +// ---------------------------------------------------------------------------- +// main unit info + +int bus_parse_unit_info(sd_bus_message *message, UnitInfo *u) { + assert(message); + assert(u); + + u->type = NULL; + + int r = sd_bus_message_read( + message, + SYSTEMD_UNITS_DBUS_TYPES, + &u->id, + &u->description, + &u->load_state, + &u->active_state, + &u->sub_state, + &u->following, + &u->unit_path, + &u->job_id, + &u->job_type, + &u->job_path); + + if(r <= 0) + return r; + + char *dot; + if(u->id && (dot = strrchr(u->id, '.')) != NULL) + u->type = &dot[1]; + else + u->type = "unknown"; + + u->UnitType = unit_type_from_string(u->type); + u->UnitLoadState = unit_load_state_from_string(u->load_state); + u->UnitActiveState = unit_active_state_from_string(u->active_state); + + switch(u->UnitType) { + case UNIT_SERVICE: + u->ServiceState = service_state_from_string(u->sub_state); + break; + + case UNIT_MOUNT: + u->MountState = mount_state_from_string(u->sub_state); + break; + + case UNIT_SWAP: + u->SwapState = swap_state_from_string(u->sub_state); + break; + + case UNIT_SOCKET: + u->SocketState = socket_state_from_string(u->sub_state); + break; + + case UNIT_TARGET: + u->TargetState = target_state_from_string(u->sub_state); + break; + + case UNIT_DEVICE: + u->DeviceState = device_state_from_string(u->sub_state); + break; + + case UNIT_AUTOMOUNT: + u->AutomountState = automount_state_from_string(u->sub_state); + break; + + case UNIT_TIMER: + u->TimerState = timer_state_from_string(u->sub_state); + break; + + case UNIT_PATH: + u->PathState = path_state_from_string(u->sub_state); + break; + + case UNIT_SLICE: + u->SliceState = slice_state_from_string(u->sub_state); + break; + + case UNIT_SCOPE: + u->ScopeState = scope_state_from_string(u->sub_state); + break; + + default: + break; + } + + return r; +} + +static int hex_to_int(char c) { + if (c >= '0' && c <= '9') return c - '0'; + if (c >= 'a' && c <= 'f') return c - 'a' + 10; + if (c >= 'A' && c <= 'F') return c - 'A' + 10; + return 0; +} + +// un-escape hex sequences (\xNN) in id +static void txt_decode(char *txt) { + if(!txt || !*txt) + return; + + char *src = txt, *dst = txt; + + size_t id_len = strlen(src); + size_t s = 0, d = 0; + for(; s < id_len ; s++) { + if(src[s] == '\\' && src[s + 1] == 'x' && isxdigit(src[s + 2]) && isxdigit(src[s + 3])) { + int value = (hex_to_int(src[s + 2]) << 4) + hex_to_int(src[s + 3]); + dst[d++] = (char)value; + s += 3; + } + else + dst[d++] = src[s]; + } + dst[d] = '\0'; +} + +static UnitInfo *systemd_units_get_all(void) { + _cleanup_(sd_bus_unrefp) sd_bus *bus = NULL; + _cleanup_(sd_bus_error_free) sd_bus_error error = SD_BUS_ERROR_NULL; + _cleanup_(sd_bus_message_unrefp) sd_bus_message *reply = NULL; + + UnitInfo *base = NULL; + int r; + + r = sd_bus_default_system(&bus); + if (r < 0) { + log_dbus_error(r, "sd_bus_default_system()"); + return base; + } + + // This calls the ListUnits method of the org.freedesktop.systemd1.Manager interface + // Replace "ListUnits" with "ListUnitsFiltered" to get specific units based on filters + r = sd_bus_call_method(bus, + "org.freedesktop.systemd1", /* service to contact */ + "/org/freedesktop/systemd1", /* object path */ + "org.freedesktop.systemd1.Manager", /* interface name */ + "ListUnits", /* method name */ + &error, /* object to return error in */ + &reply, /* return message on success */ + NULL); /* input signature */ + if (r < 0) { + log_dbus_error(r, "sd_bus_call_method()"); + return base; + } + + r = sd_bus_message_enter_container(reply, SD_BUS_TYPE_ARRAY, SYSTEMD_UNITS_DBUS_TYPES); + if (r < 0) { + log_dbus_error(r, "sd_bus_message_enter_container()"); + return base; + } + + UnitInfo u; + memset(&u, 0, sizeof(u)); + while ((r = bus_parse_unit_info(reply, &u)) > 0) { + UnitInfo *i = callocz(1, sizeof(u)); + *i = u; + + i->id = strdupz(u.id && *u.id ? u.id : "-"); + txt_decode(i->id); + + i->type = strdupz(u.type && *u.type ? u.type : "-"); + i->description = strdupz(u.description && *u.description ? u.description : "-"); + txt_decode(i->description); + + i->load_state = strdupz(u.load_state && *u.load_state ? u.load_state : "-"); + i->active_state = strdupz(u.active_state && *u.active_state ? u.active_state : "-"); + i->sub_state = strdupz(u.sub_state && *u.sub_state ? u.sub_state : "-"); + i->following = strdupz(u.following && *u.following ? u.following : "-"); + i->unit_path = strdupz(u.unit_path && *u.unit_path ? u.unit_path : "-"); + i->job_type = strdupz(u.job_type && *u.job_type ? u.job_type : "-"); + i->job_path = strdupz(u.job_path && *u.job_path ? u.job_path : "-"); + i->job_id = u.job_id; + + DOUBLE_LINKED_LIST_APPEND_ITEM_UNSAFE(base, i, prev, next); + memset(&u, 0, sizeof(u)); + } + if (r < 0) { + log_dbus_error(r, "sd_bus_message_read()"); + return base; + } + + r = sd_bus_message_exit_container(reply); + if (r < 0) { + log_dbus_error(r, "sd_bus_message_exit_container()"); + return base; + } + + systemd_units_get_all_properties(bus, base); + + return base; +} + +void systemd_units_free_all(UnitInfo *base) { + while(base) { + UnitInfo *u = base; + DOUBLE_LINKED_LIST_REMOVE_ITEM_UNSAFE(base, u, prev, next); + freez((void *)u->id); + freez((void *)u->type); + freez((void *)u->description); + freez((void *)u->load_state); + freez((void *)u->active_state); + freez((void *)u->sub_state); + freez((void *)u->following); + freez((void *)u->unit_path); + freez((void *)u->job_type); + freez((void *)u->job_path); + + for(int i = 0; i < (ssize_t)_UNIT_ATTRIBUTE_MAX ;i++) + systemd_unit_free_property(unit_attributes[i].value_type, &u->attributes[i]); + + freez(u); + } +} + +// ---------------------------------------------------------------------------- + +static void netdata_systemd_units_function_help(const char *transaction) { + BUFFER *wb = buffer_create(0, NULL); + buffer_sprintf(wb, + "%s / %s\n" + "\n" + "%s\n" + "\n" + "The following parameters are supported:\n" + "\n" + " help\n" + " Shows this help message.\n" + "\n" + " info\n" + " Request initial configuration information about the plugin.\n" + " The key entity returned is the required_params array, which includes\n" + " all the available systemd journal sources.\n" + " When `info` is requested, all other parameters are ignored.\n" + "\n" + , program_name + , SYSTEMD_UNITS_FUNCTION_NAME + , SYSTEMD_UNITS_FUNCTION_DESCRIPTION + ); + + netdata_mutex_lock(&stdout_mutex); + pluginsd_function_result_to_stdout(transaction, HTTP_RESP_OK, "text/plain", now_realtime_sec() + 3600, wb); + netdata_mutex_unlock(&stdout_mutex); + + buffer_free(wb); +} + +static void netdata_systemd_units_function_info(const char *transaction) { + BUFFER *wb = buffer_create(0, NULL); + buffer_json_initialize(wb, "\"", "\"", 0, true, BUFFER_JSON_OPTIONS_MINIFY); + + buffer_json_member_add_uint64(wb, "status", HTTP_RESP_OK); + buffer_json_member_add_string(wb, "type", "table"); + buffer_json_member_add_string(wb, "help", SYSTEMD_UNITS_FUNCTION_DESCRIPTION); + + buffer_json_finalize(wb); + netdata_mutex_lock(&stdout_mutex); + pluginsd_function_result_to_stdout(transaction, HTTP_RESP_OK, "text/plain", now_realtime_sec() + 3600, wb); + netdata_mutex_unlock(&stdout_mutex); + + buffer_free(wb); +} + +// ---------------------------------------------------------------------------- + +static void systemd_unit_priority(UnitInfo *u, size_t units) { + uint32_t prio; + + switch(u->severity) { + case FACET_ROW_SEVERITY_CRITICAL: + prio = 0; + break; + + default: + case FACET_ROW_SEVERITY_WARNING: + prio = 1; + break; + + case FACET_ROW_SEVERITY_NOTICE: + prio = 2; + break; + + case FACET_ROW_SEVERITY_NORMAL: + prio = 3; + break; + + case FACET_ROW_SEVERITY_DEBUG: + prio = 4; + break; + } + + prio = prio * (uint32_t)(_UNIT_TYPE_MAX + 1) + (uint32_t)u->UnitType; + u->prio = (prio * units) + u->prio; +} + +#define if_less(current, max, target) ({ \ + typeof(current) _wanted = (current); \ + if((current) < (target)) \ + _wanted = (target) > (max) ? (max) : (target); \ + _wanted; \ +}) + +#define if_normal(current, max, target) ({ \ + typeof(current) _wanted = (current); \ + if((current) == FACET_ROW_SEVERITY_NORMAL) \ + _wanted = (target) > (max) ? (max) : (target); \ + _wanted; \ +}) + +FACET_ROW_SEVERITY system_unit_severity(UnitInfo *u) { + FACET_ROW_SEVERITY severity, max_severity; + + switch(u->UnitLoadState) { + case UNIT_ERROR: + case UNIT_BAD_SETTING: + severity = FACET_ROW_SEVERITY_CRITICAL; + max_severity = FACET_ROW_SEVERITY_CRITICAL; + break; + + default: + severity = FACET_ROW_SEVERITY_WARNING; + max_severity = FACET_ROW_SEVERITY_CRITICAL; + break; + + case UNIT_NOT_FOUND: + severity = FACET_ROW_SEVERITY_NOTICE; + max_severity = FACET_ROW_SEVERITY_NOTICE; + break; + + case UNIT_LOADED: + severity = FACET_ROW_SEVERITY_NORMAL; + max_severity = FACET_ROW_SEVERITY_CRITICAL; + break; + + case UNIT_MERGED: + case UNIT_MASKED: + case UNIT_STUB: + severity = FACET_ROW_SEVERITY_DEBUG; + max_severity = FACET_ROW_SEVERITY_DEBUG; + break; + } + + switch(u->UnitActiveState) { + case UNIT_FAILED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + case UNIT_RELOADING: + case UNIT_ACTIVATING: + case UNIT_DEACTIVATING: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case UNIT_MAINTENANCE: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_NOTICE); + break; + + case UNIT_ACTIVE: + break; + + case UNIT_INACTIVE: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + + switch(u->FreezerState) { + default: + case FREEZER_FROZEN: + case FREEZER_FREEZING: + case FREEZER_THAWING: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case FREEZER_RUNNING: + break; + } + + switch(u->UnitType) { + case UNIT_SERVICE: + switch(u->ServiceState) { + case SERVICE_FAILED: + case SERVICE_FAILED_BEFORE_AUTO_RESTART: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + case SERVICE_STOP: + case SERVICE_STOP_WATCHDOG: + case SERVICE_STOP_SIGTERM: + case SERVICE_STOP_SIGKILL: + case SERVICE_STOP_POST: + case SERVICE_FINAL_WATCHDOG: + case SERVICE_FINAL_SIGTERM: + case SERVICE_FINAL_SIGKILL: + case SERVICE_AUTO_RESTART: + case SERVICE_AUTO_RESTART_QUEUED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case SERVICE_CONDITION: + case SERVICE_START_PRE: + case SERVICE_START: + case SERVICE_START_POST: + case SERVICE_RELOAD: + case SERVICE_RELOAD_SIGNAL: + case SERVICE_RELOAD_NOTIFY: + case SERVICE_DEAD_RESOURCES_PINNED: + case SERVICE_CLEANING: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_NOTICE); + break; + + case SERVICE_EXITED: + case SERVICE_RUNNING: + break; + + case SERVICE_DEAD: + case SERVICE_DEAD_BEFORE_AUTO_RESTART: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_MOUNT: + switch(u->MountState) { + case MOUNT_FAILED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + case MOUNT_REMOUNTING_SIGTERM: + case MOUNT_REMOUNTING_SIGKILL: + case MOUNT_UNMOUNTING_SIGTERM: + case MOUNT_UNMOUNTING_SIGKILL: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case MOUNT_MOUNTING: + case MOUNT_MOUNTING_DONE: + case MOUNT_REMOUNTING: + case MOUNT_UNMOUNTING: + case MOUNT_CLEANING: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_NOTICE); + break; + + case MOUNT_MOUNTED: + break; + + case MOUNT_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_SWAP: + switch(u->SwapState) { + case SWAP_FAILED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + case SWAP_DEACTIVATING_SIGTERM: + case SWAP_DEACTIVATING_SIGKILL: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case SWAP_ACTIVATING: + case SWAP_ACTIVATING_DONE: + case SWAP_DEACTIVATING: + case SWAP_CLEANING: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_NOTICE); + break; + + case SWAP_ACTIVE: + break; + + case SWAP_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_SOCKET: + switch(u->SocketState) { + case SOCKET_FAILED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + case SOCKET_STOP_PRE_SIGTERM: + case SOCKET_STOP_PRE_SIGKILL: + case SOCKET_FINAL_SIGTERM: + case SOCKET_FINAL_SIGKILL: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case SOCKET_START_PRE: + case SOCKET_START_CHOWN: + case SOCKET_START_POST: + case SOCKET_STOP_PRE: + case SOCKET_STOP_POST: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_NOTICE); + break; + + case SOCKET_RUNNING: + case SOCKET_LISTENING: + break; + + case SOCKET_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_TARGET: + switch(u->TargetState) { + default: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case TARGET_ACTIVE: + break; + + case TARGET_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_DEVICE: + switch(u->DeviceState) { + default: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case DEVICE_TENTATIVE: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_NOTICE); + break; + + case DEVICE_PLUGGED: + break; + + case DEVICE_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_AUTOMOUNT: + switch(u->AutomountState) { + case AUTOMOUNT_FAILED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case AUTOMOUNT_WAITING: + case AUTOMOUNT_RUNNING: + break; + + case AUTOMOUNT_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_TIMER: + switch(u->TimerState) { + case TIMER_FAILED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + case TIMER_ELAPSED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case TIMER_WAITING: + case TIMER_RUNNING: + break; + + case TIMER_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_PATH: + switch(u->PathState) { + case PATH_FAILED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case PATH_WAITING: + case PATH_RUNNING: + break; + + case PATH_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_SLICE: + switch(u->SliceState) { + default: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case SLICE_ACTIVE: + break; + + case SLICE_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + case UNIT_SCOPE: + switch(u->ScopeState) { + case SCOPE_FAILED: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_CRITICAL); + break; + + default: + case SCOPE_STOP_SIGTERM: + case SCOPE_STOP_SIGKILL: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + + case SCOPE_ABANDONED: + case SCOPE_START_CHOWN: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_NOTICE); + break; + + case SCOPE_RUNNING: + break; + + case SCOPE_DEAD: + severity = if_normal(severity, max_severity, FACET_ROW_SEVERITY_DEBUG); + break; + } + break; + + default: + severity = if_less(severity, max_severity, FACET_ROW_SEVERITY_WARNING); + break; + } + + u->severity = severity; + return severity; +} + +int unit_info_compar(const void *a, const void *b) { + UnitInfo *u1 = *((UnitInfo **)a); + UnitInfo *u2 = *((UnitInfo **)b); + + return strcasecmp(u1->id, u2->id); +} + +void systemd_units_assign_priority(UnitInfo *base) { + size_t units = 0, c = 0, prio = 0; + for(UnitInfo *u = base; u ; u = u->next) + units++; + + UnitInfo *array[units]; + for(UnitInfo *u = base; u ; u = u->next) + array[c++] = u; + + qsort(array, units, sizeof(UnitInfo *), unit_info_compar); + + for(c = 0; c < units ; c++) { + array[c]->prio = prio++; + system_unit_severity(array[c]); + systemd_unit_priority(array[c], units); + } +} + +void function_systemd_units(const char *transaction, char *function, int timeout, bool *cancelled) { + char *words[SYSTEMD_UNITS_MAX_PARAMS] = { NULL }; + size_t num_words = quoted_strings_splitter_pluginsd(function, words, SYSTEMD_UNITS_MAX_PARAMS); + for(int i = 1; i < SYSTEMD_UNITS_MAX_PARAMS ;i++) { + char *keyword = get_word(words, num_words, i); + if(!keyword) break; + + if(strcmp(keyword, "info") == 0) { + netdata_systemd_units_function_info(transaction); + return; + } + else if(strcmp(keyword, "help") == 0) { + netdata_systemd_units_function_help(transaction); + return; + } + } + + UnitInfo *base = systemd_units_get_all(); + systemd_units_assign_priority(base); + + BUFFER *wb = buffer_create(0, NULL); + buffer_json_initialize(wb, "\"", "\"", 0, true, BUFFER_JSON_OPTIONS_MINIFY); + + buffer_json_member_add_uint64(wb, "status", HTTP_RESP_OK); + buffer_json_member_add_string(wb, "type", "table"); + buffer_json_member_add_time_t(wb, "update_every", 10); + buffer_json_member_add_string(wb, "help", SYSTEMD_UNITS_FUNCTION_DESCRIPTION); + buffer_json_member_add_array(wb, "data"); + + size_t count[_UNIT_ATTRIBUTE_MAX] = { 0 }; + struct UnitAttribute max[_UNIT_ATTRIBUTE_MAX]; + + for(UnitInfo *u = base; u ;u = u->next) { + buffer_json_add_array_item_array(wb); + { + buffer_json_add_array_item_string(wb, u->id); + + buffer_json_add_array_item_object(wb); + { + buffer_json_member_add_string(wb, "severity", facets_severity_to_string(u->severity)); + } + buffer_json_object_close(wb); + + buffer_json_add_array_item_string(wb, u->type); + buffer_json_add_array_item_string(wb, u->description); + buffer_json_add_array_item_string(wb, u->load_state); + buffer_json_add_array_item_string(wb, u->active_state); + buffer_json_add_array_item_string(wb, u->sub_state); + buffer_json_add_array_item_string(wb, u->following); + buffer_json_add_array_item_string(wb, u->unit_path); + buffer_json_add_array_item_uint64(wb, u->job_id); + buffer_json_add_array_item_string(wb, u->job_type); + buffer_json_add_array_item_string(wb, u->job_path); + + for(ssize_t i = 0; i < (ssize_t)_UNIT_ATTRIBUTE_MAX ;i++) { + switch(unit_attributes[i].value_type) { + case SD_BUS_TYPE_OBJECT_PATH: + case SD_BUS_TYPE_STRING: + buffer_json_add_array_item_string(wb, u->attributes[i].str && *u->attributes[i].str ? u->attributes[i].str : "-"); + break; + + case SD_BUS_TYPE_UINT64: + buffer_json_add_array_item_uint64(wb, u->attributes[i].uint64); + if(!count[i]++) max[i].uint64 = 0; + max[i].uint64 = MAX(max[i].uint64, u->attributes[i].uint64); + break; + + case SD_BUS_TYPE_UINT32: + buffer_json_add_array_item_uint64(wb, u->attributes[i].uint32); + if(!count[i]++) max[i].uint32 = 0; + max[i].uint32 = MAX(max[i].uint32, u->attributes[i].uint32); + break; + + case SD_BUS_TYPE_INT64: + buffer_json_add_array_item_uint64(wb, u->attributes[i].int64); + if(!count[i]++) max[i].uint64 = 0; + max[i].int64 = MAX(max[i].int64, u->attributes[i].int64); + break; + + case SD_BUS_TYPE_INT32: + buffer_json_add_array_item_uint64(wb, u->attributes[i].int32); + if(!count[i]++) max[i].int32 = 0; + max[i].int32 = MAX(max[i].int32, u->attributes[i].int32); + break; + + case SD_BUS_TYPE_DOUBLE: + buffer_json_add_array_item_double(wb, u->attributes[i].dbl); + if(!count[i]++) max[i].dbl = 0.0; + max[i].dbl = MAX(max[i].dbl, u->attributes[i].dbl); + break; + + case SD_BUS_TYPE_BOOLEAN: + buffer_json_add_array_item_boolean(wb, u->attributes[i].boolean); + break; + + default: + break; + } + } + + buffer_json_add_array_item_uint64(wb, u->prio); + buffer_json_add_array_item_uint64(wb, 1); // count + } + buffer_json_array_close(wb); + } + + buffer_json_array_close(wb); // data + + buffer_json_member_add_object(wb, "columns"); + { + size_t field_id = 0; + + buffer_rrdf_table_add_field(wb, field_id++, "id", "Unit ID", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_VISIBLE | RRDF_FIELD_OPTS_UNIQUE_KEY | RRDF_FIELD_OPTS_WRAP | RRDF_FIELD_OPTS_FULL_WIDTH, + NULL); + + buffer_rrdf_table_add_field( + wb, field_id++, + "rowOptions", "rowOptions", + RRDF_FIELD_TYPE_NONE, + RRDR_FIELD_VISUAL_ROW_OPTIONS, + RRDF_FIELD_TRANSFORM_NONE, 0, NULL, NAN, + RRDF_FIELD_SORT_FIXED, + NULL, + RRDF_FIELD_SUMMARY_COUNT, + RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_DUMMY, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "type", "Unit Type", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_MULTISELECT, + RRDF_FIELD_OPTS_VISIBLE | RRDF_FIELD_OPTS_EXPANDED_FILTER, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "description", "Unit Description", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_VISIBLE | RRDF_FIELD_OPTS_WRAP | RRDF_FIELD_OPTS_FULL_WIDTH, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "loadState", "Unit Load State", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_MULTISELECT, + RRDF_FIELD_OPTS_VISIBLE | RRDF_FIELD_OPTS_EXPANDED_FILTER, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "activeState", "Unit Active State", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_MULTISELECT, + RRDF_FIELD_OPTS_VISIBLE | RRDF_FIELD_OPTS_EXPANDED_FILTER, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "subState", "Unit Sub State", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_MULTISELECT, + RRDF_FIELD_OPTS_VISIBLE | RRDF_FIELD_OPTS_EXPANDED_FILTER, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "following", "Unit Following", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_WRAP, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "path", "Unit Path", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_WRAP | RRDF_FIELD_OPTS_FULL_WIDTH, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "jobId", "Unit Job ID", + RRDF_FIELD_TYPE_INTEGER, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_NONE, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "jobType", "Unit Job Type", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_MULTISELECT, + RRDF_FIELD_OPTS_NONE, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "jobPath", "Unit Job Path", + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_WRAP | RRDF_FIELD_OPTS_FULL_WIDTH, + NULL); + + for(ssize_t i = 0; i < (ssize_t)_UNIT_ATTRIBUTE_MAX ;i++) { + char key[256], name[256]; + + if(unit_attributes[i].show_as) + snprintfz(key, sizeof(key), "%s", unit_attributes[i].show_as); + else + snprintfz(key, sizeof(key), "attribute%s", unit_property_name_to_string_from_slot(i)); + + if(unit_attributes[i].info) + snprintfz(name, sizeof(name), "%s", unit_attributes[i].info); + else + snprintfz(name, sizeof(name), "Attribute %s", unit_property_name_to_string_from_slot(i)); + + RRDF_FIELD_OPTIONS options = unit_attributes[i].options; + RRDF_FIELD_FILTER filter = unit_attributes[i].filter; + + switch(unit_attributes[i].value_type) { + case SD_BUS_TYPE_OBJECT_PATH: + case SD_BUS_TYPE_STRING: + buffer_rrdf_table_add_field(wb, field_id++, key, name, + RRDF_FIELD_TYPE_STRING, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, filter, + RRDF_FIELD_OPTS_WRAP | options, + NULL); + break; + + case SD_BUS_TYPE_INT32: + case SD_BUS_TYPE_UINT32: + case SD_BUS_TYPE_INT64: + case SD_BUS_TYPE_UINT64: { + double m; + if(unit_attributes[i].value_type == SD_BUS_TYPE_UINT64) + m = (double)max[i].uint64; + else if(unit_attributes[i].value_type == SD_BUS_TYPE_INT64) + m = (double)max[i].int64; + else if(unit_attributes[i].value_type == SD_BUS_TYPE_UINT32) + m = (double)max[i].uint32; + else if(unit_attributes[i].value_type == SD_BUS_TYPE_INT32) + m = (double)max[i].int32; + + buffer_rrdf_table_add_field(wb, field_id++, key, name, + RRDF_FIELD_TYPE_INTEGER, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, m, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_SUM, filter, + RRDF_FIELD_OPTS_WRAP | options, + NULL); + } + break; + + case SD_BUS_TYPE_DOUBLE: + buffer_rrdf_table_add_field(wb, field_id++, key, name, + RRDF_FIELD_TYPE_INTEGER, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 2, NULL, max[i].dbl, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_SUM, filter, + RRDF_FIELD_OPTS_WRAP | options, + NULL); + break; + + case SD_BUS_TYPE_BOOLEAN: + buffer_rrdf_table_add_field(wb, field_id++, key, name, + RRDF_FIELD_TYPE_BOOLEAN, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, filter, + RRDF_FIELD_OPTS_WRAP | options, + NULL); + break; + + default: + break; + } + + } + + buffer_rrdf_table_add_field(wb, field_id++, "priority", "Priority", + RRDF_FIELD_TYPE_INTEGER, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_NONE, + NULL); + + buffer_rrdf_table_add_field(wb, field_id++, "count", "Count", + RRDF_FIELD_TYPE_INTEGER, RRDF_FIELD_VISUAL_VALUE, RRDF_FIELD_TRANSFORM_NONE, + 0, NULL, NAN, RRDF_FIELD_SORT_ASCENDING, NULL, + RRDF_FIELD_SUMMARY_COUNT, RRDF_FIELD_FILTER_NONE, + RRDF_FIELD_OPTS_NONE, + NULL); + } + + buffer_json_object_close(wb); // columns + buffer_json_member_add_string(wb, "default_sort_column", "priority"); + + buffer_json_member_add_object(wb, "charts"); + { + buffer_json_member_add_object(wb, "count"); + { + buffer_json_member_add_string(wb, "name", "count"); + buffer_json_member_add_string(wb, "type", "stacked-bar"); + buffer_json_member_add_array(wb, "columns"); + { + buffer_json_add_array_item_string(wb, "count"); + } + buffer_json_array_close(wb); + } + buffer_json_object_close(wb); + } + buffer_json_object_close(wb); // charts + + buffer_json_member_add_array(wb, "default_charts"); + { + buffer_json_add_array_item_array(wb); + buffer_json_add_array_item_string(wb, "count"); + buffer_json_add_array_item_string(wb, "activeState"); + buffer_json_array_close(wb); + buffer_json_add_array_item_array(wb); + buffer_json_add_array_item_string(wb, "count"); + buffer_json_add_array_item_string(wb, "subState"); + buffer_json_array_close(wb); + } + buffer_json_array_close(wb); + + buffer_json_member_add_object(wb, "group_by"); + { + buffer_json_member_add_object(wb, "type"); + { + buffer_json_member_add_string(wb, "name", "Top Down Tree"); + buffer_json_member_add_array(wb, "columns"); + { + buffer_json_add_array_item_string(wb, "type"); + buffer_json_add_array_item_string(wb, "loadState"); + buffer_json_add_array_item_string(wb, "activeState"); + buffer_json_add_array_item_string(wb, "subState"); + } + buffer_json_array_close(wb); + } + buffer_json_object_close(wb); + + buffer_json_member_add_object(wb, "subState"); + { + buffer_json_member_add_string(wb, "name", "Bottom Up Tree"); + buffer_json_member_add_array(wb, "columns"); + { + buffer_json_add_array_item_string(wb, "subState"); + buffer_json_add_array_item_string(wb, "activeState"); + buffer_json_add_array_item_string(wb, "loadState"); + buffer_json_add_array_item_string(wb, "type"); + } + buffer_json_array_close(wb); + } + buffer_json_object_close(wb); + } + buffer_json_object_close(wb); // group_by + + buffer_json_member_add_time_t(wb, "expires", now_realtime_sec() + 1); + buffer_json_finalize(wb); + + netdata_mutex_lock(&stdout_mutex); + pluginsd_function_result_to_stdout(transaction, HTTP_RESP_OK, "application/json", now_realtime_sec() + 1, wb); + netdata_mutex_unlock(&stdout_mutex); + + buffer_free(wb); + systemd_units_free_all(base); +} + +#endif // ENABLE_SYSTEMD_DBUS |