summaryrefslogtreecommitdiffstats
path: root/collectors/systemd-journal.plugin/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'collectors/systemd-journal.plugin/README.md')
-rw-r--r--collectors/systemd-journal.plugin/README.md472
1 files changed, 0 insertions, 472 deletions
diff --git a/collectors/systemd-journal.plugin/README.md b/collectors/systemd-journal.plugin/README.md
deleted file mode 100644
index c3c639045..000000000
--- a/collectors/systemd-journal.plugin/README.md
+++ /dev/null
@@ -1,472 +0,0 @@
-
-# `systemd` journal plugin
-
-[KEY FEATURES](#key-features) | [JOURNAL SOURCES](#journal-sources) | [JOURNAL FIELDS](#journal-fields) |
-[PLAY MODE](#play-mode) | [FULL TEXT SEARCH](#full-text-search) | [PERFORMANCE](#query-performance) |
-[CONFIGURATION](#configuration-and-maintenance) | [FAQ](#faq)
-
-The `systemd` journal plugin by Netdata makes viewing, exploring and analyzing `systemd` journal logs simple and
-efficient.
-It automatically discovers available journal sources, allows advanced filtering, offers interactive visual
-representations and supports exploring the logs of both individual servers and the logs on infrastructure wide
-journal centralization servers.
-
-![image](https://github.com/netdata/netdata/assets/2662304/691b7470-ec56-430c-8b81-0c9e49012679)
-
-## Key features
-
-- Works on both **individual servers** and **journal centralization servers**.
-- Supports `persistent` and `volatile` journals.
-- Supports `system`, `user`, `namespaces` and `remote` journals.
-- Allows filtering on **any journal field** or **field value**, for any time-frame.
-- Allows **full text search** (`grep`) on all journal fields, for any time-frame.
-- Provides a **histogram** for log entries over time, with a break down per field-value, for any field and any
- time-frame.
-- Works directly on journal files, without any other third-party components.
-- Supports coloring log entries, the same way `journalctl` does.
-- In PLAY mode provides the same experience as `journalctl -f`, showing new log entries immediately after they are
- received.
-
-### Prerequisites
-
-`systemd-journal.plugin` is a Netdata Function Plugin.
-
-To protect your privacy, as with all Netdata Functions, a free Netdata Cloud user account is required to access it.
-For more information check [this discussion](https://github.com/netdata/netdata/discussions/16136).
-
-### Limitations
-
-#### Plugin availability
-
-The following are limitations related to the availability of the plugin:
-
-- Netdata versions prior to 1.44 shipped in a docker container do not include this plugin.
- The problem is that `libsystemd` is not available in Alpine Linux (there is a `libsystemd`, but it is a dummy that
- returns failure on all calls). Starting with Netdata version 1.44, Netdata containers use a Debian base image
- making this plugin available when Netdata is running in a container.
-- For the same reason (lack of `systemd` support for Alpine Linux), the plugin is not available on `static` builds of
- Netdata (which are based on `muslc`, not `glibc`). If your Netdata is installed in `/opt/netdata` you most likely have
- a static build of Netdata.
-- On old systemd systems (like Centos 7), the plugin runs always in "full data query" mode, which makes it slower. The
- reason, is that systemd API is missing some important calls we need to use the field indexes of `systemd` journal.
- However, when running in this mode, the plugin offers also negative matches on the data (like filtering for all logs
- that do not have set some field), and this is the reason "full data query" mode is also offered as an option even on
- newer versions of `systemd`.
-
-#### `systemd` journal features
-
-The following are limitations related to the features of `systemd` journal:
-
-- This plugin assumes that binary field values are text fields with newlines in them. `systemd-journal` has the ability
- to support binary fields, without specifying the nature of the binary data. However, binary fields are commonly used
- to store log entries that include multiple lines of text. The plugin treats all binary fields are multi-line text.
-- This plugin does not support multiple values per field for any given log entry. `systemd` journal has the ability to
- accept the same field key, multiple times, with multiple values on a single log entry. This plugin will present the
- last value and ignore the others for this log entry.
-- This plugin will only read journal files located in `/var/log/journal` or `/run/log/journal`. `systemd-journal-remote` has the
- ability to store journal files anywhere (user configured). If journal files are not located in `/var/log/journal`
- or `/run/log/journal` (and any of their subdirectories), the plugin will not find them. A simple solution is to link
- the other directories somewhere inside `/var/log/journal`. The plugin will pick them up, even if a sub-directory of
- `/var/log/journal` is a link to a directory outside `/var/log/journal`.
-
-Other than the above, this plugin supports all features of `systemd` journals.
-
-## Journal Sources
-
-The plugin automatically detects the available journal sources, based on the journal files available in
-`/var/log/journal` (persistent logs) and `/run/log/journal` (volatile logs).
-
-![journal-sources](https://github.com/netdata/netdata/assets/2662304/28e63a3e-6809-4586-b3b0-80755f340e31)
-
-The plugin, by default, merges all journal sources together, to provide a unified view of all log messages available.
-
-> To improve query performance, we recommend selecting the relevant journal source, before doing more analysis on the
-> logs.
-
-### `system` journals
-
-`system` journals are the default journals available on all `systemd` based systems.
-
-`system` journals contain:
-
-- kernel log messages (via `kmsg`),
-- audit records, originating from the kernel audit subsystem,
-- messages received by `systemd-journald` via `syslog`,
-- messages received via the standard output and error of service units,
-- structured messages received via the native journal API.
-
-### `user` journals
-
-Unlike `journalctl`, the Netdata plugin allows viewing, exploring and querying the journal files of **all users**.
-
-By default, each user, with a UID outside the range of system users (0 - 999), dynamic service users,
-and the nobody user (65534), will get their own set of `user` journal files. For more information about
-this policy check [Users, Groups, UIDs and GIDs on systemd Systems](https://systemd.io/UIDS-GIDS/).
-
-Keep in mind that `user` journals are merged with the `system` journals when they are propagated to a journal
-centralization server. So, at the centralization server, the `remote` journals contain both the `system` and `user`
-journals of the sender.
-
-### `namespaces` journals
-
-The plugin auto-detects the namespaces available and provides a list of all namespaces at the "sources" list on the UI.
-
-Journal namespaces are both a mechanism for logically isolating the log stream of projects consisting
-of one or more services from the rest of the system and a mechanism for improving performance.
-
-`systemd` service units may be assigned to a specific journal namespace through the `LogNamespace=` unit file setting.
-
-Keep in mind that namespaces require special configuration to be propagated to a journal centralization server.
-This makes them a little more difficult to handle, from the administration perspective.
-
-### `remote` journals
-
-Remote journals are created by `systemd-journal-remote`. This `systemd` feature allows creating logs centralization
-points within your infrastructure, based exclusively on `systemd`.
-
-Usually `remote` journals are named by the IP of the server sending these logs. The Netdata plugin automatically
-extracts these IPs and performs a reverse DNS lookup to find their hostnames. When this is successful,
-`remote` journals are named by the hostnames of the origin servers.
-
-For information about configuring a journal centralization server,
-check [this FAQ item](#how-do-i-configure-a-journal-centralization-server).
-
-## Journal Fields
-
-`systemd` journals are designed to support multiple fields per log entry. The power of `systemd` journals is that,
-unlike other log management systems, it supports dynamic and variable fields for each log message,
-while all fields and their values are indexed for fast querying.
-
-This means that each application can log messages annotated with its own unique fields and values, and `systemd`
-journals will automatically index all of them, without any configuration or manual action.
-
-For a description of the most frequent fields found in `systemd` journals, check `man systemd.journal-fields`.
-
-Fields found in the journal files are automatically added to the UI in multiple places to help you explore
-and filter the data.
-
-The plugin automatically enriches certain fields to make them more user-friendly:
-
-- `_BOOT_ID`: the hex value is annotated with the timestamp of the first message encountered for this boot id.
-- `PRIORITY`: the numeric value is replaced with the human-readable name of each priority.
-- `SYSLOG_FACILITY`: the encoded value is replaced with the human-readable name of each facility.
-- `ERRNO`: the numeric value is annotated with the short name of each value.
-- `_UID` `_AUDIT_LOGINUID`, `_SYSTEMD_OWNER_UID`, `OBJECT_UID`, `OBJECT_SYSTEMD_OWNER_UID`, `OBJECT_AUDIT_LOGINUID`:
- the local user database is consulted to annotate them with usernames.
-- `_GID`, `OBJECT_GID`: the local group database is consulted to annotate them with group names.
-- `_CAP_EFFECTIVE`: the encoded value is annotated with a human-readable list of the linux capabilities.
-- `_SOURCE_REALTIME_TIMESTAMP`: the numeric value is annotated with human-readable datetime in UTC.
-- `MESSAGE_ID`: for the known `MESSAGE_ID`s, the value is replaced with the well known name of the event.
-
-The values of all other fields are presented as found in the journals.
-
-> IMPORTANT:
-> The UID and GID annotations are added during presentation and are taken from the server running the plugin.
-> For `remote` sources, the names presented may not reflect the actual user and group names on the origin server.
-> The numeric value will still be visible though, as-is on the origin server.
-
-The annotations are not searchable with full-text search. They are only added for the presentation of the fields.
-
-### Journal fields as columns in the table
-
-All journal fields available in the journal files are offered as columns on the UI. Use the gear button above the table:
-
-![image](https://github.com/netdata/netdata/assets/2662304/cd75fb55-6821-43d4-a2aa-033792c7f7ac)
-
-### Journal fields as additional info to each log entry
-
-When you click a log line, the `info` sidebar will open on the right of the screen, to provide the full list of fields
-related to this log line. You can close this `info` sidebar, by selecting the filter icon at its top.
-
-![image](https://github.com/netdata/netdata/assets/2662304/3207794c-a61b-444c-8ffe-6c07cbc90ae2)
-
-### Journal fields as filters
-
-The plugin presents a select list of fields as filters to the query, with counters for each of the possible values
-for the field. This list can used to quickly check which fields and values are available for the entire time-frame
-of the query.
-
-Internally the plugin has:
-
-1. A white-list of fields, to be presented as filters.
-2. A black-list of fields, to prevent them from becoming filters. This list includes fields with a very high
- cardinality, like timestamps, unique message ids, etc. This is mainly for protecting the server's performance,
- to avoid building in memory indexes for the fields that almost each of their values is unique.
-
-Keep in mind that the values presented in the filters, and their sorting is affected by the "full data queries"
-setting:
-
-![image](https://github.com/netdata/netdata/assets/2662304/ac710d46-07c2-487b-8ce3-e7f767b9ae0f)
-
-When "full data queries" is off, empty values are hidden and cannot be selected. This is due to a limitation of
-`libsystemd` that does not allow negative or empty matches. Also, values with zero counters may appear in the list.
-
-When "full data queries" is on, Netdata is applying all filtering to the data (not `libsystemd`), but this means
-that all the data of the entire time-frame, without any filtering applied, have to be read by the plugin to prepare
-the response required. So, "full data queries" can be significantly slower over long time-frames.
-
-### Journal fields as histogram sources
-
-The plugin presents a histogram of the number of log entries across time.
-
-The data source of this histogram can be any of the fields that are available as filters.
-For each of the values this field has, across the entire time-frame of the query, the histogram will get corresponding
-dimensions, showing the number of log entries, per value, over time.
-
-The granularity of the histogram is adjusted automatically to have about 150 columns visible on screen.
-
-The histogram presented by the plugin is interactive:
-
-- **Zoom**, either with the global date-time picker, or the zoom tool in the histogram's toolbox.
-- **Pan**, either with global date-time picker, or by dragging with the mouse the chart to the left or the right.
-- **Click**, to quickly jump to the highlighted point in time in the log entries.
-
-![image](https://github.com/netdata/netdata/assets/2662304/d3dcb1d1-daf4-49cf-9663-91b5b3099c2d)
-
-## PLAY mode
-
-The plugin supports PLAY mode, to continuously update the screen with new log entries found in the journal files.
-Just hit the "play" button at the top of the Netdata dashboard screen.
-
-On centralized log servers, PLAY mode provides a unified view of all the new logs encountered across the entire
-infrastructure,
-from all hosts sending logs to the central logs server via `systemd-remote`.
-
-## Full-text search
-
-The plugin supports searching for any text on all fields of the log entries.
-
-Full text search is combined with the selected filters.
-
-The text box accepts asterisks `*` as wildcards. So, `a*b*c` means match anything that contains `a`, then `b` and
-then `c` with anything between them.
-
-Spaces are treated as OR expressions. So that `a*b c*d` means `a*b OR c*d`.
-
-Negative expressions are supported, by prefixing any string with `!`. Example: `!systemd *` means match anything that
-does not contain `systemd` on any of its fields.
-
-## Query performance
-
-Journal files are designed to be accessed by multiple readers and one writer, concurrently.
-
-Readers (like this Netdata plugin), open the journal files and `libsystemd`, behind the scenes, maps regions
-of the files into memory, to satisfy each query.
-
-On logs aggregation servers, the performance of the queries depend on the following factors:
-
-1. The **number of files** involved in each query.
-
- This is why we suggest to select a source when possible.
-
-2. The **speed of the disks** hosting the journal files.
-
- Journal files perform a lot of reading while querying, so the fastest the disks, the faster the query will finish.
-
-3. The **memory available** for caching parts of the files.
-
- Increased memory will help the kernel cache the most frequently used parts of the journal files, avoiding disk I/O
- and speeding up queries.
-
-4. The **number of filters** applied.
-
- Queries are significantly faster when just a few filters are selected.
-
-In general, for a faster experience, **keep a low number of rows within the visible timeframe**.
-
-Even on long timeframes, selecting a couple of filters that will result in a **few dozen thousand** log entries
-will provide fast / rapid responses, usually less than a second. To the contrary, viewing timeframes with **millions
-of entries** may result in longer delays.
-
-The plugin aborts journal queries when your browser cancels inflight requests. This allows you to work on the UI
-while there are background queries running.
-
-At the time of this writing, this Netdata plugin is about 25-30 times faster than `journalctl` on queries that access
-multiple journal files, over long time-frames.
-
-During the development of this plugin, we submitted, to `systemd`, a number of patches to improve `journalctl`
-performance by a factor of 14:
-
-- <https://github.com/systemd/systemd/pull/29365>
-- <https://github.com/systemd/systemd/pull/29366>
-- <https://github.com/systemd/systemd/pull/29261>
-
-However, even after these patches are merged, `journalctl` will still be 2x slower than this Netdata plugin,
-on multi-journal queries.
-
-The problem lies in the way `libsystemd` handles multi-journal file queries. To overcome this problem,
-the Netdata plugin queries each file individually and it then it merges the results to be returned.
-This is transparent, thanks to the `facets` library in `libnetdata` that handles on-the-fly indexing, filtering,
-and searching of any dataset, independently of its source.
-
-## Performance at scale
-
-On busy logs servers, or when querying long timeframes that match millions of log entries, the plugin has a sampling
-algorithm to allow it respond promptly. It works like this:
-
-1. The latest 500k log entries are queried in full, evaluating all the fields of every single log entry. This evaluation
- allows counting the unique values per field, updating the counters next to each value at the filters section of the
- dashboard.
-2. When the latest 500k log entries have been processed and there are more data to read, the plugin divides evenly 500k
- more log entries to the number of journal files matched by the query. So, it will continue to evaluate all the fields
- of all log entries, up to the budget per file, aiming to fully query 1 million log entries in total.
-3. When the budget is hit for a given file, the plugin continues to scan log entries, but this time it does not evaluate
- the fields and their values, so the counters per field and value are not updated. These unsampled log entries are
- shown in the histogram with the label `[unsampled]`.
-4. The plugin continues to count `[unsampled]` entries until as many as sampled entries have been evaluated and at least
- 1% of the journal file has been processed.
-5. When the `[unsampled]` budget is exhausted, the plugin stops processing the journal file and based on the processing
- completed so far and the number of entries in the journal file, it estimates the remaining number of log entries in
- that file. This is shown as `[estimated]` at the histogram.
-6. In systemd versions 254 or later, the plugin fetches the unique sequence number of each log entry and calculates the
- the percentage of the file matched by the query, versus the total number of the log entries in the journal file.
-7. In systemd versions prior to 254, the plugin estimates the number of entries the journal file contributes to the
- query, using the amount of log entries matched it vs. the total duration the log file has entries for.
-
-The above allow the plugin to respond promptly even when the number of log entries in the journal files is several
-dozens millions, while providing accurate estimations of the log entries over time at the histogram and enough counters
-at the fields filtering section to help users get an overview of the whole timeframe.
-
-The fact that the latest 500k log entries and 1% of all journal files (which are spread over time) have been fully
-evaluated, including counting the number of appearances for each field value, the plugin usually provides an accurate
-representation of the whole timeframe.
-
-Keep in mind that although the plugin is quite effective and responds promptly when there are hundreds of journal files
-matching a query, response times may be longer when there are several thousands of smaller files. systemd versions 254+
-attempt to solve this problem by allowing `systemd-journal-remote` to create larger files. However, for systemd
-versions prior to 254, `systemd-journal-remote` creates files of up to 32MB each, which when running very busy
-journals centralization servers aggregating several thousands of log entries per second, the number of files can grow
-to several dozens of thousands quickly. In such setups, the plugin should ideally skip processing journal files
-entirely, relying solely on the estimations of the sequence of files each file is part of. However, this has not been
-implemented yet. To improve the query performance in such setups, the user has to query smaller timeframes.
-
-Another optimization taking place in huge journal centralization points, is the initial scan of the database. The plugin
-needs to know the list of all journal files available, including the details of the first and the last message in each
-of them. When there are several thousands of files in a directory (like it usually happens in `/var/log/journal/remote`),
-directory listing and examination of each file can take a considerable amount of time (even `ls -l` takes minutes).
-To work around this problem, the plugin uses `inotify` to receive file updates immediately and scans the library from
-the newest to the oldest file, allowing the user interface to work immediately after startup, for the most recent
-timeframes.
-
-### Best practices for better performance
-
-systemd-journal has been designed **first to be reliable** and then to be fast. It includes several mechanisms to ensure
-minimal data loss under all conditions (e.g. disk corruption, tampering, forward secure sealing) and despite the fact
-that it utilizes several techniques to require minimal disk footprint (like deduplication of log entries, linking of
-values and fields, compression) the disk footprint of journal files remains significantly higher compared to other log
-management solutions.
-
-The higher disk footprint results in higher disk I/O during querying, since a lot more data have to read from disk to
-evaluate a query. Query performance at scale can greatly benefit by utilizing a compressed filesystem (ext4, btrfs, zfs)
-to store systemd-journal files.
-
-systemd-journal files are cached by the operating system. There is no database server to serve queries. Each file is
-opened and the query runs by directly accessing the data in it.
-
-Therefore systemd-journal relies on the caching layer of the operating system to optimize query performance. The more
-RAM the system has, although it will not be reported as `used` (it will be reported as `cache`), the faster the queries
-will get. The first time a timeframe is accessed the query performance will be slower, but further queries on the same
-timeframe will be significantly faster since journal data are now cached in memory.
-
-So, on busy logs centralization systems, queries performance can be improved significantly by using a compressed
-filesystem for storing the journal files, and higher amounts of RAM.
-
-## Configuration and maintenance
-
-This Netdata plugin does not require any configuration or maintenance.
-
-## FAQ
-
-### Can I use this plugin on journal centralization servers?
-
-Yes. You can centralize your logs using `systemd-journal-remote`, and then install Netdata
-on this logs centralization server to explore the logs of all your infrastructure.
-
-This plugin will automatically provide multi-node views of your logs and also give you the ability to combine the logs
-of multiple servers, as you see fit.
-
-Check [configuring a logs centralization server](#how-do-i-configure-a-journal-centralization-server).
-
-### Can I use this plugin from a parent Netdata?
-
-Yes. When your nodes are connected to a Netdata parent, all their functions are available
-via the parent's UI. So, from the parent UI, you can access the functions of all your nodes.
-
-Keep in mind that to protect your privacy, in order to access Netdata functions, you need a
-free Netdata Cloud account.
-
-### Is any of my data exposed to Netdata Cloud from this plugin?
-
-No. When you access the agent directly, none of your data passes through Netdata Cloud.
-You need a free Netdata Cloud account only to verify your identity and enable the use of
-Netdata Functions. Once this is done, all the data flow directly from your Netdata agent
-to your web browser.
-
-Also check [this discussion](https://github.com/netdata/netdata/discussions/16136).
-
-When you access Netdata via `https://app.netdata.cloud`, your data travel via Netdata Cloud,
-but they are not stored in Netdata Cloud. This is to allow you access your Netdata agents from
-anywhere. All communication from/to Netdata Cloud is encrypted.
-
-### What are `volatile` and `persistent` journals?
-
-`systemd` `journald` allows creating both `volatile` journals in a `tmpfs` ram drive,
-and `persistent` journals stored on disk.
-
-`volatile` journals are particularly useful when the system monitored is sensitive to
-disk I/O, or does not have any writable disks at all.
-
-For more information check `man systemd-journald`.
-
-### I centralize my logs with Loki. Why to use Netdata for my journals?
-
-`systemd` journals have almost infinite cardinality at their labels and all of them are indexed,
-even if every single message has unique fields and values.
-
-When you send `systemd` journal logs to Loki, even if you use the `relabel_rules` argument to
-`loki.source.journal` with a JSON format, you need to specify which of the fields from journald
-you want inherited by Loki. This means you need to know the most important fields beforehand.
-At the same time you loose all the flexibility `systemd` journal provides:
-**indexing on all fields and all their values**.
-
-Loki generally assumes that all logs are like a table. All entries in a stream share the same
-fields. But journald does exactly the opposite. Each log entry is unique and may have its own unique fields.
-
-So, Loki and `systemd-journal` are good for different use cases.
-
-`systemd-journal` already runs in your systems. You use it today. It is there inside all your systems
-collecting the system and applications logs. And for its use case, it has advantages over other
-centralization solutions. So, why not use it?
-
-### Is it worth to build a `systemd` logs centralization server?
-
-Yes. It is simple, fast and the software to do it is already in your systems.
-
-For application and system logs, `systemd` journal is ideal and the visibility you can get
-by centralizing your system logs and the use of this Netdata plugin, is unparalleled.
-
-### How do I configure a journal centralization server?
-
-A short summary to get journal server running can be found below.
-There are two strategies you can apply, when it comes down to a centralized server for `systemd` journal logs.
-
-1. _Active sources_, where the centralized server fetches the logs from each individual server
-2. _Passive sources_, where the centralized server accepts a log stream from an individual server.
-
-For more options and reference to documentation, check `man systemd-journal-remote` and `man systemd-journal-upload`.
-
-#### _passive_ journal centralization without encryption
-
-If you want to setup your own passive journal centralization setup without encryption, [check out guide on it](https://github.com/netdata/netdata/blob/master/collectors/systemd-journal.plugin/passive_journal_centralization_guide_no_encryption.md).
-
-#### _passive_ journal centralization with encryption using self-signed certificates
-
-If you want to setup your own passive journal centralization setup using self-signed certificates for encryption, [check out guide on it](https://github.com/netdata/netdata/blob/master/collectors/systemd-journal.plugin/passive_journal_centralization_guide_self_signed_certs.md).
-
-#### Limitations when using a logs centralization server
-
-As of this writing `namespaces` support by `systemd` is limited:
-
-- Docker containers cannot log to namespaces. Check [this issue](https://github.com/moby/moby/issues/41879).
-- `systemd-journal-upload` automatically uploads `system` and `user` journals, but not `namespaces` journals. For this
- you need to spawn a `systemd-journal-upload` per namespace.