# Netdata Logging This document describes how Netdata generates its own logs, not how Netdata manages and queries logs databases. ## Log sources Netdata supports the following log sources: 1. **daemon**, logs generated by Netdata daemon. 2. **collector**, logs generated by Netdata collectors, including internal and external ones. 3. **access**, API requests received by Netdata 4. **health**, all alert transitions and notifications ## Log outputs For each log source, Netdata supports the following output methods: - **off**, to disable this log source - **journal**, to send the logs to systemd-journal. - **syslog**, to send the logs to syslog. - **system**, to send the output to `stderr` or `stdout` depending on the log source. - **stdout**, to write the logs to Netdata's `stdout`. - **stderr**, to write the logs to Netdata's `stderr`. - **filename**, to send the logs to a file. For `daemon` and `collector` the default is `journal` when systemd-journal is available. To decide if systemd-journal is available, Netdata checks: 1. `stderr` is connected to systemd-journald 2. `/run/systemd/journal/socket` exists 3. `/host/run/systemd/journal/socket` exists (`/host` is configurable in containers) If any of the above is detected, Netdata will select `journal` for `daemon` and `collector` sources. All other sources default to a file. ## Log formats | Format | Description | |---------|--------------------------------------------------------------------------------------------------------| | journal | journald-specific log format. Automatically selected when logging to systemd-journal. | | logfmt | logs data as a series of key/value pairs. The default when logging to any output other than `journal`. | | json | logs data in JSON format. | ## Log levels Each time Netdata logs, it assigns a priority to the log. It can be one of this (in order of importance): | Level | Description | |-----------|----------------------------------------------------------------------------------------| | emergency | a fatal condition, Netdata will most likely exit immediately after. | | alert | a very important issue that may affect how Netdata operates. | | critical | a very important issue the user should know which, Netdata thinks it can survive. | | error | an error condition indicating that Netdata is trying to do something, but it fails. | | warning | something unexpected has happened that may or may not affect the operation of Netdata. | | notice | something that does not affect the operation of Netdata, but the user should notice. | | info | the default log level about information the user should know. | | debug | these are more verbose logs that can be ignored. | ## Logs Configuration In `netdata.conf`, there are the following settings: ``` [logs] # logs to trigger flood protection = 1000 # logs flood protection period = 60 # facility = daemon # level = info # daemon = journal # collector = journal # access = /var/log/netdata/access.log # health = /var/log/netdata/health.log ``` - `logs to trigger flood protection` and `logs flood protection period` enable logs flood protection for `daemon` and `collector` sources. It can also be configured per log source. - `facility` is used only when Netdata logs to syslog. - `level` defines the minimum [log level](#log-levels) of logs that will be logged. This setting is applied only to `daemon` and `collector` sources. It can also be configured per source. ### Configuring log sources Each for the sources (`daemon`, `collector`, `access`, `health`), accepts the following: ``` source = {FORMAT},level={LEVEL},protection={LOG}/{PERIOD}@{OUTPUT} ``` Where: - `{FORMAT}`, is one of the [log formats](#log-formats), - `{LEVEL}`, is the minimum [log level](#log-levels) to be logged, - `{LOGS}` is the number of `logs to trigger flood protection` configured per output, - `{PERIOD}` is the equivalent of `logs flood protection period` configured per output, - `{OUTPUT}` is one of the `[log outputs](#log-outputs), All parameters can be omitted, except `{OUTPUT}`. If `{OUTPUT}` is the only given parameter, `@` can be omitted. ### Logs rotation Netdata comes with `logrotate` configuration to rotate its log files periodically. The default is usually found in `/etc/logrotate.d/netdata`. Sending a `SIGHUP` to Netdata, will instruct it to re-open all its log files. ## Log Fields Netdata exposes the following fields to its logs: | journal | logfmt | json | Description | |:--------------------------------------:|:------------------------------:|:------------------------------:|:---------------------------------------------------------------------------------------------------------:| | `_SOURCE_REALTIME_TIMESTAMP` | `time` | `time` | the timestamp of the event | | `SYSLOG_IDENTIFIER` | `comm` | `comm` | the program logging the event | | `ND_LOG_SOURCE` | `source` | `source` | one of the [log sources](#log-sources) | | `PRIORITY`
numeric | `level`
text | `level`
numeric | one of the [log levels](#log-levels) | | `ERRNO` | `errno` | `errno` | the numeric value of `errno` | | `INVOCATION_ID` | - | - | a unique UUID of the Netdata session, reset on every Netdata restart, inherited by systemd when available | | `CODE_LINE` | - | - | the line number of of the source code logging this event | | `CODE_FILE` | - | - | the filename of the source code logging this event | | `CODE_FUNCTION` | - | - | the function name of the source code logging this event | | `TID` | `tid` | `tid` | the thread id of the thread logging this event | | `THREAD_TAG` | `thread` | `thread` | the name of the thread logging this event | | `MESSAGE_ID` | `msg_id` | `msg_id` | see [message IDs](#message-ids) | | `ND_MODULE` | `module` | `module` | the Netdata module logging this event | | `ND_NIDL_NODE` | `node` | `node` | the hostname of the node the event is related to | | `ND_NIDL_INSTANCE` | `instance` | `instance` | the instance of the node the event is related to | | `ND_NIDL_CONTEXT` | `context` | `context` | the context the event is related to (this is usually the chart name, as shown on netdata dashboards | | `ND_NIDL_DIMENSION` | `dimension` | `dimension` | the dimension the event is related to | | `ND_SRC_TRANSPORT` | `src_transport` | `src_transport` | when the event happened during a request, this is the request transport | | `ND_SRC_IP` | `src_ip` | `src_ip` | when the event happened during an inbound request, this is the IP the request came from | | `ND_SRC_PORT` | `src_port` | `src_port` | when the event happened during an inbound request, this is the port the request came from | | `ND_SRC_CAPABILITIES` | `src_capabilities` | `src_capabilities` | when the request came from a child, this is the communication capabilities of the child | | `ND_DST_TRANSPORT` | `dst_transport` | `dst_transport` | when the event happened during an outbound request, this is the outbound request transport | | `ND_DST_IP` | `dst_ip` | `dst_ip` | when the event happened during an outbound request, this is the IP the request destination | | `ND_DST_PORT` | `dst_port` | `dst_port` | when the event happened during an outbound request, this is the port the request destination | | `ND_DST_CAPABILITIES` | `dst_capabilities` | `dst_capabilities` | when the request goes to a parent, this is the communication capabilities of the parent | | `ND_REQUEST_METHOD` | `req_method` | `req_method` | when the event happened during an inbound request, this is the method the request was received | | `ND_RESPONSE_CODE` | `code` | `code` | when responding to a request, this this the response code | | `ND_CONNECTION_ID` | `conn` | `conn` | when there is a connection id for an inbound connection, this is the connection id | | `ND_TRANSACTION_ID` | `transaction` | `transaction` | the transaction id (UUID) of all API requests | | `ND_RESPONSE_SENT_BYTES` | `sent_bytes` | `sent_bytes` | the bytes we sent to API responses | | `ND_RESPONSE_SIZE_BYTES` | `size_bytes` | `size_bytes` | the uncompressed bytes of the API responses | | `ND_RESPONSE_PREP_TIME_USEC` | `prep_ut` | `prep_ut` | the time needed to prepare a response | | `ND_RESPONSE_SENT_TIME_USEC` | `sent_ut` | `sent_ut` | the time needed to send a response | | `ND_RESPONSE_TOTAL_TIME_USEC` | `total_ut` | `total_ut` | the total time needed to complete a response | | `ND_ALERT_ID` | `alert_id` | `alert_id` | the alert id this event is related to | | `ND_ALERT_EVENT_ID` | `alert_event_id` | `alert_event_id` | a sequential number of the alert transition (per host) | | `ND_ALERT_UNIQUE_ID` | `alert_unique_id` | `alert_unique_id` | a sequential number of the alert transition (per alert) | | `ND_ALERT_TRANSITION_ID` | `alert_transition_id` | `alert_transition_id` | the unique UUID of this alert transition | | `ND_ALERT_CONFIG` | `alert_config` | `alert_config` | the alert configuration hash (UUID) | | `ND_ALERT_NAME` | `alert` | `alert` | the alert name | | `ND_ALERT_CLASS` | `alert_class` | `alert_class` | the alert classification | | `ND_ALERT_COMPONENT` | `alert_component` | `alert_component` | the alert component | | `ND_ALERT_TYPE` | `alert_type` | `alert_type` | the alert type | | `ND_ALERT_EXEC` | `alert_exec` | `alert_exec` | the alert notification program | | `ND_ALERT_RECIPIENT` | `alert_recipient` | `alert_recipient` | the alert recipient(s) | | `ND_ALERT_VALUE` | `alert_value` | `alert_value` | the current alert value | | `ND_ALERT_VALUE_OLD` | `alert_value_old` | `alert_value_old` | the previous alert value | | `ND_ALERT_STATUS` | `alert_status` | `alert_status` | the current alert status | | `ND_ALERT_STATUS_OLD` | `alert_value_old` | `alert_value_old` | the previous alert value | | `ND_ALERT_UNITS` | `alert_units` | `alert_units` | the units of the alert | | `ND_ALERT_SUMMARY` | `alert_summary` | `alert_summary` | the summary text of the alert | | `ND_ALERT_INFO` | `alert_info` | `alert_info` | the info text of the alert | | `ND_ALERT_DURATION` | `alert_duration` | `alert_duration` | the duration the alert was in its previous state | | `ND_ALERT_NOTIFICATION_TIMESTAMP_USEC` | `alert_notification_timestamp` | `alert_notification_timestamp` | the timestamp the notification delivery is scheduled | | `ND_REQUEST` | `request` | `request` | the full request during which the event happened | | `MESSAGE` | `msg` | `msg` | the event message | ### Message IDs Netdata assigns specific message IDs to certain events: - `ed4cdb8f1beb4ad3b57cb3cae2d162fa` when a Netdata child connects to this Netdata - `6e2e3839067648968b646045dbf28d66` when this Netdata connects to a Netdata parent - `9ce0cb58ab8b44df82c4bf1ad9ee22de` when alerts change state - `6db0018e83e34320ae2a659d78019fb7` when notifications are sent You can view these events using the Netdata systemd-journal.plugin at the `MESSAGE_ID` filter, or using `journalctl` like this: ```bash # query children connection journalctl MESSAGE_ID=ed4cdb8f1beb4ad3b57cb3cae2d162fa # query parent connection journalctl MESSAGE_ID=6e2e3839067648968b646045dbf28d66 # query alert transitions journalctl MESSAGE_ID=9ce0cb58ab8b44df82c4bf1ad9ee22de # query alert notifications journalctl MESSAGE_ID=6db0018e83e34320ae2a659d78019fb7 ```