diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:23 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:44 +0000 |
commit | 836b47cb7e99a977c5a23b059ca1d0b5065d310e (patch) | |
tree | 1604da8f482d02effa033c94a84be42bc0c848c3 /src/streaming | |
parent | Releasing debian version 1.44.3-2. (diff) | |
download | netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.tar.xz netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.zip |
Merging upstream version 1.46.3.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'src/streaming')
-rw-r--r-- | src/streaming/README.md | 635 | ||||
-rw-r--r-- | src/streaming/common.h | 9 | ||||
-rw-r--r-- | src/streaming/compression.c | 707 | ||||
-rw-r--r-- | src/streaming/compression.h | 175 | ||||
-rw-r--r-- | src/streaming/compression_brotli.c | 142 | ||||
-rw-r--r-- | src/streaming/compression_brotli.h | 15 | ||||
-rw-r--r-- | src/streaming/compression_gzip.c | 164 | ||||
-rw-r--r-- | src/streaming/compression_gzip.h | 15 | ||||
-rw-r--r-- | src/streaming/compression_lz4.c | 143 | ||||
-rw-r--r-- | src/streaming/compression_lz4.h | 19 | ||||
-rw-r--r-- | src/streaming/compression_zstd.c | 163 | ||||
-rw-r--r-- | src/streaming/compression_zstd.h | 19 | ||||
-rw-r--r-- | src/streaming/receiver.c | 948 | ||||
-rw-r--r-- | src/streaming/replication.c | 2032 | ||||
-rw-r--r-- | src/streaming/replication.h | 36 | ||||
-rw-r--r-- | src/streaming/rrdpush.c | 1418 | ||||
-rw-r--r-- | src/streaming/rrdpush.h | 761 | ||||
-rw-r--r-- | src/streaming/sender.c | 1907 | ||||
-rw-r--r-- | src/streaming/stream.conf | 263 |
19 files changed, 9571 insertions, 0 deletions
diff --git a/src/streaming/README.md b/src/streaming/README.md new file mode 100644 index 000000000..fe4e01bae --- /dev/null +++ b/src/streaming/README.md @@ -0,0 +1,635 @@ +# Streaming and replication reference + +This document contains advanced streaming options and suggested deployment options for production. +If you haven't already done so, we suggest you first go through the +[quick introduction to streaming](/docs/observability-centralization-points/README.md) +, for your first, basic parent child setup. + +## Configuration + +There are two files responsible for configuring Netdata's streaming capabilities: `stream.conf` and `netdata.conf`. + +From within your Netdata config directory (typically `/etc/netdata`), [use `edit-config`](/docs/netdata-agent/configuration/README.md) to +open either `stream.conf` or `netdata.conf`. + +``` +sudo ./edit-config stream.conf +sudo ./edit-config netdata.conf +``` + +### `stream.conf` + +The `stream.conf` file contains three sections. The `[stream]` section is for configuring child nodes. + +The `[API_KEY]` and `[MACHINE_GUID]` sections are both for configuring parent nodes, and share the same settings. +`[API_KEY]` settings affect every child node using that key, whereas `[MACHINE_GUID]` settings affect only the child +node with a matching GUID. + +The file `/var/lib/netdata/registry/netdata.public.unique.id` contains a random GUID that **uniquely identifies each +node**. This file is automatically generated by Netdata the first time it is started and remains unaltered forever. + +#### `[stream]` section + +| Setting | Default | Description | +|-------------------------------------------------|---------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `enabled` | `no` | Whether this node streams metrics to any parent. Change to `yes` to enable streaming. | +| [`destination`](#destination) | | A space-separated list of parent nodes to attempt to stream to, with the first available parent receiving metrics, using the following format: `[PROTOCOL:]HOST[%INTERFACE][:PORT][:SSL]`. [Read more →](#destination) | +| `ssl skip certificate verification` | `yes` | If you want to accept self-signed or expired certificates, set to `yes` and uncomment. | +| `CApath` | `/etc/ssl/certs/` | The directory where known certificates are found. Defaults to OpenSSL's default path. | +| `CAfile` | `/etc/ssl/certs/cert.pem` | Add a parent node certificate to the list of known certificates in `CAPath`. | +| `api key` | | The `API_KEY` to use as the child node. | +| `timeout seconds` | `60` | The timeout to connect and send metrics to a parent. | +| `default port` | `19999` | The port to use if `destination` does not specify one. | +| [`send charts matching`](#send-charts-matching) | `*` | A space-separated list of [Netdata simple patterns](/src/libnetdata/simple_pattern/README.md) to filter which charts are streamed. [Read more →](#send-charts-matching) | +| `buffer size bytes` | `10485760` | The size of the buffer to use when sending metrics. The default `10485760` equals a buffer of 10MB, which is good for 60 seconds of data. Increase this if you expect latencies higher than that. The buffer is flushed on reconnect. | +| `reconnect delay seconds` | `5` | How long to wait until retrying to connect to the parent node. | +| `initial clock resync iterations` | `60` | Sync the clock of charts for how many seconds when starting. | +| `parent using h2o` | `no` | Set to yes if you are connecting to parent trough it's h2o webserver/port. Currently there is no reason to set this to `yes` unless you are testing the new h2o based netdata webserver. When production ready this will be set to `yes` as default. | + +### `[API_KEY]` and `[MACHINE_GUID]` sections + +| Setting | Default | Description | +|-----------------------------------------------|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `enabled` | `no` | Whether this API KEY enabled or disabled. | +| [`allow from`](#allow-from) | `*` | A space-separated list of [Netdata simple patterns](/src/libnetdata/simple_pattern/README.md) matching the IPs of nodes that will stream metrics using this API key. [Read more →](#allow-from) | +| `default history` | `3600` | The default amount of child metrics history to retain when using the `ram` memory mode. | +| [`default memory mode`](#default-memory-mode) | `ram` | The [database](/src/database/README.md) to use for all nodes using this `API_KEY`. Valid settings are `dbengine`, `ram`, or `none`. [Read more →](#default-memory-mode) | +| `health enabled by default` | `auto` | Whether alerts and notifications should be enabled for nodes using this `API_KEY`. `auto` enables alerts when the child is connected. `yes` enables alerts always, and `no` disables alerts. | +| `default postpone alarms on connect seconds` | `60` | Postpone alerts and notifications for a period of time after the child connects. | +| `default health log history` | `432000` | History of health log events (in seconds) kept in the database. | +| `default proxy enabled` | | Route metrics through a proxy. | +| `default proxy destination` | | Space-separated list of `IP:PORT` for proxies. | +| `default proxy api key` | | The `API_KEY` of the proxy. | +| `default send charts matching` | `*` | See [`send charts matching`](#send-charts-matching). | +| `enable compression` | `yes` | Enable/disable stream compression. | +| `enable replication` | `yes` | Enable/disable replication. | +| `seconds to replicate` | `86400` | How many seconds of data to replicate from each child at a time | +| `seconds per replication step` | `600` | The duration we want to replicate per each replication step. | +| `is ephemeral node` | `no` | Indicate whether this child is an ephemeral node. An ephemeral node will become unavailable after the specified duration of "cleanup ephemeral hosts after secs" from the time of the node's last connection. | + +#### `destination` + +A space-separated list of parent nodes to attempt to stream to, with the first available parent receiving metrics, using +the following format: `[PROTOCOL:]HOST[%INTERFACE][:PORT][:SSL]`. + +- `PROTOCOL`: `tcp`, `udp`, or `unix`. (only tcp and unix are supported by parent nodes) +- `HOST`: A IPv4, IPv6 IP, or a hostname, or a unix domain socket path. IPv6 IPs should be given with brackets + `[ip:address]`. +- `INTERFACE` (IPv6 only): The network interface to use. +- `PORT`: The port number or service name (`/etc/services`) to use. +- `SSL`: To enable TLS/SSL encryption of the streaming connection. + +To enable TCP streaming to a parent node at `203.0.113.0` on port `20000` and with TLS/SSL encryption: + +```conf +[stream] + destination = tcp:203.0.113.0:20000:SSL +``` + +#### `send charts matching` + +A space-separated list of [Netdata simple patterns](/src/libnetdata/simple_pattern/README.md) to filter which charts are streamed. + +The default is a single wildcard `*`, which streams all charts. + +To send only a few charts, list them explicitly, or list a group using a wildcard. To send _only_ the `apps.cpu` chart +and charts with contexts beginning with `system.`: + +```conf +[stream] + send charts matching = apps.cpu system.* +``` + +To send all but a few charts, use `!` to create a negative match. To send _all_ charts _but_ `apps.cpu`: + +```conf +[stream] + send charts matching = !apps.cpu * +``` + +#### `allow from` + +A space-separated list of [Netdata simple patterns](/src/libnetdata/simple_pattern/README.md) matching the IPs of nodes that +will stream metrics using this API key. The order is important, left to right, as the first positive or negative match is used. + +The default is `*`, which accepts all requests including the `API_KEY`. + +To allow from only a specific IP address: + +```conf +[API_KEY] + allow from = 203.0.113.10 +``` + +To allow all IPs starting with `10.*`, except `10.1.2.3`: + +```conf +[API_KEY] + allow from = !10.1.2.3 10.* +``` + +> If you set specific IP addresses here, and also use the `allow connections` setting in the `[web]` section of +> `netdata.conf`, be sure to add the IP address there so that it can access the API port. + +#### `default memory mode` + +The [database](/src/database/README.md) to use for all nodes using this `API_KEY`. +Valid settings are `dbengine`, `ram`, , or `none`. + +- `dbengine`: The default, recommended time-series database (TSDB) for Netdata. Stores recent metrics in memory, then + efficiently spills them to disk for long-term storage. +- `ram`: Stores metrics _only_ in memory, which means metrics are lost when Netdata stops or restarts. Ideal for + streaming configurations that use ephemeral nodes. +- `none`: No database. + +When using `default memory mode = dbengine`, the parent node creates a separate instance of the TSDB to store metrics +from child nodes. The [size of _each_ instance is configurable](/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md) with the `page +cache size` and `dbengine multihost disk space` settings in the `[global]` section in `netdata.conf`. + +### `netdata.conf` + +| Setting | Default | Description | +|--------------------------------------------|-------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `[global]` section | | | +| `memory mode` | `dbengine` | Determines the [database type](/src/database/README.md) to be used on that node. Other options settings include `none`, and `ram`. `none` disables the database at this host. This also disables alerts and notifications, as those can't run without a database. | +| `[web]` section | | | +| `mode` | `static-threaded` | Determines the [web server](/src/web/server/README.md) type. The other option is `none`, which disables the dashboard, API, and registry. | +| `accept a streaming request every seconds` | `0` | Set a limit on how often a parent node accepts streaming requests from child nodes. `0` equals no limit. If this is set, you may see `... too busy to accept new streaming request. Will be allowed in X secs` in Netdata's `error.log`. | + +### Basic use cases + +This is an overview of how the main options can be combined: + +| target | memory<br/>mode | web<br/>mode | stream<br/>enabled | exporting | alerts | dashboard | +|--------------------|:---------------:|:------------:|:------------------:|:-------------------------------------:|:------------:|:---------:| +| headless collector | `none` | `none` | `yes` | only for `data source = as collected` | not possible | no | +| headless proxy | `none` | not `none` | `yes` | only for `data source = as collected` | not possible | no | +| proxy with db | not `none` | not `none` | `yes` | possible | possible | yes | +| central netdata | not `none` | not `none` | `no` | possible | possible | yes | + +### Per-child settings + +While the `[API_KEY]` section applies settings for any child node using that key, you can also use per-child settings +with the `[MACHINE_GUID]` section. + +For example, the metrics streamed from only the child node with `MACHINE_GUID` are saved in memory, not using the +default `dbengine` as specified by the `API_KEY`, and alerts are disabled. + +```conf +[API_KEY] + enabled = yes + default memory mode = dbengine + health enabled by default = auto + allow from = * + +[MACHINE_GUID] + enabled = yes + memory mode = ram + health enabled = no +``` + +### Streaming compression + +[![Supported version Netdata Agent release](https://img.shields.io/badge/Supported%20Netdata%20Agent-v1.33%2B-brightgreen)](https://github.com/netdata/netdata/releases/latest) + +[![Supported version Netdata Agent release](https://img.shields.io/badge/Supported%20Netdata%20stream%20version-v5%2B-blue)](https://github.com/netdata/netdata/releases/latest) + +#### OS dependencies +* Streaming compression is based on [lz4 v1.9.0+](https://github.com/lz4/lz4). The [lz4 v1.9.0+](https://github.com/lz4/lz4) library must be installed in your OS in order to enable streaming compression. Any lower version will disable Netdata streaming compression for compatibility purposes between the older versions of Netdata agents. + +To check if your Netdata Agent supports stream compression run the following GET request in your browser or terminal: + +``` +curl -X GET http://localhost:19999/api/v1/info | grep 'Stream Compression' +``` + +**Output** +``` +"buildinfo": "dbengine|Native HTTPS|Netdata Cloud|ACLK Next Generation|New Cloud Protocol Support|ACLK Legacy|TLS Host Verification|Machine Learning|Stream Compression|protobuf|JSON-C|libcrypto|libm|LWS v3.2.2|mosquitto|zlib|apps|cgroup Network Tracking|EBPF|perf|slabinfo", +``` +> Note: If your OS doesn't support Netdata compression the `buildinfo` will not contain the `Stream Compression` statement. + +To check if your Netdata Agent has stream compression enabled, run the following GET request in your browser or terminal: + +``` + curl -X GET http://localhost:19999/api/v1/info | grep 'stream-compression' +``` +**Output** +``` +"stream-compression": "enabled" +``` +Note: The `stream-compression` status can be `"enabled" | "disabled" | "N/A"`. + +A compressed data packet is determined and decompressed on the fly. + +#### Limitations +This limitation will be withdrawn asap and is work-in-progress. + +The current implementation of streaming data compression can support only a few number of dimensions in a chart with names that cannot exceed the size of 16384 bytes. In case your instance hit this limitation, the agent will deactivate compression during runtime to avoid stream corruption. This limitation can be seen in the error.log file with the sequence of the following messages: +``` +netdata INFO : STREAM_SENDER[child01] : STREAM child01 [send to my.parent.IP]: connecting... +netdata INFO : STREAM_SENDER[child01] : STREAM child01 [send to my.parent.IP]: initializing communication... +netdata INFO : STREAM_SENDER[child01] : STREAM child01 [send to my.parent.IP]: waiting response from remote netdata... +netdata INFO : STREAM_SENDER[child01] : STREAM_COMPRESSION: Compressor Reset +netdata INFO : STREAM_SENDER[child01] : STREAM child01 [send to my.parent.IP]: established communication with a parent using protocol version 5 - ready to send metrics... +... +netdata ERROR : PLUGINSD[go.d] : STREAM_COMPRESSION: Compression Failed - Message size 27847 above compression buffer limit: 16384 (errno 9, Bad file descriptor) +netdata ERROR : PLUGINSD[go.d] : STREAM_COMPRESSION: Deactivating compression to avoid stream corruption +netdata ERROR : PLUGINSD[go.d] : STREAM_COMPRESSION child01 [send to my.parent.IP]: Restarting connection without compression +... +netdata INFO : STREAM_SENDER[child01] : STREAM child01 [send to my.parent.IP]: connecting... +netdata INFO : STREAM_SENDER[child01] : STREAM child01 [send to my.parent.IP]: initializing communication... +netdata INFO : STREAM_SENDER[child01] : STREAM child01 [send to my.parent.IP]: waiting response from remote netdata... +netdata INFO : STREAM_SENDER[child01] : Stream is uncompressed! One of the agents (my.parent.IP <-> child01) does not support compression OR compression is disabled. +netdata INFO : STREAM_SENDER[child01] : STREAM child01 [send to my.parent.IP]: established communication with a parent using protocol version 4 - ready to send metrics... +netdata INFO : WEB_SERVER[static4] : STREAM child01 [send]: sending metrics... +``` + +#### How to enable stream compression +Netdata Agents are shipped with data compression enabled by default. You can also configure which streams will use compression. + +With enabled stream compression, a Netdata Agent can negotiate streaming compression with other Netdata Agents. During the negotiation of streaming compression both Netdata Agents should support and enable compression in order to communicate over a compressed stream. The negotiation will result into an uncompressed stream, if one of the Netdata Agents doesn't support **or** has compression disabled. + +To enable stream compression: + +1. Edit `stream.conf` by using the `edit-config` script: +`/etc/netdata/edit-config stream.conf`. + +2. In the `[stream]` section, set `enable compression` to `yes`. +``` +# This is the default stream compression flag for an agent. + +[stream] + enable compression = yes | no +``` + + +| Parent | Stream compression | Child | +|--------------------------------------|--------------------|--------------------------------------| +| Supported & Enabled | compressed | Supported & Enabled | +| (Supported & Disabled)/Not supported | uncompressed | Supported & Enabled | +| Supported & Enabled | uncompressed | (Supported & Disabled)/Not supported | +| (Supported & Disabled)/Not supported | uncompressed | (Supported & Disabled)/Not supported | + +In case of parents with multiple children you can select which streams will be compressed by using the same configuration under the `[API_KEY]`, `[MACHINE_GUID]` section. + +This configuration uses AND logic with the default stream compression configuration under the `[stream]` section. This means the stream compression from child to parent will be enabled only if the outcome of the AND logic operation is true (`default compression enabled` && `api key compression enabled`). So both should be enabled to get stream compression otherwise stream compression is disabled. +``` +[API_KEY] + enable compression = yes | no +``` +Same thing applies with the `[MACHINE_GUID]` configuration. +``` +[MACHINE_GUID] + enable compression = yes | no +``` + +### Securing streaming with TLS/SSL + +Netdata does not activate TLS encryption by default. To encrypt streaming connections, you first need to [enable TLS +support](/src/web/server/README.md#enabling-tls-support) on the parent. With encryption enabled on the receiving side, you +need to instruct the child to use TLS/SSL as well. On the child's `stream.conf`, configure the destination as follows: + +``` +[stream] + destination = host:port:SSL +``` + +The word `SSL` appended to the end of the destination tells the child that connections must be encrypted. + +> While Netdata uses Transport Layer Security (TLS) 1.2 to encrypt communications rather than the obsolete SSL protocol, +> it's still common practice to refer to encrypted web connections as `SSL`. Many vendors, like Nginx and even Netdata +> itself, use `SSL` in configuration files, whereas documentation will always refer to encrypted communications as `TLS` +> or `TLS/SSL`. + +#### Certificate verification + +When TLS/SSL is enabled on the child, the default behavior will be to not connect with the parent unless the server's +certificate can be verified via the default chain. In case you want to avoid this check, add the following to the +child's `stream.conf` file: + +``` +[stream] + ssl skip certificate verification = yes +``` + +#### Trusted certificate + +If you've enabled [certificate verification](#certificate-verification), you might see errors from the OpenSSL library +when there's a problem with checking the certificate chain (`X509_V_ERR_UNABLE_TO_GET_ISSUER_CERT_LOCALLY`). More +importantly, OpenSSL will reject self-signed certificates. + +Given these known issues, you have two options. If you trust your certificate, you can set the options `CApath` and +`CAfile` to inform Netdata where your certificates, and the certificate trusted file, are stored. + +For more details about these options, you can read about [verify +locations](https://www.openssl.org/docs/man1.1.1/man3/SSL_CTX_load_verify_locations.html). + +Before you changed your streaming configuration, you need to copy your trusted certificate to your child system and add +the certificate to OpenSSL's list. + +On most Linux distributions, the `update-ca-certificates` command searches inside the `/usr/share/ca-certificates` +directory for certificates. You should double-check by reading the `update-ca-certificate` manual (`man +update-ca-certificate`), and then change the directory in the below commands if needed. + +If you have `sudo` configured on your child system, you can use that to run the following commands. If not, you'll have +to log in as `root` to complete them. + +``` +# mkdir /usr/share/ca-certificates/netdata +# cp parent_cert.pem /usr/share/ca-certificates/netdata/parent_cert.crt +# chown -R netdata.netdata /usr/share/ca-certificates/netdata/ +``` + +First, you create a new directory to store your certificates for Netdata. Next, you need to change the extension on your +certificate from `.pem` to `.crt` so it's compatible with `update-ca-certificate`. Finally, you need to change +permissions so the user that runs Netdata can access the directory where you copied in your certificate. + +Next, edit the file `/etc/ca-certificates.conf` and add the following line: + +``` +netdata/parent_cert.crt +``` + +Now you update the list of certificates running the following, again either as `sudo` or `root`: + +``` +# update-ca-certificates +``` + +> Some Linux distributions have different methods of updating the certificate list. For more details, please read this +> guide on [adding trusted root certificates](https://github.com/Busindre/How-to-Add-trusted-root-certificates). + +Once you update your certificate list, you can set the stream parameters for Netdata to trust the parent certificate. +Open `stream.conf` for editing and change the following lines: + +``` +[stream] + CApath = /etc/ssl/certs/ + CAfile = /etc/ssl/certs/parent_cert.pem +``` + +With this configuration, the `CApath` option tells Netdata to search for trusted certificates inside `/etc/ssl/certs`. +The `CAfile` option specifies the Netdata parent certificate is located at `/etc/ssl/certs/parent_cert.pem`. With this +configuration, you can skip using the system's entire list of certificates and use Netdata's parent certificate instead. + +#### Expected behaviors + +With the introduction of TLS/SSL, the parent-child communication behaves as shown in the table below, depending on the +following configurations: + +- **Parent TLS (Yes/No)**: Whether the `[web]` section in `netdata.conf` has `ssl key` and `ssl certificate`. +- **Parent port TLS (-/force/optional)**: Depends on whether the `[web]` section `bind to` contains a `^SSL=force` or + `^SSL=optional` directive on the port(s) used for streaming. +- **Child TLS (Yes/No)**: Whether the destination in the child's `stream.conf` has `:SSL` at the end. +- **Child TLS Verification (yes/no)**: Value of the child's `stream.conf` `ssl skip certificate verification` + parameter (default is no). + +| Parent TLS enabled | Parent port SSL | Child TLS | Child SSL Ver. | Behavior | +|:-------------------|:-----------------|:----------|:---------------|:-----------------------------------------------------------------------------------------------------------------------------------------| +| No | - | No | no | Legacy behavior. The parent-child stream is unencrypted. | +| Yes | force | No | no | The parent rejects the child connection. | +| Yes | -/optional | No | no | The parent-child stream is unencrypted (expected situation for legacy child nodes and newer parent nodes) | +| Yes | -/force/optional | Yes | no | The parent-child stream is encrypted, provided that the parent has a valid TLS/SSL certificate. Otherwise, the child refuses to connect. | +| Yes | -/force/optional | Yes | yes | The parent-child stream is encrypted. | + +### Proxy + +A proxy is a node that receives metrics from a child, then streams them onward to a parent. To configure a proxy, +configure it as a receiving and a sending Netdata at the same time. + +Netdata proxies may or may not maintain a database for the metrics passing through them. When they maintain a database, +they can also run health checks (alerts and notifications) for the remote host that is streaming the metrics. + +In the following example, the proxy receives metrics from a child node using the `API_KEY` of +`66666666-7777-8888-9999-000000000000`, then stores metrics using `dbengine`. It then uses the `API_KEY` of +`11111111-2222-3333-4444-555555555555` to proxy those same metrics on to a parent node at `203.0.113.0`. + +```conf +[stream] + enabled = yes + destination = 203.0.113.0 + api key = 11111111-2222-3333-4444-555555555555 + +[66666666-7777-8888-9999-000000000000] + enabled = yes + default memory mode = dbengine +``` + +### Ephemeral nodes + +Netdata can help you monitor ephemeral nodes, such as containers in an auto-scaling infrastructure, by always streaming +metrics to any number of permanently-running parent nodes. + +On the parent, set the following in `stream.conf`: + +```conf +[11111111-2222-3333-4444-555555555555] + # enable/disable this API key + enabled = yes + + # one hour of data for each of the child nodes + default history = 3600 + + # do not save child metrics on disk + default memory = ram + + # alerts checks, only while the child is connected + health enabled by default = auto +``` + +On the child nodes, set the following in `stream.conf`: + +```bash +[stream] + # stream metrics to another Netdata + enabled = yes + + # the IP and PORT of the parent + destination = 10.11.12.13:19999 + + # the API key to use + api key = 11111111-2222-3333-4444-555555555555 +``` + +In addition, edit `netdata.conf` on each child node to disable the database and alerts. + +```bash +[global] + # disable the local database + memory mode = none + +[health] + # disable health checks + enabled = no +``` + +## Replication + +Netdata streaming automatically replicates data from child nodes to parent nodes, ensuring that the parent node has a complete and up-to-date view of all metrics. +This replication process ensures data continuity even if child nodes temporarily disconnect. + +Replication is enabled by default in Netdata, but you can customize the replication behavior by modifying the `[API_KEY]` section of the `stream.conf` file. Here's an example configuration: + +```conf +[11111111-2222-3333-4444-555555555555] + # Enable replication for all hosts using this api key. Default: yes. + enable replication = yes + + # How many seconds of data to replicate from each child at a time. Default: a day (86400 seconds). + seconds to replicate = 86400 + + # The duration we want to replicate per each replication step. Default: 600 seconds (10 minutes). + seconds per replication step = 600 +``` + +You can monitor the replication process in two ways: + +1. **Netdata Monitoring**: access the Netdata Monitoring section and look for the Replication charts. +2. **Streaming Function**: use the Streaming function (Top) to see the replication status of children nodes. This function provides real-time insights into the replication status of each child node. + +### Replication history + +Replication history in [dbengine](/src/database/README.md) mode is limited +by [Tier 0 retention](/docs/netdata-agent/configuration/optimizing-metrics-database/change-metrics-storage.md#effect-of-storage-tiers-and-disk-space-on-retention): + +- Child instances replicate only Tier 0 data. +- Parent instance calculates higher-level tiers using Tier 0 as the basis. + +Extend replication history by increasing Tier 0 retention. + +Checking Tier 0 retention: + +- Using a web browser: + - Navigate to `http://{CHILD_IP}:19999/api/v2/node_instances`. + - Locate the `expected_retention` value for Tier 0 of your Agent. + - Convert the value from seconds to days for a more meaningful representation. +- Using `curl` and `jq`: + - Execute the following command: + ```bash + $ curl -s "http://{CHILD_IP}:19999/api/v2/node_instances" | jq '.agents[] | {nm, retention: (.db_size[0].retention / 86400 | .*100 | round/100) }' + ``` + - Example output: + ```json + { + "nm": "myhost", + "retention": 12.73 + } + ``` + +## Troubleshooting + +Both parent and child nodes log information at `/var/log/netdata/error.log`. + +If the child manages to connect to the parent you will see something like (on the parent): + +``` +2017-03-09 09:38:52: netdata: INFO : STREAM [receive from [10.11.12.86]:38564]: new client connection. +2017-03-09 09:38:52: netdata: INFO : STREAM xxx [10.11.12.86]:38564: receive thread created (task id 27721) +2017-03-09 09:38:52: netdata: INFO : STREAM xxx [receive from [10.11.12.86]:38564]: client willing to stream metrics for host 'xxx' with machine_guid '1234567-1976-11e6-ae19-7cdd9077342a': update every = 1, history = 3600, memory mode = ram, health auto +2017-03-09 09:38:52: netdata: INFO : STREAM xxx [receive from [10.11.12.86]:38564]: initializing communication... +2017-03-09 09:38:52: netdata: INFO : STREAM xxx [receive from [10.11.12.86]:38564]: receiving metrics... +``` + +and something like this on the child: + +``` +2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: connecting... +2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: initializing communication... +2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: waiting response from remote netdata... +2017-03-09 09:38:28: netdata: INFO : STREAM xxx [send to box:19999]: established communication - sending metrics... +``` + +The following sections describe the most common issues you might encounter when connecting parent and child nodes. + +### Slow connections between parent and child + +When you have a slow connection between parent and child, Netdata raises a few different errors. Most of the +errors will appear in the child's `error.log`. + +```bash +netdata ERROR : STREAM_SENDER[CHILD HOSTNAME] : STREAM CHILD HOSTNAME [send to PARENT IP:PARENT PORT]: too many data pending - buffer is X bytes long, +Y unsent - we have sent Z bytes in total, W on this connection. Closing connection to flush the data. +``` + +On the parent side, you may see various error messages, most commonly the following: + +``` +netdata ERROR : STREAM_PARENT[CHILD HOSTNAME,[CHILD IP]:CHILD PORT] : read failed: end of file +``` + +Another common problem in slow connections is the child sending a partial message to the parent. In this case, the +parent will write the following to its `error.log`: + +``` +ERROR : STREAM_RECEIVER[CHILD HOSTNAME,[CHILD IP]:CHILD PORT] : sent command 'B' which is not known by netdata, for host 'HOSTNAME'. Disabling it. +``` + +In this example, `B` was part of a `BEGIN` message that was cut due to connection problems. + +Slow connections can also cause problems when the parent misses a message and then receives a command related to the +missed message. For example, a parent might miss a message containing the child's charts, and then doesn't know +what to do with the `SET` message that follows. When that happens, the parent will show a message like this: + +``` +ERROR : STREAM_RECEIVER[CHILD HOSTNAME,[CHILD IP]:CHILD PORT] : requested a SET on chart 'CHART NAME' of host 'HOSTNAME', without a dimension. Disabling it. +``` + +### Child cannot connect to parent + +When the child can't connect to a parent for any reason (misconfiguration, networking, firewalls, parent +down), you will see the following in the child's `error.log`. + +``` +ERROR : STREAM_SENDER[HOSTNAME] : Failed to connect to 'PARENT IP', port 'PARENT PORT' (errno 113, No route to host) +``` + +### 'Is this a Netdata?' + +This question can appear when Netdata starts the stream and receives an unexpected response. This error can appear when +the parent is using SSL and the child tries to connect using plain text. You will also see this message when +Netdata connects to another server that isn't Netdata. The complete error message will look like this: + +``` +ERROR : STREAM_SENDER[CHILD HOSTNAME] : STREAM child HOSTNAME [send to PARENT HOSTNAME:PARENT PORT]: server is not replying properly (is it a netdata?). +``` + +### Stream charts wrong + +Chart data needs to be consistent between child and parent nodes. If there are differences between chart data on +a parent and a child, such as gaps in metrics collection, it most often means your child's `memory mode` +does not match the parent's. To learn more about the different ways Netdata can store metrics, and thus keep chart +data consistent, read our [memory mode documentation](/src/database/README.md). + +### Forbidding access + +You may see errors about "forbidding access" for a number of reasons. It could be because of a slow connection between +the parent and child nodes, but it could also be due to other failures. Look in your parent's `error.log` for errors +that look like this: + +``` +STREAM [receive from [child HOSTNAME]:child IP]: `MESSAGE`. Forbidding access." +``` + +`MESSAGE` will have one of the following patterns: + +- `request without KEY` : The message received is incomplete and the KEY value can be API, hostname, machine GUID. +- `API key 'VALUE' is not valid GUID`: The UUID received from child does not have the format defined in [RFC + 4122](https://tools.ietf.org/html/rfc4122) +- `machine GUID 'VALUE' is not GUID.`: This error with machine GUID is like the previous one. +- `API key 'VALUE' is not allowed`: This stream has a wrong API key. +- `API key 'VALUE' is not permitted from this IP`: The IP is not allowed to use STREAM with this parent. +- `machine GUID 'VALUE' is not allowed.`: The GUID that is trying to send stream is not allowed. +- `Machine GUID 'VALUE' is not permitted from this IP. `: The IP does not match the pattern or IP allowed to connect to + use stream. + +### Netdata could not create a stream + +The connection between parent and child is a stream. When the parent can't convert the initial connection into +a stream, it will write the following message inside `error.log`: + +``` +file descriptor given is not a valid stream +``` + +After logging this error, Netdata will close the stream. diff --git a/src/streaming/common.h b/src/streaming/common.h new file mode 100644 index 000000000..b7292f4d0 --- /dev/null +++ b/src/streaming/common.h @@ -0,0 +1,9 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef STREAMING_COMMON_H +#define STREAMING_COMMON_H + +#define NETDATA_STREAM_URL "/stream" +#define NETDATA_STREAM_PROTO_NAME "netdata_stream/2.0" + +#endif /* STREAMING_COMMON_H */ diff --git a/src/streaming/compression.c b/src/streaming/compression.c new file mode 100644 index 000000000..a94c8a0a6 --- /dev/null +++ b/src/streaming/compression.c @@ -0,0 +1,707 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression.h" + +#include "compression_gzip.h" + +#ifdef ENABLE_LZ4 +#include "compression_lz4.h" +#endif + +#ifdef ENABLE_ZSTD +#include "compression_zstd.h" +#endif + +#ifdef ENABLE_BROTLI +#include "compression_brotli.h" +#endif + +int rrdpush_compression_levels[COMPRESSION_ALGORITHM_MAX] = { + [COMPRESSION_ALGORITHM_NONE] = 0, + [COMPRESSION_ALGORITHM_ZSTD] = 3, // 1 (faster) - 22 (smaller) + [COMPRESSION_ALGORITHM_LZ4] = 1, // 1 (smaller) - 9 (faster) + [COMPRESSION_ALGORITHM_BROTLI] = 3, // 0 (faster) - 11 (smaller) + [COMPRESSION_ALGORITHM_GZIP] = 1, // 1 (faster) - 9 (smaller) +}; + +void rrdpush_parse_compression_order(struct receiver_state *rpt, const char *order) { + // empty all slots + for(size_t i = 0; i < COMPRESSION_ALGORITHM_MAX ;i++) + rpt->config.compression_priorities[i] = STREAM_CAP_NONE; + + char *s = strdupz(order); + + char *words[COMPRESSION_ALGORITHM_MAX + 100] = { NULL }; + size_t num_words = quoted_strings_splitter_pluginsd(s, words, COMPRESSION_ALGORITHM_MAX + 100); + size_t slot = 0; + STREAM_CAPABILITIES added = STREAM_CAP_NONE; + for(size_t i = 0; i < num_words && slot < COMPRESSION_ALGORITHM_MAX ;i++) { + if((STREAM_CAP_ZSTD_AVAILABLE) && strcasecmp(words[i], "zstd") == 0 && !(added & STREAM_CAP_ZSTD)) { + rpt->config.compression_priorities[slot++] = STREAM_CAP_ZSTD; + added |= STREAM_CAP_ZSTD; + } + else if((STREAM_CAP_LZ4_AVAILABLE) && strcasecmp(words[i], "lz4") == 0 && !(added & STREAM_CAP_LZ4)) { + rpt->config.compression_priorities[slot++] = STREAM_CAP_LZ4; + added |= STREAM_CAP_LZ4; + } + else if((STREAM_CAP_BROTLI_AVAILABLE) && strcasecmp(words[i], "brotli") == 0 && !(added & STREAM_CAP_BROTLI)) { + rpt->config.compression_priorities[slot++] = STREAM_CAP_BROTLI; + added |= STREAM_CAP_BROTLI; + } + else if(strcasecmp(words[i], "gzip") == 0 && !(added & STREAM_CAP_GZIP)) { + rpt->config.compression_priorities[slot++] = STREAM_CAP_GZIP; + added |= STREAM_CAP_GZIP; + } + } + + freez(s); + + // make sure all participate + if((STREAM_CAP_ZSTD_AVAILABLE) && slot < COMPRESSION_ALGORITHM_MAX && !(added & STREAM_CAP_ZSTD)) + rpt->config.compression_priorities[slot++] = STREAM_CAP_ZSTD; + if((STREAM_CAP_LZ4_AVAILABLE) && slot < COMPRESSION_ALGORITHM_MAX && !(added & STREAM_CAP_LZ4)) + rpt->config.compression_priorities[slot++] = STREAM_CAP_LZ4; + if((STREAM_CAP_BROTLI_AVAILABLE) && slot < COMPRESSION_ALGORITHM_MAX && !(added & STREAM_CAP_BROTLI)) + rpt->config.compression_priorities[slot++] = STREAM_CAP_BROTLI; + if(slot < COMPRESSION_ALGORITHM_MAX && !(added & STREAM_CAP_GZIP)) + rpt->config.compression_priorities[slot++] = STREAM_CAP_GZIP; +} + +void rrdpush_select_receiver_compression_algorithm(struct receiver_state *rpt) { + if (!rpt->config.rrdpush_compression) + rpt->capabilities &= ~STREAM_CAP_COMPRESSIONS_AVAILABLE; + + // select the right compression before sending our capabilities to the child + if(stream_has_more_than_one_capability_of(rpt->capabilities, STREAM_CAP_COMPRESSIONS_AVAILABLE)) { + STREAM_CAPABILITIES compressions = rpt->capabilities & STREAM_CAP_COMPRESSIONS_AVAILABLE; + for(int i = 0; i < COMPRESSION_ALGORITHM_MAX; i++) { + STREAM_CAPABILITIES c = rpt->config.compression_priorities[i]; + + if(!(c & STREAM_CAP_COMPRESSIONS_AVAILABLE)) + continue; + + if(compressions & c) { + STREAM_CAPABILITIES exclude = compressions; + exclude &= ~c; + + rpt->capabilities &= ~exclude; + break; + } + } + } +} + +bool rrdpush_compression_initialize(struct sender_state *s) { + rrdpush_compressor_destroy(&s->compressor); + + // IMPORTANT + // KEEP THE SAME ORDER IN DECOMPRESSION + + if(stream_has_capability(s, STREAM_CAP_ZSTD)) + s->compressor.algorithm = COMPRESSION_ALGORITHM_ZSTD; + else if(stream_has_capability(s, STREAM_CAP_LZ4)) + s->compressor.algorithm = COMPRESSION_ALGORITHM_LZ4; + else if(stream_has_capability(s, STREAM_CAP_BROTLI)) + s->compressor.algorithm = COMPRESSION_ALGORITHM_BROTLI; + else if(stream_has_capability(s, STREAM_CAP_GZIP)) + s->compressor.algorithm = COMPRESSION_ALGORITHM_GZIP; + else + s->compressor.algorithm = COMPRESSION_ALGORITHM_NONE; + + if(s->compressor.algorithm != COMPRESSION_ALGORITHM_NONE) { + s->compressor.level = rrdpush_compression_levels[s->compressor.algorithm]; + rrdpush_compressor_init(&s->compressor); + return true; + } + + return false; +} + +bool rrdpush_decompression_initialize(struct receiver_state *rpt) { + rrdpush_decompressor_destroy(&rpt->decompressor); + + // IMPORTANT + // KEEP THE SAME ORDER IN COMPRESSION + + if(stream_has_capability(rpt, STREAM_CAP_ZSTD)) + rpt->decompressor.algorithm = COMPRESSION_ALGORITHM_ZSTD; + else if(stream_has_capability(rpt, STREAM_CAP_LZ4)) + rpt->decompressor.algorithm = COMPRESSION_ALGORITHM_LZ4; + else if(stream_has_capability(rpt, STREAM_CAP_BROTLI)) + rpt->decompressor.algorithm = COMPRESSION_ALGORITHM_BROTLI; + else if(stream_has_capability(rpt, STREAM_CAP_GZIP)) + rpt->decompressor.algorithm = COMPRESSION_ALGORITHM_GZIP; + else + rpt->decompressor.algorithm = COMPRESSION_ALGORITHM_NONE; + + if(rpt->decompressor.algorithm != COMPRESSION_ALGORITHM_NONE) { + rrdpush_decompressor_init(&rpt->decompressor); + return true; + } + + return false; +} + +/* +* In case of stream compression buffer overflow +* Inform the user through the error log file and +* deactivate compression by downgrading the stream protocol. +*/ +void rrdpush_compression_deactivate(struct sender_state *s) { + switch(s->compressor.algorithm) { + case COMPRESSION_ALGORITHM_MAX: + case COMPRESSION_ALGORITHM_NONE: + netdata_log_error("STREAM_COMPRESSION: compression error on 'host:%s' without any compression enabled. Ignoring error.", + rrdhost_hostname(s->host)); + break; + + case COMPRESSION_ALGORITHM_GZIP: + netdata_log_error("STREAM_COMPRESSION: GZIP compression error on 'host:%s'. Disabling GZIP for this node.", + rrdhost_hostname(s->host)); + s->disabled_capabilities |= STREAM_CAP_GZIP; + break; + + case COMPRESSION_ALGORITHM_LZ4: + netdata_log_error("STREAM_COMPRESSION: LZ4 compression error on 'host:%s'. Disabling ZSTD for this node.", + rrdhost_hostname(s->host)); + s->disabled_capabilities |= STREAM_CAP_LZ4; + break; + + case COMPRESSION_ALGORITHM_ZSTD: + netdata_log_error("STREAM_COMPRESSION: ZSTD compression error on 'host:%s'. Disabling ZSTD for this node.", + rrdhost_hostname(s->host)); + s->disabled_capabilities |= STREAM_CAP_ZSTD; + break; + + case COMPRESSION_ALGORITHM_BROTLI: + netdata_log_error("STREAM_COMPRESSION: BROTLI compression error on 'host:%s'. Disabling BROTLI for this node.", + rrdhost_hostname(s->host)); + s->disabled_capabilities |= STREAM_CAP_BROTLI; + break; + } +} + +// ---------------------------------------------------------------------------- +// compressor public API + +void rrdpush_compressor_init(struct compressor_state *state) { + switch(state->algorithm) { +#ifdef ENABLE_ZSTD + case COMPRESSION_ALGORITHM_ZSTD: + rrdpush_compressor_init_zstd(state); + break; +#endif + +#ifdef ENABLE_LZ4 + case COMPRESSION_ALGORITHM_LZ4: + rrdpush_compressor_init_lz4(state); + break; +#endif + +#ifdef ENABLE_BROTLI + case COMPRESSION_ALGORITHM_BROTLI: + rrdpush_compressor_init_brotli(state); + break; +#endif + + default: + case COMPRESSION_ALGORITHM_GZIP: + rrdpush_compressor_init_gzip(state); + break; + } + + simple_ring_buffer_reset(&state->input); + simple_ring_buffer_reset(&state->output); +} + +void rrdpush_compressor_destroy(struct compressor_state *state) { + switch(state->algorithm) { +#ifdef ENABLE_ZSTD + case COMPRESSION_ALGORITHM_ZSTD: + rrdpush_compressor_destroy_zstd(state); + break; +#endif + +#ifdef ENABLE_LZ4 + case COMPRESSION_ALGORITHM_LZ4: + rrdpush_compressor_destroy_lz4(state); + break; +#endif + +#ifdef ENABLE_BROTLI + case COMPRESSION_ALGORITHM_BROTLI: + rrdpush_compressor_destroy_brotli(state); + break; +#endif + + default: + case COMPRESSION_ALGORITHM_GZIP: + rrdpush_compressor_destroy_gzip(state); + break; + } + + state->initialized = false; + + simple_ring_buffer_destroy(&state->input); + simple_ring_buffer_destroy(&state->output); +} + +size_t rrdpush_compress(struct compressor_state *state, const char *data, size_t size, const char **out) { + size_t ret = 0; + + switch(state->algorithm) { +#ifdef ENABLE_ZSTD + case COMPRESSION_ALGORITHM_ZSTD: + ret = rrdpush_compress_zstd(state, data, size, out); + break; +#endif + +#ifdef ENABLE_LZ4 + case COMPRESSION_ALGORITHM_LZ4: + ret = rrdpush_compress_lz4(state, data, size, out); + break; +#endif + +#ifdef ENABLE_BROTLI + case COMPRESSION_ALGORITHM_BROTLI: + ret = rrdpush_compress_brotli(state, data, size, out); + break; +#endif + + default: + case COMPRESSION_ALGORITHM_GZIP: + ret = rrdpush_compress_gzip(state, data, size, out); + break; + } + + if(unlikely(ret >= COMPRESSION_MAX_CHUNK)) { + netdata_log_error("RRDPUSH_COMPRESS: compressed data is %zu bytes, which is >= than the max chunk size %d", + ret, COMPRESSION_MAX_CHUNK); + return 0; + } + + return ret; +} + +// ---------------------------------------------------------------------------- +// decompressor public API + +void rrdpush_decompressor_destroy(struct decompressor_state *state) { + if(unlikely(!state->initialized)) + return; + + switch(state->algorithm) { +#ifdef ENABLE_ZSTD + case COMPRESSION_ALGORITHM_ZSTD: + rrdpush_decompressor_destroy_zstd(state); + break; +#endif + +#ifdef ENABLE_LZ4 + case COMPRESSION_ALGORITHM_LZ4: + rrdpush_decompressor_destroy_lz4(state); + break; +#endif + +#ifdef ENABLE_BROTLI + case COMPRESSION_ALGORITHM_BROTLI: + rrdpush_decompressor_destroy_brotli(state); + break; +#endif + + default: + case COMPRESSION_ALGORITHM_GZIP: + rrdpush_decompressor_destroy_gzip(state); + break; + } + + simple_ring_buffer_destroy(&state->output); + + state->initialized = false; +} + +void rrdpush_decompressor_init(struct decompressor_state *state) { + switch(state->algorithm) { +#ifdef ENABLE_ZSTD + case COMPRESSION_ALGORITHM_ZSTD: + rrdpush_decompressor_init_zstd(state); + break; +#endif + +#ifdef ENABLE_LZ4 + case COMPRESSION_ALGORITHM_LZ4: + rrdpush_decompressor_init_lz4(state); + break; +#endif + +#ifdef ENABLE_BROTLI + case COMPRESSION_ALGORITHM_BROTLI: + rrdpush_decompressor_init_brotli(state); + break; +#endif + + default: + case COMPRESSION_ALGORITHM_GZIP: + rrdpush_decompressor_init_gzip(state); + break; + } + + state->signature_size = RRDPUSH_COMPRESSION_SIGNATURE_SIZE; + simple_ring_buffer_reset(&state->output); +} + +size_t rrdpush_decompress(struct decompressor_state *state, const char *compressed_data, size_t compressed_size) { + if (unlikely(state->output.read_pos != state->output.write_pos)) + fatal("RRDPUSH_DECOMPRESS: asked to decompress new data, while there are unread data in the decompression buffer!"); + + size_t ret = 0; + + switch(state->algorithm) { +#ifdef ENABLE_ZSTD + case COMPRESSION_ALGORITHM_ZSTD: + ret = rrdpush_decompress_zstd(state, compressed_data, compressed_size); + break; +#endif + +#ifdef ENABLE_LZ4 + case COMPRESSION_ALGORITHM_LZ4: + ret = rrdpush_decompress_lz4(state, compressed_data, compressed_size); + break; +#endif + +#ifdef ENABLE_BROTLI + case COMPRESSION_ALGORITHM_BROTLI: + ret = rrdpush_decompress_brotli(state, compressed_data, compressed_size); + break; +#endif + + default: + case COMPRESSION_ALGORITHM_GZIP: + ret = rrdpush_decompress_gzip(state, compressed_data, compressed_size); + break; + } + + // for backwards compatibility we cannot check for COMPRESSION_MAX_MSG_SIZE, + // because old children may send this big payloads. + if(unlikely(ret > COMPRESSION_MAX_CHUNK)) { + netdata_log_error("RRDPUSH_DECOMPRESS: decompressed data is %zu bytes, which is bigger than the max msg size %d", + ret, COMPRESSION_MAX_CHUNK); + return 0; + } + + return ret; +} + +// ---------------------------------------------------------------------------- +// unit test + +static inline long int my_random (void) { + return random(); +} + +void unittest_generate_random_name(char *dst, size_t size) { + if(size < 7) + size = 7; + + size_t len = 5 + my_random() % (size - 6); + + for(size_t i = 0; i < len ; i++) { + if(my_random() % 2 == 0) + dst[i] = 'A' + my_random() % 26; + else + dst[i] = 'a' + my_random() % 26; + } + + dst[len] = '\0'; +} + +void unittest_generate_message(BUFFER *wb, time_t now_s, size_t counter) { + bool with_slots = true; + NUMBER_ENCODING integer_encoding = NUMBER_ENCODING_BASE64; + NUMBER_ENCODING doubles_encoding = NUMBER_ENCODING_BASE64; + time_t update_every = 1; + time_t point_end_time_s = now_s; + time_t wall_clock_time_s = now_s; + size_t chart_slot = counter + 1; + size_t dimensions = 2 + my_random() % 5; + char chart[RRD_ID_LENGTH_MAX + 1] = "name"; + unittest_generate_random_name(chart, 5 + my_random() % 30); + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_BEGIN_V2, sizeof(PLUGINSD_KEYWORD_BEGIN_V2) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, chart_slot); + } + + buffer_fast_strcat(wb, " '", 2); + buffer_strcat(wb, chart); + buffer_fast_strcat(wb, "' ", 2); + buffer_print_uint64_encoded(wb, integer_encoding, update_every); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, point_end_time_s); + buffer_fast_strcat(wb, " ", 1); + if(point_end_time_s == wall_clock_time_s) + buffer_fast_strcat(wb, "#", 1); + else + buffer_print_uint64_encoded(wb, integer_encoding, wall_clock_time_s); + buffer_fast_strcat(wb, "\n", 1); + + + for(size_t d = 0; d < dimensions ;d++) { + size_t dim_slot = d + 1; + char dim_id[RRD_ID_LENGTH_MAX + 1] = "dimension"; + unittest_generate_random_name(dim_id, 10 + my_random() % 20); + int64_t last_collected_value = (my_random() % 2 == 0) ? (int64_t)(counter + d) : (int64_t)my_random(); + NETDATA_DOUBLE value = (my_random() % 2 == 0) ? (NETDATA_DOUBLE)my_random() / ((NETDATA_DOUBLE)my_random() + 1) : (NETDATA_DOUBLE)last_collected_value; + SN_FLAGS flags = (my_random() % 1000 == 0) ? SN_FLAG_NONE : SN_FLAG_NOT_ANOMALOUS; + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_SET_V2, sizeof(PLUGINSD_KEYWORD_SET_V2) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, dim_slot); + } + + buffer_fast_strcat(wb, " '", 2); + buffer_strcat(wb, dim_id); + buffer_fast_strcat(wb, "' ", 2); + buffer_print_int64_encoded(wb, integer_encoding, last_collected_value); + buffer_fast_strcat(wb, " ", 1); + + if((NETDATA_DOUBLE)last_collected_value == value) + buffer_fast_strcat(wb, "#", 1); + else + buffer_print_netdata_double_encoded(wb, doubles_encoding, value); + + buffer_fast_strcat(wb, " ", 1); + buffer_print_sn_flags(wb, flags, true); + buffer_fast_strcat(wb, "\n", 1); + } + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_END_V2 "\n", sizeof(PLUGINSD_KEYWORD_END_V2) - 1 + 1); +} + +int unittest_rrdpush_compression_speed(compression_algorithm_t algorithm, const char *name) { + fprintf(stderr, "\nTesting streaming compression speed with %s\n", name); + + struct compressor_state cctx = { + .initialized = false, + .algorithm = algorithm, + }; + struct decompressor_state dctx = { + .initialized = false, + .algorithm = algorithm, + }; + + rrdpush_compressor_init(&cctx); + rrdpush_decompressor_init(&dctx); + + int errors = 0; + + BUFFER *wb = buffer_create(COMPRESSION_MAX_MSG_SIZE, NULL); + time_t now_s = now_realtime_sec(); + usec_t compression_ut = 0; + usec_t decompression_ut = 0; + size_t bytes_compressed = 0; + size_t bytes_uncompressed = 0; + + usec_t compression_started_ut = now_monotonic_usec(); + usec_t decompression_started_ut = compression_started_ut; + + for(int i = 0; i < 10000 ;i++) { + compression_started_ut = now_monotonic_usec(); + decompression_ut += compression_started_ut - decompression_started_ut; + + buffer_flush(wb); + while(buffer_strlen(wb) < COMPRESSION_MAX_MSG_SIZE - 1024) + unittest_generate_message(wb, now_s, i); + + const char *txt = buffer_tostring(wb); + size_t txt_len = buffer_strlen(wb); + bytes_uncompressed += txt_len; + + const char *out; + size_t size = rrdpush_compress(&cctx, txt, txt_len, &out); + + bytes_compressed += size; + decompression_started_ut = now_monotonic_usec(); + compression_ut += decompression_started_ut - compression_started_ut; + + if(size == 0) { + fprintf(stderr, "iteration %d: compressed size %zu is zero\n", + i, size); + errors++; + goto cleanup; + } + else if(size >= COMPRESSION_MAX_CHUNK) { + fprintf(stderr, "iteration %d: compressed size %zu exceeds max allowed size\n", + i, size); + errors++; + goto cleanup; + } + else { + size_t dtxt_len = rrdpush_decompress(&dctx, out, size); + char *dtxt = (char *) &dctx.output.data[dctx.output.read_pos]; + + if(rrdpush_decompressed_bytes_in_buffer(&dctx) != dtxt_len) { + fprintf(stderr, "iteration %d: decompressed size %zu does not rrdpush_decompressed_bytes_in_buffer() %zu\n", + i, dtxt_len, rrdpush_decompressed_bytes_in_buffer(&dctx) + ); + errors++; + goto cleanup; + } + + if(!dtxt_len) { + fprintf(stderr, "iteration %d: decompressed size is zero\n", i); + errors++; + goto cleanup; + } + else if(dtxt_len != txt_len) { + fprintf(stderr, "iteration %d: decompressed size %zu does not match original size %zu\n", + i, dtxt_len, txt_len + ); + errors++; + goto cleanup; + } + else { + if(memcmp(txt, dtxt, txt_len) != 0) { + fprintf(stderr, "iteration %d: decompressed data '%s' do not match original data length %zu\n", + i, dtxt, txt_len); + errors++; + goto cleanup; + } + } + } + + // here we are supposed to copy the data and advance the position + dctx.output.read_pos += rrdpush_decompressed_bytes_in_buffer(&dctx); + } + +cleanup: + rrdpush_compressor_destroy(&cctx); + rrdpush_decompressor_destroy(&dctx); + + if(errors) + fprintf(stderr, "Compression with %s: FAILED (%d errors)\n", name, errors); + else + fprintf(stderr, "Compression with %s: OK " + "(compression %zu usec, decompression %zu usec, bytes raw %zu, compressed %zu, savings ratio %0.2f%%)\n", + name, compression_ut, decompression_ut, + bytes_uncompressed, bytes_compressed, + 100.0 - (double)bytes_compressed * 100.0 / (double)bytes_uncompressed); + + return errors; +} + +int unittest_rrdpush_compression(compression_algorithm_t algorithm, const char *name) { + fprintf(stderr, "\nTesting streaming compression with %s\n", name); + + struct compressor_state cctx = { + .initialized = false, + .algorithm = algorithm, + }; + struct decompressor_state dctx = { + .initialized = false, + .algorithm = algorithm, + }; + + char txt[COMPRESSION_MAX_MSG_SIZE]; + + rrdpush_compressor_init(&cctx); + rrdpush_decompressor_init(&dctx); + + int errors = 0; + + memset(txt, '=', COMPRESSION_MAX_MSG_SIZE); + + for(int i = 0; i < COMPRESSION_MAX_MSG_SIZE ;i++) { + txt[i] = 'A' + (i % 26); + size_t txt_len = i + 1; + + const char *out; + size_t size = rrdpush_compress(&cctx, txt, txt_len, &out); + + if(size == 0) { + fprintf(stderr, "iteration %d: compressed size %zu is zero\n", + i, size); + errors++; + goto cleanup; + } + else if(size >= COMPRESSION_MAX_CHUNK) { + fprintf(stderr, "iteration %d: compressed size %zu exceeds max allowed size\n", + i, size); + errors++; + goto cleanup; + } + else { + size_t dtxt_len = rrdpush_decompress(&dctx, out, size); + char *dtxt = (char *) &dctx.output.data[dctx.output.read_pos]; + + if(rrdpush_decompressed_bytes_in_buffer(&dctx) != dtxt_len) { + fprintf(stderr, "iteration %d: decompressed size %zu does not rrdpush_decompressed_bytes_in_buffer() %zu\n", + i, dtxt_len, rrdpush_decompressed_bytes_in_buffer(&dctx) + ); + errors++; + goto cleanup; + } + + if(!dtxt_len) { + fprintf(stderr, "iteration %d: decompressed size is zero\n", i); + errors++; + goto cleanup; + } + else if(dtxt_len != txt_len) { + fprintf(stderr, "iteration %d: decompressed size %zu does not match original size %zu\n", + i, dtxt_len, txt_len + ); + errors++; + goto cleanup; + } + else { + if(memcmp(txt, dtxt, txt_len) != 0) { + txt[txt_len] = '\0'; + dtxt[txt_len + 5] = '\0'; + + fprintf(stderr, "iteration %d: decompressed data '%s' do not match original data '%s' of length %zu\n", + i, dtxt, txt, txt_len); + errors++; + goto cleanup; + } + } + } + + // fill the compressed buffer with garbage + memset((void *)out, 'x', size); + + // here we are supposed to copy the data and advance the position + dctx.output.read_pos += rrdpush_decompressed_bytes_in_buffer(&dctx); + } + +cleanup: + rrdpush_compressor_destroy(&cctx); + rrdpush_decompressor_destroy(&dctx); + + if(errors) + fprintf(stderr, "Compression with %s: FAILED (%d errors)\n", name, errors); + else + fprintf(stderr, "Compression with %s: OK\n", name); + + return errors; +} + +int unittest_rrdpush_compressions(void) { + int ret = 0; + + ret += unittest_rrdpush_compression(COMPRESSION_ALGORITHM_ZSTD, "ZSTD"); + ret += unittest_rrdpush_compression(COMPRESSION_ALGORITHM_LZ4, "LZ4"); + ret += unittest_rrdpush_compression(COMPRESSION_ALGORITHM_BROTLI, "BROTLI"); + ret += unittest_rrdpush_compression(COMPRESSION_ALGORITHM_GZIP, "GZIP"); + + ret += unittest_rrdpush_compression_speed(COMPRESSION_ALGORITHM_ZSTD, "ZSTD"); + ret += unittest_rrdpush_compression_speed(COMPRESSION_ALGORITHM_LZ4, "LZ4"); + ret += unittest_rrdpush_compression_speed(COMPRESSION_ALGORITHM_BROTLI, "BROTLI"); + ret += unittest_rrdpush_compression_speed(COMPRESSION_ALGORITHM_GZIP, "GZIP"); + + return ret; +} diff --git a/src/streaming/compression.h b/src/streaming/compression.h new file mode 100644 index 000000000..285fb2cf6 --- /dev/null +++ b/src/streaming/compression.h @@ -0,0 +1,175 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrdpush.h" + +#ifndef NETDATA_RRDPUSH_COMPRESSION_H +#define NETDATA_RRDPUSH_COMPRESSION_H 1 + +// signature MUST end with a newline + +#if COMPRESSION_MAX_MSG_SIZE >= (COMPRESSION_MAX_CHUNK - COMPRESSION_MAX_OVERHEAD) +#error "COMPRESSION_MAX_MSG_SIZE >= (COMPRESSION_MAX_CHUNK - COMPRESSION_MAX_OVERHEAD)" +#endif + +typedef uint32_t rrdpush_signature_t; +#define RRDPUSH_COMPRESSION_SIGNATURE ((rrdpush_signature_t)('z' | 0x80) | (0x80 << 8) | (0x80 << 16) | ('\n' << 24)) +#define RRDPUSH_COMPRESSION_SIGNATURE_MASK ((rrdpush_signature_t) 0xffU | (0x80U << 8) | (0x80U << 16) | (0xffU << 24)) +#define RRDPUSH_COMPRESSION_SIGNATURE_SIZE sizeof(rrdpush_signature_t) + +static inline rrdpush_signature_t rrdpush_compress_encode_signature(size_t compressed_data_size) { + rrdpush_signature_t len = ((compressed_data_size & 0x7f) | 0x80 | (((compressed_data_size & (0x7f << 7)) << 1) | 0x8000)) << 8; + return len | RRDPUSH_COMPRESSION_SIGNATURE; +} + +typedef enum { + COMPRESSION_ALGORITHM_NONE = 0, + COMPRESSION_ALGORITHM_ZSTD, + COMPRESSION_ALGORITHM_LZ4, + COMPRESSION_ALGORITHM_GZIP, + COMPRESSION_ALGORITHM_BROTLI, + + // terminator + COMPRESSION_ALGORITHM_MAX, +} compression_algorithm_t; + +extern int rrdpush_compression_levels[COMPRESSION_ALGORITHM_MAX]; + +// this defines the order the algorithms will be selected by the receiver (parent) +#define RRDPUSH_COMPRESSION_ALGORITHMS_ORDER "zstd lz4 brotli gzip" + +// ---------------------------------------------------------------------------- + +typedef struct simple_ring_buffer { + const char *data; + size_t size; + size_t read_pos; + size_t write_pos; +} SIMPLE_RING_BUFFER; + +static inline void simple_ring_buffer_reset(SIMPLE_RING_BUFFER *b) { + b->read_pos = b->write_pos = 0; +} + +static inline void simple_ring_buffer_make_room(SIMPLE_RING_BUFFER *b, size_t size) { + if(b->write_pos + size > b->size) { + if(!b->size) + b->size = COMPRESSION_MAX_CHUNK; + else + b->size *= 2; + + if(b->write_pos + size > b->size) + b->size += size; + + b->data = (const char *)reallocz((void *)b->data, b->size); + } +} + +static inline void simple_ring_buffer_append_data(SIMPLE_RING_BUFFER *b, const void *data, size_t size) { + simple_ring_buffer_make_room(b, size); + memcpy((void *)(b->data + b->write_pos), data, size); + b->write_pos += size; +} + +static inline void simple_ring_buffer_destroy(SIMPLE_RING_BUFFER *b) { + freez((void *)b->data); + b->data = NULL; + b->read_pos = b->write_pos = b->size = 0; +} + +// ---------------------------------------------------------------------------- + +struct compressor_state { + bool initialized; + compression_algorithm_t algorithm; + + SIMPLE_RING_BUFFER input; + SIMPLE_RING_BUFFER output; + + int level; + void *stream; + + struct { + size_t total_compressed; + size_t total_uncompressed; + size_t total_compressions; + } sender_locked; +}; + +void rrdpush_compressor_init(struct compressor_state *state); +void rrdpush_compressor_destroy(struct compressor_state *state); +size_t rrdpush_compress(struct compressor_state *state, const char *data, size_t size, const char **out); + +// ---------------------------------------------------------------------------- + +struct decompressor_state { + bool initialized; + compression_algorithm_t algorithm; + size_t signature_size; + + size_t total_compressed; + size_t total_uncompressed; + size_t total_compressions; + + SIMPLE_RING_BUFFER output; + + void *stream; +}; + +void rrdpush_decompressor_destroy(struct decompressor_state *state); +void rrdpush_decompressor_init(struct decompressor_state *state); +size_t rrdpush_decompress(struct decompressor_state *state, const char *compressed_data, size_t compressed_size); + +static inline size_t rrdpush_decompress_decode_signature(const char *data, size_t data_size) { + if (unlikely(!data || !data_size)) + return 0; + + if (unlikely(data_size != RRDPUSH_COMPRESSION_SIGNATURE_SIZE)) + return 0; + + rrdpush_signature_t sign = *(rrdpush_signature_t *)data; + if (unlikely((sign & RRDPUSH_COMPRESSION_SIGNATURE_MASK) != RRDPUSH_COMPRESSION_SIGNATURE)) + return 0; + + size_t length = ((sign >> 8) & 0x7f) | ((sign >> 9) & (0x7f << 7)); + return length; +} + +static inline size_t rrdpush_decompressor_start(struct decompressor_state *state, const char *header, size_t header_size) { + if(unlikely(state->output.read_pos != state->output.write_pos)) + fatal("RRDPUSH DECOMPRESS: asked to decompress new data, while there are unread data in the decompression buffer!"); + + return rrdpush_decompress_decode_signature(header, header_size); +} + +static inline size_t rrdpush_decompressed_bytes_in_buffer(struct decompressor_state *state) { + if(unlikely(state->output.read_pos > state->output.write_pos)) + fatal("RRDPUSH DECOMPRESS: invalid read/write stream positions"); + + return state->output.write_pos - state->output.read_pos; +} + +static inline size_t rrdpush_decompressor_get(struct decompressor_state *state, char *dst, size_t size) { + if (unlikely(!state || !size || !dst)) + return 0; + + size_t remaining = rrdpush_decompressed_bytes_in_buffer(state); + + if(unlikely(!remaining)) + return 0; + + size_t bytes_to_return = size; + if(bytes_to_return > remaining) + bytes_to_return = remaining; + + memcpy(dst, state->output.data + state->output.read_pos, bytes_to_return); + state->output.read_pos += bytes_to_return; + + if(unlikely(state->output.read_pos > state->output.write_pos)) + fatal("RRDPUSH DECOMPRESS: invalid read/write stream positions"); + + return bytes_to_return; +} + +// ---------------------------------------------------------------------------- + +#endif // NETDATA_RRDPUSH_COMPRESSION_H 1 diff --git a/src/streaming/compression_brotli.c b/src/streaming/compression_brotli.c new file mode 100644 index 000000000..cf52f3bca --- /dev/null +++ b/src/streaming/compression_brotli.c @@ -0,0 +1,142 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression_brotli.h" + +#ifdef ENABLE_BROTLI +#include <brotli/encode.h> +#include <brotli/decode.h> + +void rrdpush_compressor_init_brotli(struct compressor_state *state) { + if (!state->initialized) { + state->initialized = true; + state->stream = BrotliEncoderCreateInstance(NULL, NULL, NULL); + + if (state->level < BROTLI_MIN_QUALITY) { + state->level = BROTLI_MIN_QUALITY; + } else if (state->level > BROTLI_MAX_QUALITY) { + state->level = BROTLI_MAX_QUALITY; + } + + BrotliEncoderSetParameter(state->stream, BROTLI_PARAM_QUALITY, state->level); + } +} + +void rrdpush_compressor_destroy_brotli(struct compressor_state *state) { + if (state->stream) { + BrotliEncoderDestroyInstance(state->stream); + state->stream = NULL; + } +} + +size_t rrdpush_compress_brotli(struct compressor_state *state, const char *data, size_t size, const char **out) { + if (unlikely(!state || !size || !out)) + return 0; + + simple_ring_buffer_make_room(&state->output, MAX(BrotliEncoderMaxCompressedSize(size), COMPRESSION_MAX_CHUNK)); + + size_t available_out = state->output.size; + + size_t available_in = size; + const uint8_t *next_in = (const uint8_t *)data; + uint8_t *next_out = (uint8_t *)state->output.data; + + if (!BrotliEncoderCompressStream(state->stream, BROTLI_OPERATION_FLUSH, &available_in, &next_in, &available_out, &next_out, NULL)) { + netdata_log_error("STREAM: Brotli compression failed."); + return 0; + } + + if(available_in != 0) { + netdata_log_error("STREAM: BrotliEncoderCompressStream() did not use all the input buffer, %zu bytes out of %zu remain", + available_in, size); + return 0; + } + + size_t compressed_size = state->output.size - available_out; + if(available_out == 0) { + netdata_log_error("STREAM: BrotliEncoderCompressStream() needs a bigger output buffer than the one we provided " + "(output buffer %zu bytes, compressed payload %zu bytes)", + state->output.size, size); + return 0; + } + + if(compressed_size == 0) { + netdata_log_error("STREAM: BrotliEncoderCompressStream() did not produce any output from the input provided " + "(input buffer %zu bytes)", + size); + return 0; + } + + state->sender_locked.total_compressions++; + state->sender_locked.total_uncompressed += size - available_in; + state->sender_locked.total_compressed += compressed_size; + + *out = state->output.data; + return compressed_size; +} + +void rrdpush_decompressor_init_brotli(struct decompressor_state *state) { + if (!state->initialized) { + state->initialized = true; + state->stream = BrotliDecoderCreateInstance(NULL, NULL, NULL); + + simple_ring_buffer_make_room(&state->output, COMPRESSION_MAX_CHUNK); + } +} + +void rrdpush_decompressor_destroy_brotli(struct decompressor_state *state) { + if (state->stream) { + BrotliDecoderDestroyInstance(state->stream); + state->stream = NULL; + } +} + +size_t rrdpush_decompress_brotli(struct decompressor_state *state, const char *compressed_data, size_t compressed_size) { + if (unlikely(!state || !compressed_data || !compressed_size)) + return 0; + + // The state.output ring buffer is always EMPTY at this point, + // meaning that (state->output.read_pos == state->output.write_pos) + // However, THEY ARE NOT ZERO. + + size_t available_out = state->output.size; + size_t available_in = compressed_size; + const uint8_t *next_in = (const uint8_t *)compressed_data; + uint8_t *next_out = (uint8_t *)state->output.data; + + if (BrotliDecoderDecompressStream(state->stream, &available_in, &next_in, &available_out, &next_out, NULL) == BROTLI_DECODER_RESULT_ERROR) { + netdata_log_error("STREAM: Brotli decompression failed."); + return 0; + } + + if(available_in != 0) { + netdata_log_error("STREAM: BrotliDecoderDecompressStream() did not use all the input buffer, %zu bytes out of %zu remain", + available_in, compressed_size); + return 0; + } + + size_t decompressed_size = state->output.size - available_out; + if(available_out == 0) { + netdata_log_error("STREAM: BrotliDecoderDecompressStream() needs a bigger output buffer than the one we provided " + "(output buffer %zu bytes, compressed payload %zu bytes)", + state->output.size, compressed_size); + return 0; + } + + if(decompressed_size == 0) { + netdata_log_error("STREAM: BrotliDecoderDecompressStream() did not produce any output from the input provided " + "(input buffer %zu bytes)", + compressed_size); + return 0; + } + + state->output.read_pos = 0; + state->output.write_pos = decompressed_size; + + state->total_compressed += compressed_size - available_in; + state->total_uncompressed += decompressed_size; + state->total_compressions++; + + return decompressed_size; +} + +#endif // ENABLE_BROTLI diff --git a/src/streaming/compression_brotli.h b/src/streaming/compression_brotli.h new file mode 100644 index 000000000..4955e5a82 --- /dev/null +++ b/src/streaming/compression_brotli.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression.h" + +#ifndef NETDATA_STREAMING_COMPRESSION_BROTLI_H +#define NETDATA_STREAMING_COMPRESSION_BROTLI_H + +void rrdpush_compressor_init_brotli(struct compressor_state *state); +void rrdpush_compressor_destroy_brotli(struct compressor_state *state); +size_t rrdpush_compress_brotli(struct compressor_state *state, const char *data, size_t size, const char **out); +size_t rrdpush_decompress_brotli(struct decompressor_state *state, const char *compressed_data, size_t compressed_size); +void rrdpush_decompressor_init_brotli(struct decompressor_state *state); +void rrdpush_decompressor_destroy_brotli(struct decompressor_state *state); + +#endif //NETDATA_STREAMING_COMPRESSION_BROTLI_H diff --git a/src/streaming/compression_gzip.c b/src/streaming/compression_gzip.c new file mode 100644 index 000000000..c4ef3af05 --- /dev/null +++ b/src/streaming/compression_gzip.c @@ -0,0 +1,164 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression_gzip.h" +#include <zlib.h> + +void rrdpush_compressor_init_gzip(struct compressor_state *state) { + if (!state->initialized) { + state->initialized = true; + + // Initialize deflate stream + z_stream *strm = state->stream = (z_stream *) mallocz(sizeof(z_stream)); + strm->zalloc = Z_NULL; + strm->zfree = Z_NULL; + strm->opaque = Z_NULL; + + if(state->level < Z_BEST_SPEED) + state->level = Z_BEST_SPEED; + + if(state->level > Z_BEST_COMPRESSION) + state->level = Z_BEST_COMPRESSION; + + // int r = deflateInit2(strm, Z_BEST_COMPRESSION, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY); + int r = deflateInit2(strm, state->level, Z_DEFLATED, 15 + 16, 8, Z_DEFAULT_STRATEGY); + if (r != Z_OK) { + netdata_log_error("Failed to initialize deflate with error: %d", r); + freez(state->stream); + state->initialized = false; + return; + } + + } +} + +void rrdpush_compressor_destroy_gzip(struct compressor_state *state) { + if (state->stream) { + deflateEnd(state->stream); + freez(state->stream); + state->stream = NULL; + } +} + +size_t rrdpush_compress_gzip(struct compressor_state *state, const char *data, size_t size, const char **out) { + if (unlikely(!state || !size || !out)) + return 0; + + simple_ring_buffer_make_room(&state->output, deflateBound(state->stream, size)); + + z_stream *strm = state->stream; + strm->avail_in = (uInt)size; + strm->next_in = (Bytef *)data; + strm->avail_out = (uInt)state->output.size; + strm->next_out = (Bytef *)state->output.data; + + int ret = deflate(strm, Z_SYNC_FLUSH); + if (ret != Z_OK && ret != Z_STREAM_END) { + netdata_log_error("STREAM: deflate() failed with error %d", ret); + return 0; + } + + if(strm->avail_in != 0) { + netdata_log_error("STREAM: deflate() did not use all the input buffer, %u bytes out of %zu remain", + strm->avail_in, size); + return 0; + } + + if(strm->avail_out == 0) { + netdata_log_error("STREAM: deflate() needs a bigger output buffer than the one we provided " + "(output buffer %zu bytes, compressed payload %zu bytes)", + state->output.size, size); + return 0; + } + + size_t compressed_data_size = state->output.size - strm->avail_out; + + if(compressed_data_size == 0) { + netdata_log_error("STREAM: deflate() did not produce any output " + "(output buffer %zu bytes, compressed payload %zu bytes)", + state->output.size, size); + return 0; + } + + state->sender_locked.total_compressions++; + state->sender_locked.total_uncompressed += size; + state->sender_locked.total_compressed += compressed_data_size; + + *out = state->output.data; + return compressed_data_size; +} + +void rrdpush_decompressor_init_gzip(struct decompressor_state *state) { + if (!state->initialized) { + state->initialized = true; + + // Initialize inflate stream + z_stream *strm = state->stream = (z_stream *)mallocz(sizeof(z_stream)); + strm->zalloc = Z_NULL; + strm->zfree = Z_NULL; + strm->opaque = Z_NULL; + + int r = inflateInit2(strm, 15 + 16); + if (r != Z_OK) { + netdata_log_error("Failed to initialize inflateInit2() with error: %d", r); + freez(state->stream); + state->initialized = false; + return; + } + + simple_ring_buffer_make_room(&state->output, COMPRESSION_MAX_CHUNK); + } +} + +void rrdpush_decompressor_destroy_gzip(struct decompressor_state *state) { + if (state->stream) { + inflateEnd(state->stream); + freez(state->stream); + state->stream = NULL; + } +} + +size_t rrdpush_decompress_gzip(struct decompressor_state *state, const char *compressed_data, size_t compressed_size) { + if (unlikely(!state || !compressed_data || !compressed_size)) + return 0; + + // The state.output ring buffer is always EMPTY at this point, + // meaning that (state->output.read_pos == state->output.write_pos) + // However, THEY ARE NOT ZERO. + + z_stream *strm = state->stream; + strm->avail_in = (uInt)compressed_size; + strm->next_in = (Bytef *)compressed_data; + strm->avail_out = (uInt)state->output.size; + strm->next_out = (Bytef *)state->output.data; + + int ret = inflate(strm, Z_SYNC_FLUSH); + if (ret != Z_STREAM_END && ret != Z_OK) { + netdata_log_error("RRDPUSH DECOMPRESS: inflate() failed with error %d", ret); + return 0; + } + + if(strm->avail_in != 0) { + netdata_log_error("RRDPUSH DECOMPRESS: inflate() did not use all compressed data we provided " + "(compressed payload %zu bytes, remaining to be uncompressed %u)" + , compressed_size, strm->avail_in); + return 0; + } + + if(strm->avail_out == 0) { + netdata_log_error("RRDPUSH DECOMPRESS: inflate() needs a bigger output buffer than the one we provided " + "(compressed payload %zu bytes, output buffer size %zu bytes)" + , compressed_size, state->output.size); + return 0; + } + + size_t decompressed_size = state->output.size - strm->avail_out; + + state->output.read_pos = 0; + state->output.write_pos = decompressed_size; + + state->total_compressed += compressed_size; + state->total_uncompressed += decompressed_size; + state->total_compressions++; + + return decompressed_size; +} diff --git a/src/streaming/compression_gzip.h b/src/streaming/compression_gzip.h new file mode 100644 index 000000000..85f34bc6d --- /dev/null +++ b/src/streaming/compression_gzip.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression.h" + +#ifndef NETDATA_STREAMING_COMPRESSION_GZIP_H +#define NETDATA_STREAMING_COMPRESSION_GZIP_H + +void rrdpush_compressor_init_gzip(struct compressor_state *state); +void rrdpush_compressor_destroy_gzip(struct compressor_state *state); +size_t rrdpush_compress_gzip(struct compressor_state *state, const char *data, size_t size, const char **out); +size_t rrdpush_decompress_gzip(struct decompressor_state *state, const char *compressed_data, size_t compressed_size); +void rrdpush_decompressor_init_gzip(struct decompressor_state *state); +void rrdpush_decompressor_destroy_gzip(struct decompressor_state *state); + +#endif //NETDATA_STREAMING_COMPRESSION_GZIP_H diff --git a/src/streaming/compression_lz4.c b/src/streaming/compression_lz4.c new file mode 100644 index 000000000..f5174134e --- /dev/null +++ b/src/streaming/compression_lz4.c @@ -0,0 +1,143 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression_lz4.h" + +#ifdef ENABLE_LZ4 +#include "lz4.h" + +// ---------------------------------------------------------------------------- +// compress + +void rrdpush_compressor_init_lz4(struct compressor_state *state) { + if(!state->initialized) { + state->initialized = true; + state->stream = LZ4_createStream(); + + // LZ4 needs access to the last 64KB of source data + // so, we keep twice the size of each message + simple_ring_buffer_make_room(&state->input, 65536 + COMPRESSION_MAX_CHUNK * 2); + } +} + +void rrdpush_compressor_destroy_lz4(struct compressor_state *state) { + if (state->stream) { + LZ4_freeStream(state->stream); + state->stream = NULL; + } +} + +/* + * Compress the given block of data + * Compressed data will remain in the internal buffer until the next invocation + * Return the size of compressed data block as result and the pointer to internal buffer using the last argument + * or 0 in case of error + */ +size_t rrdpush_compress_lz4(struct compressor_state *state, const char *data, size_t size, const char **out) { + if(unlikely(!state || !size || !out)) + return 0; + + // we need to keep the last 64K of our previous source data + // as they were in the ring buffer + + simple_ring_buffer_make_room(&state->output, LZ4_COMPRESSBOUND(size)); + + if(state->input.write_pos + size > state->input.size) + // the input buffer cannot fit out data, restart from zero + simple_ring_buffer_reset(&state->input); + + simple_ring_buffer_append_data(&state->input, data, size); + + long int compressed_data_size = LZ4_compress_fast_continue( + state->stream, + state->input.data + state->input.read_pos, + (char *)state->output.data, + (int)(state->input.write_pos - state->input.read_pos), + (int)state->output.size, + state->level); + + if (compressed_data_size <= 0) { + netdata_log_error("STREAM: LZ4_compress_fast_continue() returned %ld " + "(source is %zu bytes, output buffer can fit %zu bytes)", + compressed_data_size, size, state->output.size); + return 0; + } + + state->input.read_pos = state->input.write_pos; + + state->sender_locked.total_compressions++; + state->sender_locked.total_uncompressed += size; + state->sender_locked.total_compressed += compressed_data_size; + + *out = state->output.data; + return compressed_data_size; +} + +// ---------------------------------------------------------------------------- +// decompress + +void rrdpush_decompressor_init_lz4(struct decompressor_state *state) { + if(!state->initialized) { + state->initialized = true; + state->stream = LZ4_createStreamDecode(); + simple_ring_buffer_make_room(&state->output, 65536 + COMPRESSION_MAX_CHUNK * 2); + } +} + +void rrdpush_decompressor_destroy_lz4(struct decompressor_state *state) { + if (state->stream) { + LZ4_freeStreamDecode(state->stream); + state->stream = NULL; + } +} + +/* + * Decompress the compressed data in the internal buffer + * Return the size of uncompressed data or 0 for error + */ +size_t rrdpush_decompress_lz4(struct decompressor_state *state, const char *compressed_data, size_t compressed_size) { + if (unlikely(!state || !compressed_data || !compressed_size)) + return 0; + + // The state.output ring buffer is always EMPTY at this point, + // meaning that (state->output.read_pos == state->output.write_pos) + // However, THEY ARE NOT ZERO. + + if (unlikely(state->output.write_pos + COMPRESSION_MAX_CHUNK > state->output.size)) + // the input buffer cannot fit out data, restart from zero + simple_ring_buffer_reset(&state->output); + + long int decompressed_size = LZ4_decompress_safe_continue( + state->stream + , compressed_data + , (char *)(state->output.data + state->output.write_pos) + , (int)compressed_size + , (int)(state->output.size - state->output.write_pos) + ); + + if (unlikely(decompressed_size < 0)) { + netdata_log_error("RRDPUSH DECOMPRESS: LZ4_decompress_safe_continue() returned negative value: %ld " + "(compressed chunk is %zu bytes)" + , decompressed_size, compressed_size); + return 0; + } + + if(unlikely(decompressed_size + state->output.write_pos > state->output.size)) + fatal("RRDPUSH DECOMPRESS: LZ4_decompress_safe_continue() overflown the stream_buffer " + "(size: %zu, pos: %zu, added: %ld, exceeding the buffer by %zu)" + , state->output.size + , state->output.write_pos + , decompressed_size + , (size_t)(state->output.write_pos + decompressed_size - state->output.size) + ); + + state->output.write_pos += decompressed_size; + + // statistics + state->total_compressed += compressed_size; + state->total_uncompressed += decompressed_size; + state->total_compressions++; + + return decompressed_size; +} + +#endif // ENABLE_LZ4 diff --git a/src/streaming/compression_lz4.h b/src/streaming/compression_lz4.h new file mode 100644 index 000000000..69f0fadcc --- /dev/null +++ b/src/streaming/compression_lz4.h @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression.h" + +#ifndef NETDATA_STREAMING_COMPRESSION_LZ4_H +#define NETDATA_STREAMING_COMPRESSION_LZ4_H + +#ifdef ENABLE_LZ4 + +void rrdpush_compressor_init_lz4(struct compressor_state *state); +void rrdpush_compressor_destroy_lz4(struct compressor_state *state); +size_t rrdpush_compress_lz4(struct compressor_state *state, const char *data, size_t size, const char **out); +size_t rrdpush_decompress_lz4(struct decompressor_state *state, const char *compressed_data, size_t compressed_size); +void rrdpush_decompressor_init_lz4(struct decompressor_state *state); +void rrdpush_decompressor_destroy_lz4(struct decompressor_state *state); + +#endif // ENABLE_LZ4 + +#endif //NETDATA_STREAMING_COMPRESSION_LZ4_H diff --git a/src/streaming/compression_zstd.c b/src/streaming/compression_zstd.c new file mode 100644 index 000000000..dabc044f7 --- /dev/null +++ b/src/streaming/compression_zstd.c @@ -0,0 +1,163 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression_zstd.h" + +#ifdef ENABLE_ZSTD +#include <zstd.h> + +void rrdpush_compressor_init_zstd(struct compressor_state *state) { + if(!state->initialized) { + state->initialized = true; + state->stream = ZSTD_createCStream(); + + if(state->level < 1) + state->level = 1; + + if(state->level > ZSTD_maxCLevel()) + state->level = ZSTD_maxCLevel(); + + size_t ret = ZSTD_initCStream(state->stream, state->level); + if(ZSTD_isError(ret)) + netdata_log_error("STREAM: ZSTD_initCStream() returned error: %s", ZSTD_getErrorName(ret)); + + // ZSTD_CCtx_setParameter(state->stream, ZSTD_c_compressionLevel, 1); + // ZSTD_CCtx_setParameter(state->stream, ZSTD_c_strategy, ZSTD_fast); + } +} + +void rrdpush_compressor_destroy_zstd(struct compressor_state *state) { + if(state->stream) { + ZSTD_freeCStream(state->stream); + state->stream = NULL; + } +} + +size_t rrdpush_compress_zstd(struct compressor_state *state, const char *data, size_t size, const char **out) { + if(unlikely(!state || !size || !out)) + return 0; + + ZSTD_inBuffer inBuffer = { + .pos = 0, + .size = size, + .src = data, + }; + + size_t wanted_size = MAX(ZSTD_compressBound(inBuffer.size - inBuffer.pos), ZSTD_CStreamOutSize()); + simple_ring_buffer_make_room(&state->output, wanted_size); + + ZSTD_outBuffer outBuffer = { + .pos = 0, + .size = state->output.size, + .dst = (void *)state->output.data, + }; + + // compress + size_t ret = ZSTD_compressStream(state->stream, &outBuffer, &inBuffer); + + // error handling + if(ZSTD_isError(ret)) { + netdata_log_error("STREAM: ZSTD_compressStream() return error: %s", ZSTD_getErrorName(ret)); + return 0; + } + + if(inBuffer.pos < inBuffer.size) { + netdata_log_error("STREAM: ZSTD_compressStream() left unprocessed input (source payload %zu bytes, consumed %zu bytes)", + inBuffer.size, inBuffer.pos); + return 0; + } + + if(outBuffer.pos == 0) { + // ZSTD needs more input to flush the output, so let's flush it manually + ret = ZSTD_flushStream(state->stream, &outBuffer); + + if(ZSTD_isError(ret)) { + netdata_log_error("STREAM: ZSTD_flushStream() return error: %s", ZSTD_getErrorName(ret)); + return 0; + } + + if(outBuffer.pos == 0) { + netdata_log_error("STREAM: ZSTD_compressStream() returned zero compressed bytes " + "(source is %zu bytes, output buffer can fit %zu bytes) " + , size, outBuffer.size); + return 0; + } + } + + state->sender_locked.total_compressions++; + state->sender_locked.total_uncompressed += size; + state->sender_locked.total_compressed += outBuffer.pos; + + // return values + *out = state->output.data; + return outBuffer.pos; +} + +void rrdpush_decompressor_init_zstd(struct decompressor_state *state) { + if(!state->initialized) { + state->initialized = true; + state->stream = ZSTD_createDStream(); + + size_t ret = ZSTD_initDStream(state->stream); + if(ZSTD_isError(ret)) + netdata_log_error("STREAM: ZSTD_initDStream() returned error: %s", ZSTD_getErrorName(ret)); + + simple_ring_buffer_make_room(&state->output, MAX(COMPRESSION_MAX_CHUNK, ZSTD_DStreamOutSize())); + } +} + +void rrdpush_decompressor_destroy_zstd(struct decompressor_state *state) { + if (state->stream) { + ZSTD_freeDStream(state->stream); + state->stream = NULL; + } +} + +size_t rrdpush_decompress_zstd(struct decompressor_state *state, const char *compressed_data, size_t compressed_size) { + if (unlikely(!state || !compressed_data || !compressed_size)) + return 0; + + // The state.output ring buffer is always EMPTY at this point, + // meaning that (state->output.read_pos == state->output.write_pos) + // However, THEY ARE NOT ZERO. + + ZSTD_inBuffer inBuffer = { + .pos = 0, + .size = compressed_size, + .src = compressed_data, + }; + + ZSTD_outBuffer outBuffer = { + .pos = 0, + .dst = (char *)state->output.data, + .size = state->output.size, + }; + + size_t ret = ZSTD_decompressStream( + state->stream + , &outBuffer + , &inBuffer); + + if(ZSTD_isError(ret)) { + netdata_log_error("STREAM: ZSTD_decompressStream() return error: %s", ZSTD_getErrorName(ret)); + return 0; + } + + if(inBuffer.pos < inBuffer.size) + fatal("RRDPUSH DECOMPRESS: ZSTD ZSTD_decompressStream() decompressed %zu bytes, " + "but %zu bytes of compressed data remain", + inBuffer.pos, inBuffer.size); + + size_t decompressed_size = outBuffer.pos; + + state->output.read_pos = 0; + state->output.write_pos = outBuffer.pos; + + // statistics + state->total_compressed += compressed_size; + state->total_uncompressed += decompressed_size; + state->total_compressions++; + + return decompressed_size; +} + +#endif // ENABLE_ZSTD diff --git a/src/streaming/compression_zstd.h b/src/streaming/compression_zstd.h new file mode 100644 index 000000000..bfabbf89d --- /dev/null +++ b/src/streaming/compression_zstd.h @@ -0,0 +1,19 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "compression.h" + +#ifndef NETDATA_STREAMING_COMPRESSION_ZSTD_H +#define NETDATA_STREAMING_COMPRESSION_ZSTD_H + +#ifdef ENABLE_ZSTD + +void rrdpush_compressor_init_zstd(struct compressor_state *state); +void rrdpush_compressor_destroy_zstd(struct compressor_state *state); +size_t rrdpush_compress_zstd(struct compressor_state *state, const char *data, size_t size, const char **out); +size_t rrdpush_decompress_zstd(struct decompressor_state *state, const char *compressed_data, size_t compressed_size); +void rrdpush_decompressor_init_zstd(struct decompressor_state *state); +void rrdpush_decompressor_destroy_zstd(struct decompressor_state *state); + +#endif // ENABLE_ZSTD + +#endif //NETDATA_STREAMING_COMPRESSION_ZSTD_H diff --git a/src/streaming/receiver.c b/src/streaming/receiver.c new file mode 100644 index 000000000..2cbf247dc --- /dev/null +++ b/src/streaming/receiver.c @@ -0,0 +1,948 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrdpush.h" +#include "web/server/h2o/http_server.h" + +extern struct config stream_config; + +void receiver_state_free(struct receiver_state *rpt) { + + freez(rpt->key); + freez(rpt->hostname); + freez(rpt->registry_hostname); + freez(rpt->machine_guid); + freez(rpt->os); + freez(rpt->timezone); + freez(rpt->abbrev_timezone); + freez(rpt->client_ip); + freez(rpt->client_port); + freez(rpt->program_name); + freez(rpt->program_version); + +#ifdef ENABLE_HTTPS + netdata_ssl_close(&rpt->ssl); +#endif + + if(rpt->fd != -1) { + internal_error(true, "closing socket..."); + close(rpt->fd); + } + + rrdpush_decompressor_destroy(&rpt->decompressor); + + if(rpt->system_info) + rrdhost_system_info_free(rpt->system_info); + + __atomic_sub_fetch(&netdata_buffers_statistics.rrdhost_receivers, sizeof(*rpt), __ATOMIC_RELAXED); + + freez(rpt); +} + +#include "collectors/plugins.d/pluginsd_parser.h" + +// IMPORTANT: to add workers, you have to edit WORKER_PARSER_FIRST_JOB accordingly +#define WORKER_RECEIVER_JOB_BYTES_READ (WORKER_PARSER_FIRST_JOB - 1) +#define WORKER_RECEIVER_JOB_BYTES_UNCOMPRESSED (WORKER_PARSER_FIRST_JOB - 2) + +// this has to be the same at parser.h +#define WORKER_RECEIVER_JOB_REPLICATION_COMPLETION (WORKER_PARSER_FIRST_JOB - 3) + +#if WORKER_PARSER_FIRST_JOB < 1 +#error The define WORKER_PARSER_FIRST_JOB needs to be at least 1 +#endif + +static inline int read_stream(struct receiver_state *r, char* buffer, size_t size) { + if(unlikely(!size)) { + internal_error(true, "%s() asked to read zero bytes", __FUNCTION__); + return 0; + } + +#ifdef ENABLE_H2O + if (is_h2o_rrdpush(r)) { + if(nd_thread_signaled_to_cancel()) + return -4; + + return (int)h2o_stream_read(r->h2o_ctx, buffer, size); + } +#endif + + int tries = 100; + ssize_t bytes_read; + + do { + errno = 0; + + switch(wait_on_socket_or_cancel_with_timeout( +#ifdef ENABLE_HTTPS + &r->ssl, +#endif + r->fd, 0, POLLIN, NULL)) + { + case 0: // data are waiting + break; + + case 1: // timeout reached + netdata_log_error("STREAM: %s(): timeout while waiting for data on socket!", __FUNCTION__); + return -3; + + case -1: // thread cancelled + netdata_log_error("STREAM: %s(): thread has been cancelled timeout while waiting for data on socket!", __FUNCTION__); + return -4; + + default: + case 2: // error on socket + netdata_log_error("STREAM: %s() socket error!", __FUNCTION__); + return -2; + } + +#ifdef ENABLE_HTTPS + if (SSL_connection(&r->ssl)) + bytes_read = netdata_ssl_read(&r->ssl, buffer, size); + else + bytes_read = read(r->fd, buffer, size); +#else + bytes_read = read(r->fd, buffer, size); +#endif + + } while(bytes_read < 0 && errno == EINTR && tries--); + + if((bytes_read == 0 || bytes_read == -1) && (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINPROGRESS)) { + netdata_log_error("STREAM: %s(): timeout while waiting for data on socket!", __FUNCTION__); + bytes_read = -3; + } + else if (bytes_read == 0) { + netdata_log_error("STREAM: %s(): EOF while reading data from socket!", __FUNCTION__); + bytes_read = -1; + } + else if (bytes_read < 0) { + netdata_log_error("STREAM: %s() failed to read from socket!", __FUNCTION__); + bytes_read = -2; + } + + return (int)bytes_read; +} + +static inline STREAM_HANDSHAKE read_stream_error_to_reason(int code) { + if(code > 0) + return 0; + + switch(code) { + case 0: + // asked to read zero bytes + return STREAM_HANDSHAKE_DISCONNECT_NOT_SUFFICIENT_READ_BUFFER; + + case -1: + // EOF + return STREAM_HANDSHAKE_DISCONNECT_SOCKET_EOF; + + case -2: + // failed to read + return STREAM_HANDSHAKE_DISCONNECT_SOCKET_READ_FAILED; + + case -3: + // timeout + return STREAM_HANDSHAKE_DISCONNECT_SOCKET_READ_TIMEOUT; + + case -4: + // the thread is cancelled + return STREAM_HANDSHAKE_DISCONNECT_SHUTDOWN; + + default: + // anything else + return STREAM_HANDSHAKE_DISCONNECT_UNKNOWN_SOCKET_READ_ERROR; + } +} + +static inline bool receiver_read_uncompressed(struct receiver_state *r, STREAM_HANDSHAKE *reason) { +#ifdef NETDATA_INTERNAL_CHECKS + if(r->reader.read_buffer[r->reader.read_len] != '\0') + fatal("%s(): read_buffer does not start with zero", __FUNCTION__ ); +#endif + + int bytes_read = read_stream(r, r->reader.read_buffer + r->reader.read_len, sizeof(r->reader.read_buffer) - r->reader.read_len - 1); + if(unlikely(bytes_read <= 0)) { + *reason = read_stream_error_to_reason(bytes_read); + return false; + } + + worker_set_metric(WORKER_RECEIVER_JOB_BYTES_READ, (NETDATA_DOUBLE)bytes_read); + worker_set_metric(WORKER_RECEIVER_JOB_BYTES_UNCOMPRESSED, (NETDATA_DOUBLE)bytes_read); + + r->reader.read_len += bytes_read; + r->reader.read_buffer[r->reader.read_len] = '\0'; + + return true; +} + +static inline bool receiver_read_compressed(struct receiver_state *r, STREAM_HANDSHAKE *reason) { + + internal_fatal(r->reader.read_buffer[r->reader.read_len] != '\0', + "%s: read_buffer does not start with zero #2", __FUNCTION__ ); + + // first use any available uncompressed data + if (likely(rrdpush_decompressed_bytes_in_buffer(&r->decompressor))) { + size_t available = sizeof(r->reader.read_buffer) - r->reader.read_len - 1; + if (likely(available)) { + size_t len = rrdpush_decompressor_get(&r->decompressor, r->reader.read_buffer + r->reader.read_len, available); + if (unlikely(!len)) { + internal_error(true, "decompressor returned zero length #1"); + return false; + } + + r->reader.read_len += (int)len; + r->reader.read_buffer[r->reader.read_len] = '\0'; + } + else + internal_fatal(true, "The line to read is too big! Already have %zd bytes in read_buffer.", r->reader.read_len); + + return true; + } + + // no decompressed data available + // read the compression signature of the next block + + if(unlikely(r->reader.read_len + r->decompressor.signature_size > sizeof(r->reader.read_buffer) - 1)) { + internal_error(true, "The last incomplete line does not leave enough room for the next compression header! " + "Already have %zd bytes in read_buffer.", r->reader.read_len); + return false; + } + + // read the compression signature from the stream + // we have to do a loop here, because read_stream() may return less than the data we need + int bytes_read = 0; + do { + int ret = read_stream(r, r->reader.read_buffer + r->reader.read_len + bytes_read, r->decompressor.signature_size - bytes_read); + if (unlikely(ret <= 0)) { + *reason = read_stream_error_to_reason(ret); + return false; + } + + bytes_read += ret; + } while(unlikely(bytes_read < (int)r->decompressor.signature_size)); + + worker_set_metric(WORKER_RECEIVER_JOB_BYTES_READ, (NETDATA_DOUBLE)bytes_read); + + if(unlikely(bytes_read != (int)r->decompressor.signature_size)) + fatal("read %d bytes, but expected compression signature of size %zu", bytes_read, r->decompressor.signature_size); + + size_t compressed_message_size = rrdpush_decompressor_start(&r->decompressor, r->reader.read_buffer + r->reader.read_len, bytes_read); + if (unlikely(!compressed_message_size)) { + internal_error(true, "multiplexed uncompressed data in compressed stream!"); + r->reader.read_len += bytes_read; + r->reader.read_buffer[r->reader.read_len] = '\0'; + return true; + } + + if(unlikely(compressed_message_size > COMPRESSION_MAX_MSG_SIZE)) { + netdata_log_error("received a compressed message of %zu bytes, which is bigger than the max compressed message size supported of %zu. Ignoring message.", + compressed_message_size, (size_t)COMPRESSION_MAX_MSG_SIZE); + return false; + } + + // delete compression header from our read buffer + r->reader.read_buffer[r->reader.read_len] = '\0'; + + // Read the entire compressed block of compressed data + char compressed[compressed_message_size]; + size_t compressed_bytes_read = 0; + do { + size_t start = compressed_bytes_read; + size_t remaining = compressed_message_size - start; + + int last_read_bytes = read_stream(r, &compressed[start], remaining); + if (unlikely(last_read_bytes <= 0)) { + *reason = read_stream_error_to_reason(last_read_bytes); + return false; + } + + compressed_bytes_read += last_read_bytes; + + } while(unlikely(compressed_message_size > compressed_bytes_read)); + + worker_set_metric(WORKER_RECEIVER_JOB_BYTES_READ, (NETDATA_DOUBLE)compressed_bytes_read); + + // decompress the compressed block + size_t bytes_to_parse = rrdpush_decompress(&r->decompressor, compressed, compressed_bytes_read); + if (unlikely(!bytes_to_parse)) { + internal_error(true, "no bytes to parse."); + return false; + } + + worker_set_metric(WORKER_RECEIVER_JOB_BYTES_UNCOMPRESSED, (NETDATA_DOUBLE)bytes_to_parse); + + // fill read buffer with decompressed data + size_t len = (int) rrdpush_decompressor_get(&r->decompressor, r->reader.read_buffer + r->reader.read_len, sizeof(r->reader.read_buffer) - r->reader.read_len - 1); + if (unlikely(!len)) { + internal_error(true, "decompressor returned zero length #2"); + return false; + } + r->reader.read_len += (int)len; + r->reader.read_buffer[r->reader.read_len] = '\0'; + + return true; +} + +bool plugin_is_enabled(struct plugind *cd); + +static void receiver_set_exit_reason(struct receiver_state *rpt, STREAM_HANDSHAKE reason, bool force) { + if(force || !rpt->exit.reason) + rpt->exit.reason = reason; +} + +static inline bool receiver_should_stop(struct receiver_state *rpt) { + static __thread size_t counter = 0; + + if(nd_thread_signaled_to_cancel()) { + receiver_set_exit_reason(rpt, STREAM_HANDSHAKE_DISCONNECT_SHUTDOWN, false); + return true; + } + + if(unlikely(rpt->exit.shutdown)) { + receiver_set_exit_reason(rpt, STREAM_HANDSHAKE_DISCONNECT_SHUTDOWN, false); + return true; + } + + if(unlikely(!service_running(SERVICE_STREAMING))) { + receiver_set_exit_reason(rpt, STREAM_HANDSHAKE_DISCONNECT_NETDATA_EXIT, false); + return true; + } + + if(unlikely((counter++ % 1000) == 0)) + rpt->last_msg_t = now_monotonic_sec(); + + return false; +} + +static size_t streaming_parser(struct receiver_state *rpt, struct plugind *cd, int fd, void *ssl) { + size_t result = 0; + + PARSER *parser = NULL; + { + PARSER_USER_OBJECT user = { + .enabled = plugin_is_enabled(cd), + .host = rpt->host, + .opaque = rpt, + .cd = cd, + .trust_durations = 1, + .capabilities = rpt->capabilities, + }; + + parser = parser_init(&user, NULL, NULL, fd, PARSER_INPUT_SPLIT, ssl); + } + +#ifdef ENABLE_H2O + parser->h2o_ctx = rpt->h2o_ctx; +#endif + + pluginsd_keywords_init(parser, PARSER_INIT_STREAMING); + + rrd_collector_started(); + + // this keeps the parser with its current value + // so, parser needs to be allocated before pushing it + CLEANUP_FUNCTION_REGISTER(pluginsd_process_thread_cleanup) parser_ptr = parser; + + bool compressed_connection = rrdpush_decompression_initialize(rpt); + buffered_reader_init(&rpt->reader); + +#ifdef NETDATA_LOG_STREAM_RECEIVE + { + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "/tmp/stream-receiver-%s.txt", rpt->host ? rrdhost_hostname( + rpt->host) : "unknown" + ); + parser->user.stream_log_fp = fopen(filename, "w"); + parser->user.stream_log_repertoire = PARSER_REP_METADATA; + } +#endif + + CLEAN_BUFFER *buffer = buffer_create(sizeof(rpt->reader.read_buffer), NULL); + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_CB(NDF_REQUEST, line_splitter_reconstruct_line, &parser->line), + ND_LOG_FIELD_CB(NDF_NIDL_NODE, parser_reconstruct_node, parser), + ND_LOG_FIELD_CB(NDF_NIDL_INSTANCE, parser_reconstruct_instance, parser), + ND_LOG_FIELD_CB(NDF_NIDL_CONTEXT, parser_reconstruct_context, parser), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + while(!receiver_should_stop(rpt)) { + + if(!buffered_reader_next_line(&rpt->reader, buffer)) { + STREAM_HANDSHAKE reason = STREAM_HANDSHAKE_DISCONNECT_UNKNOWN_SOCKET_READ_ERROR; + + bool have_new_data = compressed_connection ? receiver_read_compressed(rpt, &reason) + : receiver_read_uncompressed(rpt, &reason); + + if(unlikely(!have_new_data)) { + receiver_set_exit_reason(rpt, reason, false); + break; + } + + continue; + } + + if(unlikely(parser_action(parser, buffer->buffer))) { + receiver_set_exit_reason(rpt, STREAM_HANDSHAKE_DISCONNECT_PARSER_FAILED, false); + break; + } + + buffer->len = 0; + buffer->buffer[0] = '\0'; + } + result = parser->user.data_collections_count; + return result; +} + +static void rrdpush_receiver_replication_reset(RRDHOST *host) { + RRDSET *st; + rrdset_foreach_read(st, host) { + rrdset_flag_clear(st, RRDSET_FLAG_RECEIVER_REPLICATION_IN_PROGRESS); + rrdset_flag_set(st, RRDSET_FLAG_RECEIVER_REPLICATION_FINISHED); + } + rrdset_foreach_done(st); + rrdhost_receiver_replicating_charts_zero(host); +} + +static bool rrdhost_set_receiver(RRDHOST *host, struct receiver_state *rpt) { + bool signal_rrdcontext = false; + bool set_this = false; + + netdata_mutex_lock(&host->receiver_lock); + + if (!host->receiver) { + rrdhost_flag_clear(host, RRDHOST_FLAG_ORPHAN); + + host->rrdpush_receiver_connection_counter++; + __atomic_add_fetch(&localhost->connected_children_count, 1, __ATOMIC_RELAXED); + + host->receiver = rpt; + rpt->host = host; + + host->child_connect_time = now_realtime_sec(); + host->child_disconnected_time = 0; + host->child_last_chart_command = 0; + host->trigger_chart_obsoletion_check = 1; + + if (rpt->config.health_enabled != CONFIG_BOOLEAN_NO) { + if (rpt->config.alarms_delay > 0) { + host->health.health_delay_up_to = now_realtime_sec() + rpt->config.alarms_delay; + nd_log(NDLS_DAEMON, NDLP_DEBUG, + "[%s]: Postponing health checks for %" PRId64 " seconds, because it was just connected.", + rrdhost_hostname(host), + (int64_t) rpt->config.alarms_delay); + } + } + + host->health_log.health_log_history = rpt->config.alarms_history; + +// this is a test +// if(rpt->hops <= host->sender->hops) +// rrdpush_sender_thread_stop(host, "HOPS MISMATCH", false); + + signal_rrdcontext = true; + rrdpush_receiver_replication_reset(host); + + rrdhost_flag_clear(rpt->host, RRDHOST_FLAG_RRDPUSH_RECEIVER_DISCONNECTED); + aclk_queue_node_info(rpt->host, true); + + rrdpush_reset_destinations_postpone_time(host); + + set_this = true; + } + + netdata_mutex_unlock(&host->receiver_lock); + + if(signal_rrdcontext) + rrdcontext_host_child_connected(host); + + return set_this; +} + +static void rrdhost_clear_receiver(struct receiver_state *rpt) { + RRDHOST *host = rpt->host; + if(host) { + bool signal_rrdcontext = false; + netdata_mutex_lock(&host->receiver_lock); + + // Make sure that we detach this thread and don't kill a freshly arriving receiver + if(host->receiver == rpt) { + __atomic_sub_fetch(&localhost->connected_children_count, 1, __ATOMIC_RELAXED); + rrdhost_flag_set(rpt->host, RRDHOST_FLAG_RRDPUSH_RECEIVER_DISCONNECTED); + + host->trigger_chart_obsoletion_check = 0; + host->child_connect_time = 0; + host->child_disconnected_time = now_realtime_sec(); + + host->health.health_enabled = 0; + + rrdpush_sender_thread_stop(host, STREAM_HANDSHAKE_DISCONNECT_RECEIVER_LEFT, false); + + signal_rrdcontext = true; + rrdpush_receiver_replication_reset(host); + + rrdhost_flag_set(host, RRDHOST_FLAG_ORPHAN); + host->receiver = NULL; + host->rrdpush_last_receiver_exit_reason = rpt->exit.reason; + + if(rpt->config.health_enabled) + rrdcalc_child_disconnected(host); + } + + netdata_mutex_unlock(&host->receiver_lock); + + if(signal_rrdcontext) + rrdcontext_host_child_disconnected(host); + + rrdpush_reset_destinations_postpone_time(host); + } +} + +bool stop_streaming_receiver(RRDHOST *host, STREAM_HANDSHAKE reason) { + bool ret = false; + + netdata_mutex_lock(&host->receiver_lock); + + if(host->receiver) { + if(!host->receiver->exit.shutdown) { + host->receiver->exit.shutdown = true; + receiver_set_exit_reason(host->receiver, reason, true); + shutdown(host->receiver->fd, SHUT_RDWR); + } + + nd_thread_signal_cancel(host->receiver->thread); + } + + int count = 2000; + while (host->receiver && count-- > 0) { + netdata_mutex_unlock(&host->receiver_lock); + + // let the lock for the receiver thread to exit + sleep_usec(1 * USEC_PER_MS); + + netdata_mutex_lock(&host->receiver_lock); + } + + if(host->receiver) + netdata_log_error("STREAM '%s' [receive from [%s]:%s]: " + "thread %d takes too long to stop, giving up..." + , rrdhost_hostname(host) + , host->receiver->client_ip, host->receiver->client_port + , host->receiver->tid); + else + ret = true; + + netdata_mutex_unlock(&host->receiver_lock); + + return ret; +} + +static void rrdpush_send_error_on_taken_over_connection(struct receiver_state *rpt, const char *msg) { + (void) send_timeout( +#ifdef ENABLE_HTTPS + &rpt->ssl, +#endif + rpt->fd, + (char *)msg, + strlen(msg), + 0, + 5); +} + +void rrdpush_receive_log_status(struct receiver_state *rpt, const char *msg, const char *status, ND_LOG_FIELD_PRIORITY priority) { + // this function may be called BEFORE we spawn the receiver thread + // so, we need to add the fields again (it does not harm) + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_SRC_IP, rpt->client_ip), + ND_LOG_FIELD_TXT(NDF_SRC_PORT, rpt->client_port), + ND_LOG_FIELD_TXT(NDF_NIDL_NODE, (rpt->hostname && *rpt->hostname) ? rpt->hostname : ""), + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, status), + ND_LOG_FIELD_UUID(NDF_MESSAGE_ID, &streaming_from_child_msgid), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + nd_log(NDLS_ACCESS, priority, "api_key:'%s' machine_guid:'%s' msg:'%s'" + , (rpt->key && *rpt->key)? rpt->key : "" + , (rpt->machine_guid && *rpt->machine_guid) ? rpt->machine_guid : "" + , msg); + + nd_log(NDLS_DAEMON, priority, "STREAM_RECEIVER for '%s': %s %s%s%s" + , (rpt->hostname && *rpt->hostname) ? rpt->hostname : "" + , msg + , rpt->exit.reason != STREAM_HANDSHAKE_NEVER?" (":"" + , stream_handshake_error_to_string(rpt->exit.reason) + , rpt->exit.reason != STREAM_HANDSHAKE_NEVER?")":"" + ); +} + +static void rrdpush_receive(struct receiver_state *rpt) +{ + rpt->config.mode = default_rrd_memory_mode; + rpt->config.history = default_rrd_history_entries; + + rpt->config.health_enabled = health_plugin_enabled(); + rpt->config.alarms_delay = 60; + rpt->config.alarms_history = HEALTH_LOG_DEFAULT_HISTORY; + + rpt->config.rrdpush_enabled = (int)default_rrdpush_enabled; + rpt->config.rrdpush_destination = default_rrdpush_destination; + rpt->config.rrdpush_api_key = default_rrdpush_api_key; + rpt->config.rrdpush_send_charts_matching = default_rrdpush_send_charts_matching; + + rpt->config.rrdpush_enable_replication = default_rrdpush_enable_replication; + rpt->config.rrdpush_seconds_to_replicate = default_rrdpush_seconds_to_replicate; + rpt->config.rrdpush_replication_step = default_rrdpush_replication_step; + + rpt->config.update_every = (int)appconfig_get_number(&stream_config, rpt->machine_guid, "update every", rpt->config.update_every); + if(rpt->config.update_every < 0) rpt->config.update_every = 1; + + rpt->config.history = (int)appconfig_get_number(&stream_config, rpt->key, "default history", rpt->config.history); + rpt->config.history = (int)appconfig_get_number(&stream_config, rpt->machine_guid, "history", rpt->config.history); + if(rpt->config.history < 5) rpt->config.history = 5; + + rpt->config.mode = rrd_memory_mode_id(appconfig_get(&stream_config, rpt->key, "default memory mode", rrd_memory_mode_name(rpt->config.mode))); + rpt->config.mode = rrd_memory_mode_id(appconfig_get(&stream_config, rpt->machine_guid, "memory mode", rrd_memory_mode_name(rpt->config.mode))); + + if (unlikely(rpt->config.mode == RRD_MEMORY_MODE_DBENGINE && !dbengine_enabled)) { + netdata_log_error("STREAM '%s' [receive from %s:%s]: " + "dbengine is not enabled, falling back to default." + , rpt->hostname + , rpt->client_ip, rpt->client_port + ); + + rpt->config.mode = default_rrd_memory_mode; + } + + rpt->config.health_enabled = appconfig_get_boolean_ondemand(&stream_config, rpt->key, "health enabled by default", rpt->config.health_enabled); + rpt->config.health_enabled = appconfig_get_boolean_ondemand(&stream_config, rpt->machine_guid, "health enabled", rpt->config.health_enabled); + + rpt->config.alarms_delay = appconfig_get_number(&stream_config, rpt->key, "default postpone alarms on connect seconds", rpt->config.alarms_delay); + rpt->config.alarms_delay = appconfig_get_number(&stream_config, rpt->machine_guid, "postpone alarms on connect seconds", rpt->config.alarms_delay); + + rpt->config.alarms_history = appconfig_get_number(&stream_config, rpt->key, "default health log history", rpt->config.alarms_history); + rpt->config.alarms_history = appconfig_get_number(&stream_config, rpt->machine_guid, "health log history", rpt->config.alarms_history); + + rpt->config.rrdpush_enabled = appconfig_get_boolean(&stream_config, rpt->key, "default proxy enabled", rpt->config.rrdpush_enabled); + rpt->config.rrdpush_enabled = appconfig_get_boolean(&stream_config, rpt->machine_guid, "proxy enabled", rpt->config.rrdpush_enabled); + + rpt->config.rrdpush_destination = appconfig_get(&stream_config, rpt->key, "default proxy destination", rpt->config.rrdpush_destination); + rpt->config.rrdpush_destination = appconfig_get(&stream_config, rpt->machine_guid, "proxy destination", rpt->config.rrdpush_destination); + + rpt->config.rrdpush_api_key = appconfig_get(&stream_config, rpt->key, "default proxy api key", rpt->config.rrdpush_api_key); + rpt->config.rrdpush_api_key = appconfig_get(&stream_config, rpt->machine_guid, "proxy api key", rpt->config.rrdpush_api_key); + + rpt->config.rrdpush_send_charts_matching = appconfig_get(&stream_config, rpt->key, "default proxy send charts matching", rpt->config.rrdpush_send_charts_matching); + rpt->config.rrdpush_send_charts_matching = appconfig_get(&stream_config, rpt->machine_guid, "proxy send charts matching", rpt->config.rrdpush_send_charts_matching); + + rpt->config.rrdpush_enable_replication = appconfig_get_boolean(&stream_config, rpt->key, "enable replication", rpt->config.rrdpush_enable_replication); + rpt->config.rrdpush_enable_replication = appconfig_get_boolean(&stream_config, rpt->machine_guid, "enable replication", rpt->config.rrdpush_enable_replication); + + rpt->config.rrdpush_seconds_to_replicate = appconfig_get_number(&stream_config, rpt->key, "seconds to replicate", rpt->config.rrdpush_seconds_to_replicate); + rpt->config.rrdpush_seconds_to_replicate = appconfig_get_number(&stream_config, rpt->machine_guid, "seconds to replicate", rpt->config.rrdpush_seconds_to_replicate); + + rpt->config.rrdpush_replication_step = appconfig_get_number(&stream_config, rpt->key, "seconds per replication step", rpt->config.rrdpush_replication_step); + rpt->config.rrdpush_replication_step = appconfig_get_number(&stream_config, rpt->machine_guid, "seconds per replication step", rpt->config.rrdpush_replication_step); + + rpt->config.rrdpush_compression = default_rrdpush_compression_enabled; + rpt->config.rrdpush_compression = appconfig_get_boolean(&stream_config, rpt->key, "enable compression", rpt->config.rrdpush_compression); + rpt->config.rrdpush_compression = appconfig_get_boolean(&stream_config, rpt->machine_guid, "enable compression", rpt->config.rrdpush_compression); + + bool is_ephemeral = false; + is_ephemeral = appconfig_get_boolean(&stream_config, rpt->key, "is ephemeral node", CONFIG_BOOLEAN_NO); + is_ephemeral = appconfig_get_boolean(&stream_config, rpt->machine_guid, "is ephemeral node", is_ephemeral); + + if(rpt->config.rrdpush_compression) { + char *order = appconfig_get(&stream_config, rpt->key, "compression algorithms order", RRDPUSH_COMPRESSION_ALGORITHMS_ORDER); + order = appconfig_get(&stream_config, rpt->machine_guid, "compression algorithms order", order); + rrdpush_parse_compression_order(rpt, order); + } + + // find the host for this receiver + { + // this will also update the host with our system_info + RRDHOST *host = rrdhost_find_or_create( + rpt->hostname, + rpt->registry_hostname, + rpt->machine_guid, + rpt->os, + rpt->timezone, + rpt->abbrev_timezone, + rpt->utc_offset, + rpt->program_name, + rpt->program_version, + rpt->config.update_every, + rpt->config.history, + rpt->config.mode, + (unsigned int)(rpt->config.health_enabled != CONFIG_BOOLEAN_NO), + (unsigned int)(rpt->config.rrdpush_enabled && rpt->config.rrdpush_destination && + *rpt->config.rrdpush_destination && rpt->config.rrdpush_api_key && + *rpt->config.rrdpush_api_key), + rpt->config.rrdpush_destination, + rpt->config.rrdpush_api_key, + rpt->config.rrdpush_send_charts_matching, + rpt->config.rrdpush_enable_replication, + rpt->config.rrdpush_seconds_to_replicate, + rpt->config.rrdpush_replication_step, + rpt->system_info, + 0); + + if(!host) { + rrdpush_receive_log_status( + rpt,"failed to find/create host structure, rejecting connection", + RRDPUSH_STATUS_INTERNAL_SERVER_ERROR, NDLP_ERR); + + rrdpush_send_error_on_taken_over_connection(rpt, START_STREAMING_ERROR_INTERNAL_ERROR); + goto cleanup; + } + + if (unlikely(rrdhost_flag_check(host, RRDHOST_FLAG_PENDING_CONTEXT_LOAD))) { + rrdpush_receive_log_status( + rpt, "host is initializing, retry later", + RRDPUSH_STATUS_INITIALIZATION_IN_PROGRESS, NDLP_NOTICE); + + rrdpush_send_error_on_taken_over_connection(rpt, START_STREAMING_ERROR_INITIALIZATION); + goto cleanup; + } + + // system_info has been consumed by the host structure + rpt->system_info = NULL; + + if(!rrdhost_set_receiver(host, rpt)) { + rrdpush_receive_log_status( + rpt, "host is already served by another receiver", + RRDPUSH_STATUS_DUPLICATE_RECEIVER, NDLP_INFO); + + rrdpush_send_error_on_taken_over_connection(rpt, START_STREAMING_ERROR_ALREADY_STREAMING); + goto cleanup; + } + } + +#ifdef NETDATA_INTERNAL_CHECKS + netdata_log_info("STREAM '%s' [receive from [%s]:%s]: " + "client willing to stream metrics for host '%s' with machine_guid '%s': " + "update every = %d, history = %d, memory mode = %s, health %s,%s" + , rpt->hostname + , rpt->client_ip + , rpt->client_port + , rrdhost_hostname(rpt->host) + , rpt->host->machine_guid + , rpt->host->rrd_update_every + , rpt->host->rrd_history_entries + , rrd_memory_mode_name(rpt->host->rrd_memory_mode) + , (rpt->config.health_enabled == CONFIG_BOOLEAN_NO)?"disabled":((rpt->config.health_enabled == CONFIG_BOOLEAN_YES)?"enabled":"auto") +#ifdef ENABLE_HTTPS + , (rpt->ssl.conn != NULL) ? " SSL," : "" +#else + , "" +#endif + ); +#endif // NETDATA_INTERNAL_CHECKS + + + struct plugind cd = { + .update_every = default_rrd_update_every, + .unsafe = { + .spinlock = NETDATA_SPINLOCK_INITIALIZER, + .running = true, + .enabled = true, + }, + .started_t = now_realtime_sec(), + }; + + // put the client IP and port into the buffers used by plugins.d + snprintfz(cd.id, CONFIG_MAX_NAME, "%s:%s", rpt->client_ip, rpt->client_port); + snprintfz(cd.filename, FILENAME_MAX, "%s:%s", rpt->client_ip, rpt->client_port); + snprintfz(cd.fullfilename, FILENAME_MAX, "%s:%s", rpt->client_ip, rpt->client_port); + snprintfz(cd.cmd, PLUGINSD_CMD_MAX, "%s:%s", rpt->client_ip, rpt->client_port); + + rrdpush_select_receiver_compression_algorithm(rpt); + + { + // netdata_log_info("STREAM %s [receive from [%s]:%s]: initializing communication...", rrdhost_hostname(rpt->host), rpt->client_ip, rpt->client_port); + char initial_response[HTTP_HEADER_SIZE]; + if (stream_has_capability(rpt, STREAM_CAP_VCAPS)) { + log_receiver_capabilities(rpt); + sprintf(initial_response, "%s%u", START_STREAMING_PROMPT_VN, rpt->capabilities); + } + else if (stream_has_capability(rpt, STREAM_CAP_VN)) { + log_receiver_capabilities(rpt); + sprintf(initial_response, "%s%d", START_STREAMING_PROMPT_VN, stream_capabilities_to_vn(rpt->capabilities)); + } + else if (stream_has_capability(rpt, STREAM_CAP_V2)) { + log_receiver_capabilities(rpt); + sprintf(initial_response, "%s", START_STREAMING_PROMPT_V2); + } + else { // stream_has_capability(rpt, STREAM_CAP_V1) + log_receiver_capabilities(rpt); + sprintf(initial_response, "%s", START_STREAMING_PROMPT_V1); + } + + netdata_log_debug(D_STREAM, "Initial response to %s: %s", rpt->client_ip, initial_response); +#ifdef ENABLE_H2O + if (is_h2o_rrdpush(rpt)) { + h2o_stream_write(rpt->h2o_ctx, initial_response, strlen(initial_response)); + } else { +#endif + ssize_t bytes_sent = send_timeout( +#ifdef ENABLE_HTTPS + &rpt->ssl, +#endif + rpt->fd, initial_response, strlen(initial_response), 0, 60); + + if(bytes_sent != (ssize_t)strlen(initial_response)) { + internal_error(true, "Cannot send response, got %zd bytes, expecting %zu bytes", bytes_sent, strlen(initial_response)); + rrdpush_receive_log_status( + rpt, "cannot reply back, dropping connection", + RRDPUSH_STATUS_CANT_REPLY, NDLP_ERR); + goto cleanup; + } +#ifdef ENABLE_H2O + } +#endif + } + +#ifdef ENABLE_H2O + unless_h2o_rrdpush(rpt) +#endif + { + // remove the non-blocking flag from the socket + if(sock_delnonblock(rpt->fd) < 0) + netdata_log_error("STREAM '%s' [receive from [%s]:%s]: " + "cannot remove the non-blocking flag from socket %d" + , rrdhost_hostname(rpt->host) + , rpt->client_ip, rpt->client_port + , rpt->fd); + + struct timeval timeout; + timeout.tv_sec = 600; + timeout.tv_usec = 0; + if (unlikely(setsockopt(rpt->fd, SOL_SOCKET, SO_RCVTIMEO, &timeout, sizeof timeout) != 0)) + netdata_log_error("STREAM '%s' [receive from [%s]:%s]: " + "cannot set timeout for socket %d" + , rrdhost_hostname(rpt->host) + , rpt->client_ip, rpt->client_port + , rpt->fd); + } + + rrdpush_receive_log_status( + rpt, "connected and ready to receive data", + RRDPUSH_STATUS_CONNECTED, NDLP_INFO); + +#ifdef ENABLE_ACLK + // in case we have cloud connection we inform cloud + // new child connected + if (netdata_cloud_enabled) + aclk_host_state_update(rpt->host, 1, 1); +#endif + + rrdhost_set_is_parent_label(); + + if (is_ephemeral) + rrdhost_option_set(rpt->host, RRDHOST_OPTION_EPHEMERAL_HOST); + + // let it reconnect to parent immediately + rrdpush_reset_destinations_postpone_time(rpt->host); + + size_t count = streaming_parser(rpt, &cd, rpt->fd, +#ifdef ENABLE_HTTPS + (rpt->ssl.conn) ? &rpt->ssl : NULL +#else + NULL +#endif + ); + + receiver_set_exit_reason(rpt, STREAM_HANDSHAKE_DISCONNECT_PARSER_EXIT, false); + + { + char msg[100 + 1]; + snprintfz(msg, sizeof(msg) - 1, "disconnected (completed %zu updates)", count); + rrdpush_receive_log_status( + rpt, msg, + RRDPUSH_STATUS_DISCONNECTED, NDLP_WARNING); + } + +#ifdef ENABLE_ACLK + // in case we have cloud connection we inform cloud + // a child disconnected + if (netdata_cloud_enabled) + aclk_host_state_update(rpt->host, 0, 1); +#endif + +cleanup: + ; +} + +static void rrdpush_receiver_thread_cleanup(void *pptr) { + struct receiver_state *rpt = CLEANUP_FUNCTION_GET_PTR(pptr); + if(!rpt) return; + + netdata_log_info("STREAM '%s' [receive from [%s]:%s]: " + "receive thread ended (task id %d)" + , rpt->hostname ? rpt->hostname : "-" + , rpt->client_ip ? rpt->client_ip : "-", rpt->client_port ? rpt->client_port : "-", gettid_cached()); + + worker_unregister(); + rrdhost_clear_receiver(rpt); + receiver_state_free(rpt); + rrdhost_set_is_parent_label(); +} + +static bool stream_receiver_log_capabilities(BUFFER *wb, void *ptr) { + struct receiver_state *rpt = ptr; + if(!rpt) + return false; + + stream_capabilities_to_string(wb, rpt->capabilities); + return true; +} + +static bool stream_receiver_log_transport(BUFFER *wb, void *ptr) { + struct receiver_state *rpt = ptr; + if(!rpt) + return false; + +#ifdef ENABLE_HTTPS + buffer_strcat(wb, SSL_connection(&rpt->ssl) ? "https" : "http"); +#else + buffer_strcat(wb, "http"); +#endif + return true; +} + +void *rrdpush_receiver_thread(void *ptr) { + CLEANUP_FUNCTION_REGISTER(rrdpush_receiver_thread_cleanup) cleanup_ptr = ptr; + worker_register("STREAMRCV"); + + worker_register_job_custom_metric(WORKER_RECEIVER_JOB_BYTES_READ, + "received bytes", "bytes/s", + WORKER_METRIC_INCREMENT); + + worker_register_job_custom_metric(WORKER_RECEIVER_JOB_BYTES_UNCOMPRESSED, + "uncompressed bytes", "bytes/s", + WORKER_METRIC_INCREMENT); + + worker_register_job_custom_metric(WORKER_RECEIVER_JOB_REPLICATION_COMPLETION, + "replication completion", "%", + WORKER_METRIC_ABSOLUTE); + + struct receiver_state *rpt = (struct receiver_state *) ptr; + rpt->tid = gettid_cached(); + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_SRC_IP, rpt->client_ip), + ND_LOG_FIELD_TXT(NDF_SRC_PORT, rpt->client_port), + ND_LOG_FIELD_TXT(NDF_NIDL_NODE, rpt->hostname), + ND_LOG_FIELD_CB(NDF_SRC_TRANSPORT, stream_receiver_log_transport, rpt), + ND_LOG_FIELD_CB(NDF_SRC_CAPABILITIES, stream_receiver_log_capabilities, rpt), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + netdata_log_info("STREAM %s [%s]:%s: receive thread started", rpt->hostname, rpt->client_ip + , rpt->client_port); + + rrdpush_receive(rpt); + return NULL; +} diff --git a/src/streaming/replication.c b/src/streaming/replication.c new file mode 100644 index 000000000..1f5aeb34c --- /dev/null +++ b/src/streaming/replication.c @@ -0,0 +1,2032 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "replication.h" +#include "Judy.h" + +#define STREAMING_START_MAX_SENDER_BUFFER_PERCENTAGE_ALLOWED 50ULL +#define MAX_REPLICATION_MESSAGE_PERCENT_SENDER_BUFFER 25ULL +#define MAX_SENDER_BUFFER_PERCENTAGE_ALLOWED 50ULL +#define MIN_SENDER_BUFFER_PERCENTAGE_ALLOWED 10ULL + +#define WORKER_JOB_FIND_NEXT 1 +#define WORKER_JOB_QUERYING 2 +#define WORKER_JOB_DELETE_ENTRY 3 +#define WORKER_JOB_FIND_CHART 4 +#define WORKER_JOB_PREPARE_QUERY 5 +#define WORKER_JOB_CHECK_CONSISTENCY 6 +#define WORKER_JOB_BUFFER_COMMIT 7 +#define WORKER_JOB_CLEANUP 8 +#define WORKER_JOB_WAIT 9 + +// master thread worker jobs +#define WORKER_JOB_STATISTICS 10 +#define WORKER_JOB_CUSTOM_METRIC_PENDING_REQUESTS 11 +#define WORKER_JOB_CUSTOM_METRIC_SKIPPED_NO_ROOM 12 +#define WORKER_JOB_CUSTOM_METRIC_COMPLETION 13 +#define WORKER_JOB_CUSTOM_METRIC_ADDED 14 +#define WORKER_JOB_CUSTOM_METRIC_DONE 15 +#define WORKER_JOB_CUSTOM_METRIC_SENDER_RESETS 16 +#define WORKER_JOB_CUSTOM_METRIC_SENDER_FULL 17 + +#define ITERATIONS_IDLE_WITHOUT_PENDING_TO_RUN_SENDER_VERIFICATION 30 +#define SECONDS_TO_RESET_POINT_IN_TIME 10 + +static struct replication_query_statistics replication_queries = { + .spinlock = NETDATA_SPINLOCK_INITIALIZER, + .queries_started = 0, + .queries_finished = 0, + .points_read = 0, + .points_generated = 0, +}; + +struct replication_query_statistics replication_get_query_statistics(void) { + spinlock_lock(&replication_queries.spinlock); + struct replication_query_statistics ret = replication_queries; + spinlock_unlock(&replication_queries.spinlock); + return ret; +} + +size_t replication_buffers_allocated = 0; + +size_t replication_allocated_buffers(void) { + return __atomic_load_n(&replication_buffers_allocated, __ATOMIC_RELAXED); +} + +// ---------------------------------------------------------------------------- +// sending replication replies + +struct replication_dimension { + STORAGE_POINT sp; + struct storage_engine_query_handle handle; + bool enabled; + bool skip; + + DICTIONARY *dict; + const DICTIONARY_ITEM *rda; + RRDDIM *rd; +}; + +struct replication_query { + RRDSET *st; + + struct { + time_t first_entry_t; + time_t last_entry_t; + } db; + + struct { // what the parent requested + time_t after; + time_t before; + bool enable_streaming; + } request; + + struct { // what the child will do + time_t after; + time_t before; + bool enable_streaming; + + bool locked_data_collection; + bool execute; + bool interrupted; + STREAM_CAPABILITIES capabilities; + } query; + + time_t wall_clock_time; + + size_t points_read; + size_t points_generated; + + STORAGE_ENGINE_BACKEND backend; + struct replication_request *rq; + + size_t dimensions; + struct replication_dimension data[]; +}; + +static struct replication_query *replication_query_prepare( + RRDSET *st, + time_t db_first_entry, + time_t db_last_entry, + time_t requested_after, + time_t requested_before, + bool requested_enable_streaming, + time_t query_after, + time_t query_before, + bool query_enable_streaming, + time_t wall_clock_time, + STREAM_CAPABILITIES capabilities +) { + size_t dimensions = rrdset_number_of_dimensions(st); + struct replication_query *q = callocz(1, sizeof(struct replication_query) + dimensions * sizeof(struct replication_dimension)); + __atomic_add_fetch(&replication_buffers_allocated, sizeof(struct replication_query) + dimensions * sizeof(struct replication_dimension), __ATOMIC_RELAXED); + + q->dimensions = dimensions; + q->st = st; + + q->db.first_entry_t = db_first_entry; + q->db.last_entry_t = db_last_entry; + + q->request.after = requested_after, + q->request.before = requested_before, + q->request.enable_streaming = requested_enable_streaming, + + q->query.after = query_after; + q->query.before = query_before; + q->query.enable_streaming = query_enable_streaming; + q->query.capabilities = capabilities; + + q->wall_clock_time = wall_clock_time; + + if (!q->dimensions || !q->query.after || !q->query.before) { + q->query.execute = false; + q->dimensions = 0; + return q; + } + + if(q->query.enable_streaming) { + spinlock_lock(&st->data_collection_lock); + q->query.locked_data_collection = true; + + if (st->last_updated.tv_sec > q->query.before) { +#ifdef NETDATA_LOG_REPLICATION_REQUESTS + internal_error(true, + "STREAM_SENDER REPLAY: 'host:%s/chart:%s' " + "has start_streaming = true, " + "adjusting replication before timestamp from %llu to %llu", + rrdhost_hostname(st->rrdhost), rrdset_id(st), + (unsigned long long) q->query.before, + (unsigned long long) st->last_updated.tv_sec + ); +#endif + q->query.before = MIN(st->last_updated.tv_sec, wall_clock_time); + } + } + + q->backend = st->rrdhost->db[0].eng->seb; + + // prepare our array of dimensions + size_t count = 0; + RRDDIM *rd; + rrddim_foreach_read(rd, st) { + if (unlikely(!rd || !rd_dfe.item || !rrddim_check_upstream_exposed(rd))) + continue; + + if (unlikely(rd_dfe.counter >= q->dimensions)) { + internal_error(true, + "STREAM_SENDER REPLAY ERROR: 'host:%s/chart:%s' has more dimensions than the replicated ones", + rrdhost_hostname(st->rrdhost), rrdset_id(st)); + break; + } + + struct replication_dimension *d = &q->data[rd_dfe.counter]; + + d->dict = rd_dfe.dict; + d->rda = dictionary_acquired_item_dup(rd_dfe.dict, rd_dfe.item); + d->rd = rd; + + storage_engine_query_init(q->backend, rd->tiers[0].smh, &d->handle, q->query.after, q->query.before, + q->query.locked_data_collection ? STORAGE_PRIORITY_HIGH : STORAGE_PRIORITY_LOW); + d->enabled = true; + d->skip = false; + count++; + } + rrddim_foreach_done(rd); + + if(!count) { + // no data for this chart + + q->query.execute = false; + + if(q->query.locked_data_collection) { + spinlock_unlock(&st->data_collection_lock); + q->query.locked_data_collection = false; + } + + } + else { + // we have data for this chart + + q->query.execute = true; + } + + return q; +} + +static void replication_send_chart_collection_state(BUFFER *wb, RRDSET *st, STREAM_CAPABILITIES capabilities) { + bool with_slots = (capabilities & STREAM_CAP_SLOTS) ? true : false; + NUMBER_ENCODING integer_encoding = (capabilities & STREAM_CAP_IEEE754) ? NUMBER_ENCODING_BASE64 : NUMBER_ENCODING_DECIMAL; + RRDDIM *rd; + rrddim_foreach_read(rd, st){ + if (!rrddim_check_upstream_exposed(rd)) continue; + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_REPLAY_RRDDIM_STATE, sizeof(PLUGINSD_KEYWORD_REPLAY_RRDDIM_STATE) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, rd->rrdpush.sender.dim_slot); + } + + buffer_fast_strcat(wb, " '", 2); + buffer_fast_strcat(wb, rrddim_id(rd), string_strlen(rd->id)); + buffer_fast_strcat(wb, "' ", 2); + buffer_print_uint64_encoded(wb, integer_encoding, (usec_t) rd->collector.last_collected_time.tv_sec * USEC_PER_SEC + + (usec_t) rd->collector.last_collected_time.tv_usec); + buffer_fast_strcat(wb, " ", 1); + buffer_print_int64_encoded(wb, integer_encoding, rd->collector.last_collected_value); + buffer_fast_strcat(wb, " ", 1); + buffer_print_netdata_double_encoded(wb, integer_encoding, rd->collector.last_calculated_value); + buffer_fast_strcat(wb, " ", 1); + buffer_print_netdata_double_encoded(wb, integer_encoding, rd->collector.last_stored_value); + buffer_fast_strcat(wb, "\n", 1); + } + rrddim_foreach_done(rd); + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_REPLAY_RRDSET_STATE " ", sizeof(PLUGINSD_KEYWORD_REPLAY_RRDSET_STATE) - 1 + 1); + buffer_print_uint64_encoded(wb, integer_encoding, (usec_t) st->last_collected_time.tv_sec * USEC_PER_SEC + (usec_t) st->last_collected_time.tv_usec); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, (usec_t) st->last_updated.tv_sec * USEC_PER_SEC + (usec_t) st->last_updated.tv_usec); + buffer_fast_strcat(wb, "\n", 1); +} + +static void replication_query_finalize(BUFFER *wb, struct replication_query *q, bool executed) { + size_t dimensions = q->dimensions; + + if(wb && q->query.enable_streaming) + replication_send_chart_collection_state(wb, q->st, q->query.capabilities); + + if(q->query.locked_data_collection) { + spinlock_unlock(&q->st->data_collection_lock); + q->query.locked_data_collection = false; + } + + // release all the dictionary items acquired + // finalize the queries + size_t queries = 0; + + for (size_t i = 0; i < dimensions; i++) { + struct replication_dimension *d = &q->data[i]; + if (unlikely(!d->enabled)) continue; + + storage_engine_query_finalize(&d->handle); + + dictionary_acquired_item_release(d->dict, d->rda); + + // update global statistics + queries++; + } + + if(executed) { + spinlock_lock(&replication_queries.spinlock); + replication_queries.queries_started += queries; + replication_queries.queries_finished += queries; + replication_queries.points_read += q->points_read; + replication_queries.points_generated += q->points_generated; + + if(q->st && q->st->rrdhost->sender) { + struct sender_state *s = q->st->rrdhost->sender; + s->replication.latest_completed_before_t = q->query.before; + } + + spinlock_unlock(&replication_queries.spinlock); + } + + __atomic_sub_fetch(&replication_buffers_allocated, sizeof(struct replication_query) + dimensions * sizeof(struct replication_dimension), __ATOMIC_RELAXED); + freez(q); +} + +static void replication_query_align_to_optimal_before(struct replication_query *q) { + if(!q->query.execute || q->query.enable_streaming) + return; + + size_t dimensions = q->dimensions; + time_t expanded_before = 0; + + for (size_t i = 0; i < dimensions; i++) { + struct replication_dimension *d = &q->data[i]; + if(unlikely(!d->enabled)) continue; + + time_t new_before = storage_engine_align_to_optimal_before(&d->handle); + if (!expanded_before || new_before < expanded_before) + expanded_before = new_before; + } + + if(expanded_before > q->query.before && // it is later than the original + (expanded_before - q->query.before) / q->st->update_every < 1024 && // it is reasonable (up to a page) + expanded_before < q->st->last_updated.tv_sec && // it is not the chart's last updated time + expanded_before < q->wall_clock_time) // it is not later than the wall clock time + q->query.before = expanded_before; +} + +static bool replication_query_execute(BUFFER *wb, struct replication_query *q, size_t max_msg_size) { + replication_query_align_to_optimal_before(q); + + bool with_slots = (q->query.capabilities & STREAM_CAP_SLOTS) ? true : false; + NUMBER_ENCODING integer_encoding = (q->query.capabilities & STREAM_CAP_IEEE754) ? NUMBER_ENCODING_BASE64 : NUMBER_ENCODING_DECIMAL; + time_t after = q->query.after; + time_t before = q->query.before; + size_t dimensions = q->dimensions; + time_t wall_clock_time = q->wall_clock_time; + + bool finished_with_gap = false; + size_t points_read = 0, points_generated = 0; + +#ifdef NETDATA_LOG_REPLICATION_REQUESTS + time_t actual_after = 0, actual_before = 0; +#endif + + time_t now = after + 1; + time_t last_end_time_in_buffer = 0; + while(now <= before) { + time_t min_start_time = 0, max_start_time = 0, min_end_time = 0, max_end_time = 0, min_update_every = 0, max_update_every = 0; + for (size_t i = 0; i < dimensions ;i++) { + struct replication_dimension *d = &q->data[i]; + if(unlikely(!d->enabled || d->skip)) continue; + + // fetch the first valid point for the dimension + int max_skip = 1000; + while(d->sp.end_time_s < now && !storage_engine_query_is_finished(&d->handle) && max_skip-- >= 0) { + d->sp = storage_engine_query_next_metric(&d->handle); + points_read++; + } + + if(max_skip <= 0) { + d->skip = true; + + nd_log_limit_static_global_var(erl, 1, 0); + nd_log_limit(&erl, NDLS_DAEMON, NDLP_ERR, + "STREAM_SENDER REPLAY ERROR: 'host:%s/chart:%s/dim:%s': db does not advance the query " + "beyond time %llu (tried 1000 times to get the next point and always got back a point in the past)", + rrdhost_hostname(q->st->rrdhost), rrdset_id(q->st), rrddim_id(d->rd), + (unsigned long long) now); + + continue; + } + + if(unlikely(d->sp.end_time_s < now || d->sp.end_time_s < d->sp.start_time_s)) + // this dimension does not provide any data + continue; + + time_t update_every = d->sp.end_time_s - d->sp.start_time_s; + if(unlikely(!update_every)) + update_every = q->st->update_every; + + if(unlikely(!min_update_every)) + min_update_every = update_every; + + if(unlikely(!min_start_time)) + min_start_time = d->sp.start_time_s; + + if(unlikely(!min_end_time)) + min_end_time = d->sp.end_time_s; + + min_update_every = MIN(min_update_every, update_every); + max_update_every = MAX(max_update_every, update_every); + + min_start_time = MIN(min_start_time, d->sp.start_time_s); + max_start_time = MAX(max_start_time, d->sp.start_time_s); + + min_end_time = MIN(min_end_time, d->sp.end_time_s); + max_end_time = MAX(max_end_time, d->sp.end_time_s); + } + + if (unlikely(min_update_every != max_update_every || + min_start_time != max_start_time)) { + + time_t fix_min_start_time; + if(last_end_time_in_buffer && + last_end_time_in_buffer >= min_start_time && + last_end_time_in_buffer <= max_start_time) { + fix_min_start_time = last_end_time_in_buffer; + } + else + fix_min_start_time = min_end_time - min_update_every; + +#ifdef NETDATA_INTERNAL_CHECKS + nd_log_limit_static_global_var(erl, 1, 0); + nd_log_limit(&erl, NDLS_DAEMON, NDLP_WARNING, + "REPLAY WARNING: 'host:%s/chart:%s' " + "misaligned dimensions, " + "update every (min: %ld, max: %ld), " + "start time (min: %ld, max: %ld), " + "end time (min %ld, max %ld), " + "now %ld, last end time sent %ld, " + "min start time is fixed to %ld", + rrdhost_hostname(q->st->rrdhost), rrdset_id(q->st), + min_update_every, max_update_every, + min_start_time, max_start_time, + min_end_time, max_end_time, + now, last_end_time_in_buffer, + fix_min_start_time + ); +#endif + + min_start_time = fix_min_start_time; + } + + if(likely(min_start_time <= now && min_end_time >= now)) { + // we have a valid point + + if (unlikely(min_end_time == min_start_time)) + min_start_time = min_end_time - q->st->update_every; + +#ifdef NETDATA_LOG_REPLICATION_REQUESTS + if (unlikely(!actual_after)) + actual_after = min_end_time; + + actual_before = min_end_time; +#endif + + if(buffer_strlen(wb) > max_msg_size && last_end_time_in_buffer) { + q->query.before = last_end_time_in_buffer; + q->query.enable_streaming = false; + + internal_error(true, "REPLICATION: current buffer size %zu is more than the " + "max message size %zu for chart '%s' of host '%s'. " + "Interrupting replication request (%ld to %ld, %s) at %ld to %ld, %s.", + buffer_strlen(wb), max_msg_size, rrdset_id(q->st), rrdhost_hostname(q->st->rrdhost), + q->request.after, q->request.before, q->request.enable_streaming?"true":"false", + q->query.after, q->query.before, q->query.enable_streaming?"true":"false"); + + q->query.interrupted = true; + + break; + } + last_end_time_in_buffer = min_end_time; + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_REPLAY_BEGIN, sizeof(PLUGINSD_KEYWORD_REPLAY_BEGIN) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, q->st->rrdpush.sender.chart_slot); + } + + buffer_fast_strcat(wb, " '' ", 4); + buffer_print_uint64_encoded(wb, integer_encoding, min_start_time); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, min_end_time); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, wall_clock_time); + buffer_fast_strcat(wb, "\n", 1); + + // output the replay values for this time + for (size_t i = 0; i < dimensions; i++) { + struct replication_dimension *d = &q->data[i]; + if (unlikely(!d->enabled)) continue; + + if (likely( d->sp.start_time_s <= min_end_time && + d->sp.end_time_s >= min_end_time && + !storage_point_is_unset(d->sp) && + !storage_point_is_gap(d->sp))) { + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_REPLAY_SET, sizeof(PLUGINSD_KEYWORD_REPLAY_SET) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, d->rd->rrdpush.sender.dim_slot); + } + + buffer_fast_strcat(wb, " \"", 2); + buffer_fast_strcat(wb, rrddim_id(d->rd), string_strlen(d->rd->id)); + buffer_fast_strcat(wb, "\" ", 2); + buffer_print_netdata_double_encoded(wb, integer_encoding, d->sp.sum); + buffer_fast_strcat(wb, " ", 1); + buffer_print_sn_flags(wb, d->sp.flags, q->query.capabilities & STREAM_CAP_INTERPOLATED); + buffer_fast_strcat(wb, "\n", 1); + + points_generated++; + } + } + + now = min_end_time + 1; + } + else if(unlikely(min_end_time < now)) + // the query does not progress + break; + else { + // we have gap - all points are in the future + now = min_start_time; + + if(min_start_time > before && !points_generated) { + before = q->query.before = min_start_time - 1; + finished_with_gap = true; + break; + } + } + } + +#ifdef NETDATA_LOG_REPLICATION_REQUESTS + if(actual_after) { + char actual_after_buf[LOG_DATE_LENGTH + 1], actual_before_buf[LOG_DATE_LENGTH + 1]; + log_date(actual_after_buf, LOG_DATE_LENGTH, actual_after); + log_date(actual_before_buf, LOG_DATE_LENGTH, actual_before); + internal_error(true, + "STREAM_SENDER REPLAY: 'host:%s/chart:%s': sending data %llu [%s] to %llu [%s] (requested %llu [delta %lld] to %llu [delta %lld])", + rrdhost_hostname(q->st->rrdhost), rrdset_id(q->st), + (unsigned long long)actual_after, actual_after_buf, (unsigned long long)actual_before, actual_before_buf, + (unsigned long long)after, (long long)(actual_after - after), (unsigned long long)before, (long long)(actual_before - before)); + } + else + internal_error(true, + "STREAM_SENDER REPLAY: 'host:%s/chart:%s': nothing to send (requested %llu to %llu)", + rrdhost_hostname(q->st->rrdhost), rrdset_id(q->st), + (unsigned long long)after, (unsigned long long)before); +#endif // NETDATA_LOG_REPLICATION_REQUESTS + + q->points_read += points_read; + q->points_generated += points_generated; + + if(last_end_time_in_buffer < before - q->st->update_every) + finished_with_gap = true; + + return finished_with_gap; +} + +static struct replication_query *replication_response_prepare( + RRDSET *st, + bool requested_enable_streaming, + time_t requested_after, + time_t requested_before, + STREAM_CAPABILITIES capabilities + ) { + time_t wall_clock_time = now_realtime_sec(); + + if(requested_after > requested_before) { + // flip them + time_t t = requested_before; + requested_before = requested_after; + requested_after = t; + } + + if(requested_after > wall_clock_time) { + requested_after = 0; + requested_before = 0; + requested_enable_streaming = true; + } + + if(requested_before > wall_clock_time) { + requested_before = wall_clock_time; + requested_enable_streaming = true; + } + + time_t query_after = requested_after; + time_t query_before = requested_before; + bool query_enable_streaming = requested_enable_streaming; + + time_t db_first_entry = 0, db_last_entry = 0; + rrdset_get_retention_of_tier_for_collected_chart( + st, &db_first_entry, &db_last_entry, wall_clock_time, 0); + + if(requested_after == 0 && requested_before == 0 && requested_enable_streaming == true) { + // no data requested - just enable streaming + ; + } + else { + if (query_after < db_first_entry) + query_after = db_first_entry; + + if (query_before > db_last_entry) + query_before = db_last_entry; + + // if the parent asked us to start streaming, then fill the rest with the data that we have + if (requested_enable_streaming) + query_before = db_last_entry; + + if (query_after > query_before) { + time_t tmp = query_before; + query_before = query_after; + query_after = tmp; + } + + query_enable_streaming = (requested_enable_streaming || + query_before == db_last_entry || + !requested_after || + !requested_before) ? true : false; + } + + return replication_query_prepare( + st, + db_first_entry, db_last_entry, + requested_after, requested_before, requested_enable_streaming, + query_after, query_before, query_enable_streaming, + wall_clock_time, capabilities); +} + +void replication_response_cancel_and_finalize(struct replication_query *q) { + replication_query_finalize(NULL, q, false); +} + +static bool sender_is_still_connected_for_this_request(struct replication_request *rq); + +bool replication_response_execute_and_finalize(struct replication_query *q, size_t max_msg_size) { + bool with_slots = (q->query.capabilities & STREAM_CAP_SLOTS) ? true : false; + NUMBER_ENCODING integer_encoding = (q->query.capabilities & STREAM_CAP_IEEE754) ? NUMBER_ENCODING_BASE64 : NUMBER_ENCODING_DECIMAL; + struct replication_request *rq = q->rq; + RRDSET *st = q->st; + RRDHOST *host = st->rrdhost; + + // we might want to optimize this by filling a temporary buffer + // and copying the result to the host's buffer in order to avoid + // holding the host's buffer lock for too long + BUFFER *wb = sender_start(host->sender); + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_REPLAY_BEGIN, sizeof(PLUGINSD_KEYWORD_REPLAY_BEGIN) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, q->st->rrdpush.sender.chart_slot); + } + + buffer_fast_strcat(wb, " '", 2); + buffer_fast_strcat(wb, rrdset_id(st), string_strlen(st->id)); + buffer_fast_strcat(wb, "'\n", 2); + + bool locked_data_collection = q->query.locked_data_collection; + q->query.locked_data_collection = false; + + bool finished_with_gap = false; + if(q->query.execute) + finished_with_gap = replication_query_execute(wb, q, max_msg_size); + + time_t after = q->request.after; + time_t before = q->query.before; + bool enable_streaming = q->query.enable_streaming; + + replication_query_finalize(wb, q, q->query.execute); + q = NULL; // IMPORTANT: q is invalid now + + // get a fresh retention to send to the parent + time_t wall_clock_time = now_realtime_sec(); + time_t db_first_entry, db_last_entry; + rrdset_get_retention_of_tier_for_collected_chart(st, &db_first_entry, &db_last_entry, wall_clock_time, 0); + + // end with first/last entries we have, and the first start time and + // last end time of the data we sent + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_REPLAY_END " ", sizeof(PLUGINSD_KEYWORD_REPLAY_END) - 1 + 1); + buffer_print_int64_encoded(wb, integer_encoding, st->update_every); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, db_first_entry); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, db_last_entry); + + buffer_fast_strcat(wb, enable_streaming ? " true " : " false ", 7); + + buffer_print_uint64_encoded(wb, integer_encoding, after); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, before); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, wall_clock_time); + buffer_fast_strcat(wb, "\n", 1); + + worker_is_busy(WORKER_JOB_BUFFER_COMMIT); + sender_commit(host->sender, wb, STREAM_TRAFFIC_TYPE_REPLICATION); + worker_is_busy(WORKER_JOB_CLEANUP); + + if(enable_streaming) { + if(sender_is_still_connected_for_this_request(rq)) { + // enable normal streaming if we have to + // but only if the sender buffer has not been flushed since we started + + if(rrdset_flag_check(st, RRDSET_FLAG_SENDER_REPLICATION_IN_PROGRESS)) { + rrdset_flag_clear(st, RRDSET_FLAG_SENDER_REPLICATION_IN_PROGRESS); + rrdset_flag_set(st, RRDSET_FLAG_SENDER_REPLICATION_FINISHED); + rrdhost_sender_replicating_charts_minus_one(st->rrdhost); + + if(!finished_with_gap) + st->rrdpush.sender.resync_time_s = 0; + +#ifdef NETDATA_LOG_REPLICATION_REQUESTS + internal_error(true, "STREAM_SENDER REPLAY: 'host:%s/chart:%s' streaming starts", + rrdhost_hostname(st->rrdhost), rrdset_id(st)); +#endif + } + else + internal_error(true, "REPLAY ERROR: 'host:%s/chart:%s' received start streaming command, but the chart is not in progress replicating", + rrdhost_hostname(st->rrdhost), rrdset_id(st)); + } + } + + if(locked_data_collection) + spinlock_unlock(&st->data_collection_lock); + + return enable_streaming; +} + +// ---------------------------------------------------------------------------- +// sending replication requests + +struct replication_request_details { + struct { + send_command callback; + void *data; + } caller; + + RRDHOST *host; + RRDSET *st; + + struct { + time_t first_entry_t; // the first entry time the child has + time_t last_entry_t; // the last entry time the child has + time_t wall_clock_time; // the current time of the child + bool fixed_last_entry; // when set we set the last entry to wall clock time + } child_db; + + struct { + time_t first_entry_t; // the first entry time we have + time_t last_entry_t; // the last entry time we have + time_t wall_clock_time; // the current local world clock time + } local_db; + + struct { + time_t from; // the starting time of the entire gap we have + time_t to; // the ending time of the entire gap we have + } gap; + + struct { + time_t after; // the start time we requested previously from this child + time_t before; // the end time we requested previously from this child + } last_request; + + struct { + time_t after; // the start time of this replication request - the child will add 1 second + time_t before; // the end time of this replication request + bool start_streaming; // true when we want the child to send anything remaining and start streaming - the child will overwrite 'before' + } wanted; +}; + +static void replicate_log_request(struct replication_request_details *r, const char *msg) { +#ifdef NETDATA_INTERNAL_CHECKS + internal_error(true, +#else + nd_log_limit_static_global_var(erl, 1, 0); + nd_log_limit(&erl, NDLS_DAEMON, NDLP_ERR, +#endif + "REPLAY ERROR: 'host:%s/chart:%s' child sent: " + "db from %ld to %ld%s, wall clock time %ld, " + "last request from %ld to %ld, " + "issue: %s - " + "sending replication request from %ld to %ld, start streaming %s", + rrdhost_hostname(r->st->rrdhost), rrdset_id(r->st), + r->child_db.first_entry_t, + r->child_db.last_entry_t, r->child_db.fixed_last_entry ? " (fixed)" : "", + r->child_db.wall_clock_time, + r->last_request.after, + r->last_request.before, + msg, + r->wanted.after, + r->wanted.before, + r->wanted.start_streaming ? "true" : "false"); +} + +static bool send_replay_chart_cmd(struct replication_request_details *r, const char *msg, bool log) { + RRDSET *st = r->st; + + if(log) + replicate_log_request(r, msg); + + if(st->rrdhost->receiver && (!st->rrdhost->receiver->replication_first_time_t || r->wanted.after < st->rrdhost->receiver->replication_first_time_t)) + st->rrdhost->receiver->replication_first_time_t = r->wanted.after; + +#ifdef NETDATA_LOG_REPLICATION_REQUESTS + st->replay.log_next_data_collection = true; + + char wanted_after_buf[LOG_DATE_LENGTH + 1] = "", wanted_before_buf[LOG_DATE_LENGTH + 1] = ""; + + if(r->wanted.after) + log_date(wanted_after_buf, LOG_DATE_LENGTH, r->wanted.after); + + if(r->wanted.before) + log_date(wanted_before_buf, LOG_DATE_LENGTH, r->wanted.before); + + internal_error(true, + "REPLAY: 'host:%s/chart:%s' sending replication request %ld [%s] to %ld [%s], start streaming '%s': %s: " + "last[%ld - %ld] child[%ld - %ld, now %ld %s] local[%ld - %ld, now %ld] gap[%ld - %ld %s] %s" + , rrdhost_hostname(r->host), rrdset_id(r->st) + , r->wanted.after, wanted_after_buf + , r->wanted.before, wanted_before_buf + , r->wanted.start_streaming ? "YES" : "NO" + , msg + , r->last_request.after, r->last_request.before + , r->child_db.first_entry_t, r->child_db.last_entry_t + , r->child_db.wall_clock_time, (r->child_db.wall_clock_time == r->local_db.wall_clock_time) ? "SAME" : (r->child_db.wall_clock_time < r->local_db.wall_clock_time) ? "BEHIND" : "AHEAD" + , r->local_db.first_entry_t, r->local_db.last_entry_t + , r->local_db.wall_clock_time + , r->gap.from, r->gap.to + , (r->gap.from == r->wanted.after) ? "FULL" : "PARTIAL" + , (st->replay.after != 0 || st->replay.before != 0) ? "OVERLAPPING" : "" + ); + + st->replay.start_streaming = r->wanted.start_streaming; + st->replay.after = r->wanted.after; + st->replay.before = r->wanted.before; +#endif // NETDATA_LOG_REPLICATION_REQUESTS + + char buffer[2048 + 1]; + snprintfz(buffer, sizeof(buffer) - 1, PLUGINSD_KEYWORD_REPLAY_CHART " \"%s\" \"%s\" %llu %llu\n", + rrdset_id(st), r->wanted.start_streaming ? "true" : "false", + (unsigned long long)r->wanted.after, (unsigned long long)r->wanted.before); + + ssize_t ret = r->caller.callback(buffer, r->caller.data); + if (ret < 0) { + netdata_log_error("REPLAY ERROR: 'host:%s/chart:%s' failed to send replication request to child (error %zd)", + rrdhost_hostname(r->host), rrdset_id(r->st), ret); + return false; + } + + return true; +} + +bool replicate_chart_request(send_command callback, void *callback_data, RRDHOST *host, RRDSET *st, + time_t child_first_entry, time_t child_last_entry, time_t child_wall_clock_time, + time_t prev_first_entry_wanted, time_t prev_last_entry_wanted) +{ + struct replication_request_details r = { + .caller = { + .callback = callback, + .data = callback_data, + }, + + .host = host, + .st = st, + + .child_db = { + .first_entry_t = child_first_entry, + .last_entry_t = child_last_entry, + .wall_clock_time = child_wall_clock_time, + .fixed_last_entry = false, + }, + + .local_db = { + .first_entry_t = 0, + .last_entry_t = 0, + .wall_clock_time = now_realtime_sec(), + }, + + .last_request = { + .after = prev_first_entry_wanted, + .before = prev_last_entry_wanted, + }, + + .wanted = { + .after = 0, + .before = 0, + .start_streaming = true, + }, + }; + + if(r.child_db.last_entry_t > r.child_db.wall_clock_time) { + replicate_log_request(&r, "child's db last entry > child's wall clock time"); + r.child_db.last_entry_t = r.child_db.wall_clock_time; + r.child_db.fixed_last_entry = true; + } + + rrdset_get_retention_of_tier_for_collected_chart(r.st, &r.local_db.first_entry_t, &r.local_db.last_entry_t, r.local_db.wall_clock_time, 0); + + // let's find the GAP we have + if(!r.last_request.after || !r.last_request.before) { + // there is no previous request + + if(r.local_db.last_entry_t) + // we have some data, let's continue from the last point we have + r.gap.from = r.local_db.last_entry_t; + else + // we don't have any data, the gap is the max timeframe we are allowed to replicate + r.gap.from = r.local_db.wall_clock_time - r.host->rrdpush_seconds_to_replicate; + + } + else { + // we had sent a request - let's continue at the point we left it + // for this we don't take into account the actual data in our db + // because the child may also have gaps, and we need to get over it + r.gap.from = r.last_request.before; + } + + // we want all the data up to now + r.gap.to = r.local_db.wall_clock_time; + + // The gap is now r.gap.from -> r.gap.to + + if (unlikely(!rrdhost_option_check(host, RRDHOST_OPTION_REPLICATION))) + return send_replay_chart_cmd(&r, "empty replication request, replication is disabled", false); + + if (unlikely(!rrdset_number_of_dimensions(st))) + return send_replay_chart_cmd(&r, "empty replication request, chart has no dimensions", false); + + if (unlikely(!r.child_db.first_entry_t || !r.child_db.last_entry_t)) + return send_replay_chart_cmd(&r, "empty replication request, child has no stored data", false); + + if (unlikely(r.child_db.first_entry_t < 0 || r.child_db.last_entry_t < 0)) + return send_replay_chart_cmd(&r, "empty replication request, child db timestamps are invalid", true); + + if (unlikely(r.child_db.first_entry_t > r.child_db.wall_clock_time)) + return send_replay_chart_cmd(&r, "empty replication request, child db first entry is after its wall clock time", true); + + if (unlikely(r.child_db.first_entry_t > r.child_db.last_entry_t)) + return send_replay_chart_cmd(&r, "empty replication request, child timings are invalid (first entry > last entry)", true); + + if (unlikely(r.local_db.last_entry_t > r.child_db.last_entry_t)) + return send_replay_chart_cmd(&r, "empty replication request, local last entry is later than the child one", false); + + // let's find what the child can provide to fill that gap + + if(r.child_db.first_entry_t > r.gap.from) + // the child does not have all the data - let's get what it has + r.wanted.after = r.child_db.first_entry_t; + else + // ok, the child can fill the entire gap we have + r.wanted.after = r.gap.from; + + if(r.gap.to - r.wanted.after > host->rrdpush_replication_step) + // the duration is too big for one request - let's take the first step + r.wanted.before = r.wanted.after + host->rrdpush_replication_step; + else + // wow, we can do it in one request + r.wanted.before = r.gap.to; + + // don't ask from the child more than it has + if(r.wanted.before > r.child_db.last_entry_t) + r.wanted.before = r.child_db.last_entry_t; + + if(r.wanted.after > r.wanted.before) { + r.wanted.after = 0; + r.wanted.before = 0; + r.wanted.start_streaming = true; + return send_replay_chart_cmd(&r, "empty replication request, wanted after computed bigger than wanted before", true); + } + + // the child should start streaming immediately if the wanted duration is small, or we reached the last entry of the child + r.wanted.start_streaming = (r.local_db.wall_clock_time - r.wanted.after <= host->rrdpush_replication_step || + r.wanted.before >= r.child_db.last_entry_t || + r.wanted.before >= r.child_db.wall_clock_time || + r.wanted.before >= r.local_db.wall_clock_time); + + // the wanted timeframe is now r.wanted.after -> r.wanted.before + // send it + return send_replay_chart_cmd(&r, "OK", false); +} + +// ---------------------------------------------------------------------------- +// replication thread + +// replication request in sender DICTIONARY +// used for de-duplicating the requests +struct replication_request { + struct sender_state *sender; // the sender we should put the reply at + STRING *chart_id; // the chart of the request + time_t after; // the start time of the query (maybe zero) key for sorting (JudyL) + time_t before; // the end time of the query (maybe zero) + + usec_t sender_last_flush_ut; // the timestamp of the sender, at the time we indexed this request + Word_t unique_id; // auto-increment, later requests have bigger + + bool start_streaming; // true, when the parent wants to send the rest of the data (before is overwritten) and enable normal streaming + bool indexed_in_judy; // true when the request is indexed in judy + bool not_indexed_buffer_full; // true when the request is not indexed because the sender is full + bool not_indexed_preprocessing; // true when the request is not indexed, but it is pending in preprocessing + + // prepare ahead members - preprocessing + bool found; // used as a result boolean for the find call + bool executed; // used to detect if we have skipped requests while preprocessing + RRDSET *st; // caching of the chart during preprocessing + struct replication_query *q; // the preprocessing query initialization +}; + +// replication sort entry in JudyL array +// used for sorting all requests, across all nodes +struct replication_sort_entry { + struct replication_request *rq; + + size_t unique_id; // used as a key to identify the sort entry - we never access its contents +}; + +#define MAX_REPLICATION_THREADS 20 // + 1 for the main thread + +// the global variables for the replication thread +static struct replication_thread { + ARAL *aral_rse; + + SPINLOCK spinlock; + + struct { + size_t pending; // number of requests pending in the queue + + // statistics + size_t added; // number of requests added to the queue + size_t removed; // number of requests removed from the queue + size_t pending_no_room; // number of requests skipped, because the sender has no room for responses + size_t senders_full; // number of times a sender reset our last position in the queue + size_t sender_resets; // number of times a sender reset our last position in the queue + time_t first_time_t; // the minimum 'after' we encountered + + struct { + Word_t after; + Word_t unique_id; + Pvoid_t JudyL_array; + } queue; + + } unsafe; // protected from replication_recursive_lock() + + struct { + Word_t unique_id; // the last unique id we gave to a request (auto-increment, starting from 1) + size_t executed; // the number of replication requests executed + size_t latest_first_time; // the 'after' timestamp of the last request we executed + size_t memory; // the total memory allocated by replication + } atomic; // access should be with atomic operations + + struct { + size_t last_executed; // caching of the atomic.executed to report number of requests executed since last time + + ND_THREAD **threads_ptrs; + size_t threads; + } main_thread; // access is allowed only by the main thread + +} replication_globals = { + .aral_rse = NULL, + .spinlock = NETDATA_SPINLOCK_INITIALIZER, + .unsafe = { + .pending = 0, + + .added = 0, + .removed = 0, + .pending_no_room = 0, + .sender_resets = 0, + .senders_full = 0, + + .first_time_t = 0, + + .queue = { + .after = 0, + .unique_id = 0, + .JudyL_array = NULL, + }, + }, + .atomic = { + .unique_id = 0, + .executed = 0, + .latest_first_time = 0, + .memory = 0, + }, + .main_thread = { + .last_executed = 0, + .threads = 0, + .threads_ptrs = NULL, + }, +}; + +size_t replication_allocated_memory(void) { + return __atomic_load_n(&replication_globals.atomic.memory, __ATOMIC_RELAXED); +} + +#define replication_set_latest_first_time(t) __atomic_store_n(&replication_globals.atomic.latest_first_time, t, __ATOMIC_RELAXED) +#define replication_get_latest_first_time() __atomic_load_n(&replication_globals.atomic.latest_first_time, __ATOMIC_RELAXED) + +static inline bool replication_recursive_lock_mode(char mode) { + static __thread int recursions = 0; + + if(mode == 'L') { // (L)ock + if(++recursions == 1) + spinlock_lock(&replication_globals.spinlock); + } + else if(mode == 'U') { // (U)nlock + if(--recursions == 0) + spinlock_unlock(&replication_globals.spinlock); + } + else if(mode == 'C') { // (C)heck + if(recursions > 0) + return true; + else + return false; + } + else + fatal("REPLICATION: unknown lock mode '%c'", mode); + +#ifdef NETDATA_INTERNAL_CHECKS + if(recursions < 0) + fatal("REPLICATION: recursions is %d", recursions); +#endif + + return true; +} + +#define replication_recursive_lock() replication_recursive_lock_mode('L') +#define replication_recursive_unlock() replication_recursive_lock_mode('U') +#define fatal_when_replication_is_not_locked_for_me() do { \ + if(!replication_recursive_lock_mode('C')) \ + fatal("REPLICATION: reached %s, but replication is not locked by this thread.", __FUNCTION__); \ +} while(0) + +void replication_set_next_point_in_time(time_t after, size_t unique_id) { + replication_recursive_lock(); + replication_globals.unsafe.queue.after = after; + replication_globals.unsafe.queue.unique_id = unique_id; + replication_recursive_unlock(); +} + +// ---------------------------------------------------------------------------- +// replication sort entry management + +static inline struct replication_sort_entry *replication_sort_entry_create(struct replication_request *rq) { + struct replication_sort_entry *rse = aral_mallocz(replication_globals.aral_rse); + __atomic_add_fetch(&replication_globals.atomic.memory, sizeof(struct replication_sort_entry), __ATOMIC_RELAXED); + + rrdpush_sender_pending_replication_requests_plus_one(rq->sender); + + // copy the request + rse->rq = rq; + rse->unique_id = __atomic_add_fetch(&replication_globals.atomic.unique_id, 1, __ATOMIC_SEQ_CST); + + // save the unique id into the request, to be able to delete it later + rq->unique_id = rse->unique_id; + rq->indexed_in_judy = false; + rq->not_indexed_buffer_full = false; + rq->not_indexed_preprocessing = false; + return rse; +} + +static void replication_sort_entry_destroy(struct replication_sort_entry *rse) { + aral_freez(replication_globals.aral_rse, rse); + __atomic_sub_fetch(&replication_globals.atomic.memory, sizeof(struct replication_sort_entry), __ATOMIC_RELAXED); +} + +static void replication_sort_entry_add(struct replication_request *rq) { + if(unlikely(rrdpush_sender_replication_buffer_full_get(rq->sender))) { + rq->indexed_in_judy = false; + rq->not_indexed_buffer_full = true; + rq->not_indexed_preprocessing = false; + replication_recursive_lock(); + replication_globals.unsafe.pending_no_room++; + replication_recursive_unlock(); + return; + } + + // cache this, because it will be changed + bool decrement_no_room = rq->not_indexed_buffer_full; + + struct replication_sort_entry *rse = replication_sort_entry_create(rq); + + replication_recursive_lock(); + + if(decrement_no_room) + replication_globals.unsafe.pending_no_room--; + +// if(rq->after < (time_t)replication_globals.protected.queue.after && +// rq->sender->buffer_used_percentage <= MAX_SENDER_BUFFER_PERCENTAGE_ALLOWED && +// !replication_globals.protected.skipped_no_room_since_last_reset) { +// +// // make it find this request first +// replication_set_next_point_in_time(rq->after, rq->unique_id); +// } + + replication_globals.unsafe.added++; + replication_globals.unsafe.pending++; + + Pvoid_t *inner_judy_ptr; + + // find the outer judy entry, using after as key + size_t mem_before_outer_judyl = JudyLMemUsed(replication_globals.unsafe.queue.JudyL_array); + inner_judy_ptr = JudyLIns(&replication_globals.unsafe.queue.JudyL_array, (Word_t) rq->after, PJE0); + size_t mem_after_outer_judyl = JudyLMemUsed(replication_globals.unsafe.queue.JudyL_array); + if(unlikely(!inner_judy_ptr || inner_judy_ptr == PJERR)) + fatal("REPLICATION: corrupted outer judyL"); + + // add it to the inner judy, using unique_id as key + size_t mem_before_inner_judyl = JudyLMemUsed(*inner_judy_ptr); + Pvoid_t *item = JudyLIns(inner_judy_ptr, rq->unique_id, PJE0); + size_t mem_after_inner_judyl = JudyLMemUsed(*inner_judy_ptr); + if(unlikely(!item || item == PJERR)) + fatal("REPLICATION: corrupted inner judyL"); + + *item = rse; + rq->indexed_in_judy = true; + rq->not_indexed_buffer_full = false; + rq->not_indexed_preprocessing = false; + + if(!replication_globals.unsafe.first_time_t || rq->after < replication_globals.unsafe.first_time_t) + replication_globals.unsafe.first_time_t = rq->after; + + replication_recursive_unlock(); + + __atomic_add_fetch(&replication_globals.atomic.memory, (mem_after_inner_judyl - mem_before_inner_judyl) + (mem_after_outer_judyl - mem_before_outer_judyl), __ATOMIC_RELAXED); +} + +static bool replication_sort_entry_unlink_and_free_unsafe(struct replication_sort_entry *rse, Pvoid_t **inner_judy_ppptr, bool preprocessing) { + fatal_when_replication_is_not_locked_for_me(); + + bool inner_judy_deleted = false; + + replication_globals.unsafe.removed++; + replication_globals.unsafe.pending--; + + rrdpush_sender_pending_replication_requests_minus_one(rse->rq->sender); + + rse->rq->indexed_in_judy = false; + rse->rq->not_indexed_preprocessing = preprocessing; + + size_t memory_saved = 0; + + // delete it from the inner judy + size_t mem_before_inner_judyl = JudyLMemUsed(**inner_judy_ppptr); + JudyLDel(*inner_judy_ppptr, rse->rq->unique_id, PJE0); + size_t mem_after_inner_judyl = JudyLMemUsed(**inner_judy_ppptr); + memory_saved = mem_before_inner_judyl - mem_after_inner_judyl; + + // if no items left, delete it from the outer judy + if(**inner_judy_ppptr == NULL) { + size_t mem_before_outer_judyl = JudyLMemUsed(replication_globals.unsafe.queue.JudyL_array); + JudyLDel(&replication_globals.unsafe.queue.JudyL_array, rse->rq->after, PJE0); + size_t mem_after_outer_judyl = JudyLMemUsed(replication_globals.unsafe.queue.JudyL_array); + memory_saved += mem_before_outer_judyl - mem_after_outer_judyl; + inner_judy_deleted = true; + } + + // free memory + replication_sort_entry_destroy(rse); + + __atomic_sub_fetch(&replication_globals.atomic.memory, memory_saved, __ATOMIC_RELAXED); + + return inner_judy_deleted; +} + +static void replication_sort_entry_del(struct replication_request *rq, bool buffer_full) { + Pvoid_t *inner_judy_pptr; + struct replication_sort_entry *rse_to_delete = NULL; + + replication_recursive_lock(); + if(rq->indexed_in_judy) { + + inner_judy_pptr = JudyLGet(replication_globals.unsafe.queue.JudyL_array, rq->after, PJE0); + if (inner_judy_pptr) { + Pvoid_t *our_item_pptr = JudyLGet(*inner_judy_pptr, rq->unique_id, PJE0); + if (our_item_pptr) { + rse_to_delete = *our_item_pptr; + replication_sort_entry_unlink_and_free_unsafe(rse_to_delete, &inner_judy_pptr, false); + + if(buffer_full) { + replication_globals.unsafe.pending_no_room++; + rq->not_indexed_buffer_full = true; + } + } + } + + if (!rse_to_delete) + fatal("REPLAY: 'host:%s/chart:%s' Cannot find sort entry to delete for time %ld.", + rrdhost_hostname(rq->sender->host), string2str(rq->chart_id), rq->after); + + } + + replication_recursive_unlock(); +} + +static struct replication_request replication_request_get_first_available() { + Pvoid_t *inner_judy_pptr; + + replication_recursive_lock(); + + struct replication_request rq_to_return = (struct replication_request){ .found = false }; + + if(unlikely(!replication_globals.unsafe.queue.after || !replication_globals.unsafe.queue.unique_id)) { + replication_globals.unsafe.queue.after = 0; + replication_globals.unsafe.queue.unique_id = 0; + } + + Word_t started_after = replication_globals.unsafe.queue.after; + + size_t round = 0; + while(!rq_to_return.found) { + round++; + + if(round > 2) + break; + + if(round == 2) { + if(started_after == 0) + break; + + replication_globals.unsafe.queue.after = 0; + replication_globals.unsafe.queue.unique_id = 0; + } + + bool find_same_after = true; + while (!rq_to_return.found && (inner_judy_pptr = JudyLFirstThenNext(replication_globals.unsafe.queue.JudyL_array, &replication_globals.unsafe.queue.after, &find_same_after))) { + Pvoid_t *our_item_pptr; + + if(unlikely(round == 2 && replication_globals.unsafe.queue.after > started_after)) + break; + + while (!rq_to_return.found && (our_item_pptr = JudyLNext(*inner_judy_pptr, &replication_globals.unsafe.queue.unique_id, PJE0))) { + struct replication_sort_entry *rse = *our_item_pptr; + struct replication_request *rq = rse->rq; + + // copy the request to return it + rq_to_return = *rq; + rq_to_return.chart_id = string_dup(rq_to_return.chart_id); + + // set the return result to found + rq_to_return.found = true; + + if (replication_sort_entry_unlink_and_free_unsafe(rse, &inner_judy_pptr, true)) + // we removed the item from the outer JudyL + break; + } + + // prepare for the next iteration on the outer loop + replication_globals.unsafe.queue.unique_id = 0; + } + } + + replication_recursive_unlock(); + return rq_to_return; +} + +// ---------------------------------------------------------------------------- +// replication request management + +static void replication_request_react_callback(const DICTIONARY_ITEM *item __maybe_unused, void *value __maybe_unused, void *sender_state __maybe_unused) { + struct sender_state *s = sender_state; (void)s; + struct replication_request *rq = value; + + // IMPORTANT: + // We use the react instead of the insert callback + // because we want the item to be atomically visible + // to our replication thread, immediately after. + + // If we put this at the insert callback, the item is not guaranteed + // to be atomically visible to others, so the replication thread + // may see the replication sort entry, but fail to find the dictionary item + // related to it. + + replication_sort_entry_add(rq); + + // this request is about a unique chart for this sender + rrdpush_sender_replicating_charts_plus_one(s); +} + +static bool replication_request_conflict_callback(const DICTIONARY_ITEM *item __maybe_unused, void *old_value, void *new_value, void *sender_state) { + struct sender_state *s = sender_state; (void)s; + struct replication_request *rq = old_value; (void)rq; + struct replication_request *rq_new = new_value; + + replication_recursive_lock(); + + if(!rq->indexed_in_judy && rq->not_indexed_buffer_full && !rq->not_indexed_preprocessing) { + // we can replace this command + internal_error( + true, + "STREAM %s [send to %s]: REPLAY: 'host:%s/chart:%s' replacing duplicate replication command received (existing from %llu to %llu [%s], new from %llu to %llu [%s])", + rrdhost_hostname(s->host), s->connected_to, rrdhost_hostname(s->host), dictionary_acquired_item_name(item), + (unsigned long long)rq->after, (unsigned long long)rq->before, rq->start_streaming ? "true" : "false", + (unsigned long long)rq_new->after, (unsigned long long)rq_new->before, rq_new->start_streaming ? "true" : "false"); + + rq->after = rq_new->after; + rq->before = rq_new->before; + rq->start_streaming = rq_new->start_streaming; + } + else if(!rq->indexed_in_judy && !rq->not_indexed_preprocessing) { + replication_sort_entry_add(rq); + internal_error( + true, + "STREAM %s [send to %s]: REPLAY: 'host:%s/chart:%s' adding duplicate replication command received (existing from %llu to %llu [%s], new from %llu to %llu [%s])", + rrdhost_hostname(s->host), s->connected_to, rrdhost_hostname(s->host), dictionary_acquired_item_name(item), + (unsigned long long)rq->after, (unsigned long long)rq->before, rq->start_streaming ? "true" : "false", + (unsigned long long)rq_new->after, (unsigned long long)rq_new->before, rq_new->start_streaming ? "true" : "false"); + } + else { + internal_error( + true, + "STREAM %s [send to %s]: REPLAY: 'host:%s/chart:%s' ignoring duplicate replication command received (existing from %llu to %llu [%s], new from %llu to %llu [%s])", + rrdhost_hostname(s->host), s->connected_to, rrdhost_hostname(s->host), + dictionary_acquired_item_name(item), + (unsigned long long) rq->after, (unsigned long long) rq->before, rq->start_streaming ? "true" : "false", + (unsigned long long) rq_new->after, (unsigned long long) rq_new->before, rq_new->start_streaming ? "true" : "false"); + } + + replication_recursive_unlock(); + + string_freez(rq_new->chart_id); + return false; +} + +static void replication_request_delete_callback(const DICTIONARY_ITEM *item __maybe_unused, void *value, void *sender_state __maybe_unused) { + struct replication_request *rq = value; + + // this request is about a unique chart for this sender + rrdpush_sender_replicating_charts_minus_one(rq->sender); + + if(rq->indexed_in_judy) + replication_sort_entry_del(rq, false); + + else if(rq->not_indexed_buffer_full) { + replication_recursive_lock(); + replication_globals.unsafe.pending_no_room--; + replication_recursive_unlock(); + } + + string_freez(rq->chart_id); +} + +static bool sender_is_still_connected_for_this_request(struct replication_request *rq) { + return rq->sender_last_flush_ut == rrdpush_sender_get_flush_time(rq->sender); +} + +static bool replication_execute_request(struct replication_request *rq, bool workers) { + bool ret = false; + + if(!rq->st) { + if(likely(workers)) + worker_is_busy(WORKER_JOB_FIND_CHART); + + rq->st = rrdset_find(rq->sender->host, string2str(rq->chart_id)); + } + + if(!rq->st) { + internal_error(true, "REPLAY ERROR: 'host:%s/chart:%s' not found", + rrdhost_hostname(rq->sender->host), string2str(rq->chart_id)); + + goto cleanup; + } + + if(!rq->q) { + if(likely(workers)) + worker_is_busy(WORKER_JOB_PREPARE_QUERY); + + rq->q = replication_response_prepare( + rq->st, + rq->start_streaming, + rq->after, + rq->before, + rq->sender->capabilities); + } + + if(likely(workers)) + worker_is_busy(WORKER_JOB_QUERYING); + + // send the replication data + rq->q->rq = rq; + replication_response_execute_and_finalize( + rq->q, (size_t)((unsigned long long)rq->sender->host->sender->buffer->max_size * MAX_REPLICATION_MESSAGE_PERCENT_SENDER_BUFFER / 100ULL)); + + rq->q = NULL; + + __atomic_add_fetch(&replication_globals.atomic.executed, 1, __ATOMIC_RELAXED); + + ret = true; + +cleanup: + if(rq->q) { + replication_response_cancel_and_finalize(rq->q); + rq->q = NULL; + } + + string_freez(rq->chart_id); + worker_is_idle(); + return ret; +} + +// ---------------------------------------------------------------------------- +// public API + +void replication_add_request(struct sender_state *sender, const char *chart_id, time_t after, time_t before, bool start_streaming) { + struct replication_request rq = { + .sender = sender, + .chart_id = string_strdupz(chart_id), + .after = after, + .before = before, + .start_streaming = start_streaming, + .sender_last_flush_ut = rrdpush_sender_get_flush_time(sender), + .indexed_in_judy = false, + .not_indexed_buffer_full = false, + .not_indexed_preprocessing = false, + }; + + if(!sender->replication.oldest_request_after_t || rq.after < sender->replication.oldest_request_after_t) + sender->replication.oldest_request_after_t = rq.after; + + if(start_streaming && rrdpush_sender_get_buffer_used_percent(sender) <= STREAMING_START_MAX_SENDER_BUFFER_PERCENTAGE_ALLOWED) + replication_execute_request(&rq, false); + + else + dictionary_set(sender->replication.requests, chart_id, &rq, sizeof(struct replication_request)); +} + +void replication_sender_delete_pending_requests(struct sender_state *sender) { + // allow the dictionary destructor to go faster on locks + dictionary_flush(sender->replication.requests); +} + +void replication_init_sender(struct sender_state *sender) { + sender->replication.requests = dictionary_create_advanced(DICT_OPTION_DONT_OVERWRITE_VALUE | DICT_OPTION_FIXED_SIZE, + NULL, sizeof(struct replication_request)); + + dictionary_register_react_callback(sender->replication.requests, replication_request_react_callback, sender); + dictionary_register_conflict_callback(sender->replication.requests, replication_request_conflict_callback, sender); + dictionary_register_delete_callback(sender->replication.requests, replication_request_delete_callback, sender); +} + +void replication_cleanup_sender(struct sender_state *sender) { + // allow the dictionary destructor to go faster on locks + replication_recursive_lock(); + dictionary_destroy(sender->replication.requests); + replication_recursive_unlock(); +} + +void replication_recalculate_buffer_used_ratio_unsafe(struct sender_state *s) { + size_t available = cbuffer_available_size_unsafe(s->host->sender->buffer); + size_t percentage = (s->buffer->max_size - available) * 100 / s->buffer->max_size; + + if(unlikely(percentage > MAX_SENDER_BUFFER_PERCENTAGE_ALLOWED && !rrdpush_sender_replication_buffer_full_get(s))) { + rrdpush_sender_replication_buffer_full_set(s, true); + + struct replication_request *rq; + dfe_start_read(s->replication.requests, rq) { + if(rq->indexed_in_judy) + replication_sort_entry_del(rq, true); + } + dfe_done(rq); + + replication_recursive_lock(); + replication_globals.unsafe.senders_full++; + replication_recursive_unlock(); + } + else if(unlikely(percentage < MIN_SENDER_BUFFER_PERCENTAGE_ALLOWED && rrdpush_sender_replication_buffer_full_get(s))) { + rrdpush_sender_replication_buffer_full_set(s, false); + + struct replication_request *rq; + dfe_start_read(s->replication.requests, rq) { + if(!rq->indexed_in_judy && (rq->not_indexed_buffer_full || rq->not_indexed_preprocessing)) + replication_sort_entry_add(rq); + } + dfe_done(rq); + + replication_recursive_lock(); + replication_globals.unsafe.senders_full--; + replication_globals.unsafe.sender_resets++; + // replication_set_next_point_in_time(0, 0); + replication_recursive_unlock(); + } + + rrdpush_sender_set_buffer_used_percent(s, percentage); +} + +// ---------------------------------------------------------------------------- +// replication thread + +static size_t verify_host_charts_are_streaming_now(RRDHOST *host) { + internal_error( + host->sender && + !rrdpush_sender_pending_replication_requests(host->sender) && + dictionary_entries(host->sender->replication.requests) != 0, + "REPLICATION SUMMARY: 'host:%s' reports %zu pending replication requests, but its chart replication index says there are %zu charts pending replication", + rrdhost_hostname(host), + rrdpush_sender_pending_replication_requests(host->sender), + dictionary_entries(host->sender->replication.requests) + ); + + size_t ok = 0; + size_t errors = 0; + + RRDSET *st; + rrdset_foreach_read(st, host) { + RRDSET_FLAGS flags = rrdset_flag_check(st, RRDSET_FLAG_SENDER_REPLICATION_IN_PROGRESS | RRDSET_FLAG_SENDER_REPLICATION_FINISHED); + + bool is_error = false; + + if(!flags) { + internal_error( + true, + "REPLICATION SUMMARY: 'host:%s/chart:%s' is neither IN PROGRESS nor FINISHED", + rrdhost_hostname(host), rrdset_id(st) + ); + is_error = true; + } + + if(!(flags & RRDSET_FLAG_SENDER_REPLICATION_FINISHED) || (flags & RRDSET_FLAG_SENDER_REPLICATION_IN_PROGRESS)) { + internal_error( + true, + "REPLICATION SUMMARY: 'host:%s/chart:%s' is IN PROGRESS although replication is finished", + rrdhost_hostname(host), rrdset_id(st) + ); + is_error = true; + } + + if(is_error) + errors++; + else + ok++; + } + rrdset_foreach_done(st); + + internal_error(errors, + "REPLICATION SUMMARY: 'host:%s' finished replicating %zu charts, but %zu charts are still in progress although replication finished", + rrdhost_hostname(host), ok, errors); + + return errors; +} + +static void verify_all_hosts_charts_are_streaming_now(void) { + worker_is_busy(WORKER_JOB_CHECK_CONSISTENCY); + + size_t errors = 0; + RRDHOST *host; + dfe_start_read(rrdhost_root_index, host) + errors += verify_host_charts_are_streaming_now(host); + dfe_done(host); + + size_t executed = __atomic_load_n(&replication_globals.atomic.executed, __ATOMIC_RELAXED); + netdata_log_info("REPLICATION SUMMARY: finished, executed %zu replication requests, %zu charts pending replication", + executed - replication_globals.main_thread.last_executed, errors); + replication_globals.main_thread.last_executed = executed; +} + +static void replication_initialize_workers(bool master) { + worker_register("REPLICATION"); + worker_register_job_name(WORKER_JOB_FIND_NEXT, "find next"); + worker_register_job_name(WORKER_JOB_QUERYING, "querying"); + worker_register_job_name(WORKER_JOB_DELETE_ENTRY, "dict delete"); + worker_register_job_name(WORKER_JOB_FIND_CHART, "find chart"); + worker_register_job_name(WORKER_JOB_PREPARE_QUERY, "prepare query"); + worker_register_job_name(WORKER_JOB_CHECK_CONSISTENCY, "check consistency"); + worker_register_job_name(WORKER_JOB_BUFFER_COMMIT, "commit"); + worker_register_job_name(WORKER_JOB_CLEANUP, "cleanup"); + worker_register_job_name(WORKER_JOB_WAIT, "wait"); + + if(master) { + worker_register_job_name(WORKER_JOB_STATISTICS, "statistics"); + worker_register_job_custom_metric(WORKER_JOB_CUSTOM_METRIC_PENDING_REQUESTS, "pending requests", "requests", WORKER_METRIC_ABSOLUTE); + worker_register_job_custom_metric(WORKER_JOB_CUSTOM_METRIC_SKIPPED_NO_ROOM, "no room requests", "requests", WORKER_METRIC_ABSOLUTE); + worker_register_job_custom_metric(WORKER_JOB_CUSTOM_METRIC_COMPLETION, "completion", "%", WORKER_METRIC_ABSOLUTE); + worker_register_job_custom_metric(WORKER_JOB_CUSTOM_METRIC_ADDED, "added requests", "requests/s", WORKER_METRIC_INCREMENTAL_TOTAL); + worker_register_job_custom_metric(WORKER_JOB_CUSTOM_METRIC_DONE, "finished requests", "requests/s", WORKER_METRIC_INCREMENTAL_TOTAL); + worker_register_job_custom_metric(WORKER_JOB_CUSTOM_METRIC_SENDER_RESETS, "sender resets", "resets/s", WORKER_METRIC_INCREMENTAL_TOTAL); + worker_register_job_custom_metric(WORKER_JOB_CUSTOM_METRIC_SENDER_FULL, "senders full", "senders", WORKER_METRIC_ABSOLUTE); + } +} + +#define REQUEST_OK (0) +#define REQUEST_QUEUE_EMPTY (-1) +#define REQUEST_CHART_NOT_FOUND (-2) + +static __thread struct replication_thread_pipeline { + int max_requests_ahead; + struct replication_request *rqs; + int rqs_last_executed, rqs_last_prepared; + size_t queue_rounds; +} rtp = { + .max_requests_ahead = 0, + .rqs = NULL, + .rqs_last_executed = 0, + .rqs_last_prepared = 0, + .queue_rounds = 0, +}; + +static void replication_pipeline_cancel_and_cleanup(void) { + if(!rtp.rqs) + return; + + struct replication_request *rq; + size_t cancelled = 0; + + do { + if (++rtp.rqs_last_executed >= rtp.max_requests_ahead) + rtp.rqs_last_executed = 0; + + rq = &rtp.rqs[rtp.rqs_last_executed]; + + if (rq->q) { + internal_fatal(rq->executed, "REPLAY FATAL: query has already been executed!"); + internal_fatal(!rq->found, "REPLAY FATAL: orphan q in rq"); + + replication_response_cancel_and_finalize(rq->q); + rq->q = NULL; + cancelled++; + } + + rq->executed = true; + rq->found = false; + + } while (rtp.rqs_last_executed != rtp.rqs_last_prepared); + + internal_error(true, "REPLICATION: cancelled %zu inflight queries", cancelled); + + freez(rtp.rqs); + rtp.rqs = NULL; + rtp.max_requests_ahead = 0; + rtp.rqs_last_executed = 0; + rtp.rqs_last_prepared = 0; + rtp.queue_rounds = 0; +} + +static int replication_pipeline_execute_next(void) { + struct replication_request *rq; + + if(unlikely(!rtp.rqs)) { + rtp.max_requests_ahead = (int)get_netdata_cpus() / 2; + + if(rtp.max_requests_ahead > libuv_worker_threads * 2) + rtp.max_requests_ahead = libuv_worker_threads * 2; + + if(rtp.max_requests_ahead < 2) + rtp.max_requests_ahead = 2; + + rtp.rqs = callocz(rtp.max_requests_ahead, sizeof(struct replication_request)); + __atomic_add_fetch(&replication_buffers_allocated, rtp.max_requests_ahead * sizeof(struct replication_request), __ATOMIC_RELAXED); + } + + // fill the queue + do { + if(++rtp.rqs_last_prepared >= rtp.max_requests_ahead) { + rtp.rqs_last_prepared = 0; + rtp.queue_rounds++; + } + + internal_fatal(rtp.rqs[rtp.rqs_last_prepared].q, + "REPLAY FATAL: slot is used by query that has not been executed!"); + + worker_is_busy(WORKER_JOB_FIND_NEXT); + rtp.rqs[rtp.rqs_last_prepared] = replication_request_get_first_available(); + rq = &rtp.rqs[rtp.rqs_last_prepared]; + + if(rq->found) { + if (!rq->st) { + worker_is_busy(WORKER_JOB_FIND_CHART); + rq->st = rrdset_find(rq->sender->host, string2str(rq->chart_id)); + } + + if (rq->st && !rq->q) { + worker_is_busy(WORKER_JOB_PREPARE_QUERY); + rq->q = replication_response_prepare( + rq->st, + rq->start_streaming, + rq->after, + rq->before, + rq->sender->capabilities); + } + + rq->executed = false; + } + + } while(rq->found && rtp.rqs_last_prepared != rtp.rqs_last_executed); + + // pick the first usable + do { + if (++rtp.rqs_last_executed >= rtp.max_requests_ahead) + rtp.rqs_last_executed = 0; + + rq = &rtp.rqs[rtp.rqs_last_executed]; + + if(rq->found) { + internal_fatal(rq->executed, "REPLAY FATAL: query has already been executed!"); + + if (rq->sender_last_flush_ut != rrdpush_sender_get_flush_time(rq->sender)) { + // the sender has reconnected since this request was queued, + // we can safely throw it away, since the parent will resend it + replication_response_cancel_and_finalize(rq->q); + rq->executed = true; + rq->found = false; + rq->q = NULL; + } + else if (rrdpush_sender_replication_buffer_full_get(rq->sender)) { + // the sender buffer is full, so we can ignore this request, + // it has already been marked as 'preprocessed' in the dictionary, + // and the sender will put it back in when there is + // enough room in the buffer for processing replication requests + replication_response_cancel_and_finalize(rq->q); + rq->executed = true; + rq->found = false; + rq->q = NULL; + } + else { + // we can execute this, + // delete it from the dictionary + worker_is_busy(WORKER_JOB_DELETE_ENTRY); + dictionary_del(rq->sender->replication.requests, string2str(rq->chart_id)); + } + } + else + internal_fatal(rq->q, "REPLAY FATAL: slot status says slot is empty, but it has a pending query!"); + + } while(!rq->found && rtp.rqs_last_executed != rtp.rqs_last_prepared); + + if(unlikely(!rq->found)) { + worker_is_idle(); + return REQUEST_QUEUE_EMPTY; + } + + replication_set_latest_first_time(rq->after); + + bool chart_found = replication_execute_request(rq, true); + rq->executed = true; + rq->found = false; + rq->q = NULL; + + if(unlikely(!chart_found)) { + worker_is_idle(); + return REQUEST_CHART_NOT_FOUND; + } + + worker_is_idle(); + return REQUEST_OK; +} + +static void replication_worker_cleanup(void *pptr) { + if(CLEANUP_FUNCTION_GET_PTR(pptr) != (void *)0x01) return; + replication_pipeline_cancel_and_cleanup(); + worker_unregister(); +} + +static void *replication_worker_thread(void *ptr __maybe_unused) { + CLEANUP_FUNCTION_REGISTER(replication_worker_cleanup) cleanup_ptr = (void *)0x1; + replication_initialize_workers(false); + + while (service_running(SERVICE_REPLICATION)) { + if (unlikely(replication_pipeline_execute_next() == REQUEST_QUEUE_EMPTY)) { + sender_thread_buffer_free(); + worker_is_busy(WORKER_JOB_WAIT); + worker_is_idle(); + sleep_usec(1 * USEC_PER_SEC); + } + } + + return NULL; +} + +static void replication_main_cleanup(void *pptr) { + struct netdata_static_thread *static_thread = CLEANUP_FUNCTION_GET_PTR(pptr); + if(!static_thread) return; + + static_thread->enabled = NETDATA_MAIN_THREAD_EXITING; + + replication_pipeline_cancel_and_cleanup(); + + int threads = (int)replication_globals.main_thread.threads; + for(int i = 0; i < threads ;i++) { + nd_thread_join(replication_globals.main_thread.threads_ptrs[i]); + __atomic_sub_fetch(&replication_buffers_allocated, sizeof(ND_THREAD *), __ATOMIC_RELAXED); + } + freez(replication_globals.main_thread.threads_ptrs); + replication_globals.main_thread.threads_ptrs = NULL; + __atomic_sub_fetch(&replication_buffers_allocated, threads * sizeof(ND_THREAD *), __ATOMIC_RELAXED); + + aral_destroy(replication_globals.aral_rse); + replication_globals.aral_rse = NULL; + + // custom code + worker_unregister(); + + static_thread->enabled = NETDATA_MAIN_THREAD_EXITED; +} + +void replication_initialize(void) { + replication_globals.aral_rse = aral_create("rse", sizeof(struct replication_sort_entry), + 0, 65536, aral_by_size_statistics(), + NULL, NULL, false, false); +} + +void *replication_thread_main(void *ptr __maybe_unused) { + replication_initialize_workers(true); + + int threads = config_get_number(CONFIG_SECTION_DB, "replication threads", 1); + if(threads < 1 || threads > MAX_REPLICATION_THREADS) { + netdata_log_error("replication threads given %d is invalid, resetting to 1", threads); + threads = 1; + } + + if(--threads) { + replication_globals.main_thread.threads = threads; + replication_globals.main_thread.threads_ptrs = mallocz(threads * sizeof(ND_THREAD *)); + __atomic_add_fetch(&replication_buffers_allocated, threads * sizeof(ND_THREAD *), __ATOMIC_RELAXED); + + for(int i = 0; i < threads ;i++) { + char tag[NETDATA_THREAD_TAG_MAX + 1]; + snprintfz(tag, NETDATA_THREAD_TAG_MAX, "REPLAY[%d]", i + 2); + replication_globals.main_thread.threads_ptrs[i] = mallocz(sizeof(ND_THREAD *)); + __atomic_add_fetch(&replication_buffers_allocated, sizeof(ND_THREAD *), __ATOMIC_RELAXED); + replication_globals.main_thread.threads_ptrs[i] = nd_thread_create(tag, NETDATA_THREAD_OPTION_JOINABLE, + replication_worker_thread, NULL); + } + } + + CLEANUP_FUNCTION_REGISTER(replication_main_cleanup) cleanup_ptr = ptr; + + // start from 100% completed + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_COMPLETION, 100.0); + + long run_verification_countdown = LONG_MAX; // LONG_MAX to prevent an initial verification when no replication ever took place + bool slow = true; // control the time we sleep - it has to start with true! + usec_t last_now_mono_ut = now_monotonic_usec(); + time_t replication_reset_next_point_in_time_countdown = SECONDS_TO_RESET_POINT_IN_TIME; // restart from the beginning every 10 seconds + + size_t last_executed = 0; + size_t last_sender_resets = 0; + + while(service_running(SERVICE_REPLICATION)) { + + // statistics + usec_t now_mono_ut = now_monotonic_usec(); + if(unlikely(now_mono_ut - last_now_mono_ut > default_rrd_update_every * USEC_PER_SEC)) { + last_now_mono_ut = now_mono_ut; + + worker_is_busy(WORKER_JOB_STATISTICS); + replication_recursive_lock(); + + size_t current_executed = __atomic_load_n(&replication_globals.atomic.executed, __ATOMIC_RELAXED); + if(last_executed != current_executed) { + run_verification_countdown = ITERATIONS_IDLE_WITHOUT_PENDING_TO_RUN_SENDER_VERIFICATION; + last_executed = current_executed; + slow = false; + } + + if(replication_reset_next_point_in_time_countdown-- == 0) { + // once per second, make it scan all the pending requests next time + replication_set_next_point_in_time(0, 0); +// replication_globals.protected.skipped_no_room_since_last_reset = 0; + replication_reset_next_point_in_time_countdown = SECONDS_TO_RESET_POINT_IN_TIME; + } + + if(--run_verification_countdown == 0) { + if (!replication_globals.unsafe.pending && !replication_globals.unsafe.pending_no_room) { + // reset the statistics about completion percentage + replication_globals.unsafe.first_time_t = 0; + replication_set_latest_first_time(0); + + verify_all_hosts_charts_are_streaming_now(); + + run_verification_countdown = LONG_MAX; + slow = true; + } + else + run_verification_countdown = ITERATIONS_IDLE_WITHOUT_PENDING_TO_RUN_SENDER_VERIFICATION; + } + + time_t latest_first_time_t = replication_get_latest_first_time(); + if(latest_first_time_t && replication_globals.unsafe.pending) { + // completion percentage statistics + time_t now = now_realtime_sec(); + time_t total = now - replication_globals.unsafe.first_time_t; + time_t done = latest_first_time_t - replication_globals.unsafe.first_time_t; + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_COMPLETION, + (NETDATA_DOUBLE) done * 100.0 / (NETDATA_DOUBLE) total); + } + else + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_COMPLETION, 100.0); + + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_PENDING_REQUESTS, (NETDATA_DOUBLE)replication_globals.unsafe.pending); + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_ADDED, (NETDATA_DOUBLE)replication_globals.unsafe.added); + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_DONE, (NETDATA_DOUBLE)__atomic_load_n(&replication_globals.atomic.executed, __ATOMIC_RELAXED)); + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_SKIPPED_NO_ROOM, (NETDATA_DOUBLE)replication_globals.unsafe.pending_no_room); + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_SENDER_RESETS, (NETDATA_DOUBLE)replication_globals.unsafe.sender_resets); + worker_set_metric(WORKER_JOB_CUSTOM_METRIC_SENDER_FULL, (NETDATA_DOUBLE)replication_globals.unsafe.senders_full); + + replication_recursive_unlock(); + worker_is_idle(); + } + + if(unlikely(replication_pipeline_execute_next() == REQUEST_QUEUE_EMPTY)) { + + worker_is_busy(WORKER_JOB_WAIT); + replication_recursive_lock(); + + // the timeout also defines now frequently we will traverse all the pending requests + // when the outbound buffers of all senders is full + usec_t timeout; + if(slow) { + // no work to be done, wait for a request to come in + timeout = 1000 * USEC_PER_MS; + sender_thread_buffer_free(); + } + + else if(replication_globals.unsafe.pending > 0) { + if(replication_globals.unsafe.sender_resets == last_sender_resets) + timeout = 1000 * USEC_PER_MS; + + else { + // there are pending requests waiting to be executed, + // but none could be executed at this time. + // try again after this time. + timeout = 100 * USEC_PER_MS; + } + + last_sender_resets = replication_globals.unsafe.sender_resets; + } + else { + // no requests pending, but there were requests recently (run_verification_countdown) + // so, try in a short time. + // if this is big, one chart replicating will be slow to finish (ping - pong just one chart) + timeout = 10 * USEC_PER_MS; + last_sender_resets = replication_globals.unsafe.sender_resets; + } + + replication_recursive_unlock(); + + worker_is_idle(); + sleep_usec(timeout); + + // make it scan all the pending requests next time + replication_set_next_point_in_time(0, 0); + replication_reset_next_point_in_time_countdown = SECONDS_TO_RESET_POINT_IN_TIME; + + continue; + } + } + + return NULL; +} diff --git a/src/streaming/replication.h b/src/streaming/replication.h new file mode 100644 index 000000000..507b7c32f --- /dev/null +++ b/src/streaming/replication.h @@ -0,0 +1,36 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef REPLICATION_H +#define REPLICATION_H + +#include "daemon/common.h" + +struct replication_query_statistics { + SPINLOCK spinlock; + size_t queries_started; + size_t queries_finished; + size_t points_read; + size_t points_generated; +}; + +struct replication_query_statistics replication_get_query_statistics(void); + +bool replicate_chart_response(RRDHOST *rh, RRDSET *rs, bool start_streaming, time_t after, time_t before); + +typedef ssize_t (*send_command)(const char *txt, void *data); + +bool replicate_chart_request(send_command callback, void *callback_data, + RRDHOST *rh, RRDSET *rs, + time_t child_first_entry, time_t child_last_entry, time_t child_wall_clock_time, + time_t response_first_start_time, time_t response_last_end_time); + +void replication_init_sender(struct sender_state *sender); +void replication_cleanup_sender(struct sender_state *sender); +void replication_sender_delete_pending_requests(struct sender_state *sender); +void replication_add_request(struct sender_state *sender, const char *chart_id, time_t after, time_t before, bool start_streaming); +void replication_recalculate_buffer_used_ratio_unsafe(struct sender_state *s); + +size_t replication_allocated_memory(void); +size_t replication_allocated_buffers(void); + +#endif /* REPLICATION_H */ diff --git a/src/streaming/rrdpush.c b/src/streaming/rrdpush.c new file mode 100644 index 000000000..1ce8e4ea8 --- /dev/null +++ b/src/streaming/rrdpush.c @@ -0,0 +1,1418 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrdpush.h" + +/* + * rrdpush + * + * 3 threads are involved for all stream operations + * + * 1. a random data collection thread, calling rrdset_done_push() + * this is called for each chart. + * + * the output of this work is kept in a thread BUFFER + * the sender thread is signalled via a pipe (in RRDHOST) + * + * 2. a sender thread running at the sending netdata + * this is spawned automatically on the first chart to be pushed + * + * It tries to push the metrics to the remote netdata, as fast + * as possible (i.e. immediately after they are collected). + * + * 3. a receiver thread, running at the receiving netdata + * this is spawned automatically when the sender connects to + * the receiver. + * + */ + +struct config stream_config = { + .first_section = NULL, + .last_section = NULL, + .mutex = NETDATA_MUTEX_INITIALIZER, + .index = { + .avl_tree = { + .root = NULL, + .compar = appconfig_section_compare + }, + .rwlock = AVL_LOCK_INITIALIZER + } +}; + +unsigned int default_rrdpush_enabled = 0; +STREAM_CAPABILITIES globally_disabled_capabilities = STREAM_CAP_NONE; + +unsigned int default_rrdpush_compression_enabled = 1; +char *default_rrdpush_destination = NULL; +char *default_rrdpush_api_key = NULL; +char *default_rrdpush_send_charts_matching = NULL; +bool default_rrdpush_enable_replication = true; +time_t default_rrdpush_seconds_to_replicate = 86400; +time_t default_rrdpush_replication_step = 600; +#ifdef ENABLE_HTTPS +char *netdata_ssl_ca_path = NULL; +char *netdata_ssl_ca_file = NULL; +#endif + +static void load_stream_conf() { + errno = 0; + char *filename = strdupz_path_subpath(netdata_configured_user_config_dir, "stream.conf"); + if(!appconfig_load(&stream_config, filename, 0, NULL)) { + nd_log_daemon(NDLP_NOTICE, "CONFIG: cannot load user config '%s'. Will try stock config.", filename); + freez(filename); + + filename = strdupz_path_subpath(netdata_configured_stock_config_dir, "stream.conf"); + if(!appconfig_load(&stream_config, filename, 0, NULL)) + nd_log_daemon(NDLP_NOTICE, "CONFIG: cannot load stock config '%s'. Running with internal defaults.", filename); + } + freez(filename); +} + +bool rrdpush_receiver_needs_dbengine() { + struct section *co; + + for(co = stream_config.first_section; co; co = co->next) { + if(strcmp(co->name, "stream") == 0) + continue; // the first section is not relevant + + char *s; + + s = appconfig_get_by_section(co, "enabled", NULL); + if(!s || !appconfig_test_boolean_value(s)) + continue; + + s = appconfig_get_by_section(co, "default memory mode", NULL); + if(s && strcmp(s, "dbengine") == 0) + return true; + + s = appconfig_get_by_section(co, "memory mode", NULL); + if(s && strcmp(s, "dbengine") == 0) + return true; + } + + return false; +} + +int rrdpush_init() { + // -------------------------------------------------------------------- + // load stream.conf + load_stream_conf(); + + default_rrdpush_enabled = (unsigned int)appconfig_get_boolean(&stream_config, CONFIG_SECTION_STREAM, "enabled", default_rrdpush_enabled); + default_rrdpush_destination = appconfig_get(&stream_config, CONFIG_SECTION_STREAM, "destination", ""); + default_rrdpush_api_key = appconfig_get(&stream_config, CONFIG_SECTION_STREAM, "api key", ""); + default_rrdpush_send_charts_matching = appconfig_get(&stream_config, CONFIG_SECTION_STREAM, "send charts matching", "*"); + + default_rrdpush_enable_replication = config_get_boolean(CONFIG_SECTION_DB, "enable replication", default_rrdpush_enable_replication); + default_rrdpush_seconds_to_replicate = config_get_number(CONFIG_SECTION_DB, "seconds to replicate", default_rrdpush_seconds_to_replicate); + default_rrdpush_replication_step = config_get_number(CONFIG_SECTION_DB, "seconds per replication step", default_rrdpush_replication_step); + + rrdhost_free_orphan_time_s = config_get_number(CONFIG_SECTION_DB, "cleanup orphan hosts after secs", rrdhost_free_orphan_time_s); + + default_rrdpush_compression_enabled = (unsigned int)appconfig_get_boolean(&stream_config, CONFIG_SECTION_STREAM, + "enable compression", default_rrdpush_compression_enabled); + + rrdpush_compression_levels[COMPRESSION_ALGORITHM_BROTLI] = (int)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, "brotli compression level", + rrdpush_compression_levels[COMPRESSION_ALGORITHM_BROTLI]); + + rrdpush_compression_levels[COMPRESSION_ALGORITHM_ZSTD] = (int)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, "zstd compression level", + rrdpush_compression_levels[COMPRESSION_ALGORITHM_ZSTD]); + + rrdpush_compression_levels[COMPRESSION_ALGORITHM_LZ4] = (int)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, "lz4 compression acceleration", + rrdpush_compression_levels[COMPRESSION_ALGORITHM_LZ4]); + + rrdpush_compression_levels[COMPRESSION_ALGORITHM_GZIP] = (int)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, "gzip compression level", + rrdpush_compression_levels[COMPRESSION_ALGORITHM_GZIP]); + + if(default_rrdpush_enabled && (!default_rrdpush_destination || !*default_rrdpush_destination || !default_rrdpush_api_key || !*default_rrdpush_api_key)) { + nd_log_daemon(NDLP_WARNING, "STREAM [send]: cannot enable sending thread - information is missing."); + default_rrdpush_enabled = 0; + } + +#ifdef ENABLE_HTTPS + netdata_ssl_validate_certificate_sender = !appconfig_get_boolean(&stream_config, CONFIG_SECTION_STREAM, "ssl skip certificate verification", !netdata_ssl_validate_certificate); + + if(!netdata_ssl_validate_certificate_sender) + nd_log_daemon(NDLP_NOTICE, "SSL: streaming senders will skip SSL certificates verification."); + + netdata_ssl_ca_path = appconfig_get(&stream_config, CONFIG_SECTION_STREAM, "CApath", NULL); + netdata_ssl_ca_file = appconfig_get(&stream_config, CONFIG_SECTION_STREAM, "CAfile", NULL); +#endif + + return default_rrdpush_enabled; +} + +// data collection happens from multiple threads +// each of these threads calls rrdset_done() +// which in turn calls rrdset_done_push() +// which uses this pipe to notify the streaming thread +// that there are more data ready to be sent +#define PIPE_READ 0 +#define PIPE_WRITE 1 + +// to have the remote netdata re-sync the charts +// to its current clock, we send for this many +// iterations a BEGIN line without microseconds +// this is for the first iterations of each chart +unsigned int remote_clock_resync_iterations = 60; + +static inline bool should_send_chart_matching(RRDSET *st, RRDSET_FLAGS flags) { + if(!(flags & RRDSET_FLAG_RECEIVER_REPLICATION_FINISHED)) + return false; + + if(unlikely(!(flags & (RRDSET_FLAG_UPSTREAM_SEND | RRDSET_FLAG_UPSTREAM_IGNORE)))) { + RRDHOST *host = st->rrdhost; + + if (flags & RRDSET_FLAG_ANOMALY_DETECTION) { + if(ml_streaming_enabled()) + rrdset_flag_set(st, RRDSET_FLAG_UPSTREAM_SEND); + else + rrdset_flag_set(st, RRDSET_FLAG_UPSTREAM_IGNORE); + } + else if(simple_pattern_matches_string(host->rrdpush_send_charts_matching, st->id) || + simple_pattern_matches_string(host->rrdpush_send_charts_matching, st->name)) + + rrdset_flag_set(st, RRDSET_FLAG_UPSTREAM_SEND); + else + rrdset_flag_set(st, RRDSET_FLAG_UPSTREAM_IGNORE); + + // get the flags again, to know how to respond + flags = rrdset_flag_check(st, RRDSET_FLAG_UPSTREAM_SEND|RRDSET_FLAG_UPSTREAM_IGNORE); + } + + return flags & RRDSET_FLAG_UPSTREAM_SEND; +} + +int configured_as_parent() { + struct section *section = NULL; + int is_parent = 0; + + appconfig_wrlock(&stream_config); + for (section = stream_config.first_section; section; section = section->next) { + nd_uuid_t uuid; + + if (uuid_parse(section->name, uuid) != -1 && + appconfig_get_boolean_by_section(section, "enabled", 0)) { + is_parent = 1; + break; + } + } + appconfig_unlock(&stream_config); + + return is_parent; +} + +// chart labels +static int send_clabels_callback(const char *name, const char *value, RRDLABEL_SRC ls, void *data) { + BUFFER *wb = (BUFFER *)data; + buffer_sprintf(wb, PLUGINSD_KEYWORD_CLABEL " \"%s\" \"%s\" %d\n", name, value, ls & ~(RRDLABEL_FLAG_INTERNAL)); + return 1; +} + +static void rrdpush_send_clabels(BUFFER *wb, RRDSET *st) { + if (st->rrdlabels) { + if(rrdlabels_walkthrough_read(st->rrdlabels, send_clabels_callback, wb) > 0) + buffer_sprintf(wb, PLUGINSD_KEYWORD_CLABEL_COMMIT "\n"); + } +} + +// Send the current chart definition. +// Assumes that collector thread has already called sender_start for mutex / buffer state. +static inline bool rrdpush_send_chart_definition(BUFFER *wb, RRDSET *st) { + uint32_t version = rrdset_metadata_version(st); + + RRDHOST *host = st->rrdhost; + NUMBER_ENCODING integer_encoding = stream_has_capability(host->sender, STREAM_CAP_IEEE754) ? NUMBER_ENCODING_BASE64 : NUMBER_ENCODING_HEX; + bool with_slots = stream_has_capability(host->sender, STREAM_CAP_SLOTS) ? true : false; + + bool replication_progress = false; + + // properly set the name for the remote end to parse it + char *name = ""; + if(likely(st->name)) { + if(unlikely(st->id != st->name)) { + // they differ + name = strchr(rrdset_name(st), '.'); + if(name) + name++; + else + name = ""; + } + } + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_CHART, sizeof(PLUGINSD_KEYWORD_CHART) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, st->rrdpush.sender.chart_slot); + } + + // send the chart + buffer_sprintf( + wb + , " \"%s\" \"%s\" \"%s\" \"%s\" \"%s\" \"%s\" \"%s\" %d %d \"%s %s %s %s\" \"%s\" \"%s\"\n" + , rrdset_id(st) + , name + , rrdset_title(st) + , rrdset_units(st) + , rrdset_family(st) + , rrdset_context(st) + , rrdset_type_name(st->chart_type) + , st->priority + , st->update_every + , rrdset_flag_check(st, RRDSET_FLAG_OBSOLETE)?"obsolete":"" + , rrdset_flag_check(st, RRDSET_FLAG_DETAIL)?"detail":"" + , rrdset_flag_check(st, RRDSET_FLAG_STORE_FIRST)?"store_first":"" + , rrdset_flag_check(st, RRDSET_FLAG_HIDDEN)?"hidden":"" + , rrdset_plugin_name(st) + , rrdset_module_name(st) + ); + + // send the chart labels + if (stream_has_capability(host->sender, STREAM_CAP_CLABELS)) + rrdpush_send_clabels(wb, st); + + // send the dimensions + RRDDIM *rd; + rrddim_foreach_read(rd, st) { + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_DIMENSION, sizeof(PLUGINSD_KEYWORD_DIMENSION) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, rd->rrdpush.sender.dim_slot); + } + + buffer_sprintf( + wb + , " \"%s\" \"%s\" \"%s\" %d %d \"%s %s %s\"\n" + , rrddim_id(rd) + , rrddim_name(rd) + , rrd_algorithm_name(rd->algorithm) + , rd->multiplier + , rd->divisor + , rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE)?"obsolete":"" + , rrddim_option_check(rd, RRDDIM_OPTION_HIDDEN)?"hidden":"" + , rrddim_option_check(rd, RRDDIM_OPTION_DONT_DETECT_RESETS_OR_OVERFLOWS)?"noreset":"" + ); + } + rrddim_foreach_done(rd); + + // send the chart functions + if(stream_has_capability(host->sender, STREAM_CAP_FUNCTIONS)) + rrd_chart_functions_expose_rrdpush(st, wb); + + // send the chart local custom variables + rrdvar_print_to_streaming_custom_chart_variables(st, wb); + + if (stream_has_capability(host->sender, STREAM_CAP_REPLICATION)) { + time_t db_first_time_t, db_last_time_t; + + time_t now = now_realtime_sec(); + rrdset_get_retention_of_tier_for_collected_chart(st, &db_first_time_t, &db_last_time_t, now, 0); + + buffer_sprintf(wb, PLUGINSD_KEYWORD_CHART_DEFINITION_END " %llu %llu %llu\n", + (unsigned long long)db_first_time_t, + (unsigned long long)db_last_time_t, + (unsigned long long)now); + + if(!rrdset_flag_check(st, RRDSET_FLAG_SENDER_REPLICATION_IN_PROGRESS)) { + rrdset_flag_set(st, RRDSET_FLAG_SENDER_REPLICATION_IN_PROGRESS); + rrdset_flag_clear(st, RRDSET_FLAG_SENDER_REPLICATION_FINISHED); + rrdhost_sender_replicating_charts_plus_one(st->rrdhost); + } + replication_progress = true; + +#ifdef NETDATA_LOG_REPLICATION_REQUESTS + internal_error(true, "REPLAY: 'host:%s/chart:%s' replication starts", + rrdhost_hostname(st->rrdhost), rrdset_id(st)); +#endif + } + + sender_commit(host->sender, wb, STREAM_TRAFFIC_TYPE_METADATA); + + // we can set the exposed flag, after we commit the buffer + // because replication may pick it up prematurely + rrddim_foreach_read(rd, st) { + rrddim_metadata_exposed_upstream(rd, version); + } + rrddim_foreach_done(rd); + rrdset_metadata_exposed_upstream(st, version); + + st->rrdpush.sender.resync_time_s = st->last_collected_time.tv_sec + (remote_clock_resync_iterations * st->update_every); + return replication_progress; +} + +// sends the current chart dimensions +static void rrdpush_send_chart_metrics(BUFFER *wb, RRDSET *st, struct sender_state *s __maybe_unused, RRDSET_FLAGS flags) { + buffer_fast_strcat(wb, "BEGIN \"", 7); + buffer_fast_strcat(wb, rrdset_id(st), string_strlen(st->id)); + buffer_fast_strcat(wb, "\" ", 2); + + if(st->last_collected_time.tv_sec > st->rrdpush.sender.resync_time_s) + buffer_print_uint64(wb, st->usec_since_last_update); + else + buffer_fast_strcat(wb, "0", 1); + + buffer_fast_strcat(wb, "\n", 1); + + RRDDIM *rd; + rrddim_foreach_read(rd, st) { + if(unlikely(!rrddim_check_updated(rd))) + continue; + + if(likely(rrddim_check_upstream_exposed_collector(rd))) { + buffer_fast_strcat(wb, "SET \"", 5); + buffer_fast_strcat(wb, rrddim_id(rd), string_strlen(rd->id)); + buffer_fast_strcat(wb, "\" = ", 4); + buffer_print_int64(wb, rd->collector.collected_value); + buffer_fast_strcat(wb, "\n", 1); + } + else { + internal_error(true, "STREAM: 'host:%s/chart:%s/dim:%s' flag 'exposed' is updated but not exposed", + rrdhost_hostname(st->rrdhost), rrdset_id(st), rrddim_id(rd)); + // we will include it in the next iteration + rrddim_metadata_updated(rd); + } + } + rrddim_foreach_done(rd); + + if(unlikely(flags & RRDSET_FLAG_UPSTREAM_SEND_VARIABLES)) + rrdvar_print_to_streaming_custom_chart_variables(st, wb); + + buffer_fast_strcat(wb, "END\n", 4); +} + +static void rrdpush_sender_thread_spawn(RRDHOST *host); + +// Called from the internal collectors to mark a chart obsolete. +bool rrdset_push_chart_definition_now(RRDSET *st) { + RRDHOST *host = st->rrdhost; + + if(unlikely(!rrdhost_can_send_definitions_to_parent(host) + || !should_send_chart_matching(st, rrdset_flag_get(st)))) { + return false; + } + + BUFFER *wb = sender_start(host->sender); + rrdpush_send_chart_definition(wb, st); + sender_thread_buffer_free(); + + return true; +} + +void rrdset_push_metrics_v1(RRDSET_STREAM_BUFFER *rsb, RRDSET *st) { + RRDHOST *host = st->rrdhost; + rrdpush_send_chart_metrics(rsb->wb, st, host->sender, rsb->rrdset_flags); +} + +void rrddim_push_metrics_v2(RRDSET_STREAM_BUFFER *rsb, RRDDIM *rd, usec_t point_end_time_ut, NETDATA_DOUBLE n, SN_FLAGS flags) { + if(!rsb->wb || !rsb->v2 || !netdata_double_isnumber(n) || !does_storage_number_exist(flags)) + return; + + bool with_slots = stream_has_capability(rsb, STREAM_CAP_SLOTS) ? true : false; + NUMBER_ENCODING integer_encoding = stream_has_capability(rsb, STREAM_CAP_IEEE754) ? NUMBER_ENCODING_BASE64 : NUMBER_ENCODING_HEX; + NUMBER_ENCODING doubles_encoding = stream_has_capability(rsb, STREAM_CAP_IEEE754) ? NUMBER_ENCODING_BASE64 : NUMBER_ENCODING_DECIMAL; + BUFFER *wb = rsb->wb; + time_t point_end_time_s = (time_t)(point_end_time_ut / USEC_PER_SEC); + if(unlikely(rsb->last_point_end_time_s != point_end_time_s)) { + + if(unlikely(rsb->begin_v2_added)) + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_END_V2 "\n", sizeof(PLUGINSD_KEYWORD_END_V2) - 1 + 1); + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_BEGIN_V2, sizeof(PLUGINSD_KEYWORD_BEGIN_V2) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, rd->rrdset->rrdpush.sender.chart_slot); + } + + buffer_fast_strcat(wb, " '", 2); + buffer_fast_strcat(wb, rrdset_id(rd->rrdset), string_strlen(rd->rrdset->id)); + buffer_fast_strcat(wb, "' ", 2); + buffer_print_uint64_encoded(wb, integer_encoding, rd->rrdset->update_every); + buffer_fast_strcat(wb, " ", 1); + buffer_print_uint64_encoded(wb, integer_encoding, point_end_time_s); + buffer_fast_strcat(wb, " ", 1); + if(point_end_time_s == rsb->wall_clock_time) + buffer_fast_strcat(wb, "#", 1); + else + buffer_print_uint64_encoded(wb, integer_encoding, rsb->wall_clock_time); + buffer_fast_strcat(wb, "\n", 1); + + rsb->last_point_end_time_s = point_end_time_s; + rsb->begin_v2_added = true; + } + + buffer_fast_strcat(wb, PLUGINSD_KEYWORD_SET_V2, sizeof(PLUGINSD_KEYWORD_SET_V2) - 1); + + if(with_slots) { + buffer_fast_strcat(wb, " "PLUGINSD_KEYWORD_SLOT":", sizeof(PLUGINSD_KEYWORD_SLOT) - 1 + 2); + buffer_print_uint64_encoded(wb, integer_encoding, rd->rrdpush.sender.dim_slot); + } + + buffer_fast_strcat(wb, " '", 2); + buffer_fast_strcat(wb, rrddim_id(rd), string_strlen(rd->id)); + buffer_fast_strcat(wb, "' ", 2); + buffer_print_int64_encoded(wb, integer_encoding, rd->collector.last_collected_value); + buffer_fast_strcat(wb, " ", 1); + + if((NETDATA_DOUBLE)rd->collector.last_collected_value == n) + buffer_fast_strcat(wb, "#", 1); + else + buffer_print_netdata_double_encoded(wb, doubles_encoding, n); + + buffer_fast_strcat(wb, " ", 1); + buffer_print_sn_flags(wb, flags, true); + buffer_fast_strcat(wb, "\n", 1); +} + +void rrdset_push_metrics_finished(RRDSET_STREAM_BUFFER *rsb, RRDSET *st) { + if(!rsb->wb) + return; + + if(rsb->v2 && rsb->begin_v2_added) { + if(unlikely(rsb->rrdset_flags & RRDSET_FLAG_UPSTREAM_SEND_VARIABLES)) + rrdvar_print_to_streaming_custom_chart_variables(st, rsb->wb); + + buffer_fast_strcat(rsb->wb, PLUGINSD_KEYWORD_END_V2 "\n", sizeof(PLUGINSD_KEYWORD_END_V2) - 1 + 1); + } + + sender_commit(st->rrdhost->sender, rsb->wb, STREAM_TRAFFIC_TYPE_DATA); + + *rsb = (RRDSET_STREAM_BUFFER){ .wb = NULL, }; +} + +RRDSET_STREAM_BUFFER rrdset_push_metric_initialize(RRDSET *st, time_t wall_clock_time) { + RRDHOST *host = st->rrdhost; + + // fetch the flags we need to check with one atomic operation + RRDHOST_FLAGS host_flags = __atomic_load_n(&host->flags, __ATOMIC_SEQ_CST); + + // check if we are not connected + if(unlikely(!(host_flags & RRDHOST_FLAG_RRDPUSH_SENDER_READY_4_METRICS))) { + + if(unlikely(!(host_flags & (RRDHOST_FLAG_RRDPUSH_SENDER_SPAWN | RRDHOST_FLAG_RRDPUSH_RECEIVER_DISCONNECTED)))) + rrdpush_sender_thread_spawn(host); + + if(unlikely(!(host_flags & RRDHOST_FLAG_RRDPUSH_SENDER_LOGGED_STATUS))) { + rrdhost_flag_set(host, RRDHOST_FLAG_RRDPUSH_SENDER_LOGGED_STATUS); + nd_log_daemon(NDLP_NOTICE, "STREAM %s [send]: not ready - collected metrics are not sent to parent.", rrdhost_hostname(host)); + } + + return (RRDSET_STREAM_BUFFER) { .wb = NULL, }; + } + else if(unlikely(host_flags & RRDHOST_FLAG_RRDPUSH_SENDER_LOGGED_STATUS)) { + nd_log_daemon(NDLP_INFO, "STREAM %s [send]: sending metrics to parent...", rrdhost_hostname(host)); + rrdhost_flag_clear(host, RRDHOST_FLAG_RRDPUSH_SENDER_LOGGED_STATUS); + } + + if(unlikely(host_flags & RRDHOST_FLAG_GLOBAL_FUNCTIONS_UPDATED)) { + BUFFER *wb = sender_start(host->sender); + rrd_global_functions_expose_rrdpush(host, wb, stream_has_capability(host->sender, STREAM_CAP_DYNCFG)); + sender_commit(host->sender, wb, STREAM_TRAFFIC_TYPE_FUNCTIONS); + } + + bool exposed_upstream = rrdset_check_upstream_exposed(st); + RRDSET_FLAGS rrdset_flags = rrdset_flag_get(st); + bool replication_in_progress = !(rrdset_flags & RRDSET_FLAG_SENDER_REPLICATION_FINISHED); + + if(unlikely((exposed_upstream && replication_in_progress) || + !should_send_chart_matching(st, rrdset_flags))) + return (RRDSET_STREAM_BUFFER) { .wb = NULL, }; + + if(unlikely(!exposed_upstream)) { + BUFFER *wb = sender_start(host->sender); + replication_in_progress = rrdpush_send_chart_definition(wb, st); + } + + if(replication_in_progress) + return (RRDSET_STREAM_BUFFER) { .wb = NULL, }; + + return (RRDSET_STREAM_BUFFER) { + .capabilities = host->sender->capabilities, + .v2 = stream_has_capability(host->sender, STREAM_CAP_INTERPOLATED), + .rrdset_flags = rrdset_flags, + .wb = sender_start(host->sender), + .wall_clock_time = wall_clock_time, + }; +} + +// labels +static int send_labels_callback(const char *name, const char *value, RRDLABEL_SRC ls, void *data) { + BUFFER *wb = (BUFFER *)data; + buffer_sprintf(wb, "LABEL \"%s\" = %d \"%s\"\n", name, ls, value); + return 1; +} + +void rrdpush_send_host_labels(RRDHOST *host) { + if(unlikely(!rrdhost_can_send_definitions_to_parent(host) + || !stream_has_capability(host->sender, STREAM_CAP_HLABELS))) + return; + + BUFFER *wb = sender_start(host->sender); + + rrdlabels_walkthrough_read(host->rrdlabels, send_labels_callback, wb); + buffer_sprintf(wb, "OVERWRITE %s\n", "labels"); + + sender_commit(host->sender, wb, STREAM_TRAFFIC_TYPE_METADATA); + + sender_thread_buffer_free(); +} + +void rrdpush_send_global_functions(RRDHOST *host) { + if(!stream_has_capability(host->sender, STREAM_CAP_FUNCTIONS)) + return; + + if(unlikely(!rrdhost_can_send_definitions_to_parent(host))) + return; + + BUFFER *wb = sender_start(host->sender); + + rrd_global_functions_expose_rrdpush(host, wb, stream_has_capability(host->sender, STREAM_CAP_DYNCFG)); + + sender_commit(host->sender, wb, STREAM_TRAFFIC_TYPE_FUNCTIONS); + + sender_thread_buffer_free(); +} + +void rrdpush_send_claimed_id(RRDHOST *host) { + if(!stream_has_capability(host->sender, STREAM_CAP_CLAIM)) + return; + + if(unlikely(!rrdhost_can_send_definitions_to_parent(host))) + return; + + BUFFER *wb = sender_start(host->sender); + rrdhost_aclk_state_lock(host); + + buffer_sprintf(wb, "CLAIMED_ID %s %s\n", host->machine_guid, (host->aclk_state.claimed_id ? host->aclk_state.claimed_id : "NULL") ); + + rrdhost_aclk_state_unlock(host); + sender_commit(host->sender, wb, STREAM_TRAFFIC_TYPE_METADATA); + + sender_thread_buffer_free(); +} + +int connect_to_one_of_destinations( + RRDHOST *host, + int default_port, + struct timeval *timeout, + size_t *reconnects_counter, + char *connected_to, + size_t connected_to_size, + struct rrdpush_destinations **destination) +{ + int sock = -1; + + for (struct rrdpush_destinations *d = host->destinations; d; d = d->next) { + time_t now = now_realtime_sec(); + + if(nd_thread_signaled_to_cancel()) + return -1; + + if(d->postpone_reconnection_until > now) + continue; + + nd_log(NDLS_DAEMON, NDLP_DEBUG, + "STREAM %s: connecting to '%s' (default port: %d)...", + rrdhost_hostname(host), string2str(d->destination), default_port); + + if (reconnects_counter) + *reconnects_counter += 1; + + d->since = now; + d->attempts++; + sock = connect_to_this(string2str(d->destination), default_port, timeout); + + if (sock != -1) { + if (connected_to && connected_to_size) + strncpyz(connected_to, string2str(d->destination), connected_to_size); + + *destination = d; + + // move the current item to the end of the list + // without this, this destination will break the loop again and again + // not advancing the destinations to find one that may work + DOUBLE_LINKED_LIST_REMOVE_ITEM_UNSAFE(host->destinations, d, prev, next); + DOUBLE_LINKED_LIST_APPEND_ITEM_UNSAFE(host->destinations, d, prev, next); + + break; + } + } + + return sock; +} + +struct destinations_init_tmp { + RRDHOST *host; + struct rrdpush_destinations *list; + int count; +}; + +bool destinations_init_add_one(char *entry, void *data) { + struct destinations_init_tmp *t = data; + + struct rrdpush_destinations *d = callocz(1, sizeof(struct rrdpush_destinations)); + char *colon_ssl = strstr(entry, ":SSL"); + if(colon_ssl) { + *colon_ssl = '\0'; + d->ssl = true; + } + else + d->ssl = false; + + d->destination = string_strdupz(entry); + + __atomic_add_fetch(&netdata_buffers_statistics.rrdhost_senders, sizeof(struct rrdpush_destinations), __ATOMIC_RELAXED); + + DOUBLE_LINKED_LIST_APPEND_ITEM_UNSAFE(t->list, d, prev, next); + + t->count++; + nd_log_daemon(NDLP_INFO, "STREAM: added streaming destination No %d: '%s' to host '%s'", t->count, string2str(d->destination), rrdhost_hostname(t->host)); + + return false; // we return false, so that we will get all defined destinations +} + +void rrdpush_destinations_init(RRDHOST *host) { + if(!host->rrdpush_send_destination) return; + + rrdpush_destinations_free(host); + + struct destinations_init_tmp t = { + .host = host, + .list = NULL, + .count = 0, + }; + + foreach_entry_in_connection_string(host->rrdpush_send_destination, destinations_init_add_one, &t); + + host->destinations = t.list; +} + +void rrdpush_destinations_free(RRDHOST *host) { + while (host->destinations) { + struct rrdpush_destinations *tmp = host->destinations; + DOUBLE_LINKED_LIST_REMOVE_ITEM_UNSAFE(host->destinations, tmp, prev, next); + string_freez(tmp->destination); + freez(tmp); + __atomic_sub_fetch(&netdata_buffers_statistics.rrdhost_senders, sizeof(struct rrdpush_destinations), __ATOMIC_RELAXED); + } + + host->destinations = NULL; +} + +// ---------------------------------------------------------------------------- +// rrdpush sender thread + +// Either the receiver lost the connection or the host is being destroyed. +// The sender mutex guards thread creation, any spurious data is wiped on reconnection. +void rrdpush_sender_thread_stop(RRDHOST *host, STREAM_HANDSHAKE reason, bool wait) { + if (!host->sender) + return; + + sender_lock(host->sender); + + if(rrdhost_flag_check(host, RRDHOST_FLAG_RRDPUSH_SENDER_SPAWN)) { + + host->sender->exit.shutdown = true; + host->sender->exit.reason = reason; + + // signal it to cancel + nd_thread_signal_cancel(host->rrdpush_sender_thread); + } + + sender_unlock(host->sender); + + if(wait) { + sender_lock(host->sender); + while(host->sender->tid) { + sender_unlock(host->sender); + sleep_usec(10 * USEC_PER_MS); + sender_lock(host->sender); + } + sender_unlock(host->sender); + } +} + +// ---------------------------------------------------------------------------- +// rrdpush receiver thread + +static void rrdpush_sender_thread_spawn(RRDHOST *host) { + sender_lock(host->sender); + + if(!rrdhost_flag_check(host, RRDHOST_FLAG_RRDPUSH_SENDER_SPAWN)) { + char tag[NETDATA_THREAD_TAG_MAX + 1]; + snprintfz(tag, NETDATA_THREAD_TAG_MAX, THREAD_TAG_STREAM_SENDER "[%s]", rrdhost_hostname(host)); + + host->rrdpush_sender_thread = nd_thread_create(tag, NETDATA_THREAD_OPTION_DEFAULT, + rrdpush_sender_thread, (void *)host->sender); + if(!host->rrdpush_sender_thread) + nd_log_daemon(NDLP_ERR, "STREAM %s [send]: failed to create new thread for client.", rrdhost_hostname(host)); + else + rrdhost_flag_set(host, RRDHOST_FLAG_RRDPUSH_SENDER_SPAWN); + } + + sender_unlock(host->sender); +} + +int rrdpush_receiver_permission_denied(struct web_client *w) { + // we always respond with the same message and error code + // to prevent an attacker from gaining info about the error + buffer_flush(w->response.data); + buffer_strcat(w->response.data, START_STREAMING_ERROR_NOT_PERMITTED); + return HTTP_RESP_UNAUTHORIZED; +} + +int rrdpush_receiver_too_busy_now(struct web_client *w) { + // we always respond with the same message and error code + // to prevent an attacker from gaining info about the error + buffer_flush(w->response.data); + buffer_strcat(w->response.data, START_STREAMING_ERROR_BUSY_TRY_LATER); + return HTTP_RESP_SERVICE_UNAVAILABLE; +} + +static void rrdpush_receiver_takeover_web_connection(struct web_client *w, struct receiver_state *rpt) { + rpt->fd = w->ifd; + +#ifdef ENABLE_HTTPS + rpt->ssl.conn = w->ssl.conn; + rpt->ssl.state = w->ssl.state; + + w->ssl = NETDATA_SSL_UNSET_CONNECTION; +#endif + + WEB_CLIENT_IS_DEAD(w); + + if(web_server_mode == WEB_SERVER_MODE_STATIC_THREADED) { + web_client_flag_set(w, WEB_CLIENT_FLAG_DONT_CLOSE_SOCKET); + } + else { + if(w->ifd == w->ofd) + w->ifd = w->ofd = -1; + else + w->ifd = -1; + } + + buffer_flush(w->response.data); +} + +void *rrdpush_receiver_thread(void *ptr); +int rrdpush_receiver_thread_spawn(struct web_client *w, char *decoded_query_string, void *h2o_ctx __maybe_unused) { + + if(!service_running(ABILITY_STREAMING_CONNECTIONS)) + return rrdpush_receiver_too_busy_now(w); + + struct receiver_state *rpt = callocz(1, sizeof(*rpt)); + rpt->last_msg_t = now_monotonic_sec(); + rpt->hops = 1; + + rpt->capabilities = STREAM_CAP_INVALID; + +#ifdef ENABLE_H2O + rpt->h2o_ctx = h2o_ctx; +#endif + + __atomic_add_fetch(&netdata_buffers_statistics.rrdhost_receivers, sizeof(*rpt), __ATOMIC_RELAXED); + __atomic_add_fetch(&netdata_buffers_statistics.rrdhost_allocations_size, sizeof(struct rrdhost_system_info), __ATOMIC_RELAXED); + + rpt->system_info = callocz(1, sizeof(struct rrdhost_system_info)); + rpt->system_info->hops = rpt->hops; + + rpt->fd = -1; + rpt->client_ip = strdupz(w->client_ip); + rpt->client_port = strdupz(w->client_port); + +#ifdef ENABLE_HTTPS + rpt->ssl = NETDATA_SSL_UNSET_CONNECTION; +#endif + + rpt->config.update_every = default_rrd_update_every; + + // parse the parameters and fill rpt and rpt->system_info + + while(decoded_query_string) { + char *value = strsep_skip_consecutive_separators(&decoded_query_string, "&"); + if(!value || !*value) continue; + + char *name = strsep_skip_consecutive_separators(&value, "="); + if(!name || !*name) continue; + if(!value || !*value) continue; + + if(!strcmp(name, "key") && !rpt->key) + rpt->key = strdupz(value); + + else if(!strcmp(name, "hostname") && !rpt->hostname) + rpt->hostname = strdupz(value); + + else if(!strcmp(name, "registry_hostname") && !rpt->registry_hostname) + rpt->registry_hostname = strdupz(value); + + else if(!strcmp(name, "machine_guid") && !rpt->machine_guid) + rpt->machine_guid = strdupz(value); + + else if(!strcmp(name, "update_every")) + rpt->config.update_every = (int)strtoul(value, NULL, 0); + + else if(!strcmp(name, "os") && !rpt->os) + rpt->os = strdupz(value); + + else if(!strcmp(name, "timezone") && !rpt->timezone) + rpt->timezone = strdupz(value); + + else if(!strcmp(name, "abbrev_timezone") && !rpt->abbrev_timezone) + rpt->abbrev_timezone = strdupz(value); + + else if(!strcmp(name, "utc_offset")) + rpt->utc_offset = (int32_t)strtol(value, NULL, 0); + + else if(!strcmp(name, "hops")) + rpt->hops = rpt->system_info->hops = (uint16_t) strtoul(value, NULL, 0); + + else if(!strcmp(name, "ml_capable")) + rpt->system_info->ml_capable = strtoul(value, NULL, 0); + + else if(!strcmp(name, "ml_enabled")) + rpt->system_info->ml_enabled = strtoul(value, NULL, 0); + + else if(!strcmp(name, "mc_version")) + rpt->system_info->mc_version = strtoul(value, NULL, 0); + + else if(!strcmp(name, "ver") && (rpt->capabilities & STREAM_CAP_INVALID)) + rpt->capabilities = convert_stream_version_to_capabilities(strtoul(value, NULL, 0), NULL, false); + + else { + // An old Netdata child does not have a compatible streaming protocol, map to something sane. + if (!strcmp(name, "NETDATA_SYSTEM_OS_NAME")) + name = "NETDATA_HOST_OS_NAME"; + + else if (!strcmp(name, "NETDATA_SYSTEM_OS_ID")) + name = "NETDATA_HOST_OS_ID"; + + else if (!strcmp(name, "NETDATA_SYSTEM_OS_ID_LIKE")) + name = "NETDATA_HOST_OS_ID_LIKE"; + + else if (!strcmp(name, "NETDATA_SYSTEM_OS_VERSION")) + name = "NETDATA_HOST_OS_VERSION"; + + else if (!strcmp(name, "NETDATA_SYSTEM_OS_VERSION_ID")) + name = "NETDATA_HOST_OS_VERSION_ID"; + + else if (!strcmp(name, "NETDATA_SYSTEM_OS_DETECTION")) + name = "NETDATA_HOST_OS_DETECTION"; + + else if(!strcmp(name, "NETDATA_PROTOCOL_VERSION") && (rpt->capabilities & STREAM_CAP_INVALID)) + rpt->capabilities = convert_stream_version_to_capabilities(1, NULL, false); + + if (unlikely(rrdhost_set_system_info_variable(rpt->system_info, name, value))) { + nd_log_daemon(NDLP_NOTICE, "STREAM '%s' [receive from [%s]:%s]: " + "request has parameter '%s' = '%s', which is not used." + , (rpt->hostname && *rpt->hostname) ? rpt->hostname : "-" + , rpt->client_ip, rpt->client_port + , name, value); + } + } + } + + if (rpt->capabilities & STREAM_CAP_INVALID) + // no version is supplied, assume version 0; + rpt->capabilities = convert_stream_version_to_capabilities(0, NULL, false); + + // find the program name and version + if(w->user_agent && w->user_agent[0]) { + char *t = strchr(w->user_agent, '/'); + if(t && *t) { + *t = '\0'; + t++; + } + + rpt->program_name = strdupz(w->user_agent); + if(t && *t) rpt->program_version = strdupz(t); + } + + // check if we should accept this connection + + if(!rpt->key || !*rpt->key) { + rrdpush_receive_log_status( + rpt, "request without an API key, rejecting connection", + RRDPUSH_STATUS_NO_API_KEY, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + if(!rpt->hostname || !*rpt->hostname) { + rrdpush_receive_log_status( + rpt, "request without a hostname, rejecting connection", + RRDPUSH_STATUS_NO_HOSTNAME, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + if(!rpt->registry_hostname) + rpt->registry_hostname = strdupz(rpt->hostname); + + if(!rpt->machine_guid || !*rpt->machine_guid) { + rrdpush_receive_log_status( + rpt, "request without a machine GUID, rejecting connection", + RRDPUSH_STATUS_NO_MACHINE_GUID, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + { + char buf[GUID_LEN + 1]; + + if (regenerate_guid(rpt->key, buf) == -1) { + rrdpush_receive_log_status( + rpt, "API key is not a valid UUID (use the command uuidgen to generate one)", + RRDPUSH_STATUS_INVALID_API_KEY, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + if (regenerate_guid(rpt->machine_guid, buf) == -1) { + rrdpush_receive_log_status( + rpt, "machine GUID is not a valid UUID", + RRDPUSH_STATUS_INVALID_MACHINE_GUID, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + } + + const char *api_key_type = appconfig_get(&stream_config, rpt->key, "type", "api"); + if(!api_key_type || !*api_key_type) api_key_type = "unknown"; + if(strcmp(api_key_type, "api") != 0) { + rrdpush_receive_log_status( + rpt, "API key is a machine GUID", + RRDPUSH_STATUS_INVALID_API_KEY, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + if(!appconfig_get_boolean(&stream_config, rpt->key, "enabled", 0)) { + rrdpush_receive_log_status( + rpt, "API key is not enabled", + RRDPUSH_STATUS_API_KEY_DISABLED, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + { + SIMPLE_PATTERN *key_allow_from = simple_pattern_create( + appconfig_get(&stream_config, rpt->key, "allow from", "*"), + NULL, SIMPLE_PATTERN_EXACT, true); + + if(key_allow_from) { + if(!simple_pattern_matches(key_allow_from, w->client_ip)) { + simple_pattern_free(key_allow_from); + + rrdpush_receive_log_status( + rpt, "API key is not allowed from this IP", + RRDPUSH_STATUS_NOT_ALLOWED_IP, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + simple_pattern_free(key_allow_from); + } + } + + { + const char *machine_guid_type = appconfig_get(&stream_config, rpt->machine_guid, "type", "machine"); + if (!machine_guid_type || !*machine_guid_type) machine_guid_type = "unknown"; + + if (strcmp(machine_guid_type, "machine") != 0) { + rrdpush_receive_log_status( + rpt, "machine GUID is an API key", + RRDPUSH_STATUS_INVALID_MACHINE_GUID, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + } + + if(!appconfig_get_boolean(&stream_config, rpt->machine_guid, "enabled", 1)) { + rrdpush_receive_log_status( + rpt, "machine GUID is not enabled", + RRDPUSH_STATUS_MACHINE_GUID_DISABLED, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + { + SIMPLE_PATTERN *machine_allow_from = simple_pattern_create( + appconfig_get(&stream_config, rpt->machine_guid, "allow from", "*"), + NULL, SIMPLE_PATTERN_EXACT, true); + + if(machine_allow_from) { + if(!simple_pattern_matches(machine_allow_from, w->client_ip)) { + simple_pattern_free(machine_allow_from); + + rrdpush_receive_log_status( + rpt, "machine GUID is not allowed from this IP", + RRDPUSH_STATUS_NOT_ALLOWED_IP, NDLP_WARNING); + + receiver_state_free(rpt); + return rrdpush_receiver_permission_denied(w); + } + + simple_pattern_free(machine_allow_from); + } + } + + if (strcmp(rpt->machine_guid, localhost->machine_guid) == 0) { + + rrdpush_receiver_takeover_web_connection(w, rpt); + + rrdpush_receive_log_status( + rpt, "machine GUID is my own", + RRDPUSH_STATUS_LOCALHOST, NDLP_DEBUG); + + char initial_response[HTTP_HEADER_SIZE + 1]; + snprintfz(initial_response, HTTP_HEADER_SIZE, "%s", START_STREAMING_ERROR_SAME_LOCALHOST); + + if(send_timeout( +#ifdef ENABLE_HTTPS + &rpt->ssl, +#endif + rpt->fd, initial_response, strlen(initial_response), 0, 60) != (ssize_t)strlen(initial_response)) { + + nd_log_daemon(NDLP_ERR, "STREAM '%s' [receive from [%s]:%s]: " + "failed to reply." + , rpt->hostname + , rpt->client_ip, rpt->client_port + ); + } + + receiver_state_free(rpt); + return HTTP_RESP_OK; + } + + if(unlikely(web_client_streaming_rate_t > 0)) { + static SPINLOCK spinlock = NETDATA_SPINLOCK_INITIALIZER; + static time_t last_stream_accepted_t = 0; + + time_t now = now_realtime_sec(); + spinlock_lock(&spinlock); + + if(unlikely(last_stream_accepted_t == 0)) + last_stream_accepted_t = now; + + if(now - last_stream_accepted_t < web_client_streaming_rate_t) { + spinlock_unlock(&spinlock); + + char msg[100 + 1]; + snprintfz(msg, sizeof(msg) - 1, + "rate limit, will accept new connection in %ld secs", + (long)(web_client_streaming_rate_t - (now - last_stream_accepted_t))); + + rrdpush_receive_log_status( + rpt, msg, + RRDPUSH_STATUS_RATE_LIMIT, NDLP_NOTICE); + + receiver_state_free(rpt); + return rrdpush_receiver_too_busy_now(w); + } + + last_stream_accepted_t = now; + spinlock_unlock(&spinlock); + } + + /* + * Quick path for rejecting multiple connections. The lock taken is fine-grained - it only protects the receiver + * pointer within the host (if a host exists). This protects against multiple concurrent web requests hitting + * separate threads within the web-server and landing here. The lock guards the thread-shutdown sequence that + * detaches the receiver from the host. If the host is being created (first time-access) then we also use the + * lock to prevent race-hazard (two threads try to create the host concurrently, one wins and the other does a + * lookup to the now-attached structure). + */ + + { + time_t age = 0; + bool receiver_stale = false; + bool receiver_working = false; + + rrd_rdlock(); + RRDHOST *host = rrdhost_find_by_guid(rpt->machine_guid); + if (unlikely(host && rrdhost_flag_check(host, RRDHOST_FLAG_ARCHIVED))) /* Ignore archived hosts. */ + host = NULL; + + if (host) { + netdata_mutex_lock(&host->receiver_lock); + if (host->receiver) { + age = now_monotonic_sec() - host->receiver->last_msg_t; + + if (age < 30) + receiver_working = true; + else + receiver_stale = true; + } + netdata_mutex_unlock(&host->receiver_lock); + } + rrd_rdunlock(); + + if (receiver_stale && stop_streaming_receiver(host, STREAM_HANDSHAKE_DISCONNECT_STALE_RECEIVER)) { + // we stopped the receiver + // we can proceed with this connection + receiver_stale = false; + + nd_log_daemon(NDLP_NOTICE, "STREAM '%s' [receive from [%s]:%s]: " + "stopped previous stale receiver to accept this one." + , rpt->hostname + , rpt->client_ip, rpt->client_port + ); + } + + if (receiver_working || receiver_stale) { + // another receiver is already connected + // try again later + + char msg[200 + 1]; + snprintfz(msg, sizeof(msg) - 1, + "multiple connections for same host, " + "old connection was last used %ld secs ago%s", + age, receiver_stale ? " (signaled old receiver to stop)" : " (new connection not accepted)"); + + rrdpush_receive_log_status( + rpt, msg, + RRDPUSH_STATUS_ALREADY_CONNECTED, NDLP_DEBUG); + + // Have not set WEB_CLIENT_FLAG_DONT_CLOSE_SOCKET - caller should clean up + buffer_flush(w->response.data); + buffer_strcat(w->response.data, START_STREAMING_ERROR_ALREADY_STREAMING); + receiver_state_free(rpt); + return HTTP_RESP_CONFLICT; + } + } + + rrdpush_receiver_takeover_web_connection(w, rpt); + + char tag[NETDATA_THREAD_TAG_MAX + 1]; + snprintfz(tag, NETDATA_THREAD_TAG_MAX, THREAD_TAG_STREAM_RECEIVER "[%s]", rpt->hostname); + tag[NETDATA_THREAD_TAG_MAX] = '\0'; + + rpt->thread = nd_thread_create(tag, NETDATA_THREAD_OPTION_DEFAULT, rrdpush_receiver_thread, (void *)rpt); + if(!rpt->thread) { + rrdpush_receive_log_status( + rpt, "can't create receiver thread", + RRDPUSH_STATUS_INTERNAL_SERVER_ERROR, NDLP_ERR); + + buffer_flush(w->response.data); + buffer_strcat(w->response.data, "Can't handle this request"); + receiver_state_free(rpt); + return HTTP_RESP_INTERNAL_SERVER_ERROR; + } + + // prevent the caller from closing the streaming socket + return HTTP_RESP_OK; +} + +void rrdpush_reset_destinations_postpone_time(RRDHOST *host) { + uint32_t wait = (host->sender) ? host->sender->reconnect_delay : 5; + time_t now = now_realtime_sec(); + for (struct rrdpush_destinations *d = host->destinations; d; d = d->next) + d->postpone_reconnection_until = now + wait; +} + +static struct { + STREAM_HANDSHAKE err; + const char *str; +} handshake_errors[] = { + { STREAM_HANDSHAKE_OK_V3, "CONNECTED" }, + { STREAM_HANDSHAKE_OK_V2, "CONNECTED" }, + { STREAM_HANDSHAKE_OK_V1, "CONNECTED" }, + { STREAM_HANDSHAKE_NEVER, "" }, + { STREAM_HANDSHAKE_ERROR_BAD_HANDSHAKE, "BAD HANDSHAKE" }, + { STREAM_HANDSHAKE_ERROR_LOCALHOST, "LOCALHOST" }, + { STREAM_HANDSHAKE_ERROR_ALREADY_CONNECTED, "ALREADY CONNECTED" }, + { STREAM_HANDSHAKE_ERROR_DENIED, "DENIED" }, + { STREAM_HANDSHAKE_ERROR_SEND_TIMEOUT, "SEND TIMEOUT" }, + { STREAM_HANDSHAKE_ERROR_RECEIVE_TIMEOUT, "RECEIVE TIMEOUT" }, + { STREAM_HANDSHAKE_ERROR_INVALID_CERTIFICATE, "INVALID CERTIFICATE" }, + { STREAM_HANDSHAKE_ERROR_SSL_ERROR, "SSL ERROR" }, + { STREAM_HANDSHAKE_ERROR_CANT_CONNECT, "CANT CONNECT" }, + { STREAM_HANDSHAKE_BUSY_TRY_LATER, "BUSY TRY LATER" }, + { STREAM_HANDSHAKE_INTERNAL_ERROR, "INTERNAL ERROR" }, + { STREAM_HANDSHAKE_INITIALIZATION, "REMOTE IS INITIALIZING" }, + { STREAM_HANDSHAKE_DISCONNECT_HOST_CLEANUP, "DISCONNECTED HOST CLEANUP" }, + { STREAM_HANDSHAKE_DISCONNECT_STALE_RECEIVER, "DISCONNECTED STALE RECEIVER" }, + { STREAM_HANDSHAKE_DISCONNECT_SHUTDOWN, "DISCONNECTED SHUTDOWN REQUESTED" }, + { STREAM_HANDSHAKE_DISCONNECT_NETDATA_EXIT, "DISCONNECTED NETDATA EXIT" }, + { STREAM_HANDSHAKE_DISCONNECT_PARSER_EXIT, "DISCONNECTED PARSE ENDED" }, + {STREAM_HANDSHAKE_DISCONNECT_UNKNOWN_SOCKET_READ_ERROR, "DISCONNECTED UNKNOWN SOCKET READ ERROR" }, + { STREAM_HANDSHAKE_DISCONNECT_PARSER_FAILED, "DISCONNECTED PARSE ERROR" }, + { STREAM_HANDSHAKE_DISCONNECT_RECEIVER_LEFT, "DISCONNECTED RECEIVER LEFT" }, + { STREAM_HANDSHAKE_DISCONNECT_ORPHAN_HOST, "DISCONNECTED ORPHAN HOST" }, + { STREAM_HANDSHAKE_NON_STREAMABLE_HOST, "NON STREAMABLE HOST" }, + { STREAM_HANDSHAKE_DISCONNECT_NOT_SUFFICIENT_READ_BUFFER, "DISCONNECTED NOT SUFFICIENT READ BUFFER" }, + {STREAM_HANDSHAKE_DISCONNECT_SOCKET_EOF, "DISCONNECTED SOCKET EOF" }, + {STREAM_HANDSHAKE_DISCONNECT_SOCKET_READ_FAILED, "DISCONNECTED SOCKET READ FAILED" }, + {STREAM_HANDSHAKE_DISCONNECT_SOCKET_READ_TIMEOUT, "DISCONNECTED SOCKET READ TIMEOUT" }, + { 0, NULL }, +}; + +const char *stream_handshake_error_to_string(STREAM_HANDSHAKE handshake_error) { + if(handshake_error >= STREAM_HANDSHAKE_OK_V1) + // handshake_error is the whole version / capabilities number + return "CONNECTED"; + + for(size_t i = 0; handshake_errors[i].str ; i++) { + if(handshake_error == handshake_errors[i].err) + return handshake_errors[i].str; + } + + return "UNKNOWN"; +} + +static struct { + STREAM_CAPABILITIES cap; + const char *str; +} capability_names[] = { + {STREAM_CAP_V1, "V1" }, + {STREAM_CAP_V2, "V2" }, + {STREAM_CAP_VN, "VN" }, + {STREAM_CAP_VCAPS, "VCAPS" }, + {STREAM_CAP_HLABELS, "HLABELS" }, + {STREAM_CAP_CLAIM, "CLAIM" }, + {STREAM_CAP_CLABELS, "CLABELS" }, + {STREAM_CAP_LZ4, "LZ4" }, + {STREAM_CAP_FUNCTIONS, "FUNCTIONS" }, + {STREAM_CAP_REPLICATION, "REPLICATION" }, + {STREAM_CAP_BINARY, "BINARY" }, + {STREAM_CAP_INTERPOLATED, "INTERPOLATED" }, + {STREAM_CAP_IEEE754, "IEEE754" }, + {STREAM_CAP_DATA_WITH_ML, "ML" }, + {STREAM_CAP_DYNCFG, "DYNCFG" }, + {STREAM_CAP_SLOTS, "SLOTS" }, + {STREAM_CAP_ZSTD, "ZSTD" }, + {STREAM_CAP_GZIP, "GZIP" }, + {STREAM_CAP_BROTLI, "BROTLI" }, + {STREAM_CAP_PROGRESS, "PROGRESS" }, + {0 , NULL }, +}; + +void stream_capabilities_to_string(BUFFER *wb, STREAM_CAPABILITIES caps) { + for(size_t i = 0; capability_names[i].str ; i++) { + if(caps & capability_names[i].cap) { + buffer_strcat(wb, capability_names[i].str); + buffer_strcat(wb, " "); + } + } +} + +void stream_capabilities_to_json_array(BUFFER *wb, STREAM_CAPABILITIES caps, const char *key) { + if(key) + buffer_json_member_add_array(wb, key); + else + buffer_json_add_array_item_array(wb); + + for(size_t i = 0; capability_names[i].str ; i++) { + if(caps & capability_names[i].cap) + buffer_json_add_array_item_string(wb, capability_names[i].str); + } + + buffer_json_array_close(wb); +} + +void log_receiver_capabilities(struct receiver_state *rpt) { + BUFFER *wb = buffer_create(100, NULL); + stream_capabilities_to_string(wb, rpt->capabilities); + + nd_log_daemon(NDLP_INFO, "STREAM %s [receive from [%s]:%s]: established link with negotiated capabilities: %s", + rrdhost_hostname(rpt->host), rpt->client_ip, rpt->client_port, buffer_tostring(wb)); + + buffer_free(wb); +} + +void log_sender_capabilities(struct sender_state *s) { + BUFFER *wb = buffer_create(100, NULL); + stream_capabilities_to_string(wb, s->capabilities); + + nd_log_daemon(NDLP_INFO, "STREAM %s [send to %s]: established link with negotiated capabilities: %s", + rrdhost_hostname(s->host), s->connected_to, buffer_tostring(wb)); + + buffer_free(wb); +} + +STREAM_CAPABILITIES stream_our_capabilities(RRDHOST *host, bool sender) { + STREAM_CAPABILITIES disabled_capabilities = globally_disabled_capabilities; + + if(host && sender) { + // we have DATA_WITH_ML capability + // we should remove the DATA_WITH_ML capability if our database does not have anomaly info + // this can happen under these conditions: 1. we don't run ML, and 2. we don't receive ML + netdata_mutex_lock(&host->receiver_lock); + + if(!ml_host_running(host) && !stream_has_capability(host->receiver, STREAM_CAP_DATA_WITH_ML)) + disabled_capabilities |= STREAM_CAP_DATA_WITH_ML; + + netdata_mutex_unlock(&host->receiver_lock); + + if(host->sender) + disabled_capabilities |= host->sender->disabled_capabilities; + } + + return (STREAM_CAP_V1 | + STREAM_CAP_V2 | + STREAM_CAP_VN | + STREAM_CAP_VCAPS | + STREAM_CAP_HLABELS | + STREAM_CAP_CLAIM | + STREAM_CAP_CLABELS | + STREAM_CAP_FUNCTIONS | + STREAM_CAP_REPLICATION | + STREAM_CAP_BINARY | + STREAM_CAP_INTERPOLATED | + STREAM_CAP_SLOTS | + STREAM_CAP_PROGRESS | + STREAM_CAP_COMPRESSIONS_AVAILABLE | + STREAM_CAP_DYNCFG | + STREAM_CAP_IEEE754 | + STREAM_CAP_DATA_WITH_ML | + 0) & ~disabled_capabilities; +} + +STREAM_CAPABILITIES convert_stream_version_to_capabilities(int32_t version, RRDHOST *host, bool sender) { + STREAM_CAPABILITIES caps = 0; + + if(version <= 1) caps = STREAM_CAP_V1; + else if(version < STREAM_OLD_VERSION_CLAIM) caps = STREAM_CAP_V2 | STREAM_CAP_HLABELS; + else if(version <= STREAM_OLD_VERSION_CLAIM) caps = STREAM_CAP_VN | STREAM_CAP_HLABELS | STREAM_CAP_CLAIM; + else if(version <= STREAM_OLD_VERSION_CLABELS) caps = STREAM_CAP_VN | STREAM_CAP_HLABELS | STREAM_CAP_CLAIM | STREAM_CAP_CLABELS; + else if(version <= STREAM_OLD_VERSION_LZ4) caps = STREAM_CAP_VN | STREAM_CAP_HLABELS | STREAM_CAP_CLAIM | STREAM_CAP_CLABELS | STREAM_CAP_LZ4_AVAILABLE; + else caps = version; + + if(caps & STREAM_CAP_VCAPS) + caps &= ~(STREAM_CAP_V1|STREAM_CAP_V2|STREAM_CAP_VN); + + if(caps & STREAM_CAP_VN) + caps &= ~(STREAM_CAP_V1|STREAM_CAP_V2); + + if(caps & STREAM_CAP_V2) + caps &= ~(STREAM_CAP_V1); + + STREAM_CAPABILITIES common_caps = caps & stream_our_capabilities(host, sender); + + if(!(common_caps & STREAM_CAP_INTERPOLATED)) + // DATA WITH ML requires INTERPOLATED + common_caps &= ~STREAM_CAP_DATA_WITH_ML; + + return common_caps; +} + +int32_t stream_capabilities_to_vn(uint32_t caps) { + if(caps & STREAM_CAP_LZ4) return STREAM_OLD_VERSION_LZ4; + if(caps & STREAM_CAP_CLABELS) return STREAM_OLD_VERSION_CLABELS; + return STREAM_OLD_VERSION_CLAIM; // if(caps & STREAM_CAP_CLAIM) +} diff --git a/src/streaming/rrdpush.h b/src/streaming/rrdpush.h new file mode 100644 index 000000000..d55a07675 --- /dev/null +++ b/src/streaming/rrdpush.h @@ -0,0 +1,761 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_RRDPUSH_H +#define NETDATA_RRDPUSH_H 1 + +#include "libnetdata/libnetdata.h" +#include "daemon/common.h" +#include "web/server/web_client.h" +#include "database/rrdfunctions.h" +#include "database/rrd.h" + +#define CONNECTED_TO_SIZE 100 +#define CBUFFER_INITIAL_SIZE (16 * 1024) +#define THREAD_BUFFER_INITIAL_SIZE (CBUFFER_INITIAL_SIZE / 2) + +// ---------------------------------------------------------------------------- +// obsolete versions - do not use anymore + +#define STREAM_OLD_VERSION_CLAIM 3 +#define STREAM_OLD_VERSION_CLABELS 4 +#define STREAM_OLD_VERSION_LZ4 5 + +// ---------------------------------------------------------------------------- +// capabilities negotiation + +typedef enum { + STREAM_CAP_NONE = 0, + + // do not use the first 3 bits + // they used to be versions 1, 2 and 3 + // before we introduce capabilities + + STREAM_CAP_V1 = (1 << 3), // v1 = the oldest protocol + STREAM_CAP_V2 = (1 << 4), // v2 = the second version of the protocol (with host labels) + STREAM_CAP_VN = (1 << 5), // version negotiation supported (for versions 3, 4, 5 of the protocol) + // v3 = claiming supported + // v4 = chart labels supported + // v5 = lz4 compression supported + STREAM_CAP_VCAPS = (1 << 6), // capabilities negotiation supported + STREAM_CAP_HLABELS = (1 << 7), // host labels supported + STREAM_CAP_CLAIM = (1 << 8), // claiming supported + STREAM_CAP_CLABELS = (1 << 9), // chart labels supported + STREAM_CAP_LZ4 = (1 << 10), // lz4 compression supported + STREAM_CAP_FUNCTIONS = (1 << 11), // plugin functions supported + STREAM_CAP_REPLICATION = (1 << 12), // replication supported + STREAM_CAP_BINARY = (1 << 13), // streaming supports binary data + STREAM_CAP_INTERPOLATED = (1 << 14), // streaming supports interpolated streaming of values + STREAM_CAP_IEEE754 = (1 << 15), // streaming supports binary/hex transfer of double values + STREAM_CAP_DATA_WITH_ML = (1 << 16), // streaming supports transferring anomaly bit + // STREAM_CAP_DYNCFG = (1 << 17), // leave this unused for as long as possible + STREAM_CAP_SLOTS = (1 << 18), // the sender can appoint a unique slot for each chart + STREAM_CAP_ZSTD = (1 << 19), // ZSTD compression supported + STREAM_CAP_GZIP = (1 << 20), // GZIP compression supported + STREAM_CAP_BROTLI = (1 << 21), // BROTLI compression supported + STREAM_CAP_PROGRESS = (1 << 22), // Functions PROGRESS support + STREAM_CAP_DYNCFG = (1 << 23), // support for DYNCFG + + STREAM_CAP_INVALID = (1 << 30), // used as an invalid value for capabilities when this is set + // this must be signed int, so don't use the last bit + // needed for negotiating errors between parent and child +} STREAM_CAPABILITIES; + +#ifdef ENABLE_LZ4 +#define STREAM_CAP_LZ4_AVAILABLE STREAM_CAP_LZ4 +#else +#define STREAM_CAP_LZ4_AVAILABLE 0 +#endif // ENABLE_LZ4 + +#ifdef ENABLE_ZSTD +#define STREAM_CAP_ZSTD_AVAILABLE STREAM_CAP_ZSTD +#else +#define STREAM_CAP_ZSTD_AVAILABLE 0 +#endif // ENABLE_ZSTD + +#ifdef ENABLE_BROTLI +#define STREAM_CAP_BROTLI_AVAILABLE STREAM_CAP_BROTLI +#else +#define STREAM_CAP_BROTLI_AVAILABLE 0 +#endif // ENABLE_BROTLI + +#define STREAM_CAP_COMPRESSIONS_AVAILABLE (STREAM_CAP_LZ4_AVAILABLE|STREAM_CAP_ZSTD_AVAILABLE|STREAM_CAP_BROTLI_AVAILABLE|STREAM_CAP_GZIP) + +extern STREAM_CAPABILITIES globally_disabled_capabilities; + +STREAM_CAPABILITIES stream_our_capabilities(RRDHOST *host, bool sender); + +#define stream_has_capability(rpt, capability) ((rpt) && ((rpt)->capabilities & (capability)) == (capability)) + +static inline bool stream_has_more_than_one_capability_of(STREAM_CAPABILITIES caps, STREAM_CAPABILITIES mask) { + STREAM_CAPABILITIES common = (STREAM_CAPABILITIES)(caps & mask); + return (common & (common - 1)) != 0 && common != 0; +} + +// ---------------------------------------------------------------------------- +// stream handshake + +#define HTTP_HEADER_SIZE 8192 + +#define STREAMING_PROTOCOL_VERSION "1.1" +#define START_STREAMING_PROMPT_V1 "Hit me baby, push them over..." +#define START_STREAMING_PROMPT_V2 "Hit me baby, push them over and bring the host labels..." +#define START_STREAMING_PROMPT_VN "Hit me baby, push them over with the version=" + +#define START_STREAMING_ERROR_SAME_LOCALHOST "Don't hit me baby, you are trying to stream my localhost back" +#define START_STREAMING_ERROR_ALREADY_STREAMING "This GUID is already streaming to this server" +#define START_STREAMING_ERROR_NOT_PERMITTED "You are not permitted to access this. Check the logs for more info." +#define START_STREAMING_ERROR_BUSY_TRY_LATER "The server is too busy now to accept this request. Try later." +#define START_STREAMING_ERROR_INTERNAL_ERROR "The server encountered an internal error. Try later." +#define START_STREAMING_ERROR_INITIALIZATION "The server is initializing. Try later." + +#define RRDPUSH_STATUS_CONNECTED "CONNECTED" +#define RRDPUSH_STATUS_ALREADY_CONNECTED "ALREADY CONNECTED" +#define RRDPUSH_STATUS_DISCONNECTED "DISCONNECTED" +#define RRDPUSH_STATUS_RATE_LIMIT "RATE LIMIT TRY LATER" +#define RRDPUSH_STATUS_INITIALIZATION_IN_PROGRESS "INITIALIZATION IN PROGRESS RETRY LATER" +#define RRDPUSH_STATUS_INTERNAL_SERVER_ERROR "INTERNAL SERVER ERROR DROPPING CONNECTION" +#define RRDPUSH_STATUS_DUPLICATE_RECEIVER "DUPLICATE RECEIVER DROPPING CONNECTION" +#define RRDPUSH_STATUS_CANT_REPLY "CANT REPLY DROPPING CONNECTION" +#define RRDPUSH_STATUS_NO_HOSTNAME "NO HOSTNAME PERMISSION DENIED" +#define RRDPUSH_STATUS_NO_API_KEY "NO API KEY PERMISSION DENIED" +#define RRDPUSH_STATUS_INVALID_API_KEY "INVALID API KEY PERMISSION DENIED" +#define RRDPUSH_STATUS_NO_MACHINE_GUID "NO MACHINE GUID PERMISSION DENIED" +#define RRDPUSH_STATUS_MACHINE_GUID_DISABLED "MACHINE GUID DISABLED PERMISSION DENIED" +#define RRDPUSH_STATUS_INVALID_MACHINE_GUID "INVALID MACHINE GUID PERMISSION DENIED" +#define RRDPUSH_STATUS_API_KEY_DISABLED "API KEY DISABLED PERMISSION DENIED" +#define RRDPUSH_STATUS_NOT_ALLOWED_IP "NOT ALLOWED IP PERMISSION DENIED" +#define RRDPUSH_STATUS_LOCALHOST "LOCALHOST PERMISSION DENIED" +#define RRDPUSH_STATUS_PERMISSION_DENIED "PERMISSION DENIED" +#define RRDPUSH_STATUS_BAD_HANDSHAKE "BAD HANDSHAKE" +#define RRDPUSH_STATUS_TIMEOUT "TIMEOUT" +#define RRDPUSH_STATUS_CANT_UPGRADE_CONNECTION "CANT UPGRADE CONNECTION" +#define RRDPUSH_STATUS_SSL_ERROR "SSL ERROR" +#define RRDPUSH_STATUS_INVALID_SSL_CERTIFICATE "INVALID SSL CERTIFICATE" +#define RRDPUSH_STATUS_CANT_ESTABLISH_SSL_CONNECTION "CANT ESTABLISH SSL CONNECTION" + +typedef enum { + STREAM_HANDSHAKE_OK_V3 = 3, // v3+ + STREAM_HANDSHAKE_OK_V2 = 2, // v2 + STREAM_HANDSHAKE_OK_V1 = 1, // v1 + STREAM_HANDSHAKE_NEVER = 0, // never tried to connect + STREAM_HANDSHAKE_ERROR_BAD_HANDSHAKE = -1, + STREAM_HANDSHAKE_ERROR_LOCALHOST = -2, + STREAM_HANDSHAKE_ERROR_ALREADY_CONNECTED = -3, + STREAM_HANDSHAKE_ERROR_DENIED = -4, + STREAM_HANDSHAKE_ERROR_SEND_TIMEOUT = -5, + STREAM_HANDSHAKE_ERROR_RECEIVE_TIMEOUT = -6, + STREAM_HANDSHAKE_ERROR_INVALID_CERTIFICATE = -7, + STREAM_HANDSHAKE_ERROR_SSL_ERROR = -8, + STREAM_HANDSHAKE_ERROR_CANT_CONNECT = -9, + STREAM_HANDSHAKE_BUSY_TRY_LATER = -10, + STREAM_HANDSHAKE_INTERNAL_ERROR = -11, + STREAM_HANDSHAKE_INITIALIZATION = -12, + STREAM_HANDSHAKE_DISCONNECT_HOST_CLEANUP = -13, + STREAM_HANDSHAKE_DISCONNECT_STALE_RECEIVER = -14, + STREAM_HANDSHAKE_DISCONNECT_SHUTDOWN = -15, + STREAM_HANDSHAKE_DISCONNECT_NETDATA_EXIT = -16, + STREAM_HANDSHAKE_DISCONNECT_PARSER_EXIT = -17, + STREAM_HANDSHAKE_DISCONNECT_UNKNOWN_SOCKET_READ_ERROR = -18, + STREAM_HANDSHAKE_DISCONNECT_PARSER_FAILED = -19, + STREAM_HANDSHAKE_DISCONNECT_RECEIVER_LEFT = -20, + STREAM_HANDSHAKE_DISCONNECT_ORPHAN_HOST = -21, + STREAM_HANDSHAKE_NON_STREAMABLE_HOST = -22, + STREAM_HANDSHAKE_DISCONNECT_NOT_SUFFICIENT_READ_BUFFER = -23, + STREAM_HANDSHAKE_DISCONNECT_SOCKET_EOF = -24, + STREAM_HANDSHAKE_DISCONNECT_SOCKET_READ_FAILED = -25, + STREAM_HANDSHAKE_DISCONNECT_SOCKET_READ_TIMEOUT = -26, + STREAM_HANDSHAKE_ERROR_HTTP_UPGRADE = -27, + +} STREAM_HANDSHAKE; + + +// ---------------------------------------------------------------------------- + +typedef struct { + char *os_name; + char *os_id; + char *os_version; + char *kernel_name; + char *kernel_version; +} stream_encoded_t; + +#include "compression.h" + +// Thread-local storage +// Metric transmission: collector threads asynchronously fill the buffer, sender thread uses it. + +typedef enum __attribute__((packed)) { + STREAM_TRAFFIC_TYPE_REPLICATION = 0, + STREAM_TRAFFIC_TYPE_FUNCTIONS, + STREAM_TRAFFIC_TYPE_METADATA, + STREAM_TRAFFIC_TYPE_DATA, + STREAM_TRAFFIC_TYPE_DYNCFG, + + // terminator + STREAM_TRAFFIC_TYPE_MAX, +} STREAM_TRAFFIC_TYPE; + +typedef enum __attribute__((packed)) { + SENDER_FLAG_OVERFLOW = (1 << 0), // The buffer has been overflown +} SENDER_FLAGS; + +struct sender_state { + RRDHOST *host; + pid_t tid; // the thread id of the sender, from gettid_cached() + SENDER_FLAGS flags; + int timeout; + int default_port; + uint32_t reconnect_delay; + char connected_to[CONNECTED_TO_SIZE + 1]; // We don't know which proxy we connect to, passed back from socket.c + size_t begin; + size_t reconnects_counter; + size_t sent_bytes; + size_t sent_bytes_on_this_connection; + size_t send_attempts; + time_t last_traffic_seen_t; + time_t last_state_since_t; // the timestamp of the last state (online/offline) change + size_t not_connected_loops; + // Metrics are collected asynchronously by collector threads calling rrdset_done_push(). This can also trigger + // the lazy creation of the sender thread - both cases (buffer access and thread creation) are guarded here. + SPINLOCK spinlock; + struct circular_buffer *buffer; + char read_buffer[PLUGINSD_LINE_MAX + 1]; + ssize_t read_len; + STREAM_CAPABILITIES capabilities; + STREAM_CAPABILITIES disabled_capabilities; + + size_t sent_bytes_on_this_connection_per_type[STREAM_TRAFFIC_TYPE_MAX]; + + int rrdpush_sender_pipe[2]; // collector to sender thread signaling + int rrdpush_sender_socket; + + uint16_t hops; + + struct line_splitter line; + struct compressor_state compressor; + +#ifdef NETDATA_LOG_STREAM_SENDER + FILE *stream_log_fp; +#endif + +#ifdef ENABLE_HTTPS + NETDATA_SSL ssl; // structure used to encrypt the connection +#endif + + struct { + bool shutdown; + STREAM_HANDSHAKE reason; + } exit; + + struct { + DICTIONARY *requests; // de-duplication of replication requests, per chart + time_t oldest_request_after_t; // the timestamp of the oldest replication request + time_t latest_completed_before_t; // the timestamp of the latest replication request + + struct { + size_t pending_requests; // the currently outstanding replication requests + size_t charts_replicating; // the number of unique charts having pending replication requests (on every request one is added and is removed when we finish it - it does not track completion of the replication for this chart) + bool reached_max; // true when the sender buffer should not get more replication responses + } atomic; + + } replication; + + struct { + bool pending_data; + size_t buffer_used_percentage; // the current utilization of the sending buffer + usec_t last_flush_time_ut; // the last time the sender flushed the sending buffer in USEC + time_t last_buffer_recreate_s; // true when the sender buffer should be re-created + } atomic; + + struct { + bool intercept_input; + const char *transaction; + const char *timeout_s; + const char *function; + const char *access; + const char *source; + BUFFER *payload; + } functions; + + int parent_using_h2o; +}; + +#define sender_lock(sender) spinlock_lock(&(sender)->spinlock) +#define sender_unlock(sender) spinlock_unlock(&(sender)->spinlock) + +#define rrdpush_sender_pipe_has_pending_data(sender) __atomic_load_n(&(sender)->atomic.pending_data, __ATOMIC_RELAXED) +#define rrdpush_sender_pipe_set_pending_data(sender) __atomic_store_n(&(sender)->atomic.pending_data, true, __ATOMIC_RELAXED) +#define rrdpush_sender_pipe_clear_pending_data(sender) __atomic_store_n(&(sender)->atomic.pending_data, false, __ATOMIC_RELAXED) + +#define rrdpush_sender_last_buffer_recreate_get(sender) __atomic_load_n(&(sender)->atomic.last_buffer_recreate_s, __ATOMIC_RELAXED) +#define rrdpush_sender_last_buffer_recreate_set(sender, value) __atomic_store_n(&(sender)->atomic.last_buffer_recreate_s, value, __ATOMIC_RELAXED) + +#define rrdpush_sender_replication_buffer_full_set(sender, value) __atomic_store_n(&((sender)->replication.atomic.reached_max), value, __ATOMIC_SEQ_CST) +#define rrdpush_sender_replication_buffer_full_get(sender) __atomic_load_n(&((sender)->replication.atomic.reached_max), __ATOMIC_SEQ_CST) + +#define rrdpush_sender_set_buffer_used_percent(sender, value) __atomic_store_n(&((sender)->atomic.buffer_used_percentage), value, __ATOMIC_RELAXED) +#define rrdpush_sender_get_buffer_used_percent(sender) __atomic_load_n(&((sender)->atomic.buffer_used_percentage), __ATOMIC_RELAXED) + +#define rrdpush_sender_set_flush_time(sender) __atomic_store_n(&((sender)->atomic.last_flush_time_ut), now_realtime_usec(), __ATOMIC_RELAXED) +#define rrdpush_sender_get_flush_time(sender) __atomic_load_n(&((sender)->atomic.last_flush_time_ut), __ATOMIC_RELAXED) + +#define rrdpush_sender_replicating_charts(sender) __atomic_load_n(&((sender)->replication.atomic.charts_replicating), __ATOMIC_RELAXED) +#define rrdpush_sender_replicating_charts_plus_one(sender) __atomic_add_fetch(&((sender)->replication.atomic.charts_replicating), 1, __ATOMIC_RELAXED) +#define rrdpush_sender_replicating_charts_minus_one(sender) __atomic_sub_fetch(&((sender)->replication.atomic.charts_replicating), 1, __ATOMIC_RELAXED) +#define rrdpush_sender_replicating_charts_zero(sender) __atomic_store_n(&((sender)->replication.atomic.charts_replicating), 0, __ATOMIC_RELAXED) + +#define rrdpush_sender_pending_replication_requests(sender) __atomic_load_n(&((sender)->replication.atomic.pending_requests), __ATOMIC_RELAXED) +#define rrdpush_sender_pending_replication_requests_plus_one(sender) __atomic_add_fetch(&((sender)->replication.atomic.pending_requests), 1, __ATOMIC_RELAXED) +#define rrdpush_sender_pending_replication_requests_minus_one(sender) __atomic_sub_fetch(&((sender)->replication.atomic.pending_requests), 1, __ATOMIC_RELAXED) +#define rrdpush_sender_pending_replication_requests_zero(sender) __atomic_store_n(&((sender)->replication.atomic.pending_requests), 0, __ATOMIC_RELAXED) + +/* +typedef enum { + STREAM_NODE_INSTANCE_FEATURE_CLOUD_ONLINE = (1 << 0), + STREAM_NODE_INSTANCE_FEATURE_VIRTUAL_HOST = (1 << 1), + STREAM_NODE_INSTANCE_FEATURE_HEALTH_ENABLED = (1 << 2), + STREAM_NODE_INSTANCE_FEATURE_ML_SELF = (1 << 3), + STREAM_NODE_INSTANCE_FEATURE_ML_RECEIVED = (1 << 4), + STREAM_NODE_INSTANCE_FEATURE_SSL = (1 << 5), +} STREAM_NODE_INSTANCE_FEATURES; + +typedef struct stream_node_instance { + uuid_t uuid; + STRING *agent; + STREAM_NODE_INSTANCE_FEATURES features; + uint32_t hops; + + // receiver information on that agent + int32_t capabilities; + uint32_t local_port; + uint32_t remote_port; + STRING *local_ip; + STRING *remote_ip; +} STREAM_NODE_INSTANCE; +*/ + +struct receiver_state { + RRDHOST *host; + pid_t tid; + ND_THREAD *thread; + int fd; + char *key; + char *hostname; + char *registry_hostname; + char *machine_guid; + char *os; + char *timezone; // Unused? + char *abbrev_timezone; + int32_t utc_offset; + char *client_ip; // Duplicated in pluginsd + char *client_port; // Duplicated in pluginsd + char *program_name; // Duplicated in pluginsd + char *program_version; + struct rrdhost_system_info *system_info; + STREAM_CAPABILITIES capabilities; + time_t last_msg_t; + + struct buffered_reader reader; + + uint16_t hops; + + struct { + bool shutdown; // signal the streaming parser to exit + STREAM_HANDSHAKE reason; + } exit; + + struct { + RRD_MEMORY_MODE mode; + int history; + int update_every; + int health_enabled; // CONFIG_BOOLEAN_YES, CONFIG_BOOLEAN_NO, CONFIG_BOOLEAN_AUTO + time_t alarms_delay; + uint32_t alarms_history; + int rrdpush_enabled; + char *rrdpush_api_key; // DONT FREE - it is allocated in appconfig + char *rrdpush_send_charts_matching; // DONT FREE - it is allocated in appconfig + bool rrdpush_enable_replication; + time_t rrdpush_seconds_to_replicate; + time_t rrdpush_replication_step; + char *rrdpush_destination; // DONT FREE - it is allocated in appconfig + unsigned int rrdpush_compression; + STREAM_CAPABILITIES compression_priorities[COMPRESSION_ALGORITHM_MAX]; + } config; + +#ifdef ENABLE_HTTPS + NETDATA_SSL ssl; +#endif + + time_t replication_first_time_t; + + struct decompressor_state decompressor; +/* + struct { + uint32_t count; + STREAM_NODE_INSTANCE *array; + } instances; +*/ + +#ifdef ENABLE_H2O + void *h2o_ctx; +#endif +}; + +#ifdef ENABLE_H2O +#define is_h2o_rrdpush(x) ((x)->h2o_ctx != NULL) +#define unless_h2o_rrdpush(x) if(!is_h2o_rrdpush(x)) +#endif + +struct rrdpush_destinations { + STRING *destination; + bool ssl; + uint32_t attempts; + time_t since; + time_t postpone_reconnection_until; + STREAM_HANDSHAKE reason; + + struct rrdpush_destinations *prev; + struct rrdpush_destinations *next; +}; + +extern unsigned int default_rrdpush_enabled; +extern unsigned int default_rrdpush_compression_enabled; +extern char *default_rrdpush_destination; +extern char *default_rrdpush_api_key; +extern char *default_rrdpush_send_charts_matching; +extern bool default_rrdpush_enable_replication; +extern time_t default_rrdpush_seconds_to_replicate; +extern time_t default_rrdpush_replication_step; +extern unsigned int remote_clock_resync_iterations; + +void rrdpush_destinations_init(RRDHOST *host); +void rrdpush_destinations_free(RRDHOST *host); + +BUFFER *sender_start(struct sender_state *s); +void sender_commit(struct sender_state *s, BUFFER *wb, STREAM_TRAFFIC_TYPE type); +int rrdpush_init(); +bool rrdpush_receiver_needs_dbengine(); +int configured_as_parent(); + +typedef struct rrdset_stream_buffer { + STREAM_CAPABILITIES capabilities; + bool v2; + bool begin_v2_added; + time_t wall_clock_time; + uint64_t rrdset_flags; // RRDSET_FLAGS + time_t last_point_end_time_s; + BUFFER *wb; +} RRDSET_STREAM_BUFFER; + +RRDSET_STREAM_BUFFER rrdset_push_metric_initialize(RRDSET *st, time_t wall_clock_time); +void rrdset_push_metrics_v1(RRDSET_STREAM_BUFFER *rsb, RRDSET *st); +void rrdset_push_metrics_finished(RRDSET_STREAM_BUFFER *rsb, RRDSET *st); +void rrddim_push_metrics_v2(RRDSET_STREAM_BUFFER *rsb, RRDDIM *rd, usec_t point_end_time_ut, NETDATA_DOUBLE n, SN_FLAGS flags); + +bool rrdset_push_chart_definition_now(RRDSET *st); +void *rrdpush_sender_thread(void *ptr); +void rrdpush_send_host_labels(RRDHOST *host); +void rrdpush_send_claimed_id(RRDHOST *host); +void rrdpush_send_global_functions(RRDHOST *host); + +int rrdpush_receiver_thread_spawn(struct web_client *w, char *decoded_query_string, void *h2o_ctx); +void rrdpush_sender_thread_stop(RRDHOST *host, STREAM_HANDSHAKE reason, bool wait); + +void rrdpush_sender_send_this_host_variable_now(RRDHOST *host, const RRDVAR_ACQUIRED *rva); +int connect_to_one_of_destinations( + RRDHOST *host, + int default_port, + struct timeval *timeout, + size_t *reconnects_counter, + char *connected_to, + size_t connected_to_size, + struct rrdpush_destinations **destination); + +void rrdpush_signal_sender_to_wake_up(struct sender_state *s); + +void rrdpush_reset_destinations_postpone_time(RRDHOST *host); +const char *stream_handshake_error_to_string(STREAM_HANDSHAKE handshake_error); +void stream_capabilities_to_json_array(BUFFER *wb, STREAM_CAPABILITIES caps, const char *key); +void rrdpush_receive_log_status(struct receiver_state *rpt, const char *msg, const char *status, ND_LOG_FIELD_PRIORITY priority); +void log_receiver_capabilities(struct receiver_state *rpt); +void log_sender_capabilities(struct sender_state *s); +STREAM_CAPABILITIES convert_stream_version_to_capabilities(int32_t version, RRDHOST *host, bool sender); +int32_t stream_capabilities_to_vn(uint32_t caps); +void stream_capabilities_to_string(BUFFER *wb, STREAM_CAPABILITIES caps); + +void receiver_state_free(struct receiver_state *rpt); +bool stop_streaming_receiver(RRDHOST *host, STREAM_HANDSHAKE reason); + +void sender_thread_buffer_free(void); + +#include "replication.h" + +typedef enum __attribute__((packed)) { + RRDHOST_DB_STATUS_INITIALIZING = 0, + RRDHOST_DB_STATUS_QUERYABLE, +} RRDHOST_DB_STATUS; + +static inline const char *rrdhost_db_status_to_string(RRDHOST_DB_STATUS status) { + switch(status) { + default: + case RRDHOST_DB_STATUS_INITIALIZING: + return "initializing"; + + case RRDHOST_DB_STATUS_QUERYABLE: + return "online"; + } +} + +typedef enum __attribute__((packed)) { + RRDHOST_DB_LIVENESS_STALE = 0, + RRDHOST_DB_LIVENESS_LIVE, +} RRDHOST_DB_LIVENESS; + +static inline const char *rrdhost_db_liveness_to_string(RRDHOST_DB_LIVENESS status) { + switch(status) { + default: + case RRDHOST_DB_LIVENESS_STALE: + return "stale"; + + case RRDHOST_DB_LIVENESS_LIVE: + return "live"; + } +} + +typedef enum __attribute__((packed)) { + RRDHOST_INGEST_STATUS_ARCHIVED = 0, + RRDHOST_INGEST_STATUS_INITIALIZING, + RRDHOST_INGEST_STATUS_REPLICATING, + RRDHOST_INGEST_STATUS_ONLINE, + RRDHOST_INGEST_STATUS_OFFLINE, +} RRDHOST_INGEST_STATUS; + +static inline const char *rrdhost_ingest_status_to_string(RRDHOST_INGEST_STATUS status) { + switch(status) { + case RRDHOST_INGEST_STATUS_ARCHIVED: + return "archived"; + + case RRDHOST_INGEST_STATUS_INITIALIZING: + return "initializing"; + + case RRDHOST_INGEST_STATUS_REPLICATING: + return "replicating"; + + case RRDHOST_INGEST_STATUS_ONLINE: + return "online"; + + default: + case RRDHOST_INGEST_STATUS_OFFLINE: + return "offline"; + } +} + +typedef enum __attribute__((packed)) { + RRDHOST_INGEST_TYPE_LOCALHOST = 0, + RRDHOST_INGEST_TYPE_VIRTUAL, + RRDHOST_INGEST_TYPE_CHILD, + RRDHOST_INGEST_TYPE_ARCHIVED, +} RRDHOST_INGEST_TYPE; + +static inline const char *rrdhost_ingest_type_to_string(RRDHOST_INGEST_TYPE type) { + switch(type) { + case RRDHOST_INGEST_TYPE_LOCALHOST: + return "localhost"; + + case RRDHOST_INGEST_TYPE_VIRTUAL: + return "virtual"; + + case RRDHOST_INGEST_TYPE_CHILD: + return "child"; + + default: + case RRDHOST_INGEST_TYPE_ARCHIVED: + return "archived"; + } +} + +typedef enum __attribute__((packed)) { + RRDHOST_STREAM_STATUS_DISABLED = 0, + RRDHOST_STREAM_STATUS_REPLICATING, + RRDHOST_STREAM_STATUS_ONLINE, + RRDHOST_STREAM_STATUS_OFFLINE, +} RRDHOST_STREAMING_STATUS; + +static inline const char *rrdhost_streaming_status_to_string(RRDHOST_STREAMING_STATUS status) { + switch(status) { + case RRDHOST_STREAM_STATUS_DISABLED: + return "disabled"; + + case RRDHOST_STREAM_STATUS_REPLICATING: + return "replicating"; + + case RRDHOST_STREAM_STATUS_ONLINE: + return "online"; + + default: + case RRDHOST_STREAM_STATUS_OFFLINE: + return "offline"; + } +} + +typedef enum __attribute__((packed)) { + RRDHOST_ML_STATUS_DISABLED = 0, + RRDHOST_ML_STATUS_OFFLINE, + RRDHOST_ML_STATUS_RUNNING, +} RRDHOST_ML_STATUS; + +static inline const char *rrdhost_ml_status_to_string(RRDHOST_ML_STATUS status) { + switch(status) { + case RRDHOST_ML_STATUS_RUNNING: + return "online"; + + case RRDHOST_ML_STATUS_OFFLINE: + return "offline"; + + default: + case RRDHOST_ML_STATUS_DISABLED: + return "disabled"; + } +} + +typedef enum __attribute__((packed)) { + RRDHOST_ML_TYPE_DISABLED = 0, + RRDHOST_ML_TYPE_SELF, + RRDHOST_ML_TYPE_RECEIVED, +} RRDHOST_ML_TYPE; + +static inline const char *rrdhost_ml_type_to_string(RRDHOST_ML_TYPE type) { + switch(type) { + case RRDHOST_ML_TYPE_SELF: + return "self"; + + case RRDHOST_ML_TYPE_RECEIVED: + return "received"; + + default: + case RRDHOST_ML_TYPE_DISABLED: + return "disabled"; + } +} + +typedef enum __attribute__((packed)) { + RRDHOST_HEALTH_STATUS_DISABLED = 0, + RRDHOST_HEALTH_STATUS_INITIALIZING, + RRDHOST_HEALTH_STATUS_RUNNING, +} RRDHOST_HEALTH_STATUS; + +static inline const char *rrdhost_health_status_to_string(RRDHOST_HEALTH_STATUS status) { + switch(status) { + default: + case RRDHOST_HEALTH_STATUS_DISABLED: + return "disabled"; + + case RRDHOST_HEALTH_STATUS_INITIALIZING: + return "initializing"; + + case RRDHOST_HEALTH_STATUS_RUNNING: + return "online"; + } +} + +typedef enum __attribute__((packed)) { + RRDHOST_DYNCFG_STATUS_UNAVAILABLE = 0, + RRDHOST_DYNCFG_STATUS_AVAILABLE, +} RRDHOST_DYNCFG_STATUS; + +static inline const char *rrdhost_dyncfg_status_to_string(RRDHOST_DYNCFG_STATUS status) { + switch(status) { + default: + case RRDHOST_DYNCFG_STATUS_UNAVAILABLE: + return "unavailable"; + + case RRDHOST_DYNCFG_STATUS_AVAILABLE: + return "online"; + } +} + +typedef struct rrdhost_status { + RRDHOST *host; + time_t now; + + struct { + RRDHOST_DYNCFG_STATUS status; + } dyncfg; + + struct { + RRDHOST_DB_STATUS status; + RRDHOST_DB_LIVENESS liveness; + RRD_MEMORY_MODE mode; + time_t first_time_s; + time_t last_time_s; + size_t metrics; + size_t instances; + size_t contexts; + } db; + + struct { + RRDHOST_ML_STATUS status; + RRDHOST_ML_TYPE type; + struct ml_metrics_statistics metrics; + } ml; + + struct { + size_t hops; + RRDHOST_INGEST_TYPE type; + RRDHOST_INGEST_STATUS status; + SOCKET_PEERS peers; + bool ssl; + STREAM_CAPABILITIES capabilities; + uint32_t id; + time_t since; + STREAM_HANDSHAKE reason; + + struct { + bool in_progress; + NETDATA_DOUBLE completion; + size_t instances; + } replication; + } ingest; + + struct { + size_t hops; + RRDHOST_STREAMING_STATUS status; + SOCKET_PEERS peers; + bool ssl; + bool compression; + STREAM_CAPABILITIES capabilities; + uint32_t id; + time_t since; + STREAM_HANDSHAKE reason; + + struct { + bool in_progress; + NETDATA_DOUBLE completion; + size_t instances; + } replication; + + size_t sent_bytes_on_this_connection_per_type[STREAM_TRAFFIC_TYPE_MAX]; + } stream; + + struct { + RRDHOST_HEALTH_STATUS status; + struct { + uint32_t undefined; + uint32_t uninitialized; + uint32_t clear; + uint32_t warning; + uint32_t critical; + } alerts; + } health; +} RRDHOST_STATUS; + +void rrdhost_status(RRDHOST *host, time_t now, RRDHOST_STATUS *s); +bool rrdhost_state_cloud_emulation(RRDHOST *host); + +bool rrdpush_compression_initialize(struct sender_state *s); +bool rrdpush_decompression_initialize(struct receiver_state *rpt); +void rrdpush_parse_compression_order(struct receiver_state *rpt, const char *order); +void rrdpush_select_receiver_compression_algorithm(struct receiver_state *rpt); +void rrdpush_compression_deactivate(struct sender_state *s); + +#endif //NETDATA_RRDPUSH_H diff --git a/src/streaming/sender.c b/src/streaming/sender.c new file mode 100644 index 000000000..3432e6927 --- /dev/null +++ b/src/streaming/sender.c @@ -0,0 +1,1907 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrdpush.h" +#include "common.h" +#include "aclk/https_client.h" + +#define WORKER_SENDER_JOB_CONNECT 0 +#define WORKER_SENDER_JOB_PIPE_READ 1 +#define WORKER_SENDER_JOB_SOCKET_RECEIVE 2 +#define WORKER_SENDER_JOB_EXECUTE 3 +#define WORKER_SENDER_JOB_SOCKET_SEND 4 +#define WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE 5 +#define WORKER_SENDER_JOB_DISCONNECT_OVERFLOW 6 +#define WORKER_SENDER_JOB_DISCONNECT_TIMEOUT 7 +#define WORKER_SENDER_JOB_DISCONNECT_POLL_ERROR 8 +#define WORKER_SENDER_JOB_DISCONNECT_SOCKET_ERROR 9 +#define WORKER_SENDER_JOB_DISCONNECT_SSL_ERROR 10 +#define WORKER_SENDER_JOB_DISCONNECT_PARENT_CLOSED 11 +#define WORKER_SENDER_JOB_DISCONNECT_RECEIVE_ERROR 12 +#define WORKER_SENDER_JOB_DISCONNECT_SEND_ERROR 13 +#define WORKER_SENDER_JOB_DISCONNECT_NO_COMPRESSION 14 +#define WORKER_SENDER_JOB_BUFFER_RATIO 15 +#define WORKER_SENDER_JOB_BYTES_RECEIVED 16 +#define WORKER_SENDER_JOB_BYTES_SENT 17 +#define WORKER_SENDER_JOB_BYTES_COMPRESSED 18 +#define WORKER_SENDER_JOB_BYTES_UNCOMPRESSED 19 +#define WORKER_SENDER_JOB_BYTES_COMPRESSION_RATIO 20 +#define WORKER_SENDER_JOB_REPLAY_REQUEST 21 +#define WORKER_SENDER_JOB_FUNCTION_REQUEST 22 +#define WORKER_SENDER_JOB_REPLAY_DICT_SIZE 23 +#define WORKER_SENDER_JOB_DISCONNECT_CANT_UPGRADE_CONNECTION 24 + +#if WORKER_UTILIZATION_MAX_JOB_TYPES < 25 +#error WORKER_UTILIZATION_MAX_JOB_TYPES has to be at least 25 +#endif + +extern struct config stream_config; +extern char *netdata_ssl_ca_path; +extern char *netdata_ssl_ca_file; + +static __thread BUFFER *sender_thread_buffer = NULL; +static __thread bool sender_thread_buffer_used = false; +static __thread time_t sender_thread_buffer_last_reset_s = 0; + +void sender_thread_buffer_free(void) { + buffer_free(sender_thread_buffer); + sender_thread_buffer = NULL; + sender_thread_buffer_used = false; +} + +// Collector thread starting a transmission +BUFFER *sender_start(struct sender_state *s) { + if(unlikely(sender_thread_buffer_used)) + fatal("STREAMING: thread buffer is used multiple times concurrently."); + + if(unlikely(rrdpush_sender_last_buffer_recreate_get(s) > sender_thread_buffer_last_reset_s)) { + if(unlikely(sender_thread_buffer && sender_thread_buffer->size > THREAD_BUFFER_INITIAL_SIZE)) { + buffer_free(sender_thread_buffer); + sender_thread_buffer = NULL; + } + } + + if(unlikely(!sender_thread_buffer)) { + sender_thread_buffer = buffer_create(THREAD_BUFFER_INITIAL_SIZE, &netdata_buffers_statistics.buffers_streaming); + sender_thread_buffer_last_reset_s = rrdpush_sender_last_buffer_recreate_get(s); + } + + sender_thread_buffer_used = true; + buffer_flush(sender_thread_buffer); + return sender_thread_buffer; +} + +static inline void rrdpush_sender_thread_close_socket(RRDHOST *host); + +#define SENDER_BUFFER_ADAPT_TO_TIMES_MAX_SIZE 3 + +// Collector thread finishing a transmission +void sender_commit(struct sender_state *s, BUFFER *wb, STREAM_TRAFFIC_TYPE type) { + + if(unlikely(wb != sender_thread_buffer)) + fatal("STREAMING: sender is trying to commit a buffer that is not this thread's buffer."); + + if(unlikely(!sender_thread_buffer_used)) + fatal("STREAMING: sender is committing a buffer twice."); + + sender_thread_buffer_used = false; + + char *src = (char *)buffer_tostring(wb); + size_t src_len = buffer_strlen(wb); + + if(unlikely(!src || !src_len)) + return; + + sender_lock(s); + +#ifdef NETDATA_LOG_STREAM_SENDER + if(type == STREAM_TRAFFIC_TYPE_METADATA) { + if(!s->stream_log_fp) { + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "/tmp/stream-sender-%s.txt", s->host ? rrdhost_hostname(s->host) : "unknown"); + + s->stream_log_fp = fopen(filename, "w"); + } + + fprintf(s->stream_log_fp, "\n--- SEND MESSAGE START: %s ----\n" + "%s" + "--- SEND MESSAGE END ----------------------------------------\n" + , rrdhost_hostname(s->host), src + ); + } +#endif + + if(unlikely(s->buffer->max_size < (src_len + 1) * SENDER_BUFFER_ADAPT_TO_TIMES_MAX_SIZE)) { + netdata_log_info("STREAM %s [send to %s]: max buffer size of %zu is too small for a data message of size %zu. Increasing the max buffer size to %d times the max data message size.", + rrdhost_hostname(s->host), s->connected_to, s->buffer->max_size, buffer_strlen(wb) + 1, SENDER_BUFFER_ADAPT_TO_TIMES_MAX_SIZE); + + s->buffer->max_size = (src_len + 1) * SENDER_BUFFER_ADAPT_TO_TIMES_MAX_SIZE; + } + + if (s->compressor.initialized) { + while(src_len) { + size_t size_to_compress = src_len; + + if(unlikely(size_to_compress > COMPRESSION_MAX_MSG_SIZE)) { + if (stream_has_capability(s, STREAM_CAP_BINARY)) + size_to_compress = COMPRESSION_MAX_MSG_SIZE; + else { + if (size_to_compress > COMPRESSION_MAX_MSG_SIZE) { + // we need to find the last newline + // so that the decompressor will have a whole line to work with + + const char *t = &src[COMPRESSION_MAX_MSG_SIZE]; + while (--t >= src) + if (unlikely(*t == '\n')) + break; + + if (t <= src) { + size_to_compress = COMPRESSION_MAX_MSG_SIZE; + } else + size_to_compress = t - src + 1; + } + } + } + + const char *dst; + size_t dst_len = rrdpush_compress(&s->compressor, src, size_to_compress, &dst); + if (!dst_len) { + netdata_log_error("STREAM %s [send to %s]: COMPRESSION failed. Resetting compressor and re-trying", + rrdhost_hostname(s->host), s->connected_to); + + rrdpush_compression_initialize(s); + dst_len = rrdpush_compress(&s->compressor, src, size_to_compress, &dst); + if(!dst_len) { + netdata_log_error("STREAM %s [send to %s]: COMPRESSION failed again. Deactivating compression", + rrdhost_hostname(s->host), s->connected_to); + + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_NO_COMPRESSION); + rrdpush_compression_deactivate(s); + rrdpush_sender_thread_close_socket(s->host); + sender_unlock(s); + return; + } + } + + rrdpush_signature_t signature = rrdpush_compress_encode_signature(dst_len); + +#ifdef NETDATA_INTERNAL_CHECKS + // check if reversing the signature provides the same length + size_t decoded_dst_len = rrdpush_decompress_decode_signature((const char *)&signature, sizeof(signature)); + if(decoded_dst_len != dst_len) + fatal("RRDPUSH COMPRESSION: invalid signature, original payload %zu bytes, " + "compressed payload length %zu bytes, but signature says payload is %zu bytes", + size_to_compress, dst_len, decoded_dst_len); +#endif + + if(cbuffer_add_unsafe(s->buffer, (const char *)&signature, sizeof(signature))) + s->flags |= SENDER_FLAG_OVERFLOW; + else { + if(cbuffer_add_unsafe(s->buffer, dst, dst_len)) + s->flags |= SENDER_FLAG_OVERFLOW; + else + s->sent_bytes_on_this_connection_per_type[type] += dst_len + sizeof(signature); + } + + src = src + size_to_compress; + src_len -= size_to_compress; + } + } + else if(cbuffer_add_unsafe(s->buffer, src, src_len)) + s->flags |= SENDER_FLAG_OVERFLOW; + else + s->sent_bytes_on_this_connection_per_type[type] += src_len; + + replication_recalculate_buffer_used_ratio_unsafe(s); + + bool signal_sender = false; + if(!rrdpush_sender_pipe_has_pending_data(s)) { + rrdpush_sender_pipe_set_pending_data(s); + signal_sender = true; + } + + sender_unlock(s); + + if(signal_sender && (!stream_has_capability(s, STREAM_CAP_INTERPOLATED) || type != STREAM_TRAFFIC_TYPE_DATA)) + rrdpush_signal_sender_to_wake_up(s); +} + +static inline void rrdpush_sender_add_host_variable_to_buffer(BUFFER *wb, const RRDVAR_ACQUIRED *rva) { + buffer_sprintf( + wb + , "VARIABLE HOST %s = " NETDATA_DOUBLE_FORMAT "\n" + , rrdvar_name(rva) + , rrdvar2number(rva) + ); + + netdata_log_debug(D_STREAM, "RRDVAR pushed HOST VARIABLE %s = " NETDATA_DOUBLE_FORMAT, rrdvar_name(rva), rrdvar2number(rva)); +} + +void rrdpush_sender_send_this_host_variable_now(RRDHOST *host, const RRDVAR_ACQUIRED *rva) { + if(rrdhost_can_send_definitions_to_parent(host)) { + BUFFER *wb = sender_start(host->sender); + rrdpush_sender_add_host_variable_to_buffer(wb, rva); + sender_commit(host->sender, wb, STREAM_TRAFFIC_TYPE_METADATA); + sender_thread_buffer_free(); + } +} + +struct custom_host_variables_callback { + BUFFER *wb; +}; + +static int rrdpush_sender_thread_custom_host_variables_callback(const DICTIONARY_ITEM *item __maybe_unused, void *rrdvar_ptr __maybe_unused, void *struct_ptr) { + const RRDVAR_ACQUIRED *rv = (const RRDVAR_ACQUIRED *)item; + struct custom_host_variables_callback *tmp = struct_ptr; + BUFFER *wb = tmp->wb; + + rrdpush_sender_add_host_variable_to_buffer(wb, rv); + return 1; +} + +static void rrdpush_sender_thread_send_custom_host_variables(RRDHOST *host) { + if(rrdhost_can_send_definitions_to_parent(host)) { + BUFFER *wb = sender_start(host->sender); + struct custom_host_variables_callback tmp = { + .wb = wb + }; + int ret = rrdvar_walkthrough_read(host->rrdvars, rrdpush_sender_thread_custom_host_variables_callback, &tmp); + (void)ret; + sender_commit(host->sender, wb, STREAM_TRAFFIC_TYPE_METADATA); + sender_thread_buffer_free(); + + netdata_log_debug(D_STREAM, "RRDVAR sent %d VARIABLES", ret); + } +} + +// resets all the chart, so that their definitions +// will be resent to the central netdata +static void rrdpush_sender_thread_reset_all_charts(RRDHOST *host) { + RRDSET *st; + rrdset_foreach_read(st, host) { + rrdset_flag_clear(st, RRDSET_FLAG_SENDER_REPLICATION_IN_PROGRESS); + rrdset_flag_set(st, RRDSET_FLAG_SENDER_REPLICATION_FINISHED); + + st->rrdpush.sender.resync_time_s = 0; + + RRDDIM *rd; + rrddim_foreach_read(rd, st) + rrddim_metadata_exposed_upstream_clear(rd); + rrddim_foreach_done(rd); + + rrdset_metadata_updated(st); + } + rrdset_foreach_done(st); + + rrdhost_sender_replicating_charts_zero(host); +} + +static void rrdpush_sender_cbuffer_recreate_timed(struct sender_state *s, time_t now_s, bool have_mutex, bool force) { + static __thread time_t last_reset_time_s = 0; + + if(!force && now_s - last_reset_time_s < 300) + return; + + if(!have_mutex) + sender_lock(s); + + rrdpush_sender_last_buffer_recreate_set(s, now_s); + last_reset_time_s = now_s; + + if(s->buffer && s->buffer->size > CBUFFER_INITIAL_SIZE) { + size_t max = s->buffer->max_size; + cbuffer_free(s->buffer); + s->buffer = cbuffer_new(CBUFFER_INITIAL_SIZE, max, &netdata_buffers_statistics.cbuffers_streaming); + } + + sender_thread_buffer_free(); + + if(!have_mutex) + sender_unlock(s); +} + +static void rrdpush_sender_cbuffer_flush(RRDHOST *host) { + rrdpush_sender_set_flush_time(host->sender); + + sender_lock(host->sender); + + // flush the output buffer from any data it may have + cbuffer_flush(host->sender->buffer); + rrdpush_sender_cbuffer_recreate_timed(host->sender, now_monotonic_sec(), true, true); + replication_recalculate_buffer_used_ratio_unsafe(host->sender); + + sender_unlock(host->sender); +} + +static void rrdpush_sender_charts_and_replication_reset(RRDHOST *host) { + rrdpush_sender_set_flush_time(host->sender); + + // stop all replication commands inflight + replication_sender_delete_pending_requests(host->sender); + + // reset the state of all charts + rrdpush_sender_thread_reset_all_charts(host); + + rrdpush_sender_replicating_charts_zero(host->sender); +} + +static void rrdpush_sender_on_connect(RRDHOST *host) { + rrdpush_sender_cbuffer_flush(host); + rrdpush_sender_charts_and_replication_reset(host); +} + +static void rrdpush_sender_after_connect(RRDHOST *host) { + rrdpush_sender_thread_send_custom_host_variables(host); +} + +static inline void rrdpush_sender_thread_close_socket(RRDHOST *host) { +#ifdef ENABLE_HTTPS + netdata_ssl_close(&host->sender->ssl); +#endif + + if(host->sender->rrdpush_sender_socket != -1) { + close(host->sender->rrdpush_sender_socket); + host->sender->rrdpush_sender_socket = -1; + } + + rrdhost_flag_clear(host, RRDHOST_FLAG_RRDPUSH_SENDER_READY_4_METRICS); + rrdhost_flag_clear(host, RRDHOST_FLAG_RRDPUSH_SENDER_CONNECTED); + + // do not flush the circular buffer here + // this function is called sometimes with the mutex lock, sometimes without the lock + rrdpush_sender_charts_and_replication_reset(host); +} + +void rrdpush_encode_variable(stream_encoded_t *se, RRDHOST *host) { + se->os_name = (host->system_info->host_os_name)?url_encode(host->system_info->host_os_name):strdupz(""); + se->os_id = (host->system_info->host_os_id)?url_encode(host->system_info->host_os_id):strdupz(""); + se->os_version = (host->system_info->host_os_version)?url_encode(host->system_info->host_os_version):strdupz(""); + se->kernel_name = (host->system_info->kernel_name)?url_encode(host->system_info->kernel_name):strdupz(""); + se->kernel_version = (host->system_info->kernel_version)?url_encode(host->system_info->kernel_version):strdupz(""); +} + +void rrdpush_clean_encoded(stream_encoded_t *se) { + if (se->os_name) { + freez(se->os_name); + se->os_name = NULL; + } + + if (se->os_id) { + freez(se->os_id); + se->os_id = NULL; + } + + if (se->os_version) { + freez(se->os_version); + se->os_version = NULL; + } + + if (se->kernel_name) { + freez(se->kernel_name); + se->kernel_name = NULL; + } + + if (se->kernel_version) { + freez(se->kernel_version); + se->kernel_version = NULL; + } +} + +struct { + const char *response; + const char *status; + size_t length; + int32_t version; + bool dynamic; + const char *error; + int worker_job_id; + int postpone_reconnect_seconds; + ND_LOG_FIELD_PRIORITY priority; +} stream_responses[] = { + { + .response = START_STREAMING_PROMPT_VN, + .length = sizeof(START_STREAMING_PROMPT_VN) - 1, + .status = RRDPUSH_STATUS_CONNECTED, + .version = STREAM_HANDSHAKE_OK_V3, // and above + .dynamic = true, // dynamic = we will parse the version / capabilities + .error = NULL, + .worker_job_id = 0, + .postpone_reconnect_seconds = 0, + .priority = NDLP_INFO, + }, + { + .response = START_STREAMING_PROMPT_V2, + .length = sizeof(START_STREAMING_PROMPT_V2) - 1, + .status = RRDPUSH_STATUS_CONNECTED, + .version = STREAM_HANDSHAKE_OK_V2, + .dynamic = false, + .error = NULL, + .worker_job_id = 0, + .postpone_reconnect_seconds = 0, + .priority = NDLP_INFO, + }, + { + .response = START_STREAMING_PROMPT_V1, + .length = sizeof(START_STREAMING_PROMPT_V1) - 1, + .status = RRDPUSH_STATUS_CONNECTED, + .version = STREAM_HANDSHAKE_OK_V1, + .dynamic = false, + .error = NULL, + .worker_job_id = 0, + .postpone_reconnect_seconds = 0, + .priority = NDLP_INFO, + }, + { + .response = START_STREAMING_ERROR_SAME_LOCALHOST, + .length = sizeof(START_STREAMING_ERROR_SAME_LOCALHOST) - 1, + .status = RRDPUSH_STATUS_LOCALHOST, + .version = STREAM_HANDSHAKE_ERROR_LOCALHOST, + .dynamic = false, + .error = "remote server rejected this stream, the host we are trying to stream is its localhost", + .worker_job_id = WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE, + .postpone_reconnect_seconds = 60 * 60, // the IP may change, try it every hour + .priority = NDLP_DEBUG, + }, + { + .response = START_STREAMING_ERROR_ALREADY_STREAMING, + .length = sizeof(START_STREAMING_ERROR_ALREADY_STREAMING) - 1, + .status = RRDPUSH_STATUS_ALREADY_CONNECTED, + .version = STREAM_HANDSHAKE_ERROR_ALREADY_CONNECTED, + .dynamic = false, + .error = "remote server rejected this stream, the host we are trying to stream is already streamed to it", + .worker_job_id = WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE, + .postpone_reconnect_seconds = 2 * 60, // 2 minutes + .priority = NDLP_DEBUG, + }, + { + .response = START_STREAMING_ERROR_NOT_PERMITTED, + .length = sizeof(START_STREAMING_ERROR_NOT_PERMITTED) - 1, + .status = RRDPUSH_STATUS_PERMISSION_DENIED, + .version = STREAM_HANDSHAKE_ERROR_DENIED, + .dynamic = false, + .error = "remote server denied access, probably we don't have the right API key?", + .worker_job_id = WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE, + .postpone_reconnect_seconds = 1 * 60, // 1 minute + .priority = NDLP_ERR, + }, + { + .response = START_STREAMING_ERROR_BUSY_TRY_LATER, + .length = sizeof(START_STREAMING_ERROR_BUSY_TRY_LATER) - 1, + .status = RRDPUSH_STATUS_RATE_LIMIT, + .version = STREAM_HANDSHAKE_BUSY_TRY_LATER, + .dynamic = false, + .error = "remote server is currently busy, we should try later", + .worker_job_id = WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE, + .postpone_reconnect_seconds = 2 * 60, // 2 minutes + .priority = NDLP_NOTICE, + }, + { + .response = START_STREAMING_ERROR_INTERNAL_ERROR, + .length = sizeof(START_STREAMING_ERROR_INTERNAL_ERROR) - 1, + .status = RRDPUSH_STATUS_INTERNAL_SERVER_ERROR, + .version = STREAM_HANDSHAKE_INTERNAL_ERROR, + .dynamic = false, + .error = "remote server is encountered an internal error, we should try later", + .worker_job_id = WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE, + .postpone_reconnect_seconds = 5 * 60, // 5 minutes + .priority = NDLP_CRIT, + }, + { + .response = START_STREAMING_ERROR_INITIALIZATION, + .length = sizeof(START_STREAMING_ERROR_INITIALIZATION) - 1, + .status = RRDPUSH_STATUS_INITIALIZATION_IN_PROGRESS, + .version = STREAM_HANDSHAKE_INITIALIZATION, + .dynamic = false, + .error = "remote server is initializing, we should try later", + .worker_job_id = WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE, + .postpone_reconnect_seconds = 2 * 60, // 2 minute + .priority = NDLP_NOTICE, + }, + + // terminator + { + .response = NULL, + .length = 0, + .status = RRDPUSH_STATUS_BAD_HANDSHAKE, + .version = STREAM_HANDSHAKE_ERROR_BAD_HANDSHAKE, + .dynamic = false, + .error = "remote node response is not understood, is it Netdata?", + .worker_job_id = WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE, + .postpone_reconnect_seconds = 1 * 60, // 1 minute + .priority = NDLP_ERR, + } +}; + +static inline bool rrdpush_sender_validate_response(RRDHOST *host, struct sender_state *s, char *http, size_t http_length) { + int32_t version = STREAM_HANDSHAKE_ERROR_BAD_HANDSHAKE; + + int i; + for(i = 0; stream_responses[i].response ; i++) { + if(stream_responses[i].dynamic && + http_length > stream_responses[i].length && http_length < (stream_responses[i].length + 30) && + strncmp(http, stream_responses[i].response, stream_responses[i].length) == 0) { + + version = str2i(&http[stream_responses[i].length]); + break; + } + else if(http_length == stream_responses[i].length && strcmp(http, stream_responses[i].response) == 0) { + version = stream_responses[i].version; + + break; + } + } + + if(version >= STREAM_HANDSHAKE_OK_V1) { + host->destination->reason = version; + host->destination->postpone_reconnection_until = now_realtime_sec() + s->reconnect_delay; + s->capabilities = convert_stream_version_to_capabilities(version, host, true); + return true; + } + + ND_LOG_FIELD_PRIORITY priority = stream_responses[i].priority; + const char *error = stream_responses[i].error; + const char *status = stream_responses[i].status; + int worker_job_id = stream_responses[i].worker_job_id; + int delay = stream_responses[i].postpone_reconnect_seconds; + + worker_is_busy(worker_job_id); + rrdpush_sender_thread_close_socket(host); + host->destination->reason = version; + host->destination->postpone_reconnection_until = now_realtime_sec() + delay; + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, status), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + char buf[RFC3339_MAX_LENGTH]; + rfc3339_datetime_ut(buf, sizeof(buf), host->destination->postpone_reconnection_until * USEC_PER_SEC, 0, false); + + nd_log(NDLS_DAEMON, priority, + "STREAM %s [send to %s]: %s - will retry in %d secs, at %s", + rrdhost_hostname(host), s->connected_to, error, delay, buf); + + return false; +} + +unsigned char alpn_proto_list[] = { + 18, 'n', 'e', 't', 'd', 'a', 't', 'a', '_', 's', 't', 'r', 'e', 'a', 'm', '/', '2', '.', '0', + 8, 'h', 't', 't', 'p', '/', '1', '.', '1' +}; + +#define CONN_UPGRADE_VAL "upgrade" + +static bool rrdpush_sender_connect_ssl(struct sender_state *s __maybe_unused) { +#ifdef ENABLE_HTTPS + RRDHOST *host = s->host; + bool ssl_required = host->destination && host->destination->ssl; + + netdata_ssl_close(&host->sender->ssl); + + if(!ssl_required) + return true; + + if (netdata_ssl_open_ext(&host->sender->ssl, netdata_ssl_streaming_sender_ctx, s->rrdpush_sender_socket, alpn_proto_list, sizeof(alpn_proto_list))) { + if(!netdata_ssl_connect(&host->sender->ssl)) { + // couldn't connect + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, RRDPUSH_STATUS_SSL_ERROR), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_SSL_ERROR); + rrdpush_sender_thread_close_socket(host); + host->destination->reason = STREAM_HANDSHAKE_ERROR_SSL_ERROR; + host->destination->postpone_reconnection_until = now_realtime_sec() + 5 * 60; + return false; + } + + if (netdata_ssl_validate_certificate_sender && + security_test_certificate(host->sender->ssl.conn)) { + // certificate is not valid + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, RRDPUSH_STATUS_INVALID_SSL_CERTIFICATE), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_SSL_ERROR); + netdata_log_error("SSL: closing the stream connection, because the server SSL certificate is not valid."); + rrdpush_sender_thread_close_socket(host); + host->destination->reason = STREAM_HANDSHAKE_ERROR_INVALID_CERTIFICATE; + host->destination->postpone_reconnection_until = now_realtime_sec() + 5 * 60; + return false; + } + + return true; + } + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, RRDPUSH_STATUS_CANT_ESTABLISH_SSL_CONNECTION), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + netdata_log_error("SSL: failed to establish connection."); + return false; + +#else + // SSL is not enabled + return true; +#endif +} + +static int rrdpush_http_upgrade_prelude(RRDHOST *host, struct sender_state *s) { + + char http[HTTP_HEADER_SIZE + 1]; + snprintfz(http, HTTP_HEADER_SIZE, + "GET " NETDATA_STREAM_URL HTTP_1_1 HTTP_ENDL + "Upgrade: " NETDATA_STREAM_PROTO_NAME HTTP_ENDL + "Connection: Upgrade" + HTTP_HDR_END); + + ssize_t bytes = send_timeout( +#ifdef ENABLE_HTTPS + &host->sender->ssl, +#endif + s->rrdpush_sender_socket, + http, + strlen(http), + 0, + 1000); + + bytes = recv_timeout( +#ifdef ENABLE_HTTPS + &host->sender->ssl, +#endif + s->rrdpush_sender_socket, + http, + HTTP_HEADER_SIZE, + 0, + 1000); + + if (bytes <= 0) { + error_report("Error reading from remote"); + return 1; + } + + rbuf_t buf = rbuf_create(bytes); + rbuf_push(buf, http, bytes); + + http_parse_ctx ctx; + http_parse_ctx_create(&ctx, HTTP_PARSE_INITIAL); + ctx.flags |= HTTP_PARSE_FLAG_DONT_WAIT_FOR_CONTENT; + + int rc; +// while((rc = parse_http_response(buf, &ctx)) == HTTP_PARSE_NEED_MORE_DATA); + rc = parse_http_response(buf, &ctx); + + if (rc != HTTP_PARSE_SUCCESS) { + error_report("Failed to parse HTTP response sent. (%d)", rc); + goto err_cleanup; + } + if (ctx.http_code == HTTP_RESP_MOVED_PERM) { + const char *hdr = get_http_header_by_name(&ctx, "location"); + if (hdr) + error_report("HTTP response is %d Moved Permanently (location: \"%s\") instead of expected %d Switching Protocols.", ctx.http_code, hdr, HTTP_RESP_SWITCH_PROTO); + else + error_report("HTTP response is %d instead of expected %d Switching Protocols.", ctx.http_code, HTTP_RESP_SWITCH_PROTO); + goto err_cleanup; + } + if (ctx.http_code == HTTP_RESP_NOT_FOUND) { + error_report("HTTP response is %d instead of expected %d Switching Protocols. Parent version too old.", ctx.http_code, HTTP_RESP_SWITCH_PROTO); + // TODO set some flag here that will signify parent is older version + // and to try connection without rrdpush_http_upgrade_prelude next time + goto err_cleanup; + } + if (ctx.http_code != HTTP_RESP_SWITCH_PROTO) { + error_report("HTTP response is %d instead of expected %d Switching Protocols", ctx.http_code, HTTP_RESP_SWITCH_PROTO); + goto err_cleanup; + } + + const char *hdr = get_http_header_by_name(&ctx, "connection"); + if (!hdr) { + error_report("Missing \"connection\" header in reply"); + goto err_cleanup; + } + if (strncmp(hdr, CONN_UPGRADE_VAL, strlen(CONN_UPGRADE_VAL))) { + error_report("Expected \"connection: " CONN_UPGRADE_VAL "\""); + goto err_cleanup; + } + + hdr = get_http_header_by_name(&ctx, "upgrade"); + if (!hdr) { + error_report("Missing \"upgrade\" header in reply"); + goto err_cleanup; + } + if (strncmp(hdr, NETDATA_STREAM_PROTO_NAME, strlen(NETDATA_STREAM_PROTO_NAME))) { + error_report("Expected \"upgrade: " NETDATA_STREAM_PROTO_NAME "\""); + goto err_cleanup; + } + + netdata_log_debug(D_STREAM, "Stream sender upgrade to \"" NETDATA_STREAM_PROTO_NAME "\" successful"); + rbuf_free(buf); + http_parse_ctx_destroy(&ctx); + return 0; +err_cleanup: + rbuf_free(buf); + http_parse_ctx_destroy(&ctx); + return 1; +} + +static bool rrdpush_sender_thread_connect_to_parent(RRDHOST *host, int default_port, int timeout, struct sender_state *s) { + + struct timeval tv = { + .tv_sec = timeout, + .tv_usec = 0 + }; + + // make sure the socket is closed + rrdpush_sender_thread_close_socket(host); + + s->rrdpush_sender_socket = connect_to_one_of_destinations( + host + , default_port + , &tv + , &s->reconnects_counter + , s->connected_to + , sizeof(s->connected_to)-1 + , &host->destination + ); + + if(unlikely(s->rrdpush_sender_socket == -1)) { + // netdata_log_error("STREAM %s [send to %s]: could not connect to parent node at this time.", rrdhost_hostname(host), host->rrdpush_send_destination); + return false; + } + + // netdata_log_info("STREAM %s [send to %s]: initializing communication...", rrdhost_hostname(host), s->connected_to); + + // reset our capabilities to default + s->capabilities = stream_our_capabilities(host, true); + + /* TODO: During the implementation of #7265 switch the set of variables to HOST_* and CONTAINER_* if the + version negotiation resulted in a high enough version. + */ + stream_encoded_t se; + rrdpush_encode_variable(&se, host); + + host->sender->hops = host->system_info->hops + 1; + + char http[HTTP_HEADER_SIZE + 1]; + int eol = snprintfz(http, HTTP_HEADER_SIZE, + "STREAM " + "key=%s" + "&hostname=%s" + "®istry_hostname=%s" + "&machine_guid=%s" + "&update_every=%d" + "&os=%s" + "&timezone=%s" + "&abbrev_timezone=%s" + "&utc_offset=%d" + "&hops=%d" + "&ml_capable=%d" + "&ml_enabled=%d" + "&mc_version=%d" + "&ver=%u" + "&NETDATA_INSTANCE_CLOUD_TYPE=%s" + "&NETDATA_INSTANCE_CLOUD_INSTANCE_TYPE=%s" + "&NETDATA_INSTANCE_CLOUD_INSTANCE_REGION=%s" + "&NETDATA_SYSTEM_OS_NAME=%s" + "&NETDATA_SYSTEM_OS_ID=%s" + "&NETDATA_SYSTEM_OS_ID_LIKE=%s" + "&NETDATA_SYSTEM_OS_VERSION=%s" + "&NETDATA_SYSTEM_OS_VERSION_ID=%s" + "&NETDATA_SYSTEM_OS_DETECTION=%s" + "&NETDATA_HOST_IS_K8S_NODE=%s" + "&NETDATA_SYSTEM_KERNEL_NAME=%s" + "&NETDATA_SYSTEM_KERNEL_VERSION=%s" + "&NETDATA_SYSTEM_ARCHITECTURE=%s" + "&NETDATA_SYSTEM_VIRTUALIZATION=%s" + "&NETDATA_SYSTEM_VIRT_DETECTION=%s" + "&NETDATA_SYSTEM_CONTAINER=%s" + "&NETDATA_SYSTEM_CONTAINER_DETECTION=%s" + "&NETDATA_CONTAINER_OS_NAME=%s" + "&NETDATA_CONTAINER_OS_ID=%s" + "&NETDATA_CONTAINER_OS_ID_LIKE=%s" + "&NETDATA_CONTAINER_OS_VERSION=%s" + "&NETDATA_CONTAINER_OS_VERSION_ID=%s" + "&NETDATA_CONTAINER_OS_DETECTION=%s" + "&NETDATA_SYSTEM_CPU_LOGICAL_CPU_COUNT=%s" + "&NETDATA_SYSTEM_CPU_FREQ=%s" + "&NETDATA_SYSTEM_TOTAL_RAM=%s" + "&NETDATA_SYSTEM_TOTAL_DISK_SIZE=%s" + "&NETDATA_PROTOCOL_VERSION=%s" + HTTP_1_1 HTTP_ENDL + "User-Agent: %s/%s\r\n" + "Accept: */*\r\n\r\n" + , host->rrdpush_send_api_key + , rrdhost_hostname(host) + , rrdhost_registry_hostname(host) + , host->machine_guid + , default_rrd_update_every + , rrdhost_os(host) + , rrdhost_timezone(host) + , rrdhost_abbrev_timezone(host) + , host->utc_offset + , host->sender->hops + , host->system_info->ml_capable + , host->system_info->ml_enabled + , host->system_info->mc_version + , s->capabilities + , (host->system_info->cloud_provider_type) ? host->system_info->cloud_provider_type : "" + , (host->system_info->cloud_instance_type) ? host->system_info->cloud_instance_type : "" + , (host->system_info->cloud_instance_region) ? host->system_info->cloud_instance_region : "" + , se.os_name + , se.os_id + , (host->system_info->host_os_id_like) ? host->system_info->host_os_id_like : "" + , se.os_version + , (host->system_info->host_os_version_id) ? host->system_info->host_os_version_id : "" + , (host->system_info->host_os_detection) ? host->system_info->host_os_detection : "" + , (host->system_info->is_k8s_node) ? host->system_info->is_k8s_node : "" + , se.kernel_name + , se.kernel_version + , (host->system_info->architecture) ? host->system_info->architecture : "" + , (host->system_info->virtualization) ? host->system_info->virtualization : "" + , (host->system_info->virt_detection) ? host->system_info->virt_detection : "" + , (host->system_info->container) ? host->system_info->container : "" + , (host->system_info->container_detection) ? host->system_info->container_detection : "" + , (host->system_info->container_os_name) ? host->system_info->container_os_name : "" + , (host->system_info->container_os_id) ? host->system_info->container_os_id : "" + , (host->system_info->container_os_id_like) ? host->system_info->container_os_id_like : "" + , (host->system_info->container_os_version) ? host->system_info->container_os_version : "" + , (host->system_info->container_os_version_id) ? host->system_info->container_os_version_id : "" + , (host->system_info->container_os_detection) ? host->system_info->container_os_detection : "" + , (host->system_info->host_cores) ? host->system_info->host_cores : "" + , (host->system_info->host_cpu_freq) ? host->system_info->host_cpu_freq : "" + , (host->system_info->host_ram_total) ? host->system_info->host_ram_total : "" + , (host->system_info->host_disk_space) ? host->system_info->host_disk_space : "" + , STREAMING_PROTOCOL_VERSION + , rrdhost_program_name(host) + , rrdhost_program_version(host) + ); + http[eol] = 0x00; + rrdpush_clean_encoded(&se); + + if(!rrdpush_sender_connect_ssl(s)) + return false; + + if (s->parent_using_h2o && rrdpush_http_upgrade_prelude(host, s)) { + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, RRDPUSH_STATUS_CANT_UPGRADE_CONNECTION), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_CANT_UPGRADE_CONNECTION); + rrdpush_sender_thread_close_socket(host); + host->destination->reason = STREAM_HANDSHAKE_ERROR_HTTP_UPGRADE; + host->destination->postpone_reconnection_until = now_realtime_sec() + 1 * 60; + return false; + } + + ssize_t len = (ssize_t)strlen(http); + ssize_t bytes = send_timeout( +#ifdef ENABLE_HTTPS + &host->sender->ssl, +#endif + s->rrdpush_sender_socket, + http, + len, + 0, + timeout); + + if(bytes <= 0) { // timeout is 0 + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, RRDPUSH_STATUS_TIMEOUT), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_TIMEOUT); + rrdpush_sender_thread_close_socket(host); + + nd_log(NDLS_DAEMON, NDLP_ERR, + "STREAM %s [send to %s]: failed to send HTTP header to remote netdata.", + rrdhost_hostname(host), s->connected_to); + + host->destination->reason = STREAM_HANDSHAKE_ERROR_SEND_TIMEOUT; + host->destination->postpone_reconnection_until = now_realtime_sec() + 1 * 60; + return false; + } + + bytes = recv_timeout( +#ifdef ENABLE_HTTPS + &host->sender->ssl, +#endif + s->rrdpush_sender_socket, + http, + HTTP_HEADER_SIZE, + 0, + timeout); + + if(bytes <= 0) { // timeout is 0 + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, RRDPUSH_STATUS_TIMEOUT), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_TIMEOUT); + rrdpush_sender_thread_close_socket(host); + + nd_log(NDLS_DAEMON, NDLP_ERR, + "STREAM %s [send to %s]: remote netdata does not respond.", + rrdhost_hostname(host), s->connected_to); + + host->destination->reason = STREAM_HANDSHAKE_ERROR_RECEIVE_TIMEOUT; + host->destination->postpone_reconnection_until = now_realtime_sec() + 30; + return false; + } + + if(sock_setnonblock(s->rrdpush_sender_socket) < 0) + nd_log(NDLS_DAEMON, NDLP_WARNING, + "STREAM %s [send to %s]: cannot set non-blocking mode for socket.", + rrdhost_hostname(host), s->connected_to); + sock_setcloexec(s->rrdpush_sender_socket); + + if(sock_enlarge_out(s->rrdpush_sender_socket) < 0) + nd_log(NDLS_DAEMON, NDLP_WARNING, + "STREAM %s [send to %s]: cannot enlarge the socket buffer.", + rrdhost_hostname(host), s->connected_to); + + http[bytes] = '\0'; + if(!rrdpush_sender_validate_response(host, s, http, bytes)) + return false; + + rrdpush_compression_initialize(s); + + log_sender_capabilities(s); + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_TXT(NDF_RESPONSE_CODE, RRDPUSH_STATUS_CONNECTED), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + nd_log(NDLS_DAEMON, NDLP_DEBUG, + "STREAM %s: connected to %s...", + rrdhost_hostname(host), s->connected_to); + + return true; +} + +static bool attempt_to_connect(struct sender_state *state) { + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_UUID(NDF_MESSAGE_ID, &streaming_to_parent_msgid), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + state->send_attempts = 0; + + // reset the bytes we have sent for this session + state->sent_bytes_on_this_connection = 0; + memset(state->sent_bytes_on_this_connection_per_type, 0, sizeof(state->sent_bytes_on_this_connection_per_type)); + + if(rrdpush_sender_thread_connect_to_parent(state->host, state->default_port, state->timeout, state)) { + // reset the buffer, to properly send charts and metrics + rrdpush_sender_on_connect(state->host); + + // send from the beginning + state->begin = 0; + + // make sure the next reconnection will be immediate + state->not_connected_loops = 0; + + // let the data collection threads know we are ready + rrdhost_flag_set(state->host, RRDHOST_FLAG_RRDPUSH_SENDER_CONNECTED); + + rrdpush_sender_after_connect(state->host); + + return true; + } + + // we couldn't connect + + // increase the failed connections counter + state->not_connected_loops++; + + // slow re-connection on repeating errors + usec_t now_ut = now_monotonic_usec(); + usec_t end_ut = now_ut + USEC_PER_SEC * state->reconnect_delay; + while(now_ut < end_ut) { + if(nd_thread_signaled_to_cancel()) + return false; + + sleep_usec(100 * USEC_PER_MS); // seconds + now_ut = now_monotonic_usec(); + } + + return false; +} + +// TCP window is open, and we have data to transmit. +static ssize_t attempt_to_send(struct sender_state *s) { + ssize_t ret; + +#ifdef NETDATA_INTERNAL_CHECKS + struct circular_buffer *cb = s->buffer; +#endif + + sender_lock(s); + char *chunk; + size_t outstanding = cbuffer_next_unsafe(s->buffer, &chunk); + netdata_log_debug(D_STREAM, "STREAM: Sending data. Buffer r=%zu w=%zu s=%zu, next chunk=%zu", cb->read, cb->write, cb->size, outstanding); + +#ifdef ENABLE_HTTPS + if(SSL_connection(&s->ssl)) + ret = netdata_ssl_write(&s->ssl, chunk, outstanding); + else + ret = send(s->rrdpush_sender_socket, chunk, outstanding, MSG_DONTWAIT); +#else + ret = send(s->rrdpush_sender_socket, chunk, outstanding, MSG_DONTWAIT); +#endif + + if (likely(ret > 0)) { + cbuffer_remove_unsafe(s->buffer, ret); + s->sent_bytes_on_this_connection += ret; + s->sent_bytes += ret; + netdata_log_debug(D_STREAM, "STREAM %s [send to %s]: Sent %zd bytes", rrdhost_hostname(s->host), s->connected_to, ret); + } + else if (ret == -1 && (errno == EAGAIN || errno == EINTR || errno == EWOULDBLOCK)) + netdata_log_debug(D_STREAM, "STREAM %s [send to %s]: unavailable after polling POLLOUT", rrdhost_hostname(s->host), s->connected_to); + else if (ret == -1) { + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_SEND_ERROR); + netdata_log_debug(D_STREAM, "STREAM: Send failed - closing socket..."); + netdata_log_error("STREAM %s [send to %s]: failed to send metrics - closing connection - we have sent %zu bytes on this connection.", rrdhost_hostname(s->host), s->connected_to, s->sent_bytes_on_this_connection); + rrdpush_sender_thread_close_socket(s->host); + } + else + netdata_log_debug(D_STREAM, "STREAM: send() returned 0 -> no error but no transmission"); + + replication_recalculate_buffer_used_ratio_unsafe(s); + sender_unlock(s); + + return ret; +} + +static ssize_t attempt_read(struct sender_state *s) { + ssize_t ret; + +#ifdef ENABLE_HTTPS + if (SSL_connection(&s->ssl)) + ret = netdata_ssl_read(&s->ssl, s->read_buffer + s->read_len, sizeof(s->read_buffer) - s->read_len - 1); + else + ret = recv(s->rrdpush_sender_socket, s->read_buffer + s->read_len, sizeof(s->read_buffer) - s->read_len - 1,MSG_DONTWAIT); +#else + ret = recv(s->rrdpush_sender_socket, s->read_buffer + s->read_len, sizeof(s->read_buffer) - s->read_len - 1,MSG_DONTWAIT); +#endif + + if (ret > 0) { + s->read_len += ret; + return ret; + } + + if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)) + return ret; + +#ifdef ENABLE_HTTPS + if (SSL_connection(&s->ssl)) + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_SSL_ERROR); + else +#endif + + if (ret == 0 || errno == ECONNRESET) { + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_PARENT_CLOSED); + netdata_log_error("STREAM %s [send to %s]: connection closed by far end.", rrdhost_hostname(s->host), s->connected_to); + } + else { + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_RECEIVE_ERROR); + netdata_log_error("STREAM %s [send to %s]: error during receive (%zd) - closing connection.", rrdhost_hostname(s->host), s->connected_to, ret); + } + + rrdpush_sender_thread_close_socket(s->host); + + return ret; +} + +struct inflight_stream_function { + struct sender_state *sender; + STRING *transaction; + usec_t received_ut; +}; + +static void stream_execute_function_callback(BUFFER *func_wb, int code, void *data) { + struct inflight_stream_function *tmp = data; + struct sender_state *s = tmp->sender; + + if(rrdhost_can_send_definitions_to_parent(s->host)) { + BUFFER *wb = sender_start(s); + + pluginsd_function_result_begin_to_buffer(wb + , string2str(tmp->transaction) + , code + , content_type_id2string(func_wb->content_type) + , func_wb->expires); + + buffer_fast_strcat(wb, buffer_tostring(func_wb), buffer_strlen(func_wb)); + pluginsd_function_result_end_to_buffer(wb); + + sender_commit(s, wb, STREAM_TRAFFIC_TYPE_FUNCTIONS); + sender_thread_buffer_free(); + + internal_error(true, "STREAM %s [send to %s] FUNCTION transaction %s sending back response (%zu bytes, %"PRIu64" usec).", + rrdhost_hostname(s->host), s->connected_to, + string2str(tmp->transaction), + buffer_strlen(func_wb), + now_realtime_usec() - tmp->received_ut); + } + + string_freez(tmp->transaction); + buffer_free(func_wb); + freez(tmp); +} + +static void stream_execute_function_progress_callback(void *data, size_t done, size_t all) { + struct inflight_stream_function *tmp = data; + struct sender_state *s = tmp->sender; + + if(rrdhost_can_send_definitions_to_parent(s->host)) { + BUFFER *wb = sender_start(s); + + buffer_sprintf(wb, PLUGINSD_KEYWORD_FUNCTION_PROGRESS " '%s' %zu %zu\n", + string2str(tmp->transaction), done, all); + + sender_commit(s, wb, STREAM_TRAFFIC_TYPE_FUNCTIONS); + } +} + +static void execute_commands_function(struct sender_state *s, const char *command, const char *transaction, const char *timeout_s, const char *function, BUFFER *payload, const char *access, const char *source) { + worker_is_busy(WORKER_SENDER_JOB_FUNCTION_REQUEST); + nd_log(NDLS_ACCESS, NDLP_INFO, NULL); + + if(!transaction || !*transaction || !timeout_s || !*timeout_s || !function || !*function) { + netdata_log_error("STREAM %s [send to %s] %s execution command is incomplete (transaction = '%s', timeout = '%s', function = '%s'). Ignoring it.", + rrdhost_hostname(s->host), s->connected_to, + command, + transaction?transaction:"(unset)", + timeout_s?timeout_s:"(unset)", + function?function:"(unset)"); + } + else { + int timeout = str2i(timeout_s); + if(timeout <= 0) timeout = PLUGINS_FUNCTIONS_TIMEOUT_DEFAULT; + + struct inflight_stream_function *tmp = callocz(1, sizeof(struct inflight_stream_function)); + tmp->received_ut = now_realtime_usec(); + tmp->sender = s; + tmp->transaction = string_strdupz(transaction); + BUFFER *wb = buffer_create(1024, &netdata_buffers_statistics.buffers_functions); + + int code = rrd_function_run(s->host, wb, timeout, + http_access_from_hex_mapping_old_roles(access), function, false, transaction, + stream_execute_function_callback, tmp, + stream_has_capability(s, STREAM_CAP_PROGRESS) ? stream_execute_function_progress_callback : NULL, + stream_has_capability(s, STREAM_CAP_PROGRESS) ? tmp : NULL, + NULL, NULL, payload, source); + + if(code != HTTP_RESP_OK) { + if (!buffer_strlen(wb)) + rrd_call_function_error(wb, "Failed to route request to collector", code); + } + } +} + +static void cleanup_intercepting_input(struct sender_state *s) { + freez((void *)s->functions.transaction); + freez((void *)s->functions.timeout_s); + freez((void *)s->functions.function); + freez((void *)s->functions.access); + freez((void *)s->functions.source); + buffer_free(s->functions.payload); + + s->functions.transaction = NULL; + s->functions.timeout_s = NULL; + s->functions.function = NULL; + s->functions.payload = NULL; + s->functions.access = NULL; + s->functions.source = NULL; + s->functions.intercept_input = false; +} + +static void execute_commands_cleanup(struct sender_state *s) { + cleanup_intercepting_input(s); +} + +// This is just a placeholder until the gap filling state machine is inserted +void execute_commands(struct sender_state *s) { + worker_is_busy(WORKER_SENDER_JOB_EXECUTE); + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_CB(NDF_REQUEST, line_splitter_reconstruct_line, &s->line), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + char *start = s->read_buffer, *end = &s->read_buffer[s->read_len], *newline; + *end = '\0'; + for( ; start < end ; start = newline + 1) { + newline = strchr(start, '\n'); + + if(!newline) { + if(s->functions.intercept_input) { + buffer_strcat(s->functions.payload, start); + start = end; + } + break; + } + + *newline = '\0'; + s->line.count++; + + if(s->functions.intercept_input) { + if(strcmp(start, PLUGINSD_CALL_FUNCTION_PAYLOAD_END) == 0) { + execute_commands_function(s, + PLUGINSD_CALL_FUNCTION_PAYLOAD_END, + s->functions.transaction, s->functions.timeout_s, + s->functions.function, s->functions.payload, + s->functions.access, s->functions.source); + + cleanup_intercepting_input(s); + } + else { + buffer_strcat(s->functions.payload, start); + buffer_fast_charcat(s->functions.payload, '\n'); + } + + continue; + } + + s->line.num_words = quoted_strings_splitter_pluginsd(start, s->line.words, PLUGINSD_MAX_WORDS); + const char *command = get_word(s->line.words, s->line.num_words, 0); + + if(command && strcmp(command, PLUGINSD_CALL_FUNCTION) == 0) { + char *transaction = get_word(s->line.words, s->line.num_words, 1); + char *timeout_s = get_word(s->line.words, s->line.num_words, 2); + char *function = get_word(s->line.words, s->line.num_words, 3); + char *access = get_word(s->line.words, s->line.num_words, 4); + char *source = get_word(s->line.words, s->line.num_words, 5); + + execute_commands_function(s, command, transaction, timeout_s, function, NULL, access, source); + } + else if(command && strcmp(command, PLUGINSD_CALL_FUNCTION_PAYLOAD_BEGIN) == 0) { + char *transaction = get_word(s->line.words, s->line.num_words, 1); + char *timeout_s = get_word(s->line.words, s->line.num_words, 2); + char *function = get_word(s->line.words, s->line.num_words, 3); + char *access = get_word(s->line.words, s->line.num_words, 4); + char *source = get_word(s->line.words, s->line.num_words, 5); + char *content_type = get_word(s->line.words, s->line.num_words, 6); + + s->functions.transaction = strdupz(transaction ? transaction : ""); + s->functions.timeout_s = strdupz(timeout_s ? timeout_s : ""); + s->functions.function = strdupz(function ? function : ""); + s->functions.access = strdupz(access ? access : ""); + s->functions.source = strdupz(source ? source : ""); + s->functions.payload = buffer_create(0, NULL); + s->functions.payload->content_type = content_type_string2id(content_type); + s->functions.intercept_input = true; + } + else if(command && strcmp(command, PLUGINSD_CALL_FUNCTION_CANCEL) == 0) { + worker_is_busy(WORKER_SENDER_JOB_FUNCTION_REQUEST); + nd_log(NDLS_ACCESS, NDLP_DEBUG, NULL); + + char *transaction = get_word(s->line.words, s->line.num_words, 1); + if(transaction && *transaction) + rrd_function_cancel(transaction); + } + else if(command && strcmp(command, PLUGINSD_CALL_FUNCTION_PROGRESS) == 0) { + worker_is_busy(WORKER_SENDER_JOB_FUNCTION_REQUEST); + nd_log(NDLS_ACCESS, NDLP_DEBUG, NULL); + + char *transaction = get_word(s->line.words, s->line.num_words, 1); + if(transaction && *transaction) + rrd_function_progress(transaction); + } + else if (command && strcmp(command, PLUGINSD_KEYWORD_REPLAY_CHART) == 0) { + worker_is_busy(WORKER_SENDER_JOB_REPLAY_REQUEST); + nd_log(NDLS_ACCESS, NDLP_DEBUG, NULL); + + const char *chart_id = get_word(s->line.words, s->line.num_words, 1); + const char *start_streaming = get_word(s->line.words, s->line.num_words, 2); + const char *after = get_word(s->line.words, s->line.num_words, 3); + const char *before = get_word(s->line.words, s->line.num_words, 4); + + if (!chart_id || !start_streaming || !after || !before) { + netdata_log_error("STREAM %s [send to %s] %s command is incomplete" + " (chart=%s, start_streaming=%s, after=%s, before=%s)", + rrdhost_hostname(s->host), s->connected_to, + command, + chart_id ? chart_id : "(unset)", + start_streaming ? start_streaming : "(unset)", + after ? after : "(unset)", + before ? before : "(unset)"); + } + else { + replication_add_request(s, chart_id, + strtoll(after, NULL, 0), + strtoll(before, NULL, 0), + !strcmp(start_streaming, "true") + ); + } + } + else { + netdata_log_error("STREAM %s [send to %s] received unknown command over connection: %s", rrdhost_hostname(s->host), s->connected_to, s->line.words[0]?s->line.words[0]:"(unset)"); + } + + line_splitter_reset(&s->line); + worker_is_busy(WORKER_SENDER_JOB_EXECUTE); + } + + if (start < end) { + memmove(s->read_buffer, start, end-start); + s->read_len = end - start; + } + else { + s->read_buffer[0] = '\0'; + s->read_len = 0; + } +} + +struct rrdpush_sender_thread_data { + RRDHOST *host; + char *pipe_buffer; +}; + +static bool rrdpush_sender_pipe_close(RRDHOST *host, int *pipe_fds, bool reopen) { + static netdata_mutex_t mutex = NETDATA_MUTEX_INITIALIZER; + + bool ret = true; + + netdata_mutex_lock(&mutex); + + int new_pipe_fds[2]; + if(reopen) { + if(pipe(new_pipe_fds) != 0) { + netdata_log_error("STREAM %s [send]: cannot create required pipe.", rrdhost_hostname(host)); + new_pipe_fds[PIPE_READ] = -1; + new_pipe_fds[PIPE_WRITE] = -1; + ret = false; + } + } + + int old_pipe_fds[2]; + old_pipe_fds[PIPE_READ] = pipe_fds[PIPE_READ]; + old_pipe_fds[PIPE_WRITE] = pipe_fds[PIPE_WRITE]; + + if(reopen) { + pipe_fds[PIPE_READ] = new_pipe_fds[PIPE_READ]; + pipe_fds[PIPE_WRITE] = new_pipe_fds[PIPE_WRITE]; + } + else { + pipe_fds[PIPE_READ] = -1; + pipe_fds[PIPE_WRITE] = -1; + } + + if(old_pipe_fds[PIPE_READ] > 2) + close(old_pipe_fds[PIPE_READ]); + + if(old_pipe_fds[PIPE_WRITE] > 2) + close(old_pipe_fds[PIPE_WRITE]); + + netdata_mutex_unlock(&mutex); + return ret; +} + +void rrdpush_signal_sender_to_wake_up(struct sender_state *s) { + if(unlikely(s->tid == gettid_cached())) + return; + + RRDHOST *host = s->host; + + int pipe_fd = s->rrdpush_sender_pipe[PIPE_WRITE]; + + // signal the sender there are more data + if (pipe_fd != -1 && write(pipe_fd, " ", 1) == -1) { + netdata_log_error("STREAM %s [send]: cannot write to internal pipe.", rrdhost_hostname(host)); + rrdpush_sender_pipe_close(host, s->rrdpush_sender_pipe, true); + } +} + +static bool rrdhost_set_sender(RRDHOST *host) { + if(unlikely(!host->sender)) return false; + + bool ret = false; + sender_lock(host->sender); + if(!host->sender->tid) { + rrdhost_flag_clear(host, RRDHOST_FLAG_RRDPUSH_SENDER_CONNECTED | RRDHOST_FLAG_RRDPUSH_SENDER_READY_4_METRICS); + rrdhost_flag_set(host, RRDHOST_FLAG_RRDPUSH_SENDER_SPAWN); + host->rrdpush_sender_connection_counter++; + host->sender->tid = gettid_cached(); + host->sender->last_state_since_t = now_realtime_sec(); + host->sender->exit.reason = STREAM_HANDSHAKE_NEVER; + ret = true; + } + sender_unlock(host->sender); + + rrdpush_reset_destinations_postpone_time(host); + + return ret; +} + +static void rrdhost_clear_sender___while_having_sender_mutex(RRDHOST *host) { + if(unlikely(!host->sender)) return; + + if(host->sender->tid == gettid_cached()) { + host->sender->tid = 0; + host->sender->exit.shutdown = false; + rrdhost_flag_clear(host, RRDHOST_FLAG_RRDPUSH_SENDER_SPAWN | RRDHOST_FLAG_RRDPUSH_SENDER_CONNECTED | RRDHOST_FLAG_RRDPUSH_SENDER_READY_4_METRICS); + host->sender->last_state_since_t = now_realtime_sec(); + if(host->destination) { + host->destination->since = host->sender->last_state_since_t; + host->destination->reason = host->sender->exit.reason; + } + } + + rrdpush_reset_destinations_postpone_time(host); +} + +static bool rrdhost_sender_should_exit(struct sender_state *s) { + if(unlikely(nd_thread_signaled_to_cancel())) { + if(!s->exit.reason) + s->exit.reason = STREAM_HANDSHAKE_DISCONNECT_SHUTDOWN; + return true; + } + + if(unlikely(!service_running(SERVICE_STREAMING))) { + if(!s->exit.reason) + s->exit.reason = STREAM_HANDSHAKE_DISCONNECT_NETDATA_EXIT; + return true; + } + + if(unlikely(!rrdhost_has_rrdpush_sender_enabled(s->host))) { + if(!s->exit.reason) + s->exit.reason = STREAM_HANDSHAKE_NON_STREAMABLE_HOST; + return true; + } + + if(unlikely(s->exit.shutdown)) { + if(!s->exit.reason) + s->exit.reason = STREAM_HANDSHAKE_DISCONNECT_SHUTDOWN; + return true; + } + + if(unlikely(rrdhost_flag_check(s->host, RRDHOST_FLAG_ORPHAN))) { + if(!s->exit.reason) + s->exit.reason = STREAM_HANDSHAKE_DISCONNECT_ORPHAN_HOST; + return true; + } + + return false; +} + +static void rrdpush_sender_thread_cleanup_callback(void *pptr) { + struct rrdpush_sender_thread_data *s = CLEANUP_FUNCTION_GET_PTR(pptr); + if(!s) return; + + worker_unregister(); + + RRDHOST *host = s->host; + + sender_lock(host->sender); + netdata_log_info("STREAM %s [send]: sending thread exits %s", + rrdhost_hostname(host), + host->sender->exit.reason != STREAM_HANDSHAKE_NEVER ? stream_handshake_error_to_string(host->sender->exit.reason) : ""); + + rrdpush_sender_thread_close_socket(host); + rrdpush_sender_pipe_close(host, host->sender->rrdpush_sender_pipe, false); + execute_commands_cleanup(host->sender); + + rrdhost_clear_sender___while_having_sender_mutex(host); + +#ifdef NETDATA_LOG_STREAM_SENDER + if(host->sender->stream_log_fp) { + fclose(host->sender->stream_log_fp); + host->sender->stream_log_fp = NULL; + } +#endif + + sender_unlock(host->sender); + + freez(s->pipe_buffer); + freez(s); +} + +void rrdpush_initialize_ssl_ctx(RRDHOST *host __maybe_unused) { +#ifdef ENABLE_HTTPS + static SPINLOCK sp = NETDATA_SPINLOCK_INITIALIZER; + spinlock_lock(&sp); + + if(netdata_ssl_streaming_sender_ctx || !host) { + spinlock_unlock(&sp); + return; + } + + for(struct rrdpush_destinations *d = host->destinations; d ; d = d->next) { + if (d->ssl) { + // we need to initialize SSL + + netdata_ssl_initialize_ctx(NETDATA_SSL_STREAMING_SENDER_CTX); + ssl_security_location_for_context(netdata_ssl_streaming_sender_ctx, netdata_ssl_ca_file, netdata_ssl_ca_path); + + // stop the loop + break; + } + } + + spinlock_unlock(&sp); +#endif +} + +static bool stream_sender_log_capabilities(BUFFER *wb, void *ptr) { + struct sender_state *state = ptr; + if(!state) + return false; + + stream_capabilities_to_string(wb, state->capabilities); + return true; +} + +static bool stream_sender_log_transport(BUFFER *wb, void *ptr) { + struct sender_state *state = ptr; + if(!state) + return false; + +#ifdef ENABLE_HTTPS + buffer_strcat(wb, SSL_connection(&state->ssl) ? "https" : "http"); +#else + buffer_strcat(wb, "http"); +#endif + return true; +} + +static bool stream_sender_log_dst_ip(BUFFER *wb, void *ptr) { + struct sender_state *state = ptr; + if(!state || state->rrdpush_sender_socket == -1) + return false; + + SOCKET_PEERS peers = socket_peers(state->rrdpush_sender_socket); + buffer_strcat(wb, peers.peer.ip); + return true; +} + +static bool stream_sender_log_dst_port(BUFFER *wb, void *ptr) { + struct sender_state *state = ptr; + if(!state || state->rrdpush_sender_socket == -1) + return false; + + SOCKET_PEERS peers = socket_peers(state->rrdpush_sender_socket); + buffer_print_uint64(wb, peers.peer.port); + return true; +} + +void *rrdpush_sender_thread(void *ptr) { + struct sender_state *s = ptr; + + ND_LOG_STACK lgs[] = { + ND_LOG_FIELD_STR(NDF_NIDL_NODE, s->host->hostname), + ND_LOG_FIELD_CB(NDF_DST_IP, stream_sender_log_dst_ip, s), + ND_LOG_FIELD_CB(NDF_DST_PORT, stream_sender_log_dst_port, s), + ND_LOG_FIELD_CB(NDF_DST_TRANSPORT, stream_sender_log_transport, s), + ND_LOG_FIELD_CB(NDF_SRC_CAPABILITIES, stream_sender_log_capabilities, s), + ND_LOG_FIELD_END(), + }; + ND_LOG_STACK_PUSH(lgs); + + worker_register("STREAMSND"); + worker_register_job_name(WORKER_SENDER_JOB_CONNECT, "connect"); + worker_register_job_name(WORKER_SENDER_JOB_PIPE_READ, "pipe read"); + worker_register_job_name(WORKER_SENDER_JOB_SOCKET_RECEIVE, "receive"); + worker_register_job_name(WORKER_SENDER_JOB_EXECUTE, "execute"); + worker_register_job_name(WORKER_SENDER_JOB_SOCKET_SEND, "send"); + + // disconnection reasons + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_TIMEOUT, "disconnect timeout"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_POLL_ERROR, "disconnect poll error"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_SOCKET_ERROR, "disconnect socket error"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_OVERFLOW, "disconnect overflow"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_SSL_ERROR, "disconnect ssl error"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_PARENT_CLOSED, "disconnect parent closed"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_RECEIVE_ERROR, "disconnect receive error"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_SEND_ERROR, "disconnect send error"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_NO_COMPRESSION, "disconnect no compression"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_BAD_HANDSHAKE, "disconnect bad handshake"); + worker_register_job_name(WORKER_SENDER_JOB_DISCONNECT_CANT_UPGRADE_CONNECTION, "disconnect cant upgrade"); + + worker_register_job_name(WORKER_SENDER_JOB_REPLAY_REQUEST, "replay request"); + worker_register_job_name(WORKER_SENDER_JOB_FUNCTION_REQUEST, "function"); + + worker_register_job_custom_metric(WORKER_SENDER_JOB_BUFFER_RATIO, "used buffer ratio", "%", WORKER_METRIC_ABSOLUTE); + worker_register_job_custom_metric(WORKER_SENDER_JOB_BYTES_RECEIVED, "bytes received", "bytes/s", WORKER_METRIC_INCREMENT); + worker_register_job_custom_metric(WORKER_SENDER_JOB_BYTES_SENT, "bytes sent", "bytes/s", WORKER_METRIC_INCREMENT); + worker_register_job_custom_metric(WORKER_SENDER_JOB_BYTES_COMPRESSED, "bytes compressed", "bytes/s", WORKER_METRIC_INCREMENTAL_TOTAL); + worker_register_job_custom_metric(WORKER_SENDER_JOB_BYTES_UNCOMPRESSED, "bytes uncompressed", "bytes/s", WORKER_METRIC_INCREMENTAL_TOTAL); + worker_register_job_custom_metric(WORKER_SENDER_JOB_BYTES_COMPRESSION_RATIO, "cumulative compression savings ratio", "%", WORKER_METRIC_ABSOLUTE); + worker_register_job_custom_metric(WORKER_SENDER_JOB_REPLAY_DICT_SIZE, "replication dict entries", "entries", WORKER_METRIC_ABSOLUTE); + + if(!rrdhost_has_rrdpush_sender_enabled(s->host) || !s->host->rrdpush_send_destination || + !*s->host->rrdpush_send_destination || !s->host->rrdpush_send_api_key || + !*s->host->rrdpush_send_api_key) { + netdata_log_error("STREAM %s [send]: thread created (task id %d), but host has streaming disabled.", + rrdhost_hostname(s->host), gettid_cached()); + return NULL; + } + + if(!rrdhost_set_sender(s->host)) { + netdata_log_error("STREAM %s [send]: thread created (task id %d), but there is another sender running for this host.", + rrdhost_hostname(s->host), gettid_cached()); + return NULL; + } + + rrdpush_initialize_ssl_ctx(s->host); + + netdata_log_info("STREAM %s [send]: thread created (task id %d)", rrdhost_hostname(s->host), gettid_cached()); + + s->timeout = (int)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, "timeout seconds", 600); + + s->default_port = (int)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, "default port", 19999); + + s->buffer->max_size = (size_t)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, "buffer size bytes", 1024 * 1024 * 10); + + s->reconnect_delay = (unsigned int)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, "reconnect delay seconds", 5); + + remote_clock_resync_iterations = (unsigned int)appconfig_get_number( + &stream_config, CONFIG_SECTION_STREAM, + "initial clock resync iterations", + remote_clock_resync_iterations); // TODO: REMOVE FOR SLEW / GAPFILLING + + s->parent_using_h2o = appconfig_get_boolean( + &stream_config, CONFIG_SECTION_STREAM, "parent using h2o", false); + + // initialize rrdpush globals + rrdhost_flag_clear(s->host, RRDHOST_FLAG_RRDPUSH_SENDER_READY_4_METRICS); + rrdhost_flag_clear(s->host, RRDHOST_FLAG_RRDPUSH_SENDER_CONNECTED); + + int pipe_buffer_size = 10 * 1024; +#ifdef F_GETPIPE_SZ + pipe_buffer_size = fcntl(s->rrdpush_sender_pipe[PIPE_READ], F_GETPIPE_SZ); +#endif + if(pipe_buffer_size < 10 * 1024) + pipe_buffer_size = 10 * 1024; + + if(!rrdpush_sender_pipe_close(s->host, s->rrdpush_sender_pipe, true)) { + netdata_log_error("STREAM %s [send]: cannot create inter-thread communication pipe. Disabling streaming.", + rrdhost_hostname(s->host)); + return NULL; + } + + struct rrdpush_sender_thread_data *thread_data = callocz(1, sizeof(struct rrdpush_sender_thread_data)); + thread_data->pipe_buffer = mallocz(pipe_buffer_size); + thread_data->host = s->host; + + CLEANUP_FUNCTION_REGISTER(rrdpush_sender_thread_cleanup_callback) cleanup_ptr = thread_data; + + size_t iterations = 0; + time_t now_s = now_monotonic_sec(); + while(!rrdhost_sender_should_exit(s)) { + iterations++; + + // The connection attempt blocks (after which we use the socket in nonblocking) + if(unlikely(s->rrdpush_sender_socket == -1)) { + worker_is_busy(WORKER_SENDER_JOB_CONNECT); + + now_s = now_monotonic_sec(); + rrdpush_sender_cbuffer_recreate_timed(s, now_s, false, true); + execute_commands_cleanup(s); + + rrdhost_flag_clear(s->host, RRDHOST_FLAG_RRDPUSH_SENDER_READY_4_METRICS); + s->flags &= ~SENDER_FLAG_OVERFLOW; + s->read_len = 0; + s->buffer->read = 0; + s->buffer->write = 0; + + if(!attempt_to_connect(s)) + continue; + + if(rrdhost_sender_should_exit(s)) + break; + + now_s = s->last_traffic_seen_t = now_monotonic_sec(); + rrdpush_send_claimed_id(s->host); + rrdpush_send_host_labels(s->host); + rrdpush_send_global_functions(s->host); + s->replication.oldest_request_after_t = 0; + + rrdhost_flag_set(s->host, RRDHOST_FLAG_RRDPUSH_SENDER_READY_4_METRICS); + + nd_log(NDLS_DAEMON, NDLP_DEBUG, + "STREAM %s [send to %s]: enabling metrics streaming...", + rrdhost_hostname(s->host), s->connected_to); + + continue; + } + + if(iterations % 1000 == 0) + now_s = now_monotonic_sec(); + + // If the TCP window never opened then something is wrong, restart connection + if(unlikely(now_s - s->last_traffic_seen_t > s->timeout && + !rrdpush_sender_pending_replication_requests(s) && + !rrdpush_sender_replicating_charts(s) + )) { + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_TIMEOUT); + netdata_log_error("STREAM %s [send to %s]: could not send metrics for %d seconds - closing connection - we have sent %zu bytes on this connection via %zu send attempts.", rrdhost_hostname(s->host), s->connected_to, s->timeout, s->sent_bytes_on_this_connection, s->send_attempts); + rrdpush_sender_thread_close_socket(s->host); + continue; + } + + sender_lock(s); + size_t outstanding = cbuffer_next_unsafe(s->buffer, NULL); + size_t available = cbuffer_available_size_unsafe(s->buffer); + if (unlikely(!outstanding)) { + rrdpush_sender_pipe_clear_pending_data(s); + rrdpush_sender_cbuffer_recreate_timed(s, now_s, true, false); + } + + if(s->compressor.initialized) { + size_t bytes_uncompressed = s->compressor.sender_locked.total_uncompressed; + size_t bytes_compressed = s->compressor.sender_locked.total_compressed + s->compressor.sender_locked.total_compressions * sizeof(rrdpush_signature_t); + NETDATA_DOUBLE ratio = 100.0 - ((NETDATA_DOUBLE)bytes_compressed * 100.0 / (NETDATA_DOUBLE)bytes_uncompressed); + worker_set_metric(WORKER_SENDER_JOB_BYTES_UNCOMPRESSED, (NETDATA_DOUBLE)bytes_uncompressed); + worker_set_metric(WORKER_SENDER_JOB_BYTES_COMPRESSED, (NETDATA_DOUBLE)bytes_compressed); + worker_set_metric(WORKER_SENDER_JOB_BYTES_COMPRESSION_RATIO, ratio); + } + sender_unlock(s); + + worker_set_metric(WORKER_SENDER_JOB_BUFFER_RATIO, (NETDATA_DOUBLE)(s->buffer->max_size - available) * 100.0 / (NETDATA_DOUBLE)s->buffer->max_size); + + if(outstanding) + s->send_attempts++; + + if(unlikely(s->rrdpush_sender_pipe[PIPE_READ] == -1)) { + if(!rrdpush_sender_pipe_close(s->host, s->rrdpush_sender_pipe, true)) { + netdata_log_error("STREAM %s [send]: cannot create inter-thread communication pipe. Disabling streaming.", + rrdhost_hostname(s->host)); + rrdpush_sender_thread_close_socket(s->host); + break; + } + } + + worker_is_idle(); + + // Wait until buffer opens in the socket or a rrdset_done_push wakes us + enum { + Collector = 0, + Socket = 1, + }; + struct pollfd fds[2] = { + [Collector] = { + .fd = s->rrdpush_sender_pipe[PIPE_READ], + .events = POLLIN, + .revents = 0, + }, + [Socket] = { + .fd = s->rrdpush_sender_socket, + .events = POLLIN | (outstanding ? POLLOUT : 0 ), + .revents = 0, + } + }; + + int poll_rc = poll(fds, 2, 50); // timeout in milliseconds + + netdata_log_debug(D_STREAM, "STREAM: poll() finished collector=%d socket=%d (current chunk %zu bytes)...", + fds[Collector].revents, fds[Socket].revents, outstanding); + + if(unlikely(rrdhost_sender_should_exit(s))) + break; + + internal_error(fds[Collector].fd != s->rrdpush_sender_pipe[PIPE_READ], + "STREAM %s [send to %s]: pipe changed after poll().", rrdhost_hostname(s->host), s->connected_to); + + internal_error(fds[Socket].fd != s->rrdpush_sender_socket, + "STREAM %s [send to %s]: socket changed after poll().", rrdhost_hostname(s->host), s->connected_to); + + // Spurious wake-ups without error - loop again + if (poll_rc == 0 || ((poll_rc == -1) && (errno == EAGAIN || errno == EINTR))) { + netdata_log_debug(D_STREAM, "Spurious wakeup"); + now_s = now_monotonic_sec(); + continue; + } + + // Only errors from poll() are internal, but try restarting the connection + if(unlikely(poll_rc == -1)) { + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_POLL_ERROR); + netdata_log_error("STREAM %s [send to %s]: failed to poll(). Closing socket.", rrdhost_hostname(s->host), s->connected_to); + rrdpush_sender_pipe_close(s->host, s->rrdpush_sender_pipe, true); + rrdpush_sender_thread_close_socket(s->host); + continue; + } + + // If we have data and have seen the TCP window open then try to close it by a transmission. + if(likely(outstanding && (fds[Socket].revents & POLLOUT))) { + worker_is_busy(WORKER_SENDER_JOB_SOCKET_SEND); + ssize_t bytes = attempt_to_send(s); + if(bytes > 0) { + s->last_traffic_seen_t = now_monotonic_sec(); + worker_set_metric(WORKER_SENDER_JOB_BYTES_SENT, (NETDATA_DOUBLE)bytes); + } + } + + // If the collector woke us up then empty the pipe to remove the signal + if (fds[Collector].revents & (POLLIN|POLLPRI)) { + worker_is_busy(WORKER_SENDER_JOB_PIPE_READ); + netdata_log_debug(D_STREAM, "STREAM: Data added to send buffer (current buffer chunk %zu bytes)...", outstanding); + + if (read(fds[Collector].fd, thread_data->pipe_buffer, pipe_buffer_size) == -1) + netdata_log_error("STREAM %s [send to %s]: cannot read from internal pipe.", rrdhost_hostname(s->host), s->connected_to); + } + + // Read as much as possible to fill the buffer, split into full lines for execution. + if (fds[Socket].revents & POLLIN) { + worker_is_busy(WORKER_SENDER_JOB_SOCKET_RECEIVE); + ssize_t bytes = attempt_read(s); + if(bytes > 0) { + s->last_traffic_seen_t = now_monotonic_sec(); + worker_set_metric(WORKER_SENDER_JOB_BYTES_RECEIVED, (NETDATA_DOUBLE)bytes); + } + } + + if(unlikely(s->read_len)) + execute_commands(s); + + if(unlikely(fds[Collector].revents & (POLLERR|POLLHUP|POLLNVAL))) { + char *error = NULL; + + if (unlikely(fds[Collector].revents & POLLERR)) + error = "pipe reports errors (POLLERR)"; + else if (unlikely(fds[Collector].revents & POLLHUP)) + error = "pipe closed (POLLHUP)"; + else if (unlikely(fds[Collector].revents & POLLNVAL)) + error = "pipe is invalid (POLLNVAL)"; + + if(error) { + rrdpush_sender_pipe_close(s->host, s->rrdpush_sender_pipe, true); + netdata_log_error("STREAM %s [send to %s]: restarting internal pipe: %s.", + rrdhost_hostname(s->host), s->connected_to, error); + } + } + + if(unlikely(fds[Socket].revents & (POLLERR|POLLHUP|POLLNVAL))) { + char *error = NULL; + + if (unlikely(fds[Socket].revents & POLLERR)) + error = "socket reports errors (POLLERR)"; + else if (unlikely(fds[Socket].revents & POLLHUP)) + error = "connection closed by remote end (POLLHUP)"; + else if (unlikely(fds[Socket].revents & POLLNVAL)) + error = "connection is invalid (POLLNVAL)"; + + if(unlikely(error)) { + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_SOCKET_ERROR); + netdata_log_error("STREAM %s [send to %s]: restarting connection: %s - %zu bytes transmitted.", + rrdhost_hostname(s->host), s->connected_to, error, s->sent_bytes_on_this_connection); + rrdpush_sender_thread_close_socket(s->host); + } + } + + // protection from overflow + if(unlikely(s->flags & SENDER_FLAG_OVERFLOW)) { + worker_is_busy(WORKER_SENDER_JOB_DISCONNECT_OVERFLOW); + errno = 0; + netdata_log_error("STREAM %s [send to %s]: buffer full (allocated %zu bytes) after sending %zu bytes. Restarting connection", + rrdhost_hostname(s->host), s->connected_to, s->buffer->size, s->sent_bytes_on_this_connection); + rrdpush_sender_thread_close_socket(s->host); + } + + worker_set_metric(WORKER_SENDER_JOB_REPLAY_DICT_SIZE, (NETDATA_DOUBLE) dictionary_entries(s->replication.requests)); + } + + return NULL; +} diff --git a/src/streaming/stream.conf b/src/streaming/stream.conf new file mode 100644 index 000000000..9dc154e2f --- /dev/null +++ b/src/streaming/stream.conf @@ -0,0 +1,263 @@ +# netdata configuration for aggregating data from remote hosts +# +# API keys authorize a pair of sending-receiving netdata servers. +# Once their communication is authorized, they can exchange metrics for any +# number of hosts. +# +# You can generate API keys, with the linux command: uuidgen + + +# ----------------------------------------------------------------------------- +# 1. ON CHILD NETDATA - THE ONE THAT WILL BE SENDING METRICS + +[stream] + # Enable this on child nodes, to have them send metrics. + enabled = no + + # Where is the receiving netdata? + # A space separated list of: + # + # [PROTOCOL:]HOST[%INTERFACE][:PORT][:SSL] + # + # If many are given, the first available will get the metrics. + # + # PROTOCOL = tcp, udp, or unix (only tcp and unix are supported by parent nodes) + # HOST = an IPv4, IPv6 IP, or a hostname, or a unix domain socket path. + # IPv6 IPs should be given with brackets [ip:address] + # INTERFACE = the network interface to use (only for IPv6) + # PORT = the port number or service name (/etc/services) + # SSL = when this word appear at the end of the destination string + # the Netdata will encrypt the connection with the parent. + # + # This communication is not HTTP (it cannot be proxied by web proxies). + destination = + + # Skip Certificate verification? + # The netdata child is configurated to avoid invalid SSL/TLS certificate, + # so certificates that are self-signed or expired will stop the streaming. + # Case the server certificate is not valid, you can enable the use of + # 'bad' certificates setting the next option as 'yes'. + #ssl skip certificate verification = yes + + # Certificate Authority Path + # OpenSSL has a default directory where the known certificates are stored. + # In case it is necessary, it is possible to change this rule using the variable + # "CApath", e.g. CApath = /etc/ssl/certs/ + # + #CApath = + + # Certificate Authority file + # When the Netdata parent has a certificate that is not recognized as valid, + # we can add it to the list of known certificates in "CApath" and give it to + # Netdata as an argument, e.g. CAfile = /etc/ssl/certs/cert.pem + # + #CAfile = + + # The API_KEY to use (as the sender) + api key = + + # Stream Compression + # The default is enabled + # You can control stream compression in this agent with options: yes | no + #enable compression = yes + + # The timeout to connect and send metrics + timeout seconds = 60 + + # If the destination line above does not specify a port, use this + default port = 19999 + + # filter the charts to be streamed + # netdata SIMPLE PATTERN: + # - space separated list of patterns (use \ to include spaces in patterns) + # - use * as wildcard, any number of times within each pattern + # - prefix a pattern with ! for a negative match (ie not stream the charts it matches) + # - the order of patterns is important (left to right) + # To send all except a few, use: !this !that * (ie append a wildcard pattern) + send charts matching = * + + # The buffer to use for sending metrics. + # 10MB is good for 60 seconds of data, so increase this if you expect latencies. + # The buffer is flushed on reconnects (this will not prevent gaps at the charts). + buffer size bytes = 10485760 + + # If the connection fails, or it disconnects, + # retry after that many seconds. + reconnect delay seconds = 5 + + # Sync the clock of the charts for that many iterations, when starting. + # It is ignored when replication is enabled + initial clock resync iterations = 60 + +# ----------------------------------------------------------------------------- +# 2. ON PARENT NETDATA - THE ONE THAT WILL BE RECEIVING METRICS + +# You can have one API key per child, +# or the same API key for all child nodes. +# +# netdata searches for options in this order: +# +# a) parent netdata settings (netdata.conf) +# b) [stream] section (above) +# c) [API_KEY] section (below, settings for the API key) +# d) [MACHINE_GUID] section (below, settings for each machine) +# +# You can combine the above (the more specific setting will be used). + +# API key authentication +# If the key is not listed here, it will not be able to push metrics. + +# [API_KEY] is [YOUR-API-KEY], i.e [11111111-2222-3333-4444-555555555555] +[API_KEY] + # Default settings for this API key + + # This GUID is to be used as an API key from remote agents connecting + # to this machine. Failure to match such a key, denies access. + # YOU MUST SET THIS FIELD ON ALL API KEYS. + type = api + + # You can disable the API key, by setting this to: no + # The default (for unknown API keys) is: no + enabled = no + + # A list of simple patterns matching the IPs of the servers that + # will be pushing metrics using this API key. + # The metrics are received via the API port, so the same IPs + # should also be matched at netdata.conf [web].allow connections from + allow from = * + + # The default history in entries, for all hosts using this API key. + # You can also set it per host below. + # For the default db mode (dbengine), this is ignored. + #default history = 3600 + + # The default memory mode to be used for all hosts using this API key. + # You can also set it per host below. + # If you don't set it here, the memory mode of netdata.conf will be used. + # Valid modes: + # ram keep it in RAM, don't touch the disk + # none no database at all (use this on headless proxies) + # dbengine like a traditional database + #default memory mode = dbengine + + # Shall we enable health monitoring for the hosts using this API key? + # 3 possible values: + # yes enable alarms + # no do not enable alarms + # auto enable alarms, only when the sending netdata is connected. + # Health monitoring will be disabled as soon as the connection is closed. + # You can also set it per host, below. + # The default is taken from [health].enabled of netdata.conf + #health enabled by default = auto + + # postpone alarms for a short period after the sender is connected + default postpone alarms on connect seconds = 60 + + # seconds of health log events to keep + #default health log history = 432000 + + # need to route metrics differently? set these. + # the defaults are the ones at the [stream] section (above) + #default proxy enabled = yes | no + #default proxy destination = IP:PORT IP:PORT ... + #default proxy api key = API_KEY + #default proxy send charts matching = * + + # Stream Compression + # By default it is enabled. + # You can control stream compression in this parent agent stream with options: yes | no + #enable compression = yes + + # select the order the compression algorithms will be used, when multiple are offered by the child + #compression algorithms order = zstd lz4 brotli gzip + + # Replication + # Enable replication for all hosts using this api key. Default: enabled + #enable replication = yes + + # How many seconds to replicate from each child. Default: a day + #seconds to replicate = 86400 + + # The duration we want to replicate per each step. + #seconds per replication step = 600 + + # Indicate whether this child is an ephemeral node. An ephemeral node will become unavailable + # after the specified duration of "cleanup ephemeral hosts after secs" (as defined in the db section of netdata.conf) + # from the time of the node's last connection. + #is ephemeral node = no + +# ----------------------------------------------------------------------------- +# 3. PER SENDING HOST SETTINGS, ON PARENT NETDATA +# THIS IS OPTIONAL - YOU DON'T HAVE TO CONFIGURE IT + +# This section exists to give you finer control of the parent settings for each +# child host, when the same API key is used by many netdata child nodes / proxies. +# +# Each netdata has a unique GUID - generated the first time netdata starts. +# You can find it at /var/lib/netdata/registry/netdata.public.unique.id +# (at the child). +# +# The host sending data will have one. If the host is not ephemeral, +# you can give settings for each sending host here. + +[MACHINE_GUID] + # This GUID is to be used as a MACHINE GUID from remote agents connecting + # to this machine, not an API key. + # YOU MUST SET THIS FIELD ON ALL MACHINE GUIDs. + type = machine + + # enable this host: yes | no + # When disabled, the parent will not receive metrics for this host. + # THIS IS NOT A SECURITY MECHANISM - AN ATTACKER CAN SET ANY OTHER GUID. + # Use only the API key for security. + enabled = no + + # A list of simple patterns matching the IPs of the servers that + # will be pushing metrics using this MACHINE GUID. + # The metrics are received via the API port, so the same IPs + # should also be matched at netdata.conf [web].allow connections from + # and at stream.conf [API_KEY].allow from + allow from = * + + # The number of entries in the database. + # This is ignored for db mode dbengine. + #history = 3600 + + # The memory mode of the database: ram | none | dbengine + #memory mode = dbengine + + # Health / alarms control: yes | no | auto + #health enabled = auto + + # postpone alarms when the sender connects + postpone alarms on connect seconds = 60 + + # seconds of health log events to keep + #health log history = 432000 + + # need to route metrics differently? + # the defaults are the ones at the [API KEY] section + #proxy enabled = yes | no + #proxy destination = IP:PORT IP:PORT ... + #proxy api key = API_KEY + #proxy send charts matching = * + + # Stream Compression + # By default, enabled. + # You can control stream compression in this parent agent stream with options: yes | no + #enable compression = yes + + # Replication + # Enable replication for all hosts using this api key. + #enable replication = yes + + # How many seconds to replicate from each child. + #seconds to replicate = 86400 + + # The duration we want to replicate per each step. + #seconds per replication step = 600 + + # Indicate whether this child is an ephemeral node. An ephemeral node will become unavailable + # after the specified duration of "cleanup ephemeral hosts after secs" (as defined in the db section of netdata.conf) + # from the time of the node's last connection. + #is ephemeral node = no |