diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2022-04-14 18:12:10 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2022-04-14 18:12:10 +0000 |
commit | b5321aff06d6ea8d730d62aec2ffd8e9271c1ffc (patch) | |
tree | 36c41e35994786456154f9d3bf88c324763aeea4 /backends | |
parent | Adding upstream version 1.33.1. (diff) | |
download | netdata-b5321aff06d6ea8d730d62aec2ffd8e9271c1ffc.tar.xz netdata-b5321aff06d6ea8d730d62aec2ffd8e9271c1ffc.zip |
Adding upstream version 1.34.0.upstream/1.34.0
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'backends')
38 files changed, 0 insertions, 4798 deletions
diff --git a/backends/Makefile.am b/backends/Makefile.am deleted file mode 100644 index dace0132a..000000000 --- a/backends/Makefile.am +++ /dev/null @@ -1,22 +0,0 @@ -# SPDX-License-Identifier: GPL-3.0-or-later - -AUTOMAKE_OPTIONS = subdir-objects -MAINTAINERCLEANFILES = $(srcdir)/Makefile.in - -SUBDIRS = \ - graphite \ - json \ - opentsdb \ - prometheus \ - aws_kinesis \ - mongodb \ - $(NULL) - -dist_noinst_DATA = \ - README.md \ - WALKTHROUGH.md \ - $(NULL) - -dist_noinst_SCRIPTS = \ - nc-backend.sh \ - $(NULL) diff --git a/backends/README.md b/backends/README.md deleted file mode 100644 index 8d53fd664..000000000 --- a/backends/README.md +++ /dev/null @@ -1,236 +0,0 @@ -<!-- -title: "Metrics long term archiving" -custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/README.md ---> - -# Metrics long term archiving - -> ⚠️ The backends system is now deprecated in favor of the [exporting engine](/exporting/README.md). - -Netdata supports backends for archiving the metrics, or providing long term dashboards, using Grafana or other tools, -like this: - -![image](https://cloud.githubusercontent.com/assets/2662304/20649711/29f182ba-b4ce-11e6-97c8-ab2c0ab59833.png) - -Since Netdata collects thousands of metrics per server per second, which would easily congest any backend server when -several Netdata servers are sending data to it, Netdata allows sending metrics at a lower frequency, by resampling them. - -So, although Netdata collects metrics every second, it can send to the backend servers averages or sums every X seconds -(though, it can send them per second if you need it to). - -## features - -1. Supported backends - - - **graphite** (`plaintext interface`, used by **Graphite**, **InfluxDB**, **KairosDB**, **Blueflood**, - **ElasticSearch** via logstash tcp input and the graphite codec, etc) - - metrics are sent to the backend server as `prefix.hostname.chart.dimension`. `prefix` is configured below, - `hostname` is the hostname of the machine (can also be configured). - - - **opentsdb** (`telnet or HTTP interfaces`, used by **OpenTSDB**, **InfluxDB**, **KairosDB**, etc) - - metrics are sent to opentsdb as `prefix.chart.dimension` with tag `host=hostname`. - - - **json** document DBs - - metrics are sent to a document db, `JSON` formatted. - - - **prometheus** is described at [prometheus page](/backends/prometheus/README.md) since it pulls data from - Netdata. - - - **prometheus remote write** (a binary snappy-compressed protocol buffer encoding over HTTP used by - **Elasticsearch**, **Gnocchi**, **Graphite**, **InfluxDB**, **Kafka**, **OpenTSDB**, **PostgreSQL/TimescaleDB**, - **Splunk**, **VictoriaMetrics**, and a lot of other [storage - providers](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage)) - - metrics are labeled in the format, which is used by Netdata for the [plaintext prometheus - protocol](/backends/prometheus/README.md). Notes on using the remote write backend are [here](/backends/prometheus/remote_write/README.md). - - - **TimescaleDB** via [community-built connector](/backends/TIMESCALE.md) that takes JSON streams from a Netdata - client and writes them to a TimescaleDB table. - - - **AWS Kinesis Data Streams** - - metrics are sent to the service in `JSON` format. - - - **MongoDB** - - metrics are sent to the database in `JSON` format. - -2. Only one backend may be active at a time. - -3. Netdata can filter metrics (at the chart level), to send only a subset of the collected metrics. - -4. Netdata supports three modes of operation for all backends: - - - `as-collected` sends to backends the metrics as they are collected, in the units they are collected. So, - counters are sent as counters and gauges are sent as gauges, much like all data collectors do. For example, to - calculate CPU utilization in this format, you need to know how to convert kernel ticks to percentage. - - - `average` sends to backends normalized metrics from the Netdata database. In this mode, all metrics are sent as - gauges, in the units Netdata uses. This abstracts data collection and simplifies visualization, but you will not - be able to copy and paste queries from other sources to convert units. For example, CPU utilization percentage - is calculated by Netdata, so Netdata will convert ticks to percentage and send the average percentage to the - backend. - - - `sum` or `volume`: the sum of the interpolated values shown on the Netdata graphs is sent to the backend. So, if - Netdata is configured to send data to the backend every 10 seconds, the sum of the 10 values shown on the - Netdata charts will be used. - - Time-series databases suggest to collect the raw values (`as-collected`). If you plan to invest on building your - monitoring around a time-series database and you already know (or you will invest in learning) how to convert units - and normalize the metrics in Grafana or other visualization tools, we suggest to use `as-collected`. - - If, on the other hand, you just need long term archiving of Netdata metrics and you plan to mainly work with - Netdata, we suggest to use `average`. It decouples visualization from data collection, so it will generally be a lot - simpler. Furthermore, if you use `average`, the charts shown in the back-end will match exactly what you see in - Netdata, which is not necessarily true for the other modes of operation. - -5. This code is smart enough, not to slow down Netdata, independently of the speed of the backend server. - -## configuration - -In `/etc/netdata/netdata.conf` you should have something like this (if not download the latest version of `netdata.conf` -from your Netdata): - -```conf -[backend] - enabled = yes | no - type = graphite | opentsdb:telnet | opentsdb:http | opentsdb:https | prometheus_remote_write | json | kinesis | mongodb - host tags = list of TAG=VALUE - destination = space separated list of [PROTOCOL:]HOST[:PORT] - the first working will be used, or a region for kinesis - data source = average | sum | as collected - prefix = Netdata - hostname = my-name - update every = 10 - buffer on failures = 10 - timeout ms = 20000 - send charts matching = * - send hosts matching = localhost * - send names instead of ids = yes -``` - -- `enabled = yes | no`, enables or disables sending data to a backend - -- `type = graphite | opentsdb:telnet | opentsdb:http | opentsdb:https | json | kinesis | mongodb`, selects the backend - type - -- `destination = host1 host2 host3 ...`, accepts **a space separated list** of hostnames, IPs (IPv4 and IPv6) and - ports to connect to. Netdata will use the **first available** to send the metrics. - - The format of each item in this list, is: `[PROTOCOL:]IP[:PORT]`. - - `PROTOCOL` can be `udp` or `tcp`. `tcp` is the default and only supported by the current backends. - - `IP` can be `XX.XX.XX.XX` (IPv4), or `[XX:XX...XX:XX]` (IPv6). For IPv6 you can to enclose the IP in `[]` to - separate it from the port. - - `PORT` can be a number of a service name. If omitted, the default port for the backend will be used - (graphite = 2003, opentsdb = 4242). - - Example IPv4: - -```conf - destination = 10.11.14.2:4242 10.11.14.3:4242 10.11.14.4:4242 -``` - - Example IPv6 and IPv4 together: - -```conf - destination = [ffff:...:0001]:2003 10.11.12.1:2003 -``` - - When multiple servers are defined, Netdata will try the next one when the first one fails. This allows you to - load-balance different servers: give your backend servers in different order on each Netdata. - - Netdata also ships `nc-backend.sh`, a script that can be used as a fallback backend to save the - metrics to disk and push them to the time-series database when it becomes available again. It can also be used to - monitor / trace / debug the metrics Netdata generates. - - For kinesis backend `destination` should be set to an AWS region (for example, `us-east-1`). - - The MongoDB backend doesn't use the `destination` option for its configuration. It uses the `mongodb.conf` - [configuration file](/backends/mongodb/README.md) instead. - -- `data source = as collected`, or `data source = average`, or `data source = sum`, selects the kind of data that will - be sent to the backend. - -- `hostname = my-name`, is the hostname to be used for sending data to the backend server. By default this is - `[global].hostname`. - -- `prefix = Netdata`, is the prefix to add to all metrics. - -- `update every = 10`, is the number of seconds between sending data to the backend. Netdata will add some randomness - to this number, to prevent stressing the backend server when many Netdata servers send data to the same backend. - This randomness does not affect the quality of the data, only the time they are sent. - -- `buffer on failures = 10`, is the number of iterations (each iteration is `[backend].update every` seconds) to - buffer data, when the backend is not available. If the backend fails to receive the data after that many failures, - data loss on the backend is expected (Netdata will also log it). - -- `timeout ms = 20000`, is the timeout in milliseconds to wait for the backend server to process the data. By default - this is `2 * update_every * 1000`. - -- `send hosts matching = localhost *` includes one or more space separated patterns, using `*` as wildcard (any number - of times within each pattern). The patterns are checked against the hostname (the localhost is always checked as - `localhost`), allowing us to filter which hosts will be sent to the backend when this Netdata is a central Netdata - aggregating multiple hosts. A pattern starting with `!` gives a negative match. So to match all hosts named `*db*` - except hosts containing `*child*`, use `!*child* *db*` (so, the order is important: the first pattern - matching the hostname will be used - positive or negative). - -- `send charts matching = *` includes one or more space separated patterns, using `*` as wildcard (any number of times - within each pattern). The patterns are checked against both chart id and chart name. A pattern starting with `!` - gives a negative match. So to match all charts named `apps.*` except charts ending in `*reads`, use `!*reads - apps.*` (so, the order is important: the first pattern matching the chart id or the chart name will be used - - positive or negative). - -- `send names instead of ids = yes | no` controls the metric names Netdata should send to backend. Netdata supports - names and IDs for charts and dimensions. Usually IDs are unique identifiers as read by the system and names are - human friendly labels (also unique). Most charts and metrics have the same ID and name, but in several cases they - are different: disks with device-mapper, interrupts, QoS classes, statsd synthetic charts, etc. - -- `host tags = list of TAG=VALUE` defines tags that should be appended on all metrics for the given host. These are - currently only sent to graphite, json, opentsdb and prometheus. Please use the appropriate format for each - time-series db. For example opentsdb likes them like `TAG1=VALUE1 TAG2=VALUE2`, but prometheus like `tag1="value1", - tag2="value2"`. Host tags are mirrored with database replication (streaming of metrics between Netdata servers). - - Starting from Netdata v1.20 the host tags are parsed in accordance with a configured backend type and stored as - host labels so that they can be reused in API responses and exporting connectors. The parsing is supported for - graphite, json, opentsdb, and prometheus (default) backend types. You can check how the host tags were parsed using - the /api/v1/info API call. - -## monitoring operation - -Netdata provides 5 charts: - -1. **Buffered metrics**, the number of metrics Netdata added to the buffer for dispatching them to the - backend server. - -2. **Buffered data size**, the amount of data (in KB) Netdata added the buffer. - -3. ~~**Backend latency**, the time the backend server needed to process the data Netdata sent. If there was a - re-connection involved, this includes the connection time.~~ (this chart has been removed, because it only measures - the time Netdata needs to give the data to the O/S - since the backend servers do not ack the reception, Netdata - does not have any means to measure this properly). - -4. **Backend operations**, the number of operations performed by Netdata. - -5. **Backend thread CPU usage**, the CPU resources consumed by the Netdata thread, that is responsible for sending the - metrics to the backend server. - -![image](https://cloud.githubusercontent.com/assets/2662304/20463536/eb196084-af3d-11e6-8ee5-ddbd3b4d8449.png) - -## alarms - -Netdata adds 4 alarms: - -1. `backend_last_buffering`, number of seconds since the last successful buffering of backend data -2. `backend_metrics_sent`, percentage of metrics sent to the backend server -3. `backend_metrics_lost`, number of metrics lost due to repeating failures to contact the backend server -4. ~~`backend_slow`, the percentage of time between iterations needed by the backend time to process the data sent by - Netdata~~ (this was misleading and has been removed). - -![image](https://cloud.githubusercontent.com/assets/2662304/20463779/a46ed1c2-af43-11e6-91a5-07ca4533cac3.png) - -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/backends/TIMESCALE.md b/backends/TIMESCALE.md deleted file mode 100644 index 05a3c3b47..000000000 --- a/backends/TIMESCALE.md +++ /dev/null @@ -1,57 +0,0 @@ -<!-- -title: "Writing metrics to TimescaleDB" -custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/TIMESCALE.md ---> - -# Writing metrics to TimescaleDB - -Thanks to Netdata's community of developers and system administrators, and Mahlon Smith -([GitHub](https://github.com/mahlonsmith)/[Website](http://www.martini.nu/)) in particular, Netdata now supports -archiving metrics directly to TimescaleDB. - -What's TimescaleDB? Here's how their team defines the project on their [GitHub page](https://github.com/timescale/timescaledb): - -> TimescaleDB is an open-source database designed to make SQL scalable for time-series data. It is engineered up from -> PostgreSQL, providing automatic partitioning across time and space (partitioning key), as well as full SQL support. - -## Quickstart - -To get started archiving metrics to TimescaleDB right away, check out Mahlon's [`netdata-timescale-relay` -repository](https://github.com/mahlonsmith/netdata-timescale-relay) on GitHub. - -This small program takes JSON streams from a Netdata client and writes them to a PostgreSQL (aka TimescaleDB) table. -You'll run this program in parallel with Netdata, and after a short [configuration -process](https://github.com/mahlonsmith/netdata-timescale-relay#configuration), your metrics should start populating -TimescaleDB. - -Finally, another member of Netdata's community has built a project that quickly launches Netdata, TimescaleDB, and -Grafana in easy-to-manage Docker containers. Rune Juhl Jacobsen's -[project](https://github.com/runejuhl/grafana-timescaledb) uses a `Makefile` to create everything, which makes it -perfect for testing and experimentation. - -## Netdata↔TimescaleDB in action - -Aside from creating incredible contributions to Netdata, Mahlon works at [LAIKA](https://www.laika.com/), an -Oregon-based animation studio that's helped create acclaimed films like _Coraline_ and _Kubo and the Two Strings_. - -As part of his work to maintain the company's infrastructure of render farms, workstations, and virtual machines, he's -using Netdata, `netdata-timescale-relay`, and TimescaleDB to store Netdata metrics alongside other data from other -sources. - -> LAIKA is a long-time PostgreSQL user and added TimescaleDB to their infrastructure in 2018 to help manage and store -> their IT metrics and time-series data. So far, the tool has been in production at LAIKA for over a year and helps them -> with their use case of time-based logging, where they record over 8 million metrics an hour for netdata content alone. - -By archiving Netdata metrics to a backend like TimescaleDB, LAIKA can consolidate metrics data from distributed machines -efficiently. Mahlon can then correlate Netdata metrics with other sources directly in TimescaleDB. - -And, because LAIKA will soon be storing years worth of Netdata metrics data in TimescaleDB, they can analyze long-term -metrics as their films move from concept to final cut. - -Read the full blog post from LAIKA at the [TimescaleDB -blog](https://blog.timescale.com/blog/writing-it-metrics-from-netdata-to-timescaledb/amp/). - -Thank you to Mahlon, Rune, TimescaleDB, and the members of the Netdata community that requested and then built this -backend connection between Netdata and TimescaleDB! - -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2FTIMESCALE&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/backends/WALKTHROUGH.md b/backends/WALKTHROUGH.md deleted file mode 100644 index bb38e7c1c..000000000 --- a/backends/WALKTHROUGH.md +++ /dev/null @@ -1,258 +0,0 @@ -<!-- -title: "Netdata, Prometheus, Grafana stack" -custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/WALKTHROUGH.md ---> - -# Netdata, Prometheus, Grafana stack - -## Intro - -In this article I will walk you through the basics of getting Netdata, Prometheus and Grafana all working together and -monitoring your application servers. This article will be using docker on your local workstation. We will be working -with docker in an ad-hoc way, launching containers that run ‘/bin/bash’ and attaching a TTY to them. I use docker here -in a purely academic fashion and do not condone running Netdata in a container. I pick this method so individuals -without cloud accounts or access to VMs can try this out and for it’s speed of deployment. - -## Why Netdata, Prometheus, and Grafana - -Some time ago I was introduced to Netdata by a coworker. We were attempting to troubleshoot python code which seemed to -be bottlenecked. I was instantly impressed by the amount of metrics Netdata exposes to you. I quickly added Netdata to -my set of go-to tools when troubleshooting systems performance. - -Some time ago, even later, I was introduced to Prometheus. Prometheus is a monitoring application which flips the normal -architecture around and polls rest endpoints for its metrics. This architectural change greatly simplifies and decreases -the time necessary to begin monitoring your applications. Compared to current monitoring solutions the time spent on -designing the infrastructure is greatly reduced. Running a single Prometheus server per application becomes feasible -with the help of Grafana. - -Grafana has been the go to graphing tool for… some time now. It’s awesome, anyone that has used it knows it’s awesome. -We can point Grafana at Prometheus and use Prometheus as a data source. This allows a pretty simple overall monitoring -architecture: Install Netdata on your application servers, point Prometheus at Netdata, and then point Grafana at -Prometheus. - -I’m omitting an important ingredient in this stack in order to keep this tutorial simple and that is service discovery. -My personal preference is to use Consul. Prometheus can plug into consul and automatically begin to scrape new hosts -that register a Netdata client with Consul. - -At the end of this tutorial you will understand how each technology fits together to create a modern monitoring stack. -This stack will offer you visibility into your application and systems performance. - -## Getting Started - Netdata - -To begin let’s create our container which we will install Netdata on. We need to run a container, forward the necessary -port that Netdata listens on, and attach a tty so we can interact with the bash shell on the container. But before we do -this we want name resolution between the two containers to work. In order to accomplish this we will create a -user-defined network and attach both containers to this network. The first command we should run is: - -```sh -docker network create --driver bridge netdata-tutorial -``` - -With this user-defined network created we can now launch our container we will install Netdata on and point it to this -network. - -```sh -docker run -it --name netdata --hostname netdata --network=netdata-tutorial -p 19999:19999 centos:latest '/bin/bash' -``` - -This command creates an interactive tty session (-it), gives the container both a name in relation to the docker daemon -and a hostname (this is so you know what container is which when working in the shells and docker maps hostname -resolution to this container), forwards the local port 19999 to the container’s port 19999 (-p 19999:19999), sets the -command to run (/bin/bash) and then chooses the base container images (centos:latest). After running this you should be -sitting inside the shell of the container. - -After we have entered the shell we can install Netdata. This process could not be easier. If you take a look at [this -link](/packaging/installer/README.md), the Netdata devs give us several one-liners to install Netdata. I have not had -any issues with these one liners and their bootstrapping scripts so far (If you guys run into anything do share). Run -the following command in your container. - -```sh -bash <(curl -Ss https://my-netdata.io/kickstart.sh) --dont-wait -``` - -After the install completes you should be able to hit the Netdata dashboard at <http://localhost:19999/> (replace -localhost if you’re doing this on a VM or have the docker container hosted on a machine not on your local system). If -this is your first time using Netdata I suggest you take a look around. The amount of time I’ve spent digging through -/proc and calculating my own metrics has been greatly reduced by this tool. Take it all in. - -Next I want to draw your attention to a particular endpoint. Navigate to -<http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes> In your browser. This is the endpoint which -publishes all the metrics in a format which Prometheus understands. Let’s take a look at one of these metrics. -`netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="system"} 0.0831255 1501271696000` This -metric is representing several things which I will go in more details in the section on prometheus. For now understand -that this metric: `netdata_system_cpu_percentage_average` has several labels: (chart, family, dimension). This -corresponds with the first cpu chart you see on the Netdata dashboard. - -![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%204.00.45%20PM.png) - -This CHART is called ‘system.cpu’, The FAMILY is cpu, and the DIMENSION we are observing is “system”. You can begin to -draw links between the charts in Netdata to the prometheus metrics format in this manner. - -## Prometheus - -We will be installing prometheus in a container for purpose of demonstration. While prometheus does have an official -container I would like to walk through the install process and setup on a fresh container. This will allow anyone -reading to migrate this tutorial to a VM or Server of any sort. - -Let’s start another container in the same fashion as we did the Netdata container. - -```sh -docker run -it --name prometheus --hostname prometheus ---network=netdata-tutorial -p 9090:9090 centos:latest '/bin/bash' -``` - -This should drop you into a shell once again. Once there quickly install your favorite editor as we will be editing -files later in this tutorial. - -```sh -yum install vim -y -``` - -Prometheus provides a tarball of their latest stable versions [here](https://prometheus.io/download/). - -Let’s download the latest version and install into your container. - -```sh -cd /tmp && curl -s https://api.github.com/repos/prometheus/prometheus/releases/latest \ -| grep "browser_download_url.*linux-amd64.tar.gz" \ -| cut -d '"' -f 4 \ -| wget -qi - - -mkdir /opt/prometheus - -sudo tar -xvf /tmp/prometheus-*linux-amd64.tar.gz -C /opt/prometheus --strip=1 -``` - -This should get prometheus installed into the container. Let’s test that we can run prometheus and connect to it’s web -interface. - -```sh -/opt/prometheus/prometheus -``` - -Now attempt to go to <http://localhost:9090/>. You should be presented with the prometheus homepage. This is a good -point to talk about Prometheus’s data model which can be viewed here: <https://prometheus.io/docs/concepts/data_model/> -As explained we have two key elements in Prometheus metrics. We have the ‘metric’ and its ‘labels’. Labels allow for -granularity between metrics. Let’s use our previous example to further explain. - -```conf -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="system"} 0.0831255 1501271696000 -``` - -Here our metric is ‘netdata_system_cpu_percentage_average’ and our labels are ‘chart’, ‘family’, and ‘dimension. The -last two values constitute the actual metric value for the metric type (gauge, counter, etc…). We can begin graphing -system metrics with this information, but first we need to hook up Prometheus to poll Netdata stats. - -Let’s move our attention to Prometheus’s configuration. Prometheus gets it config from the file located (in our example) -at `/opt/prometheus/prometheus.yml`. I won’t spend an extensive amount of time going over the configuration values -documented here: <https://prometheus.io/docs/operating/configuration/>. We will be adding a new“job” under the -“scrape_configs”. Let’s make the “scrape_configs” section look like this (we can use the dns name Netdata due to the -custom user-defined network we created in docker beforehand). - -```yaml -scrape_configs: - # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - - job_name: 'prometheus' - - # metrics_path defaults to '/metrics' - # scheme defaults to 'http'. - - static_configs: - - targets: ['localhost:9090'] - - - job_name: 'netdata' - - metrics_path: /api/v1/allmetrics - params: - format: [ prometheus ] - - static_configs: - - targets: ['netdata:19999'] -``` - -Let’s start prometheus once again by running `/opt/prometheus/prometheus`. If we now navigate to prometheus at -‘<http://localhost:9090/targets’> we should see our target being successfully scraped. If we now go back to the -Prometheus’s homepage and begin to type ‘netdata\_’ Prometheus should auto complete metrics it is now scraping. - -![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%205.13.43%20PM.png) - -Let’s now start exploring how we can graph some metrics. Back in our Netdata container lets get the CPU spinning with a -pointless busy loop. On the shell do the following: - -```sh -[root@netdata /]# while true; do echo "HOT HOT HOT CPU"; done -``` - -Our Netdata cpu graph should be showing some activity. Let’s represent this in Prometheus. In order to do this let’s -keep our metrics page open for reference: <http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes> We are -setting out to graph the data in the CPU chart so let’s search for “system.cpu”in the metrics page above. We come across -a section of metrics with the first comments `# COMMENT homogeneous chart "system.cpu", context "system.cpu", family -"cpu", units "percentage"` Followed by the metrics. This is a good start now let us drill down to the specific metric we -would like to graph. - -```conf -# COMMENT -netdata_system_cpu_percentage_average: dimension "system", value is percentage, gauge, dt 1501275951 to 1501275951 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="system"} 0.0000000 1501275951000 -``` - -Here we learn that the metric name we care about is‘netdata_system_cpu_percentage_average’ so throw this into Prometheus -and see what we get. We should see something similar to this (I shut off my busy loop) - -![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%205.47.53%20PM.png) - -This is a good step toward what we want. Also make note that Prometheus will tag on an ‘instance’ label for us which -corresponds to our statically defined job in the configuration file. This allows us to tailor our queries to specific -instances. Now we need to isolate the dimension we want in our query. To do this let us refine the query slightly. Let’s -query the dimension also. Place this into our query text box. -`netdata_system_cpu_percentage_average{dimension="system"}` We now wind up with the following graph. - -![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%205.54.40%20PM.png) - -Awesome, this is exactly what we wanted. If you haven’t caught on yet we can emulate entire charts from Netdata by using -the `chart` dimension. If you’d like you can combine the ‘chart’ and ‘instance’ dimension to create per-instance charts. -Let’s give this a try: `netdata_system_cpu_percentage_average{chart="system.cpu", instance="netdata:19999"}` - -This is the basics of using Prometheus to query Netdata. I’d advise everyone at this point to read [this -page](/backends/prometheus/README.md#using-netdata-with-prometheus). The key point here is that Netdata can export metrics from -its internal DB or can send metrics “as-collected” by specifying the ‘source=as-collected’ url parameter like so. -<http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes&types=yes&source=as-collected> If you choose to use -this method you will need to use Prometheus's set of functions here: <https://prometheus.io/docs/querying/functions/> to -obtain useful metrics as you are now dealing with raw counters from the system. For example you will have to use the -`irate()` function over a counter to get that metric's rate per second. If your graphing needs are met by using the -metrics returned by Netdata's internal database (not specifying any source= url parameter) then use that. If you find -limitations then consider re-writing your queries using the raw data and using Prometheus functions to get the desired -chart. - -## Grafana - -Finally we make it to grafana. This is the easiest part in my opinion. This time we will actually run the official -grafana docker container as all configuration we need to do is done via the GUI. Let’s run the following command: - -```sh -docker run -i -p 3000:3000 --network=netdata-tutorial grafana/grafana -``` - -This will get grafana running at ‘<http://localhost:3000/’> Let’s go there and - -login using the credentials Admin:Admin. - -The first thing we want to do is click ‘Add data source’. Let’s make it look like the following screenshot - -![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%206.36.55%20PM.png) - -With this completed let’s graph! Create a new Dashboard by clicking on the top left Grafana Icon and create a new graph -in that dashboard. Fill in the query like we did above and save. - -![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%206.39.38%20PM.png) - -## Conclusion - -There you have it, a complete systems monitoring stack which is very easy to deploy. From here I would begin to -understand how Prometheus and a service discovery mechanism such as Consul can play together nicely. My current prod -deployments automatically register Netdata services into Consul and Prometheus automatically begins to scrape them. Once -achieved you do not have to think about the monitoring system until Prometheus cannot keep up with your scale. Once this -happens there are options presented in the Prometheus documentation for solving this. Hope this was helpful, happy -monitoring. - -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2FWALKTHROUGH&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/backends/aws_kinesis/Makefile.am b/backends/aws_kinesis/Makefile.am deleted file mode 100644 index 1fec72c1f..000000000 --- a/backends/aws_kinesis/Makefile.am +++ /dev/null @@ -1,12 +0,0 @@ -# SPDX-License-Identifier: GPL-3.0-or-later - -AUTOMAKE_OPTIONS = subdir-objects -MAINTAINERCLEANFILES = $(srcdir)/Makefile.in - -dist_noinst_DATA = \ - README.md \ - $(NULL) - -dist_libconfig_DATA = \ - aws_kinesis.conf \ - $(NULL) diff --git a/backends/aws_kinesis/README.md b/backends/aws_kinesis/README.md deleted file mode 100644 index a2b682517..000000000 --- a/backends/aws_kinesis/README.md +++ /dev/null @@ -1,53 +0,0 @@ -<!-- -title: "Using Netdata with AWS Kinesis Data Streams" -custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/aws_kinesis/README.md ---> - -# Using Netdata with AWS Kinesis Data Streams - -## Prerequisites - -To use AWS Kinesis as a backend AWS SDK for C++ should be -[installed](https://docs.aws.amazon.com/en_us/sdk-for-cpp/v1/developer-guide/setup.html) first. `libcrypto`, `libssl`, -and `libcurl` are also required to compile Netdata with Kinesis support enabled. Next, Netdata should be re-installed -from the source. The installer will detect that the required libraries are now available. - -If the AWS SDK for C++ is being installed from source, it is useful to set `-DBUILD_ONLY="kinesis"`. Otherwise, the -building process could take a very long time. Take a note, that the default installation path for the libraries is -`/usr/local/lib64`. Many Linux distributions don't include this path as the default one for a library search, so it is -advisable to use the following options to `cmake` while building the AWS SDK: - -```sh -cmake -DCMAKE_INSTALL_LIBDIR=/usr/lib -DCMAKE_INSTALL_INCLUDEDIR=/usr/include -DBUILD_SHARED_LIBS=OFF -DBUILD_ONLY=kinesis <aws-sdk-cpp sources> -``` - -## Configuration - -To enable data sending to the kinesis backend set the following options in `netdata.conf`: - -```conf -[backend] - enabled = yes - type = kinesis - destination = us-east-1 -``` - -set the `destination` option to an AWS region. - -In the Netdata configuration directory run `./edit-config aws_kinesis.conf` and set AWS credentials and stream name: - -```yaml -# AWS credentials -aws_access_key_id = your_access_key_id -aws_secret_access_key = your_secret_access_key - -# destination stream -stream name = your_stream_name -``` - -Alternatively, AWS credentials can be set for the `netdata` user using AWS SDK for C++ [standard methods](https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/credentials.html). - -A partition key for every record is computed automatically by Netdata with the purpose to distribute records across -available shards evenly. - -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Faws_kinesis%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/backends/aws_kinesis/aws_kinesis.c b/backends/aws_kinesis/aws_kinesis.c deleted file mode 100644 index b1ea47862..000000000 --- a/backends/aws_kinesis/aws_kinesis.c +++ /dev/null @@ -1,94 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#define BACKENDS_INTERNALS -#include "aws_kinesis.h" - -#define CONFIG_FILE_LINE_MAX ((CONFIG_MAX_NAME + CONFIG_MAX_VALUE + 1024) * 2) - -// ---------------------------------------------------------------------------- -// kinesis backend - -// read the aws_kinesis.conf file -int read_kinesis_conf(const char *path, char **access_key_id_p, char **secret_access_key_p, char **stream_name_p) -{ - char *access_key_id = *access_key_id_p; - char *secret_access_key = *secret_access_key_p; - char *stream_name = *stream_name_p; - - if(unlikely(access_key_id)) freez(access_key_id); - if(unlikely(secret_access_key)) freez(secret_access_key); - if(unlikely(stream_name)) freez(stream_name); - access_key_id = NULL; - secret_access_key = NULL; - stream_name = NULL; - - int line = 0; - - char filename[FILENAME_MAX + 1]; - snprintfz(filename, FILENAME_MAX, "%s/aws_kinesis.conf", path); - - char buffer[CONFIG_FILE_LINE_MAX + 1], *s; - - debug(D_BACKEND, "BACKEND: opening config file '%s'", filename); - - FILE *fp = fopen(filename, "r"); - if(!fp) { - return 1; - } - - while(fgets(buffer, CONFIG_FILE_LINE_MAX, fp) != NULL) { - buffer[CONFIG_FILE_LINE_MAX] = '\0'; - line++; - - s = trim(buffer); - if(!s || *s == '#') { - debug(D_BACKEND, "BACKEND: ignoring line %d of file '%s', it is empty.", line, filename); - continue; - } - - char *name = s; - char *value = strchr(s, '='); - if(unlikely(!value)) { - error("BACKEND: ignoring line %d ('%s') of file '%s', there is no = in it.", line, s, filename); - continue; - } - *value = '\0'; - value++; - - name = trim(name); - value = trim(value); - - if(unlikely(!name || *name == '#')) { - error("BACKEND: ignoring line %d of file '%s', name is empty.", line, filename); - continue; - } - - if(!value) - value = ""; - else - value = strip_quotes(value); - - if(name[0] == 'a' && name[4] == 'a' && !strcmp(name, "aws_access_key_id")) { - access_key_id = strdupz(value); - } - else if(name[0] == 'a' && name[4] == 's' && !strcmp(name, "aws_secret_access_key")) { - secret_access_key = strdupz(value); - } - else if(name[0] == 's' && !strcmp(name, "stream name")) { - stream_name = strdupz(value); - } - } - - fclose(fp); - - if(unlikely(!stream_name || !*stream_name)) { - error("BACKEND: stream name is a mandatory Kinesis parameter but it is not configured"); - return 1; - } - - *access_key_id_p = access_key_id; - *secret_access_key_p = secret_access_key; - *stream_name_p = stream_name; - - return 0; -} diff --git a/backends/aws_kinesis/aws_kinesis.conf b/backends/aws_kinesis/aws_kinesis.conf deleted file mode 100644 index cc54b5fa2..000000000 --- a/backends/aws_kinesis/aws_kinesis.conf +++ /dev/null @@ -1,10 +0,0 @@ -# AWS Kinesis Data Streams backend configuration -# -# All options in this file are mandatory - -# AWS credentials -aws_access_key_id = -aws_secret_access_key = - -# destination stream -stream name =
\ No newline at end of file diff --git a/backends/aws_kinesis/aws_kinesis.h b/backends/aws_kinesis/aws_kinesis.h deleted file mode 100644 index 50a4631c5..000000000 --- a/backends/aws_kinesis/aws_kinesis.h +++ /dev/null @@ -1,14 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#ifndef NETDATA_BACKEND_KINESIS_H -#define NETDATA_BACKEND_KINESIS_H - -#include "backends/backends.h" -#include "aws_kinesis_put_record.h" - -#define KINESIS_PARTITION_KEY_MAX 256 -#define KINESIS_RECORD_MAX 1024 * 1024 - -extern int read_kinesis_conf(const char *path, char **auth_key_id_p, char **secure_key_p, char **stream_name_p); - -#endif //NETDATA_BACKEND_KINESIS_H diff --git a/backends/aws_kinesis/aws_kinesis_put_record.cc b/backends/aws_kinesis/aws_kinesis_put_record.cc deleted file mode 100644 index a8ba4aaca..000000000 --- a/backends/aws_kinesis/aws_kinesis_put_record.cc +++ /dev/null @@ -1,87 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#include <aws/core/Aws.h> -#include <aws/core/client/ClientConfiguration.h> -#include <aws/core/auth/AWSCredentials.h> -#include <aws/core/utils/Outcome.h> -#include <aws/kinesis/KinesisClient.h> -#include <aws/kinesis/model/PutRecordRequest.h> -#include "aws_kinesis_put_record.h" - -using namespace Aws; - -static SDKOptions options; - -static Kinesis::KinesisClient *client; - -struct request_outcome { - Kinesis::Model::PutRecordOutcomeCallable future_outcome; - size_t data_len; -}; - -static Vector<request_outcome> request_outcomes; - -void backends_kinesis_init(const char *region, const char *access_key_id, const char *secret_key, const long timeout) { - InitAPI(options); - - Client::ClientConfiguration config; - - config.region = region; - config.requestTimeoutMs = timeout; - config.connectTimeoutMs = timeout; - - if(access_key_id && *access_key_id && secret_key && *secret_key) { - client = New<Kinesis::KinesisClient>("client", Auth::AWSCredentials(access_key_id, secret_key), config); - } else { - client = New<Kinesis::KinesisClient>("client", config); - } -} - -void backends_kinesis_shutdown() { - Delete(client); - - ShutdownAPI(options); -} - -int backends_kinesis_put_record(const char *stream_name, const char *partition_key, - const char *data, size_t data_len) { - Kinesis::Model::PutRecordRequest request; - - request.SetStreamName(stream_name); - request.SetPartitionKey(partition_key); - request.SetData(Utils::ByteBuffer((unsigned char*) data, data_len)); - - request_outcomes.push_back({client->PutRecordCallable(request), data_len}); - - return 0; -} - -int backends_kinesis_get_result(char *error_message, size_t *sent_bytes, size_t *lost_bytes) { - Kinesis::Model::PutRecordOutcome outcome; - *sent_bytes = 0; - *lost_bytes = 0; - - for(auto request_outcome = request_outcomes.begin(); request_outcome != request_outcomes.end(); ) { - std::future_status status = request_outcome->future_outcome.wait_for(std::chrono::microseconds(100)); - - if(status == std::future_status::ready || status == std::future_status::deferred) { - outcome = request_outcome->future_outcome.get(); - *sent_bytes += request_outcome->data_len; - - if(!outcome.IsSuccess()) { - *lost_bytes += request_outcome->data_len; - outcome.GetError().GetMessage().copy(error_message, ERROR_LINE_MAX); - } - - request_outcomes.erase(request_outcome); - } else { - ++request_outcome; - } - } - - if(*lost_bytes) { - return 1; - } - - return 0; -}
\ No newline at end of file diff --git a/backends/aws_kinesis/aws_kinesis_put_record.h b/backends/aws_kinesis/aws_kinesis_put_record.h deleted file mode 100644 index fa3d03459..000000000 --- a/backends/aws_kinesis/aws_kinesis_put_record.h +++ /dev/null @@ -1,25 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#ifndef NETDATA_BACKEND_KINESIS_PUT_RECORD_H -#define NETDATA_BACKEND_KINESIS_PUT_RECORD_H - -#define ERROR_LINE_MAX 1023 - -#ifdef __cplusplus -extern "C" { -#endif - -void backends_kinesis_init(const char *region, const char *access_key_id, const char *secret_key, const long timeout); - -void backends_kinesis_shutdown(); - -int backends_kinesis_put_record(const char *stream_name, const char *partition_key, - const char *data, size_t data_len); - -int backends_kinesis_get_result(char *error_message, size_t *sent_bytes, size_t *lost_bytes); - -#ifdef __cplusplus -} -#endif - -#endif //NETDATA_BACKEND_KINESIS_PUT_RECORD_H diff --git a/backends/backends.c b/backends/backends.c deleted file mode 100644 index dca21ef1c..000000000 --- a/backends/backends.c +++ /dev/null @@ -1,1247 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#define BACKENDS_INTERNALS -#include "backends.h" - -// ---------------------------------------------------------------------------- -// How backends work in netdata: -// -// 1. There is an independent thread that runs at the required interval -// (for example, once every 10 seconds) -// -// 2. Every time it wakes, it calls the backend formatting functions to build -// a buffer of data. This is a very fast, memory only operation. -// -// 3. If the buffer already includes data, the new data are appended. -// If the buffer becomes too big, because the data cannot be sent, a -// log is written and the buffer is discarded. -// -// 4. Then it tries to send all the data. It blocks until all the data are sent -// or the socket returns an error. -// If the time required for this is above the interval, it starts skipping -// intervals, but the calculated values include the entire database, without -// gaps (it remembers the timestamps and continues from where it stopped). -// -// 5. repeats the above forever. -// - -const char *global_backend_prefix = "netdata"; -const char *global_backend_send_charts_matching = "*"; -int global_backend_update_every = 10; -BACKEND_OPTIONS global_backend_options = BACKEND_SOURCE_DATA_AVERAGE | BACKEND_OPTION_SEND_NAMES; -const char *global_backend_source = NULL; - -// ---------------------------------------------------------------------------- -// helper functions for backends - -size_t backend_name_copy(char *d, const char *s, size_t usable) { - size_t n; - - for(n = 0; *s && n < usable ; d++, s++, n++) { - char c = *s; - - if(c != '.' && !isalnum(c)) *d = '_'; - else *d = c; - } - *d = '\0'; - - return n; -} - -// calculate the SUM or AVERAGE of a dimension, for any timeframe -// may return NAN if the database does not have any value in the give timeframe - -calculated_number backend_calculate_value_from_stored_data( - RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap - , time_t *first_timestamp // the first point of the database used in this response - , time_t *last_timestamp // the timestamp that should be reported to backend -) { - RRDHOST *host = st->rrdhost; - (void)host; - - // find the edges of the rrd database for this chart - time_t first_t = rd->state->query_ops.oldest_time(rd); - time_t last_t = rd->state->query_ops.latest_time(rd); - time_t update_every = st->update_every; - struct rrddim_query_handle handle; - storage_number n; - - // step back a little, to make sure we have complete data collection - // for all metrics - after -= update_every * 2; - before -= update_every * 2; - - // align the time-frame - after = after - (after % update_every); - before = before - (before % update_every); - - // for before, loose another iteration - // the latest point will be reported the next time - before -= update_every; - - if(unlikely(after > before)) - // this can happen when update_every > before - after - after = before; - - if(unlikely(after < first_t)) - after = first_t; - - if(unlikely(before > last_t)) - before = last_t; - - if(unlikely(before < first_t || after > last_t)) { - // the chart has not been updated in the wanted timeframe - debug(D_BACKEND, "BACKEND: %s.%s.%s: aligned timeframe %lu to %lu is outside the chart's database range %lu to %lu", - host->hostname, st->id, rd->id, - (unsigned long)after, (unsigned long)before, - (unsigned long)first_t, (unsigned long)last_t - ); - return NAN; - } - - *first_timestamp = after; - *last_timestamp = before; - - size_t counter = 0; - calculated_number sum = 0; - -/* - long start_at_slot = rrdset_time2slot(st, before), - stop_at_slot = rrdset_time2slot(st, after), - slot, stop_now = 0; - - for(slot = start_at_slot; !stop_now ; slot--) { - - if(unlikely(slot < 0)) slot = st->entries - 1; - if(unlikely(slot == stop_at_slot)) stop_now = 1; - - storage_number n = rd->values[slot]; - - if(unlikely(!does_storage_number_exist(n))) { - // not collected - continue; - } - - calculated_number value = unpack_storage_number(n); - sum += value; - - counter++; - } -*/ - for(rd->state->query_ops.init(rd, &handle, after, before) ; !rd->state->query_ops.is_finished(&handle) ; ) { - time_t curr_t; - n = rd->state->query_ops.next_metric(&handle, &curr_t); - - if(unlikely(!does_storage_number_exist(n))) { - // not collected - continue; - } - - calculated_number value = unpack_storage_number(n); - sum += value; - - counter++; - } - rd->state->query_ops.finalize(&handle); - if(unlikely(!counter)) { - debug(D_BACKEND, "BACKEND: %s.%s.%s: no values stored in database for range %lu to %lu", - host->hostname, st->id, rd->id, - (unsigned long)after, (unsigned long)before - ); - return NAN; - } - - if(unlikely(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_SUM)) - return sum; - - return sum / (calculated_number)counter; -} - - -// discard a response received by a backend -// after logging a simple of it to error.log - -int discard_response(BUFFER *b, const char *backend) { - char sample[1024]; - const char *s = buffer_tostring(b); - char *d = sample, *e = &sample[sizeof(sample) - 1]; - - for(; *s && d < e ;s++) { - char c = *s; - if(unlikely(!isprint(c))) c = ' '; - *d++ = c; - } - *d = '\0'; - - info("BACKEND: received %zu bytes from %s backend. Ignoring them. Sample: '%s'", buffer_strlen(b), backend, sample); - buffer_flush(b); - return 0; -} - - -// ---------------------------------------------------------------------------- -// the backend thread - -static SIMPLE_PATTERN *charts_pattern = NULL; -static SIMPLE_PATTERN *hosts_pattern = NULL; - -inline int backends_can_send_rrdset(BACKEND_OPTIONS backend_options, RRDSET *st) { - RRDHOST *host = st->rrdhost; - (void)host; - - if(unlikely(rrdset_flag_check(st, RRDSET_FLAG_BACKEND_IGNORE))) - return 0; - - if(unlikely(!rrdset_flag_check(st, RRDSET_FLAG_BACKEND_SEND))) { - // we have not checked this chart - if(simple_pattern_matches(charts_pattern, st->id) || simple_pattern_matches(charts_pattern, st->name)) - rrdset_flag_set(st, RRDSET_FLAG_BACKEND_SEND); - else { - rrdset_flag_set(st, RRDSET_FLAG_BACKEND_IGNORE); - debug(D_BACKEND, "BACKEND: not sending chart '%s' of host '%s', because it is disabled for backends.", st->id, host->hostname); - return 0; - } - } - - if(unlikely(!rrdset_is_available_for_exporting_and_alarms(st))) { - debug(D_BACKEND, "BACKEND: not sending chart '%s' of host '%s', because it is not available for backends.", st->id, host->hostname); - return 0; - } - - if(unlikely(st->rrd_memory_mode == RRD_MEMORY_MODE_NONE && !(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED))) { - debug(D_BACKEND, "BACKEND: not sending chart '%s' of host '%s' because its memory mode is '%s' and the backend requires database access.", st->id, host->hostname, rrd_memory_mode_name(host->rrd_memory_mode)); - return 0; - } - - return 1; -} - -inline BACKEND_OPTIONS backend_parse_data_source(const char *source, BACKEND_OPTIONS backend_options) { - if(!strcmp(source, "raw") || !strcmp(source, "as collected") || !strcmp(source, "as-collected") || !strcmp(source, "as_collected") || !strcmp(source, "ascollected")) { - backend_options |= BACKEND_SOURCE_DATA_AS_COLLECTED; - backend_options &= ~(BACKEND_OPTIONS_SOURCE_BITS ^ BACKEND_SOURCE_DATA_AS_COLLECTED); - } - else if(!strcmp(source, "average")) { - backend_options |= BACKEND_SOURCE_DATA_AVERAGE; - backend_options &= ~(BACKEND_OPTIONS_SOURCE_BITS ^ BACKEND_SOURCE_DATA_AVERAGE); - } - else if(!strcmp(source, "sum") || !strcmp(source, "volume")) { - backend_options |= BACKEND_SOURCE_DATA_SUM; - backend_options &= ~(BACKEND_OPTIONS_SOURCE_BITS ^ BACKEND_SOURCE_DATA_SUM); - } - else { - error("BACKEND: invalid data source method '%s'.", source); - } - - return backend_options; -} - -static void backends_main_cleanup(void *ptr) { - struct netdata_static_thread *static_thread = (struct netdata_static_thread *)ptr; - static_thread->enabled = NETDATA_MAIN_THREAD_EXITING; - - info("cleaning up..."); - - static_thread->enabled = NETDATA_MAIN_THREAD_EXITED; -} - -/** - * Set Kinesis variables - * - * Set the variables necessary to work with this specific backend. - * - * @param default_port the default port of the backend - * @param brc function called to check the result. - * @param brf function called to format the message to the backend - */ -void backend_set_kinesis_variables(int *default_port, - backend_response_checker_t brc, - backend_request_formatter_t brf) -{ - (void)default_port; -#ifndef HAVE_KINESIS - (void)brc; - (void)brf; -#endif - -#if HAVE_KINESIS - *brc = process_json_response; - if (BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED) - *brf = backends_format_dimension_collected_json_plaintext; - else - *brf = backends_format_dimension_stored_json_plaintext; -#endif -} - -/** - * Set Prometheus variables - * - * Set the variables necessary to work with this specific backend. - * - * @param default_port the default port of the backend - * @param brc function called to check the result. - * @param brf function called to format the message to the backend - */ -void backend_set_prometheus_variables(int *default_port, - backend_response_checker_t brc, - backend_request_formatter_t brf) -{ - (void)default_port; - (void)brf; -#ifndef ENABLE_PROMETHEUS_REMOTE_WRITE - (void)brc; -#endif - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - *brc = backends_process_prometheus_remote_write_response; -#endif /* ENABLE_PROMETHEUS_REMOTE_WRITE */ -} - -/** - * Set MongoDB variables - * - * Set the variables necessary to work with this specific backend. - * - * @param default_port the default port of the backend - * @param brc function called to check the result. - * @param brf function called to format the message to the backend - */ -void backend_set_mongodb_variables(int *default_port, - backend_response_checker_t brc, - backend_request_formatter_t brf) -{ - (void)default_port; -#ifndef HAVE_MONGOC - (void)brc; - (void)brf; -#endif - -#if HAVE_MONGOC - *brc = process_json_response; - if(BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED) - *brf = backends_format_dimension_collected_json_plaintext; - else - *brf = backends_format_dimension_stored_json_plaintext; -#endif -} - -/** - * Set JSON variables - * - * Set the variables necessary to work with this specific backend. - * - * @param default_port the default port of the backend - * @param brc function called to check the result. - * @param brf function called to format the message to the backend - */ -void backend_set_json_variables(int *default_port, - backend_response_checker_t brc, - backend_request_formatter_t brf) -{ - *default_port = 5448; - *brc = process_json_response; - - if (BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED) - *brf = backends_format_dimension_collected_json_plaintext; - else - *brf = backends_format_dimension_stored_json_plaintext; -} - -/** - * Set OpenTSDB HTTP variables - * - * Set the variables necessary to work with this specific backend. - * - * @param default_port the default port of the backend - * @param brc function called to check the result. - * @param brf function called to format the message to the backend - */ -void backend_set_opentsdb_http_variables(int *default_port, - backend_response_checker_t brc, - backend_request_formatter_t brf) -{ - *default_port = 4242; - *brc = process_opentsdb_response; - - if(BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED) - *brf = backends_format_dimension_collected_opentsdb_http; - else - *brf = backends_format_dimension_stored_opentsdb_http; - -} - -/** - * Set OpenTSDB Telnet variables - * - * Set the variables necessary to work with this specific backend. - * - * @param default_port the default port of the backend - * @param brc function called to check the result. - * @param brf function called to format the message to the backend - */ -void backend_set_opentsdb_telnet_variables(int *default_port, - backend_response_checker_t brc, - backend_request_formatter_t brf) -{ - *default_port = 4242; - *brc = process_opentsdb_response; - - if(BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED) - *brf = backends_format_dimension_collected_opentsdb_telnet; - else - *brf = backends_format_dimension_stored_opentsdb_telnet; -} - -/** - * Set Graphite variables - * - * Set the variables necessary to work with this specific backend. - * - * @param default_port the default port of the backend - * @param brc function called to check the result. - * @param brf function called to format the message to the backend - */ -void backend_set_graphite_variables(int *default_port, - backend_response_checker_t brc, - backend_request_formatter_t brf) -{ - *default_port = 2003; - *brc = process_graphite_response; - - if(BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED) - *brf = backends_format_dimension_collected_graphite_plaintext; - else - *brf = backends_format_dimension_stored_graphite_plaintext; -} - -/** - * Select Type - * - * Select the backend type based in the user input - * - * @param type is the string that defines the backend type - * - * @return It returns the backend id. - */ -BACKEND_TYPE backend_select_type(const char *type) { - if(!strcmp(type, "graphite") || !strcmp(type, "graphite:plaintext")) { - return BACKEND_TYPE_GRAPHITE; - } - else if(!strcmp(type, "opentsdb") || !strcmp(type, "opentsdb:telnet")) { - return BACKEND_TYPE_OPENTSDB_USING_TELNET; - } - else if(!strcmp(type, "opentsdb:http") || !strcmp(type, "opentsdb:https")) { - return BACKEND_TYPE_OPENTSDB_USING_HTTP; - } - else if (!strcmp(type, "json") || !strcmp(type, "json:plaintext")) { - return BACKEND_TYPE_JSON; - } - else if (!strcmp(type, "prometheus_remote_write")) { - return BACKEND_TYPE_PROMETHEUS_REMOTE_WRITE; - } - else if (!strcmp(type, "kinesis") || !strcmp(type, "kinesis:plaintext")) { - return BACKEND_TYPE_KINESIS; - } - else if (!strcmp(type, "mongodb") || !strcmp(type, "mongodb:plaintext")) { - return BACKEND_TYPE_MONGODB; - } - - return BACKEND_TYPE_UNKNOWN; -} - -/** - * Backend main - * - * The main thread used to control the backends. - * - * @param ptr a pointer to netdata_static_structure. - * - * @return It always return NULL. - */ -void *backends_main(void *ptr) { - netdata_thread_cleanup_push(backends_main_cleanup, ptr); - - int default_port = 0; - int sock = -1; - BUFFER *b = buffer_create(1), *response = buffer_create(1); - int (*backend_request_formatter)(BUFFER *, const char *, RRDHOST *, const char *, RRDSET *, RRDDIM *, time_t, time_t, BACKEND_OPTIONS) = NULL; - int (*backend_response_checker)(BUFFER *) = NULL; - -#if HAVE_KINESIS - int do_kinesis = 0; - char *kinesis_auth_key_id = NULL, *kinesis_secure_key = NULL, *kinesis_stream_name = NULL; -#endif - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - int do_prometheus_remote_write = 0; - BUFFER *http_request_header = NULL; -#endif - -#if HAVE_MONGOC - int do_mongodb = 0; - char *mongodb_uri = NULL; - char *mongodb_database = NULL; - char *mongodb_collection = NULL; - - // set the default socket timeout in ms - int32_t mongodb_default_socket_timeout = (int32_t)(global_backend_update_every >= 2)?(global_backend_update_every * MSEC_PER_SEC - 500):1000; - -#endif - -#ifdef ENABLE_HTTPS - struct netdata_ssl opentsdb_ssl = {NULL , NETDATA_SSL_START}; -#endif - - // ------------------------------------------------------------------------ - // collect configuration options - - struct timeval timeout = { - .tv_sec = 0, - .tv_usec = 0 - }; - int enabled = config_get_boolean(CONFIG_SECTION_BACKEND, "enabled", 0); - const char *source = config_get(CONFIG_SECTION_BACKEND, "data source", "average"); - const char *type = config_get(CONFIG_SECTION_BACKEND, "type", "graphite"); - const char *destination = config_get(CONFIG_SECTION_BACKEND, "destination", "localhost"); - global_backend_prefix = config_get(CONFIG_SECTION_BACKEND, "prefix", "netdata"); - const char *hostname = config_get(CONFIG_SECTION_BACKEND, "hostname", localhost->hostname); - global_backend_update_every = (int)config_get_number(CONFIG_SECTION_BACKEND, "update every", global_backend_update_every); - int buffer_on_failures = (int)config_get_number(CONFIG_SECTION_BACKEND, "buffer on failures", 10); - long timeoutms = config_get_number(CONFIG_SECTION_BACKEND, "timeout ms", global_backend_update_every * 2 * 1000); - - if(config_get_boolean(CONFIG_SECTION_BACKEND, "send names instead of ids", (global_backend_options & BACKEND_OPTION_SEND_NAMES))) - global_backend_options |= BACKEND_OPTION_SEND_NAMES; - else - global_backend_options &= ~BACKEND_OPTION_SEND_NAMES; - - charts_pattern = simple_pattern_create( - global_backend_send_charts_matching = config_get(CONFIG_SECTION_BACKEND, "send charts matching", "*"), - NULL, - SIMPLE_PATTERN_EXACT); - hosts_pattern = simple_pattern_create(config_get(CONFIG_SECTION_BACKEND, "send hosts matching", "localhost *"), NULL, SIMPLE_PATTERN_EXACT); - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - const char *remote_write_path = config_get(CONFIG_SECTION_BACKEND, "remote write URL path", "/receive"); -#endif - - // ------------------------------------------------------------------------ - // validate configuration options - // and prepare for sending data to our backend - - global_backend_options = backend_parse_data_source(source, global_backend_options); - global_backend_source = source; - - if(timeoutms < 1) { - error("BACKEND: invalid timeout %ld ms given. Assuming %d ms.", timeoutms, global_backend_update_every * 2 * 1000); - timeoutms = global_backend_update_every * 2 * 1000; - } - timeout.tv_sec = (timeoutms * 1000) / 1000000; - timeout.tv_usec = (timeoutms * 1000) % 1000000; - - if(!enabled || global_backend_update_every < 1) - goto cleanup; - - // ------------------------------------------------------------------------ - // select the backend type - BACKEND_TYPE work_type = backend_select_type(type); - if (work_type == BACKEND_TYPE_UNKNOWN) { - error("BACKEND: Unknown backend type '%s'", type); - goto cleanup; - } - - switch (work_type) { - case BACKEND_TYPE_OPENTSDB_USING_HTTP: { -#ifdef ENABLE_HTTPS - if (!strcmp(type, "opentsdb:https")) { - security_start_ssl(NETDATA_SSL_CONTEXT_EXPORTING); - } -#endif - backend_set_opentsdb_http_variables(&default_port,&backend_response_checker,&backend_request_formatter); - break; - } - case BACKEND_TYPE_PROMETHEUS_REMOTE_WRITE: { -#if ENABLE_PROMETHEUS_REMOTE_WRITE - do_prometheus_remote_write = 1; - - http_request_header = buffer_create(1); - backends_init_write_request(); -#else - error("BACKEND: Prometheus remote write support isn't compiled"); -#endif // ENABLE_PROMETHEUS_REMOTE_WRITE - backend_set_prometheus_variables(&default_port,&backend_response_checker,&backend_request_formatter); - break; - } - case BACKEND_TYPE_KINESIS: { -#if HAVE_KINESIS - do_kinesis = 1; - - if(unlikely(read_kinesis_conf(netdata_configured_user_config_dir, &kinesis_auth_key_id, &kinesis_secure_key, &kinesis_stream_name))) { - error("BACKEND: kinesis backend type is set but cannot read its configuration from %s/aws_kinesis.conf", netdata_configured_user_config_dir); - goto cleanup; - } - - backends_kinesis_init(destination, kinesis_auth_key_id, kinesis_secure_key, timeout.tv_sec * 1000 + timeout.tv_usec / 1000); -#else - error("BACKEND: AWS Kinesis support isn't compiled"); -#endif // HAVE_KINESIS - backend_set_kinesis_variables(&default_port,&backend_response_checker,&backend_request_formatter); - break; - } - case BACKEND_TYPE_MONGODB: { -#if HAVE_MONGOC - if(unlikely(read_mongodb_conf(netdata_configured_user_config_dir, - &mongodb_uri, - &mongodb_database, - &mongodb_collection))) { - error("BACKEND: mongodb backend type is set but cannot read its configuration from %s/mongodb.conf", - netdata_configured_user_config_dir); - goto cleanup; - } - - if(likely(!backends_mongodb_init(mongodb_uri, mongodb_database, mongodb_collection, mongodb_default_socket_timeout))) { - backend_set_mongodb_variables(&default_port, &backend_response_checker, &backend_request_formatter); - do_mongodb = 1; - } - else { - error("BACKEND: cannot initialize MongoDB backend"); - goto cleanup; - } -#else - error("BACKEND: MongoDB support isn't compiled"); -#endif // HAVE_MONGOC - break; - } - case BACKEND_TYPE_GRAPHITE: { - backend_set_graphite_variables(&default_port,&backend_response_checker,&backend_request_formatter); - break; - } - case BACKEND_TYPE_OPENTSDB_USING_TELNET: { - backend_set_opentsdb_telnet_variables(&default_port,&backend_response_checker,&backend_request_formatter); - break; - } - case BACKEND_TYPE_JSON: { - backend_set_json_variables(&default_port,&backend_response_checker,&backend_request_formatter); - break; - } - case BACKEND_TYPE_UNKNOWN: { - break; - } - default: { - break; - } - } - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - if((backend_request_formatter == NULL && !do_prometheus_remote_write) || backend_response_checker == NULL) { -#else - if(backend_request_formatter == NULL || backend_response_checker == NULL) { -#endif - error("BACKEND: backend is misconfigured - disabling it."); - goto cleanup; - } - - -// ------------------------------------------------------------------------ -// prepare the charts for monitoring the backend operation - - struct rusage thread; - - collected_number - chart_buffered_metrics = 0, - chart_lost_metrics = 0, - chart_sent_metrics = 0, - chart_buffered_bytes = 0, - chart_received_bytes = 0, - chart_sent_bytes = 0, - chart_receptions = 0, - chart_transmission_successes = 0, - chart_transmission_failures = 0, - chart_data_lost_events = 0, - chart_lost_bytes = 0, - chart_backend_reconnects = 0; - // chart_backend_latency = 0; - - RRDSET *chart_metrics = rrdset_create_localhost("netdata", "backend_metrics", NULL, "backend", NULL, "Netdata Buffered Metrics", "metrics", "backends", NULL, 130600, global_backend_update_every, RRDSET_TYPE_LINE); - rrddim_add(chart_metrics, "buffered", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_metrics, "lost", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_metrics, "sent", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); - - RRDSET *chart_bytes = rrdset_create_localhost("netdata", "backend_bytes", NULL, "backend", NULL, "Netdata Backend Data Size", "KiB", "backends", NULL, 130610, global_backend_update_every, RRDSET_TYPE_AREA); - rrddim_add(chart_bytes, "buffered", NULL, 1, 1024, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_bytes, "lost", NULL, 1, 1024, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_bytes, "sent", NULL, 1, 1024, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_bytes, "received", NULL, 1, 1024, RRD_ALGORITHM_ABSOLUTE); - - RRDSET *chart_ops = rrdset_create_localhost("netdata", "backend_ops", NULL, "backend", NULL, "Netdata Backend Operations", "operations", "backends", NULL, 130630, global_backend_update_every, RRDSET_TYPE_LINE); - rrddim_add(chart_ops, "write", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_ops, "discard", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_ops, "reconnect", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_ops, "failure", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); - rrddim_add(chart_ops, "read", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE); - - /* - * this is misleading - we can only measure the time we need to send data - * this time is not related to the time required for the data to travel to - * the backend database and the time that server needed to process them - * - * issue #1432 and https://www.softlab.ntua.gr/facilities/documentation/unix/unix-socket-faq/unix-socket-faq-2.html - * - RRDSET *chart_latency = rrdset_create_localhost("netdata", "backend_latency", NULL, "backend", NULL, "Netdata Backend Latency", "ms", "backends", NULL, 130620, global_backend_update_every, RRDSET_TYPE_AREA); - rrddim_add(chart_latency, "latency", NULL, 1, 1000, RRD_ALGORITHM_ABSOLUTE); - */ - - RRDSET *chart_rusage = rrdset_create_localhost("netdata", "backend_thread_cpu", NULL, "backend", NULL, "Netdata Backend Thread CPU usage", "milliseconds/s", "backends", NULL, 130630, global_backend_update_every, RRDSET_TYPE_STACKED); - rrddim_add(chart_rusage, "user", NULL, 1, 1000, RRD_ALGORITHM_INCREMENTAL); - rrddim_add(chart_rusage, "system", NULL, 1, 1000, RRD_ALGORITHM_INCREMENTAL); - - - // ------------------------------------------------------------------------ - // prepare the backend main loop - - info("BACKEND: configured ('%s' on '%s' sending '%s' data, every %d seconds, as host '%s', with prefix '%s')", type, destination, source, global_backend_update_every, hostname, global_backend_prefix); - send_statistics("BACKEND_START", "OK", type); - - usec_t step_ut = global_backend_update_every * USEC_PER_SEC; - time_t after = now_realtime_sec(); - int failures = 0; - heartbeat_t hb; - heartbeat_init(&hb); - - while(!netdata_exit) { - - // ------------------------------------------------------------------------ - // Wait for the next iteration point. - - heartbeat_next(&hb, step_ut); - time_t before = now_realtime_sec(); - debug(D_BACKEND, "BACKEND: preparing buffer for timeframe %lu to %lu", (unsigned long)after, (unsigned long)before); - - // ------------------------------------------------------------------------ - // add to the buffer the data we need to send to the backend - - netdata_thread_disable_cancelability(); - - size_t count_hosts = 0; - size_t count_charts_total = 0; - size_t count_dims_total = 0; - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - if(do_prometheus_remote_write) - backends_clear_write_request(); -#endif - rrd_rdlock(); - RRDHOST *host; - rrdhost_foreach_read(host) { - if(unlikely(!rrdhost_flag_check(host, RRDHOST_FLAG_BACKEND_SEND|RRDHOST_FLAG_BACKEND_DONT_SEND))) { - char *name = (host == localhost)?"localhost":host->hostname; - if (!hosts_pattern || simple_pattern_matches(hosts_pattern, name)) { - rrdhost_flag_set(host, RRDHOST_FLAG_BACKEND_SEND); - info("enabled backend for host '%s'", name); - } - else { - rrdhost_flag_set(host, RRDHOST_FLAG_BACKEND_DONT_SEND); - info("disabled backend for host '%s'", name); - } - } - - if(unlikely(!rrdhost_flag_check(host, RRDHOST_FLAG_BACKEND_SEND))) - continue; - - rrdhost_rdlock(host); - - count_hosts++; - size_t count_charts = 0; - size_t count_dims = 0; - size_t count_dims_skipped = 0; - - const char *__hostname = (host == localhost)?hostname:host->hostname; - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - if(do_prometheus_remote_write) { - backends_rrd_stats_remote_write_allmetrics_prometheus( - host - , __hostname - , global_backend_prefix - , global_backend_options - , after - , before - , &count_charts - , &count_dims - , &count_dims_skipped - ); - chart_buffered_metrics += count_dims; - } - else -#endif - { - RRDSET *st; - rrdset_foreach_read(st, host) { - if(likely(backends_can_send_rrdset(global_backend_options, st))) { - rrdset_rdlock(st); - - count_charts++; - - RRDDIM *rd; - rrddim_foreach_read(rd, st) { - if (likely(rd->last_collected_time.tv_sec >= after)) { - chart_buffered_metrics += backend_request_formatter(b, global_backend_prefix, host, __hostname, st, rd, after, before, global_backend_options); - count_dims++; - } - else { - debug(D_BACKEND, "BACKEND: not sending dimension '%s' of chart '%s' from host '%s', its last data collection (%lu) is not within our timeframe (%lu to %lu)", rd->id, st->id, __hostname, (unsigned long)rd->last_collected_time.tv_sec, (unsigned long)after, (unsigned long)before); - count_dims_skipped++; - } - } - - rrdset_unlock(st); - } - } - } - - debug(D_BACKEND, "BACKEND: sending host '%s', metrics of %zu dimensions, of %zu charts. Skipped %zu dimensions.", __hostname, count_dims, count_charts, count_dims_skipped); - count_charts_total += count_charts; - count_dims_total += count_dims; - - rrdhost_unlock(host); - } - rrd_unlock(); - - netdata_thread_enable_cancelability(); - - debug(D_BACKEND, "BACKEND: buffer has %zu bytes, added metrics for %zu dimensions, of %zu charts, from %zu hosts", buffer_strlen(b), count_dims_total, count_charts_total, count_hosts); - - // ------------------------------------------------------------------------ - - chart_buffered_bytes = (collected_number)buffer_strlen(b); - - // reset the monitoring chart counters - chart_received_bytes = - chart_sent_bytes = - chart_sent_metrics = - chart_lost_metrics = - chart_receptions = - chart_transmission_successes = - chart_transmission_failures = - chart_data_lost_events = - chart_lost_bytes = - chart_backend_reconnects = 0; - // chart_backend_latency = 0; - - if(unlikely(netdata_exit)) break; - - //fprintf(stderr, "\nBACKEND BEGIN:\n%s\nBACKEND END\n", buffer_tostring(b)); - //fprintf(stderr, "after = %lu, before = %lu\n", after, before); - - // prepare for the next iteration - // to add incrementally data to buffer - after = before; - -#if HAVE_KINESIS - if(do_kinesis) { - unsigned long long partition_key_seq = 0; - - size_t buffer_len = buffer_strlen(b); - size_t sent = 0; - - while(sent < buffer_len) { - char partition_key[KINESIS_PARTITION_KEY_MAX + 1]; - snprintf(partition_key, KINESIS_PARTITION_KEY_MAX, "netdata_%llu", partition_key_seq++); - size_t partition_key_len = strnlen(partition_key, KINESIS_PARTITION_KEY_MAX); - - const char *first_char = buffer_tostring(b) + sent; - - size_t record_len = 0; - - // split buffer into chunks of maximum allowed size - if(buffer_len - sent < KINESIS_RECORD_MAX - partition_key_len) { - record_len = buffer_len - sent; - } - else { - record_len = KINESIS_RECORD_MAX - partition_key_len; - while(*(first_char + record_len) != '\n' && record_len) record_len--; - } - - char error_message[ERROR_LINE_MAX + 1] = ""; - - debug(D_BACKEND, "BACKEND: backends_kinesis_put_record(): dest = %s, id = %s, key = %s, stream = %s, partition_key = %s, \ - buffer = %zu, record = %zu", destination, kinesis_auth_key_id, kinesis_secure_key, kinesis_stream_name, - partition_key, buffer_len, record_len); - - backends_kinesis_put_record(kinesis_stream_name, partition_key, first_char, record_len); - - sent += record_len; - chart_transmission_successes++; - - size_t sent_bytes = 0, lost_bytes = 0; - - if(unlikely(backends_kinesis_get_result(error_message, &sent_bytes, &lost_bytes))) { - // oops! we couldn't send (all or some of the) data - error("BACKEND: %s", error_message); - error("BACKEND: failed to write data to database backend '%s'. Willing to write %zu bytes, wrote %zu bytes.", - destination, sent_bytes, sent_bytes - lost_bytes); - - chart_transmission_failures++; - chart_data_lost_events++; - chart_lost_bytes += lost_bytes; - - // estimate the number of lost metrics - chart_lost_metrics += (collected_number)(chart_buffered_metrics - * (buffer_len && (lost_bytes > buffer_len) ? (double)lost_bytes / buffer_len : 1)); - - break; - } - else { - chart_receptions++; - } - - if(unlikely(netdata_exit)) break; - } - - chart_sent_bytes += sent; - if(likely(sent == buffer_len)) - chart_sent_metrics = chart_buffered_metrics; - - buffer_flush(b); - } else -#endif /* HAVE_KINESIS */ - -#if HAVE_MONGOC - if(do_mongodb) { - size_t buffer_len = buffer_strlen(b); - size_t sent = 0; - - while(sent < buffer_len) { - const char *first_char = buffer_tostring(b); - - debug(D_BACKEND, "BACKEND: backends_mongodb_insert(): uri = %s, database = %s, collection = %s, \ - buffer = %zu", mongodb_uri, mongodb_database, mongodb_collection, buffer_len); - - if(likely(!backends_mongodb_insert((char *)first_char, (size_t)chart_buffered_metrics))) { - sent += buffer_len; - chart_transmission_successes++; - chart_receptions++; - } - else { - // oops! we couldn't send (all or some of the) data - error("BACKEND: failed to write data to database backend '%s'. Willing to write %zu bytes, wrote %zu bytes.", - mongodb_uri, buffer_len, 0UL); - - chart_transmission_failures++; - chart_data_lost_events++; - chart_lost_bytes += buffer_len; - - // estimate the number of lost metrics - chart_lost_metrics += (collected_number)chart_buffered_metrics; - - break; - } - - if(unlikely(netdata_exit)) break; - } - - chart_sent_bytes += sent; - if(likely(sent == buffer_len)) - chart_sent_metrics = chart_buffered_metrics; - - buffer_flush(b); - } else -#endif /* HAVE_MONGOC */ - - { - - // ------------------------------------------------------------------------ - // if we are connected, receive a response, without blocking - - if(likely(sock != -1)) { - errno = 0; - - // loop through to collect all data - while(sock != -1 && errno != EWOULDBLOCK) { - buffer_need_bytes(response, 4096); - - ssize_t r; -#ifdef ENABLE_HTTPS - if(opentsdb_ssl.conn && !opentsdb_ssl.flags) { - r = SSL_read(opentsdb_ssl.conn, &response->buffer[response->len], response->size - response->len); - } else { - r = recv(sock, &response->buffer[response->len], response->size - response->len, MSG_DONTWAIT); - } -#else - r = recv(sock, &response->buffer[response->len], response->size - response->len, MSG_DONTWAIT); -#endif - if(likely(r > 0)) { - // we received some data - response->len += r; - chart_received_bytes += r; - chart_receptions++; - } - else if(r == 0) { - error("BACKEND: '%s' closed the socket", destination); - close(sock); - sock = -1; - } - else { - // failed to receive data - if(errno != EAGAIN && errno != EWOULDBLOCK) { - error("BACKEND: cannot receive data from backend '%s'.", destination); - } - } - } - - // if we received data, process them - if(buffer_strlen(response)) - backend_response_checker(response); - } - - // ------------------------------------------------------------------------ - // if we are not connected, connect to a backend server - - if(unlikely(sock == -1)) { - // usec_t start_ut = now_monotonic_usec(); - size_t reconnects = 0; - - sock = connect_to_one_of(destination, default_port, &timeout, &reconnects, NULL, 0); -#ifdef ENABLE_HTTPS - if(sock != -1) { - if(netdata_exporting_ctx) { - if(!opentsdb_ssl.conn) { - opentsdb_ssl.conn = SSL_new(netdata_exporting_ctx); - if(!opentsdb_ssl.conn) { - error("Failed to allocate SSL structure %d.", sock); - opentsdb_ssl.flags = NETDATA_SSL_NO_HANDSHAKE; - } - } else { - SSL_clear(opentsdb_ssl.conn); - } - } - - if(opentsdb_ssl.conn) { - if(SSL_set_fd(opentsdb_ssl.conn, sock) != 1) { - error("Failed to set the socket to the SSL on socket fd %d.", host->rrdpush_sender_socket); - opentsdb_ssl.flags = NETDATA_SSL_NO_HANDSHAKE; - } else { - opentsdb_ssl.flags = NETDATA_SSL_HANDSHAKE_COMPLETE; - SSL_set_connect_state(opentsdb_ssl.conn); - int err = SSL_connect(opentsdb_ssl.conn); - if (err != 1) { - err = SSL_get_error(opentsdb_ssl.conn, err); - error("SSL cannot connect with the server: %s ", ERR_error_string((long)SSL_get_error(opentsdb_ssl.conn, err), NULL)); - opentsdb_ssl.flags = NETDATA_SSL_NO_HANDSHAKE; - } //TODO: check certificate here - } - } - } -#endif - chart_backend_reconnects += reconnects; - // chart_backend_latency += now_monotonic_usec() - start_ut; - } - - if(unlikely(netdata_exit)) break; - - // ------------------------------------------------------------------------ - // if we are connected, send our buffer to the backend server - - if(likely(sock != -1)) { - size_t len = buffer_strlen(b); - // usec_t start_ut = now_monotonic_usec(); - int flags = 0; - #ifdef MSG_NOSIGNAL - flags += MSG_NOSIGNAL; - #endif - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - if(do_prometheus_remote_write) { - size_t data_size = backends_get_write_request_size(); - - if(unlikely(!data_size)) { - error("BACKEND: write request size is out of range"); - continue; - } - - buffer_flush(b); - buffer_need_bytes(b, data_size); - if(unlikely(backends_pack_write_request(b->buffer, &data_size))) { - error("BACKEND: cannot pack write request"); - continue; - } - b->len = data_size; - chart_buffered_bytes = (collected_number)buffer_strlen(b); - - buffer_flush(http_request_header); - buffer_sprintf(http_request_header, - "POST %s HTTP/1.1\r\n" - "Host: %s\r\n" - "Accept: */*\r\n" - "Content-Length: %zu\r\n" - "Content-Type: application/x-www-form-urlencoded\r\n\r\n", - remote_write_path, - destination, - data_size - ); - - len = buffer_strlen(http_request_header); - send(sock, buffer_tostring(http_request_header), len, flags); - - len = data_size; - } -#endif - - ssize_t written; -#ifdef ENABLE_HTTPS - if(opentsdb_ssl.conn && !opentsdb_ssl.flags) { - written = SSL_write(opentsdb_ssl.conn, buffer_tostring(b), len); - } else { - written = send(sock, buffer_tostring(b), len, flags); - } -#else - written = send(sock, buffer_tostring(b), len, flags); -#endif - - // chart_backend_latency += now_monotonic_usec() - start_ut; - if(written != -1 && (size_t)written == len) { - // we sent the data successfully - chart_transmission_successes++; - chart_sent_bytes += written; - chart_sent_metrics = chart_buffered_metrics; - - // reset the failures count - failures = 0; - - // empty the buffer - buffer_flush(b); - } - else { - // oops! we couldn't send (all or some of the) data - error("BACKEND: failed to write data to database backend '%s'. Willing to write %zu bytes, wrote %zd bytes. Will re-connect.", destination, len, written); - chart_transmission_failures++; - - if(written != -1) - chart_sent_bytes += written; - - // increment the counter we check for data loss - failures++; - - // close the socket - we will re-open it next time - close(sock); - sock = -1; - } - } - else { - error("BACKEND: failed to update database backend '%s'", destination); - chart_transmission_failures++; - - // increment the counter we check for data loss - failures++; - } - } - - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - if(do_prometheus_remote_write && failures) { - (void) buffer_on_failures; - failures = 0; - chart_lost_bytes = chart_buffered_bytes = backends_get_write_request_size(); // estimated write request size - chart_data_lost_events++; - chart_lost_metrics = chart_buffered_metrics; - } else -#endif - if(failures > buffer_on_failures) { - // too bad! we are going to lose data - chart_lost_bytes += buffer_strlen(b); - error("BACKEND: reached %d backend failures. Flushing buffers to protect this host - this results in data loss on back-end server '%s'", failures, destination); - buffer_flush(b); - failures = 0; - chart_data_lost_events++; - chart_lost_metrics = chart_buffered_metrics; - } - - if(unlikely(netdata_exit)) break; - - // ------------------------------------------------------------------------ - // update the monitoring charts - - if(likely(chart_ops->counter_done)) rrdset_next(chart_ops); - rrddim_set(chart_ops, "read", chart_receptions); - rrddim_set(chart_ops, "write", chart_transmission_successes); - rrddim_set(chart_ops, "discard", chart_data_lost_events); - rrddim_set(chart_ops, "failure", chart_transmission_failures); - rrddim_set(chart_ops, "reconnect", chart_backend_reconnects); - rrdset_done(chart_ops); - - if(likely(chart_metrics->counter_done)) rrdset_next(chart_metrics); - rrddim_set(chart_metrics, "buffered", chart_buffered_metrics); - rrddim_set(chart_metrics, "lost", chart_lost_metrics); - rrddim_set(chart_metrics, "sent", chart_sent_metrics); - rrdset_done(chart_metrics); - - if(likely(chart_bytes->counter_done)) rrdset_next(chart_bytes); - rrddim_set(chart_bytes, "buffered", chart_buffered_bytes); - rrddim_set(chart_bytes, "lost", chart_lost_bytes); - rrddim_set(chart_bytes, "sent", chart_sent_bytes); - rrddim_set(chart_bytes, "received", chart_received_bytes); - rrdset_done(chart_bytes); - - /* - if(likely(chart_latency->counter_done)) rrdset_next(chart_latency); - rrddim_set(chart_latency, "latency", chart_backend_latency); - rrdset_done(chart_latency); - */ - - getrusage(RUSAGE_THREAD, &thread); - if(likely(chart_rusage->counter_done)) rrdset_next(chart_rusage); - rrddim_set(chart_rusage, "user", thread.ru_utime.tv_sec * 1000000ULL + thread.ru_utime.tv_usec); - rrddim_set(chart_rusage, "system", thread.ru_stime.tv_sec * 1000000ULL + thread.ru_stime.tv_usec); - rrdset_done(chart_rusage); - - if(likely(buffer_strlen(b) == 0)) - chart_buffered_metrics = 0; - - if(unlikely(netdata_exit)) break; - } - -cleanup: -#if HAVE_KINESIS - if(do_kinesis) { - backends_kinesis_shutdown(); - freez(kinesis_auth_key_id); - freez(kinesis_secure_key); - freez(kinesis_stream_name); - } -#endif - -#if ENABLE_PROMETHEUS_REMOTE_WRITE - buffer_free(http_request_header); - if(do_prometheus_remote_write) - backends_protocol_buffers_shutdown(); -#endif - -#if HAVE_MONGOC - if(do_mongodb) { - backends_mongodb_cleanup(); - freez(mongodb_uri); - freez(mongodb_database); - freez(mongodb_collection); - } -#endif - - if(sock != -1) - close(sock); - - buffer_free(b); - buffer_free(response); - -#ifdef ENABLE_HTTPS - if(netdata_exporting_ctx) { - if(opentsdb_ssl.conn) { - SSL_free(opentsdb_ssl.conn); - } - } -#endif - - netdata_thread_cleanup_pop(1); - return NULL; -} diff --git a/backends/backends.h b/backends/backends.h deleted file mode 100644 index 77c58c9e4..000000000 --- a/backends/backends.h +++ /dev/null @@ -1,98 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#ifndef NETDATA_BACKENDS_H -#define NETDATA_BACKENDS_H 1 - -#include "daemon/common.h" - -typedef enum backend_options { - BACKEND_OPTION_NONE = 0, - - BACKEND_SOURCE_DATA_AS_COLLECTED = (1 << 0), - BACKEND_SOURCE_DATA_AVERAGE = (1 << 1), - BACKEND_SOURCE_DATA_SUM = (1 << 2), - - BACKEND_OPTION_SEND_NAMES = (1 << 16) -} BACKEND_OPTIONS; - -typedef enum backend_types { - BACKEND_TYPE_UNKNOWN, // Invalid type - BACKEND_TYPE_GRAPHITE, // Send plain text to Graphite - BACKEND_TYPE_OPENTSDB_USING_TELNET, // Send data to OpenTSDB using telnet API - BACKEND_TYPE_OPENTSDB_USING_HTTP, // Send data to OpenTSDB using HTTP API - BACKEND_TYPE_JSON, // Stores the data using JSON. - BACKEND_TYPE_PROMETHEUS_REMOTE_WRITE, // The user selected to use Prometheus backend - BACKEND_TYPE_KINESIS, // Send message to AWS Kinesis - BACKEND_TYPE_MONGODB, // Send data to MongoDB collection - BACKEND_TYPE_NUM // Number of backend types -} BACKEND_TYPE; - -typedef int (**backend_response_checker_t)(BUFFER *); -typedef int (**backend_request_formatter_t)(BUFFER *, const char *, RRDHOST *, const char *, RRDSET *, RRDDIM *, time_t, time_t, BACKEND_OPTIONS); - -#define BACKEND_OPTIONS_SOURCE_BITS (BACKEND_SOURCE_DATA_AS_COLLECTED|BACKEND_SOURCE_DATA_AVERAGE|BACKEND_SOURCE_DATA_SUM) -#define BACKEND_OPTIONS_DATA_SOURCE(backend_options) (backend_options & BACKEND_OPTIONS_SOURCE_BITS) - -extern int global_backend_update_every; -extern BACKEND_OPTIONS global_backend_options; -extern const char *global_backend_source; -extern const char *global_backend_prefix; -extern const char *global_backend_send_charts_matching; - -extern void *backends_main(void *ptr); -BACKEND_TYPE backend_select_type(const char *type); - -extern BACKEND_OPTIONS backend_parse_data_source(const char *source, BACKEND_OPTIONS backend_options); - -#ifdef BACKENDS_INTERNALS - -extern int backends_can_send_rrdset(BACKEND_OPTIONS backend_options, RRDSET *st); -extern calculated_number backend_calculate_value_from_stored_data( - RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap - , time_t *first_timestamp // the timestamp of the first point used in this response - , time_t *last_timestamp // the timestamp that should be reported to backend -); - -extern size_t backend_name_copy(char *d, const char *s, size_t usable); -extern int discard_response(BUFFER *b, const char *backend); - -static inline char *strip_quotes(char *str) { - if(*str == '"' || *str == '\'') { - char *s; - - str++; - - s = str; - while(*s) s++; - if(s != str) s--; - - if(*s == '"' || *s == '\'') *s = '\0'; - } - - return str; -} - -#endif // BACKENDS_INTERNALS - -#include "backends/prometheus/backend_prometheus.h" -#include "backends/graphite/graphite.h" -#include "backends/json/json.h" -#include "backends/opentsdb/opentsdb.h" - -#if HAVE_KINESIS -#include "backends/aws_kinesis/aws_kinesis.h" -#endif - -#if ENABLE_PROMETHEUS_REMOTE_WRITE -#include "backends/prometheus/remote_write/remote_write.h" -#endif - -#if HAVE_MONGOC -#include "backends/mongodb/mongodb.h" -#endif - -#endif /* NETDATA_BACKENDS_H */ diff --git a/backends/graphite/Makefile.am b/backends/graphite/Makefile.am deleted file mode 100644 index babdcf0df..000000000 --- a/backends/graphite/Makefile.am +++ /dev/null @@ -1,4 +0,0 @@ -# SPDX-License-Identifier: GPL-3.0-or-later - -AUTOMAKE_OPTIONS = subdir-objects -MAINTAINERCLEANFILES = $(srcdir)/Makefile.in diff --git a/backends/graphite/graphite.c b/backends/graphite/graphite.c deleted file mode 100644 index f75a93a0f..000000000 --- a/backends/graphite/graphite.c +++ /dev/null @@ -1,90 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#define BACKENDS_INTERNALS -#include "graphite.h" - -// ---------------------------------------------------------------------------- -// graphite backend - -int backends_format_dimension_collected_graphite_plaintext( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -) { - (void)host; - (void)after; - (void)before; - - char chart_name[RRD_ID_LENGTH_MAX + 1]; - char dimension_name[RRD_ID_LENGTH_MAX + 1]; - backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX); - backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX); - - buffer_sprintf( - b - , "%s.%s.%s.%s%s%s " COLLECTED_NUMBER_FORMAT " %llu\n" - , prefix - , hostname - , chart_name - , dimension_name - , (host->tags)?";":"" - , (host->tags)?host->tags:"" - , rd->last_collected_value - , (unsigned long long)rd->last_collected_time.tv_sec - ); - - return 1; -} - -int backends_format_dimension_stored_graphite_plaintext( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -) { - (void)host; - - char chart_name[RRD_ID_LENGTH_MAX + 1]; - char dimension_name[RRD_ID_LENGTH_MAX + 1]; - backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX); - backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX); - - time_t first_t = after, last_t = before; - calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t); - - if(!isnan(value)) { - - buffer_sprintf( - b - , "%s.%s.%s.%s%s%s " CALCULATED_NUMBER_FORMAT " %llu\n" - , prefix - , hostname - , chart_name - , dimension_name - , (host->tags)?";":"" - , (host->tags)?host->tags:"" - , value - , (unsigned long long) last_t - ); - - return 1; - } - return 0; -} - -int process_graphite_response(BUFFER *b) { - return discard_response(b, "graphite"); -} - - diff --git a/backends/graphite/graphite.h b/backends/graphite/graphite.h deleted file mode 100644 index 498a7fcdf..000000000 --- a/backends/graphite/graphite.h +++ /dev/null @@ -1,35 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - - -#ifndef NETDATA_BACKEND_GRAPHITE_H -#define NETDATA_BACKEND_GRAPHITE_H - -#include "backends/backends.h" - -extern int backends_format_dimension_collected_graphite_plaintext( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -); - -extern int backends_format_dimension_stored_graphite_plaintext( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -); - -extern int process_graphite_response(BUFFER *b); - -#endif //NETDATA_BACKEND_GRAPHITE_H diff --git a/backends/json/Makefile.am b/backends/json/Makefile.am deleted file mode 100644 index babdcf0df..000000000 --- a/backends/json/Makefile.am +++ /dev/null @@ -1,4 +0,0 @@ -# SPDX-License-Identifier: GPL-3.0-or-later - -AUTOMAKE_OPTIONS = subdir-objects -MAINTAINERCLEANFILES = $(srcdir)/Makefile.in diff --git a/backends/json/json.c b/backends/json/json.c deleted file mode 100644 index 0c7cc738f..000000000 --- a/backends/json/json.c +++ /dev/null @@ -1,152 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#define BACKENDS_INTERNALS -#include "json.h" - -// ---------------------------------------------------------------------------- -// json backend - -int backends_format_dimension_collected_json_plaintext( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -) { - (void)host; - (void)after; - (void)before; - (void)backend_options; - - const char *tags_pre = "", *tags_post = "", *tags = host->tags; - if(!tags) tags = ""; - - if(*tags) { - if(*tags == '{' || *tags == '[' || *tags == '"') { - tags_pre = "\"host_tags\":"; - tags_post = ","; - } - else { - tags_pre = "\"host_tags\":\""; - tags_post = "\","; - } - } - - buffer_sprintf(b, "{" - "\"prefix\":\"%s\"," - "\"hostname\":\"%s\"," - "%s%s%s" - - "\"chart_id\":\"%s\"," - "\"chart_name\":\"%s\"," - "\"chart_family\":\"%s\"," - "\"chart_context\": \"%s\"," - "\"chart_type\":\"%s\"," - "\"units\": \"%s\"," - - "\"id\":\"%s\"," - "\"name\":\"%s\"," - "\"value\":" COLLECTED_NUMBER_FORMAT "," - - "\"timestamp\": %llu}\n", - prefix, - hostname, - tags_pre, tags, tags_post, - - st->id, - st->name, - st->family, - st->context, - st->type, - st->units, - - rd->id, - rd->name, - rd->last_collected_value, - - (unsigned long long) rd->last_collected_time.tv_sec - ); - - return 1; -} - -int backends_format_dimension_stored_json_plaintext( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -) { - (void)host; - - time_t first_t = after, last_t = before; - calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t); - - if(!isnan(value)) { - const char *tags_pre = "", *tags_post = "", *tags = host->tags; - if(!tags) tags = ""; - - if(*tags) { - if(*tags == '{' || *tags == '[' || *tags == '"') { - tags_pre = "\"host_tags\":"; - tags_post = ","; - } - else { - tags_pre = "\"host_tags\":\""; - tags_post = "\","; - } - } - - buffer_sprintf(b, "{" - "\"prefix\":\"%s\"," - "\"hostname\":\"%s\"," - "%s%s%s" - - "\"chart_id\":\"%s\"," - "\"chart_name\":\"%s\"," - "\"chart_family\":\"%s\"," - "\"chart_context\": \"%s\"," - "\"chart_type\":\"%s\"," - "\"units\": \"%s\"," - - "\"id\":\"%s\"," - "\"name\":\"%s\"," - "\"value\":" CALCULATED_NUMBER_FORMAT "," - - "\"timestamp\": %llu}\n", - prefix, - hostname, - tags_pre, tags, tags_post, - - st->id, - st->name, - st->family, - st->context, - st->type, - st->units, - - rd->id, - rd->name, - value, - - (unsigned long long) last_t - ); - - return 1; - } - return 0; -} - -int process_json_response(BUFFER *b) { - return discard_response(b, "json"); -} - - diff --git a/backends/json/json.h b/backends/json/json.h deleted file mode 100644 index 78ac37609..000000000 --- a/backends/json/json.h +++ /dev/null @@ -1,34 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#ifndef NETDATA_BACKEND_JSON_H -#define NETDATA_BACKEND_JSON_H - -#include "backends/backends.h" - -extern int backends_format_dimension_collected_json_plaintext( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -); - -extern int backends_format_dimension_stored_json_plaintext( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -); - -extern int process_json_response(BUFFER *b); - -#endif //NETDATA_BACKEND_JSON_H diff --git a/backends/mongodb/Makefile.am b/backends/mongodb/Makefile.am deleted file mode 100644 index 161784b8f..000000000 --- a/backends/mongodb/Makefile.am +++ /dev/null @@ -1,8 +0,0 @@ -# SPDX-License-Identifier: GPL-3.0-or-later - -AUTOMAKE_OPTIONS = subdir-objects -MAINTAINERCLEANFILES = $(srcdir)/Makefile.in - -dist_noinst_DATA = \ - README.md \ - $(NULL) diff --git a/backends/mongodb/README.md b/backends/mongodb/README.md deleted file mode 100644 index 7c7996e1b..000000000 --- a/backends/mongodb/README.md +++ /dev/null @@ -1,41 +0,0 @@ -<!-- -title: "MongoDB backend" -custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/mongodb/README.md ---> - -# MongoDB backend - -## Prerequisites - -To use MongoDB as a backend, `libmongoc` 1.7.0 or higher should be -[installed](http://mongoc.org/libmongoc/current/installing.html) first. Next, Netdata should be re-installed from the -source. The installer will detect that the required libraries are now available. - -## Configuration - -To enable data sending to the MongoDB backend set the following options in `netdata.conf`: - -```conf -[backend] - enabled = yes - type = mongodb -``` - -In the Netdata configuration directory run `./edit-config mongodb.conf` and set [MongoDB -URI](https://docs.mongodb.com/manual/reference/connection-string/), database name, and collection name: - -```yaml -# URI -uri = mongodb://<hostname> - -# database name -database = your_database_name - -# collection name -collection = your_collection_name -``` - -The default socket timeout depends on the backend update interval. The timeout is 500 ms shorter than the interval (but -not less than 1000 ms). You can alter the timeout using the `sockettimeoutms` MongoDB URI option. - -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Fmongodb%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/backends/mongodb/mongodb.c b/backends/mongodb/mongodb.c deleted file mode 100644 index d0527a723..000000000 --- a/backends/mongodb/mongodb.c +++ /dev/null @@ -1,189 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#define BACKENDS_INTERNALS -#include "mongodb.h" -#include <mongoc.h> - -#define CONFIG_FILE_LINE_MAX ((CONFIG_MAX_NAME + CONFIG_MAX_VALUE + 1024) * 2) - -static mongoc_client_t *mongodb_client; -static mongoc_collection_t *mongodb_collection; - -int backends_mongodb_init(const char *uri_string, - const char *database_string, - const char *collection_string, - int32_t default_socket_timeout) { - mongoc_uri_t *uri; - bson_error_t error; - - mongoc_init(); - - uri = mongoc_uri_new_with_error(uri_string, &error); - if(unlikely(!uri)) { - error("BACKEND: failed to parse URI: %s. Error message: %s", uri_string, error.message); - return 1; - } - - int32_t socket_timeout = mongoc_uri_get_option_as_int32(uri, MONGOC_URI_SOCKETTIMEOUTMS, default_socket_timeout); - if(!mongoc_uri_set_option_as_int32(uri, MONGOC_URI_SOCKETTIMEOUTMS, socket_timeout)) { - error("BACKEND: failed to set %s to the value %d", MONGOC_URI_SOCKETTIMEOUTMS, socket_timeout); - return 1; - }; - - mongodb_client = mongoc_client_new_from_uri(uri); - if(unlikely(!mongodb_client)) { - error("BACKEND: failed to create a new client"); - return 1; - } - - if(!mongoc_client_set_appname(mongodb_client, "netdata")) { - error("BACKEND: failed to set client appname"); - }; - - mongodb_collection = mongoc_client_get_collection(mongodb_client, database_string, collection_string); - - mongoc_uri_destroy(uri); - - return 0; -} - -void backends_free_bson(bson_t **insert, size_t n_documents) { - size_t i; - - for(i = 0; i < n_documents; i++) - bson_destroy(insert[i]); - - free(insert); -} - -int backends_mongodb_insert(char *data, size_t n_metrics) { - bson_t **insert = calloc(n_metrics, sizeof(bson_t *)); - bson_error_t error; - char *start = data, *end = data; - size_t n_documents = 0; - - while(*end && n_documents <= n_metrics) { - while(*end && *end != '\n') end++; - - if(likely(*end)) { - *end = '\0'; - end++; - } - else { - break; - } - - insert[n_documents] = bson_new_from_json((const uint8_t *)start, -1, &error); - - if(unlikely(!insert[n_documents])) { - error("BACKEND: %s", error.message); - backends_free_bson(insert, n_documents); - return 1; - } - - start = end; - - n_documents++; - } - - if(unlikely(!mongoc_collection_insert_many(mongodb_collection, (const bson_t **)insert, n_documents, NULL, NULL, &error))) { - error("BACKEND: %s", error.message); - backends_free_bson(insert, n_documents); - return 1; - } - - backends_free_bson(insert, n_documents); - - return 0; -} - -void backends_mongodb_cleanup() { - mongoc_collection_destroy(mongodb_collection); - mongoc_client_destroy(mongodb_client); - mongoc_cleanup(); - - return; -} - -int read_mongodb_conf(const char *path, char **uri_p, char **database_p, char **collection_p) { - char *uri = *uri_p; - char *database = *database_p; - char *collection = *collection_p; - - if(unlikely(uri)) freez(uri); - if(unlikely(database)) freez(database); - if(unlikely(collection)) freez(collection); - uri = NULL; - database = NULL; - collection = NULL; - - int line = 0; - - char filename[FILENAME_MAX + 1]; - snprintfz(filename, FILENAME_MAX, "%s/mongodb.conf", path); - - char buffer[CONFIG_FILE_LINE_MAX + 1], *s; - - debug(D_BACKEND, "BACKEND: opening config file '%s'", filename); - - FILE *fp = fopen(filename, "r"); - if(!fp) { - return 1; - } - - while(fgets(buffer, CONFIG_FILE_LINE_MAX, fp) != NULL) { - buffer[CONFIG_FILE_LINE_MAX] = '\0'; - line++; - - s = trim(buffer); - if(!s || *s == '#') { - debug(D_BACKEND, "BACKEND: ignoring line %d of file '%s', it is empty.", line, filename); - continue; - } - - char *name = s; - char *value = strchr(s, '='); - if(unlikely(!value)) { - error("BACKEND: ignoring line %d ('%s') of file '%s', there is no = in it.", line, s, filename); - continue; - } - *value = '\0'; - value++; - - name = trim(name); - value = trim(value); - - if(unlikely(!name || *name == '#')) { - error("BACKEND: ignoring line %d of file '%s', name is empty.", line, filename); - continue; - } - - if(!value) - value = ""; - else - value = strip_quotes(value); - - if(name[0] == 'u' && !strcmp(name, "uri")) { - uri = strdupz(value); - } - else if(name[0] == 'd' && !strcmp(name, "database")) { - database = strdupz(value); - } - else if(name[0] == 'c' && !strcmp(name, "collection")) { - collection = strdupz(value); - } - } - - fclose(fp); - - if(unlikely(!collection || !*collection)) { - error("BACKEND: collection name is a mandatory MongoDB parameter, but it is not configured"); - return 1; - } - - *uri_p = uri; - *database_p = database; - *collection_p = collection; - - return 0; -} diff --git a/backends/mongodb/mongodb.conf b/backends/mongodb/mongodb.conf deleted file mode 100644 index 11ea6efb2..000000000 --- a/backends/mongodb/mongodb.conf +++ /dev/null @@ -1,12 +0,0 @@ -# MongoDB backend configuration -# -# All options in this file are mandatory - -# URI -uri = - -# database name -database = - -# collection name -collection = diff --git a/backends/mongodb/mongodb.h b/backends/mongodb/mongodb.h deleted file mode 100644 index cae9b093e..000000000 --- a/backends/mongodb/mongodb.h +++ /dev/null @@ -1,16 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#ifndef NETDATA_BACKEND_MONGODB_H -#define NETDATA_BACKEND_MONGODB_H - -#include "backends/backends.h" - -extern int backends_mongodb_init(const char *uri_string, const char *database_string, const char *collection_string, const int32_t socket_timeout); - -extern int backends_mongodb_insert(char *data, size_t n_metrics); - -extern void backends_mongodb_cleanup(); - -extern int read_mongodb_conf(const char *path, char **uri_p, char **database_p, char **collection_p); - -#endif //NETDATA_BACKEND_MONGODB_H diff --git a/backends/nc-backend.sh b/backends/nc-backend.sh deleted file mode 100755 index 65704b98f..000000000 --- a/backends/nc-backend.sh +++ /dev/null @@ -1,158 +0,0 @@ -#!/usr/bin/env bash - -# SPDX-License-Identifier: GPL-3.0-or-later - -# This is a simple backend database proxy, written in BASH, using the nc command. -# Run the script without any parameters for help. - -MODE="${1}" -MY_PORT="${2}" -BACKEND_HOST="${3}" -BACKEND_PORT="${4}" -FILE="${NETDATA_NC_BACKEND_DIR-/tmp}/netdata-nc-backend-${MY_PORT}" - -log() { - logger --stderr --id=$$ --tag "netdata-nc-backend" "${*}" -} - -mync() { - local ret - - log "Running: nc ${*}" - nc "${@}" - ret=$? - - log "nc stopped with return code ${ret}." - - return ${ret} -} - -listen_save_replay_forever() { - local file="${1}" port="${2}" real_backend_host="${3}" real_backend_port="${4}" ret delay=1 started ended - - while true - do - log "Starting nc to listen on port ${port} and save metrics to ${file}" - - started=$(date +%s) - mync -l -p "${port}" | tee -a -p --output-error=exit "${file}" - ended=$(date +%s) - - if [ -s "${file}" ] - then - if [ ! -z "${real_backend_host}" ] && [ ! -z "${real_backend_port}" ] - then - log "Attempting to send the metrics to the real backend at ${real_backend_host}:${real_backend_port}" - - mync "${real_backend_host}" "${real_backend_port}" <"${file}" - ret=$? - - if [ ${ret} -eq 0 ] - then - log "Successfully sent the metrics to ${real_backend_host}:${real_backend_port}" - mv "${file}" "${file}.old" - touch "${file}" - else - log "Failed to send the metrics to ${real_backend_host}:${real_backend_port} (nc returned ${ret}) - appending more data to ${file}" - fi - else - log "No backend configured - appending more data to ${file}" - fi - fi - - # prevent a CPU hungry infinite loop - # if nc cannot listen to port - if [ $((ended - started)) -lt 5 ] - then - log "nc has been stopped too fast." - delay=30 - else - delay=1 - fi - - log "Waiting ${delay} seconds before listening again for data." - sleep ${delay} - done -} - -if [ "${MODE}" = "start" ] - then - - # start the listener, in exclusive mode - # only one can use the same file/port at a time - { - flock -n 9 - # shellcheck disable=SC2181 - if [ $? -ne 0 ] - then - log "Cannot get exclusive lock on file ${FILE}.lock - Am I running multiple times?" - exit 2 - fi - - # save our PID to the lock file - echo "$$" >"${FILE}.lock" - - listen_save_replay_forever "${FILE}" "${MY_PORT}" "${BACKEND_HOST}" "${BACKEND_PORT}" - ret=$? - - log "listener exited." - exit ${ret} - - } 9>>"${FILE}.lock" - - # we can only get here if ${FILE}.lock cannot be created - log "Cannot create file ${FILE}." - exit 3 - -elif [ "${MODE}" = "stop" ] - then - - { - flock -n 9 - # shellcheck disable=SC2181 - if [ $? -ne 0 ] - then - pid=$(<"${FILE}".lock) - log "Killing process ${pid}..." - kill -TERM "-${pid}" - exit 0 - fi - - log "File ${FILE}.lock has been locked by me but it shouldn't. Is a collector running?" - exit 4 - - } 9<"${FILE}.lock" - - log "File ${FILE}.lock does not exist. Is a collector running?" - exit 5 - -else - - cat <<EOF -Usage: - - "${0}" start|stop PORT [BACKEND_HOST BACKEND_PORT] - - PORT The port this script will listen - (configure netdata to use this as a second backend) - - BACKEND_HOST The real backend host - BACKEND_PORT The real backend port - - This script can act as fallback backend for netdata. - It will receive metrics from netdata, save them to - ${FILE} - and once netdata reconnects to the real-backend, this script - will push all metrics collected to the real-backend too and - wait for a failure to happen again. - - Only one netdata can connect to this script at a time. - If you need fallback for multiple netdata, run this script - multiple times with different ports. - - You can run me in the background with this: - - screen -d -m "${0}" start PORT [BACKEND_HOST BACKEND_PORT] -EOF - exit 1 -fi diff --git a/backends/opentsdb/Makefile.am b/backends/opentsdb/Makefile.am deleted file mode 100644 index babdcf0df..000000000 --- a/backends/opentsdb/Makefile.am +++ /dev/null @@ -1,4 +0,0 @@ -# SPDX-License-Identifier: GPL-3.0-or-later - -AUTOMAKE_OPTIONS = subdir-objects -MAINTAINERCLEANFILES = $(srcdir)/Makefile.in diff --git a/backends/opentsdb/README.md b/backends/opentsdb/README.md deleted file mode 100644 index 5ba7b12c5..000000000 --- a/backends/opentsdb/README.md +++ /dev/null @@ -1,38 +0,0 @@ -<!-- -title: "OpenTSDB with HTTP" -custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/opentsdb/README.md ---> - -# OpenTSDB with HTTP - -Netdata can easily communicate with OpenTSDB using HTTP API. To enable this channel, set the following options in your -`netdata.conf`: - -```conf -[backend] - type = opentsdb:http - destination = localhost:4242 -``` - -In this example, OpenTSDB is running with its default port, which is `4242`. If you run OpenTSDB on a different port, -change the `destination = localhost:4242` line accordingly. - -## HTTPS - -As of [v1.16.0](https://github.com/netdata/netdata/releases/tag/v1.16.0), Netdata can send metrics to OpenTSDB using -TLS/SSL. Unfortunately, OpenTDSB does not support encrypted connections, so you will have to configure a reverse proxy -to enable HTTPS communication between Netdata and OpenTSDB. You can set up a reverse proxy with -[Nginx](/docs/Running-behind-nginx.md). - -After your proxy is configured, make the following changes to `netdata.conf`: - -```conf -[backend] - type = opentsdb:https - destination = localhost:8082 -``` - -In this example, we used the port `8082` for our reverse proxy. If your reverse proxy listens on a different port, -change the `destination = localhost:8082` line accordingly. - -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Fopentsdb%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]() diff --git a/backends/opentsdb/opentsdb.c b/backends/opentsdb/opentsdb.c deleted file mode 100644 index 965b4c092..000000000 --- a/backends/opentsdb/opentsdb.c +++ /dev/null @@ -1,205 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#define BACKENDS_INTERNALS -#include "opentsdb.h" - -// ---------------------------------------------------------------------------- -// opentsdb backend - -int backends_format_dimension_collected_opentsdb_telnet( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -) { - (void)host; - (void)after; - (void)before; - - char chart_name[RRD_ID_LENGTH_MAX + 1]; - char dimension_name[RRD_ID_LENGTH_MAX + 1]; - backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX); - backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX); - - buffer_sprintf( - b - , "put %s.%s.%s %llu " COLLECTED_NUMBER_FORMAT " host=%s%s%s\n" - , prefix - , chart_name - , dimension_name - , (unsigned long long)rd->last_collected_time.tv_sec - , rd->last_collected_value - , hostname - , (host->tags)?" ":"" - , (host->tags)?host->tags:"" - ); - - return 1; -} - -int backends_format_dimension_stored_opentsdb_telnet( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -) { - (void)host; - - time_t first_t = after, last_t = before; - calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t); - - char chart_name[RRD_ID_LENGTH_MAX + 1]; - char dimension_name[RRD_ID_LENGTH_MAX + 1]; - backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX); - backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX); - - if(!isnan(value)) { - - buffer_sprintf( - b - , "put %s.%s.%s %llu " CALCULATED_NUMBER_FORMAT " host=%s%s%s\n" - , prefix - , chart_name - , dimension_name - , (unsigned long long) last_t - , value - , hostname - , (host->tags)?" ":"" - , (host->tags)?host->tags:"" - ); - - return 1; - } - - return 0; -} - -int process_opentsdb_response(BUFFER *b) { - return discard_response(b, "opentsdb"); -} - -static inline void opentsdb_build_message(BUFFER *b, char *message, const char *hostname, int length) { - buffer_sprintf( - b - , "POST /api/put HTTP/1.1\r\n" - "Host: %s\r\n" - "Content-Type: application/json\r\n" - "Content-Length: %d\r\n" - "\r\n" - "%s" - , hostname - , length - , message - ); -} - -int backends_format_dimension_collected_opentsdb_http( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -) { - (void)host; - (void)after; - (void)before; - - char message[1024]; - char chart_name[RRD_ID_LENGTH_MAX + 1]; - char dimension_name[RRD_ID_LENGTH_MAX + 1]; - backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX); - backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX); - - int length = snprintfz(message - , sizeof(message) - , "{" - " \"metric\": \"%s.%s.%s\"," - " \"timestamp\": %llu," - " \"value\": "COLLECTED_NUMBER_FORMAT "," - " \"tags\": {" - " \"host\": \"%s%s%s\"" - " }" - "}" - , prefix - , chart_name - , dimension_name - , (unsigned long long)rd->last_collected_time.tv_sec - , rd->last_collected_value - , hostname - , (host->tags)?" ":"" - , (host->tags)?host->tags:"" - ); - - if(length > 0) { - opentsdb_build_message(b, message, hostname, length); - } - - return 1; -} - -int backends_format_dimension_stored_opentsdb_http( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -) { - (void)host; - - time_t first_t = after, last_t = before; - calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t); - - if(!isnan(value)) { - char chart_name[RRD_ID_LENGTH_MAX + 1]; - char dimension_name[RRD_ID_LENGTH_MAX + 1]; - backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX); - backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX); - - char message[1024]; - int length = snprintfz(message - , sizeof(message) - , "{" - " \"metric\": \"%s.%s.%s\"," - " \"timestamp\": %llu," - " \"value\": "CALCULATED_NUMBER_FORMAT "," - " \"tags\": {" - " \"host\": \"%s%s%s\"" - " }" - "}" - , prefix - , chart_name - , dimension_name - , (unsigned long long)last_t - , value - , hostname - , (host->tags)?" ":"" - , (host->tags)?host->tags:"" - ); - - if(length > 0) { - opentsdb_build_message(b, message, hostname, length); - } - - return 1; - } - - return 0; -} diff --git a/backends/opentsdb/opentsdb.h b/backends/opentsdb/opentsdb.h deleted file mode 100644 index 87d9c5cd7..000000000 --- a/backends/opentsdb/opentsdb.h +++ /dev/null @@ -1,58 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#ifndef NETDATA_BACKEND_OPENTSDB_H -#define NETDATA_BACKEND_OPENTSDB_H - -#include "backends/backends.h" - -extern int backends_format_dimension_collected_opentsdb_telnet( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -); - -extern int backends_format_dimension_stored_opentsdb_telnet( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -); - -extern int process_opentsdb_response(BUFFER *b); - -int backends_format_dimension_collected_opentsdb_http( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -); - -int backends_format_dimension_stored_opentsdb_http( - BUFFER *b // the buffer to write data to - , const char *prefix // the prefix to use - , RRDHOST *host // the host this chart comes from - , const char *hostname // the hostname (to override host->hostname) - , RRDSET *st // the chart - , RRDDIM *rd // the dimension - , time_t after // the start timestamp - , time_t before // the end timestamp - , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap -); - -#endif //NETDATA_BACKEND_OPENTSDB_H diff --git a/backends/prometheus/Makefile.am b/backends/prometheus/Makefile.am deleted file mode 100644 index 334fca81c..000000000 --- a/backends/prometheus/Makefile.am +++ /dev/null @@ -1,12 +0,0 @@ -# SPDX-License-Identifier: GPL-3.0-or-later - -AUTOMAKE_OPTIONS = subdir-objects -MAINTAINERCLEANFILES = $(srcdir)/Makefile.in - -SUBDIRS = \ - remote_write \ - $(NULL) - -dist_noinst_DATA = \ - README.md \ - $(NULL) diff --git a/backends/prometheus/README.md b/backends/prometheus/README.md deleted file mode 100644 index a0460d1d8..000000000 --- a/backends/prometheus/README.md +++ /dev/null @@ -1,457 +0,0 @@ -<!-- -title: "Using Netdata with Prometheus" -custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/prometheus/README.md ---> - -# Using Netdata with Prometheus - -> IMPORTANT: the format Netdata sends metrics to prometheus has changed since Netdata v1.7. The new prometheus backend -> for Netdata supports a lot more features and is aligned to the development of the rest of the Netdata backends. - -Prometheus is a distributed monitoring system which offers a very simple setup along with a robust data model. Recently -Netdata added support for Prometheus. I'm going to quickly show you how to install both Netdata and prometheus on the -same server. We can then use grafana pointed at Prometheus to obtain long term metrics Netdata offers. I'm assuming we -are starting at a fresh ubuntu shell (whether you'd like to follow along in a VM or a cloud instance is up to you). - -## Installing Netdata and prometheus - -### Installing Netdata - -There are number of ways to install Netdata according to [Installation](/packaging/installer/README.md). The suggested way -of installing the latest Netdata and keep it upgrade automatically. Using one line installation: - -```sh -bash <(curl -Ss https://my-netdata.io/kickstart.sh) -``` - -At this point we should have Netdata listening on port 19999. Attempt to take your browser here: - -```sh -http://your.netdata.ip:19999 -``` - -_(replace `your.netdata.ip` with the IP or hostname of the server running Netdata)_ - -### Installing Prometheus - -In order to install prometheus we are going to introduce our own systemd startup script along with an example of -prometheus.yaml configuration. Prometheus needs to be pointed to your server at a specific target url for it to scrape -Netdata's api. Prometheus is always a pull model meaning Netdata is the passive client within this architecture. -Prometheus always initiates the connection with Netdata. - -#### Download Prometheus - -```sh -cd /tmp && curl -s https://api.github.com/repos/prometheus/prometheus/releases/latest \ -| grep "browser_download_url.*linux-amd64.tar.gz" \ -| cut -d '"' -f 4 \ -| wget -qi - -``` - -#### Create prometheus system user - -```sh -sudo useradd -r prometheus -``` - -#### Create prometheus directory - -```sh -sudo mkdir /opt/prometheus -sudo chown prometheus:prometheus /opt/prometheus -``` - -#### Untar prometheus directory - -```sh -sudo tar -xvf /tmp/prometheus-*linux-amd64.tar.gz -C /opt/prometheus --strip=1 -``` - -#### Install prometheus.yml - -We will use the following `prometheus.yml` file. Save it at `/opt/prometheus/prometheus.yml`. - -Make sure to replace `your.netdata.ip` with the IP or hostname of the host running Netdata. - -```yaml -# my global config -global: - scrape_interval: 5s # Set the scrape interval to every 5 seconds. Default is every 1 minute. - evaluation_interval: 5s # Evaluate rules every 5 seconds. The default is every 1 minute. - # scrape_timeout is set to the global default (10s). - - # Attach these labels to any time series or alerts when communicating with - # external systems (federation, remote storage, Alertmanager). - external_labels: - monitor: 'codelab-monitor' - -# Load rules once and periodically evaluate them according to the global 'evaluation_interval'. -rule_files: - # - "first.rules" - # - "second.rules" - -# A scrape configuration containing exactly one endpoint to scrape: -# Here it's Prometheus itself. -scrape_configs: - # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config. - - job_name: 'prometheus' - - # metrics_path defaults to '/metrics' - # scheme defaults to 'http'. - - static_configs: - - targets: ['0.0.0.0:9090'] - - - job_name: 'netdata-scrape' - - metrics_path: '/api/v1/allmetrics' - params: - # format: prometheus | prometheus_all_hosts - # You can use `prometheus_all_hosts` if you want Prometheus to set the `instance` to your hostname instead of IP - format: [prometheus] - # - # source: as-collected | raw | average | sum | volume - # default is: average - #source: [as-collected] - # - # server name for this prometheus - the default is the client IP - # for Netdata to uniquely identify it - #server: ['prometheus1'] - honor_labels: true - - static_configs: - - targets: ['{your.netdata.ip}:19999'] -``` - -#### Install nodes.yml - -The following is completely optional, it will enable Prometheus to generate alerts from some Netdata sources. Tweak the -values to your own needs. We will use the following `nodes.yml` file below. Save it at `/opt/prometheus/nodes.yml`, and -add a _- "nodes.yml"_ entry under the _rule_files:_ section in the example prometheus.yml file above. - -```yaml -groups: - - name: nodes - - rules: - - alert: node_high_cpu_usage_70 - expr: sum(sum_over_time(netdata_system_cpu_percentage_average{dimension=~"(user|system|softirq|irq|guest)"}[10m])) by (job) / sum(count_over_time(netdata_system_cpu_percentage_average{dimension="idle"}[10m])) by (job) > 70 - for: 1m - annotations: - description: '{{ $labels.job }} on ''{{ $labels.job }}'' CPU usage is at {{ humanize $value }}%.' - summary: CPU alert for container node '{{ $labels.job }}' - - - alert: node_high_memory_usage_70 - expr: 100 / sum(netdata_system_ram_MB_average) by (job) - * sum(netdata_system_ram_MB_average{dimension=~"free|cached"}) by (job) < 30 - for: 1m - annotations: - description: '{{ $labels.job }} memory usage is {{ humanize $value}}%.' - summary: Memory alert for container node '{{ $labels.job }}' - - - alert: node_low_root_filesystem_space_20 - expr: 100 / sum(netdata_disk_space_GB_average{family="/"}) by (job) - * sum(netdata_disk_space_GB_average{family="/",dimension=~"avail|cached"}) by (job) < 20 - for: 1m - annotations: - description: '{{ $labels.job }} root filesystem space is {{ humanize $value}}%.' - summary: Root filesystem alert for container node '{{ $labels.job }}' - - - alert: node_root_filesystem_fill_rate_6h - expr: predict_linear(netdata_disk_space_GB_average{family="/",dimension=~"avail|cached"}[1h], 6 * 3600) < 0 - for: 1h - labels: - severity: critical - annotations: - description: Container node {{ $labels.job }} root filesystem is going to fill up in 6h. - summary: Disk fill alert for Swarm node '{{ $labels.job }}' -``` - -#### Install prometheus.service - -Save this service file as `/etc/systemd/system/prometheus.service`: - -```sh -[Unit] -Description=Prometheus Server -AssertPathExists=/opt/prometheus - -[Service] -Type=simple -WorkingDirectory=/opt/prometheus -User=prometheus -Group=prometheus -ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml --log.level=info -ExecReload=/bin/kill -SIGHUP $MAINPID -ExecStop=/bin/kill -SIGINT $MAINPID - -[Install] -WantedBy=multi-user.target -``` - -##### Start Prometheus - -```sh -sudo systemctl start prometheus -sudo systemctl enable prometheus -``` - -Prometheus should now start and listen on port 9090. Attempt to head there with your browser. - -If everything is working correctly when you fetch `http://your.prometheus.ip:9090` you will see a 'Status' tab. Click -this and click on 'targets' We should see the Netdata host as a scraped target. - ---- - -## Netdata support for prometheus - -> IMPORTANT: the format Netdata sends metrics to prometheus has changed since Netdata v1.6. The new format allows easier -> queries for metrics and supports both `as collected` and normalized metrics. - -Before explaining the changes, we have to understand the key differences between Netdata and prometheus. - -### understanding Netdata metrics - -#### charts - -Each chart in Netdata has several properties (common to all its metrics): - -- `chart_id` - uniquely identifies a chart. - -- `chart_name` - a more human friendly name for `chart_id`, also unique. - -- `context` - this is the template of the chart. All disk I/O charts have the same context, all mysql requests charts - have the same context, etc. This is used for alarm templates to match all the charts they should be attached to. - -- `family` groups a set of charts together. It is used as the submenu of the dashboard. - -- `units` is the units for all the metrics attached to the chart. - -#### dimensions - -Then each Netdata chart contains metrics called `dimensions`. All the dimensions of a chart have the same units of -measurement, and are contextually in the same category (ie. the metrics for disk bandwidth are `read` and `write` and -they are both in the same chart). - -### Netdata data source - -Netdata can send metrics to prometheus from 3 data sources: - -- `as collected` or `raw` - this data source sends the metrics to prometheus as they are collected. No conversion is - done by Netdata. The latest value for each metric is just given to prometheus. This is the most preferred method by - prometheus, but it is also the harder to work with. To work with this data source, you will need to understand how - to get meaningful values out of them. - - The format of the metrics is: `CONTEXT{chart="CHART",family="FAMILY",dimension="DIMENSION"}`. - - If the metric is a counter (`incremental` in Netdata lingo), `_total` is appended the context. - - Unlike prometheus, Netdata allows each dimension of a chart to have a different algorithm and conversion constants - (`multiplier` and `divisor`). In this case, that the dimensions of a charts are heterogeneous, Netdata will use this - format: `CONTEXT_DIMENSION{chart="CHART",family="FAMILY"}` - -- `average` - this data source uses the Netdata database to send the metrics to prometheus as they are presented on - the Netdata dashboard. So, all the metrics are sent as gauges, at the units they are presented in the Netdata - dashboard charts. This is the easiest to work with. - - The format of the metrics is: `CONTEXT_UNITS_average{chart="CHART",family="FAMILY",dimension="DIMENSION"}`. - - When this source is used, Netdata keeps track of the last access time for each prometheus server fetching the - metrics. This last access time is used at the subsequent queries of the same prometheus server to identify the - time-frame the `average` will be calculated. - - So, no matter how frequently prometheus scrapes Netdata, it will get all the database data. - To identify each prometheus server, Netdata uses by default the IP of the client fetching the metrics. - - If there are multiple prometheus servers fetching data from the same Netdata, using the same IP, each prometheus - server can append `server=NAME` to the URL. Netdata will use this `NAME` to uniquely identify the prometheus server. - -- `sum` or `volume`, is like `average` but instead of averaging the values, it sums them. - - The format of the metrics is: `CONTEXT_UNITS_sum{chart="CHART",family="FAMILY",dimension="DIMENSION"}`. All the - other operations are the same with `average`. - - To change the data source to `sum` or `as-collected` you need to provide the `source` parameter in the request URL. - e.g.: `http://your.netdata.ip:19999/api/v1/allmetrics?format=prometheus&help=yes&source=as-collected` - - Keep in mind that early versions of Netdata were sending the metrics as: `CHART_DIMENSION{}`. - -### Querying Metrics - -Fetch with your web browser this URL: - -`http://your.netdata.ip:19999/api/v1/allmetrics?format=prometheus&help=yes` - -_(replace `your.netdata.ip` with the ip or hostname of your Netdata server)_ - -Netdata will respond with all the metrics it sends to prometheus. - -If you search that page for `"system.cpu"` you will find all the metrics Netdata is exporting to prometheus for this -chart. `system.cpu` is the chart name on the Netdata dashboard (on the Netdata dashboard all charts have a text heading -such as : `Total CPU utilization (system.cpu)`. What we are interested here in the chart name: `system.cpu`). - -Searching for `"system.cpu"` reveals: - -```sh -# COMMENT homogeneous chart "system.cpu", context "system.cpu", family "cpu", units "percentage" -# COMMENT netdata_system_cpu_percentage_average: dimension "guest_nice", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="guest_nice"} 0.0000000 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "guest", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="guest"} 1.7837326 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "steal", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="steal"} 0.0000000 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "softirq", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="softirq"} 0.5275442 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "irq", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="irq"} 0.2260836 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "user", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="user"} 2.3362762 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "system", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="system"} 1.7961062 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "nice", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="nice"} 0.0000000 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "iowait", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="iowait"} 0.9671802 1500066662000 -# COMMENT netdata_system_cpu_percentage_average: dimension "idle", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive -netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="idle"} 92.3630770 1500066662000 -``` - -_(Netdata response for `system.cpu` with source=`average`)_ - -In `average` or `sum` data sources, all values are normalized and are reported to prometheus as gauges. Now, use the -'expression' text form in prometheus. Begin to type the metrics we are looking for: `netdata_system_cpu`. You should see -that the text form begins to auto-fill as prometheus knows about this metric. - -If the data source was `as collected`, the response would be: - -```sh -# COMMENT homogeneous chart "system.cpu", context "system.cpu", family "cpu", units "percentage" -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "guest_nice", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="guest_nice"} 0 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "guest", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="guest"} 63945 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "steal", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="steal"} 0 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "softirq", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="softirq"} 8295 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "irq", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="irq"} 4079 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "user", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="user"} 116488 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "system", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="system"} 35084 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "nice", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="nice"} 505 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "iowait", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="iowait"} 23314 1500066716438 -# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "idle", value * 1 / 1 delta gives percentage (counter) -netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="idle"} 918470 1500066716438 -``` - -_(Netdata response for `system.cpu` with source=`as-collected`)_ - -For more information check prometheus documentation. - -### Streaming data from upstream hosts - -The `format=prometheus` parameter only exports the host's Netdata metrics. If you are using the parent-child -functionality of Netdata this ignores any upstream hosts - so you should consider using the below in your -**prometheus.yml**: - -```yaml - metrics_path: '/api/v1/allmetrics' - params: - format: [prometheus_all_hosts] - honor_labels: true -``` - -This will report all upstream host data, and `honor_labels` will make Prometheus take note of the instance names -provided. - -### Timestamps - -To pass the metrics through prometheus pushgateway, Netdata supports the option `×tamps=no` to send the metrics -without timestamps. - -## Netdata host variables - -Netdata collects various system configuration metrics, like the max number of TCP sockets supported, the max number of -files allowed system-wide, various IPC sizes, etc. These metrics are not exposed to prometheus by default. - -To expose them, append `variables=yes` to the Netdata URL. - -### TYPE and HELP - -To save bandwidth, and because prometheus does not use them anyway, `# TYPE` and `# HELP` lines are suppressed. If -wanted they can be re-enabled via `types=yes` and `help=yes`, e.g. -`/api/v1/allmetrics?format=prometheus&types=yes&help=yes` - -Note that if enabled, the `# TYPE` and `# HELP` lines are repeated for every occurrence of a metric, which goes against the Prometheus documentation's [specification for these lines](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#comments-help-text-and-type-information). - -### Names and IDs - -Netdata supports names and IDs for charts and dimensions. Usually IDs are unique identifiers as read by the system and -names are human friendly labels (also unique). - -Most charts and metrics have the same ID and name, but in several cases they are different: disks with device-mapper, -interrupts, QoS classes, statsd synthetic charts, etc. - -The default is controlled in `netdata.conf`: - -```conf -[backend] - send names instead of ids = yes | no -``` - -You can overwrite it from prometheus, by appending to the URL: - -- `&names=no` to get IDs (the old behaviour) -- `&names=yes` to get names - -### Filtering metrics sent to prometheus - -Netdata can filter the metrics it sends to prometheus with this setting: - -```conf -[backend] - send charts matching = * -``` - -This settings accepts a space separated list of patterns to match the **charts** to be sent to prometheus. Each pattern -can use `*` as wildcard, any number of times (e.g `*a*b*c*` is valid). Patterns starting with `!` give a negative match -(e.g `!*.bad users.* groups.*` will send all the users and groups except `bad` user and `bad` group). The order is -important: the first match (positive or negative) left to right, is used. - -### Changing the prefix of Netdata metrics - -Netdata sends all metrics prefixed with `netdata_`. You can change this in `netdata.conf`, like this: - -```conf -[backend] - prefix = netdata -``` - -It can also be changed from the URL, by appending `&prefix=netdata`. - -### Metric Units - -The default source `average` adds the unit of measurement to the name of each metric (e.g. `_KiB_persec`). To hide the -units and get the same metric names as with the other sources, append to the URL `&hideunits=yes`. - -The units were standardized in v1.12, with the effect of changing the metric names. To get the metric names as they were -before v1.12, append to the URL `&oldunits=yes` - -### Accuracy of `average` and `sum` data sources - -When the data source is set to `average` or `sum`, Netdata remembers the last access of each client accessing prometheus -metrics and uses this last access time to respond with the `average` or `sum` of all the entries in the database since -that. This means that prometheus servers are not losing data when they access Netdata with data source = `average` or -`sum`. - -To uniquely identify each prometheus server, Netdata uses the IP of the client accessing the metrics. If however the IP -is not good enough for identifying a single prometheus server (e.g. when prometheus servers are accessing Netdata -through a web proxy, or when multiple prometheus servers are NATed to a single IP), each prometheus may append -`&server=NAME` to the URL. This `NAME` is used by Netdata to uniquely identify each prometheus server and keep track of -its last access time. - -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Fprometheus%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/backends/prometheus/backend_prometheus.c b/backends/prometheus/backend_prometheus.c deleted file mode 100644 index 1fb3fd42c..000000000 --- a/backends/prometheus/backend_prometheus.c +++ /dev/null @@ -1,797 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#define BACKENDS_INTERNALS -#include "backend_prometheus.h" - -// ---------------------------------------------------------------------------- -// PROMETHEUS -// /api/v1/allmetrics?format=prometheus and /api/v1/allmetrics?format=prometheus_all_hosts - -static struct prometheus_server { - const char *server; - uint32_t hash; - RRDHOST *host; - time_t last_access; - struct prometheus_server *next; -} *prometheus_server_root = NULL; - -static inline time_t prometheus_server_last_access(const char *server, RRDHOST *host, time_t now) { - static netdata_mutex_t prometheus_server_root_mutex = NETDATA_MUTEX_INITIALIZER; - - uint32_t hash = simple_hash(server); - - netdata_mutex_lock(&prometheus_server_root_mutex); - - struct prometheus_server *ps; - for(ps = prometheus_server_root; ps ;ps = ps->next) { - if (host == ps->host && hash == ps->hash && !strcmp(server, ps->server)) { - time_t last = ps->last_access; - ps->last_access = now; - netdata_mutex_unlock(&prometheus_server_root_mutex); - return last; - } - } - - ps = callocz(1, sizeof(struct prometheus_server)); - ps->server = strdupz(server); - ps->hash = hash; - ps->host = host; - ps->last_access = now; - ps->next = prometheus_server_root; - prometheus_server_root = ps; - - netdata_mutex_unlock(&prometheus_server_root_mutex); - return 0; -} - -static inline size_t backends_prometheus_name_copy(char *d, const char *s, size_t usable) { - size_t n; - - for(n = 0; *s && n < usable ; d++, s++, n++) { - register char c = *s; - - if(!isalnum(c)) *d = '_'; - else *d = c; - } - *d = '\0'; - - return n; -} - -static inline size_t backends_prometheus_label_copy(char *d, const char *s, size_t usable) { - size_t n; - - // make sure we can escape one character without overflowing the buffer - usable--; - - for(n = 0; *s && n < usable ; d++, s++, n++) { - register char c = *s; - - if(unlikely(c == '"' || c == '\\' || c == '\n')) { - *d++ = '\\'; - n++; - } - *d = c; - } - *d = '\0'; - - return n; -} - -static inline char *backends_prometheus_units_copy(char *d, const char *s, size_t usable, int showoldunits) { - const char *sorig = s; - char *ret = d; - size_t n; - - // Fix for issue 5227 - if (unlikely(showoldunits)) { - static struct { - const char *newunit; - uint32_t hash; - const char *oldunit; - } units[] = { - {"KiB/s", 0, "kilobytes/s"} - , {"MiB/s", 0, "MB/s"} - , {"GiB/s", 0, "GB/s"} - , {"KiB" , 0, "KB"} - , {"MiB" , 0, "MB"} - , {"GiB" , 0, "GB"} - , {"inodes" , 0, "Inodes"} - , {"percentage" , 0, "percent"} - , {"faults/s" , 0, "page faults/s"} - , {"KiB/operation", 0, "kilobytes per operation"} - , {"milliseconds/operation", 0, "ms per operation"} - , {NULL, 0, NULL} - }; - static int initialized = 0; - int i; - - if(unlikely(!initialized)) { - for (i = 0; units[i].newunit; i++) - units[i].hash = simple_hash(units[i].newunit); - initialized = 1; - } - - uint32_t hash = simple_hash(s); - for(i = 0; units[i].newunit ; i++) { - if(unlikely(hash == units[i].hash && !strcmp(s, units[i].newunit))) { - // info("matched extension for filename '%s': '%s'", filename, last_dot); - s=units[i].oldunit; - sorig = s; - break; - } - } - } - *d++ = '_'; - for(n = 1; *s && n < usable ; d++, s++, n++) { - register char c = *s; - - if(!isalnum(c)) *d = '_'; - else *d = c; - } - - if(n == 2 && sorig[0] == '%') { - n = 0; - d = ret; - s = "_percent"; - for( ; *s && n < usable ; n++) *d++ = *s++; - } - else if(n > 3 && sorig[n-3] == '/' && sorig[n-2] == 's') { - n = n - 2; - d -= 2; - s = "_persec"; - for( ; *s && n < usable ; n++) *d++ = *s++; - } - - *d = '\0'; - - return ret; -} - - -#define PROMETHEUS_ELEMENT_MAX 256 -#define PROMETHEUS_LABELS_MAX 1024 -#define PROMETHEUS_VARIABLE_MAX 256 - -#define PROMETHEUS_LABELS_MAX_NUMBER 128 - -struct host_variables_callback_options { - RRDHOST *host; - BUFFER *wb; - BACKEND_OPTIONS backend_options; - BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options; - const char *prefix; - const char *labels; - time_t now; - int host_header_printed; - char name[PROMETHEUS_VARIABLE_MAX+1]; -}; - -static int print_host_variables(RRDVAR *rv, void *data) { - struct host_variables_callback_options *opts = data; - - if(rv->options & (RRDVAR_OPTION_CUSTOM_HOST_VAR|RRDVAR_OPTION_CUSTOM_CHART_VAR)) { - if(!opts->host_header_printed) { - opts->host_header_printed = 1; - - if(opts->output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP) { - buffer_sprintf(opts->wb, "\n# COMMENT global host and chart variables\n"); - } - } - - calculated_number value = rrdvar2number(rv); - if(isnan(value) || isinf(value)) { - if(opts->output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP) - buffer_sprintf(opts->wb, "# COMMENT variable \"%s\" is %s. Skipped.\n", rv->name, (isnan(value))?"NAN":"INF"); - - return 0; - } - - char *label_pre = ""; - char *label_post = ""; - if(opts->labels && *opts->labels) { - label_pre = "{"; - label_post = "}"; - } - - backends_prometheus_name_copy(opts->name, rv->name, sizeof(opts->name)); - - if(opts->output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) - buffer_sprintf(opts->wb - , "%s_%s%s%s%s " CALCULATED_NUMBER_FORMAT " %llu\n" - , opts->prefix - , opts->name - , label_pre - , opts->labels - , label_post - , value - , opts->now * 1000ULL - ); - else - buffer_sprintf(opts->wb, "%s_%s%s%s%s " CALCULATED_NUMBER_FORMAT "\n" - , opts->prefix - , opts->name - , label_pre - , opts->labels - , label_post - , value - ); - - return 1; - } - - return 0; -} - -static void rrd_stats_api_v1_charts_allmetrics_prometheus(RRDHOST *host, BUFFER *wb, const char *prefix, BACKEND_OPTIONS backend_options, time_t after, time_t before, int allhosts, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options) { - rrdhost_rdlock(host); - - char hostname[PROMETHEUS_ELEMENT_MAX + 1]; - backends_prometheus_label_copy(hostname, host->hostname, PROMETHEUS_ELEMENT_MAX); - - char labels[PROMETHEUS_LABELS_MAX + 1] = ""; - if(allhosts) { - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) - buffer_sprintf(wb, "netdata_info{instance=\"%s\",application=\"%s\",version=\"%s\"} 1 %llu\n", hostname, host->program_name, host->program_version, now_realtime_usec() / USEC_PER_MS); - else - buffer_sprintf(wb, "netdata_info{instance=\"%s\",application=\"%s\",version=\"%s\"} 1\n", hostname, host->program_name, host->program_version); - - if(host->tags && *(host->tags)) { - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) { - buffer_sprintf(wb, "netdata_host_tags_info{instance=\"%s\",%s} 1 %llu\n", hostname, host->tags, now_realtime_usec() / USEC_PER_MS); - - // deprecated, exists only for compatibility with older queries - buffer_sprintf(wb, "netdata_host_tags{instance=\"%s\",%s} 1 %llu\n", hostname, host->tags, now_realtime_usec() / USEC_PER_MS); - } - else { - buffer_sprintf(wb, "netdata_host_tags_info{instance=\"%s\",%s} 1\n", hostname, host->tags); - - // deprecated, exists only for compatibility with older queries - buffer_sprintf(wb, "netdata_host_tags{instance=\"%s\",%s} 1\n", hostname, host->tags); - } - - } - - snprintfz(labels, PROMETHEUS_LABELS_MAX, ",instance=\"%s\"", hostname); - } - else { - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) - buffer_sprintf(wb, "netdata_info{instance=\"%s\",application=\"%s\",version=\"%s\"} 1 %llu\n", hostname, host->program_name, host->program_version, now_realtime_usec() / USEC_PER_MS); - else - buffer_sprintf(wb, "netdata_info{instance=\"%s\",application=\"%s\",version=\"%s\"} 1\n", hostname, host->program_name, host->program_version); - - if(host->tags && *(host->tags)) { - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) { - buffer_sprintf(wb, "netdata_host_tags_info{%s} 1 %llu\n", host->tags, now_realtime_usec() / USEC_PER_MS); - - // deprecated, exists only for compatibility with older queries - buffer_sprintf(wb, "netdata_host_tags{%s} 1 %llu\n", host->tags, now_realtime_usec() / USEC_PER_MS); - } - else { - buffer_sprintf(wb, "netdata_host_tags_info{%s} 1\n", host->tags); - - // deprecated, exists only for compatibility with older queries - buffer_sprintf(wb, "netdata_host_tags{%s} 1\n", host->tags); - } - } - } - - // send custom variables set for the host - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_VARIABLES){ - struct host_variables_callback_options opts = { - .host = host, - .wb = wb, - .labels = (labels[0] == ',')?&labels[1]:labels, - .backend_options = backend_options, - .output_options = output_options, - .prefix = prefix, - .now = now_realtime_sec(), - .host_header_printed = 0 - }; - foreach_host_variable_callback(host, print_host_variables, &opts); - } - - // for each chart - RRDSET *st; - rrdset_foreach_read(st, host) { - char chart[PROMETHEUS_ELEMENT_MAX + 1]; - char context[PROMETHEUS_ELEMENT_MAX + 1]; - char family[PROMETHEUS_ELEMENT_MAX + 1]; - char units[PROMETHEUS_ELEMENT_MAX + 1] = ""; - - backends_prometheus_label_copy(chart, (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && st->name)?st->name:st->id, PROMETHEUS_ELEMENT_MAX); - backends_prometheus_label_copy(family, st->family, PROMETHEUS_ELEMENT_MAX); - backends_prometheus_name_copy(context, st->context, PROMETHEUS_ELEMENT_MAX); - - if(likely(backends_can_send_rrdset(backend_options, st))) { - rrdset_rdlock(st); - - int as_collected = (BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED); - int homogeneous = 1; - if(as_collected) { - if(rrdset_flag_check(st, RRDSET_FLAG_HOMOGENEOUS_CHECK)) - rrdset_update_heterogeneous_flag(st); - - if(rrdset_flag_check(st, RRDSET_FLAG_HETEROGENEOUS)) - homogeneous = 0; - } - else { - if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE && !(output_options & BACKENDS_PROMETHEUS_OUTPUT_HIDEUNITS)) - backends_prometheus_units_copy(units, st->units, PROMETHEUS_ELEMENT_MAX, output_options & BACKENDS_PROMETHEUS_OUTPUT_OLDUNITS); - } - - if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP)) - buffer_sprintf(wb, "\n# COMMENT %s chart \"%s\", context \"%s\", family \"%s\", units \"%s\"\n" - , (homogeneous)?"homogeneous":"heterogeneous" - , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && st->name) ? st->name : st->id - , st->context - , st->family - , st->units - ); - - // for each dimension - RRDDIM *rd; - rrddim_foreach_read(rd, st) { - if(rd->collections_counter && !rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE)) { - char dimension[PROMETHEUS_ELEMENT_MAX + 1]; - char *suffix = ""; - - if (as_collected) { - // we need as-collected / raw data - - if(unlikely(rd->last_collected_time.tv_sec < after)) - continue; - - const char *t = "gauge", *h = "gives"; - if(rd->algorithm == RRD_ALGORITHM_INCREMENTAL || - rd->algorithm == RRD_ALGORITHM_PCENT_OVER_DIFF_TOTAL) { - t = "counter"; - h = "delta gives"; - suffix = "_total"; - } - - if(homogeneous) { - // all the dimensions of the chart, has the same algorithm, multiplier and divisor - // we add all dimensions as labels - - backends_prometheus_label_copy(dimension, (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX); - - if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP)) - buffer_sprintf(wb - , "# COMMENT %s_%s%s: chart \"%s\", context \"%s\", family \"%s\", dimension \"%s\", value * " COLLECTED_NUMBER_FORMAT " / " COLLECTED_NUMBER_FORMAT " %s %s (%s)\n" - , prefix - , context - , suffix - , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && st->name) ? st->name : st->id - , st->context - , st->family - , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id - , rd->multiplier - , rd->divisor - , h - , st->units - , t - ); - - if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_TYPES)) - buffer_sprintf(wb, "# TYPE %s_%s%s %s\n" - , prefix - , context - , suffix - , t - ); - - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) - buffer_sprintf(wb - , "%s_%s%s{chart=\"%s\",family=\"%s\",dimension=\"%s\"%s} " COLLECTED_NUMBER_FORMAT " %llu\n" - , prefix - , context - , suffix - , chart - , family - , dimension - , labels - , rd->last_collected_value - , timeval_msec(&rd->last_collected_time) - ); - else - buffer_sprintf(wb - , "%s_%s%s{chart=\"%s\",family=\"%s\",dimension=\"%s\"%s} " COLLECTED_NUMBER_FORMAT "\n" - , prefix - , context - , suffix - , chart - , family - , dimension - , labels - , rd->last_collected_value - ); - } - else { - // the dimensions of the chart, do not have the same algorithm, multiplier or divisor - // we create a metric per dimension - - backends_prometheus_name_copy(dimension, (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX); - - if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP)) - buffer_sprintf(wb - , "# COMMENT %s_%s_%s%s: chart \"%s\", context \"%s\", family \"%s\", dimension \"%s\", value * " COLLECTED_NUMBER_FORMAT " / " COLLECTED_NUMBER_FORMAT " %s %s (%s)\n" - , prefix - , context - , dimension - , suffix - , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && st->name) ? st->name : st->id - , st->context - , st->family - , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id - , rd->multiplier - , rd->divisor - , h - , st->units - , t - ); - - if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_TYPES)) - buffer_sprintf(wb, "# TYPE %s_%s_%s%s %s\n" - , prefix - , context - , dimension - , suffix - , t - ); - - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) - buffer_sprintf(wb - , "%s_%s_%s%s{chart=\"%s\",family=\"%s\"%s} " COLLECTED_NUMBER_FORMAT " %llu\n" - , prefix - , context - , dimension - , suffix - , chart - , family - , labels - , rd->last_collected_value - , timeval_msec(&rd->last_collected_time) - ); - else - buffer_sprintf(wb - , "%s_%s_%s%s{chart=\"%s\",family=\"%s\"%s} " COLLECTED_NUMBER_FORMAT "\n" - , prefix - , context - , dimension - , suffix - , chart - , family - , labels - , rd->last_collected_value - ); - } - } - else { - // we need average or sum of the data - - time_t first_t = after, last_t = before; - calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t); - - if(!isnan(value) && !isinf(value)) { - - if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE) - suffix = "_average"; - else if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_SUM) - suffix = "_sum"; - - backends_prometheus_label_copy(dimension, (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX); - - if (unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP)) - buffer_sprintf(wb, "# COMMENT %s_%s%s%s: dimension \"%s\", value is %s, gauge, dt %llu to %llu inclusive\n" - , prefix - , context - , units - , suffix - , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id - , st->units - , (unsigned long long)first_t - , (unsigned long long)last_t - ); - - if (unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_TYPES)) - buffer_sprintf(wb, "# TYPE %s_%s%s%s gauge\n" - , prefix - , context - , units - , suffix - ); - - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) - buffer_sprintf(wb, "%s_%s%s%s{chart=\"%s\",family=\"%s\",dimension=\"%s\"%s} " CALCULATED_NUMBER_FORMAT " %llu\n" - , prefix - , context - , units - , suffix - , chart - , family - , dimension - , labels - , value - , last_t * MSEC_PER_SEC - ); - else - buffer_sprintf(wb, "%s_%s%s%s{chart=\"%s\",family=\"%s\",dimension=\"%s\"%s} " CALCULATED_NUMBER_FORMAT "\n" - , prefix - , context - , units - , suffix - , chart - , family - , dimension - , labels - , value - ); - } - } - } - } - - rrdset_unlock(st); - } - } - - rrdhost_unlock(host); -} - -#if ENABLE_PROMETHEUS_REMOTE_WRITE -inline static void remote_write_split_words(char *str, char **words, int max_words) { - char *s = str; - int i = 0; - - while(*s && i < max_words - 1) { - while(*s && isspace(*s)) s++; // skip spaces to the beginning of a tag name - - if(*s) - words[i] = s; - - while(*s && !isspace(*s) && *s != '=') s++; // find the end of the tag name - - if(*s != '=') { - words[i] = NULL; - break; - } - *s = '\0'; - s++; - i++; - - while(*s && isspace(*s)) s++; // skip spaces to the beginning of a tag value - - if(*s && *s == '"') s++; // strip an opening quote - if(*s) - words[i] = s; - - while(*s && !isspace(*s) && *s != ',') s++; // find the end of the tag value - - if(*s && *s != ',') { - words[i] = NULL; - break; - } - if(s != words[i] && *(s - 1) == '"') *(s - 1) = '\0'; // strip a closing quote - if(*s != '\0') { - *s = '\0'; - s++; - i++; - } - } -} - -void backends_rrd_stats_remote_write_allmetrics_prometheus( - RRDHOST *host - , const char *__hostname - , const char *prefix - , BACKEND_OPTIONS backend_options - , time_t after - , time_t before - , size_t *count_charts - , size_t *count_dims - , size_t *count_dims_skipped -) { - char hostname[PROMETHEUS_ELEMENT_MAX + 1]; - backends_prometheus_label_copy(hostname, __hostname, PROMETHEUS_ELEMENT_MAX); - - backends_add_host_info("netdata_info", hostname, host->program_name, host->program_version, now_realtime_usec() / USEC_PER_MS); - - if(host->tags && *(host->tags)) { - char tags[PROMETHEUS_LABELS_MAX + 1]; - strncpy(tags, host->tags, PROMETHEUS_LABELS_MAX); - char *words[PROMETHEUS_LABELS_MAX_NUMBER] = {NULL}; - int i; - - remote_write_split_words(tags, words, PROMETHEUS_LABELS_MAX_NUMBER); - - backends_add_host_info("netdata_host_tags_info", hostname, NULL, NULL, now_realtime_usec() / USEC_PER_MS); - - for(i = 0; words[i] != NULL && words[i + 1] != NULL && (i + 1) < PROMETHEUS_LABELS_MAX_NUMBER; i += 2) { - backends_add_tag(words[i], words[i + 1]); - } - } - - // for each chart - RRDSET *st; - rrdset_foreach_read(st, host) { - char chart[PROMETHEUS_ELEMENT_MAX + 1]; - char context[PROMETHEUS_ELEMENT_MAX + 1]; - char family[PROMETHEUS_ELEMENT_MAX + 1]; - char units[PROMETHEUS_ELEMENT_MAX + 1] = ""; - - backends_prometheus_label_copy(chart, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, PROMETHEUS_ELEMENT_MAX); - backends_prometheus_label_copy(family, st->family, PROMETHEUS_ELEMENT_MAX); - backends_prometheus_name_copy(context, st->context, PROMETHEUS_ELEMENT_MAX); - - if(likely(backends_can_send_rrdset(backend_options, st))) { - rrdset_rdlock(st); - - (*count_charts)++; - - int as_collected = (BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED); - int homogeneous = 1; - if(as_collected) { - if(rrdset_flag_check(st, RRDSET_FLAG_HOMOGENEOUS_CHECK)) - rrdset_update_heterogeneous_flag(st); - - if(rrdset_flag_check(st, RRDSET_FLAG_HETEROGENEOUS)) - homogeneous = 0; - } - else { - if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE) - backends_prometheus_units_copy(units, st->units, PROMETHEUS_ELEMENT_MAX, 0); - } - - // for each dimension - RRDDIM *rd; - rrddim_foreach_read(rd, st) { - if(rd->collections_counter && !rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE)) { - char name[PROMETHEUS_LABELS_MAX + 1]; - char dimension[PROMETHEUS_ELEMENT_MAX + 1]; - char *suffix = ""; - - if (as_collected) { - // we need as-collected / raw data - - if(unlikely(rd->last_collected_time.tv_sec < after)) { - debug(D_BACKEND, "BACKEND: not sending dimension '%s' of chart '%s' from host '%s', its last data collection (%lu) is not within our timeframe (%lu to %lu)", rd->id, st->id, __hostname, (unsigned long)rd->last_collected_time.tv_sec, (unsigned long)after, (unsigned long)before); - (*count_dims_skipped)++; - continue; - } - - if(homogeneous) { - // all the dimensions of the chart, has the same algorithm, multiplier and divisor - // we add all dimensions as labels - - backends_prometheus_label_copy(dimension, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX); - snprintf(name, PROMETHEUS_LABELS_MAX, "%s_%s%s", prefix, context, suffix); - - backends_add_metric(name, chart, family, dimension, hostname, rd->last_collected_value, timeval_msec(&rd->last_collected_time)); - (*count_dims)++; - } - else { - // the dimensions of the chart, do not have the same algorithm, multiplier or divisor - // we create a metric per dimension - - backends_prometheus_name_copy(dimension, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX); - snprintf(name, PROMETHEUS_LABELS_MAX, "%s_%s_%s%s", prefix, context, dimension, suffix); - - backends_add_metric(name, chart, family, NULL, hostname, rd->last_collected_value, timeval_msec(&rd->last_collected_time)); - (*count_dims)++; - } - } - else { - // we need average or sum of the data - - time_t first_t = after, last_t = before; - calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t); - - if(!isnan(value) && !isinf(value)) { - - if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE) - suffix = "_average"; - else if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_SUM) - suffix = "_sum"; - - backends_prometheus_label_copy(dimension, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX); - snprintf(name, PROMETHEUS_LABELS_MAX, "%s_%s%s%s", prefix, context, units, suffix); - - backends_add_metric(name, chart, family, dimension, hostname, value, last_t * MSEC_PER_SEC); - (*count_dims)++; - } - } - } - } - - rrdset_unlock(st); - } - } -} -#endif /* ENABLE_PROMETHEUS_REMOTE_WRITE */ - -static inline time_t prometheus_preparation(RRDHOST *host, BUFFER *wb, BACKEND_OPTIONS backend_options, const char *server, time_t now, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options) { - if(!server || !*server) server = "default"; - - time_t after = prometheus_server_last_access(server, host, now); - - int first_seen = 0; - if(!after) { - after = now - global_backend_update_every; - first_seen = 1; - } - - if(after > now) { - // oops! this should never happen - after = now - global_backend_update_every; - } - - if(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP) { - char *mode; - if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED) - mode = "as collected"; - else if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE) - mode = "average"; - else if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_SUM) - mode = "sum"; - else - mode = "unknown"; - - buffer_sprintf(wb, "# COMMENT netdata \"%s\" to %sprometheus \"%s\", source \"%s\", last seen %lu %s, time range %lu to %lu\n\n" - , host->hostname - , (first_seen)?"FIRST SEEN ":"" - , server - , mode - , (unsigned long)((first_seen)?0:(now - after)) - , (first_seen)?"never":"seconds ago" - , (unsigned long)after, (unsigned long)now - ); - } - - return after; -} - -void backends_rrd_stats_api_v1_charts_allmetrics_prometheus_single_host(RRDHOST *host, BUFFER *wb, const char *server, const char *prefix, BACKEND_OPTIONS backend_options, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options) { - time_t before = now_realtime_sec(); - - // we start at the point we had stopped before - time_t after = prometheus_preparation(host, wb, backend_options, server, before, output_options); - - rrd_stats_api_v1_charts_allmetrics_prometheus(host, wb, prefix, backend_options, after, before, 0, output_options); -} - -void backends_rrd_stats_api_v1_charts_allmetrics_prometheus_all_hosts(RRDHOST *host, BUFFER *wb, const char *server, const char *prefix, BACKEND_OPTIONS backend_options, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options) { - time_t before = now_realtime_sec(); - - // we start at the point we had stopped before - time_t after = prometheus_preparation(host, wb, backend_options, server, before, output_options); - - rrd_rdlock(); - rrdhost_foreach_read(host) { - rrd_stats_api_v1_charts_allmetrics_prometheus(host, wb, prefix, backend_options, after, before, 1, output_options); - } - rrd_unlock(); -} - -#if ENABLE_PROMETHEUS_REMOTE_WRITE -int backends_process_prometheus_remote_write_response(BUFFER *b) { - if(unlikely(!b)) return 1; - - const char *s = buffer_tostring(b); - int len = buffer_strlen(b); - - // do nothing with HTTP responses 200 or 204 - - while(!isspace(*s) && len) { - s++; - len--; - } - s++; - len--; - - if(likely(len > 4 && (!strncmp(s, "200 ", 4) || !strncmp(s, "204 ", 4)))) - return 0; - else - return discard_response(b, "prometheus remote write"); -} -#endif diff --git a/backends/prometheus/backend_prometheus.h b/backends/prometheus/backend_prometheus.h deleted file mode 100644 index 8c14ddc26..000000000 --- a/backends/prometheus/backend_prometheus.h +++ /dev/null @@ -1,37 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#ifndef NETDATA_BACKEND_PROMETHEUS_H -#define NETDATA_BACKEND_PROMETHEUS_H 1 - -#include "backends/backends.h" - -typedef enum backends_prometheus_output_flags { - BACKENDS_PROMETHEUS_OUTPUT_NONE = 0, - BACKENDS_PROMETHEUS_OUTPUT_HELP = (1 << 0), - BACKENDS_PROMETHEUS_OUTPUT_TYPES = (1 << 1), - BACKENDS_PROMETHEUS_OUTPUT_NAMES = (1 << 2), - BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS = (1 << 3), - BACKENDS_PROMETHEUS_OUTPUT_VARIABLES = (1 << 4), - BACKENDS_PROMETHEUS_OUTPUT_OLDUNITS = (1 << 5), - BACKENDS_PROMETHEUS_OUTPUT_HIDEUNITS = (1 << 6) -} BACKENDS_PROMETHEUS_OUTPUT_OPTIONS; - -extern void backends_rrd_stats_api_v1_charts_allmetrics_prometheus_single_host(RRDHOST *host, BUFFER *wb, const char *server, const char *prefix, BACKEND_OPTIONS backend_options, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options); -extern void backends_rrd_stats_api_v1_charts_allmetrics_prometheus_all_hosts(RRDHOST *host, BUFFER *wb, const char *server, const char *prefix, BACKEND_OPTIONS backend_options, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options); - -#if ENABLE_PROMETHEUS_REMOTE_WRITE -extern void backends_rrd_stats_remote_write_allmetrics_prometheus( - RRDHOST *host - , const char *__hostname - , const char *prefix - , BACKEND_OPTIONS backend_options - , time_t after - , time_t before - , size_t *count_charts - , size_t *count_dims - , size_t *count_dims_skipped -); -extern int backends_process_prometheus_remote_write_response(BUFFER *b); -#endif - -#endif //NETDATA_BACKEND_PROMETHEUS_H diff --git a/backends/prometheus/remote_write/Makefile.am b/backends/prometheus/remote_write/Makefile.am deleted file mode 100644 index d049ef48c..000000000 --- a/backends/prometheus/remote_write/Makefile.am +++ /dev/null @@ -1,14 +0,0 @@ -# SPDX-License-Identifier: GPL-3.0-or-later - -AUTOMAKE_OPTIONS = subdir-objects -MAINTAINERCLEANFILES = $(srcdir)/Makefile.in - -CLEANFILES = \ - remote_write.pb.cc \ - remote_write.pb.h \ - $(NULL) - -dist_noinst_DATA = \ - remote_write.proto \ - README.md \ - $(NULL) diff --git a/backends/prometheus/remote_write/README.md b/backends/prometheus/remote_write/README.md deleted file mode 100644 index b83575e10..000000000 --- a/backends/prometheus/remote_write/README.md +++ /dev/null @@ -1,41 +0,0 @@ -<!-- -title: "Prometheus remote write backend" -custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/prometheus/remote_write/README.md ---> - -# Prometheus remote write backend - -## Prerequisites - -To use the prometheus remote write API with [storage -providers](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage) -[protobuf](https://developers.google.com/protocol-buffers/) and [snappy](https://github.com/google/snappy) libraries -should be installed first. Next, Netdata should be re-installed from the source. The installer will detect that the -required libraries and utilities are now available. - -## Configuration - -An additional option in the backend configuration section is available for the remote write backend: - -```conf -[backend] - remote write URL path = /receive -``` - -The default value is `/receive`. `remote write URL path` is used to set an endpoint path for the remote write protocol. -For example, if your endpoint is `http://example.domain:example_port/storage/read` you should set - -```conf -[backend] - destination = example.domain:example_port - remote write URL path = /storage/read -``` - -`buffered` and `lost` dimensions in the Netdata Backend Data Size operation monitoring chart estimate uncompressed -buffer size on failures. - -## Notes - -The remote write backend does not support `buffer on failures` - -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Fprometheus%2Fremote_write%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/backends/prometheus/remote_write/remote_write.cc b/backends/prometheus/remote_write/remote_write.cc deleted file mode 100644 index b919cffad..000000000 --- a/backends/prometheus/remote_write/remote_write.cc +++ /dev/null @@ -1,120 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#include <snappy.h> -#include "exporting/prometheus/remote_write/remote_write.pb.h" -#include "remote_write.h" - -using namespace prometheus; - -static google::protobuf::Arena arena; -static WriteRequest *write_request; - -void backends_init_write_request() { - GOOGLE_PROTOBUF_VERIFY_VERSION; - write_request = google::protobuf::Arena::CreateMessage<WriteRequest>(&arena); -} - -void backends_clear_write_request() { - write_request->clear_timeseries(); -} - -void backends_add_host_info(const char *name, const char *instance, const char *application, const char *version, const int64_t timestamp) { - TimeSeries *timeseries; - Sample *sample; - Label *label; - - timeseries = write_request->add_timeseries(); - - label = timeseries->add_labels(); - label->set_name("__name__"); - label->set_value(name); - - label = timeseries->add_labels(); - label->set_name("instance"); - label->set_value(instance); - - if(application) { - label = timeseries->add_labels(); - label->set_name("application"); - label->set_value(application); - } - - if(version) { - label = timeseries->add_labels(); - label->set_name("version"); - label->set_value(version); - } - - sample = timeseries->add_samples(); - sample->set_value(1); - sample->set_timestamp(timestamp); -} - -// adds tag to the last created timeseries -void backends_add_tag(char *tag, char *value) { - TimeSeries *timeseries; - Label *label; - - timeseries = write_request->mutable_timeseries(write_request->timeseries_size() - 1); - - label = timeseries->add_labels(); - label->set_name(tag); - label->set_value(value); -} - -void backends_add_metric(const char *name, const char *chart, const char *family, const char *dimension, const char *instance, const double value, const int64_t timestamp) { - TimeSeries *timeseries; - Sample *sample; - Label *label; - - timeseries = write_request->add_timeseries(); - - label = timeseries->add_labels(); - label->set_name("__name__"); - label->set_value(name); - - label = timeseries->add_labels(); - label->set_name("chart"); - label->set_value(chart); - - label = timeseries->add_labels(); - label->set_name("family"); - label->set_value(family); - - if(dimension) { - label = timeseries->add_labels(); - label->set_name("dimension"); - label->set_value(dimension); - } - - label = timeseries->add_labels(); - label->set_name("instance"); - label->set_value(instance); - - sample = timeseries->add_samples(); - sample->set_value(value); - sample->set_timestamp(timestamp); -} - -size_t backends_get_write_request_size(){ -#if GOOGLE_PROTOBUF_VERSION < 3001000 - size_t size = (size_t)snappy::MaxCompressedLength(write_request->ByteSize()); -#else - size_t size = (size_t)snappy::MaxCompressedLength(write_request->ByteSizeLong()); -#endif - - return (size < INT_MAX)?size:0; -} - -int backends_pack_write_request(char *buffer, size_t *size) { - std::string uncompressed_write_request; - if(write_request->SerializeToString(&uncompressed_write_request) == false) return 1; - - snappy::RawCompress(uncompressed_write_request.data(), uncompressed_write_request.size(), buffer, size); - - return 0; -} - -void backends_protocol_buffers_shutdown() { - google::protobuf::ShutdownProtobufLibrary(); -} diff --git a/backends/prometheus/remote_write/remote_write.h b/backends/prometheus/remote_write/remote_write.h deleted file mode 100644 index 1307d7281..000000000 --- a/backends/prometheus/remote_write/remote_write.h +++ /dev/null @@ -1,30 +0,0 @@ -// SPDX-License-Identifier: GPL-3.0-or-later - -#ifndef NETDATA_BACKEND_PROMETHEUS_REMOTE_WRITE_H -#define NETDATA_BACKEND_PROMETHEUS_REMOTE_WRITE_H - -#ifdef __cplusplus -extern "C" { -#endif - -void backends_init_write_request(); - -void backends_clear_write_request(); - -void backends_add_host_info(const char *name, const char *instance, const char *application, const char *version, const int64_t timestamp); - -void backends_add_tag(char *tag, char *value); - -void backends_add_metric(const char *name, const char *chart, const char *family, const char *dimension, const char *instance, const double value, const int64_t timestamp); - -size_t backends_get_write_request_size(); - -int backends_pack_write_request(char *buffer, size_t *size); - -void backends_protocol_buffers_shutdown(); - -#ifdef __cplusplus -} -#endif - -#endif //NETDATA_BACKEND_PROMETHEUS_REMOTE_WRITE_H diff --git a/backends/prometheus/remote_write/remote_write.proto b/backends/prometheus/remote_write/remote_write.proto deleted file mode 100644 index dfde254e1..000000000 --- a/backends/prometheus/remote_write/remote_write.proto +++ /dev/null @@ -1,29 +0,0 @@ -syntax = "proto3"; -package prometheus; - -option cc_enable_arenas = true; - -import "google/protobuf/descriptor.proto"; - -message WriteRequest { - repeated TimeSeries timeseries = 1 [(nullable) = false]; -} - -message TimeSeries { - repeated Label labels = 1 [(nullable) = false]; - repeated Sample samples = 2 [(nullable) = false]; -} - -message Label { - string name = 1; - string value = 2; -} - -message Sample { - double value = 1; - int64 timestamp = 2; -} - -extend google.protobuf.FieldOptions { - bool nullable = 65001; -} |