summaryrefslogtreecommitdiffstats
path: root/backends
diff options
context:
space:
mode:
Diffstat (limited to 'backends')
-rw-r--r--backends/Makefile.am22
-rw-r--r--backends/README.md236
-rw-r--r--backends/TIMESCALE.md57
-rw-r--r--backends/WALKTHROUGH.md258
-rw-r--r--backends/aws_kinesis/Makefile.am12
-rw-r--r--backends/aws_kinesis/README.md53
-rw-r--r--backends/aws_kinesis/aws_kinesis.c94
-rw-r--r--backends/aws_kinesis/aws_kinesis.conf10
-rw-r--r--backends/aws_kinesis/aws_kinesis.h14
-rw-r--r--backends/aws_kinesis/aws_kinesis_put_record.cc87
-rw-r--r--backends/aws_kinesis/aws_kinesis_put_record.h25
-rw-r--r--backends/backends.c1243
-rw-r--r--backends/backends.h97
-rw-r--r--backends/graphite/Makefile.am4
-rw-r--r--backends/graphite/graphite.c90
-rw-r--r--backends/graphite/graphite.h35
-rw-r--r--backends/json/Makefile.am4
-rw-r--r--backends/json/json.c152
-rw-r--r--backends/json/json.h34
-rw-r--r--backends/mongodb/Makefile.am8
-rw-r--r--backends/mongodb/README.md41
-rw-r--r--backends/mongodb/mongodb.c189
-rw-r--r--backends/mongodb/mongodb.conf12
-rw-r--r--backends/mongodb/mongodb.h16
-rwxr-xr-xbackends/nc-backend.sh158
-rw-r--r--backends/opentsdb/Makefile.am4
-rw-r--r--backends/opentsdb/README.md38
-rw-r--r--backends/opentsdb/opentsdb.c205
-rw-r--r--backends/opentsdb/opentsdb.h58
-rw-r--r--backends/prometheus/Makefile.am12
-rw-r--r--backends/prometheus/README.md457
-rw-r--r--backends/prometheus/backend_prometheus.c797
-rw-r--r--backends/prometheus/backend_prometheus.h37
-rw-r--r--backends/prometheus/remote_write/Makefile.am14
-rw-r--r--backends/prometheus/remote_write/README.md41
-rw-r--r--backends/prometheus/remote_write/remote_write.cc120
-rw-r--r--backends/prometheus/remote_write/remote_write.h30
-rw-r--r--backends/prometheus/remote_write/remote_write.proto29
38 files changed, 4793 insertions, 0 deletions
diff --git a/backends/Makefile.am b/backends/Makefile.am
new file mode 100644
index 0000000..dace013
--- /dev/null
+++ b/backends/Makefile.am
@@ -0,0 +1,22 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+AUTOMAKE_OPTIONS = subdir-objects
+MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
+
+SUBDIRS = \
+ graphite \
+ json \
+ opentsdb \
+ prometheus \
+ aws_kinesis \
+ mongodb \
+ $(NULL)
+
+dist_noinst_DATA = \
+ README.md \
+ WALKTHROUGH.md \
+ $(NULL)
+
+dist_noinst_SCRIPTS = \
+ nc-backend.sh \
+ $(NULL)
diff --git a/backends/README.md b/backends/README.md
new file mode 100644
index 0000000..8d53fd6
--- /dev/null
+++ b/backends/README.md
@@ -0,0 +1,236 @@
+<!--
+title: "Metrics long term archiving"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/README.md
+-->
+
+# Metrics long term archiving
+
+> ⚠️ The backends system is now deprecated in favor of the [exporting engine](/exporting/README.md).
+
+Netdata supports backends for archiving the metrics, or providing long term dashboards, using Grafana or other tools,
+like this:
+
+![image](https://cloud.githubusercontent.com/assets/2662304/20649711/29f182ba-b4ce-11e6-97c8-ab2c0ab59833.png)
+
+Since Netdata collects thousands of metrics per server per second, which would easily congest any backend server when
+several Netdata servers are sending data to it, Netdata allows sending metrics at a lower frequency, by resampling them.
+
+So, although Netdata collects metrics every second, it can send to the backend servers averages or sums every X seconds
+(though, it can send them per second if you need it to).
+
+## features
+
+1. Supported backends
+
+ - **graphite** (`plaintext interface`, used by **Graphite**, **InfluxDB**, **KairosDB**, **Blueflood**,
+ **ElasticSearch** via logstash tcp input and the graphite codec, etc)
+
+ metrics are sent to the backend server as `prefix.hostname.chart.dimension`. `prefix` is configured below,
+ `hostname` is the hostname of the machine (can also be configured).
+
+ - **opentsdb** (`telnet or HTTP interfaces`, used by **OpenTSDB**, **InfluxDB**, **KairosDB**, etc)
+
+ metrics are sent to opentsdb as `prefix.chart.dimension` with tag `host=hostname`.
+
+ - **json** document DBs
+
+ metrics are sent to a document db, `JSON` formatted.
+
+ - **prometheus** is described at [prometheus page](/backends/prometheus/README.md) since it pulls data from
+ Netdata.
+
+ - **prometheus remote write** (a binary snappy-compressed protocol buffer encoding over HTTP used by
+ **Elasticsearch**, **Gnocchi**, **Graphite**, **InfluxDB**, **Kafka**, **OpenTSDB**, **PostgreSQL/TimescaleDB**,
+ **Splunk**, **VictoriaMetrics**, and a lot of other [storage
+ providers](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage))
+
+ metrics are labeled in the format, which is used by Netdata for the [plaintext prometheus
+ protocol](/backends/prometheus/README.md). Notes on using the remote write backend are [here](/backends/prometheus/remote_write/README.md).
+
+ - **TimescaleDB** via [community-built connector](/backends/TIMESCALE.md) that takes JSON streams from a Netdata
+ client and writes them to a TimescaleDB table.
+
+ - **AWS Kinesis Data Streams**
+
+ metrics are sent to the service in `JSON` format.
+
+ - **MongoDB**
+
+ metrics are sent to the database in `JSON` format.
+
+2. Only one backend may be active at a time.
+
+3. Netdata can filter metrics (at the chart level), to send only a subset of the collected metrics.
+
+4. Netdata supports three modes of operation for all backends:
+
+ - `as-collected` sends to backends the metrics as they are collected, in the units they are collected. So,
+ counters are sent as counters and gauges are sent as gauges, much like all data collectors do. For example, to
+ calculate CPU utilization in this format, you need to know how to convert kernel ticks to percentage.
+
+ - `average` sends to backends normalized metrics from the Netdata database. In this mode, all metrics are sent as
+ gauges, in the units Netdata uses. This abstracts data collection and simplifies visualization, but you will not
+ be able to copy and paste queries from other sources to convert units. For example, CPU utilization percentage
+ is calculated by Netdata, so Netdata will convert ticks to percentage and send the average percentage to the
+ backend.
+
+ - `sum` or `volume`: the sum of the interpolated values shown on the Netdata graphs is sent to the backend. So, if
+ Netdata is configured to send data to the backend every 10 seconds, the sum of the 10 values shown on the
+ Netdata charts will be used.
+
+ Time-series databases suggest to collect the raw values (`as-collected`). If you plan to invest on building your
+ monitoring around a time-series database and you already know (or you will invest in learning) how to convert units
+ and normalize the metrics in Grafana or other visualization tools, we suggest to use `as-collected`.
+
+ If, on the other hand, you just need long term archiving of Netdata metrics and you plan to mainly work with
+ Netdata, we suggest to use `average`. It decouples visualization from data collection, so it will generally be a lot
+ simpler. Furthermore, if you use `average`, the charts shown in the back-end will match exactly what you see in
+ Netdata, which is not necessarily true for the other modes of operation.
+
+5. This code is smart enough, not to slow down Netdata, independently of the speed of the backend server.
+
+## configuration
+
+In `/etc/netdata/netdata.conf` you should have something like this (if not download the latest version of `netdata.conf`
+from your Netdata):
+
+```conf
+[backend]
+ enabled = yes | no
+ type = graphite | opentsdb:telnet | opentsdb:http | opentsdb:https | prometheus_remote_write | json | kinesis | mongodb
+ host tags = list of TAG=VALUE
+ destination = space separated list of [PROTOCOL:]HOST[:PORT] - the first working will be used, or a region for kinesis
+ data source = average | sum | as collected
+ prefix = Netdata
+ hostname = my-name
+ update every = 10
+ buffer on failures = 10
+ timeout ms = 20000
+ send charts matching = *
+ send hosts matching = localhost *
+ send names instead of ids = yes
+```
+
+- `enabled = yes | no`, enables or disables sending data to a backend
+
+- `type = graphite | opentsdb:telnet | opentsdb:http | opentsdb:https | json | kinesis | mongodb`, selects the backend
+ type
+
+- `destination = host1 host2 host3 ...`, accepts **a space separated list** of hostnames, IPs (IPv4 and IPv6) and
+ ports to connect to. Netdata will use the **first available** to send the metrics.
+
+ The format of each item in this list, is: `[PROTOCOL:]IP[:PORT]`.
+
+ `PROTOCOL` can be `udp` or `tcp`. `tcp` is the default and only supported by the current backends.
+
+ `IP` can be `XX.XX.XX.XX` (IPv4), or `[XX:XX...XX:XX]` (IPv6). For IPv6 you can to enclose the IP in `[]` to
+ separate it from the port.
+
+ `PORT` can be a number of a service name. If omitted, the default port for the backend will be used
+ (graphite = 2003, opentsdb = 4242).
+
+ Example IPv4:
+
+```conf
+ destination = 10.11.14.2:4242 10.11.14.3:4242 10.11.14.4:4242
+```
+
+ Example IPv6 and IPv4 together:
+
+```conf
+ destination = [ffff:...:0001]:2003 10.11.12.1:2003
+```
+
+ When multiple servers are defined, Netdata will try the next one when the first one fails. This allows you to
+ load-balance different servers: give your backend servers in different order on each Netdata.
+
+ Netdata also ships `nc-backend.sh`, a script that can be used as a fallback backend to save the
+ metrics to disk and push them to the time-series database when it becomes available again. It can also be used to
+ monitor / trace / debug the metrics Netdata generates.
+
+ For kinesis backend `destination` should be set to an AWS region (for example, `us-east-1`).
+
+ The MongoDB backend doesn't use the `destination` option for its configuration. It uses the `mongodb.conf`
+ [configuration file](/backends/mongodb/README.md) instead.
+
+- `data source = as collected`, or `data source = average`, or `data source = sum`, selects the kind of data that will
+ be sent to the backend.
+
+- `hostname = my-name`, is the hostname to be used for sending data to the backend server. By default this is
+ `[global].hostname`.
+
+- `prefix = Netdata`, is the prefix to add to all metrics.
+
+- `update every = 10`, is the number of seconds between sending data to the backend. Netdata will add some randomness
+ to this number, to prevent stressing the backend server when many Netdata servers send data to the same backend.
+ This randomness does not affect the quality of the data, only the time they are sent.
+
+- `buffer on failures = 10`, is the number of iterations (each iteration is `[backend].update every` seconds) to
+ buffer data, when the backend is not available. If the backend fails to receive the data after that many failures,
+ data loss on the backend is expected (Netdata will also log it).
+
+- `timeout ms = 20000`, is the timeout in milliseconds to wait for the backend server to process the data. By default
+ this is `2 * update_every * 1000`.
+
+- `send hosts matching = localhost *` includes one or more space separated patterns, using `*` as wildcard (any number
+ of times within each pattern). The patterns are checked against the hostname (the localhost is always checked as
+ `localhost`), allowing us to filter which hosts will be sent to the backend when this Netdata is a central Netdata
+ aggregating multiple hosts. A pattern starting with `!` gives a negative match. So to match all hosts named `*db*`
+ except hosts containing `*child*`, use `!*child* *db*` (so, the order is important: the first pattern
+ matching the hostname will be used - positive or negative).
+
+- `send charts matching = *` includes one or more space separated patterns, using `*` as wildcard (any number of times
+ within each pattern). The patterns are checked against both chart id and chart name. A pattern starting with `!`
+ gives a negative match. So to match all charts named `apps.*` except charts ending in `*reads`, use `!*reads
+ apps.*` (so, the order is important: the first pattern matching the chart id or the chart name will be used -
+ positive or negative).
+
+- `send names instead of ids = yes | no` controls the metric names Netdata should send to backend. Netdata supports
+ names and IDs for charts and dimensions. Usually IDs are unique identifiers as read by the system and names are
+ human friendly labels (also unique). Most charts and metrics have the same ID and name, but in several cases they
+ are different: disks with device-mapper, interrupts, QoS classes, statsd synthetic charts, etc.
+
+- `host tags = list of TAG=VALUE` defines tags that should be appended on all metrics for the given host. These are
+ currently only sent to graphite, json, opentsdb and prometheus. Please use the appropriate format for each
+ time-series db. For example opentsdb likes them like `TAG1=VALUE1 TAG2=VALUE2`, but prometheus like `tag1="value1",
+ tag2="value2"`. Host tags are mirrored with database replication (streaming of metrics between Netdata servers).
+
+ Starting from Netdata v1.20 the host tags are parsed in accordance with a configured backend type and stored as
+ host labels so that they can be reused in API responses and exporting connectors. The parsing is supported for
+ graphite, json, opentsdb, and prometheus (default) backend types. You can check how the host tags were parsed using
+ the /api/v1/info API call.
+
+## monitoring operation
+
+Netdata provides 5 charts:
+
+1. **Buffered metrics**, the number of metrics Netdata added to the buffer for dispatching them to the
+ backend server.
+
+2. **Buffered data size**, the amount of data (in KB) Netdata added the buffer.
+
+3. ~~**Backend latency**, the time the backend server needed to process the data Netdata sent. If there was a
+ re-connection involved, this includes the connection time.~~ (this chart has been removed, because it only measures
+ the time Netdata needs to give the data to the O/S - since the backend servers do not ack the reception, Netdata
+ does not have any means to measure this properly).
+
+4. **Backend operations**, the number of operations performed by Netdata.
+
+5. **Backend thread CPU usage**, the CPU resources consumed by the Netdata thread, that is responsible for sending the
+ metrics to the backend server.
+
+![image](https://cloud.githubusercontent.com/assets/2662304/20463536/eb196084-af3d-11e6-8ee5-ddbd3b4d8449.png)
+
+## alarms
+
+Netdata adds 4 alarms:
+
+1. `backend_last_buffering`, number of seconds since the last successful buffering of backend data
+2. `backend_metrics_sent`, percentage of metrics sent to the backend server
+3. `backend_metrics_lost`, number of metrics lost due to repeating failures to contact the backend server
+4. ~~`backend_slow`, the percentage of time between iterations needed by the backend time to process the data sent by
+ Netdata~~ (this was misleading and has been removed).
+
+![image](https://cloud.githubusercontent.com/assets/2662304/20463779/a46ed1c2-af43-11e6-91a5-07ca4533cac3.png)
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/backends/TIMESCALE.md b/backends/TIMESCALE.md
new file mode 100644
index 0000000..05a3c3b
--- /dev/null
+++ b/backends/TIMESCALE.md
@@ -0,0 +1,57 @@
+<!--
+title: "Writing metrics to TimescaleDB"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/TIMESCALE.md
+-->
+
+# Writing metrics to TimescaleDB
+
+Thanks to Netdata's community of developers and system administrators, and Mahlon Smith
+([GitHub](https://github.com/mahlonsmith)/[Website](http://www.martini.nu/)) in particular, Netdata now supports
+archiving metrics directly to TimescaleDB.
+
+What's TimescaleDB? Here's how their team defines the project on their [GitHub page](https://github.com/timescale/timescaledb):
+
+> TimescaleDB is an open-source database designed to make SQL scalable for time-series data. It is engineered up from
+> PostgreSQL, providing automatic partitioning across time and space (partitioning key), as well as full SQL support.
+
+## Quickstart
+
+To get started archiving metrics to TimescaleDB right away, check out Mahlon's [`netdata-timescale-relay`
+repository](https://github.com/mahlonsmith/netdata-timescale-relay) on GitHub.
+
+This small program takes JSON streams from a Netdata client and writes them to a PostgreSQL (aka TimescaleDB) table.
+You'll run this program in parallel with Netdata, and after a short [configuration
+process](https://github.com/mahlonsmith/netdata-timescale-relay#configuration), your metrics should start populating
+TimescaleDB.
+
+Finally, another member of Netdata's community has built a project that quickly launches Netdata, TimescaleDB, and
+Grafana in easy-to-manage Docker containers. Rune Juhl Jacobsen's
+[project](https://github.com/runejuhl/grafana-timescaledb) uses a `Makefile` to create everything, which makes it
+perfect for testing and experimentation.
+
+## Netdata&#8596;TimescaleDB in action
+
+Aside from creating incredible contributions to Netdata, Mahlon works at [LAIKA](https://www.laika.com/), an
+Oregon-based animation studio that's helped create acclaimed films like _Coraline_ and _Kubo and the Two Strings_.
+
+As part of his work to maintain the company's infrastructure of render farms, workstations, and virtual machines, he's
+using Netdata, `netdata-timescale-relay`, and TimescaleDB to store Netdata metrics alongside other data from other
+sources.
+
+> LAIKA is a long-time PostgreSQL user and added TimescaleDB to their infrastructure in 2018 to help manage and store
+> their IT metrics and time-series data. So far, the tool has been in production at LAIKA for over a year and helps them
+> with their use case of time-based logging, where they record over 8 million metrics an hour for netdata content alone.
+
+By archiving Netdata metrics to a backend like TimescaleDB, LAIKA can consolidate metrics data from distributed machines
+efficiently. Mahlon can then correlate Netdata metrics with other sources directly in TimescaleDB.
+
+And, because LAIKA will soon be storing years worth of Netdata metrics data in TimescaleDB, they can analyze long-term
+metrics as their films move from concept to final cut.
+
+Read the full blog post from LAIKA at the [TimescaleDB
+blog](https://blog.timescale.com/blog/writing-it-metrics-from-netdata-to-timescaledb/amp/).
+
+Thank you to Mahlon, Rune, TimescaleDB, and the members of the Netdata community that requested and then built this
+backend connection between Netdata and TimescaleDB!
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2FTIMESCALE&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/backends/WALKTHROUGH.md b/backends/WALKTHROUGH.md
new file mode 100644
index 0000000..76dd62f
--- /dev/null
+++ b/backends/WALKTHROUGH.md
@@ -0,0 +1,258 @@
+<!--
+title: "Netdata, Prometheus, Grafana stack"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/WALKTHROUGH.md
+-->
+
+# Netdata, Prometheus, Grafana stack
+
+## Intro
+
+In this article I will walk you through the basics of getting Netdata, Prometheus and Grafana all working together and
+monitoring your application servers. This article will be using docker on your local workstation. We will be working
+with docker in an ad-hoc way, launching containers that run ‘/bin/bash’ and attaching a TTY to them. I use docker here
+in a purely academic fashion and do not condone running Netdata in a container. I pick this method so individuals
+without cloud accounts or access to VMs can try this out and for it’s speed of deployment.
+
+## Why Netdata, Prometheus, and Grafana
+
+Some time ago I was introduced to Netdata by a coworker. We were attempting to troubleshoot python code which seemed to
+be bottlenecked. I was instantly impressed by the amount of metrics Netdata exposes to you. I quickly added Netdata to
+my set of go-to tools when troubleshooting systems performance.
+
+Some time ago, even later, I was introduced to Prometheus. Prometheus is a monitoring application which flips the normal
+architecture around and polls rest endpoints for its metrics. This architectural change greatly simplifies and decreases
+the time necessary to begin monitoring your applications. Compared to current monitoring solutions the time spent on
+designing the infrastructure is greatly reduced. Running a single Prometheus server per application becomes feasible
+with the help of Grafana.
+
+Grafana has been the go to graphing tool for… some time now. It’s awesome, anyone that has used it knows it’s awesome.
+We can point Grafana at Prometheus and use Prometheus as a data source. This allows a pretty simple overall monitoring
+architecture: Install Netdata on your application servers, point Prometheus at Netdata, and then point Grafana at
+Prometheus.
+
+I’m omitting an important ingredient in this stack in order to keep this tutorial simple and that is service discovery.
+My personal preference is to use Consul. Prometheus can plug into consul and automatically begin to scrape new hosts
+that register a Netdata client with Consul.
+
+At the end of this tutorial you will understand how each technology fits together to create a modern monitoring stack.
+This stack will offer you visibility into your application and systems performance.
+
+## Getting Started - Netdata
+
+To begin let’s create our container which we will install Netdata on. We need to run a container, forward the necessary
+port that Netdata listens on, and attach a tty so we can interact with the bash shell on the container. But before we do
+this we want name resolution between the two containers to work. In order to accomplish this we will create a
+user-defined network and attach both containers to this network. The first command we should run is:
+
+```sh
+docker network create --driver bridge netdata-tutorial
+```
+
+With this user-defined network created we can now launch our container we will install Netdata on and point it to this
+network.
+
+```sh
+docker run -it --name netdata --hostname netdata --network=netdata-tutorial -p 19999:19999 centos:latest '/bin/bash'
+```
+
+This command creates an interactive tty session (-it), gives the container both a name in relation to the docker daemon
+and a hostname (this is so you know what container is which when working in the shells and docker maps hostname
+resolution to this container), forwards the local port 19999 to the container’s port 19999 (-p 19999:19999), sets the
+command to run (/bin/bash) and then chooses the base container images (centos:latest). After running this you should be
+sitting inside the shell of the container.
+
+After we have entered the shell we can install Netdata. This process could not be easier. If you take a look at [this
+link](../packaging/installer/README.md), the Netdata devs give us several one-liners to install Netdata. I have not had
+any issues with these one liners and their bootstrapping scripts so far (If you guys run into anything do share). Run
+the following command in your container.
+
+```sh
+bash <(curl -Ss https://my-netdata.io/kickstart.sh) --dont-wait
+```
+
+After the install completes you should be able to hit the Netdata dashboard at <http://localhost:19999/> (replace
+localhost if you’re doing this on a VM or have the docker container hosted on a machine not on your local system). If
+this is your first time using Netdata I suggest you take a look around. The amount of time I’ve spent digging through
+/proc and calculating my own metrics has been greatly reduced by this tool. Take it all in.
+
+Next I want to draw your attention to a particular endpoint. Navigate to
+<http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes> In your browser. This is the endpoint which
+publishes all the metrics in a format which Prometheus understands. Let’s take a look at one of these metrics.
+`netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="system"} 0.0831255 1501271696000` This
+metric is representing several things which I will go in more details in the section on prometheus. For now understand
+that this metric: `netdata_system_cpu_percentage_average` has several labels: (chart, family, dimension). This
+corresponds with the first cpu chart you see on the Netdata dashboard.
+
+![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%204.00.45%20PM.png)
+
+This CHART is called ‘system.cpu’, The FAMILY is cpu, and the DIMENSION we are observing is “system”. You can begin to
+draw links between the charts in Netdata to the prometheus metrics format in this manner.
+
+## Prometheus
+
+We will be installing prometheus in a container for purpose of demonstration. While prometheus does have an official
+container I would like to walk through the install process and setup on a fresh container. This will allow anyone
+reading to migrate this tutorial to a VM or Server of any sort.
+
+Let’s start another container in the same fashion as we did the Netdata container.
+
+```sh
+docker run -it --name prometheus --hostname prometheus
+--network=netdata-tutorial -p 9090:9090 centos:latest '/bin/bash'
+```
+
+This should drop you into a shell once again. Once there quickly install your favorite editor as we will be editing
+files later in this tutorial.
+
+```sh
+yum install vim -y
+```
+
+Prometheus provides a tarball of their latest stable versions [here](https://prometheus.io/download/).
+
+Let’s download the latest version and install into your container.
+
+```sh
+cd /tmp && curl -s https://api.github.com/repos/prometheus/prometheus/releases/latest \
+| grep "browser_download_url.*linux-amd64.tar.gz" \
+| cut -d '"' -f 4 \
+| wget -qi -
+
+mkdir /opt/prometheus
+
+sudo tar -xvf /tmp/prometheus-*linux-amd64.tar.gz -C /opt/prometheus --strip=1
+```
+
+This should get prometheus installed into the container. Let’s test that we can run prometheus and connect to it’s web
+interface.
+
+```sh
+/opt/prometheus/prometheus
+```
+
+Now attempt to go to <http://localhost:9090/>. You should be presented with the prometheus homepage. This is a good
+point to talk about Prometheus’s data model which can be viewed here: <https://prometheus.io/docs/concepts/data_model/>
+As explained we have two key elements in Prometheus metrics. We have the ‘metric’ and its ‘labels’. Labels allow for
+granularity between metrics. Let’s use our previous example to further explain.
+
+```conf
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="system"} 0.0831255 1501271696000
+```
+
+Here our metric is ‘netdata_system_cpu_percentage_average’ and our labels are ‘chart’, ‘family’, and ‘dimension. The
+last two values constitute the actual metric value for the metric type (gauge, counter, etc…). We can begin graphing
+system metrics with this information, but first we need to hook up Prometheus to poll Netdata stats.
+
+Let’s move our attention to Prometheus’s configuration. Prometheus gets it config from the file located (in our example)
+at `/opt/prometheus/prometheus.yml`. I won’t spend an extensive amount of time going over the configuration values
+documented here: <https://prometheus.io/docs/operating/configuration/>. We will be adding a new“job” under the
+“scrape_configs”. Let’s make the “scrape_configs” section look like this (we can use the dns name Netdata due to the
+custom user-defined network we created in docker beforehand).
+
+```yaml
+scrape_configs:
+ # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
+ - job_name: 'prometheus'
+
+ # metrics_path defaults to '/metrics'
+ # scheme defaults to 'http'.
+
+ static_configs:
+ - targets: ['localhost:9090']
+
+ - job_name: 'netdata'
+
+ metrics_path: /api/v1/allmetrics
+ params:
+ format: [ prometheus ]
+
+ static_configs:
+ - targets: ['netdata:19999']
+```
+
+Let’s start prometheus once again by running `/opt/prometheus/prometheus`. If we now navigate to prometheus at
+‘<http://localhost:9090/targets’> we should see our target being successfully scraped. If we now go back to the
+Prometheus’s homepage and begin to type ‘netdata\_’ Prometheus should auto complete metrics it is now scraping.
+
+![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%205.13.43%20PM.png)
+
+Let’s now start exploring how we can graph some metrics. Back in our NetData container lets get the CPU spinning with a
+pointless busy loop. On the shell do the following:
+
+```sh
+[root@netdata /]# while true; do echo "HOT HOT HOT CPU"; done
+```
+
+Our NetData cpu graph should be showing some activity. Let’s represent this in Prometheus. In order to do this let’s
+keep our metrics page open for reference: <http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes> We are
+setting out to graph the data in the CPU chart so let’s search for “system.cpu”in the metrics page above. We come across
+a section of metrics with the first comments `# COMMENT homogeneous chart "system.cpu", context "system.cpu", family
+"cpu", units "percentage"` Followed by the metrics. This is a good start now let us drill down to the specific metric we
+would like to graph.
+
+```conf
+# COMMENT
+netdata_system_cpu_percentage_average: dimension "system", value is percentage, gauge, dt 1501275951 to 1501275951 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="system"} 0.0000000 1501275951000
+```
+
+Here we learn that the metric name we care about is‘netdata_system_cpu_percentage_average’ so throw this into Prometheus
+and see what we get. We should see something similar to this (I shut off my busy loop)
+
+![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%205.47.53%20PM.png)
+
+This is a good step toward what we want. Also make note that Prometheus will tag on an ‘instance’ label for us which
+corresponds to our statically defined job in the configuration file. This allows us to tailor our queries to specific
+instances. Now we need to isolate the dimension we want in our query. To do this let us refine the query slightly. Let’s
+query the dimension also. Place this into our query text box.
+`netdata_system_cpu_percentage_average{dimension="system"}` We now wind up with the following graph.
+
+![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%205.54.40%20PM.png)
+
+Awesome, this is exactly what we wanted. If you haven’t caught on yet we can emulate entire charts from NetData by using
+the `chart` dimension. If you’d like you can combine the ‘chart’ and ‘instance’ dimension to create per-instance charts.
+Let’s give this a try: `netdata_system_cpu_percentage_average{chart="system.cpu", instance="netdata:19999"}`
+
+This is the basics of using Prometheus to query NetData. I’d advise everyone at this point to read [this
+page](../backends/prometheus/#using-netdata-with-prometheus). The key point here is that NetData can export metrics from
+its internal DB or can send metrics “as-collected” by specifying the ‘source=as-collected’ url parameter like so.
+<http://localhost:19999/api/v1/allmetrics?format=prometheus&help=yes&types=yes&source=as-collected> If you choose to use
+this method you will need to use Prometheus's set of functions here: <https://prometheus.io/docs/querying/functions/> to
+obtain useful metrics as you are now dealing with raw counters from the system. For example you will have to use the
+`irate()` function over a counter to get that metric's rate per second. If your graphing needs are met by using the
+metrics returned by NetData's internal database (not specifying any source= url parameter) then use that. If you find
+limitations then consider re-writing your queries using the raw data and using Prometheus functions to get the desired
+chart.
+
+## Grafana
+
+Finally we make it to grafana. This is the easiest part in my opinion. This time we will actually run the official
+grafana docker container as all configuration we need to do is done via the GUI. Let’s run the following command:
+
+```sh
+docker run -i -p 3000:3000 --network=netdata-tutorial grafana/grafana
+```
+
+This will get grafana running at ‘<http://localhost:3000/’> Let’s go there and
+
+login using the credentials Admin:Admin.
+
+The first thing we want to do is click ‘Add data source’. Let’s make it look like the following screenshot
+
+![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%206.36.55%20PM.png)
+
+With this completed let’s graph! Create a new Dashboard by clicking on the top left Grafana Icon and create a new graph
+in that dashboard. Fill in the query like we did above and save.
+
+![](https://github.com/ldelossa/NetdataTutorial/raw/master/Screen%20Shot%202017-07-28%20at%206.39.38%20PM.png)
+
+## Conclusion
+
+There you have it, a complete systems monitoring stack which is very easy to deploy. From here I would begin to
+understand how Prometheus and a service discovery mechanism such as Consul can play together nicely. My current prod
+deployments automatically register Netdata services into Consul and Prometheus automatically begins to scrape them. Once
+achieved you do not have to think about the monitoring system until Prometheus cannot keep up with your scale. Once this
+happens there are options presented in the Prometheus documentation for solving this. Hope this was helpful, happy
+monitoring.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2FWALKTHROUGH&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/backends/aws_kinesis/Makefile.am b/backends/aws_kinesis/Makefile.am
new file mode 100644
index 0000000..1fec72c
--- /dev/null
+++ b/backends/aws_kinesis/Makefile.am
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+AUTOMAKE_OPTIONS = subdir-objects
+MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
+
+dist_noinst_DATA = \
+ README.md \
+ $(NULL)
+
+dist_libconfig_DATA = \
+ aws_kinesis.conf \
+ $(NULL)
diff --git a/backends/aws_kinesis/README.md b/backends/aws_kinesis/README.md
new file mode 100644
index 0000000..a2b6825
--- /dev/null
+++ b/backends/aws_kinesis/README.md
@@ -0,0 +1,53 @@
+<!--
+title: "Using Netdata with AWS Kinesis Data Streams"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/aws_kinesis/README.md
+-->
+
+# Using Netdata with AWS Kinesis Data Streams
+
+## Prerequisites
+
+To use AWS Kinesis as a backend AWS SDK for C++ should be
+[installed](https://docs.aws.amazon.com/en_us/sdk-for-cpp/v1/developer-guide/setup.html) first. `libcrypto`, `libssl`,
+and `libcurl` are also required to compile Netdata with Kinesis support enabled. Next, Netdata should be re-installed
+from the source. The installer will detect that the required libraries are now available.
+
+If the AWS SDK for C++ is being installed from source, it is useful to set `-DBUILD_ONLY="kinesis"`. Otherwise, the
+building process could take a very long time. Take a note, that the default installation path for the libraries is
+`/usr/local/lib64`. Many Linux distributions don't include this path as the default one for a library search, so it is
+advisable to use the following options to `cmake` while building the AWS SDK:
+
+```sh
+cmake -DCMAKE_INSTALL_LIBDIR=/usr/lib -DCMAKE_INSTALL_INCLUDEDIR=/usr/include -DBUILD_SHARED_LIBS=OFF -DBUILD_ONLY=kinesis <aws-sdk-cpp sources>
+```
+
+## Configuration
+
+To enable data sending to the kinesis backend set the following options in `netdata.conf`:
+
+```conf
+[backend]
+ enabled = yes
+ type = kinesis
+ destination = us-east-1
+```
+
+set the `destination` option to an AWS region.
+
+In the Netdata configuration directory run `./edit-config aws_kinesis.conf` and set AWS credentials and stream name:
+
+```yaml
+# AWS credentials
+aws_access_key_id = your_access_key_id
+aws_secret_access_key = your_secret_access_key
+
+# destination stream
+stream name = your_stream_name
+```
+
+Alternatively, AWS credentials can be set for the `netdata` user using AWS SDK for C++ [standard methods](https://docs.aws.amazon.com/sdk-for-cpp/v1/developer-guide/credentials.html).
+
+A partition key for every record is computed automatically by Netdata with the purpose to distribute records across
+available shards evenly.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Faws_kinesis%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/backends/aws_kinesis/aws_kinesis.c b/backends/aws_kinesis/aws_kinesis.c
new file mode 100644
index 0000000..b1ea478
--- /dev/null
+++ b/backends/aws_kinesis/aws_kinesis.c
@@ -0,0 +1,94 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#define BACKENDS_INTERNALS
+#include "aws_kinesis.h"
+
+#define CONFIG_FILE_LINE_MAX ((CONFIG_MAX_NAME + CONFIG_MAX_VALUE + 1024) * 2)
+
+// ----------------------------------------------------------------------------
+// kinesis backend
+
+// read the aws_kinesis.conf file
+int read_kinesis_conf(const char *path, char **access_key_id_p, char **secret_access_key_p, char **stream_name_p)
+{
+ char *access_key_id = *access_key_id_p;
+ char *secret_access_key = *secret_access_key_p;
+ char *stream_name = *stream_name_p;
+
+ if(unlikely(access_key_id)) freez(access_key_id);
+ if(unlikely(secret_access_key)) freez(secret_access_key);
+ if(unlikely(stream_name)) freez(stream_name);
+ access_key_id = NULL;
+ secret_access_key = NULL;
+ stream_name = NULL;
+
+ int line = 0;
+
+ char filename[FILENAME_MAX + 1];
+ snprintfz(filename, FILENAME_MAX, "%s/aws_kinesis.conf", path);
+
+ char buffer[CONFIG_FILE_LINE_MAX + 1], *s;
+
+ debug(D_BACKEND, "BACKEND: opening config file '%s'", filename);
+
+ FILE *fp = fopen(filename, "r");
+ if(!fp) {
+ return 1;
+ }
+
+ while(fgets(buffer, CONFIG_FILE_LINE_MAX, fp) != NULL) {
+ buffer[CONFIG_FILE_LINE_MAX] = '\0';
+ line++;
+
+ s = trim(buffer);
+ if(!s || *s == '#') {
+ debug(D_BACKEND, "BACKEND: ignoring line %d of file '%s', it is empty.", line, filename);
+ continue;
+ }
+
+ char *name = s;
+ char *value = strchr(s, '=');
+ if(unlikely(!value)) {
+ error("BACKEND: ignoring line %d ('%s') of file '%s', there is no = in it.", line, s, filename);
+ continue;
+ }
+ *value = '\0';
+ value++;
+
+ name = trim(name);
+ value = trim(value);
+
+ if(unlikely(!name || *name == '#')) {
+ error("BACKEND: ignoring line %d of file '%s', name is empty.", line, filename);
+ continue;
+ }
+
+ if(!value)
+ value = "";
+ else
+ value = strip_quotes(value);
+
+ if(name[0] == 'a' && name[4] == 'a' && !strcmp(name, "aws_access_key_id")) {
+ access_key_id = strdupz(value);
+ }
+ else if(name[0] == 'a' && name[4] == 's' && !strcmp(name, "aws_secret_access_key")) {
+ secret_access_key = strdupz(value);
+ }
+ else if(name[0] == 's' && !strcmp(name, "stream name")) {
+ stream_name = strdupz(value);
+ }
+ }
+
+ fclose(fp);
+
+ if(unlikely(!stream_name || !*stream_name)) {
+ error("BACKEND: stream name is a mandatory Kinesis parameter but it is not configured");
+ return 1;
+ }
+
+ *access_key_id_p = access_key_id;
+ *secret_access_key_p = secret_access_key;
+ *stream_name_p = stream_name;
+
+ return 0;
+}
diff --git a/backends/aws_kinesis/aws_kinesis.conf b/backends/aws_kinesis/aws_kinesis.conf
new file mode 100644
index 0000000..cc54b5f
--- /dev/null
+++ b/backends/aws_kinesis/aws_kinesis.conf
@@ -0,0 +1,10 @@
+# AWS Kinesis Data Streams backend configuration
+#
+# All options in this file are mandatory
+
+# AWS credentials
+aws_access_key_id =
+aws_secret_access_key =
+
+# destination stream
+stream name = \ No newline at end of file
diff --git a/backends/aws_kinesis/aws_kinesis.h b/backends/aws_kinesis/aws_kinesis.h
new file mode 100644
index 0000000..50a4631
--- /dev/null
+++ b/backends/aws_kinesis/aws_kinesis.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef NETDATA_BACKEND_KINESIS_H
+#define NETDATA_BACKEND_KINESIS_H
+
+#include "backends/backends.h"
+#include "aws_kinesis_put_record.h"
+
+#define KINESIS_PARTITION_KEY_MAX 256
+#define KINESIS_RECORD_MAX 1024 * 1024
+
+extern int read_kinesis_conf(const char *path, char **auth_key_id_p, char **secure_key_p, char **stream_name_p);
+
+#endif //NETDATA_BACKEND_KINESIS_H
diff --git a/backends/aws_kinesis/aws_kinesis_put_record.cc b/backends/aws_kinesis/aws_kinesis_put_record.cc
new file mode 100644
index 0000000..a8ba4aa
--- /dev/null
+++ b/backends/aws_kinesis/aws_kinesis_put_record.cc
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#include <aws/core/Aws.h>
+#include <aws/core/client/ClientConfiguration.h>
+#include <aws/core/auth/AWSCredentials.h>
+#include <aws/core/utils/Outcome.h>
+#include <aws/kinesis/KinesisClient.h>
+#include <aws/kinesis/model/PutRecordRequest.h>
+#include "aws_kinesis_put_record.h"
+
+using namespace Aws;
+
+static SDKOptions options;
+
+static Kinesis::KinesisClient *client;
+
+struct request_outcome {
+ Kinesis::Model::PutRecordOutcomeCallable future_outcome;
+ size_t data_len;
+};
+
+static Vector<request_outcome> request_outcomes;
+
+void backends_kinesis_init(const char *region, const char *access_key_id, const char *secret_key, const long timeout) {
+ InitAPI(options);
+
+ Client::ClientConfiguration config;
+
+ config.region = region;
+ config.requestTimeoutMs = timeout;
+ config.connectTimeoutMs = timeout;
+
+ if(access_key_id && *access_key_id && secret_key && *secret_key) {
+ client = New<Kinesis::KinesisClient>("client", Auth::AWSCredentials(access_key_id, secret_key), config);
+ } else {
+ client = New<Kinesis::KinesisClient>("client", config);
+ }
+}
+
+void backends_kinesis_shutdown() {
+ Delete(client);
+
+ ShutdownAPI(options);
+}
+
+int backends_kinesis_put_record(const char *stream_name, const char *partition_key,
+ const char *data, size_t data_len) {
+ Kinesis::Model::PutRecordRequest request;
+
+ request.SetStreamName(stream_name);
+ request.SetPartitionKey(partition_key);
+ request.SetData(Utils::ByteBuffer((unsigned char*) data, data_len));
+
+ request_outcomes.push_back({client->PutRecordCallable(request), data_len});
+
+ return 0;
+}
+
+int backends_kinesis_get_result(char *error_message, size_t *sent_bytes, size_t *lost_bytes) {
+ Kinesis::Model::PutRecordOutcome outcome;
+ *sent_bytes = 0;
+ *lost_bytes = 0;
+
+ for(auto request_outcome = request_outcomes.begin(); request_outcome != request_outcomes.end(); ) {
+ std::future_status status = request_outcome->future_outcome.wait_for(std::chrono::microseconds(100));
+
+ if(status == std::future_status::ready || status == std::future_status::deferred) {
+ outcome = request_outcome->future_outcome.get();
+ *sent_bytes += request_outcome->data_len;
+
+ if(!outcome.IsSuccess()) {
+ *lost_bytes += request_outcome->data_len;
+ outcome.GetError().GetMessage().copy(error_message, ERROR_LINE_MAX);
+ }
+
+ request_outcomes.erase(request_outcome);
+ } else {
+ ++request_outcome;
+ }
+ }
+
+ if(*lost_bytes) {
+ return 1;
+ }
+
+ return 0;
+} \ No newline at end of file
diff --git a/backends/aws_kinesis/aws_kinesis_put_record.h b/backends/aws_kinesis/aws_kinesis_put_record.h
new file mode 100644
index 0000000..fa3d034
--- /dev/null
+++ b/backends/aws_kinesis/aws_kinesis_put_record.h
@@ -0,0 +1,25 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef NETDATA_BACKEND_KINESIS_PUT_RECORD_H
+#define NETDATA_BACKEND_KINESIS_PUT_RECORD_H
+
+#define ERROR_LINE_MAX 1023
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void backends_kinesis_init(const char *region, const char *access_key_id, const char *secret_key, const long timeout);
+
+void backends_kinesis_shutdown();
+
+int backends_kinesis_put_record(const char *stream_name, const char *partition_key,
+ const char *data, size_t data_len);
+
+int backends_kinesis_get_result(char *error_message, size_t *sent_bytes, size_t *lost_bytes);
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif //NETDATA_BACKEND_KINESIS_PUT_RECORD_H
diff --git a/backends/backends.c b/backends/backends.c
new file mode 100644
index 0000000..6bf583e
--- /dev/null
+++ b/backends/backends.c
@@ -0,0 +1,1243 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#define BACKENDS_INTERNALS
+#include "backends.h"
+
+// ----------------------------------------------------------------------------
+// How backends work in netdata:
+//
+// 1. There is an independent thread that runs at the required interval
+// (for example, once every 10 seconds)
+//
+// 2. Every time it wakes, it calls the backend formatting functions to build
+// a buffer of data. This is a very fast, memory only operation.
+//
+// 3. If the buffer already includes data, the new data are appended.
+// If the buffer becomes too big, because the data cannot be sent, a
+// log is written and the buffer is discarded.
+//
+// 4. Then it tries to send all the data. It blocks until all the data are sent
+// or the socket returns an error.
+// If the time required for this is above the interval, it starts skipping
+// intervals, but the calculated values include the entire database, without
+// gaps (it remembers the timestamps and continues from where it stopped).
+//
+// 5. repeats the above forever.
+//
+
+const char *global_backend_prefix = "netdata";
+int global_backend_update_every = 10;
+BACKEND_OPTIONS global_backend_options = BACKEND_SOURCE_DATA_AVERAGE | BACKEND_OPTION_SEND_NAMES;
+const char *global_backend_source = NULL;
+
+// ----------------------------------------------------------------------------
+// helper functions for backends
+
+size_t backend_name_copy(char *d, const char *s, size_t usable) {
+ size_t n;
+
+ for(n = 0; *s && n < usable ; d++, s++, n++) {
+ char c = *s;
+
+ if(c != '.' && !isalnum(c)) *d = '_';
+ else *d = c;
+ }
+ *d = '\0';
+
+ return n;
+}
+
+// calculate the SUM or AVERAGE of a dimension, for any timeframe
+// may return NAN if the database does not have any value in the give timeframe
+
+calculated_number backend_calculate_value_from_stored_data(
+ RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+ , time_t *first_timestamp // the first point of the database used in this response
+ , time_t *last_timestamp // the timestamp that should be reported to backend
+) {
+ RRDHOST *host = st->rrdhost;
+ (void)host;
+
+ // find the edges of the rrd database for this chart
+ time_t first_t = rd->state->query_ops.oldest_time(rd);
+ time_t last_t = rd->state->query_ops.latest_time(rd);
+ time_t update_every = st->update_every;
+ struct rrddim_query_handle handle;
+ storage_number n;
+
+ // step back a little, to make sure we have complete data collection
+ // for all metrics
+ after -= update_every * 2;
+ before -= update_every * 2;
+
+ // align the time-frame
+ after = after - (after % update_every);
+ before = before - (before % update_every);
+
+ // for before, loose another iteration
+ // the latest point will be reported the next time
+ before -= update_every;
+
+ if(unlikely(after > before))
+ // this can happen when update_every > before - after
+ after = before;
+
+ if(unlikely(after < first_t))
+ after = first_t;
+
+ if(unlikely(before > last_t))
+ before = last_t;
+
+ if(unlikely(before < first_t || after > last_t)) {
+ // the chart has not been updated in the wanted timeframe
+ debug(D_BACKEND, "BACKEND: %s.%s.%s: aligned timeframe %lu to %lu is outside the chart's database range %lu to %lu",
+ host->hostname, st->id, rd->id,
+ (unsigned long)after, (unsigned long)before,
+ (unsigned long)first_t, (unsigned long)last_t
+ );
+ return NAN;
+ }
+
+ *first_timestamp = after;
+ *last_timestamp = before;
+
+ size_t counter = 0;
+ calculated_number sum = 0;
+
+/*
+ long start_at_slot = rrdset_time2slot(st, before),
+ stop_at_slot = rrdset_time2slot(st, after),
+ slot, stop_now = 0;
+
+ for(slot = start_at_slot; !stop_now ; slot--) {
+
+ if(unlikely(slot < 0)) slot = st->entries - 1;
+ if(unlikely(slot == stop_at_slot)) stop_now = 1;
+
+ storage_number n = rd->values[slot];
+
+ if(unlikely(!does_storage_number_exist(n))) {
+ // not collected
+ continue;
+ }
+
+ calculated_number value = unpack_storage_number(n);
+ sum += value;
+
+ counter++;
+ }
+*/
+ for(rd->state->query_ops.init(rd, &handle, after, before) ; !rd->state->query_ops.is_finished(&handle) ; ) {
+ time_t curr_t;
+ n = rd->state->query_ops.next_metric(&handle, &curr_t);
+
+ if(unlikely(!does_storage_number_exist(n))) {
+ // not collected
+ continue;
+ }
+
+ calculated_number value = unpack_storage_number(n);
+ sum += value;
+
+ counter++;
+ }
+ rd->state->query_ops.finalize(&handle);
+ if(unlikely(!counter)) {
+ debug(D_BACKEND, "BACKEND: %s.%s.%s: no values stored in database for range %lu to %lu",
+ host->hostname, st->id, rd->id,
+ (unsigned long)after, (unsigned long)before
+ );
+ return NAN;
+ }
+
+ if(unlikely(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_SUM))
+ return sum;
+
+ return sum / (calculated_number)counter;
+}
+
+
+// discard a response received by a backend
+// after logging a simple of it to error.log
+
+int discard_response(BUFFER *b, const char *backend) {
+ char sample[1024];
+ const char *s = buffer_tostring(b);
+ char *d = sample, *e = &sample[sizeof(sample) - 1];
+
+ for(; *s && d < e ;s++) {
+ char c = *s;
+ if(unlikely(!isprint(c))) c = ' ';
+ *d++ = c;
+ }
+ *d = '\0';
+
+ info("BACKEND: received %zu bytes from %s backend. Ignoring them. Sample: '%s'", buffer_strlen(b), backend, sample);
+ buffer_flush(b);
+ return 0;
+}
+
+
+// ----------------------------------------------------------------------------
+// the backend thread
+
+static SIMPLE_PATTERN *charts_pattern = NULL;
+static SIMPLE_PATTERN *hosts_pattern = NULL;
+
+inline int backends_can_send_rrdset(BACKEND_OPTIONS backend_options, RRDSET *st) {
+ RRDHOST *host = st->rrdhost;
+ (void)host;
+
+ if(unlikely(rrdset_flag_check(st, RRDSET_FLAG_BACKEND_IGNORE)))
+ return 0;
+
+ if(unlikely(!rrdset_flag_check(st, RRDSET_FLAG_BACKEND_SEND))) {
+ // we have not checked this chart
+ if(simple_pattern_matches(charts_pattern, st->id) || simple_pattern_matches(charts_pattern, st->name))
+ rrdset_flag_set(st, RRDSET_FLAG_BACKEND_SEND);
+ else {
+ rrdset_flag_set(st, RRDSET_FLAG_BACKEND_IGNORE);
+ debug(D_BACKEND, "BACKEND: not sending chart '%s' of host '%s', because it is disabled for backends.", st->id, host->hostname);
+ return 0;
+ }
+ }
+
+ if(unlikely(!rrdset_is_available_for_backends(st))) {
+ debug(D_BACKEND, "BACKEND: not sending chart '%s' of host '%s', because it is not available for backends.", st->id, host->hostname);
+ return 0;
+ }
+
+ if(unlikely(st->rrd_memory_mode == RRD_MEMORY_MODE_NONE && !(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED))) {
+ debug(D_BACKEND, "BACKEND: not sending chart '%s' of host '%s' because its memory mode is '%s' and the backend requires database access.", st->id, host->hostname, rrd_memory_mode_name(host->rrd_memory_mode));
+ return 0;
+ }
+
+ return 1;
+}
+
+inline BACKEND_OPTIONS backend_parse_data_source(const char *source, BACKEND_OPTIONS backend_options) {
+ if(!strcmp(source, "raw") || !strcmp(source, "as collected") || !strcmp(source, "as-collected") || !strcmp(source, "as_collected") || !strcmp(source, "ascollected")) {
+ backend_options |= BACKEND_SOURCE_DATA_AS_COLLECTED;
+ backend_options &= ~(BACKEND_OPTIONS_SOURCE_BITS ^ BACKEND_SOURCE_DATA_AS_COLLECTED);
+ }
+ else if(!strcmp(source, "average")) {
+ backend_options |= BACKEND_SOURCE_DATA_AVERAGE;
+ backend_options &= ~(BACKEND_OPTIONS_SOURCE_BITS ^ BACKEND_SOURCE_DATA_AVERAGE);
+ }
+ else if(!strcmp(source, "sum") || !strcmp(source, "volume")) {
+ backend_options |= BACKEND_SOURCE_DATA_SUM;
+ backend_options &= ~(BACKEND_OPTIONS_SOURCE_BITS ^ BACKEND_SOURCE_DATA_SUM);
+ }
+ else {
+ error("BACKEND: invalid data source method '%s'.", source);
+ }
+
+ return backend_options;
+}
+
+static void backends_main_cleanup(void *ptr) {
+ struct netdata_static_thread *static_thread = (struct netdata_static_thread *)ptr;
+ static_thread->enabled = NETDATA_MAIN_THREAD_EXITING;
+
+ info("cleaning up...");
+
+ static_thread->enabled = NETDATA_MAIN_THREAD_EXITED;
+}
+
+/**
+ * Set Kinesis variables
+ *
+ * Set the variables necessary to work with this specific backend.
+ *
+ * @param default_port the default port of the backend
+ * @param brc function called to check the result.
+ * @param brf function called to format the message to the backend
+ */
+void backend_set_kinesis_variables(int *default_port,
+ backend_response_checker_t brc,
+ backend_request_formatter_t brf)
+{
+ (void)default_port;
+#ifndef HAVE_KINESIS
+ (void)brc;
+ (void)brf;
+#endif
+
+#if HAVE_KINESIS
+ *brc = process_json_response;
+ if (BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED)
+ *brf = backends_format_dimension_collected_json_plaintext;
+ else
+ *brf = backends_format_dimension_stored_json_plaintext;
+#endif
+}
+
+/**
+ * Set Prometheus variables
+ *
+ * Set the variables necessary to work with this specific backend.
+ *
+ * @param default_port the default port of the backend
+ * @param brc function called to check the result.
+ * @param brf function called to format the message to the backend
+ */
+void backend_set_prometheus_variables(int *default_port,
+ backend_response_checker_t brc,
+ backend_request_formatter_t brf)
+{
+ (void)default_port;
+ (void)brf;
+#ifndef ENABLE_PROMETHEUS_REMOTE_WRITE
+ (void)brc;
+#endif
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ *brc = backends_process_prometheus_remote_write_response;
+#endif /* ENABLE_PROMETHEUS_REMOTE_WRITE */
+}
+
+/**
+ * Set MongoDB variables
+ *
+ * Set the variables necessary to work with this specific backend.
+ *
+ * @param default_port the default port of the backend
+ * @param brc function called to check the result.
+ * @param brf function called to format the message to the backend
+ */
+void backend_set_mongodb_variables(int *default_port,
+ backend_response_checker_t brc,
+ backend_request_formatter_t brf)
+{
+ (void)default_port;
+#ifndef HAVE_MONGOC
+ (void)brc;
+ (void)brf;
+#endif
+
+#if HAVE_MONGOC
+ *brc = process_json_response;
+ if(BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED)
+ *brf = backends_format_dimension_collected_json_plaintext;
+ else
+ *brf = backends_format_dimension_stored_json_plaintext;
+#endif
+}
+
+/**
+ * Set JSON variables
+ *
+ * Set the variables necessary to work with this specific backend.
+ *
+ * @param default_port the default port of the backend
+ * @param brc function called to check the result.
+ * @param brf function called to format the message to the backend
+ */
+void backend_set_json_variables(int *default_port,
+ backend_response_checker_t brc,
+ backend_request_formatter_t brf)
+{
+ *default_port = 5448;
+ *brc = process_json_response;
+
+ if (BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED)
+ *brf = backends_format_dimension_collected_json_plaintext;
+ else
+ *brf = backends_format_dimension_stored_json_plaintext;
+}
+
+/**
+ * Set OpenTSDB HTTP variables
+ *
+ * Set the variables necessary to work with this specific backend.
+ *
+ * @param default_port the default port of the backend
+ * @param brc function called to check the result.
+ * @param brf function called to format the message to the backend
+ */
+void backend_set_opentsdb_http_variables(int *default_port,
+ backend_response_checker_t brc,
+ backend_request_formatter_t brf)
+{
+ *default_port = 4242;
+ *brc = process_opentsdb_response;
+
+ if(BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED)
+ *brf = backends_format_dimension_collected_opentsdb_http;
+ else
+ *brf = backends_format_dimension_stored_opentsdb_http;
+
+}
+
+/**
+ * Set OpenTSDB Telnet variables
+ *
+ * Set the variables necessary to work with this specific backend.
+ *
+ * @param default_port the default port of the backend
+ * @param brc function called to check the result.
+ * @param brf function called to format the message to the backend
+ */
+void backend_set_opentsdb_telnet_variables(int *default_port,
+ backend_response_checker_t brc,
+ backend_request_formatter_t brf)
+{
+ *default_port = 4242;
+ *brc = process_opentsdb_response;
+
+ if(BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED)
+ *brf = backends_format_dimension_collected_opentsdb_telnet;
+ else
+ *brf = backends_format_dimension_stored_opentsdb_telnet;
+}
+
+/**
+ * Set Graphite variables
+ *
+ * Set the variables necessary to work with this specific backend.
+ *
+ * @param default_port the default port of the backend
+ * @param brc function called to check the result.
+ * @param brf function called to format the message to the backend
+ */
+void backend_set_graphite_variables(int *default_port,
+ backend_response_checker_t brc,
+ backend_request_formatter_t brf)
+{
+ *default_port = 2003;
+ *brc = process_graphite_response;
+
+ if(BACKEND_OPTIONS_DATA_SOURCE(global_backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED)
+ *brf = backends_format_dimension_collected_graphite_plaintext;
+ else
+ *brf = backends_format_dimension_stored_graphite_plaintext;
+}
+
+/**
+ * Select Type
+ *
+ * Select the backedn type based in the user input
+ *
+ * @param type is the string that defines the backend type
+ *
+ * @return It returns the backend id.
+ */
+BACKEND_TYPE backend_select_type(const char *type) {
+ if(!strcmp(type, "graphite") || !strcmp(type, "graphite:plaintext")) {
+ return BACKEND_TYPE_GRAPHITE;
+ }
+ else if(!strcmp(type, "opentsdb") || !strcmp(type, "opentsdb:telnet")) {
+ return BACKEND_TYPE_OPENTSDB_USING_TELNET;
+ }
+ else if(!strcmp(type, "opentsdb:http") || !strcmp(type, "opentsdb:https")) {
+ return BACKEND_TYPE_OPENTSDB_USING_HTTP;
+ }
+ else if (!strcmp(type, "json") || !strcmp(type, "json:plaintext")) {
+ return BACKEND_TYPE_JSON;
+ }
+ else if (!strcmp(type, "prometheus_remote_write")) {
+ return BACKEND_TYPE_PROMETHEUS_REMOTE_WRITE;
+ }
+ else if (!strcmp(type, "kinesis") || !strcmp(type, "kinesis:plaintext")) {
+ return BACKEND_TYPE_KINESIS;
+ }
+ else if (!strcmp(type, "mongodb") || !strcmp(type, "mongodb:plaintext")) {
+ return BACKEND_TYPE_MONGODB;
+ }
+
+ return BACKEND_TYPE_UNKNOWN;
+}
+
+/**
+ * Backend main
+ *
+ * The main thread used to control the backedns.
+ *
+ * @param ptr a pointer to netdata_static_structure.
+ *
+ * @return It always return NULL.
+ */
+void *backends_main(void *ptr) {
+ netdata_thread_cleanup_push(backends_main_cleanup, ptr);
+
+ int default_port = 0;
+ int sock = -1;
+ BUFFER *b = buffer_create(1), *response = buffer_create(1);
+ int (*backend_request_formatter)(BUFFER *, const char *, RRDHOST *, const char *, RRDSET *, RRDDIM *, time_t, time_t, BACKEND_OPTIONS) = NULL;
+ int (*backend_response_checker)(BUFFER *) = NULL;
+
+#if HAVE_KINESIS
+ int do_kinesis = 0;
+ char *kinesis_auth_key_id = NULL, *kinesis_secure_key = NULL, *kinesis_stream_name = NULL;
+#endif
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ int do_prometheus_remote_write = 0;
+ BUFFER *http_request_header = NULL;
+#endif
+
+#if HAVE_MONGOC
+ int do_mongodb = 0;
+ char *mongodb_uri = NULL;
+ char *mongodb_database = NULL;
+ char *mongodb_collection = NULL;
+
+ // set the default socket timeout in ms
+ int32_t mongodb_default_socket_timeout = (int32_t)(global_backend_update_every >= 2)?(global_backend_update_every * MSEC_PER_SEC - 500):1000;
+
+#endif
+
+#ifdef ENABLE_HTTPS
+ struct netdata_ssl opentsdb_ssl = {NULL , NETDATA_SSL_START};
+#endif
+
+ // ------------------------------------------------------------------------
+ // collect configuration options
+
+ struct timeval timeout = {
+ .tv_sec = 0,
+ .tv_usec = 0
+ };
+ int enabled = config_get_boolean(CONFIG_SECTION_BACKEND, "enabled", 0);
+ const char *source = config_get(CONFIG_SECTION_BACKEND, "data source", "average");
+ const char *type = config_get(CONFIG_SECTION_BACKEND, "type", "graphite");
+ const char *destination = config_get(CONFIG_SECTION_BACKEND, "destination", "localhost");
+ global_backend_prefix = config_get(CONFIG_SECTION_BACKEND, "prefix", "netdata");
+ const char *hostname = config_get(CONFIG_SECTION_BACKEND, "hostname", localhost->hostname);
+ global_backend_update_every = (int)config_get_number(CONFIG_SECTION_BACKEND, "update every", global_backend_update_every);
+ int buffer_on_failures = (int)config_get_number(CONFIG_SECTION_BACKEND, "buffer on failures", 10);
+ long timeoutms = config_get_number(CONFIG_SECTION_BACKEND, "timeout ms", global_backend_update_every * 2 * 1000);
+
+ if(config_get_boolean(CONFIG_SECTION_BACKEND, "send names instead of ids", (global_backend_options & BACKEND_OPTION_SEND_NAMES)))
+ global_backend_options |= BACKEND_OPTION_SEND_NAMES;
+ else
+ global_backend_options &= ~BACKEND_OPTION_SEND_NAMES;
+
+ charts_pattern = simple_pattern_create(config_get(CONFIG_SECTION_BACKEND, "send charts matching", "*"), NULL, SIMPLE_PATTERN_EXACT);
+ hosts_pattern = simple_pattern_create(config_get(CONFIG_SECTION_BACKEND, "send hosts matching", "localhost *"), NULL, SIMPLE_PATTERN_EXACT);
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ const char *remote_write_path = config_get(CONFIG_SECTION_BACKEND, "remote write URL path", "/receive");
+#endif
+
+ // ------------------------------------------------------------------------
+ // validate configuration options
+ // and prepare for sending data to our backend
+
+ global_backend_options = backend_parse_data_source(source, global_backend_options);
+ global_backend_source = source;
+
+ if(timeoutms < 1) {
+ error("BACKEND: invalid timeout %ld ms given. Assuming %d ms.", timeoutms, global_backend_update_every * 2 * 1000);
+ timeoutms = global_backend_update_every * 2 * 1000;
+ }
+ timeout.tv_sec = (timeoutms * 1000) / 1000000;
+ timeout.tv_usec = (timeoutms * 1000) % 1000000;
+
+ if(!enabled || global_backend_update_every < 1)
+ goto cleanup;
+
+ // ------------------------------------------------------------------------
+ // select the backend type
+ BACKEND_TYPE work_type = backend_select_type(type);
+ if (work_type == BACKEND_TYPE_UNKNOWN) {
+ error("BACKEND: Unknown backend type '%s'", type);
+ goto cleanup;
+ }
+
+ switch (work_type) {
+ case BACKEND_TYPE_OPENTSDB_USING_HTTP: {
+#ifdef ENABLE_HTTPS
+ if (!strcmp(type, "opentsdb:https")) {
+ security_start_ssl(NETDATA_SSL_CONTEXT_EXPORTING);
+ }
+#endif
+ backend_set_opentsdb_http_variables(&default_port,&backend_response_checker,&backend_request_formatter);
+ break;
+ }
+ case BACKEND_TYPE_PROMETHEUS_REMOTE_WRITE: {
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ do_prometheus_remote_write = 1;
+
+ http_request_header = buffer_create(1);
+ backends_init_write_request();
+#else
+ error("BACKEND: Prometheus remote write support isn't compiled");
+#endif // ENABLE_PROMETHEUS_REMOTE_WRITE
+ backend_set_prometheus_variables(&default_port,&backend_response_checker,&backend_request_formatter);
+ break;
+ }
+ case BACKEND_TYPE_KINESIS: {
+#if HAVE_KINESIS
+ do_kinesis = 1;
+
+ if(unlikely(read_kinesis_conf(netdata_configured_user_config_dir, &kinesis_auth_key_id, &kinesis_secure_key, &kinesis_stream_name))) {
+ error("BACKEND: kinesis backend type is set but cannot read its configuration from %s/aws_kinesis.conf", netdata_configured_user_config_dir);
+ goto cleanup;
+ }
+
+ backends_kinesis_init(destination, kinesis_auth_key_id, kinesis_secure_key, timeout.tv_sec * 1000 + timeout.tv_usec / 1000);
+#else
+ error("BACKEND: AWS Kinesis support isn't compiled");
+#endif // HAVE_KINESIS
+ backend_set_kinesis_variables(&default_port,&backend_response_checker,&backend_request_formatter);
+ break;
+ }
+ case BACKEND_TYPE_MONGODB: {
+#if HAVE_MONGOC
+ if(unlikely(read_mongodb_conf(netdata_configured_user_config_dir,
+ &mongodb_uri,
+ &mongodb_database,
+ &mongodb_collection))) {
+ error("BACKEND: mongodb backend type is set but cannot read its configuration from %s/mongodb.conf",
+ netdata_configured_user_config_dir);
+ goto cleanup;
+ }
+
+ if(likely(!backends_mongodb_init(mongodb_uri, mongodb_database, mongodb_collection, mongodb_default_socket_timeout))) {
+ backend_set_mongodb_variables(&default_port, &backend_response_checker, &backend_request_formatter);
+ do_mongodb = 1;
+ }
+ else {
+ error("BACKEND: cannot initialize MongoDB backend");
+ goto cleanup;
+ }
+#else
+ error("BACKEND: MongoDB support isn't compiled");
+#endif // HAVE_MONGOC
+ break;
+ }
+ case BACKEND_TYPE_GRAPHITE: {
+ backend_set_graphite_variables(&default_port,&backend_response_checker,&backend_request_formatter);
+ break;
+ }
+ case BACKEND_TYPE_OPENTSDB_USING_TELNET: {
+ backend_set_opentsdb_telnet_variables(&default_port,&backend_response_checker,&backend_request_formatter);
+ break;
+ }
+ case BACKEND_TYPE_JSON: {
+ backend_set_json_variables(&default_port,&backend_response_checker,&backend_request_formatter);
+ break;
+ }
+ case BACKEND_TYPE_UNKNOWN: {
+ break;
+ }
+ default: {
+ break;
+ }
+ }
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ if((backend_request_formatter == NULL && !do_prometheus_remote_write) || backend_response_checker == NULL) {
+#else
+ if(backend_request_formatter == NULL || backend_response_checker == NULL) {
+#endif
+ error("BACKEND: backend is misconfigured - disabling it.");
+ goto cleanup;
+ }
+
+
+// ------------------------------------------------------------------------
+// prepare the charts for monitoring the backend operation
+
+ struct rusage thread;
+
+ collected_number
+ chart_buffered_metrics = 0,
+ chart_lost_metrics = 0,
+ chart_sent_metrics = 0,
+ chart_buffered_bytes = 0,
+ chart_received_bytes = 0,
+ chart_sent_bytes = 0,
+ chart_receptions = 0,
+ chart_transmission_successes = 0,
+ chart_transmission_failures = 0,
+ chart_data_lost_events = 0,
+ chart_lost_bytes = 0,
+ chart_backend_reconnects = 0;
+ // chart_backend_latency = 0;
+
+ RRDSET *chart_metrics = rrdset_create_localhost("netdata", "backend_metrics", NULL, "backend", NULL, "Netdata Buffered Metrics", "metrics", "backends", NULL, 130600, global_backend_update_every, RRDSET_TYPE_LINE);
+ rrddim_add(chart_metrics, "buffered", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_metrics, "lost", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_metrics, "sent", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
+
+ RRDSET *chart_bytes = rrdset_create_localhost("netdata", "backend_bytes", NULL, "backend", NULL, "Netdata Backend Data Size", "KiB", "backends", NULL, 130610, global_backend_update_every, RRDSET_TYPE_AREA);
+ rrddim_add(chart_bytes, "buffered", NULL, 1, 1024, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_bytes, "lost", NULL, 1, 1024, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_bytes, "sent", NULL, 1, 1024, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_bytes, "received", NULL, 1, 1024, RRD_ALGORITHM_ABSOLUTE);
+
+ RRDSET *chart_ops = rrdset_create_localhost("netdata", "backend_ops", NULL, "backend", NULL, "Netdata Backend Operations", "operations", "backends", NULL, 130630, global_backend_update_every, RRDSET_TYPE_LINE);
+ rrddim_add(chart_ops, "write", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_ops, "discard", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_ops, "reconnect", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_ops, "failure", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
+ rrddim_add(chart_ops, "read", NULL, 1, 1, RRD_ALGORITHM_ABSOLUTE);
+
+ /*
+ * this is misleading - we can only measure the time we need to send data
+ * this time is not related to the time required for the data to travel to
+ * the backend database and the time that server needed to process them
+ *
+ * issue #1432 and https://www.softlab.ntua.gr/facilities/documentation/unix/unix-socket-faq/unix-socket-faq-2.html
+ *
+ RRDSET *chart_latency = rrdset_create_localhost("netdata", "backend_latency", NULL, "backend", NULL, "Netdata Backend Latency", "ms", "backends", NULL, 130620, global_backend_update_every, RRDSET_TYPE_AREA);
+ rrddim_add(chart_latency, "latency", NULL, 1, 1000, RRD_ALGORITHM_ABSOLUTE);
+ */
+
+ RRDSET *chart_rusage = rrdset_create_localhost("netdata", "backend_thread_cpu", NULL, "backend", NULL, "NetData Backend Thread CPU usage", "milliseconds/s", "backends", NULL, 130630, global_backend_update_every, RRDSET_TYPE_STACKED);
+ rrddim_add(chart_rusage, "user", NULL, 1, 1000, RRD_ALGORITHM_INCREMENTAL);
+ rrddim_add(chart_rusage, "system", NULL, 1, 1000, RRD_ALGORITHM_INCREMENTAL);
+
+
+ // ------------------------------------------------------------------------
+ // prepare the backend main loop
+
+ info("BACKEND: configured ('%s' on '%s' sending '%s' data, every %d seconds, as host '%s', with prefix '%s')", type, destination, source, global_backend_update_every, hostname, global_backend_prefix);
+ send_statistics("BACKEND_START", "OK", type);
+
+ usec_t step_ut = global_backend_update_every * USEC_PER_SEC;
+ time_t after = now_realtime_sec();
+ int failures = 0;
+ heartbeat_t hb;
+ heartbeat_init(&hb);
+
+ while(!netdata_exit) {
+
+ // ------------------------------------------------------------------------
+ // Wait for the next iteration point.
+
+ heartbeat_next(&hb, step_ut);
+ time_t before = now_realtime_sec();
+ debug(D_BACKEND, "BACKEND: preparing buffer for timeframe %lu to %lu", (unsigned long)after, (unsigned long)before);
+
+ // ------------------------------------------------------------------------
+ // add to the buffer the data we need to send to the backend
+
+ netdata_thread_disable_cancelability();
+
+ size_t count_hosts = 0;
+ size_t count_charts_total = 0;
+ size_t count_dims_total = 0;
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ if(do_prometheus_remote_write)
+ backends_clear_write_request();
+#endif
+ rrd_rdlock();
+ RRDHOST *host;
+ rrdhost_foreach_read(host) {
+ if(unlikely(!rrdhost_flag_check(host, RRDHOST_FLAG_BACKEND_SEND|RRDHOST_FLAG_BACKEND_DONT_SEND))) {
+ char *name = (host == localhost)?"localhost":host->hostname;
+ if (!hosts_pattern || simple_pattern_matches(hosts_pattern, name)) {
+ rrdhost_flag_set(host, RRDHOST_FLAG_BACKEND_SEND);
+ info("enabled backend for host '%s'", name);
+ }
+ else {
+ rrdhost_flag_set(host, RRDHOST_FLAG_BACKEND_DONT_SEND);
+ info("disabled backend for host '%s'", name);
+ }
+ }
+
+ if(unlikely(!rrdhost_flag_check(host, RRDHOST_FLAG_BACKEND_SEND)))
+ continue;
+
+ rrdhost_rdlock(host);
+
+ count_hosts++;
+ size_t count_charts = 0;
+ size_t count_dims = 0;
+ size_t count_dims_skipped = 0;
+
+ const char *__hostname = (host == localhost)?hostname:host->hostname;
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ if(do_prometheus_remote_write) {
+ backends_rrd_stats_remote_write_allmetrics_prometheus(
+ host
+ , __hostname
+ , global_backend_prefix
+ , global_backend_options
+ , after
+ , before
+ , &count_charts
+ , &count_dims
+ , &count_dims_skipped
+ );
+ chart_buffered_metrics += count_dims;
+ }
+ else
+#endif
+ {
+ RRDSET *st;
+ rrdset_foreach_read(st, host) {
+ if(likely(backends_can_send_rrdset(global_backend_options, st))) {
+ rrdset_rdlock(st);
+
+ count_charts++;
+
+ RRDDIM *rd;
+ rrddim_foreach_read(rd, st) {
+ if (likely(rd->last_collected_time.tv_sec >= after)) {
+ chart_buffered_metrics += backend_request_formatter(b, global_backend_prefix, host, __hostname, st, rd, after, before, global_backend_options);
+ count_dims++;
+ }
+ else {
+ debug(D_BACKEND, "BACKEND: not sending dimension '%s' of chart '%s' from host '%s', its last data collection (%lu) is not within our timeframe (%lu to %lu)", rd->id, st->id, __hostname, (unsigned long)rd->last_collected_time.tv_sec, (unsigned long)after, (unsigned long)before);
+ count_dims_skipped++;
+ }
+ }
+
+ rrdset_unlock(st);
+ }
+ }
+ }
+
+ debug(D_BACKEND, "BACKEND: sending host '%s', metrics of %zu dimensions, of %zu charts. Skipped %zu dimensions.", __hostname, count_dims, count_charts, count_dims_skipped);
+ count_charts_total += count_charts;
+ count_dims_total += count_dims;
+
+ rrdhost_unlock(host);
+ }
+ rrd_unlock();
+
+ netdata_thread_enable_cancelability();
+
+ debug(D_BACKEND, "BACKEND: buffer has %zu bytes, added metrics for %zu dimensions, of %zu charts, from %zu hosts", buffer_strlen(b), count_dims_total, count_charts_total, count_hosts);
+
+ // ------------------------------------------------------------------------
+
+ chart_buffered_bytes = (collected_number)buffer_strlen(b);
+
+ // reset the monitoring chart counters
+ chart_received_bytes =
+ chart_sent_bytes =
+ chart_sent_metrics =
+ chart_lost_metrics =
+ chart_receptions =
+ chart_transmission_successes =
+ chart_transmission_failures =
+ chart_data_lost_events =
+ chart_lost_bytes =
+ chart_backend_reconnects = 0;
+ // chart_backend_latency = 0;
+
+ if(unlikely(netdata_exit)) break;
+
+ //fprintf(stderr, "\nBACKEND BEGIN:\n%s\nBACKEND END\n", buffer_tostring(b));
+ //fprintf(stderr, "after = %lu, before = %lu\n", after, before);
+
+ // prepare for the next iteration
+ // to add incrementally data to buffer
+ after = before;
+
+#if HAVE_KINESIS
+ if(do_kinesis) {
+ unsigned long long partition_key_seq = 0;
+
+ size_t buffer_len = buffer_strlen(b);
+ size_t sent = 0;
+
+ while(sent < buffer_len) {
+ char partition_key[KINESIS_PARTITION_KEY_MAX + 1];
+ snprintf(partition_key, KINESIS_PARTITION_KEY_MAX, "netdata_%llu", partition_key_seq++);
+ size_t partition_key_len = strnlen(partition_key, KINESIS_PARTITION_KEY_MAX);
+
+ const char *first_char = buffer_tostring(b) + sent;
+
+ size_t record_len = 0;
+
+ // split buffer into chunks of maximum allowed size
+ if(buffer_len - sent < KINESIS_RECORD_MAX - partition_key_len) {
+ record_len = buffer_len - sent;
+ }
+ else {
+ record_len = KINESIS_RECORD_MAX - partition_key_len;
+ while(*(first_char + record_len) != '\n' && record_len) record_len--;
+ }
+
+ char error_message[ERROR_LINE_MAX + 1] = "";
+
+ debug(D_BACKEND, "BACKEND: backends_kinesis_put_record(): dest = %s, id = %s, key = %s, stream = %s, partition_key = %s, \
+ buffer = %zu, record = %zu", destination, kinesis_auth_key_id, kinesis_secure_key, kinesis_stream_name,
+ partition_key, buffer_len, record_len);
+
+ backends_kinesis_put_record(kinesis_stream_name, partition_key, first_char, record_len);
+
+ sent += record_len;
+ chart_transmission_successes++;
+
+ size_t sent_bytes = 0, lost_bytes = 0;
+
+ if(unlikely(backends_kinesis_get_result(error_message, &sent_bytes, &lost_bytes))) {
+ // oops! we couldn't send (all or some of the) data
+ error("BACKEND: %s", error_message);
+ error("BACKEND: failed to write data to database backend '%s'. Willing to write %zu bytes, wrote %zu bytes.",
+ destination, sent_bytes, sent_bytes - lost_bytes);
+
+ chart_transmission_failures++;
+ chart_data_lost_events++;
+ chart_lost_bytes += lost_bytes;
+
+ // estimate the number of lost metrics
+ chart_lost_metrics += (collected_number)(chart_buffered_metrics
+ * (buffer_len && (lost_bytes > buffer_len) ? (double)lost_bytes / buffer_len : 1));
+
+ break;
+ }
+ else {
+ chart_receptions++;
+ }
+
+ if(unlikely(netdata_exit)) break;
+ }
+
+ chart_sent_bytes += sent;
+ if(likely(sent == buffer_len))
+ chart_sent_metrics = chart_buffered_metrics;
+
+ buffer_flush(b);
+ } else
+#endif /* HAVE_KINESIS */
+
+#if HAVE_MONGOC
+ if(do_mongodb) {
+ size_t buffer_len = buffer_strlen(b);
+ size_t sent = 0;
+
+ while(sent < buffer_len) {
+ const char *first_char = buffer_tostring(b);
+
+ debug(D_BACKEND, "BACKEND: backends_mongodb_insert(): uri = %s, database = %s, collection = %s, \
+ buffer = %zu", mongodb_uri, mongodb_database, mongodb_collection, buffer_len);
+
+ if(likely(!backends_mongodb_insert((char *)first_char, (size_t)chart_buffered_metrics))) {
+ sent += buffer_len;
+ chart_transmission_successes++;
+ chart_receptions++;
+ }
+ else {
+ // oops! we couldn't send (all or some of the) data
+ error("BACKEND: failed to write data to database backend '%s'. Willing to write %zu bytes, wrote %zu bytes.",
+ mongodb_uri, buffer_len, 0UL);
+
+ chart_transmission_failures++;
+ chart_data_lost_events++;
+ chart_lost_bytes += buffer_len;
+
+ // estimate the number of lost metrics
+ chart_lost_metrics += (collected_number)chart_buffered_metrics;
+
+ break;
+ }
+
+ if(unlikely(netdata_exit)) break;
+ }
+
+ chart_sent_bytes += sent;
+ if(likely(sent == buffer_len))
+ chart_sent_metrics = chart_buffered_metrics;
+
+ buffer_flush(b);
+ } else
+#endif /* HAVE_MONGOC */
+
+ {
+
+ // ------------------------------------------------------------------------
+ // if we are connected, receive a response, without blocking
+
+ if(likely(sock != -1)) {
+ errno = 0;
+
+ // loop through to collect all data
+ while(sock != -1 && errno != EWOULDBLOCK) {
+ buffer_need_bytes(response, 4096);
+
+ ssize_t r;
+#ifdef ENABLE_HTTPS
+ if(opentsdb_ssl.conn && !opentsdb_ssl.flags) {
+ r = SSL_read(opentsdb_ssl.conn, &response->buffer[response->len], response->size - response->len);
+ } else {
+ r = recv(sock, &response->buffer[response->len], response->size - response->len, MSG_DONTWAIT);
+ }
+#else
+ r = recv(sock, &response->buffer[response->len], response->size - response->len, MSG_DONTWAIT);
+#endif
+ if(likely(r > 0)) {
+ // we received some data
+ response->len += r;
+ chart_received_bytes += r;
+ chart_receptions++;
+ }
+ else if(r == 0) {
+ error("BACKEND: '%s' closed the socket", destination);
+ close(sock);
+ sock = -1;
+ }
+ else {
+ // failed to receive data
+ if(errno != EAGAIN && errno != EWOULDBLOCK) {
+ error("BACKEND: cannot receive data from backend '%s'.", destination);
+ }
+ }
+ }
+
+ // if we received data, process them
+ if(buffer_strlen(response))
+ backend_response_checker(response);
+ }
+
+ // ------------------------------------------------------------------------
+ // if we are not connected, connect to a backend server
+
+ if(unlikely(sock == -1)) {
+ // usec_t start_ut = now_monotonic_usec();
+ size_t reconnects = 0;
+
+ sock = connect_to_one_of(destination, default_port, &timeout, &reconnects, NULL, 0);
+#ifdef ENABLE_HTTPS
+ if(sock != -1) {
+ if(netdata_exporting_ctx) {
+ if(!opentsdb_ssl.conn) {
+ opentsdb_ssl.conn = SSL_new(netdata_exporting_ctx);
+ if(!opentsdb_ssl.conn) {
+ error("Failed to allocate SSL structure %d.", sock);
+ opentsdb_ssl.flags = NETDATA_SSL_NO_HANDSHAKE;
+ }
+ } else {
+ SSL_clear(opentsdb_ssl.conn);
+ }
+ }
+
+ if(opentsdb_ssl.conn) {
+ if(SSL_set_fd(opentsdb_ssl.conn, sock) != 1) {
+ error("Failed to set the socket to the SSL on socket fd %d.", host->rrdpush_sender_socket);
+ opentsdb_ssl.flags = NETDATA_SSL_NO_HANDSHAKE;
+ } else {
+ opentsdb_ssl.flags = NETDATA_SSL_HANDSHAKE_COMPLETE;
+ SSL_set_connect_state(opentsdb_ssl.conn);
+ int err = SSL_connect(opentsdb_ssl.conn);
+ if (err != 1) {
+ err = SSL_get_error(opentsdb_ssl.conn, err);
+ error("SSL cannot connect with the server: %s ", ERR_error_string((long)SSL_get_error(opentsdb_ssl.conn, err), NULL));
+ opentsdb_ssl.flags = NETDATA_SSL_NO_HANDSHAKE;
+ } //TODO: check certificate here
+ }
+ }
+ }
+#endif
+ chart_backend_reconnects += reconnects;
+ // chart_backend_latency += now_monotonic_usec() - start_ut;
+ }
+
+ if(unlikely(netdata_exit)) break;
+
+ // ------------------------------------------------------------------------
+ // if we are connected, send our buffer to the backend server
+
+ if(likely(sock != -1)) {
+ size_t len = buffer_strlen(b);
+ // usec_t start_ut = now_monotonic_usec();
+ int flags = 0;
+ #ifdef MSG_NOSIGNAL
+ flags += MSG_NOSIGNAL;
+ #endif
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ if(do_prometheus_remote_write) {
+ size_t data_size = backends_get_write_request_size();
+
+ if(unlikely(!data_size)) {
+ error("BACKEND: write request size is out of range");
+ continue;
+ }
+
+ buffer_flush(b);
+ buffer_need_bytes(b, data_size);
+ if(unlikely(backends_pack_write_request(b->buffer, &data_size))) {
+ error("BACKEND: cannot pack write request");
+ continue;
+ }
+ b->len = data_size;
+ chart_buffered_bytes = (collected_number)buffer_strlen(b);
+
+ buffer_flush(http_request_header);
+ buffer_sprintf(http_request_header,
+ "POST %s HTTP/1.1\r\n"
+ "Host: %s\r\n"
+ "Accept: */*\r\n"
+ "Content-Length: %zu\r\n"
+ "Content-Type: application/x-www-form-urlencoded\r\n\r\n",
+ remote_write_path,
+ destination,
+ data_size
+ );
+
+ len = buffer_strlen(http_request_header);
+ send(sock, buffer_tostring(http_request_header), len, flags);
+
+ len = data_size;
+ }
+#endif
+
+ ssize_t written;
+#ifdef ENABLE_HTTPS
+ if(opentsdb_ssl.conn && !opentsdb_ssl.flags) {
+ written = SSL_write(opentsdb_ssl.conn, buffer_tostring(b), len);
+ } else {
+ written = send(sock, buffer_tostring(b), len, flags);
+ }
+#else
+ written = send(sock, buffer_tostring(b), len, flags);
+#endif
+
+ // chart_backend_latency += now_monotonic_usec() - start_ut;
+ if(written != -1 && (size_t)written == len) {
+ // we sent the data successfully
+ chart_transmission_successes++;
+ chart_sent_bytes += written;
+ chart_sent_metrics = chart_buffered_metrics;
+
+ // reset the failures count
+ failures = 0;
+
+ // empty the buffer
+ buffer_flush(b);
+ }
+ else {
+ // oops! we couldn't send (all or some of the) data
+ error("BACKEND: failed to write data to database backend '%s'. Willing to write %zu bytes, wrote %zd bytes. Will re-connect.", destination, len, written);
+ chart_transmission_failures++;
+
+ if(written != -1)
+ chart_sent_bytes += written;
+
+ // increment the counter we check for data loss
+ failures++;
+
+ // close the socket - we will re-open it next time
+ close(sock);
+ sock = -1;
+ }
+ }
+ else {
+ error("BACKEND: failed to update database backend '%s'", destination);
+ chart_transmission_failures++;
+
+ // increment the counter we check for data loss
+ failures++;
+ }
+ }
+
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ if(do_prometheus_remote_write && failures) {
+ (void) buffer_on_failures;
+ failures = 0;
+ chart_lost_bytes = chart_buffered_bytes = backends_get_write_request_size(); // estimated write request size
+ chart_data_lost_events++;
+ chart_lost_metrics = chart_buffered_metrics;
+ } else
+#endif
+ if(failures > buffer_on_failures) {
+ // too bad! we are going to lose data
+ chart_lost_bytes += buffer_strlen(b);
+ error("BACKEND: reached %d backend failures. Flushing buffers to protect this host - this results in data loss on back-end server '%s'", failures, destination);
+ buffer_flush(b);
+ failures = 0;
+ chart_data_lost_events++;
+ chart_lost_metrics = chart_buffered_metrics;
+ }
+
+ if(unlikely(netdata_exit)) break;
+
+ // ------------------------------------------------------------------------
+ // update the monitoring charts
+
+ if(likely(chart_ops->counter_done)) rrdset_next(chart_ops);
+ rrddim_set(chart_ops, "read", chart_receptions);
+ rrddim_set(chart_ops, "write", chart_transmission_successes);
+ rrddim_set(chart_ops, "discard", chart_data_lost_events);
+ rrddim_set(chart_ops, "failure", chart_transmission_failures);
+ rrddim_set(chart_ops, "reconnect", chart_backend_reconnects);
+ rrdset_done(chart_ops);
+
+ if(likely(chart_metrics->counter_done)) rrdset_next(chart_metrics);
+ rrddim_set(chart_metrics, "buffered", chart_buffered_metrics);
+ rrddim_set(chart_metrics, "lost", chart_lost_metrics);
+ rrddim_set(chart_metrics, "sent", chart_sent_metrics);
+ rrdset_done(chart_metrics);
+
+ if(likely(chart_bytes->counter_done)) rrdset_next(chart_bytes);
+ rrddim_set(chart_bytes, "buffered", chart_buffered_bytes);
+ rrddim_set(chart_bytes, "lost", chart_lost_bytes);
+ rrddim_set(chart_bytes, "sent", chart_sent_bytes);
+ rrddim_set(chart_bytes, "received", chart_received_bytes);
+ rrdset_done(chart_bytes);
+
+ /*
+ if(likely(chart_latency->counter_done)) rrdset_next(chart_latency);
+ rrddim_set(chart_latency, "latency", chart_backend_latency);
+ rrdset_done(chart_latency);
+ */
+
+ getrusage(RUSAGE_THREAD, &thread);
+ if(likely(chart_rusage->counter_done)) rrdset_next(chart_rusage);
+ rrddim_set(chart_rusage, "user", thread.ru_utime.tv_sec * 1000000ULL + thread.ru_utime.tv_usec);
+ rrddim_set(chart_rusage, "system", thread.ru_stime.tv_sec * 1000000ULL + thread.ru_stime.tv_usec);
+ rrdset_done(chart_rusage);
+
+ if(likely(buffer_strlen(b) == 0))
+ chart_buffered_metrics = 0;
+
+ if(unlikely(netdata_exit)) break;
+ }
+
+cleanup:
+#if HAVE_KINESIS
+ if(do_kinesis) {
+ backends_kinesis_shutdown();
+ freez(kinesis_auth_key_id);
+ freez(kinesis_secure_key);
+ freez(kinesis_stream_name);
+ }
+#endif
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+ buffer_free(http_request_header);
+ if(do_prometheus_remote_write)
+ backends_protocol_buffers_shutdown();
+#endif
+
+#if HAVE_MONGOC
+ if(do_mongodb) {
+ backends_mongodb_cleanup();
+ freez(mongodb_uri);
+ freez(mongodb_database);
+ freez(mongodb_collection);
+ }
+#endif
+
+ if(sock != -1)
+ close(sock);
+
+ buffer_free(b);
+ buffer_free(response);
+
+#ifdef ENABLE_HTTPS
+ if(netdata_exporting_ctx) {
+ if(opentsdb_ssl.conn) {
+ SSL_free(opentsdb_ssl.conn);
+ }
+ }
+#endif
+
+ netdata_thread_cleanup_pop(1);
+ return NULL;
+}
diff --git a/backends/backends.h b/backends/backends.h
new file mode 100644
index 0000000..2f4efd9
--- /dev/null
+++ b/backends/backends.h
@@ -0,0 +1,97 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef NETDATA_BACKENDS_H
+#define NETDATA_BACKENDS_H 1
+
+#include "daemon/common.h"
+
+typedef enum backend_options {
+ BACKEND_OPTION_NONE = 0,
+
+ BACKEND_SOURCE_DATA_AS_COLLECTED = (1 << 0),
+ BACKEND_SOURCE_DATA_AVERAGE = (1 << 1),
+ BACKEND_SOURCE_DATA_SUM = (1 << 2),
+
+ BACKEND_OPTION_SEND_NAMES = (1 << 16)
+} BACKEND_OPTIONS;
+
+typedef enum backend_types {
+ BACKEND_TYPE_UNKNOWN, // Invalid type
+ BACKEND_TYPE_GRAPHITE, // Send plain text to Graphite
+ BACKEND_TYPE_OPENTSDB_USING_TELNET, // Send data to OpenTSDB using telnet API
+ BACKEND_TYPE_OPENTSDB_USING_HTTP, // Send data to OpenTSDB using HTTP API
+ BACKEND_TYPE_JSON, // Stores the data using JSON.
+ BACKEND_TYPE_PROMETHEUS_REMOTE_WRITE, // The user selected to use Prometheus backend
+ BACKEND_TYPE_KINESIS, // Send message to AWS Kinesis
+ BACKEND_TYPE_MONGODB, // Send data to MongoDB collection
+ BACKEND_TYPE_NUM // Number of backend types
+} BACKEND_TYPE;
+
+typedef int (**backend_response_checker_t)(BUFFER *);
+typedef int (**backend_request_formatter_t)(BUFFER *, const char *, RRDHOST *, const char *, RRDSET *, RRDDIM *, time_t, time_t, BACKEND_OPTIONS);
+
+#define BACKEND_OPTIONS_SOURCE_BITS (BACKEND_SOURCE_DATA_AS_COLLECTED|BACKEND_SOURCE_DATA_AVERAGE|BACKEND_SOURCE_DATA_SUM)
+#define BACKEND_OPTIONS_DATA_SOURCE(backend_options) (backend_options & BACKEND_OPTIONS_SOURCE_BITS)
+
+extern int global_backend_update_every;
+extern BACKEND_OPTIONS global_backend_options;
+extern const char *global_backend_source;
+extern const char *global_backend_prefix;
+
+extern void *backends_main(void *ptr);
+BACKEND_TYPE backend_select_type(const char *type);
+
+extern BACKEND_OPTIONS backend_parse_data_source(const char *source, BACKEND_OPTIONS backend_options);
+
+#ifdef BACKENDS_INTERNALS
+
+extern int backends_can_send_rrdset(BACKEND_OPTIONS backend_options, RRDSET *st);
+extern calculated_number backend_calculate_value_from_stored_data(
+ RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+ , time_t *first_timestamp // the timestamp of the first point used in this response
+ , time_t *last_timestamp // the timestamp that should be reported to backend
+);
+
+extern size_t backend_name_copy(char *d, const char *s, size_t usable);
+extern int discard_response(BUFFER *b, const char *backend);
+
+static inline char *strip_quotes(char *str) {
+ if(*str == '"' || *str == '\'') {
+ char *s;
+
+ str++;
+
+ s = str;
+ while(*s) s++;
+ if(s != str) s--;
+
+ if(*s == '"' || *s == '\'') *s = '\0';
+ }
+
+ return str;
+}
+
+#endif // BACKENDS_INTERNALS
+
+#include "backends/prometheus/backend_prometheus.h"
+#include "backends/graphite/graphite.h"
+#include "backends/json/json.h"
+#include "backends/opentsdb/opentsdb.h"
+
+#if HAVE_KINESIS
+#include "backends/aws_kinesis/aws_kinesis.h"
+#endif
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+#include "backends/prometheus/remote_write/remote_write.h"
+#endif
+
+#if HAVE_MONGOC
+#include "backends/mongodb/mongodb.h"
+#endif
+
+#endif /* NETDATA_BACKENDS_H */
diff --git a/backends/graphite/Makefile.am b/backends/graphite/Makefile.am
new file mode 100644
index 0000000..babdcf0
--- /dev/null
+++ b/backends/graphite/Makefile.am
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+AUTOMAKE_OPTIONS = subdir-objects
+MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
diff --git a/backends/graphite/graphite.c b/backends/graphite/graphite.c
new file mode 100644
index 0000000..f75a93a
--- /dev/null
+++ b/backends/graphite/graphite.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#define BACKENDS_INTERNALS
+#include "graphite.h"
+
+// ----------------------------------------------------------------------------
+// graphite backend
+
+int backends_format_dimension_collected_graphite_plaintext(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+) {
+ (void)host;
+ (void)after;
+ (void)before;
+
+ char chart_name[RRD_ID_LENGTH_MAX + 1];
+ char dimension_name[RRD_ID_LENGTH_MAX + 1];
+ backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX);
+ backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX);
+
+ buffer_sprintf(
+ b
+ , "%s.%s.%s.%s%s%s " COLLECTED_NUMBER_FORMAT " %llu\n"
+ , prefix
+ , hostname
+ , chart_name
+ , dimension_name
+ , (host->tags)?";":""
+ , (host->tags)?host->tags:""
+ , rd->last_collected_value
+ , (unsigned long long)rd->last_collected_time.tv_sec
+ );
+
+ return 1;
+}
+
+int backends_format_dimension_stored_graphite_plaintext(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+) {
+ (void)host;
+
+ char chart_name[RRD_ID_LENGTH_MAX + 1];
+ char dimension_name[RRD_ID_LENGTH_MAX + 1];
+ backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX);
+ backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX);
+
+ time_t first_t = after, last_t = before;
+ calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t);
+
+ if(!isnan(value)) {
+
+ buffer_sprintf(
+ b
+ , "%s.%s.%s.%s%s%s " CALCULATED_NUMBER_FORMAT " %llu\n"
+ , prefix
+ , hostname
+ , chart_name
+ , dimension_name
+ , (host->tags)?";":""
+ , (host->tags)?host->tags:""
+ , value
+ , (unsigned long long) last_t
+ );
+
+ return 1;
+ }
+ return 0;
+}
+
+int process_graphite_response(BUFFER *b) {
+ return discard_response(b, "graphite");
+}
+
+
diff --git a/backends/graphite/graphite.h b/backends/graphite/graphite.h
new file mode 100644
index 0000000..498a7fc
--- /dev/null
+++ b/backends/graphite/graphite.h
@@ -0,0 +1,35 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+
+#ifndef NETDATA_BACKEND_GRAPHITE_H
+#define NETDATA_BACKEND_GRAPHITE_H
+
+#include "backends/backends.h"
+
+extern int backends_format_dimension_collected_graphite_plaintext(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+);
+
+extern int backends_format_dimension_stored_graphite_plaintext(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+);
+
+extern int process_graphite_response(BUFFER *b);
+
+#endif //NETDATA_BACKEND_GRAPHITE_H
diff --git a/backends/json/Makefile.am b/backends/json/Makefile.am
new file mode 100644
index 0000000..babdcf0
--- /dev/null
+++ b/backends/json/Makefile.am
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+AUTOMAKE_OPTIONS = subdir-objects
+MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
diff --git a/backends/json/json.c b/backends/json/json.c
new file mode 100644
index 0000000..0c7cc73
--- /dev/null
+++ b/backends/json/json.c
@@ -0,0 +1,152 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#define BACKENDS_INTERNALS
+#include "json.h"
+
+// ----------------------------------------------------------------------------
+// json backend
+
+int backends_format_dimension_collected_json_plaintext(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+) {
+ (void)host;
+ (void)after;
+ (void)before;
+ (void)backend_options;
+
+ const char *tags_pre = "", *tags_post = "", *tags = host->tags;
+ if(!tags) tags = "";
+
+ if(*tags) {
+ if(*tags == '{' || *tags == '[' || *tags == '"') {
+ tags_pre = "\"host_tags\":";
+ tags_post = ",";
+ }
+ else {
+ tags_pre = "\"host_tags\":\"";
+ tags_post = "\",";
+ }
+ }
+
+ buffer_sprintf(b, "{"
+ "\"prefix\":\"%s\","
+ "\"hostname\":\"%s\","
+ "%s%s%s"
+
+ "\"chart_id\":\"%s\","
+ "\"chart_name\":\"%s\","
+ "\"chart_family\":\"%s\","
+ "\"chart_context\": \"%s\","
+ "\"chart_type\":\"%s\","
+ "\"units\": \"%s\","
+
+ "\"id\":\"%s\","
+ "\"name\":\"%s\","
+ "\"value\":" COLLECTED_NUMBER_FORMAT ","
+
+ "\"timestamp\": %llu}\n",
+ prefix,
+ hostname,
+ tags_pre, tags, tags_post,
+
+ st->id,
+ st->name,
+ st->family,
+ st->context,
+ st->type,
+ st->units,
+
+ rd->id,
+ rd->name,
+ rd->last_collected_value,
+
+ (unsigned long long) rd->last_collected_time.tv_sec
+ );
+
+ return 1;
+}
+
+int backends_format_dimension_stored_json_plaintext(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+) {
+ (void)host;
+
+ time_t first_t = after, last_t = before;
+ calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t);
+
+ if(!isnan(value)) {
+ const char *tags_pre = "", *tags_post = "", *tags = host->tags;
+ if(!tags) tags = "";
+
+ if(*tags) {
+ if(*tags == '{' || *tags == '[' || *tags == '"') {
+ tags_pre = "\"host_tags\":";
+ tags_post = ",";
+ }
+ else {
+ tags_pre = "\"host_tags\":\"";
+ tags_post = "\",";
+ }
+ }
+
+ buffer_sprintf(b, "{"
+ "\"prefix\":\"%s\","
+ "\"hostname\":\"%s\","
+ "%s%s%s"
+
+ "\"chart_id\":\"%s\","
+ "\"chart_name\":\"%s\","
+ "\"chart_family\":\"%s\","
+ "\"chart_context\": \"%s\","
+ "\"chart_type\":\"%s\","
+ "\"units\": \"%s\","
+
+ "\"id\":\"%s\","
+ "\"name\":\"%s\","
+ "\"value\":" CALCULATED_NUMBER_FORMAT ","
+
+ "\"timestamp\": %llu}\n",
+ prefix,
+ hostname,
+ tags_pre, tags, tags_post,
+
+ st->id,
+ st->name,
+ st->family,
+ st->context,
+ st->type,
+ st->units,
+
+ rd->id,
+ rd->name,
+ value,
+
+ (unsigned long long) last_t
+ );
+
+ return 1;
+ }
+ return 0;
+}
+
+int process_json_response(BUFFER *b) {
+ return discard_response(b, "json");
+}
+
+
diff --git a/backends/json/json.h b/backends/json/json.h
new file mode 100644
index 0000000..78ac376
--- /dev/null
+++ b/backends/json/json.h
@@ -0,0 +1,34 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef NETDATA_BACKEND_JSON_H
+#define NETDATA_BACKEND_JSON_H
+
+#include "backends/backends.h"
+
+extern int backends_format_dimension_collected_json_plaintext(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+);
+
+extern int backends_format_dimension_stored_json_plaintext(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+);
+
+extern int process_json_response(BUFFER *b);
+
+#endif //NETDATA_BACKEND_JSON_H
diff --git a/backends/mongodb/Makefile.am b/backends/mongodb/Makefile.am
new file mode 100644
index 0000000..161784b
--- /dev/null
+++ b/backends/mongodb/Makefile.am
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+AUTOMAKE_OPTIONS = subdir-objects
+MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
+
+dist_noinst_DATA = \
+ README.md \
+ $(NULL)
diff --git a/backends/mongodb/README.md b/backends/mongodb/README.md
new file mode 100644
index 0000000..7c7996e
--- /dev/null
+++ b/backends/mongodb/README.md
@@ -0,0 +1,41 @@
+<!--
+title: "MongoDB backend"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/mongodb/README.md
+-->
+
+# MongoDB backend
+
+## Prerequisites
+
+To use MongoDB as a backend, `libmongoc` 1.7.0 or higher should be
+[installed](http://mongoc.org/libmongoc/current/installing.html) first. Next, Netdata should be re-installed from the
+source. The installer will detect that the required libraries are now available.
+
+## Configuration
+
+To enable data sending to the MongoDB backend set the following options in `netdata.conf`:
+
+```conf
+[backend]
+ enabled = yes
+ type = mongodb
+```
+
+In the Netdata configuration directory run `./edit-config mongodb.conf` and set [MongoDB
+URI](https://docs.mongodb.com/manual/reference/connection-string/), database name, and collection name:
+
+```yaml
+# URI
+uri = mongodb://<hostname>
+
+# database name
+database = your_database_name
+
+# collection name
+collection = your_collection_name
+```
+
+The default socket timeout depends on the backend update interval. The timeout is 500 ms shorter than the interval (but
+not less than 1000 ms). You can alter the timeout using the `sockettimeoutms` MongoDB URI option.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Fmongodb%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/backends/mongodb/mongodb.c b/backends/mongodb/mongodb.c
new file mode 100644
index 0000000..d0527a7
--- /dev/null
+++ b/backends/mongodb/mongodb.c
@@ -0,0 +1,189 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#define BACKENDS_INTERNALS
+#include "mongodb.h"
+#include <mongoc.h>
+
+#define CONFIG_FILE_LINE_MAX ((CONFIG_MAX_NAME + CONFIG_MAX_VALUE + 1024) * 2)
+
+static mongoc_client_t *mongodb_client;
+static mongoc_collection_t *mongodb_collection;
+
+int backends_mongodb_init(const char *uri_string,
+ const char *database_string,
+ const char *collection_string,
+ int32_t default_socket_timeout) {
+ mongoc_uri_t *uri;
+ bson_error_t error;
+
+ mongoc_init();
+
+ uri = mongoc_uri_new_with_error(uri_string, &error);
+ if(unlikely(!uri)) {
+ error("BACKEND: failed to parse URI: %s. Error message: %s", uri_string, error.message);
+ return 1;
+ }
+
+ int32_t socket_timeout = mongoc_uri_get_option_as_int32(uri, MONGOC_URI_SOCKETTIMEOUTMS, default_socket_timeout);
+ if(!mongoc_uri_set_option_as_int32(uri, MONGOC_URI_SOCKETTIMEOUTMS, socket_timeout)) {
+ error("BACKEND: failed to set %s to the value %d", MONGOC_URI_SOCKETTIMEOUTMS, socket_timeout);
+ return 1;
+ };
+
+ mongodb_client = mongoc_client_new_from_uri(uri);
+ if(unlikely(!mongodb_client)) {
+ error("BACKEND: failed to create a new client");
+ return 1;
+ }
+
+ if(!mongoc_client_set_appname(mongodb_client, "netdata")) {
+ error("BACKEND: failed to set client appname");
+ };
+
+ mongodb_collection = mongoc_client_get_collection(mongodb_client, database_string, collection_string);
+
+ mongoc_uri_destroy(uri);
+
+ return 0;
+}
+
+void backends_free_bson(bson_t **insert, size_t n_documents) {
+ size_t i;
+
+ for(i = 0; i < n_documents; i++)
+ bson_destroy(insert[i]);
+
+ free(insert);
+}
+
+int backends_mongodb_insert(char *data, size_t n_metrics) {
+ bson_t **insert = calloc(n_metrics, sizeof(bson_t *));
+ bson_error_t error;
+ char *start = data, *end = data;
+ size_t n_documents = 0;
+
+ while(*end && n_documents <= n_metrics) {
+ while(*end && *end != '\n') end++;
+
+ if(likely(*end)) {
+ *end = '\0';
+ end++;
+ }
+ else {
+ break;
+ }
+
+ insert[n_documents] = bson_new_from_json((const uint8_t *)start, -1, &error);
+
+ if(unlikely(!insert[n_documents])) {
+ error("BACKEND: %s", error.message);
+ backends_free_bson(insert, n_documents);
+ return 1;
+ }
+
+ start = end;
+
+ n_documents++;
+ }
+
+ if(unlikely(!mongoc_collection_insert_many(mongodb_collection, (const bson_t **)insert, n_documents, NULL, NULL, &error))) {
+ error("BACKEND: %s", error.message);
+ backends_free_bson(insert, n_documents);
+ return 1;
+ }
+
+ backends_free_bson(insert, n_documents);
+
+ return 0;
+}
+
+void backends_mongodb_cleanup() {
+ mongoc_collection_destroy(mongodb_collection);
+ mongoc_client_destroy(mongodb_client);
+ mongoc_cleanup();
+
+ return;
+}
+
+int read_mongodb_conf(const char *path, char **uri_p, char **database_p, char **collection_p) {
+ char *uri = *uri_p;
+ char *database = *database_p;
+ char *collection = *collection_p;
+
+ if(unlikely(uri)) freez(uri);
+ if(unlikely(database)) freez(database);
+ if(unlikely(collection)) freez(collection);
+ uri = NULL;
+ database = NULL;
+ collection = NULL;
+
+ int line = 0;
+
+ char filename[FILENAME_MAX + 1];
+ snprintfz(filename, FILENAME_MAX, "%s/mongodb.conf", path);
+
+ char buffer[CONFIG_FILE_LINE_MAX + 1], *s;
+
+ debug(D_BACKEND, "BACKEND: opening config file '%s'", filename);
+
+ FILE *fp = fopen(filename, "r");
+ if(!fp) {
+ return 1;
+ }
+
+ while(fgets(buffer, CONFIG_FILE_LINE_MAX, fp) != NULL) {
+ buffer[CONFIG_FILE_LINE_MAX] = '\0';
+ line++;
+
+ s = trim(buffer);
+ if(!s || *s == '#') {
+ debug(D_BACKEND, "BACKEND: ignoring line %d of file '%s', it is empty.", line, filename);
+ continue;
+ }
+
+ char *name = s;
+ char *value = strchr(s, '=');
+ if(unlikely(!value)) {
+ error("BACKEND: ignoring line %d ('%s') of file '%s', there is no = in it.", line, s, filename);
+ continue;
+ }
+ *value = '\0';
+ value++;
+
+ name = trim(name);
+ value = trim(value);
+
+ if(unlikely(!name || *name == '#')) {
+ error("BACKEND: ignoring line %d of file '%s', name is empty.", line, filename);
+ continue;
+ }
+
+ if(!value)
+ value = "";
+ else
+ value = strip_quotes(value);
+
+ if(name[0] == 'u' && !strcmp(name, "uri")) {
+ uri = strdupz(value);
+ }
+ else if(name[0] == 'd' && !strcmp(name, "database")) {
+ database = strdupz(value);
+ }
+ else if(name[0] == 'c' && !strcmp(name, "collection")) {
+ collection = strdupz(value);
+ }
+ }
+
+ fclose(fp);
+
+ if(unlikely(!collection || !*collection)) {
+ error("BACKEND: collection name is a mandatory MongoDB parameter, but it is not configured");
+ return 1;
+ }
+
+ *uri_p = uri;
+ *database_p = database;
+ *collection_p = collection;
+
+ return 0;
+}
diff --git a/backends/mongodb/mongodb.conf b/backends/mongodb/mongodb.conf
new file mode 100644
index 0000000..11ea6ef
--- /dev/null
+++ b/backends/mongodb/mongodb.conf
@@ -0,0 +1,12 @@
+# MongoDB backend configuration
+#
+# All options in this file are mandatory
+
+# URI
+uri =
+
+# database name
+database =
+
+# collection name
+collection =
diff --git a/backends/mongodb/mongodb.h b/backends/mongodb/mongodb.h
new file mode 100644
index 0000000..cae9b09
--- /dev/null
+++ b/backends/mongodb/mongodb.h
@@ -0,0 +1,16 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef NETDATA_BACKEND_MONGODB_H
+#define NETDATA_BACKEND_MONGODB_H
+
+#include "backends/backends.h"
+
+extern int backends_mongodb_init(const char *uri_string, const char *database_string, const char *collection_string, const int32_t socket_timeout);
+
+extern int backends_mongodb_insert(char *data, size_t n_metrics);
+
+extern void backends_mongodb_cleanup();
+
+extern int read_mongodb_conf(const char *path, char **uri_p, char **database_p, char **collection_p);
+
+#endif //NETDATA_BACKEND_MONGODB_H
diff --git a/backends/nc-backend.sh b/backends/nc-backend.sh
new file mode 100755
index 0000000..7280f86
--- /dev/null
+++ b/backends/nc-backend.sh
@@ -0,0 +1,158 @@
+#!/usr/bin/env bash
+
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+# This is a simple backend database proxy, written in BASH, using the nc command.
+# Run the script without any parameters for help.
+
+MODE="${1}"
+MY_PORT="${2}"
+BACKEND_HOST="${3}"
+BACKEND_PORT="${4}"
+FILE="${NETDATA_NC_BACKEND_DIR-/tmp}/netdata-nc-backend-${MY_PORT}"
+
+log() {
+ logger --stderr --id=$$ --tag "netdata-nc-backend" "${*}"
+}
+
+mync() {
+ local ret
+
+ log "Running: nc ${*}"
+ nc "${@}"
+ ret=$?
+
+ log "nc stopped with return code ${ret}."
+
+ return ${ret}
+}
+
+listen_save_replay_forever() {
+ local file="${1}" port="${2}" real_backend_host="${3}" real_backend_port="${4}" ret delay=1 started ended
+
+ while true
+ do
+ log "Starting nc to listen on port ${port} and save metrics to ${file}"
+
+ started=$(date +%s)
+ mync -l -p "${port}" | tee -a -p --output-error=exit "${file}"
+ ended=$(date +%s)
+
+ if [ -s "${file}" ]
+ then
+ if [ ! -z "${real_backend_host}" ] && [ ! -z "${real_backend_port}" ]
+ then
+ log "Attempting to send the metrics to the real backend at ${real_backend_host}:${real_backend_port}"
+
+ mync "${real_backend_host}" "${real_backend_port}" <"${file}"
+ ret=$?
+
+ if [ ${ret} -eq 0 ]
+ then
+ log "Successfuly sent the metrics to ${real_backend_host}:${real_backend_port}"
+ mv "${file}" "${file}.old"
+ touch "${file}"
+ else
+ log "Failed to send the metrics to ${real_backend_host}:${real_backend_port} (nc returned ${ret}) - appending more data to ${file}"
+ fi
+ else
+ log "No backend configured - appending more data to ${file}"
+ fi
+ fi
+
+ # prevent a CPU hungry infinite loop
+ # if nc cannot listen to port
+ if [ $((ended - started)) -lt 5 ]
+ then
+ log "nc has been stopped too fast."
+ delay=30
+ else
+ delay=1
+ fi
+
+ log "Waiting ${delay} seconds before listening again for data."
+ sleep ${delay}
+ done
+}
+
+if [ "${MODE}" = "start" ]
+ then
+
+ # start the listener, in exclusive mode
+ # only one can use the same file/port at a time
+ {
+ flock -n 9
+ # shellcheck disable=SC2181
+ if [ $? -ne 0 ]
+ then
+ log "Cannot get exclusive lock on file ${FILE}.lock - Am I running multiple times?"
+ exit 2
+ fi
+
+ # save our PID to the lock file
+ echo "$$" >"${FILE}.lock"
+
+ listen_save_replay_forever "${FILE}" "${MY_PORT}" "${BACKEND_HOST}" "${BACKEND_PORT}"
+ ret=$?
+
+ log "listener exited."
+ exit ${ret}
+
+ } 9>>"${FILE}.lock"
+
+ # we can only get here if ${FILE}.lock cannot be created
+ log "Cannot create file ${FILE}."
+ exit 3
+
+elif [ "${MODE}" = "stop" ]
+ then
+
+ {
+ flock -n 9
+ # shellcheck disable=SC2181
+ if [ $? -ne 0 ]
+ then
+ pid=$(<"${FILE}".lock)
+ log "Killing process ${pid}..."
+ kill -TERM "-${pid}"
+ exit 0
+ fi
+
+ log "File ${FILE}.lock has been locked by me but it shouldn't. Is a collector running?"
+ exit 4
+
+ } 9<"${FILE}.lock"
+
+ log "File ${FILE}.lock does not exist. Is a collector running?"
+ exit 5
+
+else
+
+ cat <<EOF
+Usage:
+
+ "${0}" start|stop PORT [BACKEND_HOST BACKEND_PORT]
+
+ PORT The port this script will listen
+ (configure netdata to use this as a second backend)
+
+ BACKEND_HOST The real backend host
+ BACKEND_PORT The real backend port
+
+ This script can act as fallback backend for netdata.
+ It will receive metrics from netdata, save them to
+ ${FILE}
+ and once netdata reconnects to the real-backend, this script
+ will push all metrics collected to the real-backend too and
+ wait for a failure to happen again.
+
+ Only one netdata can connect to this script at a time.
+ If you need fallback for multiple netdata, run this script
+ multiple times with different ports.
+
+ You can run me in the background with this:
+
+ screen -d -m "${0}" start PORT [BACKEND_HOST BACKEND_PORT]
+EOF
+ exit 1
+fi
diff --git a/backends/opentsdb/Makefile.am b/backends/opentsdb/Makefile.am
new file mode 100644
index 0000000..babdcf0
--- /dev/null
+++ b/backends/opentsdb/Makefile.am
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+AUTOMAKE_OPTIONS = subdir-objects
+MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
diff --git a/backends/opentsdb/README.md b/backends/opentsdb/README.md
new file mode 100644
index 0000000..5ba7b12
--- /dev/null
+++ b/backends/opentsdb/README.md
@@ -0,0 +1,38 @@
+<!--
+title: "OpenTSDB with HTTP"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/opentsdb/README.md
+-->
+
+# OpenTSDB with HTTP
+
+Netdata can easily communicate with OpenTSDB using HTTP API. To enable this channel, set the following options in your
+`netdata.conf`:
+
+```conf
+[backend]
+ type = opentsdb:http
+ destination = localhost:4242
+```
+
+In this example, OpenTSDB is running with its default port, which is `4242`. If you run OpenTSDB on a different port,
+change the `destination = localhost:4242` line accordingly.
+
+## HTTPS
+
+As of [v1.16.0](https://github.com/netdata/netdata/releases/tag/v1.16.0), Netdata can send metrics to OpenTSDB using
+TLS/SSL. Unfortunately, OpenTDSB does not support encrypted connections, so you will have to configure a reverse proxy
+to enable HTTPS communication between Netdata and OpenTSDB. You can set up a reverse proxy with
+[Nginx](/docs/Running-behind-nginx.md).
+
+After your proxy is configured, make the following changes to `netdata.conf`:
+
+```conf
+[backend]
+ type = opentsdb:https
+ destination = localhost:8082
+```
+
+In this example, we used the port `8082` for our reverse proxy. If your reverse proxy listens on a different port,
+change the `destination = localhost:8082` line accordingly.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Fopentsdb%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]()
diff --git a/backends/opentsdb/opentsdb.c b/backends/opentsdb/opentsdb.c
new file mode 100644
index 0000000..965b4c0
--- /dev/null
+++ b/backends/opentsdb/opentsdb.c
@@ -0,0 +1,205 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#define BACKENDS_INTERNALS
+#include "opentsdb.h"
+
+// ----------------------------------------------------------------------------
+// opentsdb backend
+
+int backends_format_dimension_collected_opentsdb_telnet(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+) {
+ (void)host;
+ (void)after;
+ (void)before;
+
+ char chart_name[RRD_ID_LENGTH_MAX + 1];
+ char dimension_name[RRD_ID_LENGTH_MAX + 1];
+ backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX);
+ backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX);
+
+ buffer_sprintf(
+ b
+ , "put %s.%s.%s %llu " COLLECTED_NUMBER_FORMAT " host=%s%s%s\n"
+ , prefix
+ , chart_name
+ , dimension_name
+ , (unsigned long long)rd->last_collected_time.tv_sec
+ , rd->last_collected_value
+ , hostname
+ , (host->tags)?" ":""
+ , (host->tags)?host->tags:""
+ );
+
+ return 1;
+}
+
+int backends_format_dimension_stored_opentsdb_telnet(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+) {
+ (void)host;
+
+ time_t first_t = after, last_t = before;
+ calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t);
+
+ char chart_name[RRD_ID_LENGTH_MAX + 1];
+ char dimension_name[RRD_ID_LENGTH_MAX + 1];
+ backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX);
+ backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX);
+
+ if(!isnan(value)) {
+
+ buffer_sprintf(
+ b
+ , "put %s.%s.%s %llu " CALCULATED_NUMBER_FORMAT " host=%s%s%s\n"
+ , prefix
+ , chart_name
+ , dimension_name
+ , (unsigned long long) last_t
+ , value
+ , hostname
+ , (host->tags)?" ":""
+ , (host->tags)?host->tags:""
+ );
+
+ return 1;
+ }
+
+ return 0;
+}
+
+int process_opentsdb_response(BUFFER *b) {
+ return discard_response(b, "opentsdb");
+}
+
+static inline void opentsdb_build_message(BUFFER *b, char *message, const char *hostname, int length) {
+ buffer_sprintf(
+ b
+ , "POST /api/put HTTP/1.1\r\n"
+ "Host: %s\r\n"
+ "Content-Type: application/json\r\n"
+ "Content-Length: %d\r\n"
+ "\r\n"
+ "%s"
+ , hostname
+ , length
+ , message
+ );
+}
+
+int backends_format_dimension_collected_opentsdb_http(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+) {
+ (void)host;
+ (void)after;
+ (void)before;
+
+ char message[1024];
+ char chart_name[RRD_ID_LENGTH_MAX + 1];
+ char dimension_name[RRD_ID_LENGTH_MAX + 1];
+ backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX);
+ backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX);
+
+ int length = snprintfz(message
+ , sizeof(message)
+ , "{"
+ " \"metric\": \"%s.%s.%s\","
+ " \"timestamp\": %llu,"
+ " \"value\": "COLLECTED_NUMBER_FORMAT ","
+ " \"tags\": {"
+ " \"host\": \"%s%s%s\""
+ " }"
+ "}"
+ , prefix
+ , chart_name
+ , dimension_name
+ , (unsigned long long)rd->last_collected_time.tv_sec
+ , rd->last_collected_value
+ , hostname
+ , (host->tags)?" ":""
+ , (host->tags)?host->tags:""
+ );
+
+ if(length > 0) {
+ opentsdb_build_message(b, message, hostname, length);
+ }
+
+ return 1;
+}
+
+int backends_format_dimension_stored_opentsdb_http(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+) {
+ (void)host;
+
+ time_t first_t = after, last_t = before;
+ calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t);
+
+ if(!isnan(value)) {
+ char chart_name[RRD_ID_LENGTH_MAX + 1];
+ char dimension_name[RRD_ID_LENGTH_MAX + 1];
+ backend_name_copy(chart_name, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, RRD_ID_LENGTH_MAX);
+ backend_name_copy(dimension_name, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name)?rd->name:rd->id, RRD_ID_LENGTH_MAX);
+
+ char message[1024];
+ int length = snprintfz(message
+ , sizeof(message)
+ , "{"
+ " \"metric\": \"%s.%s.%s\","
+ " \"timestamp\": %llu,"
+ " \"value\": "CALCULATED_NUMBER_FORMAT ","
+ " \"tags\": {"
+ " \"host\": \"%s%s%s\""
+ " }"
+ "}"
+ , prefix
+ , chart_name
+ , dimension_name
+ , (unsigned long long)last_t
+ , value
+ , hostname
+ , (host->tags)?" ":""
+ , (host->tags)?host->tags:""
+ );
+
+ if(length > 0) {
+ opentsdb_build_message(b, message, hostname, length);
+ }
+
+ return 1;
+ }
+
+ return 0;
+}
diff --git a/backends/opentsdb/opentsdb.h b/backends/opentsdb/opentsdb.h
new file mode 100644
index 0000000..87d9c5c
--- /dev/null
+++ b/backends/opentsdb/opentsdb.h
@@ -0,0 +1,58 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef NETDATA_BACKEND_OPENTSDB_H
+#define NETDATA_BACKEND_OPENTSDB_H
+
+#include "backends/backends.h"
+
+extern int backends_format_dimension_collected_opentsdb_telnet(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+);
+
+extern int backends_format_dimension_stored_opentsdb_telnet(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+);
+
+extern int process_opentsdb_response(BUFFER *b);
+
+int backends_format_dimension_collected_opentsdb_http(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+);
+
+int backends_format_dimension_stored_opentsdb_http(
+ BUFFER *b // the buffer to write data to
+ , const char *prefix // the prefix to use
+ , RRDHOST *host // the host this chart comes from
+ , const char *hostname // the hostname (to override host->hostname)
+ , RRDSET *st // the chart
+ , RRDDIM *rd // the dimension
+ , time_t after // the start timestamp
+ , time_t before // the end timestamp
+ , BACKEND_OPTIONS backend_options // BACKEND_SOURCE_* bitmap
+);
+
+#endif //NETDATA_BACKEND_OPENTSDB_H
diff --git a/backends/prometheus/Makefile.am b/backends/prometheus/Makefile.am
new file mode 100644
index 0000000..334fca8
--- /dev/null
+++ b/backends/prometheus/Makefile.am
@@ -0,0 +1,12 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+AUTOMAKE_OPTIONS = subdir-objects
+MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
+
+SUBDIRS = \
+ remote_write \
+ $(NULL)
+
+dist_noinst_DATA = \
+ README.md \
+ $(NULL)
diff --git a/backends/prometheus/README.md b/backends/prometheus/README.md
new file mode 100644
index 0000000..10275fa
--- /dev/null
+++ b/backends/prometheus/README.md
@@ -0,0 +1,457 @@
+<!--
+title: "Using Netdata with Prometheus"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/prometheus/README.md
+-->
+
+# Using Netdata with Prometheus
+
+> IMPORTANT: the format Netdata sends metrics to prometheus has changed since Netdata v1.7. The new prometheus backend
+> for Netdata supports a lot more features and is aligned to the development of the rest of the Netdata backends.
+
+Prometheus is a distributed monitoring system which offers a very simple setup along with a robust data model. Recently
+Netdata added support for Prometheus. I'm going to quickly show you how to install both Netdata and prometheus on the
+same server. We can then use grafana pointed at Prometheus to obtain long term metrics Netdata offers. I'm assuming we
+are starting at a fresh ubuntu shell (whether you'd like to follow along in a VM or a cloud instance is up to you).
+
+## Installing Netdata and prometheus
+
+### Installing Netdata
+
+There are number of ways to install Netdata according to [Installation](/packaging/installer/README.md). The suggested way
+of installing the latest Netdata and keep it upgrade automatically. Using one line installation:
+
+```sh
+bash <(curl -Ss https://my-netdata.io/kickstart.sh)
+```
+
+At this point we should have Netdata listening on port 19999. Attempt to take your browser here:
+
+```sh
+http://your.netdata.ip:19999
+```
+
+_(replace `your.netdata.ip` with the IP or hostname of the server running Netdata)_
+
+### Installing Prometheus
+
+In order to install prometheus we are going to introduce our own systemd startup script along with an example of
+prometheus.yaml configuration. Prometheus needs to be pointed to your server at a specific target url for it to scrape
+Netdata's api. Prometheus is always a pull model meaning Netdata is the passive client within this architecture.
+Prometheus always initiates the connection with Netdata.
+
+#### Download Prometheus
+
+```sh
+cd /tmp && curl -s https://api.github.com/repos/prometheus/prometheus/releases/latest \
+| grep "browser_download_url.*linux-amd64.tar.gz" \
+| cut -d '"' -f 4 \
+| wget -qi -
+```
+
+#### Create prometheus system user
+
+```sh
+sudo useradd -r prometheus
+```
+
+#### Create prometheus directory
+
+```sh
+sudo mkdir /opt/prometheus
+sudo chown prometheus:prometheus /opt/prometheus
+```
+
+#### Untar prometheus directory
+
+```sh
+sudo tar -xvf /tmp/prometheus-*linux-amd64.tar.gz -C /opt/prometheus --strip=1
+```
+
+#### Install prometheus.yml
+
+We will use the following `prometheus.yml` file. Save it at `/opt/prometheus/prometheus.yml`.
+
+Make sure to replace `your.netdata.ip` with the IP or hostname of the host running Netdata.
+
+```yaml
+# my global config
+global:
+ scrape_interval: 5s # Set the scrape interval to every 5 seconds. Default is every 1 minute.
+ evaluation_interval: 5s # Evaluate rules every 5 seconds. The default is every 1 minute.
+ # scrape_timeout is set to the global default (10s).
+
+ # Attach these labels to any time series or alerts when communicating with
+ # external systems (federation, remote storage, Alertmanager).
+ external_labels:
+ monitor: 'codelab-monitor'
+
+# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
+rule_files:
+ # - "first.rules"
+ # - "second.rules"
+
+# A scrape configuration containing exactly one endpoint to scrape:
+# Here it's Prometheus itself.
+scrape_configs:
+ # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
+ - job_name: 'prometheus'
+
+ # metrics_path defaults to '/metrics'
+ # scheme defaults to 'http'.
+
+ static_configs:
+ - targets: ['0.0.0.0:9090']
+
+ - job_name: 'netdata-scrape'
+
+ metrics_path: '/api/v1/allmetrics'
+ params:
+ # format: prometheus | prometheus_all_hosts
+ # You can use `prometheus_all_hosts` if you want Prometheus to set the `instance` to your hostname instead of IP
+ format: [prometheus]
+ #
+ # sources: as-collected | raw | average | sum | volume
+ # default is: average
+ #source: [as-collected]
+ #
+ # server name for this prometheus - the default is the client IP
+ # for Netdata to uniquely identify it
+ #server: ['prometheus1']
+ honor_labels: true
+
+ static_configs:
+ - targets: ['{your.netdata.ip}:19999']
+```
+
+#### Install nodes.yml
+
+The following is completely optional, it will enable Prometheus to generate alerts from some NetData sources. Tweak the
+values to your own needs. We will use the following `nodes.yml` file below. Save it at `/opt/prometheus/nodes.yml`, and
+add a _- "nodes.yml"_ entry under the _rule_files:_ section in the example prometheus.yml file above.
+
+```yaml
+groups:
+- name: nodes
+
+ rules:
+ - alert: node_high_cpu_usage_70
+ expr: avg(rate(netdata_cpu_cpu_percentage_average{dimension="idle"}[1m])) by (job) > 70
+ for: 1m
+ annotations:
+ description: '{{ $labels.job }} on ''{{ $labels.job }}'' CPU usage is at {{ humanize $value }}%.'
+ summary: CPU alert for container node '{{ $labels.job }}'
+
+ - alert: node_high_memory_usage_70
+ expr: 100 / sum(netdata_system_ram_MB_average) by (job)
+ * sum(netdata_system_ram_MB_average{dimension=~"free|cached"}) by (job) < 30
+ for: 1m
+ annotations:
+ description: '{{ $labels.job }} memory usage is {{ humanize $value}}%.'
+ summary: Memory alert for container node '{{ $labels.job }}'
+
+ - alert: node_low_root_filesystem_space_20
+ expr: 100 / sum(netdata_disk_space_GB_average{family="/"}) by (job)
+ * sum(netdata_disk_space_GB_average{family="/",dimension=~"avail|cached"}) by (job) < 20
+ for: 1m
+ annotations:
+ description: '{{ $labels.job }} root filesystem space is {{ humanize $value}}%.'
+ summary: Root filesystem alert for container node '{{ $labels.job }}'
+
+ - alert: node_root_filesystem_fill_rate_6h
+ expr: predict_linear(netdata_disk_space_GB_average{family="/",dimension=~"avail|cached"}[1h], 6 * 3600) < 0
+ for: 1h
+ labels:
+ severity: critical
+ annotations:
+ description: Container node {{ $labels.job }} root filesystem is going to fill up in 6h.
+ summary: Disk fill alert for Swarm node '{{ $labels.job }}'
+```
+
+#### Install prometheus.service
+
+Save this service file as `/etc/systemd/system/prometheus.service`:
+
+```sh
+[Unit]
+Description=Prometheus Server
+AssertPathExists=/opt/prometheus
+
+[Service]
+Type=simple
+WorkingDirectory=/opt/prometheus
+User=prometheus
+Group=prometheus
+ExecStart=/opt/prometheus/prometheus --config.file=/opt/prometheus/prometheus.yml --log.level=info
+ExecReload=/bin/kill -SIGHUP $MAINPID
+ExecStop=/bin/kill -SIGINT $MAINPID
+
+[Install]
+WantedBy=multi-user.target
+```
+
+##### Start Prometheus
+
+```sh
+sudo systemctl start prometheus
+sudo systemctl enable prometheus
+```
+
+Prometheus should now start and listen on port 9090. Attempt to head there with your browser.
+
+If everything is working correctly when you fetch `http://your.prometheus.ip:9090` you will see a 'Status' tab. Click
+this and click on 'targets' We should see the Netdata host as a scraped target.
+
+---
+
+## Netdata support for prometheus
+
+> IMPORTANT: the format Netdata sends metrics to prometheus has changed since Netdata v1.6. The new format allows easier
+> queries for metrics and supports both `as collected` and normalized metrics.
+
+Before explaining the changes, we have to understand the key differences between Netdata and prometheus.
+
+### understanding Netdata metrics
+
+#### charts
+
+Each chart in Netdata has several properties (common to all its metrics):
+
+- `chart_id` - uniquely identifies a chart.
+
+- `chart_name` - a more human friendly name for `chart_id`, also unique.
+
+- `context` - this is the template of the chart. All disk I/O charts have the same context, all mysql requests charts
+ have the same context, etc. This is used for alarm templates to match all the charts they should be attached to.
+
+- `family` groups a set of charts together. It is used as the submenu of the dashboard.
+
+- `units` is the units for all the metrics attached to the chart.
+
+#### dimensions
+
+Then each Netdata chart contains metrics called `dimensions`. All the dimensions of a chart have the same units of
+measurement, and are contextually in the same category (ie. the metrics for disk bandwidth are `read` and `write` and
+they are both in the same chart).
+
+### Netdata data source
+
+Netdata can send metrics to prometheus from 3 data sources:
+
+- `as collected` or `raw` - this data source sends the metrics to prometheus as they are collected. No conversion is
+ done by Netdata. The latest value for each metric is just given to prometheus. This is the most preferred method by
+ prometheus, but it is also the harder to work with. To work with this data source, you will need to understand how
+ to get meaningful values out of them.
+
+ The format of the metrics is: `CONTEXT{chart="CHART",family="FAMILY",dimension="DIMENSION"}`.
+
+ If the metric is a counter (`incremental` in Netdata lingo), `_total` is appended the context.
+
+ Unlike prometheus, Netdata allows each dimension of a chart to have a different algorithm and conversion constants
+ (`multiplier` and `divisor`). In this case, that the dimensions of a charts are heterogeneous, Netdata will use this
+ format: `CONTEXT_DIMENSION{chart="CHART",family="FAMILY"}`
+
+- `average` - this data source uses the Netdata database to send the metrics to prometheus as they are presented on
+ the Netdata dashboard. So, all the metrics are sent as gauges, at the units they are presented in the Netdata
+ dashboard charts. This is the easiest to work with.
+
+ The format of the metrics is: `CONTEXT_UNITS_average{chart="CHART",family="FAMILY",dimension="DIMENSION"}`.
+
+ When this source is used, Netdata keeps track of the last access time for each prometheus server fetching the
+ metrics. This last access time is used at the subsequent queries of the same prometheus server to identify the
+ time-frame the `average` will be calculated.
+
+ So, no matter how frequently prometheus scrapes Netdata, it will get all the database data.
+ To identify each prometheus server, Netdata uses by default the IP of the client fetching the metrics.
+
+ If there are multiple prometheus servers fetching data from the same Netdata, using the same IP, each prometheus
+ server can append `server=NAME` to the URL. Netdata will use this `NAME` to uniquely identify the prometheus server.
+
+- `sum` or `volume`, is like `average` but instead of averaging the values, it sums them.
+
+ The format of the metrics is: `CONTEXT_UNITS_sum{chart="CHART",family="FAMILY",dimension="DIMENSION"}`. All the
+ other operations are the same with `average`.
+
+ To change the data source to `sum` or `as-collected` you need to provide the `source` parameter in the request URL.
+ e.g.: `http://your.netdata.ip:19999/api/v1/allmetrics?format=prometheus&help=yes&source=as-collected`
+
+ Keep in mind that early versions of Netdata were sending the metrics as: `CHART_DIMENSION{}`.
+
+### Querying Metrics
+
+Fetch with your web browser this URL:
+
+`http://your.netdata.ip:19999/api/v1/allmetrics?format=prometheus&help=yes`
+
+_(replace `your.netdata.ip` with the ip or hostname of your Netdata server)_
+
+Netdata will respond with all the metrics it sends to prometheus.
+
+If you search that page for `"system.cpu"` you will find all the metrics Netdata is exporting to prometheus for this
+chart. `system.cpu` is the chart name on the Netdata dashboard (on the Netdata dashboard all charts have a text heading
+such as : `Total CPU utilization (system.cpu)`. What we are interested here in the chart name: `system.cpu`).
+
+Searching for `"system.cpu"` reveals:
+
+```sh
+# COMMENT homogeneous chart "system.cpu", context "system.cpu", family "cpu", units "percentage"
+# COMMENT netdata_system_cpu_percentage_average: dimension "guest_nice", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="guest_nice"} 0.0000000 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "guest", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="guest"} 1.7837326 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "steal", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="steal"} 0.0000000 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "softirq", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="softirq"} 0.5275442 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "irq", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="irq"} 0.2260836 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "user", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="user"} 2.3362762 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "system", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="system"} 1.7961062 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "nice", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="nice"} 0.0000000 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "iowait", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="iowait"} 0.9671802 1500066662000
+# COMMENT netdata_system_cpu_percentage_average: dimension "idle", value is percentage, gauge, dt 1500066653 to 1500066662 inclusive
+netdata_system_cpu_percentage_average{chart="system.cpu",family="cpu",dimension="idle"} 92.3630770 1500066662000
+```
+
+_(Netdata response for `system.cpu` with source=`average`)_
+
+In `average` or `sum` data sources, all values are normalized and are reported to prometheus as gauges. Now, use the
+'expression' text form in prometheus. Begin to type the metrics we are looking for: `netdata_system_cpu`. You should see
+that the text form begins to auto-fill as prometheus knows about this metric.
+
+If the data source was `as collected`, the response would be:
+
+```sh
+# COMMENT homogeneous chart "system.cpu", context "system.cpu", family "cpu", units "percentage"
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "guest_nice", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="guest_nice"} 0 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "guest", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="guest"} 63945 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "steal", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="steal"} 0 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "softirq", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="softirq"} 8295 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "irq", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="irq"} 4079 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "user", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="user"} 116488 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "system", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="system"} 35084 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "nice", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="nice"} 505 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "iowait", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="iowait"} 23314 1500066716438
+# COMMENT netdata_system_cpu_total: chart "system.cpu", context "system.cpu", family "cpu", dimension "idle", value * 1 / 1 delta gives percentage (counter)
+netdata_system_cpu_total{chart="system.cpu",family="cpu",dimension="idle"} 918470 1500066716438
+```
+
+_(Netdata response for `system.cpu` with source=`as-collected`)_
+
+For more information check prometheus documentation.
+
+### Streaming data from upstream hosts
+
+The `format=prometheus` parameter only exports the host's Netdata metrics. If you are using the parent-child
+functionality of Netdata this ignores any upstream hosts - so you should consider using the below in your
+**prometheus.yml**:
+
+```yaml
+ metrics_path: '/api/v1/allmetrics'
+ params:
+ format: [prometheus_all_hosts]
+ honor_labels: true
+```
+
+This will report all upstream host data, and `honor_labels` will make Prometheus take note of the instance names
+provided.
+
+### Timestamps
+
+To pass the metrics through prometheus pushgateway, Netdata supports the option `&timestamps=no` to send the metrics
+without timestamps.
+
+## Netdata host variables
+
+Netdata collects various system configuration metrics, like the max number of TCP sockets supported, the max number of
+files allowed system-wide, various IPC sizes, etc. These metrics are not exposed to prometheus by default.
+
+To expose them, append `variables=yes` to the Netdata URL.
+
+### TYPE and HELP
+
+To save bandwidth, and because prometheus does not use them anyway, `# TYPE` and `# HELP` lines are suppressed. If
+wanted they can be re-enabled via `types=yes` and `help=yes`, e.g.
+`/api/v1/allmetrics?format=prometheus&types=yes&help=yes`
+
+Note that if enabled, the `# TYPE` and `# HELP` lines are repeated for every occurrence of a metric, which goes against the Prometheus documentation's [specification for these lines](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#comments-help-text-and-type-information).
+
+### Names and IDs
+
+Netdata supports names and IDs for charts and dimensions. Usually IDs are unique identifiers as read by the system and
+names are human friendly labels (also unique).
+
+Most charts and metrics have the same ID and name, but in several cases they are different: disks with device-mapper,
+interrupts, QoS classes, statsd synthetic charts, etc.
+
+The default is controlled in `netdata.conf`:
+
+```conf
+[backend]
+ send names instead of ids = yes | no
+```
+
+You can overwrite it from prometheus, by appending to the URL:
+
+- `&names=no` to get IDs (the old behaviour)
+- `&names=yes` to get names
+
+### Filtering metrics sent to prometheus
+
+Netdata can filter the metrics it sends to prometheus with this setting:
+
+```conf
+[backend]
+ send charts matching = *
+```
+
+This settings accepts a space separated list of patterns to match the **charts** to be sent to prometheus. Each pattern
+can use `*` as wildcard, any number of times (e.g `*a*b*c*` is valid). Patterns starting with `!` give a negative match
+(e.g `!*.bad users.* groups.*` will send all the users and groups except `bad` user and `bad` group). The order is
+important: the first match (positive or negative) left to right, is used.
+
+### Changing the prefix of Netdata metrics
+
+Netdata sends all metrics prefixed with `netdata_`. You can change this in `netdata.conf`, like this:
+
+```conf
+[backend]
+ prefix = netdata
+```
+
+It can also be changed from the URL, by appending `&prefix=netdata`.
+
+### Metric Units
+
+The default source `average` adds the unit of measurement to the name of each metric (e.g. `_KiB_persec`). To hide the
+units and get the same metric names as with the other sources, append to the URL `&hideunits=yes`.
+
+The units were standardized in v1.12, with the effect of changing the metric names. To get the metric names as they were
+before v1.12, append to the URL `&oldunits=yes`
+
+### Accuracy of `average` and `sum` data sources
+
+When the data source is set to `average` or `sum`, Netdata remembers the last access of each client accessing prometheus
+metrics and uses this last access time to respond with the `average` or `sum` of all the entries in the database since
+that. This means that prometheus servers are not losing data when they access Netdata with data source = `average` or
+`sum`.
+
+To uniquely identify each prometheus server, Netdata uses the IP of the client accessing the metrics. If however the IP
+is not good enough for identifying a single prometheus server (e.g. when prometheus servers are accessing Netdata
+through a web proxy, or when multiple prometheus servers are NATed to a single IP), each prometheus may append
+`&server=NAME` to the URL. This `NAME` is used by Netdata to uniquely identify each prometheus server and keep track of
+its last access time.
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Fprometheus%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/backends/prometheus/backend_prometheus.c b/backends/prometheus/backend_prometheus.c
new file mode 100644
index 0000000..a3ecf16
--- /dev/null
+++ b/backends/prometheus/backend_prometheus.c
@@ -0,0 +1,797 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#define BACKENDS_INTERNALS
+#include "backend_prometheus.h"
+
+// ----------------------------------------------------------------------------
+// PROMETHEUS
+// /api/v1/allmetrics?format=prometheus and /api/v1/allmetrics?format=prometheus_all_hosts
+
+static struct prometheus_server {
+ const char *server;
+ uint32_t hash;
+ RRDHOST *host;
+ time_t last_access;
+ struct prometheus_server *next;
+} *prometheus_server_root = NULL;
+
+static inline time_t prometheus_server_last_access(const char *server, RRDHOST *host, time_t now) {
+ static netdata_mutex_t prometheus_server_root_mutex = NETDATA_MUTEX_INITIALIZER;
+
+ uint32_t hash = simple_hash(server);
+
+ netdata_mutex_lock(&prometheus_server_root_mutex);
+
+ struct prometheus_server *ps;
+ for(ps = prometheus_server_root; ps ;ps = ps->next) {
+ if (host == ps->host && hash == ps->hash && !strcmp(server, ps->server)) {
+ time_t last = ps->last_access;
+ ps->last_access = now;
+ netdata_mutex_unlock(&prometheus_server_root_mutex);
+ return last;
+ }
+ }
+
+ ps = callocz(1, sizeof(struct prometheus_server));
+ ps->server = strdupz(server);
+ ps->hash = hash;
+ ps->host = host;
+ ps->last_access = now;
+ ps->next = prometheus_server_root;
+ prometheus_server_root = ps;
+
+ netdata_mutex_unlock(&prometheus_server_root_mutex);
+ return 0;
+}
+
+static inline size_t backends_prometheus_name_copy(char *d, const char *s, size_t usable) {
+ size_t n;
+
+ for(n = 0; *s && n < usable ; d++, s++, n++) {
+ register char c = *s;
+
+ if(!isalnum(c)) *d = '_';
+ else *d = c;
+ }
+ *d = '\0';
+
+ return n;
+}
+
+static inline size_t backends_prometheus_label_copy(char *d, const char *s, size_t usable) {
+ size_t n;
+
+ // make sure we can escape one character without overflowing the buffer
+ usable--;
+
+ for(n = 0; *s && n < usable ; d++, s++, n++) {
+ register char c = *s;
+
+ if(unlikely(c == '"' || c == '\\' || c == '\n')) {
+ *d++ = '\\';
+ n++;
+ }
+ *d = c;
+ }
+ *d = '\0';
+
+ return n;
+}
+
+static inline char *backends_prometheus_units_copy(char *d, const char *s, size_t usable, int showoldunits) {
+ const char *sorig = s;
+ char *ret = d;
+ size_t n;
+
+ // Fix for issue 5227
+ if (unlikely(showoldunits)) {
+ static struct {
+ const char *newunit;
+ uint32_t hash;
+ const char *oldunit;
+ } units[] = {
+ {"KiB/s", 0, "kilobytes/s"}
+ , {"MiB/s", 0, "MB/s"}
+ , {"GiB/s", 0, "GB/s"}
+ , {"KiB" , 0, "KB"}
+ , {"MiB" , 0, "MB"}
+ , {"GiB" , 0, "GB"}
+ , {"inodes" , 0, "Inodes"}
+ , {"percentage" , 0, "percent"}
+ , {"faults/s" , 0, "page faults/s"}
+ , {"KiB/operation", 0, "kilobytes per operation"}
+ , {"milliseconds/operation", 0, "ms per operation"}
+ , {NULL, 0, NULL}
+ };
+ static int initialized = 0;
+ int i;
+
+ if(unlikely(!initialized)) {
+ for (i = 0; units[i].newunit; i++)
+ units[i].hash = simple_hash(units[i].newunit);
+ initialized = 1;
+ }
+
+ uint32_t hash = simple_hash(s);
+ for(i = 0; units[i].newunit ; i++) {
+ if(unlikely(hash == units[i].hash && !strcmp(s, units[i].newunit))) {
+ // info("matched extension for filename '%s': '%s'", filename, last_dot);
+ s=units[i].oldunit;
+ sorig = s;
+ break;
+ }
+ }
+ }
+ *d++ = '_';
+ for(n = 1; *s && n < usable ; d++, s++, n++) {
+ register char c = *s;
+
+ if(!isalnum(c)) *d = '_';
+ else *d = c;
+ }
+
+ if(n == 2 && sorig[0] == '%') {
+ n = 0;
+ d = ret;
+ s = "_percent";
+ for( ; *s && n < usable ; n++) *d++ = *s++;
+ }
+ else if(n > 3 && sorig[n-3] == '/' && sorig[n-2] == 's') {
+ n = n - 2;
+ d -= 2;
+ s = "_persec";
+ for( ; *s && n < usable ; n++) *d++ = *s++;
+ }
+
+ *d = '\0';
+
+ return ret;
+}
+
+
+#define PROMETHEUS_ELEMENT_MAX 256
+#define PROMETHEUS_LABELS_MAX 1024
+#define PROMETHEUS_VARIABLE_MAX 256
+
+#define PROMETHEUS_LABELS_MAX_NUMBER 128
+
+struct host_variables_callback_options {
+ RRDHOST *host;
+ BUFFER *wb;
+ BACKEND_OPTIONS backend_options;
+ BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options;
+ const char *prefix;
+ const char *labels;
+ time_t now;
+ int host_header_printed;
+ char name[PROMETHEUS_VARIABLE_MAX+1];
+};
+
+static int print_host_variables(RRDVAR *rv, void *data) {
+ struct host_variables_callback_options *opts = data;
+
+ if(rv->options & (RRDVAR_OPTION_CUSTOM_HOST_VAR|RRDVAR_OPTION_CUSTOM_CHART_VAR)) {
+ if(!opts->host_header_printed) {
+ opts->host_header_printed = 1;
+
+ if(opts->output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP) {
+ buffer_sprintf(opts->wb, "\n# COMMENT global host and chart variables\n");
+ }
+ }
+
+ calculated_number value = rrdvar2number(rv);
+ if(isnan(value) || isinf(value)) {
+ if(opts->output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP)
+ buffer_sprintf(opts->wb, "# COMMENT variable \"%s\" is %s. Skipped.\n", rv->name, (isnan(value))?"NAN":"INF");
+
+ return 0;
+ }
+
+ char *label_pre = "";
+ char *label_post = "";
+ if(opts->labels && *opts->labels) {
+ label_pre = "{";
+ label_post = "}";
+ }
+
+ backends_prometheus_name_copy(opts->name, rv->name, sizeof(opts->name));
+
+ if(opts->output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS)
+ buffer_sprintf(opts->wb
+ , "%s_%s%s%s%s " CALCULATED_NUMBER_FORMAT " %llu\n"
+ , opts->prefix
+ , opts->name
+ , label_pre
+ , opts->labels
+ , label_post
+ , value
+ , opts->now * 1000ULL
+ );
+ else
+ buffer_sprintf(opts->wb, "%s_%s%s%s%s " CALCULATED_NUMBER_FORMAT "\n"
+ , opts->prefix
+ , opts->name
+ , label_pre
+ , opts->labels
+ , label_post
+ , value
+ );
+
+ return 1;
+ }
+
+ return 0;
+}
+
+static void rrd_stats_api_v1_charts_allmetrics_prometheus(RRDHOST *host, BUFFER *wb, const char *prefix, BACKEND_OPTIONS backend_options, time_t after, time_t before, int allhosts, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options) {
+ rrdhost_rdlock(host);
+
+ char hostname[PROMETHEUS_ELEMENT_MAX + 1];
+ backends_prometheus_label_copy(hostname, host->hostname, PROMETHEUS_ELEMENT_MAX);
+
+ char labels[PROMETHEUS_LABELS_MAX + 1] = "";
+ if(allhosts) {
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS)
+ buffer_sprintf(wb, "netdata_info{instance=\"%s\",application=\"%s\",version=\"%s\"} 1 %llu\n", hostname, host->program_name, host->program_version, now_realtime_usec() / USEC_PER_MS);
+ else
+ buffer_sprintf(wb, "netdata_info{instance=\"%s\",application=\"%s\",version=\"%s\"} 1\n", hostname, host->program_name, host->program_version);
+
+ if(host->tags && *(host->tags)) {
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) {
+ buffer_sprintf(wb, "netdata_host_tags_info{instance=\"%s\",%s} 1 %llu\n", hostname, host->tags, now_realtime_usec() / USEC_PER_MS);
+
+ // deprecated, exists only for compatibility with older queries
+ buffer_sprintf(wb, "netdata_host_tags{instance=\"%s\",%s} 1 %llu\n", hostname, host->tags, now_realtime_usec() / USEC_PER_MS);
+ }
+ else {
+ buffer_sprintf(wb, "netdata_host_tags_info{instance=\"%s\",%s} 1\n", hostname, host->tags);
+
+ // deprecated, exists only for compatibility with older queries
+ buffer_sprintf(wb, "netdata_host_tags{instance=\"%s\",%s} 1\n", hostname, host->tags);
+ }
+
+ }
+
+ snprintfz(labels, PROMETHEUS_LABELS_MAX, ",instance=\"%s\"", hostname);
+ }
+ else {
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS)
+ buffer_sprintf(wb, "netdata_info{instance=\"%s\",application=\"%s\",version=\"%s\"} 1 %llu\n", hostname, host->program_name, host->program_version, now_realtime_usec() / USEC_PER_MS);
+ else
+ buffer_sprintf(wb, "netdata_info{instance=\"%s\",application=\"%s\",version=\"%s\"} 1\n", hostname, host->program_name, host->program_version);
+
+ if(host->tags && *(host->tags)) {
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS) {
+ buffer_sprintf(wb, "netdata_host_tags_info{%s} 1 %llu\n", host->tags, now_realtime_usec() / USEC_PER_MS);
+
+ // deprecated, exists only for compatibility with older queries
+ buffer_sprintf(wb, "netdata_host_tags{%s} 1 %llu\n", host->tags, now_realtime_usec() / USEC_PER_MS);
+ }
+ else {
+ buffer_sprintf(wb, "netdata_host_tags_info{%s} 1\n", host->tags);
+
+ // deprecated, exists only for compatibility with older queries
+ buffer_sprintf(wb, "netdata_host_tags{%s} 1\n", host->tags);
+ }
+ }
+ }
+
+ // send custom variables set for the host
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_VARIABLES){
+ struct host_variables_callback_options opts = {
+ .host = host,
+ .wb = wb,
+ .labels = (labels[0] == ',')?&labels[1]:labels,
+ .backend_options = backend_options,
+ .output_options = output_options,
+ .prefix = prefix,
+ .now = now_realtime_sec(),
+ .host_header_printed = 0
+ };
+ foreach_host_variable_callback(host, print_host_variables, &opts);
+ }
+
+ // for each chart
+ RRDSET *st;
+ rrdset_foreach_read(st, host) {
+ char chart[PROMETHEUS_ELEMENT_MAX + 1];
+ char context[PROMETHEUS_ELEMENT_MAX + 1];
+ char family[PROMETHEUS_ELEMENT_MAX + 1];
+ char units[PROMETHEUS_ELEMENT_MAX + 1] = "";
+
+ backends_prometheus_label_copy(chart, (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && st->name)?st->name:st->id, PROMETHEUS_ELEMENT_MAX);
+ backends_prometheus_label_copy(family, st->family, PROMETHEUS_ELEMENT_MAX);
+ backends_prometheus_name_copy(context, st->context, PROMETHEUS_ELEMENT_MAX);
+
+ if(likely(backends_can_send_rrdset(backend_options, st))) {
+ rrdset_rdlock(st);
+
+ int as_collected = (BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED);
+ int homogeneous = 1;
+ if(as_collected) {
+ if(rrdset_flag_check(st, RRDSET_FLAG_HOMOGENEOUS_CHECK))
+ rrdset_update_heterogeneous_flag(st);
+
+ if(rrdset_flag_check(st, RRDSET_FLAG_HETEROGENEOUS))
+ homogeneous = 0;
+ }
+ else {
+ if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE && !(output_options & BACKENDS_PROMETHEUS_OUTPUT_HIDEUNITS))
+ backends_prometheus_units_copy(units, st->units, PROMETHEUS_ELEMENT_MAX, output_options & BACKENDS_PROMETHEUS_OUTPUT_OLDUNITS);
+ }
+
+ if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP))
+ buffer_sprintf(wb, "\n# COMMENT %s chart \"%s\", context \"%s\", family \"%s\", units \"%s\"\n"
+ , (homogeneous)?"homogeneous":"heterogeneous"
+ , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && st->name) ? st->name : st->id
+ , st->context
+ , st->family
+ , st->units
+ );
+
+ // for each dimension
+ RRDDIM *rd;
+ rrddim_foreach_read(rd, st) {
+ if(rd->collections_counter && !rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE)) {
+ char dimension[PROMETHEUS_ELEMENT_MAX + 1];
+ char *suffix = "";
+
+ if (as_collected) {
+ // we need as-collected / raw data
+
+ if(unlikely(rd->last_collected_time.tv_sec < after))
+ continue;
+
+ const char *t = "gauge", *h = "gives";
+ if(rd->algorithm == RRD_ALGORITHM_INCREMENTAL ||
+ rd->algorithm == RRD_ALGORITHM_PCENT_OVER_DIFF_TOTAL) {
+ t = "counter";
+ h = "delta gives";
+ suffix = "_total";
+ }
+
+ if(homogeneous) {
+ // all the dimensions of the chart, has the same algorithm, multiplier and divisor
+ // we add all dimensions as labels
+
+ backends_prometheus_label_copy(dimension, (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX);
+
+ if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP))
+ buffer_sprintf(wb
+ , "# COMMENT %s_%s%s: chart \"%s\", context \"%s\", family \"%s\", dimension \"%s\", value * " COLLECTED_NUMBER_FORMAT " / " COLLECTED_NUMBER_FORMAT " %s %s (%s)\n"
+ , prefix
+ , context
+ , suffix
+ , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && st->name) ? st->name : st->id
+ , st->context
+ , st->family
+ , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id
+ , rd->multiplier
+ , rd->divisor
+ , h
+ , st->units
+ , t
+ );
+
+ if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_TYPES))
+ buffer_sprintf(wb, "# TYPE %s_%s%s %s\n"
+ , prefix
+ , context
+ , suffix
+ , t
+ );
+
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS)
+ buffer_sprintf(wb
+ , "%s_%s%s{chart=\"%s\",family=\"%s\",dimension=\"%s\"%s} " COLLECTED_NUMBER_FORMAT " %llu\n"
+ , prefix
+ , context
+ , suffix
+ , chart
+ , family
+ , dimension
+ , labels
+ , rd->last_collected_value
+ , timeval_msec(&rd->last_collected_time)
+ );
+ else
+ buffer_sprintf(wb
+ , "%s_%s%s{chart=\"%s\",family=\"%s\",dimension=\"%s\"%s} " COLLECTED_NUMBER_FORMAT "\n"
+ , prefix
+ , context
+ , suffix
+ , chart
+ , family
+ , dimension
+ , labels
+ , rd->last_collected_value
+ );
+ }
+ else {
+ // the dimensions of the chart, do not have the same algorithm, multiplier or divisor
+ // we create a metric per dimension
+
+ backends_prometheus_name_copy(dimension, (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX);
+
+ if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP))
+ buffer_sprintf(wb
+ , "# COMMENT %s_%s_%s%s: chart \"%s\", context \"%s\", family \"%s\", dimension \"%s\", value * " COLLECTED_NUMBER_FORMAT " / " COLLECTED_NUMBER_FORMAT " %s %s (%s)\n"
+ , prefix
+ , context
+ , dimension
+ , suffix
+ , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && st->name) ? st->name : st->id
+ , st->context
+ , st->family
+ , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id
+ , rd->multiplier
+ , rd->divisor
+ , h
+ , st->units
+ , t
+ );
+
+ if(unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_TYPES))
+ buffer_sprintf(wb, "# TYPE %s_%s_%s%s %s\n"
+ , prefix
+ , context
+ , dimension
+ , suffix
+ , t
+ );
+
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS)
+ buffer_sprintf(wb
+ , "%s_%s_%s%s{chart=\"%s\",family=\"%s\"%s} " COLLECTED_NUMBER_FORMAT " %llu\n"
+ , prefix
+ , context
+ , dimension
+ , suffix
+ , chart
+ , family
+ , labels
+ , rd->last_collected_value
+ , timeval_msec(&rd->last_collected_time)
+ );
+ else
+ buffer_sprintf(wb
+ , "%s_%s_%s%s{chart=\"%s\",family=\"%s\"%s} " COLLECTED_NUMBER_FORMAT "\n"
+ , prefix
+ , context
+ , dimension
+ , suffix
+ , chart
+ , family
+ , labels
+ , rd->last_collected_value
+ );
+ }
+ }
+ else {
+ // we need average or sum of the data
+
+ time_t first_t = after, last_t = before;
+ calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t);
+
+ if(!isnan(value) && !isinf(value)) {
+
+ if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE)
+ suffix = "_average";
+ else if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_SUM)
+ suffix = "_sum";
+
+ backends_prometheus_label_copy(dimension, (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX);
+
+ if (unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP))
+ buffer_sprintf(wb, "# COMMENT %s_%s%s%s: dimension \"%s\", value is %s, gauge, dt %llu to %llu inclusive\n"
+ , prefix
+ , context
+ , units
+ , suffix
+ , (output_options & BACKENDS_PROMETHEUS_OUTPUT_NAMES && rd->name) ? rd->name : rd->id
+ , st->units
+ , (unsigned long long)first_t
+ , (unsigned long long)last_t
+ );
+
+ if (unlikely(output_options & BACKENDS_PROMETHEUS_OUTPUT_TYPES))
+ buffer_sprintf(wb, "# TYPE %s_%s%s%s gauge\n"
+ , prefix
+ , context
+ , units
+ , suffix
+ );
+
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS)
+ buffer_sprintf(wb, "%s_%s%s%s{chart=\"%s\",family=\"%s\",dimension=\"%s\"%s} " CALCULATED_NUMBER_FORMAT " %llu\n"
+ , prefix
+ , context
+ , units
+ , suffix
+ , chart
+ , family
+ , dimension
+ , labels
+ , value
+ , last_t * MSEC_PER_SEC
+ );
+ else
+ buffer_sprintf(wb, "%s_%s%s%s{chart=\"%s\",family=\"%s\",dimension=\"%s\"%s} " CALCULATED_NUMBER_FORMAT "\n"
+ , prefix
+ , context
+ , units
+ , suffix
+ , chart
+ , family
+ , dimension
+ , labels
+ , value
+ );
+ }
+ }
+ }
+ }
+
+ rrdset_unlock(st);
+ }
+ }
+
+ rrdhost_unlock(host);
+}
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+inline static void remote_write_split_words(char *str, char **words, int max_words) {
+ char *s = str;
+ int i = 0;
+
+ while(*s && i < max_words - 1) {
+ while(*s && isspace(*s)) s++; // skip spaces to the begining of a tag name
+
+ if(*s)
+ words[i] = s;
+
+ while(*s && !isspace(*s) && *s != '=') s++; // find the end of the tag name
+
+ if(*s != '=') {
+ words[i] = NULL;
+ break;
+ }
+ *s = '\0';
+ s++;
+ i++;
+
+ while(*s && isspace(*s)) s++; // skip spaces to the begining of a tag value
+
+ if(*s && *s == '"') s++; // strip an opening quote
+ if(*s)
+ words[i] = s;
+
+ while(*s && !isspace(*s) && *s != ',') s++; // find the end of the tag value
+
+ if(*s && *s != ',') {
+ words[i] = NULL;
+ break;
+ }
+ if(s != words[i] && *(s - 1) == '"') *(s - 1) = '\0'; // strip a closing quote
+ if(*s != '\0') {
+ *s = '\0';
+ s++;
+ i++;
+ }
+ }
+}
+
+void backends_rrd_stats_remote_write_allmetrics_prometheus(
+ RRDHOST *host
+ , const char *__hostname
+ , const char *prefix
+ , BACKEND_OPTIONS backend_options
+ , time_t after
+ , time_t before
+ , size_t *count_charts
+ , size_t *count_dims
+ , size_t *count_dims_skipped
+) {
+ char hostname[PROMETHEUS_ELEMENT_MAX + 1];
+ backends_prometheus_label_copy(hostname, __hostname, PROMETHEUS_ELEMENT_MAX);
+
+ backends_add_host_info("netdata_info", hostname, host->program_name, host->program_version, now_realtime_usec() / USEC_PER_MS);
+
+ if(host->tags && *(host->tags)) {
+ char tags[PROMETHEUS_LABELS_MAX + 1];
+ strncpy(tags, host->tags, PROMETHEUS_LABELS_MAX);
+ char *words[PROMETHEUS_LABELS_MAX_NUMBER] = {NULL};
+ int i;
+
+ remote_write_split_words(tags, words, PROMETHEUS_LABELS_MAX_NUMBER);
+
+ backends_add_host_info("netdata_host_tags_info", hostname, NULL, NULL, now_realtime_usec() / USEC_PER_MS);
+
+ for(i = 0; words[i] != NULL && words[i + 1] != NULL && (i + 1) < PROMETHEUS_LABELS_MAX_NUMBER; i += 2) {
+ backends_add_tag(words[i], words[i + 1]);
+ }
+ }
+
+ // for each chart
+ RRDSET *st;
+ rrdset_foreach_read(st, host) {
+ char chart[PROMETHEUS_ELEMENT_MAX + 1];
+ char context[PROMETHEUS_ELEMENT_MAX + 1];
+ char family[PROMETHEUS_ELEMENT_MAX + 1];
+ char units[PROMETHEUS_ELEMENT_MAX + 1] = "";
+
+ backends_prometheus_label_copy(chart, (backend_options & BACKEND_OPTION_SEND_NAMES && st->name)?st->name:st->id, PROMETHEUS_ELEMENT_MAX);
+ backends_prometheus_label_copy(family, st->family, PROMETHEUS_ELEMENT_MAX);
+ backends_prometheus_name_copy(context, st->context, PROMETHEUS_ELEMENT_MAX);
+
+ if(likely(backends_can_send_rrdset(backend_options, st))) {
+ rrdset_rdlock(st);
+
+ (*count_charts)++;
+
+ int as_collected = (BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED);
+ int homogeneous = 1;
+ if(as_collected) {
+ if(rrdset_flag_check(st, RRDSET_FLAG_HOMOGENEOUS_CHECK))
+ rrdset_update_heterogeneous_flag(st);
+
+ if(rrdset_flag_check(st, RRDSET_FLAG_HETEROGENEOUS))
+ homogeneous = 0;
+ }
+ else {
+ if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE)
+ backends_prometheus_units_copy(units, st->units, PROMETHEUS_ELEMENT_MAX, 0);
+ }
+
+ // for each dimension
+ RRDDIM *rd;
+ rrddim_foreach_read(rd, st) {
+ if(rd->collections_counter && !rrddim_flag_check(rd, RRDDIM_FLAG_OBSOLETE)) {
+ char name[PROMETHEUS_LABELS_MAX + 1];
+ char dimension[PROMETHEUS_ELEMENT_MAX + 1];
+ char *suffix = "";
+
+ if (as_collected) {
+ // we need as-collected / raw data
+
+ if(unlikely(rd->last_collected_time.tv_sec < after)) {
+ debug(D_BACKEND, "BACKEND: not sending dimension '%s' of chart '%s' from host '%s', its last data collection (%lu) is not within our timeframe (%lu to %lu)", rd->id, st->id, __hostname, (unsigned long)rd->last_collected_time.tv_sec, (unsigned long)after, (unsigned long)before);
+ (*count_dims_skipped)++;
+ continue;
+ }
+
+ if(homogeneous) {
+ // all the dimensions of the chart, has the same algorithm, multiplier and divisor
+ // we add all dimensions as labels
+
+ backends_prometheus_label_copy(dimension, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX);
+ snprintf(name, PROMETHEUS_LABELS_MAX, "%s_%s%s", prefix, context, suffix);
+
+ backends_add_metric(name, chart, family, dimension, hostname, rd->last_collected_value, timeval_msec(&rd->last_collected_time));
+ (*count_dims)++;
+ }
+ else {
+ // the dimensions of the chart, do not have the same algorithm, multiplier or divisor
+ // we create a metric per dimension
+
+ backends_prometheus_name_copy(dimension, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX);
+ snprintf(name, PROMETHEUS_LABELS_MAX, "%s_%s_%s%s", prefix, context, dimension, suffix);
+
+ backends_add_metric(name, chart, family, NULL, hostname, rd->last_collected_value, timeval_msec(&rd->last_collected_time));
+ (*count_dims)++;
+ }
+ }
+ else {
+ // we need average or sum of the data
+
+ time_t first_t = after, last_t = before;
+ calculated_number value = backend_calculate_value_from_stored_data(st, rd, after, before, backend_options, &first_t, &last_t);
+
+ if(!isnan(value) && !isinf(value)) {
+
+ if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE)
+ suffix = "_average";
+ else if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_SUM)
+ suffix = "_sum";
+
+ backends_prometheus_label_copy(dimension, (backend_options & BACKEND_OPTION_SEND_NAMES && rd->name) ? rd->name : rd->id, PROMETHEUS_ELEMENT_MAX);
+ snprintf(name, PROMETHEUS_LABELS_MAX, "%s_%s%s%s", prefix, context, units, suffix);
+
+ backends_add_metric(name, chart, family, dimension, hostname, value, last_t * MSEC_PER_SEC);
+ (*count_dims)++;
+ }
+ }
+ }
+ }
+
+ rrdset_unlock(st);
+ }
+ }
+}
+#endif /* ENABLE_PROMETHEUS_REMOTE_WRITE */
+
+static inline time_t prometheus_preparation(RRDHOST *host, BUFFER *wb, BACKEND_OPTIONS backend_options, const char *server, time_t now, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options) {
+ if(!server || !*server) server = "default";
+
+ time_t after = prometheus_server_last_access(server, host, now);
+
+ int first_seen = 0;
+ if(!after) {
+ after = now - global_backend_update_every;
+ first_seen = 1;
+ }
+
+ if(after > now) {
+ // oops! this should never happen
+ after = now - global_backend_update_every;
+ }
+
+ if(output_options & BACKENDS_PROMETHEUS_OUTPUT_HELP) {
+ char *mode;
+ if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AS_COLLECTED)
+ mode = "as collected";
+ else if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_AVERAGE)
+ mode = "average";
+ else if(BACKEND_OPTIONS_DATA_SOURCE(backend_options) == BACKEND_SOURCE_DATA_SUM)
+ mode = "sum";
+ else
+ mode = "unknown";
+
+ buffer_sprintf(wb, "# COMMENT netdata \"%s\" to %sprometheus \"%s\", source \"%s\", last seen %lu %s, time range %lu to %lu\n\n"
+ , host->hostname
+ , (first_seen)?"FIRST SEEN ":""
+ , server
+ , mode
+ , (unsigned long)((first_seen)?0:(now - after))
+ , (first_seen)?"never":"seconds ago"
+ , (unsigned long)after, (unsigned long)now
+ );
+ }
+
+ return after;
+}
+
+void backends_rrd_stats_api_v1_charts_allmetrics_prometheus_single_host(RRDHOST *host, BUFFER *wb, const char *server, const char *prefix, BACKEND_OPTIONS backend_options, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options) {
+ time_t before = now_realtime_sec();
+
+ // we start at the point we had stopped before
+ time_t after = prometheus_preparation(host, wb, backend_options, server, before, output_options);
+
+ rrd_stats_api_v1_charts_allmetrics_prometheus(host, wb, prefix, backend_options, after, before, 0, output_options);
+}
+
+void backends_rrd_stats_api_v1_charts_allmetrics_prometheus_all_hosts(RRDHOST *host, BUFFER *wb, const char *server, const char *prefix, BACKEND_OPTIONS backend_options, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options) {
+ time_t before = now_realtime_sec();
+
+ // we start at the point we had stopped before
+ time_t after = prometheus_preparation(host, wb, backend_options, server, before, output_options);
+
+ rrd_rdlock();
+ rrdhost_foreach_read(host) {
+ rrd_stats_api_v1_charts_allmetrics_prometheus(host, wb, prefix, backend_options, after, before, 1, output_options);
+ }
+ rrd_unlock();
+}
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+int backends_process_prometheus_remote_write_response(BUFFER *b) {
+ if(unlikely(!b)) return 1;
+
+ const char *s = buffer_tostring(b);
+ int len = buffer_strlen(b);
+
+ // do nothing with HTTP responses 200 or 204
+
+ while(!isspace(*s) && len) {
+ s++;
+ len--;
+ }
+ s++;
+ len--;
+
+ if(likely(len > 4 && (!strncmp(s, "200 ", 4) || !strncmp(s, "204 ", 4))))
+ return 0;
+ else
+ return discard_response(b, "prometheus remote write");
+}
+#endif
diff --git a/backends/prometheus/backend_prometheus.h b/backends/prometheus/backend_prometheus.h
new file mode 100644
index 0000000..8c14ddc
--- /dev/null
+++ b/backends/prometheus/backend_prometheus.h
@@ -0,0 +1,37 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef NETDATA_BACKEND_PROMETHEUS_H
+#define NETDATA_BACKEND_PROMETHEUS_H 1
+
+#include "backends/backends.h"
+
+typedef enum backends_prometheus_output_flags {
+ BACKENDS_PROMETHEUS_OUTPUT_NONE = 0,
+ BACKENDS_PROMETHEUS_OUTPUT_HELP = (1 << 0),
+ BACKENDS_PROMETHEUS_OUTPUT_TYPES = (1 << 1),
+ BACKENDS_PROMETHEUS_OUTPUT_NAMES = (1 << 2),
+ BACKENDS_PROMETHEUS_OUTPUT_TIMESTAMPS = (1 << 3),
+ BACKENDS_PROMETHEUS_OUTPUT_VARIABLES = (1 << 4),
+ BACKENDS_PROMETHEUS_OUTPUT_OLDUNITS = (1 << 5),
+ BACKENDS_PROMETHEUS_OUTPUT_HIDEUNITS = (1 << 6)
+} BACKENDS_PROMETHEUS_OUTPUT_OPTIONS;
+
+extern void backends_rrd_stats_api_v1_charts_allmetrics_prometheus_single_host(RRDHOST *host, BUFFER *wb, const char *server, const char *prefix, BACKEND_OPTIONS backend_options, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options);
+extern void backends_rrd_stats_api_v1_charts_allmetrics_prometheus_all_hosts(RRDHOST *host, BUFFER *wb, const char *server, const char *prefix, BACKEND_OPTIONS backend_options, BACKENDS_PROMETHEUS_OUTPUT_OPTIONS output_options);
+
+#if ENABLE_PROMETHEUS_REMOTE_WRITE
+extern void backends_rrd_stats_remote_write_allmetrics_prometheus(
+ RRDHOST *host
+ , const char *__hostname
+ , const char *prefix
+ , BACKEND_OPTIONS backend_options
+ , time_t after
+ , time_t before
+ , size_t *count_charts
+ , size_t *count_dims
+ , size_t *count_dims_skipped
+);
+extern int backends_process_prometheus_remote_write_response(BUFFER *b);
+#endif
+
+#endif //NETDATA_BACKEND_PROMETHEUS_H
diff --git a/backends/prometheus/remote_write/Makefile.am b/backends/prometheus/remote_write/Makefile.am
new file mode 100644
index 0000000..d049ef4
--- /dev/null
+++ b/backends/prometheus/remote_write/Makefile.am
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-3.0-or-later
+
+AUTOMAKE_OPTIONS = subdir-objects
+MAINTAINERCLEANFILES = $(srcdir)/Makefile.in
+
+CLEANFILES = \
+ remote_write.pb.cc \
+ remote_write.pb.h \
+ $(NULL)
+
+dist_noinst_DATA = \
+ remote_write.proto \
+ README.md \
+ $(NULL)
diff --git a/backends/prometheus/remote_write/README.md b/backends/prometheus/remote_write/README.md
new file mode 100644
index 0000000..b83575e
--- /dev/null
+++ b/backends/prometheus/remote_write/README.md
@@ -0,0 +1,41 @@
+<!--
+title: "Prometheus remote write backend"
+custom_edit_url: https://github.com/netdata/netdata/edit/master/backends/prometheus/remote_write/README.md
+-->
+
+# Prometheus remote write backend
+
+## Prerequisites
+
+To use the prometheus remote write API with [storage
+providers](https://prometheus.io/docs/operating/integrations/#remote-endpoints-and-storage)
+[protobuf](https://developers.google.com/protocol-buffers/) and [snappy](https://github.com/google/snappy) libraries
+should be installed first. Next, Netdata should be re-installed from the source. The installer will detect that the
+required libraries and utilities are now available.
+
+## Configuration
+
+An additional option in the backend configuration section is available for the remote write backend:
+
+```conf
+[backend]
+ remote write URL path = /receive
+```
+
+The default value is `/receive`. `remote write URL path` is used to set an endpoint path for the remote write protocol.
+For example, if your endpoint is `http://example.domain:example_port/storage/read` you should set
+
+```conf
+[backend]
+ destination = example.domain:example_port
+ remote write URL path = /storage/read
+```
+
+`buffered` and `lost` dimensions in the Netdata Backend Data Size operation monitoring chart estimate uncompressed
+buffer size on failures.
+
+## Notes
+
+The remote write backend does not support `buffer on failures`
+
+[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fbackends%2Fprometheus%2Fremote_write%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>)
diff --git a/backends/prometheus/remote_write/remote_write.cc b/backends/prometheus/remote_write/remote_write.cc
new file mode 100644
index 0000000..9448595
--- /dev/null
+++ b/backends/prometheus/remote_write/remote_write.cc
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#include <snappy.h>
+#include "remote_write.pb.h"
+#include "remote_write.h"
+
+using namespace prometheus;
+
+static google::protobuf::Arena arena;
+static WriteRequest *write_request;
+
+void backends_init_write_request() {
+ GOOGLE_PROTOBUF_VERIFY_VERSION;
+ write_request = google::protobuf::Arena::CreateMessage<WriteRequest>(&arena);
+}
+
+void backends_clear_write_request() {
+ write_request->clear_timeseries();
+}
+
+void backends_add_host_info(const char *name, const char *instance, const char *application, const char *version, const int64_t timestamp) {
+ TimeSeries *timeseries;
+ Sample *sample;
+ Label *label;
+
+ timeseries = write_request->add_timeseries();
+
+ label = timeseries->add_labels();
+ label->set_name("__name__");
+ label->set_value(name);
+
+ label = timeseries->add_labels();
+ label->set_name("instance");
+ label->set_value(instance);
+
+ if(application) {
+ label = timeseries->add_labels();
+ label->set_name("application");
+ label->set_value(application);
+ }
+
+ if(version) {
+ label = timeseries->add_labels();
+ label->set_name("version");
+ label->set_value(version);
+ }
+
+ sample = timeseries->add_samples();
+ sample->set_value(1);
+ sample->set_timestamp(timestamp);
+}
+
+// adds tag to the last created timeseries
+void backends_add_tag(char *tag, char *value) {
+ TimeSeries *timeseries;
+ Label *label;
+
+ timeseries = write_request->mutable_timeseries(write_request->timeseries_size() - 1);
+
+ label = timeseries->add_labels();
+ label->set_name(tag);
+ label->set_value(value);
+}
+
+void backends_add_metric(const char *name, const char *chart, const char *family, const char *dimension, const char *instance, const double value, const int64_t timestamp) {
+ TimeSeries *timeseries;
+ Sample *sample;
+ Label *label;
+
+ timeseries = write_request->add_timeseries();
+
+ label = timeseries->add_labels();
+ label->set_name("__name__");
+ label->set_value(name);
+
+ label = timeseries->add_labels();
+ label->set_name("chart");
+ label->set_value(chart);
+
+ label = timeseries->add_labels();
+ label->set_name("family");
+ label->set_value(family);
+
+ if(dimension) {
+ label = timeseries->add_labels();
+ label->set_name("dimension");
+ label->set_value(dimension);
+ }
+
+ label = timeseries->add_labels();
+ label->set_name("instance");
+ label->set_value(instance);
+
+ sample = timeseries->add_samples();
+ sample->set_value(value);
+ sample->set_timestamp(timestamp);
+}
+
+size_t backends_get_write_request_size(){
+#if GOOGLE_PROTOBUF_VERSION < 3001000
+ size_t size = (size_t)snappy::MaxCompressedLength(write_request->ByteSize());
+#else
+ size_t size = (size_t)snappy::MaxCompressedLength(write_request->ByteSizeLong());
+#endif
+
+ return (size < INT_MAX)?size:0;
+}
+
+int backends_pack_write_request(char *buffer, size_t *size) {
+ std::string uncompressed_write_request;
+ if(write_request->SerializeToString(&uncompressed_write_request) == false) return 1;
+
+ snappy::RawCompress(uncompressed_write_request.data(), uncompressed_write_request.size(), buffer, size);
+
+ return 0;
+}
+
+void backends_protocol_buffers_shutdown() {
+ google::protobuf::ShutdownProtobufLibrary();
+}
diff --git a/backends/prometheus/remote_write/remote_write.h b/backends/prometheus/remote_write/remote_write.h
new file mode 100644
index 0000000..1307d72
--- /dev/null
+++ b/backends/prometheus/remote_write/remote_write.h
@@ -0,0 +1,30 @@
+// SPDX-License-Identifier: GPL-3.0-or-later
+
+#ifndef NETDATA_BACKEND_PROMETHEUS_REMOTE_WRITE_H
+#define NETDATA_BACKEND_PROMETHEUS_REMOTE_WRITE_H
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+void backends_init_write_request();
+
+void backends_clear_write_request();
+
+void backends_add_host_info(const char *name, const char *instance, const char *application, const char *version, const int64_t timestamp);
+
+void backends_add_tag(char *tag, char *value);
+
+void backends_add_metric(const char *name, const char *chart, const char *family, const char *dimension, const char *instance, const double value, const int64_t timestamp);
+
+size_t backends_get_write_request_size();
+
+int backends_pack_write_request(char *buffer, size_t *size);
+
+void backends_protocol_buffers_shutdown();
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif //NETDATA_BACKEND_PROMETHEUS_REMOTE_WRITE_H
diff --git a/backends/prometheus/remote_write/remote_write.proto b/backends/prometheus/remote_write/remote_write.proto
new file mode 100644
index 0000000..dfde254
--- /dev/null
+++ b/backends/prometheus/remote_write/remote_write.proto
@@ -0,0 +1,29 @@
+syntax = "proto3";
+package prometheus;
+
+option cc_enable_arenas = true;
+
+import "google/protobuf/descriptor.proto";
+
+message WriteRequest {
+ repeated TimeSeries timeseries = 1 [(nullable) = false];
+}
+
+message TimeSeries {
+ repeated Label labels = 1 [(nullable) = false];
+ repeated Sample samples = 2 [(nullable) = false];
+}
+
+message Label {
+ string name = 1;
+ string value = 2;
+}
+
+message Sample {
+ double value = 1;
+ int64 timestamp = 2;
+}
+
+extend google.protobuf.FieldOptions {
+ bool nullable = 65001;
+}