From 4bf37db76e7dda93e57a9730958c6d467a85c622 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Mon, 8 Jul 2019 22:14:49 +0200 Subject: Merging upstream version 1.16.0. Signed-off-by: Daniel Baumann --- streaming/README.md | 65 ++++++++++++++++++++++----- streaming/rrdpush.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++++-- streaming/stream.conf | 16 ++++++- 3 files changed, 186 insertions(+), 16 deletions(-) (limited to 'streaming') diff --git a/streaming/README.md b/streaming/README.md index 0ad9d7e2e..3e58f1f06 100644 --- a/streaming/README.md +++ b/streaming/README.md @@ -18,7 +18,7 @@ a netdata performs: Local netdata (`slave`), **without any database or alarms**, collects metrics and sends them to another netdata (`master`). -The `my-netdata` menu shows a list of all "databases streamed to" the master. Clicking one of those links allows the user to view the full dashboard of the `slave` netdata. The URL has the form http://master-host:master-port/host/slave-host/. +The node menu shows a list of all "databases streamed to" the master. Clicking one of those links allows the user to view the full dashboard of the `slave` netdata. The URL has the form http://master-host:master-port/host/slave-host/. Alarms for the `slave` are served by the `master`. @@ -41,6 +41,8 @@ The `slave` and the `master` may have different data retention policies for the Alarms for the `slave` are triggered by **both** the `slave` and the `master` (and actually each can have different alarms configurations or have alarms disabled). +Take a note, that custom chart names, configured on the `slave`, should be in the form `type.name` to work correctly. The `master` will truncate the `type` part and substitute the original chart `type` to store the name in the database. + ### netdata proxies Local netdata (`slave`), with or without a database, collects metrics and sends them to another @@ -81,14 +83,14 @@ monitoring (there cannot be health monitoring without a database). ``` [web] - mode = none | static-threaded - accept a streaming request every seconds = 0 + mode = none | static-threaded + accept a streaming request every seconds = 0 ``` `[web].mode = none` disables the API (netdata will not listen to any ports). This also disables the registry (there cannot be a registry without an API). -`accept a streaming request every seconds` can be used to set a limit on how often a master Netdata server will accept streaming requests from the slaves. 0 sets no limit, 1 means maximum once every second. If this is set, you may see error log entries "... too busy to accept new streaming request. Will be allowed in X secs". +`accept a streaming request every seconds` can be used to set a limit on how often a master Netdata server will accept streaming requests from the slaves. 0 sets no limit, 1 means maximum once every second. If this is set, you may see error log entries "... too busy to accept new streaming request. Will be allowed in X secs". ``` [backend] @@ -123,7 +125,7 @@ a `proxy`). ``` [stream] enabled = yes | no - destination = IP:PORT ... + destination = IP:PORT[:SSL] ... api key = XXXXXXXXXXX ``` @@ -136,6 +138,8 @@ headless proxy|`none`|not `none`|`yes`|only for `data source = as collected`|not proxy with db|not `none`|not `none`|`yes`|possible|possible|yes central netdata|not `none`|not `none`|`no`|possible|possible|yes +For the options to encrypt the data stream between the slave and the master, refer to [securing the communication](#securing-the-communication) + ##### options for the receiving node `stream.conf` looks like this: @@ -209,11 +213,46 @@ The receiving end (`proxy` or `master`) logs entries like these: For netdata v1.9+, streaming can also be monitored via `access.log`. +### Securing the communication + +Netdata does not activate TLS encryption by default. To encrypt the connection, you first need to [enable TLS support](../web/server/#enabling-tls-support) on the master. With encryption enabled on the receiving side, we need to instruct the slave to use SSL as well. On the slave's `stream.conf`, configure the destination as follows: + +``` +[stream] + destination = host:port:SSL +``` + +The word SSL appended to the end of the destination tells the slave that the connection must be encrypted. + +#### Certificate verification + +When SSL is enabled on the slave, the default behavior will be do not connect with the master unless the server's certificate can be verified via the default chain. In case you want to avoid this check, add to the slave's `stream.conf` the following: + +``` +[stream] + ssl skip certificate verification = yes +``` + +#### Expected behaviors + +With the introduction of SSL, the master-slave communication behaves as shown in the table below, depending on the following configurations: +- Master TLS (Yes/No): Whether the `[web]` section in `netdata.conf` has `ssl key` and `ssl certificate`. +- Master port SSL (-/force/optional): Depends on whether the `[web]` section `bind to` contains a `^SSL=force` or `^SSL=optional` directive on the port(s) used for streaming. +- Slave TLS (Yes/No): Whether the destination in the slave's `stream.conf` has `:SSL` at the end. +- Slave SSL Verification (yes/no): Value of the slave's `stream.conf` `ssl skip certificate verification` parameter (default is no). + + Master TLS enabled | Master port SSL | Slave TLS | Slave SSL Ver. | Behavior +:------:|:-----:|:-----:|:-----:|:-------- +No | - | No | no | Legacy behavior. The master-slave stream is unencrypted. +Yes | force | No | no | The master rejects the slave connection. +Yes | -/optional | No | no | The master-slave stream is unencrypted (expected situation for legacy slaves and newer masters) +Yes | -/force/optional | Yes | no | The master-slave stream is encrypted, provided that the master has a valid SSL certificate. Otherwise, the slave refuses to connect. +Yes | -/force/optional | Yes | yes | The master-slave stream is encrypted. ## Viewing remote host dashboards, using mirrored databases On any receiving netdata, that maintains remote databases and has its web server enabled, -`my-netdata` menu will include a list of the mirrored databases. +The node menu will include a list of the mirrored databases. ![image](https://cloud.githubusercontent.com/assets/2662304/24080824/24cd2d3c-0caf-11e7-909d-a8dd1dbb95d7.png) @@ -289,13 +328,13 @@ On the master, edit `/etc/netdata/stream.conf` (to edit it on your system run `/ [11111111-2222-3333-4444-555555555555] # enable/disable this API key enabled = yes - + # one hour of data for each of the slaves default history = 3600 - + # do not save slave metrics on disk default memory = ram - + # alarms checks, only while the slave is connected health enabled by default = auto ``` @@ -305,6 +344,10 @@ If you used many API keys, you can add one such section for each API key. When done, restart netdata on the `master` node. It is now ready to receive metrics. +Note that `health enabled by default = auto` will still trigger `last_collected` alarms, if a connected slave does not exit gracefully. If the netdata running on the slave is +stopped, it will close the connection to the master, ensuring that no `last_collected` alarms are triggered. For example, a proper container restart would first terminate +the netdata process, but a system power issue would leave the connection open on the master side. In the second case, you will still receive alarms. + #### Configuring the `slaves` On each of the slaves, edit `/etc/netdata/stream.conf` (to edit it on your system run `/etc/netdata/edit-config stream.conf`) and set these: @@ -313,10 +356,10 @@ On each of the slaves, edit `/etc/netdata/stream.conf` (to edit it on your syste [stream] # stream metrics to another netdata enabled = yes - + # the IP and PORT of the master destination = 10.11.12.13:19999 - + # the API key to use api key = 11111111-2222-3333-4444-555555555555 ``` diff --git a/streaming/rrdpush.c b/streaming/rrdpush.c index 2e9050ff2..954b1d7d1 100644 --- a/streaming/rrdpush.c +++ b/streaming/rrdpush.c @@ -79,6 +79,25 @@ int rrdpush_init() { default_rrdpush_enabled = 0; } +#ifdef ENABLE_HTTPS + if (netdata_use_ssl_on_stream == NETDATA_SSL_OPTIONAL) { + if (default_rrdpush_destination){ + char *test = strstr(default_rrdpush_destination,":SSL"); + if(test){ + *test = 0X00; + netdata_use_ssl_on_stream = NETDATA_SSL_FORCE; + } + } + } + char *invalid_certificate = appconfig_get(&stream_config, CONFIG_SECTION_STREAM, "ssl skip certificate verification", "no"); + if ( !strcmp(invalid_certificate,"yes")){ + if (netdata_validate_server == NETDATA_SSL_VALID_CERTIFICATE){ + info("The Netdata is configured to accept invalid certificate."); + netdata_validate_server = NETDATA_SSL_INVALID_CERTIFICATE; + } + } +#endif + return default_rrdpush_enabled; } @@ -414,6 +433,7 @@ static inline void rrdpush_sender_thread_close_socket(RRDHOST *host) { } } +//called from client side static int rrdpush_sender_thread_connect_to_master(RRDHOST *host, int default_port, int timeout, size_t *reconnects_counter, char *connected_to, size_t connected_to_size) { struct timeval tv = { .tv_sec = timeout, @@ -442,9 +462,38 @@ static int rrdpush_sender_thread_connect_to_master(RRDHOST *host, int default_po info("STREAM %s [send to %s]: initializing communication...", host->hostname, connected_to); +#ifdef ENABLE_HTTPS + if( netdata_client_ctx ){ + host->ssl.flags = NETDATA_SSL_START; + if (!host->ssl.conn){ + host->ssl.conn = SSL_new(netdata_client_ctx); + if(!host->ssl.conn){ + error("Failed to allocate SSL structure."); + host->ssl.flags = NETDATA_SSL_NO_HANDSHAKE; + } + } + else{ + SSL_clear(host->ssl.conn); + } + + if (host->ssl.conn) + { + if (SSL_set_fd(host->ssl.conn, host->rrdpush_sender_socket) != 1) { + error("Failed to set the socket to the SSL on socket fd %d.", host->rrdpush_sender_socket); + host->ssl.flags = NETDATA_SSL_NO_HANDSHAKE; + } else{ + host->ssl.flags = NETDATA_SSL_HANDSHAKE_COMPLETE; + } + } + } + else { + host->ssl.flags = NETDATA_SSL_NO_HANDSHAKE; + } +#endif + #define HTTP_HEADER_SIZE 8192 char http[HTTP_HEADER_SIZE + 1]; - snprintfz(http, HTTP_HEADER_SIZE, + int eol = snprintfz(http, HTTP_HEADER_SIZE, "STREAM key=%s&hostname=%s®istry_hostname=%s&machine_guid=%s&update_every=%d&os=%s&timezone=%s&tags=%s" "&NETDATA_SYSTEM_OS_NAME=%s" "&NETDATA_SYSTEM_OS_ID=%s" @@ -486,8 +535,39 @@ static int rrdpush_sender_thread_connect_to_master(RRDHOST *host, int default_po , host->program_name , host->program_version ); - + http[eol] = 0x00; + +#ifdef ENABLE_HTTPS + if (!host->ssl.flags) { + ERR_clear_error(); + SSL_set_connect_state(host->ssl.conn); + int err = SSL_connect(host->ssl.conn); + if (err != 1){ + err = SSL_get_error(host->ssl.conn, err); + error("SSL cannot connect with the server: %s ",ERR_error_string((long)SSL_get_error(host->ssl.conn,err),NULL)); + if (netdata_use_ssl_on_stream == NETDATA_SSL_FORCE) { + rrdpush_sender_thread_close_socket(host); + return 0; + }else { + host->ssl.flags = NETDATA_SSL_NO_HANDSHAKE; + } + } + else { + if (netdata_use_ssl_on_stream == NETDATA_SSL_FORCE) { + if (netdata_validate_server == NETDATA_SSL_VALID_CERTIFICATE) { + if ( security_test_certificate(host->ssl.conn)) { + error("Closing the stream connection, because the server SSL certificate is not valid."); + rrdpush_sender_thread_close_socket(host); + return 0; + } + } + } + } + } + if(send_timeout(&host->ssl,host->rrdpush_sender_socket, http, strlen(http), 0, timeout) == -1) { +#else if(send_timeout(host->rrdpush_sender_socket, http, strlen(http), 0, timeout) == -1) { +#endif error("STREAM %s [send to %s]: failed to send HTTP header to remote netdata.", host->hostname, connected_to); rrdpush_sender_thread_close_socket(host); return 0; @@ -495,7 +575,11 @@ static int rrdpush_sender_thread_connect_to_master(RRDHOST *host, int default_po info("STREAM %s [send to %s]: waiting response from remote netdata...", host->hostname, connected_to); +#ifdef ENABLE_HTTPS + if(recv_timeout(&host->ssl,host->rrdpush_sender_socket, http, HTTP_HEADER_SIZE, 0, timeout) == -1) { +#else if(recv_timeout(host->rrdpush_sender_socket, http, HTTP_HEADER_SIZE, 0, timeout) == -1) { +#endif error("STREAM %s [send to %s]: remote netdata does not respond.", host->hostname, connected_to); rrdpush_sender_thread_close_socket(host); return 0; @@ -565,6 +649,12 @@ void *rrdpush_sender_thread(void *ptr) { return NULL; } +#ifdef ENABLE_HTTPS + if (netdata_use_ssl_on_stream & NETDATA_SSL_FORCE ){ + security_start_ssl(NETDATA_SSL_CONTEXT_STREAMING); + } +#endif + info("STREAM %s [send]: thread created (task id %d)", host->hostname, gettid()); int timeout = (int)appconfig_get_number(&stream_config, CONFIG_SECTION_STREAM, "timeout seconds", 60); @@ -852,6 +942,9 @@ static int rrdpush_receive(int fd , int update_every , char *client_ip , char *client_port +#ifdef ENABLE_HTTPS + , struct netdata_ssl *ssl +#endif ) { RRDHOST *host; int history = default_rrd_history_entries; @@ -965,7 +1058,11 @@ static int rrdpush_receive(int fd snprintfz(cd.cmd, PLUGINSD_CMD_MAX, "%s:%s", client_ip, client_port); info("STREAM %s [receive from [%s]:%s]: initializing communication...", host->hostname, client_ip, client_port); +#ifdef ENABLE_HTTPS + if(send_timeout(ssl,fd, START_STREAMING_PROMPT, strlen(START_STREAMING_PROMPT), 0, 60) != strlen(START_STREAMING_PROMPT)) { +#else if(send_timeout(fd, START_STREAMING_PROMPT, strlen(START_STREAMING_PROMPT), 0, 60) != strlen(START_STREAMING_PROMPT)) { +#endif log_stream_connection(client_ip, client_port, key, host->machine_guid, host->hostname, "FAILED - CANNOT REPLY"); error("STREAM %s [receive from [%s]:%s]: cannot send ready command.", host->hostname, client_ip, client_port); close(fd); @@ -1058,6 +1155,9 @@ struct rrdpush_thread { char *program_version; struct rrdhost_system_info *system_info; int update_every; +#ifdef ENABLE_HTTPS + struct netdata_ssl ssl; +#endif }; static void rrdpush_receiver_thread_cleanup(void *ptr) { @@ -1079,8 +1179,13 @@ static void rrdpush_receiver_thread_cleanup(void *ptr) { freez(rpt->client_port); freez(rpt->program_name); freez(rpt->program_version); - rrdhost_system_info_free(rpt->system_info); +#ifdef ENABLE_HTTPS + if(rpt->ssl.conn){ + SSL_free(rpt->ssl.conn); + } +#endif freez(rpt); + } } @@ -1105,6 +1210,9 @@ static void *rrdpush_receiver_thread(void *ptr) { , rpt->update_every , rpt->client_ip , rpt->client_port +#ifdef ENABLE_HTTPS + , &rpt->ssl +#endif ); netdata_thread_cleanup_pop(1); @@ -1295,6 +1403,13 @@ int rrdpush_receiver_thread_spawn(RRDHOST *host, struct web_client *w, char *url rpt->client_port = strdupz(w->client_port); rpt->update_every = update_every; rpt->system_info = system_info; +#ifdef ENABLE_HTTPS + rpt->ssl.conn = w->ssl.conn; + rpt->ssl.flags = w->ssl.flags; + + w->ssl.conn = NULL; + w->ssl.flags = NETDATA_SSL_START; +#endif if(w->user_agent && w->user_agent[0]) { char *t = strchr(w->user_agent, '/'); diff --git a/streaming/stream.conf b/streaming/stream.conf index d0d02a7c8..0d360cc24 100644 --- a/streaming/stream.conf +++ b/streaming/stream.conf @@ -17,7 +17,7 @@ # Where is the receiving netdata? # A space separated list of: # - # [PROTOCOL:]HOST[%INTERFACE][:PORT] + # [PROTOCOL:]HOST[%INTERFACE][:PORT][:SSL] # # If many are given, the first available will get the metrics. # @@ -26,10 +26,21 @@ # IPv6 IPs should be given with brackets [ip:address] # INTERFACE = the network interface to use (only for IPv6) # PORT = the port number or service name (/etc/services) + # SSL = when this word appear at the end of the destination string + # the Netdata will do encrypt connection with the master. # # This communication is not HTTP (it cannot be proxied by web proxies). destination = + # Skip Certificate verification? + # + # The netdata slave is configurated to avoid invalid SSL/TLS certificate, + # so certificates that are self-signed or expired will stop the streaming. + # Case the server certificate is not valid, you can enable the use of + # 'bad' certificates setting the next option as 'yes'. + # + #ssl skip certificate verification = yes + # The API_KEY to use (as the sender) api key = @@ -114,7 +125,8 @@ # 3 possible values: # yes enable alarms # no do not enable alarms - # auto enable alarms, only when the sending netdata is connected + # auto enable alarms, only when the sending netdata is connected. For ephemeral slaves or slave system restarts, + # ensure that the netdata process on the slave is gracefully stopped, to prevent invalid last_collected alarms # You can also set it per host, below. # The default is taken from [health].enabled of netdata.conf health enabled by default = auto -- cgit v1.2.3