Merging upstream version 1.39.0.

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2023-05-08 16:27:08 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2023-05-08 16:27:08 +0000
commit: 81581f9719bc56f01d5aa08952671d65fda9867a (patch)
tree: 0f5c6b6138bf169c23c9d24b1fc0a3521385cb18 /docs/netdata-security.md
parent: Releasing debian version 1.38.1-1. (diff)
download: netdata-81581f9719bc56f01d5aa08952671d65fda9867a.tar.xz
netdata-81581f9719bc56f01d5aa08952671d65fda9867a.zip
1 files changed, 130 insertions, 163 deletions
diff --git a/docs/netdata-security.md b/docs/netdata-security.md
index 511bc7721..6cd33c061 100644
--- a/docs/netdata-security.md
+++ b/docs/netdata-security.md
@@ -1,222 +1,167 @@
-<!--
-title: "Security design"
-custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/netdata-security.md
--->
+# Security and privacy design
 
-# Security design
+This document serves as the relevant Annex to the [Terms of Service](https://www.netdata.cloud/service-terms/), the [Privacy Policy](https://www.netdata.cloud/privacy/) and
+the Data Processing Addendum, when applicable. It provides more information regarding Netdata’s technical and organizational security and privacy measures.
 
 We have given special attention to all aspects of Netdata, ensuring that everything throughout its operation is as secure as possible. Netdata has been designed with security in mind.
 
-**Table of Contents**
+> When running Netdata in environments requiring Payment Card Industry Data Security Standard (**PCI DSS**), Systems and Organization Controls (**SOC 2**),
+or Health Insurance Portability and Accountability Act (**HIPAA**) compliance, please keep in mind that 
+**even when the user uses Netdata Cloud, all collected data is always stored inside their infrastructure**. 
 
-1.  [Your data is safe with Netdata](#your-data-is-safe-with-netdata)
-2.  [Your systems are safe with Netdata](#your-systems-are-safe-with-netdata)
-3.  [Netdata is read-only](#netdata-is-read-only)
-4.  [Netdata viewers authentication](#netdata-viewers-authentication)
-    *   [Why Netdata should be protected](#why-netdata-should-be-protected)
-    *   [Protect Netdata from the internet](#protect-netdata-from-the-internet)
-        * [Expose Netdata only in a private LAN](#expose-netdata-only-in-a-private-lan)
-        * [Use an authenticating web server in proxy mode](#use-an-authenticating-web-server-in-proxy-mode)
-        * [Other methods](#other-methods)
-5.  [Registry or how to not send any information to a third party server](#registry-or-how-to-not-send-any-information-to-a-third-party-server)
+Dashboard data a user views and alert notifications do travel 
+over Netdata Cloud, as they also travel over third party networks, to reach the user's web browser or the notification integrations the user has configured, 
+but Netdata Cloud does not store metric data. It only transforms them as they pass through it, aggregating them from multiple Agents and Parents, 
+to appear as one data source on the user's browser.
 
-## Your data is safe with Netdata
+## Cloud design
 
-Netdata collects raw data from many sources. For each source, Netdata uses a plugin that connects to the source (or reads the relative files produced by the source), receives raw data and processes them to calculate the metrics shown on Netdata dashboards.
+### User identification and authorization
 
-Even if Netdata plugins connect to your database server, or read your application log file to collect raw data, the product of this data collection process is always a number of **chart metadata and metric values** (summarized data for dashboard visualization). All Netdata plugins (internal to the Netdata daemon, and external ones written in any computer language), convert raw data collected into metrics, and only these metrics are stored in Netdata databases, sent to upstream Netdata servers, or archived to external time-series databases.
+Netdata ensures that only an email address is stored to create an account and use the Service. 
+User identification and authorization is done 
+either via third parties (Google, GitHub accounts), or short-lived access tokens, sent to the user’s email account. 
 
-> The **raw data** collected by Netdata, does not leave the host when collected. **The only data Netdata exposes are chart metadata and metric values.**
+### Personal Data stored
 
-This means that Netdata can safely be used in environments that require the highest level of data isolation (like PCI Level 1).
-
-## Your systems are safe with Netdata
-
-We are very proud that **the Netdata daemon runs as a normal system user, without any special privileges**. This is quite an achievement for a monitoring system that collects all kinds of system and application metrics.
-
-There are a few cases, however, that raw source data are only exposed to processes with escalated privileges. To support these cases, Netdata attempts to minimize and completely isolate the code that runs with escalated privileges.
-
-So, Netdata **plugins**, even those running with escalated capabilities or privileges, perform a **hard coded data collection job**. They do not accept commands from Netdata. The communication is strictly **unidirectional**: from the plugin towards the Netdata daemon. The original application data collected by each plugin do not leave the process they are collected, are not saved and are not transferred to the Netdata daemon. The communication from the plugins to the Netdata daemon includes only chart metadata and processed metric values.
-
-Child nodes use the same protocol when streaming metrics to their parent nodes. The raw data collected by the plugins of
-child Netdata servers are **never leaving the host they are collected**. The only data appearing on the wire are chart
-metadata and metric values. This communication is also **unidirectional**: child nodes never accept commands from
-parent Netdata servers.
-
-## Netdata is read-only
-
-Netdata **dashboards are read-only**. Dashboard users can view and examine metrics collected by Netdata, but cannot instruct Netdata to do something other than present the already collected metrics.
-
-Netdata dashboards do not expose sensitive information. Business data of any kind, the kernel version, O/S version, application versions, host IPs, etc are not stored and are not exposed by Netdata on its dashboards.
-
-## Netdata viewers authentication
-
-Netdata is a monitoring system. It should be protected, the same way you protect all your admin apps. We assume Netdata will be installed privately, for your eyes only.
-
-### Why Netdata should be protected
-
-Viewers will be able to get some information about the system Netdata is running. This information is everything the dashboard provides. The dashboard includes a list of the services each system runs (the legends of the charts under the `Systemd Services` section),  the applications running (the legends of the charts under the `Applications` section), the disks of the system and their names, the user accounts of the system that are running processes (the `Users` and `User Groups` section of the dashboard), the network interfaces and their names (not the IPs) and detailed information about the performance of the system and its applications.
-
-This information is not sensitive (meaning that it is not your business data), but **it is important for possible attackers**. It will give them clues on what to check, what to try and in the case of DDoS against your applications, they will know if they are doing it right or not.
-
-Also, viewers could use Netdata itself to stress your servers. Although the Netdata daemon runs unprivileged, with the minimum process priority (scheduling priority `idle` - lower than nice 19) and adjusts its OutOfMemory (OOM) score to 1000 (so that it will be first to be killed by the kernel if the system starves for memory), some pressure can be applied on your systems if someone attempts a DDoS against Netdata.
-
-### Protect Netdata from the internet
-
-Netdata is a distributed application. Most likely you will have many installations of it. Since it is distributed and you are expected to jump from server to server, there is very little usability to add authentication local on each Netdata.
-
-Until we add a distributed authentication method to Netdata, you have the following options:
-
-#### Expose Netdata only in a private LAN
-
-If your organisation has a private administration and management LAN, you can bind Netdata on this network interface on all your servers. This is done in `Netdata.conf` with these settings:
+Netdata ensures that only an email address is stored to create an account and use the Service. The same email 
+address is used for Netdata product and marketing communications (via Hubspot and Sendgrid). 
 
-```
-[web]
-	bind to = 10.1.1.1:19999 localhost:19999
-```
+Email addresses are stored in our production database on AWS and copied to Google BigQuery, our data lake, 
+for analytics purposes. These analytics are crucial for our product development process.
 
-You can bind Netdata to multiple IPs and ports. If you use hostnames, Netdata will resolve them and use all the IPs (in the above example `localhost` usually resolves to both `127.0.0.1` and `::1`).
+If the user accepts the use of analytical cookies, the email address is also stored in the systems we use to track the 
+usage of the application (Posthog and Gainsight PX)
 
-**This is the best and the suggested way to protect Netdata**. Your systems **should** have a private administration and management LAN, so that all management tasks are performed without any possibility of them being exposed on the internet.
+The IP address used to access Netdata Cloud is stored in web proxy access logs. If the user accepts the use of analytical 
+cookies, the IP is also stored in the systems we use to track the usage of the application (Posthog and Gainsight PX). 
 
-For cloud based installations, if your cloud provider does not provide such a private LAN (or if you use multiple providers), you can create a virtual management and administration LAN with tools like `tincd` or `gvpe`. These tools create a mesh VPN allowing all servers to communicate securely and privately. Your administration stations join this mesh VPN to get access to management and administration tasks on all your cloud servers.
+### Infrastructure data stored
 
-For `gvpe` we have developed a [simple provisioning tool](https://github.com/netdata/netdata-demo-site/tree/master/gvpe) you may find handy (it includes statically compiled `gvpe` binaries for Linux and FreeBSD, and also a script to compile `gvpe` on your macOS system). We use this to create a management and administration LAN for all Netdata demo sites (spread all over the internet using multiple hosting providers).
+The metric data that a user sees in the web browser when using Netdata Cloud is streamed directly from the Netdata Agent 
+to the Netdata Cloud dashboard, via the Agent-Cloud link (see [data transfer](#data-transfer)). The data passes through our systems, but it isn’t stored. 
 
----
+The metadata we do store for each node connected to the user's Spaces in Netdata Cloud is:
+  - Hostname (as it appears in Netdata Cloud)
+  - Information shown in `/api/v1/info`. For example: [https://frankfurt.my-netdata.io/api/v1/info](https://frankfurt.my-netdata.io/api/v1/info).
+  - Metric metadata information shown in `/api/v1/contexts`. For example: [https://frankfurt.my-netdata.io/api/v1/contexts](https://frankfurt.my-netdata.io/api/v1/contexts).
+  - Alarm configurations shown in `/api/v1/alarms?all`. For example: [https://frankfurt.my-netdata.io/api/v1/alarms?all](https://frankfurt.my-netdata.io/api/v1/alarms?all).
+  - Active alarms shown in `/api/v1/alarms`. For example: [https://frankfurt.my-netdata.io/api/v1/alarms](https://frankfurt.my-netdata.io/api/v1/alarms).
 
-In Netdata v1.9+ there is also access list support, like this:
+The infrastructure data is stored in our production database on AWS and copied to Google BigQuery, our data lake, for
+ analytics purposes.
 
-```
-[web]
-	bind to = *
-	allow connections from = localhost 10.* 192.168.*
-```
+### Data transfer
 
-#### Fine-grained access control
+All infrastructure data visible on Netdata Cloud has to pass through the Agent-Cloud link (ACLK) mechanism, which 
+securely connects a Netdata Agent to Netdata Cloud. The Netdata agent initiates and establishes an outgoing secure 
+WebSocket (WSS) connection to Netdata Cloud. The ACLK is encrypted, safe, and is only established if the user connects their node. 
 
-The access list support allows filtering of all incoming connections, by specific IP addresses, ranges
-or validated DNS lookups. Only connections that match an entry on the list will be allowed:
+Data is encrypted when in transit between a user and Netdata Cloud using TLS.
 
-```
-[web]
-	allow connections from = localhost 192.168.* 1.2.3.4 homeip.net
-```
+### Data retention
 
-Connections from the IP addresses are allowed if the connection IP matches one of the patterns given.
-The alias localhost is always checked against 127.0.0.1, any other symbolic names need to resolve in
-both directions using DNS. In the above example the IP address of `homeip.net` must reverse DNS resolve
-to the incoming IP address and a DNS lookup on `homeip.net` must return the incoming IP address as
-one of the resolved addresses.
+Netdata may maintain backups of Netdata Cloud Customer Content, which would remain in place for approximately ninety 
+(90) days following a deletion in Netdata Cloud. 
 
-More specific control of what each incoming connection can do can be specified through the access control
-list settings:
+### Data portability and erasure
 
-```
-[web]
-	allow connections from = 160.1.*
-	allow badges from = 160.1.1.2
-	allow streaming from = 160.1.2.*
-	allow management from = control.subnet.ip
-	allow netdata.conf from = updates.subnet.ip
-	allow dashboard from = frontend.subnet.ip
-```
+Netdata will, as necessary to enable the Customer to meet its obligations under Data Protection Law, provide the Customer 
+via the availability of Netdata Cloud with the ability to access, retrieve, correct and delete the Personal Data stored in 
+Netdata Cloud. The Customer acknowledges that such ability may from time to time be limited due to temporary service outages 
+for maintenance or other updates to Netdata Cloud, or technically not feasible. 
 
-In this example only connections from `160.1.x.x` are allowed, only the specific IP address `160.1.1.2`
-can access badges, only IP addresses in the smaller range `160.1.2.x` can stream data. The three
-hostnames shown can access specific features, this assumes that DNS is setup to resolve these names
-to IP addresses within the `160.1.x.x` range and that reverse DNS is setup for these hosts.
+To the extent that the Customer, in its fulfillment of its Data Protection Law obligations, is unable to access, retrieve, 
+correct or delete Customer Personal Data in Netdata Cloud due to prolonged unavailability of Netdata Cloud due to an issue 
+within Netdata’s control, Netdata will where possible use reasonable efforts to provide, correct or delete such Customer Personal Data.
 
+If a Customer is unable to delete Personal Data via the self-services functionality, then Netdata deletes Personal Data upon 
+the Customer’s written request, within the timeframe specified in the DPA and in accordance with applicable data protection law. 
 
-#### Use an authenticating web server in proxy mode
+#### Delete all personal data
 
-Use one web server to provide authentication in front of **all your Netdata servers**. So, you will be accessing all your Netdata with URLs like `http://{HOST}/netdata/{NETDATA_HOSTNAME}/` and authentication will be shared among all of them (you will sign-in once for all your servers). Instructions are provided on how to set the proxy configuration to have Netdata run behind [nginx](Running-behind-nginx.md), [Apache](Running-behind-apache.md), [lighttpd](Running-behind-lighttpd.md) and [Caddy](Running-behind-caddy.md).
+To remove all personal info we have about a user (email and activities) they need to delete their cloud account by logging into https://app.netdata.cloud and accessing their profile, at the bottom left of the screen. 
 
-To use this method, you should firewall protect all your Netdata servers, so that only the web server IP will be allowed to directly access Netdata. To do this, run this on each of your servers (or use your firewall manager):
 
-```sh
-PROXY_IP="1.2.3.4"
-iptables -t filter -I INPUT -p tcp --dport 19999 \! -s ${PROXY_IP} -m conntrack --ctstate NEW -j DROP
-```
+## Agent design
 
-*commands to allow direct access to Netdata from a web server proxy*
+### User data is safe with Netdata
 
-The above will prevent anyone except your web server to access a Netdata dashboard running on the host.
+Netdata collects raw data from many sources. For each source, Netdata uses a plugin that connects to the source (or reads the 
+relative files produced by the source), receives raw data and processes them to calculate the metrics shown on Netdata dashboards.
 
-For Netdata v1.9+ you can also use `netdata.conf`:
+Even if Netdata plugins connect to the user's database server, or read user's application log file to collect raw data, the product of 
+this data collection process is always a number of **chart metadata and metric values** (summarized data for dashboard visualization). 
+All Netdata plugins (internal to the Netdata daemon, and external ones written in any computer language), convert raw data collected 
+into metrics, and only these metrics are stored in Netdata databases, sent to upstream Netdata servers, or archived to external 
+time-series databases.
 
-```
-[web]
-	allow connections from = localhost 1.2.3.4
-```
+The **raw data** collected by Netdata does not leave the host when collected. **The only data Netdata exposes are chart metadata and metric values.**
 
-Of course you can add more IPs.
-
-For Netdata prior to v1.9, if you want to allow multiple IPs, use this:
-
-```sh
-# space separated list of IPs to allow access Netdata
-NETDATA_ALLOWED="1.2.3.4 5.6.7.8 9.10.11.12"
-NETDATA_PORT=19999
+This means that Netdata can safely be used in environments that require the highest level of data isolation (like PCI Level 1).
 
-# create a new filtering chain || or empty an existing one named netdata
-iptables -t filter -N netdata 2>/dev/null || iptables -t filter -F netdata
-for x in ${NETDATA_ALLOWED}
-do
-	# allow this IP
-    iptables -t filter -A netdata -s ${x} -j ACCEPT
-done
+### User systems are safe with Netdata
 
-# drop all other IPs
-iptables -t filter -A netdata -j DROP
+We are very proud that **the Netdata daemon runs as a normal system user, without any special privileges**. This is quite an 
+achievement for a monitoring system that collects all kinds of system and application metrics.
 
-# delete the input chain hook (if it exists)
-iptables -t filter -D INPUT -p tcp --dport ${NETDATA_PORT} -m conntrack --ctstate NEW -j netdata 2>/dev/null
+There are a few cases, however, that raw source data are only exposed to processes with escalated privileges. To support these 
+cases, Netdata attempts to minimize and completely isolate the code that runs with escalated privileges.
 
-# add the input chain hook (again)
-# to send all new Netdata connections to our filtering chain
-iptables -t filter -I INPUT -p tcp --dport ${NETDATA_PORT} -m conntrack --ctstate NEW -j netdata
-```
+So, Netdata **plugins**, even those running with escalated capabilities or privileges, perform a **hard coded data collection job**. 
+They do not accept commands from Netdata. The communication is **unidirectional** from the plugin towards the Netdata daemon, except 
+for Functions (see below).  The original application data collected by each plugin do not leave the process they are collected, are 
+not saved and are not transferred to the Netdata daemon. The communication from the plugins to the Netdata daemon includes only chart 
+metadata and processed metric values.
 
-_script to allow access to Netdata only from a number of hosts_
+Child nodes use the same protocol when streaming metrics to their parent nodes. The raw data collected by the plugins of
+child Netdata servers are **never leaving the host they are collected**. The only data appearing on the wire are chart
+metadata and metric values. This communication is also **unidirectional**: child nodes never accept commands from
+parent Netdata servers (except for Functions). 
 
-You can run the above any number of times. Each time it runs it refreshes the list of allowed hosts.
+[Functions](https://github.com/netdata/netdata/blob/master/docs/cloud/netdata-functions.md) is currently 
+the only feature that routes requests back to origin Netdata Agents via Netdata Parents. The feature allows Netdata Cloud to send 
+a request to the Netdata Agent data collection plugin running at the 
+edge, to provide additional information, such as the process tree of a server, or the long queries of a DB. 
 
-#### Other methods
+<!-- The user has full control over the available functions. For more information see “Controlling Access to Functions” and “Disabling Functions”. -->
 
-Of course, there are many more methods you could use to protect Netdata:
+### Netdata is read-only
 
--   bind Netdata to localhost and use `ssh -L 19998:127.0.0.1:19999 remote.netdata.ip` to forward connections of local port 19998 to remote port 19999. This way you can ssh to a Netdata server and then use `http://127.0.0.1:19998/` on your computer to access the remote Netdata dashboard.
+Netdata **dashboards are read-only**. Dashboard users can view and examine metrics collected by Netdata, but cannot 
+instruct Netdata to do something other than present the already collected metrics.
 
--   If you are always under a static IP, you can use the script given above to allow direct access to your Netdata servers without authentication, from all your static IPs.
+Netdata dashboards do not expose sensitive information. Business data of any kind, the kernel version, O/S version, 
+application versions, host IPs, etc. are not stored and are not exposed by Netdata on its dashboards.
 
--   install all your Netdata in **headless data collector** mode, forwarding all metrics in real-time to a parent
-    Netdata server, which will be protected with authentication using an nginx server running locally at the parent
-    Netdata server. This requires more resources (you will need a bigger parent Netdata server), but does not require
-    any firewall changes, since all the child Netdata servers will not be listening for incoming connections.
+### Protect Netdata from the internet
 
-## Anonymous Statistics
+Users are responsible to take all appropriate measures to secure their Netdata agent installations and especially the Netdata web user interface and API against unauthorized access. Netdata comes with a wide range of options to 
+[secure user nodes](https://github.com/netdata/netdata/blob/master/docs/category-overview-pages/secure-nodes.md) in 
+compliance with the user organization's security policy.
 
-### Registry or how to not send any information to a third party server
+### Anonymous statistics
 
-The default configuration uses a public registry under registry.my-netdata.io (more information about the registry here: [mynetdata-menu-item](https://github.com/netdata/netdata/blob/master/registry/README.md) ). Please be aware that if you use that public registry, you submit the following information to a third party server: 
+#### Netdata registry
 
--   The url where you open the web-ui in the browser (via http request referrer)
--   The hostnames of the Netdata servers
+The default configuration uses a public [registry](https://github.com/netdata/netdata/blob/master/registry/README.md) under registry.my-netdata.io. 
+If the user uses that public registry, they submit the following information to a third party server: 
+ - The URL of the agent's web user interface (via http request referrer)
+ - The hostnames of the user's Netdata servers
 
-If sending this information to the central Netdata registry violates your security policies, you can configure Netdata to [run your own registry](https://github.com/netdata/netdata/blob/master/registry/README.md#run-your-own-registry).
+If sending this information to the central Netdata registry violates user's security policies, they can configure Netdata to 
+[run their own registry](https://github.com/netdata/netdata/blob/master/registry/README.md#run-your-own-registry).
 
-### Opt-out of anonymous statistics
+#### Anonymous telemetry events
 
 Starting with v1.30, Netdata collects anonymous usage information by default and sends it to a self hosted PostHog instance within the Netdata infrastructure. Read
-about the information collected, and learn how to-opt, on our [anonymous statistics](anonymous-statistics.md) page.
-
-The usage statistics are _vital_ for us, as we use them to discover bugs and prioritize new features. We thank you for
-_actively_ contributing to Netdata's future.
+about the information collected and learn how to opt-out, on our 
+[anonymous telemetry events](https://github.com/netdata/netdata/blob/master/docs/anonymous-statistics.md) page.
 
-## Netdata directories
+### Netdata directories
 
+The agent stores data in 6 different directories on the user's system. 
+  
 | path|owner|permissions|Netdata|comments|
 |:---|:----|:----------|:------|:-------|
 | `/etc/netdata`|user `root`<br/>group `netdata`|dirs `0755`<br/>files `0640`|reads|**Netdata config files**<br/>may contain sensitive information, so group `netdata` is allowed to read them.|
@@ -226,4 +171,26 @@ _actively_ contributing to Netdata's future.
 | `/var/lib/netdata`|user `netdata`<br/>group `netdata`|dirs `0750`<br/>files `0660`|reads, writes, creates, deletes|**Netdata permanent database files**<br/>Netdata stores here the registry data, health alarm log db, etc.|
 | `/var/log/netdata`|user `netdata`<br/>group `root`|dirs `0755`<br/>files `0644`|writes, creates|**Netdata log files**<br/>all the Netdata applications, logs their errors or other informational messages to files in this directory. These files should be log rotated.|
 
+## Organization processes
+
+### Employee identification and authorization
+
+Netdata operates technical and organizational measures for employee identification and authentication, such as logs, policies, 
+assigning distinct usernames for each employee and utilizing password complexity requirements for access to all platforms. 
+
+The COO or HR are the primary system owners for all platforms and may designate additional system owners, as needed. Additional 
+user access is also established on a role basis, requires the system owner’s approval, and is tracked by HR. User access to each 
+platform is subject to periodic review and testing. When an employee changes roles, HR updates the employee’s access to all systems. 
+Netdata uses on-boarding and off-boarding processes to regulate access by Netdata Personnel. 
+
+Second-layer authentication is employed where available, by way of multi-factor authentication. 
+
+Netdata’s IT control environment is based upon industry-accepted concepts, such as multiple layers of preventive and detective 
+controls, working in concert to provide for the overall protection of Netdata’s computing environment and data assets. 
+
+### Systems security
 
+Netdata maintains a risk-based assessment security program. The framework for Netdata’s security program includes administrative, 
+organizational, technical, and physical safeguards reasonably designed to protect the services and confidentiality, integrity, 
+and availability of user data. The program is intended to be appropriate to the nature of the services and the size and complexity 
+of Netdata’s business operations.
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2023-05-08 16:27:08 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2023-05-08 16:27:08 +0000
commit	81581f9719bc56f01d5aa08952671d65fda9867a (patch)
tree	0f5c6b6138bf169c23c9d24b1fc0a3521385cb18 /docs/netdata-security.md
parent	Releasing debian version 1.38.1-1. (diff)
download	netdata-81581f9719bc56f01d5aa08952671d65fda9867a.tar.xz netdata-81581f9719bc56f01d5aa08952671d65fda9867a.zip