diff options
Diffstat (limited to 'claim/README.md')
-rw-r--r-- | claim/README.md | 394 |
1 files changed, 394 insertions, 0 deletions
diff --git a/claim/README.md b/claim/README.md new file mode 100644 index 000000000..ade6a221f --- /dev/null +++ b/claim/README.md @@ -0,0 +1,394 @@ +<!-- +title: "Agent claiming" +description: "Agent claiming allows a Netdata Agent, running on a distributed node, to securely connect to Netdata Cloud via the encrypted Agent-Cloud link (ACLK)." +custom_edit_url: https://github.com/netdata/netdata/edit/master/claim/README.md +--> + +# Agent claiming + +Agent claiming allows a Netdata Agent, running on a distributed node, to securely connect to Netdata Cloud. A Space's +administrator creates a **claiming token**, which is used to add an Agent to their Space via the [Agent-Cloud link +(ACLK)](/aclk/README.md). + +Are you just starting out with Netdata Cloud? See our [get started with +Cloud](https://learn.netdata.cloud/docs/cloud/get-started) guide for a walkthrough of the process and simplified +instructions. + +Claiming nodes is a security feature in Netdata Cloud. Through the process of claiming, you demonstrate in a few ways +that you have administrative access to that node and the configuration settings for its Agent. By logging into the node, +you prove you have access, and by using the claiming script or the Netdata command line, you prove you have write access +and administrative privileges. + +Only the administrators of a Space in Netdata Cloud can view the claiming token and accompanying script generated by +Netdata Cloud. + +> The claiming process ensures no third party can add your node, and then view your node's metrics, in a Cloud account, +> Space, or War Room that you did not authorize. + +By claiming a node, you opt-in to sending data from your Agent to Netdata Cloud via the [ACLK](/aclk/README.md). This +data is encrypted by TLS while it is in transit. We use the RSA keypair created during claiming to authenticate the +identity of the Agent when it connects to the Cloud. While the data does flow through Netdata Cloud servers on its way +from Agents to the browser, we do not store or log it. + +You can claim a node during the Cloud onboarding process, or after you created a Space by clicking on the **USER's +Space** dropdown, then **Manage claimed nodes**. + +There are two important notes regarding claiming: + +- _You can only claim any given node in a single Space_. You can, however, add that claimed node to multiple War Rooms + within that one Space. +- You must repeat the claiming process on every node you want to add to Netdata Cloud. + +## How to claim a node + +To claim a node, select which War Rooms you want to add this node to with the dropdown, then copy and paste the script +given by Cloud into your node's terminal. Hit **Enter**. + +```bash +sudo netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud +``` + +The script should return `Agent was successfully claimed.`. If the claiming script returns errors, or if you don't see +the node in your Space after 60 seconds, see the [troubleshooting information](#troubleshooting). If you prefer not to +use root privileges via `sudo` to run the claiming script, see the next section. + +Repeat this process with every node you want to add to Cloud during onboarding. You can also add more nodes once you've +finished onboarding. + +### Claim an agent without root privileges + +If you don't want to run the claiming script with root privileges, you can discover which user is running the Agent, +switch to that user, and run the claiming script. + +Use `grep` to search your `netdata.conf` file, which is typically located at `/etc/netdata/netdata.conf`, for the `run +as user` setting. For example: + +```bash +grep "run as user" /etc/netdata/netdata.conf + # run as user = netdata +``` + +The default user is `netdata`. Yours may be different, so pay attention to the output from `grep`. Switch to that user +and run the claiming script. + +```bash +netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud +``` + +Hit **Enter**. The script should return `Agent was successfully claimed.`. If the claiming script returns errors, or if +you don't see the node in your Space after 60 seconds, see the [troubleshooting information](#troubleshooting). + +### Claim an Agent running in Docker + +The claiming process works with Agents running inside of Docker containers. You can use `docker exec` to run the +claiming script on containers already running, or append the claiming script to `docker run` to create a new container +and immediately claim it. + +#### Running Agent containers + +Claim a _running Agent container_ by appending the script offered by Cloud to a `docker exec ...` command, replacing +`netdata` with the name of your running container: + +```bash +docker exec -it netdata netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud +``` + +The script should return `Agent was successfully claimed.`. If the claiming script returns errors, or if +you don't see the node in your Space after 60 seconds, see the [troubleshooting information](#troubleshooting). + +#### New/ephemeral Agent containers + +Claim a newly-created container with `docker run ...`. + +In the example below, the last line calls the [daemon binary](/daemon/README.md), sets essential variables, and then +executes claiming using the information after `-W "claim... `. You should copy the relevant token, rooms, and URL from +Cloud. + +```bash +docker run -d --name=netdata \ + -p 19999:19999 \ + -v netdatalib:/var/lib/netdata \ + -v netdatacache:/var/cache/netdata \ + -v /etc/passwd:/host/etc/passwd:ro \ + -v /etc/group:/host/etc/group:ro \ + -v /proc:/host/proc:ro \ + -v /sys:/host/sys:ro \ + -v /etc/os-release:/host/etc/os-release:ro \ + --restart unless-stopped \ + --cap-add SYS_PTRACE \ + --security-opt apparmor=unconfined \ + netdata/netdata \ + -W set2 cloud global enabled true -W set2 cloud global "cloud base url" "https://app.netdata.cloud" -W "claim \ + -token=TOKEN \ + -rooms=ROOM1,ROOM2 \ + -url=https://app.netdata.cloud" +``` + +The container runs in detached mode, so you won't see any output. If the node does not appear in your Space, you can run +the following to find any error output and use that to guide your [troubleshooting](#troubleshooting). Replace `netdata` +with the name of your container if different. + +```bash +docker logs netdata 2>&1 | grep -E --line-buffered 'ACLK|claim|cloud' +``` + +### Claim a Kubernetes cluster's parent Netdata pod + +Read our [Kubernetes installation](/packaging/installer/methods/kubernetes.md#claim-a-kubernetes-clusters-parent-pod) +for details on claiming a parent Netdata pod. + +### Claim through a proxy + +A Space's administrator can claim a node through a SOCKS5 or HTTP(S) proxy. + +You should first configure the proxy in the `[cloud]` section of `netdata.conf`. The proxy settings you specify here +will also be used to tunnel the ACLK. The default `proxy` setting is `none`. + +```conf +[cloud] + proxy = none +``` + +The `proxy` setting can take one of the following values: + +- `none`: Do not use a proxy, even if the system configured otherwise. +- `env`: Try to read proxy settings from set environment variables `http_proxy`/`socks_proxy`. +- `socks5[h]://[user:pass@]host:ip`: The ACLK and claiming will use the specified SOCKS5 proxy. +- `http://[user:pass@]host:ip`: The ACLK and claiming will use the specified HTTP(S) proxy. + +For example, a SOCKS5 proxy setting may look like the following: + +```conf +[cloud] + proxy = socks5h://203.0.113.0:1080 # With an IP address + proxy = socks5h://proxy.example.com:1080 # With a URL +``` + +You can now move on to claiming. When you claim with the `netdata-claim.sh` script, add the `-proxy=` parameter and +append the same proxy setting you added to `netdata.conf`. + +```bash +sudo netdata-claim.sh -token=MYTOKEN1234567 -rooms=room1,room2 -url=https://app.netdata.cloud -proxy=socks5h://203.0.113.0:1080 +``` + +Hit **Enter**. The script should return `Agent was successfully claimed.`. If the claiming script returns errors, or if +you don't see the node in your Space after 60 seconds, see the [troubleshooting information](#troubleshooting). + +### Troubleshooting + +If you're having trouble claiming a node, this may be because the [ACLK](/aclk/README.md) cannot connect to Cloud. + +With the Netdata Agent running, visit `http://NODE:19999/api/v1/info` in your browser, replacing `NODE` with the IP +address or hostname of your Agent. The returned JSON contains four keys that will be helpful to diagnose any issues you +might be having with the ACLK or claiming process. + +```json + "cloud-enabled" + "cloud-available" + "agent-claimed" + "aclk-available" +``` + +Use these keys and the information below to troubleshoot the ACLK. + +#### bash: netdata-claim.sh: command not found + +If you run the claiming script and see a `command not found` error, you either installed Netdata in a non-standard +location or are using an unsupported package. If you installed Netdata in a non-standard path using the `--install` +option, you need to update your `$PATH` or run `netdata-claim.sh` using the full path. For example, if you installed +Netdata to `/opt/netdata`, use `/opt/netdata/bin/netdata-claim.sh` to run the claiming script. + +If you are using an unsupported package, such as a third-party `.deb`/`.rpm` package provided by your distribution, +please remove that package and reinstall using our [recommended kickstart +script](/docs/get/README.md#install-the-netdata-agent). + +#### Claiming on older distributions (Ubuntu 14.04, Debian 8, CentOS 6) + +If you're running an older Linux distribution or one that has reached EOL, such as Ubuntu 14.04 LTS, Debian 8, or CentOS +6, your Agent may not be able to securely connect to Netdata Cloud due to an outdated version of OpenSSL. These old +versions of OpenSSL cannot perform [hostname validation](https://wiki.openssl.org/index.php/Hostname_validation), which +helps securely encrypt SSL connections. + +We recommend you reinstall Netdata with a [static build](/packaging/installer/methods/kickstart-64.md), which uses an +up-to-date version of OpenSSL with hostname validation enabled. + +If you choose to continue using the outdated version of OpenSSL, your node will still connect to Netdata Cloud, albeit +with hostname verification disabled. Without verification, your Netdata Cloud connection could be vulnerable to +man-in-the-middle attacks. + +#### cloud-enabled is false + +If `cloud-enabled` is `false`, you probably ran the installer with `--disable-cloud` option. + +Additionally, check that the `enabled` setting in `var/lib/netdata/cloud.d/cloud.conf` is set to `true`: + +```conf +[global] + enabled = true +``` + +To fix this issue, reinstall Netdata using your [preferred method](/packaging/installer/README.md) and do not add the +`--disable-cloud` option. + +#### cloud-available is false + +If `cloud-available` is `false` after you verified Cloud is enabled in the previous step, the most likely issue is that +Cloud features failed to build during installation. + +If Cloud features fail to build, the installer continues and finishes the process without Cloud functionality as opposed +to failing the installation altogether. We do this to ensure the Agent will always finish installing. + +If you can't see an explicit error in the installer's output, you can run the installer with the `--require-cloud` +option. This option causes the installation to fail if Cloud functionality can't be built and enabled, and the +installer's output should give you more error details. + +You may see one of the following error messages during installation: + +- Failed to build libmosquitto. The install process will continue, but you will not be able to connect this node to + Netdata Cloud. +- Unable to fetch sources for libmosquitto. The install process will continue, but you will not be able to connect + this node to Netdata Cloud. +- Failed to build libwebsockets. The install process will continue, but you may not be able to connect this node to + Netdata Cloud. +- Unable to fetch sources for libwebsockets. The install process will continue, but you may not be able to connect + this node to Netdata Cloud. +- Could not find cmake, which is required to build libwebsockets. The install process will continue, but you may not + be able to connect this node to Netdata Cloud. +- Could not find cmake, which is required to build JSON-C. The install process will continue, but Netdata Cloud + support will be disabled. +- Failed to build JSON-C. Netdata Cloud support will be disabled. +- Unable to fetch sources for JSON-C. Netdata Cloud support will be disabled. + +One common cause of the installer failing to build Cloud features is not having one of the following dependencies on +your system: `cmake` and OpenSSL, including the `devel` package. + +You can also look for error messages in `/var/log/netdata/error.log`. Try one of the following two commands to search +for ACLK-related errors. + +```bash +less /var/log/netdata/error.log +grep -i ACLK /var/log/netdata/error.log +``` + +If the installer's output does not help you enable Cloud features, contact us by [creating an issue on +GitHub](https://github.com/netdata/netdata/issues/new?labels=bug%2C+needs+triage%2C+ACLK&template=bug_report.md&title=The+installer+failed+to+prepare+the+required+dependencies+for+Netdata+Cloud+functionality) +with details about your system and relevant output from `error.log`. + +#### agent-claimed is false + +You must [claim your node](#how-to-claim-a-node). + +#### aclk-available is false + +If `aclk-available` is `false` and all other keys are `true`, your Agent is having trouble connecting to the Cloud +through the ACLK. Please check your system's firewall. + +If your Agent needs to use a proxy to access the internet, you must [set up a proxy for +claiming](#claim-through-a-proxy). + +If you are certain firewall and proxy settings are not the issue, you should consult the Agent's `error.log` at +`/var/log/netdata/error.log` and contact us by [creating an issue on +GitHub](https://github.com/netdata/netdata/issues/new?labels=bug%2C+needs+triage%2C+ACLK&template=bug_report.md&title=ACLK-available-is-false) +with details about your system and relevant output from `error.log`. + +### Remove and reclaim a node + +To remove a node from your Space in Netdata Cloud, delete the `cloud.d/` directory in your Netdata library directory. + +```bash +cd /var/lib/netdata # Replace with your Netdata library directory, if not /var/lib/netdata/ +sudo rm -rf cloud.d/ +``` + +This node no longer has access to the credentials it was claimed with and cannot connect to Netdata Cloud via the ACLK. +You will still be able to see this node in your War Rooms in an **unreachable** state. + +If you want to reclaim this node into a different Space, you need to create a new identity by adding `-id=$(uuidgen)` to +the claiming script parameters. Make sure that you have the `uuidgen-runtime` package installed, as it is used to run the command `uuidgen`. For example, using the default claiming script: + +```bash +sudo netdata-claim.sh -token=TOKEN -rooms=ROOM1,ROOM2 -url=https://app.netdata.cloud -id=$(uuidgen) +``` + +The agent _must be restarted_ after this change. + +## Claiming reference + +In the sections below, you can find reference material for the claiming script, claiming via the Agent's command line +tool, and details about the files found in `cloud.d`. + +### The `cloud.conf` file + +This section defines how and whether your Agent connects to [Netdata Cloud](https://learn.netdata.cloud/docs/cloud/) +using the [ACLK](/aclk/README.md). + +| setting | default | info | +|:-------------- |:------------------------- |:-------------------------------------------------------------------------------------------------------------------------------------- | +| cloud base url | https://app.netdata.cloud | The URL for the Netdata Cloud web application. You should not change this. If you want to disable Cloud, change the `enabled` setting. | +| enabled | yes | The runtime option to disable the [Agent-Cloud link](/aclk/README.md) and prevent your Agent from connecting to Netdata Cloud. | + +### Claiming script + +A Space's administrator can claim an Agent by directly calling the `netdata-claim.sh` script either with root privileges +using `sudo`, or as the user running the Agent (typically `netdata`), and passing the following arguments: + +```sh +-token=TOKEN + where TOKEN is the Space's claiming token. +-rooms=ROOM1,ROOM2,... + where ROOMX is the War Room this node should be added to. This list is optional. +-url=URL_BASE + where URL_BASE is the Netdata Cloud endpoint base URL. By default, this is https://app.netdata.cloud. +-id=AGENT_ID + where AGENT_ID is the unique identifier of the Agent. This is the Agent's MACHINE_GUID by default. +-hostname=HOSTNAME + where HOSTNAME is the result of the hostname command by default. +-proxy=PROXY_URL + where PROXY_URL is the endpoint of a SOCKS5 proxy. +``` + +For example, the following command claims an Agent and adds it to rooms `room1` and `room2`: + +```sh +netdata-claim.sh -token=MYTOKEN1234567 -rooms=room1,room2 +``` + +You should then update the `netdata` service about the result with `netdatacli`: + +```sh +netdatacli reload-claiming-state +``` + +This reloads the Agent claiming state from disk. + +### Netdata Agent command line + +If a Netdata Agent is running, the Space's administrator can claim a node using the `netdata` service binary with +additional command line parameters: + +```sh +-W "claim -token=TOKEN -rooms=ROOM1,ROOM2" +``` + +For example: + +```sh +/usr/sbin/netdata -D -W "claim -token=MYTOKEN1234567 -rooms=room1,room2" +``` + +If need be, the user can override the Agent's defaults by providing additional arguments like those described +[here](#claiming-script). + +### Claiming directory + +Netdata stores the Agent's claiming-related state in the Netdata library directory under `cloud.d`. For a default +installation, this directory exists at `/var/lib/netdata/cloud.d`. The directory and its files should be owned by the +user that runs the Agent, which is typically the `netdata` user. + +The `cloud.d/token` file should contain the claiming-token and the `cloud.d/rooms` file should contain the list of War +Rooms you added that node to. + +The user can also put the Cloud endpoint's full certificate chain in `cloud.d/cloud_fullchain.pem` so that the Agent +can trust the endpoint if necessary. + +[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fclaim%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) |