diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:23 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:44 +0000 |
commit | 836b47cb7e99a977c5a23b059ca1d0b5065d310e (patch) | |
tree | 1604da8f482d02effa033c94a84be42bc0c848c3 /src/claim | |
parent | Releasing debian version 1.44.3-2. (diff) | |
download | netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.tar.xz netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.zip |
Merging upstream version 1.46.3.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'src/claim')
-rw-r--r-- | src/claim/README.md | 347 | ||||
-rw-r--r-- | src/claim/claim.c | 475 | ||||
-rw-r--r-- | src/claim/claim.h | 32 | ||||
-rwxr-xr-x | src/claim/netdata-claim.sh.in | 451 |
4 files changed, 1305 insertions, 0 deletions
diff --git a/src/claim/README.md b/src/claim/README.md new file mode 100644 index 000000000..51e2a9ebe --- /dev/null +++ b/src/claim/README.md @@ -0,0 +1,347 @@ +# Connect Agent to Cloud + +This section guides you through installing and securely connecting a new Netdata Agent to Netdata Cloud via the +encrypted Agent-Cloud Link ([ACLK](/src/aclk/README.md)). Connecting your agent to Netdata Cloud unlocks additional +features like centralized monitoring and easier collaboration. + +## Connect + +### Install and Connect a New Agent + +There are two places in the UI where you can add/connect your Node: + +- **Space/Room settings**: Click the cogwheel (the bottom-left corner or next to the Room name at the top) and + select "Nodes." Click the "+" button to add + a new node. +- [**Nodes tab**](/docs/dashboards-and-charts/nodes-tab.md): Click on the "Add nodes" button. + +Netdata Cloud will generate a command that you can execute on your Node to install and claim the Agent. The command is +available for different installation methods: + +| Method | Description | +|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Linux/FreeBSD/macOS | Install directly using the [kickstart.sh](/packaging/installer/methods/kickstart.md) script. | +| Docker | Install as a container using the provided docker run command or YAML files (Docker Compose/Swarm). | +| Kubernetes | Install inside the cluster using `helm`. **Important**: refer to the [Kubernetes installation](/packaging/installer/methods/kubernetes.md#deploy-netdata-on-your-kubernetes-cluster) for detailed instructions. | + +Once you've chosen your installation method, follow the provided instructions to install and connect the Agent. + +### Connect an Existing Agent + +There are two methods to connect an already installed Netdata Agent to your Netdata Cloud Space: + +- using the Netdata Cloud user interface (UI). +- using the claiming script. + +#### Using the UI (recommended) + +The UI method is the easiest and recommended way to connect your Agent. Here's how: + +1. Open your Agent local UI. +2. Sign in to your Netdata Cloud account. +3. Click the "Connect" button. +4. Follow the on-screen instructions to connect your Agent. + +#### Using claiming script + +You can connect an Agent by running +the [netdata-claim.sh](https://github.com/netdata/netdata/blob/master/src/claim/netdata-claim.sh.in) script directly. +You can either run it with root privileges using `sudo` or as the user running the Agent (typically `netdata`). + +The claiming script accepts options that control the connection process. You can specify these options using the +following format: + +```bash +netdata-claim.sh -OPTION=VALUE ... +``` + +Claiming script options: + +| Option | Description | Required | Default value | +|--------|--------------------------------------------------------------------|:--------:|:------------------------------------------------------| +| token | The claiming token for your Netdata Cloud Space. | yes | | +| rooms | A comma-separated list of Rooms to add the Agent to. | no | The Agent will be added to the "All nodes" Room only. | +| id | The unique identifier of the Agent. | no | The Agent's MACHINE_GUID. | +| proxy | The URL of a proxy server to use for the connection, if necessary. | no | | + +Example: + +```bash +netdata-claim.sh -token=MYTOKEN1234567 -rooms=room1,room2 +``` + +This command connects the Agent and adds it to the "room1" and "room2" Rooms using your claiming token +MYTOKEN1234567. + +## Reconnect + +### Linux based installations + +To remove a node from your Space in Netdata Cloud, delete the `cloud.d/` directory in your Netdata library directory. + +```bash +cd /var/lib/netdata # Replace with your Netdata library directory, if not /var/lib/netdata/ +sudo rm -rf cloud.d/ +``` + +This node no longer has access to the credentials it was used when connecting to Netdata Cloud via the ACLK. You will +still be able to see this node in your Rooms in an **unreachable** state. + +If you want to reconnect this node, you need to: + +1. Ensure that the `/var/lib/netdata/cloud.d` directory doesn't exist. In some installations, the path + is `/opt/netdata/var/lib/netdata/cloud.d` +2. Stop the Agent +3. Ensure that the `uuidgen-runtime` package is installed. Run ```echo "$(uuidgen)"``` and validate you get back a UUID +4. Copy the kickstart.sh command to add a node from your space and add to the end of it `--claim-id "$(uuidgen)"`. Run + the command and look for the message `Node was successfully claimed.` +5. Start the Agent + +### Docker based installations + +To remove a node from you Space in Netdata Cloud, and connect it to another Space, follow these steps: + +1. Enter the running container you wish to remove from your Space + + ```bash + docker exec -it CONTAINER_NAME sh + ``` + + Replacing `CONTAINER_NAME` with either the container's name or ID. + +2. Delete `/var/lib/netdata/cloud.d` and `/var/lib/netdata/registry/netdata.public.unique.id` + + ```bash + rm -rf /var/lib/netdata/cloud.d/ + + rm /var/lib/netdata/registry/netdata.public.unique.id + ``` + +3. Stop and remove the container + + **Docker CLI:** + + ```bash + docker stop CONTAINER_NAME + + docker rm CONTAINER_NAME + ``` + + Replacing `CONTAINER_NAME` with either the container's name or ID. + + **Docker Compose:** + Inside the directory that has the `docker-compose.yml` file, run: + + ```bash + docker compose down + ``` + + **Docker Swarm:** + Run the following, and replace `STACK` with your Stack's name: + + ```bash + docker stack rm STACK + ``` + +4. Finally, go to your new Space, copy the installation command with the new claim token and run it. + If you are using a `docker-compose.yml` file, you will have to overwrite it with the new claiming token. + The node should now appear online in that Space. + +## Regenerate Claiming Token + +If in case of some security reason, or other, you need to revoke your previous claiming token and generate a new one you +can achieve that from the Netdata Cloud UI. + +On any screen where you see the connect the node to Netdata Cloud command you'll see above it, next to +the [updates channel](/docs/netdata-agent/versions-and-platforms.md), a +button to **Regenerate token**. This action will invalidate your previous token and generate a fresh new one. + +Only the administrators of a Space in Netdata Cloud can trigger this action. + +## Troubleshoot + +If you're having trouble connecting a node, this may be because +the [ACLK](/src/aclk/README.md) cannot connect to Cloud. + +With the Netdata Agent running, visit `http://NODE:19999/api/v1/info` in your browser, replacing `NODE` with the IP +address or hostname of your Agent. The returned JSON contains four keys that will be helpful to diagnose any issues you +might be having with the ACLK or connection process. + +``` +"cloud-enabled" +"cloud-available" +"agent-claimed" +"aclk-available" +``` + +> **Note** +> +> On Netdata Agent version `1.32` (`netdata -v` to find your version) and newer, `sudo netdatacli aclk-state` can be +> used to get some diagnostic information about ACLK. Sample output: + +```bash +ACLK Available: Yes +ACLK Implementation: Next Generation +New Cloud Protocol Support: Yes +Claimed: Yes +Claimed Id: 53aa76c2-8af5-448f-849a-b16872cc4ba1 +Online: Yes +Used Cloud Protocol: New +``` + +Use these keys and the information below to troubleshoot the ACLK. + +### kickstart: unsupported Netdata installation + +If you run the kickstart script and get the following +error `Existing install appears to be handled manually or through the system package manager.` you most probably +installed Netdata using an unsupported package. + +> **Note** +> +> If you are using an unsupported package, such as a third-party `.deb`/`.rpm` package provided by your distribution, +> please remove that package and reinstall using +> +our [recommended kickstart script](/packaging/installer/methods/kickstart.md). + +### kickstart: Failed to write new machine GUID + +If you run the kickstart script but don't have privileges required for the actions done on the connecting to Netdata +Cloud process you will get the following error: + +```bash +Failed to write new machine GUID. Please make sure you have rights to write to /var/lib/netdata/registry/netdata.public.unique.id. +``` + +For a successful execution you will need to run the script with root privileges or run it with the user that is running +the Agent. + +### bash: netdata-claim.sh: command not found + +If you run the claiming script and see a `command not found` error, you either installed Netdata in a non-standard +location or are using an unsupported package. If you installed Netdata in a non-standard path using +the `--install-prefix` option, you need to update your `$PATH` or run `netdata-claim.sh` using the full path. + +For example, if you installed Netdata to `/opt/netdata`, use `/opt/netdata/bin/netdata-claim.sh` to run the claiming +script. + +> **Note** +> +> If you are using an unsupported package, such as a third-party `.deb`/`.rpm` package provided by your distribution, +> please remove that package and reinstall using +> +our [recommended kickstart script](/packaging/installer/methods/kickstart.md). + +### Connecting on older distributions (Ubuntu 14.04, Debian 8, CentOS 6) + +If you're running an older Linux distribution or one that has reached EOL, such as Ubuntu 14.04 LTS, Debian 8, or CentOS +6, your Agent may not be able to securely connect to Netdata Cloud due to an outdated version of OpenSSL. These old +versions of OpenSSL cannot perform [hostname validation](https://wiki.openssl.org/index.php/Hostname_validation), which +helps securely encrypt SSL connections. + +We recommend you reinstall Netdata with +a [static build](/packaging/installer/methods/kickstart.md#static-builds), +which uses an up-to-date version of OpenSSL with hostname validation enabled. + +If you choose to continue using the outdated version of OpenSSL, your node will still connect to Netdata Cloud, albeit +with hostname verification disabled. Without verification, your Netdata Cloud connection could be vulnerable to +man-in-the-middle attacks. + +### cloud-enabled is false + +If `cloud-enabled` is `false`, you probably ran the installer with `--disable-cloud` option. + +Additionally, check that the `enabled` setting in `var/lib/netdata/cloud.d/cloud.conf` is set to `true`: + +```conf +[global] + enabled = true +``` + +To fix this issue, reinstall Netdata using +your [preferred method](/packaging/installer/README.md) and do not add +the `--disable-cloud` option. + +### cloud-available is false / ACLK Available: No + +If `cloud-available` is `false` after you verified Cloud is enabled in the previous step, the most likely issue is that +Cloud features failed to build during installation. + +If Cloud features fail to build, the installer continues and finishes the process without Cloud functionality as opposed +to failing the installation altogether. + +We do this to ensure the Agent will always finish installing. + +If you can't see an explicit error in the installer's output, you can run the installer with the `--require-cloud` +option. This option causes the installation to fail if Cloud functionality can't be built and enabled, and the +installer's output should give you more error details. + +You may see one of the following error messages during installation: + +- `Failed to build libmosquitto. The install process will continue, but you will not be able to connect this node to Netdata Cloud.` +- `Unable to fetch sources for libmosquitto. The install process will continue, but you will not be able to connect this node to Netdata Cloud.` +- `Failed to build libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.` +- `Unable to fetch sources for libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.` +- `Could not find cmake, which is required to build libwebsockets. The install process will continue, but you may not be able to connect this node to Netdata Cloud.` +- `Could not find cmake, which is required to build JSON-C. The install process will continue, but Netdata Cloud support will be disabled.` +- `Failed to build JSON-C. Netdata Cloud support will be disabled.` +- `Unable to fetch sources for JSON-C. Netdata Cloud support will be disabled.` + +One common cause of the installer failing to build Cloud features is not having one of the following dependencies on +your system: `cmake`, `json-c` and `OpenSSL`, including corresponding `devel` packages. + +You can also look for error messages in `/var/log/netdata/error.log`. Try one of the following two commands to search +for ACLK-related errors. + +```bash +less /var/log/netdata/error.log +grep -i ACLK /var/log/netdata/error.log +``` + +If the installer's output does not help you enable Cloud features, contact us +by [creating an issue on GitHub](https://github.com/netdata/netdata/issues/new?assignees=&labels=bug%2Cneeds+triage&template=BUG_REPORT.yml&title=The+installer+failed+to+prepare+the+required+dependencies+for+Netdata+Cloud+functionality) +with details about your system and relevant output from `error.log`. + +### agent-claimed is false / Claimed: No + +You must [connect your node](#connect). + +### aclk-available is false / Online: No + +If `aclk-available` is `false` and all other keys are `true`, your Agent is having trouble connecting to the Cloud +through the ACLK. Please check your system's firewall. + +If your Agent needs to use a proxy to access the internet, you must set up a proxy for connecting. + +If you are certain firewall and proxy settings are not the issue, you should consult the Agent's `error.log` +at `/var/log/netdata/error.log` and contact us +by [creating an issue on GitHub](https://github.com/netdata/netdata/issues/new?assignees=&labels=bug%2Cneeds+triage&template=BUG_REPORT.yml&title=ACLK-available-is-false) +with details about your system and relevant output from `error.log`. + +## Connecting reference + +In the sections below, you can find reference material for the kickstart script, claiming script, connecting via the +Agent's command line tool, and details about the files found in `cloud.d`. + +### The `cloud.conf` file + +This section defines how and whether your Agent connects to Netdata Cloud using +the [Agent-Cloud link](/src/aclk/README.md)(ACLK). + +| setting | default | info | +|:---------------|:----------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------| +| enabled | yes | Controls whether the ACLK is active. Set to no to prevent the Agent from connecting to Netdata Cloud. | +| cloud base url | <https://app.netdata.cloud> | The URL for the Netdata Cloud web application. Typically, this should not be changed. | +| proxy | env | Specifies the proxy setting for the ACLK. Options: none (no proxy), env (use environment's proxy), or a URL (e.g., `http://proxy.example.com:1080`). | + +### Connection directory + +Netdata stores the Agent's connection-related state in the Netdata library directory under `cloud.d`. For a default +installation, this directory exists at `/var/lib/netdata/cloud.d`. The directory and its files should be owned by the +user that runs the Agent, which is typically the `netdata` user. + +The `cloud.d/token` file should contain the claiming-token and the `cloud.d/rooms` file should contain the list of War +Rooms you added that node to. + +The user can also put the Cloud endpoint's full certificate chain in `cloud.d/cloud_fullchain.pem` so that the Agent +can trust the endpoint if necessary. diff --git a/src/claim/claim.c b/src/claim/claim.c new file mode 100644 index 000000000..5f4ec9a43 --- /dev/null +++ b/src/claim/claim.c @@ -0,0 +1,475 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "claim.h" +#include "registry/registry_internals.h" +#include "aclk/aclk.h" +#include "aclk/aclk_proxy.h" + +char *claiming_pending_arguments = NULL; + +static char *claiming_errors[] = { + "Agent claimed successfully", // 0 + "Unknown argument", // 1 + "Problems with claiming working directory", // 2 + "Missing dependencies", // 3 + "Failure to connect to endpoint", // 4 + "The CLI didn't work", // 5 + "Wrong user", // 6 + "Unknown HTTP error message", // 7 + "invalid node id", // 8 + "invalid node name", // 9 + "invalid room id", // 10 + "invalid public key", // 11 + "token expired/token not found/invalid token", // 12 + "already claimed", // 13 + "processing claiming", // 14 + "Internal Server Error", // 15 + "Gateway Timeout", // 16 + "Service Unavailable", // 17 + "Agent Unique Id Not Readable" // 18 +}; + +/* Retrieve the claim id for the agent. + * Caller owns the string. +*/ +char *get_agent_claimid() +{ + char *result; + rrdhost_aclk_state_lock(localhost); + result = (localhost->aclk_state.claimed_id == NULL) ? NULL : strdupz(localhost->aclk_state.claimed_id); + rrdhost_aclk_state_unlock(localhost); + return result; +} + +#define CLAIMING_COMMAND_LENGTH 16384 +#define CLAIMING_PROXY_LENGTH (CLAIMING_COMMAND_LENGTH/4) + +/* rrd_init() and post_conf_load() must have been called before this function */ +CLAIM_AGENT_RESPONSE claim_agent(const char *claiming_arguments, bool force, const char **msg __maybe_unused) +{ + if (!force || !netdata_cloud_enabled) { + netdata_log_error("Refusing to claim agent -> cloud functionality has been disabled"); + return CLAIM_AGENT_CLOUD_DISABLED; + } + +#ifndef DISABLE_CLOUD + int exit_code; + pid_t command_pid; + char command_exec_buffer[CLAIMING_COMMAND_LENGTH + 1]; + char command_line_buffer[CLAIMING_COMMAND_LENGTH + 1]; + FILE *fp_child_output, *fp_child_input; + + // This is guaranteed to be set early in main via post_conf_load() + char *cloud_base_url = appconfig_get(&cloud_config, CONFIG_SECTION_GLOBAL, "cloud base url", NULL); + if (cloud_base_url == NULL) { + internal_fatal(true, "Do not move the cloud base url out of post_conf_load!!"); + return CLAIM_AGENT_NO_CLOUD_URL; + } + + const char *proxy_str; + ACLK_PROXY_TYPE proxy_type; + char proxy_flag[CLAIMING_PROXY_LENGTH] = "-noproxy"; + + proxy_str = aclk_get_proxy(&proxy_type); + + if (proxy_type == PROXY_TYPE_SOCKS5 || proxy_type == PROXY_TYPE_HTTP) + snprintf(proxy_flag, CLAIMING_PROXY_LENGTH, "-proxy=\"%s\"", proxy_str); + + snprintfz(command_exec_buffer, CLAIMING_COMMAND_LENGTH, + "exec \"%s%snetdata-claim.sh\"", + netdata_exe_path[0] ? netdata_exe_path : "", + netdata_exe_path[0] ? "/" : "" + ); + + snprintfz(command_line_buffer, + CLAIMING_COMMAND_LENGTH, + "%s %s -hostname=%s -id=%s -url=%s -noreload %s", + command_exec_buffer, + proxy_flag, + netdata_configured_hostname, + localhost->machine_guid, + cloud_base_url, + claiming_arguments); + + netdata_log_info("Executing agent claiming command: %s", command_exec_buffer); + fp_child_output = netdata_popen(command_line_buffer, &command_pid, &fp_child_input); + if(!fp_child_output) { + netdata_log_error("Cannot popen(\"%s\").", command_exec_buffer); + return CLAIM_AGENT_CANNOT_EXECUTE_CLAIM_SCRIPT; + } + + netdata_log_info("Waiting for claiming command '%s' to finish.", command_exec_buffer); + char read_buffer[100 + 1]; + while (fgets(read_buffer, 100, fp_child_output) != NULL) ; + + exit_code = netdata_pclose(fp_child_input, fp_child_output, command_pid); + + netdata_log_info("Agent claiming command '%s' returned with code %d", command_exec_buffer, exit_code); + if (0 == exit_code) { + load_claiming_state(); + return CLAIM_AGENT_OK; + } + if (exit_code < 0) { + netdata_log_error("Agent claiming command '%s' failed to complete its run", command_exec_buffer); + return CLAIM_AGENT_CLAIM_SCRIPT_FAILED; + } + errno = 0; + unsigned maximum_known_exit_code = sizeof(claiming_errors) / sizeof(claiming_errors[0]) - 1; + + if ((unsigned)exit_code > maximum_known_exit_code) { + netdata_log_error("Agent failed to be claimed with an unknown error. Cmd: '%s'", command_exec_buffer); + return CLAIM_AGENT_CLAIM_SCRIPT_RETURNED_INVALID_CODE; + } + + netdata_log_error("Agent failed to be claimed using the command '%s' with the following error message:", + command_exec_buffer); + + netdata_log_error("\"%s\"", claiming_errors[exit_code]); + + if(msg) *msg = claiming_errors[exit_code]; + +#else + UNUSED(claiming_arguments); + UNUSED(claiming_errors); +#endif + + return CLAIM_AGENT_FAILED_WITH_MESSAGE; +} + +/* Change the claimed state of the agent. + * + * This only happens when the user has explicitly requested it: + * - via the cli tool by reloading the claiming state + * - after spawning the claim because of a command-line argument + * If this happens with the ACLK active under an old claim then we MUST KILL THE LINK + */ +void load_claiming_state(void) +{ + // -------------------------------------------------------------------- + // Check if the cloud is enabled +#if defined( DISABLE_CLOUD ) || !defined( ENABLE_ACLK ) + netdata_cloud_enabled = false; +#else + nd_uuid_t uuid; + + // Propagate into aclk and registry. Be kind of atomic... + appconfig_get(&cloud_config, CONFIG_SECTION_GLOBAL, "cloud base url", DEFAULT_CLOUD_BASE_URL); + + rrdhost_aclk_state_lock(localhost); + if (localhost->aclk_state.claimed_id) { + if (aclk_connected) + localhost->aclk_state.prev_claimed_id = strdupz(localhost->aclk_state.claimed_id); + freez(localhost->aclk_state.claimed_id); + localhost->aclk_state.claimed_id = NULL; + } + if (aclk_connected) + { + netdata_log_info("Agent was already connected to Cloud - forcing reconnection under new credentials"); + aclk_kill_link = 1; + } + aclk_disable_runtime = 0; + + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "%s/cloud.d/claimed_id", netdata_configured_varlib_dir); + + long bytes_read; + char *claimed_id = read_by_filename(filename, &bytes_read); + if(claimed_id && uuid_parse(claimed_id, uuid)) { + netdata_log_error("claimed_id \"%s\" doesn't look like valid UUID", claimed_id); + freez(claimed_id); + claimed_id = NULL; + } + + if(claimed_id) { + localhost->aclk_state.claimed_id = mallocz(UUID_STR_LEN); + uuid_unparse_lower(uuid, localhost->aclk_state.claimed_id); + } + + rrdhost_aclk_state_unlock(localhost); + invalidate_node_instances(&localhost->host_uuid, claimed_id ? &uuid : NULL); + metaqueue_store_claim_id(&localhost->host_uuid, claimed_id ? &uuid : NULL); + + if (!claimed_id) { + netdata_log_info("Unable to load '%s', setting state to AGENT_UNCLAIMED", filename); + return; + } + + freez(claimed_id); + + netdata_log_info("File '%s' was found. Setting state to AGENT_CLAIMED.", filename); + netdata_cloud_enabled = appconfig_get_boolean_ondemand(&cloud_config, CONFIG_SECTION_GLOBAL, "enabled", netdata_cloud_enabled); +#endif +} + +struct config cloud_config = { .first_section = NULL, + .last_section = NULL, + .mutex = NETDATA_MUTEX_INITIALIZER, + .index = { .avl_tree = { .root = NULL, .compar = appconfig_section_compare }, + .rwlock = AVL_LOCK_INITIALIZER } }; + +void load_cloud_conf(int silent) +{ + char *nd_disable_cloud = getenv("NETDATA_DISABLE_CLOUD"); + if (nd_disable_cloud && !strncmp(nd_disable_cloud, "1", 1)) + netdata_cloud_enabled = CONFIG_BOOLEAN_NO; + + char *filename; + errno = 0; + + int ret = 0; + + filename = strdupz_path_subpath(netdata_configured_varlib_dir, "cloud.d/cloud.conf"); + + ret = appconfig_load(&cloud_config, filename, 1, NULL); + if(!ret && !silent) + netdata_log_info("CONFIG: cannot load cloud config '%s'. Running with internal defaults.", filename); + + freez(filename); + + // -------------------------------------------------------------------- + // Check if the cloud is enabled + +#if defined( DISABLE_CLOUD ) || !defined( ENABLE_ACLK ) + netdata_cloud_enabled = CONFIG_BOOLEAN_NO; +#else + netdata_cloud_enabled = appconfig_get_boolean_ondemand(&cloud_config, CONFIG_SECTION_GLOBAL, "enabled", netdata_cloud_enabled); +#endif + + // This must be set before any point in the code that accesses it. Do not move it from this function. + appconfig_get(&cloud_config, CONFIG_SECTION_GLOBAL, "cloud base url", DEFAULT_CLOUD_BASE_URL); +} + +static char *netdata_random_session_id_filename = NULL; +static nd_uuid_t netdata_random_session_id = { 0 }; + +bool netdata_random_session_id_generate(void) { + static char guid[UUID_STR_LEN] = ""; + + uuid_generate_random(netdata_random_session_id); + uuid_unparse_lower(netdata_random_session_id, guid); + + char filename[FILENAME_MAX + 1]; + snprintfz(filename, FILENAME_MAX, "%s/netdata_random_session_id", netdata_configured_varlib_dir); + + bool ret = true; + + (void)unlink(filename); + + // save it + int fd = open(filename, O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 640); + if(fd == -1) { + netdata_log_error("Cannot create random session id file '%s'.", filename); + ret = false; + } + else { + if (write(fd, guid, UUID_STR_LEN - 1) != UUID_STR_LEN - 1) { + netdata_log_error("Cannot write the random session id file '%s'.", filename); + ret = false; + } else { + ssize_t bytes = write(fd, "\n", 1); + UNUSED(bytes); + } + close(fd); + } + + if(ret && (!netdata_random_session_id_filename || strcmp(netdata_random_session_id_filename, filename) != 0)) { + freez(netdata_random_session_id_filename); + netdata_random_session_id_filename = strdupz(filename); + } + + return ret; +} + +const char *netdata_random_session_id_get_filename(void) { + if(!netdata_random_session_id_filename) + netdata_random_session_id_generate(); + + return netdata_random_session_id_filename; +} + +bool netdata_random_session_id_matches(const char *guid) { + if(uuid_is_null(netdata_random_session_id)) + return false; + + nd_uuid_t uuid; + + if(uuid_parse(guid, uuid)) + return false; + + if(uuid_compare(netdata_random_session_id, uuid) == 0) + return true; + + return false; +} + +static bool check_claim_param(const char *s) { + if(!s || !*s) return true; + + do { + if(isalnum((uint8_t)*s) || *s == '.' || *s == ',' || *s == '-' || *s == ':' || *s == '/' || *s == '_') + ; + else + return false; + + } while(*++s); + + return true; +} + +void claim_reload_all(void) { + nd_log_limits_unlimited(); + load_claiming_state(); + registry_update_cloud_base_url(); + rrdpush_send_claimed_id(localhost); + nd_log_limits_reset(); +} + +int api_v2_claim(struct web_client *w, char *url) { + char *key = NULL; + char *token = NULL; + char *rooms = NULL; + char *base_url = NULL; + + while (url) { + char *value = strsep_skip_consecutive_separators(&url, "&"); + if (!value || !*value) continue; + + char *name = strsep_skip_consecutive_separators(&value, "="); + if (!name || !*name) continue; + if (!value || !*value) continue; + + if(!strcmp(name, "key")) + key = value; + else if(!strcmp(name, "token")) + token = value; + else if(!strcmp(name, "rooms")) + rooms = value; + else if(!strcmp(name, "url")) + base_url = value; + } + + BUFFER *wb = w->response.data; + buffer_flush(wb); + buffer_json_initialize(wb, "\"", "\"", 0, true, BUFFER_JSON_OPTIONS_DEFAULT); + + time_t now_s = now_realtime_sec(); + CLOUD_STATUS status = buffer_json_cloud_status(wb, now_s); + + bool can_be_claimed = false; + switch(status) { + case CLOUD_STATUS_AVAILABLE: + case CLOUD_STATUS_DISABLED: + case CLOUD_STATUS_OFFLINE: + can_be_claimed = true; + break; + + case CLOUD_STATUS_UNAVAILABLE: + case CLOUD_STATUS_BANNED: + case CLOUD_STATUS_ONLINE: + can_be_claimed = false; + break; + } + + buffer_json_member_add_boolean(wb, "can_be_claimed", can_be_claimed); + + if(can_be_claimed && key) { + if(!netdata_random_session_id_matches(key)) { + buffer_reset(wb); + buffer_strcat(wb, "invalid key"); + netdata_random_session_id_generate(); // generate a new key, to avoid an attack to find it + return HTTP_RESP_FORBIDDEN; + } + + if(!token || !base_url || !check_claim_param(token) || !check_claim_param(base_url) || (rooms && !check_claim_param(rooms))) { + buffer_reset(wb); + buffer_strcat(wb, "invalid parameters"); + netdata_random_session_id_generate(); // generate a new key, to avoid an attack to find it + return HTTP_RESP_BAD_REQUEST; + } + + netdata_random_session_id_generate(); // generate a new key, to avoid an attack to find it + + netdata_cloud_enabled = CONFIG_BOOLEAN_AUTO; + appconfig_set_boolean(&cloud_config, CONFIG_SECTION_GLOBAL, "enabled", CONFIG_BOOLEAN_AUTO); + appconfig_set(&cloud_config, CONFIG_SECTION_GLOBAL, "cloud base url", base_url); + + nd_uuid_t claimed_id; + uuid_generate_random(claimed_id); + char claimed_id_str[UUID_STR_LEN]; + uuid_unparse_lower(claimed_id, claimed_id_str); + + BUFFER *t = buffer_create(1024, NULL); + if(rooms) + buffer_sprintf(t, "-id=%s -token=%s -rooms=%s", claimed_id_str, token, rooms); + else + buffer_sprintf(t, "-id=%s -token=%s", claimed_id_str, token); + + bool success = false; + const char *msg = NULL; + CLAIM_AGENT_RESPONSE rc = claim_agent(buffer_tostring(t), true, &msg); + switch(rc) { + case CLAIM_AGENT_OK: + msg = "ok"; + success = true; + can_be_claimed = false; + claim_reload_all(); + { + int ms = 0; + do { + status = cloud_status(); + if (status == CLOUD_STATUS_ONLINE && __atomic_load_n(&localhost->node_id, __ATOMIC_RELAXED)) + break; + + sleep_usec(50 * USEC_PER_MS); + ms += 50; + } while (ms < 10000); + } + break; + + case CLAIM_AGENT_NO_CLOUD_URL: + msg = "No Netdata Cloud URL."; + break; + + case CLAIM_AGENT_CLAIM_SCRIPT_FAILED: + msg = "Claiming script failed."; + break; + + case CLAIM_AGENT_CLOUD_DISABLED: + msg = "Netdata Cloud is disabled on this agent."; + break; + + case CLAIM_AGENT_CANNOT_EXECUTE_CLAIM_SCRIPT: + msg = "Failed to execute claiming script."; + break; + + case CLAIM_AGENT_CLAIM_SCRIPT_RETURNED_INVALID_CODE: + msg = "Claiming script returned invalid code."; + break; + + default: + case CLAIM_AGENT_FAILED_WITH_MESSAGE: + if(!msg) + msg = "Unknown error"; + break; + } + + // our status may have changed + // refresh the status in our output + buffer_flush(wb); + buffer_json_initialize(wb, "\"", "\"", 0, true, BUFFER_JSON_OPTIONS_DEFAULT); + now_s = now_realtime_sec(); + buffer_json_cloud_status(wb, now_s); + + // and this is the status of the claiming command we run + buffer_json_member_add_boolean(wb, "success", success); + buffer_json_member_add_string(wb, "message", msg); + } + + if(can_be_claimed) + buffer_json_member_add_string(wb, "key_filename", netdata_random_session_id_get_filename()); + + buffer_json_agents_v2(wb, NULL, now_s, false, false); + buffer_json_finalize(wb); + + return HTTP_RESP_OK; +} diff --git a/src/claim/claim.h b/src/claim/claim.h new file mode 100644 index 000000000..ccab8aaa1 --- /dev/null +++ b/src/claim/claim.h @@ -0,0 +1,32 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_CLAIM_H +#define NETDATA_CLAIM_H 1 + +#include "daemon/common.h" + +extern char *claiming_pending_arguments; +extern struct config cloud_config; + +typedef enum __attribute__((packed)) { + CLAIM_AGENT_OK, + CLAIM_AGENT_CLOUD_DISABLED, + CLAIM_AGENT_NO_CLOUD_URL, + CLAIM_AGENT_CANNOT_EXECUTE_CLAIM_SCRIPT, + CLAIM_AGENT_CLAIM_SCRIPT_FAILED, + CLAIM_AGENT_CLAIM_SCRIPT_RETURNED_INVALID_CODE, + CLAIM_AGENT_FAILED_WITH_MESSAGE, +} CLAIM_AGENT_RESPONSE; + +CLAIM_AGENT_RESPONSE claim_agent(const char *claiming_arguments, bool force, const char **msg); +char *get_agent_claimid(void); +void load_claiming_state(void); +void load_cloud_conf(int silent); +void claim_reload_all(void); + +bool netdata_random_session_id_generate(void); +const char *netdata_random_session_id_get_filename(void); +bool netdata_random_session_id_matches(const char *guid); +int api_v2_claim(struct web_client *w, char *url); + +#endif //NETDATA_CLAIM_H diff --git a/src/claim/netdata-claim.sh.in b/src/claim/netdata-claim.sh.in new file mode 100755 index 000000000..f4fa382b6 --- /dev/null +++ b/src/claim/netdata-claim.sh.in @@ -0,0 +1,451 @@ +#!/usr/bin/env bash +# netdata +# real-time performance and health monitoring, done right! +# (C) 2023 Netdata Inc. +# SPDX-License-Identifier: GPL-3.0-or-later + +# Exit code: 0 - Success +# Exit code: 1 - Unknown argument +# Exit code: 2 - Problems with claiming working directory +# Exit code: 3 - Missing dependencies +# Exit code: 4 - Failure to connect to endpoint +# Exit code: 5 - The CLI didn't work +# Exit code: 6 - Wrong user +# Exit code: 7 - Unknown HTTP error message +# +# OK: Agent claimed successfully +# HTTP Status code: 204 +# Exit code: 0 +# +# Unknown HTTP error message +# HTTP Status code: 422 +# Exit code: 7 +ERROR_KEYS[7]="None" +ERROR_MESSAGES[7]="Unknown HTTP error message" + +# Error: The agent id is invalid; it does not fulfill the constraints +# HTTP Status code: 422 +# Exit code: 8 +ERROR_KEYS[8]="ErrInvalidNodeID" +ERROR_MESSAGES[8]="invalid node id" + +# Error: The agent hostname is invalid; it does not fulfill the constraints +# HTTP Status code: 422 +# Exit code: 9 +ERROR_KEYS[9]="ErrInvalidNodeName" +ERROR_MESSAGES[9]="invalid node name" + +# Error: At least one of the given rooms ids is invalid; it does not fulfill the constraints +# HTTP Status code: 422 +# Exit code: 10 +ERROR_KEYS[10]="ErrInvalidRoomID" +ERROR_MESSAGES[10]="invalid room id" + +# Error: Invalid public key; the public key is empty or not present +# HTTP Status code: 422 +# Exit code: 11 +ERROR_KEYS[11]="ErrInvalidPublicKey" +ERROR_MESSAGES[11]="invalid public key" +# +# Error: Expired, missing or invalid token +# HTTP Status code: 403 +# Exit code: 12 +ERROR_KEYS[12]="ErrForbidden" +ERROR_MESSAGES[12]="token expired/token not found/invalid token" + +# Error: Duplicate agent id; an agent with the same id is already registered in the cloud +# HTTP Status code: 409 +# Exit code: 13 +ERROR_KEYS[13]="ErrAlreadyClaimed" +ERROR_MESSAGES[13]="already claimed" + +# Error: The node claiming process is still in progress. +# HTTP Status code: 102 +# Exit code: 14 +ERROR_KEYS[14]="ErrProcessingClaim" +ERROR_MESSAGES[14]="processing claiming" + +# Error: Internal server error. Any other unexpected error (DB problems, etc.) +# HTTP Status code: 500 +# Exit code: 15 +ERROR_KEYS[15]="ErrInternalServerError" +ERROR_MESSAGES[15]="Internal Server Error" + +# Error: There was a timeout processing the claim. +# HTTP Status code: 504 +# Exit code: 16 +ERROR_KEYS[16]="ErrGatewayTimeout" +ERROR_MESSAGES[16]="Gateway Timeout" + +# Error: The service cannot handle the claiming request at this time. +# HTTP Status code: 503 +# Exit code: 17 +ERROR_KEYS[17]="ErrServiceUnavailable" +ERROR_MESSAGES[17]="Service Unavailable" + +# Exit code: 18 - Agent unique id is not generated yet. + +NETDATA_RUNNING=1 + +get_config_value() { + conf_file="${1}" + section="${2}" + key_name="${3}" + if [ "${NETDATA_RUNNING}" -eq 1 ]; then + config_result=$(@sbindir_POST@/netdatacli 2>/dev/null read-config "$conf_file|$section|$key_name"; exit $?) + result="$?" + if [ "${result}" -ne 0 ]; then + echo >&2 "Unable to communicate with Netdata daemon, querying config from disk instead." + NETDATA_RUNNING=0 + fi + fi + if [ "${NETDATA_RUNNING}" -eq 0 ]; then + config_result=$(@sbindir_POST@/netdata 2>/dev/null -W get2 "$conf_file" "$section" "$key_name" unknown_default) + fi + echo "$config_result" +} +if command -v curl >/dev/null 2>&1 ; then + URLTOOL="curl" +elif command -v wget >/dev/null 2>&1 ; then + URLTOOL="wget" +else + echo >&2 "I need curl or wget to proceed, but neither is available on this system." + exit 3 +fi +if ! command -v openssl >/dev/null 2>&1 ; then + echo >&2 "I need openssl to proceed, but it is not available on this system." + exit 3 +fi + +# shellcheck disable=SC2050 +if [ "@enable_cloud_POST@" = "no" ]; then + echo >&2 "This agent was built with --disable-cloud and cannot be claimed" + exit 3 +fi +# shellcheck disable=SC2050 +if [ "@enable_aclk_POST@" != "yes" ]; then + echo >&2 "This agent was built without the dependencies for Cloud and cannot be claimed" + exit 3 +fi + +# ----------------------------------------------------------------------------- +# defaults to allow running this script by hand + +[ -z "${NETDATA_VARLIB_DIR}" ] && NETDATA_VARLIB_DIR="@varlibdir_POST@" +MACHINE_GUID_FILE="@registrydir_POST@/netdata.public.unique.id" +CLAIMING_DIR="${NETDATA_VARLIB_DIR}/cloud.d" +TOKEN="unknown" +URL_BASE=$(get_config_value cloud global "cloud base url") +[ -z "$URL_BASE" ] && URL_BASE="https://app.netdata.cloud" # Cover post-install with --dont-start +ID="unknown" +ROOMS="" +[ -z "$HOSTNAME" ] && HOSTNAME=$(hostname) +CLOUD_CERTIFICATE_FILE="${CLAIMING_DIR}/cloud_fullchain.pem" +VERBOSE=0 +INSECURE=0 +RELOAD=1 +NETDATA_USER=$(get_config_value netdata global "run as user") +[ -z "$EUID" ] && EUID="$(id -u)" + + +gen_id() { + local id + + if command -v uuidgen > /dev/null 2>&1; then + id="$(uuidgen | tr '[:upper:]' '[:lower:]')" + elif [ -r /proc/sys/kernel/random/uuid ]; then + id="$(cat /proc/sys/kernel/random/uuid)" + else + echo >&2 "Unable to generate machine ID." + exit 18 + fi + + if [ "${id}" = "8a795b0c-2311-11e6-8563-000c295076a6" ] || [ "${id}" = "4aed1458-1c3e-11e6-a53f-000c290fc8f5" ]; then + gen_id + else + echo "${id}" + fi +} + +# get the MACHINE_GUID by default +if [ -r "${MACHINE_GUID_FILE}" ]; then + ID="$(cat "${MACHINE_GUID_FILE}")" + MGUID=$ID +elif [ -f "${MACHINE_GUID_FILE}" ]; then + echo >&2 "netdata.public.unique.id is not readable. Please make sure you have rights to read it (Filename: ${MACHINE_GUID_FILE})." + exit 18 +else + if mkdir -p "${MACHINE_GUID_FILE%/*}" && echo -n "$(gen_id)" > "${MACHINE_GUID_FILE}"; then + ID="$(cat "${MACHINE_GUID_FILE}")" + MGUID=$ID + else + echo >&2 "Failed to write new machine GUID. Please make sure you have rights to write to ${MACHINE_GUID_FILE}." + exit 18 + fi +fi + +# get token from file +if [ -r "${CLAIMING_DIR}/token" ]; then + TOKEN="$(cat "${CLAIMING_DIR}/token")" +fi + +# get rooms from file +if [ -r "${CLAIMING_DIR}/rooms" ]; then + ROOMS="$(cat "${CLAIMING_DIR}/rooms")" +fi + +variable_to_set= +for arg in "$@" +do + if [ -z "$variable_to_set" ]; then + case $arg in + --claim-token) variable_to_set="TOKEN" ;; + --claim-rooms) variable_to_set="ROOMS" ;; + --claim-url) variable_to_set="URL_BASE" ;; + -token=*) TOKEN=${arg:7} ;; + -url=*) [ -n "${arg:5}" ] && URL_BASE=${arg:5} ;; + -id=*) ID=$(echo "${arg:4}" | tr '[:upper:]' '[:lower:]');; + -rooms=*) ROOMS=${arg:7} ;; + -hostname=*) HOSTNAME=${arg:10} ;; + -verbose) VERBOSE=1 ;; + -insecure) INSECURE=1 ;; + -proxy=*) PROXY=${arg:7} ;; + -noproxy) NOPROXY=yes ;; + -noreload) RELOAD=0 ;; + -user=*) NETDATA_USER=${arg:6} ;; + -daemon-not-running) NETDATA_RUNNING=0 ;; + *) echo >&2 "Unknown argument ${arg}" + exit 1 ;; + esac + else + case "$variable_to_set" in + TOKEN) TOKEN="$arg" ;; + ROOMS) ROOMS="$arg" ;; + URL_BASE) URL_BASE="$arg" ;; + esac + variable_to_set= + fi + shift 1 +done + +if [ "$EUID" != "0" ] && [ "$(whoami)" != "$NETDATA_USER" ]; then + echo >&2 "This script must be run by the $NETDATA_USER user account" + exit 6 +fi + +# if curl not installed give warning SOCKS can't be used +if [[ "${URLTOOL}" != "curl" && "${PROXY:0:5}" = socks ]] ; then + echo >&2 "wget doesn't support SOCKS. Please install curl or disable SOCKS proxy." + exit 1 +fi + +echo >&2 "Token: ****************" +echo >&2 "Base URL: $URL_BASE" +echo >&2 "Id: $ID" +echo >&2 "Rooms: $ROOMS" +echo >&2 "Hostname: $HOSTNAME" +echo >&2 "Proxy: $PROXY" +echo >&2 "Netdata user: $NETDATA_USER" + +# create the claiming directory for this user +if [ ! -d "${CLAIMING_DIR}" ] ; then + mkdir -p "${CLAIMING_DIR}" && chmod 0770 "${CLAIMING_DIR}" +# shellcheck disable=SC2181 + if [ $? -ne 0 ] ; then + echo >&2 "Failed to create claiming working directory ${CLAIMING_DIR}" + exit 2 + fi +fi +if [ ! -w "${CLAIMING_DIR}" ] ; then + echo >&2 "No write permission in claiming working directory ${CLAIMING_DIR}" + exit 2 +fi + +if [ ! -f "${CLAIMING_DIR}/private.pem" ] ; then + echo >&2 "Generating private/public key for the first time." + if ! openssl genrsa -out "${CLAIMING_DIR}/private.pem" 2048 ; then + echo >&2 "Failed to generate private/public key pair." + exit 2 + fi +fi +if [ ! -f "${CLAIMING_DIR}/public.pem" ] ; then + echo >&2 "Extracting public key from private key." + if ! openssl rsa -in "${CLAIMING_DIR}/private.pem" -outform PEM -pubout -out "${CLAIMING_DIR}/public.pem" ; then + echo >&2 "Failed to extract public key." + exit 2 + fi +fi + +TARGET_URL="${URL_BASE%/}/api/v1/spaces/nodes/${ID}" +# shellcheck disable=SC2002 +KEY=$(cat "${CLAIMING_DIR}/public.pem" | tr '\n' '!' | sed -e 's/!/\\n/g') +# shellcheck disable=SC2001 +[ -n "$ROOMS" ] && ROOMS=\"$(echo "$ROOMS" | sed s'/,/", "/g')\" + +cat > "${CLAIMING_DIR}/tmpin.txt" <<EMBED_JSON +{ + "node": { + "id": "$ID", + "hostname": "$HOSTNAME" + }, + "token": "$TOKEN", + "rooms" : [ $ROOMS ], + "publicKey" : "$KEY", + "mGUID" : "$MGUID" +} +EMBED_JSON + +if [ "${VERBOSE}" == 1 ] ; then + echo "Request to server:" + cat "${CLAIMING_DIR}/tmpin.txt" +fi + + +if [ "${URLTOOL}" = "curl" ] ; then + URLCOMMAND="curl --connect-timeout 30 --retry 0 -s -i -X PUT -d \"@${CLAIMING_DIR}/tmpin.txt\"" + if [ "${NOPROXY}" = "yes" ] ; then + URLCOMMAND="${URLCOMMAND} -x \"\"" + elif [ -n "${PROXY}" ] ; then + URLCOMMAND="${URLCOMMAND} -x \"${PROXY}\"" + fi +else + URLCOMMAND="wget -T 15 -O - -q --server-response --content-on-error=on --method=PUT \ + --body-file=\"${CLAIMING_DIR}/tmpin.txt\"" + if [ "${NOPROXY}" = "yes" ] ; then + URLCOMMAND="${URLCOMMAND} --no-proxy" + elif [ "${PROXY:0:4}" = http ] ; then + URLCOMMAND="export http_proxy=${PROXY}; ${URLCOMMAND}" + fi +fi + +if [ "${INSECURE}" == 1 ] ; then + if [ "${URLTOOL}" = "curl" ] ; then + URLCOMMAND="${URLCOMMAND} --insecure" + else + URLCOMMAND="${URLCOMMAND} --no-check-certificate" + fi +fi + +if [ -r "${CLOUD_CERTIFICATE_FILE}" ] ; then + if [ "${URLTOOL}" = "curl" ] ; then + URLCOMMAND="${URLCOMMAND} --cacert \"${CLOUD_CERTIFICATE_FILE}\"" + else + URLCOMMAND="${URLCOMMAND} --ca-certificate \"${CLOUD_CERTIFICATE_FILE}\"" + fi +fi + +if [ "${VERBOSE}" == 1 ]; then + echo "${URLCOMMAND} \"${TARGET_URL}\"" +fi + +attempt_contact () { + if [ "${URLTOOL}" = "curl" ] ; then + eval "${URLCOMMAND} \"${TARGET_URL}\"" >"${CLAIMING_DIR}/tmpout.txt" + else + eval "${URLCOMMAND} \"${TARGET_URL}\"" >"${CLAIMING_DIR}/tmpout.txt" 2>&1 + fi + URLCOMMAND_EXIT_CODE=$? + if [ "${URLTOOL}" = "wget" ] && [ "${URLCOMMAND_EXIT_CODE}" -eq 8 ] ; then + # We consider the server issuing an error response a successful attempt at communicating + URLCOMMAND_EXIT_CODE=0 + fi + + # Check if URLCOMMAND connected and received reply + if [ "${URLCOMMAND_EXIT_CODE}" -ne 0 ] ; then + echo >&2 "Failed to connect to ${URL_BASE}, return code ${URLCOMMAND_EXIT_CODE}" + rm -f "${CLAIMING_DIR}/tmpout.txt" + return 4 + fi + + if [ "${VERBOSE}" == 1 ] ; then + echo "Response from server:" + cat "${CLAIMING_DIR}/tmpout.txt" + fi + + return 0 +} + +for i in {1..3} +do + if attempt_contact ; then + echo "Connection attempt $i successful" + break + fi + echo "Connection attempt $i failed. Retry in ${i}s." + if [ "$i" -eq 5 ] ; then + rm -f "${CLAIMING_DIR}/tmpin.txt" + exit 4 + fi + sleep "$i" +done + +rm -f "${CLAIMING_DIR}/tmpin.txt" + +ERROR_KEY=$(grep "\"errorMsgKey\":" "${CLAIMING_DIR}/tmpout.txt" | awk -F "errorMsgKey\":\"" '{print $2}' | awk -F "\"" '{print $1}') +case ${ERROR_KEY} in + "ErrInvalidNodeID") EXIT_CODE=8 ;; + "ErrInvalidNodeName") EXIT_CODE=9 ;; + "ErrInvalidRoomID") EXIT_CODE=10 ;; + "ErrInvalidPublicKey") EXIT_CODE=11 ;; + "ErrForbidden") EXIT_CODE=12 ;; + "ErrAlreadyClaimed") EXIT_CODE=13 ;; + "ErrProcessingClaim") EXIT_CODE=14 ;; + "ErrInternalServerError") EXIT_CODE=15 ;; + "ErrGatewayTimeout") EXIT_CODE=16 ;; + "ErrServiceUnavailable") EXIT_CODE=17 ;; + *) EXIT_CODE=7 ;; +esac + +HTTP_STATUS_CODE=$(grep "HTTP" "${CLAIMING_DIR}/tmpout.txt" | tail -1 | awk -F " " '{print $2}') +if [ "${HTTP_STATUS_CODE}" = "204" ] ; then + EXIT_CODE=0 +fi + +if [ "${HTTP_STATUS_CODE}" = "204" ] || [ "${ERROR_KEY}" = "ErrAlreadyClaimed" ] ; then + rm -f "${CLAIMING_DIR}/tmpout.txt" + if [ "${HTTP_STATUS_CODE}" = "204" ] ; then + echo -n "${ID}" >"${CLAIMING_DIR}/claimed_id" || (echo >&2 "Claiming failed"; set -e; exit 2) + fi + rm -f "${CLAIMING_DIR}/token" || (echo >&2 "Claiming failed"; set -e; exit 2) + + # Rewrite the cloud.conf on the disk + cat > "$CLAIMING_DIR/cloud.conf" <<HERE_DOC +[global] + enabled = yes + cloud base url = $URL_BASE +${PROXY:+ proxy = $PROXY} +HERE_DOC + if [ "$EUID" == "0" ]; then + chown -R "${NETDATA_USER}:${NETDATA_USER}" "${CLAIMING_DIR}" || (echo >&2 "Claiming failed"; set -e; exit 2) + fi + if [ "${RELOAD}" == "0" ] ; then + exit $EXIT_CODE + fi + + # Update cloud.conf in the agent memory + @sbindir_POST@/netdatacli write-config 'cloud|global|enabled|yes' && \ + @sbindir_POST@/netdatacli write-config "cloud|global|cloud base url|$URL_BASE" && \ + @sbindir_POST@/netdatacli reload-claiming-state && \ + if [ "${HTTP_STATUS_CODE}" = "204" ] ; then + echo >&2 "Node was successfully claimed." + else + echo >&2 "The agent cloud base url is set to the url provided." + echo >&2 "The cloud may have different credentials already registered for this agent ID and it cannot be reclaimed under different credentials for security reasons. If you are unable to connect use -id=\$(uuidgen) to overwrite this agent ID with a fresh value if the original credentials cannot be restored." + echo >&2 "Failed to claim node with the following error message:\"${ERROR_MESSAGES[$EXIT_CODE]}\"" + fi && exit $EXIT_CODE + + if [ "${ERROR_KEY}" = "ErrAlreadyClaimed" ] ; then + echo >&2 "The cloud may have different credentials already registered for this agent ID and it cannot be reclaimed under different credentials for security reasons. If you are unable to connect use -id=\$(uuidgen) to overwrite this agent ID with a fresh value if the original credentials cannot be restored." + echo >&2 "Failed to claim node with the following error message:\"${ERROR_MESSAGES[$EXIT_CODE]}\"" + exit $EXIT_CODE + fi + echo >&2 "The claim was successful but the agent could not be notified ($?)- it requires a restart to connect to the cloud." + [ "$NETDATA_RUNNING" -eq 0 ] && exit 0 || exit 5 +fi + +echo >&2 "Failed to claim node with the following error message:\"${ERROR_MESSAGES[$EXIT_CODE]}\"" +if [ "${VERBOSE}" == 1 ]; then + echo >&2 "Error key was:\"${ERROR_KEYS[$EXIT_CODE]}\"" +fi +rm -f "${CLAIMING_DIR}/tmpout.txt" +exit $EXIT_CODE |