From b5f8ee61a7f7e9bd291dd26b0585d03eb686c941 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 5 May 2024 13:19:16 +0200 Subject: Adding upstream version 1.46.3. Signed-off-by: Daniel Baumann --- docs/netdata-cloud/README.md | 134 +++++++++++++ .../authentication-and-authorization/README.md | 27 +++ .../authentication-and-authorization/api-tokens.md | 34 ++++ .../enterprise-sso-authentication.md | 36 ++++ .../role-based-access-model.md | 157 +++++++++++++++ docs/netdata-cloud/netdata-cloud-on-prem/README.md | 77 ++++++++ .../netdata-cloud-on-prem/infrastructure.jpeg | Bin 0 -> 517302 bytes .../netdata-cloud-on-prem/installation.md | 212 +++++++++++++++++++++ .../netdata-cloud-on-prem/poc-without-k8s.md | 70 +++++++ .../netdata-cloud-on-prem/troubleshooting.md | 37 ++++ ...rganize-your-infrastructure-invite-your-team.md | 62 ++++++ docs/netdata-cloud/versions.md | 19 ++ docs/netdata-cloud/view-plan-and-billing.md | 121 ++++++++++++ 13 files changed, 986 insertions(+) create mode 100644 docs/netdata-cloud/README.md create mode 100644 docs/netdata-cloud/authentication-and-authorization/README.md create mode 100644 docs/netdata-cloud/authentication-and-authorization/api-tokens.md create mode 100644 docs/netdata-cloud/authentication-and-authorization/enterprise-sso-authentication.md create mode 100644 docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md create mode 100644 docs/netdata-cloud/netdata-cloud-on-prem/README.md create mode 100644 docs/netdata-cloud/netdata-cloud-on-prem/infrastructure.jpeg create mode 100644 docs/netdata-cloud/netdata-cloud-on-prem/installation.md create mode 100644 docs/netdata-cloud/netdata-cloud-on-prem/poc-without-k8s.md create mode 100644 docs/netdata-cloud/netdata-cloud-on-prem/troubleshooting.md create mode 100644 docs/netdata-cloud/organize-your-infrastructure-invite-your-team.md create mode 100644 docs/netdata-cloud/versions.md create mode 100644 docs/netdata-cloud/view-plan-and-billing.md (limited to 'docs/netdata-cloud') diff --git a/docs/netdata-cloud/README.md b/docs/netdata-cloud/README.md new file mode 100644 index 000000000..6a2406aeb --- /dev/null +++ b/docs/netdata-cloud/README.md @@ -0,0 +1,134 @@ +# Netdata Cloud + +Netdata Cloud is a service that complements Netdata installations. It is a key component in achieving optimal cost structure for large scale observability. + +Technically, Netdata Cloud is a thin control plane that allows the Netdata ecosystem to be a virtually unlimited scalable and flexible observability pipeline. With Netdata Cloud, this observability pipeline can span multiple teams, cloud providers, data centers and services, while remaining a uniform and highly integrated infrastructure, providing real-time and high-fidelity insights. + +```mermaid +flowchart TB + NC("☁️ Netdata Cloud + access from anywhere, + horizontal scalability, + role based access, + custom dashboards, + central notifications") + Users[["✨ Unified Dashboards + across the infrastructure, + multi-cloud, hybrid-cloud"]] + Notifications["🔔 Alert Notifications + Slack, e-mail, Mobile App, + PagerDuty, and more"] + Users <--> NC + NC -->|deduplicated| Notifications + subgraph On-Prem Infrastructure + direction TB + Agents("🌎 Netdata Agents + Standalone, + Children, Parents + (possibly overlapping)") + TimeSeries[("Time-Series + metric samples + database")] + PrivateAgents("🔒 Private + Netdata Agents") + Agents <--> TimeSeries + Agents ---|stream| PrivateAgents + end + NC <-->|secure connection| Agents +``` + +Netdata Cloud provides the following features, on top of what the Netdata agents already provide: + +1. **Horizontal scalability**: Netdata Cloud allows scaling the observability infrastructure horizontally, by adding more independent Netdata Parents and Children. It can aggregate such, otherwise independent, observability islands into one uniform and integrated infrastructure. + + Netdata Cloud is a fundamental component for achieving an optimal cost structure and flexibility, in structuring observability the way that is best suited for each case. + +2. **Role Based Access Control (RBAC)**: Netdata Cloud has all the mechanisms for user-management and access control. It allows assigning all users a role, segmenting the infrastructure into rooms, and associating Rooms with roles and users. + +3. **Access from anywhere**: Netdata agents are installed on-prem and this is where all your data are always stored. Netdata Cloud allows querying all the Netdata agents (Standalone, Children and Parents) in real-time when dashboards are accessed via Netdata Cloud. + + This enables a much simpler access control, eliminating the complexities of setting up VPNs to access observability, and the bandwidth costs for centralizing all metrics to one place. + +4. **Central dispatch of alert notifications**: Netdata Cloud allows controlling the dispatch of alert notifications centrally. By default, all Netdata agents (Standalone, Children and Parents) send their own notifications. This becomes increasingly complex as the infrastructure grows. So, Netdata Cloud steps in to simplify this process and provide central control of all notifications. + + Netdata Cloud also enables the use of the **Netdata Mobile App** offering mobile push notifications for all users in commercial plans. + +5. **Custom Dashboards**: Netdata Cloud enables the creation, storage and sharing custom dashboards. + + Custom dashboards are created directly from the UI, without the need for learning a query language. Netdata Cloud provides all the APIs to the Netdata dashboards to store, browse and retrieve custom dashboards created by all users. + +6. **Advanced Customization**: Netdata Cloud provides all the APIs for the dashboard to have different default settings per space, per Room and per user, allowing administrators and users to customize the Netdata dashboards and charts the way they see fit. + +## Data Exposed to Netdata Cloud + +Netdata is thin layer of top of Netdata agents. It does not receive the samples collected, or the logs Netdata agents maintain. + +This is a key design decision for Netdata. If we were centralizing metric samples and logs, Netdata would have the same constrains and cost structure other observability solutions have, and we would be forced to lower metrics resolution, filter out metrics and eventually increase significantly the cost of observability. + +Instead, Netdata Cloud receives and stores only metadata related to the metrics collected, such as the nodes collecting metrics and their labels, the metric names, their labels and their retention, the data collection plugins and modules running, the configured alerts and their transitions. + +This information is a small fraction of the total information maintained by Netdata agents, allowing Netdata Cloud to remain high-resolution, high-fidelity and real-time, while being able to: + +- dispatch alerts centrally for all alert transitions. +- know which Netdata agents to query when users view the dashboards. + +Metric samples and logs are transferred via Netdata Cloud to your Web Browser, only when you view them via Netdata Cloud. And even then, Netdata Cloud does not store this information. It only aggregates the responses of multiple Netdata agents to a single response for your web browser to visualize. + +## High-Availability + +You can subscribe to Netdata Cloud updates at the [Netdata Cloud Status](https://status.netdata.cloud/) page. + +Netdata Cloud is a highly available, auto-scalable solution, however being a monitoring solution, we need to ensure dashboards are accessible during crisis. + +Netdata agents provide the same dashboard Netdata Cloud provides, with the following limitations: + +1. Netdata agents (Children and Parents) dashboards are limited to their databases, while on Netdata Cloud the dashboard presents the entire infrastructure, from all Netdata agents connected to it. + +2. When you are not logged-in or the agent is not connected to Netdata Cloud, certain features of the Netdata agent dashboard will not be available. + + When you are logged-in and the agent is connected to Netdata Cloud, the agent dashboard has the same functionality as Netdata Cloud. + +To ensure dashboard high availability, Netdata agent dashboards are available by directly accessing them, even when the connectivity between Children and Parents or Netdata Cloud faces issues. This allows the use of the individual Netdata agents' dashboards during crisis, at different levels of aggregation. + +## Fidelity and Insights + +Netdata Cloud queries Netdata agents, so it provides exactly the same fidelity and insights Netdata agents provide. Dashboards have the same resolution, the same number of metrics, exactly the same data. + +## Performance + +The Netdata agent and Netdata Cloud have similar query performance, but there are additional network latencies involved when the dashboards are viewed via Netdata Cloud. + +Accessing Netdata agents on the same LAN has marginal network latency and their response time is only affected by the queries. However, accessing the same Netdata agents via Netdata Cloud has a bigger network round-trip time, that looks like this: + +1. Your web browser makes a request to Netdata Cloud. +2. Netdata Cloud sends the request to your Netdata agents. If multiple Netdata agents are involved, they are queried in parallel. +3. Netdata Cloud receives their responses and aggregates them into a single response. +4. Netdata Cloud replies to your web browser. + +If you are sitting on the same LAN as the Netdata agents, the latency will be 2 times the round-trip network latency between this LAN and Netdata Cloud. + +However, when there are multiple Netdata agents involved, the queries will be faster compared to a monitoring solution that has one centralization point. Netdata Cloud splits each query into multiple parts and each of the Netdata agents involved will only perform a small part of the original query. So, when querying a large infrastructure, you enjoy the performance of the combined power of all your Netdata agents, which is usually quite higher than any single-centralization-point monitoring solution. + +## Does Netdata Cloud require Observability Centralization Points? + +No. Any or all Netdata agents can be connected to Netdata Cloud. + +We recommend to create [observability centralization points](/docs/observability-centralization-points/README.md), as required for operational efficiency (ephemeral nodes, teams or services isolation, central control of alerts, production systems performance), security policies (internet isolation), or cost optimization (use existing capacities before allocating new ones). + +We suggest to review the [Best Practices for Observability Centralization Points](/docs/observability-centralization-points/best-practices.md). + +## When I have Netdata Parents, do I need to connect Netdata Children to Netdata Cloud too? + +No, it is not needed, but it provides high-availability. + +When Netdata Parents are connected to Netdata Cloud, all their Netdata Children are available, via these Parents. + +When multiple Netdata Parents maintain a database for the same Netdata Children (e.g. clustered Parents, or Parents and Grandparents), Netdata Cloud is able to detect the unique nodes in an infrastructure and query each node only once, using one of the available Parents. + +Netdata Cloud prefers: + +- The most distant (from the Child) Parent available, when doing metrics visualization queries (since usually these Parents have been added for this purpose). + +- The closest (to the Child) Parent available, for [Top Monitoring](/docs/top-monitoring-netdata-functions.md) (since top-monitoring provides live data, like the processes running, the list of sockets open, etc). The streaming protocol of Netdata Parents and Children is able to forward such requests to the right child, via the Parents, to respond with live and accurate data. + +Netdata Children may be connected to Netdata Cloud for high-availability, in case the Netdata Parents are unreachable. diff --git a/docs/netdata-cloud/authentication-and-authorization/README.md b/docs/netdata-cloud/authentication-and-authorization/README.md new file mode 100644 index 000000000..5eb7acf24 --- /dev/null +++ b/docs/netdata-cloud/authentication-and-authorization/README.md @@ -0,0 +1,27 @@ +# Authentication & Authorization + +This section contains documentation about how Netdata allows users to Authenticate with Netdata Cloud, as well as the Authorization flows that control the access and actions of their teammates in Netdata Cloud. + +## Authentication + +### Email + +To sign in/sign up using email, visit [Netdata Cloud](https://app.netdata.cloud/sign-in?cloudRoute=spaces?utm_source=docs&utm_content=sign_in_button_email_section), enter your email address, and click the **Sign in by email** button. + +Click the **Verify** button in the email you received to start using Netdata Cloud. + +### Google and GitHub OAuth + +When you use Google/GitHub OAuth, your Netdata Cloud account is associated with the email address that Netdata Cloud receives through OAuth. + +To sign in/sign up using Google or GitHub OAuth, visit [Netdata Cloud](https://app.netdata.cloud/sign-in?cloudRoute=spaces?utm_source=docs&utm_content=sign_in_button_google_github_section) select the method you want to use. After the verification steps, you will be signed in to Netdata Cloud. + +### Enterprise SSO Authentication + +Netdata integrates with SSO tools, allowing you to control how your team connects and authenticates to Netdata Cloud. + +For more information, see [Enterprise SSO Authentication](/docs/netdata-cloud/authentication-and-authorization/enterprise-sso-authentication.md). + +## Authorization + +Once logged in, you can manage role-based access in your space to give each team member the appropriate role. For more information, see [Role-Based Access model](/docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md). diff --git a/docs/netdata-cloud/authentication-and-authorization/api-tokens.md b/docs/netdata-cloud/authentication-and-authorization/api-tokens.md new file mode 100644 index 000000000..88b73ee68 --- /dev/null +++ b/docs/netdata-cloud/authentication-and-authorization/api-tokens.md @@ -0,0 +1,34 @@ +# API Tokens + +## Overview + +Every single user can get access to the Netdata resource programmatically. It is done through the API Token which +can be also called as Bearer Token. This token is used for authentication and authorization, it can be issued +in the Netdata UI under the user Settings: + +image + +The API Tokens are not going to expire and can be limited to a few scopes: + +* `scope:all` + + this token is given the same level of action as the user has, the use-case for it is Netdata terraform provider + +* `scope:agent-ui` + + this token is mainly used by the local Netdata agent accessing the Cloud UI + +* `scope:grafana-plugin` + + this token is used for the [Netdata Grafana plugin](https://github.com/netdata/netdata-grafana-datasource-plugin/blob/master/README.md) + to access Netdata charts + +Currently, the Netdata Cloud is not exposing stable API. + +## Example usage + +* get the cloud space list + +```console +$ curl -H 'Accept: application/json' -H "Authorization: Bearer " https://app.netdata.cloud/api/v2/spaces +``` diff --git a/docs/netdata-cloud/authentication-and-authorization/enterprise-sso-authentication.md b/docs/netdata-cloud/authentication-and-authorization/enterprise-sso-authentication.md new file mode 100644 index 000000000..7657e8bcf --- /dev/null +++ b/docs/netdata-cloud/authentication-and-authorization/enterprise-sso-authentication.md @@ -0,0 +1,36 @@ +# Enterprise SSO Authentication + +Netdata provides you with means to streamline and control how your team connects and authenticates to Netdata Cloud. We provide + diferent Single Sign-On (SSO) integrations that allow you to connect with the tool that your organization is using to manage your + user accounts. + + > ❗ This feature focus is on the Authentication flow, it doesn't support the Authorization with managing Users and Roles. + + +## How to set it up? + +If you want to setup your Netdata Space to allow user Authentication through an Enterprise SSO tool you need to: +* Confirm the integration to the tool you want is available ([Authentication integations](https://learn.netdata.cloud/docs/netdata-cloud/authentication-&-authorization/cloud-authentication-&-authorization-integrations)) +* Have a Netdata Cloud account +* Have Access to the Space as an administrator +* Your Space needs to be on the Business plan or higher + +Once you ensure the above prerequisites you need to: +1. Click on the Space settings cog (located above your profile icon) +2. Click on the Authentication tab +3. Select the card for the integration you are looking for, click on Configure +4. Fill the required attributes need to establish the integration with the tool + + +## How to authenticate to Netdata? + +### From Netdata Sign-up page + +If you're starting your flow from Netdata sign-in page you need to: +1. Click on the link `Sign-in with an Enterprise Signle Sign-On (SSO)` +2. Enter your email address +3. Go to your mailbox and check the `Sign In to Nedata` email that you have received +4. Click on the **Sign In** button + +Note: If you're not authenticated on the Enterprise SSO tool you'll be prompted to authenticate there +first before being allowed to proceed to Netdata Cloud. diff --git a/docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md b/docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md new file mode 100644 index 000000000..fec33ca22 --- /dev/null +++ b/docs/netdata-cloud/authentication-and-authorization/role-based-access-model.md @@ -0,0 +1,157 @@ +# Role-Based Access model + +Netdata Cloud's role-based-access mechanism allows you to control what functionalities in the app users can access. Each user can be assigned only one role, which fully specifies all the capabilities they are afforded. + +## What roles are available? + +With the advent of the paid plans we revamped the roles to cover needs expressed by Netdata users, like providing more limited access to their customers, or +being able to join any Room. We also aligned the offered roles to the target audience of each plan. The end result is the following: + +| **Role** | **Community** | **Homelab** | **Business** | **Enterprise On-Premise** | +|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------|:-------------------|:-------------------|:--------------------------| +| **Admins**

Users with this role can control Spaces, Rooms, Nodes, Users and Billing.

They can also access any Room in the Space.

| :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| **Managers**

Users with this role can manage Rooms and Users.

They can access any Room in the Space.

| - | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| **Troubleshooters**

Users with this role can use Netdata to troubleshoot, not manage entities.

They can access any Room in the Space.

| - | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| **Observers**

Users with this role can only view data in specific Rooms.

💡 Ideal for restricting your customer's access to their own dedicated rooms.

| - | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| **Billing**

Users with this role can handle billing options and invoices.

| - | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| **Member** ⚠️ Legacy role

Users with this role you can create Rooms and invite other Members.

They can only see the Rooms they belong to and all Nodes in the All Nodes Room.

| - | - | - | - | + +## Which functionalities are available for each role? + +In more detail, you can find on the following tables which functionalities are available for each role on each domain. + +### Space Management + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | +|:-----------------------|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:| +| See Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| Leave Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| Delete Space | :heavy_check_mark: | - | - | - | - | - | +| Change name | :heavy_check_mark: | - | - | - | - | - | +| Change description | :heavy_check_mark: | - | - | - | - | - | +| Change slug | :heavy_check_mark: | - | - | - | - | - | +| Change preferred nodes | :heavy_check_mark: | - | - | - | - | - | + +### Node Management + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | +|:------------------------------------------|:------------------:|:------------------:|:------------------:|:------------:|:-----------:|:------------------:|:-------------------------------------------| +| See all Nodes in Space (_All Nodes_ Room) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | Members are always on the _All Nodes_ Room | +| Connect Node to Space | :heavy_check_mark: | - | - | - | - | - | - | +| Delete Node from Space | :heavy_check_mark: | - | - | - | - | - | - | + +### User Management + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | +|:-----------------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:-----------:|:------------------:|:----------------------------------------------------------------------------------------------| +| See all Users in Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | +| Invite new User to Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | You can't invite a user with a role you don't have permissions to appoint to (see below) | +| Delete Pending Invitation to Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | +| Delete User from Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | You can't delete a user if he has a role you don't have permissions to appoint to (see below) | +| Appoint Administrators | :heavy_check_mark: | - | - | - | - | - | | +| Appoint Billing user | :heavy_check_mark: | - | - | - | - | - | | +| Appoint Managers | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | +| Appoint Troubleshooters | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | +| Appoint Observer | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | +| Appoint Member | :heavy_check_mark: | - | - | - | - | :heavy_check_mark: | Only available on Early Bird plans | +| See all Users in a Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | +| Invite existing user to Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | User already invited to the Space | +| Remove user from Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | + +### Room Management + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | +|:-----------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:-----------:|:------------------:|:-----------------------------------------------------------------------------------| +| See all Rooms in a Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | - | | +| Join any Room in a Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | - | By joining a Room you will be enabled to get notifications from nodes on that Room | +| Leave Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | +| Create a new Room in a Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | +| Delete Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | +| Change Room name | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | If not the _All Nodes_ Room | +| Change Room description | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | +| Add existing Nodes to Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | Node already connected to the Space | +| Remove Nodes from Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | + +### Notifications Management + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | +|:--------------------------------------------------------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| See all configured notifications on a Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | +| Add new configuration | :heavy_check_mark: | - | - | - | - | - | | +| Enable/Disable configuration | :heavy_check_mark: | - | - | - | - | - | | +| Edit configuration | :heavy_check_mark: | - | - | - | - | - | Some exceptions apply depending on [service level](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-notification-methods.md#available-actions-per-notification-method-based-on-service-level) | +| Delete configuration | :heavy_check_mark: | - | - | - | - | - | | +| Edit personal level notification settings | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | [Manage user notification settings](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-notification-methods.md#manage-user-notification-settings) | +| See space alert notification silencing rules | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | - | | +| Add new space alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | +| Enable/Disable space alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | +| Edit space alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | +| Delete space alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | +| See, add, edit or delete personal level alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | | + +> **Note** +> +> Enable, Edit and Add actions over specific notification methods will only be allowed if your plan has access to those ([service classification](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/centralized-cloud-notifications-reference.md#service-classification)) + +### Dashboards + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | +|:-----------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:-----------:|:------------------:| +| See all dashboards in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | +| Add new dashboard to Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | +| Edit any dashboard in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | +| Edit own dashboard in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | +| Delete any dashboard in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | +| Delete own dashboard in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | + +### Functions + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | +|:-------------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:-----------:|:------------------:|:---------------------------------------------------------------------| +| See all functions in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | +| Run any function in Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | +| Run read-only function in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | +| Run sensitive function in Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | There isn't any function on this category yet, so subject to change. | + +### Events feed + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | +|:-----------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:-----------:|:------------------:|:-----------------------------------------------| +| See Alert or Topology events | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | +| See Auditing events | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | These are coming soon, not currently available | + +### Billing + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | +|:---------------------------|:------------------:|:-----------:|:------------------:|:------------:|:------------------:|:----------:|:----------------------------------------------------------------| +| See Plan & Billing details | :heavy_check_mark: | - | - | - | :heavy_check_mark: | - | Current plan and usage figures | +| Update plans | :heavy_check_mark: | - | - | - | - | - | This includes cancelling current plan (going to Community plan) | +| See invoices | :heavy_check_mark: | - | - | - | :heavy_check_mark: | - | | +| Manage payment methods | :heavy_check_mark: | - | - | - | :heavy_check_mark: | - | | +| Update billing email | :heavy_check_mark: | - | - | - | :heavy_check_mark: | - | | + +### Dynamic Configuration Manager + +Netdata Cloud paid subscription required for all action except "List All". + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | +|:--------------------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:------------------:| +| List All (see all configurable items) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | +| Enable/Disable | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | +| Add | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | +| Update | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | +| Remove | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | +| Test | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | +| View | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | +| View File Format | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | + + +### Other permissions + +| **Functionality** | **Admin** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | +|:---------------------------|:------------------:|:------------------:|:------------------:|:------------------:|:-----------:|:------------------:| +| See Bookmarks in Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | +| Add Bookmark to Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | +| Delete Bookmark from Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | +| See Visited Nodes | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | +| Update Visited Nodes | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | diff --git a/docs/netdata-cloud/netdata-cloud-on-prem/README.md b/docs/netdata-cloud/netdata-cloud-on-prem/README.md new file mode 100644 index 000000000..49373c454 --- /dev/null +++ b/docs/netdata-cloud/netdata-cloud-on-prem/README.md @@ -0,0 +1,77 @@ +# Netdata Cloud On-Prem + +Netdata Cloud is built as microservices and is orchestrated by a Kubernetes cluster, providing a highly available and auto-scaled observability platform. + +The overall architecture looks like this: + +```mermaid +flowchart TD + agents("🌍 Netdata Agents
Users' infrastructure
Netdata Children & Parents") + users[["🔥 Unified Dashboards
Integrated Infrastructure
Dashboards"]] + ingress("🛡️ Ingress Gateway
TLS termination") + traefik((("🔒 Traefik
Authentication &
Authorization"))) + emqx(("📤 EMQX
Agents Communication
Message Bus
MQTT")) + pulsar(("⚡ Pulsar
Internal Microservices
Message Bus")) + frontend("🌐 Front-End
Static Web Files") + auth("👨‍💼 Users & Agents
Authorization
Microservices") + spaceroom("🏡 Spaces, Rooms,
Nodes, Settings

Microservices for
managing Spaces,
Rooms, Nodes and
related settings") + charts("📈 Metrics & Queries
Microservices for
dispatching queries
to Netdata agents") + alerts("🔔 Alerts & Notifications
Microservices for
tracking alert
transitions and
deduplicating alerts") + sql[("✨ PostgreSQL
Users, Spaces, Rooms,
Agents, Nodes, Metric
Names, Metrics Retention,
Custom Dashboards,
Settings")] + redis[("🗒️ Redis
Caches needed
by Microservices")] + elk[("🗞️ Elasticsearch
Feed Events Database")] + bridges("🤝 Input & Output
Microservices bridging
agents to internal
components") + notifications("📢 Notifications Integrations
Dispatch alert
notifications to
3rd party services") + feed("📝 Feed & Events
Microservices for
managing the events feed") + users --> ingress + agents --> ingress + ingress --> traefik + ingress ==>|agents
websockets| emqx + traefik -.- auth + traefik ==>|http| spaceroom + traefik ==>|http| frontend + traefik ==>|http| charts + traefik ==>|http| alerts + spaceroom o-...-o pulsar + spaceroom -.- redis + spaceroom x-..-x sql + spaceroom -.-> feed + charts o-.-o pulsar + charts -.- redis + charts x-.-x sql + charts -..-> feed + alerts o-.-o pulsar + alerts -.- redis + alerts x-.-x sql + alerts -..-> feed + auth o-.-o pulsar + auth -.- redis + auth x-.-x sql + auth -.-> feed + feed <--> elk + alerts ----> notifications + %% auth ~~~ spaceroom + emqx <.-> bridges o-..-o pulsar +``` + +## Requirements + +The following components are required to run Netdata Cloud On-Prem: + +- **Kubernetes cluster** version 1.23+ +- **Kubernetes metrics server** (for autoscaling) +- **TLS certificate** for secure connections. A single endpoint is required but there is an option to split the frontend, api, and MQTT endpoints. The certificate must be trusted by all entities connecting to it. +- Default **storage class configured and working** (persistent volumes based on SSDs are preferred) + +The following 3rd party components are used, which can be pulled with the `netdata-cloud-dependency` package we provide: + +- **Ingress controller** supporting HTTPS +- **PostgreSQL** version 13.7 (main database for all metadata Netdata Cloud maintains) +- **EMQX** version 5.11 (MQTT Broker that allows Agents to send messages to the On-Prem Cloud) +- **Apache Pulsar** version 2.10+ (message broken for inter-container communication) +- **Traefik** version 2.7.x (internal API Gateway) +- **Elasticsearch** version 8.8.x (stores the feed of events) +- **Redis** version 6.2 (caching) +- imagePullSecret (our ECR repos are secured) + +Keep in mind though that the pulled versions are not configured properly for production use. Customers of Netdata Cloud On-Prem are expected to configure these applications according to their needs and policies for production use. Netdata Cloud On-Prem can be configured to use all these applications as a shared resource from other existing production installations. diff --git a/docs/netdata-cloud/netdata-cloud-on-prem/infrastructure.jpeg b/docs/netdata-cloud/netdata-cloud-on-prem/infrastructure.jpeg new file mode 100644 index 000000000..a866e141c Binary files /dev/null and b/docs/netdata-cloud/netdata-cloud-on-prem/infrastructure.jpeg differ diff --git a/docs/netdata-cloud/netdata-cloud-on-prem/installation.md b/docs/netdata-cloud/netdata-cloud-on-prem/installation.md new file mode 100644 index 000000000..259ddb5ce --- /dev/null +++ b/docs/netdata-cloud/netdata-cloud-on-prem/installation.md @@ -0,0 +1,212 @@ +# Netdata Cloud On-Prem Installation + +This installation guide assumes the prerequisites for installing Netdata Cloud On-Prem as satisfied. For more information please refer to the [requirements documentation](/docs/netdata-cloud/netdata-cloud-on-prem/README.md#requirements). + +## Installation Requirements + +The following components are required to install Netdata Cloud On-Prem: + +- **AWS** CLI +- **Helm** version 3.12+ with OCI Configuration (explained in the installation section) +- **Kubectl** + +## Preparations for Installation + +### Configure AWS CLI + +Install [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html). + +There are 2 options for configuring `aws cli` to work with the provided credentials. The first one is to set the environment variables: + +```bash +export AWS_ACCESS_KEY_ID= +export AWS_SECRET_ACCESS_KEY= +``` + +The second one is to use an interactive shell: + +```bash +aws configure +``` + +### Configure helm to use secured ECR repository + +Using `aws` command we will generate a token for helm to access the secured ECR repository: + +```bash +aws ecr get-login-password --region us-east-1 | helm registry login --username AWS --password-stdin 362923047827.dkr.ecr.us-east-1.amazonaws.com +``` + +After this step you should be able to add the repository to your helm or just pull the helm chart: + +```bash +helm pull oci://362923047827.dkr.ecr.us-east-1.amazonaws.com/netdata-cloud-dependency --untar #optional +helm pull oci://362923047827.dkr.ecr.us-east-1.amazonaws.com/netdata-cloud-onprem --untar +``` + +Local folders with the newest versions of helm charts should appear on your working dir. + +## Installation + +Netdata provides access to two helm charts: + +1. `netdata-cloud-dependency` - required applications for `netdata-cloud-onprem`. +2. `netdata-cloud-onprem` - the application itself + provisioning + +### netdata-cloud-dependency + +This helm chart is designed to install the necessary applications: + +- Redis +- Elasticsearch +- EMQX +- Apache Pulsar +- PostgreSQL +- Traefik +- Mailcatcher +- k8s-ecr-login-renew +- kubernetes-ingress + +Although we provide an easy way to install all these applications, we expect users of Netdata Cloud On-Prem to provide production quality versions for them. Therefore, every configuration option is available through `values.yaml` in the folder that contains your netdata-cloud-dependency helm chart. All configuration options are described in `README.md` which is a part of the helm chart. + +Each component can be enabled/disabled individually. It is done by true/false switches in `values.yaml`. This way, it is easier to migrate to production-grade components gradually. + +Unless you prefer otherwise, `k8s-ecr-login-renew` is responsible for calling out the `AWS API` for token regeneration. This token is then injected into the secret that every node is using for authentication with secured ECR when pulling the images. + +The default setting in `values.yaml` of `netdata-cloud-onprem` - `.global.imagePullSecrets` is configured to work out of the box with the dependency helm chart. + +For helm chart installation - save your changes in `values.yaml` and execute: + +```shell +cd [your helm chart location] +helm upgrade --wait --install netdata-cloud-dependency -n netdata-cloud --create-namespace -f values.yaml . +``` + +Keep in mind that `netdata-cloud-dependency` is provided only as a proof of concept. Users installing Netdata Cloud On-Prem should properly configure these components. + +### netdata-cloud-onprem + +Every configuration option is available in `values.yaml` in the folder that contains your `netdata-cloud-onprem` helm chart. All configuration options are described in the `README.md` which is a part of the helm chart. + +#### Installing Netdata Cloud On-Prem + +```shell +cd [your helm chart location] +helm upgrade --wait --install netdata-cloud-onprem -n netdata-cloud --create-namespace -f values.yaml . +``` + +##### Important notes + +1. Installation takes care of provisioning the resources with migration services. + +2. During the first installation, a secret called the `netdata-cloud-common` is created. It contains several randomly generated entries. Deleting helm chart is not going to delete this secret, nor reinstalling the whole On-Prem, unless manually deleted by kubernetes administrator. The content of this secret is extremely relevant - strings that are contained there are essential parts of encryption. Losing or changing the data that it contains will result in data loss. + +## Short description of Netdata Cloud microservices + +#### cloud-accounts-service + +Responsible for user registration & authentication. Manages user account information. + +#### cloud-agent-data-ctrl-service + +Forwards request from the cloud to the relevant agents. +The requests include: +- Fetching chart metadata from the agent +- Fetching chart data from the agent +- Fetching function data from the agent + +#### cloud-agent-mqtt-input-service + +Forwards MQTT messages emitted by the agent related to the agent entities to the internal Pulsar broker. These include agent connection state updates. + +#### cloud-agent-mqtt-output-service + +Forwards Pulsar messages emitted in the cloud related to the agent entities to the MQTT broker. From there, the messages reach the relevant agent. + +#### cloud-alarm-config-mqtt-input-service + +Forwards MQTT messages emitted by the agent related to the alarm-config entities to the internal Pulsar broker. These include the data for the alarm configuration as seen by the agent. + +#### cloud-alarm-log-mqtt-input-service + +Forwards MQTT messages emitted by the agent related to the alarm-log entities to the internal Pulsar broker. These contain data about the alarm transitions that occurred in an agent. + +#### cloud-alarm-mqtt-output-service + +Forwards Pulsar messages emitted in the cloud related to the alarm entities to the MQTT broker. From there, the messages reach the relevant agent. + +#### cloud-alarm-processor-service + +Persists latest alert statuses received from the agent in the cloud. +Aggregates alert statuses from relevant node instances. +Exposes API endpoints to fetch alert data for visualization on the cloud. +Determines if notifications need to be sent when alert statuses change and emits relevant messages to Pulsar. +Exposes API endpoints to store and return notification-silencing data. + +#### cloud-alarm-streaming-service + +Responsible for starting the alert stream between the agent and the cloud. +Ensures that messages are processed in the correct order, and starts a reconciliation process between the cloud and the agent if out-of-order processing occurs. + +#### cloud-charts-mqtt-input-service + +Forwards MQTT messages emitted by the agent related to the chart entities to the internal Pulsar broker. These include the chart metadata that is used to display relevant charts on the cloud. + +#### cloud-charts-mqtt-output-service + +Forwards Pulsar messages emitted in the cloud related to the charts entities to the MQTT broker. From there, the messages reach the relevant agent. + +#### cloud-charts-service + +Exposes API endpoints to fetch the chart metadata. +Forwards data requests via the `cloud-agent-data-ctrl-service` to the relevant agents to fetch chart data points. +Exposes API endpoints to call various other endpoints on the agent, for instance, functions. + +#### cloud-custom-dashboard-service + +Exposes API endpoints to fetch and store custom dashboard data. + +#### cloud-environment-service + +Serves as the first contact point between the agent and the cloud. +Returns authentication and MQTT endpoints to connecting agents. + +#### cloud-feed-service + +Processes incoming feed events and stores them in Elasticsearch. +Exposes API endpoints to fetch feed events from Elasticsearch. + +#### cloud-frontend + +Contains the on-prem cloud website. Serves static content. + +#### cloud-iam-user-service + +Acts as a middleware for authentication on most of the API endpoints. Validates incoming token headers, injects the relevant ones, and forwards the requests. + +#### cloud-metrics-exporter + +Exports various metrics from an On-Prem Cloud installation. Uses the Prometheus metric exposition format. + +#### cloud-netdata-assistant + +Exposes API endpoints to fetch a human-friendly explanation of various netdata configuration options, namely the alerts. + +#### cloud-node-mqtt-input-service + +Forwards MQTT messages emitted by the agent related to the node entities to the internal Pulsar broker. These include the node metadata as well as their connectivity state, either direct or via parents. + +#### cloud-node-mqtt-output-service + +Forwards Pulsar messages emitted in the cloud related to the charts entities to the MQTT broker. From there, the messages reach the relevant agent. + +#### cloud-notifications-dispatcher-service + +Exposes API endpoints to handle integrations. +Handles incoming notification messages and uses the relevant channels(email, slack...) to notify relevant users. + +#### cloud-spaceroom-service + +Exposes API endpoints to fetch and store relations between agents, nodes, spaces, users, and rooms. +Acts as a provider of authorization for other cloud endpoints. +Exposes API endpoints to authenticate agents connecting to the cloud. diff --git a/docs/netdata-cloud/netdata-cloud-on-prem/poc-without-k8s.md b/docs/netdata-cloud/netdata-cloud-on-prem/poc-without-k8s.md new file mode 100644 index 000000000..6be4066bd --- /dev/null +++ b/docs/netdata-cloud/netdata-cloud-on-prem/poc-without-k8s.md @@ -0,0 +1,70 @@ +# Netdata Cloud On-Prem PoC without k8s + +These instructions are about installing a light version of Netdata Cloud, for clients who do not have a Kubernetes cluster installed. This setup is **only for demonstration purposes**, as it has no built-in resiliency on failures of any kind. + +## Requirements + +- Ubuntu 22.04 (clean installation will work best). +- 10 CPU Cores and 24 GiB of memory. +- Access to shell as a sudo. +- TLS certificate for Netdata Cloud On-Prem PoC. A single endpoint is required. The certificate must be trusted by all entities connecting to this installation. +- AWS ID and License Key - we should have provided this to you, if not contact us: . + +To install the whole environment, log in to the designated host and run: + +```bash +curl https://netdata-cloud-netdata-static-content.s3.amazonaws.com/provision.sh -o provision.sh +chmod +x provision.sh +sudo ./provision.sh install \ + -key-id "" \ + -access-key "" \ + -onprem-license-key "" \ + -onprem-license-subject "" \ + -onprem-url "" \ + -certificate-path "" \ + -private-key-path "" +``` + +What does the script do during installation? + +1. Prompts for user to provide: + - `-key-id` - AWS ECR access key ID. + - `-access-key` - AWS ECR Access Key. + - `-onprem-license-key` - Netdata Cloud On-Prem license key. + - `-onprem-license-subject` - Netdata Cloud On-Prem license subject. + - `-onprem-url` - URL for the On-prem (without http(s) protocol). + - `-certificate-path` - path to your PEM encoded certificate. + - `-private-key-path` - path to your PEM encoded key. + +2. After all the above installation will begin. The script will install: + - Helm + - Kubectl + - AWS CLI + - K3s cluster (single node) + +3. When all the required software is installed script starts to provision the K3s cluster with gathered data. + +After cluster provisioning netdata is ready to be used. + +> WARNING: +> This script will automatically expose not only netdata but also a mailcatcher under `/mailcatcher`. + +## How to log in? + +Only login by mail can work without further configuration. Every mail this Netdata Cloud On-Prem sends, will appear on the mailcatcher, which acts as the SMTP server with a simple GUI to read the mails. + +Steps: + +1. Open Netdata Cloud On-Prem PoC in the web browser on URL you specified +2. Provide email and use the button to confirm +3. Mailcatcher will catch all the emails so go to `/mailcatcher`. Find yours and click the link. +4. You are now logged into Netdata Cloud. Add your first nodes! + +## How to remove Netdata Cloud On-Prem PoC? + +To uninstall the whole PoC, use the same script that installed it, with the `uninstall` switch. + +```shell +cd