diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:23 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:44 +0000 |
commit | 836b47cb7e99a977c5a23b059ca1d0b5065d310e (patch) | |
tree | 1604da8f482d02effa033c94a84be42bc0c848c3 /docs/cloud | |
parent | Releasing debian version 1.44.3-2. (diff) | |
download | netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.tar.xz netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.zip |
Merging upstream version 1.46.3.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'docs/cloud')
22 files changed, 0 insertions, 2331 deletions
diff --git a/docs/cloud/alerts-notifications/manage-alert-notification-silencing-rules.md b/docs/cloud/alerts-notifications/manage-alert-notification-silencing-rules.md deleted file mode 100644 index b9806c6fa..000000000 --- a/docs/cloud/alerts-notifications/manage-alert-notification-silencing-rules.md +++ /dev/null @@ -1,58 +0,0 @@ -# Manage alert notification silencing rules - -From the Cloud interface, you can manage your space's alert notification silencing rules settings as well as allow users to define their personal ones. - -## Prerequisites - -To manage **space's alert notification silencing rule settings**, you will need the following: - -- A Netdata Cloud account -- Access to the space as an **administrator** or **manager** (**troubleshooters** can only view space rules) - - -To manage your **personal alert notification silencing rule settings**, you will need the following: - -- A Netdata Cloud account -- Access to the space with any roles except **billing** - -### Steps - -1. Click on the **Space settings** cog (located above your profile icon) -1. Click on the **Alert & Notification** tab on the left hand-side -1. Click on the **Notification Silencing Rules** tab -1. You will be presented with a table of the configured alert notification silencing rules for: - * the space (if aren't an **observer**) - * yourself - - You will be able to: - 1. **Add a new** alert notification silencing rule configuration. - - Choose if it applies to **All users** or **Myself** (All users is only available for **administrators** and **managers**) - - You need to provide a name for the configuration so you can easily refer to it - - Define criteria for Nodes: To which Rooms will this apply? What Nodes? Does it apply to host labels key-value pairs? - - Define criteria for Alerts: Which alert name is being targeted? What alert context? Will it apply to a specific alert role? - - Define when it will be applied: - - Immediately, from now till until it is turned off or until a specific duration (start and end date automatically set) - - Scheduled, you specify the start and end time for when the rule becomes active and then inactive (time is set according to your browser local timezone) - Note: You are only able to add a rule if your space is on a [paid plan](https://github.com/netdata/netdata/edit/master/docs/cloud/manage/plans.md). - 1. **Edit an existing** alert notification silencing rule configurations. You will be able to change: - - The name provided for it - - Who it applies to - - Selection criteria for Nodes and Alert - - When it will be applied - 1. **Enable/Disable** a given alert notification silencing rule configuration. - - Use the toggle to enable or disable - 1. **Delete an existing** alert notification silencing rule. - - Use the trash icon to delete your configuration - -## Silencing rules examples - -| Rule name | War Rooms | Nodes | Host Label | Alert name | Alert context | Alert role | Description | -| :-- | :-- | :-- | :-- | :-- | :-- | :-- | :--| -| Space silencing | All Rooms | * | * | * | * | * | This rule silences the entire space, targets all nodes and for all users. E.g. infrastructure wide maintenance window. | -| DB Servers Rooms | PostgreSQL Servers | * | * | * | * | * | This rules silences the nodes in the room named PostgreSQL Servers, for example it doesn't silence the `All Nodes` room. E.g. My team with membership to this room doesn't want to receive notifications for these nodes. | -| Node child1 | All Rooms | `child1` | * | * | * | * | This rule silences all alert state transitions for node `child1` on all rooms and for all users. E.g. node could be going under maintenance. | -| Production nodes | All Rooms | * | `environment:production` | * | * | * | This rule silences all alert state transitions for nodes with the host label key-value pair `environment:production`. E.g. Maintenance window on nodes with specific host labels. | -| Third party maintenance | All Rooms | * | * | `httpcheck_posthog_netdata_cloud.request_status` | * | * | This rule silences this specific alert since third party partner will be undergoing maintenance. | -| Intended stress usage on CPU | All Rooms | * | * | * | `system.cpu` | * | This rule silences specific alerts across all nodes and their CPU cores. | -| Silence role webmaster | All Rooms | * | * | * | * | `webmaster` | This rule silences all alerts configured with the role `webmaster`. | -| Silence alert on node | All Rooms | `child1` | * | `httpcheck_posthog_netdata_cloud.request_status` | * | * | This rule silences the specific alert on the `child1` node. | diff --git a/docs/cloud/alerts-notifications/manage-notification-methods.md b/docs/cloud/alerts-notifications/manage-notification-methods.md deleted file mode 100644 index f61b6bf6f..000000000 --- a/docs/cloud/alerts-notifications/manage-notification-methods.md +++ /dev/null @@ -1,73 +0,0 @@ -# Manage notification methods - -From the Cloud interface, you can manage your space's notification settings as well as allow users to personalize their notifications setting - -## Manage space notification settings - -### Prerequisites - -To manage space notification settings, you will need the following: - -- A Netdata Cloud account -- Access to the space as an **administrator** - -### Available actions per notification methods based on service level - -| **Action** | **Personal service level** | **System service level** | -| :- | :-: | :-: | -| Enable / Disable | X | X | -| Edit | | X | | -| Delete | X | X | -| Add multiple configurations for same method | | X | - -Notes: -* For Netadata provided ones you can't delete the existing notification method configuration. -* Enable, Edit and Add actions over specific notification methods will only be allowed if your plan has access to those ([service classification](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/notifications.md#service-classification)) - -### Steps - -1. Click on the **Space settings** cog (located above your profile icon) -1. Click on the **Alerts & Notification** tab on the left hand-side -1. Click on the **Notification Methods** tab -1. You will be presented with a table of the configured notification methods for the space. You will be able to: - 1. **Add a new** notification method configuration. - - Choose the service from the list of the available ones, you'll may see a list of unavailable options if your plan doesn't allow some of them (you will see on the - card the plan level that allows a specific service) - - You can optionally provide a name for the configuration so you can easily refer to what it - - Define filtering criteria. To which Rooms will this apply? What notifications I want to receive? (All Alerts and unreachable, All Alerts, Critical only) - - Depending on the service different inputs will be present, please note that there are mandatory and optional inputs - - If you doubts on how to configure the service you can find a link at the top of the modal that takes you to the specific documentation page to help you - 1. **Edit an existing** notification method configuration. Personal level ones can't be edited here, see [Manage user notification settings](#manage-user-notification-settings). You will be able to change: - - The name provided for it - - Filtering criteria - - Service specific inputs - 1. **Enable/Disable** a given notification method configuration. - - Use the toggle to enable or disable the notification method configuration - 1. **Delete an existing** notification method configuration. Netdata provided ones can't be deleted, e.g. Email - - Use the trash icon to delete your configuration - -## Manage user notification settings - -### Prerequisites - -To manage user specific notification settings, you will need the following: - -- A Cloud account -- Have access to, at least, a space - -Note: If an administrator has disabled a Personal [service level](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/notifications.md#service-level) notification method this will override any user specific setting. - -### Steps - -1. Click on the **User notification settings** shortcut on top of the help button -1. You are presented with: - - The Personal [service level](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/notifications.md#service-level) notification methods you can manage - - The list spaces and rooms inside those where you have access to - - If you're an administrator, Manager or Troubleshooter you'll also see the Rooms from a space you don't have access to on **All Rooms** tab and you can activate notifications for them by joining the room -1. On this modal you will be able to: - 1. **Enable/Disable** the notification method for you, this applies accross all spaces and rooms - - Use the the toggle enable or disable the notification method - 1. **Define what notifications you want** to per space/room: All Alerts and unreachable, All Alerts, Critical only or No notifications - 1. **Activate notifications** for a room you aren't a member of - - From the **All Rooms** tab click on the Join button for the room(s) you want - diff --git a/docs/cloud/alerts-notifications/notifications.md b/docs/cloud/alerts-notifications/notifications.md deleted file mode 100644 index cde30a2b4..000000000 --- a/docs/cloud/alerts-notifications/notifications.md +++ /dev/null @@ -1,147 +0,0 @@ -# Cloud alert notifications - -import Callout from '@site/src/components/Callout' - -Netdata Cloud can send centralized alert notifications to your team whenever a node enters a warning, critical, or -unreachable state. By enabling notifications, you ensure no alert, on any node in your infrastructure, goes unnoticed by -you or your team. - -Having this information centralized helps you: -* Have a clear view of the health across your infrastructure, seeing all alerts in one place. -* Easily [set up your alert notification process](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/manage-notification-methods.md): -methods to use and where to use them, filtering rules, etc. -* Quickly troubleshoot using [Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md) -or [Anomaly Advisor](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/anomaly-advisor.md) - -If a node is getting disconnected often or has many alerts, we protect you and your team from alert fatigue by sending -you a flood protection notification. Getting one of these notifications is a good signal of health or performance issues -on that node. - -Admins must enable alert notifications for their [Space(s)](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/manage-notification-methods.md#manage-space-notification-settings). All users in a -Space can then personalize their notifications settings from within their [account -menu](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/#manage-user-notification-settings). - -<Callout type="notice"> - -Centralized alert notifications from Netdata Cloud is a independent process from [notifications from -Netdata](https://github.com/netdata/netdata/blob/master/docs/monitor/enable-notifications.md). You can enable one or the other, or both, based on your needs. However, -the alerts you see in Netdata Cloud are based on those streamed from your Netdata-monitoring nodes. If you want to tweak -or add new alert that you see in Netdata Cloud, and receive via centralized alert notifications, you must -[configure](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md) each node's alert watchdog. - -</Callout> - -## Alert notifications - -Netdata Cloud can send centralized alert notifications to your team whenever a node enters a warning, critical, or unreachable state. By enabling notifications, -you ensure no alert, on any node in your infrastructure, goes unnoticed by you or your team. - -If a node is getting disconnected often or has many alerts, we protect you and your team from alert fatigue by sending you a flood protection notification. -Getting one of these notifications is a good signal of health or performance issues on that node. - -Alert notifications can be delivered through different methods, these can go from an Email sent from Netdata to the use of a 3rd party tool like PagerDuty. - -Notification methods are classified on two main attributes: -* Service level: Personal or System -* Service classification: Community or Business - -Only administrators are able to manage the space's alert notification settings. -All users in a Space can personalize their notifications settings, for Personal service level notification methods, from within their profile menu. - -> ⚠️ Netdata Cloud supports different notification methods and their availability will depend on the plan you are at. -> For more details check [Service classification](#service-classification) or [netdata.cloud/pricing](https://www.netdata.cloud/pricing). - -### Service level - -#### Personal - -The notifications methods classified as **Personal** are what we consider generic, meaning that these can't have specific rules for them set by the administrators. - -These notifications are sent to the destination of the channel which is a user-specific attribute, e.g. user's e-mail, and the users are the ones that will then be able to -manage what specific configurations they want for the Space / Room(s) and the desired Notification level, they can achieve this from their User Profile page under -**Notifications**. - -One example of such a notification method is the E-mail. - -#### System - -For **System** notification methods, the destination of the channel will be a target that usually isn't specific to a single user, e.g. slack channel. - -These notification methods allow for fine-grain rule settings to be done by administrators and more than one configuration can exist for them since. You can specify -different targets depending on Rooms or Notification level settings. - -Some examples of such notification methods are: Webhook, PagerDuty, Slack. - -### Service classification - -#### Community - -Notification methods classified as Community can be used by everyone independent on the plan your space is at. -These are: Email and discord - -#### Pro - -Notification methods classified as Pro are only available for **Pro** and **Business** plans -These are: webhook - -#### Business - -Notification methods classified as Business are only available for **Business** plans -These are: PagerDuty, Slack, Opsgenie - -## Silencing Alert notifications - -Netdata Cloud provides you a Silencing Rule engine which allows you to mute alert notifications. This muting action is specific to alert state transition notifications, it doesn't include node unreachable state transitions. - -The Silencing Rule engine is flexible and allows you to enter silence rules for the two main entities involved on alert notifications and can be set using different attributes. The main entities you can enter are **Nodes** and **Alerts** which can be used in combination or isolation to target specific needs - see some examples [here](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/manage-alert-notification-silencing-rules.md#silencing-rules-examples). - -### Scope definition for Nodes -* **Space:** silencing the space, selecting `All Rooms`, silences all alert state transitions from any node claimed to the space. -* **War Room:** silencing a specific room will silence all alert state transitions from any node in that room. Please note if the node belongs to -another room which isn't silenced it can trigger alert notifications to the users with membership to that other room. -* **Node:** silencing a specific node can be done for the entire space, selecting `All Rooms`, or for specific war room(s). The main difference is -if the node should be silenced for the entire space or just for specific rooms (when specific rooms are selected only users with membership to that room won't receive notifications). - -### Scope definition for Alerts -* **Alert name:** silencing a specific alert name silences all alert state transitions for that specific alert. -* **Alert context:** silencing a specific alert context will silence all alert state transitions for alerts targeting that chart context, for more details check [alert configuration docs](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alert-line-on). -* **Alert role:** silencing a specific alert role will silence all the alert state transitions for alerts that are configured to be specific role recipients, for more details check [alert configuration docs](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alert-line-to). - -Beside the above two main entities there are another two important settings that you can define on a silencing rule: -* Who does the rule affect? **All user** in the space or **Myself** -* When does is to apply? **Immediately** or on a **Schedule** (when setting immediately you can set duration) - -For further help on setting alert notification silencing rules go to [Manage Alert Notification Silencing Rules](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/manage-alert-notification-silencing-rules.md). - -> ⚠️ This feature is only available for [Netdata paid plans](https://github.com/netdata/netdata/edit/master/docs/cloud/manage/plans.md). - -## Flood protection - -If a node has too many state changes like firing too many alerts or going from reachable to unreachable, Netdata Cloud -enables flood protection. As long as a node is in flood protection mode, Netdata Cloud does not send notifications about -this node. Even with flood protection active, it is possible to access the node directly, either via Netdata Cloud or -the local Agent dashboard at `http://NODE:19999`. - -## Anatomy of an alert notification - -Email alert notifications show the following information: - -- The Space's name -- The node's name -- Alert status: critical, warning, cleared -- Previous alert status -- Time at which the alert triggered -- Chart context that triggered the alert -- Name and information about the triggered alert -- Alert value -- Total number of warning and critical alerts on that node -- Threshold for triggering the given alert state -- Calculation or database lookups that Netdata uses to compute the value -- Source of the alert, including which file you can edit to configure this alert on an individual node - -Email notifications also feature a **Go to Node** button, which takes you directly to the offending chart for that node -within Cloud's embedded dashboards. - -Here's an example email notification for the `ram_available` chart, which is in a critical state: - -![Screenshot of an alert notification email from Netdata Cloud](https://user-images.githubusercontent.com/1153921/87461878-e933c480-c5c3-11ea-870b-affdb0801854.png) diff --git a/docs/cloud/cheatsheet.md b/docs/cloud/cheatsheet.md deleted file mode 100644 index a3d2f0285..000000000 --- a/docs/cloud/cheatsheet.md +++ /dev/null @@ -1,215 +0,0 @@ -# Useful management and configuration actions - -Below you will find some of the most common actions that one can take while using Netdata. You can use this page as a quick reference for installing Netdata, connecting a node to the Cloud, properly editing the configuration, accessing Netdata's API, and more! - -### Install Netdata - -```bash -wget -O /tmp/netdata-kickstart.sh https://my-netdata.io/kickstart.sh && sh /tmp/netdata-kickstart.sh - -# Or, if you have cURL but not wget (such as on macOS): -curl https://my-netdata.io/kickstart.sh > /tmp/netdata-kickstart.sh && sh /tmp/netdata-kickstart.sh -``` - -#### Connect a node to Netdata Cloud - -To do so, sign in to Netdata Cloud, on your Space under the Nodes tab, click `Add Nodes` and paste the provided command into your node’s terminal and run it. -You can also copy the Claim token and pass it to the installation script with `--claim-token` and re-run it. - -### Configuration - -**Netdata's config directory** is `/etc/netdata/` but in some operating systems it might be `/opt/netdata/etc/netdata/`. -Look for the `# config directory =` line over at `http://NODE_IP:19999/netdata.conf` to find your config directory. - -From within that directory you can run `sudo ./edit-config netdata.conf` **to edit Netdata's configuration.** -You can edit other config files too, by specifying their filename after `./edit-config`. -You are expected to use this method in all following configuration changes. - -<!-- #### Edit Netdata's other config files (examples): - -- `$ sudo ./edit-config apps_groups.conf` -- `$ sudo ./edit-config ebpf.conf` -- `$ sudo ./edit-config health.d/load.conf` -- `$ sudo ./edit-config go.d/prometheus.conf` - -#### View the running Netdata configuration: `http://NODE:19999/netdata.conf` - -> Replace `NODE` with the IP address or hostname of your node. Often `localhost`. - -## Metrics collection & retention - -You can tweak your settings in the netdata.conf file. -📄 [Find your netdata.conf file](https://github.com/netdata/netdata/blob/master/daemon/config/README.md) - -Open a new terminal and navigate to the netdata.conf file. Use the edit-config script to make changes: `sudo ./edit-config netdata.conf` - -The most popular settings to change are: - -#### Increase metrics retention (4GiB) - -``` -sudo ./edit-config netdata.conf -``` - -``` -[global] - dbengine multihost disk space = 4096 -``` - -#### Reduce the collection frequency (every 5 seconds) - -``` -sudo ./edit-config netdata.conf -``` - -``` -[global] - update every = 5 -``` --> - ---- - -#### Enable/disable plugins (groups of collectors) - -```bash -sudo ./edit-config netdata.conf -``` - -```conf -[plugins] - go.d = yes # enabled - node.d = no # disabled -``` - -#### Enable/disable specific collectors - -```bash -sudo ./edit-config go.d.conf # edit a plugin's config -``` - -```yaml -modules: - activemq: no # disabled - cockroachdb: yes # enabled -``` - -#### Edit a collector's config - -```bash -sudo ./edit-config go.d/mysql.conf -``` - -### Alerts & notifications - -<!-- #### Add a new alert - -``` -sudo touch health.d/example-alert.conf -sudo ./edit-config health.d/example-alert.conf -``` --> -After any change, reload the Netdata health configuration: - -```bash -netdatacli reload-health -#or if that command doesn't work on your installation, use: -killall -USR2 netdata -``` - -#### Configure a specific alert - -```bash -sudo ./edit-config health.d/example-alert.conf -``` - -#### Silence a specific alert - -```bash -sudo ./edit-config health.d/example-alert.conf -``` - -``` - to: silent -``` - -<!-- #### Disable alerts and notifications - -```conf -[health] - enabled = no -``` --> - ---- - -### Manage the daemon - -| Intent | Action | -|:----------------------------|------------------------------------------------------------:| -| Start Netdata | `$ sudo service netdata start` | -| Stop Netdata | `$ sudo service netdata stop` | -| Restart Netdata | `$ sudo service netdata restart` | -| Reload health configuration | `$ sudo netdatacli reload-health` `$ killall -USR2 netdata` | -| View error logs | `less /var/log/netdata/error.log` | -| View collectors logs | `less /var/log/netdata/collector.log` | - -#### Change the port Netdata listens to (example, set it to port 39999) - -```conf -[web] -default port = 39999 -``` - -### See metrics and dashboards - -#### Netdata Cloud: `https://app.netdata.cloud` - -#### Local dashboard: `https://NODE:19999` - -> Replace `NODE` with the IP address or hostname of your node. Often `localhost`. - -### Access the Netdata API - -You can access the API like this: `http://NODE:19999/api/VERSION/REQUEST`. -If you want to take a look at all the API requests, check our API page at <https://learn.netdata.cloud/api> -<!-- -## Interact with charts - -| Intent | Action | -| -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | -| Stop a chart from updating | `click` | -| Zoom | **Cloud** <br/> use the `zoom in` and `zoom out` buttons on any chart (upper right corner) <br/><br/> **Agent**<br/>`SHIFT` or `ALT` + `mouse scrollwheel` <br/> `SHIFT` or `ALT` + `two-finger pinch` (touchscreen) <br/> `SHIFT` or `ALT` + `two-finger scroll` (touchscreen) | -| Zoom to a specific timeframe | **Cloud**<br/>use the `select and zoom` button on any chart and then do a `mouse selection` <br/><br/> **Agent**<br/>`SHIFT` + `mouse selection` | -| Pan forward or back in time | `click` & `drag` <br/> `touch` & `drag` (touchpad/touchscreen) | -| Select a certain timeframe | `ALT` + `mouse selection` <br/> WIP need to evaluate this `command?` + `mouse selection` (macOS) | -| Reset to default auto refreshing state | `double click` | --> - -<!-- ## Dashboards - -#### Disable the local dashboard - -Use the `edit-config` script to edit the `netdata.conf` file. - -``` -[web] -mode = none -``` --> - -<!-- #### Opt out from anonymous statistics - -``` -sudo touch .opt-out-from-anonymous-statistics -``` --> - -<!-- ## Understanding the dashboard - -**Charts**: A visualization displaying one or more collected/calculated metrics in a time series. Charts are generated -by collectors. - -**Dimensions**: Any value shown on a chart, which can be raw or calculated values, such as percentages, averages, -minimums, maximums, and more. - -**Families**: One instance of a monitored hardware or software resource that needs to be monitored and displayed -separately from similar instances. Example, disks named -**sda**, **sdb**, **sdc**, and so on. - -**Contexts**: A grouping of charts based on the types of metrics collected and visualized. -**disk.io**, **disk.ops**, and **disk.backlog** are all contexts. --> diff --git a/docs/cloud/insights/anomaly-advisor.md b/docs/cloud/insights/anomaly-advisor.md deleted file mode 100644 index 4804dbc16..000000000 --- a/docs/cloud/insights/anomaly-advisor.md +++ /dev/null @@ -1,87 +0,0 @@ -<!-- -title: "Anomaly Advisor" -description: "Quickly find anomalous metrics anywhere in your infrastructure." -custom_edit_url: "https://github.com/netdata/netdata/blob/master/docs/cloud/insights/anomaly-advisor.md" -sidebar_label: "Anomaly Advisor" -learn_status: "Published" -learn_topic_type: "Tasks" -learn_rel_path: "Operations" ---> - -# Anomaly Advisor - -import ReactPlayer from 'react-player' - -The Anomaly Advisor feature lets you quickly surface potentially anomalous metrics and charts related to a particular highlight window of -interest. - -<ReactPlayer playing true controls true url='https://user-images.githubusercontent.com/24860547/165943403-1acb9759-7446-4704-8955-c566d04ad7ab.mp4' /> - -## Getting Started - -If you are running a Netdata version higher than `v1.35.0-29-nightly` you will be able to use the Anomaly Advisor out of the box with zero configuration. If you are on an earlier Netdata version you will need to first enable ML on your nodes by following the steps below. - -To enable the Anomaly Advisor you must first enable ML on your nodes via a small config change in `netdata.conf`. Once the anomaly detection models have trained on the Agent (with default settings this takes a couple of hours until enough data has been seen to train the models) you will then be able to enable the Anomaly Advisor feature in Netdata Cloud. - -### Enable ML on Netdata Agent - -To enable ML on your Netdata Agent, you need to edit the `[ml]` section in your `netdata.conf` to look something like the following example. - -```bash -[ml] - enabled = yes -``` - -At a minimum you just need to set `enabled = yes` to enable ML with default params. More details about configuration can be found in the [Netdata Agent ML docs](https://github.com/netdata/netdata/blob/master/ml/README.md#configuration). - -When you have finished your configuration, restart Netdata with a command like `sudo systemctl restart netdata` for the config changes to take effect. You can find more info on restarting Netdata [here](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md). - -After a brief delay, you should see the number of `trained` dimensions start to increase on the "dimensions" chart of the "Anomaly Detection" menu on the Overview page. By default the `minimum num samples to train = 3600` parameter means at least 1 hour of data is required to train initial models, but you could set this to `900` if you want to train initial models quicker but on less data. Over time, they will retrain on up to `maximum num samples to train = 14400` (4 hours by default), but you could increase this is you wanted to train on more data. - -![image](https://user-images.githubusercontent.com/2178292/166474099-ba6f5ebe-12b2-4ef2-af9f-e84a05349791.png) - -Once this line flattens out all configured metrics should have models trained and predicting anomaly scores each second, ready to be used by the new "anomalies" tab of the Anomaly Advisor. - -## Using Anomaly Advisor - -To use the Anomaly Advisor, go to the "anomalies" tab. Once you highlight a particular timeframe of interest, a selection of the most anomalous dimensions will appear below. - -The aim here is to surface the most anomalous metrics in the space or room for the highlighted window to try and cut down on the amount of manual searching required to get to the root cause of your issues. - -![image](https://user-images.githubusercontent.com/2178292/164427337-a40820d2-8d36-4a94-8dfb-cfd3194941e0.png) - -The "Anomaly Rate" chart shows the percentage of anomalous metrics over time per node. For example, in the following image, 3.21% of the metrics on the "ml-demo-ml-disabled" node were considered anomalous. This elevated anomaly rate could be a sign of something worth investigating. - -**Note**: in this example the anomaly rates for this node are actually being calculated on the parent it streams to, you can run ml on the Agent itselt or on a parent the Agent stream to. Read more about the various configuration options in the [Agent docs](https://github.com/netdata/netdata/blob/master/ml/README.md). - -![image](https://user-images.githubusercontent.com/2178292/164428307-6a86989a-611d-47f8-a673-911d509cd954.png) - -The "Count of Anomalous Metrics" chart (collapsed by default) shows raw counts of anomalous metrics per node so may often be similar to the anomaly rate chart, apart from where nodes may have different numbers of metrics. - -The "Anomaly Events Detected" chart (collapsed by default) shows if the anomaly rate per node was sufficiently elevated to trigger a node level anomaly. Anomaly events will appear slightly after the anomaly rate starts to increase in the timeline, this is because a significant number of metrics in the node need to be anomalous before an anomaly event is triggered. - -Once you have highlighted a window of interest, you should see an ordered list of anomaly rate sparklines in the "Anomalous metrics" section like below. - -![image](https://user-images.githubusercontent.com/2178292/164427592-ab1d0eb1-57e2-4a05-aaeb-da4437a019b1.png) - -You can expand any sparkline chart to see the underlying raw data to see how it relates to the corresponding anomaly rate. - -![image](https://user-images.githubusercontent.com/2178292/164430105-f747d1e0-f3cb-4495-a5f7-b7bbb71039ae.png) - -On the upper right hand side of the page you can select which nodes to filter on if you wish to do so. The ML training status of each node is also displayed. - -On the lower right hand side of the page an index of anomaly rates is displayed for the highlighted timeline of interest. The index is sorted from most anomalous metric (highest anomaly rate) to least (lowest anomaly rate). Clicking on an entry in the index will scroll the rest of the page to the corresponding anomaly rate sparkline for that metric. - -### Usage Tips - -- If you are interested in a subset of specific nodes then filtering to just those nodes before highlighting tends to give better results. This is because when you highlight a region, Netdata Cloud will ask the Agents for a ranking over all metrics so if you can filter this early to just the subset of nodes you are interested in, less 'averaging' will occur and so you might be a less noisy ranking. -- Ideally try and highlight close to a spike or window of interest so that the resulting ranking can narrow in more easily on the timeline you are interested in. - -You can read more detail on how anomaly detection in the Netdata Agent works in our [Agent docs](https://github.com/netdata/netdata/blob/master/ml/README.md). - -🚧 **Note**: This functionality is still **under active development** and considered experimental. We dogfood it internally and among early adopters within the Netdata community to build the feature. If you would like to get involved and help us with feedback, you can reach us through any of the following channels: - -- Email us at analytics-ml-team@netdata.cloud -- Comment on the [beta launch post](https://community.netdata.cloud/t/anomaly-advisor-beta-launch/2717) in the Netdata community -- Join us in the [🤖-ml-powered-monitoring](https://discord.gg/4eRSEUpJnc) channel of the Netdata discord. -- Or open a discussion in GitHub if that's more your thing diff --git a/docs/cloud/insights/events-feed.md b/docs/cloud/insights/events-feed.md deleted file mode 100644 index a56877ab1..000000000 --- a/docs/cloud/insights/events-feed.md +++ /dev/null @@ -1,99 +0,0 @@ -<!-- -title: "Events feed" -sidebar_label: "Events feed" -custom_edit_url: "https://github.com/netdata/netdata/blob/master/docs/cloud/insights/events-feed.md" -sidebar_position: "2800" -learn_status: "Published" -learn_topic_type: "Concepts" -learn_rel_path: "Concepts" -learn_docs_purpose: "Present the Netdata Events feed." ---> - -# Events feed - -Netdata Cloud provides the Events feed which is a powerful feature that tracks events that happen on your infrastructure, or in your Space. The feed lets you investigate events that occurred in the past, which is invaluable for troubleshooting. Common use cases are ones like when a node goes offline, and you want to understand what events happened before that. A detailed event history can also assist in attributing sudden pattern changes in a time series to specific changes in your environment. - -## What are the available events? - -At a high-level view, these are the domains from which the Events feed will provide visibility into. - -> ⚠️ Based on your space's plan, different allowances are defined to query past data. - -| **Domains of events** | **Community** | **Pro** | **Business** | -| :-- | :-- | :-- | :-- | -| **[Auditing events](#auditing-events)** - <br/>Events related to actions done on your Space, e.g. invite user, change user role or change plan.| 4 hours | 7 days | 90 days | -| **[Topology events](#topology-events)**<br/>Node state transition events, e.g. live or offline.| 4 hours | 7 days | 14 days | -| **[Alert events](#alert-events)**<br/>Alert state transition events, can be seen as an alert history log.| 4 hours | 7 days | 90 days | - -### Auditing events - -| **Event name** | **Description** | **Example** | -| :-- | :-- | :-- | -| Space Created | The space was created.| Space `Acme Space` was **created** | -| Room Created | A room was created on the Space.| Room `DB Servers` was **created** by `John Doe` | -| Room Deleted | A room was deleted from the Space. | Room `DB servers` was **deleted** by `John Doe` | -| User Invited to Space | A user was invited to join the Space.| User `John Smith` was **invited** to this space by `Alan Doe` | -| User Uninvited from Space | An invitation for a user to join the space was revoked.| User `John Smith` was **uninvited** from this space | -| User Added to Space | A user was added to the Space from an invitation (user accepted the invitation).| User `John Smith` was **added** to this space by invite of `Alan Doe` | -| User Removed from Space | A user was added to the Space from an invitation. | User `John Smith` was **removed** from this space by `Alan Doe` | -| User Added to Room | A user was added to a room on the Space. | User `John Smith` was **added** to room `DB servers` | -| User Removed from Room | A user was removed from a room on the Space. | User `John Smith` was **removed** from room `DB Servers` by `Alan Doe` | -| User Space Properties Changed | The properties of a user on the Space have changed, e.g. change user role | User role for `John Smith` was **changed** to `troubleshooter` by `Alan Doe` | -| Node Added To Room | The node was added to a room on the Space. | Node `ip-xyz.ec2.internal` was **added** to room `DB Servers` by `John Doe` | -| Node Removed To Room | The node was removed from a room on the Space. | Node `ip-xyz.ec2.internal` was **removed** from room `DB Servers` by `John Doe` | -| Silencing Rule Created | A new alert notification silencing rule was created on the Space. | Silencing rule `DB Servers schedule silencing` on rooms `All nodes` and `DB Servers` was **created** by `John Smith` | -| Silencing Rule Changed | An existing alert notification silencing rule was modified on the Space. | Silencing rule `DB Servers schedule silencing` on rooms `All nodes` and `DB Servers` was **changed** by `John Doe` | -| Silencing Rule Deleted | An existing alert notifications silencing rule was removed from the Space. | Silencing rule `DB Servers schedule silencing` on rooms `All nodes` and `DB Servers` was **changed** by `Alan Smith` | - -### Topology events - -| **Event name** | **Description** | **Example** | -| :-- | :-- | :-- | -| Node Became Live | The node is collecting and streaming metrics to Cloud.| Node `netdata-k8s-state-xyz` was **live** | -| Node Became Stale | The node is offline and not streaming metrics to Cloud. It can show historical data from a parent node. | Node `ip-xyz.ec2.internal` was **stale** | -| Node Became Offline | The node is offline, not streaming metrics to Cloud and not available in any parent node.| Node `ip-xyz.ec2.internal` was **offline** | -| Node Created | The node is created but it is still `Unseen` on Cloud, didn't establish a successful connection yet.| Node `ip-xyz.ec2.internal` was **created** | -| Node Removed |The node was removed from the Space, for example by using the `Delete` action on the node. This is a soft delete in that the node gets marked as deleted, but retains the association with this space. If it becomes live again, it will be restored (see `Node Restored` below) and reappear in this space as before. | Node `ip-xyz.ec2.internal` was **deleted (soft)** | -| Node Restored | The node was restored. See `Node Removed` above. | Node `ip-xyz.ec2.internal` was **restored** | -| Node Deleted | The node was deleted from the Space. This is a hard delete and no information on the node is retained. | Node `ip-xyz.ec2.internal` was **deleted (hard)** | -| Agent Connected | The agent connected to the Cloud MQTT server (Agent-Cloud Link established).<br/>These events can only be seen on _All nodes_ War Room. | Agent with claim ID `7d87bqs9-cv42-4823-8sd4-3614548850c7` has connected to Cloud. | -| Agent Disconnected | The agent disconnected from the Cloud MQTT server (Agent-Cloud Link severed).<br/>These events can only be seen on _All nodes_ War Room. | Agent with claim ID `7d87bqs9-cv42-4823-8sd4-3614548850c7` has disconnected from Cloud: **Connection Timeout**. | -| Space Statistics | Daily snapshot of space node statistics.<br/>These events can only be seen on _All nodes_ War Room. | Space statistics. Nodes: **22 live**, **21 stale**, **18 removed**, **61 total**. | - - -### Alert events - -| **Event name** | **Description** | **Example** | -| :-- | :-- | :-- | -| Node Alert State Changed | These are node alert state transition events and can be seen as an alert history log. You will be able to see transitions to or from any of these states: Cleared, Warning, Critical, Removed, Error or Unknown | Transition to Cleared:<br/>`httpcheck_web_service_bad_status` for `httpcheck_netdata_cloud.request_status` on `netdata-parent-xyz` recovered with value **8.33%**<br/><br/>Transition from Cleared to Warning or Critical:<br/>`httpcheck_web_service_bad_status` for `httpcheck_netdata_cloud.request_status` on `netdata-parent-xyz` was raised to **WARNING** with value **10%**<br/><br/>Transition from Warning to Critical:<br/>`httpcheck_web_service_bad_status` for `httpcheck_netdata_cloud.request_status` on `netdata-parent-xyz` escalated to **CRITICAL** with value **25%**<br/><br/>Transition from Critical to Warning:<br/>`httpcheck_web_service_bad_status` for `httpcheck_netdata_cloud.request_status` on `netdata-parent-xyz` was demoted to **WARNING** with value **10%**<br/><br/>Transition to Removed:<br/>Alert `httpcheck_web_service_bad_status` for `httpcheck_netdata_cloud.request_status` on `netdata-parent-xyz` is no longer available, state can't be assessed.<br/><br/>Transition to Error:<br/>For this alert `httpcheck_web_service_bad_status` related to `httpcheck_netdata_cloud.request_status` on `netdata-parent-xyz` we couldn't calculate the current value ⓘ| - -## Who can access the events? - -All users will be able to see events from the Topology and Alerts domain but Auditing events, once these are added, only be accessible to administrators. For more details checkout [Netdata Role-Based Access model](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/role-based-access.md). - -## How to use the events feed - -1. Click on the **Events** tab (located near the top of your screen) -1. You will be presented with a table listing the events that occurred from the timeframe defined on the date time picker -1. You can use the filtering capabilities available on right-hand bar to slice through the results provided. See more details on event types and filters - -Note: When you try to query a longer period than what your space allows you will see an error message highlighting that you are querying data outside of your plan. - -### Event types and filters - -| Event type | Tags | Nodes | Alert Status | Alert Names | Chart Names | -| :-- | :-- | :-- | :-- | :-- | :-- | -| Node Became Live | node, lifecycle | Node name | - | - | - | -| Node Became Stale | node, lifecycle | Node name | - | - | - | -| Node Became Offline | node, lifecycle | Node name | - | - | - | -| Node Created | node, lifecycle | Node name | - | - | - | -| Node Removed | node, lifecycle | Node name | - | - | - | -| Node Restored | node, lifecycle | Node name | - | - | - | -| Node Deleted | node, lifecycle | Node name | - | - | - | -| Agent Claimed | agent | - | - | - | - | -| Agent Connected | agent | - | - | - | - | -| Agent Disconnected | agent | - | - | - | - | -| Agent Authenticated | agent | - | - | - | - | -| Agent Authentication Failed | agent | - | - | - | - | -| Space Statistics | space, node, statistics | Node name | - | - | - | -| Node Alert State Changed | alert, node | Node name | Cleared, Warning, Critical, Removed, Error or Unknown | Alert name | Chart name | diff --git a/docs/cloud/insights/metric-correlations.md b/docs/cloud/insights/metric-correlations.md deleted file mode 100644 index c8ead9be3..000000000 --- a/docs/cloud/insights/metric-correlations.md +++ /dev/null @@ -1,85 +0,0 @@ -<!-- -title: "Metric Correlations" -description: "Quickly find metrics and charts closely related to a particular timeframe of interest anywhere in your infrastructure to discover the root cause faster." -custom_edit_url: "https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md" -sidebar_label: "Metric Correlations" -learn_status: "Published" -learn_topic_type: "Tasks" -learn_rel_path: "Operations" ---> - -# Metric Correlations - -The Metric Correlations (MC) feature lets you quickly find metrics and charts related to a particular window of interest that you want to explore further. By displaying the standard Netdata dashboard, filtered to show only charts that are relevant to the window of interest, you can get to the root cause sooner. - -Because Metric Correlations uses every available metric from your infrastructure, with as high as 1-second granularity, you get the most accurate insights using every possible metric. - -## Using Metric Correlations - -When viewing the overview or a single-node dashboard, the **Metric Correlations** button appears in the top right corner of the page. - -![The Metric Correlations button](https://user-images.githubusercontent.com/2178292/201082551-d805b20d-0472-455d-9f11-b2329adf3098.png) - -To start correlating metrics, click the **Metric Correlations** button, then hold the `Alt` key (or `⌘` on macOS) and click-and-drag a selection of metrics on a single chart. The selected timeframe needs to be at least 15 seconds for Metric Correlation to work. - -The menu then displays information about the selected area and reference baseline. Metric Correlations uses the reference baseline to discover which additional metrics are most closely connected to the selected metrics. The reference baseline is based upon the period immediately preceding the highlighted window and is the length of 4 times the highlighted window. This is to ensure that the reference baseline is always immediately before the highlighted window of interest and a bit longer so as to ensure it's a more representative short term baseline. - -Press the **Find Correlations** button to start up the correlations process, the button is only enabled when a valid timeframe is selected (at least 15 seconds). Once pressed, the process will score all available metrics on your nodes and return a filtered version of the Netdata dashboard. Now, you'll see only those metrics that have changed the most between a baseline window and the highlighted window you have selected. - -![Metric Correlations results](https://user-images.githubusercontent.com/2178292/181751182-25e0890d-a5f4-4799-9936-1523603cf97d.png) - -These charts are fully interactive, and whenever possible, will only show the _dimensions_ related to the timeline you selected. - -You can interact with all the scored metrics via the slider. Slide toward **show less** for more nuanced and significant results, or toward **show more** to "loosen" the threshold to explore other charts that may have changed too, but in a less significant manner. - -If you find something else interesting in the results, you can select another window and press **Find Correlations** again to kick the process off again. - -## Metric Correlations options - -MC enables a few input parameters that users can define to iteratively explore their data in different ways. As is usually the case in Machine Learning (ML), there is no "one size fits all" algorithm, what approach works best will typically depend on the type of data (which can be very different from one metric to the next) and even the nature of the event or incident you might be exploring in Netdata. - -So when you first run MC it will use the most sensible and general defaults. But you can also then vary any of the below options to explore further. - -### Method - -There are two algorithms available that aim to score metrics based on how much they have changed between the baseline and highlight windows. - -- `KS2` - A statistical test ([Two-sample Kolmogorov Smirnov](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Two-sample_Kolmogorov%E2%80%93Smirnov_test)) comparing the distribution of the highlighted window to the baseline to try and quantify which metrics have most evidence of a significant change. You can explore our implementation [here](https://github.com/netdata/netdata/blob/d917f9831c0a1638ef4a56580f321eb6c9a88037/database/metric_correlations.c#L212). -- `Volume` - A heuristic measure based on the percentage change in averages between highlighted window and baseline, with various edge cases sensibly controlled for. You can explore our implementation [here](https://github.com/netdata/netdata/blob/d917f9831c0a1638ef4a56580f321eb6c9a88037/database/metric_correlations.c#L516). - -### Aggregation - -Behind the scenes, Netdata will aggregate the raw data as needed such that arbitrary window lengths can be selected for MC. By default, Netdata will just `Average` raw data when needed as part of pre-processing. However other aggregations like `Median`, `Min`, `Max`, `Stddev` are also possible. - -### Data - -Netdata is different from typical observability agents since, in addition to just collecting raw metric values, it will by default also assign an "[Anomaly Bit](https://github.com/netdata/netdata/tree/master/ml#anomaly-bit---100--anomalous-0--normal)" related to each collected metric each second. This bit will be 0 for "normal" and 1 for "anomalous". This means that each metric also natively has an "[Anomaly Rate](https://github.com/netdata/netdata/tree/master/ml#anomaly-rate---averageanomaly-bit)" associated with it and, as such, MC can be run against the raw metric values or their corresponding anomaly rates. - -**Note**: Read more [here](https://github.com/netdata/netdata/blob/master/ml/README.md) to learn more about the native anomaly detection features within netdata. - -- `Metrics` - Run MC on the raw metric values. -- `Anomaly Rate` - Run MC on the corresponding anomaly rate for each metric. - -## Metric Correlations on the agent - -As of `v1.35.0` Netdata is able to run the Metric Correlations algorithm ([Two Sample Kolmogorov-Smirnov test](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Two-sample_Kolmogorov%E2%80%93Smirnov_test)) on the agent itself. This avoids sending the underlying raw data to the original Netdata Cloud based microservice and so typically will be much much faster as no data moves around and the computation happens instead on the agent. - -When a Metric Correlations request is made to Netdata Cloud, if any node instances have MC enabled then the request will be routed to the node instance with the highest hops (e.g. a parent node if one is found or the node itself if not). If no node instances have MC enabled then the request will be routed to the original Netdata Cloud based service which will request input data from the nodes and run the computation within the Netdata Cloud backend. - -#### Enabling/Disabling Metric Correlations on the agent - -As of `v1.35.0-22-nightly` Metric Correlation has been enabled by default on all agents. After further optimizations to the implementation, the impact of running the metric correlations algorithm on the agent was less than the impact of preparing all the data to send to cloud for MC to run in the cloud, as such running MC on the agent is less impactful on local resources than running via cloud. - -Should you still want to, disabling nodes for Metric Correlation on the agent is a simple one line config change. Just set `enable metric correlations = no` in the `[global]` section of `netdata.conf` - -## Usage tips! - -- When running Metric Correlations from the [Overview tab](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md#overview-and-single-node-view) across multiple nodes, you might find better results if you iterate on the initial results by grouping by node to then filter to nodes of interest and run the Metric Correlations again. So a typical workflow in this case would be to: - - If unsure which nodes you are interested in then run MC on all nodes. - - Within the initial results returned group the most interesting chart by node to see if the changes are across all nodes or a subset of nodes. - - If you see a subset of nodes clearly jump out when you group by node, then filter for just those nodes of interest and run the MC again. This will result in less aggregation needing to be done by Netdata and so should help give clearer results as you interact with the slider. -- Use the `Volume` algorithm for metrics with a lot of gaps (e.g. request latency when there are few requests), otherwise stick with `KS2` - - By default, Netdata uses the `KS2` algorithm which is a tried and tested method for change detection in a lot of domains. The [Wikipedia](https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test) article gives a good overview of how this works. Basically, it is comparing, for each metric, its cumulative distribution in the highlight window with its cumulative distribution in the baseline window. The statistical test then seeks to quantify the extent to which we can say these two distributions look similar enough to be considered the same or not. The `Volume` algorithm is a bit more simple than `KS2` in that it basically compares (with some edge cases sensibly handled) the average value of the metric across baseline and highlight and looks at the percentage change. Often both `KS2` and `Volume` will have significant agreement and return similar metrics. - - `Volume` might favour picking up more sparse metrics that were relatively flat and then came to life with some spikes (or vice versa). This is because for such metrics that just don't have that many different values in them, it is impossible to construct a cumulative distribution that can then be compared. So `Volume` might be useful in spotting examples of metrics turning on or off. ![example where volume captured network traffic turning on](https://user-images.githubusercontent.com/2178292/182336924-d02fd3d3-7f09-41da-9cfc-809d01396d9d.png) - - `KS2` since it relies on the full distribution might be better at highlighting more complex changes that `Volume` is unable to capture. For example a change in the variation of a metric might be picked up easily by `KS2` but missed (or just much lower scored) by `Volume` since the averages might remain not all that different between baseline and highlight even if their variance has changed a lot. ![example where KS2 captured a change in entropy distribution that volume alone might not have picked up](https://user-images.githubusercontent.com/2178292/182338289-59b61e6b-089d-431c-bc8e-bd19ba6ad5a5.png) -- Use `Volume` and `Anomaly Rate` together to ask what metrics have turned most anomalous from baseline to highlighted window. You can expand the embedded anomaly rate chart once you have results to see this more clearly. ![example where Volume and Anomaly Rate together help show what dimensions where most anomalous](https://user-images.githubusercontent.com/2178292/182338666-6d19fa92-89d3-4d61-804c-8f10982114f5.png) diff --git a/docs/cloud/manage/organize-your-infrastrucutre-invite-your-team.md b/docs/cloud/manage/organize-your-infrastrucutre-invite-your-team.md deleted file mode 100644 index b36e0806b..000000000 --- a/docs/cloud/manage/organize-your-infrastrucutre-invite-your-team.md +++ /dev/null @@ -1,169 +0,0 @@ -# Organize Your Infrastructure and Invite your Team - -Netdata Cloud provides you with features such as [Spaces](#netdata-cloud-spaces) and [War Rooms](#netdata-cloud-war-rooms) that allow you to better organize your infrastructure and ensure your team can also have access to it through invites. - -## Netdata Cloud Spaces - -Organize your multi-organization infrastructure monitoring on Netdata Cloud by creating Spaces to completely isolate access to your Agent-monitored nodes. - -A Space is a high-level container. It's a collaboration space where you can organize team members, access levels and the -nodes you want to monitor. - -Let's talk through some strategies for creating the most intuitive Cloud experience for your team. - -### How to organize your Netdata Cloud - -You can use any number of Spaces you want, but as you organize your Cloud experience, keep in mind that _you can only -add any given node to a single Space_. This 1:1 relationship between node and Space may dictate whether you use one -encompassing Space for your entire team and separate them by War Rooms, or use different Spaces for teams monitoring -discrete parts of your infrastructure. - -If you have been invited to Netdata Cloud by another user by default you will able to see that space. If you are a new -user the first space is already created. - -The other consideration for the number of Spaces you use to organize your Netdata Cloud experience is the size and -complexity of your organization. - -For smaller teams and infrastructures, we recommend sticking to a single Space so that you can keep all your nodes and their -respective metrics in one place. You can then use -multiple [War Rooms](#netdata-cloud-war-rooms) -to further organize your infrastructure monitoring. - -Enterprises may want to create multiple Spaces for each of their larger teams, particularly if those teams have -different responsibilities or parts of the overall infrastructure to monitor. For example, you might have one SRE team -for your user-facing SaaS application and a second team for infrastructure tooling. If they don't need to monitor the -same nodes, you can create separate Spaces for each team. - -### Navigate between spaces - -Click on any of the boxes to switch between available Spaces. - -Netdata Cloud abbreviates each Space to the first letter of the name, or the first two letters if the name is two words -or more. Hover over each icon to see the full name in a tooltip. - -To add a new Space click on the green **+** button. Enter the name of the Space and click **Save**. - -![Switch between Spaces](https://github.com/netdata/netdata/assets/70198089/aa0d7a2f-02ec-4c01-a2d9-1f99642f2496) - -### Manage Spaces - -Manage your spaces by selecting a particular space and clicking on the small gear icon in the lower left corner. This -will open a side tab in which you can: - -1. _Configure this Space*_, in the first tab (**Space**) you can change the name, description or/and some privilege - options of this space - -2. _Edit the War Rooms*_, click on the **War rooms** tab to add or remove War Rooms. - -3. _Connect nodes*_, click on **Nodes** tab. Copy the claiming script to your node and run it. See the - [connect to Cloud doc](https://github.com/netdata/netdata/blob/master/claim/README.md) for details. - -4. _Manage the users*_, click on **Users**. - The [invitation doc](#invite-your-team) - details the invitation process. - -5. _Manage notification setting*_, click on **Notifications** tab to turn off/on notification methods. - -6. _Manage your bookmarks*_, click on the **Bookmarks** tab to add or remove bookmarks that you need. - -> #### Note -> -> \* This action requires admin rights for this space - -### Obsoleting offline nodes from a Space - -Netdata admin users now have the ability to remove obsolete nodes from a space. - -- Only admin users have the ability to obsolete nodes -- Only offline nodes can be marked obsolete (Live nodes and stale nodes cannot be obsoleted) -- Node obsoletion works across the entire space, so the obsoleted node will be removed from all rooms belonging to the - space -- If the obsoleted nodes eventually become live or online once more they will be automatically re-added to the space - -![Obsoleting an offline node](https://user-images.githubusercontent.com/24860547/173087202-70abfd2d-f0eb-4959-bd0f-74aeee2a2a5a.gif) - -## Netdata Cloud War rooms - -Netdata Cloud uses War Rooms to organize your connected nodes and provide infrastructure-wide dashboards using real-time metrics and visualizations. - -Once you add nodes to a Space, all of your nodes will be visible in the **All nodes** War Room. This is a special War Room -which gives you an overview of all of your nodes in this particular Space. Then you can create functional separations of -your nodes into more War Rooms. Every War Room has its own dashboards, navigation, indicators, and management tools. - -![An example War Room](https://user-images.githubusercontent.com/43294513/225355998-f16730ba-06d4-4953-8fd3-f1c2751e102d.png) - -### War Room organization - -We recommend a few strategies for organizing your War Rooms. - -- **Service, purpose, location, etc.** - You can group War Rooms by a service (Nginx, MySQL, Pulsar, and so on), their purpose (webserver, database, application), their physical location, whether they're "bare metal" or a Docker container, the PaaS/cloud provider it runs on, and much more. - This allows you to see entire slices of your infrastructure by moving from one War Room to another. - -- **End-to-end apps/services** - If you have a user-facing SaaS product, or an internal service that this said product relies on, you may want to monitor that entire stack in a single War Room. This might include Kubernetes clusters, Docker containers, proxies, databases, web servers, brokers, and more. - End-to-end War Rooms are valuable tools for ensuring the health and performance of your organization's essential services. - -- **Incident response** - You can also create new War Rooms as one of the first steps in your incident response process. - For example, you have a user-facing web app that relies on Apache Pulsar for a message queue, and one of your nodes using the [Pulsar collector](https://github.com/netdata/go.d.plugin/blob/master/modules/pulsar/README.md) begins reporting a suspiciously low messages rate. - You can create a War Room called `$year-$month-$day-pulsar-rate`, add all your Pulsar nodes in addition to nodes they connect to, and begin diagnosing the root cause in a War Room optimized for getting to resolution as fast as possible. - -### Add War Rooms - -To add new War Rooms to any Space, click on the green plus icon **+** next to the **War Rooms** heading on the left (Space's) sidebar. - -In the panel, give the War Room a name and description, and choose whether it's public or private. -Anyone in your Space can join public War Rooms, but can only join private War Rooms with an invitation. - -### Manage War Rooms - -All the users and nodes involved in a particular Space can be part of a War Room. - -Any user can change simple settings of a War room, like the name or the users participating in it. -Click on the gear icon of the War Room's name in the top of the page to do that. A sidebar will open with options for this War Room: - -1. To **change a War Room's name, description, or public/private status**, click on **War Room** tab. - -2. To **include an existing node** to a War Room or **connect a new node\*** click on **Nodes** tab. Choose any connected node you want to add to this War Room by clicking on the checkbox next to its hostname, then click **+ Add** at the top of the panel. - -3. To **add existing users to a War Room**, click on **Add Users**. - See our [invite section](#invite-your-team) for details on inviting new users to your Space in Netdata Cloud. - -> #### Note -> ->\* This action requires **admin** rights for this Space - -#### More actions - -To **view or remove nodes** in a War Room, click on the **Nodes tab**. To remove a node from the current War Room, click on -the **🗑** icon. - -> #### Info -> -> Removing a node from a War Room does not remove it from your Space. - -## Invite your team - -Invite your entire SRE, DevOPs, or ITOps team to Netdata Cloud, to give everyone insights into your infrastructure from a single pane of glass. - -Invite new users to your Space by clicking on **Invite Users** in -the [Space](#netdata-cloud-spaces) management area. - -![image](https://user-images.githubusercontent.com/70198089/227887469-e46bad55-ef5d-441a-83a5-dcc2af038678.png) - - -You will be prompted to enter the email addresses of the users you want to invite to your Space. You can enter any number of email addresses, separated by a comma, to send multiple invitations at once. - -Next, choose the War Rooms you want to invite these users to. Once logged in, these users are not restricted only to -these War Rooms. They can be invited to others, or join any that are public. - -Next, pick a role for the invited user. You can read more about [which roles are available](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/role-based-access.md#what-roles-are-available) based on your [subscription plan](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/plans.md). - -Click the **Send** button to send an email invitation, which will prompt them -to [sign up](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/sign-in.md) and join your Space. - -![image](https://user-images.githubusercontent.com/70198089/227888899-8511081b-0157-4e22-81d9-898cc464dcb0.png) - -Any unaccepted invitations remain under **Invitations awaiting response**. These invitations can be rescinded at any -time by clicking the trash can icon. diff --git a/docs/cloud/manage/plans.md b/docs/cloud/manage/plans.md deleted file mode 100644 index f84adaa8e..000000000 --- a/docs/cloud/manage/plans.md +++ /dev/null @@ -1,123 +0,0 @@ -# Netdata Plans - -This page will guide you through the differences between the Community, Pro, Business and Enterprise plans. - -At Netdata, we believe in providing free and unrestricted access to high-quality monitoring solutions, and our commitment to this principle will not change. We offer our free SaaS offering - what we call **Community plan** - and Open Source Agent, which features unlimited nodes and users, unlimited metrics, and retention, providing real-time, high-fidelity, out-of-the-box infrastructure monitoring for packaged applications, containers, and operating systems. - -We also provide paid subscriptions that designed to provide additional features and capabilities for businesses that need tighter and customizable integration of the free monitoring solution to their processes. These are divided into three different plans: **Pro**, **Business**, and **Enterprise**. Each plan will offers a different set of features and capabilities to meet the needs of businesses of different sizes and with different monitoring requirements. - -> ### Note -> To not disrupt the existing space user's access rights we will keep them in the **Early Bird** plan. The reason for this is to allow users to -> keep using the legacy **Member** role with the exact same permissions as it has currently. -> -> If you move from the **Early Bird** plan to a paid plan, you will not be able to return to the **Early Bird** plan again. The **Community** free plan will always be available to you, but it does not allow -> you to invite or change users using the Member role. See more details on our [roles and plans](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/role-based-access.md#what-roles-are-available) documentation. - -### Plans - -The plan is an attribute that is directly attached to your space(s) and that dictates what capabilities and customizations you have on your space. If you have different spaces you can have different Netdata plans on them. This gives you flexibility to chose what is more adequate for your needs on each of your spaces. - -Netdata Cloud plans, with the exception of Community, work as subscriptions and overall consist of two pricing components: - -* A flat fee component, that is applied on yearly subscriptions for the [comitted-nodes](#committed-nodes) charte (space subscription fee has been waived off) -* An on-demand metered component, that is related to your usage of Netdata which directly links to the [number of nodes you have running](#running-nodes-and-billing) - -Netdata provides two billing frequency options: - -* Monthly - Pay as you go, where we charge both the flat fee and the on-demand component every month -* Yearly - Annual prepayment, where we charge upfront the flat fee and committed amount related to your estimated usage of Netdata (more details [here](#committed-nodes)) - -For more details on the plans and subscription conditions please check <https://netdata.cloud/pricing>. - -#### Running nodes and billing - -The only dynamic variable we consider for billing is the number of concurrently running nodes or agents. We only charge you for your active running nodes, so we don't count: - -* offline nodes -* stale nodes, nodes that are available to query through a Netdata parent agent but are not actively connecting metrics at the moment - -To ensure we don't overcharge you due to sporadic spikes throughout a month or even at a certain point in a day we are: - -* Calculate a daily P90 figure for your running nodes. To achieve that, we take a daily snapshot of your running nodes, and using the node state change events (live, offline) we guarantee that a daily P90 figure is calculated to remove any daily spikes -* On top of the above, we do a running P90 calculation from the start to the end of your billing cycle. Even if you have an yearly billing frequency we keep a monthly subscription linked to that to identify any potential overage over your [committed nodes](#committed-nodes). - -#### Committed nodes - -When you subscribe to an Yearly plan you will need to specify the number of nodes that you will commit to. On these nodes, a discounted price of less 25% than the original cost per node of the plan is applied. This amount will be part of your annual prepayment. - -``` -Node plan discounted price x committed nodes x 12 months -``` - -If, for a given month, your usage is over these committed nodes we will charge the original cost per node for the nodes above the committed number. - -#### Plan changes and credit balance - -It is ok to change your mind. We allow to change your plan, billing frequency or adjust the committed nodes, on yearly plans, at any time. - -To achieve this you can check the [Update plan](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/view-plan-billing.md#update-plan) section. - -> ⚠️ On a downgrade (going to a new plan with less benefits) or cancellation of an active subscription, please note that you will have all your notification methods configurations active **for a period of 24 hours**. -> After that, any notification methods unavailable in your new plan at that time will be automatically disabled. You can always re-enable them once you move to a paid plan that includes them. - -> ⚠️ Downgrade or cancellation may affect users in your Space. Please check what roles are available on the [each plans](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/plans.md#areas-impacted-by-plans). Users with unavailable roles on the new plan will immediately have restricted access to the Space. - -> ⚠️ Any credit given to you will be available to use on future paid subscriptions with us. It will be available until the the **end of the following year**. - -### Areas impacted by plans - -##### Role-Based Access model - -Depending on the plan associated to your space you will have different roles available: - -| **Role** | **Community** | **Pro** | **Business** | **Early Bird** | -| :-- | :--: | :--: | :--: | :--: | -| **Administrators**<p>Users with this role can control Spaces, War Rooms, Nodes, Users and Billing.</p><p>They can also access any War Room in the Space.</p> | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | -| **Managers**<p>Users with this role can manage War Rooms and Users.</p><p>They can access any War Room in the Space.</p> | - | - | :heavy_check_mark: | - | -| **Troubleshooters**<p>Users with this role can use Netdata to troubleshoot, not manage entities.</p><p>They can access any War Room in the Space.</p> | - | :heavy_check_mark: | :heavy_check_mark: | - | -| **Observers**<p>Users with this role can only view data in specific War Rooms.</p>💡 Ideal for restricting your customer's access to their own dedicated rooms.<p></p> | - | - | :heavy_check_mark: | - | -| **Billing**<p>Users with this role can handle billing options and invoices.</p> | - | - | :heavy_check_mark: | - | -| **Member** ⚠️ Legacy role<p>Users with this role can create War Rooms and invite other Members.</p><p>They can only see the War Rooms they belong to and all Nodes in the All Nodes room.</p>| - | - | - | :heavy_check_mark: | - -For more details check the documentation under [Role-Based Access model](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/role-based-access.md). - -##### Events feed - -The plan you have subscribed on your space will determine the amount of historical data you will be able to query: - -| **Type of events** | **Community** | **Pro** | **Business** | -| :-- | :-- | :-- | :-- | -| **Auditing events** - COMING SOON<p>Events related to actions done on your Space, e.g. invite user, change user role or create room.</p>| 4 hours | 7 days | 90 days | -| **Topology events**<p>Node state transition events, e.g. live or offline.</p>| 4 hours | 7 days | 14 days | -| **Alert events**<p>Alert state transition events, can be seen as an alert history log.</p>| 4 hours | 7 days | 90 days | - -For more details check the documentation under [Events feed](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/events-feed.md). - -##### Notification integrations - -The plan on your space will determine what type of notifications methods will be available to you: - -* **Community** - Email and Discord -* **Pro** - Email, Discord and webhook -* **Business** - Unlimited, this includes Slack, PagerDuty, Opsgenie etc. - -For more details check the documentation under [Alert Notifications](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/notifications.md#alert-notifications). - -##### Alert notification silencing rules - -The plan on your space will determine if you are able to add alert notification silencing rules since this feature will only be available for paid plans: **Pro** or **Business**. - -For more details check the documentation under [Alert Notifications](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/notifications.md#silencing-alert-notifications). - -### Related Topics - -#### **Related Concepts** - -* [Spaces](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/organize-your-infrastrucutre-invite-your-team.md#netdata-cloud-spaces) -* [Alert Notifications](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/notifications.md) -* [Events feed](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/events-feed.md) -* [Role-Based Access model](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/role-based-access.md) - -#### Related Tasks - -* [View Plan & Billing](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/view-plan-billing.md) diff --git a/docs/cloud/manage/role-based-access.md b/docs/cloud/manage/role-based-access.md deleted file mode 100644 index a0b387749..000000000 --- a/docs/cloud/manage/role-based-access.md +++ /dev/null @@ -1,143 +0,0 @@ -# Role-Based Access model - -Netdata Cloud's role-based-access mechanism allows you to control what functionalities in the app users can access. Each user can be assigned only one role, which fully specifies all the capabilities they are afforded. - -## What roles are available? - -With the advent of the paid plans we revamped the roles to cover needs expressed by Netdata users, like providing more limited access to their customers, or -being able to join any room. We also aligned the offered roles to the target audience of each plan. The end result is the following: - -| **Role** | **Community** | **Pro** | **Business** | **Early Bird** | -| :-- | :--: | :--: | :--: | :--: | -| **Administrators**<p>Users with this role can control Spaces, War Rooms, Nodes, Users and Billing.</p><p>They can also access any War Room in the Space.</p> | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | -| **Managers**<p>Users with this role can manage War Rooms and Users.</p><p>They can access any War Room in the Space.</p> | - | - | :heavy_check_mark: | - | -| **Troubleshooters**<p>Users with this role can use Netdata to troubleshoot, not manage entities.</p><p>They can access any War Room in the Space.</p> | - | :heavy_check_mark: | :heavy_check_mark: | - | -| **Observers**<p>Users with this role can only view data in specific War Rooms.</p>💡 Ideal for restricting your customer's access to their own dedicated rooms.<p></p> | - | - | :heavy_check_mark: | - | -| **Billing**<p>Users with this role can handle billing options and invoices.</p> | - | - | :heavy_check_mark: | - | -| **Member** ⚠️ Legacy role<p>Users with this role can create War Rooms and invite other Members.</p><p>They can only see the War Rooms they belong to and all Nodes in the All Nodes room.</p>| - | - | - | :heavy_check_mark: | - -## What happens to the previous Member role? - -We will maintain a Early Bird plan for existing users, which will continue to provide access to the Member role. - -## Which functionalities are available for each role? - -In more detail, you can find on the following tables which functionalities are available for each role on each domain. - -### Space Management - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | -| See Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | -| Leave Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | -| Delete Space | :heavy_check_mark: | - | - | - | - | - | -| Change name | :heavy_check_mark: | - | - | - | - | - | -| Change description | :heavy_check_mark: | - | - | - | - | - | - -### Node Management - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | :-- | -| See all Nodes in Space (_All Nodes_ room) | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | Members are always on the _All Nodes_ room | -| Connect Node to Space | :heavy_check_mark: | - | - | - | - | - | - | -| Delete Node from Space | :heavy_check_mark: | - | - | - | - | - | - | - -### User Management - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | :-- | -| See all Users in Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | -| Invite new User to Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | You can't invite a user with a role you don't have permissions to appoint to (see below) | -| Delete Pending Invitation to Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | -| Delete User from Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | You can't delete a user if he has a role you don't have permissions to appoint to (see below) | -| Appoint Administrators | :heavy_check_mark: | - | - | - | - | - | | -| Appoint Billing user | :heavy_check_mark: | - | - | - | - | - | | -| Appoint Managers | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | -| Appoint Troubleshooters | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | -| Appoint Observer | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | -| Appoint Member | :heavy_check_mark: | - | - | - | - | :heavy_check_mark: | Only available on Early Bird plans | -| See all Users in a Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | -| Invite existing user to Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | User already invited to the Space | -| Remove user from Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | - -### Room Management - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | :-- | -| See all Rooms in a Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | - | | -| Join any Room in a Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | - | By joining a room you will be enabled to get notifications from nodes on that room | -| Leave Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | -| Create a new Room in a Space | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | -| Delete Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | -| Change Room name | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | If not the _All Nodes_ room | -| Change Room description | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | -| Add existing Nodes to Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | Node already connected to the Space | -| Remove Nodes from Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | :heavy_check_mark: | | - -### Notifications Management - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | :-- | -| See all configured notifications on a Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | -| Add new configuration | :heavy_check_mark: | - | - | - | - | - | | -| Enable/Disable configuration | :heavy_check_mark: | - | - | - | - | - | | -| Edit configuration | :heavy_check_mark: | - | - | - | - | - | Some exceptions apply depending on [service level](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/manage-notification-methods.md#available-actions-per-notification-methods-based-on-service-level) | -| Delete configuration | :heavy_check_mark: | - | - | - | - | - | | -| Edit personal level notification settings | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | [Manage user notification settings](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/manage-notification-methods.md#manage-user-notification-settings) | -| See space alert notification silencing rules | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | - | | -| Add new space alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | -| Enable/Disable space alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | -| Edit space alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | -| Delete space alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | | -| See, add, edit or delete personal level alert notification silencing rule | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | | - - -Notes: -* Enable, Edit and Add actions over specific notification methods will only be allowed if your plan has access to those ([service classification](https://github.com/netdata/netdata/blob/master/docs/cloud/alerts-notifications/notifications.md#service-classification)) - -### Dashboards - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | -| See all dashboards in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | -| Add new dashboard to Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | -| Edit any dashboard in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | -| Edit own dashboard in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | -| Delete any dashboard in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | -| Delete own dashboard in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | - -### Functions - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | :-- | -| See all functions in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | -| Run any function in Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | -| Run read-only function in Room | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | -| Run sensitive function in Room | :heavy_check_mark: | :heavy_check_mark: | - | - | - | - | There isn't any function on this category yet, so subject to change. | - -### Events feed - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | :-- | -| See Alert or Topology events | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | | -| See Auditing events | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | These are coming soon, not currently available | - -### Billing - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | Notes | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | :-- | -| See Plan & Billing details | :heavy_check_mark: | - | - | - | :heavy_check_mark: | - | Current plan and usage figures | -| Update plans | :heavy_check_mark: | - | - | - | - | - | This includes cancelling current plan (going to Community plan) | -| See invoices | :heavy_check_mark: | - | - | - | :heavy_check_mark: | - | | -| Manage payment methods | :heavy_check_mark: | - | - | - | :heavy_check_mark: | - | | -| Update billing email | :heavy_check_mark: | - | - | - | :heavy_check_mark: | - | | - -### Other permissions - -| **Functionality** | **Administrator** | **Manager** | **Troubleshooter** | **Observer** | **Billing** | **Member** | -| :-- | :--: | :--: | :--: | :--: | :--: | :--: | -| See Bookmarks in Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | -| Add Bookmark to Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | -| Delete Bookmark from Space | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | - | :heavy_check_mark: | -| See Visited Nodes | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | -| Update Visited Nodes | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | - | :heavy_check_mark: | diff --git a/docs/cloud/manage/sign-in.md b/docs/cloud/manage/sign-in.md deleted file mode 100644 index 53ea3a22a..000000000 --- a/docs/cloud/manage/sign-in.md +++ /dev/null @@ -1,81 +0,0 @@ -# Sign in to Netdata - -This page explains how to sign in to Netdata with your email, Google account, or GitHub account, and provides some tips if you're having trouble signing in. - -You can [sign in to Netdata](https://app.netdata.cloud/sign-in?cloudRoute=spaces?utm_source=docs&utm_content=sign_in_button_first_section) through one of three methods: email, Google, or GitHub. Email uses a -time-sensitive link that authenticates your browser, and Google/GitHub both use OAuth to associate your email address -with a Netdata Cloud account. - -No matter the method, your Netdata Cloud account is based around your email address. Netdata Cloud does not store -passwords. - -## Email - -To sign in with email, visit [Netdata Cloud](https://app.netdata.cloud/sign-in?cloudRoute=spaces?utm_source=docs&utm_content=sign_in_button_email_section), enter your email address, and click -the **Sign in by email** button. - -![Verify your email!](https://user-images.githubusercontent.com/82235632/125475486-c667635a-067f-4866-9411-9f7f795a0d50.png) - -Click the **Verify** button in the email to begin using Netdata Cloud. - -To use this same Netdata Cloud account on additional devices, request another sign in email, open the email on that -device, and sign in. - -### Don't have a Netdata Cloud account yet? - -If you don't already have a Netdata Cloud account, you don't need to worry about this. During the sign-in process we will create one for you and make the process seamless to you. - -After your account is created and you sign in to Netdata, you first are asked to agree to Netdata Cloud's [Privacy -Policy](https://www.netdata.cloud/privacy/) and [Terms of Use](https://www.netdata.cloud/terms/). Once you agree with these you are directed -through the Netdata Cloud onboarding process, which is explained in the [Netdata Cloud -quickstart](https://github.com/netdata/netdata/blob/master/packaging/installer/README.md). - -### Troubleshooting - -You should receive your sign in email in less than a minute. The subject is **Verify your email!** for new sign-ups, **Sign in to Netdata** for sign ins. -The sender is `no-reply@netdata.cloud` via `sendgrid.net`. - -If you don't see the email, try the following: - -- Check your spam folder. -- In Gmail, check the **Updates** category. -- Check [Netdata Cloud status](https://status.netdata.cloud) for ongoing issues with our infrastructure. -- Request another sign in email via the [sign-in page](https://app.netdata.cloud/sign-in?cloudRoute=spaces?utm_source=docs&utm_content=sign_in_button_troubleshooting_section). - -You may also want to add `no-reply@netdata.cloud` to your address book or contacts list, especially if you're using -a public email service, such as Gmail. You may also want to whitelist/allowlist either the specific email or the entire -`netdata.cloud` domain. - -In some cases, temporary issues with your mail server or email account may result in your email address being added to a Bounce list by Sendgrid. -If you are added to that list, no Netdata cloud email can reach you, including alert notifications. Let us know in Discord that you have trouble receiving -any email from us and someone will ask you to provide your email address privately, so we can check if you are on the Bounce list. - -## Google and GitHub OAuth - -When you use Google/GitHub OAuth, your Netdata Cloud account is associated with the email address that Netdata Cloud -receives via OAuth. - -To sign in with Google or GitHub OAuth, visit [Netdata Cloud](https://app.netdata.cloud/sign-in?cloudRoute=spaces?utm_source=docs&utm_content=sign_in_button_google_github_section) and click the -**Continue with Google/GitHub** or button. Enter your Google/GitHub username and your password. Complete two-factor -authentication if you or your organization has it enabled. - -You are then signed in to Netdata Cloud or directed to the new-user onboarding if you have not signed up previously. - -## Reset a password - -Netdata Cloud does not store passwords and does not support password resets. All of our sign in methods do not -require passwords, and use either links in emails or Google/GitHub OAuth for authentication. - -## Switch between sign in methods - -You can switch between sign in methods if the email account associated with each method is the same. - -For example, you first sign in via your email account, `user@example.com`, and later sign out. You later attempt to sign -in via a GitHub account associated with `user@example.com`. Netdata Cloud recognizes that the two are the same and signs -you in to your original account. - -However, if you first sign in via your `user@example.com` email account and then sign in via a Google account associated -with `user2@example.com`, Netdata Cloud creates a new account and begins the onboarding process. - -It is not currently possible to link an account created with `user@example.com` to a Google account associated with -`user2@example.com`. diff --git a/docs/cloud/manage/themes.md b/docs/cloud/manage/themes.md deleted file mode 100644 index aaf193a87..000000000 --- a/docs/cloud/manage/themes.md +++ /dev/null @@ -1,14 +0,0 @@ -# Choose your Netdata Cloud theme - -The Dark theme is the default for all new Netdata Cloud accounts. - -To change your theme across Netdata Cloud, click on your profile picture, then **Profile**. Click on the **Settings** -tab, then choose your preferred theme: Light or Dark. - -**Light**: - -![Dark theme](https://user-images.githubusercontent.com/1153921/108530742-2ca98c00-7293-11eb-8c1e-1e0dd34eb87b.png) - -**Dark (default)**: - -![Light theme](https://user-images.githubusercontent.com/1153921/108530848-4519a680-7293-11eb-897d-1c470b67ceb0.png) diff --git a/docs/cloud/manage/view-plan-billing.md b/docs/cloud/manage/view-plan-billing.md deleted file mode 100644 index 5d381f952..000000000 --- a/docs/cloud/manage/view-plan-billing.md +++ /dev/null @@ -1,141 +0,0 @@ -# View Plan & Billing - -From the Cloud interface, you can view and manage your space's plan and billing settings, and see the space's usage in terms of running nodes. - -To view and manage some specific settings, related to billing options and invoices, you'll be redirected to our billing provider Customer Portal. - -## Prerequisites - -To see your plan and billing setting you need: - -- A Cloud account -- Access to the space as an Administrator or Billing user - -## Steps - -### View current plan and Billing options and Invoices - -1. Click on the **Space settings** cog (located above your profile icon) -1. Click on the **Plan & Billing** tab -1. On this page you will be presented with information on your current plan, billing settings, and usage information: - 1. At the top of the page you will see: - - **Credit** amount which refers to any amount you have available to use on future invoices or subscription changes ([Plan changes and credit balance](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/plans.md#plan-changes-and-credit-balance)) - this is displayed once you have had an active paid subscription with us - - **Billing email** the email that was specified to be linked to tha plan subscription. This is where invoices, payment, and subscription-related notifications will be sent. - - **Billing options and Invoices** is the link to our billing provider Customer Portal where you will be able to: - - See the current subscription. There will always be 2 subscriptions active for the two pricing components mentioned on [Netdata Plans documentation page](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/plans.md#plans) - - Change directly the payment method associated to current subscriptions - - View, add, delete or change your default payment methods - - View or change or Billing information: - - Billing email - - Address - - Phone number - - Tax ID - - View your invoice history - 1. At the middle, you'll see details on your current plan as well as means to: - - Upgrade or cancel your plan - - View **All Plans** details page - 1. At the bottom, you will find your Usage chart that displays: - - Daily count - The weighted 90th percentile of the live node count during the day, taking time as the weight. If you have 30 live nodes throughout the day - except for a two hour peak of 44 live nodes, the daily value is 31. - - Period count: The 90th percentile of the daily counts for this period up to the date. The last value for the period is used as the number of nodes for the bill for that period. See more details in [running nodes and billing](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/plans.md#running-nodes-and-billing) (only applicable if you are on a paid plan subscription) - - Committed nodes: The number of nodes committed to in the yearly plan. In case the period count is higher than the number of committed nodes, the difference is billed as overage. - - -### Update plan - -1. Click on the **Space settings** cog (located above your profile icon) -1. Click on the **Plan & Billing** tab -1. On this page you will be presented with information on your current plan, billing settings, and usage information - 1. Depending on your plan there could be shortcuts to immediately take you to change, for example, the billing frequency to **Yearly** - 1. Most actions will be available under the **Change plan** link that take you to the **All plans** details page where you can - 1. Downgrade or upgrade your plan - 1. Change the billing frequency - 1. Change committed nodes, in case you are on a Yearly plan - 1. Once you chose an action to update your plan a modal will pop-up on the right with - 1. Billing frequency displayed on the top right-corner - 1. Committed Nodes, when applicable - 1. Current billing information: - - Billing email - - Default payment method - - Business name and VAT number, when these are applicable - - Billing Address - Note: Any changes to these need to done through our billing provider Customer Portal prior to confirm the checkout. You can click on the link **Change billing info and payment method** to access it. - 1. Promotion code, so you can review any applied promotion or enter one you may have - 1. Detailed view on Node and Space charges - 1. Breakdown of: - - Subscription Total - - Discount from promotion codes, if applicable - - credit value for Unused time from current plan, if applicable - - Credit amount used from balance, if applicable - - Total Before Tax - - VAT rate and amount, if applicable - 1. Summary of: - - Total payable amount - - credit adjustment value for any Remaining Unused time from current plan, if applicable - - Final credit balance - -Notes: -* Since there is an active plan you won't be redirected to our billing provider, the checkout if performed as soon as you click on **Checkout** -* The change to your plan will be applied as soon as the checkout process is completed successfully -* Downgrade or cancellations may have impacts on some of notification method settings or user accesses to your space, for more details please check [Plan changes and credit balance](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/plans.md#plan-changes-and-credit-balance) - -## FAQ - -### 1. What Payment Methods are accepted? - -You can easily pay online via most major Credit/Debit Cards. More payment options are expected to become available in the near future. - -### 2. What happens if a renewal payment fails? - -After an initial failed payment, we will attempt to process your payment every week for the next 15 days. After three failed attempts your Space will be moved to the **Community** plan (free forever). - -For the next 24 hours, you will be able to use all your current notification method configurations. After 24 hours, any of the notification method configurations that aren't available on your space's plan will be automatically disabled. - -Cancellation might affect users in your Space. Please check what roles are available on the [Community plan](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/plans.md#areas-impacted-by-plans). Users with unavailable roles on the Community plan will immediately have restricted access to the Space. - -### 3. Which currencies do you support? - -We currently accept payments only in US Dollars (USD). We currently have plans to also accept payments in Euros (EUR), but do not currently have an estimate for when such support will be available. - -### 4. Can I get a refund? How? - -Payments for Netdata subscriptions are refundable **only** if you cancel your subscription within 14 days of purchase. The refund will be credited to the Credit/Debit Card used for making the purchase. To request a refund, please email us at [billing@netdata.cloud](mailto:billing@netdata.cloud). - -### 5. How do I cancel my paid Plan? - -Your annual or monthly Netdata Subscription plan will automatically renew until you cancel it. You can cancel your paid plan at any time by clicking ‘Cancel Plan’ from the **Plan & Billing** section under settings. You can also cancel your paid Plan by clicking the _Select_ button under **Community** plan in the **Plan & Billing** Section under Settings. - -### 6. How can I access my Invoices/Receipts after I paid for a Plan? - -You can visit the _Billing Options & Invoices_ in the **Plan & Billing** section under settings in your Netdata Space where you can find all your Invoicing history. - -### 7. Why do I see two separate Invoices? - -Every time you purchase or renew a Plan, two separate Invoices are generated: - -- One Invoice includes the recurring fees of the Plan you have chosen - - We have waived off the space subscription free ($0.00), so the only recurring fee will be on annual plans for the committed nodes. - -- The other Invoice includes your monthly “On Demand - Usage”. - - Right after the activation of your subscription, you will receive a zero value Invoice since you had no usage when you subscribed. - - On the following month you will receive an Invoice based on your monthly usage. - -You can find some further details on the [Netdata Plans page](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/plans.md#plans). - -> ⚠️ We expect this to change to a single invoice in the future, but currently do not have a concrete timeline for when this change will happen. - -### 8. How is the **Total Before Tax** value calculated on plan changes? - -When you change your plan we will be calculating the residual before tax value you have from the _Unused time on your current plan_ in order to credit you with this value. - -After that, we will be performing the following calculations: - -1. Get the **Subscription total** (total amount to be paid for Nodes and Space) -2. Deduct any Discount applicable from promotion codes -3. If an amount remains, then we deduct the sum of the _Unused time on current plan_ then and the Credit amount from any existing credit balance. -4. The result, if positive, is the Total Before Tax, if applicable, any sales tax (VAT or other) will apply. - -If the calculation of step 3 returns a negative amount then this amount will be your new customer credit balance. diff --git a/docs/cloud/netdata-assistant.md b/docs/cloud/netdata-assistant.md deleted file mode 100644 index afa13f6e9..000000000 --- a/docs/cloud/netdata-assistant.md +++ /dev/null @@ -1,20 +0,0 @@ -# Alert troubleshooting with Netdata Assistant - -The Netdata Assistant is a feature that uses large language models and the Netdata community's collective knowledge to guide you during troubleshooting. It is designed to make understanding and root causing alerts simpler and faster. - -## Using Netdata Assistant - -- Navigate to the alerts tab -- If there are active alerts, the `Actions` column will have an Assistant button - - ![](https://github-production-user-asset-6210df.s3.amazonaws.com/24860547/253559075-815ca123-e2b6-4d44-a780-eeee64cca420.png) - -- Clicking on the Assistant button opens up as a floating window with customized information and troubleshooting tips for this alert (note that the window can follow you through your troubleshooting journey on Netdata dashboards) - - ![](https://github-production-user-asset-6210df.s3.amazonaws.com/24860547/253559645-62850c7b-cd1d-45f2-b2dd-474ecbf2b713.png) - -- In case you need more information, or want to understand deeper, Netdata Assistant also provides useful web links to resources that can help. - - ![](https://github-production-user-asset-6210df.s3.amazonaws.com/24860547/253560071-e768fa6d-6c9a-4504-bb1f-17d5f4707627.png) - -- If there are no active alerts, you can still use Netdata Assistant by clicking the Assistant button on the Alert Configuration view. diff --git a/docs/cloud/netdata-functions.md b/docs/cloud/netdata-functions.md deleted file mode 100644 index caff9b35d..000000000 --- a/docs/cloud/netdata-functions.md +++ /dev/null @@ -1,79 +0,0 @@ -<!-- -title: "Netdata Functions" -sidebar_label: "Netdata Functions" -custom_edit_url: "https://github.com/netdata/netdata/blob/master/docs/cloud/netdata-functions.md" -sidebar_position: "2800" -learn_status: "Published" -learn_topic_type: "Concepts" -learn_rel_path: "Concepts" -learn_docs_purpose: "Present the Netdata Functions what these are and why they should be used." ---> - -# Netdata Functions - -Netdata Agent collectors are able to expose functions that can be executed in run-time and on-demand. These will be -executed on the node - host where the function is made -available. - -#### What is a function? - -Collectors besides the metric collection, storing, and/or streaming work are capable of executing specific routines on -request. These routines will bring additional information -to help you troubleshoot or even trigger some action to happen on the node itself. - -A function is a `key` - `value` pair. The `key` uniquely identifies the function within a node. The `value` is a -function (i.e. code) to be run by a data collector when -the function is invoked. - -For more details please check out documentation on how we use our internal collector to get this from the first collector that exposes -functions - [plugins.d](https://github.com/netdata/netdata/blob/master/collectors/plugins.d/README.md#function). - -#### What functions are currently available? - -| Function | Description | Alternative to CLI tools | plugin - module | -|:-------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------|:-----------------------------------------------------------------------------------------------------------| -| block-devices | Disk I/O activity for all block devices, offering insights into both data transfer volume and operation performance. | `iostat` | [proc](https://github.com/netdata/netdata/tree/master/collectors/proc.plugin#readme) | -| containers-vms | Insights into the resource utilization of containers and QEMU virtual machines: CPU usage, memory consumption, disk I/O, and network traffic. | `docker stats`, `systemd-cgtop` | [cgroups](https://github.com/netdata/netdata/tree/master/collectors/cgroups.plugin#readme) | -| ipmi-sensors | Readings and status of IPMI sensors. | `ipmi-sensors` | [freeipmi](https://github.com/netdata/netdata/tree/master/collectors/freeipmi.plugin#readme) | -| mount-points | Disk usage for each mount point, including used and available space, both in terms of percentage and actual bytes, as well as used and available inode counts. | `df` | [diskspace](https://github.com/netdata/netdata/tree/master/collectors/diskspace.plugin#readme) | -| network interfaces | Network traffic, packet drop rates, interface states, MTU, speed, and duplex mode for all network interfaces. | `bmon`, `bwm-ng` | [proc](https://github.com/netdata/netdata/tree/master/collectors/proc.plugin#readme) | -| processes | Real-time information about the system's resource usage, including CPU utilization, memory consumption, and disk IO for every running process. | `top`, `htop` | [apps](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md) | -| systemd-journal | Viewing, exploring and analyzing systemd journal logs. | `journalctl` | [systemd-journal](https://github.com/netdata/netdata/tree/master/collectors/systemd-journal.plugin#readme) | -| systemd-list-units | Information about all systemd units, including their active state, description, whether or not they are enabled, and more. | `systemctl list-units` | [systemd-journal](https://github.com/netdata/netdata/tree/master/collectors/systemd-journal.plugin#readme) | -| systemd-services | System resource utilization for all running systemd services: CPU, memory, and disk IO. | `systemd-cgtop` | [cgroups](https://github.com/netdata/netdata/tree/master/collectors/cgroups.plugin#readme) | -| streaming | Comprehensive overview of all Netdata children instances, offering detailed information about their status, replication completion time, and many more. | | | - - -If you have ideas or requests for other functions: -* Participate in the relevant [GitHub discussion](https://github.com/netdata/netdata/discussions/14412) -* Open a [feature request](https://github.com/netdata/netdata-cloud/issues/new?assignees=&labels=feature+request%2Cneeds+triage&template=FEAT_REQUEST.yml&title=%5BFeat%5D%3A+) on Netdata Cloud repo -* Join the Netdata community on [Discord](https://discord.com/invite/mPZ6WZKKG2) and let us know. - -#### How do functions work with streaming? - -Via streaming, the definitions of functions are transmitted to a parent node, so it knows all the functions available on -any children connected to it. - -If the parent node is the one connected to Netdata Cloud it is capable of triggering the call to the respective children -node to run the function. - -#### Why are they available only on Netdata Cloud? - -Since these functions are able to execute routines on the node and due to the potential use cases that they can cover, our -concern is to ensure no sensitive -information or disruptive actions are exposed through the Agent's API. - -With the communication between the Netdata Agent and Netdata Cloud being -through [ACLK](https://github.com/netdata/netdata/blob/master/aclk/README.md) this -concern is addressed. - -## Related Topics - -### **Related Concepts** - -- [ACLK](https://github.com/netdata/netdata/blob/master/aclk/README.md) -- [plugins.d](https://github.com/netdata/netdata/blob/master/collectors/plugins.d/README.md) - -### Related Tasks - -- [Run-time troubleshooting with Functions](https://github.com/netdata/netdata/blob/master/docs/cloud/runtime-troubleshooting-with-functions.md) diff --git a/docs/cloud/runtime-troubleshooting-with-functions.md b/docs/cloud/runtime-troubleshooting-with-functions.md deleted file mode 100644 index 839b8c9ed..000000000 --- a/docs/cloud/runtime-troubleshooting-with-functions.md +++ /dev/null @@ -1,34 +0,0 @@ -# Run-time troubleshooting with Functions - -Netdata Functions feature allows you to execute on-demand a pre-defined routine on a node where a Netdata Agent is running. These routines are exposed by a given collector. -These routines can be used to retrieve additional information to help you troubleshoot or to trigger some action to happen on the node itself. - - -### Prerequisites - -The following is required to be able to run Functions from Netdata Cloud. -* At least one of the nodes claimed to your Space should be on a Netdata agent version higher than `v1.37.1` -* Ensure that the node has the collector that exposes the function you want enabled ([see current available functions](https://github.com/netdata/netdata/blob/master/docs/cloud/netdata-functions.md#what-functions-are-currently-available)) - -### Execute a function (from the Functions tab) - -1. From the right-hand bar select the **Function** you want to run -2. Still on the right-hand bar select the **Node** where you want to run it -3. Results will be displayed in the central area for you to interact with -4. Additional filtering capabilities, depending on the function, should be available on right-hand bar - -### Execute a function (from the Nodes tab) - -1. Click on the functions icon for a node that has this active -2. You are directed to the **Functions** tab -3. Follow the above instructions from step 3. - -> ⚠️ If you get an error saying that your node can't execute Functions please check the [prerequisites](#prerequisites). - -## Related Topics - -### **Related Concepts** -- [Netdata Functions](https://github.com/netdata/netdata/blob/master/docs/cloud/netdata-functions.md) - -#### Related References documentation -- [External plugins overview](https://github.com/netdata/netdata/blob/master/collectors/plugins.d/README.md#function) diff --git a/docs/cloud/visualize/dashboards.md b/docs/cloud/visualize/dashboards.md deleted file mode 100644 index 4b4baf426..000000000 --- a/docs/cloud/visualize/dashboards.md +++ /dev/null @@ -1,101 +0,0 @@ -# Build new dashboards - -With Netdata Cloud, you can build new dashboards that target your infrastructure's unique needs. Put key metrics from -any number of distributed systems in one place for a bird's eye view of your infrastructure. - -Click on the **Dashboards** tab in any War Room to get started. - -## Create your first dashboard - -From the Dashboards tab, click on the **+** button. - -<img width="98" alt=" Green plus button " src="https://github.com/netdata/netdata/assets/73346910/511e2b38-e751-4a88-bc7d-bcd49764b7f6"/> - - -In the modal, give your new dashboard a name, and click **+ Add**. - -- The **Add Chart** button on the top right of the interface adds your first chart card. From the dropdown, select either **All Nodes** or a specific -node. Next, select the context. You'll see a preview of the chart before you finish adding it. In this modal you can also [interact with the chart](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/interact-new-charts.md), meaning you can configure all the aspects of the [NIDL framework](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/interact-new-charts.md#nidl-framework) of the chart and more in detail, you can: - - define which `group by` method to use - - select the aggregation function over the data source - - select nodes - - select instances - - select dimensions - - select labels - - select the aggregation function over time - - After you are done configuring the chart, you can also change the type of the chart from the right hand side of the [Title bar](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/interact-new-charts.md#title-bar), and select which of the final dimensions you want to be visible and in what order, from the [Dimensions bar](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/interact-new-charts.md#dimensions-bar). - -- The **Add Text** button on the top right of the interface creates a new card with user-defined text, which you can use to describe or document a -particular dashboard's meaning and purpose. - -> ### Important -> -> Be sure to click the **Save** button any time you make changes to your dashboard. - - -## Using your dashboard - -Dashboards are designed to be interactive and flexible so you can design them to your exact needs. Dashboards are made -of any number of **cards**, which can contain charts or text. - -### Chart cards - -The charts you add to any dashboard are [fully interactive](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/interact-new-charts.md), just like any other Netdata chart. You can zoom in and out, highlight timeframes, and more. - -Charts also synchronize as you interact with them, even across contexts _or_ nodes. - -### Text cards - -You can use text cards as notes to explain to other members of the [War Room](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/organize-your-infrastrucutre-invite-your-team.md#netdata-cloud-war-rooms) the purpose of the dashboard's arrangement. - -### Move cards - -To move any card, click and hold on **Drag & rearrange** at the top right of the card and drag it to a new location. A red placeholder indicates the -new location. Once you release your mouse, other charts re-sort to the grid system automatically. - -### Resize cards - -To resize any card on a dashboard, click on the bottom-right corner and drag to the card's new size. Other cards re-sort -to the grid system automatically. - -## Go to chart - -Quickly jump to the location of the chart in either the Overview tab or if the card refers to a single node, its single node dashboard by clicking the 3-dot icon in the corner of any card to open a menu. Hit the **Go to Chart** item. - -You'll land directly on that chart of interest, but you can now scroll up and down to correlate your findings with other -charts. Of course, you can continue to zoom, highlight, and pan through time just as you're used to with Netdata Charts. - -## Managing your dashboard - -To see dashboards associated with the current War Room, click the **Dashboards** tab in any War Room. You can select -dashboards and delete them using the 🗑️ icon. - -### Update/save a dashboard - -If you've made changes to a dashboard, such as adding or moving cards, the **Save** button is enabled. Click it to save -your most recent changes. Any other members of the War Room will be able to see these changes the next time they load -this dashboard. - -If multiple users attempt to make concurrent changes to the same dashboard, the second user who hits Save will be -prompted to either overwrite the dashboard or reload to see the most recent changes. - -### Remove an individual card - -Click on the 3-dot icon in the corner of any card to open a menu. Click the **Remove** item to remove the card. - -### Delete a dashboard - -Delete any dashboard by navigating to it and clicking the **Delete** button. This will remove this entry from the -dropdown for every member of this War Room. - -### Minimum browser viewport - -Because of the visual complexity of individual charts, dashboards require a minimum browser viewport of 800px. - -## What's next? - -Once you've designed a dashboard or two, make sure -to [invite your team](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/organize-your-infrastrucutre-invite-your-team.md#invite-your-team) if -you haven't already. You can add these new users to the same War Room to let them see the same dashboards without any -effort. diff --git a/docs/cloud/visualize/interact-new-charts.md b/docs/cloud/visualize/interact-new-charts.md deleted file mode 100644 index 16db927a8..000000000 --- a/docs/cloud/visualize/interact-new-charts.md +++ /dev/null @@ -1,416 +0,0 @@ -# Netdata Charts - -Learn how to use Netdata's powerful charts to troubleshoot with real-time, per-second metric data. - -Netdata excels in collecting, storing, and organizing metrics in out-of-the-box dashboards. -To make sense of all the metrics, Netdata offers an enhanced version of charts that update every second. - -These charts provide a lot of useful information, so that you can: - -- Enjoy the high-resolution, granular metrics collected by Netdata -- Examine all the metrics by hovering over them with your cursor -- Filter the metrics in any way you want using the [Definition bar](#definition-bar) -- View the combined anomaly rate of all underlying data with the [Anomaly Rate ribbon](#anomaly-rate-ribbon) -- Explore even more details about a chart's metrics through [hovering over certain elements of it](#hover-over-the-chart) -- Use intuitive tooling and shortcuts to pan, zoom or highlight areas of interest in your charts -- On highlight, get easy access to [Metric Correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md) to see other metrics with similar patterns -- Have the dimensions sorted based on name or value -- View information about the chart, its plugin, context, and type -- View individual metric collection status about a chart - -These charts are available on Netdata Cloud's -[Overview tab](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md), Single Node tab and -on your [Custom Dashboards](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/dashboards.md). - -## Overview - -A Netdata chart looks like this: - -<img src="https://user-images.githubusercontent.com/70198089/236133212-353c102f-a6ed-45b7-9251-34e004c7a10a.png" width="900"/> - -With a quick glance you have immediate information available at your disposal: - -- [Chart title and units](#title-bar) -- [Anomaly Rate ribbon](#anomaly-rate-ribbon) -- [Definition bar](#definition-bar) -- [Tool bar](#tool-bar) -- [Chart area](#hover-over-the-chart) -- [Legend with dimensions](#dimensions-bar) - -## Fundemental elements - -While Netdata's charts require no configuration and are easy to interact with, they have a lot of underlying complexity. To meaningfully organize charts out of the box based on what's happening in your nodes, Netdata uses the concepts of [dimensions](#dimensions), [contexts](#contexts), and [families](#families). - -Understanding how these work will help you more easily navigate the dashboard, -[write new alerts](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md), or play around -with the [API](https://github.com/netdata/netdata/blob/master/web/api/README.md). - -### Dimensions - -A **dimension** is a value that gets shown on a chart. The value can be raw data or calculated values, such as the -average (the default), minimum, or maximum. These values can then be given any type of unit. For example, CPU -utilization is represented as a percentage, disk I/O as `MiB/s`, and available RAM as an absolute value in `MiB` or -`GiB`. - -Beneath every chart (or on the right-side if you configure the dashboard) is a legend of dimensions. When there are -multiple dimensions, you'll see a different entry in the legend for each dimension. - -The **Apps CPU Time** chart (with the [context](#contexts) `apps.cpu`), which visualizes CPU utilization of -different types of processes/services/applications on your node, always provides a vibrant example of a chart with -multiple dimensions. - -Dimensions can be [hidden](#show-and-hide-dimensions) to help you focus your attention. - -### Contexts - -A **context** is a way of grouping charts by the types of metrics collected and dimensions displayed. It's like a machine-readable naming and organization scheme. - -For example, the **Apps CPU Time** has the context `apps.cpu`. A little further down on the dashboard is a similar -chart, **Apps Real Memory (w/o shared)** with the context `apps.mem`. The `apps` portion of the context is the **type**, -whereas anything after the `.` is specified either by the chart's developer or by the [family](#families). - -By default, a chart's type affects where it fits in the menu, while its family creates submenus. - -Netdata also relies on contexts for [alert configuration](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md) (the [`on` line](https://github.com/netdata/netdata/blob/master/health/REFERENCE.md#alert-line-on)). - -### Families - -**Families** are a _single instance_ of a hardware or software resource that needs to be displayed separately from -similar instances. - -For example, let's look at the **Disks** section, which contains a number of charts with contexts like `disk.io`, -`disk.ops`, `disk.backlog`, and `disk.util`. If your node has multiple disk drives at `sda` and `sdb`, Netdata creates -a separate family for each. - -Netdata now merges the contexts and families to create charts that are grouped by family, following a -`[context].[family]` naming scheme, so that you can see the `disk.io` and `disk.ops` charts for `sda` right next to each -other. - -Given the four example contexts, and two families of `sda` and `sdb`, Netdata will create the following charts and their -names: - -| Context | `sda` family | `sdb` family | -|:---------------|--------------------|--------------------| -| `disk.io` | `disk_io.sda` | `disk_io.sdb` | -| `disk.ops` | `disk_ops.sda` | `disk_ops.sdb` | -| `disk.backlog` | `disk_backlog.sda` | `disk_backlog.sdb` | -| `disk.util` | `disk_util.sda` | `disk_util.sdb` | - -## Title bar - -When you start interacting with a chart, you'll notice valuable information on the top bar: - -<img src="https://user-images.githubusercontent.com/70198089/236133832-fad45e65-5bd6-4fd1-8d68-33acf69fff5c.png" width="900"/> - -The elements that you can find on this top bar are: - -- **Netdata icon**: this indicates that data is continuously being updated, this happens if [Time controls](https://github.com/netdata/netdata/blob/master/docs/dashboard/visualization-date-and-time-controls.md#time-controls) are in Play or Force Play mode. -- **Chart title**: on the chart title you can see the title together with the metric being displayed, as well as the unit of measurement. -- **Chart status icon**: possible values are: Loading, Timeout, Error or No data, otherwise this icon is not shown. - -Along with viewing chart type, context and units, on this bar you have access to immediate actions over the chart: - -<img src="https://user-images.githubusercontent.com/70198089/236134195-ecb08f79-1355-4bce-8449-e829f4a6b1c0.png" width="200" /> - -- **Chart info**: get more information relevant to the chart you are interacting with. -- **Chart type**: change the chart type from **line**, **stacked**, **area**, **stacked bar** and **multi bar**. -- **Enter fullscreen mode**: expand the current chart to the full size of your screen. -- **Add chart to dashboard**: add the chart to an existing custom dashboard or directly create a new one that includes the chart. - -## Definition bar - -Each composite chart has a definition bar to provide information and options about the following: - -<img src="https://user-images.githubusercontent.com/70198089/236134615-e53a1d68-8a0f-466b-b2ef-1974085f0e8d.png" width="900"/> - -- Group by option -- Aggregate function to be applied in case multiple data sources exist -- Nodes filter -- Instances filter -- Dimensions filter -- Labels filter -- The aggregate function over time to be applied if one point in the chart consists of multiple data points aggregated -- Resetting the Definition bar - -### NIDL framework - -To help users instantly understand and validate the data they see on charts, we developed the NIDL (Nodes, Instances, Dimensions, Labels) framework. This information is visualized on all charts. - -> You can explore the in-depth infographic, by clicking on this image and opening it in a new tab, -> allowing you to zoom in to the different parts of it. -> -> <a href="https://user-images.githubusercontent.com/2662304/235475061-44628011-3b1f-4c44-9528-34452018eb89.png" target="_blank"> -> <img src="https://user-images.githubusercontent.com/2662304/235475061-44628011-3b1f-4c44-9528-34452018eb89.png" width="400" border="0" align="center"/> -> </a> - -You can rapidly access condensed information for collected metrics, grouped by node, monitored instances, dimension, or any key/value label pair. - -At the Definition bar of each chart, there are a few dropdown menus: - -<img src="https://user-images.githubusercontent.com/43294513/235470150-62a3b9ac-51ca-4c0d-81de-8804e3d733eb.png" width="900"/> - -These dropdown menus have 2 functions: - -1. Provide additional information about the visualized chart, to help with understanding the data that is presented. -2. Provide filtering and grouping capabilities, altering the query on the fly, to help get different views of the dataset. - -The NIDL framework attaches metadata to every metric that is collected to provide for each of them the following consolidated data for the visible time frame: - -1. The volume contribution of each metric into the final query. So even if a query comes from 1000 nodes, the contribution of each node in the result can instantly be visualized. The same goes for instances, dimensions and labels. Especially for labels, Netdata also provides the volume contribution of each label `key:value` pair to the final query, so that you can immediately see how much every label value involved in the query affected the chart. -2. The anomaly rate of each of them for the time-frame of the query. This is used to quickly spot which of the nodes, instances, dimensions or labels have anomalies in the requested time-frame. -3. The minimum, average and maximum values of all the points used for the query. This is used to quickly spot which of the nodes, instances, dimensions or labels are responsible for a spike or a dive in the chart. - -All of these dropdown menus can be used for instantly filtering the information shown, by including or excluding specific nodes, instances, dimensions or labels. Directly from the dropdown menu, without the need to edit a query string and without any additional knowledge of the underlying data. - -### Group by dropdown - -The "Group by" dropdown menu allows selecting 1 or more groupings to be applied at once on the same dataset. - -<img src="https://user-images.githubusercontent.com/43294513/235468819-3af5a1d3-8619-48fb-a8b7-8e8b4cf6a8ff.png" width="900"/> - -It supports: - -1. **Group by Node**, to summarize the data of each node, and provide one dimension on the chart for each of the nodes involved. Filtering nodes is supported at the same time, using the nodes dropdown menu. -2. **Group by Instance**, to summarize the data of each instance and provide one dimension on the chart for each of the instances involved. Filtering instances is supported at the same time, using the instances dropdown menu. -3. **Group by Dimension**, so that each metric in the visualization is the aggregation of a single dimension. This provides a per dimension view of the data from all the nodes in the War Room, taking into account filtering criteria if defined. -4. **Group by Label**, to summarize the data for each label value. Multiple label keys can be selected at the same time. - -Using this menu, you can slice and dice the data in any possible way, to quickly get different views of it, without the need to edit a query string and without any need to better understand the format of the underlying data. - -> ### Tip -> -> A very pertinent example is composite charts over contexts related to cgroups (VMs and containers). -> You have the means to change the default group by or apply filtering to get a better view into what data your are trying to analyze. -> For example, if you change the group by to _instance_ you get a view with the data of all the instances (cgroups) that contribute to that chart. -> Then you can use further filtering tools to focus the data that is important to you and even save the result to your own dashboards. - -> ### Tip -> -> Group by instance, dimension to see the time series of every individual collected metric participating in the chart. - -### Aggregate functions over data sources dropdown - -Each chart uses an opinionated-but-valuable default aggregate function over the data sources. - -<img src="https://user-images.githubusercontent.com/70198089/236136725-778670b4-7e81-44a8-8d3d-f38ded823c94.png" width="500"/> - -For example, the `system.cpu` chart shows the average for each dimension from every contributing chart, while the `net.net` chart shows the sum for each dimension from every contributing chart, which can also come from multiple networking interfaces. - -The following aggregate functions are available for each selected dimension: - -- **Average**: Displays the average value from contributing nodes. If a composite chart has 5 nodes with the following - values for the `out` dimension—`-2.1`, `-5.5`, `-10.2`, `-15`, `-0.1`—the composite chart displays a - value of `−6.58`. -- **Sum**: Displays the sum of contributed values. Using the same nodes, dimension, and values as above, the composite - chart displays a metric value of `-32.9`. -- **Min**: Displays a minimum value. For dimensions with positive values, the min is the value closest to zero. For - charts with negative values, the min is the value with the largest magnitude. -- **Max**: Displays a maximum value. For dimensions with positive values, the max is the value with the largest - magnitude. For charts with negative values, the max is the value closet to zero. - -### Nodes dropdown - -In this dropdown, you can view or filter the nodes contributing time-series metrics to the chart. -This menu also provides the contribution of each node to the volume of the chart, and a break down of the anomaly rate of the queried data per node. - -<img src="https://user-images.githubusercontent.com/70198089/236137765-b57d5443-3d4b-42f4-9e3d-db1eb606626f.png" width="900"/> - -If one or more nodes can't contribute to a given chart, the definition bar shows a warning symbol plus the number of -affected nodes, then lists them in the dropdown along with the associated error. Nodes might return errors because of -networking issues, a stopped `netdata` service, or because that node does not have any metrics for that context. - -### Instances dropdown - -In this dropdown, you can view or filter the instances contributing time-series metrics to the chart. -This menu also provides the contribution of each instance to the volume of the chart, and a break down of the anomaly rate of the queried data per instance. - -<img src="https://user-images.githubusercontent.com/70198089/236138302-4dd4072e-3a0d-43bb-a9d8-4dde79c65e92.png" width="900"/> - -### Dimensions dropdown - -In this dropdown, you can view or filter the original dimensions contributing time-series metrics to the chart. -This menu also presents the contribution of each original dimensions on the chart, and a break down of the anomaly rate of the data per dimension. - -<img src="https://user-images.githubusercontent.com/70198089/236138796-08dc6ac6-9a50-4913-a46d-d9bbcedd48f6.png" width="900"/> - -### Labels dropdown - -In this dropdown, you can view or filter the contributing time-series labels of the chart. -This menu also presents the contribution of each label on the chart,and a break down of the anomaly rate of the data per label. - -<img src="https://user-images.githubusercontent.com/70198089/236139027-8a51a958-2074-4675-a41b-efff30d8f51a.png" width="900"/> - -### Aggregate functions over time - -When the granularity of the data collected is higher than the plotted points on the chart an aggregation function over -time is applied. - -<img src="https://user-images.githubusercontent.com/70198089/236411297-e123db06-0117-4e24-a5ac-955b980a8f55.png" width="400"/> - -By default the aggregation applied is _average_ but the user can choose different options from the following: - -- Min, Max, Average or Sum -- Percentile - - you can specify the percentile you want to focus on: 25th, 50th, 75th, 80th, 90th, 95th, 97th, 98th and 99th. - <img src="https://user-images.githubusercontent.com/70198089/236410299-de5f3367-f3b0-4beb-a73f-a49007c543d4.png" width="250"/> -- Trimmed Mean or Trimmed Median - - you can choose the percentage of data tha you want to focus on: 1%, 2%, 3%, 5%, 10%, 15%, 20% and 25%. - <img src="https://user-images.githubusercontent.com/70198089/236410858-74b46af9-280a-4ab2-ad26-5a6aa9403aa8.png" width="250"/> -- Median -- Standard deviation -- Coefficient of variation -- Delta -- Single or Double exponential smoothing - -For more details on each, you can refer to our Agent's HTTP API details on [Data Queries - Data Grouping](https://github.com/netdata/netdata/blob/master/web/api/queries/README.md#data-grouping). - -### Reset to defaults - -Finally, you can reset everything to its defaults by clicking the green "Reset" prompt at the end of the definition bar. - -## Anomaly Rate ribbon - -Netdata's unsupervised machine learning algorithm creates a unique model for each metric collected by your agents, using exclusively the metric's past data. -It then uses these unique models during data collection to predict the value that should be collected and check if the collected value is within the range of acceptable values based on past patterns and behavior. - -If the value collected is an outlier, it is marked as anomalous. - -<img src="https://user-images.githubusercontent.com/70198089/236139886-79d63cf6-61ed-4aa7-842c-b5a1728c870d.png" width="900"/> - -This unmatched capability of real-time predictions as data is collected allows you to **detect anomalies for potentially millions of metrics across your entire infrastructure within a second of occurrence**. - -The Anomaly Rate ribbon on top of each chart visualizes the combined anomaly rate of all the underlying data, highlighting areas of interest that may not be easily visible to the naked eye. - -Hovering over the Anomaly Rate ribbon provides a histogram of the anomaly rates per presented dimension, for the specific point in time. - -Anomaly Rate visualization does not make Netdata slower. Anomaly rate is saved in the the Netdata database, together with metric values, and due to the smart design of Netdata, it does not even incur a disk footprint penalty. - -## Hover over the chart - -Hovering over any point in the chart will reveal a more informative overlay. -It includes a bar indicating the volume percentage of each time series compared to the total, the anomaly rate, and a notification on if there are data collection issues. - -This overlay sorts all dimensions by value, makes bold the closest dimension to the mouse and presents a histogram based on the values of the dimensions. - -<img src="https://user-images.githubusercontent.com/70198089/236141460-bfa66b99-d63c-4a2c-84b1-2509ed94857f.png" width="500"/> - -When hovering the anomaly ribbon, the overlay sorts all dimensions by anomaly rate, and presents a histogram of these anomaly rates. - -#### Info column - -Additionally, when hovering over the chart, the overlay may display an indication in the "Info" column. - -Currently, this column is used to inform users of any data collection issues that might affect the chart. -Below each chart, there is an information ribbon. This ribbon currently shows 3 states related to the points presented in the chart: - -1. **[P]: Partial Data** - At least one of the dimensions in the chart has partial data, meaning that not all instances available contributed data to this point. This can happen when a container is stopped, or when a node is restarted. This indicator helps to gain confidence of the dataset, in situations when unusual spikes or dives appear due to infrastructure maintenance, or due to failures to part of the infrastructure. - -2. **[O]: Overflown** - At least one of the data sources included in the chart has a counter that has overflowed at this point. - -3. **[E]: Empty Data** - At least one of the dimensions included in the chart has no data at all for the given points. - -All these indicators are also visualized per dimension, in the pop-over that appears when hovering the chart. - -<img src="https://user-images.githubusercontent.com/70198089/236145768-8ffadd02-93a4-4e9e-b4ae-c1367f614a7e.png" width="700"/> - -## Play, Pause and Reset - -Your charts are controlled using the available [Time controls](https://github.com/netdata/netdata/blob/master/docs/dashboard/visualization-date-and-time-controls.md#time-controls). -Besides these, when interacting with the chart you can also activate these controls by: - -- Hovering over any chart to temporarily pause it - this momentarily switches time control to Pause, so that you can - hover over a specific timeframe. When moving out of the chart time control will go back to Play (if it was it's - previous state) -- Clicking on the chart to lock it - this enables the Pause option on the time controls, to the current timeframe. This - is if you want to jump to a different chart to look for possible correlations. -- Double clicking to release a previously locked chart - move the time control back to Play - -| Interaction | Keyboard/mouse | Touchpad/touchscreen | Time control | -|:------------------|:---------------|:---------------------|:----------------------| -| **Pause** a chart | `hover` | `n/a` | Temporarily **Pause** | -| **Stop** a chart | `click` | `tap` | **Pause** | -| **Reset** a chart | `double click` | `n/a` | **Play** | - -Note: These interactions are available when the default "Pan" action is used from the [Tool Bar](#tool-bar). - -## Tool bar - -While exploring the chart, a tool bar will appear. This tool bar is there to support you on this task. -The available manipulation tools you can select are: - -<img src="https://user-images.githubusercontent.com/70198089/236143292-c1d75528-263d-4ddd-9db8-b8d6a31cb83e.png" width="400" /> - -- Pan -- Highlight -- Select and zoom -- Chart zoom -- Reset zoom - -### Pan - -Drag your mouse/finger to the right to pan backward through time, or drag to the left to pan forward in time. Think of -it like pushing the current timeframe off the screen to see what came before or after. - -| Interaction | Keyboard | Mouse | Touchpad/touchscreen | -|:------------|:---------|:---------------|:---------------------| -| **Pan** | `n/a` | `click + drag` | `touch drag` | - -### Highlight - -Selecting timeframes is useful when you see an interesting spike or change in a chart and want to investigate further by: - -- Looking at the same period of time on other charts/sections -- Running [metric correlations](https://github.com/netdata/netdata/blob/master/docs/cloud/insights/metric-correlations.md) to filter metrics that also show something different in the selected period, vs the previous one - -| Interaction | Keyboard/mouse | Touchpad/touchscreen | -|:-----------------------------------|:---------------------------------------------------------|:---------------------| -| **Highlight** a specific timeframe | `Alt + mouse selection` or `⌘ + mouse selection` (macOS) | `n/a` | - -### Select and zoom - -You can zoom to a specific timeframe, either horizontally of vertically, by selecting a timeframe. - -| Interaction | Keyboard/mouse | Touchpad/touchscreen | -|:-------------------------------------------|:-------------------------------------|:-----------------------------------------------------| -| **Zoom** to a specific timeframe | `Shift + mouse vertical selection` | `n/a` | -| **Horizontal Zoom** a specific Y-axis area | `Shift + mouse horizontal selection` | `n/a` | - -### Chart zoom - -Zooming in helps you see metrics with maximum granularity, which is useful when you're trying to diagnose the root cause -of an anomaly or outage. - -Zooming out lets you see metrics within the larger context, such as the last hour, day, or week, which is useful in understanding what "normal" looks like, or to identify long-term trends, like a slow creep in memory usage. - -| Interaction | Keyboard/mouse | Touchpad/touchscreen | -|:-------------------------------------------|:-------------------------------------|:-----------------------------------------------------| -| **Zoom** in or out | `Shift + mouse scrollwheel` | `two-finger pinch` <br />`Shift + two-finger scroll` | - -## Dimensions bar - -### Order dimensions legend - -The bottom legend where you can see the dimensions of the chart can be ordered by: - -<img src="https://user-images.githubusercontent.com/70198089/236144658-6c3d0e31-9bcb-45f3-bb95-4eafdcbb0a58.png" width="300" /> - -- Dimension name (Ascending or Descending) -- Dimension value (Ascending or Descending) -- Dimension Anomaly Rate (Ascending or Descending) - -### Show and hide dimensions - -Hiding dimensions simplifies the chart and can help you better discover exactly which aspect of your system might be -behaving strangely. - -| Interaction | Keyboard/mouse | Touchpad/touchscreen | -|:---------------------------------------|:----------------|:---------------------| -| **Show one** dimension and hide others | `click` | `tap` | -| **Toggle (show/hide)** one dimension | `Shift + click` | `n/a` | - -## Resize a chart - -To resize the chart, click-and-drag the icon on the bottom-right corner of any chart. To restore the chart to its original height, double-click the same icon. diff --git a/docs/cloud/visualize/kubernetes.md b/docs/cloud/visualize/kubernetes.md deleted file mode 100644 index 82c33fd3e..000000000 --- a/docs/cloud/visualize/kubernetes.md +++ /dev/null @@ -1,142 +0,0 @@ -<!-- -title: "Kubernetes visualizations" -description: "Netdata Cloud features rich, zero-configuration Kubernetes monitoring for the resource utilization and application metrics of Kubernetes (k8s) clusters." -custom_edit_url: "https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/kubernetes.md" -sidebar_label: "Kubernetes visualizations" -learn_status: "Published" -learn_topic_type: "Concepts" -learn_rel_path: "Operations/Visualizations" ---> - -# Kubernetes visualizations - -Netdata Cloud features enhanced visualizations for the resource utilization of Kubernetes (k8s) clusters, embedded in -the default [Overview](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md) dashboard. - -These visualizations include a health map for viewing the status of k8s pods/containers, in addition to composite charts -for viewing per-second CPU, memory, disk, and networking metrics from k8s nodes. - -See our [Kubernetes deployment instructions](https://github.com/netdata/netdata/blob/master/packaging/installer/methods/kubernetes.md) for details on -installation and connecting to Netdata Cloud. - -## Available Kubernetes metrics - -Netdata Cloud organizes and visualizes the following metrics from your Kubernetes cluster from every container: - -- `cpu_limit`: CPU utilization as a percentage of the limit defined by the [pod specification - `spec.containers[].resources.limits.cpu`](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-requests-and-limits-of-pod-and-container) - or a [`LimitRange` - object](https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#create-a-limitrange-and-a-pod). -- `cpu`: CPU utilization of the pod/container. 100% usage equals 1 fully-utilized core, 200% equals 2 fully-utilized - cores, and so on. -- `cpu_per_core`: CPU utilization averaged across available cores. -- `mem_usage_limit`: Memory utilization, without cache, as a percentage of the limit defined by the [pod specification - `spec.containers[].resources.limits.memory`](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#resource-requests-and-limits-of-pod-and-container) - or a [`LimitRange` - object](https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#create-a-limitrange-and-a-pod). -- `mem_usage`: Used memory, without cache. -- `mem`: The sum of `cache` and `rss` (resident set size) memory usage. -- `writeback`: The size of `dirty` and `writeback` cache. -- `mem_activity`: Sum of `in` and `out` bandwidth. -- `pgfaults`: Sum of page fault bandwidth, which are raised when the Kubernetes cluster tries accessing a memory page - that is mapped into the virtual address space, but not actually loaded into main memory. -- `throttle_io`: Sum of `read` and `write` per second across all PVs/PVCs attached to the container. -- `throttle_serviced_ops`: Sum of the `read` and `write` operations per second across all PVs/PVCs attached to the - container. -- `net.net`: Sum of `received` and `sent` bandwidth per second. -- `net.packets`: Sum of `multicast`, `received`, and `sent` packets. - -When viewing the [health map](#health-map), Netdata Cloud shows the above metrics per container, or aggregated based on -their associated pods. - -When viewing the [composite charts](#composite-charts), Netdata Cloud aggregates metrics from multiple nodes, pods, or -containers, depending on the grouping chosen. For example, if you group the `cpu_limit` composite chart by -`k8s_namespace`, the metrics shown will be the average of `cpu_limit` metrics from all nodes/pods/containers that are -part of that namespace. - -## Health map - -The health map places each container or pod as a single box, then varies the intensity of its color to visualize the -resource utilization of specific k8s pods/containers. - -![The Kubernetes health map in Netdata -Cloud](https://user-images.githubusercontent.com/1153921/106964367-39f54100-66ff-11eb-888c-5a04f8abb3d0.png) - -Change the health map's coloring, grouping, and displayed nodes to customize your experience and learn more about the -status of your k8s cluster. - -### Color by - -Color the health map by choosing an aggregate function to apply to an [available Kubernetes -metric](#available-kubernetes-metrics), then whether you to display boxes for individual pods or containers. - -The default is the _average, of CPU within the configured limit, organized by container_. - -### Group by - -Group the health map by the `k8s_cluster_id`, `k8s_controller_kind`, `k8s_controller_name`, `k8s_kind`, `k8s_namespace`, -and `k8s_node_name`. The default is `k8s_controller_name`. - -### Filtering - -Filtering behaves identically to the [node filter in War Rooms](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/node-filter.md), with the ability to -filter pods/containers by `container_id` and `namespace`. - -### Detailed information - -Hover over any of the pods/containers in the map to display a modal window, which contains contextual information -and real-time metrics from that resource. - -![The modal containing additional information about a k8s -resource](https://user-images.githubusercontent.com/1153921/106964369-3a8dd780-66ff-11eb-8a8a-a5c8f0d5711f.png) - -The **context** tab provides the following details about a container or pod: - -- Cluster ID -- Node -- Controller Kind -- Controller Name -- Pod Name -- Container -- Kind -- Pod UID - -This information helps orient you as to where the container/pod operates inside your cluster. - -The **Metrics** tab contains charts visualizing the last 15 minutes of the same metrics available in the [color by -option](#color-by). Use these metrics along with the context, to identify which containers or pods are experiencing -problematic behavior to investigate further, troubleshoot, and remediate with `kubectl` or another tool. - -## Composite charts - -The Kubernetes composite charts show real-time and historical resource utilization metrics from nodes, pods, or -containers within your Kubernetes deployment. - -See the [Overview](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md#definition-bar) doc for details on how composite charts work. These -work similarly, but in addition to visualizing _by dimension_ and _by node_, Kubernetes composite charts can also be -grouped by the following labels: - -- `k8s_cluster_id` -- `k8s_container_id` -- `k8s_container_name` -- `k8s_controller_kind` -- `k8s_kind` -- `k8s_namespace` -- `k8s_node_name` -- `k8s_pod_name` -- `k8s_pod_uid` - -![Composite charts of Kubernetes metrics in Netdata -Cloud](https://user-images.githubusercontent.com/1153921/106964370-3a8dd780-66ff-11eb-8858-05b2253b25c6.png) - -In addition, when you hover over a composite chart, the colors in the heat map changes as well, so you can see how -certain pod/container-level metrics change over time. - -## Caveats - -There are some caveats and known issues with Kubernetes monitoring with Netdata Cloud. - -- **No way to remove any nodes** you might have - [drained](https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/) from your Kubernetes cluster. These - drained nodes will be marked "unreachable" and will show up in War Room management screens/dropdowns. The same applies - for any ephemeral nodes created and destroyed during horizontal scaling. diff --git a/docs/cloud/visualize/node-filter.md b/docs/cloud/visualize/node-filter.md deleted file mode 100644 index 0dd0ef5a6..000000000 --- a/docs/cloud/visualize/node-filter.md +++ /dev/null @@ -1,17 +0,0 @@ -# Node filter - -The node filter allows you to quickly filter the nodes visualized in a War Room's views. It appears on all views, except on single-node dashboards. - -Inside the filter, the nodes get categorized into three groups: - -| Group | Description | -|---------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Live | Nodes that are currently online, collecting and streaming metrics to Cloud. Live nodes display raised [Alert](https://github.com/netdata/netdata/blob/master/docs/monitor/view-active-alerts.md) counters, [Machine Learning](https://github.com/netdata/netdata/blob/master/ml/README.md) availability, and [Functions](https://github.com/netdata/netdata/blob/master/docs/cloud/netdata-functions.md) availability | -| Stale | Nodes that are offline and not streaming metrics to Cloud. Only historical data can be presented from a parent node. For these nodes you can only see their ML status, as they are not online to provide more information | -| Offline | Nodes that are offline, not streaming metrics to Cloud and not available in any parent node. Offline nodes are automatically deleted after 30 days and can also be deleted manually. | - -By using the search bar, you can narrow down to specific nodes based on their name. - -When you select one or more nodes, the total selected number will appear in the **Nodes** bar on the **Selected** field. - -![The node filter](https://user-images.githubusercontent.com/70198089/225249850-60ce4fcc-4398-4412-a6b5-6082308f4e60.png) diff --git a/docs/cloud/visualize/nodes.md b/docs/cloud/visualize/nodes.md deleted file mode 100644 index 3ecf76ca5..000000000 --- a/docs/cloud/visualize/nodes.md +++ /dev/null @@ -1,39 +0,0 @@ -# Nodes tab - -The Nodes tab lets you see and customize key metrics from any number of Agent-monitored nodes and seamlessly navigate -to any node's dashboard for troubleshooting performance issues or anomalies using Netdata's highly-granular metrics. - -![The Nodes tab in Netdata -Cloud](https://user-images.githubusercontent.com/1153921/119035218-2eebb700-b964-11eb-8b74-4ec2df0e457c.png) - -Each War Room's Nodes tab is populated based on the nodes you added to that specific War Room. Each node occupies a -single row, first featuring that node's alert status (yellow for warnings, red for critical alerts) and operating -system, some essential information about the node, followed by columns of user-defined key metrics represented in -real-time charts. - -Use the [Overview](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/overview.md) for monitoring an infrastructure in real time using -composite charts and Netdata's familiar dashboard UI. - -Check the [node -filter](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/node-filter.md) and the [Visualization date time controls -selector](https://github.com/netdata/netdata/blob/master/docs/dashboard/visualization-date-and-time-controls.md) for tools available on the utility bar. - -## Add and customize metrics columns - -Add more metrics columns by clicking the gear icon. Choose the context you'd like to add, give it a relevant name, and -select whether you want to see all dimensions (the default), or only the specific dimensions your team is interested in. - -Click the gear icon and hover over any existing charts, then click the pencil icon. This opens a panel to -edit that chart. Edit the context, its title, add or remove dimensions, or delete the chart altogether. - -These customizations appear for anyone else with access to that War Room. - -## See more metrics in Netdata Cloud - -If you want to add more metrics to your War Rooms and they don't show up when you add new metrics to Nodes, you likely -need to configure those nodes to collect from additional data sources. See our [collectors configuration reference](https://github.com/netdata/netdata/blob/master/collectors/REFERENCE.md) -to learn how to use dozens of pre-installed collectors that can instantly collect from your favorite services and applications. - -If you want to see up to 30 days of historical metrics in Cloud (and more on individual node dashboards), read about [changing how long Netdata stores metrics](https://github.com/netdata/netdata/blob/master/docs/store/change-metrics-storage.md). Also, see our -[calculator](https://github.com/netdata/netdata/blob/master/docs/store/change-metrics-storage.md#calculate-the-system-resources-ram-disk-space-needed-to-store-metrics) -for finding the disk and RAM you need to store metrics for a certain period of time. diff --git a/docs/cloud/visualize/overview.md b/docs/cloud/visualize/overview.md deleted file mode 100644 index 84638f058..000000000 --- a/docs/cloud/visualize/overview.md +++ /dev/null @@ -1,48 +0,0 @@ -# Home, overview and single node tabs - -Learn how to use the Home, Overview, and Single Node tabs in Netdata Cloud, to explore your infrastructure and troubleshoot issues. - -## Home - -The Home tab provides a predefined dashboard of relevant information about entities in the War Room. - -This tab will automatically present summarized information in an easily digestible display. You can see information about your -nodes, data collection and retention stats, alerts, users and dashboards. - -## Overview and single node tab - -The Overview tab is another great way to monitor infrastructure using Netdata Cloud. While the interface might look -similar to local dashboards served by an Agent Overview uses **composite charts**. -These charts display real-time aggregated metrics from all the nodes (or a filtered selection) in a given War Room. - -When you [interact with composite charts](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/interact-new-charts.md) -you can see your infrastructure from a single pane of glass, discover trends or anomalies, and perform root cause analysis. - -The Single Node tab dashboard is exactly the same as the Overview, but with a hard-coded filter to only show a single node. - -### Chart navigation Menu - -Netdata Cloud uses a similar menu to local Agent dashboards, with sections -and sub-menus aggregated from every contributing node. For example, even if only two nodes actively collect from and -monitor an Apache web server, the **Apache** section still appears and displays composite charts from those two nodes. - -![A menu in the Overview screen](https://user-images.githubusercontent.com/1153921/95785094-fa0ad980-0c89-11eb-8328-2ff11ac630b4.png) - -One difference between the Netdata Cloud menu and those found in local Agent dashboards is that -the Overview condenses multiple services, families, or instances into single sections, sub-menus, and associated charts. - -For services, let's say you have two concurrent jobs with the [web_log collector](https://github.com/netdata/go.d.plugin/blob/master/modules/weblog/README.md), one for Apache and another for Nginx. -A single-node or local dashboard shows two section, **web_log apache** and **web_log nginx**, whereas the Overview condenses these into a -single **web_log** section containing composite charts from both jobs. - -The Cloud also condenses multiple families or multiple instances into a single **all** sub-menu and associated charts. -For example, if Node A has 5 disks, and Node B has 3, each disk contributes to a single `disk.io` composite chart. -The utility bar should show that there are 8 charts from 2 nodes contributing to that chart. -The aggregation applies to disks, network devices, and other metric types that involve multiple instances of a piece of hardware or software. - -## Persistence of composite chart settings - -Of course you can [change the filtering or grouping](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/interact-new-charts.md) of metrics in the composite charts that aggregate all these instances, to see only the information you are interested in, and save that tab in a custom dashboard. - -When you change a composite chart via its definition bar, Netdata Cloud persists these settings in a query string attached to the URL in your browser. -You can "save" these settings by bookmarking this particular URL, or share it with colleagues by having them copy-paste it into their browser. |