diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:23 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:44 +0000 |
commit | 836b47cb7e99a977c5a23b059ca1d0b5065d310e (patch) | |
tree | 1604da8f482d02effa033c94a84be42bc0c848c3 /docs/alerts-and-notifications | |
parent | Releasing debian version 1.44.3-2. (diff) | |
download | netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.tar.xz netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.zip |
Merging upstream version 1.46.3.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'docs/alerts-and-notifications')
5 files changed, 252 insertions, 0 deletions
diff --git a/docs/alerts-and-notifications/creating-alerts-with-netdata-alerts-configuration-manager.md b/docs/alerts-and-notifications/creating-alerts-with-netdata-alerts-configuration-manager.md new file mode 100644 index 000000000..f9a443c9d --- /dev/null +++ b/docs/alerts-and-notifications/creating-alerts-with-netdata-alerts-configuration-manager.md @@ -0,0 +1,44 @@ +# Creating Alerts with Netdata Alerts Configuration Manager + +The Netdata Alerts Configuration Manager enables subscribers to easily set up Alerts directly from the Netdata Dashboard. More details on subscriptions can be found [here](https://www.netdata.cloud/pricing/). + +## Using the Alerts Configuration Manager + +1. Navigate to the **Metrics** tab and select the chart you want to configure for Alerts. +2. Click the **Alert icon** in the top right corner of the chart. +3. The Alert Configuration Manager will open, showing the default thresholds. Modify these thresholds as needed; the Alert definition on the right will update automatically. +4. For additional settings, toggle **Show advanced**. +5. After configuring the Alert, copy the generated Alert definition from the code box. Paste this into an existing or new custom health configuration file located at `<path to netdata install>/etc/netdata/health.d/` on a Parent Agent or a Standalone Child Agent. The guide to edit health configuration files is available [here](/src/health/REFERENCE.md#edit-health-configuration-files). +6. To activate the new Alert, run the command `<path to netdata install>/usr/sbin/netdatacli reload-health`. + +## Alerts Configuration Manager Sections + +### Alert Detection Method + +An Alert is triggered whenever a metric crosses a threshold: + +- **Standard Threshold**: Triggered when a metric crosses a predefined value. +- **Metric Variance**: Triggered based on the variance of the metric. +- **Anomaly Rate**: Triggered based on the anomaly rate of the metric. + +### Metrics Lookup, Filtering, and Formula Section + +You can read more about the different options in the [Alerts reference documentation](/src/health/REFERENCE.md). + +- **Metrics Lookup**: Adjust the database lookup parameters directly in the UI, including method (`avg`, `sum`, `min`, `max`, etc.), computation style, dimensions, duration, and options like `absolute` or `percentage`. +- **Alert Filtering**: The **show advanced** checkbox allows filtering of Alert health checks for specific infrastructure components. Options include selecting hosts, nodes, instances, chart labels, and operating systems. +- **Formula / Calculation**: The **show advanced** checkbox allows defining a formula for the metric value, which is then used to set Alert thresholds. + +### Alerting Conditions + +- **Thresholds**: Set thresholds for warning and critical Alert states, specifying whether the Alert should trigger above or below these thresholds. Advanced settings allow for custom formulas. + - **Recovery Thresholds**: Set thresholds for downgrading the Alert from critical to warning or from warning to clear. +- **Check Interval**: Define how frequently the health check should run. +- **Delay Notifications**: Manage notification delays for Alert escalations or de-escalations. +- **Agent Specific Options**: Options exclusive to the Netdata Agent, like repeat notification frequencies and notification recipients. + - **Custom Exec Script**: Define custom scripts to execute when an Alert triggers. + +### Alert Name, Description, and Summary Section + +- **Alert Template Name**: Provide a unique name for the Alert. +- **Alert Template Description**: Offer a brief explanation of what the Alert diff --git a/docs/alerts-and-notifications/notifications/README.md b/docs/alerts-and-notifications/notifications/README.md new file mode 100644 index 000000000..3368b4e14 --- /dev/null +++ b/docs/alerts-and-notifications/notifications/README.md @@ -0,0 +1,7 @@ +# Notifications + +This section includes the documentation of the integrations for both of Netdata's notification methods. + +- Netdata Cloud provides centralized alert notifications, utilizing the health status data already sent to Netdata Cloud from connected nodes to send alerts to configured integrations. [Supported integrations](/docs/alerts-&-notifications/notifications/centralized-cloud-notifications) include Amazon SNS, Discord, Slack, Splunk, and others. + +- The Netdata Agent offers a [wider range of notification options](/docs/alerts-&-notifications/notifications/agent-dispatched-notifications) directly from the agent itself. You can choose from over a dozen services, including email, Slack, PagerDuty, Twilio, and others, for more granular control over notifications on each node. diff --git a/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/centralized-cloud-notifications-reference.md b/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/centralized-cloud-notifications-reference.md new file mode 100644 index 000000000..c9570c470 --- /dev/null +++ b/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/centralized-cloud-notifications-reference.md @@ -0,0 +1,69 @@ +# Centralized Cloud Notifications Reference + +Netdata Cloud sends Alert notifications for nodes in warning, critical, or unreachable states, ensuring Alerts are managed centrally and efficiently. + +## Benefits of Centralized Notifications + +- Consolidate health status views across all infrastructure in one place. +- Set up and [manage your Alert notifications easily](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-notification-methods.md). +- Expedite troubleshooting with tools like [Metric Correlations](/docs/metric-correlations.md) and the [Anomaly Advisor](/docs/dashboards-and-charts/anomaly-advisor-tab.md). + +> **Note** +> +> To avoid notification overload, **flood protection** is triggered when a node frequently disconnects or sends excessive Alerts, highlighting potential issues. + +Administrators must [enable Alert notifications](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-notification-methods.md#manage-space-notification-settings) for their Space(s). All users can then customize their notification preferences through their [account menu](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-notification-methods.md#manage-user-notification-settings). + +> **Note** +> +> Centralized Alerts in Netdata Cloud are separate from the [Netdata Agent](/docs/alerts-and-notifications/notifications/README.md) notifications. Agent Alerts must be [configured individually](/src/health/REFERENCE.md) on each node. + +## Alert Notifications + +Notifications can be sent via email or through third-party services like PagerDuty or Slack. Administrators can manage notification settings for the entire Space, while individual users can personalize settings in their profile. + +### Service Level + +#### Personal + +Notifications are sent to user-specific destinations, such as email, which are managed by users under their profile settings. + +#### System + +These notifications go to general targets like a Slack channel, with administrators setting rules for notification targets based on workspace or Alert level. + +### Service Classification + +#### Community + +Available to all plans, includes basic methods like Email and Discord. + +#### Business + +Exclusive to [paid plans](/docs/netdata-cloud/view-plan-and-billing.md), includes advanced services like PagerDuty and Slack. + +## Alert Notification Silencing Rules + +Netdata Cloud offers a silencing rule engine to mute Alert notifications based on specific conditions related to nodes or Alert types. Learn how to manage these settings [here](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-alert-notification-silencing-rules.md). + +## Flood Protection + +If a node repeatedly changes state or raises Alerts, flood protection limits notifications to prevent overload. You can still access node details through Netdata Cloud or directly via the local Agent dashboard. + +## Anatomy of an Email Alert Notification + +Email notifications provide comprehensive details: + +- The Space's name +- The node's name +- Alert status: critical, warning, cleared +- Previous Alert status +- Time at which the Alert triggered +- Chart context that triggered the Alert +- Name and information about the triggered Alert +- Alert value +- Total number of warning and critical Alerts on that node +- Threshold for triggering the given Alert state +- Calculation or database lookups that Netdata uses to compute the value +- Source of the Alert, including which file you can edit to configure this Alert on an individual node +- Direct link to the node’s chart in Cloud dashboards. diff --git a/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-alert-notification-silencing-rules.md b/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-alert-notification-silencing-rules.md new file mode 100644 index 000000000..d537ef7ea --- /dev/null +++ b/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-alert-notification-silencing-rules.md @@ -0,0 +1,60 @@ +# Manage Alert Notification Silencing Rules + +From the Cloud interface, you can manage your space's Alert notification silencing rules settings as well as allow users to define their personal ones. + +## Prerequisites + +To manage **space's Alert notification silencing rule settings**, you will need the following: + +- A Netdata Cloud account +- Access to the space as an **administrator** or **manager** (**troubleshooters** can only view space rules) + +To manage your **personal Alert notification silencing rule settings**, you will need the following: + +- A Netdata Cloud account +- Access to the space with any role except **billing** + +### Steps + +1. Click on the **Space settings** cog (located above your profile icon). +2. Click on the **Alert & Notification** tab on the left-hand side. +3. Click on the **Notification Silencing Rules** tab. +4. You will be presented with a table of the configured Alert notification silencing rules for: + + - The space (if you aren't an **observer**) + - Yourself + + You will be able to: + + 1. **Add a new** Alert notification silencing rule configuration. + - Choose if it applies to **All users** or **Myself** (All users is only available for **administrators** and **managers**). + - You need to provide a name for the configuration so you can easily refer to it. + - Define criteria for Nodes, to which Rooms will the rule apply, on what Nodes and whether or not it applies to host labels key-value pairs. + - Define criteria for Alerts, such as Alert name is being targeted and on what Alert context. You can also specify if it will apply to a specific Alert role. + - Define when it will be applied: + - Immediately, from now until it is turned off or until a specific duration (start and end date automatically set). + - Scheduled, you can specify the start and end time for when the rule becomes active and then inactive (time is set according to your browser's local timezone). + Note: You are only able to add a rule if your space is on a [paid plan](/docs/netdata-cloud/view-plan-and-billing.md). + 2. **Edit an existing** Alert notification silencing rule configuration. You will be able to change: + - The name provided for it + - Who it applies to + - Selection criteria for Nodes and Alerts + - When it will be applied + 3. **Enable/Disable** a given Alert notification silencing rule configuration. + - Use the toggle to enable or disable + 4. **Delete an existing** Alert notification silencing rule. + - Use the trash icon to delete your configuration + +## Silencing Rules Examples + +| Rule name | Rooms | Nodes | Host Label | Alert name | Alert context | Alert instance | Alert role | Description | +|:---------------------------------|:-------------------|:---------|:-------------------------|:-------------------------------------------------|:--------------|:-------------------------|:------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| Space silencing | All Rooms | * | * | * | * | * | * | This rule silences the entire space, targets all nodes, and for all users. E.g. infrastructure-wide maintenance window. | +| DB Servers Rooms | PostgreSQL Servers | * | * | * | * | * | * | This rule silences the nodes in the Room named PostgreSQL Servers, for example, it doesn't silence the `All Nodes` Room. E.g. My team with membership to this Room doesn't want to receive notifications for these nodes. | +| Node child1 | All Rooms | `child1` | * | * | * | * | * | This rule silences all Alert state transitions for node `child1` in all Rooms and for all users. E.g. node could be going under maintenance. | +| Production nodes | All Rooms | * | `environment:production` | * | * | * | * | This rule silences all Alert state transitions for nodes with the host label key-value pair `environment:production`. E.g. Maintenance window on nodes with specific host labels. | +| Third party maintenance | All Rooms | * | * | `httpcheck_posthog_netdata_cloud.request_status` | * | * | * | This rule silences this specific Alert since the third-party partner will be undergoing maintenance. | +| Intended stress usage on CPU | All Rooms | * | * | * | `system.cpu` | * | * | This rule silences specific Alerts across all nodes and their CPU cores. | +| Silence role webmaster | All Rooms | * | * | * | * | * | `webmaster` | This rule silences all Alerts configured with the role `webmaster`. | +| Silence Alert on node | All Rooms | `child1` | * | `httpcheck_posthog_netdata_cloud.request_status` | * | * | * | This rule silences the specific Alert on the `child1` node. | +| Disk Space Alerts on mount point | All Rooms | * | * | `disk_space_usage` | `disk.space` | `disk_space_opt_baddisk` | * | This rule silences the specific Alert instance on all nodes `/opt/baddisk`. | diff --git a/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-notification-methods.md b/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-notification-methods.md new file mode 100644 index 000000000..6a432ded3 --- /dev/null +++ b/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/manage-notification-methods.md @@ -0,0 +1,72 @@ +# Manage Notification Methods + +From the Cloud interface, you can manage your Space's notification settings as well as allow users to personalize their notification settings. + +## Manage Space Notification Settings + +### Prerequisites + +To manage Space notification settings, you will need the following: + +- A Netdata Cloud account +- Access to the Space as an **administrator** + +### Available Actions per Notification Method Based on Service Level + +| **Action** | **Personal Service Level** | **System Service Level** | +|:------------------------------------------------|:--------------------------:|:------------------------:| +| Enable / Disable | X | X | +| Edit | | X | +| Delete | X | X | +| Add multiple configurations for the same method | | X | + +> **Notes** +> +> - For Netdata provided ones, you can't delete the existing notification method configuration. +> - Enable, Edit, and Add actions over specific notification methods will only be allowed if your plan has access to those ([service classification](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/centralized-cloud-notifications-reference.md#service-classification)). + +### Steps + +1. Click on the **Space settings** cog (located above your profile icon). +2. Click on the **Alerts & Notifications** tab on the left-hand side. +3. Click on the **Notification Methods** tab. +4. You will be presented with a table of the configured notification methods for the Space. You will be able to: + 1. **Add a new** notification method configuration. + - Choose the service from the list of available ones. The available options will depend on your subscription plan. + - You can optionally provide a name for the configuration so you can easily refer to it. + - You can define the filtering criteria, regarding which Rooms the method will apply, and what notifications you want to receive (All Alerts and unreachable, All Alerts, Critical only). + - Depending on the service, different inputs will be present. Please note that there are mandatory and optional inputs. + - If you have doubts on how to configure the service, you can find a link at the top of the modal that takes you to the specific documentation page to help you. + 2. **Edit an existing** notification method configuration. Personal level ones can't be edited here, see [Manage User Notification Settings](#manage-user-notification-settings). You will be able to change: + - The name provided for it + - Filtering criteria + - Service-specific inputs + 3. **Enable/Disable** a given notification method configuration. + - Use the toggle to enable or disable the notification method configuration. + 4. **Delete an existing** notification method configuration. Netdata provided ones can't be deleted, e.g., Email. + - Use the trash icon to delete your configuration. + +## Manage User Notification Settings + +### Prerequisites + +To manage user-specific notification settings, you will need the following: + +- A Cloud account +- Access to, at least, a Space + +Note: If an administrator has disabled a Personal [service level](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/centralized-cloud-notifications-reference.md#service-level) notification method, this will override any user-specific setting. + +### Steps + +1. Click on the **User notification settings** shortcut on top of the help button. +2. You are presented with: + - The Personal [service level](/docs/alerts-and-notifications/notifications/centralized-cloud-notifications/centralized-cloud-notifications-reference.md#service-level) notification methods you can manage. + - The list of Spaces and Rooms inside those where you have access to. + - If you're an Administrator, Manager, or Troubleshooter, you'll also see the Rooms from a Space you don't have access to on the **All Rooms** tab, and you can activate notifications for them by joining the Room. +3. On this modal you will be able to: + 1. **Enable/Disable** the notification method for you; this applies across all Spaces and Rooms. + - Use the toggle to enable or disable the notification method. + 2. **Define what notifications you want** per Space/Room: All Alerts and unreachable, All Alerts, Critical only, or No notifications. + 3. **Activate notifications** for a Room you aren't a member of. + - From the **All Rooms** tab, click on the Join button for the Room(s) you want. |