diff options
Diffstat (limited to 'health')
-rw-r--r-- | health/README.md | 53 | ||||
-rw-r--r-- | health/health.d/net.conf | 14 | ||||
-rw-r--r-- | health/health_config.c | 2 | ||||
-rw-r--r-- | health/notifications/alarm-notify.sh | 73 | ||||
-rwxr-xr-x | health/notifications/alarm-notify.sh.in | 73 | ||||
-rw-r--r-- | health/notifications/alerta/README.md | 238 | ||||
-rwxr-xr-x | health/notifications/health_alarm_notify.conf | 5 |
7 files changed, 181 insertions, 277 deletions
diff --git a/health/README.md b/health/README.md index 597bd3c32..5d68d752a 100644 --- a/health/README.md +++ b/health/README.md @@ -1,4 +1,3 @@ - # Health monitoring Each netdata node runs an independent thread evaluating health monitoring checks. @@ -40,16 +39,16 @@ killall -USR2 netdata There are 2 entities: -1. **alarms**, which are attached to specific charts, and +1. **alarms**, which are attached to specific charts, and -2. **templates**, which define rules that should be applied to all charts having a +1. **templates**, which define rules that should be applied to all charts having a specific `context`. You can use this feature to apply **alarms** to all disks, all network interfaces, all mysql databases, all nginx web servers, etc. Both of these entities have exactly the same format and feature set. The only difference is the label `alarm` or `template`. -netdata supports overriding **templates** with **alarms**. +Netdata supports overriding **templates** with **alarms**. For example, when a template is defined for a set of charts, an alarm with exactly the same name attached to the same chart the template matches, will have higher precedence (i.e. netdata will use the alarm on this chart and prevent the template from being applied @@ -59,7 +58,7 @@ to it). The following lines are parsed. -#### alarm line `alarm` or `template` +#### Alarm line `alarm` or `template` This line starts an alarm or alarm template. @@ -78,7 +77,7 @@ This line has to be first on each alarm or template. --- -#### alarm line `on` +#### Alarm line `on` This line defines the data the alarm should be attached to. @@ -112,7 +111,7 @@ So, `plugin = proc`, `module = /proc/net/dev` and `context = net.net`. --- -#### alarm line `os` +#### Alarm line `os` This alarm or template will be used only if the O/S of the host loading it, matches this pattern list. The value is a space separated list of simple patterns (use `*` as wildcard, @@ -124,7 +123,7 @@ os: linux freebsd macos --- -#### alarm line `hosts` +#### Alarm line `hosts` This alarm or template will be used only if the hostname of the host loading it, matches this pattern list. The value is a space separated list of simple patterns (use `*` as wildcard, @@ -141,7 +140,7 @@ This is useful when you centralize metrics from multiple hosts, to one netdata. --- -#### alarm line `families` +#### Alarm line `families` This line is only used in alarm templates. It filters the charts. So, if you need to create an alarm template for a few of a kind of chart (a few of your disks, or a few of your network @@ -165,7 +164,7 @@ The family of a chart is usually the submenu of the netdata dashboard it appears --- -#### alarm line `lookup` +#### Alarm line `lookup` This lines makes a database lookup to find a value. This result of this lookup is available as `$this`. @@ -205,7 +204,7 @@ The timestamps of the timeframe evaluated by the database lookup is available as --- -#### alarm line `calc` +#### Alarm line `calc` This expression is evaluated just after the `lookup` (if any). Its purpose is to apply some calculation before using the value looked up from the db. @@ -225,7 +224,7 @@ Check [Expressions](#expressions) for more information. --- -#### alarm line `every` +#### Alarm line `every` Sets the update frequency of this alarm. This is the same to the `every DURATION` given in the `lookup` lines. @@ -240,7 +239,7 @@ every: DURATION --- -#### alarm lines `green` and `red` +#### Alarm lines `green` and `red` Set the green and red thresholds of a chart. Both are available as `$green` and `$red` in expressions. If multiple alarms define different thresholds, the ones defined by the first @@ -257,7 +256,7 @@ red: NUMBER --- -#### alarm lines `warn` and `crit` +#### Alarm lines `warn` and `crit` These expressions should evaluate to true or false (alternatively non-zero or zero). They trigger the alarm. Both are optional. @@ -272,7 +271,7 @@ Check [Expressions](#expressions) for more information. --- -#### alarm line `to` +#### Alarm line `to` This will be the first parameter of the script to be executed when the alarm switches status. Its meaning is left up to the `exec` script. @@ -288,7 +287,7 @@ to: ROLE1 ROLE2 ROLE3 ... --- -#### alarm line `exec` +#### Alarm line `exec` The script that will be executed when the alarm changes status. @@ -303,7 +302,7 @@ methods netdata supports, including custom hooks. --- -#### alarm line `delay` +#### Alarm line `delay` This is used to provide optional hysteresis settings for the notifications, to defend against notification floods. These settings do not affect the actual alarm - only the time @@ -374,13 +373,9 @@ Expressions can have variables. Variables start with `$`. Check below for more i There are two special values you can use: - - `nan`, for example `$this != nan` will check if the variable `this` is available. - A variable can be `nan` if the database lookup failed. All calculations (i.e. addition, - multiplication, etc) with a `nan` result in a `nan`. +- `nan`, for example `$this != nan` will check if the variable `this` is available. A variable can be `nan` if the database lookup failed. All calculations (i.e. addition, multiplication, etc) with a `nan` result in a `nan`. - - `inf`, for example `$this != inf` will check if `this` is not infinite. A value or - variable can be infinite if divided by zero. All calculations (i.e. addition, - multiplication, etc) with a `inf` result in a `inf`. +- `inf`, for example `$this != inf` will check if `this` is not infinite. A value or variable can be infinite if divided by zero. All calculations (i.e. addition, multiplication, etc) with a `inf` result in a `inf`. --- @@ -412,10 +407,10 @@ Which in turn, results in the following behavior: * While the value is falling, it will return to a warning state when it goes below 85, and a normal state when it goes below 75. - + * If the value is constantly varying between 80 and 90, then it will trigger a warning the first time it goes above 85, but will remain a warning until it goes below 75 (or goes above 85). - + * If the value is constantly varying between 90 and 100, then it will trigger a critical alert the first time it goes above 95, but will remain a critical alert goes below 85 (at which point it will return to being a warning). @@ -490,8 +485,7 @@ The external script will be called for all status changes. ## Examples - -Check the **[health.d directory](health.d)** for all alarms shipped with netdata. +Check the `health/health.d/` directory for all alarms shipped with netdata. Here are a few examples: @@ -650,8 +644,5 @@ Important: this will generate a lot of output in debug.log. You can find the context of charts by looking up the chart in either `http://your.netdata:19999/netdata.conf` or `http://your.netdata:19999/api/v1/charts`. -You can find how netdata interpreted the expressions by examining the alarm at -`http://your.netdata:19999/api/v1/alarms?all`. For each expression, netdata will return the -expression as given in its config file, and the same expression with additional parentheses -added to indicate the evaluation flow of the expression. +You can find how netdata interpreted the expressions by examining the alarm at `http://your.netdata:19999/api/v1/alarms?all`. For each expression, netdata will return the expression as given in its config file, and the same expression with additional parentheses added to indicate the evaluation flow of the expression. diff --git a/health/health.d/net.conf b/health/health.d/net.conf index 489016dd5..ae3c26ec6 100644 --- a/health/health.d/net.conf +++ b/health/health.d/net.conf @@ -4,13 +4,23 @@ # ----------------------------------------------------------------------------- # net traffic overflow + template: interface_speed + on: net.net + os: * + hosts: * + families: * + calc: ( $nic_speed_max > 0 ) ? ( $nic_speed_max) : ( nan ) + units: Mbit + every: 10s + info: The current speed of the physical network interface + template: 1m_received_traffic_overflow on: net.net os: linux hosts: * families: * lookup: average -1m unaligned absolute of received - calc: ($nic_speed_max > 0) ? ($this * 100 / ($nic_speed_max * 1000)) : ( nan ) + calc: ($interface_speed > 0) ? ($this * 100 / ($interface_speed * 1000)) : ( nan ) units: % every: 10s warn: $this > (($status >= $WARNING) ? (80) : (85)) @@ -25,7 +35,7 @@ hosts: * families: * lookup: average -1m unaligned absolute of sent - calc: ($nic_speed_max > 0) ? ($this * 100 / ($nic_speed_max * 1000)) : ( nan ) + calc: ($interface_speed > 0) ? ($this * 100 / ($interface_speed * 1000)) : ( nan ) units: % every: 10s warn: $this > (($status >= $WARNING) ? (80) : (85)) diff --git a/health/health_config.c b/health/health_config.c index d4af9776f..d4cf78d97 100644 --- a/health/health_config.c +++ b/health/health_config.c @@ -84,7 +84,7 @@ static inline int rrdcalctemplate_add_template_from_config(RRDHOST *host, RRDCAL return 0; } - if(unlikely(!RRDCALCTEMPLATE_HAS_CALCULATION(rt) && !rt->warning && !rt->critical)) { + if(unlikely(!RRDCALCTEMPLATE_HAS_DB_LOOKUP(rt) && !rt->calculation && !rt->warning && !rt->critical)) { error("Health configuration for template '%s' is useless (no calculation, no warning and no critical evaluation)", rt->name); return 0; } diff --git a/health/notifications/alarm-notify.sh b/health/notifications/alarm-notify.sh index 33a59590e..3331dcd94 100644 --- a/health/notifications/alarm-notify.sh +++ b/health/notifications/alarm-notify.sh @@ -896,7 +896,7 @@ date=$(date --date=@${when} "${date_format}" 2>/dev/null) # ---------------------------------------------------------------------------- # prepare some extra headers if we've been asked to thread e-mails -if [ "${SEND_EMAIL}" == "YES" -a "${EMAIL_THREADING}" == "YES" ] ; then +if [ "${SEND_EMAIL}" == "YES" -a "${EMAIL_THREADING}" != "NO" ] ; then email_thread_headers="In-Reply-To: <${chart}-${name}@${host}>\nReferences: <${chart}-${name}@${host}>" else email_thread_headers= @@ -1480,7 +1480,7 @@ send_slack() { { "channel": "#${channel}", "username": "netdata on ${host}", - "icon_url": "${images_base_url}/images/seo-performance-128.png", + "icon_url": "${images_base_url}/images/banner-icon-144x144.png", "text": "${host} ${status_message}, \`${chart}\` (_${family}_), *${alarm}*", "attachments": [ { @@ -1545,7 +1545,7 @@ send_rocketchat() { { "channel": "#${channel}", "alias": "netdata on ${host}", - "avatar": "${images_base_url}/images/seo-performance-128.png", + "avatar": "${images_base_url}/images/banner-icon-144x144.png", "text": "${host} ${status_message}, \`${chart}\` (_${family}_), *${alarm}*", "attachments": [ { @@ -1592,39 +1592,68 @@ EOF # alerta sender send_alerta() { - local webhook="${1}" channels="${2}" httpcode sent=0 channel severity content + local webhook="${1}" channels="${2}" httpcode sent=0 channel severity resource event payload auth [ "${SEND_ALERTA}" != "YES" ] && return 1 case "${status}" in - WARNING) severity="warning" ;; CRITICAL) severity="critical" ;; + WARNING) severity="warning" ;; CLEAR) severity="cleared" ;; - *) severity="unknown" ;; + *) severity="indeterminate" ;; esac - info=$( echo -n ${info}) + if [[ "${chart}" == httpcheck* ]] + then + resource=$chart + event=$name + else + resource="${host}:${family}" + event="${chart}.${name}" + fi - # the "event" property must be unique and repetible between states to let alerta do automatic correlation using severity value for channel in ${channels} do - content="{" - content="$content \"environment\": \"${channel}\"," - content="$content \"service\": [\"${host}\"]," - content="$content \"resource\": \"${host}\"," - content="$content \"event\": \"${name}.${chart} (${family})\"," - content="$content \"severity\": \"${severity}\"," - content="$content \"value\": \"${alarm}\"," - content="$content \"text\": \"${info}\"" - content="$content }" + payload="$(cat <<EOF + { + "resource": "${resource}", + "event": "${event}", + "environment": "${channel}", + "severity": "${severity}", + "service": ["Netdata"], + "group": "Performance", + "value": "${value_string}", + "text": "${info}", + "tags": ["alarm_id:${alarm_id}"], + "attributes": { + "roles": "${roles}", + "name": "${name}", + "chart": "${chart}", + "family": "${family}", + "source": "${src}", + "moreInfo": "<a href=\"${goto_url}\">View Netdata</a>" + }, + "origin": "netdata/${this_host}", + "type": "netdataAlarm", + "rawData": "${BASH_ARGV[@]}" + } +EOF + )" + if [[ -n "${ALERTA_API_KEY}" ]] + then + auth="Key ${ALERTA_API_KEY}" + fi - httpcode=$(docurl -X POST "${webhook}/alert" -H "Content-Type: application/json" -H "Authorization: Key $ALERTA_API_KEY" -d "$content" ) + httpcode=$(docurl -X POST "${webhook}/alert" -H "Content-Type: application/json" -H "Authorization: $auth" --data "${payload}") if [[ "${httpcode}" = "200" || "${httpcode}" = "201" ]] then info "sent alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}'" sent=$((sent + 1)) + elif [[ "${httpcode}" = "202" ]] + then + info "suppressed alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}'" else error "failed to send alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}', with HTTP error code ${httpcode}." fi @@ -1655,7 +1684,7 @@ send_flock() { httpcode=$(docurl -X POST "${webhook}" -H "Content-Type: application/json" -d "{ \"sendAs\": { \"name\" : \"netdata on ${host}\", - \"profileImage\" : \"${images_base_url}/images/seo-performance-128.png\" + \"profileImage\" : \"${images_base_url}/images/banner-icon-144x144.png\" }, \"text\": \"${host} *${status_message}*\", \"timestamp\": \"${when}\", @@ -1715,7 +1744,7 @@ send_discord() { "channel": "#${channel}", "username": "${username}", "text": "${host} ${status_message}, \`${chart}\` (_${family}_), *${alarm}*", - "icon_url": "${images_base_url}/images/seo-performance-128.png", + "icon_url": "${images_base_url}/images/banner-icon-144x144.png", "attachments": [ { "color": "${color}", @@ -1729,7 +1758,7 @@ send_discord() { } ], "thumb_url": "${image}", - "footer_icon": "${images_base_url}/images/seo-performance-128.png", + "footer_icon": "${images_base_url}/images/banner-icon-144x144.png", "footer": "${this_host}", "ts": ${when} } @@ -1952,7 +1981,7 @@ color="grey" alarm="${name//_/ } = ${value_string}" # the image of the alarm -image="${images_base_url}/images/seo-performance-128.png" +image="${images_base_url}/images/banner-icon-144x144.png" # prepare the title based on status case "${status}" in diff --git a/health/notifications/alarm-notify.sh.in b/health/notifications/alarm-notify.sh.in index 4aef3a521..ea8223097 100755 --- a/health/notifications/alarm-notify.sh.in +++ b/health/notifications/alarm-notify.sh.in @@ -896,7 +896,7 @@ date=$(date --date=@${when} "${date_format}" 2>/dev/null) # ---------------------------------------------------------------------------- # prepare some extra headers if we've been asked to thread e-mails -if [ "${SEND_EMAIL}" == "YES" -a "${EMAIL_THREADING}" == "YES" ] ; then +if [ "${SEND_EMAIL}" == "YES" -a "${EMAIL_THREADING}" != "NO" ] ; then email_thread_headers="In-Reply-To: <${chart}-${name}@${host}>\nReferences: <${chart}-${name}@${host}>" else email_thread_headers= @@ -1480,7 +1480,7 @@ send_slack() { { "channel": "#${channel}", "username": "netdata on ${host}", - "icon_url": "${images_base_url}/images/seo-performance-128.png", + "icon_url": "${images_base_url}/images/banner-icon-144x144.png", "text": "${host} ${status_message}, \`${chart}\` (_${family}_), *${alarm}*", "attachments": [ { @@ -1545,7 +1545,7 @@ send_rocketchat() { { "channel": "#${channel}", "alias": "netdata on ${host}", - "avatar": "${images_base_url}/images/seo-performance-128.png", + "avatar": "${images_base_url}/images/banner-icon-144x144.png", "text": "${host} ${status_message}, \`${chart}\` (_${family}_), *${alarm}*", "attachments": [ { @@ -1592,39 +1592,68 @@ EOF # alerta sender send_alerta() { - local webhook="${1}" channels="${2}" httpcode sent=0 channel severity content + local webhook="${1}" channels="${2}" httpcode sent=0 channel severity resource event payload auth [ "${SEND_ALERTA}" != "YES" ] && return 1 case "${status}" in - WARNING) severity="warning" ;; CRITICAL) severity="critical" ;; + WARNING) severity="warning" ;; CLEAR) severity="cleared" ;; - *) severity="unknown" ;; + *) severity="indeterminate" ;; esac - info=$( echo -n ${info}) + if [[ "${chart}" == httpcheck* ]] + then + resource=$chart + event=$name + else + resource="${host}:${family}" + event="${chart}.${name}" + fi - # the "event" property must be unique and repetible between states to let alerta do automatic correlation using severity value for channel in ${channels} do - content="{" - content="$content \"environment\": \"${channel}\"," - content="$content \"service\": [\"${host}\"]," - content="$content \"resource\": \"${host}\"," - content="$content \"event\": \"${name}.${chart} (${family})\"," - content="$content \"severity\": \"${severity}\"," - content="$content \"value\": \"${alarm}\"," - content="$content \"text\": \"${info}\"" - content="$content }" + payload="$(cat <<EOF + { + "resource": "${resource}", + "event": "${event}", + "environment": "${channel}", + "severity": "${severity}", + "service": ["Netdata"], + "group": "Performance", + "value": "${value_string}", + "text": "${info}", + "tags": ["alarm_id:${alarm_id}"], + "attributes": { + "roles": "${roles}", + "name": "${name}", + "chart": "${chart}", + "family": "${family}", + "source": "${src}", + "moreInfo": "<a href=\"${goto_url}\">View Netdata</a>" + }, + "origin": "netdata/${this_host}", + "type": "netdataAlarm", + "rawData": "${BASH_ARGV[@]}" + } +EOF + )" + if [[ -n "${ALERTA_API_KEY}" ]] + then + auth="Key ${ALERTA_API_KEY}" + fi - httpcode=$(docurl -X POST "${webhook}/alert" -H "Content-Type: application/json" -H "Authorization: Key $ALERTA_API_KEY" -d "$content" ) + httpcode=$(docurl -X POST "${webhook}/alert" -H "Content-Type: application/json" -H "Authorization: $auth" --data "${payload}") if [[ "${httpcode}" = "200" || "${httpcode}" = "201" ]] then info "sent alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}'" sent=$((sent + 1)) + elif [[ "${httpcode}" = "202" ]] + then + info "suppressed alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}'" else error "failed to send alerta notification for: ${host} ${chart}.${name} is ${status} to '${channel}', with HTTP error code ${httpcode}." fi @@ -1655,7 +1684,7 @@ send_flock() { httpcode=$(docurl -X POST "${webhook}" -H "Content-Type: application/json" -d "{ \"sendAs\": { \"name\" : \"netdata on ${host}\", - \"profileImage\" : \"${images_base_url}/images/seo-performance-128.png\" + \"profileImage\" : \"${images_base_url}/images/banner-icon-144x144.png\" }, \"text\": \"${host} *${status_message}*\", \"timestamp\": \"${when}\", @@ -1715,7 +1744,7 @@ send_discord() { "channel": "#${channel}", "username": "${username}", "text": "${host} ${status_message}, \`${chart}\` (_${family}_), *${alarm}*", - "icon_url": "${images_base_url}/images/seo-performance-128.png", + "icon_url": "${images_base_url}/images/banner-icon-144x144.png", "attachments": [ { "color": "${color}", @@ -1729,7 +1758,7 @@ send_discord() { } ], "thumb_url": "${image}", - "footer_icon": "${images_base_url}/images/seo-performance-128.png", + "footer_icon": "${images_base_url}/images/banner-icon-144x144.png", "footer": "${this_host}", "ts": ${when} } @@ -1952,7 +1981,7 @@ color="grey" alarm="${name//_/ } = ${value_string}" # the image of the alarm -image="${images_base_url}/images/seo-performance-128.png" +image="${images_base_url}/images/banner-icon-144x144.png" # prepare the title based on status case "${status}" in diff --git a/health/notifications/alerta/README.md b/health/notifications/alerta/README.md index bbed23bac..cf43621ff 100644 --- a/health/notifications/alerta/README.md +++ b/health/notifications/alerta/README.md @@ -1,207 +1,50 @@ # alerta.io notifications -The alerta monitoring system is a tool used to consolidate and de-duplicate alerts from multiple sources for quick ‘at-a-glance’ visualisation. With just one system you can monitor alerts from many other monitoring tools on a single screen. +The [Alerta](https://alerta.io) monitoring system is a tool used to +consolidate and de-duplicate alerts from multiple sources for quick +‘at-a-glance’ visualisation. With just one system you can monitor +alerts from many other monitoring tools on a single screen. -![](http://docs.alerta.io/en/latest/_images/alerta-screen-shot-3.png) +![](https://docs.alerta.io/en/latest/_images/alerta-screen-shot-3.png) -When receiving alerts from multiple sources you can quickly become overwhelmed. With Alerta any alert with the same environment and resource is considered a duplicate if it has the same severity. If it has a different severity it is correlated so that you only see the most recent one. Awesome. +Netadata alarms can be sent to Alerta so you can see in one place +alerts coming from many Netdata hosts or also from a multi-host +Netadata configuration. The big advantage over other notifications +systems is that there is a main view of all active alarms with +the most recent state, and it is also possible to view alarm history. -main site http://www.alerta.io +## Deploying Alerta -We can send Netadata alarms to Alerta so yo can see in one place alerts coming from many Netdata hosts or also from a multihost Netadata configuration.\ -The big advantage over other notifications method is that you have in a main view all active alarms with only las state, but you can also search history. +It is recommended to set up the server in a separated server, VM or +container. If you have other Nginx or Apache server in your organization, +it is recommended to proxy to this new server. -## Setting up an Alerta server with Ubuntu 16.04 +The easiest way to install Alerta is to use the Docker image available +on [Docker hub][1]. Alternatively, follow the ["getting started"][2] +tutorial to deploy Alerta to an Ubuntu server. More advanced +configurations are out os scope of this tutorial but information +about different deployment scenaries can be found in the [docs][3]. -Here we will set a basic Alerta server to test it with Netdata alerts.\ -More advanced configurations are out os scope of this tutorial. +[1]: https://hub.docker.com/r/alerta/alerta-web/ +[2]: http://alerta.readthedocs.io/en/latest/gettingstarted/tutorial-1-deploy-alerta.html +[3]: http://docs.alerta.io/en/latest/deployment.html -source: http://alerta.readthedocs.io/en/latest/gettingstarted/tutorial-1-deploy-alerta.html +## Send alarms to Alerta -I recommend to set up the server in a separated server, VM or container.\ -If you have other Nginx or Apache server in your organization, I recommend to proxy to this new server. +Step 1. Create an API key (if authentication is enabled) -Set us as root for easiest working -``` -sudo su -cd -``` - -Install Mongodb https://docs.mongodb.com/manual/tutorial/install-mongodb-on-ubuntu/ -``` -apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2930ADAE8CAF5059EE73BB4B58712A2291FA4AD5 -echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.6 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-3.6.list -apt-get update -apt-get install -y mongodb-org -systemctl enable mongod -systemctl start mongod -systemctl status mongod -``` - -Install Nginx and Alerta uwsgi -``` -apt-get install -y python-pip python-dev nginx -pip install alerta-server uwsgi -``` +You will need an API key to send messages from any source, if +Alerta is configured to use authentication (recommended). To +create an API key go to "Configuration -> API Keys" and create +a new API key called "netdata" with `write:alerts` permission. -Install web console -``` -cd /var/www/html -mkdir alerta -cd alerta -wget -q -O - https://github.com/alerta/angular-alerta-webui/tarball/master | tar zxf - -mv alerta*/app/* . -cd -``` -## Services configuration - -Create a wsgi python file -``` -nano /var/www/wsgi.py -``` -fill with -``` -from alerta import app -``` -Create uWsgi configuration file -``` -nano /etc/uwsgi.ini -``` -fill with -``` -[uwsgi] -chdir = /var/www -mount = /alerta/api=wsgi.py -callable = app -manage-script-name = true - -master = true -processes = 5 -logger = syslog:alertad - -socket = /tmp/uwsgi.sock -chmod-socket = 664 -uid = www-data -gid = www-data -vacuum = true - -die-on-term = true -``` -Create a systemd configuration file -``` -nano /etc/systemd/system/uwsgi.service -``` -fill with -``` -[Unit] -Description=uWSGI service - -[Service] -ExecStart=/usr/local/bin/uwsgi --ini /etc/uwsgi.ini - -[Install] -WantedBy=multi-user.target -``` -enable service -``` -systemctl start uwsgi -systemctl status uwsgi -systemctl enable uwsgi -``` -Configure nginx to serve Alerta as a uWsgi application on /alerta/api -``` -nano /etc/nginx/sites-enabled/default -``` -fill with -``` -server { - listen 80 default_server; - listen [::]:80 default_server; - - location /alerta/api { try_files $uri @alerta/api; } - location @alerta/api { - include uwsgi_params; - uwsgi_pass unix:/tmp/uwsgi.sock; - proxy_set_header Host $host:$server_port; - proxy_set_header X-Real-IP $remote_addr; - proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; - } - - location / { - root /var/www/html; - } -} -``` -restart nginx -``` -service nginx restart -``` -## Config web console -``` -nano /var/www/html/config.js -``` -fill with -``` -'use strict'; - -angular.module('config', []) - .constant('config', { - 'endpoint' : "/alerta/api", - 'provider' : "basic", - 'colors' : {}, - 'severity' : {}, - 'audio' : {} - }); -``` - -## Config Alerta server - -source: http://alerta.readthedocs.io/en/latest/configuration.html - -Create a random string to use as SECRET_KEY -``` -cat /dev/urandom | tr -dc A-Za-z0-9_\!\@\#\$\%\^\&\*\(\)-+= | head -c 32 && echo -``` -will output something like -``` -0pv8Bw7VKfW6avDAz_TqzYPme_fYV%7g -``` -Edit alertad.conf -``` -nano /etc/alertad.conf -``` -fill with (take care about all single quotes) -``` -BASE_URL='/alerta/api' -AUTH_REQUIRED=True -SECRET_KEY='0pv8Bw7VKfW6avDAz_TqzYPme_fYV%7g' -ADMIN_USERS=['<here put you email for future login>'] -``` - -restart -``` -systemctl restart uwsgi -``` - -* go to console to http://yourserver/alerta/ -* go to Login -> Create an account -* use your email for login so and administrative account will be created - -## create an API KEY - -You need an API KEY to send messages from any source.\ -To create an API KEY go to Configuration -> Api Keys\ -Then create a API KEY with write permisions. - -## configure Netdata to send alarms to Alerta +Step 2. configure Netdata to send alarms to Alerta On your system run: -``` -/etc/netdata/edit-config health_alarm_notify.conf -``` + $ /etc/netdata/edit-config health_alarm_notify.conf -and set +and modify the file as below: ``` # enable/disable sending alerta notifications @@ -214,7 +57,7 @@ ALERTA_WEBHOOK_URL="http://yourserver/alerta/api" # Login with an administrative user to you Alerta server and create an API KEY # with write permissions. -ALERTA_API_KEY="you last created API KEY" +ALERTA_API_KEY="INSERT_YOUR_API_KEY_HERE" # you can define environments in /etc/alertad.conf option ALLOWED_ENVIRONMENTS # standard environments are Production and Development @@ -225,12 +68,13 @@ DEFAULT_RECIPIENT_ALERTA="Production" ## Test alarms -We can test alarms with standard -``` -sudo su -s /bin/bash netdata -/opt/netdata/netdata-plugins/plugins.d/alarm-notify.sh test -exit -``` -But the problem is that Netdata will send 3 alarms, and because last alarm is "CLEAR" you will not se them in main Alerta page, you need to select to see "closed" alarma in top-right lookup. +We can test alarms using the standard approach: + + $ /opt/netdata/netdata-plugins/plugins.d/alarm-notify.sh test + +Note: Netdata will send 3 alarms, and because last alarm is "CLEAR" +you will not se them in main Alerta page, you need to select to see +"closed" alarma in top-right lookup. A little change in `alarm-notify.sh` +that let us test each state one by one will be useful. -A little change in alarm-notify.sh that let us test each state one by one will be useful.
\ No newline at end of file +For more information see [https://docs.alerta.io](https://docs.alerta.io) diff --git a/health/notifications/health_alarm_notify.conf b/health/notifications/health_alarm_notify.conf index 9e72aac4d..a997765a6 100755 --- a/health/notifications/health_alarm_notify.conf +++ b/health/notifications/health_alarm_notify.conf @@ -183,8 +183,9 @@ DEFAULT_RECIPIENT_EMAIL="root" # chart+alarm+host combination as a single thread. This can help # simplify tracking of alarms, as it provides an easy wway for scripts # to corelate messages and also will cause most clients to group all the -# messages together. THis is off by default. -#EMAIL_THREADING="YES" +# messages together. This is enabled by default, uncomment the line +# below if you want to disable it. +#EMAIL_THREADING="NO" #------------------------------------------------------------------------------ |