diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2022-01-26 18:05:15 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2022-01-26 18:05:42 +0000 |
commit | 112b5b91647c3dea45cc1c9bc364df526c8012f1 (patch) | |
tree | 450af925135ec664c4310a1eb28b69481094ee2a /collectors/python.d.plugin | |
parent | Releasing debian version 1.32.1-2. (diff) | |
download | netdata-112b5b91647c3dea45cc1c9bc364df526c8012f1.tar.xz netdata-112b5b91647c3dea45cc1c9bc364df526c8012f1.zip |
Merging upstream version 1.33.0.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'collectors/python.d.plugin')
11 files changed, 148 insertions, 52 deletions
diff --git a/collectors/python.d.plugin/anomalies/README.md b/collectors/python.d.plugin/anomalies/README.md index c58c858bf..3552053ee 100644 --- a/collectors/python.d.plugin/anomalies/README.md +++ b/collectors/python.d.plugin/anomalies/README.md @@ -229,6 +229,7 @@ If you would like to go deeper on what exactly the anomalies collector is doing - If you activate this collector on a fresh node, it might take a little while to build up enough data to calculate a realistic and useful model. - Some models like `iforest` can be comparatively expensive (on same n1-standard-2 system above ~2s runtime during predict, ~40s training time, ~50% cpu on both train and predict) so if you would like to use it you might be advised to set a relatively high `update_every` maybe 10, 15 or 30 in `anomalies.conf`. - Setting a higher `train_every_n` and `update_every` is an easy way to devote less resources on the node to anomaly detection. Specifying less charts and a lower `train_n_secs` will also help reduce resources at the expense of covering less charts and maybe a more noisy model if you set `train_n_secs` to be too small for how your node tends to behave. +- If you would like to enable this on a Rasberry Pi, then check out [this guide](https://learn.netdata.cloud/guides/monitor/raspberry-pi-anomaly-detection) which will guide you through first installing LLVM. ## Useful links and further reading @@ -240,4 +241,4 @@ If you would like to go deeper on what exactly the anomalies collector is doing - Good [blog post](https://www.anodot.com/blog/what-is-anomaly-detection/) from Anodot on time series anomaly detection. Anodot also have some great whitepapers in this space too that some may find useful. - Novelty and outlier detection in the [scikit-learn documentation](https://scikit-learn.org/stable/modules/outlier_detection.html). -[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fcollectors%2Fpython.d.plugin%2Fanomalies%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]()
\ No newline at end of file +[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fcollectors%2Fpython.d.plugin%2Fanomalies%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]() diff --git a/collectors/python.d.plugin/example/README.md b/collectors/python.d.plugin/example/README.md index 561ea62ed..b1c21ecbc 100644 --- a/collectors/python.d.plugin/example/README.md +++ b/collectors/python.d.plugin/example/README.md @@ -5,7 +5,10 @@ custom_edit_url: https://github.com/netdata/netdata/edit/master/collectors/pytho # Example -An example python data collection module. -You can use this example to help you [write a new Python module](../#how-to-write-a-new-module). +You can add custom data collectors using Python. + +Netdata provides an [example python data collection module](https://github.com/netdata/netdata/tree/master/collectors/python.d.plugin/example). + +If you want to write your own collector, read our [writing a new Python module](/collectors/python.d.plugin/README.md#how-to-write-a-new-module) tutorial. [![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fcollectors%2Fpython.d.plugin%2Fexample%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)](<>) diff --git a/collectors/python.d.plugin/fail2ban/README.md b/collectors/python.d.plugin/fail2ban/README.md index c1ad994a5..90a59dce0 100644 --- a/collectors/python.d.plugin/fail2ban/README.md +++ b/collectors/python.d.plugin/fail2ban/README.md @@ -10,14 +10,55 @@ Monitors the fail2ban log file to show all bans for all active jails. ## Requirements -- fail2ban.log file MUST BE readable by Netdata (A good idea is to add **create 0640 root netdata** to fail2ban conf at logrotate.d) +The `fail2ban.log` file must be readable by the user `netdata`: -It produces one chart with multiple lines (one line per jail) +- change the file ownership and access permissions. +- update `/etc/logrotate.d/fail2ban` to persists the changes after rotating the log file. + +<details> + <summary>Click to expand the instruction.</summary> + +To change the file ownership and access permissions, execute the following: + +```shell +sudo chown root:netdata /var/log/fail2ban.log +sudo chmod 640 /var/log/fail2ban.log +``` + +To persist the changes after rotating the log file, add `create 640 root netdata` to the `/etc/logrotate.d/fail2ban`: + +```shell +/var/log/fail2ban.log { + + weekly + rotate 4 + compress + + delaycompress + missingok + postrotate + fail2ban-client flushlogs 1>/dev/null + endscript + + # If fail2ban runs as non-root it still needs to have write access + # to logfiles. + # create 640 fail2ban adm + create 640 root netdata +} +``` + +</details> + +## Charts + +- Failed attempts in attempts/s +- Bans in bans/s +- Banned IP addresses (since the last restart of netdata) in ips ## Configuration -Edit the `python.d/fail2ban.conf` configuration file using `edit-config` from the Netdata [config -directory](/docs/configure/nodes.md), which is typically at `/etc/netdata`. +Edit the `python.d/fail2ban.conf` configuration file using `edit-config` from the +Netdata [config directory](/docs/configure/nodes.md), which is typically at `/etc/netdata`. ```bash cd /etc/netdata # Replace this path with your Netdata config directory, if different @@ -28,13 +69,13 @@ Sample: ```yaml local: - log_path: '/var/log/fail2ban.log' - conf_path: '/etc/fail2ban/jail.local' - exclude: 'dropbear apache' + log_path: '/var/log/fail2ban.log' + conf_path: '/etc/fail2ban/jail.local' + exclude: 'dropbear apache' ``` -If no configuration is given, module will attempt to read log file at `/var/log/fail2ban.log` and conf file at `/etc/fail2ban/jail.local`. -If conf file is not found default jail is `ssh`. +If no configuration is given, module will attempt to read log file at `/var/log/fail2ban.log` and conf file +at `/etc/fail2ban/jail.local`. If conf file is not found default jail is `ssh`. --- diff --git a/collectors/python.d.plugin/fail2ban/fail2ban.chart.py b/collectors/python.d.plugin/fail2ban/fail2ban.chart.py index 99dbf79dd..76f6d92b4 100644 --- a/collectors/python.d.plugin/fail2ban/fail2ban.chart.py +++ b/collectors/python.d.plugin/fail2ban/fail2ban.chart.py @@ -11,8 +11,9 @@ from glob import glob from bases.FrameworkServices.LogService import LogService ORDER = [ + 'jails_failed_attempts', 'jails_bans', - 'jails_in_jail', + 'jails_banned_ips', ] @@ -23,40 +24,49 @@ def charts(jails): ch = { ORDER[0]: { - 'options': [None, 'Jails Ban Rate', 'bans/s', 'bans', 'jail.bans', 'line'], + 'options': [None, 'Failed attempts', 'attempts/s', 'failed attempts', 'fail2ban.failed_attempts', 'line'], 'lines': [] }, ORDER[1]: { - 'options': [None, 'Banned IPs (since the last restart of netdata)', 'IPs', 'in jail', - 'jail.in_jail', 'line'], + 'options': [None, 'Bans', 'bans/s', 'bans', 'fail2ban.bans', 'line'], + 'lines': [] + }, + ORDER[2]: { + 'options': [None, 'Banned IP addresses (since the last restart of netdata)', 'ips', 'banned ips', + 'fail2ban.banned_ips', 'line'], 'lines': [] }, } for jail in jails: - dim = [ - jail, - jail, - 'incremental', - ] + dim = ['{0}_failed_attempts'.format(jail), jail, 'incremental'] ch[ORDER[0]]['lines'].append(dim) - dim = [ - '{0}_in_jail'.format(jail), - jail, - 'absolute', - ] + dim = [jail, jail, 'incremental'] ch[ORDER[1]]['lines'].append(dim) + dim = ['{0}_in_jail'.format(jail), jail, 'absolute'] + ch[ORDER[2]]['lines'].append(dim) + return ch RE_JAILS = re.compile(r'\[([a-zA-Z0-9_-]+)\][^\[\]]+?enabled\s+= +(true|yes|false|no)') +ACTION_BAN = 'Ban' +ACTION_UNBAN = 'Unban' +ACTION_RESTORE_BAN = 'Restore Ban' +ACTION_FOUND = 'Found' + # Example: -# 2018-09-12 11:45:53,715 fail2ban.actions[25029]: WARNING [ssh] Unban 195.201.88.33 -# 2018-09-12 11:45:58,727 fail2ban.actions[25029]: WARNING [ssh] Ban 217.59.246.27 -# 2018-09-12 11:45:58,727 fail2ban.actions[25029]: WARNING [ssh] Restore Ban 217.59.246.27 -RE_DATA = re.compile(r'\[(?P<jail>[A-Za-z-_0-9]+)\] (?P<action>Unban|Ban|Restore Ban) (?P<ip>[a-f0-9.:]+)') +# 2018-09-12 11:45:58,727 fail2ban.actions[25029]: WARNING [ssh] Found 203.0.113.1 +# 2018-09-12 11:45:58,727 fail2ban.actions[25029]: WARNING [ssh] Ban 203.0.113.1 +# 2018-09-12 11:45:58,727 fail2ban.actions[25029]: WARNING [ssh] Restore Ban 203.0.113.1 +# 2018-09-12 11:45:53,715 fail2ban.actions[25029]: WARNING [ssh] Unban 203.0.113.1 +RE_DATA = re.compile( + r'\[(?P<jail>[A-Za-z-_0-9]+)\] (?P<action>{0}|{1}|{2}|{3}) (?P<ip>[a-f0-9.:]+)'.format( + ACTION_BAN, ACTION_UNBAN, ACTION_RESTORE_BAN, ACTION_FOUND + ) +) DEFAULT_JAILS = [ 'ssh', @@ -94,6 +104,7 @@ class Service(LogService): self.monitoring_jails = self.jails_auto_detection() for jail in self.monitoring_jails: + self.data['{0}_failed_attempts'.format(jail)] = 0 self.data[jail] = 0 self.data['{0}_in_jail'.format(jail)] = 0 @@ -124,12 +135,14 @@ class Service(LogService): jail, action, ip = match['jail'], match['action'], match['ip'] - if action == 'Ban' or action == 'Restore Ban': + if action == ACTION_FOUND: + self.data['{0}_failed_attempts'.format(jail)] += 1 + elif action in (ACTION_BAN, ACTION_RESTORE_BAN): self.data[jail] += 1 if ip not in self.banned_ips[jail]: self.banned_ips[jail].add(ip) self.data['{0}_in_jail'.format(jail)] += 1 - else: + elif action == ACTION_UNBAN: if ip in self.banned_ips[jail]: self.banned_ips[jail].remove(ip) self.data['{0}_in_jail'.format(jail)] -= 1 @@ -196,9 +209,9 @@ class Service(LogService): if name in exclude: continue - if status in ('true','yes') and name not in active_jails: + if status in ('true', 'yes') and name not in active_jails: active_jails.append(name) - elif status in ('false','no') and name in active_jails: + elif status in ('false', 'no') and name in active_jails: active_jails.remove(name) return active_jails or DEFAULT_JAILS diff --git a/collectors/python.d.plugin/mongodb/README.md b/collectors/python.d.plugin/mongodb/README.md index c0df123d7..e122736ac 100644 --- a/collectors/python.d.plugin/mongodb/README.md +++ b/collectors/python.d.plugin/mongodb/README.md @@ -152,7 +152,7 @@ Number of charts depends on mongodb version, storage engine and other features ( - member (time when last heartbeat was received from replica set member) -## prerequisite +## Prerequisite Create a read-only user for Netdata in the admin database. diff --git a/collectors/python.d.plugin/nvidia_smi/nvidia_smi.chart.py b/collectors/python.d.plugin/nvidia_smi/nvidia_smi.chart.py index 9c69586dd..00bc7884d 100644 --- a/collectors/python.d.plugin/nvidia_smi/nvidia_smi.chart.py +++ b/collectors/python.d.plugin/nvidia_smi/nvidia_smi.chart.py @@ -28,6 +28,7 @@ GPU_UTIL = 'gpu_utilization' MEM_UTIL = 'mem_utilization' ENCODER_UTIL = 'encoder_utilization' MEM_USAGE = 'mem_usage' +BAR_USAGE = 'bar1_mem_usage' TEMPERATURE = 'temperature' CLOCKS = 'clocks' POWER = 'power' @@ -42,6 +43,7 @@ ORDER = [ MEM_UTIL, ENCODER_UTIL, MEM_USAGE, + BAR_USAGE, TEMPERATURE, CLOCKS, POWER, @@ -95,6 +97,13 @@ def gpu_charts(gpu): ['fb_memory_used', 'used'], ] }, + BAR_USAGE: { + 'options': [None, 'Bar1 Memory Usage', 'MiB', fam, 'nvidia_smi.bar1_memory_usage', 'stacked'], + 'lines': [ + ['bar1_memory_free', 'free'], + ['bar1_memory_used', 'used'], + ] + }, TEMPERATURE: { 'options': [None, 'Temperature', 'celsius', fam, 'nvidia_smi.temperature', 'line'], 'lines': [ @@ -285,12 +294,14 @@ def get_username_by_pid_safe(pid, passwd_file): try: uid = os.stat(path).st_uid except (OSError, IOError): - return '' - - try: - return passwd_file[uid][0] - except KeyError: - return str(uid) + return '' + if IS_INSIDE_DOCKER: + try: + return passwd_file[uid][0] + except KeyError: + return str(uid) + else: + return pwd.getpwuid(uid)[0] class GPU: @@ -345,6 +356,14 @@ class GPU: return self.root.find('fb_memory_usage').find('free').text.split()[0] @handle_attr_error + def bar1_memory_used(self): + return self.root.find('bar1_memory_usage').find('used').text.split()[0] + + @handle_attr_error + def bar1_memory_free(self): + return self.root.find('bar1_memory_usage').find('free').text.split()[0] + + @handle_attr_error def temperature(self): return self.root.find('temperature').find('gpu_temp').text.split()[0] @@ -399,6 +418,8 @@ class GPU: 'decoder_util': self.decoder_util(), 'fb_memory_used': self.fb_memory_used(), 'fb_memory_free': self.fb_memory_free(), + 'bar1_memory_used': self.bar1_memory_used(), + 'bar1_memory_free': self.bar1_memory_free(), 'gpu_temp': self.temperature(), 'graphics_clock': self.graphics_clock(), 'video_clock': self.video_clock(), diff --git a/collectors/python.d.plugin/postgres/postgres.chart.py b/collectors/python.d.plugin/postgres/postgres.chart.py index 29026a6a3..bd8f71a66 100644 --- a/collectors/python.d.plugin/postgres/postgres.chart.py +++ b/collectors/python.d.plugin/postgres/postgres.chart.py @@ -336,18 +336,18 @@ WHERE d.datallowconn; QUERY_TABLE_STATS = { DEFAULT: """ SELECT - ((sum(relpages) * 8) * 1024) AS table_size, - count(1) AS table_count + sum(relpages) * current_setting('block_size')::numeric AS table_size, + count(1) AS table_count FROM pg_class -WHERE relkind IN ('r', 't'); +WHERE relkind IN ('r', 't', 'm'); """, } QUERY_INDEX_STATS = { DEFAULT: """ SELECT - ((sum(relpages) * 8) * 1024) AS index_size, - count(1) AS index_count + sum(relpages) * current_setting('block_size')::numeric AS index_size, + count(1) AS index_count FROM pg_class WHERE relkind = 'i'; """, diff --git a/collectors/python.d.plugin/python.d.plugin.in b/collectors/python.d.plugin/python.d.plugin.in index b263f229e..b943f3a20 100644 --- a/collectors/python.d.plugin/python.d.plugin.in +++ b/collectors/python.d.plugin/python.d.plugin.in @@ -1,6 +1,6 @@ #!/usr/bin/env bash '''':; -pybinary=$(which python || which python3 || which python2) +pybinary=$(which python3 || which python || which python2) filtered=() for arg in "$@" do diff --git a/collectors/python.d.plugin/python_modules/bases/FrameworkServices/ExecutableService.py b/collectors/python.d.plugin/python_modules/bases/FrameworkServices/ExecutableService.py index dea50eea0..a74b4239e 100644 --- a/collectors/python.d.plugin/python_modules/bases/FrameworkServices/ExecutableService.py +++ b/collectors/python.d.plugin/python_modules/bases/FrameworkServices/ExecutableService.py @@ -35,7 +35,7 @@ class ExecutableService(SimpleService): for line in std: try: data.append(line.decode('utf-8')) - except TypeError: + except (TypeError, UnicodeDecodeError): continue return data diff --git a/collectors/python.d.plugin/spigotmc/spigotmc.chart.py b/collectors/python.d.plugin/spigotmc/spigotmc.chart.py index f334113e4..81370fb4c 100644 --- a/collectors/python.d.plugin/spigotmc/spigotmc.chart.py +++ b/collectors/python.d.plugin/spigotmc/spigotmc.chart.py @@ -22,6 +22,7 @@ COMMAND_ONLINE = 'online' ORDER = [ 'tps', + 'mem', 'users', ] @@ -39,15 +40,27 @@ CHARTS = { 'lines': [ ['users', 'Users', 'absolute', 1, 1] ] + }, + 'mem': { + 'options': [None, 'Minecraft Memory Usage', 'MiB', 'spigotmc', 'spigotmc.mem', 'line'], + 'lines': [ + ['mem_used', 'used', 'absolute', 1, 1], + ['mem_alloc', 'allocated', 'absolute', 1, 1], + ['mem_max', 'max', 'absolute', 1, 1] + ] } } _TPS_REGEX = re.compile( + # Examples: + # §6TPS from last 1m, 5m, 15m: §a*20.0, §a*20.0, §a*20.0 + # §6Current Memory Usage: §a936/65536 mb (Max: 65536 mb) r'^.*: .*?' # Message lead-in r'(\d{1,2}.\d+), .*?' # 1-minute TPS value r'(\d{1,2}.\d+), .*?' # 5-minute TPS value - r'(\d{1,2}\.\d+).*$', # 15-minute TPS value - re.X + r'(\d{1,2}\.\d+).*?' # 15-minute TPS value + r'(\s.*?(\d+)\/(\d+).*?: (\d+).*)?', # Current Memory Usage / Total Memory (Max Memory) + re.MULTILINE ) _LIST_REGEX = re.compile( # Examples: @@ -126,6 +139,10 @@ class Service(SimpleService): data['tps1'] = int(float(match.group(1)) * PRECISION) data['tps5'] = int(float(match.group(2)) * PRECISION) data['tps15'] = int(float(match.group(3)) * PRECISION) + if match.group(4): + data['mem_used'] = int(match.group(5)) + data['mem_alloc'] = int(match.group(6)) + data['mem_max'] = int(match.group(7)) else: self.error('Unable to process TPS values.') if not raw: diff --git a/collectors/python.d.plugin/web_log/README.md b/collectors/python.d.plugin/web_log/README.md index 2cf60ed9e..8bbb9a83a 100644 --- a/collectors/python.d.plugin/web_log/README.md +++ b/collectors/python.d.plugin/web_log/README.md @@ -31,7 +31,7 @@ If Netdata is installed on a system running a web server, it will detect it and ![image](https://cloud.githubusercontent.com/assets/2662304/22900686/e283f636-f237-11e6-93d2-cbdf63de150c.png) *[**netdata**](https://my-netdata.io/) charts based on metrics collected by querying the `nginx` API (i.e. `/stub_status`).* -> [**netdata**](https://my-netdata.io/) supports `apache`, `nginx`, `lighttpd` and `tomcat`. To obtain real-time information from a web server API, the web server needs to expose it. For directions on configuring your web server, check the config files for each web server. There is a directory with a config file for each web server under [`/etc/netdata/python.d/`](../). +> [**netdata**](https://my-netdata.io/) supports `apache`, `nginx`, `lighttpd` and `tomcat`. To obtain real-time information from a web server API, the web server needs to expose it. For directions on configuring your web server, check the config files for each web server. There is a directory with a config file for each web server under `/etc/netdata/python.d/`. ## Configuration @@ -120,7 +120,7 @@ This is a nice view of the traffic the web server is receiving and is sending. What is important to know for this chart, is that the bandwidth used for each request and response is accounted at the time the log is written. Since [**netdata**](https://my-netdata.io/) refreshes this chart every single second, you may have unrealistic spikes is the size of the requests or responses is too big. The reason is simple: a response may have needed 1 minute to be completed, but all the bandwidth used during that minute for the specific response will be accounted at the second the log line is written. -As the legend on the chart suggests, you can use FireQoS to setup QoS on the web server ports and IPs to accurately measure the bandwidth the web server is using. Actually, [there may be a few more reasons to install QoS on your servers](/collectors/tc.plugin/README.md#tcplugin)... +As the legend on the chart suggests, you can use FireQOS to setup QoS on the web server ports and IPs to accurately measure the bandwidth the web server is using. Actually, [there may be a few more reasons to install QoS on your servers](/collectors/tc.plugin/README.md#tcplugin)... **Bandwidth** KB/s |