diff options
Diffstat (limited to 'python.d/README.md')
-rw-r--r-- | python.d/README.md | 657 |
1 files changed, 542 insertions, 115 deletions
diff --git a/python.d/README.md b/python.d/README.md index 009265f72..faabba2c7 100644 --- a/python.d/README.md +++ b/python.d/README.md @@ -29,7 +29,7 @@ local: # job name update_every : 5 # job update frequency other_var1 : some_val # module specific variable -other_job: +other_job: priority : 5 # job position on dashboard retries : 20 # job retries other_var2 : val # module specific variable @@ -43,7 +43,7 @@ The following python.d modules are supported: # apache -This module will monitor one or more apache servers depending on configuration. +This module will monitor one or more apache servers depending on configuration. **Requirements:** * apache with enabled `mod_status` @@ -60,20 +60,20 @@ It produces the following charts: * keepalive * closing * writing - + 4. **Bandwidth** in kilobytes/s * sent - + 5. **Workers** * idle * busy - + 6. **Lifetime Avg. Requests/s** in requests/s * requests_sec - + 7. **Lifetime Avg. Bandwidth/s** in kilobytes/s * size_sec - + 8. **Lifetime Avg. Response Size** in bytes/request * size_req @@ -109,7 +109,7 @@ Module monitors apache mod_cache log and produces only one chart: * hit * miss * other - + ### configuration Sample: @@ -138,14 +138,14 @@ Module provides server and tube level statistics: 1. **Cpu usage** in cpu time * user * system - + 2. **Jobs rate** in jobs/s * total * timeouts - + 3. **Connections rate** in connections/s * connections - + 4. **Commands rate** in commands/s * put * peek @@ -167,27 +167,27 @@ Module provides server and tube level statistics: * list-tube-used * list-tubes-watched * pause-tube - + 5. **Current tubes** in tubes * tubes - + 6. **Current jobs** in jobs * urgent * ready * reserved * delayed * buried - + 7. **Current connections** in connections * written * producers * workers * waiting - + 8. **Binlog** in records/s * written * migrated - + 9. **Uptime** in seconds * uptime @@ -195,7 +195,7 @@ Module provides server and tube level statistics: 1. **Jobs rate** in jobs/s * jobs - + 2. **Jobs** in jobs * using * ready @@ -211,12 +211,12 @@ Module provides server and tube level statistics: 4. **Commands** in commands/s * deletes * pauses - + 5. **Pause** in seconds * since * left - + ### configuration Sample: @@ -252,7 +252,7 @@ It produces: * recursion * duplicate * rejections - + 2. **Incoming queries** * RESERVED0 * A @@ -273,7 +273,7 @@ It produces: * SPF * ANY * DLV - + 3. **Outgoing queries** * Same as Incoming queries @@ -323,6 +323,39 @@ local: --- +# ceph + +This module monitors the ceph cluster usage and consuption data of a server. + +It produces: + +* Cluster statistics (usage, available, latency, objects, read/write rate) +* OSD usage +* OSD latency +* Pool usage +* Pool read/write operations +* Pool read/write rate +* number of objects per pool + +**Requirements:** + +- `rados` python module +- Granting read permissions to ceph group from keyring file +```shell +# chmod 640 /etc/ceph/ceph.client.admin.keyring +``` + +### Configuration + +Sample: +```yaml +local: + config_file: '/etc/ceph/ceph.conf' + keyring_file: '/etc/ceph/ceph.client.admin.keyring' +``` + +--- + # couchdb This module monitors vital statistics of a local Apache CouchDB 2.x server, including: @@ -467,13 +500,13 @@ localhost: # dovecot -This module provides statistics information from dovecot server. +This module provides statistics information from dovecot server. Statistics are taken from dovecot socket by executing `EXPORT global` command. More information about dovecot stats can be found on [project wiki page.](http://wiki2.dovecot.org/Statistics) **Requirement:** Dovecot unix socket with R/W permissions for user netdata or dovecot with configured TCP/IP socket. - + Module gives information with following charts: 1. **sessions** @@ -482,25 +515,25 @@ Module gives information with following charts: 2. **logins** * logins -3. **commands** - number of IMAP commands +3. **commands** - number of IMAP commands * commands - + 4. **Faults** * minor * major - -5. **Context Switches** + +5. **Context Switches** * volountary * involountary - + 6. **disk** in bytes/s * read * write - + 7. **bytes** in bytes/s * read * write - + 8. **number of syscalls** in syscalls/s * read * write @@ -509,7 +542,7 @@ Module gives information with following charts: * path * attr -10. **hits** - number of cache hits +10. **hits** - number of cache hits * hits 11. **attempts** - authorization attemts @@ -519,7 +552,7 @@ Module gives information with following charts: 12. **cache** - cached authorization hits * hit * miss - + ### configuration Sample: @@ -561,7 +594,7 @@ It produces: * Time spent on garbage collections 4. **Host metrics** charts: - * Available file descriptors in percent + * Available file descriptors in percent * Opened HTTP connections * Cluster communication transport metrics @@ -602,7 +635,7 @@ If no configuration is given, module will fail to run. # exim -Simple module executing `exim -bpc` to grab exim queue. +Simple module executing `exim -bpc` to grab exim queue. This command can take a lot of time to finish its execution thus it is not recommended to run it every second. It produces only one chart: @@ -616,13 +649,13 @@ Configuration is not needed. # fail2ban -Module monitor fail2ban log file to show all bans for all active jails +Module monitor fail2ban log file to show all bans for all active jails **Requirements:** * fail2ban.log file MUST BE readable by netdata (A good idea is to add **create 0640 root netdata** to fail2ban conf at logrotate.d) - + It produces one chart with multiple lines (one line per jail) - + ### configuration Sample: @@ -691,14 +724,14 @@ local: port : '18121' secret : 'adminsecret' acct : False # Freeradius accounting statistics. - proxy_auth : False # Freeradius proxy authentication statistics. + proxy_auth : False # Freeradius proxy authentication statistics. proxy_acct : False # Freeradius proxy accounting statistics. ``` **Freeradius server configuration:** The configuration for the status server is automatically created in the sites-available directory. -By default, server is enabled and can be queried from every client. +By default, server is enabled and can be queried from every client. FreeRADIUS will only respond to status-server messages, if the status-server virtual server has been enabled. To do this, create a link from the sites-enabled directory to the status file in the sites-available directory: @@ -721,30 +754,30 @@ For the memory statistics, it produces the following charts: 1. **Heap allocations** in kB * alloc: size of objects allocated on the heap - * inuse: size of allocated heap spans - + * inuse: size of allocated heap spans + 2. **Stack allocations** in kB * inuse: size of allocated stack spans - + 3. **MSpan allocations** in kB * inuse: size of allocated mspan structures - + 4. **MCache allocations** in kB * inuse: size of allocated mcache structures - + 5. **Virtual memory** in kB * sys: size of reserved virtual address space - + 6. **Live objects** * live: number of live objects in memory - + 7. **GC pauses average** in ns * avg: average duration of all GC stop-the-world pauses - + ### configuration - + Please see the [wiki page](https://github.com/firehol/netdata/wiki/Monitoring-Go-Applications#using-netdata-go_expvar-module) for detailed info about module configuration. - + --- # haproxy @@ -760,13 +793,13 @@ Socket MUST be readable AND writable by netdata user. It produces: 1. **Frontend** family charts - * Kilobytes in/s + * Kilobytes in/s * Kilobytes out/s * Sessions current * Sessions in queue current 2. **Backend** family charts - * Kilobytes in/s + * Kilobytes in/s * Kilobytes out/s * Sessions current * Sessions in queue current @@ -798,7 +831,7 @@ If no configuration is given, module will fail to run. --- # hddtemp - + Module monitors disk temperatures from one or more hddtemp daemons. **Requirement:** @@ -820,6 +853,75 @@ If no configuration is given, module will attempt to connect to hddtemp daemon o --- +# httpcheck + +Module monitors remote http server for availability and response time. + +Following charts are drawn per job: + +1. **Response time** ms + * Time in 0.1 ms resolution in which the server responds. + If the connection failed, the value is missing. + +2. **Status** boolean + * Connection successful + * Unexpected content: No Regex match found in the response + * Unexpected status code: Do we get 500 errors? + * Connection failed: port not listening or blocked + * Connection timed out: host or port unreachable + +### configuration + +Sample configuration and their default values. + +```yaml +server: + url: 'http://host:port/path' # required + status_accepted: # optional + - 200 + timeout: 1 # optional, supports decimals (e.g. 0.2) + update_every: 3 # optional + regex: 'REGULAR_EXPRESSION' # optional, see https://docs.python.org/3/howto/regex.html + redirect: yes # optional +``` + +### notes + + * The status chart is primarily intended for alarms, badges or for access via API. + * A system/service/firewall might block netdata's access if a portscan or + similar is detected. + * This plugin is meant for simple use cases. Currently, the accuracy of the + response time is low and should be used as reference only. + +--- + +# icecast + +This module will monitor number of listeners for active sources. + +**Requirements:** + * icecast version >= 2.4.0 + +It produces the following charts: + +1. **Listeners** in listeners + * source number + +### configuration + +Needs only `url` to server's `/status-json.xsl` + +Here is an example for remote server: + +```yaml +remote: + url : 'http://1.2.3.4:8443/status-json.xsl' +``` + +Without configuration, module attempts to connect to `http://localhost:8443/status-json.xsl` + +--- + # IPFS Module monitors [IPFS](https://ipfs.io) basic information. @@ -827,13 +929,13 @@ Module monitors [IPFS](https://ipfs.io) basic information. 1. **Bandwidth** in kbits/s * in * out - + 2. **Peers** * peers - + ### configuration -Only url to IPFS server is needed. +Only url to IPFS server is needed. Sample: @@ -860,11 +962,11 @@ It produces: 2. **Total leases** * leases (overall number of leases for all pools) - + 3. **Active leases** for every pools * leases (number of active leases in pool) - + ### configuration Sample: @@ -888,8 +990,8 @@ Module monitor /proc/mdstat It produces: 1. **Health** Number of failed disks in every array (aggregate chart). - -2. **Disks stats** + +2. **Disks stats** * total (number of devices array ideally would have) * inuse (number of devices currently are in use) @@ -898,11 +1000,11 @@ It produces: * recovery in percent * reshape in percent * check in percent - + 4. **Operation status** (if resync/recovery/reshape/check is active) * finish in minutes * speed in megabytes/s - + ### configuration No configuration is needed. @@ -915,20 +1017,20 @@ Memcached monitoring module. Data grabbed from [stats interface](https://github. 1. **Network** in kilobytes/s * read * written - + 2. **Connections** per second * current * rejected * total - + 3. **Items** in cluster * current * total - + 4. **Evicted and Reclaimed** items * evicted * reclaimed - + 5. **GET** requests/s * hits * misses @@ -938,7 +1040,7 @@ Memcached monitoring module. Data grabbed from [stats interface](https://github. 7. **SET rate** rate in requests/s * rate - + 8. **DELETE** requests/s * hits * misses @@ -947,22 +1049,22 @@ Memcached monitoring module. Data grabbed from [stats interface](https://github. * hits * misses * bad value - + 10. **Increment** requests/s * hits * misses - + 11. **Decrement** requests/s * hits * misses - + 12. **Touch** requests/s * hits * misses - + 13. **Touch rate** rate in requests/s * rate - + ### configuration Sample: @@ -1214,7 +1316,7 @@ If no configuration is given, module will attempt to connect to mysql server via # nginx -This module will monitor one or more nginx servers depending on configuration. Servers can be either local or remote. +This module will monitor one or more nginx servers depending on configuration. Servers can be either local or remote. **Requirements:** * nginx with configured 'ngx_http_stub_status_module' @@ -1234,11 +1336,11 @@ It produces following charts: * reading * writing * waiting - + 4. **Connections Rate** in connections/s * accepts * handled - + ### configuration Needs only `url` to server's `stub_status` @@ -1258,6 +1360,132 @@ Without configuration, module attempts to connect to `http://localhost/stub_stat --- +# nginx_plus + +This module will monitor one or more nginx_plus servers depending on configuration. +Servers can be either local or remote. + +Example nginx_plus configuration can be found in 'python.d/nginx_plus.conf' + +It produces following charts: + +1. **Requests total** in requests/s + * total + +2. **Requests current** in requests + * current + +3. **Connection Statistics** in connections/s + * accepted + * dropped + +4. **Workers Statistics** in workers + * idle + * active + +5. **SSL Handshakes** in handshakes/s + * successful + * failed + +6. **SSL Session Reuses** in sessions/s + * reused + +7. **SSL Memory Usage** in percent + * usage + +8. **Processes** in processes + * respawned + +For every server zone: + +1. **Processing** in requests + * processing + +2. **Requests** in requests/s + * requests + +3. **Responses** in requests/s + * 1xx + * 2xx + * 3xx + * 4xx + * 5xx + +4. **Traffic** in kilobits/s + * received + * sent + +For every upstream: + +1. **Peers Requests** in requests/s + * peer name (dimension per peer) + +2. **All Peers Responses** in responses/s + * 1xx + * 2xx + * 3xx + * 4xx + * 5xx + +3. **Peer Responses** in requests/s (for every peer) + * 1xx + * 2xx + * 3xx + * 4xx + * 5xx + +4. **Peers Connections** in active + * peer name (dimension per peer) + +5. **Peers Connections Usage** in percent + * peer name (dimension per peer) + +6. **All Peers Traffic** in KB + * received + * sent + +7. **Peer Traffic** in KB/s (for every peer) + * received + * sent + +8. **Peer Timings** in ms (for every peer) + * header + * response + +9. **Memory Usage** in percent + * usage + +10. **Peers Status** in state + * peer name (dimension per peer) + +11. **Peers Total Downtime** in seconds + * peer name (dimension per peer) + +For every cache: + +1. **Traffic** in KB + * served + * written + * bypass + +2. **Memory Usage** in percent + * usage + +### configuration + +Needs only `url` to server's `status` + +Here is an example for local server: + +```yaml +local: + url : 'http://localhost/status' +``` + +Without configuration, module fail to start. + +--- + # nsd Module uses the `nsd-control stats_noreset` command to provide `nsd` statistics. @@ -1313,9 +1541,81 @@ Configuration is not needed. --- +# ntpd + +Module monitors the system variables of the local `ntpd` daemon (optional incl. variables of the polled peers) using the NTP Control Message Protocol via UDP socket, similar to `ntpq`, the [standard NTP query program](http://doc.ntp.org/current-stable/ntpq.html). + +**Requirements:** + * Version: `NTPv4` + * Local interrogation allowed in `/etc/ntp.conf` (default): + +``` +# Local users may interrogate the ntp server more closely. +restrict 127.0.0.1 +restrict ::1 +``` + +It produces: + +1. system + * offset + * jitter + * frequency + * delay + * dispersion + * stratum + * tc + * precision + +2. peers + * offset + * delay + * dispersion + * jitter + * rootdelay + * rootdispersion + * stratum + * hmode + * pmode + * hpoll + * ppoll + * precision + +**configuration** + +Sample: + +```yaml +update_every: 10 + +host: 'localhost' +port: '123' +show_peers: yes +# hide peers with source address in ranges 127.0.0.0/8 and 192.168.0.0/16 +peer_filter: '(127\..*)|(192\.168\..*)' +# check for new/changed peers every 60 updates +peer_rescan: 60 +``` + +Sample (multiple jobs): + +Note: `ntp.conf` on the host `otherhost` must be configured to allow queries from our local host by including a line like `restrict <IP> nomodify notrap nopeer`. + +```yaml +local: + host: 'localhost' + +otherhost: + host: 'otherhost' +``` + +If no configuration is given, module will attempt to connect to `ntpd` on `::1:123` or `127.0.0.1:123` and show charts for the systemvars. Use `show_peers: yes` to also show the charts for configured peers. Local peers in the range `127.0.0.0/8` are hidden by default, use `peer_filter: ''` to show all peers. + +--- + # ovpn_status_log -Module monitor openvpn-status log file. +Module monitor openvpn-status log file. **Requirements:** @@ -1325,16 +1625,16 @@ Module monitor openvpn-status log file. * Make sure NETDATA USER CAN READ openvpn-status.log * Update_every interval MUST MATCH interval on which OpenVPN writes operational status to log file. - + It produces: 1. **Users** OpenVPN active users * users - + 2. **Traffic** OpenVPN overall bandwidth usage in kilobit/s * in * out - + ### configuration Sample: @@ -1348,12 +1648,12 @@ default # phpfpm -This module will monitor one or more php-fpm instances depending on configuration. +This module will monitor one or more php-fpm instances depending on configuration. **Requirements:** * php-fpm with enabled `status` page * access to `status` page via web server - + It produces following charts: 1. **Active Connections** @@ -1363,15 +1663,15 @@ It produces following charts: 2. **Requests** in requests/s * requests - + 3. **Performance** * reached * slow - + ### configuration Needs only `url` to server's `status` - + Here is an example for local instance: ```yaml @@ -1387,6 +1687,42 @@ Without configuration, module attempts to connect to `http://localhost/status` --- +# portcheck + +Module monitors a remote TCP service. + +Following charts are drawn per host: + +1. **Latency** ms + * Time required to connect to a TCP port. + Displays latency in 0.1 ms resolution. If the connection failed, the value is missing. + +2. **Status** boolean + * Connection successful + * Could not create socket: possible DNS problems + * Connection refused: port not listening or blocked + * Connection timed out: host or port unreachable + + +### configuration + +```yaml +server: + host: 'dns or ip' # required + port: 22 # required + timeout: 1 # optional + update_every: 1 # optional +``` + +### notes + + * The error chart is intended for alarms, badges or for access via API. + * A system/service/firewall might block netdata's access if a portscan or + similar is detected. + * Currently, the accuracy of the latency is low and should be used as reference only. + +--- + # postfix Simple module executing `postfix -p` to grab postfix queue. @@ -1395,7 +1731,7 @@ It produces only two charts: 1. **Postfix Queue Emails** * emails - + 2. **Postfix Queue Emails Size** in KB * size @@ -1427,10 +1763,10 @@ Following charts are drawn: 4. **Checkpoints** writes/s * scheduled * requested - + 5. **Current connections to db** count * connections - + 6. **Tuples returned from db** tuples/s * sequential * bitmap @@ -1451,7 +1787,7 @@ Following charts are drawn: 10. **Locks on db** count per type * locks - + ### configuration ```yaml @@ -1543,10 +1879,13 @@ Following charts are drawn: 6. **Erlang processes** * used processes -7. **Memory** +7. **Erlang run queue** + * Erlang run queue + +8. **Memory** * free memory in megabytes -8. **Disk Space** +9. **Disk Space** * free disk space in gigabytes ### configuration @@ -1581,16 +1920,16 @@ Following charts are drawn: * total * lua -4. **Database keys** +4. **Database keys** * lines are creates dynamically based on how many databases are there - + 5. **Clients** * connected * blocked - + 6. **Slaves** * connected - + ### configuration ```yaml @@ -1688,6 +2027,39 @@ Please join this discussion for help. --- +# springboot + +This module will monitor one or more Java Spring-boot applications depending on configuration. + +It produces following charts: + +1. **Response Codes** in requests/s + * 1xx + * 2xx + * 3xx + * 4xx + * 5xx + * others + +2. **Threads** + * daemon + * total + +3. **GC Time** in milliseconds and **GC Operations** in operations/s + * Copy + * MarkSweep + * ... + +4. **Heap Mmeory Usage** in KB + * used + * committed + +### configuration + +Please see the [Monitoring Java Spring Boot Applications](https://github.com/firehol/netdata/wiki/Monitoring-Java-Spring-Boot-Applications) page for detailed info about module configuration. + +--- + # squid This module will monitor one or more squid instances depending on configuration. @@ -1707,11 +2079,11 @@ It produces following charts: 3. **Server Bandwidth** in kilobits/s * in * out - + 4. **Server Requests** in requests/s * requests * errors - + ### configuration ```yaml @@ -1724,7 +2096,7 @@ local: ``` Without any configuration module will try to autodetect where squid presents its `counters` data - + --- # smartd_log @@ -1738,7 +2110,7 @@ It produces following charts (you can add additional attributes in the module co 2. **Start/Stop Count** attribute 4 3. **Reallocated Sectors Count** attribute 5 - + 4. **Seek Error Rate** attribute 7 5. **Power-On Hours Count** attribute 9 @@ -1750,11 +2122,11 @@ It produces following charts (you can add additional attributes in the module co 8. **Temperature** attribute 194 9. **Current Pending Sectors** attribute 197 - + 10. **Off-Line Uncorrectable** attribute 198 11. **Write Error Rate** attribute 200 - + ### configuration ```yaml @@ -1763,7 +2135,7 @@ local: ``` If no configuration is given, module will attempt to read log files in /var/log/smartd/ directory. - + --- # tomcat @@ -1781,10 +2153,10 @@ Charts: 3. **Threads** * current * busy - + 4. **JVM Free Memory** in MB * jvm - + ### configuration ```yaml @@ -1795,10 +2167,65 @@ localhost: pass : 'secret_tomcat_password' ``` -Without configuration, module attempts to connect to `http://localhost:8080/manager/status?XML=true`, without any credentials. +Without configuration, module attempts to connect to `http://localhost:8080/manager/status?XML=true`, without any credentials. So it will probably fail. ---- +--- + +# Traefik + +Module uses the `health` API to provide statistics. + +It produces: + +1. **Responses** by statuses + * success (1xx, 2xx, 304) + * error (5xx) + * redirect (3xx except 304) + * bad (4xx) + * other (all other responses) + +2. **Responses** by codes + * 2xx (successful) + * 5xx (internal server errors) + * 3xx (redirect) + * 4xx (bad) + * 1xx (informational) + * other (non-standart responses) + +3. **Detailed Response Codes** requests/s (number of responses for each response code family individually) + +4. **Requests**/s + * request statistics + +5. **Total response time** + * sum of all response time + +6. **Average response time** + +7. **Average response time per iteration** + +8. **Uptime** + * Traefik server uptime + +### configuration + +Needs only `url` to server's `health` + +Here is an example for local server: + +```yaml +update_every : 1 +priority : 60000 + +local: + url : 'http://localhost:8080/health' + retries : 10 +``` + +Without configuration, module attempts to connect to `http://localhost:8080/health`. + +--- # varnish cache @@ -1825,7 +2252,7 @@ It produces: 5. **Expired Objects** in expired/s * objects - + 6. **Least Recently Used Nuked Objects** in nuked/s * objects @@ -1837,7 +2264,7 @@ It produces: * created * failed * limited - + 9. **Current Queue Length** in requests * in queue @@ -1848,22 +2275,22 @@ It produces: * closed * resycled * failed - + 10. **Requests To The Backend** in requests/s * received - + 11. **ESI Statistics** in problems/s * errors * warnings - + 12. **Memory Usage** in MB * free * allocated - + 13. **Uptime** in seconds * uptime - - + + ### configuration No configuration is needed. @@ -1893,7 +2320,7 @@ It produces following charts: * unmatched (the lines in the log file that are not matched) 3. **Detailed Response Codes** requests/s (number of responses for each response code family individually) - + 4. **Bandwidth** KB/s * received (bandwidth of requests) * send (bandwidth of responses) @@ -1915,7 +2342,7 @@ It produces following charts: 11. **All Time Unique Client IPs** unique ips/s (unique client IPs since the last restart of netdata) - + ### configuration ```yaml @@ -1933,4 +2360,4 @@ apache_log: Module has preconfigured jobs for nginx, apache and gunicorn on various distros. ---- +--- |