diff options
Diffstat (limited to 'collectors/python.d.plugin/web_log')
-rw-r--r-- | collectors/python.d.plugin/web_log/Makefile.inc | 13 | ||||
-rw-r--r-- | collectors/python.d.plugin/web_log/README.md | 203 | ||||
-rw-r--r-- | collectors/python.d.plugin/web_log/web_log.chart.py | 1196 | ||||
-rw-r--r-- | collectors/python.d.plugin/web_log/web_log.conf | 204 |
4 files changed, 1616 insertions, 0 deletions
diff --git a/collectors/python.d.plugin/web_log/Makefile.inc b/collectors/python.d.plugin/web_log/Makefile.inc new file mode 100644 index 0000000..8931159 --- /dev/null +++ b/collectors/python.d.plugin/web_log/Makefile.inc @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +# THIS IS NOT A COMPLETE Makefile +# IT IS INCLUDED BY ITS PARENT'S Makefile.am +# IT IS REQUIRED TO REFERENCE ALL FILES RELATIVE TO THE PARENT + +# install these files +dist_python_DATA += web_log/web_log.chart.py +dist_pythonconfig_DATA += web_log/web_log.conf + +# do not install these files, but include them in the distribution +dist_noinst_DATA += web_log/README.md web_log/Makefile.inc + diff --git a/collectors/python.d.plugin/web_log/README.md b/collectors/python.d.plugin/web_log/README.md new file mode 100644 index 0000000..176551c --- /dev/null +++ b/collectors/python.d.plugin/web_log/README.md @@ -0,0 +1,203 @@ +# web_log + +## Motivation + +Web server log files exist for more than 20 years. All web servers of all kinds, from all vendors, [since the time NCSA httpd was powering the web](https://en.wikipedia.org/wiki/NCSA_HTTPd), produce log files, saving in real-time all accesses to web sites and APIs. + +Yet, after the appearance of google analytics and similar services, and the recent rise of APM (Application Performance Monitoring) with sophisticated time-series databases that collect and analyze metrics at the application level, all these web server log files are mostly just filling our disks, rotated every night without any use whatsoever. + +netdata turns this "useless" log file, into a powerful performance and health monitoring tool, capable of detecting, **in real-time**, most common web server problems, such as: + +- too many redirects (i.e. **oops!** *this should not redirect clients to itself*) +- too many bad requests (i.e. **oops!** *a few files were not uploaded*) +- too many internal server errors (i.e. **oops!** *this release crashes too much*) +- unreasonably too many requests (i.e. **oops!** *we are under attack*) +- unreasonably few requests (i.e. **oops!** *call the network guys*) +- unreasonably slow responses (i.e. **oops!** *the database is slow again*) +- too few successful responses (i.e. **oops!** *help us God!*) + +## Usage + +If netdata is installed on a system running a web server, it will detect it and it will automatically present a series of charts, with information obtained from the web server API, like these (*these do not come from the web server log file*): + +![image](https://cloud.githubusercontent.com/assets/2662304/22900686/e283f636-f237-11e6-93d2-cbdf63de150c.png) +*[**netdata**](https://my-netdata.io/) charts based on metrics collected by querying the `nginx` API (i.e. `/stub_status`).* + +> [**netdata**](https://my-netdata.io/) supports `apache`, `nginx`, `lighttpd` and `tomcat`. To obtain real-time information from a web server API, the web server needs to expose it. For directions on configuring your web server, check the config files for each web server. There is a directory with a config file for each web server under [`/etc/netdata/python.d/`](../). + +## Configuration + +[**netdata**](https://my-netdata.io/) has a powerful `web_log` plugin, capable of incrementally parsing any number of web server log files. This plugin is automatically started with [**netdata**](https://my-netdata.io/) and comes, pre-configured, for finding web server log files on popular distributions. Its configuration is at [`/etc/netdata/python.d/web_log.conf`](web_log.conf), like this: + + +```yaml +nginx_log: + name : 'nginx_log' + path : '/var/log/nginx/access.log' + +apache_log: + name : 'apache_log' + path : '/var/log/apache/other_vhosts_access.log' + categories: + cacti : 'cacti.*' + observium : 'observium' +``` + +Theodule has preconfigured jobs for nginx, apache and gunicorn on various distros. +You can add one such section, for each of your web server log files. + +> **Important**<br/>Keep in mind [**netdata**](https://my-netdata.io/) runs as user `netdata`. So, make sure user `netdata` has access to the logs directory and can read the log file. + +## Charts + +Once you have all log files configured and [**netdata**](https://my-netdata.io/) restarted, **for each log file** you will get a section at the [**netdata**](https://my-netdata.io/) dashboard, with the following charts. + +### responses by status + +In this chart we tried to provide a meaningful status for all responses. So: + +- `success` counts all the valid responses (i.e. `1xx` informational, `2xx` successful and `304` not modified). +- `error` are `5xx` internal server errors. These are very bad, they mean your web site or API is facing difficulties. +- `redirect` are `3xx` responses, except `304`. All `3xx` are redirects, but `304` means "not modified" - it tells the browsers the content they already have is still valid and can be used as-is. So, we decided to account it as a successful response. +- `bad` are bad requests that cannot be served. +- `other` as all the other, non-standard, types of responses. + +![image](https://cloud.githubusercontent.com/assets/2662304/22902194/ea0affc6-f23c-11e6-85f1-a4951dd4bb40.png) + +### Responses by type + +Then, we group all responses by code family, without interpreting their meaning. +**Response by type** requests/s + * success (1xx, 2xx, 304) + * error (5xx) + * redirect (3xx except 304) + * bad (4xx) + * other (all other responses) + +![image](https://cloud.githubusercontent.com/assets/2662304/22901883/dea7d33a-f23b-11e6-960d-00a913b58936.png) + +### Responses by code family + +Here we show all the response codes in detail. + +**Response by code family** requests/s + * 1xx (informational) + * 2xx (successful) + * 3xx (redirect) + * 4xx (bad) + * 5xx (internal server errors) + * other (non-standart responses) + * unmatched (the lines in the log file that are not matched) + + +![image](https://cloud.githubusercontent.com/assets/2662304/22901965/1a5d84ba-f23c-11e6-9d38-3deebcc8b879.png) + +>**Important**<br/>If your application is using hundreds of non-standard response codes, your browser may become slow while viewing this chart, so we have added a configuration [option to disable this chart](https://github.com/netdata/netdata/blob/419cd0a237275e5eeef3f92dcded84e735ee6c58/conf.d/python.d/web_log.conf#L63). + +### Detailed Response Codes + +Number of responses for each response code family individually (requests/s) + +### bandwidth + +This is a nice view of the traffic the web server is receiving and is sending. + +What is important to know for this chart, is that the bandwidth used for each request and response is accounted at the time the log is written. Since [**netdata**](https://my-netdata.io/) refreshes this chart every single second, you may have unrealistic spikes is the size of the requests or responses is too big. The reason is simple: a response may have needed 1 minute to be completed, but all the bandwidth used during that minute for the specific response will be accounted at the second the log line is written. + +As the legend on the chart suggests, you can use FireQoS to setup QoS on the web server ports and IPs to accurately measure the bandwidth the web server is using. Actually, [there may be a few more reasons to install QoS on your servers](../../tc.plugin/#tcplugin)... + +**Bandwidth** KB/s + * received (bandwidth of requests) + * send (bandwidth of responses) + +![image](https://cloud.githubusercontent.com/assets/2662304/22902266/245141d6-f23d-11e6-90f9-98729733e0da.png) + +> **Important**<br/>Most web servers do not log the request size by default.<br/>So, [unless you have configured your web server to log the size of requests](https://github.com/netdata/netdata/blob/419cd0a237275e5eeef3f92dcded84e735ee6c58/conf.d/python.d/web_log.conf#L76-L89), the `received` dimension will be always zero. + +### timings + +[**netdata**](https://my-netdata.io/) will also render the `minimum`, `average` and `maximum` time the web server needed to respond to requests. + +Keep in mind most web servers timings start at the reception of the full request, until the dispatch of the last byte of the response. So, they include network latencies of responses, but they do not include network latencies of requests. + +**Timings** ms (request processing time) + * min (bandwidth of requests) + * max (bandwidth of responses) + * average (bandwidth of responses) + +![image](https://cloud.githubusercontent.com/assets/2662304/22902283/369e3f92-f23d-11e6-9359-53e5d4ecb18e.png) + +> **Important**<br/>Most web servers do not log timing information by default.<br/>So, [unless you have configured your web server to also log timings](https://github.com/netdata/netdata/blob/419cd0a237275e5eeef3f92dcded84e735ee6c58/conf.d/python.d/web_log.conf#L76-L89), this chart will not exist. + +### URL patterns + +This is a very interesting chart. It is configured entirely by you. + +[**netdata**](https://my-netdata.io/) can map the URLs found in the log file into categories. You can define these categories, by providing names and regular expressions in `web_log.conf`. + +So, this configuration: + +```yaml +nginx_netdata: # name the charts + path: '/var/log/nginx/access.log' # web server log file + categories: + badges : '^/api/v1/badge\.svg' + charts : '^/api/v1/(data|chart|charts)' + registry : '^/api/v1/registry' + alarms : '^/api/v1/alarm' + allmetrics : '^/api/v1/allmetrics' + api_other : '^/api/' + netdata_conf: '^/netdata.conf' + api_old : '^/(data|datasource|graph|list|all\.json)' +``` + +Produces the following chart. The `categories` section is matched in the order given. So, pay attention to the order you give your patterns. + +![image](https://cloud.githubusercontent.com/assets/2662304/22902302/4d25bf06-f23d-11e6-844d-18c0876bdc3d.png) + +### HTTP versions + +This chart breaks down requests by HTTP version used. + +![image](https://cloud.githubusercontent.com/assets/2662304/22902323/5ee376d4-f23d-11e6-8457-157d3f438843.png) + +### IP versions + +This one provides requests per IP version used by the clients (`IPv4`, `IPv6`). + +![image](https://cloud.githubusercontent.com/assets/2662304/22902370/7091a770-f23d-11e6-8cd2-74e9a67b1397.png) + +### Unique clients + +The last charts are about the unique IPs accessing your web server. + +**Current Poll Unique Client IPs** unique ips/s. This one counts the unique IPs for each data collection iteration (i.e. **unique clients per second**). + +![image](https://cloud.githubusercontent.com/assets/2662304/22902384/835aa168-f23d-11e6-914f-cfc3f06eaff8.png) + +**All Time Unique Client IPs** unique ips/s. Counts the unique IPs, since the last [**netdata**](https://my-netdata.io/) restart. + +![image](https://cloud.githubusercontent.com/assets/2662304/22902407/92dd27e6-f23d-11e6-900d-eede7bc08e64.png) + +>**Important**<br/>To provide this information `web_log` plugin keeps in memory all the IPs seen by the web server. Although this does not require so much memory, if you have a web server with several million unique client IPs, we suggest to [disable this chart](https://github.com/netdata/netdata/blob/419cd0a237275e5eeef3f92dcded84e735ee6c58/conf.d/python.d/web_log.conf#L64). + + +## Alarms + +The magic of [**netdata**](https://my-netdata.io/) is that all metrics are collected per second, and all metrics can be used or correlated to provide real-time alarms. Out of the box, [**netdata**](https://my-netdata.io/) automatically attaches the [following alarms](../../../health/health.d/web_log.conf) to all `web_log` charts (i.e. to all log files configured, individually): + +alarm|description|minimum<br/>requests|warning|critical +:-------|-------|:------:|:-----:|:------: +`1m_redirects`|The ratio of HTTP redirects (3xx except 304) over all the requests, during the last minute.<br/> <br/>*Detects if the site or the web API is suffering from too many or circular redirects.*<br/> <br/>(i.e. **oops!** *this should not redirect clients to itself*)|120/min|> 20%|> 30% +`1m_bad_requests`|The ratio of HTTP bad requests (4xx) over all the requests, during the last minute.<br/> <br/>*Detects if the site or the web API is receiving too many bad requests, including `404`, not found.*<br/> <br/>(i.e. **oops!** *a few files were not uploaded*)|120/min|> 30%|> 50% +`1m_internal_errors`|The ratio of HTTP internal server errors (5xx), over all the requests, during the last minute.<br/> <br/>*Detects if the site is facing difficulties to serve requests.*<br/> <br/>(i.e. **oops!** *this release crashes too much*)|120/min|> 2%|> 5% +`5m_requests_ratio`|The percentage of successful web requests of the last 5 minutes, compared with the previous 5 minutes.<br/> <br/>*Detects if the site or the web API is suddenly getting too many or too few requests.*<br/> <br/>(i.e. too many = **oops!** *we are under attack*)<br/>(i.e. too few = **oops!** *call the network guys*)|120/5min|> double or < half|> 4x or < 1/4x +`web_slow`|The average time to respond to requests, over the last 1 minute, compared to the average of last 10 minutes.<br/> <br/>*Detects if the site or the web API is suddenly a lot slower.*<br/> <br/>(i.e. **oops!** *the database is slow again*)|120/min|> 2x|> 4x +`1m_successful`|The ratio of successful HTTP responses (1xx, 2xx, 304) over all the requests, during the last minute.<br/> <br/>*Detects if the site or the web API is performing within limits.*<br/> <br/>(i.e. **oops!** *help us God!*)|120/min|< 85%|< 75% + +The column `minimum requests` state the minimum number of requests required for the alarm to be evaluated. We found that when the site is receiving requests above this rate, these alarms are pretty accurate (i.e. no false-positives). + +[**netdata**](https://my-netdata.io/) alarms are user configurable. Sample config files can be found under directory `health/health.d` of the netdata github repository. So, even [`web_log` alarms can be adapted to your needs](../../../health/health.d/web_log.conf). + + +[![analytics](https://www.google-analytics.com/collect?v=1&aip=1&t=pageview&_s=1&ds=github&dr=https%3A%2F%2Fgithub.com%2Fnetdata%2Fnetdata&dl=https%3A%2F%2Fmy-netdata.io%2Fgithub%2Fcollectors%2Fpython.d.plugin%2Fweb_log%2FREADME&_u=MAC~&cid=5792dfd7-8dc4-476b-af31-da2fdb9f93d2&tid=UA-64295674-3)]() diff --git a/collectors/python.d.plugin/web_log/web_log.chart.py b/collectors/python.d.plugin/web_log/web_log.chart.py new file mode 100644 index 0000000..9927904 --- /dev/null +++ b/collectors/python.d.plugin/web_log/web_log.chart.py @@ -0,0 +1,1196 @@ +# -*- coding: utf-8 -*- +# Description: web log netdata python.d module +# Author: l2isbad +# SPDX-License-Identifier: GPL-3.0-or-later + +import bisect +import re +import os + +from collections import namedtuple, defaultdict +from copy import deepcopy + +try: + from itertools import filterfalse +except ImportError: + from itertools import ifilter as filter + from itertools import ifilterfalse as filterfalse + +try: + from sys import maxint +except ImportError: + from sys import maxsize as maxint + +from bases.collection import read_last_line +from bases.FrameworkServices.LogService import LogService + + +ORDER_APACHE_CACHE = [ + 'apache_cache', +] + +ORDER_WEB = [ + 'response_statuses', + 'response_codes', + 'bandwidth', + 'response_time', + 'response_time_hist', + 'response_time_upstream', + 'response_time_upstream_hist', + 'requests_per_url', + 'requests_per_user_defined', + 'http_method', + 'vhost', + 'port', + 'http_version', + 'requests_per_ipproto', + 'clients', + 'clients_all' +] + +ORDER_SQUID = [ + 'squid_response_statuses', + 'squid_response_codes', + 'squid_detailed_response_codes', + 'squid_method', + 'squid_mime_type', + 'squid_hier_code', + 'squid_transport_methods', + 'squid_transport_errors', + 'squid_code', + 'squid_handling_opts', + 'squid_object_types', + 'squid_cache_events', + 'squid_bytes', + 'squid_duration', + 'squid_clients', + 'squid_clients_all' +] + +CHARTS_WEB = { + 'response_codes': { + 'options': [None, 'Response Codes', 'requests/s', 'responses', 'web_log.response_codes', 'stacked'], + 'lines': [ + ['2xx', None, 'incremental'], + ['5xx', None, 'incremental'], + ['3xx', None, 'incremental'], + ['4xx', None, 'incremental'], + ['1xx', None, 'incremental'], + ['0xx', 'other', 'incremental'], + ['unmatched', None, 'incremental'] + ] + }, + 'bandwidth': { + 'options': [None, 'Bandwidth', 'kilobits/s', 'bandwidth', 'web_log.bandwidth', 'area'], + 'lines': [ + ['resp_length', 'received', 'incremental', 8, 1000], + ['bytes_sent', 'sent', 'incremental', -8, 1000] + ] + }, + 'response_time': { + 'options': [None, 'Processing Time', 'milliseconds', 'timings', 'web_log.response_time', 'area'], + 'lines': [ + ['resp_time_min', 'min', 'incremental', 1, 1000], + ['resp_time_max', 'max', 'incremental', 1, 1000], + ['resp_time_avg', 'avg', 'incremental', 1, 1000] + ] + }, + 'response_time_hist': { + 'options': [None, 'Processing Time Histogram', 'requests/s', 'timings', 'web_log.response_time_hist', 'line'], + 'lines': [] + }, + 'response_time_upstream': { + 'options': [None, 'Processing Time Upstream', 'milliseconds', 'timings', + 'web_log.response_time_upstream', 'area'], + 'lines': [ + ['resp_time_upstream_min', 'min', 'incremental', 1, 1000], + ['resp_time_upstream_max', 'max', 'incremental', 1, 1000], + ['resp_time_upstream_avg', 'avg', 'incremental', 1, 1000] + ] + }, + 'response_time_upstream_hist': { + 'options': [None, 'Processing Time Histogram', 'requests/s', 'timings', + 'web_log.response_time_upstream_hist', 'line'], + 'lines': [] + }, + 'clients': { + 'options': [None, 'Current Poll Unique Client IPs', 'unique ips', 'clients', 'web_log.clients', 'stacked'], + 'lines': [ + ['unique_cur_ipv4', 'ipv4', 'incremental', 1, 1], + ['unique_cur_ipv6', 'ipv6', 'incremental', 1, 1] + ] + }, + 'clients_all': { + 'options': [None, 'All Time Unique Client IPs', 'unique ips', 'clients', 'web_log.clients_all', 'stacked'], + 'lines': [ + ['unique_tot_ipv4', 'ipv4', 'absolute', 1, 1], + ['unique_tot_ipv6', 'ipv6', 'absolute', 1, 1] + ] + }, + 'http_method': { + 'options': [None, 'Requests Per HTTP Method', 'requests/s', 'http methods', 'web_log.http_method', 'stacked'], + 'lines': [ + ['GET', 'GET', 'incremental', 1, 1] + ] + }, + 'http_version': { + 'options': [None, 'Requests Per HTTP Version', 'requests/s', 'http versions', + 'web_log.http_version', 'stacked'], + 'lines': [] + }, + 'requests_per_ipproto': { + 'options': [None, 'Requests Per IP Protocol', 'requests/s', 'ip protocols', 'web_log.requests_per_ipproto', + 'stacked'], + 'lines': [ + ['req_ipv4', 'ipv4', 'incremental', 1, 1], + ['req_ipv6', 'ipv6', 'incremental', 1, 1] + ] + }, + 'response_statuses': { + 'options': [None, 'Response Statuses', 'requests/s', 'responses', 'web_log.response_statuses', 'stacked'], + 'lines': [ + ['successful_requests', 'success', 'incremental', 1, 1], + ['server_errors', 'error', 'incremental', 1, 1], + ['redirects', 'redirect', 'incremental', 1, 1], + ['bad_requests', 'bad', 'incremental', 1, 1], + ['other_requests', 'other', 'incremental', 1, 1] + ] + }, + 'requests_per_url': { + 'options': [None, 'Requests Per Url', 'requests/s', 'urls', 'web_log.requests_per_url', 'stacked'], + 'lines': [ + ['url_pattern_other', 'other', 'incremental', 1, 1] + ] + }, + 'requests_per_user_defined': { + 'options': [None, 'Requests Per User Defined Pattern', 'requests/s', 'user defined', + 'web_log.requests_per_user_defined', 'stacked'], + 'lines': [ + ['user_pattern_other', 'other', 'incremental', 1, 1] + ] + }, + 'port': { + 'options': [None, 'Requests Per Port', 'requests/s', 'port', 'web_log.port', 'stacked'], + 'lines': [ + ['port_80', 'http', 'incremental', 1, 1], + ['port_443', 'https', 'incremental', 1, 1] + ] + }, + 'vhost': { + 'options': [None, 'Requests Per Vhost', 'requests/s', 'vhost', 'web_log.vhost', 'stacked'], + 'lines': [] + } +} + +CHARTS_APACHE_CACHE = { + 'apache_cache': { + 'options': [None, 'Apache Cached Responses', 'percentage', 'cached', 'web_log.apache_cache_cache', + 'stacked'], + 'lines': [ + ['hit', 'cache', 'percentage-of-absolute-row'], + ['miss', None, 'percentage-of-absolute-row'], + ['other', None, 'percentage-of-absolute-row'] + ] + } +} + +CHARTS_SQUID = { + 'squid_duration': { + 'options': [None, 'Elapsed Time The Transaction Busied The Cache', + 'milliseconds', 'squid_timings', 'web_log.squid_duration', 'area'], + 'lines': [ + ['duration_min', 'min', 'incremental', 1, 1000], + ['duration_max', 'max', 'incremental', 1, 1000], + ['duration_avg', 'avg', 'incremental', 1, 1000] + ] + }, + 'squid_bytes': { + 'options': [None, 'Amount Of Data Delivered To The Clients', + 'kilobits/s', 'squid_bandwidth', 'web_log.squid_bytes', 'area'], + 'lines': [ + ['bytes', 'sent', 'incremental', 8, 1000] + ] + }, + 'squid_response_statuses': { + 'options': [None, 'Response Statuses', 'responses/s', 'squid_responses', 'web_log.squid_response_statuses', + 'stacked'], + 'lines': [ + ['successful_requests', 'success', 'incremental', 1, 1], + ['server_errors', 'error', 'incremental', 1, 1], + ['redirects', 'redirect', 'incremental', 1, 1], + ['bad_requests', 'bad', 'incremental', 1, 1], + ['other_requests', 'other', 'incremental', 1, 1] + ] + }, + 'squid_response_codes': { + 'options': [None, 'Response Codes', 'responses/s', 'squid_responses', + 'web_log.squid_response_codes', 'stacked'], + 'lines': [ + ['2xx', None, 'incremental'], + ['5xx', None, 'incremental'], + ['3xx', None, 'incremental'], + ['4xx', None, 'incremental'], + ['1xx', None, 'incremental'], + ['0xx', None, 'incremental'], + ['other', None, 'incremental'], + ['unmatched', None, 'incremental'] + ] + }, + 'squid_code': { + 'options': [None, 'Responses Per Cache Result Of The Request', + 'requests/s', 'squid_squid_cache', 'web_log.squid_code', 'stacked'], + 'lines': [] + }, + 'squid_detailed_response_codes': { + 'options': [None, 'Detailed Response Codes', + 'responses/s', 'squid_responses', 'web_log.squid_detailed_response_codes', 'stacked'], + 'lines': [] + }, + 'squid_hier_code': { + 'options': [None, 'Responses Per Hierarchy Code', + 'requests/s', 'squid_hierarchy', 'web_log.squid_hier_code', 'stacked'], + 'lines': [] + }, + 'squid_method': { + 'options': [None, 'Requests Per Method', + 'requests/s', 'squid_requests', 'web_log.squid_method', 'stacked'], + 'lines': [] + }, + 'squid_mime_type': { + 'options': [None, 'Requests Per MIME Type', + 'requests/s', 'squid_requests', 'web_log.squid_mime_type', 'stacked'], + 'lines': [] + }, + 'squid_clients': { + 'options': [None, 'Current Poll Unique Client IPs', 'unique ips', 'squid_clients', + 'web_log.squid_clients', 'stacked'], + 'lines': [ + ['unique_ipv4', 'ipv4', 'incremental'], + ['unique_ipv6', 'ipv6', 'incremental'] + ] + }, + 'squid_clients_all': { + 'options': [None, 'All Time Unique Client IPs', 'unique ips', 'squid_clients', + 'web_log.squid_clients_all', 'stacked'], + 'lines': [ + ['unique_tot_ipv4', 'ipv4', 'absolute'], + ['unique_tot_ipv6', 'ipv6', 'absolute'] + ] + }, + 'squid_transport_methods': { + 'options': [None, 'Transport Methods', 'requests/s', 'squid_squid_transport', + 'web_log.squid_transport_methods', 'stacked'], + 'lines': [] + }, + 'squid_transport_errors': { + 'options': [None, 'Transport Errors', 'requests/s', 'squid_squid_transport', + 'web_log.squid_transport_errors', 'stacked'], + 'lines': [] + }, + 'squid_handling_opts': { + 'options': [None, 'Handling Opts', 'requests/s', 'squid_squid_cache', + 'web_log.squid_handling_opts', 'stacked'], + 'lines': [] + }, + 'squid_object_types': { + 'options': [None, 'Object Types', 'objects/s', 'squid_squid_cache', + 'web_log.squid_object_types', 'stacked'], + 'lines': [] + }, + 'squid_cache_events': { + 'options': [None, 'Cache Events', 'events/s', 'squid_squid_cache', + 'web_log.squid_cache_events', 'stacked'], + 'lines': [] + } +} + +NAMED_PATTERN = namedtuple('PATTERN', ['description', 'func']) + +DET_RESP_AGGR = ['', '_1xx', '_2xx', '_3xx', '_4xx', '_5xx', '_Other'] + +SQUID_CODES = { + 'TCP': 'squid_transport_methods', + 'UDP': 'squid_transport_methods', + 'NONE': 'squid_transport_methods', + 'CLIENT': 'squid_handling_opts', + 'IMS': 'squid_handling_opts', + 'ASYNC': 'squid_handling_opts', + 'SWAPFAIL': 'squid_handling_opts', + 'REFRESH': 'squid_handling_opts', + 'SHARED': 'squid_handling_opts', + 'REPLY': 'squid_handling_opts', + 'NEGATIVE': 'squid_object_types', + 'STALE': 'squid_object_types', + 'OFFLINE': 'squid_object_types', + 'INVALID': 'squid_object_types', + 'FAIL': 'squid_object_types', + 'MODIFIED': 'squid_object_types', + 'UNMODIFIED': 'squid_object_types', + 'REDIRECT': 'squid_object_types', + 'HIT': 'squid_cache_events', + 'MEM': 'squid_cache_events', + 'MISS': 'squid_cache_events', + 'DENIED': 'squid_cache_events', + 'NOFETCH': 'squid_cache_events', + 'TUNNEL': 'squid_cache_events', + 'ABORTED': 'squid_transport_errors', + 'TIMEOUT': 'squid_transport_errors' +} + +REQUEST_REGEX = re.compile(r'(?P<method>[A-Z]+) (?P<url>[^ ]+) [A-Z]+/(?P<http_version>\d(?:.\d)?)') + +MIME_TYPES = ['application', 'audio', 'example', 'font', 'image', 'message', 'model', 'multipart', 'text', 'video'] + + +class Service(LogService): + def __init__(self, configuration=None, name=None): + """ + :param configuration: + :param name: + """ + LogService.__init__(self, configuration=configuration, name=name) + self.configuration = configuration + self.log_path = self.configuration.get('path') + self.job = None + + def check(self): + """ + :return: bool + + 1. "log_path" is specified in the module configuration file + 2. "log_path" must be readable by netdata user and must exist + 3. "log_path' must not be empty. We need at least 1 line to find appropriate pattern to parse + 4. other checks depends on log "type" + """ + + log_type = self.configuration.get('type', 'web') + log_types = dict(web=Web, apache_cache=ApacheCache, squid=Squid) + + if log_type not in log_types: + self.error('bad log type {log_type}. Supported types: {types}'.format(log_type=log_type, + types=log_types.keys())) + return False + + if not self.log_path: + self.error('log path is not specified') + return False + + if not (self._find_recent_log_file() and os.access(self.log_path, os.R_OK)): + self.error('{log_file} not readable or not exist'.format(log_file=self.log_path)) + return False + + if not os.path.getsize(self.log_path): + self.error('{log_file} is empty'.format(log_file=self.log_path)) + return False + + self.job = log_types[log_type](self) + if self.job.check(): + self.order = self.job.order + self.definitions = self.job.definitions + return True + return False + + def _get_data(self): + return self.job.get_data(self._get_raw_data()) + + +class Web: + def __init__(self, service): + self.service = service + self.order = ORDER_WEB[:] + self.definitions = deepcopy(CHARTS_WEB) + self.pre_filter = check_patterns('filter', self.configuration.get('filter')) + self.storage = dict() + self.data = { + 'bytes_sent': 0, + 'resp_length': 0, + 'resp_time_min': 0, + 'resp_time_max': 0, + 'resp_time_avg': 0, + 'resp_time_upstream_min': 0, + 'resp_time_upstream_max': 0, + 'resp_time_upstream_avg': 0, + 'unique_cur_ipv4': 0, + 'unique_cur_ipv6': 0, + '2xx': 0, + '5xx': 0, + '3xx': 0, + '4xx': 0, + '1xx': 0, + '0xx': 0, + 'unmatched': 0, + 'req_ipv4': 0, + 'req_ipv6': 0, + 'unique_tot_ipv4': 0, + 'unique_tot_ipv6': 0, + 'successful_requests': 0, + 'redirects': 0, + 'bad_requests': 0, + 'server_errors': 0, + 'other_requests': 0, + 'GET': 0 + } + + def __getattr__(self, item): + return getattr(self.service, item) + + def check(self): + last_line = read_last_line(self.log_path) + if not last_line: + return False + # Custom_log_format or predefined log format. + if self.configuration.get('custom_log_format'): + match_dict, error = self.find_regex_custom(last_line) + else: + match_dict, error = self.find_regex(last_line) + + # "match_dict" is None if there are any problems + if match_dict is None: + self.error(error) + return False + + self.storage['unique_all_time'] = list() + self.storage['url_pattern'] = check_patterns('url_pattern', self.configuration.get('categories')) + self.storage['user_pattern'] = check_patterns('user_pattern', self.configuration.get('user_defined')) + + self.create_web_charts(match_dict) # Create charts + self.info('Collected data: %s' % list(match_dict.keys())) + return True + + def create_web_charts(self, match_dict): + """ + :param match_dict: dict: regex.search.groupdict(). Ex. {'address': '127.0.0.1', 'code': '200', 'method': 'GET'} + :return: + Create/remove additional charts depending on the 'match_dict' keys and configuration file options + """ + if 'resp_time' not in match_dict: + self.order.remove('response_time') + self.order.remove('response_time_hist') + if 'resp_time_upstream' not in match_dict: + self.order.remove('response_time_upstream') + self.order.remove('response_time_upstream_hist') + + # Add 'response_time_hist' and 'response_time_upstream_hist' charts if is specified in the configuration + histogram = self.configuration.get('histogram', None) + if isinstance(histogram, list): + self.storage['bucket_index'] = histogram[:] + self.storage['bucket_index'].append(maxint) + self.storage['buckets'] = [0] * (len(histogram) + 1) + self.storage['upstream_buckets'] = [0] * (len(histogram) + 1) + hist_lines = self.definitions['response_time_hist']['lines'] + upstream_hist_lines = self.definitions['response_time_upstream_hist']['lines'] + for i, le in enumerate(histogram): + hist_key = 'response_time_hist_%d' % i + upstream_hist_key = 'response_time_upstream_hist_%d' % i + hist_lines.append([hist_key, str(le), 'incremental', 1, 1]) + upstream_hist_lines.append([upstream_hist_key, str(le), 'incremental', 1, 1]) + + hist_lines.append(['response_time_hist_%d' % len(histogram), '+Inf', 'incremental', 1, 1]) + upstream_hist_lines.append(['response_time_upstream_hist_%d' % len(histogram), '+Inf', 'incremental', 1, 1]) + elif histogram is not None: + self.error('expect histogram list, but was {0}'.format(type(histogram))) + + if not self.configuration.get('all_time', True): + self.order.remove('clients_all') + + # Add 'detailed_response_codes' chart if specified in the configuration + if self.configuration.get('detailed_response_codes', True): + if self.configuration.get('detailed_response_aggregate', True): + codes = DET_RESP_AGGR[:1] + else: + codes = DET_RESP_AGGR[1:] + + for code in codes: + self.order.append('detailed_response_codes%s' % code) + self.definitions['detailed_response_codes%s' % code] = { + 'options': [None, 'Detailed Response Codes %s' % code[1:], 'requests/s', 'responses', + 'web_log.detailed_response_codes%s' % code, 'stacked'], + 'lines': [] + } + + # Add 'requests_per_url' chart if specified in the configuration + if self.storage['url_pattern']: + for elem in self.storage['url_pattern']: + dim = [elem.description, elem.description[12:], 'incremental'] + self.definitions['requests_per_url']['lines'].append(dim) + self.data[elem.description] = 0 + self.data['url_pattern_other'] = 0 + else: + self.order.remove('requests_per_url') + + # Add 'requests_per_user_defined' chart if specified in the configuration + if self.storage['user_pattern'] and 'user_defined' in match_dict: + for elem in self.storage['user_pattern']: + dim = [elem.description, elem.description[13:], 'incremental'] + self.definitions['requests_per_user_defined']['lines'].append(dim) + self.data[elem.description] = 0 + self.data['user_pattern_other'] = 0 + else: + self.order.remove('requests_per_user_defined') + + def get_data(self, raw_data=None): + """ + Parses new log lines + :return: dict OR None + None if _get_raw_data method fails. + In all other cases - dict. + """ + if not raw_data: + return None if raw_data is None else self.data + + filtered_data = filter_data(raw_data=raw_data, pre_filter=self.pre_filter) + + unique_current = set() + timings = defaultdict(lambda: dict(minimum=None, maximum=0, summary=0, count=0)) + + for line in filtered_data: + match = self.storage['regex'].search(line) + if match: + match_dict = match.groupdict() + try: + code = match_dict['code'][0] + 'xx' + self.data[code] += 1 + except KeyError: + self.data['0xx'] += 1 + # detailed response code + if self.configuration.get('detailed_response_codes', True): + self.get_data_per_response_codes_detailed(code=match_dict['code']) + # response statuses + self.get_data_per_statuses(code=match_dict['code']) + # requests per user defined pattern + if self.storage['user_pattern'] and 'user_defined' in match_dict: + self.get_data_per_pattern(row=match_dict['user_defined'], + other='user_pattern_other', + pattern=self.storage['user_pattern']) + # method, url, http version + self.get_data_from_request_field(match_dict=match_dict) + # bandwidth sent + bytes_sent = match_dict['bytes_sent'] if '-' not in match_dict['bytes_sent'] else 0 + self.data['bytes_sent'] += int(bytes_sent) + # request processing time and bandwidth received + if 'resp_length' in match_dict: + resp_length = match_dict['resp_length'] if '-' not in match_dict['resp_length'] else 0 + self.data['resp_length'] += int(resp_length) + if 'resp_time' in match_dict: + resp_time = self.storage['func_resp_time'](float(match_dict['resp_time'])) + get_timings(timings=timings['resp_time'], time=resp_time) + if 'bucket_index' in self.storage: + get_hist(self.storage['bucket_index'], self.storage['buckets'], resp_time / 1000) + if 'resp_time_upstream' in match_dict and match_dict['resp_time_upstream'] != '-': + resp_time_upstream = self.storage['func_resp_time'](float(match_dict['resp_time_upstream'])) + get_timings(timings=timings['resp_time_upstream'], time=resp_time_upstream) + if 'bucket_index' in self.storage: + get_hist(self.storage['bucket_index'], self.storage['upstream_buckets'], resp_time / 1000) + # requests per ip proto + proto = 'ipv6' if ':' in match_dict['address'] else 'ipv4' + self.data['req_' + proto] += 1 + # unique clients ips + if self.configuration.get('all_time', True): + if address_not_in_pool(pool=self.storage['unique_all_time'], + address=match_dict['address'], + pool_size=self.data['unique_tot_ipv4'] + self.data['unique_tot_ipv6']): + self.data['unique_tot_' + proto] += 1 + if match_dict['address'] not in unique_current: + self.data['unique_cur_' + proto] += 1 + unique_current.add(match_dict['address']) + else: + self.data['unmatched'] += 1 + + # timings + for elem in timings: + self.data[elem + '_min'] += timings[elem]['minimum'] + self.data[elem + '_avg'] += timings[elem]['summary'] / timings[elem]['count'] + self.data[elem + '_max'] += timings[elem]['maximum'] + + # histogram + if 'bucket_index' in self.storage: + buckets = self.storage['buckets'] + upstream_buckets = self.storage['upstream_buckets'] + for i in range(0, len(self.storage['bucket_index'])): + hist_key = 'response_time_hist_%d' % i + upstream_hist_key = 'response_time_upstream_hist_%d' % i + self.data[hist_key] = buckets[i] + self.data[upstream_hist_key] = upstream_buckets[i] + + return self.data + + def find_regex(self, last_line): + """ + :param last_line: str: literally last line from log file + :return: tuple where: + [0]: dict or None: match_dict or None + [1]: str: error description + We need to find appropriate pattern for current log file + All logic is do a regex search through the string for all predefined patterns + until we find something or fail. + """ + # REGEX: 1.IPv4 address 2.HTTP method 3. URL 4. Response code + # 5. Bytes sent 6. Response length 7. Response process time + default = re.compile(r'(?P<address>[\da-f.:]+|localhost)' + r' -.*?"(?P<request>[^"]*)"' + r' (?P<code>[1-9]\d{2})' + r' (?P<bytes_sent>\d+|-)') + + apache_ext_insert = re.compile(r'(?P<address>[\da-f.:]+|localhost)' + r' -.*?"(?P<request>[^"]*)"' + r' (?P<code>[1-9]\d{2})' + r' (?P<bytes_sent>\d+|-)' + r' (?P<resp_length>\d+|-)' + r' (?P<resp_time>\d+) ') + + apache_ext_append = re.compile(r'(?P<address>[\da-f.:]+|localhost)' + r' -.*?"(?P<request>[^"]*)"' + r' (?P<code>[1-9]\d{2})' + r' (?P<bytes_sent>\d+|-)' + r' .*?' + r' (?P<resp_length>\d+|-)' + r' (?P<resp_time>\d+)' + r'(?: |$)') + + nginx_ext_insert = re.compile(r'(?P<address>[\da-f.:]+)' + r' -.*?"(?P<request>[^"]*)"' + r' (?P<code>[1-9]\d{2})' + r' (?P<bytes_sent>\d+)' + r' (?P<resp_length>\d+)' + r' (?P<resp_time>\d+\.\d+) ') + + nginx_ext2_insert = re.compile(r'(?P<address>[\da-f.:]+)' + r' -.*?"(?P<request>[^"]*)"' + r' (?P<code>[1-9]\d{2})' + r' (?P<bytes_sent>\d+)' + r' (?P<resp_length>\d+)' + r' (?P<resp_time>\d+\.\d+)' + r' (?P<resp_time_upstream>[\d.-]+) ') + + nginx_ext_append = re.compile(r'(?P<address>[\da-f.:]+)' + r' -.*?"(?P<request>[^"]*)"' + r' (?P<code>[1-9]\d{2})' + r' (?P<bytes_sent>\d+)' + r' .*?' + r' (?P<resp_length>\d+)' + r' (?P<resp_time>\d+\.\d+)') + + def func_usec(time): + return time + + def func_sec(time): + return time * 1000000 + + r_regex = [apache_ext_insert, apache_ext_append, + nginx_ext2_insert, nginx_ext_insert, nginx_ext_append, + default] + r_function = [func_usec, func_usec, func_sec, func_sec, func_sec, func_usec] + regex_function = zip(r_regex, r_function) + + match_dict = dict() + for regex, func in regex_function: + match = regex.search(last_line) + if match: + self.storage['regex'] = regex + self.storage['func_resp_time'] = func + match_dict = match.groupdict() + break + + return find_regex_return(match_dict=match_dict or None, + msg='Unknown log format. You need to use "custom_log_format" feature.') + + def find_regex_custom(self, last_line): + """ + :param last_line: str: literally last line from log file + :return: tuple where: + [0]: dict or None: match_dict or None + [1]: str: error description + + We are here only if "custom_log_format" is in logs. We need to make sure: + 1. "custom_log_format" is a dict + 2. "pattern" in "custom_log_format" and pattern is <str> instance + 3. if "time_multiplier" is in "custom_log_format" it must be <int> or <float> instance + + If all parameters is ok we need to make sure: + 1. Pattern search is success + 2. Pattern search contains named subgroups (?P<subgroup_name>) (= "match_dict") + + If pattern search is success we need to make sure: + 1. All mandatory keys ['address', 'code', 'bytes_sent', 'method', 'url'] are in "match_dict" + + If this is True we need to make sure: + 1. All mandatory key values from "match_dict" have the correct format + ("code" is integer, "method" is uppercase word, etc) + + If non mandatory keys in "match_dict" we need to make sure: + 1. All non mandatory key values from match_dict ['resp_length', 'resp_time'] have the correct format + ("resp_length" is integer or "-", "resp_time" is integer or float) + + """ + if not hasattr(self.configuration.get('custom_log_format'), 'keys'): + return find_regex_return(msg='Custom log: "custom_log_format" is not a <dict>') + + pattern = self.configuration.get('custom_log_format', dict()).get('pattern') + if not (pattern and isinstance(pattern, str)): + return find_regex_return(msg='Custom log: "pattern" option is not specified or type is not <str>') + + resp_time_func = self.configuration.get('custom_log_format', dict()).get('time_multiplier') or 0 + + if not isinstance(resp_time_func, (int, float)): + return find_regex_return(msg='Custom log: "time_multiplier" is not an integer or a float') + + try: + regex = re.compile(pattern) + except re.error as error: + return find_regex_return(msg='Pattern compile error: %s' % str(error)) + + match = regex.search(last_line) + if not match: + return find_regex_return(msg='Custom log: pattern search FAILED') + + match_dict = match.groupdict() or None + if match_dict is None: + return find_regex_return(msg='Custom log: search OK but contains no named subgroups' + ' (you need to use ?P<subgroup_name>)') + mandatory_dict = {'address': r'[\w.:-]+', + 'code': r'[1-9]\d{2}', + 'bytes_sent': r'\d+|-'} + optional_dict = {'resp_length': r'\d+|-', + 'resp_time': r'[\d.]+', + 'resp_time_upstream': r'[\d.-]+', + 'method': r'[A-Z]+', + 'http_version': r'\d(?:.\d)?'} + + mandatory_values = set(mandatory_dict) - set(match_dict) + if mandatory_values: + return find_regex_return(msg='Custom log: search OK but some mandatory keys (%s) are missing' + % list(mandatory_values)) + for key in mandatory_dict: + if not re.search(mandatory_dict[key], match_dict[key]): + return find_regex_return(msg='Custom log: can\'t parse "%s": %s' + % (key, match_dict[key])) + + optional_values = set(optional_dict) & set(match_dict) + for key in optional_values: + if not re.search(optional_dict[key], match_dict[key]): + return find_regex_return(msg='Custom log: can\'t parse "%s": %s' + % (key, match_dict[key])) + + dot_in_time = '.' in match_dict.get('resp_time', '') + if dot_in_time: + self.storage['func_resp_time'] = lambda time: time * (resp_time_func or 1000000) + else: + self.storage['func_resp_time'] = lambda time: time * (resp_time_func or 1) + + self.storage['regex'] = regex + return find_regex_return(match_dict=match_dict) + + def get_data_from_request_field(self, match_dict): + if match_dict.get('request'): + match_dict = REQUEST_REGEX.search(match_dict['request']) + if match_dict: + match_dict = match_dict.groupdict() + else: + return + # requests per url + if match_dict.get('url') and self.storage['url_pattern']: + self.get_data_per_pattern(row=match_dict['url'], + other='url_pattern_other', + pattern=self.storage['url_pattern']) + # requests per http method + if match_dict.get('method'): + if match_dict['method'] not in self.data: + self.charts['http_method'].add_dimension([match_dict['method'], + match_dict['method'], + 'incremental']) + self.data[match_dict['method']] = 0 + self.data[match_dict['method']] += 1 + # requests per http version + if match_dict.get('http_version'): + dim_id = match_dict['http_version'].replace('.', '_') + if dim_id not in self.data: + self.charts['http_version'].add_dimension([dim_id, + match_dict['http_version'], + 'incremental']) + self.data[dim_id] = 0 + self.data[dim_id] += 1 + # requests per port number + if match_dict.get('port'): + if match_dict['port'] not in self.data: + self.charts['port'].add_dimension([match_dict['port'], + match_dict['port'], + 'incremental']) + self.data[match_dict['port']] = 0 + self.data[match_dict['port']] += 1 + # requests per vhost + if match_dict.get('vhost'): + dim_id = match_dict['vhost'].replace('.', '_') + if dim_id not in self.data: + self.charts['vhost'].add_dimension([dim_id, + match_dict['vhost'], + 'incremental']) + self.data[dim_id] = 0 + self.data[dim_id] += 1 + + def get_data_per_response_codes_detailed(self, code): + """ + :param code: str: CODE from parsed line. Ex.: '202, '499' + :return: + Calls add_new_dimension method If the value is found for the first time + """ + if code not in self.data: + if self.configuration.get('detailed_response_aggregate', True): + self.charts['detailed_response_codes'].add_dimension([code, code, 'incremental']) + self.data[code] = 0 + else: + code_index = int(code[0]) if int(code[0]) < 6 else 6 + chart_key = 'detailed_response_codes' + DET_RESP_AGGR[code_index] + self.charts[chart_key].add_dimension([code, code, 'incremental']) + self.data[code] = 0 + self.data[code] += 1 + + def get_data_per_pattern(self, row, other, pattern): + """ + :param row: str: + :param other: str: + :param pattern: named tuple: (['pattern_description', 'regular expression']) + :return: + Scan through string looking for the first location where patterns produce a match for all user + defined patterns + """ + match = None + for elem in pattern: + if elem.func(row): + self.data[elem.description] += 1 + match = True + break + if not match: + self.data[other] += 1 + + def get_data_per_statuses(self, code): + """ + :param code: str: response status code. Ex.: '202', '499' + :return: + """ + code_class = code[0] + if code_class == '2' or code == '304' or code_class == '1': + self.data['successful_requests'] += 1 + elif code_class == '3': + self.data['redirects'] += 1 + elif code_class == '4': + self.data['bad_requests'] += 1 + elif code_class == '5': + self.data['server_errors'] += 1 + else: + self.data['other_requests'] += 1 + + +class ApacheCache: + def __init__(self, service): + self.service = service + self.order = ORDER_APACHE_CACHE + self.definitions = CHARTS_APACHE_CACHE + + @staticmethod + def check(): + return True + + @staticmethod + def get_data(raw_data=None): + data = dict(hit=0, miss=0, other=0) + if not raw_data: + return None if raw_data is None else data + + for line in raw_data: + if 'cache hit' in line: + data['hit'] += 1 + elif 'cache miss' in line: + data['miss'] += 1 + else: + data['other'] += 1 + return data + + +class Squid: + def __init__(self, service): + self.service = service + self.order = ORDER_SQUID + self.definitions = CHARTS_SQUID + self.pre_filter = check_patterns('filter', self.configuration.get('filter')) + self.storage = dict() + self.data = { + 'duration_max': 0, + 'duration_avg': 0, + 'duration_min': 0, + 'bytes': 0, + '0xx': 0, + '1xx': 0, + '2xx': 0, + '3xx': 0, + '4xx': 0, + '5xx': 0, + 'other': 0, + 'unmatched': 0, + 'unique_ipv4': 0, + 'unique_ipv6': 0, + 'unique_tot_ipv4': 0, + 'unique_tot_ipv6': 0, + 'successful_requests': 0, + 'redirects': 0, + 'bad_requests': 0, + 'server_errors': 0, + 'other_requests': 0 + } + + def __getattr__(self, item): + return getattr(self.service, item) + + def check(self): + last_line = read_last_line(self.log_path) + if not last_line: + return False + self.storage['unique_all_time'] = list() + self.storage['regex'] = re.compile(r'[0-9.]+\s+(?P<duration>[0-9]+)' + r' (?P<client_address>[\da-f.:]+)' + r' (?P<squid_code>[A-Z_]+)/' + r'(?P<http_code>[0-9]+)' + r' (?P<bytes>[0-9]+)' + r' (?P<method>[A-Z_]+)' + r' (?P<url>[^ ]+)' + r' (?P<user>[^ ]+)' + r' (?P<hier_code>[A-Z_]+)/[\da-z.:-]+' + r' (?P<mime_type>[A-Za-z-]*)') + + match = self.storage['regex'].search(last_line) + if not match: + self.error('Regex not matches (%s)' % self.storage['regex'].pattern) + return False + self.storage['dynamic'] = { + 'http_code': { + 'chart': 'squid_detailed_response_codes', + 'func_dim_id': None, + 'func_dim': None + }, + 'hier_code': { + 'chart': 'squid_hier_code', + 'func_dim_id': None, + 'func_dim': lambda v: v.replace('HIER_', '') + }, + 'method': { + 'chart': 'squid_method', + 'func_dim_id': None, + 'func_dim': None + }, + 'mime_type': { + 'chart': 'squid_mime_type', + 'func_dim_id': lambda v: str.lower(v) if str.lower(v) in MIME_TYPES else 'unknown', + 'func_dim': None + } + } + if not self.configuration.get('all_time', True): + self.order.remove('squid_clients_all') + return True + + def get_data(self, raw_data=None): + if not raw_data: + return None if raw_data is None else self.data + + filtered_data = filter_data(raw_data=raw_data, pre_filter=self.pre_filter) + + unique_ip = set() + timings = defaultdict(lambda: dict(minimum=None, maximum=0, summary=0, count=0)) + + for row in filtered_data: + match = self.storage['regex'].search(row) + if match: + match = match.groupdict() + if match['duration'] != '0': + get_timings(timings=timings['duration'], time=float(match['duration']) * 1000) + try: + self.data[match['http_code'][0] + 'xx'] += 1 + except KeyError: + self.data['other'] += 1 + + self.get_data_per_statuses(match['http_code']) + + self.get_data_per_squid_code(match['squid_code']) + + self.data['bytes'] += int(match['bytes']) + + proto = 'ipv4' if '.' in match['client_address'] else 'ipv6' + # unique clients ips + if self.configuration.get('all_time', True): + if address_not_in_pool(pool=self.storage['unique_all_time'], + address=match['client_address'], + pool_size=self.data['unique_tot_ipv4'] + self.data['unique_tot_ipv6']): + self.data['unique_tot_' + proto] += 1 + + if match['client_address'] not in unique_ip: + self.data['unique_' + proto] += 1 + unique_ip.add(match['client_address']) + + for key, values in self.storage['dynamic'].items(): + if match[key] == '-': + continue + dimension_id = values['func_dim_id'](match[key]) if values['func_dim_id'] else match[key] + if dimension_id not in self.data: + dimension = values['func_dim'](match[key]) if values['func_dim'] else dimension_id + self.charts[values['chart']].add_dimension([dimension_id, + dimension, + 'incremental']) + self.data[dimension_id] = 0 + self.data[dimension_id] += 1 + else: + self.data['unmatched'] += 1 + + for elem in timings: + self.data[elem + '_min'] += timings[elem]['minimum'] + self.data[elem + '_avg'] += timings[elem]['summary'] / timings[elem]['count'] + self.data[elem + '_max'] += timings[elem]['maximum'] + return self.data + + def get_data_per_statuses(self, code): + """ + :param code: str: response status code. Ex.: '202', '499' + :return: + """ + code_class = code[0] + if code_class == '2' or code == '304' or code_class == '1' or code == '000': + self.data['successful_requests'] += 1 + elif code_class == '3': + self.data['redirects'] += 1 + elif code_class == '4': + self.data['bad_requests'] += 1 + elif code_class == '5' or code_class == '6': + self.data['server_errors'] += 1 + else: + self.data['other_requests'] += 1 + + def get_data_per_squid_code(self, code): + """ + :param code: str: squid response code. Ex.: 'TCP_MISS', 'TCP_MISS_ABORTED' + :return: + """ + if code not in self.data: + self.charts['squid_code'].add_dimension([code, code, 'incremental']) + self.data[code] = 0 + self.data[code] += 1 + + for tag in code.split('_'): + try: + chart_key = SQUID_CODES[tag] + except KeyError: + continue + dimension_id = '_'.join(['code_detailed', tag]) + if dimension_id not in self.data: + self.charts[chart_key].add_dimension([dimension_id, tag, 'incremental']) + self.data[dimension_id] = 0 + self.data[dimension_id] += 1 + + +def get_timings(timings, time): + """ + :param timings: + :param time: + :return: + """ + if timings['minimum'] is None: + timings['minimum'] = time + if time > timings['maximum']: + timings['maximum'] = time + elif time < timings['minimum']: + timings['minimum'] = time + timings['summary'] += time + timings['count'] += 1 + + +def get_hist(index, buckets, time): + """ + :param index: histogram index (Ex. [10, 50, 100, 150, ...]) + :param buckets: histogram buckets + :param time: time + :return: None + """ + for i in range(len(index)-1, -1, -1): + if time <= index[i]: + buckets[i] += 1 + else: + break + + +def address_not_in_pool(pool, address, pool_size): + """ + :param pool: list of ip addresses + :param address: ip address + :param pool_size: current pool size + :return: True if address not in pool. False otherwise. + """ + index = bisect.bisect_left(pool, address) + if index < pool_size: + if pool[index] == address: + return False + bisect.insort_left(pool, address) + return True + bisect.insort_left(pool, address) + return True + + +def find_regex_return(match_dict=None, msg='Generic error message'): + """ + :param match_dict: dict: re.search.groupdict() or None + :param msg: str: error description + :return: tuple: + """ + return match_dict, msg + + +def check_patterns(string, dimension_regex_dict): + """ + :param string: str: + :param dimension_regex_dict: dict: ex. {'dim1': '<pattern1>', 'dim2': '<pattern2>'} + :return: list of named tuples or None: + We need to make sure all patterns are valid regular expressions + """ + if not hasattr(dimension_regex_dict, 'keys'): + return None + + result = list() + + def valid_pattern(pattern): + """ + :param pattern: str + :return: re.compile(pattern) or None + """ + if not isinstance(pattern, str): + return False + try: + return re.compile(pattern) + except re.error: + return False + + def func_search(pattern): + def closure(v): + return pattern.search(v) + + return closure + + for dimension, regex in dimension_regex_dict.items(): + valid = valid_pattern(regex) + if isinstance(dimension, str) and valid_pattern: + func = func_search(valid) + result.append(NAMED_PATTERN(description='_'.join([string, dimension]), + func=func)) + return result or None + + +def filter_data(raw_data, pre_filter): + """ + :param raw_data: + :param pre_filter: + :return: + """ + + if not pre_filter: + return raw_data + filtered = raw_data + for elem in pre_filter: + if elem.description == 'filter_include': + filtered = filter(elem.func, filtered) + elif elem.description == 'filter_exclude': + filtered = filterfalse(elem.func, filtered) + return filtered diff --git a/collectors/python.d.plugin/web_log/web_log.conf b/collectors/python.d.plugin/web_log/web_log.conf new file mode 100644 index 0000000..0ac17f6 --- /dev/null +++ b/collectors/python.d.plugin/web_log/web_log.conf @@ -0,0 +1,204 @@ +# netdata python.d.plugin configuration for web log +# +# This file is in YaML format. Generally the format is: +# +# name: value +# +# There are 2 sections: +# - global variables +# - one or more JOBS +# +# JOBS allow you to collect values from multiple sources. +# Each source will have its own set of charts. +# +# JOB parameters have to be indented (using spaces only, example below). + +# ---------------------------------------------------------------------- +# Global Variables +# These variables set the defaults for all JOBs, however each JOB +# may define its own, overriding the defaults. + +# update_every sets the default data collection frequency. +# If unset, the python.d.plugin default is used. +# update_every: 1 + +# priority controls the order of charts at the netdata dashboard. +# Lower numbers move the charts towards the top of the page. +# If unset, the default for python.d.plugin is used. +# priority: 60000 + +# penalty indicates whether to apply penalty to update_every in case of failures. +# Penalty will increase every 5 failed updates in a row. Maximum penalty is 10 minutes. +# penalty: yes + +# autodetection_retry sets the job re-check interval in seconds. +# The job is not deleted if check fails. +# Attempts to start the job are made once every autodetection_retry. +# This feature is disabled by default. +# autodetection_retry: 0 + +# ---------------------------------------------------------------------- +# JOBS (data collection sources) +# +# The default JOBS share the same *name*. JOBS with the same name +# are mutually exclusive. Only one of them will be allowed running at +# any time. This allows autodetection to try several alternatives and +# pick the one that works. +# +# Any number of jobs is supported. + +# ---------------------------------------------------------------------- +# PLUGIN CONFIGURATION +# +# All python.d.plugin JOBS (for all its modules) support a set of +# predefined parameters. These are: +# +# job_name: +# name: myname # the JOB's name as it will appear at the +# # dashboard (by default is the job_name) +# # JOBs sharing a name are mutually exclusive +# update_every: 1 # the JOB's data collection frequency +# priority: 60000 # the JOB's order on the dashboard +# penalty: yes # the JOB's penalty +# autodetection_retry: 0 # the JOB's re-check interval in seconds +# +# Additionally to the above, web_log also supports the following: +# +# path: 'PATH' # the path to web server log file +# path: 'PATH[0-9]*[0-9]' # log files with date suffix are also supported +# detailed_response_codes: yes/no # default: yes. Additional chart where response codes are not grouped +# detailed_response_aggregate: yes/no # default: yes. Not aggregated detailed response codes charts +# all_time : yes/no # default: yes. All time unique client IPs chart (50000 addresses ~ 400KB) +# filter: # filter with regex +# include: 'REGEX' # only those rows that matches the regex +# exclude: 'REGEX' # all rows except those that matches the regex +# categories: # requests per url chart configuration +# cacti: 'cacti.*' # name(dimension): REGEX to match +# observium: 'observium.*' # name(dimension): REGEX to match +# stub_status: 'stub_status' # name(dimension): REGEX to match +# user_defined: # requests per pattern in <user_defined> field (custom_log_format) +# cacti: 'cacti.*' # name(dimension): REGEX to match +# observium: 'observium.*' # name(dimension): REGEX to match +# stub_status: 'stub_status' # name(dimension): REGEX to match +# custom_log_format: # define a custom log format +# pattern: '(?P<address>[\da-f.:]+) -.*?"(?P<method>[A-Z]+) (?P<url>.*?)" (?P<code>[1-9]\d{2}) (?P<bytes_sent>\d+) (?P<resp_length>\d+) (?P<resp_time>\d+\.\d+) ' +# time_multiplier: 1000000 # type <int>/<float> - convert time to microseconds +# histogram: [1,3,10,30,100, ...] # type list of int - Cumulative histogram of response time in milli seconds + +# ---------------------------------------------------------------------- +# WEB SERVER CONFIGURATION +# +# Make sure the web server log directory and the web server log files +# can be read by user 'netdata'. +# +# To enable the timings chart and the requests size dimension, the +# web server needs to log them. This is how to add them: +# +# nginx: +# log_format netdata '$remote_addr - $remote_user [$time_local] ' +# '"$request" $status $body_bytes_sent ' +# '$request_length $request_time $upstream_response_time ' +# '"$http_referer" "$http_user_agent"'; +# access_log /var/log/nginx/access.log netdata; +# +# apache (you need mod_logio enabled): +# LogFormat "%h %l %u %t \"%r\" %>s %O %I %D \"%{Referer}i\" \"%{User-Agent}i\"" vhost_netdata +# LogFormat "%h %l %u %t \"%r\" %>s %O %I %D \"%{Referer}i\" \"%{User-Agent}i\"" netdata +# CustomLog "/var/log/apache2/access.log" netdata + +# ---------------------------------------------------------------------- +# VHOST AND PORT +# if your want to graph the request/sec per virtual host and per port (to check the number of requests in http vs https) + +# in apache : (%v gives the hostname, %p the port number) +# LogFormat "%v %p %h %t \"%r\" %>s %O %I %D \"%{Referer}i\" \"%{User-Agent}i\"" vhost_netdata +# +# and in this file in apache_vhosts_log section, add : +# custom_log_format: +# pattern: '(?P<vhost>[a-zA-Z\d.-_]+) (?P<port>\d+) (?P<address>[\da-f.:]+) \[.*\] "(?P<method>[A-Z]+)[^"]*" (?P<code>[1-9]\d{2}) (?P<bytes_sent>\d+) (?P<resp_length>\d+) (?P<resp_time>\d+)' + +# ---------------------------------------------------------------------- +# AUTO-DETECTION JOBS +# only one of them per web server will run (when they have the same name) + + +# ------------------------------------------- +# nginx log on various distros + +# debian, arch +nginx_log: + name: 'nginx' + path: '/var/log/nginx/access.log' + +# gentoo +nginx_log2: + name: 'nginx' + path: '/var/log/nginx/localhost.access_log' + + +# ------------------------------------------- +# apache log on various distros + +# debian +apache_log: + name: 'apache' + path: '/var/log/apache2/access.log' + +# gentoo +apache_log2: + name: 'apache' + path: '/var/log/apache2/access_log' + +# arch +apache_log3: + name: 'apache' + path: '/var/log/httpd/access_log' + +# debian +apache_vhosts_log: + name: 'apache_vhosts' + path: '/var/log/apache2/other_vhosts_access.log' + + +# ------------------------------------------- +# gunicorn log on various distros + +gunicorn_log: + name: 'gunicorn' + path: '/var/log/gunicorn/access.log' + +gunicorn_log2: + name: 'gunicorn' + path: '/var/log/gunicorn/gunicorn-access.log' + +# ------------------------------------------- +# Apache Cache +apache_cache: + name: 'apache_cache' + type: 'apache_cache' + path: '/var/log/apache/cache.log' + +apache2_cache: + name: 'apache_cache' + type: 'apache_cache' + path: '/var/log/apache2/cache.log' + +httpd_cache: + name: 'apache_cache' + type: 'apache_cache' + path: '/var/log/httpd/cache.log' + +# ------------------------------------------- +# Squid + +# debian/ubuntu +squid_log1: + name: 'squid' + type: 'squid' + path: '/var/log/squid3/access.log' + +#gentoo +squid_log2: + name: 'squid' + type: 'squid' + path: '/var/log/squid/access.log' |