diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2018-11-07 12:19:29 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2018-11-07 12:20:17 +0000 |
commit | a64a253794ac64cb40befee54db53bde17dd0d49 (patch) | |
tree | c1024acc5f6e508814b944d99f112259bb28b1be /collectors/python.d.plugin/README.md | |
parent | New upstream version 1.10.0+dfsg (diff) | |
download | netdata-a64a253794ac64cb40befee54db53bde17dd0d49.tar.xz netdata-a64a253794ac64cb40befee54db53bde17dd0d49.zip |
New upstream version 1.11.0+dfsgupstream/1.11.0+dfsg
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'collectors/python.d.plugin/README.md')
-rw-r--r-- | collectors/python.d.plugin/README.md | 198 |
1 files changed, 198 insertions, 0 deletions
diff --git a/collectors/python.d.plugin/README.md b/collectors/python.d.plugin/README.md new file mode 100644 index 000000000..df24cd18f --- /dev/null +++ b/collectors/python.d.plugin/README.md @@ -0,0 +1,198 @@ +# python.d.plugin + +`python.d.plugin` is a netdata external plugin. It is an **orchestrator** for data collection modules written in `python`. + +1. It runs as an independent process `ps fax` shows it +2. It is started and stopped automatically by netdata +3. It communicates with netdata via a unidirectional pipe (sending data to the netdata daemon) +4. Supports any number of data collection **modules** +5. Allows each **module** to have one or more data collection **jobs** +6. Each **job** is collecting one or more metrics from a single data source + + +## Disclaimer + +Every module should be compatible with python2 and python3. +All third party libraries should be installed system-wide or in `python_modules` directory. +Module configurations are written in YAML and **pyYAML is required**. + +Every configuration file must have one of two formats: + +- Configuration for only one job: + +```yaml +update_every : 2 # update frequency +retries : 1 # how many failures in update() is tolerated +priority : 20000 # where it is shown on dashboard + +other_var1 : bla # variables passed to module +other_var2 : alb +``` + +- Configuration for many jobs (ex. mysql): + +```yaml +# module defaults: +update_every : 2 +retries : 1 +priority : 20000 + +local: # job name + update_every : 5 # job update frequency + other_var1 : some_val # module specific variable + +other_job: + priority : 5 # job position on dashboard + retries : 20 # job retries + other_var2 : val # module specific variable +``` + +`update_every`, `retries`, and `priority` are always optional. + +--- + +## How to write a new module + +Writing new python module is simple. You just need to remember to include 5 major things: +- **ORDER** global list +- **CHART** global dictionary +- **Service** class +- **_get_data** method +- all code needs to be compatible with Python 2 (**≥ 2.7**) *and* 3 (**≥ 3.1**) + +If you plan to submit the module in a PR, make sure and go through the [PR checklist for new modules](https://github.com/netdata/netdata/wiki/New-Module-PR-Checklist) beforehand to make sure you have updated all the files you need to. + +### Global variables `ORDER` and `CHART` + +`ORDER` list should contain the order of chart ids. Example: +```py +ORDER = ['first_chart', 'second_chart', 'third_chart'] +``` + +`CHART` dictionary is a little bit trickier. It should contain the chart definition in following format: +```py +CHART = { + id: { + 'options': [name, title, units, family, context, charttype], + 'lines': [ + [unique_dimension_name, name, algorithm, multiplier, divisor] + ]} +``` + +All names are better explained in the [External Plugins](../) section. +Parameters like `priority` and `update_every` are handled by `python.d.plugin`. + +### `Service` class + +Every module needs to implement its own `Service` class. This class should inherit from one of the framework classes: + +- `SimpleService` +- `UrlService` +- `SocketService` +- `LogService` +- `ExecutableService` + +Also it needs to invoke the parent class constructor in a specific way as well as assign global variables to class variables. + +Simple example: +```py +from base import UrlService +class Service(UrlService): + def __init__(self, configuration=None, name=None): + UrlService.__init__(self, configuration=configuration, name=name) + self.order = ORDER + self.definitions = CHARTS +``` + +### `_get_data` collector/parser + +This method should grab raw data from `_get_raw_data`, parse it, and return a dictionary where keys are unique dimension names or `None` if no data is collected. + +Example: +```py +def _get_data(self): + try: + raw = self._get_raw_data().split(" ") + return {'active': int(raw[2])} + except (ValueError, AttributeError): + return None +``` + +More about framework classes +============================ + +Every framework class has some user-configurable variables which are specific to this particular class. Those variables should have default values initialized in the child class constructor. + +If module needs some additional user-configurable variable, it can be accessed from the `self.configuration` list and assigned in constructor or custom `check` method. Example: +```py +def __init__(self, configuration=None, name=None): + UrlService.__init__(self, configuration=configuration, name=name) + try: + self.baseurl = str(self.configuration['baseurl']) + except (KeyError, TypeError): + self.baseurl = "http://localhost:5001" +``` + +Classes implement `_get_raw_data` which should be used to grab raw data. This method usually returns a list of strings. + +### `SimpleService` + +_This is last resort class, if a new module cannot be written by using other framework class this one can be used._ + +_Example: `mysql`, `sensors`_ + +It is the lowest-level class which implements most of module logic, like: +- threading +- handling run times +- chart formatting +- logging +- chart creation and updating + +### `LogService` + +_Examples: `apache_cache`, `nginx_log`_ + +_Variable from config file_: `log_path`. + +Object created from this class reads new lines from file specified in `log_path` variable. It will check if file exists and is readable. Also `_get_raw_data` returns list of strings where each string is one line from file specified in `log_path`. + +### `ExecutableService` + +_Examples: `exim`, `postfix`_ + +_Variable from config file_: `command`. + +This allows to execute a shell command in a secure way. It will check for invalid characters in `command` variable and won't proceed if there is one of: +- '&' +- '|' +- ';' +- '>' +- '<' + +For additional security it uses python `subprocess.Popen` (without `shell=True` option) to execute command. Command can be specified with absolute or relative name. When using relative name, it will try to find `command` in `PATH` environment variable as well as in `/sbin` and `/usr/sbin`. + +`_get_raw_data` returns list of decoded lines returned by `command`. + +### UrlService + +_Examples: `apache`, `nginx`, `tomcat`_ + +_Variables from config file_: `url`, `user`, `pass`. + +If data is grabbed by accessing service via HTTP protocol, this class can be used. It can handle HTTP Basic Auth when specified with `user` and `pass` credentials. + +`_get_raw_data` returns list of utf-8 decoded strings (lines). + +### SocketService + +_Examples: `dovecot`, `redis`_ + +_Variables from config file_: `unix_socket`, `host`, `port`, `request`. + +Object will try execute `request` using either `unix_socket` or TCP/IP socket with combination of `host` and `port`. This can access unix sockets with SOCK_STREAM or SOCK_DGRAM protocols and TCP/IP sockets in version 4 and 6 with SOCK_STREAM setting. + +Sockets are accessed in non-blocking mode with 15 second timeout. + +After every execution of `_get_raw_data` socket is closed, to prevent this module needs to set `_keep_alive` variable to `True` and implement custom `_check_raw_data` method. + +`_check_raw_data` should take raw data and return `True` if all data is received otherwise it should return `False`. Also it should do it in fast and efficient way.
\ No newline at end of file |