summaryrefslogtreecommitdiffstats
path: root/collectors/python.d.plugin/pandas
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2023-05-08 16:27:08 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2023-05-08 16:27:08 +0000
commit81581f9719bc56f01d5aa08952671d65fda9867a (patch)
tree0f5c6b6138bf169c23c9d24b1fc0a3521385cb18 /collectors/python.d.plugin/pandas
parentReleasing debian version 1.38.1-1. (diff)
downloadnetdata-81581f9719bc56f01d5aa08952671d65fda9867a.tar.xz
netdata-81581f9719bc56f01d5aa08952671d65fda9867a.zip
Merging upstream version 1.39.0.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'collectors/python.d.plugin/pandas')
-rw-r--r--collectors/python.d.plugin/pandas/README.md26
-rw-r--r--collectors/python.d.plugin/pandas/pandas.chart.py12
-rw-r--r--collectors/python.d.plugin/pandas/pandas.conf24
3 files changed, 49 insertions, 13 deletions
diff --git a/collectors/python.d.plugin/pandas/README.md b/collectors/python.d.plugin/pandas/README.md
index 141549478..19b11d5be 100644
--- a/collectors/python.d.plugin/pandas/README.md
+++ b/collectors/python.d.plugin/pandas/README.md
@@ -1,16 +1,15 @@
-<!--
-title: "Pandas"
-custom_edit_url: https://github.com/netdata/netdata/edit/master/collectors/python.d.plugin/pandas/README.md
--->
-
-# Pandas Netdata Collector
+# Ingest structured data (Pandas)
<a href="https://pandas.pydata.org/" target="_blank">
<img src="https://pandas.pydata.org/docs/_static/pandas.svg" alt="Pandas" width="100px" height="50px" />
</a>
-A python collector using [pandas](https://pandas.pydata.org/) to pull data and do pandas based
-preprocessing before feeding to Netdata.
+[Pandas](https://pandas.pydata.org/) is a de-facto standard in reading and processing most types of structured data in Python.
+If you have metrics appearing in a CSV, JSON, XML, HTML, or [other supported format](https://pandas.pydata.org/docs/user_guide/io.html),
+either locally or via some HTTP endpoint, you can easily ingest and present those metrics in Netdata, by leveraging the Pandas collector.
+
+The collector uses [pandas](https://pandas.pydata.org/) to pull data and do pandas-based
+preprocessing, before feeding to Netdata.
## Requirements
@@ -20,6 +19,12 @@ This collector depends on some Python (Python 3 only) packages that can usually
sudo pip install pandas requests
```
+Note: If you would like to use [`pandas.read_sql`](https://pandas.pydata.org/docs/reference/api/pandas.read_sql.html) to query a database, you will need to install the below packages as well.
+
+```bash
+sudo pip install 'sqlalchemy<2.0' psycopg2-binary
+```
+
## Configuration
Below is an example configuration to query some json weather data from [Open-Meteo](https://open-meteo.com),
@@ -66,12 +71,11 @@ temperature:
`chart_configs` is a list of dictionary objects where each one defines the sequence of `df_steps` to be run using [`pandas`](https://pandas.pydata.org/),
and the `name`, `title` etc to define the
-[CHART variables](https://learn.netdata.cloud/docs/agent/collectors/python.d.plugin#global-variables-order-and-chart)
+[CHART variables](https://github.com/netdata/netdata/blob/master/docs/guides/python-collector.md#create-charts)
that will control how the results will look in netdata.
The example configuration above would result in a `data` dictionary like the below being collected by Netdata
-at each time step. They keys in this dictionary will be the
-[dimension](https://learn.netdata.cloud/docs/agent/web#dimensions) names on the chart.
+at each time step. They keys in this dictionary will be the "dimensions" of the chart.
```javascript
{'athens_max': 26.2, 'athens_mean': 19.45952380952381, 'athens_min': 12.2, 'berlin_max': 17.4, 'berlin_mean': 10.764285714285714, 'berlin_min': 5.7, 'dublin_max': 15.3, 'dublin_mean': 12.008928571428571, 'dublin_min': 6.6, 'london_max': 18.9, 'london_mean': 12.510714285714286, 'london_min': 5.2, 'paris_max': 19.4, 'paris_mean': 12.054166666666665, 'paris_min': 4.8}
diff --git a/collectors/python.d.plugin/pandas/pandas.chart.py b/collectors/python.d.plugin/pandas/pandas.chart.py
index 8eb4452fb..7977bcb36 100644
--- a/collectors/python.d.plugin/pandas/pandas.chart.py
+++ b/collectors/python.d.plugin/pandas/pandas.chart.py
@@ -3,6 +3,7 @@
# Author: Andrew Maguire (andrewm4894)
# SPDX-License-Identifier: GPL-3.0-or-later
+import os
import pandas as pd
try:
@@ -11,6 +12,12 @@ try:
except ImportError:
HAS_REQUESTS = False
+try:
+ from sqlalchemy import create_engine
+ HAS_SQLALCHEMY = True
+except ImportError:
+ HAS_SQLALCHEMY = False
+
from bases.FrameworkServices.SimpleService import SimpleService
ORDER = []
@@ -46,7 +53,10 @@ class Service(SimpleService):
"""ensure charts and dims all configured and that we can get data"""
if not HAS_REQUESTS:
- self.warn('requests library could not be imported')
+ self.warning('requests library could not be imported')
+
+ if not HAS_SQLALCHEMY:
+ self.warning('sqlalchemy library could not be imported')
if not self.chart_configs:
self.error('chart_configs must be defined')
diff --git a/collectors/python.d.plugin/pandas/pandas.conf b/collectors/python.d.plugin/pandas/pandas.conf
index 6684af9d5..ca523ed36 100644
--- a/collectors/python.d.plugin/pandas/pandas.conf
+++ b/collectors/python.d.plugin/pandas/pandas.conf
@@ -188,4 +188,26 @@ update_every: 5
# df_steps: >
# pd.read_xml('http://metwdb-openaccess.ichec.ie/metno-wdb2ts/locationforecast?lat=54.7210798611;long=-8.7237392806', xpath='./product/time[1]/location/temperature', parser='etree')|
# df.rename(columns={'value': 'dublin'})|
-# df[['dublin']]| \ No newline at end of file
+# df[['dublin']]|
+
+# example showing a read_sql from a postgres database using sqlalchemy.
+# note: example assumes a running postgress db on localhost with a netdata users and password netdata.
+# sql:
+# name: "sql"
+# update_every: 5
+# chart_configs:
+# - name: "sql"
+# title: "SQL Example"
+# family: "sql.example"
+# context: "example"
+# type: "line"
+# units: "percent"
+# df_steps: >
+# pd.read_sql_query(
+# sql='\
+# select \
+# random()*100 as metric_1, \
+# random()*100 as metric_2 \
+# ',
+# con=create_engine('postgresql://localhost/postgres?user=netdata&password=netdata')
+# );