diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 14:31:17 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-04 14:31:17 +0000 |
commit | 8020f71afd34d7696d7933659df2d763ab05542f (patch) | |
tree | 2fdf1b5447ffd8bdd61e702ca183e814afdcb4fc /web/api/queries | |
parent | Initial commit. (diff) | |
download | netdata-upstream/1.37.1.tar.xz netdata-upstream/1.37.1.zip |
Adding upstream version 1.37.1.upstream/1.37.1upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to '')
56 files changed, 6078 insertions, 0 deletions
diff --git a/web/api/queries/Makefile.am b/web/api/queries/Makefile.am new file mode 100644 index 0000000..7c4c435 --- /dev/null +++ b/web/api/queries/Makefile.am @@ -0,0 +1,23 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +SUBDIRS = \ + average \ + countif \ + des \ + incremental_sum \ + max \ + min \ + sum \ + median \ + percentile \ + ses \ + stddev \ + trimmed_mean \ + $(NULL) + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/README.md b/web/api/queries/README.md new file mode 100644 index 0000000..44cdd05 --- /dev/null +++ b/web/api/queries/README.md @@ -0,0 +1,176 @@ +<!-- +title: "Database Queries" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/README.md +--> + +# Database Queries + +Netdata database can be queried with `/api/v1/data` and `/api/v1/badge.svg` REST API methods. + +Every data query accepts the following parameters: + +|name|required|description| +|:--:|:------:|:----------| +|`chart`|yes|The chart to be queried.| +|`points`|no|The number of points to be returned. Netdata can reduce number of points by applying query grouping methods. If not given, the result will have the same granularity as the database (although this relates to `gtime`).| +|`before`|no|The absolute timestamp or the relative (to now) time the query should finish evaluating data. If not given, it defaults to the timestamp of the latest point in the database.| +|`after`|no|The absolute timestamp or the relative (to `before`) time the query should start evaluating data. if not given, it defaults to the timestamp of the oldest point in the database.| +|`group`|no|The grouping method to use when reducing the points the database has. If not given, it defaults to `average`.| +|`gtime`|no|A resampling period to change the units of the metrics (i.e. setting this to `60` will convert `per second` metrics to `per minute`. If not given it defaults to granularity of the database.| +|`options`|no|A bitmap of options that can affect the operation of the query. Only 2 options are used by the query engine: `unaligned` and `percentage`. All the other options are used by the output formatters. The default is to return aligned data.| +|`dimensions`|no|A simple pattern to filter the dimensions to be queried. The default is to return all the dimensions of the chart.| + +## Operation + +The query engine works as follows (in this order): + +#### Time-frame + +`after` and `before` define a time-frame, accepting: + +- **absolute timestamps** (unix timestamps, i.e. seconds since epoch). + +- **relative timestamps**: + + `before` is relative to now and `after` is relative to `before`. + + Example: `before=-60&after=-60` evaluates to the time-frame from -120 up to -60 seconds in + the past, relative to the latest entry of the database of the chart. + +The engine verifies that the time-frame requested is available at the database: + +- If the requested time-frame overlaps with the database, the excess requested + will be truncated. + +- If the requested time-frame does not overlap with the database, the engine will + return an empty data set. + +At the end of this operation, `after` and `before` are absolute timestamps. + +#### Data grouping + +Database points grouping is applied when the caller requests a time-frame to be +expressed with fewer points, compared to what is available at the database. + +There are 2 uses that enable this feature: + +- The caller requests a specific number of `points` to be returned. + + For example, for a time-frame of 10 minutes, the database has 600 points (1/sec), + while the caller requested these 10 minutes to be expressed in 200 points. + + This feature is used by Netdata dashboards when you zoom-out the charts. + The dashboard is requesting the number of points the user's screen has. + This saves bandwidth and speeds up the browser (fewer points to evaluate for drawing the charts). +- The caller requests a **re-sampling** of the database, by setting `gtime` to any value + above the granularity of the chart. + + For example, the chart's units is `requests/sec` and caller wants `requests/min`. + +Using `points` and `gtime` the query engine tries to find a best fit for **database-points** +vs **result-points** (we call this ratio `group points`). It always tries to keep `group points` +an integer. Keep in mind the query engine may shift `after` if required. See also the [example](#example). + +#### Time-frame Alignment + +Alignment is a very important aspect of Netdata queries. Without it, the animated +charts on the dashboards would constantly [change shape](#example) during incremental updates. + +To provide consistent grouping through time, the query engine (by default) aligns +`after` and `before` to be a multiple of `group points`. + +For example, if `group points` is 60 and alignment is enabled, the engine will return +each point with durations XX:XX:00 - XX:XX:59, matching whole minutes. + +To disable alignment, pass `&options=unaligned` to the query. + +#### Query Execution + +To execute the query, the engine evaluates all dimensions of the chart, one after another. + +The engine does not evaluate dimensions that do not match the [simple pattern](/libnetdata/simple_pattern/README.md) +given at the `dimensions` parameter, except when `options=percentage` is given (this option +requires all the dimensions to be evaluated to find the percentage of each dimension vs to chart +total). + +For each dimension, it starts evaluating values starting at `after` (not inclusive) towards +`before` (inclusive). + +For each value it calls the **grouping method** given with the `&group=` query parameter +(the default is `average`). + +## Grouping methods + +The following grouping methods are supported. These are given all the values in the time-frame +and they group the values every `group points`. + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min&value_color=blue) finds the minimum value +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max&value_color=lightblue) finds the maximum value +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average&value_color=yellow) finds the average value +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=sum&after=-60&label=sum&units=requests&value_color=orange) adds all the values and returns the sum +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=median&after=-60&label=median&value_color=red) sorts the values and returns the value in the middle of the list +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=stddev&after=-60&label=stddev&value_color=green) finds the standard deviation of the values +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=cv&after=-60&label=cv&units=pcent&value_color=yellow) finds the relative standard deviation (coefficient of variation) of the values +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=ses&after=-60&label=ses&value_color=brown) finds the exponential weighted moving average of the values +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=des&after=-60&label=des&value_color=blue) applies Holt-Winters double exponential smoothing +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=incremental_sum&after=-60&label=incremental_sum&value_color=red) finds the difference of the last vs the first value + +The examples shown above, are live information from the `successful` web requests of the global Netdata registry. + +## Further processing + +The result of the query engine is always a structure that has dimensions and values +for each dimension. + +Formatting modules are then used to convert this result in many different formats and return it +to the caller. + +## Performance + +The query engine is highly optimized for speed. Most of its modules implement "online" +versions of the algorithms, requiring just one pass on the database values to produce +the result. + +## Example + +When Netdata is reducing metrics, it tries to return always the same boundaries. So, if we want 10s averages, it will always return points starting at a `unix timestamp % 10 = 0`. + +Let's see why this is needed, by looking at the error case. + +Assume we have 5 points: + +|time|value| +|:--:|:---:| +|00:01|1| +|00:02|2| +|00:03|3| +|00:04|4| +|00:05|5| + +At 00:04 you ask for 2 points for 4 seconds in the past. So `group = 2`. Netdata would return: + +|point|time|value| +|:---:|:--:|:---:| +|1|00:01 - 00:02|1.5| +|2|00:03 - 00:04|3.5| + +A second later the chart is to be refreshed, and makes again the same request at 00:05. These are the points that would have been returned: + +|point|time|value| +|:---:|:--:|:---:| +|1|00:02 - 00:03|2.5| +|2|00:04 - 00:05|4.5| + +**Wait a moment!** The chart was shifted just one point and it changed value! Point 2 was 3.5 and when shifted to point 1 is 2.5! If you see this in a chart, it's a mess. The charts change shape constantly. + +For this reason, Netdata always aligns the data it returns to the `group`. + +When you request `points=1`, Netdata understands that you need 1 point for the whole database, so `group = 3600`. Then it tries to find the starting point which would be `timestamp % 3600 = 0` Within a database of 3600 seconds, there is one such point for sure. Then it tries to find the average of 3600 points. But, most probably it will not find 3600 of them (for just 1 out of 3600 seconds this query will return something). + +So, the proper way to query the database is to also set at least `after`. The following call will returns 1 point for the last complete 10-second duration (it starts at `timestamp % 10 = 0`): + +<http://netdata.firehol.org/api/v1/data?chart=system.cpu&points=1&after=-10&options=seconds> + +When you keep calling this URL, you will see that it returns one new value every 10 seconds, and the timestamp always ends with zero. Similarly, if you say `points=1&after=-5` it will always return timestamps ending with 0 or 5. + + diff --git a/web/api/queries/average/Makefile.am b/web/api/queries/average/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/average/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/average/README.md b/web/api/queries/average/README.md new file mode 100644 index 0000000..b8d4ba7 --- /dev/null +++ b/web/api/queries/average/README.md @@ -0,0 +1,46 @@ +<!-- +title: "Average or Mean" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/average/README.md +--> + +# Average or Mean + +> This query is available as `average` and `mean`. + +An average is a single number taken as representative of a list of numbers. + +It is calculated as: + +``` +average = sum(numbers) / count(numbers) +``` + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: average -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`average` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=average` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average&value_color=orange) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) + +## References + +- <https://en.wikipedia.org/wiki/Average>. + + diff --git a/web/api/queries/average/average.c b/web/api/queries/average/average.c new file mode 100644 index 0000000..0719d57 --- /dev/null +++ b/web/api/queries/average/average.c @@ -0,0 +1,59 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "average.h" + +// ---------------------------------------------------------------------------- +// average + +struct grouping_average { + NETDATA_DOUBLE sum; + size_t count; +}; + +void grouping_create_average(RRDR *r, const char *options __maybe_unused) { + r->internal.grouping_data = onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_average)); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_average(RRDR *r) { + struct grouping_average *g = (struct grouping_average *)r->internal.grouping_data; + g->sum = 0; + g->count = 0; +} + +void grouping_free_average(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_average(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_average *g = (struct grouping_average *)r->internal.grouping_data; + g->sum += value; + g->count++; +} + +NETDATA_DOUBLE grouping_flush_average(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_average *g = (struct grouping_average *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(unlikely(!g->count)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else { + if(unlikely(r->internal.resampling_group != 1)) { + if (unlikely(r->result_options & RRDR_RESULT_OPTION_VARIABLE_STEP)) + value = g->sum / g->count / r->internal.resampling_divisor; + else + value = g->sum / r->internal.resampling_divisor; + } else + value = g->sum / g->count; + } + + g->sum = 0.0; + g->count = 0; + + return value; +} diff --git a/web/api/queries/average/average.h b/web/api/queries/average/average.h new file mode 100644 index 0000000..b319668 --- /dev/null +++ b/web/api/queries/average/average.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERY_AVERAGE_H +#define NETDATA_API_QUERY_AVERAGE_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_average(RRDR *r, const char *options __maybe_unused); +void grouping_reset_average(RRDR *r); +void grouping_free_average(RRDR *r); +void grouping_add_average(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_average(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERY_AVERAGE_H diff --git a/web/api/queries/countif/Makefile.am b/web/api/queries/countif/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/countif/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/countif/README.md b/web/api/queries/countif/README.md new file mode 100644 index 0000000..200a4c9 --- /dev/null +++ b/web/api/queries/countif/README.md @@ -0,0 +1,36 @@ +<!-- +title: "CountIf" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/countif/README.md +--> + +# CountIf + +> This query is available as `countif`. + +CountIf returns the percentage of points in the database that satisfy the condition supplied. + +The following conditions are available: + +- `!` or `!=` or `<>`, different than +- `=` or `:`, equal to +- `>`, greater than +- `<`, less than +- `>=`, greater or equal to +- `<=`, less or equal to + +The target number and the desired condition can be set using the `group_options` query parameter, as a string, like in these examples: + +- `!0`, to match any number except zero. +- `>=-3` to match any number bigger or equal to -3. + +. When an invalid condition is given, the web server can deliver a not accurate response. + +## how to use + +This query cannot be used in alarms. + +`countif` changes the units of charts. The result of the calculation is always from zero to 1, expressing the percentage of database points that matched the condition. + +In APIs and badges can be used like this: `&group=countif&group_options=>10` in the URL. + + diff --git a/web/api/queries/countif/countif.c b/web/api/queries/countif/countif.c new file mode 100644 index 0000000..369b20b --- /dev/null +++ b/web/api/queries/countif/countif.c @@ -0,0 +1,136 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "countif.h" + +// ---------------------------------------------------------------------------- +// countif + +struct grouping_countif { + size_t (*comparison)(NETDATA_DOUBLE, NETDATA_DOUBLE); + NETDATA_DOUBLE target; + size_t count; + size_t matched; +}; + +static size_t countif_equal(NETDATA_DOUBLE v, NETDATA_DOUBLE target) { + return (v == target); +} + +static size_t countif_notequal(NETDATA_DOUBLE v, NETDATA_DOUBLE target) { + return (v != target); +} + +static size_t countif_less(NETDATA_DOUBLE v, NETDATA_DOUBLE target) { + return (v < target); +} + +static size_t countif_lessequal(NETDATA_DOUBLE v, NETDATA_DOUBLE target) { + return (v <= target); +} + +static size_t countif_greater(NETDATA_DOUBLE v, NETDATA_DOUBLE target) { + return (v > target); +} + +static size_t countif_greaterequal(NETDATA_DOUBLE v, NETDATA_DOUBLE target) { + return (v >= target); +} + +void grouping_create_countif(RRDR *r, const char *options __maybe_unused) { + struct grouping_countif *g = onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_countif)); + r->internal.grouping_data = g; + + if(options && *options) { + // skip any leading spaces + while(isspace(*options)) options++; + + // find the comparison function + switch(*options) { + case '!': + options++; + if(*options != '=' && *options != ':') + options--; + g->comparison = countif_notequal; + break; + + case '>': + options++; + if(*options == '=' || *options == ':') { + g->comparison = countif_greaterequal; + } + else { + options--; + g->comparison = countif_greater; + } + break; + + case '<': + options++; + if(*options == '>') { + g->comparison = countif_notequal; + } + else if(*options == '=' || *options == ':') { + g->comparison = countif_lessequal; + } + else { + options--; + g->comparison = countif_less; + } + break; + + default: + case '=': + case ':': + g->comparison = countif_equal; + break; + } + if(*options) options++; + + // skip everything up to the first digit + while(isspace(*options)) options++; + + g->target = str2ndd(options, NULL); + } + else { + g->target = 0.0; + g->comparison = countif_equal; + } +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_countif(RRDR *r) { + struct grouping_countif *g = (struct grouping_countif *)r->internal.grouping_data; + g->matched = 0; + g->count = 0; +} + +void grouping_free_countif(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_countif(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_countif *g = (struct grouping_countif *)r->internal.grouping_data; + g->matched += g->comparison(value, g->target); + g->count++; +} + +NETDATA_DOUBLE grouping_flush_countif(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_countif *g = (struct grouping_countif *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(unlikely(!g->count)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else { + value = (NETDATA_DOUBLE)g->matched * 100 / (NETDATA_DOUBLE)g->count; + } + + g->matched = 0; + g->count = 0; + + return value; +} diff --git a/web/api/queries/countif/countif.h b/web/api/queries/countif/countif.h new file mode 100644 index 0000000..dfe8056 --- /dev/null +++ b/web/api/queries/countif/countif.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERY_COUNTIF_H +#define NETDATA_API_QUERY_COUNTIF_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_countif(RRDR *r, const char *options __maybe_unused); +void grouping_reset_countif(RRDR *r); +void grouping_free_countif(RRDR *r); +void grouping_add_countif(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_countif(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERY_COUNTIF_H diff --git a/web/api/queries/des/Makefile.am b/web/api/queries/des/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/des/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/des/README.md b/web/api/queries/des/README.md new file mode 100644 index 0000000..33c5f1a --- /dev/null +++ b/web/api/queries/des/README.md @@ -0,0 +1,73 @@ +<!-- +title: "double exponential smoothing" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/des/README.md +--> + +# double exponential smoothing + +Exponential smoothing is one of many window functions commonly applied to smooth data in signal +processing, acting as low-pass filters to remove high frequency noise. + +Simple exponential smoothing does not do well when there is a trend in the data. +In such situations, several methods were devised under the name "double exponential smoothing" +or "second-order exponential smoothing.", which is the recursive application of an exponential +filter twice, thus being termed "double exponential smoothing". + +In simple terms, this is like an average value, but more recent values are given more weight +and the trend of the values influences significantly the result. + +> **IMPORTANT** +> +> It is common for `des` to provide "average" values that far beyond the minimum or the maximum +> values found in the time-series. +> `des` estimates these values because of it takes into account the trend. + +This module implements the "Holt-Winters double exponential smoothing". + +Netdata automatically adjusts the weight (`alpha`) and the trend (`beta`) based on the number +of values processed, using the formula: + +``` +window = max(number of values, 15) +alpha = 2 / (window + 1) +beta = 2 / (window + 1) +``` + +You can change the fixed value `15` by setting in `netdata.conf`: + +``` +[web] + des max window = 15 +``` + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: des -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`des` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=des` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average&value_color=yellow) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=ses&after=-60&label=single+exponential+smoothing&value_color=yellow) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=des&after=-60&label=double+exponential+smoothing&value_color=orange) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) + +## References + +- <https://en.wikipedia.org/wiki/Exponential_smoothing>. + + diff --git a/web/api/queries/des/des.c b/web/api/queries/des/des.c new file mode 100644 index 0000000..a6c4e40 --- /dev/null +++ b/web/api/queries/des/des.c @@ -0,0 +1,137 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include <web/api/queries/rrdr.h> +#include "des.h" + + +// ---------------------------------------------------------------------------- +// single exponential smoothing + +struct grouping_des { + NETDATA_DOUBLE alpha; + NETDATA_DOUBLE alpha_other; + NETDATA_DOUBLE beta; + NETDATA_DOUBLE beta_other; + + NETDATA_DOUBLE level; + NETDATA_DOUBLE trend; + + size_t count; +}; + +static size_t max_window_size = 15; + +void grouping_init_des(void) { + long long ret = config_get_number(CONFIG_SECTION_WEB, "des max window", (long long)max_window_size); + if(ret <= 1) { + config_set_number(CONFIG_SECTION_WEB, "des max window", (long long)max_window_size); + } + else { + max_window_size = (size_t) ret; + } +} + +static inline NETDATA_DOUBLE window(RRDR *r, struct grouping_des *g) { + (void)g; + + NETDATA_DOUBLE points; + if(r->group == 1) { + // provide a running DES + points = (NETDATA_DOUBLE)r->internal.points_wanted; + } + else { + // provide a SES with flush points + points = (NETDATA_DOUBLE)r->group; + } + + // https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average + // A commonly used value for alpha is 2 / (N + 1) + return (points > (NETDATA_DOUBLE)max_window_size) ? (NETDATA_DOUBLE)max_window_size : points; +} + +static inline void set_alpha(RRDR *r, struct grouping_des *g) { + // https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average + // A commonly used value for alpha is 2 / (N + 1) + + g->alpha = 2.0 / (window(r, g) + 1.0); + g->alpha_other = 1.0 - g->alpha; + + //info("alpha for chart '%s' is " CALCULATED_NUMBER_FORMAT, r->st->name, g->alpha); +} + +static inline void set_beta(RRDR *r, struct grouping_des *g) { + // https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average + // A commonly used value for alpha is 2 / (N + 1) + + g->beta = 2.0 / (window(r, g) + 1.0); + g->beta_other = 1.0 - g->beta; + + //info("beta for chart '%s' is " CALCULATED_NUMBER_FORMAT, r->st->name, g->beta); +} + +void grouping_create_des(RRDR *r, const char *options __maybe_unused) { + struct grouping_des *g = (struct grouping_des *)onewayalloc_mallocz(r->internal.owa, sizeof(struct grouping_des)); + set_alpha(r, g); + set_beta(r, g); + g->level = 0.0; + g->trend = 0.0; + g->count = 0; + r->internal.grouping_data = g; +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_des(RRDR *r) { + struct grouping_des *g = (struct grouping_des *)r->internal.grouping_data; + g->level = 0.0; + g->trend = 0.0; + g->count = 0; + + // fprintf(stderr, "\nDES: "); + +} + +void grouping_free_des(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_des(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_des *g = (struct grouping_des *)r->internal.grouping_data; + + if(likely(g->count > 0)) { + // we have at least a number so far + + if(unlikely(g->count == 1)) { + // the second value we got + g->trend = value - g->trend; + g->level = value; + } + + // for the values, except the first + NETDATA_DOUBLE last_level = g->level; + g->level = (g->alpha * value) + (g->alpha_other * (g->level + g->trend)); + g->trend = (g->beta * (g->level - last_level)) + (g->beta_other * g->trend); + } + else { + // the first value we got + g->level = g->trend = value; + } + + g->count++; + + //fprintf(stderr, "value: " CALCULATED_NUMBER_FORMAT ", level: " CALCULATED_NUMBER_FORMAT ", trend: " CALCULATED_NUMBER_FORMAT "\n", value, g->level, g->trend); +} + +NETDATA_DOUBLE grouping_flush_des(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_des *g = (struct grouping_des *)r->internal.grouping_data; + + if(unlikely(!g->count || !netdata_double_isnumber(g->level))) { + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + return 0.0; + } + + //fprintf(stderr, " RESULT for %zu values = " CALCULATED_NUMBER_FORMAT " \n", g->count, g->level); + + return g->level; +} diff --git a/web/api/queries/des/des.h b/web/api/queries/des/des.h new file mode 100644 index 0000000..05fa01b --- /dev/null +++ b/web/api/queries/des/des.h @@ -0,0 +1,17 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERIES_DES_H +#define NETDATA_API_QUERIES_DES_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_init_des(void); + +void grouping_create_des(RRDR *r, const char *options __maybe_unused); +void grouping_reset_des(RRDR *r); +void grouping_free_des(RRDR *r); +void grouping_add_des(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_des(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERIES_DES_H diff --git a/web/api/queries/incremental_sum/Makefile.am b/web/api/queries/incremental_sum/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/incremental_sum/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/incremental_sum/README.md b/web/api/queries/incremental_sum/README.md new file mode 100644 index 0000000..4430117 --- /dev/null +++ b/web/api/queries/incremental_sum/README.md @@ -0,0 +1,41 @@ +<!-- +title: "Incremental Sum (`incremental_sum`)" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/incremental_sum/README.md +--> + +# Incremental Sum (`incremental_sum`) + +This modules finds the incremental sum of a period, which `last value - first value`. + +The result may be positive (rising) or negative (falling) depending on the first and last values. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: incremental_sum -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`incremental_sum` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=incremental_sum` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=incremental_sum&after=-60&label=incremental+sum&value_color=orange) + +## References + +- none + + diff --git a/web/api/queries/incremental_sum/incremental_sum.c b/web/api/queries/incremental_sum/incremental_sum.c new file mode 100644 index 0000000..afca530 --- /dev/null +++ b/web/api/queries/incremental_sum/incremental_sum.c @@ -0,0 +1,66 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "incremental_sum.h" + +// ---------------------------------------------------------------------------- +// incremental sum + +struct grouping_incremental_sum { + NETDATA_DOUBLE first; + NETDATA_DOUBLE last; + size_t count; +}; + +void grouping_create_incremental_sum(RRDR *r, const char *options __maybe_unused) { + r->internal.grouping_data = onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_incremental_sum)); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_incremental_sum(RRDR *r) { + struct grouping_incremental_sum *g = (struct grouping_incremental_sum *)r->internal.grouping_data; + g->first = 0; + g->last = 0; + g->count = 0; +} + +void grouping_free_incremental_sum(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_incremental_sum(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_incremental_sum *g = (struct grouping_incremental_sum *)r->internal.grouping_data; + + if(unlikely(!g->count)) { + g->first = value; + g->count++; + } + else { + g->last = value; + g->count++; + } +} + +NETDATA_DOUBLE grouping_flush_incremental_sum(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_incremental_sum *g = (struct grouping_incremental_sum *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(unlikely(!g->count)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else if(unlikely(g->count == 1)) { + value = 0.0; + } + else { + value = g->last - g->first; + } + + g->first = 0.0; + g->last = 0.0; + g->count = 0; + + return value; +} diff --git a/web/api/queries/incremental_sum/incremental_sum.h b/web/api/queries/incremental_sum/incremental_sum.h new file mode 100644 index 0000000..c24507f --- /dev/null +++ b/web/api/queries/incremental_sum/incremental_sum.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERY_INCREMENTAL_SUM_H +#define NETDATA_API_QUERY_INCREMENTAL_SUM_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_incremental_sum(RRDR *r, const char *options __maybe_unused); +void grouping_reset_incremental_sum(RRDR *r); +void grouping_free_incremental_sum(RRDR *r); +void grouping_add_incremental_sum(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_incremental_sum(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERY_INCREMENTAL_SUM_H diff --git a/web/api/queries/max/Makefile.am b/web/api/queries/max/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/max/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/max/README.md b/web/api/queries/max/README.md new file mode 100644 index 0000000..48da7cf --- /dev/null +++ b/web/api/queries/max/README.md @@ -0,0 +1,38 @@ +<!-- +title: "Max" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/max/README.md +--> + +# Max + +This module finds the max value in the time-frame given. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: max -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`max` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=max` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max&value_color=orange) + +## References + +- <https://en.wikipedia.org/wiki/Sample_maximum_and_minimum>. + + diff --git a/web/api/queries/max/max.c b/web/api/queries/max/max.c new file mode 100644 index 0000000..73cf9fa --- /dev/null +++ b/web/api/queries/max/max.c @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "max.h" + +// ---------------------------------------------------------------------------- +// max + +struct grouping_max { + NETDATA_DOUBLE max; + size_t count; +}; + +void grouping_create_max(RRDR *r, const char *options __maybe_unused) { + r->internal.grouping_data = onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_max)); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_max(RRDR *r) { + struct grouping_max *g = (struct grouping_max *)r->internal.grouping_data; + g->max = 0; + g->count = 0; +} + +void grouping_free_max(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_max(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_max *g = (struct grouping_max *)r->internal.grouping_data; + + if(!g->count || fabsndd(value) > fabsndd(g->max)) { + g->max = value; + g->count++; + } +} + +NETDATA_DOUBLE grouping_flush_max(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_max *g = (struct grouping_max *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(unlikely(!g->count)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else { + value = g->max; + } + + g->max = 0.0; + g->count = 0; + + return value; +} + diff --git a/web/api/queries/max/max.h b/web/api/queries/max/max.h new file mode 100644 index 0000000..e2427d2 --- /dev/null +++ b/web/api/queries/max/max.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERY_MAX_H +#define NETDATA_API_QUERY_MAX_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_max(RRDR *r, const char *options __maybe_unused); +void grouping_reset_max(RRDR *r); +void grouping_free_max(RRDR *r); +void grouping_add_max(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_max(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERY_MAX_H diff --git a/web/api/queries/median/Makefile.am b/web/api/queries/median/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/median/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/median/README.md b/web/api/queries/median/README.md new file mode 100644 index 0000000..5600284 --- /dev/null +++ b/web/api/queries/median/README.md @@ -0,0 +1,60 @@ +<!-- +title: "Median" +description: "Use median in API queries and health entities to find the 'middle' value from a sample, eliminating any unwanted spikes in the returned metrics." +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/median/README.md +--> + +# Median + +The median is the value separating the higher half from the lower half of a data sample +(a population or a probability distribution). For a data set, it may be thought of as the +"middle" value. + +`median` is not an accurate average. However, it eliminates all spikes, by sorting +all the values in a period, and selecting the value in the middle of the sorted array. + +Netdata also supports `trimmed-median`, which trims a percentage of the smaller and bigger values prior to finding the +median. The following `trimmed-median` functions are defined: + +- `trimmed-median1` +- `trimmed-median2` +- `trimmed-median3` +- `trimmed-median5` +- `trimmed-median10` +- `trimmed-median15` +- `trimmed-median20` +- `trimmed-median25` + +The function `trimmed-median` is an alias for `trimmed-median5`. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: median -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`median` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=median` in the URL. Additionally, a percentage may be given with +`&group_options=` to trim all small and big values before finding the median. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=median&after=-60&label=median&value_color=orange) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) + +## References + +- <https://en.wikipedia.org/wiki/Median>. + + diff --git a/web/api/queries/median/median.c b/web/api/queries/median/median.c new file mode 100644 index 0000000..40fd4ec --- /dev/null +++ b/web/api/queries/median/median.c @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "median.h" + +// ---------------------------------------------------------------------------- +// median + +struct grouping_median { + size_t series_size; + size_t next_pos; + NETDATA_DOUBLE percent; + + NETDATA_DOUBLE *series; +}; + +void grouping_create_median_internal(RRDR *r, const char *options, NETDATA_DOUBLE def) { + long entries = r->group; + if(entries < 10) entries = 10; + + struct grouping_median *g = (struct grouping_median *)onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_median)); + g->series = onewayalloc_mallocz(r->internal.owa, entries * sizeof(NETDATA_DOUBLE)); + g->series_size = (size_t)entries; + + g->percent = def; + if(options && *options) { + g->percent = str2ndd(options, NULL); + if(!netdata_double_isnumber(g->percent)) g->percent = 0.0; + if(g->percent < 0.0) g->percent = 0.0; + if(g->percent > 50.0) g->percent = 50.0; + } + + g->percent = g->percent / 100.0; + r->internal.grouping_data = g; +} + +void grouping_create_median(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 0.0); +} +void grouping_create_trimmed_median1(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 1.0); +} +void grouping_create_trimmed_median2(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 2.0); +} +void grouping_create_trimmed_median3(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 3.0); +} +void grouping_create_trimmed_median5(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 5.0); +} +void grouping_create_trimmed_median10(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 10.0); +} +void grouping_create_trimmed_median15(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 15.0); +} +void grouping_create_trimmed_median20(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 20.0); +} +void grouping_create_trimmed_median25(RRDR *r, const char *options) { + grouping_create_median_internal(r, options, 25.0); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_median(RRDR *r) { + struct grouping_median *g = (struct grouping_median *)r->internal.grouping_data; + g->next_pos = 0; +} + +void grouping_free_median(RRDR *r) { + struct grouping_median *g = (struct grouping_median *)r->internal.grouping_data; + if(g) onewayalloc_freez(r->internal.owa, g->series); + + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_median(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_median *g = (struct grouping_median *)r->internal.grouping_data; + + if(unlikely(g->next_pos >= g->series_size)) { + g->series = onewayalloc_doublesize( r->internal.owa, g->series, g->series_size * sizeof(NETDATA_DOUBLE)); + g->series_size *= 2; + } + + g->series[g->next_pos++] = value; +} + +NETDATA_DOUBLE grouping_flush_median(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_median *g = (struct grouping_median *)r->internal.grouping_data; + + size_t available_slots = g->next_pos; + NETDATA_DOUBLE value; + + if(unlikely(!available_slots)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else if(available_slots == 1) { + value = g->series[0]; + } + else { + sort_series(g->series, available_slots); + + size_t start_slot = 0; + size_t end_slot = available_slots - 1; + + if(g->percent > 0.0) { + NETDATA_DOUBLE min = g->series[0]; + NETDATA_DOUBLE max = g->series[available_slots - 1]; + NETDATA_DOUBLE delta = (max - min) * g->percent; + + NETDATA_DOUBLE wanted_min = min + delta; + NETDATA_DOUBLE wanted_max = max - delta; + + for (start_slot = 0; start_slot < available_slots; start_slot++) + if (g->series[start_slot] >= wanted_min) break; + + for (end_slot = available_slots - 1; end_slot > start_slot; end_slot--) + if (g->series[end_slot] <= wanted_max) break; + } + + if(start_slot == end_slot) + value = g->series[start_slot]; + else + value = median_on_sorted_series(&g->series[start_slot], end_slot - start_slot + 1); + } + + if(unlikely(!netdata_double_isnumber(value))) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + + //log_series_to_stderr(g->series, g->next_pos, value, "median"); + + g->next_pos = 0; + + return value; +} diff --git a/web/api/queries/median/median.h b/web/api/queries/median/median.h new file mode 100644 index 0000000..9fc159d --- /dev/null +++ b/web/api/queries/median/median.h @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERIES_MEDIAN_H +#define NETDATA_API_QUERIES_MEDIAN_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_median(RRDR *r, const char *options); +void grouping_create_trimmed_median1(RRDR *r, const char *options); +void grouping_create_trimmed_median2(RRDR *r, const char *options); +void grouping_create_trimmed_median3(RRDR *r, const char *options); +void grouping_create_trimmed_median5(RRDR *r, const char *options); +void grouping_create_trimmed_median10(RRDR *r, const char *options); +void grouping_create_trimmed_median15(RRDR *r, const char *options); +void grouping_create_trimmed_median20(RRDR *r, const char *options); +void grouping_create_trimmed_median25(RRDR *r, const char *options); +void grouping_reset_median(RRDR *r); +void grouping_free_median(RRDR *r); +void grouping_add_median(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_median(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERIES_MEDIAN_H diff --git a/web/api/queries/min/Makefile.am b/web/api/queries/min/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/min/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/min/README.md b/web/api/queries/min/README.md new file mode 100644 index 0000000..495523c --- /dev/null +++ b/web/api/queries/min/README.md @@ -0,0 +1,38 @@ +<!-- +title: "Min" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/min/README.md +--> + +# Min + +This module finds the min value in the time-frame given. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: min -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`min` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=min` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min&value_color=orange) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) + +## References + +- <https://en.wikipedia.org/wiki/Sample_maximum_and_minimum>. + + diff --git a/web/api/queries/min/min.c b/web/api/queries/min/min.c new file mode 100644 index 0000000..1752e9e --- /dev/null +++ b/web/api/queries/min/min.c @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "min.h" + +// ---------------------------------------------------------------------------- +// min + +struct grouping_min { + NETDATA_DOUBLE min; + size_t count; +}; + +void grouping_create_min(RRDR *r, const char *options __maybe_unused) { + r->internal.grouping_data = onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_min)); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_min(RRDR *r) { + struct grouping_min *g = (struct grouping_min *)r->internal.grouping_data; + g->min = 0; + g->count = 0; +} + +void grouping_free_min(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_min(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_min *g = (struct grouping_min *)r->internal.grouping_data; + + if(!g->count || fabsndd(value) < fabsndd(g->min)) { + g->min = value; + g->count++; + } +} + +NETDATA_DOUBLE grouping_flush_min(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_min *g = (struct grouping_min *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(unlikely(!g->count)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else { + value = g->min; + } + + g->min = 0.0; + g->count = 0; + + return value; +} + diff --git a/web/api/queries/min/min.h b/web/api/queries/min/min.h new file mode 100644 index 0000000..dcdfe25 --- /dev/null +++ b/web/api/queries/min/min.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERY_MIN_H +#define NETDATA_API_QUERY_MIN_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_min(RRDR *r, const char *options __maybe_unused); +void grouping_reset_min(RRDR *r); +void grouping_free_min(RRDR *r); +void grouping_add_min(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_min(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERY_MIN_H diff --git a/web/api/queries/percentile/Makefile.am b/web/api/queries/percentile/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/percentile/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/percentile/README.md b/web/api/queries/percentile/README.md new file mode 100644 index 0000000..70afc74 --- /dev/null +++ b/web/api/queries/percentile/README.md @@ -0,0 +1,58 @@ +<!-- +title: "Percentile" +description: "Use percentile in API queries and health entities to find the 'percentile' value from a sample, eliminating any unwanted spikes in the returned metrics." +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/percentile/README.md +--> + +# Percentile + +The percentile is the average value of a series using only the smaller N percentile of the values. +(a population or a probability distribution). + +Netdata applies linear interpolation on the last point, if the percentile requested does not give a round number of +points. + +The following percentile aliases are defined: + +- `percentile25` +- `percentile50` +- `percentile75` +- `percentile80` +- `percentile90` +- `percentile95` +- `percentile97` +- `percentile98` +- `percentile99` + +The default `percentile` is an alias for `percentile95`. +Any percentile may be requested using the `group_options` query parameter. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: percentile95 -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`percentile` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=percentile` in the URL and the additional parameter `group_options` +may be used to request any percentile (e.g. `&group=percentile&group_options=96`). + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=percentile95&after=-60&label=percentile95&value_color=orange) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) + +## References + +- <https://en.wikipedia.org/wiki/Percentile>. diff --git a/web/api/queries/percentile/percentile.c b/web/api/queries/percentile/percentile.c new file mode 100644 index 0000000..88f8600 --- /dev/null +++ b/web/api/queries/percentile/percentile.c @@ -0,0 +1,169 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "percentile.h" + +// ---------------------------------------------------------------------------- +// median + +struct grouping_percentile { + size_t series_size; + size_t next_pos; + NETDATA_DOUBLE percent; + + NETDATA_DOUBLE *series; +}; + +static void grouping_create_percentile_internal(RRDR *r, const char *options, NETDATA_DOUBLE def) { + long entries = r->group; + if(entries < 10) entries = 10; + + struct grouping_percentile *g = (struct grouping_percentile *)onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_percentile)); + g->series = onewayalloc_mallocz(r->internal.owa, entries * sizeof(NETDATA_DOUBLE)); + g->series_size = (size_t)entries; + + g->percent = def; + if(options && *options) { + g->percent = str2ndd(options, NULL); + if(!netdata_double_isnumber(g->percent)) g->percent = 0.0; + if(g->percent < 0.0) g->percent = 0.0; + if(g->percent > 100.0) g->percent = 100.0; + } + + g->percent = g->percent / 100.0; + r->internal.grouping_data = g; +} + +void grouping_create_percentile25(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 25.0); +} +void grouping_create_percentile50(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 50.0); +} +void grouping_create_percentile75(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 75.0); +} +void grouping_create_percentile80(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 80.0); +} +void grouping_create_percentile90(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 90.0); +} +void grouping_create_percentile95(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 95.0); +} +void grouping_create_percentile97(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 97.0); +} +void grouping_create_percentile98(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 98.0); +} +void grouping_create_percentile99(RRDR *r, const char *options) { + grouping_create_percentile_internal(r, options, 99.0); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_percentile(RRDR *r) { + struct grouping_percentile *g = (struct grouping_percentile *)r->internal.grouping_data; + g->next_pos = 0; +} + +void grouping_free_percentile(RRDR *r) { + struct grouping_percentile *g = (struct grouping_percentile *)r->internal.grouping_data; + if(g) onewayalloc_freez(r->internal.owa, g->series); + + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_percentile(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_percentile *g = (struct grouping_percentile *)r->internal.grouping_data; + + if(unlikely(g->next_pos >= g->series_size)) { + g->series = onewayalloc_doublesize( r->internal.owa, g->series, g->series_size * sizeof(NETDATA_DOUBLE)); + g->series_size *= 2; + } + + g->series[g->next_pos++] = value; +} + +NETDATA_DOUBLE grouping_flush_percentile(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_percentile *g = (struct grouping_percentile *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + size_t available_slots = g->next_pos; + + if(unlikely(!available_slots)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else if(available_slots == 1) { + value = g->series[0]; + } + else { + sort_series(g->series, available_slots); + + NETDATA_DOUBLE min = g->series[0]; + NETDATA_DOUBLE max = g->series[available_slots - 1]; + + if (min != max) { + size_t slots_to_use = (size_t)((NETDATA_DOUBLE)available_slots * g->percent); + if(!slots_to_use) slots_to_use = 1; + + NETDATA_DOUBLE percent_to_use = (NETDATA_DOUBLE)slots_to_use / (NETDATA_DOUBLE)available_slots; + NETDATA_DOUBLE percent_delta = g->percent - percent_to_use; + + NETDATA_DOUBLE percent_interpolation_slot = 0.0; + NETDATA_DOUBLE percent_last_slot = 0.0; + if(percent_delta > 0.0) { + NETDATA_DOUBLE percent_to_use_plus_1_slot = (NETDATA_DOUBLE)(slots_to_use + 1) / (NETDATA_DOUBLE)available_slots; + NETDATA_DOUBLE percent_1slot = percent_to_use_plus_1_slot - percent_to_use; + + percent_interpolation_slot = percent_delta / percent_1slot; + percent_last_slot = 1 - percent_interpolation_slot; + } + + int start_slot, stop_slot, step, last_slot, interpolation_slot; + if(min >= 0.0 && max >= 0.0) { + start_slot = 0; + stop_slot = start_slot + (int)slots_to_use; + last_slot = stop_slot - 1; + interpolation_slot = stop_slot; + step = 1; + } + else { + start_slot = (int)available_slots - 1; + stop_slot = start_slot - (int)slots_to_use; + last_slot = stop_slot + 1; + interpolation_slot = stop_slot; + step = -1; + } + + value = 0.0; + for(int slot = start_slot; slot != stop_slot ; slot += step) + value += g->series[slot]; + + size_t counted = slots_to_use; + if(percent_interpolation_slot > 0.0 && interpolation_slot >= 0 && interpolation_slot < (int)available_slots) { + value += g->series[interpolation_slot] * percent_interpolation_slot; + value += g->series[last_slot] * percent_last_slot; + counted++; + } + + value = value / (NETDATA_DOUBLE)counted; + } + else + value = min; + } + + if(unlikely(!netdata_double_isnumber(value))) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + + //log_series_to_stderr(g->series, g->next_pos, value, "percentile"); + + g->next_pos = 0; + + return value; +} diff --git a/web/api/queries/percentile/percentile.h b/web/api/queries/percentile/percentile.h new file mode 100644 index 0000000..65e335c --- /dev/null +++ b/web/api/queries/percentile/percentile.h @@ -0,0 +1,23 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERIES_PERCENTILE_H +#define NETDATA_API_QUERIES_PERCENTILE_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_percentile25(RRDR *r, const char *options); +void grouping_create_percentile50(RRDR *r, const char *options); +void grouping_create_percentile75(RRDR *r, const char *options); +void grouping_create_percentile80(RRDR *r, const char *options); +void grouping_create_percentile90(RRDR *r, const char *options); +void grouping_create_percentile95(RRDR *r, const char *options); +void grouping_create_percentile97(RRDR *r, const char *options); +void grouping_create_percentile98(RRDR *r, const char *options); +void grouping_create_percentile99(RRDR *r, const char *options ); +void grouping_reset_percentile(RRDR *r); +void grouping_free_percentile(RRDR *r); +void grouping_add_percentile(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_percentile(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERIES_PERCENTILE_H diff --git a/web/api/queries/query.c b/web/api/queries/query.c new file mode 100644 index 0000000..0365b6e --- /dev/null +++ b/web/api/queries/query.c @@ -0,0 +1,2175 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "query.h" +#include "web/api/formatters/rrd2json.h" +#include "rrdr.h" + +#include "average/average.h" +#include "countif/countif.h" +#include "incremental_sum/incremental_sum.h" +#include "max/max.h" +#include "median/median.h" +#include "min/min.h" +#include "sum/sum.h" +#include "stddev/stddev.h" +#include "ses/ses.h" +#include "des/des.h" +#include "percentile/percentile.h" +#include "trimmed_mean/trimmed_mean.h" + +// ---------------------------------------------------------------------------- + +static struct { + const char *name; + uint32_t hash; + RRDR_GROUPING value; + + // One time initialization for the module. + // This is called once, when netdata starts. + void (*init)(void); + + // Allocate all required structures for a query. + // This is called once for each netdata query. + void (*create)(struct rrdresult *r, const char *options); + + // Cleanup collected values, but don't destroy the structures. + // This is called when the query engine switches dimensions, + // as part of the same query (so same chart, switching metric). + void (*reset)(struct rrdresult *r); + + // Free all resources allocated for the query. + void (*free)(struct rrdresult *r); + + // Add a single value into the calculation. + // The module may decide to cache it, or use it in the fly. + void (*add)(struct rrdresult *r, NETDATA_DOUBLE value); + + // Generate a single result for the values added so far. + // More values and points may be requested later. + // It is up to the module to reset its internal structures + // when flushing it (so for a few modules it may be better to + // continue after a flush as if nothing changed, for others a + // cleanup of the internal structures may be required). + NETDATA_DOUBLE (*flush)(struct rrdresult *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + + TIER_QUERY_FETCH tier_query_fetch; +} api_v1_data_groups[] = { + {.name = "average", + .hash = 0, + .value = RRDR_GROUPING_AVERAGE, + .init = NULL, + .create= grouping_create_average, + .reset = grouping_reset_average, + .free = grouping_free_average, + .add = grouping_add_average, + .flush = grouping_flush_average, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "mean", // alias on 'average' + .hash = 0, + .value = RRDR_GROUPING_AVERAGE, + .init = NULL, + .create= grouping_create_average, + .reset = grouping_reset_average, + .free = grouping_free_average, + .add = grouping_add_average, + .flush = grouping_flush_average, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean1", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN1, + .init = NULL, + .create= grouping_create_trimmed_mean1, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean2", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN2, + .init = NULL, + .create= grouping_create_trimmed_mean2, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean3", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN3, + .init = NULL, + .create= grouping_create_trimmed_mean3, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean5", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN5, + .init = NULL, + .create= grouping_create_trimmed_mean5, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean10", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN10, + .init = NULL, + .create= grouping_create_trimmed_mean10, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean15", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN15, + .init = NULL, + .create= grouping_create_trimmed_mean15, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean20", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN20, + .init = NULL, + .create= grouping_create_trimmed_mean20, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean25", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN25, + .init = NULL, + .create= grouping_create_trimmed_mean25, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-mean", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEAN5, + .init = NULL, + .create= grouping_create_trimmed_mean5, + .reset = grouping_reset_trimmed_mean, + .free = grouping_free_trimmed_mean, + .add = grouping_add_trimmed_mean, + .flush = grouping_flush_trimmed_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "incremental_sum", + .hash = 0, + .value = RRDR_GROUPING_INCREMENTAL_SUM, + .init = NULL, + .create= grouping_create_incremental_sum, + .reset = grouping_reset_incremental_sum, + .free = grouping_free_incremental_sum, + .add = grouping_add_incremental_sum, + .flush = grouping_flush_incremental_sum, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "incremental-sum", + .hash = 0, + .value = RRDR_GROUPING_INCREMENTAL_SUM, + .init = NULL, + .create= grouping_create_incremental_sum, + .reset = grouping_reset_incremental_sum, + .free = grouping_free_incremental_sum, + .add = grouping_add_incremental_sum, + .flush = grouping_flush_incremental_sum, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "median", + .hash = 0, + .value = RRDR_GROUPING_MEDIAN, + .init = NULL, + .create= grouping_create_median, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median1", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN1, + .init = NULL, + .create= grouping_create_trimmed_median1, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median2", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN2, + .init = NULL, + .create= grouping_create_trimmed_median2, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median3", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN3, + .init = NULL, + .create= grouping_create_trimmed_median3, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median5", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN5, + .init = NULL, + .create= grouping_create_trimmed_median5, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median10", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN10, + .init = NULL, + .create= grouping_create_trimmed_median10, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median15", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN15, + .init = NULL, + .create= grouping_create_trimmed_median15, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median20", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN20, + .init = NULL, + .create= grouping_create_trimmed_median20, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median25", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN25, + .init = NULL, + .create= grouping_create_trimmed_median25, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "trimmed-median", + .hash = 0, + .value = RRDR_GROUPING_TRIMMED_MEDIAN5, + .init = NULL, + .create= grouping_create_trimmed_median5, + .reset = grouping_reset_median, + .free = grouping_free_median, + .add = grouping_add_median, + .flush = grouping_flush_median, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile25", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE25, + .init = NULL, + .create= grouping_create_percentile25, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile50", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE50, + .init = NULL, + .create= grouping_create_percentile50, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile75", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE75, + .init = NULL, + .create= grouping_create_percentile75, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile80", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE80, + .init = NULL, + .create= grouping_create_percentile80, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile90", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE90, + .init = NULL, + .create= grouping_create_percentile90, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile95", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE95, + .init = NULL, + .create= grouping_create_percentile95, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile97", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE97, + .init = NULL, + .create= grouping_create_percentile97, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile98", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE98, + .init = NULL, + .create= grouping_create_percentile98, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile99", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE99, + .init = NULL, + .create= grouping_create_percentile99, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "percentile", + .hash = 0, + .value = RRDR_GROUPING_PERCENTILE95, + .init = NULL, + .create= grouping_create_percentile95, + .reset = grouping_reset_percentile, + .free = grouping_free_percentile, + .add = grouping_add_percentile, + .flush = grouping_flush_percentile, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "min", + .hash = 0, + .value = RRDR_GROUPING_MIN, + .init = NULL, + .create= grouping_create_min, + .reset = grouping_reset_min, + .free = grouping_free_min, + .add = grouping_add_min, + .flush = grouping_flush_min, + .tier_query_fetch = TIER_QUERY_FETCH_MIN + }, + {.name = "max", + .hash = 0, + .value = RRDR_GROUPING_MAX, + .init = NULL, + .create= grouping_create_max, + .reset = grouping_reset_max, + .free = grouping_free_max, + .add = grouping_add_max, + .flush = grouping_flush_max, + .tier_query_fetch = TIER_QUERY_FETCH_MAX + }, + {.name = "sum", + .hash = 0, + .value = RRDR_GROUPING_SUM, + .init = NULL, + .create= grouping_create_sum, + .reset = grouping_reset_sum, + .free = grouping_free_sum, + .add = grouping_add_sum, + .flush = grouping_flush_sum, + .tier_query_fetch = TIER_QUERY_FETCH_SUM + }, + + // standard deviation + {.name = "stddev", + .hash = 0, + .value = RRDR_GROUPING_STDDEV, + .init = NULL, + .create= grouping_create_stddev, + .reset = grouping_reset_stddev, + .free = grouping_free_stddev, + .add = grouping_add_stddev, + .flush = grouping_flush_stddev, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "cv", // coefficient variation is calculated by stddev + .hash = 0, + .value = RRDR_GROUPING_CV, + .init = NULL, + .create= grouping_create_stddev, // not an error, stddev calculates this too + .reset = grouping_reset_stddev, // not an error, stddev calculates this too + .free = grouping_free_stddev, // not an error, stddev calculates this too + .add = grouping_add_stddev, // not an error, stddev calculates this too + .flush = grouping_flush_coefficient_of_variation, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "rsd", // alias of 'cv' + .hash = 0, + .value = RRDR_GROUPING_CV, + .init = NULL, + .create= grouping_create_stddev, // not an error, stddev calculates this too + .reset = grouping_reset_stddev, // not an error, stddev calculates this too + .free = grouping_free_stddev, // not an error, stddev calculates this too + .add = grouping_add_stddev, // not an error, stddev calculates this too + .flush = grouping_flush_coefficient_of_variation, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + + /* + {.name = "mean", // same as average, no need to define it again + .hash = 0, + .value = RRDR_GROUPING_MEAN, + .setup = NULL, + .create= grouping_create_stddev, + .reset = grouping_reset_stddev, + .free = grouping_free_stddev, + .add = grouping_add_stddev, + .flush = grouping_flush_mean, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + */ + + /* + {.name = "variance", // meaningless to offer + .hash = 0, + .value = RRDR_GROUPING_VARIANCE, + .setup = NULL, + .create= grouping_create_stddev, + .reset = grouping_reset_stddev, + .free = grouping_free_stddev, + .add = grouping_add_stddev, + .flush = grouping_flush_variance, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + */ + + // single exponential smoothing + {.name = "ses", + .hash = 0, + .value = RRDR_GROUPING_SES, + .init = grouping_init_ses, + .create= grouping_create_ses, + .reset = grouping_reset_ses, + .free = grouping_free_ses, + .add = grouping_add_ses, + .flush = grouping_flush_ses, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "ema", // alias for 'ses' + .hash = 0, + .value = RRDR_GROUPING_SES, + .init = NULL, + .create= grouping_create_ses, + .reset = grouping_reset_ses, + .free = grouping_free_ses, + .add = grouping_add_ses, + .flush = grouping_flush_ses, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + {.name = "ewma", // alias for ses + .hash = 0, + .value = RRDR_GROUPING_SES, + .init = NULL, + .create= grouping_create_ses, + .reset = grouping_reset_ses, + .free = grouping_free_ses, + .add = grouping_add_ses, + .flush = grouping_flush_ses, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + + // double exponential smoothing + {.name = "des", + .hash = 0, + .value = RRDR_GROUPING_DES, + .init = grouping_init_des, + .create= grouping_create_des, + .reset = grouping_reset_des, + .free = grouping_free_des, + .add = grouping_add_des, + .flush = grouping_flush_des, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + + {.name = "countif", + .hash = 0, + .value = RRDR_GROUPING_COUNTIF, + .init = NULL, + .create= grouping_create_countif, + .reset = grouping_reset_countif, + .free = grouping_free_countif, + .add = grouping_add_countif, + .flush = grouping_flush_countif, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + }, + + // terminator + {.name = NULL, + .hash = 0, + .value = RRDR_GROUPING_UNDEFINED, + .init = NULL, + .create= grouping_create_average, + .reset = grouping_reset_average, + .free = grouping_free_average, + .add = grouping_add_average, + .flush = grouping_flush_average, + .tier_query_fetch = TIER_QUERY_FETCH_AVERAGE + } +}; + +void web_client_api_v1_init_grouping(void) { + int i; + + for(i = 0; api_v1_data_groups[i].name ; i++) { + api_v1_data_groups[i].hash = simple_hash(api_v1_data_groups[i].name); + + if(api_v1_data_groups[i].init) + api_v1_data_groups[i].init(); + } +} + +const char *group_method2string(RRDR_GROUPING group) { + int i; + + for(i = 0; api_v1_data_groups[i].name ; i++) { + if(api_v1_data_groups[i].value == group) { + return api_v1_data_groups[i].name; + } + } + + return "unknown-group-method"; +} + +RRDR_GROUPING web_client_api_request_v1_data_group(const char *name, RRDR_GROUPING def) { + int i; + + uint32_t hash = simple_hash(name); + for(i = 0; api_v1_data_groups[i].name ; i++) + if(unlikely(hash == api_v1_data_groups[i].hash && !strcmp(name, api_v1_data_groups[i].name))) + return api_v1_data_groups[i].value; + + return def; +} + +const char *web_client_api_request_v1_data_group_to_string(RRDR_GROUPING group) { + int i; + + for(i = 0; api_v1_data_groups[i].name ; i++) + if(unlikely(group == api_v1_data_groups[i].value)) + return api_v1_data_groups[i].name; + + return "unknown"; +} + +static void rrdr_set_grouping_function(RRDR *r, RRDR_GROUPING group_method) { + int i, found = 0; + for(i = 0; !found && api_v1_data_groups[i].name ;i++) { + if(api_v1_data_groups[i].value == group_method) { + r->internal.grouping_create = api_v1_data_groups[i].create; + r->internal.grouping_reset = api_v1_data_groups[i].reset; + r->internal.grouping_free = api_v1_data_groups[i].free; + r->internal.grouping_add = api_v1_data_groups[i].add; + r->internal.grouping_flush = api_v1_data_groups[i].flush; + r->internal.tier_query_fetch = api_v1_data_groups[i].tier_query_fetch; + found = 1; + } + } + if(!found) { + errno = 0; + internal_error(true, "QUERY: grouping method %u not found. Using 'average'", (unsigned int)group_method); + r->internal.grouping_create = grouping_create_average; + r->internal.grouping_reset = grouping_reset_average; + r->internal.grouping_free = grouping_free_average; + r->internal.grouping_add = grouping_add_average; + r->internal.grouping_flush = grouping_flush_average; + r->internal.tier_query_fetch = TIER_QUERY_FETCH_AVERAGE; + } +} + +// ---------------------------------------------------------------------------- +// helpers to find our way in RRDR + +static inline RRDR_VALUE_FLAGS *UNUSED_FUNCTION(rrdr_line_options)(RRDR *r, long rrdr_line) { + return &r->o[ rrdr_line * r->d ]; +} + +static inline NETDATA_DOUBLE *UNUSED_FUNCTION(rrdr_line_values)(RRDR *r, long rrdr_line) { + return &r->v[ rrdr_line * r->d ]; +} + +static inline long rrdr_line_init(RRDR *r, time_t t, long rrdr_line) { + rrdr_line++; + + internal_error(rrdr_line >= (long)r->n, + "QUERY: requested to step above RRDR size for query '%s'", + r->internal.qt->id); + + internal_error(r->t[rrdr_line] != 0 && r->t[rrdr_line] != t, + "QUERY: overwriting the timestamp of RRDR line %zu from %zu to %zu, of query '%s'", + (size_t)rrdr_line, (size_t)r->t[rrdr_line], (size_t)t, r->internal.qt->id); + + // save the time + r->t[rrdr_line] = t; + + return rrdr_line; +} + +static inline void rrdr_done(RRDR *r, long rrdr_line) { + r->rows = rrdr_line + 1; +} + + +// ---------------------------------------------------------------------------- +// tier management + +static bool query_metric_is_valid_tier(QUERY_METRIC *qm, size_t tier) { + if(!qm->tiers[tier].db_metric_handle || !qm->tiers[tier].db_first_time_t || !qm->tiers[tier].db_last_time_t || !qm->tiers[tier].db_update_every) + return false; + + return true; +} + +static size_t query_metric_first_working_tier(QUERY_METRIC *qm) { + for(size_t tier = 0; tier < storage_tiers ; tier++) { + + // find the db time-range for this tier for all metrics + STORAGE_METRIC_HANDLE *db_metric_handle = qm->tiers[tier].db_metric_handle; + time_t first_t = qm->tiers[tier].db_first_time_t; + time_t last_t = qm->tiers[tier].db_last_time_t; + time_t update_every = qm->tiers[tier].db_update_every; + + if(!db_metric_handle || !first_t || !last_t || !update_every) + continue; + + return tier; + } + + return 0; +} + +static long query_plan_points_coverage_weight(time_t db_first_t, time_t db_last_t, time_t db_update_every, time_t after_wanted, time_t before_wanted, size_t points_wanted, size_t tier __maybe_unused) { + if(db_first_t == 0 || db_last_t == 0 || db_update_every == 0) + return -LONG_MAX; + + time_t common_first_t = MAX(db_first_t, after_wanted); + time_t common_last_t = MIN(db_last_t, before_wanted); + + long time_coverage = (common_last_t - common_first_t) * 1000000 / (before_wanted - after_wanted); + size_t points_wanted_in_coverage = points_wanted * time_coverage / 1000000; + + long points_available = (common_last_t - common_first_t) / db_update_every; + long points_delta = (long)(points_available - points_wanted_in_coverage); + long points_coverage = (points_delta < 0) ? (long)(points_available * time_coverage / points_wanted_in_coverage) : time_coverage; + + // a way to benefit higher tiers + // points_coverage += (long)tier * 10000; + + if(points_available <= 0) + return -LONG_MAX; + + return points_coverage; +} + +static size_t query_metric_best_tier_for_timeframe(QUERY_METRIC *qm, time_t after_wanted, time_t before_wanted, size_t points_wanted) { + if(unlikely(storage_tiers < 2)) + return 0; + + if(unlikely(after_wanted == before_wanted || points_wanted <= 0)) + return query_metric_first_working_tier(qm); + + long weight[storage_tiers]; + + for(size_t tier = 0; tier < storage_tiers ; tier++) { + + // find the db time-range for this tier for all metrics + STORAGE_METRIC_HANDLE *db_metric_handle = qm->tiers[tier].db_metric_handle; + time_t first_t = qm->tiers[tier].db_first_time_t; + time_t last_t = qm->tiers[tier].db_last_time_t; + time_t update_every = qm->tiers[tier].db_update_every; + + if(!db_metric_handle || !first_t || !last_t || !update_every) { + weight[tier] = -LONG_MAX; + continue; + } + + weight[tier] = query_plan_points_coverage_weight(first_t, last_t, update_every, after_wanted, before_wanted, points_wanted, tier); + } + + size_t best_tier = 0; + for(size_t tier = 1; tier < storage_tiers ; tier++) { + if(weight[tier] >= weight[best_tier]) + best_tier = tier; + } + + return best_tier; +} + +static size_t rrddim_find_best_tier_for_timeframe(QUERY_TARGET *qt, time_t after_wanted, time_t before_wanted, size_t points_wanted) { + if(unlikely(storage_tiers < 2)) + return 0; + + if(unlikely(after_wanted == before_wanted || points_wanted <= 0)) { + internal_error(true, "QUERY: '%s' has invalid params to tier calculation", qt->id); + return 0; + } + + long weight[storage_tiers]; + + for(size_t tier = 0; tier < storage_tiers ; tier++) { + + time_t common_first_t = 0; + time_t common_last_t = 0; + time_t common_update_every = 0; + + // find the db time-range for this tier for all metrics + for(size_t i = 0, used = qt->query.used; i < used ; i++) { + QUERY_METRIC *qm = &qt->query.array[i]; + + time_t first_t = qm->tiers[tier].db_first_time_t; + time_t last_t = qm->tiers[tier].db_last_time_t; + time_t update_every = qm->tiers[tier].db_update_every; + + if(!first_t || !last_t || !update_every) + continue; + + if(!common_first_t) + common_first_t = first_t; + else + common_first_t = MIN(first_t, common_first_t); + + if(!common_last_t) + common_last_t = last_t; + else + common_last_t = MAX(last_t, common_last_t); + + if(!common_update_every) + common_update_every = update_every; + else + common_update_every = MIN(update_every, common_update_every); + } + + weight[tier] = query_plan_points_coverage_weight(common_first_t, common_last_t, common_update_every, after_wanted, before_wanted, points_wanted, tier); + } + + size_t best_tier = 0; + for(size_t tier = 1; tier < storage_tiers ; tier++) { + if(weight[tier] >= weight[best_tier]) + best_tier = tier; + } + + if(weight[best_tier] == -LONG_MAX) + best_tier = 0; + + return best_tier; +} + +static time_t rrdset_find_natural_update_every_for_timeframe(QUERY_TARGET *qt, time_t after_wanted, time_t before_wanted, size_t points_wanted, RRDR_OPTIONS options, size_t tier) { + size_t best_tier; + if((options & RRDR_OPTION_SELECTED_TIER) && tier < storage_tiers) + best_tier = tier; + else + best_tier = rrddim_find_best_tier_for_timeframe(qt, after_wanted, before_wanted, points_wanted); + + // find the db minimum update every for this tier for all metrics + time_t common_update_every = default_rrd_update_every; + for(size_t i = 0, used = qt->query.used; i < used ; i++) { + QUERY_METRIC *qm = &qt->query.array[i]; + + time_t update_every = qm->tiers[best_tier].db_update_every; + + if(!i) + common_update_every = update_every; + else + common_update_every = MIN(update_every, common_update_every); + } + + return common_update_every; +} + +// ---------------------------------------------------------------------------- +// query ops + +typedef struct query_point { + time_t end_time; + time_t start_time; + NETDATA_DOUBLE value; + NETDATA_DOUBLE anomaly; + SN_FLAGS flags; +#ifdef NETDATA_INTERNAL_CHECKS + size_t id; +#endif +} QUERY_POINT; + +QUERY_POINT QUERY_POINT_EMPTY = { + .end_time = 0, + .start_time = 0, + .value = NAN, + .anomaly = 0, + .flags = SN_FLAG_NONE, +#ifdef NETDATA_INTERNAL_CHECKS + .id = 0, +#endif +}; + +#ifdef NETDATA_INTERNAL_CHECKS +#define query_point_set_id(point, point_id) (point).id = point_id +#else +#define query_point_set_id(point, point_id) debug_dummy() +#endif + +typedef struct query_plan_entry { + size_t tier; + time_t after; + time_t before; +} QUERY_PLAN_ENTRY; + +typedef struct query_plan { + size_t entries; + QUERY_PLAN_ENTRY data[RRD_STORAGE_TIERS*2]; +} QUERY_PLAN; + +typedef struct query_engine_ops { + // configuration + RRDR *r; + QUERY_METRIC *qm; + time_t view_update_every; + time_t query_granularity; + TIER_QUERY_FETCH tier_query_fetch; + + // query planer + QUERY_PLAN plan; + size_t current_plan; + time_t current_plan_expire_time; + + // storage queries + size_t tier; + struct query_metric_tier *tier_ptr; + struct storage_engine_query_handle handle; + STORAGE_POINT (*next_metric)(struct storage_engine_query_handle *handle); + int (*is_finished)(struct storage_engine_query_handle *handle); + void (*finalize)(struct storage_engine_query_handle *handle); + + // aggregating points over time + void (*grouping_add)(struct rrdresult *r, NETDATA_DOUBLE value); + NETDATA_DOUBLE (*grouping_flush)(struct rrdresult *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + size_t group_points_non_zero; + size_t group_points_added; + NETDATA_DOUBLE group_anomaly_rate; + RRDR_VALUE_FLAGS group_value_flags; + + // statistics + size_t db_total_points_read; + size_t db_points_read_per_tier[RRD_STORAGE_TIERS]; +} QUERY_ENGINE_OPS; + + +// ---------------------------------------------------------------------------- +// query planer + +#define query_plan_should_switch_plan(ops, now) ((now) >= (ops).current_plan_expire_time) + +static void query_planer_activate_plan(QUERY_ENGINE_OPS *ops, size_t plan_id, time_t overwrite_after) { + if(unlikely(plan_id >= ops->plan.entries)) + plan_id = ops->plan.entries - 1; + + time_t after = ops->plan.data[plan_id].after; + time_t before = ops->plan.data[plan_id].before; + + if(overwrite_after > after && overwrite_after < before) + after = overwrite_after; + + ops->tier = ops->plan.data[plan_id].tier; + ops->tier_ptr = &ops->qm->tiers[ops->tier]; + ops->tier_ptr->eng->api.query_ops.init(ops->tier_ptr->db_metric_handle, &ops->handle, after, before); + ops->next_metric = ops->tier_ptr->eng->api.query_ops.next_metric; + ops->is_finished = ops->tier_ptr->eng->api.query_ops.is_finished; + ops->finalize = ops->tier_ptr->eng->api.query_ops.finalize; + ops->current_plan = plan_id; + ops->current_plan_expire_time = ops->plan.data[plan_id].before; +} + +static void query_planer_next_plan(QUERY_ENGINE_OPS *ops, time_t now, time_t last_point_end_time) { + internal_error(now < ops->current_plan_expire_time && now < ops->plan.data[ops->current_plan].before, + "QUERY: switching query plan too early!"); + + size_t old_plan = ops->current_plan; + + time_t next_plan_before_time; + do { + ops->current_plan++; + + if (ops->current_plan >= ops->plan.entries) { + ops->current_plan = old_plan; + ops->current_plan_expire_time = ops->r->internal.qt->window.before; + // let the query run with current plan + // we will not switch it + return; + } + + next_plan_before_time = ops->plan.data[ops->current_plan].before; + } while(now >= next_plan_before_time || last_point_end_time >= next_plan_before_time); + + if(!query_metric_is_valid_tier(ops->qm, ops->plan.data[ops->current_plan].tier)) { + ops->current_plan = old_plan; + ops->current_plan_expire_time = ops->r->internal.qt->window.before; + return; + } + + if(ops->finalize) { + ops->finalize(&ops->handle); + ops->finalize = NULL; + ops->is_finished = NULL; + } + + // internal_error(true, "QUERY: switched plan to %zu (all is %zu), previous expiration was %ld, this starts at %ld, now is %ld, last_point_end_time %ld", ops->current_plan, ops->plan.entries, ops->plan.data[ops->current_plan-1].before, ops->plan.data[ops->current_plan].after, now, last_point_end_time); + + query_planer_activate_plan(ops, ops->current_plan, MIN(now, last_point_end_time)); +} + +static int compare_query_plan_entries_on_start_time(const void *a, const void *b) { + QUERY_PLAN_ENTRY *p1 = (QUERY_PLAN_ENTRY *)a; + QUERY_PLAN_ENTRY *p2 = (QUERY_PLAN_ENTRY *)b; + return (p1->after < p2->after)?-1:1; +} + +static bool query_plan(QUERY_ENGINE_OPS *ops, time_t after_wanted, time_t before_wanted, size_t points_wanted) { + //BUFFER *wb = buffer_create(1000); + //buffer_sprintf(wb, "QUERY PLAN for chart '%s' dimension '%s', from %ld to %ld:", rd->rrdset->name, rd->name, after_wanted, before_wanted); + + // put our selected tier as the first plan + size_t selected_tier; + + if(ops->r->internal.query_options & RRDR_OPTION_SELECTED_TIER + && ops->r->internal.qt->window.tier < storage_tiers + && query_metric_is_valid_tier(ops->qm, ops->r->internal.qt->window.tier)) { + selected_tier = ops->r->internal.qt->window.tier; + } + else { + selected_tier = query_metric_best_tier_for_timeframe(ops->qm, after_wanted, before_wanted, points_wanted); + + if(ops->r->internal.query_options & RRDR_OPTION_SELECTED_TIER) + ops->r->internal.query_options &= ~RRDR_OPTION_SELECTED_TIER; + } + + ops->plan.entries = 1; + ops->plan.data[0].tier = selected_tier; + ops->plan.data[0].after = ops->qm->tiers[selected_tier].db_first_time_t; + ops->plan.data[0].before = ops->qm->tiers[selected_tier].db_last_time_t; + + if(!(ops->r->internal.query_options & RRDR_OPTION_SELECTED_TIER)) { + // the selected tier + time_t selected_tier_first_time_t = ops->plan.data[0].after; + time_t selected_tier_last_time_t = ops->plan.data[0].before; + + //buffer_sprintf(wb, ": SELECTED tier %zu, from %ld to %ld", selected_tier, ops->plan.data[0].after, ops->plan.data[0].before); + + // check if our selected tier can start the query + if (selected_tier_first_time_t > after_wanted) { + // we need some help from other tiers + for (size_t tr = (int)selected_tier + 1; tr < storage_tiers; tr++) { + if(!query_metric_is_valid_tier(ops->qm, tr)) + continue; + + // find the first time of this tier + time_t first_time_t = ops->qm->tiers[tr].db_first_time_t; + + //buffer_sprintf(wb, ": EVAL AFTER tier %d, %ld", tier, first_time_t); + + // can it help? + if (first_time_t < selected_tier_first_time_t) { + // it can help us add detail at the beginning of the query + QUERY_PLAN_ENTRY t = { + .tier = tr, + .after = (first_time_t < after_wanted) ? after_wanted : first_time_t, + .before = selected_tier_first_time_t}; + ops->plan.data[ops->plan.entries++] = t; + + // prepare for the tier + selected_tier_first_time_t = t.after; + + if (t.after <= after_wanted) + break; + } + } + } + + // check if our selected tier can finish the query + if (selected_tier_last_time_t < before_wanted) { + // we need some help from other tiers + for (int tr = (int)selected_tier - 1; tr >= 0; tr--) { + if(!query_metric_is_valid_tier(ops->qm, tr)) + continue; + + // find the last time of this tier + time_t last_time_t = ops->qm->tiers[tr].db_last_time_t; + + //buffer_sprintf(wb, ": EVAL BEFORE tier %d, %ld", tier, last_time_t); + + // can it help? + if (last_time_t > selected_tier_last_time_t) { + // it can help us add detail at the end of the query + QUERY_PLAN_ENTRY t = { + .tier = tr, + .after = selected_tier_last_time_t, + .before = (last_time_t > before_wanted) ? before_wanted : last_time_t}; + ops->plan.data[ops->plan.entries++] = t; + + // prepare for the tier + selected_tier_last_time_t = t.before; + + if (t.before >= before_wanted) + break; + } + } + } + } + + // sort the query plan + if(ops->plan.entries > 1) + qsort(&ops->plan.data, ops->plan.entries, sizeof(QUERY_PLAN_ENTRY), compare_query_plan_entries_on_start_time); + + // make sure it has the whole timeframe we need + if(ops->plan.data[0].after < after_wanted) + ops->plan.data[0].after = after_wanted; + + if(ops->plan.data[ops->plan.entries - 1].before > before_wanted) + ops->plan.data[ops->plan.entries - 1].before = before_wanted; + + //buffer_sprintf(wb, ": FINAL STEPS %zu", ops->plan.entries); + + //for(size_t i = 0; i < ops->plan.entries ;i++) + // buffer_sprintf(wb, ": STEP %zu = use tier %zu from %ld to %ld", i+1, ops->plan.data[i].tier, ops->plan.data[i].after, ops->plan.data[i].before); + + //internal_error(true, "%s", buffer_tostring(wb)); + + if(!query_metric_is_valid_tier(ops->qm, ops->plan.data[0].tier)) + return false; + + query_planer_activate_plan(ops, 0, 0); + + return true; +} + + +// ---------------------------------------------------------------------------- +// dimension level query engine + +#define query_interpolate_point(this_point, last_point, now) do { \ + if(likely( \ + /* the point to interpolate is more than 1s wide */ \ + (this_point).end_time - (this_point).start_time > 1 \ + \ + /* the two points are exactly next to each other */ \ + && (last_point).end_time == (this_point).start_time \ + \ + /* both points are valid numbers */ \ + && netdata_double_isnumber((this_point).value) \ + && netdata_double_isnumber((last_point).value) \ + \ + )) { \ + (this_point).value = (last_point).value + ((this_point).value - (last_point).value) * (1.0 - (NETDATA_DOUBLE)((this_point).end_time - (now)) / (NETDATA_DOUBLE)((this_point).end_time - (this_point).start_time)); \ + (this_point).end_time = now; \ + } \ +} while(0) + +#define query_add_point_to_group(r, point, ops) do { \ + if(likely(netdata_double_isnumber((point).value))) { \ + if(likely(fpclassify((point).value) != FP_ZERO)) \ + (ops).group_points_non_zero++; \ + \ + if(unlikely((point).flags & SN_FLAG_RESET)) \ + (ops).group_value_flags |= RRDR_VALUE_RESET; \ + \ + (ops).grouping_add(r, (point).value); \ + } \ + \ + (ops).group_points_added++; \ + (ops).group_anomaly_rate += (point).anomaly; \ +} while(0) + +static inline void rrd2rrdr_do_dimension(RRDR *r, size_t dim_id_in_rrdr) { + QUERY_TARGET *qt = r->internal.qt; + QUERY_METRIC *qm = &qt->query.array[dim_id_in_rrdr]; + size_t points_wanted = qt->window.points; + time_t after_wanted = qt->window.after; + time_t before_wanted = qt->window.before; + +// bool debug_this = false; +// if(strcmp("user", string2str(rd->id)) == 0 && strcmp("system.cpu", string2str(rd->rrdset->id)) == 0) +// debug_this = true; + + time_t max_date = 0, + min_date = 0; + + size_t points_added = 0; + + QUERY_ENGINE_OPS ops = { + .r = r, + .qm = qm, + .grouping_add = r->internal.grouping_add, + .grouping_flush = r->internal.grouping_flush, + .tier_query_fetch = r->internal.tier_query_fetch, + .view_update_every = r->update_every, + .query_granularity = (time_t)(r->update_every / r->group), + .group_value_flags = RRDR_VALUE_NOTHING + }; + + long rrdr_line = -1; + bool use_anomaly_bit_as_value = (r->internal.query_options & RRDR_OPTION_ANOMALY_BIT) ? true : false; + + if(!query_plan(&ops, after_wanted, before_wanted, points_wanted)) + return; + + NETDATA_DOUBLE min = r->min, max = r->max; + + QUERY_POINT last2_point = QUERY_POINT_EMPTY; + QUERY_POINT last1_point = QUERY_POINT_EMPTY; + QUERY_POINT new_point = QUERY_POINT_EMPTY; + + time_t now_start_time = after_wanted - ops.query_granularity; + time_t now_end_time = after_wanted + ops.view_update_every - ops.query_granularity; + + size_t db_points_read_since_plan_switch = 0; (void)db_points_read_since_plan_switch; + + // The main loop, based on the query granularity we need + for( ; points_added < points_wanted ; now_start_time = now_end_time, now_end_time += ops.view_update_every) { + + if(unlikely(query_plan_should_switch_plan(ops, now_end_time))) { + query_planer_next_plan(&ops, now_end_time, new_point.end_time); + db_points_read_since_plan_switch = 0; + } + + // read all the points of the db, prior to the time we need (now_end_time) + + size_t count_same_end_time = 0; + while(count_same_end_time < 100) { + if(likely(count_same_end_time == 0)) { + last2_point = last1_point; + last1_point = new_point; + } + + if(unlikely(ops.is_finished(&ops.handle))) { + if(count_same_end_time != 0) { + last2_point = last1_point; + last1_point = new_point; + } + new_point = QUERY_POINT_EMPTY; + new_point.start_time = last1_point.end_time; + new_point.end_time = now_end_time; +// +// if(debug_this) info("QUERY: is finished() returned true"); +// + break; + } + + // fetch the new point + { + db_points_read_since_plan_switch++; + STORAGE_POINT sp = ops.next_metric(&ops.handle); + + ops.db_points_read_per_tier[ops.tier]++; + ops.db_total_points_read++; + + new_point.start_time = sp.start_time; + new_point.end_time = sp.end_time; + new_point.anomaly = sp.count ? (NETDATA_DOUBLE)sp.anomaly_count * 100.0 / (NETDATA_DOUBLE)sp.count : 0.0; + query_point_set_id(new_point, ops.db_total_points_read); + +// if(debug_this) +// info("QUERY: got point %zu, from time %ld to %ld // now from %ld to %ld // query from %ld to %ld", +// new_point.id, new_point.start_time, new_point.end_time, now_start_time, now_end_time, after_wanted, before_wanted); +// + // set the right value to the point we got + if(likely(!storage_point_is_unset(sp) && !storage_point_is_empty(sp))) { + + if(unlikely(use_anomaly_bit_as_value)) + new_point.value = new_point.anomaly; + + else { + switch (ops.tier_query_fetch) { + default: + case TIER_QUERY_FETCH_AVERAGE: + new_point.value = sp.sum / sp.count; + break; + + case TIER_QUERY_FETCH_MIN: + new_point.value = sp.min; + break; + + case TIER_QUERY_FETCH_MAX: + new_point.value = sp.max; + break; + + case TIER_QUERY_FETCH_SUM: + new_point.value = sp.sum; + break; + }; + } + } + else { + new_point.value = NAN; + new_point.flags = SN_FLAG_NONE; + } + } + + // check if the db is giving us zero duration points + if(unlikely(new_point.start_time == new_point.end_time)) { + internal_error(true, "QUERY: '%s', dimension '%s' next_metric() returned point %zu start time %ld, end time %ld, that are both equal", + qt->id, string2str(qm->dimension.id), new_point.id, new_point.start_time, new_point.end_time); + + new_point.start_time = new_point.end_time - ops.tier_ptr->db_update_every; + } + + // check if the db is advancing the query + if(unlikely(new_point.end_time <= last1_point.end_time)) { + internal_error(db_points_read_since_plan_switch > 1, + "QUERY: '%s', dimension '%s' next_metric() returned point %zu from %ld to %ld, before the last point %zu from %ld to %ld, now is %ld to %ld", + qt->id, string2str(qm->dimension.id), new_point.id, new_point.start_time, new_point.end_time, + last1_point.id, last1_point.start_time, last1_point.end_time, now_start_time, now_end_time); + + count_same_end_time++; + continue; + } + count_same_end_time = 0; + + // decide how to use this point + if(likely(new_point.end_time < now_end_time)) { // likely to favor tier0 + // this db point ends before our now_end_time + + if(likely(new_point.end_time >= now_start_time)) { // likely to favor tier0 + // this db point ends after our now_start time + + query_add_point_to_group(r, new_point, ops); + } + else { + // we don't need this db point + // it is totally outside our current time-frame + + // this is desirable for the first point of the query + // because it allows us to interpolate the next point + // at exactly the time we will want + + // we only log if this is not point 1 + internal_error(new_point.end_time < after_wanted && new_point.id > 1, + "QUERY: '%s', dimension '%s' next_metric() returned point %zu from %ld time %ld, which is entirely before our current timeframe %ld to %ld (and before the entire query, after %ld, before %ld)", + qt->id, string2str(qm->dimension.id), + new_point.id, new_point.start_time, new_point.end_time, + now_start_time, now_end_time, + after_wanted, before_wanted); + } + + } + else { + // the point ends in the future + // so, we will interpolate it below, at the inner loop + break; + } + } + + if(unlikely(count_same_end_time)) { + internal_error(true, + "QUERY: '%s', dimension '%s', the database does not advance the query, it returned an end time less or equal to the end time of the last point we got %ld, %zu times", + qt->id, string2str(qm->dimension.id), last1_point.end_time, count_same_end_time); + + if(unlikely(new_point.end_time <= last1_point.end_time)) + new_point.end_time = now_end_time; + } + + // the inner loop + // we have 3 points in memory: last2, last1, new + // we select the one to use based on their timestamps + + size_t iterations = 0; + for ( ; now_end_time <= new_point.end_time && points_added < points_wanted ; + now_end_time += ops.view_update_every, iterations++) { + + // now_start_time is wrong in this loop + // but, we don't need it + + QUERY_POINT current_point; + + if(likely(now_end_time > new_point.start_time)) { + // it is time for our NEW point to be used + current_point = new_point; + query_interpolate_point(current_point, last1_point, now_end_time); + +// internal_error(current_point.id > 0 +// && last1_point.id == 0 +// && current_point.end_time > after_wanted +// && current_point.end_time > now_end_time, +// "QUERY: '%s', dimension '%s', after %ld, before %ld, view update every %ld," +// " query granularity %ld, interpolating point %zu (from %ld to %ld) at %ld," +// " but we could really favor by having last_point1 in this query.", +// qt->id, string2str(qm->dimension.id), +// after_wanted, before_wanted, +// ops.view_update_every, ops.query_granularity, +// current_point.id, current_point.start_time, current_point.end_time, +// now_end_time); + } + else if(likely(now_end_time <= last1_point.end_time)) { + // our LAST point is still valid + current_point = last1_point; + query_interpolate_point(current_point, last2_point, now_end_time); + +// internal_error(current_point.id > 0 +// && last2_point.id == 0 +// && current_point.end_time > after_wanted +// && current_point.end_time > now_end_time, +// "QUERY: '%s', dimension '%s', after %ld, before %ld, view update every %ld," +// " query granularity %ld, interpolating point %zu (from %ld to %ld) at %ld," +// " but we could really favor by having last_point2 in this query.", +// qt->id, string2str(qm->dimension.id), +// after_wanted, before_wanted, ops.view_update_every, ops.query_granularity, +// current_point.id, current_point.start_time, current_point.end_time, +// now_end_time); + } + else { + // a GAP, we don't have a value this time + current_point = QUERY_POINT_EMPTY; + } + + query_add_point_to_group(r, current_point, ops); + + rrdr_line = rrdr_line_init(r, now_end_time, rrdr_line); + size_t rrdr_o_v_index = rrdr_line * r->d + dim_id_in_rrdr; + + if(unlikely(!min_date)) min_date = now_end_time; + max_date = now_end_time; + + // find the place to store our values + RRDR_VALUE_FLAGS *rrdr_value_options_ptr = &r->o[rrdr_o_v_index]; + + // update the dimension options + if(likely(ops.group_points_non_zero)) + r->od[dim_id_in_rrdr] |= RRDR_DIMENSION_NONZERO; + + // store the specific point options + *rrdr_value_options_ptr = ops.group_value_flags; + + // store the group value + NETDATA_DOUBLE group_value = ops.grouping_flush(r, rrdr_value_options_ptr); + r->v[rrdr_o_v_index] = group_value; + + // we only store uint8_t anomaly rates, + // so let's get double precision by storing + // anomaly rates in the range 0 - 200 + r->ar[rrdr_o_v_index] = ops.group_anomaly_rate / (NETDATA_DOUBLE)ops.group_points_added; + + if(likely(points_added || dim_id_in_rrdr)) { + // find the min/max across all dimensions + + if(unlikely(group_value < min)) min = group_value; + if(unlikely(group_value > max)) max = group_value; + + } + else { + // runs only when dim_id_in_rrdr == 0 && points_added == 0 + // so, on the first point added for the query. + min = max = group_value; + } + + points_added++; + ops.group_points_added = 0; + ops.group_value_flags = RRDR_VALUE_NOTHING; + ops.group_points_non_zero = 0; + ops.group_anomaly_rate = 0; + } + // the loop above increased "now" by query_granularity, + // but the main loop will increase it too, + // so, let's undo the last iteration of this loop + if(iterations) + now_end_time -= ops.view_update_every; + } + ops.finalize(&ops.handle); + + r->internal.result_points_generated += points_added; + r->internal.db_points_read += ops.db_total_points_read; + for(size_t tr = 0; tr < storage_tiers ; tr++) + r->internal.tier_points_read[tr] += ops.db_points_read_per_tier[tr]; + + r->min = min; + r->max = max; + r->before = max_date; + r->after = min_date - ops.view_update_every + ops.query_granularity; + rrdr_done(r, rrdr_line); + + internal_error(points_added != points_wanted, + "QUERY: '%s', dimension '%s', requested %zu points, but RRDR added %zu (%zu db points read).", + qt->id, string2str(qm->dimension.id), + (size_t)points_wanted, (size_t)points_added, ops.db_total_points_read); +} + +// ---------------------------------------------------------------------------- +// fill the gap of a tier + +void store_metric_at_tier(RRDDIM *rd, size_t tier, struct rrddim_tier *t, STORAGE_POINT sp, usec_t now_ut); +void store_metric_collection_completed(void); + +void rrdr_fill_tier_gap_from_smaller_tiers(RRDDIM *rd, size_t tier, time_t now) { + if(unlikely(tier >= storage_tiers)) return; + if(storage_tiers_backfill[tier] == RRD_BACKFILL_NONE) return; + + struct rrddim_tier *t = rd->tiers[tier]; + if(unlikely(!t)) return; + + time_t latest_time_t = t->query_ops->latest_time(t->db_metric_handle); + time_t granularity = (time_t)t->tier_grouping * (time_t)rd->update_every; + time_t time_diff = now - latest_time_t; + + // if the user wants only NEW backfilling, and we don't have any data + if(storage_tiers_backfill[tier] == RRD_BACKFILL_NEW && latest_time_t <= 0) return; + + // there is really nothing we can do + if(now <= latest_time_t || time_diff < granularity) return; + + struct storage_engine_query_handle handle; + + // for each lower tier + for(int read_tier = (int)tier - 1; read_tier >= 0 ; read_tier--){ + time_t smaller_tier_first_time = rd->tiers[read_tier]->query_ops->oldest_time(rd->tiers[read_tier]->db_metric_handle); + time_t smaller_tier_last_time = rd->tiers[read_tier]->query_ops->latest_time(rd->tiers[read_tier]->db_metric_handle); + if(smaller_tier_last_time <= latest_time_t) continue; // it is as bad as we are + + long after_wanted = (latest_time_t < smaller_tier_first_time) ? smaller_tier_first_time : latest_time_t; + long before_wanted = smaller_tier_last_time; + + struct rrddim_tier *tmp = rd->tiers[read_tier]; + tmp->query_ops->init(tmp->db_metric_handle, &handle, after_wanted, before_wanted); + + size_t points_read = 0; + + while(!tmp->query_ops->is_finished(&handle)) { + + STORAGE_POINT sp = tmp->query_ops->next_metric(&handle); + points_read++; + + if(sp.end_time > latest_time_t) { + latest_time_t = sp.end_time; + store_metric_at_tier(rd, tier, t, sp, sp.end_time * USEC_PER_SEC); + } + } + + tmp->query_ops->finalize(&handle); + store_metric_collection_completed(); + global_statistics_backfill_query_completed(points_read); + + //internal_error(true, "DBENGINE: backfilled chart '%s', dimension '%s', tier %d, from %ld to %ld, with %zu points from tier %d", + // rd->rrdset->name, rd->name, tier, after_wanted, before_wanted, points, tr); + } +} + +// ---------------------------------------------------------------------------- +// fill RRDR for the whole chart + +#ifdef NETDATA_INTERNAL_CHECKS +static void rrd2rrdr_log_request_response_metadata(RRDR *r + , RRDR_OPTIONS options __maybe_unused + , RRDR_GROUPING group_method + , bool aligned + , size_t group + , time_t resampling_time + , size_t resampling_group + , time_t after_wanted + , time_t after_requested + , time_t before_wanted + , time_t before_requested + , size_t points_requested + , size_t points_wanted + //, size_t after_slot + //, size_t before_slot + , const char *msg + ) { + + time_t first_entry_t = r->internal.qt->db.first_time_t; + time_t last_entry_t = r->internal.qt->db.last_time_t; + + internal_error( + true, + "rrd2rrdr() on %s update every %ld with %s grouping %s (group: %zu, resampling_time: %ld, resampling_group: %zu), " + "after (got: %ld, want: %ld, req: %ld, db: %ld), " + "before (got: %ld, want: %ld, req: %ld, db: %ld), " + "duration (got: %ld, want: %ld, req: %ld, db: %ld), " + "points (got: %zu, want: %zu, req: %zu), " + "%s" + , r->internal.qt->id + , r->internal.qt->window.query_granularity + + // grouping + , (aligned) ? "aligned" : "unaligned" + , group_method2string(group_method) + , group + , resampling_time + , resampling_group + + // after + , r->after + , after_wanted + , after_requested + , first_entry_t + + // before + , r->before + , before_wanted + , before_requested + , last_entry_t + + // duration + , (long)(r->before - r->after + r->internal.qt->window.query_granularity) + , (long)(before_wanted - after_wanted + r->internal.qt->window.query_granularity) + , (long)before_requested - after_requested + , (long)((last_entry_t - first_entry_t) + r->internal.qt->window.query_granularity) + + // points + , r->rows + , points_wanted + , points_requested + + // message + , msg + ); +} +#endif // NETDATA_INTERNAL_CHECKS + +// Returns 1 if an absolute period was requested or 0 if it was a relative period +bool rrdr_relative_window_to_absolute(time_t *after, time_t *before) { + time_t now = now_realtime_sec() - 1; + + int absolute_period_requested = -1; + long long after_requested, before_requested; + + before_requested = *before; + after_requested = *after; + + // allow relative for before (smaller than API_RELATIVE_TIME_MAX) + if(ABS(before_requested) <= API_RELATIVE_TIME_MAX) { + // if the user asked for a positive relative time, + // flip it to a negative + if(before_requested > 0) + before_requested = -before_requested; + + before_requested = now + before_requested; + absolute_period_requested = 0; + } + + // allow relative for after (smaller than API_RELATIVE_TIME_MAX) + if(ABS(after_requested) <= API_RELATIVE_TIME_MAX) { + if(after_requested > 0) + after_requested = -after_requested; + + // if the user didn't give an after, use the number of points + // to give a sane default + if(after_requested == 0) + after_requested = -600; + + // since the query engine now returns inclusive timestamps + // it is awkward to return 6 points when after=-5 is given + // so for relative queries we add 1 second, to give + // more predictable results to users. + after_requested = before_requested + after_requested + 1; + absolute_period_requested = 0; + } + + if(absolute_period_requested == -1) + absolute_period_requested = 1; + + // check if the parameters are flipped + if(after_requested > before_requested) { + long long t = before_requested; + before_requested = after_requested; + after_requested = t; + } + + // if the query requests future data + // shift the query back to be in the present time + // (this may also happen because of the rules above) + if(before_requested > now) { + long long delta = before_requested - now; + before_requested -= delta; + after_requested -= delta; + } + + time_t absolute_minimum_time = now - (10 * 365 * 86400); + time_t absolute_maximum_time = now + (1 * 365 * 86400); + + if (after_requested < absolute_minimum_time && !unittest_running) + after_requested = absolute_minimum_time; + + if (after_requested > absolute_maximum_time && !unittest_running) + after_requested = absolute_maximum_time; + + if (before_requested < absolute_minimum_time && !unittest_running) + before_requested = absolute_minimum_time; + + if (before_requested > absolute_maximum_time && !unittest_running) + before_requested = absolute_maximum_time; + + *before = before_requested; + *after = after_requested; + + return (absolute_period_requested != 1); +} + +// #define DEBUG_QUERY_LOGIC 1 + +#ifdef DEBUG_QUERY_LOGIC +#define query_debug_log_init() BUFFER *debug_log = buffer_create(1000) +#define query_debug_log(args...) buffer_sprintf(debug_log, ##args) +#define query_debug_log_fin() { \ + info("QUERY: '%s', after:%ld, before:%ld, duration:%ld, points:%zu, res:%ld - wanted => after:%ld, before:%ld, points:%zu, group:%zu, granularity:%ld, resgroup:%ld, resdiv:" NETDATA_DOUBLE_FORMAT_AUTO " %s", qt->id, after_requested, before_requested, before_requested - after_requested, points_requested, resampling_time_requested, after_wanted, before_wanted, points_wanted, group, query_granularity, resampling_group, resampling_divisor, buffer_tostring(debug_log)); \ + buffer_free(debug_log); \ + debug_log = NULL; \ + } +#define query_debug_log_free() do { buffer_free(debug_log); } while(0) +#else +#define query_debug_log_init() debug_dummy() +#define query_debug_log(args...) debug_dummy() +#define query_debug_log_fin() debug_dummy() +#define query_debug_log_free() debug_dummy() +#endif + +bool query_target_calculate_window(QUERY_TARGET *qt) { + if (unlikely(!qt)) return false; + + size_t points_requested = (long)qt->request.points; + time_t after_requested = qt->request.after; + time_t before_requested = qt->request.before; + RRDR_GROUPING group_method = qt->request.group_method; + time_t resampling_time_requested = qt->request.resampling_time; + RRDR_OPTIONS options = qt->request.options; + size_t tier = qt->request.tier; + time_t update_every = qt->db.minimum_latest_update_every; + + // RULES + // points_requested = 0 + // the user wants all the natural points the database has + // + // after_requested = 0 + // the user wants to start the query from the oldest point in our database + // + // before_requested = 0 + // the user wants the query to end to the latest point in our database + // + // when natural points are wanted, the query has to be aligned to the update_every + // of the database + + size_t points_wanted = points_requested; + time_t after_wanted = after_requested; + time_t before_wanted = before_requested; + + bool aligned = !(options & RRDR_OPTION_NOT_ALIGNED); + bool automatic_natural_points = (points_wanted == 0); + bool relative_period_requested = false; + bool natural_points = (options & RRDR_OPTION_NATURAL_POINTS) || automatic_natural_points; + bool before_is_aligned_to_db_end = false; + + query_debug_log_init(); + + if (ABS(before_requested) <= API_RELATIVE_TIME_MAX || ABS(after_requested) <= API_RELATIVE_TIME_MAX) { + relative_period_requested = true; + natural_points = true; + options |= RRDR_OPTION_NATURAL_POINTS; + query_debug_log(":relative+natural"); + } + + // if the user wants virtual points, make sure we do it + if (options & RRDR_OPTION_VIRTUAL_POINTS) + natural_points = false; + + // set the right flag about natural and virtual points + if (natural_points) { + options |= RRDR_OPTION_NATURAL_POINTS; + + if (options & RRDR_OPTION_VIRTUAL_POINTS) + options &= ~RRDR_OPTION_VIRTUAL_POINTS; + } + else { + options |= RRDR_OPTION_VIRTUAL_POINTS; + + if (options & RRDR_OPTION_NATURAL_POINTS) + options &= ~RRDR_OPTION_NATURAL_POINTS; + } + + if (after_wanted == 0 || before_wanted == 0) { + relative_period_requested = true; + + time_t first_entry_t = qt->db.first_time_t; + time_t last_entry_t = qt->db.last_time_t; + + if (first_entry_t == 0 || last_entry_t == 0) { + internal_error(true, "QUERY: no data detected on query '%s' (db first_entry_t = %ld, last_entry_t = %ld", qt->id, first_entry_t, last_entry_t); + query_debug_log_free(); + return false; + } + + query_debug_log(":first_entry_t %ld, last_entry_t %ld", first_entry_t, last_entry_t); + + if (after_wanted == 0) { + after_wanted = first_entry_t; + query_debug_log(":zero after_wanted %ld", after_wanted); + } + + if (before_wanted == 0) { + before_wanted = last_entry_t; + before_is_aligned_to_db_end = true; + query_debug_log(":zero before_wanted %ld", before_wanted); + } + + if (points_wanted == 0) { + points_wanted = (last_entry_t - first_entry_t) / update_every; + query_debug_log(":zero points_wanted %zu", points_wanted); + } + } + + if (points_wanted == 0) { + points_wanted = 600; + query_debug_log(":zero600 points_wanted %zu", points_wanted); + } + + // convert our before_wanted and after_wanted to absolute + rrdr_relative_window_to_absolute(&after_wanted, &before_wanted); + query_debug_log(":relative2absolute after %ld, before %ld", after_wanted, before_wanted); + + if (natural_points && (options & RRDR_OPTION_SELECTED_TIER) && tier > 0 && storage_tiers > 1) { + update_every = rrdset_find_natural_update_every_for_timeframe( + qt, after_wanted, before_wanted, points_wanted, options, tier); + + if (update_every <= 0) update_every = qt->db.minimum_latest_update_every; + query_debug_log(":natural update every %ld", update_every); + } + + // this is the update_every of the query + // it may be different to the update_every of the database + time_t query_granularity = (natural_points) ? update_every : 1; + if (query_granularity <= 0) query_granularity = 1; + query_debug_log(":query_granularity %ld", query_granularity); + + // align before_wanted and after_wanted to query_granularity + if (before_wanted % query_granularity) { + before_wanted -= before_wanted % query_granularity; + query_debug_log(":granularity align before_wanted %ld", before_wanted); + } + + if (after_wanted % query_granularity) { + after_wanted -= after_wanted % query_granularity; + query_debug_log(":granularity align after_wanted %ld", after_wanted); + } + + // automatic_natural_points is set when the user wants all the points available in the database + if (automatic_natural_points) { + points_wanted = (before_wanted - after_wanted + 1) / query_granularity; + if (unlikely(points_wanted <= 0)) points_wanted = 1; + query_debug_log(":auto natural points_wanted %zu", points_wanted); + } + + time_t duration = before_wanted - after_wanted; + + // if the resampling time is too big, extend the duration to the past + if (unlikely(resampling_time_requested > duration)) { + after_wanted = before_wanted - resampling_time_requested; + duration = before_wanted - after_wanted; + query_debug_log(":resampling after_wanted %ld", after_wanted); + } + + // if the duration is not aligned to resampling time + // extend the duration to the past, to avoid a gap at the chart + // only when the missing duration is above 1/10th of a point + if (resampling_time_requested > query_granularity && duration % resampling_time_requested) { + time_t delta = duration % resampling_time_requested; + if (delta > resampling_time_requested / 10) { + after_wanted -= resampling_time_requested - delta; + duration = before_wanted - after_wanted; + query_debug_log(":resampling2 after_wanted %ld", after_wanted); + } + } + + // the available points of the query + size_t points_available = (duration + 1) / query_granularity; + if (unlikely(points_available <= 0)) points_available = 1; + query_debug_log(":points_available %zu", points_available); + + if (points_wanted > points_available) { + points_wanted = points_available; + query_debug_log(":max points_wanted %zu", points_wanted); + } + + if(points_wanted > 86400 && !unittest_running) { + points_wanted = 86400; + query_debug_log(":absolute max points_wanted %zu", points_wanted); + } + + // calculate the desired grouping of source data points + size_t group = points_available / points_wanted; + if (group == 0) group = 1; + + // round "group" to the closest integer + if (points_available % points_wanted > points_wanted / 2) + group++; + + query_debug_log(":group %zu", group); + + if (points_wanted * group * query_granularity < (size_t)duration) { + // the grouping we are going to do, is not enough + // to cover the entire duration requested, so + // we have to change the number of points, to make sure we will + // respect the timeframe as closely as possibly + + // let's see how many points are the optimal + points_wanted = points_available / group; + + if (points_wanted * group < points_available) + points_wanted++; + + if (unlikely(points_wanted == 0)) + points_wanted = 1; + + query_debug_log(":optimal points %zu", points_wanted); + } + + // resampling_time_requested enforces a certain grouping multiple + NETDATA_DOUBLE resampling_divisor = 1.0; + size_t resampling_group = 1; + if (unlikely(resampling_time_requested > query_granularity)) { + // the points we should group to satisfy gtime + resampling_group = resampling_time_requested / query_granularity; + if (unlikely(resampling_time_requested % query_granularity)) + resampling_group++; + + query_debug_log(":resampling group %zu", resampling_group); + + // adapt group according to resampling_group + if (unlikely(group < resampling_group)) { + group = resampling_group; // do not allow grouping below the desired one + query_debug_log(":group less res %zu", group); + } + if (unlikely(group % resampling_group)) { + group += resampling_group - (group % resampling_group); // make sure group is multiple of resampling_group + query_debug_log(":group mod res %zu", group); + } + + // resampling_divisor = group / resampling_group; + resampling_divisor = (NETDATA_DOUBLE) (group * query_granularity) / (NETDATA_DOUBLE) resampling_time_requested; + query_debug_log(":resampling divisor " NETDATA_DOUBLE_FORMAT, resampling_divisor); + } + + // now that we have group, align the requested timeframe to fit it. + if (aligned && before_wanted % (group * query_granularity)) { + if (before_is_aligned_to_db_end) + before_wanted -= before_wanted % (time_t)(group * query_granularity); + else + before_wanted += (time_t)(group * query_granularity) - before_wanted % (time_t)(group * query_granularity); + query_debug_log(":align before_wanted %ld", before_wanted); + } + + after_wanted = before_wanted - (time_t)(points_wanted * group * query_granularity) + query_granularity; + query_debug_log(":final after_wanted %ld", after_wanted); + + duration = before_wanted - after_wanted; + query_debug_log(":final duration %ld", duration + 1); + + query_debug_log_fin(); + + internal_error(points_wanted != duration / (query_granularity * group) + 1, + "QUERY: points_wanted %zu is not points %zu", + points_wanted, (size_t)(duration / (query_granularity * group) + 1)); + + internal_error(group < resampling_group, + "QUERY: group %zu is less than the desired group points %zu", + group, resampling_group); + + internal_error(group > resampling_group && group % resampling_group, + "QUERY: group %zu is not a multiple of the desired group points %zu", + group, resampling_group); + + // ------------------------------------------------------------------------- + // update QUERY_TARGET with our calculations + + qt->window.after = after_wanted; + qt->window.before = before_wanted; + qt->window.relative = relative_period_requested; + qt->window.points = points_wanted; + qt->window.group = group; + qt->window.group_method = group_method; + qt->window.group_options = qt->request.group_options; + qt->window.query_granularity = query_granularity; + qt->window.resampling_group = resampling_group; + qt->window.resampling_divisor = resampling_divisor; + qt->window.options = options; + qt->window.tier = tier; + qt->window.aligned = aligned; + + return true; +} + +RRDR *rrd2rrdr_legacy( + ONEWAYALLOC *owa, + RRDSET *st, size_t points, time_t after, time_t before, + RRDR_GROUPING group_method, time_t resampling_time, RRDR_OPTIONS options, const char *dimensions, + const char *group_options, time_t timeout, size_t tier, QUERY_SOURCE query_source) { + + QUERY_TARGET_REQUEST qtr = { + .st = st, + .points = points, + .after = after, + .before = before, + .group_method = group_method, + .resampling_time = resampling_time, + .options = options, + .dimensions = dimensions, + .group_options = group_options, + .timeout = timeout, + .tier = tier, + .query_source = query_source, + }; + + return rrd2rrdr(owa, query_target_create(&qtr)); +} + +RRDR *rrd2rrdr(ONEWAYALLOC *owa, QUERY_TARGET *qt) { + if(!qt) + return NULL; + + if(!owa) { + query_target_release(qt); + return NULL; + } + + // qt.window members are the WANTED ones. + // qt.request members are the REQUESTED ones. + + RRDR *r = rrdr_create(owa, qt); + if(unlikely(!r)) { + internal_error(true, "QUERY: cannot create RRDR for %s, after=%ld, before=%ld, points=%zu", + qt->id, qt->window.after, qt->window.before, qt->window.points); + return NULL; + } + + if(unlikely(!r->d || !qt->window.points)) { + internal_error(true, "QUERY: returning empty RRDR (no dimensions in RRDSET) for %s, after=%ld, before=%ld, points=%zu", + qt->id, qt->window.after, qt->window.before, qt->window.points); + return r; + } + + if(qt->window.relative) + r->result_options |= RRDR_RESULT_OPTION_RELATIVE; + else + r->result_options |= RRDR_RESULT_OPTION_ABSOLUTE; + + // ------------------------------------------------------------------------- + // initialize RRDR + + r->group = qt->window.group; + r->update_every = (int) (qt->window.group * qt->window.query_granularity); + r->before = qt->window.before; + r->after = qt->window.after; + r->internal.points_wanted = qt->window.points; + r->internal.resampling_group = qt->window.resampling_group; + r->internal.resampling_divisor = qt->window.resampling_divisor; + r->internal.query_options = qt->window.options; + + // ------------------------------------------------------------------------- + // assign the processor functions + rrdr_set_grouping_function(r, qt->window.group_method); + + // allocate any memory required by the grouping method + r->internal.grouping_create(r, qt->window.group_options); + + // ------------------------------------------------------------------------- + // do the work for each dimension + + time_t max_after = 0, min_before = 0; + size_t max_rows = 0; + + long dimensions_used = 0, dimensions_nonzero = 0; + struct timeval query_start_time; + struct timeval query_current_time; + if (qt->request.timeout) + now_realtime_timeval(&query_start_time); + + for(size_t c = 0, max = qt->query.used; c < max ; c++) { + // set the query target dimension options to rrdr + r->od[c] = qt->query.array[c].dimension.options; + + r->od[c] |= RRDR_DIMENSION_SELECTED; + + // reset the grouping for the new dimension + r->internal.grouping_reset(r); + + rrd2rrdr_do_dimension(r, c); + if (qt->request.timeout) + now_realtime_timeval(&query_current_time); + + if(r->od[c] & RRDR_DIMENSION_NONZERO) + dimensions_nonzero++; + + // verify all dimensions are aligned + if(unlikely(!dimensions_used)) { + min_before = r->before; + max_after = r->after; + max_rows = r->rows; + } + else { + if(r->after != max_after) { + internal_error(true, "QUERY: 'after' mismatch between dimensions for chart '%s': max is %zu, dimension '%s' has %zu", + string2str(qt->query.array[c].dimension.id), (size_t)max_after, string2str(qt->query.array[c].dimension.name), (size_t)r->after); + + r->after = (r->after > max_after) ? r->after : max_after; + } + + if(r->before != min_before) { + internal_error(true, "QUERY: 'before' mismatch between dimensions for chart '%s': max is %zu, dimension '%s' has %zu", + string2str(qt->query.array[c].dimension.id), (size_t)min_before, string2str(qt->query.array[c].dimension.name), (size_t)r->before); + + r->before = (r->before < min_before) ? r->before : min_before; + } + + if(r->rows != max_rows) { + internal_error(true, "QUERY: 'rows' mismatch between dimensions for chart '%s': max is %zu, dimension '%s' has %zu", + string2str(qt->query.array[c].dimension.id), (size_t)max_rows, string2str(qt->query.array[c].dimension.name), (size_t)r->rows); + + r->rows = (r->rows > max_rows) ? r->rows : max_rows; + } + } + + dimensions_used++; + if (qt->request.timeout && ((NETDATA_DOUBLE)dt_usec(&query_start_time, &query_current_time) / 1000.0) > (NETDATA_DOUBLE)qt->request.timeout) { + log_access("QUERY CANCELED RUNTIME EXCEEDED %0.2f ms (LIMIT %lld ms)", + (NETDATA_DOUBLE)dt_usec(&query_start_time, &query_current_time) / 1000.0, (long long)qt->request.timeout); + r->result_options |= RRDR_RESULT_OPTION_CANCEL; + break; + } + } + +#ifdef NETDATA_INTERNAL_CHECKS + if (dimensions_used) { + if(r->internal.log) + rrd2rrdr_log_request_response_metadata(r, qt->window.options, qt->window.group_method, qt->window.aligned, qt->window.group, qt->request.resampling_time, qt->window.resampling_group, + qt->window.after, qt->request.after, qt->window.before, qt->request.before, + qt->request.points, qt->window.points, /*after_slot, before_slot,*/ + r->internal.log); + + if(r->rows != qt->window.points) + rrd2rrdr_log_request_response_metadata(r, qt->window.options, qt->window.group_method, qt->window.aligned, qt->window.group, qt->request.resampling_time, qt->window.resampling_group, + qt->window.after, qt->request.after, qt->window.before, qt->request.before, + qt->request.points, qt->window.points, /*after_slot, before_slot,*/ + "got 'points' is not wanted 'points'"); + + if(qt->window.aligned && (r->before % (qt->window.group * qt->window.query_granularity)) != 0) + rrd2rrdr_log_request_response_metadata(r, qt->window.options, qt->window.group_method, qt->window.aligned, qt->window.group, qt->request.resampling_time, qt->window.resampling_group, + qt->window.after, qt->request.after, qt->window.before,qt->request.before, + qt->request.points, qt->window.points, /*after_slot, before_slot,*/ + "'before' is not aligned but alignment is required"); + + // 'after' should not be aligned, since we start inside the first group + //if(qt->window.aligned && (r->after % group) != 0) + // rrd2rrdr_log_request_response_metadata(r, qt->window.options, qt->window.group_method, qt->window.aligned, qt->window.group, qt->request.resampling_time, qt->window.resampling_group, qt->window.after, after_requested, before_wanted, before_requested, points_requested, points_wanted, after_slot, before_slot, "'after' is not aligned but alignment is required"); + + if(r->before != qt->window.before) + rrd2rrdr_log_request_response_metadata(r, qt->window.options, qt->window.group_method, qt->window.aligned, qt->window.group, qt->request.resampling_time, qt->window.resampling_group, + qt->window.after, qt->request.after, qt->window.before, qt->request.before, + qt->request.points, qt->window.points, /*after_slot, before_slot,*/ + "chart is not aligned to requested 'before'"); + + if(r->before != qt->window.before) + rrd2rrdr_log_request_response_metadata(r, qt->window.options, qt->window.group_method, qt->window.aligned, qt->window.group, qt->request.resampling_time, qt->window.resampling_group, + qt->window.after, qt->request.after, qt->window.before, qt->request.before, + qt->request.points, qt->window.points, /*after_slot, before_slot,*/ + "got 'before' is not wanted 'before'"); + + // reported 'after' varies, depending on group + if(r->after != qt->window.after) + rrd2rrdr_log_request_response_metadata(r, qt->window.options, qt->window.group_method, qt->window.aligned, qt->window.group, qt->request.resampling_time, qt->window.resampling_group, + qt->window.after, qt->request.after, qt->window.before, qt->request.before, + qt->request.points, qt->window.points, /*after_slot, before_slot,*/ + "got 'after' is not wanted 'after'"); + + } +#endif + + // free all resources used by the grouping method + r->internal.grouping_free(r); + + // when all the dimensions are zero, we should return all of them + if(unlikely((qt->window.options & RRDR_OPTION_NONZERO) && !dimensions_nonzero && !(r->result_options & RRDR_RESULT_OPTION_CANCEL))) { + // all the dimensions are zero + // mark them as NONZERO to send them all + for(size_t c = 0, max = qt->query.used; c < max ; c++) { + if(unlikely(r->od[c] & RRDR_DIMENSION_HIDDEN)) continue; + r->od[c] |= RRDR_DIMENSION_NONZERO; + } + } + + global_statistics_rrdr_query_completed(dimensions_used, r->internal.db_points_read, + r->internal.result_points_generated, qt->request.query_source); + return r; +} diff --git a/web/api/queries/query.h b/web/api/queries/query.h new file mode 100644 index 0000000..ebad5a1 --- /dev/null +++ b/web/api/queries/query.h @@ -0,0 +1,59 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_DATA_QUERY_H +#define NETDATA_API_DATA_QUERY_H + +#ifdef __cplusplus +extern "C" { +#endif + +typedef enum rrdr_grouping { + RRDR_GROUPING_UNDEFINED = 0, + RRDR_GROUPING_AVERAGE, + RRDR_GROUPING_MIN, + RRDR_GROUPING_MAX, + RRDR_GROUPING_SUM, + RRDR_GROUPING_INCREMENTAL_SUM, + RRDR_GROUPING_TRIMMED_MEAN1, + RRDR_GROUPING_TRIMMED_MEAN2, + RRDR_GROUPING_TRIMMED_MEAN3, + RRDR_GROUPING_TRIMMED_MEAN5, + RRDR_GROUPING_TRIMMED_MEAN10, + RRDR_GROUPING_TRIMMED_MEAN15, + RRDR_GROUPING_TRIMMED_MEAN20, + RRDR_GROUPING_TRIMMED_MEAN25, + RRDR_GROUPING_MEDIAN, + RRDR_GROUPING_TRIMMED_MEDIAN1, + RRDR_GROUPING_TRIMMED_MEDIAN2, + RRDR_GROUPING_TRIMMED_MEDIAN3, + RRDR_GROUPING_TRIMMED_MEDIAN5, + RRDR_GROUPING_TRIMMED_MEDIAN10, + RRDR_GROUPING_TRIMMED_MEDIAN15, + RRDR_GROUPING_TRIMMED_MEDIAN20, + RRDR_GROUPING_TRIMMED_MEDIAN25, + RRDR_GROUPING_PERCENTILE25, + RRDR_GROUPING_PERCENTILE50, + RRDR_GROUPING_PERCENTILE75, + RRDR_GROUPING_PERCENTILE80, + RRDR_GROUPING_PERCENTILE90, + RRDR_GROUPING_PERCENTILE95, + RRDR_GROUPING_PERCENTILE97, + RRDR_GROUPING_PERCENTILE98, + RRDR_GROUPING_PERCENTILE99, + RRDR_GROUPING_STDDEV, + RRDR_GROUPING_CV, + RRDR_GROUPING_SES, + RRDR_GROUPING_DES, + RRDR_GROUPING_COUNTIF, +} RRDR_GROUPING; + +const char *group_method2string(RRDR_GROUPING group); +void web_client_api_v1_init_grouping(void); +RRDR_GROUPING web_client_api_request_v1_data_group(const char *name, RRDR_GROUPING def); +const char *web_client_api_request_v1_data_group_to_string(RRDR_GROUPING group); + +#ifdef __cplusplus +} +#endif + +#endif //NETDATA_API_DATA_QUERY_H diff --git a/web/api/queries/rrdr.c b/web/api/queries/rrdr.c new file mode 100644 index 0000000..676224c --- /dev/null +++ b/web/api/queries/rrdr.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "rrdr.h" + +/* +static void rrdr_dump(RRDR *r) +{ + long c, i; + RRDDIM *d; + + fprintf(stderr, "\nCHART %s (%s)\n", r->st->id, r->st->name); + + for(c = 0, d = r->st->dimensions; d ;c++, d = d->next) { + fprintf(stderr, "DIMENSION %s (%s), %s%s%s%s\n" + , d->id + , d->name + , (r->od[c] & RRDR_EMPTY)?"EMPTY ":"" + , (r->od[c] & RRDR_RESET)?"RESET ":"" + , (r->od[c] & RRDR_DIMENSION_HIDDEN)?"HIDDEN ":"" + , (r->od[c] & RRDR_DIMENSION_NONZERO)?"NONZERO ":"" + ); + } + + if(r->rows <= 0) { + fprintf(stderr, "RRDR does not have any values in it.\n"); + return; + } + + fprintf(stderr, "RRDR includes %d values in it:\n", r->rows); + + // for each line in the array + for(i = 0; i < r->rows ;i++) { + NETDATA_DOUBLE *cn = &r->v[ i * r->d ]; + RRDR_DIMENSION_FLAGS *co = &r->o[ i * r->d ]; + + // print the id and the timestamp of the line + fprintf(stderr, "%ld %ld ", i + 1, r->t[i]); + + // for each dimension + for(c = 0, d = r->st->dimensions; d ;c++, d = d->next) { + if(unlikely(r->od[c] & RRDR_DIMENSION_HIDDEN)) continue; + if(unlikely(!(r->od[c] & RRDR_DIMENSION_NONZERO))) continue; + + if(co[c] & RRDR_EMPTY) + fprintf(stderr, "null "); + else + fprintf(stderr, NETDATA_DOUBLE_FORMAT " %s%s%s%s " + , cn[c] + , (co[c] & RRDR_EMPTY)?"E":" " + , (co[c] & RRDR_RESET)?"R":" " + , (co[c] & RRDR_DIMENSION_HIDDEN)?"H":" " + , (co[c] & RRDR_DIMENSION_NONZERO)?"N":" " + ); + } + + fprintf(stderr, "\n"); + } +} +*/ + +inline void rrdr_free(ONEWAYALLOC *owa, RRDR *r) { + if(unlikely(!r)) return; + + query_target_release(r->internal.qt); + onewayalloc_freez(owa, r->t); + onewayalloc_freez(owa, r->v); + onewayalloc_freez(owa, r->o); + onewayalloc_freez(owa, r->od); + onewayalloc_freez(owa, r->ar); + onewayalloc_freez(owa, r); +} + +RRDR *rrdr_create(ONEWAYALLOC *owa, QUERY_TARGET *qt) { + if(unlikely(!qt || !qt->query.used || !qt->window.points)) + return NULL; + + size_t dimensions = qt->query.used; + size_t points = qt->window.points; + + // create the rrdr + RRDR *r = onewayalloc_callocz(owa, 1, sizeof(RRDR)); + r->internal.owa = owa; + r->internal.qt = qt; + + r->before = qt->window.before; + r->after = qt->window.after; + r->internal.points_wanted = qt->window.points; + r->d = (int)dimensions; + r->n = (int)points; + + r->t = onewayalloc_callocz(owa, points, sizeof(time_t)); + r->v = onewayalloc_mallocz(owa, points * dimensions * sizeof(NETDATA_DOUBLE)); + r->o = onewayalloc_mallocz(owa, points * dimensions * sizeof(RRDR_VALUE_FLAGS)); + r->ar = onewayalloc_mallocz(owa, points * dimensions * sizeof(NETDATA_DOUBLE)); + r->od = onewayalloc_mallocz(owa, dimensions * sizeof(RRDR_DIMENSION_FLAGS)); + + r->group = 1; + r->update_every = 1; + + return r; +} diff --git a/web/api/queries/rrdr.h b/web/api/queries/rrdr.h new file mode 100644 index 0000000..6151cdd --- /dev/null +++ b/web/api/queries/rrdr.h @@ -0,0 +1,152 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_QUERIES_RRDR_H +#define NETDATA_QUERIES_RRDR_H + +#include "libnetdata/libnetdata.h" +#include "web/api/queries/query.h" + +#ifdef __cplusplus +extern "C" { +#endif + +typedef enum tier_query_fetch { + TIER_QUERY_FETCH_SUM, + TIER_QUERY_FETCH_MIN, + TIER_QUERY_FETCH_MAX, + TIER_QUERY_FETCH_AVERAGE +} TIER_QUERY_FETCH; + +typedef enum rrdr_options { + RRDR_OPTION_NONZERO = 0x00000001, // don't output dimensions with just zero values + RRDR_OPTION_REVERSED = 0x00000002, // output the rows in reverse order (oldest to newest) + RRDR_OPTION_ABSOLUTE = 0x00000004, // values positive, for DATASOURCE_SSV before summing + RRDR_OPTION_MIN2MAX = 0x00000008, // when adding dimensions, use max - min, instead of sum + RRDR_OPTION_SECONDS = 0x00000010, // output seconds, instead of dates + RRDR_OPTION_MILLISECONDS = 0x00000020, // output milliseconds, instead of dates + RRDR_OPTION_NULL2ZERO = 0x00000040, // do not show nulls, convert them to zeros + RRDR_OPTION_OBJECTSROWS = 0x00000080, // each row of values should be an object, not an array + RRDR_OPTION_GOOGLE_JSON = 0x00000100, // comply with google JSON/JSONP specs + RRDR_OPTION_JSON_WRAP = 0x00000200, // wrap the response in a JSON header with info about the result + RRDR_OPTION_LABEL_QUOTES = 0x00000400, // in CSV output, wrap header labels in double quotes + RRDR_OPTION_PERCENTAGE = 0x00000800, // give values as percentage of total + RRDR_OPTION_NOT_ALIGNED = 0x00001000, // do not align charts for persistent timeframes + RRDR_OPTION_DISPLAY_ABS = 0x00002000, // for badges, display the absolute value, but calculate colors with sign + RRDR_OPTION_MATCH_IDS = 0x00004000, // when filtering dimensions, match only IDs + RRDR_OPTION_MATCH_NAMES = 0x00008000, // when filtering dimensions, match only names + RRDR_OPTION_NATURAL_POINTS = 0x00020000, // return the natural points of the database + RRDR_OPTION_VIRTUAL_POINTS = 0x00040000, // return virtual points + RRDR_OPTION_ANOMALY_BIT = 0x00080000, // Return the anomaly bit stored in each collected_number + RRDR_OPTION_RETURN_RAW = 0x00100000, // Return raw data for aggregating across multiple nodes + RRDR_OPTION_RETURN_JWAR = 0x00200000, // Return anomaly rates in jsonwrap + RRDR_OPTION_SELECTED_TIER = 0x00400000, // Use the selected tier for the query + RRDR_OPTION_ALL_DIMENSIONS = 0x00800000, // Return the full dimensions list + + // internal ones - not to be exposed to the API + RRDR_OPTION_INTERNAL_AR = 0x10000000, // internal use only, to let the formatters we want to render the anomaly rate + RRDR_OPTION_HEALTH_RSRVD1 = 0x80000000, // reserved for RRDCALC_OPTION_NO_CLEAR_NOTIFICATION +} RRDR_OPTIONS; + +typedef enum rrdr_value_flag { + RRDR_VALUE_NOTHING = 0x00, // no flag set (a good default) + RRDR_VALUE_EMPTY = 0x01, // the database value is empty + RRDR_VALUE_RESET = 0x02, // the database value is marked as reset (overflown) +} RRDR_VALUE_FLAGS; + +typedef enum rrdr_dimension_flag { + RRDR_DIMENSION_DEFAULT = 0x00, + RRDR_DIMENSION_HIDDEN = 0x04, // the dimension is hidden (not to be presented to callers) + RRDR_DIMENSION_NONZERO = 0x08, // the dimension is non zero (contains non-zero values) + RRDR_DIMENSION_SELECTED = 0x10, // the dimension is selected for evaluation in this RRDR +} RRDR_DIMENSION_FLAGS; + +// RRDR result options +typedef enum rrdr_result_flags { + RRDR_RESULT_OPTION_ABSOLUTE = 0x00000001, // the query uses absolute time-frames + // (can be cached by browsers and proxies) + RRDR_RESULT_OPTION_RELATIVE = 0x00000002, // the query uses relative time-frames + // (should not to be cached by browsers and proxies) + RRDR_RESULT_OPTION_VARIABLE_STEP = 0x00000004, // the query uses variable-step time-frames + RRDR_RESULT_OPTION_CANCEL = 0x00000008, // the query needs to be cancelled +} RRDR_RESULT_OPTIONS; + +typedef struct rrdresult { + RRDR_RESULT_OPTIONS result_options; // RRDR_RESULT_OPTION_* + + size_t d; // the number of dimensions + size_t n; // the number of values in the arrays + size_t rows; // the number of rows used + + RRDR_DIMENSION_FLAGS *od; // the options for the dimensions + + time_t *t; // array of n timestamps + NETDATA_DOUBLE *v; // array n x d values + RRDR_VALUE_FLAGS *o; // array n x d options for each value returned + NETDATA_DOUBLE *ar; // array n x d of anomaly rates (0 - 100) + + size_t group; // how many collected values were grouped for each row + time_t update_every; // what is the suggested update frequency in seconds + + NETDATA_DOUBLE min; + NETDATA_DOUBLE max; + + time_t before; + time_t after; + + // internal rrd2rrdr() members below this point + struct { + ONEWAYALLOC *owa; // the allocator used + struct query_target *qt; // the QUERY_TARGET + + RRDR_OPTIONS query_options; // RRDR_OPTION_* (as run by the query) + + size_t points_wanted; // used by SES and DES + size_t resampling_group; // used by AVERAGE + NETDATA_DOUBLE resampling_divisor; // used by AVERAGE + + // grouping function pointers + void (*grouping_create)(struct rrdresult *r, const char *options); + void (*grouping_reset)(struct rrdresult *r); + void (*grouping_free)(struct rrdresult *r); + void (*grouping_add)(struct rrdresult *r, NETDATA_DOUBLE value); + NETDATA_DOUBLE (*grouping_flush)(struct rrdresult *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + + TIER_QUERY_FETCH tier_query_fetch; // which value to use from STORAGE_POINT + void *grouping_data; // the internal data of the grouping function + +#ifdef NETDATA_INTERNAL_CHECKS + const char *log; +#endif + + // statistics + size_t db_points_read; + size_t result_points_generated; + size_t tier_points_read[RRD_STORAGE_TIERS]; + } internal; +} RRDR; + +#define rrdr_rows(r) ((r)->rows) + +#include "database/rrd.h" +void rrdr_free(ONEWAYALLOC *owa, RRDR *r); +RRDR *rrdr_create(ONEWAYALLOC *owa, struct query_target *qt); + +#include "../web_api_v1.h" +#include "web/api/queries/query.h" + +RRDR *rrd2rrdr_legacy( + ONEWAYALLOC *owa, + RRDSET *st, size_t points, time_t after, time_t before, + RRDR_GROUPING group_method, time_t resampling_time, RRDR_OPTIONS options, const char *dimensions, + const char *group_options, time_t timeout, size_t tier, QUERY_SOURCE query_source); + +RRDR *rrd2rrdr(ONEWAYALLOC *owa, struct query_target *qt); +bool query_target_calculate_window(struct query_target *qt); + +bool rrdr_relative_window_to_absolute(time_t *after, time_t *before); + +#ifdef __cplusplus +} +#endif + +#endif //NETDATA_QUERIES_RRDR_H diff --git a/web/api/queries/ses/Makefile.am b/web/api/queries/ses/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/ses/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/ses/README.md b/web/api/queries/ses/README.md new file mode 100644 index 0000000..b835b81 --- /dev/null +++ b/web/api/queries/ses/README.md @@ -0,0 +1,61 @@ +<!-- +title: "Single (or Simple) Exponential Smoothing (`ses`)" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/ses/README.md +--> + +# Single (or Simple) Exponential Smoothing (`ses`) + +> This query is also available as `ema` and `ewma`. + +An exponential moving average (`ema`), also known as an exponentially weighted moving average (`ewma`) +is a first-order infinite impulse response filter that applies weighting factors which decrease +exponentially. The weighting for each older datum decreases exponentially, never reaching zero. + +In simple terms, this is like an average value, but more recent values are given more weight. + +Netdata automatically adjusts the weight (`alpha`) based on the number of values processed, +using the formula: + +``` +window = max(number of values, 15) +alpha = 2 / (window + 1) +``` + +You can change the fixed value `15` by setting in `netdata.conf`: + +``` +[web] + ses max window = 15 +``` + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: ses -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`ses` does not change the units. For example, if the chart units is `requests/sec`, the exponential +moving average will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=ses` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average&value_color=yellow) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=ses&after=-60&label=single+exponential+smoothing&value_color=orange) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) + +## References + +- <https://en.wikipedia.org/wiki/Moving_average#exponential-moving-average> +- <https://en.wikipedia.org/wiki/Exponential_smoothing>. + + diff --git a/web/api/queries/ses/ses.c b/web/api/queries/ses/ses.c new file mode 100644 index 0000000..5e94002 --- /dev/null +++ b/web/api/queries/ses/ses.c @@ -0,0 +1,90 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "ses.h" + + +// ---------------------------------------------------------------------------- +// single exponential smoothing + +struct grouping_ses { + NETDATA_DOUBLE alpha; + NETDATA_DOUBLE alpha_other; + NETDATA_DOUBLE level; + size_t count; +}; + +static size_t max_window_size = 15; + +void grouping_init_ses(void) { + long long ret = config_get_number(CONFIG_SECTION_WEB, "ses max window", (long long)max_window_size); + if(ret <= 1) { + config_set_number(CONFIG_SECTION_WEB, "ses max window", (long long)max_window_size); + } + else { + max_window_size = (size_t) ret; + } +} + +static inline NETDATA_DOUBLE window(RRDR *r, struct grouping_ses *g) { + (void)g; + + NETDATA_DOUBLE points; + if(r->group == 1) { + // provide a running DES + points = (NETDATA_DOUBLE)r->internal.points_wanted; + } + else { + // provide a SES with flush points + points = (NETDATA_DOUBLE)r->group; + } + + return (points > (NETDATA_DOUBLE)max_window_size) ? (NETDATA_DOUBLE)max_window_size : points; +} + +static inline void set_alpha(RRDR *r, struct grouping_ses *g) { + // https://en.wikipedia.org/wiki/Moving_average#Exponential_moving_average + // A commonly used value for alpha is 2 / (N + 1) + g->alpha = 2.0 / (window(r, g) + 1.0); + g->alpha_other = 1.0 - g->alpha; +} + +void grouping_create_ses(RRDR *r, const char *options __maybe_unused) { + struct grouping_ses *g = (struct grouping_ses *)onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_ses)); + set_alpha(r, g); + g->level = 0.0; + r->internal.grouping_data = g; +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_ses(RRDR *r) { + struct grouping_ses *g = (struct grouping_ses *)r->internal.grouping_data; + g->level = 0.0; + g->count = 0; +} + +void grouping_free_ses(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_ses(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_ses *g = (struct grouping_ses *)r->internal.grouping_data; + + if(unlikely(!g->count)) + g->level = value; + + g->level = g->alpha * value + g->alpha_other * g->level; + g->count++; +} + +NETDATA_DOUBLE grouping_flush_ses(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_ses *g = (struct grouping_ses *)r->internal.grouping_data; + + if(unlikely(!g->count || !netdata_double_isnumber(g->level))) { + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + return 0.0; + } + + return g->level; +} diff --git a/web/api/queries/ses/ses.h b/web/api/queries/ses/ses.h new file mode 100644 index 0000000..79b09fb --- /dev/null +++ b/web/api/queries/ses/ses.h @@ -0,0 +1,17 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERIES_SES_H +#define NETDATA_API_QUERIES_SES_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_init_ses(void); + +void grouping_create_ses(RRDR *r, const char *options __maybe_unused); +void grouping_reset_ses(RRDR *r); +void grouping_free_ses(RRDR *r); +void grouping_add_ses(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_ses(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERIES_SES_H diff --git a/web/api/queries/stddev/Makefile.am b/web/api/queries/stddev/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/stddev/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/stddev/README.md b/web/api/queries/stddev/README.md new file mode 100644 index 0000000..2fca47d --- /dev/null +++ b/web/api/queries/stddev/README.md @@ -0,0 +1,93 @@ +<!-- +title: "standard deviation (`stddev`)" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/stddev/README.md +--> + +# standard deviation (`stddev`) + +The standard deviation is a measure that is used to quantify the amount of variation or dispersion +of a set of data values. + +A low standard deviation indicates that the data points tend to be close to the mean (also called the +expected value) of the set, while a high standard deviation indicates that the data points are spread +out over a wider range of values. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: stddev -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`stdev` does not change the units. For example, if the chart units is `requests/sec`, the standard +deviation will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=stddev` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&dimensions=success&group=average&after=-60&label=average&value_color=yellow) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&dimensions=success&group=stddev&after=-60&label=standard+deviation&value_color=orange) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&dimensions=success&group=max&after=-60&label=max) + +## References + +Check <https://en.wikipedia.org/wiki/Standard_deviation>. + +--- + +# Coefficient of variation (`cv`) + +> This query is also available as `rsd`. + +The coefficient of variation (`cv`), also known as relative standard deviation (`rsd`), +is a standardized measure of dispersion of a probability distribution or frequency distribution. + +It is defined as the ratio of the **standard deviation** to the **mean**. + +In simple terms, it gives the percentage of change. So, if the average value of a metric is 1000 +and its standard deviation is 100 (meaning that it variates from 900 to 1100), then `cv` is 10%. + +This is an easy way to check the % variation, without using absolute values. + +For example, you may trigger an alarm if your web server requests/sec `cv` is above 20 (`%`) +over the last minute. So if your web server was serving 1000 reqs/sec over the last minute, +it will trigger the alarm if had spikes below 800/sec or above 1200/sec. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: cv -1m unaligned of my_dimension + units: % + warn: $this > 20 +``` + +The units reported by `cv` is always `%`. + +It can also be used in APIs and badges as `&group=cv` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&dimensions=success&group=average&after=-60&label=average&value_color=yellow) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&dimensions=success&group=cv&after=-60&label=coefficient+of+variation&value_color=orange&units=pcent) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&dimensions=success&group=max&after=-60&label=max) + +## References + +Check <https://en.wikipedia.org/wiki/Coefficient_of_variation>. + + diff --git a/web/api/queries/stddev/stddev.c b/web/api/queries/stddev/stddev.c new file mode 100644 index 0000000..92a67b4 --- /dev/null +++ b/web/api/queries/stddev/stddev.c @@ -0,0 +1,173 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "stddev.h" + + +// ---------------------------------------------------------------------------- +// stddev + +// this implementation comes from: +// https://www.johndcook.com/blog/standard_deviation/ + +struct grouping_stddev { + long count; + NETDATA_DOUBLE m_oldM, m_newM, m_oldS, m_newS; +}; + +void grouping_create_stddev(RRDR *r, const char *options __maybe_unused) { + r->internal.grouping_data = onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_stddev)); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_stddev(RRDR *r) { + struct grouping_stddev *g = (struct grouping_stddev *)r->internal.grouping_data; + g->count = 0; +} + +void grouping_free_stddev(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_stddev(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_stddev *g = (struct grouping_stddev *)r->internal.grouping_data; + + g->count++; + + // See Knuth TAOCP vol 2, 3rd edition, page 232 + if (g->count == 1) { + g->m_oldM = g->m_newM = value; + g->m_oldS = 0.0; + } + else { + g->m_newM = g->m_oldM + (value - g->m_oldM) / g->count; + g->m_newS = g->m_oldS + (value - g->m_oldM) * (value - g->m_newM); + + // set up for next iteration + g->m_oldM = g->m_newM; + g->m_oldS = g->m_newS; + } +} + +static inline NETDATA_DOUBLE mean(struct grouping_stddev *g) { + return (g->count > 0) ? g->m_newM : 0.0; +} + +static inline NETDATA_DOUBLE variance(struct grouping_stddev *g) { + return ( (g->count > 1) ? g->m_newS/(NETDATA_DOUBLE)(g->count - 1) : 0.0 ); +} +static inline NETDATA_DOUBLE stddev(struct grouping_stddev *g) { + return sqrtndd(variance(g)); +} + +NETDATA_DOUBLE grouping_flush_stddev(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_stddev *g = (struct grouping_stddev *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(likely(g->count > 1)) { + value = stddev(g); + + if(!netdata_double_isnumber(value)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + } + else if(g->count == 1) { + value = 0.0; + } + else { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + + grouping_reset_stddev(r); + + return value; +} + +// https://en.wikipedia.org/wiki/Coefficient_of_variation +NETDATA_DOUBLE grouping_flush_coefficient_of_variation(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_stddev *g = (struct grouping_stddev *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(likely(g->count > 1)) { + NETDATA_DOUBLE m = mean(g); + value = 100.0 * stddev(g) / ((m < 0)? -m : m); + + if(unlikely(!netdata_double_isnumber(value))) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + } + else if(g->count == 1) { + // one value collected + value = 0.0; + } + else { + // no values collected + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + + grouping_reset_stddev(r); + + return value; +} + + +/* + * Mean = average + * +NETDATA_DOUBLE grouping_flush_mean(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_stddev *g = (struct grouping_stddev *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(unlikely(!g->count)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else { + value = mean(g); + + if(!isnormal(value)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + } + + grouping_reset_stddev(r); + + return value; +} + */ + +/* + * It is not advised to use this version of variance directly + * +NETDATA_DOUBLE grouping_flush_variance(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_stddev *g = (struct grouping_stddev *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(unlikely(!g->count)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else { + value = variance(g); + + if(!isnormal(value)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + } + + grouping_reset_stddev(r); + + return value; +} +*/
\ No newline at end of file diff --git a/web/api/queries/stddev/stddev.h b/web/api/queries/stddev/stddev.h new file mode 100644 index 0000000..4b8ffcd --- /dev/null +++ b/web/api/queries/stddev/stddev.h @@ -0,0 +1,18 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERIES_STDDEV_H +#define NETDATA_API_QUERIES_STDDEV_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_stddev(RRDR *r, const char *options __maybe_unused); +void grouping_reset_stddev(RRDR *r); +void grouping_free_stddev(RRDR *r); +void grouping_add_stddev(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_stddev(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); +NETDATA_DOUBLE grouping_flush_coefficient_of_variation(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); +// NETDATA_DOUBLE grouping_flush_mean(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); +// NETDATA_DOUBLE grouping_flush_variance(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERIES_STDDEV_H diff --git a/web/api/queries/sum/Makefile.am b/web/api/queries/sum/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/sum/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/sum/README.md b/web/api/queries/sum/README.md new file mode 100644 index 0000000..d4465bd --- /dev/null +++ b/web/api/queries/sum/README.md @@ -0,0 +1,41 @@ +<!-- +title: "Sum" +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/sum/README.md +--> + +# Sum + +This module sums all the values in the time-frame requested. + +You can use `sum` to find the volume of something over a period. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: sum -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`sum` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=sum` in the URL. + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=sum&after=-60&label=1m+sum&value_color=orange&units=requests) + +## References + +- <https://en.wikipedia.org/wiki/Summation>. + + diff --git a/web/api/queries/sum/sum.c b/web/api/queries/sum/sum.c new file mode 100644 index 0000000..eec6e2a --- /dev/null +++ b/web/api/queries/sum/sum.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "sum.h" + +// ---------------------------------------------------------------------------- +// sum + +struct grouping_sum { + NETDATA_DOUBLE sum; + size_t count; +}; + +void grouping_create_sum(RRDR *r, const char *options __maybe_unused) { + r->internal.grouping_data = onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_sum)); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_sum(RRDR *r) { + struct grouping_sum *g = (struct grouping_sum *)r->internal.grouping_data; + g->sum = 0; + g->count = 0; +} + +void grouping_free_sum(RRDR *r) { + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_sum(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_sum *g = (struct grouping_sum *)r->internal.grouping_data; + g->sum += value; + g->count++; +} + +NETDATA_DOUBLE grouping_flush_sum(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_sum *g = (struct grouping_sum *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + + if(unlikely(!g->count)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else { + value = g->sum; + } + + g->sum = 0.0; + g->count = 0; + + return value; +} + + diff --git a/web/api/queries/sum/sum.h b/web/api/queries/sum/sum.h new file mode 100644 index 0000000..8987827 --- /dev/null +++ b/web/api/queries/sum/sum.h @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERY_SUM_H +#define NETDATA_API_QUERY_SUM_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_sum(RRDR *r, const char *options __maybe_unused); +void grouping_reset_sum(RRDR *r); +void grouping_free_sum(RRDR *r); +void grouping_add_sum(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_sum(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERY_SUM_H diff --git a/web/api/queries/trimmed_mean/Makefile.am b/web/api/queries/trimmed_mean/Makefile.am new file mode 100644 index 0000000..161784b --- /dev/null +++ b/web/api/queries/trimmed_mean/Makefile.am @@ -0,0 +1,8 @@ +# SPDX-License-Identifier: GPL-3.0-or-later + +AUTOMAKE_OPTIONS = subdir-objects +MAINTAINERCLEANFILES = $(srcdir)/Makefile.in + +dist_noinst_DATA = \ + README.md \ + $(NULL) diff --git a/web/api/queries/trimmed_mean/README.md b/web/api/queries/trimmed_mean/README.md new file mode 100644 index 0000000..71cdb85 --- /dev/null +++ b/web/api/queries/trimmed_mean/README.md @@ -0,0 +1,56 @@ +<!-- +title: "Trimmed Mean" +description: "Use trimmed-mean in API queries and health entities to find the average value from a sample, eliminating any unwanted spikes in the returned metrics." +custom_edit_url: https://github.com/netdata/netdata/edit/master/web/api/queries/trimmed_mean/README.md +--> + +# Trimmed Mean + +The trimmed mean is the average value of a series excluding the smallest and biggest points. + +Netdata applies linear interpolation on the last point, if the percentage requested to be excluded does not give a +round number of points. + +The following percentile aliases are defined: + +- `trimmed-mean1` +- `trimmed-mean2` +- `trimmed-mean3` +- `trimmed-mean5` +- `trimmed-mean10` +- `trimmed-mean15` +- `trimmed-mean20` +- `trimmed-mean25` + +The default `trimmed-mean` is an alias for `trimmed-mean5`. +Any percentage may be requested using the `group_options` query parameter. + +## how to use + +Use it in alarms like this: + +``` + alarm: my_alarm + on: my_chart +lookup: trimmed-mean5 -1m unaligned of my_dimension + warn: $this > 1000 +``` + +`trimmed-mean` does not change the units. For example, if the chart units is `requests/sec`, the result +will be again expressed in the same units. + +It can also be used in APIs and badges as `&group=trimmed-mean` in the URL and the additional parameter `group_options` +may be used to request any percentage (e.g. `&group=trimmed-mean&group_options=29`). + +## Examples + +Examining last 1 minute `successful` web server responses: + +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=min&after=-60&label=min) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=average&after=-60&label=average) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=trimmed-mean5&after=-60&label=trimmed-mean5&value_color=orange) +- ![](https://registry.my-netdata.io/api/v1/badge.svg?chart=web_log_nginx.response_statuses&options=unaligned&dimensions=success&group=max&after=-60&label=max) + +## References + +- <https://en.wikipedia.org/wiki/Truncated_mean>. diff --git a/web/api/queries/trimmed_mean/trimmed_mean.c b/web/api/queries/trimmed_mean/trimmed_mean.c new file mode 100644 index 0000000..2277208 --- /dev/null +++ b/web/api/queries/trimmed_mean/trimmed_mean.c @@ -0,0 +1,166 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "trimmed_mean.h" + +// ---------------------------------------------------------------------------- +// median + +struct grouping_trimmed_mean { + size_t series_size; + size_t next_pos; + NETDATA_DOUBLE percent; + + NETDATA_DOUBLE *series; +}; + +static void grouping_create_trimmed_mean_internal(RRDR *r, const char *options, NETDATA_DOUBLE def) { + long entries = r->group; + if(entries < 10) entries = 10; + + struct grouping_trimmed_mean *g = (struct grouping_trimmed_mean *)onewayalloc_callocz(r->internal.owa, 1, sizeof(struct grouping_trimmed_mean)); + g->series = onewayalloc_mallocz(r->internal.owa, entries * sizeof(NETDATA_DOUBLE)); + g->series_size = (size_t)entries; + + g->percent = def; + if(options && *options) { + g->percent = str2ndd(options, NULL); + if(!netdata_double_isnumber(g->percent)) g->percent = 0.0; + if(g->percent < 0.0) g->percent = 0.0; + if(g->percent > 50.0) g->percent = 50.0; + } + + g->percent = 1.0 - ((g->percent / 100.0) * 2.0); + r->internal.grouping_data = g; +} + +void grouping_create_trimmed_mean1(RRDR *r, const char *options) { + grouping_create_trimmed_mean_internal(r, options, 1.0); +} +void grouping_create_trimmed_mean2(RRDR *r, const char *options) { + grouping_create_trimmed_mean_internal(r, options, 2.0); +} +void grouping_create_trimmed_mean3(RRDR *r, const char *options) { + grouping_create_trimmed_mean_internal(r, options, 3.0); +} +void grouping_create_trimmed_mean5(RRDR *r, const char *options) { + grouping_create_trimmed_mean_internal(r, options, 5.0); +} +void grouping_create_trimmed_mean10(RRDR *r, const char *options) { + grouping_create_trimmed_mean_internal(r, options, 10.0); +} +void grouping_create_trimmed_mean15(RRDR *r, const char *options) { + grouping_create_trimmed_mean_internal(r, options, 15.0); +} +void grouping_create_trimmed_mean20(RRDR *r, const char *options) { + grouping_create_trimmed_mean_internal(r, options, 20.0); +} +void grouping_create_trimmed_mean25(RRDR *r, const char *options) { + grouping_create_trimmed_mean_internal(r, options, 25.0); +} + +// resets when switches dimensions +// so, clear everything to restart +void grouping_reset_trimmed_mean(RRDR *r) { + struct grouping_trimmed_mean *g = (struct grouping_trimmed_mean *)r->internal.grouping_data; + g->next_pos = 0; +} + +void grouping_free_trimmed_mean(RRDR *r) { + struct grouping_trimmed_mean *g = (struct grouping_trimmed_mean *)r->internal.grouping_data; + if(g) onewayalloc_freez(r->internal.owa, g->series); + + onewayalloc_freez(r->internal.owa, r->internal.grouping_data); + r->internal.grouping_data = NULL; +} + +void grouping_add_trimmed_mean(RRDR *r, NETDATA_DOUBLE value) { + struct grouping_trimmed_mean *g = (struct grouping_trimmed_mean *)r->internal.grouping_data; + + if(unlikely(g->next_pos >= g->series_size)) { + g->series = onewayalloc_doublesize( r->internal.owa, g->series, g->series_size * sizeof(NETDATA_DOUBLE)); + g->series_size *= 2; + } + + g->series[g->next_pos++] = value; +} + +NETDATA_DOUBLE grouping_flush_trimmed_mean(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr) { + struct grouping_trimmed_mean *g = (struct grouping_trimmed_mean *)r->internal.grouping_data; + + NETDATA_DOUBLE value; + size_t available_slots = g->next_pos; + + if(unlikely(!available_slots)) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + else if(available_slots == 1) { + value = g->series[0]; + } + else { + sort_series(g->series, available_slots); + + NETDATA_DOUBLE min = g->series[0]; + NETDATA_DOUBLE max = g->series[available_slots - 1]; + + if (min != max) { + size_t slots_to_use = (size_t)((NETDATA_DOUBLE)available_slots * g->percent); + if(!slots_to_use) slots_to_use = 1; + + NETDATA_DOUBLE percent_to_use = (NETDATA_DOUBLE)slots_to_use / (NETDATA_DOUBLE)available_slots; + NETDATA_DOUBLE percent_delta = g->percent - percent_to_use; + + NETDATA_DOUBLE percent_interpolation_slot = 0.0; + NETDATA_DOUBLE percent_last_slot = 0.0; + if(percent_delta > 0.0) { + NETDATA_DOUBLE percent_to_use_plus_1_slot = (NETDATA_DOUBLE)(slots_to_use + 1) / (NETDATA_DOUBLE)available_slots; + NETDATA_DOUBLE percent_1slot = percent_to_use_plus_1_slot - percent_to_use; + + percent_interpolation_slot = percent_delta / percent_1slot; + percent_last_slot = 1 - percent_interpolation_slot; + } + + int start_slot, stop_slot, step, last_slot, interpolation_slot; + if(min >= 0.0 && max >= 0.0) { + start_slot = (int)((available_slots - slots_to_use) / 2); + stop_slot = start_slot + (int)slots_to_use; + last_slot = stop_slot - 1; + interpolation_slot = stop_slot; + step = 1; + } + else { + start_slot = (int)available_slots - 1 - (int)((available_slots - slots_to_use) / 2); + stop_slot = start_slot - (int)slots_to_use; + last_slot = stop_slot + 1; + interpolation_slot = stop_slot; + step = -1; + } + + value = 0.0; + for(int slot = start_slot; slot != stop_slot ; slot += step) + value += g->series[slot]; + + size_t counted = slots_to_use; + if(percent_interpolation_slot > 0.0 && interpolation_slot >= 0 && interpolation_slot < (int)available_slots) { + value += g->series[interpolation_slot] * percent_interpolation_slot; + value += g->series[last_slot] * percent_last_slot; + counted++; + } + + value = value / (NETDATA_DOUBLE)counted; + } + else + value = min; + } + + if(unlikely(!netdata_double_isnumber(value))) { + value = 0.0; + *rrdr_value_options_ptr |= RRDR_VALUE_EMPTY; + } + + //log_series_to_stderr(g->series, g->next_pos, value, "trimmed_mean"); + + g->next_pos = 0; + + return value; +} diff --git a/web/api/queries/trimmed_mean/trimmed_mean.h b/web/api/queries/trimmed_mean/trimmed_mean.h new file mode 100644 index 0000000..e66d925 --- /dev/null +++ b/web/api/queries/trimmed_mean/trimmed_mean.h @@ -0,0 +1,22 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_QUERIES_TRIMMED_MEAN_H +#define NETDATA_API_QUERIES_TRIMMED_MEAN_H + +#include "../query.h" +#include "../rrdr.h" + +void grouping_create_trimmed_mean1(RRDR *r, const char *options); +void grouping_create_trimmed_mean2(RRDR *r, const char *options); +void grouping_create_trimmed_mean3(RRDR *r, const char *options); +void grouping_create_trimmed_mean5(RRDR *r, const char *options); +void grouping_create_trimmed_mean10(RRDR *r, const char *options); +void grouping_create_trimmed_mean15(RRDR *r, const char *options); +void grouping_create_trimmed_mean20(RRDR *r, const char *options); +void grouping_create_trimmed_mean25(RRDR *r, const char *options); +void grouping_reset_trimmed_mean(RRDR *r); +void grouping_free_trimmed_mean(RRDR *r); +void grouping_add_trimmed_mean(RRDR *r, NETDATA_DOUBLE value); +NETDATA_DOUBLE grouping_flush_trimmed_mean(RRDR *r, RRDR_VALUE_FLAGS *rrdr_value_options_ptr); + +#endif //NETDATA_API_QUERIES_TRIMMED_MEAN_H diff --git a/web/api/queries/weights.c b/web/api/queries/weights.c new file mode 100644 index 0000000..a9555a6 --- /dev/null +++ b/web/api/queries/weights.c @@ -0,0 +1,1107 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#include "daemon/common.h" +#include "database/KolmogorovSmirnovDist.h" + +#define MAX_POINTS 10000 +int enable_metric_correlations = CONFIG_BOOLEAN_YES; +int metric_correlations_version = 1; +WEIGHTS_METHOD default_metric_correlations_method = WEIGHTS_METHOD_MC_KS2; + +typedef struct weights_stats { + NETDATA_DOUBLE max_base_high_ratio; + size_t db_points; + size_t result_points; + size_t db_queries; + size_t db_points_per_tier[RRD_STORAGE_TIERS]; + size_t binary_searches; +} WEIGHTS_STATS; + +// ---------------------------------------------------------------------------- +// parse and render metric correlations methods + +static struct { + const char *name; + WEIGHTS_METHOD value; +} weights_methods[] = { + { "ks2" , WEIGHTS_METHOD_MC_KS2} + , { "volume" , WEIGHTS_METHOD_MC_VOLUME} + , { "anomaly-rate" , WEIGHTS_METHOD_ANOMALY_RATE} + , { NULL , 0 } +}; + +WEIGHTS_METHOD weights_string_to_method(const char *method) { + for(int i = 0; weights_methods[i].name ;i++) + if(strcmp(method, weights_methods[i].name) == 0) + return weights_methods[i].value; + + return default_metric_correlations_method; +} + +const char *weights_method_to_string(WEIGHTS_METHOD method) { + for(int i = 0; weights_methods[i].name ;i++) + if(weights_methods[i].value == method) + return weights_methods[i].name; + + return "unknown"; +} + +// ---------------------------------------------------------------------------- +// The results per dimension are aggregated into a dictionary + +typedef enum { + RESULT_IS_BASE_HIGH_RATIO = (1 << 0), + RESULT_IS_PERCENTAGE_OF_TIME = (1 << 1), +} RESULT_FLAGS; + +struct register_result { + RESULT_FLAGS flags; + RRDCONTEXT_ACQUIRED *rca; + RRDINSTANCE_ACQUIRED *ria; + RRDMETRIC_ACQUIRED *rma; + NETDATA_DOUBLE value; +}; + +static DICTIONARY *register_result_init() { + DICTIONARY *results = dictionary_create(DICT_OPTION_SINGLE_THREADED); + return results; +} + +static void register_result_destroy(DICTIONARY *results) { + dictionary_destroy(results); +} + +static void register_result(DICTIONARY *results, + RRDCONTEXT_ACQUIRED *rca, + RRDINSTANCE_ACQUIRED *ria, + RRDMETRIC_ACQUIRED *rma, + NETDATA_DOUBLE value, + RESULT_FLAGS flags, + WEIGHTS_STATS *stats, + bool register_zero) { + + if(!netdata_double_isnumber(value)) return; + + // make it positive + NETDATA_DOUBLE v = fabsndd(value); + + // no need to store zero scored values + if(unlikely(fpclassify(v) == FP_ZERO && !register_zero)) + return; + + // keep track of the max of the baseline / highlight ratio + if(flags & RESULT_IS_BASE_HIGH_RATIO && v > stats->max_base_high_ratio) + stats->max_base_high_ratio = v; + + struct register_result t = { + .flags = flags, + .rca = rca, + .ria = ria, + .rma = rma, + .value = v + }; + + // we can use the pointer address or RMA as a unique key for each metric + char buf[20 + 1]; + ssize_t len = snprintfz(buf, 20, "%p", rma); + dictionary_set_advanced(results, buf, len + 1, &t, sizeof(struct register_result), NULL); +} + +// ---------------------------------------------------------------------------- +// Generation of JSON output for the results + +static void results_header_to_json(DICTIONARY *results __maybe_unused, BUFFER *wb, + time_t after, time_t before, + time_t baseline_after, time_t baseline_before, + size_t points, WEIGHTS_METHOD method, + RRDR_GROUPING group, RRDR_OPTIONS options, uint32_t shifts, + size_t examined_dimensions __maybe_unused, usec_t duration, + WEIGHTS_STATS *stats) { + + buffer_sprintf(wb, "{\n" + "\t\"after\": %lld,\n" + "\t\"before\": %lld,\n" + "\t\"duration\": %lld,\n" + "\t\"points\": %zu,\n", + (long long)after, + (long long)before, + (long long)(before - after), + points + ); + + if(method == WEIGHTS_METHOD_MC_KS2 || method == WEIGHTS_METHOD_MC_VOLUME) + buffer_sprintf(wb, "" + "\t\"baseline_after\": %lld,\n" + "\t\"baseline_before\": %lld,\n" + "\t\"baseline_duration\": %lld,\n" + "\t\"baseline_points\": %zu,\n", + (long long)baseline_after, + (long long)baseline_before, + (long long)(baseline_before - baseline_after), + points << shifts + ); + + buffer_sprintf(wb, "" + "\t\"statistics\": {\n" + "\t\t\"query_time_ms\": %f,\n" + "\t\t\"db_queries\": %zu,\n" + "\t\t\"query_result_points\": %zu,\n" + "\t\t\"binary_searches\": %zu,\n" + "\t\t\"db_points_read\": %zu,\n" + "\t\t\"db_points_per_tier\": [ ", + (double)duration / (double)USEC_PER_MS, + stats->db_queries, + stats->result_points, + stats->binary_searches, + stats->db_points + ); + + for(size_t tier = 0; tier < storage_tiers ;tier++) + buffer_sprintf(wb, "%s%zu", tier?", ":"", stats->db_points_per_tier[tier]); + + buffer_sprintf(wb, " ]\n" + "\t},\n" + "\t\"group\": \"%s\",\n" + "\t\"method\": \"%s\",\n" + "\t\"options\": \"", + web_client_api_request_v1_data_group_to_string(group), + weights_method_to_string(method) + ); + + web_client_api_request_v1_data_options_to_buffer(wb, options); +} + +static size_t registered_results_to_json_charts(DICTIONARY *results, BUFFER *wb, + time_t after, time_t before, + time_t baseline_after, time_t baseline_before, + size_t points, WEIGHTS_METHOD method, + RRDR_GROUPING group, RRDR_OPTIONS options, uint32_t shifts, + size_t examined_dimensions, usec_t duration, + WEIGHTS_STATS *stats) { + + results_header_to_json(results, wb, after, before, baseline_after, baseline_before, + points, method, group, options, shifts, examined_dimensions, duration, stats); + + buffer_strcat(wb, "\",\n\t\"correlated_charts\": {\n"); + + size_t charts = 0, chart_dims = 0, total_dimensions = 0; + struct register_result *t; + RRDINSTANCE_ACQUIRED *last_ria = NULL; // never access this - we use it only for comparison + dfe_start_read(results, t) { + if(t->ria != last_ria) { + last_ria = t->ria; + + if(charts) buffer_strcat(wb, "\n\t\t\t}\n\t\t},\n"); + buffer_strcat(wb, "\t\t\""); + buffer_strcat(wb, rrdinstance_acquired_id(t->ria)); + buffer_strcat(wb, "\": {\n"); + buffer_strcat(wb, "\t\t\t\"context\": \""); + buffer_strcat(wb, rrdcontext_acquired_id(t->rca)); + buffer_strcat(wb, "\",\n\t\t\t\"dimensions\": {\n"); + charts++; + chart_dims = 0; + } + if (chart_dims) buffer_sprintf(wb, ",\n"); + buffer_sprintf(wb, "\t\t\t\t\"%s\": " NETDATA_DOUBLE_FORMAT, rrdmetric_acquired_name(t->rma), t->value); + chart_dims++; + total_dimensions++; + } + dfe_done(t); + + // close dimensions and chart + if (total_dimensions) + buffer_strcat(wb, "\n\t\t\t}\n\t\t}\n"); + + // close correlated_charts + buffer_sprintf(wb, "\t},\n" + "\t\"correlated_dimensions\": %zu,\n" + "\t\"total_dimensions_count\": %zu\n" + "}\n", + total_dimensions, + examined_dimensions + ); + + return total_dimensions; +} + +static size_t registered_results_to_json_contexts(DICTIONARY *results, BUFFER *wb, + time_t after, time_t before, + time_t baseline_after, time_t baseline_before, + size_t points, WEIGHTS_METHOD method, + RRDR_GROUPING group, RRDR_OPTIONS options, uint32_t shifts, + size_t examined_dimensions, usec_t duration, + WEIGHTS_STATS *stats) { + + results_header_to_json(results, wb, after, before, baseline_after, baseline_before, + points, method, group, options, shifts, examined_dimensions, duration, stats); + + buffer_strcat(wb, "\",\n\t\"contexts\": {\n"); + + size_t contexts = 0, charts = 0, total_dimensions = 0, context_dims = 0, chart_dims = 0; + NETDATA_DOUBLE contexts_total_weight = 0.0, charts_total_weight = 0.0; + struct register_result *t; + RRDCONTEXT_ACQUIRED *last_rca = NULL; + RRDINSTANCE_ACQUIRED *last_ria = NULL; + dfe_start_read(results, t) { + + if(t->rca != last_rca) { + last_rca = t->rca; + + if(contexts) + buffer_sprintf(wb, "\n" + "\t\t\t\t\t},\n" + "\t\t\t\t\t\"weight\":" NETDATA_DOUBLE_FORMAT "\n" + "\t\t\t\t}\n\t\t\t},\n" + "\t\t\t\"weight\":" NETDATA_DOUBLE_FORMAT "\n\t\t},\n" + , charts_total_weight / (double)chart_dims + , contexts_total_weight / (double)context_dims); + + buffer_strcat(wb, "\t\t\""); + buffer_strcat(wb, rrdcontext_acquired_id(t->rca)); + buffer_strcat(wb, "\": {\n\t\t\t\"charts\":{\n"); + + contexts++; + charts = 0; + context_dims = 0; + contexts_total_weight = 0.0; + + last_ria = NULL; + } + + if(t->ria != last_ria) { + last_ria = t->ria; + + if(charts) + buffer_sprintf(wb, "\n" + "\t\t\t\t\t},\n" + "\t\t\t\t\t\"weight\":" NETDATA_DOUBLE_FORMAT "\n" + "\t\t\t\t},\n" + , charts_total_weight / (double)chart_dims); + + buffer_strcat(wb, "\t\t\t\t\""); + buffer_strcat(wb, rrdinstance_acquired_id(t->ria)); + buffer_strcat(wb, "\": {\n"); + buffer_strcat(wb, "\t\t\t\t\t\"dimensions\": {\n"); + + charts++; + chart_dims = 0; + charts_total_weight = 0.0; + } + + if (chart_dims) buffer_sprintf(wb, ",\n"); + buffer_sprintf(wb, "\t\t\t\t\t\t\"%s\": " NETDATA_DOUBLE_FORMAT, rrdmetric_acquired_name(t->rma), t->value); + charts_total_weight += t->value; + contexts_total_weight += t->value; + chart_dims++; + context_dims++; + total_dimensions++; + } + dfe_done(t); + + // close dimensions and chart + if (total_dimensions) + buffer_sprintf(wb, "\n" + "\t\t\t\t\t},\n" + "\t\t\t\t\t\"weight\":" NETDATA_DOUBLE_FORMAT "\n" + "\t\t\t\t}\n" + "\t\t\t},\n" + "\t\t\t\"weight\":" NETDATA_DOUBLE_FORMAT "\n" + "\t\t}\n" + , charts_total_weight / (double)chart_dims + , contexts_total_weight / (double)context_dims); + + // close correlated_charts + buffer_sprintf(wb, "\t},\n" + "\t\"weighted_dimensions\": %zu,\n" + "\t\"total_dimensions_count\": %zu\n" + "}\n", + total_dimensions, + examined_dimensions + ); + + return total_dimensions; +} + +// ---------------------------------------------------------------------------- +// KS2 algorithm functions + +typedef long int DIFFS_NUMBERS; +#define DOUBLE_TO_INT_MULTIPLIER 100000 + +static inline int binary_search_bigger_than(const DIFFS_NUMBERS arr[], int left, int size, DIFFS_NUMBERS K) { + // binary search to find the index the smallest index + // of the first value in the array that is greater than K + + int right = size; + while(left < right) { + int middle = (int)(((unsigned int)(left + right)) >> 1); + + if(arr[middle] > K) + right = middle; + + else + left = middle + 1; + } + + return left; +} + +int compare_diffs(const void *left, const void *right) { + DIFFS_NUMBERS lt = *(DIFFS_NUMBERS *)left; + DIFFS_NUMBERS rt = *(DIFFS_NUMBERS *)right; + + // https://stackoverflow.com/a/3886497/1114110 + return (lt > rt) - (lt < rt); +} + +static size_t calculate_pairs_diff(DIFFS_NUMBERS *diffs, NETDATA_DOUBLE *arr, size_t size) { + NETDATA_DOUBLE *last = &arr[size - 1]; + size_t added = 0; + + while(last > arr) { + NETDATA_DOUBLE second = *last--; + NETDATA_DOUBLE first = *last; + *diffs++ = (DIFFS_NUMBERS)((first - second) * (NETDATA_DOUBLE)DOUBLE_TO_INT_MULTIPLIER); + added++; + } + + return added; +} + +static double ks_2samp( + DIFFS_NUMBERS baseline_diffs[], int base_size, + DIFFS_NUMBERS highlight_diffs[], int high_size, + uint32_t base_shifts) { + + qsort(baseline_diffs, base_size, sizeof(DIFFS_NUMBERS), compare_diffs); + qsort(highlight_diffs, high_size, sizeof(DIFFS_NUMBERS), compare_diffs); + + // Now we should be calculating this: + // + // For each number in the diffs arrays, we should find the index of the + // number bigger than them in both arrays and calculate the % of this index + // vs the total array size. Once we have the 2 percentages, we should find + // the min and max across the delta of all of them. + // + // It should look like this: + // + // base_pcent = binary_search_bigger_than(...) / base_size; + // high_pcent = binary_search_bigger_than(...) / high_size; + // delta = base_pcent - high_pcent; + // if(delta < min) min = delta; + // if(delta > max) max = delta; + // + // This would require a lot of multiplications and divisions. + // + // To speed it up, we do the binary search to find the index of each number + // but, then we divide the base index by the power of two number (shifts) it + // is bigger than high index. So the 2 indexes are now comparable. + // We also keep track of the original indexes with min and max, to properly + // calculate their percentages once the loops finish. + + + // initialize min and max using the first number of baseline_diffs + DIFFS_NUMBERS K = baseline_diffs[0]; + int base_idx = binary_search_bigger_than(baseline_diffs, 1, base_size, K); + int high_idx = binary_search_bigger_than(highlight_diffs, 0, high_size, K); + int delta = base_idx - (high_idx << base_shifts); + int min = delta, max = delta; + int base_min_idx = base_idx; + int base_max_idx = base_idx; + int high_min_idx = high_idx; + int high_max_idx = high_idx; + + // do the baseline_diffs starting from 1 (we did position 0 above) + for(int i = 1; i < base_size; i++) { + K = baseline_diffs[i]; + base_idx = binary_search_bigger_than(baseline_diffs, i + 1, base_size, K); // starting from i, since data1 is sorted + high_idx = binary_search_bigger_than(highlight_diffs, 0, high_size, K); + + delta = base_idx - (high_idx << base_shifts); + if(delta < min) { + min = delta; + base_min_idx = base_idx; + high_min_idx = high_idx; + } + else if(delta > max) { + max = delta; + base_max_idx = base_idx; + high_max_idx = high_idx; + } + } + + // do the highlight_diffs starting from 0 + for(int i = 0; i < high_size; i++) { + K = highlight_diffs[i]; + base_idx = binary_search_bigger_than(baseline_diffs, 0, base_size, K); + high_idx = binary_search_bigger_than(highlight_diffs, i + 1, high_size, K); // starting from i, since data2 is sorted + + delta = base_idx - (high_idx << base_shifts); + if(delta < min) { + min = delta; + base_min_idx = base_idx; + high_min_idx = high_idx; + } + else if(delta > max) { + max = delta; + base_max_idx = base_idx; + high_max_idx = high_idx; + } + } + + // now we have the min, max and their indexes + // properly calculate min and max as dmin and dmax + double dbase_size = (double)base_size; + double dhigh_size = (double)high_size; + double dmin = ((double)base_min_idx / dbase_size) - ((double)high_min_idx / dhigh_size); + double dmax = ((double)base_max_idx / dbase_size) - ((double)high_max_idx / dhigh_size); + + dmin = -dmin; + if(islessequal(dmin, 0.0)) dmin = 0.0; + else if(isgreaterequal(dmin, 1.0)) dmin = 1.0; + + double d; + if(isgreaterequal(dmin, dmax)) d = dmin; + else d = dmax; + + double en = round(dbase_size * dhigh_size / (dbase_size + dhigh_size)); + + // under these conditions, KSfbar() crashes + if(unlikely(isnan(en) || isinf(en) || en == 0.0 || isnan(d) || isinf(d))) + return NAN; + + return KSfbar((int)en, d); +} + +static double kstwo( + NETDATA_DOUBLE baseline[], int baseline_points, + NETDATA_DOUBLE highlight[], int highlight_points, + uint32_t base_shifts) { + + // -1 in size, since the calculate_pairs_diffs() returns one less point + DIFFS_NUMBERS baseline_diffs[baseline_points - 1]; + DIFFS_NUMBERS highlight_diffs[highlight_points - 1]; + + int base_size = (int)calculate_pairs_diff(baseline_diffs, baseline, baseline_points); + int high_size = (int)calculate_pairs_diff(highlight_diffs, highlight, highlight_points); + + if(unlikely(!base_size || !high_size)) + return NAN; + + if(unlikely(base_size != baseline_points - 1 || high_size != highlight_points - 1)) { + error("Metric correlations: internal error - calculate_pairs_diff() returns the wrong number of entries"); + return NAN; + } + + return ks_2samp(baseline_diffs, base_size, highlight_diffs, high_size, base_shifts); +} + +NETDATA_DOUBLE *rrd2rrdr_ks2( + ONEWAYALLOC *owa, RRDHOST *host, + RRDCONTEXT_ACQUIRED *rca, RRDINSTANCE_ACQUIRED *ria, RRDMETRIC_ACQUIRED *rma, + time_t after, time_t before, size_t points, RRDR_OPTIONS options, + RRDR_GROUPING group_method, const char *group_options, size_t tier, + WEIGHTS_STATS *stats, + size_t *entries + ) { + + NETDATA_DOUBLE *ret = NULL; + + QUERY_TARGET_REQUEST qtr = { + .host = host, + .rca = rca, + .ria = ria, + .rma = rma, + .after = after, + .before = before, + .points = points, + .options = options, + .group_method = group_method, + .group_options = group_options, + .tier = tier, + .query_source = QUERY_SOURCE_API_WEIGHTS, + }; + + RRDR *r = rrd2rrdr(owa, query_target_create(&qtr)); + if(!r) + goto cleanup; + + stats->db_queries++; + stats->result_points += r->internal.result_points_generated; + stats->db_points += r->internal.db_points_read; + for(size_t tr = 0; tr < storage_tiers ; tr++) + stats->db_points_per_tier[tr] += r->internal.tier_points_read[tr]; + + if(r->d != 1) { + error("WEIGHTS: on query '%s' expected 1 dimension in RRDR but got %zu", r->internal.qt->id, r->d); + goto cleanup; + } + + if(unlikely(r->od[0] & RRDR_DIMENSION_HIDDEN)) + goto cleanup; + + if(unlikely(!(r->od[0] & RRDR_DIMENSION_NONZERO))) + goto cleanup; + + if(rrdr_rows(r) < 2) + goto cleanup; + + *entries = rrdr_rows(r); + ret = onewayalloc_mallocz(owa, sizeof(NETDATA_DOUBLE) * rrdr_rows(r)); + + // copy the points of the dimension to a contiguous array + // there is no need to check for empty values, since empty values are already zero + // https://github.com/netdata/netdata/blob/6e3144683a73a2024d51425b20ecfd569034c858/web/api/queries/average/average.c#L41-L43 + memcpy(ret, r->v, rrdr_rows(r) * sizeof(NETDATA_DOUBLE)); + +cleanup: + rrdr_free(owa, r); + return ret; +} + +static void rrdset_metric_correlations_ks2( + RRDHOST *host, + RRDCONTEXT_ACQUIRED *rca, RRDINSTANCE_ACQUIRED *ria, RRDMETRIC_ACQUIRED *rma, + DICTIONARY *results, + time_t baseline_after, time_t baseline_before, + time_t after, time_t before, + size_t points, RRDR_OPTIONS options, + RRDR_GROUPING group_method, const char *group_options, size_t tier, + uint32_t shifts, + WEIGHTS_STATS *stats, bool register_zero + ) { + + options |= RRDR_OPTION_NATURAL_POINTS; + + ONEWAYALLOC *owa = onewayalloc_create(16 * 1024); + + size_t high_points = 0; + NETDATA_DOUBLE *highlight = rrd2rrdr_ks2( + owa, host, rca, ria, rma, after, before, points, + options, group_method, group_options, tier, stats, &high_points); + + if(!highlight) + goto cleanup; + + size_t base_points = 0; + NETDATA_DOUBLE *baseline = rrd2rrdr_ks2( + owa, host, rca, ria, rma, baseline_after, baseline_before, high_points << shifts, + options, group_method, group_options, tier, stats, &base_points); + + if(!baseline) + goto cleanup; + + stats->binary_searches += 2 * (base_points - 1) + 2 * (high_points - 1); + + double prob = kstwo(baseline, (int)base_points, highlight, (int)high_points, shifts); + if(!isnan(prob) && !isinf(prob)) { + + // these conditions should never happen, but still let's check + if(unlikely(prob < 0.0)) { + error("Metric correlations: kstwo() returned a negative number: %f", prob); + prob = -prob; + } + if(unlikely(prob > 1.0)) { + error("Metric correlations: kstwo() returned a number above 1.0: %f", prob); + prob = 1.0; + } + + // to spread the results evenly, 0.0 needs to be the less correlated and 1.0 the most correlated + // so, we flip the result of kstwo() + register_result(results, rca, ria, rma, 1.0 - prob, RESULT_IS_BASE_HIGH_RATIO, stats, register_zero); + } + +cleanup: + onewayalloc_destroy(owa); +} + +// ---------------------------------------------------------------------------- +// VOLUME algorithm functions + +static void merge_query_value_to_stats(QUERY_VALUE *qv, WEIGHTS_STATS *stats) { + stats->db_queries++; + stats->result_points += qv->result_points; + stats->db_points += qv->points_read; + for(size_t tier = 0; tier < storage_tiers ; tier++) + stats->db_points_per_tier[tier] += qv->storage_points_per_tier[tier]; +} + +static void rrdset_metric_correlations_volume( + RRDHOST *host, + RRDCONTEXT_ACQUIRED *rca, RRDINSTANCE_ACQUIRED *ria, RRDMETRIC_ACQUIRED *rma, + DICTIONARY *results, + time_t baseline_after, time_t baseline_before, + time_t after, time_t before, + RRDR_OPTIONS options, RRDR_GROUPING group_method, const char *group_options, + size_t tier, + WEIGHTS_STATS *stats, bool register_zero) { + + options |= RRDR_OPTION_MATCH_IDS | RRDR_OPTION_ABSOLUTE | RRDR_OPTION_NATURAL_POINTS; + + QUERY_VALUE baseline_average = rrdmetric2value(host, rca, ria, rma, baseline_after, baseline_before, options, group_method, group_options, tier, 0, QUERY_SOURCE_API_WEIGHTS); + merge_query_value_to_stats(&baseline_average, stats); + + if(!netdata_double_isnumber(baseline_average.value)) { + // this means no data for the baseline window, but we may have data for the highlighted one - assume zero + baseline_average.value = 0.0; + } + + QUERY_VALUE highlight_average = rrdmetric2value(host, rca, ria, rma, after, before, options, group_method, group_options, tier, 0, QUERY_SOURCE_API_WEIGHTS); + merge_query_value_to_stats(&highlight_average, stats); + + if(!netdata_double_isnumber(highlight_average.value)) + return; + + if(baseline_average.value == highlight_average.value) { + // they are the same - let's move on + return; + } + + char highlight_countif_options[50 + 1]; + snprintfz(highlight_countif_options, 50, "%s" NETDATA_DOUBLE_FORMAT, highlight_average.value < baseline_average.value ? "<" : ">", baseline_average.value); + QUERY_VALUE highlight_countif = rrdmetric2value(host, rca, ria, rma, after, before, options, RRDR_GROUPING_COUNTIF, highlight_countif_options, tier, 0, QUERY_SOURCE_API_WEIGHTS); + merge_query_value_to_stats(&highlight_countif, stats); + + if(!netdata_double_isnumber(highlight_countif.value)) { + info("WEIGHTS: highlighted countif query failed, but highlighted average worked - strange..."); + return; + } + + // this represents the percentage of time + // the highlighted window was above/below the baseline window + // (above or below depending on their averages) + highlight_countif.value = highlight_countif.value / 100.0; // countif returns 0 - 100.0 + + RESULT_FLAGS flags; + NETDATA_DOUBLE pcent = NAN; + if(isgreater(baseline_average.value, 0.0) || isless(baseline_average.value, 0.0)) { + flags = RESULT_IS_BASE_HIGH_RATIO; + pcent = (highlight_average.value - baseline_average.value) / baseline_average.value * highlight_countif.value; + } + else { + flags = RESULT_IS_PERCENTAGE_OF_TIME; + pcent = highlight_countif.value; + } + + register_result(results, rca, ria, rma, pcent, flags, stats, register_zero); +} + +// ---------------------------------------------------------------------------- +// ANOMALY RATE algorithm functions + +static void rrdset_weights_anomaly_rate( + RRDHOST *host, + RRDCONTEXT_ACQUIRED *rca, RRDINSTANCE_ACQUIRED *ria, RRDMETRIC_ACQUIRED *rma, + DICTIONARY *results, + time_t after, time_t before, + RRDR_OPTIONS options, RRDR_GROUPING group_method, const char *group_options, + size_t tier, + WEIGHTS_STATS *stats, bool register_zero) { + + options |= RRDR_OPTION_MATCH_IDS | RRDR_OPTION_ANOMALY_BIT | RRDR_OPTION_NATURAL_POINTS; + + QUERY_VALUE qv = rrdmetric2value(host, rca, ria, rma, after, before, options, group_method, group_options, tier, 0, QUERY_SOURCE_API_WEIGHTS); + merge_query_value_to_stats(&qv, stats); + + if(netdata_double_isnumber(qv.value)) + register_result(results, rca, ria, rma, qv.value, 0, stats, register_zero); +} + +// ---------------------------------------------------------------------------- + +int compare_netdata_doubles(const void *left, const void *right) { + NETDATA_DOUBLE lt = *(NETDATA_DOUBLE *)left; + NETDATA_DOUBLE rt = *(NETDATA_DOUBLE *)right; + + // https://stackoverflow.com/a/3886497/1114110 + return (lt > rt) - (lt < rt); +} + +static inline int binary_search_bigger_than_netdata_double(const NETDATA_DOUBLE arr[], int left, int size, NETDATA_DOUBLE K) { + // binary search to find the index the smallest index + // of the first value in the array that is greater than K + + int right = size; + while(left < right) { + int middle = (int)(((unsigned int)(left + right)) >> 1); + + if(arr[middle] > K) + right = middle; + + else + left = middle + 1; + } + + return left; +} + +// ---------------------------------------------------------------------------- +// spread the results evenly according to their value + +static size_t spread_results_evenly(DICTIONARY *results, WEIGHTS_STATS *stats) { + struct register_result *t; + + // count the dimensions + size_t dimensions = dictionary_entries(results); + if(!dimensions) return 0; + + if(stats->max_base_high_ratio == 0.0) + stats->max_base_high_ratio = 1.0; + + // create an array of the right size and copy all the values in it + NETDATA_DOUBLE slots[dimensions]; + dimensions = 0; + dfe_start_read(results, t) { + if(t->flags & (RESULT_IS_PERCENTAGE_OF_TIME)) + t->value = t->value * stats->max_base_high_ratio; + + slots[dimensions++] = t->value; + } + dfe_done(t); + + // sort the array with the values of all dimensions + qsort(slots, dimensions, sizeof(NETDATA_DOUBLE), compare_netdata_doubles); + + // skip the duplicates in the sorted array + NETDATA_DOUBLE last_value = NAN; + size_t unique_values = 0; + for(size_t i = 0; i < dimensions ;i++) { + if(likely(slots[i] != last_value)) + slots[unique_values++] = last_value = slots[i]; + } + + // this cannot happen, but coverity thinks otherwise... + if(!unique_values) + unique_values = dimensions; + + // calculate the weight of each slot, using the number of unique values + NETDATA_DOUBLE slot_weight = 1.0 / (NETDATA_DOUBLE)unique_values; + + dfe_start_read(results, t) { + int slot = binary_search_bigger_than_netdata_double(slots, 0, (int)unique_values, t->value); + NETDATA_DOUBLE v = slot * slot_weight; + if(unlikely(v > 1.0)) v = 1.0; + v = 1.0 - v; + t->value = v; + } + dfe_done(t); + + return dimensions; +} + +// ---------------------------------------------------------------------------- +// The main function + +int web_api_v1_weights( + RRDHOST *host, BUFFER *wb, WEIGHTS_METHOD method, WEIGHTS_FORMAT format, + RRDR_GROUPING group, const char *group_options, + time_t baseline_after, time_t baseline_before, + time_t after, time_t before, + size_t points, RRDR_OPTIONS options, SIMPLE_PATTERN *contexts, size_t tier, size_t timeout) { + + WEIGHTS_STATS stats = {}; + + DICTIONARY *results = register_result_init(); + DICTIONARY *metrics = NULL; + char *error = NULL; + int resp = HTTP_RESP_OK; + + // if the user didn't give a timeout + // assume 60 seconds + if(!timeout) + timeout = 60 * MSEC_PER_SEC; + + // if the timeout is less than 1 second + // make it at least 1 second + if(timeout < (long)(1 * MSEC_PER_SEC)) + timeout = 1 * MSEC_PER_SEC; + + usec_t timeout_usec = timeout * USEC_PER_MS; + usec_t started_usec = now_realtime_usec(); + + if(!rrdr_relative_window_to_absolute(&after, &before)) + buffer_no_cacheable(wb); + + if (before <= after) { + resp = HTTP_RESP_BAD_REQUEST; + error = "Invalid selected time-range."; + goto cleanup; + } + + uint32_t shifts = 0; + if(method == WEIGHTS_METHOD_MC_KS2 || method == WEIGHTS_METHOD_MC_VOLUME) { + if(!points) points = 500; + + if(baseline_before <= API_RELATIVE_TIME_MAX) + baseline_before += after; + + rrdr_relative_window_to_absolute(&baseline_after, &baseline_before); + + if (baseline_before <= baseline_after) { + resp = HTTP_RESP_BAD_REQUEST; + error = "Invalid baseline time-range."; + goto cleanup; + } + + // baseline should be a power of two multiple of highlight + long long base_delta = baseline_before - baseline_after; + long long high_delta = before - after; + uint32_t multiplier = (uint32_t)round((double)base_delta / (double)high_delta); + + // check if the multiplier is a power of two + // https://stackoverflow.com/a/600306/1114110 + if((multiplier & (multiplier - 1)) != 0) { + // it is not power of two + // let's find the closest power of two + // https://stackoverflow.com/a/466242/1114110 + multiplier--; + multiplier |= multiplier >> 1; + multiplier |= multiplier >> 2; + multiplier |= multiplier >> 4; + multiplier |= multiplier >> 8; + multiplier |= multiplier >> 16; + multiplier++; + } + + // convert the multiplier to the number of shifts + // we need to do, to divide baseline numbers to match + // the highlight ones + while(multiplier > 1) { + shifts++; + multiplier = multiplier >> 1; + } + + // if the baseline size will not comply to MAX_POINTS + // lower the window of the baseline + while(shifts && (points << shifts) > MAX_POINTS) + shifts--; + + // if the baseline size still does not comply to MAX_POINTS + // lower the resolution of the highlight and the baseline + while((points << shifts) > MAX_POINTS) + points = points >> 1; + + if(points < 15) { + resp = HTTP_RESP_BAD_REQUEST; + error = "Too few points available, at least 15 are needed."; + goto cleanup; + } + + // adjust the baseline to be multiplier times bigger than the highlight + baseline_after = baseline_before - (high_delta << shifts); + } + + size_t examined_dimensions = 0; + + bool register_zero = true; + if(options & RRDR_OPTION_NONZERO) { + register_zero = false; + options &= ~RRDR_OPTION_NONZERO; + } + + metrics = rrdcontext_all_metrics_to_dict(host, contexts); + struct metric_entry *me; + + // for every metric_entry in the dictionary + dfe_start_read(metrics, me) { + usec_t now_usec = now_realtime_usec(); + if(now_usec - started_usec > timeout_usec) { + error = "timed out"; + resp = HTTP_RESP_GATEWAY_TIMEOUT; + goto cleanup; + } + + examined_dimensions++; + + switch(method) { + case WEIGHTS_METHOD_ANOMALY_RATE: + options |= RRDR_OPTION_ANOMALY_BIT; + rrdset_weights_anomaly_rate( + host, + me->rca, me->ria, me->rma, + results, + after, before, + options, group, group_options, tier, + &stats, register_zero + ); + break; + + case WEIGHTS_METHOD_MC_VOLUME: + rrdset_metric_correlations_volume( + host, + me->rca, me->ria, me->rma, + results, + baseline_after, baseline_before, + after, before, + options, group, group_options, tier, + &stats, register_zero + ); + break; + + default: + case WEIGHTS_METHOD_MC_KS2: + rrdset_metric_correlations_ks2( + host, + me->rca, me->ria, me->rma, + results, + baseline_after, baseline_before, + after, before, points, + options, group, group_options, tier, shifts, + &stats, register_zero + ); + break; + } + } + dfe_done(me); + + if(!register_zero) + options |= RRDR_OPTION_NONZERO; + + if(!(options & RRDR_OPTION_RETURN_RAW)) + spread_results_evenly(results, &stats); + + usec_t ended_usec = now_realtime_usec(); + + // generate the json output we need + buffer_flush(wb); + + size_t added_dimensions = 0; + switch(format) { + case WEIGHTS_FORMAT_CHARTS: + added_dimensions = + registered_results_to_json_charts( + results, wb, + after, before, + baseline_after, baseline_before, + points, method, group, options, shifts, + examined_dimensions, + ended_usec - started_usec, &stats); + break; + + default: + case WEIGHTS_FORMAT_CONTEXTS: + added_dimensions = + registered_results_to_json_contexts( + results, wb, + after, before, + baseline_after, baseline_before, + points, method, group, options, shifts, + examined_dimensions, + ended_usec - started_usec, &stats); + break; + } + + if(!added_dimensions) { + error = "no results produced."; + resp = HTTP_RESP_NOT_FOUND; + } + +cleanup: + if(metrics) dictionary_destroy(metrics); + if(results) register_result_destroy(results); + + if(error) { + buffer_flush(wb); + buffer_sprintf(wb, "{\"error\": \"%s\" }", error); + } + + return resp; +} + +// ---------------------------------------------------------------------------- +// unittest + +/* + +Unit tests against the output of this: + +https://github.com/scipy/scipy/blob/4cf21e753cf937d1c6c2d2a0e372fbc1dbbeea81/scipy/stats/_stats_py.py#L7275-L7449 + +import matplotlib.pyplot as plt +import pandas as pd +import numpy as np +import scipy as sp +from scipy import stats + +data1 = np.array([ 1111, -2222, 33, 100, 100, 15555, -1, 19999, 888, 755, -1, -730 ]) +data2 = np.array([365, -123, 0]) +data1 = np.sort(data1) +data2 = np.sort(data2) +n1 = data1.shape[0] +n2 = data2.shape[0] +data_all = np.concatenate([data1, data2]) +cdf1 = np.searchsorted(data1, data_all, side='right') / n1 +cdf2 = np.searchsorted(data2, data_all, side='right') / n2 +print(data_all) +print("\ndata1", data1, cdf1) +print("\ndata2", data2, cdf2) +cddiffs = cdf1 - cdf2 +print("\ncddiffs", cddiffs) +minS = np.clip(-np.min(cddiffs), 0, 1) +maxS = np.max(cddiffs) +print("\nmin", minS) +print("max", maxS) +m, n = sorted([float(n1), float(n2)], reverse=True) +en = m * n / (m + n) +d = max(minS, maxS) +prob = stats.distributions.kstwo.sf(d, np.round(en)) +print("\nprob", prob) + +*/ + +static int double_expect(double v, const char *str, const char *descr) { + char buf[100 + 1]; + snprintfz(buf, 100, "%0.6f", v); + int ret = strcmp(buf, str) ? 1 : 0; + + fprintf(stderr, "%s %s, expected %s, got %s\n", ret?"FAILED":"OK", descr, str, buf); + return ret; +} + +static int mc_unittest1(void) { + int bs = 3, hs = 3; + DIFFS_NUMBERS base[3] = { 1, 2, 3 }; + DIFFS_NUMBERS high[3] = { 3, 4, 6 }; + + double prob = ks_2samp(base, bs, high, hs, 0); + return double_expect(prob, "0.222222", "3x3"); +} + +static int mc_unittest2(void) { + int bs = 6, hs = 3; + DIFFS_NUMBERS base[6] = { 1, 2, 3, 10, 10, 15 }; + DIFFS_NUMBERS high[3] = { 3, 4, 6 }; + + double prob = ks_2samp(base, bs, high, hs, 1); + return double_expect(prob, "0.500000", "6x3"); +} + +static int mc_unittest3(void) { + int bs = 12, hs = 3; + DIFFS_NUMBERS base[12] = { 1, 2, 3, 10, 10, 15, 111, 19999, 8, 55, -1, -73 }; + DIFFS_NUMBERS high[3] = { 3, 4, 6 }; + + double prob = ks_2samp(base, bs, high, hs, 2); + return double_expect(prob, "0.347222", "12x3"); +} + +static int mc_unittest4(void) { + int bs = 12, hs = 3; + DIFFS_NUMBERS base[12] = { 1111, -2222, 33, 100, 100, 15555, -1, 19999, 888, 755, -1, -730 }; + DIFFS_NUMBERS high[3] = { 365, -123, 0 }; + + double prob = ks_2samp(base, bs, high, hs, 2); + return double_expect(prob, "0.777778", "12x3"); +} + +int mc_unittest(void) { + int errors = 0; + + errors += mc_unittest1(); + errors += mc_unittest2(); + errors += mc_unittest3(); + errors += mc_unittest4(); + + return errors; +} + diff --git a/web/api/queries/weights.h b/web/api/queries/weights.h new file mode 100644 index 0000000..50d8634 --- /dev/null +++ b/web/api/queries/weights.h @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-3.0-or-later + +#ifndef NETDATA_API_WEIGHTS_H +#define NETDATA_API_WEIGHTS_H 1 + +#include "query.h" + +typedef enum { + WEIGHTS_METHOD_MC_KS2 = 1, + WEIGHTS_METHOD_MC_VOLUME = 2, + WEIGHTS_METHOD_ANOMALY_RATE = 3, +} WEIGHTS_METHOD; + +typedef enum { + WEIGHTS_FORMAT_CHARTS = 1, + WEIGHTS_FORMAT_CONTEXTS = 2, +} WEIGHTS_FORMAT; + +extern int enable_metric_correlations; +extern int metric_correlations_version; +extern WEIGHTS_METHOD default_metric_correlations_method; + +int web_api_v1_weights (RRDHOST *host, BUFFER *wb, WEIGHTS_METHOD method, WEIGHTS_FORMAT format, + RRDR_GROUPING group, const char *group_options, + time_t baseline_after, time_t baseline_before, + time_t after, time_t before, + size_t points, RRDR_OPTIONS options, SIMPLE_PATTERN *contexts, size_t tier, size_t timeout); + +WEIGHTS_METHOD weights_string_to_method(const char *method); +const char *weights_method_to_string(WEIGHTS_METHOD method); +int mc_unittest(void); + +#endif //NETDATA_API_WEIGHTS_H |