diff options
Diffstat (limited to '')
-rw-r--r-- | docs/collect/application-metrics.md | 81 | ||||
-rw-r--r-- | docs/collect/container-metrics.md | 99 | ||||
-rw-r--r-- | docs/collect/enable-configure.md | 68 | ||||
-rw-r--r-- | docs/collect/how-collectors-work.md | 78 | ||||
-rw-r--r-- | docs/collect/system-metrics.md | 60 |
5 files changed, 386 insertions, 0 deletions
diff --git a/docs/collect/application-metrics.md b/docs/collect/application-metrics.md new file mode 100644 index 0000000..c9bc4e2 --- /dev/null +++ b/docs/collect/application-metrics.md @@ -0,0 +1,81 @@ +<!-- +title: "Collect application metrics with Netdata" +sidebar_label: "Application metrics" +description: "Monitor and troubleshoot every application on your infrastructure with per-second metrics, zero configuration, and meaningful charts." +custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/collect/application-metrics.md +--> + +# Collect application metrics with Netdata + +Netdata instantly collects per-second metrics from many different types of applications running on your systems, such as +web servers, databases, message brokers, email servers, search platforms, and much more. Metrics collectors are +pre-installed with every Netdata Agent and usually require zero configuration. Netdata also collects and visualizes +resource utilization per application on Linux systems using `apps.plugin`. + +[**apps.plugin**](/collectors/apps.plugin/README.md) looks at the Linux process tree every second, much like `top` or +`ps fax`, and collects resource utilization information on every running process. By reading the process tree, Netdata +shows CPU, disk, networking, processes, and eBPF for every application or Linux user. Unlike `top` or `ps fax`, Netdata +adds a layer of meaningful visualization on top of the process tree metrics, such as grouping applications into useful +dimensions, and then creates per-application charts under the **Applications** section of a Netdata dashboard, per-user +charts under **Users**, and per-user group charts under **User Groups**. + +Our most popular application collectors: + +- [Prometheus endpoints](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus): Gathers + metrics from one or more Prometheus endpoints that use the OpenMetrics exposition format. Auto-detects more than 600 + endpoints. +- [Web server logs (Apache, NGINX)](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog/): + Tail access logs and provide very detailed web server performance statistics. This module is able to parse 200k+ + rows in less than half a second. +- [MySQL](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql/): Collect database global, + replication, and per-user statistics. +- [Redis](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/redis): Monitor database status by + reading the server's response to the `INFO` command. +- [Apache](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/apache/): Collect Apache web server + performance metrics via the `server-status?auto` endpoint. +- [Nginx](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginx/): Monitor web server status + information by gathering metrics via `ngx_http_stub_status_module`. +- [Postgres](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/postgres): Collect database health + and performance metrics. +- [ElasticSearch](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/elasticsearch): Collect search + engine performance and health statistics. Optionally collects per-index metrics. +- [PHP-FPM](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/phpfpm/): Collect application summary + and processes health metrics by scraping the status page (`/status?full`). + +Our [supported collectors list](/collectors/COLLECTORS.md#service-and-application-collectors) shows all Netdata's +application metrics collectors, including those for containers/k8s clusters. + +## Collect metrics from applications running on Windows + +Netdata is fully capable of collecting and visualizing metrics from applications running on Windows systems. The only +caveat is that you must [install Netdata](/docs/get-started.mdx) on a separate system or a compatible VM because there +is no native Windows version of the Netdata Agent. + +Once you have Netdata running on that separate system, you can follow the [enable and configure +doc](/docs/collect/enable-configure.md) to tell the collector to look for exposed metrics on the Windows system's IP +address or hostname, plus the applicable port. + +For example, you have a MySQL database with a root password of `my-secret-pw` running on a Windows system with the IP +address 203.0.113.0. you can configure the [MySQL +collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql) to look at `203.0.113.0:3306`: + +```yml +jobs: + - name: local + dsn: root:my-secret-pw@tcp(203.0.113.0:3306)/ +``` + +This same logic applies to any application in our [supported collectors +list](/collectors/COLLECTORS.md#service-and-application-collectors) that can run on Windows. + +## What's next? + +If you haven't yet seen the [supported collectors list](/collectors/COLLECTORS.md) give it a once-over for any +additional applications you may want to monitor using Netdata's native collectors, or the [generic Prometheus +collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus). + +Collecting all the available metrics on your nodes, and across your entire infrastructure, is just one piece of the +puzzle. Next, learn more about Netdata's famous real-time visualizations by [seeing an overview of your +infrastructure](/docs/visualize/overview-infrastructure.md) using Netdata Cloud. + + diff --git a/docs/collect/container-metrics.md b/docs/collect/container-metrics.md new file mode 100644 index 0000000..5d14536 --- /dev/null +++ b/docs/collect/container-metrics.md @@ -0,0 +1,99 @@ +<!-- +title: "Collect container metrics with Netdata" +sidebar_label: "Container metrics" +description: "Use Netdata to collect per-second utilization and application-level metrics from Linux/Docker containers and Kubernetes clusters." +custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/collect/container-metrics.md +--> + +# Collect container metrics with Netdata + +Thanks to close integration with Linux cgroups and the virtual files it maintains under `/sys/fs/cgroup`, Netdata can +monitor the health, status, and resource utilization of many different types of Linux containers. + +Netdata uses [cgroups.plugin](/collectors/cgroups.plugin/README.md) to poll `/sys/fs/cgroup` and convert the raw data +into human-readable metrics and meaningful visualizations. Through cgroups, Netdata is compatible with **all Linux +containers**, such as Docker, LXC, LXD, Libvirt, systemd-nspawn, and more. Read more about [Docker-specific +monitoring](#collect-docker-metrics) below. + +Netdata also has robust **Kubernetes monitoring** support thanks to a +[Helmchart](/packaging/installer/methods/kubernetes.md) to automate deployment, collectors for k8s agent services, and +robust [service discovery](https://github.com/netdata/agent-service-discovery/#service-discovery) to monitor the +services running inside of pods in your k8s cluster. Read more about [Kubernetes +monitoring](#collect-kubernetes-metrics) below. + +A handful of additional collectors gather metrics from container-related services, such as +[dockerd](/collectors/python.d.plugin/dockerd/README.md) or [Docker +Engine](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/docker_engine/). You can find all +container collectors in our supported collectors list under the +[containers/VMs](/collectors/COLLECTORS.md#containers-and-vms) and +[Kubernetes](/collectors/COLLECTORS.md#containers-and-vms) headings. + +## Collect Docker metrics + +Netdata has robust Docker monitoring thanks to the aforementioned +[cgroups.plugin](/collectors/cgroups.plugin/README.md). By polling cgroups every second, Netdata can produce meaningful +visualizations about the CPU, memory, disk, and network utilization of all running containers on the host system with +zero configuration. + +Netdata also collects metrics from applications running inside of Docker containers. For example, if you create a MySQL +database container using `docker run --name some-mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql:tag`, it exposes +metrics on port 3306. You can configure the [MySQL +collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql) to look at `127.0.0.0:3306` for +MySQL metrics: + +```yml +jobs: + - name: local + dsn: root:my-secret-pw@tcp(127.0.0.1:3306)/ +``` + +Netdata then collects metrics from the container itself, but also dozens [MySQL-specific +metrics](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/mysql#charts) as well. + +### Collect metrics from applications running in Docker containers + +You could use this technique to monitor an entire infrastructure of Docker containers. The same [enable and +configure](/docs/collect/enable-configure.md) procedures apply whether an application runs on the host system or inside +a container. You may need to configure the target endpoint if it's not the application's default. + +Netdata can even [run in a Docker container](/packaging/docker/README.md) itself, and then collect metrics about the +host system, its own container with cgroups, and any applications you want to monitor. + +See our [application metrics doc](/docs/collect/application-metrics.md) for details about Netdata's application metrics +collection capabilities. + +## Collect Kubernetes metrics + +We already have a few complementary tools and collectors for monitoring the many layers of a Kubernetes cluster, +_entirely for free_. These methods work together to help you troubleshoot performance or availability issues across +your k8s infrastructure. + +- A [Helm chart](https://github.com/netdata/helmchart), which bootstraps a Netdata Agent pod on every node in your + cluster, plus an additional parent pod for storing metrics and managing alarm notifications. +- A [service discovery plugin](https://github.com/netdata/agent-service-discovery), which discovers and creates + configuration files for [compatible + applications](https://github.com/netdata/helmchart#service-discovery-and-supported-services) and any endpoints + covered by our [generic Prometheus + collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/prometheus). With these + configuration files, Netdata collects metrics from any compatible applications as they run _inside_ of a pod. + Service discovery happens without manual intervention as pods are created, destroyed, or moved between nodes. +- A [Kubelet collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubelet), which runs + on each node in a k8s cluster to monitor the number of pods/containers, the volume of operations on each container, + and more. +- A [kube-proxy collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/k8s_kubeproxy), which + also runs on each node and monitors latency and the volume of HTTP requests to the proxy. +- A [cgroups collector](/collectors/cgroups.plugin/README.md), which collects CPU, memory, and bandwidth metrics for + each container running on your k8s cluster. + +For a holistic view of Netdata's Kubernetes monitoring capabilities, see our guide: [_Monitor a Kubernetes (k8s) cluster +with Netdata_](https://learn.netdata.cloud/guides/monitor/kubernetes-k8s-netdata). + +## What's next? + +Netdata is capable of collecting metrics from hundreds of applications, such as web servers, databases, messaging +brokers, and more. See more in the [application metrics doc](/docs/collect/application-metrics.md). + +If you already have all the information you need about collecting metrics, move into Netdata's meaningful visualizations +with [seeing an overview of your infrastructure](/docs/visualize/overview-infrastructure.md) using Netdata Cloud. + + diff --git a/docs/collect/enable-configure.md b/docs/collect/enable-configure.md new file mode 100644 index 0000000..19e680c --- /dev/null +++ b/docs/collect/enable-configure.md @@ -0,0 +1,68 @@ +<!-- +title: "Enable or configure a collector" +description: "Every collector is highly configurable, allowing them to collect metrics from any node and any infrastructure." +custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/collect/enable-configure.md +--> + +# Enable or configure a collector + +When Netdata starts up, each collector searches for exposed metrics on the default endpoint established by that service +or application's standard installation procedure. For example, the [Nginx +collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginx) searches at +`http://127.0.0.1/stub_status` for exposed metrics in the correct format. If an Nginx web server is running and exposes +metrics on that endpoint, the collector begins gathering them. + +However, not every node or infrastructure uses standard ports, paths, files, or naming conventions. You may need to +enable or configure a collector to gather all available metrics from your systems, containers, or applications. + +## Enable a collector or its orchestrator + +You can enable/disable collectors individually, or enable/disable entire orchestrators, using their configuration files. +For example, you can change the behavior of the Go orchestrator, or any of its collectors, by editing `go.d.conf`. + +Use `edit-config` from your [Netdata config directory](/docs/configure/nodes.md#the-netdata-config-directory) to open +the orchestrator primary configuration file: + +```bash +cd /etc/netdata +sudo ./edit-config go.d.conf +``` + +Within this file, you can either disable the orchestrator entirely (`enabled: yes`), or find a specific collector and +enable/disable it with `yes` and `no` settings. Uncomment any line you change to ensure the Netdata daemon reads it on +start. + +After you make your changes, restart the Agent with `sudo systemctl restart netdata`, or the [appropriate +method](/docs/configure/start-stop-restart.md) for your system. + +## Configure a collector + +First, [find the collector](/collectors/COLLECTORS.md) you want to edit and open its documentation. Some software has +collectors written in multiple languages. In these cases, you should always pick the collector written in Go. + +Use `edit-config` from your [Netdata config directory](/docs/configure/nodes.md#the-netdata-config-directory) to open a +collector's configuration file. For example, edit the Nginx collector with the following: + +```bash +./edit-config go.d/nginx.conf +``` + +Each configuration file describes every available option and offers examples to help you tweak Netdata's settings +according to your needs. In addition, every collector's documentation shows the exact command you need to run to +configure that collector. Uncomment any line you change to ensure the collector's orchestrator or the Netdata daemon +read it on start. + +After you make your changes, restart the Agent with `sudo systemctl restart netdata`, or the [appropriate +method](/docs/configure/start-stop-restart.md) for your system. + +## What's next? + +Read high-level overviews on how Netdata collects [system metrics](/docs/collect/system-metrics.md), [container +metrics](/docs/collect/container-metrics.md), and [application metrics](/docs/collect/application-metrics.md). + +If you're already collecting all metrics from your systems, containers, and applications, it's time to move into +Netdata's visualization features. [See an overview of your infrastructure](/docs/visualize/overview-infrastructure.md) +using Netdata Cloud, or learn how to [interact with dashboards and +charts](/docs/visualize/interact-dashboards-charts.md). + + diff --git a/docs/collect/how-collectors-work.md b/docs/collect/how-collectors-work.md new file mode 100644 index 0000000..07e3485 --- /dev/null +++ b/docs/collect/how-collectors-work.md @@ -0,0 +1,78 @@ +<!-- +title: "How Netdata's metrics collectors work" +description: "When Netdata starts, and with zero configuration, it auto-detects thousands of data sources and immediately collects per-second metrics." +custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/collect/how-collectors-work.md +--> + +# How Netdata's metrics collectors work + +When Netdata starts, and with zero configuration, it auto-detects thousands of data sources and immediately collects +per-second metrics. + +Netdata can immediately collect metrics from these endpoints thanks to 300+ **collectors**, which all come pre-installed +when you [install Netdata](/docs/get-started.mdx). + +Every collector has two primary jobs: + +- Look for exposed metrics at a pre- or user-defined endpoint. +- Gather exposed metrics and use additional logic to build meaningful, interactive visualizations. + +If the collector finds compatible metrics exposed on the configured endpoint, it begins a per-second collection job. The +Netdata Agent gathers these metrics, sends them to the [database engine for +storage](/docs/store/change-metrics-storage.md), and immediately [visualizes them +meaningfully](/docs/visualize/interact-dashboards-charts.md) on dashboards. + +Each collector comes with a pre-defined configuration that matches the default setup for that application. This endpoint +can be a URL and port, a socket, a file, a web page, and more. + +For example, the [Nginx collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/nginx) searches +at `http://127.0.0.1/stub_status`, which is the default endpoint for exposing Nginx metrics. The [web log collector for +Nginx or Apache](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/weblog) searches at +`/var/log/nginx/access.log` and `/var/log/apache2/access.log`, respectively, both of which are standard locations for +access log files on Linux systems. + +The endpoint is user-configurable, as are many other specifics of what a given collector does. + +## What can Netdata collect? + +To quickly find your answer, see our [list of supported collectors](/collectors/COLLECTORS.md). + +Generally, Netdata's collectors can be grouped into three types: + +- [Systems](/docs/collect/system-metrics.md): Monitor CPU, memory, disk, networking, systemd, eBPF, and much more. + Every metric exposed by `/proc`, `/sys`, and other Linux kernel sources. +- [Containers](/docs/collect/container-metrics.md): Gather metrics from container agents, like `dockerd` or `kubectl`, + along with the resource usage of containers and the applications they run. +- [Applications](/docs/collect/application-metrics.md): Collect per-second metrics from web servers, databases, logs, + message brokers, APM tools, email servers, and much more. + +## Collector architecture and terminology + +**Collector** is a catch-all term for any Netdata process that gathers metrics from an endpoint. + +While we use _collector_ most often in documentation, release notes, and educational content, you may encounter other +terms related to collecting metrics. + +- **Modules** are a type of collector. +- **Orchestrators** are external plugins that run and manage one or more modules. They run as independent processes. + The Go orchestrator is in active development. + - [go.d.plugin](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/): An orchestrator for data + collection modules written in `go`. + - [python.d.plugin](/collectors/python.d.plugin/README.md): An orchestrator for data collection modules written in + `python` v2/v3. + - [charts.d.plugin](/collectors/charts.d.plugin/README.md): An orchestrator for data collection modules written in + `bash` v4+. +- **External plugins** gather metrics from external processes, such as a webserver or database, and run as independent + processes that communicate with the Netdata daemon via pipes. +- **Internal plugins** gather metrics from `/proc`, `/sys`, and other Linux kernel sources. They are written in `C`, + and run as threads within the Netdata daemon. + +## What's next? + +[Enable or configure a collector](/docs/collect/enable-configure.md) if the default settings are not compatible with +your infrastructure. + +See our [collectors reference](/collectors/REFERENCE.md) for detailed information on Netdata's collector architecture, +troubleshooting a collector, developing a custom collector, and more. + + diff --git a/docs/collect/system-metrics.md b/docs/collect/system-metrics.md new file mode 100644 index 0000000..ecd8dad --- /dev/null +++ b/docs/collect/system-metrics.md @@ -0,0 +1,60 @@ +<!-- +title: "Collect system metrics with Netdata" +sidebar_label: "System metrics" +description: "Netdata collects thousands of metrics from physical and virtual systems, IoT/edge devices, and containers with zero configuration." +custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/collect/system-metrics.md +--> + +# Collect system metrics with Netdata + +Netdata collects thousands of metrics directly from the operating systems of physical and virtual systems, IoT/edge +devices, and [containers](/docs/collect/container-metrics.md) with zero configuration. + +To gather system metrics, Netdata uses roughly a dozen plugins, each of which has one or more collectors for very +specific metrics exposed by the host. The system metrics Netdata users interact with most for health monitoring and +performance troubleshooting are collected and visualized by `proc.plugin`, `cgroups.plugin`, and `ebpf.plugin`. + +[**proc.plugin**](/collectors/proc.plugin/README.md) gathers metrics from the `/proc` and `/sys` folders in Linux +systems, along with a few other endpoints, and is responsible for the bulk of the system metrics collected and +visualized by Netdata. It collects CPU, memory, disks, load, networking, mount points, and more with zero configuration. +It even allows Netdata to monitor its own resource utilization! + +[**cgroups.plugin**](/collectors/cgroups.plugin/README.md) collects rich metrics about containers and virtual machines +using the virtual files under `/sys/fs/cgroup`. By reading cgroups, Netdata can instantly collect resource utilization +metrics for systemd services, all containers (Docker, LXC, LXD, Libvirt, systemd-nspawn), and more. Learn more in the +[collecting container metrics](/docs/collect/container-metrics.md) doc. + +[**ebpf.plugin**](/collectors/ebpf.plugin/README.md): Netdata's extended Berkeley Packet Filter (eBPF) collector +monitors Linux kernel-level metrics for file descriptors, virtual filesystem IO, and process management. You can use our +eBPF collector to analyze how and when a process accesses files, when it makes system calls, whether it leaks memory or +creating zombie processes, and more. + +While the above plugins and associated collectors are the most important for system metrics, there are many others. You +can find all system collectors in our [supported collectors list](/collectors/COLLECTORS.md#system-collectors). + +## Collect Windows system metrics + +Netdata is also capable of monitoring Windows systems. The [WMI +collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/wmi) integrates with +[windows_exporter](https://github.com/prometheus-community/windows_exporter), a small Go-based binary that you can run +on Windows systems. The WMI collector then gathers metrics from an endpoint created by windows_exporter, for more +details see [the requirements](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/wmi#requirements). + +Next, [configure the WMI +collector](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/wmi#configuration) to point to the URL +and port of your exposed endpoint. Restart Netdata with `sudo systemctl restart netdata`, or the [appropriate +method](/docs/configure/start-stop-restart.md) for your system. You'll start seeing Windows system metrics, such as CPU +utilization, memory, bandwidth per NIC, number of processes, and much more. + +For information about collecting metrics from applications _running on Windows systems_, see the [application metrics +doc](/docs/collect/application-metrics.md#collect-metrics-from-applications-running-on-windows). + +## What's next? + +Because there's some overlap between system metrics and [container metrics](/docs/collect/container-metrics.md), you +should investigate Netdata's container compatibility if you use them heavily in your infrastructure. + +If you don't use containers, skip ahead to collecting [application metrics](/docs/collect/application-metrics.md) with +Netdata. + + |