diff options
Diffstat (limited to 'src/go/plugin/go.d/modules/vsphere/integrations/vmware_vcenter_server.md')
-rw-r--r-- | src/go/plugin/go.d/modules/vsphere/integrations/vmware_vcenter_server.md | 357 |
1 files changed, 357 insertions, 0 deletions
diff --git a/src/go/plugin/go.d/modules/vsphere/integrations/vmware_vcenter_server.md b/src/go/plugin/go.d/modules/vsphere/integrations/vmware_vcenter_server.md new file mode 100644 index 000000000..3f05eadfd --- /dev/null +++ b/src/go/plugin/go.d/modules/vsphere/integrations/vmware_vcenter_server.md @@ -0,0 +1,357 @@ +<!--startmeta +custom_edit_url: "https://github.com/netdata/netdata/edit/master/src/go/plugin/go.d/modules/vsphere/README.md" +meta_yaml: "https://github.com/netdata/netdata/edit/master/src/go/plugin/go.d/modules/vsphere/metadata.yaml" +sidebar_label: "VMware vCenter Server" +learn_status: "Published" +learn_rel_path: "Collecting Metrics/Containers and VMs" +most_popular: True +message: "DO NOT EDIT THIS FILE DIRECTLY, IT IS GENERATED BY THE COLLECTOR'S metadata.yaml FILE" +endmeta--> + +# VMware vCenter Server + + +<img src="https://netdata.cloud/img/vmware.svg" width="150"/> + + +Plugin: go.d.plugin +Module: vsphere + +<img src="https://img.shields.io/badge/maintained%20by-Netdata-%2300ab44" /> + +## Overview + +This collector monitors hosts and vms performance statistics from `vCenter` servers. + +> **Warning**: The `vsphere` collector cannot re-login and continue collecting metrics after a vCenter reboot. +> go.d.plugin needs to be restarted. + + + + +This collector is supported on all platforms. + +This collector supports collecting metrics from multiple instances of this integration, including remote instances. + + +### Default Behavior + +#### Auto-Detection + +This integration doesn't support auto-detection. + +#### Limits + +The default configuration for this integration does not impose any limits on data collection. + +#### Performance Impact + +The default `update_every` is 20 seconds, and it doesn't make sense to decrease the value. +**VMware real-time statistics are generated at the 20-second specificity**. + +It is likely that 20 seconds is not enough for big installations and the value should be tuned. + +To get a better view we recommend running the collector in debug mode and seeing how much time it will take to collect metrics. + +<details> +<summary>Example (all not related debug lines were removed)</summary> + +``` +[ilyam@pc]$ ./go.d.plugin -d -m vsphere +[ DEBUG ] vsphere[vsphere] discover.go:94 discovering : starting resource discovering process +[ DEBUG ] vsphere[vsphere] discover.go:102 discovering : found 3 dcs, process took 49.329656ms +[ DEBUG ] vsphere[vsphere] discover.go:109 discovering : found 12 folders, process took 49.538688ms +[ DEBUG ] vsphere[vsphere] discover.go:116 discovering : found 3 clusters, process took 47.722692ms +[ DEBUG ] vsphere[vsphere] discover.go:123 discovering : found 2 hosts, process took 52.966995ms +[ DEBUG ] vsphere[vsphere] discover.go:130 discovering : found 2 vms, process took 49.832979ms +[ INFO ] vsphere[vsphere] discover.go:140 discovering : found 3 dcs, 12 folders, 3 clusters (2 dummy), 2 hosts, 3 vms, process took 249.655993ms +[ DEBUG ] vsphere[vsphere] build.go:12 discovering : building : starting building resources process +[ INFO ] vsphere[vsphere] build.go:23 discovering : building : built 3/3 dcs, 12/12 folders, 3/3 clusters, 2/2 hosts, 3/3 vms, process took 63.3µs +[ DEBUG ] vsphere[vsphere] hierarchy.go:10 discovering : hierarchy : start setting resources hierarchy process +[ INFO ] vsphere[vsphere] hierarchy.go:18 discovering : hierarchy : set 3/3 clusters, 2/2 hosts, 3/3 vms, process took 6.522µs +[ DEBUG ] vsphere[vsphere] filter.go:24 discovering : filtering : starting filtering resources process +[ DEBUG ] vsphere[vsphere] filter.go:45 discovering : filtering : removed 0 unmatched hosts +[ DEBUG ] vsphere[vsphere] filter.go:56 discovering : filtering : removed 0 unmatched vms +[ INFO ] vsphere[vsphere] filter.go:29 discovering : filtering : filtered 0/2 hosts, 0/3 vms, process took 42.973µs +[ DEBUG ] vsphere[vsphere] metric_lists.go:14 discovering : metric lists : starting resources metric lists collection process +[ INFO ] vsphere[vsphere] metric_lists.go:30 discovering : metric lists : collected metric lists for 2/2 hosts, 3/3 vms, process took 275.60764ms +[ INFO ] vsphere[vsphere] discover.go:74 discovering : discovered 2/2 hosts, 3/3 vms, the whole process took 525.614041ms +[ INFO ] vsphere[vsphere] discover.go:11 starting discovery process, will do discovery every 5m0s +[ DEBUG ] vsphere[vsphere] collect.go:11 starting collection process +[ DEBUG ] vsphere[vsphere] scrape.go:48 scraping : scraped metrics for 2/2 hosts, process took 96.257374ms +[ DEBUG ] vsphere[vsphere] scrape.go:60 scraping : scraped metrics for 3/3 vms, process took 57.879697ms +[ DEBUG ] vsphere[vsphere] collect.go:23 metrics collected, process took 154.77997ms +``` + +</details> + +There you can see that discovering took `525.614041ms`, and collecting metrics took `154.77997ms`. Discovering is a separate thread, it doesn't affect collecting. +`update_every` and `timeout` parameters should be adjusted based on these numbers. + + + +## Metrics + +Metrics grouped by *scope*. + +The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels. + + + +### Per virtual machine + +These metrics refer to the Virtual Machine. + +Labels: + +| Label | Description | +|:-----------|:----------------| +| datacenter | Datacenter name | +| cluster | Cluster name | +| host | Host name | +| vm | Virtual Machine name | + +Metrics: + +| Metric | Dimensions | Unit | +|:------|:----------|:----| +| vsphere.vm_cpu_utilization | used | percentage | +| vsphere.vm_mem_utilization | used | percentage | +| vsphere.vm_mem_usage | granted, consumed, active, shared | KiB | +| vsphere.vm_mem_swap_usage | swapped | KiB | +| vsphere.vm_mem_swap_io | in, out | KiB/s | +| vsphere.vm_disk_io | read, write | KiB/s | +| vsphere.vm_disk_max_latency | latency | milliseconds | +| vsphere.vm_net_traffic | received, sent | KiB/s | +| vsphere.vm_net_packets | received, sent | packets | +| vsphere.vm_net_drops | received, sent | packets | +| vsphere.vm_overall_status | green, red, yellow, gray | status | +| vsphere.vm_system_uptime | uptime | seconds | + +### Per host + +These metrics refer to the ESXi host. + +Labels: + +| Label | Description | +|:-----------|:----------------| +| datacenter | Datacenter name | +| cluster | Cluster name | +| host | Host name | + +Metrics: + +| Metric | Dimensions | Unit | +|:------|:----------|:----| +| vsphere.host_cpu_utilization | used | percentage | +| vsphere.host_mem_utilization | used | percentage | +| vsphere.host_mem_usage | granted, consumed, active, shared, sharedcommon | KiB | +| vsphere.host_mem_swap_io | in, out | KiB/s | +| vsphere.host_disk_io | read, write | KiB/s | +| vsphere.host_disk_max_latency | latency | milliseconds | +| vsphere.host_net_traffic | received, sent | KiB/s | +| vsphere.host_net_packets | received, sent | packets | +| vsphere.host_net_drops | received, sent | packets | +| vsphere.host_net_errors | received, sent | errors | +| vsphere.host_overall_status | green, red, yellow, gray | status | +| vsphere.host_system_uptime | uptime | seconds | + + + +## Alerts + + +The following alerts are available: + +| Alert name | On metric | Description | +|:------------|:----------|:------------| +| [ vsphere_vm_cpu_utilization ](https://github.com/netdata/netdata/blob/master/src/health/health.d/vsphere.conf) | vsphere.vm_cpu_utilization | Virtual Machine CPU utilization | +| [ vsphere_vm_mem_usage ](https://github.com/netdata/netdata/blob/master/src/health/health.d/vsphere.conf) | vsphere.vm_mem_utilization | Virtual Machine memory utilization | +| [ vsphere_host_cpu_utilization ](https://github.com/netdata/netdata/blob/master/src/health/health.d/vsphere.conf) | vsphere.host_cpu_utilization | ESXi Host CPU utilization | +| [ vsphere_host_mem_utilization ](https://github.com/netdata/netdata/blob/master/src/health/health.d/vsphere.conf) | vsphere.host_mem_utilization | ESXi Host memory utilization | + + +## Setup + +### Prerequisites + +No action required. + +### Configuration + +#### File + +The configuration file name for this integration is `go.d/vsphere.conf`. + + +You can edit the configuration file using the `edit-config` script from the +Netdata [config directory](/docs/netdata-agent/configuration/README.md#the-netdata-config-directory). + +```bash +cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata +sudo ./edit-config go.d/vsphere.conf +``` +#### Options + +The following options can be defined globally: update_every, autodetection_retry. + + +<details open><summary>Config options</summary> + +| Name | Description | Default | Required | +|:----|:-----------|:-------|:--------:| +| update_every | Data collection frequency. | 20 | no | +| autodetection_retry | Recheck interval in seconds. Zero means no recheck will be scheduled. | 0 | no | +| url | vCenter server URL. | | yes | +| host_include | Hosts selector (filter). | | no | +| vm_include | Virtual machines selector (filter). | | no | +| discovery_interval | Hosts and VMs discovery interval. | 300 | no | +| timeout | HTTP request timeout. | 20 | no | +| username | Username for basic HTTP authentication. | | no | +| password | Password for basic HTTP authentication. | | no | +| proxy_url | Proxy URL. | | no | +| proxy_username | Username for proxy basic HTTP authentication. | | no | +| proxy_password | Password for proxy basic HTTP authentication. | | no | +| not_follow_redirects | Redirect handling policy. Controls whether the client follows redirects. | no | no | +| tls_skip_verify | Server certificate chain and hostname validation policy. Controls whether the client performs this check. | no | no | +| tls_ca | Certification authority that the client uses when verifying the server's certificates. | | no | +| tls_cert | Client TLS certificate. | | no | +| tls_key | Client TLS key. | | no | + +##### host_include + +Metrics of hosts matching the selector will be collected. + +- Include pattern syntax: "/Datacenter pattern/Cluster pattern/Host pattern". +- Match pattern syntax: [simple patterns](/src/libnetdata/simple_pattern/README.md#simple-patterns). +- Syntax: + + ```yaml + host_include: + - '/DC1/*' # select all hosts from datacenter DC1 + - '/DC2/*/!Host2 *' # select all hosts from datacenter DC2 except HOST2 + - '/DC3/Cluster3/*' # select all hosts from datacenter DC3 cluster Cluster3 + ``` + + +##### vm_include + +Metrics of VMs matching the selector will be collected. + +- Include pattern syntax: "/Datacenter pattern/Cluster pattern/Host pattern/VM pattern". +- Match pattern syntax: [simple patterns](/src/libnetdata/simple_pattern/README.md#simple-patterns). +- Syntax: + + ```yaml + vm_include: + - '/DC1/*' # select all VMs from datacenter DC + - '/DC2/*/*/!VM2 *' # select all VMs from datacenter DC2 except VM2 + - '/DC3/Cluster3/*' # select all VMs from datacenter DC3 cluster Cluster3 + ``` + + +</details> + +#### Examples + +##### Basic + +A basic example configuration. + +```yaml +jobs: + - name : vcenter1 + url : https://203.0.113.1 + username : admin@vsphere.local + password : somepassword + +``` +##### Multi-instance + +> **Note**: When you define multiple jobs, their names must be unique. + +Collecting metrics from local and remote instances. + + +<details open><summary>Config</summary> + +```yaml +jobs: + - name : vcenter1 + url : https://203.0.113.1 + username : admin@vsphere.local + password : somepassword + + - name : vcenter2 + url : https://203.0.113.10 + username : admin@vsphere.local + password : somepassword + +``` +</details> + + + +## Troubleshooting + +### Debug Mode + +**Important**: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature. + +To troubleshoot issues with the `vsphere` collector, run the `go.d.plugin` with the debug option enabled. The output +should give you clues as to why the collector isn't working. + +- Navigate to the `plugins.d` directory, usually at `/usr/libexec/netdata/plugins.d/`. If that's not the case on + your system, open `netdata.conf` and look for the `plugins` setting under `[directories]`. + + ```bash + cd /usr/libexec/netdata/plugins.d/ + ``` + +- Switch to the `netdata` user. + + ```bash + sudo -u netdata -s + ``` + +- Run the `go.d.plugin` to debug the collector: + + ```bash + ./go.d.plugin -d -m vsphere + ``` + +### Getting Logs + +If you're encountering problems with the `vsphere` collector, follow these steps to retrieve logs and identify potential issues: + +- **Run the command** specific to your system (systemd, non-systemd, or Docker container). +- **Examine the output** for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem. + +#### System with systemd + +Use the following command to view logs generated since the last Netdata service restart: + +```bash +journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep vsphere +``` + +#### System without systemd + +Locate the collector log file, typically at `/var/log/netdata/collector.log`, and use `grep` to filter for collector's name: + +```bash +grep vsphere /var/log/netdata/collector.log +``` + +**Note**: This method shows logs from all restarts. Focus on the **latest entries** for troubleshooting current issues. + +#### Docker Container + +If your Netdata runs in a Docker container named "netdata" (replace if different), use this command: + +```bash +docker logs netdata 2>&1 | grep vsphere +``` + + |