diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 02:57:58 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 02:57:58 +0000 |
commit | be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97 (patch) | |
tree | 9754ff1ca740f6346cf8483ec915d4054bc5da2d /health/guides/consul/consul_autopilot_server_health_status.md | |
parent | Initial commit. (diff) | |
download | netdata-be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97.tar.xz netdata-be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97.zip |
Adding upstream version 1.44.3.upstream/1.44.3upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to '')
-rw-r--r-- | health/guides/consul/consul_autopilot_server_health_status.md | 48 |
1 files changed, 48 insertions, 0 deletions
diff --git a/health/guides/consul/consul_autopilot_server_health_status.md b/health/guides/consul/consul_autopilot_server_health_status.md new file mode 100644 index 00000000..687c2bb1 --- /dev/null +++ b/health/guides/consul/consul_autopilot_server_health_status.md @@ -0,0 +1,48 @@ +### Understand the alert + +The `consul_autopilot_server_health_status` alert triggers when a Consul server in your service mesh is marked `unhealthy`. This can affect the overall stability and performance of the service mesh. Regular monitoring and addressing unhealthy servers are crucial in maintaining a smooth functioning environment. + +### What is Consul? + +`Consul` is a service mesh solution that provides a full-featured control plane with service discovery, configuration, and segmentation functionalities. It is used to connect, secure, and configure services across any runtime platform and public or private cloud. + +### Troubleshoot the alert + +Follow the steps below to identify and resolve the issue of an unhealthy Consul server: + +1. Check Consul server logs + + Inspect the logs of the unhealthy server to identify the root cause of the issue. You can find logs typically in `/var/log/consul` or use `journalctl` with Consul: + + ``` + journalctl -u consul + ``` + +2. Verify connectivity + + Ensure that the unhealthy server can communicate with other servers in the datacenter. Check for any misconfigurations or network issues. + +3. Review server resources + + Monitor the resource usage of the unhealthy server (CPU, memory, disk I/O, network). High resource usage can impact the server's health status. Use tools like `top`, `htop`, `iotop`, or `nload` to monitor the resources. + +4. Restart the Consul server + + If the issue persists and you cannot identify the root cause, try restarting the Consul server: + + ``` + sudo systemctl restart consul + ``` + +5. Refer to Consul's documentation + + Consult the official [Consul troubleshooting documentation](https://developer.hashicorp.com/consul/tutorials/datacenter-operations/troubleshooting) for further assistance. + +6. Inspect the Consul UI + + Check the Consul UI for the server health status and any additional information related to the unhealthy server. You can find the Consul UI at `http://<consul-server-ip>:8500/ui/`. + +### Useful resources + +1. [Consul Documentation](https://www.consul.io/docs) +2. [Running Consul as a Systemd Service](https://learn.hashicorp.com/tutorials/consul/deployment-guide#systemd-service) |