summaryrefslogtreecommitdiffstats
path: root/health/guides/consul/consul_autopilot_server_health_status.md
diff options
context:
space:
mode:
Diffstat (limited to '')
-rw-r--r--health/guides/consul/consul_autopilot_server_health_status.md48
1 files changed, 48 insertions, 0 deletions
diff --git a/health/guides/consul/consul_autopilot_server_health_status.md b/health/guides/consul/consul_autopilot_server_health_status.md
new file mode 100644
index 00000000..687c2bb1
--- /dev/null
+++ b/health/guides/consul/consul_autopilot_server_health_status.md
@@ -0,0 +1,48 @@
+### Understand the alert
+
+The `consul_autopilot_server_health_status` alert triggers when a Consul server in your service mesh is marked `unhealthy`. This can affect the overall stability and performance of the service mesh. Regular monitoring and addressing unhealthy servers are crucial in maintaining a smooth functioning environment.
+
+### What is Consul?
+
+`Consul` is a service mesh solution that provides a full-featured control plane with service discovery, configuration, and segmentation functionalities. It is used to connect, secure, and configure services across any runtime platform and public or private cloud.
+
+### Troubleshoot the alert
+
+Follow the steps below to identify and resolve the issue of an unhealthy Consul server:
+
+1. Check Consul server logs
+
+ Inspect the logs of the unhealthy server to identify the root cause of the issue. You can find logs typically in `/var/log/consul` or use `journalctl` with Consul:
+
+ ```
+ journalctl -u consul
+ ```
+
+2. Verify connectivity
+
+ Ensure that the unhealthy server can communicate with other servers in the datacenter. Check for any misconfigurations or network issues.
+
+3. Review server resources
+
+ Monitor the resource usage of the unhealthy server (CPU, memory, disk I/O, network). High resource usage can impact the server's health status. Use tools like `top`, `htop`, `iotop`, or `nload` to monitor the resources.
+
+4. Restart the Consul server
+
+ If the issue persists and you cannot identify the root cause, try restarting the Consul server:
+
+ ```
+ sudo systemctl restart consul
+ ```
+
+5. Refer to Consul's documentation
+
+ Consult the official [Consul troubleshooting documentation](https://developer.hashicorp.com/consul/tutorials/datacenter-operations/troubleshooting) for further assistance.
+
+6. Inspect the Consul UI
+
+ Check the Consul UI for the server health status and any additional information related to the unhealthy server. You can find the Consul UI at `http://<consul-server-ip>:8500/ui/`.
+
+### Useful resources
+
+1. [Consul Documentation](https://www.consul.io/docs)
+2. [Running Consul as a Systemd Service](https://learn.hashicorp.com/tutorials/consul/deployment-guide#systemd-service)