summaryrefslogtreecommitdiffstats
path: root/health/guides/portcheck/portcheck_connection_fails.md
diff options
context:
space:
mode:
Diffstat (limited to 'health/guides/portcheck/portcheck_connection_fails.md')
-rw-r--r--health/guides/portcheck/portcheck_connection_fails.md32
1 files changed, 32 insertions, 0 deletions
diff --git a/health/guides/portcheck/portcheck_connection_fails.md b/health/guides/portcheck/portcheck_connection_fails.md
new file mode 100644
index 00000000..781cf7a0
--- /dev/null
+++ b/health/guides/portcheck/portcheck_connection_fails.md
@@ -0,0 +1,32 @@
+### Understand the alert
+
+This alert indicates that too many connections are failing to a specific TCP endpoint in the last 5 minutes. It suggests that the monitored service on that endpoint is most likely down, unreachable, or access is being denied by firewall/security rules.
+
+### Troubleshoot the alert
+
+1. Check the service
+ Investigate if the service at the endpoint (specific IP and port) is running as expected. Inspect service logs for issues, error messages, or indications of a shutdown event.
+
+2. Test the endpoint
+ Try to establish a connection to the flagged endpoint using tools like `telnet`, `curl`, or `nc`. These tools provide real-time feedback that can help identify problems with the endpoint:
+
+ Example using `telnet`:
+ ```
+ telnet IP_ADDRESS PORT_NUMBER
+ ```
+
+3. Examine firewall and security group rules
+ Verify if there are any recent changes or newly added firewall/security group rules that might be causing the connectivity issues. Look for any rules that could be blocking the monitored port specifically or the IP range.
+
+4. Inspect network connectivity
+ Check the network connectivity between the Netdata Agent and the monitored endpoint. Ensure there are no intermittent network failures or high latency affecting the communication between the two.
+
+5. Examine the alert configuration
+ Validate the alert configuration in the `netdata.conf` file to confirm that the alert thresholds and monitored percentage of failed connections are set appropriately.
+
+6. Check resource utilization
+ High resource utilization might affect the availability of the monitored endpoint. Check if the system hosting the service has enough resources available (CPU, memory, and storage) to serve incoming requests.
+
+### Useful resources
+
+1. [How to use netcat (nc) command: Examples for network testing/debugging](https://www.nixcraft.com/t/how-to-use-netcat-nc-command-examples-for-network-testing-debugging/3332)