diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-05 11:19:16 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-05-05 12:07:37 +0000 |
commit | b485aab7e71c1625cfc27e0f92c9509f42378458 (patch) | |
tree | ae9abe108601079d1679194de237c9a435ae5b55 /src/health/guides/portcheck/portcheck_connection_fails.md | |
parent | Adding upstream version 1.44.3. (diff) | |
download | netdata-b485aab7e71c1625cfc27e0f92c9509f42378458.tar.xz netdata-b485aab7e71c1625cfc27e0f92c9509f42378458.zip |
Adding upstream version 1.45.3+dfsg.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'src/health/guides/portcheck/portcheck_connection_fails.md')
-rw-r--r-- | src/health/guides/portcheck/portcheck_connection_fails.md | 32 |
1 files changed, 32 insertions, 0 deletions
diff --git a/src/health/guides/portcheck/portcheck_connection_fails.md b/src/health/guides/portcheck/portcheck_connection_fails.md new file mode 100644 index 000000000..781cf7a01 --- /dev/null +++ b/src/health/guides/portcheck/portcheck_connection_fails.md @@ -0,0 +1,32 @@ +### Understand the alert + +This alert indicates that too many connections are failing to a specific TCP endpoint in the last 5 minutes. It suggests that the monitored service on that endpoint is most likely down, unreachable, or access is being denied by firewall/security rules. + +### Troubleshoot the alert + +1. Check the service + Investigate if the service at the endpoint (specific IP and port) is running as expected. Inspect service logs for issues, error messages, or indications of a shutdown event. + +2. Test the endpoint + Try to establish a connection to the flagged endpoint using tools like `telnet`, `curl`, or `nc`. These tools provide real-time feedback that can help identify problems with the endpoint: + + Example using `telnet`: + ``` + telnet IP_ADDRESS PORT_NUMBER + ``` + +3. Examine firewall and security group rules + Verify if there are any recent changes or newly added firewall/security group rules that might be causing the connectivity issues. Look for any rules that could be blocking the monitored port specifically or the IP range. + +4. Inspect network connectivity + Check the network connectivity between the Netdata Agent and the monitored endpoint. Ensure there are no intermittent network failures or high latency affecting the communication between the two. + +5. Examine the alert configuration + Validate the alert configuration in the `netdata.conf` file to confirm that the alert thresholds and monitored percentage of failed connections are set appropriately. + +6. Check resource utilization + High resource utilization might affect the availability of the monitored endpoint. Check if the system hosting the service has enough resources available (CPU, memory, and storage) to serve incoming requests. + +### Useful resources + +1. [How to use netcat (nc) command: Examples for network testing/debugging](https://www.nixcraft.com/t/how-to-use-netcat-nc-command-examples-for-network-testing-debugging/3332) |