diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 02:57:58 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-19 02:57:58 +0000 |
commit | be1c7e50e1e8809ea56f2c9d472eccd8ffd73a97 (patch) | |
tree | 9754ff1ca740f6346cf8483ec915d4054bc5da2d /health/guides/ping | |
parent | Initial commit. (diff) | |
download | netdata-upstream.tar.xz netdata-upstream.zip |
Adding upstream version 1.44.3.upstream/1.44.3upstream
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'health/guides/ping')
-rw-r--r-- | health/guides/ping/ping_host_latency.md | 48 | ||||
-rw-r--r-- | health/guides/ping/ping_host_reachable.md | 27 | ||||
-rw-r--r-- | health/guides/ping/ping_packet_loss.md | 44 |
3 files changed, 119 insertions, 0 deletions
diff --git a/health/guides/ping/ping_host_latency.md b/health/guides/ping/ping_host_latency.md new file mode 100644 index 00000000..59ea1be6 --- /dev/null +++ b/health/guides/ping/ping_host_latency.md @@ -0,0 +1,48 @@ +### Understand the alert + +This alert calculates the average latency (`ping round-trip time`) to a network host (${label:host}) over the last 10 seconds. If you receive this alert, it means there might be issues with your network connectivity or host responsiveness. + +### What does latency mean? + +Latency is the time it takes for a packet of data to travel from the sender to the receiver, and back from the receiver to the sender. In this case, we're measuring the latency using the `ping` command, which sends an ICMP echo request to the host and then waits for the ICMP echo reply. + +### Troubleshoot the alert + +1. Double-check the network connection: + + Verify the network connectivity between your system and the target host. Check if the host is accessible via other tools such as `traceroute` or `mtr`. + + ``` + traceroute ${label:host} + mtr ${label:host} + ``` + +2. Check for packet loss: + + Packet loss can make latency appear higher than it actually is. Use the `ping` command to check for packet loss: + + ``` + ping -c 10 ${label:host} + ``` + + Look for the percentage of packet loss in the output. + +3. Investigate the host: + + If no packet loss is detected and the network connection is stable, the problem might be related to the host itself. Check the host for overloaded resources, such as high CPU usage, disk I/O, or network traffic. + +4. Check DNS resolution: + + If the alert's `${label:host}` is a domain name, make sure that DNS resolution is working properly: + + ``` + nslookup ${label:host} + ``` + +5. Verify firewall and routing: + + Check if any firewall rules or routing policies might be affecting the network traffic between your system and the target host. + +### Useful resources + +1. [Using Ping and Traceroute to troubleshoot network connectivity](https://support.cloudflare.com/hc/en-us/articles/200169336-Using-Ping-and-Traceroute-to-troubleshoot-network-connectivity) diff --git a/health/guides/ping/ping_host_reachable.md b/health/guides/ping/ping_host_reachable.md new file mode 100644 index 00000000..75e24cbe --- /dev/null +++ b/health/guides/ping/ping_host_reachable.md @@ -0,0 +1,27 @@ +### Understand the alert + +This `ping_host_reachable` alert checks the network reachability status of a specific host. When you receive this alert, it means that the host is either `up` (reachable) or `down` (unreachable). + +### What is network reachability? + +Network reachability refers to the ability of a particular host to communicate with other devices or systems within a network. In this alert, the reachability is monitored using the `ping` command, which sends packets to the host and checks for the response. The alert evaluates the packet loss percentage over a 30-second period. + +### Troubleshoot the alert + +1. Verify if the alert is accurate: Check if there are transient network issues or if there is a problem with the particular host. You can run the `ping` command manually to see if the packet loss percentage is consistent over time. + + ``` + ping -c 10 <host IP or domain> + ``` + +2. Check the network connectivity: Ensure there are no issues with the local network or the physical connections (switches, routers, etc.). Look for potential network bottlenecks, high traffic, and hardware failures that can affect reachability. + +3. Check the host's health: If the host is reachable, log in to the system and examine its performance, stability, and resource usage. Look for indicators of high system load, resource constraints, or unresponsive processes. + +4. Examine network security policies and firewalls: Network reachability can be affected by misconfigured firewalls or security policies. Ensure there are no restrictions blocking the communication between the monitoring system and the host. + +5. Analyze logs for any relevant information: Check system logs (e.g., `/var/log/syslog`) and application logs on both the monitoring system and the target host. Look for error messages, timeouts, or connectivity problems. + +### Useful resources + +1. [Understanding High Packet Loss in Networking](https://www.fiberplex.com/blog/understanding-high-packet-loss-in-networking) diff --git a/health/guides/ping/ping_packet_loss.md b/health/guides/ping/ping_packet_loss.md new file mode 100644 index 00000000..546ecb00 --- /dev/null +++ b/health/guides/ping/ping_packet_loss.md @@ -0,0 +1,44 @@ +### Understand the alert + +This alert calculates the `ping packet loss` percentage to the network host over the last 10 minutes. If you receive this alert, it means that your network is experiencing increased packet loss. + +### What does ping packet loss mean? + +Ping is a command used to test the reachability of a host on a network. It measures the round-trip-time (RTT) for packets sent from the source host to the destination host. Packet loss occurs when these packets are not successfully delivered to their destination. + +### Troubleshoot the alert + +1. Check for network congestion: + + Excessive network traffic can cause packet loss. Use tools like `iftop`, `nload`, or `bmon` to monitor your network bandwidth usage and identify possible congestion sources. + +2. Inspect the network hardware: + + Faulty network hardware like routers, switches, and cables can lead to packet loss. Examine the physical network hardware for possible issues and ensure that all devices are functioning properly. + +3. Test the connection to the destination host: + + Use the `ping` command to test the connection to the destination host: + + ``` + ping <destination_host> + ``` + + If you experience consistent packet loss, it may indicate an issue with the destination host or the network path leading to it. + +4. Check the destination host: + + If the destination host is under heavy load or experiencing issues, it may cause packet loss. Check the host's resources, such as CPU usage, memory usage, and disk space, and resolve any issues if necessary. + +5. Investigate possible packet loss causes: + + Some factors that can cause packet loss include network congestion, poor network equipment performance, corrupt data packets, or interference from other devices. Analyze your network traffic and pinpoint the cause of the packet loss. + +6. Rectify any identified issues: + + Once you've identified the cause of the packet loss, take appropriate measures to resolve it. This may involve updating network hardware, optimizing network traffic, or fixing issues with the destination host. + +### Useful resources + +1. [How to Troubleshoot Packet Loss](https://www.lifewire.com/how-to-troubleshoot-packet-loss-on-your-network-4685249) +2. [Diagnosing Network Issues with MTR](https://www.linode.com/community/questions/17967/diagnosing-network-issues-with-mtr) |