From b485aab7e71c1625cfc27e0f92c9509f42378458 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sun, 5 May 2024 13:19:16 +0200 Subject: Adding upstream version 1.45.3+dfsg. Signed-off-by: Daniel Baumann --- src/health/guides/redis/redis_master_link_down.md | 50 +++++++++++++++++++++++ 1 file changed, 50 insertions(+) create mode 100644 src/health/guides/redis/redis_master_link_down.md (limited to 'src/health/guides/redis/redis_master_link_down.md') diff --git a/src/health/guides/redis/redis_master_link_down.md b/src/health/guides/redis/redis_master_link_down.md new file mode 100644 index 000000000..5a2d24293 --- /dev/null +++ b/src/health/guides/redis/redis_master_link_down.md @@ -0,0 +1,50 @@ +### Understand the alert + +The `redis_master_link_down` alert is triggered when there is a disconnection between a Redis master and its slave for more than 10 seconds. This alert indicates a potential problem with the replication process and can impact the data consistency across multiple instances. + +### Troubleshoot the alert + +1. Check the Redis logs + + Examine the Redis logs for any errors or issues regarding the disconnection between the master and slave instances. By default, Redis log files are located at `/var/log/redis/redis.log`. Look for messages related to replication, network errors or timeouts. + + ``` + grep -i "replication" /var/log/redis/redis.log + grep -i "timeout" /var/log/redis/redis.log + ``` + +2. Check the Redis replication status + + Connect to the Redis master using the `redis-cli` tool, and execute the `INFO` command to get the detailed information about the master instance: + + ``` + redis-cli + INFO REPLICATION + ``` + + Also, check the replication status on the slave instance. If you have access to the IP address and port of the slave, connect to it and run the same `INFO` command. + +3. Verify the network connection between the master and slave instances + + Test the network connectivity using `ping` and `telnet` or `nc` commands, ensuring that the connection between the master and slave instances is stable and there are no issues with firewalls or network policies. + + ``` + ping + telnet + ``` + +4. Restart the Redis instances (if needed) + + If Redis instances are experiencing issues or are unable to reconnect, consider restarting them. Be cautious as restarting instances might result in data loss or consistency issues. + + ``` + sudo systemctl restart redis + ``` + +5. Monitor the situation + + After addressing the potential issues, keep an eye on the Redis instances to ensure that the problem doesn't reoccur. + +### Useful resources + +1. [Redis Replication Documentation](https://redis.io/topics/replication) -- cgit v1.2.3