summaryrefslogtreecommitdiffstats
path: root/health/guides/vernemq
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-03-09 13:19:22 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-03-09 13:19:22 +0000
commitc21c3b0befeb46a51b6bf3758ffa30813bea0ff0 (patch)
tree9754ff1ca740f6346cf8483ec915d4054bc5da2d /health/guides/vernemq
parentAdding upstream version 1.43.2. (diff)
downloadnetdata-upstream/1.44.3.tar.xz
netdata-upstream/1.44.3.zip
Adding upstream version 1.44.3.upstream/1.44.3
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to '')
-rw-r--r--health/guides/vernemq/vernemq_average_scheduler_utilization.md66
-rw-r--r--health/guides/vernemq/vernemq_cluster_dropped.md49
-rw-r--r--health/guides/vernemq/vernemq_mqtt_connack_sent_reason_unsuccessful.md20
-rw-r--r--health/guides/vernemq/vernemq_mqtt_disconnect_received_reason_not_normal.md40
-rw-r--r--health/guides/vernemq/vernemq_mqtt_disconnect_sent_reason_not_normal.md45
-rw-r--r--health/guides/vernemq/vernemq_mqtt_puback_received_reason_unsuccessful.md33
-rw-r--r--health/guides/vernemq/vernemq_mqtt_puback_sent_reason_unsuccessful.md32
-rw-r--r--health/guides/vernemq/vernemq_mqtt_puback_unexpected.md34
-rw-r--r--health/guides/vernemq/vernemq_mqtt_pubcomp_received_reason_unsuccessful.md26
-rw-r--r--health/guides/vernemq/vernemq_mqtt_pubcomp_sent_reason_unsuccessful.md35
-rw-r--r--health/guides/vernemq/vernemq_mqtt_pubcomp_unexpected.md29
-rw-r--r--health/guides/vernemq/vernemq_mqtt_publish_auth_errors.md36
-rw-r--r--health/guides/vernemq/vernemq_mqtt_publish_errors.md44
-rw-r--r--health/guides/vernemq/vernemq_mqtt_pubrec_invalid_error.md34
-rw-r--r--health/guides/vernemq/vernemq_mqtt_pubrec_received_reason_unsuccessful.md26
-rw-r--r--health/guides/vernemq/vernemq_mqtt_pubrec_sent_reason_unsuccessful.md30
-rw-r--r--health/guides/vernemq/vernemq_mqtt_pubrel_received_reason_unsuccessful.md43
-rw-r--r--health/guides/vernemq/vernemq_mqtt_pubrel_sent_reason_unsuccessful.md49
-rw-r--r--health/guides/vernemq/vernemq_mqtt_subscribe_auth_error.md37
-rw-r--r--health/guides/vernemq/vernemq_mqtt_subscribe_error.md58
-rw-r--r--health/guides/vernemq/vernemq_mqtt_unsubscribe_error.md39
-rw-r--r--health/guides/vernemq/vernemq_netsplits.md44
-rw-r--r--health/guides/vernemq/vernemq_queue_message_drop.md53
-rw-r--r--health/guides/vernemq/vernemq_queue_message_expired.md53
-rw-r--r--health/guides/vernemq/vernemq_queue_message_unhandled.md41
-rw-r--r--health/guides/vernemq/vernemq_socket_errors.md33
26 files changed, 1029 insertions, 0 deletions
diff --git a/health/guides/vernemq/vernemq_average_scheduler_utilization.md b/health/guides/vernemq/vernemq_average_scheduler_utilization.md
new file mode 100644
index 000000000..5e5bc6d43
--- /dev/null
+++ b/health/guides/vernemq/vernemq_average_scheduler_utilization.md
@@ -0,0 +1,66 @@
+### Understand the alert
+
+This alert is related to VerneMQ, which is an MQTT broker. The Netdata Agent calculates the average VerneMQ's scheduler utilization over the last 10 minutes. If you receive this alert, it means your VerneMQ scheduler's utilization is high, which may indicate performance issues or resource constraints.
+
+### What does scheduler utilization mean?
+
+VerneMQ uses schedulers to manage its tasks and processes. In this context, scheduler utilization represents the degree to which the VerneMQ schedulers are being used. High scheduler utilization may cause delays in processing tasks, leading to performance degradation and possibly affecting the proper functioning of the MQTT broker.
+
+### Troubleshoot the alert
+
+- Verify the VerneMQ scheduler utilization
+
+1. To check the scheduler utilization, you can use the `vmq-admin` command like this:
+
+ ```
+ vmq-admin metrics show | grep scheduler
+ ```
+
+ This command will display the scheduler utilization percentage.
+
+- Analyze the VerneMQ MQTT traffic
+
+1. To analyze the MQTT traffic, use the `vmq-admin` `session` and `client` subcommands. These can give you insights into the current subscription and client status:
+
+ ```
+ vmq-admin session show
+ vmq-admin client show
+ ```
+
+ This can help you identify if there is any abnormal activity or an increase in the number of clients or subscriptions that may be affecting the scheduler's performance.
+
+- Evaluate VerneMQ system resources
+
+1. Assess CPU and memory usage of the VerneMQ process using the `top` or `htop` commands:
+
+ ```
+ top -p $(pgrep -f vernemq)
+ ```
+
+ This will show you the CPU and memory usage for the VerneMQ process. If the process is consuming too many resources, it might be affecting the scheduler's utilization.
+
+2. Evaluate the system's available resources (CPU, memory, and I/O) using commands like `vmstat`, `free`, and `iostat`.
+
+ ```
+ vmstat
+ free
+ iostat
+ ```
+
+ These commands can help you understand if your system's resources are nearing their limits or if there are any bottlenecks affecting the overall performance.
+
+3. Check the VerneMQ logs for any errors or warnings. The default location for VerneMQ logs is `/var/log/vernemq`. Look for messages that may indicate issues affecting the scheduler's performance.
+
+- Optimize VerneMQ performance or adjust resources
+
+1. If the MQTT traffic is high or has increased recently, consider scaling up your VerneMQ instance by adding more resources (CPU or memory) or by distributing the load across multiple nodes.
+
+2. If your system resources are limited, consider optimizing your VerneMQ configuration to improve performance. Some example options include adjusting the `max_online_messages`, `max_inflight_messages`, or `queue_deliver_mode`.
+
+3. If the alert persists even after evaluating and making changes to the above steps, consult the VerneMQ documentation or community for further assistance.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/docs/)
+2. [VerneMQAdministration Guide](https://vernemq.com/docs/administration/)
+3. [VerneMQ Configuration Guide](https://vernemq.com/docs/configuration/) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_cluster_dropped.md b/health/guides/vernemq/vernemq_cluster_dropped.md
new file mode 100644
index 000000000..0bdc6f08d
--- /dev/null
+++ b/health/guides/vernemq/vernemq_cluster_dropped.md
@@ -0,0 +1,49 @@
+### Understand the alert
+
+This alert indicates that VerneMQ, an MQTT broker, is experiencing issues with inter-node message delivery within a clustered environment. The Netdata agent calculates the amount of traffic dropped during communication with cluster nodes in the last minute. If you receive this alert, it means that the outgoing cluster buffer is full and some messages cannot be delivered.
+
+### What does dropped messages mean?
+
+Dropped messages occur when the outgoing cluster buffer becomes full, and VerneMQ cannot deliver messages between its nodes. This can happen due to a remote node being down or unreachable, causing the buffer to fill up and preventing efficient message delivery.
+
+### Troubleshoot the alert
+
+1. Check the connectivity and status of cluster nodes
+
+ Verify that all cluster nodes are up, running and reachable. Use `vmq-admin cluster show` to get an overview of the cluster nodes and their connectivity status.
+
+ ```
+ vmq-admin cluster show
+ ```
+
+2. Investigate logs for any errors or warnings
+
+ Inspect the logs of the VerneMQ node(s) for any errors or warning messages. This can provide insight into any potential problems related to the cluster or network.
+
+ ```
+ sudo journalctl -u vernemq
+ ```
+
+3. Increase the buffer size
+
+ If the issue persists, consider increasing the buffer size. Adjust the `outgoing_clustering_buffer_size` value in the `vernemq.conf` file.
+
+ ```
+ outgoing_clustering_buffer_size = <new_buffer_size>
+ ```
+
+ Replace `<new_buffer_size>` with a larger value, for example, doubling the current buffer size. After updating the configuration, restart the VerneMQ service to apply the changes.
+
+ ```
+ sudo systemctl restart vernemq
+ ```
+
+4. Monitor the dropped messages
+
+ Continue to monitor the dropped messages using Netdata, and check if the issue is resolved after increasing the buffer size.
+
+### Useful resources
+
+1. [VerneMQ Documentation - Clustering](https://vernemq.com/docs/clustering/)
+2. [VerneMQ Logging and Monitoring](https://docs.vernemq.com/monitoring-vernemq/logging)
+3. [Managing VerneMQ Configuration](https://docs.vernemq.com/configuration/) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_connack_sent_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_connack_sent_reason_unsuccessful.md
new file mode 100644
index 000000000..d68db0d1c
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_connack_sent_reason_unsuccessful.md
@@ -0,0 +1,20 @@
+### Understand the alert
+
+This alert is triggered when there is a significant increase in the number of unsuccessful v3/v5 CONNACK packets sent by the VerneMQ broker within the last minute. A higher-than-normal rate of unsuccessful CONNACKs indicates that clients are experiencing difficulties establishing a connection with the MQTT broker.
+
+### What is a CONNACK packet?
+
+A CONNACK packet is an acknowledgment packet sent by the MQTT broker to a client in response to a CONNECT command. The CONNACK packet informs the client if the connection has been accepted or rejected, which is indicated by the return code. An unsuccessful CONNACK packet indicates a rejected connection.
+
+### Troubleshoot the alert
+
+1. **Check VerneMQ logs**: Inspect the VerneMQ logs for error messages or reasons why the connections are being rejected. By default, these logs are located at `/var/log/vernemq/console.log` and `/var/log/vernemq/error.log`. Look for entries with "CONNACK" and discern the cause of the unsuccessful connections.
+
+2. **Diagnose client configuration issues**: Analyze the rejected connection attempts' client configurations, such as incorrect credentials, unsupported protocol versions, or security settings. Debug the client-side applications, fix the configurations, and try reconnecting to the MQTT broker.
+
+3. **Evaluate broker capacity**: Check the system resources and settings of the VerneMQ broker. An overloaded broker or insufficient system resources, such as CPU and memory, can cause connection rejections. Optimize the VerneMQ configuration, upgrade the broker's hardware, or distribute the load between multiple brokers to resolve the issue.
+
+4. **Assess network issues**: Verify the network topology, firewalls, and router settings to ensure clients can reach the MQTT broker. Network latency or misconfigurations can lead to unsuccessful CONNACKs. Use monitoring tools such as `ping`, `traceroute`, or `netstat` to diagnose network issues and assess connectivity between clients and the broker.
+
+5. **Verify security settings and permissions**: Check the VerneMQ broker's security settings, including access control lists (ACL), user permissions, and authentication/authorization settings. Restricted access or incorrect permissions can lead to connection rejections. Update the security settings accordingly and test the connection again.
+
diff --git a/health/guides/vernemq/vernemq_mqtt_disconnect_received_reason_not_normal.md b/health/guides/vernemq/vernemq_mqtt_disconnect_received_reason_not_normal.md
new file mode 100644
index 000000000..014c5b0cf
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_disconnect_received_reason_not_normal.md
@@ -0,0 +1,40 @@
+### Understand the alert
+
+This alert is triggered when the number of not normal v5 DISCONNECT packets received by VerneMQ in the last minute is above a certain threshold. This indicates that there is an issue with MQTT clients connecting to your VerneMQ MQTT broker that requires attention.
+
+### What does not normal mean?
+
+In the context of this alert, "not normal" refers to v5 DISCONNECT packets that were received with a reason code other than "normal disconnection", as specified in the MQTT v5 protocol. Normal disconnection refers to clients disconnecting gracefully without any issues.
+
+### Troubleshoot the alert
+
+1. Inspect VerneMQ logs
+
+ Check the VerneMQ logs for any relevant information about the MQTT clients that are experiencing not normal disconnects. This can provide important context to identify the root cause of the issue.
+
+ ```
+ sudo journalctl -u vernemq
+ ```
+
+2. Check the MQTT clients
+
+ Investigate the MQTT clients that are experiencing not normal disconnects. This may involve inspecting client logs or usage patterns, as well as verifying that the clients are using the correct MQTT version (v5) and have the appropriate configurations.
+
+3. Monitor VerneMQ metrics
+
+ Use the VerneMQ metrics to monitor the broker's performance and identify any sudden spikes in abnormal disconnects or other relevant metrics.
+
+ To view the VerneMQ metrics, access the VerneMQ admin interface, usually available at `http://<your_vernemq_address>:8888/metrics`.
+
+4. Review network conditions
+
+ Verify that there are no networking issues between the MQTT clients and the VerneMQ MQTT broker, as these issues could cause MQTT clients to disconnect unexpectedly.
+
+5. Review VerneMQ configuration
+
+ Review your VerneMQ configuration to ensure it is correctly set up to handle the expected MQTT client load and usage patterns.
+
+### Useful resources
+
+1. [VerneMQ documentation](https://vernemq.com/docs/)
+2. [MQTT v5 specification](https://docs.oasis-open.org/mqtt/mqtt/v5.0/mqtt-v5.0.html)
diff --git a/health/guides/vernemq/vernemq_mqtt_disconnect_sent_reason_not_normal.md b/health/guides/vernemq/vernemq_mqtt_disconnect_sent_reason_not_normal.md
new file mode 100644
index 000000000..7bbc1ba16
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_disconnect_sent_reason_not_normal.md
@@ -0,0 +1,45 @@
+### Understand the alert
+
+This alert indicates that VerneMQ, a high-performance, distributed MQTT message broker, is sending an abnormal number of v5 DISCONNECT packets in the last minute. This may signify an issue in the MQTT messaging system and impact the functioning of IoT devices or other MQTT clients connected to VerneMQ.
+
+### What does an abnormal v5 DISCONNECT packet mean?
+
+In MQTT v5, the DISCONNECT packet is sent by a client or server to indicate the end of a session. A "not normal" DISCONNECT packet, generally refers to a DISCONNECT packet sent with a reason code other than "Normal Disconnection" (0x00). These reason codes might include:
+
+- Protocol errors
+- Invalid DISCONNECT payloads
+- Authorization or authentication violations
+- Exceeded keep-alive timers
+- Server/connection errors
+- User-triggered disconnects
+
+A high number of not normal DISCONNECT packets, might indicate an issue in your MQTT infrastructure, misconfigured clients, or security breaches.
+
+### Troubleshoot the alert
+
+1. **Inspect VerneMQ logs**: VerneMQ logs can provide detailed information about connections, disconnections, and possible issues. Check the VerneMQ logs for errors and information about unusual disconnects.
+
+ ```
+ cat /var/log/vernemq/console.log
+ cat /var/log/vernemq/error.log
+ ```
+
+2. **Monitor VerneMQ status**: Use the `vmq-admin` command-line tool to monitor VerneMQ and view its runtime status. Check the number of connected clients, subscriptions, and sessions.
+
+ ```
+ sudo vmq-admin cluster show
+ sudo vmq-admin session show
+ sudo vmq-admin listener show
+ ```
+
+3. **Check clients and configurations**: Review client configurations for potential errors, like incorrect authentication credentials, misconfigured keep-alive timers, or invalid packet formats. If possible, isolate problematic clients and test their behavior.
+
+4. **Consider resource limitations**: If your VerneMQ instance is reaching resource limitations (CPU, memory, network), it might automatically terminate some connections to maintain performance. Monitor system resources using the `top` command or tools like Netdata.
+
+5. **Evaluate security**: If the issue persists, consider checking the security of your MQTT infrastructure. Investigate possible cyber threats, such as a DDoS attack or unauthorized clients attempting to connect.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://docs.vernemq.com/)
+2. [MQTT v5 Specification](https://docs.oasis-open.org/mqtt/mqtt/v5.0/mqtt-v5.0.html)
+3. [Debugging MQTT Connections](https://www.hivemq.com/blog/mqtt-essentials-part-9-last-will-and-testament/) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_puback_received_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_puback_received_reason_unsuccessful.md
new file mode 100644
index 000000000..f7b506669
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_puback_received_reason_unsuccessful.md
@@ -0,0 +1,33 @@
+### Understand the alert
+
+This alert tracks the number of `unsuccessful v5 PUBACK packets` received by the VerneMQ broker within the last minute. If you receive this alert, there might be an issue with your MQTT clients or the packets they send to the VerneMQ broker.
+
+### What are v5 PUBACK packets?
+
+In MQTT v5, the `PUBACK` packet is sent by the server or subscriber client to acknowledge the receipt of a `PUBLISH` packet. In the MQTT v5 protocol, the `PUBACK` packet can contain a reason code indicating whether the message was successfully processed or if there was an error.
+
+### Troubleshoot the alert
+
+1. Check the VerneMQ logs: Analyze the logs to check for any errors or issues related to the MQTT clients or the incoming messages. VerneMQ's logs are usually located at `/var/log/vernemq/` directory, or you can check the log location in the VerneMQ configuration files.
+
+ ```
+ less /var/log/vernemq/console.log
+ less /var/log/vernemq/error.log
+ ```
+
+2. Verify MQTT clients' configurations: Review your MQTT clients' settings to ensure that they are configured correctly, especially the protocol version, QoS levels, and any MQTT v5 specific settings. Make any necessary adjustments and restart the clients.
+
+3. Monitor VerneMQ performance: Use the VerneMQ `vmq-admin` tool to monitor the broker's performance, check connections, subscriptions, and session information. This can help you identify potential issues affecting the processing of incoming messages.
+
+ ```
+ vmq-admin metrics show
+ vmq-admin session list
+ vmq-admin listener show
+ ```
+
+4. Check the `PUBLISH` messages: Inspect the contents of `PUBLISH` messages being sent by the MQTT clients to ensure they are correctly formatted and adhere to the MQTT v5 protocol specifications. If necessary, correct any issues and send test messages to confirm the problem is resolved.
+
+### Useful resources
+
+1. [VerneMQ documentation](https://vernemq.com/docs/)
+2. [MQTT v5.0 Specification](https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html)
diff --git a/health/guides/vernemq/vernemq_mqtt_puback_sent_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_puback_sent_reason_unsuccessful.md
new file mode 100644
index 000000000..85a06a220
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_puback_sent_reason_unsuccessful.md
@@ -0,0 +1,32 @@
+### Understand the alert
+
+This alert is related to VerneMQ, an MQTT message broker. If you receive this alert, it means that an increasing number of unsuccessful v5 PUBACK packets have been sent in the last minute.
+
+### What does "unsuccessful v5 PUBACK" mean?
+
+In the MQTT protocol, when a client sends a Publish message with a Quality of Service (QoS) level 1, the message broker sends a PUBACK packet to acknowledge receipt of the message. However, MQTT v5 has added a reason code field in the PUBACK packet, allowing brokers to report any issues or errors that occurred during message delivery. An "unsuccessful v5 PUBACK" refers to a PUBACK packet that reports a delivery problem or issue.
+
+### Troubleshoot the alert
+
+1. Check VerneMQ logs for possible errors or warnings: VerneMQ logs can provide valuable insights into the broker's runtime behavior, including connection issues or problems with authentication/authorization. Look for errors or warnings in the logs that could indicate the cause of the unsuccessful PUBACK packets.
+
+ ```
+ sudo journalctl -u vernemq
+ ```
+
+2. Verify client connections: Connection issues can be a possible cause of unsuccessful PUBACK packets. Use the `vmq-admin session show` command to view the client connections, and check for any abnormal behavior (e.g., frequent disconnects and reconnects).
+
+ ```
+ sudo vmq-admin session show
+ ```
+
+3. Check MQTT client logs: Review the logs from the devices that connect to your VerneMQ broker instance to verify if they encounter any issues or errors when sending messages.
+
+4. Monitor the broker's resources usage: High system load or insufficient resources may affect VerneMQ's performance and prevent it from processing PUBACK packets as expected. Use monitoring tools like `top` and `iotop` to observe CPU and I/O usage, and assess whether the broker has enough resources to handle the MQTT traffic.
+
+5. Update VerneMQ configuration: Double-check your VerneMQ settings for any misconfiguration related to QoS, message storage, or security policies that could prevent PUBACK packets from being sent or processed successfully.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/docs/)
+2. [MQTT Version 5 Features](https://www.hivemq.com/blog/mqtt-5-foundational-changes-in-the-protocol/)
diff --git a/health/guides/vernemq/vernemq_mqtt_puback_unexpected.md b/health/guides/vernemq/vernemq_mqtt_puback_unexpected.md
new file mode 100644
index 000000000..b2541e867
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_puback_unexpected.md
@@ -0,0 +1,34 @@
+### Understand the alert
+
+This alert is related to VerneMQ, a high-performance MQTT broker. It monitors the number of unexpected v3/v5 PUBACK packets received in the last minute. If you receive this alert, it means that there are more PUBACK packets received than expected, which could indicate an issue with your MQTT broker or your MQTT client application(s).
+
+### What are PUBACK packets?
+
+In MQTT (Message Queuing Telemetry Transport) protocol, PUBACK packets are acknowledgement packets sent by the MQTT broker to confirm the receipt of a PUBLISH message with QoS (Quality of Service) level 1. The MQTT client will wait for this acknowledgment packet before it can continue with the next transaction.
+
+### Troubleshoot the alert
+
+1. Check VerneMQ logs for any unusual events, errors, or issues that could be related to the PUBACK packets. The VerneMQ logs can be found in `/var/log/vernemq` by default, or any custom location defined in the configuration file.
+
+ ```
+ sudo tail -f /var/log/vernemq/console.log
+ ```
+
+2. Investigate your MQTT client application(s) to ensure they are handling the PUBLISH messages correctly and not causing duplicate or unexpected PUBACK packets. You can use an MQTT client library that supports QoS level 1 to eliminate the possibility of custom code not following the MQTT protocol properly.
+
+3. Monitor your MQTT broker and client application(s) for any network connectivity issues that could cause unexpected PUBACK packets. You can use tools like `ping` and `traceroute` to check the network connectivity between the MQTT broker and client application(s).
+
+4. Analyze the load and performance of your MQTT broker using the various metrics provided by VerneMQ. You can access the VerneMQ status and metrics using the `vmq-admin` command:
+
+ ```
+ sudo vmq-admin metrics show
+ ```
+
+ Look for any unusual spikes or bottlenecks that could cause unexpected PUBACK packets in the output.
+
+5. If none of the above steps resolve the issue, consider reaching out to the VerneMQ community or opening a GitHub issue to seek further assistance.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/docs/)
+2. [Understanding MQTT QoS Levels](https://www.hivemq.com/blog/mqtt-essentials-part-6-mqtt-quality-of-service-levels/)
diff --git a/health/guides/vernemq/vernemq_mqtt_pubcomp_received_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_pubcomp_received_reason_unsuccessful.md
new file mode 100644
index 000000000..5bdfd5b38
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_pubcomp_received_reason_unsuccessful.md
@@ -0,0 +1,26 @@
+### Understand the alert
+
+This alert indicates that the VerneMQ broker has received an increased number of unsuccessful MQTT v5 PUBCOMP (Publish Complete) packets in the last minute. The PUBCOMP packet is the fourth and final packet in the QoS 2 publish flow. It means that there are issues in the MQTT message delivery process at Quality of Service (QoS) level 2, which could lead to message loss or duplicated messages.
+
+### What does an unsuccessful PUBCOMP mean?
+
+An unsuccessful PUBCOMP occurs when the recipient of a PUBLISH message (subscriber) acknowledges reception but encounters a problem while processing the message. The PUBCOMP packet contains a Reason Code, indicating the outcome of processing the PUBLISH message. In a successful case, the code would be 0x00 (Success); otherwise, it would be one of the following: 0x80 (Unspecified Error), 0x83 (Implementation Specific Error), 0x87 (Not Authorized), 0xD0 (Packet Identifier in Use), or 0xD2 (Packet Identifier Not Found).
+
+### Troubleshoot the alert
+
+1. Check the VerneMQ error logs: VerneMQ logs can provide valuable information on encountered errors or any misconfiguration that leads to unsuccessful PUBCOMP messages. Generally, their location is `/var/log/vernemq/console.log`, `/var/log/vernemq/error.log`, and `/var/log/vernemq/crash.log`.
+
+2. Review MQTT clients' logs: Inspect the logs of the MQTT clients that are publishing or subscribing to the messages on the VerneMQ broker. This may help you identify specific clients causing the problem or any pattern associated with unsuccessful PUBCOMP messages.
+
+3. Verify the Quality of Service (QoS) level: Check if the QoS level for PUBCOMP packets is set to 2, as required. If necessary, adjust the settings for the MQTT clients to match the expected QoS level.
+
+4. Investigate authorization and access control: If the Reason Code is related to authorization (0x87), verify that the MQTT clients involved have the correct permissions to publish and subscribe to the topics in question. Make sure that the VerneMQ Access Control List (ACL) or external authentication mechanisms are correctly configured.
+
+5. Monitor network connectivity: Unsuccessful PUBCOMP messages could be due to network issues between the MQTT clients and the VerneMQ broker. Monitor and analyze network latency or packet loss between clients and the VerneMQ server to identify any potential issues.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/docs/)
+2. [MQTT v5 Specification](https://docs.oasis-open.org/mqtt/mqtt/v5.0/mqtt-v5.0.html)
+3. [Troubleshooting VerneMQ](https://vernemq.com/docs/guide/introduction/troubleshooting/)
+4. [VerneMQ ACL Configuration](https://vernemq.com/docs/configuration/acl.html) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_pubcomp_sent_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_pubcomp_sent_reason_unsuccessful.md
new file mode 100644
index 000000000..cc71b739b
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_pubcomp_sent_reason_unsuccessful.md
@@ -0,0 +1,35 @@
+### Understand the alert
+
+This alert indicates that the number of unsuccessful v5 PUBCOMP (Publish Complete) packets sent within the last minute has increased. VerneMQ is an MQTT broker, which plays a crucial role in managing and processing the message flow between MQTT clients. If you receive this alert, it implies that there are issues in the message flow, which might affect the communication between MQTT clients and the broker.
+
+### What does PUBCOMP mean?
+
+In MQTT protocol, PUBCOMP is the fourth and final packet in the Quality of Service (QoS) 2 protocol exchange. The flow consists of PUBLISH, PUBREC (Publish Received), PUBREL (Publish Release), and PUBCOMP packets. PUBCOMP is sent by the receiver (MQTT client or broker) to confirm that it has received and processed the PUBREL packet. Unsuccessful PUBCOMP packets indicate that the receiver was not able to process the message properly.
+
+### Troubleshoot the alert
+
+- Check VerneMQ logs for errors or warnings
+
+ VerneMQ logs can provide valuable information about issues with the message flow. Locate the log file (usually at `/var/log/vernemq/console.log`) and inspect it for any error messages or warnings related to the PUBCOMP packet or its predecessors (PUBLISH, PUBREC, PUBREL) in the QoS 2 flow.
+
+- Identify problematic MQTT clients
+
+ Analyze the logs to identify the MQTT clients that are frequently involved in unsuccessful PUBCOMP packets exchange. These clients might have connection or configuration issues that lead to unsuccessful PUBCOMP packets.
+
+- Validate MQTT clients configurations
+
+ Ensure that the MQTT clients involved in unsuccessful PUBCOMP packets have valid configurations and that they are compatible with the broker (VerneMQ). Check parameters such as QoS level, protocol version, authentication, etc.
+
+- Monitor VerneMQ metrics
+
+ Use Netdata or other monitoring tools to observe VerneMQ metrics and identify unusual patterns in the broker's performance. Increased load on the broker, high memory or CPU usage, slow response times, or network hiccups might contribute to unsuccessful PUBCOMP packets.
+
+- Ensure proper MQTT payload size
+
+ Unsuccessful PUBCOMP packets can be caused by oversized payload or incorrect Message ID. Verify that the payload size respects the Maximum Transmission Unit (MTU) and that the Message ID follows the MQTT protocol specifications.
+
+### Useful resources
+
+1. [VerneMQ - Troubleshooting](https://vernemq.com/docs/troubleshooting/)
+2. [MQTT Protocol Specification](https://docs.oasis-open.org/mqtt/mqtt/v5.0/mqtt-v5.0.html)
+3. [VerneMQ - Monitoring](https://vernemq.com/docs/monitoring/) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_pubcomp_unexpected.md b/health/guides/vernemq/vernemq_mqtt_pubcomp_unexpected.md
new file mode 100644
index 000000000..ab4932177
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_pubcomp_unexpected.md
@@ -0,0 +1,29 @@
+### Understand the alert
+
+This alert is related to VerneMQ, a high-performance MQTT message broker. It monitors the number of unexpected PUBCOMP (publish complete) packets received in the last minute. If you receive this alert, it means there's an issue with the MQTT message flow between clients and the broker, which might lead to data inconsistencies.
+
+### What are PUBCOMP packets?
+
+In MQTT, the PUBCOMP packet is used when QoS (Quality of Service) 2 is applied. It's the fourth and final packet in the four-packet flow to ensure that messages are delivered exactly once. An unexpected PUBCOMP packet means that the client or the broker received a PUBCOMP packet that it didn't expect in the message flow, which can cause issues in processing the message correctly.
+
+### Troubleshoot the alert
+
+1. Inspect the VerneMQ logs: Check the VerneMQ logs for any error messages or unusual activity that could indicate a problem with the message flow. By default, VerneMQ logs are located in `/var/log/vernemq/`, but this might be different for your system.
+
+ ```
+ sudo tail -f /var/log/vernemq/console.log
+ sudo tail -f /var/log/vernemq/error.log
+ ```
+
+2. Identify problematic clients: Inspect the MQTT client logs to identify which clients are causing the unexpected PUBCOMP packets. Some MQTT client libraries provide logging features, while others might require debugging or setting a higher log level.
+
+3. Check QoS settings: Ensure that the clients and the MQTT broker have the same QoS settings to avoid inconsistencies in the four-packet flow.
+
+4. Monitor the VerneMQ metrics: Use Netdata or other monitoring tools to keep an eye on MQTT message flows and observe any anomalies that require further investigation.
+
+5. Update client libraries and VerneMQ: Ensure that all MQTT client libraries and the VerneMQ server are up-to-date to avoid any incompatibilities or bugs that could lead to unexpected behavior.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/documentation/)
+2. [MQTT Specification - MQTT Control Packets](https://docs.oasis-open.org/mqtt/mqtt/v5.0/os/mqtt-v5.0-os.html#_Toc3901046)
diff --git a/health/guides/vernemq/vernemq_mqtt_publish_auth_errors.md b/health/guides/vernemq/vernemq_mqtt_publish_auth_errors.md
new file mode 100644
index 000000000..46bc7d312
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_publish_auth_errors.md
@@ -0,0 +1,36 @@
+### Understand the alert
+
+This alert is triggered when the Netdata Agent detects a spike in unauthorized MQTT v3/v5 `PUBLISH` attempts in the last minute on your VerneMQ broker. If you receive this alert, it means that there might be clients attempting to publish messages without the proper authentication, which could indicate a misconfiguration or potential security risk.
+
+### What are MQTT and VerneMQ?
+
+MQTT (Message Queuing Telemetry Transport) is a lightweight, publish-subscribe protocol designed for low-bandwidth, high-latency, or unreliable networks. VerneMQ is a high-performance, distributed MQTT broker that supports a wide range of industry standards and can handle millions of clients.
+
+### Troubleshoot the alert
+
+1. Verify the clients' credentials
+
+ To check if the clients are using the correct credentials while connecting and publishing to the VerneMQ broker, inspect their log files or debug messages to find authentication-related issues.
+
+2. Review VerneMQ broker configuration
+
+ Ensure that the VerneMQ configuration allows for proper authentication of clients. Verify that the correct authentication plugins and settings are enabled. The configuration file is usually located at `/etc/vernemq/vernemq.conf`. For more information on VerneMQ config, please refer to [VerneMQ documentation](https://vernemq.com/docs/configuration/index.html).
+
+3. Analyze VerneMQ logs
+
+ Inspect the VerneMQ logs to identify unauthorized attempts and assess any potential risks. The logs typically reside in the `/var/log/vernemq` directory, and you can tail the logs using the following command:
+
+ ```
+ tail -f /var/log/vernemq/console.log
+ ```
+
+4. Configure firewall rules
+
+ If you find unauthorized or suspicious IP addresses attempting to connect to your VerneMQ broker, consider blocking those addresses using firewall rules to prevent unauthorized access.
+
+### Useful resources
+
+1. [VerneMQ documentation](https://vernemq.com/docs/index.html)
+2. [Getting started with MQTT](https://mqtt.org/getting-started/)
+3. [MQTT Security Fundamentals](https://www.hivemq.com/mqtt-security-fundamentals/)
+4. [VerneMQ configuration options](https://vernemq.com/docs/configuration/) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_publish_errors.md b/health/guides/vernemq/vernemq_mqtt_publish_errors.md
new file mode 100644
index 000000000..9b57b1a74
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_publish_errors.md
@@ -0,0 +1,44 @@
+### Understand the alert
+
+This alert monitors the number of failed v3/v5 PUBLISH operations in the last minute for VerneMQ, an MQTT broker. If you receive this alert, it means that there is an issue with the MQTT message publishing process in your VerneMQ broker.
+
+### What is MQTT?
+
+MQTT (Message Queuing Telemetry Transport) is a lightweight messaging protocol designed for constrained devices and low-bandwidth, high latency, or unreliable networks. It is based on the publish-subscribe model, where clients (devices or applications) can subscribe and publish messages to topics.
+
+### What is VerneMQ?
+
+VerneMQ is a high-performance, distributed MQTT message broker. It is designed to handle thousands of concurrent clients while providing low latency and high throughput.
+
+### Troubleshoot the alert
+
+1. Check the VerneMQ log files for any error messages or warnings related to the MQTT PUBLISH operation failures. The log files are usually located in the `/var/log/vernemq` directory.
+
+ ```
+ sudo tail -f /var/log/vernemq/vernemq.log
+ ```
+
+2. Check VerneMQ metrics to identify any bottlenecks in the system's performance. You can do this by using the `vmq-admin` tool, which comes with VerneMQ. Run the following command to get an overview of the broker's performance:
+
+ ```
+ sudo vmq-admin metrics show
+ ```
+
+ Pay attention to the metrics related to PUBLISH operation failures, such as `mqtt.publish.error_code.*`.
+
+3. Assess the performance of connected clients. Use the `vmq-admin` tool to list client connections along with details like the client's state and the number of published messages:
+
+ ```
+ sudo vmq-admin session show --client_id --is_online --is_authenticated --session_publish_errors
+ ```
+
+ Investigate the clients with `session_publish_errors` to find out if there's an issue with specific clients.
+
+4. Review your MQTT topic configuration, such as the retained flag, QoS levels, and the permissions for publishing to ensure your setup aligns with the intended behavior.
+
+5. If the issue persists or requires further investigation, consider examining the network conditions, such as latency or connection issues, which might hinder the MQTT PUBLISH operation's efficiency.
+
+### Useful resources
+
+1. [VerneMQ documentation](https://vernemq.com/docs/)
+2. [An introduction to MQTT](https://www.hivemq.com/mqtt-essentials/)
diff --git a/health/guides/vernemq/vernemq_mqtt_pubrec_invalid_error.md b/health/guides/vernemq/vernemq_mqtt_pubrec_invalid_error.md
new file mode 100644
index 000000000..47cd0fefc
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_pubrec_invalid_error.md
@@ -0,0 +1,34 @@
+### Understand the alert
+
+This alert is triggered when the Netdata Agent monitors an unexpected increase in the number of VerneMQ v3 MQTT `PUBREC` packets received during the last minute. VerneMQ is an MQTT broker that is essential for message distribution in IoT applications. MQTT v3 is one of the protocol versions used by the MQTT brokers.
+
+### What does an invalid PUBREC packet mean?
+
+`PUBREC` is a control packet in the MQTT protocol that acknowledges receipt of a `PUBLISH` packet. This packet is used during Quality of Service (QoS) level 2 message delivery, ensuring that the message is received exactly once. An invalid `PUBREC` packet means that VerneMQ has received a `PUBREC` packet that contains incorrect, unexpected, or duplicate data.
+
+### Troubleshoot the alert
+
+- Check VerneMQ logs
+
+ Investigate the VerneMQ logs to see if there are any error messages or warnings related to the processing of `PUBREC` packets. The logs can be found in `/var/log/vernemq/console.log` or `/usr/local/var/log/vernemq/console.log`. Look for any entries with specific error messages mentioning `PUBREC`.
+
+- Check MQTT Clients
+
+ Monitor the MQTT clients that are connected to the VerneMQ broker to identify which clients are sending invalid `PUBREC` packets. Check the logs or monitoring systems of those clients to understand the root cause of the problem. They might be experiencing issues or bugs causing them to send incorrect `PUBREC` packets.
+
+- Check the MQTT topics
+
+ Monitor the MQTT topics with high levels of QoS 2 message delivery and determine if a specific topic is causing the spike in invalid `PUBREC` packets.
+
+- Upgrade or fix MQTT Clients
+
+ If the issue arises from specific client implementations, consider upgrading the MQTT client libraries, fixing any configuration issues or reporting the bug to the appropriate development teams.
+
+- Review VerneMQ configuration
+
+ Verify that the VerneMQ broker configuration is set up correctly and that MQTT v3 protocol is enabled. If necessary, adjust the configuration to better handle the volume of QoS 2 messages being processed.
+
+### Useful resources
+
+1. [VerneMQ documentation](https://vernemq.com/docs/index.html)
+2. [MQTT v3.1.1 specification](http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html)
diff --git a/health/guides/vernemq/vernemq_mqtt_pubrec_received_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_pubrec_received_reason_unsuccessful.md
new file mode 100644
index 000000000..b01dc9fbb
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_pubrec_received_reason_unsuccessful.md
@@ -0,0 +1,26 @@
+### Understand the alert
+
+This alert indicates that the number of received unsuccessful v5 `PUBREC` packets in the last minute is higher than expected. VerneMQ is an open-source MQTT broker. MQTT is a lightweight messaging protocol for small sensors and mobile devices optimized for high-latency or unreliable networks. `PUBREC` is an MQTT packet that is part of the quality of service 2 (QoS 2) message flow for MQTT publish/subscribe model. An unsuccessful `PUBREC` could mean that there are issues with the MQTT messages being processed by the MQTT broker.
+
+### What does PUBREC mean?
+
+`PUBREC` stands for "Publish Received." In MQTT, it is part of the QoS 2 message flow to ensure end-to-end delivery of a message between clients (publishers) and subscribers connected to an MQTT broker. When a client sends a `PUBLISH` message with QoS 2, the broker acknowledges the receipt with a `PUBREC` message.
+
+### Troubleshoot the alert
+
+To address this alert and identify the root cause, follow these steps:
+
+1. **Check the VerneMQ log files**: Inspect the VerneMQ log files to find any issues or errors related to the processing of MQTT messages. Look for messages related to `PUBREC` or QoS 2 issues. The logs are typically located at `/var/log/vernemq/console.log`or `/var/log/vernemq/error.log`.
+
+2. **Monitor the VerneMQ metrics**: Check VerneMQ metrics using tools like `vmq-admin` to get insights into the broker's performance and message statistics. The command `vmq-admin metrics show` provides various metrics, including the number of received `PUBREC` and the number of unsuccessful `PUBREC` messages.
+
+3. **Verify the publisher's configuration**: Check the configuration of the MQTT clients (publishers) that are sending the QoS 2 messages to ensure a proper message flow. It's crucial to confirm that the clients are using the correct version of MQTT and adhere to the limitations set by MQTT v5, like the packet size or the maximum topic aliases used.
+
+4. **Identify unsupported features**: Some MQTT brokers may not support all MQTT v5 features. Verify that the publisher's MQTT library supports MQTT v5 features in use, such as user properties or message expiration interval, and that it is compatible with VerneMQ.
+
+5. **Analyze network conditions**: Unreliable network conditions or high traffic load may cause unsuccessful MQTT messages. Evaluate the network and identify any issues causing packet loss or latency. Often, improving the network conditions, migrating the broker/server to a stronger network, or adjusting the user's connection settings can help with such issues.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/docs/)
+2. [MQTT v5 Specification](https://docs.oasis-open.org/mqtt/mqtt/v5.0/cs02/mqtt-v5.0-cs02.html)
diff --git a/health/guides/vernemq/vernemq_mqtt_pubrec_sent_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_pubrec_sent_reason_unsuccessful.md
new file mode 100644
index 000000000..9b1976494
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_pubrec_sent_reason_unsuccessful.md
@@ -0,0 +1,30 @@
+### Understand the alert
+
+This alert monitors the number of sent unsuccessful v5 PUBREC packets in the last minute in the VerneMQ MQTT broker. If you receive this alert, it means that there is an issue with successfully acknowledging receipt of PUBLISH packets in the MQTT system.
+
+### What does PUBREC mean?
+
+In the MQTT protocol, when a client sends a PUBLISH message with Quality of Service (QoS) level 2, it expects an acknowledgment from the server in the form of a PUBREC (Publish Received) message. This confirms the successful receipt of the PUBLISH message by the server. If a PUBREC message is marked as unsuccessful, it indicates a problem with the message acknowledgment process.
+
+### Troubleshoot the alert
+
+1. Check VerneMQ log files for any errors or warnings related to unsuccessful PUBREC messages. VerneMQ logs can be found in `/var/log/vernemq` (by default) or the directory specified in your configuration file.
+
+ ```
+ sudo tail -f /var/log/vernemq/console.log
+ sudo tail -f /var/log/vernemq/error.log
+ ```
+
+2. Verify if any clients are having issues with the MQTT connection, such as intermittent network problems or misconfigured settings. Check the client logs for any issues and take appropriate action.
+
+3. Review the MQTT QoS settings for the clients in the system. If possible, consider lowering the QoS level to 1 or 0, which uses less resources and bandwidth. QoS level 2 might not be necessary for some use cases.
+
+4. Inspect the VerneMQ system and environment for resource bottlenecks or other performance issues. Use tools like `top`, `htop`, `vmstat`, or `iotop` to monitor system resources and identify any potential problems.
+
+5. If the issue persists, consider seeking support from the VerneMQ community or the software vendor for further assistance.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/documentation.html)
+2. [MQTT Essentials – All Core MQTT Concepts explained](https://www.hivemq.com/mqtt-essentials/)
+3. [Understanding QoS Levels in MQTT](https://www.hivemq.com/blog/mqtt-essentials-part-6-mqtt-quality-of-service-levels/) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_pubrel_received_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_pubrel_received_reason_unsuccessful.md
new file mode 100644
index 000000000..67a54f0c3
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_pubrel_received_reason_unsuccessful.md
@@ -0,0 +1,43 @@
+### Understand the alert
+
+This alert monitors the number of received `unsuccessful v5 PUBREL` packets in the last minute in the VerneMQ MQTT broker. If you receive this alert, it means that there were unsuccessful PUBREL attempts in VerneMQ, which might indicate an issue during the message delivery process.
+
+### What are MQTT and PUBREL?
+
+MQTT (Message Queuing Telemetry Transport) is a lightweight, low-code and low-latency messaging protocol that works with a subscription-based system. It utilizes a broker, like VerneMQ, to facilitate communication.
+
+A `PUBREL` packet is the third one in a QoS-2 (Quality of Service level 2) message flow. QoS-2 is the highest available level in MQTT and strives to provide once-and-only-once message delivery to subscribers. The `PUBREL` packet is sent by the publisher to acknowledge its receipt of a `PUBREC` packet and signal that it is OK to release the message.
+
+An unsuccessful `PUBREL` packet indicates that the message release process encountered issues and may not have been completed as expected.
+
+### Troubleshoot the alert
+
+1. Check the VerneMQ broker logs for any unusual messages:
+
+ ```
+ sudo journalctl -u vernemq
+ ```
+
+ Look for errors or warnings that might be related to the unsuccessful `PUBREL` packets.
+
+2. Examine the configuration files of VerneMQ:
+
+ ```
+ cat /etc/vernemq/vernemq.conf
+ ```
+
+ Check if there are any misconfigurations or unsupported features that could cause issues with QoS-2 message flow. Refer to the [VerneMQ Documentation](https://docs.vernemq.com/configuration/introduction) for correct configurations.
+
+3. Analyze the clients' logs, which can be publishers or subscribers, for any errors or issues related to MQTT connections and QoS levels. Make sure the clients are using the correct QoS levels and are following the MQTT protocol.
+
+4. Monitor VerneMQ's RAM, CPU, and file descriptor usage to determine if the broker's performance is degraded. Resolve any performance bottlenecks or resource constraints to prevent further unsuccessful `PUBREL` packets.
+
+5. For in-depth analysis, enable VerneMQ's debug logs by setting `log.console.level` to `debug` in its configuration file and restarting the service. Be cautious, as this might generate large amounts of log data.
+
+6. If the issue persists, consider reaching out to the VerneMQ support channels, such as their [GitHub](https://github.com/vernemq/vernemq) repository.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://docs.vernemq.com/)
+2. [MQTT Essentials](https://www.hivemq.com/mqtt-essentials/)
+3. [Understanding MQTT QoS Levels - Part 1](https://www.hivemq.com/blog/mqtt-essentials-part-6-mqtt-quality-of-service-levels/)
diff --git a/health/guides/vernemq/vernemq_mqtt_pubrel_sent_reason_unsuccessful.md b/health/guides/vernemq/vernemq_mqtt_pubrel_sent_reason_unsuccessful.md
new file mode 100644
index 000000000..18e85e12a
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_pubrel_sent_reason_unsuccessful.md
@@ -0,0 +1,49 @@
+### Understand the alert
+
+This alert is related to VerneMQ, a high-performance MQTT broker. It monitors the number of unsuccessful v5 `PUBREL` packets sent in the last minute. If you receive this alert, it means that there was an issue with sending `PUBREL` packets in your VerneMQ instance.
+
+### What does PUBREL mean?
+
+`PUBREL` is a type of MQTT control packet that indicates the release of an application message from the server to the client. It is the third message in the QoS 2 (Quality of Service level 2) protocol exchange, where QoS 2 ensures that a message is delivered exactly once. An unsuccessful v5 `PUBREL` packet means that there was an error during the packet processing, and the message wasn't delivered to the client as expected.
+
+### Troubleshoot the alert
+
+1. Check the VerneMQ logs:
+
+ VerneMQ logs can give you valuable information about possible errors that might have occurred during the processing of `PUBREL` packets. Look for any error messages or traces related to the `PUBREL` packets in the logs.
+
+ ```
+ sudo journalctl -u vernemq -f
+ ```
+
+ Alternatively, if you're using a custom log location:
+
+ ```
+ tail -f /path/to/custom/log
+ ```
+
+2. Check the MQTT client-side logs:
+
+ Check the logs of the MQTT client that might have caused the unsuccessful `PUBREL` packets. Look for any connection issues, error messages, or traces related to the MQTT protocol exchanges.
+
+3. Ensure proper configuration for VerneMQ:
+
+ Verify that the VerneMQ configuration settings related to QoS 2 protocol timeouts and retries are correctly set. Check the VerneMQ [documentation](https://docs.vernemq.com/configuration) for guidance on the proper configuration.
+
+ ```
+ cat /etc/vernemq/vernemq.conf
+ ```
+
+4. Monitor VerneMQ metrics:
+
+ Use Netdata to monitor VerneMQ metrics to analyze the MQTT server's performance and resource usage. This can help you identify possible issues with the server.
+
+5. Address network or service issues:
+
+ If the above steps don't resolve the alert, look for possible network or service-related issues that might be causing the unsuccessful `PUBREL` packets. This could require additional investigation based on your specific infrastructure and environment.
+
+### Useful resources
+
+1. [VerneMQ - Official Documentation](https://docs.vernemq.com/)
+2. [MQTT Essentials: Quality of Service 2 (QoS 2)](https://www.hivemq.com/blog/mqtt-essentials-part-6-mqtt-quality-of-service-levels/)
+3. [Netdata - VerneMQ monitoring](https://learn.netdata.cloud/docs/agent/collectors/go.d.plugin/modules/vernemq) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_subscribe_auth_error.md b/health/guides/vernemq/vernemq_mqtt_subscribe_auth_error.md
new file mode 100644
index 000000000..b80118730
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_subscribe_auth_error.md
@@ -0,0 +1,37 @@
+### Understand the alert
+
+This alert indicates that there have been unauthorized MQTT (Message Queuing Telemetry Transport) v3/v5 SUBSCRIBE attempts in the last minute. This could mean that there are clients trying to subscribe to topics without proper authentication or authorization in your VerneMQ broker.
+
+### What does unauthorized subscribe mean?
+
+In the MQTT protocol, clients can subscribe to topics to receive messages published by other clients to the broker. An unauthorized subscribe occurs when a client tries to subscribe to a topic but does not have the required permissions or has not provided valid credentials.
+
+### Troubleshoot the alert
+
+1. Check the VerneMQ logs for unauthorized subscribe attempts:
+
+ The first step in troubleshooting this issue is to check the VerneMQ logs to identify the source of the unauthorized attempts. Look for log messages related to authentication or authorization errors in the log files (`/var/log/vernemq/console.log` or `/var/log/vernemq/error.log`).
+
+ Example log message:
+ ```
+ date time [warning] <client_id>@<client_IP> MQTT SUBSCRIBE authorization failure for user "<username>", topic "<topic_name>"
+ ```
+
+2. Verify client authentication and authorization configuration:
+
+ Check the client configurations to ensure they have the correct credentials (username and password) and are authorized to subscribe to the intended topics. Remember that topic permissions are case-sensitive and might have wildcards. Update the client configurations if necessary and restart the MQTT clients.
+
+3. Review the VerneMQ broker configurations:
+
+ Verify the authentication and authorization plugins or settings in the VerneMQ broker (`/etc/vernemq/vernemq.conf` or `/etc/vernemq/vmq.acl` for access control). Make sure the settings are correctly configured to allow the clients to subscribe to the intended topics. Update the configurations if necessary and restart the VerneMQ broker.
+
+4. Monitor the unauthorized subscribe attempts using the Netdata dashboard or configuration file:
+
+ Continue monitoring the unauthorized subscribe attempts using the Netdata dashboard or by configuring the alert thresholds in the Netdata configuration file. This will help you track the issue and ensure that the problem has been resolved.
+
+### Useful resources
+
+1. [VerneMQ documentation](https://vernemq.com/docs/)
+2. [MQTT v3.1.1 specification](https://docs.oasis-open.org/mqtt/mqtt/v3.1.1/os/mqtt-v3.1.1-os.html)
+3. [MQTT v5.0 specification](https://docs.oasis-open.org/mqtt/mqtt/v5.0/mqtt-v5.0.html)
+4. [Understanding MQTT topic permissions and wildcards](http://www.steves-internet-guide.com/understanding-mqtt-topics/) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_subscribe_error.md b/health/guides/vernemq/vernemq_mqtt_subscribe_error.md
new file mode 100644
index 000000000..f14d18d55
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_subscribe_error.md
@@ -0,0 +1,58 @@
+### Understand the alert
+
+This alert is related to `VerneMQ`, the open-source, distributed MQTT message broker. If you receive this alert, it means that the number of failed v3/v5 `SUBSCRIBE` operations has increased in the last minute.
+
+### What do v3 and v5 SUBSCRIBE operations mean?
+
+MQTT v3 and v5 are different versions of the MQTT protocol, used for the Internet of Things (IoT) devices and their communication. The `SUBSCRIBE` operation allows a client (device) to subscribe to a specific topic and receive messages published under that topic.
+
+### Troubleshoot the alert
+
+- Check the VerneMQ logs
+
+1. Identify the location of the VerneMQ logs. The default location is `/var/log/vernemq`. If you have changed the default location, you can find it in the `vernemq.conf` file by looking for `log.console.file` and `log.error.file`.
+
+ ```
+ grep log.console.file /etc/vernemq/vernemq.conf
+ grep log.error.file /etc/vernemq/vernemq.conf
+ ```
+
+2. Analyze the logs for any errors or issues related to the `SUBSCRIBE` operation:
+
+ ```
+ tail -f /path/to/vernemq/logs
+ ```
+
+- Check the system resources
+
+1. Check the available resources (RAM and CPU) on your system:
+
+ ```
+ top
+ ```
+
+2. If you find that the system resources are low, consider adding more resources or stopping unnecessary processes/applications.
+
+- Check the client-side logs
+
+1. Most MQTT clients (e.g., Mosquitto, Paho, MQTT.js) provide their logs to help you identify any issues related to the `SUBSCRIBE` operation.
+
+2. Analyze the client logs for errors in connecting, subscribing, or receiving messages from the MQTT broker.
+
+- Analyze the topics and subscriptions
+
+1. Verify if there are any invalid, restricted, or forbidden topics in your MQTT broker.
+
+2. Check the ACLs (Access Control Lists) and client authentication settings in your VerneMQ `vernemq.conf` file.
+
+ ```
+ grep -E '^(allow_anonymous|vmq_acl.acl_file|vmq_passwd.password_file)' /etc/vernemq/vernemq.conf
+ ```
+
+3. Ensure the `ACLs` and authentication configuration are correct and allow the clients to subscribe to the required topics.
+
+### Useful resources
+
+1. [VerneMQ Administration](https://vernemq.com/docs/administration/)
+2. [VerneMQ Configuration](https://vernemq.com/docs/configuration/)
+3. [VerneMQ Logging](https://vernemq.com/docs/guide/internals.html#logging) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_mqtt_unsubscribe_error.md b/health/guides/vernemq/vernemq_mqtt_unsubscribe_error.md
new file mode 100644
index 000000000..55feb0a17
--- /dev/null
+++ b/health/guides/vernemq/vernemq_mqtt_unsubscribe_error.md
@@ -0,0 +1,39 @@
+### Understand the alert
+
+This alert monitors the number of failed v3/v5 `UNSUBSCRIBE` operations in VerneMQ in the last minute. If you receive this alert, it means that there is a significant number of failed `UNSUBSCRIBE` operations, which may impact the MQTT messaging on your system.
+
+### What is VerneMQ?
+
+VerneMQ is a high-performance, distributed MQTT message broker. It provides scalable and reliable communication for Internet of Things (IoT) systems and applications.
+
+### What is an MQTT UNSUBSCRIBE operation?
+
+An `UNSUBSCRIBE` operation in MQTT protocol is a request sent by a client to the server to remove one or more topics from the subscription list. It allows clients to stop receiving messages for particular topics.
+
+### Troubleshoot the alert
+
+1. Check VerneMQ logs for any error messages or indications of issues with the `UNSUBSCRIBE` operation:
+
+ ```
+ sudo journalctl -u vernemq
+ ```
+
+ Alternatively, you may find the logs in `/var/log/vernemq/` directory, if using the default configuration:
+
+ ```
+ cat /var/log/vernemq/console.log
+ cat /var/log/vernemq/error.log
+ ```
+
+2. Review the VerneMQ configuration to ensure it is properly set up. The default configuration file is located at `/etc/vernemq/vernemq.conf`. Make sure that the settings are correct, especially those related to the MQTT protocol version and the supported QoS levels.
+
+3. Monitor the VerneMQ metrics using the `vmq-admin metrics show` command. This will provide you with an overview of the broker's performance and help you identify any abnormal metrics that could be related to the failed `UNSUBSCRIBE` operations:
+
+ ```
+ sudo vmq-admin metrics show
+ ```
+
+ Pay attention to the `mqtt.unsubscribe_error` metric, which indicates the number of failed `UNSUBSCRIBE` operations.
+
+4. Check the MQTT clients that are sending the `UNSUBSCRIBE` requests. It is possible that the client itself is misconfigured or has some faulty logic in its communication with the MQTT broker. Review the client's logs and configuration to identify any issues.
+
diff --git a/health/guides/vernemq/vernemq_netsplits.md b/health/guides/vernemq/vernemq_netsplits.md
new file mode 100644
index 000000000..15d4d4498
--- /dev/null
+++ b/health/guides/vernemq/vernemq_netsplits.md
@@ -0,0 +1,44 @@
+### Understand the alert
+
+This alert indicates that your VerneMQ cluster has experienced a netsplit (split-brain) situation within the last minute. This can lead to inconsistencies in the cluster, and you need to troubleshoot the problem to maintain proper cluster operation.
+
+### What is a netsplit?
+
+In distributed systems, a netsplit occurs when a cluster of nodes loses connectivity to one or more nodes due to a network failure, leaving the cluster to operate in a degraded state. In the context of VerneMQ, a netsplit can lead to inconsistencies in the subscription data and retained messages.
+
+### Troubleshoot the alert
+
+- Confirm the alert issue
+
+ Review the VerneMQ logs to check for any signs of network partitioning or netsplits.
+
+- Check connectivity between nodes
+
+ Ensure that the network connectivity between your cluster nodes is restored. You can use tools like `ping` and `traceroute` to verify network connectivity.
+
+- Inspect node status
+
+ Use the `vmq-admin cluster show` command to inspect the current status of the nodes in the VerneMQ cluster, and check for any disconnected nodes:
+
+ ```
+ vmq-admin cluster show
+ ```
+
+- Reestablish connections and heal partitions
+
+ If a node is disconnected, reconnect it using the `vmq-admin cluster join` command:
+
+ ```
+ vmq-admin cluster join discovery-node=IP_ADDRESS_OF_ANOTHER_NODE
+ ```
+
+ As soon as the partition is healed, and connectivity is reestablished, the VerneMQ nodes will replicate the latest changes made to the subscription data.
+
+- Ensure node connectivity remains active
+
+ Monitor the cluster and network to maintain consistent connectivity between the nodes. Set up monitoring tools and consider using an auto-healing or auto-scaling framework to help maintain node connectivity.
+
+### Useful resources
+
+1. [VerneMQ Clustering Guide: Netsplits](https://docs.vernemq.com/v/master/vernemq-clustering/netsplits)
+2. [VerneMQ Documentation](https://docs.vernemq.com/)
diff --git a/health/guides/vernemq/vernemq_queue_message_drop.md b/health/guides/vernemq/vernemq_queue_message_drop.md
new file mode 100644
index 000000000..0b97c6b7a
--- /dev/null
+++ b/health/guides/vernemq/vernemq_queue_message_drop.md
@@ -0,0 +1,53 @@
+### Understand the alert
+
+This alert monitors the number of dropped messages in VerneMQ due to full message queues within the last minute. If you receive this alert, it means that message queues are full and VerneMQ is dropping messages. This can be a result of slow consumers, slow VerneMQ performance, or fast publishers.
+
+### Troubleshoot the alert
+
+1. Check the message queue length and performance metrics of VerneMQ
+
+ Monitor the current message queue length for each topic by using the command:
+
+ ```
+ vmq-admin metrics show | grep queue | sort | uniq -c
+ ```
+
+ You can also monitor VerneMQ performance metrics like CPU utilization, memory usage, and network I/O by using the `top` command:
+
+ ```
+ top
+ ```
+
+2. Identify slow consumers, slow VerneMQ, or fast publishers
+
+ Analyze the message flow and performance data to determine if the issue is caused by slow consumers, slow VerneMQ performance, or fast publishers.
+
+ - Slow Consumers: If you identify slow consumers, consider optimizing their processing capabilities or scaling them to handle more load.
+ - Slow VerneMQ: If VerneMQ itself is slow, consider optimizing its configuration, increasing resources, or scaling the nodes in the cluster.
+ - Fast Publishers: If fast publishers are causing the issue, consider rate-limiting them or breaking their input into smaller chunks.
+
+3. Increase the queue length or adjust max_online_messages
+
+ If increasing the capacity of your infrastructure is not a viable solution, consider increasing the queue length or adjusting the `max_online_messages` value in VerneMQ. This can help mitigate the issue of dropped messages due to full queues.
+
+ Update the VerneMQ configuration file (`vernemq.conf`) to set the desired `max_online_messages` value:
+
+ ```
+ max_online_messages=<your_desired_value>
+ ```
+
+ Then, restart VerneMQ to apply the changes:
+
+ ```
+ sudo service vernemq restart
+ ```
+
+4. Monitor the situation
+
+ Continue to monitor the message queue length and VerneMQ performance metrics after making changes, to ensure that the issue is resolved or mitigated.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/docs/)
+2. [Understanding and Monitoring VerneMQ Metrics](https://docs.vernemq.com/monitoring/introduction)
+3. [VerneMQ Configuration Guide](https://docs.vernemq.com/configuration/introduction) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_queue_message_expired.md b/health/guides/vernemq/vernemq_queue_message_expired.md
new file mode 100644
index 000000000..bd0533402
--- /dev/null
+++ b/health/guides/vernemq/vernemq_queue_message_expired.md
@@ -0,0 +1,53 @@
+### Understand the alert
+
+This alert is related to VerneMQ, a scalable and open-source MQTT broker. The `vernemq_queue_message_expired` alert indicates that there is a high number of expired messages that could not be delivered in the last minute.
+
+### What does message expiration mean?
+
+In MQTT, messages are kept in queues until they are delivered to their respective subscribers. Sometimes, messages might have a specific lifespan given by the Time to Live (TTL) attribute, and if they are not delivered within this time, they expire.
+
+Expired messages are removed from the queue and are not delivered to subscribers. This usually means that clients are unable to process the incoming messages fast enough, putting the VerneMQ system under stress.
+
+### Troubleshoot the alert
+
+1. **Check VerneMQ status**: Use the `vernemq` command along with the `vmq-admin` tool to monitor the status of your VerneMQ broker:
+
+ ```
+ sudo vmq-admin cluster show
+ ```
+
+ Analyze the output to make sure that the cluster is up and running without issues.
+
+2. **Check the message rate and throughput**: You can use the `vmq-admin metrics show` command to display key metrics related to your VerneMQ cluster:
+
+ ```
+ sudo vmq-admin metrics show
+ ```
+
+ Analyze the output and identify any sudden increase in the message rate or unusual rate of message expiration.
+
+3. **Identify slow or malfunctioning clients**: VerneMQ provides a command to list all clients connected to the cluster. You can use the following command to identify slow or malfunctioning clients:
+
+ ```
+ sudo vmq-admin session show
+ ```
+
+ Check the output for clients who have a high amount of queue delay, low queued messages, or are not receiving messages properly.
+
+4. **Optimize client connections**: Increasing the message TTL or decreasing the message rate can help decrease the number of expired messages. Adjust the client settings accordingly, ensuring they match the application requirements.
+
+5. **Ensure proper resource allocation**: Check whether the VerneMQ broker has enough resources by monitoring CPU, memory, and disk usage using tools like `top`1, `vmstat`, or `iotop`.
+
+6. **Check VerneMQ logs**: VerneMQ logs can provide valuable insight into the underlying issue. Check the logs for any relevant error messages or warnings:
+
+ ```
+ sudo tail -f /var/log/vernemq/console.log
+ sudo tail -f /var/log/vernemq/error.log
+ ```
+
+7. **Monitor Netdata charts**: Monitor Netdata's VerneMQ dashboard to gain more insight into the behavior of your MQTT broker over time. Look for spikes in the number of expired messages, slow message delivery, or increasing message queues.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/docs/)
+2. [How to Monitor VerneMQ MQTT broker with Netdata](https://learn.netdata.cloud/guides/monitor/vernemq.html)
diff --git a/health/guides/vernemq/vernemq_queue_message_unhandled.md b/health/guides/vernemq/vernemq_queue_message_unhandled.md
new file mode 100644
index 000000000..e2b5c5034
--- /dev/null
+++ b/health/guides/vernemq/vernemq_queue_message_unhandled.md
@@ -0,0 +1,41 @@
+### Understand the alert
+
+This alert is raised when the number of unhandled messages in the last minute, monitored by the Netdata Agent, is too high. It indicates that many messages were not delivered due to connections with `clean_session=true` in a VerneMQ messaging system.
+
+### What does clean_session=true mean?
+
+In MQTT, `clean_session=true` means that the client doesn't want to store any session state on the broker for the duration of its connection. When the session is terminated, all subscriptions and messages are deleted. The broker won't store any messages or send any missed messages once the client reconnects.
+
+### What are VerneMQ unhandled messages?
+
+Unhandled messages are messages that cannot be delivered to subscribers due to connection issues, protocol limitations, or session configurations. These messages are often related to clients' settings for `clean_session=true`, which means they don't store any session state on the broker.
+
+### Troubleshoot the alert
+
+- Identify clients causing unhandled messages
+
+ One way to find the clients causing unhandled messages is by analyzing the VerneMQ log files. Look for warning or error messages related to undelivered messages or clean sessions. The log files are typically located in `/var/log/vernemq/`.
+
+- Check clients' clean_session settings
+
+ Review your MQTT clients' configurations to verify if they have `clean_session=true`. Consider changing the setting to `clean_session=false` if you want the broker to store session state and send missed messages upon reconnection.
+
+- Monitor VerneMQ statistics
+
+ Use the following command to see an overview of the VerneMQ statistics:
+
+ ```
+ vmq-admin metrics show
+ ```
+
+ Look for metrics related to dropped or unhandled messages, such as `gauge.queue_message_unhandled`.
+
+- Examine your system resources
+
+ High unhandled message rates can also be a result of insufficient system resources. Check your system resources (CPU, memory, disk usage) and consider upgrading if necessary.
+
+### Useful resources
+
+1. [VerneMQ - An MQTT Broker](https://vernemq.com/)
+2. [VerneMQ Documentation: Monitoring & Metrics](https://docs.vernemq.com/monitoring/)
+3. [Understanding MQTT Clean Sessions, Queuing, Retained Messages and QoS](https://www.hivemq.com/blog/mqtt-essentials-part-7-persistent-session-queuing-messages/) \ No newline at end of file
diff --git a/health/guides/vernemq/vernemq_socket_errors.md b/health/guides/vernemq/vernemq_socket_errors.md
new file mode 100644
index 000000000..0be28eb6c
--- /dev/null
+++ b/health/guides/vernemq/vernemq_socket_errors.md
@@ -0,0 +1,33 @@
+### Understand the alert
+
+This alert is related to the VerneMQ MQTT broker, and it triggers when there is a high number of socket errors in the last minute. Socket errors can occur due to various reasons, such as network connectivity issues or resource contention on the system running the VerneMQ broker.
+
+### What are socket errors?
+
+Socket errors are issues related to network communication between the VerneMQ broker and its clients. They usually occur when there are problems establishing or maintaining a stable network connection between the server and clients. Examples of socket errors include connection timeouts, connection resets, unreachable hosts, and other network-related problems.
+
+### Troubleshoot the alert
+
+1. Check the VerneMQ logs for more information:
+
+ VerneMQ logs can give you a better understanding of the cause of the socket errors. You can find the logs at `/var/log/vernemq/console.log` or `/var/log/vernemq/error.log`. Look for any errors or warning messages that might be related to the socket errors.
+
+2. Monitor the system's resources:
+
+ Use the `top`, `vmstat`, `iostat`, or `netstat` commands to monitor your system's resource usage, such as CPU, RAM, disk I/O, and network activity. Check if there are any resource bottlenecks or excessive usage that might be causing the socket errors.
+
+3. Check network connectivity:
+
+ Verify that there are no issues with the network connectivity between the VerneMQ broker and its clients. Use tools such as `ping`, `traceroute`, or `mtr` to check the connectivity and latency of the network.
+
+4. Make sure the VerneMQ broker is running:
+
+ Ensure that the VerneMQ broker process is running and listening for connections. You can use the `ps` command to check if the `vernemq` process is running, and the `netstat` command to verify that it's listening on the expected ports.
+
+5. Inspect client configurations and logs:
+
+ It's possible that the root cause of the socket errors is related to the MQTT clients. Check their configurations and logs for any signs of issues or misconfigurations that could be causing socket errors when connecting to the VerneMQ broker.
+
+### Useful resources
+
+1. [VerneMQ Documentation](https://vernemq.com/docs/)