summaryrefslogtreecommitdiffstats
path: root/health/guides/scaleio
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-03-09 13:19:48 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-03-09 13:20:02 +0000
commit58daab21cd043e1dc37024a7f99b396788372918 (patch)
tree96771e43bb69f7c1c2b0b4f7374cb74d7866d0cb /health/guides/scaleio
parentReleasing debian version 1.43.2-1. (diff)
downloadnetdata-58daab21cd043e1dc37024a7f99b396788372918.tar.xz
netdata-58daab21cd043e1dc37024a7f99b396788372918.zip
Merging upstream version 1.44.3.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'health/guides/scaleio')
-rw-r--r--health/guides/scaleio/scaleio_sdc_mdm_connection_state.md43
-rw-r--r--health/guides/scaleio/scaleio_storage_pool_capacity_utilization.md34
2 files changed, 77 insertions, 0 deletions
diff --git a/health/guides/scaleio/scaleio_sdc_mdm_connection_state.md b/health/guides/scaleio/scaleio_sdc_mdm_connection_state.md
new file mode 100644
index 000000000..1e09b978c
--- /dev/null
+++ b/health/guides/scaleio/scaleio_sdc_mdm_connection_state.md
@@ -0,0 +1,43 @@
+### Understand the alert
+
+The `scaleio_sdc_mdm_connection_state` alert indicates that your ScaleIO Data Client (SDC) is disconnected from the ScaleIO MetaData Manager (MDM). This disconnection can lead to potential performance issues or data unavailability in your storage infrastructure.
+
+### Troubleshoot the alert
+
+1. Check the connectivity between SDC and MDM nodes.
+
+Verify that the SDC and MDM nodes are reachable by performing a `ping` or using `traceroute` from the SDC node to the MDM node and vice versa. Network connectivity issues such as high latency or packet loss may cause the disconnection between SDC and MDM.
+
+2. Examine log files.
+
+Review the SDC and MDM log files to identify any error messages or warnings that can indicate the reason for the disconnection. Common log file locations are:
+
+ - SDC logs: `/opt/emc/scaleio/sdc/logs/sdc.log`
+ - MDM logs: `/opt/emc/scaleio/mdm/logs/mdm.log`
+
+3. Check the status of ScaleIO services.
+
+Verify that the ScaleIO services are running on both the SDC and MDM nodes. You can check the service status with the following commands:
+
+ - SDC service status: `sudo systemctl status scaleio-sdc`
+ - MDM service status: `sudo systemctl status scaleio-mdm`
+
+If any of the services are not running, start them and check the connection state again.
+
+4. Reconnect SDC to MDM.
+
+If the issue still persists after verifying the network connectivity and services' statuses, try to reconnect the SDC to MDM manually. Use the following command on the SDC node:
+
+ ```
+ sudo scli --reconnect_sdc --mdm_ip <MDM_IP_ADDRESS>
+ ```
+
+Replace `<MDM_IP_ADDRESS>` with the IP address of your MDM node.
+
+5. Contact support.
+
+If the disconnection issue persists after trying the above steps, consider contacting technical support for assistance.
+
+### Useful resources
+
+1. [ScaleIO Troubleshooting](https://www.dell.com/support/home/en-us/product-support/product/scaleio)
diff --git a/health/guides/scaleio/scaleio_storage_pool_capacity_utilization.md b/health/guides/scaleio/scaleio_storage_pool_capacity_utilization.md
new file mode 100644
index 000000000..0f8a723b8
--- /dev/null
+++ b/health/guides/scaleio/scaleio_storage_pool_capacity_utilization.md
@@ -0,0 +1,34 @@
+### Understand the alert
+
+The `scaleio_storage_pool_capacity_utilization` alert is related to storage capacity in ScaleIO, a software-defined storage solution. If you receive this alert, it means that the storage pool capacity utilization is high, potentially leading to performance issues or running out of space.
+
+### What does high storage pool capacity utilization mean?
+
+High storage pool capacity utilization means that the allocated storage space in the ScaleIO storage pool is being used at a high percentage. Warning and critical alerts are triggered at 80-90% and 90-98% utilization, respectively. When the storage pool capacity utilization is high, it may impact the performance of the system and may prevent new data from being stored, as available space is limited.
+
+### Troubleshoot the alert
+
+1. **Verify the storage pool capacity utilization**
+
+ Check the Netdata dashboard or use Netdata API to verify the storage pool capacity utilization. Take note of the storage pools with high utilization.
+
+2. **Investigate storage usage**
+
+ Inspect the storage usage in your environment, and determine which data or applications are consuming the most space. You can use tools like `du`, `df`, and `ncdu` to analyze disk usage.
+
+3. **Delete or move unnecessary files**
+
+ If you found any unnecessary files or backup copies occupying large amounts of space, consider deleting them or moving them to different storage devices to free up space in the storage pool.
+
+4. **Optimize storage provisioning**
+
+ Evaluate the storage provisioning for your applications, and ensure that appropriate storage space is allocated based on the actual needs. Adjust storage allocations if needed.
+
+5. **Consider expanding the storage pool**
+
+ If the high storage pool capacity utilization is expected based on your application and data storage needs, consider expanding the storage pool by adding new devices or increasing the allocated storage space on the existing devices in the pool.
+
+6. **Monitor storage pool capacity utilization trends**
+
+ Keep track of the storage pool capacity utilization trends and be proactive in addressing potential storage capacity issues in the future.
+