diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:23 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-07-24 09:54:44 +0000 |
commit | 836b47cb7e99a977c5a23b059ca1d0b5065d310e (patch) | |
tree | 1604da8f482d02effa033c94a84be42bc0c848c3 /docs/observability-centralization-points/best-practices.md | |
parent | Releasing debian version 1.44.3-2. (diff) | |
download | netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.tar.xz netdata-836b47cb7e99a977c5a23b059ca1d0b5065d310e.zip |
Merging upstream version 1.46.3.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'docs/observability-centralization-points/best-practices.md')
-rw-r--r-- | docs/observability-centralization-points/best-practices.md | 39 |
1 files changed, 39 insertions, 0 deletions
diff --git a/docs/observability-centralization-points/best-practices.md b/docs/observability-centralization-points/best-practices.md new file mode 100644 index 000000000..49bd3d6c3 --- /dev/null +++ b/docs/observability-centralization-points/best-practices.md @@ -0,0 +1,39 @@ +# Best Practices for Observability Centralization Points + +When planning the deployment of Observability Centralization Points, the following factors need consideration: + +1. **Volume of Monitored Systems**: The number of systems being monitored dictates the scaling and number of centralization points required. Larger infrastructures may necessitate multiple centralization points to manage the volume of data effectively and maintain performance. + +2. **Cost of Data Transfer**: Particularly in multi-cloud or hybrid environments, the location of centralization points can significantly impact egress bandwidth costs. Strategically placing centralization points in each data center or cloud region can minimize these costs by reducing the need for cross-network data transfer. + +3. **Usability without Netdata Cloud**: When not using Netdata Cloud, observability with Netdata is simpler when there are fewer centralization points, making it easier to remember where observability is and how to access it. + +4. When Netdata Cloud is used, infrastructure level views are provided independently of the centralization points, so it is preferable to centralize as required for security (e.g. internet access), cost control (e.g. egress bandwidth, dedicated resources) and operational efficiency (regions, services or teams isolation). + +## Cost Optimization + +Netdata has been designed for observability cost optimization. For optimal cost we recommend using Netdata Cloud and multiple independent observability centralization points: + +- **Scale out**: add more, smaller centralization points to distribute the load. This strategy provides the least resource consumption per unit of workload, maintaining optimal performance and resource efficiency across your observability infrastructure. + +- **Use existing infrastructure resources**: use spare capacities before allocating dedicated resources for observability. This approach minimizes additional costs and promotes an economically sustainable observability framework. + +- **Unified or separate centralization for logs and metrics**: Netdata allows centralizing metrics and logs together or separately. Consider factors such as access frequency, data retention policies, and compliance requirements to enhance performance and reduce costs. + +- **Decentralized configuration management**: each Netdata centralization point can have its own unique configuration for retention and alerts. This enables 1) finer control on infrastructure costs and 2) localized control for separate services or teams. + +## Pros and Cons + +Compared to other observability solutions, the design of Netdata offers: + +- **Enhanced Scalability and Flexibility**: Netdata's support for multiple independent observability centralization points allows for a more scalable and flexible architecture. This feature is particularly advantageous in distributed and complex environments, enabling tailored observability strategies that can vary by region, service, or team requirements. + +- **Resilience and Fault Tolerance**: The ability to deploy multiple centralization points also contributes to greater system resilience and fault tolerance. Replication is a native feature of Netdata centralization points, so in the event of a failure at one centralization point, others can continue to function, ensuring continuous observability. + +- **Optimized Cost and Performance**: By distributing the load across multiple centralization points, Netdata can optimize both performance and cost. This distribution allows for the efficient use of resources and help mitigate the bottlenecks associated with a single centralization point. + +- **Simplicity**: Netdata agents (Children and Parents) require minimal configuration and maintenance, usually less than the configuration and maintenance required for the agents and exporters of other monitoring solutions. This provides an observability pipeline that has less moving parts and is easier to manage and maintain. + +- **Always On-Prem**: Netdata centralization points are always on-prem. Even when Netdata Cloud is used, Netdata agents and parents are queried to provide the data required for the dashboards. + +- **Bottom-Up Observability**: Netdata is designed to monitor systems, containers and applications bottom-up, aiming to provide the maximum resolution, visibility, depth and insights possible. Its ability to segment the infrastructure into multiple independent observability centralization points with customized retention, machine learning and alerts on each of them, while providing unified infrastructure level dashboards at Netdata Cloud, provides a flexible environment that can be tailored per service or team, while still being one unified infrastructure. |