diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2023-05-08 16:27:08 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2023-05-08 16:27:08 +0000 |
commit | 81581f9719bc56f01d5aa08952671d65fda9867a (patch) | |
tree | 0f5c6b6138bf169c23c9d24b1fc0a3521385cb18 /docs/guides/troubleshoot | |
parent | Releasing debian version 1.38.1-1. (diff) | |
download | netdata-81581f9719bc56f01d5aa08952671d65fda9867a.tar.xz netdata-81581f9719bc56f01d5aa08952671d65fda9867a.zip |
Merging upstream version 1.39.0.
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'docs/guides/troubleshoot')
-rw-r--r-- | docs/guides/troubleshoot/monitor-debug-applications-ebpf.md | 28 | ||||
-rw-r--r-- | docs/guides/troubleshoot/troubleshooting-agent-with-cloud-connection.md | 40 |
2 files changed, 24 insertions, 44 deletions
diff --git a/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md b/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md index c79a038cc..856985ec5 100644 --- a/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md +++ b/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md @@ -1,8 +1,11 @@ <!-- title: "Monitor, troubleshoot, and debug applications with eBPF metrics" +sidebar_label: "Monitor, troubleshoot, and debug applications with eBPF metrics" description: "Use Netdata's built-in eBPF metrics collector to monitor, troubleshoot, and debug your custom application using low-level kernel feedback." image: /img/seo/guides/troubleshoot/monitor-debug-applications-ebpf.png custom_edit_url: https://github.com/netdata/netdata/edit/master/docs/guides/troubleshoot/monitor-debug-applications-ebpf.md +learn_status: "Published" +learn_rel_path: "Operations" --> # Monitor, troubleshoot, and debug applications with eBPF metrics @@ -83,7 +86,7 @@ to show other charts that will help you debug and troubleshoot how it interacts ## Configure the eBPF collector to monitor errors -The eBPF collector has [two possible modes](/collectors/ebpf.plugin#ebpf-load-mode): `entry` and `return`. The default +The eBPF collector has [two possible modes](https://github.com/netdata/netdata/blob/master/collectors/ebpf.plugin/README.md#ebpf-load-mode): `entry` and `return`. The default is `entry`, and only monitors calls to kernel functions, but the `return` also monitors and charts _whether these calls return in error_. @@ -236,35 +239,16 @@ same application on multiple systems and want to correlate how it performs on ea findings with someone else on your team. If you don't already have a Netdata Cloud account, go [sign in](https://app.netdata.cloud) and get started for free. -Read the [get started with Cloud guide](https://github.com/netdata/netdata/blob/master/docs/cloud/get-started.mdx) for a walkthrough of -connecting nodes to and other fundamentals. +You can also read how to [monitor your infrastructure with Netdata Cloud](https://github.com/netdata/netdata/blob/master/docs/quickstart/infrastructure.md) to understand the key features that it has to offer. Once you've added one or more nodes to a Space in Netdata Cloud, you can see aggregated eBPF metrics in the [Overview dashboard](https://github.com/netdata/netdata/blob/master/docs/visualize/overview-infrastructure.md) under the same **Applications** or **eBPF** sections that you -find on the local Agent dashboard. Or, [create new dashboards](https://github.com/netdata/netdata/blob/master/docs/visualize/create-dashboards.md) using eBPF metrics +find on the local Agent dashboard. Or, [create new dashboards](https://github.com/netdata/netdata/blob/master/docs/cloud/visualize/dashboards.md) using eBPF metrics from any number of distributed nodes to see how your application interacts with multiple Linux kernels on multiple Linux systems. Now that you can see eBPF metrics in Netdata Cloud, you can [invite your team](https://github.com/netdata/netdata/blob/master/docs/cloud/manage/invite-your-team.md) and share your findings with others. -## What's next? - -Debugging and troubleshooting an application takes a special combination of practice, experience, and sheer luck. With -Netdata's eBPF metrics to back you up, you can rest assured that you see every minute detail of how your application -interacts with the Linux kernel. - -If you're still trying to wrap your head around what we offer, be sure to read up on our accompanying documentation and -other resources on eBPF monitoring with Netdata: - -- [eBPF collector](https://github.com/netdata/netdata/blob/master/collectors/ebpf.plugin/README.md) -- [eBPF's integration with `apps.plugin`](https://github.com/netdata/netdata/blob/master/collectors/apps.plugin/README.md#integration-with-ebpf) -- [Linux eBPF monitoring with Netdata](https://www.netdata.cloud/blog/linux-ebpf-monitoring-with-netdata/) - -The scenarios described above are just the beginning when it comes to troubleshooting with eBPF metrics. We're excited -to explore others and see what our community dreams up. If you have other use cases, whether simulated or real-world, -we'd love to hear them: [info@netdata.cloud](mailto:info@netdata.cloud). - -Happy troubleshooting! diff --git a/docs/guides/troubleshoot/troubleshooting-agent-with-cloud-connection.md b/docs/guides/troubleshoot/troubleshooting-agent-with-cloud-connection.md index 138182e01..a0e8973f7 100644 --- a/docs/guides/troubleshoot/troubleshooting-agent-with-cloud-connection.md +++ b/docs/guides/troubleshoot/troubleshooting-agent-with-cloud-connection.md @@ -1,11 +1,7 @@ -<!-- -title: "Troubleshoot Agent-Cloud connectivity issues" -description: "A simple guide to troubleshoot occurrences where the Agent is showing as offline after claiming." -custom_edit_url: https://github.com/netdata/netdata/edit/master/guides/troubleshoot/troubleshooting-agent-with-cloud-connection.md ---> - # Troubleshoot Agent-Cloud connectivity issues +Learn how to troubleshoot the Netdata Agent showing as offline after claiming, so you can connect the Agent to Netdata Cloud. + When you are claiming a node, you might not be able to immediately see it online in Netdata Cloud. This could be due to an error in the claiming process or a temporary outage of some services. @@ -13,9 +9,13 @@ We identified some scenarios that might cause this delay and possible actions yo The most common explanation for the delay usually falls into one of the following three categories: -- [The claiming process of the kickstart script was unsuccessful](#the-claiming-process-of-the-kickstart-script-was-unsuccessful) -- [Claiming on an older, deprecated version of the Agent](#claiming-on-an-older-deprecated-version-of-the-agent) -- [Network issues while connecting to the Cloud](#network-issues-while-connecting-to-the-cloud) +- [Troubleshoot Agent-Cloud connectivity issues](#troubleshoot-agent-cloud-connectivity-issues) + - [The claiming process of the kickstart script was unsuccessful](#the-claiming-process-of-the-kickstart-script-was-unsuccessful) + - [The kickstart script auto-claimed the Agent but there was no error message displayed](#the-kickstart-script-auto-claimed-the-agent-but-there-was-no-error-message-displayed) + - [Claiming on an older, deprecated version of the Agent](#claiming-on-an-older-deprecated-version-of-the-agent) + - [Network issues while connecting to the Cloud](#network-issues-while-connecting-to-the-cloud) + - [Verify that your IP is whitelisted from Netdata Cloud](#verify-that-your-ip-is-whitelisted-from-netdata-cloud) + - [Make sure that your node has internet connectivity and can resolve network domains](#make-sure-that-your-node-has-internet-connectivity-and-can-resolve-network-domains) ## The claiming process of the kickstart script was unsuccessful @@ -48,16 +48,14 @@ and you must do it manually, using the following steps: 3. Retry the kickstart claiming process. -:::note - -In some cases a simple restart of the Agent can fix the issue. -Read more about [Starting, Stopping and Restarting the Agent](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md). - -::: +> ### Note +> +> In some cases a simple restart of the Agent can fix the issue. +> Read more about [Starting, Stopping and Restarting the Agent](https://github.com/netdata/netdata/blob/master/docs/configure/start-stop-restart.md). ## Claiming on an older, deprecated version of the Agent -Make sure that you are using the latest version of Netdata if you are using the [Claiming script](https://learn.netdata.cloud/docs/agent/claim#claiming-script). +Make sure that you are using the latest version of Netdata if you are using the [Claiming script](https://github.com/netdata/netdata/blob/master/claim/README.md#claiming-script). With the introduction of our new architecture, Agents running versions lower than `v1.32.0` can face claiming problems, so we recommend you [update the Netdata Agent](https://github.com/netdata/netdata/blob/master/packaging/installer/UPDATE.md) to the latest stable version. @@ -109,9 +107,7 @@ To verify this: main-ingress-545609a41fcaf5d6.elb.us-east-1.amazonaws.com has address 44.196.50.41 ``` - :::info - - There will be cases in which the firewall restricts network access. In those cases, you need to whitelist `api.netdata.cloud` and `mqtt.netdata.cloud` domains to be able to see your nodes in Netdata Cloud. - If you can't whitelist domains in your firewall, you can whitelist the IPs that the above command will produce, but keep in mind that they can change without any notice. - - ::: + > ### Info + > + > There will be cases in which the firewall restricts network access. In those cases, you need to whitelist `api.netdata.cloud` and `mqtt.netdata.cloud` domains to be able to see your nodes in Netdata Cloud. + > If you can't whitelist domains in your firewall, you can whitelist the IPs that the above command will produce, but keep in mind that they can change without any notice. |