summaryrefslogtreecommitdiffstats
path: root/src/registry/README.md
blob: 97db113f7dc795f2e438a2599b90f47f558637d4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
# Registry

Netdata provides distributed monitoring.

Traditional monitoring solutions centralize all the data to provide unified dashboards across all servers. Before
Netdata, this was the standard practice. However it has a few issues:

1. due to the resources required, the number of metrics collected is limited.
2. for the same reason, the data collection frequency is not that high, at best it will be once every 10 or 15 seconds,
    at worst every 5 or 10 mins.
3. the central monitoring solution needs dedicated resources, thus becoming "another bottleneck" in the whole
    ecosystem. It also requires maintenance, administration, etc.
4. most centralized monitoring solutions are usually only good for presenting _statistics of past performance_ (i.e.
    cannot be used for real-time performance troubleshooting).

Netdata follows a different approach:

1. data collection happens per second
2. thousands of metrics per server are collected
3. data do not leave the server where they are collected
4. Netdata servers do not talk to each other
5. your browser connects all the Netdata servers

Using Netdata, your monitoring infrastructure is embedded on each server, limiting significantly the need of additional
resources. Netdata is blazingly fast, very resource efficient and utilizes server resources that already exist and are
spare (on each server). This allows **scaling out** the monitoring infrastructure.

However, the Netdata approach introduces a few new issues that need to be addressed, one being **the list of Netdata we
have installed**, i.e. the URLs our Netdata servers are listening.

To solve this, Netdata utilizes a **central registry**. This registry, together with certain browser features, allow
Netdata to provide unified cross-server dashboards. For example, when you jump from server to server using the node
menu, several session settings (like the currently viewed charts, the current zoom and pan operations on the charts,
etc.) are propagated to the new server, so that the new dashboard will come with exactly the same view.

## What data does the registry store?

The registry keeps track of 4 entities:

1. **machines**: i.e. the Netdata installations (a random GUID generated by each Netdata the first time it starts; we
    call this **machine_guid**)

    For each Netdata installation (each `machine_guid`) the registry keeps track of the different URLs it has accessed.

2. **persons**: i.e. the web browsers accessing the Netdata installations (a random GUID generated by the registry the
    first time it sees a new web browser; we call this **person_guid**)

    For each person, the registry keeps track of the Netdata installations it has accessed and their URLs.

3. **URLs** of Netdata installations (as seen by the web browsers)

    For each URL, the registry keeps the URL and nothing more. Each URL is linked to _persons_ and _machines_. The only
  way to find a URL is to know its **machine_guid** or have a **person_guid** it is linked to it.

4. **accounts**: i.e. the information used to sign-in via one of the available sign-in methods. Depending on the method, this may include an email, or an email and a profile picture or avatar.

For _persons_/_accounts_ and _machines_, the registry keeps links to _URLs_, each link with 2 timestamps (first time
seen, last time seen) and a counter (number of times it has been seen). *machines_, _persons_ and timestamps are stored
in the Netdata registry regardless of whether you sign in or not.

## Who talks to the registry?

Your web browser **only**! If sending this information is against your policies, you
can [run your own registry](#run-your-own-registry)

Your Netdata servers do not talk to the registry. This is a UML diagram of its operation:

![registry](https://cloud.githubusercontent.com/assets/2662304/19448565/11a70632-94ab-11e6-9d80-f410b4acb797.png)

## Which is the default registry?

`https://registry.my-netdata.io`, which is currently served by `https://london.my-netdata.io`. This registry listens to
both HTTP and HTTPS requests but the default is HTTPS.

### Can this registry handle the global load of Netdata installations?

Yeap! The registry can handle 50.000 - 100.000 requests **per second per core** (depending on the type of CPU, the
computer's memory bandwidth, etc). 50.000 is on J1900 (celeron 2Ghz).

We believe, it can do it...

## Run your own registry

**Every Netdata can be a registry**. Just pick one and configure it.

**To turn any Netdata into a registry**, edit `/etc/netdata/netdata.conf` and set:

```text
[registry]
    enabled = yes
    registry to announce = http://your.registry:19999
```

Restart your Netdata to activate it.

Then, you need to tell **all your other Netdata servers to advertise your registry**, instead of the default. To do
this, on each of your Netdata servers, edit `/etc/netdata/netdata.conf` and set:

```text
[registry]
    enabled = no
    registry to announce = http://your.registry:19999
```

Note that we have not enabled the registry on the other servers. Only one Netdata (the registry) needs
`[registry].enabled = yes`.

This is it. You have your registry now.

You may also want to give your server different names under the node menu (i.e. to have them sorted / grouped). You can
change its registry name, by setting on each Netdata server:

```text
[registry]
    registry hostname = Group1 - Master DB
```

So this server will appear in the node menu as `Group1 - Master DB`. The max name length is 50 characters.

### Limiting access to the registry

Netdata v1.9+ support limiting access to the registry from given IPs, like this:

```text
[registry]
    allow from = *
```

`allow from` settings are [Netdata simple patterns](/src/libnetdata/simple_pattern/README.md): string matches that use `*`
as wildcard (any number of times) and a `!` prefix for a negative match. So: `allow from = !10.1.2.3 10.*` will allow
all IPs in `10.*` except `10.1.2.3`. The order is important: left to right, the first positive or negative match is
used.

Keep in mind that connections to Netdata API ports are filtered by `[web].allow connections from`. So, IPs allowed by
`[registry].allow from` should also be allowed by `[web].allow connection from`.

The patterns can be matches over IP addresses or FQDN of the host. In order to check the FQDN of the connection without
opening the Netdata agent to DNS-spoofing, a reverse-dns record must be setup for the connecting host. At connection
time the reverse-dns of the peer IP address is resolved, and a forward DNS resolution is made to validate the IP address
against the name-pattern.

Please note that this process can be expensive on a machine that is serving many connections. The behaviour of the
pattern matching can be controlled with the following setting:

```text
[registry]
    allow by dns = heuristic
```

The settings are:

- `yes` allows the pattern to match DNS names.
- `no` disables DNS matching for the patterns (they only match IP addresses).
- `heuristic` will estimate if the patterns should match FQDNs by the presence or absence of `:`s or alpha-characters.

### Where is the registry database stored?

`/var/lib/netdata/registry/*.db`

There can be up to 2 files:

- `registry-log.db`, the transaction log

    all incoming requests that affect the registry are saved in this file in real-time.

- `registry.db`, the database
  
    every `[registry].registry save db every new entries` entries in `registry-log.db`, Netdata will save its database to `registry.db` and empty `registry-log.db`.

Both files are machine readable text files.

### How can I disable the SameSite and Secure cookies?

Beginning with `v1.30.0`, when the Netdata Agent's web server processes a request, it delivers the `SameSite=none`
and `Secure` cookies. If you have problems accessing the local Agent dashboard or Netdata Cloud, disable these
cookies by [editing `netdata.conf`](/docs/netdata-agent/configuration/README.md#edit-a-configuration-file-using-edit-config):

```text
[registry]
    enable cookies SameSite and Secure = no
```

## The future

The registry opens a whole world of new possibilities for Netdata. Check here what we think:
<https://github.com/netdata/netdata/issues/416>

## Troubleshooting the registry

The registry URL should be set to the URL of a Netdata dashboard. This server has to have `[registry].enabled = yes`.
So, accessing the registry URL directly with your web browser, should present the dashboard of the Netdata operating the
registry.

To use the registry, your web browser needs to support **third party cookies**, since the cookies are set by the
registry while you are browsing the dashboard of another Netdata server. The registry, the first time it sees a new web
browser it tries to figure if the web browser has cookies enabled or not. It does this by setting a cookie and
redirecting the browser back to itself hoping that it will receive the cookie. If it does not receive the cookie, the
registry will keep redirecting your web browser back to itself, which after a few redirects will fail with an error like
this:

```text
ERROR 409: Cannot ACCESS netdata registry: https://registry.my-netdata.io responded with: {"status":"redirect","registry":"https://registry.my-netdata.io"}
```

This error is printed on your web browser console (press F12 on your browser to see it).