summaryrefslogtreecommitdiffstats
path: root/doc/dev/cephadm/scalability-notes.rst
diff options
context:
space:
mode:
Diffstat (limited to 'doc/dev/cephadm/scalability-notes.rst')
-rw-r--r--doc/dev/cephadm/scalability-notes.rst95
1 files changed, 95 insertions, 0 deletions
diff --git a/doc/dev/cephadm/scalability-notes.rst b/doc/dev/cephadm/scalability-notes.rst
new file mode 100644
index 000000000..157153cb3
--- /dev/null
+++ b/doc/dev/cephadm/scalability-notes.rst
@@ -0,0 +1,95 @@
+#############################################
+ Notes and Thoughts on Cephadm's scalability
+#############################################
+
+*********************
+ About this document
+*********************
+
+This document does NOT define a specific proposal or some future work.
+Instead it merely lists a few thoughts that MIGHT be relevant for future
+cephadm enhacements.
+
+*******
+ Intro
+*******
+
+Current situation:
+
+Cephadm manages all registered hosts. This means that it periodically
+scrapes data from each host to identify changes on the host like:
+
+- disk added/removed
+- daemon added/removed
+- host network/firewall etc has changed
+
+Currently, cephadm scrapes each host (up to 10 in parallel) every 6
+minutes, unless a refresh is forced manually.
+
+Refreshes for disks (ceph-volume), daemons (podman/docker), etc, happen
+in sequence.
+
+With the cephadm exporter, we have now reduced the time to scan hosts
+considerably, but the question remains:
+
+Is the cephadm-exporter sufficient to solve all future scalability
+issues?
+
+***********************************************
+ Considerations of cephadm-exporter's REST API
+***********************************************
+
+The cephadm-exporter uses HTTP to serve an endpoint to the hosts
+metadata. We MIGHT encounter some issues with this approach, which need
+to be mitigated at some point.
+
+- With the cephadm-exporter we use SSH and HTTP to connect to each
+ host. Having two distinct transport layers feels odd, and we might
+ want to consider reducing it to only a single protocol.
+
+- The current approach of delivering ``bin/cephadm`` to the host doesn't
+ allow the use of external dependencies. This means that we're stuck
+ with the built-in HTTP server lib, which isn't great for providing a
+ good developer experience. ``bin/cephadm`` needs to be packaged and
+ distributed (one way or the other) for us to make use of a better
+ http server library.
+
+************************
+ MON's config-key store
+************************
+
+After the ``mgr/cephadm`` queried metadata from each host, cephadm stores
+the data within the mon's k-v store.
+
+If each host would be allowed to write their own metadata to the store,
+``mgr/cephadm`` would no longer be required to gather the data.
+
+Some questions arise:
+
+- ``mgr/cephadm`` now needs to query data from the config-key store,
+ instead of relying on cached data.
+
+- cephadm knows three different types of data: (1) Data that is
+ critical and needs to be stored in the config-key store. (2) Data
+ that can be kept in memory only. (3) Data that can be stored in
+ RADOS pool. How can we apply this idea to those different types of
+ data.
+
+*******************************
+ Increase the worker pool size
+*******************************
+
+``mgr/cephadm`` is currently able to scrape 10 nodes at the same time.
+
+The scrape of a individual host takes the same amount of time persists.
+We'd just reduce the overall execution time.
+
+At best we can reach O(hosts) + O(daemons).
+
+*************************
+ Backwards compatibility
+*************************
+
+Any changes need to be backwards compatible or completely isolated from
+any existing functionality. There are running cephadm clusters out there
+that require an upgrade path.