1 files changed, 95 insertions, 0 deletions
diff --git a/doc/dev/cephadm/scalability-notes.rst b/doc/dev/cephadm/scalability-notes.rst
new file mode 100644
index 000000000..9faaee041
--- /dev/null
+++ b/doc/dev/cephadm/scalability-notes.rst
@@ -0,0 +1,95 @@
+#############################################
+ Notes and Thoughts on Cephadm's scalability
+#############################################
+
+*********************
+ About this document
+*********************
+
+This document does NOT define a specific proposal or some future work.
+Instead it merely lists a few thoughts that MIGHT be relevant for future
+cephadm enhancements.
+
+*******
+ Intro
+*******
+
+Current situation:
+
+Cephadm manages all registered hosts. This means that it periodically
+scrapes data from each host to identify changes on the host like:
+
+-  disk added/removed
+-  daemon added/removed
+-  host network/firewall etc has changed
+
+Currently, cephadm scrapes each host (up to 10 in parallel) every 6
+minutes, unless a refresh is forced manually.
+
+Refreshes for disks (ceph-volume), daemons (podman/docker), etc, happen
+in sequence.
+
+With the cephadm exporter, we have now reduced the time to scan hosts
+considerably, but the question remains:
+
+Is the cephadm-exporter sufficient to solve all future scalability
+issues?
+
+***********************************************
+ Considerations of cephadm-exporter's REST API
+***********************************************
+
+The cephadm-exporter uses HTTP to serve an endpoint to the hosts
+metadata. We MIGHT encounter some issues with this approach, which need
+to be mitigated at some point.
+
+-  With the cephadm-exporter we use SSH and HTTP to connect to each
+   host. Having two distinct transport layers feels odd, and we might
+   want to consider reducing it to only a single protocol.
+
+-  The current approach of delivering ``bin/cephadm`` to the host doesn't
+   allow the use of external dependencies. This means that we're stuck
+   with the built-in HTTP server lib, which isn't great for providing a
+   good developer experience. ``bin/cephadm`` needs to be packaged and
+   distributed (one way or the other) for us to make use of a better
+   http server library.
+
+************************
+ MON's config-key store
+************************
+
+After the ``mgr/cephadm`` queried metadata from each host, cephadm stores
+the data within the mon's k-v store.
+
+If each host would be allowed to write their own metadata to the store,
+``mgr/cephadm`` would no longer be required to gather the data.
+
+Some questions arise:
+
+-  ``mgr/cephadm`` now needs to query data from the config-key store,
+   instead of relying on cached data.
+
+-  cephadm knows three different types of data: (1) Data that is
+   critical and needs to be stored in the config-key store. (2) Data
+   that can be kept in memory only. (3) Data that can be stored in
+   RADOS pool. How can we apply this idea to those different types of
+   data.
+
+*******************************
+ Increase the worker pool size
+*******************************
+
+``mgr/cephadm`` is currently able to scrape 10 nodes at the same time.
+
+The scrape of a individual host takes the same amount of time persists.
+We'd just reduce the overall execution time.
+
+At best we can reach O(hosts) + O(daemons).
+
+*************************
+ Backwards compatibility
+*************************
+
+Any changes need to be backwards compatible or completely isolated from
+any existing functionality. There are running cephadm clusters out there
+that require an upgrade path.