From 483eb2f56657e8e7f419ab1a4fab8dce9ade8609 Mon Sep 17 00:00:00 2001 From: Daniel Baumann Date: Sat, 27 Apr 2024 20:24:20 +0200 Subject: Adding upstream version 14.2.21. Signed-off-by: Daniel Baumann --- doc/rados/operations/upmap.rst | 85 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 85 insertions(+) create mode 100644 doc/rados/operations/upmap.rst (limited to 'doc/rados/operations/upmap.rst') diff --git a/doc/rados/operations/upmap.rst b/doc/rados/operations/upmap.rst new file mode 100644 index 00000000..8bad0d95 --- /dev/null +++ b/doc/rados/operations/upmap.rst @@ -0,0 +1,85 @@ +Using the pg-upmap +================== + +Starting in Luminous v12.2.z there is a new *pg-upmap* exception table +in the OSDMap that allows the cluster to explicitly map specific PGs to +specific OSDs. This allows the cluster to fine-tune the data +distribution to, in most cases, perfectly distributed PGs across OSDs. + +The key caveat to this new mechanism is that it requires that all +clients understand the new *pg-upmap* structure in the OSDMap. + +Enabling +-------- + +To allow use of the feature, you must tell the cluster that it only +needs to support luminous (and newer) clients with:: + + ceph osd set-require-min-compat-client luminous + +This command will fail if any pre-luminous clients or daemons are +connected to the monitors. You can see what client versions are in +use with:: + + ceph features + +Balancer module +----------------- + +The new `balancer` module for ceph-mgr will automatically balance +the number of PGs per OSD. See ``Balancer`` + + +Offline optimization +-------------------- + +Upmap entries are updated with an offline optimizer built into ``osdmaptool``. + +#. Grab the latest copy of your osdmap:: + + ceph osd getmap -o om + +#. Run the optimizer:: + + osdmaptool om --upmap out.txt [--upmap-pool ] + [--upmap-max ] [--upmap-deviation ] + [--upmap-active] + + It is highly recommended that optimization be done for each pool + individually, or for sets of similarly-utilized pools. You can + specify the ``--upmap-pool`` option multiple times. "Similar pools" + means pools that are mapped to the same devices and store the same + kind of data (e.g., RBD image pools, yes; RGW index pool and RGW + data pool, no). + + The ``max-optimizations`` value is the maximum number of upmap entries to + identify in the run. The default is `10` like the ceph-mgr balancer module, + but you should use a larger number if you are doing offline optimization. + If it cannot find any additional changes to make it will stop early + (i.e., when the pool distribution is perfect). + + The ``max-deviation`` value defaults to `5`. If an OSD PG count + varies from the computed target number by less than or equal + to this amount it will be considered perfect. + + The ``--upmap-active`` option simulates the behavior of the active + balancer in upmap mode. It keeps cycling until the OSDs are balanced + and reports how many rounds and how long each round is taking. The + elapsed time for rounds indicates the CPU load ceph-mgr will be + consuming when it tries to compute the next optimization plan. + +#. Apply the changes:: + + source out.txt + + The proposed changes are written to the output file ``out.txt`` in + the example above. These are normal ceph CLI commands that can be + run to apply the changes to the cluster. + + +The above steps can be repeated as many times as necessary to achieve +a perfect distribution of PGs for each set of pools. + +You can see some (gory) details about what the tool is doing by +passing ``--debug-osd 10`` and even more with ``--debug-crush 10`` +to ``osdmaptool``. -- cgit v1.2.3