Adding upstream version 18.2.2.upstream/18.2.2

Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
author: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-21 11:54:28 +0000
committer: Daniel Baumann <daniel.baumann@progress-linux.org> 2024-04-21 11:54:28 +0000
commit: e6918187568dbd01842d8d1d2c808ce16a894239 (patch)
tree: 64f88b554b444a49f656b6c656111a145cbbaa28 /PendingReleaseNotes
parent: Initial commit. (diff)
download: ceph-e6918187568dbd01842d8d1d2c808ce16a894239.tar.xz
ceph-e6918187568dbd01842d8d1d2c808ce16a894239.zip
1 files changed, 293 insertions, 0 deletions
diff --git a/PendingReleaseNotes b/PendingReleaseNotes
new file mode 100644
index 000000000..03520c97b
--- /dev/null
+++ b/PendingReleaseNotes
@@ -0,0 +1,293 @@
+>=19.0.0
+
+* RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in
+  multi-site. Previously, the replicas of such objects were corrupted on decryption.
+  A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to
+  identify these original multipart uploads. The ``LastModified`` timestamp of any
+  identified object is incremented by 1ns to cause peer zones to replicate it again.
+  For multi-site deployments that make any use of Server-Side Encryption, we
+  recommended running this command against every bucket in every zone after all
+  zones have upgraded.
+* CEPHFS: MDS evicts clients which are not advancing their request tids which causes
+  a large buildup of session metadata resulting in the MDS going read-only due to
+  the RADOS operation exceeding the size threshold. `mds_session_metadata_threshold`
+  config controls the maximum size that a (encoded) session metadata can grow.
+* RGW: New tools have been added to radosgw-admin for identifying and
+  correcting issues with versioned bucket indexes. Historical bugs with the
+  versioned bucket index transaction workflow made it possible for the index
+  to accumulate extraneous "book-keeping" olh entries and plain placeholder
+  entries. In some specific scenarios where clients made concurrent requests
+  referencing the same object key, it was likely that a lot of extra index
+  entries would accumulate. When a significant number of these entries are
+  present in a single bucket index shard, they can cause high bucket listing
+  latencies and lifecycle processing failures. To check whether a versioned
+  bucket has unnecessary olh entries, users can now run ``radosgw-admin
+  bucket check olh``. If the ``--fix`` flag is used, the extra entries will
+  be safely removed. A distinct issue from the one described thus far, it is
+  also possible that some versioned buckets are maintaining extra unlinked
+  objects that are not listable from the S3/ Swift APIs. These extra objects
+  are typically a result of PUT requests that exited abnormally, in the middle
+  of a bucket index transaction - so the client would not have received a
+  successful response. Bugs in prior releases made these unlinked objects easy
+  to reproduce with any PUT request that was made on a bucket that was actively
+  resharding. Besides the extra space that these hidden, unlinked objects
+  consume, there can be another side effect in certain scenarios, caused by
+  the nature of the failure mode that produced them, where a client of a bucket
+  that was a victim of this bug may find the object associated with the key to
+  be in an inconsistent state. To check whether a versioned bucket has unlinked
+  entries, users can now run ``radosgw-admin bucket check unlinked``. If the
+  ``--fix`` flag is used, the unlinked objects will be safely removed. Finally,
+  a third issue made it possible for versioned bucket index stats to be
+  accounted inaccurately. The tooling for recalculating versioned bucket stats
+  also had a bug, and was not previously capable of fixing these inaccuracies.
+  This release resolves those issues and users can now expect that the existing
+  ``radosgw-admin bucket check`` command will produce correct results. We
+  recommend that users with versioned buckets, especially those that existed
+  on prior releases, use these new tools to check whether their buckets are
+  affected and to clean them up accordingly.
+* mgr/snap-schedule: For clusters with multiple CephFS file systems, all the
+  snap-schedule commands now expect the '--fs' argument.
+
+>=18.0.0
+
+* The RGW policy parser now rejects unknown principals by default. If you are
+  mirroring policies between RGW and AWS, you may wish to set
+  "rgw policy reject invalid principals" to "false". This affects only newly set
+  policies, not policies that are already in place.
+* RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file.
+  The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path`
+  defaults to "/var/log/ceph/ops-log-$cluster-$name.log".
+* The SPDK backend for BlueStore is now able to connect to an NVMeoF target.
+  Please note that this is not an officially supported feature.
+* RGW's pubsub interface now returns boolean fields using bool. Before this change,
+  `/topics/<topic-name>` returns "stored_secret" and "persistent" using a string
+  of "true" or "false" with quotes around them. After this change, these fields
+  are returned without quotes so they can be decoded as boolean values in JSON.
+  The same applies to the `is_truncated` field returned by `/subscriptions/<sub-name>`.
+* RGW's response of `Action=GetTopicAttributes&TopicArn=<topic-arn>` REST API now
+  returns `HasStoredSecret` and `Persistent` as boolean in the JSON string
+  encoded in `Attributes/EndPoint`.
+* All boolean fields previously rendered as string by `rgw-admin` command when
+  the JSON format is used are now rendered as boolean. If your scripts/tools
+  relies on this behavior, please update them accordingly. The impacted field names
+  are:
+  * absolute
+  * add
+  * admin
+  * appendable
+  * bucket_key_enabled
+  * delete_marker
+  * exists
+  * has_bucket_info
+  * high_precision_time
+  * index
+  * is_master
+  * is_prefix
+  * is_truncated
+  * linked
+  * log_meta
+  * log_op
+  * pending_removal
+  * read_only
+  * retain_head_object
+  * rule_exist
+  * start_with_full_sync
+  * sync_from_all
+  * syncstopped
+  * system
+  * truncated
+  * user_stats_sync
+* RGW: The beast frontend's HTTP access log line uses a new debug_rgw_access
+  configurable. This has the same defaults as debug_rgw, but can now be controlled
+  independently.
+* RBD: The semantics of compare-and-write C++ API (`Image::compare_and_write`
+  and `Image::aio_compare_and_write` methods) now match those of C API.  Both
+  compare and write steps operate only on `len` bytes even if the respective
+  buffers are larger. The previous behavior of comparing up to the size of
+  the compare buffer was prone to subtle breakage upon straddling a stripe
+  unit boundary.
+* RBD: compare-and-write operation is no longer limited to 512-byte sectors.
+  Assuming proper alignment, it now allows operating on stripe units (4M by
+  default).
+* RBD: New `rbd_aio_compare_and_writev` API method to support scatter/gather
+  on both compare and write buffers.  This compliments existing `rbd_aio_readv`
+  and `rbd_aio_writev` methods.
+* The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard 'AT_STATX_DONT_SYNC'
+  macro. The 'AT_NO_ATTR_SYNC' macro will be removed in the future.
+* Trimming of PGLog dups is now controlled by the size instead of the version.
+  This fixes the PGLog inflation issue that was happening when the on-line
+  (in OSD) trimming got jammed after a PG split operation. Also, a new off-line
+  mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op
+  that targets situations where OSD is unable to boot due to those inflated dups.
+  If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning
+  will be visible.
+  Relevant tracker: https://tracker.ceph.com/issues/53729
+* RBD: `rbd device unmap` command gained `--namespace` option.  Support for
+  namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to
+  map and unmap images in namespaces using the `image-spec` syntax since then
+  but the corresponding option available in most other commands was missing.
+* RGW: Compression is now supported for objects uploaded with Server-Side Encryption.
+  When both are enabled, compression is applied before encryption. Earlier releases
+  of multisite do not replicate such objects correctly, so all zones must upgrade to
+  Reef before enabling the `compress-encrypted` zonegroup feature: see
+  https://docs.ceph.com/en/reef/radosgw/multisite/#zone-features and note the
+  security considerations.
+* RGW: the "pubsub" functionality for storing bucket notifications inside Ceph
+  is removed. Together with it, the "pubsub" zone should not be used anymore.
+  The REST operations, as well as radosgw-admin commands for manipulating
+  subscriptions, as well as fetching and acking the notifications are removed 
+  as well.
+  In case that the endpoint to which the notifications are sent maybe down or 
+  disconnected, it is recommended to use persistent notifications to guarantee 
+  the delivery of the notifications. In case the system that consumes the 
+  notifications needs to pull them (instead of the notifications be pushed 
+  to it), an external message bus (e.g. rabbitmq, Kafka) should be used for 
+  that purpose.
+* RGW: The serialized format of notification and topics has changed, so that 
+  new/updated topics will be unreadable by old RGWs. We recommend completing 
+  the RGW upgrades before creating or modifying any notification topics.
+* RBD: Trailing newline in passphrase files (`<passphrase-file>` argument in
+  `rbd encryption format` command and `--encryption-passphrase-file` option
+  in other commands) is no longer stripped.
+* RBD: Support for layered client-side encryption is added.  Cloned images
+  can now be encrypted each with its own encryption format and passphrase,
+  potentially different from that of the parent image.  The efficient
+  copy-on-write semantics intrinsic to unformatted (regular) cloned images
+  are retained.
+* CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to
+  `client_max_retries_on_remount_failure` and move it from mds.yaml.in to
+  mds-client.yaml.in because this option was only used by MDS client from its
+  birth.
+* The `perf dump` and `perf schema` commands are deprecated in favor of new
+  `counter dump` and `counter schema` commands. These new commands add support
+  for labeled perf counters and also emit existing unlabeled perf counters. Some
+  unlabeled perf counters became labeled in this release, with more to follow in
+  future releases; such converted perf counters are no longer emitted by the
+  `perf dump` and `perf schema` commands.
+* `ceph mgr dump` command now outputs `last_failure_osd_epoch` and
+  `active_clients` fields at the top level.  Previously, these fields were
+  output under `always_on_modules` field.
+* `ceph mgr dump` command now displays the name of the mgr module that
+  registered a RADOS client in the `name` field added to elements of the
+  `active_clients` array. Previously, only the address of a module's RADOS
+  client was shown in the `active_clients` array.
+* RBD: All rbd-mirror daemon perf counters became labeled and as such are now
+  emitted only by the new `counter dump` and `counter schema` commands.  As part
+  of the conversion, many also got renamed to better disambiguate journal-based
+  and snapshot-based mirroring.
+* RBD: list-watchers C++ API (`Image::list_watchers`) now clears the passed
+  `std::list` before potentially appending to it, aligning with the semantics
+  of the corresponding C API (`rbd_watchers_list`).
+* The rados python binding is now able to process (opt-in) omap keys as bytes
+  objects. This enables interacting with RADOS omap keys that are not decodeable as
+  UTF-8 strings.
+* Telemetry: Users who are opted-in to telemetry can also opt-in to
+  participating in a leaderboard in the telemetry public
+  dashboards (https://telemetry-public.ceph.com/). Users can now also add a
+  description of the cluster to publicly appear in the leaderboard.
+  For more details, see:
+  https://docs.ceph.com/en/latest/mgr/telemetry/#leaderboard
+  See a sample report with `ceph telemetry preview`.
+  Opt-in to telemetry with `ceph telemetry on`.
+  Opt-in to the leaderboard with
+  `ceph config set mgr mgr/telemetry/leaderboard true`.
+  Add leaderboard description with:
+  `ceph config set mgr mgr/telemetry/leaderboard_description ‘Cluster description’`.
+* CEPHFS: After recovering a Ceph File System post following the disaster recovery
+  procedure, the recovered files under `lost+found` directory can now be deleted.
+* core: cache-tiering is now deprecated.
+* mClock Scheduler: The mClock scheduler (default scheduler in Quincy) has
+  undergone significant usability and design improvements to address the slow
+  backfill issue. Some important changes are:
+  * The 'balanced' profile is set as the default mClock profile because it
+    represents a compromise between prioritizing client IO or recovery IO. Users
+    can then choose either the 'high_client_ops' profile to prioritize client IO
+    or the 'high_recovery_ops' profile to prioritize recovery IO.
+  * QoS parameters like reservation and limit are now specified in terms of a
+    fraction (range: 0.0 to 1.0) of the OSD's IOPS capacity.
+  * The cost parameters (osd_mclock_cost_per_io_usec_* and
+    osd_mclock_cost_per_byte_usec_*) have been removed. The cost of an operation
+    is now determined using the random IOPS and maximum sequential bandwidth
+    capability of the OSD's underlying device.
+  * Degraded object recovery is given higher priority when compared to misplaced
+    object recovery because degraded objects present a data safety issue not
+    present with objects that are merely misplaced. Therefore, backfilling
+    operations with the 'balanced' and 'high_client_ops' mClock profiles may
+    progress slower than what was seen with the 'WeightedPriorityQueue' (WPQ)
+    scheduler.
+  * The QoS allocations in all the mClock profiles are optimized based on the above
+    fixes and enhancements.
+  * For more detailed information see:
+    https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/
+* CEPHFS: After recovering a Ceph File System post following the disaster recovery
+  procedure, the recovered files under `lost+found` directory can now be deleted.
+    https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/
+* mgr/snap_schedule: The snap-schedule mgr module now retains one less snapshot
+  than the number mentioned against the config tunable `mds_max_snaps_per_dir`
+  so that a new snapshot can be created and retained during the next schedule
+  run.
+
+>=17.2.1
+
+* The "BlueStore zero block detection" feature (first introduced to Quincy in
+https://github.com/ceph/ceph/pull/43337) has been turned off by default with a
+new global configuration called `bluestore_zero_block_detection`. This feature,
+intended for large-scale synthetic testing, does not interact well with some RBD
+and CephFS features. Any side effects experienced in previous Quincy versions
+would no longer occur, provided that the configuration remains set to false.
+Relevant tracker: https://tracker.ceph.com/issues/55521
+
+* telemetry: Added new Rook metrics to the 'basic' channel to report Rook's
+  version, Kubernetes version, node metrics, etc.
+  See a sample report with `ceph telemetry preview`.
+  Opt-in with `ceph telemetry on`.
+
+  For more details, see:
+
+  https://docs.ceph.com/en/latest/mgr/telemetry/
+
+* OSD: The issue of high CPU utilization during recovery/backfill operations
+  has been fixed. For more details, see: https://tracker.ceph.com/issues/56530.
+
+>=15.2.17
+
+* OSD: Octopus modified the SnapMapper key format from
+  <LEGACY_MAPPING_PREFIX><snapid>_<shardid>_<hobject_t::to_str()>
+  to
+  <MAPPING_PREFIX><pool>_<snapid>_<shardid>_<hobject_t::to_str()>
+  When this change was introduced, 94ebe0e also introduced a conversion
+  with a crucial bug which essentially destroyed legacy keys by mapping them
+  to
+  <MAPPING_PREFIX><poolid>_<snapid>_
+  without the object-unique suffix. The conversion is fixed in this release.
+  Relevant tracker: https://tracker.ceph.com/issues/56147
+  
+* Cephadm may now be configured to carry out CephFS MDS upgrades without
+reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
+avoid having two active MDS modifying on-disk structures with new versions,
+communicating cross-version-incompatible messages, or other potential
+incompatibilities. This could be disruptive for large-scale CephFS deployments
+because the cluster cannot easily reduce active MDS daemons to 1.
+NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
+of the feature, refer this link on how to perform it:
+https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
+Relevant tracker: https://tracker.ceph.com/issues/55715
+
+  Relevant tracker: https://tracker.ceph.com/issues/5614
+  
+* Cephadm may now be configured to carry out CephFS MDS upgrades without
+reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
+avoid having two active MDS modifying on-disk structures with new versions,
+communicating cross-version-incompatible messages, or other potential
+incompatibilities. This could be disruptive for large-scale CephFS deployments
+because the cluster cannot easily reduce active MDS daemons to 1.
+NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
+of the feature, refer this link on how to perform it:
+https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
+Relevant tracker: https://tracker.ceph.com/issues/55715
+
+* Introduced a new file system flag `refuse_client_session` that can be set using the
+`fs set` command. This flag allows blocking any incoming session
+request from client(s). This can be useful during some recovery situations
+where it's desirable to bring MDS up but have no client workload.
+Relevant tracker: https://tracker.ceph.com/issues/57090
author	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-21 11:54:28 +0000
committer	Daniel Baumann <daniel.baumann@progress-linux.org>	2024-04-21 11:54:28 +0000
commit	e6918187568dbd01842d8d1d2c808ce16a894239 (patch)
tree	64f88b554b444a49f656b6c656111a145cbbaa28 /PendingReleaseNotes
parent	Initial commit. (diff)
download	ceph-e6918187568dbd01842d8d1d2c808ce16a894239.tar.xz ceph-e6918187568dbd01842d8d1d2c808ce16a894239.zip