summaryrefslogtreecommitdiffstats
path: root/PendingReleaseNotes
diff options
context:
space:
mode:
authorDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-21 11:54:28 +0000
committerDaniel Baumann <daniel.baumann@progress-linux.org>2024-04-21 11:54:28 +0000
commite6918187568dbd01842d8d1d2c808ce16a894239 (patch)
tree64f88b554b444a49f656b6c656111a145cbbaa28 /PendingReleaseNotes
parentInitial commit. (diff)
downloadceph-e6918187568dbd01842d8d1d2c808ce16a894239.tar.xz
ceph-e6918187568dbd01842d8d1d2c808ce16a894239.zip
Adding upstream version 18.2.2.upstream/18.2.2
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'PendingReleaseNotes')
-rw-r--r--PendingReleaseNotes293
1 files changed, 293 insertions, 0 deletions
diff --git a/PendingReleaseNotes b/PendingReleaseNotes
new file mode 100644
index 000000000..03520c97b
--- /dev/null
+++ b/PendingReleaseNotes
@@ -0,0 +1,293 @@
+>=19.0.0
+
+* RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in
+ multi-site. Previously, the replicas of such objects were corrupted on decryption.
+ A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to
+ identify these original multipart uploads. The ``LastModified`` timestamp of any
+ identified object is incremented by 1ns to cause peer zones to replicate it again.
+ For multi-site deployments that make any use of Server-Side Encryption, we
+ recommended running this command against every bucket in every zone after all
+ zones have upgraded.
+* CEPHFS: MDS evicts clients which are not advancing their request tids which causes
+ a large buildup of session metadata resulting in the MDS going read-only due to
+ the RADOS operation exceeding the size threshold. `mds_session_metadata_threshold`
+ config controls the maximum size that a (encoded) session metadata can grow.
+* RGW: New tools have been added to radosgw-admin for identifying and
+ correcting issues with versioned bucket indexes. Historical bugs with the
+ versioned bucket index transaction workflow made it possible for the index
+ to accumulate extraneous "book-keeping" olh entries and plain placeholder
+ entries. In some specific scenarios where clients made concurrent requests
+ referencing the same object key, it was likely that a lot of extra index
+ entries would accumulate. When a significant number of these entries are
+ present in a single bucket index shard, they can cause high bucket listing
+ latencies and lifecycle processing failures. To check whether a versioned
+ bucket has unnecessary olh entries, users can now run ``radosgw-admin
+ bucket check olh``. If the ``--fix`` flag is used, the extra entries will
+ be safely removed. A distinct issue from the one described thus far, it is
+ also possible that some versioned buckets are maintaining extra unlinked
+ objects that are not listable from the S3/ Swift APIs. These extra objects
+ are typically a result of PUT requests that exited abnormally, in the middle
+ of a bucket index transaction - so the client would not have received a
+ successful response. Bugs in prior releases made these unlinked objects easy
+ to reproduce with any PUT request that was made on a bucket that was actively
+ resharding. Besides the extra space that these hidden, unlinked objects
+ consume, there can be another side effect in certain scenarios, caused by
+ the nature of the failure mode that produced them, where a client of a bucket
+ that was a victim of this bug may find the object associated with the key to
+ be in an inconsistent state. To check whether a versioned bucket has unlinked
+ entries, users can now run ``radosgw-admin bucket check unlinked``. If the
+ ``--fix`` flag is used, the unlinked objects will be safely removed. Finally,
+ a third issue made it possible for versioned bucket index stats to be
+ accounted inaccurately. The tooling for recalculating versioned bucket stats
+ also had a bug, and was not previously capable of fixing these inaccuracies.
+ This release resolves those issues and users can now expect that the existing
+ ``radosgw-admin bucket check`` command will produce correct results. We
+ recommend that users with versioned buckets, especially those that existed
+ on prior releases, use these new tools to check whether their buckets are
+ affected and to clean them up accordingly.
+* mgr/snap-schedule: For clusters with multiple CephFS file systems, all the
+ snap-schedule commands now expect the '--fs' argument.
+
+>=18.0.0
+
+* The RGW policy parser now rejects unknown principals by default. If you are
+ mirroring policies between RGW and AWS, you may wish to set
+ "rgw policy reject invalid principals" to "false". This affects only newly set
+ policies, not policies that are already in place.
+* RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file.
+ The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path`
+ defaults to "/var/log/ceph/ops-log-$cluster-$name.log".
+* The SPDK backend for BlueStore is now able to connect to an NVMeoF target.
+ Please note that this is not an officially supported feature.
+* RGW's pubsub interface now returns boolean fields using bool. Before this change,
+ `/topics/<topic-name>` returns "stored_secret" and "persistent" using a string
+ of "true" or "false" with quotes around them. After this change, these fields
+ are returned without quotes so they can be decoded as boolean values in JSON.
+ The same applies to the `is_truncated` field returned by `/subscriptions/<sub-name>`.
+* RGW's response of `Action=GetTopicAttributes&TopicArn=<topic-arn>` REST API now
+ returns `HasStoredSecret` and `Persistent` as boolean in the JSON string
+ encoded in `Attributes/EndPoint`.
+* All boolean fields previously rendered as string by `rgw-admin` command when
+ the JSON format is used are now rendered as boolean. If your scripts/tools
+ relies on this behavior, please update them accordingly. The impacted field names
+ are:
+ * absolute
+ * add
+ * admin
+ * appendable
+ * bucket_key_enabled
+ * delete_marker
+ * exists
+ * has_bucket_info
+ * high_precision_time
+ * index
+ * is_master
+ * is_prefix
+ * is_truncated
+ * linked
+ * log_meta
+ * log_op
+ * pending_removal
+ * read_only
+ * retain_head_object
+ * rule_exist
+ * start_with_full_sync
+ * sync_from_all
+ * syncstopped
+ * system
+ * truncated
+ * user_stats_sync
+* RGW: The beast frontend's HTTP access log line uses a new debug_rgw_access
+ configurable. This has the same defaults as debug_rgw, but can now be controlled
+ independently.
+* RBD: The semantics of compare-and-write C++ API (`Image::compare_and_write`
+ and `Image::aio_compare_and_write` methods) now match those of C API. Both
+ compare and write steps operate only on `len` bytes even if the respective
+ buffers are larger. The previous behavior of comparing up to the size of
+ the compare buffer was prone to subtle breakage upon straddling a stripe
+ unit boundary.
+* RBD: compare-and-write operation is no longer limited to 512-byte sectors.
+ Assuming proper alignment, it now allows operating on stripe units (4M by
+ default).
+* RBD: New `rbd_aio_compare_and_writev` API method to support scatter/gather
+ on both compare and write buffers. This compliments existing `rbd_aio_readv`
+ and `rbd_aio_writev` methods.
+* The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard 'AT_STATX_DONT_SYNC'
+ macro. The 'AT_NO_ATTR_SYNC' macro will be removed in the future.
+* Trimming of PGLog dups is now controlled by the size instead of the version.
+ This fixes the PGLog inflation issue that was happening when the on-line
+ (in OSD) trimming got jammed after a PG split operation. Also, a new off-line
+ mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op
+ that targets situations where OSD is unable to boot due to those inflated dups.
+ If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning
+ will be visible.
+ Relevant tracker: https://tracker.ceph.com/issues/53729
+* RBD: `rbd device unmap` command gained `--namespace` option. Support for
+ namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to
+ map and unmap images in namespaces using the `image-spec` syntax since then
+ but the corresponding option available in most other commands was missing.
+* RGW: Compression is now supported for objects uploaded with Server-Side Encryption.
+ When both are enabled, compression is applied before encryption. Earlier releases
+ of multisite do not replicate such objects correctly, so all zones must upgrade to
+ Reef before enabling the `compress-encrypted` zonegroup feature: see
+ https://docs.ceph.com/en/reef/radosgw/multisite/#zone-features and note the
+ security considerations.
+* RGW: the "pubsub" functionality for storing bucket notifications inside Ceph
+ is removed. Together with it, the "pubsub" zone should not be used anymore.
+ The REST operations, as well as radosgw-admin commands for manipulating
+ subscriptions, as well as fetching and acking the notifications are removed
+ as well.
+ In case that the endpoint to which the notifications are sent maybe down or
+ disconnected, it is recommended to use persistent notifications to guarantee
+ the delivery of the notifications. In case the system that consumes the
+ notifications needs to pull them (instead of the notifications be pushed
+ to it), an external message bus (e.g. rabbitmq, Kafka) should be used for
+ that purpose.
+* RGW: The serialized format of notification and topics has changed, so that
+ new/updated topics will be unreadable by old RGWs. We recommend completing
+ the RGW upgrades before creating or modifying any notification topics.
+* RBD: Trailing newline in passphrase files (`<passphrase-file>` argument in
+ `rbd encryption format` command and `--encryption-passphrase-file` option
+ in other commands) is no longer stripped.
+* RBD: Support for layered client-side encryption is added. Cloned images
+ can now be encrypted each with its own encryption format and passphrase,
+ potentially different from that of the parent image. The efficient
+ copy-on-write semantics intrinsic to unformatted (regular) cloned images
+ are retained.
+* CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to
+ `client_max_retries_on_remount_failure` and move it from mds.yaml.in to
+ mds-client.yaml.in because this option was only used by MDS client from its
+ birth.
+* The `perf dump` and `perf schema` commands are deprecated in favor of new
+ `counter dump` and `counter schema` commands. These new commands add support
+ for labeled perf counters and also emit existing unlabeled perf counters. Some
+ unlabeled perf counters became labeled in this release, with more to follow in
+ future releases; such converted perf counters are no longer emitted by the
+ `perf dump` and `perf schema` commands.
+* `ceph mgr dump` command now outputs `last_failure_osd_epoch` and
+ `active_clients` fields at the top level. Previously, these fields were
+ output under `always_on_modules` field.
+* `ceph mgr dump` command now displays the name of the mgr module that
+ registered a RADOS client in the `name` field added to elements of the
+ `active_clients` array. Previously, only the address of a module's RADOS
+ client was shown in the `active_clients` array.
+* RBD: All rbd-mirror daemon perf counters became labeled and as such are now
+ emitted only by the new `counter dump` and `counter schema` commands. As part
+ of the conversion, many also got renamed to better disambiguate journal-based
+ and snapshot-based mirroring.
+* RBD: list-watchers C++ API (`Image::list_watchers`) now clears the passed
+ `std::list` before potentially appending to it, aligning with the semantics
+ of the corresponding C API (`rbd_watchers_list`).
+* The rados python binding is now able to process (opt-in) omap keys as bytes
+ objects. This enables interacting with RADOS omap keys that are not decodeable as
+ UTF-8 strings.
+* Telemetry: Users who are opted-in to telemetry can also opt-in to
+ participating in a leaderboard in the telemetry public
+ dashboards (https://telemetry-public.ceph.com/). Users can now also add a
+ description of the cluster to publicly appear in the leaderboard.
+ For more details, see:
+ https://docs.ceph.com/en/latest/mgr/telemetry/#leaderboard
+ See a sample report with `ceph telemetry preview`.
+ Opt-in to telemetry with `ceph telemetry on`.
+ Opt-in to the leaderboard with
+ `ceph config set mgr mgr/telemetry/leaderboard true`.
+ Add leaderboard description with:
+ `ceph config set mgr mgr/telemetry/leaderboard_description ‘Cluster description’`.
+* CEPHFS: After recovering a Ceph File System post following the disaster recovery
+ procedure, the recovered files under `lost+found` directory can now be deleted.
+* core: cache-tiering is now deprecated.
+* mClock Scheduler: The mClock scheduler (default scheduler in Quincy) has
+ undergone significant usability and design improvements to address the slow
+ backfill issue. Some important changes are:
+ * The 'balanced' profile is set as the default mClock profile because it
+ represents a compromise between prioritizing client IO or recovery IO. Users
+ can then choose either the 'high_client_ops' profile to prioritize client IO
+ or the 'high_recovery_ops' profile to prioritize recovery IO.
+ * QoS parameters like reservation and limit are now specified in terms of a
+ fraction (range: 0.0 to 1.0) of the OSD's IOPS capacity.
+ * The cost parameters (osd_mclock_cost_per_io_usec_* and
+ osd_mclock_cost_per_byte_usec_*) have been removed. The cost of an operation
+ is now determined using the random IOPS and maximum sequential bandwidth
+ capability of the OSD's underlying device.
+ * Degraded object recovery is given higher priority when compared to misplaced
+ object recovery because degraded objects present a data safety issue not
+ present with objects that are merely misplaced. Therefore, backfilling
+ operations with the 'balanced' and 'high_client_ops' mClock profiles may
+ progress slower than what was seen with the 'WeightedPriorityQueue' (WPQ)
+ scheduler.
+ * The QoS allocations in all the mClock profiles are optimized based on the above
+ fixes and enhancements.
+ * For more detailed information see:
+ https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/
+* CEPHFS: After recovering a Ceph File System post following the disaster recovery
+ procedure, the recovered files under `lost+found` directory can now be deleted.
+ https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/
+* mgr/snap_schedule: The snap-schedule mgr module now retains one less snapshot
+ than the number mentioned against the config tunable `mds_max_snaps_per_dir`
+ so that a new snapshot can be created and retained during the next schedule
+ run.
+
+>=17.2.1
+
+* The "BlueStore zero block detection" feature (first introduced to Quincy in
+https://github.com/ceph/ceph/pull/43337) has been turned off by default with a
+new global configuration called `bluestore_zero_block_detection`. This feature,
+intended for large-scale synthetic testing, does not interact well with some RBD
+and CephFS features. Any side effects experienced in previous Quincy versions
+would no longer occur, provided that the configuration remains set to false.
+Relevant tracker: https://tracker.ceph.com/issues/55521
+
+* telemetry: Added new Rook metrics to the 'basic' channel to report Rook's
+ version, Kubernetes version, node metrics, etc.
+ See a sample report with `ceph telemetry preview`.
+ Opt-in with `ceph telemetry on`.
+
+ For more details, see:
+
+ https://docs.ceph.com/en/latest/mgr/telemetry/
+
+* OSD: The issue of high CPU utilization during recovery/backfill operations
+ has been fixed. For more details, see: https://tracker.ceph.com/issues/56530.
+
+>=15.2.17
+
+* OSD: Octopus modified the SnapMapper key format from
+ <LEGACY_MAPPING_PREFIX><snapid>_<shardid>_<hobject_t::to_str()>
+ to
+ <MAPPING_PREFIX><pool>_<snapid>_<shardid>_<hobject_t::to_str()>
+ When this change was introduced, 94ebe0e also introduced a conversion
+ with a crucial bug which essentially destroyed legacy keys by mapping them
+ to
+ <MAPPING_PREFIX><poolid>_<snapid>_
+ without the object-unique suffix. The conversion is fixed in this release.
+ Relevant tracker: https://tracker.ceph.com/issues/56147
+
+* Cephadm may now be configured to carry out CephFS MDS upgrades without
+reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
+avoid having two active MDS modifying on-disk structures with new versions,
+communicating cross-version-incompatible messages, or other potential
+incompatibilities. This could be disruptive for large-scale CephFS deployments
+because the cluster cannot easily reduce active MDS daemons to 1.
+NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
+of the feature, refer this link on how to perform it:
+https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
+Relevant tracker: https://tracker.ceph.com/issues/55715
+
+ Relevant tracker: https://tracker.ceph.com/issues/5614
+
+* Cephadm may now be configured to carry out CephFS MDS upgrades without
+reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to
+avoid having two active MDS modifying on-disk structures with new versions,
+communicating cross-version-incompatible messages, or other potential
+incompatibilities. This could be disruptive for large-scale CephFS deployments
+because the cluster cannot easily reduce active MDS daemons to 1.
+NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage
+of the feature, refer this link on how to perform it:
+https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade
+Relevant tracker: https://tracker.ceph.com/issues/55715
+
+* Introduced a new file system flag `refuse_client_session` that can be set using the
+`fs set` command. This flag allows blocking any incoming session
+request from client(s). This can be useful during some recovery situations
+where it's desirable to bring MDS up but have no client workload.
+Relevant tracker: https://tracker.ceph.com/issues/57090