diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-21 11:54:28 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-21 11:54:28 +0000 |
commit | e6918187568dbd01842d8d1d2c808ce16a894239 (patch) | |
tree | 64f88b554b444a49f656b6c656111a145cbbaa28 /PendingReleaseNotes | |
parent | Initial commit. (diff) | |
download | ceph-e6918187568dbd01842d8d1d2c808ce16a894239.tar.xz ceph-e6918187568dbd01842d8d1d2c808ce16a894239.zip |
Adding upstream version 18.2.2.upstream/18.2.2
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'PendingReleaseNotes')
-rw-r--r-- | PendingReleaseNotes | 293 |
1 files changed, 293 insertions, 0 deletions
diff --git a/PendingReleaseNotes b/PendingReleaseNotes new file mode 100644 index 000000000..03520c97b --- /dev/null +++ b/PendingReleaseNotes @@ -0,0 +1,293 @@ +>=19.0.0 + +* RGW: S3 multipart uploads using Server-Side Encryption now replicate correctly in + multi-site. Previously, the replicas of such objects were corrupted on decryption. + A new tool, ``radosgw-admin bucket resync encrypted multipart``, can be used to + identify these original multipart uploads. The ``LastModified`` timestamp of any + identified object is incremented by 1ns to cause peer zones to replicate it again. + For multi-site deployments that make any use of Server-Side Encryption, we + recommended running this command against every bucket in every zone after all + zones have upgraded. +* CEPHFS: MDS evicts clients which are not advancing their request tids which causes + a large buildup of session metadata resulting in the MDS going read-only due to + the RADOS operation exceeding the size threshold. `mds_session_metadata_threshold` + config controls the maximum size that a (encoded) session metadata can grow. +* RGW: New tools have been added to radosgw-admin for identifying and + correcting issues with versioned bucket indexes. Historical bugs with the + versioned bucket index transaction workflow made it possible for the index + to accumulate extraneous "book-keeping" olh entries and plain placeholder + entries. In some specific scenarios where clients made concurrent requests + referencing the same object key, it was likely that a lot of extra index + entries would accumulate. When a significant number of these entries are + present in a single bucket index shard, they can cause high bucket listing + latencies and lifecycle processing failures. To check whether a versioned + bucket has unnecessary olh entries, users can now run ``radosgw-admin + bucket check olh``. If the ``--fix`` flag is used, the extra entries will + be safely removed. A distinct issue from the one described thus far, it is + also possible that some versioned buckets are maintaining extra unlinked + objects that are not listable from the S3/ Swift APIs. These extra objects + are typically a result of PUT requests that exited abnormally, in the middle + of a bucket index transaction - so the client would not have received a + successful response. Bugs in prior releases made these unlinked objects easy + to reproduce with any PUT request that was made on a bucket that was actively + resharding. Besides the extra space that these hidden, unlinked objects + consume, there can be another side effect in certain scenarios, caused by + the nature of the failure mode that produced them, where a client of a bucket + that was a victim of this bug may find the object associated with the key to + be in an inconsistent state. To check whether a versioned bucket has unlinked + entries, users can now run ``radosgw-admin bucket check unlinked``. If the + ``--fix`` flag is used, the unlinked objects will be safely removed. Finally, + a third issue made it possible for versioned bucket index stats to be + accounted inaccurately. The tooling for recalculating versioned bucket stats + also had a bug, and was not previously capable of fixing these inaccuracies. + This release resolves those issues and users can now expect that the existing + ``radosgw-admin bucket check`` command will produce correct results. We + recommend that users with versioned buckets, especially those that existed + on prior releases, use these new tools to check whether their buckets are + affected and to clean them up accordingly. +* mgr/snap-schedule: For clusters with multiple CephFS file systems, all the + snap-schedule commands now expect the '--fs' argument. + +>=18.0.0 + +* The RGW policy parser now rejects unknown principals by default. If you are + mirroring policies between RGW and AWS, you may wish to set + "rgw policy reject invalid principals" to "false". This affects only newly set + policies, not policies that are already in place. +* RGW's default backend for `rgw_enable_ops_log` changed from RADOS to file. + The default value of `rgw_ops_log_rados` is now false, and `rgw_ops_log_file_path` + defaults to "/var/log/ceph/ops-log-$cluster-$name.log". +* The SPDK backend for BlueStore is now able to connect to an NVMeoF target. + Please note that this is not an officially supported feature. +* RGW's pubsub interface now returns boolean fields using bool. Before this change, + `/topics/<topic-name>` returns "stored_secret" and "persistent" using a string + of "true" or "false" with quotes around them. After this change, these fields + are returned without quotes so they can be decoded as boolean values in JSON. + The same applies to the `is_truncated` field returned by `/subscriptions/<sub-name>`. +* RGW's response of `Action=GetTopicAttributes&TopicArn=<topic-arn>` REST API now + returns `HasStoredSecret` and `Persistent` as boolean in the JSON string + encoded in `Attributes/EndPoint`. +* All boolean fields previously rendered as string by `rgw-admin` command when + the JSON format is used are now rendered as boolean. If your scripts/tools + relies on this behavior, please update them accordingly. The impacted field names + are: + * absolute + * add + * admin + * appendable + * bucket_key_enabled + * delete_marker + * exists + * has_bucket_info + * high_precision_time + * index + * is_master + * is_prefix + * is_truncated + * linked + * log_meta + * log_op + * pending_removal + * read_only + * retain_head_object + * rule_exist + * start_with_full_sync + * sync_from_all + * syncstopped + * system + * truncated + * user_stats_sync +* RGW: The beast frontend's HTTP access log line uses a new debug_rgw_access + configurable. This has the same defaults as debug_rgw, but can now be controlled + independently. +* RBD: The semantics of compare-and-write C++ API (`Image::compare_and_write` + and `Image::aio_compare_and_write` methods) now match those of C API. Both + compare and write steps operate only on `len` bytes even if the respective + buffers are larger. The previous behavior of comparing up to the size of + the compare buffer was prone to subtle breakage upon straddling a stripe + unit boundary. +* RBD: compare-and-write operation is no longer limited to 512-byte sectors. + Assuming proper alignment, it now allows operating on stripe units (4M by + default). +* RBD: New `rbd_aio_compare_and_writev` API method to support scatter/gather + on both compare and write buffers. This compliments existing `rbd_aio_readv` + and `rbd_aio_writev` methods. +* The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard 'AT_STATX_DONT_SYNC' + macro. The 'AT_NO_ATTR_SYNC' macro will be removed in the future. +* Trimming of PGLog dups is now controlled by the size instead of the version. + This fixes the PGLog inflation issue that was happening when the on-line + (in OSD) trimming got jammed after a PG split operation. Also, a new off-line + mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op + that targets situations where OSD is unable to boot due to those inflated dups. + If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning + will be visible. + Relevant tracker: https://tracker.ceph.com/issues/53729 +* RBD: `rbd device unmap` command gained `--namespace` option. Support for + namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to + map and unmap images in namespaces using the `image-spec` syntax since then + but the corresponding option available in most other commands was missing. +* RGW: Compression is now supported for objects uploaded with Server-Side Encryption. + When both are enabled, compression is applied before encryption. Earlier releases + of multisite do not replicate such objects correctly, so all zones must upgrade to + Reef before enabling the `compress-encrypted` zonegroup feature: see + https://docs.ceph.com/en/reef/radosgw/multisite/#zone-features and note the + security considerations. +* RGW: the "pubsub" functionality for storing bucket notifications inside Ceph + is removed. Together with it, the "pubsub" zone should not be used anymore. + The REST operations, as well as radosgw-admin commands for manipulating + subscriptions, as well as fetching and acking the notifications are removed + as well. + In case that the endpoint to which the notifications are sent maybe down or + disconnected, it is recommended to use persistent notifications to guarantee + the delivery of the notifications. In case the system that consumes the + notifications needs to pull them (instead of the notifications be pushed + to it), an external message bus (e.g. rabbitmq, Kafka) should be used for + that purpose. +* RGW: The serialized format of notification and topics has changed, so that + new/updated topics will be unreadable by old RGWs. We recommend completing + the RGW upgrades before creating or modifying any notification topics. +* RBD: Trailing newline in passphrase files (`<passphrase-file>` argument in + `rbd encryption format` command and `--encryption-passphrase-file` option + in other commands) is no longer stripped. +* RBD: Support for layered client-side encryption is added. Cloned images + can now be encrypted each with its own encryption format and passphrase, + potentially different from that of the parent image. The efficient + copy-on-write semantics intrinsic to unformatted (regular) cloned images + are retained. +* CEPHFS: Rename the `mds_max_retries_on_remount_failure` option to + `client_max_retries_on_remount_failure` and move it from mds.yaml.in to + mds-client.yaml.in because this option was only used by MDS client from its + birth. +* The `perf dump` and `perf schema` commands are deprecated in favor of new + `counter dump` and `counter schema` commands. These new commands add support + for labeled perf counters and also emit existing unlabeled perf counters. Some + unlabeled perf counters became labeled in this release, with more to follow in + future releases; such converted perf counters are no longer emitted by the + `perf dump` and `perf schema` commands. +* `ceph mgr dump` command now outputs `last_failure_osd_epoch` and + `active_clients` fields at the top level. Previously, these fields were + output under `always_on_modules` field. +* `ceph mgr dump` command now displays the name of the mgr module that + registered a RADOS client in the `name` field added to elements of the + `active_clients` array. Previously, only the address of a module's RADOS + client was shown in the `active_clients` array. +* RBD: All rbd-mirror daemon perf counters became labeled and as such are now + emitted only by the new `counter dump` and `counter schema` commands. As part + of the conversion, many also got renamed to better disambiguate journal-based + and snapshot-based mirroring. +* RBD: list-watchers C++ API (`Image::list_watchers`) now clears the passed + `std::list` before potentially appending to it, aligning with the semantics + of the corresponding C API (`rbd_watchers_list`). +* The rados python binding is now able to process (opt-in) omap keys as bytes + objects. This enables interacting with RADOS omap keys that are not decodeable as + UTF-8 strings. +* Telemetry: Users who are opted-in to telemetry can also opt-in to + participating in a leaderboard in the telemetry public + dashboards (https://telemetry-public.ceph.com/). Users can now also add a + description of the cluster to publicly appear in the leaderboard. + For more details, see: + https://docs.ceph.com/en/latest/mgr/telemetry/#leaderboard + See a sample report with `ceph telemetry preview`. + Opt-in to telemetry with `ceph telemetry on`. + Opt-in to the leaderboard with + `ceph config set mgr mgr/telemetry/leaderboard true`. + Add leaderboard description with: + `ceph config set mgr mgr/telemetry/leaderboard_description ‘Cluster description’`. +* CEPHFS: After recovering a Ceph File System post following the disaster recovery + procedure, the recovered files under `lost+found` directory can now be deleted. +* core: cache-tiering is now deprecated. +* mClock Scheduler: The mClock scheduler (default scheduler in Quincy) has + undergone significant usability and design improvements to address the slow + backfill issue. Some important changes are: + * The 'balanced' profile is set as the default mClock profile because it + represents a compromise between prioritizing client IO or recovery IO. Users + can then choose either the 'high_client_ops' profile to prioritize client IO + or the 'high_recovery_ops' profile to prioritize recovery IO. + * QoS parameters like reservation and limit are now specified in terms of a + fraction (range: 0.0 to 1.0) of the OSD's IOPS capacity. + * The cost parameters (osd_mclock_cost_per_io_usec_* and + osd_mclock_cost_per_byte_usec_*) have been removed. The cost of an operation + is now determined using the random IOPS and maximum sequential bandwidth + capability of the OSD's underlying device. + * Degraded object recovery is given higher priority when compared to misplaced + object recovery because degraded objects present a data safety issue not + present with objects that are merely misplaced. Therefore, backfilling + operations with the 'balanced' and 'high_client_ops' mClock profiles may + progress slower than what was seen with the 'WeightedPriorityQueue' (WPQ) + scheduler. + * The QoS allocations in all the mClock profiles are optimized based on the above + fixes and enhancements. + * For more detailed information see: + https://docs.ceph.com/en/reef/rados/configuration/mclock-config-ref/ +* CEPHFS: After recovering a Ceph File System post following the disaster recovery + procedure, the recovered files under `lost+found` directory can now be deleted. + https://docs.ceph.com/en/latest/rados/configuration/mclock-config-ref/ +* mgr/snap_schedule: The snap-schedule mgr module now retains one less snapshot + than the number mentioned against the config tunable `mds_max_snaps_per_dir` + so that a new snapshot can be created and retained during the next schedule + run. + +>=17.2.1 + +* The "BlueStore zero block detection" feature (first introduced to Quincy in +https://github.com/ceph/ceph/pull/43337) has been turned off by default with a +new global configuration called `bluestore_zero_block_detection`. This feature, +intended for large-scale synthetic testing, does not interact well with some RBD +and CephFS features. Any side effects experienced in previous Quincy versions +would no longer occur, provided that the configuration remains set to false. +Relevant tracker: https://tracker.ceph.com/issues/55521 + +* telemetry: Added new Rook metrics to the 'basic' channel to report Rook's + version, Kubernetes version, node metrics, etc. + See a sample report with `ceph telemetry preview`. + Opt-in with `ceph telemetry on`. + + For more details, see: + + https://docs.ceph.com/en/latest/mgr/telemetry/ + +* OSD: The issue of high CPU utilization during recovery/backfill operations + has been fixed. For more details, see: https://tracker.ceph.com/issues/56530. + +>=15.2.17 + +* OSD: Octopus modified the SnapMapper key format from + <LEGACY_MAPPING_PREFIX><snapid>_<shardid>_<hobject_t::to_str()> + to + <MAPPING_PREFIX><pool>_<snapid>_<shardid>_<hobject_t::to_str()> + When this change was introduced, 94ebe0e also introduced a conversion + with a crucial bug which essentially destroyed legacy keys by mapping them + to + <MAPPING_PREFIX><poolid>_<snapid>_ + without the object-unique suffix. The conversion is fixed in this release. + Relevant tracker: https://tracker.ceph.com/issues/56147 + +* Cephadm may now be configured to carry out CephFS MDS upgrades without +reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to +avoid having two active MDS modifying on-disk structures with new versions, +communicating cross-version-incompatible messages, or other potential +incompatibilities. This could be disruptive for large-scale CephFS deployments +because the cluster cannot easily reduce active MDS daemons to 1. +NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage +of the feature, refer this link on how to perform it: +https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade +Relevant tracker: https://tracker.ceph.com/issues/55715 + + Relevant tracker: https://tracker.ceph.com/issues/5614 + +* Cephadm may now be configured to carry out CephFS MDS upgrades without +reducing ``max_mds`` to 1. Previously, Cephadm would reduce ``max_mds`` to 1 to +avoid having two active MDS modifying on-disk structures with new versions, +communicating cross-version-incompatible messages, or other potential +incompatibilities. This could be disruptive for large-scale CephFS deployments +because the cluster cannot easily reduce active MDS daemons to 1. +NOTE: Staggered upgrade of the mons/mgrs may be necessary to take advantage +of the feature, refer this link on how to perform it: +https://docs.ceph.com/en/quincy/cephadm/upgrade/#staggered-upgrade +Relevant tracker: https://tracker.ceph.com/issues/55715 + +* Introduced a new file system flag `refuse_client_session` that can be set using the +`fs set` command. This flag allows blocking any incoming session +request from client(s). This can be useful during some recovery situations +where it's desirable to bring MDS up but have no client workload. +Relevant tracker: https://tracker.ceph.com/issues/57090 |