diff options
Diffstat (limited to '')
-rw-r--r-- | PendingReleaseNotes | 475 |
1 files changed, 475 insertions, 0 deletions
diff --git a/PendingReleaseNotes b/PendingReleaseNotes new file mode 100644 index 000000000..ea3a21e5d --- /dev/null +++ b/PendingReleaseNotes @@ -0,0 +1,475 @@ +>=17.0.0 + +* `ceph-mgr-modules-core` debian package does not recommend `ceph-mgr-rook` + anymore. As the latter depends on `python3-numpy` which cannot be imported in + different Python sub-interpreters multi-times if the version of + `python3-numpy` is older than 1.19. Since `apt-get` installs the `Recommends` + packages by default, `ceph-mgr-rook` was always installed along with + `ceph-mgr` debian package as an indirect dependency. If your workflow depends + on this behavior, you might want to install `ceph-mgr-rook` separately. + +* A new library is available, libcephsqlite. It provides a SQLite Virtual File + System (VFS) on top of RADOS. The database and journals are striped over + RADOS across multiple objects for virtually unlimited scaling and throughput + only limited by the SQLite client. Applications using SQLite may change to + the Ceph VFS with minimal changes, usually just by specifying the alternate + VFS. We expect the library to be most impactful and useful for applications + that were storing state in RADOS omap, especially without striping which + limits scalability. + +* MDS upgrades no longer require stopping all standby MDS daemons before + upgrading the sole active MDS for a file system. + +* CephFS: Failure to replay the journal by a standby-replay daemon will now + cause the rank to be marked damaged. + +* RGW: It is possible to specify ssl options and ciphers for beast frontend now. + The default ssl options setting is "no_sslv2:no_sslv3:no_tlsv1:no_tlsv1_1". + If you want to return back the old behavior add 'ssl_options=' (empty) to + ``rgw frontends`` configuration. + +* fs: A file system can be created with a specific ID ("fscid"). This is useful + in certain recovery scenarios, e.g., monitor database lost and rebuilt, and + the restored file system is expected to have the same ID as before. + +>=16.2.11 +-------- + +* Cephfs: The 'AT_NO_ATTR_SYNC' macro is deprecated, please use the standard + 'AT_STATX_DONT_SYNC' macro. The 'AT_NO_ATTR_SYNC' macro will be removed in + the future. +* Trimming of PGLog dups is now controlled by the size instead of the version. + This fixes the PGLog inflation issue that was happening when the on-line + (in OSD) trimming got jammed after a PG split operation. Also, a new off-line + mechanism has been added: `ceph-objectstore-tool` got `trim-pg-log-dups` op + that targets situations where OSD is unable to boot due to those inflated dups. + If that is the case, in OSD logs the "You can be hit by THE DUPS BUG" warning + will be visible. + Relevant tracker: https://tracker.ceph.com/issues/53729 +* RBD: `rbd device unmap` command gained `--namespace` option. Support for + namespaces was added to RBD in Nautilus 14.2.0 and it has been possible to + map and unmap images in namespaces using the `image-spec` syntax since then + but the corresponding option available in most other commands was missing. + +>=16.2.8 +-------- + +* RGW: The behavior for Multipart Upload was modified so that only + CompleteMultipartUpload notification is sent at the end of the multipart upload. + The POST notification at the beginning of the upload, and PUT notifications that + were sent on each part are not sent anymore. + +* MON/MGR: Pools can now be created with `--bulk` flag. Any pools created with `bulk` + will use a profile of the `pg_autoscaler` that provides more performance from the start. + However, any pools created without the `--bulk` flag will remain using it's old behavior + by default. For more details, see: + + https://docs.ceph.com/en/latest/rados/operations/placement-groups/ + +* MGR: The pg_autoscaler can now be turned `on` and `off` globally + with the `noautoscale` flag. By default this flag is unset and + the default pg_autoscale mode remains the same. + For more details, see: + + https://docs.ceph.com/en/latest/rados/operations/placement-groups/ + +* A health warning will now be reported if the ``require-osd-release`` flag is not + set to the appropriate release after a cluster upgrade. + +>=16.2.7 +-------- +* Critical bug in OMAP format upgrade is fixed. This could cause data corruption + (improperly formatted OMAP keys) after pre-Pacific cluster upgrade if + bluestore-quick-fix-on-mount parameter is set to true or ceph-bluestore-tool's + quick-fix/repair commands are invoked. + Relevant tracker: https://tracker.ceph.com/issues/53062 + +* MGR: The pg_autoscaler will use the 'scale-up' profile as the default profile. + 16.2.6 changed the default profile to 'scale-down' but we ran into issues + with the device_health_metrics pool consuming too many PGs, which is not ideal + for performance. So we will continue to use the 'scale-up' profile by default, + until we implement a limit on the number of PGs default pools should consume, + in combination with the 'scale-down' profile. + +>=16.2.6 +-------- + +* MGR: The pg_autoscaler has a new default 'scale-down' profile which provides more + performance from the start for new pools (for newly created clusters). + Existing clusters will retain the old behavior, now called the 'scale-up' profile. + For more details, see: + + https://docs.ceph.com/en/latest/rados/operations/placement-groups/ +* Cephadm: ``osd_memory_target_autotune`` will be enabled by default which will set + ``mgr/cephadm/autotune_memory_target_ratio`` to ``0.7`` of total RAM. This will be + unsuitable for hyperconverged infrastructures. For hyperconverged Ceph, please refer + to the documentation or set ``mgr/cephadm/autotune_memory_target_ratio`` to ``0.2``. + +>=16.0.0 +-------- + +* RGW: S3 bucket notification events now contain an `eTag` key instead of `etag`, + and eventName values no longer carry the `s3:` prefix, fixing deviations from + the message format observed on AWS. +* `ceph-mgr-modules-core` debian package does not recommend `ceph-mgr-rook` + anymore. As the latter depends on `python3-numpy` which cannot be imported in + different Python sub-interpreters multi-times if the version of + `python3-numpy` is older than 1.19. Since `apt-get` installs the `Recommends` + packages by default, `ceph-mgr-rook` was always installed along with + `ceph-mgr` debian package as an indirect dependency. If your workflow depends + on this behavior, you might want to install `ceph-mgr-rook` separately. + +* mgr/nfs: ``nfs`` module is moved out of volumes plugin. Prior using the + ``ceph nfs`` commands, ``nfs`` mgr module must be enabled. + +* volumes/nfs: The ``cephfs`` cluster type has been removed from the + ``nfs cluster create`` subcommand. Clusters deployed by cephadm can + support an NFS export of both ``rgw`` and ``cephfs`` from a single + NFS cluster instance. + +* The ``nfs cluster update`` command has been removed. You can modify + the placement of an existing NFS service (and/or its associated + ingress service) using ``orch ls --export`` and ``orch apply -i + ...``. + +* The ``orch apply nfs`` command no longer requires a pool or + namespace argument. We strongly encourage users to use the defaults + so that the ``nfs cluster ls`` and related commands will work + properly. + +* The ``nfs cluster delete`` and ``nfs export delete`` commands are + deprecated and will be removed in a future release. Please use + ``nfs cluster rm`` and ``nfs export rm`` instead. + +* The ``nfs export create`` CLI arguments have changed, with the + *fsname* or *bucket-name* argument position moving to the right of + *the *cluster-id* and *pseudo-path*. Consider transitioning to + *using named arguments instead of positional arguments (e.g., ``ceph + *nfs export create cephfs --cluster-id mycluster --pseudo-path /foo + *--fsname myfs`` instead of ``ceph nfs export create cephfs + *mycluster /foo myfs`` to ensure correct behavior with any + *version. + +* mgr-pg_autoscaler: Autoscaler will now start out by scaling each + pool to have a full complements of pgs from the start and will only + decrease it when other pools need more pgs due to increased usage. + This improves out of the box performance of Ceph by allowing more PGs + to be created for a given pool. + +* CephFS: Disabling allow_standby_replay on a file system will also stop all + standby-replay daemons for that file system. + +* New bluestore_rocksdb_options_annex config parameter. Complements + bluestore_rocksdb_options and allows setting rocksdb options without repeating + the existing defaults. +* The cephfs addes two new CDentry tags, 'I' --> 'i' and 'L' --> 'l', and + on-RADOS metadata is no longer backwards compatible after upgraded to Pacific + or a later release. + +* $pid expansion in config paths like `admin_socket` will now properly expand + to the daemon pid for commands like `ceph-mds` or `ceph-osd`. Previously only + `ceph-fuse`/`rbd-nbd` expanded `$pid` with the actual daemon pid. + +* The allowable options for some "radosgw-admin" commands have been changed. + + * "mdlog-list", "datalog-list", "sync-error-list" no longer accepts + start and end dates, but does accept a single optional start marker. + * "mdlog-trim", "datalog-trim", "sync-error-trim" only accept a + single marker giving the end of the trimmed range. + * Similarly the date ranges and marker ranges have been removed on + the RESTful DATALog and MDLog list and trim operations. + +* ceph-volume: The ``lvm batch` subcommand received a major rewrite. This closed + a number of bugs and improves usability in terms of size specification and + calculation, as well as idempotency behaviour and disk replacement process. + Please refer to https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/ for + more detailed information. + +* Configuration variables for permitted scrub times have changed. The legal + values for ``osd_scrub_begin_hour`` and ``osd_scrub_end_hour`` are 0 - 23. + The use of 24 is now illegal. Specifying ``0`` for both values causes every + hour to be allowed. The legal vaues for ``osd_scrub_begin_week_day`` and + ``osd_scrub_end_week_day`` are 0 - 6. The use of 7 is now illegal. + Specifying ``0`` for both values causes every day of the week to be allowed. + +* Multiple file systems in a single Ceph cluster is now stable. New Ceph clusters + enable support for multiple file systems by default. Existing clusters + must still set the "enable_multiple" flag on the fs. Please see the CephFS + documentation for more information. + +* volume/nfs: Recently "ganesha-" prefix from cluster id and nfs-ganesha common + config object was removed, to ensure consistent namespace across different + orchestrator backends. Please delete any existing nfs-ganesha clusters prior + to upgrading and redeploy new clusters after upgrading to Pacific. + +* A new health check, DAEMON_OLD_VERSION, will warn if different versions of Ceph are running + on daemons. It will generate a health error if multiple versions are detected. + This condition must exist for over mon_warn_older_version_delay (set to 1 week by default) in order for the + health condition to be triggered. This allows most upgrades to proceed + without falsely seeing the warning. If upgrade is paused for an extended + time period, health mute can be used like this + "ceph health mute DAEMON_OLD_VERSION --sticky". In this case after + upgrade has finished use "ceph health unmute DAEMON_OLD_VERSION". + +* MGR: progress module can now be turned on/off, using the commands: + ``ceph progress on`` and ``ceph progress off``. +* An AWS-compliant API: "GetTopicAttributes" was added to replace the existing "GetTopic" API. The new API + should be used to fetch information about topics used for bucket notifications. + +* librbd: The shared, read-only parent cache's config option ``immutable_object_cache_watermark`` now has been updated + to property reflect the upper cache utilization before space is reclaimed. The default ``immutable_object_cache_watermark`` + now is ``0.9``. If the capacity reaches 90% the daemon will delete cold cache. +* The ceph_volume_client.py library used for manipulating legacy "volumes" in + CephFS is removed. All remaining users should use the "fs volume" interface + exposed by the ceph-mgr: + https://docs.ceph.com/en/latest/cephfs/fs-volumes/ + +* An AWS-compliant API: "GetTopicAttributes" was added to replace the existing + "GetTopic" API. The new API should be used to fetch information about topics + used for bucket notifications. + +* librbd: The shared, read-only parent cache's config option + ``immutable_object_cache_watermark`` has now been updated to properly reflect + the upper cache utilization before space is reclaimed. The default + ``immutable_object_cache_watermark`` is now ``0.9``. If the capacity reaches + 90% the daemon will delete cold cache. + +* OSD: the option ``osd_fast_shutdown_notify_mon`` has been introduced to allow + the OSD to notify the monitor it is shutting down even if ``osd_fast_shutdown`` + is enabled. This helps with the monitor logs on larger clusters, that may get + many 'osd.X reported immediately failed by osd.Y' messages, and confuse tools. +* rgw/kms/vault: the transit logic has been revamped to better use + the transit engine in vault. To take advantage of this new + functionality configuration changes are required. See the current + documentation (radosgw/vault) for more details. + +* Scubs are more aggressive in trying to find more simultaneous possible PGs within osd_max_scrubs limitation. + It is possible that increasing osd_scrub_sleep may be necessary to maintain client responsiveness. +* OSD: the option ``osd_fast_shutdown_notify_mon`` has been introduced to allow + the OSD to notify the monitor it is shutting down even if ``osd_fast_shutdown`` + is enabled. This helps with the monitor logs on larger clusters, that may get + many 'osd.X reported immediately failed by osd.Y' messages, and confuse tools. + +* The mclock scheduler has been refined. A set of built-in profiles are now available that + provide QoS between the internal and external clients of Ceph. To enable the mclock + scheduler, set the config option "osd_op_queue" to "mclock_scheduler". The + "high_client_ops" profile is enabled by default, and allocates more OSD bandwidth to + external client operations than to internal client operations (such as background recovery + and scrubs). Other built-in profiles include "high_recovery_ops" and "balanced". These + built-in profiles optimize the QoS provided to clients of mclock scheduler. + +* Version 2 of the cephx authentication protocol (``CEPHX_V2`` feature bit) is + now required by default. It was introduced in 2018, adding replay attack + protection for authorizers and making msgr v1 message signatures stronger + (CVE-2018-1128 and CVE-2018-1129). Support is present in Jewel 10.2.11, + Luminous 12.2.6, Mimic 13.2.1, Nautilus 14.2.0 and later; upstream kernels + 4.9.150, 4.14.86, 4.19 and later; various distribution kernels, in particular + CentOS 7.6 and later. To enable older clients, set ``cephx_require_version`` + and ``cephx_service_require_version`` config options to 1. + +* rgw: The Civetweb frontend is now deprecated and will be removed in Quincy. + +>=15.0.0 +-------- + +* MON: The cluster log now logs health detail every ``mon_health_to_clog_interval``, + which has been changed from 1hr to 10min. Logging of health detail will be + skipped if there is no change in health summary since last known. + +* The ``ceph df`` command now lists the number of pgs in each pool. + +* Monitors now have config option ``mon_allow_pool_size_one``, which is disabled + by default. However, if enabled, user now have to pass the + ``--yes-i-really-mean-it`` flag to ``osd pool set size 1``, if they are really + sure of configuring pool size 1. + +* librbd now inherits the stripe unit and count from its parent image upon creation. + This can be overridden by specifying different stripe settings during clone creation. + +* The balancer is now on by default in upmap mode. Since upmap mode requires + ``require_min_compat_client`` luminous, new clusters will only support luminous + and newer clients by default. Existing clusters can enable upmap support by running + ``ceph osd set-require-min-compat-client luminous``. It is still possible to turn + the balancer off using the ``ceph balancer off`` command. In earlier versions, + the balancer was included in the ``always_on_modules`` list, but needed to be + turned on explicitly using the ``ceph balancer on`` command. + +* MGR: the "cloud" mode of the diskprediction module is not supported anymore + and the ``ceph-mgr-diskprediction-cloud`` manager module has been removed. This + is because the external cloud service run by ProphetStor is no longer accessible + and there is no immediate replacement for it at this time. The "local" prediction + mode will continue to be supported. + +* Cephadm: There were a lot of small usability improvements and bug fixes: + + * Grafana when deployed by Cephadm now binds to all network interfaces. + * ``cephadm check-host`` now prints all detected problems at once. + * Cephadm now calls ``ceph dashboard set-grafana-api-ssl-verify false`` + when generating an SSL certificate for Grafana. + * The Alertmanager is now correctly pointed to the Ceph Dashboard + * ``cephadm adopt`` now supports adopting an Alertmanager + * ``ceph orch ps`` now supports filtering by service name + * ``ceph orch host ls`` now marks hosts as offline, if they are not + accessible. + +* Cephadm can now deploy NFS Ganesha services. For example, to deploy NFS with + a service id of mynfs, that will use the RADOS pool nfs-ganesha and namespace + nfs-ns:: + + ceph orch apply nfs mynfs nfs-ganesha nfs-ns + +* Cephadm: ``ceph orch ls --export`` now returns all service specifications in + yaml representation that is consumable by ``ceph orch apply``. In addition, + the commands ``orch ps`` and ``orch ls`` now support ``--format yaml`` and + ``--format json-pretty``. + +* CephFS: Automatic static subtree partitioning policies may now be configured + using the new distributed and random ephemeral pinning extended attributes on + directories. See the documentation for more information: + https://docs.ceph.com/docs/master/cephfs/multimds/ + +* Cephadm: ``ceph orch apply osd`` supports a ``--preview`` flag that prints a preview of + the OSD specification before deploying OSDs. This makes it possible to + verify that the specification is correct, before applying it. + +* RGW: The ``radosgw-admin`` sub-commands dealing with orphans -- + ``radosgw-admin orphans find``, ``radosgw-admin orphans finish``, and + ``radosgw-admin orphans list-jobs`` -- have been deprecated. They have + not been actively maintained and they store intermediate results on + the cluster, which could fill a nearly-full cluster. They have been + replaced by a tool, currently considered experimental, + ``rgw-orphan-list``. + +* RBD: The name of the rbd pool object that is used to store + rbd trash purge schedule is changed from "rbd_trash_trash_purge_schedule" + to "rbd_trash_purge_schedule". Users that have already started using + ``rbd trash purge schedule`` functionality and have per pool or namespace + schedules configured should copy "rbd_trash_trash_purge_schedule" + object to "rbd_trash_purge_schedule" before the upgrade and remove + "rbd_trash_purge_schedule" using the following commands in every RBD + pool and namespace where a trash purge schedule was previously + configured:: + + rados -p <pool-name> [-N namespace] cp rbd_trash_trash_purge_schedule rbd_trash_purge_schedule + rados -p <pool-name> [-N namespace] rm rbd_trash_trash_purge_schedule + + or use any other convenient way to restore the schedule after the + upgrade. + +* librbd: The shared, read-only parent cache has been moved to a separate librbd + plugin. If the parent cache was previously in-use, you must also instruct + librbd to load the plugin by adding the following to your configuration:: + + rbd_plugins = parent_cache + +* Monitors now have a config option ``mon_osd_warn_num_repaired``, 10 by default. + If any OSD has repaired more than this many I/O errors in stored data a + ``OSD_TOO_MANY_REPAIRS`` health warning is generated. + +* Introduce commands that manipulate required client features of a file system:: + + ceph fs required_client_features <fs name> add <feature> + ceph fs required_client_features <fs name> rm <feature> + ceph fs feature ls + +* OSD: A new configuration option ``osd_compact_on_start`` has been added which triggers + an OSD compaction on start. Setting this option to ``true`` and restarting an OSD + will result in an offline compaction of the OSD prior to booting. + +* OSD: the option named ``bdev_nvme_retry_count`` has been removed. Because + in SPDK v20.07, there is no easy access to bdev_nvme options, and this + option is hardly used, so it was removed. + +* Now when noscrub and/or nodeep-scrub flags are set globally or per pool, + scheduled scrubs of the type disabled will be aborted. All user initiated + scrubs are NOT interrupted. + +* Alpine build related script, documentation and test have been removed since + the most updated APKBUILD script of Ceph is already included by Alpine Linux's + aports repository. + +* fs: Names of new FSs, volumes, subvolumes and subvolume groups can only + contain alphanumeric and ``-``, ``_`` and ``.`` characters. Some commands + or CephX credentials may not work with old FSs with non-conformant names. + +* It is now possible to specify the initial monitor to contact for Ceph tools + and daemons using the ``mon_host_override`` config option or + ``--mon-host-override <ip>`` command-line switch. This generally should only + be used for debugging and only affects initial communication with Ceph's + monitor cluster. + +* `blacklist` has been replaced with `blocklist` throughout. The following commands have changed: + + - ``ceph osd blacklist ...`` are now ``ceph osd blocklist ...`` + - ``ceph <tell|daemon> osd.<NNN> dump_blacklist`` is now ``ceph <tell|daemon> osd.<NNN> dump_blocklist`` + +* The following config options have changed: + + - ``mon osd blacklist default expire`` is now ``mon osd blocklist default expire`` + - ``mon mds blacklist interval`` is now ``mon mds blocklist interval`` + - ``mon mgr blacklist interval`` is now ''mon mgr blocklist interval`` + - ``rbd blacklist on break lock`` is now ``rbd blocklist on break lock`` + - ``rbd blacklist expire seconds`` is now ``rbd blocklist expire seconds`` + - ``mds session blacklist on timeout`` is now ``mds session blocklist on timeout`` + - ``mds session blacklist on evict`` is now ``mds session blocklist on evict`` + +* CephFS: Compatibility code for old on-disk format of snapshot has been removed. + Current on-disk format of snapshot was introduced by Mimic release. If there + are any snapshots created by Ceph release older than Mimic. Before upgrading, + either delete them all or scrub the whole filesystem: + + ceph daemon <mds of rank 0> scrub_path / force recursive repair + ceph daemon <mds of rank 0> scrub_path '~mdsdir' force recursive repair + +* CephFS: Scrub is supported in multiple active mds setup. MDS rank 0 handles + scrub commands, and forward scrub to other mds if necessary. + +* The following librados API calls have changed: + + - ``rados_blacklist_add`` is now ``rados_blocklist_add``; the former will issue a deprecation warning and be removed in a future release. + - ``rados.blacklist_add`` is now ``rados.blocklist_add`` in the C++ API. + +* The JSON output for the following commands now shows ``blocklist`` instead of ``blacklist``: + + - ``ceph osd dump`` + - ``ceph <tell|daemon> osd.<N> dump_blocklist`` + +* caps: MON and MDS caps can now be used to restrict client's ability to view + and operate on specific Ceph file systems. The FS can be specificed using + ``fsname`` in caps. This also affects subcommand ``fs authorize``, the caps + produce by it will be specific to the FS name passed in its arguments. + +* fs: root_squash flag can be set in MDS caps. It disallows file system + operations that need write access for clients with uid=0 or gid=0. This + feature should prevent accidents such as an inadvertent `sudo rm -rf /<path>`. + +* fs: "fs authorize" now sets MON cap to "allow <perm> fsname=<fsname>" + instead of setting it to "allow r" all the time. + +* ``ceph pg #.# list_unfound`` output has been enhanced to provide + might_have_unfound information which indicates which OSDs may + contain the unfound objects. + +* The ``ceph orch apply rgw`` syntax and behavior have changed. RGW + services can now be arbitrarily named (it is no longer forced to be + `realm.zone`). The ``--rgw-realm=...`` and ``--rgw-zone=...`` + arguments are now optional, which means that if they are omitted, a + vanilla single-cluster RGW will be deployed. When the realm and + zone are provided, the user is now responsible for setting up the + multisite configuration beforehand--cephadm no longer attempts to + create missing realms or zones. + +* The cephadm NFS support has been simplified to no longer allow the + pool and namespace where configuration is stored to be customized. + As a result, the ``ceph orch apply nfs`` command no longer has + ``--pool`` or ``--namespace`` arguments. + + Existing cephadm NFS deployments (from earlier version of Pacific or + from Octopus) will be automatically migrated when the cluster is + upgraded. Note that the NFS ganesha daemons will be redeployed and + it is possible that their IPs will change. + +* RGW now requires a secure connection to the monitor by default + (``auth_client_required=cephx`` and ``ms_mon_client_mode=secure``). + If you have cephx authentication disabled on your cluster, you may + need to adjust these settings for RGW. |