diff options
author | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-21 11:54:28 +0000 |
---|---|---|
committer | Daniel Baumann <daniel.baumann@progress-linux.org> | 2024-04-21 11:54:28 +0000 |
commit | e6918187568dbd01842d8d1d2c808ce16a894239 (patch) | |
tree | 64f88b554b444a49f656b6c656111a145cbbaa28 /src/spdk/doc/blobfs.md | |
parent | Initial commit. (diff) | |
download | ceph-upstream/18.2.2.tar.xz ceph-upstream/18.2.2.zip |
Adding upstream version 18.2.2.upstream/18.2.2
Signed-off-by: Daniel Baumann <daniel.baumann@progress-linux.org>
Diffstat (limited to 'src/spdk/doc/blobfs.md')
-rw-r--r-- | src/spdk/doc/blobfs.md | 93 |
1 files changed, 93 insertions, 0 deletions
diff --git a/src/spdk/doc/blobfs.md b/src/spdk/doc/blobfs.md new file mode 100644 index 000000000..221abed4d --- /dev/null +++ b/src/spdk/doc/blobfs.md @@ -0,0 +1,93 @@ +# BlobFS (Blobstore Filesystem) {#blobfs} + +# BlobFS Getting Started Guide {#blobfs_getting_started} + +# RocksDB Integration {#blobfs_rocksdb} + +Clone and build the SPDK repository as per https://github.com/spdk/spdk + +~~~{.sh} +git clone https://github.com/spdk/spdk.git +cd spdk +./configure +make +~~~ + +Clone the RocksDB repository from the SPDK GitHub fork into a separate directory. +Make sure you check out the `spdk-v5.14.3` branch. + +~~~{.sh} +cd .. +git clone -b spdk-v5.14.3 https://github.com/spdk/rocksdb.git +~~~ + +Build RocksDB. Only the `db_bench` benchmarking tool is integrated with BlobFS. + +~~~{.sh} +cd rocksdb +make db_bench SPDK_DIR=path/to/spdk +~~~ + +Or you can also add `DEBUG_LEVEL=0` for a release build (need to turn on `USE_RTTI`). + +~~~{.sh} +export USE_RTTI=1 && make db_bench DEBUG_LEVEL=0 SPDK_DIR=path/to/spdk +~~~ + +Create an NVMe section in the configuration file using SPDK's `gen_nvme.sh` script. + +~~~{.sh} +scripts/gen_nvme.sh > /usr/local/etc/spdk/rocksdb.conf +~~~ + +Verify the configuration file has specified the correct NVMe SSD. +If there are any NVMe SSDs you do not wish to use for RocksDB/SPDK testing, remove them from the configuration file. + +Make sure you have at least 5GB of memory allocated for huge pages. +By default, the SPDK `setup.sh` script only allocates 2GB. +The following will allocate 5GB of huge page memory (in addition to binding the NVMe devices to uio/vfio). + +~~~{.sh} +HUGEMEM=5120 scripts/setup.sh +~~~ + +Create an empty SPDK blobfs for testing. + +~~~{.sh} +test/blobfs/mkfs/mkfs /usr/local/etc/spdk/rocksdb.conf Nvme0n1 +~~~ + +At this point, RocksDB is ready for testing with SPDK. Three `db_bench` parameters are used to configure SPDK: + +1. `spdk` - Defines the name of the SPDK configuration file. If omitted, RocksDB will use the default PosixEnv implementation + instead of SpdkEnv. (Required) +2. `spdk_bdev` - Defines the name of the SPDK block device which contains the BlobFS to be used for testing. (Required) +3. `spdk_cache_size` - Defines the amount of userspace cache memory used by SPDK. Specified in terms of megabytes (MB). + Default is 4096 (4GB). (Optional) + +SPDK has a set of scripts which will run `db_bench` against a variety of workloads and capture performance and profiling +data. The primary script is `test/blobfs/rocksdb/rocksdb.sh`. + +# FUSE + +BlobFS provides a FUSE plug-in to mount an SPDK BlobFS as a kernel filesystem for inspection or debug purposes. +The FUSE plug-in requires fuse3 and will be built automatically when fuse3 is detected on the system. + +~~~{.sh} +test/blobfs/fuse/fuse /usr/local/etc/spdk/rocksdb.conf Nvme0n1 /mnt/fuse +~~~ + +Note that the FUSE plug-in has some limitations - see the list below. + +# Limitations + +* BlobFS has primarily been tested with RocksDB so far, so any use cases different from how RocksDB uses a filesystem + may run into issues. BlobFS will be tested in a broader range of use cases after this initial release. +* Only a synchronous API is currently supported. An asynchronous API has been developed but not thoroughly tested + yet so is not part of the public interface yet. This will be added in a future release. +* File renames are not atomic. This will be fixed in a future release. +* BlobFS currently supports only a flat namespace for files with no directory support. Filenames are currently stored + as xattrs in each blob. This means that filename lookup is an O(n) operation. An SPDK btree implementation is + underway which will be the underpinning for BlobFS directory support in a future release. +* Writes to a file must always append to the end of the file. Support for writes to any location within the file + will be added in a future release. |