summaryrefslogtreecommitdiffstats
path: root/Documentation/filesystems/caching/fscache.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Documentation/filesystems/caching/fscache.rst')
-rw-r--r--Documentation/filesystems/caching/fscache.rst565
1 files changed, 565 insertions, 0 deletions
diff --git a/Documentation/filesystems/caching/fscache.rst b/Documentation/filesystems/caching/fscache.rst
new file mode 100644
index 000000000..70de86922
--- /dev/null
+++ b/Documentation/filesystems/caching/fscache.rst
@@ -0,0 +1,565 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+General Filesystem Caching
+==========================
+
+Overview
+========
+
+This facility is a general purpose cache for network filesystems, though it
+could be used for caching other things such as ISO9660 filesystems too.
+
+FS-Cache mediates between cache backends (such as CacheFS) and network
+filesystems::
+
+ +---------+
+ | | +--------------+
+ | NFS |--+ | |
+ | | | +-->| CacheFS |
+ +---------+ | +----------+ | | /dev/hda5 |
+ | | | | +--------------+
+ +---------+ +-->| | |
+ | | | |--+
+ | AFS |----->| FS-Cache |
+ | | | |--+
+ +---------+ +-->| | |
+ | | | | +--------------+
+ +---------+ | +----------+ | | |
+ | | | +-->| CacheFiles |
+ | ISOFS |--+ | /var/cache |
+ | | +--------------+
+ +---------+
+
+Or to look at it another way, FS-Cache is a module that provides a caching
+facility to a network filesystem such that the cache is transparent to the
+user::
+
+ +---------+
+ | |
+ | Server |
+ | |
+ +---------+
+ | NETWORK
+ ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ |
+ | +----------+
+ V | |
+ +---------+ | |
+ | | | |
+ | NFS |----->| FS-Cache |
+ | | | |--+
+ +---------+ | | | +--------------+ +--------------+
+ | | | | | | | |
+ V +----------+ +-->| CacheFiles |-->| Ext3 |
+ +---------+ | /var/cache | | /dev/sda6 |
+ | | +--------------+ +--------------+
+ | VFS | ^ ^
+ | | | |
+ +---------+ +--------------+ |
+ | KERNEL SPACE | |
+ ~~~~~|~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|~~~~~~|~~~~
+ | USER SPACE | |
+ V | |
+ +---------+ +--------------+
+ | | | |
+ | Process | | cachefilesd |
+ | | | |
+ +---------+ +--------------+
+
+
+FS-Cache does not follow the idea of completely loading every netfs file
+opened in its entirety into a cache before permitting it to be accessed and
+then serving the pages out of that cache rather than the netfs inode because:
+
+ (1) It must be practical to operate without a cache.
+
+ (2) The size of any accessible file must not be limited to the size of the
+ cache.
+
+ (3) The combined size of all opened files (this includes mapped libraries)
+ must not be limited to the size of the cache.
+
+ (4) The user should not be forced to download an entire file just to do a
+ one-off access of a small portion of it (such as might be done with the
+ "file" program).
+
+It instead serves the cache out in PAGE_SIZE chunks as and when requested by
+the netfs('s) using it.
+
+
+FS-Cache provides the following facilities:
+
+ (1) More than one cache can be used at once. Caches can be selected
+ explicitly by use of tags.
+
+ (2) Caches can be added / removed at any time.
+
+ (3) The netfs is provided with an interface that allows either party to
+ withdraw caching facilities from a file (required for (2)).
+
+ (4) The interface to the netfs returns as few errors as possible, preferring
+ rather to let the netfs remain oblivious.
+
+ (5) Cookies are used to represent indices, files and other objects to the
+ netfs. The simplest cookie is just a NULL pointer - indicating nothing
+ cached there.
+
+ (6) The netfs is allowed to propose - dynamically - any index hierarchy it
+ desires, though it must be aware that the index search function is
+ recursive, stack space is limited, and indices can only be children of
+ indices.
+
+ (7) Data I/O is done direct to and from the netfs's pages. The netfs
+ indicates that page A is at index B of the data-file represented by cookie
+ C, and that it should be read or written. The cache backend may or may
+ not start I/O on that page, but if it does, a netfs callback will be
+ invoked to indicate completion. The I/O may be either synchronous or
+ asynchronous.
+
+ (8) Cookies can be "retired" upon release. At this point FS-Cache will mark
+ them as obsolete and the index hierarchy rooted at that point will get
+ recycled.
+
+ (9) The netfs provides a "match" function for index searches. In addition to
+ saying whether a match was made or not, this can also specify that an
+ entry should be updated or deleted.
+
+(10) As much as possible is done asynchronously.
+
+
+FS-Cache maintains a virtual indexing tree in which all indices, files, objects
+and pages are kept. Bits of this tree may actually reside in one or more
+caches::
+
+ FSDEF
+ |
+ +------------------------------------+
+ | |
+ NFS AFS
+ | |
+ +--------------------------+ +-----------+
+ | | | |
+ homedir mirror afs.org redhat.com
+ | | |
+ +------------+ +---------------+ +----------+
+ | | | | | |
+ 00001 00002 00007 00125 vol00001 vol00002
+ | | | | |
+ +---+---+ +-----+ +---+ +------+------+ +-----+----+
+ | | | | | | | | | | | | |
+ PG0 PG1 PG2 PG0 XATTR PG0 PG1 DIRENT DIRENT DIRENT R/W R/O Bak
+ | |
+ PG0 +-------+
+ | |
+ 00001 00003
+ |
+ +---+---+
+ | | |
+ PG0 PG1 PG2
+
+In the example above, you can see two netfs's being backed: NFS and AFS. These
+have different index hierarchies:
+
+ * The NFS primary index contains per-server indices. Each server index is
+ indexed by NFS file handles to get data file objects. Each data file
+ objects can have an array of pages, but may also have further child
+ objects, such as extended attributes and directory entries. Extended
+ attribute objects themselves have page-array contents.
+
+ * The AFS primary index contains per-cell indices. Each cell index contains
+ per-logical-volume indices. Each of volume index contains up to three
+ indices for the read-write, read-only and backup mirrors of those volumes.
+ Each of these contains vnode data file objects, each of which contains an
+ array of pages.
+
+The very top index is the FS-Cache master index in which individual netfs's
+have entries.
+
+Any index object may reside in more than one cache, provided it only has index
+children. Any index with non-index object children will be assumed to only
+reside in one cache.
+
+
+The netfs API to FS-Cache can be found in:
+
+ Documentation/filesystems/caching/netfs-api.rst
+
+The cache backend API to FS-Cache can be found in:
+
+ Documentation/filesystems/caching/backend-api.rst
+
+A description of the internal representations and object state machine can be
+found in:
+
+ Documentation/filesystems/caching/object.rst
+
+
+Statistical Information
+=======================
+
+If FS-Cache is compiled with the following options enabled::
+
+ CONFIG_FSCACHE_STATS=y
+ CONFIG_FSCACHE_HISTOGRAM=y
+
+then it will gather certain statistics and display them through a number of
+proc files.
+
+/proc/fs/fscache/stats
+----------------------
+
+ This shows counts of a number of events that can happen in FS-Cache:
+
++--------------+-------+-------------------------------------------------------+
+|CLASS |EVENT |MEANING |
++==============+=======+=======================================================+
+|Cookies |idx=N |Number of index cookies allocated |
++ +-------+-------------------------------------------------------+
+| |dat=N |Number of data storage cookies allocated |
++ +-------+-------------------------------------------------------+
+| |spc=N |Number of special cookies allocated |
++--------------+-------+-------------------------------------------------------+
+|Objects |alc=N |Number of objects allocated |
++ +-------+-------------------------------------------------------+
+| |nal=N |Number of object allocation failures |
++ +-------+-------------------------------------------------------+
+| |avl=N |Number of objects that reached the available state |
++ +-------+-------------------------------------------------------+
+| |ded=N |Number of objects that reached the dead state |
++--------------+-------+-------------------------------------------------------+
+|ChkAux |non=N |Number of objects that didn't have a coherency check |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of objects that passed a coherency check |
++ +-------+-------------------------------------------------------+
+| |upd=N |Number of objects that needed a coherency data update |
++ +-------+-------------------------------------------------------+
+| |obs=N |Number of objects that were declared obsolete |
++--------------+-------+-------------------------------------------------------+
+|Pages |mrk=N |Number of pages marked as being cached |
+| |unc=N |Number of uncache page requests seen |
++--------------+-------+-------------------------------------------------------+
+|Acquire |n=N |Number of acquire cookie requests seen |
++ +-------+-------------------------------------------------------+
+| |nul=N |Number of acq reqs given a NULL parent |
++ +-------+-------------------------------------------------------+
+| |noc=N |Number of acq reqs rejected due to no cache available |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of acq reqs succeeded |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of acq reqs rejected due to error |
++ +-------+-------------------------------------------------------+
+| |oom=N |Number of acq reqs failed on ENOMEM |
++--------------+-------+-------------------------------------------------------+
+|Lookups |n=N |Number of lookup calls made on cache backends |
++ +-------+-------------------------------------------------------+
+| |neg=N |Number of negative lookups made |
++ +-------+-------------------------------------------------------+
+| |pos=N |Number of positive lookups made |
++ +-------+-------------------------------------------------------+
+| |crt=N |Number of objects created by lookup |
++ +-------+-------------------------------------------------------+
+| |tmo=N |Number of lookups timed out and requeued |
++--------------+-------+-------------------------------------------------------+
+|Updates |n=N |Number of update cookie requests seen |
++ +-------+-------------------------------------------------------+
+| |nul=N |Number of upd reqs given a NULL parent |
++ +-------+-------------------------------------------------------+
+| |run=N |Number of upd reqs granted CPU time |
++--------------+-------+-------------------------------------------------------+
+|Relinqs |n=N |Number of relinquish cookie requests seen |
++ +-------+-------------------------------------------------------+
+| |nul=N |Number of rlq reqs given a NULL parent |
++ +-------+-------------------------------------------------------+
+| |wcr=N |Number of rlq reqs waited on completion of creation |
++--------------+-------+-------------------------------------------------------+
+|AttrChg |n=N |Number of attribute changed requests seen |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of attr changed requests queued |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of attr changed rejected -ENOBUFS |
++ +-------+-------------------------------------------------------+
+| |oom=N |Number of attr changed failed -ENOMEM |
++ +-------+-------------------------------------------------------+
+| |run=N |Number of attr changed ops given CPU time |
++--------------+-------+-------------------------------------------------------+
+|Allocs |n=N |Number of allocation requests seen |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of successful alloc reqs |
++ +-------+-------------------------------------------------------+
+| |wt=N |Number of alloc reqs that waited on lookup completion |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of alloc reqs rejected -ENOBUFS |
++ +-------+-------------------------------------------------------+
+| |int=N |Number of alloc reqs aborted -ERESTARTSYS |
++ +-------+-------------------------------------------------------+
+| |ops=N |Number of alloc reqs submitted |
++ +-------+-------------------------------------------------------+
+| |owt=N |Number of alloc reqs waited for CPU time |
++ +-------+-------------------------------------------------------+
+| |abt=N |Number of alloc reqs aborted due to object death |
++--------------+-------+-------------------------------------------------------+
+|Retrvls |n=N |Number of retrieval (read) requests seen |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of successful retr reqs |
++ +-------+-------------------------------------------------------+
+| |wt=N |Number of retr reqs that waited on lookup completion |
++ +-------+-------------------------------------------------------+
+| |nod=N |Number of retr reqs returned -ENODATA |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of retr reqs rejected -ENOBUFS |
++ +-------+-------------------------------------------------------+
+| |int=N |Number of retr reqs aborted -ERESTARTSYS |
++ +-------+-------------------------------------------------------+
+| |oom=N |Number of retr reqs failed -ENOMEM |
++ +-------+-------------------------------------------------------+
+| |ops=N |Number of retr reqs submitted |
++ +-------+-------------------------------------------------------+
+| |owt=N |Number of retr reqs waited for CPU time |
++ +-------+-------------------------------------------------------+
+| |abt=N |Number of retr reqs aborted due to object death |
++--------------+-------+-------------------------------------------------------+
+|Stores |n=N |Number of storage (write) requests seen |
++ +-------+-------------------------------------------------------+
+| |ok=N |Number of successful store reqs |
++ +-------+-------------------------------------------------------+
+| |agn=N |Number of store reqs on a page already pending storage |
++ +-------+-------------------------------------------------------+
+| |nbf=N |Number of store reqs rejected -ENOBUFS |
++ +-------+-------------------------------------------------------+
+| |oom=N |Number of store reqs failed -ENOMEM |
++ +-------+-------------------------------------------------------+
+| |ops=N |Number of store reqs submitted |
++ +-------+-------------------------------------------------------+
+| |run=N |Number of store reqs granted CPU time |
++ +-------+-------------------------------------------------------+
+| |pgs=N |Number of pages given store req processing time |
++ +-------+-------------------------------------------------------+
+| |rxd=N |Number of store reqs deleted from tracking tree |
++ +-------+-------------------------------------------------------+
+| |olm=N |Number of store reqs over store limit |
++--------------+-------+-------------------------------------------------------+
+|VmScan |nos=N |Number of release reqs against pages with no |
+| | |pending store |
++ +-------+-------------------------------------------------------+
+| |gon=N |Number of release reqs against pages stored by |
+| | |time lock granted |
++ +-------+-------------------------------------------------------+
+| |bsy=N |Number of release reqs ignored due to in-progress store|
++ +-------+-------------------------------------------------------+
+| |can=N |Number of page stores cancelled due to release req |
++--------------+-------+-------------------------------------------------------+
+|Ops |pend=N |Number of times async ops added to pending queues |
++ +-------+-------------------------------------------------------+
+| |run=N |Number of times async ops given CPU time |
++ +-------+-------------------------------------------------------+
+| |enq=N |Number of times async ops queued for processing |
++ +-------+-------------------------------------------------------+
+| |can=N |Number of async ops cancelled |
++ +-------+-------------------------------------------------------+
+| |rej=N |Number of async ops rejected due to object |
+| | |lookup/create failure |
++ +-------+-------------------------------------------------------+
+| |ini=N |Number of async ops initialised |
++ +-------+-------------------------------------------------------+
+| |dfr=N |Number of async ops queued for deferred release |
++ +-------+-------------------------------------------------------+
+| |rel=N |Number of async ops released |
+| | |(should equal ini=N when idle) |
++ +-------+-------------------------------------------------------+
+| |gc=N |Number of deferred-release async ops garbage collected |
++--------------+-------+-------------------------------------------------------+
+|CacheOp |alo=N |Number of in-progress alloc_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |luo=N |Number of in-progress lookup_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |luc=N |Number of in-progress lookup_complete() cache ops |
++ +-------+-------------------------------------------------------+
+| |gro=N |Number of in-progress grab_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |upo=N |Number of in-progress update_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |dro=N |Number of in-progress drop_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |pto=N |Number of in-progress put_object() cache ops |
++ +-------+-------------------------------------------------------+
+| |syn=N |Number of in-progress sync_cache() cache ops |
++ +-------+-------------------------------------------------------+
+| |atc=N |Number of in-progress attr_changed() cache ops |
++ +-------+-------------------------------------------------------+
+| |rap=N |Number of in-progress read_or_alloc_page() cache ops |
++ +-------+-------------------------------------------------------+
+| |ras=N |Number of in-progress read_or_alloc_pages() cache ops |
++ +-------+-------------------------------------------------------+
+| |alp=N |Number of in-progress allocate_page() cache ops |
++ +-------+-------------------------------------------------------+
+| |als=N |Number of in-progress allocate_pages() cache ops |
++ +-------+-------------------------------------------------------+
+| |wrp=N |Number of in-progress write_page() cache ops |
++ +-------+-------------------------------------------------------+
+| |ucp=N |Number of in-progress uncache_page() cache ops |
++ +-------+-------------------------------------------------------+
+| |dsp=N |Number of in-progress dissociate_pages() cache ops |
++--------------+-------+-------------------------------------------------------+
+|CacheEv |nsp=N |Number of object lookups/creations rejected due to |
+| | |lack of space |
++ +-------+-------------------------------------------------------+
+| |stl=N |Number of stale objects deleted |
++ +-------+-------------------------------------------------------+
+| |rtr=N |Number of objects retired when relinquished |
++ +-------+-------------------------------------------------------+
+| |cul=N |Number of objects culled |
++--------------+-------+-------------------------------------------------------+
+
+
+
+/proc/fs/fscache/histogram
+--------------------------
+
+ ::
+
+ cat /proc/fs/fscache/histogram
+ JIFS SECS OBJ INST OP RUNS OBJ RUNS RETRV DLY RETRIEVLS
+ ===== ===== ========= ========= ========= ========= =========
+
+ This shows the breakdown of the number of times each amount of time
+ between 0 jiffies and HZ-1 jiffies a variety of tasks took to run. The
+ columns are as follows:
+
+ ========= =======================================================
+ COLUMN TIME MEASUREMENT
+ ========= =======================================================
+ OBJ INST Length of time to instantiate an object
+ OP RUNS Length of time a call to process an operation took
+ OBJ RUNS Length of time a call to process an object event took
+ RETRV DLY Time between an requesting a read and lookup completing
+ RETRIEVLS Time between beginning and end of a retrieval
+ ========= =======================================================
+
+ Each row shows the number of events that took a particular range of times.
+ Each step is 1 jiffy in size. The JIFS column indicates the particular
+ jiffy range covered, and the SECS field the equivalent number of seconds.
+
+
+
+Object List
+===========
+
+If CONFIG_FSCACHE_OBJECT_LIST is enabled, the FS-Cache facility will maintain a
+list of all the objects currently allocated and allow them to be viewed
+through::
+
+ /proc/fs/fscache/objects
+
+This will look something like::
+
+ [root@andromeda ~]# head /proc/fs/fscache/objects
+ OBJECT PARENT STAT CHLDN OPS OOP IPR EX READS EM EV F S | NETFS_COOKIE_DEF TY FL NETFS_DATA OBJECT_KEY, AUX_DATA
+ ======== ======== ==== ===== === === === == ===== == == = = | ================ == == ================ ================
+ 17e4b 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88001dd82820 010006017edcf8bbc93b43298fdfbe71e50b57b13a172c0117f38472, e567634700000000000000000000000063f2404a000000000000000000000000c9030000000000000000000063f2404a
+ 1693a 2 ACTV 0 0 0 0 0 0 7b 4 0 0 | NFS.fh DT 0 ffff88002db23380 010006017edcf8bbc93b43298fdfbe71e50b57b1e0162c01a2df0ea6, 420ebc4a000000000000000000000000420ebc4a0000000000000000000000000e1801000000000000000000420ebc4a
+
+where the first set of columns before the '|' describe the object:
+
+ ======= ===============================================================
+ COLUMN DESCRIPTION
+ ======= ===============================================================
+ OBJECT Object debugging ID (appears as OBJ%x in some debug messages)
+ PARENT Debugging ID of parent object
+ STAT Object state
+ CHLDN Number of child objects of this object
+ OPS Number of outstanding operations on this object
+ OOP Number of outstanding child object management operations
+ IPR
+ EX Number of outstanding exclusive operations
+ READS Number of outstanding read operations
+ EM Object's event mask
+ EV Events raised on this object
+ F Object flags
+ S Object work item busy state mask (1:pending 2:running)
+ ======= ===============================================================
+
+and the second set of columns describe the object's cookie, if present:
+
+ ================ ======================================================
+ COLUMN DESCRIPTION
+ ================ ======================================================
+ NETFS_COOKIE_DEF Name of netfs cookie definition
+ TY Cookie type (IX - index, DT - data, hex - special)
+ FL Cookie flags
+ NETFS_DATA Netfs private data stored in the cookie
+ OBJECT_KEY Object key } 1 column, with separating comma
+ AUX_DATA Object aux data } presence may be configured
+ ================ ======================================================
+
+The data shown may be filtered by attaching the a key to an appropriate keyring
+before viewing the file. Something like::
+
+ keyctl add user fscache:objlist <restrictions> @s
+
+where <restrictions> are a selection of the following letters:
+
+ == =========================================================
+ K Show hexdump of object key (don't show if not given)
+ A Show hexdump of object aux data (don't show if not given)
+ == =========================================================
+
+and the following paired letters:
+
+ == =========================================================
+ C Show objects that have a cookie
+ c Show objects that don't have a cookie
+ B Show objects that are busy
+ b Show objects that aren't busy
+ W Show objects that have pending writes
+ w Show objects that don't have pending writes
+ R Show objects that have outstanding reads
+ r Show objects that don't have outstanding reads
+ S Show objects that have work queued
+ s Show objects that don't have work queued
+ == =========================================================
+
+If neither side of a letter pair is given, then both are implied. For example:
+
+ keyctl add user fscache:objlist KB @s
+
+shows objects that are busy, and lists their object keys, but does not dump
+their auxiliary data. It also implies "CcWwRrSs", but as 'B' is given, 'b' is
+not implied.
+
+By default all objects and all fields will be shown.
+
+
+Debugging
+=========
+
+If CONFIG_FSCACHE_DEBUG is enabled, the FS-Cache facility can have runtime
+debugging enabled by adjusting the value in::
+
+ /sys/module/fscache/parameters/debug
+
+This is a bitmask of debugging streams to enable:
+
+ ======= ======= =============================== =======================
+ BIT VALUE STREAM POINT
+ ======= ======= =============================== =======================
+ 0 1 Cache management Function entry trace
+ 1 2 Function exit trace
+ 2 4 General
+ 3 8 Cookie management Function entry trace
+ 4 16 Function exit trace
+ 5 32 General
+ 6 64 Page handling Function entry trace
+ 7 128 Function exit trace
+ 8 256 General
+ 9 512 Operation management Function entry trace
+ 10 1024 Function exit trace
+ 11 2048 General
+ ======= ======= =============================== =======================
+
+The appropriate set of values should be OR'd together and the result written to
+the control file. For example::
+
+ echo $((1|8|64)) >/sys/module/fscache/parameters/debug
+
+will turn on all function entry debugging.