286 lines
9.8 KiB
Plaintext
286 lines
9.8 KiB
Plaintext
========================
|
|
FILESYSTEM LOCAL CACHING
|
|
========================
|
|
|
|
========
|
|
CONTENTS
|
|
========
|
|
|
|
(*) Introduction.
|
|
|
|
(*) Setting up a cache.
|
|
|
|
(*) Setting cache cull limits.
|
|
|
|
(*) Monitoring.
|
|
|
|
(*) Relocating the cache.
|
|
|
|
(*) Further information.
|
|
|
|
|
|
============
|
|
INTRODUCTION
|
|
============
|
|
|
|
Linux now supports local caching of certain filesystems (currently only NFS and
|
|
the in-kernel AFS filesystems). This permits remote data to be cached on local
|
|
disk, thus potentially speeding up future accesses to that data by avoiding the
|
|
need to go to the network and fetch it again.
|
|
|
|
This facility (known as FS-Cache) is designed to be as transparent as possible
|
|
to a user of the system. Applications should just be able to use NFS files as
|
|
normal, without any knowledge of there being a cache.
|
|
|
|
The administrator has to set up the cache in the first place, tell the system
|
|
to use it and then mark the NFS mount points they want caching, but the user
|
|
need not see any of that.
|
|
|
|
The facility can be conceptualised by the following diagram:
|
|
|
|
+--------+ +--------+ +--------+ +--------+
|
|
| | /\ | | | | | |
|
|
| NFS |--- \ ---->| NFS |------>| Page |---->| User |
|
|
| Server | \/ | Client | ^ | Cache | | App |
|
|
| | Network | | | | (RAM) | | |
|
|
+--------+ +--------+ | +--------+ +--------+
|
|
| |
|
|
| +-----+
|
|
V |
|
|
+--------+ +--------+ +---------+
|
|
| | | | | |
|
|
| FS- |<--->| Cache |<--->| /var/ |
|
|
| Cache | | Files | | fscache|
|
|
| | | | | |
|
|
+--------+ +--------+ +---------+
|
|
|
|
When a user application reads data, data flows left to right along the top row.
|
|
With a local cache is available, the NFS client copies any data it doesn't have
|
|
a local copy of into the cache if there's space such that the second and
|
|
subsequent times it tries to read that data, it retrieves it from the cache
|
|
instead.
|
|
|
|
FS-Cache is an intermediary between the network filesystems (such as NFS) and
|
|
the actual cache backends (such as CacheFiles) that do the real work. If there
|
|
aren't any caches available, FS-Cache will smooth over the fact, with as little
|
|
extra latency as possible.
|
|
|
|
CacheFiles is the only cache backend currently available. It uses files in a
|
|
directory nominated by the administrator to store the data given to it. The
|
|
contents of the cache are persistent over reboots.
|
|
|
|
|
|
==================
|
|
SETTING UP A CACHE
|
|
==================
|
|
|
|
Setting up a cache should be straightforward. The configuration for the
|
|
in-filesystem cache backend (CacheFiles) is placed in /etc/cachefilesd.conf.
|
|
There is a manual page available to cover the options in detail, but they will
|
|
be overviewed here. The cachefilesd package will need to be installed to use
|
|
the cache.
|
|
|
|
The administrator first needs to decide which directory they want to place the
|
|
cache in (typically /var/cache/fscache) and specify that to the system:
|
|
|
|
[/etc/cachefilesd.conf]
|
|
dir /var/cache/fscache
|
|
|
|
The cache will be stored in the filesystem that hosts that directory. For
|
|
something like a laptop, you'll probably want to select the root directory
|
|
here, but for a main desktop machine you might want to mount a disk partition
|
|
specifically for the cache.
|
|
|
|
The filesystem must support user-defined extended attributes as these are used
|
|
by CacheFiles to store coherency maintenance information. User-defined
|
|
extended attributes can be turned on on an Ext3 filesystem by doing the
|
|
following:
|
|
|
|
tune2fs -o user_xattr /dev/hdxN
|
|
|
|
or by mounting the filesystem like this:
|
|
|
|
mount /dev/hda6 /var/cache/fscache/ -o user_xattr
|
|
|
|
All other requirements should be met by using a RHEL5+ or FC6+ kernel and using
|
|
Ext3 (ReiserFS and XFS will also meet the requirements). See the "Further
|
|
information" section for more information.
|
|
|
|
|
|
The CacheFiles backend works by using up free space on the disk, caching remote
|
|
data in it. See the section on "Setting cache cull limits" for configuring how
|
|
much free space it maintains. This is, however, optional as defaults are set.
|
|
|
|
|
|
Once the configuration file is in place, just start up the cachefilesd service:
|
|
|
|
systemd start cachefilesd.service
|
|
|
|
And the cache is ready to go. This can be made to happen automatically on boot
|
|
by running this as root:
|
|
|
|
systemd enable cachefilesd.service
|
|
|
|
|
|
========================
|
|
USING THE CACHE WITH NFS
|
|
========================
|
|
|
|
NFS will not use the cache unless explicitly told to do so. This is done by
|
|
attaching an extra option to an NFS mount ("-o fsc"), for instance:
|
|
|
|
mount fred:/ /fred -o fsc
|
|
|
|
All the accesses to files under /fred will then be put through the cache,
|
|
provided they aren't opened for direct I/O or opened for writing (see below).
|
|
|
|
NFS supports caching for version 2, 3 and 4, though they'll use different
|
|
branches of the cache for each.
|
|
|
|
NFS keys the contents of the cache on the server and the NFS file handle,
|
|
meaning that hard linked files share the cache correctly.
|
|
|
|
|
|
CACHE LIMITATIONS WITH NFS
|
|
--------------------------
|
|
|
|
If a file is opened for direct-I/O, the cache will be bypassed because the I/O
|
|
must be direct to the server.
|
|
|
|
If the file is opened for writing, NFS version 2 and 3 protocols don't provide
|
|
sufficient coherency management information for the client to be able to detect
|
|
a write from another client that overlapped with one that it did.
|
|
|
|
So if a file is opened for direct-I/O or for writing, the copy of the data
|
|
cached on disk will be retired and that file will cease being cached until it
|
|
is no longer being used by that client.
|
|
|
|
|
|
=========================
|
|
SETTING CACHE CULL LIMITS
|
|
=========================
|
|
|
|
The CacheFiles backend works by using up free space on the disk, caching remote
|
|
data in it. This could, potentially, consume the entirety of the free space,
|
|
which if it was also your root partition, would be bad. To control this,
|
|
CacheFiles tries to maintain a certain amount of free space, and will shrink
|
|
the cache to compensate if whatever else is on the disk grows.
|
|
|
|
This can be controlled by three settings:
|
|
|
|
[/etc/cachefilesd.conf]
|
|
brun 20%
|
|
bcull 10%
|
|
bstop 5%
|
|
|
|
These are specified as percentages of the total disk space. When the amount of
|
|
available free space drops below the "bcull" or "bstop" limits, the cache
|
|
management daemon will start reducing the amount of data in the cache, and when
|
|
the available free space rises above the "brun" limit, the culling will cease.
|
|
This provides hysteresis. Note that the following must hold true:
|
|
|
|
0 <= bstop < bcull < brun < 100
|
|
|
|
|
|
Similarly, some filesystems have limited numbers of files that they can
|
|
actually support (Ext3 for instance falls into this category). If the data
|
|
being pulled from the server is in lots of small files, then this can quickly
|
|
use up all the files available to the cache without using up all the data. To
|
|
counter this problem, the cache tries to maintain a minimum percentage of free
|
|
files, just as it does for available free space. This can also be configured:
|
|
|
|
[/etc/cachefilesd.conf]
|
|
frun 20%
|
|
fcull 10%
|
|
fstop 5%
|
|
|
|
And this must hold true:
|
|
|
|
0 <= fstop < fcull < frun < 100
|
|
|
|
|
|
The defaults are 7% (run), 5% (cull) and 1% (stop) for both groups of settings.
|
|
|
|
When the bstop or fstop limit is reached, no more data will be added to the
|
|
cache until appropriate parameter falls back beneath the run limit.
|
|
|
|
|
|
==========
|
|
MONITORING
|
|
==========
|
|
|
|
The state of NFS filesystem caching can be monitored to a certain extent by the
|
|
data exposed through files in /proc/sys/fs/nfs/:
|
|
|
|
(*) nfs_fscache_to_pages
|
|
|
|
The number of pages of data NFS has added to the cache.
|
|
|
|
(*) nfs_fscache_from_pages
|
|
|
|
The number of pages of data NFS has retrieved from the cache.
|
|
|
|
(*) nfs_fscache_uncache_page
|
|
|
|
The number of active page bindings that NFS has removed from the
|
|
cache. (Note that just because a page binding has been released, it
|
|
does not mean the page has been removed from the cache, just that NFS
|
|
is no longer using that particular bit of the cache at the moment).
|
|
|
|
(*) nfs_fscache_from_error
|
|
|
|
The last error incurred when reading page(s) from the cache.
|
|
|
|
(*) nfs_fscache_to_error
|
|
|
|
The last error incurred when writing a page to the cache.
|
|
|
|
Note that these sysctl parameters are only temporary and will be integrated in
|
|
to the NFS per-mount statistics sometime in the future.
|
|
|
|
|
|
Futhermore, the caching state of individual mountpoints can be examined through
|
|
other /proc files. For instance:
|
|
|
|
[root@andromeda ~]# cat /proc/fs/nfsfs/servers
|
|
NV SERVER PORT USE HOSTNAME
|
|
v4 ac101209 801 1 home0
|
|
[root@andromeda ~]# cat /proc/fs/nfsfs/volumes
|
|
NV SERVER PORT DEV FSID FSC
|
|
v4 ac101209 801 0:16 9:2 no
|
|
v4 ac101209 801 0:17 9:3 yes
|
|
|
|
The "FSC" column says "yes" when the system has been asked to cache a
|
|
particular NFS share/volume/export, and "no" when it hasn't.
|
|
|
|
|
|
====================
|
|
RELOCATING THE CACHE
|
|
====================
|
|
|
|
By default, the cache is located in /var/cache/fscache, but this may be
|
|
undesirable. Unless SELinux is being used in enforcing mode, relocating the
|
|
cache is trivially a matter of changing the "dir" line in /etc/cachefilesd.
|
|
|
|
However, if SELinux is being used in enforcing mode, then it's not that
|
|
simple. The security policy that governs access to the cache must be changed.
|
|
For more information, see:
|
|
|
|
move-cache.txt
|
|
|
|
|
|
===================
|
|
FURTHER INFORMATION
|
|
===================
|
|
|
|
On the subject of the CacheFiles facility and configuring it:
|
|
|
|
/usr/share/doc/cachefilesd/README
|
|
/usr/share/man/man5/cachefilesd.conf.5.gz
|
|
/usr/share/man/man8/cachefilesd.8.gz
|
|
|
|
For general information, including the design constraints and capabilities,
|
|
see:
|
|
|
|
/usr/share/doc/kernel-doc-2.6.17/Documentation/filesystems/caching/fscache.txt
|