Tune and Tweak NFS for Top Performance
As promised in our previous NFS article, we will now explore mount options in a bit more detail. We will also talk about differences between NFS implementations among various UNIX flavors, and the wonderful capability automatic mounting provides.
More options than you can shake a mouse at
Previously, we discussed the differences between hard and soft mounts, but skipped over everything else. NFS has many configurable options that can aid or hinder file transfer speeds.
The main advantage of running NFS over TCP is that it works much better over marginal or congested networks because of its built-in flow-control capabilities. If for some strange reason you're using NFS across the Internet (hopefully through some sort of tunnel, at least) then TCP will be necessary. On a 100 Mb/s or even Gigabit LAN, the extra TCP overhead may not be worth it. Not to mention the annoyance of having to remount things if the server doesn't respond in a timely manner.
That being said, there really isn't a discernible difference between NFS over UDP or TCP. It is actually quite hard to measure; assuming a non-congested, functioning network. So, to get TCP, just add
tcp to the NFS mount options.
NFS also allows us to control the size of data chunks we send back and forth to the server. There is much debate in this area, but there are a few observations we can make. Solaris uses a 32K block size for transfers. Linux can too, with a 2.4 or greater kernel. Most older Linux NFS implementations default to 4K (or 4096 bytes), but allow this to be changed via the
wsize= options. Setting these both to 8192 should work on all Linux distributions, and if you're talking to a server that uses a large block size, performance should increase.
Getting NFS to perform optimally can be quite a task if the clients and servers aren't all running the same version of the same operating system. Linux's nfsstat command is very useful for debugging and gathering statistics about NFS transactions and network performance. For example, when using an NFS block size larger than your network card's MTU, IP fragmentation will occur. IP fragmentation is even more common on complex networks where the data can take multiple paths. Reassembling fragments takes CPU cycles and slower clients or servers may suffer. This only happens when using UDP, and is yet another reason TCP is superior despite its overhead. Using nfsstat will reveal such bottlenecks and lead you in the correct direction to fix them.
The final option worthy of discussing is
sync. This tells the server to not operate in asynchronous mode. Using
async, the default in many clients, will tell the server to queue write operations and lie to the client. This results in better performance, but if the server crashes before the delayed writes happen, data is lost.
What about something besides Linux?
There are a few differences in NFS between different Unix variants, mostly in the format of configuration files. Remember, we chose Linux because it was quite similar to other Unix versions.
FreeBSD, for example, is very similar to Linux. The exports file has the same basic principle, but options are specified a bit differently. In Linux, we'd say
192.168.0.1(ro) to export something read-only to that IP address, whereas in BSD we would say
-ro 192.168.0.1. This is really a semantic difference, not a philosophical one. Reading the man page or examples in the export file's comments should be sufficient.
Solaris, however, imposes a completely different way of thinking. The equivalent to "exports" in Solaris is the /etc/dfs/dfstab file. It's different from other Unix variants, but not incomprehensible. To share a directory in Solaris, the entry looks something like:
share -F nfs -o rw=192.168.0.1 /usr
That's right: It's basically a script. You can run the same thing on the command line, and the directory will be shared. A major difference in specifying hosts haunts new Solaris administrators:
-o rw=host1:host2,ro=host3 shares the directory read-write to hosts 1 and 2, and read-only to host 3.
Solaris has quite a few differences, which should be investigated further if clients or servers will be on this operating system. The manpages share(1M), shareall(1M), mountd(1M) and nfsd(1M) are actually quite helpful, surprising as that may be.
The automounter will allow a remote file system to be mounted on-demand, so that the client doesn't need to keep hundreds of NFS mounts open all the time. This feature adds to NFS's scalability, making it possible to have thousands of remote file systems accessible from a client.
In some Linux distributions, an automounter may already be running. When you insert a CD, the directory /cdrom magically contains the contents of the CD. This is basically how autofs works.
To set it up, automount maps need to be created. This is basically a listing of directory (to mount it in) and server:directory to mount. A common setup will have /etc/auto.master referencing files or NIS maps that list the directories to mount. NIS is a wonderfully complex beast that can synchronize users and centralize things like automounter maps, and is way beyond the scope of this NFS article.
Back to NFS, let's view a common setup. In auto.master we have:
This tells autofs to look at auto.import when something needs to be accessed in the file system's /import. In the auto.import file, we'd have:
So, when autofs realizes that /usr from the server is not mounted in /import/test, it will simply go ahead and mount it for us. Using the command
ls /import will not cause it to be mounted, leading to some confusion at first. Running
cd /import/test will cause it to be mounted, however. When sharing all user's home directories to all clients, automounting becomes very important, so that clients don't have too many open NFS mounts unnecessarily. Autofs is highly configurable, but this is the basic method of how it operates.