Using the New Linux ReiserFS Filesystem

By Stew Benedict | Dec 21, 2000 | Print this Page
http://www.enterprisenetworkingplanet.com/netos/article.php/625421/Using-the-New-Linux-ReiserFS-Filesystem.htm

For most of Linux's history, the extended2 (ext2) filesystem has been the standard, although the Linux kernel can be configured to read/write many other types of filesystems. Recently, work has been done on some new filesystem types in an attempt to improve on the performance of ext2. Ext3 and ReiserFS are two of these projects. Ext3 adds journaling to ext2, whereas ReiserFS is a whole new filesystem, with journaling. This article will focus on the strengths and weaknesses of ReiserFS, as well as how to go about setting it up, should you decide to give it a try.

Journaling

If you're not familiar with database systems, journaling means that each transaction is written to a journal, or log. It is then possible to replay this journal in the event of a catastrophe and recover the lost transactions. Periodically, the journal is flushed, after it is certain the transactions have taken place.

In addition, for performance reasons, the author of ReiserFS has chosen to store both the file names and the files in a database, rather than just the file names and locations. This arrangement allows better storage of small files. Rather than allocating a whole block for a file that is smaller than the block size, small files are combined for optimal usage of disk space.

Strengths and Weaknesses

ReiserFS has a number of strengths:

  • Generally higher performance for all file sizes.

  • Wastes less space. There is no static inode space allocation; small files are packed together.

  • Much higher performance for large directories, even compared to other balanced tree filesystems.

  • Uses B* balanced trees, whereas other balanced tree filesystems use obsolete B+ trees.

  • Partitions can be resized while in use.

  • Extremely fast recovery in the event of unplanned machine shutdown (loss of power). Rather than taking many seconds or even minutes to check the filesystem as e2fsck does, ReiserFSck takes only seconds.

    Of course, it also has its weaknesses:

    • The software is still relatively new. You may want to hold off implementing in production systems, although results to date have been good.

    • No quota support (yet).

    • Dump does not work (yet).

    • Kernel-based NFS is not very stable yet.

    • Certain programs (qmail, for one) have issues with ReiserFS. Patches are available in many cases.

    • It is not possible to change live filesystems back and forth from ext2 to ReiserFS. To make the change, you need to back up your data, create the file system, and restore.

    • Occasionally, you may experience a stall condition (read starves). When large writes are scheduled all at once, reads can starve. A fix for this is in the works, and the later your ReiserFS patch, the better this situation is handled.

    Precautions

    When you use ReiserFS, you'll need to take a number of precautions:

    • If you're using Lilo, you will need to set the notail option on / (the root partition) if you use ReiserFS on that partition. The new version of Lilo claims notail isn't needed anymore, but it also requires a newer version of ReiserFS.

    • If using md (software raid) to spread ReiserFS over multiple disks, turn off REISERFS_READ_LOCK. To do so, comment out
      #define REISERFS_READ_LOCK

      in linux/include/linux/reiserfs_fs.h.

    • If you want to run Windows 98 under Linux using VMware, add the following line to the configuration file (~/vmware/win98/win98.cfg): host.FSSupportLocking1 = 0x52654973

    Utilities

    Several utilities are included with ReiserFS kernel patches:

    • mkreiserfsCreates a Linux ReiserFS file system on a device. This utility is equivalent to mke2fs.

    • reiserfsckPerforms a consistency check for the Linux ReiserFS file system. It is equivalent to e2fsck on ext2.

    • resize_reiserfsUsed for offline file system resizing. As far as I know, ext2 has no equivalent to this command.

    I am currently running ReiserFS on /home and /usr on a SuSE7.0 Linux install, leaving the / partition as ext2. SuSE 7.0 comes with ReiserFS as an install option, if you pick the expert install. For testing purposes, I created equivalent partitions /usr2, and /home2 using ext2. The following is the output from mount:

    larry:~ # mount
    /dev/hda2 on / type ext2 (rw)
    proc on /proc type proc (rw)
    /dev/hda1 on /boot type ext2 (rw)
    /dev/hda5 on /usr type reiserfs (rw)
    /dev/hda7 on /home type reiserfs (rw)
    

    Setting Up ReiserFS

    To use ReiserFS, you have to patch the kernel source before you can include it in your options. You can download appropriate kernel patches from ftp://ftp.lugoj.org/pub/reiserfs/devlinux.com/pub/namesys/. There are patches for kernel versions 2.2.14 through 2.2.17, the 2.3 series, and the new 2.4.0 kernel.

    To apply the kernel patches, do the following:

    cd /usr/src
    zcat linux-2.2.16-reiserfs-3.5.24-patch.gz | patch -p0
    

    (I was patching a 2.2.16 kernelyours may vary.)

    When configuring the kernel, answer "y" or "m" (module) to the ReiserFS support question. Then compile and install the kernel image and modules as usual. ReiserFS utilities are located in the patched kernel source tree in linux/fs/reiserfs/utils. They are built and installed in the usual manner:

    cd /usr/src/linux/fs/reiserfs/utils
    make
    make install  
    

    Boot with your newly built kernel, run mkreiserfs on a spare partition, and mount it. If you want to replace an existing ext2 partition with the new one, it would be wise to boot into single-user mode; then you can either copy the files from the existing partition to the new one or restore to the new partition from backup media. To copy, I generally use the following:

    mount /dev/hdx /mnt/temp    (new reiserfs)
    cd /home                    (old ext2 fs)
    tar -cf - . | (cd /mnt/temp;tar -xf -)
    

    These commands will copy everything from the existing partition to the new one. You can then umount both partitions, edit /etc/fstab appropriately, and remount your new /home. Here is a sample /etc/fstab entry for a Reiserfs partition:

    /dev/hda7       /home   reiserfs        defaults 1 2

    ReiserFS has a few mount/re-mount options:

    • notailCauses the filesystem to work faster, especially for small appends to small files. The cost is more wasted disk space.

    • replayonlyForces the filesystem driver to replay the journal and exit. Mounting will be avoided. This option is used by fsck, for the most part.

    • resizeLets you expand a ReiserFS partition online. To do this you would specify the mount command as follows:
      mount -o remount,resize=<new blockcount> <reiserfs mount point>

      Of course, you need additional free space on the drive or you may end up consuming an adjacent partition.

    Bonnie++ Performance Test

    Bonnie++ is a program to test hard drives and file systems for performance. As the man pages for the program mention, there are many different types of filesystem operations. Bonnie++ tests some of them and for each test gives a result of the amount of work done per second and the percentage of CPU time this took. For performance results, higher numbers are better; for CPU usage, lower numbers are better. Bonnie++ can be downloaded from http://www.coker.com.au/bonnie++/.

    Here are the results of the test on my machine. I was the only user on the system. X was only running in login mode, and the normal system daemons were running. Here are the hardware specs:

    • Processor: Pentium P166MMX

    • RAM: 80MB

    • Hard Drive: FUJITSU MPB3064ATU, 6187MB w/0kB Cache

    I ran Bonnie++ with the defaults, reading and writing first from a 200 MB file, and then creating/reading/deleting from 30 files:

    /home/stew (ReiserFS):
    
    Version 1.00d       ------Sequential Output------ --Sequential Input- --Random-
                        -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
    Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
    larry          200M  1305  95  4431  78  2231  70  1342  95  4560  84  13.6   1
                        ------Sequential Create------ --------Random Create--------
                        -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
                  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                     30  1476  98 10719  99  2265 100  1497  99 10617  99  2079  99
    
    /home2/stew (ext2):
    
    Version 1.00d       ------Sequential Output------ --Sequential Input- --Random-
                        -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
    Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
    larry          200M  1365  94  5000  80  2296  75  1351  94  4561  82  13.9   1
                        ------Sequential Create------ --------Random Create--------
                        -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
                  files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                     30    71  99   246  99  1460  98    72  99   272 100   181  70
    

    In the case of the single 200 MB file, ReiserFS and ext2 were pretty similar. But for the small file manipulations, ReiserFS was much faster. Also note the difference between random and sequential deletes in ext2.

    I also killed the power on the system a number of times without shutting down properly, and the ReiserFS partitions were checked and back online in less than a second when the system came back up. On the other hand, the ext2 partitions took 20 to 40 seconds to run e2fsck.

    Real-World Usage

    The following statements were listed as testimonials on the ReiserFS Web pages. SourceForge is a developer resource that sponsors open source projects; the site is run by VALinux. Mozilla is the open source spin-off from the commercial Netscape browser:

    • "http://ftp.sourceforge.net/ has 850GB storage, half of which is reiserfs, half is ext2. Both filesystems have been running flawlessly for > 4 months of production (actually longer, but wasn't reiserfs before). That server pushes between 15Mbit and 50Mbit/sec, and pulls/syncs about 2-5Mbit/sec, 24x7."

    • "Reiserfs also powers the CVS tree filesystem for cvs-mirror.mozilla.org (also tokyojoe.sourceforge.net), which is the one and only anonymous CVS checkout point for mozilla. That server has run flawlessly under very heavy load since birth."
    CrossLinks

    Conclusion

    I hope this overview gives you some insight as to the pros and cons of using ReiserFS. My experience so far has been good, and I may adopt it the next time I upgrade my server. In the meantime, I'll continue running it on this test system, and perhaps load up a database and give it a bit more of a workout. //

    Stew Benedict is a systems administrator for an automotive manufacturer in Cleveland, Ohio. He also is a freelance consultant, and runs AYS Enterprises, which specializes in printed circuit design, Microsoft Access solutions for the Windows platforms, and utilizing Linux as a low-cost alternative to commercial operating systems and software. He has been using and promoting Linux since about 1994. When not basking in the glow of a CRT, Stew enjoys time with his wife, daughter, and two dogs at his future (not too much longer!) retirement home overlooking Norris Lake in the foothills of the Smoky Mountains in Tennessee.