Linux File Services: Good Things Arrive in Fours
Major improvements to Samba and its mainstay file system are positioning Linux for big change in 2007.
Neither the ext4 filesystem nor Samba 4 are ready for prime-time yet, but they are chock-full of promise and potential, so let's take a look at what they are promising to deliver.
Ext4: Spawn of Ext2/3
Linux supports all manner of file systems: ext2, ext3, JFS, XFS, ReiserFS, NTFS, FAT12/16/32 and many more. Look in your /boot/config-[kernel version] file under "File systems" to see what's enabled in your kernel.
The many file systems Linux supports have all kinds of different purposes. Some are network file systems, some are compressed read-only file systems, some are native to other operating systems like OS/2, Netware and Windows. The first five are probably the ones you're the most familiar with, as these are the general-purpose read/write file systems Linux users use every day. (I wish ZFS would come to Linux. Now that's a filesystem to get excited about.) With all of these filesystems to choose from, why do we even need ext4?
In a word: size. Ext2/3 is a 32-bit filesystem. With a 4k block size, this gives us a (theoretical) upper limit of 8 tebibytes for a single volume, and a maximum file size of 2 tebibytes, depending on various factors. For a lot of folks this will always be more than enough. But for others ext2/3 is downright quaint.
So why invent another filesystem? Can't users who need support for giant files and filesystems use JFS or XFS, both of which are 64-bit filesystems that can handle very large (pebibytes and exbibytes) files and volumes? Yes, they can, but as much as I adore filesystem flamewars, I'm going to skirt the question of the pros and cons of these two file systems, and stick to discussing the reasons for Ext4. The ext codebase is native to Linux, and ext3 is rock-solid and reliable. Ext3 has a reputation for being "stodgy", which usually translates to "slow." Ext4 may lose the stodgy label, according to first benchmarks of the file system.
The ext4 developers have wrestled with a number of difficult decisions: add ext4 features to ext3, or leave ext3 alone? Preserve both forward and backwards-compatibility, or not? Forget about the ext filesystem entirely, and build a brand-new advanced filesystem from scratch?
The current game plan is to not mess with ext3. It's stable, and it's the most widely-used Linux filesystem, so mucking with it seems too risky. So new features are going into ext4.
Forward- and backwards-compatibility are more difficult than with ext2 and ext3. Ext3 is just a meta- and data-journal added to ext2. It can even be easily removed. Ext4 is more than just ext2 with a journal; it has several new, important features. It's a 48-bit filesystem, in contrast to ext3's mere 32-bits, so this bumps up the theoretical maximum volume size to 1024 pebibytes, or one exbibyte. Like ext3, it offers both meta-data and data journaling. Unlike ext3, it supports extents. Users of JFS and ReiserFS know about extents— an extent is a contiguous chunk of storage set aside for a file. This reduces fragmentation and speeds up performance, because the filesystem sees each file as a single unit rather than the usual collection of fixed-size blocks that typically comprise a single file.
At the present time ext3 cannot be upgraded to ext4—if you want to convert an ext3 volume to ext4 you'll have to rebuild the filesystem. It's still beta, but you can start testing it by following the instructions in Andrew Morton's announcement.
Samba 4: AD Replacement
Samba 4 has the lofty ambition of functioning as a Windows Active Directory replacement. Windows on the desktop is going to be with us for a long time, but as good Linux geeks we know that using Linux on the backend saves both money and headaches. AD is central to most Windows networks, so having a free/open source software alternative makes much sense. Especially one that works well, supports Linux/Unix clients and doesn't cost an arm and a leg.
Samba 4 is nearly a complete re-write, with hardly any leftover code from older versions. The ancestral Samba is over ten years old, and the Samba team have learned a considerable amount in that time. Samba 4 promises to be better in almost every way: more features, more manageable, and more streamlined.
Samba 3 introduced Active Directory integration for Samba, by making it a full Active Directory member. The next step is for Samba to become an Active Directory replacement by providing network authentication and domain member management, in addition to file and print services. As always, this is an uphill battle because Windows clients depend on a number of ever-changing, closed, non-standard, undocumented Windows login protocols, like the NTLM2 authentication scheme. Even where Windows uses standard protocols like Kerberos, they are borked by the addition of closed, proprietary extensions, such as the infamous PAC (Privilege Attribute Certificate). A PAC sends authorization data along with proof of identity.
Samba 4 aims to enable Kerberos logins from Windows clients. It provides PAC functionality with its own KDC (Key Distribution Center) that behaves in a way that Windows clients expect. Kerberos is light on system and network resources, and flexible, allowing the use of different login technologies such as smart cards.
Older versions of Samba require the installation of several additional servers: OpenLDAP, Kerberos, OpenSSL and HTTP. Samba 4 integrates its own LDAP, Kerberos, and HTTP/HTTPS servers, which means installation and setup is easier, and the Samba team handles patching and bugfixes on these components.
Samba 4 comes with a slick AJAX-based Web administration interface that is miles above any existing graphical interfaces. This feature alone should incite dances for joy in many sites.
Another positive change for Samba could be Jeremy Allison's move from Novell to Google, which could mean Samba will get significant new resources. As Mr. Allison said in an interview:
What we *really* need is more and good quality code. This (IMHO) is what Google understands (it's all about the code)...
The Samba team is also working hard at improving Samba's VFS (Microsoft Virtual File System), and they are working on scalability, making both small and large deployments easier. All told, 2007 looks like a great year to be a Samba admin.
- Table of bytes and bibytes, and why you should care which is which
- Comparison of file systems
- Time for ext4?
- Ext4 discussion on Kerneltrap
- The Gag is Off: Samba's Allison Talks Turkey on Microsoft-Novell Deal
- Samba 4- Active Directory; by Andrew Bartlett. Excellent 78-page .pdf on the challenges faced by the Samba team in dealing with weirdo Windows protocols
- Samba 4 download