Thumping on Thumpers: Sun's Missing the Boat
Opinion: Sun's Thumper and Thor provide outstanding performance and value, but if the company thinks it's going to pick off NetApp customers, it needs to provide some polish.
Sun Microsystems’ x4500 and x4540 storage systems, code names Thumper and Thor, are truly revolutionary. These 48-disk storage devices leverage Solaris 10 and ZFS, providing stunning performance somewhere around the $1K per TB price point. While these systems are great, and potentially much better than NetApp storage devices, Sun will not be able to take much market share from NetApp.
The Thumpers, as I will call both models, have certainly changed the way my IT shop thinks about storage. Large Hitachi arrays with FC disks, at a price point an order of magnitude higher than a Thumper, used to be the only storage trusted to perform under moderately high usage. Time and again the Thumper’s 48 SATA disks have proven much faster, thanks to ZFS.
As wonderful as it is to have high volumes of storage performing at levels we never thought possible, we are also keenly aware of its limitations. It may be tempting to replace high-end storage with better performing and much cheaper storage, but there are a few problems with that idea.
Ignoring, for a moment, that you must understand Solaris and Samba to even start using this in production, let’s take a look at the hardware itself.
A Thumper is currently a dual quad-core AMD Opteron server with plenty of RAM. The Thumper is certainly a wonderful server, with tons of throughput provided by quad gigabit Ethernet ports and many PCI-E busses. If you want to run a database server that requires many TBs of storage, this is the perfect box.
When you start talking about sharing CIFS, NFS, or iSCSI, the picture gets bleak, compared to a NetApp. Sharing iSCSI from the Thumper is amazing, and you can easily max out a gigabit link. Performance is not the problem; performance is what you get for running at risk: the Thumper is just a single server with no redundant “controllers.” If something happens to the operating system or hardware, all your iSCSI shares go down. That said, these boxes are extremely reliable, and ZFS is extremely stable, but you certainly are guaranteed no uptime the way that traditional storage devices do it. Software updates, of course, require taking the entire box down, instead of a single controller at a time.
No Management Tools
Now, I do love Solaris, so the Thumpers fit right in and aren’t a huge pain to manage. Unfortunately, I simply cannot imagine anyone but a Solaris administrator running a Thumper. You must understand how to get NFS, Samba, or iSCSI working in Solaris if you plan to use those features.
In the NetApp world, devices do not even require a storage administrator to manage. Web-based management tools allow the casual user to understand and configure every aspect of the storage system. This opens the devices up to a much wider audience, as well as provides a layer of abstraction that guarantees features are configured the way NetApp intended. You can configure a Thumper any way you like, which is nice, but leads to customers doing crazy things. The performance characteristics will vary depending on how each customer has configured their Thumper, which also leads to supportability issues.
Sun’s idea of management consoles is to provide a Web-based interface that covers only half of a feature set, meaning you still have to login via SSH and configure things the rest of the way. I am happy doing so, but Sun could dramatically cut support costs, ensure all customers have the same experience, and open the Thumper to a much wider audience if it created a unified management interface for all NAS-like features. Historically, Sun starts to go down this path, but abandons it shortly thereafter.
As I have alluded to a few times, I am using a Thumper to share iSCSI volumes. Each virtual machine in a Linux cluster gets its own LUN, and everything works amazingly well. Well, it works, but takes a lot of manual configuration on both the Thumper and the VM servers. In an ideal world, running an iSNS server would allow me to cut down on management tasks dramatically. Sun does provide an iSNS server with OpenSolaris now, but it is a pretty time-consuming task. If you want to get a nice visual representation of your domains, you must jump through some crazy hoops to get the GUI running as well (you have to install Solaris 10 packages off the CD onto your OpenSolaris server - ick).
ZFS replication is in development, so we will just talk like it exists, but sadly it will require manual scripting to be useful. Replicating ZFS volumes to a second Thumper could be a way to overcome the lack of redundancy in a Thumper server. Heck, buying two Thumpers and only being able to use half the space isn’t even that bad of a proposition. Unfortunately, you would spend an extremely large amount of time configuring replication and failover. While this sounds fun if you have extra time on your hands, it makes much more business sense to buy a NetApp and be done with it. Replication has existed in nearly every storage product on the market for years, and if Sun wants to play in this arena they will need to make it extremely easy to use.
And finally, but not exhaustively, I would like to mention ZFS snapshots. Snapshots require a sysadmin write scripts to run the snapshot commands at the desired interval. It is a manual process, but not too difficult to accomplish, so it may seem a trivial complaint. After snapshotting, you must also devise a method by which your users (in the case of CIFS and NFS) can access them. There is no good way to visualize your snapshots and how much space they are using, nor to adjust the frequency or aging policies without editing the script you made to create the file system snapshots. Once you have a few thousand ZFS volumes, if you take daily snapshots and retain them for a week, you now have 2,000 times 7, or 14,000 mounted ZFS file systems. Just listing them with ‘zfs list’ takes forever, let alone wrapping your head around your file system layout. This is one situation where a GUI would be welcome, even by the grumpiest of terminal-based old-time sysadmins.
In reality, I have no idea if Sun wants to take market share from NetApp, or if it is just targeting existing server customers. But if making the wonderful Thumper ubiquitous is a goal, Sun has a lot of work ahead, and addressing the above issues isn’t anything it has done before. Apart from hiring me, Sun is unlikely to succeed in fully leveraging the Thumper line.
Charlie Schluting is the author of Network Ninja, a must-read for every network engineer.