Unix: Pretty Spry for "Dead."

By Charlie Schluting | Dec 29, 2006 | Print this Page
http://www.enterprisenetworkingplanet.com/netos/article.php/3651351/Unix-Pretty-Spry-for-Dead.htm

Since SGI's bankruptcy and many changes for most Unix vendors, many media outfits have been playing the "Unix is dead" card pretty heavily (warning: pop ups).

The common arguments for switching to Linux are replayed over and over, rarely taking into account some very important benefits that the Unixes have to offer. With Sun's latest developments, and some not-so-new but overlooked features, we find it difficult to believe that Linux can ever take over all of Unix's traditional roles.

Frequently, people like to use the fact that Linux is "everywhere" to argue that sysadmins are easier to find, and therefore easier on the personnel budget. Unfortunately, Linux "for fun" does little to prepare a junior admin for work in a production environment, aside from gaining the most basic skills. So we aren't paying any consideration to the notion of man-hours, at least in terms of skill sets, required to run various OSes. It is assumed that all sysadmins are competent, and either know or can learn the differences between Solaris, the BSDs and Linux rather quickly.

Many arguments for and against certain Unix operating systems factor in the notion that "my current admin" may not know how to run the new server. While that fact may hinder new platform deployments, it isn't generally much of a concern. The real time sink is a result of operating system bugs, lack of OS capabilities and sysadmin skills—as opposed to a sysadmin's existing knowledge.

We are here to focus on platform stability, reliability, capability, and scalability: the things both operating systems and hardware need to do well in order to thrive.

Intel published a white paper titled "The ROI of Moving from UNIX to Linux" which starts by noting that Linux "gets the job done less expensively." This argument is no longer valid, since Solaris is free, and it was never valid to begin with—Solaris has always been cheaper than Red Hat Enterprise Linux (RHEL), for example.

The crux of the matter is deciding if servers will be secure, manageable, and robust. Linux proponents say that large numbers of bugs are discovered in Linux because it's an open platform. Silly amateur mistakes are generally the cause of security vulnerabilities, and there's no arguing that Linux has tons of these holes. But Solaris does too, just not nearly as much. Mistakes happen, and bugs will always exist, so whatever measure is used to say one platform is more secure than another isn't worthwhile, especially considering that most security holes are in third party applications, not the kernel itself.

Both Linux and Solaris have proven to be capable of stability, to some extent. Bugs, unrelated to security, exist in both operating systems. The Linux community certainly works hard to fix important bugs in a timely fashion, but why are they there in the first place? Solaris rarely, if ever, has a bug that causes the entire server to crash.

With the power of community comes uncertainty. It's mind boggling to run into software that is supposed to do something, but instead crashes or silently fails to execute the desired action. We've all run into it, and it can be extremely frustrating. Does this happen more often in Linux, where random people of unknown skills contribute thousands of lines of code, or more often in Solaris, which has an actual quality control department and professional kernel engineers? The answer should be fairly obvious.

Code trustworthiness translates directly to reliability. A quick anecdote: while testing NFSv4 for a recent article, we ran into a situation where the Linux kernel would go insane, eating tons of CPU for no reason. Essentially it was "enable NFSv4, get an infinitely increasing load average." The solution, of course, was to upgrade to a different kernel version. These types of bugs in an enterprise environment just aren't acceptable. How could we have found out what was happening? There's nothing even close to dtrace in Linux, so you just sit and stare, wondering what's happening. Even before dtrace, Sun's OS observability tools provided more detail than Linux. Sun's engineers will work with anyone who has a support contract (cheaper than RHEL), especially if it's a serious issue that affects other people. The accountability just isn't there in the Linux world. We know, it's an old argument made by pointy-haired bosses, but it's true.

Linux has one thing going for it: corporate support. SUSE and RHEL are both much more stable than a stock Debian load. Yes, it will run Apache under moderate traffic load just fine for years, but intensive applications tend to fail frequently. Linux does take personnel time, not because people are busy trying to figure out how to run it, but because they are busy working around broken software.

Linux also has tons of freely available software, installable without even thinking. This isn't actually much of an argument for datacenter operations, where you'll run just a few specialized or custom-compiled applications, but people like to mention it. Interestingly enough, much of the GNU and free software is available from blastwave for Solaris, installable just as easy as Debian's apt system. Free software accessibility is a concern for desktop users, but not so much in the corporate world.

Another important consideration is that software vendors will not support their software on just any platform. They are very specific about exact kernel versions in Linux, and exact patch levels in Solaris. Unfortunately, there is not a free open source substitute for everything, so companies are always going to purchase specialized software that needs to be certified on specific operating systems, and specific versions. Vendors do this for a reason, because they know that.

Scalability, to be fair, has improved greatly in Linux. It used to crash if given more than four CPUs, but now copes much better. Still, Solaris has been running on multi-hundred processor machines reliably over the course of two decades.

There are a lot of other issues to consider as well. How easy is it to manage when you have 1000 of them? How are updates and patches applied? Is your software compatible between OS upgrades? Does the OS support high-availability features? And the list goes on.

Linux isn't cheaper, it takes more time to run, isn't capable of reliably running on big iron, and it's a lot of guesswork. There are advantages to running Linux for some purposes in the datacenter, but the disadvantages tilt the scale so much that it flips over.

But while Solaris isn't in any danger, Linux definitely could someday replace Windows. We'll leave that for another day.