Sun's x86 Gear: Good Fit for the Enterprise?
Opinion: If you're like me, you were pleasantly surprised when Sun announced their AMD64 offerings last year. We all know that Sun hardware is well designed, tested, darn sexy, and of course reliable. Why should the discount stuff from Sun be any different?
Well, it is. Not to say it's "bad" by any means, that couldn't be further from the actual sentiment here. As a matter of fact, Sun's Opteron line beats other vendors hands down—a topic for a future column. But it certainly isn't equivalent to their SPARC gear. What I'm talking about here is their x2100 and x2200 servers, and to a certain extent the Galaxy line (yes, they reused the name): the x4100, x4200, and x4600 products.
Let's say you have a bunch of mail servers, for example, currently running on SPARC hardware, and you need more speed. To get decent performance, your options are: something with an UltraSPARC IV+ processor (a SunFire v490 or better), or a screaming fast Opteron. Mid-range Sun server pricing starts at the middle of five figures, and many Opteron servers like the x2100 and x2200 can be had for much less than $10K. The Opterons prove faster in most tests as well, so the clear winner is the Opteron in terms of price to performance ratios.
Well, hold on there, bucko. You didn't consider what the real differences are. To put it bluntly, in fact paraphrased from a Sun technical support engineer, "this is PC hardware in a server." That's not completely accurate, but it's too close for comfort. Sun puts some great hardware engineers on the task of making sure cooling and power is adequate and designed well. That alone can improve the life of a computer tremendously. The components are all quality, and again, tested well. But where the "PC hardware" mantra rings true is in the service level Sun offers.
Here's how it all played out when I personally had to deal with getting maintenance on an x2100 server. This server started to exhibit strange behavior. It was running Linux, and random programs began to crash (segmentation faults). This normally means that you either have really bad software, or that some hardware is failing. It could be the motherboard, RAM, or even the CPU since the memory controller is on-chip with these Opterons. In this case multiple applications were having this issue, so it was very unlikely to be a software problem. Sadly we couldn't just blame Linux, so we tested the RAM before calling Sun. Everything checks out. It took days to explain this situation to Sun's first line support for this server.
So finally we're bumped to "an engineer" of sorts, and she recommends that we try new memory to see is that helps. Their plan is to ship us new memory, and have us install and test it. Requesting that someone come fix the broken hardware led nowhere, except to the comment about "this is PC hardware" and the stark realization that the customer is now the field technician. That's fine for the PC market, but when you have hundreds of servers to worry about, and have paid for a hardware contract, that's what field technicians are for.
The point is that AMD64 support, with the same level of contract, is a completely different experience compared to SPARC hardware. If we had the same issue with a SPARC server, a tech would show up within 4 hours. He's called an Appeasement Officer (at least, that's what we call them), and his job is to either swap blatantly broken parts or soothe us until the real tech shows up. Even without a software contract, if hardware were proven to be unreliable on a SPARC server, they would fix it for us very quickly. There's none of this "let's have the customer swap parts" business.
This is really a comparison of Sun's stellar SPARC reliability and maintenance compared to their AMD64 line. It should be noted that Opteron and Xeon server experiences are very similar across all vendors.
So are the AMD64 systems reliable? Well yes, they are. Sun isn't a discount hardware-peddling outfit after all. We must realize that reliability is relative, however. Components of the Opteron and Xeon servers will fail more frequently than the SPARC servers, which are built for reliability. That's the reality.
The price versus performance metric is tough to ignore. Furthermore, a well-managed IT infrastructure should be able to handle a failure of a few servers without significant issue, so it may just be worthwhile to purchase as many cheap (but faster) servers as possible.
The absolute cheapest Sun AMD64 machine available is the x2100. I'd caution everyone to stay away from these, however. There's no service processor—a card that allows remote access to the hardware even when it's powered off—and there are no RAID capabilities. The x2200 does have a service processor, but still no RAID. Integrated RAID isn't available until you reach the x4100 level, which starts to get pricier, but still a great deal. These servers, excluding the x2100, also have a remove KVM functionality built into the service processor. Kudos to Sun on this one; it's a truly innovative feature that reduces the need for those pricey KVMoIP switches.
Long-time Sun customers may still have a mental roadblock to overcome. The good news is that the days of Solaris x86 being completely unreliable are gone. Sun put many man-hours into making Solaris 10 function properly on both x86 and SPARC platforms. So whether you wish to run Solaris, Linux, or Windows, Sun is now poised to provide the hardware.
Be wary of replacing important servers with the lower-end AMD products, but at the same time feel free to rejoice. Sun has extremely fast servers available! (Xeon-based servers coming soon)