RAID: Faster and Cheaper with Linux

Welcome to today’s thrilling howto on implementing Linux software RAID
with no expense other than however many hard disks you wish to use, whether they be inexpensive
ordinary PATA (IDE) drives, expensive SCSI drives, or newfangled serial ATA (SATA) drives.

RAID (define) is no longer the exclusive province of expensive systems with
SCSI drives and controllers. In fact it hasn’t been since the 2.0 Linux
kernel, released in 1996, which was the first kernel release to support
software RAID.

What RAID Is For
A RAID array provides various functions, depending on how it is
configured: high speed, high reliability, or both. RAID 0, 1, and 5 are
probably the most commonly used.

RAID 0, or “striping,” writes data across two or more drives. RAID 0 is
very fast; data are split up in blocks and written across all the
drives in the array. It will noticeably speed up everyday work, and is
great for applications that generate large files, like image editing.
It is not fault-tolerant — a failure on one disk means all data in the
array are lost. That is no different than when a single drive fails, so
if it’s speed and more capacity you want, go for it.

RAID 1, or “mirroring,” clones two disks. Your storage space is limited
to the size of the smaller drive, if your two drives are not the same
size. If one drive fails, the other carries on, allowing you to
continue working until it is convenient to replace the disk. RAID 1 is
slower than striping, because all writes are done twice.

RAID 5 combines striping with parity checks, so you get speed and data
redundancy. You need a minimum of three disks. If a single disk is lost
your data are still intact. Losing two disks means losing everything.
Reads are very fast, while writes are a bit slower because the parity
checks must be calculated.

You may use disks of different sizes in all of these, though you’ll get
better performance with disks of the same capacity and geometry. Some
admins like to use different brands of hard disks on the theory that
different brands will have different flaws.

What RAID Is Not
It is not a substitute for a good backup regimen, backup power
supplies, surge protectors, and other sensible protections. Linux
software RAID is not a substitute for true hardware SCSI RAID in
high-demand mission-critical systems. But it is a dandy tool for
workstations and low- to medium-duty servers. PATA (or IDE) drives (define) are
not hot-swappable, but you can set up an array with standby drives that
automatically take over in the event of a disk failure. If you don’t
want to use standby drives your downtime is limited only to the time it
takes to replace the drive, because the system is usable even while the
array is rebuilding itself.

Hardware RAID
Hardware RAID controllers come in a rather bewildering variety.
Mainboards come with built-in IDE RAID controllers, and PCI IDE RAID
controller cards can be had for as little as $25. Most of these are
like horrid Winmodems, in that they require Windows drivers to work and
have Windows-only management tools. I wouldn’t bother with IDE RAID
controllers — Linux software RAID outperforms them in every way, and
costs nothing.

A true hardware RAID controller operates independently of the host
operating system. You’ll find a lot of choices for SATA (define) and SCSI drives. SATA controllers cost from $150 to the sky’s the
limit, depending on how many drives they support, how much onboard
memory they have, and other refinements that take the processing load
away from the system CPU.

Good SCSI controllers start around $400 and have an even higher sky.
Both SATA and SCSI controllers should support hot-swapping, error
handling, caching, and fast data-transfer speeds. A good-quality
hardware controller is fast and reliable; but finding such a one is not
so easy. Many an experienced admin has lost sleep and hair over flaky
RAID hardware.

Something to keep in mind for the future- as SATA support in Linux
matures, and the technology itself improves, it should be a capable
SCSI replacement for all but the most demanding uses. (For more
information see the excellent
pages
posted by the maintainer of the kernel SATA drivers, Jeff
Garzik.)

Software RAID Advantages
Linux software RAID is more versatile than most hardware RAID
controllers. Hardware controllers see each drive as a single member of
the RAID array, and handle only one type of hard disk. Most hardware
controllers are picky about the brand and size of hard disk — you can’t
just slap in any old disks you want, but must carefully choose
compatible disks. And it’s not always documented what these are.

Linux RAID is a separate layer from Linux block devices, so any block
device can be a member of the array — a particular partition, any type
of hard drive, and you can even mix and match. Endless debates rage
over which offers superior performance, hardware or software RAID. The
answer is “it depends.” An old slow RAID controller won’t match the
performance of a modern system with a fast CPU and fast buses. The
number of drives on a cable, the types of drives and cabling, the speed
of the data bus- all of these affect performance in addition to the
speed of the CPU and the demands placed on it.

One disadvantage is hot-swap ability is limited and not entirely
reliable.

Converting An Existing System To RAID
First of all, your power supply must be capable of powering all the
drives you want to run on the system. Adding as many drives as you want
is easy and inexpensive. If you’re going to purchase new hard disks,
you might as well get SATA, because the cost is about the same as PATA.
SATA drives are faster and use less cabling, and will soon supplant
PATA drives. PCI controller cards for additional PATA and SATA disks
cost around $40, and will run two disks each. The built-in IDE channels
on mainboards can handle two disks each, but you should run only one
disk per channel. You’ll get better performance and minimize the risk
of a fault taking out both hard disks.

Next, install the raidtools2 and mdadm packages. If you
want your RAID array to be bootable, you’ll need RAID support built
into the kernel. Or use a loadable module and use an initrd
file, which to me is more trouble than rebuilding a kernel. Next week
in Part 2 we’ll cover how to do all of this. You may get a head start
by consulting the links in Resources.

Resources

Latest Articles

Follow Us On Social Media

Explore More