Basics of RAID

Introduction


The word RAID sounds like it might describe something Marines conduct in Fallujah, or a can of what all roaches fear, but it is simply an acronym that stands for Redundant Array of Independent (or Inexpensive) Disks. Depending on who you talk to, the letter “I” can stand for either independent or inexpensive, but in my opinion independent is more appropriate, and far less subjective.


RAID generally allows data to be written to multiple hard disk drives so that a failure of any one drive in the array does not result in the loss of any data, as well as increasing the system’s fault tolerance. I say RAID generally does this, as there are several RAID configurations that provide different approaches to redundancy, but some RAID configurations are not redundant at all. Fault tolerance refers to a system’s ability to continue operating when presented with a hardware (or software) failure, as should be experienced when a hard drive fails in one of the redundant configurations of RAID.


The Hardware


The basic hardware required to run RAID includes a set of matched hard drives and a RAID controller.


RAID can be run on any type of hard drive, including SCSI, SATA, and ATA. The number of hard drives required is dependent on the particular RAID configuration chosen, as described later. I mention the need for matched hard drives, and although this is not absolutely necessary, it is recommended. Most arrays will only be able to use the capacity of the smallest drive, so if a 250GB Hitachi drive is added to a RAID configuration with an 80GB Hitachi drive, that extra 170GB would probably go to waste (the only time that this doesn’t apply is in a RAID configuration called JBOD (Just a Bunch Of Disks); which really “isn’t a RAID configuration” but just a convenient thing that a RAID controller can do – see “Basic RAID Configurations” below for more information). In addition to matching capacities, it is highly recommended that drives match in terms of speed and transfer rate as the performance of the array would be restricted by the weakest drive used. One more area that should be considered while matching is the type of hard drive. RAID controllers are generally for either SCSI, SATA, or ATA exclusively, although some systems allow RAID arrays to be operated across controllers of different formats.


The RAID controller is where the data cables from the hard drives are connected, and conducts all of the processing of the data, like the typical drive connections found on a motherboard. RAID controllers are available as add on cards, such as this Silicon Image PCI ATA RAID controller, or integrated into motherboards, such as the SATA RAID controller found on the Asus K8V SE Deluxe (http://www.geeks.com/details.asp?invtid=K8VSE-DELUXE). Motherboards that include RAID controllers can be operated without the use of RAID, but the integration is a nice feature to have if RAID is a consideration. Even for systems without onboard RAID, the relatively low cost of add on cards makes this part of the upgrade relatively pain free.


Another piece of hardware that is not required, but may prove useful in a RAID array is a hot swappable drive bay. It allows a failed hard drive to be removed from a live system by simply unlocking the bay and sliding the drive cage out of the case. A new drive can then be slid in, locked into place, and the system won’t skip a beat. This is typically seen on SCSI RAID arrays, but some IDE RAIDS cards will also allow this (such as this product manufactured by Promise Technology: http://www.promise.com/product/product_detail_eng.asp?productId=92&familyId=7).


The Software


RAID can be run on any modern operating system provided that the appropriate drivers are available from the RAID controller’s manufacturer. A computer with the operating system and all of the software already installed on one drive can be easily be cloned to another single drive by using software like Norton Ghost. But it is not as easy when going to RAID, as a user who wants to have their existing system with a single bootable hard drive upgraded to RAID must start from the beginning. This implies that the operating system and all software needs to be re-installed from scratch, and all key data must be backed up to be restored on the new RAID array.


If a RAID array is desired in a system for use as storage, but not as the location for the operating system, things get much easier. The existing hard drive can remain intact, and the necessary configuration can be made to add the RAID array without starting from scratch.


Basic RAID Configurations


There are about a dozen different types of RAID that I know of, and I will describe five of the more typical configurations, and usually offered on RAID controller cards.


RAID 0 is one of the configurations that does not provide redundancy, making it arguably not a true RAID array. Using at least two disks, RAID 0 writes data to the two drives in an alternating fashion, referred to as striping. If you had 8 chunks of data, for example, chunk 1, 3, 5, and 7 would be written to the first drive, and chunk 2, 4, 6, and 8 would be written to the second drive, but all in sequential order. This process of splitting the data across drives allows for a theoretical performance boost of up to double the speed of a single hard drive, but real world results will generally not be nearly that good. Since all data is not written to each disk, the failure of any one drive in the array generally results in a complete loss of data. RAID 0 is good for people who need to access large files quickly, or just demand high performance across the board (i.e. gaming systems). The capacity of a RAID 0 array is equal to the sum of the individual drives. So, if two 160GB Seagate drives were in a RAID 0 array, the total capacity would be 320GB.


RAID 1 is one of the most basic arrays that provides redundancy. Using at least two hard drives, all data is written to both drives in a method referred to as mirroring. Each drive’s contents are identical to each other, so if one drive fails, the system could continue operating on the remaining good drive, making it an ideal choice for those who value their data. There is no performance increase as in RAID 0, and in fact there may be a slight decrease compared to a single drive system as the data is processed and written to both drives. The capacity of a RAID 1 array is equal to half the capacity of the sum of individual drives. Using those same two 160GB Seagate drives from above in RAID 1 would result in a total capacity of 160GB.


RAID 0+1, as the name may imply, is a combination of RAID 0 and RAID 1. You have the best of both worlds, the performance boost of RAID 0 and the redundancy of RAID 1. A minimum of four drives is required to implement RAID 0+1, where all data is written in both a mirrored and striped fashion to the four drives. Using the 8 chunks of data from the example above, the write pattern would be something like this… Chunks 1, 3, 5, and 7 would be written to drives one and three, and chunks 2, 4, 6, and 8 would be written to drives two and four, again in a sequential manner. If one drive should fail, the system and data are still intact. The capacity of a RAID 0+1 array is equal to half the total capacity of the individual drives. So, using four of the 160 GB Seagate drives results in a total capacity of 320GB when configured in RAID 0+1.


RAID 5 may be the most powerful RAID configuration for the typical user, with three (or five) disks required. Data is striped across all drives in the array, and in addition, parity information is striped as well. This parity information is basically a check on the data being written, so even though all data is not being written to all the drives in the array, the parity information can be used to reconstruct a lost drive in case of failure. Perhaps a bit difficult to describe, so let’s go back to the example of the 8 chunks of data now being written to 3 drives in a RAID 5 array. Chunks one and two would be written to drive one and two respectively, with a corresponding parity chunk being written to drive three. Chunks three and four would then be written to drives one and three respectively, with the corresponding parity chunk being written to drive two. Chunks five and six would be written to drives two and three, with the corresponding parity chunk being written to drive one. Chunks seven and eight take us back to the beginning with the data being written to drives one and two, and the parity chunk being written to drive three. It might not sound like it, but due to the parity information being written to the drive not containing that specific bits of information, there is full redundancy. The capacity of a RAID 5 array is equal to the sum of the capacities of all the drives used, minus one drive. So, using three of the 160GB Seagate drives, the total capacity is 320GB when configured in RAID 5.


JBOD is another non-redundant configuration, which does not really offer a true RAID array. JBOD stands for Just a Bunch Of Disks (or Drives), and that is basically all that it is. RAID controllers that support JBOD allow users to ignore the RAID functions available and simply attach drives as they would to a standard drive controller. No redundancy, no performance boost, just additional connections for adding more drives to a system. A smart thing that JBOD does is that it can treat the odd sized drives as if they are a single volume (thus a 10GB drive and a 30GB would be seen as a single 40GB drive), so it is good to use if you have a bunch of odd sized drives sitting around – but otherwise it is better to go with a RAID 0, 1 or 0+1 configuration to get the performance boost, redundancy or both.


Final Words


Implementing RAID may sound daunting to those unfamiliar with the concept, but with some of the more basic configurations it is not much more involved than setting up a computer to use a standard drive controller. But, the benefits of RAID over a single drive system far outweigh the extra consideration required during installation. Losing data once due to hard drive failure may be all that is required to convince anyone that RAID is right for them, but why wait until that happens.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>