Thoughts on RAID Arrays – Rick Davis
So you’re trying to budget for that new Conroe upgrade. You envision yourself sitting back and enjoying the improved frame rates, snappy program responses and benchmarks that you’ll brag about in the forums. But did you consider your hard drive performance? You say you have plenty of room and that 350 gigabyte 7200 RPM jewel you bought a year ago is running great. What’s that? You only have one drive?Hmmm..Buddy we need to talk.
Serious overclockers have always known this, but even performance minded users on a budget really need to consider a RAID (Redundant Array of Independent Disks) configuration.
Although hard drives are faster, bigger and have more cache memory than ever, it’s RAID that really breathes life into a modern PC. You might think RAID is something only used for corporate servers and you may have heard horror stories about configuring them to boot. Well rest assured – RAID is no longer just for servers and it’s no more difficult to implement today than it is to upgrade to that new processor. In fact, it can be one of the easiest hardware upgrades you can make if you get the right hardware to start with.
There are many RAID configurations out there but a first time RAID user should really stick with one of three types:
- RAID 0: The fastest mode and the most popular among speed junkies. But there is a downside I’ll cover below.
- RAID 1: The ‘safe’ RAID that offers a safety net in case one of your drives decides to retire early. It’s got a great purpose, but it often doesn’t offer a significant performance advantage over a single drive.
- RAID 5: Essentially a mix of 0 and 1 together. It requires a minimum of three drives but does offer a significant performance boost as well as redundancy.
But you say you don’t have the money to buy two or more drives. Well, maybe you really do. Consider this:
RAID 0 was born to solve some big problems. In days gone by, the cost of hard drives were exponentially more than today – capacity was just too expensive to waste. RAID 0 is basically two or more drives that are joined together with a controller that the operating system sees as a single volume.
For example: if you put two 250 Gigabyte, 7200 rpm, 8 meg cache SATA drives that cost about $80 each (as of this writing) together as RAID 0, then Windows (or another OS) sees them as a single 500 gig drive. That same 500 gigs as a single drive with the same specs costs over $200! Now you can buy two hard drives at a steal and make Windows think and perform as if a half terabyte 16 MB cache 15,000 RPM monster tethered to the mainboard.
Slick, eh? The real kicker is that the drives running as RAID 0 perform simultaneous read/writes, thus reducing the seek/access times by a huge amount. In fact, a pair of garden variety low-priced drives can easily outperform a single high speed, expensive drive without much of a sweat.
Sounds just too good to be true right?
There is a penalty – the Achilles Heel of a RAID 0 configuration is that if one of the drives fails or even hiccups, your whole RAID array is down with it. That’s because your data is “striped” or split evenly between the two drives. That’s where the performance really lies. Load that huge map in your game and under RAID 0 half the data is pulled from one drive, half of it simultaneously from the other. Think of it as SLI of the hard drives.
But if the data is missing from one drive, even a small amount, you’re probably down for count. It simple math – if you use RAID 0, your odds of loosing your data is multiplied by the number of drives in the array. The performance increases for every drive you include, though. Use three drives and you may dehydrate from drooling all over they keyboard because your jaw will drop at the performance. The rule of thumb is a general 33% increase in performance for each drive added, up to about 5 drives – then the system’s own limitations prevent much improvement beyond that.
Okay so you want RAID but the conservative side of you says its too risky.
Your data is just too important to loose. This is where RAID 5 comes in. Under this scenario, you have to start with at least three drives. Two of the drives act just like a RAID 0 while one drive acts as a mirrored drive or RAID 1. It’s technically far more complicated than this, but you can let the RAID controller deal with the details.
Suffice to say you can buy three of those $80 drives and your wallet will barely miss the extra 20 or so bucks extra, because your performance will be smoking, and your data safer than even a single drive! If one drive fails, you just replace the drive and your back in business. RAID 5 is growing in popularity and is a great solution, but it’s slightly more complicated to setup and usually requires three identical drives.
There is a really effective “poor-boy” trick – just configure a RAID 0 using two SATA drives. Then attach either another SATA or a plain old IDE drive of about the same capacity of the RAID array. Now you have a perfect spot to make a copy of the array for backup, or to store non performance sensitive files like pictures, file images etc.
I won’t go into detail about how to actually install the array. It differs just enough with every mainboard and chipset that it’s better to follow directions from the manufacturer, but one thing should be clear by now: you need a mainboard that supports SATA and RAID¹. Most popular boards do. You need at least two drives, and its best if you plan on installing your OS cleanly after the RAID is setup.
Dual Core is the way to go with processors, DUAL graphics is mainstream now, why not dual hard drives? Research the components in the forums, be sure you have enough wattage from the power supply to run multiple drives and you just might be shocked at what a difference it can make.
¹ Or you can buy a RAID PCI card.
Overclockers.com carried the article “Don’t Forget RAID” by Rick Davis. I feel that the article sells its message hard, and that the ‘opposite view’ is needed for balance, so here goes:
The article uses fresh language. That’s great – it’s generally a great way to write for the web audience – but here it goes so far that newbies could get overwhelmed by enthusiasm, and perhaps act before thinking. Some examples: “it’s RAID that really breathes life into a modern PC”; “make Windows think and perform as if a half terabyte 16 MB cache 15,000 RPM monster tethered to the mainboard”; and lastly “three drives and you may dehydrate from drooling all over they keyboard because your jaw will drop at the performance”.
So how about that performance; how much of a boost does RAID really get you? The short answer is “it depends”. The performance is largely determined by three things:
- The time to find the first byte of a file (disk access time)
- The speed at which the following bytes are read/written (sustained transfer rate, STR)
- Any ‘tricks’ to improve performance, such as caching or rearranging the order in which requests are served (such as Native Command Queuing, NCQ)
Now when you add RAID, you mostly change the sustained transfer rate. For example, two disks in a stripe will truly give you twice the performance when reading or writing a very large file. Four disks in a striped set, well that’s four times the read performance for large files – quite impressive. But with small files the ‘time to find the first byte (disk access time)’ completely dominates. Thus RAID will do nothing for performance in that situation. OK – it quickly gets more complicated than this, but for controllers without cache RAM and single-user scenarios, the above is pretty much correct.
The Storagereview.com forums and FAQ have covered this before. For starters this FAQ entry outlines the situation. (One thing to note: perhaps the FAQ comments about first person shooter games no longer hold true. FPS games today use larger bitmap textures and level/map files than older titles.)
The Tech Report has a good tutorial and test of RAID on semi-current Intel and NVIDIA chipsets, which I also would recommend to read HERE. The main point that I want to make here is that the real performance can be much lower than the theoretical performance, due to limitations in the RAID controller / driver.
Furthermore, using RAID may also require a certain understanding and ‘sysadmin knowledge’ from the user. Booting Windows from a RAID array requires a driver and a special install process. Maintenance can also require some new knowledge, such as monitoring the array for disk failure and recovering it before the whole array fails. This isn’t hard stuff, but it’s new for many people.
So is RAID a good idea for the typical home user? I’d say certainly not.
How about the PC enthusiast with solid technical knowledge? Well, maybe, perhaps even probably – but it really depends on your usage of the PC, your risk tolerance, your need for speed, and the quality of your chosen RAID controller.
So this ends my comments to the article. And while I’m on a roll, I’ll add some content of my own:
If you’re thinking about RAID, then I would suggest the following decision process / things to go over:
- Understand the differences between
- Complete hardware RAID coprocessing solution (‘hardware RAID’)
- Motherboard with a dumb controller, a BIOS extension and an intelligent driver (most enthusiast motherboards, Intel and NVIDIA’s RAID solutions)
- Pure software RAID solution (Microsoft Windows and Linux software RAID)
- Understand the RAID levels thoroughly. For most people, the sensible RAID choices are 0, 1, 5, 6 & 10. RAID 0, 1 and 5 are by far most common, but not necessarily the best. Take a look at RAID 1 and 10 too. HERE is a simple guide (the RAID 10 graphic is a bit illogical).
- If you plan on using the RAID controller on your motherboard, then find out which chipset it uses and Google it extensively. Are some people in grief because that RAID implementation is buggy and incompatible? What do people say about its performance?
- Consider software RAID. Both Linux and Windows offer some support for RAID across partitions. This can be more flexible than controller-based RAID, and faster and more mature too. But generally you can’t have your boot partition (Windows partition) on a RAID array when using OS RAID (with Linux you can, but it’s a bit hairy).
- Consider a ‘real’ RAID controller, such as a PCI-X or PCI-Express add-in card. There are several SATA models available, typically costing about 300 USD. Toms Hardware, GamePC.com and xbitlabs.com have reviewed several. (Note two things: With multiple drives, these really need a PCI-X or PCI-Express connection to perform optimally. And when a disk is initialized on such a controller, the disk is formatted in a proprietary format – moving it to a new PC/controller will require a format of the drive.)
- Consider Intel’s “Matrix RAID” and other implementations that allow you to use different RAID levels on the same set of drives. Using just two drives, you could create a a mirrored partition for Windows and your applications, and a striped partition for games and less important data. This could be appealing.
- Consider your risk tolerance. Would you be in grief if your hard disk system failed? Will you need to move the disk drives to a new PC without formatting them, and will your RAID solution support this? How many resources and how much time do you have if there are installation or compatibility problems?
- Lastly, consider buying the Western Digital Raptor WD1500 10,000 RPM disk instead. It’s fast, see HERE. For your needs, how would the performance and capacity of a single Raptor stack up against getting more 7,200 RPM drives?
It’s a long list, I know – but I feel that it’s necessary to think through these things if you want to make an informed decision about using RAID in your home PC.
For those who think of RAID, I further feel that they should look more closely at RAID 1 and 10, and not just RAID 0 and 5. RAID 1 on a good controller gets you twice the sustained read rate and data redundancy.
Note that the Tech Report review showed the Intel and NVIDIA RAID 1 implementations not providing twice the read speed, illustrating just how many caveats there are in this. RAID 10 both gets you increased sustained transfer during read and write and data redundancy, but requires 4 disks.
I’m currently not using RAID at home, but have used RAID 0 in the past. For my own needs, a single, near-silent 7,200 RPM drive or a single Raptor 1500 would be my choices for a new PC. Your mileage may vary… 🙂
Jesper Mortensen email – jesper [at] mortensen [dot] name
I read both articles about RAID (HERE and HERE) and I have to agree with most of what Jesper wrote,
but I wanted to give some more stuff to think about.
I am a long time RAID user – I used RAID when 4200 rpm disks were still in fashion.
My first RAID array consisted of two Quantum (later Maxtor) 4.5GB Atlas 7200rpm SCSI drives on an Adaptec 2940UW; this put me back a swooping $1500 — for 9GB, back in 1993! Over the years I have been using various and multiple RAID arrays and they cost me a small wheel barrel of money!
But it was well worth it then.
One thing I learned: Although a RAID 0 or striping definitely has its benefits and in the “old days” was even really needed, it is not the holy grail anymore!
When the WD Raptor 74GB SATA came out, it was the fastest disk for home use and the time came to retire the SCSI drives, because modern desktop disks use a different access algorithm than SCSI server disks and they are therefor faster for desktop use.
I built a RAID 0 system with two of these babies and was very happy with it until I got a system crash 😉
At the time I was also doing some reading around when I had my HD crashed.
There ware two side’s battling over the Pro’s and Con’s of what was better – RAID0 or a single disk.
The argument was that RAID 0 just moves files faster but does nothing for access time, and if one disk is too late with data, the other has to wait.
So is it worth the extra risk and does it give real world benefits?
The truth is somewhere in the middle, I think, but yes – in “some” cases it helps.
It’s like buying a real nice expensive sports car – yes it’s fast, but if you’re driving in town all day (what we all normally do) it doesn’t help much, there is just too much traffic around us; but sometimes you can get on the German autobahn where there are no restrictions – then it’s all worth it!
And so I started thinking “Maybe it would be better to spread the files over two disks and have the best of both worlds.”
I started testing some different setups with a stopwatch and found out that in most cases, two single disks are faster than one RAID 0 array, if setup right, and the fastest setup for me now is:
Disk 1: Windows and the swapfile
Disk 2: Games
The underlining thinking of my setup is the same as with dual core CPU’s.
It stops the interrupts of the task on hand by other data requests,
and even though it doesn’t show much benefit in most benchmarks, it’s because they don’t really look at load times – it does however makes your system feel a lot faster.
The reason I think this setup works faster is that Windows is always doing something in the background and also, if the swap file is getting used, it doesn’t stop your game from being loaded, so it can do two things at the same time.
So instead of going back and forward on the same disk,
you have your main program having full control over your second disk and don’t get interrupted all the time by other (Windows) things.
I also tried using a third disk and used one of the older 10K SCSI disks for use as the swap file. Even though I could measure some difference, especially when I had a lot of programs open and started using the swapfile a lot, for me the overall benefit was not worth the extra noise and heat.
But if you are one of the lucky guys that have a three or more disks setup, I would definitely go this way.
If speed is most important and money is no problem, then look at one of these SATA-II cards from Areca; they are also sold under the Tekram brand name. They are really the fastest SATA controllers on the market for the moment; they have 256MB cache on board that helps also a lot, especially with lots of small files as some games have.
So if you have plenty of money to burn and just want the best of the best, go for it, but these cards have either PCIe 8X or PCI-X 64 bits connectors.
The PCI-X version will also work fine in a standard PCI slot, and the PCIe version will work fine in a SLI 16X slot¹ or 1X / 4X slot² if it fits.
¹ If the BIOS supports it – check Areca’s FAQ on their website or send an email.
² If the back end of the slot isn’t open, as most aren’t, you can use a Dremel to cut it away (if you are real handy and don’t mind losing your warranty).
But both cards will be limited by the max bus speed of the bus you are using, but if you really feel the need for speed, try one of the new ASUS “PCI-X work station AM2/S775 boards” – the PCI-X version will do real fine job for you;
you can have Windows on a Raptor and the game / program files where you need fast access to on the RAID channel.
One thing I learned over the years is that RAID is not an easy subject; what works for one really well doesn’t help another that much. I would advise anyone that wants to spend money on it to do a lot of reading on it, and there plenty of information on the net about it.
And one last but one important tip:
The fastest hard drive setup RAID or non RAID won’t help much if you don’t have enough main memory on your mobo – this should be the first thing to upgrade –
2 GB is a minimum I would say for smooth running and some multitasking.
My 2 cents –