RAID Cards - Lies, Damned Lies and Benchmarks

Add Your Comments

This started off as a comparison between the IWill Side RAID66 and the hacked version of the Promise competition.

However, it quickly dawned on me that the real story was not what was being measured but the measurements. I’ll compare the cards in Part II, but the big lesson to be learned is not which card is better,
but rather the danger of using a benchmark or two to determine what you buy.

Imposing Simplicity When There Isn’t

We like to keep things as simple as possible. We’d much rather consider just one or two factors rather than ten or twenty.

Unfortunately, this desire often leads us to oversimplify and crunch everything down to a number, then assuming that that single number tells us everything we need to know. Some things just can’t be crunched down to a single number and retain much or even any validity for a specific situation. This is one of them.

For example, you’ve probably had an IQ test at one time or another. Imagine you take a test for 5 different IQs – for example, one for math ability, one for verbal etc. Now Steve Numbers does tremendously in math and terribly in the rest. Ann Average does OK in all of them. When you clump the numbers together and compare the two, Steve rules.

If the only thing you go by is that benchmark, are you going to be happy with Steve if you want to hire him as an editor? No matter how much better Steve’s overall score was than Ann’s?

That’s just what you’re doing if you judge based on a benchmark like High End Disk Winmark. If you look at High End Winmark numbers for Win2000, find them much higher than those for Win98, and decide that this is the OS for you to load up big audio files for sound editing, you just hired Steve, who can’t write a coherent sentence, as your Chief Editor.

That’s the major problem with benchmarks as they are often used. Using just one or two numbers to judge an entire situation carries with it the assumption that if one thing is generally better, it will be better at everything. Sometimes that is true. In my testing and review of other benchmarks, it almost never was.

Clumping everything together is like clumping together those IQ tests. Sure, you get a single average number, but does it really help you find your Chief Editor?

Unspoken Assumptions

A benchmarker often takes a range of activities, and places relative values on those activities. Those may well not be the values you would place on those activities, based on what you do. The guy making up that IQ benchmark might give math a weight of 10%, and everything else 90%. You may well give your favorite thing a weight of 90%, and everything else 10%. If that’s the case, it’ll almost be pure accident if that benchmark points you the right way.

For instance, the IDE RAID cards certainly move large blocks of data faster than single drives. No doubt about that. However, you then have to answer two questions before you can judge how important that fact is to you:

  1. Just how often do you move large blocks of data around?
  2. If you do, how much improvement can you expect when you move your large blocks of data around in your specific activities?

Benchmarks can’t answer the first question, and as we’ll see, they don’t answer the second one too well, either. For one thing, in some test, I found out that size matters, and just because the equipment can handle the little ones doesn’t mean it’s as good handling the big ones. 🙂

Random And Not-So-Random Variations

If a comparison between two items is within a couple percentage points, odds are the difference falls within the margin of error. You can’t say one is better than the other. As you’ll see, much of the Promise/IWill testing is like that, a 1-2% difference. The results I got from each card varied more than that; the difference is meaningless.

Even if they don’t, that doesn’t necessarily mean than one piece of equipment with better general numbers is better for you for what you do. I saw reviews using different poorer drives than my IBM 22GXPs, but sometimes they’d do better in a particular test anyway. Sometimes I would do better than IBM 34GXPs. Compare my results to those from a different brand of hard drives in roughly the same class, and I would do significantly better (often 10% or more, occasionally 20% or even 30%) in some tests, and they’d do significantly better in others. I even saw a couple benchmarks from users where I did better than a couple SCSI RAID setups with 10K rpm hard drives. (Not usually, though.:))

Even if you are lucky enough to have a benchmark test your particular application and shows that one piece of equipment is better than another, don’t assume that will be the case for what you do with the application. As you’ll see, just because one card does better than another generally doesn’t mean it will always do better.

Tampering With The Evidence

I’m not talking about deliberate tampering. Rather, I’m talking about unknowingly doing something that serves to rev things up.

ZdNet make up a series of benchmarks. I used a number of them in testing. I used Winbench to test disk performance, and Winstone and Content Creation to test how much faster hard drive speed affected a system benchmark.

I got some big differences in some of the testing, and it took me a while to figure out why. What I found was the Viagra for Winbench. Both Winstone and Content Creation pumped up Disk Winmarks a lot, and that effect only went away if I doused it with the cold water of some other Winbench tests. If I didn’t dowse, the effects lingered.

Single IBM 22GXP Drive Winbench before running Winstone Winbench after running Winstone Winbench after running Winstone and running Disk Transfer/Access
Business Disk Winmark 4140 5430 4030
High End Disk Winmark 13800 17000 13900
AVS/Express 9100 16500 9000
Front Page 98 88900 98700 84400
Microstation 13600 22100 13200
Photoshop 6920 7370 7290
Premiere 13700 15500 13400
Sound Forge 27900 27500 27800
Visual C++ 17000 17100 17000

Let’s look at a few of these more carefully

Benchmark Pre-Winstone Post-Winstone  Difference 
Business Disk Winmark 4140 5430 31.2%
High End Disk Winmark 13800 17000 23.2%
AVS/Express 9100 16500 81.3%
Microstation 13600 22100 62.5%

Quite a difference after you run a benchmark that claims to “restore your system after testing,” wouldn’t you say?

If you keep running Winbench, the effect erodes, but even after running it four straight times, you still have some scores significantly higher than pre-Winstone. The AVS/Express score remained well over 30% higher.

Even worse, look at this:

Benchmark Pre-Winstone Post-Winstone  Score after five Winbench runs 
Photoshop 6920 7370 7290

The improvement on this test never went away, even after repeated reruns.

Nor is the Winstone Effect just a single drive or Win98
phenomenon either; somewhat less effect occurs when using RAID or Windows 2000.

The point to all this is not to say that other reviewers regularly screw up (this did not seem to have happened in the other RAID reviews I’ve seen) or to automatically assume that an unusually high score is due to this. It’s to point out that benchmarking isn’t as bulletproof as the benchmarkers would have you believe.

For that matter, after some of the strange things I’ve witnessed, I wouldn’t be surprised if some of you tried the same thing (please do and send me the results) on different brands of hard drives and didn’t find that, or found something even worse. There were so many other tests where drives you would have thought were similiar showed much different results.

When comparing my results with those using other drives which were fairly close to the 22GXPs in performance if you used the benchmarks in Storage Review for comparison; I sometimes got much different results in all kinds of tests. I even saw to my amazement completely different results in certain synthetic tests for MY OWN DRIVES.

Granted, I can see some possible reasons for somebody’s newer 34GXPs having 30% slower access times than my 22GXPs, or why my throughput scores were much more stable than many other results I saw out there. But I can’t really explain why someone using the same drives and RAID controller (and a slower processor/bus speed) start off much faster than I, or why my throughput looked almost flatline but some of theirs looked like one continuous heart seizure.

Wouldn’t be so bad if I were the only one with wacky numbers, maybe I did something wrong, but some other got results like mine, too.

Then you have changes I can’t think of any reason for.

For instance, on Saturday, after seeing some outrageously high Winbench scores (higher than I was getting with RAID!) in a review of ATA66 PCI controllers, I decided to try both cards in that mode to see if running in non-RAID ATA66 mode really made a difference. I ran a quick Winbench on both, saw little difference, and thought that was that.

I was doing some web browsing, concentrating on reading some RAID-related posts in a forum, when I saw there were beta drivers and BIOS for the Promise card. I decided to go the extra yard, and loaded both.

Ran a quick Business Winmark: 40% improvement, even more than the Winstone Viagra. I said, “Uhh?” then graduated to four-letter thoughts.:)

Seeing an additional day of benchmarking before my eyes, I ran the Winmarks with cold water tests again immediately, number jumped right back.

Now unless the Winstone Effect survives FDISK and FORMAT, it wasn’t to blame, none of those bad boys had ever even been on the hard drive after formatting. Then what? Damned if I know.

The Fog Of War

Benchmarking usually take place under tightly controlled circumstances, in order to be able to exactly replicate conditions for comparison. This is all well and good, but do you restart after every action you take? Of course not. You do a variety of things, very untidy from a benchmarking perspective, but nonetheless typical. I wondered just how well these systems would work if you added a tiny little bit of chaos to the system, and the results are interesting.

Part II will illustrate all this by comparing the two cards, using both my benchmarks, and referring to numbers generated for different setups elsewhere in other reviews.

Email Ed


Background

  1. RAID has long been in use in SCSI systems to provide increased disk throughput and/or automated data backup. These controllers were (and are) intended for high-end workstation or servers to perform certain specific tasks. Both the RAID controllers and the SCSI drives fueling them were (and are) very expensive.
  2. Last year, Promise introduced the Fasttrak and later the Fasttrak66 IDE RAID card. While nowhere near as elaborate as the SCSI setups (a limited number of drives, no memory cache on the controller), they were also nowhere near as expensive, and do remarkably well against the SCSI setups from a cost/benefit perspective.
  3. When the Fasttrak66 card came out, people commented how similar it looked to the Promise Ultra66 card. Some enterprising soul discovered that they were essentially the same thing outside of a resistor and solder or two, which was easily remedied. Soon, many enterprising souls were buying $25 Ultra66 cards and turning them into functional equivalents of the now $85 Fasttrak 66s.
  4. A couple months ago, IWill came up with its version of a RAID card, and now it looks like everyone and his brother is coming up with one, too.
  5. RAID doesn’t help everything, or even most things, when it comes to performance. It won’t make you boot faster (matter of fact, you’ll boot much slower, figure at least another thirty seconds for either card), it won’t make most applications or games load any faster, either (though in the case of games, it may help load new levels quicker). For performance, RAID is really good with really big single files. Video and audio files are the kinds of files that RAID will help you with.

Also see my initial IWill RAID review HERE for more discussion on the types of RAID and how it works.

Comments on installation

Since I don’t want to reinvent the wheel, there are plenty of reviews out there that will tell you what comes in the box and general installation instructions. All I have to say is both cards
can be a little quirky to install and upgrade.

The IWill has a BIOS upgrade that demands that you identify one of five flash BIOSes (like you memorize these things when you glance at the card). What was even more exciting for me was to find out
that my particular BIOS didn’t exactly match any of the options. I went with the closest one, and went 50-50; it failed for version 1.04 (but still worked) and succeeded in version 1.05. No matter what the version, it really liked calling my first hard drive
UDMA2 rather than UDMA4.

The Promise card is pretty finicky as to whether an array it creates is bootable or not. In theory, you’re supposed to hit the space bar to designate an array as bootable (that’s in the manual, not on the screen), but it seemed to decide that when it felt like it.

The Promise card really is a jealous mistress though; it really doesn’t want to see any other hard drives around it can’t control. If you don’t have a BIOS setting like “boot first from SCSI,” I’d think long and hard about trying one of the other boot options, but if you do, just remember to use it, because it won’t work if you don’t.

If you have another SCSI controller, save yourself time and make sure either the Promise or IWill controller is in a PCI slot ahead of the SCSI controller. They want to go first.

If you have the Promise card, read the next section on what you may need to do to properly format your striped hard drive.

Maybe this was just my system, but initially, installing both cards paralyzed my ability to boot off a floppy. The floppy would start to boot, go for about five seconds and just die. In both cases, eventually it stopped doing that, but don’t ask me why. If it happens to you, you didn’t kill your computer.

Comments on features

The Promise card gives you some flexibility in implementing a RAID solution. The IWill is more a “one size fits all” type of card that makes the decision for you.

The Promise card lets you choose your stripe size; the IWill card does not. For a full discussion and graphs of the effect of stripes size on RAID performance, go HERE for a lot more detail on what you might expect from bigger stripes and clusters.

If you went over there and looked, you noticed that cluster size was also discussed. No, the Promise card does not adjust the size of clusters. That means you either have a copy of Partition Magic handy, or when you format, you’ll have to force the cluster size to the one you want. The DOS command for that is format x:/Z:x. What are you supposed to put in “x”? The multiple of 512 you want each cluster to take up.

In English:

  1. If you want 4K clusters, you type in: format x:/Z:8
  2. If you want 8K clusters, you type in: format x:/Z:16
  3. If you want 16K clusters, you type in: format x:/Z:32
  4. If you want 32K clusters, you type in: format x:/Z:64

Of course, the bigger the cluster, the more space it wastes. If you don’t know why, you don’t need to; all you need to know is that your files may suddenly take up 15-25% more room if you go to a performance maxxing 64K stripe/32K cluster setup. Whether that’s worth it for the extra few percentages of speed you get is up to you.

The Promise card also lets you stripe more than one drive on each IDE channel; the IWill doesn’t. While I did not try to put more than two hard drives in a RAID configuration, those who did reported diminishing gains from it. Those of you who need to do such a thing should be buying the Promise card anyway, but this is a minority of a minority. Most people reading this will only use two drives, and for them, this particular feature is effectively useless.

The IWill does let you plug in an external IDE drive, but there’s not exactly a ton of those around.

I didn’t test the backup (RAID 1) abilities of these cards; but those who did found both reliable. However, the more I think about this, the less useful this seems to the average member of our audience. Please correct me if I’m wrong, but most desktop data disasters don’t occur because drives flat-out die, they occur because the hard drive gets trashed for some reason. If mirroring copies everything that happens, the mirrored disk be just as trashed as the original.

How I tested

The cards were tested in a computer, the relevant components of which were:

  • PIII 550 overclocked to 733Mhz (thanks to Neil at Proton Computers.
  • AOpen AX63Pro Via Apollo Pro 133 board
  • 256Mb of Crucial Micron PC133 memory
  • 2 13.5Gb IBM 22GXP hard drives.

Four configurations were extensively tested:

  • Single drives
  • “Modified” Promise Ultra66 card using an 8Kb stripe and 4K clusters (“Promise Little Stripe”) and BIOS/drivers version 1.14
  • The same using a 64Kb stripe and 32Kb clusters (“Promise Big Stripe”) and BIOS/drivers version 1.14.
  • IWill SideRAID66 card (asked IWill for stripe data, but they didn’t get back to me) and 4K clusters, BIOS/drivers version 1.05

Since the AXPro’s ATA66 support is dubious at best, I spot-checked the single drives using the Promise and IWill cards as ATA-66 devices, and saw only a tiny (1-2%) improvement over the scores in the IDE channels.

All tests run using both Win98SE and Win2000 using FAT32. NTFS was spotchecked, but did not make any appreciable difference.

In single drive configuration, a 3Gb partition was used for all Win98 testing on IDE 1; a 5Gb partition was used for Win2000 testing on IDE 2.

For RAID configurations, the 3Gb Win98 partition came first, followed by the 5Gb Win2000 partition.

Tests used:

  • Winbench 1.1: Business and High Disk Winmark, Disk Access Time, Disk CPU Utilization, Disk Transfer Rate
  • Winstone 1.3; Business Winstone and (for Win2000), High-End Winstone
  • Ed’s Recently Invented Photoshop Disk Test (see below for details)
  • Ed’s Even More Recently Invented Sound Forge Disk Test (also, see below)

I ran these tests until I was satisfied I had gotten the full range of variation. Since Disk Access, CPU Utlization and Transfer Rate were very consistent, I usually only ran them twice.

The other tests I ran at least three times each, with additional testing to explore the Winstone Effect.

Winbench, Winstone, and Content Creation: Win98

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Business Disk Winmark 4130 4930 4950 5590
High End Disk Winmark 13800 17300 17300 19100
AVS/Express 9100 10100 10400 11300
FrontPage 98 88700 95500 99200 92800
Microstation 13600 16000 15800 19200
Photoshop 6920 9570 9340 10200
Premiere 13600 17300 15700 20900
Sound Forge 28400 36500 38900 40700
Visual C++ 16900 23300 22500 23700
Business Winstone 30.0 30.4 30.3 30.7
Content Creation 31.1 31.0 31.2 31.9

Some observations:

  1. Any form of RAID is an improvement over a single drive, at least for the benchmark. (We will see that’s not necessarily so when you actually try to do something.)
  2. Big stripes and clusters do usually increase performance, but not always. Nor is the level of improvement consistent; Premiere and Microstation got much more of a boost than other apps.
  3. The Promise Small Stripe might have been at a slight disadvantage against the IWill because I suspect that the IWill has a bigger stripe size. Nonetheless, though the IWill usually came in second, the differences were usually insignificant. We will further test performance in Photoshop and Sound Forge later on.
  4. Faster hard drive throughput barely affects general system benchmarks.

The other Winbench numbers? Big Stripe had the fastest access times (about 9.8ms) and lowest CPU utilization (about 3.5%). Little Stripe has access times of about 10.5 ms and CPU utilization of almost 6%. The IWill trailed both with access times of about 11ms, and CPU utilization of about 9.5%. Disk transfer rates were all about the same, generally a little less than 36Mb/sec.

The Winstone Effect

Do we get it when we go to RAID?

IWill SideRAID66

Benchmark Pre-Winstone Post-Winstone  Difference 
Business Disk Winmark 4940 5670 14.8%
High End Disk Winmark 17300 19700 13.9%
AVS/Express 10300 14200 37.9%
Microstation 16100 19200 19.3%

Promise Little Stripe

Benchmark Pre-Winstone Post-Winstone  Difference 
Business Disk Winmark 4940 6260 26.7%
High End Disk Winmark 17100 22000 28.7%
AVS/Express 10200 20700 102.9%
Microstation 15600 24900 59.6%

Promise Big Stripe

Benchmark Pre-Winstone Post-Winstone  Difference 
Business Disk Winmark 5570 6880 23.5%
High End Disk Winmark 19300 20900 8.3%
AVS/Express 11600 17600 51.7%
Microstation 19200 23500 22.4%

Obviously, Benchmark Viagra works here, too, though much more with Promise Small Stripe than IWill or his big brother.

Is every test affected as much? No. Sometimes, they even go down. The numbers for the first few HE tests tend to go up, and the numbers for the last few go down, but that’s not consistent. Nothing in this testing is broadly consistent.

Sound Forge, for instance, goes down a little bit on a single drive, drops 20% on Promise Small Stripe, but goes up a bit with IWill and Big Stripe.

Ed’s Homebrewed RAID benchmarks

I wasn’t too thrilled with the results I was getting from the commercial benchmarks after finding things like the Winstone Effect and comparing my results to others, so I decided that if RAID was supposed to be so good at moving big files around, I’d better do something that would measure big files being moved around.

Ed’s Photoshop Test

The test is pretty simple. Use PS5bench 1.11 (go HERE for more information) to generate a random 10Mb image, then repeatedly open and save it, using Photoshop’s timing mechanism to see how long it takes. Take that 10Mb image, increase the image size to 40Mb, 60Mb and finally 160Mb, and do the same thing. Finally, after you’re done with the 160Mb image, reopen and save those 10, 40, 60 and 160Mb files, and see if the open/save times are any different than when you were repeatedly opening and closing them.

I wasn’t concerned about anything other than opening and saving files because if you are seriously using Photoshop, you have enough memory to hold the image in memory without having to wait for virtual memory. In any case, the 160Mb file was meant to push the hard drive into using VM. Repeatedly opening and closing files (with one noted exception) did not make the process go faster; however, if you opened up a file after opening and closing others, the load time was significantly greater.

10 Mb file (all times in seconds)

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Initial Save 0.4 0.4 0.4 0.3
Opens 0.4 0.5 0.5 0.5
Saves 0.3 0.3 0.3 0.3
Opens after full cycle 1.2 0.8 0.9 0.7
Saves after full cycle 0.3 0.3 0.3 0.3

No real difference here in repeated opens and closes, if anything, the RAID is a little slower in opening. To be fair, the Photoshop
timing tool only measures tenths of a second, and when the action only takes a split second, a speed increase would only be in the hundredths of second. However, the RAID arrays do bring back the file after it hasn’t been opened in a while a bit faster than a single drive.

40 Mb file (all times in seconds)

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Initial Save 2.0 1.3 1.4 1.4
Opens 1.8 1.8 1.8 1.8
Saves 1.6 1.3 1.3 1.3
Opens after full cycle 4.4 3.3 3.3 3.1
Saves after full cycle 2.0 1.5 1.5 1.4

The RAID cards start to show some improvement, with Big Stripe doing a little better than the others.

60 Mb file (all times in seconds)

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Initial Save 10.0 7.1 19.1 7.2
Opens 6.3/2.8 5.6/2.8 5.3/2.8 5.0/2.8
Saves 5.2 2.5 2.5 2.6
Opens after full cycle 6.7 6.5 6.1 5.6
Saves after full cycle 5.3 2.5 2.9 2.7

The “Opens” column needs some explanation. For some unknown reason, at 60Mb, the initial open took considerably longer than subsequent opens. This is the only time this took place; the Open numbers are extremely consistent across all situations at the other sizes.

Otherwise, much the same pattern as at 40Mb, except the IWill starts to beat the Promise card, even in Big Stripe mode.

160 Mb file (all times in seconds)

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Initial Save 57.1 54.0 46.3 38.4
Opens 27.5 19.3 20.3 18.4
Saves 51.6 31.8 35.7 31.0
Opens after full cycle 29.5 17.4 20.7 18.6
Saves after full cycle 49.9 30.6 33.5 29.9

This is the one that separates the men from the boys. The IWill card does very well, even against Big Stripe, and usually beats Little Stripe.

In this case, the Winbench numbers in a sense are predictive, since the IWill card faired pretty well in the Photoshop portion of the test, and Little Stripe struggled.

But note a couple other things:

  • It takes some awfully big files for the RAID cards to show significantly superiority. If you’re a professional graphics artist, you do use files this big. If you’re not, you probably don’t.
  • As the files get bigger, the loading times and save times increase dramatically. Instead of seeing 35Mb a second transfers, or even the 10Mb predicted by the benchmark, you see loads at more like 5Mb, and saves at more like 3Mb. Even considering swaps in and out of VM, that’s quite a slowdown. Still a lot better than a single drive, but no miracles.

Ed’s Sound Forge Test

You probably don’t tinker around with 150Mb pictures too often. However, even if you just tinker with sound editing, you can’t help but deal with big files. So I decided after the IWill review that I needed either an audio or video benchmark.

I decided to use Sound Forge as my other real-life benchmark because:

  • It was part of the Winbench tests
  • I didn’t have videos around, but I did have CDs with music tracks I could use to test.
  • It was easy to find a demo of the program (Sound Forge 4.5 XP). 🙂

This benchmark has two parts. I took three albums tracks from a CD and converted them into WAV files of about 50, 125, and 210 Mb respectively. The first part of the test simply opens the files.

Since the Sound Forge demo doesn’t allow you to save files, I had to come up with a rough equivalent that the demo would allow me to do. Fortunately, the demo has an Undo feature, which is enabled. So what I did was process a Reverb cathedral effect for the longest track with Undo enabled, then did an Undo and timed how long that took to process.

I tried opening files repeatedly and found that Sound Forge did something that the other benchmarks and my Photoshop homebrew did not. Either Sound Forge doesn’t really get rid of files, or it retains some great pointers, because reopening a file usually took only a small fraction of the time it took to initially open, as little as 10%. Load something else in the meantime, though, and that goes away. It’s not something I’m going to include in these benchmarks, but if you ever hear of complete songs loading into a sound program in a couple seconds, that’s probably what’s happening.

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
50Mb Open 14.108 5.512 5.329 6.277
125Mb Open 35.635 14.674 22.275 19.325
210Mb Open 59.981 26.094 26.996 25.327
210Mb Undo 124.6 64.1 71.8 64.78

RAID does very well here, though Winbench predicts comparative performance badly. The IWill card trails in Winbench but wins overall here. No 35 or 40Mb transfer speeds here. Not even transfer three times faster than in Photoshop, like the benchmark would have you believe.

Tomorrow, I’ll do the same thing over again for Win2000. Some more surprises.

Email Ed


Might as well get right to it:

Winbench: Win2000

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Business Disk Winmark 6050 7830 8030 8380
High End Disk Winmark 13800 19000 19100 19000
AVS/Express 18300 19700 21600 19900
FrontPage 98 117000 134000 135000 113000
Microstation 24200 30600 32200 32300
Photoshop 6470 9390 9090 8900
Premiere 11000 18400 17300 17900
Sound Forge 11700 16200 16300 16800
Visual C++ 14100 18000 18700 19300

Yes, the IWill card did much better here than it did in Win98; and the Promise Big Stripe did much worse. There was a lot more variation across the board than I found in Win98, also, but what you need to get from this, if you don’t know this already, is how these scores compare to what they were for Win98. Let’s take the IWill numbers in Win98 and compare them (the same trend holds true for the others) to its numbers for Win2000:

Benchmark Win98 score Win2000 score  W2000 score compared to W98 score 
Business Disk Winmark 4930 7830 59% Better
High End Disk Winmark 17300 19000 10% Better
AVS/Express 10100 19700 95% Better
Front Page 98 95500 134000 40% Better
Microstation 16000 30600 91% Better
Photoshop 9570 9390 2% Worse
Premiere 17300 18400 6% Better
Sound Forge 36500 16200 56% Worse
Visual C++ 23300 18000 23% Worse

Not too good if you’re using Sound Forge in Win2000. Do you see how relying on a grouping of benchmarks isn’t such a good idea if you primarily use just one of them?

My little Homebrewed Sound Forge benchmark confirms that things do indeed move a lot more slowly in Win2000 than they do in Win98. However, Winbench indicates things are about the same in Photoshop, and my little Photoshop test doesn’t quite agree.

More on this later.

Winstone Effect Here, Too?

Yes, but not as much as with Win98. Since I’m beating you to death with charts, let’s just say that the usual suspects are still most susceptible, but the effect is 15-30% at worst, and usually about 5-10%.

Winstone and Content Creation Scores:

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Content Creation 2000 39.2 38.0 38.0 38.2
Business Winstone 99 42.4 43.5 43.4 43.6
High End 99 47.1 47.5 47.4 47.4
AVS/Express 4.73 4.12 4.15 4.15
FrontPage 98 4.20 4.15 4.16 4.16
Microstation 5.36 5.55 5.53 5.53
Photoshop 7.12 6.89 6.87 6.96
Premiere 3.96 4.01 4.01 4.03
Sound Forge 3.66 4.25 4.13 4.13
Visual C++ 5.41 5.50 5.55 5.54

Hmmmm. Content Creation scores higher without RAID? That seems odd, but all the other tests tell us not to expect more than a couple of percentage points improvement at best from RAID.

But look at the scores; only two get affected a lot. AVS/Express goes down a lot when RAID is around, Sound Forge goes up a lot. Both are programs that use very big files (AVS Video, SF audio). Yet they went in different directions.

You may say, “but Ed, the Winstone and Content Creation scores are much higher than in Win98.” This is true. From the High-End Winstone, though, you can see a few sets of numbers jump up more than the others, and more importantly, jump down too. At least we get a breakdown for High End, we don’t for the others.

Ed’s Photoshop Test

Same test as in Win98, but much different results.

10 Mb File (all times in seconds)

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Initial Save 0.9 0.4 0.4 0.4
Opens 0.5 0.4 0.4 0.4
Saves 1.0 0.5 0.4 0.4
Opens after full cycle 1.1 0.7 0.7 0.7
Saves after full cycle 0.9 0.4 0.4 0.4

40 Mb File (all times in seconds)

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Initial Save 3.4 1.8 1.8 1.8
Opens 2.0 1.8 1.7 1.7
Saves 3.5 1.8 1.9 1.8
Opens after full cycle 3.8 2.7 2.7 2.4
Saves after full cycle 4.1 1.8 1.8 1.9

60 Mb File (all times in seconds)

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Initial Save 23.1 11.0 13.7 11.3
Opens 3.1 3.0 2.6 2.5
Saves 5.4 3.1 2.8 2.7
Opens after full cycle 4.8 3.9 3.9 3.6
Saves after full cycle 5.4 2.7 2.8 2.7

160 Mb File (all times in seconds)

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
Initial Save 87.3 44.1 49.5 45.0
Opens 46.9 24.6 28.3 28.2
Saves 95.8 46.8 55.9 49.7
Opens after full cycle 52.7 27.4 30.6 30.9
Saves after full cycle 93.1 45.9 56.0 47.3

Obviously, any version of RAID improves matters dramatically. The Winbench Photoshop numbers said that the IWill card was best, but that didn’t really prove to be the case until we got to 160Mb. But let’s compare the times a single drive took to accomplish these tasks in Win98 as opposed to Win2000:

Single Drive–Win98/2000 Comparison

10 Mb File

Test Time in Win98 Time in Win2000

Difference

Initial Save 0.4 0.9 125% More Time
Opens 0.4 0.5 25% More Time
Saves 0.3 1.0 233% More Time
Opens after full cycle 1.2 1.1 8% Less Time
Saves after full cycle 0.3 0.9 200% More Time

40 Mb File

Test Time in Win98 Time in Win2000

Difference

Initial Save 2.0 3.4 70% More Time
Opens 1.8 2.0 11% More Time
Saves 1.6 3.5 118% More Time
Opens after full cycle 4.4 3.8 15% Less Time
Saves after full cycle 2.0 4.1 105% More Time

60 Mb File

Test Time in Win98 Time in Win2000

Difference

Initial Save 10.0 23.1 131% More Time
Opens 6.3/2.8 3.1 50% Less/10% More Time***
Saves 5.2 5.4 4% More Time
Opens after full cycle 6.7 4.8 28% Less Time
Saves after full cycle 5.3 5.4 2% More Time

***See Part II for explanation of split times for Win98.

160 Mb File

Test Time in Win98 Time in Win2000

Difference

Initial Save 57.1 87.3 53% More Time
Opens 27.5 46.9 70% More Time
Saves 51.6 95.8 86% More Time
Opens after full cycle 29.5 52.7 79% More Time
Saves after full cycle 49.9 93.1 86% More Time

All RAID essentially does is get you back roughly to where you would have been with a single drive in Win98.

Ed’s Sound Forge Test

Test Single Drive Iwill SideRAID Promise Little Stripe Promise Big Stripe
50Mb Open 26.5 19.5 22.6 13.9
125Mb Open 69.9 49.2 57.4 33.3
210Mb Open 116.4 81.9 95.3 55.6
210Mb Undo 194.4 125.4 148.3 145.5

Promise Big Stripe usually rules, but IWill pretty easily disposes of Little Stripe for second and wins the Undo. Again, though, this is really an OS story.

Single Drive–Win98/2000 Comparison

Test Time in Win98 Time in Win2000

Difference

50Mb Open 14.1 26.5 88% More Time
125Mb Open 35.6 69.9 96% More Time
210Mb Open 60.0 116.4 94% More Time
210Mb Undo 124.6 194.4 56% More Time

Tomorrow, I’ll take a look at some of the other benchmarking done on these products, and see how differences in hard drives change the results.

Email Ed


Leave a Reply

Your email address will not be published. Required fields are marked *