# More On Memory

With new platforms come at least the possibility of new rules. What was true before isn’t necessarily true now.

A couple days ago, we suggested one area where the old rules might not apply anymore: memory speed.

Bandwidth Bottleneck

With single channel memory on earlier PIV boards, the maximum FSB bandwidth of such a system at a given MHz was always going to be double that of the maximum memory bandwidth. That’s why there was never any question of a bandwidth bottleneck on such systems. That’s why you could run memory faster than RAM on such boards and have it do you some good.

All that changes with dual-channel.

Going to dual channel instantly doubles the maximum memory bandwidth and makes it equal to that of the FSB at a given MHz. Since memory has to share bandwidth with other things, this means a dual-channel bottleneck becomes prone to bottlenecks (albeit at a much higher level than with a single channel solution).

I’ve used the example of cars on a highway before in comparing the PIV to the Athlon bus. Memory bandwidth in a single-channel PIV system is like trying to get four cars driving abreast on an eight-lane highway. Even if there’s another car around (other FSB activity), there’s plenty of room for it to ride abreast too, and even enough room to get five memory cars running abreast with no problem.

When you have dual-channel, though, memory bandwidth there is like trying to get eight cars driving abreast on an eight-lane highway. If there’s any other cars wanting their own lane, too, at least one of the memory cars is going to have to trail a bit.

Essentially, this is the same problem Athlons face: an FSB bandwidth bottleneck. The answer to the bottleneck for dual channel PIV systems is the same as for the Athlon: Increase the FSB bandwidth. Broaden the highway.

The Canterwood/Springdale mobos really lets you broaden the highway by allowing you to increase FSB to a speed above that of memory. The 5:4 and 3:2 ratios essentially let you increase the lanes of the FSB highway to ten and twelve lanes. This is a good thing, not a bad one. 1:1 or 1:1.25 was a good thing for single-channel, but 1:1 is not good for dual-channel from a bandwidth perspective, period.

What Does That Mean For An Overclocker?

What that means is that high FSB is a contributor to performance because it relieves a bottleneck. For a given memory and CPU speed, you should be a little better off getting that memory speed from a 5:4 ratio than a 1:1 ratio.

That’s the hypothesis. It needs to be proven or disproven, and here’s where I need a little help.

I’ve already found a good test to show the effect of memory speed and FSB. It’s SuperPi. It’s easy to run, relatively memory sensitive, and not affected by differing video cards or hard drives. You can also run it long enough so even minor differences in performance can be picked.

I have a 2.4C. Since Intel CPUs are multiplier locked, I can’t do the necessary tests to prove this. I would need a 3.0C, too.
I don’t have one, but I bet a few of you do.

So if. . . .

• You have a 3.0C CPU
• A Canterwood mobo (preferably an IC7/IC7-G, but any Canterwood should do) and
• 512Mb of memory capable of 2-6-2-2 memory settings at 200MHz and 220MHz (preferably Corsair, but any high-end memory will do)

. . . I would really appreciate it if you could do the following:

1. Download and install version 1.1 of SuperPi if you haven’t already done so. You can get it here.
2. Set your machine first to default speed; 1:1 FSB:memory ratio, memory settings 2-6-2-2.
3. Set SuperPi to run to 32M places. It should take roughly 40 minutes.
4. Take a screen shot of the end result and send it to me, along with details of your mobo and memory. Also tell me whether or not I should post your name and/or email to give you credit for your work.
5. If your RAM can run at 220MHz at 2-6-2-2, overclock your 3.0C to 3.3 by setting your FSB to 220MHz. Again, 1:1 FSB:memory ratio, memory settings 2-6-2-2.
6. Repeat steps 3 and 4.

If my hypothesis is correct, if I do the same thing with the 2.4C running at 250MHz and 275MHz respectively, the 2.4C scores should be a little bit better. Well, a tiny bit is probably more like it.

Why just tiny? Well, that’s because . . . .

Memory Doesn’t Matter Much…

Memory Doesn’t Matter Much

The biggest myth in this community is that: Memory Matters Much. Some people think it matters as much or more than CPU speed.

It doesn’t. It never has and it never will. That’s a urban legend.

This doesn’t mean it doesn’t matter at all, it’s just that increases in memory speed scale little in overall performance compared to CPU speed.

As two very rough general rules of thumb, a 10% increase in CPU speed will get you about a 6% improvement in overall system performance. A 10% improvement in memory speed will get you about a 1% increase in overall system performance.

We’ll use SuperPi to demonstrate this.

First, here’s the results of running SuperPi to 32M places using a [email protected], 275MHz, 183MHz memory speed, memory settings 2-6-2-2:

Time to run all the loops (the rest is just to give you a file with pi written down to the 32 millionth digit), 37 minutes, 13 seconds, or 2233 seconds.

Next, here’s the results of running SuperPi to 32M places using a [email protected], 275MHz, 220MHz memory speed 2-5-3-2. Everything is absolutely the same except the memory speed. That’s twenty percent faster.

What do you get for your 20%?

2168 seconds, a time saving of 65 seconds. Divide 65 seconds into the base time of 2233 seconds, and you get a 2.7% performance increase.

This actually is an unlikely worst-case scenario. If you could do 275/220MHz at 2-3-5-2, you’d have no reason to run 275/183 at the same memory setting. You would only lower the memory speed to get fast timings, and this is what you get with a a [email protected], 275MHz, 183MHz memory speed, memory settings 2-6-2-2.

2196 seconds. The time savings from 220MHz? 28 seconds. Divide that by the new base of 2196, and you get a 1.3% performance difference.

Yes, 2.7% or 1.3% is an improvement. If you can run 220MHz just as easily, well and cheaply as you can 183MHz, you should run at the higher speed.

However, that’s not going to be the situation for most overclockers, especially since they’ll be looking at higher speeds than that.

The real life choice for people is “Should I risk frying my RAM or buy new RAM to get the memory speed up a little higher?” You ought to know beforehand what kind of performance increase is at stake here and decide for yourself whether or not it’s worth it.

It’s Not Just RAM

Focusing on just memory scores is like playing Unreal Tournament and paying attention to only one enemy character in the game. How smart is that?

These are not memory systems. These are computer systems, with a whole number of different parts interacting with each other.

You should not let one component determine or one approach decide your end results. You should try multiple approaches, and see which works best for you.

In my particular case, I couldn’t get the machine to run stably at more than 280MHz using a 5:4 ratio. So I went to 3:2, and found I could max out at 295MHz.

Let’s see how well the machine fares at SuperPi at 295MHz with that piss-slow 196MHz memory:

Hmmm. 2043 seconds. That’s a saving of 125 seconds over the results with a fast memory speed. I increased the FSB (and CPU) speed a little over 7%, slowed down the RAM MHzage about 11%, and got a performance increase (125/2168) of almost 6%.

Just to be absolutely fair, I then ran SuperPi again using the exact same 2-5-3-2 speed I used at 220MHz, even though you wouldn’t. Here’s the results:

2072 seconds. That’s a savings of 96 seconds, or a performance increase of 4.5% for my 7% FSB/CPU speed increase.

Let’s summarize. For my machine and SuperPi:

A 20% increase in memory speed gets us at most a 2.7% system performance increase. . .

. . . while . . .

only a 7% increase in FSB/CPU speed gets us at least a 4.5% system performance increase.

Do you see why maximizing the CPU speed should take priority in your own testing over memory speed?

Unreal Tournament 2003, Winrar, and 3DMark 2001…

Unreal Tournament 2003

We ran the seven benchmarks provided with the Unreal Tournament 2003 Demo, and these are the raw scores from the benchmark. txt file. Not a million pretty graphs, but it should actually be easier for you to see the general pattern this way:

Order of Scores

Botmatch-Antalus
Botmatch-Anubis
Botmatch-Asbestos
Flyby-Antalus
Flyby-Asbestos

295/196/2-6-2-2

Score = 87.226723

Score = 111.723717

Score = 97.156708

Score = 69.712524

Score = 234.789459

Score = 329.395813

Score = 157.424255

275/220/2-5-3-2

Score = 81.309341

Score = 104.289055

Score = 91.122650

Score = 65.478745

Score = 220.874084

Score = 310.620514

Score = 148.306107

275/183/2-6-2-2

Score = 80.993004

Score = 103.163139

Score = 90.252373

Score = 64.459167

Score = 219.929459

Score = 307.762207

Score = 146.988373

275/183/2-5-3-2

Score = 79.080299

Score = 101.002571

Score = 88.453804

Score = 63.443298

Score = 217.785995

Score = 302.105011

Score = 145.217682

You see the same pattern we saw for SuperPi. While a faster memory speed at a certain FSB helps a bit (and less if you can get faster timings from the slower speed), a boost in FSB helps a lot more.

Winrar

Winrar is an application sensitive to memory, as the Opteron benchmarking a while back demonstrated.

I took the Windows XP files, and converted them into a RAR file. These were the time results (these are averages, the Winrar results were a bit erratic):

295/196/2-6-2-2: 5:04 (as in five minutes)
275/220/2-5-3-2: 5:17
275/183/2-6-2-2: 5:41
275/183/2-5-3-2: 5:51

While memory speed helped Winrar relatively more than in the other benchmarks, it still wasn’t enough to overcome a 20MHz/240MHz increase in FSB/CPU speed.

3DMark 2001SE

Since we’re keeping track of memory, not which company has forgotten more about honesty, we will not use 3DMark2003 for this (and probably never will for anything).

Here’s a few pretty pictures:

295/196/2-6-2-2:

275/220/2-5-3-2:

275/183/2-6-2-2:

275/183/2-5-3-2:

Again, the same general pattern.

More Choice, More Thinking, More Testing…

There Is No One Rule

Does that mean people should just be concerned about FSB/CPU speed and nothing else? No, that would be just as stupid as just being concerned about memory speed.

For me, going to 3:2 makes sense because I can increase the FSB substantially due to it. Then again, my CPU is water-cooled. I have a somewhat better Northbridge fan than the standard (and will get a better one still). My RAM isn’t exactly pick of the litter, either.

You may well have an air-cooled CPU and find that you can’t increase the FSB very much going to a 3:2 ratio. It’s possible your Northbridge may be the bottle neck. Your RAM may do tremendously at tight settings at speeds higher than mine.

Should some or all of these apply to you, it could very well be that going to 3:2 may not be better for you. If you can only increase your FSB 3MHz by going to 3:2; it almost certainly won’t be, You’ll probably only see a net gain if you can increase it by more than 6-7 MHz.

The point is not to slavishly follow one rule or the other. It is to try both.

In the past, there was basically only one way to do this. The Canterwoods and Springdales now give you options you’ve never had before. That means new rules and more of them.