• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Memory bandwidth tests... any real differences (PC4300 vs. PC7100)

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.

graysky

Member
Joined
May 6, 2007
Memory bandwidth tests... any real differences (PC5300 vs. PC8888)

Common sense tell you that higher memory bandwidth should mean faster results, right? I set out to put this thought to the test looking at just two different memory dividers on my o/c'ed Q6600 system. At a FSB of 333 MHz, the slowest and fastest dividers I could run are:

1:1 a.k.a. PC5300 (667 MHz)
3:5 a.k.a. PC8888 (1,111 MHz)

cpuz2ux1.gif


Just for reference, as they relate to DDR2 memory:
Code:
PC4300=533 MHz
PC5300=667 MHz
PC6400=800 MHz
PC7100=900 MHz
PC8000=1,000 MHz
PC8500=1,066 MHz
PC8888=1,111 MHz
PC10600=1,333 MHz

The highest divider is 1:2 aka PC10600 (1,333 MHz) and it just wasn't stable with my hardware @ 333 MHz.

All other BIOS settings were held constant:
FSB = 333.34 MHz and multiplier = 9.0 which gives an overall core rate of 3.0 GHz.
DRAM voltage was 2.25V and timings were 5-5-5-15-4-30-10-10-10-11.

You can think of memory bandwidth as the diameter (size) of your memory's pipe. Quite often, the pipe's diameter isn't the bottle neck for a modern Intel-based system; it is usually much larger than the information flow to/from the processor. Think of it this way, if you can only flush your toilet twice per minute, it doesn't matter if the drain pipe connecting your home to the sewer is 3 inches around, or 8 inches around, or 18 inches around: the rate limiting step in removing water from your home is the toilet flushing/recycling and the pull of gravity, not the size of your drain line. The same is true for memory bandwidth.

After seeing the data I generated on a quad core @ 3.0 GHz, I concluded that this toilet analogy is pretty true: the higher memory bandwidth gave more or less no appreciable difference for real world applications. Shocked? I was.

Further, I should point out that in order for my system to run stable in PC8888 mode @ a FSB of 333, I had to boost my NB vcore two notches and raise my ICH to the max (both of which the BIOS colored red meaning "high risk.") The increased voltage means more heat production, and greater power consumption -- not worth it for small gains realized in my opinion. Anyway, the test details and results are below if you want to read on.

memak1.jpg


Relevant test hardware:

Motherboard: Asus P5B-Deluxe (BIOS 1215)
CPU: Intel C2Q - Q6600 (B3 revision)
Memory: Ballistix DDR2-1066 (PC2-8500)

"Real-World" Application Based Tests

I chose the following apps: lameenc, x264, winrar, and the trial version of Photohop CS3. I ran these tests on a freshly installed Windows XP Pro SP2 machine.

Lame version 3.97 – Encoded the same test file (about 60 MB wav) with these commandline options:
Code:
lame -V 2 --vbr-new test.wav
(which is equivalent to the old –-alt-preset fast standard) a total of 8 times and averaged play/CPU data as the benchmark.

x264 version 0.55.663 – Ran a 2-pass encode on the same MPEG-2 (720x480 DVD source) file 5 times totally and averaged the results. Without getting into too much detail, the benchmark is 1,749 frames @ 23 fps. Based on these numbers, I reported the time it would take to encode 215,784 frames (which is your average 2.5 h of video @ 23 fps). Why did I do this? The differences of just 1,749 frames were too insignificant.

Shameless promotion --> you can read more about the x264 Benchmark at this URL which contains results for hundreds of systems. You can also download the benchmark and test your own machine.

RAR version 3.62 – rar.exe ran my standard backup batch file which generated about 1.09 G of rars (1,654 files totally). Here is the commandline used:
Code:
rar a -u -m0 -md2048 -v51200 -rv5 -msjpg;mp3;tif;avi;zip;rar;gpg;jpg  "E:\Backups\Backup.rar" @list.txt
where list.txt a list of all the dirs I want it to back up. Benchmark results are an average of two runs timed with a stopwatch.

Trial of Photoshop CS3 – The batch function in PSCS3 was used to do three things to a total of twenty-nine, 10.1 MP jpeg files:

1) bicubic resize 10.1 MP to 2.2 MP (3872x2592 --> 1800x1200) which is the perfect size for a 4x6 print @ 300 dpi.
2) unsharpen mask filter (60 %, 0.8 px radius, threshold 12)
3) saved the resulting files as a quality 8 jpg.

Benchmark results are an average of two runs timed with a stopwatch.

"Synthetic" Application Based Tests

Just two of these were chosen to illustrate a point about theoretical gains vs. real world gains. Actually, I did SuperPI for the hell of it. WinRAR served to illustrate that point.

SuperPI / mod1.5 XS – The 16M test was run twice, and the average of the two are the benchmark.

WinRAR version 3.62 – If you hit alt-B in WinRAR, it'll run a synthetic benchmark. This was run twice (stopped after 100 MB) and is the average of two runs.

Raw Data - "Real-World" Apps
Lameenc play/cpu (average 8 runs) @ PC5300: 30.7935
Lameenc play/cpu (average 8 runs) @ PC8888: 30.8045
Result: PC8888 is 0.5 % faster

x264 time to encode 2.5 h DVD @ PC5300: 01:48:54
x264 time to encode 2.5 h DVD @ PC8888: 01:46:14
Result: PC8888 is 2.5 % faster

rar.exe back-up (average 2 runs) @ PC5300: 45 sec
rar.exe back-up (average 2 runs) @ PC8888: 44 sec
Result: PC8888 is 2.2 % faster

Photoshop CS3 Trial batch (average 2 runs) @ PC5300: 33 sec
Photoshop CS3 Trial batch (average 2 runs) @ PC8888: 33 sec
Result: PC8888 is 0.0 % faster

So stop right here and ask yourself if a 2-3 % gain is worth the higher voltage and heat.

Raw Data - "Synthetic" Apps

SuperPI/16M test (average 2 runs) @ PC5300: 8 m 8.546 s
SuperPI/16M test (average 2 runs) @ PC8888: 7 m 33.328 s
Result: PC8888 is 7.8 % faster

Winrar internal benchmark (average 2 runs) @ PC5300: 1,515 KB/s
Winrar internal benchmark (average 2 runs) @ PC8888: 2,079 KB/s
Result: PC8888 is 37.2 % faster

...but who uses their system exclusively running internal and synthetic benchmarks? Recall that for my 1.09 gig back up, I only gained about 2 % doing "real work" by using the higher divider. Hardrives are notorious bottle-necks in systems that serve to nullify any memory bandwidth increases. In this case the 37 % theoretical increase was translated into only a 2 % "real world" increase likely due to the hardrive/rar's ability to read/write the data. Again, this seems kinda wasteful to me.

I will admit that there might be special cases where running at high memory dividers may produce more substantial gains: apps such as folding@home or seti@home, etc. may benefit from the higher memory bandwidth since they tend to make exclusive use of the system memory bandwidth and rely much less on the hardrive. I have no data to back-up this though. Also lacking in my experiments are any game data. I'd be interested in knowing if the higher bandwidth can be leveraged by game engines such as UT3, Crysis, etc. but I also didn't look at these here.

Finally, since I held everything else constant, I didn't look at the tighter timings in 1:1 mode that people can often use which may give additional gains. For example, I can get away with 3-3-3-9 @ 1:1 vs. the slower 5-5-5-15 @ 3:5 with this memory.

Anyway, I hope you found this useful and maybe this will inspire someone else to look at the gaps pointed out above (and the gaps I haven't thought of too!)
 
Last edited:
Thanks for sharing this excellent finding, and really appreciate that long hours testings ! :thup:
 
1:1 a.k.a. PC4300 (667 MHz)
3:5 a.k.a. PC7100 (1,111 MHz)

The highest divider is 1:2 aka PC8500 (1,333 MHz) and it just wasn't stable with my hardware @ 333 MHz.

All other BIOS settings were held constant:
FSB = 333.34 MHz and multiplier = 9.0 which gives an overall core rate of 3.0 GHz.
DRAM voltage was 2.25V and timings were 5-5-5-15-4-30-10-10-10-11.


Question did you use same timings using for pc4300 and pc7100?
 
1:1 a.k.a. PC4300 (667 MHz)
3:5 a.k.a. PC7100 (1,111 MHz)


I think pc6400 with timings 4-4-4 should do better

Something to correct, because your cpu is overclocked memory speed change aswell.

Pc4300=266
pc5300=667
pc6400=800
Pc7100=900
Pc8000=1000
pc8500=1066
Pc8888=1111
Pc10600=1333
 
Odd... my BIOS reported it as PC4300 (667 MHz) and PC7100 (1,111 MHz).
 
When you finish your testing and analysis, you should consider cleaning it up and submitting it to the front page. At that point, you might also consider PM'ing deeppow for advice on analysis, etc.; he's a senior member with a very good track record for memory testing and main site publishing. -- Paul
 
Awesome amount of testing there! :beer: Thanks for posting the results.

Results are similar to what I've seen, see Figure 2 in link. Similar in that performance improvements aren't big.

What was the range on the two SuperPi results within your tests?

EDIT: By the way, the blue text (Results) on the black background is very difficult for me to see. Probably just me!
 
Last edited:
What do you mean by the range? Like how much did they differ? Also, good suggestion... blue now = yellow!
 
What do you mean by the range? Like how much did they differ? ..

Yes, how much they differ.

With more than 2 samples, I would ask for the standard deviation. However when you have only two samples, the standard deviation is = to the range. :beer:
 
I know you mentioned using tighter timings, but it would be interesting to see results where each FSB;DRAM ratio was run w/ optimal timings.

667 3-3-3-9 vs. 1111 5-5-5-15

If 667 wins hands down in all tests (or at least ties) it would help confirm my results, and help people understand that the MHz of their RAM was only part of the equation. Timings are equally important. Although neither make a huge difference in real-world.

I feel DDR2-800 4-4-4-12 and DDR2-1066 5-5-5-15 are going to give you very similar results assuming the RAM are the same model like Ballistix. Good D9 DDR2-800 will take your FSB to 500MHz and beyond. Nobody really needs more than that except for guys using exotic cooling solutions. For the average OCer D9 DDR2-800 is the smart buy. Someone had some nice RAM w/ different IC's on here they were recommending...Patriots maybe w/ 4-4-3-5 timings? Basically just make sure to get some OCForum approved DDR2-800 w/ CAS4, and you'll be golden.
 
Raw Data - "Real-World" Apps
Lameenc play/cpu (average 8 runs) @ PC5300: 30.7935
Lameenc play/cpu (average 8 runs) @ PC8888: 30.8045
Result: PC8888 is 0.5 % faster

x264 time to encode 2.5 h DVD @ PC5300: 01:48:54
x264 time to encode 2.5 h DVD @ PC8888: 01:46:14
Result: PC8888 is 2.5 % faster

rar.exe back-up (average 2 runs) @ PC5300: 45 sec
rar.exe back-up (average 2 runs) @ PC8888: 44 sec
Result: PC8888 is 2.2 % faster

Photoshop CS3 Trial batch (average 2 runs) @ PC5300: 33 sec
Photoshop CS3 Trial batch (average 2 runs) @ PC8888: 33 sec
Result: PC8888 is 0.0 % faster

So stop right here and ask yourself if a 2-3 % gain is worth the higher voltage and heat.

Raw Data - "Synthetic" Apps

SuperPI/16M test (average 2 runs) @ PC5300: 8 m 8.546 s
SuperPI/16M test (average 2 runs) @ PC8888: 7 m 33.328 s
Result: PC8888 is 7.8 % faster

Winrar internal benchmark (average 2 runs) @ PC5300: 1,515 KB/s
Winrar internal benchmark (average 2 runs) @ PC8888: 2,079 KB/s
Result: PC8888 is 37.2 % faster
Both are PC5300 and PC8888 are running 5-5-5-15-2T? If so, then the results are hardly comparable due to different latencies.
CL5 @ 333MHz is 15ns but only 9ns @ 555MHz, that's 67% slower CAS for PC5300...
 
Just because the memory is running at a faster speed doesn't mean the speed can be utilized.

And w/ CL3 @ 333 vs CL5 @ 555...The 555 may be switching it's transistors faster, but the system has to wait more ticks of the clock before it can get the information it has requested due to the higher timings. It's like racing a race car vs. a street car. If you give the street car a big enough head-start it will win.
 
Back