• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

Memory bandwidth tests... any real differences (PC4300 vs. PC7100)

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
And w/ CL3 @ 333 vs CL5 @ 555...The 555 may be switching it's transistors faster, but the system has to wait more ticks of the clock before it can get the information it has requested due to the higher timings.
No, that's not how latencies work.
Clock cycle length @ 555MHz is 1.80 nanoseconds therefore a 5 clock cycle period of latency is 9.00ns (5*1.8ns). But at 333MHz the clock cycle is ofcourse longer (3.00ns). This means a 3T latency period @ 333MHz takes 9.00ns (3*3.00ns), same as 5T @ 555MHz.

Nothing else matters in memory latency other than the absolute latency - time, that is, not clock cycles.
 
Well, the most obvoius point that nobody seems to have pinned is this:

"Memory bandwidth" is indeed the combination of speed, latency, interleaving, multi-channel operation and bits per transfer. We have some very smart people above telling you about why bandwidth is being hurt on the lower-speed memory due to non-matching amounts of latency; that's good info.

But "useable memory bandwidth" is all of the above, combined with the bus interconnecting the memory and the CPU. If you have 10GB/sec of memory bandwidth, but the bus connecting is only 8.5GB/sec, then you effectively have 8.5GB/sec of usable memory bandwidth.

In your example, you're over-driving the memory bus by way of a greater-than-one multiplier. If you had instead overclocked the entire FSB as well as the memory bus, you would have seen different results.

Of course, keeping the CPU at the same speed becomes another issue :)
 
Fair may not be the right term, but relevant sure is.

Keep your FSB:MEM ratio set to 1:1, and now jack with the frequencies. So for example:

276 x 11 = 3.036Ghz
337 x 9 = 3.033Ghz
433 x 7 = 3.031Ghz

Stuff like that :) I know they're not PERFECTLy the same speed, but a difference of +/- 30mhz should not drastically affect benchmark scores.
 
Folks, see post #12 above. There are several testing sequences within that link including, I believe, the ones you suggest.

What graysky has done is test a particular combo using a broad array of applications -- a very time consuming effort for some of the apps. While the individual results for the apps do vary, it is interesting to see the trend is typically the same in the case tested. Thanks graysky! :beer:

If you want to see a particular sequence tested then please get out your benchmarks, roll up your sleeves and show us the results. Its always interesting to see how the different hardware affects the results. :cool:
 
I don't see anything in that review regarding the (very important) link between FSB and memory speed.

I see FSB's changing. I see memory speeds changing. I don't see whether or not the FSB and MEMORY speeds were changed in unison, nor the ratio in which they were changed...

I'm not saying that you didn't measure it, but your posted review didn't cover it. So, do you have more information?
 
I don't see anything in that review regarding the (very important) link between FSB and memory speed.
There is no link between fsb and memory speed. This is about difference between low and high bandwidth memory and effect on system's performance.

I see FSB's changing. I see memory speeds changing. I don't see whether or not the FSB and MEMORY speeds were changed in unison, nor the ratio in which they were changed...

Fsb was same, the only thing that changed was divider. So the difference is between lowest and highest divider on that system.
 
There is no link between fsb and memory speed. This is about difference between low and high bandwidth memory and effect on system's performance.
*sigh*

Please direct your attention to my first post in this thread. Yes, there IS a link between FSB and memory speed, and the link is of the utmost importance.


Fsb was same, the only thing that changed was divider. So the difference is between lowest and highest divider on that system.
If this is the case, then the review was utterly worthless, for the exact reason I posted above.
 
Ok, it seems some people are confused still by what I'm trying to convey... In the interest of open communication, I'm going to re-post my PM to deeppow: (with some typographical corrections, hehe)

--------------------------------
Here lies the problem:

The memory interface on an Intel system is driven from the northbridge, not the CPU. I definitely know you're aware of this, as you've been doing this bit for a while now ;)

Ok, so now you have two busses: A bus between the CPU and northbridge, and a bus between MEM and northbridge. We commonly refer to the bus between CPU and northbridge as the front-side bus -- but you of course know this too :)

Now, obvious statement #1: if the CPU->NB bus is running at speed (X), and the MEM->NB bus is also at the same speed (X), then your CPU (and all the processes therein) can use the full memory bandwidth available -- minus any minor losses / overhead involved in the workings of the northbridge.

Here's obvious statement #2: if the CPU->NB bus is running at speed (X), and the MEM->NB bus is running at the lesser speed (X - 1), then again your CPU can use the full memory bandwidth available. Of course, minus the slight overhead from the NB...

Here's the statement I feel you're missing: if the CPU->NB bus is running at speed (X), and the MEM->NB bus is running at a greater speed (X + 1), now your CPU cannot use the full memory bandwidth available. The pipe between NB and CPU is simply not the same size, and as such the data cannot be transmitted / received at the same rate.

That third scenario is what happens when you use a memory ratio that is greater than one (3:4, 3:5, 4:5, 5:6). You are running the bus between NB->MEM faster than the CPU can interact, essentially giving you zero benefit. Of course, it's not PURELY zero benefit, as other things interact directly with the NB that need access -- think DMA transfers.

But DMA transfers are a very small slice of the pie, and the improvement by doing so is incredibly small -- which explains the findings in the article you wrote.

If CPU->NB bus is not at least as fast as the MEM->NB bus, then for the most part you're simply wasting power and increasing heat for no obvious gain.
 
Ok, it seems some people are confused still by what I'm trying to convey... In the interest of open communication, I'm going to re-post my PM to deeppow: (with some typographical corrections, hehe)

--------------------------------
Here lies the problem:

The memory interface on an Intel system is driven from the northbridge, not the CPU. I definitely know you're aware of this, as you've been doing this bit for a while now ;)

Ok, so now you have two busses: A bus between the CPU and northbridge, and a bus between MEM and northbridge. We commonly refer to the bus between CPU and northbridge as the front-side bus -- but you of course know this too :)

Now, obvious statement #1: if the CPU->NB bus is running at speed (X), and the MEM->NB bus is also at the same speed (X), then your CPU (and all the processes therein) can use the full memory bandwidth available -- minus any minor losses / overhead involved in the workings of the northbridge.

Here's obvious statement #2: if the CPU->NB bus is running at speed (X), and the MEM->NB bus is running at the lesser speed (X - 1), then again your CPU can use the full memory bandwidth available. Of course, minus the slight overhead from the NB...

Here's the statement I feel you're missing: if the CPU->NB bus is running at speed (X), and the MEM->NB bus is running at a greater speed (X + 1), now your CPU cannot use the full memory bandwidth available. The pipe between NB and CPU is simply not the same size, and as such the data cannot be transmitted / received at the same rate.

That third scenario is what happens when you use a memory ratio that is greater than one (3:4, 3:5, 4:5, 5:6). You are running the bus between NB->MEM faster than the CPU can interact, essentially giving you zero benefit. Of course, it's not PURELY zero benefit, as other things interact directly with the NB that need access -- think DMA transfers.

But DMA transfers are a very small slice of the pie, and the improvement by doing so is incredibly small -- which explains the findings in the article you wrote.

If CPU->NB bus is not at least as fast as the MEM->NB bus, then for the most part you're simply wasting power and increasing heat for no obvious gain.

Very informative, clarified up a few things for me, thanks :D
 
No, that's not how latencies work.
Clock cycle length @ 555MHz is 1.80 nanoseconds therefore a 5 clock cycle period of latency is 9.00ns (5*1.8ns). But at 333MHz the clock cycle is ofcourse longer (3.00ns). This means a 3T latency period @ 333MHz takes 9.00ns (3*3.00ns), same as 5T @ 555MHz.

Nothing else matters in memory latency other than the absolute latency - time, that is, not clock cycles.

That was the point I was trying to make. I guess I didn't word it correctly. One has to wait more clock cycles, but the clock is ticking faster, so it balances out.

Putting it into absolute times makes it easier to understand, though. Kudos!

:beer:
 
Ok, it seems some people are confused .....

Thank you very much for your clarification for all! Greatly appreciate it. :beer:

Yes, I understand (and understood) your points. And in the spirit of clarification, let me add some discussion. Here is Figure 2 from the link so all can see.

image002.jpg

If I look at the CPU speeds (all run with a multiple of 8), the FSB speeds for the 3 curves are
2400 -> 300,
3000 -> 375, and
3500 -> 438.

Now if the FSB speed has an effect on performance as it relates to memory speed then I would expect the characteristics of the curves (such as slope) to be different. While the curves do show some differences which I would characterize as second order, the slopes are nearly the same as memory speed increases. Thus the FSB from CPU->NB as reflected in the parametric cases shown, does not show a significant difference resulting from a higher FSBs over the same memory range shown. I'm assuming a FSB range of 300 to 438 is sufficient to test your point.
 
Well, it entirely depends on the dataset being memory-bandwidth bound, doesn't it?

If your 1-million digits of Pi can be computed within the confines of the L2 cache, then external memory bandwidth is not going to affect it much (if at all). And in fact, your graphs almost entirely prove that point -- memory bandwidth has little to do with 1M Pi calc.

Now, calculating Pi to 32M digits isn't going to fit in L2, and would be very dependant on memory bandwidth. That would be a test more apt to be impacted by memory bandwidth performance.

Further, the 2.4Ghz test at 300FSB would have been the gain expected for a "zero impact" memory bandwidth increase -- meaning, all of your memory tests at the 300FSB / 2.4Ghz speed started at 1:1 (600mhz effective DDR dual-channel), so any memory speed over the initial 600mhz you tested was effectively 'worthless" outside of the minor enhancement of DMA transfers and northbridge losses.

Seeing that ALL the data follows the 2.4Ghz line means that none of them were bandwidth limited.
 
I'm bumping this thread from the grave, because I think people still forget that memory bandwidth beyond 1:1 really is essentially useless.

And attempting to benchmark anything to do with memory bandwidth by using Pi 1M test that fits entirely within L2 cache of the processor is similarly incorrect.
 
Great info here, it's a bookmark.

I was just popping in some 5-5-5-12 DDR2-800 (as an upgrade from my 4-4-4-12 DDR2-667) which would be running with a multiplier of 800 / 800 - (266 x 2) => 1:1,5.

Since i haven't OC'ed the system yet, i understand that this will give me literally no performance gain, unless i run 1:1 on my DDR2800 and tighten the timings.
 
I might not be able to. I have no idea about the performance of these Crucial ram : 2x CM2X1024-6400. But in any case, i'll leave it at 1:1 and save some power and heat :sn:
 
I'm bumping this thread from the grave, because I think people still forget that memory bandwidth beyond 1:1 really is essentially useless.

And attempting to benchmark anything to do with memory bandwidth by using Pi 1M test that fits entirely within L2 cache of the processor is similarly incorrect.

Resurrection!!!!

What about high fsb AND 1:1 ratio? Thats what I'm facing here with the E8400.

I'm running 8x514, but 9x457 is just as stable. However, I get much higher bandwidth readings with 8x514 (even if i use a 5:6 divider w/ 457 fsb!).
 
Resurrection!!!!

What about high fsb AND 1:1 ratio? Thats what I'm facing here with the E8400.

I'm running 8x514, but 9x457 is just as stable. However, I get much higher bandwidth readings with 8x514 (even if i use a 5:6 divider w/ 457 fsb!).

Go w/ whatever gives you the best numbers IMO.
 
Back