Check this out...

Rumrunner · Feb 2, 2005

Post your ram bandwidth benches here!!!

I decided to do a little experiment on bandwidth derivative! As most of us know, the AMD 64 seems to be more responsive to the cpu multiplier than our past cpu's. I strongly believe that this is due to the HTT.

My simple experiment shows the difference between FSB based mhz and cpu MULTIPLIER based mhz.

The constant is the system in my signature; benchmarking with Sandra lite 2005.

The settings chosen are 2600 mhz {260 X 10} and 2610 mhz {290 X 9}.

I think you may be a little surprised by the results!

2600 mhz {260 X 10} http://img.photobucket.com/albums/v636/rrrr/260.png

2610 mhz {290 X 9} http://img.photobucket.com/albums/v636/rrrr/290.jpg

The 290 X 9 variable performed only 1.27 % better than the 260 X 10 variable in all tests.

As for ram bandwidth alone, the 2610 X 9 variable test scored 5.28% higher than the 2600 X 10 variable.

I love my ram, but I am not so sure that the memory bandwidth increase on this cpu @ 290 fsb {2.5-3-3-5} will be able to outperform timings of {250 X 10} {2-2-2-5}. Hence bh-5, or that OCZ vx stuff.

I would like to see one with the ocz vx or bh-5 run some Si lite benchmarks and contribute them! Uueeegh! :attn:

TimoneX · Feb 2, 2005

Interesting test, why 290*9 and not 289*9 though? Sandra is probably not the best way to guage the effects of latency. Everest, Aida32, SuperPi, or Sciencemark would likely have been preferable. I'd think for the majority of tasks [email protected] 1T would be > DDR500@2-2-2-5 1T, but obviously there'd be exceptions. Anyway interesting results thanx for sharing your findings.

Rumrunner · Feb 2, 2005

I could have done 289 x 9 too, but I just figured that for a only a .384 % increase in clock, I could have a nice round number

. Also, I guess I didn't mention: I did not change the latencies at all actually. They were a constant 2.5-3-3-5. All I changed was the derivative of the mhz.

TimoneX said:
Everest, Aida32, SuperPi, or Sciencemark would likely have been preferable

Care to explain?

Rob

TimoneX · Feb 2, 2005

Surely.

I do not believe that Sandra is a particularly accurate test when it comes to measuring performance variations with different memory latencies. The unbuffered results can be somewhat usefull, but in general sandra seems better equipped to illustrate difference in memory speed than memory timings. Just my opinion of course.

Rumrunner · Feb 2, 2005

Ok,

All right, so I went ahead and did a couple of benchmarks In sciencemark...

The tests here show that a 289 X 9 {2601 mhz} yields 3.41% more bandwidth than 260 X 10 {2600 mhz}.

260 X 10: http://img.photobucket.com/albums/v636/rrrr/sci26.jpg

289 X 9: http://img.photobucket.com/albums/v636/rrrr/sci.jpg

I know you all are thinking ya and so. lol. Well I am truly surprised by these results really, and I think that AMD 64 multiplier is just a little underrated. Do you guys think that there might be some gains in this relm? It seems to me, that Ram timings resiliently hold thier timings with a lower fsb somewhat regaurdless of the cpu multiplier. If this is true, then I predict that a 240 X 11 {2640 mhz} setup with timings of 2-2-2, can see at least 9000 mb/s ram bandwidth. I do not know because I have yet to see any benchmarks.

So lets see some benchmarks!

Any input???

Gautam · Feb 3, 2005

Ok, I think it's about time we had a little lesson on how the Sandra Memory Bandwidth benchmark works. Yes, multipliers help bandwidth, but they are useless by themselves, which I'll explain.

Sandra tests the maximum raw bandwidth of the memory bus.

Bus bandwidth can be determined like this:

Bus frequency(memory in this case)*(8 bits/byte)*2 (double-data rate)*2(dual channel if applicable)

So normally, in the old days, you could just plug your memory speed into this formula, and get your maximum theoretical bandwidth. Sandra would show a little lower (but not significantly) due to inefficiency. Tightening your latencies would help slightly, but not signficantly, perhaps a few percentage points, but the raw speed is what determines your maximum theoretical bandwidth, which is in essence what Sandra tests. Higher latencies can only be held against you, tighter ones just increase the efficiency (which is next to a non-issue with the ultra-low latency that the A64 offers)

Ok so, now whats with the higher multipliers and lower speeds giving higher bandwidth?

The reason that this happens is because the Athlon64 has an on-die memory controller, which uses CPU power. This is something that no architecture before it has seen. What this implies is that a certain amount of CPU power is required to drive the bus and provide the bandwidth.

That's right CPU power. Which requires CPU speed. What's happening with the dual-channel A64's is that the CPU is bottlenecking the memory bus. In essence, its not like higher CPU is actually boosting bandwidth, rather, slow CPU speeds are actually limiting it. Lower CPU speeds bottleneck the bandwidth to the point where going past a certain speed just barely helps. When you increase the CPU multiplier, you increase the CPU speed, which partially removes this bottleneck and unleashes more bandwidth.

It's not the mulitplier that's really doing anything. Really, it's just indirectly increasing the CPU speed, which is unleashing the bandwidth.

In your case, 2.6 GHz is bottlenecking your memory bus. Therefore, whether the memory is at 260 or 290 barely matters. In fact, even ramping it up to 300+ would probably do little to help the bandwidth at low CPU speeds.

This isn't an opinion question. Sandra is just a test that's focused on raw memory bandwidth. For some people, that's useful, for others, it isn't. Just keep in mind that that's all its testing.

This phenomena of the memory bandwidth increasing with the CPU speed only occurs on the dual-channel equipped A64 systems. Not even 754 A64 systems see this, unless using ridiculously low CPU speeds, like <2 GHz, because they don't have enough memory bandwidth to strain the CPU.

Rumrunner · Feb 3, 2005

So the ondie controller which supports dual channel, causes a bottleneck at high speeds because it causes the cpu to effectively slow down: Makes sense.

Also, the HTT has the ability to load the cpu very hard. Do you think that multiplier is worth more because of HTT?

Gautam · Feb 3, 2005

It's the CPU speed that's compromising memory bandwidth, not the other way around.

I don't get exactly what you're asking, higher multi with lower htt better than lower multi with higher htt?

Rumrunner · Feb 3, 2005

What I mean is: Since the HTT can effectively load the fsb of the cpu so heavy, a higher multiplier can help pass the dense information through the maze of logic gates more efficiently...?

flapperhead · Feb 3, 2005

i aint a amd guy but what i get from gatum is that the memory controller (built into the cpu is getting saturated (full) at the higher mult and lower fsb(or what ever amd calls it) so since the memory controller is dependant upon cpu speed, raising the fsb and memory after a certain point (given the same cpu speed) doesnt give anymore bandwidth..

TombKeeper · Feb 3, 2005

Yeah, then why is everyone so obsessed to overclock their 939 so much if it won't jack for performance.....?? Doesn't make sense... (people - not above info)

Rumrunner · Feb 4, 2005

flapperhead said:
i aint a amd guy but what i get from gatum is that the memory controller (built into the cpu is getting saturated (full) at the higher mult and lower fsb(or what ever amd calls it) so since the memory controller is dependant upon cpu speed, raising the fsb and memory after a certain point (given the same cpu speed) doesnt give anymore bandwidth..

Not trying to speak for Gautam, but It seems that the front side of these cpu's are being loaded so hard, that there is a need for more cpu speed to run the data through. Hence these dual channel cpu's need more "multiplier derived mhz" than other cpu's.

TombKeeper said:
Yeah, then why is everyone so obsessed to overclock their 939 so much if it won't jack for performance.....?? Doesn't make sense... (people - not above info)

Well overclocking will always help.

This is a matter of running an extremely high fsb/low multiplier and how it's not as efficient as running a lower fsb/higher multiplier.

TimoneX · Feb 4, 2005

Interesting discussion. It seems to me there are certainly those here about who are overly obsessed with HT(FSB if you must) speeds at any cost. I personally like to find the point of diminishing returns where you're forced to loosen memory timings substantially to continue increasing HT speeds. I find this to offer the best overall performance. For instance with my particular setup the top performing choices are 294*9 3-4-4-8 1T or 265*10 2.5-3-3-8. Although sandra and several other benchmarks show the config with higher HT speeds to offer more bandwidth, bencharks that do raw number crunching such as P95 & superPi tell a different story and show the lower HT setup with slightly better timings to be superior.

Gautam · Feb 4, 2005

flapperhead said:
i aint a amd guy but what i get from gatum is that the memory controller (built into the cpu is getting saturated (full) at the higher mult and lower fsb(or what ever amd calls it) so since the memory controller is dependant upon cpu speed, raising the fsb and memory after a certain point (given the same cpu speed) doesnt give anymore bandwidth..

Flapper's 100% dead on.

This isn't that complicated, it's just that the proc can bottleneck the memory bandwidth.

The memory bus isn't inefficient per se, it's just being limited by the processor. I'd still wager that outside of Sandra, 290x9 would be significantly better than 260x10, given same timings.

Sandra is too focused to show us more than raw bandwidth, which != performance.

flapperhead · Feb 4, 2005

Gautam said:
Flapper's 100% dead on.

This isn't that complicated, it's just that the proc can bottleneck the memory bandwidth.

The memory bus isn't inefficient per se, it's just being limited by the processor. I'd still wager that outside of Sandra, 290x9 would be significantly better than 260x10, given same timings.

Sandra is too focused to show us more than raw bandwidth, which != performance.

HA! i guess all that partying i did in my youth didnt completely burn me out. lol... since im gonna get one of these monsters i guess i should start reading up on the amd stuff..

Check this out...

Rumrunner

Member

TimoneX

Closet Elitist Member

Rumrunner

Member

TimoneX

Closet Elitist Member

Rumrunner

Member

Gautam

Senior Benchmark Addict

Rumrunner

Member

Gautam

Senior Benchmark Addict

Rumrunner

Member

flapperhead

Senior Member

TombKeeper

Member

Rumrunner

Member

TimoneX

Closet Elitist Member

Gautam

Senior Benchmark Addict

flapperhead

Senior Member

Similar threads