View Full Version : Never underestimate the important of fast CAS settings . . .
. . . when it comes to 3D performance. Been playing with couple of NF2 boards the last three weeks, and I notice the AMD fixation on fsb. Yes, the only way to get higher bandwidth is through fsb: 250 fsb=3700+ bandwidth. But not so fast, that's not the whole story when it comes to the 3DMarks. Here is some data I got with Mobile Bartons:
236x10.5, 2595 mhz, 2.5-2-2-5, bandwidth: 3597/3369
3DMark2001=21490
3DMark2003=11295
3DMark2000=19191
3DMark2005=5412
220x11, 2530 mhz, 2-2-2-5, bandwidth: 3378/3205
3DMark2001=21569
3DMark2003=11412
3DMark2000=19336
3DMark2005=5466
Heck, the board at lower fsb and 65 mhz less (AMD guys know that is a HUGE amount to give away) gets higher 3DMark scores due to the faster timings of BH-5 at 2-2-2-5. And hell, the other board running 2.5-2-2-5 (using Ballistix ram) sure ain't slow.
And look at the benchmarks of running even higher fsb at 2-2-2-5 on a non-Barton chip (XP2100) where you have 1/2 the cache, and a CPU almost 100 mhz smaller:
231x10.5, 2440 mhz, 2-2-2-5, bandwidth: 3582/3341
3DMark2003=11896
3DMark2005=5445
(I've not posted the 3DMark2001 and 3DMark2000 scores as these are lower than the boards above due to the non-Barton chip)
Summary: 236 fsb, 2.5-2-2-5 gets beat soundly by 231 and even 220 fsb, 2-2-2-5. Moreover, it's not even a clock-for-clock comparison, as the 231/220 fsb rigs have smaller CPUs! I wouldn't be surprised if 220, 2-2-2-5 beats 250, 2.5-4-4-7 in the 'Marks, clock-for-clock, even if you could bench there (hah!).
There is no substitute for good ole BH-5 at 2-2-2-5, and you don't even have to run high fsb and do all the Voodoo and black magic on the NF7-S board. A decent NF7-S board will run 220 out-of-the-box, with OEM BIOS and no L12 mod, on an ordinary APIC-enabled WinXP.
All the results above are 2x512 MB ram with CPC Enabled. I'm not interested in 256 MB sticks or 2T, so I take what fsb I can get here. To run 3D stable >223 fsb I had to use an Epox 8RDA3+ Rev. 3.2 board. The BH-5 I'm using is 2x512 XMS3500 (which runs 231 fsb at just 2.816 actual on the Epox board!).
flapperhead
12-02-04, 11:55 AM
interesting,since u have access to many different boards, have tried it with the 939/754 and 875 chipsets. if so did u see much difference??
Barton is known to like Lowest Cas Possible, its not the same for all AMD chips :p
grimm003
12-02-04, 03:27 PM
That's some really usefully information, props to you!
Quailane
12-02-04, 03:30 PM
Barton is known to like Lowest Cas Possible, its not the same for all AMD chips :p
Well since it is the same on my duron, for what is in between (the thoroughbred) it must also surely be the case.
interesting,since u have access to many different boards, have tried it with the 939/754 and 875 chipsets. if so did u see much difference??
With the P4 boards, there's not THAT much difference between 2-2-2-5 and 2.5-2-2-5, should you be so lucky to run even this. But the boost is scores is about the same. Mainly you want to get as close to -2-2- as possible, which is where the real speed is.
Also with P4s, faster CAS settings boost unbuffered bandwidth. I did not notice this much with AMD mainly because I, as well as everybody else, looks at buffered. Mainly it's the higher the fsb, the higher the bandwidth, even with dog slow timings.
So if you are a gamer, I wouldn't be so fixated with high fsb. And if you are running high fsb but your timings aren't even close to 2.5-2-2-5, why bother. Get good BH-5 and run 220 or so at 2-2-2-5, which all the 8RDA+ Rev. 1.1s and NF7-S Rev. 2.0s I've tried can do, out-of-the-box with no mods whatsoever. Plus you can run CPC Enabled. Fortunately the multi is unlocked (on pre 0352 XPs and Mobile Bartons) so you can up that to get big CPUs. If anything, size counts a lot with AMDs when it comes to 3DMark scores.
Or as I mentioned, switch to a Rev. 3.2 8RDA3+; I've found it to be a marvelously unfussy board, better on the VCORE overclocking than the NF7-S despite being 2-phase, and much stabler in 3D at higher fsb. And ya only have to try one modded BIOS by Merlin, heh. Do that in 3 minutes and drop the pins in for the L12 mod and you are good to go. I think the BIOS has the L12 mod built-in but I always do the hard mod to make sure. I think most AMDers would be happy with 231, 2-2-2-5, 2x512s with CPC Enabled :p. Got good XMS3500 though.
As for the 939/754, I have the DFI 250 GB NF3 and K8N Neo2 Plat, but I need to get back to my P4 stuff for now. I imagine the same story there: faster CAS settings, better 3DMark scores, all at lower fsb (and 1T) compared to the Banzai runs. It's just that all you see is bandwidth screenies and not much in the way of 3DMark scores to compare. That's a whole lot harder to get ;).
flapperhead
12-02-04, 05:26 PM
With the P4 boards, there's not THAT much difference between 2-2-2-5 and 2.5-2-2-5, should you be so lucky to run even this. But the boost is scores is about the same. Mainly you want to get as close to -2-2- as possible, which is where the real speed is.
Also with P4s, faster CAS settings boost unbuffered bandwidth. I did not notice this much with AMD mainly because I, as well as everybody else, looks at buffered. Mainly it's the higher the fsb, the higher the bandwidth, even with dog slow timings.
So if you are a gamer, I wouldn't be so fixated with high fsb. And if you are running high fsb but your timings aren't even close to 2.5-2-2-5, why bother. Get good BH-5 and run 220 or so at 2-2-2-5, which all the 8RDA+ Rev. 1.1s and NF7-S Rev. 2.0s I've tried can do, out-of-the-box with no mods whatsoever. Plus you can run CPC Enabled. Fortunately the multi is unlocked (on pre 0352 XPs and Mobile Bartons) so you can up that to get big CPUs. If anything, size counts a lot with AMDs when it comes to 3DMark scores.
Or as I mentioned, switch to a Rev. 3.2 8RDA3+; I've found it to be a marvelously unfussy board, better on the VCORE overclocking than the NF7-S despite being 2-phase, and much stabler in 3D at higher fsb. And ya only have to try one modded BIOS by Merlin, heh. Do that in 3 minutes and drop the pins in for the L12 mod and you are good to go. I think the BIOS has the L12 mod built-in but I always do the hard mod to make sure. I think most AMDers would be happy with 231, 2-2-2-5, 2x512s with CPC Enabled :p. Got good XMS3500 though.
As for the 939/754, I have the DFI 250 GB NF3 and K8N Neo2 Plat, but I need to get back to my P4 stuff for now. I imagine the same story there: faster CAS settings, better 3DMark scores, all at lower fsb (and 1T) compared to the Banzai runs. It's just that all you see is bandwidth screenies and not much in the way of 3DMark scores to compare. That's a whole lot harder to get ;).
Exellent information.. that info takes time to find out.. thnx clev..
It's no different with P4 sytems, timings and PAT are everything. If you have 2-2-2-5 and PAT 400MHz is plenty of clock rate on the ram.
I sent a PM to a member discussing just this topic not long ago. He urged me to post it, but I hadn't seen an appropriate opportunity. Here goes:
================================================== =======
The important thing to understand with PC performance is that we are indeed building a system. So often overclockers get so consumed with the individual components in an ivory-tower analysis that they lose sight of the effectiveness of the system as a whole. And they are so reliant on the use of synthetic benchmarks (that they don't have the expertise to correlate with application performance) that they rarely realize their mistakes.
A secondarily important realization is that Intel really knows what they are doing. 875p running BH5 memory at 400MHz with PAT enabled is essentially perfect, limited only by the use of a discrete memory controller chip rather than the cpu-integrated approach.
The game plays out like this from here: BH5, BH5, BH5. Most of the miraculous performance-enhancing technology built into 875 is predicated on BH5 ram. 2-2-2-5 timings won't fly on anything else (except the late model BH5 substitute, TCCD), and PAT is often unstable without BH5. If you get PAT working with 2-2-2-5 timings, 400MHz proves to be plenty of clock speed on the memory.
The next key piece of information is that the 1:1 mode is fastest, the 5:4 mode performs very well, and the 3:2 mode starts to slip noticeably. No matter which mode we run we want the (BH5) memory to run at 400-416MHz at to correctly balance bandwidth and latency concerns.
This means 1:1 w/200fsb, 5:4 w/250fsb, or 3:2 w/300fsb. If we land south of these fsb's, we lose memory bandwidth (small consequence in application performance), and if we exceed those fsb's, we have to forgo essentially all hope of the 2-2-2-5 timings and PAT to allow the memory to operate at the required clock rate (application performance suffers noticeably).
Obviously achieving one of those fsb's is not possible with all cpus due to the locked multiplier. So that leaves the system design process as the culmination of the following three steps:
1) Get some BH5 (or TCCD if it will do 2-2-2-5 w/PAT)
2) Decide if 200, 250, or 300fsb is your bag
3) Obtain a cpu that will run at the fsb you select (or one of the other two if that's all you can get) and still produce adequate cpu clock rate.
The only other factor that plays a large role is Abit's GAT technology. It looks to me like GAT is an even more aggressive implementation of PAT, and both supercedes and obviates PAT. i865pe boards with GAT like the IS7 and AI7 suffer no loss in performance when compared to their i875 based cousins with official PAT as well. But it's hard on the ram (BH5, keep the clock rate down if you want it to work).
It's always fun to try to make the most of whatever you have, and often you find that you can accomplish a lot more than anyone would have thought. But for my #1 rig I have a very clear and developed notion of system design as the first priority, and hopefully only one. My intent is to produce leading performance while maintaining heat, noise, and cost characteristics inseperable from a modest, everyday PC. I use overclocked processors in the pursuit of this goal, but it is vital to understand that overclocking is a means to an end, not an end in itself. It's not always a more-is-better thing, especially if that more compromises your fsb, memory clock rate, and/or timings.
The game plays out like this from here: BH5, BH5, BH5. Most of the miraculous performance-enhancing technology built into 875 is predicated on BH5 ram. 2-2-2-5 timings won't fly on anything else (except the late model BH5 substitute, TCCD), and PAT is often unstable without BH5. If you get PAT working with 2-2-2-5 timings, 400MHz proves to be plenty of clock speed on the memory. . .
The only other factor that plays a large role is Abit's GAT technology. It looks to me like GAT is an even more aggressive implementation of PAT, and both supercedes and obviates PAT. i865pe boards with GAT like the IS7 and AI7 suffer no loss in performance when compared to their i875 based cousins with official PAT as well. But it's hard on the ram (BH5, keep the clock rate down if you want it to work).
I have suspected this as well, and have mentioned many times on various forums of the superior unbuffered bandwidth of the Abit 865P board using GAT or the straps.
Do ya notice that on the Abit 865P board, the 667 strap only imposes full PAT with CAS 2.5 or lower? Or that on the 800 strap, you can achieve full PAT using Sr-Enh-A-D-D, but only at CAS 2 or lower? Can someone smell BH-5 here?
However I disagree with one statement: the BH-5 substitute now is (or was) Ballistix ram. It can run higher down low at 2-2-2-5 (up to 225), and even at 2.5-2-2-5, you can get full PAT on the Abit boards using the 667 strap and A-A-A-D-D to stabilize the ram. Though it won't run PAT at the 800 strap and GAT due to CAS 2.5, ya got BH-5 for that ;).
Unfortunately Ballistix seems to have given in to the high fsb TCCD hype and have taken away the fast timings down low to give more headroom up high, at blah 2.5-3-3-6 timings. Note their PC3200 is now rated at 2-3-2-5. I can confirm they are not as fast down low as the original stuff, but can hit 260, 2.5-3-3-6.
grimm003
12-02-04, 07:26 PM
Sorry for the newb question here, but what is PAT and GAT, and is it for pentiums only?
I have suspected this as well, and have mentioned many times on various forums of the superior unbuffered bandwidth of the Abit 865P board using GAT or the straps.
Yes, Abit milks more out of the 865 chipset and DDR400 than anyone else. As you well know other boards are more compatible ram-wise, but I've tested nothing as fast as an AI7, clock tick per clock tick. Personally I have had mixed results QC-wise with Abit, and recommend Asus for other people. But for my personal rigs, the AI7 is choice #1 for its intelligent design, affordability, and unmatched memory performance.
Do ya notice that on the Abit 865P board, the 667 strap only imposes full PAT with CAS 2.5 or lower? Or that on the 800 strap, you can achieve full PAT using Sr-Enh-A-D-D, but only at CAS 2 or lower? Can someone smell BH-5 here?
I haven't fooled much with strap settings beyond the 800fsb one. I have seen others report results with the other settings that seem to defy logic, or at least confuse it. As well, I run only 2-2-2-5 for timings. My buffalo 3700 will do 2-2-2-5/400MHz/GAT SR in 5:4 mode on 2.8V, or 2-2-2-5/400MHz/GAT F1 in 1:1. I've tried other memory, but when it can't do the above the BH5 goes back in.
The Abit 865pe boards do implement significant and proprietary bios optimizations. To this day they confuse CPU-Z, which indicates 5:4 operation when they are in fact in 1:1.
However I disagree with one statement: the BH-5 substitute now is (or was) Ballistix ram. It can run higher down low at 2-2-2-5 (up to 225), and even at 2.5-2-2-5, you can get full PAT on the Abit boards using the 667 strap and A-A-A-D-D to stabilize the ram. Though it won't run PAT at the 800 strap and GAT due to CAS 2.5, ya got BH-5 for that ;).
However BH5-like Ballistix may be, I wasn't mentioning what was the replacement for BH-5, but rather what is. TCCD is undoubtably good ram and is getting better, but I'm not giving up my BH-5 yet. And just as undoubtably, those that don't have good BH-5 are going to be choosing between TCCD, TCCD, and TCCD when it comes time to buy ram.
mattspalace
12-02-04, 09:14 PM
Hey larva, how's that 2.8C treatin' ya lately? Still nice and stable??
Sorry for the newb question here, but what is PAT and GAT, and is it for pentiums only?
PAT is a hardware and bios performance optimization Intel created with the advent of the i875p chipset. It cuts realized memory subsystem latency by a significant amount. The gain in latency behavior essentially cancels out the increased latency inherent in a dual channel memory scheme, allowing the (excellent) latency characteristics of the single channel i845 series to be maintained.
GAT is Abit's proprietary and unauthorized PAT-like technology for the ever-so-similar i865pe chipset. In authorized designs i865pe has no PAT function, and as such under-performs i875p by a few percent. Abit's i865pe boards like the IS7 and AI7 perform at least on par with their i875p-based cousins.
As such PAT and GAT are only applicable topics where i865/875 chipsets are involved.
Hey larva, how's that 2.8C treatin' ya lately? Still nice and stable??
Yeah, it's a good chip. I tried a SL6WK 3.0c I picked up cheap and it would bench at 3.75GHz, but was not 100% in daily use. It did run a lot cooler than the M0 though. It looked good at 3.6GHz, but due to the compromised fsb/memory clock rate barely outperformed the 2.8@3.5GHz. I sold it to a friend and put the 2.8c back in until I can locate a 3.0c that is 100% at my preferred 250fsb.
mattspalace
12-02-04, 09:23 PM
I did the same, but the 3.0c I got seemed like it was starting to degrade..which you saw in one of my threads.
That 2.8c is a good chip. If you ever decide to sell her, PM me. :)
In authorized designs i865pe has no PAT function, and as such under-performs i875p by a few percent. Abit's i865pe boards like the IS7 and AI7 perform at least on par with their i875p-based cousins.
I posted my results on Asusboards a while back. The 865P boards can indeed outperform the 875P. Case in point is 300, 3:2, 2-2-2-5 on the IS7 on 800 strap at Sr-Enh-A-D-D gave just over 3200 unbuffered. The same setup on the IC7 gave 3130-3150 unbuffered. That's partial PAT for the IS7 and full PAT for the IC7. The IS7 had a higher 3DMark2001 score by 100-130 points.
BTW, you can get PAT on 865P boards up to 200 fsb. Over that you can't do anything on the Asus P4P800 except do the PAT hack. On the Abit boards you can play with the strap and GAT.
One comment on TCCD: some sticks can do 2-3-3-6. Mine can do that up to 236. So I can run 800 strap, Sr-Enh-A-D-D and get PAT as long as I run 2-3-3-6.
I believe on the 667 strap you have to run 2.5-2-2- at the slowest to get PAT. I believe I tried the TCCD at 2.5-3-3-6, 667 strap but PAT was off. I know on the Buffalo PC3200 at 3-2-3-5, nope, no PAT on the 667 strap.
I have also found that the NF2 boards really groove on BH-5. Particularly that 8RDA3+ with the Merlin BIOS. I can't run DDR462, 2-2-2-5 at 2.816 volts even on the P4C800-E.
flapperhead
12-03-04, 04:16 PM
i got a 512 stik of kvr bh6 on ebay, preliminary tests at 3.5 volts are extremely encouraging, especially since i only paid 70.00.
flapperhead
12-03-04, 04:23 PM
i got a 512 stik of kvr bh6 on ebay, preliminary tests at 3.5 volts are extremely encouraging, especially since i only paid 70.00.
btw is there any conflicts with the p4c800 e deluxe and bh6?? it wont boot in my first dimm slot. now it doesnt want to boot at all.. HMMM.. i wonder if the 3.5 volts had anything to do with it... lol
Reefa_Madness
12-03-04, 06:08 PM
Is that smoke I smell???
adelphia83
12-03-04, 08:39 PM
This discussion was originally oriented toward the Socket A platform, and somehow migrated to the current P4 discussion. Does anyone have any concrete evidence that the same holds true w/ Athlon 64 based systems?
I understand that timings play an important role in overall system performance, and that tight timings often outweigh the benefits of increased memory bandwidth through increased frequency.
My testing seems to support quite the opposite, in that memory timings generally don't play as big a role in both Socket 754 and 939 based systems. CAS latency to be specific made little to no difference in 3DMark03 scores. I have several screenshots supporting this, that I will post as soon as they are hosted.
My results concluded that 2-2-2-5 @ 200mhz is slower than ~235mhz 2.5-3-3-x both using 1T command rate. It took 272mhz to achieve the same score using 2.5-4-4-x 1T. Using 2T was pointless, as both the score in Sandra and 3DMark fell well below that of 2-2-2-5 @ 200mhz at any frequency.
If anyone else has results using the A64 platform, it'd be interesting to see both how they compare to mine, and to that of other platforms.
vBulletin® v3.8.7, Copyright ©2000-2012, vBulletin Solutions, Inc.