• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

[Ret Sticky]Overclocking sndbx for A64 939 systems with Winchester, Opteron dual core

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
hitechjb1 said:
Actually the Fortron 350W is not for short burst testing. I figured out it should be OK before a full setup with more components.

The Fortron 350W had been used to test the system for more than one day continuously without a single BSOD, system hang, ..., with the Winchester between 2.7 - 2.9 GHz priming, the 6600 GT at rated 525/1050, as well as one HD and one OD, all drawing 12 V current.

From an engineering stand point, the 12 V current rating of the Fortron 350W (12 mm fan) is marginal at that level (16 A rated).

I swapped it with an Antec True 550 which is rated 24 A on 12 V for now until the time I feel a 24-pin PSU with higher 12 V current is needed.

Let's move onto something else than just PSU. :)

Ah, really? My mistake. Most impressive. And Winchesters do have a lower power draw than Clawhammers and Newcastles.

Ok, let's talk about how you went against your own advice to leave the 10x multi open; banking on hopes of both the motherboard and memory to hit 300MHz, and the processor hitting 2.7 GHz....odds wouldn't seem in your favor but yet you defied them anyways :D


Yeah come on folks. Hitech's threads are a great resource, but a long read, crapping it with a flame war on the side isn't helping anyone.
 
Excellent results and a very methodical review. I'm curious have you tried that particular processor in any other boards? It would be interesting to note the differences if any in results between an nF3 & nF4 board with the same processor.

Edit: You guys are really polluting this otherwise very informative thread with this PSU debate.
 
Well, I've just been replying.. I was just minding my own business before he got bent out of shape.. I will delete my posts if that is possible on this board.

Deception, you should do the same.
 
Gautam said:
Ah, really? My mistake. Most impressive. And Winchesters do have a lower power draw than Clawhammers and Newcastles.

Ok, let's talk about how you went against your own advice to leave the 10x multi open; banking on hopes of both the motherboard and memory to hit 300MHz, and the processor hitting 2.7 GHz....odds wouldn't seem in your favor but yet you defied them anyways :D


Yeah come on folks. Hitech's threads are a great resource, but a long read, crapping it with a flame war on the side isn't helping anyone.

Indeed, the Winchester runs much cooler than the 130 nm Barton and 130 nm NewCastle/ClawHammer clock for clock. Especially the later week Winchester, seems to achieve 2.6 - 2.8 GHz at ~ 1.55 V. As the test result showed, even at 2.7 - 2.8 GHz level loaded, a medium CFM fan (~ 40 CFM) can be used to keep the CPU around 40 C. Further auto fan control by motherboard cuts the noise level down for regular 24/7 usage. Even without using cool & quite, the NF4 Winchester system is much quieter compared to systems based on 130 nm CPU.

With the debut of DFI LP Nforce4, that prompted me to get this system the first day the DFI board was available, not just for Winchester, but for building a testbed (long over due) through the other 939 CPU's that follow.

In the early Nforce3, due to limitation of motherboard, maybe also memory and its controller, 250 - 280 Hz HTT were about the top, so a 3200+ would be more flexible and almost a must in order to achieve 2.5+ GHz on CPU.

Recently, the MSI Neo2 (NF3 Ultra), Neo4 NF4 and the DFI LP NF4, teamed with TCCD modules, all seem to deliver 300 MHz HTT and 280-300 MHz memory bus. Further, the DFI NF4 has more memory ratio for fine tuning in case.

So a while back, I already revised the choice between 3000+ and 3200+ as:

hitechjb1 said:
939 Winchester 3000+ vs 3200+

One can run the memory bus frequency slower than the HTT with minimal impact on memory performance,
for example, assume bios only has 1:1, 5:6, 2:3 memory_HTT_ratio
For 3000+, max multiplier = 9, memory_divider available = 9, 11,
For 3200+, max multiplier = 10, memory_divider available = 9, 10, 11, 12, 15

So if the CPU clock frequency is 2500 MHz,
one would get memory at 277 or 227 MHz with a 3000+ (x9 max),
one would get memory at 277, 250, 227, 208, 167 MHz with a 3200+ (x10 max).


As can be seen, the 3200+ provides more flexible matching of memory frequency for given memory modules. In addition, one can also get a high CPU overclocking in case the motherboard and system cannot handle high HTT for whatever reason. Say, if HTT is stuck under 260 MHz, with a 3200+, one can still get 2.60 GHz with the x10 multiplier, but with a 3000+, the highest CPU overclock would be limited to 2.34 GHz.

On the other hand, in terms of budget and price-performance, one can argue that a 3000+ is a better choice. Which CPU can potentially give higher overclocking is a luck of draw, due to random nature over the stock frequency specification.

If mothboard and memory modules used can handle high HTT (to 300 - 330+ MHz such as DFI Nforce4) in combination with high memory bus frequency (such as TCCD based modules), and especially the motherboard and bios can provide a wide range of memory_HTT_ratio (more than 1:1, 5.6, 2:3, 1:2 such as the DFI Nforce3 and Nforce4 boards), then a 3000+ would be almost as good as a 3200+ on air as 2.7 - 3.0+ GHz would not be a barrier due to the x9 3000+ multiplier alone.

The above argument assumes CPU's are from similar week/stepping. In many cases, newer CPU (more recently dated) may be preferred, especially if supported by results and statistics, probably due to some process, yield improvements or some not-yet-known reasons.


In general, the 3200+ is preferred since lower motherboard HTT and memory frequency are sufficient to deliver the same CPU frequency, especially if it is not sure what the board and memory can achieve a priori.

Since I assumed (not a simple assumption but with quite a bit of research) the DFI LP Nforce4 and G. Skill 4400 have a good chance to deliver 300+ MHz memory bus 2.5-4-4-x and the CPU target was 2.7 GHz, so I picked that 3000+ instead of a 3200+, also based on cost saving as Venice maybe around the corner. Anything above 300 MHz memory and 2700 MHz CPU are above my objective.
 
Last edited:
TimoneX said:
Excellent results and a very methodical review. I'm curious have you tried that particular processor in any other boards? It would be interesting to note the differences if any in results between an nF3 & nF4 board with the same processor.

Edit: You guys are really polluting this otherwise very informative thread with this PSU debate.

I don't have another NF3 and NF4 to test the CPU. But I have seen high CPU clock based on MSI Neo2 NF3 Ultra boards, ....

You are correct that CPU, memory, motherboard and its subsystems (chipset, voltage regulator, system bus) can potentially affect one another. One part may not perform its full potential in a particular system.

I am still trying to test out the new system. As I try to push the HTT above 324 MHz, 2.92 GHz CPU it seems to hit some walls while the CPU seems to be quite stable under 2.92 GHz with ~1.55 V and under 40 C, and slowing down the memory frequency nor relaxing memory timing does not seem to work neither. I still try to find out the limiting factor, be it the CPU, memory, memory controller, motherboard subsystems, ....


BTW, how well do your 3200+ Winchester and Neo2 NF3 Ultra overclock? What week is the 3200+?
 
hitechjb1 said:
In the early Nforce3, due to limitation of motherboard, maybe also memory and its controller, 250 - 280 Hz HTT were about the top, so a 3200+ would be more flexible and almost a must in order to achieve 2.5+ GHz on CPU.

Recently, the MSI Neo2 (NF3 Ultra), Neo4 NF4 and the DFI LP NF4, teamed with TCCD modules, all seem to deliver 300 MHz HTT and 280-300 MHz memory bus. Further, the DFI NF4 has more memory ratio for fine tuning in case.

So a while back, I already revised the choice between 3000+ and 3200+ as:

In general, the 3200+ is preferred since lower motherboard HTT and memory frequency are sufficient to deliver the same CPU frequency, especially if it is not sure what the board and memory can achieve a priori.

Since I assumed (not a simple assumption but with quite a bit of research) the DFI LP Nforce4 and G. Skill 4400 have a good chance to deliver 300+ MHz memory bus 2.5-4-4-x and the CPU target was 2.7 GHz, so I picked that 3000+ instead of a 3200+, also based on cost saving as Venice maybe around the corner. Anything above 300 MHz memory and 2700 MHz CPU are above my objective.
Yep, just playing with you. I know you'd never make a short-sighted purchase. :p

However, I have to disagree with you slightly. The nForce4's do seem to hit 300+ quite effortlessly. The same isn't true for nForce3's. I've seen a couple of them cross 300, but for example, Glock19Owner is able to reach well in excess of 300 in single channel, and has a 2.8-2.9GHz capable processor, but is stopped at around 295 in DC, which appears to be the board's fault (MSI Neo2).

And now this wall that you're mentioning at 292MHz HTT also, at first glace, might possibly be board related.

So the board/chipset play a limiting role for many peoples' setups it appears.


Btw, have you tried a lower multiplier and/or more voltage yet? (Obviously neither is practical for daily usage)
 
The dual channel limitation may well be within the CPU itself or the interface between the CPU and the motherboard. For Nforce3 Ultra, the max is probably lower than the current Nforce4.

My current DFI NF4 Ultra is hitting a limit around 324 MHz for both HTT and memory 10-4-4-2.5 1T (very good memory), 2.92 GHz for CPU, not 292 MHz HTT.

I am working on finding out why, be it CPU, CPU's memory controller, CPU/motherboard interface, memory, .... Since lower memory frequency and relaxing memory timing won't help, it seems to be not memory related (at 324 MHz dual channel :), efficiency only around 81% though).


Conjecture:
NF3 Ultra tops out lower on memory bus and HTT, but higher efficiency, usually around 90+ %.
DFI NF4 can attain higher memory bus and HTT, but lower efficiency. Mine is only 81% at 300-320 MHz, e.g. 8225 MB/s raw bandwidth at 318 MHz, has to check more data from others.
So there may be an intrinsic bottleneck somewhere limiting HTT and/or memory.
 
Last edited:
hitechjb1 said:
BTW, how well do your 3200+ Winchester and Neo2 NF3 Ultra overclock? What week is the 3200+?

Mine is a CBBHD 0444RPAW. Tops for this setup is 265*10, anything higher fails prime95 within an hour. Since some of the limitations of the neo2 are clearly bios & memory related I think it would be interesting to see if the same processor would perform equally or better on an nF4 board and a DFI sample specifically.

hitechjb1 said:
Conjecture:
NF3 Ultra tops out lower on memory bus and HTT, but higher efficiency, usually around 90+ %.
DFI NF4 can attain higher memory bus and HTT, but lower efficiency. Mine is only 81% at 300-320 MHz, e.g. 8225 MB/s raw bandwidth at 318 MHz, has to check more data from others.
So there may be an intrinsic bottleneck somewhere limiting HTT and/or memory.

Is it possible this efficiency loss is due to bank interleaving being disabled? I recall seing a screenie of this option on the DFI bios. Have you confirmed whether it's working or not?
 
hitechjb1:

What are the stable versions of the same results?
Also, I request you test if TCCD chips like high voltages or not.


/*Will delete this post if necessary*/
 
Yes and along those lines I noted some posts at XTS where guys were saying the nF4 boards seem to be requiring more than 3v with TCCD modules, which I thought rather odd.
 
hitechjb1 said:
The dual channel limitation may well be within the CPU itself or the interface between the CPU and the motherboard. For Nforce3 Ultra, the max is probably lower than the current Nforce4.

My current DFI NF4 Ultra is hitting a limit around 324 MHz for both HTT and memory 10-4-4-2.5 1T (very good memory), 2.92 GHz for CPU, not 292 MHz HTT.

I am working on finding out why, be it CPU, CPU's memory controller, CPU/motherboard interface, memory, .... Since lower memory frequency and relaxing memory timing won't help, it seems to be not memory related (at 324 MHz dual channel :), efficiency only around 81% though).


Conjecture:
NF3 Ultra tops out lower on memory bus and HTT, but higher efficiency, usually around 90+ %.
DFI NF4 can attain higher memory bus and HTT, but lower efficiency. Mine is only 81% at 300-320 MHz, e.g. 8225 MB/s raw bandwidth at 318 MHz, has to check more data from others.
So there may be an intrinsic bottleneck somewhere limiting HTT and/or memory.
d'oh! I did mean 324MHz HTT. Neways question still applies. Have you tried a lower multi?

I *think* that the 81% efficiency is common for your speeds. The oddity with the 939 A64's is that the CPU speed actually bottlenecks the memory bandwidth, because the controller is loaded too heavily at low CPU speeds. You will probably get similar bandwidth were you to run the memory at 324 or 292, because of this odd issue where the CPU speed actually bottlenecks the memory bandwidth. If you were to reach 324MHz memory speed along with the processor being at 3.24 GHz, you would actually see more memory bandwidth, as the memory controller scales up with the CPU speed. It's an odd phenomena that hasn't occured in any type of architecture prior to the dual channel A64's; CPU speed actually determining memory bandwidth.
 
TimoneX said:
...

Is it possible this efficiency loss is due to bank interleaving being disabled? I recall seing a screenie of this option on the DFI bios. Have you confirmed whether it's working or not?

The Bank Interleave in the bios is ENABLE by default.

I changed to DISABLE and rerun Sandra memory bandwidth, and the memory efficiency is slightly lower than with it ENABLE (80% vs 81%).

There are a whole bunch of memory timing parameters other than tCL-tRCD-tRP-tRAS which I set them to AUTO. Have to look into them, not crucial at this point.
 
Super Nade said:
hitechjb1:

What are the stable versions of the same results?
Also, I request you test if TCCD chips like high voltages or not.


/*Will delete this post if necessary*/


Regarding to TCCD memory, it is running well at 324 MHz 2.5-4-4-10 1T, 2.9 V. I tried raising voltage to 3 V, the sensitivity of frequency vs voltage showed no improvement. I would not raise voltage further at this time.
 
Gautam said:
d'oh! I did mean 324MHz HTT. Neways question still applies. Have you tried a lower multi?

I *think* that the 81% efficiency is common for your speeds. The oddity with the 939 A64's is that the CPU speed actually bottlenecks the memory bandwidth, because the controller is loaded too heavily at low CPU speeds. You will probably get similar bandwidth were you to run the memory at 324 or 292, because of this odd issue where the CPU speed actually bottlenecks the memory bandwidth. If you were to reach 324MHz memory speed along with the processor being at 3.24 GHz, you would actually see more memory bandwidth, as the memory controller scales up with the CPU speed. It's an odd phenomena that hasn't occured in any type of architecture prior to the dual channel A64's; CPU speed actually determining memory bandwidth.


Tried the following:

HTT 324, CPU 2916 with x 9 multiplier, memory_efficiency = 81% (nominal)

HTT 300, CPU 2400 with x 8 multiplier, memory_efficiency = 73% (even worst)

HTT 250, CPU 2250 with x 9 multiplier, memory_efficiency = 80% (about same)

All with same timing 2.5-4-4-8 1T, bank_interleave ENABLE
memory_bus = HTT (1:1)
 
Yes, but that's bound to happen with low CPU speed.

Try 324x9 with the 183( or 180 divider) to get CPU/10 ratio and then see what efficiency you get. I expect at least 85%, if not higher.

Edit: Let me rephrase. To maintain the same level of efficiency at elevating memory speeds, the cpu speed must increase concurrently. So if you want a high efficiency at 324MHz memory speed, you will need a CPU speed of above 3.2 GHz in most cases.

Since you're running at "only" 2.9GHz, you will see much higher efficiency with the memory at 290MHz.

Most people run at around 250-270x10 on average, and see efficiency in the mid 80's. To go higher usually requires a CPU/11 divider from what I can gather.

Now because you were running a CPU/9 divider in both case 1 and case 3, the efficiencies were very similar. Case 2 used a CPU8 divider, resulting in lower efficiency. Increase to CPU/10 and you should see higher efficiency, and about the same bandwidth level.
 
Last edited:
HTT = 318 MHz
CPU = 2862 MHz (with x9)

memory_HTT_ratio = 1:1, memory = CPU / 9 = 318 MHz, efficiency = 81%
memory_HTT_ratio = 9:10, memory = CPU / 10 = 286 MHz, efficiency = 85%
memory_HTT_ratio = 5:6, memory = CPU / 11 = 260 MHz, efficiency = 90%
etc

Possible explanation:

When the CPU is clocked faster (consuming faster), the memory controller is not fast enough to keep pace and provide enough data I/O with the L2 cache running in sync with the processor clock, hence resulting in more cache wait states relative to the memory controller and in turn lower efficiency (the actual bandwidth is still higher, but efficiency which is bandwidth per memory clock is reduced).

Will the revision E0 correct/improve this?
 
c627627 said:
Here's a question I always wanted to ask you and we can use this overclock as an example, in how many increments did you get to such a high overclock, I usually corrupt my Windows registry because of the 'too much too soon' impatient approach....

Notes on searching for "optimal" solution in an overclocking system

That is an interesting question. How to search for "optimal" solution in a system with many variables? In general, it can be very complicated, ....

With specific to A64 overclocking, since I have been following what a Winchester can do, e.g. 2.5 GHz at ~1.4 V, 2.6 - 2.7 at ~1.55 V, above that it really depends. Further I know what the G. Skill TCCD PC4400 can do, e.g. 280-300 MHz 2.5-3-3-7, above 300 MHz it depends. With all these in mind,

I got to 2.5 GHz @ 1.4 V, HTT and memory to 278 MHz right away.

I then tried to see whether memory and motherboard in combination can go to 300, 310, 320, ... MHz pushing the CPU to 2700, 2790, 2880, ... MHz. In doing do, the Vcore was increased and memory timing increased accordingly (observing sensitivity of frequency vs voltage, memory timing).

I arrived at around 2.9 GHz with only a few steps with HTT and memory at 322 MHz. This can also be attributed to the stability of the motherboard, CPU and memory as a whole. Without a set of correct parts in combination, the search for high overclocking will usually lead to disappointment and lots of time spent.

Another important thing is to keep the non-essential parts and parameters under control to avoid uncertainty. E.g. HT bus is not essential to CPU and memory overclocking, so it should be kept under its 1000 MHz specification using a LDT_multiplier of 3 to avoid unnecessary issues. One can always push the HT bus frequency later, if needed.

The more time consuming part is to try to get to the last 1% of the system or the last academic MHz for satisfaction (not much practical value).

In summary, since there are many variables, it really depends on experience, knowing what the components can do, observation and judgement, uncertainty avoidance, patience. This will help to optimize the system faster and to get to a better operating point.

For example, when one variable is changed slightly, one have to observe how the system behaves and then readjust that variable accordingly - sensitivity analysis which deals with the change of system behavior to (slight) change of system variables or parameters.
E.g.
change of frequency vs variable of voltage, timing, any setting
change of performance vs variable of voltage, timing, any setting
it may require component changes such as different memory modules

If there is no response of system behavior (frequency, benchmark performance) to a change in variable, then do not push further (simply model).

Another technique is to use binary search instead of just a linear search, e.g. to search for best point between HTT = 250 to 270 MHz.
One can do a binary search over a range,
e.g. try the mid-point which is 260 MHz, if it fails then try the mid-point between 250 - 260 MHz which is 255 MHz, etc.
 
To be very clear, what exactly does the E0 bring with it (and when)? Of these improvements, what is confirmed and what is suspected?

For example, off the top of my head - SSE3, improved memory controller, improved process, et cetera?

I ask this because your results are excellent for your application in the opinion of most people here I would think, and I am trying to think of what and how it might be easily improved. Pardon me if this is a poorly executed question, but perhaps it asks what I'm trying to ask.
 
Hello I.M.O.G.

• Improvements to the memory controller are considered the biggest update.

Also:

• Support for double sided DDR 400 DIMMs.
• Power savings that come with IBM strained silicon process.

Most importantly (for us)

• Extra layers of copper interconnects leaving us :drool: with solid expectations of higher overclocks.

P.S. As to when, latest is:
http://www.ocforums.com/showthread.php?t=364226

Before today it was :
c627627 said:
yah, confirmed no PR jump as of today.

Also, finally an answer re Venice:

The scenario from this September 17, 2004 thread about Winchesters will repeat itself with Venice:
http://www.ocforums.com/showthread.php?t=329781

In it, Winchester appeared in one or two shops, then the price sky rocketed and Winchesters remained scarce until weeks and weeks later.

So:

Venice will have full availability in Q3 2005 and limited availability (a store or two here & there) in Q2 2005...

so either place a pre-order from a reputable store like with Winchesters or buy on the day of release, do not wait or you'll likely pay more if you wish to get them before full Q3 2005 availability.
 
Back