• Welcome to Overclockers Forums! Join us to reply in threads, receive reduced ads, and to customize your site experience!

A64 CPUs, chipsets, motherboards

Overclockers is supported by our readers. When you click a link to make a purchase, we may earn a commission. Learn More.
August 2004 A64 AMD Publication # 30430 Revision: 3.37

AMD said:
Added AMD Sempron™ Processor and Mobile AMD Sempron™ Processor chapters.
Removed model 3100+ 62W OPN.
Added DTR Model 3700+ OPN.

Updated:
A64 CPU Models, OPN code, PR/frequency rating

Added 754 mobile DTR (1.5V)
3700+: AMA3700BEX5AR 1.5V (CG rev, F4Ah) <- ClawHammer, 1 MB L2, 2.4 GHz, x12
 
Last edited:
hitechjb1 said:
3000+: ADA3000AEP4AP 1.5V (C0 rev, F48h) <- ClawHammer, 512 KB L2, 2.0 GHz, x10 (512 KB L2 "disabled")
3200+: ADA3200AEP5AP 1.5V (C0 rev, F48h) <- ClawHammer, 1 MB L2, 2.0 GHz, x10
3400+: ADA3400AEP5AP 1.5V (C0 rev, F48h) <- ClawHammer, 1 MB L2, 2.2 GHz, x11
3000+: ADA3000AEP4AR 1.5V (CG rev, F4Ah) <- ClawHammer, 512 KB L2, 2.0 GHz, x10 (512 KB L2 "disabled")
3200+: ADA3200AEP5AR 1.5V (CG rev, F4Ah) <- ClawHammer, 1 MB L2, 2.0 GHz, x10
3400+: ADA3400AEP5AR 1.5V (CG rev, F4Ah) <- ClawHammer, 1 MB L2, 2.2 GHz, x11
3700+: ADA3700AEP5AR 1.5V (CG rev, F4Ah) <- ClawHammer, 1 MB L2, 2.4 GHz, x12

2800+: ADA2800AEP4AX 1.5V (CG rev, FC0h) <- NewCastle, 512 KB L2, 1.8 GHz, x9
3000+: ADA3000AEP4AX 1.5V (CG rev, FC0h) <- NewCastle, 512 KB L2, 2.0 GHz, x10
3200+: ADA3200AEP4AX 1.5V (CG rev, FC0h) <- NewCastle, 512 KB L2, 2.2 GHz, x11
3400+: ADA3400AEP4AX 1.5V (CG rev, FC0h) <- NewCastle, 512 KB L2, 2.4 GHz, x12

Before I order, am I understanding correctly that there are no 2.2Ghz "CO" revision 3200+ A64's. All stock 2.2Ghz 3200+ are "CG" to date (for desktops).

I just want to confirm.

Thanks
 
SunTzu69 said:
Before I order, am I understanding correctly that there are no 2.2Ghz "CO" revision 3200+ A64's. All stock 2.2Ghz 3200+ are "CG" to date (for desktops).

I just want to confirm.

Thanks

Currently, this seems to be the case.

3200+ 2.2 GHz is a NewCastle, which is by definition CG revision,
unless there are some hidden C0 ClawHammer with 1/2 L2 disabled to make a 3200+ 2.2 GHz (unlikely).

Side note:
3200+ 2.0 GHz can be C0 ClawHammer or CG ClawHammer.

But the safest bet is make sure the part number is ADA3200AEP4AX for a 3200+ 2.2 GHz NewCastle (with max multiplier of 11).

Also IMO, the part number ADA3000AEP4AX for a 3000+ 2.0 GHz NewCastle (with max multiplier of 10) will do equally well, and save some money.
 
The August 2004 AMD A64 Publication # 30430 Revision: 3.37 also includes the following,
though these may not be actually running 64-bit (post it here anyway).

1. Mobile AMD Athlon™ XP-M Processor Family 15

New AY part definition, CG revision, socket 754
AHN2800BIX2AY 2800+ 1600 MHz 128 KB L2 1.4 V (F82h)
AHN3000BIX3AY 3000+ 1600 MHz 256 KB L2 1.4 V (F82h)

2. AMD Sempron™ Desktop Processor

AX part definition, CG revision, socket 754
SDA3100AIP3AX 3100+ 1800 MHz 256 KB L2 1.4 V (FC0h)

Interesting observation:
Same AX part definition as NewCastle, but with 256 KB L2 (1/2 L2 disabled ?) and 64-bit disabled

3. Mobile AMD Sempron™ Processor

New AY part definition, CG revision, socket 754
New LA part definition, CG revision, socket 754

SMN2600BIX2AY 2600+ 1600 MHz 128 KB L2 1.4 V (F82h)
SMN2800BIX3AY 2800+ 1600 MHz 256 KB L2 1.4 V (F82h)
SMN3000BIX2AY 3000+ 1800 MHz 128 KB L2 1.4 V (F82h)

SMS2600BOX2LA 2600+ 1600 MHz 128 KB L2 1.22 V (F82h)
SMS2800BOX3LA 2800+ 1600 MHz 256 KB L2 1.22 V (F82h)
 
How to identify the physical core of an A64

There are two ways to tell the physical core:

1. The model code such as AP, AR, AX, AW, AS, BI, ... will tell the physical core, revision.

2. Another way is to use CPUID, CPU-Z, GCPUID or alike to look at the CPUID EAX hex string such as F48h, F4Ah, FC0h, FF0h, ... coded in each CPU,
often referred to as "model, family, stepping" in these utility programs.

It identifies the physical core regardless whether part of the L2 cache has been disabled, or the full 64-bit function.

AP = 00000F48h, ClawHammer 754, rev. SH7 C0, 130 nm SOI
AR = 00000F4Ah, ClawHammer 754, rev. SH7 CG, 130 nm SOI
AX = 00000FC0h, NewCastle 754, rev. DH7 CG, 130 nm SOI
AW = 00000FF0h, NewCastle 939, rev. DH7 CG, 130 nm SOI
AS = 00000F7Ah, SledgeHammer 939, rev. SH7 CG, 130 nm SOI
AK = 00000F58h, SledgeHammer 940, rev. SH7 C0, 130 nm SOI
AT = 00000F5Ah, SledgeHammer 940, rev. SH7 CG, 130 nm SOI
BI = 00010FF0h, Winchester 939, rev. DH8 D0, 90 nm SOI
BP = 00020FF0h, Venice 939, rev. DH8 E3, 90 nm SOI DSL
BN = 00020F71h, San Diego 939, rev. SH8 E4, 90 nm SOI DSL
BW = 00020FF2h, Venice 939, rev. DH8 E6, 90 nm SOI DSL
CD = 00020F72h, Toledo 939, rev. JH8 E6, 90 nm SOI DSL
BV = 00020FB1h, Manchester 939, rev. BH8 E4, 90 nm SOI DSL
CS = 00020F32h, Windsor AM2 940, rev. JH F2, 90 nm SOI DSL (2x1 MB L2)
CU = 00020FB2h, Windsor AM2 940, rev. BH F2, 90 nm SOI DSL (2x512 KB L2)


Some physical informations about the various cores

ClawHammer 754 3200+/3400+/3700+:
rating = 2000/2200/2400 MHz, 1.5V, 57.4 A, 89 W
130 nm SOI, 105.9 millions transistors, 193 mm^2, 1MB L2

ClawHammer 939 FX-53/4000+:
rating = 2400/2400 MHz, 1.5V, 57.4 A, 89 W
130 nm SOI, 105.9 millions transistors, 193 mm^2, 1MB L2

NewCastle 754 2800+/3000+/3200+:
rating = 1800/2000/2200 MHz, 1.5V, 57.4 A, 89 W
130 nm SOI, 68.5 millions transistors, 144 mm^2, 512 KB L2

NewCastle 939 3500+/3800+:
rating = 2200/2400 MHz, 1.5V, 57.4 A, 89 W
130 nm SOI, 68.5 millions transistors, 144 mm^2, 512 KB L2

ClawHammer 939 FX-55:
rating = 2600 MHz, 1.5V, xxx A, 104 W
130 nm SOI (DSL), 105.9 millions transistors, 193 mm^2, 1MB L2

Winchester 939 3000+/3200+/3500+
rating = 1800/2000/2200 MHz, 1.4V, 45.8 A, 67 W
90 nm SOI, 68.5 millions transistors, 83 mm^2, 512 KB L2

Venice 939 3000+/3200+/3500+/3800+
rating = 1800/2000/2200/2400 MHz, 1.35/1.4V, 67 W (3000+/3200+/3500+), 89 W (3800+)
90 nm SOI (DSL), 68.5 (?) millions transistors, 83 (?) mm^2, 512 KB L2

San Diego 939 4000+
rating = 2400 MHz, 1.35/1.4V, 89 W
90 nm SOI (DSL), 105.9 (?) millions transistors, 115 (?) mm^2, 1 MB L2

Manchester X2 939 3800+/4200+/4600+
rating = 2200/2400 MHz, 1.35/1.40V, 89 W
90 nm SOI (DSL), 154 millions transistors, 147 mm^2, 2x512 KB L2

Toledo X2 939 4400+/4800+
rating = 2200/2400 MHz, 1.35/1.40V, 110 W
90 nm SOI (DSL), 233.2 millions transistors, 199 mm^2, 2x1 MB L2

SanDiego Opteron 939 144/146/148/150/152
rating = 1800/2000/2200/2400/2600 MHz, 1.35/1.4V, 67/85/104 W
90 nm SOI (DSL), 105.9 (?) millions transistors, 115 (?) mm^2, 1 MB L2

Toledo (Denmark) Opteron 939 165/170/175
rating = 1800/2000/2200 MHz, 1.35/1.40V, 110 W
90 nm SOI (DSL), 233.2 millions transistors, 199 mm^2, 2x1 MB L2

Windsor AM2 940 X2 3800+/4200+/4600+/5000+
rating = 2000/2200/2400/2600 MHz, 1.30/1.35V, 65/89 W
90 nm SOI (DSL), 154 millions transistors, 183 (?) mm^2, 2x512 KB L2

Windsor AM2 940 X2 4000+/4400+/4800+/FX-62
rating = 2000/2200/2400/2600 MHz, 1.30/1.35V, 65/125 W
90 nm SOI (DSL), 227 millions transistors, 230 (?) mm^2, 2x1MB L2

Some PR ratings may not be listed for each core.


940, 754, 939, AM2 940 CPU models and specifications (post 5)
 
Last edited:
Wow and i mean wow very nice posts, just a pit nitpicky here but you posted your sig 40 sum times ;) and a printable version would be very nice. Good work and keep it coming
 
Many of us want to know more about 90 nm silicon technology, SOI, strained silicon, ..., here is listed some links and informations about them. Will add more over time.


Some links about latest silicon technology, Silicon on Insulator (SOI), Strained Silicon (SS)

In regular silicon, atoms are spaced apart with certain distance determined by the silicon lattice.

In stained silicon, silicon is deposted onto a substrate (such as silicon germanium) whose atoms are spaced apart in the lattice with larger distance than that in regular silicon lattice. Since atoms tend to align with one another, so the top silicon atoms are stretched or strained to align with the atoms underneath in the stretched lattice.

In strained silicon (lattice), electrons flow with less resistance and up to 70% faster, which in turns can lead to 35% faster chips without scaling down the size of transistors (numbers quoted from IBM).

Strained Silicon (SS) can be built on top of Silicon on Insulator (SOI), the two are not mutually exclusive. Intel, IBM, AMD, ... are building 90 nm chips using both SS and SOI in various ways. IBM called it SSDOI (Strained Silicon Directly on Insulator).


Dec 13, 04, "AMD, IBM Announce Semiconductor Manufacturing Technology Breakthrough"
http://www.amd.com/us-en/Corporate/VirtualPressRoom/0,,51_104_543~91999,00.html
from above article said:
“The new strained silicon process, called Dual Stress Liner, enhances the performance of both types of semiconductor transistors, called n-channel and p-channel transistors, by stretching silicon atoms in one transistor and compressing them in the other. The dual stress liner technique works without the introduction of challenging, costly new production techniques, allowing for its rapid integration into volume manufacturing using standard tools and materials.”

http://www.infoworld.com/article/04/12/13/HNibmamdsilicon_1.html
above article from infoworld (12/13/04) said:
...
As it has become more difficult for chip companies to improve transistor performance by simply shrinking transistors, they have turned to alternative techniques to keep improving the performance of their products. Strained silicon is a technique in which a lattice pattern of silicon atoms is either stretched or compressed to improve the speed at which electrons flow through the silicon. Positive transistors run faster when they are compressed, and negative transistors run faster when they are stretched.

However, strained silicon also works in reverse to the detriment of the transistors. Compressing the silicon atoms reduces the performance of negative transistors, while stretching the silicon impedes the performance of positive transistors, said Nick Kepler, vice president of logic technology development with AMD.

In order to get optimal performance from each type of transistor, IBM and AMD created the compressive strain on the silicon wafer with a film of silicon nitride and then removed the film from just the negative transistors, said Lisa Su, vice president of technology development and alliances with IBM. The companies then created the tensile, or stretched, strain on the wafer and removed that layer from the positive transistors, allowing both types of strain to exist side-by-side on the companies' chips, she said.
...

The companies believe that by using their strained silicon techniques on both positive and negative transistors they can improve transistor speed by as much as 24 percent, the statement said.

The strained silicon technology will be integrated into AMD's Opteron and Athlon 64 processors and IBM's Power processors in the first half of 2005.

IBM is planning to use the dual stress liner technology on all of its 90nm products starting next year, Su said. AMD will introduce the technology selectively on both 90nm and 130nm products next year and has already done so with the Athlon FX-55 desktop processor that was introduced earlier this year, Kepler said.
...

Conventional strained silicon on insulator is referred to as "singly stressed" only. The dual stress liner refers to both "stretched" and "compressed" on NFET and PFET respectively to achieve further speed improvement.

Apparently, according to the above Infoworld article, the "dual stress liner" has already been used in the 130 nm FX55, and is going to be adopted to other processors such as Opteron and other A64's in the first half of 2005.

IBM to Mix SOI with Strained Silicon
New Manufacturing Process [for AMD and NVIDIA] on the Way
http://www.xbitlabs.com/news/other/display/20030911140127.html

About SSDOI and HOT (Hybrid Orientation Technology)
http://www.compoundsemiconductor.net/articles/news/7/9/10/1


SOI paper - A simple and clearly written article about silicon chips, traditional silicon process (bulk) and SOI, and the advantages of SOI on speed, power and soft error rate.

Strained Silicon article from IBM
Strained Silicon article from Intel

Pictures and videos about Strained Silicon on SOI
http://domino.research.ibm.com/comm/bios.nsf/pages/cmos-perf.html


How does leakage current slow down future generations of chips (page 19)
 
Last edited:
I lost this post due to editing mistake.
12/13/04 reassembled and rewrote the post. It is even better.
story: http://www.ocforums.com/showthread.php?t=348445

Low PR 90 nm 939 (Sept 2004)

90 nm 939 Winchester 3000+/3200+/3500+ 512 KB L2

These 90 nm 939 CPU with new revision D0 should be cooler and perform better than a 130 nm NewCastle, in most cases even a 754 ClawHammer, at same clock frequency.

For new built, the 939 Winchester system should be a better choice than a 754 system with a NewCastle, and even a 754 ClawHammer. Pricewise, a 90 nm 939 system is also as cost effective as a then 754 system (with same PR). A 90 nm 939 3000+/3200+ Winchester should be a good choice for a cost effective, high performance, high bandwidth, overclocking A64 system with AGP or PCI-e. In short, the pros for 939 system are
- covering memory intensive applications such as scientific, video rendering programs
- future compatibility and uniformity, having the option to upgrade only CPU (including 90 nm revision E with SSE3, dual core) or motherboard down the road (not necessary to change both)

To summarize:
939_performance >= 754_performance (at same frequencies of CPU, memory, HT)
939_memory_intensive_performance >> 754_memory_intensive_performance (workstation, scientific programs, video processing, ...)
939_90nm >= 939_130nm (at same CPU frequency)
939_price_performance ~ 754_price_performance

Currently the choice of 939 motherboards is less compared to 754. New motherboards with chipset (such as Nforce4 from Nvidia, K8T890 from VIA) for PCI-express support will be available towards end of 2004. Another possibility is to build one with 90 nm 939 (e.g. 3200+) with a 939 motherboard with AGP, and upgrade to PCI-e few months later if needed.
Nforce4 chipsets with PCI-e, SLI features

The test result of this link shows that a 90 nm Winchester (3200+) performed better than a 130 nm 939 (3500+) and in many cases a 130 nm 3400+ with larger 1 MB L2, when clocked at same frequencies, from less than 1% to few % over a range of benchmarks.
http://www.madshrimps.be/?action=getarticle&articID=230

Review of 90 nm 939 from Anandtech
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2242

Rumor is a new revision E 90 nm 939 (Venice ?) with SSE3 support is coming (2005 ?), probably with strained silicon enhancement.
About Rev E and SSE3 instructions

Dual-core A64 in 90 nm is expected to debut in 2005.


939 Price performance system (added Sept 2004)

Options:
1. Need to build system now and have AGP video card: It is true that 754 has more choice of motherboards, but the MSI K8N Neo2 Platinum, though may not be perfect in every aspect, is a viable choice with a 90 nm Winchester 3000+/3200+ for current build (3200+ is perferred for its flexible in setting memory and allowing lower HTT to overclock CPU in case, assuming both 3000+/3200+ are close in manufacturing date).

2. Using PCI-e video card: If not reusing existing AGP video card, better to wait for a NF4 PCI-e board (either SLI or Ultra) so the latest PCI-e video cards can be used down the road.

- A64 939 3000+ (x9) or 3200+ (x10) 90 nm 512 KB L2 Winchester
.... preferably week after 0444, e.g. 0448 CBBHD

- two choices for motherboards: AGP or PCI-e
.... AGP based motherboard: if staying with AGP video card, Nforce3 Ultra motherboard can be used, e.g. MSI K8N Neo2 Platinum
.... PCI-e based motherboard:
........ ASUS Nforce4 A8N SLI-deluxe
........ MSI Neo4 Diamond (Nforce4 SLI), MSI Neo4 Platinum (Nforce4 Ultra)
........ DFI LanParty UT Nforce4 SLI, DFI LanParty UT Nforce4 Ultra
........ PCI-e video card, e.g. Nvidia 6800 GT/6600 GT, or ATI X800/X700

- 2x512 MB DDR500+ dual channel or overclock equivalent,
e.g. modules w/ Samsung TCCD DRAM chips
G. Skill PC4400/4800, PQI Turbo PC3200, OCZ rev.2 Platinum, Corsair 4400C25
(from low 230 MHz with 2/2.5-2/3-2-x 1T, to 250 MHz with 2.5-3-3-x 1T, to 280/300 MHz 3-3/4-3-x 1T at 2.x V)

- SLK-948U or XP-90 or XP-120 (check for motherboard compatibility first)

e.g. 3200+ Winchester, x10, HTT >= 250 MHz, memory (1:1) >= 250 MHz 1T, CPU >= 2.5 GHz (to 3 GHz).
A 3200+ with x10 multiplier is more flexible than a 3000+ with x9 (x9 is doable) for setting up CPU (between 2.5 - 3 GHz), HTT and memory bus.

Excerpt from
Typical Overclocking Systems for 754, 939 (post 2)

Some overclocking scenarios for 939 Winchester

A64 Nforce4 939, Nforce3 754, 939 Motherboards (post 11)


939 Winchester 3000+ vs 3200+

One can run the memory bus frequency slower than the HTT with minimal impact on memory performance,
for example, assume bios only has 1:1, 5:6, 2:3 memory_HTT_ratio
For 3000+, max multiplier = 9, memory_divider available = 9, 11,
For 3200+, max multiplier = 10, memory_divider available = 9, 10, 11, 12, 15

So if the CPU clock frequency is 2500 MHz,
one would get memory at 277 or 227 MHz with a 3000+ (x9 max),
one would get memory at 277, 250, 227, 208, 167 MHz with a 3200+ (x10 max).


As can be seen, the 3200+ provides more flexible matching of memory frequency for given memory modules. In addition, one can also get a high CPU overclocking in case the motherboard and system cannot handle high HTT for whatever reason. Say, if HTT is stuck under 260 MHz, with a 3200+, one can still get 2.60 GHz with the x10 multiplier, but with a 3000+, the highest CPU overclock would be limited to 2.34 GHz.

On the other hand, in terms of budget and price-performance, one can argue that a 3000+ is a better choice. Which CPU can potentially give higher overclocking is a luck of draw, due to random nature over the stock frequency specification.

If mothboard and memory modules used can handle high HTT (to 300 - 330+ MHz such as DFI Nforce4) in combination with high memory bus frequency (such as TCCD based modules), and especially the motherboard and bios can provide a wide range of memory_HTT_ratio (more than 1:1, 5.6, 2:3, 1:2 such as the DFI Nforce3 and Nforce4 boards), then a 3000+ would be almost as good as a 3200+ on air as 2.7 - 3.0+ GHz would not be a barrier due to the x9 3000+ multiplier alone.

The above argument assumes CPU's are from similar week/stepping. In many cases, newer CPU (more recently dated) may be preferred, especially if supported by results and statistics, probably due to some process, yield improvements or some not-yet-known reasons.

Overclocking setting for various bus frequencies (post 8)

Relationship between CPU_memory_divider and CPU_multiplier, memory_HTT_ratio
How to determine memory bus frequency
(post 60)

Memory bus frequency setting, SYNC/ASYNC mode


939 Winchester 3000+/3200+ vs 3500+

3000+ is rated at 1.8 GHz, 3200+ is rated at 2.0 GHz, 3500+ is rated at 2.2 GHz for the 90 nm 939 CPU's.

Statistically, a 3500+ should be overclocked slightly higher than a 3200+, due to the frequency distribution of the 3500+ is centerd higher than that of the 3200+, but they overlap. Depending on the process maturity and yield, that difference can be pretty small, and getting smaller when the chip process matures as the two distribution overlap to a greater extent. Also the small statistical difference may not justify for the higher price of the 3500+ in terms of MHz/$.

Further, since most of us only get 1 or very few such CPU, randomness plays more than the statistical behavior of large samples, so there is no guarantee that a 3500+ one get can be clocked higher than a 3200+/3000+, when they are overclocked few hundred MHz above their rated clock frequency.

In order to make decision, one has to rely on certain objective.

1. If the objective is the highest frequency (even just 1 MHz) at rated frequency, regardless of price, then a 3500+ would have 100% certainty at rated frequency which is 2200 MHz over a 3200+ which is rated at 2000 MHz (assume both CPU are not defective, otherwise can return them). The 3500+ would be the choice.

2. But if the objective is now changed to highest overclocking frequency to a few hundred MHz over rated frequency, then the 3500+ still have a slightly higher probability than a 3200+. But that percentage is small. There is no guarantee that a single 3500+ can be overclocked higher than a single 3200+ at the 2500-2600 MHz level, unless one is dealing with a large sample of 3500+ and 3200+.

3. Now most people when selecting computer/component, a common metric is price/performance, or incremental price/performance.

Lets assume a 3500+ can be overclocked 100 MHz higher (may not be the case).

E.g. (price may not be up-to-date, for illustaration only)
- 3200+ $200 overclocked to 2500 MHz, $0.08 / MHz
- 3500+ $300 overclocked to 2600 MHz, $0.115 / MHz which is 44% higher in price/performance measure.

The incremental price/performance for an overclocked 3500+ would be $100/100 MHz or $1/ MHz !!! (very high)

For price-performance wise, it would be suggested to save that $100 for better memory, or for gaming, more importantly, a better video card.


939 vs 754 Performance (with same L2 size)

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2249&p=1
The above article also has a section comparing a 130 nm 939 3800+ and a 130 nm 754 3400+, both running at 2.4 GHz, both has 512 KB L2. It gave the 939 dual channel CPU a few % average performance advantages over the single channel CPU on different kinds of programs, including A/V encoding (4.4%), games (6.3%), video creation/photo editing (4.2%), 3D rendering (5.4%), multi-tasking content creation (3.2%), business/general use (5.4%) applcations. For workstation test, the 939 was ahead by as much as 17%.

Performance analysis of various A64 systems (including Barton, P4's) (post 7)


Related links:

IBM PowerPC 970FX power envelope and power management

Dec 13, 04, "AMD, IBM Announce Semiconductor Manufacturing Technology Breakthrough"
http://www.amd.com/us-en/Corporate/VirtualPressRoom/0,,51_104_543~91999,00.html
Quote from article:
“The new strained silicon process, called Dual Stress Liner, enhances the performance of both types of semiconductor transistors, called n-channel and p-channel transistors, by stretching silicon atoms in one transistor and compressing them in the other. The dual stress liner technique works without the introduction of challenging, costly new production techniques, allowing for its rapid integration into volume manufacturing using standard tools and materials.”

Some links about latest silicon technology, Silicon on Insulator (SOI), Strained Silicon (SS)

Going from 130 nm to 90 nm, the dimensions of the transistors and metal wires in a chip are shrunk, this is called MOS scaling. Here is described MOS scaling in more details.
MOS scaling, voltage, power and leakage current
 
Last edited:
Nforce4 chipsets with PCI-e, SLI features

There are three versions of the Nforce4 chipsets from nVidia, namely the Nforce4, Nforce4 Ultra, Nforce4 SLI.

Nforce4, Nforce4 Ultra and Nforce4 SLI have separate specifications, and implemented as separate MCPs (media and communication processors), namely the nForce4 MCP, nForce4 Ultra MCP and the nForce4 SLI MCP respectively. SLI stands for Scalable Link Interface. The corresponding chipsets are different and hence the cost of the associated motherboards.

Nforce4 is the basic chipset, Nforce4 Ultra is for single GPU solution or dual GPU with reduced performance (~10% ? less than the full SLI version), Nforce4 SLI is able to configure the PCI-e lanes to support more than one GPUs/video cards.

The SLI version motherboards has 2 x16 PCI-e slots for dual video cards each running x8 bandwidth. Some Ultra version motherboards has dual PCI-e slots 1 x16, 1 x2/x4/x16 PCI-e slots for dual video cards running x8 and x2 bandwidth respectively,

Nforce4 (A02 revision ?)
PCI Express Support Yes
SATA Support 1.5 Gb/s
NVIDIA RAID Yes
NVIDIA Firewall Yes
NVIDIA Gigabit Ethernet Yes

Nforce4 Ultra (A03 revision ?)
PCI Express Support Yes
SATA Support 3 Gb/s
NVIDIA RAID Yes
NVIDIA ActiveArmor Yes
NVIDIA Gigabit Ethernet Yes

Nforce4 SLI (A03 revision ?)
NVIDIA SLI Support Yes
PCI Express Support Yes
SATA Support 3 Gb/s
NVIDIA RAID Yes
NVIDIA ActiveArmor Yes
NVIDIA Gigabit Ethernet Yes

Main new features (in additional to Nforce3):
- PCI-e supports
- SLI option (single PCI-e x16 or dual PCI-e x8 slots for dual PCI-e video cards)
- SATA2 with data transfer rate up to 3 GB/s (2x that of SATA1)
- RAID can be shared among SATA2 and IDE as single RAID (Nforce3 250 GB has such for SATA1 and IDE)
...


Nforce4, Nforce4 Ultra, Nforce4 SLI from nVidia,
http://www.nvidia.com/page/nforce4_family.html

Breaking the SLI "Code"
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2322

SLI Bridge Available in Retail. Turn Your nForce4 Ultra into nForce4 SLI
http://www.xbitlabs.com/news/video/display/20050314182648.html

Dual Bridge System (DBS) from MSI:
NVIDIA nForce4 Ultra based K8N Neo4 Platinum mainboard running dual NVIDIA 6600GT MSI NX6600GT-VTD128 3D accelerators in SLI mode.
http://www.hexus.net/content/reviews/review.php?dXJsX3Jldmlld19JRD05NTc=
http://www.msi.com.tw/program/products/mainboard/mbd/pro_mbd_detail.php?UID=637

Dual Xpress Graphics (DXG) mode in DFI UT Nforce4 Ultra-D:
http://www.dfi.com.tw/Press/press_h...&TITLE_ID=4890&LINKED_URL=arch344.jsp&SITE=NA

Comparing performance of single, dual SLI video cards using ASUS K8N SLI-deluxe, 3800+, 6800 GT, OCZ 2x512 MB 3200 LL 2-2-2-5
http://www.tweaktown.com/document.php?dType=article&dId=740


A64 Nforce4 939 Motherboards

From Tom's Hardware: NVIDIA Rushes Into PCI Express With nForce4
http://www.tomshardware.com/motherboard/20041020/index.html

An article from Anandtech on the various Nforce4 chipsets, reference boards, with PCI-e, SLI, ... features for the A64.
It includes interesting comparisons about the 6600GT, 6800GT in sinlge and SLI modes, ....
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2248
http://www.anandtech.com/video/showdoc.aspx?i=2097
 
Last edited:
FX-55 and 4000+ (Oct 2004)

The 4000+ is basically a FX-53 with locked multiplier above x12. It has dual channel, 1 MB L2, is rated at 2.4 GHz, in 130 nm SOI technology.

The FX-55 has dual channel, 1 MB L2, and is rated at 2.6 GHz, in strained silicon and 130 nm SOI technology.

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/31366.pdf

http://www.hardocp.com/article.html?art=Njc1

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2249&p=1
The article also has a section comparing a 939 3800+ and a 754 3400+, both running at 2.4 GHz, both has 512 KB L2. It gave the 939 dual channel CPU a few % average performance advantages over the single channel CPU on different kinds of programs, including A/V encoding (4.4%), games (6.3%), video creation/photo editing (4.2%), 3D rendering (5.4%), multi-tasking content creation (3.2%), business/general use (5.4%) applcations. For workstation test, the 939 was ahead by as much as 17%.

http://www.nforcershq.com/modules.php?name=News&file=article&sid=2072



940, 754, 939 CPU models and specifications (post 5)
 
Last edited:
AMD Publication 30430 revised Oct 2004:

Added specification for 90 nm 939 3000+, 3200+, 3500+.
Added specification for FX-55.
 
Recently, there have been lots of discussions about how much rating a PSU would be needed to power an A64 system, such as that based on a 939 (Winchester and follow-on), plus video card(s) in single/dual modes, ....

Here is presented some calculations and estimation of how much total power and current are needed for an overclocked Winchester, current video cards in single/dual modes, drives, .... Also some latest PSU are listed.

It is taken from a post in A64 CPUs, chipsets, motherboards


PSU rating estimate for some 939 CPU and systems

The 12V current rating of many existing PSU are
Fortron 530 = 18 A
Antec TP 480 (old version) = 22 A
Antec TP 550 (old version) = 24 A

I think the above PSU's should provide enough current for overclocking a 939 NewCastle/Winchester with 9700/9800 (at least just for testing the CPU), but may be tight for entire system with lots of other components and latest video cards. Read entire post for details.

This discussion is based on the current rating aspect of the PSU, if there are other unknown issues involving certain PSU's from obtaining high CPU overclocking, it has to be looked into.

NewCastle 939 3500+/3800+:
rating = 2200/2400 MHz, 1.5V, 57.4 A, 89 W

Winchester 939 3000+/3200+/3500+:
rating = 1800/2000/2200 MHz, 1.4V, 45.8 A, 67 W

Venice 939 3000+/3200+/3500+:
rating = 1800/2000/2200 MHz, 1.35/1.40 V, 67W
Venice 939 3800+:
rating = 2400 MHz, 1.35/1.40 V, 89 W

San Diego 939 4000+:
rating = 2400 MHz, 1.35/1.40 V, 89 W

Using a 939 NewCastle 3500+ (rated 2200 MHz) overclocked to 2600 MHz, assuming voltage to 1.65 V
power ~ 127.3 W
current ~ 77.1 A

Using a 939 Winchester 3200+ (rated 2000 MHz) overclocked to 2600 MHz, assuming voltage to 1.55 V
power ~ 106.8 W
current ~ 68.9 A

Using a 939 Winchester 3000+ (rated 1800 MHz) overclocked to 2600 MHz, assuming voltage to 1.55 V
power ~ 118.6 W
current ~ 76.5 A

These are based on approximation, since actual breakdown of the power, current numbers for the various 3000+/3200+/3500+/3800+ are not known. Taking active and leakage power, current components into account, the power estimation is further refined as follows.

Overclocking current and power refinement:

By adjusting for standby leakage and I/O current into the max power and current estimation:
Halt/Stop Grant at Min P-State
IDDC1 max = 9.3 A => I_standby = 9.3 A
I/O power = 2.9 W => I_I/O = 2.07 A

As totol power = active power + standby power
the total power is refined to
IDD_rated = 47.2 A
IDD_active = IDD_rated - I_standy = 47.2 - 9.3 = 37.9 A
I_I/O = 2.07 A
checking: (IDD_rated + I_I/O) x V_rated = P_rated
(47.2 + 2.07) 1.4 = 69.0 W

I_overclock = I_standby + (I_I/O + IDD_rated) (V_oc/V_rated) (f_oc/f_rated)

Using a 939 Winchester 3200+ (rated 2000 MHz) overclocked to 2600 MHz, assuming voltage to 1.55 V
current ~ 9.3 + (2.07 + 37.9) (1.55/1.4) (2600/2000) = 66.8 A
power ~ 66.8 * 1.55 = 103.5 W

Using a 939 Winchester 3000+ (rated 1800 MHz) overclocked to 2600 MHz, assuming voltage to 1.55 V
current ~ 9.3 + (2.07 + 37.9) (1.55/1.4) (2600/1800) = 73.2 A
power ~ 73.2 * 1.55 = 113.5 W

So when the standby current adjustment is taken into account, the overclocking power and current are reduced (by about 3-4%).

Among these 939 CPU's, taking the 3200+ Winchester estimate of 67 A, 104 W, assuming a regulator efficiency of 80%, the current and power requirement on the 12 V rail should be
current requirement for CPU = 67 (1.55 / 12) 1.25 = 10.8 A (on 12 V rail)
power requirement for CPU = 104 (1.25) = 130 W

Sugguestion: For testing your CPU for overclocking with existing older PSU (under ~20A 12V rating), disable non-essential HD's, optical drives, may have to put video card into non-overclocking, then these older PSU should be OK.


For some Nforce2, Nforce3/Nforce4, CPU, video card (recent high power ones), drives, fans are from 12 V. Besides CPU, video card is the second major components for power (9700/9800 50-70W (on 3.3V, 5V), X800 XT ~ 65-80W (on 5V, 12V), 6600GT ~50W (on 5V, 12V), 6800 GT/Ultra ~70-90W (on 5V, 12V)), then followed by HD and optical drive (~15-20 W each), then fans, ....

DDR memory is about 10 W per dimm, not too much (relatively).


Let's estimate the 12 V current and power requirement which is the most important number for PSU sizing currently.

These are the major 12 V components:
- CPU (90 nm Winchester 3200+ overclocked to 2.6 GHz 1.55 V)
power ~ 130 W (after factored in 80% regulation efficiency)
current ~ 10.8 A (after factored in 80% regulation efficiency)
- video card
power ~ 50 - 90 W (PCI-e slot specification is 75 W, need extra connectors for those that exceed 75 W)
current ~ 2.08 - 5.25 A (not all power are on 12V, assume 50-70% on 12V)
- HD, optical drive with R/W (use a single number of 20 W as estimate)
power ~ 15-20 W
current ~ 1.25-1.67 A
- fan
power ~ 2 - 8 W (2W for case fan, 8W for CPU fan)
current ~ 0.17 - 0.67 A


So for a system with 1 CPU, 1 video card (75 W), 4 HD, 2 optical drives, 5 fans, 2 dimms, ....
main_power = 130 + (75 + 4 * 20 + 2 * 20 + 8 + 4 * 2) = 130 + 211 = 341 W
current_12V(wc) = 10.8 + (4.38 + 4 * 1.67 + 2 * 1.67 + 0.67 + 4 * 0.17) = 10.8 + 15.75 = 26.55 A

Other components, like memory modules, motherboards, mouse/keyboard, USB, add-on cards, ... use less than 50 W (from non-12 V).
So total power ~ 341 + 50 = 391 W


So for a system with 1 CPU, 1 video card (90 W), 4 HD, 2 optical drives, 5 fans, 2 dimms, ....
main_power = 130 + (90 + 4 * 20 + 2 * 20 + 8 + 4 * 2) = 130 + 226 = 356 W
current_12V(wc) = 10.8 + (5.25 + 4 * 1.67 + 2 * 1.67 + 0.67 + 4 * 0.17) = 10.8 + 16.62 = 27.42 A

Adding power for the rest, total power ~ 356 + 50 = 406 W

A PSU with effective current rating 28 A on the 12 V, with effective power of 400 W is recommended for Nforce3/4 system with an overclocked A64 (~ 1.55 V 2.6 GHz), a high power video card (e.g. X800/850 Pro/XT/XL, 6600/6800 GT/Ultra), 4 HD + 2 OD, ....

For running dual video card system, and system with more drives, or for deviation from the above list, add or subtract the power/current of the corresponding components accordingly.
- Add 75-90 W per recent high power video card such as ATI 800, Nvidia 6600/6800, add 5.25 A on 12V for 90 W video card
- Add 15-20 W per HD, optical drive

A PSU with effective current rating 30 - 33 A on the 12 V, with effective power of 500 W is recommended for Nforce4 system with an overclocked A64 (~ 1.55 V 2.6 GHz), dual high power video cards (e.g. X800/850 Pro/XT/XL, 6600/6800 GT/Ultra), 4 HD + 2 OD, ....


The above numbers will be refined for
- video card(s) under overclocking conditions
- average/worst case power draw in typical systems
- actual power measurement vs power estimation


Further, in additional to the 12V current rating, the quality of the PSU build such as operating temperature (to 50 C), cooling of the PSU components (PSU internal air flow and case cooling), ripple and load regulation, PSU effiency (> 70%), active PFC are very important factors.


PSU

The current rating of few popular PSU's are shown, not implying recommendation, order listed not as preference.

- OCZ PowerStream OCZ-520ADJ, ATX12V V2.0, EPS12V, 33 A on 12 V, 1% reg., with 24-pin connector, ~$140
- OCZ PowerStream OCZ-600ADJ, ATX12V V2.01, EPS12V, 12V1 20A, 12V2 18A, 1% reg., with 24-pin connector, ~$205
http://www.ocztechnology.com/products/PSUSpec.pdf
- OCZ ModStream 450/520
http://www.ocztechnology.com/products/ModStreamSpecSheet.pdf

- Fortron Blue Storm AX500-A 500W, ATX12V V2.0, 12V1 15A, 12V2 15A, 5% reg., with 24-pin connector, ~$89
http://www.newegg.com/app/ViewProductDesc.asp?description=17-104-934&depa=0
http://www.cluboverclocker.com/reviews/power/fortron_source/ax500/

- Antec TP-II 480 has 36 A on 12 V (18A on 12V1, 18A on 12V2), 3% reg., with 24-pin, 2 PCI-e connector, ~$85
http://www.antec.com/us/productDetails.php?ProdID=22480
- Antec TP-II 550 has 38 A on 12 V (19A on 12V1, 19A on 12V2), 3% reg., with 24-pin, 2 PCI-e connector, ~$100
http://www.antec.com/us/productDetails.php?ProdID=22550
- Antec True Control-II 550 has 38 A on 12 V (19A on 12V1, 19A on 12V2), 3% reg., adjustable voltage rails and fan speed, with 24-pin, 2 PCI-e connector, ~$120
http://www.antec.com/us/productDetails.php?ProdID=24480

- PC Power & Cooling
http://www.pcpowercooling.com/products/power_supplies/highperformance/turbocools/510/index.htm

* Above prices based on newegg (01/16/05) before S/H for comparison only

ATX12V rev. 2.01, rev 2.2 specifications
http://www.formfactors.org/FFDetail.asp?FFID=1&CatID=2


More latest PSU reviews:
http://techreport.com/reviews/2004q4/psus/index.x?pg=1


Video card power measurement ref:
ATI video cards (X800, 9800, 9600)
http://www.xbitlabs.com/articles/video/display/ati-powercons.html
Nvidia video cards (6800, 5950, 5700)
http://www.xbitlabs.com/articles/video/display/ati-vs-nv-power.html
Nvidia video cards (6600)
http://www.xbitlabs.com/articles/video/display/geforce6600gt-oc.html


Estimation of PSU size for AXP, is included here for reference.
What size PSU do I need?


Relationship of clock, die temperature and voltage (update)
- What is the active power of a CPU at frequency f and voltage V
- How to estimate CPU static and active power
- Effect of die temperature on CPU clock frequency at a given Vcore
(page 13)

Effect of overclocking on CPU active power and current

What is the active power of a CPU at frequency f and voltage V

Discussion:
http://www.ocforums.com/showthread.php?t=358395
 
Last edited:
Some overclocking scenarios for 939 Winchester/Venice/San Diego

04/2005 add: The discussion in this post applies also to Venice and San Diego. For San Diego L2 cache size is 1 MB instead of 512 KB.

With memory modules using TCCD DRAM chips which would allow a wide range of memory bus frequency
- from low 230 MHz with 2/2.5-2/3-2-x 1T
- to 250 MHz with 2.5-3-3-x 1T
- to possibly 280+ MHz 3-3/4-3-x 1T at ~2.8 V.
I think 1T is key to good memory bandwidth performance, rule of thumb is 15% leverage over 2T.

Beside high frequency type such as TCCD, low latency type such as BH-5/UTT can be clocked to 250-260 MHz 2-2-2-x 1T at 3.3+ V.

Memory frequency and latency tradeoff
How much frequency increase is needed to break-even with low latency
Testing UTT and TCCD memory modules in Winchester and DFI NF4 Ultra-D setup
Some results about comparing memory frequency, memory timing, memory divider


Such wide bus frequency range matches the 3000+ CPU overclocking frequency between 2400-2600 MHz for tradeoff under 1T. 3200+ provides slightly overclocking flexibility for matching CPU and memory frequencies, and also requires lower HTT (in case) to reach the same CPU frequency than the 3000+ due to the x10 multiplier.

Memory timings are in this order, tCAS, tRCD, tRP, tRAS.

Set HT_multiplier (aka LDT multiplier) to x3 during testing of the CPU, so high HTT (above 250 MHz) won't create potential instability from getting HT above 1000 MHz (had x4 been used with HTT 250 MHz or higher). After finalizing CPU overclocking, depending on the actual HTT used, x4 may be used if HTT is under 250 MHz.

3000+:
max CPU multiplier = x9

CPU multiplier = x9
CPU to 2400 - 2500 - 2600 - 2700 MHz
HTT to 267 - 278 - 289 - 300 MHz
memory_HTT_ratio = 1:1 (aka max memclock 200 MHz)
so memory_divider = 9
memory_bus_frequency ~ 266 - 278 - 289 - 300 MHz
memory would run at 2.5/3-3/4-3-x 1T ~2.8 V

CPU multiplier = x9
CPU to 2400 - 2500 - 2600 - 2700 MHz
HTT to 267 - 278 - 289 - 300 MHz
memory_HTT_ratio = 9:10 (aka max memclock 180 MHz)
so memory_divider = 10
memory_bus_frequency ~ 240 - 250 - 260 - 270 MHz
memory would run at 2.5-2/3-2/3-x 1T ~2.8 V

CPU multiplier = x9
CPU to 2400 - 2500 - 2600 - 2700 MHz
HTT to 267 - 278 - 289 - 300 MHz
memory_HTT_ratio = 5:6
so memory_divider = 11 (aka max memclock 166 MHz)
memory_bus_frequency ~ 219 - 228 - 236 - 245 MHz
memory would run at 2/2.5-2/3-2/3-x 1T ~2.8 V


3200+:
In additional to using CPU multiplier x9 as in the 3000+, the x10 provides the following possibility.

max CPU multiplier = x10

CPU multiplier = x10
CPU to 2400 - 2500 - 2600 - 2700 MHz
HTT to 240 - 250 - 260 - 270 MHz
memory_HTT_ratio = 1:1 (aka max memclock 200 MHz)
so memory_divider = 10
memory_bus_frequency ~ 240 - 250 - 260 - 270 MHz
memory would run at 2.5/3-3/4-3-x 1T ~2.8 V

CPU multiplier = x10
CPU to 2400 - 2500 - 2600 - 2700 MHz
HTT to 240 - 250 - 260 - 270 MHz
memory_HTT_ratio = 9:10 (aka max memclock to 180 MHz)
so memory_divider = 11
memory_bus_frequency ~ 218 - 227 - 236 - 245 MHz
memory would run at 2/2.5-2/3-2-x 1T ~2.8 V

CPU multiplier = x10
CPU to 2400 - 2500 - 2600 - 2700 MHz
HTT to 240 - 250 - 260 - 270 MHz
memory_HTT_ratio = 5:6 (aka max memclock to 166 MHz)
so memory_divider = 12
memory_bus_frequency ~ 200 - 208 - 217 - 225 MHz
memory would run at 2/2.5-2/3-2-x 1T ~2.8 V


For extreme overclocking to 3000 MHz or more,

CPU multiplier = x10
CPU to 2800 - 2900 - 3000 - 3100 MHz
HTT to 280 - 290 - 300 - 310 MHz
memory_HTT_ratio = 5:6 (aka max memclock 166 MHz)
so memory_divider = 12
memory_bus_frequency ~ 233 - 242 - 250 - 258 MHz
memory would run at 2/2.5/3-2/3-2/3-x 1T ~2.8 V


3500+:
For a 3500+ Winchester with a max CPU multiplier of x11.

For CPU multiplier x11
CPU to 2400 - 2500 - 2600 - 2700 - 2800 - 2900 - 3000 - 3100 MHz
HTT to 218 - 227 - 236 - 245 - 255 - 264 - 273 - 282 MHz
memory_HTT_ratio = 1:1 (aka max memclock 200 MHz)
so memory_divider = 11
memory_bus_frequency ~ 218 - 227 - 236 - 245 - 255 - 264 - 273 - 282 MHz
from low 230 MHz with 2/2.5-2/3-2-x 1T
to 250 MHz with 2.5-3-3-x 1T
to possibly 280+ MHz 3-3/4-3-x 1T at ~2.8 V

For CPU multiplier x10
CPU to 2400 - 2500 - 2600 - 2700 - 2800 - 2900 - 3000 - 3100 MHz
HTT to 240 - 250 - 260 - 270 - 280 - 290 - 300 - 310 MHz
(refer to the 3200+ case above)


Some common memory dividers derived:

For 1:1, CPU multiplier x9, memory divider = 9 <- refer to the 3000+ case above
For 9:10, CPU multiplier x9, memory divider = 10 <- refer to the 3000+ case above
For 5:6, CPU multiplier x9, memory divider = 11 <- refer to the 3000+ case above

For 1:1, CPU multiplier x10, memory divider = 10 <- refer to the 3200+ case above
For 9:10, CPU multiplier x10, memory divider = 11 <- refer to the 3200+ case above
For 5:6, CPU multiplier x10, memory divider = 12 <- refer to the 3200+ case above

For 1:1, CPU multiplier x11, memory divider = 11 <- refer to the 3500+ case above
For 9:10, CPU multiplier x11, memory divider = 13 <- refer to the 3500+ case above
For 5:6, CPU multiplier x11, memory divider = 14 <- refer to the 3500+ case above (not recommended)


In case, "lower clock" PC3200 modules are used.

Overclocking 3000+ Winchester with "lower clock" memory (PC3200/PC2700/PC2100)

CPU 2500 MHz
HTT = 278 MHz
CPU_multiplier = 9
max memclock = 100 MHz (aka 1:2 ratio) <---- set this in bios
memory_bus_frequency = 2500 / 18 = 139 MHz
or
max memclock = 133 MHz (aka 2:3 ratio) <---- set this in bios
memory_bus_frequency = 2500 / 14 = 179 MHz
or
max memclock = 166 MHz (aka 5:6 ratio) <---- set this in bios
memory_bus_frequency = 2500 / 11 = 227 MHz
or
max memclock = 180 MHz (aka 9:10 ratio) <---- set this in bios
memory_bus_frequency = 2500 / 10 = 250 MHz

CPU 2600 MHz
HTT = 289 MHz
CPU_multiplier = 9
max memclock = 100 MHz (aka 1:2 ratio) <---- set this in bios
memory_bus_frequency = 2600 / 18 = 144 MHz
or
max memclock = 133 MHz (aka 2:3 ratio) <---- set this in bios
memory_bus_frequency = 2600 / 14 = 186 MHz
or
max memclock = 166 MHz (aka 5:6 ratio) <---- set this in bios
memory_bus_frequency = 2600 / 11 = 236 MHz
or
max memclock = 180 MHz (aka 9:10 ratio) <---- set this in bios
memory_bus_frequency = 2600 / 10 = 260 MHz

CPU 2700 MHz
HTT = 300 MHz
CPU_multiplier = 9
max memclock = 100 MHz (aka 1:2 ratio) <---- set this in bios
memory_bus_frequency = 2700 / 18 = 150 MHz
or
max memclock = 133 MHz (aka 2:3 ratio) <---- set this in bios
memory_bus_frequency = 2700 / 14 = 193 MHz
or
max memclock = 166 MHz (aka 5:6 ratio) <---- set this in bios
memory_bus_frequency = 2700 / 11 = 245 MHz
or
max memclock = 180 MHz (aka 9:10 ratio) <---- set this in bios
memory_bus_frequency = 2700 / 10 = 270 MHz

Overclocking 3200+ Winchester with "lower clock" memory (PC3200/PC2700)

CPU 2500 MHz
HTT = 250 MHz
CPU_multiplier = 10
max memclock = 133 MHz (aka 2:3 ratio) <---- set this in bios
memory_bus_frequency = 2500 / 15 = 167 MHz
or
max memclock = 166 MHz (aka 5:6 ratio) <---- set this in bios
memory_bus_frequency = 2500 / 12 = 208 MHz
(If such memory speed does not work, try as in 3000+ using x9 multiplier.)

CPU 2600 MHz
HTT = 260 MHz
CPU_multiplier = 10
max memclock = 133 MHz (aka 2:3 ratio) <---- set this in bios
memory_bus_frequency = 2600 / 15 = 173 MHz
or
max memclock = 166 MHz (aka 5:6 ratio) <---- set this in bios
memory_bus_frequency = 2600 / 12 = 217 MHz
(If such memory speed does not work, try as in 3000+ using x9 multiplier.)

CPU 2700 MHz
HTT = 270 MHz
CPU_multiplier = 10
max memclock = 133 MHz (aka 2:3 ratio) <---- set this in bios
memory_bus_frequency = 2700 / 15 = 180 MHz
or
max memclock = 166 MHz (aka 5:6 ratio) <---- set this in bios
memory_bus_frequency = 2700 / 12 = 225 MHz
(If such memory speed does not work, try as in 3000+ using x9 multiplier.)



Overclocking setting for various bus frequencies (post 8)

Low PR 90 nm 939 Winchester (Sept 2004)

Memory bus, cache and memory bandwidth (for 940, 754, 939)
 
Last edited:
About ECC memory (for A64)

Based on reading AMD tech doc for 754, 940 and 939, they all can take either ECC or non-ECC memory modules.
These CPU's support "ECC checking with double-bit detect with single-bit correct".

Socket 940 is mainly targeted for mission-critical, server application, ECC memory modules are recommended.

Most registered memory modules are ECC, but can also be non-ECC. Socket 940 CPU's support only registered memory modules.

Most unbuffered memory modules are non-ECC and is supported by socket 754 and 939 CPU's. Unbuffered memory modules can also be ECC.

For details,
Different CPU and system platforms (754, 939, 940)


In general, there is overhead in memory access for ECC memory, for error checking and correction. For ECC memory modules, after the data is read from the memory array, the data have to go through some additional circuits ("ECC decoding tree") to detect potential one or two bit errors and the circuit is able to correct for single bit error, hence there is overhead or performance hit.

ECC used in dynamic random access memory (DRAM), static random access memory (SRAM) is mainly for detecting and correcting errors that are occurring infrequently and random in nature, such as due to alpha particle radiation that may discharge and alter the state of certain memory cell, system electric noise.

Typical ECC is for detecting 1- or 2-bit errors and correcting single bit error, and is structured in the form of 1 additional bit (parity bit for error detection) per every 8 bits in the wordline direction of memory arrays each of which is a two dimensional layout of memory cells, wordlines in one direction and bitlines in the other orthogonal direction. So for an A64 or P4 64-bit single channel memory bus, with ECC the actual bus width is 72-bit, whereas that for the 128-bit dual channel is 144 bit.

ECC is not intended for detecting and correcting massive, consistent, block errors due to functional timing errors, higher frequency applied above specification that prevent the proper operation of the memory chips when overclocked above specification.
 
Last edited:
hitechjb1 said:
PSU rating estimate for some 939 CPU and system

The 12V current rating of many existing PSU are
Fortron 530 = 18 A
.....
In the long run, with lots of harddrives, optical drives, fans, ..., and if video card also uses 12 V (e.g. PCI-e supports up to 75W), then a typical system would take an additional 10-15 A (exact numbers have to be calculated based on hardware count and video card overclocking),
e.g. 4 HD's, 1 DVD, 1 DVD-RW, 4 or more case fans, 1 CPU fan, high end overclocked video card (if using 12 V)

12 V current rating = 13.2 + (10 or 15) = 23.2 - 28.2 A
....

:eek: :bang head

thanks for this info...i guess i should (did) tack-on a new PSU to my 3500 (939) and Abit AV8 order...

since my "trusty" Fortron 530 now = "rusty" ;)

THANKS!!
 
About Rev E and SSE3 instructions

AMD Athlon 64 Revision E adds SSE3 Support
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2350

Rumor is a new revision E 90 nm 939 (Venice ?) with SSE3 support is coming (2005 ?).
http://anandtech.com/mb/showdoc.aspx?i=2264&p=3

From this article which reported AMD's chief Hammer Architect's (Kevin McGrath) presentation, at Stanford University
http://techreport.com/onearticle.x/6363

from above article said:
...
The enhancements include power reductions (gained by using slow but less leaky transistors in non-critical paths) and speed improvements (by using fast but leaky transistors in critical paths). Also, the processor halt and stopclock states have been improved, reducing some unnecessary work previously conducted during these states, resulting in a savings of several hundred milliwatts. Like the Pentium 4, future Hammer chips will feature on-die thermal throttling to cool themselves down if certain temperature limits are reached.

Performance-wise, the big news is the addition of SSE3 instructions, which accelerate a number of different types of computation, including video encoding, scientific computing, and software graphics vertex shaders. (For more on SSE3, see our Prescott review.) Beyond SSE3, the updated Hammer core will convert the LEA instruction, under certain circumstances, into an ADD instruction, which has only a single cycle of latency. AMD's design mavens have also added additional write-combining buffers to the chip, so it can combine up to four streams of non-cacheable writes, up from two. Hammer's data prefetch has been improved, as well.

...

Rumor is there will be additional metal layers added for the E core. In general terms, extra metal layers are used to
- improve power distribution, i.e. less voltage drop for a given current or more current to more devices
- improve connectivity to package I/O, i.e. more package pins
- reduce signal RC delay, hence potentially higher clock frequency
- reduce clock skew, hence potentially higher clock frequency
- provide more flexible communication paths between functional units and multiple cores
- improve signal integrity
...
So the net is to enable more complex architecture and logic functions, multiple cores, higher socket pin count, higher clock, more devices, ....

About SSE3 Instructions:
http://www.intel.com/technology/itj/2004/volume08issue01/art01_microarchitecture/p06_sse.htm
 
Last edited:
I am trying to develop an Overclocking calcualtor for A64 Winchesters. The problem is that some of the valid values: CPU Freq, FSB Freq, HT Freq, Memory Freq produced by some multiplier combinations are not workable in BIOS (as those mentioned in Strange Bios Overclocking Behaviour thread). I am trying to figure out why these very easily attainable values are unworkable and I suspect BIOS bugs. I would appreciate if someone could repeat the unworkable combinations (HTT=FSB:209 and beyond, CPU Multi:9x, HT multi:4x and Memory 133:2/3 divider)and post his results or way better if he understands what's going on under these very specific situations.
 
Back